How Does Local Storage Offer High Availability
-
@dafyre said:
@scottalanmiller said:
I'm always talking about the resulting system reliability.
Right.
So you have one server that is highly reliable. It has no unplanned down time until one day, the RAID controller in the system shorts out and takes out three hard drives. You spend 2 hours down, waiting on parts, and an additional 4 hours restoring from backups.
Now I have two servers that are plain reliable and replicated with failover, etc, etc. Neither of them has any unplanned down time until one day, the RAID controller burns out on one of the nodes. I spend 2 hours waiting on parts, and 1 hour re-installing my OS, and getting the system set back up for replication. I suffer from zero down time.
Which system is more reliable? Yours, of course. A single system would win at reliability.
Which one looks more reliable (it's all about the appearance). Mine would, because as far as my end-users are concerned the system (as a whole, all moving parts involved) did not go down.
I see. I guess it doesn't look reliable to people who know. If you say to an average person on the street "I have two servers and he has a mainframe, which is more reliable" I bet they'd say the mainframe because that's how people think. It's rare that people are used to "two cheap things" is better than "one good thing."
Like "I'll give you two cheap Bics in exchange for your $100 pen", most people would be like "your nuts, this will last forever."
-
@scottalanmiller said:
Using the dictionary definition of engineering redundancy, how would we define something like tightly coupled controllers?
Can then failover and keep working? Yes, they can.
Are they likely to do so? No.
Is the additional risk introduced by having two objects to possibly fail and to likely cause their peer to fail offset by the possibility of failover sometimes? No.
So this is how I see it:
English Redundancy: Yes, there are two items.
Engineering Redundancy: Yes, they can failover.
Increased Reliability: No, the resulting system has become more fragile.^ Now I understand your thought process in this.
Even engineering redundancy can lead to fragility if we don't look at the system holistically.
If the building of redundancy leads to fragility, then something is wrong, IMHO.
-
Semantics?
-
@dafyre said:
If the building of redundancy leads to fragility, then something is wrong, IMHO.
That's where it is debatable. Because the goal of an engineer would be reliability. But the goal of a salesman is sales. If the customer demands redundancy and not reliability, then the cheapest path to redundancy is the right one. But on the business empathy cap and it gets murky. Give the customer what they want is never wrong, right?
-
@scottalanmiller said:
@dafyre said:
@scottalanmiller said:
I'm always talking about the resulting system reliability.
Right.
So you have one server that is highly reliable. It has no unplanned down time until one day, the RAID controller in the system shorts out and takes out three hard drives. You spend 2 hours down, waiting on parts, and an additional 4 hours restoring from backups.
Now I have two servers that are plain reliable and replicated with failover, etc, etc. Neither of them has any unplanned down time until one day, the RAID controller burns out on one of the nodes. I spend 2 hours waiting on parts, and 1 hour re-installing my OS, and getting the system set back up for replication. I suffer from zero down time.
Which system is more reliable? Yours, of course. A single system would win at reliability.
Which one looks more reliable (it's all about the appearance). Mine would, because as far as my end-users are concerned the system (as a whole, all moving parts involved) did not go down.
I see. I guess it doesn't look reliable to people who know. If you say to an average person on the street "I have two servers and he has a mainframe, which is more reliable" I bet they'd say the mainframe because that's how people think. It's rare that people are used to "two cheap things" is better than "one good thing."
Like "I'll give you two cheap Bics in exchange for your $100 pen", most people would be like "your nuts, this will last forever."
I'd take a few packs of cheap Bics. You can have my $100 pen. It might burst and start leaking tomorrow. The chances of all 20 or 30 Bic pens leaking and bursting tomorrow are slim.
-
@wirestyle22 said:
Semantics?
Semantics are one of the most important things in IT. This isn't a theoretical experiment in language, this is a real problem that plagues SMB IT every day. Go on Spiceworks and the average conversation around storage is someone being hoodwinked by this very bit of semantics. They request the wrong thing, they get what they ask for and they end up paying a lot and getting something negative.
-
@Dashrender said:
What good is redundancy if it's not reliable?
That's what this whole thread is trying to convey. That IT Pros should never be asking for redundancy as a goal. It is always resulting reliability. Always, no exceptions.
The issue is not that people are building reliability where it isn't useful, it is that people are demanding redundancy without reason. Given that redundancy is only a means to an end (or a proximate goal rather than a real goal) no one should request it, they need reliability. If redundancy provides that reliability, no problem. If magic fairy dust does, that's fine too.
-
@scottalanmiller said:
@dafyre said:
If the building of redundancy leads to fragility, then something is wrong, IMHO.
That's where it is debatable. Because the goal of an engineer would be reliability. But the goal of a salesman is sales. If the customer demands redundancy and not reliability, then the cheapest path to redundancy is the right one. But on the business empathy cap and it gets murky. Give the customer what they want is never wrong, right?
When the customer knows what they are getting themselves into, then yes, by all means give them what they want. Up until my experience with an almost fully virtualized infrastructure, I would rather have reliable servers.
However, after my experience with virtualized infrastructure, my mindset changed.
-
@dafyre said:
When the customer knows what they are getting themselves into, then yes, by all means give them what they want.
Why do we care if they know what they want and why is the vendor to judge their desires?
-
@scottalanmiller said:
@wirestyle22 said:
Semantics?
Semantics are one of the most important things in IT. This isn't a theoretical experiment in language, this is a real problem that plagues SMB IT every day. Go on Spiceworks and the average conversation around storage is someone being hoodwinked by this very bit of semantics. They request the wrong thing, they get what they ask for and they end up paying a lot and getting something negative.
I understand everything that you guys have said here but you both agree. That is the confusing part of it for me.
Redundancy doesn't mean reliability.
Reliability doesn't mean Redundancy.I would rather have my users complain up and down, calling me the worst SysAdmin ever yet have a better system overall. I think complication with no real reward is a huge problem in IT from what I have read and experienced.
Take my opinion with a grain of salt though. I have never made incredible claims about my knowledge. I can only speak of my experiences.
-
@dafyre said:
Up until my experience with an almost fully virtualized infrastructure, I would rather have reliable servers.
However, after my experience with virtualized infrastructure, my mindset changed.
It should not change. Resultant reliability is the only value.
-
@scottalanmiller said:
If redundancy provides that reliability, no problem. If magic fairy dust does, that's fine too.
Where can I find 3 boxes of Magic fairy dust? My supplies are starting to run low, lol.
That's kinda been my whole point though. If redundancy doesn't provide a better perception of reliability, then why bother with it?
If I knew that redundancy wasn't going to help improve the perception of reliability, I'd much rather work on a single server that I knew was going to fail and restore it from backup when the failure happens.
I've been on both sides of that road.
-
@dafyre said:
@scottalanmiller said:
If redundancy provides that reliability, no problem. If magic fairy dust does, that's fine too.
Where can I find 3 boxes of Magic fairy dust? My supplies are starting to run low, lol.
That's kinda been my whole point though. If redundancy doesn't provide a better perception of reliability, then why bother with it?
If I knew that redundancy wasn't going to help improve the perception of reliability, I'd much rather work on a single server that I knew was going to fail and restore it from backup when the failure happens.
I've been on both sides of that road.
Reality > Perception
I say this working at a place where uneducated perception is leaps and bounds the most annoying part of my job. I could write a book on it.
-
@scottalanmiller said:
@dafyre said:
I would suggest more to meet the definition of redundancy, RAID 1 would be a better suggestion. In RAID 1, a single disk failure becomes a need to replace the disk, but no lost data, and no down time (assuming hot swappable drives).
That's defining failover, not redundancy. And you are referring to the data, not the drives. If I have a RAID 0 failure and one dies, I still have a working drive. It's redundant at the drive level by either definition of redundant.
I guess the term independent in RAID is what drives @scottalanmiller point the most. Redundant Array of Independent Drives = so at a drive only level, Independent, the drives are Redundant, and they are in an Array..
Wow - I've never looked at it this way before.
-
@scottalanmiller said:
@dafyre said:
Up until my experience with an almost fully virtualized infrastructure, I would rather have reliable servers.
However, after my experience with virtualized infrastructure, my mindset changed.
It should not change. Resultant reliability is the only value.
Right. Mine changed because the reliability of the single systems we had (on the budget that we had to work with) resulted in systems being not reliable as they should have been.
The resultant reliability of having two VMware servers with replicated storage was increased, because the perception was that the system was more reliable because things did not go down near as often as was happening otherwise.
-
@dafyre said:
@scottalanmiller said:
If redundancy provides that reliability, no problem. If magic fairy dust does, that's fine too.
Where can I find 3 boxes of Magic fairy dust? My supplies are starting to run low, lol.
That's kinda been my whole point though. If redundancy doesn't provide a better perception of reliability, then why bother with it?
If I knew that redundancy wasn't going to help improve the perception of reliability, I'd much rather work on a single server that I knew was going to fail and restore it from backup when the failure happens.
I've been on both sides of that road.
You keep using the term perception what does perception have to do with anything?
-
@dafyre said:
@scottalanmiller said:
@dafyre said:
Up until my experience with an almost fully virtualized infrastructure, I would rather have reliable servers.
However, after my experience with virtualized infrastructure, my mindset changed.
It should not change. Resultant reliability is the only value.
Right. Mine changed because the reliability of the single systems we had (on the budget that we had to work with) resulted in systems being not reliable as they should have been.
The resultant reliability of having two VMware servers with replicated storage was increased, because the perception was that the system was more reliable because things did not go down near as often as was happening otherwise.
That's not perception - that's reality. You found one option, an option through redundancy that provided you with reliability.
The lack of redundancy does not mean lack of reliability. You're continued stance on perception seems to imply that not having redundancy would mean you would have less or no reliability.
I'd argue, in the case of virtualization, redundancy is often a major player in reliability, but not a sole requirement.
-
@dafyre said:
@scottalanmiller said:
@dafyre said:
Up until my experience with an almost fully virtualized infrastructure, I would rather have reliable servers.
However, after my experience with virtualized infrastructure, my mindset changed.
It should not change. Resultant reliability is the only value.
Right. Mine changed because the reliability of the single systems we had (on the budget that we had to work with) resulted in systems being not reliable as they should have been.
The resultant reliability of having two VMware servers with replicated storage was increased, because the perception was that the system was more reliable because things did not go down near as often as was happening otherwise.
I'm confused, though. Sure, you improved reliability (I'm confused about the perception bit too) but why did this make you change your mindset versus a single reliable server? Since you didn't use a single reliable server for comparison, what changed the mindset?
-
@Dashrender said:
I'd argue, in the case of virtualization, redundancy is often a major player in reliability, but not a sole requirement.
I'd argue that virtualization is a red herring. It's good and we should always have it, and high availability systems always have (doing back to the 1960s.) But it's not a factor here.
Redundancy is the most common means of getting reliability, but it is definitely not the sole means.
-
@Dashrender said:
I guess the term independent in RAID is what drives @scottalanmiller point the most. Redundant Array of Independent Drives = so at a drive only level, Independent, the drives are Redundant, and they are in an Array..
Wow - I've never looked at it this way before.
I think the "it must mean the data" perception probably comes from the fact that many people state that RAID is about improving reliability. But it isn't. That's a big reason that people choose it, but RAID is about increasing speed, capacity and/or reliability by using cheap Winchester drives rather than using some other drive type. It's one of the three.
So when we look at it that way, RAID 0 has both redundancy (meaning more than one disk) AND redundancy (meaning something can fail and something else takes over) in two of three instances.
If we need a cache with increased speed over a single drive and we have a five disk RAID 0, then one fails, we just go down to a four disk RAID 0. Not as fast as before, but still faster than a single drive.