How to Market RAID 6 When Customers Need Safety
-
@Dashrender said in How to Market RAID 6 When Customers Need Safety:
@scottalanmiller said in How to Market RAID 6 When Customers Need Safety:
@Dashrender said in How to Market RAID 6 When Customers Need Safety:
and second - write hole in ZFS?
ZFS uses variable stripe widths to overcome the write hole. Why no one else has implemented this, I am not sure (backward compatibility concerns, perhaps?) It's been a decade since Sun solved the write hole problem but still today, no one has it solved except for the ZFS implementation of parity RAID. Now, most people avoid it by having batteries, flash cache or insane UPS systems, so it does not come up that often. But the risk is real.
But what is a write hole?
From Sun's 2005 paper addressing it: "RAID-5 (and other data/parity schemes such as RAID-4, RAID-6, even-odd, and Row Diagonal Parity) never quite delivered on the RAID promise -- and can't -- due to a fatal flaw known as the RAID-5 write hole. Whenever you update the data in a RAID stripe you must also update the parity, so that all disks XOR to zero -- it's that equation that allows you to reconstruct data when a disk fails. The problem is that there's no way to update two or more disks atomically, so RAID stripes can become damaged during a crash or power outage."
-
@coliver said in How to Market RAID 6 When Customers Need Safety:
@Dashrender said in How to Market RAID 6 When Customers Need Safety:
@coliver said in How to Market RAID 6 When Customers Need Safety:
@Dashrender said in How to Market RAID 6 When Customers Need Safety:
@scottalanmiller said in How to Market RAID 6 When Customers Need Safety:
@Dashrender said in How to Market RAID 6 When Customers Need Safety:
and second - write hole in ZFS?
ZFS uses variable stripe widths to overcome the write hole. Why no one else has implemented this, I am not sure (backward compatibility concerns, perhaps?) It's been a decade since Sun solved the write hole problem but still today, no one has it solved except for the ZFS implementation of parity RAID. Now, most people avoid it by having batteries, flash cache or insane UPS systems, so it does not come up that often. But the risk is real.
But what is a write hole?
It's when two disks, in a RAID6, don't match the other members of the array. RAID1 and RAID5 have this issue as well but with a single drive.
If that happens in RAID 1/10 as well, then how is it solved?
From my understanding it doesn't happen on RAID1 often. Only when there is a drive/array misconfiguration. However it is common on RAID5/6. I'm not sure the exact mechanism but it has something to do with built in drive caching.
It's full name is the RAID 5 Write Hole. It does not exist in mirrored RAID, it is a parity RAID only risk.
-
@scottalanmiller said in How to Market RAID 6 When Customers Need Safety:
@coliver said in How to Market RAID 6 When Customers Need Safety:
@Dashrender said in How to Market RAID 6 When Customers Need Safety:
@coliver said in How to Market RAID 6 When Customers Need Safety:
@Dashrender said in How to Market RAID 6 When Customers Need Safety:
@scottalanmiller said in How to Market RAID 6 When Customers Need Safety:
@Dashrender said in How to Market RAID 6 When Customers Need Safety:
and second - write hole in ZFS?
ZFS uses variable stripe widths to overcome the write hole. Why no one else has implemented this, I am not sure (backward compatibility concerns, perhaps?) It's been a decade since Sun solved the write hole problem but still today, no one has it solved except for the ZFS implementation of parity RAID. Now, most people avoid it by having batteries, flash cache or insane UPS systems, so it does not come up that often. But the risk is real.
But what is a write hole?
It's when two disks, in a RAID6, don't match the other members of the array. RAID1 and RAID5 have this issue as well but with a single drive.
If that happens in RAID 1/10 as well, then how is it solved?
From my understanding it doesn't happen on RAID1 often. Only when there is a drive/array misconfiguration. However it is common on RAID5/6. I'm not sure the exact mechanism but it has something to do with built in drive caching.
It's full name is the RAID 5 Write Hole. It does not exist in mirrored RAID, it is a parity RAID only risk.
That's good to know. So it has to do with the parity bit in parity RAID devices. I'll have to look at it more.
-
So the RAID 5 Write Hole is active on all parity arrays?
Which means any parity array should be avoided at all cost... doesn't it?
-
@coliver said in How to Market RAID 6 When Customers Need Safety:
@scottalanmiller said in How to Market RAID 6 When Customers Need Safety:
@coliver said in How to Market RAID 6 When Customers Need Safety:
@Dashrender said in How to Market RAID 6 When Customers Need Safety:
@coliver said in How to Market RAID 6 When Customers Need Safety:
@Dashrender said in How to Market RAID 6 When Customers Need Safety:
@scottalanmiller said in How to Market RAID 6 When Customers Need Safety:
@Dashrender said in How to Market RAID 6 When Customers Need Safety:
and second - write hole in ZFS?
ZFS uses variable stripe widths to overcome the write hole. Why no one else has implemented this, I am not sure (backward compatibility concerns, perhaps?) It's been a decade since Sun solved the write hole problem but still today, no one has it solved except for the ZFS implementation of parity RAID. Now, most people avoid it by having batteries, flash cache or insane UPS systems, so it does not come up that often. But the risk is real.
But what is a write hole?
It's when two disks, in a RAID6, don't match the other members of the array. RAID1 and RAID5 have this issue as well but with a single drive.
If that happens in RAID 1/10 as well, then how is it solved?
From my understanding it doesn't happen on RAID1 often. Only when there is a drive/array misconfiguration. However it is common on RAID5/6. I'm not sure the exact mechanism but it has something to do with built in drive caching.
It's full name is the RAID 5 Write Hole. It does not exist in mirrored RAID, it is a parity RAID only risk.
That's good to know. So it has to do with the parity bit in parity RAID devices. I'll have to look at it more.
Yeah, has to do with the way that it writes.
-
@DustinB3403 said in How to Market RAID 6 When Customers Need Safety:
So the RAID 5 Write Hole is active on all parity arrays?
Which means any parity array should be avoided at all cost... doesn't it?
No, because, like losing multiple disks in RAID 10, it's just not a real world risk. I've been involved in an awful lot of array failures over the years and never once was it because of the write hole. Write holes are rare even when the circumstances allow it to happen - and almost no enterprise system does that. Any enterprise class hardware RAID protects against the write hole, that's why we have battery backed cache and nvram caches on them. ZFS protects against this the Solaris, FreeBSD and OpenIndiana worlds.
The risk really only exists with Linux MD RAID, non-ZFS RAID on BSD, Windows Software RAID, FakeRAID controllers and other situations. The big enterprise software RAID vendors have stated that they assume that you will maintain power to your system and then the write hole cannot happen. If you want to use software RAID, and parity and not use ZFS then you need to either accept the write hole risk or you need to ensure continuous power to the box, the same as the battery cache does for a hardware RAID cache.
-
I once asked a vendor who were pitching an appliance that supported RAID0+1 and RAID1+0, "what would you recommend between the two, to a potential customer?" They said it didn't matter as they are both the same thing.
We didn't go with that vendor.
-
@BBigford said in How to Market RAID 6 When Customers Need Safety:
I once asked a vendor who were pitching an appliance that supported RAID0+1 and RAID1+0, "what would you recommend between the two, to a potential customer?" They said it didn't matter as they are both the same thing.
We didn't go with that vendor.
Amazing. Now that's just stupid. Losing a sale over not knowing your own product is ridiculous.
-
@BBigford said in How to Market RAID 6 When Customers Need Safety:
I once asked a vendor who were pitching an appliance that supported RAID0+1 and RAID1+0, "what would you recommend between the two, to a potential customer?" They said it didn't matter as they are both the same thing.
We didn't go with that vendor.
-
@DustinB3403 said in How to Market RAID 6 When Customers Need Safety:
@BBigford said in How to Market RAID 6 When Customers Need Safety:
I once asked a vendor who were pitching an appliance that supported RAID0+1 and RAID1+0, "what would you recommend between the two, to a potential customer?" They said it didn't matter as they are both the same thing.
We didn't go with that vendor.
Or, you know...
http://www.smbitjournal.com/2014/07/comparing-raid-10-and-raid-01/
-
@scottalanmiller said in How to Market RAID 6 When Customers Need Safety:
@DustinB3403 said in How to Market RAID 6 When Customers Need Safety:
@BBigford said in How to Market RAID 6 When Customers Need Safety:
I once asked a vendor who were pitching an appliance that supported RAID0+1 and RAID1+0, "what would you recommend between the two, to a potential customer?" They said it didn't matter as they are both the same thing.
We didn't go with that vendor.
Or, you know...
http://www.smbitjournal.com/2014/07/comparing-raid-10-and-raid-01/
TL:DR pictures are prettier