Large or small Raid 5 with SSD
-
@scottalanmiller said in Large or small Raid 5 with SSD:
@Donahue said in Large or small Raid 5 with SSD:
also, with larger drive count SSD arrays, is there a point at which I should be looking at raid 6?
Yes
is there a rule of thumb for this point?
-
@scottalanmiller said in Large or small Raid 5 with SSD:
@Donahue said in Large or small Raid 5 with SSD:
@scottalanmiller it seems like the trade off becomes something like this:
Larger disks means less drive bays used, less risk because there are less disks, but a higher cost per effective TB and a higher cost of having a cold spare on the shelf.
Smaller disks means more bays used (at some point this becomes important), more risk because of more risk sources, but less effect cost per TB, and cheaper cold spares?
The spares might be cheaper, but you consume them more often. Probably not cheaper overall.
interesting point
-
@Donahue said in Large or small Raid 5 with SSD:
@scottalanmiller said in Large or small Raid 5 with SSD:
@Donahue said in Large or small Raid 5 with SSD:
also, with larger drive count SSD arrays, is there a point at which I should be looking at raid 6?
Yes
is there a rule of thumb for this point?
Not really, it's a decently difficult calculation based on the value of uptime, data loss, cost of the extra drive, performance offsets, etc. Very hard to produce a RoT for that.
Because it really comes down to market prices, you tend to build out a RAID 5 and then just run the numbers to see the difference.
Of course you always do a RAID 6 before you consider a spare of any kind.
-
@scottalanmiller said in Large or small Raid 5 with SSD:
@Donahue said in Large or small Raid 5 with SSD:
also, with larger drive count SSD arrays, is there a point at which I should be looking at raid 6?
Yes
The number of drives can play a factor? not just the amount of storage? and if so, what is that number, and how is it determined?
-
@scottalanmiller said in Large or small Raid 5 with SSD:
Of course you always do a RAID 6 before you consider a spare of any kind.
Really? The RAID 6 penalty isn't high enough to warrant keeping a hot spare?
-
@Dashrender said in Large or small Raid 5 with SSD:
@scottalanmiller said in Large or small Raid 5 with SSD:
Of course you always do a RAID 6 before you consider a spare of any kind.
Really? The RAID 6 penalty isn't high enough to warrant keeping a hot spare?
Scott, I assume that not having the drive bay for a spare is the exception?
-
@Dashrender said in Large or small Raid 5 with SSD:
@scottalanmiller said in Large or small Raid 5 with SSD:
Of course you always do a RAID 6 before you consider a spare of any kind.
Really? The RAID 6 penalty isn't high enough to warrant keeping a hot spare?
The difference in reliability is huge. The difference in write performance is trivial, especially in modern systems buffered by cache. Yes, there is write expansion to think about, but modern systems using parity RAID are not concerned with IOPS, you'd be on NVMe if that were the case, and you'd need RAID handled a completely different way.
-
@Dashrender said in Large or small Raid 5 with SSD:
@scottalanmiller said in Large or small Raid 5 with SSD:
@Donahue said in Large or small Raid 5 with SSD:
also, with larger drive count SSD arrays, is there a point at which I should be looking at raid 6?
Yes
The number of drives can play a factor? not just the amount of storage? and if so, what is that number, and how is it determined?
The number of drives is the primary factor in whether or not a device will fail. More drives = more risk.
Amount of storage is the primary factor in how long it will take for an array to recovery.
-
@Donahue said in Large or small Raid 5 with SSD:
@Dashrender said in Large or small Raid 5 with SSD:
@scottalanmiller said in Large or small Raid 5 with SSD:
Of course you always do a RAID 6 before you consider a spare of any kind.
Really? The RAID 6 penalty isn't high enough to warrant keeping a hot spare?
Scott, I assume that not having the drive bay for a spare is the exception?
Correct, but that's a super rare case. But can happen. But normally in that case, consider larger drives.
-
So in general, an 8 drive raid 5 is more risky than a 4 drive raid 5, but how much so? I want to know how to calculate the tipping point between safety and cost.
-
@Donahue said in Large or small Raid 5 with SSD:
So in general, an 8 drive raid 5 is more risky than a 4 drive raid 5, but how much so? I want to know how to calculate the tipping point between safety and cost.
It's pretty close, but not exactly, twice as likely to lose a drive. For loose calculations, just use double. If the four drive array is going to lose a drive once every five years, the eight drive array will lose two.
-
Let me know if I am thinking about this correctly. If each drive in the 4 drive array was twice the price of the smaller drives, then the cost per year is basically the same. But what is different is you are twice as likely to have a second loss (with 8 drives) during a rebuild because there are twice as many primary failures, correct? So would this make a 4 drive raid 5 and an 8 drive raid 6 be similar in reliability?
-
and rebuild times are more dependent on capacity, not on drive count? So with equal capacity, the rebuild should take the same amount of time?
-
@Donahue said in Large or small Raid 5 with SSD:
Let me know if I am thinking about this correctly. If each drive in the 4 drive array was twice the price of the smaller drives, then the cost per year is basically the same. But what is different is you are twice as likely to have a second loss (with 8 drives) during a rebuild because there are twice as many primary failures, correct? So would this make a 4 drive raid 5 and an 8 drive raid 6 be similar in reliability?
Your first theory is correct, drive failure during rebuild would make the primary failure mode roughly equal during that tiny window, assuming rebuilds are automate and instantaneous, which they are not.
The result though is incorrect. They would not be even remotely close in reliability. They would be orders of magnitude different. The eight drive RAID 6 would be thousands of times more reliable then the four drive RAID 5.
You can never isolate a failure mode, which as failure during a rebuild, and look at it in a vacuum to approximate a total failure rate. RAID reliability is the result of the interplay between several different failure modes.
This of it like an equation... x * y * z = resulting reliability. You can't isolate y and have any idea how the result will be, if x and z skyrocket when y reduces, the result might still be much higher.
-
@Donahue said in Large or small Raid 5 with SSD:
and rebuild times are more dependent on capacity, not on drive count? So with equal capacity, the rebuild should take the same amount of time?
Correct, capacity is nearly all of what matters, combined with drive speed. A large array will have a slight rebuild advantage, until the RAID subsystem is saturated, then it would be slower.
-
@scottalanmiller said in Large or small Raid 5 with SSD:
@Donahue said in Large or small Raid 5 with SSD:
So in general, an 8 drive raid 5 is more risky than a 4 drive raid 5, but how much so? I want to know how to calculate the tipping point between safety and cost.
It's pretty close, but not exactly, twice as likely to lose a drive. For loose calculations, just use double. If the four drive array is going to lose a drive once every five years, the eight drive array will lose two.
Or might not lose any in five years, perhaps, if you aren't nearing the DWPD rating, have a good RAID card with caching, nvme caching, use RAM caching, etc... Have a SSD mirror JUST for caching, whatever. There are ways to extend life.
How big are the drives? The more drives, the more performance you get, but more likely for one to fail. The smaller the drives, the faster the "rebuilds" depending on how you look at it and what RAID level.
If you have a high number of drives and they are pretty large, and are SSD, then a RAID6 is fine. How much performance do you actually need?
-
@Donahue said in Large or small Raid 5 with SSD:
So would this make a 4 drive raid 5 and an 8 drive raid 6 be similar in reliability?
You'd have to define reliability here. You are twice as likely to experience a drive failure on the 8-drive array. For data loss you are about the same - if you don't replace the failed drive.
In real life I feel it comes down to practical things. Like how big your budget is and how much storage you need. 4TB SSD is pretty standard so if you need 24 TB SSD then you need to use more drives. In almost no case would it be a good idea to use many small drives.
-
@Pete-S said in Large or small Raid 5 with SSD:
@Donahue said in Large or small Raid 5 with SSD:
So would this make a 4 drive raid 5 and an 8 drive raid 6 be similar in reliability?
You'd have to define reliability here. You are twice as likely to experience a drive failure on the 8-drive array. For data loss you are about the same - if you don't replace the failed drive.
In real life I feel it comes down to practical things. Like how big your budget is and how much storage you need. 4TB SSD is pretty standard so if you need 24 TB SSD then you need to use more drives. In almost no case would it be a good idea to use many small drives.
Many small drives will typically overrun the controller, too, making the performance gains that you expect to get, all lost.
-
@scottalanmiller said in Large or small Raid 5 with SSD:
@Pete-S said in Large or small Raid 5 with SSD:
@Donahue said in Large or small Raid 5 with SSD:
So would this make a 4 drive raid 5 and an 8 drive raid 6 be similar in reliability?
You'd have to define reliability here. You are twice as likely to experience a drive failure on the 8-drive array. For data loss you are about the same - if you don't replace the failed drive.
In real life I feel it comes down to practical things. Like how big your budget is and how much storage you need. 4TB SSD is pretty standard so if you need 24 TB SSD then you need to use more drives. In almost no case would it be a good idea to use many small drives.
Many small drives will typically overrun the controller, too, making the performance gains that you expect to get, all lost.
Yes and as you mentioned above NVMe is where it's at when it comes to performance. SATA and SAS SSDs are for legacy applications - as Intel says.
-
How do you figure out where a RAID card will bottleneck when you are using an SSD RAID, say a RAID10 of some model of SSDs with a Dell H740p up to the maximum number of drives and channels?
How many would it take to have to realistically consider the RAID card itself being the bottleneck?