Large or small Raid 5 with SSD
-
@Donahue said in Large or small Raid 5 with SSD:
@Obsolesce said in Large or small Raid 5 with SSD:
@Donahue said in Large or small Raid 5 with SSD:
Live optics says I've got ~3k IOPS peak between all my existing hosts, and ~800 at 95%.
@Donahue said in Large or small Raid 5 with SSD:
A simple raid 5 with like 4x3.84TB SSD's is appealing
That'll be just dandy. Depends on the SSDs, but that's at least 11k IOPS. Still a 16TB RAID5 though, and rebuild performance is 30% by default.
what do you mean by ... and rebuild performance is 30% by default...?
On Dells, when a drive rebuilds, it does it at 30% capabilities by default. I assume to prevent production services coming to a crawl. You can change it though.
-
@Obsolesce said in Large or small Raid 5 with SSD:
@Donahue said in Large or small Raid 5 with SSD:
@Obsolesce said in Large or small Raid 5 with SSD:
@Donahue said in Large or small Raid 5 with SSD:
Live optics says I've got ~3k IOPS peak between all my existing hosts, and ~800 at 95%.
@Donahue said in Large or small Raid 5 with SSD:
A simple raid 5 with like 4x3.84TB SSD's is appealing
That'll be just dandy. Depends on the SSDs, but that's at least 11k IOPS. Still a 16TB RAID5 though, and rebuild performance is 30% by default.
what do you mean by ... and rebuild performance is 30% by default...?
On Dells, when a drive rebuilds, it does it at 30% capabilities by default. I assume to prevent production services coming to a crawl. You can change it though.
30% of capabilities, not 30% speed, though. So it is difficult to calculate.
-
Ok, lets add a layer to this. Lets assume the raid 5 will lose a disk. Do I run with no spare of any kind, and when it fails, then buy a replacement and switch it out? Is the URE risk primarily during rebuild, or anytime it is in a degraded state? I know that SSD's are generally an order of magnitude (or two) safer in this regard, but I want to have this planned out ahead of time.
-
also, am I right to assume that network contention can influence IOPS?
-
@Donahue said in Large or small Raid 5 with SSD:
Is the URE risk primarily during rebuild, or anytime it is in a degraded state?
URE is quite nominal on SSDs typically. Not zero, but not like you are used to, either.
-
@Donahue said in Large or small Raid 5 with SSD:
also, am I right to assume that network contention can influence IOPS?
Resulting IOPS to a third party service, but not IOPS themselves.
-
But I know that you don't have a SAN, so in your case the answer is no.
-
@Donahue said in Large or small Raid 5 with SSD:
Ok, lets add a layer to this. Lets assume the raid 5 will lose a disk. Do I run with no spare of any kind, and when it fails, then buy a replacement and switch it out?
You can, lots of places with four hour SLA hardware replacement plans do that. I wouldn't do that without a warranty to cover the replacements, though.
-
@scottalanmiller said in Large or small Raid 5 with SSD:
@Donahue said in Large or small Raid 5 with SSD:
Is the URE risk primarily during rebuild, or anytime it is in a degraded state?
URE is quite nominal on SSDs typically. Not zero, but not like you are used to, either.
but is the risk only present one I initiate a rebuild? As in, if a primary failure occurs, do I have time to assess my options before starting? I am basically trying to figure out if I should buy 4 or 5 drives. I know you said earlier that with raid 5, you may as well add that 5th drive to the array and make it a raid 6 as opposed to sitting on the shelf.
-
I am probably looking at more like next day replacement
-
@scottalanmiller said in Large or small Raid 5 with SSD:
@Donahue said in Large or small Raid 5 with SSD:
also, am I right to assume that network contention can influence IOPS?
Resulting IOPS to a third party service, but not IOPS themselves.
It will certainly improve latency. That synology is averaging 14.6ms reads, with spikes over 280. writes are averaging 4.5ms with spikes over 200.
-
@Donahue said in Large or small Raid 5 with SSD:
@scottalanmiller said in Large or small Raid 5 with SSD:
@Donahue said in Large or small Raid 5 with SSD:
Is the URE risk primarily during rebuild, or anytime it is in a degraded state?
URE is quite nominal on SSDs typically. Not zero, but not like you are used to, either.
but is the risk only present one I initiate a rebuild? As in, if a primary failure occurs, do I have time to assess my options before starting? I am basically trying to figure out if I should buy 4 or 5 drives. I know you said earlier that with raid 5, you may as well add that 5th drive to the array and make it a raid 6 as opposed to sitting on the shelf.
Yes, but if you are waiting, that's when you create the risk of a second drive failing. Because your time exposure goes from a few hours to potentially days. That's a lot of expansion.
-
just to clarify, we are talking about two different risks, with two different triggers, correct? The risk of a second disk failure while degraded, which is triggered the moment the first disk dies. The second risk (and less so for SSD) is URE, but my question is does this risk only trigger once you initiate a rebuild? Because it is the rebuild itself that is trying to read the unreadable block during its parity calculation?
-
@Donahue said in Large or small Raid 5 with SSD:
@scottalanmiller said in Large or small Raid 5 with SSD:
@Donahue said in Large or small Raid 5 with SSD:
Is the URE risk primarily during rebuild, or anytime it is in a degraded state?
URE is quite nominal on SSDs typically. Not zero, but not like you are used to, either.
but is the risk only present one I initiate a rebuild? As in, if a primary failure occurs, do I have time to assess my options before starting? I am basically trying to figure out if I should buy 4 or 5 drives. I know you said earlier that with raid 5, you may as well add that 5th drive to the array and make it a raid 6 as opposed to sitting on the shelf.
I never do hot spare. If you are going to have it plugged in, use it. Make it a RAID 6.
-
@Donahue said in Large or small Raid 5 with SSD:
just to clarify, we are talking about two different risks, with two different triggers, correct? The risk of a second disk failure while degraded, which is triggered the moment the first disk dies. The second risk (and less so for SSD) is URE, but my question is does this risk only trigger once you initiate a rebuild? Because it is the rebuild itself that is trying to read the unreadable block during its parity calculation?
The URE risk only triggers once you trigger a rebuild, but the shift risk happens the moment you delay replacing the disk. You can't win through that thought process.
-
@Donahue said in Large or small Raid 5 with SSD:
I know you said earlier that with raid 5, you may as well add that 5th drive to the array and make it a raid 6 as opposed to sitting on the shelf.
Not "might as well", but "had better make sure you do." Difference in risk is astronomic. If you are even thinking hot spare is an option, we've not explain adequately how it works.
-
@scottalanmiller said in Large or small Raid 5 with SSD:
@Donahue said in Large or small Raid 5 with SSD:
just to clarify, we are talking about two different risks, with two different triggers, correct? The risk of a second disk failure while degraded, which is triggered the moment the first disk dies. The second risk (and less so for SSD) is URE, but my question is does this risk only trigger once you initiate a rebuild? Because it is the rebuild itself that is trying to read the unreadable block during its parity calculation?
The URE risk only triggers once you trigger a rebuild, but the shift risk happens the moment you delay replacing the disk. You can't win through that thought process.
How is the URE not a risk the instant the first drive fails. Can't a URE happen during normal disk operation? i.e. you're in degraded status - and while reading before starting the rebuild, hit an URE?
-
@scottalanmiller said in Large or small Raid 5 with SSD:
@Donahue said in Large or small Raid 5 with SSD:
I know you said earlier that with raid 5, you may as well add that 5th drive to the array and make it a raid 6 as opposed to sitting on the shelf.
Not "might as well", but "had better make sure you do." Difference in risk is astronomic. If you are even thinking hot spare is an option, we've not explain adequately how it works.
Assuming the performance hit is as low as Scott claims (and I'm sure he's right) then there would be no reason to not put the protection in place now - you have the drive, just use it. Sitting it on the shelf introduces risk you don't need to take - the amount of time for you to be notified, and then act upon that notification before a second drive fails. Why expose that risk when you don't have to.
-
@Dashrender said in Large or small Raid 5 with SSD:
@scottalanmiller said in Large or small Raid 5 with SSD:
@Donahue said in Large or small Raid 5 with SSD:
just to clarify, we are talking about two different risks, with two different triggers, correct? The risk of a second disk failure while degraded, which is triggered the moment the first disk dies. The second risk (and less so for SSD) is URE, but my question is does this risk only trigger once you initiate a rebuild? Because it is the rebuild itself that is trying to read the unreadable block during its parity calculation?
The URE risk only triggers once you trigger a rebuild, but the shift risk happens the moment you delay replacing the disk. You can't win through that thought process.
How is the URE not a risk the instant the first drive fails.
It is a risk, but because we're talking SSD specifically, the chance of a URE failure is exponentially smaller than a HDD. The flash will normally fail before a URE. Not impossible, just the chance of it actually happening is much smaller than other failures happening.
Can't a URE happen during normal disk operation? i.e. you're in degraded status - and while reading before starting the rebuild, hit an URE?
Normal operation of the RAID would correct the issue. Degraded status depends on the type of RAID IE: RAID6 degraded mode should function as a RAID5, so a URE doesn't become a problem until the 2nd drive fails.
Again, URE is not an expected failure point for SSD drives. Not that it can't happen, it's just very unlikely.
-
@Dashrender said in Large or small Raid 5 with SSD:
@scottalanmiller said in Large or small Raid 5 with SSD:
@Donahue said in Large or small Raid 5 with SSD:
just to clarify, we are talking about two different risks, with two different triggers, correct? The risk of a second disk failure while degraded, which is triggered the moment the first disk dies. The second risk (and less so for SSD) is URE, but my question is does this risk only trigger once you initiate a rebuild? Because it is the rebuild itself that is trying to read the unreadable block during its parity calculation?
The URE risk only triggers once you trigger a rebuild, but the shift risk happens the moment you delay replacing the disk. You can't win through that thought process.
How is the URE not a risk the instant the first drive fails. Can't a URE happen during normal disk operation? i.e. you're in degraded status - and while reading before starting the rebuild, hit an URE?
It's not an array risk outside of a rebuild. It's rebuilding that causes the cascade of the URE to make the whole array unreadable.
That a URE can happen is not the fear. That a URE can happen during a rebuild is the fear. Because the array is a "single file" being rewritten during a rebuild and the write operation cannot complete. Leaving everything lost.