Large or small Raid 5 with SSD
-
@Donahue said in Large or small Raid 5 with SSD:
am I wrong to think that the probability of two drives failing is much less than the probability of just one drive failing?
You are correct, but no one is disagreeing with that. It's how you are using this info is what is incorrect.
-
@Donahue said in Large or small Raid 5 with SSD:
And while say a 24-48 hour decision window plus rebuild time is a lot more exposure than an instant rebuild time, it is still quite low?
So that the first drive has failed is unrelated. Once we hit this window, it is, say, 48 hours of "decision" and say 8 hours of rebuilding. During which time, there is no protection.
- Why would you add 48 hours of exposure with NO RAID at all, for no reason?
- There is only one possible outcome of the decision, to replace the drive. There is no condition under which you would not replace the drive, so why introduce a two day risk window without potential benefit?
-
@scottalanmiller said in Large or small Raid 5 with SSD:
@Donahue said in Large or small Raid 5 with SSD:
And while say a 24-48 hour decision window plus rebuild time is a lot more exposure than an instant rebuild time, it is still quite low?
So that the first drive has failed is unrelated. Once we hit this window, it is, say, 48 hours of "decision" and say 8 hours of rebuilding. During which time, there is no protection.
- Why would you add 48 hours of exposure with NO RAID at all, for no reason?
- There is only one possible outcome of the decision, to replace the drive. There is no condition under which you would not replace the drive, so why introduce a two day risk window without potential benefit?
perhaps that comes from what I have read, and perhaps what I have read would have made sense with spinners and initializing the rebuild inducing the second drive failure. Presumably that extra time would be to make sure all my ducks are in a row with fresh backups and such, but perhaps that is where my error is, and I should know my ducks were in a row long before the first failure.
-
@Donahue said in Large or small Raid 5 with SSD:
@scottalanmiller said in Large or small Raid 5 with SSD:
@Donahue said in Large or small Raid 5 with SSD:
And while say a 24-48 hour decision window plus rebuild time is a lot more exposure than an instant rebuild time, it is still quite low?
So that the first drive has failed is unrelated. Once we hit this window, it is, say, 48 hours of "decision" and say 8 hours of rebuilding. During which time, there is no protection.
- Why would you add 48 hours of exposure with NO RAID at all, for no reason?
- There is only one possible outcome of the decision, to replace the drive. There is no condition under which you would not replace the drive, so why introduce a two day risk window without potential benefit?
perhaps that comes from what I have read, and perhaps what I have read would have made sense with spinners and initializing the rebuild inducing the second drive failure. Presumably that extra time would be to make sure all my ducks are in a row with fresh backups and such, but perhaps that is where my error is, and I should know my ducks were in a row long before the first failure.
With spinners, resilvering can take weeks or months of time, rather than hours, and generally has 6TB+ to resilver with high URE rates. SSDs take hours to resilver, with generally under 1TB of capacity, with low URE rates. So the factors of one apply poorly to the other.
-
@scottalanmiller said in Large or small Raid 5 with SSD:
@Donahue said in Large or small Raid 5 with SSD:
@scottalanmiller said in Large or small Raid 5 with SSD:
@Donahue said in Large or small Raid 5 with SSD:
And while say a 24-48 hour decision window plus rebuild time is a lot more exposure than an instant rebuild time, it is still quite low?
So that the first drive has failed is unrelated. Once we hit this window, it is, say, 48 hours of "decision" and say 8 hours of rebuilding. During which time, there is no protection.
- Why would you add 48 hours of exposure with NO RAID at all, for no reason?
- There is only one possible outcome of the decision, to replace the drive. There is no condition under which you would not replace the drive, so why introduce a two day risk window without potential benefit?
perhaps that comes from what I have read, and perhaps what I have read would have made sense with spinners and initializing the rebuild inducing the second drive failure. Presumably that extra time would be to make sure all my ducks are in a row with fresh backups and such, but perhaps that is where my error is, and I should know my ducks were in a row long before the first failure.
With spinners, resilvering can take weeks or months of time, rather than hours, and generally has 6TB+ to resilver with high URE rates. SSDs take hours to resilver, with generally under 1TB of capacity, with low URE rates. So the factors of one apply poorly to the other.
why only 1 TB of capacity?
-
With spinners, you take a backup first because your resilver is often expected to fail. Or the risk is super high, at least.
The backup might take two hours, while the rebuild might take two weeks.
With SSD, the backup might take longer than the rebuild. So the factors of that alone change a lot, too.
-
@Dashrender said in Large or small Raid 5 with SSD:
@scottalanmiller said in Large or small Raid 5 with SSD:
@Donahue said in Large or small Raid 5 with SSD:
@scottalanmiller said in Large or small Raid 5 with SSD:
@Donahue said in Large or small Raid 5 with SSD:
And while say a 24-48 hour decision window plus rebuild time is a lot more exposure than an instant rebuild time, it is still quite low?
So that the first drive has failed is unrelated. Once we hit this window, it is, say, 48 hours of "decision" and say 8 hours of rebuilding. During which time, there is no protection.
- Why would you add 48 hours of exposure with NO RAID at all, for no reason?
- There is only one possible outcome of the decision, to replace the drive. There is no condition under which you would not replace the drive, so why introduce a two day risk window without potential benefit?
perhaps that comes from what I have read, and perhaps what I have read would have made sense with spinners and initializing the rebuild inducing the second drive failure. Presumably that extra time would be to make sure all my ducks are in a row with fresh backups and such, but perhaps that is where my error is, and I should know my ducks were in a row long before the first failure.
With spinners, resilvering can take weeks or months of time, rather than hours, and generally has 6TB+ to resilver with high URE rates. SSDs take hours to resilver, with generally under 1TB of capacity, with low URE rates. So the factors of one apply poorly to the other.
why only 1 TB of capacity?
How big do you expect SSDs to be when you have many in an array realistically?
-
@scottalanmiller said in Large or small Raid 5 with SSD:
@Dashrender said in Large or small Raid 5 with SSD:
@scottalanmiller said in Large or small Raid 5 with SSD:
@Donahue said in Large or small Raid 5 with SSD:
@scottalanmiller said in Large or small Raid 5 with SSD:
@Donahue said in Large or small Raid 5 with SSD:
And while say a 24-48 hour decision window plus rebuild time is a lot more exposure than an instant rebuild time, it is still quite low?
So that the first drive has failed is unrelated. Once we hit this window, it is, say, 48 hours of "decision" and say 8 hours of rebuilding. During which time, there is no protection.
- Why would you add 48 hours of exposure with NO RAID at all, for no reason?
- There is only one possible outcome of the decision, to replace the drive. There is no condition under which you would not replace the drive, so why introduce a two day risk window without potential benefit?
perhaps that comes from what I have read, and perhaps what I have read would have made sense with spinners and initializing the rebuild inducing the second drive failure. Presumably that extra time would be to make sure all my ducks are in a row with fresh backups and such, but perhaps that is where my error is, and I should know my ducks were in a row long before the first failure.
With spinners, resilvering can take weeks or months of time, rather than hours, and generally has 6TB+ to resilver with high URE rates. SSDs take hours to resilver, with generally under 1TB of capacity, with low URE rates. So the factors of one apply poorly to the other.
why only 1 TB of capacity?
How big do you expect SSDs to be when you have many in an array realistically?
so you're talking about the single drive, not the array. Got it.
Though when resilvering, you still read the entire array worth.
-
For the sake of this thread, I am probably going to use 3.84TB SSD's, but the point remains.
-
@Dashrender said in Large or small Raid 5 with SSD:
@scottalanmiller said in Large or small Raid 5 with SSD:
@Dashrender said in Large or small Raid 5 with SSD:
@scottalanmiller said in Large or small Raid 5 with SSD:
@Donahue said in Large or small Raid 5 with SSD:
@scottalanmiller said in Large or small Raid 5 with SSD:
@Donahue said in Large or small Raid 5 with SSD:
And while say a 24-48 hour decision window plus rebuild time is a lot more exposure than an instant rebuild time, it is still quite low?
So that the first drive has failed is unrelated. Once we hit this window, it is, say, 48 hours of "decision" and say 8 hours of rebuilding. During which time, there is no protection.
- Why would you add 48 hours of exposure with NO RAID at all, for no reason?
- There is only one possible outcome of the decision, to replace the drive. There is no condition under which you would not replace the drive, so why introduce a two day risk window without potential benefit?
perhaps that comes from what I have read, and perhaps what I have read would have made sense with spinners and initializing the rebuild inducing the second drive failure. Presumably that extra time would be to make sure all my ducks are in a row with fresh backups and such, but perhaps that is where my error is, and I should know my ducks were in a row long before the first failure.
With spinners, resilvering can take weeks or months of time, rather than hours, and generally has 6TB+ to resilver with high URE rates. SSDs take hours to resilver, with generally under 1TB of capacity, with low URE rates. So the factors of one apply poorly to the other.
why only 1 TB of capacity?
How big do you expect SSDs to be when you have many in an array realistically?
so you're talking about the single drive, not the array. Got it.
Though when resilvering, you still read the entire array worth.
Correct, the time to resilver is primarily based on the size of the drive being rebuild. That's the bottleneck, the time to write data back to the one drive.
So if 4x 10TB drives takes 2 days to replace a drive.
8x 5TB drives would take 1 day to replace a drive.It's not exact, but it is really close.
-
@scottalanmiller said in Large or small Raid 5 with SSD:
@Dashrender said in Large or small Raid 5 with SSD:
@scottalanmiller said in Large or small Raid 5 with SSD:
@Dashrender said in Large or small Raid 5 with SSD:
@scottalanmiller said in Large or small Raid 5 with SSD:
@Donahue said in Large or small Raid 5 with SSD:
@scottalanmiller said in Large or small Raid 5 with SSD:
@Donahue said in Large or small Raid 5 with SSD:
And while say a 24-48 hour decision window plus rebuild time is a lot more exposure than an instant rebuild time, it is still quite low?
So that the first drive has failed is unrelated. Once we hit this window, it is, say, 48 hours of "decision" and say 8 hours of rebuilding. During which time, there is no protection.
- Why would you add 48 hours of exposure with NO RAID at all, for no reason?
- There is only one possible outcome of the decision, to replace the drive. There is no condition under which you would not replace the drive, so why introduce a two day risk window without potential benefit?
perhaps that comes from what I have read, and perhaps what I have read would have made sense with spinners and initializing the rebuild inducing the second drive failure. Presumably that extra time would be to make sure all my ducks are in a row with fresh backups and such, but perhaps that is where my error is, and I should know my ducks were in a row long before the first failure.
With spinners, resilvering can take weeks or months of time, rather than hours, and generally has 6TB+ to resilver with high URE rates. SSDs take hours to resilver, with generally under 1TB of capacity, with low URE rates. So the factors of one apply poorly to the other.
why only 1 TB of capacity?
How big do you expect SSDs to be when you have many in an array realistically?
so you're talking about the single drive, not the array. Got it.
Though when resilvering, you still read the entire array worth.
Correct, the time to resilver is primarily based on the size of the drive being rebuild. That's the bottleneck, the time to write data back to the one drive.
So if 4x 10TB drives takes 2 days to replace a drive.
8x 5TB drives would take 1 day to replace a drive.It's not exact, but it is really close.
but with twice the chance of having to rebuild.
-
TANSTAAFL
-
@Donahue said in Large or small Raid 5 with SSD:
@scottalanmiller said in Large or small Raid 5 with SSD:
@Dashrender said in Large or small Raid 5 with SSD:
@scottalanmiller said in Large or small Raid 5 with SSD:
@Dashrender said in Large or small Raid 5 with SSD:
@scottalanmiller said in Large or small Raid 5 with SSD:
@Donahue said in Large or small Raid 5 with SSD:
@scottalanmiller said in Large or small Raid 5 with SSD:
@Donahue said in Large or small Raid 5 with SSD:
And while say a 24-48 hour decision window plus rebuild time is a lot more exposure than an instant rebuild time, it is still quite low?
So that the first drive has failed is unrelated. Once we hit this window, it is, say, 48 hours of "decision" and say 8 hours of rebuilding. During which time, there is no protection.
- Why would you add 48 hours of exposure with NO RAID at all, for no reason?
- There is only one possible outcome of the decision, to replace the drive. There is no condition under which you would not replace the drive, so why introduce a two day risk window without potential benefit?
perhaps that comes from what I have read, and perhaps what I have read would have made sense with spinners and initializing the rebuild inducing the second drive failure. Presumably that extra time would be to make sure all my ducks are in a row with fresh backups and such, but perhaps that is where my error is, and I should know my ducks were in a row long before the first failure.
With spinners, resilvering can take weeks or months of time, rather than hours, and generally has 6TB+ to resilver with high URE rates. SSDs take hours to resilver, with generally under 1TB of capacity, with low URE rates. So the factors of one apply poorly to the other.
why only 1 TB of capacity?
How big do you expect SSDs to be when you have many in an array realistically?
so you're talking about the single drive, not the array. Got it.
Though when resilvering, you still read the entire array worth.
Correct, the time to resilver is primarily based on the size of the drive being rebuild. That's the bottleneck, the time to write data back to the one drive.
So if 4x 10TB drives takes 2 days to replace a drive.
8x 5TB drives would take 1 day to replace a drive.It's not exact, but it is really close.
but with twice the chance of having to rebuild.
Correct, that you need to rebuild happens roughly twice as often.
-
So I got a quote from xbyte, but it includes this. I had been expecting that I would just load the hypervisor on the raid 5 array and that having a seperate R1 array for the OS was an old way of approaching this. Thoughts?
-
@Donahue said in Large or small Raid 5 with SSD:
So I got a quote from xbyte, but it includes this. I had been expecting that I would just load the hypervisor on the raid 5 array and that having a seperate R1 array for the OS was an old way of approaching this. Thoughts?
How much money does that add?
-
@Donahue said in Large or small Raid 5 with SSD:
So I got a quote from xbyte, but it includes this. I had been expecting that I would just load the hypervisor on the raid 5 array and that having a seperate R1 array for the OS was an old way of approaching this. Thoughts?
Cost is the big factor. As a device, I like it.
-
@Obsolesce said in Large or small Raid 5 with SSD:
@Donahue said in Large or small Raid 5 with SSD:
So I got a quote from xbyte, but it includes this. I had been expecting that I would just load the hypervisor on the raid 5 array and that having a seperate R1 array for the OS was an old way of approaching this. Thoughts?
How much money does that add?
Exactly.
-
$411
-
-
@scottalanmiller said in Large or small Raid 5 with SSD:
@Donahue said in Large or small Raid 5 with SSD:
$411
Good for what it is. Still a bit of money.
I was under the impression that OBR was the way this should be done.