How can we recover data from Hard Drives were on RAID 10 without controller?
-
@scottalanmiller said in How can we recover data from Hard Drives were on RAID 10 without controller?:
RAID 6 requires non-straight reads from all drives in the array. So that's N-1 stress with a minimum being three drives (we don't count the one being written to) under more stressful load. So the chances of killing another drive from stress ranges from around 350% to easily 3,000% depending on the size of the array - assuming otherwise idle.
That is incorrect, you must be thinking about something else.
Below are the stripe layout on a RAID-6. P and Q are parity, 1 to 3 are actual data.
The first row A, is the stripe so they belong together. Then B, C etc.So to rebuild disk 1, the stripe A on disk 2, 3, 4, 5 are read at the same time and as soon as the data is in, the missing data is calculated (XOR) and written to disk 1. While that happens the next stripe B is read. And so on and so forth.
So it's a pure sequential read on all drives in the array and a sequential write on the drive being rebuilt. Starting from the outside of the disk platters and moving in.
Only thing that would interrupt this sequential operation is other I/O operations on the array, but that is true for all types of raid arrays.
PS. The astute reader will notice that there are three errors in the image above from Seagate. -
@Pete-S said in How can we recover data from Hard Drives were on RAID 10 without controller?:
So it's a pure sequential read on all drives in the array and a sequential write on the drive being rebuilt. Starting from the outside of the disk platters and moving in.
Can't be sequential because it has to skip over the parity, it's nearly sequential, but not quite. And if there is any production traffic, all sequential is gone.
-
@Pete-S said in How can we recover data from Hard Drives were on RAID 10 without controller?:
Only thing that would interrupt this sequential operation is other I/O operations on the array, but that is true for all types of raid arrays.
Yes, that affects all types. But as mirror recreation is so much faster typically (from hours to months faster depending on drives and controllers and activity) you often get to rebuild mirrors with low or no load, and parity almost always gets hit with production load.
-
@scottalanmiller said in How can we recover data from Hard Drives were on RAID 10 without controller?:
@Pete-S said in How can we recover data from Hard Drives were on RAID 10 without controller?:
Only thing that would interrupt this sequential operation is other I/O operations on the array, but that is true for all types of raid arrays.
Yes, that affects all types. But as mirror recreation is so much faster typically (from hours to months faster depending on drives and controllers and activity) you often get to rebuild mirrors with low or no load, and parity almost always gets hit with production load.
Yup, I did a RAID5 rebuild that took well over a month. Couldn't help it, it had to stay in production because contained too much data to keep it down long enough to rebuild at 100%, as that still would have taken weeks. I tried to rebuild another another RAID 5 too, but gave up after a few weeks. It was still faster at that point to set up something else and migrate/sync the data over and start over.
Again, couldn't image them being a RAID 6. There's no way they'd have survived a rebuild.
-
@Obsolesce said in How can we recover data from Hard Drives were on RAID 10 without controller?:
@scottalanmiller said in How can we recover data from Hard Drives were on RAID 10 without controller?:
@Pete-S said in How can we recover data from Hard Drives were on RAID 10 without controller?:
Only thing that would interrupt this sequential operation is other I/O operations on the array, but that is true for all types of raid arrays.
Yes, that affects all types. But as mirror recreation is so much faster typically (from hours to months faster depending on drives and controllers and activity) you often get to rebuild mirrors with low or no load, and parity almost always gets hit with production load.
Yup, I did a RAID5 rebuild that took well over a month. Couldn't help it, it had to stay in production because contained too much data to keep it down long enough to rebuild at 100%, as that still would have taken weeks. I tried to rebuild another another RAID 5 too, but gave up after a few weeks. It was still faster at that point to set up something else and migrate/sync the data over and start over.
Again, couldn't image them being a RAID 6. There's no way they'd have survived a rebuild.
I've had clients on RAID 6 top two months. Once you have a rebuild go over 48 hours, it's a very, very rare shop that can both justify attempting a rebuild and doesn't have to keep it in production while doing so.
In a lab environment where you don't have real world time constraints, the results are much closer. And if you have software RAID and can throw loads of CPU at it, it speeds up. But I've never seen real world conditions where they can do that to a workload that also needs to rebuild.
And then it adds all those other failures, too. Disk failure isn't the main one on spinners, there is just so much to go wrong.
It's so big of a problem that we used to see shops routinely refuse disk replacements until the weekends because they knew it would take too long, and that it would be impactful. So on top of other concerns, it also turned what should have been a 15 minute mean time to drive replacement into a 50 hours mean time to drive replacement. Which obviously takes the chances of failure through the roof.
-
@scottalanmiller said in How can we recover data from Hard Drives were on RAID 10 without controller?:
@Pete-S said in How can we recover data from Hard Drives were on RAID 10 without controller?:
So it's a pure sequential read on all drives in the array and a sequential write on the drive being rebuilt. Starting from the outside of the disk platters and moving in.
Can't be sequential because it has to skip over the parity, it's nearly sequential, but not quite. And if there is any production traffic, all sequential is gone.
It makes sense to believe that, since RAID-6 only needs data from N-2 drives to recreate what's missing.
However it reads from all drives because it's much faster to calculate the missing data when you have N-1. It's a math thing. If you check the source code for the md driver in the kernel you'll see it mentioned several times.The other factor is that it is head movements that destroys the sequential performance. The drive that had the data you could calculate, instead of reading, is just spinning. It doesn't do anything else and can't do anything by itself. So you won't lose anything even if you'd skip one stripe segment on one drive. It wouldn't lower the performance of the rebuild except increasing the amount of data that need to be calculated.
-
@Pete-S said in How can we recover data from Hard Drives were on RAID 10 without controller?:
However it reads from all drives because it's much faster to calculate the missing data when you have N-1. It's a math thing. If you check the source code for the md driver in the kernel you'll see it mentioned several times.
Oh right, because it's a different calc each time. One time it's p, one time it's q, one time it's 1 and so forth.
-
@scottalanmiller said in How can we recover data from Hard Drives were on RAID 10 without controller?:
@Pete-S said in How can we recover data from Hard Drives were on RAID 10 without controller?:
However it reads from all drives because it's much faster to calculate the missing data when you have N-1. It's a math thing. If you check the source code for the md driver in the kernel you'll see it mentioned several times.
Oh right, because it's a different calc each time. One time it's p, one time it's q, one time it's 1 and so forth.
Yes, and it's the double parity (Q) that is very costly in CPU to calculate. The actual math for the double parity is advanced stuff, way beyond me. But if you have one failed drive you only have to do the double parity calculation every N stripes to be able to rebuild the drive. If you have two failed drives you have to do it every stripe. That's why it's less CPU/energy/heat consuming to just read all the drives when rebuilding.
In general I think that people who have problems rebuilding RAID-6 arrays have two problems.
- They just pop in the replacement drive and wait. Not knowing that they need to adjust the rebuild priority unless they want to wait forever.
- They made a design fail, ie wrong type / size of array for the job in question. And now they're paying the price.
RAID-6 arrays also have a tendency to big large arrays with large drives, exacerbating the problem.
And I think people are over-consolidating in their excitement to consolidate everything. Basically ending up with all the eggs in one basket.
-
@Pete-S said in How can we recover data from Hard Drives were on RAID 10 without controller?:
Yes, and it's the double parity (Q) that is very costly in CPU to calculate. The actual math for the double parity is advanced stuff, way beyond me. But if you have one failed drive you only have to do the double parity calculation every N stripes to be able to rebuild the drive.
Yeah, I was trying to allude to that in what I had said. Only when replacing the P, I think, it needs that.
-
@Pete-S said in How can we recover data from Hard Drives were on RAID 10 without controller?:
They just pop in the replacement drive and wait. Not knowing that they need to adjust the rebuild priority unless they want to wait forever.
Or.... they should have their rebuild priorities adjusted as a standard based on the assumed workload needs and only adjust later if something special has happened.
-
@Pete-S said in How can we recover data from Hard Drives were on RAID 10 without controller?:
RAID-6 arrays also have a tendency to big large arrays with large drives, exacerbating the problem.
This is a complex logic. It's something like...
Large arrays tend to use spinners. Spinners tend to use RAID 6 or 10. Large arrays tend to be costly for RAID 10. etc.
But a huge selling point for RAID 10 is that recovery time is flat. Keep adding drives, the recovery time doesn't change. RAID 6, every additional drive can add a bit of time.
-
@Pete-S said in How can we recover data from Hard Drives were on RAID 10 without controller?:
And I think people are over-consolidating in their excitement to consolidate everything. Basically ending up with all the eggs in one basket.
We see this a lot. There is certainly a desire for "one pool of storage" and it's so easy now that 10TB drives are so cheap. Heck, I bought one for my kids' video games. 10TB Helium 6Gb/s SATA drive with 256MB cache on my children's video game machine!
-
Have you checked the SMART values to make sure the drives are degraded and bad? If the drives are good a simple chkdsk may resolve your issues
-
This post is deleted! -
I recently arranged two 8TB hard drives to clone the drives. Already done with cloning from Disk 2 and Disk 4.
Now reconstructing RAID 0 and trying to recover the data. I tried a couple of software which were saying free and after reconstructing RAID, to recover data it started saying Evaluation.
Is there any open source or completely free software for this requirement?
-
@openit said in How can we recover data from Hard Drives were on RAID 10 without controller?:
I recently arranged two 8TB hard drives to clone the drives. Already done with cloning from Disk 2 and Disk 4.
Now reconstructing RAID 0 and trying to recover the data. I tried a couple of software which were saying free and after reconstructing RAID, to recover data it started saying Evaluation.
Is there any open source or completely free software for this requirement?
None that I know of.
-
@openit said in How can we recover data from Hard Drives were on RAID 10 without controller?:
I recently arranged two 8TB hard drives to clone the drives. Already done with cloning from Disk 2 and Disk 4.
Now reconstructing RAID 0 and trying to recover the data. I tried a couple of software which were saying free and after reconstructing RAID, to recover data it started saying Evaluation.
Is there any open source or completely free software for this requirement?
NO. If your data is important, pay for the software. otherwise, why bother.
-
Okay, got it, name some paid software, for RAID data recovery, known to be working or from your experience.
-
@openit www.runtime.org
GetDataBack for NTFS with RAID Reconstructor.
We've had excellent success with their product. -
@openit said in How can we recover data from Hard Drives were on RAID 10 without controller?:
Okay, got it, name some paid software, for RAID data recovery, known to be working or from your experience.
@CCWTech does this every day.