ML
    • Recent
    • Categories
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    RAID rebuild times 16TB drive

    Scheduled Pinned Locked Moved IT Discussion
    raid rebuildraidhddmd raid
    21 Posts 6 Posters 10.5k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • scottalanmillerS
      scottalanmiller @biggen
      last edited by

      @biggen said in RAID rebuild times 16TB drive:

      Just so I understand what you are saying, you are indicating that it doesn't matter the array size or RAID type, that it will simply take ~24hrs to fully write to those 16TB drives 100% if the system isn't doing anything other than a rebuild?

      Right, as long as the system can dedicate enough resources to the process so that the only bottleneck is the write process. This has always been the case and always been known. What people almost never observe is a situation where you have a system fast enough to do this. Typically the parity reconstruction uses so much CPU and cache that there are bottlenecks. That's why on a typical NAS, for example, we see reconstruction taking weeks or months. Both because the system is bottlenecked heavily, and because it is still in use a little.

      1 Reply Last reply Reply Quote 0
      • scottalanmillerS
        scottalanmiller @biggen
        last edited by

        @biggen said in RAID rebuild times 16TB drive:

        So, for example, a RAID 1 16TB mirror would have the same rebuild time as a RAID 6 32TB array (4 x 16TB) or a RAID 10 32TB array (4 x 16TB)? I must be misunderstanding.

        Under the situation where the only bottleneck in question is the write speed of the receiving drive, you'd absolutely expect the write time to be the same, because the source is taken out of the equation (by the phrase "no other bottlenecks") and the absolute only factor in play is the write speed of the same drive. So of course it will be identical.

        B 1 Reply Last reply Reply Quote 0
        • B
          biggen @scottalanmiller
          last edited by biggen

          @scottalanmiller @Pete-S

          Excellent. Thanks for that explanation guys and that nifty diagram Pete!

          I guess I was skeptical I had correct what @Pete-S said because I've seen so many reports that its taken days/weeks to rebuild [insert whatever size] TB Raid 6 arrays in the past. But I guess that was because those systems weren't just idle. There was still IOPS on those arrays AND a possible CPU/cache bottleneck.

          1 scottalanmillerS S 3 Replies Last reply Reply Quote 1
          • 1
            1337 @biggen
            last edited by

            @biggen said in RAID rebuild times 16TB drive:

            @scottalanmiller @Pete-S

            Excellent. Thanks for that explanation guys and that nifty diagram Pete!

            I guess I was skeptical I had correct what @Pete-S said because I've seen so many reports that its taken days/weeks to rebuild [insert whatever size] TB Raid 6 arrays in the past. But I guess that was because those systems weren't just idle. There was still IOPS on those arrays AND a possible CPU/cache bottleneck.

            We don't see any bottlenecks on our software RAID-6 arrays but they run bare metal on standard servers. That might be atypical, I don't know.

            But I think regular I/O has a much bigger effect than any bottleneck. I can see how MB/sec takes a nose dive when rebuilding and there is activity on the drive array.

            If you think about it, when the drive only does rebuilding it's just doing sequential read/writes and hard drives are up to 50% as fast as SATA SSDs at this. But when other I/O comes in, it becomes a question of IOPS. And hard drives are really bad at this and only have about 1% of the IOPS of an SSDs.

            scottalanmillerS 1 Reply Last reply Reply Quote 1
            • scottalanmillerS
              scottalanmiller @biggen
              last edited by

              @biggen said in RAID rebuild times 16TB drive:

              I guess I was skeptical I had correct what @Pete-S said because I've seen so many reports that its taken days/weeks to rebuild [insert whatever size] TB Raid 6 arrays in the past. But I guess that was because those systems weren't just idle. There was still IOPS on those arrays AND a possible CPU/cache bottleneck.

              Well you are still skipping the one key phrase "no other bottlenecks." According to most reports, there are normally extreme bottlenecks (either because of computational time and/or systems not being completely idle) so the information you are getting is in no way counter to what you've already heard as reports.

              You are responding as if you feel that this is somehow different, but it is not.

              It's a bit like hearing that a Chevy Sonic can go 200 mph when dropped out of an airplane, and then saying that most people say that they never get it over 90mph, and ignoring the obvious key fact that it's being dropped out of an airplane that let's it go so fast.

              1 Reply Last reply Reply Quote 0
              • scottalanmillerS
                scottalanmiller @1337
                last edited by

                @Pete-S said in RAID rebuild times 16TB drive:

                We don't see any bottlenecks on our software RAID-6 arrays but they run bare metal on standard servers. That might be atypical, I don't know.

                Even on bare metal, we normally see a lot of bottlenecks. But normally because almost no one can make their arrays go idle during a rebuild cycle. If they could, they'd not need the rebuild in the first place, typically.

                S 1 Reply Last reply Reply Quote 0
                • S
                  StorageNinja Vendor @scottalanmiller
                  last edited by

                  @scottalanmiller said in RAID rebuild times 16TB drive:

                  Its a system, not an IO, bottleneck typically. Especially with RAID 6. Its math that runs on a single thread.

                  Distributed storage systems with per object raid FTW here. If I have every VMDK running it's own rebuild process (vSAN) or every individual LUN/CPG (how Compellent or 3PAR do it) then a given drive failing is a giant party across all of the drives in the cluster/system. (Also how the fancy erasure code array systems run this).

                  scottalanmillerS 1 Reply Last reply Reply Quote 0
                  • S
                    StorageNinja Vendor @scottalanmiller
                    last edited by

                    @scottalanmiller said in RAID rebuild times 16TB drive:

                    Even on bare metal, we normally see a lot of bottlenecks. But normally because almost no one can make their arrays go idle during a rebuild cycle. If they could, they'd not need the rebuild in the first place, typically.

                    Our engineers put in a default "reserve 80% of max throughput for production" IO schedular QoS system, so at saturation rebuilds only get 20% so they don't murder production IO. (note rebuilds can use 100% if the bandwidth is there for the taking).

                    1 Reply Last reply Reply Quote 1
                    • S
                      StorageNinja Vendor @biggen
                      last edited by

                      @biggen said in RAID rebuild times 16TB drive:

                      I guess I was skeptical I had correct what @Pete-S said because I've seen so many reports that its taken days/weeks to rebuild [insert whatever size] TB Raid 6 arrays in the past. But I guess that was because those systems weren't just idle. There was still IOPS on those arrays AND a possible CPU/cache bottleneck.

                      Was the drive full? Smarter new RAID rebuild systems don't rebuild empty LBAs. Every enterprise storage array system has done this with rebuilds for the last 20 years...

                      B 1 Reply Last reply Reply Quote 1
                      • B
                        biggen @StorageNinja
                        last edited by biggen

                        @StorageNinja No personal experience with it. I've only ever run RAID 1 or 10. Just the reading I've done over the years from people reporting how long it took to rebuild larger RAID 6 arrays.

                        BTW, are you the same person who is/was over at Spiceworks? I always enjoyed reading your posts on storage. I respect both you and @scottalanmiller in this arena immensely.

                        JaredBuschJ scottalanmillerS 2 Replies Last reply Reply Quote 1
                        • JaredBuschJ
                          JaredBusch @biggen
                          last edited by

                          @biggen said in RAID rebuild times 16TB drive:

                          BTW, are you the same person who is/was over at Spiceworks?

                          Yes he is.

                          1 Reply Last reply Reply Quote 0
                          • scottalanmillerS
                            scottalanmiller @biggen
                            last edited by

                            @biggen said in RAID rebuild times 16TB drive:

                            BTW, are you the same person who is/was over at Spiceworks? I always enjoyed reading your posts on storage. I respect both you and @scottalanmiller in this arena immensely.

                            Yup, he's one of the "Day Zero" founders over here.

                            1 Reply Last reply Reply Quote 0
                            • scottalanmillerS
                              scottalanmiller @StorageNinja
                              last edited by

                              @StorageNinja said in RAID rebuild times 16TB drive:

                              @scottalanmiller said in RAID rebuild times 16TB drive:

                              Its a system, not an IO, bottleneck typically. Especially with RAID 6. Its math that runs on a single thread.

                              Distributed storage systems with per object raid FTW here. If I have every VMDK running it's own rebuild process (vSAN) or every individual LUN/CPG (how Compellent or 3PAR do it) then a given drive failing is a giant party across all of the drives in the cluster/system. (Also how the fancy erasure code array systems run this).

                              Yeah, that's RAIN and that basically solves everything 🙂

                              1 Reply Last reply Reply Quote 0
                              • 1
                              • 2
                              • 2 / 2
                              • First post
                                Last post