ML
    • Recent
    • Categories
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Hot Swap vs. Blind Swap

    Scheduled Pinned Locked Moved Announcements
    storageraidhot swapblind swapcold swap
    66 Posts 10 Posters 25.4k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • J
      Jason Banned @BRRABill
      last edited by

      @BRRABill said:

      P.S. If anyone can read that, and it DOESN'T say good luck, please don't let me know. 🙂

      @JaredBusch might know.

      JaredBuschJ 1 Reply Last reply Reply Quote 0
      • BRRABillB
        BRRABill
        last edited by

        @Jason

        Yeah that's the server I have that the RAID 5 array died on me Tuesday.

        Ironic.

        1 Reply Last reply Reply Quote 1
        • BRRABillB
          BRRABill @scottalanmiller
          last edited by

          @scottalanmiller said:

          RAID 5 induces other failures when you go to rebuild. It's extremely common and just an artifact of that RAID level. Doesn't mean that it will always do it or even normally do it, but it is very common. Once you do a drive swap it immediately increases the load on the drives and makes them more likely to fail.

          Is it just RAID 5 that induces failures? I mean, theoretically couldn't a RAID 10 array do the same thing?

          scottalanmillerS 1 Reply Last reply Reply Quote 0
          • scottalanmillerS
            scottalanmiller @BRRABill
            last edited by

            @BRRABill said:

            Is it just RAID 5 that induces failures? I mean, theoretically couldn't a RAID 10 array do the same thing?

            Parity RAID induces it on resilver, mirrored RAID really does not. It does a little, but only a little, and only to a single drive not all drives. So the impact of parity rebuilds is always at least double that of any mirrored RAID and often many, many times more.

            1 Reply Last reply Reply Quote 0
            • BRRABillB
              BRRABill
              last edited by

              My drive failed almost immediately. I mean, whatever happened rebooted the server.

              scottalanmillerS brianlittlejohnB 2 Replies Last reply Reply Quote 0
              • scottalanmillerS
                scottalanmiller @BRRABill
                last edited by

                @BRRABill said:

                My drive failed almost immediately. I mean, whatever happened rebooted the server.

                With RAID 5 that can be almost anything. Secondary drive failed naturally, resilver induced, URE, etc. RAID 5 has abundant failure modes that could have happened there.

                1 Reply Last reply Reply Quote 0
                • brianlittlejohnB
                  brianlittlejohn @BRRABill
                  last edited by

                  @BRRABill It's possible that the drive had a loose connection and replacing the other knocked it offline.

                  1 Reply Last reply Reply Quote 1
                  • scottalanmillerS
                    scottalanmiller
                    last edited by

                    That too, could be as simple as physical vibration.

                    1 Reply Last reply Reply Quote 0
                    • drewlanderD
                      drewlander @BRRABill
                      last edited by

                      @BRRABill That Chinese character means "Spring".

                      BRRABillB 1 Reply Last reply Reply Quote 0
                      • BRRABillB
                        BRRABill
                        last edited by

                        It was firmly plugged in. I think it just gave up the ghost.

                        I've seen that kind of stuff happen with a surge, but that seems unlikely in a hotplug backplane.

                        1 Reply Last reply Reply Quote 0
                        • BRRABillB
                          BRRABill @drewlander
                          last edited by

                          @drewlander said:

                          @BRRABill That Chinese character means "Spring".

                          Maybe she gave it to me in the Spring.

                          THOUGH...if all my server die tonight, I am blaming you. 😉

                          drewlanderD 1 Reply Last reply Reply Quote 0
                          • drewlanderD
                            drewlander @BRRABill
                            last edited by

                            @BRRABill said:

                            My drive failed almost immediately. I mean, whatever happened rebooted the server.

                            Go right ahead. Did that drive fail after replacement while it was in a degraded state? Id say your controller is failing if that happened.

                            On a side note, I pretty much only use RAID 1 mirror w 1 hot spare (3 disks total) these days in what I do. The apps I deal with and code for (mostly) are OLTP with tons of tiny write transactions. Using a small stripe size and only two disks, this setup benchmarks 13x faster write speeds for me than a RAID5 array with 4 disks, all day, according to AS SSD. The way we coded our software and designed the database everything uses GUID's for PK. GoDaddy premium dns provides round-robin load balancing ( I don't manage that part). In Proliant servers (dl360 G7 for example) I like to install both backplane kits and split the RAID1 mirror between backplanes. This is just to show as example that there's really not a one-size-fits-all solution for server configurations and redundancy. The software I develop (or run) dictates what I am able to do with the hardware.

                            scottalanmillerS BRRABillB 2 Replies Last reply Reply Quote 1
                            • scottalanmillerS
                              scottalanmiller @drewlander
                              last edited by

                              @drewlander said:

                              On a side note, I pretty much only use RAID 1 mirror w 1 hot spare (3 disks total) these days in what I do.

                              Never use a hot spare with RAID 1 unless your controller really lacks basic functionality. Instead go to a triple mirrored RAID 1. This is far safer than RAID 1 with a hot spare because instead of needing to rebuild while lacking mirroring the data is always hot and ready AND you get a 50% read performance boost for the life of the array. So faster and safer, no downsides.

                              drewlanderD 1 Reply Last reply Reply Quote 3
                              • JaredBuschJ
                                JaredBusch @Jason
                                last edited by

                                @Jason said:

                                @BRRABill said:

                                P.S. If anyone can read that, and it DOESN'T say good luck, please don't let me know. 🙂

                                @JaredBusch might know.

                                The Japanese meaning for that is spring when used by itself. compounded with other kanji, the meanign could change.

                                Chinese reads the kanji differently. No idea on that.

                                1 Reply Last reply Reply Quote 0
                                • BRRABillB
                                  BRRABill @drewlander
                                  last edited by

                                  @drewlander said:

                                  Go right ahead. Did that drive fail after replacement while it was in a degraded state? Id say your controller is failing if that happened.

                                  There were 4 drives.

                                  1 2 3 4

                                  2 was degraded/failed. I took it out, and put in a fresh one. The server then rebooted, and both 1 and 2 showed up as failed when it came back up.

                                  scottalanmillerS 1 Reply Last reply Reply Quote 0
                                  • scottalanmillerS
                                    scottalanmiller @BRRABill
                                    last edited by

                                    @BRRABill said:

                                    @drewlander said:

                                    Go right ahead. Did that drive fail after replacement while it was in a degraded state? Id say your controller is failing if that happened.

                                    There were 4 drives.

                                    1 2 3 4

                                    2 was degraded/failed. I took it out, and put in a fresh one. The server then rebooted, and both 1 and 2 showed up as failed when it came back up.

                                    It would have rebooted because the other drive failed or else it means that the server had failed on its own.

                                    1 Reply Last reply Reply Quote 1
                                    • BRRABillB
                                      BRRABill
                                      last edited by

                                      No matter. I am on a new machine now with a new drive. Neither server grade, but all temporary. Probably safer.

                                      All to be written up some day soon. I had to go into work today on my day off (with the two kids in tow who LOVED IT (for real)) for non-IT stuff.

                                      I'm now having beer and watching the Jets/Bills game.

                                      1 Reply Last reply Reply Quote 0
                                      • drewlanderD
                                        drewlander @scottalanmiller
                                        last edited by

                                        @scottalanmiller said:

                                        @drewlander said:

                                        On a side note, I pretty much only use RAID 1 mirror w 1 hot spare (3 disks total) these days in what I do.

                                        Never use a hot spare with RAID 1 unless your controller really lacks basic functionality. Instead go to a triple mirrored RAID 1. This is far safer than RAID 1 with a hot spare because instead of needing to rebuild while lacking mirroring the data is always hot and ready AND you get a 50% read performance boost for the life of the array. So faster and safer, no downsides.

                                        That's a pretty strong leading sentence. I want that spare inactive because the servers run SSD. Also I am not sure the gains on reads would be worth the hit on writes in an OLTP app that processes high volume micro transactions. We both know HDD's read faster than they write, and reads are not generally where people suffer with disk I/O issues (at least not in what I do). Id be happy to try it and compare random writes on a RAID1 3-way mirror cs RAID1 2 disk mirror, but I don't think I even need to do that to know 3x random writes takes longer than 2x random writes. Rebuild in degraded mode would be slower, but I would sooner prefer generally faster transactions with a day of slow rebuilding over a generally slower application from day to day.

                                        😜

                                        -d

                                        scottalanmillerS 4 Replies Last reply Reply Quote 1
                                        • scottalanmillerS
                                          scottalanmiller @drewlander
                                          last edited by

                                          @drewlander said:

                                          That's a pretty strong leading sentence.

                                          It should be. Hot spares use all of the electrical and HVAC and incur all of the cost of having a full mirror but give up the performance boost and carry the resilver time that are unnecessary. It's two different uses of the same resources. It's the same rule, more or less, as to why you always do RAID 6 rather than RAID 5 plus hot spare. Faster, safer, same cost and same capacity.

                                          1 Reply Last reply Reply Quote 0
                                          • scottalanmillerS
                                            scottalanmiller @drewlander
                                            last edited by

                                            @drewlander said:

                                            Also I am not sure the gains on reads would be worth the hit on writes in an OLTP app that processes high volume micro transactions. We both know HDD's read faster than they write, and reads are not generally where people suffer with disk I/O issues (at least not in what I do).

                                            Regardless of where they suffer, the reads are 50% faster. This means both that any read operation is simply that much faster (unless you are saying that you literally have no bottleneck on your storage at all, which seems very unlikely) and read operations take less time leaving more available time for write transactions so that there is less opportunity for contention.

                                            It is certainly not as beneficial as if writes were faster too, but it is a non-trivial boost in read performance while getting better safety as well and read performance aids in write performance in any case where both need to be performance which, because of how write work, is nearly always to some extent.

                                            1 Reply Last reply Reply Quote 0
                                            • 1
                                            • 2
                                            • 3
                                            • 4
                                            • 2 / 4
                                            • First post
                                              Last post