Dell PERC Question (Server Down)
-
@todd-at-xByte said in Dell PERC Question (Server Down):
Basically the tech said that while he has never seen this exact issue, he has seen issues that are almost identical, and that they have a special model of the drive I have (I have the 3X) to fix these issues.
We(xByte) have not seen or tested the Edge 3X drives and before feeling comfortable with this solution we would need to fully test the drives in a variety of Dell servers and PERC controllers. In the interest of getting this fully resolved as soon as possible, we are going to replace the Edge drives with Dell branded Enterprise SSD's.
Wow - awesome.
So what are you doing for future customers who want SSDs in Dell servers? Is the only option now the Dell SSDs?
-
@Dashrender I think both xByte and EDGE would likely look to do more testing before completely ending the relationship.
There is already a relationship there. . .
-
I imagine they will also look at my drives when I send them back and try to figure out what went wrong.
-
@BRRABill said in Dell PERC Question (Server Down):
I imagine they will also look at my drives when I send them back and try to figure out what went wrong.
The drives will indeed go back to Edge. They have updated their firmware based on other cases like this in the past.
-
@todd-at-xByte these were a new line of drives to begin with, right?
I think you just recently switched, right?
-
@BRRABill said in Dell PERC Question (Server Down):
@todd-at-xByte these were a new line of drives to begin with, right?
I think you just recently switched, right?
Not exactly. Edge rebranded the 960 Boost Pro Plus drives we were selling to E3 to consolidate their line. Edge stated that the 960 E3's were exactly the same as their 960 Boost Pro Plus.
-
@DustinB3403 said in Dell PERC Question (Server Down):
@Dashrender I think both xByte and EDGE would likely look to do more testing before completely ending the relationship.
There is already a relationship there. . .
Awww - well I meant they would suggest Dell's only as long as they hadn't tested and resolved any issues with the new E3s
-
@todd-at-xByte said in Dell PERC Question (Server Down):
@BRRABill said in Dell PERC Question (Server Down):
@todd-at-xByte these were a new line of drives to begin with, right?
I think you just recently switched, right?
Not exactly. Edge rebranded the 960 Boost Pro Plus drives we were selling to E3 to consolidate their line. Edge stated that the 960 E3's were exactly the same as their 960 Boost Pro Plus.
How long had/have you been selling the Boost Pro Plus drives? And were there ever any issues with them?
-
So, the array went bye-bye again. Actually, it was just drive 1:0. (I'm still wondering if there is just something off with that drive.)
Anyway, I said, the hell with this, time to get these new DELL drives in there.
So the array came up degraded (just running on 1:1). I Rebooted and went into the PERC config. I unplugged 1:0 (which was missing anyway), and plugged the DELL drive into 1:2. It instantly powered up, and the PERC config saw it. I added it as a hot spare, and it instantly started rebuilding. AWESOME!
So I rebooted the server. As soon as the server rebooted, the LED on the DELL drive started blinking. Hmm, that's odd, I think. Of course an error comes up, saying drives are missing. I look at the DELL drive, no LEDs. WTF.
I'll cut through the 2+ hours on support with DELL, trying everything. They basically said, the array is toast. Great.
I have 2 more of these DELL SSDs, so I think, WTH, let me try one of them. I plug it in, and reboot a few times with it outside the array. Comes back. So the big test, try it with the array. I do the same steps. But this time when it reboots, the array stays up.
AWESOME AWESOME AWESOME!
It is still currently rebuilding, so we shall see where we get with this. I wonder if the one drive was just a lemon. DELL says no, but I think the results say otherwise.
-
you rebooted after it started a rebuild? That wasn't wise.
-
@Dashrender said in Dell PERC Question (Server Down):
you rebooted after it started a rebuild? That wasn't wise.
That's what I have seen all the DELL techs do.
I doubled checked tonight and that it definitely the supported way to go.
-
it may be allowable, but seems like an unwise thing to do. For example, I would never do that on a RAID 5 array, all that math, one little bit gets messed up.. array is lost. kinda like what happened to you.
-
@Dashrender said in Dell PERC Question (Server Down):
it may be allowable, but seems like an unwise thing to do. For example, I would never do that on a RAID 5 array, all that math, one little bit gets messed up.. array is lost. kinda like what happened to you.
Just on a few drives. For other drives, it worked fine.
If you had to wait the whole time for it to rebuild, it would take forever. (Think of the really long RAID5 times @scottalanmiller has mentioned.)
-
of course, but normally you would have the OS up and running during that time, so the users don't see that long downtime.
-
The end of the night had me trying to add the third DELL disk in, which also failed in the same way as the first.
So I now have the array fully functional with the EDGE drive that didn't ever fail, and the 1 DELL drive that worked.
Ugh.
Please stay up tonight, gentle array.
-
@BRRABill said in Dell PERC Question (Server Down):
@Dashrender said in Dell PERC Question (Server Down):
you rebooted after it started a rebuild? That wasn't wise.
That's what I have seen all the DELL techs do.
I doubled checked tonight and that it definitely the supported way to go.
I've been a Dell tech. They are not trained and are just random people grabbed from third party staffing firms at the last second. Never use "Dell techs do it" as a guide to anything. It's the same as saying "random out of work guy did this".
-
@scottalanmiller said
I've been a Dell tech. They are not trained and are just random people grabbed from third party staffing firms at the last second. Never use "Dell techs do it" as a guide to anything. It's the same as saying "random out of work guy did this".
Sorry, I didn't mean the techs who come out to your location. I mean their US-based phone technical support.
Is that who you mean? Is technical support safe to trust?
-
Also, what is your take? Safe to reboot on a rebuilding array if you have to configure it from the controller config in BIOS?
-
@BRRABill said in Dell PERC Question (Server Down):
Is that who you mean? Is technical support safe to trust?
Probably. But... that was a very reckless decision on their part. So... no.
-
@BRRABill said in Dell PERC Question (Server Down):
Also, what is your take? Safe to reboot on a rebuilding array if you have to configure it from the controller config in BIOS?
No, it's reckless and crazy. You don't induce a failure risk during a repair operation.