Dell PERC Question (Server Down)
-
@BRRABill
I'm coming late into this thread and I'm having problems discerning exactly what the issue is right now. Please contact your xByte rep Brad and he will get a support request going. Our techs can assist directly and can get Edge officially involved instead of trying to rely on ML posts.
--Todd -
@Dashrender said in Dell PERC Question (Server Down):
@Lyndsie_xByte said in Dell PERC Question (Server Down):
@BRRABill Thank you for reaching out! Even though I have set up an email notification for this post, I haven't been receiving them. Much appreciate you keeping me in the loop. This is beyond my basic IT knowledge (marketing gal here), but I will alert one of our engineers to see if they can chime in.
Email notices run out after about 5 days or less. I think ML is working on it, but it's a cost issue.
Not exactly. It is a choice to use a service with limits over sending directly and updating records.
-
@todd-at-xByte said
@BRRABill
I'm coming late into this thread and I'm having problems discerning exactly what the issue is right now. Please contact your xByte rep Brad and he will get a support request going. Our techs can assist directly and can get Edge officially involved instead of trying to rely on ML posts.
--ToddTodd:
I reached out to Brad yesterday to open a case with your tech support. Though we already kind of went through them and they sent us to EDGE. I've been having problems with EDGE responding to me, which is why I reached back out to Lyndsey who set that up the first time.
-
@JaredBusch said in Dell PERC Question (Server Down):
@Dashrender said in Dell PERC Question (Server Down):
@Lyndsie_xByte said in Dell PERC Question (Server Down):
@BRRABill Thank you for reaching out! Even though I have set up an email notification for this post, I haven't been receiving them. Much appreciate you keeping me in the loop. This is beyond my basic IT knowledge (marketing gal here), but I will alert one of our engineers to see if they can chime in.
Email notices run out after about 5 days or less. I think ML is working on it, but it's a cost issue.
Not exactly. It is a choice to use a service with limits over sending directly and updating records.
We tried sending directly and were blacklisted. We could try to get that to work but have tried this in the past and not had luck. We couldn't get even test messages to go out locally. If we switched to local, email would just stop for nearly everyone, all the time, completely.
-
@todd-at-xByte said in Dell PERC Question (Server Down):
@BRRABill
I'm coming late into this thread and I'm having problems discerning exactly what the issue is right now. Please contact your xByte rep Brad and he will get a support request going. Our techs can assist directly and can get Edge officially involved instead of trying to rely on ML posts.
--ToddWhy not get Edge responding here?
-
@todd-at-xByte said in Dell PERC Question (Server Down):
I'm coming late into this thread and I'm having problems discerning exactly what the issue is right now.
From what I could tell, the issue is that Edge does not respond.
-
@StrongBad said
From what I could tell, the issue is that Edge does not respond.
Yes, the tech who was working with me has not responded.
Now, in the past few weeks I have dealt with people on vacation, and people who were sick, and everything else. So I always like to give them the benefit of the doubt as to why they are not responding.
-
@BRRABill Any update to share?
-
This was the latest e-mail from earlier this afternoon:
"That information is good. I was hoping that your iDRAC log would shine some light on what the actual fault error was being recorded when the drive array is actually going down. Iām working on this now with one of our SSD engineers and I am hoping to have some additional information or potential resolutions about this issue today. "
-
Checking in again.
-
Have not heard from them today, sadly.
Let me go rattle the cage...
-
The cage rattling did nothing.
In other news, the RAID array crashed again this morning. Management is starting to ask questions, so I think I am just going to go back to the old DELL spinning rust drives. I don't think I have an option at this point.
This time (this is the fourth time this has happen in a month) was similar to times 1 and 2. In both those instances the entire virtual disk disappeared, as did the physical disks. If you boot into the PERC config, you will see under the FOREIGN tab that the VD and the PD are both there. "Simply" reimport the config, and you're all set.
The third time, the array was still there, it was the disk in 0:0 that was missing. So we cleared the foreign config off of that.
This fourth time, I took more notice of what happened when the array came back up. Sure enough it was 0;0 that was degraded. But, I don't know if I can trust that it might just be that drive.
Here are some pictures of the PERC screens...
-
It's possible that the PERC is bad, I suppose.
-
@scottalanmiller said
It's possible that the PERC is bad, I suppose.
Why would you think that as opposed to it being the disks?
Or are you just spitballing ideas?
Of course DELL would say it was the disks, but the first thing that @BradfromxByte said the first time was that it had to be the disk. It was the guys from EDGE that said no.
-
@scottalanmiller said in Dell PERC Question (Server Down):
It's possible that the PERC is bad, I suppose.
BTW: this PERC did not come with the server. But it was purchased new from DELL.
-
@BRRABill said in Dell PERC Question (Server Down):
@scottalanmiller said
It's possible that the PERC is bad, I suppose.
Why would you think that as opposed to it being the disks?
Or are you just spitballing ideas?
Of course DELL would say it was the disks, but the first thing that @BradfromxByte said the first time was that it had to be the disk. It was the guys from EDGE that said no.
Just saying that it is one of the potential culprits. The disks are not failing per se, they are going foreign. While it is certainly possible that the disks are at fault, doesn't it seem more likely to be the controller? What likely failure condition on a disk would result in the controller thinking that it is a foreign device?
-
@scottalanmiller said
Just saying that it is one of the potential culprits. The disks are not failing per se, they are going foreign. While it is certainly possible that the disks are at fault, doesn't it seem more likely to be the controller? What likely failure condition on a disk would result in the controller thinking that it is a foreign device?
What I have heard from DELL, xByte, and here on ML is that is the firmware on the EDGE disks has some sort of issue communicating, the PERC will set it to foreign.
That's why they instantly think it is the drive.
-
That's not unreasonable. Of course, if the same error happens on the PERC side, it will think that the issue is from the Edge side and behave the same way.
-
@scottalanmiller said in Dell PERC Question (Server Down):
@BRRABill said in Dell PERC Question (Server Down):
@scottalanmiller said
It's possible that the PERC is bad, I suppose.
Why would you think that as opposed to it being the disks?
Or are you just spitballing ideas?
Of course DELL would say it was the disks, but the first thing that @BradfromxByte said the first time was that it had to be the disk. It was the guys from EDGE that said no.
Just saying that it is one of the potential culprits. The disks are not failing per se, they are going foreign. While it is certainly possible that the disks are at fault, doesn't it seem more likely to be the controller? What likely failure condition on a disk would result in the controller thinking that it is a foreign device?
I'm tending to agree with Scott here - it seems really odd for the controller to consider the drives and their config to be completely foreign. Of course this is the problem that we've been talking about ever since one of my first postings with Scott on SW well over 5 years ago - When you're mixing vendors, at what point will someone capitulate that it's their stuff that's broken.
Scott's been saying since that time, oh so long ago, that Dell (in this case) can't refuse to provide warranty/support on their equipment just because you're using someone else's drives. This looks like a good case of time to prove that.
Also, you could be having two problems - A drive (0:0) and the Perc controller could be bad. In this case, perhaps when the drive freaks out for a moment it's causing the Perc to freak out too.
Of course I suppose the drive could be freaking out so badly out of Dell's expectation as to cause the Perc to loose it's mind, but that seems .... you pick a word.
-
Yes, Dell has a responsibility to support the PERC, but only if the PERC is at fault. That's what makes this a tough situation that we don't know where the fault lies.