ML
    • Recent
    • Categories
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Solved PCI bus error

    IT Discussion
    dell poweredge poweredge 2850 lspci bmc ipmi
    10
    55
    3.7k
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • JaredBuschJ
      JaredBusch
      last edited by

      So old ass Dell PowerEdge 2850 was network failed. Ping but nothing else.

      I got into the BMC (not even iDRAC on this ancient thing....

       .\ipmish.exe -ip 192.168.10.102 -u root -p WTFPWD sel get
      

      913949cb-8f24-4ec1-b946-5eae519f56cc-image.png

      lspci leads me to see that that is the NIC?
      1811d4b2-bb05-4048-911f-8945210ff28b-image.png
      and it is a 2 port NIC
      7de3ac87-2cf7-4b8f-a8e1-80060fccb07a-image.png
      from intel
      8710d6a6-15ea-4f74-82b3-87fc10650a2b-image.png

      am I correct that the NIC is failing?

      travisdh1T 1 Reply Last reply Reply Quote 0
      • 1
        1337
        last edited by 1337

        No, it's not the NIC.

        It says PCIe error bus 0, device 6, function 0.

        That's 00:06.0, and that is the PCI bridge E7520. Which I think is connected directly to the chipset on the CPU. Can't remember exactly what was on the CPU and the chipset back in those days.

        Either way the motherboard/CPU is done.

        Or I guess technically speaking a driver error caused by drive corruption could have caused the same error. After all it's the OS that gives the error message here.

        JaredBuschJ 2 Replies Last reply Reply Quote 3
        • JaredBuschJ
          JaredBusch
          last edited by

          Just a little old..
          b9092a6c-34f4-442f-98fe-04c044f108bd-image.png

          It is running RHEL 4
          a47eeea1-df22-4176-8967-232ec0d621d8-image.png

          1 Reply Last reply Reply Quote 0
          • travisdh1T
            travisdh1 @JaredBusch
            last edited by

            @JaredBusch Most likely a failing NIC, yes.

            Also, talk about a blast from the past. That's the same era of hardware as a Vostro 220 I pulled out of service yesterday.

            DustinB3403D 1 Reply Last reply Reply Quote 0
            • DustinB3403D
              DustinB3403 @travisdh1
              last edited by

              @travisdh1 said in PCI bus error:

              @JaredBusch Most likely a failing NIC, yes.

              Also, talk about a blast from the past. That's the same era of hardware as a Vostro 220 I pulled out of service yesterday.

              Jared seems to be trying to keep this unit in service, unless of course there is a good reason to replace it.

              travisdh1T 1 Reply Last reply Reply Quote 0
              • travisdh1T
                travisdh1 @DustinB3403
                last edited by

                @DustinB3403 said in PCI bus error:

                @travisdh1 said in PCI bus error:

                @JaredBusch Most likely a failing NIC, yes.

                Also, talk about a blast from the past. That's the same era of hardware as a Vostro 220 I pulled out of service yesterday.

                Jared seems to be trying to keep this unit in service, unless of course there is a good reason to replace it.

                Yeah, this is @JaredBusch tho, I think we all know he's not keeping it in service by his choice.

                1 Reply Last reply Reply Quote 0
                • JaredBuschJ
                  JaredBusch
                  last edited by

                  So, no one can hep me confirm that the NIC is failing?

                  Also, anything I can look at to see if this is the onboard NIC or the NIC on a separate card?

                  I'll be on site on Tuesday.

                  ObsolesceO 1 Reply Last reply Reply Quote 0
                  • ObsolesceO
                    Obsolesce @JaredBusch
                    last edited by

                    @JaredBusch said in PCI bus error:

                    So, no one can hep me confirm that the NIC is failing?

                    Also, anything I can look at to see if this is the onboard NIC or the NIC on a separate card?

                    I'll be on site on Tuesday.

                    Did not see any info indicating NIC failure. No idea what device 6 is.

                    Looking up the Intel NIC u linked shows this:

                    Screenshot_20201213-113521_Edge.jpg

                    1 Reply Last reply Reply Quote 0
                    • 1
                      1337
                      last edited by 1337

                      No, it's not the NIC.

                      It says PCIe error bus 0, device 6, function 0.

                      That's 00:06.0, and that is the PCI bridge E7520. Which I think is connected directly to the chipset on the CPU. Can't remember exactly what was on the CPU and the chipset back in those days.

                      Either way the motherboard/CPU is done.

                      Or I guess technically speaking a driver error caused by drive corruption could have caused the same error. After all it's the OS that gives the error message here.

                      JaredBuschJ 2 Replies Last reply Reply Quote 3
                      • JaredBuschJ
                        JaredBusch @1337
                        last edited by

                        @Pete-S said in PCI bus error:

                        After all it's the OS that gives the error message here.

                        No the error message if from the BMC (predecessor to iDRAC).

                        1 Reply Last reply Reply Quote 0
                        • JaredBuschJ
                          JaredBusch @1337
                          last edited by

                          @Pete-S said in PCI bus error:

                          No, it's not the NIC.
                          It says PCIe error bus 0, device 6, function 0.

                          That is why I wanted others to look. The way I read the man page it seemed that the bus was omitted when using lspci.

                          1 1 Reply Last reply Reply Quote 0
                          • 1
                            1337 @JaredBusch
                            last edited by

                            @JaredBusch said in PCI bus error:

                            @Pete-S said in PCI bus error:

                            No, it's not the NIC.
                            

                            It says PCIe error bus 0, device 6, function 0.

                            That is why I wanted others to look. The way I read the man page it seemed that the bus was omitted when using lspci.

                            No its <bus>:<device>.<func>

                            But it's a bit confusing nowadays compared how it was in the old days when you had all the devices on the same bus.

                            1 Reply Last reply Reply Quote 0
                            • JaredBuschJ
                              JaredBusch
                              last edited by JaredBusch

                              So the customer asked me to spec out a replacement server.

                              This is what I am thinking to recommend.

                              Dell PowerEdge R6515 – Chassis with 8x 2.5” drives
                              AMD EPYC 7262 or 7302P 
                              1x 16gb RDIMM 3200MT
                              PERC H730P
                              3x 480GB SSD SATA Mix Use Hot plug
                              Dual hot plug power supply
                              Riser Config 1 1x16LP PCIe slot
                              iDRAC 9 Express
                              BOSS controller card with 2 M.2 240GB RAID 1
                              

                              Comments?

                              DashrenderD 1 JaredBuschJ 3 Replies Last reply Reply Quote 2
                              • scottalanmillerS
                                scottalanmiller
                                last edited by

                                Seems like anything will work in this scenario given how old the original was. What's the workload?

                                JaredBuschJ 2 Replies Last reply Reply Quote 1
                                • JaredBuschJ
                                  JaredBusch @scottalanmiller
                                  last edited by JaredBusch

                                  @scottalanmiller said in PCI bus error:

                                  Seems like anything will work in this scenario given how old the original was. What's the workload?

                                  A proprietary system from TopTech

                                  Server load is nothing normally. The system is catching up from a planned maintenance window at the moment.
                                  a6ac8ef6-a47b-4f05-8623-f3abee52d37c-image.png
                                  c911289b-d704-489b-a415-0c47e443006c-image.png

                                  I'll get another snapshot once it is caught up.

                                  1 Reply Last reply Reply Quote 0
                                  • JaredBuschJ
                                    JaredBusch @scottalanmiller
                                    last edited by

                                    @scottalanmiller said in PCI bus error:

                                    Seems like anything will work in this scenario given how old the original was.

                                    I am future planning. The system will get replaced by a new version.

                                    But that requires infrastructure updates at the terminals also.

                                    scottalanmillerS 1 Reply Last reply Reply Quote 0
                                    • scottalanmillerS
                                      scottalanmiller @JaredBusch
                                      last edited by

                                      @JaredBusch said in PCI bus error:

                                      @scottalanmiller said in PCI bus error:

                                      Seems like anything will work in this scenario given how old the original was.

                                      I am future planning. The system will get replaced by a new version.

                                      But that requires infrastructure updates at the terminals also.

                                      Well sure, but even the smallest modern system will be orders of magnitude faster. Hard to believe anything wouldn't have the "oomph" for the task unless the workload isn't just updated, but overhauled.

                                      JaredBuschJ 1 Reply Last reply Reply Quote 0
                                      • JaredBuschJ
                                        JaredBusch @scottalanmiller
                                        last edited by

                                        @scottalanmiller said in PCI bus error:

                                        Well sure, but even the smallest modern system will be orders of magnitude faster. Hard to believe anything wouldn't have the "oomph" for the task unless the workload isn't just updated, but overhauled.

                                        Right the workload will not change. That is pretty consistent. The specs for the new version are higher. But still, yes, anything modern will power it.

                                        1 Reply Last reply Reply Quote 0
                                        • JaredBuschJ
                                          JaredBusch
                                          last edited by

                                          Yeah it sleeps all day long..

                                          8799298e-ab9d-4111-89c2-31a82d3737cb-image.png

                                          1 Reply Last reply Reply Quote 0
                                          • DashrenderD
                                            Dashrender @JaredBusch
                                            last edited by

                                            @JaredBusch said in PCI bus error:

                                            So the customer asked me to spec out a replacement server.

                                            This is what I am thinking to recommend.

                                            Dell PowerEdge R6515 – Chassis with 8x 2.5” drives
                                            AMD EPYC 7262 or 7302P 
                                            1x 16gb RDIMM 3200MT
                                            PERC H730P
                                            3x 480GB SSD SATA Mix Use Hot plug
                                            Dual hot plug power supply
                                            Riser Config 1 1x16LP PCIe slot
                                            iDRAC 9 Express
                                            BOSS controller card with 2 M.2 240GB RAID 1
                                            

                                            Comments?

                                            Does iDRAC 9 Express allow remote access to the console?

                                            1 JaredBuschJ 2 Replies Last reply Reply Quote 0
                                            • 1
                                              1337 @JaredBusch
                                              last edited by 1337

                                              @JaredBusch said in PCI bus error:

                                              So the customer asked me to spec out a replacement server.

                                              This is what I am thinking to recommend.

                                              Dell PowerEdge R6515 – Chassis with 8x 2.5” drives
                                              AMD EPYC 7262 or 7302P 
                                              1x 16gb RDIMM 3200MT
                                              PERC H730P
                                              3x 480GB SSD SATA Mix Use Hot plug
                                              Dual hot plug power supply
                                              Riser Config 1 1x16LP PCIe slot
                                              iDRAC 9 Express
                                              BOSS controller card with 2 M.2 240GB RAID 1
                                              

                                              Comments?

                                              Yeah, I assume this is a low budget spec.

                                              Pick the cheapest epyc rome unless you expect the server to handle lots more in the future. 7232P is the cheapest.
                                              Also skip the BOSS card and pick 2x960GB read-intensive drives in RAID 1. Since you have the H730P RAID1 and it's cache, RAID1 should be more than fine.

                                              I mean comparing to the old machine you could also use the H330 card. You don't get the cache but the SSDs have cache and RAID1/10 doesn't require any parity calculations so the H330 will get the job done.

                                              JaredBuschJ 1 Reply Last reply Reply Quote 0
                                              • 1
                                              • 2
                                              • 3
                                              • 1 / 3
                                              • First post
                                                Last post