ML
    • Recent
    • Categories
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    XenServer 6.2 servers down. I have no Xen skill. Most likely networking? Help!

    Scheduled Pinned Locked Moved IT Discussion
    xenserverxenserver 6.2iscsisan
    243 Posts 10 Posters 48.1k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • C
      CitrixNewbJD @seal
      last edited by

      @seal

      Frank, can we speak on the phone for a minute so I can be sure I can intelligently talk to the guy when demanding his money?

      1 Reply Last reply Reply Quote 0
      • scottalanmillerS
        scottalanmiller @CitrixNewbJD
        last edited by

        @CitrixNewbJD said in XenServer 6.2 servers down. I have no Xen skill. Most likely networking? Help!:

        @momurda

        Having been through this once before, and learning the hard way, I do normally have a physical DC.

        This is absolutely the wrong response. You should never have a physical DC, ever. There is zero issues here with virtualization. There are two problems....

        • Zero AD redundancy
        • An inverted pyramid of doom (single storage for all systems)

        Fixing either of those anti-practices would have saved you. Physical would have zero benefit and is the polar opposite of the reaction that you should have.

        DashrenderD 1 Reply Last reply Reply Quote 1
        • scottalanmillerS
          scottalanmiller @CitrixNewbJD
          last edited by

          @CitrixNewbJD said in XenServer 6.2 servers down. I have no Xen skill. Most likely networking? Help!:

          @momurda

          I've been using the root authentication for everything.

          So we are safe there.

          1 Reply Last reply Reply Quote 0
          • scottalanmillerS
            scottalanmiller
            last edited by

            More on the IPOD: http://www.smbitjournal.com/2013/06/the-inverted-pyramid-of-doom/

            And in video form from MangoCon:

            Youtube Video

            1 Reply Last reply Reply Quote 1
            • C
              CitrixNewbJD
              last edited by

              So, when looking for places to turn off AD integration, I see this...

              0_1482873847039_Screenshot 2016-12-27 15.23.59.png

              1 Reply Last reply Reply Quote 0
              • scottalanmillerS
                scottalanmiller
                last edited by

                It's not pool integration that is the issue, it's SAN integration. Check the SAN (PowerVault) interface instead.

                1 Reply Last reply Reply Quote 2
                • C
                  CitrixNewbJD
                  last edited by

                  @seal Just came across these two items on the SAN interface. Dental_Data, Spindlemedia, are critical and it looks like those VDs failed.

                  PROFILE FOR STORAGE ARRAY: MDS-Spindle01 (12/27/16 3:28:58 PM) 
                   
                   
                  STANDARD VIRTUAL DISKS------------------------------ 
                   
                  SUMMARY 
                   
                     Number of standard virtual disks: 3 
                   
                     See other Virtual Disks sub-tabs for premium feature information. 
                   
                   
                     NAME          STATUS  CAPACITY  RAID LEVEL  DISK GROUP  DRIVE TYPE   
                     Dental_Data   Failed  1.495 TB  5           0           SAS          
                     SpindleMedia  Failed  2.862 TB  5           0           SAS          
                     Virtual       Failed  1.367 TB  5           0           SAS          
                   
                  DETAILS 
                   
                   
                     Virtual Disk name:                      Dental_Data                                       
                                                                                                               
                        Virtual Disk status:                 Failed                                            
                                                                                                               
                        Capacity:                            1.495 TB                                          
                        Virtual Disk world-wide identifier:  60:02:4e:80:00:7b:78:6a:00:00:04:13:4a:96:70:f3   
                        Subsystem ID (SSID):                 1                                                 
                        Associated disk group:               0                                                 
                        RAID level:                          5                                                 
                                                                                                               
                        Physical Disk type:                  Serial Attached SCSI (SAS)                        
                        Enclosure loss protection:           No                                                
                                                                                                               
                        Preferred owner:                     RAID Controller Module in slot 1                  
                        Current owner:                       RAID Controller Module in slot 1                  
                   
                   
                        Segment size:                                       128 KB     
                        Capacity reserved for future segment size changes:  Yes        
                        Maximum future segment size:                        2,048 KB   
                        Modification priority:                              High       
                   
                   
                        Read cache:                            Enabled    
                        Write cache:                           Enabled    
                           Write cache without batteries:      Disabled   
                           Write cache with mirroring:         Enabled    
                        Flush write cache after (in seconds):  10.00      
                        Dynamic cache read prefetch:           Enabled    
                                                                          
                        Enable background media scan:          Enabled    
                        Media scan with consistency check:     Enabled    
                                                                          
                        Pre-Read consistency check:            Disabled   
                   
                   
                     Virtual Disk name:                      SpindleMedia                                      
                                                                                                               
                        Virtual Disk status:                 Failed                                            
                                                                                                               
                        Capacity:                            2.862 TB                                          
                        Virtual Disk world-wide identifier:  60:02:4e:80:00:70:ed:06:00:00:07:f5:4d:ba:7b:fb   
                        Subsystem ID (SSID):                 2                                                 
                        Associated disk group:               0                                                 
                        RAID level:                          5                                                 
                                                                                                               
                        Physical Disk type:                  Serial Attached SCSI (SAS)                        
                        Enclosure loss protection:           No                                                
                                                                                                               
                        Preferred owner:                     RAID Controller Module in slot 0                  
                        Current owner:                       RAID Controller Module in slot 1                  
                   
                   
                        Segment size:                                       128 KB     
                        Capacity reserved for future segment size changes:  Yes        
                        Maximum future segment size:                        2,048 KB   
                        Modification priority:                              High       
                   
                   
                        Read cache:                            Enabled    
                        Write cache:                           Enabled    
                           Write cache without batteries:      Disabled   
                           Write cache with mirroring:         Enabled    
                        Flush write cache after (in seconds):  10.00      
                        Dynamic cache read prefetch:           Enabled    
                                                                          
                        Enable background media scan:          Enabled    
                        Media scan with consistency check:     Enabled    
                                                                          
                        Pre-Read consistency check:            Disabled   
                   
                   
                     Virtual Disk name:                      Virtual                                           
                                                                                                               
                        Virtual Disk status:                 Failed                                            
                                                                                                               
                        Capacity:                            1.367 TB                                          
                        Virtual Disk world-wide identifier:  60:02:4e:80:00:70:ed:06:00:00:04:31:4a:96:73:09   
                        Subsystem ID (SSID):                 0                                                 
                        Associated disk group:               0                                                 
                        RAID level:                          5                                                 
                                                                                                               
                        Physical Disk type:                  Serial Attached SCSI (SAS)                        
                        Enclosure loss protection:           No                                                
                                                                                                               
                        Preferred owner:                     RAID Controller Module in slot 0                  
                        Current owner:                       RAID Controller Module in slot 1                  
                   
                   
                        Segment size:                                       128 KB     
                        Capacity reserved for future segment size changes:  Yes        
                        Maximum future segment size:                        2,048 KB   
                        Modification priority:                              High       
                   
                   
                        Read cache:                            Enabled    
                        Write cache:                           Enabled    
                           Write cache without batteries:      Disabled   
                           Write cache with mirroring:         Enabled    
                        Flush write cache after (in seconds):  10.00      
                        Dynamic cache read prefetch:           Enabled    
                                                                          
                        Enable background media scan:          Enabled    
                        Media scan with consistency check:     Enabled    
                                                                          
                        Pre-Read consistency check:            Disabled   
                  

                  0_1482874726865_Screenshot 2016-12-27 15.38.36.png

                  1 Reply Last reply Reply Quote 0
                  • scottalanmillerS
                    scottalanmiller
                    last edited by

                    Oh look, on top of everything else, they left you with RAID 5, too. Figures. Whoever set this up really set you up for failure.

                    1 Reply Last reply Reply Quote 2
                    • scottalanmillerS
                      scottalanmiller
                      last edited by

                      Your predecessor definitely pulled this on you: https://mangolassi.it/topic/11852/why-it-builds-a-house-of-cards

                      1 Reply Last reply Reply Quote 1
                      • scottalanmillerS
                        scottalanmiller
                        last edited by

                        Looks like, on top of other problems, the SAN has died. It's hard to tell from this, but it looks like those are the LUNs that hold all of your VMs?

                        1 Reply Last reply Reply Quote 2
                        • momurdaM
                          momurda
                          last edited by

                          So 2 drives failed at once? You should be able to go into the server room and see some sort of blinky light pattern that indicates what/how many drives are gone.
                          Did you lose a RAID Controller?

                          scottalanmillerS 1 Reply Last reply Reply Quote 1
                          • NerdyDadN
                            NerdyDad
                            last edited by

                            Dear God I pray that you have backups outside of the environment. Please tell me that you do. Another NAS, tapes, diskettes, something?

                            scottalanmillerS 1 Reply Last reply Reply Quote 1
                            • scottalanmillerS
                              scottalanmiller @momurda
                              last edited by

                              @momurda said in XenServer 6.2 servers down. I have no Xen skill. Most likely networking? Help!:

                              So 2 drives failed at once? You should be able to go into the server room and see some sort of blinky light pattern that indicates what/how many drives are gone.
                              Did you lose a RAID Controller?

                              It's a dual controller device. So in theory it should fail over. But in reality, they rarely do.

                              scottalanmillerS 1 Reply Last reply Reply Quote 1
                              • scottalanmillerS
                                scottalanmiller @NerdyDad
                                last edited by

                                @NerdyDad said in XenServer 6.2 servers down. I have no Xen skill. Most likely networking? Help!:

                                Dear God I pray that you have backups outside of the environment. Please tell me that you do. Another NAS, tapes, diskettes, something?

                                At this point, recovering from backup to a new cluster might be the best way to go. The SAN is worthless if the arrays have failed. And the local servers probably don't have the necessary storage to run without it. If the array is really lost, the old hardware has probably dropped to a zero value level. Time to get something new in and recover to that ASAP.

                                1 Reply Last reply Reply Quote 1
                                • scottalanmillerS
                                  scottalanmiller @scottalanmiller
                                  last edited by

                                  @scottalanmiller said in XenServer 6.2 servers down. I have no Xen skill. Most likely networking? Help!:

                                  @momurda said in XenServer 6.2 servers down. I have no Xen skill. Most likely networking? Help!:

                                  So 2 drives failed at once? You should be able to go into the server room and see some sort of blinky light pattern that indicates what/how many drives are gone.
                                  Did you lose a RAID Controller?

                                  It's a dual controller device. So in theory it should fail over. But in reality, they rarely do.

                                  But if drives are lost, that won't help.

                                  1 Reply Last reply Reply Quote 0
                                  • sealS
                                    seal
                                    last edited by

                                    Isn't this saying the virtual drives for each failed? This should be different than a physical drive failure, right? Or am I reading something wrong?

                                    scottalanmillerS 1 Reply Last reply Reply Quote 0
                                    • DashrenderD
                                      Dashrender @scottalanmiller
                                      last edited by

                                      @scottalanmiller said in XenServer 6.2 servers down. I have no Xen skill. Most likely networking? Help!:

                                      @CitrixNewbJD said in XenServer 6.2 servers down. I have no Xen skill. Most likely networking? Help!:

                                      @momurda

                                      Having been through this once before, and learning the hard way, I do normally have a physical DC.

                                      This is absolutely the wrong response. You should never have a physical DC, ever. There is zero issues here with virtualization. There are two problems....

                                      • Zero AD redundancy
                                      • An inverted pyramid of doom (single storage for all systems)

                                      Fixing either of those anti-practices would have saved you. Physical would have zero benefit and is the polar opposite of the reaction that you should have.

                                      having a physical in this situation would have probably saved him. That said, I agree it's not the solution. If you really wanted to have a DC outside this cluster, fine, but you still virtualize that third server, then install a DC on that.

                                      scottalanmillerS 2 Replies Last reply Reply Quote 0
                                      • scottalanmillerS
                                        scottalanmiller @seal
                                        last edited by

                                        @seal said in XenServer 6.2 servers down. I have no Xen skill. Most likely networking? Help!:

                                        Isn't this saying the virtual drives for each failed? This should be different than a physical drive failure, right? Or am I reading something wrong?

                                        Well, yes and no. You are correct. The warning is that the LDs have failed. But the LDs fail when their underlying array fails. That underlying array is built on physical drives. So for the LDs to fail, it means that the array(s) that they share has failed, which means that the drives it has in its pool have failed. Or that both controllers have failed. In this case, since two utility LUNs are still hanging around, we are guessing that the controller(s) are intact and only the array has failed.

                                        1 Reply Last reply Reply Quote 0
                                        • scottalanmillerS
                                          scottalanmiller @Dashrender
                                          last edited by

                                          @Dashrender said in XenServer 6.2 servers down. I have no Xen skill. Most likely networking? Help!:

                                          having a physical in this situation would have probably saved him.

                                          Don't feed the crazy. Physical can never save you. You are mixing assumptions to come to the wrong conclusion. Physical will never help. What helps is separate storage.

                                          Physical with shared storage = fail just the same.
                                          Physical with separate storage = just fine.
                                          Virtual with shared storage = fail just the same.
                                          Virtual with separate storage = just fine.

                                          As you can see, physical vs virtual is unrelated. It's all about the storage separation and nothing else.

                                          1 Reply Last reply Reply Quote 1
                                          • scottalanmillerS
                                            scottalanmiller @Dashrender
                                            last edited by

                                            @Dashrender said in XenServer 6.2 servers down. I have no Xen skill. Most likely networking? Help!:

                                            That said, I agree it's not the solution. If you really wanted to have a DC outside this cluster, fine, but you still virtualize that third server, then install a DC on that.

                                            While I generally agree that "outside the cluster" is good in extreme cases where you have extreme levels of AD dependencies, that's not necessary. Same cluster with different storage is all that is needed. Same scenario on a Scale cluster, for example, would not have a problem even being on a single cluster. Having "inter-cluster" protection is good, but a whole level beyond what is needed here.

                                            DashrenderD 1 Reply Last reply Reply Quote 0
                                            • 1
                                            • 2
                                            • 7
                                            • 8
                                            • 9
                                            • 10
                                            • 11
                                            • 12
                                            • 13
                                            • 9 / 13
                                            • First post
                                              Last post