ML
    • Recent
    • Categories
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Hyper-V Failover Cluster FAILURE(S)

    Scheduled Pinned Locked Moved IT Discussion
    140 Posts 6 Posters 17.7k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • scottalanmillerS
      scottalanmiller @Kyle
      last edited by

      @kyle said in Hyper-V Failover Cluster FAILURE(S):

      We rolled out Class B networking and then 48 hours later made the IP changes on the DFS farm and then 2 hours later we ended up having identical Event ID 5120 Where the Cluster lost connection to the SAN.

      But that never happened before?

      An issue here is that changing the networking means a lot of things were changed, not just the subnet mask size.

      KyleK 1 Reply Last reply Reply Quote 0
      • KyleK
        Kyle @scottalanmiller
        last edited by

        @scottalanmiller said in Hyper-V Failover Cluster FAILURE(S):

        @kyle said in Hyper-V Failover Cluster FAILURE(S):

        We rolled out Class B networking and then 48 hours later made the IP changes on the DFS farm and then 2 hours later we ended up having identical Event ID 5120 Where the Cluster lost connection to the SAN.

        But that never happened before?

        An issue here is that changing the networking means a lot of things were changed, not just the subnet mask size.

        That's another issue too. Some things that recieved the new /16 addresses still carry the 255.255.255.0 instead of the 255.255.0.0 and they said it didn't matter when I questioned them about it.

        scottalanmillerS 1 Reply Last reply Reply Quote 0
        • scottalanmillerS
          scottalanmiller @Kyle
          last edited by

          @kyle said in Hyper-V Failover Cluster FAILURE(S):

          Some things that recieved the new /16 addresses still carry the 255.255.255.0 instead of the 255.255.0.0 and they said it didn't matter when I questioned them about it.

          Um, that means they are NOT /16. 255.255.0.0 and /16 are the same thing, just two different ways to write it. It means that they didn't do the /16 as they said, and they knew it, and they lied about not needing to do it. It's true that at times you can have half broken smaller networks inside of larger ones, but they are broken and not all of your networking will work when it needs to.

          So don't say it that they received /16 addressing, because they did not, they are /24 on a /24 network that is broken and can only communicate with a small fraction of the /16.

          That's just broken, so that might easily be the issue.

          KyleK 1 Reply Last reply Reply Quote 0
          • KyleK
            Kyle @scottalanmiller
            last edited by

            @scottalanmiller said in Hyper-V Failover Cluster FAILURE(S):

            @kyle said in Hyper-V Failover Cluster FAILURE(S):

            Some things that recieved the new /16 addresses still carry the 255.255.255.0 instead of the 255.255.0.0 and they said it didn't matter when I questioned them about it.

            Um, that means they are NOT /16. 255.255.0.0 and /16 are the same thing, just two different ways to write it. It means that they didn't do the /16 as they said, and they knew it, and they lied about not needing to do it. It's true that at times you can have half broken smaller networks inside of larger ones, but they are broken and not all of your networking will work when it needs to.

            So don't say it that they received /16 addressing, because they did not, they are /24 on a /24 network that is broken and can only communicate with a small fraction of the /16.

            That's just broken, so that might easily be the issue.

            I know. But being the FNG I'm not allowed to make changes to anything. I'm only allowed to view and make suggestions that have to be approved. The SAN also point to 192.168.x.x DNS addresses which I believe can be causing issues as well.

            1 Reply Last reply Reply Quote 0
            • KyleK
              Kyle
              last edited by

              @scottalanmiller The SAN's two 10G connections, should those be on the same subnet for best practices?

              1 Reply Last reply Reply Quote 0
              • DashrenderD
                Dashrender
                last edited by

                The logs say the switches aren’t saturated, but I wonder is a network broadcast issues can’t be an issue here in the new network size.

                KyleK 1 Reply Last reply Reply Quote 1
                • KyleK
                  Kyle @Dashrender
                  last edited by

                  @dashrender said in Hyper-V Failover Cluster FAILURE(S):

                  The logs say the switches aren’t saturated, but I wonder is a network broadcast issues can’t be an issue here in the new network size.

                  I'm going to have to verify the network settings are correct tomorrow again since the 5 NICs associated with the nodes are all over the place I'm going to guess it has something to do with that.

                  They said they had this exact same issue a few months ago but the logs do not go back that far so I cannot compare the events. But having the cluster fail twice in 2 days isn't sitting right with me since it started occurring just days after switching IP ranges.

                  I've dumped all the logs and documented everything I have found that look out of place.

                  ObsolesceO 1 Reply Last reply Reply Quote 0
                  • ObsolesceO
                    Obsolesce @Kyle
                    last edited by

                    @kyle said in Hyper-V Failover Cluster FAILURE(S):

                    @dashrender said in Hyper-V Failover Cluster FAILURE(S):

                    The logs say the switches aren’t saturated, but I wonder is a network broadcast issues can’t be an issue here in the new network size.

                    I'm going to have to verify the network settings are correct tomorrow again since the 5 NICs associated with the nodes are all over the place I'm going to guess it has something to do with that.

                    They said they had this exact same issue a few months ago but the logs do not go back that far so I cannot compare the events. But having the cluster fail twice in 2 days isn't sitting right with me since it started occurring just days after switching IP ranges.

                    I've dumped all the logs and documented everything I have found that look out of place.

                    Isn't the SAN network isolated both physically and logically from everything else?

                    Can you show a screenshot of this window: 0574.1.jpg

                    Hard to find a good one on Google, but highlight the network you use for your for your SAN and show the settings if you can.

                    DashrenderD KyleK 2 Replies Last reply Reply Quote 0
                    • DustinB3403D
                      DustinB3403
                      last edited by

                      Half of this thread has been deleted.

                      1 Reply Last reply Reply Quote 0
                      • DashrenderD
                        Dashrender @Obsolesce
                        last edited by

                        @tim_g said in Hyper-V Failover Cluster FAILURE(S):

                        @kyle said in Hyper-V Failover Cluster FAILURE(S):

                        @dashrender said in Hyper-V Failover Cluster FAILURE(S):

                        The logs say the switches aren’t saturated, but I wonder is a network broadcast issues can’t be an issue here in the new network size.

                        I'm going to have to verify the network settings are correct tomorrow again since the 5 NICs associated with the nodes are all over the place I'm going to guess it has something to do with that.

                        They said they had this exact same issue a few months ago but the logs do not go back that far so I cannot compare the events. But having the cluster fail twice in 2 days isn't sitting right with me since it started occurring just days after switching IP ranges.

                        I've dumped all the logs and documented everything I have found that look out of place.

                        Isn't the SAN network isolated both physically and logically from everything else?

                        Nope he said they are not.

                        Reid CooperR 1 Reply Last reply Reply Quote 1
                        • KyleK
                          Kyle @Obsolesce
                          last edited by

                          @tim_g said in Hyper-V Failover Cluster FAILURE(S):

                          @kyle said in Hyper-V Failover Cluster FAILURE(S):

                          @dashrender said in Hyper-V Failover Cluster FAILURE(S):

                          The logs say the switches aren’t saturated, but I wonder is a network broadcast issues can’t be an issue here in the new network size.

                          I'm going to have to verify the network settings are correct tomorrow again since the 5 NICs associated with the nodes are all over the place I'm going to guess it has something to do with that.

                          They said they had this exact same issue a few months ago but the logs do not go back that far so I cannot compare the events. But having the cluster fail twice in 2 days isn't sitting right with me since it started occurring just days after switching IP ranges.

                          I've dumped all the logs and documented everything I have found that look out of place.

                          Isn't the SAN network isolated both physically and logically from everything else?

                          Can you show a screenshot of this window: 0574.1.jpg

                          Hard to find a good one on Google, but highlight the network you use for your for your SAN and show the settings if you can.

                          @Tim_G , They are not separated. I can access the SAN and the nodes from the same network and vice versa. There is also some places where there is a /16 network address and a 255.255.255.0 subnet mask in the config.

                          I know this needs to be changed, but as the FNG at the company I am currently restricted to submitting my recommendations in writing to the Director and then the advice of the Sr. Admin and DBA are asked what they think. Le Sigh.......

                          1 Reply Last reply Reply Quote 0
                          • KyleK
                            Kyle
                            last edited by

                            @Tim_G, I can get screen grabs tomorrow and some more info. I've been trying to wrap my head around why some of the things were done the way they were.

                            1 Reply Last reply Reply Quote 0
                            • Reid CooperR
                              Reid Cooper @Dashrender
                              last edited by

                              @dashrender said in Hyper-V Failover Cluster FAILURE(S):

                              @tim_g said in Hyper-V Failover Cluster FAILURE(S):

                              @kyle said in Hyper-V Failover Cluster FAILURE(S):

                              @dashrender said in Hyper-V Failover Cluster FAILURE(S):

                              The logs say the switches aren’t saturated, but I wonder is a network broadcast issues can’t be an issue here in the new network size.

                              I'm going to have to verify the network settings are correct tomorrow again since the 5 NICs associated with the nodes are all over the place I'm going to guess it has something to do with that.

                              They said they had this exact same issue a few months ago but the logs do not go back that far so I cannot compare the events. But having the cluster fail twice in 2 days isn't sitting right with me since it started occurring just days after switching IP ranges.

                              I've dumped all the logs and documented everything I have found that look out of place.

                              Isn't the SAN network isolated both physically and logically from everything else?

                              Nope he said they are not.

                              That's not good. So it isn't really a SAN, just a normal network with SAN traffic dumped onto it?

                              1 Reply Last reply Reply Quote 1
                              • 1
                              • 2
                              • 3
                              • 4
                              • 5
                              • 6
                              • 7
                              • 5 / 7
                              • First post
                                Last post