Hyper-V Failover Cluster FAILURE(S)
-
Twice thin as many days we have had a complete failure where all nodes and VM's go critical and then hard down.
Setup is:
6 Nodes configured with Hyper-V Failover Cluster (Can't get a straight answer as to who set it up(
All nodes are connected to a Tegile HA 2300 with a single 7TB Lun for 42 VM's with 2 10G ISCSI subnetted NICs.Twice in as many days the cluster has thrown a server Event ID 5120 (STATUS_DEVICE_BUSY) and lists the SAN Volume.
The day we swapped from a Class C to a Class B networking this occurred 2 hours later and again 36 hours later.
I have found some MSDN articles that point to subnetting being the issue here and the 4 reference links at the bottom of the page:
My question is what do you guys think?
-
Is your SAN on your production network? i.e. not a separate switch away from the normal network?
-
@dashrender said in Hyper-V Failover Cluster FAILURE(S):
Is your SAN on your production network? i.e. not a separate switch away from the normal network?
Yes. The SAN is connected via ISCSI to a switch that coverts it to 10G CAT 6 and connected to the Hyper-V Cluster. The nodes have have separate NICs for different tasks, 2 for Failover to the SAN, 1 for Migration, and 2 for failover to the network.
-
@kyle said in Hyper-V Failover Cluster FAILURE(S):
@dashrender said in Hyper-V Failover Cluster FAILURE(S):
Is your SAN on your production network? i.e. not a separate switch away from the normal network?
Yes. The SAN is connected via ISCSI to a switch that coverts it to 10G CAT 6 and connected to the Hyper-V Cluster. The nodes have have separate NICs for different tasks, 2 for Failover to the SAN, 1 for Migration, and 2 for failover to the network.
so all those connections go into one switch?
-
@kyle said in Hyper-V Failover Cluster FAILURE(S):
I have found some MSDN articles that point to subnetting being the issue here and the 4 reference links at the bottom of the page:
All of those are about issues with multi-subnetting. But you are not doing that here, right? So those would not be applicable.
-
@scottalanmiller said in Hyper-V Failover Cluster FAILURE(S):
@kyle said in Hyper-V Failover Cluster FAILURE(S):
I have found some MSDN articles that point to subnetting being the issue here and the 4 reference links at the bottom of the page:
All of those are about issues with multi-subnetting. But you are not doing that here, right? So those would not be applicable.
All IP's on those NIC's are muti-sunetted.
-
@kyle said in Hyper-V Failover Cluster FAILURE(S):
@scottalanmiller said in Hyper-V Failover Cluster FAILURE(S):
@kyle said in Hyper-V Failover Cluster FAILURE(S):
I have found some MSDN articles that point to subnetting being the issue here and the 4 reference links at the bottom of the page:
All of those are about issues with multi-subnetting. But you are not doing that here, right? So those would not be applicable.
All IP's on those NIC's are muti-sunetted.
Huh? Why? What does "multi-subnetted IP" even mean? An IP cannot be on more than one subnet.
-
@scottalanmiller said in Hyper-V Failover Cluster FAILURE(S):
@kyle said in Hyper-V Failover Cluster FAILURE(S):
@scottalanmiller said in Hyper-V Failover Cluster FAILURE(S):
@kyle said in Hyper-V Failover Cluster FAILURE(S):
I have found some MSDN articles that point to subnetting being the issue here and the 4 reference links at the bottom of the page:
All of those are about issues with multi-subnetting. But you are not doing that here, right? So those would not be applicable.
All IP's on those NIC's are muti-sunetted.
Huh? Why? What does "multi-subnetted IP" even mean? An IP cannot be on more than one subnet.
I was wondering the same.
-
I try to hold back being overly pedantic about semantics, contrary to popular belief, but I have a feeling that this might be a case where not pointing out incorrect semantics might have helped lead to other misconceptions.
In the OP, it was mentioned that a Class C was moved to a Class B. There are no classes in IPs. There have not been for decades. Not since the introduction of CIDR in 1993, predating basically all of us in networking. The idea of classes was made legacy prior to the explosive use of the Internet or IT as a field. It's nearly a quarter of a century now (24 years.) None of us active in the community are old enough to have realistically seen the Class based IP world. That it gets mentioned still indicates some weird teaching somewhere that is getting repeated. I'm just old enough that some material in the 1990s was still teaching it.
Class based teaching could lead to a misunderstanding of terms like subnetting and multi-subnet as these don't mean what you might think based on class based networking. Subnetting meant something very different before 1993 than it has since then. It shouldn't be called subnetting at all, but just netting, but it is more clear as some terms, like subnet mask, have remained in place.
If we were more pedantic about Class C and Class B being impossible terms, I think it might have cleared up some other confusions as well. As is often the case, knowing the right names for things and using the right names often leads to a better understanding of them.
-
Also, why has using /16 networking come up twice this weekend? I've gone years without hearing of someone trying something like this and suddenly, twice in a weekend?
Why is the SAN bigger than a /26? Why so many addresses for something that should have so few?
-
For those wondering... multi-subnet as a term refers to having machines in different subnets. A single IP can never be in more than one subnet, by definition. There is no working way to have it differently. It's an odd term, generally we would just refer to it as being a system with nodes in different networks, but sometimes different subnets is used.
As this is a /16 SAN, multi-subnetting would imply that there are more than one /16 networking involved in the cluster.
-
MS was still using Class networks in 1997 in their Networking Essentials MSCE courses.
-
@dashrender said in Hyper-V Failover Cluster FAILURE(S):
MS was still using Class networks in 1997 in their Networking Essentials MSCE courses.
Yes, that's the one I am aware of. It was only four years out of date at that point. And knowing that it existed historically is useful, so it is good that they taught it. But somehow it entered the popular consciousness as something that still existed.
-
-
@scottalanmiller said in Hyper-V Failover Cluster FAILURE(S):
@dashrender said in Hyper-V Failover Cluster FAILURE(S):
MS was still using Class networks in 1997 in their Networking Essentials MSCE courses.
Yes, that's the one I am aware of. It was only four years out of date at that point. And knowing that it existed historically is useful, so it is good that they taught it. But somehow it entered the popular consciousness as something that still existed.
Do you still think that the knowledge of Classes is confusing to people in the use of subnets like /24 /16, etc?
At least the other thread that mentioned it had a reason for wanting /16, since his in use subnets where so far apart, a /16 was (to him) the simplest way to get both subnets into a single one.
-
@scottalanmiller said in Hyper-V Failover Cluster FAILURE(S):
Also, why has using /16 networking come up twice this weekend? I've gone years without hearing of someone trying something like this and suddenly, twice in a weekend?
Why is the SAN bigger than a /26? Why so many addresses for something that should have so few?
The move from a /24 to /16 was due to a "MSP" claiming flattening out the network would solve vlan issues that were occurring.
-
@kyle okay, that's crazy. Why is your iSCSI going to different networks? Why is there more than one SAN?
-
@scottalanmiller said in Hyper-V Failover Cluster FAILURE(S):
@kyle okay, that's crazy. Why is your iSCSI going to different networks? Why is there more than one SAN?
There is more than 1 SAN but those point to the same SAN, that Tegile HA2300.
-
@kyle said in Hyper-V Failover Cluster FAILURE(S):
@scottalanmiller said in Hyper-V Failover Cluster FAILURE(S):
Also, why has using /16 networking come up twice this weekend? I've gone years without hearing of someone trying something like this and suddenly, twice in a weekend?
Why is the SAN bigger than a /26? Why so many addresses for something that should have so few?
The move from a /24 to /16 was due to a "MSP" claiming flattening out the network would solve vlan issues that were occurring.
A /16 is worlds beyond flattening. Flattening is /22 maybe a /21. But what you are showing isn't in the scope of that flattening, these networks are all over the place and can't be covered by a /16.
-
@kyle said in Hyper-V Failover Cluster FAILURE(S):
@scottalanmiller said in Hyper-V Failover Cluster FAILURE(S):
@kyle okay, that's crazy. Why is your iSCSI going to different networks? Why is there more than one SAN?
There is more than 1 SAN but those point to the same SAN, that Tegile HA2300.
I know, but why is there more than one SAN? A single storage device, like the Tegile, should be on only a single SAN.