hyper-v bad physical NIC? - vswitch or NIC teaming?
-
I have a server about to go in to production, but the NIC keeps failing after 12-24 hours. I thought it might be the physical network switch, so I changed out the switch and it's still just the NIC on the server going down. (all the other client network connections stay up) The server has two NICs so I figured I would just use the other physical NIC and see if it stays up.
The server is running Hyper-V core 2016. When the NIC goes down I lose access to the host as well.
Given that, would it make the most sense to create a new vSwitch with the second NIC and add that to the VM and see if it stays up like that, or would NIC teaming be a better way to go?
-
If you have a known bad NIC can you get it replaced on warranty?
-
@coliver yes. The server is brand new. The issue however is that the server is in a secure environment and remote control isn't allowed. I'm not even sure if a tech can be escorted in. If the work around doesn't work, it might mean picking up a new server or adding a NIC card.
-
@coliver I really want to troubleshoot to see if both NICs are bad, or what the deal is.
-
@mike-davis said in hyper-v bad physical NIC? - vswitch or NIC teaming?:
@coliver yes. The server is brand new. The issue however is that the server is in a secure environment and remote control isn't allowed. I'm not even sure if a tech can be escorted in. If the work around doesn't work, it might mean picking up a new server or adding a NIC card.
I don't know if I would deploy something with known bad hardware into an environment that a tech would have trouble accessing.
That being said I don't know if I would use NIC teaming unless you had two known good NICs.
-
@coliver said in hyper-v bad physical NIC? - vswitch or NIC teaming?:
I don't know if I would deploy something with known bad hardware into an environment that a tech would have trouble accessing.
I don't want to either. I need to be absolutely sure what exactly is wrong. I shouldn't have said work around in the sense that it would be permanent. More of something like cut it over to the second NIC and make sure it's not a software bug.
-
@mike-davis said in hyper-v bad physical NIC? - vswitch or NIC teaming?:
@coliver said in hyper-v bad physical NIC? - vswitch or NIC teaming?:
I don't know if I would deploy something with known bad hardware into an environment that a tech would have trouble accessing.
I don't want to either. I need to be absolutely sure what exactly is wrong. I shouldn't have said work around in the sense that it would be permanent. More of something like cut it over to the second NIC and make sure it's not a software bug.
That's a good way to test. But I wouldn't deploy it like that.
-
Is the NIC drivers up to date?
-
@black3dynamite said in hyper-v bad physical NIC? - vswitch or NIC teaming?:
Is the NIC drivers up to date?
First thing I did was update the firmware for everything and all the drivers.
-
Other troubleshooting that I did was cranked up iperf and ran a few GB through it. It didn't drop a beat.
-
@mike-davis said in hyper-v bad physical NIC? - vswitch or NIC teaming?:
@black3dynamite said in hyper-v bad physical NIC? - vswitch or NIC teaming?:
Is the NIC drivers up to date?
First thing I did was update the firmware for everything and all the drivers.
Its possible Virtual Machine Queue (VMQ) could be causing the issue. Intel or Broadcom?
-
The only other thing I can think of is disabling the power management settings on the NIC, but I'm pretty sure I did that with the:
Disable-NetAdapterPowerManagement -Name Ethernet
command. The command
Get-NetAdapterPowerManagement -Name Ethernet
doesn't seem to show if it has been enabled or disabled, only the Wake on LAN. Does anyone know if there is another command to check the current status? -
@mike-davis said in hyper-v bad physical NIC? - vswitch or NIC teaming?:
The only other thing I can think of is disabling the power management settings on the NIC, but I'm pretty sure I did that with the:
Disable-NetAdapterPowerManagement -Name Ethernet
command. The command
Get-NetAdapterPowerManagement -Name Ethernet
doesn't seem to show if it has been enabled or disabled, only the Wake on LAN. Does anyone know if there is another command to check the current status?PS C:\Windows\system32> powercfg /list Existing Power Schemes (* Active) ----------------------------------- Power Scheme GUID: 381b4222-f694-41f0-9685-ff5bb260df2e (Balanced) Power Scheme GUID: 8c5e7fda-e8bf-4a96-9a85-a6e23a8c635c (High performance) * Power Scheme GUID: a1841308-3541-4fab-bc81-f71556f20b4a (Power saver)
-
@black3dynamite said in hyper-v bad physical NIC? - vswitch or NIC teaming?:
Its possible Virtual Machine Queue (VMQ) could be causing the issue. Intel or Broadcom?
I'm not sure. I'll have to be onsite to find out. What are the symptoms of that? Would VMQ knock the host offline as well?
-
@black3dynamite said in hyper-v bad physical NIC? - vswitch or NIC teaming?:
@mike-davis said in hyper-v bad physical NIC? - vswitch or NIC teaming?:
The only other thing I can think of is disabling the power management settings on the NIC, but I'm pretty sure I did that with the:
Disable-NetAdapterPowerManagement -Name Ethernet
command. The command
Get-NetAdapterPowerManagement -Name Ethernet
doesn't seem to show if it has been enabled or disabled, only the Wake on LAN. Does anyone know if there is another command to check the current status?PS C:\Windows\system32> powercfg /list Existing Power Schemes (* Active) ----------------------------------- Power Scheme GUID: 381b4222-f694-41f0-9685-ff5bb260df2e (Balanced) Power Scheme GUID: 8c5e7fda-e8bf-4a96-9a85-a6e23a8c635c (High performance) * Power Scheme GUID: a1841308-3541-4fab-bc81-f71556f20b4a (Power saver)
That command may work for some stuff on the host, but I tested it on another server and it doesn't change the power settings for the individual NICs. cool command though.
-
@mike-davis said in hyper-v bad physical NIC? - vswitch or NIC teaming?:
@black3dynamite said in hyper-v bad physical NIC? - vswitch or NIC teaming?:
@mike-davis said in hyper-v bad physical NIC? - vswitch or NIC teaming?:
The only other thing I can think of is disabling the power management settings on the NIC, but I'm pretty sure I did that with the:
Disable-NetAdapterPowerManagement -Name Ethernet
command. The command
Get-NetAdapterPowerManagement -Name Ethernet
doesn't seem to show if it has been enabled or disabled, only the Wake on LAN. Does anyone know if there is another command to check the current status?PS C:\Windows\system32> powercfg /list Existing Power Schemes (* Active) ----------------------------------- Power Scheme GUID: 381b4222-f694-41f0-9685-ff5bb260df2e (Balanced) Power Scheme GUID: 8c5e7fda-e8bf-4a96-9a85-a6e23a8c635c (High performance) * Power Scheme GUID: a1841308-3541-4fab-bc81-f71556f20b4a (Power saver)
That command may work for some stuff on the host, but I tested it on another server and it doesn't change the power settings for the individual NICs. cool command though.
Which power scheme is active? I've always make sure mine is set to High performance.
-
@mike-davis said in hyper-v bad physical NIC? - vswitch or NIC teaming?:
@black3dynamite said in hyper-v bad physical NIC? - vswitch or NIC teaming?:
Its possible Virtual Machine Queue (VMQ) could be causing the issue. Intel or Broadcom?
I'm not sure. I'll have to be onsite to find out. What are the symptoms of that? Would VMQ knock the host offline as well?
Yes. This is a well known issue in an older firmware of some 1gig nics, such as broadcoms. Updating the firmware resolves the issue, or turning off VMQ, which you can do via powershell on Hyper-V Server 2016.
VMQ is meant for 10gig+ NICs, but the firmware is messed up and has it on by default on some 1gig nics.
-
@black3dynamite said in hyper-v bad physical NIC? - vswitch or NIC teaming?:
Which power scheme is active? I've always make sure mine is set to High performance.
High performance. On the host I checked, it's a GUI install, so I went to the nic properties and I can still see Power Management tab and the box for "Allow the computer to turn off this device to save power" is still checked.
When you do: Disable-NetAdapterPowerManagement -Name Ethernet
it removes that tab. -
@tim_g said in hyper-v bad physical NIC? - vswitch or NIC teaming?:
@mike-davis said in hyper-v bad physical NIC? - vswitch or NIC teaming?:
@black3dynamite said in hyper-v bad physical NIC? - vswitch or NIC teaming?:
Its possible Virtual Machine Queue (VMQ) could be causing the issue. Intel or Broadcom?
I'm not sure. I'll have to be onsite to find out. What are the symptoms of that? Would VMQ knock the host offline as well?
Yes. This is a well known issue in an older firmware of some 1gig nics, such as broadcoms. Updating the firmware resolves the issue, or turning off VMQ, which you can do via powershell on Hyper-V Server 2016.
VMQ is meant for 10gig NICs, but the firmware is messed up and has it on by default on some 1gig nics.
Here's some PS to get you started:
Get-NetAdapter Get-NetAdapterAdvancedProperty NIC1 Get-NetAdapterAdvancedProperty * -DisplayName “Virtual Machine Queues” Set-NetAdapterAdvancedProperty * -DisplayName “Virtual Machine Queues” -DisplayValue Disabled
-
@tim_g Thanks for the commands. If I get the output below, does that mean my NICs don't have the option for VMQ?
PS C:\> Get-NetAdapter Name InterfaceDescription ifIndex Statu s ---- -------------------- ------- ----- vEthernet (vSwitch02) Hyper-V Virtual Ethernet Adapter #3 29 Up vEthernet (Broadcom BC... Hyper-V Virtual Ethernet Adapter #2 18 Up Ethernet 2 HP NC373i Multifunction Gigabit S...#40 13 Up Ethernet HP NC373i Multifunction Gigabit S...#39 12 Di... PS C:\> Get-NetAdapterAdvancedProperty Ethernet Name DisplayName DisplayValue ---- ----------- ------------ Ethernet Flow Control Auto Ethernet Interrupt Moderation Enabled Ethernet Jumbo Packet 1514 Ethernet Large Send Offload V2 (IPv4) Enabled Ethernet Maximum Number of RSS Queues 2 Ethernet Priority & VLAN Priority & VLAN ena... Ethernet Receive Buffers (0=Auto) 0 Ethernet Receive Side Scaling Enabled Ethernet Speed & Duplex Auto Negotiation Ethernet TCP Connection Offload (IPv4) Disabled Ethernet TCP/UDP Checksum Offload (I... Rx & Tx Enabled Ethernet Transmit Buffers (0=Auto) 0 Ethernet Wake On Magic Packet Disabled Ethernet Wake On Pattern Match Disabled Ethernet Locally Administered Address -- Ethernet VLAN ID 0 Ethernet Ethernet@WireSpeed Enabled PS C:\> Get-NetAdapterAdvancedProperty * -DisplayName "Virtual Machine Queues" Get-NetAdapterAdvancedProperty : No matching MSFT_NetAdapterAdvancedPropertySettingData objects found by CIM query for instances of the ROOT/StandardCimv2/MSFT_NetAdapterAdvancedPropertySettingData class on the CIM server: SELECT * FROM MSFT_NetAdapterAdvancedPropertySettingData WHERE ((Name LIKE '%')) AND ((DisplayName LIKE 'Virtual Machine Queues')). Verify query parameters and retry. At line:1 char:1 + Get-NetAdapterAdvancedProperty * -DisplayName "Virtual Machine Queues" + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + CategoryInfo : ObjectNotFound: (MSFT_NetAdapter...ertySettingDa ta:String) [Get-NetAdapterAdvancedProperty], CimJobException + FullyQualifiedErrorId : CmdletizationQuery_NotFound,Get-NetAdapterAdvanc edProperty PS C:\>