Server 2019 randomly DNS stops
-
If noting on the network is working, that makes me think a bad switch.
Or possibly a bad NIC taking down your switch. -
@Dashrender Draytek 2860 (being replaced this week) - 3 * unifi switches (fibre linked) and a few rando APs dotted around. It's quite a simple setup, the APs are a bit sus. I could pull them all out...
-
@choppy_sea said in Server 2019 randomly DNS stops:
@Dashrender Draytek 2860 (being replaced this week) - 3 * unifi switches (fibre linked) and a few rando APs dotted around. It's quite a simple setup, the APs are a bit sus. I could pull them all out...
Why are the AP's sus?
-
@DustinB3403 they're cheapo TP-Link routers acting as APs (they have an AP mode) I think they are sus because technically they are capable of DNS and DHCP themselves...could be one going rouge
-
@choppy_sea said in Server 2019 randomly DNS stops:
@DustinB3403 they're cheapo TP-Link routers acting as APs (they have an AP mode) I think they are sus because technically they are capable of DNS and DHCP themselves...could be one going rouge
Oh! Sorry I misread, I thought you said Unifi AP's but you said Unifi switches.
-
@choppy_sea said in Server 2019 randomly DNS stops:
@Obsolesce DNS logs show one interesting one linked, the log says that its transferred the master role from itself to itself https://imgur.com/a/4I75qnB if you mean somewhere else I apologise!
@Dashrender It happens for every device on the network!
@JasGot Yes the AD server does DNS and DHCP too, yes the Host on the domain
@notverypunny When I ping a known good IP i.e. 8.8.8.8 I get "...unreachable" rather than the "Ping request could not..."
OK, so if you can't even get out by IP, then strictly speaking DNS isn't the issue. Lower level TCP/IP or something else in the network is a problem before DNS even comes into play. Even if your DNS is completely offline you should be able to ping 8.8.8.8 or 1.1.1.1
I'd setup a standalone machine on the network with a static IP and have it pointed to external DNS. If it stops working when everything else does, then you know that it's something in your LAN > WAN setup. If it keeps working when everything else goes sideways then you're looking at the possibility of something wrong along the lines of the rogue DHCP that you've alluded to or other LAN-side gremlins. Don't rule out the possibility of a user having connected something that's doing all kinds of fun DHCP garbage.... Users can be... "special"
-
@notverypunny said in Server 2019 randomly DNS stops:
@choppy_sea said in Server 2019 randomly DNS stops:
@Obsolesce DNS logs show one interesting one linked, the log says that its transferred the master role from itself to itself https://imgur.com/a/4I75qnB if you mean somewhere else I apologise!
@Dashrender It happens for every device on the network!
@JasGot Yes the AD server does DNS and DHCP too, yes the Host on the domain
@notverypunny When I ping a known good IP i.e. 8.8.8.8 I get "...unreachable" rather than the "Ping request could not..."
OK, so if you can't even get out by IP, then strictly speaking DNS isn't the issue. Lower level TCP/IP or something else in the network is a problem before DNS even comes into play. Even if your DNS is completely offline you should be able to ping 8.8.8.8 or 1.1.1.1
I'd setup a standalone machine on the network with a static IP and have it pointed to external DNS. If it stops working when everything else does, then you know that it's something in your LAN > WAN setup. If it keeps working when everything else goes sideways then you're looking at the possibility of something wrong along the lines of the rogue DHCP that you've alluded to or other LAN-side gremlins. Don't rule out the possibility of a user having connected something that's doing all kinds of fun DHCP garbage.... Users can be... "special"
Rogue DHCP won't cause a universal issue like this all at once unless all the leases came up for renewal at the same time. Then on top of that, unless the rogue is on the VM host, he's not indicated he's done anything that would remove it from the network - like rebooting all APs/switches.. only reboot mentioned is the VM host.
I'm wondering if you have a bad NIC causing the switch that connects to your firewall to overload? or the switch itself is bad and flaky, but again, no mention of rebooting the switch to make things work again.
-
Next time you take the server down, before rebooting it - unplug it's network cables and use another device to try to ping 8.8.8.8
-
To back up @notverypunny when I have seen this behavior on our network, 99% of the time its rogue piece of equipment that starts handing out its own DHCP conflicting with ours. Check that users are NOT plugging in or connecting non approved devices, like travel routers or other router devices. We install a lot of equipment ofr custoemrs that our techs are suppsoed to plug into "isolated' networks and usually the , um "forget"
-
@Dashrender @notverypunny Good ideas! Thanks!
-
You can find rouge DHCP with Wireshark. All kind of strange issues actually.
https://www.wireshark.org/You just need to know how the protocol works.
If you have two DHCP servers competing, you'll see two DHCP server offers from two different IPs.
-
Any developments on this? Curious to see what the end result is...
-
UPDATE: We have replaced the router, wifi APs and enabled DHCP forwarding .... it hasn't gone down since! *Really don't want to Jinx it" Thanks all for you help I really appreciate it