Solved Elastix: phones lose registration
-
Looking at a multi-site client PBX system for some issues regarding phones at one site dropping registration.
timeline
14 Aug 2015: Firewall replaced
15-16Aug 2015: Issues with new FW update and corrected
18Aug 2015 physical onsite done- Power cycled switch done by removing power
- Checked phones IP status
- Phones reachable via local and remote browser
- Phones are pingable via Win box and PBX host
- Phone: Mix mainly of Yealink, but a few Polycoms
- Switch is a Juniper series PoE with two vLANs
When the switch was power cycled, all the phones connected to the PBX. Phone use / access has been up and down since.
Phones have been factory reset and configured. Phones will stay registered for (x) period of time at which point they will fail to register again. However, the switch was remotely cycled this morning and those phones are showing in the PBX as active ($ sip show peers).
PBX IP tables have been cleared, phones reconfigured, tcpdump files taken. A phone that registers now, may lose that registration at some random (x) minutes from when it registered and rebooting the phone does not reconnect.
We've had a number of people investigate this, but at this point we are no closer to resolving this then when I was on site last week.
Some speculation that it is an issue with the Switch, new Firewall or the service provider (ComCast). However some phones work and some don't, and it does seem to be random.
Tags:
@scottalanmiller; @Minion-Queen; @Mike-Ralston; @JaredBusch; -
What kind of VPN connects the two sites?
-
@scottalanmiller said:
What kind of VPN connects the two sites?
The new Firewall is a Fortigate unit. We connect via the Fortigate client software.
-
That's the first clue that something is wrong. Fortigate are very problematic and undocumented.
-
Sip Show Peers at: 08:00(ish)EDT
2200 (Unspecified) D N A 0 UNKNOWN 2201/2201 xxx.xx.xx.123 D N A 5062 OK (127 ms) 2202/2202 xxx.xx.xx.106 D N A 5062 OK (67 ms) 2203/2203 xxx.xx.xx.124 D N A 5062 OK (135 ms) 2210/2210 xxx.xx.xx.121 D N A 5062 OK (59 ms) 2211/2211 xxx.xx.xx.115 D N A 5062 OK (79 ms) 2212/2212 xxx.xx.xx.114 D N A 5062 OK (59 ms) 2213/2213 xxx.xx.xx.113 D N A 5062 OK (68 ms) 2214/2214 xxx.xx.xx.105 D N A 5062 OK (69 ms) 2215/2215 xxx.xx.xx.122 D N A 5062 OK (74 ms) 2216/2216 xxx.xx.xx.109 D N A 5062 OK (62 ms) 2217/2217 xxx.xx.xx.101 D N A 5062 OK (63 ms) 2218/2218 xxx.xx.xx.117 D N A 5062 OK (62 ms) 2219/2219 xxx.xx.xx.110 D N A 5062 OK (71 ms) 2220/2220 xxx.xx.xx.116 D N A 5062 OK (63 ms) 2221/2221 xxx.xx.xx.119 D N A 5062 OK (67 ms) 2223/2223 xxx.xx.xx.102 D N A 5060 OK (57 ms) 2224 (Unspecified D N A 0 UNKNOWN 2226/2226 xxx.xx.xx.111 D N A 5062 OK (62 ms) 2227/2227 xxx.xx.xx.120 D N A 5062 OK (59 ms) 2228/2228 xxx.xx.xx.107 D N A 5062 OK (68 ms) 2230/2230 xxx.xx.xx.103 D N A 5062 OK (75 ms) 2233/2233 xxx.xx.xx.108 D N A 5062 OK (67 ms) 2235/2235 xxx.xx.xx.100 D N A 5060 OK (56 ms) 2236 (Unspecified) D N A 0 UNKNOWN 2238/2238 xxx.xx.xx.112 D N A 5062 OK (60 ms) 2239/2239 xxx.xx.xx.118 D N A 5062 OK (61 ms) 2240/2240 xxx.xx.xx.104 D N A 5060 OK (59 ms) 9999/9999 (Unspecified) D N A 0 UNKNOWN
Sip Show Peers now: 09:00EDT
2200 (Unspecified) D N A 0 UNKNOWN 2201/2201 (Unspecified) D N A 0 UNKNOWN 2202/2202 xxx.xx.xx.106 D N A 5062 OK (63 ms) 2203/2203 (Unspecified) D N A 0 UNKNOWN 2210/2210 (Unspecified) D N A 0 UNKNOWN 2211/2211 xxx.xx.xx.115 D N A 5062 OK (78 ms) 2212/2212 xxx.xx.xx.114 D N A 5062 OK (62 ms) 2213/2213 (Unspecified) D N A 0 UNKNOWN 2214/2214 (Unspecified) D N A 0 UNKNOWN 2215/2215 (Unspecified) D N A 0 UNKNOWN 2216/2216 xxx.xx.xx.109 D N A 5062 OK (67 ms) 2217/2217 (Unspecified) D N A 0 UNKNOWN 2218/2218 (Unspecified) D N A 0 UNKNOWN 2219/2219 xxx.xx.xx.110 D N A 5062 OK (73 ms) 2220/2220 (Unspecified) D N A 0 UNKNOWN 2221/2221 xxx.xx.xx.119 D N A 5062 OK (89 ms) 2223/2223 xxx.xx.xx.102 D N A 5060 OK (57 ms) 2224 (Unspecified) D N A 0 UNKNOWN 2226/2226 (Unspecified) D N A 0 UNKNOWN 2227/2227 (Unspecified) D N A 0 UNKNOWN 2228/2228 (Unspecified) D N A 0 UNKNOWN 2230/2230 xxx.xx.xx.103 D N A 5062 OK (67 ms) 2233/2233 (Unspecified) D N A 0 UNKNOWN 2235/2235 xxx.xx.xx.100 D N A 5060 OK (53 ms) 2236 (Unspecified) D N A 0 UNKNOWN 2238/2238 (Unspecified) D N A 0 UNKNOWN 2239/2239 (Unspecified) D N A 0 UNKNOWN 2240/2240 xxx.xx.xx.104 D N A 5060 OK (57 ms) 9999/9999 (Unspecified) D N A 0 UNKNOWN
-
You said this was over a VPN? When they disconnect are you able to ping the phones from the PBX? How are the phones configured, do you reference the PBX by domain name or by IP address?
-
Is thew PBX on site or hosted? That was not clear.
-
@JaredBusch said:
Is thew PBX on site or hosted? That was not clear.
PBX is not hosted by a third party, but it is hosted by another division of the same company. So it acts like hosted to the site with the registration problems. It is an "on premises" PBX across the country with the phones registering over the VPN.
-
@coliver said:
You said this was over a VPN? When they disconnect are you able to ping the phones from the PBX? How are the phones configured, do you reference the PBX by domain name or by IP address?
Phones are configured to use DNS name for the PBX server in the SIPserver. However both name and IP have been tried.
-
According to the "hosting site", their network tooling says that there is packet loss over their site to site connection.
-
using the webGUI on one phone, just pushed a reboot.
Registration failed after restart.
-
@scottalanmiller said:
According to the "hosting site", their network tooling says that there is packet loss over their site to site connection.
So it seems to be network related then. So this isn't really a PBX or SIP registration problem something funky is going on over the wire.
-
Ping from PBX to phone:
ping xxx.xx.xx.116 PING xxx.xx.xx.116 (xxx.xx.xx.116) 56(84) bytes of data. 64 bytes from xxx.xx.xx.116: icmp_seq=1 ttl=60 time=51.8 ms 64 bytes from xxx.xx.xx.116: icmp_seq=2 ttl=60 time=52.2 ms 64 bytes from xxx.xx.xx.116: icmp_seq=3 ttl=60 time=53.7 ms 64 bytes from xxx.xx.xx.116: icmp_seq=4 ttl=60 time=51.2 ms 64 bytes from xxx.xx.xx.116: icmp_seq=5 ttl=60 time=51.4 ms 64 bytes from xxx.xx.xx.116: icmp_seq=6 ttl=60 time=56.3 ms
-
@scottalanmiller said:
According to the "hosting site", their network tooling says that there is packet loss over their site to site connection.
@coliver said:
So it seems to be network related then. So this isn't really a PBX or SIP registration problem something funky is going on over the wire.
I call this done. Not the problem of the PBX.
-
@g.jacobse said:
Ping from PBX to phone:
ping xxx.xx.xx.116 PING xxx.xx.xx.116 (xxx.xx.xx.116) 56(84) bytes of data. 64 bytes from xxx.xx.xx.116: icmp_seq=1 ttl=60 time=51.8 ms 64 bytes from xxx.xx.xx.116: icmp_seq=2 ttl=60 time=52.2 ms 64 bytes from xxx.xx.xx.116: icmp_seq=3 ttl=60 time=53.7 ms 64 bytes from xxx.xx.xx.116: icmp_seq=4 ttl=60 time=51.2 ms 64 bytes from xxx.xx.xx.116: icmp_seq=5 ttl=60 time=51.4 ms 64 bytes from xxx.xx.xx.116: icmp_seq=6 ttl=60 time=56.3 ms
Ok, it is unregistered but still passing packets? Can you run this for a longer period and see what the loss rate is? Something like 100 or 500 packets?
-
---xxx.xx.xx.116 ping statistics --- 152 packets transmitted, 152 received, 0% packet loss, time 151239ms rtt min/avg/max/mdev = 50.673/53.947/109.152/7.745 ms
-
@g.jacobse said:
---xxx.xx.xx.116 ping statistics --- 152 packets transmitted, 152 received, 0% packet loss, time 151239ms rtt min/avg/max/mdev = 50.673/53.947/109.152/7.745 ms
What is your registration timeout set at?
-
UDP Keep-alive: 30
Login Expire: 3600
SIP Registration retry: 30 -
@g.jacobse said:
UDP Keep-alive: 30
Login Expire: 3600
SIP Registration retry: 30Is there a registrationattempts entry other then 0?
-