Help troubleshooting L2TP over IPSEC VPN connections.
-
So we have the VPN setup and it is working currently for 3 out of 4 users. I have been dealing with the problematic connection but can't figure out how to solve the issue. I'd really appreciate any help you guys can provide.
L2TP over IPSEC VPN
VPN Server: EdgeRouter PoE 5 v1.10.5
Client: Windows 10 v1709 build 16299.579Windows Side
Client is properly reaching the VPN server even though the Windows error says the server is unreachable (logs below). Don't really think the problem lies on the Windows side but still, I have checked the Windows setup and everything is set according to documentation and the same as the other working clients. The machine has been rebooted (several times) and I have even uninstalled and reinstalled the WAN Miniport interfaces.Edge Router Side
Full log - sudo swanctl --log while trying to connect.06[NET] received packet: from USER_PUBLIC_IP[500] to EDGE_ROUTER_IP[500] (408 bytes)06[ENC] parsed ID_PROT request 0 [ SA V V V V V V V V ] 06[ENC] received unknown vendor ID: 01:52:8b:bb:c0:06:96:12:18:49:ab:9a:1c:5b:2a:51:00:00:00:01 06[IKE] received MS NT5 ISAKMPOAKLEY vendor ID06[IKE] received NAT-T (RFC 3947) vendor ID 06[IKE] received draft-ietf-ipsec-nat-t-ike-02\n vendor ID06[IKE] received FRAGMENTATION vendor ID 06[ENC] received unknown vendor ID: fb:1d:e3:cd:f3:41:b7:ea:16:b7:e5:be:08:55:f1 :20 06[ENC] received unknown vendor ID: 26:24:4d:38:ed:db:61:b3:17:2a:36:e3:d0:cf:b8 :1906[ENC] received unknown vendor ID: e3:a5:96:6a:76:37:9f:e7:07:22:82:31:e5:ce:86 :52 06[IKE] USER_PUBLIC_IP is initiating a Main Mode IKE_SA 06[ENC] generating ID_PROT response 0 [ SA V V V ] 06[NET] sending packet: from EDGE_ROUTER_IP[500] to USER_PUBLIC_IP[500] (136 bytes) 01[NET] received packet: from USER_PUBLIC_IP[500] to EDGE_ROUTER_IP[500] (228 bytes) 01[ENC] parsed ID_PROT request 0 [ KE No NAT-D NAT-D ]01[IKE] remote host is behind NAT 01[ENC] generating ID_PROT response 0 [ KE No NAT-D NAT-D ]01[NET] sending packet: from EDGE_ROUTER_IP[500] to USER_PUBLIC_IP[500] (212 bytes) 05[NET] received packet: from USER_PUBLIC_IP[4500] to EDGE_ROUTER_IP[4500] (76 bytes )05[ENC] parsed ID_PROT request 0 [ ID HASH ] 05[CFG] looking for pre-shared key peer configs matching EDGE_ROUTER_IP...USER_PUBLIC_IP[192.168.0.16] 05[CFG] selected peer config "remote-access" 05[IKE] IKE_SA remote-access[63] established between EDGE_ROUTER_IP[EDGE_ROUTER_IP ]...USER_PUBLIC_IP[192.168.0.16]05[IKE] DPD not supported by peer, disabled05[ENC] generating ID_PROT response 0 [ ID HASH ] 05[NET] sending packet: from EDGE_ROUTER_IP[4500] to USER_PUBLIC_IP[4500] (76 bytes)09[NET] received packet: from USER_PUBLIC_IP[4500] to EDGE_ROUTER_IP[4500] (444 byte s) 09[ENC] parsed QUICK_MODE request 1 [ HASH SA No ID ID NAT-OA NAT-OA ] 09[IKE] received 3600s lifetime, configured 0s 09[IKE] received 250000000 lifebytes, configured 009[ENC] generating QUICK_MODE response 1 [ HASH SA No ID ID NAT-OA NAT-OA ] 09[NET] sending packet: from EDGE_ROUTER_IP[4500] to USER_PUBLIC_IP[4500] (204 bytes ) 13[NET] received packet: from USER_PUBLIC_IP[4500] to EDGE_ROUTER_IP[4500] (60 bytes) 13[ENC] parsed QUICK_MODE request 1 [ HASH ] 13[CFG] unable to install policy EDGE_ROUTER_IP/32[udp/l2f] === USER_PUBLIC_IP/32[ud p/l2f] out (mark 0/0x00000000) for reqid 35, the same policy for reqid 14 exists 13[CFG] unable to install policy USER_PUBLIC_IP/32[udp/l2f] === EDGE_ROUTER_IP/32[udp/l2f] in (mark 0/0x00000000) for reqid 35, the same policy for reqid 14 exists 13[CFG] unable to install policy EDGE_ROUTER_IP/32[udp/l2f] === USER_PUBLIC_IP/32[udp/l2f] out (mark 0/0x00000000) for reqid 35, the same policy for reqid 14 exists 13[CFG] unable to install policy USER_PUBLIC_IP/32[udp/l2f] === EDGE_ROUTER_IP/32[udp/l2f] in (mark 0/0x00000000) for reqid 35, the same policy for reqid 14 exists 13[IKE] unable to install IPsec policies (SPD) in kernel 13[KNL] deleting policy EDGE_ROUTER_IP/32[udp/l2f] === USER_PUBLIC_IP/32[udp/l2f] out failed, not found 13[KNL] deleting policy USER_PUBLIC_IP/32[udp/l2f] === EDGE_ROUTER_IP/32[udp/l2f] in failed, not found 13[KNL] deleting policy EDGE_ROUTER_IP/32[udp/l2f] === USER_PUBLIC_IP/32[udp/l2f] out failed, not found 13[KNL] deleting policy USER_PUBLIC_IP/32[udp/l2f] === EDGE_ROUTER_IP/32[udp/l2f] in failed, not found 13[IKE] sending DELETE for ESP CHILD_SA with SPI 740d890e 13[ENC] generating INFORMATIONAL_V1 request 3087336472 [ HASH D ] 13[NET] sending packet: from EDGE_ROUTER_IP[4500] to USER_PUBLIC_IP[4500] (76 bytes) 14[NET] received packet: from USER_PUBLIC_IP[4500] to EDGE_ROUTER_IP[4500] (76 bytes) 14[ENC] parsed INFORMATIONAL_V1 request 2912129370 [ HASH D ] 14[IKE] received DELETE for ESP CHILD_SA with SPI 740d890e 14[IKE] CHILD_SA not found, ignored 04[NET] received packet: from USER_PUBLIC_IP[4500] to EDGE_ROUTER_IP[4500] (92 bytes) 04[ENC] parsed INFORMATIONAL_V1 request 1035896583 [ HASH D ] 04[IKE] received DELETE for IKE_SA remote-access[63] 04[IKE] deleting IKE_SA remote-access[63] between EDGE_ROUTER_IP[EDGE_ROUTER_IP]...USER_PUBLIC_IP[192.168.0.16]
Checking the logs, I can see everything is working properly until this messages start to appear.
13[CFG] unable to install policy EDGE_ROUTER_IP/32[udp/l2f] === USER_PUBLIC_IP/32[udp/l2f] out (mark 0/0x00000000) for reqid 35, the same policy for reqid 14 exists 13[CFG] unable to install policy USER_PUBLIC_IP/32[udp/l2f] === EDGE_ROUTER_IP/32[udp/l2f] in (mark 0/0x00000000) for reqid 35, the same policy for reqid 14 exists
It can't install the policy for reqid 35 because there is an existing reqid (14) which has the same policy.
Indeed there is, policy remote-access policy 14 is a child of remote-access 28
remote-access: #28, ESTABLISHED, IKEv1, 2dba0e93f1dc2f3c:4a212e556a07f9b7 local 'EDGE_ROUTER_IP' @ EDGE_ROUTER_IP remote '192.168.0.8' @ USER_PUBLIC_IP AES_CBC-256/HMAC_SHA1_96/PRF_HMAC_SHA1/ECP_384 established 75540s ago remote-access: #14, INSTALLED, TRANSPORT-in-UDP, ESP:AES_CBC-128/HMAC_SHA1_96 installed 75207 ago in c9a20ab8, 2965565 bytes, 32775 packets, 8314s ago out 8fadd716, 44934358 bytes, 50838 packets, 8268s ago local EDGE_ROUTER_IP/32[udp/l2f] remote USER_PUBLIC_IP/32[udp/l2f]
This leads me to believe the user maybe already be connected via another machine, but the user doesn't show as online when using
show vpn remote-access
.Any idea how to fix the conflict with the duplicate policies and why it is happening?
Only thing I haven't done is rebooting the edge router since other users are working fine and don't want to cause a disruption for them.
-
Can you sign in from your system using the users VPN credentials?
This will give you another point to test from to see if it is a router issue.
-
@gjacobse Will try that next
-
@gjacobse I can connect without a problem from a different public ip
-
@JaredBusch @scottalanmiller Any idea?
-
If this is what I think it is,.. it's something we have gone round and round with for nearly a year. It's something we can't seem to nail down as either an ERL or OS issue.
-
@romo said in Help troubleshooting L2TP over IPSEC VPN connections.:
@JaredBusch @scottalanmiller Any idea?
Is this user trying to connect from the same IP as another user?
-
@jaredbusch said in Help troubleshooting L2TP over IPSEC VPN connections.:
@romo said in Help troubleshooting L2TP over IPSEC VPN connections.:
@JaredBusch @scottalanmiller Any idea?
Is this user trying to connect from the same IP as another user?
Generally not.. They are remote / home users.
-
@jaredbusch said in Help troubleshooting L2TP over IPSEC VPN connections.:
@romo said in Help troubleshooting L2TP over IPSEC VPN connections.:
@JaredBusch @scottalanmiller Any idea?
Is this user trying to connect from the same IP as another user?
No, a single user trying to connect from home. She connected Wednesday without a problem, but Thursday she tries to connect again and it is not possible.
Logs show
13[CFG] unable to install policy EDGE_ROUTER_IP/32[udp/l2f] === USER_PUBLIC_IP/32[udp/l2f] out (mark 0/0x00000000) for reqid 35, the same policy for reqid 14 exists 13[CFG] unable to install policy USER_PUBLIC_IP/32[udp/l2f] === EDGE_ROUTER_IP/32[udp/l2f] in (mark 0/0x00000000) for reqid 35, the same policy for reqid 14 exists
New connection can't be made because a policy with the same details is already present. If we vpn from any place that has a different public ip than the one from her home, we can establish the vpn connection without a problem.
-
@romo said in Help troubleshooting L2TP over IPSEC VPN connections.:
@jaredbusch said in Help troubleshooting L2TP over IPSEC VPN connections.:
@romo said in Help troubleshooting L2TP over IPSEC VPN connections.:
@JaredBusch @scottalanmiller Any idea?
Is this user trying to connect from the same IP as another user?
No, a single user trying to connect from home. She connected Wednesday without a problem, but Thursday she tries to connect again and it is not possible.
Logs show
13[CFG] unable to install policy EDGE_ROUTER_IP/32[udp/l2f] === USER_PUBLIC_IP/32[udp/l2f] out (mark 0/0x00000000) for reqid 35, the same policy for reqid 14 exists 13[CFG] unable to install policy USER_PUBLIC_IP/32[udp/l2f] === EDGE_ROUTER_IP/32[udp/l2f] in (mark 0/0x00000000) for reqid 35, the same policy for reqid 14 exists
New connection can't be made because a policy with the same details is already present. If we vpn from any place that has a different public ip than the one from her home, we can establish the vpn connection without a problem.
What VPN client are you using, default to Windows?
-
@dbeato Yes
-
@romo said in Help troubleshooting L2TP over IPSEC VPN connections.:
@dbeato Yes
Okay, was looking at that error on other OpenVPN clients that had issues on older versions.
-
@dbeato said in Help troubleshooting L2TP over IPSEC VPN connections.:
@romo said in Help troubleshooting L2TP over IPSEC VPN connections.:
@dbeato Yes
Okay, was looking at that error on other OpenVPN clients that had issues on older versions.
It looks like it is a Strongswan issue, as a temporary fix it should be resolved by manually restarting the IPSec VPN (restart vpn). Unfortunately, during working hours it seems to be too disruptive to use for properly connected users. At least without having tested the effects of the restart for connected users.
The strange thing is the connection is acting as if two computers were trying to access the VPN server behind the same NAT when according to the user it is only a single device.
-
Here is our issue https://wiki.strongswan.org/issues/431, it was fixed 3 years ago when version 5.3 of strongSwan came out.
I had not found what strongSwan version we were using, I just assumed we were using something newer. Then I found that our edge router is using strongSwan 5.2.2.
Here is our version.
Status of IKE charon daemon (strongSwan 5.2.2, Linux 3.10.107-UBNT, mips64): uptime: 3 days, since Aug 06 22:12:40 2018 malloc: sbrk 376832, mmap 0, used 295456, free 81376 worker threads: 11 of 16 idle, 5/0/0/0 working, job queue: 0/0/0/0, scheduled:
From here https://community.ubnt.com/t5/EdgeMAX-Feature-Requests/Upgrade-to-strongswan-5-6-x/idi-p/1507341 we see a change to strongSwan 5.5.x has been accepted don't know when it will be available.
strongSwan 5.3 + can now handle identical policies by reusing the same reqid. This allows identical CHILD_SAs to the same host.
So that probably means multiple machines behind NAT could also work when the fix is implemented.
-
jeeze,.. that is a sad state to think that we have nbeen fighting this for that long,...
@JaredBusch @scottalanmiller
Can a cron be set to restart the ipsec every 24 hours? -
@romo said in Help troubleshooting L2TP over IPSEC VPN connections.:
Here is our issue https://wiki.strongswan.org/issues/431, it was fixed 3 years ago when version 5.3 of strongSwan came out.
I had not found what strongSwan version we were using, I just assumed we were using something newer. Then I found that our edge router is using strongSwan 5.2.2.
Here is our version.
Status of IKE charon daemon (strongSwan 5.2.2, Linux 3.10.107-UBNT, mips64): uptime: 3 days, since Aug 06 22:12:40 2018 malloc: sbrk 376832, mmap 0, used 295456, free 81376 worker threads: 11 of 16 idle, 5/0/0/0 working, job queue: 0/0/0/0, scheduled:
From here https://community.ubnt.com/t5/EdgeMAX-Feature-Requests/Upgrade-to-strongswan-5-6-x/idi-p/1507341 we see a change to strongSwan 5.5.x has been accepted don't know when it will be available.
strongSwan 5.3 + can now handle identical policies by reusing the same reqid. This allows identical CHILD_SAs to the same host.
So that probably means multiple machines behind NAT could also work when the fix is implemented.
Yeah, that is what I found and was referring to. I just did not post it here.
-
@gjacobse said in Help troubleshooting L2TP over IPSEC VPN connections.:
jeeze,.. that is a sad state to think that we have nbeen fighting this for that long,...
@JaredBusch @scottalanmiller
Can a cron be set to restart the ipsec every 24 hours?Yes.