Mailing List Archive

[lvs-users] Unexpected source IP selection in IPVS TUN
Hello,

I am looking for some help with an intermittent issue with IPVS source IP
selection in TUNNEL mode. We run LVS directly in a Kuberenetes cluster,
potentially on the same machine as backend workload so in many cases the LVS
Director is also a real server. Keepalived runs in active/passive on a pair
of VMs, working to maintain a VIP on the public interface for some level of
HA. We periodically notice a strange selection of source IP for IPinIP
tunnelled traffic coming out of the LVS when using source hashing (sh)
scheduling.

This problem has been noticed in the following environment:
- Virtual Machines running Red Hat Enterprise Linux
- Kernel version 3.10.0-1062.18.1.el7.x86_64
- IPVS version 1.2.1 with TUN and source hash (sh) scheduling.
- OpenShift 4.3
- OpenShift 3.11

An example from a 3 node Kubernetes cluster:
- 10.221.95.10
- 10.221.95.2
- 10.221.95.5

with Linux director running on 10.221.95.2 and virtual service directing
traffic to:
- localhost
- remote node 10.221.95.5

The local endpoint/real server is Direct Routing. The remote real server is
Tunnel.

LVS Configuration:
# ipvsadm -Ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 169.46.4.90:80 sh
-> 10.221.95.5:80 Tunnel 1 0 0
-> 127.0.0.11:80 Route 1 1 0
/ # ipvsadm -Lnc
IPVS connection entries
pro expire state source virtual destination
TCP 14:44 ESTABLISHED 128.92.120.147:56212 169.46.4.90:80 127.0.0.11:80

Traffic moving through the LVS and remaining on the same node gives
successfull connection establishment. The problem arises when traffic from a
particular source IP is selected to transit by tunnel to a remote real
server.

EXPECTED BEHAVIOR: IPVS encapsulates the traffic with IPinIP using the IP
address from the private interface of the VM (10.X.X.X). Example traffic
successfully balanced from LVS director VM 10.221.95.2 to remote real server
10.221.95.5:

# tcpdump -n -i eth0 host 10.221.95.2 and proto 4
13:58:28.151571 IP 10.221.95.2 > 10.221.95.5: IP 52.117.148.54.64369 >
169.46.4.90.80: Flags [S], seq 180302151, win 65535, options [mss
1460,sackOK,TS val 590414746 ecr 0,nop,wscale 9], length 0 (ipip-proto-4)
13:58:28.152447 IP 10.221.95.2 > 10.221.95.5: IP 52.117.148.54.64369 >
169.46.4.90.80: Flags [.], ack 2964164084, win 128, options [nop,nop,TS val
590414747 ecr 89050127], length 0 (ipip-proto-4)
13:58:28.152467 IP 10.221.95.2 > 10.221.95.5: IP 52.117.148.54.64369 >
169.46.4.90.80: Flags [P.], seq 0:75, ack 1, win 128, options [nop,nop,TS
val 590414747 ecr 89050127], length 75: HTTP: GET / HTTP/1.1 (ipip-proto-4)
13:58:28.154037 IP 10.221.95.2 > 10.221.95.5: IP 52.117.148.54.64369 >
169.46.4.90.80: Flags [.], ack 723, win 131, options [nop,nop,TS val
590414749 ecr 89050129], length 0 (ipip-proto-4)

NOTE: The above trace was grabbed after finding a way around the issue (see
below) and depicts only inbound traffic from the LVS. DSR carries the
response back to the client out eth1.
OBSERVED BEHAVIOR: IPVS mysteriously encapsulates traffic with source IP
from 127.X.255.255. Running tcpdump from the remote real server
(10.221.95.5):

# tcpdump -n -i eth0 net 127.0.0.0/8 and proto 4
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
23:43:34.065782 IP 127.138.255.255 > 10.221.95.5: IP 52.117.148.54.3595 >
169.46.4.90.80: Flags [S], seq 146570019, win 65535, options [mss
1460,sackOK,TS val 539120382 ecr 0,nop,wscale 9], length 0 (ipip-proto-4)
23:43:35.065967 IP 127.138.255.255 > 10.221.95.5: IP 52.117.148.54.3595 >
169.46.4.90.80: Flags [S], seq 146570019, win 65535, options [mss
1460,sackOK,TS val 539121383 ecr 0,nop,wscale 9], length 0 (ipip-proto-4)
23:43:37.082042 IP 127.138.255.255 > 10.221.95.5: IP 52.117.148.54.3595 >
169.46.4.90.80: Flags [S], seq 146570019, win 65535, options [mss
1460,sackOK,TS val 539123399 ecr 0,nop,wscale 9], length 0 (ipip-proto-4)
23:43:41.306020 IP 127.138.255.255 > 10.221.95.5: IP 52.117.148.54.3595 >
169.46.4.90.80: Flags [S], seq 146570019, win 65535, options [mss
1460,sackOK,TS val 539127623 ecr 0,nop,wscale 9], length 0 (ipip-proto-4

One can see that the arriving IPinIP tunneled traffic has a source IP of
127.138.255.255. This is NOT expected. This is accompanied by kernel logs
like:

kernel: IPv4: martian source 10.X.X.X from 127.X.255.255, on dev eth0

Why is IPVS selecting this source IP for tunnelled traffic instead of the IP
from the private interface (eth0) of the VM?

We have noticed that the problem can be resolved by the following:
- trigger LB failover (keepalived moves VIP to new node and IPVS needs
reprogramming)
- create a new LB (again IPVS needs to be programmed to include the new
virtual service)

I understand several factors are at play here and I will continue to try and
isolate, but any insight on IPVS selection of source IP would be much
appreciated.

Best,
Calvin
_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users@LinuxVirtualServer.org
Send requests to lvs-users-request@LinuxVirtualServer.org
or go to http://lists.graemef.net/mailman/listinfo/lvs-users
Re: [lvs-users] Unexpected source IP selection in IPVS TUN [ In reply to ]
Hello,

On Mon, 6 Apr 2020, Calvin Zachman wrote:

> EXPECTED BEHAVIOR: IPVS encapsulates the traffic with IPinIP using the IP
> address from the private interface of the VM (10.X.X.X). Example traffic
> successfully balanced from LVS director VM 10.221.95.2 to remote real server
> 10.221.95.5:
>
> # tcpdump -n -i eth0 host 10.221.95.2 and proto 4
> 13:58:28.151571 IP 10.221.95.2 > 10.221.95.5: IP 52.117.148.54.64369 >
> 169.46.4.90.80: Flags [S], seq 180302151, win 65535, options [mss
> 1460,sackOK,TS val 590414746 ecr 0,nop,wscale 9], length 0 (ipip-proto-4)
> 13:58:28.152447 IP 10.221.95.2 > 10.221.95.5: IP 52.117.148.54.64369 >
> 169.46.4.90.80: Flags [.], ack 2964164084, win 128, options [nop,nop,TS val
> 590414747 ecr 89050127], length 0 (ipip-proto-4)
> 13:58:28.152467 IP 10.221.95.2 > 10.221.95.5: IP 52.117.148.54.64369 >
> 169.46.4.90.80: Flags [P.], seq 0:75, ack 1, win 128, options [nop,nop,TS
> val 590414747 ecr 89050127], length 75: HTTP: GET / HTTP/1.1 (ipip-proto-4)
> 13:58:28.154037 IP 10.221.95.2 > 10.221.95.5: IP 52.117.148.54.64369 >
> 169.46.4.90.80: Flags [.], ack 723, win 131, options [nop,nop,TS val
> 590414749 ecr 89050129], length 0 (ipip-proto-4)
>
> NOTE: The above trace was grabbed after finding a way around the issue (see
> below) and depicts only inbound traffic from the LVS. DSR carries the
> response back to the client out eth1.
> OBSERVED BEHAVIOR: IPVS mysteriously encapsulates traffic with source IP
> from 127.X.255.255. Running tcpdump from the remote real server
> (10.221.95.5):
>
> # tcpdump -n -i eth0 net 127.0.0.0/8 and proto 4
> tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
> listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
> 23:43:34.065782 IP 127.138.255.255 > 10.221.95.5: IP 52.117.148.54.3595 >
> 169.46.4.90.80: Flags [S], seq 146570019, win 65535, options [mss

Looking at archives I found thread that can help you:

https://marc.info/?t=153556562900003&r=1&w=2

Check if your kernel has this line removed from
do_output_route4():

fl4.saddr = (rt_mode & IP_VS_RT_MODE_CONNECT) ? *saddr : 0;

Probably, it is present.

Regards

--
Julian Anastasov <ja@ssi.bg>

_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users@LinuxVirtualServer.org
Send requests to lvs-users-request@LinuxVirtualServer.org
or go to http://lists.graemef.net/mailman/listinfo/lvs-users
Re: [lvs-users] Unexpected source IP selection in IPVS TUN [ In reply to ]
Hi Julian,

That is the exact issue we are seeing. Thank you for the guidance. I will
look into whether we can update to the latest RHEL with newer kernel
version.
I know it's probably a long shot, but are you aware of any workaround
without updating?

Thanks,
Calvin

----- Original message -----
From: Julian Anastasov <ja@ssi.bg>
Sent by: lvs-users-bounces@linuxvirtualserver.org
To: Calvin Zachman <calvin.zachman@ibm.com>
Cc: lvs-users@linuxvirtualserver.org
Subject: [EXTERNAL] Re: [lvs-users] Unexpected source IP selection in IPVS
TUN
Date: Mon, Apr 6, 2020 8:49 AM

Hello,
On Mon, 6 Apr 2020, Calvin Zachman wrote:
> EXPECTED BEHAVIOR: IPVS encapsulates the traffic with IPinIP using the
IP
> address from the private interface of the VM (10.X.X.X). Example
traffic
> successfully balanced from LVS director VM 10.221.95.2 to remote real
server
> 10.221.95.5:
>
> # tcpdump -n -i eth0 host 10.221.95.2 and proto 4
> 13:58:28.151571 IP 10.221.95.2 > 10.221.95.5: IP 52.117.148.54.64369 >
> 169.46.4.90.80: Flags [S], seq 180302151, win 65535, options [mss
> 1460,sackOK,TS val 590414746 ecr 0,nop,wscale 9], length 0
(ipip-proto-4)
> 13:58:28.152447 IP 10.221.95.2 > 10.221.95.5: IP 52.117.148.54.64369 >
> 169.46.4.90.80: Flags [.], ack 2964164084, win 128, options [nop,nop,TS
val
> 590414747 ecr 89050127], length 0 (ipip-proto-4)
> 13:58:28.152467 IP 10.221.95.2 > 10.221.95.5: IP 52.117.148.54.64369 >
> 169.46.4.90.80: Flags [P.], seq 0:75, ack 1, win 128, options
[nop,nop,TS
> val 590414747 ecr 89050127], length 75: HTTP: GET / HTTP/1.1
(ipip-proto-4)
> 13:58:28.154037 IP 10.221.95.2 > 10.221.95.5: IP 52.117.148.54.64369 >
> 169.46.4.90.80: Flags [.], ack 723, win 131, options [nop,nop,TS val
> 590414749 ecr 89050129], length 0 (ipip-proto-4)
>
> NOTE: The above trace was grabbed after finding a way around the issue
(see
> below) and depicts only inbound traffic from the LVS. DSR carries the
> response back to the client out eth1.
> OBSERVED BEHAVIOR: IPVS mysteriously encapsulates traffic with source
IP
> from 127.X.255.255. Running tcpdump from the remote real server
> (10.221.95.5):
>
> # tcpdump -n -i eth0 net 127.0.0.0/8 and proto 4
> tcpdump: verbose output suppressed, use -v or -vv for full protocol
decode
> listening on eth0, link-type EN10MB (Ethernet), capture size 262144
bytes
> 23:43:34.065782 IP 127.138.255.255 > 10.221.95.5: IP 52.117.148.54.3595
>
> 169.46.4.90.80: Flags [S], seq 146570019, win 65535, options [mss
Looking at archives I found thread that can help you:
[1]https://marc.info/?t=153556562900003&r=1&w=2
Check if your kernel has this line removed from
do_output_route4():
fl4.saddr = (rt_mode & IP_VS_RT_MODE_CONNECT) ? *saddr : 0;
Probably, it is present.
Regards
--
Julian Anastasov <ja@ssi.bg>
_______________________________________________
Please read the documentation before posting - it's available at:
[2]http://www.linuxvirtualserver.org/
LinuxVirtualServer.org mailing list - lvs-users@LinuxVirtualServer.org
Send requests to lvs-users-request@LinuxVirtualServer.org
or go to [3]http://lists.graemef.net/mailman/listinfo/lvs-users

References

1. https://marc.info/?t=153556562900003&r=1&w=2
2. http://www.linuxvirtualserver.org/
3. http://lists.graemef.net/mailman/listinfo/lvs-users
_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users@LinuxVirtualServer.org
Send requests to lvs-users-request@LinuxVirtualServer.org
or go to http://lists.graemef.net/mailman/listinfo/lvs-users
Re: [lvs-users] Unexpected source IP selection in IPVS TUN [ In reply to ]
Hello,

On Tue, 7 Apr 2020, Calvin Zachman wrote:

> Hi Julian,
>
> That is the exact issue we are seeing. Thank you for the guidance. I will
> look into whether we can update to the latest RHEL with newer kernel
> version.
> I know it's probably a long shot, but are you aware of any workaround
> without updating?

Sorry for the delay> I'm not aware of any work-around. May be
with SNAT translation. If IPVS box can not do it, some next hop.

Regards

--
Julian Anastasov <ja@ssi.bg>

_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users@LinuxVirtualServer.org
Send requests to lvs-users-request@LinuxVirtualServer.org
or go to http://lists.graemef.net/mailman/listinfo/lvs-users