Mailing List Archive

[lvs-users] Understanding granularity, timeouts and unexpected balance of traffic on reals
Hi everyone,

I'm investigating a typical configuration for an L4 TCP load balancer using
ipvs+keepalived. Settings:-

persistence_timeout: 120 seconds. (# LVS persistence timeout, sec)
/sbin/ipvsadm --set 1800 120 300 (30 min timeout for TCP)
persistence_granularity: "48" for ipv6.
lb_algo: rr (round-robin)

My expectation is, all the IPs from the same /48 v6 subnet should always
reach the same real_server because of setting granularity. (at least the
connections created in last 120 seconds)

However, I can see that established connections from the same /48 v6 subnet
are spread across multiple reals, even for recently established
connections.

# Same /48 going to different reals, (very recent connections)
1. Grepping with only first 3 quibble to see how a /48 is being
distributed.
2. " | grep ESTAB | grep 29: | head -n 100" to only see first 100
established connections created in last 60 seconds as my timeout is set to
30:00 (1800 seconds)
4. 6th column is the real IP.
I see that the same /48 is getting distributed across multiple different
reals. (should be same real because of persistence_granularity set to 48).
$ sudo ipvsadm -lnc | grep "xxxx:xxxx:xxxx" | grep ESTAB | grep 29: | head
-n 100 | awk '{print $6}' | sort | uniq -c
2 [V6IP_REDACTED:9222]:443
9 [V6IP_REDACTED:9223]:443
7 [V6IP_REDACTED:9224]:443
13 [V6IP_REDACTED:9225]:443
1 [V6IP_REDACTED:9226]:443
............
............ output redacted


- Why are recent connections going to different reals?
- For recent connections, shouldn't they always end up on same real?
- For older connections, I guess, persistence_timeout causes the traffic
to balance to other reals via round robin.


Thanks in advance!
--
Cheers,
Abhijeet (https://abhi.host)
_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users@LinuxVirtualServer.org
Send requests to lvs-users-request@LinuxVirtualServer.org
or go to http://lists.graemef.net/mailman/listinfo/lvs-users
Re: [lvs-users] Understanding granularity, timeouts and unexpected balance of traffic on reals [ In reply to ]
Hi everyone,

Bump. Can anyone help with why IPVS is not respecting the subnet
persistence granularity?

Thanks,
Abhijeet

On Thu, Aug 15, 2019 at 1:11 PM Abhijeet Rastogi <abhijeet.1989@gmail.com>
wrote:

> Hi everyone,
>
> I'm investigating a typical configuration for an L4 TCP load balancer
> using ipvs+keepalived. Settings:-
>
> persistence_timeout: 120 seconds. (# LVS persistence timeout, sec)
> /sbin/ipvsadm --set 1800 120 300 (30 min timeout for TCP)
> persistence_granularity: "48" for ipv6.
> lb_algo: rr (round-robin)
>
> My expectation is, all the IPs from the same /48 v6 subnet should always
> reach the same real_server because of setting granularity. (at least the
> connections created in last 120 seconds)
>
> However, I can see that established connections from the same /48 v6
> subnet are spread across multiple reals, even for recently established
> connections.
>
> # Same /48 going to different reals, (very recent connections)
> 1. Grepping with only first 3 quibble to see how a /48 is being
> distributed.
> 2. " | grep ESTAB | grep 29: | head -n 100" to only see first 100
> established connections created in last 60 seconds as my timeout is set to
> 30:00 (1800 seconds)
> 4. 6th column is the real IP.
> I see that the same /48 is getting distributed across multiple different
> reals. (should be same real because of persistence_granularity set to 48).
> $ sudo ipvsadm -lnc | grep "xxxx:xxxx:xxxx" | grep ESTAB | grep 29: | head
> -n 100 | awk '{print $6}' | sort | uniq -c
> 2 [V6IP_REDACTED:9222]:443
> 9 [V6IP_REDACTED:9223]:443
> 7 [V6IP_REDACTED:9224]:443
> 13 [V6IP_REDACTED:9225]:443
> 1 [V6IP_REDACTED:9226]:443
> ............
> ............ output redacted
>
>
> - Why are recent connections going to different reals?
> - For recent connections, shouldn't they always end up on same real?
> - For older connections, I guess, persistence_timeout causes the
> traffic to balance to other reals via round robin.
>
>
> Thanks in advance!
> --
> Cheers,
> Abhijeet (https://abhi.host)
>


--
Cheers,
Abhijeet (https://abhi.host)
_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users@LinuxVirtualServer.org
Send requests to lvs-users-request@LinuxVirtualServer.org
or go to http://lists.graemef.net/mailman/listinfo/lvs-users
Re: [lvs-users] Understanding granularity, timeouts and unexpected balance of traffic on reals [ In reply to ]
Hello,

On Thu, 15 Aug 2019, Abhijeet Rastogi wrote:

> Hi everyone,
>
> I'm investigating a typical configuration for an L4 TCP load balancer using
> ipvs+keepalived. Settings:-
>
> persistence_timeout: 120 seconds. (# LVS persistence timeout, sec)
> /sbin/ipvsadm --set 1800 120 300 (30 min timeout for TCP)
> persistence_granularity: "48" for ipv6.
> lb_algo: rr (round-robin)
>
> My expectation is, all the IPs from the same /48 v6 subnet should always
> reach the same real_server because of setting granularity. (at least the
> connections created in last 120 seconds)

Yes, if they are for same virtual service. Otherwise,
it would be a bug if connections from same subnet go to different
real servers.

> However, I can see that established connections from the same /48 v6 subnet
> are spread across multiple reals, even for recently established
> connections.
>
> # Same /48 going to different reals, (very recent connections)
> 1. Grepping with only first 3 quibble to see how a /48 is being
> distributed.
> 2. " | grep ESTAB | grep 29: | head -n 100" to only see first 100
> established connections created in last 60 seconds as my timeout is set to
> 30:00 (1800 seconds)
> 4. 6th column is the real IP.
> I see that the same /48 is getting distributed across multiple different
> reals. (should be same real because of persistence_granularity set to 48).
> $ sudo ipvsadm -lnc | grep "xxxx:xxxx:xxxx" | grep ESTAB | grep 29: | head
> -n 100 | awk '{print $6}' | sort | uniq -c
> 2 [V6IP_REDACTED:9222]:443
> 9 [V6IP_REDACTED:9223]:443
> 7 [V6IP_REDACTED:9224]:443
> 13 [V6IP_REDACTED:9225]:443
> 1 [V6IP_REDACTED:9226]:443
> ............
> ............ output redacted

I'm not sure how many virtual servers you are using.
Please, list your configuration, even with scrambled IPs:

ipvsadm -Ln

> - Why are recent connections going to different reals?
> - For recent connections, shouldn't they always end up on same real?
> - For older connections, I guess, persistence_timeout causes the traffic
> to balance to other reals via round robin.

The persistence timeout (-p N) is used as minimum time.
For this period other connections can come and expire.

When the timer expires it can be extended each time with new 60
seconds if there are existing connections (even if not ESTAB anymore) that
refer to the persistence template (connection with zeros after the 48-th bit)
created to remember which real server is used. So, the persistence
template can live very long time if the subnet is very active. You should
see one such template for every subnet.

Regards

--
Julian Anastasov <ja@ssi.bg>

_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users@LinuxVirtualServer.org
Send requests to lvs-users-request@LinuxVirtualServer.org
or go to http://lists.graemef.net/mailman/listinfo/lvs-users
Re: [lvs-users] Understanding granularity, timeouts and unexpected balance of traffic on reals [ In reply to ]
Hi Julian,

Thanks for taking a look.

> Yes, if they are for same virtual service. Otherwise,
it would be a bug if connections from same subnet go to different
real servers.

I can confirm that traffic is coming to only one virtual service. Column 5
on "sudo ipvsadm -lnc" is the destination virtual service configured, this
is the output:-

[arastogi@esv5-app02 ~]$ sudo ipvsadm -lnc | grep <source_ip_/48_prefix> |
grep ESTAB | grep 29: | head -n 100 | awk '{print $5}' | sort | uniq -c |
sort -nk 1
100 [VIP_configured]:443

I see only VIP as the destination from current active ESTABLISHED
connections created <60 seconds ago.

> I'm not sure how many virtual servers you are using.
Please, list your configuration, even with scrambled IPs:

ipvsadm -Ln

Now that you said, if it's not happening it should be a bug, looks like I
missed seeing a key section in the ipvsadm output.

FWM 97284778 IPv6 rr persistent 120
-> [v6_reals:9222]:0 Route 1 0 0
-> [v6_reals:9223]:0 Route 1 0 0
-> [v6_reals:9224]:0 Route 1 0 0

There is no mask mentioned in the service table info (line1). That should
mean that the mask is 128 as per ipvsadm code.

if (se->af == AF_INET6)
if (se->netmask != 128)
printf(" mask %i", se->netmask);

Thanks, Julian.

Cheers,
Abhijeet


On Sun, Aug 25, 2019 at 3:28 AM Julian Anastasov <ja@ssi.bg> wrote:

>
> Hello,
>
> On Thu, 15 Aug 2019, Abhijeet Rastogi wrote:
>
> > Hi everyone,
> >
> > I'm investigating a typical configuration for an L4 TCP load balancer
> using
> > ipvs+keepalived. Settings:-
> >
> > persistence_timeout: 120 seconds. (# LVS persistence timeout, sec)
> > /sbin/ipvsadm --set 1800 120 300 (30 min timeout for TCP)
> > persistence_granularity: "48" for ipv6.
> > lb_algo: rr (round-robin)
> >
> > My expectation is, all the IPs from the same /48 v6 subnet should always
> > reach the same real_server because of setting granularity. (at least the
> > connections created in last 120 seconds)
>
> Yes, if they are for same virtual service. Otherwise,
> it would be a bug if connections from same subnet go to different
> real servers.
>
> > However, I can see that established connections from the same /48 v6
> subnet
> > are spread across multiple reals, even for recently established
> > connections.
> >
> > # Same /48 going to different reals, (very recent connections)
> > 1. Grepping with only first 3 quibble to see how a /48 is being
> > distributed.
> > 2. " | grep ESTAB | grep 29: | head -n 100" to only see first 100
> > established connections created in last 60 seconds as my timeout is set
> to
> > 30:00 (1800 seconds)
> > 4. 6th column is the real IP.
> > I see that the same /48 is getting distributed across multiple different
> > reals. (should be same real because of persistence_granularity set to
> 48).
> > $ sudo ipvsadm -lnc | grep "xxxx:xxxx:xxxx" | grep ESTAB | grep 29: |
> head
> > -n 100 | awk '{print $6}' | sort | uniq -c
> > 2 [V6IP_REDACTED:9222]:443
> > 9 [V6IP_REDACTED:9223]:443
> > 7 [V6IP_REDACTED:9224]:443
> > 13 [V6IP_REDACTED:9225]:443
> > 1 [V6IP_REDACTED:9226]:443
> > ............
> > ............ output redacted
>
> I'm not sure how many virtual servers you are using.
> Please, list your configuration, even with scrambled IPs:
>
> ipvsadm -Ln
>
> > - Why are recent connections going to different reals?
> > - For recent connections, shouldn't they always end up on same real?
> > - For older connections, I guess, persistence_timeout causes the
> traffic
> > to balance to other reals via round robin.
>
> The persistence timeout (-p N) is used as minimum time.
> For this period other connections can come and expire.
>
> When the timer expires it can be extended each time with new 60
> seconds if there are existing connections (even if not ESTAB anymore) that
> refer to the persistence template (connection with zeros after the 48-th
> bit)
> created to remember which real server is used. So, the persistence
> template can live very long time if the subnet is very active. You should
> see one such template for every subnet.
>
> Regards
>
> --
> Julian Anastasov <ja@ssi.bg>
>


--
Cheers,
Abhijeet (https://abhi.host)
_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users@LinuxVirtualServer.org
Send requests to lvs-users-request@LinuxVirtualServer.org
or go to http://lists.graemef.net/mailman/listinfo/lvs-users
Re: [lvs-users] Understanding granularity, timeouts and unexpected balance of traffic on reals [ In reply to ]
Hi Julian,

I still wanted to understand this behavior more. You mentioned:-

```
When the timer expires it can be extended each time with new 60
seconds if there are existing connections (even if not ESTAB anymore) that
refer to the persistence template (connection with zeros after the 48-th
bit)
created to remember which real server is used. So, the persistence
template can live very long time if the subnet is very active. You should
see one such template for every subnet.
```

Where is this documented? Is this 60 second configurable? Can it be
disabled? Is this related to this code?
https://sourcegraph.com/github.com/torvalds/linux@master/-/blob/net/netfilter/ipvs/ip_vs_conn.c#L892

On Mon, Aug 26, 2019 at 2:05 PM Abhijeet Rastogi <abhijeet.1989@gmail.com>
wrote:

> Hi Julian,
>
> Thanks for taking a look.
>
> > Yes, if they are for same virtual service. Otherwise,
> it would be a bug if connections from same subnet go to different
> real servers.
>
> I can confirm that traffic is coming to only one virtual service. Column 5
> on "sudo ipvsadm -lnc" is the destination virtual service configured, this
> is the output:-
>
> [arastogi@esv5-app02 ~]$ sudo ipvsadm -lnc | grep <source_ip_/48_prefix>
> | grep ESTAB | grep 29: | head -n 100 | awk '{print $5}' | sort | uniq -c
> | sort -nk 1
> 100 [VIP_configured]:443
>
> I see only VIP as the destination from current active ESTABLISHED
> connections created <60 seconds ago.
>
> > I'm not sure how many virtual servers you are using.
> Please, list your configuration, even with scrambled IPs:
>
> ipvsadm -Ln
>
> Now that you said, if it's not happening it should be a bug, looks like I
> missed seeing a key section in the ipvsadm output.
>
> FWM 97284778 IPv6 rr persistent 120
> -> [v6_reals:9222]:0 Route 1 0 0
> -> [v6_reals:9223]:0 Route 1 0 0
> -> [v6_reals:9224]:0 Route 1 0 0
>
> There is no mask mentioned in the service table info (line1). That should
> mean that the mask is 128 as per ipvsadm code.
>
> if (se->af == AF_INET6)
> if (se->netmask != 128)
> printf(" mask %i", se->netmask);
>
> Thanks, Julian.
>
> Cheers,
> Abhijeet
>
>
> On Sun, Aug 25, 2019 at 3:28 AM Julian Anastasov <ja@ssi.bg> wrote:
>
>>
>> Hello,
>>
>> On Thu, 15 Aug 2019, Abhijeet Rastogi wrote:
>>
>> > Hi everyone,
>> >
>> > I'm investigating a typical configuration for an L4 TCP load balancer
>> using
>> > ipvs+keepalived. Settings:-
>> >
>> > persistence_timeout: 120 seconds. (# LVS persistence timeout, sec)
>> > /sbin/ipvsadm --set 1800 120 300 (30 min timeout for TCP)
>> > persistence_granularity: "48" for ipv6.
>> > lb_algo: rr (round-robin)
>> >
>> > My expectation is, all the IPs from the same /48 v6 subnet should always
>> > reach the same real_server because of setting granularity. (at least the
>> > connections created in last 120 seconds)
>>
>> Yes, if they are for same virtual service. Otherwise,
>> it would be a bug if connections from same subnet go to different
>> real servers.
>>
>> > However, I can see that established connections from the same /48 v6
>> subnet
>> > are spread across multiple reals, even for recently established
>> > connections.
>> >
>> > # Same /48 going to different reals, (very recent connections)
>> > 1. Grepping with only first 3 quibble to see how a /48 is being
>> > distributed.
>> > 2. " | grep ESTAB | grep 29: | head -n 100" to only see first 100
>> > established connections created in last 60 seconds as my timeout is set
>> to
>> > 30:00 (1800 seconds)
>> > 4. 6th column is the real IP.
>> > I see that the same /48 is getting distributed across multiple different
>> > reals. (should be same real because of persistence_granularity set to
>> 48).
>> > $ sudo ipvsadm -lnc | grep "xxxx:xxxx:xxxx" | grep ESTAB | grep 29: |
>> head
>> > -n 100 | awk '{print $6}' | sort | uniq -c
>> > 2 [V6IP_REDACTED:9222]:443
>> > 9 [V6IP_REDACTED:9223]:443
>> > 7 [V6IP_REDACTED:9224]:443
>> > 13 [V6IP_REDACTED:9225]:443
>> > 1 [V6IP_REDACTED:9226]:443
>> > ............
>> > ............ output redacted
>>
>> I'm not sure how many virtual servers you are using.
>> Please, list your configuration, even with scrambled IPs:
>>
>> ipvsadm -Ln
>>
>> > - Why are recent connections going to different reals?
>> > - For recent connections, shouldn't they always end up on same real?
>> > - For older connections, I guess, persistence_timeout causes the
>> traffic
>> > to balance to other reals via round robin.
>>
>> The persistence timeout (-p N) is used as minimum time.
>> For this period other connections can come and expire.
>>
>> When the timer expires it can be extended each time with new 60
>> seconds if there are existing connections (even if not ESTAB anymore)
>> that
>> refer to the persistence template (connection with zeros after the 48-th
>> bit)
>> created to remember which real server is used. So, the persistence
>> template can live very long time if the subnet is very active. You should
>> see one such template for every subnet.
>>
>> Regards
>>
>> --
>> Julian Anastasov <ja@ssi.bg>
>>
>
>
> --
> Cheers,
> Abhijeet (https://abhi.host)
>


--
Cheers,
Abhijeet (https://abhi.host)
_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users@LinuxVirtualServer.org
Send requests to lvs-users-request@LinuxVirtualServer.org
or go to http://lists.graemef.net/mailman/listinfo/lvs-users
Re: [lvs-users] Understanding granularity, timeouts and unexpected balance of traffic on reals [ In reply to ]
Hello,

On Mon, 26 Aug 2019, Abhijeet Rastogi wrote:

> Hi Julian,
>
> I still wanted to understand this behavior more. You mentioned:-
>
> ```
> When the timer expires it can be extended each time with new 60
> seconds if there are existing connections (even if not ESTAB anymore) that
> refer to the persistence template (connection with zeros after the 48-th
> bit)
> created to remember which real server is used. So, the persistence
> template can live very long time if the subnet is very active. You should
> see one such template for every subnet.
> ```
>
> Where is this documented? Is this 60 second configurable? Can it be
> disabled? Is this related to this code?
> https://sourcegraph.com/github.com/torvalds/linux@master/-/blob/net/netfilter/ipvs/ip_vs_conn.c#L892

Yes, it is hardcoded in ip_vs_conn_expire() and can not
be disabled because we know only the count of connections that
have pointer to the template (n_control), the template has no
list of the connections from its subnet. So, the template just
waits all traffic to stop.

> On Mon, Aug 26, 2019 at 2:05 PM Abhijeet Rastogi <abhijeet.1989@gmail.com>
> wrote:

> > Now that you said, if it's not happening it should be a bug, looks like I
> > missed seeing a key section in the ipvsadm output.
> >
> > FWM 97284778 IPv6 rr persistent 120
> > -> [v6_reals:9222]:0 Route 1 0 0
> > -> [v6_reals:9223]:0 Route 1 0 0
> > -> [v6_reals:9224]:0 Route 1 0 0
> >
> > There is no mask mentioned in the service table info (line1). That should
> > mean that the mask is 128 as per ipvsadm code.
> >
> > if (se->af == AF_INET6)
> > if (se->netmask != 128)
> > printf(" mask %i", se->netmask);

Make sure -6 is specified exactly after -f FWMARK.

For example:

ipvsadm -A -f FWMARK -6 -s rr -p 120 -M 48

Is it working this way?

http://kb.linuxvirtualserver.org/wiki/IPv6_load_balancing

May be the ipvsadm man page should have example for -6.

Regards

--
Julian Anastasov <ja@ssi.bg>

_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users@LinuxVirtualServer.org
Send requests to lvs-users-request@LinuxVirtualServer.org
or go to http://lists.graemef.net/mailman/listinfo/lvs-users
Re: [lvs-users] Understanding granularity, timeouts and unexpected balance of traffic on reals [ In reply to ]
HI Julian,

Thanks for the information. I was using keepalived-1.3.5-8.el7_6.x86_64 to
manage ipvsadm settings and that seems to be not doing the right thing and
respecting configs.

>From keepalived.conf manpage:-

# LVS granularity mask (-M in ipvsadm)
persistence_granularity <NETMASK>

For now, I have a good understanding of the problem and you were really
helpful.

Thanks,
Abhijeet

On Tue, Aug 27, 2019 at 12:11 AM Julian Anastasov <ja@ssi.bg> wrote:

>
> Hello,
>
> On Mon, 26 Aug 2019, Abhijeet Rastogi wrote:
>
> > Hi Julian,
> >
> > I still wanted to understand this behavior more. You mentioned:-
> >
> > ```
> > When the timer expires it can be extended each time with new 60
> > seconds if there are existing connections (even if not ESTAB anymore)
> that
> > refer to the persistence template (connection with zeros after the 48-th
> > bit)
> > created to remember which real server is used. So, the persistence
> > template can live very long time if the subnet is very active. You should
> > see one such template for every subnet.
> > ```
> >
> > Where is this documented? Is this 60 second configurable? Can it be
> > disabled? Is this related to this code?
> >
> https://sourcegraph.com/github.com/torvalds/linux@master/-/blob/net/netfilter/ipvs/ip_vs_conn.c#L892
>
> Yes, it is hardcoded in ip_vs_conn_expire() and can not
> be disabled because we know only the count of connections that
> have pointer to the template (n_control), the template has no
> list of the connections from its subnet. So, the template just
> waits all traffic to stop.
>
> > On Mon, Aug 26, 2019 at 2:05 PM Abhijeet Rastogi <
> abhijeet.1989@gmail.com>
> > wrote:
>
> > > Now that you said, if it's not happening it should be a bug, looks
> like I
> > > missed seeing a key section in the ipvsadm output.
> > >
> > > FWM 97284778 IPv6 rr persistent 120
> > > -> [v6_reals:9222]:0 Route 1 0 0
> > > -> [v6_reals:9223]:0 Route 1 0 0
> > > -> [v6_reals:9224]:0 Route 1 0 0
> > >
> > > There is no mask mentioned in the service table info (line1). That
> should
> > > mean that the mask is 128 as per ipvsadm code.
> > >
> > > if (se->af == AF_INET6)
> > > if (se->netmask != 128)
> > > printf(" mask %i", se->netmask);
>
> Make sure -6 is specified exactly after -f FWMARK.
>
> For example:
>
> ipvsadm -A -f FWMARK -6 -s rr -p 120 -M 48
>
> Is it working this way?
>
> http://kb.linuxvirtualserver.org/wiki/IPv6_load_balancing
>
> May be the ipvsadm man page should have example for -6.
>
> Regards
>
> --
> Julian Anastasov <ja@ssi.bg>
>


--
Cheers,
Abhijeet (https://abhi.host)
_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users@LinuxVirtualServer.org
Send requests to lvs-users-request@LinuxVirtualServer.org
or go to http://lists.graemef.net/mailman/listinfo/lvs-users