Mailing List Archive

[lvs-users] SYN storm with DR, not your average ARP issue
Hello,

I've been using LVS for ages (as my posts here 9 years ago show ^o^), and
consider myself well versed (and happy except for SH and quiescent) with
it.

Facts first:
Debian Stretch, kernel 4.9, ipvsadm 1.28.
Network is bonded (CLAG) to 2 Arctica switches, tagged ports, actual
interface is a VLAN (bond1.284).

2 servers, pacemaker, ldirectord, 1 having the LB and public VIP as well
as the service (LDAP), the other being "just" an LDAP server by default.
Again, not the first I'm doing LVS by a long shot (though first time with
LDAP and bonded VLANs) and everything worked as expected.

However once in a while I'm seeing a SYN storm between the two LDAP nodes,
supposedly coming from a client node (the busiest one).
And at that time "ipvsadm -Lcn" will indeed show one connection from that
client in SYN state.
However:

1. The packets are not originating from the client at all.
2. Other connections from that client (and the rest) work fine.

The failure clearly is related to the "slave" LDAP server, this never
happens on the one actually running LVS and having the public VIP.
Bringing the lo: interface with the VIP down and up on the slave fixes
things, until it happens again a day or so later.

Unfortunately I didn't have time to do a complete analysis the last time
on the "master" server, but I definitely can say the SYN packets were
local to the 2 servers and maybe the switches.
tcpdump on the slave showed that while they had the IP address of the
client the MAC was that of the master (LVS node).

I'm wondering if this a load issue, corner case, as the rate of LDAP
connections is quite high (can peak to 500/s per server).
OTOH, on exactly the same HW but with another bonded (but no VLAN)
interface pair I'm also running another LVS setup for POP/IMAP for a
dovecot proxy that can see 50 connections per second per server.

Typical, normal state of the LDAP LVS (07 is the local one running LVS, 08
the "slave":
---
# ipvsadm -L
IP Virtual Server version 1.2.1 (size=1048576)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP in-lbldap2:ldap rr
-> inside-pp08:ldap Route 1 292 3473
-> inside-pp07:ldap Route 1 35 3745
---

Anybody seen this before?
Any other data needed?

Christian
--
Christian Balzer Network/Systems Engineer
chibi@gol.com Rakuten Communications

_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users@LinuxVirtualServer.org
Send requests to lvs-users-request@LinuxVirtualServer.org
or go to http://lists.graemef.net/mailman/listinfo/lvs-users
Re: [lvs-users] SYN storm with DR, not your average ARP issue [ In reply to ]
Hello,

On Thu, 16 Nov 2017, Christian Balzer wrote:

> I've been using LVS for ages (as my posts here 9 years ago show ^o^), and
> consider myself well versed (and happy except for SH and quiescent) with
> it.
>
> Facts first:
> Debian Stretch, kernel 4.9, ipvsadm 1.28.
> Network is bonded (CLAG) to 2 Arctica switches, tagged ports, actual
> interface is a VLAN (bond1.284).
>
> 2 servers, pacemaker, ldirectord, 1 having the LB and public VIP as well
> as the service (LDAP), the other being "just" an LDAP server by default.
> Again, not the first I'm doing LVS by a long shot (though first time with
> LDAP and bonded VLANs) and everything worked as expected.
>
> However once in a while I'm seeing a SYN storm between the two LDAP nodes,
> supposedly coming from a client node (the busiest one).
> And at that time "ipvsadm -Lcn" will indeed show one connection from that
> client in SYN state.
> However:
>
> 1. The packets are not originating from the client at all.
> 2. Other connections from that client (and the rest) work fine.
>
> The failure clearly is related to the "slave" LDAP server, this never
> happens on the one actually running LVS and having the public VIP.
> Bringing the lo: interface with the VIP down and up on the slave fixes
> things, until it happens again a day or so later.
>
> Unfortunately I didn't have time to do a complete analysis the last time
> on the "master" server, but I definitely can say the SYN packets were
> local to the 2 servers and maybe the switches.
> tcpdump on the slave showed that while they had the IP address of the
> client the MAC was that of the master (LVS node).
>
> I'm wondering if this a load issue, corner case, as the rate of LDAP
> connections is quite high (can peak to 500/s per server).
> OTOH, on exactly the same HW but with another bonded (but no VLAN)
> interface pair I'm also running another LVS setup for POP/IMAP for a
> dovecot proxy that can see 50 connections per second per server.
>
> Typical, normal state of the LDAP LVS (07 is the local one running LVS, 08
> the "slave":
> ---
> # ipvsadm -L
> IP Virtual Server version 1.2.1 (size=1048576)
> Prot LocalAddress:Port Scheduler Flags
> -> RemoteAddress:Port Forward Weight ActiveConn InActConn
> TCP in-lbldap2:ldap rr
> -> inside-pp08:ldap Route 1 292 3473
> -> inside-pp07:ldap Route 1 35 3745
> ---
>
> Anybody seen this before?

Yes and we know for two solutions. One is the setting
of sysctl var "backup_only" to 1 in all directors that can take
the role of backup server. Here is recent thread that has more
info:

https://marc.info/?l=linux-virtual-server&m=148621038304357&w=2

> Any other data needed?

Regards

--
Julian Anastasov <ja@ssi.bg>

_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users@LinuxVirtualServer.org
Send requests to lvs-users-request@LinuxVirtualServer.org
or go to http://lists.graemef.net/mailman/listinfo/lvs-users