Mailing List Archive

[lvs-users] source hashing some times land on wrong server (with FTP)
Hello!
We have FTP setup with on its own VIP and just map all ports (:0) and use
source hashing. Sometimes when the FTP client opens the data channel it
will land on the wrong real server causing a reset. I stress sometimes
because mostly FTP seems to work but we do see this behavior of requests
landing on the wrong server.

FTP client makes connection to VIP:0 on ftp port, is asked to open data
channel on VIP:0 on alternate port. FTP client sends SYN packet but that
packet doesn't land on the correct real FTP server, so connection is
reset. That SYN packet likely came through a different IPVS server but
should have sync connection state by this time.

Example of our config:

-A -t x.y.z.220:0 -s sh -p 600 -b sh-fallback
-a -t x.y.z.220:0 -r a.b.c.4:0 -i -w 1
-a -t x.y.z.220:0 -r a.b.c.5:0 -i -w 1
-a -t x.y.z.220:0 -r a.b.c.6:0 -i -w 1
-a -t x.y.z.220:0 -r a.b.c.7:0 -i -w 1

3.10.0-1062.1.1.el7.x86_64

We have this config running on multiple active IPVS servers all running
active/backup sync processes .

We've also tried a non 1 weight (1000) to see if it was the overload logic
kicking in and sending requests to alt server, but that did not seem to be
it.

Is there any reason why subsequent connections from the same source IP
would land on a different server?

Thanks,
Phillip Moore
_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users@LinuxVirtualServer.org
Send requests to lvs-users-request@LinuxVirtualServer.org
or go to http://lists.graemef.net/mailman/listinfo/lvs-users
Re: [lvs-users] source hashing some times land on wrong server (with FTP) [ In reply to ]
Phillip,

Are you getting any health check failures?
It's not a consistent hash - so any config changes would throw it.

Why not just use the standard persistence table?
The stick table for that is completely reliable, as long as you have
enough memory.



Regards,

Malcolm Turnbull

Loadbalancer.org Ltd.

www.loadbalancer.org
+44 (0)330 380 1064


Malcolm Turnbull

Loadbalancer.org Ltd.

www.loadbalancer.org

+44 (0)330 380 1064
malcolm@loadbalancer.org

FREE TRIAL | ONLINE DEMO | REVIEWS | BLOG




On Fri, 1 Nov 2019 at 15:49, Phillip Moore <pdm@pobox.com> wrote:
>
> Hello!
> We have FTP setup with on its own VIP and just map all ports (:0) and use
> source hashing. Sometimes when the FTP client opens the data channel it
> will land on the wrong real server causing a reset. I stress sometimes
> because mostly FTP seems to work but we do see this behavior of requests
> landing on the wrong server.
>
> FTP client makes connection to VIP:0 on ftp port, is asked to open data
> channel on VIP:0 on alternate port. FTP client sends SYN packet but that
> packet doesn't land on the correct real FTP server, so connection is
> reset. That SYN packet likely came through a different IPVS server but
> should have sync connection state by this time.
>
> Example of our config:
>
> -A -t x.y.z.220:0 -s sh -p 600 -b sh-fallback
> -a -t x.y.z.220:0 -r a.b.c.4:0 -i -w 1
> -a -t x.y.z.220:0 -r a.b.c.5:0 -i -w 1
> -a -t x.y.z.220:0 -r a.b.c.6:0 -i -w 1
> -a -t x.y.z.220:0 -r a.b.c.7:0 -i -w 1
>
> 3.10.0-1062.1.1.el7.x86_64
>
> We have this config running on multiple active IPVS servers all running
> active/backup sync processes .
>
> We've also tried a non 1 weight (1000) to see if it was the overload logic
> kicking in and sending requests to alt server, but that did not seem to be
> it.
>
> Is there any reason why subsequent connections from the same source IP
> would land on a different server?
>
> Thanks,
> Phillip Moore
> _______________________________________________
> Please read the documentation before posting - it's available at:
> http://www.linuxvirtualserver.org/
>
> LinuxVirtualServer.org mailing list - lvs-users@LinuxVirtualServer.org
> Send requests to lvs-users-request@LinuxVirtualServer.org
> or go to http://lists.graemef.net/mailman/listinfo/lvs-users

_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users@LinuxVirtualServer.org
Send requests to lvs-users-request@LinuxVirtualServer.org
or go to http://lists.graemef.net/mailman/listinfo/lvs-users
Re: [lvs-users] source hashing some times land on wrong server (with FTP) [ In reply to ]
On Fri, Nov 1, 2019 at 12:39 PM Malcolm Turnbull <malcolm@loadbalancer.org>
wrote:

> Are you getting any health check failures?
> It's not a consistent hash - so any config changes would throw it


There do not seem to be any health check failures that correspond with the
incident. That was my thought to especially because of the sh-fallback
option in use.

>
>
Why not just use the standard persistence table?
> The stick table for that is completely reliable, as long as you have
> enough memory.
>

Can you elaborate on what you mean by this? I was using source hash (sh)
on :0 to ensure that when a client connects back on the data port, it would
be hashed to the same place it was sent for the ftp control port. This
seems to work most of the time as most ftp sessions work. More than chance
anyway since we have 4 ftp servers behind the VIP and if the hashing was
not working I would expect it to only succeed randomly 25% of the time.

So I'm not sure how this would work or what you mean by the standard
persistence table.

Thank you for the suggestions.

Phillip


>
>
> On Fri, 1 Nov 2019 at 15:49, Phillip Moore <pdm@pobox.com> wrote:
> >
> > Hello!
> > We have FTP setup with on its own VIP and just map all ports (:0) and use
> > source hashing. Sometimes when the FTP client opens the data channel it
> > will land on the wrong real server causing a reset. I stress sometimes
> > because mostly FTP seems to work but we do see this behavior of requests
> > landing on the wrong server.
> >
> > FTP client makes connection to VIP:0 on ftp port, is asked to open data
> > channel on VIP:0 on alternate port. FTP client sends SYN packet but that
> > packet doesn't land on the correct real FTP server, so connection is
> > reset. That SYN packet likely came through a different IPVS server but
> > should have sync connection state by this time.
> >
> > Example of our config:
> >
> > -A -t x.y.z.220:0 -s sh -p 600 -b sh-fallback
> > -a -t x.y.z.220:0 -r a.b.c.4:0 -i -w 1
> > -a -t x.y.z.220:0 -r a.b.c.5:0 -i -w 1
> > -a -t x.y.z.220:0 -r a.b.c.6:0 -i -w 1
> > -a -t x.y.z.220:0 -r a.b.c.7:0 -i -w 1
> >
> > 3.10.0-1062.1.1.el7.x86_64
> >
> > We have this config running on multiple active IPVS servers all running
> > active/backup sync processes .
> >
> > We've also tried a non 1 weight (1000) to see if it was the overload
> logic
> > kicking in and sending requests to alt server, but that did not seem to
> be
> > it.
> >
> > Is there any reason why subsequent connections from the same source IP
> > would land on a different server?
> >
> > Thanks,
> > Phillip Moore
> > _______________________________________________
> > Please read the documentation before posting - it's available at:
> > http://www.linuxvirtualserver.org/
> >
> > LinuxVirtualServer.org mailing list - lvs-users@LinuxVirtualServer.org
> > Send requests to lvs-users-request@LinuxVirtualServer.org
> > or go to http://lists.graemef.net/mailman/listinfo/lvs-users
>
> _______________________________________________
> Please read the documentation before posting - it's available at:
> http://www.linuxvirtualserver.org/
>
> LinuxVirtualServer.org mailing list - lvs-users@LinuxVirtualServer.org
> Send requests to lvs-users-request@LinuxVirtualServer.org
> or go to http://lists.graemef.net/mailman/listinfo/lvs-users
>
_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users@LinuxVirtualServer.org
Send requests to lvs-users-request@LinuxVirtualServer.org
or go to http://lists.graemef.net/mailman/listinfo/lvs-users
Re: [lvs-users] source hashing some times land on wrong server (with FTP) [ In reply to ]
Phillip,

I'm not sure why the SH scheduler is not working for you,
but I'm pretty certain that if you just remove it and use WLC with
persistence then you should get the desired result.

i.e.

-A -t x.y.z.220:0 -s wlc -p 600
-a -t x.y.z.220:0 -r a.b.c.4:0 -i -w 1
-a -t x.y.z.220:0 -r a.b.c.5:0 -i -w 1
-a -t x.y.z.220:0 -r a.b.c.6:0 -i -w 1
-a -t x.y.z.220:0 -r a.b.c.7:0 -i -w 1

Otherwise, the SH scheduler overrides the default persistence table
(which you don't really need to do).

Obviously I may be missing something important - but I don't think so.




Regards,

Malcolm Turnbull

Loadbalancer.org Ltd.
www.loadbalancer.org
+44 (0)330 380 1064

Malcolm Turnbull

Loadbalancer.org Ltd.

www.loadbalancer.org

+44 (0)330 380 1064
malcolm@loadbalancer.org

FREE TRIAL | ONLINE DEMO | REVIEWS | BLOG



On Fri, 1 Nov 2019 at 18:41, Phillip Moore <pdm@pobox.com> wrote:
>
> On Fri, Nov 1, 2019 at 12:39 PM Malcolm Turnbull <malcolm@loadbalancer.org>
> wrote:
>
> > Are you getting any health check failures?
> > It's not a consistent hash - so any config changes would throw it
>
>
> There do not seem to be any health check failures that correspond with the
> incident. That was my thought to especially because of the sh-fallback
> option in use.
>
> >
> >
> Why not just use the standard persistence table?
> > The stick table for that is completely reliable, as long as you have
> > enough memory.
> >
>
> Can you elaborate on what you mean by this? I was using source hash (sh)
> on :0 to ensure that when a client connects back on the data port, it would
> be hashed to the same place it was sent for the ftp control port. This
> seems to work most of the time as most ftp sessions work. More than chance
> anyway since we have 4 ftp servers behind the VIP and if the hashing was
> not working I would expect it to only succeed randomly 25% of the time.
>
> So I'm not sure how this would work or what you mean by the standard
> persistence table.
>
> Thank you for the suggestions.
>
> Phillip
>
>
> >
> >
> > On Fri, 1 Nov 2019 at 15:49, Phillip Moore <pdm@pobox.com> wrote:
> > >
> > > Hello!
> > > We have FTP setup with on its own VIP and just map all ports (:0) and use
> > > source hashing. Sometimes when the FTP client opens the data channel it
> > > will land on the wrong real server causing a reset. I stress sometimes
> > > because mostly FTP seems to work but we do see this behavior of requests
> > > landing on the wrong server.
> > >
> > > FTP client makes connection to VIP:0 on ftp port, is asked to open data
> > > channel on VIP:0 on alternate port. FTP client sends SYN packet but that
> > > packet doesn't land on the correct real FTP server, so connection is
> > > reset. That SYN packet likely came through a different IPVS server but
> > > should have sync connection state by this time.
> > >
> > > Example of our config:
> > >
> > > -A -t x.y.z.220:0 -s sh -p 600 -b sh-fallback
> > > -a -t x.y.z.220:0 -r a.b.c.4:0 -i -w 1
> > > -a -t x.y.z.220:0 -r a.b.c.5:0 -i -w 1
> > > -a -t x.y.z.220:0 -r a.b.c.6:0 -i -w 1
> > > -a -t x.y.z.220:0 -r a.b.c.7:0 -i -w 1
> > >
> > > 3.10.0-1062.1.1.el7.x86_64
> > >
> > > We have this config running on multiple active IPVS servers all running
> > > active/backup sync processes .
> > >
> > > We've also tried a non 1 weight (1000) to see if it was the overload
> > logic
> > > kicking in and sending requests to alt server, but that did not seem to
> > be
> > > it.
> > >
> > > Is there any reason why subsequent connections from the same source IP
> > > would land on a different server?
> > >
> > > Thanks,
> > > Phillip Moore
> > > _______________________________________________
> > > Please read the documentation before posting - it's available at:
> > > http://www.linuxvirtualserver.org/
> > >
> > > LinuxVirtualServer.org mailing list - lvs-users@LinuxVirtualServer.org
> > > Send requests to lvs-users-request@LinuxVirtualServer.org
> > > or go to http://lists.graemef.net/mailman/listinfo/lvs-users
> >
> > _______________________________________________
> > Please read the documentation before posting - it's available at:
> > http://www.linuxvirtualserver.org/
> >
> > LinuxVirtualServer.org mailing list - lvs-users@LinuxVirtualServer.org
> > Send requests to lvs-users-request@LinuxVirtualServer.org
> > or go to http://lists.graemef.net/mailman/listinfo/lvs-users
> >
> _______________________________________________
> Please read the documentation before posting - it's available at:
> http://www.linuxvirtualserver.org/
>
> LinuxVirtualServer.org mailing list - lvs-users@LinuxVirtualServer.org
> Send requests to lvs-users-request@LinuxVirtualServer.org
> or go to http://lists.graemef.net/mailman/listinfo/lvs-users

_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users@LinuxVirtualServer.org
Send requests to lvs-users-request@LinuxVirtualServer.org
or go to http://lists.graemef.net/mailman/listinfo/lvs-users
Re: [lvs-users] source hashing some times land on wrong server (with FTP) [ In reply to ]
Hello,

On Fri, 1 Nov 2019, Phillip Moore wrote:

> Hello!
> We have FTP setup with on its own VIP and just map all ports (:0) and use
> source hashing. Sometimes when the FTP client opens the data channel it
> will land on the wrong real server causing a reset. I stress sometimes
> because mostly FTP seems to work but we do see this behavior of requests
> landing on the wrong server.
>
> FTP client makes connection to VIP:0 on ftp port, is asked to open data
> channel on VIP:0 on alternate port. FTP client sends SYN packet but that
> packet doesn't land on the correct real FTP server, so connection is
> reset. That SYN packet likely came through a different IPVS server but
> should have sync connection state by this time.
>
> Example of our config:
>
> -A -t x.y.z.220:0 -s sh -p 600 -b sh-fallback
> -a -t x.y.z.220:0 -r a.b.c.4:0 -i -w 1
> -a -t x.y.z.220:0 -r a.b.c.5:0 -i -w 1
> -a -t x.y.z.220:0 -r a.b.c.6:0 -i -w 1
> -a -t x.y.z.220:0 -r a.b.c.7:0 -i -w 1
>
> 3.10.0-1062.1.1.el7.x86_64
>
> We have this config running on multiple active IPVS servers all running
> active/backup sync processes .
>
> We've also tried a non 1 weight (1000) to see if it was the overload logic
> kicking in and sending requests to alt server, but that did not seem to be
> it.
>
> Is there any reason why subsequent connections from the same source IP
> would land on a different server?

Try to set the backup_only sysctl var to 1 on all directors
that are backup servers and that can be used also as real servers.
The flag can stay to 1 even while director runs as master. For the
rare setups that run both master and backup function at the same time,
this flag should not be used.

As result, when backup function is active any traffic received on
backup servers will be delivered locally, it will not be rescheduled to
other real servers. The backup_only flag is useful for DR/TUN setups to
avoid packet loops or as in your case to avoid rescheduling to different
real server. Why this happens? May be because sync messages are delayed,
sometimes up to 2 seconds.

Also, make sure the real servers are listed (added) in same order
in all directors that use SH scheduler. If not, the scheduling can select
different real server in both directors. There is no such requirement for
the MH scheduler which is more advanced but it is added in more recent
kernels (4.18+).

Regards

--
Julian Anastasov <ja@ssi.bg>

_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users@LinuxVirtualServer.org
Send requests to lvs-users-request@LinuxVirtualServer.org
or go to http://lists.graemef.net/mailman/listinfo/lvs-users
Re: [lvs-users] source hashing some times land on wrong server (with FTP) [ In reply to ]
Julian,

Thank you for the response. I'm not quite sure I understand though or maybe
I did not express our setup clearly.

We have a set of director hosts that are all active (running master and
backup at the same time) BGP/anycasting their VIPS. Then using TUN mode to
load balance separate real servers running the FTP server.

It sounds to me from the documentation the backup_only is for when the real
is the same server as director. But we aren't doing that.

We had previously used WLC instead of SH for the lb agl but we had
customers facing similar problems where connections to data channel would
land on a different (and unprepared to handle it) FTP server. SH seemed a
good fit since it only looked at the source IP so seemingly requests from
the same client (regardless of src or dst port) would land on the same real
FTP server for the data port.

Thank you for your input and I would appreciate it if you can expand a
little on if backup_only would help in this case.


Regards,
Phillip Moore




On Mon, Nov 4, 2019 at 2:21 PM Julian Anastasov <ja@ssi.bg> wrote:

>
> Hello,
>
> On Fri, 1 Nov 2019, Phillip Moore wrote:
>
> > Hello!
> > We have FTP setup with on its own VIP and just map all ports (:0) and use
> > source hashing. Sometimes when the FTP client opens the data channel it
> > will land on the wrong real server causing a reset. I stress sometimes
> > because mostly FTP seems to work but we do see this behavior of requests
> > landing on the wrong server.
> >
> > FTP client makes connection to VIP:0 on ftp port, is asked to open data
> > channel on VIP:0 on alternate port. FTP client sends SYN packet but that
> > packet doesn't land on the correct real FTP server, so connection is
> > reset. That SYN packet likely came through a different IPVS server but
> > should have sync connection state by this time.
> >
> > Example of our config:
> >
> > -A -t x.y.z.220:0 -s sh -p 600 -b sh-fallback
> > -a -t x.y.z.220:0 -r a.b.c.4:0 -i -w 1
> > -a -t x.y.z.220:0 -r a.b.c.5:0 -i -w 1
> > -a -t x.y.z.220:0 -r a.b.c.6:0 -i -w 1
> > -a -t x.y.z.220:0 -r a.b.c.7:0 -i -w 1
> >
> > 3.10.0-1062.1.1.el7.x86_64
> >
> > We have this config running on multiple active IPVS servers all running
> > active/backup sync processes .
> >
> > We've also tried a non 1 weight (1000) to see if it was the overload
> logic
> > kicking in and sending requests to alt server, but that did not seem to
> be
> > it.
> >
> > Is there any reason why subsequent connections from the same source IP
> > would land on a different server?
>
> Try to set the backup_only sysctl var to 1 on all directors
> that are backup servers and that can be used also as real servers.
> The flag can stay to 1 even while director runs as master. For the
> rare setups that run both master and backup function at the same time,
> this flag should not be used.
>
> As result, when backup function is active any traffic received on
> backup servers will be delivered locally, it will not be rescheduled to
> other real servers. The backup_only flag is useful for DR/TUN setups to
> avoid packet loops or as in your case to avoid rescheduling to different
> real server. Why this happens? May be because sync messages are delayed,
> sometimes up to 2 seconds.
>
> Also, make sure the real servers are listed (added) in same order
> in all directors that use SH scheduler. If not, the scheduling can select
> different real server in both directors. There is no such requirement for
> the MH scheduler which is more advanced but it is added in more recent
> kernels (4.18+).
>
> Regards
>
> --
> Julian Anastasov <ja@ssi.bg>
>
_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users@LinuxVirtualServer.org
Send requests to lvs-users-request@LinuxVirtualServer.org
or go to http://lists.graemef.net/mailman/listinfo/lvs-users
Re: [lvs-users] source hashing some times land on wrong server (with FTP) [ In reply to ]
Hello,

On Mon, 4 Nov 2019, Phillip Moore wrote:

> We have a set of director hosts that are all active (running master and
> backup at the same time) BGP/anycasting their VIPS. Then using TUN mode to
> load balance separate real servers running the FTP server.
>
> It sounds to me from the documentation the backup_only is for when the real
> is the same server as director. But we aren't doing that.
>
> We had previously used WLC instead of SH for the lb agl but we had
> customers facing similar problems where connections to data channel would
> land on a different (and unprepared to handle it) FTP server. SH seemed a
> good fit since it only looked at the source IP so seemingly requests from
> the same client (regardless of src or dst port) would land on the same real
> FTP server for the data port.
>
> Thank you for your input and I would appreciate it if you can expand a
> little on if backup_only would help in this case.

No, this flag helps the backup server to run as real server
by ignoring the steps that IPVS does to lookup and create connections.
The packets are simply passed to the local stack without any inspection.

Note that if sh-fallback detects unavailable real server (even with
weight 0), the fallback is temporary - when this real server becomes
available again the traffic is switched back immediately, so at such
switching moments two commections can land in different servers.

Regards

--
Julian Anastasov <ja@ssi.bg>

_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users@LinuxVirtualServer.org
Send requests to lvs-users-request@LinuxVirtualServer.org
or go to http://lists.graemef.net/mailman/listinfo/lvs-users