Mailing List Archive: [lvs-users] SYN spiraling between master and slave IPVS balancers

[lvs-users] SYN spiraling between master and slave IPVS balancers

dimak at stalker

Feb 5, 2013, 8:45 AM

Post #1 of 12 (3650 views)

Hello,

We have met a quite troublesome situation which causes an internal SYN
storm.

The simplified version of the configuration consists of 2 servers - A
and B, both running Linux kernel 3.7.4-20.

Both have the IPVS software enabled, A is acting as the active load
balancer, B as a backup.
Both servers act as real servers also.

At some point, there is an incoming TCP connection from IPpair
(address:port) I.
The load balancer A decides to process it locally. Connection is
established, and the balancer status is distributed to server B via
syncing broadcast.

The client closes connection, and again the status is updated on B via
the broadcast - the connection is now in the "TCP_WAIT" state.

Pretty soon (within 10 seconds) the client opens the new TCP connection
using the same IP pair I.
It is not a good TCP practice, but nevertheless, some clients work this way.

This time the load balancer A decides that the connection is to be
handled on the server B (persistence is switched off).
The SYN packet is relayed to the server B, which finds an existing
routing record for that pair I.
And that record (in the CLOSE state) - points to the server A, and the
SYN packet is relayed there.

The server A processes it again, directs it to the server B again, and
the loop spirals, since the server B does not have the new connection
table element I synced.

We can send packet dumps illustrating the problem.

If our analysis is correct, what are the available workarounds?
a) we can always use "persistent" option with time larger than CLOSE
(TIME_WAIT?) state time.
b) on the server B we can remove the iptables records marking incoming
packets with a flag used with the IPVS uses.
We can insert those iptable rule(s) only when the server B becomes the
main load balancer. But will it stop IPVS from running all incoming
packets via its (synced) connections table?

--
Best regards,
Dmitry Akindinov

_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users@LinuxVirtualServer.org
Send requests to lvs-users-request@LinuxVirtualServer.org
or go to http://lists.graemef.net/mailman/listinfo/lvs-users

Re: [lvs-users] SYN spiraling between master and slave IPVS balancers [ In reply to ]

jan at frydenbo-bruvoll

Feb 6, 2013, 2:03 AM

Post #2 of 12 (3557 views)

Dear Dmitry,

On 5 February 2013 16:45, Dmitry Akindinov <dimak@stalker.com> wrote:

> Hello,
>
> We have met a quite troublesome situation which causes an internal SYN
> storm.
>

So do we. We see intermittent loops of traffic, at various "levels", i.e.
15 Mbps if only one incoming packet gets duplicated, 30 Mbps if two are,
etc. These storms die down on their own in our case; we do not yet know why
they appear and why they disappear again.

> The simplified version of the configuration consists of 2 servers - A
> and B, both running Linux kernel 3.7.4-20.
>

Our kernels are currently 3.3.18 and 3.6.11, in a bridged LXC container
set-up. All Gentoo (hosts and containers).

> Both have the IPVS software enabled, A is acting as the active load
> balancer, B as a backup.
> Both servers act as real servers also.
>

Same here. The set-ups where we have pure load balancers do not exhibit
this problem at all.

> At some point, there is an incoming TCP connection from IPpair
> (address:port) I.
> The load balancer A decides to process it locally. Connection is
> established, and the balancer status is distributed to server B via
> syncing broadcast.
>
> The client closes connection, and again the status is updated on B via
> the broadcast - the connection is now in the "TCP_WAIT" state.
>
> Pretty soon (within 10 seconds) the client opens the new TCP connection
> using the same IP pair I.
> It is not a good TCP practice, but nevertheless, some clients work this
> way.
>
> This time the load balancer A decides that the connection is to be
> handled on the server B (persistence is switched off).
> The SYN packet is relayed to the server B, which finds an existing
> routing record for that pair I.
> And that record (in the CLOSE state) - points to the server A, and the
> SYN packet is relayed there.
>
> The server A processes it again, directs it to the server B again, and
> the loop spirals, since the server B does not have the new connection
> table element I synced.
>

Incredibly interesting information. Have you tweaked any TCP settings on
the servers at all (in desperation perhaps, settings that are now forgotten
but still active)?

Based on your description my idea would be that a pair of timeouts clash -
i.e. that IPVS forgets a connection that the routing layer keeps hold of,
and that maybe these timeouts need to be in sync (or ensured that IPVS
keeps a connection longer than the routing layer). To be honest, I have no
idea what I'm talking about here, though.

We'd be happy to share information and findings so that we can get rid of
this problem - it is annoying at best.

Best regards
Jan Frydenbo-Bruvoll
_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users@LinuxVirtualServer.org
Send requests to lvs-users-request@LinuxVirtualServer.org
or go to http://lists.graemef.net/mailman/listinfo/lvs-users

Re: [lvs-users] SYN spiraling between master and slave IPVS balancers [ In reply to ]

graeme at graemef

Feb 6, 2013, 2:34 AM

Post #3 of 12 (3544 views)

Hi

On Tue, 2013-02-05 at 20:45 +0400, Dmitry Akindinov wrote:
> We have met a quite troublesome situation which causes an internal SYN
> storm.

I documented this some time ago on this list, but will offer the
solution that I came up with at the time - which coincidentally I gave
to you as a possible solution to a separate problem you were having last
year :)

As you say, in a system where there is a multi-director setup (with or
without connection table synchronisation) it is possible for a packet to
hit one director and then "ping-pong" between two (or more) directors
causing a network storm.

My solution to this was to use the iptables MARK module to apply an
fwmark value to incoming traffic on the directors which is NOT from the
MAC address of the other director(s) in the system, and then setup the
LVS using the ipvsadm -f parameter to match those packets.

This way the incoming packets from the upstream router are marked, but
those being sent from the other director are not. In turn, those from
the upstream router are then handled using LVS; those from the other
director are not.

It may not be terribly elegant, and it may not scale easily across more
than three directors - but it does work.

http://archive.linuxvirtualserver.org/html/lvs-users/2012-08/msg00014.html

Graeme

_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users@LinuxVirtualServer.org
Send requests to lvs-users-request@LinuxVirtualServer.org
or go to http://lists.graemef.net/mailman/listinfo/lvs-users

Re: [lvs-users] SYN spiraling between master and slave IPVS balancers [ In reply to ]

jan at frydenbo-bruvoll

Feb 6, 2013, 2:54 AM

Post #4 of 12 (3554 views)

Dear Graeme,

On 6 February 2013 10:34, Graeme Fowler <graeme@graemef.net> wrote:

> My solution to this was to use the iptables MARK module to apply an
> fwmark value to incoming traffic on the directors which is NOT from the
> MAC address of the other director(s) in the system, and then setup the
> LVS using the ipvsadm -f parameter to match those packets.
>
> This way the incoming packets from the upstream router are marked, but
> those being sent from the other director are not. In turn, those from
> the upstream router are then handled using LVS; those from the other
> director are not.
>

We have this in place already, and in our case it does not work. It seems
we have spurious packets somewhere in the system that trigger the packet
flood. Note - the flood does not escalate - it just keeps bouncing the same
packet back and forth, and at some stage that ping-ponging also stops.

I am wondering whether in our case this is related to the bridge set-up,
however I have not been able to find out how to track this down yet.

Best regards
Jan
_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users@LinuxVirtualServer.org
Send requests to lvs-users-request@LinuxVirtualServer.org
or go to http://lists.graemef.net/mailman/listinfo/lvs-users

Re: [lvs-users] SYN spiraling between master and slave IPVS balancers [ In reply to ]

Feb 6, 2013, 1:13 PM

Post #5 of 12 (3545 views)

Hello,

On Tue, 5 Feb 2013, Dmitry Akindinov wrote:

> Hello,
>
> We have met a quite troublesome situation which causes an internal SYN
> storm.
>
> The simplified version of the configuration consists of 2 servers - A
> and B, both running Linux kernel 3.7.4-20.
>
> Both have the IPVS software enabled, A is acting as the active load
> balancer, B as a backup.
> Both servers act as real servers also.
>
> At some point, there is an incoming TCP connection from IPpair
> (address:port) I.
> The load balancer A decides to process it locally. Connection is
> established, and the balancer status is distributed to server B via
> syncing broadcast.
>
> The client closes connection, and again the status is updated on B via
> the broadcast - the connection is now in the "TCP_WAIT" state.
>
> Pretty soon (within 10 seconds) the client opens the new TCP connection
> using the same IP pair I.
> It is not a good TCP practice, but nevertheless, some clients work this way.
>
> This time the load balancer A decides that the connection is to be
> handled on the server B (persistence is switched off).

If connection still exists in balancer A it is
not going to select new real server. May be only if
expire_nodest_conn is set and when current real server
becomes unavailable a new real server can be selected
for next packets (2nd SYN).

> The SYN packet is relayed to the server B, which finds an existing
> routing record for that pair I.
> And that record (in the CLOSE state) - points to the server A, and the
> SYN packet is relayed there.
>
> The server A processes it again, directs it to the server B again, and
> the loop spirals, since the server B does not have the new connection
> table element I synced.

More likely the SYN comes short after the conn
in server A is expired but the synced conn in server B
is not expired yet. This can happen often because
the sync protocol is not perfect, conns in backup
tend to expire later.

> We can send packet dumps illustrating the problem.
>
> If our analysis is correct, what are the available workarounds?

I see that we discussed this problem August 2012.
I assume this is DR method and all IPVS rules are present
in backup? Are you using the old "sync_threshold" algorithm
or the new one with sync_refresh_period=10?

> a) we can always use "persistent" option with time larger than CLOSE
> (TIME_WAIT?) state time.

May be the problem will move from the normal conns
to the persistent conn templates. The simplest solution is
to use:

if (ipvs->sync_state & IP_VS_STATE_BACKUP)
return NF_ACCEPT;

in all hooks. This will stop all traffic.
The problem is that we do not know if the backup
function is used for part of the virtual services,
other virtual services can be in normal mode,
possibly using IP_VS_STATE_MASTER. I assume that
was the reason the master and backup functions to
be able to run together, with different sync_id.

I'm not sure what is more appropriate
to apply here, some sysctl var that will stop
the forwarding mode while backup function is
enabled. This solution will work better if we
don't want to change the tools that manage IPVS
rules and it is easier to implement, eg.
"backup_only=1" to activate such mode. The
result would be:

if (ipvs->sync_state & IP_VS_STATE_BACKUP &&
ipvs->backup_only)
return NF_ACCEPT; /* Try local server */

Another solution would be to add
optional syncid attribute to the virtual
server. By this way we will know that received
packet matches the backup_syncid (we are
used as real server from director) or else it is
directed from client to our virtual server. In the
second case if the packet matches master_syncid (as
additional check to the IP_VS_STATE_MASTER flag) a sync
message would be sent.

> b) on the server B we can remove the iptables records marking incoming
> packets with a flag used with the IPVS uses.
> We can insert those iptable rule(s) only when the server B becomes the
> main load balancer. But will it stop IPVS from running all incoming
> packets via its (synced) connections table?

What is the case now, do you have IPVS rules
on server B while the backup function is enabled?

> --
> Best regards,
> Dmitry Akindinov

Regards

--
Julian Anastasov <ja@ssi.bg>

_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users@LinuxVirtualServer.org
Send requests to lvs-users-request@LinuxVirtualServer.org
or go to http://lists.graemef.net/mailman/listinfo/lvs-users

Re: [lvs-users] SYN spiraling between master and slave IPVS balancers [ In reply to ]

dimak at stalker

Feb 7, 2013, 11:46 PM

Post #6 of 12 (3540 views)

Hello,

On 2013-02-07 01:13, Julian Anastasov wrote:
>
> Hello,
>
> On Tue, 5 Feb 2013, Dmitry Akindinov wrote:
>
>> Hello,
>>
>> We have met a quite troublesome situation which causes an internal SYN
>> storm.
>>
>> The simplified version of the configuration consists of 2 servers - A
>> and B, both running Linux kernel 3.7.4-20.
>>
>> Both have the IPVS software enabled, A is acting as the active load
>> balancer, B as a backup.
>> Both servers act as real servers also.
>>
>> At some point, there is an incoming TCP connection from IPpair
>> (address:port) I.
>> The load balancer A decides to process it locally. Connection is
>> established, and the balancer status is distributed to server B via
>> syncing broadcast.
>>
>> The client closes connection, and again the status is updated on B via
>> the broadcast - the connection is now in the "TCP_WAIT" state.
>>
>> Pretty soon (within 10 seconds) the client opens the new TCP connection
>> using the same IP pair I.
>> It is not a good TCP practice, but nevertheless, some clients work this way.
>>
>> This time the load balancer A decides that the connection is to be
>> handled on the server B (persistence is switched off).
>
> If connection still exists in balancer A it is
> not going to select new real server. May be only if
> expire_nodest_conn is set and when current real server
> becomes unavailable a new real server can be selected
> for next packets (2nd SYN).
>
>> The SYN packet is relayed to the server B, which finds an existing
>> routing record for that pair I.
>> And that record (in the CLOSE state) - points to the server A, and the
>> SYN packet is relayed there.
>>
>> The server A processes it again, directs it to the server B again, and
>> the loop spirals, since the server B does not have the new connection
>> table element I synced.
>
> More likely the SYN comes short after the conn
> in server A is expired but the synced conn in server B
> is not expired yet. This can happen often because
> the sync protocol is not perfect, conns in backup
> tend to expire later.

We tried to run with persistence but this made the situation even worse:
after successfully accepting a connection from the client IP, a new (now
from a different source port) should be handled by the same server (the
master) but the SYN packet is redirected...

>> We can send packet dumps illustrating the problem.
>>
>> If our analysis is correct, what are the available workarounds?
>
> I see that we discussed this problem August 2012.
> I assume this is DR method and all IPVS rules are present
> in backup? Are you using the old "sync_threshold" algorithm
> or the new one with sync_refresh_period=10?

Yes, we use DR method and rules are set up on all members. We use as few
configuration options as possible, e.g.:

[root@fm1 ~]# ipvsadm-save
-A -f 1000 -s rr -O
-a -f 1000 -r fm1.service.com:0 -g -w 75
-a -f 1000 -r fm2.service.com:0 -g -w 100
-A -f 1061 -s wlc
-a -f 1061 -r fm1.service.com:0 -g -w 1
-A -f 1062 -s wlc
-a -f 1062 -r fm2.service.com:0 -g -w 1

and the daemons are started without any additional parameters:

ipvsadm --start-daemon [backup|master] --mcast-interface $IPINTF --syncid 0

>> a) we can always use "persistent" option with time larger than CLOSE
>> (TIME_WAIT?) state time.
>
> May be the problem will move from the normal conns
> to the persistent conn templates.

Just starting the daemons with the -p flag did worse: now all new
connections from the same client IP for which a live connection exists
would start a SYN storm.

> The simplest solution is
> to use:
>
> if (ipvs->sync_state & IP_VS_STATE_BACKUP)
> return NF_ACCEPT;
>
> in all hooks. This will stop all traffic.

Currently we use no hooks.

> The problem is that we do not know if the backup
> function is used for part of the virtual services,
> other virtual services can be in normal mode,
> possibly using IP_VS_STATE_MASTER. I assume that
> was the reason the master and backup functions to
> be able to run together, with different sync_id.
>
> I'm not sure what is more appropriate
> to apply here, some sysctl var that will stop
> the forwarding mode while backup function is
> enabled. This solution will work better if we
> don't want to change the tools that manage IPVS
> rules and it is easier to implement, eg.
> "backup_only=1" to activate such mode. The
> result would be:
>
> if (ipvs->sync_state & IP_VS_STATE_BACKUP &&
> ipvs->backup_only)
> return NF_ACCEPT; /* Try local server */
>
> Another solution would be to add
> optional syncid attribute to the virtual
> server. By this way we will know that received
> packet matches the backup_syncid (we are
> used as real server from director) or else it is
> directed from client to our virtual server. In the
> second case if the packet matches master_syncid (as
> additional check to the IP_VS_STATE_MASTER flag) a sync
> message would be sent.

Hmm. Do you mean the workaround is to customize the kernel?
We would rather avoid that...

>> b) on the server B we can remove the iptables records marking incoming
>> packets with a flag used with the IPVS uses.
>> We can insert those iptable rule(s) only when the server B becomes the
>> main load balancer. But will it stop IPVS from running all incoming
>> packets via its (synced) connections table?
>
> What is the case now, do you have IPVS rules
> on server B while the backup function is enabled?

Yes.

--
Best regards,
Dmitry Akindinov

_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users@LinuxVirtualServer.org
Send requests to lvs-users-request@LinuxVirtualServer.org
or go to http://lists.graemef.net/mailman/listinfo/lvs-users

Re: [lvs-users] SYN spiraling between master and slave IPVS balancers [ In reply to ]

dimak at stalker

Feb 18, 2013, 3:50 AM

Post #7 of 12 (3466 views)

Hello,

I'd like to return to the discussion (some excerpts are below) to get
some ideas on how we can plan development of our product in regards to
IPVS development.

Can we expect that the synchronization protocol changes to avoid this
SYNC spiraling?

Or, the suggested changes to the kernel - I doubt we can afford
ourselves maintaining a custom version of the kernel for our customers,
- so the question is whether these changes are planned for a commit to
the main development branch of the kernel? The idea with the kernel
variable so a control over the behavior is possible in run-time looks good.

Thank you.

On 2013-02-07 01:13, Julian Anastasov wrote:
>
> Hello,
>
> On Tue, 5 Feb 2013, Dmitry Akindinov wrote:
>
>> Hello,
>>
>> We have met a quite troublesome situation which causes an internal SYN
>> storm.
>>

[]

>> The server A processes it again, directs it to the server B again, and
>> the loop spirals, since the server B does not have the new connection
>> table element I synced.
>
> More likely the SYN comes short after the conn
> in server A is expired but the synced conn in server B
> is not expired yet. This can happen often because
> the sync protocol is not perfect, conns in backup
> tend to expire later.
>
[]

> May be the problem will move from the normal conns
> to the persistent conn templates. The simplest solution is
> to use:
>
> if (ipvs->sync_state & IP_VS_STATE_BACKUP)
> return NF_ACCEPT;
>
> in all hooks. This will stop all traffic.
> The problem is that we do not know if the backup
> function is used for part of the virtual services,
> other virtual services can be in normal mode,
> possibly using IP_VS_STATE_MASTER. I assume that
> was the reason the master and backup functions to
> be able to run together, with different sync_id.
>
> I'm not sure what is more appropriate
> to apply here, some sysctl var that will stop
> the forwarding mode while backup function is
> enabled. This solution will work better if we
> don't want to change the tools that manage IPVS
> rules and it is easier to implement, eg.
> "backup_only=1" to activate such mode. The
> result would be:
>
> if (ipvs->sync_state & IP_VS_STATE_BACKUP &&
> ipvs->backup_only)
> return NF_ACCEPT; /* Try local server */
>
> Another solution would be to add
> optional syncid attribute to the virtual
> server. By this way we will know that received
> packet matches the backup_syncid (we are
> used as real server from director) or else it is
> directed from client to our virtual server. In the
> second case if the packet matches master_syncid (as
> additional check to the IP_VS_STATE_MASTER flag) a sync
> message would be sent.

[]

--
Best regards,
Dmitry Akindinov

_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users@LinuxVirtualServer.org
Send requests to lvs-users-request@LinuxVirtualServer.org
or go to http://lists.graemef.net/mailman/listinfo/lvs-users

Re: [lvs-users] SYN spiraling between master and slave IPVS balancers [ In reply to ]

Feb 21, 2013, 1:10 AM

Post #8 of 12 (3462 views)

Hello,

On Mon, 18 Feb 2013, Dmitry Akindinov wrote:

> Hello,
>
> I'd like to return to the discussion (some excerpts are below) to get some
> ideas on how we can plan development of our product in regards to IPVS
> development.
>
> Can we expect that the synchronization protocol changes to avoid this SYNC
> spiraling?
>
> Or, the suggested changes to the kernel - I doubt we can afford ourselves
> maintaining a custom version of the kernel for our customers, - so the
> question is whether these changes are planned for a commit to the main
> development branch of the kernel? The idea with the kernel variable so a
> control over the behavior is possible in run-time looks good.

That is what I have implemented. I hope you will
find a way to test it and after your feedback and comments
from others we will post this patch for kernel inclusion.
This is the link of the patch, I'll post it in separate
thread too:

http://www.ssi.bg/~ja/tmp/0001-ipvs-add-backup_only-flag-to-avoid-loops.txt

> Thank you.
>
> On 2013-02-07 01:13, Julian Anastasov wrote:
> >
> > Hello,
> >
> > On Tue, 5 Feb 2013, Dmitry Akindinov wrote:
> >
> > > Hello,
> > >
> > > We have met a quite troublesome situation which causes an internal SYN
> > > storm.
> > >
>
> []
>
> > > The server A processes it again, directs it to the server B again, and
> > > the loop spirals, since the server B does not have the new connection
> > > table element I synced.
> >
> > More likely the SYN comes short after the conn
> > in server A is expired but the synced conn in server B
> > is not expired yet. This can happen often because
> > the sync protocol is not perfect, conns in backup
> > tend to expire later.
> >
> []
>
> > May be the problem will move from the normal conns
> > to the persistent conn templates. The simplest solution is
> > to use:
> >
> > if (ipvs->sync_state & IP_VS_STATE_BACKUP)
> > return NF_ACCEPT;
> >
> > in all hooks. This will stop all traffic.
> > The problem is that we do not know if the backup
> > function is used for part of the virtual services,
> > other virtual services can be in normal mode,
> > possibly using IP_VS_STATE_MASTER. I assume that
> > was the reason the master and backup functions to
> > be able to run together, with different sync_id.
> >
> > I'm not sure what is more appropriate
> > to apply here, some sysctl var that will stop
> > the forwarding mode while backup function is
> > enabled. This solution will work better if we
> > don't want to change the tools that manage IPVS
> > rules and it is easier to implement, eg.
> > "backup_only=1" to activate such mode. The
> > result would be:
> >
> > if (ipvs->sync_state & IP_VS_STATE_BACKUP &&
> > ipvs->backup_only)
> > return NF_ACCEPT; /* Try local server */
> >
> > Another solution would be to add
> > optional syncid attribute to the virtual
> > server. By this way we will know that received
> > packet matches the backup_syncid (we are
> > used as real server from director) or else it is
> > directed from client to our virtual server. In the
> > second case if the packet matches master_syncid (as
> > additional check to the IP_VS_STATE_MASTER flag) a sync
> > message would be sent.
>
> []
>
> --
> Best regards,
> Dmitry Akindinov

Regards

--
Julian Anastasov <ja@ssi.bg>

_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users@LinuxVirtualServer.org
Send requests to lvs-users-request@LinuxVirtualServer.org
or go to http://lists.graemef.net/mailman/listinfo/lvs-users

Re: [lvs-users] SYN spiraling between master and slave IPVS balancers [ In reply to ]

dimak at stalker

Mar 4, 2013, 2:00 AM

Post #9 of 12 (3437 views)

Hello,

On 2013-02-21 13:10, Julian Anastasov wrote:
>
> Hello,
>
> On Mon, 18 Feb 2013, Dmitry Akindinov wrote:
>
>> Hello,
>>
>> I'd like to return to the discussion (some excerpts are below) to get some
>> ideas on how we can plan development of our product in regards to IPVS
>> development.
>>
>> Can we expect that the synchronization protocol changes to avoid this SYNC
>> spiraling?
>>
>> Or, the suggested changes to the kernel - I doubt we can afford ourselves
>> maintaining a custom version of the kernel for our customers, - so the
>> question is whether these changes are planned for a commit to the main
>> development branch of the kernel? The idea with the kernel variable so a
>> control over the behavior is possible in run-time looks good.
>
> That is what I have implemented. I hope you will
> find a way to test it and after your feedback and comments
> from others we will post this patch for kernel inclusion.
> This is the link of the patch, I'll post it in separate
> thread too:
>
> http://www.ssi.bg/~ja/tmp/0001-ipvs-add-backup_only-flag-to-avoid-loops.txt

Thank you very much!

It's about ten days since we started to use on our servers kernel
3.8.0-28 (now 3.8.1-30) with your patch applied. So far, there are no
signs of SYN spiraling or other problems with using ipvs balancer nodes
on our multiprotocol real servers. It would be ver nice to have this
feature in the standard kernels.

Thank you!

>> Thank you.
>>
>> On 2013-02-07 01:13, Julian Anastasov wrote:
>>>
>>> Hello,
>>>
>>> On Tue, 5 Feb 2013, Dmitry Akindinov wrote:
>>>
>>>> Hello,
>>>>
>>>> We have met a quite troublesome situation which causes an internal SYN
>>>> storm.
>>>>
>>
>> []
>>
>>>> The server A processes it again, directs it to the server B again, and
>>>> the loop spirals, since the server B does not have the new connection
>>>> table element I synced.
>>>
>>> More likely the SYN comes short after the conn
>>> in server A is expired but the synced conn in server B
>>> is not expired yet. This can happen often because
>>> the sync protocol is not perfect, conns in backup
>>> tend to expire later.
>>>
>> []
>>
>>> May be the problem will move from the normal conns
>>> to the persistent conn templates. The simplest solution is
>>> to use:
>>>
>>> if (ipvs->sync_state & IP_VS_STATE_BACKUP)
>>> return NF_ACCEPT;
>>>
>>> in all hooks. This will stop all traffic.
>>> The problem is that we do not know if the backup
>>> function is used for part of the virtual services,
>>> other virtual services can be in normal mode,
>>> possibly using IP_VS_STATE_MASTER. I assume that
>>> was the reason the master and backup functions to
>>> be able to run together, with different sync_id.
>>>
>>> I'm not sure what is more appropriate
>>> to apply here, some sysctl var that will stop
>>> the forwarding mode while backup function is
>>> enabled. This solution will work better if we
>>> don't want to change the tools that manage IPVS
>>> rules and it is easier to implement, eg.
>>> "backup_only=1" to activate such mode. The
>>> result would be:
>>>
>>> if (ipvs->sync_state & IP_VS_STATE_BACKUP &&
>>> ipvs->backup_only)
>>> return NF_ACCEPT; /* Try local server */
>>>
>>> Another solution would be to add
>>> optional syncid attribute to the virtual
>>> server. By this way we will know that received
>>> packet matches the backup_syncid (we are
>>> used as real server from director) or else it is
>>> directed from client to our virtual server. In the
>>> second case if the packet matches master_syncid (as
>>> additional check to the IP_VS_STATE_MASTER flag) a sync
>>> message would be sent.
>>
>> []
>>
>> --
>> Best regards,
>> Dmitry Akindinov
>
> Regards
>
> --
> Julian Anastasov <ja@ssi.bg>
>
> _______________________________________________
> Please read the documentation before posting - it's available at:
> http://www.linuxvirtualserver.org/
>
> LinuxVirtualServer.org mailing list - lvs-users@LinuxVirtualServer.org
> Send requests to lvs-users-request@LinuxVirtualServer.org
> or go to http://lists.graemef.net/mailman/listinfo/lvs-users
>

--
Best regards,
Dmitry Akindinov

_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users@LinuxVirtualServer.org
Send requests to lvs-users-request@LinuxVirtualServer.org
or go to http://lists.graemef.net/mailman/listinfo/lvs-users

Re: [lvs-users] SYN spiraling between master and slave IPVS balancers [ In reply to ]

Mar 4, 2013, 12:39 PM

Post #10 of 12 (3432 views)

Hello,

On Mon, 4 Mar 2013, Dmitry Akindinov wrote:

> > http://www.ssi.bg/~ja/tmp/0001-ipvs-add-backup_only-flag-to-avoid-loops.txt
>
> Thank you very much!
>
> It's about ten days since we started to use on our servers kernel 3.8.0-28
> (now 3.8.1-30) with your patch applied. So far, there are no signs of SYN
> spiraling or other problems with using ipvs balancer nodes on our
> multiprotocol real servers. It would be ver nice to have this feature in the
> standard kernels.

Yes, I'll do it in the next days.

So, can I add your Tested-by in commit message?
And I assume your tests were with backup_only=1, right?

> --
> Best regards,
> Dmitry Akindinov

Regards

--
Julian Anastasov <ja@ssi.bg>

_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users@LinuxVirtualServer.org
Send requests to lvs-users-request@LinuxVirtualServer.org
or go to http://lists.graemef.net/mailman/listinfo/lvs-users

Re: [lvs-users] SYN spiraling between master and slave IPVS balancers [ In reply to ]

dimak at stalker

Mar 4, 2013, 10:07 PM

Post #11 of 12 (3435 views)

Hello,

On 2013-03-05 00:39, Julian Anastasov wrote:
>
> Hello,
>
> On Mon, 4 Mar 2013, Dmitry Akindinov wrote:
>
>>> http://www.ssi.bg/~ja/tmp/0001-ipvs-add-backup_only-flag-to-avoid-loops.txt
>>
>> Thank you very much!
>>
>> It's about ten days since we started to use on our servers kernel 3.8.0-28
>> (now 3.8.1-30) with your patch applied. So far, there are no signs of SYN
>> spiraling or other problems with using ipvs balancer nodes on our
>> multiprotocol real servers. It would be ver nice to have this feature in the
>> standard kernels.
>
> Yes, I'll do it in the next days.
>
> So, can I add your Tested-by in commit message?

It would be more right to add German Myzovsky <lawyer@sipnet.ru> as the
tester. I'm just the coordinator.

> And I assume your tests were with backup_only=1, right?

Yes, slave balancers set this flag to 1. In the case of a cluster
fail-over the newly elected master balancer clears that flag.

>> --
>> Best regards,
>> Dmitry Akindinov
>
> Regards
>
> --
> Julian Anastasov <ja@ssi.bg>
>

--
Best regards,
Dmitry Akindinov

_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users@LinuxVirtualServer.org
Send requests to lvs-users-request@LinuxVirtualServer.org
or go to http://lists.graemef.net/mailman/listinfo/lvs-users

Re: [lvs-users] SYN spiraling between master and slave IPVS balancers [ In reply to ]

Mar 4, 2013, 10:41 PM

Post #12 of 12 (3443 views)

Hello,

On Tue, 5 Mar 2013, Dmitry Akindinov wrote:

> On 2013-03-05 00:39, Julian Anastasov wrote:
> >
> > Hello,
> >
> > On Mon, 4 Mar 2013, Dmitry Akindinov wrote:
> >
> > > > http://www.ssi.bg/~ja/tmp/0001-ipvs-add-backup_only-flag-to-avoid-loops.txt
> > >
> > > Thank you very much!
> > >
> > > It's about ten days since we started to use on our servers kernel 3.8.0-28
> > > (now 3.8.1-30) with your patch applied. So far, there are no signs of SYN
> > > spiraling or other problems with using ipvs balancer nodes on our
> > > multiprotocol real servers. It would be ver nice to have this feature in
> > > the
> > > standard kernels.
> >
> > Yes, I'll do it in the next days.
> >
> > So, can I add your Tested-by in commit message?
>
> It would be more right to add German Myzovsky <lawyer@sipnet.ru> as the
> tester. I'm just the coordinator.

ok

> > And I assume your tests were with backup_only=1, right?
>
> Yes, slave balancers set this flag to 1. In the case of a cluster fail-over
> the newly elected master balancer clears that flag.

Very good. Note that this flag is ignored if
the backup function is stopped, so there is no need
to clear the flag when switching to master mode.

Regards

--
Julian Anastasov <ja@ssi.bg>

_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users@LinuxVirtualServer.org
Send requests to lvs-users-request@LinuxVirtualServer.org
or go to http://lists.graemef.net/mailman/listinfo/lvs-users