Mailing List Archive: [lvs-users] ldirectord does not transfer connections when a real server dies

[lvs-users] ldirectord does not transfer connections when a real server dies

Apr 30, 2013, 2:30 AM

Post #1 of 6 (2212 views)

Hello LVS users,

I am using ldirectord to load balance two IIS servers. The
ldirectord.cglooks like this:

autoreload = yes
quiescent = yes
checkinterval = 1
negotiatetimeout = 2
emailalertfreq = 60
emailalert = Konstantin.Boyanov@mysite.com
failurecount = 1

virtual = 172.22.9.100:80
checktimeout = 1
checktype = negotiate
protocol = tcp
real = 172.22.1.133:80 masq 2048
real = 172.22.1.134:80 masq 2048
request = "alive.htm"
receive = "I am not a zombie"
scheduler = wrr

The load balancing is working fine, the real servers are visible etc.
Nevertheless I am encountering a problem with a simple test:

1. I open some connections from a client browser (IE 8) to the sites that
are hosted on the real servers
2. I cange the weight of the real server which server the above connections
to 0 and leave only the other real server alive
3. I reload the pages to regenerate the connections

What I am seeing with ipvsadm -Ln is that the connections are still on the
"dead" server. I have to wait up to one minute (I suppose some TCP timeout
from the browser-side) for them to transfer to the "living" server. And If
in this one minute I continue pressing the reload button the connections
stay at the "dead" server and their TCP timeout counter gets restarted.

So my question is: Is there a way to tell the load balancer in NAT mode to
terminate / redirect existing connections to a dead server *immediately*
(or close to immediately)?

It seems to me a blunder that a reload on the client-side can make a
connection become a "zombie", e.g. be bound to a dead real server although
persistance is not used and the other server is ready and available.

The only thing that I found affecting this timeout is changing the
keepAliveTimeout in the Windows machine running the IE8 which I use for the
tests. When I cahnged it from the dafault value of 60 seconds to 30 seconds
the connections could be transferred after 30 seconds. It seems to me very
odd that a client setting can affect the operation of a network component
as the load balancer.

And another thing - what is the colum named "Inactive Conenctions" in the
output from ipvsadm used for? Which connections are considered inactive?

And also in the output of ipvsadm i see a couple of connections with the
state TIME_WAIT. What are these for?

Any insight and suggestions are highly appreciated !

Cheers,
Konstantin

P.S: Here is some more information about the configuration:

# uname -a
Linux 3.0.58-0.6.2-default #1 SMP Fri Jan 25 08:31:01 UTC 2013 x86_64
x86_64 x86_64 GNU/Linux

# ipvsadm -L
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP lb-mysite.com wrr
-> spwfe001.mysite.com:h Masq 10 0 0
-> spwfe002.mysite.com:h Masq 10 0 0

# iptables -t nat -L
Chain PREROUTING (policy ACCEPT)
target prot opt source destination

Chain INPUT (policy ACCEPT)
target prot opt source destination

Chain OUTPUT (policy ACCEPT)
target prot opt source destination

Chain POSTROUTING (policy ACCEPT)
target prot opt source destination
SNAT all -- anywhere anywhere
to:172.22.9.100
SNAT all -- anywhere anywhere
to:172.22.1.130

# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 brd 127.255.255.255 scope host lo
inet 127.0.0.2/8 brd 127.255.255.255 scope host secondary lo
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast
state UNKNOWN qlen 1000
link/ether 00:50:56:a5:77:ae brd ff:ff:ff:ff:ff:ff
inet 192.168.8.216/22 brd 192.168.11.255 scope global eth0
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast
state UNKNOWN qlen 1000
link/ether 00:50:56:a5:77:af brd ff:ff:ff:ff:ff:ff
inet 172.22.9.100/22 brd 172.22.11.255 scope global eth1:1
inet 172.22.8.213/22 brd 172.22.11.255 scope global secondary eth1
4: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast
state UNKNOWN qlen 1000
link/ether 00:50:56:a5:77:b0 brd ff:ff:ff:ff:ff:ff
inet 172.22.1.130/24 brd 172.22.1.255 scope global eth2

# cat /proc/sys/net/ipv4/ip_forward
1
# cat /proc/sys/net/ipv4/vs/conntrack
1
# cat /proc/sys/net/ipv4/vs/expire_nodest_conn
1
# cat /proc/sys/net/ipv4/vs/expire_quiescent_template
1
_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users@LinuxVirtualServer.org
Send requests to lvs-users-request@LinuxVirtualServer.org
or go to http://lists.graemef.net/mailman/listinfo/lvs-users