Mailing List Archive

[lvs-users] LVS & Question on Implementing an Active/Active Load Balancer
I am looking to build for a client an Active/Active load balancer supporting ~5000 machines

Previously we have looked at implementing this solution using pacemaker/corosync/haproxy on RHEL 6.4 but that solution does not support syslog/514/UDP.

Looking at LVS there is support for UDP as well as maintaining the TCP state which is the other major requirement.

My question is, since we are only load balancing UDP/TCP rsyslog & HTTPS (Posting to a single URL), would there be a requirement to install keepalived ontop of LVS? From my view it appears to add another layer of complexity for no gain that I can see.

I have been searching for a configuration for an active/active LVS solution and while there are people stating that they are using it I haven't been able to find any information on the configuration of a Active/Active load balancer.

--

Barry

Banpen Fugyou - 10,000 Changes, No surprises


----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.
_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users@LinuxVirtualServer.org
Send requests to lvs-users-request@LinuxVirtualServer.org
or go to http://lists.graemef.net/mailman/listinfo/lvs-users
Re: [lvs-users] LVS & Question on Implementing an Active/Active Load Balancer [ In reply to ]
On 15.02.2015, Barry Haycock wrote:
> I am looking to build for a client an Active/Active load balancer supporting ~5000 machines

Your question isn't something I'd expect any FAQ for, it is more of an advanced configuration, and does raise multiple concerns.

Active/Active: is there a strong reason for doing so?

Why I'm asking: active/active may result in oversubscription, and one needs to very carefully keep that in mind, along with various other parameters.

Otherwise, you may be running with two loadbalancers at 45% of capacity each and
discover just in the second where one of both balancers fails, the combined
capacity of 90% exceeds the typical 70%-threshhold where things start stalling.

Regarding 5000 machines: unless you're looking into tunneling (which does have its own merits), IPVS/LVS requires you to have all of your loadbalancers and real servers on the same layer 2 network(s).

A single broken or misbehaving network card can flood the same L2 network. Such floods may be hard to track down, but will surely result in downtime for your whole server farm - that's just one of the many aspects why about every network engineer tries to stay well below thousands of servers per layer-2-domain. One can attempt to work around this topic by attaching multiple VLANs to the same loadbalancer, and distribute real servers onto those different VLANs, but this doesn't really decrease complexity.

Maybe your network design requires some more thoughts, maybe I'm just worrying about some missing information.

IPVS/LVS "out of the box" (default kernel configuration) uses a connection table size
CONFIG_IP_VS_TAB_BITS=12, which equals 4096 records; with 5000 machines, that's most likely something you'd like to tune (the help text for CONFIG_IP_VS_TAB_BITS tells how).

However, I wouldn't want to have more than e.g. 50 machines behind a single loadbalancer pair: typically, involved traffic and bandwith can saturate enough capacity. Technically, I'd expect IPVS to handle hundreds, possibly thousands of real servers. I just don't come up with a box being able to handle the resulting possible traffic.

> Previously we have looked at implementing this solution using pacemaker/corosync/haproxy on RHEL 6.4 but that solution does not support syslog/514/UDP.

UDP-based protocols do assume the traffic may be dropped silently or the application will ensure retransmit. syslog doesn't care about re-transmits, and so does choose to have its traffic dropped. If you're logging traffic remotely for auditing, intrusion detection or anything else which case about the logged information: you probably will want to avoid UDP-based remote syslog at all.

About any syslog implementation (rsyslog, syslog-ng, ...) updated during the
past decade does implement syslog on arbitary tcp ports, which may save you
from this hazzle. rsyslog also does offer its own protocol (RELP) for reliable
transmission.

> Looking at LVS there is support for UDP as well as maintaining the TCP state which is the other major requirement.
>
> My question is, since we are only load balancing UDP/TCP rsyslog & HTTPS (Posting to a single URL), would there be a requirement to install keepalived ontop of LVS? From my view it appears to add another layer of complexity for no gain that I can see.

keepalived does configure IPVS, the same kernel component used by LVS;
so it's not another layer of complexity, it's exactly the same layer,
but you may just using different tools, depending on what you're expecting :)


> I have been searching for a configuration for an active/active LVS solution and while there are people stating that they are using it I haven't been able to find any information on the configuration of a Active/Active load balancer.



There are multiple approaches regarding "active/active".

keepalived's high availability concept relies on VRRP, which at a first look is a pure active/passive-technology (there's just one active master, and everyone else is passive backup). However, you could turn this into active-active:

- your service uses multiple IP addresses.
load balancer A does handle IP X,
load balancer B does handle IP Y,
if any of them fails, the other one also handles the failed IP.

This behaviour should be achievable by using multiple vrrp syncgroups
with different states per balancer (syncgroup C is master on node A
and backup on node B, syncgroup D is backup on node A and master on node B).

Of course, your DNS record needs to show both IPs.
service.example.tld. IN A X
service.example.tld. IN A Y

Downside: your traffic may not be very evenly split. Over time and with
many different clients, it is somehow an even split (50:50), but it may als
sometimes become 45:55 or even 40:60. It does also depend on client behaviour,
and so that's not exactly what you may be asking for.

- you're using some other way than ARP to attract incoming network traffic,
e.g. by using specific routing protocols (like BGP, OSPF, ...).

Such routing protocols can be used to perform equal-cost multipathing
on the network layer: your loadbalancers are announcing the balanced IPs
to your network router. Your network router in turn does check the source IP
address of every IP address, and distributes e.g. the "odd" IP addresses to
one loadbalancer node, the "even" ones to a different loadbalancer node.
So from some perspective, this will introduce another layer of "loadbalancing".
Depending on used equipment and configuration, you could have e.g. 16 loadbalancer
nodes announcing the same IP address to your network, and all of them were active
at the same time.

http://sysadvent.blogspot.com/2014/12/day-11-turning-off-pacemaker-load.html
does have a nice introduction into this topic.



Anders
--
1&1 Internet AG Expert Systems Architect (IT Operations)
Brauerstrasse 50 v://49.721.91374.0
D-76135 Karlsruhe f://49.721.91374.225

Amtsgericht Montabaur HRB 6484
Vorstand: Frank Einhellinger, Robert Hoffmann, Markus Huhn,
Hans-Henning Kettler, Uwe Lamnek
Aufsichtsratsvorsitzender: Michael Scheeren

_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users@LinuxVirtualServer.org
Send requests to lvs-users-request@LinuxVirtualServer.org
or go to http://lists.graemef.net/mailman/listinfo/lvs-users