Mailing List Archive

Julian Anastasov wrote:

>5. NAT is not the only used method. The DR and TUN methods don't allow

>the director's checks properly to check the real services: the real >service listens to the same VIP and it is hard to generate packets >in the director with daddr=VIP that will avoid the routing and will >reach the real server. They don't leave the director. What means this: >we can't check exactly the VIP:VPORT in the real service, may be only >RIP:VPORT ? This problem does not exist when the checks are performed >from the real service, for example the L4 check can be simple bind() >to VIP:VPORT. Port busy means L4 succeeds. No problems to perform >L7 checks. Sometimes httpd can listen to many virtual domains with >bind to 0.0.0.0. Why we need to perform checks for all these VIPs >when we can simply check on of them. Many, many optimizations, User >defined. > But it is nice to be able to have the ability to configure this (the VIP/RIP and PORT combination) since we don't want to assume the only configuration is multiple HTTP daemons (for example) bound to 0.0.0.0. (Even if we are local on DR or TUN server). In Apache http.conf we can specify a LISTEN port and run a separate daemon for HTTPS on port 443 for example. If this https daemon or daemons dies, or fails to start (because we have it configured to prompt for our security certificate password at startup) we wouldn't want to make assumptions about the health of the daemons listening on port 80 right? Also, Julian does your comment about FWMARK mean you think keepalived will not work with FWMARKing Directors? Many thanks to Alexandre Cassen for the great contribution... I plan to test it further in the lab ASAP. --K

Get your FREE download of MSN Explorer at http://explorer.msn.com"]http://explorer.msn.com

Re: keepalived (was Re: News contrib to LVS) [ In reply to ]

Dec 25, 2000, 3:12 AM

Post #3 of 12 (2423 views)

Alexandre.Cassen at wanadoo

Hello,

On Sun, 24 Dec 2000, Lorn Kay wrote:

> >5. NAT is not the only used method. The DR and TUN methods don't allow
>
> >the director's checks properly to check the real services: the real
>
> >service listens to the same VIP and it is hard to generate packets
> >in the director with daddr=VIP that will avoid the routing and will
> >reach the real server. They don't leave the director. What means this:
> >we can't check exactly the VIP:VPORT in the real service, may be only
> >RIP:VPORT ? This problem does not exist when the checks are performed
> >from the real service, for example the L4 check can be simple bind()
> >to VIP:VPORT. Port busy means L4 succeeds. No problems to perform
> >L7 checks. Sometimes httpd can listen to many virtual domains with
> >bind to 0.0.0.0. Why we need to perform checks for all these VIPs
> >when we can simply check on of them. Many, many optimizations, User
> >defined.
> >
>
> But it is nice to be able to have the ability to configure this (the
> VIP/RIP and PORT combination) since we don't want to assume the only
> configuration is multiple HTTP daemons (for example) bound to 0.0.0.0.
> (Even if we are local on DR or TUN server).

Agreed.

> In Apache http.conf we can specify a LISTEN port and run a separate
> daemon for HTTPS on port 443 for example. If this https daemon or daemons
> dies, or fails to start (because we have it configured to prompt for our
> security certificate password at startup) we wouldn't want to make
> assumptions about the health of the daemons listening on port 80 right?

Yes, even when we have one httpd for two domains may be we want to
check different cgi or database calls with L7 HTTP checks. But the L4
check can be one, of course, configured from the user: bind to 0.0.0.0:80.

> Also, Julian does your comment about FWMARK mean you think keepalived
> will not work with FWMARKing Directors?

It will. I think, all we want to see it implemented. But such
virtual services are not setup only with LVS setsockopts, we need to define
some ipchains rules, etc. These settings can be added to the configuration
file(s): chain name (input, vip), many vproto:vip:vport, fwmark value, etc.

> Many thanks to Alexandre Cassen for the great contribution... I plan to
> test it further in the lab ASAP.
>
> --K

Regards

--
Julian Anastasov <ja@ssi.bg>

Re: keepalived (was Re: News contrib to LVS) [ In reply to ]

Dec 25, 2000, 3:19 PM

Post #4 of 12 (2421 views)

Alexandre.Cassen at wanadoo

Hi,

Just to tell you about keepalived futur dev :

1. I will clean the entire code
2. I will clean the tcpcheck half open connection fonction to be really
full fonctionnal
3. I will add a ipchain kernel wrapper for adding support to FWMARK, NAT, ...
4. I will add HTTPS_GET fonction to check secure urls
5. More documentation and case study

good week,

Alexandre

Re: keepalived (was Re: News contrib to LVS) [ In reply to ]

Dec 26, 2000, 3:11 PM

Post #5 of 12 (2418 views)

Hello,

> > the project homepage is : http://keepalived.sourceforge.net
>
> Some thoughts on this topic:
>
>1. If the checks are moved from the director to the real servers you
>can relax the CPU activity in the director. If this does not sounds as
>a problem for small clusters consider a setups with many virtual services
>and with many real servers. The director is going to waste CPU cycles
>only for checks. Not fatal for LVS (the network always has CPU cycles)
>but for the checks.

Sure, but in my mind people who run big LVS infrastructure can run the
whole solution on a director with appropriate CPU.
Big director solution are chip today. But it can weaken the network
performances it is true (multiple tests like SSL checks can act that way
too with CPU...).
So we can imagine a solution where the director solution is a cluster of
two server :
1. One server for the VS gestion using the ipvs kernel module
2. The second for performing the keepalived checks triggers. This server
will communic via socket with the first server to pass add/remove
realserver from the pool.

listen for he ipvs director.
like hearthbeat moves the ipvs director functionnality on the keepalived
server.

I am using Arrowpoint loadbalancer at work (CS50), and they perform
triggers checks like this on each loadbalancer. For administrators, i think
it is a good design to
locate the keepalived functionnality. If the CPU is not so strong, we can
also create, using LVS, a virtual server with a cluster of keepalived
server. This can be a good
design too i think.

>2. Don't copy the same work from the hardware solutions, most of them
>can't run agents in the real servers and implement checks in different
>...................
>director to set the weight based on expression from these parameters:
>one expression for FTP, another for HTTP, etc

That you discribe here is the way like BMC BEST/1 or PATROL or other
monitoring platform work. For me adding an agent on each server
multiplicate administration task and introduce security vulerabilities (i
probably mistake... :) ).

If we do not want to depend on the plateform the realserver service run we
need to centralize the check triggers to the loabalancer or a single point
check. A monitoring environnement based on a couple
of collector/monitoring console are extremly OS dependent. In a really
first realse of keepalived I had used monitoring agent based on a simple
protocole frame to communicate with a centralized monitoring tools. But my
environnement is really eterogeneous (Oracle OAS, IIS, Netscape, Apache in
the same realserver pool), so to factorise a limite the OS dependent dev I
have emplemented a design centralized to a single point using network
scanning technic to perform check.

>3. Of course, there are other ways to set the weights - when they
>are evaluated in the director. This can include decisions based on
>response times (from the L7/4/3 checks), etc. Not sure how well they
>are working. I've never implemented such risky tricks.

Yes !!! :) response time and the ability to check application performance
is a great and VERY interresting functionnality that we can add to such
daemon. We can use a dynamic structure registering statistics about each
server response time... if the response time decrease or change, we can
modify the cluster performance and made him fully dynamic on hte
applications performance. We can so define here a "weighted performance"
variable like the LVS weight. We can also use some great fairequeing
functionnality that is present in the advanced routing functionnality to
adjust ip stream using kernel call to QOS framework.... really a good think
to do here :)

>4. User defined checks: talk with the real service and analyze these
>talks

A macro language definition ... a small language to define checks and use
hardcoded primitives (tcpcheck, httpget, ...) to define action on result...

>5. NAT is not the only used method. The DR and TUN methods don't allow
>the director's checks properly to check the real services: the real
>service listens to the same VIP and it is hard to generate packets
>in the director with daddr=VIP that will avoid the routing and will
>reach the real server. They don't leave the director. What means this:
>we can't check exactly the VIP:VPORT in the real service, may be only
>RIP:VPORT ? This problem does not exist when the checks are performed
>from the real service, for example the L4 check can be simple bind()
>to VIP:VPORT. Port busy means L4 succeeds. No problems to perform
>L7 checks. Sometimes httpd can listen to many virtual domains with
>bind to 0.0.0.0. Why we need to perform checks for all these VIPs
>when we can simply check on of them. Many, many optimizations, User
>defined.
>
>6. Some utilities or ioctls can be included in the game: ip,
>ipchains or the ioctls they use. This allows complex virtual services
>to be created and to support the fwmark-based virtual services.

Yes it is in my focus : adding multiple kernel functionnality wrappers...
for fwmark, qos, ...

>7. Redundancy: keepalive probes to other directors, failover times,
>takeover decisions

> This can be a very very long discussion :)

Of course yes !!! :))) I think many interresting things .... do not give an
8. point otherwise i will not stop coding !!!! :))

regards,

Alexandre

Re: keepalived (was Re: News contrib to LVS) [ In reply to ]

Dec 27, 2000, 7:38 AM

Post #6 of 12 (2418 views)

Hello,

On Tue, 26 Dec 2000, Alexandre Cassen wrote:

> > This can be a very very long discussion :)
>
> Of course yes !!! :))) I think many interresting things .... do not give an
> 8. point otherwise i will not stop coding !!!! :))

:) No problem, we have different ideas. I'm in process of
implementing my ideas, here are some old links:

http://marc.theaimsgroup.com/?l=linux-virtual-server&m=96432992117737&w=2
http://marc.theaimsgroup.com/?l=linux-virtual-server&m=96252502113930&w=2
http://marc.theaimsgroup.com/?l=linux-virtual-server&m=96268604328970&w=2
http://marc.theaimsgroup.com/?l=linux-virtual-server&m=95959865123051&w=2

The main things in my TODO list:

- the director registers for resource information from the real servers

- the real servers report resource information, real service state

- the admin selects expression from load parameters for the different
virtual services and real hosts (OS differences, differences in the
hardware, etc)

- L4 and L7 checks in the real servers

- L3 and other L4 and L7 checks in another place

- alerts and notifications

- resource information for Linux, FreeBSD, Solaris, HP/UX, WinNT 4,
SCO, Unixware. I still don't have load parameters for Win2K.

- sockopt interface to LVS, etc.

- redundancy: primary and backup directors, different role for different
clusters

- export interface for external reporting tools

- transport protocols: user defined (TCP or UDP)

- libraries: link the code in other cluster tools

- console to manage the clusters, other interfaces for management

These are the essentials.

I hope in the next weeks to have an usable version, at least
the agent ported on Linux. When Wensong is ready with the new CVS
repository I'm planning to put this GPL-ed work there. I'll appreciate
help on sticking this software with other programs to make it more
usable, for many admins.

> regards,
>
> Alexandre

Regards

--
Julian Anastasov <ja@ssi.bg>

Re: keepalived (was Re: News contrib to LVS) [ In reply to ]

Jan 2, 2001, 9:49 AM

Post #7 of 12 (2414 views)

Hi,

> > In Apache http.conf we can specify a LISTEN port and run a separate
> > daemon for HTTPS on port 443 for example. If this https daemon or daemons
> > dies, or fails to start (because we have it configured to prompt for our
> > security certificate password at startup) we wouldn't want to make
> > assumptions about the health of the daemons listening on port 80 right?

IMHO you have three possibilities to overcome the INADDR_ANY bind problem:

1.) configure the application to listen to localhost as normal and to a VIP
only for healthchecking.
2.) ipchains is your friend, man! Do an ipchains -A input -j REDIRECT for
packets
coming from the DIP with destination VIP. You redirect it to the loopback
and get your response. You may even first mark the incoming packet and
redirect it accordingly.
3.) Write a user space daemon maybe even with tcpd support that listens to a
unused port and does the check locally and sends 0 if ok and 1 if nok.

pros: It's working and it's cool.
cons: the solutions are not 100% cross-compatible. f.e [1] will work on all
nodes, [2] only on unices that either have support for ipchains or ipfw
and [3] finally needs some coder and is the hardest to maintain.

> Yes, even when we have one httpd for two domains may be we want to
> check different cgi or database calls with L7 HTTP checks. But the L4
> check can be one, of course, configured from the user: bind to 0.0.0.0:80.

The healthcheck is based on the VIP and not on the RIP, so as long as we
don't have L7 support in LVS this is not an issue since for every new service
needs a new VIP.

> > Many thanks to Alexandre Cassen for the great contribution... I plan to
> > test it further in the lab ASAP.

Me too. I hope Julian and Alexandre can merge their work.

Regards,
Roberto Nibali, ratz

--
mailto: `echo NrOatSz@tPacA.cMh | sed 's/[NOSPAM]//g'`

Re: keepalived (was Re: News contrib to LVS) [ In reply to ]

Jan 2, 2001, 9:55 AM

Post #8 of 12 (2422 views)

Hi Julian,

> - resource information for Linux, FreeBSD, Solaris, HP/UX, WinNT 4,
> SCO, Unixware. I still don't have load parameters for Win2K.

What about SNMP? Do they have now some kind of intelligent snmpd?

> - redundancy: primary and backup directors, different role for different
> clusters

I've been at the CCC Congress in Berlin lately and I met one of the
netfilter core developers and he actually is already working on the
problem for kernel 2.4 and iptables. I'll contact him and ask about
the status of his work. As far as I spoke to him he's also trying to
keep the iptables structures in synchronisation with a backup node
and I reckon this will be pretty much the same code for LVS.

> These are the essentials.
>
> I hope in the next weeks to have an usable version, at least
> the agent ported on Linux. When Wensong is ready with the new CVS
> repository I'm planning to put this GPL-ed work there. I'll appreciate
> help on sticking this software with other programs to make it more
> usable, for many admins.

Looking forward,
Roberto Nibali, ratz

--
mailto: `echo NrOatSz@tPacA.cMh | sed 's/[NOSPAM]//g'`

Re: keepalived (was Re: News contrib to LVS) [ In reply to ]

Jan 2, 2001, 5:20 PM

Post #9 of 12 (2416 views)

Hello,

On Tue, 2 Jan 2001, ratz wrote:

> Hi,
>
> > > In Apache http.conf we can specify a LISTEN port and run a separate
> > > daemon for HTTPS on port 443 for example. If this https daemon or daemons
> > > dies, or fails to start (because we have it configured to prompt for our
> > > security certificate password at startup) we wouldn't want to make
> > > assumptions about the health of the daemons listening on port 80 right?
>
> IMHO you have three possibilities to overcome the INADDR_ANY bind problem:
>
> 1.) configure the application to listen to localhost as normal and to a VIP
> only for healthchecking.
> 2.) ipchains is your friend, man! Do an ipchains -A input -j REDIRECT for
> packets
> coming from the DIP with destination VIP. You redirect it to the loopback
> and get your response. You may even first mark the incoming packet and
> redirect it accordingly.

Hm, is there an easy way to send packets with saddr=DIP and
daddr=VIP and they to leave the director ignoring the ip rule:
0: from all lookup local

and the route from the local table:
local VIP dev eth0 proto kernel scope host src VIP

May be with VIP defined not in "local" table using
ip rule add prio 100 iif eth0 table 100
ip route add table 100 local VIP dev lo

But I'm not sure if these rules will help. May be there is no way,
such packets are always delivered locally when daddr=local_address.
At least, the replies from the real server will be treated as source
martians in the director.

Of course, there is no problem in the real server to accept
and reply to packets with daddr=VIP and saddr=DIP

> 3.) Write a user space daemon maybe even with tcpd support that listens to a
> unused port and does the check locally and sends 0 if ok and 1 if nok.

Something like this, get the weight from RS:

#! /bin/sh

jiffies=100

function cpuidle() {
sed -n "s/^cpu .* $[0-9]*$.*/\1/p" /proc/stat
}

a="`cpuidle`"
sleep 1
b="`cpuidle`"

c="$[($b-$a)*100/$jiffies]"
[ $c -gt 100 ] && c="100"

echo $c

Don't use this in production. The returned value is not correct
when the RS is loaded.

>
> pros: It's working and it's cool.
> cons: the solutions are not 100% cross-compatible. f.e [1] will work on all
> nodes, [2] only on unices that either have support for ipchains or ipfw
> and [3] finally needs some coder and is the hardest to maintain.
>
> > Yes, even when we have one httpd for two domains may be we want to
> > check different cgi or database calls with L7 HTTP checks. But the L4
> > check can be one, of course, configured from the user: bind to 0.0.0.0:80.
>
> The healthcheck is based on the VIP and not on the RIP, so as long as we
> don't have L7 support in LVS this is not an issue since for every new service
> needs a new VIP.
>
> > > Many thanks to Alexandre Cassen for the great contribution... I plan to
> > > test it further in the lab ASAP.
>
> Me too. I hope Julian and Alexandre can merge their work.

I have some things ready for CVS:

- resource information for Linux 2.2/2.4

- indepenent lvs.o module to access LVS via setsockopt, for 2.2,
soon for 2.4 too

- expression processor, needs to be extended to generate pseudo code
for faster calcs

> Regards,
> Roberto Nibali, ratz

Regards

--
Julian Anastasov <ja@ssi.bg>

Re: keepalived (was Re: News contrib to LVS) [ In reply to ]

Jan 3, 2001, 2:49 AM

Post #10 of 12 (2420 views)

Hi Alexandre,

Alexandre CASSEN wrote:
> >What about SNMP? Do they have now some kind of intelligent snmpd?
>
> SNMP is pletty of security hole. Why not create a simple tcp protocole
> that integrate
> security issues ? (using SSL to push to the director the info) We can
> use an abstract
> similar to the ASN.1.

Oh, ok, you like to speak about security? Good, we can have a long and intense
talk about the security of your tools if you like to. The old snmpd (ucd-snmp
had some big security issues agreed, but the new one with MD5 and DES encryption
is encrypted. If you still think that the md5 and des encryption of the stream
is not enough then you can put a tcp-wrapper entry for the snmpd and if you
still
think this is not secure enough, you ssh-tunnel it and put another tcp-wrapper
on
it. So and now tell me how your half-open tcpcheck will be secured against Seq-
Number attacks, if it even doesn't check them? ;)

> I can suggest an integration with keepalived. But i thinks (due to the
> keepalived announce feedback), that
> we must have the two possibilities :
>
> 1. A simple standalone keepalived working all in the director (all
> checks are performed on the director). This
> solution can be usefull for small/medium LVS topology.

Maybe you even want to have a separate machine to have the checks done.
Sometimes
the director is quite busy and imagine doing all those checks on the same
machine.
I've done it in production and I have to say that if you do all the healthchecks
you want to do and interact with LVS the machine must be extremely powerfull. So
I could even imagine having a separate dedicated probe that does the
healthchecking
and reporting to the director which in turn does a setsockopt or a netlink
socket
to inform the kernel to change the LVS-parameters.

> 2. An advanced keepalived daemon working with a listener on the
> director. All the servers push information to this
> listener. Finally the listener send action to LVS via setsockopt.

For me this is just another set of healthchecks but remote healthchecks. You
need
them for example to monitor CPU, RAM unless you take snmp.

> Do you agree ?

Basically yes, I think the concrete design is just not yet clear but lets hear
other opinions on this.

Best regards,
Roberto Nibali, ratz

BTW: do a cc to the lvs-user mailing list.

--
mailto: `echo NrOatSz@tPacA.cMh | sed 's/[NOSPAM]//g'`

Re: keepalived (was Re: News contrib to LVS) [ In reply to ]

Jan 3, 2001, 6:50 AM

Post #11 of 12 (2421 views)

Hi Alexandre,

> :) I think MD5 encryption is enought, and in such a developpement, we need
> to integrate this problematic.
> >it. So and now tell me how your half-open tcpcheck will be secured against
> Seq-
> >Number attacks, if it even doesn't check them? ;)
>
> :) Ok, In the new release (0.2.3) I wanted to post yesterday (sourceforge
> down :/) and will post today, The tcp check implement :

I'll have a look at it this weekend.

> 1. Create a random (simple not hard random) TCP sequence number.

Why do you generate an extra random ISN?

> 2. Send to the remote host a TCP/IP packet based on that sequence flagged
> to SYN
> 3. Using a two level timouted handled function : tcpcheck wait for SYN|ACK
> reply

hmm, so you set the timeout for the SYN|ACK return? You just have to pay
attention that if someone sets the SYN|ACK timeout higher than in the value
in the proc-fs your socket might be closed before you want it to be.

> 4. When a SYN|ACK packet is received, it looks for the ack seq number
> expected, for @IP src & port
> src.

Good, now tell me, how many parallel checks can you perform? I mean you have,
let's say in a productionary environment, 50 VIPs and 30 of them need tcpcheck
for their realserver. How do you intend to handle the parallelism? I mean, as
soon as you have a non sane result from the check you take the server/service
out and if the check is back ok, you like to insert the server/service back into
the cluster configuration. You also have to handle the case where the last
server
of a VIP will be taken out. This is a special case. Tell me your ideas.

> 5. If packet headers mistmatch 4. tcpcheck wait for 2 scdes to received a
> good answer.

You shouldn't hardcore this. This must be settable. You may want to check a
service every 5 seconds and leave 2 secs of response delay but you also might
check a not so important service every 30 seconds and have the timeout to 5
seconds. BTW: I saw that you use in your code somewhere following structure:

int recvfrom_to(int s, char *buf, int len, struct sockaddr *saddr, int timo){
struct timeval to;
to.tv_sec = timo/1000;
to.tv_usec = 0;
[...]
nfound = select(s+1,&readset,&writeset,NULL,&to);
[...]

Just recall that to.tv_sec is a long! So your granularity should be split up
between to.tv_sec and to.tv_usec.

> 6. If good SYN|ACK is not received within 2 scdes, the packet is probably
> loosed -or network congestion) so tcpcheck perform a 3 times retry.

Hmm, this must also be selectable. You should never hardcore this. If you
have access to the Alteon load balancer GUI or the ServerIron load balancer
GUI, you can see if you dig deep into the configuration that you can set the
amount of repeating tests a healthcheck performs before it will return a
EFAILED.

> => Finally if no answer is received, we assume that the packet is not
> received, so the check is false.

I agreee :)

> If I had time I probably add a MAC address check.

Wouldn't do that since poking around with MAC addresses within a LVS-cluster
is strong tobacco.

> >I could even imagine having a separate dedicated probe that does the
> >healthchecking
> >and reporting to the director which in turn does a setsockopt or a netlink
> >socket to inform the kernel to change the LVS-parameters.
>
> I my mind this is a good design, we limit the communication exchange. The
> advantage is that the checks implementations doesn't depend on the OS type.

That would be nice of course. (although we do all like Linux, don't we:)

> >
> >> 2. An advanced keepalived daemon working with a listener on the
> >> director. All the servers push information to this
> >> listener. Finally the listener send action to LVS via setsockopt.
> >
> >For me this is just another set of healthchecks but remote healthchecks.
> You
> >need them for example to monitor CPU, RAM unless you take snmp.
>
> Yes, we can have a design where we use 2 servers. One for LVS, the other
> for healthcheck. The second can run a snmp engine using SNMP TRAP. This

For heaven's sake, please don't use snmptrap!! That's like all the shit HP
Openview or BMC Patrol or Winblows NetBIOS or TXE shit does. It fills up your
network with crap packets which 95% of the time get lost somewhere and waste
bandwith. If you want information, you get it, if not, keep quiet. We like
to have control over the internal physical net of the LVS-cluster. I tell
you, if you once had to tcpdump in a heterogenic Win$loth environment to
find out why the cluster doesn't work you have to use very long regex syntax
to tcpdump to filter out all the mentioned waste-traffic.

> server can be connected to the LVS server using a secure or dedicated
> connection.

Nope, the LVS-server connects. We have to maintain a security hierarchy.
The securest box should be the load balancer. If you hack it, bye bye load
balancing anyway. It's like if you design a database server for your
firewall logs. You would never send them to the machine, but the machine
would connect to the server and get the logs. So you need no listener on
the logserver and if you establish a connection you use an unpriviledged
port.

Regards,
Roberto Nibali, ratz

--
mailto: `echo NrOatSz@tPacA.cMh | sed 's/[NOSPAM]//g'`

Re: keepalived (was Re: News contrib to LVS) [ In reply to ]

Jan 3, 2001, 2:58 PM

Post #12 of 12 (2418 views)