Hi Ren,
let me share information about our setup:
Basics:
~22 relays, thousands clients worldwide, 3 worker nodes, Linux Virtual
Server managed by heartbeat+ldirectord+home grown script for LVS UDP
monitoring, 3 service_ip x 3 service_port x 2 (tcp+udp) = 18 balancing
services
Limitations:
- HB does support two node cluster only, third is just the worker (will
upgrade to 3node cluster with peacemaker soon)
- one LVS instance for all service IPs and instance types (will separate
service IP balancing within peacemaker upgrade)
- UDP listener needs to use the same port number like TCP listener for LVS
UDP monitoring script to work
Example:
ldirectord.cf snippet for TCP and UDP listeners (they are open by the same
rsyslog instance on worker nodes):
virtual=10.0.0.10:1514
real=10.0.0.1:1514 gate 4
real=10.0.0.2:1514 gate 4
real=10.0.0.3:1514 gate 4
service=none
scheduler=lc
protocol=tcp
checktype=connect
virtual=10.0.0.10:1514
real=10.0.0.1:1514 gate
real=10.0.0.2:1514 gate
real=10.0.0.3:1514 gate
service=none
scheduler=rr
protocol=udp
checktype=external-perl
checkcommand=/usr/local/sbin/ldirector_port_check
The script is attached. Feel free to reuse, the code should be cleaned a
little, but working fine.
Our setup is combination of
http://www.linuxvirtualserver.org/docs/ha/heartbeat_ldirectord.html setup with DR
http://www.linuxvirtualserver.org/VS-DRouting.html and we share workers as LB nodes which needs to be properly handled.
Would also recommend reading about the NFTLB as all modern Linux
distributions already use nftables as the default packet handling.
https://www.zevenet.com/knowledge-base/nftlb/what-is-nftlb/ https://github.com/zevenet/nftlb We plan to focus on the NFTLB setup in near future.
Do not forget to enable TCP KeepAlive on both sides. We do not use
rebindinterval, but that might be good option.
Keep in mind, the server does not send any data to the client except TCP
ACKs. For plain TCP the data flow is one way only, thus consider the
balancing design accordingly. This makes the syslog communication different
to HTTP.
Good luck!
Peter
On Thu, Jul 9, 2020 at 8:25 AM David Lang via rsyslog <
rsyslog@lists.adiscon.com> wrote:
> yep, one thing to keep in mind when making your logs HA is that the fewer
> dependencies you have, the less likely you are to have your logs fail when
> you
> need them to troubleshot why things aren't working :-)
>
> keepalived and corosync/pacemaker both do the job by managing a virtual IP
> on
> the syslog servers themselves, so nothing else (other than the switch)
> needs to
> be working and they can both be set to very fast failovers. They also give
> you
> the ability to get a health check on rsyslog itself (have some source of
> logs
> that is frequent and watch for them to stop arriving). Keepalived is
> simpler,
> Corocync/Pacemaker allows for load sharing and multi-site clusters (only
> send
> alerts from one system, even across datacenters for example)
>
> external load balancers (HAProxy, F5, etc) have a much harder time dealing
> with
> health checks for non-webservers.
>
> David Lang
>
> On Thu, 9 Jul 2020, Benoit DOLEZ via rsyslog wrote:
>
> > HAProxy is a good solution for tcp/relp LB but do not manage HA part of
> > himself nor UDP LB. For HA, it is easier to use keepalived. Add script
> > check to look for rsyslog process. For UDP LB, you can use LVS.
> >
> > Benoit
> >
> > Le 09/07/2020 à 02:46, David Lang via rsyslog a écrit :
> >> on the sending side, enble rebindinterval so that the sender disconnects
> >> periodically to let the load balancer have a chance of doing it's job.
> >>
> >> also be aware that tcp syslog can loose data in a failover (see
> >>
> https://rainer.gerhards.net/2008/04/on-unreliability-of-plain-tcp-syslog.html
> >> ) rsyslog supports the RELP protocol to be reliable in the face of
> >> network failures (relp has one known failure mode that can loose a log,
> >> but so far nobody has cared enough to sponsor a fix for it)
> >>
> >> David Lang
> >> _______________________________________________
> >> rsyslog mailing list
> >> https://lists.adiscon.net/mailman/listinfo/rsyslog
> >> http://www.rsyslog.com/professional-services/
> >> What's up with rsyslog? Follow https://twitter.com/rgerhards
> >> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
> >> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
> >> DON'T LIKE THAT.
> >
> > --
> > Benoit DOLEZ
> > GSM: +33 6 21 05 91 69 mailto:bdolez@ant-computing.com
> > _______________________________________________
> > rsyslog mailing list
> > https://lists.adiscon.net/mailman/listinfo/rsyslog
> > http://www.rsyslog.com/professional-services/
> > What's up with rsyslog? Follow https://twitter.com/rgerhards
> > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
> DON'T LIKE THAT.
> _______________________________________________
> rsyslog mailing list
> https://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com/professional-services/
> What's up with rsyslog? Follow https://twitter.com/rgerhards
> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
> DON'T LIKE THAT.