Mailing List Archive: Probable "sub-optimal" behavior in ucast

Probable "sub-optimal" behavior in ucast

Jul 30, 2012, 1:02 PM

Post #1 of 4 (1715 views)

Hi,

I have a 10-node system with each system having 2 interfaces, and
therefore each ha.cf file has 18 ucast lines in it.

If I read the code correctly, I think each heartbeat packet is then
being received 18 times and sent to the master control process - where
each is then uncompressed and 17 of them are thrown away...

Could someone else offer your thoughts on this?

It looks to be a 2 or 3 line fix in ucast.c to thrown away ucast packets
that aren't from the address we expect - which would cut us down to only
one of them being sent from each of the interfaces - a 9 to 1 reduction
in work on the master control process. And I don't have to uncompress
them to throw them away - I can just look at the source IP address...

What do you think?

--
Alan Robertson <alanr@unix.sh> - @OSSAlanR

"Openness is the foundation and preservative of friendship... Let me claim from you at all times your undisguised opinions." - William Wilberforce
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Re: Probable "sub-optimal" behavior in ucast [ In reply to ]

lars.ellenberg at linbit

Jul 31, 2012, 12:58 AM

Post #2 of 4 (1671 views)

Permalink

On Mon, Jul 30, 2012 at 02:02:38PM -0600, Alan Robertson wrote:
> Hi,
>
> I have a 10-node system with each system having 2 interfaces, and
> therefore each ha.cf file has 18 ucast lines in it.
>
> If I read the code correctly, I think each heartbeat packet is then
> being received 18 times and sent to the master control process -
> where each is then uncompressed and 17 of them are thrown away...
>
> Could someone else offer your thoughts on this?
>
> It looks to be a 2 or 3 line fix in ucast.c to thrown away ucast
> packets that aren't from the address we expect - which would cut us
> down to only one of them being sent from each of the interfaces - a
> 9 to 1 reduction in work on the master control process. And I don't
> have to uncompress them to throw them away - I can just look at the
> source IP address...
>
> What do you think?

Besides that a ten node cluster is likely to break the 64k message size
limit, even after compression...

You probably should re-organize the code so that you only have
one receiving ucast socket per nic/ip/port.

But I think that a single UDP packet will be delivered to
a single socket, even though you have 18 receiving sockets
bound to the same port (possible because of SO_REUSEPORT, only).

If we, as I think we do, receive on just one of them, where which one is
determined by the kernel, not us, your suggested ingress filter on
"expected" source IP would break communications.

Do you have evidence for the assumption that you receive incoming
packets on all sockets, and not on just one of them?

--
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Re: Probable "sub-optimal" behavior in ucast [ In reply to ]

alanr at unix

Jul 31, 2012, 5:08 AM

Post #3 of 4 (1651 views)

Permalink

On 07/31/2012 01:58 AM, Lars Ellenberg wrote:
> Besides that a ten node cluster is likely to break the 64k message size
> limit, even after compression...
The CIB is about 20K before compression... So I think we're not in as
bad a shape as I would have guessed.
>
> You probably should re-organize the code so that you only have
> one receiving ucast socket per nic/ip/port.
That would be a big change or so it seems to me. Right now, the parent
code doesn't look at the parameters given to its children...
>
> But I think that a single UDP packet will be delivered to
> a single socket, even though you have 18 receiving sockets
> bound to the same port (possible because of SO_REUSEPORT, only).
I was having various troubles with the system and wasn't sure debugging
was actually taking effect. But your explanation may be the right one.
I will get some more time on one of the systems in the next few days and
verify that.
> If we, as I think we do, receive on just one of them, where which one is
> determined by the kernel, not us, your suggested ingress filter on
> "expected" source IP would break communications.
Good point.
>
> Do you have evidence for the assumption that you receive incoming
> packets on all sockets, and not on just one of them?
I wasn't sure, actually - because of the troubles mentioned above. I'll
check back in and let you know...

I saw the IPC (!) having troubles on one of the systems - and the CIB
was trying to send packets that were getting lost - and eventually the
CIB lost its connection to Heartbeat. I could not imagine what could
cause that - so this was my theory. We had a resource that we were
trying to restart but because of some disk problem it wouldn't actually
restart. About this time on a different machine (the DC) we saw this
IPC issue.

If you have an idea what could cause IPC to behave this way I'd be happy
to know what it was...

-- Alan Robertson
alanr@unix.sh
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Re: Probable "sub-optimal" behavior in ucast [ In reply to ]

alanr at unix

Jul 31, 2012, 12:22 PM

Post #4 of 4 (1662 views)

Permalink

On 07/31/2012 06:08 AM, Alan Robertson wrote:
>
> I wasn't sure, actually - because of the troubles mentioned above. I'll
> check back in and let you know...
Only two of the read processes are accumulating any CPU - it's the last
one on each interface.

You hit it spot on.

Thanks Lars!

--
Alan Robertson <alanr@unix.sh> - @OSSAlanR

"Openness is the foundation and preservative of friendship... Let me claim from you at all times your undisguised opinions." - William Wilberforce
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/