Mailing List Archive: Using "/etc/ha.d/nodeinfo"--supported?

Using "/etc/ha.d/nodeinfo"--supported?

Sep 16, 2012, 2:52 PM

Post #1 of 5 (1838 views)

Hi all.

I'm investigating an HA environment with a simple active/standby
configuration, just two nodes in a cluster with DRBD to provide a shared
partition (using standard Red Hat EL 6.2 packages, such as heartbeat
2.1.2 and DRBD 8.4).

These servers have multiple network interfaces: an internal private
network, over which DRBD, heartbeat, etc. are run, and also a separate
set of interfaces to provide "customer" access. The internal interfaces
have static IP addresses and unchanging hostnames. The external
interfaces do NOT: those interfaces are owned by the "customer", not by
the cluster. They might use DHCP to get IP addresses, and it's
important that the hostnames of the systems, when the customer runs
"uname" etc., be _their_ hostname and not the internal hostname of the
cluster. Users of the system must be free to change these values.

I'm frustrated trying to get this to work robustly with heartbeat.
Explaining why the entire system must be brought down and restarted
merely to change the hostname is somewhat embarrassing as well. If I
could get heartbeat to use my internal, forever-constant names rather
than the results of "uname -n" my system would work so much more
smoothly and reliably, provide more uptime, and require a lot less
effort from me. Because this is a working environment, moving to
completely different technology like corosync is not really feasible.

I found a thread from 2004 discussing the (then?) undocumented support
for the "/etc/ha.d/nodeinfo" file with heartbeat. This seems like the
obviously correct solution. I can't find any information on this
subject more current than that thread, though. Is this feature still
available/supported? Does it work with DRBD as well? Is it something I
can rely on going forward, insofar as heartbeat is still supported?

I must confess myself somewhat taken aback to read in that 2004 thread a
robust defense of the idea that "uname -n" would be the sole true
infallible identifier for a node. Hostnames may be relied upon to be
unique _at any given moment_, yes, but they are a very far ways from
being _constants_. They do change. While it's useful for status
output, logs, etc. to utilize hostnames as user-readable identifiers, a
design using an internal (constant) identifier for nodes in the cluster
seems to me to be far more reliable and straightforward to manage. I'm
no HA guru however; is there a technical reason why this is difficult or
sub-optimal?

_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Re: Using "/etc/ha.d/nodeinfo"--supported? [ In reply to ]

paul at mad-scientist

Sep 20, 2012, 5:59 AM

Post #2 of 5 (1782 views)

Permalink

Hi; anyone have any thoughts about the "nodeinfo" file in modern
heartbeat implementations?

Thanks!

On Sun, 2012-09-16 at 17:52 -0400, Paul Smith wrote:
> Hi all.
>
> I'm investigating an HA environment with a simple active/standby
> configuration, just two nodes in a cluster with DRBD to provide a shared
> partition (using standard Red Hat EL 6.2 packages, such as heartbeat
> 2.1.2 and DRBD 8.4).
>
> These servers have multiple network interfaces: an internal private
> network, over which DRBD, heartbeat, etc. are run, and also a separate
> set of interfaces to provide "customer" access. The internal interfaces
> have static IP addresses and unchanging hostnames. The external
> interfaces do NOT: those interfaces are owned by the "customer", not by
> the cluster. They might use DHCP to get IP addresses, and it's
> important that the hostnames of the systems, when the customer runs
> "uname" etc., be _their_ hostname and not the internal hostname of the
> cluster. Users of the system must be free to change these values.
>
> I'm frustrated trying to get this to work robustly with heartbeat.
> Explaining why the entire system must be brought down and restarted
> merely to change the hostname is somewhat embarrassing as well. If I
> could get heartbeat to use my internal, forever-constant names rather
> than the results of "uname -n" my system would work so much more
> smoothly and reliably, provide more uptime, and require a lot less
> effort from me. Because this is a working environment, moving to
> completely different technology like corosync is not really feasible.
>
> I found a thread from 2004 discussing the (then?) undocumented support
> for the "/etc/ha.d/nodeinfo" file with heartbeat. This seems like the
> obviously correct solution. I can't find any information on this
> subject more current than that thread, though. Is this feature still
> available/supported? Does it work with DRBD as well? Is it something I
> can rely on going forward, insofar as heartbeat is still supported?
>
>
> I must confess myself somewhat taken aback to read in that 2004 thread a
> robust defense of the idea that "uname -n" would be the sole true
> infallible identifier for a node. Hostnames may be relied upon to be
> unique _at any given moment_, yes, but they are a very far ways from
> being _constants_. They do change. While it's useful for status
> output, logs, etc. to utilize hostnames as user-readable identifiers, a
> design using an internal (constant) identifier for nodes in the cluster
> seems to me to be far more reliable and straightforward to manage. I'm
> no HA guru however; is there a technical reason why this is difficult or
> sub-optimal?

_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Re: Using "/etc/ha.d/nodeinfo"--supported? [ In reply to ]

lars.ellenberg at linbit

Sep 24, 2012, 3:12 AM

Post #3 of 5 (1776 views)

Permalink

On Thu, Sep 20, 2012 at 08:59:36AM -0400, Paul Smith wrote:
> Hi; anyone have any thoughts about the "nodeinfo" file in modern
> heartbeat implementations?

Subject: "/etc/ha.d/nodeinfo"--supported?
Short answer: probably not.

Longer answer:

It is still in there, and it was not changed.
I doubt has ever been much used or tested.

Guess you have to just try it, to find out if it works for you.

It will pretend that whatever is read from that nodeinfo file (up to the
first newline, if any) is the local node name, and it will export the
HA_CURHOST environment variable for any child processes.

AFAIK, Pacemaker is not aware of this at all.

Awareness is probably missing from some client libs as well,
so it may or may not work for "some of them".

Most external scripts will not be aware of it either.

DRBD certainly is not.

From a quick glance at the heartbeat source, at least "heartbeat -s" is
not aware of it either, so would print the "true" uname in its message,
but still the correct status.

Why fake the uname for the cluster?
Why not fake it for "that other application",
which thinks it needs to depend on it?
Or maybe even just add some entry into /etc/hosts,
so the reverse lookup for "that other application"
returns whatever is expected?

You could also look into ldpreload'ing the uname() call...

Oh, and you certainly do NOT want to use 2.1.2.

Well, if you want haresources mode, maybe you could.
But still you should use latest heartbeat.

If you want to use crm mode, use pacemaker.

Hth,

Lars

> Thanks!
>
> On Sun, 2012-09-16 at 17:52 -0400, Paul Smith wrote:
> > Hi all.
> >
> > I'm investigating an HA environment with a simple active/standby
> > configuration, just two nodes in a cluster with DRBD to provide a shared
> > partition (using standard Red Hat EL 6.2 packages, such as heartbeat
> > 2.1.2 and DRBD 8.4).

Uhm, those are "standard RHEL 6.2 packages"?
Really?

> > These servers have multiple network interfaces: an internal private
> > network, over which DRBD, heartbeat, etc. are run, and also a separate
> > set of interfaces to provide "customer" access. The internal interfaces
> > have static IP addresses and unchanging hostnames. The external
> > interfaces do NOT: those interfaces are owned by the "customer", not by
> > the cluster. They might use DHCP to get IP addresses, and it's
> > important that the hostnames of the systems, when the customer runs
> > "uname" etc., be _their_ hostname and not the internal hostname of the
> > cluster. Users of the system must be free to change these values.

Why.

> >
> > I'm frustrated trying to get this to work robustly with heartbeat.
> > Explaining why the entire system must be brought down and restarted
> > merely to change the hostname is somewhat embarrassing as well. If I
> > could get heartbeat to use my internal, forever-constant names rather

BTW, heartbeat *does* use UUIDs internally,
so would even be able to detect uname changes.

But still many parts of the whole cluster stack rely on uname to be
constant accros at least their process lifetime, and pacemaker requires
special treatment when renaming nodes as well.

> > than the results of "uname -n" my system would work so much more
> > smoothly and reliably, provide more uptime, and require a lot less
> > effort from me. Because this is a working environment, moving to
> > completely different technology like corosync is not really feasible.

I don't think moving to corosync would change *this* particular problem.

> > I found a thread from 2004 discussing the (then?) undocumented support
> > for the "/etc/ha.d/nodeinfo" file with heartbeat. This seems like the
> > obviously correct solution. I can't find any information on this
> > subject more current than that thread, though. Is this feature still
> > available/supported? Does it work with DRBD as well? Is it something I
> > can rely on going forward, insofar as heartbeat is still supported?
> >
> >
> > I must confess myself somewhat taken aback to read in that 2004 thread a
> > robust defense of the idea that "uname -n" would be the sole true
> > infallible identifier for a node. Hostnames may be relied upon to be
> > unique _at any given moment_, yes, but they are a very far ways from
> > being _constants_. They do change. While it's useful for status
> > output, logs, etc. to utilize hostnames as user-readable identifiers, a
> > design using an internal (constant) identifier for nodes in the cluster
> > seems to me to be far more reliable and straightforward to manage. I'm
> > no HA guru however; is there a technical reason why this is difficult or
> > sub-optimal?

--
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Re: Using "/etc/ha.d/nodeinfo"--supported? [ In reply to ]

paul at mad-scientist

Sep 30, 2012, 9:42 PM

Post #4 of 5 (1760 views)

Permalink

Thank you for replying Lars!

On Mon, 2012-09-24 at 12:12 +0200, Lars Ellenberg wrote:
> On Thu, Sep 20, 2012 at 08:59:36AM -0400, Paul Smith wrote:
> > Hi; anyone have any thoughts about the "nodeinfo" file in modern
> > heartbeat implementations?
>
> Subject: "/etc/ha.d/nodeinfo"--supported?
> Short answer: probably not.

Based on this information I agree, this is not sufficient for me. I'll
simply have to continue to script ways to automatically bring down the
cluster, edit the various configuration files to rewrite the hostnames,
etc. and try to make this as robust as possible whenever any hostname
needs to be changed.

> Why fake the uname for the cluster?
> Why not fake it for "that other application",
> which thinks it needs to depend on it?
> Or maybe even just add some entry into /etc/hosts,
> so the reverse lookup for "that other application"
> returns whatever is expected?

Well, the thing to be clear about is that I'm not creating a cluster
primarily for the purpose of running heartbeat etc. The HA software is
just there as a necessary infrastructure for delivering the services
that my customers actually want; the only way they care about HA is that
their important services are always available. As long as that is true
they'd prefer to never even know there's HA software running. They
certainly don't want to have it take over a critical piece of
identifying information for their hardware. They care about things like
SNMP, SMTP, web services, database services, etc., all of which make the
hostname of the system visible in one way or another, and so all of
which are impacted by this requirement.

Having critical system infrastructure that requires knowing unique and
immutable identifiers, such as HA node referencing, make use of
information which is intended for a completely different purpose and
which is designed to be changeable at any time, like hostnames, is so
clearly a mismatch that I'm not sure what to say to convince the
developers of this if it's not already obvious.

> Oh, and you certainly do NOT want to use 2.1.2.

Unfortunately as I mentioned I'm using the standard software that comes
with Red Hat 6.3. Adding my own customized builds of a different
version would be a significant amount of red tape in terms of support,
etc. that I'm not interested in taking on, unless there are specific
issues with the Red Hat version that will be even more difficult to deal
with.

Thanks again for your response!

_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Re: Using "/etc/ha.d/nodeinfo"--supported? [ In reply to ]

lars.ellenberg at linbit

Oct 1, 2012, 8:35 AM

Post #5 of 5 (1766 views)

Permalink

On Mon, Oct 01, 2012 at 12:42:08AM -0400, Paul Smith wrote:
> Thank you for replying Lars!
>
> On Mon, 2012-09-24 at 12:12 +0200, Lars Ellenberg wrote:
> > On Thu, Sep 20, 2012 at 08:59:36AM -0400, Paul Smith wrote:
> > > Hi; anyone have any thoughts about the "nodeinfo" file in modern
> > > heartbeat implementations?
> >
> > Subject: "/etc/ha.d/nodeinfo"--supported?
> > Short answer: probably not.
>
> Based on this information I agree, this is not sufficient for me. I'll
> simply have to continue to script ways to automatically bring down the
> cluster, edit the various configuration files to rewrite the hostnames,
> etc. and try to make this as robust as possible whenever any hostname
> needs to be changed.
>
> > Why fake the uname for the cluster?
> > Why not fake it for "that other application",
> > which thinks it needs to depend on it?
> > Or maybe even just add some entry into /etc/hosts,
> > so the reverse lookup for "that other application"
> > returns whatever is expected?
>

[.... snipped a perfectly obvious answer ...
to a different question ;-) ... ]

BTW, I cannot remember when I last changed a hostname without a reboot,
or even a complete reinstall.

Of course not just because the hostname changed.

But because whatever made me change the hostname, also changed quite a
number of other things outside of that specific host ...

Use cases differ.

Note that I am NOT saying that relying on constant and unique hostnames
was the best design choice.
Obviously there are use cases where it clearly is not.
But the same goes for anything else that depends on the hostname having
a specific special value.

> > Oh, and you certainly do NOT want to use 2.1.2.
>
> Unfortunately as I mentioned I'm using the standard software that comes
> with Red Hat 6.3.

I was not aware that RHEL 6 shipped heartbeat?

> Adding my own customized builds of a different
> version would be a significant amount of red tape in terms of support,
> etc. that I'm not interested in taking on, unless there are specific
> issues with the Red Hat version that will be even more difficult to deal
> with.

heartbeat 2.1.2 in haresources mode should be just as good as any older
version ;-)

For the haresources mode, probably the only interesting changes in later
versions of heartbeat are the improved behaviour of the messaging layer
under moderate to heavy packet loss, or when experiencing sporadic very
high communication latency.
If you think you won't be affected, fine.

Heartbeat 2.x.y in crm mode (cib xml configuration stuff):
Don't even bother. Too much to list here.

Pacemaker is there for a reason, and it even ships with RHEL 6.

--
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/