Mailing List Archive

[LinuxFailSafe] Re: Re: STONITH implementations
Alan Robertson wrote:
>
> Horms wrote:
> >
> > On Thu, Apr 27, 2000 at 10:42:07AM -0600, Alan Robertson wrote:
> > > Folks,
> > >
> > > Does anyone know of any existing code for operating one or more kinds of
> > > remote power-off/reset devices suitable for a STONITH/STOMITH approach.
> > >
> > > STONITH/STOMITH:
> > >
> > > Shoot
> > > The
> > > Other
> > > Node/Machine
> > > In
> > > The
> > > Head
> >
> > VACM http://vacm.sourceforge.net/ has support for controlling
> > Baytek power strips.
>
> Thanks Horms!
>
> It's good information, but it looks like it'll take a little tweaking
> for use with heartbeat or FailSafe. It has lots of tie ins to the VA
> clustering infrastructure, and it is only set up to work with serial
> communication. Serial isn't suitable for STONITH, because all nodes
> need to be able to power each other off independently.
>
> However, the code clearly shows how to operate the switches, and the
> Baytek hardware seems reasonably nice. They have models that support
> having each machine be on it's own UPS, and some models provide telnet
> support. Unfortunately, they seem to cost around $150/port.
>
> One nice thing about telnet. If they only support one caller at a time,
> then this eliminates each machine shooting the other in the head
> simultaneously. Unfortunately, it also means that when the hub goes
> out, STONITH won't work. This could be a real problem unless you have
> redundant heartbeat mechanisms to minimize the possibility of a split
> cluster occurring for this reason... Hmmm...

This is simply resolved this way:

Don't take over any resources unless you are actually able to
successfully power cycle the errant node. If the hub/switch or
power control unit failed, then by definition, this won't happen.

Sorry for the red herring.

-- Alan Robertson
alanr@suse.com