Mailing List Archive

Recovery from a partitioned cluster
Hi,

I just added basic support in the CVS version of heartbeat for recognizing and
recovering from a split partition.

It does the following things:

Recognizes the split partition

Each side requests it's resources back from the other side.

This will NOT unscramble bad things that happened because both sides were
serving the same resources at the same time. [Like unscrambling a filesystem,
etc].

However, it will allow it to go on and do semi-reasonable things for the cases
where something semi-reasonable can be done.

For the other cases, you need Stonith support (which is also on CVS), and also
quorum (which is not in CVS yet).

Looking at how simple the result was to accomplish, I just wonder what took me
so long... Sorry!

-- Alan Robertson
alanr@suse.com
Recovery from a partitioned cluster [ In reply to ]
On Thu, Aug 10, 2000 at 06:44:49PM -0600, Alan Robertson wrote:
> Hi,
>
> I just added basic support in the CVS version of heartbeat for recognizing and
> recovering from a split partition.
>
> It does the following things:
>
> Recognizes the split partition
>
> Each side requests it's resources back from the other side.
>
> This will NOT unscramble bad things that happened because both sides were
> serving the same resources at the same time. [Like unscrambling a filesystem,
> etc].
>
> However, it will allow it to go on and do semi-reasonable things for the cases
> where something semi-reasonable can be done.
>
> For the other cases, you need Stonith support (which is also on CVS), and also
> quorum (which is not in CVS yet).
>
> Looking at how simple the result was to accomplish, I just wonder what took me
> so long... Sorry!

So this would be ideal for a situation where the resource is just an
IP address and the cluster has become partitioned and then unpartitioned?

--
Horms
Recovery from a partitioned cluster [ In reply to ]
Horms wrote:
>
> On Thu, Aug 10, 2000 at 06:44:49PM -0600, Alan Robertson wrote:
> > Hi,
> >
> > I just added basic support in the CVS version of heartbeat for recognizing and
> > recovering from a split partition.
> >
> > It does the following things:
> >
> > Recognizes the split partition
> >
> > Each side requests it's resources back from the other side.
> >
> > This will NOT unscramble bad things that happened because both sides were
> > serving the same resources at the same time. [Like unscrambling a filesystem,
> > etc].
> >
> > However, it will allow it to go on and do semi-reasonable things for the cases
> > where something semi-reasonable can be done.
> >
> > For the other cases, you need Stonith support (which is also on CVS), and also
> > quorum (which is not in CVS yet).
> >
> > Looking at how simple the result was to accomplish, I just wonder what took me
> > so long... Sorry!
>
> So this would be ideal for a situation where the resource is just an
> IP address and the cluster has become partitioned and then unpartitioned?

Yes. I haven't tested it extensively, but it seems to basically work.

-- Alan Robertson
alanr@suse.com
Recovery from a partitioned cluster [ In reply to ]
On Thu, Aug 10, 2000 at 07:23:13PM -0600, Alan Robertson wrote:
> > So this would be ideal for a situation where the resource is just an
> > IP address and the cluster has become partitioned and then unpartitioned?
>
> Yes. I haven't tested it extensively, but it seems to basically work.

Excelent

--
Horms
Recovery from a partitioned cluster [ In reply to ]
Horms wrote:
>
> On Thu, Aug 10, 2000 at 07:23:13PM -0600, Alan Robertson wrote:
> > > So this would be ideal for a situation where the resource is just an
> > > IP address and the cluster has become partitioned and then unpartitioned?
> >
> > Yes. I haven't tested it extensively, but it seems to basically work.
>
> Excelent

Things like web sites with basically static content, etc. would work this way.

And BEST OF ALL... This is how EVERYONE tests heartbeat -- and they decide it
isn't working when this doesn't do what they expect it to do ;-)

-- Alan Robertson
alanr@suse.com