Mailing List Archive: TODO/WANNADO list...

TODO/WANNADO list...

lclaudio at conectiva

Apr 13, 2000, 12:06 PM

Post #1 of 8 (1837 views)

Hi!

I'm rewriting heartbeat starting process under nice_failback. I'm
also adding some ideas that may help those who are looking for
statistics or info about the cluster.
The main ideas are:

1. To get the starting process correct, I'll create a variable called
CLUSTER where the bits have the following meaning when set:

ME | OTHER
________|________
CLUSTER: [7|6|5|4|3|2|1|0]

Bits Meaning
0,4 - Host is alive
1,5 - Host is primary
2,6 - Host has the resources
3,7 - Host is starting

2. The *simplified* starting algorithm will be (under nice_failback):

1- Host A is starting. It sends out a starting Message

2- Host B is alive?
2.1- Not. Turn nice_failback off.
Take over the resources.

3- Yes. Host B is starting?
3.1- Yes. Turn off nice_failback.
wait until req_our_resources do the usual stuff
(that means: the primary will take over resources)

4- No. Is host B holding any resource?
4.1- Yes. Act as a secondary.

5- No. Turn nice_failback off
Take over resources.

3. Maybe a variable called FAULTS (4 bits to me and 4 bits to the
other node) that counts how many times each host went out by any
reason. If any host went outy more than X (up to 15) times, report it
to the supervisor.

Ideas?

Well, let's code. :)

[ Luis Claudio R. Goncalves lclaudio@conectiva.com.br ]
[. BSc in Computer Science -- MSc coming soon -- Gospel User -- Linuxer ]
[. Fault Tolerance - Real-Time - Distributed Systems - IECLB - IS 40:31 ]
[. LateNite Programmer -- Jesus Is The Solid Rock On Which I Stand -- ]

TODO/WANNADO list... [ In reply to ]

Apr 14, 2000, 10:08 AM

Post #2 of 8 (1828 views)

On Thu, Apr 13, 2000 at 04:06:23PM -0300, Luis Claudio R. Goncalves wrote:
>
> Hi!
>
> I'm rewriting heartbeat starting process under nice_failback. I'm
> also adding some ideas that may help those who are looking for
> statistics or info about the cluster.
> The main ideas are:
>
> 1. To get the starting process correct, I'll create a variable called
> CLUSTER where the bits have the following meaning when set:
>
> ME | OTHER
> ________|________
> CLUSTER: [7|6|5|4|3|2|1|0]
>
> Bits Meaning
> 0,4 - Host is alive
> 1,5 - Host is primary
> 2,6 - Host has the resources
> 3,7 - Host is starting
>
> 2. The *simplified* starting algorithm will be (under nice_failback):
>
> 1- Host A is starting. It sends out a starting Message
>
> 2- Host B is alive?
> 2.1- Not. Turn nice_failback off.
> Take over the resources.
>
> 3- Yes. Host B is starting?
> 3.1- Yes. Turn off nice_failback.
> wait until req_our_resources do the usual stuff
> (that means: the primary will take over resources)
>
> 4- No. Is host B holding any resource?
> 4.1- Yes. Act as a secondary.
>
> 5- No. Turn nice_failback off
> Take over resources.
>
> 3. Maybe a variable called FAULTS (4 bits to me and 4 bits to the
> other node) that counts how many times each host went out by any
> reason. If any host went outy more than X (up to 15) times, report it
> to the supervisor.
>
> Ideas?

Question, is it intended to only handle two nodes? If not, is this
going to work with arbitrary numbers of nodes?

Also, is there any reason to use such a tightly packed representation?

Curious

-dg

--
David Gould dgould@suse.com
If simplicity worked, the world would be overrun with insects.

TODO/WANNADO list... [ In reply to ]

lclaudio at conectiva

Apr 14, 2000, 10:25 AM

Post #3 of 8 (1832 views)

Hi!

> Question, is it intended to only handle two nodes? If not, is this
> going to work with arbitrary numbers of nodes?

I'm looking for a two node solution, for a while... before we get
into the N nodes stuff there are some more discussion to be done. On a
N nodes environment you will need to know everything about who is
running what, you must take care about the services (and not only
about the nodes) and some more stuff that isn't a need for a two node
solution.
...
Anyway some of the ideas can be used for a N nodes solution. You
can put a memory struct on the place you have the CLUSTER variable.
There some people looking on the N nodes situation.

> Also, is there any reason to use such a tightly packed representation?

The code seems more clean... I was used to code in assembly Z80, so
this is very natural for me. :) And all the options packed on CLUSTER
are binary ones.

Hugs!

Luis

[ Luis Claudio R. Goncalves lclaudio@conectiva.com.br ]
[. BSc in Computer Science -- MSc coming soon -- Gospel User -- Linuxer ]
[. Fault Tolerance - Real-Time - Distributed Systems - IECLB - IS 40:31 ]
[. LateNite Programmer -- Jesus Is The Solid Rock On Which I Stand -- ]

TODO/WANNADO list... [ In reply to ]

Apr 17, 2000, 8:08 AM

Post #4 of 8 (1836 views)

Hi Luis,

Sorry for the delay in replying, but I was travelling...

"Luis Claudio R. Goncalves" wrote:
>
> Hi!
>
> I'm rewriting heartbeat starting process under nice_failback. I'm
> also adding some ideas that may help those who are looking for
> statistics or info about the cluster.
> The main ideas are:
>
> 1. To get the starting process correct, I'll create a variable called
> CLUSTER where the bits have the following meaning when set:
>
> ME | OTHER
> ________|________
> CLUSTER: [7|6|5|4|3|2|1|0]
>
> Bits Meaning
> 0,4 - Host is alive
> 1,5 - Host is primary
> 2,6 - Host has the resources
> 3,7 - Host is starting

Whoa! From this plan, it looks like you plan on nailing this down to
two hosts and one set of resources. I would be *very* uncomfortable
with such a change. The heartbeat code currently doesn't have either
assumption, and I'm not keen to see it get added.

>
> 2. The *simplified* starting algorithm will be (under nice_failback):
>
> 1- Host A is starting. It sends out a starting Message
>
> 2- Host B is alive?
> 2.1- Not. Turn nice_failback off.
> Take over the resources.
>
> 3- Yes. Host B is starting?
> 3.1- Yes. Turn off nice_failback.
> wait until req_our_resources do the usual stuff
> (that means: the primary will take over resources)
>
> 4- No. Is host B holding any resource?
> 4.1- Yes. Act as a secondary.
>
> 5- No. Turn nice_failback off
> Take over resources.
>
> 3. Maybe a variable called FAULTS (4 bits to me and 4 bits to the
> other node) that counts how many times each host went out by any
> reason. If any host went outy more than X (up to 15) times, report it
> to the supervisor.

-- Alan Robertson
alanr@suse.com

TODO/WANNADO list... [ In reply to ]

Apr 17, 2000, 9:08 AM

Post #5 of 8 (1829 views)

"Luis Claudio R. Goncalves" wrote:
>
> Hi!
>
> On Mon, 17 Apr 2000, Alan Robertson wrote:
> ...
> > > I'm rewriting heartbeat starting process under nice_failback. I'm
> > > also adding some ideas that may help those who are looking for
> > > statistics or info about the cluster.
> > > The main ideas are:
> > >
> > > 1. To get the starting process correct, I'll create a variable called
> > > CLUSTER where the bits have the following meaning when set:
> > >
> > > ME | OTHER
> > > ________|________
> > > CLUSTER: [7|6|5|4|3|2|1|0]
> > >
> > > Bits Meaning
> > > 0,4 - Host is alive
> > > 1,5 - Host is primary
> > > 2,6 - Host has the resources
> > > 3,7 - Host is starting
> >
> > Whoa! From this plan, it looks like you plan on nailing this down to
> > two hosts and one set of resources. I would be *very* uncomfortable
> > with such a change. The heartbeat code currently doesn't have either
> > assumption, and I'm not keen to see it get added.
>
> There are two points here... :)
> The first one is that this kind of situation wouldn't occur, at
> least this way, on a multinode setup. If you have N nodes you surely
> will have load balance or at least the service active on all nodes.

Not necessarily. And, if your code demands that you only have two
nodes, then it won't ever happen. It may be the case that heartbeat
isn't managing resources for larger clusters, just acting as heartbeat.

> On
> a two node setup you may have such a situation on which both hosts are
> up and noone has the resources... and, of course, it it could occur on
> a multinode scene, it'd be *more* painful to solve.

Yes, but the proposed implementation puts these assumptions in the main
part of the heartbeat code. It makes the current code less functional
in some respects.

> Second, this stuff address the starting protocol... it'd be used
> only when a host starts heartbeat. After the start this code has no
> more use. Anyway, I can put all this stuff on a memory struct like
> resources_held and expand it to work with N nodes.

But it doesn't look like the proposed design will function at all in the
context of more than 2 nodes. It's hard wired for two -- period. Right
now, the initial takeover of resources is handled by external programs
(in particular, shell scripts). I could easily see putting the logic
being discussed into an external program (C or script), and doing
whatever you want there. The current external logic has that
assumption. If you want to keep that assumption out there, that's OK,
but I don't want it in the main part of the code.

> Another cool detail is that when you have two nodes and only one of
> them is up, it surely has all the resources.

It surely *should* have all the resource groups, and you should
guarantee that no race conditions occur with regard to startup.

> So the "has resources"
> variable can be used as one bit. The resources_held struct, the one
> that lists the resources the host holds, is dynamic and can handle as
> many resources as you can put on haresources.

You can have an arbitrary number of resource groups per machine. It's
clearly not limited to just one group. That isn't required by the
current code in any way. I'm sure there are people out there that have
more than one resource group per machine.

> Could you please read the code again and let me know if should I
> stop working on it or no? :)

I don't mind the approach in general, but not if it's part of the main
heartbeat code. I'll look over your patch again. Again, I'm sorry I
didn't get back to you sooner.

I understand and appreciate the need you're trying to address. With a
little more effort, I'm sure that you'll come up with a design that
doesn't limit the use of heartbeat in other contexts.

Thanks for all your efforts!

-- Alan Robertson
alanr@suse.com

TODO/WANNADO list... [ In reply to ]

Apr 17, 2000, 11:23 AM

Post #6 of 8 (1832 views)

"Luis Claudio R. Goncalves" wrote:
>
> On Mon, 17 Apr 2000, Alan Robertson wrote:
>
> > > There are two points here... :)
> > > The first one is that this kind of situation wouldn't occur, at
> > > least this way, on a multinode setup. If you have N nodes you surely
> > > will have load balance or at least the service active on all nodes.
> >
> > Not necessarily. And, if your code demands that you only have two
> > nodes, then it won't ever happen. It may be the case that heartbeat
> > isn't managing resources for larger clusters, just acting as heartbeat.
>
> It is possible that you have the nice_failback on and use N
> hosts.
Not with the code you defined. It was strictly "me and the other gy".

> When you startup a host it will look for someone alive and
> holding resources. On finding this guy, it simply doesn't take the
> resources. Else, it will take the resources it is configured to hold.
> I think this is the main problem here: I don't want (or someone
> may not want) the host that has the resources defined on haresources
> take them back everytime it starts up - if someone already holds the
> resources. It means, among other things, that when this host (that I
> would call Master/Primary using the terminology I adopted for the
> messages) starts heartbeat, and the resources were hold by someone
> else, all the negociations in course may stop. It is not good for what
> we're looking for... this is why all this nice_failback(tm) stuff came
> to life.

Yes. I think negotation is the way to go. Perhaps simply have the far
end guy say "no" when he is asked to give up the resources. It'll be a
little complicated to distinguish this case from the "no response" case
given the current scripts, but not that complicated.

>
> > > a two node setup you may have such a situation on which both hosts are
> > > up and noone has the resources... and, of course, it it could occur on
> > > a multinode scene, it'd be *more* painful to solve.
> >
> > Yes, but the proposed implementation puts these assumptions in the main
> > part of the heartbeat code. It makes the current code less functional
> > in some respects.
>
> Maybe in the conceptual world.:) But I can't see this loss of
> functionality, mainly because you can easily turn it off. It seems, to
> me, like a special case treatment.
> Well, as it is a need for me, I'll start writing some scripts.
> (Where's my bash wizards book???)

Write it in C if you like, just put in a separate binary.

> > But it doesn't look like the proposed design will function at all in the
> > context of more than 2 nodes. It's hard wired for two -- period.
>
> It's sufficent that one host is up and holding resources to satisfy
> the nice_failback requests. It doesn't matter which resources or how
> many resources. And it may be used on N hosts environment only if you
> use heartbeat as heartbet, not as a cluster manager.
> If heartbeat is your cluster manager, that makes more sense for N
> hosts evironments, turn off nice_failback.
> This is MHO. Anyway, I don't mind on rewriting this stuff on the
> scripts. I'd only ask you to pray for me... :)

And for Heartbeat ;-) But, feel free to write a little "C" program if
you'd rather. It would be simpler in some ways...

> > Right
> > now, the initial takeover of resources is handled by external programs
> > (in particular, shell scripts). I could easily see putting the logic
> > being discussed into an external program (C or script), and doing
> > whatever you want there.
>
> Going to the meta-world, or the conceptual world, it makes the main
> code more beatiful and clean... but polutes all around. IMHO heartbeat
> should handle more than two states. ON or OFF on the core code and
> more one or two abstractions done by the scripts isn't a good
> architechtural view.

I'm afraid I didn't follow this. I claim that heartbeat simply tells
you when nodes come and go. Everything else is Somebody Else's
Problem(tm). This is very simple architecturally. It might mean
rewriting the takeover script, but it's not very complicated. It
doesn't have to be a bash script.

> > > Another cool detail is that when you have two nodes and only one of
> > > them is up, it surely has all the resources.
> >
> > It surely *should* have all the resource groups, and you should
> > guarantee that no race conditions occur with regard to startup.
>
> Ooooops...
> How/Where/Why me? :)

The possibility seems to exist. Let's see if I can give an idea...
When one side starts up, it asks the other to give up it's resources.
It is coming up too, and it asks you to give up your resources. Since
neither side has any, both think they can start them up. Now, because
of the way the scripts are written, and the fact that we have a
"natural" master, this may not happen, but it should be walked through
to make sure that it can't happen for either the normal or "nice
failback" case.

> > You can have an arbitrary number of resource groups per machine. It's
> > clearly not limited to just one group. That isn't required by the
> > current code in any way. I'm sure there are people out there that have
> > more than one resource group per machine.
>
> But for the starting stuff, if someone has at least one of the
> resources it may have... that's good for us. The resources_held
> structure (that I used only to count how many resources I do have, and
> to list each of them to help Horms) actualy lists all the resource
> groups a node handles. :)
>
> > I don't mind the approach in general, but not if it's part of the main
> > heartbeat code. I'll look over your patch again. Again, I'm sorry I
> > didn't get back to you sooner.
>
> I'm feeling more confortable on putting this stuff on the
> scripts, now. But I still thinking that two states on core are few.
Still don't understand this. The core code doesn't know or care about
the resources. It just tracks nodes... *That's* the key distinction.

> Anyway, let's code. :)
>
> > I understand and appreciate the need you're trying to address. With a
> > little more effort, I'm sure that you'll come up with a design that
> > doesn't limit the use of heartbeat in other contexts.
>
> The main problem I had is that I'm looking for a two hosts
> solution. The are things that are easy to solve on a two nodes
> fashion.

Understood. I just don't want to break it for multi-nodes when you're
just tracking nodes. This makes heartbeat useful in other contexts -
like possibly in LinuxFailSafe :-)

-- Alan Robertson
alanr@suse.com

TODO/WANNADO list... [ In reply to ]

lclaudio at conectiva

Apr 17, 2000, 2:34 PM

Post #7 of 8 (1833 views)

Hi!

On Mon, 17 Apr 2000, Alan Robertson wrote:
...
> > I'm rewriting heartbeat starting process under nice_failback. I'm
> > also adding some ideas that may help those who are looking for
> > statistics or info about the cluster.
> > The main ideas are:
> >
> > 1. To get the starting process correct, I'll create a variable called
> > CLUSTER where the bits have the following meaning when set:
> >
> > ME | OTHER
> > ________|________
> > CLUSTER: [7|6|5|4|3|2|1|0]
> >
> > Bits Meaning
> > 0,4 - Host is alive
> > 1,5 - Host is primary
> > 2,6 - Host has the resources
> > 3,7 - Host is starting
>
> Whoa! From this plan, it looks like you plan on nailing this down to
> two hosts and one set of resources. I would be *very* uncomfortable
> with such a change. The heartbeat code currently doesn't have either
> assumption, and I'm not keen to see it get added.

There are two points here... :)
The first one is that this kind of situation wouldn't occur, at
least this way, on a multinode setup. If you have N nodes you surely
will have load balance or at least the service active on all nodes. On
a two node setup you may have such a situation on which both hosts are
up and noone has the resources... and, of course, it it could occur on
a multinode scene, it'd be *more* painful to solve.
Second, this stuff address the starting protocol... it'd be used
only when a host starts heartbeat. After the start this code has no
more use. Anyway, I can put all this stuff on a memory struct like
resources_held and expand it to work with N nodes.
Another cool detail is that when you have two nodes and only one of
them is up, it surely has all the resources. So the "has resources"
variable can be used as one bit. The resources_held struct, the one
that lists the resources the host holds, is dynamic and can handle as
many resources as you can put on haresources.

Could you please read the code again and let me know if should I
stop working on it or no? :)

Hugs!

Luis
[ Luis Claudio R. Goncalves lclaudio@conectiva.com.br ]
[. BSc in Computer Science -- MSc coming soon -- Gospel User -- Linuxer ]
[. Fault Tolerance - Real-Time - Distributed Systems - IECLB - IS 40:31 ]
[. LateNite Programmer -- Jesus Is The Solid Rock On Which I Stand -- ]

TODO/WANNADO list... [ In reply to ]

lclaudio at conectiva

Apr 17, 2000, 4:10 PM

Post #8 of 8 (1829 views)

On Mon, 17 Apr 2000, Alan Robertson wrote:

> > There are two points here... :)
> > The first one is that this kind of situation wouldn't occur, at
> > least this way, on a multinode setup. If you have N nodes you surely
> > will have load balance or at least the service active on all nodes.
>
> Not necessarily. And, if your code demands that you only have two
> nodes, then it won't ever happen. It may be the case that heartbeat
> isn't managing resources for larger clusters, just acting as heartbeat.

It is possible that you have the nice_failback on and use N
hosts. When you startup a host it will look for someone alive and
holding resources. On finding this guy, it simply doesn't take the
resources. Else, it will take the resources it is configured to hold.
I think this is the main problem here: I don't want (or someone
may not want) the host that has the resources defined on haresources
take them back everytime it starts up - if someone already holds the
resources. It means, among other things, that when this host (that I
would call Master/Primary using the terminology I adopted for the
messages) starts heartbeat, and the resources were hold by someone
else, all the negociations in course may stop. It is not good for what
we're looking for... this is why all this nice_failback(tm) stuff came
to life.

> > On
> > a two node setup you may have such a situation on which both hosts are
> > up and noone has the resources... and, of course, it it could occur on
> > a multinode scene, it'd be *more* painful to solve.
>
> Yes, but the proposed implementation puts these assumptions in the main
> part of the heartbeat code. It makes the current code less functional
> in some respects.

Maybe in the conceptual world.:) But I can't see this loss of
functionality, mainly because you can easily turn it off. It seems, to
me, like a special case treatment.
Well, as it is a need for me, I'll start writing some scripts.
(Where's my bash wizards book???)

> But it doesn't look like the proposed design will function at all in the
> context of more than 2 nodes. It's hard wired for two -- period.

It's sufficent that one host is up and holding resources to satisfy
the nice_failback requests. It doesn't matter which resources or how
many resources. And it may be used on N hosts environment only if you
use heartbeat as heartbet, not as a cluster manager.
If heartbeat is your cluster manager, that makes more sense for N
hosts evironments, turn off nice_failback.
This is MHO. Anyway, I don't mind on rewriting this stuff on the
scripts. I'd only ask you to pray for me... :)

> Right
> now, the initial takeover of resources is handled by external programs
> (in particular, shell scripts). I could easily see putting the logic
> being discussed into an external program (C or script), and doing
> whatever you want there.

Going to the meta-world, or the conceptual world, it makes the main
code more beatiful and clean... but polutes all around. IMHO heartbeat
should handle more than two states. ON or OFF on the core code and
more one or two abstractions done by the scripts isn't a good
architechtural view.

> > Another cool detail is that when you have two nodes and only one of
> > them is up, it surely has all the resources.
>
> It surely *should* have all the resource groups, and you should
> guarantee that no race conditions occur with regard to startup.

Ooooops...
How/Where/Why me? :)

> You can have an arbitrary number of resource groups per machine. It's
> clearly not limited to just one group. That isn't required by the
> current code in any way. I'm sure there are people out there that have
> more than one resource group per machine.

But for the starting stuff, if someone has at least one of the
resources it may have... that's good for us. The resources_held
structure (that I used only to count how many resources I do have, and
to list each of them to help Horms) actualy lists all the resource
groups a node handles. :)

> I don't mind the approach in general, but not if it's part of the main
> heartbeat code. I'll look over your patch again. Again, I'm sorry I
> didn't get back to you sooner.

I'm feeling more confortable on putting this stuff on the
scripts, now. But I still thinking that two states on core are few.
Anyway, let's code. :)

> I understand and appreciate the need you're trying to address. With a
> little more effort, I'm sure that you'll come up with a design that
> doesn't limit the use of heartbeat in other contexts.

The main problem I had is that I'm looking for a two hosts
solution. The are things that are easy to solve on a two nodes
fashion.

> Thanks for all your efforts!

:]

Luis

[ Luis Claudio R. Goncalves lclaudio@conectiva.com.br ]
[. BSc in Computer Science -- MSc coming soon -- Gospel User -- Linuxer ]
[. Fault Tolerance - Real-Time - Distributed Systems - IECLB - IS 40:31 ]
[. LateNite Programmer -- Jesus Is The Solid Rock On Which I Stand -- ]