This is a multi-part message in MIME format.
--------------DC9CFC494943C9EF537941A6
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Hi Marcelo,
David Brower and others have expressed interests in this, so I've
decided to send this to the linux-ha-dev list, and let others offer
their comments too. Because of David's request, I decided to CC a
couple of FailSafe people too.
I've attached my latest version of the API header file.
Marcelo Tosatti wrote:
>
> Alan,
>
> How do you plan to do the interprocess communication? Signals and FIFO's?
> I'm asking that because i'm interested in starting implementing it.
Actually, I had been planning on implementing it... But I probably
don't have time...
Let's go ahead and discuss it - and decide who will implement it as we
go along...
If you wind up doing it, I'll want to review it very carefully.
This is an item that I would put in blue on the TODO list...
Expect me to be cranky and nit-picky if you do it. I reserve the right
to be unreasonable... ;-)
Let's put out a release with the current fixes in it before we commit
any of these changes to CVS.
Horms: Are you ready for this new release now?
Here's my plan:
Write the requests to the common FIFO /var/run/heartbeat-fifo/
Make a well-known client FIFO directory for clients to make FIFOs in
pid == FIFO name... Probably /var/run/heartbeat-clients
The messages in the FIFOs would be the famous "ha_msg" messages... ;-)
Add special message types to handle the queries from clients.
Add a well-known field type maybe "orig_pid" which specifies the PID
of the process making the request (hence the FIFO name)
Locally handled requests should probably have some kind of convention
in their types like "lr-" or something... Then you could make
sure they don't accidentally get written to the cluster, and
whine about them in the logs, and return an automatic
response reporting failure for unimplemented requests.
Make sure you handle dead clients or clients whose reply FIFOs might
be full...
Replies to messages should have types that match the request, but end
in "-resp"
In the case of the list of interfaces, I was planning on the return
message being a comma-separated list of interfaces - that
way all the remote messages will be be one-for-one returned
for each request. This should be OK to limit the number of
bytes in the interface names to no more than about 1K bytes
per host... ;-)
Otherwise, we need to implement guaranteed packet delivery order
which I don't want to put in the way of implementing this API.
We should have a version of the API in the header which goes
into each request, like this:
#define API_COMM_VERS 1
and then put api_vers (or something) into each request from the
clients. Make it a simple number, not a dotted number so that
we can easily compare less than, greater than, or equal.
Changing the meaning and format, or number of fields for a
request requires upping the number. Adding new requests
doesn't. Unimplemented requests are easily detectable.
I planned on replicating all the messages to all the attached clients,
except for replies that have orig_pid in them (but see the
debugging mode below).
Debugging needs a promiscuous mode so that a process can sit and
monitor
the traffic separately from whatever applications are using
the system. It might even be nice to have such a process be able
to make it "really promiscuous", and then see *all*
heartbeats from all machines - including those normally
filtered out.
David Brower made the very reasonable request to make this match
corresponding FailSafe APIs. This makes sense, but I haven't looked at
them enough yet to comment. I'm back home now, so I should be able to
do this soon.
I suspect that the big deal will be the communications protocol between
heartbeat and client, not the exact format of the APIs. So, if we have
to tweak them, or implement a failsafe compatibility layer for the APIs,
it should be pretty easy, once the comm stuff is designed and
implemented.
I guess I should get serious about checking the failsafe docs to
minimize the rework, or maybe even get better APIs... ;-)
Comments?
-- Alan Robertson
alanr@suse.com
--------------DC9CFC494943C9EF537941A6
Content-Type: text/plain; charset=us-ascii;
name="hb_api.h"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
filename="hb_api.h"
#include <ha_msg.h>
/*
* Low-level clustering API to heartbeat.
*/
typedef void (*llc_msg_callback_t) (const struct ha_msg* msg
, void* private_data);
typedef void (*llc_nstatus_callback_t) (const char *node, const char * status
, void* private_data);
typedef void (*llc_ifstatus_callback_t) (const char *node
, const char * interface, const char * status
, void* private_data);
struct llc_ops {
/*
*************************************************************************
* Status Update Callbacks
*************************************************************************
*/
/*
* set_msg_callback: Define callback for the given message type
*
* msgtype: Type of message being handled. NULL for default case.
* Note that default case not reached for node
* status messages handled by nstatus_callback,
* or ifstatus messages handled by nstatus_callback,
* Not just those explicitly handled by "msg_hander"
* cases.
*
* callback: callback function.
*
* p: private data - later passed to callback.
*/
int (*set_msg_callback) (const char * msgtype
, llc_msg_callback_t callback, void * p);
/*
* set_nstatus_callback: Define callback for node status messages
* This is a message of type "st"
*
* cbf: callback function.
*
* p: private data - later passed to callback.
*/
int (*set_nstatus_callback) (llc_nstatus_callback_t cbf
, void * p);
/*
* set_ifstatus_callback: Define callback for interface status messages
* This is a message of type "???"
* These messages are issued whenever an interface goes
* dead or becomes active again.
*
* cbf: callback function.
*
* node: the name of the node to get the interface updates for
* If node is NULL, it will receive notification for all
* nodes.
*
* iface: The name of the interface to receive updates for. If
* iface is NULL, it will receive notification for all
* interfaces.
*
* If NULL is passed for both "node" and "iface", then "cbf" would
* be called for interface status change against any node in
* the cluster.
*
* p: private data - later passed to callback.
*/
int (*set_ifstatus_callback) (llc_ifstatus_callback_t cbf,
const char * node, const char * iface, void * p);
/*
*************************************************************************
* Getting Current Information
*************************************************************************
*/
/*
* init_nodewalk: Initialize walk through list of list of known nodes
*/
int (*init_nodewalk)(void);
/*
* nextnode: Return next node in the list of known nodes
*/
const char * (*nextnode)(void);
/*
* end_nodewalk: End walk through the list of known nodes
*/
int (*end_nodewalk)(void);
/*
* node_status: Return most recent heartbeat status of the given node
*/
int (*node_status)(const char * nodename);
/*
* init_ifwalk: Initialize walk through list of list of known interfaces
*/
int (*init_ifwalk)(const char * node);
/*
* nextif: Return next node in the list of known interfaces on node
*/
const char * (*nextif)(void);
/*
* end_ifwalk: End walk through the list of known interfaces
*/
int (*end_ifwalk)(void);
/*
* if_status: Return current status of the given interface
*/
int (*if_status)(const char * nodename, const char *iface);
/*
*************************************************************************
* Intracluster messaging
*************************************************************************
*/
/*
* sendclustermsg: Send the given message to all cluster members
*/
int (*sendclustermsg)(const struct ha_msg* msg);
/*
* sendnodemsg: Send the given message to the given node in cluster.
*/
int (*sendnodemsg)(const struct ha_msg* msg
, const char * nodename);
/*
* inputfd: Return fd which can be given to select(2) or poll(2)
* for determining when messages are ready to be read.
* Only to be used in select() or poll(), please...
*/
int (*inputfd)(void);
/*
* msgready: Returns TRUE (1) when a message is ready to be read.
*/
int (*msgready)(void);
/*
* setmsgsignal: Associates the given signal with the "message waiting"
* condition.
*/
int (*setmsgsignal)(int signo);
/*
* rcvmsg: Cause the next message to be read - activating callbacks for
* processing the message.
*/
int (*rcvmsg)(int blocking);
/*
* Read the next message without any silly callbacks.
* (at least the next one not intercepted by another callback).
* NOTE: you must dispose of this message by calling ha_msg_del().
*/
struct ha_msg* (*readmsg)(int blocking);
/*
*************************************************************************
* Debugging
*************************************************************************
*
* setfmode: Set filter mode. Analagous to promiscous mode in TCP.
*
* LLC_FILTER_DEFAULT (default)
* In this mode, all messages destined for this pid
* are received, along with all that don't go to specific pids.
*
* LLC_FILTER_PMODE See all messages, but filter heart beats
*
* that don't tell us anything new.
* LLC_FILTER_ALLHB See all heartbeats, including those that
* don't change status.
* LLC_FILTER_RAW See all packets, from all interfaces, even
* dups. Pkts with auth errors are still ignored.
*
* Set filter mode. Analagous to promiscous mode in TCP.
*
*/
# define LLC_FILTER_DEFAULT 0
# define LLC_FILTER_PMODE 1
/* Do we need these higher levels ? */
# define LLC_FILTER_ALLHB 2
# define LLC_FILTER_RAW 3
struct ha_msg* (*setfmode)(int mode);
};
struct ll_cluster {
void * ll_cluster_private;
struct llc_ops* llc_ops;
};
--------------DC9CFC494943C9EF537941A6--
--------------DC9CFC494943C9EF537941A6
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Hi Marcelo,
David Brower and others have expressed interests in this, so I've
decided to send this to the linux-ha-dev list, and let others offer
their comments too. Because of David's request, I decided to CC a
couple of FailSafe people too.
I've attached my latest version of the API header file.
Marcelo Tosatti wrote:
>
> Alan,
>
> How do you plan to do the interprocess communication? Signals and FIFO's?
> I'm asking that because i'm interested in starting implementing it.
Actually, I had been planning on implementing it... But I probably
don't have time...
Let's go ahead and discuss it - and decide who will implement it as we
go along...
If you wind up doing it, I'll want to review it very carefully.
This is an item that I would put in blue on the TODO list...
Expect me to be cranky and nit-picky if you do it. I reserve the right
to be unreasonable... ;-)
Let's put out a release with the current fixes in it before we commit
any of these changes to CVS.
Horms: Are you ready for this new release now?
Here's my plan:
Write the requests to the common FIFO /var/run/heartbeat-fifo/
Make a well-known client FIFO directory for clients to make FIFOs in
pid == FIFO name... Probably /var/run/heartbeat-clients
The messages in the FIFOs would be the famous "ha_msg" messages... ;-)
Add special message types to handle the queries from clients.
Add a well-known field type maybe "orig_pid" which specifies the PID
of the process making the request (hence the FIFO name)
Locally handled requests should probably have some kind of convention
in their types like "lr-" or something... Then you could make
sure they don't accidentally get written to the cluster, and
whine about them in the logs, and return an automatic
response reporting failure for unimplemented requests.
Make sure you handle dead clients or clients whose reply FIFOs might
be full...
Replies to messages should have types that match the request, but end
in "-resp"
In the case of the list of interfaces, I was planning on the return
message being a comma-separated list of interfaces - that
way all the remote messages will be be one-for-one returned
for each request. This should be OK to limit the number of
bytes in the interface names to no more than about 1K bytes
per host... ;-)
Otherwise, we need to implement guaranteed packet delivery order
which I don't want to put in the way of implementing this API.
We should have a version of the API in the header which goes
into each request, like this:
#define API_COMM_VERS 1
and then put api_vers (or something) into each request from the
clients. Make it a simple number, not a dotted number so that
we can easily compare less than, greater than, or equal.
Changing the meaning and format, or number of fields for a
request requires upping the number. Adding new requests
doesn't. Unimplemented requests are easily detectable.
I planned on replicating all the messages to all the attached clients,
except for replies that have orig_pid in them (but see the
debugging mode below).
Debugging needs a promiscuous mode so that a process can sit and
monitor
the traffic separately from whatever applications are using
the system. It might even be nice to have such a process be able
to make it "really promiscuous", and then see *all*
heartbeats from all machines - including those normally
filtered out.
David Brower made the very reasonable request to make this match
corresponding FailSafe APIs. This makes sense, but I haven't looked at
them enough yet to comment. I'm back home now, so I should be able to
do this soon.
I suspect that the big deal will be the communications protocol between
heartbeat and client, not the exact format of the APIs. So, if we have
to tweak them, or implement a failsafe compatibility layer for the APIs,
it should be pretty easy, once the comm stuff is designed and
implemented.
I guess I should get serious about checking the failsafe docs to
minimize the rework, or maybe even get better APIs... ;-)
Comments?
-- Alan Robertson
alanr@suse.com
--------------DC9CFC494943C9EF537941A6
Content-Type: text/plain; charset=us-ascii;
name="hb_api.h"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
filename="hb_api.h"
#include <ha_msg.h>
/*
* Low-level clustering API to heartbeat.
*/
typedef void (*llc_msg_callback_t) (const struct ha_msg* msg
, void* private_data);
typedef void (*llc_nstatus_callback_t) (const char *node, const char * status
, void* private_data);
typedef void (*llc_ifstatus_callback_t) (const char *node
, const char * interface, const char * status
, void* private_data);
struct llc_ops {
/*
*************************************************************************
* Status Update Callbacks
*************************************************************************
*/
/*
* set_msg_callback: Define callback for the given message type
*
* msgtype: Type of message being handled. NULL for default case.
* Note that default case not reached for node
* status messages handled by nstatus_callback,
* or ifstatus messages handled by nstatus_callback,
* Not just those explicitly handled by "msg_hander"
* cases.
*
* callback: callback function.
*
* p: private data - later passed to callback.
*/
int (*set_msg_callback) (const char * msgtype
, llc_msg_callback_t callback, void * p);
/*
* set_nstatus_callback: Define callback for node status messages
* This is a message of type "st"
*
* cbf: callback function.
*
* p: private data - later passed to callback.
*/
int (*set_nstatus_callback) (llc_nstatus_callback_t cbf
, void * p);
/*
* set_ifstatus_callback: Define callback for interface status messages
* This is a message of type "???"
* These messages are issued whenever an interface goes
* dead or becomes active again.
*
* cbf: callback function.
*
* node: the name of the node to get the interface updates for
* If node is NULL, it will receive notification for all
* nodes.
*
* iface: The name of the interface to receive updates for. If
* iface is NULL, it will receive notification for all
* interfaces.
*
* If NULL is passed for both "node" and "iface", then "cbf" would
* be called for interface status change against any node in
* the cluster.
*
* p: private data - later passed to callback.
*/
int (*set_ifstatus_callback) (llc_ifstatus_callback_t cbf,
const char * node, const char * iface, void * p);
/*
*************************************************************************
* Getting Current Information
*************************************************************************
*/
/*
* init_nodewalk: Initialize walk through list of list of known nodes
*/
int (*init_nodewalk)(void);
/*
* nextnode: Return next node in the list of known nodes
*/
const char * (*nextnode)(void);
/*
* end_nodewalk: End walk through the list of known nodes
*/
int (*end_nodewalk)(void);
/*
* node_status: Return most recent heartbeat status of the given node
*/
int (*node_status)(const char * nodename);
/*
* init_ifwalk: Initialize walk through list of list of known interfaces
*/
int (*init_ifwalk)(const char * node);
/*
* nextif: Return next node in the list of known interfaces on node
*/
const char * (*nextif)(void);
/*
* end_ifwalk: End walk through the list of known interfaces
*/
int (*end_ifwalk)(void);
/*
* if_status: Return current status of the given interface
*/
int (*if_status)(const char * nodename, const char *iface);
/*
*************************************************************************
* Intracluster messaging
*************************************************************************
*/
/*
* sendclustermsg: Send the given message to all cluster members
*/
int (*sendclustermsg)(const struct ha_msg* msg);
/*
* sendnodemsg: Send the given message to the given node in cluster.
*/
int (*sendnodemsg)(const struct ha_msg* msg
, const char * nodename);
/*
* inputfd: Return fd which can be given to select(2) or poll(2)
* for determining when messages are ready to be read.
* Only to be used in select() or poll(), please...
*/
int (*inputfd)(void);
/*
* msgready: Returns TRUE (1) when a message is ready to be read.
*/
int (*msgready)(void);
/*
* setmsgsignal: Associates the given signal with the "message waiting"
* condition.
*/
int (*setmsgsignal)(int signo);
/*
* rcvmsg: Cause the next message to be read - activating callbacks for
* processing the message.
*/
int (*rcvmsg)(int blocking);
/*
* Read the next message without any silly callbacks.
* (at least the next one not intercepted by another callback).
* NOTE: you must dispose of this message by calling ha_msg_del().
*/
struct ha_msg* (*readmsg)(int blocking);
/*
*************************************************************************
* Debugging
*************************************************************************
*
* setfmode: Set filter mode. Analagous to promiscous mode in TCP.
*
* LLC_FILTER_DEFAULT (default)
* In this mode, all messages destined for this pid
* are received, along with all that don't go to specific pids.
*
* LLC_FILTER_PMODE See all messages, but filter heart beats
*
* that don't tell us anything new.
* LLC_FILTER_ALLHB See all heartbeats, including those that
* don't change status.
* LLC_FILTER_RAW See all packets, from all interfaces, even
* dups. Pkts with auth errors are still ignored.
*
* Set filter mode. Analagous to promiscous mode in TCP.
*
*/
# define LLC_FILTER_DEFAULT 0
# define LLC_FILTER_PMODE 1
/* Do we need these higher levels ? */
# define LLC_FILTER_ALLHB 2
# define LLC_FILTER_RAW 3
struct ha_msg* (*setfmode)(int mode);
};
struct ll_cluster {
void * ll_cluster_private;
struct llc_ops* llc_ops;
};
--------------DC9CFC494943C9EF537941A6--