Mailing List Archive: Additional changes made via DHCPD review process

Additional changes made via DHCPD review process

Dec 6, 2011, 6:59 AM

Post #1 of 5 (1390 views)

Hi Everyone,

I would like to thank Florian, Andreas and Dejan for making
suggestions and pointing out some additional changed I should make. At
this point the following additional changes have been made:

- A test case in the validation function for ocf_is_probe has been
reversed tp ! ocf_is_probe, and the "test"/"[ ]" wrappers removed to
ensure the validation is not occuring if the partition is not mounted or
under a probe.
- An extraneous return code has been removed from the "else" clause of
the probe test, to ensure the rest of the validation can finish.
- The call to the DHCPD daemon itself during the start phase has been
wrapped with the ocf_run helper function, to ensure that is somewhat
standardized.

The first two changes corrected the "Failed Action... Not installed"
issue on the secondary node, as well as the fail-over itself. I've been
able to fail over to secondary and primary nodes multiple times and the
service follows the rest of the grouped services.

There are a few things I'd like to add to the script, now that the main
issues/code changes have been addressed, and they are as follows:

- Add a means of copying /etc/dhcpd.conf from node1 to node2...nodeX
from within the script. The logic behind this is as follows:

1. It is possible for an admin to use a 3rd party management tool to
add/remove/update addresses in the /etc/dhcpd.conf file while the
cluster is live. There needs to be a means of detecting those updates,
and ensuring they are propagated to the remaining nodes.

2. While a user may be using drbd to handle the [chrooted_path]
partition to propagate lease information across nodes, there's no
guarantee they are using drbd to manage more then just that area. For
instance, I am not using drbd to manage the /etc/ path, but simply
/var/lib/dhcp.

The script already ensures the /etc/dhcpd.conf file is copied into the
chrooted environment, as is the standard for the current DHCPD init
scripts already used on many Linux distributions in a non-clustered
environment.

- I need to find a means to add additional monitoring to the script to
do more then simply test if the daemon is live. I've had cases where the
dhcpd daemon was live but not feeding out IP's, and it would be nice to
fence the node out if I could find way to to validate that it's not
responding to IP requests in addition to a daemon failure. The issue is
that dhcpcd, using the -T parameter, can not run on the same Ethernet
interface (for single NIC nodes) as the dhcpd process is running on, as
it will never get a response.

Is it possible to have another node execute this, and restrict that part
of the test to only the passive node(s) ?
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Re: Additional changes made via DHCPD review process [ In reply to ]

dejan at suse

Dec 6, 2011, 7:44 AM

Post #2 of 5 (1367 views)

Permalink

Hi,

On Tue, Dec 06, 2011 at 10:59:20AM -0400, Chris Bowlby wrote:
> Hi Everyone,
>
> I would like to thank Florian, Andreas and Dejan for making
> suggestions and pointing out some additional changed I should make. At
> this point the following additional changes have been made:
>
> - A test case in the validation function for ocf_is_probe has been
> reversed tp ! ocf_is_probe, and the "test"/"[ ]" wrappers removed to
> ensure the validation is not occuring if the partition is not mounted or
> under a probe.
> - An extraneous return code has been removed from the "else" clause of
> the probe test, to ensure the rest of the validation can finish.
> - The call to the DHCPD daemon itself during the start phase has been
> wrapped with the ocf_run helper function, to ensure that is somewhat
> standardized.
>
> The first two changes corrected the "Failed Action... Not installed"
> issue on the secondary node, as well as the fail-over itself. I've been
> able to fail over to secondary and primary nodes multiple times and the
> service follows the rest of the grouped services.
>
> There are a few things I'd like to add to the script, now that the main
> issues/code changes have been addressed, and they are as follows:
>
> - Add a means of copying /etc/dhcpd.conf from node1 to node2...nodeX
> from within the script. The logic behind this is as follows:

I'd say that this is admin's responsibility. There are tools such
as csync2 which can deal with that. Doing it from the RA is
possible, but definitely very error prone and I'd be very
reluctant to do that. Note that we have many RAs which keep
additional configuration in a file and none if them tries to keep
the copies of that configuration in sync itself.

Thanks,

Dejan

> 1. It is possible for an admin to use a 3rd party management tool to
> add/remove/update addresses in the /etc/dhcpd.conf file while the
> cluster is live. There needs to be a means of detecting those updates,
> and ensuring they are propagated to the remaining nodes.
>
> 2. While a user may be using drbd to handle the [chrooted_path]
> partition to propagate lease information across nodes, there's no
> guarantee they are using drbd to manage more then just that area. For
> instance, I am not using drbd to manage the /etc/ path, but simply
> /var/lib/dhcp.
>
> The script already ensures the /etc/dhcpd.conf file is copied into the
> chrooted environment, as is the standard for the current DHCPD init
> scripts already used on many Linux distributions in a non-clustered
> environment.
>
> - I need to find a means to add additional monitoring to the script to
> do more then simply test if the daemon is live. I've had cases where the
> dhcpd daemon was live but not feeding out IP's, and it would be nice to
> fence the node out if I could find way to to validate that it's not
> responding to IP requests in addition to a daemon failure. The issue is
> that dhcpcd, using the -T parameter, can not run on the same Ethernet
> interface (for single NIC nodes) as the dhcpd process is running on, as
> it will never get a response.
>
> Is it possible to have another node execute this, and restrict that part
> of the test to only the passive node(s) ?
> _______________________________________________________
> Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> Home Page: http://linux-ha.org/
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Re: Additional changes made via DHCPD review process [ In reply to ]

florian at hastexo

Dec 6, 2011, 8:07 AM

Post #3 of 5 (1362 views)

Permalink

On Tue, Dec 6, 2011 at 4:44 PM, Dejan Muhamedagic <dejan@suse.de> wrote:
> Hi,
>
> On Tue, Dec 06, 2011 at 10:59:20AM -0400, Chris Bowlby wrote:
>> Hi Everyone,
>>
>> Â I would like to thank Florian, Andreas and Dejan for making
>> suggestions and pointing out some additional changed I should make. At
>> this point the following additional changes have been made:
>>
>> - A test case in the validation function for ocf_is_probe has been
>> reversed tp ! ocf_is_probe, and the "test"/"[ ]" wrappers removed to
>> ensure the validation is not occuring if the partition is not mounted or
>> under a probe.
>> - An extraneous return code has been removed from the "else" clause of
>> the probe test, to ensure the rest of the validation can finish.
>> - The call to the DHCPD daemon itself during the start phase has been
>> wrapped with the ocf_run helper function, to ensure that is somewhat
>> standardized.
>>
>> The first two changes corrected the "Failed Action... Not installed"
>> issue on the secondary node, as well as the fail-over itself. I've been
>> able to fail over to secondary and primary nodes multiple times and the
>> service follows the rest of the grouped services.
>>
>> There are a few things I'd like to add to the script, now that the main
>> issues/code changes have been addressed, and they are as follows:
>>
>> - Add a means of copying /etc/dhcpd.conf from node1 to node2...nodeX
>> from within the script. The logic behind this is as follows:
>
> I'd say that this is admin's responsibility. There are tools such
> as csync2 which can deal with that. Doing it from the RA is
> possible, but definitely very error prone and I'd be very
> reluctant to do that. Note that we have many RAs which keep
> additional configuration in a file and none if them tries to keep
> the copies of that configuration in sync itself.

Seconded. Whatever configuration doesn't live _in_ the CIB proper, is
not Pacemaker's job to replicate. The admin gets to either sync files
manually across the nodes (csync2 greatly simplifies this; no need to
reinvent the wheel), or put the config files on storage that's
available to all cluster nodes.

Cheers,
Florian
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Re: Additional changes made via DHCPD review process [ In reply to ]

dejan at suse

Dec 9, 2011, 7:33 AM

Post #4 of 5 (1355 views)

Permalink

Hi Florian,

On Fri, Dec 09, 2011 at 03:43:43PM +0100, Florian Haas wrote:
> On Fri, Dec 9, 2011 at 6:30 AM, Dejan Muhamedagic <dejan@suse.de> wrote:
> > Hi,
> >
> > On Tue, Dec 06, 2011 at 01:39:04PM -0400, Chris Bowlby wrote:
> >> Hi All,
> >>
> >> Ok, I'll look into csync, and will concede the point on the RA syncing
> >> the out of chrooted configuration file.
> >>
> >> I still need to find a means to monitor the DHCP responses however, as
> >> that will just improve the reliability of the cluster itself, as well as
> >> the service.
> >
> > I'm really not sure how to do that.
> >
> > Didn't review the agent, but on a cursory look, perhaps you could
> > provide the default for chrooted_path (/var/lib/dhcp).
> >
> > BTW, did you think of adding an ocft test case?
>
> Please, cut the new guy some slack. :) Evidently this is Chris' first

Oh, sorry if I were pushy, certainly not my intention, anyway,
just a suggestion, for it is very comforting to be able to run
the tests and be confident that the latest modification didn't
(thoroughly) break the RA.

Cheers,

Dejan

> contributed RA, and he has been enormously responsive to our
> suggestions, and has drastically improved his agent from his first
> submission. I'm sure he'll get to ocft in due course.
>
> Cheers,
> Florian
> _______________________________________________________
> Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> Home Page: http://linux-ha.org/
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Re: Additional changes made via DHCPD review process [ In reply to ]

cbowlby at tenthpowertech

Jan 9, 2012, 10:07 AM

Post #5 of 5 (1319 views)

Permalink

Hi Everyone,

I apologize for my lack of response till now, December was a bit hectic
for me. I have, however, read over your points of discussion and agree
that a reasonable default, and subsequent removal of the required
parameter, is mostly likely the preferred method in relation to that
variable.

I should be able to make the tweak this week and get that pushed to my
git fork.

On 12/09/2011 04:00 PM, Rasto Levrinc wrote:
> On Fri, Dec 9, 2011 at 4:40 PM, Dejan Muhamedagic<dejan@suse.de> wrote:
>> Hi Chris,
>>
>> On Fri, Dec 09, 2011 at 10:33:05AM -0400, Chris Bowlby wrote:
>>> Hi Dejan,
>>>
>>> It has been recommended, that required options should not have default
>>> values.
>> Definitely they cannot have defaults. Sorry, I should've been
>> more precise.
>>
>>> The initial version of the script had a default for that
>>> variable, but chrooted_path was not required. During the revision's
>>> suggested by Andreas and Florian, chrooted was converted into a
>>> required, and unique, variable, with no default.
> I am with Dejan on this one. People sometimes have their own ideas what
> "required" and "default" means in this case and make these arguments
> almost unusable.
>
> If /var/lib/dhcpd is what most people would have to type in, don't make it
> required and make it a default. If someone enters "nothing", the default
> value should be used. So somewhat confusingly even if it is required
> parameter for your RA, it is a non-required OCF parameter. :)
>
> Rasto
>
>> I don't care much either way, it's just that I think that many of
>> our RA just make use of the defaults coming from the standard
>> installation and used by init scripts. Of course, if it makes
>> sense, perhaps in this case it doesn't, I've never really looked
>> into dhcpd configuration. The only point is that most
>> configurations should work with as little effort as possible and
>> I guess that people would usually run a single instance of dhcpd.

_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/