Mailing List Archive

Another problem with the heartbeat init script...
Earlier I mentioned the problem with portreserve (which was apparently
ignored?)

Now I have run into another problem. When you set LRM parameters in
/etc/sysconfig, the code assumes that the LRM will start within 20
seconds of starting heartbeat. That is not the case. If you have
initdead set to 120 (for example) then it can be 120 seconds before it
starts. If you also have autojoin any, then it will _always_ take >=
120 seconds before it starts.

Delaying the startup of other services on the system while we wait for
the initdead to expire is not a good idea.

I suppose I should put together a patch on these items...

--
Alan Robertson<alanr@unix.sh> - @OSSAlanR

"Openness is the foundation and preservative of friendship... Let me claim from you at all times your undisguised opinions." - William Wilberforce
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/
Re: Another problem with the heartbeat init script... [ In reply to ]
On Mon, Mar 26, 2012 at 02:28:42PM -0600, Alan Robertson wrote:
> Earlier I mentioned the problem with portreserve (which was apparently
> ignored?)

No.
But I cached a cold.
And you did not send a patch, did you?

> Now I have run into another problem. When you set LRM parameters in
> /etc/sysconfig, the code assumes that the LRM will start within 20
> seconds of starting heartbeat. That is not the case.

lrmd was changed to getenv() it's max-children meanwhile.

You need to cherry pick that patch, or update glue.

--
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/
Re: Another problem with the heartbeat init script... [ In reply to ]
On 03/30/2012 04:58 PM, Lars Ellenberg wrote:
> On Mon, Mar 26, 2012 at 02:28:42PM -0600, Alan Robertson wrote:
>> Earlier I mentioned the problem with portreserve (which was apparently
>> ignored?)
> No.
> But I cached a cold.
Sorry to hear that :-(. Hope you're feeling better now. You have my
full sympathies - since I had bronchitis that lasted over two weeks.
> And you did not send a patch, did you?
Good point. Sorry to be a whiner... I was hoping for a little more
conversation in any case.
>> Now I have run into another problem. When you set LRM parameters in
>> /etc/sysconfig, the code assumes that the LRM will start within 20
>> seconds of starting heartbeat. That is not the case.
>>
>> lrmd was changed to getenv() it's max-children meanwhile.
>>
>> You need to cherry pick that patch, or update glue.
Good to hear that's changed. I put a patch of my own into my local copy
- just extending the loop and making it not print those annoying '.'s or
delay startup while waiting. So, I'm good locally - and there's no need
for a patch for the future.

I guess I need to make a workspace so I can submit patches properly (as
you noted above).

On the other hand, the good news though is that by upping that limit to
16 and switching from a group to explicit dependencies, the failover
time was cut from about 60 seconds to about 18 seconds - so I'm happy.


--
Alan Robertson<alanr@unix.sh> - @OSSAlanR

"Openness is the foundation and preservative of friendship... Let me claim from you at all times your undisguised opinions." - William Wilberforce
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/