>>> Lars Marowsky-Bree <lmb@suse.com> schrieb am 05.02.2014 um 12:36 in
Nachricht
<20140205113649.GN13514@suse.de>:
> On 2014-02-05T12:24:00, Ulrich Windl <Ulrich.Windl@rz.uni-regensburg.de>
wrote:
>
>> I had a problem where "O2CB stop" fenced the node that was shut down:
>> I had updated the kernel, and then rebooted. As part of shutdown, the
> cluster stack was stopped. In turn, the "O2CB" resource was stopped.
>> Unfortunately this caused an error like (SLES11 SP3):
>>
>> ---
>> modprobe: FATAL: Could not load /lib/modules/3.0.101-0.8-xen/modules.dep:
No
> such file or directory
>> o2cb(prm_O2CB)[19908]: ERROR: Unable to unload module: ocfs2
>> ---
>>
>> This in turn caused a node fence, which ruined the clean reboot.
>>
>> So why is the RA messing with the kernel module on stop?
>
> Because customers complained about the new module not being picked up if
> they upgrade ocfs2-kmp and restarted the cluster stack on a node. It's
> incredibly hard to please everyone, alas ...
I think the proper way would be this:
Stop your OCFS2 resources, rmmod the module, [modprobe the module to re-insert
the new version], start your OCFS2 resources.
I guess the kernel update is more common than the "just the ocfs2-kmp update"
>
> The right way to update a cluster node is anyway this one:
>
> 1. Stop the cluster stack
> 2. Update/upgrade/reboot as needed
> 3. Restart the cluster stack
>
> This would avoid this error too. Or keeping multiple kernel versions in
> parallel (which also helps if a kernel update no longer boots for some
> reason). Removing the running kernel package is usually not a great
> idea; I prefer to remove them after having successfully rebooted only,
> because you *never* know if you may have to reload a module.
There's another way: (Like HP-UX learned to do it): Defer changes to the
running kernel until shutdown/reboot.
>
>
> Regards,
> Lars
>
> --
> Architect Storage/HA
> SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer,
> HRB 21284 (AG Nürnberg)
> "Experience is the name everyone gives to their mistakes." -- Oscar Wilde
>
> _______________________________________________
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
Nachricht
<20140205113649.GN13514@suse.de>:
> On 2014-02-05T12:24:00, Ulrich Windl <Ulrich.Windl@rz.uni-regensburg.de>
wrote:
>
>> I had a problem where "O2CB stop" fenced the node that was shut down:
>> I had updated the kernel, and then rebooted. As part of shutdown, the
> cluster stack was stopped. In turn, the "O2CB" resource was stopped.
>> Unfortunately this caused an error like (SLES11 SP3):
>>
>> ---
>> modprobe: FATAL: Could not load /lib/modules/3.0.101-0.8-xen/modules.dep:
No
> such file or directory
>> o2cb(prm_O2CB)[19908]: ERROR: Unable to unload module: ocfs2
>> ---
>>
>> This in turn caused a node fence, which ruined the clean reboot.
>>
>> So why is the RA messing with the kernel module on stop?
>
> Because customers complained about the new module not being picked up if
> they upgrade ocfs2-kmp and restarted the cluster stack on a node. It's
> incredibly hard to please everyone, alas ...
I think the proper way would be this:
Stop your OCFS2 resources, rmmod the module, [modprobe the module to re-insert
the new version], start your OCFS2 resources.
I guess the kernel update is more common than the "just the ocfs2-kmp update"
>
> The right way to update a cluster node is anyway this one:
>
> 1. Stop the cluster stack
> 2. Update/upgrade/reboot as needed
> 3. Restart the cluster stack
>
> This would avoid this error too. Or keeping multiple kernel versions in
> parallel (which also helps if a kernel update no longer boots for some
> reason). Removing the running kernel package is usually not a great
> idea; I prefer to remove them after having successfully rebooted only,
> because you *never* know if you may have to reload a module.
There's another way: (Like HP-UX learned to do it): Defer changes to the
running kernel until shutdown/reboot.
>
>
> Regards,
> Lars
>
> --
> Architect Storage/HA
> SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer,
> HRB 21284 (AG Nürnberg)
> "Experience is the name everyone gives to their mistakes." -- Oscar Wilde
>
> _______________________________________________
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems