Mailing List Archive

Strange UEFI boot behaviour
Hello list,

One of my machines uses bootctl to offer a choice of kernel to boot (I don't
use anything else from systemd); it has these files in /boot/loader/entries:

08-gentoo-5.15.32-r1-rescue.conf
09-gentoo-5.15.32-r1-rescue.nonet.conf
30-gentoo-5.18.10.conf
32-gentoo-5.18.10.nox.conf
34-gentoo-5.18.10.nonet.conf
40-gentoo-5.15.41.conf
42-gentoo-5.15.41.nox.conf
44-gentoo-5.15.41.nonet.conf

Until a few days ago, the system offered the kernels cited in those .conf files
- in the same order as I've listed them. Also of course in ascending numerical
order. Both as expected.

Now, though, they're offered in precisely the opposite order (with the two
other usual options below them as before: Windows and Enter UEFI setup).

What might have caused this reversal?

$ cat /boot/loader/entries/30*f
title Gentoo 5.18.10
version 5.18.10-gentoo
linux vmlinuz-5.18.10-gentoo
initrd intel-uc.img
options root=/dev/nvme0n1p5 net.ifnames=0 raid=noautodetect pcie_aspm=off

$ cat /boot/loader/loader.conf
timeout 5
default 30-gentoo-5.18.10

$ ls /boot/vmlinuz-5.18.10-gentoo
/boot/vmlinuz-5.18.10-gentoo

$ efibootmgr
BootCurrent: 0001
Timeout: 1 seconds
BootOrder: 0001,0007,0011,0008,0000
Boot0000* Windows Boot Manager
Boot0001* Gentoo Linux
Boot0007* UEFI OS
Boot0008* Hard Drive
Boot0011* CD/DVD Drive

--
Regards,
Peter.
Re: Strange UEFI boot behaviour [ In reply to ]
On Sunday, 10 July 2022 15:19:08 BST Peter Humphrey wrote:
> Hello list,
>
> One of my machines uses bootctl to offer a choice of kernel to boot (I don't
> use anything else from systemd); it has these files in
> /boot/loader/entries:
>
> 08-gentoo-5.15.32-r1-rescue.conf
> 09-gentoo-5.15.32-r1-rescue.nonet.conf
> 30-gentoo-5.18.10.conf
> 32-gentoo-5.18.10.nox.conf
> 34-gentoo-5.18.10.nonet.conf
> 40-gentoo-5.15.41.conf
> 42-gentoo-5.15.41.nox.conf
> 44-gentoo-5.15.41.nonet.conf
>
> Until a few days ago, the system offered the kernels cited in those .conf
> files - in the same order as I've listed them. Also of course in ascending
> numerical order. Both as expected.
>
> Now, though, they're offered in precisely the opposite order (with the two
> other usual options below them as before: Windows and Enter UEFI setup).
>
> What might have caused this reversal?
>
> $ cat /boot/loader/entries/30*f
> title Gentoo 5.18.10
> version 5.18.10-gentoo
> linux vmlinuz-5.18.10-gentoo
> initrd intel-uc.img
> options root=/dev/nvme0n1p5 net.ifnames=0 raid=noautodetect pcie_aspm=off
>
> $ cat /boot/loader/loader.conf
> timeout 5
> default 30-gentoo-5.18.10
>
> $ ls /boot/vmlinuz-5.18.10-gentoo
> /boot/vmlinuz-5.18.10-gentoo
>
> $ efibootmgr
> BootCurrent: 0001
> Timeout: 1 seconds
> BootOrder: 0001,0007,0011,0008,0000
> Boot0000* Windows Boot Manager
> Boot0001* Gentoo Linux
> Boot0007* UEFI OS
> Boot0008* Hard Drive
> Boot0011* CD/DVD Drive

This is happening if the EFI firmware for some reason has re-scanned the
attached block devices to find bootable UEFI images. I've seen something as
simple as rebooting with, then without a bootable USB drive causing this.
Since the images boot order is editable, in your case via bootctl, then it
should be a fixable problem.
Re: Strange UEFI boot behaviour [ In reply to ]
On Sunday, 10 July 2022 17:37:00 BST Michael wrote:

> This is happening if the EFI firmware for some reason has re-scanned the
> attached block devices to find bootable UEFI images. I've seen something as
> simple as rebooting with, then without a bootable USB drive causing this.
> Since the images boot order is editable, in your case via bootctl, then it
> should be a fixable problem.

But, as I said, the order is unchanged, yet the BIOS displays them in reverse
order. I think the BIOS is not long for this world, as you will see...

This machine shows bizarre behaviour in booting as well. Often, as soon as the
POST is finished and the BIOS asks which kernel image to hand over to, I have
no keyboard or mouse - except for CTRL-ALT-DEL, which does reboot.

The thing that got me exercised today was Gentoo complaining that it couldn't
mount /boot - wrong FS type or...etc. So I had to do something. So today
(well, yesterday now) I told the BIOS to load its standard optimised defaults,
then rebooted, then told it to load my tuned set and rebooted again. Then I
booted a SystemRescueCD (because the USB version showed that same no-
keyboard problem), formatted /boot with FAT32, zapped / then recovered a week-
old backup. Then, still in RescCD, a sync and world-update brought the system
back.

Even then, running bootctl remove; bootctl install; replace /boot/loader/
loader.conf; bootctl update - still left no UEFI boot option for the Gentoo
system, though it usually does create one. I had to use efibootmgr to create a
boot option, then do the bootctl dance again.

Finally, a bootable, running system.

Oh, one other thing. This machine has a small unformatted partition before /
boot, and gparted on the rescue CD showed me that it had lost its bios_grub
flag. Could that account for the wrong FS type error?

Should I consider re-flashing the BIOS? It's getting on for 10 years old. I did
that to another machine once, thereby killing it stone dead.

--
Regards,
Peter.
Re: Strange UEFI boot behaviour [ In reply to ]
On 11/07/2022 01:25, Peter Humphrey wrote:
> Should I consider re-flashing the BIOS? It's getting on for 10 years old. I did
> that to another machine once, thereby killing it stone dead.

I've flashed bios's and done stuff like that. Yes it's scary, knowing
you can kill the machine. No if you're careful it's just fine.

It's when you try and install XP, and THAT kills the bios, that you
start really worrying ...

Cheers,
Wol
Re: Strange UEFI boot behaviour [ In reply to ]
On Monday, 11 July 2022 01:25:00 BST Peter Humphrey wrote:
> On Sunday, 10 July 2022 17:37:00 BST Michael wrote:
> > This is happening if the EFI firmware for some reason has re-scanned the
> > attached block devices to find bootable UEFI images. I've seen something
> > as simple as rebooting with, then without a bootable USB drive causing
> > this. Since the images boot order is editable, in your case via bootctl,
> > then it should be a fixable problem.
>
> But, as I said, the order is unchanged, yet the BIOS displays them in
> reverse order. I think the BIOS is not long for this world, as you will
> see...
>
> This machine shows bizarre behaviour in booting as well. Often, as soon as
> the POST is finished and the BIOS asks which kernel image to hand over to,
> I have no keyboard or mouse - except for CTRL-ALT-DEL, which does reboot.

Check your BIOS* firmware settings for USB and enable xHCI. Perhaps this
setting was toggled to auto, which may not work reliably.

* I use the word "BIOS" to describe the UEFI firmware menu on a modern MoBo,
rather than the legacy CMOS stored BIOS.


> The thing that got me exercised today was Gentoo complaining that it
> couldn't mount /boot - wrong FS type or...etc. So I had to do something.

You could have checked with fsck.fat to see what the /boot/EFI partition
reported, but since you reformatted the ESP it's all new and in working order
now.


> So
> today (well, yesterday now) I told the BIOS to load its standard optimised
> defaults, then rebooted, then told it to load my tuned set and rebooted
> again. Then I booted a SystemRescueCD (because the USB version showed that
> same no- keyboard problem), formatted /boot with FAT32, zapped / then
> recovered a week- old backup. Then, still in RescCD, a sync and
> world-update brought the system back.
>
> Even then, running bootctl remove; bootctl install; replace /boot/loader/
> loader.conf; bootctl update - still left no UEFI boot option for the Gentoo
> system, though it usually does create one. I had to use efibootmgr to create
> a boot option, then do the bootctl dance again.
>
> Finally, a bootable, running system.
>
> Oh, one other thing. This machine has a small unformatted partition before /
> boot, and gparted on the rescue CD showed me that it had lost its bios_grub
> flag. Could that account for the wrong FS type error?

Yes, it is probable you mixed up legacy BIOS (CSM) Vs UEFI booting. You need
to make sure when you boot with Live media you boot in UEFI mode.

The EFI firmware can be set up to emulate a legacy BIOS configuration, by
enabling its Compatibility Support Module (CSM). This setting allows legacy
OSs to boot with a conventional MBR boot loader from a GPT disk. The problem
which arises on a GPT formatted disk is where to store GRUB's 2nd Stage image.
Normally, on a disk with a MBR partition table, the space immediately after
the MBR on sector 0 contains GRUB's 2nd Stage image. On a GPT disk the first
sector is used to store the GPT partition table and therefore GRUB's 2nd Stage
image has to be stored somewhere else - in the marked bios_grub partition.

An EFI MoBo which boots an OS installed in UEFI mode, on a GPT formatted disk,
does not require a CSM or a bios_grub flagged partition. I assume you've
installed your OSs in UEFI mode and you do not intend to run WinXP on bare
metal. In this case, disable CSM.


> Should I consider re-flashing the BIOS? It's getting on for 10 years old. I
> did that to another machine once, thereby killing it stone dead.

As you attest some folk have had bad experiences with flashing new firmware on
their MoBos. I first check if the new firmware is meant to address any issues
which affect my OS and peripherals and if it does, then I go ahead and flash it.
If the release offers fixes irrelevant to my kit and OS, I leave it alone. I
have not yet had a single MoBo fail on me, even after multiple flash
operations. As long as the flash operation is not interrupted and the image is
the correct image for the hardware, I would think the flash operation should
complete successfully without having to J-TAG the chipset. On a 10 year old
MoBo I would consider replacing the NVRAM battery prior to (re)flashing.
Re: Strange UEFI boot behaviour [ In reply to ]
On Monday, 11 July 2022 17:19:50 BST Michael wrote:

> Check your BIOS* firmware settings for USB and enable xHCI. Perhaps this
> setting was toggled to auto, which may not work reliably.

That was a good idea, Michael. Indeed, some Auto settings had crept in. I
haven't yet tried booting other media though: that's for later today.

--->8

> An EFI MoBo which boots an OS installed in UEFI mode, on a GPT formatted
> disk, does not require a CSM or a bios_grub flagged partition.

I didn't know that, but CSM was disabled anyway.

--->8

> On a 10 year old MoBo I would consider replacing the
> NVRAM battery prior to (re)flashing.

That's a good idea - thanks.

--
Regards,
Peter.
Re: Strange UEFI boot behaviour [ In reply to ]
On Tuesday, 12 July 2022 10:02:31 BST Peter Humphrey wrote:
> On Monday, 11 July 2022 17:19:50 BST Michael wrote:
> > Check your BIOS* firmware settings for USB and enable xHCI. Perhaps this
> > setting was toggled to auto, which may not work reliably.
>
> That was a good idea, Michael. Indeed, some Auto settings had crept in. I
> haven't yet tried booting other media though: that's for later today.

...and I can now boot a USB SysRescCD. Thanks again.


--
Regards,
Peter.