Mailing List Archive

Weird harddisk problem: AHCI disks sometimes not found
Hi there,

I have a weird harddisk detection problem which rises the questio: what does
the gentoo-kernel make differently than the ubuntu kernel?

The system in question has 2 identical SSDs (Kingston SV300S3 60GB) and two
identical HDDs (older Maxtor7V300F0 300GB) , all connected to SATA/AHCI ports;
the HDDs are combined to a LVM-raid1 volume. SATA controller is a onboard SB7x
on an Asus M3A78 mainboard in AHCI mode.

Only one of the two SSDs is attached at the same time to the system, the other
one is disconnected. One contains a gentoo installation (just updated
yesterday), the other one an Ubuntu LTS 20.04. This allows dual-.boot by
switching connection cables.

When I connect the gentoo-SSD and boot it, BIOS finds all HDDs and the SSD, and
starts booting; but gentoo does not recognize at least one of the HDDs (/dev/
sdc missing, dmesg shows link down on Sata-Interface
. Going back to the bios shows that even BIOS does not recognize the disk
anymore. A full powercycle (pressing reset button is not sufficent) to make BIOS
to recognize the disks again.

Doing the same with the Ubuntu-Disk works absolutely fine, all HDDs are
recognized and the raid is working fine, not a single time that one of the
disks was not recognized.

Without the Ubuntu observation I'd say its a hardware problem and the old HDDs
are simply beyond their age, but why are they working in ubuntu and not in
gentoo? And what is it doing with BIOS/Harddisk that even Bios does not find it
anymore? I need a full powercycle to make bios find it again. This indicates a
gentoo kernel problem, and I have no idea where to start looking, and AFAIK
there's nothing much to configure a SATA/AHCI drive.

Any ideas?

Thanks
Alex

PS:
Sys-kernel/gentoo-kernel-5.4.97, default configuration
Hardware:
00:00.0 Host bridge: Advanced Micro Devices, Inc. [AMD] RS780 Host Bridge
00:01.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] RS780/RS880 PCI to PCI
bridge (int gfx)
00:06.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] RS780 PCI to PCI bridge
(PCIE port 2)
00:11.0 SATA controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/
SB9x0 SATA Controller [AHCI mode]
00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/
SB9x0 USB OHCI0 Controller
00:12.1 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0 USB OHCI1
Controller
00:12.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/
SB9x0 USB EHCI Controller
00:13.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/
SB9x0 USB OHCI0 Controller
00:13.1 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0 USB OHCI1
Controller
00:13.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/
SB9x0 USB EHCI Controller
00:14.0 SMBus: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 SMBus Controller
(rev 3a)
00:14.1 IDE interface: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/
SB9x0 IDE Controller
00:14.2 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 Azalia
(Intel HDA)
00:14.3 ISA bridge: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0
LPC host controller
00:14.4 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 PCI to PCI
Bridge
00:14.5 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/
SB9x0 USB OHCI2 Controller
00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] K8 [Athlon64/Opteron]
HyperTransport Technology Configuration
00:18.1 Host bridge: Advanced Micro Devices, Inc. [AMD] K8 [Athlon64/Opteron]
Address Map
00:18.2 Host bridge: Advanced Micro Devices, Inc. [AMD] K8 [Athlon64/Opteron]
DRAM Controller
00:18.3 Host bridge: Advanced Micro Devices, Inc. [AMD] K8 [Athlon64/Opteron]
Miscellaneous Control
01:05.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI]
RS780 [Radeon HD 3200]
01:05.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] RS780 HDMI Audio
[Radeon 3000/3100 / HD 3200/3300]
02:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411
PCI Express Gigabit Ethernet Controller (rev 02)
Re: Weird harddisk problem: AHCI disks sometimes not found [ In reply to ]
On Thu, Mar 11, 2021 at 12:39 PM Alexander Puchmayr <
alexander.puchmayr@linznet.at> wrote:
>
> Hi there,
>
> I have a weird harddisk detection problem which rises the questio: what
does
> the gentoo-kernel make differently than the ubuntu kernel?
>
> The system in question has 2 identical SSDs (Kingston SV300S3 60GB) and
two
> identical HDDs (older Maxtor7V300F0 300GB) , all connected to SATA/AHCI
ports;
> the HDDs are combined to a LVM-raid1 volume. SATA controller is a onboard
SB7x
> on an Asus M3A78 mainboard in AHCI mode.
>
> Only one of the two SSDs is attached at the same time to the system, the
other
> one is disconnected. One contains a gentoo installation (just updated
> yesterday), the other one an Ubuntu LTS 20.04. This allows dual-.boot by
> switching connection cables.
>
> When I connect the gentoo-SSD and boot it, BIOS finds all HDDs and the
SSD, and
> starts booting; but gentoo does not recognize at least one of the HDDs
(/dev/
> sdc missing, dmesg shows link down on Sata-Interface
> . Going back to the bios shows that even BIOS does not recognize the disk
> anymore. A full powercycle (pressing reset button is not sufficent) to
make BIOS
> to recognize the disks again.
>
> Doing the same with the Ubuntu-Disk works absolutely fine, all HDDs are
> recognized and the raid is working fine, not a single time that one of the
> disks was not recognized.
>
> Without the Ubuntu observation I'd say its a hardware problem and the old
HDDs
> are simply beyond their age, but why are they working in ubuntu and not in
> gentoo? And what is it doing with BIOS/Harddisk that even Bios does not
find it
> anymore? I need a full powercycle to make bios find it again. This
indicates a
> gentoo kernel problem, and I have no idea where to start looking, and
AFAIK
> there's nothing much to configure a SATA/AHCI drive.
>
> Any ideas?
>
> Thanks
> Alex
>
> PS:
> Sys-kernel/gentoo-kernel-5.4.97, default configuration
> Hardware:
> 00:00.0 Host bridge: Advanced Micro Devices, Inc. [AMD] RS780 Host Bridge
> 00:01.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] RS780/RS880 PCI to
PCI
> bridge (int gfx)
> 00:06.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] RS780 PCI to PCI
bridge
> (PCIE port 2)
> 00:11.0 SATA controller: Advanced Micro Devices, Inc. [AMD/ATI]
SB7x0/SB8x0/
> SB9x0 SATA Controller [AHCI mode]
> 00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI]
SB7x0/SB8x0/
> SB9x0 USB OHCI0 Controller
> 00:12.1 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0 USB
OHCI1
> Controller
> 00:12.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI]
SB7x0/SB8x0/
> SB9x0 USB EHCI Controller
> 00:13.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI]
SB7x0/SB8x0/
> SB9x0 USB OHCI0 Controller
> 00:13.1 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0 USB
OHCI1
> Controller
> 00:13.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI]
SB7x0/SB8x0/
> SB9x0 USB EHCI Controller
> 00:14.0 SMBus: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 SMBus
Controller
> (rev 3a)
> 00:14.1 IDE interface: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/
> SB9x0 IDE Controller
> 00:14.2 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 Azalia
> (Intel HDA)
> 00:14.3 ISA bridge: Advanced Micro Devices, Inc. [AMD/ATI]
SB7x0/SB8x0/SB9x0
> LPC host controller
> 00:14.4 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 PCI to
PCI
> Bridge
> 00:14.5 USB controller: Advanced Micro Devices, Inc. [AMD/ATI]
SB7x0/SB8x0/
> SB9x0 USB OHCI2 Controller
> 00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] K8
[Athlon64/Opteron]
> HyperTransport Technology Configuration
> 00:18.1 Host bridge: Advanced Micro Devices, Inc. [AMD] K8
[Athlon64/Opteron]
> Address Map
> 00:18.2 Host bridge: Advanced Micro Devices, Inc. [AMD] K8
[Athlon64/Opteron]
> DRAM Controller
> 00:18.3 Host bridge: Advanced Micro Devices, Inc. [AMD] K8
[Athlon64/Opteron]
> Miscellaneous Control
> 01:05.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI]
> RS780 [Radeon HD 3200]
> 01:05.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] RS780 HDMI
Audio
> [Radeon 3000/3100 / HD 3200/3300]
> 02:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
RTL8111/8168/8411
> PCI Express Gigabit Ethernet Controller (rev 02)
>
>
>

I'm going to assume that you built your Gentoo kernel and have the config
file.

Ubuntu ships the config file along with whatever kernel you are running
which you can obtain with

less /boot/config-$(uname -r)

Ubuntu 'tends' to ship everything as a module and ships nearly every
module vs your Gentoo kernel where you may be building things into
the kernel. You should be able to do a diff on the two config files as
a starting point assuming you are using the same kernel version.

lsmod should give you an idea what modules are loaded for each kernel.

HTH,
Mark
Re: Weird harddisk problem: AHCI disks sometimes not found [ In reply to ]
On 3/11/21 12:39 PM, Alexander Puchmayr wrote:
> Hi there,

Hi,

> I have a weird harddisk detection problem which rises the questio:
> what does the gentoo-kernel make differently than the ubuntu kernel?

Probably multiple things. They probably have configurations that are at
least slightly different. I wouldn't be surprised if there is slightly
different levels of patching too.

My understanding is that gentoo-kernel differs slightly from a vanilla
kernel source.

> Without the Ubuntu observation I'd say its a hardware problem

I'd still be inclined to question hardware. But I agree that difference
in behavior based on different software is suspicious. I wonder if the
Gentoo kernel is tickling a bug in the drive's firmware.

> and the old HDDs are simply beyond their age, but why are they working
> in ubuntu and not in gentoo?

I don't think that older drives would fail in the way that you are
describing.

> And what is it doing with BIOS/Harddisk that even Bios does not find
> it anymore?

That sounds to me like the drive itself is misbehaving and not
responding the way the BIOS expects.

> I need a full powercycle to make bios find it again.

That really sounds like the drive is having a problem. Or that the
Gentoo kernel is inducing the drive into a state that is a problem.

What happens if you unplug power and data cables from the drive and then
reconnect them? Does the BIOS then see the drive?

I'm wondering if it's the drive and / or controller that's getting wedged.

> This indicates a gentoo kernel problem, and I have no idea where
> to start looking, and AFAIK there's nothing much to configure a
> SATA/AHCI drive.

As Mark indicated, you should be able to compare kernel configs.

I don't remember hearing about such a bug. I wonder if the Gentoo
kernel is trying to do something slightly different and tickling a
subtle bug that is causing the drive and / or controller to lock up.

I'd think that it would be easy to remove power and data cables from the
drive while the computer is powered on to see if that also revives the
drive.

> Any ideas?

Not really. Just threads to chase.



--
Grant. . . .
unix || die
Re: Weird harddisk problem: AHCI disks sometimes not found [ In reply to ]
On 11/03/2021 19:39, Alexander Puchmayr wrote:
> Only one of the two SSDs is attached at the same time to the system, the other
> one is disconnected. One contains a gentoo installation (just updated
> yesterday), the other one an Ubuntu LTS 20.04. This allows dual-.boot by
> switching connection cables.

By switching cables. Is that moving the cables from one drive to the
other? Or by disconnecting one drive from the mobo, and plugging in the
other? Or what?

A pretty recent mobo I've got says that certain ports are incompatible,
so for example if I plug in a video card, certain sata ports disappear,
or if I use NVMe, something else goes ...

Could it be you have a collision like that, if your two SSDs don't end
up plugged into the exact same SATA port (or whatever it is).

Cheers,
Wol
Re: Weird harddisk problem: AHCI disks sometimes not found [ In reply to ]
On Thu, Mar 11, 2021 at 12:39 PM Alexander Puchmayr <
alexander.puchmayr@linznet.at> wrote:
>
> Hi there,
<SNIP>
> Any ideas?
>

One other point that I'd make on this subject is that even if you had the
same kernel config file there
could be differences in the tool chain that are causing your problem. I
suspect researching that
sort of cause would use huge amounts of time and likely never lead to a
real understanding.

You can, of course, build your own kernel on Ubuntu using Gentoo source
code & config file and see
whether your new Ubuntu kernel shows the problem or acts like the provided
kernel. That might actually
produce more forward progress should you not find a more simple solution.
At least that would
presumably produce systems with the same things built in vs modules.

I would probably build vanilla-sources on the gentoo side to see if it's
Gentoo patching.

Good luck,
Mark
Re: Weird harddisk problem: AHCI disks sometimes not found [ In reply to ]
Hi there,

Thanks for all suggestions and answers so far.

I'm pretty sure it is not a hardware problem, because
* Exchanging SATA cables does not affect the problem
* Using different SATA slots on the mainboard does not affect the problem
* Using different SATA power connectors does not affect the problem

I continued to experiment with different kernel versions and configs:
* Ubuntu-5.4.0-48-generic works
* sys-kernel/gentoo-sources-5.4.60 [.self compiled and configured for a similar
machine some time ago]: WORKS
* sys-kernel/gentoo-kernel-5.4.97 [default config] FAILS
* sys-kernel/gentoo-kernel-bin-5.4.97 FAILS
* sys-kernel/vanilla-sources-5.4.102 [same config as with 5.4.60] WORKS
* sys-kernel/gentoo-kernel-5.10.20 [default config] FAILS
* sys-kernel/gentoo-sources-5.10.20 [same config as with 5.4.60] WORKS

The common thing seems to be that my self-configured kernels work and the
default dist-kernels fail. I checked the differences in the configs (/usr/src/
linux/.config) related to SATA or AHCI, and one candidate was
CONFIG_SATA_MOBILE_LPM_POLICY, which was set to 3 (medium power save) in
distkernel's config and 0 (keep seetings from firmware) in my self compiled
kernels.

SOLUTION:
Adding CONFIG_SATA_MOBILE_LPM_POLICY=0 to /etc/kernel/config.d and recompiling
the gentoo-kernel actually solved the problem.

I assume the reason is an incompatibility between the link power modes (mode
3) and the drives making the link to appear to be down.

Alex

Am Donnerstag, 11. M?rz 2021, 20:39:04 CET schrieb Alexander Puchmayr:
> Hi there,
>
> I have a weird harddisk detection problem which rises the questio: what does
> the gentoo-kernel make differently than the ubuntu kernel?
>
> The system in question has 2 identical SSDs (Kingston SV300S3 60GB) and two
> identical HDDs (older Maxtor7V300F0 300GB) , all connected to SATA/AHCI
> ports; the HDDs are combined to a LVM-raid1 volume. SATA controller is a
> onboard SB7x on an Asus M3A78 mainboard in AHCI mode.
>
> Only one of the two SSDs is attached at the same time to the system, the
> other one is disconnected. One contains a gentoo installation (just updated
> yesterday), the other one an Ubuntu LTS 20.04. This allows dual-.boot by
> switching connection cables.
>
> When I connect the gentoo-SSD and boot it, BIOS finds all HDDs and the SSD,
> and starts booting; but gentoo does not recognize at least one of the HDDs
> (/dev/ sdc missing, dmesg shows link down on Sata-Interface
> . Going back to the bios shows that even BIOS does not recognize the disk
> anymore. A full powercycle (pressing reset button is not sufficent) to make
> BIOS to recognize the disks again.
>
> Doing the same with the Ubuntu-Disk works absolutely fine, all HDDs are
> recognized and the raid is working fine, not a single time that one of the
> disks was not recognized.
>
> Without the Ubuntu observation I'd say its a hardware problem and the old
> HDDs are simply beyond their age, but why are they working in ubuntu and
> not in gentoo? And what is it doing with BIOS/Harddisk that even Bios does
> not find it anymore? I need a full powercycle to make bios find it again.
> This indicates a gentoo kernel problem, and I have no idea where to start
> looking, and AFAIK there's nothing much to configure a SATA/AHCI drive.
>
> Any ideas?
>
> Thanks
> Alex
>
> PS:
> Sys-kernel/gentoo-kernel-5.4.97, default configuration
> Hardware:
> 00:00.0 Host bridge: Advanced Micro Devices, Inc. [AMD] RS780 Host Bridge
> 00:01.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] RS780/RS880 PCI to
> PCI bridge (int gfx)
> 00:06.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] RS780 PCI to PCI
> bridge (PCIE port 2)
> 00:11.0 SATA controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/
> SB9x0 SATA Controller [AHCI mode]
> 00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/
> SB9x0 USB OHCI0 Controller
> 00:12.1 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0 USB
> OHCI1 Controller
> 00:12.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/
> SB9x0 USB EHCI Controller
> 00:13.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/
> SB9x0 USB OHCI0 Controller
> 00:13.1 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0 USB
> OHCI1 Controller
> 00:13.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/
> SB9x0 USB EHCI Controller
> 00:14.0 SMBus: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 SMBus Controller
> (rev 3a)
> 00:14.1 IDE interface: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/
> SB9x0 IDE Controller
> 00:14.2 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 Azalia
> (Intel HDA)
> 00:14.3 ISA bridge: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0
> LPC host controller
> 00:14.4 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 PCI to PCI
> Bridge
> 00:14.5 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/
> SB9x0 USB OHCI2 Controller
> 00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] K8
> [Athlon64/Opteron] HyperTransport Technology Configuration
> 00:18.1 Host bridge: Advanced Micro Devices, Inc. [AMD] K8
> [Athlon64/Opteron] Address Map
> 00:18.2 Host bridge: Advanced Micro Devices, Inc. [AMD] K8
> [Athlon64/Opteron] DRAM Controller
> 00:18.3 Host bridge: Advanced Micro Devices, Inc. [AMD] K8
> [Athlon64/Opteron] Miscellaneous Control
> 01:05.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI]
> RS780 [Radeon HD 3200]
> 01:05.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] RS780 HDMI
> Audio [Radeon 3000/3100 / HD 3200/3300]
> 02:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
> RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 02)