Mailing List Archive

Issues with amdgpu driver: Compositor hangs, sysfs not working
Hello everybody,

I installed an AMD Radeon RX 7900 XTX today, switching from Nvidia. But
once I enable FRAMEBUFFER_CONSOLE_DETECT_PRIMARY=y to have a tty once
the driver is up, the following happens:

1) My Wayland compositor (Hyprland) takes very long to start.

2) reading from sysfs (e.g. running "cat
/sys/class/drm/card0/device/gpu_busy_percent") does not work and causes
a hang.

Once I disable FRAMEBUFFER_CONSOLE_DETECT_PRIMARY=n, I have no issues
with the starting speed of the compositors at all and the mentioned
command works. But this leads to a black tty.

The only two error messages from amdgpu I find in dmesg are:

[   66.757500] amdgpu 0000:03:00.0: amdgpu: SMU: I'm not done with your
previous command: SMN_C2PMSG_66:0x00000029 SMN_C2PMSG_82:0x00000000
[   66.757502] amdgpu 0000:03:00.0: amdgpu: Failed to disable gfxoff!

and

[  870.087856] amdgpu 0000:03:00.0: amdgpu: SMU: I'm not done with your
previous command: SMN_C2PMSG_66:0x00000029 SMN_C2PMSG_82:0x00000000
[  870.087858] amdgpu 0000:03:00.0: amdgpu: Failed to export SMU metrics
table!

Did I forget anything or is this a bug?
Re: Issues with amdgpu driver: Compositor hangs, sysfs not working [ In reply to ]
On Saturday, 17 February 2024 19:34:37 GMT Paul Sopka wrote:
> Hello everybody,
>
> I installed an AMD Radeon RX 7900 XTX today, switching from Nvidia. But
> once I enable FRAMEBUFFER_CONSOLE_DETECT_PRIMARY=y to have a tty once
> the driver is up, the following happens:
>
> 1) My Wayland compositor (Hyprland) takes very long to start.
>
> 2) reading from sysfs (e.g. running "cat
> /sys/class/drm/card0/device/gpu_busy_percent") does not work and causes
> a hang.
>
> Once I disable FRAMEBUFFER_CONSOLE_DETECT_PRIMARY=n, I have no issues
> with the starting speed of the compositors at all and the mentioned
> command works. But this leads to a black tty.

You'd normally need this enabled to get a fb display on the console, but I
don't know if this would be provided by proprietary drivers instead for your
card - see below.


> The only two error messages from amdgpu I find in dmesg are:
>
> [ 66.757500] amdgpu 0000:03:00.0: amdgpu: SMU: I'm not done with your
> previous command: SMN_C2PMSG_66:0x00000029 SMN_C2PMSG_82:0x00000000
> [ 66.757502] amdgpu 0000:03:00.0: amdgpu: Failed to disable gfxoff!
>
> and
>
> [ 870.087856] amdgpu 0000:03:00.0: amdgpu: SMU: I'm not done with your
> previous command: SMN_C2PMSG_66:0x00000029 SMN_C2PMSG_82:0x00000000
> [ 870.087858] amdgpu 0000:03:00.0: amdgpu: Failed to export SMU metrics
> table!
>
> Did I forget anything or is this a bug?

It could be both. I don't think there's any Linux firmware released yet for
this card - but I don't follow the latest & greatest so I could be wrong.
You'd need the AMD amdgpu-pro on top of the amdgpu driver, to bring in the
proprietary OpenGL, OpenCL, Vulkan and AMF components:

https://wiki.gentoo.org/wiki/AMDGPU-PRO

This is what's in portage today:

~ $ eix -l amdgpu-pro
* dev-libs/amdgpu-pro-opencl
Available versions:
~ 20.40.1147286 ^fmsd [ABI_X86="32 64"] ["|| ( abi_x86_32
abi_x86_64 )"]
Homepage: https://www.amd.com/en/support/kb/release-notes/rn-amdgpu-unified-linux-20-40
Description: Proprietary OpenCL implementation for AMD GPUs

* media-libs/amdgpu-pro-vulkan
Available versions:
~ 21.50.2.1384496-r1 ^md [ABI_X86="32 64"
VIDEO_CARDS="amdgpu"] ["video_cards_amdgpu"]
~ 22.10.4.1452060-r1 ^md [ABI_X86="32 64"
VIDEO_CARDS="amdgpu"] ["video_cards_amdgpu"]
~ 22.20.5.1511376-r1 ^md [ABI_X86="32 64"
VIDEO_CARDS="amdgpu"] ["video_cards_amdgpu"]
~ 22.40.6.1580631-r1 ^md [ABI_X86="32 64"
VIDEO_CARDS="amdgpu"] ["video_cards_amdgpu"]
~ 23.10.3.1620044-r1 ^md [ABI_X86="32 64"
VIDEO_CARDS="amdgpu"] ["video_cards_amdgpu"]
~ 23.20.0.1654522-r1 ^md [ABI_X86="32 64"
VIDEO_CARDS="amdgpu"] ["video_cards_amdgpu"]
Homepage: https://www.amd.com/en/support
Description: AMD's closed source vulkan driver, from Radeon
Software for Linux

* media-video/amdgpu-pro-amf
Available versions:
~ 1.4.24.1452059 ^md
~ 1.4.26.1511376 ^md
~ 1.4.29.1580631 ^md
~ 1.4.30.1620044 ^md
~ 1.4.31.1654522 (0/31)^md
Homepage: https://www.amd.com/en/support
Description: AMD's closed source Advanced Media Framework (AMF)
driver

Found 3 matches
Re: Issues with amdgpu driver: Compositor hangs, sysfs not working [ In reply to ]
Thank you for your reply.

>> Hello everybody,
>>
>> I installed an AMD Radeon RX 7900 XTX today, switching from Nvidia. But
>> once I enable FRAMEBUFFER_CONSOLE_DETECT_PRIMARY=y to have a tty once
>> the driver is up, the following happens:
>>
>> 1) My Wayland compositor (Hyprland) takes very long to start.
>>
>> 2) reading from sysfs (e.g. running "cat
>> /sys/class/drm/card0/device/gpu_busy_percent") does not work and causes
>> a hang.
>>
>> Once I disable FRAMEBUFFER_CONSOLE_DETECT_PRIMARY=n, I have no issues
>> with the starting speed of the compositors at all and the mentioned
>> command works. But this leads to a black tty.
> You'd normally need this enabled to get a fb display on the console, but I
> don't know if this would be provided by proprietary drivers instead for your
> card - see below.
I made a mistake here, sorry. The issue causing setting is
DRM_FBDEV_EMULATION=y, which on itself works with the open source
driver, but causes issues as soon as I start Hyprland.

> It could be both. I don't think there's any Linux firmware released yet for
> this card - but I don't follow the latest & greatest so I could be wrong.
> You'd need the AMD amdgpu-pro on top of the amdgpu driver, to bring in the
> proprietary OpenGL, OpenCL, Vulkan and AMF components:
>
> https://wiki.gentoo.org/wiki/AMDGPU-PRO
>
> This is what's in portage today:
>
> ~ $ eix -l amdgpu-pro
> * dev-libs/amdgpu-pro-opencl
> Available versions:
> ~ 20.40.1147286 ^fmsd [ABI_X86="32 64"] ["|| ( abi_x86_32
> abi_x86_64 )"]
> Homepage: https://www.amd.com/en/support/kb/release-notes/rn-amdgpu-unified-linux-20-40
> Description: Proprietary OpenCL implementation for AMD GPUs
>
> * media-libs/amdgpu-pro-vulkan
> Available versions:
> ~ 21.50.2.1384496-r1 ^md [ABI_X86="32 64"
> VIDEO_CARDS="amdgpu"] ["video_cards_amdgpu"]
> ~ 22.10.4.1452060-r1 ^md [ABI_X86="32 64"
> VIDEO_CARDS="amdgpu"] ["video_cards_amdgpu"]
> ~ 22.20.5.1511376-r1 ^md [ABI_X86="32 64"
> VIDEO_CARDS="amdgpu"] ["video_cards_amdgpu"]
> ~ 22.40.6.1580631-r1 ^md [ABI_X86="32 64"
> VIDEO_CARDS="amdgpu"] ["video_cards_amdgpu"]
> ~ 23.10.3.1620044-r1 ^md [ABI_X86="32 64"
> VIDEO_CARDS="amdgpu"] ["video_cards_amdgpu"]
> ~ 23.20.0.1654522-r1 ^md [ABI_X86="32 64"
> VIDEO_CARDS="amdgpu"] ["video_cards_amdgpu"]
> Homepage: https://www.amd.com/en/support
> Description: AMD's closed source vulkan driver, from Radeon
> Software for Linux
>
> * media-video/amdgpu-pro-amf
> Available versions:
> ~ 1.4.24.1452059 ^md
> ~ 1.4.26.1511376 ^md
> ~ 1.4.29.1580631 ^md
> ~ 1.4.30.1620044 ^md
> ~ 1.4.31.1654522 (0/31)^md
> Homepage: https://www.amd.com/en/support
> Description: AMD's closed source Advanced Media Framework (AMF)
> driver
>
> Found 3 matches
The firmare seems good, since it is loaded just fine, "dmesg | grep
amdgpu | grep firmware" returns:
[   16.905914] Loading firmware: amdgpu/psp_13_0_0_sos.bin
[   16.905916] Loading firmware: amdgpu/psp_13_0_0_ta.bin
[   16.905917] Loading firmware: amdgpu/smu_13_0_0.bin
[   16.905917] Loading firmware: amdgpu/dcn_3_2_0_dmcub.bin
[   16.905918] Loading firmware: amdgpu/gc_11_0_0_pfp.bin
[   16.905919] Loading firmware: amdgpu/gc_11_0_0_me.bin
[   16.905919] Loading firmware: amdgpu/gc_11_0_0_rlc.bin
[   16.905920] Loading firmware: amdgpu/gc_11_0_0_mec.bin
[   16.905921] Loading firmware: amdgpu/gc_11_0_0_imu.bin
[   16.905922] Loading firmware: amdgpu/sdma_6_0_0.bin
[   16.905923] Loading firmware: amdgpu/vcn_4_0_0.bin
[   16.906095] Loading firmware: amdgpu/gc_11_0_0_mes_2.bin
[   16.906096] Loading firmware: amdgpu/gc_11_0_0_mes1.bin
[   16.906496] amdgpu 0000:03:00.0: amdgpu: Will use PSP to load VCN
firmware

Also the mesa libraries work just fine, if I disable
DRM_FBDEV_EMULATION=n, I just get a black tty, but Hyprland starts and I
can play games with the expected performance.
Re: Issues with amdgpu driver: Compositor hangs, sysfs not working [ In reply to ]
On Sunday, 18 February 2024 09:17:13 GMT Paul Sopka wrote:
> Thank you for your reply.
>
> >> Hello everybody,
> >>
> >> I installed an AMD Radeon RX 7900 XTX today, switching from Nvidia. But
> >> once I enable FRAMEBUFFER_CONSOLE_DETECT_PRIMARY=y to have a tty once
> >> the driver is up, the following happens:
> >>
> >> 1) My Wayland compositor (Hyprland) takes very long to start.
> >>
> >> 2) reading from sysfs (e.g. running "cat
> >> /sys/class/drm/card0/device/gpu_busy_percent") does not work and causes
> >> a hang.
> >>
> >> Once I disable FRAMEBUFFER_CONSOLE_DETECT_PRIMARY=n, I have no issues
> >> with the starting speed of the compositors at all and the mentioned
> >> command works. But this leads to a black tty.
> >
> > You'd normally need this enabled to get a fb display on the console, but I
> > don't know if this would be provided by proprietary drivers instead for
> > your card - see below.
>
> I made a mistake here, sorry. The issue causing setting is
> DRM_FBDEV_EMULATION=y, which on itself works with the open source
> driver, but causes issues as soon as I start Hyprland.

I have an older AMD card here, using amdgpu only. If I disable
DRM_FBDEV_EMULATION I lose my framebuffer and end up with a black screen.
When the sddm display manager starts I have a GUI again to login to Plasma
with. This is to be expected in my case, because I rely on the KMS driver
(KMS FB helpers) to provide a framebuffer device. Unless an AMD proprietary
driver is available via amdgpu-pro to substitute for the KMS FB emulation,
then you won't get a framebuffer device to render your tty console.


> > It could be both. I don't think there's any Linux firmware released yet
> > for this card - but I don't follow the latest & greatest so I could be
> > wrong. You'd need the AMD amdgpu-pro on top of the amdgpu driver, to
> > bring in the proprietary OpenGL, OpenCL, Vulkan and AMF components:
> >
> > https://wiki.gentoo.org/wiki/AMDGPU-PRO
> >
> > This is what's in portage today:
[snip ...]

> The firmare seems good, since it is loaded just fine, "dmesg | grep
> amdgpu | grep firmware" returns:
> [ 16.905914] Loading firmware: amdgpu/psp_13_0_0_sos.bin
> [ 16.905916] Loading firmware: amdgpu/psp_13_0_0_ta.bin
> [ 16.905917] Loading firmware: amdgpu/smu_13_0_0.bin
> [ 16.905917] Loading firmware: amdgpu/dcn_3_2_0_dmcub.bin
> [ 16.905918] Loading firmware: amdgpu/gc_11_0_0_pfp.bin
> [ 16.905919] Loading firmware: amdgpu/gc_11_0_0_me.bin
> [ 16.905919] Loading firmware: amdgpu/gc_11_0_0_rlc.bin
> [ 16.905920] Loading firmware: amdgpu/gc_11_0_0_mec.bin
> [ 16.905921] Loading firmware: amdgpu/gc_11_0_0_imu.bin
> [ 16.905922] Loading firmware: amdgpu/sdma_6_0_0.bin
> [ 16.905923] Loading firmware: amdgpu/vcn_4_0_0.bin
> [ 16.906095] Loading firmware: amdgpu/gc_11_0_0_mes_2.bin
> [ 16.906096] Loading firmware: amdgpu/gc_11_0_0_mes1.bin
> [ 16.906496] amdgpu 0000:03:00.0: amdgpu: Will use PSP to load VCN
> firmware

These are for the amdgpu driver. I expect the amdgpu-pro proprietary driver
contains additional firmware.


> Also the mesa libraries work just fine,

Mesa is the open source implementation of OpenGL, Vulkan, et al. graphics API
specifications. If you are using proprietary AMD drivers then I understand
all the graphics API instructions will go through these proprietary drivers,
instead of being translated by Mesa.


> if I disable
> DRM_FBDEV_EMULATION=n, I just get a black tty, but Hyprland starts and I
> can play games with the expected performance.

I am not sure how the fbdev emulation in the kernel works with the amdgpu-pro
when combined with Hyprland. Have you tried a different compositor to see how
it compares. If your problem is caused by some Hyprland bug, you'd soon know.
Re: Issues with amdgpu driver: Compositor hangs, sysfs not working [ In reply to ]
AMDGPU-PRO is not a driver, but a set of libraries containing
opencl,vulkan and advanced media framework. It operates on top of amdgpu.

> Mesa is the open source implementation of OpenGL, Vulkan, et al. graphics API
> specifications. If you are using proprietary AMD drivers then I understand
> all the graphics API instructions will go through these proprietary drivers,
> instead of being translated by Mesa.
As you said, it could be seen as an alternative to mesa, but will not
change anything about the firmware or the amdgpu driver. It therefore
will not change anything about the framebuffer, since this is handled by
the driver.

That said, I do not have the described problems when starting gamescope
from a tty, so my guess now is that its a Hyprland issue. I filed a bug
there. I will also try to start Hyprland using amdgpu-pro instead of mesa.

Thank you for your time