Mailing List Archive

gentoo-sources 5.15.151 breaks amdgpu support?
I upgraded gentoo-sources from 5.15.147 to 5.15.151 this morning and
amdgpu support is now borked on my system with an AMD Ryzen 5 3400G
with Radeon Vega Graphics.

Everything worked fine with 5.15.147, but when 5.15.151 (built with
same .config via "make oldconfig") boots there's always a kernel oops,
and video output goes blank.

I suppose maybe it's time to see if 6.1 works...

[ 2.888353] [drm] amdgpu kernel modesetting enabled.
[ 2.896597] amdgpu: Topology: Add APU node [0x0:0x0]
[ 2.896736] [drm] initializing kernel modesetting (RAVEN 0x1002:0x15D8 0x1462:0x7C02 0xC8).
[ 2.896742] amdgpu 0000:2a:00.0: amdgpu: Trusted Memory Zone (TMZ) feature enabled
[ 2.896751] [drm] register mmio base: 0xFCA00000
[ 2.896752] [drm] register mmio size: 524288
[ 2.896762] [drm] add ip block number 0 <soc15_common>
[ 2.896764] [drm] add ip block number 1 <gmc_v9_0>
[ 2.896766] [drm] add ip block number 2 <vega10_ih>
[ 2.896767] [drm] add ip block number 3 <psp>
[ 2.896769] [drm] add ip block number 4 <gfx_v9_0>
[ 2.896770] [drm] add ip block number 5 <sdma_v4_0>
[ 2.896772] [drm] add ip block number 6 <powerplay>
[ 2.896774] [drm] add ip block number 7 <dm>
[ 2.896775] [drm] add ip block number 8 <vcn_v1_0>
[ 2.896779] Loading firmware: amdgpu/picasso_gpu_info.bin
[ 2.919851] [drm] BIOS signature incorrect 5b 7
[ 2.919872] amdgpu 0000:2a:00.0: amdgpu: Fetched VBIOS from ROM BAR
[ 2.919874] amdgpu: ATOM BIOS: 113-PICASSO-115
[ 2.919883] Loading firmware: amdgpu/picasso_sdma.bin
[ 2.920090] [drm] VCN decode is enabled in VM mode
[ 2.920091] [drm] VCN encode is enabled in VM mode
[ 2.920092] [drm] JPEG decode is enabled in VM mode
[ 2.920111] amdgpu 0000:2a:00.0: vgaarb: deactivate vga console
[ 2.920841] Console: switching to colour dummy device 80x25
[ 2.920885] [drm] vm size is 262144 GB, 4 levels, block size is 9-bit, fragment size is 9-bit
[ 2.920892] amdgpu 0000:2a:00.0: amdgpu: VRAM: 2048M 0x000000F400000000 - 0x000000F47FFFFFFF (2048M used)
[ 2.920895] amdgpu 0000:2a:00.0: amdgpu: GART: 1024M 0x0000000000000000 - 0x000000003FFFFFFF
[ 2.920898] amdgpu 0000:2a:00.0: amdgpu: AGP: 267419648M 0x000000F800000000 - 0x0000FFFFFFFFFFFF
[ 2.920904] [drm] Detected VRAM RAM=2048M, BAR=2048M
[ 2.920906] [drm] RAM width 128bits DDR4
[ 2.921005] [drm] amdgpu: 2048M of VRAM memory ready
[ 2.921008] [drm] amdgpu: 3072M of GTT memory ready.
[ 2.921012] [drm] GART: num cpu pages 262144, num gpu pages 262144
[ 2.921136] [drm] PCIE GART of 1024M enabled.
[ 2.921137] [drm] PTB located at 0x000000F400900000
[ 2.921208] Loading firmware: amdgpu/picasso_asd.bin
[ 2.921688] Loading firmware: amdgpu/picasso_ta.bin
[ 2.921897] amdgpu 0000:2a:00.0: amdgpu: PSP runtime database doesn't exist
[ 2.921939] Loading firmware: amdgpu/picasso_pfp.bin
[ 2.922059] Loading firmware: amdgpu/picasso_me.bin
[ 2.922246] Loading firmware: amdgpu/picasso_ce.bin
[ 2.922344] Loading firmware: amdgpu/picasso_rlc_am4.bin
[ 2.922846] Loading firmware: amdgpu/picasso_mec.bin
[ 2.923079] Loading firmware: amdgpu/picasso_mec2.bin
[ 2.924074] amdgpu: hwmgr_sw_init smu backed is smu10_smu
[ 2.924081] Loading firmware: amdgpu/raven_dmcu.bin
[ 2.924221] Loading firmware: amdgpu/picasso_vcn.bin
[ 2.924504] [drm] Found VCN firmware Version ENC: 1.15 DEC: 3 VEP: 0 Revision: 0
[ 2.924510] amdgpu 0000:2a:00.0: amdgpu: Will use PSP to load VCN firmware
[ 2.945168] [drm] reserve 0x400000 from 0xf47fc00000 for PSP TMR
[ 3.004533] amdgpu 0000:2a:00.0: amdgpu: RAS: optional ras ta ucode is not available
[ 3.009017] amdgpu 0000:2a:00.0: amdgpu: RAP: optional rap ta ucode is not available
[ 3.011810] [drm] psp gfx command LOAD_TA(0x1) failed and response status is (0x7)
[ 3.011919] [drm] psp gfx command INVOKE_CMD(0x3) failed and response status is (0x4)
[ 3.011922] amdgpu 0000:2a:00.0: amdgpu: Secure display: Generic Failure.
[ 3.011924] amdgpu 0000:2a:00.0: amdgpu: SECUREDISPLAY: query securedisplay TA failed. ret 0x0
[ 3.013330] [drm] kiq ring mec 2 pipe 1 q 0
[ 3.014477] [drm] DM_PPLIB: values for F clock
[ 3.014478] [drm] DM_PPLIB: 1600000 in kHz, 4399 in mV
[ 3.014481] [drm] DM_PPLIB: values for DCF clock
[ 3.014482] [drm] DM_PPLIB: 300000 in kHz, 3099 in mV
[ 3.014484] [drm] DM_PPLIB: 600000 in kHz, 3574 in mV
[ 3.014485] [drm] DM_PPLIB: 626000 in kHz, 4250 in mV
[ 3.014487] [drm] DM_PPLIB: 654000 in kHz, 4399 in mV
[ 3.014716] [drm] Display Core initialized with v3.2.149!
[ 3.100173] [drm] VCN decode and encode initialized successfully(under SPG Mode).
[ 3.101219] kfd kfd: amdgpu: Allocated 3969056 bytes on gart
[ 3.101437] amdgpu: Topology: Add APU node [0x15d8:0x1002]
[ 3.101440] kfd kfd: amdgpu: added device 1002:15d8
[ 3.101447] kfd kfd: amdgpu: Failed to resume IOMMU for device 1002:15d8
[ 3.101449] amdgpu 0000:2a:00.0: amdgpu: amdgpu_device_ip_init failed
[ 3.101452] amdgpu 0000:2a:00.0: amdgpu: Fatal error during GPU init
[ 3.101455] amdgpu 0000:2a:00.0: amdgpu: amdgpu: finishing device.
[ 3.110369] BUG: kernel NULL pointer dereference, address: 0000000000000134
[ 3.110376] #PF: supervisor read access in kernel mode
[ 3.110380] #PF: error_code(0x0000) - not-present page
[ 3.110384] PGD 0 P4D 0
[ 3.110389] Oops: 0000 [#1] PREEMPT SMP NOPTI
[ 3.110393] CPU: 1 PID: 1147 Comm: (udev-worker) Not tainted 5.15.151-gentoo #1
[ 3.110400] Hardware name: Micro-Star International Co., Ltd MS-7C02/B450 TOMAHAWK MAX (MS-7C02), BIOS 3.70 06/09/2020
[ 3.110406] RIP: 0010:amdgpu_dm_fini+0xe7/0x180 [amdgpu]
[ 3.110555] Code: 01 00 48 85 ff 74 10 e8 87 3f 16 00 48 c7 83 d8 4d 01 00 00 00 00 00 48 8b 93 30 4e 01 00 48 85 d2 74 62 48 8b 8b 38 3d 01 00 <8b> 81 34 01 00 00 85 c0 74 3e 31 ed 48 63 c5 4c 8d 24 40 4a 8b 3c
[ 3.110564] RSP: 0018:ffffc9000120fc28 EFLAGS: 00010286
[ 3.110569] RAX: ffff88810943b480 RBX: ffff888108380000 RCX: 0000000000000000
[ 3.110574] RDX: ffff88810642a6c0 RSI: ffff88810943b480 RDI: ffff88810943b490
[ 3.110579] RBP: ffff888108380070 R08: 0000000000000001 R09: 0000000000000001
[ 3.110583] R10: 000000000002d9a4 R11: ffffffff8264e800 R12: 0000000000000007
[ 3.110588] R13: ffff888108380000 R14: ffff888108380010 R15: ffff888100c510d0
[ 3.110592] FS: 00007f26994ac540(0000) GS:ffff888410e40000(0000) knlGS:0000000000000000
[ 3.110598] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3.110603] CR2: 0000000000000134 CR3: 00000001046ea000 CR4: 00000000003506e0
[ 3.110607] Call Trace:
[ 3.110611] <TASK>
[ 3.110614] ? __die_body.cold+0x1a/0x1f
[ 3.110621] ? page_fault_oops+0xb5/0x280
[ 3.110627] ? srso_return_thunk+0x5/0x10
[ 3.110632] ? do_user_addr_fault+0x75/0x670
[ 3.110636] ? srso_return_thunk+0x5/0x10
[ 3.110641] ? exc_page_fault+0x71/0x140
[ 3.110647] ? asm_exc_page_fault+0x22/0x30
[ 3.110654] ? amdgpu_dm_fini+0xe7/0x180 [amdgpu]
[ 3.110790] ? amdgpu_dm_fini+0xc9/0x180 [amdgpu]
[ 3.110924] dm_hw_fini+0x19/0x30 [amdgpu]
[ 3.111056] amdgpu_device_fini_hw+0x205/0x2f3 [amdgpu]
[ 3.111189] amdgpu_driver_load_kms.cold+0x5f/0x78 [amdgpu]
[ 3.111320] amdgpu_pci_probe+0x174/0x250 [amdgpu]
[ 3.111425] pci_device_probe+0xbb/0x130
[ 3.111431] really_probe.part.0+0xaf/0x290
[ 3.111436] driver_probe_device+0x28/0x100
[ 3.111441] __driver_attach+0x9b/0x190
[ 3.111445] ? __device_attach_driver+0x110/0x110
[ 3.111449] bus_for_each_dev+0x74/0xc0
[ 3.111454] ? _raw_spin_lock+0xe/0x30
[ 3.111459] bus_add_driver+0x143/0x200
[ 3.111464] ? srso_return_thunk+0x5/0x10
[ 3.111468] driver_register+0x84/0xe0
[ 3.111473] ? 0xffffffffa060a000
[ 3.111476] do_one_initcall+0x41/0x200
[ 3.111482] ? srso_return_thunk+0x5/0x10
[ 3.111486] ? kmem_cache_alloc_trace+0x4a/0x1d0
[ 3.111492] do_init_module+0x45/0x220
[ 3.111498] __do_sys_finit_module+0xbf/0x120
[ 3.111506] do_syscall_64+0x3b/0x90
[ 3.111511] entry_SYSCALL_64_after_hwframe+0x62/0xcc
[ 3.111515] RIP: 0033:0x7f26996a7919
[ 3.111520] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d df 14 0c 00 f7 d8 64 89 01 48
[ 3.111529] RSP: 002b:00007ffdb88e1848 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[ 3.111535] RAX: ffffffffffffffda RBX: 000055bb43e04260 RCX: 00007f26996a7919
[ 3.111540] RDX: 0000000000000000 RSI: 00007f2699870bd5 RDI: 0000000000000010
[ 3.111545] RBP: 0000000000000000 R08: 0000000000000001 R09: 000055bb43dc4290
[ 3.111549] R10: 00007f2699769ac0 R11: 0000000000000246 R12: 00007f2699870bd5
[ 3.111554] R13: 0000000000020000 R14: 000055bb43e06820 R15: 0000000000000000
[ 3.111561] </TASK>
[ 3.111564] Modules linked in: amdgpu(+) drm_ttm_helper ttm mfd_core gpu_sched
[ 3.111574] CR2: 0000000000000134
[ 3.111578] ---[ end trace d19abfb8d26f7bd8 ]---
[ 3.111581] RIP: 0010:amdgpu_dm_fini+0xe7/0x180 [amdgpu]
[ 3.111715] Code: 01 00 48 85 ff 74 10 e8 87 3f 16 00 48 c7 83 d8 4d 01 00 00 00 00 00 48 8b 93 30 4e 01 00 48 85 d2 74 62 48 8b 8b 38 3d 01 00 <8b> 81 34 01 00 00 85 c0 74 3e 31 ed 48 63 c5 4c 8d 24 40 4a 8b 3c
[ 3.111725] RSP: 0018:ffffc9000120fc28 EFLAGS: 00010286
[ 3.111729] RAX: ffff88810943b480 RBX: ffff888108380000 RCX: 0000000000000000
[ 3.111734] RDX: ffff88810642a6c0 RSI: ffff88810943b480 RDI: ffff88810943b490
[ 3.111738] RBP: ffff888108380070 R08: 0000000000000001 R09: 0000000000000001
[ 3.111743] R10: 000000000002d9a4 R11: ffffffff8264e800 R12: 0000000000000007
[ 3.111747] R13: ffff888108380000 R14: ffff888108380010 R15: ffff888100c510d0
[ 3.111752] FS: 00007f26994ac540(0000) GS:ffff888410e40000(0000) knlGS:0000000000000000
[ 3.111758] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3.111762] CR2: 0000000000000134 CR3: 00000001046ea000 CR4: 00000000003506e0
Re: gentoo-sources 5.15.151 breaks amdgpu support? [ In reply to ]
On 2024-03-11, Grant Edwards <grant.b.edwards@gmail.com> wrote:

> I upgraded gentoo-sources from 5.15.147 to 5.15.151 this morning and
> amdgpu support is now borked on my system with an AMD Ryzen 5 3400G
> with Radeon Vega Graphics.
>
> Everything worked fine with 5.15.147, but when 5.15.151 (built with
> same .config via "make oldconfig") boots there's always a kernel oops,
> and video output goes blank.
>
> I suppose maybe it's time to see if 6.1 works...

Gentoo-sources 6.1 seems to work fine. I had masked it when it first
went stable because it wouldn't boot back then. Whatever was wrong
with it then seems to have been fixed.

Hopefully it won't catch whatever malady is affecting 5.15.151.

--
Grant