Mailing List Archive

Host freezing after "fixing" recursive fault starting in multicalls.c
Hi,

I'm running Xen 4.11.2 on Fedora 30 with Kernel versions 5.4.7 and 5.4.10 on multiple HP servers.

The workflow I'm trying to achieve looks like the following:

- a VM is resumed from a snapshot with a Python script using the libvirt API
- it is running for a few minutes,
- it gets paused and finally destroyed for testing purposes

At some point - it doesn't seem to be deterministic because sometimes it happens directly after the boot and sometimes after multiple hours - a huge stacktrace starting with an error in `arch/x86/xen/multicalls.c` can be found in the kernel logs which ends with the message 'Fixing recursive fault but reboot is needed!'.

After some time the system completely freezes and needs to be hard resetted because it is not possible any more to login via SSH.
The freeze is also not deterministic but there are no other critical errors in the logs, so it seems somehow to be related.

Because the full stacktrace has round about 370 lines I attached it as a GitHub Gist:

https://gist.github.com/baez90/135c3985cbb6fd4b4204269fb384221a

I'm a little confused as to what else to try and I have no idea what the problem might be.

Any hints/ideas/proposals?

Kind regards and thanks in advance
_______________________________________________
Xen-users mailing list
Xen-users@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-users