Mailing List Archive

Xen 4.11 - Unable to start X server: Nvidia out of memory error
Hi,

I have Ubuntu 20.04 installed, and I'm using the provided Xen package: xen-hypervisor-4.11-amd64

Xen boots as expected, but I can't get my Gnome desktop, since the X server fails to start.

Looking at the journalctl logs, I can see the errors related the Gnome:
/usr/lib/gdm3/gdm-x-session[1962]: Unable to run X server

And a few lines before, an error related to Nvidia:
/usr/lib/gdm3/gdm-x-session[1966]: (WW) NVIDIA: Failed to bind sideband socket to
/usr/lib/gdm3/gdm-x-session[1966]: (WW) NVIDIA: '/var/run/nvidia-xdriver-66ed5ce3' Permission denied
/usr/lib/gdm3/gdm-x-session[1966]: (II) NVIDIA: Using 24576.00 MB of virtual memory for indirect memory
/usr/lib/gdm3/gdm-x-session[1966]: (II) NVIDIA: access.
/usr/lib/gdm3/gdm-x-session[1966]: (EE) NVIDIA(0): Failed to allocate software rendering cache surface: out of
/usr/lib/gdm3/gdm-x-session[1966]: (EE) NVIDIA(0): memory.
/usr/lib/gdm3/gdm-x-session[1966]: (EE) NVIDIA(0): *** Aborting ***
/usr/lib/gdm3/gdm-x-session[1966]: (EE)
/usr/lib/gdm3/gdm-x-session[1966]: Fatal server error:
/usr/lib/gdm3/gdm-x-session[1966]: (EE) NVIDIA: A GPU exception occurred during X server initialization

I can also see a GPU MMU fault in dmesg:
kernel: NVRM: Xid (PCI:0000:01:00): 31, pid=285, Ch 00000008, intr 00000000. MMU Fault: ENGINE GRAPHICS HUBCLIENT_FE faulted @ 0x0_00000000. Fault is of type FAULT_PDE ACCESS_TYPE_VIRT_WRITE

In a normal boot I don't have these errors (with nouveau or nvidia driver),
so it's safe to assume that Xen is doing something that the drivers were not prepared for.

I'm using the latest drivers available: nvidia-440

Have you already seen this error before ?
Any ideas how to fix this issue ?
Do you have contacts at Nvidia who could help ?

Best regards,
Mathieu Tarral
Re: Xen 4.11 - Unable to start X server: Nvidia out of memory error [ In reply to ]
Is it a Xen problem or Nvidia?

Sent from Yahoo Mail on Android

On Fri, Jun 19, 2020 at 12:25 AM, Mathieu Tarral<mathieu.tarral@protonmail.com> wrote: Hi,

I have Ubuntu 20.04 installed, and I'm using the provided Xen package: xen-hypervisor-4.11-amd64

Xen boots as expected, but I can't get my Gnome desktop, since the X server fails to start.

Looking at the journalctl logs, I can see the errors related the Gnome:
/usr/lib/gdm3/gdm-x-session[1962]: Unable to run X server

And a few lines before, an error related to Nvidia:
/usr/lib/gdm3/gdm-x-session[1966]: (WW) NVIDIA: Failed to bind sideband socket to
/usr/lib/gdm3/gdm-x-session[1966]: (WW) NVIDIA:    '/var/run/nvidia-xdriver-66ed5ce3' Permission denied
/usr/lib/gdm3/gdm-x-session[1966]: (II) NVIDIA: Using 24576.00 MB of virtual memory for indirect memory
/usr/lib/gdm3/gdm-x-session[1966]: (II) NVIDIA:    access.
/usr/lib/gdm3/gdm-x-session[1966]: (EE) NVIDIA(0): Failed to allocate software rendering cache surface: out of
/usr/lib/gdm3/gdm-x-session[1966]: (EE) NVIDIA(0):    memory.
/usr/lib/gdm3/gdm-x-session[1966]: (EE) NVIDIA(0):  *** Aborting ***
/usr/lib/gdm3/gdm-x-session[1966]: (EE)
/usr/lib/gdm3/gdm-x-session[1966]: Fatal server error:
/usr/lib/gdm3/gdm-x-session[1966]: (EE) NVIDIA: A GPU exception occurred during X server initialization

I can also see a GPU MMU fault in dmesg:
kernel: NVRM: Xid (PCI:0000:01:00): 31, pid=285, Ch 00000008, intr 00000000. MMU Fault: ENGINE GRAPHICS HUBCLIENT_FE faulted @ 0x0_00000000. Fault is of type FAULT_PDE ACCESS_TYPE_VIRT_WRITE

In a normal boot I don't have these errors (with nouveau or nvidia driver),
so it's safe to assume that Xen is doing something that the drivers were not prepared for.

I'm using the latest drivers available: nvidia-440

Have you already seen this error before ?
Any ideas how to fix this issue ?
Do you have contacts at Nvidia who could help ?

Best regards,
Mathieu Tarral
Re : Re: Xen 4.11 - Unable to start X server: Nvidia out of memory error [ In reply to ]
Hi Jason,

> Is it a Xen problem or Nvidia?

Considering that in a normal boot I don't have any issue with Nvidia,
I would say it is related to Xen.

> [Sent from Yahoo Mail on Android](https://go.onelink.me/107872968?pid=InProduct&c=Global_Internal_YGrowth_AndroidEmailSig__AndroidUsers&af_wl=ym&af_sub1=Internal&af_sub2=Global_YGrowth&af_sub3=EmailSignature)
>
>> On Fri, Jun 19, 2020 at 12:25 AM, Mathieu Tarral
>> <mathieu.tarral@protonmail.com> wrote:
>> Hi,
>>
>> I have Ubuntu 20.04 installed, and I'm using the provided Xen package: xen-hypervisor-4.11-amd64
>>
>> Xen boots as expected, but I can't get my Gnome desktop, since the X server fails to start.
>>
>> Looking at the journalctl logs, I can see the errors related the Gnome:
>> /usr/lib/gdm3/gdm-x-session[1962]: Unable to run X server
>>
>> And a few lines before, an error related to Nvidia:
>> /usr/lib/gdm3/gdm-x-session[1966]: (WW) NVIDIA: Failed to bind sideband socket to
>> /usr/lib/gdm3/gdm-x-session[1966]: (WW) NVIDIA: '/var/run/nvidia-xdriver-66ed5ce3' Permission denied
>> /usr/lib/gdm3/gdm-x-session[1966]: (II) NVIDIA: Using 24576.00 MB of virtual memory for indirect memory
>> /usr/lib/gdm3/gdm-x-session[1966]: (II) NVIDIA: access.
>> /usr/lib/gdm3/gdm-x-session[1966]: (EE) NVIDIA(0): Failed to allocate software rendering cache surface: out of
>> /usr/lib/gdm3/gdm-x-session[1966]: (EE) NVIDIA(0): memory.
>> /usr/lib/gdm3/gdm-x-session[1966]: (EE) NVIDIA(0): *** Aborting ***
>> /usr/lib/gdm3/gdm-x-session[1966]: (EE)
>> /usr/lib/gdm3/gdm-x-session[1966]: Fatal server error:
>> /usr/lib/gdm3/gdm-x-session[1966]: (EE) NVIDIA: A GPU exception occurred during X server initialization
>>
>> I can also see a GPU MMU fault in dmesg:
>> kernel: NVRM: Xid (PCI:0000:01:00): 31, pid=285, Ch 00000008, intr 00000000. MMU Fault: ENGINE GRAPHICS HUBCLIENT_FE faulted @ 0x0_00000000. Fault is of type FAULT_PDE ACCESS_TYPE_VIRT_WRITE
>>
>> In a normal boot I don't have these errors (with nouveau or nvidia driver),
>> so it's safe to assume that Xen is doing something that the drivers were not prepared for.
>>
>> I'm using the latest drivers available: nvidia-440
>>
>> Have you already seen this error before ?
>> Any ideas how to fix this issue ?
>> Do you have contacts at Nvidia who could help ?
>>
>> Best regards,
>> Mathieu Tarral
Re: Xen 4.11 - Unable to start X server: Nvidia out of memory error [ In reply to ]
> Am 20.06.2020 um 17:12 schrieb Mathieu Tarral <mathieu.tarral@protonmail.com>:
>
> Considering that in a normal boot I don't have any issue with Nvidia,
> I would say it is related to Xen.

as a xen user i remember from years ago, NVidias proprietary graphic drivers for linux made problems if runned under Xen too. If i remember correctly, the reason was of some „improper“ handling of memory / ressource access / management / driver implementation which only affected some kind of virtualization stuff xen relys on. The open source nv driver variant (without the „binary blob“ for extended features like some 3D accel stuff etc.) worked.

At that time it was difficult to get it solved as there was no real effort from NVidia at that time...

But afaik it got solved sometimes monthes later by NVidia (not tested it again then, because had switched to another graphic if yet).

So reasons are less „clear“ on the xen side then first impression might pictures...

And if i remember right, a workaround was to use a older nvidia driver version (but not matched my kernel minimum version there...).


regards,


niels.




Niels Dettenbach
https://www.syndicat.com
https://www.syndicat.com/pub_key.asc
Re : Re: Xen 4.11 - Unable to start X server: Nvidia out of memory error [ In reply to ]
Le samedi 20 juin 2020 17:31, Niels Dettenbach <niels@dettenbach.de> a écrit :

> > Am 20.06.2020 um 17:12 schrieb Mathieu Tarral mathieu.tarral@protonmail.com:
> > Considering that in a normal boot I don't have any issue with Nvidia,
> > I would say it is related to Xen.
>
> as a xen user i remember from years ago, NVidias proprietary graphic drivers for linux made problems if runned under Xen too. If i remember correctly, the reason was of some „improper“ handling of memory / ressource access / management / driver implementation which only affected some kind of virtualization stuff xen relys on. The open source nv driver variant (without the „binary blob“ for extended features like some 3D accel stuff etc.) worked.
>
> At that time it was difficult to get it solved as there was no real effort from NVidia at that time...
Does Nvidia have a better relation with open-source nowadays ?
It seems that for consumer GPU, you are still on your own.

> But afaik it got solved sometimes monthes later by NVidia (not tested it again then, because had switched to another graphic if yet).
Do you have any link to share regarding this bug fix maybe ?

> So reasons are less „clear“ on the xen side then first impression might pictures...
Meaning that for we are not sure where the responsabilities are regarding this issue ?


> And if i remember right, a workaround was to use a older nvidia driver version (but not matched my kernel minimum version there...).
I will try both downgrading my nvidia driver and the nouveau driver.

Thanks !