Mailing List Archive

[RFC] "xs_read(): uuid get error" of qemu-dm
Hi all,

since c/s 11840, qemu-dm process is <defunct>, and the qemu log says
"xs_read(): uuid get error" in guest reboot.
This is because of being not able to read yet when qemu-dm reads
vncpasswd from xenstore.
(xend has spawned qemu-dm before writing vncpasswd to xenstore)

I think that the following actions are necessary. How about ?

- The change of the call order of vm.initDomain() and vm.storeVmDetails()
in create()@XendDomainInfo.py.
It looks safe. Isn't there problem ?
I tried this. Still, the error sometimes occurs.
- writeVm() should guarantee the completion of writing of xenstore.
Is it possible ?
- Temporary correction for which it waits for a few seconds until
being possible to read in qemu-dm.
I try to do it.

Please comment.

Regards,
Masami


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [RFC] "xs_read(): uuid get error" of qemu-dm [ In reply to ]
On 7/11/06 3:46 am, "Masami Watanabe" <masami.watanabe@jp.fujitsu.com>
wrote:

> since c/s 11840, qemu-dm process is <defunct>, and the qemu log says
> "xs_read(): uuid get error" in guest reboot.
> This is because of being not able to read yet when qemu-dm reads
> vncpasswd from xenstore.
> (xend has spawned qemu-dm before writing vncpasswd to xenstore)

This was supposed to be fixed by c/s 12187.

If it hasn't, we need to fix xend to write the passwd before starting qemu,
and/or qemu needs to treat failure of the xs_read() as an indication that
there is no authentication.

What do you think is the problem? Is the passwd getting written after qemu
is started and hence racing the xs_read() in xenstored?

We don't want to work around this with a timeouts.

-- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [RFC] "xs_read(): uuid get error" of qemu-dm [ In reply to ]
On 7/11/06 3:46 am, "Masami Watanabe" <masami.watanabe@jp.fujitsu.com>
wrote:

> - The change of the call order of vm.initDomain() and vm.storeVmDetails()
> in create()@XendDomainInfo.py.
> It looks safe. Isn't there problem ?
> I tried this. Still, the error sometimes occurs.

I'm not sure. Does this change the ordering of the writeVM versus creation
of the qemu-dm process?

> - writeVm() should guarantee the completion of writing of xenstore.
> Is it possible ?

You mean storeVm? Yes, it should complete synchronously, unless it's part of
larger transaction in which case it completes synchronously when the
transaction commits (if the transaction commits successfully). Writes aren't
buffered or delayed apart from as part of a transaction.

-- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [RFC] "xs_read(): uuid get error" of qemu-dm [ In reply to ]
Hi Keir,

My explanation was insufficient.

"xs_read(): uuid get error" happens when uuid can't read from xenstore
in xenstore_read_vncpasswd@tools/ioemu/xenstore.c.

c/s 12187 evaded this problem when the guest rebooted in a lot of
environments. As for my environment, the problem was corrected by this
correction.

However, Afterwards, following problem keeps happening.
I think that it is a problem.

[Xen-devel] VMX status report 12254:f8ffeb540ec1
http://lists.xensource.com/archives/html/xen-devel/2006-11/msg00288.html
[Xen-devel] VMX status report 12217:20204db0891b
http://lists.xensource.com/archives/html/xen-devel/2006-11/msg00183.html
> IA32/PAE/IA32E: Windows and Linux VMX domains may fail to be
> created, the qemu-dm process is <defunct>, and the qemu log says
> "xs_read(): uuid get error."


I examined it.
As a result, In the environment that allocated two or more CPU in
Dom0, this problem was able to be caused.
The result of the confirmation is as follows.
- uuid cannot be read with xenstore_read_vncpasswd() in qemu-dm.
- uuid can often be read by changing the order of vm.initDomain()
and vm.storeVmDetails() in create()@XendDomainInfo.py.
- And, when the read timing is delayed in qemu-dm, It was possible
to always read.

From the above, I thought that this problem was a problem of the timing
of writing and reading to xenstore from another process.


> Is the passwd getting written after qemu
> is started and hence racing the xs_read() in xenstored?

Yes, maybe. I understand the order of processing xend as follows.
Is it my misunderstanding ?

create()@XendDomainInfo.py+135
start()
_initDomain()
_createDevices()
createDeviceModel(self)@image.py
os.spawnve() ==============> start qemu-dm process
_storeVmDetails()
_writeVm() ==============> write to xenstore
_setVmPermissions()


Masami



On Tue, 07 Nov 2006 08:18:44 +0000, Keir Fraser wrote:
> On 7/11/06 3:46 am, "Masami Watanabe" <masami.watanabe@jp.fujitsu.com>
> wrote:
>
> > since c/s 11840, qemu-dm process is <defunct>, and the qemu log says
> > "xs_read(): uuid get error" in guest reboot.
> > This is because of being not able to read yet when qemu-dm reads
> > vncpasswd from xenstore.
> > (xend has spawned qemu-dm before writing vncpasswd to xenstore)
>
> This was supposed to be fixed by c/s 12187.
>
> If it hasn't, we need to fix xend to write the passwd before starting qemu,
> and/or qemu needs to treat failure of the xs_read() as an indication that
> there is no authentication.
>
> What do you think is the problem? Is the passwd getting written after qemu
> is started and hence racing the xs_read() in xenstored?
>
> We don't want to work around this with a timeouts.
>
> -- Keir




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [RFC] "xs_read(): uuid get error" of qemu-dm [ In reply to ]
On 8/11/06 6:13 am, "Masami Watanabe" <masami.watanabe@jp.fujitsu.com>
wrote:

> Yes, maybe. I understand the order of processing xend as follows.
> Is it my misunderstanding ?

Actually it's not the read of the 'vncpasswd' field that is failing but the
read of the 'vm' field. Maybe that happens later still.

I can't repro this right now as I can't reboot any guests at the moment. I
think it's been broken by the xen-api changes.

-- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: Re: [RFC] "xs_read(): uuid get error" of qemu-dm [ In reply to ]
On Wed, Nov 08, 2006 at 03:13:43PM +0900, Masami Watanabe wrote:

> Hi Keir,
>
> My explanation was insufficient.
>
> "xs_read(): uuid get error" happens when uuid can't read from xenstore
> in xenstore_read_vncpasswd@tools/ioemu/xenstore.c.
>
> c/s 12187 evaded this problem when the guest rebooted in a lot of
> environments. As for my environment, the problem was corrected by this
> correction.
>
> However, Afterwards, following problem keeps happening.
> I think that it is a problem.
>
> [Xen-devel] VMX status report 12254:f8ffeb540ec1
> http://lists.xensource.com/archives/html/xen-devel/2006-11/msg00288.html
> [Xen-devel] VMX status report 12217:20204db0891b
> http://lists.xensource.com/archives/html/xen-devel/2006-11/msg00183.html
> > IA32/PAE/IA32E: Windows and Linux VMX domains may fail to be
> > created, the qemu-dm process is <defunct>, and the qemu log says
> > "xs_read(): uuid get error."
>
>
> I examined it.
> As a result, In the environment that allocated two or more CPU in
> Dom0, this problem was able to be caused.
> The result of the confirmation is as follows.
> - uuid cannot be read with xenstore_read_vncpasswd() in qemu-dm.
> - uuid can often be read by changing the order of vm.initDomain()
> and vm.storeVmDetails() in create()@XendDomainInfo.py.
> - And, when the read timing is delayed in qemu-dm, It was possible
> to always read.
>
> >From the above, I thought that this problem was a problem of the timing
> of writing and reading to xenstore from another process.
>
>
> > Is the passwd getting written after qemu
> > is started and hence racing the xs_read() in xenstored?
>
> Yes, maybe. I understand the order of processing xend as follows.
> Is it my misunderstanding ?
>
> create()@XendDomainInfo.py+135
> start()
> _initDomain()
> _createDevices()
> createDeviceModel(self)@image.py
> os.spawnve() ==============> start qemu-dm process
> _storeVmDetails()
> _writeVm() ==============> write to xenstore
> _setVmPermissions()

I've just put a patch in that ought to help. We can't reproduce this race
here, but perhaps you could give it a try for me.

diff -r 9a43cc89ae0a tools/python/xen/xend/XendDomainInfo.py
--- a/tools/python/xen/xend/XendDomainInfo.py Wed Nov 08 18:27:31 2006 +0000
+++ b/tools/python/xen/xend/XendDomainInfo.py Wed Nov 08 18:08:28 2006 +0000
@@ -678,6 +678,7 @@ class XendDomainInfo:
t.remove()
t.mkdir()
t.set_permissions({ 'dom' : self.domid })
+ t.write('vm', self.vmpath)

def _storeDomDetails(self):
to_store = {


The /vm/<uuid>/vncpasswd node is written before the call to createDeviceModel,
in configVNC, but you need the /local/domain/<domid>/vm node to be present
too, and it's this one that isn't written until after qemu-dm is started.

HTH,

Ewan.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: Re: [RFC] "xs_read(): uuid get error" of qemu-dm [ In reply to ]
Hi Ewan,

> I've just put a patch in that ought to help. We can't reproduce this
> race here, but perhaps you could give it a try for me.

Special thanks for your patch.
Your patch splendidly solved "xs_read(): uuid get error" that occurred
on c/s 12307 and before that.

When is this patch committed ?

Masami



On Wed, 8 Nov 2006 18:35:37 +0000, Ewan Mellor wrote:
> On Wed, Nov 08, 2006 at 03:13:43PM +0900, Masami Watanabe wrote:
>
> > Hi Keir,
> >
> > My explanation was insufficient.
> >
> > "xs_read(): uuid get error" happens when uuid can't read from xenstore
> > in xenstore_read_vncpasswd@tools/ioemu/xenstore.c.
> >
> > c/s 12187 evaded this problem when the guest rebooted in a lot of
> > environments. As for my environment, the problem was corrected by this
> > correction.
> >
> > However, Afterwards, following problem keeps happening.
> > I think that it is a problem.
> >
> > [Xen-devel] VMX status report 12254:f8ffeb540ec1
> > http://lists.xensource.com/archives/html/xen-devel/2006-11/msg00288.html
> > [Xen-devel] VMX status report 12217:20204db0891b
> > http://lists.xensource.com/archives/html/xen-devel/2006-11/msg00183.html
> > > IA32/PAE/IA32E: Windows and Linux VMX domains may fail to be
> > > created, the qemu-dm process is <defunct>, and the qemu log says
> > > "xs_read(): uuid get error."
> >
> >
> > I examined it.
> > As a result, In the environment that allocated two or more CPU in
> > Dom0, this problem was able to be caused.
> > The result of the confirmation is as follows.
> > - uuid cannot be read with xenstore_read_vncpasswd() in qemu-dm.
> > - uuid can often be read by changing the order of vm.initDomain()
> > and vm.storeVmDetails() in create()@XendDomainInfo.py.
> > - And, when the read timing is delayed in qemu-dm, It was possible
> > to always read.
> >
> > >From the above, I thought that this problem was a problem of the timing
> > of writing and reading to xenstore from another process.
> >
> >
> > > Is the passwd getting written after qemu
> > > is started and hence racing the xs_read() in xenstored?
> >
> > Yes, maybe. I understand the order of processing xend as follows.
> > Is it my misunderstanding ?
> >
> > create()@XendDomainInfo.py+135
> > start()
> > _initDomain()
> > _createDevices()
> > createDeviceModel(self)@image.py
> > os.spawnve() ==============> start qemu-dm process
> > _storeVmDetails()
> > _writeVm() ==============> write to xenstore
> > _setVmPermissions()
>
> I've just put a patch in that ought to help. We can't reproduce this race
> here, but perhaps you could give it a try for me.
>
> diff -r 9a43cc89ae0a tools/python/xen/xend/XendDomainInfo.py
> --- a/tools/python/xen/xend/XendDomainInfo.py Wed Nov 08 18:27:31 2006 +0000
> +++ b/tools/python/xen/xend/XendDomainInfo.py Wed Nov 08 18:08:28 2006 +0000
> @@ -678,6 +678,7 @@ class XendDomainInfo:
> t.remove()
> t.mkdir()
> t.set_permissions({ 'dom' : self.domid })
> + t.write('vm', self.vmpath)
>
> def _storeDomDetails(self):
> to_store = {
>
>
> The /vm/<uuid>/vncpasswd node is written before the call to createDeviceModel,
> in configVNC, but you need the /local/domain/<domid>/vm node to be present
> too, and it's this one that isn't written until after qemu-dm is started.
>
> HTH,
>
> Ewan.


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: Re: [RFC] "xs_read(): uuid get error" of qemu-dm [ In reply to ]
On Thu, Nov 09, 2006 at 04:44:43PM +0900, Masami Watanabe wrote:

> Hi Ewan,
>
> > I've just put a patch in that ought to help. We can't reproduce this
> > race here, but perhaps you could give it a try for me.
>
> Special thanks for your patch.
> Your patch splendidly solved "xs_read(): uuid get error" that occurred
> on c/s 12307 and before that.
>
> When is this patch committed ?

Great, thank you. It's on its way (I forgot to push it last night ;-)

Ewan.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel