Mailing List Archive: PoD issue

PoD issue

JBeulich at novell

Jan 29, 2010, 8:27 AM

Post #1 of 16 (1858 views)

George,

before diving deeply into the PoD code, I hope you have some idea that
might ease the debugging that's apparently going to be needed.

Following the comment immediately before p2m_pod_set_mem_target(),
there's an apparent inconsistency with the accounting: While the guest
in question properly balloons down to its intended setting (1G, with a
maxmem setting of 2G), the combination of the equations

d->arch.p2m->pod.entry_count == B - P
d->tot_pages == P + d->arch.p2m->pod.count

doesn't hold (provided I interpreted the meaning of B correctly - I
took this from the guest balloon driver's "Current allocation" report,
converted to pages); there's a difference of over 13000 pages.
Obviously, as soon as the guest uses up enough of its memory, it
will get crashed by the PoD code.

In two runs I did, the difference (and hence the number of entries
reported in the eventual crash message) was identical, implying to
me that this is not a simple race, but rather a systematical problem.

Even on the initial dump taken (when the guest was sitting at the
boot manager screen), there already appears to be a difference of
800 pages (it's my understanding that at this point the difference
between entries and cache should equal the difference between
maxmem and mem).

Does this ring any bells? Any hints how to debug this? In any case
I'm attaching the full log in case you want to look at it.

Jan

Re: PoD issue [ In reply to ]

george.dunlap at eu

Jan 29, 2010, 9:01 AM

Post #2 of 16 (1838 views)

What seems likely to me is that Xen (setting the PoD target) and the
balloon driver (allocating memory) have a different way of calculating
the amount of guest memory. So the balloon driver thinks it's done
handing memory back to Xen when there are still more outstanding PoD
entries than there are entries in the PoD memory pool. What balloon
driver are you using? Can you let me know max_mem, target, and what the
balloon driver has reached before calling it quits? (Although 13,000
pages is an awful lot to be off by: 54 MB...)

Re what "B" means, below is a rather long-winded explanation that will,
hopefully, be clear. :-)

Hmm, I'm not sure what the guest balloon driver's "Current allocation"
means either. :-) Does it mean, "Size of the current balloon" (i.e.,
starts at 0 and grows as the balloon driver allocates guest pages and
hands them back to Xen)? Or does it mean, "Amount of memory guest
currently has allocated to it" (i.e., starts at static_max and goes down
as the balloon driver allocates guest pages and hands them back to Xen)?

In the comment, B does *not* mean "the size of the balloon" (i.e., the
number of pages allocated from the guest OS by the balloon driver).
Rather, B means "Amount of memory the guest currently thinks it has
allocated to it." B starts at M at boot. The balloon driver will try
to make B=T by inflating the size of the balloon to M-T. Clear as mud?

Let's make a concrete example. Let's say static max is 409,600K
(100,000 pages).

M=100,000 and doesn't change. Let's say that T is 50,000.

At boot:
B == M == 100,000.
P == 0
tot_pages = pod.count == 50,000
entry_count == 100,000

Thus things hold:
* 0 <= P (0) <= T (50,000) <= B (100,000) <= M (100,000)
* entry_count (100,000) == B (100,000) - P (0)
* tot_pages (50,000) == P (0) + pod.count (50,000)

As the guest boots, pages will be populated from the cache; P increases,
but entry_count and pod.count decrease. Let's say that 25,000 pages get
allocated just before the balloon driver runs:

* 0 <= P (25,000) <= T (50,000) <= B(100,000) <= M (100,000)
* entry_count (75,000) == B (100,000) - P (25,000)
* tot_pages (50,000) == P (25,000) + pod.count (25,000)

Then the balloon driver runs. It should try to allocate 50,000 pages
total (M - T). For simplicity, let's say that the balloon driver only
allocates un-allocated pages. When it's halfway there, having allocated
25,000 pages, things look like this:

* 0 <= P (25,000) <= T (50,000) <= B (75,000) <= M (100,000)
* entry_count (50,000) == B (75,000) - P (25,000)
* tot_pages (50,000) == P (25,000) + pod.count (25,000)

Eventually the balloon driver should reach its new target of 50,000,
having allocated 50,000 pages:

* 0 <= P (25,000) <= T (50,000) <= B (50,000) <= M(100,000)
* entry_count(25,000) == B(50,000) - P (25,000)
* tot_pages (50,000) == P(25,000) + pod.count(25,000)

The reason for the logic is so that we can do the Right Thing if, after
the balloon driver has ballooned half way (to 75,000 pages), the target
is changed. If you're not changing the target before the balloon driver
has reached its target,

-George

Jan Beulich wrote:
> George,
>
> before diving deeply into the PoD code, I hope you have some idea that
> might ease the debugging that's apparently going to be needed.
>
> Following the comment immediately before p2m_pod_set_mem_target(),
> there's an apparent inconsistency with the accounting: While the guest
> in question properly balloons down to its intended setting (1G, with a
> maxmem setting of 2G), the combination of the equations
>
> d->arch.p2m->pod.entry_count == B - P
> d->tot_pages == P + d->arch.p2m->pod.count
>
> doesn't hold (provided I interpreted the meaning of B correctly - I
> took this from the guest balloon driver's "Current allocation" report,
> converted to pages); there's a difference of over 13000 pages.
> Obviously, as soon as the guest uses up enough of its memory, it
> will get crashed by the PoD code.
>
> In two runs I did, the difference (and hence the number of entries
> reported in the eventual crash message) was identical, implying to
> me that this is not a simple race, but rather a systematical problem.
>
> Even on the initial dump taken (when the guest was sitting at the
> boot manager screen), there already appears to be a difference of
> 800 pages (it's my understanding that at this point the difference
> between entries and cache should equal the difference between
> maxmem and mem).
>
> Does this ring any bells? Any hints how to debug this? In any case
> I'm attaching the full log in case you want to look at it.
>
> Jan
>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Re: PoD issue [ In reply to ]

JBeulich at novell

Jan 29, 2010, 9:59 AM

Post #3 of 16 (1849 views)

>>> George Dunlap <george.dunlap@eu.citrix.com> 29.01.10 17:01 >>>
>What seems likely to me is that Xen (setting the PoD target) and the
>balloon driver (allocating memory) have a different way of calculating
>the amount of guest memory. So the balloon driver thinks it's done
>handing memory back to Xen when there are still more outstanding PoD
>entries than there are entries in the PoD memory pool. What balloon
>driver are you using?

The one from our forward ported 2.6.32.x tree. I would suppose there
are no significant differences here to the one in 2.6.18, but I wonder
how precise the totalram_pages value is that the driver (also in 2.6.18)
uses to initialize bs.current_pages. Given that with PoD it is now crucial
for the guest to balloon out enough memory, using an imprecise start
value is not acceptable anymore. The question however is what more
reliable data source one could use (given that any non-exported
kernel object is out of question). And I wonder how this works reliably
for others...

>Can you let me know max_mem, target, and what the
>balloon driver has reached before calling it quits? (Although 13,000
>pages is an awful lot to be off by: 54 MB...)

The balloon driver reports the expected state: target and allocation
are 1G. But yes - how did I not pay attention to this - the balloon is
*far* from being 1G in size (and in fact the difference is probably
matching quite closely those 54M).

Thanks a lot!

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Re: PoD issue [ In reply to ]

george.dunlap at eu

Jan 29, 2010, 11:30 AM

Post #4 of 16 (1838 views)

PoD is not critical to balloon out guest memory. You can boot with mem
== maxmem and then balloon down afterwards just as you could before,
without involving PoD. (Or at least, you should be able to; if you
can't then it's a bug.) It's just that with PoD you can do something
you've always wanted to do but never knew it: boot with 1GiB with the
option of expanding up to 2GiB later. :-)

With the 54 megabyte difference: It's not like a GiB vs GB thing, is
it? (i.e., 2^30 vs 10^9?) The difference between 1GiB (2^30) and 1 GB
(10^9) is about 74 megs, or 18,000 pages.

I guess that is a weakness of PoD in general: we can't control the guest
balloon driver, but we rely on it to have the same model of how to
translate "target" into # pages in the balloon as the PoD code.

-George

Jan Beulich wrote:
>>>> George Dunlap <george.dunlap@eu.citrix.com> 29.01.10 17:01 >>>
>>>>
>> What seems likely to me is that Xen (setting the PoD target) and the
>> balloon driver (allocating memory) have a different way of calculating
>> the amount of guest memory. So the balloon driver thinks it's done
>> handing memory back to Xen when there are still more outstanding PoD
>> entries than there are entries in the PoD memory pool. What balloon
>> driver are you using?
>>
>
> The one from our forward ported 2.6.32.x tree. I would suppose there
> are no significant differences here to the one in 2.6.18, but I wonder
> how precise the totalram_pages value is that the driver (also in 2.6.18)
> uses to initialize bs.current_pages. Given that with PoD it is now crucial
> for the guest to balloon out enough memory, using an imprecise start
> value is not acceptable anymore. The question however is what more
> reliable data source one could use (given that any non-exported
> kernel object is out of question). And I wonder how this works reliably
> for others...
>
>
>> Can you let me know max_mem, target, and what the
>> balloon driver has reached before calling it quits? (Although 13,000
>> pages is an awful lot to be off by: 54 MB...)
>>
>
> The balloon driver reports the expected state: target and allocation
> are 1G. But yes - how did I not pay attention to this - the balloon is
> *far* from being 1G in size (and in fact the difference is probably
> matching quite closely those 54M).
>
> Thanks a lot!
>
> Jan
>
>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Re: PoD issue [ In reply to ]

jbeulich at novell

Jan 31, 2010, 10:48 AM

Post #5 of 16 (1836 views)

>>> George Dunlap 01/29/10 7:30 PM >>>
>PoD is not critical to balloon out guest memory. You can boot with mem
>== maxmem and then balloon down afterwards just as you could before,
>without involving PoD. (Or at least, you should be able to; if you
>can't then it's a bug.) It's just that with PoD you can do something
>you've always wanted to do but never knew it: boot with 1GiB with the
>option of expanding up to 2GiB later. :-)

Oh, no, that's not what I meant. What I really wanted to say is that
with PoD, a properly functioning balloon driver in the guest is crucial
for it to stay alive long enough.

>With the 54 megabyte difference: It's not like a GiB vs GB thing, is
>it? (i.e., 2^30 vs 10^9?) The difference between 1GiB (2^30) and 1 GB
>(10^9) is about 74 megs, or 18,000 pages.

No, that's not the problem. As I understand it now, the problem is
that totalram_pages (which the balloon driver bases its calculations
on) reflects all memory available after all bootmem allocations were
done (i.e. includes neither the static kernel image nor any memory
allocated before or from the bootmem allocator).

>I guess that is a weakness of PoD in general: we can't control the guest
>balloon driver, but we rely on it to have the same model of how to
>translate "target" into # pages in the balloon as the PoD code.

I think this isn't a weakness of PoD, but a design issue in the balloon
driver's xenstore interface: While a target value shown in or obtained
from the /proc and /sys interfaces naturally can be based on (and
reflect) any internal kernel state, the xenstore interface should only
use numbers in terms of full memory amount given to the guest.
Hence a target value read from the memory/target node should be
adjusted before put in relation to totalram_pages. And I think this
is a general misconception in the current implementation (i.e. it
should be corrected not only for the HVM case, but for the pv one
as well).

The bad aspect of this is that it will require a fixed balloon driver
in any HVM guest that has maxmem>mem when the underlying Xen
gets updated to a version that supports PoD. I cannot, however,
see an OS and OS-version independent alternative (i.e. something
to be done in the PoD code or the tools).

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Re: PoD issue [ In reply to ]

george.dunlap at eu

Feb 3, 2010, 11:42 AM

Post #6 of 16 (1836 views)

So did you track down where the math error is? Do we have a plan to fix
this going forward?
-George

Jan Beulich wrote:
>>>> George Dunlap 01/29/10 7:30 PM >>>
>>>>
>> PoD is not critical to balloon out guest memory. You can boot with mem
>> == maxmem and then balloon down afterwards just as you could before,
>> without involving PoD. (Or at least, you should be able to; if you
>> can't then it's a bug.) It's just that with PoD you can do something
>> you've always wanted to do but never knew it: boot with 1GiB with the
>> option of expanding up to 2GiB later. :-)
>>
>
> Oh, no, that's not what I meant. What I really wanted to say is that
> with PoD, a properly functioning balloon driver in the guest is crucial
> for it to stay alive long enough.
>
>
>> With the 54 megabyte difference: It's not like a GiB vs GB thing, is
>> it? (i.e., 2^30 vs 10^9?) The difference between 1GiB (2^30) and 1 GB
>> (10^9) is about 74 megs, or 18,000 pages.
>>
>
> No, that's not the problem. As I understand it now, the problem is
> that totalram_pages (which the balloon driver bases its calculations
> on) reflects all memory available after all bootmem allocations were
> done (i.e. includes neither the static kernel image nor any memory
> allocated before or from the bootmem allocator).
>
>
>> I guess that is a weakness of PoD in general: we can't control the guest
>> balloon driver, but we rely on it to have the same model of how to
>> translate "target" into # pages in the balloon as the PoD code.
>>
>
> I think this isn't a weakness of PoD, but a design issue in the balloon
> driver's xenstore interface: While a target value shown in or obtained
> from the /proc and /sys interfaces naturally can be based on (and
> reflect) any internal kernel state, the xenstore interface should only
> use numbers in terms of full memory amount given to the guest.
> Hence a target value read from the memory/target node should be
> adjusted before put in relation to totalram_pages. And I think this
> is a general misconception in the current implementation (i.e. it
> should be corrected not only for the HVM case, but for the pv one
> as well).
>
> The bad aspect of this is that it will require a fixed balloon driver
> in any HVM guest that has maxmem>mem when the underlying Xen
> gets updated to a version that supports PoD. I cannot, however,
> see an OS and OS-version independent alternative (i.e. something
> to be done in the PoD code or the tools).
>
> Jan
>
>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Re: PoD issue [ In reply to ]

JBeulich at novell

Feb 4, 2010, 1:17 AM

Post #7 of 16 (1832 views)

It was in the balloon driver's interaction with xenstore - see 2.6.18 c/s
989.

I have to admit that I cannot see how this issue could slip attention
when the PoD code was introduced - any guest with PoD in use and
an unfixed balloon driver is set to crash sooner or later (implying the
unfortunate effect of requiring an update of the pv drivers in HVM
guests when upgrading Xen from a PoD-incapable to a PoD-capable
version).

Jan

>>> George Dunlap <george.dunlap@eu.citrix.com> 03.02.10 19:42 >>>
So did you track down where the math error is? Do we have a plan to fix
this going forward?
-George

Jan Beulich wrote:
>>>> George Dunlap 01/29/10 7:30 PM >>>
>>>>
>> PoD is not critical to balloon out guest memory. You can boot with mem
>> == maxmem and then balloon down afterwards just as you could before,
>> without involving PoD. (Or at least, you should be able to; if you
>> can't then it's a bug.) It's just that with PoD you can do something
>> you've always wanted to do but never knew it: boot with 1GiB with the
>> option of expanding up to 2GiB later. :-)
>>
>
> Oh, no, that's not what I meant. What I really wanted to say is that
> with PoD, a properly functioning balloon driver in the guest is crucial
> for it to stay alive long enough.
>
>
>> With the 54 megabyte difference: It's not like a GiB vs GB thing, is
>> it? (i.e., 2^30 vs 10^9?) The difference between 1GiB (2^30) and 1 GB
>> (10^9) is about 74 megs, or 18,000 pages.
>>
>
> No, that's not the problem. As I understand it now, the problem is
> that totalram_pages (which the balloon driver bases its calculations
> on) reflects all memory available after all bootmem allocations were
> done (i.e. includes neither the static kernel image nor any memory
> allocated before or from the bootmem allocator).
>
>
>> I guess that is a weakness of PoD in general: we can't control the guest
>> balloon driver, but we rely on it to have the same model of how to
>> translate "target" into # pages in the balloon as the PoD code.
>>
>
> I think this isn't a weakness of PoD, but a design issue in the balloon
> driver's xenstore interface: While a target value shown in or obtained
> from the /proc and /sys interfaces naturally can be based on (and
> reflect) any internal kernel state, the xenstore interface should only
> use numbers in terms of full memory amount given to the guest.
> Hence a target value read from the memory/target node should be
> adjusted before put in relation to totalram_pages. And I think this
> is a general misconception in the current implementation (i.e. it
> should be corrected not only for the HVM case, but for the pv one
> as well).
>
> The bad aspect of this is that it will require a fixed balloon driver
> in any HVM guest that has maxmem>mem when the underlying Xen
> gets updated to a version that supports PoD. I cannot, however,
> see an OS and OS-version independent alternative (i.e. something
> to be done in the PoD code or the tools).
>
> Jan
>
>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Re: Re: PoD issue [ In reply to ]

George.Dunlap at eu

Feb 4, 2010, 12:12 PM

Post #8 of 16 (1837 views)

Yeah, the OSS tree doesn't get the kind of regression testing it
really needs at the moment. I was using the OSS balloon drivers when
I implemented and submitted the PoD code last year. I didn't have any
trouble then, and I was definitely using up all of the memory. But I
haven't done any testing on OSS since then, basically.

-George

On Thu, Feb 4, 2010 at 12:17 AM, Jan Beulich <JBeulich@novell.com> wrote:
> It was in the balloon driver's interaction with xenstore - see 2.6.18 c/s
> 989.
>
> I have to admit that I cannot see how this issue could slip attention
> when the PoD code was introduced - any guest with PoD in use and
> an unfixed balloon driver is set to crash sooner or later (implying the
> unfortunate effect of requiring an update of the pv drivers in HVM
> guests when upgrading Xen from a PoD-incapable to a PoD-capable
> version).
>
> Jan
>
>>>> George Dunlap <george.dunlap@eu.citrix.com> 03.02.10 19:42 >>>
> So did you track down where the math error is? Do we have a plan to fix
> this going forward?
> -George
>
> Jan Beulich wrote:
>>>>> George Dunlap 01/29/10 7:30 PM >>>
>>>>>
>>> PoD is not critical to balloon out guest memory. You can boot with mem
>>> == maxmem and then balloon down afterwards just as you could before,
>>> without involving PoD. (Or at least, you should be able to; if you
>>> can't then it's a bug.) It's just that with PoD you can do something
>>> you've always wanted to do but never knew it: boot with 1GiB with the
>>> option of expanding up to 2GiB later. :-)
>>>
>>
>> Oh, no, that's not what I meant. What I really wanted to say is that
>> with PoD, a properly functioning balloon driver in the guest is crucial
>> for it to stay alive long enough.
>>
>>
>>> With the 54 megabyte difference: It's not like a GiB vs GB thing, is
>>> it? (i.e., 2^30 vs 10^9?) The difference between 1GiB (2^30) and 1 GB
>>> (10^9) is about 74 megs, or 18,000 pages.
>>>
>>
>> No, that's not the problem. As I understand it now, the problem is
>> that totalram_pages (which the balloon driver bases its calculations
>> on) reflects all memory available after all bootmem allocations were
>> done (i.e. includes neither the static kernel image nor any memory
>> allocated before or from the bootmem allocator).
>>
>>
>>> I guess that is a weakness of PoD in general: we can't control the guest
>>> balloon driver, but we rely on it to have the same model of how to
>>> translate "target" into # pages in the balloon as the PoD code.
>>>
>>
>> I think this isn't a weakness of PoD, but a design issue in the balloon
>> driver's xenstore interface: While a target value shown in or obtained
>> from the /proc and /sys interfaces naturally can be based on (and
>> reflect) any internal kernel state, the xenstore interface should only
>> use numbers in terms of full memory amount given to the guest.
>> Hence a target value read from the memory/target node should be
>> adjusted before put in relation to totalram_pages. And I think this
>> is a general misconception in the current implementation (i.e. it
>> should be corrected not only for the HVM case, but for the pv one
>> as well).
>>
>> The bad aspect of this is that it will require a fixed balloon driver
>> in any HVM guest that has maxmem>mem when the underlying Xen
>> gets updated to a version that supports PoD. I cannot, however,
>> see an OS and OS-version independent alternative (i.e. something
>> to be done in the PoD code or the tools).
>>
>> Jan
>>
>>
>
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Re: Re: PoD issue [ In reply to ]

list.keith at scaltro

Feb 18, 2010, 5:03 PM

Post #9 of 16 (1837 views)

On Thu, Feb 4, 2010 at 2:12 PM, George Dunlap
<George.Dunlap@eu.citrix.com> wrote:
> Yeah, the OSS tree doesn't get the kind of regression testing it
> really needs at the moment. I was using the OSS balloon drivers when
> I implemented and submitted the PoD code last year. I didn't have any
> trouble then, and I was definitely using up all of the memory. But I
> haven't done any testing on OSS since then, basically.
>

Is it expected that booting HVM guests with maxmem > memory is
unstable? In testing 3.4.3-rc2 (kernel 2.6.18 c/s 993) I can easily
crash the guest and occasionally the entire server.

Keith Coleman

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

RE: Re: PoD issue [ In reply to ]

Ian.Pratt at eu

Feb 18, 2010, 11:53 PM

Post #10 of 16 (1841 views)

> On Thu, Feb 4, 2010 at 2:12 PM, George Dunlap
> <George.Dunlap@eu.citrix.com> wrote:
> > Yeah, the OSS tree doesn't get the kind of regression testing it
> > really needs at the moment. I was using the OSS balloon drivers when
> > I implemented and submitted the PoD code last year. I didn't have any
> > trouble then, and I was definitely using up all of the memory. But I
> > haven't done any testing on OSS since then, basically.
> >
>
> Is it expected that booting HVM guests with maxmem > memory is
> unstable? In testing 3.4.3-rc2 (kernel 2.6.18 c/s 993) I can easily
> crash the guest and occasionally the entire server.

Obviously the platform should never crash, and that's very concerning.

Are you running a balloon driver in the guest? It's essential that you do, because it needs to get in fairly early in the guest boot and allocate the difference between maxmem and target memory. The populate-on-demand code exists just to cope with things like the memory scrubber running ahead of the balloon driver. If you're not running a balloon driver the guest is doomed to crash as soon as it tries using more than target memory.

All of this requires coordination between the tool stack, PoD code, and PV drivers so that sufficient memory gets ballooned out. I expect the combination that has had most testing is the XCP toolstack and Citrix PV windows drivers.

Ian

You need to be running a balloon driver in the guest: it needs

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Re: Re: PoD issue [ In reply to ]

JBeulich at novell

Feb 19, 2010, 1:19 AM

Post #11 of 16 (1838 views)

>>> Keith Coleman <list.keith@scaltro.com> 19.02.10 01:03 >>>
>On Thu, Feb 4, 2010 at 2:12 PM, George Dunlap
><George.Dunlap@eu.citrix.com> wrote:
>> Yeah, the OSS tree doesn't get the kind of regression testing it
>> really needs at the moment. I was using the OSS balloon drivers when
>> I implemented and submitted the PoD code last year. I didn't have any
>> trouble then, and I was definitely using up all of the memory. But I
>> haven't done any testing on OSS since then, basically.
>>
>
>Is it expected that booting HVM guests with maxmem > memory is
>unstable? In testing 3.4.3-rc2 (kernel 2.6.18 c/s 993) I can easily
>crash the guest and occasionally the entire server.

Crashing the guest is expected if the guest doesn't have a fixed
balloon driver (i.e. the mentioned c/s would need to be in the
sources the pv drivers for the guest were built from).

Crashing the host is certainly unacceptable - please provide logs
thereof.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Re: Re: PoD issue [ In reply to ]

list.keith at scaltro

Feb 19, 2010, 2:28 PM

Post #12 of 16 (1843 views)

On Fri, Feb 19, 2010 at 1:53 AM, Ian Pratt <Ian.Pratt@eu.citrix.com> wrote:
>> On Thu, Feb 4, 2010 at 2:12 PM, George Dunlap
>> <George.Dunlap@eu.citrix.com> wrote:
>> > Yeah, the OSS tree doesn't get the kind of regression testing it
>> > really needs at the moment. I was using the OSS balloon drivers when
>> > I implemented and submitted the PoD code last year. I didn't have any
>> > trouble then, and I was definitely using up all of the memory. But I
>> > haven't done any testing on OSS since then, basically.
>> >
>>
>> Is it expected that booting HVM guests with maxmem > memory is
>> unstable? In testing 3.4.3-rc2 (kernel 2.6.18 c/s 993) I can easily
>> crash the guest and occasionally the entire server.
>
> Obviously the platform should never crash, and that's very concerning.
>
> Are you running a balloon driver in the guest? It's essential that you do, because it needs to get in fairly early in the guest boot and allocate the difference between maxmem and target memory. The populate-on-demand code exists just to cope with things like the memory scrubber running ahead of the balloon driver. If you're not running a balloon driver the guest is doomed to crash as soon as it tries using more than target memory.
>
> All of this requires coordination between the tool stack, PoD code, and PV drivers so that sufficient memory gets ballooned out. I expect the combination that has had most testing is the XCP toolstack and Citrix PV windows drivers.
>

Initially I was using the XCP 0.1.1 WinPV drivers (win server 2003
sp2) and the guest crashed when I tried to install software via
emulated cdrom. Nothing about the crash was reported in the qemu log
file and xend.log wasn't very helpful either but here's the relevant
portion:
[2010-02-17 20:42:49 4253] DEBUG (DevController:139) Waiting for devices vtpm.
[2010-02-17 20:42:49 4253] INFO (XendDomain:1182) Domain win2 (30) unpaused.
[2010-02-17 20:48:05 4253] WARNING (XendDomainInfo:1888) Domain has
crashed: name=win2 id=30.
[2010-02-17 20:48:06 4253] DEBUG (XendDomainInfo:2734)
XendDomainInfo.destroy: domid=30
[2010-02-17 20:48:06 4253] DEBUG (XendDomainInfo:2209) Destroying device model

I unsuccessfully attempted the install several more times then tried
copying files from the emulated cd which also crashed the guest each
time. I wasn't even thinking about the fact that I had set maxmem/pod
so I blamed the xcp winpv drivers and switched to gplpv (0.10.0.138).
Same crashes with gplpv. At this point I hadn't checked 'xm dmesg'
which was the only place that the pod/p2m error is reported so I
changed to pure HVM mode and tried to copy the files from emulated cd.
That's when the real trouble started.

The rdp and vnc connections to the guest froze as did the ssh to the
dom0. This server was also hosting 7 linux pv guests. I could ping the
guests and partially load some of their websites but couldn't login
via ssh. I suspeced that the HDDs were overloaded causing disk io to
block the guests. I was on site so I went to check server and was
shocked to find no disk activity. The monitor output was blank and I
couldnt wake it up. Maybe the usb keyboard was unable to be enumerated
because I couldnt even toggle the numlock, etc after several
reconnections.

I power cycled the host and checked the logs but there was no evidence
of a crash other than one of the software raid devices being unclean
on startup. Perhaps there was interesting data logged to 'xm dmesg' or
waiting to be written to disk at the time of the crash. I'm afraid
this server/mb is incapable of logging data to the serial port. I've
attempted to do so several times both before and after this crash.

Of course the simple fix is to remove maxmem from the domU config file
for the time being. Eventually people will use pod on production
systems. Relying on the guest to have a solid balloon driver is
unacceptable. A guest could accidentally (or otherwise) remove the pv
drivers to bring down an entire host.

When I can free up a server with serial logging for testing I will try
to reproduce this crash.

Keith Coleman

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Re: Re: PoD issue [ In reply to ]

Jun 4, 2010, 8:03 AM

Post #13 of 16 (1835 views)

On Fri, Feb 19, 2010 at 08:19:15AM +0000, Jan Beulich wrote:
> >>> Keith Coleman <list.keith@scaltro.com> 19.02.10 01:03 >>>
> >On Thu, Feb 4, 2010 at 2:12 PM, George Dunlap
> ><George.Dunlap@eu.citrix.com> wrote:
> >> Yeah, the OSS tree doesn't get the kind of regression testing it
> >> really needs at the moment. I was using the OSS balloon drivers when
> >> I implemented and submitted the PoD code last year. I didn't have any
> >> trouble then, and I was definitely using up all of the memory. But I
> >> haven't done any testing on OSS since then, basically.
> >>
> >
> >Is it expected that booting HVM guests with maxmem > memory is
> >unstable? In testing 3.4.3-rc2 (kernel 2.6.18 c/s 993) I can easily
> >crash the guest and occasionally the entire server.
>
> Crashing the guest is expected if the guest doesn't have a fixed
> balloon driver (i.e. the mentioned c/s would need to be in the
> sources the pv drivers for the guest were built from).
>
> Crashing the host is certainly unacceptable - please provide logs
> thereof.
>

Was this resolved? Someone was complaining recently that maxmem != memory
crashes his Xen host..

-- Pasi

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Re: Re: PoD issue [ In reply to ]

jbeulich at novell

Jun 5, 2010, 9:15 AM

Post #14 of 16 (1839 views)

>>> Pasi Kärkkäinen 06/04/10 5:03 PM >>>
>On Fri, Feb 19, 2010 at 08:19:15AM +0000, Jan Beulich wrote:
>> >>> Keith Coleman 19.02.10 01:03 >>>
>> >On Thu, Feb 4, 2010 at 2:12 PM, George Dunlap
>> > wrote:
>> >> Yeah, the OSS tree doesn't get the kind of regression testing it
>> >> really needs at the moment. I was using the OSS balloon drivers when
>> >> I implemented and submitted the PoD code last year. I didn't have any
>> >> trouble then, and I was definitely using up all of the memory. But I
>> >> haven't done any testing on OSS since then, basically.
>> >>
>> >
>> >Is it expected that booting HVM guests with maxmem > memory is
>> >unstable? In testing 3.4.3-rc2 (kernel 2.6.18 c/s 993) I can easily
>> >crash the guest and occasionally the entire server.
>>
>> Crashing the guest is expected if the guest doesn't have a fixed
>> balloon driver (i.e. the mentioned c/s would need to be in the
>> sources the pv drivers for the guest were built from).
>>
>> Crashing the host is certainly unacceptable - please provide logs
>> thereof.
>>
>
>Was this resolved? Someone was complaining recently that maxmem != memory
>crashes his Xen host..

I don 't recall ever having seen logs of a host crash of this sort,
so if this ever was the case and no-one else fixed it, I would
believe it still to be an issue.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Re: Re: PoD issue [ In reply to ]

george.dunlap at eu

Jun 7, 2010, 2:28 AM

Post #15 of 16 (1841 views)

Jan Beulich wrote:
>> Was this resolved? Someone was complaining recently that maxmem != memory
>> crashes his Xen host..
>>
>
> I don 't recall ever having seen logs of a host crash of this sort,
> so if this ever was the case and no-one else fixed it, I would
> believe it still to be an issue.
>
>
There have been a number of fixes to the PoD code, so it's possible that
it has been fixed. I'll see if our testing team has time to add "Boot
memory < maxmem w/o balloon driver" to our testing matrix and see if we
can get a host crash.

-George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Re: Re: PoD issue [ In reply to ]

Jun 7, 2010, 2:51 AM

Post #16 of 16 (1844 views)

On Mon, Jun 07, 2010 at 10:28:11AM +0100, George Dunlap wrote:
> Jan Beulich wrote:
>>> Was this resolved? Someone was complaining recently that maxmem != memory
>>> crashes his Xen host..
>>>
>>
>> I don 't recall ever having seen logs of a host crash of this sort,
>> so if this ever was the case and no-one else fixed it, I would
>> believe it still to be an issue.
>>
>>
> There have been a number of fixes to the PoD code, so it's possible that
> it has been fixed. I'll see if our testing team has time to add "Boot
> memory < maxmem w/o balloon driver" to our testing matrix and see if we
> can get a host crash.
>

Ok, good. There has been manu queries/problems about PoD lately,
so it's good to get that tested.

-- Pasi

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel