Mailing List Archive

[PATCH 2 of 5] xenpaging: map gfn before nomination
# HG changeset patch
# User Olaf Hering <olaf@aepfle.de>
# Date 1323189147 -3600
# Node ID 96d3292797d861592a7d2d3840f371ec719775a9
# Parent b733498b351a8650b2d952aa56725f63d49c1889
xenpaging: map gfn before nomination

If the gfn is mapped before nomination, all special cases in do_mmu_update()
for paged gfns can be removed. If a gfn is actually in any of the paging
states the caller has to try again.

Bump interface age.

Signed-off-by: Olaf Hering <olaf@aepfle.de>

diff -r b733498b351a -r 96d3292797d8 tools/xenpaging/xenpaging.c
--- a/tools/xenpaging/xenpaging.c
+++ b/tools/xenpaging/xenpaging.c
@@ -576,7 +576,7 @@ static int xenpaging_evict_page(xenpagin

DECLARE_DOMCTL;

- /* Map page */
+ /* Map page to get a handle */
gfn = victim->gfn;
ret = -EFAULT;
page = xc_map_foreign_pages(xch, paging->mem_event.domain_id,
@@ -587,16 +587,21 @@ static int xenpaging_evict_page(xenpagin
goto out;
}

+ /* Nominate the page */
+ ret = xc_mem_paging_nominate(xch, paging->mem_event.domain_id, gfn);
+ if ( ret != 0 )
+ goto out;
+
/* Copy page */
ret = write_page(fd, page, i);
if ( ret != 0 )
{
PERROR("Error copying page %lx", victim->gfn);
- munmap(page, PAGE_SIZE);
goto out;
}

munmap(page, PAGE_SIZE);
+ page = NULL;

/* Tell Xen to evict page */
ret = xc_mem_paging_evict(xch, paging->mem_event.domain_id,
@@ -615,6 +620,8 @@ static int xenpaging_evict_page(xenpagin
paging->num_paged_out++;

out:
+ if (page)
+ munmap(page, PAGE_SIZE);
return ret;
}

@@ -738,14 +745,11 @@ static int evict_victim(xenpaging_t *pag
ret = -EINTR;
goto out;
}
- ret = xc_mem_paging_nominate(xch, paging->mem_event.domain_id, victim->gfn);
- if ( ret == 0 )
- ret = xenpaging_evict_page(paging, victim, fd, i);
- else
+ ret = xenpaging_evict_page(paging, victim, fd, i);
+ if ( ret && j++ % 1000 == 0 )
{
- if ( j++ % 1000 == 0 )
- if ( xenpaging_mem_paging_flush_ioemu_cache(paging) )
- PERROR("Error flushing ioemu cache");
+ if ( xenpaging_mem_paging_flush_ioemu_cache(paging) )
+ PERROR("Error flushing ioemu cache");
}
}
while ( ret );
diff -r b733498b351a -r 96d3292797d8 xen/arch/x86/mm/p2m.c
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -723,7 +723,7 @@ set_shared_p2m_entry(struct domain *d, u
* - the gfn is backed by a mfn
* - the p2mt of the gfn is pageable
* - the mfn is not used for IO
- * - the mfn has exactly one user and has no special meaning
+ * - the mfn has exactly two users (guest+pager) and has no special meaning
*
* Once the p2mt is changed the page is readonly for the guest. On success the
* pager can write the page contents to disk and later evict the page.
@@ -758,7 +758,7 @@ int p2m_mem_paging_nominate(struct domai
/* Check page count and type */
page = mfn_to_page(mfn);
if ( (page->count_info & (PGC_count_mask | PGC_allocated)) !=
- (1 | PGC_allocated) )
+ (2 | PGC_allocated) )
goto out;

if ( (page->u.inuse.type_info & PGT_type_mask) != PGT_none )
@@ -785,7 +785,7 @@ int p2m_mem_paging_nominate(struct domai
* freed:
* - the gfn is backed by a mfn
* - the gfn was nominated
- * - the mfn has still exactly one user and has no special meaning
+ * - the mfn has still exactly one user (the guest) and has no special meaning
*
* After successful nomination some other process could have mapped the page. In
* this case eviction can not be done. If the gfn was populated before the pager
diff -r b733498b351a -r 96d3292797d8 xen/include/public/mem_event.h
--- a/xen/include/public/mem_event.h
+++ b/xen/include/public/mem_event.h
@@ -49,7 +49,7 @@
#define MEM_EVENT_REASON_INT3 5 /* int3 was hit: gla/gfn are RIP */
#define MEM_EVENT_REASON_SINGLESTEP 6 /* single step was invoked: gla/gfn are RIP */

-#define MEM_EVENT_PAGING_AGE 1UL /* Number distinguish the mem_paging <-> pager interface */
+#define MEM_EVENT_PAGING_AGE 2UL /* Number distinguish the mem_paging <-> pager interface */

typedef struct mem_event_shared_page {
uint32_t port;

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [PATCH 2 of 5] xenpaging: map gfn before nomination [ In reply to ]
> Date: Tue, 06 Dec 2011 18:07:03 +0100
> From: Olaf Hering <olaf@aepfle.de>
> To: xen-devel@lists.xensource.com
> Subject: [Xen-devel] [PATCH 2 of 5] xenpaging: map gfn before
> nomination
> Message-ID: <96d3292797d861592a7d.1323191223@probook.site>
> Content-Type: text/plain; charset="us-ascii"
>
> # HG changeset patch
> # User Olaf Hering <olaf@aepfle.de>
> # Date 1323189147 -3600
> # Node ID 96d3292797d861592a7d2d3840f371ec719775a9
> # Parent b733498b351a8650b2d952aa56725f63d49c1889
> xenpaging: map gfn before nomination
>
> If the gfn is mapped before nomination, all special cases in
> do_mmu_update()
> for paged gfns can be removed. If a gfn is actually in any of the paging
> states the caller has to try again.
>
> Bump interface age.

Ouch! You're absolutely tying the user space pager with the underlying xen
paging code. Most of your patches change the tools and the hypervisor in
lockstep.

These things are independent.

Let me clarify this: I'm working on a pager. I rely on the current
interface. I'm not opposed to changes that optimize the paging code in
Xen, but I am not in favor of changes that change the interface in a
rather gratuitous manner.

Patch 4 in your series is one such case. Short-cutting the state machine:
great. But what is the gain for the hypervisor in *not* sending the
EVICT_FAIL event. It's a good thing. It keeps the same interface to
user-space. Xenpaging may not need it, but the Xen paging code does not
exist solely for xenpaging.

Patch 5 also ties xenpaging and xen code together. I don't mind this
patch, but it should definitely be split in two.

Patch 1 is also a problem. I don't buy that we can number interfaces, and
then only support the tip ("Sorry, ENOEXEC, dig out an older hypervisor").
Either we bite the bullet of some level of backwards compatibility with
all its messiness, or we just keep it in flux until there is convergence
on a major "barrier".

Patch 3 makes a lot of sense.

This current patch 2 is also rather gratuitous. The need to map before
nominate feels superfluous. If Xen cannot deal with nomination requests on
pages that have not been mapped, then we've broken Xen.

Anyway, I hope my message gets across. I'm not trying to organize a
blockade on your code. I'm trying to get to the best possible hypervisor
paging support we can.

Thanks,
Andres

>
> Signed-off-by: Olaf Hering <olaf@aepfle.de>
>
> diff -r b733498b351a -r 96d3292797d8 tools/xenpaging/xenpaging.c
> --- a/tools/xenpaging/xenpaging.c
> +++ b/tools/xenpaging/xenpaging.c
> @@ -576,7 +576,7 @@ static int xenpaging_evict_page(xenpagin
>
> DECLARE_DOMCTL;
>
> - /* Map page */
> + /* Map page to get a handle */
> gfn = victim->gfn;
> ret = -EFAULT;
> page = xc_map_foreign_pages(xch, paging->mem_event.domain_id,
> @@ -587,16 +587,21 @@ static int xenpaging_evict_page(xenpagin
> goto out;
> }
>
> + /* Nominate the page */
> + ret = xc_mem_paging_nominate(xch, paging->mem_event.domain_id, gfn);
> + if ( ret != 0 )
> + goto out;
> +
> /* Copy page */
> ret = write_page(fd, page, i);
> if ( ret != 0 )
> {
> PERROR("Error copying page %lx", victim->gfn);
> - munmap(page, PAGE_SIZE);
> goto out;
> }
>
> munmap(page, PAGE_SIZE);
> + page = NULL;
>
> /* Tell Xen to evict page */
> ret = xc_mem_paging_evict(xch, paging->mem_event.domain_id,
> @@ -615,6 +620,8 @@ static int xenpaging_evict_page(xenpagin
> paging->num_paged_out++;
>
> out:
> + if (page)
> + munmap(page, PAGE_SIZE);
> return ret;
> }
>
> @@ -738,14 +745,11 @@ static int evict_victim(xenpaging_t *pag
> ret = -EINTR;
> goto out;
> }
> - ret = xc_mem_paging_nominate(xch, paging->mem_event.domain_id,
> victim->gfn);
> - if ( ret == 0 )
> - ret = xenpaging_evict_page(paging, victim, fd, i);
> - else
> + ret = xenpaging_evict_page(paging, victim, fd, i);
> + if ( ret && j++ % 1000 == 0 )
> {
> - if ( j++ % 1000 == 0 )
> - if ( xenpaging_mem_paging_flush_ioemu_cache(paging) )
> - PERROR("Error flushing ioemu cache");
> + if ( xenpaging_mem_paging_flush_ioemu_cache(paging) )
> + PERROR("Error flushing ioemu cache");
> }
> }
> while ( ret );
> diff -r b733498b351a -r 96d3292797d8 xen/arch/x86/mm/p2m.c
> --- a/xen/arch/x86/mm/p2m.c
> +++ b/xen/arch/x86/mm/p2m.c
> @@ -723,7 +723,7 @@ set_shared_p2m_entry(struct domain *d, u
> * - the gfn is backed by a mfn
> * - the p2mt of the gfn is pageable
> * - the mfn is not used for IO
> - * - the mfn has exactly one user and has no special meaning
> + * - the mfn has exactly two users (guest+pager) and has no special
> meaning
> *
> * Once the p2mt is changed the page is readonly for the guest. On
> success the
> * pager can write the page contents to disk and later evict the page.
> @@ -758,7 +758,7 @@ int p2m_mem_paging_nominate(struct domai
> /* Check page count and type */
> page = mfn_to_page(mfn);
> if ( (page->count_info & (PGC_count_mask | PGC_allocated)) !=
> - (1 | PGC_allocated) )
> + (2 | PGC_allocated) )
> goto out;
>
> if ( (page->u.inuse.type_info & PGT_type_mask) != PGT_none )
> @@ -785,7 +785,7 @@ int p2m_mem_paging_nominate(struct domai
> * freed:
> * - the gfn is backed by a mfn
> * - the gfn was nominated
> - * - the mfn has still exactly one user and has no special meaning
> + * - the mfn has still exactly one user (the guest) and has no special
> meaning
> *
> * After successful nomination some other process could have mapped the
> page. In
> * this case eviction can not be done. If the gfn was populated before
> the pager
> diff -r b733498b351a -r 96d3292797d8 xen/include/public/mem_event.h
> --- a/xen/include/public/mem_event.h
> +++ b/xen/include/public/mem_event.h
> @@ -49,7 +49,7 @@
> #define MEM_EVENT_REASON_INT3 5 /* int3 was hit: gla/gfn are
> RIP */
> #define MEM_EVENT_REASON_SINGLESTEP 6 /* single step was invoked:
> gla/gfn are RIP */
>
> -#define MEM_EVENT_PAGING_AGE 1UL /* Number distinguish the
> mem_paging <-> pager interface */
> +#define MEM_EVENT_PAGING_AGE 2UL /* Number distinguish the
> mem_paging <-> pager interface */
>
> typedef struct mem_event_shared_page {
> uint32_t port;
>
>
>
> ------------------------------
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
>
>
> End of Xen-devel Digest, Vol 82, Issue 93
> *****************************************
>



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [PATCH 2 of 5] xenpaging: map gfn before nomination [ In reply to ]
On Tue, Dec 06, Andres Lagar-Cavilla wrote:

> Ouch! You're absolutely tying the user space pager with the underlying xen
> paging code. Most of your patches change the tools and the hypervisor in
> lockstep.

Yes, pager and hypervisor are bound closely together.

> Patch 4 in your series is one such case. Short-cutting the state machine:
> great. But what is the gain for the hypervisor in *not* sending the
> EVICT_FAIL event. It's a good thing. It keeps the same interface to
> user-space. Xenpaging may not need it, but the Xen paging code does not
> exist solely for xenpaging.

What IS the need to send yet another request? It adds just overhead for
no obvious need. Please show the code that will benefit from the extra
EVICT_FAIL message.

> Patch 1 is also a problem. I don't buy that we can number interfaces, and
> then only support the tip ("Sorry, ENOEXEC, dig out an older hypervisor").
> Either we bite the bullet of some level of backwards compatibility with
> all its messiness, or we just keep it in flux until there is convergence
> on a major "barrier".

An out-of-tree pager can very well try to support a number of
hypervisors, perhaps patch #1 can be extended to report the age in some
way. But why would the upstream pager care about (development!) state
from the past?


> This current patch 2 is also rather gratuitous. The need to map before
> nominate feels superfluous. If Xen cannot deal with nomination requests on
> pages that have not been mapped, then we've broken Xen.

Why? It was broken (or rather: not optimal) in the first place.
Any attempt to map a gfn in any paging state has to return -ENOENT
right away. The pager is certainly a special case, and thanks to this
change, and your change for the page-in part, the special handling can now
get removed.

> Anyway, I hope my message gets across. I'm not trying to organize a
> blockade on your code. I'm trying to get to the best possible hypervisor
> paging support we can.

Great.
Then please improve tools/xenpaging/, thats the cure for NIH.

Olaf

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [PATCH 2 of 5] xenpaging: map gfn before nomination [ In reply to ]
At 10:27 +0100 on 07 Dec (1323253657), Olaf Hering wrote:
> On Tue, Dec 06, Andres Lagar-Cavilla wrote:
>
> > Ouch! You're absolutely tying the user space pager with the underlying xen
> > paging code. Most of your patches change the tools and the hypervisor in
> > lockstep.
>
> Yes, pager and hypervisor are bound closely together.
>
> > Patch 4 in your series is one such case. Short-cutting the state machine:
> > great. But what is the gain for the hypervisor in *not* sending the
> > EVICT_FAIL event. It's a good thing. It keeps the same interface to
> > user-space. Xenpaging may not need it, but the Xen paging code does not
> > exist solely for xenpaging.
>
> What IS the need to send yet another request? It adds just overhead for
> no obvious need. Please show the code that will benefit from the extra
> EVICT_FAIL message.

While in general it's good to keep backward compatibility, I don't think
this interface has ever had a stable, usable release (that didn't
involve the consumer at least carrying hypervisor patches of their own)
so I think it's OK to make fairly large changes in it.

Tim.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [PATCH 2 of 5] xenpaging: map gfn before nomination [ In reply to ]
On Wed, Dec 07, Tim Deegan wrote:

> While in general it's good to keep backward compatibility, I don't think
> this interface has ever had a stable, usable release (that didn't
> involve the consumer at least carrying hypervisor patches of their own)
> so I think it's OK to make fairly large changes in it.

Yes, that what I forgot to mention. 4.0 worked somehow, 4.1 was
unfortunately broken.
Hopefully everything will be ready by 4.2-rc1.

Olaf

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [PATCH 2 of 5] xenpaging: map gfn before nomination [ In reply to ]
> On Tue, Dec 06, Andres Lagar-Cavilla wrote:
>
>> Ouch! You're absolutely tying the user space pager with the underlying
>> xen
>> paging code. Most of your patches change the tools and the hypervisor in
>> lockstep.
>
> Yes, pager and hypervisor are bound closely together.
>
>> Patch 4 in your series is one such case. Short-cutting the state
>> machine:
>> great. But what is the gain for the hypervisor in *not* sending the
>> EVICT_FAIL event. It's a good thing. It keeps the same interface to
>> user-space. Xenpaging may not need it, but the Xen paging code does not
>> exist solely for xenpaging.
>
> What IS the need to send yet another request? It adds just overhead for
> no obvious need. Please show the code that will benefit from the extra
> EVICT_FAIL message.

I have mentioned the reasoning in a previous email.

We need to look at the big picture when it comes to overhead. Sending an
event is, what, nanoseconds? -- particularly when guest events are
guaranteed to succeed and not perhaps go to sleep on a wait queue.

The I/O involved is an overhead orders of magnitude much larger. When you
have hundreds (thousands?) of VMs pounding your storage fabric, it becomes
germane to avoid unnecessary I/O.

A pager that performs a batch of nominations, I/O, and evictions, is not
at all unreasonable. It will benefit greatly from early "do not evict"
notices.

The cost/benefit equation here is drastically skewed in favor of one
event, versus one I/O.
>
>> Patch 1 is also a problem. I don't buy that we can number interfaces,
>> and
>> then only support the tip ("Sorry, ENOEXEC, dig out an older
>> hypervisor").
>> Either we bite the bullet of some level of backwards compatibility with
>> all its messiness, or we just keep it in flux until there is convergence
>> on a major "barrier".
>
> An out-of-tree pager can very well try to support a number of
> hypervisors, perhaps patch #1 can be extended to report the age in some
> way. But why would the upstream pager care about (development!) state
> from the past?
>
>
>> This current patch 2 is also rather gratuitous. The need to map before
>> nominate feels superfluous. If Xen cannot deal with nomination requests
>> on
>> pages that have not been mapped, then we've broken Xen.
>
> Why? It was broken (or rather: not optimal) in the first place.
> Any attempt to map a gfn in any paging state has to return -ENOENT
> right away. The pager is certainly a special case, and thanks to this
> change, and your change for the page-in part, the special handling can now
> get removed.

If you're pushing for this change because you want to get rid from a few
lines of (correct) error checking in the hypervisor, that's not a good
reason imho.

If you have a more compelling reason, sure.

>
>> Anyway, I hope my message gets across. I'm not trying to organize a
>> blockade on your code. I'm trying to get to the best possible hypervisor
>> paging support we can.
>
> Great.
> Then please improve tools/xenpaging/, thats the cure for NIH.

Woaah. Please read my email carefully. This is not about NIH. This about
changes that bring no tangible benefits and break other people's code.

Andres
>
> Olaf
>



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [PATCH 2 of 5] xenpaging: map gfn before nomination [ In reply to ]
> At 10:27 +0100 on 07 Dec (1323253657), Olaf Hering wrote:
>> On Tue, Dec 06, Andres Lagar-Cavilla wrote:
>>
>> > Ouch! You're absolutely tying the user space pager with the underlying
>> xen
>> > paging code. Most of your patches change the tools and the hypervisor
>> in
>> > lockstep.
>>
>> Yes, pager and hypervisor are bound closely together.
>>
>> > Patch 4 in your series is one such case. Short-cutting the state
>> machine:
>> > great. But what is the gain for the hypervisor in *not* sending the
>> > EVICT_FAIL event. It's a good thing. It keeps the same interface to
>> > user-space. Xenpaging may not need it, but the Xen paging code does
>> not
>> > exist solely for xenpaging.
>>
>> What IS the need to send yet another request? It adds just overhead for
>> no obvious need. Please show the code that will benefit from the extra
>> EVICT_FAIL message.
>
> While in general it's good to keep backward compatibility, I don't think
> this interface has ever had a stable, usable release (that didn't
> involve the consumer at least carrying hypervisor patches of their own)
> so I think it's OK to make fairly large changes in it.
Would you agree that targeting 4.2 as a reasonable long-term interface is
a. feasible? b. worthy?
Andres
>
> Tim.
>



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [PATCH 2 of 5] xenpaging: map gfn before nomination [ In reply to ]
At 08:23 -0800 on 07 Dec (1323246231), Andres Lagar-Cavilla wrote:
> > At 10:27 +0100 on 07 Dec (1323253657), Olaf Hering wrote:
> >> On Tue, Dec 06, Andres Lagar-Cavilla wrote:
> >>
> >> > Ouch! You're absolutely tying the user space pager with the underlying
> >> xen
> >> > paging code. Most of your patches change the tools and the hypervisor
> >> in
> >> > lockstep.
> >>
> >> Yes, pager and hypervisor are bound closely together.
> >>
> >> > Patch 4 in your series is one such case. Short-cutting the state
> >> machine:
> >> > great. But what is the gain for the hypervisor in *not* sending the
> >> > EVICT_FAIL event. It's a good thing. It keeps the same interface to
> >> > user-space. Xenpaging may not need it, but the Xen paging code does
> >> not
> >> > exist solely for xenpaging.
> >>
> >> What IS the need to send yet another request? It adds just overhead for
> >> no obvious need. Please show the code that will benefit from the extra
> >> EVICT_FAIL message.
> >
> > While in general it's good to keep backward compatibility, I don't think
> > this interface has ever had a stable, usable release (that didn't
> > involve the consumer at least carrying hypervisor patches of their own)
> > so I think it's OK to make fairly large changes in it.
> Would you agree that targeting 4.2 as a reasonable long-term interface is
> a. feasible? b. worthy?

Yes, and yes.

Tim.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel