Mailing List Archive

[PATCH] Fix >4G i386 PAE grant table interface
It has been discovered that i386 boxes with more than 4G of RAM would
randomly crash. It was traced to the interface of blktap using
gnttab_set_map_op.

It would pass in the 64 bit pte entry, but the gnttab_set_map_op would
only take a 32 bit (on i386) unsigned long as a parameter. So we lose
the top 32bits.

Luckily! The kernel/HV ABI used a uint64_t as the variable to pass the
address. So this does *NOT* break the current kernel/HV ABI.

But after the HV grabs the 64 bit address from the guest, it too calls a
function that uses a unsigned long (32bits on i386) to pass that address
with. So the HV side also chops off the top 64 bits of the variable.


This patch updates both the linux-2.6-sparse tree and the xen HV to use
uint64_t instead of unsigned long for those particular functions. This
patch has been tested on RHEL5 Beta on a box with 12G i386.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Re: [PATCH] Fix >4G i386 PAE grant table interface [ In reply to ]
On 2/11/06 16:53, "Steven Rostedt" <srostedt@redhat.com> wrote:

> This patch updates both the linux-2.6-sparse tree and the xen HV to use
> uint64_t instead of unsigned long for those particular functions. This
> patch has been tested on RHEL5 Beta on a box with 12G i386.
>
> Signed-off-by: Steven Rostedt <srostedt@redhat.com>

Nasty bug. At least it affects only blktap. Thanks.

-- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [PATCH] Fix >4G i386 PAE grant table interface [ In reply to ]
>>> Steven Rostedt <srostedt@redhat.com> 02.11.06 17:53 >>>
>It has been discovered that i386 boxes with more than 4G of RAM would
>randomly crash. It was traced to the interface of blktap using
>gnttab_set_map_op.
>
>It would pass in the 64 bit pte entry, but the gnttab_set_map_op would
>only take a 32 bit (on i386) unsigned long as a parameter. So we lose
>the top 32bits.

Could you use maddr_t here rather than uint64_t? For non-PAE i386
Linux, especially when using CONFIG_REGPARM, adding a useless
argument slot seems wasteful...

Keir - will this go into 3.0.3?

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [PATCH] Fix >4G i386 PAE grant table interface [ In reply to ]
>>> Keir Fraser <Keir.Fraser@cl.cam.ac.uk> 02.11.06 19:16 >>>
>On 2/11/06 16:53, "Steven Rostedt" <srostedt@redhat.com> wrote:
>
>> This patch updates both the linux-2.6-sparse tree and the xen HV to use
>> uint64_t instead of unsigned long for those particular functions. This
>> patch has been tested on RHEL5 Beta on a box with 12G i386.
>>
>> Signed-off-by: Steven Rostedt <srostedt@redhat.com>
>
>Nasty bug. At least it affects only blktap. Thanks.

Looking at this it would seem to me that the second call to
gnttab_set_unmap_op in blkltap.c is missing the
GNTMAP_contains_pte flag, affecting auto_translated_physmap
guests.

Signed-off-by: Jan Beulich <jbeulich@novell.com>

Index: head-2006-10-30/drivers/xen/blktap/blktap.c
===================================================================
--- head-2006-10-30.orig/drivers/xen/blktap/blktap.c 2006-10-26 12:10:54.000000000 +0200
+++ head-2006-10-30/drivers/xen/blktap/blktap.c 2006-11-03 10:00:44.000000000 +0100
@@ -908,8 +908,10 @@ static void fast_flush_area(pending_req_
return;
}

- gnttab_set_unmap_op(&unmap[invcount],
- ptep, GNTMAP_host_map,
+ gnttab_set_unmap_op(&unmap[invcount], ptep,
+ GNTMAP_host_map
+ | GNTMAP_application_map
+ | GNTMAP_contains_pte,
khandle->user);
invcount++;
}


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [PATCH] Fix >4G i386 PAE grant table interface [ In reply to ]
>>> Steven Rostedt <srostedt@redhat.com> 02.11.06 17:53 >>>
>It has been discovered that i386 boxes with more than 4G of RAM would
>randomly crash. It was traced to the interface of blktap using
>gnttab_set_map_op.
>
>It would pass in the 64 bit pte entry, but the gnttab_set_map_op would
>only take a 32 bit (on i386) unsigned long as a parameter. So we lose
>the top 32bits.
>
>Luckily! The kernel/HV ABI used a uint64_t as the variable to pass the
>address. So this does *NOT* break the current kernel/HV ABI.
>
>But after the HV grabs the 64 bit address from the guest, it too calls a
>function that uses a unsigned long (32bits on i386) to pass that address
>with. So the HV side also chops off the top 64 bits of the variable.
>
>
>This patch updates both the linux-2.6-sparse tree and the xen HV to use
>uint64_t instead of unsigned long for those particular functions. This
>patch has been tested on RHEL5 Beta on a box with 12G i386.
>
>Signed-off-by: Steven Rostedt <srostedt@redhat.com>

After integrating the kernel part of this patch in my tree, I found that
almost the whole kernel got rebuilt.
include/asm-{i386,x86_64}/mach-xen/asm/fixmap.h were needlessly
including include/xen/gnttab.h. Removing this made necessary explicit
inclusion of that header in tpm_xen.c, the build of which should not
have succeeded on non-x86 architectures before.

Signed-off-by: Jan Beulich <jbeulich@novell.com>

Index: head-2006-10-30/drivers/char/tpm/tpm_xen.c
===================================================================
--- head-2006-10-30.orig/drivers/char/tpm/tpm_xen.c 2006-10-30 13:36:39.000000000 +0100
+++ head-2006-10-30/drivers/char/tpm/tpm_xen.c 2006-11-03 10:41:32.000000000 +0100
@@ -41,6 +41,7 @@
#include <xen/evtchn.h>
#include <xen/interface/grant_table.h>
#include <xen/interface/io/tpmif.h>
+#include <xen/gnttab.h>
#include <xen/xenbus.h>
#include "tpm.h"
#include "tpm_vtpm.h"
Index: head-2006-10-30/include/asm-i386/mach-xen/asm/fixmap.h
===================================================================
--- head-2006-10-30.orig/include/asm-i386/mach-xen/asm/fixmap.h 2006-08-28 10:57:30.000000000 +0200
+++ head-2006-10-30/include/asm-i386/mach-xen/asm/fixmap.h 2006-11-03 10:30:39.000000000 +0100
@@ -27,7 +27,6 @@ extern unsigned long __FIXADDR_TOP;
#include <asm/acpi.h>
#include <asm/apicdef.h>
#include <asm/page.h>
-#include <xen/gnttab.h>
#ifdef CONFIG_HIGHMEM
#include <linux/threads.h>
#include <asm/kmap_types.h>
Index: head-2006-10-30/include/asm-x86_64/mach-xen/asm/fixmap.h
===================================================================
--- head-2006-10-30.orig/include/asm-x86_64/mach-xen/asm/fixmap.h 2006-08-28 10:57:30.000000000 +0200
+++ head-2006-10-30/include/asm-x86_64/mach-xen/asm/fixmap.h 2006-11-03 10:30:50.000000000 +0100
@@ -14,7 +14,6 @@
#include <linux/config.h>
#include <linux/kernel.h>
#include <asm/apicdef.h>
-#include <xen/gnttab.h>
#include <asm/page.h>
#include <asm/vsyscall.h>
#include <asm/vsyscall32.h>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [PATCH] Fix >4G i386 PAE grant table interface [ In reply to ]
Jan Beulich wrote:
>>>> Steven Rostedt <srostedt@redhat.com> 02.11.06 17:53 >>>
>> It has been discovered that i386 boxes with more than 4G of RAM would
>> randomly crash. It was traced to the interface of blktap using
>> gnttab_set_map_op.
>>
>> It would pass in the 64 bit pte entry, but the gnttab_set_map_op would
>> only take a 32 bit (on i386) unsigned long as a parameter. So we lose
>> the top 32bits.
>
> Could you use maddr_t here rather than uint64_t? For non-PAE i386
> Linux, especially when using CONFIG_REGPARM, adding a useless
> argument slot seems wasteful...
>

Actually, it makes no difference to me. In fact uint64_t was my third
incarnation, since I wasn't sure what the best would be. I started with
unsigned long long, then switched to u64, and then noticed that since
host_addr is uint64_t, that seemed the proper thing to use.

So a maddr_t would work too.

Do you want to do the patch, or would you like me to send another patch
that would do this change?

-- Steve


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [PATCH] Fix >4G i386 PAE grant table interface [ In reply to ]
>Do you want to do the patch, or would you like me to send another patch
>that would do this change?

Unless this was already committed, maybe Keir could do this on the fly?

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [PATCH] Fix >4G i386 PAE grant table interface [ In reply to ]
On 3/11/06 14:27, "Jan Beulich" <jbeulich@novell.com> wrote:

>> Do you want to do the patch, or would you like me to send another patch
>> that would do this change?
>
> Unless this was already committed, maybe Keir could do this on the fly?

Already committed, but I'll consider patches.

-- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: Re: [PATCH] Fix >4G i386 PAE grant table interface [ In reply to ]
Jan Beulich wrote:
>>>> Keir Fraser <Keir.Fraser@cl.cam.ac.uk> 02.11.06 19:16 >>>
>> On 2/11/06 16:53, "Steven Rostedt" <srostedt@redhat.com> wrote:
>>
>>> This patch updates both the linux-2.6-sparse tree and the xen HV to use
>>> uint64_t instead of unsigned long for those particular functions. This
>>> patch has been tested on RHEL5 Beta on a box with 12G i386.
>>>
>>> Signed-off-by: Steven Rostedt <srostedt@redhat.com>
>> Nasty bug. At least it affects only blktap. Thanks.
>
> Looking at this it would seem to me that the second call to
> gnttab_set_unmap_op in blkltap.c is missing the
> GNTMAP_contains_pte flag, affecting auto_translated_physmap
> guests.

It's much worst than that!

Without this patch, my patch is still broken. In the hypervisor this
flag is checked, and will go down the wrong path when it is missing:


int destroy_grant_host_mapping(
u64 addr, unsigned long frame, unsigned int flags)
{
if ( flags & GNTMAP_contains_pte )
return destroy_grant_pte_mapping(addr, frame, current->domain);
return destroy_grant_va_mapping(addr, frame, current);
}


You can see here that we call the va_mapping instead of the pte mapping.
Which means that we once again truncate the top 32 bits of the address,
since a va addr is expected to be "unsigned long" in size.

Also, this can be bad, if the va addr is really virtual, and doesn't
match the actual offset shift (like handling high memory).

>
> Signed-off-by: Jan Beulich <jbeulich@novell.com>
>

please apply.

Acked-by: Steven Rostedt <srostedt@redhat.com>

-- Steve

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: Re: [PATCH] Fix >4G i386 PAE grant table interface [ In reply to ]
On 3/11/06 10:46 pm, "Steven Rostedt" <srostedt@redhat.com> wrote:

> It's much worst than that!
>
> Without this patch, my patch is still broken. In the hypervisor this
> flag is checked, and will go down the wrong path when it is missing:

The hypervisor remembers the flags for itself, so the kernel can only
confuse itself. Still, it was a bug.

-- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: Re: [PATCH] Fix >4G i386 PAE grant table interface [ In reply to ]
Keir Fraser wrote:
> On 3/11/06 10:46 pm, "Steven Rostedt" <srostedt@redhat.com> wrote:
>
>> It's much worst than that!
>>
>> Without this patch, my patch is still broken. In the hypervisor this
>> flag is checked, and will go down the wrong path when it is missing:
>
> The hypervisor remembers the flags for itself, so the kernel can only
> confuse itself. Still, it was a bug.
>

Well the flag for the application map isn't the problem. But I do see
missing the flag for contains pte is a problem.

>

int create_grant_host_mapping(
u64 addr, unsigned long frame, unsigned int flags)
{
l1_pgentry_t pte = l1e_from_pfn(frame, GRANT_PTE_FLAGS);

if ( (flags & GNTMAP_application_map) )
l1e_add_flags(pte,_PAGE_USER);
if ( !(flags & GNTMAP_readonly) )
l1e_add_flags(pte,_PAGE_RW);

if ( flags & GNTMAP_contains_pte )
return create_grant_pte_mapping(addr, pte, current);
return create_grant_va_mapping(addr, pte, current);
}

int destroy_grant_host_mapping(
u64 addr, unsigned long frame, unsigned int flags)
{
if ( flags & GNTMAP_contains_pte )
return destroy_grant_pte_mapping(addr, frame, current->domain);
return destroy_grant_va_mapping(addr, frame, current);
}

So is there a difference between create_grant_pte_mapping and
create_grant_va_mapping. As well as destroy_grant_pte_mapping and
destroy_grant_va_mapping. So calling pte create, and then va destroy on
the same mapping is not a bug?

-- Steve

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: Re: [PATCH] Fix >4G i386 PAE grant table interface [ In reply to ]
On 4/11/06 2:25 pm, "Steven Rostedt" <srostedt@redhat.com> wrote:

> So is there a difference between create_grant_pte_mapping and
> create_grant_va_mapping. As well as destroy_grant_pte_mapping and
> destroy_grant_va_mapping. So calling pte create, and then va destroy on
> the same mapping is not a bug?

That would be a bug, if it were possible, which it's not. 'flags' is not a
parameter to the gnttab_unmap operation. Xen remembers the flags from the
original map operation.

-- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: Re: [PATCH] Fix >4G i386 PAE grant table interface [ In reply to ]
Keir Fraser wrote:
>
>
> On 4/11/06 2:25 pm, "Steven Rostedt" <srostedt@redhat.com> wrote:
>
>> So is there a difference between create_grant_pte_mapping and
>> create_grant_va_mapping. As well as destroy_grant_pte_mapping and
>> destroy_grant_va_mapping. So calling pte create, and then va destroy on
>> the same mapping is not a bug?
>
> That would be a bug, if it were possible, which it's not. 'flags' is not a
> parameter to the gnttab_unmap operation. Xen remembers the flags from the
> original map operation.
>

OK, took me some time to find what you mean:

__gnttab_unmap_grant_ref(
struct gnttab_unmap_grant_ref *op)
{
[...]

map = &ld->grant_table->maptrack[op->handle];

[...]

dom = map->domid;
ref = map->ref;
flags = map->flags;

[...]

if ( (op->host_addr != 0) && (flags & GNTMAP_host_map) )
{
if ( (rc = destroy_grant_host_mapping(op->host_addr,
frame, flags)) < 0 )
goto unmap_out;



OK, but it can be a problem on the kernel side because of the Xen auto
translate physmap feature. But not as bad as I thought.

But it's still good to be consistent.

-- Steve

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [PATCH] Fix >4G i386 PAE grant table interface [ In reply to ]
>>> Keir Fraser <Keir.Fraser@cl.cam.ac.uk> 03.11.06 21:43 >>>
>On 3/11/06 14:27, "Jan Beulich" <jbeulich@novell.com> wrote:
>
>>> Do you want to do the patch, or would you like me to send another patch
>>> that would do this change?
>>
>> Unless this was already committed, maybe Keir could do this on the fly?
>
>Already committed, but I'll consider patches.

Don't use a 64-bit addr parameter to gnttab_set_{,un}map_op() when the
upper 32 bits will never be used (i.e. i386 non-PAE).

Signed-off-by: Jan Beulich <jbeulich@novell.com>

Index: head-2006-10-30/include/xen/gnttab.h
===================================================================
--- head-2006-10-30.orig/include/xen/gnttab.h 2006-09-21 09:57:46.000000000 +0200
+++ head-2006-10-30/include/xen/gnttab.h 2006-11-03 09:56:58.000000000 +0100
@@ -118,7 +118,7 @@ int gnttab_suspend(void);
int gnttab_resume(void);

static inline void
-gnttab_set_map_op(struct gnttab_map_grant_ref *map, uint64_t addr,
+gnttab_set_map_op(struct gnttab_map_grant_ref *map, maddr_t addr,
uint32_t flags, grant_ref_t ref, domid_t domid)
{
if (flags & GNTMAP_contains_pte)
@@ -134,7 +134,7 @@ gnttab_set_map_op(struct gnttab_map_gran
}

static inline void
-gnttab_set_unmap_op(struct gnttab_unmap_grant_ref *unmap, uint64_t addr,
+gnttab_set_unmap_op(struct gnttab_unmap_grant_ref *unmap, maddr_t addr,
uint32_t flags, grant_handle_t handle)
{
if (flags & GNTMAP_contains_pte)


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel