Mailing List Archive

[PATCH] Don't trigger unnecessary shadow scans on p2m entry update
xen/arch/x86/mm/shadow/common.c | 6 ++----
xen/arch/x86/mm/shadow/multi.c | 2 +-
xen/arch/x86/mm/shadow/private.h | 6 ++++++
3 files changed, 9 insertions(+), 5 deletions(-)


When updating a p2m entry, the hypervisor scans all shadow pte's to find
mappings of that gfn and tear them down. This is avoided if the page count
reveals that there are no additional mappings. The current test ignores the
PGC_allocated flag and its effect on the page count.

Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>
Signed-off-by: Adin Scannell <adin@scannell.ca>

diff -r 5c339eb31d34 -r a264fb979b0c xen/arch/x86/mm/shadow/common.c
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -2464,7 +2464,6 @@ int sh_remove_write_access_from_sl1p(str
int sh_remove_all_mappings(struct vcpu *v, mfn_t gmfn)
{
struct page_info *page = mfn_to_page(gmfn);
- int expected_count;

/* Dispatch table for getting per-type functions */
static const hash_callback_t callbacks[SH_type_unused] = {
@@ -2501,7 +2500,7 @@ int sh_remove_all_mappings(struct vcpu *
;

perfc_incr(shadow_mappings);
- if ( (page->count_info & PGC_count_mask) == 0 )
+ if ( __check_page_no_refs(page) )
return 0;

/* Although this is an externally visible function, we do not know
@@ -2517,8 +2516,7 @@ int sh_remove_all_mappings(struct vcpu *
hash_foreach(v, callback_mask, callbacks, gmfn);

/* If that didn't catch the mapping, something is very wrong */
- expected_count = (page->count_info & PGC_allocated) ? 1 : 0;
- if ( (page->count_info & PGC_count_mask) != expected_count )
+ if ( !__check_page_no_refs(page) )
{
/* Don't complain if we're in HVM and there are some extra mappings:
* The qemu helper process has an untyped mapping of this dom's RAM
diff -r 5c339eb31d34 -r a264fb979b0c xen/arch/x86/mm/shadow/multi.c
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -4591,7 +4591,7 @@ int sh_rm_mappings_from_l1(struct vcpu *
{
(void) shadow_set_l1e(v, sl1e, shadow_l1e_empty(),
p2m_invalid, sl1mfn);
- if ( (mfn_to_page(target_mfn)->count_info & PGC_count_mask) == 0 )
+ if ( __check_page_no_refs(mfn_to_page(target_mfn)) )
/* This breaks us cleanly out of the FOREACH macro */
done = 1;
}
diff -r 5c339eb31d34 -r a264fb979b0c xen/arch/x86/mm/shadow/private.h
--- a/xen/arch/x86/mm/shadow/private.h
+++ b/xen/arch/x86/mm/shadow/private.h
@@ -815,6 +815,12 @@ static inline unsigned long vtlb_lookup(
}
#endif /* (SHADOW_OPTIMIZATIONS & SHOPT_VIRTUAL_TLB) */

+static inline int __check_page_no_refs(struct page_info *page)
+{
+ unsigned long __count = page->count_info;
+ return ( (__count & PGC_count_mask) ==
+ ((__count & PGC_allocated) ? 1 : 0) );
+}

#endif /* _XEN_SHADOW_PRIVATE_H */


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [PATCH] Don't trigger unnecessary shadow scans on p2m entry update [ In reply to ]
On 24/11/2011 17:58, "Andres Lagar-Cavilla" <andres@lagarcavilla.org> wrote:

> xen/arch/x86/mm/shadow/common.c | 6 ++----
> xen/arch/x86/mm/shadow/multi.c | 2 +-
> xen/arch/x86/mm/shadow/private.h | 6 ++++++
> 3 files changed, 9 insertions(+), 5 deletions(-)
>
>
> When updating a p2m entry, the hypervisor scans all shadow pte's to find
> mappings of that gfn and tear them down. This is avoided if the page count
> reveals that there are no additional mappings. The current test ignores the
> PGC_allocated flag and its effect on the page count.
>
> Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>
> Signed-off-by: Adin Scannell <adin@scannell.ca>
>
> diff -r 5c339eb31d34 -r a264fb979b0c xen/arch/x86/mm/shadow/common.c
> --- a/xen/arch/x86/mm/shadow/common.c
> +++ b/xen/arch/x86/mm/shadow/common.c
> @@ -2464,7 +2464,6 @@ int sh_remove_write_access_from_sl1p(str
> int sh_remove_all_mappings(struct vcpu *v, mfn_t gmfn)
> {
> struct page_info *page = mfn_to_page(gmfn);
> - int expected_count;
>
> /* Dispatch table for getting per-type functions */
> static const hash_callback_t callbacks[SH_type_unused] = {
> @@ -2501,7 +2500,7 @@ int sh_remove_all_mappings(struct vcpu *
> ;
>
> perfc_incr(shadow_mappings);
> - if ( (page->count_info & PGC_count_mask) == 0 )
> + if ( __check_page_no_refs(page) )
> return 0;
>
> /* Although this is an externally visible function, we do not know
> @@ -2517,8 +2516,7 @@ int sh_remove_all_mappings(struct vcpu *
> hash_foreach(v, callback_mask, callbacks, gmfn);
>
> /* If that didn't catch the mapping, something is very wrong */
> - expected_count = (page->count_info & PGC_allocated) ? 1 : 0;
> - if ( (page->count_info & PGC_count_mask) != expected_count )
> + if ( !__check_page_no_refs(page) )
> {
> /* Don't complain if we're in HVM and there are some extra mappings:
> * The qemu helper process has an untyped mapping of this dom's RAM
> diff -r 5c339eb31d34 -r a264fb979b0c xen/arch/x86/mm/shadow/multi.c
> --- a/xen/arch/x86/mm/shadow/multi.c
> +++ b/xen/arch/x86/mm/shadow/multi.c
> @@ -4591,7 +4591,7 @@ int sh_rm_mappings_from_l1(struct vcpu *
> {
> (void) shadow_set_l1e(v, sl1e, shadow_l1e_empty(),
> p2m_invalid, sl1mfn);
> - if ( (mfn_to_page(target_mfn)->count_info & PGC_count_mask) == 0
> )
> + if ( __check_page_no_refs(mfn_to_page(target_mfn)) )
> /* This breaks us cleanly out of the FOREACH macro */
> done = 1;
> }
> diff -r 5c339eb31d34 -r a264fb979b0c xen/arch/x86/mm/shadow/private.h
> --- a/xen/arch/x86/mm/shadow/private.h
> +++ b/xen/arch/x86/mm/shadow/private.h
> @@ -815,6 +815,12 @@ static inline unsigned long vtlb_lookup(
> }
> #endif /* (SHADOW_OPTIMIZATIONS & SHOPT_VIRTUAL_TLB) */
>
> +static inline int __check_page_no_refs(struct page_info *page)
> +{
> + unsigned long __count = page->count_info;
> + return ( (__count & PGC_count_mask) ==
> + ((__count & PGC_allocated) ? 1 : 0) );

It's a fussy point, but it might be better to use
atomic_read_ulong(&page->count_info) here, to ensure that the compiler reads
the field once only. It's cheap (compiles to a single mov instruction), but
atomic_read_ulong doesn't exist so you'd have to add its (obvious)
definition to asm-x86/atomic.h

-- Keir

> +}
>
> #endif /* _XEN_SHADOW_PRIVATE_H */
>



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [PATCH] Don't trigger unnecessary shadow scans on p2m entry update [ In reply to ]
>>> On 24.11.11 at 19:31, Keir Fraser <keir.xen@gmail.com> wrote:
> On 24/11/2011 17:58, "Andres Lagar-Cavilla" <andres@lagarcavilla.org> wrote:
>> @@ -815,6 +815,12 @@ static inline unsigned long vtlb_lookup(
>> }
>> #endif /* (SHADOW_OPTIMIZATIONS & SHOPT_VIRTUAL_TLB) */
>>
>> +static inline int __check_page_no_refs(struct page_info *page)
>> +{
>> + unsigned long __count = page->count_info;
>> + return ( (__count & PGC_count_mask) ==
>> + ((__count & PGC_allocated) ? 1 : 0) );
>
> It's a fussy point, but it might be better to use
> atomic_read_ulong(&page->count_info) here, to ensure that the compiler reads
> the field once only. It's cheap (compiles to a single mov instruction), but
> atomic_read_ulong doesn't exist so you'd have to add its (obvious)
> definition to asm-x86/atomic.h

I think Tim suggested anyway to use

(__count & (PGC_count_mask|PGC_allocated)) matched against
(1|PGC_allocated) here, which would eliminate the multiple read
potential if used in a switch statement.

Also, for clarity the function should probably return bool_t.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [PATCH] Don't trigger unnecessary shadow scans on p2m entry update [ In reply to ]
On 25/11/2011 08:45, "Jan Beulich" <JBeulich@suse.com> wrote:

>>>> On 24.11.11 at 19:31, Keir Fraser <keir.xen@gmail.com> wrote:
>> On 24/11/2011 17:58, "Andres Lagar-Cavilla" <andres@lagarcavilla.org> wrote:
>>> @@ -815,6 +815,12 @@ static inline unsigned long vtlb_lookup(
>>> }
>>> #endif /* (SHADOW_OPTIMIZATIONS & SHOPT_VIRTUAL_TLB) */
>>>
>>> +static inline int __check_page_no_refs(struct page_info *page)
>>> +{
>>> + unsigned long __count = page->count_info;
>>> + return ( (__count & PGC_count_mask) ==
>>> + ((__count & PGC_allocated) ? 1 : 0) );
>>
>> It's a fussy point, but it might be better to use
>> atomic_read_ulong(&page->count_info) here, to ensure that the compiler reads
>> the field once only. It's cheap (compiles to a single mov instruction), but
>> atomic_read_ulong doesn't exist so you'd have to add its (obvious)
>> definition to asm-x86/atomic.h
>
> I think Tim suggested anyway to use
>
> (__count & (PGC_count_mask|PGC_allocated)) matched against
> (1|PGC_allocated) here, which would eliminate the multiple read
> potential if used in a switch statement.

Yes, that would be better.

-- Keir

> Also, for clarity the function should probably return bool_t.
>
> Jan
>



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [PATCH] Don't trigger unnecessary shadow scans on p2m entry update [ In reply to ]
Not sure about that. We need to check for either of two conditions on page->count_info. That it's zero, or that it's PGC_allocated | 1.

(__count & (PGC_count_mask|PGC_allocated)) matched against (1|PGC_allocated) would lose the zero case.

Resubmitting shortly.
Andres

On Nov 25, 2011, at 4:18 AM, Keir Fraser wrote:

> On 25/11/2011 08:45, "Jan Beulich" <JBeulich@suse.com> wrote:
>
>>>>> On 24.11.11 at 19:31, Keir Fraser <keir.xen@gmail.com> wrote:
>>> On 24/11/2011 17:58, "Andres Lagar-Cavilla" <andres@lagarcavilla.org> wrote:
>>>> @@ -815,6 +815,12 @@ static inline unsigned long vtlb_lookup(
>>>> }
>>>> #endif /* (SHADOW_OPTIMIZATIONS & SHOPT_VIRTUAL_TLB) */
>>>>
>>>> +static inline int __check_page_no_refs(struct page_info *page)
>>>> +{
>>>> + unsigned long __count = page->count_info;
>>>> + return ( (__count & PGC_count_mask) ==
>>>> + ((__count & PGC_allocated) ? 1 : 0) );
>>>
>>> It's a fussy point, but it might be better to use
>>> atomic_read_ulong(&page->count_info) here, to ensure that the compiler reads
>>> the field once only. It's cheap (compiles to a single mov instruction), but
>>> atomic_read_ulong doesn't exist so you'd have to add its (obvious)
>>> definition to asm-x86/atomic.h
>>
>> I think Tim suggested anyway to use
>>
>> (__count & (PGC_count_mask|PGC_allocated)) matched against
>> (1|PGC_allocated) here, which would eliminate the multiple read
>> potential if used in a switch statement.
>
> Yes, that would be better.
>
> -- Keir
>
>> Also, for clarity the function should probably return bool_t.
>>
>> Jan
>>
>
>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel