Mailing List Archive

context switch and hypervisor
Hi,

I have been reading some papers from Xen and other sources, there are just a
couple of questions that I found hard to understand.

Why is Xen hypervisor better than a traditional hypervisor? With a
traditional hypervisor, during a context switch, the hypervisor stores the
states of a guest OS then goes to the next OS, upon coming back to the first
OS it restores the hardware states then passes it on to the first OS. Does
Xen pretty much do the same thing except it provides an API to the OS, and
the reason/benefit of having such an API is to reduce the time for a TLB
flush?

Can someone please explain this to me in detail?

Cheers

Chris
RE: context switch and hypervisor [ In reply to ]
________________________________

From: xen-devel-bounces@lists.xensource.com
[mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of Chris Zhang
Sent: 15 September 2006 07:11
To: xen-devel@lists.xensource.com
Subject: [Xen-devel] context switch and hypervisor


Hi,

I have been reading some papers from Xen and other sources,
there are just a couple of questions that I found hard to understand.

Why is Xen hypervisor better than a traditional hypervisor? With
a traditional hypervisor, during a context switch, the hypervisor stores
the states of a guest OS then goes to the next OS, upon coming back to
the first OS it restores the hardware states then passes it on to the
first OS. Does Xen pretty much do the same thing except it provides an
API to the OS, and the reason/benefit of having such an API is to reduce
the time for a TLB flush?



I may be wrong here, but I think the "reduce the time for TLB flush" and
Context Switching are not strictly related.

There are generally two ways to implement a Hypervisor (aka Virtual
Machine Monitor/VMM):
- Para-virtualization, like Xen in it's traditional shape, where the OS
source-code is modified to interact directly with the hypervisor.
- Full virtualization: No changes to the OS source code. Xen can do this
to, with HVM domains.

One of the advantages with Para-virtualization is that the para-virtual
domain can give direct and "full" information to the Hypervisor. For
example: If a call in the user-mode app does "char *p = malloc(4 * 1024
* 1024);", then the OS will have to write 1024 page-table-entries
(possibly plus a couple for creating a new page-table entry at the
higher level(s)). Since only the hypervisor knows the ACTUAL page-table
layout (since it's the only instance that knows the ACTUAL memory
layout), the page-tables in the guest are write-protected and when a
write happens, it's trapped. But for big blocks like this, assuming the
code understands big blocks of memory allocation in one place, can just
call to the hypervisor with a call to say "map me these 1024 pages".

In the full virtualization instance, we can't KNOW what's going on, so
each single page-write will cause an intercept to the hypervisor, the
hypervisor emulating the page-write. Of course, each page-write has a
risk of incurring a TLB-flush too... But in the above case, we only need
one TLB-flush for each call to the "map many pages" function.

Obviously, when switching between OS's, it's necessary to save the
current context and set up the next context, and eventually switching
back to the original context. Whether you do this via a call inside the
OS source code (like in Para-virtual Xen domains) or by identifying a
"block" some way through external means (such as the intercept of a HALT
instruction to indicate that the guest is blocked waiting for an event
of some sort) isn't going to make a big difference. The difference is in
the fact that the OS knows what it's trying to achieve (map one page or
many pages), and can help the hypervisor by giving additional
information that a "full" virtualization hypervisor can't know of.

I'm sure that if I've got this all wrong, someone will correct me...

--
Mats


Can someone please explain this to me in detail?

Cheers

Chris
Re: context switch and hypervisor [ In reply to ]
Hi Mat,

Thanks for your answer. It took me some time to think it through but I fully
understand what you are saying.

In the paper 'Xen and Art of Virtualization' by Paul Barham, et al. 'Guest
OSes may batch update requests to amortize the overhead of entering the
hypervisor'. I think this is what you were trying to say.

Thanks for your time

On 9/15/06, Chris Zhang <abnamro.chris@gmail.com> wrote:
>
> Hi Mat,
>
> Thanks for your answer. It took me some time to think it through but I
> fully understand what you are saying.
>
> In the paper 'Xen and Art of Virtualization' by Paul Barham, et al. 'Guest
> OSes may batch update requests to amortize the overhead of entering the
> hypervisor'. I think this is what you were trying to say.
>
> Thanks for your time
>
> Chris
> On 9/15/06, Petersson, Mats <Mats.Petersson@amd.com> wrote:
> >
> >
> >
> > ------------------------------
> > *From:* xen-devel-bounces@lists.xensource.com [mailto:
> > xen-devel-bounces@lists.xensource.com] *On Behalf Of *Chris Zhang
> > *Sent:* 15 September 2006 07:11
> > *To:* xen-devel@lists.xensource.com
> > *Subject:* [Xen-devel] context switch and hypervisor
> >
> > Hi,
> >
> > I have been reading some papers from Xen and other sources, there are
> > just a couple of questions that I found hard to understand.
> >
> > Why is Xen hypervisor better than a traditional hypervisor? With a
> > traditional hypervisor, during a context switch, the hypervisor stores the
> > states of a guest OS then goes to the next OS, upon coming back to the first
> > OS it restores the hardware states then passes it on to the first OS. Does
> > Xen pretty much do the same thing except it provides an API to the OS, and
> > the reason/benefit of having such an API is to reduce the time for a TLB
> > flush?
> >
> >
> >
> > I may be wrong here, but I think the "reduce the time for TLB flush" and
> > Context Switching are not strictly related.
> >
> > There are generally two ways to implement a Hypervisor (aka Virtual
> > Machine Monitor/VMM):
> > - Para-virtualization, like Xen in it's traditional shape, where the OS
> > source-code is modified to interact directly with the hypervisor.
> > - Full virtualization: No changes to the OS source code. Xen can do this
> > to, with HVM domains.
> >
> > One of the advantages with Para-virtualization is that the para-virtual
> > domain can give direct and "full" information to the Hypervisor. For
> > example: If a call in the user-mode app does "char *p = malloc(4 * 1024 *
> > 1024);", then the OS will have to write 1024 page-table-entries (possibly
> > plus a couple for creating a new page-table entry at the higher level(s)).
> > Since only the hypervisor knows the ACTUAL page-table layout (since it's the
> > only instance that knows the ACTUAL memory layout), the page-tables in the
> > guest are write-protected and when a write happens, it's trapped. But for
> > big blocks like this, assuming the code understands big blocks of memory
> > allocation in one place, can just call to the hypervisor with a call to say
> > "map me these 1024 pages".
> >
> > In the full virtualization instance, we can't KNOW what's going on, so
> > each single page-write will cause an intercept to the hypervisor, the
> > hypervisor emulating the page-write. Of course, each page-write has a risk
> > of incurring a TLB-flush too... But in the above case, we only need one
> > TLB-flush for each call to the "map many pages" function.
> >
> > Obviously, when switching between OS's, it's necessary to save the
> > current context and set up the next context, and eventually switching back
> > to the original context. Whether you do this via a call inside the OS source
> > code (like in Para-virtual Xen domains) or by identifying a "block" some way
> > through external means (such as the intercept of a HALT instruction to
> > indicate that the guest is blocked waiting for an event of some sort) isn't
> > going to make a big difference. The difference is in the fact that the OS
> > knows what it's trying to achieve (map one page or many pages), and can help
> > the hypervisor by giving additional information that a "full" virtualization
> > hypervisor can't know of.
> >
> > I'm sure that if I've got this all wrong, someone will correct me...
> >
> > --
> > Mats
> >
> >
> > Can someone please explain this to me in detail?
> >
> > Cheers
> >
> > Chris
> >
> >
>
Re: context switch and hypervisor [ In reply to ]
> I have been reading some papers from Xen and other sources, there are just
> a couple of questions that I found hard to understand.

Nice one - the papers provide good background.

Other projects you might want to read about:
* Denali (paravirtualisation for lightweight guests)
* uDenali [micro Denali] (Denali with enhanced virtual device stuff,
migration, etc)
* Disco and Cellular Disco (more traditional VMMs for large systems)
* VMware have a few papers on their architecture

> Why is Xen hypervisor better than a traditional hypervisor? With a
> traditional hypervisor, during a context switch, the hypervisor stores the
> states of a guest OS then goes to the next OS, upon coming back to the
> first OS it restores the hardware states then passes it on to the first OS.
> Does Xen pretty much do the same thing except it provides an API to the OS,
> and the reason/benefit of having such an API is to reduce the time for a
> TLB flush?
>
> Can someone please explain this to me in detail?

Having the explicit API avoids having to emulate the hardware precisely. Not
having to do this enables the hypervisor to be simpler, which is good
(although we can emulate platform hardware anyhow for fully-virtualised
guests). Importantly it also enables us to avoid certain performance
difficulties. For instance, a couple of significant ones:

* Guests can access the real page tables (with appropriate security
restrictions), which avoids the overheads in memory and CPU time of
maintaining "shadow page tables" in the hypervisor
* x86 includes some really annoying features for virtualisation. For instance
there are some instructions that *need* to be trapped and emulated in order
to provide correct semantics, but they aren't capable of producing a trap, so
there's not a straightforward way of catching them.
* we don't have to emulate real hardware devices, we can use idealised devices
which are optimised for the virtualised environment and provide good
performance (nb. other solutions such as VMware also provide the option of
using optimised drivers in some situations, e.g. for fast network)

Cheers,
Mark

--
Dave: Just a question. What use is a unicyle with no seat? And no pedals!
Mark: To answer a question with a question: What use is a skateboard?
Dave: Skateboards have wheels.
Mark: My wheel has a wheel!

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel