Mailing List Archive

HYBRID: PV in HVM container
Hi guys,

Cheers!! I got PV in HVM container prototype working with single VCPU
(pinned to a cpu). Basically, I create a VMX container just like for
HVM guest (with some differences that I'll share soon when I clean up
the code). The PV guest starts in Protected mode with the usual
entry point startup_xen().

0. Guest kernel runs in ring 0, CS:0x10.

1. I use xen for all pt management just like a PV guest. So at present
all faults are going to xen, and when fixup_page_fault() fails, they
are injected into the container for the guest to handle it.

2. The guest manages the GDT, LDT, TR, in the container.

3. The guest installs the trap table in the vmx container instead of
do_set_trap_table().

4. Events/INTs are delivered via HVMIRQ_callback_vector.

5. MSR_GS_BASE is managed by the guest in the container itself.

6. Currently, I'm managing cr4 in the container, but going to xen
for cr0. I need to revisit that.

7. Currently, VPID is disabled, I need to figure it out, and revisit.

8. Currently, VM_ENTRY_LOAD_GUEST_PAT is disabled, I need to look at
that.

These are the salient points I can think of at the moment. Next, I am
going to run LMBench and figure out the gains. After that, make sure
SMP works, and things are stable, and look at any enhancements. I need
to look at couple unrelated bugs at the moment, but hope to return back
to this very soon.

thanks,
Mukesh


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: HYBRID: PV in HVM container [ In reply to ]
On 27/06/2011 20:24, "Mukesh Rathor" <mukesh.rathor@oracle.com> wrote:

>
> Hi guys,
>
> Cheers!! I got PV in HVM container prototype working with single VCPU
> (pinned to a cpu). Basically, I create a VMX container just like for
> HVM guest (with some differences that I'll share soon when I clean up
> the code). The PV guest starts in Protected mode with the usual
> entry point startup_xen().
>
> 0. Guest kernel runs in ring 0, CS:0x10.
>
> 1. I use xen for all pt management just like a PV guest. So at present
> all faults are going to xen, and when fixup_page_fault() fails, they
> are injected into the container for the guest to handle it.
>
> 2. The guest manages the GDT, LDT, TR, in the container.
>
> 3. The guest installs the trap table in the vmx container instead of
> do_set_trap_table().

To be clear, you intend for this to work with unmodified PV guests, right?
All of this translation can easily be done in Xen, avoiding multiple paths
needed in the guest kernel (not really tenable for upstreaming).

-- Keir

> 4. Events/INTs are delivered via HVMIRQ_callback_vector.
>
> 5. MSR_GS_BASE is managed by the guest in the container itself.
>
> 6. Currently, I'm managing cr4 in the container, but going to xen
> for cr0. I need to revisit that.
>
> 7. Currently, VPID is disabled, I need to figure it out, and revisit.
>
> 8. Currently, VM_ENTRY_LOAD_GUEST_PAT is disabled, I need to look at
> that.
>
> These are the salient points I can think of at the moment. Next, I am
> going to run LMBench and figure out the gains. After that, make sure
> SMP works, and things are stable, and look at any enhancements. I need
> to look at couple unrelated bugs at the moment, but hope to return back
> to this very soon.
>
> thanks,
> Mukesh
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: HYBRID: PV in HVM container [ In reply to ]
On Mon, 27 Jun 2011 20:36:18 +0100
Keir Fraser <keir.xen@gmail.com> wrote:

> On 27/06/2011 20:24, "Mukesh Rathor" <mukesh.rathor@oracle.com> wrote:
>
> >
> > Hi guys,
> >
> > Cheers!! I got PV in HVM container prototype working with single
> > VCPU (pinned to a cpu). Basically, I create a VMX container just
> > like for HVM guest (with some differences that I'll share soon when
> > I clean up the code). The PV guest starts in Protected mode with
> > the usual entry point startup_xen().
> >
> > 0. Guest kernel runs in ring 0, CS:0x10.
> >
> > 1. I use xen for all pt management just like a PV guest. So at
> > present all faults are going to xen, and when fixup_page_fault()
> > fails, they are injected into the container for the guest to handle
> > it.
> >
> > 2. The guest manages the GDT, LDT, TR, in the container.
> >
> > 3. The guest installs the trap table in the vmx container instead of
> > do_set_trap_table().
>
> To be clear, you intend for this to work with unmodified PV guests,
> right? All of this translation can easily be done in Xen, avoiding
> multiple paths needed in the guest kernel (not really tenable for
> upstreaming).
>
> -- Keir

Hi Keir,

Actually, I modified the PVops guest. The changes in the pvops are
minimal and mostly confied to xen specific files. So I think it has
a fair shot of being upstreamed, at least, worth a shot. I will run
them by Jeremy/Konrad and get their opinions.

thanks
Mukesh


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: HYBRID: PV in HVM container [ In reply to ]
On 28/06/2011 02:51, "Mukesh Rathor" <mukesh.rathor@oracle.com> wrote:

>> To be clear, you intend for this to work with unmodified PV guests,
>> right? All of this translation can easily be done in Xen, avoiding
>> multiple paths needed in the guest kernel (not really tenable for
>> upstreaming).
>>
>> -- Keir
>
> Hi Keir,
>
> Actually, I modified the PVops guest. The changes in the pvops are
> minimal and mostly confied to xen specific files. So I think it has
> a fair shot of being upstreamed, at least, worth a shot. I will run
> them by Jeremy/Konrad and get their opinions.

Well, maybe. But we now have HVM guests, PV guests, and PV-HVM guests. I'm
not sure that adding explicitly HVM-PV guests as well isn't just a bloody
mess.

-- Keir

> thanks
> Mukesh
>



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: HYBRID: PV in HVM container [ In reply to ]
On Tue, 2011-06-28 at 08:46 +0100, Keir Fraser wrote:
> On 28/06/2011 02:51, "Mukesh Rathor" <mukesh.rathor@oracle.com> wrote:
>
> >> To be clear, you intend for this to work with unmodified PV guests,
> >> right? All of this translation can easily be done in Xen, avoiding
> >> multiple paths needed in the guest kernel (not really tenable for
> >> upstreaming).
> >>
> >> -- Keir
> >
> > Hi Keir,
> >
> > Actually, I modified the PVops guest. The changes in the pvops are
> > minimal and mostly confied to xen specific files. So I think it has
> > a fair shot of being upstreamed, at least, worth a shot. I will run
> > them by Jeremy/Konrad and get their opinions.
>
> Well, maybe. But we now have HVM guests, PV guests, and PV-HVM guests. I'm
> not sure that adding explicitly HVM-PV guests as well isn't just a bloody
> mess.

Ideally this container could be used to accelerate existing 64 bit
guests (e.g. older distros running classic-Xen) unmodified (or at least
only with latent bugs fixed) too.

Getting something working with a modified guest seems like a useful
first step (to get to a working baseline) but I'm not sure it should be
the end goal.

Ian.


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: HYBRID: PV in HVM container [ In reply to ]
On Mon, 2011-06-27 at 20:24 +0100, Mukesh Rathor wrote:
> Hi guys,
>
> Cheers!! I got PV in HVM container prototype working with single VCPU
> (pinned to a cpu). Basically, I create a VMX container just like for
> HVM guest (with some differences that I'll share soon when I clean up
> the code). The PV guest starts in Protected mode with the usual
> entry point startup_xen().

Great stuff! I've been eagerly awaiting this functionality ;-)

Do you have any timeline for when you think you might post the code?

I presume you managed to avoid bouncing through the hypervisor for
syscalls?

Cheers,
Ian.

>
> 0. Guest kernel runs in ring 0, CS:0x10.
>
> 1. I use xen for all pt management just like a PV guest. So at present
> all faults are going to xen, and when fixup_page_fault() fails, they
> are injected into the container for the guest to handle it.
>
> 2. The guest manages the GDT, LDT, TR, in the container.
>
> 3. The guest installs the trap table in the vmx container instead of
> do_set_trap_table().
>
> 4. Events/INTs are delivered via HVMIRQ_callback_vector.
>
> 5. MSR_GS_BASE is managed by the guest in the container itself.
>
> 6. Currently, I'm managing cr4 in the container, but going to xen
> for cr0. I need to revisit that.
>
> 7. Currently, VPID is disabled, I need to figure it out, and revisit.
>
> 8. Currently, VM_ENTRY_LOAD_GUEST_PAT is disabled, I need to look at
> that.
>
> These are the salient points I can think of at the moment. Next, I am
> going to run LMBench and figure out the gains. After that, make sure
> SMP works, and things are stable, and look at any enhancements. I need
> to look at couple unrelated bugs at the moment, but hope to return back
> to this very soon.
>
> thanks,
> Mukesh
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: HYBRID: PV in HVM container [ In reply to ]
On 28/06/2011 09:30, "Ian Campbell" <Ian.Campbell@citrix.com> wrote:

> On Tue, 2011-06-28 at 08:46 +0100, Keir Fraser wrote:
>> On 28/06/2011 02:51, "Mukesh Rathor" <mukesh.rathor@oracle.com> wrote:
>>
>> Well, maybe. But we now have HVM guests, PV guests, and PV-HVM guests. I'm
>> not sure that adding explicitly HVM-PV guests as well isn't just a bloody
>> mess.
>
> Ideally this container could be used to accelerate existing 64 bit
> guests (e.g. older distros running classic-Xen) unmodified (or at least
> only with latent bugs fixed) too.

There was a question mark over whether unmodified PV guests would tolerate
running in ring 0, rather than entirely in ring 3. I believe we're confident
it should work, and thus supporting classic-Xen guests should certainly be
the aim.

> Getting something working with a modified guest seems like a useful
> first step (to get to a working baseline) but I'm not sure it should be
> the end goal.

I certainly don't think we should commit such a thing without careful
thought.

-- Keir

> Ian.
>



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: HYBRID: PV in HVM container [ In reply to ]
On Tue, 2011-06-28 at 09:35 +0100, Keir Fraser wrote:
> On 28/06/2011 09:30, "Ian Campbell" <Ian.Campbell@citrix.com> wrote:
>
> > On Tue, 2011-06-28 at 08:46 +0100, Keir Fraser wrote:
> >> On 28/06/2011 02:51, "Mukesh Rathor" <mukesh.rathor@oracle.com> wrote:
> >>
> >> Well, maybe. But we now have HVM guests, PV guests, and PV-HVM guests. I'm
> >> not sure that adding explicitly HVM-PV guests as well isn't just a bloody
> >> mess.
> >
> > Ideally this container could be used to accelerate existing 64 bit
> > guests (e.g. older distros running classic-Xen) unmodified (or at least
> > only with latent bugs fixed) too.
>
> There was a question mark over whether unmodified PV guests would tolerate
> running in ring 0, rather than entirely in ring 3. I believe we're confident
> it should work, and thus supporting classic-Xen guests should certainly be
> the aim.

A guest which does XENFEAT_supervisor_mode_kernel (and perhaps one or
two other XENFEATs) should work, but that was the primary source of the
latent bugs I was thinking of...

In particular the pvops kernel probably doesn't do all the right things
for XENFEAT_supervisor_mode_kernel, since it has never been run that
way, but it also doesn't advertise it via XEN_ELFNOTE_FEATURES so we can
at least detect when it is safe to enable the container from the builder
side.

> > Getting something working with a modified guest seems like a useful
> > first step (to get to a working baseline) but I'm not sure it should be
> > the end goal.
>
> I certainly don't think we should commit such a thing without careful
> thought.

Absolutely.

Ian.



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: HYBRID: PV in HVM container [ In reply to ]
On Tue, 28 Jun 2011, Keir Fraser wrote:
> On 28/06/2011 02:51, "Mukesh Rathor" <mukesh.rathor@oracle.com> wrote:
>
> >> To be clear, you intend for this to work with unmodified PV guests,
> >> right? All of this translation can easily be done in Xen, avoiding
> >> multiple paths needed in the guest kernel (not really tenable for
> >> upstreaming).
> >>
> >> -- Keir
> >
> > Hi Keir,
> >
> > Actually, I modified the PVops guest. The changes in the pvops are
> > minimal and mostly confied to xen specific files. So I think it has
> > a fair shot of being upstreamed, at least, worth a shot. I will run
> > them by Jeremy/Konrad and get their opinions.
>
> Well, maybe. But we now have HVM guests, PV guests, and PV-HVM guests. I'm
> not sure that adding explicitly HVM-PV guests as well isn't just a bloody
> mess.

I very much agree on this point.

However it could still be useful at the very least to run a 64-bit hvm
dom0 (assuming there is a significant performance improvement in doing
so, compared to a traditional 64-bit dom0).

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: HYBRID: PV in HVM container [ In reply to ]
On Tue, 2011-06-28 at 11:46 +0100, Stefano Stabellini wrote:
> On Tue, 28 Jun 2011, Keir Fraser wrote:
> > On 28/06/2011 02:51, "Mukesh Rathor" <mukesh.rathor@oracle.com> wrote:
> >
> > >> To be clear, you intend for this to work with unmodified PV guests,
> > >> right? All of this translation can easily be done in Xen, avoiding
> > >> multiple paths needed in the guest kernel (not really tenable for
> > >> upstreaming).
> > >>
> > >> -- Keir
> > >
> > > Hi Keir,
> > >
> > > Actually, I modified the PVops guest. The changes in the pvops are
> > > minimal and mostly confied to xen specific files. So I think it has
> > > a fair shot of being upstreamed, at least, worth a shot. I will run
> > > them by Jeremy/Konrad and get their opinions.
> >
> > Well, maybe. But we now have HVM guests, PV guests, and PV-HVM guests. I'm
> > not sure that adding explicitly HVM-PV guests as well isn't just a bloody
> > mess.
>
> I very much agree on this point.
>
> However it could still be useful at the very least to run a 64-bit hvm
> dom0 (assuming there is a significant performance improvement in doing
> so, compared to a traditional 64-bit dom0).

That case is no different to the guest case in this respect, so we
should still be aiming for not needing to modify the kernel. We
certainly don't want to a new special case HVM-PV for dom0 only!

Ian.



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: HYBRID: PV in HVM container [ In reply to ]
On Tue, 28 Jun 2011 09:31:57 +0100
Ian Campbell <Ian.Campbell@citrix.com> wrote:

> On Mon, 2011-06-27 at 20:24 +0100, Mukesh Rathor wrote:
> > Hi guys,
> >
> > Cheers!! I got PV in HVM container prototype working with single
> > VCPU (pinned to a cpu). Basically, I create a VMX container just
> > like for HVM guest (with some differences that I'll share soon when
> > I clean up the code). The PV guest starts in Protected mode with
> > the usual entry point startup_xen().
>
> Great stuff! I've been eagerly awaiting this functionality ;-)
>
> Do you have any timeline for when you think you might post the code?
>
> I presume you managed to avoid bouncing through the hypervisor for
> syscalls?

Yup, that was the primary goal. DB benchmarks suffered quite a bit
because of syscall overhead.

thanks,
Mukesh

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: HYBRID: PV in HVM container [ In reply to ]
On Tue, 28 Jun 2011 08:46:08 +0100
Keir Fraser <keir.xen@gmail.com> wrote:

> On 28/06/2011 02:51, "Mukesh Rathor" <mukesh.rathor@oracle.com> wrote:
> > Hi Keir,
> >
> > Actually, I modified the PVops guest. The changes in the pvops are
> > minimal and mostly confied to xen specific files. So I think it has
> > a fair shot of being upstreamed, at least, worth a shot. I will run
> > them by Jeremy/Konrad and get their opinions.
>
> Well, maybe. But we now have HVM guests, PV guests, and PV-HVM
> guests. I'm not sure that adding explicitly HVM-PV guests as well
> isn't just a bloody mess.

Could we perhaps define a HYBRID type that will have characteristics
like, this runs in HVM container, it doesn't use EPT, it uses HVM
callback, etc.. We can they modify it without defining any new types in
future, say we find it works better with EPT under certain
circumstances etc.. What do you think?

thanks,
Mukesh


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: HYBRID: PV in HVM container [ In reply to ]
On 28/06/2011 19:32, "Mukesh Rathor" <mukesh.rathor@oracle.com> wrote:

>> Well, maybe. But we now have HVM guests, PV guests, and PV-HVM
>> guests. I'm not sure that adding explicitly HVM-PV guests as well
>> isn't just a bloody mess.
>
> Could we perhaps define a HYBRID type that will have characteristics
> like, this runs in HVM container, it doesn't use EPT, it uses HVM
> callback, etc.. We can they modify it without defining any new types in
> future, say we find it works better with EPT under certain
> circumstances etc.. What do you think?

Yes, I don't mind the idea of some HVM extensions for performance, and that
will probably become increasingly important. I just think we should support
unmodified PV as a baseline, with best performance possible (i.e., the basic
HVM container approach should support it).

-- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: HYBRID: PV in HVM container [ In reply to ]
On Mon, 27 Jun 2011 12:24:04 -0700
Mukesh Rathor <mukesh.rathor@oracle.com> wrote:

>
> Hi guys,
>
> Cheers!! I got PV in HVM container prototype working with single VCPU
> (pinned to a cpu). Basically, I create a VMX container just like for
> HVM guest (with some differences that I'll share soon when I clean up
> the code). The PV guest starts in Protected mode with the usual
> entry point startup_xen().
>
> 0. Guest kernel runs in ring 0, CS:0x10.


JFYI.. as expected, running in ring 0 and not bouncing syscalls thru
xen, syscalls do very well. fork/execs are slow prob beause VPIDs are
turned off right now. I'm trying to figure VPIDs out, and hopefully that
would help. BTW, dont' compare to anything else, both kernels
below are unoptimized debug kernels.

LMbench:
Processor, Processes - times in microseconds - smaller is better
----------------------------------------------------------------
Host OS Mhz null null open selct sig sig fork exec sh
call I/O stat clos TCP inst hndl proc proc proc
--------- ------------- ---- ---- ---- ---- ---- ----- ---- ---- ---- ---- ----
STOCK Linux 2.6.39+ 2771 0.68 0.91 2.13 4.45 4.251 0.82 3.87 433. 1134 3145
HYBRID Linux 2.6.39m 2745 0.13 0.22 0.88 2.04 3.287 0.28 1.11 526. 1393 3923


thanks,
Mukesh




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: HYBRID: PV in HVM container [ In reply to ]
> JFYI.. as expected, running in ring 0 and not bouncing syscalls thru
> xen, syscalls do very well. fork/execs are slow prob beause VPIDs are
> turned off right now. I'm trying to figure VPIDs out, and hopefully
> that would help. BTW, dont' compare to anything else, both kernels
> below are unoptimized debug kernels.
>
> LMbench:
> Processor, Processes - times in microseconds - smaller is better
> ----------------------------------------------------------------
> Host OS Mhz null null open selct sig sig fork
> exec sh call I/O stat clos TCP inst hndl proc proc proc
> --------- ------------- ---- ---- ---- ---- ---- ----- ---- ---- ----
> ---- ---- STOCK Linux 2.6.39+ 2771 0.68 0.91 2.13 4.45 4.251 0.82
> 3.87 433. 1134 3145 HYBRID Linux 2.6.39m 2745 0.13 0.22 0.88 2.04
> 3.287 0.28 1.11 526. 1393 3923
>

JFYI again, I seem to have caught up with pure PV on almost all with some
optimizations:

Processor, Processes - times in microseconds - smaller is better
----------------------------------------------------------------
Host OS Mhz null null open selct sig sig fork exec sh
call I/O stat clos TCP inst hndl proc proc proc
--------- ------------- ---- ---- ---- ---- ---- ----- ---- ---- ---- ---- ----
STOCK: Linux 2.6.39+ 2771 0.68 0.91 2.13 4.45 4.251 0.82 3.87 433. 1134 3145
N4 Linux 2.6.39m 2745 0.13 0.21 0.86 2.03 3.279 0.28 1.18 479. 1275 3502
N5 Linux 2.6.39m 2752 0.13 0.21 0.91 2.07 3.284 0.28 1.14 439. 1168 3155

Context switching - times in microseconds - smaller is better
-------------------------------------------------------------
Host OS 2p/0K 2p/16K 2p/64K 8p/16K 8p/64K 16p/16K 16p/64K
ctxsw ctxsw ctxsw ctxsw ctxsw ctxsw ctxsw
--------- ------------- ----- ------ ------ ------ ------ ------- -------
STOCK: Linux 2.6.39+ 5.800 6.2400 6.8700 6.6700 8.4600 7.13000 8.63000
N4 Linux 2.6.39m 6.420 6.9300 8.0100 7.2600 8.7600 7.97000 9.25000
N5 Linux 2.6.39m 6.650 7.0000 7.8400 7.3900 8.8000 7.90000 9.06000

*Local* Communication latencies in microseconds - smaller is better
-------------------------------------------------------------------
Host OS 2p/0K Pipe AF UDP RPC/ TCP RPC/ TCP
ctxsw UNIX UDP TCP conn
--------- ------------- ----- ----- ---- ----- ----- ----- ----- ----
STOCK: Linux 2.6.39+ 5.800 18.9 22.3 28.7 32.8 34.9 44.6 89.8
N4 Linux 2.6.39m 6.420 17.1 18.1 26.9 28.7 34.2 40.1 76.3
N5 Linux 2.6.39m 6.650 18.1 17.7 24.4 33.4 33.9 40.7 76.7

File & VM system latencies in microseconds - smaller is better
--------------------------------------------------------------
Host OS 0K File 10K File Mmap Prot Page
Create Delete Create Delete Latency Fault Fault
--------- ------------- ------ ------ ------ ------ ------- ----- -----
STOCK: Linux 2.6.39+ 3264.0 0.828 3.00000
N4 Linux 2.6.39m 3990.0 1.351 4.00000
N5 Linux 2.6.39m 3362.0 0.235 4.00000


where the only difference between N4 and N5 is that in N5 I've enabled
vmexits only for page faults on write protection, ie, err code 0x3.

I'm trying to figure out how vtlb implemention relates to SDM 28.3.5.
It seems in xen, vtlb is mostly for shadows glancing at the code, which
I am not worrying for now (I've totally ignored migration for now).
Any thoughts any body?

Also, at present I am not using vtsc, is it worth looking into? some of
the tsc stuff makes my head spin just like the shadow code does :)...

thanks,
Mukesh


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: HYBRID: PV in HVM container [ In reply to ]
On 09/07/2011 02:53, "Mukesh Rathor" <mukesh.rathor@oracle.com> wrote:

> where the only difference between N4 and N5 is that in N5 I've enabled
> vmexits only for page faults on write protection, ie, err code 0x3.
>
> I'm trying to figure out how vtlb implemention relates to SDM 28.3.5.
> It seems in xen, vtlb is mostly for shadows glancing at the code, which
> I am not worrying for now (I've totally ignored migration for now).
> Any thoughts any body?

You don't have to understand it very much. Trapping on write faults from
supervisor mode only is fine for normal operation, and you'll have to fault
on everything during live migration (since shadow page tables are built up
via read & write demand faults).

> Also, at present I am not using vtsc, is it worth looking into? some of
> the tsc stuff makes my head spin just like the shadow code does :)...

You have to understand that even less. For pure PV CR4.TSD gets set
appropriately for the VTSC mode. You can hook off that, or duplicate that,
to enable/disable RDTSC exiting instead. You don't have to actually *do* any
vtsc work, as no doubt you jump out of your wrapper code into the proper PV
paths for actually getting any real hypervisor work done (like vtsc).

-- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: HYBRID: PV in HVM container [ In reply to ]
Hi folks,

Well, I did some benchmarking and found interesting results. Following
runs are on a westmere with 2 sockets and 10GB RAM. Xen was booted
with maxcpus=2 and entire RAM. All guests were started with 1vcpu and 2GB
RAM. dom0 started with 1 vcpu and 704MB. Baremetal was booted with 2GB
and 1 cpu. HVM guest has EPT enabled. HT is on.

So, unless the NUMA'ness interfered with results (using some memory on
remote socket), it appears HVM does very well. To the point that it
seems a hybrid is not going to be worth it. I am currently running
tests on a single socket system just to be sure.

I am attaching my diff's in case any one wants to see what I did. I used
xen 4.0.2 and linux 2.6.39.

thanks,
Mukesh

L M B E N C H 3 . 0 S U M M A R Y

Processor, Processes - times in microseconds - smaller is better
------------------------------------------------------------------------------
Host OS Mhz null null open slct sig sig fork exec sh
call I/O stat clos TCP inst hndl proc proc proc
--------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----
PV Linux 2.6.39f 2639 0.65 0.88 2.14 4.59 3.77 0.79 3.62 535. 1294 3308
Hybrid Linux 2.6.39f 2639 0.13 0.21 0.89 1.96 3.08 0.24 1.10 529. 1294 3246
HVM Linux 2.6.39f 2639 0.12 0.21 0.64 1.76 3.04 0.24 3.37 113. 354. 1324
Baremetal Linux 2.6.39+ 2649 0.13 0.23 0.74 1.93 3.46 0.28 1.58 127. 386. 1434

Basic integer operations - times in nanoseconds - smaller is better
-------------------------------------------------------------------
Host OS intgr intgr intgr intgr intgr
bit add mul div mod
--------- ------------- ------ ------ ------ ------ ------
PV Linux 2.6.39f 0.3800 0.0100 0.1700 9.1000 9.0400
Hybrid Linux 2.6.39f 0.3800 0.0100 0.1700 9.1100 9.0300
HVM Linux 2.6.39f 0.3800 0.0100 0.1700 9.1100 9.0600
Baremetal Linux 2.6.39+ 0.3800 0.0100 0.1700 9.0600 8.9800

Basic float operations - times in nanoseconds - smaller is better
-----------------------------------------------------------------
Host OS float float float float
add mul div bogo
--------- ------------- ------ ------ ------ ------
PV Linux 2.6.39f 1.1300 1.5200 5.6200 5.2900
Hybrid Linux 2.6.39f 1.1300 1.5200 5.6300 5.2900
HVM Linux 2.6.39f 1.1400 1.5200 5.6300 5.3000
Baremetal Linux 2.6.39+ 1.1300 1.5100 5.6000 5.2700

Basic double operations - times in nanoseconds - smaller is better
------------------------------------------------------------------
Host OS double double double double
add mul div bogo
--------- ------------- ------ ------ ------ ------
PV Linux 2.6.39f 1.1300 1.9000 8.6400 8.3200
Hybrid Linux 2.6.39f 1.1400 1.9000 8.6600 8.3200
HVM Linux 2.6.39f 1.1400 1.9000 8.6600 8.3300
Baremetal Linux 2.6.39+ 1.1300 1.8900 8.6100 8.2800

Context switching - times in microseconds - smaller is better
-------------------------------------------------------------------------
Host OS 2p/0K 2p/16K 2p/64K 8p/16K 8p/64K 16p/16K 16p/64K
ctxsw ctxsw ctxsw ctxsw ctxsw ctxsw ctxsw
--------- ------------- ------ ------ ------ ------ ------ ------- -------
PV Linux 2.6.39f 5.2800 5.7600 6.3600 6.3200 7.3600 6.69000 7.46000
Hybrid Linux 2.6.39f 4.9200 4.9300 5.2200 5.7600 6.9600 6.12000 7.31000
HVM Linux 2.6.39f 1.3100 1.2200 1.6200 1.9200 3.2600 2.23000 3.48000
Baremetal Linux 2.6.39+ 1.5500 1.4100 2.0600 2.2500 3.3900 2.44000 3.38000

*Local* Communication latencies in microseconds - smaller is better
---------------------------------------------------------------------
Host OS 2p/0K Pipe AF UDP RPC/ TCP RPC/ TCP
ctxsw UNIX UDP TCP conn
--------- ------------- ----- ----- ---- ----- ----- ----- ----- ----
PV Linux 2.6.39f 5.280 16.6 21.3 25.9 33.7 34.7 41.8 87.
Hybrid Linux 2.6.39f 4.920 11.2 14.4 19.6 26.1 27.5 32.9 71.
HVM Linux 2.6.39f 1.310 4.416 6.15 9.386 14.8 15.8 20.1 45.
Baremetal Linux 2.6.39+ 1.550 4.625 7.34 14.3 19.8 21.4 26.4 66.

File & VM system latencies in microseconds - smaller is better
-------------------------------------------------------------------------------
Host OS 0K File 10K File Mmap Prot Page 100fd
Create Delete Create Delete Latency Fault Fault selct
--------- ------------- ------ ------ ------ ------ ------- ----- ------- -----
PV Linux 2.6.39f 24.0K 0.746 3.55870 2.184
Hybrid Linux 2.6.39f 24.6K 0.238 4.00100 1.480
HVM Linux 2.6.39f 4716.0 0.202 0.96600 1.468
Baremetal Linux 2.6.39+ 6898.0 0.325 0.93610 1.620

*Local* Communication bandwidths in MB/s - bigger is better
-----------------------------------------------------------------------------
Host OS Pipe AF TCP File Mmap Bcopy Bcopy Mem Mem
UNIX reread reread (libc) (hand) read write
--------- ------------- ---- ---- ---- ------ ------ ------ ------ ---- -----
PV Linux 2.6.39f 1661 2081 1041 3293.3 5528.3 3106.6 2800.0 4472 5633.
Hybrid Linux 2.6.39f 1974 2450 1183 3481.5 5529.6 3114.9 2786.6 4470 5672.
HVM Linux 2.6.39f 3232 2929 1622 3541.3 5527.5 3077.1 2765.6 4453 5634.
Baremetal Linux 2.6.39+ 3320 2800 1666 3523.6 5578.9 3147.0 2841.6 4541 5752.

Memory latencies in nanoseconds - smaller is better
(WARNING - may not be correct, check graphs)
------------------------------------------------------------------------------
Host OS Mhz L1 $ L2 $ Main mem Rand mem Guesses
--------- ------------- --- ---- ---- -------- -------- -------
PV Linux 2.6.39f 2639 1.5160 5.9170 29.7 97.5
Hybrid Linux 2.6.39f 2639 1.5170 7.5000 29.7 97.4
HVM Linux 2.6.39f 2639 1.5190 4.0210 29.8 105.4
Baremetal Linux 2.6.39+ 2649 1.5090 3.8370 29.2 78.0
Re: HYBRID: PV in HVM container [ In reply to ]
On Thu, 28 Jul 2011, Mukesh Rathor wrote:
> Hi folks,
>
> Well, I did some benchmarking and found interesting results. Following
> runs are on a westmere with 2 sockets and 10GB RAM. Xen was booted
> with maxcpus=2 and entire RAM. All guests were started with 1vcpu and 2GB
> RAM. dom0 started with 1 vcpu and 704MB. Baremetal was booted with 2GB
> and 1 cpu. HVM guest has EPT enabled. HT is on.
>
> So, unless the NUMA'ness interfered with results (using some memory on
> remote socket), it appears HVM does very well. To the point that it
> seems a hybrid is not going to be worth it. I am currently running
> tests on a single socket system just to be sure.
>

The high level benchmarks I run to compare PV and PV on HVM guests show
a very similar scenario.

It is still worth having HYBRID guests (running with EPT?) in order to
support dom0 in an HVM container one day not too far from now.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: HYBRID: PV in HVM container [ In reply to ]
On Wed, Jul 27, 2011 at 06:58:28PM -0700, Mukesh Rathor wrote:
> Hi folks,
>
> Well, I did some benchmarking and found interesting results. Following
> runs are on a westmere with 2 sockets and 10GB RAM. Xen was booted
> with maxcpus=2 and entire RAM. All guests were started with 1vcpu and 2GB
> RAM. dom0 started with 1 vcpu and 704MB. Baremetal was booted with 2GB
> and 1 cpu. HVM guest has EPT enabled. HT is on.

Is this PVonHVM? Or is it real HVM without _any_ PV enablement? Ah, the
.config tells me it is PVonHVM - so IRQ callbacks, and timers are PV
actually.

>
> So, unless the NUMA'ness interfered with results (using some memory on
> remote socket), it appears HVM does very well. To the point that it
> seems a hybrid is not going to be worth it. I am currently running
> tests on a single socket system just to be sure.

The xm has some NUMA capability while xl does not. Did you use xm or xl to
run this?

>
> I am attaching my diff's in case any one wants to see what I did. I used
> xen 4.0.2 and linux 2.6.39.

Wow. That is surprisingly a compact set of changes to the Linux kernel.
Good job.
>
> thanks,
> Mukesh
>
> L M B E N C H 3 . 0 S U M M A R Y
>
> Processor, Processes - times in microseconds - smaller is better
> ------------------------------------------------------------------------------
> Host OS Mhz null null open slct sig sig fork exec sh
> call I/O stat clos TCP inst hndl proc proc proc
> --------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----
> PV Linux 2.6.39f 2639 0.65 0.88 2.14 4.59 3.77 0.79 3.62 535. 1294 3308
> Hybrid Linux 2.6.39f 2639 0.13 0.21 0.89 1.96 3.08 0.24 1.10 529. 1294 3246

Hm, so it follows baremetal until fork/exec/sh. At which it is as bad as
PV.

> HVM Linux 2.6.39f 2639 0.12 0.21 0.64 1.76 3.04 0.24 3.37 113. 354. 1324

<blinks> So HVM is better than baremetal?

> Baremetal Linux 2.6.39+ 2649 0.13 0.23 0.74 1.93 3.46 0.28 1.58 127. 386. 1434
>
> Basic integer operations - times in nanoseconds - smaller is better
> -------------------------------------------------------------------
> Host OS intgr intgr intgr intgr intgr
> bit add mul div mod
> --------- ------------- ------ ------ ------ ------ ------
> PV Linux 2.6.39f 0.3800 0.0100 0.1700 9.1000 9.0400
> Hybrid Linux 2.6.39f 0.3800 0.0100 0.1700 9.1100 9.0300
> HVM Linux 2.6.39f 0.3800 0.0100 0.1700 9.1100 9.0600
> Baremetal Linux 2.6.39+ 0.3800 0.0100 0.1700 9.0600 8.9800
>
> Basic float operations - times in nanoseconds - smaller is better
> -----------------------------------------------------------------
> Host OS float float float float
> add mul div bogo
> --------- ------------- ------ ------ ------ ------
> PV Linux 2.6.39f 1.1300 1.5200 5.6200 5.2900
> Hybrid Linux 2.6.39f 1.1300 1.5200 5.6300 5.2900
> HVM Linux 2.6.39f 1.1400 1.5200 5.6300 5.3000
> Baremetal Linux 2.6.39+ 1.1300 1.5100 5.6000 5.2700
>
> Basic double operations - times in nanoseconds - smaller is better
> ------------------------------------------------------------------
> Host OS double double double double
> add mul div bogo
> --------- ------------- ------ ------ ------ ------
> PV Linux 2.6.39f 1.1300 1.9000 8.6400 8.3200
> Hybrid Linux 2.6.39f 1.1400 1.9000 8.6600 8.3200
> HVM Linux 2.6.39f 1.1400 1.9000 8.6600 8.3300
> Baremetal Linux 2.6.39+ 1.1300 1.8900 8.6100 8.2800
>
> Context switching - times in microseconds - smaller is better
> -------------------------------------------------------------------------
> Host OS 2p/0K 2p/16K 2p/64K 8p/16K 8p/64K 16p/16K 16p/64K
> ctxsw ctxsw ctxsw ctxsw ctxsw ctxsw ctxsw
> --------- ------------- ------ ------ ------ ------ ------ ------- -------
> PV Linux 2.6.39f 5.2800 5.7600 6.3600 6.3200 7.3600 6.69000 7.46000
> Hybrid Linux 2.6.39f 4.9200 4.9300 5.2200 5.7600 6.9600 6.12000 7.31000

So the diff between PV an Hybrid looks to be 8%..

And then ~50% difference between Hybrid and baremetal. So syscall is
only causing 8% drop in performance - what is the other 42%?

> HVM Linux 2.6.39f 1.3100 1.2200 1.6200 1.9200 3.2600 2.23000 3.48000

This is really bizzare. HVM kicks baremetal butt?
> Baremetal Linux 2.6.39+ 1.5500 1.4100 2.0600 2.2500 3.3900 2.44000 3.38000
>
> *Local* Communication latencies in microseconds - smaller is better
> ---------------------------------------------------------------------
> Host OS 2p/0K Pipe AF UDP RPC/ TCP RPC/ TCP
> ctxsw UNIX UDP TCP conn
> --------- ------------- ----- ----- ---- ----- ----- ----- ----- ----
> PV Linux 2.6.39f 5.280 16.6 21.3 25.9 33.7 34.7 41.8 87.
> Hybrid Linux 2.6.39f 4.920 11.2 14.4 19.6 26.1 27.5 32.9 71.
> HVM Linux 2.6.39f 1.310 4.416 6.15 9.386 14.8 15.8 20.1 45.
> Baremetal Linux 2.6.39+ 1.550 4.625 7.34 14.3 19.8 21.4 26.4 66.
>
> File & VM system latencies in microseconds - smaller is better
> -------------------------------------------------------------------------------
> Host OS 0K File 10K File Mmap Prot Page 100fd
> Create Delete Create Delete Latency Fault Fault selct
> --------- ------------- ------ ------ ------ ------ ------- ----- ------- -----
> PV Linux 2.6.39f 24.0K 0.746 3.55870 2.184
> Hybrid Linux 2.6.39f 24.6K 0.238 4.00100 1.480

Could the mmap and the pagetable creations be the fault (ha! a pun!) of sucky
performance? Perhaps running with autotranslate pagetables would eliminate this?

Is the mmap doing small little 4K runs or something much bigger?


> HVM Linux 2.6.39f 4716.0 0.202 0.96600 1.468
> Baremetal Linux 2.6.39+ 6898.0 0.325 0.93610 1.620
>
> *Local* Communication bandwidths in MB/s - bigger is better
> -----------------------------------------------------------------------------
> Host OS Pipe AF TCP File Mmap Bcopy Bcopy Mem Mem
> UNIX reread reread (libc) (hand) read write
> --------- ------------- ---- ---- ---- ------ ------ ------ ------ ---- -----
> PV Linux 2.6.39f 1661 2081 1041 3293.3 5528.3 3106.6 2800.0 4472 5633.
> Hybrid Linux 2.6.39f 1974 2450 1183 3481.5 5529.6 3114.9 2786.6 4470 5672.
> HVM Linux 2.6.39f 3232 2929 1622 3541.3 5527.5 3077.1 2765.6 4453 5634.
> Baremetal Linux 2.6.39+ 3320 2800 1666 3523.6 5578.9 3147.0 2841.6 4541 5752.
>
> Memory latencies in nanoseconds - smaller is better
> (WARNING - may not be correct, check graphs)
> ------------------------------------------------------------------------------
> Host OS Mhz L1 $ L2 $ Main mem Rand mem Guesses
> --------- ------------- --- ---- ---- -------- -------- -------
> PV Linux 2.6.39f 2639 1.5160 5.9170 29.7 97.5
> Hybrid Linux 2.6.39f 2639 1.5170 7.5000 29.7 97.4
> HVM Linux 2.6.39f 2639 1.5190 4.0210 29.8 105.4
> Baremetal Linux 2.6.39+ 2649 1.5090 3.8370 29.2 78.0


OK, so once you have access to the memory, using it under PV is actually OK.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: HYBRID: PV in HVM container [ In reply to ]
On Thu, Jul 28, 2011 at 12:34:42PM +0100, Stefano Stabellini wrote:
> On Thu, 28 Jul 2011, Mukesh Rathor wrote:
> > Hi folks,
> >
> > Well, I did some benchmarking and found interesting results. Following
> > runs are on a westmere with 2 sockets and 10GB RAM. Xen was booted
> > with maxcpus=2 and entire RAM. All guests were started with 1vcpu and 2GB
> > RAM. dom0 started with 1 vcpu and 704MB. Baremetal was booted with 2GB
> > and 1 cpu. HVM guest has EPT enabled. HT is on.
> >
> > So, unless the NUMA'ness interfered with results (using some memory on
> > remote socket), it appears HVM does very well. To the point that it
> > seems a hybrid is not going to be worth it. I am currently running
> > tests on a single socket system just to be sure.
> >
>
> The high level benchmarks I run to compare PV and PV on HVM guests show
> a very similar scenario.
>
> It is still worth having HYBRID guests (running with EPT?) in order to
> support dom0 in an HVM container one day not too far from now.

I am just wondering how much dom0 cares about this? I mean if you use
blkback, netback - etc - they are all in the kernel. The device drivers
are also in the kernel.

Based on Mukeshs's results the PVonHVM work you did really paid
itself off for guests. The numbers are even better than baremetal .. which
I am bit surprised - maybe the collecting of the data (gettimeofday) is
not the best?

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: HYBRID: PV in HVM container [ In reply to ]
On Fri, 29 Jul 2011, Konrad Rzeszutek Wilk wrote:
> On Thu, Jul 28, 2011 at 12:34:42PM +0100, Stefano Stabellini wrote:
> > On Thu, 28 Jul 2011, Mukesh Rathor wrote:
> > > Hi folks,
> > >
> > > Well, I did some benchmarking and found interesting results. Following
> > > runs are on a westmere with 2 sockets and 10GB RAM. Xen was booted
> > > with maxcpus=2 and entire RAM. All guests were started with 1vcpu and 2GB
> > > RAM. dom0 started with 1 vcpu and 704MB. Baremetal was booted with 2GB
> > > and 1 cpu. HVM guest has EPT enabled. HT is on.
> > >
> > > So, unless the NUMA'ness interfered with results (using some memory on
> > > remote socket), it appears HVM does very well. To the point that it
> > > seems a hybrid is not going to be worth it. I am currently running
> > > tests on a single socket system just to be sure.
> > >
> >
> > The high level benchmarks I run to compare PV and PV on HVM guests show
> > a very similar scenario.
> >
> > It is still worth having HYBRID guests (running with EPT?) in order to
> > support dom0 in an HVM container one day not too far from now.
>
> I am just wondering how much dom0 cares about this? I mean if you use
> blkback, netback - etc - they are all in the kernel. The device drivers
> are also in the kernel.

There are always going to be some userspace processes, even with
stubdoms.
Besides if we have HVM dom0, we can enable
XENFEAT_auto_translated_physmap and EPT and have the same level of
performances of a PV on HVM guest. Moreover since we wouldn't be using
the mmu pvops anymore we could drop them completely: that would greatly
simplify the Xen maintenance in the Linux kernel as well as gain back
some love from the x86 maintainers :)

The way I see it, normal Linux guests would be PV on HVM guests, but we
still need to do something about dom0.
This work would make dom0 exactly like PV on HVM guests apart from
the boot sequence: dom0 would still boot from xen_start_kernel,
everything else would be pretty much the same.

I would ask you to run some benchmarks using
XENFEAT_auto_translated_physmap but I am afraid it bitrotted over the
years so it would need some work to get it working.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: HYBRID: PV in HVM container [ In reply to ]
> > I am just wondering how much dom0 cares about this? I mean if you use
> > blkback, netback - etc - they are all in the kernel. The device drivers
> > are also in the kernel.
>
> There are always going to be some userspace processes, even with
> stubdoms.

Stubdomains? Linux HVM's have now PVonHVM - and for Windows there are
multitude of PV drivers available? But sure there are some processes - like
snort or other packet filtering userland software.

> Besides if we have HVM dom0, we can enable
> XENFEAT_auto_translated_physmap and EPT and have the same level of
> performances of a PV on HVM guest. Moreover since we wouldn't be using
> the mmu pvops anymore we could drop them completely: that would greatly

Sure. It also means you MUST have an IOMMU in the box.

> simplify the Xen maintenance in the Linux kernel as well as gain back
> some love from the x86 maintainers :)
>
> The way I see it, normal Linux guests would be PV on HVM guests, but we
> still need to do something about dom0.
> This work would make dom0 exactly like PV on HVM guests apart from
> the boot sequence: dom0 would still boot from xen_start_kernel,
> everything else would be pretty much the same.

Ah, so not HVM exactly (you would only use the EPT/NPT/RV1/HAP for
pagetables).. and PV for startup, spinlock, timers, debug, CPU, and
backends. Thought sticking in the HVM container in PV that Mukesh
made work would also benefit.

Or just come back to the idea of "real" HVM device driver domains
and have the PV dom0 be a light one loading the rest. But the setup of
it is just so complex.. And the PV dom0 needs to deal with the PCI backend
xenstore, and able to comprehend ACPI _PRT... and then launch the "device
driver" Dom0, which at its simplest form would have all of the devices
passed in to it.

So four payloads: PV dom0, PV dom0 initrd, HVM dom0, HVM dom0 initrd :-)
Ok, that is too cumbersome. Maybe ingest the PV dom0+initrd in the Xen
hypervisor binary.. I should stop here.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: HYBRID: PV in HVM container [ In reply to ]
On Fri, 29 Jul 2011, Konrad Rzeszutek Wilk wrote:
> > Besides if we have HVM dom0, we can enable
> > XENFEAT_auto_translated_physmap and EPT and have the same level of
> > performances of a PV on HVM guest. Moreover since we wouldn't be using
> > the mmu pvops anymore we could drop them completely: that would greatly
>
> Sure. It also means you MUST have an IOMMU in the box.

Why?
We can still remap interrupts into event channels.
Maybe you mean VMX?


> > simplify the Xen maintenance in the Linux kernel as well as gain back
> > some love from the x86 maintainers :)
> >
> > The way I see it, normal Linux guests would be PV on HVM guests, but we
> > still need to do something about dom0.
> > This work would make dom0 exactly like PV on HVM guests apart from
> > the boot sequence: dom0 would still boot from xen_start_kernel,
> > everything else would be pretty much the same.
>
> Ah, so not HVM exactly (you would only use the EPT/NPT/RV1/HAP for
> pagetables).. and PV for startup, spinlock, timers, debug, CPU, and
> backends. Thought sticking in the HVM container in PV that Mukesh
> made work would also benefit.

Yes for startup, spinlock, timers and backends. I would use HVM for cpu
operations too (no need for pv_cpu_ops.write_gdt_entry anymore for
example).


> Or just come back to the idea of "real" HVM device driver domains
> and have the PV dom0 be a light one loading the rest. But the setup of
> it is just so complex.. And the PV dom0 needs to deal with the PCI backend
> xenstore, and able to comprehend ACPI _PRT... and then launch the "device
> driver" Dom0, which at its simplest form would have all of the devices
> passed in to it.
>
> So four payloads: PV dom0, PV dom0 initrd, HVM dom0, HVM dom0 initrd :-)
> Ok, that is too cumbersome. Maybe ingest the PV dom0+initrd in the Xen
> hypervisor binary.. I should stop here.

The goal of splitting up dom0 into multiple management domain is surely
a worthy goal, no matter is the domains are PV or HVM or PV on HVM, but
yeah the setup is hard. I hope that the we'll be able to simplify it in
the near future, maybe after the switchover to the new qemu and seabios
is completed.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: HYBRID: PV in HVM container [ In reply to ]
On Fri, Jul 29, 2011 at 07:00:07PM +0100, Stefano Stabellini wrote:
> On Fri, 29 Jul 2011, Konrad Rzeszutek Wilk wrote:
> > > Besides if we have HVM dom0, we can enable
> > > XENFEAT_auto_translated_physmap and EPT and have the same level of
> > > performances of a PV on HVM guest. Moreover since we wouldn't be using
> > > the mmu pvops anymore we could drop them completely: that would greatly
> >
> > Sure. It also means you MUST have an IOMMU in the box.
>
> Why?

For HVM dom0s.. But I think when you say HVM here, you mean using
PV with the hypervisor's code that is used for managing page-tables - EPT/NPT/HAP.

So PV+HAP = Stefano's HVM :-)

> We can still remap interrupts into event channels.
> Maybe you mean VMX?
>
>
> > > simplify the Xen maintenance in the Linux kernel as well as gain back
> > > some love from the x86 maintainers :)
> > >
> > > The way I see it, normal Linux guests would be PV on HVM guests, but we
> > > still need to do something about dom0.
> > > This work would make dom0 exactly like PV on HVM guests apart from
> > > the boot sequence: dom0 would still boot from xen_start_kernel,
> > > everything else would be pretty much the same.
> >
> > Ah, so not HVM exactly (you would only use the EPT/NPT/RV1/HAP for
> > pagetables).. and PV for startup, spinlock, timers, debug, CPU, and
> > backends. Thought sticking in the HVM container in PV that Mukesh
> > made work would also benefit.
>
> Yes for startup, spinlock, timers and backends. I would use HVM for cpu
> operations too (no need for pv_cpu_ops.write_gdt_entry anymore for
> example).

OK, so a SVM/VMX setup is required.
>
>
> > Or just come back to the idea of "real" HVM device driver domains
> > and have the PV dom0 be a light one loading the rest. But the setup of
> > it is just so complex.. And the PV dom0 needs to deal with the PCI backend
> > xenstore, and able to comprehend ACPI _PRT... and then launch the "device
> > driver" Dom0, which at its simplest form would have all of the devices
> > passed in to it.
> >
> > So four payloads: PV dom0, PV dom0 initrd, HVM dom0, HVM dom0 initrd :-)
> > Ok, that is too cumbersome. Maybe ingest the PV dom0+initrd in the Xen
> > hypervisor binary.. I should stop here.
>
> The goal of splitting up dom0 into multiple management domain is surely
> a worthy goal, no matter is the domains are PV or HVM or PV on HVM, but
> yeah the setup is hard. I hope that the we'll be able to simplify it in
> the near future, maybe after the switchover to the new qemu and seabios
> is completed.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: HYBRID: PV in HVM container [ In reply to ]
On Fri, 29 Jul 2011, Konrad Rzeszutek Wilk wrote:
> On Fri, Jul 29, 2011 at 07:00:07PM +0100, Stefano Stabellini wrote:
> > On Fri, 29 Jul 2011, Konrad Rzeszutek Wilk wrote:
> > > > Besides if we have HVM dom0, we can enable
> > > > XENFEAT_auto_translated_physmap and EPT and have the same level of
> > > > performances of a PV on HVM guest. Moreover since we wouldn't be using
> > > > the mmu pvops anymore we could drop them completely: that would greatly
> > >
> > > Sure. It also means you MUST have an IOMMU in the box.
> >
> > Why?
>
> For HVM dom0s.. But I think when you say HVM here, you mean using
> PV with the hypervisor's code that is used for managing page-tables - EPT/NPT/HAP.
>
> So PV+HAP = Stefano's HVM :-)

:-)

> > Yes for startup, spinlock, timers and backends. I would use HVM for cpu
> > operations too (no need for pv_cpu_ops.write_gdt_entry anymore for
> > example).
>
> OK, so a SVM/VMX setup is required.

Yes, I would actually require SVM/VMX for performances and to simplify
the setup and maintenance of the code in general.


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

1 2  View All