Mailing List Archive: Re: [Android-virt] [ANNOUNCE] Xvisor ARM better than KVM ARM in CPU virtualization

Re: [Android-virt] [ANNOUNCE] Xvisor ARM better than KVM ARM in CPU virtualization

anup at brainfault

May 5, 2012, 7:24 AM

Post #1 of 26 (8415 views)

Hi PMM,

I agree we cannot predict real world performance based on performance on
ARM fast models but if system A is performing better than system B no ARM
fast model or QEMU then in real world system A will perform better than
system B. Of-course in real world scale of difference in performance
between system A and system B will differ.

The previous announcement only proves that Xvisor ARM is relatively better
than KVM ARM.

Regards,
--Anup

On Sat, May 5, 2012 at 3:36 PM, Peter Maydell <peter.maydell@linaro.org>wrote:

> 2012/5/5 Anup Patel <anup@brainfault.org>:
> > This announcement is to show an apple to apple performance comparison
> > between Xvisor ARM and KVM ARM running on VExpress-A15 Fast Model.
>
> I would strongly caution against trying to do any performance/timing
> type tests if you're still running on the ARM Fast Model -- they are
> not representative of performance characteristics on hardware
> and you really can't draw any conclusions about real world
> performance by timing things on a model. It's quite easy to get
> into a situation where all you're measuring is "does my code happen
> to do a lot of some perfectly reasonable operation which happens
> to be hard and slow to implement for the model?".
>
> (Also, KVM for ARM is still under development and we haven't
> yet made several of the obvious performance improvements like
> in-kernel irqchip and timer support, so it's not really a very
> useful thing to compare against yet.)
>
> -- PMM
>

Re: [Android-virt] [ANNOUNCE] Xvisor ARM better than KVM ARM in CPU virtualization [ In reply to ]

peter.maydell at linaro

May 5, 2012, 3:06 AM

Post #2 of 26 (8257 views)

2012/5/5 Anup Patel <anup@brainfault.org>:
> This announcement is to show an apple to apple performance comparison
> between Xvisor ARM and KVM ARM running on VExpress-A15 Fast Model.

I would strongly caution against trying to do any performance/timing
type tests if you're still running on the ARM Fast Model -- they are
not representative of performance characteristics on hardware
and you really can't draw any conclusions about real world
performance by timing things on a model. It's quite easy to get
into a situation where all you're measuring is "does my code happen
to do a lot of some perfectly reasonable operation which happens
to be hard and slow to implement for the model?".

(Also, KVM for ARM is still under development and we haven't
yet made several of the obvious performance improvements like
in-kernel irqchip and timer support, so it's not really a very
useful thing to compare against yet.)

-- PMM

_______________________________________________
Xen-arm mailing list
Xen-arm@lists.xen.org
http://lists.xen.org/cgi-bin/mailman/listinfo/xen-arm

Re: [Android-virt] [ANNOUNCE] Xvisor ARM better than KVM ARM in CPU virtualization [ In reply to ]

peter.maydell at linaro

May 5, 2012, 3:06 AM

Post #3 of 26 (8245 views)

2012/5/5 Anup Patel <anup@brainfault.org>:
> This announcement is to show an apple to apple performance comparison
> between Xvisor ARM and KVM ARM running on VExpress-A15 Fast Model.

I would strongly caution against trying to do any performance/timing
type tests if you're still running on the ARM Fast Model -- they are
not representative of performance characteristics on hardware
and you really can't draw any conclusions about real world
performance by timing things on a model. It's quite easy to get
into a situation where all you're measuring is "does my code happen
to do a lot of some perfectly reasonable operation which happens
to be hard and slow to implement for the model?".

(Also, KVM for ARM is still under development and we haven't
yet made several of the obvious performance improvements like
in-kernel irqchip and timer support, so it's not really a very
useful thing to compare against yet.)

-- PMM

_______________________________________________
Xen-arm mailing list
Xen-arm@lists.xen.org
http://lists.xen.org/cgi-bin/mailman/listinfo/xen-arm

Re: [Android-virt] [ANNOUNCE] Xvisor ARM better than KVM ARM in CPU virtualization [ In reply to ]

anup at brainfault

May 5, 2012, 7:24 AM

Post #4 of 26 (8320 views)

Hi PMM,

I agree we cannot predict real world performance based on performance on
ARM fast models but if system A is performing better than system B no ARM
fast model or QEMU then in real world system A will perform better than
system B. Of-course in real world scale of difference in performance
between system A and system B will differ.

The previous announcement only proves that Xvisor ARM is relatively better
than KVM ARM.

Regards,
--Anup

On Sat, May 5, 2012 at 3:36 PM, Peter Maydell <peter.maydell@linaro.org>wrote:

> 2012/5/5 Anup Patel <anup@brainfault.org>:
> > This announcement is to show an apple to apple performance comparison
> > between Xvisor ARM and KVM ARM running on VExpress-A15 Fast Model.
>
> I would strongly caution against trying to do any performance/timing
> type tests if you're still running on the ARM Fast Model -- they are
> not representative of performance characteristics on hardware
> and you really can't draw any conclusions about real world
> performance by timing things on a model. It's quite easy to get
> into a situation where all you're measuring is "does my code happen
> to do a lot of some perfectly reasonable operation which happens
> to be hard and slow to implement for the model?".
>
> (Also, KVM for ARM is still under development and we haven't
> yet made several of the obvious performance improvements like
> in-kernel irqchip and timer support, so it's not really a very
> useful thing to compare against yet.)
>
> -- PMM
>

Re: [Android-virt] [ANNOUNCE] Xvisor ARM better than KVM ARM in CPU virtualization [ In reply to ]

anup at brainfault

May 5, 2012, 7:31 AM

Post #5 of 26 (8311 views)

Hi PMM,

Also in my-view even if we have in-kernel emulation of irqchip and timer
still Xvisor ARM will be performing better than KVM ARM because amount of
code path traversed in KVM ARM will always be more.

(Please note my-view about in-kernel emulation is totally based on code
flow comparison of Xvisor ARM emulation and possible KVM ARM in-kernel
emulation)

Regards,
--Anup

On Sat, May 5, 2012 at 7:54 PM, Anup Patel <anup@brainfault.org> wrote:

> Hi PMM,
>
> I agree we cannot predict real world performance based on performance on
> ARM fast models but if system A is performing better than system B no ARM
> fast model or QEMU then in real world system A will perform better than
> system B. Of-course in real world scale of difference in performance
> between system A and system B will differ.
>
> The previous announcement only proves that Xvisor ARM is relatively better
> than KVM ARM.
>
> Regards,
> --Anup
>
>
> On Sat, May 5, 2012 at 3:36 PM, Peter Maydell <peter.maydell@linaro.org>wrote:
>
>> 2012/5/5 Anup Patel <anup@brainfault.org>:
>> > This announcement is to show an apple to apple performance comparison
>> > between Xvisor ARM and KVM ARM running on VExpress-A15 Fast Model.
>>
>> I would strongly caution against trying to do any performance/timing
>> type tests if you're still running on the ARM Fast Model -- they are
>> not representative of performance characteristics on hardware
>> and you really can't draw any conclusions about real world
>> performance by timing things on a model. It's quite easy to get
>> into a situation where all you're measuring is "does my code happen
>> to do a lot of some perfectly reasonable operation which happens
>> to be hard and slow to implement for the model?".
>>
>> (Also, KVM for ARM is still under development and we haven't
>> yet made several of the obvious performance improvements like
>> in-kernel irqchip and timer support, so it's not really a very
>> useful thing to compare against yet.)
>>
>> -- PMM
>>
>
>

Re: [Android-virt] [ANNOUNCE] Xvisor ARM better than KVM ARM in CPU virtualization [ In reply to ]

anup at brainfault

May 5, 2012, 7:31 AM

Post #6 of 26 (8320 views)

Hi PMM,

Also in my-view even if we have in-kernel emulation of irqchip and timer
still Xvisor ARM will be performing better than KVM ARM because amount of
code path traversed in KVM ARM will always be more.

(Please note my-view about in-kernel emulation is totally based on code
flow comparison of Xvisor ARM emulation and possible KVM ARM in-kernel
emulation)

Regards,
--Anup

On Sat, May 5, 2012 at 7:54 PM, Anup Patel <anup@brainfault.org> wrote:

> Hi PMM,
>
> I agree we cannot predict real world performance based on performance on
> ARM fast models but if system A is performing better than system B no ARM
> fast model or QEMU then in real world system A will perform better than
> system B. Of-course in real world scale of difference in performance
> between system A and system B will differ.
>
> The previous announcement only proves that Xvisor ARM is relatively better
> than KVM ARM.
>
> Regards,
> --Anup
>
>
> On Sat, May 5, 2012 at 3:36 PM, Peter Maydell <peter.maydell@linaro.org>wrote:
>
>> 2012/5/5 Anup Patel <anup@brainfault.org>:
>> > This announcement is to show an apple to apple performance comparison
>> > between Xvisor ARM and KVM ARM running on VExpress-A15 Fast Model.
>>
>> I would strongly caution against trying to do any performance/timing
>> type tests if you're still running on the ARM Fast Model -- they are
>> not representative of performance characteristics on hardware
>> and you really can't draw any conclusions about real world
>> performance by timing things on a model. It's quite easy to get
>> into a situation where all you're measuring is "does my code happen
>> to do a lot of some perfectly reasonable operation which happens
>> to be hard and slow to implement for the model?".
>>
>> (Also, KVM for ARM is still under development and we haven't
>> yet made several of the obvious performance improvements like
>> in-kernel irqchip and timer support, so it's not really a very
>> useful thing to compare against yet.)
>>
>> -- PMM
>>
>
>

Re: [Android-virt] [ANNOUNCE] Xvisor ARM better than KVM ARM in CPU virtualization [ In reply to ]

marc.zyngier at arm

May 5, 2012, 8:11 AM

Post #7 of 26 (8247 views)

On Sat, 5 May 2012 15:24:26 +0100
Anup Patel <anup@brainfault.org> wrote:

> Hi PMM,
>
> I agree we cannot predict real world performance based on performance on ARM fast models but if system A is performing better than system B no ARM fast model or QEMU then in real world system A will perform better than system B. Of-course in real world scale of difference in performance between system A and system B will differ.
>

You may want to re-read Peter's email, and consider that the model
doesn't represent the micro-architecture. A code sequence X can be
faster than a sequence Y on the model, and the opposite on real
hardware. The same is equally valid on two different implementation of
the same architecture (Cortex-A7 vs Cortex-A15, for example).

> The previous announcement only proves that Xvisor ARM is relatively better than KVM ARM.

On the Fast Model.

M.

> On Sat, May 5, 2012 at 3:36 PM, Peter Maydell <peter.maydell@linaro.org<mailto:peter.maydell@linaro.org>> wrote:
> 2012/5/5 Anup Patel <anup@brainfault.org<mailto:anup@brainfault.org>>:
> > This announcement is to show an apple to apple performance comparison
> > between Xvisor ARM and KVM ARM running on VExpress-A15 Fast Model.
>
> I would strongly caution against trying to do any performance/timing
> type tests if you're still running on the ARM Fast Model -- they are
> not representative of performance characteristics on hardware
> and you really can't draw any conclusions about real world
> performance by timing things on a model. It's quite easy to get
> into a situation where all you're measuring is "does my code happen
> to do a lot of some perfectly reasonable operation which happens
> to be hard and slow to implement for the model?".
>
> (Also, KVM for ARM is still under development and we haven't
> yet made several of the obvious performance improvements like
> in-kernel irqchip and timer support, so it's not really a very
> useful thing to compare against yet.)
>
> -- PMM
>

--
I'm the slime oozin' out from your TV set...

_______________________________________________
Xen-arm mailing list
Xen-arm@lists.xen.org
http://lists.xen.org/cgi-bin/mailman/listinfo/xen-arm

Re: [Android-virt] [ANNOUNCE] Xvisor ARM better than KVM ARM in CPU virtualization [ In reply to ]

marc.zyngier at arm

May 5, 2012, 8:11 AM

Post #8 of 26 (8256 views)

On Sat, 5 May 2012 15:24:26 +0100
Anup Patel <anup@brainfault.org> wrote:

> Hi PMM,
>
> I agree we cannot predict real world performance based on performance on ARM fast models but if system A is performing better than system B no ARM fast model or QEMU then in real world system A will perform better than system B. Of-course in real world scale of difference in performance between system A and system B will differ.
>

You may want to re-read Peter's email, and consider that the model
doesn't represent the micro-architecture. A code sequence X can be
faster than a sequence Y on the model, and the opposite on real
hardware. The same is equally valid on two different implementation of
the same architecture (Cortex-A7 vs Cortex-A15, for example).

> The previous announcement only proves that Xvisor ARM is relatively better than KVM ARM.

On the Fast Model.

M.

> On Sat, May 5, 2012 at 3:36 PM, Peter Maydell <peter.maydell@linaro.org<mailto:peter.maydell@linaro.org>> wrote:
> 2012/5/5 Anup Patel <anup@brainfault.org<mailto:anup@brainfault.org>>:
> > This announcement is to show an apple to apple performance comparison
> > between Xvisor ARM and KVM ARM running on VExpress-A15 Fast Model.
>
> I would strongly caution against trying to do any performance/timing
> type tests if you're still running on the ARM Fast Model -- they are
> not representative of performance characteristics on hardware
> and you really can't draw any conclusions about real world
> performance by timing things on a model. It's quite easy to get
> into a situation where all you're measuring is "does my code happen
> to do a lot of some perfectly reasonable operation which happens
> to be hard and slow to implement for the model?".
>
> (Also, KVM for ARM is still under development and we haven't
> yet made several of the obvious performance improvements like
> in-kernel irqchip and timer support, so it's not really a very
> useful thing to compare against yet.)
>
> -- PMM
>

--
I'm the slime oozin' out from your TV set...

_______________________________________________
Xen-arm mailing list
Xen-arm@lists.xen.org
http://lists.xen.org/cgi-bin/mailman/listinfo/xen-arm

Re: [Android-virt] [ANNOUNCE] Xvisor ARM better than KVM ARM in CPU virtualization [ In reply to ]

marc.zyngier at arm

May 5, 2012, 8:14 AM

Post #9 of 26 (8239 views)

On Sat, 5 May 2012 15:31:36 +0100
Anup Patel <anup@brainfault.org> wrote:

> Hi PMM,
>
> Also in my-view even if we have in-kernel emulation of irqchip and timer still Xvisor ARM will be performing better than KVM ARM because amount of code path traversed in KVM ARM will always be more.
>
> (Please note my-view about in-kernel emulation is totally based on code flow comparison of Xvisor ARM emulation and possible KVM ARM in-kernel emulation)
>

Sweet. Can I borrow your crystal ball?

M.

> On Sat, May 5, 2012 at 7:54 PM, Anup Patel <anup@brainfault.org<mailto:anup@brainfault.org>> wrote:
> Hi PMM,
>
> I agree we cannot predict real world performance based on performance on ARM fast models but if system A is performing better than system B no ARM fast model or QEMU then in real world system A will perform better than system B. Of-course in real world scale of difference in performance between system A and system B will differ.
>
> The previous announcement only proves that Xvisor ARM is relatively better than KVM ARM.
>
> Regards,
> --Anup
>
>
> On Sat, May 5, 2012 at 3:36 PM, Peter Maydell <peter.maydell@linaro.org<mailto:peter.maydell@linaro.org>> wrote:
> 2012/5/5 Anup Patel <anup@brainfault.org<mailto:anup@brainfault.org>>:
> > This announcement is to show an apple to apple performance comparison
> > between Xvisor ARM and KVM ARM running on VExpress-A15 Fast Model.
>
> I would strongly caution against trying to do any performance/timing
> type tests if you're still running on the ARM Fast Model -- they are
> not representative of performance characteristics on hardware
> and you really can't draw any conclusions about real world
> performance by timing things on a model. It's quite easy to get
> into a situation where all you're measuring is "does my code happen
> to do a lot of some perfectly reasonable operation which happens
> to be hard and slow to implement for the model?".
>
> (Also, KVM for ARM is still under development and we haven't
> yet made several of the obvious performance improvements like
> in-kernel irqchip and timer support, so it's not really a very
> useful thing to compare against yet.)
>
> -- PMM
>
>

--
I'm the slime oozin' out from your TV set...

_______________________________________________
Xen-arm mailing list
Xen-arm@lists.xen.org
http://lists.xen.org/cgi-bin/mailman/listinfo/xen-arm

Re: [Android-virt] [ANNOUNCE] Xvisor ARM better than KVM ARM in CPU virtualization [ In reply to ]

marc.zyngier at arm

May 5, 2012, 8:14 AM

Post #10 of 26 (8271 views)

On Sat, 5 May 2012 15:31:36 +0100
Anup Patel <anup@brainfault.org> wrote:

> Hi PMM,
>
> Also in my-view even if we have in-kernel emulation of irqchip and timer still Xvisor ARM will be performing better than KVM ARM because amount of code path traversed in KVM ARM will always be more.
>
> (Please note my-view about in-kernel emulation is totally based on code flow comparison of Xvisor ARM emulation and possible KVM ARM in-kernel emulation)
>

Sweet. Can I borrow your crystal ball?

M.

> On Sat, May 5, 2012 at 7:54 PM, Anup Patel <anup@brainfault.org<mailto:anup@brainfault.org>> wrote:
> Hi PMM,
>
> I agree we cannot predict real world performance based on performance on ARM fast models but if system A is performing better than system B no ARM fast model or QEMU then in real world system A will perform better than system B. Of-course in real world scale of difference in performance between system A and system B will differ.
>
> The previous announcement only proves that Xvisor ARM is relatively better than KVM ARM.
>
> Regards,
> --Anup
>
>
> On Sat, May 5, 2012 at 3:36 PM, Peter Maydell <peter.maydell@linaro.org<mailto:peter.maydell@linaro.org>> wrote:
> 2012/5/5 Anup Patel <anup@brainfault.org<mailto:anup@brainfault.org>>:
> > This announcement is to show an apple to apple performance comparison
> > between Xvisor ARM and KVM ARM running on VExpress-A15 Fast Model.
>
> I would strongly caution against trying to do any performance/timing
> type tests if you're still running on the ARM Fast Model -- they are
> not representative of performance characteristics on hardware
> and you really can't draw any conclusions about real world
> performance by timing things on a model. It's quite easy to get
> into a situation where all you're measuring is "does my code happen
> to do a lot of some perfectly reasonable operation which happens
> to be hard and slow to implement for the model?".
>
> (Also, KVM for ARM is still under development and we haven't
> yet made several of the obvious performance improvements like
> in-kernel irqchip and timer support, so it's not really a very
> useful thing to compare against yet.)
>
> -- PMM
>
>

--
I'm the slime oozin' out from your TV set...

_______________________________________________
Xen-arm mailing list
Xen-arm@lists.xen.org
http://lists.xen.org/cgi-bin/mailman/listinfo/xen-arm

Re: [Android-virt] [ANNOUNCE] Xvisor ARM better than KVM ARM in CPU virtualization [ In reply to ]

anup at brainfault

May 5, 2012, 9:22 PM

Post #11 of 26 (8321 views)

Hi Marc,

We must understand that the claimed best performing hypervisor today is a
complete monolithic hypervisor (i.e. VMware ESX server). The Xvisor vision
is to have GPLv2 monolithic hypervisor. Our point is that KVM approach to
virtualization is not optimal one and you will end-up putting more and more
things in-kernel.

Also can you give example of a code sequence which is faster on model and
slower in real world. As far as I know ARM fast models are internally TLM
based models and If a TLM based model is emulating a timer chip of X clock
then it is quite precise X clock. Ofcourse CPU emulation and computation
power will be less compared to real world. To see this behaviour try to
boot linux on Fast model or QEMU and leave it for hours and come back see
the time elapsed, you will definitely see same amount of time elapsed as
real world.

The results in the announcemnt are not baseless we have quite amount
reasons to believe Xvisor ARM will perform better than KVM ARM in
real-world too.

Regards,
Anup Patel

On Sat, May 5, 2012 at 8:44 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:

> On Sat, 5 May 2012 15:31:36 +0100
> Anup Patel <anup@brainfault.org> wrote:
>
> > Hi PMM,
> >
> > Also in my-view even if we have in-kernel emulation of irqchip and timer
> still Xvisor ARM will be performing better than KVM ARM because amount of
> code path traversed in KVM ARM will always be more.
> >
> > (Please note my-view about in-kernel emulation is totally based on code
> flow comparison of Xvisor ARM emulation and possible KVM ARM in-kernel
> emulation)
> >
>
> Sweet. Can I borrow your crystal ball?
>
> M.
>
> > On Sat, May 5, 2012 at 7:54 PM, Anup Patel <anup@brainfault.org<mailto:
> anup@brainfault.org>> wrote:
> > Hi PMM,
> >
> > I agree we cannot predict real world performance based on performance on
> ARM fast models but if system A is performing better than system B no ARM
> fast model or QEMU then in real world system A will perform better than
> system B. Of-course in real world scale of difference in performance
> between system A and system B will differ.
> >
> > The previous announcement only proves that Xvisor ARM is relatively
> better than KVM ARM.
> >
> > Regards,
> > --Anup
> >
> >
> > On Sat, May 5, 2012 at 3:36 PM, Peter Maydell <peter.maydell@linaro.org
> <mailto:peter.maydell@linaro.org>> wrote:
> > 2012/5/5 Anup Patel <anup@brainfault.org<mailto:anup@brainfault.org>>:
> > > This announcement is to show an apple to apple performance comparison
> > > between Xvisor ARM and KVM ARM running on VExpress-A15 Fast Model.
> >
> > I would strongly caution against trying to do any performance/timing
> > type tests if you're still running on the ARM Fast Model -- they are
> > not representative of performance characteristics on hardware
> > and you really can't draw any conclusions about real world
> > performance by timing things on a model. It's quite easy to get
> > into a situation where all you're measuring is "does my code happen
> > to do a lot of some perfectly reasonable operation which happens
> > to be hard and slow to implement for the model?".
> >
> > (Also, KVM for ARM is still under development and we haven't
> > yet made several of the obvious performance improvements like
> > in-kernel irqchip and timer support, so it's not really a very
> > useful thing to compare against yet.)
> >
> > -- PMM
> >
> >
>
>
>
> --
> I'm the slime oozin' out from your TV set...
>
>

Re: [Android-virt] [ANNOUNCE] Xvisor ARM better than KVM ARM in CPU virtualization [ In reply to ]

anup at brainfault

May 5, 2012, 9:22 PM

Post #12 of 26 (8322 views)

Hi Marc,

We must understand that the claimed best performing hypervisor today is a
complete monolithic hypervisor (i.e. VMware ESX server). The Xvisor vision
is to have GPLv2 monolithic hypervisor. Our point is that KVM approach to
virtualization is not optimal one and you will end-up putting more and more
things in-kernel.

Also can you give example of a code sequence which is faster on model and
slower in real world. As far as I know ARM fast models are internally TLM
based models and If a TLM based model is emulating a timer chip of X clock
then it is quite precise X clock. Ofcourse CPU emulation and computation
power will be less compared to real world. To see this behaviour try to
boot linux on Fast model or QEMU and leave it for hours and come back see
the time elapsed, you will definitely see same amount of time elapsed as
real world.

The results in the announcemnt are not baseless we have quite amount
reasons to believe Xvisor ARM will perform better than KVM ARM in
real-world too.

Regards,
Anup Patel

On Sat, May 5, 2012 at 8:44 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:

> On Sat, 5 May 2012 15:31:36 +0100
> Anup Patel <anup@brainfault.org> wrote:
>
> > Hi PMM,
> >
> > Also in my-view even if we have in-kernel emulation of irqchip and timer
> still Xvisor ARM will be performing better than KVM ARM because amount of
> code path traversed in KVM ARM will always be more.
> >
> > (Please note my-view about in-kernel emulation is totally based on code
> flow comparison of Xvisor ARM emulation and possible KVM ARM in-kernel
> emulation)
> >
>
> Sweet. Can I borrow your crystal ball?
>
> M.
>
> > On Sat, May 5, 2012 at 7:54 PM, Anup Patel <anup@brainfault.org<mailto:
> anup@brainfault.org>> wrote:
> > Hi PMM,
> >
> > I agree we cannot predict real world performance based on performance on
> ARM fast models but if system A is performing better than system B no ARM
> fast model or QEMU then in real world system A will perform better than
> system B. Of-course in real world scale of difference in performance
> between system A and system B will differ.
> >
> > The previous announcement only proves that Xvisor ARM is relatively
> better than KVM ARM.
> >
> > Regards,
> > --Anup
> >
> >
> > On Sat, May 5, 2012 at 3:36 PM, Peter Maydell <peter.maydell@linaro.org
> <mailto:peter.maydell@linaro.org>> wrote:
> > 2012/5/5 Anup Patel <anup@brainfault.org<mailto:anup@brainfault.org>>:
> > > This announcement is to show an apple to apple performance comparison
> > > between Xvisor ARM and KVM ARM running on VExpress-A15 Fast Model.
> >
> > I would strongly caution against trying to do any performance/timing
> > type tests if you're still running on the ARM Fast Model -- they are
> > not representative of performance characteristics on hardware
> > and you really can't draw any conclusions about real world
> > performance by timing things on a model. It's quite easy to get
> > into a situation where all you're measuring is "does my code happen
> > to do a lot of some perfectly reasonable operation which happens
> > to be hard and slow to implement for the model?".
> >
> > (Also, KVM for ARM is still under development and we haven't
> > yet made several of the obvious performance improvements like
> > in-kernel irqchip and timer support, so it's not really a very
> > useful thing to compare against yet.)
> >
> > -- PMM
> >
> >
>
>
>
> --
> I'm the slime oozin' out from your TV set...
>
>

Re: [xvisor-devel] Re: [Android-virt] [ANNOUNCE] Xvisor ARM better than KVM ARM in CPU virtualization [ In reply to ]

hschauhan at nulltrace

May 5, 2012, 10:47 PM

Post #13 of 26 (8266 views)

> The previous announcement only proves that Xvisor ARM is relatively
better than KVM ARM.

>
> On the Fast Model.
>
>
To quote from ARM site:

"the Fast Models can help developers debug, analyze, and __optimize__ their
applications throughout the development cycle."

If fast models are not as is (which I understand they don't represent the
microarchitecture), why care about optimizations in fast model then?

I believe if X-Visor wins on fast model on what ever reasons, it WILL win
on hardware as well. We must accept this.

= Himanshu =

Re: [Android-virt] [ANNOUNCE] Xvisor ARM better than KVM ARM in CPU virtualization [ In reply to ]

peter.maydell at linaro

May 6, 2012, 1:51 AM

Post #14 of 26 (8248 views)

On 6 May 2012 05:22, Anup Patel <anup@brainfault.org> wrote:
> Also can you give example of a code sequence which is faster on model and
> slower in real world. As far as I know ARM fast models are internally TLM
> based models and If a TLM based model is emulating a timer chip of X clock
> then it is quite precise X clock.

Support for TLM does not require that the underlying model is cycle
accurate (you can have 'loosely timed' behaviour).

You might want to read the Fast Models documentation, which tries
to be clear about what the models do and don't provide. In particular:
http://infocenter.arm.com/help/topic/com.arm.doc.dui0423l/ch02s01s02.html
"Fast models cannot be used to:
* model cycle counting
* model software performance
"

> Ofcourse CPU emulation and computation
> power will be less compared to real world. To see this behaviour try to boot
> linux on Fast model or QEMU and leave it for hours and come back see the
> time elapsed, you will definitely see same amount of time elapsed as real
> world.

Nobody's arguing that the models are faster than hardware!
Let's try a simple example with some numbers representing
relative speeds:

operation A: h/w: 1 ; model 5
operation B: h/w 3 ; model 30

Where we're comparing two equivalent code sequences "A A A A" vs "B".
On hardware "B" will be faster. On the model the "A A A A" beats "B".
(both sequences are slower on the model than on the hardware, obviously.)

The point is that some operations will be vastly vastly slower
on the model, and some operations merely moderately slower. Which
of any two code sequences is fastest depends at least as much on
whether it's using operations that are disproportionally worse
on the model. A trivial example of this is VFP -- certainly QEMU
has to do complex software emulation of the floating point ops to
maintain bit-for-bit accuracy, which makes them very slow to the
point where a hand-optimised-integer-assembly codec is likely to
be faster on the model than a Neon/VFP-using codec, even though
of course the Neon codec will be faster on hardware.

[.NB: this is itself a big simplification: model performance will
depend on a lot of interacting things and is not purely a
same-every-time slowdown per operation. Some operations effectively
slow down what happens after them, for instance on QEMU if you do
something that makes us flush our cache of translated code. And
if for instance you have a periodic timer then the fact the model
is generally slower means you execute proportionally more insns in
the timer interrupt, so inefficiency or slowness in that code path
has disproportionately more effect on overall speed than it does
on hardware. There are other complications too...]

> The results in the announcemnt are not baseless we have quite amount reasons
> to believe Xvisor ARM will perform better than KVM ARM in real-world too.

I'm not stating a position on whether KVM will be better or worse
than Xvisor. I'm just pointing out that you can't base an argument
on the faulty assumption that performance inside a model can tell
you anything useful about performance on hardware.

-- PMM

_______________________________________________
Xen-arm mailing list
Xen-arm@lists.xen.org
http://lists.xen.org/cgi-bin/mailman/listinfo/xen-arm

Re: [Android-virt] [ANNOUNCE] Xvisor ARM better than KVM ARM in CPU virtualization [ In reply to ]

peter.maydell at linaro

May 6, 2012, 1:51 AM

Post #15 of 26 (8240 views)

On 6 May 2012 05:22, Anup Patel <anup@brainfault.org> wrote:
> Also can you give example of a code sequence which is faster on model and
> slower in real world. As far as I know ARM fast models are internally TLM
> based models and If a TLM based model is emulating a timer chip of X clock
> then it is quite precise X clock.

Support for TLM does not require that the underlying model is cycle
accurate (you can have 'loosely timed' behaviour).

You might want to read the Fast Models documentation, which tries
to be clear about what the models do and don't provide. In particular:
http://infocenter.arm.com/help/topic/com.arm.doc.dui0423l/ch02s01s02.html
"Fast models cannot be used to:
* model cycle counting
* model software performance
"

> Ofcourse CPU emulation and computation
> power will be less compared to real world. To see this behaviour try to boot
> linux on Fast model or QEMU and leave it for hours and come back see the
> time elapsed, you will definitely see same amount of time elapsed as real
> world.

Nobody's arguing that the models are faster than hardware!
Let's try a simple example with some numbers representing
relative speeds:

operation A: h/w: 1 ; model 5
operation B: h/w 3 ; model 30

Where we're comparing two equivalent code sequences "A A A A" vs "B".
On hardware "B" will be faster. On the model the "A A A A" beats "B".
(both sequences are slower on the model than on the hardware, obviously.)

The point is that some operations will be vastly vastly slower
on the model, and some operations merely moderately slower. Which
of any two code sequences is fastest depends at least as much on
whether it's using operations that are disproportionally worse
on the model. A trivial example of this is VFP -- certainly QEMU
has to do complex software emulation of the floating point ops to
maintain bit-for-bit accuracy, which makes them very slow to the
point where a hand-optimised-integer-assembly codec is likely to
be faster on the model than a Neon/VFP-using codec, even though
of course the Neon codec will be faster on hardware.

[.NB: this is itself a big simplification: model performance will
depend on a lot of interacting things and is not purely a
same-every-time slowdown per operation. Some operations effectively
slow down what happens after them, for instance on QEMU if you do
something that makes us flush our cache of translated code. And
if for instance you have a periodic timer then the fact the model
is generally slower means you execute proportionally more insns in
the timer interrupt, so inefficiency or slowness in that code path
has disproportionately more effect on overall speed than it does
on hardware. There are other complications too...]

> The results in the announcemnt are not baseless we have quite amount reasons
> to believe Xvisor ARM will perform better than KVM ARM in real-world too.

I'm not stating a position on whether KVM will be better or worse
than Xvisor. I'm just pointing out that you can't base an argument
on the faulty assumption that performance inside a model can tell
you anything useful about performance on hardware.

-- PMM

_______________________________________________
Xen-arm mailing list
Xen-arm@lists.xen.org
http://lists.xen.org/cgi-bin/mailman/listinfo/xen-arm

Re: [Android-virt] [ANNOUNCE] Xvisor ARM better than KVM ARM in CPU virtualization [ In reply to ]

anup at brainfault

May 6, 2012, 4:53 AM

Post #16 of 26 (8310 views)

Hi PMM,

Whether to consider model for measuring performance is one's own opinion.
There are number of Tier1 conferences which accept simulation numbers for
proving better approaches provided the simulation platform is well accepted
by everyone.

Talking about code sequences both Xvisor ARM and KVM ARM have same set of
emulators and drivers. In fact, almost all emulation code has been adopted
from QEMU. Many of the crucial drivers are adopted from Linux ARM. Unlike
KVM ARM, in Xvisor ARM there no unnecessary switching between host mode to
guest mode and amount of code traversed in handling any fault is also very
less hence Xvisor-ARM will have much less code executed compared to KVM ARM.

In Xvisor developement, we have observed that results of any CPU
performance test on QEMU or Fast Model naturally scales up on
real-hardware. Atleast we have never come across any scenario or test
performing better on QEMU or Fast Model compared real-world (this is true
for test running on Native Linux or Linux running as guest on Xvisor ARM).

In our opinion we strongly believe monolithic approaches are always better
performing over micro-kernelized approaches (or approaches somewhere in
between micro-kernel and monolithic). Hence Xvisor ARM will always perform
better than KVM ARM in theory, simulation and real-world.

Best Regards,
Anup Patel

On Sun, May 6, 2012 at 2:21 PM, Peter Maydell <peter.maydell@linaro.org>wrote:

> On 6 May 2012 05:22, Anup Patel <anup@brainfault.org> wrote:
> > Also can you give example of a code sequence which is faster on model and
> > slower in real world. As far as I know ARM fast models are internally TLM
> > based models and If a TLM based model is emulating a timer chip of X
> clock
> > then it is quite precise X clock.
>
> Support for TLM does not require that the underlying model is cycle
> accurate (you can have 'loosely timed' behaviour).
>
> You might want to read the Fast Models documentation, which tries
> to be clear about what the models do and don't provide. In particular:
> http://infocenter.arm.com/help/topic/com.arm.doc.dui0423l/ch02s01s02.html
> "Fast models cannot be used to:
> * model cycle counting
> * model software performance
> "
>
> > Ofcourse CPU emulation and computation
> > power will be less compared to real world. To see this behaviour try to
> boot
> > linux on Fast model or QEMU and leave it for hours and come back see the
> > time elapsed, you will definitely see same amount of time elapsed as real
> > world.
>
> Nobody's arguing that the models are faster than hardware!
> Let's try a simple example with some numbers representing
> relative speeds:
>
> operation A: h/w: 1 ; model 5
> operation B: h/w 3 ; model 30
>
> Where we're comparing two equivalent code sequences "A A A A" vs "B".
> On hardware "B" will be faster. On the model the "A A A A" beats "B".
> (both sequences are slower on the model than on the hardware, obviously.)
>
> The point is that some operations will be vastly vastly slower
> on the model, and some operations merely moderately slower. Which
> of any two code sequences is fastest depends at least as much on
> whether it's using operations that are disproportionally worse
> on the model. A trivial example of this is VFP -- certainly QEMU
> has to do complex software emulation of the floating point ops to
> maintain bit-for-bit accuracy, which makes them very slow to the
> point where a hand-optimised-integer-assembly codec is likely to
> be faster on the model than a Neon/VFP-using codec, even though
> of course the Neon codec will be faster on hardware.
>
> [.NB: this is itself a big simplification: model performance will
> depend on a lot of interacting things and is not purely a
> same-every-time slowdown per operation. Some operations effectively
> slow down what happens after them, for instance on QEMU if you do
> something that makes us flush our cache of translated code. And
> if for instance you have a periodic timer then the fact the model
> is generally slower means you execute proportionally more insns in
> the timer interrupt, so inefficiency or slowness in that code path
> has disproportionately more effect on overall speed than it does
> on hardware. There are other complications too...]
>
> > The results in the announcemnt are not baseless we have quite amount
> reasons
> > to believe Xvisor ARM will perform better than KVM ARM in real-world too.
>
> I'm not stating a position on whether KVM will be better or worse
> than Xvisor. I'm just pointing out that you can't base an argument
> on the faulty assumption that performance inside a model can tell
> you anything useful about performance on hardware.
>
> -- PMM
>

Re: [Android-virt] [ANNOUNCE] Xvisor ARM better than KVM ARM in CPU virtualization [ In reply to ]

anup at brainfault

May 6, 2012, 4:53 AM

Post #17 of 26 (8317 views)

Hi PMM,

Whether to consider model for measuring performance is one's own opinion.
There are number of Tier1 conferences which accept simulation numbers for
proving better approaches provided the simulation platform is well accepted
by everyone.

Talking about code sequences both Xvisor ARM and KVM ARM have same set of
emulators and drivers. In fact, almost all emulation code has been adopted
from QEMU. Many of the crucial drivers are adopted from Linux ARM. Unlike
KVM ARM, in Xvisor ARM there no unnecessary switching between host mode to
guest mode and amount of code traversed in handling any fault is also very
less hence Xvisor-ARM will have much less code executed compared to KVM ARM.

In Xvisor developement, we have observed that results of any CPU
performance test on QEMU or Fast Model naturally scales up on
real-hardware. Atleast we have never come across any scenario or test
performing better on QEMU or Fast Model compared real-world (this is true
for test running on Native Linux or Linux running as guest on Xvisor ARM).

In our opinion we strongly believe monolithic approaches are always better
performing over micro-kernelized approaches (or approaches somewhere in
between micro-kernel and monolithic). Hence Xvisor ARM will always perform
better than KVM ARM in theory, simulation and real-world.

Best Regards,
Anup Patel

On Sun, May 6, 2012 at 2:21 PM, Peter Maydell <peter.maydell@linaro.org>wrote:

> On 6 May 2012 05:22, Anup Patel <anup@brainfault.org> wrote:
> > Also can you give example of a code sequence which is faster on model and
> > slower in real world. As far as I know ARM fast models are internally TLM
> > based models and If a TLM based model is emulating a timer chip of X
> clock
> > then it is quite precise X clock.
>
> Support for TLM does not require that the underlying model is cycle
> accurate (you can have 'loosely timed' behaviour).
>
> You might want to read the Fast Models documentation, which tries
> to be clear about what the models do and don't provide. In particular:
> http://infocenter.arm.com/help/topic/com.arm.doc.dui0423l/ch02s01s02.html
> "Fast models cannot be used to:
> * model cycle counting
> * model software performance
> "
>
> > Ofcourse CPU emulation and computation
> > power will be less compared to real world. To see this behaviour try to
> boot
> > linux on Fast model or QEMU and leave it for hours and come back see the
> > time elapsed, you will definitely see same amount of time elapsed as real
> > world.
>
> Nobody's arguing that the models are faster than hardware!
> Let's try a simple example with some numbers representing
> relative speeds:
>
> operation A: h/w: 1 ; model 5
> operation B: h/w 3 ; model 30
>
> Where we're comparing two equivalent code sequences "A A A A" vs "B".
> On hardware "B" will be faster. On the model the "A A A A" beats "B".
> (both sequences are slower on the model than on the hardware, obviously.)
>
> The point is that some operations will be vastly vastly slower
> on the model, and some operations merely moderately slower. Which
> of any two code sequences is fastest depends at least as much on
> whether it's using operations that are disproportionally worse
> on the model. A trivial example of this is VFP -- certainly QEMU
> has to do complex software emulation of the floating point ops to
> maintain bit-for-bit accuracy, which makes them very slow to the
> point where a hand-optimised-integer-assembly codec is likely to
> be faster on the model than a Neon/VFP-using codec, even though
> of course the Neon codec will be faster on hardware.
>
> [.NB: this is itself a big simplification: model performance will
> depend on a lot of interacting things and is not purely a
> same-every-time slowdown per operation. Some operations effectively
> slow down what happens after them, for instance on QEMU if you do
> something that makes us flush our cache of translated code. And
> if for instance you have a periodic timer then the fact the model
> is generally slower means you execute proportionally more insns in
> the timer interrupt, so inefficiency or slowness in that code path
> has disproportionately more effect on overall speed than it does
> on hardware. There are other complications too...]
>
> > The results in the announcemnt are not baseless we have quite amount
> reasons
> > to believe Xvisor ARM will perform better than KVM ARM in real-world too.
>
> I'm not stating a position on whether KVM will be better or worse
> than Xvisor. I'm just pointing out that you can't base an argument
> on the faulty assumption that performance inside a model can tell
> you anything useful about performance on hardware.
>
> -- PMM
>

Re: [Android-virt] [ANNOUNCE] Xvisor ARM better than KVM ARM in CPU virtualization [ In reply to ]

May 8, 2012, 3:59 PM

Post #18 of 26 (8233 views)

Anup,

Thanks for providing info on your Xvisor project. However, this is a
mailing list for the development of KVM/ARM and not a scientific forum to
establish in theory which hypervisor "will always perform better than"
which other hypervisor.

If you want to establish that your code base will always perform better
than all other hosted hypervisors, I strongly encourage you to submit a
paper about this in a peer reviewed conference. Personally I would find
that establishing such facts in a scientific way to be extremely
interesting.

I understand that you wish to argue Xvisor's superiority in comparison to
KVM, but I disagree with your conclusions. The code path taken in KVM can
be optimized to be extremely short and all logic could be placed within the
KVM module. There are numerous other advantages to using KVM (existing
driver base, upstream kernel integration, compatibility with existing user
space tools, co-existence with native host processes, etc.) and with server
grade hardware I see the reliance on the Linux kernel for scheduling,
memory management etc. to be a great advantage - and not a drawback. On the
other hand, when Xvisor matures, feature requests will only increase its
code size and complexity as well.

-Christoffer

On May 6, 2012, at 7:53 AM, Anup Patel <anup@brainfault.org> wrote:

Hi PMM,

Whether to consider model for measuring performance is one's own opinion.
There are number of Tier1 conferences which accept simulation numbers for
proving better approaches provided the simulation platform is well accepted
by everyone.

Talking about code sequences both Xvisor ARM and KVM ARM have same set of
emulators and drivers. In fact, almost all emulation code has been adopted
from QEMU. Many of the crucial drivers are adopted from Linux ARM. Unlike
KVM ARM, in Xvisor ARM there no unnecessary switching between host mode to
guest mode and amount of code traversed in handling any fault is also very
less hence Xvisor-ARM will have much less code executed compared to KVM ARM.

In Xvisor developement, we have observed that results of any CPU
performance test on QEMU or Fast Model naturally scales up on
real-hardware. Atleast we have never come across any scenario or test
performing better on QEMU or Fast Model compared real-world (this is true
for test running on Native Linux or Linux running as guest on Xvisor ARM).

In our opinion we strongly believe monolithic approaches are always better
performing over micro-kernelized approaches (or approaches somewhere in
between micro-kernel and monolithic). Hence Xvisor ARM will always perform
better than KVM ARM in theory, simulation and real-world.

Best Regards,
Anup Patel

On Sun, May 6, 2012 at 2:21 PM, Peter Maydell <peter.maydell@linaro.org>wrote:

> On 6 May 2012 05:22, Anup Patel <anup@brainfault.org> wrote:
> > Also can you give example of a code sequence which is faster on model and
> > slower in real world. As far as I know ARM fast models are internally TLM
> > based models and If a TLM based model is emulating a timer chip of X
> clock
> > then it is quite precise X clock.
>
> Support for TLM does not require that the underlying model is cycle
> accurate (you can have 'loosely timed' behaviour).
>
> You might want to read the Fast Models documentation, which tries
> to be clear about what the models do and don't provide. In particular:
> http://infocenter.arm.com/help/topic/com.arm.doc.dui0423l/ch02s01s02.html
> "Fast models cannot be used to:
> * model cycle counting
> * model software performance
> "
>
> > Ofcourse CPU emulation and computation
> > power will be less compared to real world. To see this behaviour try to
> boot
> > linux on Fast model or QEMU and leave it for hours and come back see the
> > time elapsed, you will definitely see same amount of time elapsed as real
> > world.
>
> Nobody's arguing that the models are faster than hardware!
> Let's try a simple example with some numbers representing
> relative speeds:
>
> operation A: h/w: 1 ; model 5
> operation B: h/w 3 ; model 30
>
> Where we're comparing two equivalent code sequences "A A A A" vs "B".
> On hardware "B" will be faster. On the model the "A A A A" beats "B".
> (both sequences are slower on the model than on the hardware, obviously.)
>
> The point is that some operations will be vastly vastly slower
> on the model, and some operations merely moderately slower. Which
> of any two code sequences is fastest depends at least as much on
> whether it's using operations that are disproportionally worse
> on the model. A trivial example of this is VFP -- certainly QEMU
> has to do complex software emulation of the floating point ops to
> maintain bit-for-bit accuracy, which makes them very slow to the
> point where a hand-optimised-integer-assembly codec is likely to
> be faster on the model than a Neon/VFP-using codec, even though
> of course the Neon codec will be faster on hardware.
>
> [.NB: this is itself a big simplification: model performance will
> depend on a lot of interacting things and is not purely a
> same-every-time slowdown per operation. Some operations effectively
> slow down what happens after them, for instance on QEMU if you do
> something that makes us flush our cache of translated code. And
> if for instance you have a periodic timer then the fact the model
> is generally slower means you execute proportionally more insns in
> the timer interrupt, so inefficiency or slowness in that code path
> has disproportionately more effect on overall speed than it does
> on hardware. There are other complications too...]
>
> > The results in the announcemnt are not baseless we have quite amount
> reasons
> > to believe Xvisor ARM will perform better than KVM ARM in real-world too.
>
> I'm not stating a position on whether KVM will be better or worse
> than Xvisor. I'm just pointing out that you can't base an argument
> on the faulty assumption that performance inside a model can tell
> you anything useful about performance on hardware.
>
> -- PMM
>

_______________________________________________
Android-virt mailing list
Android-virt@lists.cs.columbia.edu
https://lists.cs.columbia.edu/cucslists/listinfo/android-virt

Re: [Android-virt] [ANNOUNCE] Xvisor ARM better than KVM ARM in CPU virtualization [ In reply to ]

May 8, 2012, 3:59 PM

Post #19 of 26 (8325 views)

Anup,

Thanks for providing info on your Xvisor project. However, this is a
mailing list for the development of KVM/ARM and not a scientific forum to
establish in theory which hypervisor "will always perform better than"
which other hypervisor.

If you want to establish that your code base will always perform better
than all other hosted hypervisors, I strongly encourage you to submit a
paper about this in a peer reviewed conference. Personally I would find
that establishing such facts in a scientific way to be extremely
interesting.

I understand that you wish to argue Xvisor's superiority in comparison to
KVM, but I disagree with your conclusions. The code path taken in KVM can
be optimized to be extremely short and all logic could be placed within the
KVM module. There are numerous other advantages to using KVM (existing
driver base, upstream kernel integration, compatibility with existing user
space tools, co-existence with native host processes, etc.) and with server
grade hardware I see the reliance on the Linux kernel for scheduling,
memory management etc. to be a great advantage - and not a drawback. On the
other hand, when Xvisor matures, feature requests will only increase its
code size and complexity as well.

-Christoffer

On May 6, 2012, at 7:53 AM, Anup Patel <anup@brainfault.org> wrote:

Hi PMM,

Whether to consider model for measuring performance is one's own opinion.
There are number of Tier1 conferences which accept simulation numbers for
proving better approaches provided the simulation platform is well accepted
by everyone.

Talking about code sequences both Xvisor ARM and KVM ARM have same set of
emulators and drivers. In fact, almost all emulation code has been adopted
from QEMU. Many of the crucial drivers are adopted from Linux ARM. Unlike
KVM ARM, in Xvisor ARM there no unnecessary switching between host mode to
guest mode and amount of code traversed in handling any fault is also very
less hence Xvisor-ARM will have much less code executed compared to KVM ARM.

In Xvisor developement, we have observed that results of any CPU
performance test on QEMU or Fast Model naturally scales up on
real-hardware. Atleast we have never come across any scenario or test
performing better on QEMU or Fast Model compared real-world (this is true
for test running on Native Linux or Linux running as guest on Xvisor ARM).

In our opinion we strongly believe monolithic approaches are always better
performing over micro-kernelized approaches (or approaches somewhere in
between micro-kernel and monolithic). Hence Xvisor ARM will always perform
better than KVM ARM in theory, simulation and real-world.

Best Regards,
Anup Patel

On Sun, May 6, 2012 at 2:21 PM, Peter Maydell <peter.maydell@linaro.org>wrote:

> On 6 May 2012 05:22, Anup Patel <anup@brainfault.org> wrote:
> > Also can you give example of a code sequence which is faster on model and
> > slower in real world. As far as I know ARM fast models are internally TLM
> > based models and If a TLM based model is emulating a timer chip of X
> clock
> > then it is quite precise X clock.
>
> Support for TLM does not require that the underlying model is cycle
> accurate (you can have 'loosely timed' behaviour).
>
> You might want to read the Fast Models documentation, which tries
> to be clear about what the models do and don't provide. In particular:
> http://infocenter.arm.com/help/topic/com.arm.doc.dui0423l/ch02s01s02.html
> "Fast models cannot be used to:
> * model cycle counting
> * model software performance
> "
>
> > Ofcourse CPU emulation and computation
> > power will be less compared to real world. To see this behaviour try to
> boot
> > linux on Fast model or QEMU and leave it for hours and come back see the
> > time elapsed, you will definitely see same amount of time elapsed as real
> > world.
>
> Nobody's arguing that the models are faster than hardware!
> Let's try a simple example with some numbers representing
> relative speeds:
>
> operation A: h/w: 1 ; model 5
> operation B: h/w 3 ; model 30
>
> Where we're comparing two equivalent code sequences "A A A A" vs "B".
> On hardware "B" will be faster. On the model the "A A A A" beats "B".
> (both sequences are slower on the model than on the hardware, obviously.)
>
> The point is that some operations will be vastly vastly slower
> on the model, and some operations merely moderately slower. Which
> of any two code sequences is fastest depends at least as much on
> whether it's using operations that are disproportionally worse
> on the model. A trivial example of this is VFP -- certainly QEMU
> has to do complex software emulation of the floating point ops to
> maintain bit-for-bit accuracy, which makes them very slow to the
> point where a hand-optimised-integer-assembly codec is likely to
> be faster on the model than a Neon/VFP-using codec, even though
> of course the Neon codec will be faster on hardware.
>
> [.NB: this is itself a big simplification: model performance will
> depend on a lot of interacting things and is not purely a
> same-every-time slowdown per operation. Some operations effectively
> slow down what happens after them, for instance on QEMU if you do
> something that makes us flush our cache of translated code. And
> if for instance you have a periodic timer then the fact the model
> is generally slower means you execute proportionally more insns in
> the timer interrupt, so inefficiency or slowness in that code path
> has disproportionately more effect on overall speed than it does
> on hardware. There are other complications too...]
>
> > The results in the announcemnt are not baseless we have quite amount
> reasons
> > to believe Xvisor ARM will perform better than KVM ARM in real-world too.
>
> I'm not stating a position on whether KVM will be better or worse
> than Xvisor. I'm just pointing out that you can't base an argument
> on the faulty assumption that performance inside a model can tell
> you anything useful about performance on hardware.
>
> -- PMM
>

_______________________________________________
Android-virt mailing list
Android-virt@lists.cs.columbia.edu
https://lists.cs.columbia.edu/cucslists/listinfo/android-virt

Re: [Android-virt] [ANNOUNCE] Xvisor ARM better than KVM ARM in CPU virtualization [ In reply to ]

anup at brainfault

May 8, 2012, 11:11 PM

Post #20 of 26 (8320 views)

Christoffer,

The intention behind the announcement was to inform people interested in
virtualization about Xvisor. The announcement was an early info about
achievements of Xvisor ARM (for now compared to KVM ARM). Certainly we are
planning to have scientific paper for Xvisor.

Also, I do agree that KVM ARM can be further optimized but as I mentioned
in my previous replies "KVM ARM will end-up putting more and more stuff
in-kernel". For now you can think of Xvisor ARM = KVM ARM doing everything
in-kernel. Even Xvisor ARM is being optimized so, as time passes Xvisor ARM
is also going to improve further. Its a common wisdom that "No hypervisor
in the world can be better than Native performance". Xvisor ARM is already
very very close to native performance and KVM ARM will come close to native
performance only by increasing its monolithic nature (i.e. doing more
things in-kernel). If monolithic hypervisors are so well performing then
why not to have a monolithic hypervisor made for virtualization purpose
only. The motivation behind writing Xvisor was the same.

Apart from high performance Xvisor has many interesting features such as:
*Ability to work without hardware virtualization support* - Xvisor ARM is
able to boot multiple unmodified Linux guest even on hosts which do not
have virtualization extensions implemented. In contrast, KVM ARM does not
work without virtualization extension. The potential number of host
hardware that Xvisor ARM can support is much more than KVM ARM can support.
Xvisor ARM can in-fact run on old ARMv5 processors too.
*Tree-based configuration* - To create a guest we have to just describe the
guest in form of a device tree (possible even in runtime). In contrast, for
KVM one needs to add the support in QEMU and recompile the binaries.
*Pass-through hardware access* - For hardware not accessed or virtualized
by Xvisor can be used in pass-through mode. Providing the guests a
pass-through accessible device is just matter of adding a tree node and
configure irq routing information in guest tree. Its not just PCI devices,
we can provide any kind of device as pass-through accessible (Note: if
device has in-built DMA then it should have IOMMU or SysMMU otherwise it
would be security breach). We have already tried out Serial Port and NIC as
pass-through devices.

We can compare KVM advantages with Xvisor as follows:
*Scheduler* - The linux kernel scheduler is very mature and proven OS
scheduler but Hypervisor scheduler can be quite different. Scheduling
processes and scheduling VMs can be very different problems. In case of VMs
we can use info such as: amount of emulated IO done, amount of time spend
in waiting for irq, etc for improving the quality of server consolidation.
*Driver base* - Xvisor has and will have all driver framework APIs similar
to Linux and driver porting will be just one-on-one replacement of APIs in
most cases.
*User space tools* - For starter Xvisor will use libvirt tools (or similar
open source initiative) for remote management.
*Co-existence host processes* - Xvisor is not an OS. Its made for
virtualization only so no process. Ofcourse, Xvisor has internal threading
framework but most of the time this background threads are sleeping doing
nothing. All the management commands are provided by managment terminal
daemon (which a background thread in Xvisor).

--Anup

On Wed, May 9, 2012 at 4:29 AM, Christoffer Dall <cdall@cs.columbia.edu>wrote:

> Anup,
>
> Thanks for providing info on your Xvisor project. However, this is a
> mailing list for the development of KVM/ARM and not a scientific forum to
> establish in theory which hypervisor "will always perform better than"
> which other hypervisor.
>
> If you want to establish that your code base will always perform better
> than all other hosted hypervisors, I strongly encourage you to submit a
> paper about this in a peer reviewed conference. Personally I would find
> that establishing such facts in a scientific way to be extremely
> interesting.
>
> I understand that you wish to argue Xvisor's superiority in comparison to
> KVM, but I disagree with your conclusions. The code path taken in KVM can
> be optimized to be extremely short and all logic could be placed within the
> KVM module. There are numerous other advantages to using KVM (existing
> driver base, upstream kernel integration, compatibility with existing user
> space tools, co-existence with native host processes, etc.) and with server
> grade hardware I see the reliance on the Linux kernel for scheduling,
> memory management etc. to be a great advantage - and not a drawback. On the
> other hand, when Xvisor matures, feature requests will only increase its
> code size and complexity as well.
>
> -Christoffer
>
> On May 6, 2012, at 7:53 AM, Anup Patel <anup@brainfault.org> wrote:
>
> Hi PMM,
>
> Whether to consider model for measuring performance is one's own opinion.
> There are number of Tier1 conferences which accept simulation numbers for
> proving better approaches provided the simulation platform is well accepted
> by everyone.
>
> Talking about code sequences both Xvisor ARM and KVM ARM have same set of
> emulators and drivers. In fact, almost all emulation code has been adopted
> from QEMU. Many of the crucial drivers are adopted from Linux ARM. Unlike
> KVM ARM, in Xvisor ARM there no unnecessary switching between host mode to
> guest mode and amount of code traversed in handling any fault is also very
> less hence Xvisor-ARM will have much less code executed compared to KVM ARM.
>
> In Xvisor developement, we have observed that results of any CPU
> performance test on QEMU or Fast Model naturally scales up on
> real-hardware. Atleast we have never come across any scenario or test
> performing better on QEMU or Fast Model compared real-world (this is true
> for test running on Native Linux or Linux running as guest on Xvisor ARM).
>
> In our opinion we strongly believe monolithic approaches are always better
> performing over micro-kernelized approaches (or approaches somewhere in
> between micro-kernel and monolithic). Hence Xvisor ARM will always perform
> better than KVM ARM in theory, simulation and real-world.
>
> Best Regards,
> Anup Patel
>
> On Sun, May 6, 2012 at 2:21 PM, Peter Maydell <peter.maydell@linaro.org>wrote:
>
>> On 6 May 2012 05:22, Anup Patel <anup@brainfault.org> wrote:
>> > Also can you give example of a code sequence which is faster on model
>> and
>> > slower in real world. As far as I know ARM fast models are internally
>> TLM
>> > based models and If a TLM based model is emulating a timer chip of X
>> clock
>> > then it is quite precise X clock.
>>
>> Support for TLM does not require that the underlying model is cycle
>> accurate (you can have 'loosely timed' behaviour).
>>
>> You might want to read the Fast Models documentation, which tries
>> to be clear about what the models do and don't provide. In particular:
>>
>> http://infocenter.arm.com/help/topic/com.arm.doc.dui0423l/ch02s01s02.html
>> "Fast models cannot be used to:
>> * model cycle counting
>> * model software performance
>> "
>>
>> > Ofcourse CPU emulation and computation
>> > power will be less compared to real world. To see this behaviour try to
>> boot
>> > linux on Fast model or QEMU and leave it for hours and come back see the
>> > time elapsed, you will definitely see same amount of time elapsed as
>> real
>> > world.
>>
>> Nobody's arguing that the models are faster than hardware!
>> Let's try a simple example with some numbers representing
>> relative speeds:
>>
>> operation A: h/w: 1 ; model 5
>> operation B: h/w 3 ; model 30
>>
>> Where we're comparing two equivalent code sequences "A A A A" vs "B".
>> On hardware "B" will be faster. On the model the "A A A A" beats "B".
>> (both sequences are slower on the model than on the hardware, obviously.)
>>
>> The point is that some operations will be vastly vastly slower
>> on the model, and some operations merely moderately slower. Which
>> of any two code sequences is fastest depends at least as much on
>> whether it's using operations that are disproportionally worse
>> on the model. A trivial example of this is VFP -- certainly QEMU
>> has to do complex software emulation of the floating point ops to
>> maintain bit-for-bit accuracy, which makes them very slow to the
>> point where a hand-optimised-integer-assembly codec is likely to
>> be faster on the model than a Neon/VFP-using codec, even though
>> of course the Neon codec will be faster on hardware.
>>
>> [.NB: this is itself a big simplification: model performance will
>> depend on a lot of interacting things and is not purely a
>> same-every-time slowdown per operation. Some operations effectively
>> slow down what happens after them, for instance on QEMU if you do
>> something that makes us flush our cache of translated code. And
>> if for instance you have a periodic timer then the fact the model
>> is generally slower means you execute proportionally more insns in
>> the timer interrupt, so inefficiency or slowness in that code path
>> has disproportionately more effect on overall speed than it does
>> on hardware. There are other complications too...]
>>
>> > The results in the announcemnt are not baseless we have quite amount
>> reasons
>> > to believe Xvisor ARM will perform better than KVM ARM in real-world
>> too.
>>
>> I'm not stating a position on whether KVM will be better or worse
>> than Xvisor. I'm just pointing out that you can't base an argument
>> on the faulty assumption that performance inside a model can tell
>> you anything useful about performance on hardware.
>>
>> -- PMM
>>
>
> _______________________________________________
> Android-virt mailing list
> Android-virt@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/cucslists/listinfo/android-virt
>
>

Re: [Android-virt] [ANNOUNCE] Xvisor ARM better than KVM ARM in CPU virtualization [ In reply to ]

anup at brainfault

May 8, 2012, 11:11 PM

Post #21 of 26 (8315 views)

Christoffer,

The intention behind the announcement was to inform people interested in
virtualization about Xvisor. The announcement was an early info about
achievements of Xvisor ARM (for now compared to KVM ARM). Certainly we are
planning to have scientific paper for Xvisor.

Also, I do agree that KVM ARM can be further optimized but as I mentioned
in my previous replies "KVM ARM will end-up putting more and more stuff
in-kernel". For now you can think of Xvisor ARM = KVM ARM doing everything
in-kernel. Even Xvisor ARM is being optimized so, as time passes Xvisor ARM
is also going to improve further. Its a common wisdom that "No hypervisor
in the world can be better than Native performance". Xvisor ARM is already
very very close to native performance and KVM ARM will come close to native
performance only by increasing its monolithic nature (i.e. doing more
things in-kernel). If monolithic hypervisors are so well performing then
why not to have a monolithic hypervisor made for virtualization purpose
only. The motivation behind writing Xvisor was the same.

Apart from high performance Xvisor has many interesting features such as:
*Ability to work without hardware virtualization support* - Xvisor ARM is
able to boot multiple unmodified Linux guest even on hosts which do not
have virtualization extensions implemented. In contrast, KVM ARM does not
work without virtualization extension. The potential number of host
hardware that Xvisor ARM can support is much more than KVM ARM can support.
Xvisor ARM can in-fact run on old ARMv5 processors too.
*Tree-based configuration* - To create a guest we have to just describe the
guest in form of a device tree (possible even in runtime). In contrast, for
KVM one needs to add the support in QEMU and recompile the binaries.
*Pass-through hardware access* - For hardware not accessed or virtualized
by Xvisor can be used in pass-through mode. Providing the guests a
pass-through accessible device is just matter of adding a tree node and
configure irq routing information in guest tree. Its not just PCI devices,
we can provide any kind of device as pass-through accessible (Note: if
device has in-built DMA then it should have IOMMU or SysMMU otherwise it
would be security breach). We have already tried out Serial Port and NIC as
pass-through devices.

We can compare KVM advantages with Xvisor as follows:
*Scheduler* - The linux kernel scheduler is very mature and proven OS
scheduler but Hypervisor scheduler can be quite different. Scheduling
processes and scheduling VMs can be very different problems. In case of VMs
we can use info such as: amount of emulated IO done, amount of time spend
in waiting for irq, etc for improving the quality of server consolidation.
*Driver base* - Xvisor has and will have all driver framework APIs similar
to Linux and driver porting will be just one-on-one replacement of APIs in
most cases.
*User space tools* - For starter Xvisor will use libvirt tools (or similar
open source initiative) for remote management.
*Co-existence host processes* - Xvisor is not an OS. Its made for
virtualization only so no process. Ofcourse, Xvisor has internal threading
framework but most of the time this background threads are sleeping doing
nothing. All the management commands are provided by managment terminal
daemon (which a background thread in Xvisor).

--Anup

On Wed, May 9, 2012 at 4:29 AM, Christoffer Dall <cdall@cs.columbia.edu>wrote:

> Anup,
>
> Thanks for providing info on your Xvisor project. However, this is a
> mailing list for the development of KVM/ARM and not a scientific forum to
> establish in theory which hypervisor "will always perform better than"
> which other hypervisor.
>
> If you want to establish that your code base will always perform better
> than all other hosted hypervisors, I strongly encourage you to submit a
> paper about this in a peer reviewed conference. Personally I would find
> that establishing such facts in a scientific way to be extremely
> interesting.
>
> I understand that you wish to argue Xvisor's superiority in comparison to
> KVM, but I disagree with your conclusions. The code path taken in KVM can
> be optimized to be extremely short and all logic could be placed within the
> KVM module. There are numerous other advantages to using KVM (existing
> driver base, upstream kernel integration, compatibility with existing user
> space tools, co-existence with native host processes, etc.) and with server
> grade hardware I see the reliance on the Linux kernel for scheduling,
> memory management etc. to be a great advantage - and not a drawback. On the
> other hand, when Xvisor matures, feature requests will only increase its
> code size and complexity as well.
>
> -Christoffer
>
> On May 6, 2012, at 7:53 AM, Anup Patel <anup@brainfault.org> wrote:
>
> Hi PMM,
>
> Whether to consider model for measuring performance is one's own opinion.
> There are number of Tier1 conferences which accept simulation numbers for
> proving better approaches provided the simulation platform is well accepted
> by everyone.
>
> Talking about code sequences both Xvisor ARM and KVM ARM have same set of
> emulators and drivers. In fact, almost all emulation code has been adopted
> from QEMU. Many of the crucial drivers are adopted from Linux ARM. Unlike
> KVM ARM, in Xvisor ARM there no unnecessary switching between host mode to
> guest mode and amount of code traversed in handling any fault is also very
> less hence Xvisor-ARM will have much less code executed compared to KVM ARM.
>
> In Xvisor developement, we have observed that results of any CPU
> performance test on QEMU or Fast Model naturally scales up on
> real-hardware. Atleast we have never come across any scenario or test
> performing better on QEMU or Fast Model compared real-world (this is true
> for test running on Native Linux or Linux running as guest on Xvisor ARM).
>
> In our opinion we strongly believe monolithic approaches are always better
> performing over micro-kernelized approaches (or approaches somewhere in
> between micro-kernel and monolithic). Hence Xvisor ARM will always perform
> better than KVM ARM in theory, simulation and real-world.
>
> Best Regards,
> Anup Patel
>
> On Sun, May 6, 2012 at 2:21 PM, Peter Maydell <peter.maydell@linaro.org>wrote:
>
>> On 6 May 2012 05:22, Anup Patel <anup@brainfault.org> wrote:
>> > Also can you give example of a code sequence which is faster on model
>> and
>> > slower in real world. As far as I know ARM fast models are internally
>> TLM
>> > based models and If a TLM based model is emulating a timer chip of X
>> clock
>> > then it is quite precise X clock.
>>
>> Support for TLM does not require that the underlying model is cycle
>> accurate (you can have 'loosely timed' behaviour).
>>
>> You might want to read the Fast Models documentation, which tries
>> to be clear about what the models do and don't provide. In particular:
>>
>> http://infocenter.arm.com/help/topic/com.arm.doc.dui0423l/ch02s01s02.html
>> "Fast models cannot be used to:
>> * model cycle counting
>> * model software performance
>> "
>>
>> > Ofcourse CPU emulation and computation
>> > power will be less compared to real world. To see this behaviour try to
>> boot
>> > linux on Fast model or QEMU and leave it for hours and come back see the
>> > time elapsed, you will definitely see same amount of time elapsed as
>> real
>> > world.
>>
>> Nobody's arguing that the models are faster than hardware!
>> Let's try a simple example with some numbers representing
>> relative speeds:
>>
>> operation A: h/w: 1 ; model 5
>> operation B: h/w 3 ; model 30
>>
>> Where we're comparing two equivalent code sequences "A A A A" vs "B".
>> On hardware "B" will be faster. On the model the "A A A A" beats "B".
>> (both sequences are slower on the model than on the hardware, obviously.)
>>
>> The point is that some operations will be vastly vastly slower
>> on the model, and some operations merely moderately slower. Which
>> of any two code sequences is fastest depends at least as much on
>> whether it's using operations that are disproportionally worse
>> on the model. A trivial example of this is VFP -- certainly QEMU
>> has to do complex software emulation of the floating point ops to
>> maintain bit-for-bit accuracy, which makes them very slow to the
>> point where a hand-optimised-integer-assembly codec is likely to
>> be faster on the model than a Neon/VFP-using codec, even though
>> of course the Neon codec will be faster on hardware.
>>
>> [.NB: this is itself a big simplification: model performance will
>> depend on a lot of interacting things and is not purely a
>> same-every-time slowdown per operation. Some operations effectively
>> slow down what happens after them, for instance on QEMU if you do
>> something that makes us flush our cache of translated code. And
>> if for instance you have a periodic timer then the fact the model
>> is generally slower means you execute proportionally more insns in
>> the timer interrupt, so inefficiency or slowness in that code path
>> has disproportionately more effect on overall speed than it does
>> on hardware. There are other complications too...]
>>
>> > The results in the announcemnt are not baseless we have quite amount
>> reasons
>> > to believe Xvisor ARM will perform better than KVM ARM in real-world
>> too.
>>
>> I'm not stating a position on whether KVM will be better or worse
>> than Xvisor. I'm just pointing out that you can't base an argument
>> on the faulty assumption that performance inside a model can tell
>> you anything useful about performance on hardware.
>>
>> -- PMM
>>
>
> _______________________________________________
> Android-virt mailing list
> Android-virt@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/cucslists/listinfo/android-virt
>
>

Re: [Android-virt] [ANNOUNCE] Xvisor ARM better than KVM ARM in CPU virtualization [ In reply to ]

May 9, 2012, 4:09 AM

Post #22 of 26 (8313 views)

On May 9, 2012, at 2:11 AM, Anup Patel <anup@brainfault.org> wrote:

The intention behind the announcement was to inform people interested in
virtualization about Xvisor.

consider the members og this list informed.

The announcement was an early info about achievements of Xvisor ARM (for
now compared to KVM ARM). Certainly we are planning to have scientific
paper for Xvisor.

Also, I do agree that KVM ARM can be further optimized but as I mentioned
in my previous replies "KVM ARM will end-up putting more and more stuff
in-kernel". For now you can think of Xvisor ARM = KVM ARM doing everything
in-kernel. Even Xvisor ARM is being optimized so, as time passes Xvisor ARM
is also going to improve further. Its a common wisdom that "No hypervisor
in the world can be better than Native performance". Xvisor ARM is already
very very close to native performance and KVM ARM will come close to native
performance only by increasing its monolithic nature (i.e. doing more
things in-kernel). If monolithic hypervisors are so well performing then
why not to have a monolithic hypervisor made for virtualization purpose
only. The motivation behind writing Xvisor was the same.

Apart from high performance Xvisor has many interesting features such as:
*Ability to work without hardware virtualization support* - Xvisor ARM is
able to boot multiple unmodified Linux guest even on hosts which do not
have virtualization extensions implemented. In contrast, KVM ARM does not
work without virtualization extension. The potential number of host
hardware that Xvisor ARM can support is much more than KVM ARM can support.
Xvisor ARM can in-fact run on old ARMv5 processors too.
*Tree-based configuration* - To create a guest we have to just describe the
guest in form of a device tree (possible even in runtime). In contrast, for
KVM one needs to add the support in QEMU and recompile the binaries.
*Pass-through hardware access* - For hardware not accessed or virtualized
by Xvisor can be used in pass-through mode. Providing the guests a
pass-through accessible device is just matter of adding a tree node and
configure irq routing information in guest tree. Its not just PCI devices,
we can provide any kind of device as pass-through accessible (Note: if
device has in-built DMA then it should have IOMMU or SysMMU otherwise it
would be security breach). We have already tried out Serial Port and NIC as
pass-through devices.

We can compare KVM advantages with Xvisor as follows:
*Scheduler* - The linux kernel scheduler is very mature and proven OS
scheduler but Hypervisor scheduler can be quite different. Scheduling
processes and scheduling VMs can be very different problems. In case of VMs
we can use info such as: amount of emulated IO done, amount of time spend
in waiting for irq, etc for improving the quality of server consolidation.
*Driver base* - Xvisor has and will have all driver framework APIs similar
to Linux and driver porting will be just one-on-one replacement of APIs in
most cases.
*User space tools* - For starter Xvisor will use libvirt tools (or similar
open source initiative) for remote management.
*Co-existence host processes* - Xvisor is not an OS. Its made for
virtualization only so no process. Ofcourse, Xvisor has internal threading
framework but most of the time this background threads are sleeping doing
nothing. All the management commands are provided by

--Anup

On Wed, May 9, 2012 at 4:29 AM, Christoffer Dall <cdall@cs.columbia.edu>wrote:

> Anup,
>
> Thanks for providing info on your Xvisor project. However, this is a
> mailing list for the development of KVM/ARM and not a scientific forum to
> establish in theory which hypervisor "will always perform better than"
> which other hypervisor.
>
> If you want to establish that your code base will always perform better
> than all other hosted hypervisors, I strongly encourage you to submit a
> paper about this in a peer reviewed conference. Personally I would find
> that establishing such facts in a scientific way to be extremely
> interesting.
>
> I understand that you wish to argue Xvisor's superiority in comparison to
> KVM, but I disagree with your conclusions. The code path taken in KVM can
> be optimized to be extremely short and all logic could be placed within the
> KVM module. There are numerous other advantages to using KVM (existing
> driver base, upstream kernel integration, compatibility with existing user
> space tools, co-existence with native host processes, etc.) and with server
> grade hardware I see the reliance on the Linux kernel for scheduling,
> memory management etc. to be a great advantage - and not a drawback. On the
> other hand, when Xvisor matures, feature requests will only increase its
> code size and complexity as well.
>
> -Christoffer
>
> On May 6, 2012, at 7:53 AM, Anup Patel <anup@brainfault.org> wrote:
>
> Hi PMM,
>
> Whether to consider model for measuring performance is one's own opinion.
> There are number of Tier1 conferences which accept simulation numbers for
> proving better approaches provided the simulation platform is well accepted
> by everyone.
>
> Talking about code sequences both Xvisor ARM and KVM ARM have same set of
> emulators and drivers. In fact, almost all emulation code has been adopted
> from QEMU. Many of the crucial drivers are adopted from Linux ARM. Unlike
> KVM ARM, in Xvisor ARM there no unnecessary switching between host mode to
> guest mode and amount of code traversed in handling any fault is also very
> less hence Xvisor-ARM will have much less code executed compared to KVM ARM.
>
> In Xvisor developement, we have observed that results of any CPU
> performance test on QEMU or Fast Model naturally scales up on
> real-hardware. Atleast we have never come across any scenario or test
> performing better on QEMU or Fast Model compared real-world (this is true
> for test running on Native Linux or Linux running as guest on Xvisor ARM).
>
> In our opinion we strongly believe monolithic approaches are always better
> performing over micro-kernelized approaches (or approaches somewhere in
> between micro-kernel and monolithic). Hence Xvisor ARM will always perform
> better than KVM ARM in theory, simulation and real-world.
>
> Best Regards,
> Anup Patel
>
> On Sun, May 6, 2012 at 2:21 PM, Peter Maydell <peter.maydell@linaro.org>wrote:
>
>> On 6 May 2012 05:22, Anup Patel <anup@brainfault.org> wrote:
>> > Also can you give example of a code sequence which is faster on model
>> and
>> > slower in real world. As far as I know ARM fast models are internally
>> TLM
>> > based models and If a TLM based model is emulating a timer chip of X
>> clock
>> > then it is quite precise X clock.
>>
>> Support for TLM does not require that the underlying model is cycle
>> accurate (you can have 'loosely timed' behaviour).
>>
>> You might want to read the Fast Models documentation, which tries
>> to be clear about what the models do and don't provide. In particular:
>>
>> http://infocenter.arm.com/help/topic/com.arm.doc.dui0423l/ch02s01s02.html
>> "Fast models cannot be used to:
>> * model cycle counting
>> * model software performance
>> "
>>
>> > Ofcourse CPU emulation and computation
>> > power will be less compared to real world. To see this behaviour try to
>> boot
>> > linux on Fast model or QEMU and leave it for hours and come back see the
>> > time elapsed, you will definitely see same amount of time elapsed as
>> real
>> > world.
>>
>> Nobody's arguing that the models are faster than hardware!
>> Let's try a simple example with some numbers representing
>> relative speeds:
>>
>> operation A: h/w: 1 ; model 5
>> operation B: h/w 3 ; model 30
>>
>> Where we're comparing two equivalent code sequences "A A A A" vs "B".
>> On hardware "B" will be faster. On the model the "A A A A" beats "B".
>> (both sequences are slower on the model than on the hardware, obviously.)
>>
>> The point is that some operations will be vastly vastly slower
>> on the model, and some operations merely moderately slower. Which
>> of any two code sequences is fastest depends at least as much on
>> whether it's using operations that are disproportionally worse
>> on the model. A trivial example of this is VFP -- certainly QEMU
>> has to do complex software emulation of the floating point ops to
>> maintain bit-for-bit accuracy, which makes them very slow to the
>> point where a hand-optimised-integer-assembly codec is likely to
>> be faster on the model than a Neon/VFP-using codec, even though
>> of course the Neon codec will be faster on hardware.
>>
>> [.NB: this is itself a big simplification: model performance will
>> depend on a lot of interacting things and is not purely a
>> same-every-time slowdown per operation. Some operations effectively
>> slow down what happens after them, for instance on QEMU if you do
>> something that makes us flush our cache of translated code. And
>> if for instance you have a periodic timer then the fact the model
>> is generally slower means you execute proportionally more insns in
>> the timer interrupt, so inefficiency or slowness in that code path
>> has disproportionately more effect on overall speed than it does
>> on hardware. There are other complications too...]
>>
>> > The results in the announcemnt are not baseless we have quite amount
>> reasons
>> > to believe Xvisor ARM will perform better than KVM ARM in real-world
>> too.
>>
>> I'm not stating a position on whether KVM will be better or worse
>> than Xvisor. I'm just pointing out that you can't base an argument
>> on the faulty assumption that performance inside a model can tell
>> you anything useful about performance on hardware.
>>
>> -- PMM
>>
>
> _______________________________________________
> Android-virt mailing list
> Android-virt@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/cucslists/listinfo/android-virt
>
>

Re: [Android-virt] [ANNOUNCE] Xvisor ARM better than KVM ARM in CPU virtualization [ In reply to ]

May 9, 2012, 4:09 AM

Post #23 of 26 (8243 views)

On May 9, 2012, at 2:11 AM, Anup Patel <anup@brainfault.org> wrote:

The intention behind the announcement was to inform people interested in
virtualization about Xvisor.

consider the members og this list informed.

The announcement was an early info about achievements of Xvisor ARM (for
now compared to KVM ARM). Certainly we are planning to have scientific
paper for Xvisor.

Also, I do agree that KVM ARM can be further optimized but as I mentioned
in my previous replies "KVM ARM will end-up putting more and more stuff
in-kernel". For now you can think of Xvisor ARM = KVM ARM doing everything
in-kernel. Even Xvisor ARM is being optimized so, as time passes Xvisor ARM
is also going to improve further. Its a common wisdom that "No hypervisor
in the world can be better than Native performance". Xvisor ARM is already
very very close to native performance and KVM ARM will come close to native
performance only by increasing its monolithic nature (i.e. doing more
things in-kernel). If monolithic hypervisors are so well performing then
why not to have a monolithic hypervisor made for virtualization purpose
only. The motivation behind writing Xvisor was the same.

Apart from high performance Xvisor has many interesting features such as:
*Ability to work without hardware virtualization support* - Xvisor ARM is
able to boot multiple unmodified Linux guest even on hosts which do not
have virtualization extensions implemented. In contrast, KVM ARM does not
work without virtualization extension. The potential number of host
hardware that Xvisor ARM can support is much more than KVM ARM can support.
Xvisor ARM can in-fact run on old ARMv5 processors too.
*Tree-based configuration* - To create a guest we have to just describe the
guest in form of a device tree (possible even in runtime). In contrast, for
KVM one needs to add the support in QEMU and recompile the binaries.
*Pass-through hardware access* - For hardware not accessed or virtualized
by Xvisor can be used in pass-through mode. Providing the guests a
pass-through accessible device is just matter of adding a tree node and
configure irq routing information in guest tree. Its not just PCI devices,
we can provide any kind of device as pass-through accessible (Note: if
device has in-built DMA then it should have IOMMU or SysMMU otherwise it
would be security breach). We have already tried out Serial Port and NIC as
pass-through devices.

We can compare KVM advantages with Xvisor as follows:
*Scheduler* - The linux kernel scheduler is very mature and proven OS
scheduler but Hypervisor scheduler can be quite different. Scheduling
processes and scheduling VMs can be very different problems. In case of VMs
we can use info such as: amount of emulated IO done, amount of time spend
in waiting for irq, etc for improving the quality of server consolidation.
*Driver base* - Xvisor has and will have all driver framework APIs similar
to Linux and driver porting will be just one-on-one replacement of APIs in
most cases.
*User space tools* - For starter Xvisor will use libvirt tools (or similar
open source initiative) for remote management.
*Co-existence host processes* - Xvisor is not an OS. Its made for
virtualization only so no process. Ofcourse, Xvisor has internal threading
framework but most of the time this background threads are sleeping doing
nothing. All the management commands are provided by

--Anup

On Wed, May 9, 2012 at 4:29 AM, Christoffer Dall <cdall@cs.columbia.edu>wrote:

> Anup,
>
> Thanks for providing info on your Xvisor project. However, this is a
> mailing list for the development of KVM/ARM and not a scientific forum to
> establish in theory which hypervisor "will always perform better than"
> which other hypervisor.
>
> If you want to establish that your code base will always perform better
> than all other hosted hypervisors, I strongly encourage you to submit a
> paper about this in a peer reviewed conference. Personally I would find
> that establishing such facts in a scientific way to be extremely
> interesting.
>
> I understand that you wish to argue Xvisor's superiority in comparison to
> KVM, but I disagree with your conclusions. The code path taken in KVM can
> be optimized to be extremely short and all logic could be placed within the
> KVM module. There are numerous other advantages to using KVM (existing
> driver base, upstream kernel integration, compatibility with existing user
> space tools, co-existence with native host processes, etc.) and with server
> grade hardware I see the reliance on the Linux kernel for scheduling,
> memory management etc. to be a great advantage - and not a drawback. On the
> other hand, when Xvisor matures, feature requests will only increase its
> code size and complexity as well.
>
> -Christoffer
>
> On May 6, 2012, at 7:53 AM, Anup Patel <anup@brainfault.org> wrote:
>
> Hi PMM,
>
> Whether to consider model for measuring performance is one's own opinion.
> There are number of Tier1 conferences which accept simulation numbers for
> proving better approaches provided the simulation platform is well accepted
> by everyone.
>
> Talking about code sequences both Xvisor ARM and KVM ARM have same set of
> emulators and drivers. In fact, almost all emulation code has been adopted
> from QEMU. Many of the crucial drivers are adopted from Linux ARM. Unlike
> KVM ARM, in Xvisor ARM there no unnecessary switching between host mode to
> guest mode and amount of code traversed in handling any fault is also very
> less hence Xvisor-ARM will have much less code executed compared to KVM ARM.
>
> In Xvisor developement, we have observed that results of any CPU
> performance test on QEMU or Fast Model naturally scales up on
> real-hardware. Atleast we have never come across any scenario or test
> performing better on QEMU or Fast Model compared real-world (this is true
> for test running on Native Linux or Linux running as guest on Xvisor ARM).
>
> In our opinion we strongly believe monolithic approaches are always better
> performing over micro-kernelized approaches (or approaches somewhere in
> between micro-kernel and monolithic). Hence Xvisor ARM will always perform
> better than KVM ARM in theory, simulation and real-world.
>
> Best Regards,
> Anup Patel
>
> On Sun, May 6, 2012 at 2:21 PM, Peter Maydell <peter.maydell@linaro.org>wrote:
>
>> On 6 May 2012 05:22, Anup Patel <anup@brainfault.org> wrote:
>> > Also can you give example of a code sequence which is faster on model
>> and
>> > slower in real world. As far as I know ARM fast models are internally
>> TLM
>> > based models and If a TLM based model is emulating a timer chip of X
>> clock
>> > then it is quite precise X clock.
>>
>> Support for TLM does not require that the underlying model is cycle
>> accurate (you can have 'loosely timed' behaviour).
>>
>> You might want to read the Fast Models documentation, which tries
>> to be clear about what the models do and don't provide. In particular:
>>
>> http://infocenter.arm.com/help/topic/com.arm.doc.dui0423l/ch02s01s02.html
>> "Fast models cannot be used to:
>> * model cycle counting
>> * model software performance
>> "
>>
>> > Ofcourse CPU emulation and computation
>> > power will be less compared to real world. To see this behaviour try to
>> boot
>> > linux on Fast model or QEMU and leave it for hours and come back see the
>> > time elapsed, you will definitely see same amount of time elapsed as
>> real
>> > world.
>>
>> Nobody's arguing that the models are faster than hardware!
>> Let's try a simple example with some numbers representing
>> relative speeds:
>>
>> operation A: h/w: 1 ; model 5
>> operation B: h/w 3 ; model 30
>>
>> Where we're comparing two equivalent code sequences "A A A A" vs "B".
>> On hardware "B" will be faster. On the model the "A A A A" beats "B".
>> (both sequences are slower on the model than on the hardware, obviously.)
>>
>> The point is that some operations will be vastly vastly slower
>> on the model, and some operations merely moderately slower. Which
>> of any two code sequences is fastest depends at least as much on
>> whether it's using operations that are disproportionally worse
>> on the model. A trivial example of this is VFP -- certainly QEMU
>> has to do complex software emulation of the floating point ops to
>> maintain bit-for-bit accuracy, which makes them very slow to the
>> point where a hand-optimised-integer-assembly codec is likely to
>> be faster on the model than a Neon/VFP-using codec, even though
>> of course the Neon codec will be faster on hardware.
>>
>> [.NB: this is itself a big simplification: model performance will
>> depend on a lot of interacting things and is not purely a
>> same-every-time slowdown per operation. Some operations effectively
>> slow down what happens after them, for instance on QEMU if you do
>> something that makes us flush our cache of translated code. And
>> if for instance you have a periodic timer then the fact the model
>> is generally slower means you execute proportionally more insns in
>> the timer interrupt, so inefficiency or slowness in that code path
>> has disproportionately more effect on overall speed than it does
>> on hardware. There are other complications too...]
>>
>> > The results in the announcemnt are not baseless we have quite amount
>> reasons
>> > to believe Xvisor ARM will perform better than KVM ARM in real-world
>> too.
>>
>> I'm not stating a position on whether KVM will be better or worse
>> than Xvisor. I'm just pointing out that you can't base an argument
>> on the faulty assumption that performance inside a model can tell
>> you anything useful about performance on hardware.
>>
>> -- PMM
>>
>
> _______________________________________________
> Android-virt mailing list
> Android-virt@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/cucslists/listinfo/android-virt
>
>

Re: [Android-virt] [ANNOUNCE] Xvisor ARM better than KVM ARM in CPU virtualization [ In reply to ]

May 9, 2012, 4:16 AM

Post #24 of 26 (8234 views)

On May 9, 2012, at 2:11 AM, Anup Patel <anup@brainfault.org> wrote:

The intention behind the announcement was to inform people interested in
virtualization about Xvisor. The announcement was an early info about
achievements of Xvisor ARM (for now compared to KVM ARM). Certainly we are
planning to have scientific paper for Xvisor.

consider the members of this list informed

Also, I do agree that KVM ARM can be further optimized but as I mentioned
in my previous replies "KVM ARM will end-up putting more and more stuff
in-kernel". For now you can think of Xvisor ARM = KVM ARM doing everything
in-kernel. Even Xvisor ARM is being optimized so, as time passes Xvisor ARM
is also going to improve further. Its a common wisdom that "No hypervisor
in the world can be better than Native performance". Xvisor ARM is already
very very close to native performance and KVM ARM will come close to native
performance only by increasing its monolithic nature (i.e. doing more
things in-kernel). If monolithic hypervisors are so well performing then
why not to have a monolithic hypervisor made for virtualization purpose
only. The motivation behind writing Xvisor was the same.

Apart from high performance Xvisor has many interesting features such as:
*Ability to work without hardware virtualization support* - Xvisor ARM is
able to boot multiple unmodified Linux guest even on hosts which do not
have virtualization extensions implemented. In contrast, KVM ARM does not
work without virtualization extension. The potential number of host
hardware that Xvisor ARM can support is much more than KVM ARM can support.
Xvisor ARM can in-fact run on old ARMv5 processors too.
*Tree-based configuration* - To create a guest we have to just describe the
guest in form of a device tree (possible even in runtime). In contrast, for
KVM one needs to add the support in QEMU and recompile the binaries.
*Pass-through hardware access* - For hardware not accessed or virtualized
by Xvisor can be used in pass-through mode. Providing the guests a
pass-through accessible device is just matter of adding a tree node and
configure irq routing information in guest tree. Its not just PCI devices,
we can provide any kind of device as pass-through accessible (Note: if
device has in-built DMA then it should have IOMMU or SysMMU otherwise it
would be security breach). We have already tried out Serial Port and NIC as
pass-through devices.

We can compare KVM advantages with Xvisor as follows:
*Scheduler* - The linux kernel scheduler is very mature and proven OS
scheduler but Hypervisor scheduler can be quite different. Scheduling
processes and scheduling VMs can be very different problems. In case of VMs
we can use info such as: amount of emulated IO done, amount of time spend
in waiting for irq, etc for improving the quality of server consolidation.
*Driver base* - Xvisor has and will have all driver framework APIs similar
to Linux and driver porting will be just one-on-one replacement of APIs in
most cases.
*User space tools* - For starter Xvisor will use libvirt tools (or similar
open source initiative) for remote management.
*Co-existence host processes* - Xvisor is not an OS. Its made for
virtualization only so no process. Ofcourse, Xvisor has internal threading
framework but most of the time this background threads are sleeping doing
nothing. All the management commands are provided by managment terminal
daemon (which a background thread in Xvisor).

This all sounds fantastic!

But as I said, this list is for the development of KVM/ARM and not a place
for arguing how fantastic Xvisor is as opposed to everything else. Please
keep that in mind.

I am looking forward to your paper.

On Wed, May 9, 2012 at 4:29 AM, Christoffer Dall <cdall@cs.columbia.edu>wrote:

> Anup,
>
> Thanks for providing info on your Xvisor project. However, this is a
> mailing list for the development of KVM/ARM and not a scientific forum to
> establish in theory which hypervisor "will always perform better than"
> which other hypervisor.
>
> If you want to establish that your code base will always perform better
> than all other hosted hypervisors, I strongly encourage you to submit a
> paper about this in a peer reviewed conference. Personally I would find
> that establishing such facts in a scientific way to be extremely
> interesting.
>
> I understand that you wish to argue Xvisor's superiority in comparison to
> KVM, but I disagree with your conclusions. The code path taken in KVM can
> be optimized to be extremely short and all logic could be placed within the
> KVM module. There are numerous other advantages to using KVM (existing
> driver base, upstream kernel integration, compatibility with existing user
> space tools, co-existence with native host processes, etc.) and with server
> grade hardware I see the reliance on the Linux kernel for scheduling,
> memory management etc. to be a great advantage - and not a drawback. On the
> other hand, when Xvisor matures, feature requests will only increase its
> code size and complexity as well.
>
> -Christoffer
>
> On May 6, 2012, at 7:53 AM, Anup Patel <anup@brainfault.org> wrote:
>
> Hi PMM,
>
> Whether to consider model for measuring performance is one's own opinion.
> There are number of Tier1 conferences which accept simulation numbers for
> proving better approaches provided the simulation platform is well accepted
> by everyone.
>
> Talking about code sequences both Xvisor ARM and KVM ARM have same set of
> emulators and drivers. In fact, almost all emulation code has been adopted
> from QEMU. Many of the crucial drivers are adopted from Linux ARM. Unlike
> KVM ARM, in Xvisor ARM there no unnecessary switching between host mode to
> guest mode and amount of code traversed in handling any fault is also very
> less hence Xvisor-ARM will have much less code executed compared to KVM ARM.
>
> In Xvisor developement, we have observed that results of any CPU
> performance test on QEMU or Fast Model naturally scales up on
> real-hardware. Atleast we have never come across any scenario or test
> performing better on QEMU or Fast Model compared real-world (this is true
> for test running on Native Linux or Linux running as guest on Xvisor ARM).
>
> In our opinion we strongly believe monolithic approaches are always better
> performing over micro-kernelized approaches (or approaches somewhere in
> between micro-kernel and monolithic). Hence Xvisor ARM will always perform
> better than KVM ARM in theory, simulation and real-world.
>
> Best Regards,
> Anup Patel
>
> On Sun, May 6, 2012 at 2:21 PM, Peter Maydell <peter.maydell@linaro.org>wrote:
>
>> On 6 May 2012 05:22, Anup Patel <anup@brainfault.org> wrote:
>> > Also can you give example of a code sequence which is faster on model
>> and
>> > slower in real world. As far as I know ARM fast models are internally
>> TLM
>> > based models and If a TLM based model is emulating a timer chip of X
>> clock
>> > then it is quite precise X clock.
>>
>> Support for TLM does not require that the underlying model is cycle
>> accurate (you can have 'loosely timed' behaviour).
>>
>> You might want to read the Fast Models documentation, which tries
>> to be clear about what the models do and don't provide. In particular:
>>
>> http://infocenter.arm.com/help/topic/com.arm.doc.dui0423l/ch02s01s02.html
>> "Fast models cannot be used to:
>> * model cycle counting
>> * model software performance
>> "
>>
>> > Ofcourse CPU emulation and computation
>> > power will be less compared to real world. To see this behaviour try to
>> boot
>> > linux on Fast model or QEMU and leave it for hours and come back see the
>> > time elapsed, you will definitely see same amount of time elapsed as
>> real
>> > world.
>>
>> Nobody's arguing that the models are faster than hardware!
>> Let's try a simple example with some numbers representing
>> relative speeds:
>>
>> operation A: h/w: 1 ; model 5
>> operation B: h/w 3 ; model 30
>>
>> Where we're comparing two equivalent code sequences "A A A A" vs "B".
>> On hardware "B" will be faster. On the model the "A A A A" beats "B".
>> (both sequences are slower on the model than on the hardware, obviously.)
>>
>> The point is that some operations will be vastly vastly slower
>> on the model, and some operations merely moderately slower. Which
>> of any two code sequences is fastest depends at least as much on
>> whether it's using operations that are disproportionally worse
>> on the model. A trivial example of this is VFP -- certainly QEMU
>> has to do complex software emulation of the floating point ops to
>> maintain bit-for-bit accuracy, which makes them very slow to the
>> point where a hand-optimised-integer-assembly codec is likely to
>> be faster on the model than a Neon/VFP-using codec, even though
>> of course the Neon codec will be faster on hardware.
>>
>> [.NB: this is itself a big simplification: model performance will
>> depend on a lot of interacting things and is not purely a
>> same-every-time slowdown per operation. Some operations effectively
>> slow down what happens after them, for instance on QEMU if you do
>> something that makes us flush our cache of translated code. And
>> if for instance you have a periodic timer then the fact the model
>> is generally slower means you execute proportionally more insns in
>> the timer interrupt, so inefficiency or slowness in that code path
>> has disproportionately more effect on overall speed than it does
>> on hardware. There are other complications too...]
>>
>> > The results in the announcemnt are not baseless we have quite amount
>> reasons
>> > to believe Xvisor ARM will perform better than KVM ARM in real-world
>> too.
>>
>> I'm not stating a position on whether KVM will be better or worse
>> than Xvisor. I'm just pointing out that you can't base an argument
>> on the faulty assumption that performance inside a model can tell
>> you anything useful about performance on hardware.
>>
>> -- PMM
>>
>
> _______________________________________________
> Android-virt mailing list
> Android-virt@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/cucslists/listinfo/android-virt
>
>

Re: [Android-virt] [ANNOUNCE] Xvisor ARM better than KVM ARM in CPU virtualization [ In reply to ]

May 9, 2012, 4:16 AM

Post #25 of 26 (8250 views)

On May 9, 2012, at 2:11 AM, Anup Patel <anup@brainfault.org> wrote:

The intention behind the announcement was to inform people interested in
virtualization about Xvisor. The announcement was an early info about
achievements of Xvisor ARM (for now compared to KVM ARM). Certainly we are
planning to have scientific paper for Xvisor.

consider the members of this list informed

Also, I do agree that KVM ARM can be further optimized but as I mentioned
in my previous replies "KVM ARM will end-up putting more and more stuff
in-kernel". For now you can think of Xvisor ARM = KVM ARM doing everything
in-kernel. Even Xvisor ARM is being optimized so, as time passes Xvisor ARM
is also going to improve further. Its a common wisdom that "No hypervisor
in the world can be better than Native performance". Xvisor ARM is already
very very close to native performance and KVM ARM will come close to native
performance only by increasing its monolithic nature (i.e. doing more
things in-kernel). If monolithic hypervisors are so well performing then
why not to have a monolithic hypervisor made for virtualization purpose
only. The motivation behind writing Xvisor was the same.

Apart from high performance Xvisor has many interesting features such as:
*Ability to work without hardware virtualization support* - Xvisor ARM is
able to boot multiple unmodified Linux guest even on hosts which do not
have virtualization extensions implemented. In contrast, KVM ARM does not
work without virtualization extension. The potential number of host
hardware that Xvisor ARM can support is much more than KVM ARM can support.
Xvisor ARM can in-fact run on old ARMv5 processors too.
*Tree-based configuration* - To create a guest we have to just describe the
guest in form of a device tree (possible even in runtime). In contrast, for
KVM one needs to add the support in QEMU and recompile the binaries.
*Pass-through hardware access* - For hardware not accessed or virtualized
by Xvisor can be used in pass-through mode. Providing the guests a
pass-through accessible device is just matter of adding a tree node and
configure irq routing information in guest tree. Its not just PCI devices,
we can provide any kind of device as pass-through accessible (Note: if
device has in-built DMA then it should have IOMMU or SysMMU otherwise it
would be security breach). We have already tried out Serial Port and NIC as
pass-through devices.

We can compare KVM advantages with Xvisor as follows:
*Scheduler* - The linux kernel scheduler is very mature and proven OS
scheduler but Hypervisor scheduler can be quite different. Scheduling
processes and scheduling VMs can be very different problems. In case of VMs
we can use info such as: amount of emulated IO done, amount of time spend
in waiting for irq, etc for improving the quality of server consolidation.
*Driver base* - Xvisor has and will have all driver framework APIs similar
to Linux and driver porting will be just one-on-one replacement of APIs in
most cases.
*User space tools* - For starter Xvisor will use libvirt tools (or similar
open source initiative) for remote management.
*Co-existence host processes* - Xvisor is not an OS. Its made for
virtualization only so no process. Ofcourse, Xvisor has internal threading
framework but most of the time this background threads are sleeping doing
nothing. All the management commands are provided by managment terminal
daemon (which a background thread in Xvisor).

This all sounds fantastic!

But as I said, this list is for the development of KVM/ARM and not a place
for arguing how fantastic Xvisor is as opposed to everything else. Please
keep that in mind.

I am looking forward to your paper.

On Wed, May 9, 2012 at 4:29 AM, Christoffer Dall <cdall@cs.columbia.edu>wrote:

> Anup,
>
> Thanks for providing info on your Xvisor project. However, this is a
> mailing list for the development of KVM/ARM and not a scientific forum to
> establish in theory which hypervisor "will always perform better than"
> which other hypervisor.
>
> If you want to establish that your code base will always perform better
> than all other hosted hypervisors, I strongly encourage you to submit a
> paper about this in a peer reviewed conference. Personally I would find
> that establishing such facts in a scientific way to be extremely
> interesting.
>
> I understand that you wish to argue Xvisor's superiority in comparison to
> KVM, but I disagree with your conclusions. The code path taken in KVM can
> be optimized to be extremely short and all logic could be placed within the
> KVM module. There are numerous other advantages to using KVM (existing
> driver base, upstream kernel integration, compatibility with existing user
> space tools, co-existence with native host processes, etc.) and with server
> grade hardware I see the reliance on the Linux kernel for scheduling,
> memory management etc. to be a great advantage - and not a drawback. On the
> other hand, when Xvisor matures, feature requests will only increase its
> code size and complexity as well.
>
> -Christoffer
>
> On May 6, 2012, at 7:53 AM, Anup Patel <anup@brainfault.org> wrote:
>
> Hi PMM,
>
> Whether to consider model for measuring performance is one's own opinion.
> There are number of Tier1 conferences which accept simulation numbers for
> proving better approaches provided the simulation platform is well accepted
> by everyone.
>
> Talking about code sequences both Xvisor ARM and KVM ARM have same set of
> emulators and drivers. In fact, almost all emulation code has been adopted
> from QEMU. Many of the crucial drivers are adopted from Linux ARM. Unlike
> KVM ARM, in Xvisor ARM there no unnecessary switching between host mode to
> guest mode and amount of code traversed in handling any fault is also very
> less hence Xvisor-ARM will have much less code executed compared to KVM ARM.
>
> In Xvisor developement, we have observed that results of any CPU
> performance test on QEMU or Fast Model naturally scales up on
> real-hardware. Atleast we have never come across any scenario or test
> performing better on QEMU or Fast Model compared real-world (this is true
> for test running on Native Linux or Linux running as guest on Xvisor ARM).
>
> In our opinion we strongly believe monolithic approaches are always better
> performing over micro-kernelized approaches (or approaches somewhere in
> between micro-kernel and monolithic). Hence Xvisor ARM will always perform
> better than KVM ARM in theory, simulation and real-world.
>
> Best Regards,
> Anup Patel
>
> On Sun, May 6, 2012 at 2:21 PM, Peter Maydell <peter.maydell@linaro.org>wrote:
>
>> On 6 May 2012 05:22, Anup Patel <anup@brainfault.org> wrote:
>> > Also can you give example of a code sequence which is faster on model
>> and
>> > slower in real world. As far as I know ARM fast models are internally
>> TLM
>> > based models and If a TLM based model is emulating a timer chip of X
>> clock
>> > then it is quite precise X clock.
>>
>> Support for TLM does not require that the underlying model is cycle
>> accurate (you can have 'loosely timed' behaviour).
>>
>> You might want to read the Fast Models documentation, which tries
>> to be clear about what the models do and don't provide. In particular:
>>
>> http://infocenter.arm.com/help/topic/com.arm.doc.dui0423l/ch02s01s02.html
>> "Fast models cannot be used to:
>> * model cycle counting
>> * model software performance
>> "
>>
>> > Ofcourse CPU emulation and computation
>> > power will be less compared to real world. To see this behaviour try to
>> boot
>> > linux on Fast model or QEMU and leave it for hours and come back see the
>> > time elapsed, you will definitely see same amount of time elapsed as
>> real
>> > world.
>>
>> Nobody's arguing that the models are faster than hardware!
>> Let's try a simple example with some numbers representing
>> relative speeds:
>>
>> operation A: h/w: 1 ; model 5
>> operation B: h/w 3 ; model 30
>>
>> Where we're comparing two equivalent code sequences "A A A A" vs "B".
>> On hardware "B" will be faster. On the model the "A A A A" beats "B".
>> (both sequences are slower on the model than on the hardware, obviously.)
>>
>> The point is that some operations will be vastly vastly slower
>> on the model, and some operations merely moderately slower. Which
>> of any two code sequences is fastest depends at least as much on
>> whether it's using operations that are disproportionally worse
>> on the model. A trivial example of this is VFP -- certainly QEMU
>> has to do complex software emulation of the floating point ops to
>> maintain bit-for-bit accuracy, which makes them very slow to the
>> point where a hand-optimised-integer-assembly codec is likely to
>> be faster on the model than a Neon/VFP-using codec, even though
>> of course the Neon codec will be faster on hardware.
>>
>> [.NB: this is itself a big simplification: model performance will
>> depend on a lot of interacting things and is not purely a
>> same-every-time slowdown per operation. Some operations effectively
>> slow down what happens after them, for instance on QEMU if you do
>> something that makes us flush our cache of translated code. And
>> if for instance you have a periodic timer then the fact the model
>> is generally slower means you execute proportionally more insns in
>> the timer interrupt, so inefficiency or slowness in that code path
>> has disproportionately more effect on overall speed than it does
>> on hardware. There are other complications too...]
>>
>> > The results in the announcemnt are not baseless we have quite amount
>> reasons
>> > to believe Xvisor ARM will perform better than KVM ARM in real-world
>> too.
>>
>> I'm not stating a position on whether KVM will be better or worse
>> than Xvisor. I'm just pointing out that you can't base an argument
>> on the faulty assumption that performance inside a model can tell
>> you anything useful about performance on hardware.
>>
>> -- PMM
>>
>
> _______________________________________________
> Android-virt mailing list
> Android-virt@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/cucslists/listinfo/android-virt
>
>