Mailing List Archive

Re: [RFC PATCH 0/2] Add support for a fake, para-virtualised machine
On Tue, 4 Dec 2012, Will Deacon wrote:

> Hi Nicolas,
>
> On Tue, Dec 04, 2012 at 05:00:07PM +0000, Nicolas Pitre wrote:
> > on the topic of a para-virtualised machine, I think that it should
> > simply implement the PSCI calls to bring up CPUs _without_ any holding
> > pen nor spinning tables. You issue the appropriate PSCI call with the
> > physical address for secondary_startup() as argument and you're done.
> > The host intercepts that call and free a new CPU instance in response.
> > That's all.
>
> I'd be happy to go with this suggestion if it wasn't for one thing:
> platforms that do not implement a secure mode. For these platforms, smc will
> be an undefined instruction at the exception level where it is executed and
> therefore cannot be trapped by the hypervisor.

Really? I thought the hypervisor could virtualize SMC calls. Or is
that considered a security hazard?

I don't remember all the PSCI spec details, but I think there was some
provision for this case i.e. the SMC call could be a HYP call instead.
And if that's not in the spec, then it probably should be added and
implemented as if it was.

> If that situation requires a pen, I see no benefit from having two boot
> schemes where one of them would work in every case.

We always have the choice between several schemes in device drivers for
example, depending on the hardware generation. Yet we always implement
the better scheme for the newest hardware for performance reasons, even
if an older one could work in all cases.

A holding pen is a rather stupid scheme. Please let's try to do without
it if possible.


Nicolas

_______________________________________________
Xen-arm mailing list
Xen-arm@lists.xen.org
http://lists.xen.org/cgi-bin/mailman/listinfo/xen-arm
Re: [RFC PATCH 0/2] Add support for a fake, para-virtualised machine [ In reply to ]
Hi Rob,

[fixing Arnd's address, as I apparently can't spell]

On Mon, Dec 03, 2012 at 09:54:09PM +0000, Rob Herring wrote:
> On 12/03/2012 11:52 AM, Will Deacon wrote:
> > When running Linux on a para-virtualised platform (that is, one where
> > the guest is aware that it is dealing with virtual devices sitting on
> > things like virtio or xenbus) we require very little in the way of
> > platform code and piggy-backing on top of an existing platform can
> > require a lot of device emulation for very little gain.
> >
> > These two patches introduce mach-virt: a very simple, DT-based machine
> > which can be used with kvmtool in conjunction with virtio-based devices.
> > It's not hard to imagine the same machine being targetted by Xen, which
> > currently emulates a minimal variant of the vexpress platform.
> >
> > Note that this patch series depends on the timer rework from Mark
> > Rutland, posted on Friday:
> >
> > http://lists.infradead.org/pipermail/linux-arm-kernel/2012-November/135651.html
> >
> > All feedback welcome. We suspect that most controversy will be around
> > the name of the thing :)
>
> We've discussed this before at conferences. I don't know that we
> concluded this wasn't needed, but it certainly leaned that direction.

I too leaned that direction before I started looking at kvm in detail and,
since then, I've changed my mind when it comes to para-virtualisation.

The reason for this is that there is absolutely no reason to emulate some
components of a real platform and then bolt virtual devices onto it once
you've got enough to get it going. It leads to a right royal mess in
userspace, where you have to write a load of non-reusable emulation code and
it leads to churn in the kernel because you're constantly at odds with
people trying to develop the platform code based on the actual hardware
they have.

With a virtualisation-capable ARMv7 system, all you *need* to boot SMP
Debian is:

- A v7 CPU with virt extensions
- vGIC
- architected timers

*everything* else can be described using virtio devices in the device-tree,
essentially allowing you to generate platforms based on the above and boot
the same kernel on them.

> So what has changed? You're not going to save code space because we're
> building multiple platforms together. You'll save some boot time, but a
> stripped down dtb with only the minimal peripherals would probably save
> nearly as much time.

It's really got nothing to do with code space or boot speed. What it *is*
about is avoiding the tight coupling with a real platform and suffering as a
result. Yes, you can strip down the DT for a real platform but you'll likely
still have to emulate things like the SP804 in order to boot. That's not to
mention any platform-specific system register interfaces which are required
early on.

We can't even re-use the socfpga code (which is incredibly minimal) without
emulating the dw_apb_timer.

> However, I do have concerns with using VExpress as
> the guest. For example, you can't support a non-PAE guest with 4GB of
> RAM on VExpress (maybe if the vexpress code gets all memory map info
> from DT).

Yes, vexpress is even less suitable for this.

> Is this really complete? Will we need reset, poweroff, hotplug, and
> suspend/resume support for example? Unlike most initial platform
> submissions which are minimal, I think seeing full support would be
> useful here. Then we can better gauge how much we are really saving.

The code is complete in the sense that you can boot an SMP guest running
Debian with console, network, block etc. etc. but you're right to point out
the absence of power-management support.

However, power-management in KVM guests is a *much* larger problem and not
one that has been solved adequately as of yet. There are suggestions that it
should be handled entirely in firmware, with the guest making smc calls to
request power-management operations but this is yet to materialise and, as
such, we can't yet use it here.

We could look at building a virtio-based power controller but that's going
to come up too late for SMP booting (although will give us hotplug, reset
etc).

Will

_______________________________________________
Xen-arm mailing list
Xen-arm@lists.xen.org
http://lists.xen.org/cgi-bin/mailman/listinfo/xen-arm
Re: [RFC PATCH 0/2] Add support for a fake, para-virtualised machine [ In reply to ]
On 12/04/2012 06:30 AM, Will Deacon wrote:
> Hi Rob,
>
> [fixing Arnd's address, as I apparently can't spell]
>
> On Mon, Dec 03, 2012 at 09:54:09PM +0000, Rob Herring wrote:
>> On 12/03/2012 11:52 AM, Will Deacon wrote:
>>> When running Linux on a para-virtualised platform (that is, one where
>>> the guest is aware that it is dealing with virtual devices sitting on
>>> things like virtio or xenbus) we require very little in the way of
>>> platform code and piggy-backing on top of an existing platform can
>>> require a lot of device emulation for very little gain.
>>>
>>> These two patches introduce mach-virt: a very simple, DT-based machine
>>> which can be used with kvmtool in conjunction with virtio-based devices.
>>> It's not hard to imagine the same machine being targetted by Xen, which
>>> currently emulates a minimal variant of the vexpress platform.
>>>
>>> Note that this patch series depends on the timer rework from Mark
>>> Rutland, posted on Friday:
>>>
>>> http://lists.infradead.org/pipermail/linux-arm-kernel/2012-November/135651.html
>>>
>>> All feedback welcome. We suspect that most controversy will be around
>>> the name of the thing :)
>>
>> We've discussed this before at conferences. I don't know that we
>> concluded this wasn't needed, but it certainly leaned that direction.
>
> I too leaned that direction before I started looking at kvm in detail and,
> since then, I've changed my mind when it comes to para-virtualisation.
>
> The reason for this is that there is absolutely no reason to emulate some
> components of a real platform and then bolt virtual devices onto it once
> you've got enough to get it going. It leads to a right royal mess in
> userspace, where you have to write a load of non-reusable emulation code and
> it leads to churn in the kernel because you're constantly at odds with
> people trying to develop the platform code based on the actual hardware
> they have.
>
> With a virtualisation-capable ARMv7 system, all you *need* to boot SMP
> Debian is:
>
> - A v7 CPU with virt extensions
> - vGIC
> - architected timers
>
> *everything* else can be described using virtio devices in the device-tree,
> essentially allowing you to generate platforms based on the above and boot
> the same kernel on them.
>
>> So what has changed? You're not going to save code space because we're
>> building multiple platforms together. You'll save some boot time, but a
>> stripped down dtb with only the minimal peripherals would probably save
>> nearly as much time.
>
> It's really got nothing to do with code space or boot speed. What it *is*
> about is avoiding the tight coupling with a real platform and suffering as a
> result. Yes, you can strip down the DT for a real platform but you'll likely
> still have to emulate things like the SP804 in order to boot. That's not to
> mention any platform-specific system register interfaces which are required
> early on.
>
> We can't even re-use the socfpga code (which is incredibly minimal) without
> emulating the dw_apb_timer.

That to me is highlighting where we need to do more work on DT driving
the initialization. The platforms are still aware of what kind of timers
and interrupt controllers are present. They should not be. There's work
in progress for both of those.

Lorenzo's DT MPIDR patches should trim down smp code some. The DT spin
table code could probably be common. I think I could use it on highbank
as well. If we decide the pen code stays, then it should be common
rather than creating yet another copy.

Rob


_______________________________________________
Xen-arm mailing list
Xen-arm@lists.xen.org
http://lists.xen.org/cgi-bin/mailman/listinfo/xen-arm
Re: [RFC PATCH 0/2] Add support for a fake, para-virtualised machine [ In reply to ]
On Tue, 4 Dec 2012, Rob Herring wrote:

> That to me is highlighting where we need to do more work on DT driving
> the initialization. The platforms are still aware of what kind of timers
> and interrupt controllers are present. They should not be. There's work
> in progress for both of those.
>
> Lorenzo's DT MPIDR patches should trim down smp code some. The DT spin
> table code could probably be common. I think I could use it on highbank
> as well. If we decide the pen code stays, then it should be common
> rather than creating yet another copy.

I don't want to rain on the "everything should be common" parade here.
However, for the best part of last year I've been working on kernel
support for big.LITTLE systems, and the handling of CPU hotplug
(including SMP secondary boot) is far from being a trivial task.
Managing the simple bringing up or down of a CPU in such an environment
required hundreds of new lines of code. That is far from a simple
holding pen or spinning table to say the least.

[. For the curious, I'll post this code here soon for review. ]

So my point of view is: if you do not need a holding pen because you can
hold individual CPUs in reset, then don't. Many platforms with support
in the kernel can do that, yet they copied the holding pen code just
because it is there. And that is total crap.

on the topic of a para-virtualised machine, I think that it should
simply implement the PSCI calls to bring up CPUs _without_ any holding
pen nor spinning tables. You issue the appropriate PSCI call with the
physical address for secondary_startup() as argument and you're done.
The host intercepts that call and free a new CPU instance in response.
That's all.


Nicolas

_______________________________________________
Xen-arm mailing list
Xen-arm@lists.xen.org
http://lists.xen.org/cgi-bin/mailman/listinfo/xen-arm
Re: [RFC PATCH 0/2] Add support for a fake, para-virtualised machine [ In reply to ]
Hi Nicolas,

On Tue, Dec 04, 2012 at 05:00:07PM +0000, Nicolas Pitre wrote:
> On Tue, 4 Dec 2012, Rob Herring wrote:
>
> > That to me is highlighting where we need to do more work on DT driving
> > the initialization. The platforms are still aware of what kind of timers
> > and interrupt controllers are present. They should not be. There's work
> > in progress for both of those.
> >
> > Lorenzo's DT MPIDR patches should trim down smp code some. The DT spin
> > table code could probably be common. I think I could use it on highbank
> > as well. If we decide the pen code stays, then it should be common
> > rather than creating yet another copy.
>
> I don't want to rain on the "everything should be common" parade here.
> However, for the best part of last year I've been working on kernel
> support for big.LITTLE systems, and the handling of CPU hotplug
> (including SMP secondary boot) is far from being a trivial task.
> Managing the simple bringing up or down of a CPU in such an environment
> required hundreds of new lines of code. That is far from a simple
> holding pen or spinning table to say the least.
>
> [. For the curious, I'll post this code here soon for review. ]
>
> So my point of view is: if you do not need a holding pen because you can
> hold individual CPUs in reset, then don't. Many platforms with support
> in the kernel can do that, yet they copied the holding pen code just
> because it is there. And that is total crap.

Agreed, but it's also total crap forcing emulation of a made-up power
controller on the host in the case of a virtual platform.

> on the topic of a para-virtualised machine, I think that it should
> simply implement the PSCI calls to bring up CPUs _without_ any holding
> pen nor spinning tables. You issue the appropriate PSCI call with the
> physical address for secondary_startup() as argument and you're done.
> The host intercepts that call and free a new CPU instance in response.
> That's all.

I'd be happy to go with this suggestion if it wasn't for one thing:
platforms that do not implement a secure mode. For these platforms, smc will
be an undefined instruction at the exception level where it is executed and
therefore cannot be trapped by the hypervisor.

If that situation requires a pen, I see no benefit from having two boot
schemes where one of them would work in every case.

Will

_______________________________________________
Xen-arm mailing list
Xen-arm@lists.xen.org
http://lists.xen.org/cgi-bin/mailman/listinfo/xen-arm
Re: [RFC PATCH 0/2] Add support for a fake, para-virtualised machine [ In reply to ]
On Tue, Dec 04, 2012 at 06:02:13PM +0000, Nicolas Pitre wrote:
> On Tue, 4 Dec 2012, Will Deacon wrote:
>
> > Hi Nicolas,
> >
> > On Tue, Dec 04, 2012 at 05:00:07PM +0000, Nicolas Pitre wrote:
> > > on the topic of a para-virtualised machine, I think that it should
> > > simply implement the PSCI calls to bring up CPUs _without_ any holding
> > > pen nor spinning tables. You issue the appropriate PSCI call with the
> > > physical address for secondary_startup() as argument and you're done.
> > > The host intercepts that call and free a new CPU instance in response.
> > > That's all.
> >
> > I'd be happy to go with this suggestion if it wasn't for one thing:
> > platforms that do not implement a secure mode. For these platforms, smc will
> > be an undefined instruction at the exception level where it is executed and
> > therefore cannot be trapped by the hypervisor.
>
> Really? I thought the hypervisor could virtualize SMC calls. Or is
> that considered a security hazard?

If the security extensions aren't implemented, the hypervisor can't trap the
smc instruction.

> I don't remember all the PSCI spec details, but I think there was some
> provision for this case i.e. the SMC call could be a HYP call instead.
> And if that's not in the spec, then it probably should be added and
> implemented as if it was.

Well, this depends on the guest taking an undefined instruction exception on
the smc, then deciding to issue an hvc instead and *then* having the
hypervisor somehow translate that into a PSCI invocation. It could work, but
it sounds easy to mess up and relies on the PSCI firmware co-existing with
things like kvm.

> > If that situation requires a pen, I see no benefit from having two boot
> > schemes where one of them would work in every case.
>
> We always have the choice between several schemes in device drivers for
> example, depending on the hardware generation. Yet we always implement
> the better scheme for the newest hardware for performance reasons, even
> if an older one could work in all cases.

Again, I totally agree when it comes to things like poweroff and hotplug but
for booting I don't think we gain much from having multiple implementations
for a single platform. Hopefully this is moot -- see below.

> A holding pen is a rather stupid scheme. Please let's try to do without
> it if possible.

I've just hacked up Rob's suggestion and it seems to be working, so I'll
post a pen-less v2 tomorrow. The hotplug/reboot code can come later when we
have something host-side that we can use (could be PSCI).

Will

_______________________________________________
Xen-arm mailing list
Xen-arm@lists.xen.org
http://lists.xen.org/cgi-bin/mailman/listinfo/xen-arm
Re: [RFC PATCH 0/2] Add support for a fake, para-virtualised machine [ In reply to ]
On 4 December 2012 18:14, Will Deacon <will.deacon@arm.com> wrote:
> On Tue, Dec 04, 2012 at 06:02:13PM +0000, Nicolas Pitre wrote:
>> On Tue, 4 Dec 2012, Will Deacon wrote:
>> > On Tue, Dec 04, 2012 at 05:00:07PM +0000, Nicolas Pitre wrote:
>> > > on the topic of a para-virtualised machine, I think that it should
>> > > simply implement the PSCI calls to bring up CPUs _without_ any holding
>> > > pen nor spinning tables. You issue the appropriate PSCI call with the
>> > > physical address for secondary_startup() as argument and you're done.
>> > > The host intercepts that call and free a new CPU instance in response.
>> > > That's all.
>> >
>> > I'd be happy to go with this suggestion if it wasn't for one thing:
>> > platforms that do not implement a secure mode. For these platforms, smc will
>> > be an undefined instruction at the exception level where it is executed and
>> > therefore cannot be trapped by the hypervisor.
>>
>> Really? I thought the hypervisor could virtualize SMC calls. Or is
>> that considered a security hazard?
>
> If the security extensions aren't implemented, the hypervisor can't trap the
> smc instruction.
>
>> I don't remember all the PSCI spec details, but I think there was some
>> provision for this case i.e. the SMC call could be a HYP call instead.
>> And if that's not in the spec, then it probably should be added and
>> implemented as if it was.
>
> Well, this depends on the guest taking an undefined instruction exception on
> the smc, then deciding to issue an hvc instead and *then* having the
> hypervisor somehow translate that into a PSCI invocation. It could work, but
> it sounds easy to mess up and relies on the PSCI firmware co-existing with
> things like kvm.

We can have enable-method DT entries independent of the SoC and one of
them can be psci-hvc.

Just for clarification, AArch32 with virtualisation mandates the
security extensions, so the SMC can be trapped. On AArch64 it is a bit
tricky since the presence of EL3 is not mandate, in which case SMC
would undef (don't as why ;). That's where we can have different
enable methods specified via the DT.

--
Catalin

_______________________________________________
Xen-arm mailing list
Xen-arm@lists.xen.org
http://lists.xen.org/cgi-bin/mailman/listinfo/xen-arm
Re: [RFC PATCH 0/2] Add support for a fake, para-virtualised machine [ In reply to ]
On Wed, 5 Dec 2012, Catalin Marinas wrote:

> On 4 December 2012 18:14, Will Deacon <will.deacon@arm.com> wrote:
> > On Tue, Dec 04, 2012 at 06:02:13PM +0000, Nicolas Pitre wrote:
> >> On Tue, 4 Dec 2012, Will Deacon wrote:
> >> > On Tue, Dec 04, 2012 at 05:00:07PM +0000, Nicolas Pitre wrote:
> >> > > on the topic of a para-virtualised machine, I think that it should
> >> > > simply implement the PSCI calls to bring up CPUs _without_ any holding
> >> > > pen nor spinning tables. You issue the appropriate PSCI call with the
> >> > > physical address for secondary_startup() as argument and you're done.
> >> > > The host intercepts that call and free a new CPU instance in response.
> >> > > That's all.
> >> >
> >> > I'd be happy to go with this suggestion if it wasn't for one thing:
> >> > platforms that do not implement a secure mode. For these platforms, smc will
> >> > be an undefined instruction at the exception level where it is executed and
> >> > therefore cannot be trapped by the hypervisor.
> >>
> >> Really? I thought the hypervisor could virtualize SMC calls. Or is
> >> that considered a security hazard?
> >
> > If the security extensions aren't implemented, the hypervisor can't trap the
> > smc instruction.
> >
> >> I don't remember all the PSCI spec details, but I think there was some
> >> provision for this case i.e. the SMC call could be a HYP call instead.
> >> And if that's not in the spec, then it probably should be added and
> >> implemented as if it was.
> >
> > Well, this depends on the guest taking an undefined instruction exception on
> > the smc, then deciding to issue an hvc instead and *then* having the
> > hypervisor somehow translate that into a PSCI invocation. It could work, but
> > it sounds easy to mess up and relies on the PSCI firmware co-existing with
> > things like kvm.
>
> We can have enable-method DT entries independent of the SoC and one of
> them can be psci-hvc.
>
> Just for clarification, AArch32 with virtualisation mandates the
> security extensions, so the SMC can be trapped.

Good. Therefore this one is settled.

> On AArch64 it is a bit
> tricky since the presence of EL3 is not mandate, in which case SMC
> would undef (don't as why ;). That's where we can have different
> enable methods specified via the DT.

In that case, sure. But do you expect such a configuration to be
common? Especially with all this secure booting being and cie enforced
across the board? I bet it won't.

So it is probably best to presume PSCI by default, and have a DT
specified method only when it is necessary to override the default.


Nicolas

_______________________________________________
Xen-arm mailing list
Xen-arm@lists.xen.org
http://lists.xen.org/cgi-bin/mailman/listinfo/xen-arm
Re: [RFC PATCH 0/2] Add support for a fake, para-virtualised machine [ In reply to ]
On Wed, Dec 05, 2012 at 03:07:05PM +0000, Nicolas Pitre wrote:
> On Wed, 5 Dec 2012, Catalin Marinas wrote:
> > On 4 December 2012 18:14, Will Deacon <will.deacon@arm.com> wrote:
> > > Well, this depends on the guest taking an undefined instruction exception on
> > > the smc, then deciding to issue an hvc instead and *then* having the
> > > hypervisor somehow translate that into a PSCI invocation. It could work, but
> > > it sounds easy to mess up and relies on the PSCI firmware co-existing with
> > > things like kvm.
> >
> > We can have enable-method DT entries independent of the SoC and one of
> > them can be psci-hvc.
> >
> > Just for clarification, AArch32 with virtualisation mandates the
> > security extensions, so the SMC can be trapped.
>
> Good. Therefore this one is settled.

Looks we replied at the same time! Please see my other mail...

Will

_______________________________________________
Xen-arm mailing list
Xen-arm@lists.xen.org
http://lists.xen.org/cgi-bin/mailman/listinfo/xen-arm