Mailing List Archive

Virtio in Xen on Arm (based on IOREQ concept)
Hello all.

We would like to resume Virtio in Xen on Arm activities. You can find some
background at [1] and Virtio specification at [2].

*A few words about importance:*
There is an increasing interest, I would even say, the requirement to have
flexible, generic and standardized cross-hypervisor solution for I/O
virtualization
in the automotive and embedded areas. The target is quite clear here.
Providing a standardized interface and device models for device
para-virtualization
in hypervisor environments, Virtio interface allows us to move Guest domains
among different hypervisor systems without further modification at the
Guest side.
What is more that Virtio support is available in Linux, Android and many
other
operating systems and there are a lot of existing Virtio drivers (frontends)
which could be just reused without reinventing the wheel. Many
organisations push
Virtio direction as a common interface. To summarize, Virtio support would
be
the great feature in Xen on Arm in addition to traditional Xen PV drivers
for
the user to be able to choose which one to use.

*A few word about solution:*
As it was mentioned at [1], in order to implement virtio-mmio Xen on Arm
requires
some implementation to forward guest MMIO access to a device model. And as
it
turned out the Xen on x86 contains most of the pieces to be able to use that
transport (via existing IOREQ concept). Julien has already done a big amount
of work in his PoC (xen/arm: Add support for Guest IO forwarding to a
device emulator).
Using that code as a base we managed to create a completely functional PoC
with DomU
running on virtio block device instead of a traditional Xen PV driver
without
modifications to DomU Linux. Our work is mostly about rebasing Julien's
code on the actual
codebase (Xen 4.14-rc4), various tweeks to be able to run emulator
(virtio-disk backend)
in other than Dom0 domain (in our system we have thin Dom0 and keep all
backends
in driver domain), misc fixes for our use-cases and tool support for the
configuration.
Unfortunately, Julien doesn’t have much time to allocate on the work
anymore,
so we would like to step in and continue.

*A few word about the Xen code:*
You can find the whole Xen series at [5]. The patches are in RFC state
because
some actions in the series should be reconsidered and implemented properly.
Before submitting the final code for the review the first IOREQ patch
(which is quite
big) will be split into x86, Arm and common parts. Please note, x86 part
wasn’t
even build-tested so far and could be broken with that series. Also the
series probably
wants splitting into adding IOREQ on Arm (should be focused first) and
tools support
for the virtio-disk (which is going to be the first Virtio driver)
configuration before going
into the mailing list.

What I would like to add here, the IOREQ feature on Arm could be used not
only
for implementing Virtio, but for other use-cases which require some
emulator entity
outside Xen such as custom PCI emulator (non-ECAM compatible) for example.

*A few word about the backend(s):*
One of the main problems with Virtio in Xen on Arm is the absence of
“ready-to-use” and “out-of-Qemu” Virtio backends (I least am not aware of).
We managed to create virtio-disk backend based on demu [3] and kvmtool [4]
using
that series. It is worth mentioning that although Xenbus/Xenstore is not
supposed
to be used with native Virtio, that interface was chosen to just pass
configuration from toolstack
to the backend and notify it about creating/destroying Guest domain (I
think it is
not bad since backends are usually tied to the hypervisor and can use
services
provided by hypervisor), the most important thing here is that all Virtio
subsystem
in the Guest was left unmodified. Backend wants some cleanup and, probably,
refactoring. We have a plan to publish it in a while.

Our next plan is to start preparing series for the review. Any feedback and
would be
highly appreciated.

[1]
https://lists.xenproject.org/archives/html/xen-devel/2019-07/msg01746.html
[2]
https://docs.oasis-open.org/virtio/virtio/v1.1/cs01/virtio-v1.1-cs01.html
[3] https://xenbits.xen.org/gitweb/?p=people/pauldu/demu.git;a=summary
[4] https://git.kernel.org/pub/scm/linux/kernel/git/will/kvmtool.git/
[5] https://github.com/xen-troops/xen/commits/ioreq_4.14_ml

--
Regards,

Oleksandr Tyshchenko
Re: Virtio in Xen on Arm (based on IOREQ concept) [ In reply to ]
Hello,

I'm very happy to see this proposal, as I think having proper (1st
class) VirtIO support on Xen is crucial to our survival. Almost all
OSes have VirtIO frontends, while the same can't be said about Xen PV
frontends. It would also allow us to piggyback on any new VirtIO
devices without having to re-invent the wheel by creating a clone Xen
PV device.

On Fri, Jul 17, 2020 at 05:11:02PM +0300, Oleksandr Tyshchenko wrote:
> Hello all.
>
> We would like to resume Virtio in Xen on Arm activities. You can find some
> background at [1] and Virtio specification at [2].
>
> *A few words about importance:*
> There is an increasing interest, I would even say, the requirement to have
> flexible, generic and standardized cross-hypervisor solution for I/O
> virtualization
> in the automotive and embedded areas. The target is quite clear here.
> Providing a standardized interface and device models for device
> para-virtualization
> in hypervisor environments, Virtio interface allows us to move Guest domains
> among different hypervisor systems without further modification at the
> Guest side.
> What is more that Virtio support is available in Linux, Android and many
> other
> operating systems and there are a lot of existing Virtio drivers (frontends)
> which could be just reused without reinventing the wheel. Many
> organisations push
> Virtio direction as a common interface. To summarize, Virtio support would
> be
> the great feature in Xen on Arm in addition to traditional Xen PV drivers
> for
> the user to be able to choose which one to use.

I think most of the above also applies to x86, and fully agree.

>
> *A few word about solution:*
> As it was mentioned at [1], in order to implement virtio-mmio Xen on Arm

Any plans for virtio-pci? Arm seems to be moving to the PCI bus, and
it would be very interesting from a x86 PoV, as I don't think
virtio-mmio is something that you can easily use on x86 (or even use
at all).

> requires
> some implementation to forward guest MMIO access to a device model. And as
> it
> turned out the Xen on x86 contains most of the pieces to be able to use that
> transport (via existing IOREQ concept). Julien has already done a big amount
> of work in his PoC (xen/arm: Add support for Guest IO forwarding to a
> device emulator).
> Using that code as a base we managed to create a completely functional PoC
> with DomU
> running on virtio block device instead of a traditional Xen PV driver
> without
> modifications to DomU Linux. Our work is mostly about rebasing Julien's
> code on the actual
> codebase (Xen 4.14-rc4), various tweeks to be able to run emulator
> (virtio-disk backend)
> in other than Dom0 domain (in our system we have thin Dom0 and keep all
> backends
> in driver domain),

How do you handle this use-case? Are you using grants in the VirtIO
ring, or rather allowing the driver domain to map all the guest memory
and then placing gfn on the ring like it's commonly done with VirtIO?

Do you have any plans to try to upstream a modification to the VirtIO
spec so that grants (ie: abstract references to memory addresses) can
be used on the VirtIO ring?

> misc fixes for our use-cases and tool support for the
> configuration.
> Unfortunately, Julien doesn’t have much time to allocate on the work
> anymore,
> so we would like to step in and continue.
>
> *A few word about the Xen code:*
> You can find the whole Xen series at [5]. The patches are in RFC state
> because
> some actions in the series should be reconsidered and implemented properly.
> Before submitting the final code for the review the first IOREQ patch
> (which is quite
> big) will be split into x86, Arm and common parts. Please note, x86 part
> wasn’t
> even build-tested so far and could be broken with that series. Also the
> series probably
> wants splitting into adding IOREQ on Arm (should be focused first) and
> tools support
> for the virtio-disk (which is going to be the first Virtio driver)
> configuration before going
> into the mailing list.

Sending first a patch series to enable IOREQs on Arm seems perfectly
fine, and it doesn't have to come with the VirtIO backend. In fact I
would recommend that you send that ASAP, so that you don't spend time
working on the backend that would likely need to be modified
according to the review received on the IOREQ series.

>
> What I would like to add here, the IOREQ feature on Arm could be used not
> only
> for implementing Virtio, but for other use-cases which require some
> emulator entity
> outside Xen such as custom PCI emulator (non-ECAM compatible) for example.
>
> *A few word about the backend(s):*
> One of the main problems with Virtio in Xen on Arm is the absence of
> “ready-to-use” and “out-of-Qemu” Virtio backends (I least am not aware of).
> We managed to create virtio-disk backend based on demu [3] and kvmtool [4]
> using
> that series. It is worth mentioning that although Xenbus/Xenstore is not
> supposed
> to be used with native Virtio, that interface was chosen to just pass
> configuration from toolstack
> to the backend and notify it about creating/destroying Guest domain (I
> think it is

I would prefer if a single instance was launched to handle each
backend, and that the configuration was passed on the command line.
Killing the user-space backend from the toolstack is fine I think,
there's no need to notify the backend using xenstore or any other
out-of-band methods.

xenstore has proven to be a bottleneck in terms of performance, and it
would be better if we can avoid using it when possible, specially here
that you have to do this from scratch anyway.

Thanks, Roger.
Re: Virtio in Xen on Arm (based on IOREQ concept) [ In reply to ]
On 17.07.20 18:00, Roger Pau Monné wrote:
> Hello,

Hello Roger


> I'm very happy to see this proposal, as I think having proper (1st
> class) VirtIO support on Xen is crucial to our survival. Almost all
> OSes have VirtIO frontends, while the same can't be said about Xen PV
> frontends. It would also allow us to piggyback on any new VirtIO
> devices without having to re-invent the wheel by creating a clone Xen
> PV device.

Thank you.


>
> On Fri, Jul 17, 2020 at 05:11:02PM +0300, Oleksandr Tyshchenko wrote:
>> Hello all.
>>
>> We would like to resume Virtio in Xen on Arm activities. You can find some
>> background at [1] and Virtio specification at [2].
>>
>> *A few words about importance:*
>> There is an increasing interest, I would even say, the requirement to have
>> flexible, generic and standardized cross-hypervisor solution for I/O
>> virtualization
>> in the automotive and embedded areas. The target is quite clear here.
>> Providing a standardized interface and device models for device
>> para-virtualization
>> in hypervisor environments, Virtio interface allows us to move Guest domains
>> among different hypervisor systems without further modification at the
>> Guest side.
>> What is more that Virtio support is available in Linux, Android and many
>> other
>> operating systems and there are a lot of existing Virtio drivers (frontends)
>> which could be just reused without reinventing the wheel. Many
>> organisations push
>> Virtio direction as a common interface. To summarize, Virtio support would
>> be
>> the great feature in Xen on Arm in addition to traditional Xen PV drivers
>> for
>> the user to be able to choose which one to use.
> I think most of the above also applies to x86, and fully agree.
>
>> *A few word about solution:*
>> As it was mentioned at [1], in order to implement virtio-mmio Xen on Arm
> Any plans for virtio-pci? Arm seems to be moving to the PCI bus, and
> it would be very interesting from a x86 PoV, as I don't think
> virtio-mmio is something that you can easily use on x86 (or even use
> at all).

Being honest I didn't consider virtio-pci so far. Julien's PoC (we are
based on) provides support for the virtio-mmio transport

which is enough to start working around VirtIO and is not as complex as
virtio-pci. But it doesn't mean there is no way for virtio-pci in Xen.

I think, this could be added in next steps. But the nearest target is
virtio-mmio approach (of course if the community agrees on that).


>> requires
>> some implementation to forward guest MMIO access to a device model. And as
>> it
>> turned out the Xen on x86 contains most of the pieces to be able to use that
>> transport (via existing IOREQ concept). Julien has already done a big amount
>> of work in his PoC (xen/arm: Add support for Guest IO forwarding to a
>> device emulator).
>> Using that code as a base we managed to create a completely functional PoC
>> with DomU
>> running on virtio block device instead of a traditional Xen PV driver
>> without
>> modifications to DomU Linux. Our work is mostly about rebasing Julien's
>> code on the actual
>> codebase (Xen 4.14-rc4), various tweeks to be able to run emulator
>> (virtio-disk backend)
>> in other than Dom0 domain (in our system we have thin Dom0 and keep all
>> backends
>> in driver domain),
> How do you handle this use-case? Are you using grants in the VirtIO
> ring, or rather allowing the driver domain to map all the guest memory
> and then placing gfn on the ring like it's commonly done with VirtIO?

Second option. Xen grants are not used at all as well as event channel
and Xenbus. That allows us to have guest

*unmodified* which one of the main goals. Yes, this may sound (or even
sounds) non-secure, but backend which runs in driver domain is allowed
to map all guest memory.

In current backend implementation a part of guest memory is mapped just
to process guest request then unmapped back, there is no mappings in
advance. The xenforeignmemory_map

call is used for that purpose. For experiment I tried to map all guest
memory in advance and just calculated pointer at runtime. Of course that
logic performed better.

I was thinking about guest static memory regions and forcing guest to
allocate descriptors from them (in order not to map all guest memory,
but a predefined region). But that implies modifying guest...


>
> Do you have any plans to try to upstream a modification to the VirtIO
> spec so that grants (ie: abstract references to memory addresses) can
> be used on the VirtIO ring?

But VirtIO spec hasn't been modified as well as VirtIO infrastructure in
the guest. Nothing to upsteam)


>
>> misc fixes for our use-cases and tool support for the
>> configuration.
>> Unfortunately, Julien doesn’t have much time to allocate on the work
>> anymore,
>> so we would like to step in and continue.
>>
>> *A few word about the Xen code:*
>> You can find the whole Xen series at [5]. The patches are in RFC state
>> because
>> some actions in the series should be reconsidered and implemented properly.
>> Before submitting the final code for the review the first IOREQ patch
>> (which is quite
>> big) will be split into x86, Arm and common parts. Please note, x86 part
>> wasn’t
>> even build-tested so far and could be broken with that series. Also the
>> series probably
>> wants splitting into adding IOREQ on Arm (should be focused first) and
>> tools support
>> for the virtio-disk (which is going to be the first Virtio driver)
>> configuration before going
>> into the mailing list.
> Sending first a patch series to enable IOREQs on Arm seems perfectly
> fine, and it doesn't have to come with the VirtIO backend. In fact I
> would recommend that you send that ASAP, so that you don't spend time
> working on the backend that would likely need to be modified
> according to the review received on the IOREQ series.

Completely agree with you, I will send it after splitting IOREQ patch
and performing some cleanup.

However, it is going to take some time to make it properly taking into
the account

that personally I won't be able to test on x86.


>
>> What I would like to add here, the IOREQ feature on Arm could be used not
>> only
>> for implementing Virtio, but for other use-cases which require some
>> emulator entity
>> outside Xen such as custom PCI emulator (non-ECAM compatible) for example.
>>
>> *A few word about the backend(s):*
>> One of the main problems with Virtio in Xen on Arm is the absence of
>> “ready-to-use” and “out-of-Qemu” Virtio backends (I least am not aware of).
>> We managed to create virtio-disk backend based on demu [3] and kvmtool [4]
>> using
>> that series. It is worth mentioning that although Xenbus/Xenstore is not
>> supposed
>> to be used with native Virtio, that interface was chosen to just pass
>> configuration from toolstack
>> to the backend and notify it about creating/destroying Guest domain (I
>> think it is
> I would prefer if a single instance was launched to handle each
> backend, and that the configuration was passed on the command line.
> Killing the user-space backend from the toolstack is fine I think,
> there's no need to notify the backend using xenstore or any other
> out-of-band methods.
>
> xenstore has proven to be a bottleneck in terms of performance, and it
> would be better if we can avoid using it when possible, specially here
> that you have to do this from scratch anyway.

Let me elaborate a bit more on this.

In current backend implementation, the Xenstore is *not* used for
communication between backend (VirtIO device) and frontend (VirtIO
driver), frontend knows nothing about it.

Xenstore was chosen as an interface in order to be able to pass
configuration from toolstack in Dom0 to backend which may reside in
other than Dom0 domain (DomD in our case),

also looking into the Xenstore entries backend always knows when the
intended guest is been created/destroyed.

I may mistake, but I don't think we can avoid using Xenstore (or other
interface provided by toolstack) for the several reasons.

Besides a virtio-disk configuration (a disk to be assigned to the guest,
R/O mode, etc), for each virtio-mmio device instance

a pair (mmio range + IRQ) are allocated by toolstack at the guest
construction time and inserted into virtio-mmio device tree node

in the guest device tree. And for the backend to properly operate these
variable parameters are also passed to the backend via Xenstore.

The other reasons are:

1. Automation. With current backend implementation we don't need to
pause guest right after creating it, then go to the driver domain and
spawn backend and

after that go back to the dom0 and unpause the guest.

2. Ability to detect when guest with involved frontend has gone away and
properly release resource (guest destroy/reboot).

3. Ability to (re)connect to the newly created guest with involved
frontend (guest create/reboot).

4. What is more that having Xenstore support the backend is able to
detect the dom_id it runs into and the guest dom_id, there is no need
pass them via command line.


I will be happy to explain in details after publishing backend code).


--
Regards,

Oleksandr Tyshchenko
Re: Virtio in Xen on Arm (based on IOREQ concept) [ In reply to ]
On Fri, Jul 17, 2020 at 09:34:14PM +0300, Oleksandr wrote:
> On 17.07.20 18:00, Roger Pau Monné wrote:
> > On Fri, Jul 17, 2020 at 05:11:02PM +0300, Oleksandr Tyshchenko wrote:
> > > requires
> > > some implementation to forward guest MMIO access to a device model. And as
> > > it
> > > turned out the Xen on x86 contains most of the pieces to be able to use that
> > > transport (via existing IOREQ concept). Julien has already done a big amount
> > > of work in his PoC (xen/arm: Add support for Guest IO forwarding to a
> > > device emulator).
> > > Using that code as a base we managed to create a completely functional PoC
> > > with DomU
> > > running on virtio block device instead of a traditional Xen PV driver
> > > without
> > > modifications to DomU Linux. Our work is mostly about rebasing Julien's
> > > code on the actual
> > > codebase (Xen 4.14-rc4), various tweeks to be able to run emulator
> > > (virtio-disk backend)
> > > in other than Dom0 domain (in our system we have thin Dom0 and keep all
> > > backends
> > > in driver domain),
> > How do you handle this use-case? Are you using grants in the VirtIO
> > ring, or rather allowing the driver domain to map all the guest memory
> > and then placing gfn on the ring like it's commonly done with VirtIO?
>
> Second option. Xen grants are not used at all as well as event channel and
> Xenbus. That allows us to have guest
>
> *unmodified* which one of the main goals. Yes, this may sound (or even
> sounds) non-secure, but backend which runs in driver domain is allowed to
> map all guest memory.

Supporting unmodified guests is certainly a fine goal, but I don't
think it's incompatible with also trying to expand the spec in
parallel in order to support grants in a negotiated way (see below).

That way you could (long term) regain some of the lost security.

> > Do you have any plans to try to upstream a modification to the VirtIO
> > spec so that grants (ie: abstract references to memory addresses) can
> > be used on the VirtIO ring?
>
> But VirtIO spec hasn't been modified as well as VirtIO infrastructure in the
> guest. Nothing to upsteam)

OK, so there's no intention to add grants (or a similar interface) to
the spec?

I understand that you want to support unmodified VirtIO frontends, but
I also think that long term frontends could negotiate with backends on
the usage of grants in the shared ring, like any other VirtIO feature
negotiated between the frontend and the backend.

This of course needs to be on the spec first before we can start
implementing it, and hence my question whether a modification to the
spec in order to add grants has been considered.

It's fine to say that you don't have any plans in this regard.

>
> >
> > > misc fixes for our use-cases and tool support for the
> > > configuration.
> > > Unfortunately, Julien doesn’t have much time to allocate on the work
> > > anymore,
> > > so we would like to step in and continue.
> > >
> > > *A few word about the Xen code:*
> > > You can find the whole Xen series at [5]. The patches are in RFC state
> > > because
> > > some actions in the series should be reconsidered and implemented properly.
> > > Before submitting the final code for the review the first IOREQ patch
> > > (which is quite
> > > big) will be split into x86, Arm and common parts. Please note, x86 part
> > > wasn’t
> > > even build-tested so far and could be broken with that series. Also the
> > > series probably
> > > wants splitting into adding IOREQ on Arm (should be focused first) and
> > > tools support
> > > for the virtio-disk (which is going to be the first Virtio driver)
> > > configuration before going
> > > into the mailing list.
> > Sending first a patch series to enable IOREQs on Arm seems perfectly
> > fine, and it doesn't have to come with the VirtIO backend. In fact I
> > would recommend that you send that ASAP, so that you don't spend time
> > working on the backend that would likely need to be modified
> > according to the review received on the IOREQ series.
>
> Completely agree with you, I will send it after splitting IOREQ patch and
> performing some cleanup.
>
> However, it is going to take some time to make it properly taking into the
> account
>
> that personally I won't be able to test on x86.

We have gitlab and the osstest CI loop (plus all the reviewers) so we
should be able to spot any regressions. Build testing on x86 would be
nice so that you don't need to resend to fix build issues.

>
> >
> > > What I would like to add here, the IOREQ feature on Arm could be used not
> > > only
> > > for implementing Virtio, but for other use-cases which require some
> > > emulator entity
> > > outside Xen such as custom PCI emulator (non-ECAM compatible) for example.
> > >
> > > *A few word about the backend(s):*
> > > One of the main problems with Virtio in Xen on Arm is the absence of
> > > “ready-to-use” and “out-of-Qemu” Virtio backends (I least am not aware of).
> > > We managed to create virtio-disk backend based on demu [3] and kvmtool [4]
> > > using
> > > that series. It is worth mentioning that although Xenbus/Xenstore is not
> > > supposed
> > > to be used with native Virtio, that interface was chosen to just pass
> > > configuration from toolstack
> > > to the backend and notify it about creating/destroying Guest domain (I
> > > think it is
> > I would prefer if a single instance was launched to handle each
> > backend, and that the configuration was passed on the command line.
> > Killing the user-space backend from the toolstack is fine I think,
> > there's no need to notify the backend using xenstore or any other
> > out-of-band methods.
> >
> > xenstore has proven to be a bottleneck in terms of performance, and it
> > would be better if we can avoid using it when possible, specially here
> > that you have to do this from scratch anyway.
>
> Let me elaborate a bit more on this.
>
> In current backend implementation, the Xenstore is *not* used for
> communication between backend (VirtIO device) and frontend (VirtIO driver),
> frontend knows nothing about it.
>
> Xenstore was chosen as an interface in order to be able to pass
> configuration from toolstack in Dom0 to backend which may reside in other
> than Dom0 domain (DomD in our case),

There's 'xl devd' which can be used on the driver domain to spawn
backends, maybe you could add the logic there so that 'xl devd' calls
the backend executable with the required command line parameters, so
that the backend itself doesn't need to interact with xenstore in any
way?

That way in the future we could use something else instead of
xenstore, like Argo for instance in order to pass the backend data
from the control domain to the driver domain.

> also looking into the Xenstore entries backend always knows when the
> intended guest is been created/destroyed.

xl devd should also do the killing of backends anyway when a domain is
destroyed, or else malfunctioning user-space backends could keep
running after the domain they are serving is destroyed.

> I may mistake, but I don't think we can avoid using Xenstore (or other
> interface provided by toolstack) for the several reasons.
>
> Besides a virtio-disk configuration (a disk to be assigned to the guest, R/O
> mode, etc), for each virtio-mmio device instance
>
> a pair (mmio range + IRQ) are allocated by toolstack at the guest
> construction time and inserted into virtio-mmio device tree node
>
> in the guest device tree. And for the backend to properly operate these
> variable parameters are also passed to the backend via Xenstore.

I think you could pass all these parameters as command line arguments
to the backend?

> The other reasons are:
>
> 1. Automation. With current backend implementation we don't need to pause
> guest right after creating it, then go to the driver domain and spawn
> backend and
>
> after that go back to the dom0 and unpause the guest.

xl devd should be capable of handling this for you on the driver
domain.

> 2. Ability to detect when guest with involved frontend has gone away and
> properly release resource (guest destroy/reboot).
>
> 3. Ability to (re)connect to the newly created guest with involved frontend
> (guest create/reboot).
>
> 4. What is more that having Xenstore support the backend is able to detect
> the dom_id it runs into and the guest dom_id, there is no need pass them via
> command line.
>
>
> I will be happy to explain in details after publishing backend code).

As I'm not the one doing the work I certainly won't stop you from
using xenstore on the backend. I would certainly prefer if the backend
gets all the information it needs from the command line so that the
configuration data is completely agnostic to the transport layer used
to convey it.

Thanks, Roger.
Re: Virtio in Xen on Arm (based on IOREQ concept) [ In reply to ]
On 20/07/2020 10:17, Roger Pau Monné wrote:
> On Fri, Jul 17, 2020 at 09:34:14PM +0300, Oleksandr wrote:
>> On 17.07.20 18:00, Roger Pau Monné wrote:
>>> On Fri, Jul 17, 2020 at 05:11:02PM +0300, Oleksandr Tyshchenko wrote:
>>>> requires
>>>> some implementation to forward guest MMIO access to a device model. And as
>>>> it
>>>> turned out the Xen on x86 contains most of the pieces to be able to use that
>>>> transport (via existing IOREQ concept). Julien has already done a big amount
>>>> of work in his PoC (xen/arm: Add support for Guest IO forwarding to a
>>>> device emulator).
>>>> Using that code as a base we managed to create a completely functional PoC
>>>> with DomU
>>>> running on virtio block device instead of a traditional Xen PV driver
>>>> without
>>>> modifications to DomU Linux. Our work is mostly about rebasing Julien's
>>>> code on the actual
>>>> codebase (Xen 4.14-rc4), various tweeks to be able to run emulator
>>>> (virtio-disk backend)
>>>> in other than Dom0 domain (in our system we have thin Dom0 and keep all
>>>> backends
>>>> in driver domain),
>>> How do you handle this use-case? Are you using grants in the VirtIO
>>> ring, or rather allowing the driver domain to map all the guest memory
>>> and then placing gfn on the ring like it's commonly done with VirtIO?
>>
>> Second option. Xen grants are not used at all as well as event channel and
>> Xenbus. That allows us to have guest
>>
>> *unmodified* which one of the main goals. Yes, this may sound (or even
>> sounds) non-secure, but backend which runs in driver domain is allowed to
>> map all guest memory.
>
> Supporting unmodified guests is certainly a fine goal, but I don't
> think it's incompatible with also trying to expand the spec in
> parallel in order to support grants in a negotiated way (see below).
>
> That way you could (long term) regain some of the lost security.

FWIW, Xen is not the only hypervisor/community interested in creating
"less privileged" backend.

>
>>> Do you have any plans to try to upstream a modification to the VirtIO
>>> spec so that grants (ie: abstract references to memory addresses) can
>>> be used on the VirtIO ring?
>>
>> But VirtIO spec hasn't been modified as well as VirtIO infrastructure in the
>> guest. Nothing to upsteam)
>
> OK, so there's no intention to add grants (or a similar interface) to
> the spec?
>
> I understand that you want to support unmodified VirtIO frontends, but
> I also think that long term frontends could negotiate with backends on
> the usage of grants in the shared ring, like any other VirtIO feature
> negotiated between the frontend and the backend.
>
> This of course needs to be on the spec first before we can start
> implementing it, and hence my question whether a modification to the
> spec in order to add grants has been considered.
The problem is not really the specification but the adoption in the
ecosystem. A protocol based on grant-tables would mostly only be used by
Xen therefore:
- It may be difficult to convince a proprietary OS vendor to invest
resource on implementing the protocol
- It would be more difficult to move in/out of Xen ecosystem.

Both, may slow the adoption of Xen in some areas.

If one is interested in security, then it would be better to work with
the other interested parties. I think it would be possible to use a
virtual IOMMU for this purpose.

Cheers,

--
Julien Grall
Re: Virtio in Xen on Arm (based on IOREQ concept) [ In reply to ]
On Mon, Jul 20, 2020 at 10:40:40AM +0100, Julien Grall wrote:
>
>
> On 20/07/2020 10:17, Roger Pau Monné wrote:
> > On Fri, Jul 17, 2020 at 09:34:14PM +0300, Oleksandr wrote:
> > > On 17.07.20 18:00, Roger Pau Monné wrote:
> > > > On Fri, Jul 17, 2020 at 05:11:02PM +0300, Oleksandr Tyshchenko wrote:
> > > > Do you have any plans to try to upstream a modification to the VirtIO
> > > > spec so that grants (ie: abstract references to memory addresses) can
> > > > be used on the VirtIO ring?
> > >
> > > But VirtIO spec hasn't been modified as well as VirtIO infrastructure in the
> > > guest. Nothing to upsteam)
> >
> > OK, so there's no intention to add grants (or a similar interface) to
> > the spec?
> >
> > I understand that you want to support unmodified VirtIO frontends, but
> > I also think that long term frontends could negotiate with backends on
> > the usage of grants in the shared ring, like any other VirtIO feature
> > negotiated between the frontend and the backend.
> >
> > This of course needs to be on the spec first before we can start
> > implementing it, and hence my question whether a modification to the
> > spec in order to add grants has been considered.
> The problem is not really the specification but the adoption in the
> ecosystem. A protocol based on grant-tables would mostly only be used by Xen
> therefore:
> - It may be difficult to convince a proprietary OS vendor to invest
> resource on implementing the protocol
> - It would be more difficult to move in/out of Xen ecosystem.
>
> Both, may slow the adoption of Xen in some areas.

Right, just to be clear my suggestion wasn't to force the usage of
grants, but whether adding something along this lines was in the
roadmap, see below.

> If one is interested in security, then it would be better to work with the
> other interested parties. I think it would be possible to use a virtual
> IOMMU for this purpose.

Yes, I've also heard rumors about using the (I assume VirtIO) IOMMU in
order to protect what backends can map. This seems like a fine idea,
and would allow us to gain the lost security without having to do the
whole work ourselves.

Do you know if there's anything published about this? I'm curious
about how and where in the system the VirtIO IOMMU is/should be
implemented.

Thanks, Roger.
Re: Virtio in Xen on Arm (based on IOREQ concept) [ In reply to ]
On 20.07.20 12:17, Roger Pau Monné wrote:

Hello Roger

> On Fri, Jul 17, 2020 at 09:34:14PM +0300, Oleksandr wrote:
>> On 17.07.20 18:00, Roger Pau Monné wrote:
>>> On Fri, Jul 17, 2020 at 05:11:02PM +0300, Oleksandr Tyshchenko wrote:
>>>> requires
>>>> some implementation to forward guest MMIO access to a device model. And as
>>>> it
>>>> turned out the Xen on x86 contains most of the pieces to be able to use that
>>>> transport (via existing IOREQ concept). Julien has already done a big amount
>>>> of work in his PoC (xen/arm: Add support for Guest IO forwarding to a
>>>> device emulator).
>>>> Using that code as a base we managed to create a completely functional PoC
>>>> with DomU
>>>> running on virtio block device instead of a traditional Xen PV driver
>>>> without
>>>> modifications to DomU Linux. Our work is mostly about rebasing Julien's
>>>> code on the actual
>>>> codebase (Xen 4.14-rc4), various tweeks to be able to run emulator
>>>> (virtio-disk backend)
>>>> in other than Dom0 domain (in our system we have thin Dom0 and keep all
>>>> backends
>>>> in driver domain),
>>> How do you handle this use-case? Are you using grants in the VirtIO
>>> ring, or rather allowing the driver domain to map all the guest memory
>>> and then placing gfn on the ring like it's commonly done with VirtIO?
>> Second option. Xen grants are not used at all as well as event channel and
>> Xenbus. That allows us to have guest
>>
>> *unmodified* which one of the main goals. Yes, this may sound (or even
>> sounds) non-secure, but backend which runs in driver domain is allowed to
>> map all guest memory.
> Supporting unmodified guests is certainly a fine goal, but I don't
> think it's incompatible with also trying to expand the spec in
> parallel in order to support grants in a negotiated way (see below).
>
> That way you could (long term) regain some of the lost security.
>
>>> Do you have any plans to try to upstream a modification to the VirtIO
>>> spec so that grants (ie: abstract references to memory addresses) can
>>> be used on the VirtIO ring?
>> But VirtIO spec hasn't been modified as well as VirtIO infrastructure in the
>> guest. Nothing to upsteam)
> OK, so there's no intention to add grants (or a similar interface) to
> the spec?
>
> I understand that you want to support unmodified VirtIO frontends, but
> I also think that long term frontends could negotiate with backends on
> the usage of grants in the shared ring, like any other VirtIO feature
> negotiated between the frontend and the backend.
>
> This of course needs to be on the spec first before we can start
> implementing it, and hence my question whether a modification to the
> spec in order to add grants has been considered.
>
> It's fine to say that you don't have any plans in this regard.
Adding grants (or a similar interface) to the spec hasn't been
considered so far.

But I understand and completely agree that some solution should be found
in order not to reduce security.


>>>> misc fixes for our use-cases and tool support for the
>>>> configuration.
>>>> Unfortunately, Julien doesn’t have much time to allocate on the work
>>>> anymore,
>>>> so we would like to step in and continue.
>>>>
>>>> *A few word about the Xen code:*
>>>> You can find the whole Xen series at [5]. The patches are in RFC state
>>>> because
>>>> some actions in the series should be reconsidered and implemented properly.
>>>> Before submitting the final code for the review the first IOREQ patch
>>>> (which is quite
>>>> big) will be split into x86, Arm and common parts. Please note, x86 part
>>>> wasn’t
>>>> even build-tested so far and could be broken with that series. Also the
>>>> series probably
>>>> wants splitting into adding IOREQ on Arm (should be focused first) and
>>>> tools support
>>>> for the virtio-disk (which is going to be the first Virtio driver)
>>>> configuration before going
>>>> into the mailing list.
>>> Sending first a patch series to enable IOREQs on Arm seems perfectly
>>> fine, and it doesn't have to come with the VirtIO backend. In fact I
>>> would recommend that you send that ASAP, so that you don't spend time
>>> working on the backend that would likely need to be modified
>>> according to the review received on the IOREQ series.
>> Completely agree with you, I will send it after splitting IOREQ patch and
>> performing some cleanup.
>>
>> However, it is going to take some time to make it properly taking into the
>> account
>>
>> that personally I won't be able to test on x86.
> We have gitlab and the osstest CI loop (plus all the reviewers) so we
> should be able to spot any regressions. Build testing on x86 would be
> nice so that you don't need to resend to fix build issues.

Of course, before sending series to ML I will definitely perform a build
test

on x86.


>>>> What I would like to add here, the IOREQ feature on Arm could be used not
>>>> only
>>>> for implementing Virtio, but for other use-cases which require some
>>>> emulator entity
>>>> outside Xen such as custom PCI emulator (non-ECAM compatible) for example.
>>>>
>>>> *A few word about the backend(s):*
>>>> One of the main problems with Virtio in Xen on Arm is the absence of
>>>> “ready-to-use” and “out-of-Qemu” Virtio backends (I least am not aware of).
>>>> We managed to create virtio-disk backend based on demu [3] and kvmtool [4]
>>>> using
>>>> that series. It is worth mentioning that although Xenbus/Xenstore is not
>>>> supposed
>>>> to be used with native Virtio, that interface was chosen to just pass
>>>> configuration from toolstack
>>>> to the backend and notify it about creating/destroying Guest domain (I
>>>> think it is
>>> I would prefer if a single instance was launched to handle each
>>> backend, and that the configuration was passed on the command line.
>>> Killing the user-space backend from the toolstack is fine I think,
>>> there's no need to notify the backend using xenstore or any other
>>> out-of-band methods.
>>>
>>> xenstore has proven to be a bottleneck in terms of performance, and it
>>> would be better if we can avoid using it when possible, specially here
>>> that you have to do this from scratch anyway.
>> Let me elaborate a bit more on this.
>>
>> In current backend implementation, the Xenstore is *not* used for
>> communication between backend (VirtIO device) and frontend (VirtIO driver),
>> frontend knows nothing about it.
>>
>> Xenstore was chosen as an interface in order to be able to pass
>> configuration from toolstack in Dom0 to backend which may reside in other
>> than Dom0 domain (DomD in our case),
> There's 'xl devd' which can be used on the driver domain to spawn
> backends, maybe you could add the logic there so that 'xl devd' calls
> the backend executable with the required command line parameters, so
> that the backend itself doesn't need to interact with xenstore in any
> way?
>
> That way in the future we could use something else instead of
> xenstore, like Argo for instance in order to pass the backend data
> from the control domain to the driver domain.
>
>> also looking into the Xenstore entries backend always knows when the
>> intended guest is been created/destroyed.
> xl devd should also do the killing of backends anyway when a domain is
> destroyed, or else malfunctioning user-space backends could keep
> running after the domain they are serving is destroyed.
>
>> I may mistake, but I don't think we can avoid using Xenstore (or other
>> interface provided by toolstack) for the several reasons.
>>
>> Besides a virtio-disk configuration (a disk to be assigned to the guest, R/O
>> mode, etc), for each virtio-mmio device instance
>>
>> a pair (mmio range + IRQ) are allocated by toolstack at the guest
>> construction time and inserted into virtio-mmio device tree node
>>
>> in the guest device tree. And for the backend to properly operate these
>> variable parameters are also passed to the backend via Xenstore.
> I think you could pass all these parameters as command line arguments
> to the backend?
>
>> The other reasons are:
>>
>> 1. Automation. With current backend implementation we don't need to pause
>> guest right after creating it, then go to the driver domain and spawn
>> backend and
>>
>> after that go back to the dom0 and unpause the guest.
> xl devd should be capable of handling this for you on the driver
> domain.
>
>> 2. Ability to detect when guest with involved frontend has gone away and
>> properly release resource (guest destroy/reboot).
>>
>> 3. Ability to (re)connect to the newly created guest with involved frontend
>> (guest create/reboot).
>>
>> 4. What is more that having Xenstore support the backend is able to detect
>> the dom_id it runs into and the guest dom_id, there is no need pass them via
>> command line.
>>
>>
>> I will be happy to explain in details after publishing backend code).
> As I'm not the one doing the work I certainly won't stop you from
> using xenstore on the backend. I would certainly prefer if the backend
> gets all the information it needs from the command line so that the
> configuration data is completely agnostic to the transport layer used
> to convey it.
>
> Thanks, Roger.

Thank you for pointing another possible way. I feel I need to
investigate what is the "xl devd" (+ Argo?) and how it works. If it is
able to provide backend with

the support/information it needs and xenstore is not welcome then I
would be absolutely ok to consider using other solution.

I propose to get back to that discussion after I prepare and send out
the proper IOREQ series.


--
Regards,

Oleksandr Tyshchenko
Re: Virtio in Xen on Arm (based on IOREQ concept) [ In reply to ]
On Mon, Jul 20, 2020 at 01:56:51PM +0300, Oleksandr wrote:
> On 20.07.20 12:17, Roger Pau Monné wrote:
> > On Fri, Jul 17, 2020 at 09:34:14PM +0300, Oleksandr wrote:
> > > On 17.07.20 18:00, Roger Pau Monné wrote:
> > > > On Fri, Jul 17, 2020 at 05:11:02PM +0300, Oleksandr Tyshchenko wrote:
> > > The other reasons are:
> > >
> > > 1. Automation. With current backend implementation we don't need to pause
> > > guest right after creating it, then go to the driver domain and spawn
> > > backend and
> > >
> > > after that go back to the dom0 and unpause the guest.
> > xl devd should be capable of handling this for you on the driver
> > domain.
> >
> > > 2. Ability to detect when guest with involved frontend has gone away and
> > > properly release resource (guest destroy/reboot).
> > >
> > > 3. Ability to (re)connect to the newly created guest with involved frontend
> > > (guest create/reboot).
> > >
> > > 4. What is more that having Xenstore support the backend is able to detect
> > > the dom_id it runs into and the guest dom_id, there is no need pass them via
> > > command line.
> > >
> > >
> > > I will be happy to explain in details after publishing backend code).
> > As I'm not the one doing the work I certainly won't stop you from
> > using xenstore on the backend. I would certainly prefer if the backend
> > gets all the information it needs from the command line so that the
> > configuration data is completely agnostic to the transport layer used
> > to convey it.
> >
> > Thanks, Roger.
>
> Thank you for pointing another possible way. I feel I need to investigate
> what is the "xl devd" (+ Argo?) and how it works. If it is able to provide
> backend with

That's what x86 at least uses to manage backends on driver domains: xl
devd will for example launch the QEMU instance required to handle a
Xen PV disk backend in user-space.

Note that there's currently no support for Argo or any communication
channel different than xenstore, but I think it would be cleaner to
place the fetching of data from xenstore in xl devd and just pass
those as command line arguments to the VirtIO backend if possible. I
would prefer the VirtIO backend to be fully decoupled from xenstore.

Note that for a backend running on dom0 there would be no need to
pass any data on xenstore, as the backend would be launched directly
from xl with the appropriate command line arguments.

> the support/information it needs and xenstore is not welcome then I would be
> absolutely ok to consider using other solution.
>
> I propose to get back to that discussion after I prepare and send out the
> proper IOREQ series.

Sure, that's fine.

Roger.
Re: Virtio in Xen on Arm (based on IOREQ concept) [ In reply to ]
On Mon, 20 Jul 2020, Roger Pau Monné wrote:
> On Mon, Jul 20, 2020 at 10:40:40AM +0100, Julien Grall wrote:
> >
> >
> > On 20/07/2020 10:17, Roger Pau Monné wrote:
> > > On Fri, Jul 17, 2020 at 09:34:14PM +0300, Oleksandr wrote:
> > > > On 17.07.20 18:00, Roger Pau Monné wrote:
> > > > > On Fri, Jul 17, 2020 at 05:11:02PM +0300, Oleksandr Tyshchenko wrote:
> > > > > Do you have any plans to try to upstream a modification to the VirtIO
> > > > > spec so that grants (ie: abstract references to memory addresses) can
> > > > > be used on the VirtIO ring?
> > > >
> > > > But VirtIO spec hasn't been modified as well as VirtIO infrastructure in the
> > > > guest. Nothing to upsteam)
> > >
> > > OK, so there's no intention to add grants (or a similar interface) to
> > > the spec?
> > >
> > > I understand that you want to support unmodified VirtIO frontends, but
> > > I also think that long term frontends could negotiate with backends on
> > > the usage of grants in the shared ring, like any other VirtIO feature
> > > negotiated between the frontend and the backend.
> > >
> > > This of course needs to be on the spec first before we can start
> > > implementing it, and hence my question whether a modification to the
> > > spec in order to add grants has been considered.
> > The problem is not really the specification but the adoption in the
> > ecosystem. A protocol based on grant-tables would mostly only be used by Xen
> > therefore:
> > - It may be difficult to convince a proprietary OS vendor to invest
> > resource on implementing the protocol
> > - It would be more difficult to move in/out of Xen ecosystem.
> >
> > Both, may slow the adoption of Xen in some areas.
>
> Right, just to be clear my suggestion wasn't to force the usage of
> grants, but whether adding something along this lines was in the
> roadmap, see below.
>
> > If one is interested in security, then it would be better to work with the
> > other interested parties. I think it would be possible to use a virtual
> > IOMMU for this purpose.
>
> Yes, I've also heard rumors about using the (I assume VirtIO) IOMMU in
> order to protect what backends can map. This seems like a fine idea,
> and would allow us to gain the lost security without having to do the
> whole work ourselves.
>
> Do you know if there's anything published about this? I'm curious
> about how and where in the system the VirtIO IOMMU is/should be
> implemented.

Not yet (as far as I know), but we have just started some discussons on
this topic within Linaro.


You should also be aware that there is another proposal based on
pre-shared-memory and memcpys to solve the virtio security issue:

https://marc.info/?l=linux-kernel&m=158807398403549

It would be certainly slower than the "virtio IOMMU" solution but it
would take far less time to develop and could work as a short-term
stop-gap. (In my view the "virtio IOMMU" is the only clean solution
to the problem long term.)
Re: Virtio in Xen on Arm (based on IOREQ concept) [ In reply to ]
On Fri, 17 Jul 2020, Oleksandr wrote:
> > > *A few word about solution:*
> > > As it was mentioned at [1], in order to implement virtio-mmio Xen on Arm
> > Any plans for virtio-pci? Arm seems to be moving to the PCI bus, and
> > it would be very interesting from a x86 PoV, as I don't think
> > virtio-mmio is something that you can easily use on x86 (or even use
> > at all).
>
> Being honest I didn't consider virtio-pci so far. Julien's PoC (we are based
> on) provides support for the virtio-mmio transport
>
> which is enough to start working around VirtIO and is not as complex as
> virtio-pci. But it doesn't mean there is no way for virtio-pci in Xen.
>
> I think, this could be added in next steps. But the nearest target is
> virtio-mmio approach (of course if the community agrees on that).

Hi Julien, Oleksandr,

Aside from complexity and easy-of-development, are there any other
architectural reasons for using virtio-mmio?

I am not asking because I intend to suggest to do something different
(virtio-mmio is fine as far as I can tell.) I am asking because recently
there was a virtio-pci/virtio-mmio discussion recently in Linaro and I
would like to understand if there are any implications from a Xen point
of view that I don't yet know.

For instance, what's your take on notifications with virtio-mmio? How
are they modelled today? Are they good enough or do we need MSIs?
Re: Virtio in Xen on Arm (based on IOREQ concept) [ In reply to ]
On Mon, 20 Jul 2020, Roger Pau Monné wrote:
> On Mon, Jul 20, 2020 at 01:56:51PM +0300, Oleksandr wrote:
> > On 20.07.20 12:17, Roger Pau Monné wrote:
> > > On Fri, Jul 17, 2020 at 09:34:14PM +0300, Oleksandr wrote:
> > > > On 17.07.20 18:00, Roger Pau Monné wrote:
> > > > > On Fri, Jul 17, 2020 at 05:11:02PM +0300, Oleksandr Tyshchenko wrote:
> > > > The other reasons are:
> > > >
> > > > 1. Automation. With current backend implementation we don't need to pause
> > > > guest right after creating it, then go to the driver domain and spawn
> > > > backend and
> > > >
> > > > after that go back to the dom0 and unpause the guest.
> > > xl devd should be capable of handling this for you on the driver
> > > domain.
> > >
> > > > 2. Ability to detect when guest with involved frontend has gone away and
> > > > properly release resource (guest destroy/reboot).
> > > >
> > > > 3. Ability to (re)connect to the newly created guest with involved frontend
> > > > (guest create/reboot).
> > > >
> > > > 4. What is more that having Xenstore support the backend is able to detect
> > > > the dom_id it runs into and the guest dom_id, there is no need pass them via
> > > > command line.
> > > >
> > > >
> > > > I will be happy to explain in details after publishing backend code).
> > > As I'm not the one doing the work I certainly won't stop you from
> > > using xenstore on the backend. I would certainly prefer if the backend
> > > gets all the information it needs from the command line so that the
> > > configuration data is completely agnostic to the transport layer used
> > > to convey it.
> > >
> > > Thanks, Roger.
> >
> > Thank you for pointing another possible way. I feel I need to investigate
> > what is the "xl devd" (+ Argo?) and how it works. If it is able to provide
> > backend with
>
> That's what x86 at least uses to manage backends on driver domains: xl
> devd will for example launch the QEMU instance required to handle a
> Xen PV disk backend in user-space.
>
> Note that there's currently no support for Argo or any communication
> channel different than xenstore, but I think it would be cleaner to
> place the fetching of data from xenstore in xl devd and just pass
> those as command line arguments to the VirtIO backend if possible. I
> would prefer the VirtIO backend to be fully decoupled from xenstore.
>
> Note that for a backend running on dom0 there would be no need to
> pass any data on xenstore, as the backend would be launched directly
> from xl with the appropriate command line arguments.

If I can paraphrase Roger's point, I think we all agree that xenstore is
very convenient to use and great to get something up and running
quickly. But it has several limitations, so it would be fantastic if we
could kill two birds with one stone and find a way to deploy the system
without xenstore, given that with virtio it is not actually needed if not
for very limited initial configurations. It would certainly be a big
win. However, it is fair to say that the xenstore alternative, whatever
that might be, needs work.
Re: Virtio in Xen on Arm (based on IOREQ concept) [ In reply to ]
On 20.07.20 23:38, Stefano Stabellini wrote:

Hello Stefano

> On Fri, 17 Jul 2020, Oleksandr wrote:
>>>> *A few word about solution:*
>>>> As it was mentioned at [1], in order to implement virtio-mmio Xen on Arm
>>> Any plans for virtio-pci? Arm seems to be moving to the PCI bus, and
>>> it would be very interesting from a x86 PoV, as I don't think
>>> virtio-mmio is something that you can easily use on x86 (or even use
>>> at all).
>> Being honest I didn't consider virtio-pci so far. Julien's PoC (we are based
>> on) provides support for the virtio-mmio transport
>>
>> which is enough to start working around VirtIO and is not as complex as
>> virtio-pci. But it doesn't mean there is no way for virtio-pci in Xen.
>>
>> I think, this could be added in next steps. But the nearest target is
>> virtio-mmio approach (of course if the community agrees on that).
> Hi Julien, Oleksandr,
>
> Aside from complexity and easy-of-development, are there any other
> architectural reasons for using virtio-mmio?
>
> I am not asking because I intend to suggest to do something different
> (virtio-mmio is fine as far as I can tell.) I am asking because recently
> there was a virtio-pci/virtio-mmio discussion recently in Linaro and I
> would like to understand if there are any implications from a Xen point
> of view that I don't yet know.
Unfortunately, I can't say anything regarding virtio-pci/MSI. Could the
virtio-pci work in virtual environment without PCI support (various
embedded platforms)?

It feels to me that both transports (easy and lightweight virtio-mmio
and complex and powerfull virtio-pci) will have their consumer demand
and worth being implemented in Xen.


>
> For instance, what's your take on notifications with virtio-mmio? How
> are they modelled today? Are they good enough or do we need MSIs?

Notifications are sent from device (backend) to the driver (frontend)
using interrupts. Additional DM function was introduced for that purpose
xendevicemodel_set_irq_level() which results in vgic_inject_irq() call.

Currently, if device wants to notify a driver it should trigger the
interrupt by calling that function twice (high level at first, then low
level).


--
Regards,

Oleksandr Tyshchenko
Re: Virtio in Xen on Arm (based on IOREQ concept) [ In reply to ]
Hi Stefano,

On 20/07/2020 21:37, Stefano Stabellini wrote:
> On Mon, 20 Jul 2020, Roger Pau Monné wrote:
>> On Mon, Jul 20, 2020 at 10:40:40AM +0100, Julien Grall wrote:
>>>
>>>
>>> On 20/07/2020 10:17, Roger Pau Monné wrote:
>>>> On Fri, Jul 17, 2020 at 09:34:14PM +0300, Oleksandr wrote:
>>>>> On 17.07.20 18:00, Roger Pau Monné wrote:
>>>>>> On Fri, Jul 17, 2020 at 05:11:02PM +0300, Oleksandr Tyshchenko wrote:
>>>>>> Do you have any plans to try to upstream a modification to the VirtIO
>>>>>> spec so that grants (ie: abstract references to memory addresses) can
>>>>>> be used on the VirtIO ring?
>>>>>
>>>>> But VirtIO spec hasn't been modified as well as VirtIO infrastructure in the
>>>>> guest. Nothing to upsteam)
>>>>
>>>> OK, so there's no intention to add grants (or a similar interface) to
>>>> the spec?
>>>>
>>>> I understand that you want to support unmodified VirtIO frontends, but
>>>> I also think that long term frontends could negotiate with backends on
>>>> the usage of grants in the shared ring, like any other VirtIO feature
>>>> negotiated between the frontend and the backend.
>>>>
>>>> This of course needs to be on the spec first before we can start
>>>> implementing it, and hence my question whether a modification to the
>>>> spec in order to add grants has been considered.
>>> The problem is not really the specification but the adoption in the
>>> ecosystem. A protocol based on grant-tables would mostly only be used by Xen
>>> therefore:
>>> - It may be difficult to convince a proprietary OS vendor to invest
>>> resource on implementing the protocol
>>> - It would be more difficult to move in/out of Xen ecosystem.
>>>
>>> Both, may slow the adoption of Xen in some areas.
>>
>> Right, just to be clear my suggestion wasn't to force the usage of
>> grants, but whether adding something along this lines was in the
>> roadmap, see below.
>>
>>> If one is interested in security, then it would be better to work with the
>>> other interested parties. I think it would be possible to use a virtual
>>> IOMMU for this purpose.
>>
>> Yes, I've also heard rumors about using the (I assume VirtIO) IOMMU in
>> order to protect what backends can map. This seems like a fine idea,
>> and would allow us to gain the lost security without having to do the
>> whole work ourselves.
>>
>> Do you know if there's anything published about this? I'm curious
>> about how and where in the system the VirtIO IOMMU is/should be
>> implemented.
>
> Not yet (as far as I know), but we have just started some discussons on
> this topic within Linaro.
>
>
> You should also be aware that there is another proposal based on
> pre-shared-memory and memcpys to solve the virtio security issue:
>
> https://marc.info/?l=linux-kernel&m=158807398403549
>
> It would be certainly slower than the "virtio IOMMU" solution but it
> would take far less time to develop and could work as a short-term
> stop-gap.

I don't think I agree with this blank statement. In the case of "virtio
IOMMU", you would need to potentially map/unmap pages every request
which would result to a lot of back and forth to the hypervisor.

So it may turn out that pre-shared-memory may be faster on some setup.

Cheers,

--
Julien Grall
Re: Virtio in Xen on Arm (based on IOREQ concept) [ In reply to ]
On 20.07.20 23:40, Stefano Stabellini wrote:

Hello Stefano

> On Mon, 20 Jul 2020, Roger Pau Monné wrote:
>> On Mon, Jul 20, 2020 at 01:56:51PM +0300, Oleksandr wrote:
>>> On 20.07.20 12:17, Roger Pau Monné wrote:
>>>> On Fri, Jul 17, 2020 at 09:34:14PM +0300, Oleksandr wrote:
>>>>> On 17.07.20 18:00, Roger Pau Monné wrote:
>>>>>> On Fri, Jul 17, 2020 at 05:11:02PM +0300, Oleksandr Tyshchenko wrote:
>>>>> The other reasons are:
>>>>>
>>>>> 1. Automation. With current backend implementation we don't need to pause
>>>>> guest right after creating it, then go to the driver domain and spawn
>>>>> backend and
>>>>>
>>>>> after that go back to the dom0 and unpause the guest.
>>>> xl devd should be capable of handling this for you on the driver
>>>> domain.
>>>>
>>>>> 2. Ability to detect when guest with involved frontend has gone away and
>>>>> properly release resource (guest destroy/reboot).
>>>>>
>>>>> 3. Ability to (re)connect to the newly created guest with involved frontend
>>>>> (guest create/reboot).
>>>>>
>>>>> 4. What is more that having Xenstore support the backend is able to detect
>>>>> the dom_id it runs into and the guest dom_id, there is no need pass them via
>>>>> command line.
>>>>>
>>>>>
>>>>> I will be happy to explain in details after publishing backend code).
>>>> As I'm not the one doing the work I certainly won't stop you from
>>>> using xenstore on the backend. I would certainly prefer if the backend
>>>> gets all the information it needs from the command line so that the
>>>> configuration data is completely agnostic to the transport layer used
>>>> to convey it.
>>>>
>>>> Thanks, Roger.
>>> Thank you for pointing another possible way. I feel I need to investigate
>>> what is the "xl devd" (+ Argo?) and how it works. If it is able to provide
>>> backend with
>> That's what x86 at least uses to manage backends on driver domains: xl
>> devd will for example launch the QEMU instance required to handle a
>> Xen PV disk backend in user-space.
>>
>> Note that there's currently no support for Argo or any communication
>> channel different than xenstore, but I think it would be cleaner to
>> place the fetching of data from xenstore in xl devd and just pass
>> those as command line arguments to the VirtIO backend if possible. I
>> would prefer the VirtIO backend to be fully decoupled from xenstore.
>>
>> Note that for a backend running on dom0 there would be no need to
>> pass any data on xenstore, as the backend would be launched directly
>> from xl with the appropriate command line arguments.
> If I can paraphrase Roger's point, I think we all agree that xenstore is
> very convenient to use and great to get something up and running
> quickly. But it has several limitations, so it would be fantastic if we
> could kill two birds with one stone and find a way to deploy the system
> without xenstore, given that with virtio it is not actually needed if not
> for very limited initial configurations. It would certainly be a big
> win. However, it is fair to say that the xenstore alternative, whatever
> that might be, needs work.

Well, why actually not?

For example, the idea "to place the fetching of data from xenstore in xl
devd and just pass
those as command line arguments to the VirtIO backend if possible"
sounds fine to me. But this needs an additional investigation.

--
Regards,

Oleksandr Tyshchenko
Re: Virtio in Xen on Arm (based on IOREQ concept) [ In reply to ]
(+ Andree for the vGIC).

Hi Stefano,

On 20/07/2020 21:38, Stefano Stabellini wrote:
> On Fri, 17 Jul 2020, Oleksandr wrote:
>>>> *A few word about solution:*
>>>> As it was mentioned at [1], in order to implement virtio-mmio Xen on Arm
>>> Any plans for virtio-pci? Arm seems to be moving to the PCI bus, and
>>> it would be very interesting from a x86 PoV, as I don't think
>>> virtio-mmio is something that you can easily use on x86 (or even use
>>> at all).
>>
>> Being honest I didn't consider virtio-pci so far. Julien's PoC (we are based
>> on) provides support for the virtio-mmio transport
>>
>> which is enough to start working around VirtIO and is not as complex as
>> virtio-pci. But it doesn't mean there is no way for virtio-pci in Xen.
>>
>> I think, this could be added in next steps. But the nearest target is
>> virtio-mmio approach (of course if the community agrees on that).

> Aside from complexity and easy-of-development, are there any other
> architectural reasons for using virtio-mmio?

From the hypervisor PoV, the main/only difference between virtio-mmio
and virtio-pci is that in the latter we need to forward PCI config space
access to the device emulator. IOW, we would need to add support for
vPCI. This shouldn't require much more work, but I didn't want to invest
on it for PoC.

Long term, I don't think we should tie Xen to any of the virtio
protocol. We just need to offer facilities so users can be build easily
virtio backend for Xen.

>
> I am not asking because I intend to suggest to do something different
> (virtio-mmio is fine as far as I can tell.) I am asking because recently
> there was a virtio-pci/virtio-mmio discussion recently in Linaro and I
> would like to understand if there are any implications from a Xen point
> of view that I don't yet know.

virtio-mmio is going to require more work in the toolstack because we
would need to do the memory/interrupts allocation ourself. In the case
of virtio-pci, we only need to pass a range of memory/interrupts to the
guest and let him decide the allocation.

Regarding virtio-pci vs virtio-mmio:
- flexibility: virtio-mmio is a good fit when you know all your
devices at boot. If you want to hotplug disk/network, then virtio-pci is
going to be a better fit.
- interrupts: I would expect each virtio-mmio device to have its
own SPI interrupts. In the case of virtio-pci, then legacy interrupts
would be shared between all the PCI devices on the same host controller.
This could possibly lead to performance issue if you have many devices.
So for virtio-pci, we should consider MSIs.

>
> For instance, what's your take on notifications with virtio-mmio? How
> are they modelled today?

The backend will notify the frontend using an SPI. The other way around
(frontend -> backend) is based on an MMIO write.

We have an interface to allow the backend to control whether the
interrupt level (i.e. low, high). However, the "old" vGIC doesn't handle
properly level interrupts. So we would end up to treat level interrupts
as edge.

Technically, the problem is already existing with HW interrupts, but the
HW should fire it again if the interrupt line is still asserted. Another
issue is the interrupt may fire even if the interrupt line was
deasserted (IIRC this caused some interesting problem with the Arch timer).

I am a bit concerned that the issue will be more proeminent for virtual
interrupts. I know that we have some gross hack in the vpl011 to handle
a level interrupts. So maybe it is time to switch to the new vGIC?

> Are they good enough or do we need MSIs?

I am not sure whether virtio-mmio supports MSIs. However for virtio-pci,
MSIs is going to be useful to improve performance. This may mean to
expose an ITS, so we would need to add support for guest.

Cheers,

--
Julien Grall
Re: Virtio in Xen on Arm (based on IOREQ concept) [ In reply to ]
On Tue, Jul 21, 2020 at 01:31:48PM +0100, Julien Grall wrote:
> Hi Stefano,
>
> On 20/07/2020 21:37, Stefano Stabellini wrote:
> > On Mon, 20 Jul 2020, Roger Pau Monné wrote:
> > > On Mon, Jul 20, 2020 at 10:40:40AM +0100, Julien Grall wrote:
> > > >
> > > >
> > > > On 20/07/2020 10:17, Roger Pau Monné wrote:
> > > > > On Fri, Jul 17, 2020 at 09:34:14PM +0300, Oleksandr wrote:
> > > > > > On 17.07.20 18:00, Roger Pau Monné wrote:
> > > > > > > On Fri, Jul 17, 2020 at 05:11:02PM +0300, Oleksandr Tyshchenko wrote:
> > > > > > > Do you have any plans to try to upstream a modification to the VirtIO
> > > > > > > spec so that grants (ie: abstract references to memory addresses) can
> > > > > > > be used on the VirtIO ring?
> > > > > >
> > > > > > But VirtIO spec hasn't been modified as well as VirtIO infrastructure in the
> > > > > > guest. Nothing to upsteam)
> > > > >
> > > > > OK, so there's no intention to add grants (or a similar interface) to
> > > > > the spec?
> > > > >
> > > > > I understand that you want to support unmodified VirtIO frontends, but
> > > > > I also think that long term frontends could negotiate with backends on
> > > > > the usage of grants in the shared ring, like any other VirtIO feature
> > > > > negotiated between the frontend and the backend.
> > > > >
> > > > > This of course needs to be on the spec first before we can start
> > > > > implementing it, and hence my question whether a modification to the
> > > > > spec in order to add grants has been considered.
> > > > The problem is not really the specification but the adoption in the
> > > > ecosystem. A protocol based on grant-tables would mostly only be used by Xen
> > > > therefore:
> > > > - It may be difficult to convince a proprietary OS vendor to invest
> > > > resource on implementing the protocol
> > > > - It would be more difficult to move in/out of Xen ecosystem.
> > > >
> > > > Both, may slow the adoption of Xen in some areas.
> > >
> > > Right, just to be clear my suggestion wasn't to force the usage of
> > > grants, but whether adding something along this lines was in the
> > > roadmap, see below.
> > >
> > > > If one is interested in security, then it would be better to work with the
> > > > other interested parties. I think it would be possible to use a virtual
> > > > IOMMU for this purpose.
> > >
> > > Yes, I've also heard rumors about using the (I assume VirtIO) IOMMU in
> > > order to protect what backends can map. This seems like a fine idea,
> > > and would allow us to gain the lost security without having to do the
> > > whole work ourselves.
> > >
> > > Do you know if there's anything published about this? I'm curious
> > > about how and where in the system the VirtIO IOMMU is/should be
> > > implemented.
> >
> > Not yet (as far as I know), but we have just started some discussons on
> > this topic within Linaro.
> >
> >
> > You should also be aware that there is another proposal based on
> > pre-shared-memory and memcpys to solve the virtio security issue:
> >
> > https://marc.info/?l=linux-kernel&m=158807398403549
> >
> > It would be certainly slower than the "virtio IOMMU" solution but it
> > would take far less time to develop and could work as a short-term
> > stop-gap.
>
> I don't think I agree with this blank statement. In the case of "virtio
> IOMMU", you would need to potentially map/unmap pages every request which
> would result to a lot of back and forth to the hypervisor.
>
> So it may turn out that pre-shared-memory may be faster on some setup.

AFAICT you could achieve the same with an IOMMU: pre-share (ie: add to
the device IOMMU page tables) a bunch of pages and keep bouncing data
to/from them in order to interact with the device, that way you could
avoid the map and unmaps (and is effectively how persistent grants
work in the blkif protocol).

The thread referenced by Stefano seems to point out this shared memory
model is targeted for very limited hypervisors that don't have the
capacity to trap, decode and emulate accesses to memory?

I certainly don't know much about it.

Roger.
Re: Virtio in Xen on Arm (based on IOREQ concept) [ In reply to ]
Hi Roger,

On 21/07/2020 14:25, Roger Pau Monné wrote:
> On Tue, Jul 21, 2020 at 01:31:48PM +0100, Julien Grall wrote:
>> Hi Stefano,
>>
>> On 20/07/2020 21:37, Stefano Stabellini wrote:
>>> On Mon, 20 Jul 2020, Roger Pau Monné wrote:
>>>> On Mon, Jul 20, 2020 at 10:40:40AM +0100, Julien Grall wrote:
>>>>>
>>>>>
>>>>> On 20/07/2020 10:17, Roger Pau Monné wrote:
>>>>>> On Fri, Jul 17, 2020 at 09:34:14PM +0300, Oleksandr wrote:
>>>>>>> On 17.07.20 18:00, Roger Pau Monné wrote:
>>>>>>>> On Fri, Jul 17, 2020 at 05:11:02PM +0300, Oleksandr Tyshchenko wrote:
>>>>>>>> Do you have any plans to try to upstream a modification to the VirtIO
>>>>>>>> spec so that grants (ie: abstract references to memory addresses) can
>>>>>>>> be used on the VirtIO ring?
>>>>>>>
>>>>>>> But VirtIO spec hasn't been modified as well as VirtIO infrastructure in the
>>>>>>> guest. Nothing to upsteam)
>>>>>>
>>>>>> OK, so there's no intention to add grants (or a similar interface) to
>>>>>> the spec?
>>>>>>
>>>>>> I understand that you want to support unmodified VirtIO frontends, but
>>>>>> I also think that long term frontends could negotiate with backends on
>>>>>> the usage of grants in the shared ring, like any other VirtIO feature
>>>>>> negotiated between the frontend and the backend.
>>>>>>
>>>>>> This of course needs to be on the spec first before we can start
>>>>>> implementing it, and hence my question whether a modification to the
>>>>>> spec in order to add grants has been considered.
>>>>> The problem is not really the specification but the adoption in the
>>>>> ecosystem. A protocol based on grant-tables would mostly only be used by Xen
>>>>> therefore:
>>>>> - It may be difficult to convince a proprietary OS vendor to invest
>>>>> resource on implementing the protocol
>>>>> - It would be more difficult to move in/out of Xen ecosystem.
>>>>>
>>>>> Both, may slow the adoption of Xen in some areas.
>>>>
>>>> Right, just to be clear my suggestion wasn't to force the usage of
>>>> grants, but whether adding something along this lines was in the
>>>> roadmap, see below.
>>>>
>>>>> If one is interested in security, then it would be better to work with the
>>>>> other interested parties. I think it would be possible to use a virtual
>>>>> IOMMU for this purpose.
>>>>
>>>> Yes, I've also heard rumors about using the (I assume VirtIO) IOMMU in
>>>> order to protect what backends can map. This seems like a fine idea,
>>>> and would allow us to gain the lost security without having to do the
>>>> whole work ourselves.
>>>>
>>>> Do you know if there's anything published about this? I'm curious
>>>> about how and where in the system the VirtIO IOMMU is/should be
>>>> implemented.
>>>
>>> Not yet (as far as I know), but we have just started some discussons on
>>> this topic within Linaro.
>>>
>>>
>>> You should also be aware that there is another proposal based on
>>> pre-shared-memory and memcpys to solve the virtio security issue:
>>>
>>> https://marc.info/?l=linux-kernel&m=158807398403549
>>>
>>> It would be certainly slower than the "virtio IOMMU" solution but it
>>> would take far less time to develop and could work as a short-term
>>> stop-gap.
>>
>> I don't think I agree with this blank statement. In the case of "virtio
>> IOMMU", you would need to potentially map/unmap pages every request which
>> would result to a lot of back and forth to the hypervisor.
>>
>> So it may turn out that pre-shared-memory may be faster on some setup.
>
> AFAICT you could achieve the same with an IOMMU: pre-share (ie: add to
> the device IOMMU page tables) a bunch of pages and keep bouncing data
> to/from them in order to interact with the device, that way you could
> avoid the map and unmaps (and is effectively how persistent grants
> work in the blkif protocol).

Yes it is possible to do the same with the virtio IOMMU. I was more
arguing on the statement that pre-shared-memory is going to be slower
than the IOMMU case.

>
> The thread referenced by Stefano seems to point out this shared memory
> model is targeted for very limited hypervisors that don't have the
> capacity to trap, decode and emulate accesses to memory?

Technically we are in the same case for Xen on Arm as we don't have the
IOREQ support yet. But I think IOREQ is worthwhile as it would enable
existing unmodified Linux with virtio driver to boot on Xen.

Cheers,

--
Julien Grall
Re: Virtio in Xen on Arm (based on IOREQ concept) [ In reply to ]
On Tue, Jul 21, 2020 at 02:32:38PM +0100, Julien Grall wrote:
> Hi Roger,
>
> On 21/07/2020 14:25, Roger Pau Monné wrote:
> > On Tue, Jul 21, 2020 at 01:31:48PM +0100, Julien Grall wrote:
> > > Hi Stefano,
> > >
> > > On 20/07/2020 21:37, Stefano Stabellini wrote:
> > > > On Mon, 20 Jul 2020, Roger Pau Monné wrote:
> > > > > On Mon, Jul 20, 2020 at 10:40:40AM +0100, Julien Grall wrote:
> > > > > >
> > > > > >
> > > > > > On 20/07/2020 10:17, Roger Pau Monné wrote:
> > > > > > > On Fri, Jul 17, 2020 at 09:34:14PM +0300, Oleksandr wrote:
> > > > > > > > On 17.07.20 18:00, Roger Pau Monné wrote:
> > > > > > > > > On Fri, Jul 17, 2020 at 05:11:02PM +0300, Oleksandr Tyshchenko wrote:
> > > > > > > > > Do you have any plans to try to upstream a modification to the VirtIO
> > > > > > > > > spec so that grants (ie: abstract references to memory addresses) can
> > > > > > > > > be used on the VirtIO ring?
> > > > > > > >
> > > > > > > > But VirtIO spec hasn't been modified as well as VirtIO infrastructure in the
> > > > > > > > guest. Nothing to upsteam)
> > > > > > >
> > > > > > > OK, so there's no intention to add grants (or a similar interface) to
> > > > > > > the spec?
> > > > > > >
> > > > > > > I understand that you want to support unmodified VirtIO frontends, but
> > > > > > > I also think that long term frontends could negotiate with backends on
> > > > > > > the usage of grants in the shared ring, like any other VirtIO feature
> > > > > > > negotiated between the frontend and the backend.
> > > > > > >
> > > > > > > This of course needs to be on the spec first before we can start
> > > > > > > implementing it, and hence my question whether a modification to the
> > > > > > > spec in order to add grants has been considered.
> > > > > > The problem is not really the specification but the adoption in the
> > > > > > ecosystem. A protocol based on grant-tables would mostly only be used by Xen
> > > > > > therefore:
> > > > > > - It may be difficult to convince a proprietary OS vendor to invest
> > > > > > resource on implementing the protocol
> > > > > > - It would be more difficult to move in/out of Xen ecosystem.
> > > > > >
> > > > > > Both, may slow the adoption of Xen in some areas.
> > > > >
> > > > > Right, just to be clear my suggestion wasn't to force the usage of
> > > > > grants, but whether adding something along this lines was in the
> > > > > roadmap, see below.
> > > > >
> > > > > > If one is interested in security, then it would be better to work with the
> > > > > > other interested parties. I think it would be possible to use a virtual
> > > > > > IOMMU for this purpose.
> > > > >
> > > > > Yes, I've also heard rumors about using the (I assume VirtIO) IOMMU in
> > > > > order to protect what backends can map. This seems like a fine idea,
> > > > > and would allow us to gain the lost security without having to do the
> > > > > whole work ourselves.
> > > > >
> > > > > Do you know if there's anything published about this? I'm curious
> > > > > about how and where in the system the VirtIO IOMMU is/should be
> > > > > implemented.
> > > >
> > > > Not yet (as far as I know), but we have just started some discussons on
> > > > this topic within Linaro.
> > > >
> > > >
> > > > You should also be aware that there is another proposal based on
> > > > pre-shared-memory and memcpys to solve the virtio security issue:
> > > >
> > > > https://marc.info/?l=linux-kernel&m=158807398403549
> > > >
> > > > It would be certainly slower than the "virtio IOMMU" solution but it
> > > > would take far less time to develop and could work as a short-term
> > > > stop-gap.
> > >
> > > I don't think I agree with this blank statement. In the case of "virtio
> > > IOMMU", you would need to potentially map/unmap pages every request which
> > > would result to a lot of back and forth to the hypervisor.
> > >
> > > So it may turn out that pre-shared-memory may be faster on some setup.
> >
> > AFAICT you could achieve the same with an IOMMU: pre-share (ie: add to
> > the device IOMMU page tables) a bunch of pages and keep bouncing data
> > to/from them in order to interact with the device, that way you could
> > avoid the map and unmaps (and is effectively how persistent grants
> > work in the blkif protocol).
>
> Yes it is possible to do the same with the virtio IOMMU. I was more arguing
> on the statement that pre-shared-memory is going to be slower than the IOMMU
> case.
>
> >
> > The thread referenced by Stefano seems to point out this shared memory
> > model is targeted for very limited hypervisors that don't have the
> > capacity to trap, decode and emulate accesses to memory?
>
> Technically we are in the same case for Xen on Arm as we don't have the
> IOREQ support yet. But I think IOREQ is worthwhile as it would enable
> existing unmodified Linux with virtio driver to boot on Xen.

Yes, I fully agree.

Roger.
Re: Virtio in Xen on Arm (based on IOREQ concept) [ In reply to ]
(+ Andre)

Hi Oleksandr,

On 21/07/2020 13:26, Oleksandr wrote:
> On 20.07.20 23:38, Stefano Stabellini wrote:
>> For instance, what's your take on notifications with virtio-mmio? How
>> are they modelled today? Are they good enough or do we need MSIs?
>
> Notifications are sent from device (backend) to the driver (frontend)
> using interrupts. Additional DM function was introduced for that purpose
> xendevicemodel_set_irq_level() which results in vgic_inject_irq() call.
>
> Currently, if device wants to notify a driver it should trigger the
> interrupt by calling that function twice (high level at first, then low
> level).

This doesn't look right to me. Assuming the interrupt is trigger when
the line is high-level, the backend should only issue the hypercall once
to set the level to high. Once the guest has finish to process all the
notifications the backend would then call the hypercall to lower the
interrupt line.

This means the interrupts should keep firing as long as the interrupt
line is high.

It is quite possible that I took some shortcut when implementing the
hypercall, so this should be corrected before anyone start to rely on it.

Cheers,

--
Julien Grall
Re: Virtio in Xen on Arm (based on IOREQ concept) [ In reply to ]
Julien Grall <julien@xen.org> writes:

> Hi Stefano,
>
> On 20/07/2020 21:37, Stefano Stabellini wrote:
>> On Mon, 20 Jul 2020, Roger Pau Monné wrote:
>>> On Mon, Jul 20, 2020 at 10:40:40AM +0100, Julien Grall wrote:
>>>>
>>>>
>>>> On 20/07/2020 10:17, Roger Pau Monné wrote:
>>>>> On Fri, Jul 17, 2020 at 09:34:14PM +0300, Oleksandr wrote:
>>>>>> On 17.07.20 18:00, Roger Pau Monné wrote:
>>>>>>> On Fri, Jul 17, 2020 at 05:11:02PM +0300, Oleksandr Tyshchenko wrote:
>>>>>>> Do you have any plans to try to upstream a modification to the VirtIO
>>>>>>> spec so that grants (ie: abstract references to memory addresses) can
>>>>>>> be used on the VirtIO ring?
>>>>>>
>>>>>> But VirtIO spec hasn't been modified as well as VirtIO infrastructure in the
>>>>>> guest. Nothing to upsteam)
>>>>>
>>>>> OK, so there's no intention to add grants (or a similar interface) to
>>>>> the spec?
>>>>>
>>>>> I understand that you want to support unmodified VirtIO frontends, but
>>>>> I also think that long term frontends could negotiate with backends on
>>>>> the usage of grants in the shared ring, like any other VirtIO feature
>>>>> negotiated between the frontend and the backend.
>>>>>
>>>>> This of course needs to be on the spec first before we can start
>>>>> implementing it, and hence my question whether a modification to the
>>>>> spec in order to add grants has been considered.
>>>> The problem is not really the specification but the adoption in the
>>>> ecosystem. A protocol based on grant-tables would mostly only be used by Xen
>>>> therefore:
>>>> - It may be difficult to convince a proprietary OS vendor to invest
>>>> resource on implementing the protocol
>>>> - It would be more difficult to move in/out of Xen ecosystem.
>>>>
>>>> Both, may slow the adoption of Xen in some areas.
>>>
>>> Right, just to be clear my suggestion wasn't to force the usage of
>>> grants, but whether adding something along this lines was in the
>>> roadmap, see below.
>>>
>>>> If one is interested in security, then it would be better to work with the
>>>> other interested parties. I think it would be possible to use a virtual
>>>> IOMMU for this purpose.
>>>
>>> Yes, I've also heard rumors about using the (I assume VirtIO) IOMMU in
>>> order to protect what backends can map. This seems like a fine idea,
>>> and would allow us to gain the lost security without having to do the
>>> whole work ourselves.
>>>
>>> Do you know if there's anything published about this? I'm curious
>>> about how and where in the system the VirtIO IOMMU is/should be
>>> implemented.
>>
>> Not yet (as far as I know), but we have just started some discussons on
>> this topic within Linaro.
>>
>>
>> You should also be aware that there is another proposal based on
>> pre-shared-memory and memcpys to solve the virtio security issue:
>>
>> https://marc.info/?l=linux-kernel&m=158807398403549
>>
>> It would be certainly slower than the "virtio IOMMU" solution but it
>> would take far less time to develop and could work as a short-term
>> stop-gap.
>
> I don't think I agree with this blank statement. In the case of "virtio
> IOMMU", you would need to potentially map/unmap pages every request
> which would result to a lot of back and forth to the hypervisor.

Can a virtio-iommu just set bounds when a device is initialised as to
where memory will be in the kernel address space?

> So it may turn out that pre-shared-memory may be faster on some setup.

Certainly having to update the page permissions every transaction is
going to be to slow for soemthing that wants to avoid the performance
penalty of a bounce buffer.

--
Alex Bennée
Re: Virtio in Xen on Arm (based on IOREQ concept) [ In reply to ]
Julien Grall <julien@xen.org> writes:

> (+ Andree for the vGIC).
>
> Hi Stefano,
>
> On 20/07/2020 21:38, Stefano Stabellini wrote:
>> On Fri, 17 Jul 2020, Oleksandr wrote:
>>>>> *A few word about solution:*
>>>>> As it was mentioned at [1], in order to implement virtio-mmio Xen on Arm
>>>> Any plans for virtio-pci? Arm seems to be moving to the PCI bus, and
>>>> it would be very interesting from a x86 PoV, as I don't think
>>>> virtio-mmio is something that you can easily use on x86 (or even use
>>>> at all).
>>>
>>> Being honest I didn't consider virtio-pci so far. Julien's PoC (we are based
>>> on) provides support for the virtio-mmio transport
>>>
>>> which is enough to start working around VirtIO and is not as complex as
>>> virtio-pci. But it doesn't mean there is no way for virtio-pci in Xen.
>>>
>>> I think, this could be added in next steps. But the nearest target is
>>> virtio-mmio approach (of course if the community agrees on that).
>
>> Aside from complexity and easy-of-development, are there any other
>> architectural reasons for using virtio-mmio?
>
<snip>
>>
>> For instance, what's your take on notifications with virtio-mmio? How
>> are they modelled today?
>
> The backend will notify the frontend using an SPI. The other way around
> (frontend -> backend) is based on an MMIO write.
>
> We have an interface to allow the backend to control whether the
> interrupt level (i.e. low, high). However, the "old" vGIC doesn't handle
> properly level interrupts. So we would end up to treat level interrupts
> as edge.
>
> Technically, the problem is already existing with HW interrupts, but the
> HW should fire it again if the interrupt line is still asserted. Another
> issue is the interrupt may fire even if the interrupt line was
> deasserted (IIRC this caused some interesting problem with the Arch timer).
>
> I am a bit concerned that the issue will be more proeminent for virtual
> interrupts. I know that we have some gross hack in the vpl011 to handle
> a level interrupts. So maybe it is time to switch to the new vGIC?
>
>> Are they good enough or do we need MSIs?
>
> I am not sure whether virtio-mmio supports MSIs. However for virtio-pci,
> MSIs is going to be useful to improve performance. This may mean to
> expose an ITS, so we would need to add support for guest.

virtio-mmio doesn't support MSI's at the moment although there have been
proposals to update the spec to allow them. At the moment the cost of
reading the ISR value and then writing an ack in vm_interrupt:

/* Read and acknowledge interrupts */
status = readl(vm_dev->base + VIRTIO_MMIO_INTERRUPT_STATUS);
writel(status, vm_dev->base + VIRTIO_MMIO_INTERRUPT_ACK);

puts an extra vmexit cost to trap an emulate each exit. Getting an MSI
via an exitless access to the GIC would be better I think. I'm not quite
sure what the path to IRQs from Xen is.


>
> Cheers,


--
Alex Bennée
Re: Virtio in Xen on Arm (based on IOREQ concept) [ In reply to ]
Hi,

On 17/07/2020 19:34, Oleksandr wrote:
>
> On 17.07.20 18:00, Roger Pau Monné wrote:
>>> requires
>>> some implementation to forward guest MMIO access to a device model.
>>> And as
>>> it
>>> turned out the Xen on x86 contains most of the pieces to be able to
>>> use that
>>> transport (via existing IOREQ concept). Julien has already done a big
>>> amount
>>> of work in his PoC (xen/arm: Add support for Guest IO forwarding to a
>>> device emulator).
>>> Using that code as a base we managed to create a completely
>>> functional PoC
>>> with DomU
>>> running on virtio block device instead of a traditional Xen PV driver
>>> without
>>> modifications to DomU Linux. Our work is mostly about rebasing Julien's
>>> code on the actual
>>> codebase (Xen 4.14-rc4), various tweeks to be able to run emulator
>>> (virtio-disk backend)
>>> in other than Dom0 domain (in our system we have thin Dom0 and keep all
>>> backends
>>> in driver domain),
>> How do you handle this use-case? Are you using grants in the VirtIO
>> ring, or rather allowing the driver domain to map all the guest memory
>> and then placing gfn on the ring like it's commonly done with VirtIO?
>
> Second option. Xen grants are not used at all as well as event channel
> and Xenbus. That allows us to have guest
>
> *unmodified* which one of the main goals. Yes, this may sound (or even
> sounds) non-secure, but backend which runs in driver domain is allowed
> to map all guest memory.
>
> In current backend implementation a part of guest memory is mapped just
> to process guest request then unmapped back, there is no mappings in
> advance. The xenforeignmemory_map
>
> call is used for that purpose. For experiment I tried to map all guest
> memory in advance and just calculated pointer at runtime. Of course that
> logic performed better.

That works well for a PoC, however I am not sure you can rely on it long
term as a guest is free to modify its memory layout. For instance, Linux
may balloon in/out memory. You probably want to consider something
similar to mapcache in QEMU.

On a similar topic, I am a bit surprised you didn't encounter memory
exhaustion when trying to use virtio. Because on how Linux currently
works (see XSA-300), the backend domain as to have a least as much RAM
as the domain it serves. For instance, you have serve two domains with
1GB of RAM each, then your backend would need at least 2GB + some for
its own purpose.

This probably wants to be resolved by allowing foreign mapping to be
"paging" out as you would for memory assigned to a userspace.

> I was thinking about guest static memory regions and forcing guest to
> allocate descriptors from them (in order not to map all guest memory,
> but a predefined region). But that implies modifying guest...

[...]

>>> misc fixes for our use-cases and tool support for the
>>> configuration.
>>> Unfortunately, Julien doesn’t have much time to allocate on the work
>>> anymore,
>>> so we would like to step in and continue.
>>>
>>> *A few word about the Xen code:*
>>> You can find the whole Xen series at [5]. The patches are in RFC state
>>> because
>>> some actions in the series should be reconsidered and implemented
>>> properly.
>>> Before submitting the final code for the review the first IOREQ patch
>>> (which is quite
>>> big) will be split into x86, Arm and common parts. Please note, x86 part
>>> wasn’t
>>> even build-tested so far and could be broken with that series. Also the
>>> series probably
>>> wants splitting into adding IOREQ on Arm (should be focused first) and
>>> tools support
>>> for the virtio-disk (which is going to be the first Virtio driver)
>>> configuration before going
>>> into the mailing list.
>> Sending first a patch series to enable IOREQs on Arm seems perfectly
>> fine, and it doesn't have to come with the VirtIO backend. In fact I
>> would recommend that you send that ASAP, so that you don't spend time
>> working on the backend that would likely need to be modified
>> according to the review received on the IOREQ series.
>
> Completely agree with you, I will send it after splitting IOREQ patch
> and performing some cleanup.
>
> However, it is going to take some time to make it properly taking into
> the account
>
> that personally I won't be able to test on x86.
I think other member of the community should be able to help here.
However, nowadays testing Xen on x86 is pretty easy with QEMU :).

Cheers,

--
Julien Grall
Re: Virtio in Xen on Arm (based on IOREQ concept) [ In reply to ]
On 21/07/2020 14:43, Julien Grall wrote:
> (+ Andre)
>
> Hi Oleksandr,
>
> On 21/07/2020 13:26, Oleksandr wrote:
>> On 20.07.20 23:38, Stefano Stabellini wrote:
>>> For instance, what's your take on notifications with virtio-mmio? How
>>> are they modelled today? Are they good enough or do we need MSIs?
>>
>> Notifications are sent from device (backend) to the driver (frontend)
>> using interrupts. Additional DM function was introduced for that
>> purpose xendevicemodel_set_irq_level() which results in
>> vgic_inject_irq() call.
>>
>> Currently, if device wants to notify a driver it should trigger the
>> interrupt by calling that function twice (high level at first, then
>> low level).
>
> This doesn't look right to me. Assuming the interrupt is trigger when
> the line is high-level, the backend should only issue the hypercall once
> to set the level to high. Once the guest has finish to process all the
> notifications the backend would then call the hypercall to lower the
> interrupt line.
>
> This means the interrupts should keep firing as long as the interrupt
> line is high.
>
> It is quite possible that I took some shortcut when implementing the
> hypercall, so this should be corrected before anyone start to rely on it.

So I think the key question is: are virtio interrupts level or edge
triggered? Both QEMU and kvmtool advertise virtio-mmio interrupts as
edge-triggered.
From skimming through the virtio spec I can't find any explicit
mentioning of the type of IRQ, but the usage of MSIs indeed hints at
using an edge property. Apparently reading the PCI ISR status register
clears it, which again sounds like edge. For virtio-mmio the driver
needs to explicitly clear the interrupt status register, which again
says: edge (as it's not the device clearing the status).

So the device should just notify the driver once, which would cause one
vgic_inject_irq() call. It would be then up to the driver to clear up
that status, by reading PCI ISR status or writing to virtio-mmio's
interrupt-acknowledge register.

Does that make sense?

Cheers,
Andre
Re: Virtio in Xen on Arm (based on IOREQ concept) [ In reply to ]
Hi Alex,

Thank you for your feedback!

On 21/07/2020 15:15, Alex Bennée wrote:
> Julien Grall <julien@xen.org> writes:
>
>> (+ Andree for the vGIC).
>>
>> Hi Stefano,
>>
>> On 20/07/2020 21:38, Stefano Stabellini wrote:
>>> On Fri, 17 Jul 2020, Oleksandr wrote:
>>>>>> *A few word about solution:*
>>>>>> As it was mentioned at [1], in order to implement virtio-mmio Xen on Arm
>>>>> Any plans for virtio-pci? Arm seems to be moving to the PCI bus, and
>>>>> it would be very interesting from a x86 PoV, as I don't think
>>>>> virtio-mmio is something that you can easily use on x86 (or even use
>>>>> at all).
>>>>
>>>> Being honest I didn't consider virtio-pci so far. Julien's PoC (we are based
>>>> on) provides support for the virtio-mmio transport
>>>>
>>>> which is enough to start working around VirtIO and is not as complex as
>>>> virtio-pci. But it doesn't mean there is no way for virtio-pci in Xen.
>>>>
>>>> I think, this could be added in next steps. But the nearest target is
>>>> virtio-mmio approach (of course if the community agrees on that).
>>
>>> Aside from complexity and easy-of-development, are there any other
>>> architectural reasons for using virtio-mmio?
>>
> <snip>
>>>
>>> For instance, what's your take on notifications with virtio-mmio? How
>>> are they modelled today?
>>
>> The backend will notify the frontend using an SPI. The other way around
>> (frontend -> backend) is based on an MMIO write.
>>
>> We have an interface to allow the backend to control whether the
>> interrupt level (i.e. low, high). However, the "old" vGIC doesn't handle
>> properly level interrupts. So we would end up to treat level interrupts
>> as edge.
>>
>> Technically, the problem is already existing with HW interrupts, but the
>> HW should fire it again if the interrupt line is still asserted. Another
>> issue is the interrupt may fire even if the interrupt line was
>> deasserted (IIRC this caused some interesting problem with the Arch timer).
>>
>> I am a bit concerned that the issue will be more proeminent for virtual
>> interrupts. I know that we have some gross hack in the vpl011 to handle
>> a level interrupts. So maybe it is time to switch to the new vGIC?
>>
>>> Are they good enough or do we need MSIs?
>>
>> I am not sure whether virtio-mmio supports MSIs. However for virtio-pci,
>> MSIs is going to be useful to improve performance. This may mean to
>> expose an ITS, so we would need to add support for guest.
>
> virtio-mmio doesn't support MSI's at the moment although there have been
> proposals to update the spec to allow them. At the moment the cost of
> reading the ISR value and then writing an ack in vm_interrupt:
>
> /* Read and acknowledge interrupts */
> status = readl(vm_dev->base + VIRTIO_MMIO_INTERRUPT_STATUS);
> writel(status, vm_dev->base + VIRTIO_MMIO_INTERRUPT_ACK);
>

Hmmmm, the current way to handle MMIO is the following:
* pause the vCPU
* Forward the access to the backend domain
* Schedule the backend domain
* Wait for the access to be handled
* unpause the vCPU

So the sequence is going to be fairly expensive on Xen.

> puts an extra vmexit cost to trap an emulate each exit. Getting an MSI
> via an exitless access to the GIC would be better I think.
> I'm not quite
> sure what the path to IRQs from Xen is.

vmexit on Xen on Arm is pretty cheap compare to KVM as we don't save a
lot of things. In this situation, they handling an extra trap for the
interrupt is likely to be meaningless compare to the sequence above.

I am assuming the sequence is also going to be used by the MSIs, right?

It feels to me that it would be worth spending time to investigate the
cost of that sequence. It might be possible to optimize the ACK and
avoid to wait for the backend to handle the access.

Cheers,

--
Julien Grall
Re: Virtio in Xen on Arm (based on IOREQ concept) [ In reply to ]
On 21.07.20 17:32, André Przywara wrote:
> On 21/07/2020 14:43, Julien Grall wrote:

Hello Andre, Julien


>> (+ Andre)
>>
>> Hi Oleksandr,
>>
>> On 21/07/2020 13:26, Oleksandr wrote:
>>> On 20.07.20 23:38, Stefano Stabellini wrote:
>>>> For instance, what's your take on notifications with virtio-mmio? How
>>>> are they modelled today? Are they good enough or do we need MSIs?
>>> Notifications are sent from device (backend) to the driver (frontend)
>>> using interrupts. Additional DM function was introduced for that
>>> purpose xendevicemodel_set_irq_level() which results in
>>> vgic_inject_irq() call.
>>>
>>> Currently, if device wants to notify a driver it should trigger the
>>> interrupt by calling that function twice (high level at first, then
>>> low level).
>> This doesn't look right to me. Assuming the interrupt is trigger when
>> the line is high-level, the backend should only issue the hypercall once
>> to set the level to high. Once the guest has finish to process all the
>> notifications the backend would then call the hypercall to lower the
>> interrupt line.
>>
>> This means the interrupts should keep firing as long as the interrupt
>> line is high.
>>
>> It is quite possible that I took some shortcut when implementing the
>> hypercall, so this should be corrected before anyone start to rely on it.
> So I think the key question is: are virtio interrupts level or edge
> triggered? Both QEMU and kvmtool advertise virtio-mmio interrupts as
> edge-triggered.
> From skimming through the virtio spec I can't find any explicit
> mentioning of the type of IRQ, but the usage of MSIs indeed hints at
> using an edge property. Apparently reading the PCI ISR status register
> clears it, which again sounds like edge. For virtio-mmio the driver
> needs to explicitly clear the interrupt status register, which again
> says: edge (as it's not the device clearing the status).
>
> So the device should just notify the driver once, which would cause one
> vgic_inject_irq() call. It would be then up to the driver to clear up
> that status, by reading PCI ISR status or writing to virtio-mmio's
> interrupt-acknowledge register.
>
> Does that make sense?
When implementing Xen backend, I didn't have an already working example
so only guessed. I looked how kvmtool behaved when actually triggering
the interrupt on Arm [1].

Taking into the account that Xen PoC on Arm advertises [2] the same irq
type (TYPE_EDGE_RISING) as kvmtool [3] I decided to follow the model of
triggering an interrupt. Could you please explain, is this wrong?


[1]
https://git.kernel.org/pub/scm/linux/kernel/git/will/kvmtool.git/tree/arm/gic.c#n418

[2]
https://github.com/xen-troops/xen/blob/ioreq_4.14_ml/tools/libxl/libxl_arm.c#L727

[3]
https://git.kernel.org/pub/scm/linux/kernel/git/will/kvmtool.git/tree/virtio/mmio.c#n270

--
Regards,

Oleksandr Tyshchenko

1 2  View All