Mailing List Archive

XSM and the idle domain
Hi,

A while ago there was a quick chat on IRC about how XSM interacts with
the idle domain. The conversation did not reach any clear conclusions
so it might be a good idea to summarise the questions in an email.

Basically there were two questions in that conversation:

1. In its current state, are security modules able to limit what the
idle domain can do?
2. Should security modules be able to restrict the idle domain?

The first question came up during ongoing work in LiveUpdate. After an
LU, the next Xen needs to restore all domains. To do that, some
hypercalls need to be issued from the idle domain context and
apparently XSM does not like it. We need to introduce hacks in the
dummy module to leave the idle domain alone. Our work is not compiled
with CONFIG_XSM at all, but with CONFIG_XSM, are we able to enforce
security policies against the idle domain? Of course, without any LU
work this does not make any difference because the idle domain does not
do any useful work to be restricted anyway.

Also, should idle domain be restricted? IMO the idle domain is Xen
itself which mostly bootstraps the system and performs limited work
when switched to, and is not something a user (either dom0 or domU)
directly interacts with. I doubt XSM was designed to include the idle
domain (although there is an ID allocated for it in the code), so I
would say just exclude idle in all security policy checks.

I may have missed some points in that discussion, so please feel free
to add.

Hongyan
Re: XSM and the idle domain [ In reply to ]
On Wed, Oct 21, 2020 at 10:35 AM Hongyan Xia <hx242@xen.org> wrote:
>
> Hi,

Hi, Hongyan.

I'm familiar with Flask but not particularly familiar with other XSMs
or CONFIG_XSM=n.

> A while ago there was a quick chat on IRC about how XSM interacts with
> the idle domain. The conversation did not reach any clear conclusions
> so it might be a good idea to summarise the questions in an email.
>
> Basically there were two questions in that conversation:
>
> 1. In its current state, are security modules able to limit what the
> idle domain can do?
> 2. Should security modules be able to restrict the idle domain?
>
> The first question came up during ongoing work in LiveUpdate. After an
> LU, the next Xen needs to restore all domains. To do that, some
> hypercalls need to be issued from the idle domain context and
> apparently XSM does not like it. We need to introduce hacks in the
> dummy module to leave the idle domain alone.

Is this modifying xsm_default_action() to add an is_idle_domain()
check which always succeeds?

>Our work is not compiled
> with CONFIG_XSM at all, but with CONFIG_XSM, are we able to enforce
> security policies against the idle domain?

It's not clear to me if you want to use CONFIG_XSM, or just don't want
to break it.

> Of course, without any LU
> work this does not make any difference because the idle domain does not
> do any useful work to be restricted anyway.

I think this last sentence is the main point. It's always been
labeled xen_t, but since it doesn't go through any of the hook points,
it hasn't needed any restrictions. Actually, reviewing the Flask
policy there is:
# Domain destruction can result in some access checks for actions performed by
# the hypervisor. These should always be allowed.
allow xen_t resource_type : resource { remove_irq remove_ioport remove_iomem };

> Also, should idle domain be restricted? IMO the idle domain is Xen
> itself which mostly bootstraps the system and performs limited work
> when switched to, and is not something a user (either dom0 or domU)
> directly interacts with. I doubt XSM was designed to include the idle
> domain (although there is an ID allocated for it in the code), so I
> would say just exclude idle in all security policy checks.

I think it makes sense to label xen_t, even if it doesn't do anything.
As you say, it is a distinct entity from dom0 and domU. Yes, it can
circumvent the policy, but it's not actively hurting anything. And it
can be good to catch when it does start doing something, as you found.

Might it make sense to create a LU domain instead of using the idle
domain for Live Update? Another approach could be to run the
idle_domain as "dom0" during Live Update, and then transition to the
regular idle_domain when it completes? You are re-creating dom0, but
you could flip is_privileged on during live update and then remove it
once complete.

Regards,
Jason
Re: XSM and the idle domain [ In reply to ]
On 10/21/20 10:34 AM, Hongyan Xia wrote:
> Hi,
>
> A while ago there was a quick chat on IRC about how XSM interacts with
> the idle domain. The conversation did not reach any clear conclusions
> so it might be a good idea to summarise the questions in an email.
>
> Basically there were two questions in that conversation:
>
> 1. In its current state, are security modules able to limit what the
> idle domain can do?

Yes in the fact that the idle domain has a type and you can constrain
what actions the type is allowed. Now in reality the idle domain is
given the same type as the hypervisor itself thus must have the ability
to make certain actions.

> 2. Should security modules be able to restrict the idle domain?

IMHO I think this question should be reversed to ask whether the actions
the idle domain is being used for is appropriate from a security point
of view. AIUI the idle domain is a mechanism for the scheduler to use as
a place to schedule an idle vcpu. And yes I understand that some limited
work is done there, e.g. memory scrubbing, but 1.) there is a difference
between light/limited work that can be done within the confines of a
domain and work requiring hypercalls, and 2.) this precedence may have
been due to limitations vs being the necessarily correct approach.

> The first question came up during ongoing work in LiveUpdate. After an
> LU, the next Xen needs to restore all domains. To do that, some
> hypercalls need to be issued from the idle domain context and
> apparently XSM does not like it. We need to introduce hacks in the
> dummy module to leave the idle domain alone. Our work is not compiled
> with CONFIG_XSM at all, but with CONFIG_XSM, are we able to enforce
> security policies against the idle domain? Of course, without any LU
> work this does not make any difference because the idle domain does not
> do any useful work to be restricted anyway.

Why do they "need to be issued from the idle domain"? As was suggested
by Jason, why isn't this done from a construction domain context? I will
interject here that with DomB that is what we will be doing and it
sounds like LiveUpdate is very similar to the relaunch concept that DomB
is being constructed to support.

Yes XSM did not like it because an analogy of what is being done is like
trying to do a system call from inside an OS kernel. Again AIUI the idle
domain is not a real domain but an internal construct for the scheduler
to manage idle vcpu and attempting to make hypercalls from it is in fact
attempting to turn into a full fledged domain.

From a security perspective, if hacks to the XSM hooks are necessary to
make something work then it is highly recommended to take a step back
and ask why and whether you are doing something that is not safe from a
security perspective.

> Also, should idle domain be restricted? IMO the idle domain is Xen
> itself which mostly bootstraps the system and performs limited work
> when switched to, and is not something a user (either dom0 or domU)
> directly interacts with. I doubt XSM was designed to include the idle
> domain (although there is an ID allocated for it in the code), so I
> would say just exclude idle in all security policy checks.

The idle domain is a limited, internal construct within the hypervisor
and should be constrained as part of the hypervisor, which is why its
domain id gets labeled with the same label as the hypervisor. For this
reason I would wholeheartedly disagree with exempting the idle domain id
from XSM hooks as that would effectively be saying the core hypervisor
should not be constrained. The purpose of the XSM hooks is to control
the flow of information in the system in a non-bypassable way. Codifying
bypasses completely subverts the security model behind XSM for which the
flask security server is dependent upon.

> I may have missed some points in that discussion, so please feel free
> to add.
>
> Hongyan
>
>

V/r,
DPS
Re: XSM and the idle domain [ In reply to ]
On 22.10.2020 03:23, Daniel P. Smith wrote:
> On 10/21/20 10:34 AM, Hongyan Xia wrote:
>> Also, should idle domain be restricted? IMO the idle domain is Xen
>> itself which mostly bootstraps the system and performs limited work
>> when switched to, and is not something a user (either dom0 or domU)
>> directly interacts with. I doubt XSM was designed to include the idle
>> domain (although there is an ID allocated for it in the code), so I
>> would say just exclude idle in all security policy checks.
>
> The idle domain is a limited, internal construct within the hypervisor
> and should be constrained as part of the hypervisor, which is why its
> domain id gets labeled with the same label as the hypervisor. For this
> reason I would wholeheartedly disagree with exempting the idle domain id
> from XSM hooks as that would effectively be saying the core hypervisor
> should not be constrained. The purpose of the XSM hooks is to control
> the flow of information in the system in a non-bypassable way. Codifying
> bypasses completely subverts the security model behind XSM for which the
> flask security server is dependent upon.

While what you say may in general make sense, I have two questions:
1) When the idle domain is purely an internal construct of Xen, why
does it need limiting in any way? In fact, if restricting it in a
bad way, aren't you risking to prevent the system from functioning
correctly?
2) LU is merely restoring the prior state of the system. This prior
state was reached with security auditing as per the system's
policy at the time. Why should there be anything denind in the
process of re-establishing this same state? IOW can't XSM checking
be globally disabled until the system is ready be run normally
again?
Please forgive if this sounds like rubbish to you - I may not have a
good enough understanding of the abstract constraints involved here.

Jan
Re: XSM and the idle domain [ In reply to ]
(also replying to others in this thread.)

On Wed, 2020-10-21 at 12:21 -0400, Jason Andryuk wrote:
> On Wed, Oct 21, 2020 at 10:35 AM Hongyan Xia <hx242@xen.org> wrote:
> >
> > Hi,
>
> ...
> >
> > The first question came up during ongoing work in LiveUpdate. After
> > an
> > LU, the next Xen needs to restore all domains. To do that, some
> > hypercalls need to be issued from the idle domain context and
> > apparently XSM does not like it. We need to introduce hacks in the
> > dummy module to leave the idle domain alone.
>
> Is this modifying xsm_default_action() to add an is_idle_domain()
> check which always succeeds?

Yes. We had to do exactly that to avoid LU actions being denied by XSM.

> > Our work is not compiled
> > with CONFIG_XSM at all, but with CONFIG_XSM, are we able to enforce
> > security policies against the idle domain?
>
> It's not clear to me if you want to use CONFIG_XSM, or just don't
> want
> to break it.

We don't (and won't) enable XSM in our build, but still we need a hack
to work around it, so I am just curious about what happens when people
use both LU and XSM at the same time.

> > Of course, without any LU
> > work this does not make any difference because the idle domain does
> > not
> > do any useful work to be restricted anyway.
>
> I think this last sentence is the main point. It's always been
> labeled xen_t, but since it doesn't go through any of the hook
> points,
> it hasn't needed any restrictions. Actually, reviewing the Flask
> policy there is:
> # Domain destruction can result in some access checks for actions
> performed by
> # the hypervisor. These should always be allowed.
> allow xen_t resource_type : resource { remove_irq remove_ioport
> remove_iomem };
>
> > Also, should idle domain be restricted? IMO the idle domain is Xen
> > itself which mostly bootstraps the system and performs limited work
> > when switched to, and is not something a user (either dom0 or domU)
> > directly interacts with. I doubt XSM was designed to include the
> > idle
> > domain (although there is an ID allocated for it in the code), so I
> > would say just exclude idle in all security policy checks.
>
> I think it makes sense to label xen_t, even if it doesn't do
> anything.
> As you say, it is a distinct entity from dom0 and domU. Yes, it can
> circumvent the policy, but it's not actively hurting anything. And
> it
> can be good to catch when it does start doing something, as you
> found.
>
> Might it make sense to create a LU domain instead of using the idle
> domain for Live Update? Another approach could be to run the
> idle_domain as "dom0" during Live Update, and then transition to the
> regular idle_domain when it completes? You are re-creating dom0, but
> you could flip is_privileged on during live update and then remove it
> once complete.

Actually I think your suggestion and what Daniel suggested make sense.
We could just have a domLU that does all the restore work which has its
own security policies. That sounds like a clean solution to me.
However, one top priority of LU is to minimise the down time so that
domains won't feel a thing and every millisecond counts. I don't know
how much overhead this adds (maybe negligible if we just let domLU sit
in idle domain's page tables so switching and passing the LU save
stream to it is painless), but is something we need to keep in mind.

But this still sidesteps the question of whether the idle domain should
be subject to security policies. From another reply it sounds like the
idle domain should not be exempt from XSM. Although, to me restrictions
on idle domain are more like a debugging feature than a security
policy, since it prevents, e.g., accidentally issuing hypercalls from
it, but if the idle domain really wants to do something then there is
nothing to stop it. This is different from enforcing policies on a real
domain which guarantees things won't happen and the domain simply has
no mechanism to circumvent it (hopefully).

My experience with XSM is only the idle domain hack for LU so what I
said about it here may not make sense.

Hongyan
Re: XSM and the idle domain [ In reply to ]
On 21/10/2020 15:34, Hongyan Xia wrote:
> The first question came up during ongoing work in LiveUpdate. After an
> LU, the next Xen needs to restore all domains. To do that, some
> hypercalls need to be issued from the idle domain context and
> apparently XSM does not like it.

There is no such thing as issuing hypercalls from the idle domain
(context or otherwise), because the idle domain does not have enough
associated guest state for anything to make the requisite
SYSCALL/INT80/VMCALL/VMMCALL invocation.

I presume from this comment that what you mean is that you're calling
the plain hypercall functions, context checks and everything, from the
idle context?

If so, this is buggy for more reasons than just XSM objecting to its
calling context, and that XSM is merely the first thing to explode. 
Therefore, I don't think modifications to XSM are applicable to solving
the problem.

(Of course, this is all speculation because there's no concrete
implementation to look at.)

~Andrew
Re: XSM and the idle domain [ In reply to ]
---- On Thu, 22 Oct 2020 04:13:53 -0400 Jan Beulich <jbeulich@suse.com> wrote ----

> On 22.10.2020 03:23, Daniel P. Smith wrote:
> > On 10/21/20 10:34 AM, Hongyan Xia wrote:
> >> Also, should idle domain be restricted? IMO the idle domain is Xen
> >> itself which mostly bootstraps the system and performs limited work
> >> when switched to, and is not something a user (either dom0 or domU)
> >> directly interacts with. I doubt XSM was designed to include the idle
> >> domain (although there is an ID allocated for it in the code), so I
> >> would say just exclude idle in all security policy checks.
> >
> > The idle domain is a limited, internal construct within the hypervisor
> > and should be constrained as part of the hypervisor, which is why its
> > domain id gets labeled with the same label as the hypervisor. For this
> > reason I would wholeheartedly disagree with exempting the idle domain id
> > from XSM hooks as that would effectively be saying the core hypervisor
> > should not be constrained. The purpose of the XSM hooks is to control
> > the flow of information in the system in a non-bypassable way. Codifying
> > bypasses completely subverts the security model behind XSM for which the
> > flask security server is dependent upon.
>
> While what you say may in general make sense, I have two questions:

[.Apologies for any poor formatting, responding from webmail interface ( ._.)]

Hey Jan, these are very legitimate questions.

> 1) When the idle domain is purely an internal construct of Xen, why
> does it need limiting in any way? In fact, if restricting it in a
> bad way, aren't you risking to prevent the system from functioning
> correctly?

Think in terms of least privilege, do you want the idle domain and by extension the hypervisor to have the additional privilege of imposing state on to the system as opposed to processing the state changes. I am not saying it is wrong technical approach (though I do believe at a minimum the implementation approach is flawed), I am just asking is it wise from a privilege delegation aspect of whether it could be done differently from a technical stand point. The underlying concern here is once you grant the privilege the hypervisor will forever have the privilege which can be used for good (LU) and bad (corruption). Take for instance what is being attempted with DomB, in this approach the privilege to impose state (configure domains) is delegated to the Boot Domain but it is not delegated the privilege to create state (domain creation). As I mentioned before, this is what Jason was suggesting in having another domain type that is allowed to impose the state that is transitioned to from the idle domain to conduct the action.

Whether or not the idle domain is allowed to make hypercalls is not necessarily a concern of the XSM hooks. If it is decided that this is the desired path, then what is of concern is that the corrective action does not weaken/break the hooks. If this ends up being the desired approach, then IMHO the correct action is to update the dummy policy, flask policy, and SILO (if it applies) to allow the privilege/access to occur versus putting bypasses into the security hooks.

> 2) LU is merely restoring the prior state of the system. This prior
> state was reached with security auditing as per the system's
> policy at the time. Why should there be anything denind in the
> process of re-establishing this same state? IOW can't XSM checking
> be globally disabled until the system is ready be run normally
> again?

There is an assumption you made there that is being overlooked and that is you are assuming it is the same state. It is important to understand what assumptions are being made and when possible impose those assumptions through policy than with code. Not everyone will want to make the same assumptions and may want a better controlled path for that state to flow.

No you don't want to globally disable the XSM checking as that means you have lost all control over the system where any and all policy violations could occur without any auditing. This would open a huge hole for a malicious actor to take advantage of for an attack against the system.

In the end to reiterate, if this is decided to be the desired approach then IMHO the correct implementation is to encode the access in policy not in bypasses to the XSM hooks.

> Please forgive if this sounds like rubbish to you - I may not have a
> good enough understanding of the abstract constraints involved here.

No worries, it is always better to question when in doubt than making an assumption. Hopefully I helped in providing a better explanation.

> Jan
>
Re: XSM and the idle domain [ In reply to ]
On Thu, 2020-10-22 at 13:51 +0100, Andrew Cooper wrote:
> On 21/10/2020 15:34, Hongyan Xia wrote:
> > The first question came up during ongoing work in LiveUpdate. After
> > an
> > LU, the next Xen needs to restore all domains. To do that, some
> > hypercalls need to be issued from the idle domain context and
> > apparently XSM does not like it.
>
> There is no such thing as issuing hypercalls from the idle domain
> (context or otherwise), because the idle domain does not have enough
> associated guest state for anything to make the requisite
> SYSCALL/INT80/VMCALL/VMMCALL invocation.
>
> I presume from this comment that what you mean is that you're calling
> the plain hypercall functions, context checks and everything, from
> the
> idle context?

Yep, the restore code just calls the hypercall functions from idle
context.

> If so, this is buggy for more reasons than just XSM objecting to its
> calling context, and that XSM is merely the first thing to explode.
> Therefore, I don't think modifications to XSM are applicable to
> solving
> the problem.
>
> (Of course, this is all speculation because there's no concrete
> implementation to look at.)

Another explosion is the inability to create hypercall preemption,
which for now is disabled when the calling context is the idle domain.
Apart from XSM and preemption, the LU prototype works fine. We only
reuse a limited number of hypercall functions and are not trying to be
able to call all possible hypercalls from idle.

Having a dedicated domLU just like domB (or reusing domB) sounds like a
viable option. If the overhead can be made low enough then we won't
need to work around XSM and hypercall preemption.

Although the question was whether XSM should interact with the idle
domain. With a good design LU should be able to sidestep this though.

Hongyan
Re: XSM and the idle domain [ In reply to ]
On Thu, Oct 22, 2020 at 1:01 PM Hongyan Xia <hx242@xen.org> wrote:
>
> On Thu, 2020-10-22 at 13:51 +0100, Andrew Cooper wrote:
> > On 21/10/2020 15:34, Hongyan Xia wrote:
> > > The first question came up during ongoing work in LiveUpdate. After
> > > an
> > > LU, the next Xen needs to restore all domains. To do that, some
> > > hypercalls need to be issued from the idle domain context and
> > > apparently XSM does not like it.
> >
> > There is no such thing as issuing hypercalls from the idle domain
> > (context or otherwise), because the idle domain does not have enough
> > associated guest state for anything to make the requisite
> > SYSCALL/INT80/VMCALL/VMMCALL invocation.
> >
> > I presume from this comment that what you mean is that you're calling
> > the plain hypercall functions, context checks and everything, from
> > the
> > idle context?
>
> Yep, the restore code just calls the hypercall functions from idle
> context.
>
> > If so, this is buggy for more reasons than just XSM objecting to its
> > calling context, and that XSM is merely the first thing to explode.
> > Therefore, I don't think modifications to XSM are applicable to
> > solving
> > the problem.
> >
> > (Of course, this is all speculation because there's no concrete
> > implementation to look at.)
>
> Another explosion is the inability to create hypercall preemption,
> which for now is disabled when the calling context is the idle domain.
> Apart from XSM and preemption, the LU prototype works fine. We only
> reuse a limited number of hypercall functions and are not trying to be
> able to call all possible hypercalls from idle.

I wonder if for domain_create, it wouldn't be better to move
xsm_domain_create() out to the domctl (hypercall entry) and check it
there. That would side-step xsm in domain_create. Flask would need
to be modified for that. I've an untested patch doing the
rearranging, which I'll send as a follow up.

What other hypercalls are you having issues with? Those could also be
refactored so the hypercall entry checks permissions, and the actual
work is done in a directly callable function.

> Having a dedicated domLU just like domB (or reusing domB) sounds like a
> viable option. If the overhead can be made low enough then we won't
> need to work around XSM and hypercall preemption.
>
> Although the question was whether XSM should interact with the idle
> domain. With a good design LU should be able to sidestep this though.

Circling back to the main topic, is the idle domain Xen, or is it
distinct? It runs in the context of Xen, so Xen isn't really in a
place to enforce policy on itself. Hongyan, as you said earlier,
applying XSM is more of a debugging feature at that point than a
security feature. And as Jan pointed out, you can have problems if
XSM prevents the hypervisor from performing an action it doesn't
expect to fail.

Regards,
Jason
Re: XSM and the idle domain [ In reply to ]
On 26/10/2020 13:37, Jason Andryuk wrote:
> On Thu, Oct 22, 2020 at 1:01 PM Hongyan Xia <hx242@xen.org> wrote:
>> On Thu, 2020-10-22 at 13:51 +0100, Andrew Cooper wrote:
>>> On 21/10/2020 15:34, Hongyan Xia wrote:
>>>> The first question came up during ongoing work in LiveUpdate. After
>>>> an
>>>> LU, the next Xen needs to restore all domains. To do that, some
>>>> hypercalls need to be issued from the idle domain context and
>>>> apparently XSM does not like it.
>>> There is no such thing as issuing hypercalls from the idle domain
>>> (context or otherwise), because the idle domain does not have enough
>>> associated guest state for anything to make the requisite
>>> SYSCALL/INT80/VMCALL/VMMCALL invocation.
>>>
>>> I presume from this comment that what you mean is that you're calling
>>> the plain hypercall functions, context checks and everything, from
>>> the
>>> idle context?
>> Yep, the restore code just calls the hypercall functions from idle
>> context.
>>
>>> If so, this is buggy for more reasons than just XSM objecting to its
>>> calling context, and that XSM is merely the first thing to explode.
>>> Therefore, I don't think modifications to XSM are applicable to
>>> solving
>>> the problem.
>>>
>>> (Of course, this is all speculation because there's no concrete
>>> implementation to look at.)
>> Another explosion is the inability to create hypercall preemption,
>> which for now is disabled when the calling context is the idle domain.
>> Apart from XSM and preemption, the LU prototype works fine. We only
>> reuse a limited number of hypercall functions and are not trying to be
>> able to call all possible hypercalls from idle.
> I wonder if for domain_create, it wouldn't be better to move
> xsm_domain_create() out to the domctl (hypercall entry) and check it
> there. That would side-step xsm in domain_create. Flask would need
> to be modified for that. I've an untested patch doing the
> rearranging, which I'll send as a follow up.
>
> What other hypercalls are you having issues with? Those could also be
> refactored so the hypercall entry checks permissions, and the actual
> work is done in a directly callable function.
>
>> Having a dedicated domLU just like domB (or reusing domB) sounds like a
>> viable option. If the overhead can be made low enough then we won't
>> need to work around XSM and hypercall preemption.
>>
>> Although the question was whether XSM should interact with the idle
>> domain. With a good design LU should be able to sidestep this though.
> Circling back to the main topic, is the idle domain Xen, or is it
> distinct?

It "is" Xen, IMO.

> It runs in the context of Xen, so Xen isn't really in a
> place to enforce policy on itself. Hongyan, as you said earlier,
> applying XSM is more of a debugging feature at that point than a
> security feature. And as Jan pointed out, you can have problems if
> XSM prevents the hypervisor from performing an action it doesn't
> expect to fail.

We have several system DOMID's which are SELF, IO, XEN, COW, INVALID and
IDLE.

SELF is a magic constant expected to be used in most hypercalls on
oneself, to simplify callers.  INVALID is also a magic constant.

The others all have struct domain's allocated for them, and are concrete
objects as far as Xen is concerned.  IO/XEN/COW all exist for the
purpose of fitting into the memory/device ownership models, while IDLE
exists for the purpose of encapsulating the idle vcpus in the scheduling
model.

None of them have any kind of outside-Xen state associated with them. 
"scheduling" an idle vCPU runs the idle loop, but it is all code within
the hypervisor.

The problem here is that idle context is also used in certain "normal"
cases in Xen (startup, shutdown, possibly also for softirq/tasklet
context), all of which we (currently) expect not to be making hypercalls
from.

~Andrew