Mailing List Archive

1 2 3 4 5  View All
Re: Linux guest kernel threat model for Confidential Computing [ In reply to ]
On Wed, Feb 08 2023 at 18:02, David Alan Gilbert wrote:
> * Greg Kroah-Hartman (gregkh@linuxfoundation.org) wrote:
>> Anyway, you all are just spinning in circles now. I'll just mute this
>> thread until I see an actual code change as it seems to be full of
>> people not actually sending anything we can actually do anything with.

There have been random patchs posted which finally caused this
discussion to start. Wrong order obviously :)

> I think the challenge will be to come up with non-intrusive, minimal
> changes; obviously you don't want stuff shutgunned everywhere.

That has been tried by doing random surgery, e.g. caching some
particular PCI config value. While that might not look intrusive on the
first glance, these kind of punctual changes are the begin of a whack a
mole game and will end up in an uncoordinated maze of tiny mitigations
which make the code harder to maintain.

The real challenge is to come up with threat classes and mechanisms
which squash the whole class. Done right, e.g. caching a range of config
space values (or all of it) might give a benefit even for the bare metal
or general virtualization case.

That's quite some work, but its much more palatable than a trickle of
"fixes" when yet another source of trouble has been detected by a tool
or human inspection.

It's also more future proof because with the current approach of
scratching the itch of the day the probability that the just "mitigated"
issue comes back due to unrelated changes is very close to 100%.

It's not any different than any other threat class problem.

Thanks,

tglx
RE: Linux guest kernel threat model for Confidential Computing [ In reply to ]
> On Wed, Feb 08, 2023 at 10:16:14AM +0000, Reshetova, Elena wrote:
> > > No relation other than it would be nice to have a solution that does not
> > >require kernel command line and that prevents __init()s.
> >
> > For __inits see below. For the command line, it is pretty straightforward to
> > measure it and attest its integrity later: we need to do it for other parts
> > anyhow as acpi tables, etc. So I don’t see why we need to do smth special
> > about it? In any case it is indeed very different from driver discussion and
> > goes into "what should be covered by attestation for CC guest" topic.
> >
> > > More pressing concern than wasted memory, which may be unimportant,
> there's
> > > the issue of what are those driver init functions doing. For example, as
> > > part of device setup, MMIO regs may be involved, which we cannot trust. It's
> > > a lot more code to worry about from a CoCo perspective.
> >
> > Yes, we have seen such cases in kernel where drivers or modules would access
> > MMIO or pci config space already in their __init() functions.
> > Some concrete examples from modules and drivers (there are more):
> >
> > intel_iommu_init() -> init_dmars() -> check_tylersburg_isoch()
>
> An iommu driver. So maybe you want to use virtio iommu then?
>
> > skx_init() -> get_all_munits()
> > skx_init() -> skx_register_mci() -> skx_get_dimm_config()
>
> A memory controller driver, right? And you need it in a VM? why?
>
> > intel_rng_mod_init() -> intel_init_hw_struct()
>
> And virtio iommu?
>
> > i10nm_exit()->enable_retry_rd_err_log ->__enable_retry_rd_err_log()
>
> Another memory controller driver? Can we decide on a single one?

We don’t need any of the above in CC guest. The point was to indicate that
we know that the current device filter design we have, we will not necessary
prevent the __init functions of drivers running in CC guest and we have seen
in Linux codebase the code paths that may potentially execute and consume
malicious host input already in __init functions (most of drivers luckily
do it in probe). However, the argument below I gave is why we think such
__init functions are not that big security problem in our case.


>
> > However, this is how we address this from security point of view:
> >
> > 1. In order for a MMIO read to obtain data from a untrusted host, the memory
> > range must be shared with the host to begin with. We enforce that
> > all MMIO mappings are private by default to the CC guest unless it is
> > explicitly shared (and we do automatically share for the authorized devices
> > and their drivers from the allow list). This removes a problem of an
> > "unexpected MMIO region interaction"
> > (modulo acpi AML operation regions that we do have to share also
> unfortunately,
> > but acpi is a whole different difficult case on its own).
>
> How does it remove the problem? You basically get trash from host, no?
> But it seems that whether said trash is exploitable will really depend
> on how it's used, e.g. if it's an 8 bit value host can just scan all
> options in a couple of hundred attempts. What did I miss?

No, it wont work like that. Guest code will never be able to consume
the garbage data written into its private memory by host: we will get a memory
integrity violation and guest is killed for safety reasons. The confidentiality
and integrity of private memory is guaranteed by CC technology itself.

>
>
> > 2. For pci config space, we limit any interaction with pci config
> > space only to authorized devices and their drivers (that are in the allow list).
> > As a result device drivers outside of the allow list are not able to access pci
> > config space even in their __init routines. It is done by setting the
> > to_pci_dev(dev)->error_state = pci_channel_io_perm_failure for non-
> authorized
> > devices.
>
> This seems to be assuming drivers check return code from pci config
> space accesses, right? I doubt all drivers do though. Even if they do
> that's unlikely to be a well tested path, right?

This is a good thing to double check, thank you for pointing this out!

Best Regards,
Elena.
Re: Linux guest kernel threat model for Confidential Computing [ In reply to ]
* Thomas Gleixner (tglx@linutronix.de) wrote:
> On Wed, Feb 08 2023 at 18:02, David Alan Gilbert wrote:
> > * Greg Kroah-Hartman (gregkh@linuxfoundation.org) wrote:
> >> Anyway, you all are just spinning in circles now. I'll just mute this
> >> thread until I see an actual code change as it seems to be full of
> >> people not actually sending anything we can actually do anything with.
>
> There have been random patchs posted which finally caused this
> discussion to start. Wrong order obviously :)
>
> > I think the challenge will be to come up with non-intrusive, minimal
> > changes; obviously you don't want stuff shutgunned everywhere.
>
> That has been tried by doing random surgery, e.g. caching some
> particular PCI config value. While that might not look intrusive on the
> first glance, these kind of punctual changes are the begin of a whack a
> mole game and will end up in an uncoordinated maze of tiny mitigations
> which make the code harder to maintain.
>
> The real challenge is to come up with threat classes and mechanisms
> which squash the whole class. Done right, e.g. caching a range of config
> space values (or all of it) might give a benefit even for the bare metal
> or general virtualization case.

Yeh, reasonable.

> That's quite some work, but its much more palatable than a trickle of
> "fixes" when yet another source of trouble has been detected by a tool
> or human inspection.
>
> It's also more future proof because with the current approach of
> scratching the itch of the day the probability that the just "mitigated"
> issue comes back due to unrelated changes is very close to 100%.
>
> It's not any different than any other threat class problem.

I wonder if trying to group/categorise the output of Intel's
tool would allow common problematic patterns to be found to then
try and come up with more concrete fixes for whole classes of issues.

Dave

> Thanks,
>
> tglx
>
>
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

1 2 3 4 5  View All