Mailing List Archive

1 2  View All
Re: [PATCH v4] PCI: Relabel JHL6540 on Lenovo X1 Carbon 7,8 [ In reply to ]
On Tue, Apr 23, 2024 at 08:33:12AM +0300, Mika Westerberg wrote:
> I think what you are looking for is that anything behind a PCIe tunnel
> should not have this applied. IIRC the AMD GPU or some code there were
> going to add identification of "virtual" links to the bandwidth
> calculation functionality.

I guess I could resurrect my correlation patch:

https://lore.kernel.org/all/f53ea40a7487e145aa1a62c347cef1814072e140.1536517047.git.lukas@wunner.de/

The last time I forward-ported it was for v5.13. I still have that code
running on my development machine.

The problem is that it only allows lookup from tb_port to pci_dev.
I'd have to add a pointer to struct pci_dev to allow lookups in the
inverse direction. Though I think we have such PCI companion devices
for CXL as well, so such a pointer could be useful in general.

I'm knee-deep in PCI device authentication code but could probably
dedicate a weekend to the correlation patch if there's interest?

Once we have correlation, we can expose more precise bandwidth
for virtual PCI links in sysfs.

Thanks,

Lukas
Re: [PATCH v4] PCI: Relabel JHL6540 on Lenovo X1 Carbon 7,8 [ In reply to ]
On Tue, Apr 23, 2024 at 10:31:30AM +0200, Lukas Wunner wrote:
> On Tue, Apr 23, 2024 at 08:33:12AM +0300, Mika Westerberg wrote:
> > I think what you are looking for is that anything behind a PCIe tunnel
> > should not have this applied. IIRC the AMD GPU or some code there were
> > going to add identification of "virtual" links to the bandwidth
> > calculation functionality.
>
> I guess I could resurrect my correlation patch:
>
> https://lore.kernel.org/all/f53ea40a7487e145aa1a62c347cef1814072e140.1536517047.git.lukas@wunner.de/
>
> The last time I forward-ported it was for v5.13. I still have that code
> running on my development machine.
>
> The problem is that it only allows lookup from tb_port to pci_dev.
> I'd have to add a pointer to struct pci_dev to allow lookups in the
> inverse direction. Though I think we have such PCI companion devices
> for CXL as well, so such a pointer could be useful in general.
>
> I'm knee-deep in PCI device authentication code but could probably
> dedicate a weekend to the correlation patch if there's interest?
>
> Once we have correlation, we can expose more precise bandwidth
> for virtual PCI links in sysfs.

Sounds good to me :) There are also some additions in USB4 spec that
allows discovery of mapping between PCIe adapters and the corresponding
PCIe downstream/root port. Perhaps these can be added there too?
Re: [PATCH v4] PCI: Relabel JHL6540 on Lenovo X1 Carbon 7,8 [ In reply to ]
On 4/23/2024 00:33, Mika Westerberg wrote:
> On Mon, Apr 22, 2024 at 02:21:18PM -0500, Mario Limonciello wrote:
>> On 4/22/2024 14:17, Esther Shimanovich wrote:
>>> Thanks for the explanation! I still don't fully understand how that
>>> would work for my use case.
>>>
>>> Perhaps it would be better for me to describe the case I am trying to
>>> protect against.
>>>
>>> To rehash, this quirk was written for devices with discrete
>>> Thunderbolt controllers.
>>>
>>> For example,
>>> CometLake_CPU -> AlpineRidge_Chip -> USB-C Port
>>> This device has the ExternalFacingPort property in ACPI.
>>> My quirk relabels the Alpine Ridge chip as "fixed" and
>>> external-facing, so that devices attached to the USB-C port could be
>>> labeled as "removable"
>>>
>>> Let's say we have a TigerLake CPU, which has integrated
>>> Thunderbolt/USB4 capabilities:
>>>
>>> TigerLake_ThunderboltCPU -> USB-C Port
>>> This device also has the ExternalFacingPort property in ACPI and lacks
>>> the usb4-host-interface property in the ACPI.
>>>
>>> My worry is that someone could take an Alpine Ridge Chip Thunderbolt
>>> Dock and attach it to the TigerLake CPU
>>>
>>> TigerLake_ThunderboltCPU -> USB-C Port -> AlpineRidge_Dock
>>>
>>> If that were to happen, this quirk would incorrectly label the Alpine
>>> Ridge Dock as "fixed" instead of "removable".
>>>
>>> My thinking was that we could prevent this scenario from occurring if
>>> we filtered this quirk not to apply on CPU's like Tiger Lake, with
>>> integrated Thunderbolt/USB4 capabilities.
>>>
>>> ExternalFacingPort is found both on the Comet Lake ACPI and also on
>>> the Tiger Lake ACPI. So I can't use that to distinguish between CPUs
>>> which don't have integrated Thunderbolt, like Comet Lake, and CPUs
>>> with integrated Thunderbolt, like Tiger Lake.
>>>
>>> I am looking for something that can tell me if the device's Root Port
>>> has the Thunderbolt controller upstream to it or not.
>>> Is there anything like that?
>>> Or perhaps should I add a check which compares the name of the
>>> device's CPU with a list of CPUs that this quirk can be applied to?
>>> Or is there some way I can identify the Thunderbolt controller, then
>>> determine if it's upstream or downstream from the root port?
>>> Or are Alpine Ridge docks not something to worry about at all?
>>
>> My thought was once you have a device as untrusted, everything else
>> connected to it should "also" be untrusted.
>
> I think what you are looking for is that anything behind a PCIe tunnel
> should not have this applied. IIRC the AMD GPU or some code there were
> going to add identification of "virtual" links to the bandwidth
> calculation functionality.
>
> @Mario, do you remember if this was done already and if that could maybe
> be re-used here?

Yeah there was a series that I worked on a few spins a while back
specifically in the context of eGPUs to identify virtual links and take
them out of bandwidth calculations.

It didn't get merged, I recall it got stalled on various feedback and I
didn't dust it off because the series also did prompt discussions about
the reasoning that amdgpu was doing this in the first place. It turned
out to be a bad assumption in the code and I instead made a change to
amdgpu to not look at the whole topology but just the link partner
(466a7d115326ece682c2b60d1c77d1d0b9010b4f if anyone is curious).

>
> The other way I think is something like this:
>
> - If it does not have "usb4-host-interface" property (or behind a port
> that has that). These are all tunneled (e.g virtual).
>
> - It is directly connected to a PCIe root port with
> "ExternalFacingPort" and it has sibling device that is "Thunderbolt
> NHI". This is because you can only have "NHI" on a host router
> according to the USB4 spec.
>
> I may be forgetting something though.
Re: [PATCH v4] PCI: Relabel JHL6540 on Lenovo X1 Carbon 7,8 [ In reply to ]
On Tue, Apr 23, 2024 at 11:59:36AM -0500, Mario Limonciello wrote:
> On 4/23/2024 00:33, Mika Westerberg wrote:
> > On Mon, Apr 22, 2024 at 02:21:18PM -0500, Mario Limonciello wrote:
> > > On 4/22/2024 14:17, Esther Shimanovich wrote:
> > > > Thanks for the explanation! I still don't fully understand how that
> > > > would work for my use case.
> > > >
> > > > Perhaps it would be better for me to describe the case I am trying to
> > > > protect against.
> > > >
> > > > To rehash, this quirk was written for devices with discrete
> > > > Thunderbolt controllers.
> > > >
> > > > For example,
> > > > CometLake_CPU -> AlpineRidge_Chip -> USB-C Port
> > > > This device has the ExternalFacingPort property in ACPI.
> > > > My quirk relabels the Alpine Ridge chip as "fixed" and
> > > > external-facing, so that devices attached to the USB-C port could be
> > > > labeled as "removable"
> > > >
> > > > Let's say we have a TigerLake CPU, which has integrated
> > > > Thunderbolt/USB4 capabilities:
> > > >
> > > > TigerLake_ThunderboltCPU -> USB-C Port
> > > > This device also has the ExternalFacingPort property in ACPI and lacks
> > > > the usb4-host-interface property in the ACPI.
> > > >
> > > > My worry is that someone could take an Alpine Ridge Chip Thunderbolt
> > > > Dock and attach it to the TigerLake CPU
> > > >
> > > > TigerLake_ThunderboltCPU -> USB-C Port -> AlpineRidge_Dock
> > > >
> > > > If that were to happen, this quirk would incorrectly label the Alpine
> > > > Ridge Dock as "fixed" instead of "removable".
> > > >
> > > > My thinking was that we could prevent this scenario from occurring if
> > > > we filtered this quirk not to apply on CPU's like Tiger Lake, with
> > > > integrated Thunderbolt/USB4 capabilities.
> > > >
> > > > ExternalFacingPort is found both on the Comet Lake ACPI and also on
> > > > the Tiger Lake ACPI. So I can't use that to distinguish between CPUs
> > > > which don't have integrated Thunderbolt, like Comet Lake, and CPUs
> > > > with integrated Thunderbolt, like Tiger Lake.
> > > >
> > > > I am looking for something that can tell me if the device's Root Port
> > > > has the Thunderbolt controller upstream to it or not.
> > > > Is there anything like that?
> > > > Or perhaps should I add a check which compares the name of the
> > > > device's CPU with a list of CPUs that this quirk can be applied to?
> > > > Or is there some way I can identify the Thunderbolt controller, then
> > > > determine if it's upstream or downstream from the root port?
> > > > Or are Alpine Ridge docks not something to worry about at all?
> > >
> > > My thought was once you have a device as untrusted, everything else
> > > connected to it should "also" be untrusted.
> >
> > I think what you are looking for is that anything behind a PCIe tunnel
> > should not have this applied. IIRC the AMD GPU or some code there were
> > going to add identification of "virtual" links to the bandwidth
> > calculation functionality.
> >
> > @Mario, do you remember if this was done already and if that could maybe
> > be re-used here?
>
> Yeah there was a series that I worked on a few spins a while back
> specifically in the context of eGPUs to identify virtual links and take them
> out of bandwidth calculations.
>
> It didn't get merged, I recall it got stalled on various feedback and I
> didn't dust it off because the series also did prompt discussions about the
> reasoning that amdgpu was doing this in the first place. It turned out to
> be a bad assumption in the code and I instead made a change to amdgpu to not
> look at the whole topology but just the link partner
> (466a7d115326ece682c2b60d1c77d1d0b9010b4f if anyone is curious).

Okay that makes sense. Thanks!
Re: [PATCH v4] PCI: Relabel JHL6540 on Lenovo X1 Carbon 7,8 [ In reply to ]
Thank you for all your help!

On Tue, Apr 23, 2024 at 1:33?AM Mika Westerberg
<mika.westerberg@linux.intel.com> wrote:
>
> The other way I think is something like this:
>
> - If it does not have "usb4-host-interface" property (or behind a port
> that has that). These are all tunneled (e.g virtual).
>
> - It is directly connected to a PCIe root port with
> "ExternalFacingPort" and it has sibling device that is "Thunderbolt
> NHI". This is because you can only have "NHI" on a host router
> according to the USB4 spec.
>
I did find one example of a docking station that uses the DSL6540
chip, which has PCI IDs defined in include/linux/pci_ids.h:
#define PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_4C_NHI 0x1577
#define PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_4C_BRIDGE 0x1578
It seems like it has an NHI, despite being in an external, removable
docking station. This appears to contradict what you say about only
having "NHI" on a host router. I am assuming that by host router, you
mean the fixed discrete, fixed thunderbolt chip, or the thunderbolt
controller upstream to the root port. Please correct me if I got
anything wrong!

Looking at 18-241_ThunderboltController_Brief_HI.pdf, it seems like
these Alpine Ridge chips can be used either on a computer or a
peripheral. (Expected usage: Computer or peripheral)
So I'm not sure if finding an NHI would guarantee that the device is
not a peripheral. My original question was how to distinguish a
Thunderbolt controller that is on a removable peripheral, like a
docking station-- from one that is a discrete chip fixed to a computer
or upstream to the root port.

So unless I am misunderstanding something, it appears that my only
option is waiting for Lukas's patches. Please correct me if that is
not the case!
Re: [PATCH v4] PCI: Relabel JHL6540 on Lenovo X1 Carbon 7,8 [ In reply to ]
Hi,

On Thu, Apr 25, 2024 at 05:16:24PM -0400, Esther Shimanovich wrote:
> Thank you for all your help!
>
> On Tue, Apr 23, 2024 at 1:33?AM Mika Westerberg
> <mika.westerberg@linux.intel.com> wrote:
> >
> > The other way I think is something like this:
> >
> > - If it does not have "usb4-host-interface" property (or behind a port
> > that has that). These are all tunneled (e.g virtual).
> >
> > - It is directly connected to a PCIe root port with
> > "ExternalFacingPort" and it has sibling device that is "Thunderbolt
> > NHI". This is because you can only have "NHI" on a host router
> > according to the USB4 spec.
> >
> I did find one example of a docking station that uses the DSL6540
> chip, which has PCI IDs defined in include/linux/pci_ids.h:
> #define PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_4C_NHI 0x1577
> #define PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_4C_BRIDGE 0x1578
> It seems like it has an NHI, despite being in an external, removable
> docking station. This appears to contradict what you say about only
> having "NHI" on a host router. I am assuming that by host router, you
> mean the fixed discrete, fixed thunderbolt chip, or the thunderbolt
> controller upstream to the root port. Please correct me if I got
> anything wrong!

So it goes same way with other discrete chips from Intel at least. It is
the same silicon but the NHI is disabled on device routers.

That said, it is entirely possible for a "malicious" device to pretend
to have one so we need to be careful.

> Looking at 18-241_ThunderboltController_Brief_HI.pdf, it seems like
> these Alpine Ridge chips can be used either on a computer or a
> peripheral. (Expected usage: Computer or peripheral)

Yes as above. Most our chips are such.

> So I'm not sure if finding an NHI would guarantee that the device is
> not a peripheral. My original question was how to distinguish a
> Thunderbolt controller that is on a removable peripheral, like a
> docking station-- from one that is a discrete chip fixed to a computer
> or upstream to the root port.

Yes that's the problem. We can figure that out from full USB4 system by
looking at the "usb4-host-interface" ACPI _DSD properties of the
tunneled PCIe Root/Downstream ports. But this does not work with the
pre-USB4 hosts so that requires some sort of heuristics, unfortunately.

> So unless I am misunderstanding something, it appears that my only
> option is waiting for Lukas's patches. Please correct me if that is
> not the case!

I think with Lukas' patches (Lukas please correct me if I got that
wrong) you can find out the PCIe ports which have their link going over
a tunnel. His series works with pre-USB4 devices so that should cover
your case. (In addition to "usb4-host-interface" ACPI _DSD property there
is a special router operation that allows extracting the same PCIe
downstream port mapping that Lukas' patches are doing from DROM so this
should also allow identify all tunneled links.)

This way you can identify the xHCI (well and NHI) that are not behind
PCIe tunnel so that should mean they are really part of the host system
(being soldered or plugged to the PCIe slot or so). If I understood
right this is what you were looking for, correct?
Re: [PATCH v4] PCI: Relabel JHL6540 on Lenovo X1 Carbon 7,8 [ In reply to ]
> This way you can identify the xHCI (well and NHI) that are not behind
> PCIe tunnel so that should mean they are really part of the host system
> (being soldered or plugged to the PCIe slot or so). If I understood
> right this is what you were looking for, correct?

Yes! Thank you for the explanation.
Re: [PATCH v4] PCI: Relabel JHL6540 on Lenovo X1 Carbon 7,8 [ In reply to ]
On Fri, Apr 26, 2024 at 07:52:07AM +0300, Mika Westerberg wrote:
> On Thu, Apr 25, 2024 at 05:16:24PM -0400, Esther Shimanovich wrote:
> > I did find one example of a docking station that uses the DSL6540
> > chip, which has PCI IDs defined in include/linux/pci_ids.h:
> > #define PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_4C_NHI 0x1577
> > #define PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_4C_BRIDGE 0x1578
> > It seems like it has an NHI, despite being in an external, removable
> > docking station. This appears to contradict what you say about only
> > having "NHI" on a host router. I am assuming that by host router, you
> > mean the fixed discrete, fixed thunderbolt chip, or the thunderbolt
> > controller upstream to the root port. Please correct me if I got
> > anything wrong!
>
> So it goes same way with other discrete chips from Intel at least. It is
> the same silicon but the NHI is disabled on device routers.
>
> That said, it is entirely possible for a "malicious" device to pretend
> to have one so we need to be careful.

If a device (accidentally or maliciously) exposes an NHI, the thunderbolt
driver will try to bind to it.

Do we take any precautions to prevent that?

AFAICS we'd be allocating a duplicate root_switch with route 0.
Seems dangerous if two driver instances talk to the same Root Switch.

This looks like something that needs to be checked by Intel Validation
Engineering folks. Have they been testing this? (+cc Gil)

Thanks,

Lukas
Re: [PATCH v4] PCI: Relabel JHL6540 on Lenovo X1 Carbon 7,8 [ In reply to ]
On Thu, Apr 25, 2024 at 05:16:24PM -0400, Esther Shimanovich wrote:
> So unless I am misunderstanding something, it appears that my only
> option is waiting for Lukas's patches.

I've just pushed this branch:

https://github.com/l1k/linux/commits/thunderbolt_associate_v1

But it's not even compile-tested yet. 0-day is crunching through it now
and I'll force-push if/when fixing bugs. It also doesn't contain support
yet for the "Get PCIe Downstream Entry" functionality that's apparently
needed for USB4.

The branch marks PCIe Downstream Adapters on a Root Switch as fixed and
external_facing.

However that doesn't appear to be sufficient: I notice that in your
patch, you're also clearing the external_facing bit on the Root Port
above the discrete host controller.

Additionally you're marking the bridges leading to the NHI and XHCI
as fixed. You're also marking the NHI and XHCI themselves as fixed
and external_facing.

I just want to confirm whether all of this is actually necessary.
At least marking the NHI and XHCI as external_facing seems nonsensical.
I need to know the *minimal* set of attribute changes to fix the problem.

I also need to know exactly what the user-visible issue is and how
it comes about. Your commit message merely says "the build-in USB-C
ports stop working at all". Does that mean that no driver is bound
to the NHI or XHCI?

The solution implemented by the above-linked branch hinges on the
NHI driver fixing up the device attributes. But if the NHI driver
is not bound, it can't fix up the attributes.

I notice that commit 86eaf4a5b431 ("thunderbolt: Make iommu_dma_protection
more accurate") bemoans the inability to determine external_facing ports
clearly. The above-linked branch may solve that, so it seems there may
be overlap between the issue discussed here and the one that the commit
sought to solve.

Thanks,

Lukas
Re: [PATCH v4] PCI: Relabel JHL6540 on Lenovo X1 Carbon 7,8 [ In reply to ]
On Sat, Apr 27, 2024 at 07:35:33AM +0200, Lukas Wunner wrote:
> On Fri, Apr 26, 2024 at 07:52:07AM +0300, Mika Westerberg wrote:
> > On Thu, Apr 25, 2024 at 05:16:24PM -0400, Esther Shimanovich wrote:
> > > I did find one example of a docking station that uses the DSL6540
> > > chip, which has PCI IDs defined in include/linux/pci_ids.h:
> > > #define PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_4C_NHI 0x1577
> > > #define PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_4C_BRIDGE 0x1578
> > > It seems like it has an NHI, despite being in an external, removable
> > > docking station. This appears to contradict what you say about only
> > > having "NHI" on a host router. I am assuming that by host router, you
> > > mean the fixed discrete, fixed thunderbolt chip, or the thunderbolt
> > > controller upstream to the root port. Please correct me if I got
> > > anything wrong!
> >
> > So it goes same way with other discrete chips from Intel at least. It is
> > the same silicon but the NHI is disabled on device routers.
> >
> > That said, it is entirely possible for a "malicious" device to pretend
> > to have one so we need to be careful.
>
> If a device (accidentally or maliciously) exposes an NHI, the thunderbolt
> driver will try to bind to it.
>
> Do we take any precautions to prevent that?

Not at the moment but it will be behind the IOMMU so it cannot access
any other memory that what is reserved for it.

> AFAICS we'd be allocating a duplicate root_switch with route 0.
> Seems dangerous if two driver instances talk to the same Root Switch.

I don't think it even works because it cannot have topology ID of 0 if
it is a device router which it is in this case since it is being first
enumerated by the "real" host.

That said we have not tested this either so can be 100% sure.
Re: [PATCH v4] PCI: Relabel JHL6540 on Lenovo X1 Carbon 7,8 [ In reply to ]
On Thu, Apr 25, 2024 at 05:16:24PM -0400, Esther Shimanovich wrote:
> I did find one example of a docking station that uses the DSL6540
> chip, which has PCI IDs defined in include/linux/pci_ids.h:
> #define PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_4C_NHI 0x1577
> #define PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_4C_BRIDGE 0x1578
> It seems like it has an NHI, despite being in an external, removable
> docking station.

Could you provide full output of dmesg and lspci -vvv of a machine
with this docking station attached?

Perhaps open a bug at bugzilla.kernel.org and attach it there?

Could you then try the below patch and see if it prevents the
Thunderbolt driver from binding to the hot-plugged device?

Thanks!

-- >8 --
From a10a294a650232c16447a43c2b591f34d3cb399f Mon Sep 17 00:00:00 2001
From: Lukas Wunner <lukas@wunner.de>
Date: Sat, 27 Apr 2024 16:24:18 +0200
Subject: [PATCH] thunderbolt: Do not bind to NHI exposed by a hot-plugged
device

An NHI should only be exposed by Thunderbolt host controllers, not by
hot-plugged devices.

Avoid binding to an NHI erroneously or maliciously exposed by a device
by looking at the upstream port of its switch:

On a host controller, the upstream port is of type TB_TYPE_NHI, whereas
on hot-plugged devices it is of type TB_TYPE_PORT.

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Cc: stable@vger.kernel.org
---
drivers/thunderbolt/tb.c | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/drivers/thunderbolt/tb.c b/drivers/thunderbolt/tb.c
index e00e62de53f3..d95ff9ed4d96 100644
--- a/drivers/thunderbolt/tb.c
+++ b/drivers/thunderbolt/tb.c
@@ -2786,6 +2786,13 @@ static int tb_start(struct tb *tb, bool reset)
if (IS_ERR(tb->root_switch))
return PTR_ERR(tb->root_switch);

+ /* NHI erroneously exposed by a hot-plugged device? */
+ if (!tb_port_is_nhi(tb_upstream_port(tb->root_switch))) {
+ tb_err(tb, "not a host controller\n");
+ tb_switch_put(tb->root_switch);
+ return -ENODEV;
+ }
+
/*
* ICM firmware upgrade needs running firmware and in native
* mode that is not available so disable firmware upgrade of the
--
2.43.0
Re: [PATCH v4] PCI: Relabel JHL6540 on Lenovo X1 Carbon 7,8 [ In reply to ]
On Sat, Apr 27, 2024 at 3:17?AM Lukas Wunner <lukas@wunner.de> wrote:
>
> However that doesn't appear to be sufficient: I notice that in your
> patch, you're also clearing the external_facing bit on the Root Port
> above the discrete host controller.

Rajat (rajatja@google.com) in an internal review had suggested I add
that, and leave it up to kernel maintainers to decide if it's strictly
necessary.

> Additionally you're marking the bridges leading to the NHI and XHCI
> as fixed. You're also marking the NHI and XHCI themselves as fixed
> and external_facing.
>
> I just want to confirm whether all of this is actually necessary.
> At least marking the NHI and XHCI as external_facing seems nonsensical.
> I need to know the *minimal* set of attribute changes to fix the problem.

Labeling the NHI and XHCI external-facing was my mistake as I got the
ASCII diagram wrong in the beginning. (NHI and XHCI should not be
shown as connected to the USB-C port). If you look at my latest draft
of the patch (included in one of my messages in this email chain), you
will see that I ended up removing that mistake.
https://lore.kernel.org/lkml/CA+Y6NJGz6f8hE4kRDUTGgCj+jddUyHeD_8ocFBkARr7w90jmBw@mail.gmail.com/

I labeled the bridges leading to the NHI and XHCI as fixed because
they are technically part of the discrete chip, and fixed. I do
understand that if they were labeled as “removable”, that *might* not
affect anything even though that is not technically true. Happy to
leave that decision up to what you think makes more sense.

> I also need to know exactly what the user-visible issue is and how
> it comes about. Your commit message merely says "the build-in USB-C
> ports stop working at all". Does that mean that no driver is bound
> to the NHI or XHCI?

That is correct, when the user-visible issue occurs, no driver is
bound to the NHI and XHCI. The discrete JHL chip is not permitted to
attach to the external-facing root port because of the security
policy, so the NHI and XHCI are not seen by the computer.

When the user connects a non-thunderbolt peripheral to the USB-C port
that is downstream from the JHL chip, nothing happens in the logs, and
the computer does not see the peripheral. However, power chargers
still are able to charge the device through that port.

> The solution implemented by the above-linked branch hinges on the
> NHI driver fixing up the device attributes. But if the NHI driver
> is not bound, it can't fix up the attributes.

I could try to experiment with your patch, see what happens, and if I
can get around that.

On Sat, Apr 27, 2024 at 11:09?AM Lukas Wunner <lukas@wunner.de> wrote:
>
> On Thu, Apr 25, 2024 at 05:16:24PM -0400, Esther Shimanovich wrote:
> > I did find one example of a docking station that uses the DSL6540
> > chip, which has PCI IDs defined in include/linux/pci_ids.h:
> > #define PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_4C_NHI 0x1577
> > #define PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_4C_BRIDGE 0x1578
> > It seems like it has an NHI, despite being in an external, removable
> > docking station.
>
> Could you provide full output of dmesg and lspci -vvv of a machine
> with this docking station attached?
>
> Perhaps open a bug at bugzilla.kernel.org and attach it there?
>

I don’t have this device available at my office. I just saw that
StarTech sells a universal laptop docking station with chipset-id
Intel - Alpine Ridge DSL6540. Then I looked up the device, and found
it here: https://linux-hardware.org/?id=pci:8086-1577-8086-0000

Therefore, I concluded that the DSL6540 has an NHI component.

If these logs are important, I could probably make a case to purchase
that docking station and get the info that you need. Please let me
know!

> Could you then try the below patch and see if it prevents the
> Thunderbolt driver from binding to the hot-plugged device?

I could give it a try. Thank you!
Re: [PATCH v4] PCI: Relabel JHL6540 on Lenovo X1 Carbon 7,8 [ In reply to ]
Hi,

On Wed, May 01, 2024 at 06:23:28PM -0400, Esther Shimanovich wrote:
> I don’t have this device available at my office. I just saw that
> StarTech sells a universal laptop docking station with chipset-id
> Intel - Alpine Ridge DSL6540. Then I looked up the device, and found
> it here: https://linux-hardware.org/?id=pci:8086-1577-8086-0000
>
> Therefore, I concluded that the DSL6540 has an NHI component.

Okay understood. Yes Alpine Ridge can be both host and device router. In
device configuration such as the above it does not expose NHI. If it is
host as in the above list you shared then it includes one.
Re: [PATCH v4] PCI: Relabel JHL6540 on Lenovo X1 Carbon 7,8 [ In reply to ]
On 5/1/2024 23:38, Mika Westerberg wrote:
> Hi,
>
> On Wed, May 01, 2024 at 06:23:28PM -0400, Esther Shimanovich wrote:
>> I don’t have this device available at my office. I just saw that
>> StarTech sells a universal laptop docking station with chipset-id
>> Intel - Alpine Ridge DSL6540. Then I looked up the device, and found
>> it here: https://linux-hardware.org/?id=pci:8086-1577-8086-0000
>>
>> Therefore, I concluded that the DSL6540 has an NHI component.
>
> Okay understood. Yes Alpine Ridge can be both host and device router. In
> device configuration such as the above it does not expose NHI. If it is
> host as in the above list you shared then it includes one.

There are different PCI IDs for AR for host vs device though, right?
But I guess that could technically be spoofed.
Is there a fixed PCI ID for the RP used to tunnel for host AR?
If so you could special case that anything connected to that PCI ID for
RP used to tunnel isn't trusted.
Re: [PATCH v4] PCI: Relabel JHL6540 on Lenovo X1 Carbon 7,8 [ In reply to ]
On Thu, May 02, 2024 at 04:54:56AM -0500, Mario Limonciello wrote:
> On 5/1/2024 23:38, Mika Westerberg wrote:
> > Hi,
> >
> > On Wed, May 01, 2024 at 06:23:28PM -0400, Esther Shimanovich wrote:
> > > I don’t have this device available at my office. I just saw that
> > > StarTech sells a universal laptop docking station with chipset-id
> > > Intel - Alpine Ridge DSL6540. Then I looked up the device, and found
> > > it here: https://linux-hardware.org/?id=pci:8086-1577-8086-0000
> > >
> > > Therefore, I concluded that the DSL6540 has an NHI component.
> >
> > Okay understood. Yes Alpine Ridge can be both host and device router. In
> > device configuration such as the above it does not expose NHI. If it is
> > host as in the above list you shared then it includes one.
>
> There are different PCI IDs for AR for host vs device though, right?

No they are the same. Well I think Titan Ridge has different for device.

Here's my system with AR host and AR dock:

06:00.0 0604: 8086:15d3 (rev 02)
07:00.0 0604: 8086:15d3 (rev 02)
07:01.0 0604: 8086:15d3 (rev 02)
07:02.0 0604: 8086:15d3 (rev 02)
07:04.0 0604: 8086:15d3 (rev 02)
08:00.0 0880: 8086:15d2 (rev 02)

09:00.0 0604: 8086:15d3 (rev 02)
0a:00.0 0604: 8086:15d3 (rev 02)
0a:01.0 0604: 8086:15d3 (rev 02)
0a:02.0 0604: 8086:15d3 (rev 02)
0a:03.0 0604: 8086:15d3 (rev 02)
0a:04.0 0604: 8086:15d3 (rev 02)

> But I guess that could technically be spoofed.
> Is there a fixed PCI ID for the RP used to tunnel for host AR?

No, I don't think so. The Root Port can be anything even non-Intel.

> If so you could special case that anything connected to that PCI ID for RP
> used to tunnel isn't trusted.

None of the PCIe tunnels are trusted. We have IOMMU already in place
that blocks accesses outside. Also it is not possible to have anoter
host router in a domain because:

* The connection manager in the real host router needs to enumerate the
device router before any tunnels can be established.

* Once it does that it has TopologyID > 0 so it is not a host router and
cannot receive packets with CM bit set in the route string.

* If it still exposes a PCIe endpoint with NHI that is behind a PCIe
tunnel from the host router PCIe downstream port so it is behind the full IOMMU
mappings (as these ports are "external_facing").

Yes if there is such PCIe device the Thunderbolt driver may attach to it
but I don't see what the harm would be since it cannot really affect the
topology (except maybe trying to crash the driver by sending unexpected
replies, or so).
Re: [PATCH v4] PCI: Relabel JHL6540 on Lenovo X1 Carbon 7,8 [ In reply to ]
On Wed, May 01, 2024 at 06:23:28PM -0400, Esther Shimanovich wrote:
> On Sat, Apr 27, 2024 at 3:17AM Lukas Wunner <lukas@wunner.de> wrote:
> That is correct, when the user-visible issue occurs, no driver is
> bound to the NHI and XHCI. The discrete JHL chip is not permitted to
> attach to the external-facing root port because of the security
> policy, so the NHI and XHCI are not seen by the computer.

Could you rework your patch to only rectify the NHI's and XHCI's
device properties and leave the bridges untouched?

The thunderbolt driver will then rectify the bridge's properties
using the patches on this branch (particularly the one named
"thunderbolt: Mark PCIe Adapters on Root Switch as non-removable"):

https://github.com/l1k/linux/commits/thunderbolt_associate_v1

This approach keeps most of the code in the thunderbolt driver
(which has a very clear picture which PCI bridges belong to the
Host Router and which to Device Routers). The footprint in the
PCI core is thus kept minimal, which increases upstream
acceptability of your patch.

You can match the NHI using DECLARE_PCI_FIXUP_CLASS_FINAL():

* Search for PCI_CLASS_SERIAL_USB_USB4 with class shift 0
to match a USB4 Host Interface from any vendor.
* Seach for PCI_CLASS_SYSTEM_OTHER with class shift 8
to match a Thunderbolt 1 to 3 Host Interface.
I recommend checking the is_thunderbolt bit on those devices
to avoid matching a non-NHI.

Then fixup the device properties of the NHI so that it can bind.

To also rectify the properties of the XHCI, you'd have to use
pci_upstream_bridge() to find the Downstream Port above, check
whether that's non-NULL. The bus on which the Downstream Port
resides is pdev->bus. On all Host Routers I know, the XHCI is
below slot 02.0 on that bus, so you could use pci_get_slot()
to find that Downstream Port, then use pci_get_slot() again
to find slot 00.0 on that bridge's subordinate bus. If that
device has class PCI_CLASS_SERIAL_USB_XHCI, you've found the
XHCI and can rectify its properties. Device references acquired
with pci_get_*() need to be returned with pci_dev_put().

The quirk should be #ifdef'ed to CONFIG_ACPI. Alternatively,
it could be declared in pci-acpi.c near pci_acpi_set_external_facing().


> > However that doesn't appear to be sufficient: I notice that in your
> > patch, you're also clearing the external_facing bit on the Root Port
> > above the discrete host controller.
>
> Rajat (rajatja@google.com) in an internal review had suggested I add
> that, and leave it up to kernel maintainers to decide if it's strictly
> necessary.

I'd recommend to leave the Root Port's properties untouched
unless that's necessary.


> I don???t have this device available at my office. I just saw that
> StarTech sells a universal laptop docking station with chipset-id
> Intel - Alpine Ridge DSL6540. Then I looked up the device, and found
> it here: https://linux-hardware.org/?id=pci:8086-1577-8086-0000
>
> Therefore, I concluded that the DSL6540 has an NHI component.
>
> If these logs are important, I could probably make a case to purchase
> that docking station and get the info that you need. Please let me
> know!

Never mind, this scenario is being tested internally at Intel
and the above-linked branch contains a commit to avoid binding
to a Host Interface exposed by a Device Router.

Thanks,

Lukas

1 2  View All