Mailing List Archive

[PATCH v4] PCI: Relabel JHL6540 on Lenovo X1 Carbon 7,8
On Lenovo X1 Carbon Gen 7/8 devices, when a platform enables a policy to
distrust removable PCI devices, the build-in USB-C ports stop working at
all.
This happens because these X1 Carbon models have a unique feature; a
Thunderbolt controller that is discrete from the SoC. The software sees
this controller, and incorrectly assumes it is a removable PCI device,
even though it is fixed to the computer and is wired to the computer's
own USB-C ports.

Relabel all the components of the JHL6540 controller as DEVICE_FIXED,
and where applicable, external_facing.

Ensure that the security policy to distrust external PCI devices works
as intended, and that the device's USB-C ports are able to enumerate
even when the policy is enabled.

Signed-off-by: Esther Shimanovich <eshimanovich@chromium.org>
---
Changes in v4:
- replaced a dmi check in the rootport quirk with a subsystem vendor and
device check.
- Link to v3: https://lore.kernel.org/r/20231220-thunderbolt-pci-patch-4-v3-1-056fd1717d06@chromium.org

Changes in v3:
- removed redundant dmi check, as the subsystem vendor check is
sufficient
- switched to PCI_VENDOR_ID_LENOVO instead of hex code
- Link to v2: https://lore.kernel.org/r/20231219-thunderbolt-pci-patch-4-v2-1-ec2d7af45a9b@chromium.org

Changes in v2:
- nothing new, v1 was just a test run to see if the ASCII diagram would
be rendered properly in mutt and k-9
- for folks using gmail, make sure to select "show original" on the top
right, as otherwise the diagram will be garbled by the standard
non-monospace font
- Link to v1: https://lore.kernel.org/r/20231219-thunderbolt-pci-patch-4-v1-1-4e8e3773f0a9@chromium.org
---
drivers/pci/quirks.c | 112 +++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 112 insertions(+)

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index ea476252280a..34e43323ff14 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -3873,6 +3873,118 @@ DECLARE_PCI_FIXUP_SUSPEND_LATE(PCI_VENDOR_ID_INTEL,
quirk_apple_poweroff_thunderbolt);
#endif

+/*
+ * On most ThinkPad Carbon 7/8s, JHL6540 Thunderbolt 3 bridges are set
+ * incorrectly as DEVICE_REMOVABLE despite being built into the device.
+ * This is the side effect of a unique hardware configuration.
+ *
+ * Normally, Thunderbolt functionality is integrated to the SoC and
+ * its root ports.
+ *
+ * Most devices:
+ * root port --> USB-C port
+ *
+ * But X1 Carbon Gen 7/8 uses Whiskey Lake and Comet Lake SoC, which
+ * don't come with Thunderbolt functionality. Therefore, a discrete
+ * Thunderbolt Host PCI controller was added between the root port and
+ * the USB-C port.
+ *
+ * Thinkpad Carbon 7/8s
+ * (w/ Whiskey Lake and Comet Lake SoC):
+ * root port --> JHL6540 --> USB-C port
+ *
+ * Because the root port is labeled by FW as "ExternalFacingPort", as
+ * required by the DMAR ACPI spec, the JHL6540 chip is inaccurately
+ * labeled as DEVICE_REMOVABLE by the kernel pci driver.
+ * Therefore, the built-in USB-C ports do not enumerate when policies
+ * forbidding external pci devices are enforced.
+ *
+ * This fix relabels the pci components in the built-in JHL6540 chip as
+ * DEVICE_FIXED, ensuring that the built-in USB-C ports always enumerate
+ * properly as intended.
+ *
+ * This fix also labels the external facing components of the JHL6540 as
+ * external_facing, so that the pci attach policy works as intended.
+ *
+ * The ASCII diagram below describes the pci layout of the JHL6540 chip.
+ *
+ * Root Port
+ * [8086:02b4] or [8086:9db4]
+ * |
+ * JHL6540 Chip
+ * __________________________________________________
+ * | Bridge |
+ * | PCI ID -> [8086:15d3] |
+ * | DEVFN -> (00) |
+ * | _________________|__________________ |
+ * | | | | | |
+ * | Bridge Bridge Bridge Bridge |
+ * | [8086:15d3] [8086:15d3] [8086:15d3] [8086:15d3] |
+ * | (00) (08) (10) (20) |
+ * | | | | | |
+ * | NHI | USB Controller | |
+ * | [8086:15d2] | [8086:15d4] | |
+ * | (00) | (00) | |
+ * | |___________| |___________| |
+ * |____________|________________________|____________|
+ * | |
+ * USB-C Port USB-C Port
+ *
+ *
+ * Based on what a JHL6549 pci component's pci id, subsystem device id
+ * and devfn are, we can infer if it is fixed and if it faces a usb port;
+ * which would mean it is external facing.
+ * This quirk uses these values to identify the pci components and set the
+ * properties accordingly.
+ */
+static void carbon_X1_fixup_relabel_alpine_ridge(struct pci_dev *dev)
+{
+ /* Is this JHL6540 PCI component embedded in a Lenovo device? */
+ if (dev->subsystem_vendor != PCI_VENDOR_ID_LENOVO)
+ return;
+
+ /* Is this JHL6540 PCI component embedded in an X1 Carbon Gen 7/8? */
+ if (dev->subsystem_device != 0x22be && // Gen 8
+ dev->subsystem_device != 0x2292) { // Gen 7
+ return;
+ }
+
+ dev_set_removable(&dev->dev, DEVICE_FIXED);
+
+ /* Not all 0x15d3 components are external facing */
+ if (dev->device == 0x15d3 &&
+ dev->devfn != 0x08 &&
+ dev->devfn != 0x20) {
+ return;
+ }
+
+ dev->external_facing = true;
+}
+
+/*
+ * We also need to relabel the root port as a consequence of changing
+ * the JHL6540's PCIE hierarchy.
+ */
+static void carbon_X1_fixup_rootport_not_removable(struct pci_dev *dev)
+{
+ /* Is this JHL6540 PCI component embedded in a Lenovo device? */
+ if (dev->subsystem_vendor != PCI_VENDOR_ID_LENOVO)
+ return;
+
+ /* Is this JHL6540 PCI component embedded in an X1 Carbon Gen 7/8? */
+ if (dev->subsystem_device != 0x22be && // Gen 8
+ dev->subsystem_device != 0x2292) { // Gen 7
+ return;
+ }
+
+ dev->external_facing = false;
+}
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x15d3, carbon_X1_fixup_relabel_alpine_ridge);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x15d2, carbon_X1_fixup_relabel_alpine_ridge);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x15d4, carbon_X1_fixup_relabel_alpine_ridge);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x02b4, carbon_X1_fixup_rootport_not_removable);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x9db4, carbon_X1_fixup_rootport_not_removable);
+
/*
* Following are device-specific reset methods which can be used to
* reset a single function if other methods (e.g. FLR, PM D0->D3) are

---
base-commit: b85ea95d086471afb4ad062012a4d73cd328fa86
change-id: 20231219-thunderbolt-pci-patch-4-ede71cb833c4

Best regards,
--
Esther Shimanovich <eshimanovich@chromium.org>
Re: [PATCH v4] PCI: Relabel JHL6540 on Lenovo X1 Carbon 7,8 [ In reply to ]
On Thu, Dec 21, 2023 at 03:53:42PM -0500, Esther Shimanovich wrote:
> +static void carbon_X1_fixup_relabel_alpine_ridge(struct pci_dev *dev)
> +{
> + /* Is this JHL6540 PCI component embedded in a Lenovo device? */
> + if (dev->subsystem_vendor != PCI_VENDOR_ID_LENOVO)
> + return;
> +
> + /* Is this JHL6540 PCI component embedded in an X1 Carbon Gen 7/8? */
> + if (dev->subsystem_device != 0x22be && // Gen 8
> + dev->subsystem_device != 0x2292) { // Gen 7
> + return;
> + }
> +
> + dev_set_removable(&dev->dev, DEVICE_FIXED);
> +
> + /* Not all 0x15d3 components are external facing */
> + if (dev->device == 0x15d3 &&
> + dev->devfn != 0x08 &&
> + dev->devfn != 0x20) {
> + return;
> + }
> +
> + dev->external_facing = true;
> +}
> +
> +/*
> + * We also need to relabel the root port as a consequence of changing
> + * the JHL6540's PCIE hierarchy.
> + */
> +static void carbon_X1_fixup_rootport_not_removable(struct pci_dev *dev)
> +{
> + /* Is this JHL6540 PCI component embedded in a Lenovo device? */
> + if (dev->subsystem_vendor != PCI_VENDOR_ID_LENOVO)
> + return;
> +
> + /* Is this JHL6540 PCI component embedded in an X1 Carbon Gen 7/8? */
> + if (dev->subsystem_device != 0x22be && // Gen 8
> + dev->subsystem_device != 0x2292) { // Gen 7
> + return;
> + }

This ventures into the realm of nitpicking, but this can be factored out
into a helper.

I'll shut up now and let PCI folks handle this.

Thanks.

--
Dmitry
Re: [PATCH v4] PCI: Relabel JHL6540 on Lenovo X1 Carbon 7,8 [ In reply to ]
On Thu, Dec 21, 2023 at 03:53:42PM -0500, Esther Shimanovich wrote:
> On Lenovo X1 Carbon Gen 7/8 devices, when a platform enables a policy to
> distrust removable PCI devices, the build-in USB-C ports stop working at
> all.
> This happens because these X1 Carbon models have a unique feature; a
> Thunderbolt controller that is discrete from the SoC. The software sees
> this controller, and incorrectly assumes it is a removable PCI device,
> even though it is fixed to the computer and is wired to the computer's
> own USB-C ports.
>
> Relabel all the components of the JHL6540 controller as DEVICE_FIXED,
> and where applicable, external_facing.
>
> Ensure that the security policy to distrust external PCI devices works
> as intended, and that the device's USB-C ports are able to enumerate
> even when the policy is enabled.

Thanks for all your work here.

This is going to be a maintenance problem. We typically use quirks to
work around hardware defects, e.g., a device that advertises a feature
that doesn't work per spec.

But I don't see where the defect is here. And I doubt that this is
really a unique situation. So it's likely that this will happen on
other systems, and we don't want to have to add quirks every time
another one shows up.

If this is a firmware defect, e.g., if this platform is using
"ExternalFacingPort" incorrectly, we can add a quirk to work around
that, too. But I'm not sure that's the case.

> Signed-off-by: Esther Shimanovich <eshimanovich@chromium.org>
> ---
> Changes in v4:
> - replaced a dmi check in the rootport quirk with a subsystem vendor and
> device check.
> - Link to v3: https://lore.kernel.org/r/20231220-thunderbolt-pci-patch-4-v3-1-056fd1717d06@chromium.org
>
> Changes in v3:
> - removed redundant dmi check, as the subsystem vendor check is
> sufficient
> - switched to PCI_VENDOR_ID_LENOVO instead of hex code
> - Link to v2: https://lore.kernel.org/r/20231219-thunderbolt-pci-patch-4-v2-1-ec2d7af45a9b@chromium.org
>
> Changes in v2:
> - nothing new, v1 was just a test run to see if the ASCII diagram would
> be rendered properly in mutt and k-9
> - for folks using gmail, make sure to select "show original" on the top
> right, as otherwise the diagram will be garbled by the standard
> non-monospace font
> - Link to v1: https://lore.kernel.org/r/20231219-thunderbolt-pci-patch-4-v1-1-4e8e3773f0a9@chromium.org
> ---
> drivers/pci/quirks.c | 112 +++++++++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 112 insertions(+)
>
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index ea476252280a..34e43323ff14 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -3873,6 +3873,118 @@ DECLARE_PCI_FIXUP_SUSPEND_LATE(PCI_VENDOR_ID_INTEL,
> quirk_apple_poweroff_thunderbolt);
> #endif
>
> +/*
> + * On most ThinkPad Carbon 7/8s, JHL6540 Thunderbolt 3 bridges are set
> + * incorrectly as DEVICE_REMOVABLE despite being built into the device.
> + * This is the side effect of a unique hardware configuration.
> + *
> + * Normally, Thunderbolt functionality is integrated to the SoC and
> + * its root ports.
> + *
> + * Most devices:
> + * root port --> USB-C port
> + *
> + * But X1 Carbon Gen 7/8 uses Whiskey Lake and Comet Lake SoC, which
> + * don't come with Thunderbolt functionality. Therefore, a discrete
> + * Thunderbolt Host PCI controller was added between the root port and
> + * the USB-C port.
> + *
> + * Thinkpad Carbon 7/8s
> + * (w/ Whiskey Lake and Comet Lake SoC):
> + * root port --> JHL6540 --> USB-C port
> + *
> + * Because the root port is labeled by FW as "ExternalFacingPort", as
> + * required by the DMAR ACPI spec, the JHL6540 chip is inaccurately

Can you include a citation (spec name, revision, section) for this
DMAR requirement?

> + * labeled as DEVICE_REMOVABLE by the kernel pci driver.
> + * Therefore, the built-in USB-C ports do not enumerate when policies
> + * forbidding external pci devices are enforced.
> + *
> + * This fix relabels the pci components in the built-in JHL6540 chip as
> + * DEVICE_FIXED, ensuring that the built-in USB-C ports always enumerate
> + * properly as intended.
> + *
> + * This fix also labels the external facing components of the JHL6540 as
> + * external_facing, so that the pci attach policy works as intended.
> + *
> + * The ASCII diagram below describes the pci layout of the JHL6540 chip.
> + *
> + * Root Port
> + * [8086:02b4] or [8086:9db4]
> + * |
> + * JHL6540 Chip
> + * __________________________________________________
> + * | Bridge |
> + * | PCI ID -> [8086:15d3] |
> + * | DEVFN -> (00) |
> + * | _________________|__________________ |
> + * | | | | | |
> + * | Bridge Bridge Bridge Bridge |
> + * | [8086:15d3] [8086:15d3] [8086:15d3] [8086:15d3] |
> + * | (00) (08) (10) (20) |
> + * | | | | | |
> + * | NHI | USB Controller | |
> + * | [8086:15d2] | [8086:15d4] | |
> + * | (00) | (00) | |
> + * | |___________| |___________| |
> + * |____________|________________________|____________|
> + * | |
> + * USB-C Port USB-C Port
> + *
> + *
> + * Based on what a JHL6549 pci component's pci id, subsystem device id
> + * and devfn are, we can infer if it is fixed and if it faces a usb port;
> + * which would mean it is external facing.
> + * This quirk uses these values to identify the pci components and set the
> + * properties accordingly.

Random nits: Capitalize spec terms like "PCI", "USB", "Root Port",
etc., consistently in English text. Rewrap to fill 78 columns or so.
Add blank lines between paragraphs. This applies to the commit log
(which should be wrapped to 75 to allow for "git log" indent) as well
as comments.

But honestly, I hope we can figure out a solution that doesn't require
a big comment like this. Checking for random device IDs to deduce the
topology is not the way PCI is supposed to work. PCI is designed to
be enumerable, so software can learn what it needs to know by
interrogating the hardware directly.

For cases where that's impossible, ACPI is supposed to fill the gaps,
e.g., with "ExternalFacingPort". If this patch covers over a gap that
firmware doesn't handle yet, maybe we need to add some new firmware
interface. If that's the case, we can add quirks for platforms that
don't have the new interface. But we at least need a plan that
doesn't require quirks indefinitely.

> + */
> +static void carbon_X1_fixup_relabel_alpine_ridge(struct pci_dev *dev)
> +{
> + /* Is this JHL6540 PCI component embedded in a Lenovo device? */
> + if (dev->subsystem_vendor != PCI_VENDOR_ID_LENOVO)
> + return;
> +
> + /* Is this JHL6540 PCI component embedded in an X1 Carbon Gen 7/8? */
> + if (dev->subsystem_device != 0x22be && // Gen 8
> + dev->subsystem_device != 0x2292) { // Gen 7
> + return;
> + }
> +
> + dev_set_removable(&dev->dev, DEVICE_FIXED);
> +
> + /* Not all 0x15d3 components are external facing */
> + if (dev->device == 0x15d3 &&
> + dev->devfn != 0x08 &&
> + dev->devfn != 0x20) {
> + return;
> + }
> +
> + dev->external_facing = true;
> +}
> +
> +/*
> + * We also need to relabel the root port as a consequence of changing
> + * the JHL6540's PCIE hierarchy.
> + */
> +static void carbon_X1_fixup_rootport_not_removable(struct pci_dev *dev)
> +{
> + /* Is this JHL6540 PCI component embedded in a Lenovo device? */
> + if (dev->subsystem_vendor != PCI_VENDOR_ID_LENOVO)
> + return;
> +
> + /* Is this JHL6540 PCI component embedded in an X1 Carbon Gen 7/8? */
> + if (dev->subsystem_device != 0x22be && // Gen 8
> + dev->subsystem_device != 0x2292) { // Gen 7
> + return;
> + }
> +
> + dev->external_facing = false;
> +}
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x15d3, carbon_X1_fixup_relabel_alpine_ridge);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x15d2, carbon_X1_fixup_relabel_alpine_ridge);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x15d4, carbon_X1_fixup_relabel_alpine_ridge);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x02b4, carbon_X1_fixup_rootport_not_removable);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x9db4, carbon_X1_fixup_rootport_not_removable);
> +
> /*
> * Following are device-specific reset methods which can be used to
> * reset a single function if other methods (e.g. FLR, PM D0->D3) are
>
> ---
> base-commit: b85ea95d086471afb4ad062012a4d73cd328fa86
> change-id: 20231219-thunderbolt-pci-patch-4-ede71cb833c4
>
> Best regards,
> --
> Esther Shimanovich <eshimanovich@chromium.org>
>
Re: [PATCH v4] PCI: Relabel JHL6540 on Lenovo X1 Carbon 7,8 [ In reply to ]
[.adding Mika and Mario to the list of recipients, original patch is here:
https://lore.kernel.org/all/20231221-thunderbolt-pci-patch-4-v4-1-2e136e57c9bc@chromium.org/
a lot of folks are on vacation, replies might not be sent before Jan 8;
full quote without further comments below]

On Thu, Dec 21, 2023 at 03:53:42PM -0500, Esther Shimanovich wrote:
> On Lenovo X1 Carbon Gen 7/8 devices, when a platform enables a policy to
> distrust removable PCI devices, the build-in USB-C ports stop working at
> all.
> This happens because these X1 Carbon models have a unique feature; a
> Thunderbolt controller that is discrete from the SoC. The software sees
> this controller, and incorrectly assumes it is a removable PCI device,
> even though it is fixed to the computer and is wired to the computer's
> own USB-C ports.
>
> Relabel all the components of the JHL6540 controller as DEVICE_FIXED,
> and where applicable, external_facing.
>
> Ensure that the security policy to distrust external PCI devices works
> as intended, and that the device's USB-C ports are able to enumerate
> even when the policy is enabled.
>
> Signed-off-by: Esther Shimanovich <eshimanovich@chromium.org>
> ---
> Changes in v4:
> - replaced a dmi check in the rootport quirk with a subsystem vendor and
> device check.
> - Link to v3: https://lore.kernel.org/r/20231220-thunderbolt-pci-patch-4-v3-1-056fd1717d06@chromium.org
>
> Changes in v3:
> - removed redundant dmi check, as the subsystem vendor check is
> sufficient
> - switched to PCI_VENDOR_ID_LENOVO instead of hex code
> - Link to v2: https://lore.kernel.org/r/20231219-thunderbolt-pci-patch-4-v2-1-ec2d7af45a9b@chromium.org
>
> Changes in v2:
> - nothing new, v1 was just a test run to see if the ASCII diagram would
> be rendered properly in mutt and k-9
> - for folks using gmail, make sure to select "show original" on the top
> right, as otherwise the diagram will be garbled by the standard
> non-monospace font
> - Link to v1: https://lore.kernel.org/r/20231219-thunderbolt-pci-patch-4-v1-1-4e8e3773f0a9@chromium.org
> ---
> drivers/pci/quirks.c | 112 +++++++++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 112 insertions(+)
>
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index ea476252280a..34e43323ff14 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -3873,6 +3873,118 @@ DECLARE_PCI_FIXUP_SUSPEND_LATE(PCI_VENDOR_ID_INTEL,
> quirk_apple_poweroff_thunderbolt);
> #endif
>
> +/*
> + * On most ThinkPad Carbon 7/8s, JHL6540 Thunderbolt 3 bridges are set
> + * incorrectly as DEVICE_REMOVABLE despite being built into the device.
> + * This is the side effect of a unique hardware configuration.
> + *
> + * Normally, Thunderbolt functionality is integrated to the SoC and
> + * its root ports.
> + *
> + * Most devices:
> + * root port --> USB-C port
> + *
> + * But X1 Carbon Gen 7/8 uses Whiskey Lake and Comet Lake SoC, which
> + * don't come with Thunderbolt functionality. Therefore, a discrete
> + * Thunderbolt Host PCI controller was added between the root port and
> + * the USB-C port.
> + *
> + * Thinkpad Carbon 7/8s
> + * (w/ Whiskey Lake and Comet Lake SoC):
> + * root port --> JHL6540 --> USB-C port
> + *
> + * Because the root port is labeled by FW as "ExternalFacingPort", as
> + * required by the DMAR ACPI spec, the JHL6540 chip is inaccurately
> + * labeled as DEVICE_REMOVABLE by the kernel pci driver.
> + * Therefore, the built-in USB-C ports do not enumerate when policies
> + * forbidding external pci devices are enforced.
> + *
> + * This fix relabels the pci components in the built-in JHL6540 chip as
> + * DEVICE_FIXED, ensuring that the built-in USB-C ports always enumerate
> + * properly as intended.
> + *
> + * This fix also labels the external facing components of the JHL6540 as
> + * external_facing, so that the pci attach policy works as intended.
> + *
> + * The ASCII diagram below describes the pci layout of the JHL6540 chip.
> + *
> + * Root Port
> + * [8086:02b4] or [8086:9db4]
> + * |
> + * JHL6540 Chip
> + * __________________________________________________
> + * | Bridge |
> + * | PCI ID -> [8086:15d3] |
> + * | DEVFN -> (00) |
> + * | _________________|__________________ |
> + * | | | | | |
> + * | Bridge Bridge Bridge Bridge |
> + * | [8086:15d3] [8086:15d3] [8086:15d3] [8086:15d3] |
> + * | (00) (08) (10) (20) |
> + * | | | | | |
> + * | NHI | USB Controller | |
> + * | [8086:15d2] | [8086:15d4] | |
> + * | (00) | (00) | |
> + * | |___________| |___________| |
> + * |____________|________________________|____________|
> + * | |
> + * USB-C Port USB-C Port
> + *
> + *
> + * Based on what a JHL6549 pci component's pci id, subsystem device id
> + * and devfn are, we can infer if it is fixed and if it faces a usb port;
> + * which would mean it is external facing.
> + * This quirk uses these values to identify the pci components and set the
> + * properties accordingly.
> + */
> +static void carbon_X1_fixup_relabel_alpine_ridge(struct pci_dev *dev)
> +{
> + /* Is this JHL6540 PCI component embedded in a Lenovo device? */
> + if (dev->subsystem_vendor != PCI_VENDOR_ID_LENOVO)
> + return;
> +
> + /* Is this JHL6540 PCI component embedded in an X1 Carbon Gen 7/8? */
> + if (dev->subsystem_device != 0x22be && // Gen 8
> + dev->subsystem_device != 0x2292) { // Gen 7
> + return;
> + }
> +
> + dev_set_removable(&dev->dev, DEVICE_FIXED);
> +
> + /* Not all 0x15d3 components are external facing */
> + if (dev->device == 0x15d3 &&
> + dev->devfn != 0x08 &&
> + dev->devfn != 0x20) {
> + return;
> + }
> +
> + dev->external_facing = true;
> +}
> +
> +/*
> + * We also need to relabel the root port as a consequence of changing
> + * the JHL6540's PCIE hierarchy.
> + */
> +static void carbon_X1_fixup_rootport_not_removable(struct pci_dev *dev)
> +{
> + /* Is this JHL6540 PCI component embedded in a Lenovo device? */
> + if (dev->subsystem_vendor != PCI_VENDOR_ID_LENOVO)
> + return;
> +
> + /* Is this JHL6540 PCI component embedded in an X1 Carbon Gen 7/8? */
> + if (dev->subsystem_device != 0x22be && // Gen 8
> + dev->subsystem_device != 0x2292) { // Gen 7
> + return;
> + }
> +
> + dev->external_facing = false;
> +}
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x15d3, carbon_X1_fixup_relabel_alpine_ridge);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x15d2, carbon_X1_fixup_relabel_alpine_ridge);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x15d4, carbon_X1_fixup_relabel_alpine_ridge);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x02b4, carbon_X1_fixup_rootport_not_removable);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x9db4, carbon_X1_fixup_rootport_not_removable);
> +
> /*
> * Following are device-specific reset methods which can be used to
> * reset a single function if other methods (e.g. FLR, PM D0->D3) are
>
> ---
> base-commit: b85ea95d086471afb4ad062012a4d73cd328fa86
> change-id: 20231219-thunderbolt-pci-patch-4-ede71cb833c4
>
> Best regards,
> --
> Esther Shimanovich <eshimanovich@chromium.org>
Re: [PATCH v4] PCI: Relabel JHL6540 on Lenovo X1 Carbon 7,8 [ In reply to ]
Thanks Lukas for including me.

On Thu, Dec 28, 2023 at 02:25:17PM +0100, Lukas Wunner wrote:
> [.adding Mika and Mario to the list of recipients, original patch is here:
> https://lore.kernel.org/all/20231221-thunderbolt-pci-patch-4-v4-1-2e136e57c9bc@chromium.org/
> a lot of folks are on vacation, replies might not be sent before Jan 8;
> full quote without further comments below]
>
> On Thu, Dec 21, 2023 at 03:53:42PM -0500, Esther Shimanovich wrote:
> > On Lenovo X1 Carbon Gen 7/8 devices, when a platform enables a policy to
> > distrust removable PCI devices, the build-in USB-C ports stop working at
> > all.
> > This happens because these X1 Carbon models have a unique feature; a
> > Thunderbolt controller that is discrete from the SoC. The software sees
> > this controller, and incorrectly assumes it is a removable PCI device,
> > even though it is fixed to the computer and is wired to the computer's
> > own USB-C ports.

Yes, the ExternalFacingPort applies to root ports so that includes
everything below that.

> > Relabel all the components of the JHL6540 controller as DEVICE_FIXED,
> > and where applicable, external_facing.
> >
> > Ensure that the security policy to distrust external PCI devices works
> > as intended, and that the device's USB-C ports are able to enumerate
> > even when the policy is enabled.

This has been working just fine so far and as far as I can tell there is
no such "policy" in place in the mainline kernel.

> >
> > Signed-off-by: Esther Shimanovich <eshimanovich@chromium.org>
> > ---
> > Changes in v4:
> > - replaced a dmi check in the rootport quirk with a subsystem vendor and
> > device check.
> > - Link to v3: https://lore.kernel.org/r/20231220-thunderbolt-pci-patch-4-v3-1-056fd1717d06@chromium.org
> >
> > Changes in v3:
> > - removed redundant dmi check, as the subsystem vendor check is
> > sufficient
> > - switched to PCI_VENDOR_ID_LENOVO instead of hex code
> > - Link to v2: https://lore.kernel.org/r/20231219-thunderbolt-pci-patch-4-v2-1-ec2d7af45a9b@chromium.org
> >
> > Changes in v2:
> > - nothing new, v1 was just a test run to see if the ASCII diagram would
> > be rendered properly in mutt and k-9
> > - for folks using gmail, make sure to select "show original" on the top
> > right, as otherwise the diagram will be garbled by the standard
> > non-monospace font
> > - Link to v1: https://lore.kernel.org/r/20231219-thunderbolt-pci-patch-4-v1-1-4e8e3773f0a9@chromium.org
> > ---
> > drivers/pci/quirks.c | 112 +++++++++++++++++++++++++++++++++++++++++++++++++++
> > 1 file changed, 112 insertions(+)
> >
> > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> > index ea476252280a..34e43323ff14 100644
> > --- a/drivers/pci/quirks.c
> > +++ b/drivers/pci/quirks.c
> > @@ -3873,6 +3873,118 @@ DECLARE_PCI_FIXUP_SUSPEND_LATE(PCI_VENDOR_ID_INTEL,
> > quirk_apple_poweroff_thunderbolt);
> > #endif
> >
> > +/*
> > + * On most ThinkPad Carbon 7/8s, JHL6540 Thunderbolt 3 bridges are set
> > + * incorrectly as DEVICE_REMOVABLE despite being built into the device.
> > + * This is the side effect of a unique hardware configuration.

There is really nothing "unique" here. It's exactly as specified by
this:

https://learn.microsoft.com/en-us/windows-hardware/drivers/pci/dsd-for-pcie-root-ports#identifying-externally-exposed-pcie-root-ports

and being used in many many system already out there and those have been
working just fine.

> > + *
> > + * Normally, Thunderbolt functionality is integrated to the SoC and
> > + * its root ports.
> > + *
> > + * Most devices:
> > + * root port --> USB-C port
> > + *
> > + * But X1 Carbon Gen 7/8 uses Whiskey Lake and Comet Lake SoC, which
> > + * don't come with Thunderbolt functionality. Therefore, a discrete
> > + * Thunderbolt Host PCI controller was added between the root port and
> > + * the USB-C port.
> > + *
> > + * Thinkpad Carbon 7/8s
> > + * (w/ Whiskey Lake and Comet Lake SoC):
> > + * root port --> JHL6540 --> USB-C port
> > + *
> > + * Because the root port is labeled by FW as "ExternalFacingPort", as
> > + * required by the DMAR ACPI spec, the JHL6540 chip is inaccurately
> > + * labeled as DEVICE_REMOVABLE by the kernel pci driver.
> > + * Therefore, the built-in USB-C ports do not enumerate when policies
> > + * forbidding external pci devices are enforced.
> > + *
> > + * This fix relabels the pci components in the built-in JHL6540 chip as
> > + * DEVICE_FIXED, ensuring that the built-in USB-C ports always enumerate
> > + * properly as intended.
> > + *
> > + * This fix also labels the external facing components of the JHL6540 as
> > + * external_facing, so that the pci attach policy works as intended.
> > + *
> > + * The ASCII diagram below describes the pci layout of the JHL6540 chip.
> > + *
> > + * Root Port
> > + * [8086:02b4] or [8086:9db4]
> > + * |
> > + * JHL6540 Chip
> > + * __________________________________________________
> > + * | Bridge |
> > + * | PCI ID -> [8086:15d3] |
> > + * | DEVFN -> (00) |
> > + * | _________________|__________________ |
> > + * | | | | | |
> > + * | Bridge Bridge Bridge Bridge |
> > + * | [8086:15d3] [8086:15d3] [8086:15d3] [8086:15d3] |
> > + * | (00) (08) (10) (20) |
> > + * | | | | | |
> > + * | NHI | USB Controller | |
> > + * | [8086:15d2] | [8086:15d4] | |
> > + * | (00) | (00) | |
> > + * | |___________| |___________| |
> > + * |____________|________________________|____________|
> > + * | |
> > + * USB-C Port USB-C Port
> > + *
> > + *
> > + * Based on what a JHL6549 pci component's pci id, subsystem device id
> > + * and devfn are, we can infer if it is fixed and if it faces a usb port;
> > + * which would mean it is external facing.
> > + * This quirk uses these values to identify the pci components and set the
> > + * properties accordingly.
> > + */
> > +static void carbon_X1_fixup_relabel_alpine_ridge(struct pci_dev *dev)
> > +{
> > + /* Is this JHL6540 PCI component embedded in a Lenovo device? */
> > + if (dev->subsystem_vendor != PCI_VENDOR_ID_LENOVO)
> > + return;
> > +
> > + /* Is this JHL6540 PCI component embedded in an X1 Carbon Gen 7/8? */
> > + if (dev->subsystem_device != 0x22be && // Gen 8
> > + dev->subsystem_device != 0x2292) { // Gen 7
> > + return;
> > + }
> > +
> > + dev_set_removable(&dev->dev, DEVICE_FIXED);
> > +
> > + /* Not all 0x15d3 components are external facing */
> > + if (dev->device == 0x15d3 &&
> > + dev->devfn != 0x08 &&
> > + dev->devfn != 0x20) {
> > + return;
> > + }
> > +
> > + dev->external_facing = true;
> > +}
> > +
> > +/*
> > + * We also need to relabel the root port as a consequence of changing
> > + * the JHL6540's PCIE hierarchy.
> > + */
> > +static void carbon_X1_fixup_rootport_not_removable(struct pci_dev *dev)
> > +{
> > + /* Is this JHL6540 PCI component embedded in a Lenovo device? */
> > + if (dev->subsystem_vendor != PCI_VENDOR_ID_LENOVO)
> > + return;
> > +
> > + /* Is this JHL6540 PCI component embedded in an X1 Carbon Gen 7/8? */
> > + if (dev->subsystem_device != 0x22be && // Gen 8
> > + dev->subsystem_device != 0x2292) { // Gen 7
> > + return;
> > + }
> > +
> > + dev->external_facing = false;
> > +}
> > +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x15d3, carbon_X1_fixup_relabel_alpine_ridge);
> > +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x15d2, carbon_X1_fixup_relabel_alpine_ridge);
> > +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x15d4, carbon_X1_fixup_relabel_alpine_ridge);
> > +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x02b4, carbon_X1_fixup_rootport_not_removable);
> > +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x9db4, carbon_X1_fixup_rootport_not_removable);

This is not scalable at all. You would need to include lots of systems
here. And there should be no issue at all anyways.

Can you elaborate what the issue is and which mainline kernel you are
using to reproduce this?
Re: [PATCH v4] PCI: Relabel JHL6540 on Lenovo X1 Carbon 7,8 [ In reply to ]
Thank you for all your comments! I really appreciate all your help
with this. I will address the style feedback once we reach a decision
on how we will fix this bug.
I first will respond to your comments, and then I will list out the
possible solutions to this bug, in a way that takes into account all
of your insights.

On Tue, Dec 26, 2023 at 7:15?PM Bjorn Helgaas <helgaas@kernel.org> wrote:
> Can you include a citation (spec name, revision, section) for this
> DMAR requirement?
>
This was my mistake–I misinterpreted what a firmware developer told
me. This is a firmware ACPI requirement from windows, which is not in
the DMAR spec. Windows uses it to identify externally exposed PCIE
root ports.
https://learn.microsoft.com/en-us/windows-hardware/drivers/pci/dsd-for-pcie-root-ports#identifying-externally-exposed-pcie-root-ports

> But I don't see where the defect is here. And I doubt that this is
> really a unique situation. So it's likely that this will happen on
> other systems, and we don't want to have to add quirks every time
> another one shows up.
..
> don't have the new interface. But we at least need a plan that
> doesn't require quirks indefinitely.
..
On Thu, Dec 28, 2023 at 8:41?AM Mika Westerberg
<mika.westerberg@linux.intel.com> wrote:
> This is not scalable at all. You would need to include lots of systems
> here. And there should be no issue at all anyways.
My team tests hundreds of different devices, and this is the only one
which exhibited this issue that we’ve seen so far.
No other devices we’ve seen so far have a discrete internal
Thunderbolt controller which is treated as a removable device.
Therefore, we don’t expect that a large number of devices will need
this quirk.

> There is really nothing "unique" here. It's exactly as specified by
> this:
>
> https://learn.microsoft.com/en-us/windows-hardware/drivers/pci/dsd-for-pcie-root-ports#identifying-externally-exposed-pcie-root-ports
>
> and being used in many many system already out there and those have been
> working just fine.
I don’t know how many computers have a discrete Thunderbolt chip that
is separate from their CPU, but this doesn’t seem to be a common
occurrence.
These devices were made during a narrow window of time when CPUs
didn’t have Thunderbolt features yet, so a separate JHL6540 chip had
to be added so that Lenovo could include Thunderbolt on X1 Carbon Gen
7/8.

As you said, these devices do indeed work fine in cases where you
don’t care if a PCI Thunderbolt device is internal or external, which
is most cases.
Problems happen only whenever someone adds a security policy, or some
other feature that cares about the distinction between a fixed or
removable PCI device.

> This has been working just fine so far and as far as I can tell there is
> no such "policy" in place in the mainline kernel.
Correct, there is no such policy in the mainline kernel as of now. The
bug is that the linux kernel’s “removable” property is inaccurate for
this device.

> Can you elaborate what the issue is and which mainline kernel you are
> using to reproduce this?
Thanks for this question! On a Lenovo Thinkpad Gen 7/Gen 8 computer
with the linux kernel installed, when you look at the properties of
the JHL6540 Thunderbolt controller, you see that it is incorrectly
labeled as removable. I have replicated this bug on the b85ea95d0864
Linux 6.7-rc1 kernel.

Before my patch, you see that the JHL6540 controller is inaccurately
labeled “removable”:
$ udevadm info -a -p /sys/bus/pci/devices/0000:05:00.0 | grep -e
{removable} -e {device} -e {vendor} -e looking
looking at device '/devices/pci0000:00/0000:00:1d.4/0000:05:00.0':
ATTR{device}=="0x15d3"
ATTR{removable}=="removable"
ATTR{vendor}=="0x8086"
looking at parent device '/devices/pci0000:00/0000:00:1d.4':
ATTRS{device}=="0x02b4"
ATTRS{vendor}=="0x8086"
looking at parent device '/devices/pci0000:00':

After applying the patch in this ticket, we see the JHL6540 controller
is now labeled as “fixed”:
$ udevadm info -a -p /sys/bus/pci/devices/0000:05:00.0 | grep -e
{removable} -e {device} -e {vendor} -e looking
looking at device '/devices/pci0000:00/0000:00:1d.4/0000:05:00.0':
ATTR{device}=="0x15d3"
ATTR{removable}=="fixed"
ATTR{vendor}=="0x8086"
looking at parent device '/devices/pci0000:00/0000:00:1d.4':
ATTRS{device}=="0x02b4"
ATTRS{vendor}=="0x8086"
looking at parent device '/devices/pci0000:00':

OK so here is the part where I share what I’ve developed as a result
of your comments:

The two options I see to resolve this are as follows:
1) Either we fix this by adding a new firmware interface as Bjorn
Helgaas brought up.
2) Alternatively we may address this through a cleaned-up version of this patch

If the solution is to add a firmware interface, how would I go about
that process? Could you put me in touch with someone with that
know-how?
Would we have a temporary software quirk in place while the firmware
spec is being updated?
I am deferring to your expertise and knowledge in solving this bug.
Thank you for all your help.
Re: [PATCH v4] PCI: Relabel JHL6540 on Lenovo X1 Carbon 7,8 [ In reply to ]
On Wed, Jan 17, 2024 at 04:21:18PM -0500, Esther Shimanovich wrote:
> Thank you for all your comments! I really appreciate all your help
> with this. I will address the style feedback once we reach a decision
> on how we will fix this bug.
> I first will respond to your comments, and then I will list out the
> possible solutions to this bug, in a way that takes into account all
> of your insights.
>
> On Tue, Dec 26, 2023 at 7:15?PM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > Can you include a citation (spec name, revision, section) for this
> > DMAR requirement?
> >
> This was my mistake–I misinterpreted what a firmware developer told
> me. This is a firmware ACPI requirement from windows, which is not in
> the DMAR spec. Windows uses it to identify externally exposed PCIE
> root ports.
> https://learn.microsoft.com/en-us/windows-hardware/drivers/pci/dsd-for-pcie-root-ports#identifying-externally-exposed-pcie-root-ports
>
> > But I don't see where the defect is here. And I doubt that this is
> > really a unique situation. So it's likely that this will happen on
> > other systems, and we don't want to have to add quirks every time
> > another one shows up.
> ...
> > don't have the new interface. But we at least need a plan that
> > doesn't require quirks indefinitely.
> ...
> On Thu, Dec 28, 2023 at 8:41?AM Mika Westerberg
> <mika.westerberg@linux.intel.com> wrote:
> > This is not scalable at all. You would need to include lots of systems
> > here. And there should be no issue at all anyways.
> My team tests hundreds of different devices, and this is the only one
> which exhibited this issue that we’ve seen so far.
> No other devices we’ve seen so far have a discrete internal
> Thunderbolt controller which is treated as a removable device.
> Therefore, we don’t expect that a large number of devices will need
> this quirk.

Well that's pretty much all Intel Titan Ridge and Maple Ridge based
systems. Some early ones did not use IOMMU but all the rest do.

> > There is really nothing "unique" here. It's exactly as specified by
> > this:
> >
> > https://learn.microsoft.com/en-us/windows-hardware/drivers/pci/dsd-for-pcie-root-ports#identifying-externally-exposed-pcie-root-ports
> >
> > and being used in many many system already out there and those have been
> > working just fine.
> I don’t know how many computers have a discrete Thunderbolt chip that
> is separate from their CPU, but this doesn’t seem to be a common
> occurrence.
> These devices were made during a narrow window of time when CPUs
> didn’t have Thunderbolt features yet, so a separate JHL6540 chip had
> to be added so that Lenovo could include Thunderbolt on X1 Carbon Gen
> 7/8.

Before Intel Ice Lake it was all discrete and it is still discrete with
the Barlow Ridge controller who will have exact same ExternalFacing port
as the previous models.

> As you said, these devices do indeed work fine in cases where you
> don’t care if a PCI Thunderbolt device is internal or external, which
> is most cases.
> Problems happen only whenever someone adds a security policy, or some
> other feature that cares about the distinction between a fixed or
> removable PCI device.

Do we have such security policy in the mainline kernel?

> > This has been working just fine so far and as far as I can tell there is
> > no such "policy" in place in the mainline kernel.
> Correct, there is no such policy in the mainline kernel as of now. The
> bug is that the linux kernel’s “removable” property is inaccurate for
> this device.

Or perhaps the "policy" should know this better? IIRC there were some
"exceptions" in the Chrome kernel that allowed to put these devices into
"allowlist" is this not the case anymore?

> > Can you elaborate what the issue is and which mainline kernel you are
> > using to reproduce this?
> Thanks for this question! On a Lenovo Thinkpad Gen 7/Gen 8 computer
> with the linux kernel installed, when you look at the properties of
> the JHL6540 Thunderbolt controller, you see that it is incorrectly
> labeled as removable. I have replicated this bug on the b85ea95d0864
> Linux 6.7-rc1 kernel.
>
> Before my patch, you see that the JHL6540 controller is inaccurately
> labeled “removable”:
> $ udevadm info -a -p /sys/bus/pci/devices/0000:05:00.0 | grep -e
> {removable} -e {device} -e {vendor} -e looking
> looking at device '/devices/pci0000:00/0000:00:1d.4/0000:05:00.0':
> ATTR{device}=="0x15d3"
> ATTR{removable}=="removable"
> ATTR{vendor}=="0x8086"

This is actually accurate. The Thunderbolt controller is itself
hot-removable and that BTW happens to be hot-removed when fwupd applies
firmware upgrades to the device.
Re: [PATCH v4] PCI: Relabel JHL6540 on Lenovo X1 Carbon 7,8 [ In reply to ]
On 1/18/2024 00:00, Mika Westerberg wrote:
>> Before my patch, you see that the JHL6540 controller is inaccurately
>> labeled “removable”:
>> $ udevadm info -a -p /sys/bus/pci/devices/0000:05:00.0 | grep -e
>> {removable} -e {device} -e {vendor} -e looking
>> looking at device '/devices/pci0000:00/0000:00:1d.4/0000:05:00.0':
>> ATTR{device}=="0x15d3"
>> ATTR{removable}=="removable"
>> ATTR{vendor}=="0x8086"
>
> This is actually accurate. The Thunderbolt controller is itself
> hot-removable and that BTW happens to be hot-removed when fwupd applies
> firmware upgrades to the device.

Depending on the consumers of this removable attribute I wonder if we
need to a new ATTR of "external" instead of overloading "removable".
Re: [PATCH v4] PCI: Relabel JHL6540 on Lenovo X1 Carbon 7,8 [ In reply to ]
On Thu, Jan 18, 2024 at 09:47:07AM -0600, Mario Limonciello wrote:
> On 1/18/2024 00:00, Mika Westerberg wrote:
> > > Before my patch, you see that the JHL6540 controller is inaccurately
> > > labeled “removable”:
> > > $ udevadm info -a -p /sys/bus/pci/devices/0000:05:00.0 | grep -e
> > > {removable} -e {device} -e {vendor} -e looking
> > > looking at device '/devices/pci0000:00/0000:00:1d.4/0000:05:00.0':
> > > ATTR{device}=="0x15d3"
> > > ATTR{removable}=="removable"
> > > ATTR{vendor}=="0x8086"
> >
> > This is actually accurate. The Thunderbolt controller is itself
> > hot-removable and that BTW happens to be hot-removed when fwupd applies
> > firmware upgrades to the device.

This is quite interesting take. Does fwupd rip the controller out of the
box to update it? By that account your touchpad is also removable as it
may stop functioning when its firmware gets updated.

>
> Depending on the consumers of this removable attribute I wonder if we need
> to a new ATTR of "external" instead of overloading "removable".

Isn't this the same thing? From
Documentation/ABI/testing/sysfs-devices-removable:

What: /sys/devices/.../removable
Date: May 2021
Contact: Rajat Jain <rajatxjain@gmail.com>
Description:
Information about whether a given device can be removed from the
platform by the user. This is determined by its subsystem in a
bus / platform-specific way. This attribute is only present for
devices that can support determining such information:

=========== ===================================================
"removable" device can be removed from the platform by the user
"fixed" device is fixed to the platform / cannot be removed
by the user.

Note this "by the user". Maybe we should add word "physically" here to
qualify the meaning completely, but that is what it is. Not that it
disappears from the bus or stops operating for some time because of
firmware updates, but it can be physically detached from the
platform/system.

Thanks.

--
Dmitry
Re: [PATCH v4] PCI: Relabel JHL6540 on Lenovo X1 Carbon 7,8 [ In reply to ]
On Thu, Jan 18, 2024 at 08:12:56AM -0800, Dmitry Torokhov wrote:
> On Thu, Jan 18, 2024 at 09:47:07AM -0600, Mario Limonciello wrote:
> > On 1/18/2024 00:00, Mika Westerberg wrote:
> > > > Before my patch, you see that the JHL6540 controller is inaccurately
> > > > labeled “removable”:
> > > > $ udevadm info -a -p /sys/bus/pci/devices/0000:05:00.0 | grep -e
> > > > {removable} -e {device} -e {vendor} -e looking
> > > > looking at device '/devices/pci0000:00/0000:00:1d.4/0000:05:00.0':
> > > > ATTR{device}=="0x15d3"
> > > > ATTR{removable}=="removable"
> > > > ATTR{vendor}=="0x8086"
> > >
> > > This is actually accurate. The Thunderbolt controller is itself
> > > hot-removable and that BTW happens to be hot-removed when fwupd applies
> > > firmware upgrades to the device.
>
> This is quite interesting take. Does fwupd rip the controller out of the
> box to update it? By that account your touchpad is also removable as it
> may stop functioning when its firmware gets updated.
>
> >
> > Depending on the consumers of this removable attribute I wonder if we need
> > to a new ATTR of "external" instead of overloading "removable".
>
> Isn't this the same thing? From
> Documentation/ABI/testing/sysfs-devices-removable:
>
> What: /sys/devices/.../removable
> Date: May 2021
> Contact: Rajat Jain <rajatxjain@gmail.com>
> Description:
> Information about whether a given device can be removed from the
> platform by the user. This is determined by its subsystem in a
> bus / platform-specific way. This attribute is only present for
> devices that can support determining such information:
>
> =========== ===================================================
> "removable" device can be removed from the platform by the user
> "fixed" device is fixed to the platform / cannot be removed
> by the user.
>
> Note this "by the user". Maybe we should add word "physically" here to
> qualify the meaning completely, but that is what it is. Not that it
> disappears from the bus or stops operating for some time because of
> firmware updates, but it can be physically detached from the
> platform/system.

FTR this is where Rajat and Bjorn have agreed what the attribute means
when it was moved from USB-specific to general device property:

https://lore.kernel.org/all/20210511230228.GA2429744@bjorn-Precision-5520/

Thanks.

--
Dmitry
Re: [PATCH v4] PCI: Relabel JHL6540 on Lenovo X1 Carbon 7,8 [ In reply to ]
On Thu, Jan 18, 2024 at 08:12:56AM -0800, Dmitry Torokhov wrote:
> On Thu, Jan 18, 2024 at 09:47:07AM -0600, Mario Limonciello wrote:
> > On 1/18/2024 00:00, Mika Westerberg wrote:
> > > > Before my patch, you see that the JHL6540 controller is inaccurately
> > > > labeled “removable”:
> > > > $ udevadm info -a -p /sys/bus/pci/devices/0000:05:00.0 | grep -e
> > > > {removable} -e {device} -e {vendor} -e looking
> > > > looking at device '/devices/pci0000:00/0000:00:1d.4/0000:05:00.0':
> > > > ATTR{device}=="0x15d3"
> > > > ATTR{removable}=="removable"
> > > > ATTR{vendor}=="0x8086"
> > >
> > > This is actually accurate. The Thunderbolt controller is itself
> > > hot-removable and that BTW happens to be hot-removed when fwupd applies
> > > firmware upgrades to the device.
>
> This is quite interesting take. Does fwupd rip the controller out of the
> box to update it? By that account your touchpad is also removable as it
> may stop functioning when its firmware gets updated.

The Thunderbolt controller is connected to a hotpluggable PCIe root port
so it will be dissappear from the userspace so that "removable" in that
sense is accurate.

> > Depending on the consumers of this removable attribute I wonder if we need
> > to a new ATTR of "external" instead of overloading "removable".
>
> Isn't this the same thing? From
> Documentation/ABI/testing/sysfs-devices-removable:
>
> What: /sys/devices/.../removable
> Date: May 2021
> Contact: Rajat Jain <rajatxjain@gmail.com>
> Description:
> Information about whether a given device can be removed from the
> platform by the user. This is determined by its subsystem in a
> bus / platform-specific way. This attribute is only present for
> devices that can support determining such information:
>
> =========== ===================================================
> "removable" device can be removed from the platform by the user
> "fixed" device is fixed to the platform / cannot be removed
> by the user.
>
> Note this "by the user". Maybe we should add word "physically" here to
> qualify the meaning completely, but that is what it is. Not that it
> disappears from the bus or stops operating for some time because of
> firmware updates, but it can be physically detached from the
> platform/system.

That would be good to fully qualify what this means. But I get you it is
intented to identify devices that can be physically unplugged from the
system.
Re: [PATCH v4] PCI: Relabel JHL6540 on Lenovo X1 Carbon 7,8 [ In reply to ]
On Fri, Jan 19, 2024 at 07:37:56AM +0200, Mika Westerberg wrote:
> On Thu, Jan 18, 2024 at 08:12:56AM -0800, Dmitry Torokhov wrote:
> > On Thu, Jan 18, 2024 at 09:47:07AM -0600, Mario Limonciello wrote:
> > > On 1/18/2024 00:00, Mika Westerberg wrote:
> > > > > Before my patch, you see that the JHL6540 controller is inaccurately
> > > > > labeled “removable”:
> > > > > $ udevadm info -a -p /sys/bus/pci/devices/0000:05:00.0 | grep -e
> > > > > {removable} -e {device} -e {vendor} -e looking
> > > > > looking at device '/devices/pci0000:00/0000:00:1d.4/0000:05:00.0':
> > > > > ATTR{device}=="0x15d3"
> > > > > ATTR{removable}=="removable"
> > > > > ATTR{vendor}=="0x8086"
> > > >
> > > > This is actually accurate. The Thunderbolt controller is itself
> > > > hot-removable and that BTW happens to be hot-removed when fwupd applies
> > > > firmware upgrades to the device.
> >
> > This is quite interesting take. Does fwupd rip the controller out of the
> > box to update it? By that account your touchpad is also removable as it
> > may stop functioning when its firmware gets updated.
>
> The Thunderbolt controller is connected to a hotpluggable PCIe root port
> so it will be dissappear from the userspace so that "removable" in that
> sense is accurate.

There are systems as well where the Thunderbolt (and/or xHCI) controller
only appears if there is anything plugged to the physical Type-C ports
and it gets removed pretty soon after the physical device gets
unplugged. These are also the same Alpine Ridge and Titan Ridge
controllers that this patch is dealing with.

I tried to think about some sort of more generic heuristic how to figure
out that the controller is actually inside the physical system but there
is a problem that the same controller can appear on the bus as well, eg.
you plug in Thunderbolt dock and that one has xHCI controller too. That
device should definitely be "removable". With the "software CM" systems
we have a couple of additional hints in the ACPI tables that can be used
to identify the "tunneled" ports but this does not apply to the older
systems I'm afraid.

Now if I understand the reason behind this patch is actually not about
"removability" that much than about identifying a trusted vs. untrusted
device and attaching a driver to those. I was under impression that
there is already a solution to this in ChromeOS kernel. It has an
allowlist of drivers that are allowed to attach these devices and that
includes the PCIe port drivers, xhci_hcd and the thunderbolt driver,
possibly something else too. Is this not working for your case?
Re: [PATCH v4] PCI: Relabel JHL6540 on Lenovo X1 Carbon 7,8 [ In reply to ]
On Fri, Jan 19, 2024 at 09:48:29AM +0200, Mika Westerberg wrote:
> On Fri, Jan 19, 2024 at 07:37:56AM +0200, Mika Westerberg wrote:
> > On Thu, Jan 18, 2024 at 08:12:56AM -0800, Dmitry Torokhov wrote:
> > > On Thu, Jan 18, 2024 at 09:47:07AM -0600, Mario Limonciello wrote:
> > > > On 1/18/2024 00:00, Mika Westerberg wrote:
> > > > > > Before my patch, you see that the JHL6540 controller is inaccurately
> > > > > > labeled “removable”:
> > > > > > $ udevadm info -a -p /sys/bus/pci/devices/0000:05:00.0 | grep -e
> > > > > > {removable} -e {device} -e {vendor} -e looking
> > > > > > looking at device '/devices/pci0000:00/0000:00:1d.4/0000:05:00.0':
> > > > > > ATTR{device}=="0x15d3"
> > > > > > ATTR{removable}=="removable"
> > > > > > ATTR{vendor}=="0x8086"
> > > > >
> > > > > This is actually accurate. The Thunderbolt controller is itself
> > > > > hot-removable and that BTW happens to be hot-removed when fwupd applies
> > > > > firmware upgrades to the device.
> > >
> > > This is quite interesting take. Does fwupd rip the controller out of the
> > > box to update it? By that account your touchpad is also removable as it
> > > may stop functioning when its firmware gets updated.
> >
> > The Thunderbolt controller is connected to a hotpluggable PCIe root port
> > so it will be dissappear from the userspace so that "removable" in that
> > sense is accurate.
>
> There are systems as well where the Thunderbolt (and/or xHCI) controller
> only appears if there is anything plugged to the physical Type-C ports
> and it gets removed pretty soon after the physical device gets
> unplugged. These are also the same Alpine Ridge and Titan Ridge
> controllers that this patch is dealing with.
>
> I tried to think about some sort of more generic heuristic how to figure
> out that the controller is actually inside the physical system but there
> is a problem that the same controller can appear on the bus as well, eg.
> you plug in Thunderbolt dock and that one has xHCI controller too. That
> device should definitely be "removable". With the "software CM" systems
> we have a couple of additional hints in the ACPI tables that can be used
> to identify the "tunneled" ports but this does not apply to the older
> systems I'm afraid.

The below "might" work:

1. A device that is directly behind a PCIe root or downstream port that
has ->external_facing == 1.

2. It is a PCIe endpoint.

3. It is a sibling to or has any of the below PCI IDs (see
drivers/thunderbolt/nhi.h for the definitions):

PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_C_4C_NHI
PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_C_2C_NHI
PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_LP_USBONLY_NHI
PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_USBONLY_NHI
PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_C_USBONLY_NHI
PCI_DEVICE_ID_INTEL_TITAN_RIDGE_2C_NHI
PCI_DEVICE_ID_INTEL_TITAN_RIDGE_4C_NHI

And for all USB4 we can use the PCI class:

PCI_CLASS_SERIAL_USB_USB4

(4. With USB4 systems we could also add the check that the device is not
below the tunneled PCIe ports but that's kind of redundant).

I think this covers the existing devices as well as future discrete USB4
controllers and marks both the NHI and the xHCI as "fixed" which we
could also use to lift the bounce buffering restriction from them.

@Mario, did I miss anything?
Re: [PATCH v4] PCI: Relabel JHL6540 on Lenovo X1 Carbon 7,8 [ In reply to ]
On Thu, Jan 18, 2024 at 1:01?AM Mika Westerberg
<mika.westerberg@linux.intel.com> wrote:
>
> Well that's pretty much all Intel Titan Ridge and Maple Ridge based
> systems. Some early ones did not use IOMMU but all the rest do.
..
> Before Intel Ice Lake it was all discrete and it is still discrete with
> the Barlow Ridge controller who will have exact same ExternalFacing port
> as the previous models.

Next week I'll try those devices in our inventory to see if I can find
another one with this bug. I'll get back to you on that!

On Fri, Jan 19, 2024 at 2:58?AM Mika Westerberg
<mika.westerberg@linux.intel.com> wrote:
>
> Now if I understand the reason behind this patch is actually not about
> "removability" that much than about identifying a trusted vs. untrusted
> device and attaching a driver to those. I was under impression that
> there is already a solution to this in ChromeOS kernel. It has an
> allowlist of drivers that are allowed to attach these devices and that
> includes the PCIe port drivers, xhci_hcd and the thunderbolt driver,
> possibly something else too. Is this not working for your case?

This device shouldn’t be treated as a removable thunderbolt device
that is enabled by policy because it is an internal device that should
be trusted in the first place.
Even so, while learning about this problem I tried modifying the
ChromeOS policy but it ended up not fixing the issue because it seems
like there is an expectation for it to see an existing “fixed”
thunderbolt port before it loads anything else. But the fixed
thunderbolt port is prevented from enumerating during bootup, before
the policy has a chance to work.

On Fri, Jan 19, 2024 at 5:23?AM Mika Westerberg
<mika.westerberg@linux.intel.com> wrote:
>
> The below "might" work:
>
> 1. A device that is directly behind a PCIe root or downstream port that
> has ->external_facing == 1.
>
> 2. It is a PCIe endpoint.
>
> 3. It is a sibling to or has any of the below PCI IDs (see
> drivers/thunderbolt/nhi.h for the definitions):

External pci devices seem to have the same kinds of chips in them. So
this wouldn’t distinguish the “fixed but discrete” embedded pci
devices from the “removable” pcie through usb devices. My monitor with
thunderbolt capabilities has the JHL7540 chip in it. From the kernel's
perspective, I have only found that the subsystem id is what
distinguishes these devices.

That is, unless I am missing something in your proposal that would
distinguish a fixed JHL6540 chip from an external JHL6540 chip. Please
correct me on any assumptions I get wrong!
Re: [PATCH v4] PCI: Relabel JHL6540 on Lenovo X1 Carbon 7,8 [ In reply to ]
On Fri, Jan 19, 2024 at 11:03:18AM -0500, Esther Shimanovich wrote:
> On Thu, Jan 18, 2024 at 1:01?AM Mika Westerberg
> <mika.westerberg@linux.intel.com> wrote:
> >
> > Well that's pretty much all Intel Titan Ridge and Maple Ridge based
> > systems. Some early ones did not use IOMMU but all the rest do.
> ...
> > Before Intel Ice Lake it was all discrete and it is still discrete with
> > the Barlow Ridge controller who will have exact same ExternalFacing port
> > as the previous models.
>
> Next week I'll try those devices in our inventory to see if I can find
> another one with this bug. I'll get back to you on that!
>
> On Fri, Jan 19, 2024 at 2:58?AM Mika Westerberg
> <mika.westerberg@linux.intel.com> wrote:
> >
> > Now if I understand the reason behind this patch is actually not about
> > "removability" that much than about identifying a trusted vs. untrusted
> > device and attaching a driver to those. I was under impression that
> > there is already a solution to this in ChromeOS kernel. It has an
> > allowlist of drivers that are allowed to attach these devices and that
> > includes the PCIe port drivers, xhci_hcd and the thunderbolt driver,
> > possibly something else too. Is this not working for your case?
>
> This device shouldn’t be treated as a removable thunderbolt device
> that is enabled by policy because it is an internal device that should
> be trusted in the first place.
> Even so, while learning about this problem I tried modifying the
> ChromeOS policy but it ended up not fixing the issue because it seems
> like there is an expectation for it to see an existing “fixed”
> thunderbolt port before it loads anything else. But the fixed
> thunderbolt port is prevented from enumerating during bootup, before
> the policy has a chance to work.

Have you contacted the Chrome kernel folks about this? Perhaps they
have thought about this already?

> On Fri, Jan 19, 2024 at 5:23?AM Mika Westerberg
> <mika.westerberg@linux.intel.com> wrote:
> >
> > The below "might" work:
> >
> > 1. A device that is directly behind a PCIe root or downstream port that
> > has ->external_facing == 1.
> >
> > 2. It is a PCIe endpoint.
> >
> > 3. It is a sibling to or has any of the below PCI IDs (see
> > drivers/thunderbolt/nhi.h for the definitions):
>
> External pci devices seem to have the same kinds of chips in them. So
> this wouldn’t distinguish the “fixed but discrete” embedded pci
> devices from the “removable” pcie through usb devices. My monitor with
> thunderbolt capabilities has the JHL7540 chip in it. From the kernel's
> perspective, I have only found that the subsystem id is what
> distinguishes these devices.
>
> That is, unless I am missing something in your proposal that would
> distinguish a fixed JHL6540 chip from an external JHL6540 chip. Please
> correct me on any assumptions I get wrong!

Yes, you are missing the 1. that it needs to be directly behind the PCIe
root or downstream port that is marked as ->external_facing, and the
fact that there can't be NHI's (that's the host controller with the IDs
I listed in 3.) anywhere else except starting the topology according the
USB4 spec (and the same applies to Thunderbolt 1-3).
Re: [PATCH v4] PCI: Relabel JHL6540 on Lenovo X1 Carbon 7,8 [ In reply to ]
On 1/19/2024 04:22, Mika Westerberg wrote:
> On Fri, Jan 19, 2024 at 09:48:29AM +0200, Mika Westerberg wrote:
>> On Fri, Jan 19, 2024 at 07:37:56AM +0200, Mika Westerberg wrote:
>>> On Thu, Jan 18, 2024 at 08:12:56AM -0800, Dmitry Torokhov wrote:
>>>> On Thu, Jan 18, 2024 at 09:47:07AM -0600, Mario Limonciello wrote:
>>>>> On 1/18/2024 00:00, Mika Westerberg wrote:
>>>>>>> Before my patch, you see that the JHL6540 controller is inaccurately
>>>>>>> labeled “removable”:
>>>>>>> $ udevadm info -a -p /sys/bus/pci/devices/0000:05:00.0 | grep -e
>>>>>>> {removable} -e {device} -e {vendor} -e looking
>>>>>>> looking at device '/devices/pci0000:00/0000:00:1d.4/0000:05:00.0':
>>>>>>> ATTR{device}=="0x15d3"
>>>>>>> ATTR{removable}=="removable"
>>>>>>> ATTR{vendor}=="0x8086"
>>>>>>
>>>>>> This is actually accurate. The Thunderbolt controller is itself
>>>>>> hot-removable and that BTW happens to be hot-removed when fwupd applies
>>>>>> firmware upgrades to the device.
>>>>
>>>> This is quite interesting take. Does fwupd rip the controller out of the
>>>> box to update it? By that account your touchpad is also removable as it
>>>> may stop functioning when its firmware gets updated.
>>>
>>> The Thunderbolt controller is connected to a hotpluggable PCIe root port
>>> so it will be dissappear from the userspace so that "removable" in that
>>> sense is accurate.
>>
>> There are systems as well where the Thunderbolt (and/or xHCI) controller
>> only appears if there is anything plugged to the physical Type-C ports
>> and it gets removed pretty soon after the physical device gets
>> unplugged. These are also the same Alpine Ridge and Titan Ridge
>> controllers that this patch is dealing with.
>>
>> I tried to think about some sort of more generic heuristic how to figure
>> out that the controller is actually inside the physical system but there
>> is a problem that the same controller can appear on the bus as well, eg.
>> you plug in Thunderbolt dock and that one has xHCI controller too. That
>> device should definitely be "removable". With the "software CM" systems
>> we have a couple of additional hints in the ACPI tables that can be used
>> to identify the "tunneled" ports but this does not apply to the older
>> systems I'm afraid.
>
> The below "might" work:
>
> 1. A device that is directly behind a PCIe root or downstream port that
> has ->external_facing == 1.
>
> 2. It is a PCIe endpoint.
>
> 3. It is a sibling to or has any of the below PCI IDs (see
> drivers/thunderbolt/nhi.h for the definitions):
>
> PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_C_4C_NHI
> PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_C_2C_NHI
> PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_LP_USBONLY_NHI
> PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_USBONLY_NHI
> PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_C_USBONLY_NHI
> PCI_DEVICE_ID_INTEL_TITAN_RIDGE_2C_NHI
> PCI_DEVICE_ID_INTEL_TITAN_RIDGE_4C_NHI
>
> And for all USB4 we can use the PCI class:
>
> PCI_CLASS_SERIAL_USB_USB4
>
> (4. With USB4 systems we could also add the check that the device is not
> below the tunneled PCIe ports but that's kind of redundant).
>
> I think this covers the existing devices as well as future discrete USB4
> controllers and marks both the NHI and the xHCI as "fixed" which we
> could also use to lift the bounce buffering restriction from them.
>
> @Mario, did I miss anything?

The bounce buffering restriction is only for unaligned DMA isn't it?
Does that tend to happen a lot?

But otherwise I think this does a good job. It will cover external
enclosure cases too because of having to check it's directly behind a
root port.

But it should also include comments about why it's not needed on newer
systems (IE the ACPI hints for someone with no prior knowledge looking
at this to find).
Re: [PATCH v4] PCI: Relabel JHL6540 on Lenovo X1 Carbon 7,8 [ In reply to ]
On Mon, Jan 22, 2024 at 05:50:24PM -0600, Mario Limonciello wrote:
> On 1/19/2024 04:22, Mika Westerberg wrote:
> > On Fri, Jan 19, 2024 at 09:48:29AM +0200, Mika Westerberg wrote:
> > > On Fri, Jan 19, 2024 at 07:37:56AM +0200, Mika Westerberg wrote:
> > > > On Thu, Jan 18, 2024 at 08:12:56AM -0800, Dmitry Torokhov wrote:
> > > > > On Thu, Jan 18, 2024 at 09:47:07AM -0600, Mario Limonciello wrote:
> > > > > > On 1/18/2024 00:00, Mika Westerberg wrote:
> > > > > > > > Before my patch, you see that the JHL6540 controller is inaccurately
> > > > > > > > labeled “removable”:
> > > > > > > > $ udevadm info -a -p /sys/bus/pci/devices/0000:05:00.0 | grep -e
> > > > > > > > {removable} -e {device} -e {vendor} -e looking
> > > > > > > > looking at device '/devices/pci0000:00/0000:00:1d.4/0000:05:00.0':
> > > > > > > > ATTR{device}=="0x15d3"
> > > > > > > > ATTR{removable}=="removable"
> > > > > > > > ATTR{vendor}=="0x8086"
> > > > > > >
> > > > > > > This is actually accurate. The Thunderbolt controller is itself
> > > > > > > hot-removable and that BTW happens to be hot-removed when fwupd applies
> > > > > > > firmware upgrades to the device.
> > > > >
> > > > > This is quite interesting take. Does fwupd rip the controller out of the
> > > > > box to update it? By that account your touchpad is also removable as it
> > > > > may stop functioning when its firmware gets updated.
> > > >
> > > > The Thunderbolt controller is connected to a hotpluggable PCIe root port
> > > > so it will be dissappear from the userspace so that "removable" in that
> > > > sense is accurate.
> > >
> > > There are systems as well where the Thunderbolt (and/or xHCI) controller
> > > only appears if there is anything plugged to the physical Type-C ports
> > > and it gets removed pretty soon after the physical device gets
> > > unplugged. These are also the same Alpine Ridge and Titan Ridge
> > > controllers that this patch is dealing with.
> > >
> > > I tried to think about some sort of more generic heuristic how to figure
> > > out that the controller is actually inside the physical system but there
> > > is a problem that the same controller can appear on the bus as well, eg.
> > > you plug in Thunderbolt dock and that one has xHCI controller too. That
> > > device should definitely be "removable". With the "software CM" systems
> > > we have a couple of additional hints in the ACPI tables that can be used
> > > to identify the "tunneled" ports but this does not apply to the older
> > > systems I'm afraid.
> >
> > The below "might" work:
> >
> > 1. A device that is directly behind a PCIe root or downstream port that
> > has ->external_facing == 1.
> >
> > 2. It is a PCIe endpoint.
> >
> > 3. It is a sibling to or has any of the below PCI IDs (see
> > drivers/thunderbolt/nhi.h for the definitions):
> >
> > PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_C_4C_NHI
> > PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_C_2C_NHI
> > PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_LP_USBONLY_NHI
> > PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_USBONLY_NHI
> > PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_C_USBONLY_NHI
> > PCI_DEVICE_ID_INTEL_TITAN_RIDGE_2C_NHI
> > PCI_DEVICE_ID_INTEL_TITAN_RIDGE_4C_NHI
> >
> > And for all USB4 we can use the PCI class:
> >
> > PCI_CLASS_SERIAL_USB_USB4
> >
> > (4. With USB4 systems we could also add the check that the device is not
> > below the tunneled PCIe ports but that's kind of redundant).
> >
> > I think this covers the existing devices as well as future discrete USB4
> > controllers and marks both the NHI and the xHCI as "fixed" which we
> > could also use to lift the bounce buffering restriction from them.
> >
> > @Mario, did I miss anything?
>
> The bounce buffering restriction is only for unaligned DMA isn't it? Does
> that tend to happen a lot?

AFAICT no but this would allow to use IOMMU identity mappings instead of
full mappings with these devices.

> But otherwise I think this does a good job. It will cover external
> enclosure cases too because of having to check it's directly behind a root
> port.
>
> But it should also include comments about why it's not needed on newer
> systems (IE the ACPI hints for someone with no prior knowledge looking at
> this to find).

Agree.
Re: [PATCH v4] PCI: Relabel JHL6540 on Lenovo X1 Carbon 7,8 [ In reply to ]
On Fri, Jan 19, 2024 at 11:03?AM Esther Shimanovich
<eshimanovich@chromium.org> wrote:
> Next week I'll try those devices in our inventory to see if I can find
> another one with this bug. I'll get back to you on that!

I ended up finding 11 additional models with this bug, from various
manufacturers such as HP, Dell and Lenovo, that use the following
thunderbolt controllers: JHL6240, JHL6340, JHL6540, JHL7540. So making
this fix apply to all affected devices would make sense.

On Mon, Jan 22, 2024 at 1:10?AM Mika Westerberg
<mika.westerberg@linux.intel.com> wrote:
> Yes, you are missing the 1. that it needs to be directly behind the PCIe
> root or downstream port that is marked as ->external_facing, and the
> fact that there can't be NHI's (that's the host controller with the IDs
> I listed in 3.) anywhere else except starting the topology according the
> USB4 spec (and the same applies to Thunderbolt 1-3).

Thanks for the explanation. I'll write up a patch that implements this
and takes into account all the feedback. Then I'll test it on multiple
models, and then send it your way. Let me know if it makes sense to
add you as a co-author.
Very much appreciate your insights.
Re: [PATCH v4] PCI: Relabel JHL6540 on Lenovo X1 Carbon 7,8 [ In reply to ]
Hey!
Asking for some help on implementation.
So I implemented most of this, and successfully tested the quirk on 6
different devices with various types of discrete fixed JHL Thunderbolt
chips.

However I want to add an additional check. A device wouldn't need this
quirk if it already has Thunderbolt functionality built in within its
Root Port.

I tried to use "is_thunderbolt" as an identifier for that type of
device--- but when I tested on a device with a thunderbolt root port,
"is_thunderbolt" was false for all the Thunderbolt PCI components
listed below.

~ # lspci -nn | grep Thunderbolt
00:07.0 PCI bridge [0604]: Intel Corporation Tiger Lake-LP Thunderbolt
4 PCI Express Root Port #1 [8086:9a25] (rev 01)
00:07.2 PCI bridge [0604]: Intel Corporation Tiger Lake-LP Thunderbolt
4 PCI Express Root Port #2 [8086:9a27] (rev 01)
00:0d.0 USB controller [0c03]: Intel Corporation Tiger Lake-LP
Thunderbolt 4 USB Controller [8086:9a13] (rev 01)
00:0d.2 USB controller [0c03]: Intel Corporation Tiger Lake-LP
Thunderbolt 4 NHI #0 [8086:9a1b] (rev 01)
00:0d.3 USB controller [0c03]: Intel Corporation Tiger Lake-LP
Thunderbolt 4 NHI #1 [8086:9a1d] (rev 01)

The device I tested was the Lenovo carbon X1 Gen 9. When
set_pcie_thunderbolt is run, the devices listed above return false on
the pci_find_next_ext_capability check.

I have spent some time trying to see if there are any ACPI values or
any alternative indicators to reliably know if a root port has
thunderbolt capabilities. I do see that ADBG is often set to TBT but
that looks like a debugging tool in ACPI.

I also looked through lspci -vvv and compared the output with a device
that has no Thunderbolt built into its CPU, which gave me nothing.

I also tried looking through values in /sys/bus and searching through
the kernel, looking through logs etc. The only option I see now is
figuring out how to get the string value returned by the lspci and
parsing it, but I'm not 100% sure if all Thunderbolt CPUs would have
Root ports specifically labeled as "Thunderbolt Root Port". I'm also
not sure if that value is supposed to be used in that way.

I was hoping that someone may have a suggestion here.


Just for reference, this is the fix I have so far. I don't want to
submit it as a new patch, until I resolve the above question.

+static bool get_pci_exp_upstream_port(struct pci_dev *dev,
+ struct pci_dev **upstream_port)
+{
+ struct pci_dev *parent = dev;
+
+ // If the dev is an upstream port, return itself
+ if (pci_pcie_type(dev) == PCI_EXP_TYPE_UPSTREAM) {
+ *upstream_port = dev;
+ return true;
+ }
+
+ // Iterate through the dev's parents to find its upstream port
+ while ((parent = pci_upstream_bridge(parent))) {
+ if (pci_pcie_type(parent) == PCI_EXP_TYPE_UPSTREAM) {
+ *upstream_port = parent;
+ return true;
+ }
+ }
+ return false;
+}
+
+static void relabel_internal_thunderbolt_chip(struct pci_dev *dev)
+{
+ struct pci_dev *upstream_port;
+ struct pci_dev *upstream_ports_parent;
+
+ // Get the upstream port closest to the dev
+ if (!(get_pci_exp_upstream_port(dev, &upstream_port)))
+ return;
+
+ // Next, we confirm if the upstream port is directly behind a PCIe root
+ // port.
+ if (!(upstream_ports_parent == pci_upstream_bridge(upstream_port)))
+ return;
+ if (!(pci_pcie_type(upstream_ports_parent) == PCI_EXP_TYPE_ROOT_PORT))
+ return; // quirk does not apply
+
+ // If a device's root port already has thunderbolt capabilities, then
+ // it shouldn't be using this quirk.
+ // TODO: THIS CHECK DOES NOT WORK
+ // I ALSO TRIED USING pci_is_thunderbolt_attached WHICH DIDN'T
WORK EITHER
+ if (!(upstream_ports_parent->is_thunderbolt))
+ return; // thunderbolt functionality is built into root port
+
+ // Apply quirk
+ dev_set_removable(&dev->dev, DEVICE_FIXED);
+
+ // External facing bridges must be marked as such so that devices
+ // below them can correctly be labeled as REMOVABLE
+ if ((pci_pcie_type(dev) == PCI_EXP_TYPE_DOWNSTREAM)
+ && (dev->devfn == 0x08 | dev->devfn == 0x20))
+ dev->external_facing = true;
+}
Re: [PATCH v4] PCI: Relabel JHL6540 on Lenovo X1 Carbon 7,8 [ In reply to ]
Hi,

On Mon, Apr 15, 2024 at 06:34:00PM -0400, Esther Shimanovich wrote:
> Hey!
> Asking for some help on implementation.
> So I implemented most of this, and successfully tested the quirk on 6
> different devices with various types of discrete fixed JHL Thunderbolt
> chips.
>
> However I want to add an additional check. A device wouldn't need this
> quirk if it already has Thunderbolt functionality built in within its
> Root Port.

There is really no "Thunderbolt functionality built in within its Root
Port".

More accurate is that the Thunderbolt/USB4 controller is integrated into
the CPU and the PCIe tunnels go out through the CPU PCIe Root Ports.

> I tried to use "is_thunderbolt" as an identifier for that type of
> device--- but when I tested on a device with a thunderbolt root port,
> "is_thunderbolt" was false for all the Thunderbolt PCI components
> listed below.

You should not use is_thunderbolt for anything else than with the legacy
Apple systems.

> ~ # lspci -nn | grep Thunderbolt
> 00:07.0 PCI bridge [0604]: Intel Corporation Tiger Lake-LP Thunderbolt
> 4 PCI Express Root Port #1 [8086:9a25] (rev 01)
> 00:07.2 PCI bridge [0604]: Intel Corporation Tiger Lake-LP Thunderbolt
> 4 PCI Express Root Port #2 [8086:9a27] (rev 01)
> 00:0d.0 USB controller [0c03]: Intel Corporation Tiger Lake-LP
> Thunderbolt 4 USB Controller [8086:9a13] (rev 01)
> 00:0d.2 USB controller [0c03]: Intel Corporation Tiger Lake-LP
> Thunderbolt 4 NHI #0 [8086:9a1b] (rev 01)
> 00:0d.3 USB controller [0c03]: Intel Corporation Tiger Lake-LP
> Thunderbolt 4 NHI #1 [8086:9a1d] (rev 01)

Okay this is integrated Thunderbolt/USB4 controller.

> The device I tested was the Lenovo carbon X1 Gen 9. When
> set_pcie_thunderbolt is run, the devices listed above return false on
> the pci_find_next_ext_capability check.
>
> I have spent some time trying to see if there are any ACPI values or
> any alternative indicators to reliably know if a root port has
> thunderbolt capabilities. I do see that ADBG is often set to TBT but
> that looks like a debugging tool in ACPI.
>
> I also looked through lspci -vvv and compared the output with a device
> that has no Thunderbolt built into its CPU, which gave me nothing.

For integrated and most recent systems (that's with the software
connection manager) the tunneled PCIe ports (Root or Downstream) can be
identified by looking at "usb4-host-interface" ACPI _DSD property. In
addition to this there is the "External Facing Port" ACPI _DSD property
attached to them too. Maybe these help?

With the firmware connection manager there is only the "External Facing
Port" _DSD though.

The Microsoft documentation for this that we also use in Linux is here:

https://learn.microsoft.com/en-us/windows-hardware/drivers/pci/dsd-for-pcie-root-ports
Re: [PATCH v4] PCI: Relabel JHL6540 on Lenovo X1 Carbon 7,8 [ In reply to ]
Thank you for your response! It is very much appreciated.

On the Tiger Lake device I was testing on, the usb4-host-interface
value is NOT listed in its ACPI.

I then decided to query the ACPI values collected from devices in my
office, to see if this issue is limited to my device.
Ice Lake - 4 devices, none had "usb4-host-interface"
Tiger Lake - 31 devices, none have "usb4-host-interface"
Alder Lake - 32 devices, I see that 15 of them have
"usb4-host-interface" in their ACPI
Raptor Lake - 1 device, does not have "usb4-host-interface"

It looks like only Alder Lake has usb4-host-interface listed in its
ACPI for whatever reason.

It seems like I cannot use usb4-host-interface as a determinant
whether the CPU has Thunderbolt capabilities (thus not needing a
discrete Thunderbolt chip).
ExternalFacingPort is listed in devices that don't have CPUs with
Thunderbolts, so that can't be a determinant.

Am I missing something?
Re: [PATCH v4] PCI: Relabel JHL6540 on Lenovo X1 Carbon 7,8 [ In reply to ]
Hi,

On Thu, Apr 18, 2024 at 03:43:08PM -0400, Esther Shimanovich wrote:
> Thank you for your response! It is very much appreciated.
>
> On the Tiger Lake device I was testing on, the usb4-host-interface
> value is NOT listed in its ACPI.
>
> I then decided to query the ACPI values collected from devices in my
> office, to see if this issue is limited to my device.
> Ice Lake - 4 devices, none had "usb4-host-interface"
> Tiger Lake - 31 devices, none have "usb4-host-interface"
> Alder Lake - 32 devices, I see that 15 of them have
> "usb4-host-interface" in their ACPI
> Raptor Lake - 1 device, does not have "usb4-host-interface"
>
> It looks like only Alder Lake has usb4-host-interface listed in its
> ACPI for whatever reason.

The reason is that it is there only for "software connection manager"
systems. That's Chrome TGL and then all ADL and beyond (with integrated
Thunderbolt/USB4).

> It seems like I cannot use usb4-host-interface as a determinant
> whether the CPU has Thunderbolt capabilities (thus not needing a
> discrete Thunderbolt chip).
> ExternalFacingPort is listed in devices that don't have CPUs with
> Thunderbolts, so that can't be a determinant.
>
> Am I missing something?

"ExternalFacingPort" is there for all firmware and software connection
manager systems but only if they use IOMMU for DMA security (also
/sys/bus/thunderbolt/devices/domainX/iommu_dma_protection == 1). So this
is all integrated (Ice Lake+), then Alpine Ridge, Titan Ridge and Maple
Ridge based recent systems.

That leaves out still older systems with firmware connection manager
with the "legacy" security settings. This is pretty much Alpine and
Titan Ridge and I think those you can identify using their PCI IDs
directly as the list will not be too big. Something like this taken from
the drivers/thunderbolt/nhi.h:

#define PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_LP_BRIDGE 0x15c0
#define PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_C_4C_BRIDGE 0x15d3
#define PCI_DEVICE_ID_INTEL_ALPINE_RIDGE_C_2C_BRIDGE 0x15da
#define PCI_DEVICE_ID_INTEL_TITAN_RIDGE_2C_BRIDGE 0x15e7
#define PCI_DEVICE_ID_INTEL_TITAN_RIDGE_4C_BRIDGE 0x15ea
#define PCI_DEVICE_ID_INTEL_TITAN_RIDGE_DD_BRIDGE 0x15ef

In addition to this some of the systems used the BIOS "assisted"
enumeration which means that the controller is actually only present if
you have either USB or Thunderbolt device connected. Othertimes it is
not present at all (not sure if you want to label these differently
though).

In summary if you want to find out if the CPU has an integrated
Thunderbolt/USB4 controller (I think this is what you ask above) then
the best bet would be to use "ExternalFacingPort" because that is
present in all those systems.
Re: [PATCH v4] PCI: Relabel JHL6540 on Lenovo X1 Carbon 7,8 [ In reply to ]
Thanks for the explanation! I still don't fully understand how that
would work for my use case.

Perhaps it would be better for me to describe the case I am trying to
protect against.

To rehash, this quirk was written for devices with discrete
Thunderbolt controllers.

For example,
CometLake_CPU -> AlpineRidge_Chip -> USB-C Port
This device has the ExternalFacingPort property in ACPI.
My quirk relabels the Alpine Ridge chip as "fixed" and
external-facing, so that devices attached to the USB-C port could be
labeled as "removable"

Let's say we have a TigerLake CPU, which has integrated
Thunderbolt/USB4 capabilities:

TigerLake_ThunderboltCPU -> USB-C Port
This device also has the ExternalFacingPort property in ACPI and lacks
the usb4-host-interface property in the ACPI.

My worry is that someone could take an Alpine Ridge Chip Thunderbolt
Dock and attach it to the TigerLake CPU

TigerLake_ThunderboltCPU -> USB-C Port -> AlpineRidge_Dock

If that were to happen, this quirk would incorrectly label the Alpine
Ridge Dock as "fixed" instead of "removable".

My thinking was that we could prevent this scenario from occurring if
we filtered this quirk not to apply on CPU's like Tiger Lake, with
integrated Thunderbolt/USB4 capabilities.

ExternalFacingPort is found both on the Comet Lake ACPI and also on
the Tiger Lake ACPI. So I can't use that to distinguish between CPUs
which don't have integrated Thunderbolt, like Comet Lake, and CPUs
with integrated Thunderbolt, like Tiger Lake.

I am looking for something that can tell me if the device's Root Port
has the Thunderbolt controller upstream to it or not.
Is there anything like that?
Or perhaps should I add a check which compares the name of the
device's CPU with a list of CPUs that this quirk can be applied to?
Or is there some way I can identify the Thunderbolt controller, then
determine if it's upstream or downstream from the root port?
Or are Alpine Ridge docks not something to worry about at all?
Re: [PATCH v4] PCI: Relabel JHL6540 on Lenovo X1 Carbon 7,8 [ In reply to ]
On 4/22/2024 14:17, Esther Shimanovich wrote:
> Thanks for the explanation! I still don't fully understand how that
> would work for my use case.
>
> Perhaps it would be better for me to describe the case I am trying to
> protect against.
>
> To rehash, this quirk was written for devices with discrete
> Thunderbolt controllers.
>
> For example,
> CometLake_CPU -> AlpineRidge_Chip -> USB-C Port
> This device has the ExternalFacingPort property in ACPI.
> My quirk relabels the Alpine Ridge chip as "fixed" and
> external-facing, so that devices attached to the USB-C port could be
> labeled as "removable"
>
> Let's say we have a TigerLake CPU, which has integrated
> Thunderbolt/USB4 capabilities:
>
> TigerLake_ThunderboltCPU -> USB-C Port
> This device also has the ExternalFacingPort property in ACPI and lacks
> the usb4-host-interface property in the ACPI.
>
> My worry is that someone could take an Alpine Ridge Chip Thunderbolt
> Dock and attach it to the TigerLake CPU
>
> TigerLake_ThunderboltCPU -> USB-C Port -> AlpineRidge_Dock
>
> If that were to happen, this quirk would incorrectly label the Alpine
> Ridge Dock as "fixed" instead of "removable".
>
> My thinking was that we could prevent this scenario from occurring if
> we filtered this quirk not to apply on CPU's like Tiger Lake, with
> integrated Thunderbolt/USB4 capabilities.
>
> ExternalFacingPort is found both on the Comet Lake ACPI and also on
> the Tiger Lake ACPI. So I can't use that to distinguish between CPUs
> which don't have integrated Thunderbolt, like Comet Lake, and CPUs
> with integrated Thunderbolt, like Tiger Lake.
>
> I am looking for something that can tell me if the device's Root Port
> has the Thunderbolt controller upstream to it or not.
> Is there anything like that?
> Or perhaps should I add a check which compares the name of the
> device's CPU with a list of CPUs that this quirk can be applied to?
> Or is there some way I can identify the Thunderbolt controller, then
> determine if it's upstream or downstream from the root port?
> Or are Alpine Ridge docks not something to worry about at all?

My thought was once you have a device as untrusted, everything else
connected to it should "also" be untrusted.
Re: [PATCH v4] PCI: Relabel JHL6540 on Lenovo X1 Carbon 7,8 [ In reply to ]
On Mon, Apr 22, 2024 at 02:21:18PM -0500, Mario Limonciello wrote:
> On 4/22/2024 14:17, Esther Shimanovich wrote:
> > Thanks for the explanation! I still don't fully understand how that
> > would work for my use case.
> >
> > Perhaps it would be better for me to describe the case I am trying to
> > protect against.
> >
> > To rehash, this quirk was written for devices with discrete
> > Thunderbolt controllers.
> >
> > For example,
> > CometLake_CPU -> AlpineRidge_Chip -> USB-C Port
> > This device has the ExternalFacingPort property in ACPI.
> > My quirk relabels the Alpine Ridge chip as "fixed" and
> > external-facing, so that devices attached to the USB-C port could be
> > labeled as "removable"
> >
> > Let's say we have a TigerLake CPU, which has integrated
> > Thunderbolt/USB4 capabilities:
> >
> > TigerLake_ThunderboltCPU -> USB-C Port
> > This device also has the ExternalFacingPort property in ACPI and lacks
> > the usb4-host-interface property in the ACPI.
> >
> > My worry is that someone could take an Alpine Ridge Chip Thunderbolt
> > Dock and attach it to the TigerLake CPU
> >
> > TigerLake_ThunderboltCPU -> USB-C Port -> AlpineRidge_Dock
> >
> > If that were to happen, this quirk would incorrectly label the Alpine
> > Ridge Dock as "fixed" instead of "removable".
> >
> > My thinking was that we could prevent this scenario from occurring if
> > we filtered this quirk not to apply on CPU's like Tiger Lake, with
> > integrated Thunderbolt/USB4 capabilities.
> >
> > ExternalFacingPort is found both on the Comet Lake ACPI and also on
> > the Tiger Lake ACPI. So I can't use that to distinguish between CPUs
> > which don't have integrated Thunderbolt, like Comet Lake, and CPUs
> > with integrated Thunderbolt, like Tiger Lake.
> >
> > I am looking for something that can tell me if the device's Root Port
> > has the Thunderbolt controller upstream to it or not.
> > Is there anything like that?
> > Or perhaps should I add a check which compares the name of the
> > device's CPU with a list of CPUs that this quirk can be applied to?
> > Or is there some way I can identify the Thunderbolt controller, then
> > determine if it's upstream or downstream from the root port?
> > Or are Alpine Ridge docks not something to worry about at all?
>
> My thought was once you have a device as untrusted, everything else
> connected to it should "also" be untrusted.

I think what you are looking for is that anything behind a PCIe tunnel
should not have this applied. IIRC the AMD GPU or some code there were
going to add identification of "virtual" links to the bandwidth
calculation functionality.

@Mario, do you remember if this was done already and if that could maybe
be re-used here?

The other way I think is something like this:

- If it does not have "usb4-host-interface" property (or behind a port
that has that). These are all tunneled (e.g virtual).

- It is directly connected to a PCIe root port with
"ExternalFacingPort" and it has sibling device that is "Thunderbolt
NHI". This is because you can only have "NHI" on a host router
according to the USB4 spec.

I may be forgetting something though.

1 2  View All