From: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
Some IDT switches behave erratically when ACS Source Validation is enabled.
For example, they incorrectly flag an ACS Source Validation error on
completions for config read requests even though PCIe r4.0, sec 6.12.1.1,
says that completions are never affected by ACS Source Validation.
Even though IDT suggests working around this issue by issuing a config
write before the first config read, so that the device caches the bus and
device number. But it would still be fragile since the device could loose
the IDs after the reset and any further access may trigger ACS SV
violation.
Hence, to properly fix the issue, the respective capability needs to be
disabled. Since the ACS Capabilities are RO values, and are cached in the
'pci_dev::acs_capabilities' field, remove the cached broken capabilities by
calling pci_disable_broken_acs_cap() from pci_acs_init(). This will allow
pci_enable_acs() helper to disable the relevant ACS ctrls.
With this, the previous workaround can now be safely removed.
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
---
drivers/pci/pci.c | 1 +
drivers/pci/pci.h | 3 ++-
drivers/pci/probe.c | 12 -----------
drivers/pci/quirks.c | 61 ++++++++++++----------------------------------------
4 files changed, 17 insertions(+), 60 deletions(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index d89b04451aea..e16229e7ff6f 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -3652,6 +3652,7 @@ void pci_acs_init(struct pci_dev *dev)
return;
pci_read_config_word(dev, pos + PCI_ACS_CAP, &dev->acs_capabilities);
+ pci_disable_broken_acs_cap(dev);
}
/**
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 4592ede0ebcc..5fe5d6e84c95 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -432,7 +432,6 @@ bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *pl,
int rrs_timeout);
bool pci_bus_generic_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *pl,
int rrs_timeout);
-int pci_idt_bus_quirk(struct pci_bus *bus, int devfn, u32 *pl, int rrs_timeout);
int pci_setup_device(struct pci_dev *dev);
void __pci_size_stdbars(struct pci_dev *dev, int count,
@@ -944,6 +943,7 @@ void pci_enable_acs(struct pci_dev *dev);
int pci_dev_specific_acs_enabled(struct pci_dev *dev, u16 acs_flags);
int pci_dev_specific_enable_acs(struct pci_dev *dev);
int pci_dev_specific_disable_acs_redir(struct pci_dev *dev);
+void pci_disable_broken_acs_cap(struct pci_dev *pdev);
int pcie_failed_link_retrain(struct pci_dev *dev);
#else
static inline int pci_dev_specific_acs_enabled(struct pci_dev *dev,
@@ -959,6 +959,7 @@ static inline int pci_dev_specific_disable_acs_redir(struct pci_dev *dev)
{
return -ENOTTY;
}
+static inline void pci_disable_broken_acs_cap(struct pci_dev *dev) { }
static inline int pcie_failed_link_retrain(struct pci_dev *dev)
{
return -ENOTTY;
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 41183aed8f5d..c7304ac5afc2 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -2547,18 +2547,6 @@ bool pci_bus_generic_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *l,
bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *l,
int timeout)
{
-#ifdef CONFIG_PCI_QUIRKS
- struct pci_dev *bridge = bus->self;
-
- /*
- * Certain IDT switches have an issue where they improperly trigger
- * ACS Source Validation errors on completions for config reads.
- */
- if (bridge && bridge->vendor == PCI_VENDOR_ID_IDT &&
- bridge->device == 0x80b5)
- return pci_idt_bus_quirk(bus, devfn, l, timeout);
-#endif
-
return pci_bus_generic_read_dev_vendor_id(bus, devfn, l, timeout);
}
EXPORT_SYMBOL(pci_bus_read_dev_vendor_id);
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index b9c252aa6fe0..1571a2ef331e 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -5778,58 +5778,25 @@ DECLARE_PCI_FIXUP_CLASS_RESUME_EARLY(PCI_VENDOR_ID_NVIDIA, PCI_ANY_ID,
PCI_BASE_CLASS_DISPLAY, 16, quirk_nvidia_hda);
/*
- * Some IDT switches incorrectly flag an ACS Source Validation error on
- * completions for config read requests even though PCIe r4.0, sec
- * 6.12.1.1, says that completions are never affected by ACS Source
- * Validation. Here's the text of IDT 89H32H8G3-YC, erratum #36:
+ * Some IDT switches behave erratically when ACS Source Validation is enabled.
+ * For example, they incorrectly flag an ACS Source Validation error on
+ * completions for config read requests even though PCIe r4.0, sec 6.12.1.1,
+ * says that completions are never affected by ACS Source Validation.
*
- * Item #36 - Downstream port applies ACS Source Validation to Completions
- * Section 6.12.1.1 of the PCI Express Base Specification 3.1 states that
- * completions are never affected by ACS Source Validation. However,
- * completions received by a downstream port of the PCIe switch from a
- * device that has not yet captured a PCIe bus number are incorrectly
- * dropped by ACS Source Validation by the switch downstream port.
+ * Even though IDT suggests working around this issue by issuing a config write
+ * before the first config read, so that the switch caches the bus and device
+ * number, it would still be fragile since the device could loose the IDs after
+ * the reset.
*
- * The workaround suggested by IDT is to issue a config write to the
- * downstream device before issuing the first config read. This allows the
- * downstream device to capture its bus and device numbers (see PCIe r4.0,
- * sec 2.2.9), thus avoiding the ACS error on the completion.
- *
- * However, we don't know when the device is ready to accept the config
- * write, so we do config reads until we receive a non-Config Request Retry
- * Status, then do the config write.
- *
- * To avoid hitting the erratum when doing the config reads, we disable ACS
- * SV around this process.
+ * Hence, a reliable fix would be to assume that these switches don't support
+ * ACS SV.
*/
-int pci_idt_bus_quirk(struct pci_bus *bus, int devfn, u32 *l, int timeout)
+void pci_disable_broken_acs_cap(struct pci_dev *pdev)
{
- int pos;
- u16 ctrl = 0;
- bool found;
- struct pci_dev *bridge = bus->self;
-
- pos = bridge->acs_cap;
-
- /* Disable ACS SV before initial config reads */
- if (pos) {
- pci_read_config_word(bridge, pos + PCI_ACS_CTRL, &ctrl);
- if (ctrl & PCI_ACS_SV)
- pci_write_config_word(bridge, pos + PCI_ACS_CTRL,
- ctrl & ~PCI_ACS_SV);
+ if (pdev->vendor == PCI_VENDOR_ID_IDT && pdev->device == 0x80b5) {
+ pci_info(pdev, "Disabling broken ACS SV\n");
+ pdev->acs_capabilities &= ~PCI_ACS_SV;
}
-
- found = pci_bus_generic_read_dev_vendor_id(bus, devfn, l, timeout);
-
- /* Write Vendor ID (read-only) so the endpoint latches its bus/dev */
- if (found)
- pci_bus_write_config_word(bus, devfn, PCI_VENDOR_ID, 0);
-
- /* Re-enable ACS_SV if it was previously enabled */
- if (ctrl & PCI_ACS_SV)
- pci_write_config_word(bridge, pos + PCI_ACS_CTRL, ctrl);
-
- return found;
}
/*
--
2.48.1
[+cc Alex, James (author of aa667c6408d2)]
On Fri, Jan 02, 2026 at 09:04:49PM +0530, Manivannan Sadhasivam wrote:
> Some IDT switches behave erratically when ACS Source Validation is enabled.
> For example, they incorrectly flag an ACS Source Validation error on
> completions for config read requests even though PCIe r4.0, sec 6.12.1.1,
> says that completions are never affected by ACS Source Validation.
>
> Even though IDT suggests working around this issue by issuing a config
> write before the first config read, so that the device caches the bus and
> device number. But it would still be fragile since the device could loose
> the IDs after the reset and any further access may trigger ACS SV
> violation.
>
> Hence, to properly fix the issue, the respective capability needs to be
> disabled. Since the ACS Capabilities are RO values, and are cached in the
> 'pci_dev::acs_capabilities' field, remove the cached broken capabilities by
> calling pci_disable_broken_acs_cap() from pci_acs_init(). This will allow
> pci_enable_acs() helper to disable the relevant ACS ctrls.
>
> With this, the previous workaround can now be safely removed.
James added the existing IDT workaround for the IDT 0x80b5 switch,
which was merged as aa667c6408d2 ("PCI: Workaround IDT switch ACS
Source Validation erratum").
I guess these last three patches:
PCI: Cache ACS capabilities
PCI: Disable ACS SV capability for the broken IDT switches
PCI: Extend the pci_disable_broken_acs_cap() quirk for one more IDT switch
are basically to fix the same issue on the IDT 0x8090 switch?
The obvious approach would be to extend the test in
pci_bus_read_dev_vendor_id() to check for 0x8090 in addition to
0x80b5, which might be worth doing first, before changing the whole
approach.
I guess your point here is that the existing workaround isn't enough
even for 0x80b5 because it doesn't handle the reset case? The patch
changelogs after v9 of the original patch (see below) do mention that
FLR and hot reset were tested and didn't seem to require the
workaround. But you've probably seen problems after reset?
I think it's worth a note about why the reset case can't be fixed
similarly to the enumeration case.
This patch gives up on ACS SV completely for this switch instead of
the current approach of:
- disabling ACS SV
- doing a config write
- reading dev/vendor ID
- re-enabling ACS SV
It'd be worth expanding on this and what the effect of avoiding ACS SV
is. Does this change which devices can be safely passed through to
virtual guests? Does it give up isolation that users expect?
I also couldn't remember why we can't just do a config write before
reading the vendor/dev ID of a device below the switch, which is
addressed in the discussion of v3 (below). Basically if the device
isn't yet ready to accept config accesses, it may not latch the bus
number, and we can't really tell when it *is* ready.
I collected up all the history I could find about the original change:
v3 2017-06-09 https://lore.kernel.org/all/20170609231617.20243-1-yinghai@kernel.org/
doing config write is hard because we don't know when the device
is Configuration-Ready and can capture the bus number:
https://lore.kernel.org/all/20170613221451.GC7012@bhelgaas-glaptop.roam.corp.google.com/
v4 2017-06-27 https://lore.kernel.org/all/5952D144.8060609@oracle.com/
only do workaround for IDT switches
v5 2017-07-06 https://lore.kernel.org/all/595E610A.5000900@oracle.com/
skip devices that don't advertise ACS SV support
v7 2017-09-18 https://lore.kernel.org/all/59C02719.5050103@oracle.com/
add https://bugzilla.kernel.org/show_bug.cgi?id=196979 with
details about how Windows deals with this
v8 2018-03-06 https://lore.kernel.org/all/ce8e9a3e-474d-961c-eb0a-c021077f2dca@oracle.com/
v9 ? (couldn't find this)
tested FLR and hot reset scenarios; didn't see issues where
workaround was required
v11 2018-04-11 https://lore.kernel.org/all/e6a55432-8c56-9c99-ce99-4ae80a0ed3ba@oracle.com/ (no version label)
2018-04-19 https://lore.kernel.org/all/b6439e86-2284-5b3c-3656-a0c3c2568317@oracle.com/ (no version label)
v12 2018-04-24 https://lore.kernel.org/all/7e9f5905-34c5-f24c-8b13-df36b8bf3621@oracle.com/
v13 2018-04-30 https://lore.kernel.org/all/4bd28931-8e8d-6bfe-a98c-49f8e2778c6e@oracle.com/ (labeled v2)
v14 2018-07-09 https://lore.kernel.org/all/4117c119-c754-bf4e-544c-19cb6e239f38@oracle.com/
> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
> Tested-by: Naresh Kamboju <naresh.kamboju@linaro.org>
> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
> ---
> drivers/pci/pci.c | 1 +
> drivers/pci/pci.h | 3 ++-
> drivers/pci/probe.c | 12 -----------
> drivers/pci/quirks.c | 61 ++++++++++++----------------------------------------
> 4 files changed, 17 insertions(+), 60 deletions(-)
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index d89b04451aea..e16229e7ff6f 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -3652,6 +3652,7 @@ void pci_acs_init(struct pci_dev *dev)
> return;
>
> pci_read_config_word(dev, pos + PCI_ACS_CAP, &dev->acs_capabilities);
> + pci_disable_broken_acs_cap(dev);
> }
>
> /**
> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> index 4592ede0ebcc..5fe5d6e84c95 100644
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -432,7 +432,6 @@ bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *pl,
> int rrs_timeout);
> bool pci_bus_generic_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *pl,
> int rrs_timeout);
> -int pci_idt_bus_quirk(struct pci_bus *bus, int devfn, u32 *pl, int rrs_timeout);
>
> int pci_setup_device(struct pci_dev *dev);
> void __pci_size_stdbars(struct pci_dev *dev, int count,
> @@ -944,6 +943,7 @@ void pci_enable_acs(struct pci_dev *dev);
> int pci_dev_specific_acs_enabled(struct pci_dev *dev, u16 acs_flags);
> int pci_dev_specific_enable_acs(struct pci_dev *dev);
> int pci_dev_specific_disable_acs_redir(struct pci_dev *dev);
> +void pci_disable_broken_acs_cap(struct pci_dev *pdev);
> int pcie_failed_link_retrain(struct pci_dev *dev);
> #else
> static inline int pci_dev_specific_acs_enabled(struct pci_dev *dev,
> @@ -959,6 +959,7 @@ static inline int pci_dev_specific_disable_acs_redir(struct pci_dev *dev)
> {
> return -ENOTTY;
> }
> +static inline void pci_disable_broken_acs_cap(struct pci_dev *dev) { }
> static inline int pcie_failed_link_retrain(struct pci_dev *dev)
> {
> return -ENOTTY;
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index 41183aed8f5d..c7304ac5afc2 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -2547,18 +2547,6 @@ bool pci_bus_generic_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *l,
> bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *l,
> int timeout)
> {
> -#ifdef CONFIG_PCI_QUIRKS
> - struct pci_dev *bridge = bus->self;
> -
> - /*
> - * Certain IDT switches have an issue where they improperly trigger
> - * ACS Source Validation errors on completions for config reads.
> - */
> - if (bridge && bridge->vendor == PCI_VENDOR_ID_IDT &&
> - bridge->device == 0x80b5)
> - return pci_idt_bus_quirk(bus, devfn, l, timeout);
> -#endif
> -
> return pci_bus_generic_read_dev_vendor_id(bus, devfn, l, timeout);
> }
> EXPORT_SYMBOL(pci_bus_read_dev_vendor_id);
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index b9c252aa6fe0..1571a2ef331e 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -5778,58 +5778,25 @@ DECLARE_PCI_FIXUP_CLASS_RESUME_EARLY(PCI_VENDOR_ID_NVIDIA, PCI_ANY_ID,
> PCI_BASE_CLASS_DISPLAY, 16, quirk_nvidia_hda);
>
> /*
> - * Some IDT switches incorrectly flag an ACS Source Validation error on
> - * completions for config read requests even though PCIe r4.0, sec
> - * 6.12.1.1, says that completions are never affected by ACS Source
> - * Validation. Here's the text of IDT 89H32H8G3-YC, erratum #36:
> + * Some IDT switches behave erratically when ACS Source Validation is enabled.
> + * For example, they incorrectly flag an ACS Source Validation error on
> + * completions for config read requests even though PCIe r4.0, sec 6.12.1.1,
> + * says that completions are never affected by ACS Source Validation.
> *
> - * Item #36 - Downstream port applies ACS Source Validation to Completions
> - * Section 6.12.1.1 of the PCI Express Base Specification 3.1 states that
> - * completions are never affected by ACS Source Validation. However,
> - * completions received by a downstream port of the PCIe switch from a
> - * device that has not yet captured a PCIe bus number are incorrectly
> - * dropped by ACS Source Validation by the switch downstream port.
> + * Even though IDT suggests working around this issue by issuing a config write
> + * before the first config read, so that the switch caches the bus and device
> + * number, it would still be fragile since the device could loose the IDs after
> + * the reset.
> *
> - * The workaround suggested by IDT is to issue a config write to the
> - * downstream device before issuing the first config read. This allows the
> - * downstream device to capture its bus and device numbers (see PCIe r4.0,
> - * sec 2.2.9), thus avoiding the ACS error on the completion.
> - *
> - * However, we don't know when the device is ready to accept the config
> - * write, so we do config reads until we receive a non-Config Request Retry
> - * Status, then do the config write.
> - *
> - * To avoid hitting the erratum when doing the config reads, we disable ACS
> - * SV around this process.
> + * Hence, a reliable fix would be to assume that these switches don't support
> + * ACS SV.
> */
> -int pci_idt_bus_quirk(struct pci_bus *bus, int devfn, u32 *l, int timeout)
> +void pci_disable_broken_acs_cap(struct pci_dev *pdev)
> {
> - int pos;
> - u16 ctrl = 0;
> - bool found;
> - struct pci_dev *bridge = bus->self;
> -
> - pos = bridge->acs_cap;
> -
> - /* Disable ACS SV before initial config reads */
> - if (pos) {
> - pci_read_config_word(bridge, pos + PCI_ACS_CTRL, &ctrl);
> - if (ctrl & PCI_ACS_SV)
> - pci_write_config_word(bridge, pos + PCI_ACS_CTRL,
> - ctrl & ~PCI_ACS_SV);
> + if (pdev->vendor == PCI_VENDOR_ID_IDT && pdev->device == 0x80b5) {
> + pci_info(pdev, "Disabling broken ACS SV\n");
> + pdev->acs_capabilities &= ~PCI_ACS_SV;
> }
> -
> - found = pci_bus_generic_read_dev_vendor_id(bus, devfn, l, timeout);
> -
> - /* Write Vendor ID (read-only) so the endpoint latches its bus/dev */
> - if (found)
> - pci_bus_write_config_word(bus, devfn, PCI_VENDOR_ID, 0);
> -
> - /* Re-enable ACS_SV if it was previously enabled */
> - if (ctrl & PCI_ACS_SV)
> - pci_write_config_word(bridge, pos + PCI_ACS_CTRL, ctrl);
> -
> - return found;
> }
>
> /*
>
> --
> 2.48.1
>
On Thu, Feb 05, 2026 at 05:39:20PM -0600, Bjorn Helgaas wrote:
> [+cc Alex, James (author of aa667c6408d2)]
>
> On Fri, Jan 02, 2026 at 09:04:49PM +0530, Manivannan Sadhasivam wrote:
> > Some IDT switches behave erratically when ACS Source Validation is enabled.
> > For example, they incorrectly flag an ACS Source Validation error on
> > completions for config read requests even though PCIe r4.0, sec 6.12.1.1,
> > says that completions are never affected by ACS Source Validation.
> >
> > Even though IDT suggests working around this issue by issuing a config
> > write before the first config read, so that the device caches the bus and
> > device number. But it would still be fragile since the device could loose
> > the IDs after the reset and any further access may trigger ACS SV
> > violation.
> >
> > Hence, to properly fix the issue, the respective capability needs to be
> > disabled. Since the ACS Capabilities are RO values, and are cached in the
> > 'pci_dev::acs_capabilities' field, remove the cached broken capabilities by
> > calling pci_disable_broken_acs_cap() from pci_acs_init(). This will allow
> > pci_enable_acs() helper to disable the relevant ACS ctrls.
> >
> > With this, the previous workaround can now be safely removed.
>
> James added the existing IDT workaround for the IDT 0x80b5 switch,
> which was merged as aa667c6408d2 ("PCI: Workaround IDT switch ACS
> Source Validation erratum").
>
> I guess these last three patches:
>
> PCI: Cache ACS capabilities
> PCI: Disable ACS SV capability for the broken IDT switches
> PCI: Extend the pci_disable_broken_acs_cap() quirk for one more IDT switch
>
> are basically to fix the same issue on the IDT 0x8090 switch?
>
Yes!
> The obvious approach would be to extend the test in
> pci_bus_read_dev_vendor_id() to check for 0x8090 in addition to
> 0x80b5, which might be worth doing first, before changing the whole
> approach.
>
I've mentioned the rationale for not using this approach in the cover letter.
But I'll mention it here also. I did try extending the existing quirk in the
first version of this series [1], but Jason proposed to disable ACS completely
for these switches as doing a dummy config write before reading the vendor ID
would not help if the switch gets reset during runtime. The cached bus number in
the device would get lost, triggering ACS SV again.
> I guess your point here is that the existing workaround isn't enough
> even for 0x80b5 because it doesn't handle the reset case? The patch
> changelogs after v9 of the original patch (see below) do mention that
> FLR and hot reset were tested and didn't seem to require the
> workaround. But you've probably seen problems after reset?
>
I haven't seen the problem myself, but Jason's point was that there is no
spec mandate for the switch to retain its bus number after FLR/hot reset [2]
> I think it's worth a note about why the reset case can't be fixed
> similarly to the enumeration case.
>
If the device looses its cached bus number, then further config reads from the
host would trigger ACS SV. And we do not want to do a dummy write from all
possible locations in PCI core.
> This patch gives up on ACS SV completely for this switch instead of
> the current approach of:
>
> - disabling ACS SV
> - doing a config write
> - reading dev/vendor ID
> - re-enabling ACS SV
>
> It'd be worth expanding on this and what the effect of avoiding ACS SV
> is. Does this change which devices can be safely passed through to
> virtual guests? Does it give up isolation that users expect?
>
IMO, ACS SV is somewhat broken on this switch. But we can still passthrough the
downstream devices to the guests. There won't be ACS SV apparently, but that's
what users will get with broken hw.
> I also couldn't remember why we can't just do a config write before
> reading the vendor/dev ID of a device below the switch, which is
> addressed in the discussion of v3 (below). Basically if the device
> isn't yet ready to accept config accesses, it may not latch the bus
> number, and we can't really tell when it *is* ready.
>
I think that is a separate discussion (detecting device readiness). But doing a
dummy config write to woraround the hardware bug looks fragile to me too.
- Mani
[1] https://lore.kernel.org/linux-pci/20250910-pci-acs-v1-0-fe9adb65ad7d@oss.qualcomm.com/
[2] https://lore.kernel.org/linux-pci/20250924125555.GJ2547959@ziepe.ca/
> I collected up all the history I could find about the original change:
>
> v3 2017-06-09 https://lore.kernel.org/all/20170609231617.20243-1-yinghai@kernel.org/
> doing config write is hard because we don't know when the device
> is Configuration-Ready and can capture the bus number:
> https://lore.kernel.org/all/20170613221451.GC7012@bhelgaas-glaptop.roam.corp.google.com/
> v4 2017-06-27 https://lore.kernel.org/all/5952D144.8060609@oracle.com/
> only do workaround for IDT switches
> v5 2017-07-06 https://lore.kernel.org/all/595E610A.5000900@oracle.com/
> skip devices that don't advertise ACS SV support
> v7 2017-09-18 https://lore.kernel.org/all/59C02719.5050103@oracle.com/
> add https://bugzilla.kernel.org/show_bug.cgi?id=196979 with
> details about how Windows deals with this
> v8 2018-03-06 https://lore.kernel.org/all/ce8e9a3e-474d-961c-eb0a-c021077f2dca@oracle.com/
> v9 ? (couldn't find this)
> tested FLR and hot reset scenarios; didn't see issues where
> workaround was required
> v11 2018-04-11 https://lore.kernel.org/all/e6a55432-8c56-9c99-ce99-4ae80a0ed3ba@oracle.com/ (no version label)
> 2018-04-19 https://lore.kernel.org/all/b6439e86-2284-5b3c-3656-a0c3c2568317@oracle.com/ (no version label)
> v12 2018-04-24 https://lore.kernel.org/all/7e9f5905-34c5-f24c-8b13-df36b8bf3621@oracle.com/
> v13 2018-04-30 https://lore.kernel.org/all/4bd28931-8e8d-6bfe-a98c-49f8e2778c6e@oracle.com/ (labeled v2)
> v14 2018-07-09 https://lore.kernel.org/all/4117c119-c754-bf4e-544c-19cb6e239f38@oracle.com/
>
> > Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
> > Tested-by: Naresh Kamboju <naresh.kamboju@linaro.org>
> > Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
> > ---
> > drivers/pci/pci.c | 1 +
> > drivers/pci/pci.h | 3 ++-
> > drivers/pci/probe.c | 12 -----------
> > drivers/pci/quirks.c | 61 ++++++++++++----------------------------------------
> > 4 files changed, 17 insertions(+), 60 deletions(-)
> >
> > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > index d89b04451aea..e16229e7ff6f 100644
> > --- a/drivers/pci/pci.c
> > +++ b/drivers/pci/pci.c
> > @@ -3652,6 +3652,7 @@ void pci_acs_init(struct pci_dev *dev)
> > return;
> >
> > pci_read_config_word(dev, pos + PCI_ACS_CAP, &dev->acs_capabilities);
> > + pci_disable_broken_acs_cap(dev);
> > }
> >
> > /**
> > diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> > index 4592ede0ebcc..5fe5d6e84c95 100644
> > --- a/drivers/pci/pci.h
> > +++ b/drivers/pci/pci.h
> > @@ -432,7 +432,6 @@ bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *pl,
> > int rrs_timeout);
> > bool pci_bus_generic_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *pl,
> > int rrs_timeout);
> > -int pci_idt_bus_quirk(struct pci_bus *bus, int devfn, u32 *pl, int rrs_timeout);
> >
> > int pci_setup_device(struct pci_dev *dev);
> > void __pci_size_stdbars(struct pci_dev *dev, int count,
> > @@ -944,6 +943,7 @@ void pci_enable_acs(struct pci_dev *dev);
> > int pci_dev_specific_acs_enabled(struct pci_dev *dev, u16 acs_flags);
> > int pci_dev_specific_enable_acs(struct pci_dev *dev);
> > int pci_dev_specific_disable_acs_redir(struct pci_dev *dev);
> > +void pci_disable_broken_acs_cap(struct pci_dev *pdev);
> > int pcie_failed_link_retrain(struct pci_dev *dev);
> > #else
> > static inline int pci_dev_specific_acs_enabled(struct pci_dev *dev,
> > @@ -959,6 +959,7 @@ static inline int pci_dev_specific_disable_acs_redir(struct pci_dev *dev)
> > {
> > return -ENOTTY;
> > }
> > +static inline void pci_disable_broken_acs_cap(struct pci_dev *dev) { }
> > static inline int pcie_failed_link_retrain(struct pci_dev *dev)
> > {
> > return -ENOTTY;
> > diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> > index 41183aed8f5d..c7304ac5afc2 100644
> > --- a/drivers/pci/probe.c
> > +++ b/drivers/pci/probe.c
> > @@ -2547,18 +2547,6 @@ bool pci_bus_generic_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *l,
> > bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *l,
> > int timeout)
> > {
> > -#ifdef CONFIG_PCI_QUIRKS
> > - struct pci_dev *bridge = bus->self;
> > -
> > - /*
> > - * Certain IDT switches have an issue where they improperly trigger
> > - * ACS Source Validation errors on completions for config reads.
> > - */
> > - if (bridge && bridge->vendor == PCI_VENDOR_ID_IDT &&
> > - bridge->device == 0x80b5)
> > - return pci_idt_bus_quirk(bus, devfn, l, timeout);
> > -#endif
> > -
> > return pci_bus_generic_read_dev_vendor_id(bus, devfn, l, timeout);
> > }
> > EXPORT_SYMBOL(pci_bus_read_dev_vendor_id);
> > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> > index b9c252aa6fe0..1571a2ef331e 100644
> > --- a/drivers/pci/quirks.c
> > +++ b/drivers/pci/quirks.c
> > @@ -5778,58 +5778,25 @@ DECLARE_PCI_FIXUP_CLASS_RESUME_EARLY(PCI_VENDOR_ID_NVIDIA, PCI_ANY_ID,
> > PCI_BASE_CLASS_DISPLAY, 16, quirk_nvidia_hda);
> >
> > /*
> > - * Some IDT switches incorrectly flag an ACS Source Validation error on
> > - * completions for config read requests even though PCIe r4.0, sec
> > - * 6.12.1.1, says that completions are never affected by ACS Source
> > - * Validation. Here's the text of IDT 89H32H8G3-YC, erratum #36:
> > + * Some IDT switches behave erratically when ACS Source Validation is enabled.
> > + * For example, they incorrectly flag an ACS Source Validation error on
> > + * completions for config read requests even though PCIe r4.0, sec 6.12.1.1,
> > + * says that completions are never affected by ACS Source Validation.
> > *
> > - * Item #36 - Downstream port applies ACS Source Validation to Completions
> > - * Section 6.12.1.1 of the PCI Express Base Specification 3.1 states that
> > - * completions are never affected by ACS Source Validation. However,
> > - * completions received by a downstream port of the PCIe switch from a
> > - * device that has not yet captured a PCIe bus number are incorrectly
> > - * dropped by ACS Source Validation by the switch downstream port.
> > + * Even though IDT suggests working around this issue by issuing a config write
> > + * before the first config read, so that the switch caches the bus and device
> > + * number, it would still be fragile since the device could loose the IDs after
> > + * the reset.
> > *
> > - * The workaround suggested by IDT is to issue a config write to the
> > - * downstream device before issuing the first config read. This allows the
> > - * downstream device to capture its bus and device numbers (see PCIe r4.0,
> > - * sec 2.2.9), thus avoiding the ACS error on the completion.
> > - *
> > - * However, we don't know when the device is ready to accept the config
> > - * write, so we do config reads until we receive a non-Config Request Retry
> > - * Status, then do the config write.
> > - *
> > - * To avoid hitting the erratum when doing the config reads, we disable ACS
> > - * SV around this process.
> > + * Hence, a reliable fix would be to assume that these switches don't support
> > + * ACS SV.
> > */
> > -int pci_idt_bus_quirk(struct pci_bus *bus, int devfn, u32 *l, int timeout)
> > +void pci_disable_broken_acs_cap(struct pci_dev *pdev)
> > {
> > - int pos;
> > - u16 ctrl = 0;
> > - bool found;
> > - struct pci_dev *bridge = bus->self;
> > -
> > - pos = bridge->acs_cap;
> > -
> > - /* Disable ACS SV before initial config reads */
> > - if (pos) {
> > - pci_read_config_word(bridge, pos + PCI_ACS_CTRL, &ctrl);
> > - if (ctrl & PCI_ACS_SV)
> > - pci_write_config_word(bridge, pos + PCI_ACS_CTRL,
> > - ctrl & ~PCI_ACS_SV);
> > + if (pdev->vendor == PCI_VENDOR_ID_IDT && pdev->device == 0x80b5) {
> > + pci_info(pdev, "Disabling broken ACS SV\n");
> > + pdev->acs_capabilities &= ~PCI_ACS_SV;
> > }
> > -
> > - found = pci_bus_generic_read_dev_vendor_id(bus, devfn, l, timeout);
> > -
> > - /* Write Vendor ID (read-only) so the endpoint latches its bus/dev */
> > - if (found)
> > - pci_bus_write_config_word(bus, devfn, PCI_VENDOR_ID, 0);
> > -
> > - /* Re-enable ACS_SV if it was previously enabled */
> > - if (ctrl & PCI_ACS_SV)
> > - pci_write_config_word(bridge, pos + PCI_ACS_CTRL, ctrl);
> > -
> > - return found;
> > }
> >
> > /*
> >
> > --
> > 2.48.1
> >
--
மணிவண்ணன் சதாசிவம்
On Fri, Feb 06, 2026 at 02:41:36PM +0530, Manivannan Sadhasivam wrote: > > It'd be worth expanding on this and what the effect of avoiding ACS SV > > is. Does this change which devices can be safely passed through to > > virtual guests? Does it give up isolation that users expect? > > > > IMO, ACS SV is somewhat broken on this switch. But we can still passthrough the > downstream devices to the guests. There won't be ACS SV apparently, but that's > what users will get with broken hw. I agree with this, the HW is very broken, let's have it at least work properly in Linux on bare metal out of the box. If someone really insists they need virtualization with narrow groups on this HW then they need to come with a more complete fix. Using VFIO is going to open up the reset flows that are problematic with the current solution, so it isn't like that is already working fully. Somehow I suspect nobody would use this switch for virtualization :) Jason
On Fri, Feb 06, 2026 at 10:30:14AM -0400, Jason Gunthorpe wrote: > On Fri, Feb 06, 2026 at 02:41:36PM +0530, Manivannan Sadhasivam wrote: > > > It'd be worth expanding on this and what the effect of avoiding > > > ACS SV is. Does this change which devices can be safely passed > > > through to virtual guests? Does it give up isolation that users > > > expect? > > > > IMO, ACS SV is somewhat broken on this switch. But we can still > > passthrough the downstream devices to the guests. There won't be > > ACS SV apparently, but that's what users will get with broken hw. > > I agree with this, the HW is very broken, let's have it at least > work properly in Linux on bare metal out of the box. I'm assuming the bare metal part could be done by something like this: @@ -2555,7 +2555,7 @@ bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *l, * ACS Source Validation errors on completions for config reads. */ if (bridge && bridge->vendor == PCI_VENDOR_ID_IDT && - bridge->device == 0x80b5) + (bridge->device == 0x80b5 || bridge->device == 0x8090) return pci_idt_bus_quirk(bus, devfn, l, timeout); > If someone really insists they need virtualization with narrow > groups on this HW then they need to come with a more complete fix. > Using VFIO is going to open up the reset flows that are problematic > with the current solution, so it isn't like that is already working > fully. IIUC the current situation is that for these IDT switches, ACS SV is enabled when downstream devices are passed through to guests, but after these patches, it will no longer be enabled. So my question is whether users are giving up some isolation. If so, should we even allow devices to be passed through to guests? If we do allow that, do users have any indication that they're not getting what they expect?
On Fri, Feb 06, 2026 at 08:46:51AM -0600, Bjorn Helgaas wrote: > IIUC the current situation is that for these IDT switches, ACS SV is > enabled when downstream devices are passed through to guests, but > after these patches, it will no longer be enabled. ACS SV is enabled at boot time if an IOMMU driver is present regardless if guests or virtualization is in use. Linux doesn't change ACS flags dynamically. > So my question is whether users are giving up some isolation. If so, > should we even allow devices to be passed through to guests? If we do > allow that, do users have any indication that they're not getting what > they expect? iommu_groups will correctly describe the system limitations with the ACS quirk path and so all of the above concerns are taken care of. Robin is saying the Juno SMMU forces a large iommu_group covering the switch anyhow today, so at least that platform is not affected. Jason
On Fri, Feb 06, 2026 at 10:52:54AM -0400, Jason Gunthorpe wrote: > On Fri, Feb 06, 2026 at 08:46:51AM -0600, Bjorn Helgaas wrote: > > > IIUC the current situation is that for these IDT switches, ACS SV is > > enabled when downstream devices are passed through to guests, but > > after these patches, it will no longer be enabled. > > ACS SV is enabled at boot time if an IOMMU driver is present > regardless if guests or virtualization is in use. > > Linux doesn't change ACS flags dynamically. Right, it's just that this series effectively un-advertises ACS SV for the IDE switches so it will never be enabled for them, whereas today, I think we *do* enable ACS SV for them (but temporarily disable it during enumeration). > > So my question is whether users are giving up some isolation. If so, > > should we even allow devices to be passed through to guests? If we do > > allow that, do users have any indication that they're not getting what > > they expect? > > iommu_groups will correctly describe the system limitations with the > ACS quirk path and so all of the above concerns are taken care > of. Robin is saying the Juno SMMU forces a large iommu_group covering > the switch anyhow today, so at least that platform is not affected. I guess REQ_ACS_FLAGS is what iommu_groups looks for? I looked for such a thing earlier but must have missed it. Thanks!
On Fri, Feb 06, 2026 at 09:05:05AM -0600, Bjorn Helgaas wrote: > On Fri, Feb 06, 2026 at 10:52:54AM -0400, Jason Gunthorpe wrote: > > On Fri, Feb 06, 2026 at 08:46:51AM -0600, Bjorn Helgaas wrote: > > > > > IIUC the current situation is that for these IDT switches, ACS SV is > > > enabled when downstream devices are passed through to guests, but > > > after these patches, it will no longer be enabled. > > > > ACS SV is enabled at boot time if an IOMMU driver is present > > regardless if guests or virtualization is in use. > > > > Linux doesn't change ACS flags dynamically. > > Right, it's just that this series effectively un-advertises ACS SV for > the IDE switches so it will never be enabled for them Yes > > iommu_groups will correctly describe the system limitations with the > > ACS quirk path and so all of the above concerns are taken care > > of. Robin is saying the Juno SMMU forces a large iommu_group covering > > the switch anyhow today, so at least that platform is not affected. > > I guess REQ_ACS_FLAGS is what iommu_groups looks for? I looked for > such a thing earlier but must have missed it. Thanks! Yes Jason
On 2026-02-06 2:30 pm, Jason Gunthorpe wrote: > On Fri, Feb 06, 2026 at 02:41:36PM +0530, Manivannan Sadhasivam wrote: >>> It'd be worth expanding on this and what the effect of avoiding ACS SV >>> is. Does this change which devices can be safely passed through to >>> virtual guests? Does it give up isolation that users expect? >>> >> >> IMO, ACS SV is somewhat broken on this switch. But we can still passthrough the >> downstream devices to the guests. There won't be ACS SV apparently, but that's >> what users will get with broken hw. > > I agree with this, the HW is very broken, let's have it at least work > properly in Linux on bare metal out of the box. > > If someone really insists they need virtualization with narrow groups > on this HW then they need to come with a more complete fix. Using VFIO > is going to open up the reset flows that are problematic with the > current solution, so it isn't like that is already working fully. > > Somehow I suspect nobody would use this switch for virtualization :) And it's fine on Juno, since due to the (lack of) SMMU StreamIDs you can only assign the entire root complex as a single iommu_group anyway :D Cheers, Robin.
© 2016 - 2026 Red Hat, Inc.