Controlled by the IOMMU driver, ATS is usually enabled "on demand" when a
device requests a translation service from its associated IOMMU HW running
on the channel of a given PASID. This is working even when a device has no
translation on its RID (i.e., the RID is IOMMU bypassed).
However, certain PCIe devices require non-PASID ATS on their RID even when
the RID is IOMMU bypassed. Call this "always on".
For instance, the CXL spec notes in "3.2.5.13 Memory Type on CXL.cache":
"To source requests on CXL.cache, devices need to get the Host Physical
Address (HPA) from the Host by means of an ATS request on CXL.io."
In other words, the CXL.cache capability requires ATS; otherwise, it can't
access host physical memory.
Introduce a new pci_ats_always_on() helper for the IOMMU driver to scan a
PCI device and shift ATS policies between "on demand" and "always on".
Add the support for CXL.cache devices first. Pre-CXL devices will be added
in quirks.c file.
Note that pci_ats_always_on() validates against pci_ats_supported(), so we
ensure that untrusted devices (e.g. external ports) will not be always on.
This maintains the existing ATS security policy regarding potential side-
channel attacks via ATS.
Suggested-by: Vikram Sethi <vsethi@nvidia.com>
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
include/linux/pci-ats.h | 3 +++
include/uapi/linux/pci_regs.h | 1 +
drivers/pci/ats.c | 44 +++++++++++++++++++++++++++++++++++
3 files changed, 48 insertions(+)
diff --git a/include/linux/pci-ats.h b/include/linux/pci-ats.h
index 75c6c86cf09dc..d14ba727d38b3 100644
--- a/include/linux/pci-ats.h
+++ b/include/linux/pci-ats.h
@@ -12,6 +12,7 @@ int pci_prepare_ats(struct pci_dev *dev, int ps);
void pci_disable_ats(struct pci_dev *dev);
int pci_ats_queue_depth(struct pci_dev *dev);
int pci_ats_page_aligned(struct pci_dev *dev);
+bool pci_ats_always_on(struct pci_dev *dev);
#else /* CONFIG_PCI_ATS */
static inline bool pci_ats_supported(struct pci_dev *d)
{ return false; }
@@ -24,6 +25,8 @@ static inline int pci_ats_queue_depth(struct pci_dev *d)
{ return -ENODEV; }
static inline int pci_ats_page_aligned(struct pci_dev *dev)
{ return 0; }
+static inline bool pci_ats_always_on(struct pci_dev *dev)
+{ return false; }
#endif /* CONFIG_PCI_ATS */
#ifdef CONFIG_PCI_PRI
diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
index ec1c54b5a3101..ef061c0313ce6 100644
--- a/include/uapi/linux/pci_regs.h
+++ b/include/uapi/linux/pci_regs.h
@@ -1349,6 +1349,7 @@
/* CXL r4.0, 8.1.3: PCIe DVSEC for CXL Device */
#define PCI_DVSEC_CXL_DEVICE 0
#define PCI_DVSEC_CXL_CAP 0xA
+#define PCI_DVSEC_CXL_CACHE_CAPABLE _BITUL(0)
#define PCI_DVSEC_CXL_MEM_CAPABLE _BITUL(2)
#define PCI_DVSEC_CXL_HDM_COUNT __GENMASK(5, 4)
#define PCI_DVSEC_CXL_CTRL 0xC
diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c
index ec6c8dbdc5e9c..93060fdc0d3c0 100644
--- a/drivers/pci/ats.c
+++ b/drivers/pci/ats.c
@@ -205,6 +205,50 @@ int pci_ats_page_aligned(struct pci_dev *pdev)
return 0;
}
+/*
+ * CXL r4.0, sec 3.2.5.13 Memory Type on CXL.cache notes: to source requests on
+ * CXL.cache, devices need to get the Host Physical Address (HPA) from the Host
+ * by means of an ATS request on CXL.io.
+ *
+ * In other world, CXL.cache devices cannot access physical memory without ATS.
+ */
+static bool pci_cxl_ats_always_on(struct pci_dev *pdev)
+{
+ int offset;
+ u16 cap;
+
+ offset = pci_find_dvsec_capability(pdev, PCI_VENDOR_ID_CXL,
+ PCI_DVSEC_CXL_DEVICE);
+ if (!offset)
+ return false;
+
+ pci_read_config_word(pdev, offset + PCI_DVSEC_CXL_CAP, &cap);
+ if (cap & PCI_DVSEC_CXL_CACHE_CAPABLE)
+ return true;
+
+ return false;
+}
+
+/**
+ * pci_ats_always_on - Whether the PCI device requires ATS to be always enabled
+ * @pdev: the PCI device
+ *
+ * Returns true, if the PCI device requires non-PASID ATS function on an IOMMU
+ * bypassed configuration.
+ */
+bool pci_ats_always_on(struct pci_dev *pdev)
+{
+ if (pci_ats_disabled() || !pci_ats_supported(pdev))
+ return false;
+
+ /* A VF inherits its PF's requirement for ATS function */
+ if (pdev->is_virtfn)
+ pdev = pci_physfn(pdev);
+
+ return pci_cxl_ats_always_on(pdev);
+}
+EXPORT_SYMBOL_GPL(pci_ats_always_on);
+
#ifdef CONFIG_PCI_PRI
void pci_pri_init(struct pci_dev *pdev)
{
--
2.43.0
On 2/24/26 06:52, Nicolin Chen wrote:
> +/**
> + * pci_ats_always_on - Whether the PCI device requires ATS to be always enabled
> + * @pdev: the PCI device
> + *
> + * Returns true, if the PCI device requires non-PASID ATS function on an IOMMU
> + * bypassed configuration.
Including iommu-specific policies and configurations here might cause
confusion. How about making it simply as: "Returns true if the PCI
device requires ATS to be enabled for functional operation"?
> + */
> +bool pci_ats_always_on(struct pci_dev *pdev)
> +{
> + if (pci_ats_disabled() || !pci_ats_supported(pdev))
> + return false;
> +
> + /* A VF inherits its PF's requirement for ATS function */
> + if (pdev->is_virtfn)
> + pdev = pci_physfn(pdev);
> +
> + return pci_cxl_ats_always_on(pdev);
> +}
> +EXPORT_SYMBOL_GPL(pci_ats_always_on);
Thanks,
baolu
On Tue, Mar 03, 2026 at 11:18:10AM +0800, Baolu Lu wrote: > On 2/24/26 06:52, Nicolin Chen wrote: > > +/** > > + * pci_ats_always_on - Whether the PCI device requires ATS to be always enabled > > + * @pdev: the PCI device > > + * > > + * Returns true, if the PCI device requires non-PASID ATS function on an IOMMU > > + * bypassed configuration. > > Including iommu-specific policies and configurations here might cause > confusion. How about making it simply as: "Returns true if the PCI > device requires ATS to be enabled for functional operation"? Yea, that sounds good to me. Thanks! Nicolin
On Mon, 23 Feb 2026 14:52:20 -0800
Nicolin Chen <nicolinc@nvidia.com> wrote:
> Controlled by the IOMMU driver, ATS is usually enabled "on demand" when a
> device requests a translation service from its associated IOMMU HW running
> on the channel of a given PASID. This is working even when a device has no
> translation on its RID (i.e., the RID is IOMMU bypassed).
>
> However, certain PCIe devices require non-PASID ATS on their RID even when
> the RID is IOMMU bypassed. Call this "always on".
>
> For instance, the CXL spec notes in "3.2.5.13 Memory Type on CXL.cache":
> "To source requests on CXL.cache, devices need to get the Host Physical
> Address (HPA) from the Host by means of an ATS request on CXL.io."
>
> In other words, the CXL.cache capability requires ATS; otherwise, it can't
> access host physical memory.
>
> Introduce a new pci_ats_always_on() helper for the IOMMU driver to scan a
> PCI device and shift ATS policies between "on demand" and "always on".
>
> Add the support for CXL.cache devices first. Pre-CXL devices will be added
> in quirks.c file.
>
> Note that pci_ats_always_on() validates against pci_ats_supported(), so we
> ensure that untrusted devices (e.g. external ports) will not be always on.
> This maintains the existing ATS security policy regarding potential side-
> channel attacks via ATS.
>
> Suggested-by: Vikram Sethi <vsethi@nvidia.com>
> Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
+CC Linux-cxl
> ---
> include/linux/pci-ats.h | 3 +++
> include/uapi/linux/pci_regs.h | 1 +
> drivers/pci/ats.c | 44 +++++++++++++++++++++++++++++++++++
> 3 files changed, 48 insertions(+)
>
> diff --git a/include/linux/pci-ats.h b/include/linux/pci-ats.h
> index 75c6c86cf09dc..d14ba727d38b3 100644
> --- a/include/linux/pci-ats.h
> +++ b/include/linux/pci-ats.h
> @@ -12,6 +12,7 @@ int pci_prepare_ats(struct pci_dev *dev, int ps);
> void pci_disable_ats(struct pci_dev *dev);
> int pci_ats_queue_depth(struct pci_dev *dev);
> int pci_ats_page_aligned(struct pci_dev *dev);
> +bool pci_ats_always_on(struct pci_dev *dev);
> #else /* CONFIG_PCI_ATS */
> static inline bool pci_ats_supported(struct pci_dev *d)
> { return false; }
> @@ -24,6 +25,8 @@ static inline int pci_ats_queue_depth(struct pci_dev *d)
> { return -ENODEV; }
> static inline int pci_ats_page_aligned(struct pci_dev *dev)
> { return 0; }
> +static inline bool pci_ats_always_on(struct pci_dev *dev)
> +{ return false; }
> #endif /* CONFIG_PCI_ATS */
>
> #ifdef CONFIG_PCI_PRI
> diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
> index ec1c54b5a3101..ef061c0313ce6 100644
> --- a/include/uapi/linux/pci_regs.h
> +++ b/include/uapi/linux/pci_regs.h
> @@ -1349,6 +1349,7 @@
> /* CXL r4.0, 8.1.3: PCIe DVSEC for CXL Device */
> #define PCI_DVSEC_CXL_DEVICE 0
> #define PCI_DVSEC_CXL_CAP 0xA
> +#define PCI_DVSEC_CXL_CACHE_CAPABLE _BITUL(0)
> #define PCI_DVSEC_CXL_MEM_CAPABLE _BITUL(2)
> #define PCI_DVSEC_CXL_HDM_COUNT __GENMASK(5, 4)
> #define PCI_DVSEC_CXL_CTRL 0xC
> diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c
> index ec6c8dbdc5e9c..93060fdc0d3c0 100644
> --- a/drivers/pci/ats.c
> +++ b/drivers/pci/ats.c
> @@ -205,6 +205,50 @@ int pci_ats_page_aligned(struct pci_dev *pdev)
> return 0;
> }
>
> +/*
> + * CXL r4.0, sec 3.2.5.13 Memory Type on CXL.cache notes: to source requests on
> + * CXL.cache, devices need to get the Host Physical Address (HPA) from the Host
> + * by means of an ATS request on CXL.io.
> + *
> + * In other world, CXL.cache devices cannot access physical memory without ATS.
Maybe tweak that to "host physical memory"
There are too many physical memories in CXL land...
> + */
> +static bool pci_cxl_ats_always_on(struct pci_dev *pdev)
> +{
> + int offset;
> + u16 cap;
> +
> + offset = pci_find_dvsec_capability(pdev, PCI_VENDOR_ID_CXL,
> + PCI_DVSEC_CXL_DEVICE);
> + if (!offset)
> + return false;
> +
> + pci_read_config_word(pdev, offset + PCI_DVSEC_CXL_CAP, &cap);
> + if (cap & PCI_DVSEC_CXL_CACHE_CAPABLE)
> + return true;
Could just do
return cap & PCI_DVSEC_CXL_CACHE_CAPABLE;
Unless the thinking is there may be other stuff that comes here.
> +
> + return false;
> +}
> +
> +/**
> + * pci_ats_always_on - Whether the PCI device requires ATS to be always enabled
> + * @pdev: the PCI device
> + *
> + * Returns true, if the PCI device requires non-PASID ATS function on an IOMMU
> + * bypassed configuration.
> + */
> +bool pci_ats_always_on(struct pci_dev *pdev)
> +{
> + if (pci_ats_disabled() || !pci_ats_supported(pdev))
> + return false;
> +
> + /* A VF inherits its PF's requirement for ATS function */
> + if (pdev->is_virtfn)
> + pdev = pci_physfn(pdev);
> +
> + return pci_cxl_ats_always_on(pdev);
> +}
> +EXPORT_SYMBOL_GPL(pci_ats_always_on);
> +
> #ifdef CONFIG_PCI_PRI
> void pci_pri_init(struct pci_dev *pdev)
> {
On Tue, Feb 24, 2026 at 11:55:34AM +0000, Jonathan Cameron wrote:
> On Mon, 23 Feb 2026 14:52:20 -0800
> Nicolin Chen <nicolinc@nvidia.com> wrote:
> > +/*
> > + * CXL r4.0, sec 3.2.5.13 Memory Type on CXL.cache notes: to source requests on
> > + * CXL.cache, devices need to get the Host Physical Address (HPA) from the Host
> > + * by means of an ATS request on CXL.io.
> > + *
> > + * In other world, CXL.cache devices cannot access physical memory without ATS.
>
> Maybe tweak that to "host physical memory"
>
> There are too many physical memories in CXL land...
>
> > + */
> > +static bool pci_cxl_ats_always_on(struct pci_dev *pdev)
> > +{
> > + int offset;
> > + u16 cap;
> > +
> > + offset = pci_find_dvsec_capability(pdev, PCI_VENDOR_ID_CXL,
> > + PCI_DVSEC_CXL_DEVICE);
> > + if (!offset)
> > + return false;
> > +
> > + pci_read_config_word(pdev, offset + PCI_DVSEC_CXL_CAP, &cap);
> > + if (cap & PCI_DVSEC_CXL_CACHE_CAPABLE)
> > + return true;
>
> Could just do
>
> return cap & PCI_DVSEC_CXL_CACHE_CAPABLE;
>
> Unless the thinking is there may be other stuff that comes here.
I will fix both. Thanks for the review!
Nicolin
© 2016 - 2026 Red Hat, Inc.