Xen Security Advisory 445 v3 (CVE-2023-46835) - x86/AMD: mismatch in IOMMU quarantine page table levels

Xen.org security team posted 1 patch 5 months, 2 weeks ago
Failed in applying to current master (apply log)
xen/drivers/passthrough/amd/iommu_map.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
Xen Security Advisory 445 v3 (CVE-2023-46835) - x86/AMD: mismatch in IOMMU quarantine page table levels
Posted by Xen.org security team 5 months, 2 weeks ago
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

            Xen Security Advisory CVE-2023-46835 / XSA-445
                               version 3

        x86/AMD: mismatch in IOMMU quarantine page table levels

UPDATES IN VERSION 3
====================

Public release.

ISSUE DESCRIPTION
=================

The current setup of the quarantine page tables assumes that the
quarantine domain (dom_io) has been initialized with an address width
of DEFAULT_DOMAIN_ADDRESS_WIDTH (48) and hence 4 page table levels.

However dom_io being a PV domain gets the AMD-Vi IOMMU page tables
levels based on the maximum (hot pluggable) RAM address, and hence on
systems with no RAM above the 512GB mark only 3 page-table levels are
configured in the IOMMU.

On systems without RAM above the 512GB boundary
amd_iommu_quarantine_init() will setup page tables for the scratch
page with 4 levels, while the IOMMU will be configured to use 3 levels
only, resulting in the last page table directory (PDE) effectively
becoming a page table entry (PTE), and hence a device in quarantine
mode gaining write access to the page destined to be a PDE.

Due to this page table level mismatch, the sink page the device gets
read/write access to is no longer cleared between device assignment,
possibly leading to data leaks.

IMPACT
======

A device in quarantine mode can access data from previous quarantine
page table usages, possibly leaking data used by previous domains that
also had the device assigned.

VULNERABLE SYSTEMS
==================

All Xen versions supporting PCI passthrough are affected.

Only x86 AMD systems with IOMMU hardware are vulnerable.

Only x86 guests which have physical devices passed through to them can
leverage the vulnerability.

MITIGATION
==========

Not passing through physical devices to guests will avoid the
vulnerability.

Not using quarantine scratch-page mode will avoid the vulnerability,
but could result in other issues.

CREDITS
=======

This issue was discovered by Roger Pau Monné of XenServer.

RESOLUTION
==========

Applying the appropriate attached patch resolves this issue.

Note that patches for released versions are generally prepared to
apply to the stable branches, and may not apply cleanly to the most
recent release tarball.  Downstreams are encouraged to update to the
tip of the stable branch before applying these patches.

xsa445.patch           xen-unstable
xsa445-4.17.patch      Xen 4.17.x
xsa445-4.16.patch      Xen 4.16.x
xsa445-4.15.patch      Xen 4.15.x

$ sha256sum xsa445*
751892f1a603dbee7ecb82d046aee6d87bf10398f365d3880a7f7d32eb3d73c1  xsa445.patch
9ae729410504961578e679ba19931646802b213d026b6587fb1abb43b2629186  xsa445-4.15.patch
55fe5925741b650fe2583a1e9855ea66c4fe0212de4fe93535fd592188fa64d4  xsa445-4.16.patch
7c4478d348dad0d9c71685a8c402df78d74c6b4d3c3e1627115b91967e54d94a  xsa445-4.17.patch
$

DEPLOYMENT DURING EMBARGO
=========================

Deployment of the patches and/or mitigations described above (or
others which are substantially similar) is permitted during the
embargo, even on public-facing systems with untrusted guest users and
administrators.

But: Distribution of updated software is prohibited (except to other
members of the predisclosure list).

Predisclosure list members who wish to deploy significantly different
patches and/or mitigations, please contact the Xen Project Security
Team.

(Note: this during-embargo deployment notice is retained in
post-embargo publicly released Xen Project advisories, even though it
is then no longer applicable.  This is to enable the community to have
oversight of the Xen Project Security Team's decisionmaking.)

For more information about permissible uses of embargoed information,
consult the Xen Project community's agreed Security Policy:
  http://www.xenproject.org/security-policy.html
-----BEGIN PGP SIGNATURE-----

iQFABAEBCAAqFiEEI+MiLBRfRHX6gGCng/4UyVfoK9kFAmVTfRsMHHBncEB4ZW4u
b3JnAAoJEIP+FMlX6CvZJdUIAJOmkQjl9EbYfiuBclmQJgOik6dYwYfFRNr+Q7g0
mWWQRF9BRSZkkzKipBeFWgBkQcx/3qo5HFBfElp9Atq4JpwXlcn9iBDR9fj5Zojl
lUxKHbppKZ9lG6izHjZNVgOOmYkLBxi8STWlB4aXrxhqbgxEnv4MESC809qUuzsy
lXl8AZERW7f/L8aW5IlpQqVKskc3NXUtvrhwyegrzL5SQfeGxIl3EPChA0UGq3PC
McBQWtyMBZHmwOQco8o8QenflWpRmgO4nYHdy2CAJ5XfCqa5bgNs61AR12BAUSaS
5MLSRtCIn2VYxrfsHrE2aCYJHLvzRzWnR09N0p8DKW+4AXY=
=gjG7
-----END PGP SIGNATURE-----
From 8b26fadfe9ad626571cab185d51d3fd27a339efa Mon Sep 17 00:00:00 2001
From: Roger Pau Monne <roger.pau@citrix.com>
Date: Wed, 11 Oct 2023 13:14:21 +0200
Subject: [PATCH] iommu/amd-vi: use correct level for quarantine domain page
 tables
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The current setup of the quarantine page tables assumes that the quarantine
domain (dom_io) has been initialized with an address width of
DEFAULT_DOMAIN_ADDRESS_WIDTH (48).

However dom_io being a PV domain gets the AMD-Vi IOMMU page tables levels based
on the maximum (hot pluggable) RAM address, and hence on systems with no RAM
above the 512GB mark only 3 page-table levels are configured in the IOMMU.

On systems without RAM above the 512GB boundary amd_iommu_quarantine_init()
will setup page tables for the scratch page with 4 levels, while the IOMMU will
be configured to use 3 levels only.  The page destined to be used as level 1,
and to contain a directory of PTEs ends up being the address in a PTE itself,
and thus level 1 page becomes the leaf page.  Without the level mismatch it's
level 0 page that should be the leaf page instead.

The level 1 page won't be used as such, and hence it's not possible to use it
to gain access to other memory on the system.  However that page is not cleared
in amd_iommu_quarantine_init() as part of re-initialization of the device
quarantine page tables, and hence data on the level 1 page can be leaked
between device usages.

Fix this by making sure the paging levels setup by amd_iommu_quarantine_init()
match the number configured on the IOMMUs.

Note that IVMD regions are not affected by this issue, as those areas are
mapped taking the configured paging levels into account.

This is XSA-445 / CVE-2023-46835

Fixes: ea38867831da ('x86 / iommu: set up a scratch page in the quarantine domain')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
 xen/drivers/passthrough/amd/iommu_map.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/xen/drivers/passthrough/amd/iommu_map.c b/xen/drivers/passthrough/amd/iommu_map.c
index daa24a485891..e0f4fe736a8d 100644
--- a/xen/drivers/passthrough/amd/iommu_map.c
+++ b/xen/drivers/passthrough/amd/iommu_map.c
@@ -837,9 +837,7 @@ static int fill_qpt(union amd_iommu_pte *this, unsigned int level,
 int cf_check amd_iommu_quarantine_init(struct pci_dev *pdev, bool scratch_page)
 {
     struct domain_iommu *hd = dom_iommu(dom_io);
-    unsigned long end_gfn =
-        1UL << (DEFAULT_DOMAIN_ADDRESS_WIDTH - PAGE_SHIFT);
-    unsigned int level = amd_iommu_get_paging_mode(end_gfn);
+    unsigned int level = hd->arch.amd.paging_mode;
     unsigned int req_id = get_dma_requestor_id(pdev->seg, pdev->sbdf.bdf);
     const struct ivrs_mappings *ivrs_mappings = get_ivrs_mappings(pdev->seg);
     int rc;
-- 
2.42.0

From 9877bb3af60ef2b543742835c49de7d0108cdca9 Mon Sep 17 00:00:00 2001
From: Roger Pau Monne <roger.pau@citrix.com>
Date: Wed, 11 Oct 2023 13:14:21 +0200
Subject: [PATCH] iommu/amd-vi: use correct level for quarantine domain page
 tables
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The current setup of the quarantine page tables assumes that the quarantine
domain (dom_io) has been initialized with an address width of
DEFAULT_DOMAIN_ADDRESS_WIDTH (48).

However dom_io being a PV domain gets the AMD-Vi IOMMU page tables levels based
on the maximum (hot pluggable) RAM address, and hence on systems with no RAM
above the 512GB mark only 3 page-table levels are configured in the IOMMU.

On systems without RAM above the 512GB boundary amd_iommu_quarantine_init()
will setup page tables for the scratch page with 4 levels, while the IOMMU will
be configured to use 3 levels only.  The page destined to be used as level 1,
and to contain a directory of PTEs ends up being the address in a PTE itself,
and thus level 1 page becomes the leaf page.  Without the level mismatch it's
level 0 page that should be the leaf page instead.

The level 1 page won't be used as such, and hence it's not possible to use it
to gain access to other memory on the system.  However that page is not cleared
in amd_iommu_quarantine_init() as part of re-initialization of the device
quarantine page tables, and hence data on the level 1 page can be leaked
between device usages.

Fix this by making sure the paging levels setup by amd_iommu_quarantine_init()
match the number configured on the IOMMUs.

Note that IVMD regions are not affected by this issue, as those areas are
mapped taking the configured paging levels into account.

This is XSA-445 / CVE-2023-46835

Fixes: ea38867831da ('x86 / iommu: set up a scratch page in the quarantine domain')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
 xen/drivers/passthrough/amd/iommu_map.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/xen/drivers/passthrough/amd/iommu_map.c b/xen/drivers/passthrough/amd/iommu_map.c
index b4c182449131..3473db4c1efc 100644
--- a/xen/drivers/passthrough/amd/iommu_map.c
+++ b/xen/drivers/passthrough/amd/iommu_map.c
@@ -584,9 +584,7 @@ static int fill_qpt(union amd_iommu_pte *this, unsigned int level,
 int amd_iommu_quarantine_init(struct pci_dev *pdev)
 {
     struct domain_iommu *hd = dom_iommu(dom_io);
-    unsigned long end_gfn =
-        1ul << (DEFAULT_DOMAIN_ADDRESS_WIDTH - PAGE_SHIFT);
-    unsigned int level = amd_iommu_get_paging_mode(end_gfn);
+    unsigned int level = hd->arch.amd.paging_mode;
     unsigned int req_id = get_dma_requestor_id(pdev->seg, pdev->sbdf.bdf);
     const struct ivrs_mappings *ivrs_mappings = get_ivrs_mappings(pdev->seg);
     int rc;

base-commit: 4a4daf6bddbe8a741329df5cc8768f7dec664aed
-- 
2.30.2

From 88fa5b0db062a8f2ccac4ba05ef75768b2b03e5a Mon Sep 17 00:00:00 2001
From: Roger Pau Monne <roger.pau@citrix.com>
Date: Wed, 11 Oct 2023 13:14:21 +0200
Subject: [PATCH] iommu/amd-vi: use correct level for quarantine domain page
 tables
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The current setup of the quarantine page tables assumes that the quarantine
domain (dom_io) has been initialized with an address width of
DEFAULT_DOMAIN_ADDRESS_WIDTH (48).

However dom_io being a PV domain gets the AMD-Vi IOMMU page tables levels based
on the maximum (hot pluggable) RAM address, and hence on systems with no RAM
above the 512GB mark only 3 page-table levels are configured in the IOMMU.

On systems without RAM above the 512GB boundary amd_iommu_quarantine_init()
will setup page tables for the scratch page with 4 levels, while the IOMMU will
be configured to use 3 levels only.  The page destined to be used as level 1,
and to contain a directory of PTEs ends up being the address in a PTE itself,
and thus level 1 page becomes the leaf page.  Without the level mismatch it's
level 0 page that should be the leaf page instead.

The level 1 page won't be used as such, and hence it's not possible to use it
to gain access to other memory on the system.  However that page is not cleared
in amd_iommu_quarantine_init() as part of re-initialization of the device
quarantine page tables, and hence data on the level 1 page can be leaked
between device usages.

Fix this by making sure the paging levels setup by amd_iommu_quarantine_init()
match the number configured on the IOMMUs.

Note that IVMD regions are not affected by this issue, as those areas are
mapped taking the configured paging levels into account.

This is XSA-445 / CVE-2023-46835

Fixes: ea38867831da ('x86 / iommu: set up a scratch page in the quarantine domain')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
 xen/drivers/passthrough/amd/iommu_map.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/xen/drivers/passthrough/amd/iommu_map.c b/xen/drivers/passthrough/amd/iommu_map.c
index cf6f01b633e4..1b414a413b89 100644
--- a/xen/drivers/passthrough/amd/iommu_map.c
+++ b/xen/drivers/passthrough/amd/iommu_map.c
@@ -654,9 +654,7 @@ static int fill_qpt(union amd_iommu_pte *this, unsigned int level,
 int amd_iommu_quarantine_init(struct pci_dev *pdev, bool scratch_page)
 {
     struct domain_iommu *hd = dom_iommu(dom_io);
-    unsigned long end_gfn =
-        1ul << (DEFAULT_DOMAIN_ADDRESS_WIDTH - PAGE_SHIFT);
-    unsigned int level = amd_iommu_get_paging_mode(end_gfn);
+    unsigned int level = hd->arch.amd.paging_mode;
     unsigned int req_id = get_dma_requestor_id(pdev->seg, pdev->sbdf.bdf);
     const struct ivrs_mappings *ivrs_mappings = get_ivrs_mappings(pdev->seg);
     int rc;

base-commit: 29efce0f8f10e381417a61f2f9988b40d4f6bcf0
-- 
2.30.2

From a43127d4f1f9a364334fe16b6239c211b35fd238 Mon Sep 17 00:00:00 2001
From: Roger Pau Monne <roger.pau@citrix.com>
Date: Wed, 11 Oct 2023 13:14:21 +0200
Subject: [PATCH] iommu/amd-vi: use correct level for quarantine domain page
 tables
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The current setup of the quarantine page tables assumes that the quarantine
domain (dom_io) has been initialized with an address width of
DEFAULT_DOMAIN_ADDRESS_WIDTH (48).

However dom_io being a PV domain gets the AMD-Vi IOMMU page tables levels based
on the maximum (hot pluggable) RAM address, and hence on systems with no RAM
above the 512GB mark only 3 page-table levels are configured in the IOMMU.

On systems without RAM above the 512GB boundary amd_iommu_quarantine_init()
will setup page tables for the scratch page with 4 levels, while the IOMMU will
be configured to use 3 levels only.  The page destined to be used as level 1,
and to contain a directory of PTEs ends up being the address in a PTE itself,
and thus level 1 page becomes the leaf page.  Without the level mismatch it's
level 0 page that should be the leaf page instead.

The level 1 page won't be used as such, and hence it's not possible to use it
to gain access to other memory on the system.  However that page is not cleared
in amd_iommu_quarantine_init() as part of re-initialization of the device
quarantine page tables, and hence data on the level 1 page can be leaked
between device usages.

Fix this by making sure the paging levels setup by amd_iommu_quarantine_init()
match the number configured on the IOMMUs.

Note that IVMD regions are not affected by this issue, as those areas are
mapped taking the configured paging levels into account.

This is XSA-445 / CVE-2023-46835

Fixes: ea38867831da ('x86 / iommu: set up a scratch page in the quarantine domain')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
 xen/drivers/passthrough/amd/iommu_map.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/xen/drivers/passthrough/amd/iommu_map.c b/xen/drivers/passthrough/amd/iommu_map.c
index 993bac6f8878..e0f4fe736a8d 100644
--- a/xen/drivers/passthrough/amd/iommu_map.c
+++ b/xen/drivers/passthrough/amd/iommu_map.c
@@ -837,9 +837,7 @@ static int fill_qpt(union amd_iommu_pte *this, unsigned int level,
 int cf_check amd_iommu_quarantine_init(struct pci_dev *pdev, bool scratch_page)
 {
     struct domain_iommu *hd = dom_iommu(dom_io);
-    unsigned long end_gfn =
-        1ul << (DEFAULT_DOMAIN_ADDRESS_WIDTH - PAGE_SHIFT);
-    unsigned int level = amd_iommu_get_paging_mode(end_gfn);
+    unsigned int level = hd->arch.amd.paging_mode;
     unsigned int req_id = get_dma_requestor_id(pdev->seg, pdev->sbdf.bdf);
     const struct ivrs_mappings *ivrs_mappings = get_ivrs_mappings(pdev->seg);
     int rc;
-- 
2.42.0