Xen Security Advisory 447 v2 (CVE-2023-46837) - arm32: The cache may not be properly cleaned/invalidated (take two)

Xen.org security team posted 1 patch 4 months, 3 weeks ago
Failed in applying to current master (apply log)
xen/arch/arm/include/asm/page.h | 35 ++++++++++++++++++++++++++-------
1 file changed, 28 insertions(+), 7 deletions(-)
Xen Security Advisory 447 v2 (CVE-2023-46837) - arm32: The cache may not be properly cleaned/invalidated (take two)
Posted by Xen.org security team 4 months, 3 weeks ago
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

            Xen Security Advisory CVE-2023-46837 / XSA-447
                               version 2

  arm32: The cache may not be properly cleaned/invalidated (take two)

UPDATES IN VERSION 2
====================

Public release.

ISSUE DESCRIPTION
=================

Arm provides multiple helpers to clean & invalidate the cache
for a given region.  This is, for instance, used when allocating
guest memory to ensure any writes (such as the ones during scrubbing)
have reached memory before handing over the page to a guest.

Unfortunately, the arithmetics in the helpers can overflow and would
then result to skip the cache cleaning/invalidation.  Therefore there
is no guarantee when all the writes will reach the memory.

This undefined behavior was meant to be addressed by XSA-437, but the
approach was not sufficient.

IMPACT
======

A malicious guest may be able to read sensitive data from memory that
previously belonged to another guest.

VULNERABLE SYSTEMS
==================

Systems running all version of Xen are affected.

Only systems running Xen on Arm 32-bit are vulnerable.  Xen on Arm 64-bit
is not affected.

MITIGATION
==========

There is no known mitigation.

CREDITS
=======

This issue was discovered by Michal Orzel from AMD.

RESOLUTION
==========

Applying the appropriate attached patch resolves this issue.

Note that patches for released versions are generally prepared to
apply to the stable branches, and may not apply cleanly to the most
recent release tarball.  Downstreams are encouraged to update to the
tip of the stable branch before applying these patches.

xsa447/xsa447.patch           xen-unstable - Xen 4.17.x
xsa447/xsa447-4.16.patch      Xen 4.16.x - Xen 4.15.x

$ sha256sum xsa447* xsa447*/*
639f3a30124fd0f45b6b68768c02a5b5aa2e78c6c1f28bbf1ea5fb9be1f874af  xsa447.meta
0816717ab6e9c2250975ed1100bb2943830dc10e9a52aed7dd5cbe1884a15918  xsa447/xsa447.patch
f325543852b28af3fb2a2ca501a70fc59d3b35432334d52f734b2071c8a9667f  xsa447/xsa447-4.16.patch
$

DEPLOYMENT DURING EMBARGO
=========================

Deployment of the patches and/or mitigations described above (or
others which are substantially similar) is permitted during the
embargo, even on public-facing systems with untrusted guest users and
administrators.

But: Distribution of updated software is prohibited (except to other
members of the predisclosure list).

Predisclosure list members who wish to deploy significantly different
patches and/or mitigations, please contact the Xen Project Security
Team.

(Note: this during-embargo deployment notice is retained in
post-embargo publicly released Xen Project advisories, even though it
is then no longer applicable.  This is to enable the community to have
oversight of the Xen Project Security Team's decisionmaking.)

For more information about permissible uses of embargoed information,
consult the Xen Project community's agreed Security Policy:
  http://www.xenproject.org/security-policy.html
-----BEGIN PGP SIGNATURE-----

iQFABAEBCAAqFiEEI+MiLBRfRHX6gGCng/4UyVfoK9kFAmV4SxMMHHBncEB4ZW4u
b3JnAAoJEIP+FMlX6CvZvnUIAIG4NNqHQCeBV0VOLtdZLNgaBDt9Vguc4FLUYlI5
aBc4/IWrsGYYRuBzLAPGoKYP9/F+OjiHcE0ClFnxkQJ+bFKl4SQLxmSksHkvPtpo
6yL53IbyraIbA+TulYquTr27v7ZnTI9LQA3VurD6sMgiWIo8+C/kSb6g/1TAsm4R
qzHDRLhTd4H+yU7KV327qIUk1D4S0eGP1yWpudpd0A/05RBgI9m4gp01VFeJn8w+
UbYba/4LpcAKG/iyvxqk5o3fyO60zhZEc5BBHhcz7DJ+UvLrLf7TDLrkaI6lorye
m6etZ+kWU9ESL1Qy+lHEk9HqUOg25xQb5gPDrIP3TOMSsUU=
=mrfT
-----END PGP SIGNATURE-----
{
  "XSA": 447,
  "SupportedVersions": [
    "master",
    "4.18",
    "4.17",
    "4.16",
    "4.15"
  ],
  "Trees": [
    "xen"
  ],
  "Recipes": {
    "4.15": {
      "Recipes": {
        "xen": {
          "StableRef": "b918c4cdc7ab2c1c9e9a9b54fa9d9c595913e028",
          "Prereqs": [],
          "Patches": [
            "xsa447/xsa447-4.16.patch"
          ]
        }
      }
    },
    "4.16": {
      "Recipes": {
        "xen": {
          "StableRef": "4dfe95177b948d1f3ed27a801f603ed7f1bc36e8",
          "Prereqs": [],
          "Patches": [
            "xsa447/xsa447-4.16.patch"
          ]
        }
      }
    },
    "4.17": {
      "Recipes": {
        "xen": {
          "StableRef": "e1f9cb16e2efbb202f2f8a9aa7c5ff1d392ece33",
          "Prereqs": [],
          "Patches": [
            "xsa447/xsa447.patch"
          ]
        }
      }
    },
    "4.18": {
      "Recipes": {
        "xen": {
          "StableRef": "3f9390fea5c51a6d64596d295902d28931eeca4c",
          "Prereqs": [],
          "Patches": [
            "xsa447/xsa447.patch"
          ]
        }
      }
    },
    "master": {
      "Recipes": {
        "xen": {
          "StableRef": "80c153c48b255bae61948827241c26671207cf4e",
          "Prereqs": [],
          "Patches": [
            "xsa447/xsa447.patch"
          ]
        }
      }
    }
  }
}
From 084c7312fa6c1d4a7fa343efa1d7d73693dafff4 Mon Sep 17 00:00:00 2001
From: Michal Orzel <michal.orzel@amd.com>
Date: Thu, 23 Nov 2023 15:53:02 +0100
Subject: [PATCH] xen/arm: page: Avoid pointer overflow on cache clean &
 invalidate

On Arm32, after cleaning and invalidating the last dcache line of the top
domheap page i.e. VA = 0xfffff000 (as a result of flushing the page to
RAM), we end up adding the value of a dcache line size to the pointer
once again, which results in a pointer arithmetic overflow (with 64B line
size, operation 0xffffffc0 + 0x40 overflows to 0x0). Such behavior is
undefined and given the wide range of compiler versions we support, it is
difficult to determine what could happen in such scenario.

Modify clean_and_invalidate_dcache_va_range() as well as
clean_dcache_va_range() and invalidate_dcache_va_range() due to similarity
of handling to prevent pointer arithmetic overflow. Modify the loops to
use an additional variable to store the index of the next cacheline.
Add an assert to prevent passing a region that wraps around which is
illegal and would end up in a page fault anyway (region 0-2MB is
unmapped). Lastly, return early if size passed is 0.

Note that on Arm64, we don't have this problem given that the max VA
space we support is 48-bits.

This is XSA-447 / CVE-2023-46837.

Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
---
 xen/arch/arm/include/asm/page.h | 35 ++++++++++++++++++++++++++-------
 1 file changed, 28 insertions(+), 7 deletions(-)

diff --git a/xen/arch/arm/include/asm/page.h b/xen/arch/arm/include/asm/page.h
index ebaf5964f114..69f817d1e68a 100644
--- a/xen/arch/arm/include/asm/page.h
+++ b/xen/arch/arm/include/asm/page.h
@@ -162,6 +162,13 @@ static inline size_t read_dcache_line_bytes(void)
 static inline int invalidate_dcache_va_range(const void *p, unsigned long size)
 {
     size_t cacheline_mask = dcache_line_bytes - 1;
+    unsigned long idx = 0;
+
+    if ( !size )
+        return 0;
+
+    /* Passing a region that wraps around is illegal */
+    ASSERT(((uintptr_t)p + size - 1) >= (uintptr_t)p);
 
     dsb(sy);           /* So the CPU issues all writes to the range */
 
@@ -174,11 +181,11 @@ static inline int invalidate_dcache_va_range(const void *p, unsigned long size)
     }
 
     for ( ; size >= dcache_line_bytes;
-            p += dcache_line_bytes, size -= dcache_line_bytes )
-        asm volatile (__invalidate_dcache_one(0) : : "r" (p));
+            idx += dcache_line_bytes, size -= dcache_line_bytes )
+        asm volatile (__invalidate_dcache_one(0) : : "r" (p + idx));
 
     if ( size > 0 )
-        asm volatile (__clean_and_invalidate_dcache_one(0) : : "r" (p));
+        asm volatile (__clean_and_invalidate_dcache_one(0) : : "r" (p + idx));
 
     dsb(sy);           /* So we know the flushes happen before continuing */
 
@@ -188,14 +195,21 @@ static inline int invalidate_dcache_va_range(const void *p, unsigned long size)
 static inline int clean_dcache_va_range(const void *p, unsigned long size)
 {
     size_t cacheline_mask = dcache_line_bytes - 1;
+    unsigned long idx = 0;
+
+    if ( !size )
+        return 0;
+
+    /* Passing a region that wraps around is illegal */
+    ASSERT(((uintptr_t)p + size - 1) >= (uintptr_t)p);
 
     dsb(sy);           /* So the CPU issues all writes to the range */
     size += (uintptr_t)p & cacheline_mask;
     size = (size + cacheline_mask) & ~cacheline_mask;
     p = (void *)((uintptr_t)p & ~cacheline_mask);
     for ( ; size >= dcache_line_bytes;
-            p += dcache_line_bytes, size -= dcache_line_bytes )
-        asm volatile (__clean_dcache_one(0) : : "r" (p));
+            idx += dcache_line_bytes, size -= dcache_line_bytes )
+        asm volatile (__clean_dcache_one(0) : : "r" (p + idx));
     dsb(sy);           /* So we know the flushes happen before continuing */
     /* ARM callers assume that dcache_* functions cannot fail. */
     return 0;
@@ -205,14 +219,21 @@ static inline int clean_and_invalidate_dcache_va_range
     (const void *p, unsigned long size)
 {
     size_t cacheline_mask = dcache_line_bytes - 1;
+    unsigned long idx = 0;
+
+    if ( !size )
+        return 0;
+
+    /* Passing a region that wraps around is illegal */
+    ASSERT(((uintptr_t)p + size - 1) >= (uintptr_t)p);
 
     dsb(sy);         /* So the CPU issues all writes to the range */
     size += (uintptr_t)p & cacheline_mask;
     size = (size + cacheline_mask) & ~cacheline_mask;
     p = (void *)((uintptr_t)p & ~cacheline_mask);
     for ( ; size >= dcache_line_bytes;
-            p += dcache_line_bytes, size -= dcache_line_bytes )
-        asm volatile (__clean_and_invalidate_dcache_one(0) : : "r" (p));
+            idx += dcache_line_bytes, size -= dcache_line_bytes )
+        asm volatile (__clean_and_invalidate_dcache_one(0) : : "r" (p + idx));
     dsb(sy);         /* So we know the flushes happen before continuing */
     /* ARM callers assume that dcache_* functions cannot fail. */
     return 0;
-- 
2.40.1

From 2bc72b8be4149c1ac8733e5551d937d3d12cb8ed Mon Sep 17 00:00:00 2001
From: Michal Orzel <michal.orzel@amd.com>
Date: Thu, 23 Nov 2023 15:53:02 +0100
Subject: [PATCH] xen/arm: page: Avoid pointer overflow on cache clean &
 invalidate

On Arm32, after cleaning and invalidating the last dcache line of the top
domheap page i.e. VA = 0xfffff000 (as a result of flushing the page to
RAM), we end up adding the value of a dcache line size to the pointer
once again, which results in a pointer arithmetic overflow (with 64B line
size, operation 0xffffffc0 + 0x40 overflows to 0x0). Such behavior is
undefined and given the wide range of compiler versions we support, it is
difficult to determine what could happen in such scenario.

Modify clean_and_invalidate_dcache_va_range() as well as
clean_dcache_va_range() and invalidate_dcache_va_range() due to similarity
of handling to prevent pointer arithmetic overflow. Modify the loops to
use an additional variable to store the index of the next cacheline.
Add an assert to prevent passing a region that wraps around which is
illegal and would end up in a page fault anyway (region 0-2MB is
unmapped). Lastly, return early if size passed is 0.

Note that on Arm64, we don't have this problem given that the max VA
space we support is 48-bits.

This is XSA-447 / CVE-2023-46837.

Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
---
 xen/include/asm-arm/page.h | 35 ++++++++++++++++++++++++++++-------
 1 file changed, 28 insertions(+), 7 deletions(-)

diff --git a/xen/include/asm-arm/page.h b/xen/include/asm-arm/page.h
index eff5883ef87b..b6784417ed34 100644
--- a/xen/include/asm-arm/page.h
+++ b/xen/include/asm-arm/page.h
@@ -153,6 +153,13 @@ static inline size_t read_dcache_line_bytes(void)
 static inline int invalidate_dcache_va_range(const void *p, unsigned long size)
 {
     size_t cacheline_mask = dcache_line_bytes - 1;
+    unsigned long idx = 0;
+
+    if ( !size )
+        return 0;
+
+    /* Passing a region that wraps around is illegal */
+    ASSERT(((uintptr_t)p + size - 1) >= (uintptr_t)p);
 
     dsb(sy);           /* So the CPU issues all writes to the range */
 
@@ -165,11 +172,11 @@ static inline int invalidate_dcache_va_range(const void *p, unsigned long size)
     }
 
     for ( ; size >= dcache_line_bytes;
-            p += dcache_line_bytes, size -= dcache_line_bytes )
-        asm volatile (__invalidate_dcache_one(0) : : "r" (p));
+            idx += dcache_line_bytes, size -= dcache_line_bytes )
+        asm volatile (__invalidate_dcache_one(0) : : "r" (p + idx));
 
     if ( size > 0 )
-        asm volatile (__clean_and_invalidate_dcache_one(0) : : "r" (p));
+        asm volatile (__clean_and_invalidate_dcache_one(0) : : "r" (p + idx));
 
     dsb(sy);           /* So we know the flushes happen before continuing */
 
@@ -179,14 +186,21 @@ static inline int invalidate_dcache_va_range(const void *p, unsigned long size)
 static inline int clean_dcache_va_range(const void *p, unsigned long size)
 {
     size_t cacheline_mask = dcache_line_bytes - 1;
+    unsigned long idx = 0;
+
+    if ( !size )
+        return 0;
+
+    /* Passing a region that wraps around is illegal */
+    ASSERT(((uintptr_t)p + size - 1) >= (uintptr_t)p);
 
     dsb(sy);           /* So the CPU issues all writes to the range */
     size += (uintptr_t)p & cacheline_mask;
     size = (size + cacheline_mask) & ~cacheline_mask;
     p = (void *)((uintptr_t)p & ~cacheline_mask);
     for ( ; size >= dcache_line_bytes;
-            p += dcache_line_bytes, size -= dcache_line_bytes )
-        asm volatile (__clean_dcache_one(0) : : "r" (p));
+            idx += dcache_line_bytes, size -= dcache_line_bytes )
+        asm volatile (__clean_dcache_one(0) : : "r" (p + idx));
     dsb(sy);           /* So we know the flushes happen before continuing */
     /* ARM callers assume that dcache_* functions cannot fail. */
     return 0;
@@ -196,14 +210,21 @@ static inline int clean_and_invalidate_dcache_va_range
     (const void *p, unsigned long size)
 {
     size_t cacheline_mask = dcache_line_bytes - 1;
+    unsigned long idx = 0;
+
+    if ( !size )
+        return 0;
+
+    /* Passing a region that wraps around is illegal */
+    ASSERT(((uintptr_t)p + size - 1) >= (uintptr_t)p);
 
     dsb(sy);         /* So the CPU issues all writes to the range */
     size += (uintptr_t)p & cacheline_mask;
     size = (size + cacheline_mask) & ~cacheline_mask;
     p = (void *)((uintptr_t)p & ~cacheline_mask);
     for ( ; size >= dcache_line_bytes;
-            p += dcache_line_bytes, size -= dcache_line_bytes )
-        asm volatile (__clean_and_invalidate_dcache_one(0) : : "r" (p));
+            idx += dcache_line_bytes, size -= dcache_line_bytes )
+        asm volatile (__clean_and_invalidate_dcache_one(0) : : "r" (p + idx));
     dsb(sy);         /* So we know the flushes happen before continuing */
     /* ARM callers assume that dcache_* functions cannot fail. */
     return 0;
-- 
2.40.1