Xen Security Advisory 411 v3 (CVE-2022-33748) - lock order inversion in transitive grant copy handling

Xen.org security team posted 1 patch 1 year, 6 months ago
Failed in applying to current master (apply log)
Xen Security Advisory 411 v3 (CVE-2022-33748) - lock order inversion in transitive grant copy handling
Posted by Xen.org security team 1 year, 6 months ago
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

            Xen Security Advisory CVE-2022-33748 / XSA-411
                               version 3

        lock order inversion in transitive grant copy handling

UPDATES IN VERSION 3
====================

Public release.

ISSUE DESCRIPTION
=================

As part of XSA-226 a missing cleanup call was inserted on an error
handling path.  While doing so, locking requirements were not paid
attention to.  As a result two cooperating guests granting each
other transitive grants can cause locks to be acquired nested within
one another, but in respectively opposite order.  With suitable
timing between the involved grant copy operations this may result in
the locking up of a CPU.

IMPACT
======

Malicious or buggy guest kernels may be able to mount a Denial of
Service (DoS) attack affecting the entire system.

VULNERABLE SYSTEMS
==================

Xen versions 4.0 and newer are vulnerable.  Xen versions 3.4 and older
are not vulnerable.

Only guests with access to transitive grants can exploit the
vulnerability.  In particular, this means that:

 * ARM systems which have taken the XSA-268 fix are not vulnerable, as
   Grant Table v2 was disabled for other security reasons.

 * All systems with the XSA-226 fixes, and booted with
   `gnttab=max-ver:1` or `gnttab=no-transitive` are not vulnerable.

 * From Xen 4.16, the maximum grant table version can be controlled on a
   per-domain basis.  For the xl toolstack, the vulnerability does not
   manifest if either:

   1) Every guest has `max_grant_version=1` in their configuration file,
      or

   2) The global xl.conf has `max_grant_version=1`, and no guests have
      the default overridden by selecting `max_grant_version=2`.

Only multiple cooperating guests can exploit the vulnerability.

MITIGATION
==========

Disallowing the use of transitive grants either via the
`gnttab=no-transitive` Xen command line option, or by disabling grant
interface version 2 altogether via the `gnttab=max-ver:1` Xen command
line option or the xl controls as mentioned above will avoid the
vulnerability.

CREDITS
=======

This issue was discovered by Jan Beulich of SUSE.

RESOLUTION
==========

Applying the appropriate attached patch resolves this issue.

Note that patches for released versions are generally prepared to
apply to the stable branches, and may not apply cleanly to the most
recent release tarball.  Downstreams are encouraged to update to the
tip of the stable branch before applying these patches.

xsa411.patch           xen-unstable - Xen 4.15.x
xsa411-4.14.patch      Xen 4.14.x - 4.13.x

$ sha256sum xsa411*
0802e2e4e9d03c82429a710bbb783cee2fded52d29b1d969b97c680d30c3ac57  xsa411.patch
8473f2ee34562298c5174f0a5b3c64c561a945333aab675845093ad23250d1cf  xsa411-4.14.patch
$

DEPLOYMENT DURING EMBARGO
=========================

Deployment of the patches described above (or others which are
substantially similar) is permitted during the embargo, even on
public-facing systems with untrusted guest users and administrators.

But: Distribution of updated software is prohibited (except to other
members of the predisclosure list).

Predisclosure list members who wish to deploy significantly different
patches and/or mitigations, please contact the Xen Project Security
Team.

HOWEVER, deployment of the mitigations is NOT permitted (except where
all the affected systems and VMs are administered and used only by
organisations which are members of the Xen Project Security Issues
Predisclosure List).  Specifically, deployment on public cloud systems
is NOT permitted.

This is because it is a guest visible change which will draw attention
to the issue.
-----BEGIN PGP SIGNATURE-----

iQFABAEBCAAqFiEEI+MiLBRfRHX6gGCng/4UyVfoK9kFAmNFTAAMHHBncEB4ZW4u
b3JnAAoJEIP+FMlX6CvZPsQH/1JCqscbx49QygGVEnq43C97HQpcoZcUNJGwGjBJ
Li0SXejxd3iWsYsFlMAgmacHIjevEGv318JJLSM21hBULGe85cc6QatpWS0VWrBc
tQVbDIgqNRv42gJCtf1dLF0TnlTZ6p3wiqfsxEYBn1zlEhe2ZEMpY8an4707O32d
nQ90JFh44QJXx6HMZD3pEw2g1+4pMDu9yDUp/Yc3YmxYnXmPW6KE7iMmGkLLGigI
GfiTI4FA/BDVIZkjPErwG7pyXmp2sdtVkv5o/cg7YTOrLzeBmegdyUvzuXkizJ2F
PQnc1rgS/vXPkC62cy6fmLkeAf0dQhq6KBuxW3N8s2fXRXk=
=/bRo
-----END PGP SIGNATURE-----
From: Jan Beulich <jbeulich@suse.com>
Subject: gnttab: correct locking on transitive grant copy error path

While the comment next to the lock dropping in preparation of
recursively calling acquire_grant_for_copy() mistakenly talks about the
rd == td case (excluded a few lines further up), the same concerns apply
to the calling of release_grant_for_copy() on a subsequent error path.

This is CVE-2022-33748 / XSA-411.

Fixes: ad48fb963dbf ("gnttab: fix transitive grant handling")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: Extend code comment.

--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -2622,9 +2622,8 @@ acquire_grant_for_copy(
                      trans_domid);
 
         /*
-         * acquire_grant_for_copy() could take the lock on the
-         * remote table (if rd == td), so we have to drop the lock
-         * here and reacquire.
+         * acquire_grant_for_copy() will take the lock on the remote table,
+         * so we have to drop the lock here and reacquire.
          */
         active_entry_release(act);
         grant_read_unlock(rgt);
@@ -2661,11 +2660,25 @@ acquire_grant_for_copy(
                           act->trans_gref != trans_gref ||
                           !act->is_sub_page)) )
         {
+            /*
+             * Like above for acquire_grant_for_copy() we need to drop and then
+             * re-acquire the locks here to prevent lock order inversion issues.
+             * Unlike for acquire_grant_for_copy() we don't need to re-check
+             * anything, as release_grant_for_copy() doesn't depend on the grant
+             * table entry: It only updates internal state and the status flags.
+             */
+            active_entry_release(act);
+            grant_read_unlock(rgt);
+
             release_grant_for_copy(td, trans_gref, readonly);
             rcu_unlock_domain(td);
+
+            grant_read_lock(rgt);
+            act = active_entry_acquire(rgt, gref);
             reduce_status_for_pin(rd, act, status, readonly);
             active_entry_release(act);
             grant_read_unlock(rgt);
+
             put_page(*page);
             *page = NULL;
             return ERESTART;
From: Jan Beulich <jbeulich@suse.com>
Subject: gnttab: correct locking on transitive grant copy error path

While the comment next to the lock dropping in preparation of
recursively calling acquire_grant_for_copy() mistakenly talks about the
rd == td case (excluded a few lines further up), the same concerns apply
to the calling of release_grant_for_copy() on a subsequent error path.

This is CVE-2022-33748 / XSA-411.

Fixes: ad48fb963dbf ("gnttab: fix transitive grant handling")
Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -2617,9 +2617,8 @@ acquire_grant_for_copy(
                      trans_domid);
 
         /*
-         * acquire_grant_for_copy() could take the lock on the
-         * remote table (if rd == td), so we have to drop the lock
-         * here and reacquire.
+         * acquire_grant_for_copy() will take the lock on the remote table,
+         * so we have to drop the lock here and reacquire.
          */
         active_entry_release(act);
         grant_read_unlock(rgt);
@@ -2656,11 +2655,25 @@ acquire_grant_for_copy(
                           act->trans_gref != trans_gref ||
                           !act->is_sub_page)) )
         {
+            /*
+             * Like above for acquire_grant_for_copy() we need to drop and then
+             * re-acquire the locks here to prevent lock order inversion issues.
+             * Unlike for acquire_grant_for_copy() we don't need to re-check
+             * anything, as release_grant_for_copy() doesn't depend on the grant
+             * table entry: It only updates internal state and the status flags.
+             */
+            active_entry_release(act);
+            grant_read_unlock(rgt);
+
             release_grant_for_copy(td, trans_gref, readonly);
-            fixup_status_for_copy_pin(rd, act, status);
             rcu_unlock_domain(td);
+
+            grant_read_lock(rgt);
+            act = active_entry_acquire(rgt, gref);
+            fixup_status_for_copy_pin(rd, act, status);
             active_entry_release(act);
             grant_read_unlock(rgt);
+
             put_page(*page);
             *page = NULL;
             return ERESTART;