Xen Security Advisory 421 v2 (CVE-2022-42325,CVE-2022-42326) - Xenstore: Guests can create arbitrary number of nodes via transactions

Xen.org security team posted 1 patch 1 year, 6 months ago
Failed in applying to current master (apply log)
Xen Security Advisory 421 v2 (CVE-2022-42325,CVE-2022-42326) - Xenstore: Guests can create arbitrary number of nodes via transactions
Posted by Xen.org security team 1 year, 6 months ago
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

     Xen Security Advisory CVE-2022-42325,CVE-2022-42326 / XSA-421
                               version 2

 Xenstore: Guests can create arbitrary number of nodes via transactions

UPDATES IN VERSION 2
====================

Fix typo in title.

Public release.

ISSUE DESCRIPTION
=================

In case a node has been created in a transaction and it is later
deleted in the same transaction, the transaction will be terminated
with an error.

As this error is encountered only when handling the deleted node at
transaction finalization, the transaction will have been performed
partially and without updating the accounting information.  This will
enable a malicious guest to create arbitrary number of nodes.

IMPACT
======

A malicious guest can cause memory shortage in xenstored, resulting in
a Denial of Service (DoS) of xenstored.

This will inhibit creating new guests and changing the configuration
of already running guests.

VULNERABLE SYSTEMS
==================

All systems running Xen version 4.9 and newer are affected.

Only systems running the C variant of Xenstore (xenstored or xenstore-
stubdom) are vulnerable.

Systems using the Ocaml variant of Xenstore (oxenstored) are not vulnerable.

MITIGATION
==========

Running oxenstored instead of xenstored will avoid the vulnerability.

CREDITS
=======

This issue was discovered by Julien Grall of Amazon.

RESOLUTION
==========

Applying the appropriate attached patch resolves this issue.

Note that patches for released versions are generally prepared to
apply to the stable branches, and may not apply cleanly to the most
recent release tarball.  Downstreams are encouraged to update to the
tip of the stable branch before applying these patches.

xsa421/xsa421-??.patch           xen-unstable, Xen 4.16.x
xsa421/xsa421-4.15-??.patch      Xen 4.15.x - 4.13.x

$ sha256sum xsa421* xsa421*/*
c2184bfb9f84220c648531e1ba13a1db0533019c999622e605a6000393e97e65  xsa421.meta
eb2c5ef828e75c79a5f2eb3274a191d3b5d13107db792b8ba2b664ef335a738e  xsa421/xsa421-01.patch
50532ad32975fdaa2674e454da125d5d44d5b471f3cf7c91f24d4128e2e4d090  xsa421/xsa421-02.patch
7ea5a47c293fd2379ec99ef88e29d4a19f03221aa731a600da510f61ff702be9  xsa421/xsa421-4.15-01.patch
8198a41789ed2c63f79f64ea491d9ebbf6d31b78a47e0ff0bbf3db8257fc5f39  xsa421/xsa421-4.15-02.patch
$

DEPLOYMENT DURING EMBARGO
=========================

Deployment of the patches and/or mitigations described above (or
others which are substantially similar) is permitted during the
embargo, even on public-facing systems with untrusted guest users and
administrators.

But: Distribution of updated software is prohibited (except to other
members of the predisclosure list).

Predisclosure list members who wish to deploy significantly different
patches and/or mitigations, please contact the Xen Project Security
Team.


(Note: this during-embargo deployment notice is retained in
post-embargo publicly released Xen Project advisories, even though it
is then no longer applicable.  This is to enable the community to have
oversight of the Xen Project Security Team's decisionmaking.)

For more information about permissible uses of embargoed information,
consult the Xen Project community's agreed Security Policy:
  http://www.xenproject.org/security-policy.html
-----BEGIN PGP SIGNATURE-----

iQFABAEBCAAqFiEEI+MiLBRfRHX6gGCng/4UyVfoK9kFAmNg+7IMHHBncEB4ZW4u
b3JnAAoJEIP+FMlX6CvZgUUH/19VNMAsM8ROQ/MWuba28+8Y7iwwi/+fg5byAefj
vsQp+VfTODpvfQDngfqt43LhzHZ2YnUJqvsvteDiQKRrLtqakR5xrfAN5pNnzv8Q
PJQfIlsaxyVbeUWdsc2BPuQIdPi9hGGxVjpxTfLNSpbIk0E7pXzeztQKW7buxERv
vFLh358t2FBXXwpMD9qFHcTZX+tz9nVg9/0/POoiBb/7LKrmNQRJ3FmvqmgKwsyu
qzZli4eDkHouq/ay5RZKnhurbRxVe80yJ8yTE26AHgZayZUMkLRbTezKaUfkCDD1
Fb2wFmhOj0nfEl4taql2P4du5emFYezMVWy1JKP4y+4i0DQ=
=nNY0
-----END PGP SIGNATURE-----
{
  "XSA": 421,
  "SupportedVersions": [
    "master",
    "4.16",
    "4.15",
    "4.14",
    "4.13"
  ],
  "Trees": [
    "xen"
  ],
  "Recipes": {
    "4.13": {
      "Recipes": {
        "xen": {
          "StableRef": "0be63c2615b268001f7cc9b72ce25eed952737dc",
          "Prereqs": [
            414,
            415,
            326,
            416,
            417,
            418,
            419,
            420
          ],
          "Patches": [
            "xsa421/xsa421-4.15-??.patch"
          ]
        }
      }
    },
    "4.14": {
      "Recipes": {
        "xen": {
          "StableRef": "016de62747b26ead5a5c763b640fe8e205cd182b",
          "Prereqs": [
            414,
            415,
            326,
            416,
            417,
            418,
            419,
            420
          ],
          "Patches": [
            "xsa421/xsa421-4.15-??.patch"
          ]
        }
      }
    },
    "4.15": {
      "Recipes": {
        "xen": {
          "StableRef": "816580afdd1730d4f85f64477a242a439af1cdf8",
          "Prereqs": [
            414,
            415,
            326,
            416,
            417,
            418,
            419,
            420
          ],
          "Patches": [
            "xsa421/xsa421-4.15-??.patch"
          ]
        }
      }
    },
    "4.16": {
      "Recipes": {
        "xen": {
          "StableRef": "1bce7fb1f702da4f7a749c6f1457ecb20bf74fca",
          "Prereqs": [
            412,
            414,
            415,
            326,
            416,
            417,
            418,
            419,
            420
          ],
          "Patches": [
            "xsa421/xsa421-??.patch"
          ]
        }
      }
    },
    "master": {
      "Recipes": {
        "xen": {
          "StableRef": "cc4747be8ba157a3b310921e9ee07fb8545aa206",
          "Prereqs": [
            412,
            414,
            415,
            326,
            416,
            417,
            418,
            419,
            420
          ],
          "Patches": [
            "xsa421/xsa421-??.patch"
          ]
        }
      }
    }
  }
}From 9166869e7e6530befddfd8bb46ff37436a38efc1 Mon Sep 17 00:00:00 2001
From: Juergen Gross <jgross@suse.com>
Date: Tue, 13 Sep 2022 07:35:13 +0200
Subject: tools/xenstore: fix deleting node in transaction

In case a node has been created in a transaction and it is later
deleted in the same transaction, the transaction will be terminated
with an error.

As this error is encountered only when handling the deleted node at
transaction finalization, the transaction will have been performed
partially and without updating the accounting information. This will
enable a malicious guest to create arbitrary number of nodes.

This is part of XSA-421 / CVE-2022-42325.

Reported-by: Julien Grall <jgrall@amazon.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Tested-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>

diff --git a/tools/xenstore/xenstored_transaction.c b/tools/xenstore/xenstored_transaction.c
index 3e3eb47326cc..7ffe21bb5285 100644
--- a/tools/xenstore/xenstored_transaction.c
+++ b/tools/xenstore/xenstored_transaction.c
@@ -418,7 +418,13 @@ static int finalize_transaction(struct connection *conn,
 						   true);
 				talloc_free(data.dptr);
 			} else {
-				ret = do_tdb_delete(conn, &key, NULL);
+				/*
+				 * A node having been created and later deleted
+				 * in this transaction will have no generation
+				 * information stored.
+				 */
+				ret = (i->generation == NO_GENERATION)
+				      ? 0 : do_tdb_delete(conn, &key, NULL);
 			}
 			if (ret)
 				goto err;
From 09fc22fea8a4689c5e563ba4a2fa959282071792 Mon Sep 17 00:00:00 2001
From: Juergen Gross <jgross@suse.com>
Date: Tue, 13 Sep 2022 07:35:14 +0200
Subject: tools/xenstore: harden transaction finalization against errors

When finalizing a transaction, any error occurring after checking for
conflicts will result in the transaction being performed only
partially today. Additionally accounting data will not be updated at
the end of the transaction, which might result in further problems
later.

Avoid those problems by multiple modifications:

- free any transaction specific nodes which don't need to be committed
  as they haven't been written during the transaction as soon as their
  generation count has been verified, this will reduce the risk of
  out-of-memory situations

- store the transaction specific node name in struct accessed_node in
  order to avoid the need to allocate additional memory for it when
  finalizing the transaction

- don't stop the transaction finalization when hitting an error
  condition, but try to continue to handle all modified nodes

- in case of a detected error do the accounting update as needed and
  call the data base checking only after that

- if writing a node in a transaction is failing (e.g. due to a failed
  quota check), fail the transaction, as prior changes to struct
  accessed_node can't easily be undone in that case

This is part of XSA-421 / CVE-2022-42326.

Reported-by: Julien Grall <jgrall@amazon.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
Tested-by: Julien Grall <jgrall@amazon.com>

diff --git a/tools/xenstore/xenstored_core.c b/tools/xenstore/xenstored_core.c
index 36fb4a832834..476d5c6d51bd 100644
--- a/tools/xenstore/xenstored_core.c
+++ b/tools/xenstore/xenstored_core.c
@@ -723,8 +723,7 @@ struct node *read_node(struct connection *conn, const void *ctx,
 		return NULL;
 	}
 
-	if (transaction_prepend(conn, name, &key))
-		return NULL;
+	transaction_prepend(conn, name, &key);
 
 	data = tdb_fetch(tdb_ctx, key);
 
@@ -842,10 +841,21 @@ int write_node_raw(struct connection *conn, TDB_DATA *key, struct node *node,
 static int write_node(struct connection *conn, struct node *node,
 		      bool no_quota_check)
 {
+	int ret;
+
 	if (access_node(conn, node, NODE_ACCESS_WRITE, &node->key))
 		return errno;
 
-	return write_node_raw(conn, &node->key, node, no_quota_check);
+	ret = write_node_raw(conn, &node->key, node, no_quota_check);
+	if (ret && conn && conn->transaction) {
+		/*
+		 * Reverting access_node() is hard, so just fail the
+		 * transaction.
+		 */
+		fail_transaction(conn->transaction);
+	}
+
+	return ret;
 }
 
 unsigned int perm_for_conn(struct connection *conn,
diff --git a/tools/xenstore/xenstored_transaction.c b/tools/xenstore/xenstored_transaction.c
index 7ffe21bb5285..ac854197cadb 100644
--- a/tools/xenstore/xenstored_transaction.c
+++ b/tools/xenstore/xenstored_transaction.c
@@ -114,7 +114,8 @@ struct accessed_node
 	struct list_head list;
 
 	/* The name of the node. */
-	char *node;
+	char *trans_name;	/* Transaction specific name. */
+	char *node;		/* Main data base name. */
 
 	/* Generation count (or NO_GENERATION) for conflict checking. */
 	uint64_t generation;
@@ -199,25 +200,20 @@ static char *transaction_get_node_name(void *ctx, struct transaction *trans,
  * Prepend the transaction to name if node has been modified in the current
  * transaction.
  */
-int transaction_prepend(struct connection *conn, const char *name,
-			TDB_DATA *key)
+void transaction_prepend(struct connection *conn, const char *name,
+			 TDB_DATA *key)
 {
-	char *tdb_name;
+	struct accessed_node *i;
 
-	if (!conn || !conn->transaction ||
-	    !find_accessed_node(conn->transaction, name)) {
-		set_tdb_key(name, key);
-		return 0;
+	if (conn && conn->transaction) {
+		i = find_accessed_node(conn->transaction, name);
+		if (i) {
+			set_tdb_key(i->trans_name, key);
+			return;
+		}
 	}
 
-	tdb_name = transaction_get_node_name(conn->transaction,
-					     conn->transaction, name);
-	if (!tdb_name)
-		return errno;
-
-	set_tdb_key(tdb_name, key);
-
-	return 0;
+	set_tdb_key(name, key);
 }
 
 /*
@@ -240,7 +236,6 @@ int access_node(struct connection *conn, struct node *node,
 	struct accessed_node *i = NULL;
 	struct transaction *trans;
 	TDB_DATA local_key;
-	const char *trans_name = NULL;
 	int ret;
 	bool introduce = false;
 
@@ -259,10 +254,6 @@ int access_node(struct connection *conn, struct node *node,
 
 	trans = conn->transaction;
 
-	trans_name = transaction_get_node_name(node, trans, node->name);
-	if (!trans_name)
-		goto nomem;
-
 	i = find_accessed_node(trans, node->name);
 	if (!i) {
 		if (trans->nodes >= quota_trans_nodes &&
@@ -273,9 +264,10 @@ int access_node(struct connection *conn, struct node *node,
 		i = talloc_zero(trans, struct accessed_node);
 		if (!i)
 			goto nomem;
-		i->node = talloc_strdup(i, node->name);
-		if (!i->node)
+		i->trans_name = transaction_get_node_name(i, trans, node->name);
+		if (!i->trans_name)
 			goto nomem;
+		i->node = strchr(i->trans_name, '/') + 1;
 		if (node->generation != NO_GENERATION && node->perms.num) {
 			i->perms.p = talloc_array(i, struct xs_permissions,
 						  node->perms.num);
@@ -302,7 +294,7 @@ int access_node(struct connection *conn, struct node *node,
 			i->generation = node->generation;
 			i->check_gen = true;
 			if (node->generation != NO_GENERATION) {
-				set_tdb_key(trans_name, &local_key);
+				set_tdb_key(i->trans_name, &local_key);
 				ret = write_node_raw(conn, &local_key, node, true);
 				if (ret)
 					goto err;
@@ -321,7 +313,7 @@ int access_node(struct connection *conn, struct node *node,
 		return -1;
 
 	if (key) {
-		set_tdb_key(trans_name, key);
+		set_tdb_key(i->trans_name, key);
 		if (type == NODE_ACCESS_WRITE)
 			i->ta_node = true;
 		if (type == NODE_ACCESS_DELETE)
@@ -333,7 +325,6 @@ int access_node(struct connection *conn, struct node *node,
 nomem:
 	ret = ENOMEM;
 err:
-	talloc_free((void *)trans_name);
 	talloc_free(i);
 	trans->fail = true;
 	errno = ret;
@@ -371,100 +362,90 @@ void queue_watches(struct connection *conn, const char *name, bool watch_exact)
  * base.
  */
 static int finalize_transaction(struct connection *conn,
-				struct transaction *trans)
+				struct transaction *trans, bool *is_corrupt)
 {
-	struct accessed_node *i;
+	struct accessed_node *i, *n;
 	TDB_DATA key, ta_key, data;
 	struct xs_tdb_record_hdr *hdr;
 	uint64_t gen;
-	char *trans_name;
-	int ret;
 
-	list_for_each_entry(i, &trans->accessed, list) {
-		if (!i->check_gen)
-			continue;
+	list_for_each_entry_safe(i, n, &trans->accessed, list) {
+		if (i->check_gen) {
+			set_tdb_key(i->node, &key);
+			data = tdb_fetch(tdb_ctx, key);
+			hdr = (void *)data.dptr;
+			if (!data.dptr) {
+				if (tdb_error(tdb_ctx) != TDB_ERR_NOEXIST)
+					return EIO;
+				gen = NO_GENERATION;
+			} else
+				gen = hdr->generation;
+			talloc_free(data.dptr);
+			if (i->generation != gen)
+				return EAGAIN;
+		}
 
-		set_tdb_key(i->node, &key);
-		data = tdb_fetch(tdb_ctx, key);
-		hdr = (void *)data.dptr;
-		if (!data.dptr) {
-			if (tdb_error(tdb_ctx) != TDB_ERR_NOEXIST)
-				return EIO;
-			gen = NO_GENERATION;
-		} else
-			gen = hdr->generation;
-		talloc_free(data.dptr);
-		if (i->generation != gen)
-			return EAGAIN;
+		/* Entries for unmodified nodes can be removed early. */
+		if (!i->modified) {
+			if (i->ta_node) {
+				set_tdb_key(i->trans_name, &ta_key);
+				if (do_tdb_delete(conn, &ta_key, NULL))
+					return EIO;
+			}
+			list_del(&i->list);
+			talloc_free(i);
+		}
 	}
 
 	while ((i = list_top(&trans->accessed, struct accessed_node, list))) {
-		trans_name = transaction_get_node_name(i, trans, i->node);
-		if (!trans_name)
-			/* We are doomed: the transaction is only partial. */
-			goto err;
-
-		set_tdb_key(trans_name, &ta_key);
-
-		if (i->modified) {
-			set_tdb_key(i->node, &key);
-			if (i->ta_node) {
-				data = tdb_fetch(tdb_ctx, ta_key);
-				if (!data.dptr)
-					goto err;
+		set_tdb_key(i->node, &key);
+		if (i->ta_node) {
+			set_tdb_key(i->trans_name, &ta_key);
+			data = tdb_fetch(tdb_ctx, ta_key);
+			if (data.dptr) {
 				hdr = (void *)data.dptr;
 				hdr->generation = ++generation;
-				ret = do_tdb_write(conn, &key, &data, NULL,
-						   true);
+				*is_corrupt |= do_tdb_write(conn, &key, &data,
+							    NULL, true);
 				talloc_free(data.dptr);
+				if (do_tdb_delete(conn, &ta_key, NULL))
+					*is_corrupt = true;
 			} else {
-				/*
-				 * A node having been created and later deleted
-				 * in this transaction will have no generation
-				 * information stored.
-				 */
-				ret = (i->generation == NO_GENERATION)
-				      ? 0 : do_tdb_delete(conn, &key, NULL);
-			}
-			if (ret)
-				goto err;
-			if (i->fire_watch) {
-				fire_watches(conn, trans, i->node, NULL,
-					     i->watch_exact,
-					     i->perms.p ? &i->perms : NULL);
+				*is_corrupt = true;
 			}
+		} else {
+			/*
+			 * A node having been created and later deleted
+			 * in this transaction will have no generation
+			 * information stored.
+			 */
+			*is_corrupt |= (i->generation == NO_GENERATION)
+				       ? false
+				       : do_tdb_delete(conn, &key, NULL);
 		}
+		if (i->fire_watch)
+			fire_watches(conn, trans, i->node, NULL, i->watch_exact,
+				     i->perms.p ? &i->perms : NULL);
 
-		if (i->ta_node && do_tdb_delete(conn, &ta_key, NULL))
-			goto err;
 		list_del(&i->list);
 		talloc_free(i);
 	}
 
 	return 0;
-
-err:
-	corrupt(conn, "Partial transaction");
-	return EIO;
 }
 
 static int destroy_transaction(void *_transaction)
 {
 	struct transaction *trans = _transaction;
 	struct accessed_node *i;
-	char *trans_name;
 	TDB_DATA key;
 
 	wrl_ntransactions--;
 	trace_destroy(trans, "transaction");
 	while ((i = list_top(&trans->accessed, struct accessed_node, list))) {
 		if (i->ta_node) {
-			trans_name = transaction_get_node_name(i, trans,
-							       i->node);
-			if (trans_name) {
-				set_tdb_key(trans_name, &key);
-				do_tdb_delete(trans->conn, &key, NULL);
-			}
+			set_tdb_key(i->trans_name, &key);
+			do_tdb_delete(trans->conn, &key, NULL);
 		}
 		list_del(&i->list);
 		talloc_free(i);
@@ -556,6 +537,7 @@ int do_transaction_end(const void *ctx, struct connection *conn,
 {
 	const char *arg = onearg(in);
 	struct transaction *trans;
+	bool is_corrupt = false;
 	int ret;
 
 	if (!arg || (!streq(arg, "T") && !streq(arg, "F")))
@@ -579,13 +561,17 @@ int do_transaction_end(const void *ctx, struct connection *conn,
 		ret = transaction_fix_domains(trans, false);
 		if (ret)
 			return ret;
-		if (finalize_transaction(conn, trans))
-			return EAGAIN;
+		ret = finalize_transaction(conn, trans, &is_corrupt);
+		if (ret)
+			return ret;
 
 		wrl_apply_debit_trans_commit(conn);
 
 		/* fix domain entry for each changed domain */
 		transaction_fix_domains(trans, true);
+
+		if (is_corrupt)
+			corrupt(conn, "transaction inconsistency");
 	}
 	send_ack(conn, XS_TRANSACTION_END);
 
@@ -660,7 +646,7 @@ int check_transactions(struct hashtable *hash)
 	struct connection *conn;
 	struct transaction *trans;
 	struct accessed_node *i;
-	char *tname, *tnode;
+	char *tname;
 
 	list_for_each_entry(conn, &connections, list) {
 		list_for_each_entry(trans, &conn->transaction_list, list) {
@@ -672,11 +658,8 @@ int check_transactions(struct hashtable *hash)
 			list_for_each_entry(i, &trans->accessed, list) {
 				if (!i->ta_node)
 					continue;
-				tnode = transaction_get_node_name(tname, trans,
-								  i->node);
-				if (!tnode || !remember_string(hash, tnode))
+				if (!remember_string(hash, i->trans_name))
 					goto nomem;
-				talloc_free(tnode);
 			}
 
 			talloc_free(tname);
diff --git a/tools/xenstore/xenstored_transaction.h b/tools/xenstore/xenstored_transaction.h
index 39d7f81c5127..3417303f9427 100644
--- a/tools/xenstore/xenstored_transaction.h
+++ b/tools/xenstore/xenstored_transaction.h
@@ -48,8 +48,8 @@ int __must_check access_node(struct connection *conn, struct node *node,
 void queue_watches(struct connection *conn, const char *name, bool watch_exact);
 
 /* Prepend the transaction to name if appropriate. */
-int transaction_prepend(struct connection *conn, const char *name,
-                        TDB_DATA *key);
+void transaction_prepend(struct connection *conn, const char *name,
+                         TDB_DATA *key);
 
 /* Mark the transaction as failed. This will prevent it to be committed. */
 void fail_transaction(struct transaction *trans);
From 246d8db540f08470c2f8789a8440173028c85b38 Mon Sep 17 00:00:00 2001
From: Juergen Gross <jgross@suse.com>
Date: Tue, 13 Sep 2022 07:35:13 +0200
Subject: tools/xenstore: fix deleting node in transaction

In case a node has been created in a transaction and it is later
deleted in the same transaction, the transaction will be terminated
with an error.

As this error is encountered only when handling the deleted node at
transaction finalization, the transaction will have been performed
partially and without updating the accounting information. This will
enable a malicious guest to create arbitrary number of nodes.

This is part of XSA-421 / CVE-2022-42325.

Reported-by: Julien Grall <jgrall@amazon.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Tested-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>

diff --git a/tools/xenstore/xenstored_transaction.c b/tools/xenstore/xenstored_transaction.c
index 3e3eb47326cc..7ffe21bb5285 100644
--- a/tools/xenstore/xenstored_transaction.c
+++ b/tools/xenstore/xenstored_transaction.c
@@ -418,7 +418,13 @@ static int finalize_transaction(struct connection *conn,
 						   true);
 				talloc_free(data.dptr);
 			} else {
-				ret = do_tdb_delete(conn, &key, NULL);
+				/*
+				 * A node having been created and later deleted
+				 * in this transaction will have no generation
+				 * information stored.
+				 */
+				ret = (i->generation == NO_GENERATION)
+				      ? 0 : do_tdb_delete(conn, &key, NULL);
 			}
 			if (ret)
 				goto err;
From bb45167132b9988ef7282eef96beaa5c9845f9ad Mon Sep 17 00:00:00 2001
From: Juergen Gross <jgross@suse.com>
Date: Tue, 13 Sep 2022 07:35:14 +0200
Subject: tools/xenstore: harden transaction finalization against errors

When finalizing a transaction, any error occurring after checking for
conflicts will result in the transaction being performed only
partially today. Additionally accounting data will not be updated at
the end of the transaction, which might result in further problems
later.

Avoid those problems by multiple modifications:

- free any transaction specific nodes which don't need to be committed
  as they haven't been written during the transaction as soon as their
  generation count has been verified, this will reduce the risk of
  out-of-memory situations

- store the transaction specific node name in struct accessed_node in
  order to avoid the need to allocate additional memory for it when
  finalizing the transaction

- don't stop the transaction finalization when hitting an error
  condition, but try to continue to handle all modified nodes

- in case of a detected error do the accounting update as needed and
  call the data base checking only after that

- if writing a node in a transaction is failing (e.g. due to a failed
  quota check), fail the transaction, as prior changes to struct
  accessed_node can't easily be undone in that case

This is part of XSA-421 / CVE-2022-42326.

Reported-by: Julien Grall <jgrall@amazon.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
Tested-by: Julien Grall <jgrall@amazon.com>

diff --git a/tools/xenstore/xenstored_core.c b/tools/xenstore/xenstored_core.c
index 9ddbd934f794..3c008c8cd455 100644
--- a/tools/xenstore/xenstored_core.c
+++ b/tools/xenstore/xenstored_core.c
@@ -692,8 +692,7 @@ struct node *read_node(struct connection *conn, const void *ctx,
 		return NULL;
 	}
 
-	if (transaction_prepend(conn, name, &key))
-		return NULL;
+	transaction_prepend(conn, name, &key);
 
 	data = tdb_fetch(tdb_ctx, key);
 
@@ -811,10 +810,21 @@ int write_node_raw(struct connection *conn, TDB_DATA *key, struct node *node,
 static int write_node(struct connection *conn, struct node *node,
 		      bool no_quota_check)
 {
+	int ret;
+
 	if (access_node(conn, node, NODE_ACCESS_WRITE, &node->key))
 		return errno;
 
-	return write_node_raw(conn, &node->key, node, no_quota_check);
+	ret = write_node_raw(conn, &node->key, node, no_quota_check);
+	if (ret && conn && conn->transaction) {
+		/*
+		 * Reverting access_node() is hard, so just fail the
+		 * transaction.
+		 */
+		fail_transaction(conn->transaction);
+	}
+
+	return ret;
 }
 
 enum xs_perm_type perm_for_conn(struct connection *conn,
diff --git a/tools/xenstore/xenstored_transaction.c b/tools/xenstore/xenstored_transaction.c
index 7ffe21bb5285..ac854197cadb 100644
--- a/tools/xenstore/xenstored_transaction.c
+++ b/tools/xenstore/xenstored_transaction.c
@@ -114,7 +114,8 @@ struct accessed_node
 	struct list_head list;
 
 	/* The name of the node. */
-	char *node;
+	char *trans_name;	/* Transaction specific name. */
+	char *node;		/* Main data base name. */
 
 	/* Generation count (or NO_GENERATION) for conflict checking. */
 	uint64_t generation;
@@ -199,25 +200,20 @@ static char *transaction_get_node_name(void *ctx, struct transaction *trans,
  * Prepend the transaction to name if node has been modified in the current
  * transaction.
  */
-int transaction_prepend(struct connection *conn, const char *name,
-			TDB_DATA *key)
+void transaction_prepend(struct connection *conn, const char *name,
+			 TDB_DATA *key)
 {
-	char *tdb_name;
+	struct accessed_node *i;
 
-	if (!conn || !conn->transaction ||
-	    !find_accessed_node(conn->transaction, name)) {
-		set_tdb_key(name, key);
-		return 0;
+	if (conn && conn->transaction) {
+		i = find_accessed_node(conn->transaction, name);
+		if (i) {
+			set_tdb_key(i->trans_name, key);
+			return;
+		}
 	}
 
-	tdb_name = transaction_get_node_name(conn->transaction,
-					     conn->transaction, name);
-	if (!tdb_name)
-		return errno;
-
-	set_tdb_key(tdb_name, key);
-
-	return 0;
+	set_tdb_key(name, key);
 }
 
 /*
@@ -240,7 +236,6 @@ int access_node(struct connection *conn, struct node *node,
 	struct accessed_node *i = NULL;
 	struct transaction *trans;
 	TDB_DATA local_key;
-	const char *trans_name = NULL;
 	int ret;
 	bool introduce = false;
 
@@ -259,10 +254,6 @@ int access_node(struct connection *conn, struct node *node,
 
 	trans = conn->transaction;
 
-	trans_name = transaction_get_node_name(node, trans, node->name);
-	if (!trans_name)
-		goto nomem;
-
 	i = find_accessed_node(trans, node->name);
 	if (!i) {
 		if (trans->nodes >= quota_trans_nodes &&
@@ -273,9 +264,10 @@ int access_node(struct connection *conn, struct node *node,
 		i = talloc_zero(trans, struct accessed_node);
 		if (!i)
 			goto nomem;
-		i->node = talloc_strdup(i, node->name);
-		if (!i->node)
+		i->trans_name = transaction_get_node_name(i, trans, node->name);
+		if (!i->trans_name)
 			goto nomem;
+		i->node = strchr(i->trans_name, '/') + 1;
 		if (node->generation != NO_GENERATION && node->perms.num) {
 			i->perms.p = talloc_array(i, struct xs_permissions,
 						  node->perms.num);
@@ -302,7 +294,7 @@ int access_node(struct connection *conn, struct node *node,
 			i->generation = node->generation;
 			i->check_gen = true;
 			if (node->generation != NO_GENERATION) {
-				set_tdb_key(trans_name, &local_key);
+				set_tdb_key(i->trans_name, &local_key);
 				ret = write_node_raw(conn, &local_key, node, true);
 				if (ret)
 					goto err;
@@ -321,7 +313,7 @@ int access_node(struct connection *conn, struct node *node,
 		return -1;
 
 	if (key) {
-		set_tdb_key(trans_name, key);
+		set_tdb_key(i->trans_name, key);
 		if (type == NODE_ACCESS_WRITE)
 			i->ta_node = true;
 		if (type == NODE_ACCESS_DELETE)
@@ -333,7 +325,6 @@ int access_node(struct connection *conn, struct node *node,
 nomem:
 	ret = ENOMEM;
 err:
-	talloc_free((void *)trans_name);
 	talloc_free(i);
 	trans->fail = true;
 	errno = ret;
@@ -371,100 +362,90 @@ void queue_watches(struct connection *conn, const char *name, bool watch_exact)
  * base.
  */
 static int finalize_transaction(struct connection *conn,
-				struct transaction *trans)
+				struct transaction *trans, bool *is_corrupt)
 {
-	struct accessed_node *i;
+	struct accessed_node *i, *n;
 	TDB_DATA key, ta_key, data;
 	struct xs_tdb_record_hdr *hdr;
 	uint64_t gen;
-	char *trans_name;
-	int ret;
 
-	list_for_each_entry(i, &trans->accessed, list) {
-		if (!i->check_gen)
-			continue;
+	list_for_each_entry_safe(i, n, &trans->accessed, list) {
+		if (i->check_gen) {
+			set_tdb_key(i->node, &key);
+			data = tdb_fetch(tdb_ctx, key);
+			hdr = (void *)data.dptr;
+			if (!data.dptr) {
+				if (tdb_error(tdb_ctx) != TDB_ERR_NOEXIST)
+					return EIO;
+				gen = NO_GENERATION;
+			} else
+				gen = hdr->generation;
+			talloc_free(data.dptr);
+			if (i->generation != gen)
+				return EAGAIN;
+		}
 
-		set_tdb_key(i->node, &key);
-		data = tdb_fetch(tdb_ctx, key);
-		hdr = (void *)data.dptr;
-		if (!data.dptr) {
-			if (tdb_error(tdb_ctx) != TDB_ERR_NOEXIST)
-				return EIO;
-			gen = NO_GENERATION;
-		} else
-			gen = hdr->generation;
-		talloc_free(data.dptr);
-		if (i->generation != gen)
-			return EAGAIN;
+		/* Entries for unmodified nodes can be removed early. */
+		if (!i->modified) {
+			if (i->ta_node) {
+				set_tdb_key(i->trans_name, &ta_key);
+				if (do_tdb_delete(conn, &ta_key, NULL))
+					return EIO;
+			}
+			list_del(&i->list);
+			talloc_free(i);
+		}
 	}
 
 	while ((i = list_top(&trans->accessed, struct accessed_node, list))) {
-		trans_name = transaction_get_node_name(i, trans, i->node);
-		if (!trans_name)
-			/* We are doomed: the transaction is only partial. */
-			goto err;
-
-		set_tdb_key(trans_name, &ta_key);
-
-		if (i->modified) {
-			set_tdb_key(i->node, &key);
-			if (i->ta_node) {
-				data = tdb_fetch(tdb_ctx, ta_key);
-				if (!data.dptr)
-					goto err;
+		set_tdb_key(i->node, &key);
+		if (i->ta_node) {
+			set_tdb_key(i->trans_name, &ta_key);
+			data = tdb_fetch(tdb_ctx, ta_key);
+			if (data.dptr) {
 				hdr = (void *)data.dptr;
 				hdr->generation = ++generation;
-				ret = do_tdb_write(conn, &key, &data, NULL,
-						   true);
+				*is_corrupt |= do_tdb_write(conn, &key, &data,
+							    NULL, true);
 				talloc_free(data.dptr);
+				if (do_tdb_delete(conn, &ta_key, NULL))
+					*is_corrupt = true;
 			} else {
-				/*
-				 * A node having been created and later deleted
-				 * in this transaction will have no generation
-				 * information stored.
-				 */
-				ret = (i->generation == NO_GENERATION)
-				      ? 0 : do_tdb_delete(conn, &key, NULL);
-			}
-			if (ret)
-				goto err;
-			if (i->fire_watch) {
-				fire_watches(conn, trans, i->node, NULL,
-					     i->watch_exact,
-					     i->perms.p ? &i->perms : NULL);
+				*is_corrupt = true;
 			}
+		} else {
+			/*
+			 * A node having been created and later deleted
+			 * in this transaction will have no generation
+			 * information stored.
+			 */
+			*is_corrupt |= (i->generation == NO_GENERATION)
+				       ? false
+				       : do_tdb_delete(conn, &key, NULL);
 		}
+		if (i->fire_watch)
+			fire_watches(conn, trans, i->node, NULL, i->watch_exact,
+				     i->perms.p ? &i->perms : NULL);
 
-		if (i->ta_node && do_tdb_delete(conn, &ta_key, NULL))
-			goto err;
 		list_del(&i->list);
 		talloc_free(i);
 	}
 
 	return 0;
-
-err:
-	corrupt(conn, "Partial transaction");
-	return EIO;
 }
 
 static int destroy_transaction(void *_transaction)
 {
 	struct transaction *trans = _transaction;
 	struct accessed_node *i;
-	char *trans_name;
 	TDB_DATA key;
 
 	wrl_ntransactions--;
 	trace_destroy(trans, "transaction");
 	while ((i = list_top(&trans->accessed, struct accessed_node, list))) {
 		if (i->ta_node) {
-			trans_name = transaction_get_node_name(i, trans,
-							       i->node);
-			if (trans_name) {
-				set_tdb_key(trans_name, &key);
-				do_tdb_delete(trans->conn, &key, NULL);
-			}
+			set_tdb_key(i->trans_name, &key);
+			do_tdb_delete(trans->conn, &key, NULL);
 		}
 		list_del(&i->list);
 		talloc_free(i);
@@ -556,6 +537,7 @@ int do_transaction_end(const void *ctx, struct connection *conn,
 {
 	const char *arg = onearg(in);
 	struct transaction *trans;
+	bool is_corrupt = false;
 	int ret;
 
 	if (!arg || (!streq(arg, "T") && !streq(arg, "F")))
@@ -579,13 +561,17 @@ int do_transaction_end(const void *ctx, struct connection *conn,
 		ret = transaction_fix_domains(trans, false);
 		if (ret)
 			return ret;
-		if (finalize_transaction(conn, trans))
-			return EAGAIN;
+		ret = finalize_transaction(conn, trans, &is_corrupt);
+		if (ret)
+			return ret;
 
 		wrl_apply_debit_trans_commit(conn);
 
 		/* fix domain entry for each changed domain */
 		transaction_fix_domains(trans, true);
+
+		if (is_corrupt)
+			corrupt(conn, "transaction inconsistency");
 	}
 	send_ack(conn, XS_TRANSACTION_END);
 
@@ -660,7 +646,7 @@ int check_transactions(struct hashtable *hash)
 	struct connection *conn;
 	struct transaction *trans;
 	struct accessed_node *i;
-	char *tname, *tnode;
+	char *tname;
 
 	list_for_each_entry(conn, &connections, list) {
 		list_for_each_entry(trans, &conn->transaction_list, list) {
@@ -672,11 +658,8 @@ int check_transactions(struct hashtable *hash)
 			list_for_each_entry(i, &trans->accessed, list) {
 				if (!i->ta_node)
 					continue;
-				tnode = transaction_get_node_name(tname, trans,
-								  i->node);
-				if (!tnode || !remember_string(hash, tnode))
+				if (!remember_string(hash, i->trans_name))
 					goto nomem;
-				talloc_free(tnode);
 			}
 
 			talloc_free(tname);
diff --git a/tools/xenstore/xenstored_transaction.h b/tools/xenstore/xenstored_transaction.h
index 39d7f81c5127..3417303f9427 100644
--- a/tools/xenstore/xenstored_transaction.h
+++ b/tools/xenstore/xenstored_transaction.h
@@ -48,8 +48,8 @@ int __must_check access_node(struct connection *conn, struct node *node,
 void queue_watches(struct connection *conn, const char *name, bool watch_exact);
 
 /* Prepend the transaction to name if appropriate. */
-int transaction_prepend(struct connection *conn, const char *name,
-                        TDB_DATA *key);
+void transaction_prepend(struct connection *conn, const char *name,
+                         TDB_DATA *key);
 
 /* Mark the transaction as failed. This will prevent it to be committed. */
 void fail_transaction(struct transaction *trans);