From nobody Sun Nov 24 09:19:48 2024 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9D1AD18FC75; Wed, 6 Nov 2024 02:10:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730859047; cv=none; b=WDpz4dSi78YFrNznrzx5pWHtAaJn13b8MFG2e1jOILQfzvIoLqAR3MAMLjCa4HaptTRfjC5ImoOaHMQPhEacomZsXtmGWY+YdBPUDmdthzcDV0M6i82bZ6tUnLFgc54vWUzK4RVsIUDjGxVvc8PiknseIXPawFwhsfblCPyGaZ8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730859047; c=relaxed/simple; bh=BF/3rx8ylxCkAH0LZYctZzCK7MF4sAsutyGFA5v7yUE=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=AyCer3MBFWNyc0uzZ7QW+8p2LjK4wPmGLSeYOvzaZreTPiAN5G7eS59knccXI7sBuEF2Basi9eNXgKlnGKVamsdjDqjz+WxqN1hbmv7RXURYkFUo+UebGxdBuMvP8VaflOgLDLjY05SnG+Eb251WE4vyMPwS7Imd79JGdY7Efq8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=S0KmcA9w; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="S0KmcA9w" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 958C1C4CECF; Wed, 6 Nov 2024 02:10:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1730859047; bh=BF/3rx8ylxCkAH0LZYctZzCK7MF4sAsutyGFA5v7yUE=; h=From:To:Cc:Subject:Date:From; b=S0KmcA9waN59BJqkZEDJd0Z3+Sur1yO2GmkjgFMXxzv2VVhhSDzjbhCegCCIOn77K CAXYJCplK3XR5j8OnK10SkvmHkv9cUzA7mO7jC7s2ZEoGyiNvjabdHuzOPlAWWXDN9 M3ol7E7JPzRrgUAem7f4Yx/ndLzBdFBQA0y84BSO/pRAiaBhb9IAm/4aUNwWakwU/P pZCTN+FvpSkhYvpU1RvJlfZ3lsiCpZsb7mpRB7DqXfXUSrPCSlvbYcu2Wg9h+FmmoI vgmJea9neyw41t3RFBVmRQL86o6fYeXTGWhwoAZlF1yoEc/wvkds7BB2aHulAD/y5u 08/o2oidq17OQ== From: Sasha Levin To: stable@vger.kernel.org, dan.j.williams@intel.com Cc: Jonathan Cameron , Greg Kroah-Hartman , Davidlohr Bueso , Dave Jiang , Alison Schofield , Ira Weiny , Zijun Hu , linux-kernel@vger.kernel.org, linux-cxl@vger.kernel.org Subject: FAILED: Patch "cxl/port: Fix use-after-free, permit out-of-order decoder shutdown" failed to apply to v6.1-stable tree Date: Tue, 5 Nov 2024 21:10:42 -0500 Message-ID: <20241106021044.181771-1-sashal@kernel.org> X-Mailer: git-send-email 2.43.0 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Hint: ignore X-stable: review Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The patch below does not apply to the v6.1-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to . Thanks, Sasha Reviewed-by: Ira Weiny Reviewed-by: Jonathan Cameron ------------------ original commit in Linus's tree ------------------ From 101c268bd2f37e965a5468353e62d154db38838e Mon Sep 17 00:00:00 2001 From: Dan Williams Date: Tue, 22 Oct 2024 18:43:49 -0700 Subject: [PATCH] cxl/port: Fix use-after-free, permit out-of-order decoder shutdown In support of investigating an initialization failure report [1], cxl_test was updated to register mock memory-devices after the mock root-port/bus device had been registered. That led to cxl_test crashing with a use-after-free bug with the following signature: cxl_port_attach_region: cxl region3: cxl_host_bridge.0:port3 decoder3.0= add: mem0:decoder7.0 @ 0 next: cxl_switch_uport.0 nr_eps: 1 nr_targets: 1 cxl_port_attach_region: cxl region3: cxl_host_bridge.0:port3 decoder3.0= add: mem4:decoder14.0 @ 1 next: cxl_switch_uport.0 nr_eps: 2 nr_targets: 1 cxl_port_setup_targets: cxl region3: cxl_switch_uport.0:port6 target[0]= =3D cxl_switch_dport.0 for mem0:decoder7.0 @ 0 1) cxl_port_setup_targets: cxl region3: cxl_switch_uport.0:port6 target[1]= =3D cxl_switch_dport.4 for mem4:decoder14.0 @ 1 [..] cxld_unregister: cxl decoder14.0: cxl_region_decode_reset: cxl_region region3: mock_decoder_reset: cxl_port port3: decoder3.0 reset 2) mock_decoder_reset: cxl_port port3: decoder3.0: out of order reset, exp= ected decoder3.1 cxl_endpoint_decoder_release: cxl decoder14.0: [..] cxld_unregister: cxl decoder7.0: 3) cxl_region_decode_reset: cxl_region region3: Oops: general protection fault, probably for non-canonical address 0x6b= 6b6b6b6b6b6bc3: 0000 [#1] PREEMPT SMP PTI [..] RIP: 0010:to_cxl_port+0x8/0x60 [cxl_core] [..] Call Trace: cxl_region_decode_reset+0x69/0x190 [cxl_core] cxl_region_detach+0xe8/0x210 [cxl_core] cxl_decoder_kill_region+0x27/0x40 [cxl_core] cxld_unregister+0x5d/0x60 [cxl_core] At 1) a region has been established with 2 endpoint decoders (7.0 and 14.0). Those endpoints share a common switch-decoder in the topology (3.0). At teardown, 2), decoder14.0 is the first to be removed and hits the "out of order reset case" in the switch decoder. The effect though is that region3 cleanup is aborted leaving it in-tact and referencing decoder14.0. At 3) the second attempt to teardown region3 trips over the stale decoder14.0 object which has long since been deleted. The fix here is to recognize that the CXL specification places no mandate on in-order shutdown of switch-decoders, the driver enforces in-order allocation, and hardware enforces in-order commit. So, rather than fail and leave objects dangling, always remove them. In support of making cxl_region_decode_reset() always succeed, cxl_region_invalidate_memregion() failures are turned into warnings. Crashing the kernel is ok there since system integrity is at risk if caches cannot be managed around physical address mutation events like CXL region destruction. A new device_for_each_child_reverse_from() is added to cleanup port->commit_end after all dependent decoders have been disabled. In other words if decoders are allocated 0->1->2 and disabled 1->2->0 then port->commit_end only decrements from 2 after 2 has been disabled, and it decrements all the way to zero since 1 was disabled previously. Link: http://lore.kernel.org/20241004212504.1246-1-gourry@gourry.net [1] Cc: stable@vger.kernel.org Fixes: 176baefb2eb5 ("cxl/hdm: Commit decoder state to hardware") Reviewed-by: Jonathan Cameron Cc: Greg Kroah-Hartman Cc: Davidlohr Bueso Cc: Dave Jiang Cc: Alison Schofield Cc: Ira Weiny Cc: Zijun Hu Signed-off-by: Dan Williams Reviewed-by: Ira Weiny Link: https://patch.msgid.link/172964782781.81806.17902885593105284330.stgi= t@dwillia2-xfh.jf.intel.com Signed-off-by: Ira Weiny --- drivers/base/core.c | 35 +++++++++++++++++++++++++ drivers/cxl/core/hdm.c | 50 ++++++++++++++++++++++++++++++------ drivers/cxl/core/region.c | 48 ++++++++++------------------------ drivers/cxl/cxl.h | 3 ++- include/linux/device.h | 3 +++ tools/testing/cxl/test/cxl.c | 14 ++++------ 6 files changed, 100 insertions(+), 53 deletions(-) diff --git a/drivers/base/core.c b/drivers/base/core.c index a4c853411a6b2..e42f1ad730780 100644 --- a/drivers/base/core.c +++ b/drivers/base/core.c @@ -4037,6 +4037,41 @@ int device_for_each_child_reverse(struct device *par= ent, void *data, } EXPORT_SYMBOL_GPL(device_for_each_child_reverse); =20 +/** + * device_for_each_child_reverse_from - device child iterator in reversed = order. + * @parent: parent struct device. + * @from: optional starting point in child list + * @fn: function to be called for each device. + * @data: data for the callback. + * + * Iterate over @parent's child devices, starting at @from, and call @fn + * for each, passing it @data. This helper is identical to + * device_for_each_child_reverse() when @from is NULL. + * + * @fn is checked each iteration. If it returns anything other than 0, + * iteration stop and that value is returned to the caller of + * device_for_each_child_reverse_from(); + */ +int device_for_each_child_reverse_from(struct device *parent, + struct device *from, const void *data, + int (*fn)(struct device *, const void *)) +{ + struct klist_iter i; + struct device *child; + int error =3D 0; + + if (!parent->p) + return 0; + + klist_iter_init_node(&parent->p->klist_children, &i, + (from ? &from->p->knode_parent : NULL)); + while ((child =3D prev_device(&i)) && !error) + error =3D fn(child, data); + klist_iter_exit(&i); + return error; +} +EXPORT_SYMBOL_GPL(device_for_each_child_reverse_from); + /** * device_find_child - device iterator for locating a particular device. * @parent: parent struct device diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c index 3df10517a3278..223c273c0cd17 100644 --- a/drivers/cxl/core/hdm.c +++ b/drivers/cxl/core/hdm.c @@ -712,7 +712,44 @@ static int cxl_decoder_commit(struct cxl_decoder *cxld) return 0; } =20 -static int cxl_decoder_reset(struct cxl_decoder *cxld) +static int commit_reap(struct device *dev, const void *data) +{ + struct cxl_port *port =3D to_cxl_port(dev->parent); + struct cxl_decoder *cxld; + + if (!is_switch_decoder(dev) && !is_endpoint_decoder(dev)) + return 0; + + cxld =3D to_cxl_decoder(dev); + if (port->commit_end =3D=3D cxld->id && + ((cxld->flags & CXL_DECODER_F_ENABLE) =3D=3D 0)) { + port->commit_end--; + dev_dbg(&port->dev, "reap: %s commit_end: %d\n", + dev_name(&cxld->dev), port->commit_end); + } + + return 0; +} + +void cxl_port_commit_reap(struct cxl_decoder *cxld) +{ + struct cxl_port *port =3D to_cxl_port(cxld->dev.parent); + + lockdep_assert_held_write(&cxl_region_rwsem); + + /* + * Once the highest committed decoder is disabled, free any other + * decoders that were pinned allocated by out-of-order release. + */ + port->commit_end--; + dev_dbg(&port->dev, "reap: %s commit_end: %d\n", dev_name(&cxld->dev), + port->commit_end); + device_for_each_child_reverse_from(&port->dev, &cxld->dev, NULL, + commit_reap); +} +EXPORT_SYMBOL_NS_GPL(cxl_port_commit_reap, CXL); + +static void cxl_decoder_reset(struct cxl_decoder *cxld) { struct cxl_port *port =3D to_cxl_port(cxld->dev.parent); struct cxl_hdm *cxlhdm =3D dev_get_drvdata(&port->dev); @@ -721,14 +758,14 @@ static int cxl_decoder_reset(struct cxl_decoder *cxld) u32 ctrl; =20 if ((cxld->flags & CXL_DECODER_F_ENABLE) =3D=3D 0) - return 0; + return; =20 - if (port->commit_end !=3D id) { + if (port->commit_end =3D=3D id) + cxl_port_commit_reap(cxld); + else dev_dbg(&port->dev, "%s: out of order reset, expected decoder%d.%d\n", dev_name(&cxld->dev), port->id, port->commit_end); - return -EBUSY; - } =20 down_read(&cxl_dpa_rwsem); ctrl =3D readl(hdm + CXL_HDM_DECODER0_CTRL_OFFSET(id)); @@ -741,7 +778,6 @@ static int cxl_decoder_reset(struct cxl_decoder *cxld) writel(0, hdm + CXL_HDM_DECODER0_BASE_LOW_OFFSET(id)); up_read(&cxl_dpa_rwsem); =20 - port->commit_end--; cxld->flags &=3D ~CXL_DECODER_F_ENABLE; =20 /* Userspace is now responsible for reconfiguring this decoder */ @@ -751,8 +787,6 @@ static int cxl_decoder_reset(struct cxl_decoder *cxld) cxled =3D to_cxl_endpoint_decoder(&cxld->dev); cxled->state =3D CXL_DECODER_STATE_MANUAL; } - - return 0; } =20 static int cxl_setup_hdm_decoder_from_dvsec( diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c index e701e4b040328..3478d20583035 100644 --- a/drivers/cxl/core/region.c +++ b/drivers/cxl/core/region.c @@ -232,8 +232,8 @@ static int cxl_region_invalidate_memregion(struct cxl_r= egion *cxlr) "Bypassing cpu_cache_invalidate_memregion() for testing!\n"); return 0; } else { - dev_err(&cxlr->dev, - "Failed to synchronize CPU cache state\n"); + dev_WARN(&cxlr->dev, + "Failed to synchronize CPU cache state\n"); return -ENXIO; } } @@ -242,19 +242,17 @@ static int cxl_region_invalidate_memregion(struct cxl= _region *cxlr) return 0; } =20 -static int cxl_region_decode_reset(struct cxl_region *cxlr, int count) +static void cxl_region_decode_reset(struct cxl_region *cxlr, int count) { struct cxl_region_params *p =3D &cxlr->params; - int i, rc =3D 0; + int i; =20 /* - * Before region teardown attempt to flush, and if the flush - * fails cancel the region teardown for data consistency - * concerns + * Before region teardown attempt to flush, evict any data cached for + * this region, or scream loudly about missing arch / platform support + * for CXL teardown. */ - rc =3D cxl_region_invalidate_memregion(cxlr); - if (rc) - return rc; + cxl_region_invalidate_memregion(cxlr); =20 for (i =3D count - 1; i >=3D 0; i--) { struct cxl_endpoint_decoder *cxled =3D p->targets[i]; @@ -277,23 +275,17 @@ static int cxl_region_decode_reset(struct cxl_region = *cxlr, int count) cxl_rr =3D cxl_rr_load(iter, cxlr); cxld =3D cxl_rr->decoder; if (cxld->reset) - rc =3D cxld->reset(cxld); - if (rc) - return rc; + cxld->reset(cxld); set_bit(CXL_REGION_F_NEEDS_RESET, &cxlr->flags); } =20 endpoint_reset: - rc =3D cxled->cxld.reset(&cxled->cxld); - if (rc) - return rc; + cxled->cxld.reset(&cxled->cxld); set_bit(CXL_REGION_F_NEEDS_RESET, &cxlr->flags); } =20 /* all decoders associated with this region have been torn down */ clear_bit(CXL_REGION_F_NEEDS_RESET, &cxlr->flags); - - return 0; } =20 static int commit_decoder(struct cxl_decoder *cxld) @@ -409,16 +401,8 @@ static ssize_t commit_store(struct device *dev, struct= device_attribute *attr, * still pending. */ if (p->state =3D=3D CXL_CONFIG_RESET_PENDING) { - rc =3D cxl_region_decode_reset(cxlr, p->interleave_ways); - /* - * Revert to committed since there may still be active - * decoders associated with this region, or move forward - * to active to mark the reset successful - */ - if (rc) - p->state =3D CXL_CONFIG_COMMIT; - else - p->state =3D CXL_CONFIG_ACTIVE; + cxl_region_decode_reset(cxlr, p->interleave_ways); + p->state =3D CXL_CONFIG_ACTIVE; } } =20 @@ -2054,13 +2038,7 @@ static int cxl_region_detach(struct cxl_endpoint_dec= oder *cxled) get_device(&cxlr->dev); =20 if (p->state > CXL_CONFIG_ACTIVE) { - /* - * TODO: tear down all impacted regions if a device is - * removed out of order - */ - rc =3D cxl_region_decode_reset(cxlr, p->interleave_ways); - if (rc) - goto out; + cxl_region_decode_reset(cxlr, p->interleave_ways); p->state =3D CXL_CONFIG_ACTIVE; } =20 diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h index 0d8b810a51f04..5406e3ab3d4a4 100644 --- a/drivers/cxl/cxl.h +++ b/drivers/cxl/cxl.h @@ -359,7 +359,7 @@ struct cxl_decoder { struct cxl_region *region; unsigned long flags; int (*commit)(struct cxl_decoder *cxld); - int (*reset)(struct cxl_decoder *cxld); + void (*reset)(struct cxl_decoder *cxld); }; =20 /* @@ -730,6 +730,7 @@ static inline bool is_cxl_root(struct cxl_port *port) int cxl_num_decoders_committed(struct cxl_port *port); bool is_cxl_port(const struct device *dev); struct cxl_port *to_cxl_port(const struct device *dev); +void cxl_port_commit_reap(struct cxl_decoder *cxld); struct pci_bus; int devm_cxl_register_pci_bus(struct device *host, struct device *uport_de= v, struct pci_bus *bus); diff --git a/include/linux/device.h b/include/linux/device.h index b4bde8d226979..667cb6db90193 100644 --- a/include/linux/device.h +++ b/include/linux/device.h @@ -1078,6 +1078,9 @@ int device_for_each_child(struct device *dev, void *d= ata, int (*fn)(struct device *dev, void *data)); int device_for_each_child_reverse(struct device *dev, void *data, int (*fn)(struct device *dev, void *data)); +int device_for_each_child_reverse_from(struct device *parent, + struct device *from, const void *data, + int (*fn)(struct device *, const void *)); struct device *device_find_child(struct device *dev, void *data, int (*match)(struct device *dev, void *data)); struct device *device_find_child_by_name(struct device *parent, diff --git a/tools/testing/cxl/test/cxl.c b/tools/testing/cxl/test/cxl.c index 90d5afd52dd06..c5bbd89b31920 100644 --- a/tools/testing/cxl/test/cxl.c +++ b/tools/testing/cxl/test/cxl.c @@ -693,26 +693,22 @@ static int mock_decoder_commit(struct cxl_decoder *cx= ld) return 0; } =20 -static int mock_decoder_reset(struct cxl_decoder *cxld) +static void mock_decoder_reset(struct cxl_decoder *cxld) { struct cxl_port *port =3D to_cxl_port(cxld->dev.parent); int id =3D cxld->id; =20 if ((cxld->flags & CXL_DECODER_F_ENABLE) =3D=3D 0) - return 0; + return; =20 dev_dbg(&port->dev, "%s reset\n", dev_name(&cxld->dev)); - if (port->commit_end !=3D id) { + if (port->commit_end =3D=3D id) + cxl_port_commit_reap(cxld); + else dev_dbg(&port->dev, "%s: out of order reset, expected decoder%d.%d\n", dev_name(&cxld->dev), port->id, port->commit_end); - return -EBUSY; - } - - port->commit_end--; cxld->flags &=3D ~CXL_DECODER_F_ENABLE; - - return 0; } =20 static void default_mock_decoder(struct cxl_decoder *cxld) --=20 2.43.0