From nobody Thu Apr 2 17:36:32 2026 Received: from esa1.hc1455-7.c3s2.iphmx.com (esa1.hc1455-7.c3s2.iphmx.com [207.54.90.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 17AB829E11D; Tue, 17 Feb 2026 08:23:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=207.54.90.47 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771316599; cv=none; b=Wf2QkHhdGtRay7lGtb18pa0QrYlW3J2r/8isW4JY2uXO6XcRB5J7W6ysFvHjIyAh57ulgNvpjMeoEgNhp4lY69lRh1EMSEnLr3OOmLsuBKptfi/VrSD+gqgSXZ6rQ1K2MEEbJlXhX3hTL9b3NwIp5M2+MHFP+mhhbSTZc7VYhIo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771316599; c=relaxed/simple; bh=6wnzGCKSbQfqEcVtpMTsLA+gaV/Yj5LOg0rzyqo9Dyk=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=LwjOCmSYlWILZwJ10oKzHKXXHkZGKFxkr6rSd6NHy050goVPBI+26SAU6OcGcyH36PifrbZ7wV1x3jWr8MapKOrRgfrpcp3NICRxCuz3Zmsi26M2K3SFurgwcpfU6EPL6Bm/kaNb8YWm1UvbaOEoX2lxUyA8BdC2lZTF1WT3lHg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=fujitsu.com; spf=pass smtp.mailfrom=fujitsu.com; dkim=pass (2048-bit key) header.d=fujitsu.com header.i=@fujitsu.com header.b=j9985v9e; arc=none smtp.client-ip=207.54.90.47 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=fujitsu.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=fujitsu.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=fujitsu.com header.i=@fujitsu.com header.b="j9985v9e" DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=fujitsu.com; i=@fujitsu.com; q=dns/txt; s=fj2; t=1771316596; x=1802852596; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=6wnzGCKSbQfqEcVtpMTsLA+gaV/Yj5LOg0rzyqo9Dyk=; b=j9985v9ek0ZAgCQLkxnm7fw/LBU9X7OaZjdySSj8nCUdMwURGufA66mr wKLr7KE1bHxIMu5uHl4hJDB6yRN70TTCSJWWc937XScIXltzes+iuWvZx gUvbjU7OBIu1o2e8bKoe1KVQOloxeEAgHiaIy1+L1Vgch6227iMOZKmaT MBubK41aGe/6ymFXxKeBYIQFlIU2o02eXjaingAoK/wnKeAISJ4yaKXXZ mKuJ7T542Qg2qKdniAgmudrxP4GHsksbNl5M8PlfOiNijPGZwFZfyx8h1 pUVTVHyPs23a2l27JSKOILeMtLH+Wmet1/3Z9KFPPruX+WcWs5cQJXPvS g==; X-CSE-ConnectionGUID: bCNCFR9/TdaHHPev3e8gTQ== X-CSE-MsgGUID: dloR9RS3S9GHCkKmTjOqZg== X-IronPort-AV: E=McAfee;i="6800,10657,11703"; a="229968514" X-IronPort-AV: E=Sophos;i="6.21,295,1763391600"; d="scan'208";a="229968514" Received: from az2uksmgm3.o.css.fujitsu.com (unknown [10.151.22.200]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by az2nlsmgr3.fujitsu.com (Postfix) with ESMTPS id 9862D1000377; Tue, 17 Feb 2026 08:22:06 +0000 (UTC) Received: from m3002.s.css.fujitsu.com (msm3.b.css.fujitsu.com [10.128.233.104]) by az2uksmgm3.o.css.fujitsu.com (Postfix) with ESMTP id F3CBAC02F38; Tue, 17 Feb 2026 08:22:05 +0000 (UTC) Received: from nezuko.soft.fujitsu.com (unknown [10.118.236.61]) by m3002.s.css.fujitsu.com (Postfix) with ESMTP id 4063D203ED42; Tue, 17 Feb 2026 17:22:04 +0900 (JST) From: Pawel Mielimonka To: dan.j.williams@intel.com, alison.schofield@intel.com Cc: Smita.KoralahalliChannabasappa@amd.com, linux-cxl@vger.kernel.org, linux-kernel@vger.kernel.org, dave@stgolabs.net, jonathan.cameron@huawei.com, dave.jiang@intel.com, vishal.l.verma@intel.com, ira.weiny@intel.com, lizhijian@fujitsu.com, Pawel Mielimonka Subject: [ndctl PATCH v4 RESEND] cxl/cli: enforce HPA-descending teardown Date: Tue, 17 Feb 2026 17:27:05 +0900 Message-ID: <20260217082705.2475753-1-pawel.mielimonka@fujitsu.com> X-Mailer: git-send-email 2.47.3 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" When destroying CXL regions, users may observe failures such as "set_dpa_size failed: Device or resource busy" even when targeting valid regions. Afer such failures, subsequent destroy/create cycles may become impossible without a full system reset. The current logic does not guarantee descenting HPA order across regions, even when each region is mapped to a single endpoint decoder. As a result, a region can only be destroyed if it is the last one in HPA order. Alison observed that the issue extends when a region's mappings reach endpoint decoders under different root decoders. In such cases, the HPA descending order must be considered across all endpoint decoders that share any of the root decoders involved - effectively covering the entire bus/port. Without this global ordering destroy operations may fail unpredictably, and followng create operations can also be blocked. This change does not alter the underlying kernel behaior or decoder programming rules. Instead, it enforces the existing ordering constraints at the CLI level, preventing users from issuing destroy oprtation that would violate HPA continuity required by specification (section 8.2.4.20.12). Link to v2 - Alison's findings: https://lore.kernel.org/linux-cxl/aTTKRCUmbNC9jIrG@aschofie-mobl2.lan/ base-commit: 4f7a1c63b3305c97013d3c46daa6c0f76feff10d v4 updates: - expand the commit message to include detailed description - add references to prior discussion and failure scenarions - follow guidelines for subject and formatting - no functional changes compared to v3 v3 updates: - fix iteration to cover all endpoint decoders under a common bus/port - collapse series into a single patch =20 v2 updates: - sent by mistake from wrong local branch, does not compile, should be ignored v1: https://lore.kernel.org/linux-cxl/20251125143826.282312-1-pawel.mielimonka@= fujitsu.com/ Signed-off-by: Pawel Mielimonka --- cxl/region.c | 128 +++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 125 insertions(+), 3 deletions(-) diff --git a/cxl/region.c b/cxl/region.c index 207cf2d..d86e45f 100644 --- a/cxl/region.c +++ b/cxl/region.c @@ -831,6 +831,72 @@ out: return cxl_region_disable(region); } =20 +/* + * cmp_region_hpa() - Compare CXL regions by their HPA. + * @l: pointer to the first element (const struct cxl_region **). + * @r: pointer to the second element (const struct cxl_region **). + * + * Comparison function for CXL regions based on the Host Physical Address + * returned by cxl_region_get_resource(). + * + * Return: + * < 0 if the HPA of the region pointed to by @l is less than that of @r + * =3D 0 if both regions have te same HPA (not expected) + * > 0 if the HPA of the region pointed to by @l is greater than that of= @r + */ +static int cmp_region_hpa(const void *l, const void *r) +{ + const struct cxl_region *const *left =3D l; + const struct cxl_region *const *right =3D r; + u64 hpa1 =3D cxl_region_get_resource((struct cxl_region *) *left); + u64 hpa2 =3D cxl_region_get_resource((struct cxl_region *) *right); + + return (hpa1 > hpa2) - (hpa1 < hpa2); +} + +static int collect_regions_sorted(struct cxl_decoder *root, + struct cxl_region ***out, int *out_nr) +{ + struct cxl_region *region; + struct cxl_region **list =3D NULL; + int nr =3D 0, alloc =3D 0; + + struct cxl_port *port =3D cxl_decoder_get_port(root); + struct cxl_decoder *decoder; + + cxl_decoder_foreach(port, decoder) { + if (!cxl_port_is_root(port)) + continue; + cxl_region_foreach(decoder, region) { + if (nr =3D=3D alloc) { + int new_alloc =3D alloc ? alloc * 2 : 8; + size_t new_size =3D (size_t)new_alloc * sizeof(*list); + struct cxl_region **tmp; + + tmp =3D realloc(list, new_size); + if (!tmp) { + free(list); + return -ENOMEM; + } + list =3D tmp; + alloc =3D new_alloc; + } + list[nr++] =3D region; + } + + if (!nr) { + free(list); + *out =3D NULL; + *out_nr =3D 0; + return 0; + } + } + qsort(list, nr, sizeof(*list), cmp_region_hpa); + *out =3D list; + *out_nr =3D nr; + return 0; +} + static int destroy_region(struct cxl_region *region) { const char *devname =3D cxl_region_get_devname(region); @@ -895,6 +961,59 @@ static int destroy_region(struct cxl_region *region) return cxl_region_delete(region); } =20 +static int destroy_multiple_regions( + struct parsed_params *p, + struct cxl_decoder *decoder, + int *count) +{ + struct cxl_region **list; + int nr, rc, i; + bool skipped =3D false; + + rc =3D collect_regions_sorted(decoder, &list, &nr); + if (rc) { + log_err(&rl, "failed to allocate region list: %s\n", strerror(-rc)); + return rc; + } + + for (i =3D nr - 1; i >=3D 0; --i) { + struct cxl_region *region =3D NULL; + + for (int j =3D 0; j < p->argc; j++) { + region =3D util_cxl_region_filter(list[i], p->argv[j]); + if (region) + break; + } + + if (!region) { + skipped =3D true; + continue; + } + + /* + * If current region matches filter, but previous didn't, destroying wou= ld + * result in breaking HPA continuity + */ + if (skipped) { + log_err(&rl, "failed to destroy %s: out of order %s reset\n", + cxl_region_get_devname(region), + cxl_decoder_get_devname(decoder)); + rc =3D -EINVAL; + break; + } + + rc =3D destroy_region(region); + if (rc) { + log_err(&rl, "%s: failed: %s\n", + cxl_region_get_devname(region), strerror(-rc)); + break; + } + ++(*count); + } + free(list); + return rc; +} + static int do_region_xable(struct cxl_region *region, enum region_actions = action) { switch (action) { @@ -902,8 +1021,6 @@ static int do_region_xable(struct cxl_region *region, = enum region_actions action return cxl_region_enable(region); case ACTION_DISABLE: return disable_region(region); - case ACTION_DESTROY: - return destroy_region(region); default: return -EINVAL; } @@ -971,7 +1088,12 @@ static int region_action(int argc, const char **argv,= struct cxl_ctx *ctx, if (!util_cxl_decoder_filter(decoder, param.root_decoder)) continue; - rc =3D decoder_region_action(p, decoder, action, count); + + if (action =3D=3D ACTION_DESTROY) + rc =3D destroy_multiple_regions(p, decoder, count); + else + rc =3D decoder_region_action(p, decoder, action, count); + if (rc) err_rc =3D rc; } --=20 2.47.3