From nobody Sun Apr 12 06:06:53 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1772032869; cv=none; d=zohomail.com; s=zohoarc; b=TSDX+lo2eU4k6JBwy1SaEgxqf8HSKiEoUEm3d6oR0P7k+8p/au8W/wF6MGulJtAOs8yUVvS0Ts0UYCLf2itxTHfLzvAJM+b2/IKe7M+Shc9W3zp5E1/ZpmTewyr31HykUm2eERZhZYkidrW//SjR38iRfZAEVwcdAzaHO4K8lfc= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1772032869; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:Reply-To:Reply-To:References:Sender:Subject:Subject:To:To:Message-Id; bh=/Cq/eZOpyweoj5XSSMa7McD8GQqZxWXRzJzFkvWx9NY=; b=RklCywRcjPJbbHIw6H5neDkazNt+aFmRopp7pOuC9DZwGLVPltQtca1LGORfqBm4e2KHsI32NinvNBLbMoTFnqM2oLv72CkTZCQiHq8h3tQ8bZJPe97/+j4umKdm3VjXjfpMJLyz9wF8RA+kQjLag7vybHYS6VUWe30FKCLcK6g= ARC-Authentication-Results: i=1; mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1772032869123914.3474385603128; Wed, 25 Feb 2026 07:21:09 -0800 (PST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1vvGhE-0000wN-2A; Wed, 25 Feb 2026 10:21:04 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1vvGhB-0000vq-PA for qemu-devel@nongnu.org; Wed, 25 Feb 2026 10:21:01 -0500 Received: from frasgout.his.huawei.com ([185.176.79.56]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1vvGh9-0003J6-C5 for qemu-devel@nongnu.org; Wed, 25 Feb 2026 10:21:01 -0500 Received: from mail.maildlp.com (unknown [172.18.224.83]) by frasgout.his.huawei.com (SkyGuard) with ESMTPS id 4fLdZ74vsvzHnGhd; Wed, 25 Feb 2026 23:20:15 +0800 (CST) Received: from dubpeml500005.china.huawei.com (unknown [7.214.145.207]) by mail.maildlp.com (Postfix) with ESMTPS id C71F140086; Wed, 25 Feb 2026 23:20:57 +0800 (CST) Received: from a2303103017.china.huawei.com (10.203.177.99) by dubpeml500005.china.huawei.com (7.214.145.207) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Wed, 25 Feb 2026 15:20:56 +0000 To: CC: , , , , , , , , , , , , , , Subject: [PATCH v4 3/3] hw/cxl: Add a performant (and correct) path for the non interleaved cases Date: Wed, 25 Feb 2026 15:19:15 +0000 Message-ID: <20260225151916.390-4-alireza.sanaee@huawei.com> X-Mailer: git-send-email 2.51.0.windows.2 In-Reply-To: <20260225151916.390-1-alireza.sanaee@huawei.com> References: <20260225151916.390-1-alireza.sanaee@huawei.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [10.203.177.99] X-ClientProxiedBy: lhrpeml100010.china.huawei.com (7.191.174.197) To dubpeml500005.china.huawei.com (7.214.145.207) Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=185.176.79.56; envelope-from=alireza.sanaee@huawei.com; helo=frasgout.his.huawei.com X-Spam_score_int: -26 X-Spam_score: -2.7 X-Spam_bar: -- X-Spam_report: (-2.7 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.734, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.78, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: qemu development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-to: Alireza Sanaee From: Alireza Sanaee via qemu development Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1772032870238158500 Content-Type: text/plain; charset="utf-8" The CXL address to device decoding logic is complex because of the need to correctly decode fine grained interleave. The current implementation prevents use with KVM where executed instructions may reside in that memory and gives very slow performance even in TCG. In many real cases non interleaved memory configurations are useful and for those we can use a more conventional memory region alias allowing similar performance to other memory in the system. Whether this fast path is applicable can be established once the full set of HDM decoders has been committed (in whatever order the guest decides to commit them). As such a check is performed on each commit/uncommit of HDM decoder to establish if the alias should be added or removed. Co-developed-by: Jonathan Cameron Signed-off-by: Jonathan Cameron Signed-off-by: Alireza Sanaee --- Change log:=20 v3 -> v4: The tear down path has been change a bit, because it is not=20 exactly the same as setup, and requires some checks in logically different order. hw/cxl/cxl-component-utils.c | 6 ++ hw/cxl/cxl-host.c | 188 +++++++++++++++++++++++++++++++++++ hw/mem/cxl_type3.c | 4 + include/hw/cxl/cxl.h | 1 + include/hw/cxl/cxl_device.h | 4 + 5 files changed, 203 insertions(+) diff --git a/hw/cxl/cxl-component-utils.c b/hw/cxl/cxl-component-utils.c index d36162e91b..a10fdb0cc2 100644 --- a/hw/cxl/cxl-component-utils.c +++ b/hw/cxl/cxl-component-utils.c @@ -142,6 +142,12 @@ static void dumb_hdm_handler(CXLComponentState *cxl_cs= tate, hwaddr offset, value =3D FIELD_DP32(value, CXL_HDM_DECODER0_CTRL, COMMITTED, 0); } stl_le_p((uint8_t *)cache_mem + offset, value); + + if (should_commit) { + cfmws_update_non_interleaved(true); + } else if (should_uncommit) { + cfmws_update_non_interleaved(false); + } } =20 static void bi_handler(CXLComponentState *cxl_cstate, hwaddr offset, diff --git a/hw/cxl/cxl-host.c b/hw/cxl/cxl-host.c index 2dc9f77007..bc3b14f028 100644 --- a/hw/cxl/cxl-host.c +++ b/hw/cxl/cxl-host.c @@ -264,6 +264,194 @@ static PCIDevice *cxl_cfmws_find_device(CXLFixedWindo= w *fw, hwaddr addr, return d; } =20 +typedef struct CXLDirectPTState { + CXLType3Dev *ct3d; + hwaddr decoder_base; + hwaddr decoder_size; + hwaddr dpa_base; + unsigned int hdm_decoder_idx; +} CXLDirectPTState; + +static void cxl_fmws_direct_passthrough_setup(CXLDirectPTState *state, + CXLFixedWindow *fw) +{ + CXLType3Dev *ct3d =3D state->ct3d; + MemoryRegion *mr =3D NULL; + uint64_t vmr_size =3D 0, pmr_size =3D 0, offset =3D 0; + MemoryRegion *direct_mr; + g_autofree char *direct_mr_name; + unsigned int idx =3D state->hdm_decoder_idx; + + if (ct3d->hostvmem) { + MemoryRegion *vmr =3D host_memory_backend_get_memory(ct3d->hostvme= m); + + vmr_size =3D memory_region_size(vmr); + if (state->dpa_base < vmr_size) { + mr =3D vmr; + offset =3D state->dpa_base; + } + } + if (!mr && ct3d->hostpmem) { + MemoryRegion *pmr =3D host_memory_backend_get_memory(ct3d->hostpme= m); + + pmr_size =3D memory_region_size(pmr); + if (state->dpa_base - vmr_size < pmr_size) { + mr =3D pmr; + offset =3D state->dpa_base - vmr_size; + } + } + if (!mr) { + return; + } + + if (ct3d->direct_mr_fw[idx]) { + return; + } + + direct_mr =3D &ct3d->direct_mr[idx]; + direct_mr_name =3D g_strdup_printf("cxl-direct-mapping-alias-%u", idx); + if (!direct_mr_name) + return; + + memory_region_init_alias(direct_mr, OBJECT(ct3d), direct_mr_name, mr, + offset, state->decoder_size); + memory_region_add_subregion(&fw->mr, + state->decoder_base - fw->base, direct_mr); + ct3d->direct_mr_fw[idx] =3D fw; +} + +static void cxl_fmws_direct_passthrough_remove(CXLType3Dev *ct3d, + uint64_t decoder_base, + unsigned int idx) +{ + CXLFixedWindow *owner_fw =3D ct3d->direct_mr_fw[idx]; + MemoryRegion *direct_mr =3D &ct3d->direct_mr[idx]; + + if (!owner_fw) { + return; + } + + if (!memory_region_is_mapped(direct_mr)) { + return; + } + + if (cxl_cfmws_find_device(owner_fw, decoder_base, false)) { + return; + } + + memory_region_del_subregion(&owner_fw->mr, direct_mr); + ct3d->direct_mr_fw[idx] =3D NULL; +} + +static int cxl_fmws_direct_passthrough(Object *obj, void *opaque) +{ + CXLDirectPTState *state =3D opaque; + CXLFixedWindow *fw; + + if (!object_dynamic_cast(obj, TYPE_CXL_FMW)) { + return 0; + } + + fw =3D CXL_FMW(obj); + + /* Verify not interleaved */ + if (!cxl_cfmws_find_device(fw, state->decoder_base, false)) { + return 0; + } + + cxl_fmws_direct_passthrough_setup(state, fw); + + return 0; +} + +static int update_non_interleaved(Object *obj, void *opaque) +{ + const int hdm_inc =3D R_CXL_HDM_DECODER1_BASE_LO - R_CXL_HDM_DECODER0_= BASE_LO; + bool commit =3D *(bool *)opaque; + CXLType3Dev *ct3d; + uint32_t *cache_mem; + unsigned int hdm_count, i; + uint32_t cap; + uint64_t dpa_base =3D 0; + + if (!object_dynamic_cast(obj, TYPE_CXL_TYPE3)) { + return 0; + } + + ct3d =3D CXL_TYPE3(obj); + cache_mem =3D ct3d->cxl_cstate.crb.cache_mem_registers; + cap =3D ldl_le_p(cache_mem + R_CXL_HDM_DECODER_CAPABILITY); + hdm_count =3D cxl_decoder_count_dec(FIELD_EX32(cap, + CXL_HDM_DECODER_CAPABILIT= Y, + DECODER_COUNT)); + for (i =3D 0; i < hdm_count; i++) { + uint64_t decoder_base, decoder_size, skip; + uint32_t hdm_ctrl, low, high; + int iw, committed; + + hdm_ctrl =3D ldl_le_p(cache_mem + R_CXL_HDM_DECODER0_CTRL + i * hd= m_inc); + committed =3D FIELD_EX32(hdm_ctrl, CXL_HDM_DECODER0_CTRL, COMMITTE= D); + + /*=20 + * Optimization: Looking for a fully committed path; if the type 3= HDM + * decoder is not commmitted, it cannot lie on such a path. + */ + if (commit && !committed) { + return 0; + } + + low =3D ldl_le_p(cache_mem + R_CXL_HDM_DECODER0_DPA_SKIP_LO + + i * hdm_inc); + high =3D ldl_le_p(cache_mem + R_CXL_HDM_DECODER0_DPA_SKIP_HI + + i * hdm_inc); + skip =3D ((uint64_t)high << 32) | (low & 0xf0000000); + dpa_base +=3D skip; + + low =3D ldl_le_p(cache_mem + R_CXL_HDM_DECODER0_SIZE_LO + i * hdm_= inc); + high =3D ldl_le_p(cache_mem + R_CXL_HDM_DECODER0_SIZE_HI + i * hdm= _inc); + decoder_size =3D ((uint64_t)high << 32) | (low & 0xf0000000); + + low =3D ldl_le_p(cache_mem + R_CXL_HDM_DECODER0_BASE_LO + i * hdm_= inc); + high =3D ldl_le_p(cache_mem + R_CXL_HDM_DECODER0_BASE_HI + i * hdm= _inc); + decoder_base =3D ((uint64_t)high << 32) | (low & 0xf0000000); + + iw =3D FIELD_EX32(hdm_ctrl, CXL_HDM_DECODER0_CTRL, IW); + + if (iw =3D=3D 0) { + if (!commit) { + cxl_fmws_direct_passthrough_remove(ct3d, decoder_base, i); + } else { + CXLDirectPTState state =3D { + .ct3d =3D ct3d, + .decoder_base =3D decoder_base, + .decoder_size =3D decoder_size, + .dpa_base =3D dpa_base, + .hdm_decoder_idx =3D i, + }; + + object_child_foreach_recursive(object_get_root(), + cxl_fmws_direct_passthrough, + &state); + } + } + dpa_base +=3D decoder_size / cxl_interleave_ways_dec(iw, &error_fa= tal); + } + + return false; +} + +bool cfmws_update_non_interleaved(bool commit) +{ + /* + * Walk endpoints to find both committed and uncommitted decoders, + * then check if they are not interleaved (but the path is fully set u= p). + */ + object_child_foreach_recursive(object_get_root(), + update_non_interleaved, &commit); + + return false; +} + static MemTxResult cxl_read_cfmws(void *opaque, hwaddr addr, uint64_t *dat= a, unsigned size, MemTxAttrs attrs) { diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c index 3f09c589ae..a95f6a4014 100644 --- a/hw/mem/cxl_type3.c +++ b/hw/mem/cxl_type3.c @@ -427,6 +427,8 @@ static void hdm_decoder_commit(CXLType3Dev *ct3d, int w= hich) ctrl =3D FIELD_DP32(ctrl, CXL_HDM_DECODER0_CTRL, COMMITTED, 1); =20 stl_le_p(cache_mem + R_CXL_HDM_DECODER0_CTRL + which * hdm_inc, ctrl); + + cfmws_update_non_interleaved(true); } =20 static void hdm_decoder_uncommit(CXLType3Dev *ct3d, int which) @@ -442,6 +444,8 @@ static void hdm_decoder_uncommit(CXLType3Dev *ct3d, int= which) ctrl =3D FIELD_DP32(ctrl, CXL_HDM_DECODER0_CTRL, COMMITTED, 0); =20 stl_le_p(cache_mem + R_CXL_HDM_DECODER0_CTRL + which * hdm_inc, ctrl); + + cfmws_update_non_interleaved(false); } =20 static int ct3d_qmp_uncor_err_to_cxl(CxlUncorErrorType qmp_err) diff --git a/include/hw/cxl/cxl.h b/include/hw/cxl/cxl.h index 998f495a98..931f5680bd 100644 --- a/include/hw/cxl/cxl.h +++ b/include/hw/cxl/cxl.h @@ -71,4 +71,5 @@ CXLComponentState *cxl_usp_to_cstate(CXLUpstreamPort *usp= ); typedef struct CXLDownstreamPort CXLDownstreamPort; DECLARE_INSTANCE_CHECKER(CXLDownstreamPort, CXL_DSP, TYPE_CXL_DSP) =20 +bool cfmws_update_non_interleaved(bool commit); #endif diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h index 393f312217..ba551fa5f9 100644 --- a/include/hw/cxl/cxl_device.h +++ b/include/hw/cxl/cxl_device.h @@ -685,6 +685,8 @@ typedef struct CXLSetFeatureInfo { size_t data_size; } CXLSetFeatureInfo; =20 +typedef struct CXLFixedWindow CXLFixedWindow; + struct CXLSanitizeInfo; =20 typedef struct CXLAlertConfig { @@ -712,6 +714,8 @@ struct CXLType3Dev { uint64_t sn; =20 /* State */ + MemoryRegion direct_mr[CXL_HDM_DECODER_COUNT]; + CXLFixedWindow *direct_mr_fw[CXL_HDM_DECODER_COUNT]; AddressSpace hostvmem_as; AddressSpace hostpmem_as; CXLComponentState cxl_cstate; --=20 2.43.0