From nobody Wed Jun 17 02:50:55 2026 Received: from out30-100.freemail.mail.aliyun.com (out30-100.freemail.mail.aliyun.com [115.124.30.100]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 42230449EB2; Tue, 28 Apr 2026 13:14:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.100 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777382061; cv=none; b=Im8M+D1Z/gMPsvoFbwDsvmp+6rayiFxIymH0r1HQezdSs1A86z2jikuI8RxQ9hse/PyKkGPNIuQRU3sFQplpH8kz83csFviMl3b4Tb6zUpwGKcLNZc/HXhdB8rg8chb1OsaEbwnNENig9OAiAZgyqcPkYpZcKlgH0U17Z4MM990= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777382061; c=relaxed/simple; bh=OYEP1JP8BVFHXbULzGk1wNAbCbBlFfgcJ1H95hoOhwQ=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=gZ99vEU9qxNlniK0PFvHgXq8MftWZ74mqriWc5vRdzv0IayhszRd4lKBcol02EK49vN1U4fvjtDKoNQ2RkccgWkteLQQESSOPBFBMm53R7iKnCNCuYyux5Zc56d7T/niJ5dagSWL5WX9AK23B8aZAkRIzzoW8VkknmSqxZSxExY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=KJjAT3K+; arc=none smtp.client-ip=115.124.30.100 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="KJjAT3K+" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1777382055; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=ZUHnJEfer0hml+VpdHPHpDcwMmO2L2tgfhLYa+ZhlhQ=; b=KJjAT3K+Axk0MWlmVWuAuzpBYj+xMvI+RjiSnpXeBcYTziK31IFL7Z24+jWxtwXLLduPa7TYdbnC8ylxNyUo+RsoHDb+VeiYXRYg3NfIjusIyRc4vdVGD8niG5MBnqriWpbyJs6ed5kTTEylhmKs4p4rxuoF7HObsUWBfoE+08o= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R171e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033045098064;MF=fangyu.yu@linux.alibaba.com;NM=1;PH=DS;RN=23;SR=0;TI=SMTPD_---0X1ubIDQ_1777382052; Received: from localhost.localdomain(mailfrom:fangyu.yu@linux.alibaba.com fp:SMTPD_---0X1ubIDQ_1777382052 cluster:ay36) by smtp.aliyun-inc.com; Tue, 28 Apr 2026 21:14:13 +0800 From: fangyu.yu@linux.alibaba.com To: joro@8bytes.org, will@kernel.org, robin.murphy@arm.com, pjw@kernel.org, palmer@dabbelt.com, aou@eecs.berkeley.edu, alex@ghiti.fr, tjeznach@rivosinc.com, jgg@ziepe.ca, kevin.tian@intel.com, baolu.lu@linux.intel.com, vasant.hegde@amd.com, anup@brainfault.org, atish.patra@linux.dev, skhawaja@google.com, jgg@nvidia.com Cc: guoren@kernel.org, kvm@vger.kernel.org, iommu@lists.linux.dev, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, Fangyu Yu Subject: [RFC PATCH 01/11] iommupt: Add RISC-V Second-stage (iohgatp) page table support Date: Tue, 28 Apr 2026 21:13:49 +0800 Message-Id: <20260428131359.34872-2-fangyu.yu@linux.alibaba.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20260428131359.34872-1-fangyu.yu@linux.alibaba.com> References: <20260428131359.34872-1-fangyu.yu@linux.alibaba.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Fangyu Yu Add support for Sv39x4/Sv48x4/Sv57x4 Second-stage page tables used by the RISC-V IOMMU iohgatp register. The x4 root page table is 16 KiB instead of the usual 4 KiB, covering 2 extra GPA bits (hw_max_vasz_lg2 =3D 41/50/59). Signed-off-by: Fangyu Yu --- drivers/iommu/generic_pt/fmt/riscv.h | 64 +++++++++++++++++++++++++--- include/linux/generic_pt/common.h | 5 +++ include/linux/generic_pt/iommu.h | 17 +++++++- 3 files changed, 80 insertions(+), 6 deletions(-) diff --git a/drivers/iommu/generic_pt/fmt/riscv.h b/drivers/iommu/generic_p= t/fmt/riscv.h index a7fef6266a36..4fe645e60375 100644 --- a/drivers/iommu/generic_pt/fmt/riscv.h +++ b/drivers/iommu/generic_pt/fmt/riscv.h @@ -37,7 +37,16 @@ enum { PT_MAX_OUTPUT_ADDRESS_LG2 =3D 34, PT_MAX_TOP_LEVEL =3D 1, #else - PT_MAX_VA_ADDRESS_LG2 =3D 57, + /* + * PT_MAX_VA_ADDRESS_LG2 is the upper bound accepted by the generic + * pt_iommu_init() range check. It must cover both first-stage and + * second-stage (G-stage) modes: + * + * First-stage (fsc/iosatp): Sv39=3D39, Sv48=3D48, Sv57=3D57 + * Second-stage (iohgatp): Sv39x4=3D41, Sv48x4=3D50, Sv57x4=3D59 + * + */ + PT_MAX_VA_ADDRESS_LG2 =3D 59, PT_MAX_OUTPUT_ADDRESS_LG2 =3D 56, PT_MAX_TOP_LEVEL =3D 4, #endif @@ -124,6 +133,14 @@ riscvpt_entry_num_contig_lg2(const struct pt_state *pt= s) =20 static inline unsigned int riscvpt_num_items_lg2(const struct pt_state *pt= s) { + /* + * Second-stage (iohgatp) root page tables have 4x the usual number of + * entries (2048 =3D 2^11 instead of 512 =3D 2^9) to cover the 2 extra GPA + * bits in Sv39x4/Sv48x4/Sv57x4. Only the root (top) level is + * enlarged; all other levels remain at the standard 9-bit index width. + */ + if (to_riscvpt(pts)->second_stage && pts->level =3D=3D pts->range->top_le= vel) + return PT_TABLEMEM_LG2SZ - ilog2(sizeof(u64)) + 2; return PT_TABLEMEM_LG2SZ - ilog2(sizeof(u64)); } #define pt_num_items_lg2 riscvpt_num_items_lg2 @@ -254,6 +271,7 @@ riscvpt_iommu_fmt_init(struct pt_iommu_riscv_64 *iommu_= table, struct pt_riscv *table =3D &iommu_table->riscv_64pt; =20 switch (cfg->common.hw_max_vasz_lg2) { + /* First-stage (fsc/iosatp): Sv39 / Sv48 / Sv57 */ case 39: pt_top_set_level(&table->common, 2); break; @@ -263,6 +281,22 @@ riscvpt_iommu_fmt_init(struct pt_iommu_riscv_64 *iommu= _table, case 57: pt_top_set_level(&table->common, 4); break; + /* + * Second-stage (iohgatp): Sv39x4 / Sv48x4 / Sv57x4. + * The top level is the same as for the first-stage counterpart. + */ + case 41: + pt_top_set_level(&table->common, 2); + table->second_stage =3D true; + break; + case 50: + pt_top_set_level(&table->common, 3); + table->second_stage =3D true; + break; + case 59: + pt_top_set_level(&table->common, 4); + table->second_stage =3D true; + break; default: return -EINVAL; } @@ -283,10 +317,17 @@ riscvpt_iommu_fmt_hw_info(struct pt_iommu_riscv_64 *t= able, PT_WARN_ON(top_phys & ~PT_TOP_PHYS_MASK); =20 /* - * See Table 3. Encodings of iosatp.MODE field" for DC.tx.SXL =3D 0: - * 8 =3D Sv39 =3D top level 2 - * 9 =3D Sv38 =3D top level 3 - * 10 =3D Sv57 =3D top level 4 + * Both first-stage (fsc/iosatp) and second-stage (iohgatp) share the + * same MODE numeric values for a given top level: + * top_level 2 -> MODE 8 (Sv39 / Sv39x4) + * top_level 3 -> MODE 9 (Sv48 / Sv48x4) + * top_level 4 -> MODE 10 (Sv57 / Sv57x4) + * + * The union members fsc_iosatp_mode and iohgatp_mode occupy the same + * byte; the caller selects the appropriate name based on domain type. + * + * See "Table 3. Encodings of iosatp.MODE field" (DC.tc.SXL =3D 0) and + * "Table 2. Encoding of iohgatp.MODE field" in the RISC-V IOMMU spec. */ info->fsc_iosatp_mode =3D top_range->top_level + 6; } @@ -294,6 +335,7 @@ riscvpt_iommu_fmt_hw_info(struct pt_iommu_riscv_64 *tab= le, =20 #if defined(GENERIC_PT_KUNIT) static const struct pt_iommu_riscv_64_cfg riscv_64_kunit_fmt_cfgs[] =3D { + /* First-stage (fsc/iosatp): Sv39 / Sv48 / Sv57 */ [0] =3D { .common.features =3D BIT(PT_FEAT_RISCV_SVNAPOT_64K), .common.hw_max_oasz_lg2 =3D 56, .common.hw_max_vasz_lg2 =3D 39 }, @@ -303,6 +345,18 @@ static const struct pt_iommu_riscv_64_cfg riscv_64_kun= it_fmt_cfgs[] =3D { [2] =3D { .common.features =3D BIT(PT_FEAT_RISCV_SVNAPOT_64K), .common.hw_max_oasz_lg2 =3D 56, .common.hw_max_vasz_lg2 =3D 57 }, + /* + * Second-stage (iohgatp): Sv39x4 / Sv48x4 / Sv57x4. + */ + [3] =3D { .common.features =3D BIT(PT_FEAT_RISCV_SVNAPOT_64K), + .common.hw_max_oasz_lg2 =3D 56, + .common.hw_max_vasz_lg2 =3D 41 }, + [4] =3D { .common.features =3D 0, + .common.hw_max_oasz_lg2 =3D 56, + .common.hw_max_vasz_lg2 =3D 50 }, + [5] =3D { .common.features =3D BIT(PT_FEAT_RISCV_SVNAPOT_64K), + .common.hw_max_oasz_lg2 =3D 56, + .common.hw_max_vasz_lg2 =3D 59 }, }; #define kunit_fmt_cfgs riscv_64_kunit_fmt_cfgs enum { diff --git a/include/linux/generic_pt/common.h b/include/linux/generic_pt/c= ommon.h index fc5d0b5edadc..e82dff33ece8 100644 --- a/include/linux/generic_pt/common.h +++ b/include/linux/generic_pt/common.h @@ -181,6 +181,11 @@ struct pt_riscv_32 { =20 struct pt_riscv_64 { struct pt_common common; + /* + * True when this table is used for second-stage / iohgatp + * address translation. + */ + bool second_stage; }; =20 enum { diff --git a/include/linux/generic_pt/iommu.h b/include/linux/generic_pt/io= mmu.h index dd0edd02a48a..f27d229ff318 100644 --- a/include/linux/generic_pt/iommu.h +++ b/include/linux/generic_pt/iommu.h @@ -328,7 +328,22 @@ struct pt_iommu_riscv_64_cfg { =20 struct pt_iommu_riscv_64_hw_info { u64 ppn; - u8 fsc_iosatp_mode; + union { + /* + * First-stage (fsc/iosatp) MODE encoding: + * 8 =3D Sv39, 9 =3D Sv48, 10 =3D Sv57 + * Used to program DC.fsc.iosatp.MODE. + */ + u8 fsc_iosatp_mode; + /* + * Second-stage (iohgatp) MODE encoding: + * 8 =3D Sv39x4, 9 =3D Sv48x4, 10 =3D Sv57x4 + * Used to program DC.iohgatp.MODE. + * The numeric values are identical to fsc_iosatp_mode; + * the caller selects the interpretation based on domain type. + */ + u8 iohgatp_mode; + }; }; =20 IOMMU_FORMAT(riscv_64, riscv_64pt); --=20 2.50.1 From nobody Wed Jun 17 02:50:55 2026 Received: from out30-101.freemail.mail.aliyun.com (out30-101.freemail.mail.aliyun.com [115.124.30.101]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C07B8450903; Tue, 28 Apr 2026 13:14:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.101 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777382068; cv=none; b=eun6w1jqD/eo69uz3iQI2cGtTPMuYd2wp2nxY1gUK/d6ThCNAotBWTxXQMe9itQJycbo5TcpJy9WLuQNzZxtnTGuF/l/fYzF4kAixD41IYv3MXHHZeX2ELX1Izm7ll0pV40DVbIbCpS/5PXpUVft1LGDbPC/F0cpoxBG0I0/2Qc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777382068; c=relaxed/simple; bh=HdxLgGLxNR1dpWrEI4X/cPS6HEO/pUfKU/3U6Qtdilg=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=nEPOifmS3eQgW5XAzTzb6jpE+DevYxELRLfsAUYSChK0yoPuhUnQ4Ch9W0hfnE2snIjcenGg2p4V6LjaIjgHWomMncXLaXHR7F2e0IWI8AcQ5VPIlYcoK+VrAF4rCKXSmqFq+2CTSKz/xTVzU2DsRC/7biJre/zuYJHGMHJdpjM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=nFaZEV3/; arc=none smtp.client-ip=115.124.30.101 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="nFaZEV3/" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1777382057; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=4AWwGUYTanTn719lzq6A/EUfbomVklkhE0MF8yj4zBQ=; b=nFaZEV3/WQ5v0NvWWbYLRKGvi1DG4CQvSnFVnTQape7JCB2zITzHMgHW8Z7RMRb6fG8bzI2sp1jh66T0oxUjuJHYAFMqUXDnGruJ4+R0uPJcRhfa3TduaqHnZ+mkm0vlA8a7zY3hpuBcdArdBxFk57eSy1P8Te5j17H1G2QZZl0= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R871e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033037026112;MF=fangyu.yu@linux.alibaba.com;NM=1;PH=DS;RN=23;SR=0;TI=SMTPD_---0X1ubIDe_1777382054; Received: from localhost.localdomain(mailfrom:fangyu.yu@linux.alibaba.com fp:SMTPD_---0X1ubIDe_1777382054 cluster:ay36) by smtp.aliyun-inc.com; Tue, 28 Apr 2026 21:14:15 +0800 From: fangyu.yu@linux.alibaba.com To: joro@8bytes.org, will@kernel.org, robin.murphy@arm.com, pjw@kernel.org, palmer@dabbelt.com, aou@eecs.berkeley.edu, alex@ghiti.fr, tjeznach@rivosinc.com, jgg@ziepe.ca, kevin.tian@intel.com, baolu.lu@linux.intel.com, vasant.hegde@amd.com, anup@brainfault.org, atish.patra@linux.dev, skhawaja@google.com, jgg@nvidia.com Cc: guoren@kernel.org, kvm@vger.kernel.org, iommu@lists.linux.dev, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, Fangyu Yu Subject: [RFC PATCH 02/11] iommu/riscv: report iommu capabilities Date: Tue, 28 Apr 2026 21:13:50 +0800 Message-Id: <20260428131359.34872-3-fangyu.yu@linux.alibaba.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20260428131359.34872-1-fangyu.yu@linux.alibaba.com> References: <20260428131359.34872-1-fangyu.yu@linux.alibaba.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Tomasz Jeznach Report RISC-V IOMMU capabilities required by VFIO subsystem to enable PCIe device assignment. Signed-off-by: Tomasz Jeznach Signed-off-by: Fangyu Yu --- drivers/iommu/riscv/iommu.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/drivers/iommu/riscv/iommu.c b/drivers/iommu/riscv/iommu.c index a31f50bbad35..15e2a333f969 100644 --- a/drivers/iommu/riscv/iommu.c +++ b/drivers/iommu/riscv/iommu.c @@ -1336,6 +1336,17 @@ static struct iommu_group *riscv_iommu_device_group(= struct device *dev) return generic_device_group(dev); } =20 +static bool riscv_iommu_capable(struct device *dev, enum iommu_cap cap) +{ + switch (cap) { + case IOMMU_CAP_CACHE_COHERENCY: + case IOMMU_CAP_DEFERRED_FLUSH: + return true; + default: + return false; + } +} + static int riscv_iommu_of_xlate(struct device *dev, const struct of_phandl= e_args *args) { return iommu_fwspec_add_ids(dev, args->args, 1); @@ -1397,6 +1408,7 @@ static void riscv_iommu_release_device(struct device = *dev) =20 static const struct iommu_ops riscv_iommu_ops =3D { .of_xlate =3D riscv_iommu_of_xlate, + .capable =3D riscv_iommu_capable, .identity_domain =3D &riscv_iommu_identity_domain, .blocked_domain =3D &riscv_iommu_blocking_domain, .release_domain =3D &riscv_iommu_blocking_domain, --=20 2.50.1 From nobody Wed Jun 17 02:50:55 2026 Received: from out30-111.freemail.mail.aliyun.com (out30-111.freemail.mail.aliyun.com [115.124.30.111]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AF1AE45091D; Tue, 28 Apr 2026 13:14:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.111 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777382070; cv=none; b=myxnp27XkuCl/4eWn5QzMB0thNw4ttzJs3EhNK60ZaGdR5/Dryt9jdtRuJArLFCEbyuaUwj6OMaAFfdYXt33ZI9nsE3WMPgf40ydLEmhxdocKoodvsbSMxe2jR+EL6meEbFYXuS6tPDkPp1VENAR4g/w7zYIDA1KmdRYecksIbA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777382070; c=relaxed/simple; bh=9l5J2IcJPjHU73EY5WFkQWCg8u7oIh9DL9z3SVjfGpg=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=q2prAG/CVteXzbjg870/3fUBMB/VJC1AiFhves4jjb8BFvLbFcjjEwMrCjOhaP6ykrcWQ6msBuMFKYEWIrXoWyUIuO2BwxT42RPs5hHOUalIXCy2FD11XTHrpu6GSeLaJfxCUT8JQJavRr90gKdlrc9R0AMTJMXHdfuvCpeKbv8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=W9s16CXP; arc=none smtp.client-ip=115.124.30.111 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="W9s16CXP" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1777382059; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=MqNgI4sv+FepWpFcUsgsx+IZ0W7HlKVhYzr2f1YWc7s=; b=W9s16CXPKPac3Kt6thDkIY+bNgz66Fcn7yvM650X3ePkz421wApDXQz6xvIDPxHQkgsbtASJmqi36EpXHItT+mNL5jsNviB9j4aaevoe6uj0k1FpxpGa6Dtcu32vxtjwFDLrD3/pFmheUz6qYfXiBmglATaoB6QGCAlVRQ9Qr2Y= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R271e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033045098064;MF=fangyu.yu@linux.alibaba.com;NM=1;PH=DS;RN=24;SR=0;TI=SMTPD_---0X1ubIET_1777382056; Received: from localhost.localdomain(mailfrom:fangyu.yu@linux.alibaba.com fp:SMTPD_---0X1ubIET_1777382056 cluster:ay36) by smtp.aliyun-inc.com; Tue, 28 Apr 2026 21:14:17 +0800 From: fangyu.yu@linux.alibaba.com To: joro@8bytes.org, will@kernel.org, robin.murphy@arm.com, pjw@kernel.org, palmer@dabbelt.com, aou@eecs.berkeley.edu, alex@ghiti.fr, tjeznach@rivosinc.com, jgg@ziepe.ca, kevin.tian@intel.com, baolu.lu@linux.intel.com, vasant.hegde@amd.com, anup@brainfault.org, atish.patra@linux.dev, skhawaja@google.com, jgg@nvidia.com Cc: guoren@kernel.org, kvm@vger.kernel.org, iommu@lists.linux.dev, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, Zong Li , Fangyu Yu Subject: [RFC PATCH 03/11] iommu/riscv: use data structure instead of individual values Date: Tue, 28 Apr 2026 21:13:51 +0800 Message-Id: <20260428131359.34872-4-fangyu.yu@linux.alibaba.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20260428131359.34872-1-fangyu.yu@linux.alibaba.com> References: <20260428131359.34872-1-fangyu.yu@linux.alibaba.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Zong Li The parameter will be increased when we need to set up more bit fields in the device context. Use a data structure to wrap them up. Signed-off-by: Zong Li Signed-off-by: Fangyu Yu --- drivers/iommu/riscv/iommu.c | 27 +++++++++++++++++---------- 1 file changed, 17 insertions(+), 10 deletions(-) diff --git a/drivers/iommu/riscv/iommu.c b/drivers/iommu/riscv/iommu.c index 15e2a333f969..369c98b7e1e5 100644 --- a/drivers/iommu/riscv/iommu.c +++ b/drivers/iommu/riscv/iommu.c @@ -1077,7 +1077,7 @@ static void riscv_iommu_iodir_iotinval(struct riscv_i= ommu_device *iommu, * interim translation faults. */ static void riscv_iommu_iodir_update(struct riscv_iommu_device *iommu, - struct device *dev, u64 fsc, u64 ta) + struct device *dev, struct riscv_iommu_dc *new_dc) { struct iommu_fwspec *fwspec =3D dev_iommu_fwspec_get(dev); struct riscv_iommu_dc *dc; @@ -1116,10 +1116,10 @@ static void riscv_iommu_iodir_update(struct riscv_i= ommu_device *iommu, for (i =3D 0; i < fwspec->num_ids; i++) { dc =3D riscv_iommu_get_dc(iommu, fwspec->ids[i]); tc =3D READ_ONCE(dc->tc); - tc |=3D ta & RISCV_IOMMU_DC_TC_V; + tc |=3D new_dc->ta & RISCV_IOMMU_DC_TC_V; =20 - WRITE_ONCE(dc->fsc, fsc); - WRITE_ONCE(dc->ta, ta & RISCV_IOMMU_PC_TA_PSCID); + WRITE_ONCE(dc->fsc, new_dc->fsc); + WRITE_ONCE(dc->ta, new_dc->ta & RISCV_IOMMU_PC_TA_PSCID); /* Update device context, write TC.V as the last step. */ dma_wmb(); WRITE_ONCE(dc->tc, tc); @@ -1205,22 +1205,22 @@ static int riscv_iommu_attach_paging_domain(struct = iommu_domain *iommu_domain, struct riscv_iommu_device *iommu =3D dev_to_iommu(dev); struct riscv_iommu_info *info =3D dev_iommu_priv_get(dev); struct pt_iommu_riscv_64_hw_info pt_info; - u64 fsc, ta; + struct riscv_iommu_dc dc =3D {0}; =20 pt_iommu_riscv_64_hw_info(&domain->riscvpt, &pt_info); =20 if (!riscv_iommu_pt_supported(iommu, pt_info.fsc_iosatp_mode)) return -ENODEV; =20 - fsc =3D FIELD_PREP(RISCV_IOMMU_PC_FSC_MODE, pt_info.fsc_iosatp_mode) | + dc.fsc =3D FIELD_PREP(RISCV_IOMMU_PC_FSC_MODE, pt_info.fsc_iosatp_mode) | FIELD_PREP(RISCV_IOMMU_PC_FSC_PPN, pt_info.ppn); - ta =3D FIELD_PREP(RISCV_IOMMU_PC_TA_PSCID, domain->pscid) | + dc.ta =3D FIELD_PREP(RISCV_IOMMU_PC_TA_PSCID, domain->pscid) | RISCV_IOMMU_PC_TA_V; =20 if (riscv_iommu_bond_link(domain, dev)) return -ENOMEM; =20 - riscv_iommu_iodir_update(iommu, dev, fsc, ta); + riscv_iommu_iodir_update(iommu, dev, &dc); riscv_iommu_bond_unlink(info->domain, dev); info->domain =3D domain; =20 @@ -1292,9 +1292,12 @@ static int riscv_iommu_attach_blocking_domain(struct= iommu_domain *iommu_domain, { struct riscv_iommu_device *iommu =3D dev_to_iommu(dev); struct riscv_iommu_info *info =3D dev_iommu_priv_get(dev); + struct riscv_iommu_dc dc =3D {0}; + + dc.fsc =3D RISCV_IOMMU_FSC_BARE; =20 /* Make device context invalid, translation requests will fault w/ #258 */ - riscv_iommu_iodir_update(iommu, dev, RISCV_IOMMU_FSC_BARE, 0); + riscv_iommu_iodir_update(iommu, dev, &dc); riscv_iommu_bond_unlink(info->domain, dev); info->domain =3D NULL; =20 @@ -1314,8 +1317,12 @@ static int riscv_iommu_attach_identity_domain(struct= iommu_domain *iommu_domain, { struct riscv_iommu_device *iommu =3D dev_to_iommu(dev); struct riscv_iommu_info *info =3D dev_iommu_priv_get(dev); + struct riscv_iommu_dc dc =3D {0}; + + dc.fsc =3D RISCV_IOMMU_FSC_BARE; + dc.ta =3D RISCV_IOMMU_PC_TA_V; =20 - riscv_iommu_iodir_update(iommu, dev, RISCV_IOMMU_FSC_BARE, RISCV_IOMMU_PC= _TA_V); + riscv_iommu_iodir_update(iommu, dev, &dc); riscv_iommu_bond_unlink(info->domain, dev); info->domain =3D NULL; =20 --=20 2.50.1 From nobody Wed Jun 17 02:50:55 2026 Received: from out30-132.freemail.mail.aliyun.com (out30-132.freemail.mail.aliyun.com [115.124.30.132]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7224C450903; Tue, 28 Apr 2026 13:14:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.132 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777382073; cv=none; b=AV9RzEKV/TeIhibc0QX5RlSq0da7jr+Y/YLnwWydjMWoQed2mCp0UiMcyLTmJHoe+TA39BdKB0uoMTrxKH3Wr1g5Ns9zI9BE1ZC0QswYnQXNeHIicP5qvAMhB4yC2LCCcT1ZCG+CY0c6XJfyL4eY55wdoQxAPErp+gs1n2BX9Fs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777382073; c=relaxed/simple; bh=0j5ZL3J+m+k4ABT8fB03YfQnL0eBL5sT6rdu7L7+iYQ=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=F2N1ygceJFH9CtLDIa1/myZv3kr63xGj7zczOc7EEdHg6tBKOVoMWob7yLX8lAQWmOlg6e4nj27wnjHqHdwJj7Ne9QqzJueipsciY0TTbF1lkMNX9l8W4+KB/uMcbfnJAjsUvunbbwIsJa/oltM9kjl9OiD4FpWXe4Xll41KZto= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=osCM39ps; arc=none smtp.client-ip=115.124.30.132 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="osCM39ps" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1777382061; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=iWNf/dxG28Z+J1fFtK1WtJe+wKDMG+JY4xphdsapE1c=; b=osCM39psTopiwVz3b2EJ1UFGC1cMLwdefB8sWax/2cXMALals6mZemQT+jlvJSWCWXyifItHF/fYLrA0MWKj+1mW74lITUoruo36cR3wUg59CA8PUEGjHnhjad9wxPYpgzzLNtfv9tLhF2UR8wOR1cTRFX4wg6ZwYGDy1SyGMc0= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R481e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam011083073210;MF=fangyu.yu@linux.alibaba.com;NM=1;PH=DS;RN=24;SR=0;TI=SMTPD_---0X1ubIF3_1777382058; Received: from localhost.localdomain(mailfrom:fangyu.yu@linux.alibaba.com fp:SMTPD_---0X1ubIF3_1777382058 cluster:ay36) by smtp.aliyun-inc.com; Tue, 28 Apr 2026 21:14:19 +0800 From: fangyu.yu@linux.alibaba.com To: joro@8bytes.org, will@kernel.org, robin.murphy@arm.com, pjw@kernel.org, palmer@dabbelt.com, aou@eecs.berkeley.edu, alex@ghiti.fr, tjeznach@rivosinc.com, jgg@ziepe.ca, kevin.tian@intel.com, baolu.lu@linux.intel.com, vasant.hegde@amd.com, anup@brainfault.org, atish.patra@linux.dev, skhawaja@google.com, jgg@nvidia.com Cc: guoren@kernel.org, kvm@vger.kernel.org, iommu@lists.linux.dev, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, Zong Li , Fangyu Yu Subject: [RFC PATCH 04/11] iommu/riscv: support GSCID and GVMA invalidation command Date: Tue, 28 Apr 2026 21:13:52 +0800 Message-Id: <20260428131359.34872-5-fangyu.yu@linux.alibaba.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20260428131359.34872-1-fangyu.yu@linux.alibaba.com> References: <20260428131359.34872-1-fangyu.yu@linux.alibaba.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Zong Li This patch adds a ID Allocator for GSCID and a wrap for setting up GSCID in IOTLB invalidation command. Set up iohgatp to enable second stage table and flush stage-2 table if the GSCID is set. The GSCID of domain should be freed when release domain. GSCID will be allocated for parent domain in nested IOMMU process. Signed-off-by: Zong Li Signed-off-by: Fangyu Yu --- drivers/iommu/riscv/iommu-bits.h | 7 +++++++ drivers/iommu/riscv/iommu.c | 32 ++++++++++++++++++++++++++------ 2 files changed, 33 insertions(+), 6 deletions(-) diff --git a/drivers/iommu/riscv/iommu-bits.h b/drivers/iommu/riscv/iommu-b= its.h index 29a0040b1c32..7c440926fa23 100644 --- a/drivers/iommu/riscv/iommu-bits.h +++ b/drivers/iommu/riscv/iommu-bits.h @@ -716,6 +716,13 @@ static inline void riscv_iommu_cmd_inval_vma(struct ri= scv_iommu_command *cmd) cmd->dword1 =3D 0; } =20 +static inline void riscv_iommu_cmd_inval_gvma(struct riscv_iommu_command *= cmd) +{ + cmd->dword0 =3D FIELD_PREP(RISCV_IOMMU_CMD_OPCODE, RISCV_IOMMU_CMD_IOTINV= AL_OPCODE) | + FIELD_PREP(RISCV_IOMMU_CMD_FUNC, RISCV_IOMMU_CMD_IOTINVAL_FUNC_GVM= A); + cmd->dword1 =3D 0; +} + static inline void riscv_iommu_cmd_inval_set_addr(struct riscv_iommu_comma= nd *cmd, u64 addr) { diff --git a/drivers/iommu/riscv/iommu.c b/drivers/iommu/riscv/iommu.c index 369c98b7e1e5..5dadf6d09139 100644 --- a/drivers/iommu/riscv/iommu.c +++ b/drivers/iommu/riscv/iommu.c @@ -48,6 +48,10 @@ static DEFINE_IDA(riscv_iommu_pscids); #define RISCV_IOMMU_MAX_PSCID (BIT(20) - 1) =20 +/* IOMMU GSCID allocation namespace. */ +static DEFINE_IDA(riscv_iommu_gscids); +#define RISCV_IOMMU_MAX_GSCID (BIT(16) - 1) + /* Device resource-managed allocations */ struct riscv_iommu_devres { void *addr; @@ -819,6 +823,7 @@ struct riscv_iommu_domain { struct list_head bonds; spinlock_t lock; /* protect bonds list updates. */ int pscid; + int gscid; }; PT_IOMMU_CHECK_DOMAIN(struct riscv_iommu_domain, riscvpt.iommu, domain); =20 @@ -967,15 +972,20 @@ static void riscv_iommu_iotlb_inval(struct riscv_iomm= u_domain *domain, =20 /* * IOTLB invalidation request can be safely omitted if already sent - * to the IOMMU for the same PSCID, and with domain->bonds list + * to the IOMMU for the same PSCID/GSCID, and with domain->bonds list * arranged based on the device's IOMMU, it's sufficient to check * last device the invalidation was sent to. */ if (iommu =3D=3D prev) continue; =20 - riscv_iommu_cmd_inval_vma(&cmd); - riscv_iommu_cmd_inval_set_pscid(&cmd, domain->pscid); + if (domain->gscid) { + riscv_iommu_cmd_inval_gvma(&cmd); + riscv_iommu_cmd_inval_set_gscid(&cmd, domain->gscid); + } else { + riscv_iommu_cmd_inval_vma(&cmd); + riscv_iommu_cmd_inval_set_pscid(&cmd, domain->pscid); + } if (end - start < RISCV_IOMMU_IOTLB_INVAL_LIMIT - 1) { unsigned long iova =3D start; =20 @@ -1120,6 +1130,7 @@ static void riscv_iommu_iodir_update(struct riscv_iom= mu_device *iommu, =20 WRITE_ONCE(dc->fsc, new_dc->fsc); WRITE_ONCE(dc->ta, new_dc->ta & RISCV_IOMMU_PC_TA_PSCID); + WRITE_ONCE(dc->iohgatp, new_dc->iohgatp); /* Update device context, write TC.V as the last step. */ dma_wmb(); WRITE_ONCE(dc->tc, tc); @@ -1175,8 +1186,10 @@ static void riscv_iommu_free_paging_domain(struct io= mmu_domain *iommu_domain) =20 WARN_ON(!list_empty(&domain->bonds)); =20 - if ((int)domain->pscid > 0) + if (domain->pscid > 0) ida_free(&riscv_iommu_pscids, domain->pscid); + if (domain->gscid > 0) + ida_free(&riscv_iommu_gscids, domain->gscid); =20 pt_iommu_deinit(&domain->riscvpt.iommu); kfree(domain); @@ -1212,8 +1225,15 @@ static int riscv_iommu_attach_paging_domain(struct i= ommu_domain *iommu_domain, if (!riscv_iommu_pt_supported(iommu, pt_info.fsc_iosatp_mode)) return -ENODEV; =20 - dc.fsc =3D FIELD_PREP(RISCV_IOMMU_PC_FSC_MODE, pt_info.fsc_iosatp_mode) | - FIELD_PREP(RISCV_IOMMU_PC_FSC_PPN, pt_info.ppn); + if (domain->gscid) { + dc.iohgatp =3D FIELD_PREP(RISCV_IOMMU_DC_IOHGATP_MODE, pt_info.iohgatp_m= ode) | + FIELD_PREP(RISCV_IOMMU_DC_IOHGATP_GSCID, domain->gscid) | + FIELD_PREP(RISCV_IOMMU_DC_IOHGATP_PPN, pt_info.ppn); + } else { + dc.fsc =3D FIELD_PREP(RISCV_IOMMU_PC_FSC_MODE, pt_info.fsc_iosatp_mode) | + FIELD_PREP(RISCV_IOMMU_PC_FSC_PPN, pt_info.ppn); + } + dc.ta =3D FIELD_PREP(RISCV_IOMMU_PC_TA_PSCID, domain->pscid) | RISCV_IOMMU_PC_TA_V; =20 --=20 2.50.1 From nobody Wed Jun 17 02:50:55 2026 Received: from out30-111.freemail.mail.aliyun.com (out30-111.freemail.mail.aliyun.com [115.124.30.111]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DC59345BD7F; Tue, 28 Apr 2026 13:14:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.111 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777382073; cv=none; b=J3MleoXPWxBIvNftjQ38VmZ0JSxia3CSH54jSGYDc0nYlGD/g1NVnOxADA9Z9xD86xMnp38Ifyfm1mvHltFT2wztqlWpG64qXjWcWdS2d3ZLTkm7iO0/EfkLnBc9901Ht0bOS+7/PG86LYTnr1BRzuz+Cj8px8LsOsjz4GN/pkI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777382073; c=relaxed/simple; bh=TJtwwmOes+WMqZV10KyeFqw49c64kqL7395J+MxsNMo=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=fJbvN76KkyniroOtCEsA/Bo5JS8X/JKPrmb/+CKRa/brtt2g5jt+y9xTdoWaMxMcJFYmw3ykizX+0GG9FFwqWxeyzpV6oIs5JexUzTRdngUYUJkiqaCMVjD+Mf5uu1z8uwfGQryn27GFgD2Y4ZzBOHEiupT+z/ja/GLMBi5Shvs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=UKSGHG8e; arc=none smtp.client-ip=115.124.30.111 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="UKSGHG8e" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1777382063; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=RWu/puVJ24xC3wGWoddID7UrDVPs6J4eX8hoo+uZKZs=; b=UKSGHG8eRXdi22jfhZDotVlb+9KiVX4hP+wie53qzJSLUUTEp5F2vpajwXpXMsxa4M5vRamEjL5tADnxtSVWLZOXNhcDGrRsXH/jqku4Ul3M4FptQUzk/F0P5UMdQRsK9X7DUm+cIcWv4CvaRAa8yUP7cAGFZw/crZnKBqTO7Kk= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R661e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033045133197;MF=fangyu.yu@linux.alibaba.com;NM=1;PH=DS;RN=24;SR=0;TI=SMTPD_---0X1ubIFW_1777382060; Received: from localhost.localdomain(mailfrom:fangyu.yu@linux.alibaba.com fp:SMTPD_---0X1ubIFW_1777382060 cluster:ay36) by smtp.aliyun-inc.com; Tue, 28 Apr 2026 21:14:21 +0800 From: fangyu.yu@linux.alibaba.com To: joro@8bytes.org, will@kernel.org, robin.murphy@arm.com, pjw@kernel.org, palmer@dabbelt.com, aou@eecs.berkeley.edu, alex@ghiti.fr, tjeznach@rivosinc.com, jgg@ziepe.ca, kevin.tian@intel.com, baolu.lu@linux.intel.com, vasant.hegde@amd.com, anup@brainfault.org, atish.patra@linux.dev, skhawaja@google.com, jgg@nvidia.com Cc: guoren@kernel.org, kvm@vger.kernel.org, iommu@lists.linux.dev, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, Andrew Jones , Fangyu Yu Subject: [RFC PATCH 05/11] RISC-V: KVM: Enable KVM_VFIO interfaces on RISC-V arch Date: Tue, 28 Apr 2026 21:13:53 +0800 Message-Id: <20260428131359.34872-6-fangyu.yu@linux.alibaba.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20260428131359.34872-1-fangyu.yu@linux.alibaba.com> References: <20260428131359.34872-1-fangyu.yu@linux.alibaba.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Tomasz Jeznach Enable KVM/VFIO support on RISC-V architecture. Signed-off-by: Tomasz Jeznach Signed-off-by: Andrew Jones Signed-off-by: Fangyu Yu --- arch/riscv/kvm/Kconfig | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/riscv/kvm/Kconfig b/arch/riscv/kvm/Kconfig index ec2cee0a39e0..54ee90f010ef 100644 --- a/arch/riscv/kvm/Kconfig +++ b/arch/riscv/kvm/Kconfig @@ -30,8 +30,10 @@ config KVM select KVM_GENERIC_HARDWARE_ENABLING select KVM_MMIO select VIRT_XFER_TO_GUEST_WORK + select KVM_VFIO select SCHED_INFO select GUEST_PERF_EVENTS if PERF_EVENTS + select SRCU help Support hosting virtualized guest machines. =20 --=20 2.50.1 From nobody Wed Jun 17 02:50:55 2026 Received: from out30-101.freemail.mail.aliyun.com (out30-101.freemail.mail.aliyun.com [115.124.30.101]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3A77E450909; Tue, 28 Apr 2026 13:14:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.101 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777382072; cv=none; b=EoBTU00Y+twYrBWJXg/VTfz1X8NLAzVpCocky6D1ncPSudhzt9hNo4EDWm2lllITqeXpn2pzxbC8LZyQKyAIyEgS3t2CJWLY7vgvO1oPg6FdszWxg1EgXo0ATRaDnJ9uUDzJfU5SH7iZBQgl0PMVkzoWj1Gu/MuyvFvAH+RRqHM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777382072; c=relaxed/simple; bh=T4H+/LpGeeu7kcTVvhLBYQ/XYxFHumzultQXItxHw80=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=cheod4NDrpQNO3M/yDctdRt5fNxOeXppNKTpkeAa1oR26/90amRyp5Fy+wHMXWRcNyREKE/81W+bpHPHsA6J9so04W/tSLbzc3Lmk3Eo6Pji10EooTeR/PjMs5qxmwSMghBYft3GoYYyOV+Il3boN9F6fmJKJyq6e0KegFOMscc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=ZfS8a9z3; arc=none smtp.client-ip=115.124.30.101 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="ZfS8a9z3" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1777382065; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=Qsv/5zq52oNQKyoYl2cE2fHq6XjpzP4W+zUJu/N+63k=; b=ZfS8a9z3NVCZNAyt8G5hcKma8DzaFJdle7Q0tFKRq/YsW6CnhcQPxILrS0pi+Oi+oTysQqfqOff6+P16h6y3zNHibMm9PwIb4sUmY6rWzfwGni93Nsiw3c/yJkoqrkgJ1cULnB+TMfmnvT2Gh4TI/cxDKs6AZc9DYa7yIMlfakk= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R211e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam011083073210;MF=fangyu.yu@linux.alibaba.com;NM=1;PH=DS;RN=23;SR=0;TI=SMTPD_---0X1ubIFt_1777382062; Received: from localhost.localdomain(mailfrom:fangyu.yu@linux.alibaba.com fp:SMTPD_---0X1ubIFt_1777382062 cluster:ay36) by smtp.aliyun-inc.com; Tue, 28 Apr 2026 21:14:23 +0800 From: fangyu.yu@linux.alibaba.com To: joro@8bytes.org, will@kernel.org, robin.murphy@arm.com, pjw@kernel.org, palmer@dabbelt.com, aou@eecs.berkeley.edu, alex@ghiti.fr, tjeznach@rivosinc.com, jgg@ziepe.ca, kevin.tian@intel.com, baolu.lu@linux.intel.com, vasant.hegde@amd.com, anup@brainfault.org, atish.patra@linux.dev, skhawaja@google.com, jgg@nvidia.com Cc: guoren@kernel.org, kvm@vger.kernel.org, iommu@lists.linux.dev, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, Fangyu Yu Subject: [RFC PATCH 06/11] iommu/riscv: Add domain_alloc_paging_flags for second-stage domain Date: Tue, 28 Apr 2026 21:13:54 +0800 Message-Id: <20260428131359.34872-7-fangyu.yu@linux.alibaba.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20260428131359.34872-1-fangyu.yu@linux.alibaba.com> References: <20260428131359.34872-1-fangyu.yu@linux.alibaba.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Fangyu Yu Replace .domain_alloc_paging with .domain_alloc_paging_flags so callers can pass allocation flags to select the appropriate page-table type. When IOMMU_HWPT_ALLOC_NEST_PARENT or IOMMU_HWPT_ALLOC_DIRTY_TRACKING is set in @flags, allocate a second-stage (iohgatp) domain. When @flags is 0 the behaviour is identical to the previous domain_alloc_paging: first-stage (iosatp) domain. Signed-off-by: Fangyu Yu --- drivers/iommu/riscv/iommu.c | 66 ++++++++++++++++++++++++++++--------- 1 file changed, 51 insertions(+), 15 deletions(-) diff --git a/drivers/iommu/riscv/iommu.c b/drivers/iommu/riscv/iommu.c index 5dadf6d09139..0c13430ecc7f 100644 --- a/drivers/iommu/riscv/iommu.c +++ b/drivers/iommu/riscv/iommu.c @@ -1255,23 +1255,50 @@ static const struct iommu_domain_ops riscv_iommu_pa= ging_domain_ops =3D { .flush_iotlb_all =3D riscv_iommu_iotlb_flush_all, }; =20 -static struct iommu_domain *riscv_iommu_alloc_paging_domain(struct device = *dev) +static struct iommu_domain *riscv_iommu_domain_alloc_paging_flags( + struct device *dev, u32 flags, + const struct iommu_user_data *user_data) { + const bool second_stage =3D flags & + (IOMMU_HWPT_ALLOC_NEST_PARENT | IOMMU_HWPT_ALLOC_DIRTY_TRACKING); struct pt_iommu_riscv_64_cfg cfg =3D {}; struct riscv_iommu_domain *domain; struct riscv_iommu_device *iommu; int ret; =20 + if (user_data) + return ERR_PTR(-EOPNOTSUPP); + iommu =3D dev_to_iommu(dev); - if (iommu->caps & RISCV_IOMMU_CAPABILITIES_SV57) { - cfg.common.hw_max_vasz_lg2 =3D 57; - } else if (iommu->caps & RISCV_IOMMU_CAPABILITIES_SV48) { - cfg.common.hw_max_vasz_lg2 =3D 48; - } else if (iommu->caps & RISCV_IOMMU_CAPABILITIES_SV39) { - cfg.common.hw_max_vasz_lg2 =3D 39; + + if (second_stage) { + /* + * Second-stage (iohgatp) page table for KVM VFIO device + * pass-through and dirty tracking. The GPA space is 2 bits + * wider than the corresponding first-stage VA space (x4 root + * page table), so hw_max_vasz_lg2 values are 41/50/59. + */ + if (iommu->caps & RISCV_IOMMU_CAPABILITIES_SV57X4) { + cfg.common.hw_max_vasz_lg2 =3D 59; + } else if (iommu->caps & RISCV_IOMMU_CAPABILITIES_SV48X4) { + cfg.common.hw_max_vasz_lg2 =3D 50; + } else if (iommu->caps & RISCV_IOMMU_CAPABILITIES_SV39X4) { + cfg.common.hw_max_vasz_lg2 =3D 41; + } else { + dev_err(dev, "cannot find supported second-stage page table mode\n"); + return ERR_PTR(-ENODEV); + } } else { - dev_err(dev, "cannot find supported page table mode\n"); - return ERR_PTR(-ENODEV); + if (iommu->caps & RISCV_IOMMU_CAPABILITIES_SV57) { + cfg.common.hw_max_vasz_lg2 =3D 57; + } else if (iommu->caps & RISCV_IOMMU_CAPABILITIES_SV48) { + cfg.common.hw_max_vasz_lg2 =3D 48; + } else if (iommu->caps & RISCV_IOMMU_CAPABILITIES_SV39) { + cfg.common.hw_max_vasz_lg2 =3D 39; + } else { + dev_err(dev, "cannot find supported page table mode\n"); + return ERR_PTR(-ENODEV); + } } cfg.common.hw_max_oasz_lg2 =3D 56; =20 @@ -1291,11 +1318,20 @@ static struct iommu_domain *riscv_iommu_alloc_pagin= g_domain(struct device *dev) domain->riscvpt.iommu.nid =3D dev_to_node(iommu->dev); domain->domain.ops =3D &riscv_iommu_paging_domain_ops; =20 - domain->pscid =3D ida_alloc_range(&riscv_iommu_pscids, 1, - RISCV_IOMMU_MAX_PSCID, GFP_KERNEL); - if (domain->pscid < 0) { - riscv_iommu_free_paging_domain(&domain->domain); - return ERR_PTR(-ENOMEM); + if (second_stage) { + domain->gscid =3D ida_alloc_range(&riscv_iommu_gscids, 1, + RISCV_IOMMU_MAX_GSCID, GFP_KERNEL); + if (domain->gscid < 0) { + riscv_iommu_free_paging_domain(&domain->domain); + return ERR_PTR(-ENOMEM); + } + } else { + domain->pscid =3D ida_alloc_range(&riscv_iommu_pscids, 1, + RISCV_IOMMU_MAX_PSCID, GFP_KERNEL); + if (domain->pscid < 0) { + riscv_iommu_free_paging_domain(&domain->domain); + return ERR_PTR(-ENOMEM); + } } =20 ret =3D pt_iommu_riscv_64_init(&domain->riscvpt, &cfg, GFP_KERNEL); @@ -1439,7 +1475,7 @@ static const struct iommu_ops riscv_iommu_ops =3D { .identity_domain =3D &riscv_iommu_identity_domain, .blocked_domain =3D &riscv_iommu_blocking_domain, .release_domain =3D &riscv_iommu_blocking_domain, - .domain_alloc_paging =3D riscv_iommu_alloc_paging_domain, + .domain_alloc_paging_flags =3D riscv_iommu_domain_alloc_paging_flags, .device_group =3D riscv_iommu_device_group, .probe_device =3D riscv_iommu_probe_device, .release_device =3D riscv_iommu_release_device, --=20 2.50.1 From nobody Wed Jun 17 02:50:55 2026 Received: from out30-99.freemail.mail.aliyun.com (out30-99.freemail.mail.aliyun.com [115.124.30.99]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 898AA477984; Tue, 28 Apr 2026 13:14:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.99 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777382076; cv=none; b=qk5/V1UWTVey9FpJMMnEZx+Q0kwAMuPuiGI2dlDSKVmpi0LnoSrGGTWwxouPx432+etlX+wfofI1r0ePjrit/95PlU5zwhNzhL1eZldBuafFNlNd9NV5JX4CBTG2UEHx6GcpmkBvu+aG3oOnve83IXoM++TtjsrbBScJclFwocs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777382076; c=relaxed/simple; bh=hOS96Apq9JzCXIC4USAb423oCppmaEpcM2qpGyF2GP0=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=YYbgksqrRU/NDQwGbeNpkgi/vvj7sZiEIsorx9xsJe2MGAHY38jM8kdUiii9MOe0jGhyU0PQRgs6bVPS+OdYoRivfygqyCL3hkr1qGJUC/tlMPSTnFrTWUGN1+QThyV2Q4D2XFDqwnBg+JZtqon/FYbZyu0F/DxI49VqWSngFL8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=JlKJgXYZ; arc=none smtp.client-ip=115.124.30.99 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="JlKJgXYZ" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1777382067; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=d/CJi0j31iQURubrtaBsC+TOdWT/tV2UED1jlPNtsn8=; b=JlKJgXYZvrCr2OfhSa26rx9wnKDQg78stV+5Bykno9GcqPw3elXt5luwquqErqhj9ZW72hpqWkKG8aRnBNWV4iuLmVde8cLXjchTokg3YQM43PelO7wWQ0XAX5A175UQr35T+AQfbliKH3Rt7qYgEUkXEk2GYPvoHrcV/YXyafI= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R141e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033037026112;MF=fangyu.yu@linux.alibaba.com;NM=1;PH=DS;RN=23;SR=0;TI=SMTPD_---0X1ubIGF_1777382064; Received: from localhost.localdomain(mailfrom:fangyu.yu@linux.alibaba.com fp:SMTPD_---0X1ubIGF_1777382064 cluster:ay36) by smtp.aliyun-inc.com; Tue, 28 Apr 2026 21:14:25 +0800 From: fangyu.yu@linux.alibaba.com To: joro@8bytes.org, will@kernel.org, robin.murphy@arm.com, pjw@kernel.org, palmer@dabbelt.com, aou@eecs.berkeley.edu, alex@ghiti.fr, tjeznach@rivosinc.com, jgg@ziepe.ca, kevin.tian@intel.com, baolu.lu@linux.intel.com, vasant.hegde@amd.com, anup@brainfault.org, atish.patra@linux.dev, skhawaja@google.com, jgg@nvidia.com Cc: guoren@kernel.org, kvm@vger.kernel.org, iommu@lists.linux.dev, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, Fangyu Yu Subject: [RFC PATCH 07/11] iommupt: Don't preset D when RISC-V IOMMU dirty tracking on Date: Tue, 28 Apr 2026 21:13:55 +0800 Message-Id: <20260428131359.34872-8-fangyu.yu@linux.alibaba.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20260428131359.34872-1-fangyu.yu@linux.alibaba.com> References: <20260428131359.34872-1-fangyu.yu@linux.alibaba.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Fangyu Yu When mapping writable pages, the RISC-V format code currently pre-sets the PTE D bit unconditionally. If hardware dirty tracking is active (DC.tc.GADE set), the IOMMU sets D autonomously on the first write. Pre-setting D makes every new mapping appear dirty immediately and breaks dirty tracking. Introduce PT_FEAT_RISCV_DIRTY_TRACKING_ACTIVE and, when set, leave D cleared for new writable mappings so hardware can capture the first write. Keep pre-setting D when dirty tracking is inactive. Only meaningful for second-stage (iohgatp) page tables. Signed-off-by: Fangyu Yu --- drivers/iommu/generic_pt/fmt/riscv.h | 13 +++++++++++-- include/linux/generic_pt/common.h | 8 ++++++++ 2 files changed, 19 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/generic_pt/fmt/riscv.h b/drivers/iommu/generic_p= t/fmt/riscv.h index 4fe645e60375..0281356cfaf6 100644 --- a/drivers/iommu/generic_pt/fmt/riscv.h +++ b/drivers/iommu/generic_pt/fmt/riscv.h @@ -248,8 +248,17 @@ static inline int riscvpt_iommu_set_prot(struct pt_com= mon *common, u64 pte; =20 pte =3D RISCVPT_A | RISCVPT_U; - if (iommu_prot & IOMMU_WRITE) - pte |=3D RISCVPT_W | RISCVPT_R | RISCVPT_D; + if (iommu_prot & IOMMU_WRITE) { + pte |=3D RISCVPT_W | RISCVPT_R; + /* + * When hardware dirty tracking is active (GADE set), the IOMMU + * sets the D bit autonomously on the first write access. + * + */ + if (!(common->features & + BIT(PT_FEAT_RISCV_DIRTY_TRACKING_ACTIVE))) + pte |=3D RISCVPT_D; + } if (iommu_prot & IOMMU_READ) pte |=3D RISCVPT_R; if (!(iommu_prot & IOMMU_NOEXEC)) diff --git a/include/linux/generic_pt/common.h b/include/linux/generic_pt/c= ommon.h index e82dff33ece8..4606c7464c27 100644 --- a/include/linux/generic_pt/common.h +++ b/include/linux/generic_pt/common.h @@ -193,6 +193,14 @@ enum { * Support the 64k contiguous page size following the Svnapot extension. */ PT_FEAT_RISCV_SVNAPOT_64K =3D PT_FEAT_FMT_START, + /* + * Hardware dirty tracking is currently active: DC.tc.GADE is set and + * the IOMMU will set the D bit in PTEs autonomously on write access. + * When this flag is set, new mappings must not pre-set the D bit so + * that every write is correctly captured by hardware. + * Only meaningful for second-stage (iohgatp) page tables. + */ + PT_FEAT_RISCV_DIRTY_TRACKING_ACTIVE, =20 }; =20 --=20 2.50.1 From nobody Wed Jun 17 02:50:55 2026 Received: from out30-99.freemail.mail.aliyun.com (out30-99.freemail.mail.aliyun.com [115.124.30.99]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 08AD9478871; Tue, 28 Apr 2026 13:14:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.99 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777382081; cv=none; b=nlS2iwpB26XHjapzqSolKcu7t445uyE8vbzD1sHAiexoOsjrMGw7RcO2O0WAwkPXtvNvV0Jzl/47VRYxrWq0YphM67eBdBZD1kZMRcJn4a6VxY0tru+hHp38yKC/XdwquL8jWYD3c5Rz9Y0u4lfIRtB5KQI0QNU1dx0C+aPTm1A= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777382081; c=relaxed/simple; bh=LB3Lz/w6tAwr1A9br7Z481rveDfNzKHq5xoD3VRRSwM=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=i9UQ3nCn5XoJE4z3Dc1thEYQsQrMUQQ0iCVLHyDgnfawVbEctaCFzwmbM4fO+lxuqemVU5Y4D+v96AYssy+B+Z1fHJnxK1LC9tky8+hO8JDPqTFpG6Z1B4jqJAN+R7Qrma/8KX/VE09zs+09SRjN8sOz1rtXmCEWRwYJxackeR8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=BzLsBhvk; arc=none smtp.client-ip=115.124.30.99 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="BzLsBhvk" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1777382070; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=wDviUyznucLdHZjKk0KKSvorH0J1X46CC4BtnY6yTUA=; b=BzLsBhvkTI4eKPF2a1hfzE97pPocHBu9Mqt9xssvW+Goonc4ui7un2PLmpPKn3kCEYGKnohYaKiF2z7GmxH/ekozTNhvLvBHSU5DzfQJv6wnL9UN58BvzmVhcxM5sM5bkYMLYEkngfW3DuN2X6fWin2gpAHblULJFXZW46AjoC4= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R121e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033045098064;MF=fangyu.yu@linux.alibaba.com;NM=1;PH=DS;RN=23;SR=0;TI=SMTPD_---0X1ubIGy_1777382066; Received: from localhost.localdomain(mailfrom:fangyu.yu@linux.alibaba.com fp:SMTPD_---0X1ubIGy_1777382066 cluster:ay36) by smtp.aliyun-inc.com; Tue, 28 Apr 2026 21:14:27 +0800 From: fangyu.yu@linux.alibaba.com To: joro@8bytes.org, will@kernel.org, robin.murphy@arm.com, pjw@kernel.org, palmer@dabbelt.com, aou@eecs.berkeley.edu, alex@ghiti.fr, tjeznach@rivosinc.com, jgg@ziepe.ca, kevin.tian@intel.com, baolu.lu@linux.intel.com, vasant.hegde@amd.com, anup@brainfault.org, atish.patra@linux.dev, skhawaja@google.com, jgg@nvidia.com Cc: guoren@kernel.org, kvm@vger.kernel.org, iommu@lists.linux.dev, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, Fangyu Yu Subject: [RFC PATCH 08/11] iommu/riscv: Add dirty tracking support for second-stage domains Date: Tue, 28 Apr 2026 21:13:56 +0800 Message-Id: <20260428131359.34872-9-fangyu.yu@linux.alibaba.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20260428131359.34872-1-fangyu.yu@linux.alibaba.com> References: <20260428131359.34872-1-fangyu.yu@linux.alibaba.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Fangyu Yu Add hardware dirty tracking support for second-stage (iohgatp) domains used in KVM VFIO device pass-through. The RISC-V IOMMU can automatically set the dirty bit in PTEs on write access when DC.tc.GADE is set and the hardware has AMO_HWAD capability. Wire this up to the iommufd dirty tracking interface: - riscv_iommu_set_dirty_tracking(): Walks all bonds of the domain and sets or clears DC.tc.GADE in each device context entry. - riscv_iommu_dirty_ops: Exposes set_dirty_tracking and the generic page-table read_and_clear_dirty via IOMMU_PT_DIRTY_OPS(riscv_64). - domain_alloc_paging_flags: Assigns dirty_ops to second-stage domains when AMO_HWAD is advertised in hardware capabilities. - riscv_iommu_capable: Reports IOMMU_CAP_DIRTY_TRACKING when AMO_HWAD is present. Signed-off-by: Fangyu Yu --- drivers/iommu/riscv/iommu.c | 84 +++++++++++++++++++++++++++++++++++++ 1 file changed, 84 insertions(+) diff --git a/drivers/iommu/riscv/iommu.c b/drivers/iommu/riscv/iommu.c index 0c13430ecc7f..1f7967074492 100644 --- a/drivers/iommu/riscv/iommu.c +++ b/drivers/iommu/riscv/iommu.c @@ -1247,6 +1247,84 @@ static int riscv_iommu_attach_paging_domain(struct i= ommu_domain *iommu_domain, return 0; } =20 +/* + * Enable or disable hardware A/D bit updates (GADE) in the device context= for + * all devices attached to a second-stage domain. When dirty tracking is + * enabled the IOMMU hardware will set the dirty bit in PTEs on write acce= ss, + * making them visible to read_and_clear_dirty(). + */ +static int riscv_iommu_set_dirty_tracking(struct iommu_domain *iommu_domai= n, + bool enable) +{ + struct riscv_iommu_domain *domain =3D iommu_domain_to_riscv(iommu_domain); + struct riscv_iommu_bond *bond; + struct riscv_iommu_device *iommu, *prev; + struct riscv_iommu_dc *dc; + struct iommu_fwspec *fwspec; + struct riscv_iommu_command cmd; + u64 tc; + int i; + + rcu_read_lock(); + + list_for_each_entry_rcu(bond, &domain->bonds, list) { + iommu =3D dev_to_iommu(bond->dev); + fwspec =3D dev_iommu_fwspec_get(bond->dev); + + for (i =3D 0; i < fwspec->num_ids; i++) { + dc =3D riscv_iommu_get_dc(iommu, fwspec->ids[i]); + tc =3D READ_ONCE(dc->tc); + if (!(tc & RISCV_IOMMU_DC_TC_V)) + continue; + + if (enable) + tc |=3D RISCV_IOMMU_DC_TC_GADE; + else + tc &=3D ~RISCV_IOMMU_DC_TC_GADE; + WRITE_ONCE(dc->tc, tc); + + /* Invalidate cached device context entry */ + riscv_iommu_cmd_iodir_inval_ddt(&cmd); + riscv_iommu_cmd_iodir_set_did(&cmd, fwspec->ids[i]); + riscv_iommu_cmd_send(iommu, &cmd); + riscv_iommu_iodir_iotinval(iommu, false, dc->iohgatp, dc, NULL); + } + } + + prev =3D NULL; + list_for_each_entry_rcu(bond, &domain->bonds, list) { + iommu =3D dev_to_iommu(bond->dev); + if (iommu =3D=3D prev) + continue; + + riscv_iommu_cmd_sync(iommu, RISCV_IOMMU_IOTINVAL_TIMEOUT); + prev =3D iommu; + } + + rcu_read_unlock(); + + /* + * Reflect the active dirty-tracking state in the page table feature + * flags. When active, riscvpt_iommu_set_prot() will leave D=3D0 in + * new mappings so that the hardware can set it on the first write, + * providing accurate per-page dirty information. When inactive, + * new mappings get D=3D1 to avoid write faults on a D=3D0 PTE. + */ + if (enable) + domain->riscvpt.riscv_64pt.common.features |=3D + BIT(PT_FEAT_RISCV_DIRTY_TRACKING_ACTIVE); + else + domain->riscvpt.riscv_64pt.common.features &=3D + ~BIT(PT_FEAT_RISCV_DIRTY_TRACKING_ACTIVE); + + return 0; +} + +static const struct iommu_dirty_ops riscv_iommu_dirty_ops =3D { + IOMMU_PT_DIRTY_OPS(riscv_64), + .set_dirty_tracking =3D riscv_iommu_set_dirty_tracking, +}; + static const struct iommu_domain_ops riscv_iommu_paging_domain_ops =3D { IOMMU_PT_DOMAIN_OPS(riscv_64), .attach_dev =3D riscv_iommu_attach_paging_domain, @@ -1325,6 +1403,8 @@ static struct iommu_domain *riscv_iommu_domain_alloc_= paging_flags( riscv_iommu_free_paging_domain(&domain->domain); return ERR_PTR(-ENOMEM); } + if (iommu->caps & RISCV_IOMMU_CAPABILITIES_AMO_HWAD) + domain->domain.dirty_ops =3D &riscv_iommu_dirty_ops; } else { domain->pscid =3D ida_alloc_range(&riscv_iommu_pscids, 1, RISCV_IOMMU_MAX_PSCID, GFP_KERNEL); @@ -1401,10 +1481,14 @@ static struct iommu_group *riscv_iommu_device_group= (struct device *dev) =20 static bool riscv_iommu_capable(struct device *dev, enum iommu_cap cap) { + struct riscv_iommu_device *iommu =3D dev_to_iommu(dev); + switch (cap) { case IOMMU_CAP_CACHE_COHERENCY: case IOMMU_CAP_DEFERRED_FLUSH: return true; + case IOMMU_CAP_DIRTY_TRACKING: + return !!(iommu->caps & RISCV_IOMMU_CAPABILITIES_AMO_HWAD); default: return false; } --=20 2.50.1 From nobody Wed Jun 17 02:50:55 2026 Received: from out30-133.freemail.mail.aliyun.com (out30-133.freemail.mail.aliyun.com [115.124.30.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DCABE47798F; Tue, 28 Apr 2026 13:14:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.133 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777382077; cv=none; b=OYAeE7l/EEdRGEv/JO42PV8OwSmrRaD6ZB/YbaE9/fTGeTC8xE2IH1JeaHuma9lV025zQf2JHoa77o6ljbXIzlP7lOzzahGYACYjq9hE1AnoL6ql1dKmQZhs2Xgd4+HJ+PCJX/mS5sYElxrdXB1mzp8RlRsR8fzrr/Wekav6tuE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777382077; c=relaxed/simple; bh=4xKIRapWnT2z6xwd1ZKn2asQzFjw891Qxs3StuBXga4=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=h8NKQlgV4MDR3+z8GfH68dzSyNFQw6AYF0FvLPy9fPWtlHP4QsO51SnWkgVh0JbvqgOcN1w5YeH+ImsZaAWo54LTVAPPfo85NKGmC7souDE/gS74NkWdnbhRTId/sKamb76f3bdb+2qfN+gcIVqvohdWszq9/4v7WSx6a6yRKhI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=ZpfJ78H9; arc=none smtp.client-ip=115.124.30.133 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="ZpfJ78H9" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1777382072; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=7Qo2K+aVXskZxtjU2vtmdryTxJmRp8P78C8zyT6CDZ8=; b=ZpfJ78H9Ye9JoLqiUKCiRlIZ2NZhGPwX+BwTqQJfvmeU9oM35GGHl8HQbxIGJy/Sn2XRrJhRutJTmyIRCpICgTcy+RiF+BfuwDhaoKOweJhFKEHdCID5IuAptsFUTLeYhuh67yzlzEEpyA93U7czrJ/T/oULKHB3/mKyNh9FX2Y= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R161e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033037033178;MF=fangyu.yu@linux.alibaba.com;NM=1;PH=DS;RN=23;SR=0;TI=SMTPD_---0X1ubIHL_1777382068; Received: from localhost.localdomain(mailfrom:fangyu.yu@linux.alibaba.com fp:SMTPD_---0X1ubIHL_1777382068 cluster:ay36) by smtp.aliyun-inc.com; Tue, 28 Apr 2026 21:14:29 +0800 From: fangyu.yu@linux.alibaba.com To: joro@8bytes.org, will@kernel.org, robin.murphy@arm.com, pjw@kernel.org, palmer@dabbelt.com, aou@eecs.berkeley.edu, alex@ghiti.fr, tjeznach@rivosinc.com, jgg@ziepe.ca, kevin.tian@intel.com, baolu.lu@linux.intel.com, vasant.hegde@amd.com, anup@brainfault.org, atish.patra@linux.dev, skhawaja@google.com, jgg@nvidia.com Cc: guoren@kernel.org, kvm@vger.kernel.org, iommu@lists.linux.dev, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, Fangyu Yu Subject: [RFC PATCH 09/11] iommu/riscv: Add IOTINVAL.GVMA after updating DDT/PDT entries Date: Tue, 28 Apr 2026 21:13:57 +0800 Message-Id: <20260428131359.34872-10-fangyu.yu@linux.alibaba.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20260428131359.34872-1-fangyu.yu@linux.alibaba.com> References: <20260428131359.34872-1-fangyu.yu@linux.alibaba.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Fangyu Yu Previously, only IOTINVAL.VMA was issued, which is insufficient for second-stage address translation consistency. Signed-off-by: Fangyu Yu --- drivers/iommu/riscv/iommu.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/drivers/iommu/riscv/iommu.c b/drivers/iommu/riscv/iommu.c index 1f7967074492..cb9d315e82ee 100644 --- a/drivers/iommu/riscv/iommu.c +++ b/drivers/iommu/riscv/iommu.c @@ -1065,12 +1065,15 @@ static void riscv_iommu_iodir_iotinval(struct riscv= _iommu_device *iommu, /* * else: IOTINVAL.VMA with GV=3D1,AV=3DPSCV=3D0,and * GSCID=3DDC.iohgatp.GSCID - * + */ + riscv_iommu_cmd_send(iommu, &cmd); + /* * IOTINVAL.GVMA with GV=3D1,AV=3D0,and * GSCID=3DDC.iohgatp.GSCID - * TODO: For now, the Second-Stage feature have not yet been merged, - * also issue IOTINVAL.GVMA once second-stage support is merged. */ + riscv_iommu_cmd_inval_gvma(&cmd); + riscv_iommu_cmd_inval_set_gscid(&cmd, + FIELD_GET(RISCV_IOMMU_DC_IOHGATP_GSCID, iohgatp)); } riscv_iommu_cmd_send(iommu, &cmd); } --=20 2.50.1 From nobody Wed Jun 17 02:50:55 2026 Received: from out30-113.freemail.mail.aliyun.com (out30-113.freemail.mail.aliyun.com [115.124.30.113]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5661A4779A1; Tue, 28 Apr 2026 13:14:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.113 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777382077; cv=none; b=eIhMvc/BIzhqubtqIPJUDgoJgOjmPi3lygJKmB8qy1kpq1TNo56qveherYJJnH8LN4fh/b5SMb1lG+5xtQ4zahhmViXyX5SP4lLy31gnZCs6hrLg07yR+ntmjmnLssows9368GSfM5E2ZPgkjvvBxKxb/1JBK7BlmAkGR+0asoM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777382077; c=relaxed/simple; bh=CYrKDGIxXX5Mlt9NTR7wFUenmnP6V8xyIg08eBd1CxA=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=G1Nn5q08rhrc3LQ/SjfNqRggEGOWqH+B4jh6/3bsm6uZOxhLNLekFW+lfCQdI+wM16gBAgK5TUQ8dj+9esSQsnZALNEKylBIoT2i+BLhfijwlCwtEP4umzyBCbZO/sflXJdSBEecdiiCsLXGPOAhgitYrq8bxeJYsYP34OeMyDM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=H1v+lgsP; arc=none smtp.client-ip=115.124.30.113 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="H1v+lgsP" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1777382073; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=RbONRCU5tMlRlZ5V3hZOSCmiwWT78tBKTzkHzfkMhIE=; b=H1v+lgsPDUdCJn4OQNalwDpL8eS4qETfadnD6dmqLbd0ZZsj6JVjH2VvATqbF//Vdbpj7q0jlkx8TnRDJHsOm/AdaExSXQL9JezneS5bMot0hr46MUqgKxjiBP+6nBA0hHFK8GynHM2xcuc36aVdA6WT35Audmzke7Gf09cUQ4s= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R951e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033045098064;MF=fangyu.yu@linux.alibaba.com;NM=1;PH=DS;RN=23;SR=0;TI=SMTPD_---0X1ubII2_1777382070; Received: from localhost.localdomain(mailfrom:fangyu.yu@linux.alibaba.com fp:SMTPD_---0X1ubII2_1777382070 cluster:ay36) by smtp.aliyun-inc.com; Tue, 28 Apr 2026 21:14:31 +0800 From: fangyu.yu@linux.alibaba.com To: joro@8bytes.org, will@kernel.org, robin.murphy@arm.com, pjw@kernel.org, palmer@dabbelt.com, aou@eecs.berkeley.edu, alex@ghiti.fr, tjeznach@rivosinc.com, jgg@ziepe.ca, kevin.tian@intel.com, baolu.lu@linux.intel.com, vasant.hegde@amd.com, anup@brainfault.org, atish.patra@linux.dev, skhawaja@google.com, jgg@nvidia.com Cc: guoren@kernel.org, kvm@vger.kernel.org, iommu@lists.linux.dev, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, Fangyu Yu Subject: [RFC PATCH 10/11] iommupt: Add RISC-V dirty tracking PTE ops Date: Tue, 28 Apr 2026 21:13:58 +0800 Message-Id: <20260428131359.34872-11-fangyu.yu@linux.alibaba.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20260428131359.34872-1-fangyu.yu@linux.alibaba.com> References: <20260428131359.34872-1-fangyu.yu@linux.alibaba.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Fangyu Yu Implement the three dirty-tracking hooks required by the generic page table framework for the RISC-V format: pt_entry_is_write_dirty(): Check the D bit (bit 7) in the PTE. pt_entry_make_write_clean(): Clear the D bit across the full contiguous range. pt_entry_make_write_dirty(): Atomically set D via try_cmpxchg64() on a single PTE. Signed-off-by: Fangyu Yu --- drivers/iommu/generic_pt/fmt/riscv.h | 43 ++++++++++++++++++++++++++++ 1 file changed, 43 insertions(+) diff --git a/drivers/iommu/generic_pt/fmt/riscv.h b/drivers/iommu/generic_p= t/fmt/riscv.h index 0281356cfaf6..44b87e70f029 100644 --- a/drivers/iommu/generic_pt/fmt/riscv.h +++ b/drivers/iommu/generic_pt/fmt/riscv.h @@ -222,6 +222,49 @@ static inline void riscvpt_attr_from_entry(const struc= t pt_state *pts, } #define pt_attr_from_entry riscvpt_attr_from_entry =20 +/* + * Dirty tracking: RISC-V PTEs use D (bit 7) as the hardware dirty bit. + * When Svnapot 64K is active a leaf entry spans 16 consecutive PTEs; we + * must check / clear all of them so that no dirty indication is lost. + */ +static inline bool riscvpt_entry_is_write_dirty(const struct pt_state *pts) +{ + unsigned int num_contig_lg2 =3D riscvpt_entry_num_contig_lg2(pts); + const pt_riscv_entry_t *tablep =3D + pt_cur_table(pts, pt_riscv_entry_t) + + log2_set_mod(pts->index, 0, num_contig_lg2); + const pt_riscv_entry_t *end =3D tablep + log2_to_int(num_contig_lg2); + + for (; tablep !=3D end; tablep++) + if (READ_ONCE(*tablep) & RISCVPT_D) + return true; + return false; +} +#define pt_entry_is_write_dirty riscvpt_entry_is_write_dirty + +static inline void riscvpt_entry_make_write_clean(struct pt_state *pts) +{ + unsigned int num_contig_lg2 =3D riscvpt_entry_num_contig_lg2(pts); + pt_riscv_entry_t *tablep =3D + pt_cur_table(pts, pt_riscv_entry_t) + + log2_set_mod(pts->index, 0, num_contig_lg2); + pt_riscv_entry_t *end =3D tablep + log2_to_int(num_contig_lg2); + + for (; tablep !=3D end; tablep++) + WRITE_ONCE(*tablep, READ_ONCE(*tablep) & ~(pt_riscv_entry_t)RISCVPT_D); +} +#define pt_entry_make_write_clean riscvpt_entry_make_write_clean + +static inline bool riscvpt_entry_make_write_dirty(struct pt_state *pts) +{ + pt_riscv_entry_t *tablep =3D + pt_cur_table(pts, pt_riscv_entry_t) + pts->index; + pt_riscv_entry_t new =3D pts->entry | RISCVPT_D; + + return try_cmpxchg64(tablep, &pts->entry, new); +} +#define pt_entry_make_write_dirty riscvpt_entry_make_write_dirty + /* --- iommu */ #include #include --=20 2.50.1 From nobody Wed Jun 17 02:50:55 2026 Received: from out30-131.freemail.mail.aliyun.com (out30-131.freemail.mail.aliyun.com [115.124.30.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 46DF0478E56; Tue, 28 Apr 2026 13:14:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.131 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777382082; cv=none; b=udbDO+A2/NhIi543qbIyXk0bJfbRfO5CtgYvifJfdSI826G8IxKaHTQuBCEVOhhDNGRPt/kWZxRFR2FjFoWcn/d5fiqSM3GZgWvzxSK5FlRzik058mP1rpyasYTDJwhSrQDYSRvq8Gm1wZzdqwQp2rlJiZwEdGoauDLEyQRMyKo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777382082; c=relaxed/simple; bh=qEbL7NazfstBPbALI9WwA7iUCsNKl7ldy/g2UhgQkbQ=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Z0PLojV1ZzGcDGhDx9Q25VUxmCjnsa98TB7Amz2F/mmSoZtXxYfv6iQ49sc7JPIAFDw3iw+1t4UrgIM8H+ewQ0g38MPb8/rmmyK5NZytA/m1/JtdeP5Oq+Da5ToFP+dXezrQLhnHJlStF17zwPZAatbOo5deW1BDLqFmGyOOf24= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=H8PpUuUD; arc=none smtp.client-ip=115.124.30.131 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="H8PpUuUD" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1777382077; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=GTaXyo6u6wVsR7hfJRYDUBPmXTghSbJ0+BI4PVwt11c=; b=H8PpUuUDWrG2qxO4FDj3a0AJZ+pEyFbh8hPyeFL4A1qhuVBg4gykoZvlpGnODtD+F+hi8sE1m0DBiKCSwIf2WmTYm1957gex4h2dUTvx+0BvYYaHxLGdEAhFehrMN+wbtDSCma5EUw0EMRSFUP44oh3HcoTQUYbo4GVcm/7WAdU= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R191e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033037033178;MF=fangyu.yu@linux.alibaba.com;NM=1;PH=DS;RN=24;SR=0;TI=SMTPD_---0X1ubIIu_1777382072; Received: from localhost.localdomain(mailfrom:fangyu.yu@linux.alibaba.com fp:SMTPD_---0X1ubIIu_1777382072 cluster:ay36) by smtp.aliyun-inc.com; Tue, 28 Apr 2026 21:14:33 +0800 From: fangyu.yu@linux.alibaba.com To: joro@8bytes.org, will@kernel.org, robin.murphy@arm.com, pjw@kernel.org, palmer@dabbelt.com, aou@eecs.berkeley.edu, alex@ghiti.fr, tjeznach@rivosinc.com, jgg@ziepe.ca, kevin.tian@intel.com, baolu.lu@linux.intel.com, vasant.hegde@amd.com, anup@brainfault.org, atish.patra@linux.dev, skhawaja@google.com, jgg@nvidia.com Cc: guoren@kernel.org, kvm@vger.kernel.org, iommu@lists.linux.dev, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, Zong Li , Fangyu Yu Subject: [RFC PATCH 11/11] iommu/riscv: support nested iommu for getting iommu hardware information Date: Tue, 28 Apr 2026 21:13:59 +0800 Message-Id: <20260428131359.34872-12-fangyu.yu@linux.alibaba.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20260428131359.34872-1-fangyu.yu@linux.alibaba.com> References: <20260428131359.34872-1-fangyu.yu@linux.alibaba.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Zong Li This patch implements .hw_info operation and the related data structures for passing the IOMMU hardware capabilities for iommufd. Signed-off-by: Zong Li Reviewed-by: Jason Gunthorpe Signed-off-by: Fangyu Yu --- drivers/iommu/riscv/iommu.c | 19 +++++++++++++++++++ include/uapi/linux/iommufd.h | 18 ++++++++++++++++++ 2 files changed, 37 insertions(+) diff --git a/drivers/iommu/riscv/iommu.c b/drivers/iommu/riscv/iommu.c index cb9d315e82ee..9abf446e1b85 100644 --- a/drivers/iommu/riscv/iommu.c +++ b/drivers/iommu/riscv/iommu.c @@ -1556,8 +1556,27 @@ static void riscv_iommu_release_device(struct device= *dev) kfree_rcu_mightsleep(info); } =20 +static void *riscv_iommu_hw_info(struct device *dev, u32 *length, u32 *typ= e) +{ + struct riscv_iommu_device *iommu =3D dev_to_iommu(dev); + struct iommu_hw_info_riscv_iommu *info; + + info =3D kzalloc_obj(*info); + if (!info) + return ERR_PTR(-ENOMEM); + + info->capability =3D iommu->caps; + info->fctl =3D riscv_iommu_readl(iommu, RISCV_IOMMU_REG_FCTL); + + *length =3D sizeof(*info); + *type =3D IOMMU_HW_INFO_TYPE_RISCV_IOMMU; + + return info; +} + static const struct iommu_ops riscv_iommu_ops =3D { .of_xlate =3D riscv_iommu_of_xlate, + .hw_info =3D riscv_iommu_hw_info, .capable =3D riscv_iommu_capable, .identity_domain =3D &riscv_iommu_identity_domain, .blocked_domain =3D &riscv_iommu_blocking_domain, diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h index e998dfbd6960..79d3dc5e8d19 100644 --- a/include/uapi/linux/iommufd.h +++ b/include/uapi/linux/iommufd.h @@ -660,6 +660,22 @@ struct iommu_hw_info_amd { __aligned_u64 efr2; }; =20 +/** + * struct iommu_hw_info_riscv_iommu - RISCV IOMMU hardware information + * + * @capability: Value of RISC-V IOMMU capability register defined in + * RISC-V IOMMU spec section 5.3 IOMMU capabilities + * @fctl: Value of RISC-V IOMMU feature control register defined in + * RISC-V IOMMU spec section 5.4 Features-control register + * + * Don't advertise ATS support to the guest because driver doesn't support= it. + */ +struct iommu_hw_info_riscv_iommu { + __aligned_u64 capability; + __u32 fctl; + __u32 __reserved; +}; + /** * enum iommu_hw_info_type - IOMMU Hardware Info Types * @IOMMU_HW_INFO_TYPE_NONE: Output by the drivers that do not report hard= ware @@ -670,6 +686,7 @@ struct iommu_hw_info_amd { * @IOMMU_HW_INFO_TYPE_TEGRA241_CMDQV: NVIDIA Tegra241 CMDQV (extension fo= r ARM * SMMUv3) info type * @IOMMU_HW_INFO_TYPE_AMD: AMD IOMMU info type + * @IOMMU_HW_INFO_TYPE_RISCV_IOMMU: RISC-V iommu info type */ enum iommu_hw_info_type { IOMMU_HW_INFO_TYPE_NONE =3D 0, @@ -678,6 +695,7 @@ enum iommu_hw_info_type { IOMMU_HW_INFO_TYPE_ARM_SMMUV3 =3D 2, IOMMU_HW_INFO_TYPE_TEGRA241_CMDQV =3D 3, IOMMU_HW_INFO_TYPE_AMD =3D 4, + IOMMU_HW_INFO_TYPE_RISCV_IOMMU =3D 5, }; =20 /** --=20 2.50.1