From nobody Thu Jan 8 12:34:33 2026 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) client-ip=192.237.175.120; envelope-from=xen-devel-bounces@lists.xenproject.org; helo=lists.xenproject.org; Received-SPF: pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) client-ip=192.237.175.120; envelope-from=xen-devel-bounces@lists.xenproject.org; helo=lists.xenproject.org; Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org; dmarc=pass(p=none dis=none) header.from=gmail.com ARC-Seal: i=1; a=rsa-sha256; t=1766421503; cv=none; d=zohomail.com; s=zohoarc; b=JDnyahWVP1Yfm3E/QMDk2/NCRLxPo/H2O/MICdC4d0LeMyO/00lr7OpoV24XyxajpJCObe6MdbrQRpkaOnF3rlzaq/HHthJPNEXI0bDNA5HbbW33uD9lHm8wnlIH3+IcWSle+hxHLVs/f90h8oFfAgNTjLVdpVDo0glIUZ3eH30= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1766421503; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=K1roM12Mw39RifSf3h2tzbDV94jY317E0NvXspPdXDM=; b=kB1oXq/ZdIIPGUVZcl3cRTIRoSS8FMDvkVD3pgwxjHDrOqYzlqnF3P4LP+5wCDK1qe75qzaZNIXqOMisOoBDW8bktpZMHgwLF6rz6JhVXRkogGHk+Nx+qURpxw2i3a0ryB61a/F9N0L667cg+d+C9R6NRPw3kqMiLgecJr24Mvc= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) by mx.zohomail.com with SMTPS id 1766421503610374.2227518006076; Mon, 22 Dec 2025 08:38:23 -0800 (PST) Received: from list by lists.xenproject.org with outflank-mailman.1192169.1511509 (Exim 4.92) (envelope-from ) id 1vXiv7-0001Wd-B4; Mon, 22 Dec 2025 16:38:05 +0000 Received: by outflank-mailman (output) from mailman id 1192169.1511509; Mon, 22 Dec 2025 16:38:05 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1vXiv7-0001WS-7Z; Mon, 22 Dec 2025 16:38:05 +0000 Received: by outflank-mailman (input) for mailman id 1192169; Mon, 22 Dec 2025 16:38:03 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1vXiv5-00085h-KW for xen-devel@lists.xenproject.org; Mon, 22 Dec 2025 16:38:03 +0000 Received: from mail-ed1-x532.google.com (mail-ed1-x532.google.com [2a00:1450:4864:20::532]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 94ebe4c7-df54-11f0-9cce-f158ae23cfc8; Mon, 22 Dec 2025 17:38:01 +0100 (CET) Received: by mail-ed1-x532.google.com with SMTP id 4fb4d7f45d1cf-64ba74e6892so4587400a12.2 for ; Mon, 22 Dec 2025 08:38:01 -0800 (PST) Received: from fedora (user-109-243-71-38.play-internet.pl. [109.243.71.38]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-b8037de004fsm1149128166b.45.2025.12.22.08.37.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 22 Dec 2025 08:38:00 -0800 (PST) X-Outflank-Mailman: Message body and most headers restored to incoming version X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 94ebe4c7-df54-11f0-9cce-f158ae23cfc8 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1766421481; x=1767026281; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=K1roM12Mw39RifSf3h2tzbDV94jY317E0NvXspPdXDM=; b=Ei+19PkuGnWiCN9JLPuSQ8p34702XDYS3jQI52OnQ1PO7GfWQHyjRyuAIYwGssel41 wfoW47wXkwzaNsTcsYZcZBpP4bbP5jNLF59Hwh0MRGrd7wns2TFNSLHBmo0xyE2h2NXk vpvzO/aEcePXtKAGK+bVt3v6unWr5/Cl5eZAX0HTG+aPZh5+2u+mLHIybWmaNCU5fa/Y NNCJUKBTTuyj3frdMQ9NhsoDa43b3oSlLNwNZ54J/22WjXBoPU8g/MmpMd8F/m+5RboO JJSu4oeEYDl6WcHdvvkG5qN1GAotSOZzShQvCDCdPK4kbjPSq0z1eQBSsriDYGql+1jk mH2Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1766421481; x=1767026281; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=K1roM12Mw39RifSf3h2tzbDV94jY317E0NvXspPdXDM=; b=uU7MklnY5E9zULx/nYmQ3lE7JXzF387x7PFSzqD+zEiq2rQTSqKIORbwqW7cL3aFYS Xg+fH4Q3Erm3IwneriVCVDm0l8B491RNib3LKXsBZEJWafV4I5Urqb4o3+a0F8Q4nF4t iicZIlpFqff1CBHJm2yI+u9up/D981mJb3suUMiGIGiKU8XodUiyljhZn0Tqy3ThDTrj XB5mJoetrEFP2AhzOi5O2U+GwKR96DIcz37mDkI2kjwWX50TSE7kAN+SBzS1RM2HQSbg /7eEoMI6dV7OfthPY/qLR8buB3qmBY32Jkm5pMe5+NNTNefBftgh522qBA6ck8T94dtk OZQg== X-Gm-Message-State: AOJu0YzoYs/aDammFe9UuGL9IgLMiYbHU0Okr+Jm81AhmlztzgINohBX x7HWxqQ2OD8l7pQ/YzQaW6JieyvKac3GBh5qR04ejjx/eYzJcTfw7tNAHZ+oIg== X-Gm-Gg: AY/fxX65zQaFgrgg2ZKcp239Iqu0OL4r4SdL5ij3Iemx5ANo5vyu4c+onGUDHNcPPbw yeifFESO9jhgzT5D4ibdZ1NsrshlUhmtHqpV9xylxIu2v1GvMN18Mf+WJqqobmHwmq2axdtt45Z dCmnwORkMM8ItQwtSOlK0y6H0QPzmoPgrTD2NejPsj6F8oRvkKs9NPzb1u1OaMp/GbHZrbXxGAJ /5ZoktuTC2s/tdDGLl6KW/C4KLDqkTK19yAU1vznbQq0PEaKLAIQ7oD2O5eTQjVnIKZ3kqWSCb/ xcYCNQSwcbGyrcCQIV7TPYKRVeK4ITddiJ9cPQMAWxaHA2TRv9CrqcDRrRe6lILMS2Nn1/H3FuN pkwb8AuH7oPfll4lW5Xf1grt6QyXWjHlpp0SCfzb9cyOaAdrxbAjQDRp7zj9Nwg9Gh8EgwJngcm cD391JpB0Evb8ak6NWC2brfSuCNQ4Eu2cfhv31XVNRcnKzSEGcKlNAGGQ= X-Google-Smtp-Source: AGHT+IEtMrSq3gUuUEY2z045LwPXlSjaMt9lt7KSp0MWAxGMXSryGzCzI3aKm15gi35ZNFwbXJ4o9A== X-Received: by 2002:a17:907:1caa:b0:b77:1b03:66a1 with SMTP id a640c23a62f3a-b80371756c1mr1329058666b.41.1766421480634; Mon, 22 Dec 2025 08:38:00 -0800 (PST) From: Oleksii Kurochko To: xen-devel@lists.xenproject.org Cc: Oleksii Kurochko , Alistair Francis , Bob Eshleman , Connor Davis , Andrew Cooper , Anthony PERARD , Michal Orzel , Jan Beulich , Julien Grall , =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= , Stefano Stabellini Subject: [PATCH v8 1/3] xen/riscv: add support of page lookup by GFN Date: Mon, 22 Dec 2025 17:37:47 +0100 Message-ID: <5d10efb00eebb35861135280dfee391d0c55cf0d.1766406895.git.oleksii.kurochko@gmail.com> X-Mailer: git-send-email 2.52.0 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ZohoMail-DKIM: pass (identity @gmail.com) X-ZM-MESSAGEID: 1766421505659158500 Content-Type: text/plain; charset="utf-8" Introduce helper functions for safely querying the P2M (physical-to-machine) mapping: - add p2m_read_lock(), p2m_read_unlock(), and p2m_is_locked() for managing P2M lock state. - Implement p2m_get_entry() to retrieve mapping details for a given GFN, including MFN, page order, and validity. - Introduce p2m_get_page_from_gfn() to convert a GFN into a page_info pointer, acquiring a reference to the page if valid. - Introduce get_page(). Implementations are based on Arm's functions with some minor modifications: - p2m_get_entry(): - Reverse traversal of page tables, as RISC-V uses the opposite level numbering compared to Arm. - Removed the return of p2m_access_t from p2m_get_entry() since mem_access_settings is not introduced for RISC-V. - Updated BUILD_BUG_ON() to check using the level 0 mask, which correspon= ds to Arm's THIRD_MASK. - Replaced open-coded bit shifts with the BIT() macro. Signed-off-by: Oleksii Kurochko --- Changes in V8: - Drop the local variable masked_gfn inside check_outside_boundary() and fo= ld the is_lower conditionals into the for loop. - Initialize the local variable level in p2m_get_entry() to the root level and drop the explicit assignment when root page table wasn't found, as it now defaults to the root level. - Introduce gfn_limit_bits and use it to calculate the maximum GFN for the MMU second stage, and return the appropriate page_order when the GFN exce= eds this limit. --- Changes in V7: - Refactor check_outside_boundary(). - Reword the comment above p2m_get_entry(). - As at the moment p2m_get_entry() doesn't pass `t` as NULL we could drop "if ( t )" checks inside it to not have dead code now. - Add the check inside p2m_get_entry() that requested gfn is correct. - Add "if ( t )" check inside p2m_get_page_from_gfn() as it is going to be some callers with t =3D NULL. --- Changes in V6: - Move if-condition with initialization up in p2m_get_page_from_gfn(). - Pass p2mt to the call of p2m_get_entry() inside p2m_get_page_from_gfn() to avoid an issue when 't' is passed NULL. With p2mt passed to p2m_get_e= ntry() we will recieve a proper type and so the rest of the function will able = to continue use a proper type. - In check_outside_boundary() in the case when is_lower =3D=3D true fill th= e bottom bits of masked_gfn with all 1s. - Update code of check_outside_boundary() to return proper level in the cas= e when `level` is equal to 0. - Add ASSERT(p2m) in check_outside_boundary() to be sure that p2m isn't NUL= L as P2M_LEVEL_MASK() depends on p2m value. --- Changes in V5: - Use introduced in earlier patches P2M_DECLARE_OFFSETS() instead of DECLARE_OFFSETS(). - Drop blank line before check_outside_boundary(). - Use more readable version of if statements inside check_outside_boundary= (). - Accumulate mask in check_outside_boundary() instead of re-writing it for each page table level to have correct gfns for comparison. - Set argument `t` of p2m_get_entry() to p2m_invalid by default. - Drop checking of (rc =3D=3D P2M_TABLE_MAP_NOMEM ) when p2m_next_level(..= .,false,...) is called. - Add ASSERT(mfn & (BIT(P2M_LEVEL_ORDER(level), UL) - 1)); in p2m_get_entr= y() to be sure that recieved `mfn` has cleared lowest bits. - Drop `valid` argument from p2m_get_entry(), it is not needed anymore. - Drop p2m_lookup(), use p2m_get_entry() explicitly inside p2m_get_page_fr= om_gfn(). - Update the commit message. --- Changes in V4: - Update prototype of p2m_is_locked() to return bool and accept pointer-to= -const. - Correct the comment above p2m_get_entry(). - Drop the check "BUILD_BUG_ON(XEN_PT_LEVEL_MAP_MASK(0) !=3D PAGE_MASK);" = inside p2m_get_entry() as it is stale and it was needed to sure that 4k page(s)= are used on L3 (in Arm terms) what is true for RISC-V. (if not special exten= sion are used). It was another reason for Arm to have it (and I copied it to = RISC-V), but it isn't true for RISC-V. (some details could be found in response t= o the patch). - Style fixes. - Add explanatory comment what the loop inside "gfn is higher then the hig= hest p2m mapping" does. Move this loop to separate function check_outside_bou= ndary() to cover both boundaries (lower_mapped_gfn and max_mapped_gfn). - There is not need to allocate a page table as it is expected that p2m_get_entry() normally would be called after a corresponding p2m_set_e= ntry() was called. So change 'true' to 'false' in a page table walking loop ins= ide p2m_get_entry(). - Correct handling of p2m_is_foreign case inside p2m_get_page_from_gfn(). - Introduce and use P2M_LEVEL_MASK instead of XEN_PT_LEVEL_MASK as it isn'= t take into account two extra bits for root table in case of P2M. - Drop stale item from "change in v3" - Add is_p2m_foreign() macro and con= nected stuff. - Add p2m_read_(un)lock(). --- Changes in V3: - Change struct domain *d argument of p2m_get_page_from_gfn() to struct p2m_domain. - Update the comment above p2m_get_entry(). - s/_t/p2mt for local variable in p2m_get_entry(). - Drop local variable addr in p2m_get_entry() and use gfn_to_gaddr(gfn) to define offsets array. - Code style fixes. - Update a check of rc code from p2m_next_level() in p2m_get_entry() and drop "else" case. - Do not call p2m_get_type() if p2m_get_entry()'s t argument is NULL. - Use struct p2m_domain instead of struct domain for p2m_lookup() and p2m_get_page_from_gfn(). - Move defintion of get_page() from "xen/riscv: implement mfn_valid() and = page reference, ownership handling helpers" --- Changes in V2: - New patch. --- xen/arch/riscv/include/asm/p2m.h | 21 ++++ xen/arch/riscv/mm.c | 13 +++ xen/arch/riscv/p2m.c | 185 +++++++++++++++++++++++++++++++ 3 files changed, 219 insertions(+) diff --git a/xen/arch/riscv/include/asm/p2m.h b/xen/arch/riscv/include/asm/= p2m.h index b48693a2b41c..f63b5dec99b1 100644 --- a/xen/arch/riscv/include/asm/p2m.h +++ b/xen/arch/riscv/include/asm/p2m.h @@ -41,6 +41,9 @@ =20 #define P2M_GFN_LEVEL_SHIFT(lvl) (P2M_LEVEL_ORDER(lvl) + PAGE_SHIFT) =20 +#define P2M_LEVEL_MASK(p2m, lvl) \ + (P2M_TABLE_OFFSET(p2m, lvl) << P2M_GFN_LEVEL_SHIFT(lvl)) + #define paddr_bits PADDR_BITS =20 /* Get host p2m table */ @@ -234,6 +237,24 @@ static inline bool p2m_is_write_locked(struct p2m_doma= in *p2m) =20 unsigned long construct_hgatp(const struct p2m_domain *p2m, uint16_t vmid); =20 +static inline void p2m_read_lock(struct p2m_domain *p2m) +{ + read_lock(&p2m->lock); +} + +static inline void p2m_read_unlock(struct p2m_domain *p2m) +{ + read_unlock(&p2m->lock); +} + +static inline bool p2m_is_locked(const struct p2m_domain *p2m) +{ + return rw_is_locked(&p2m->lock); +} + +struct page_info *p2m_get_page_from_gfn(struct p2m_domain *p2m, gfn_t gfn, + p2m_type_t *t); + #endif /* ASM__RISCV__P2M_H */ =20 /* diff --git a/xen/arch/riscv/mm.c b/xen/arch/riscv/mm.c index e25f995b727f..e9ce182d066c 100644 --- a/xen/arch/riscv/mm.c +++ b/xen/arch/riscv/mm.c @@ -673,3 +673,16 @@ struct domain *page_get_owner_and_reference(struct pag= e_info *page) =20 return owner; } + +bool get_page(struct page_info *page, const struct domain *domain) +{ + const struct domain *owner =3D page_get_owner_and_reference(page); + + if ( likely(owner =3D=3D domain) ) + return true; + + if ( owner !=3D NULL ) + put_page(page); + + return false; +} diff --git a/xen/arch/riscv/p2m.c b/xen/arch/riscv/p2m.c index 8d572f838fc3..66943b969e8a 100644 --- a/xen/arch/riscv/p2m.c +++ b/xen/arch/riscv/p2m.c @@ -1057,3 +1057,188 @@ int map_regions_p2mt(struct domain *d, =20 return rc; } + +/* + * p2m_get_entry() should always return the correct order value, even if an + * entry is not present (i.e. the GFN is outside the range): + * [p2m->lowest_mapped_gfn, p2m->max_mapped_gfn] (1) + * + * This ensures that callers of p2m_get_entry() can determine what range of + * address space would be altered by a corresponding p2m_set_entry(). + * Also, it would help to avoid costly page walks for GFNs outside range (= 1). + * + * Therefore, this function returns true for GFNs outside range (1), and in + * that case the corresponding level is returned via the level_out argumen= t. + * Otherwise, it returns false and p2m_get_entry() performs a page walk to + * find the proper entry. + */ +static bool check_outside_boundary(const struct p2m_domain *p2m, gfn_t gfn, + gfn_t boundary, bool is_lower, + unsigned int *level_out) +{ + unsigned int level =3D P2M_ROOT_LEVEL(p2m); + bool ret =3D false; + + ASSERT(p2m); + + if ( is_lower ? gfn_x(gfn) < gfn_x(boundary) + : gfn_x(gfn) > gfn_x(boundary) ) + { + for ( ; level; level-- ) + { + unsigned long mask =3D BIT(P2M_GFN_LEVEL_SHIFT(level), UL) - 1; + + if ( is_lower ? (gfn_x(gfn) | mask) < gfn_x(boundary) + : (gfn_x(gfn) & ~mask) > gfn_x(boundary) ) + break; + } + + ret =3D true; + } + + if ( level_out ) + *level_out =3D level; + + return ret; +} + +/* + * Get the details of a given gfn. + * + * If the entry is present, the associated MFN, the p2m type of the mappin= g, + * and the page order of the mapping in the page table (i.e., it could be a + * superpage) will be returned. + * + * If the entry is not present, INVALID_MFN will be returned, page_order w= ill + * be set according to the order of the invalid range, and the type will be + * p2m_invalid. + */ +static mfn_t p2m_get_entry(struct p2m_domain *p2m, gfn_t gfn, + p2m_type_t *t, + unsigned int *page_order) +{ + unsigned int level =3D P2M_ROOT_LEVEL(p2m); + unsigned int gfn_limit_bits =3D + P2M_LEVEL_ORDER(level + 1) + P2M_ROOT_EXTRA_BITS(p2m, level); + pte_t entry, *table; + int rc; + mfn_t mfn =3D INVALID_MFN; + + P2M_BUILD_LEVEL_OFFSETS(p2m, offsets, gfn_to_gaddr(gfn)); + + ASSERT(p2m_is_locked(p2m)); + + *t =3D p2m_invalid; + + if ( gfn_x(gfn) > (BIT(gfn_limit_bits, UL) - 1) ) + { + if ( page_order ) + *page_order =3D gfn_limit_bits; + + return mfn; + } + + if ( check_outside_boundary(p2m, gfn, p2m->lowest_mapped_gfn, true, + &level) ) + goto out; + + if ( check_outside_boundary(p2m, gfn, p2m->max_mapped_gfn, false, &lev= el) ) + goto out; + + table =3D p2m_get_root_pointer(p2m, gfn); + + /* + * The table should always be non-NULL because the gfn is below + * p2m->max_mapped_gfn and the root table pages are always present. + */ + if ( !table ) + { + ASSERT_UNREACHABLE(); + goto out; + } + + for ( level =3D P2M_ROOT_LEVEL(p2m); level; level-- ) + { + rc =3D p2m_next_level(p2m, false, level, &table, offsets[level]); + if ( rc =3D=3D P2M_TABLE_MAP_NONE ) + goto out_unmap; + + if ( rc !=3D P2M_TABLE_NORMAL ) + break; + } + + entry =3D table[offsets[level]]; + + if ( pte_is_valid(entry) ) + { + *t =3D p2m_get_type(entry); + + mfn =3D pte_get_mfn(entry); + + ASSERT(!(mfn_x(mfn) & (BIT(P2M_LEVEL_ORDER(level), UL) - 1))); + + /* + * The entry may point to a superpage. Find the MFN associated + * to the GFN. + */ + mfn =3D mfn_add(mfn, + gfn_x(gfn) & (BIT(P2M_LEVEL_ORDER(level), UL) - 1)); + } + + out_unmap: + unmap_domain_page(table); + + out: + if ( page_order ) + *page_order =3D P2M_LEVEL_ORDER(level); + + return mfn; +} + +struct page_info *p2m_get_page_from_gfn(struct p2m_domain *p2m, gfn_t gfn, + p2m_type_t *t) +{ + struct page_info *page; + p2m_type_t p2mt; + mfn_t mfn; + + p2m_read_lock(p2m); + mfn =3D p2m_get_entry(p2m, gfn, &p2mt, NULL); + + if ( t ) + *t =3D p2mt; + + if ( !mfn_valid(mfn) ) + { + p2m_read_unlock(p2m); + return NULL; + } + + page =3D mfn_to_page(mfn); + + /* + * get_page won't work on foreign mapping because the page doesn't + * belong to the current domain. + */ + if ( unlikely(p2m_is_foreign(p2mt)) ) + { + const struct domain *fdom =3D page_get_owner_and_reference(page); + + p2m_read_unlock(p2m); + + if ( fdom ) + { + if ( likely(fdom !=3D p2m->domain) ) + return page; + + ASSERT_UNREACHABLE(); + put_page(page); + } + + return NULL; + } + + p2m_read_unlock(p2m); + + return get_page(page, p2m->domain) ? page : NULL; +} --=20 2.52.0 From nobody Thu Jan 8 12:34:33 2026 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) client-ip=192.237.175.120; envelope-from=xen-devel-bounces@lists.xenproject.org; helo=lists.xenproject.org; Received-SPF: pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) client-ip=192.237.175.120; envelope-from=xen-devel-bounces@lists.xenproject.org; helo=lists.xenproject.org; Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org; dmarc=pass(p=none dis=none) header.from=gmail.com ARC-Seal: i=1; a=rsa-sha256; t=1766421508; cv=none; d=zohomail.com; s=zohoarc; b=RFnon/Ws3WvKhuYF0Jtrr44PVtfz5DOFLyWBn5IkwCDe0F9wbzLbAATYSg+X4bREkceKN46vjYKZCr2Y4QNSPrVtodldUC5iYZNe5pKHC9ZBy2GlK586WJY54GJdCkQar2q/mPwP79HKTSpfLIf0TboiNGzoWy88x1ylAFZ5OXM= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1766421508; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=PcWFriMyjLzR9smAZRA9nO9Zffm+beqrJCupBT4WdqE=; b=jsy5jcZRa5nhGs1fVpzkoTaa8sv8XuF8mP4TKCjEOfLLhE8YoCfOpLDBKAk2e+IENNRmAc0uqW2Yf5lxeM1aWlg6nnN6DGpU0psDV4T8kkH1OGEkEPy8Y5zW87yQu8BxNB+LgmmeCebhHoZdzKs2/+nxUIiInkPVMZvDoW2KMwQ= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) by mx.zohomail.com with SMTPS id 1766421508498263.95059627641285; Mon, 22 Dec 2025 08:38:28 -0800 (PST) Received: from list by lists.xenproject.org with outflank-mailman.1192170.1511521 (Exim 4.92) (envelope-from ) id 1vXiv8-0001n4-M8; Mon, 22 Dec 2025 16:38:06 +0000 Received: by outflank-mailman (output) from mailman id 1192170.1511521; Mon, 22 Dec 2025 16:38:06 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1vXiv8-0001mk-Hb; Mon, 22 Dec 2025 16:38:06 +0000 Received: by outflank-mailman (input) for mailman id 1192170; Mon, 22 Dec 2025 16:38:04 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1vXiv6-00085h-JP for xen-devel@lists.xenproject.org; Mon, 22 Dec 2025 16:38:04 +0000 Received: from mail-ej1-x632.google.com (mail-ej1-x632.google.com [2a00:1450:4864:20::632]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 955b4bb4-df54-11f0-9cce-f158ae23cfc8; Mon, 22 Dec 2025 17:38:02 +0100 (CET) Received: by mail-ej1-x632.google.com with SMTP id a640c23a62f3a-b79e7112398so670716766b.3 for ; Mon, 22 Dec 2025 08:38:02 -0800 (PST) Received: from fedora (user-109-243-71-38.play-internet.pl. [109.243.71.38]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-b8037de004fsm1149128166b.45.2025.12.22.08.38.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 22 Dec 2025 08:38:01 -0800 (PST) X-Outflank-Mailman: Message body and most headers restored to incoming version X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 955b4bb4-df54-11f0-9cce-f158ae23cfc8 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1766421482; x=1767026282; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=PcWFriMyjLzR9smAZRA9nO9Zffm+beqrJCupBT4WdqE=; b=kBaTsotG3pT3PBM68g3SLWlITeGDbQuEVQGLeEtHOHX+k8aXj7KE7y7qttb0gQD78c iYILfpCFffjQz068dXMNKoHiVLbjn+ST4DXPccZv4fL/F1+EbTmzwk8zssrOQ9x7EuGi /AvhJ+Jbbj+MR95pwO43vee7rjDXNrhozfKBVWq9UndU5GZlmRf6ib4MVUNQWL8yYjeO jfUp3TDe7M6MIr3vWKQ66TKOailO4wKAQLJSzac/IgJx/eoHHrQa+200lZ+YxW5/6zxP M6BoZEPCdn/HVHpO16PxSR+AbNBdosU6RRZlvfgGyiqNeCIpyy2SU1oYzOZfNQrTZGlM kRUw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1766421482; x=1767026282; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=PcWFriMyjLzR9smAZRA9nO9Zffm+beqrJCupBT4WdqE=; b=W7ZctGQ778+9ZE/nj4N3yZp4Ou6bxDh2dKdvVkL6Hj3UR5EZt6TYpDsGVoU3WWn1rQ ANNMWVYgYDrkiAsDwWkTdvpjvDkLBhgOB4XfBRzjIQ/ZM5CuGvev/dt8d47MHvpGPZYI o9ysNqtq4V/oXniUdGWk5nco+oR13WjTPCLZsiMmZJpjHMZoP4bVsWesftHjQh598yLS Dq2AdU5k3bJnj/ehLLT7usJ3EcbWmxFwQzbYW8YuWzMRhCoIU9Hk/Ke8Ht0XTI60rL0X JzgkIdESIdscxtxGPcSEgSvmLZdvPusIUaZ9P1gHypXIwNdUz1fyq1a3VcEWVjsbUXA8 U5dw== X-Gm-Message-State: AOJu0YyY6FEvUYNp7D3wuda83XMX2u4I762k6R07slBPm8Ip28DWBPVA xWNhfzbL57+Q9Gl06GQfoclm1h1+Jv+pbrlM7R7sorhkLFb2igYgdJxvh91iVg== X-Gm-Gg: AY/fxX7w5HAl24acNibZQVWWTnaCbvDsRC+f20f9vhB4V0SiPElozZsuh+yNYMZGJXo nvsEyW3guGQY9v7CZlv+bPyZo63k1EVAD7f3XpDjZBWEHoJipGKc6CTr7R1qLAdDB8BBXQVLVNi OsFkalDI3k3fkAiM9QeervrI1DEBSIN3ybSbbg37DfvU/9vqB3dqztIbxKNwyjG0J2EH3FCYK1M ITlCNbsOM6T8wzG6291SNltQ3fO95gVrf8eD3u6tBPNHjmOSADEVFYMtI3BqvVAXxKVWWsmNFqs 07Z+ebqX5LRaT1mIvVk6iDoGGoixG6ci3cH9XGyIxrHOQSZbbyydWxSeenU/d6qj/Qo0JxmFrNc EHKCIavu8molC7oSbsR1/IMOmQ2lwoDpg+JPdePCJkV4zDbjB3iOA7fXpTe08rkjpoXM75zOfHC HNimRbtdcRapwIlHwoEJCSVYbfAOEvRFVt+EEnmVdg986IA1MQJyKuNTc= X-Google-Smtp-Source: AGHT+IEucjD41PXDoO0Sehqsa3CDqyfVYMMsagIuNHXh6gvXRB09XPn5DocVYWIVvH/L1Kj0c8MHng== X-Received: by 2002:a17:907:2d0e:b0:b73:80de:e6b2 with SMTP id a640c23a62f3a-b803705dbd1mr1400467366b.31.1766421481530; Mon, 22 Dec 2025 08:38:01 -0800 (PST) From: Oleksii Kurochko To: xen-devel@lists.xenproject.org Cc: Oleksii Kurochko , Alistair Francis , Bob Eshleman , Connor Davis , Andrew Cooper , Anthony PERARD , Michal Orzel , Jan Beulich , Julien Grall , =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= , Stefano Stabellini Subject: [PATCH v8 2/3] xen/riscv: introduce metadata table to store P2M type Date: Mon, 22 Dec 2025 17:37:48 +0100 Message-ID: <127d893e3b6a0da1195f9a128c8d0591e6ef473d.1766406895.git.oleksii.kurochko@gmail.com> X-Mailer: git-send-email 2.52.0 In-Reply-To: References: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-ZohoMail-DKIM: pass (identity @gmail.com) X-ZM-MESSAGEID: 1766421509582158500 RISC-V's PTE has only two available bits that can be used to store the P2M type. This is insufficient to represent all the current RISC-V P2M types. Therefore, some P2M types must be stored outside the PTE bits. To address this, a metadata table is introduced to store P2M types that cannot fit in the PTE itself. Not all P2M types are stored in the metadata table=E2=80=94only those that require it. The metadata table is linked to the intermediate page table via the `struct page_info`'s v.md.metadata field of the corresponding intermediate page. Such pages are allocated with MEMF_no_owner, which allows us to use the v field for the purpose of storing the metadata table. To simplify the allocation and linking of intermediate and metadata page tables, `p2m_{alloc,free}_table()` functions are implemented. These changes impact `p2m_split_superpage()`, since when a superpage is split, it is necessary to update the metadata table of the new intermediate page table =E2=80=94 if the entry being split has its P2M type= set to `p2m_ext_storage` in its `P2M_TYPES` bits. In addition to updating the metadata of the new intermediate page table, the corresponding entry in the metadata for the original superpage is invalidated. Also, update p2m_{get,set}_type to work with P2M types which don't fit into PTE bits. Suggested-by: Jan Beulich Signed-off-by: Oleksii Kurochko --- Changes in V8: - Update the comment above p2m_set_type(). - Drop BUG_ON(ctx->level ...) and=20 "if ( ctx->level <=3D P2M_MAX_SUPPORTED_LEVEL_MAPPING )" as p2m_set_type() doesn't care about ctx->level and it is expected that passed `pte` is val= id, and so ctx->level is expected to be valid too. - Rename p2m_pte_ctx argument to ctx for p2m_pte_from_mfn() and p2m_free_su= btree(). - Initialize local variable p2m_pte_ctx inside p2m_split_superpage() with an initializer. Drop an assigment of p2m_pte_ctx->level when old pte's ty= pe is got. - Use initializer for tmp_ctx and drop an assignment of tmp_ctx.p2m inside p2m_set_type(). - Drop brackets around p2m_free_subtree() call inside p2m_set_entry(). --- Changes in V7: - Put p2m_domain * inside struct p2m_pte_ctx and update an APIs of p2m_set_type(), p2m_pte_from_mfn(). Also, move ASSERT(p2m) closer to p2m_alloc_page(ctx->p2m) inside p2m_set_type(). Update all callers of p2m_set_type() and p2m_pte_from_mfn(). - Update the comment above BUILD_BUG_ON(p2m_invalid): drop unnessary sentenses and make it shorter then 80 chars. - Drop the comment and BUILD_BUG_ON() in p2m_get_type() as it is enough to have it in p2m_set_type(). - Update the comment above p2m_set_type() about p2m argument which was droppped. - Make ctx argument of p2m_set_type() const to be able to re-use p2m_pte_ctx across multiple iterations without fully reinitializing. - Declare "struct p2m_pte_ctx tmp_ctx;" as function scope variable and rework p2m_set_entry() correspondingly. --- Changes in V6: - Introduce new type md_t to use it instead of pte_t to store metadata types outside PTE bits. - Integrate introduced struct md_t. - Drop local variable "struct domain *d" inside p2m_set_type(). - Drop __func__ printting and use %pv. - Code style fixes - Drop unnessarry check inside if-condition in p2m_pte_from_mfn() as we have ASSERT(p2m) inside p2m_set_type() anyway. - Return back the commnent inside page_to_p2m_table() as it was deleted accidently. - move the initialization of p2m_pte_ctx.pt_page and p2m_pte_ctx.level ahead of the loop - Add BUILD_BUG_ON(p2m_invalid) before the call of p2m_alloc_page() in p2m= _set_type() and in p2m_get_type() before " if ( type =3D=3D p2m_ext_storage= )". - Set to NULL tbl_pg->v.md.pg in p2m_free_table(). - Make argument 't' of p2m_set_type() non-const as we are going to change = it. - Add some explanatory comments. - Update ASSERT at the start of p2m_set_type() to verify that passed ctx->index is lesser then 512 and drop calculation of an index of root page as it is guaranteed by calc_offset() and get_root_pointer() that we will aready get proper page and proper index inside this page. --- Changes in V5: - Rename metadata member of stuct md inside struct page_info to pg. - Stray blank in the declaration of p2m_alloc_table(). - Use "<" instead of "<=3D" in ASSERT() in p2m_set_type(). - Move the check that ctx is provided to an earlier point in p2m_set_type(). - Set `md_pg` after ASSERT() in p2m_set_type(). - Add BUG_ON() insetead of ASSERT_UNREACHABLE() in p2m_set_type(). - Drop a check that metadata isn't NULL before unmap_domain_page() is being called. - Make const `md` variable in p2m_get_type(). - unmap correct domain's page in p2m_get_type: use `md` instead of ctx->pt_page->v.md.pg. - Add description of how p2m and p2m_pte_ctx is expected to be used in p2m_pte_from_mfn() and drop a comment from page_to_p2m_table(). - Drop the stale part of the comment above p2m_alloc_table(). - Drop ASSERT(tbl_pg->v.md.pg) from p2m_free_table() as tbl_pg->v.md.pg is created conditionally now. - Drop an introduction of p2m_alloc_table(), update p2m_alloc_page() correspondengly and use it instead. - Add missing blank in definition of level member for tmp_ctx variable in p2m_free_subtree(). Also, add the comma at the end. - Initialize old_type once before for-loop in p2m_split_superpage() as old type will be used for all newly created PTEs. - Properly initialize p2m_pte_ctx.level with next_level instead of level when p2m_set_type() is going to be called for new PTEs. - Fix identations. - Move ASSERT(p2m) on top of p2m_set_type() to be sure that NULL isn't passed for p2m argument of p2m_set_type(). - s/virt_to_page(table)/mfn_to_page(domain_page_map_to_mfn(table)) to recieve correct page for a table which is mapped by domain_page_map(). - Add "return;" after domain_crash() in p2m_set_type() to avoid potential NULL pointer dereference of md_pg. --- Changes in V4: - Add Suggested-by: Jan Beulich . - Update the comment above declation of md structure inside struct page_in= fo to: "Page is used as an intermediate P2M page table". - Allocate metadata table on demand to save some memory. (1) - Rework p2m_set_type(): - Add allocatation of metadata page only if needed. - Move a check what kind of type we are handling inside p2m_set_type(). - Move mapping of metadata page inside p2m_get_type() as it is needed only in case if PTE's type is equal to p2m_ext_storage. - Add some description to p2m_get_type() function. - Drop blank after return type of p2m_alloc_table(). - Drop allocation of metadata page inside p2m_alloc_table becaues of (1). - Fix p2m_free_table() to free metadata page only if it was allocated. --- Changes in V3: - Add is_p2m_foreign() macro and connected stuff. - Change struct domain *d argument of p2m_get_page_from_gfn() to struct p2m_domain. - Update the comment above p2m_get_entry(). - s/_t/p2mt for local variable in p2m_get_entry(). - Drop local variable addr in p2m_get_entry() and use gfn_to_gaddr(gfn) to define offsets array. - Code style fixes. - Update a check of rc code from p2m_next_level() in p2m_get_entry() and drop "else" case. - Do not call p2m_get_type() if p2m_get_entry()'s t argument is NULL. - Use struct p2m_domain instead of struct domain for p2m_lookup() and p2m_get_page_from_gfn(). - Move defintion of get_page() from "xen/riscv: implement mfn_valid() and = page reference, ownership handling helpers" --- Changes in V2: - New patch. --- xen/arch/riscv/include/asm/mm.h | 9 ++ xen/arch/riscv/p2m.c | 236 ++++++++++++++++++++++++++++---- 2 files changed, 215 insertions(+), 30 deletions(-) diff --git a/xen/arch/riscv/include/asm/mm.h b/xen/arch/riscv/include/asm/m= m.h index 1a99e1cf0a3c..48162f5d65cd 100644 --- a/xen/arch/riscv/include/asm/mm.h +++ b/xen/arch/riscv/include/asm/mm.h @@ -149,6 +149,15 @@ struct page_info /* Order-size of the free chunk this page is the head of. */ unsigned int order; } free; + + /* Page is used as an intermediate P2M page table */ + struct { + /* + * Pointer to a page which store metadata for an intermediate = page + * table. + */ + struct page_info *pg; + } md; } v; =20 union { diff --git a/xen/arch/riscv/p2m.c b/xen/arch/riscv/p2m.c index 66943b969e8a..24dd07165bd1 100644 --- a/xen/arch/riscv/p2m.c +++ b/xen/arch/riscv/p2m.c @@ -26,6 +26,25 @@ */ #define P2M_MAX_SUPPORTED_LEVEL_MAPPING _AC(2, U) =20 +struct md_t { + /* + * Describes a type stored outside PTE bits. + * Look at the comment above definition of enum p2m_type_t. + */ + p2m_type_t type : 4; +}; + +/* + * P2M PTE context is used only when a PTE's P2M type is p2m_ext_storage. + * In this case, the P2M type is stored separately in the metadata page. + */ +struct p2m_pte_ctx { + struct p2m_domain *p2m; + struct page_info *pt_page; /* Page table page containing the PTE. */ + unsigned int index; /* Index of the PTE within that page. */ + unsigned int level; /* Paging level at which the PTE resides.= */ +}; + static struct gstage_mode_desc __ro_after_init max_gstage_mode =3D { .mode =3D HGATP_MODE_OFF, .paging_levels =3D 0, @@ -37,6 +56,10 @@ unsigned char get_max_supported_mode(void) return max_gstage_mode.mode; } =20 +/* + * If anything is changed here, it may also require updates to + * p2m_{get,set}_type(). + */ static inline unsigned int calc_offset(const struct p2m_domain *p2m, const unsigned int lvl, const paddr_t gpa) @@ -79,6 +102,9 @@ static inline unsigned int calc_offset(const struct p2m_= domain *p2m, * The caller is responsible for unmapping the page after use. * * Returns NULL if the calculated offset into the root table is invalid. + * + * If anything is changed here, it may also require updates to + * p2m_{get,set}_type(). */ static pte_t *p2m_get_root_pointer(struct p2m_domain *p2m, gfn_t gfn) { @@ -370,24 +396,96 @@ static struct page_info *p2m_alloc_page(struct p2m_do= main *p2m) return pg; } =20 -static int p2m_set_type(pte_t *pte, p2m_type_t t) +/* + * `pte` =E2=80=93 PTE entry for which the type `t` will be stored. + * + * If `t` >=3D p2m_first_external, a valid `ctx` must be provided. + */ +static void p2m_set_type(pte_t *pte, p2m_type_t t, + const struct p2m_pte_ctx *ctx) { - int rc =3D 0; + struct page_info **md_pg; + struct md_t *metadata =3D NULL; =20 - if ( t > p2m_first_external ) - panic("unimplemeted\n"); - else - pte->pte |=3D MASK_INSR(t, P2M_TYPE_PTE_BITS_MASK); + /* + * It is sufficient to compare ctx->index with PAGETABLE_ENTRIES becau= se, + * even for the p2m root page table (which is a 16 KB page allocated as + * four 4 KB pages), calc_offset() guarantees that the page-table index + * will always fall within the range [0, 511]. + */ + ASSERT(ctx && ctx->index < PAGETABLE_ENTRIES); =20 - return rc; + /* + * At the moment, p2m_get_root_pointer() returns one of four possible = p2m + * root pages, so there is no need to search for the correct ->pt_page + * here. + * Non-root page tables are 4 KB pages, so simply using ->pt_page is + * sufficient. + */ + md_pg =3D &ctx->pt_page->v.md.pg; + + if ( !*md_pg && (t >=3D p2m_first_external) ) + { + /* + * Since p2m_alloc_page() initializes an allocated page with + * zeros, p2m_invalid is expected to have the value 0 as well. + */ + BUILD_BUG_ON(p2m_invalid); + + ASSERT(ctx->p2m); + + *md_pg =3D p2m_alloc_page(ctx->p2m); + if ( !*md_pg ) + { + printk("%pd: can't allocate metadata page\n", + ctx->p2m->domain); + domain_crash(ctx->p2m->domain); + + return; + } + } + + if ( *md_pg ) + metadata =3D __map_domain_page(*md_pg); + + if ( t >=3D p2m_first_external ) + { + metadata[ctx->index].type =3D t; + + t =3D p2m_ext_storage; + } + else if ( metadata ) + metadata[ctx->index].type =3D p2m_invalid; + + pte->pte |=3D MASK_INSR(t, P2M_TYPE_PTE_BITS_MASK); + + unmap_domain_page(metadata); } =20 -static p2m_type_t p2m_get_type(const pte_t pte) +/* + * `pte` -> PTE entry that stores the PTE's type. + * + * If the PTE's type is `p2m_ext_storage`, `ctx` should be provided; + * otherwise it could be NULL. + */ +static p2m_type_t p2m_get_type(const pte_t pte, const struct p2m_pte_ctx *= ctx) { p2m_type_t type =3D MASK_EXTR(pte.pte, P2M_TYPE_PTE_BITS_MASK); =20 if ( type =3D=3D p2m_ext_storage ) - panic("unimplemented\n"); + { + const struct md_t *md =3D __map_domain_page(ctx->pt_page->v.md.pg); + + type =3D md[ctx->index].type; + + /* + * Since p2m_set_type() guarantees that the type will be greater t= han + * p2m_first_external, just check that we received a valid type he= re. + */ + ASSERT(type > p2m_first_external); + + unmap_domain_page(md); + } =20 return type; } @@ -477,7 +575,14 @@ static void p2m_set_permission(pte_t *e, p2m_type_t t) } } =20 -static pte_t p2m_pte_from_mfn(mfn_t mfn, p2m_type_t t, bool is_table) +/* + * If p2m_pte_from_mfn() is called with ctx =3D NULL, + * it means the function is working with a page table for which the `t` + * should not be applicable. Otherwise, the function is handling a leaf PTE + * for which `t` is applicable. + */ +static pte_t p2m_pte_from_mfn(mfn_t mfn, p2m_type_t t, + struct p2m_pte_ctx *ctx) { pte_t e =3D (pte_t) { PTE_VALID }; =20 @@ -485,7 +590,7 @@ static pte_t p2m_pte_from_mfn(mfn_t mfn, p2m_type_t t, = bool is_table) =20 ASSERT(!(mfn_to_maddr(mfn) & ~PADDR_MASK) || mfn_eq(mfn, INVALID_MFN)); =20 - if ( !is_table ) + if ( ctx ) { switch ( t ) { @@ -498,7 +603,7 @@ static pte_t p2m_pte_from_mfn(mfn_t mfn, p2m_type_t t, = bool is_table) } =20 p2m_set_permission(&e, t); - p2m_set_type(&e, t); + p2m_set_type(&e, t, ctx); } else /* @@ -518,7 +623,22 @@ static pte_t page_to_p2m_table(const struct page_info = *page) * set to true and p2m_type_t shouldn't be applied for PTEs which * describe an intermediate table. */ - return p2m_pte_from_mfn(page_to_mfn(page), p2m_invalid, true); + return p2m_pte_from_mfn(page_to_mfn(page), p2m_invalid, NULL); +} + +static void p2m_free_page(struct p2m_domain *p2m, struct page_info *pg); + +/* + * Free page table's page and metadata page linked to page table's page. + */ +static void p2m_free_table(struct p2m_domain *p2m, struct page_info *tbl_p= g) +{ + if ( tbl_pg->v.md.pg ) + { + p2m_free_page(p2m, tbl_pg->v.md.pg); + tbl_pg->v.md.pg =3D NULL; + } + p2m_free_page(p2m, tbl_pg); } =20 /* Allocate a new page table page and hook it in via the given entry. */ @@ -679,12 +799,14 @@ static void p2m_free_page(struct p2m_domain *p2m, str= uct page_info *pg) =20 /* Free pte sub-tree behind an entry */ static void p2m_free_subtree(struct p2m_domain *p2m, - pte_t entry, unsigned int level) + pte_t entry, + const struct p2m_pte_ctx *ctx) { unsigned int i; pte_t *table; mfn_t mfn; struct page_info *pg; + unsigned int level =3D ctx->level; =20 /* * Check if the level is valid: only 4K - 2M - 1G mappings are support= ed. @@ -700,7 +822,7 @@ static void p2m_free_subtree(struct p2m_domain *p2m, =20 if ( pte_is_mapping(entry) ) { - p2m_type_t p2mt =3D p2m_get_type(entry); + p2m_type_t p2mt =3D p2m_get_type(entry, ctx); =20 #ifdef CONFIG_IOREQ_SERVER /* @@ -719,10 +841,22 @@ static void p2m_free_subtree(struct p2m_domain *p2m, return; } =20 - table =3D map_domain_page(pte_get_mfn(entry)); + mfn =3D pte_get_mfn(entry); + ASSERT(mfn_valid(mfn)); + table =3D map_domain_page(mfn); + pg =3D mfn_to_page(mfn); =20 for ( i =3D 0; i < P2M_PAGETABLE_ENTRIES(p2m, level); i++ ) - p2m_free_subtree(p2m, table[i], level - 1); + { + struct p2m_pte_ctx tmp_ctx =3D { + .pt_page =3D pg, + .index =3D i, + .level =3D level - 1, + .p2m =3D p2m, + }; + + p2m_free_subtree(p2m, table[i], &tmp_ctx); + } =20 unmap_domain_page(table); =20 @@ -734,17 +868,13 @@ static void p2m_free_subtree(struct p2m_domain *p2m, */ p2m_tlb_flush_sync(p2m); =20 - mfn =3D pte_get_mfn(entry); - ASSERT(mfn_valid(mfn)); - - pg =3D mfn_to_page(mfn); - - p2m_free_page(p2m, pg); + p2m_free_table(p2m, pg); } =20 static bool p2m_split_superpage(struct p2m_domain *p2m, pte_t *entry, unsigned int level, unsigned int target, - const unsigned int *offsets) + const unsigned int *offsets, + struct page_info *tbl_pg) { struct page_info *page; unsigned long i; @@ -756,6 +886,14 @@ static bool p2m_split_superpage(struct p2m_domain *p2m= , pte_t *entry, unsigned int next_level =3D level - 1; unsigned int level_order =3D P2M_LEVEL_ORDER(next_level); =20 + struct p2m_pte_ctx p2m_pte_ctx =3D { + .p2m =3D p2m, + .level =3D level, + }; + + /* Init with p2m_invalid just to make compiler happy. */ + p2m_type_t old_type =3D p2m_invalid; + /* * This should only be called with target !=3D level and the entry is * a superpage. @@ -777,6 +915,17 @@ static bool p2m_split_superpage(struct p2m_domain *p2m= , pte_t *entry, =20 table =3D __map_domain_page(page); =20 + if ( MASK_EXTR(entry->pte, P2M_TYPE_PTE_BITS_MASK) =3D=3D p2m_ext_stor= age ) + { + p2m_pte_ctx.pt_page =3D tbl_pg; + p2m_pte_ctx.index =3D offsets[level]; + + old_type =3D p2m_get_type(*entry, &p2m_pte_ctx); + } + + p2m_pte_ctx.pt_page =3D page; + p2m_pte_ctx.level =3D next_level; + for ( i =3D 0; i < P2M_PAGETABLE_ENTRIES(p2m, next_level); i++ ) { pte_t *new_entry =3D table + i; @@ -788,6 +937,13 @@ static bool p2m_split_superpage(struct p2m_domain *p2m= , pte_t *entry, pte =3D *entry; pte_set_mfn(&pte, mfn_add(mfn, i << level_order)); =20 + if ( MASK_EXTR(pte.pte, P2M_TYPE_PTE_BITS_MASK) =3D=3D p2m_ext_sto= rage ) + { + p2m_pte_ctx.index =3D i; + + p2m_set_type(&pte, old_type, &p2m_pte_ctx); + } + write_pte(new_entry, pte); } =20 @@ -799,7 +955,7 @@ static bool p2m_split_superpage(struct p2m_domain *p2m,= pte_t *entry, */ if ( next_level !=3D target ) rv =3D p2m_split_superpage(p2m, table + offsets[next_level], - next_level, target, offsets); + next_level, target, offsets, page); =20 if ( p2m->clean_dcache ) clean_dcache_va_range(table, PAGE_SIZE); @@ -840,6 +996,9 @@ static int p2m_set_entry(struct p2m_domain *p2m, * are still allowed. */ bool removing_mapping =3D mfn_eq(mfn, INVALID_MFN); + struct p2m_pte_ctx tmp_ctx =3D { + .p2m =3D p2m, + }; P2M_BUILD_LEVEL_OFFSETS(p2m, offsets, gfn_to_gaddr(gfn)); =20 ASSERT(p2m_is_write_locked(p2m)); @@ -890,13 +1049,19 @@ static int p2m_set_entry(struct p2m_domain *p2m, { /* We need to split the original page. */ pte_t split_pte =3D *entry; + struct page_info *tbl_pg =3D mfn_to_page(domain_page_map_to_mfn(ta= ble)); =20 ASSERT(pte_is_superpage(*entry, level)); =20 - if ( !p2m_split_superpage(p2m, &split_pte, level, target, offsets)= ) + if ( !p2m_split_superpage(p2m, &split_pte, level, target, offsets, + tbl_pg) ) { + tmp_ctx.pt_page =3D tbl_pg; + tmp_ctx.index =3D offsets[level]; + tmp_ctx.level =3D level; + /* Free the allocated sub-tree */ - p2m_free_subtree(p2m, split_pte, level); + p2m_free_subtree(p2m, split_pte, &tmp_ctx); =20 rc =3D -ENOMEM; goto out; @@ -922,6 +1087,10 @@ static int p2m_set_entry(struct p2m_domain *p2m, entry =3D table + offsets[level]; } =20 + tmp_ctx.pt_page =3D mfn_to_page(domain_page_map_to_mfn(table)); + tmp_ctx.index =3D offsets[level]; + tmp_ctx.level =3D level; + /* * We should always be there with the correct level because all the * intermediate tables have been installed if necessary. @@ -934,7 +1103,7 @@ static int p2m_set_entry(struct p2m_domain *p2m, p2m_clean_pte(entry, p2m->clean_dcache); else { - pte_t pte =3D p2m_pte_from_mfn(mfn, t, false); + pte_t pte =3D p2m_pte_from_mfn(mfn, t, &tmp_ctx); =20 p2m_write_pte(entry, pte, p2m->clean_dcache); =20 @@ -970,7 +1139,7 @@ static int p2m_set_entry(struct p2m_domain *p2m, if ( pte_is_valid(orig_pte) && (!pte_is_valid(*entry) || !mfn_eq(pte_get_mfn(*entry), pte_get_mfn(orig_pte))) ) - p2m_free_subtree(p2m, orig_pte, level); + p2m_free_subtree(p2m, orig_pte, &tmp_ctx); =20 out: unmap_domain_page(table); @@ -1171,7 +1340,14 @@ static mfn_t p2m_get_entry(struct p2m_domain *p2m, g= fn_t gfn, =20 if ( pte_is_valid(entry) ) { - *t =3D p2m_get_type(entry); + struct p2m_pte_ctx p2m_pte_ctx =3D { + .pt_page =3D mfn_to_page(domain_page_map_to_mfn(table)), + .index =3D offsets[level], + .level =3D level, + .p2m =3D p2m, + }; + + *t =3D p2m_get_type(entry, &p2m_pte_ctx); =20 mfn =3D pte_get_mfn(entry); =20 --=20 2.52.0 From nobody Thu Jan 8 12:34:33 2026 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) client-ip=192.237.175.120; envelope-from=xen-devel-bounces@lists.xenproject.org; helo=lists.xenproject.org; Received-SPF: pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) client-ip=192.237.175.120; envelope-from=xen-devel-bounces@lists.xenproject.org; helo=lists.xenproject.org; Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org; dmarc=pass(p=none dis=none) header.from=gmail.com ARC-Seal: i=1; a=rsa-sha256; t=1766421500; cv=none; d=zohomail.com; s=zohoarc; b=Hcp5fVYSszWtFfrONpzWyfg4f3tRPqyO7eS6c1MwL2fzv07RXaOLdfj3FdOIkwT9dnX/qOTm3J0dAi8e1rTdsCTiZHsLTAkCMf5DN6OgaBj7Fo2uKUxvge2U5UaP5EKbaVYgIWs1PhEJh1yKosTFGHNj+k3yG8XA+ar4Bzy+Fv0= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1766421500; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=snjG4R/pA1CFKmrsMjW984DLPxncWtCMO3mbZglzYjc=; b=hNSD1e99bDtvVdzwE/xfFG/msIOWZigSDiRVn3DSLaPU+3w9RKHincA5zrpsn0DrWcyFiINejCp/0gTMk74buJqTZra+QKWh7TyDjz96lZQ1/7kIPDaG1ktJtedmIjzuyI1k1vRkw4bW0To/k/H3Pd1LTV2A/pAt2heBMB5PE7w= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) by mx.zohomail.com with SMTPS id 1766421500255623.3740375844218; Mon, 22 Dec 2025 08:38:20 -0800 (PST) Received: from list by lists.xenproject.org with outflank-mailman.1192172.1511526 (Exim 4.92) (envelope-from ) id 1vXiv9-0001rN-0J; Mon, 22 Dec 2025 16:38:07 +0000 Received: by outflank-mailman (output) from mailman id 1192172.1511526; Mon, 22 Dec 2025 16:38:06 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1vXiv8-0001qw-RB; Mon, 22 Dec 2025 16:38:06 +0000 Received: by outflank-mailman (input) for mailman id 1192172; Mon, 22 Dec 2025 16:38:05 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1vXiv7-00085h-C5 for xen-devel@lists.xenproject.org; Mon, 22 Dec 2025 16:38:05 +0000 Received: from mail-ed1-x52c.google.com (mail-ed1-x52c.google.com [2a00:1450:4864:20::52c]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 95da2142-df54-11f0-9cce-f158ae23cfc8; Mon, 22 Dec 2025 17:38:03 +0100 (CET) Received: by mail-ed1-x52c.google.com with SMTP id 4fb4d7f45d1cf-64b8e5d1611so4637192a12.3 for ; Mon, 22 Dec 2025 08:38:03 -0800 (PST) Received: from fedora (user-109-243-71-38.play-internet.pl. [109.243.71.38]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-b8037de004fsm1149128166b.45.2025.12.22.08.38.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 22 Dec 2025 08:38:02 -0800 (PST) X-Outflank-Mailman: Message body and most headers restored to incoming version X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 95da2142-df54-11f0-9cce-f158ae23cfc8 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1766421483; x=1767026283; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=snjG4R/pA1CFKmrsMjW984DLPxncWtCMO3mbZglzYjc=; b=FTKDi5LVX/yGU1afdBpJUkl+FPe1he+lOvWoAbkoVLqKgsee1qztHw5+t44M27Ct/d 3MPUcVIsrs4i6WyCpVmVdHy+yMUVFTtjjxBA/dfiBFBaV9zac2iuFTF61FO3mf+JZ6EV lwqAtiUGOw6ay3/dG3iSHidwxKoLrNlMh2Y4dOWuVVhpja18U3A3DtB/GO0TGP4tvYnQ iobzSbqyGfu+bBMwEUbL+e2MOJQ4/Um+I/GvOAoiH4rJXRG/nuBzBf5n8tQA+cb7STD4 2ztGB0RYobR+fgnGLJ9g0sM47ydFTG7oSITBI4rgI5MCScr9J9BQpUBqK1LSbucBheUe y3hQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1766421483; x=1767026283; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=snjG4R/pA1CFKmrsMjW984DLPxncWtCMO3mbZglzYjc=; b=Exxc2B9x+SmZZF0PnRUVQwvsEKgygSYn9RJMGYrBAxYdfLyHkOyo5Jlp2x3wFRDl8I O0EfNlNXAliM6hk2zMXO1b6+rWb3SzbvRYiuz8HWwb7X5bOC6FCqTA6AHc7BA78s7DJY bzvCjC7D2vtBqmHaCo7anLvhse3Umq9ahJ9Vb5PZ8LhnBeForCovyUsHpBxbrKx/CCYq 2gE4Z7ICQKC3P0eWCrmUyg0fAXGK2CCG5umb3Tctx2rdnly+sXdFHz9H2SH7bq3OUYE4 UcXTWWi6FUY9/w4ZL9GXBaQtBtHTD50XvlJBYg4upxQKoMyDD1Kqnjs4/Fmavw7DUOwe gCMQ== X-Gm-Message-State: AOJu0YxqocfFVL+4bX18VVWnibNgIuCv7aIAMhQTaKtGed8Xt/bu3/Zk FRDArDkmeuQicmBIEJ3QY1s0EGjhUtfn969S9OC1wI4JpsxGypovuWDBHdoFYQ== X-Gm-Gg: AY/fxX5qs6b2z+JfOhqBqV816Is61ly1Tpd4LC7QqbvXWzcFVOLwt3DS75i5pSSmWhX KM0X5rz5MraCuuoLb4/G6rRVgkoX6xuW4MN9121oXjDr1BSfVR3CgQkyFk7Aqv6/0iUyE6iJytw ITURn70I/d2FegVyzkIdI3/bVa1ob6H26bDZelVdwQA0ULZAeuNbPf6LcM2RyLPV8ab4kMg+KcU axfVFWctsztQ/KpTCrUGF+HTt42QSlLxWaltYpeqyQ9fLhfa1OZujdnxlM9ZwRtVn8pYSxD/ZYx mAKK/WXE3spIJPFJ5T6s4KgnWEjhFJ3vaxOtG/xvZwGkIQlwM19fCMdo24+Ab1wCLgfsob2rywA 6C+8JtaEopSt2PHyuoGRUZBycFx6Cif8e46nkcMAshufeJxkZI17Y9foquW3EuvLa5IIoxFCLZ2 pW5dnhj5HFNXhYy7085BAuRpolgRJh2m6l/Eecyp4s9XDERSr1j++2p49g7dxZ2s/Vyw== X-Google-Smtp-Source: AGHT+IGl4+5cwit28zyS+FC7ckzBQFpj70pEn0gT8nTWI/cUrgY36FnvSC8pIPig+JQtSEWYV8J2iw== X-Received: by 2002:a17:907:3f98:b0:b73:78f3:15b3 with SMTP id a640c23a62f3a-b80371f98f1mr1136707266b.47.1766421482573; Mon, 22 Dec 2025 08:38:02 -0800 (PST) From: Oleksii Kurochko To: xen-devel@lists.xenproject.org Cc: Oleksii Kurochko , Alistair Francis , Bob Eshleman , Connor Davis , Andrew Cooper , Anthony PERARD , Michal Orzel , Jan Beulich , Julien Grall , =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= , Stefano Stabellini Subject: [PATCH v8 3/3] xen/riscv: update p2m_set_entry() to free unused metadata pages Date: Mon, 22 Dec 2025 17:37:49 +0100 Message-ID: <898c078cb84f8cf0b24ea1c61480b264a4da6ba5.1766406895.git.oleksii.kurochko@gmail.com> X-Mailer: git-send-email 2.52.0 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ZohoMail-DKIM: pass (identity @gmail.com) X-ZM-MESSAGEID: 1766421501547158500 Content-Type: text/plain; charset="utf-8" Introduce tracking of metadata page entries usage and if all of them are p2m_invalid then free them. Intermediate P2M page tables are allocated with MEMF_no_owner, so we are fr= ee to repurpose struct page_info fields for them. Since page_info.u.* is not used for such pages, introduce a used_entries counter in struct page_info to track how many metadata entries are in use for a given intermediate P2M page table. The counter is updated in p2m_set_type() when metadata entries transition between p2m_invalid and a valid external type. When the last metadata entry is cleared (used_entries =3D=3D 0), the associated metadata page is freed a= nd returned to the P2M pool. Refactor metadata page freeing into a new helper, p2m_free_metadata_page(), as the same logic is needed both when tearing down a P2M table and when all metadata entries become p2m_invalid in p2m_set_type(). As part of this refactoring, move the declaration of p2m_free_page() earlier to satisfy the new helper. Additionally, implement page_set_tlbflush_timestamp() for RISC-V instead of BUGing, as it is invoked when returning memory to the domheap. Suggested-by: Jan Beulich Signed-off-by: Oleksii Kurochko Acked-by: Jan Beulich --- Changes in V9: - Add Acked-by: Jan Beulich . --- Changes in V8: - New patch. --- xen/arch/riscv/include/asm/flushtlb.h | 2 +- xen/arch/riscv/include/asm/mm.h | 12 ++++++++++ xen/arch/riscv/p2m.c | 32 +++++++++++++++++++++------ 3 files changed, 38 insertions(+), 8 deletions(-) diff --git a/xen/arch/riscv/include/asm/flushtlb.h b/xen/arch/riscv/include= /asm/flushtlb.h index ab32311568ac..4f64f9757058 100644 --- a/xen/arch/riscv/include/asm/flushtlb.h +++ b/xen/arch/riscv/include/asm/flushtlb.h @@ -38,7 +38,7 @@ static inline void tlbflush_filter(cpumask_t *mask, uint3= 2_t page_timestamp) {} =20 static inline void page_set_tlbflush_timestamp(struct page_info *page) { - BUG_ON("unimplemented"); + page->tlbflush_timestamp =3D tlbflush_current_time(); } =20 static inline void arch_flush_tlb_mask(const cpumask_t *mask) diff --git a/xen/arch/riscv/include/asm/mm.h b/xen/arch/riscv/include/asm/m= m.h index 48162f5d65cd..a005d0247a6f 100644 --- a/xen/arch/riscv/include/asm/mm.h +++ b/xen/arch/riscv/include/asm/mm.h @@ -113,6 +113,18 @@ struct page_info unsigned long type_info; } inuse; =20 + /* Page is used as an intermediate P2M page table: count_info =3D= =3D 0 */ + struct { + /* + * Tracks the number of used entries in the metadata page table. + * + * If used_entries =3D=3D 0, then `page_info.v.md.pg` can be fr= eed and + * returned to the P2M pool. + */ + unsigned long used_entries; + } md; + + /* Page is on a free list: ((count_info & PGC_count_mask) =3D=3D 0= ). */ union { struct { diff --git a/xen/arch/riscv/p2m.c b/xen/arch/riscv/p2m.c index 24dd07165bd1..a6e4c01b873d 100644 --- a/xen/arch/riscv/p2m.c +++ b/xen/arch/riscv/p2m.c @@ -51,6 +51,18 @@ static struct gstage_mode_desc __ro_after_init max_gstag= e_mode =3D { .name =3D "Bare", }; =20 +static void p2m_free_page(struct p2m_domain *p2m, struct page_info *pg); + +static inline void p2m_free_metadata_page(struct p2m_domain *p2m, + struct page_info **md_pg) +{ + if ( *md_pg ) + { + p2m_free_page(p2m, *md_pg); + *md_pg =3D NULL; + } +} + unsigned char get_max_supported_mode(void) { return max_gstage_mode.mode; @@ -450,16 +462,27 @@ static void p2m_set_type(pte_t *pte, p2m_type_t t, =20 if ( t >=3D p2m_first_external ) { + if ( metadata[ctx->index].type =3D=3D p2m_invalid ) + ctx->pt_page->u.md.used_entries++; + metadata[ctx->index].type =3D t; =20 t =3D p2m_ext_storage; } else if ( metadata ) + { + if ( metadata[ctx->index].type !=3D p2m_invalid ) + ctx->pt_page->u.md.used_entries--; + metadata[ctx->index].type =3D p2m_invalid; + } =20 pte->pte |=3D MASK_INSR(t, P2M_TYPE_PTE_BITS_MASK); =20 unmap_domain_page(metadata); + + if ( *md_pg && !ctx->pt_page->u.md.used_entries ) + p2m_free_metadata_page(ctx->p2m, md_pg); } =20 /* @@ -626,18 +649,13 @@ static pte_t page_to_p2m_table(const struct page_info= *page) return p2m_pte_from_mfn(page_to_mfn(page), p2m_invalid, NULL); } =20 -static void p2m_free_page(struct p2m_domain *p2m, struct page_info *pg); - /* * Free page table's page and metadata page linked to page table's page. */ static void p2m_free_table(struct p2m_domain *p2m, struct page_info *tbl_p= g) { - if ( tbl_pg->v.md.pg ) - { - p2m_free_page(p2m, tbl_pg->v.md.pg); - tbl_pg->v.md.pg =3D NULL; - } + p2m_free_metadata_page(p2m, &tbl_pg->v.md.pg); + p2m_free_page(p2m, tbl_pg); } =20 --=20 2.52.0