From nobody Fri Oct 31 03:47:33 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) client-ip=192.237.175.120; envelope-from=xen-devel-bounces@lists.xenproject.org; helo=lists.xenproject.org; Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org; dmarc=pass(p=reject dis=none) header.from=cloud.com ARC-Seal: i=1; a=rsa-sha256; t=1755343433; cv=none; d=zohomail.com; s=zohoarc; b=UdUjrxt8YQ3GSJQtxbOoVgwzv44h18/tWE4NzhPvZuWxchQyrQoSLkP7lQvFDCq54yTGIthsyBEd6kV19CZIabQemPOepwIArErEyEqLy2wWtmr4cxAPt8Csov/M/5w8GfREBCJuEO1H7mZqOyZXVxU8fyqWC8dzMA8tIkCemdg= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1755343433; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=I+05UXLfbMsAgarpS2s46Loi9+DfuxXq2X3Fj022QKo=; b=XA5ooc+U2Jp0ewE78+kivYfBzMzkessh8kh1MmFE+v9jLpomfYXfOaTyjKtCnVugd9h03T5jdGiuWaQaZsC1x6jFSqLL1sCwg7FnZzMgc1FjHymkkT3/eoFmYDsVVl/qxMwtIN1gXnI5UdqGjNkavi482eDUezjCzL7WwqH/8QU= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org; dmarc=pass header.from= (p=reject dis=none) Return-Path: Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) by mx.zohomail.com with SMTPS id 1755343433152942.7865062033943; Sat, 16 Aug 2025 04:23:53 -0700 (PDT) Received: from list by lists.xenproject.org with outflank-mailman.1084480.1443625 (Exim 4.92) (envelope-from ) id 1unF0d-0007zF-5Z; Sat, 16 Aug 2025 11:23:39 +0000 Received: by outflank-mailman (output) from mailman id 1084480.1443625; Sat, 16 Aug 2025 11:23:39 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1unF0d-0007z8-2Q; Sat, 16 Aug 2025 11:23:39 +0000 Received: by outflank-mailman (input) for mailman id 1084480; Sat, 16 Aug 2025 11:23:38 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1unF0c-0007SI-3o for xen-devel@lists.xenproject.org; Sat, 16 Aug 2025 11:23:38 +0000 Received: from mail-wm1-x333.google.com (mail-wm1-x333.google.com [2a00:1450:4864:20::333]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id 7423fa02-7a93-11f0-a328-13f23c93f187; Sat, 16 Aug 2025 13:23:37 +0200 (CEST) Received: by mail-wm1-x333.google.com with SMTP id 5b1f17b1804b1-45a1b00f187so11183025e9.0 for ; Sat, 16 Aug 2025 04:23:37 -0700 (PDT) Received: from MinisforumBD795m.citrite.net ([2a02:1748:f7df:8cb1:3992:b1e9:da8a:3f30]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-45a27ec6b71sm13852325e9.10.2025.08.16.04.23.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 16 Aug 2025 04:23:36 -0700 (PDT) X-Outflank-Mailman: Message body and most headers restored to incoming version X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 7423fa02-7a93-11f0-a328-13f23c93f187 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloud.com; s=cloud; t=1755343417; x=1755948217; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=I+05UXLfbMsAgarpS2s46Loi9+DfuxXq2X3Fj022QKo=; b=kDoUhMPGZWSzXRA7TmKOko1MdKJAEHbmqa+jGeU7BEP8GtxlFMpCr5t2fNbCPNW981 OdzJG62MM6ks3UQtlWcIb73vBCjxujBFuYdzFu824MahvhoRaCyuil2XUkhbrOTqaHf/ +j+MpkEwItqBCuEsqfkPiw3aUehdYbv7s2L/I= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1755343417; x=1755948217; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=I+05UXLfbMsAgarpS2s46Loi9+DfuxXq2X3Fj022QKo=; b=RjZ1C5wzeqD2oM3C4lZ47JmSaxF2rvboRt7A4kgS+ULenob4Z/hKw+/s1HFoA5H0WF cv7ktnahzippxW7pXR8ovTUjjqY1yVzNkupHiGy4fGZGxaTAFA2u3aAmN3n5+x5ugG4F g+va2nf41qJouG4buUWONJaaWBdPGSQqfq6XUv2/X2moLrghyntSTpCGI2zsScVyFyFn mys4id2JQjQipPowkps4RG4UCgsj/BISHU2KixQyghvQubf+kYbf2N11HRwuUoFP9a0A MW9018Qp8UeF9Qfuh5VYjHctCvuTiRj0P9ReObIA5DTZoxodCKBJmXgfathsrnkbHW8n FAQA== X-Gm-Message-State: AOJu0YzDursj2uGaSVDuFvqZi7y72fXw3tXvBDzMY3HPI5QOOjsuFAeY xb8H7DBeTS0BKnwL8dHxlYzn1luCA6YzUeQ4nRu+fJF1KobVGxWUepYYauBZ62nXAVkL7MgpQz9 WlnLlD9I= X-Gm-Gg: ASbGnctOeYRuq+mke9OOWMTsgME0zdQFkFTaPanBiibG845d6iX8DD1Vggvb1Nvt03C 7wC1gXQ7lAyMibquSy/G/6uaRX02TxDuQF5HXE1w/FMcVce4BHq1hw3KW5YtIgLIYsKibtyrPtg eG+BhPtpKgiCtnk66kizLn4MAyKgHo/8x+Bhza+83v1+Ana667B6BTdo1vUGX8T2Cp92Cnq6vw6 Npbz6a3BS+gaTww1gMYAdMhcG9tFEYWqdSZWs+A+7bczOhL5Rwe/vhy/Rt1J7wlENvOWkw2AjHX 0iZBH+oqIOz7roL+UIfHJIfUBuhRE2rY/ziDLng6d/A0qYYn+EYOFZnx9uEtFDWmKQmU+9ByZTp HEy76vLYL7XVBA0VITVSufICWk6uj2jvm94gpUKQ5GJaYODmnp449exw= X-Google-Smtp-Source: AGHT+IGqX6lX+0qjwd3ZJHWzwevLA3gtpXVk7B4Wh5fM2DAlWP2x0QefOv/95gtbzlo0eDDvBzZfBQ== X-Received: by 2002:a05:600c:b90:b0:459:eeaf:d6c7 with SMTP id 5b1f17b1804b1-45a218578fbmr38391925e9.26.1755343416867; Sat, 16 Aug 2025 04:23:36 -0700 (PDT) From: Bernhard Kaindl To: xen-devel@lists.xenproject.org Cc: Bernhard Kaindl , Andrew Cooper , Anthony PERARD , Michal Orzel , Jan Beulich , Julien Grall , =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= , Stefano Stabellini , Marcus Granado , Alejandro Vallejo Subject: [PATCH v2 5/7] xen/page_alloc: Create per-node outstanding claims Date: Sat, 16 Aug 2025 13:19:31 +0200 Message-ID: <646f3bdbbfef5ace7902ca18c532b5518612f36e.1755341947.git.bernhard.kaindl@cloud.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ZohoMail-DKIM: pass (identity @cloud.com) X-ZM-MESSAGEID: 1755343436368124100 Content-Type: text/plain; charset="utf-8" Extend domain_set_outstanding_claims() to allow staking claims on a specific NUMA node instead of host-wide: A claim on a specific NUMA node is the amount of d->outstanding_claims where the new field d->claim_node field is not NUMA_NO_NODE. We use the most straightforward implementation to minimise the amount of changes in this commit and the rest of the series: In the next series that converts the claims handling the multi-node claims, this will of course be converted into another structure. It helps to keep this commit focused on the central challenge of the new type of claim and leaves extending claims to multi-node claims for the next series. Also extend get_free_buddy() for when it circles round-robin over nodes: Make it skip NUMA nodes that do not have enough unclaimed memory left. --- Changes since v1: - Join all conditions into a single if clause - Improve the function description and comments - Use const when passing struct domain when applicable - Renamed pernode_oc[] to per_node_outstanding_claims[] - Reject invalid node IDs in domain_set_outstanding_pages() - Use nodeid_t instead of unsigned int for the claim_node field. - Removed dependency on MEMF_EXACT_NODE (checked in get_free_buddy()) - Added awareness for honoring NUMA claims to get_free_buddy() Signed-off-by: Bernhard Kaindl Signed-off-by: Marcus Granado Signed-off-by: Alejandro Vallejo --- xen/common/page_alloc.c | 37 +++++++++++++++++++++++++++++++++++-- xen/include/xen/sched.h | 1 + 2 files changed, 36 insertions(+), 2 deletions(-) diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c index e8ba21dc46..63ecd74dcc 100644 --- a/xen/common/page_alloc.c +++ b/xen/common/page_alloc.c @@ -491,6 +491,7 @@ static unsigned long per_node_avail_pages[MAX_NUMNODES]; =20 static DEFINE_SPINLOCK(heap_lock); static long outstanding_claims; /* total outstanding claims by all domains= */ +static unsigned long per_node_outstanding_claims[MAX_NUMNODES]; =20 static unsigned long avail_heap_pages( unsigned int zone_lo, unsigned int zone_hi, unsigned int node) @@ -532,8 +533,12 @@ unsigned long domain_adjust_tot_pages(struct domain *d= , nodeid_t node, * * If the domain has no outstanding claims (or we freed pages instead), * we don't update outstanding claims and skip the claims adjustment. + * + * Also don't update outstanding claims when the domain has node-speci= fic + * claims, but the memory allocation was from a different NUMA node. */ - if ( !d->outstanding_pages || pages <=3D 0 ) + if ( !d->outstanding_pages || pages <=3D 0 || + (d->claim_node !=3D NUMA_NO_NODE && d->claim_node !=3D node) ) goto out; =20 spin_lock(&heap_lock); @@ -544,6 +549,8 @@ unsigned long domain_adjust_tot_pages(struct domain *d,= nodeid_t node, */ adjustment =3D min(d->outstanding_pages, (unsigned int)pages); d->outstanding_pages -=3D adjustment; + if ( d->claim_node !=3D NUMA_NO_NODE ) /* adjust the static per-node c= laims */ + per_node_outstanding_claims[d->claim_node] -=3D adjustment; outstanding_claims -=3D adjustment; spin_unlock(&heap_lock); =20 @@ -557,6 +564,9 @@ int domain_set_outstanding_pages(struct domain *d, node= id_t node, int ret =3D -ENOMEM; unsigned long avail_pages; =20 + if ( node !=3D NUMA_NO_NODE && !node_online(node) ) + return -EINVAL; + /* * take the domain's page_alloc_lock, else all d->tot_page adjustments * must always take the global heap_lock rather than only in the much @@ -569,6 +579,10 @@ int domain_set_outstanding_pages(struct domain *d, nod= eid_t node, if ( pages =3D=3D 0 ) { outstanding_claims -=3D d->outstanding_pages; + + if ( d->claim_node !=3D NUMA_NO_NODE ) + per_node_outstanding_claims[d->claim_node] -=3D d->outstanding= _pages; + d->outstanding_pages =3D 0; ret =3D 0; goto out; @@ -591,12 +605,26 @@ int domain_set_outstanding_pages(struct domain *d, no= deid_t node, /* how much memory is available? */ avail_pages =3D total_avail_pages - outstanding_claims; =20 + /* This check can't be skipped for the NUMA case, or we may overclaim = */ if ( pages > avail_pages ) goto out; =20 + if ( node !=3D NUMA_NO_NODE ) + { + avail_pages =3D per_node_avail_pages[node] - per_node_outstanding_= claims[node]; + + if ( pages > avail_pages ) + goto out; + } + /* yay, claim fits in available memory, stake the claim, success! */ d->outstanding_pages =3D pages; outstanding_claims +=3D d->outstanding_pages; + d->claim_node =3D node; + + if ( node !=3D NUMA_NO_NODE ) + per_node_outstanding_claims[node] +=3D pages; + ret =3D 0; =20 out: @@ -934,7 +962,12 @@ static struct page_info *get_free_buddy(unsigned int z= one_lo, zone =3D zone_hi; do { /* Check if target node can support the allocation. */ - if ( !avail[node] || (avail[node][zone] < (1UL << order)) ) + if ( !avail[node] || (avail[node][zone] < (1UL << order)) || + /* For host-wide allocations, skip nodes without enough + * unclaimed memory. */ + (req_node =3D=3D NUMA_NO_NODE && outstanding_claims && + ((per_node_avail_pages[node] - + per_node_outstanding_claims[node]) < (1UL << order)))= ) continue; =20 /* Find smallest order which can satisfy the request. */ diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h index fd5c9f9333..9535ed7a6a 100644 --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h @@ -406,6 +406,7 @@ struct domain unsigned int max_pages; /* maximum value for domain_tot_pa= ges() */ unsigned int extra_pages; /* pages not included in domain_to= t_pages() */ =20 + nodeid_t claim_node; /* NUMA_NO_NODE for host-wide clai= ms */ #ifdef CONFIG_MEM_SHARING atomic_t shr_pages; /* shared pages */ #endif --=20 2.43.0