From nobody Wed Sep 10 06:05:35 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) client-ip=192.237.175.120; envelope-from=xen-devel-bounces@lists.xenproject.org; helo=lists.xenproject.org; Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org; dmarc=pass(p=reject dis=none) header.from=cloud.com ARC-Seal: i=1; a=rsa-sha256; t=1757261750; cv=none; d=zohomail.com; s=zohoarc; b=ltyGDhBWviGBz+iAkpJxPQnukuuamgl3pWhJoiKZGsHvEYXdiHaO8vNtyYSCoFSyA0x0mBATUL2f7cXmBc3kuKuHY77GS1AO7nB8xHJIcZnYuNTL15jlX+LFBsJ64GgrFLnfEL4wNZ9qM76Of98HY49cdbS3r9Wrh3zLxKZM58Q= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1757261750; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=iO5tUDv0U0MxdgWnoUNF5wlhakSsj8V4mCcgx1zPxGY=; b=mbrqkLmse/ytxdCl+pYlh4VHvPpdu6zb7j0v2B+W+6f8My/wmVL+JNLjtibiTpwaMshHVtGgQsBfPIH/v8wdWNvKXjOG1M5jE3H9kwrNDs3MGoriPEo6p/mpz4zgPTi0ZTEuxtz+92DSmaylNUkk8vmhkT1DbhDtivh1bAkXnI0= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org; dmarc=pass header.from= (p=reject dis=none) Return-Path: Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) by mx.zohomail.com with SMTPS id 1757261750311750.6525826147882; Sun, 7 Sep 2025 09:15:50 -0700 (PDT) Received: from list by lists.xenproject.org with outflank-mailman.1114126.1461308 (Exim 4.92) (envelope-from ) id 1uvI3C-0000UF-5E; Sun, 07 Sep 2025 16:15:34 +0000 Received: by outflank-mailman (output) from mailman id 1114126.1461308; Sun, 07 Sep 2025 16:15:34 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1uvI3B-0000Sc-Sn; Sun, 07 Sep 2025 16:15:33 +0000 Received: by outflank-mailman (input) for mailman id 1114126; Sun, 07 Sep 2025 16:15:32 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1uvI3A-00009z-MC for xen-devel@lists.xenproject.org; Sun, 07 Sep 2025 16:15:32 +0000 Received: from mail-ed1-x530.google.com (mail-ed1-x530.google.com [2a00:1450:4864:20::530]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id dd7dec0c-8c05-11f0-9809-7dc792cee155; Sun, 07 Sep 2025 18:15:26 +0200 (CEST) Received: by mail-ed1-x530.google.com with SMTP id 4fb4d7f45d1cf-6188b793d21so5740718a12.3 for ; Sun, 07 Sep 2025 09:15:26 -0700 (PDT) Received: from MinisforumBD795m.phoenix-carat.ts.net ([2a02:1748:f7df:8cb1:5474:d7c3:6edd:e683]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-b047b61cf00sm908263766b.15.2025.09.07.09.15.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 07 Sep 2025 09:15:25 -0700 (PDT) X-Outflank-Mailman: Message body and most headers restored to incoming version X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: dd7dec0c-8c05-11f0-9809-7dc792cee155 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloud.com; s=cloud; t=1757261726; x=1757866526; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=iO5tUDv0U0MxdgWnoUNF5wlhakSsj8V4mCcgx1zPxGY=; b=Sk+rhqyTkocQRRGYQwdCD2mzvcbK5xh4F6JrfTQb7UEfpGhXYdaBs/bRR488lu5Or8 0HCV23bkSc7QKkukR3acqrMHxg+ySAr9UuICfdsCZAEAYGbben7/I6R0bSteilH1aKDM iTddId3Q9zj5ZEXGWOQ+tf/lwaKL85eNKEDxQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1757261726; x=1757866526; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=iO5tUDv0U0MxdgWnoUNF5wlhakSsj8V4mCcgx1zPxGY=; b=K7XH1nBo5McfS/5QHoQmyY/1ONDqZNKfEJL+Mwvz4AlhGvUiu/ZxvUzTGXxb604Xhg djVPCt4aJmoq4sP95HqvRhOBdjo13VyrkmyV5AIDXitd6KmPso+imEvJ/b7340jG3L2G /LANt2XXJ0WdAQzlcXRYAhCFk9fVv1rx2OJ4YyqF56DuKNaXcuTBZvZpapRVR3V+2XVL p+nKFp4o4Fh7CXKuspbcJJAoG8oST8rrGkSMW4sSug/9ORxlWinaOtAyNSMVreXKqzdT 9q9BSxFr6d5h2C6q7OLGuvMTqUWt8pw+C7mJtBJRUCrzEdZKb2GcivLzPgQncai1WSaZ k/YA== X-Gm-Message-State: AOJu0YwgCWyG4PolYd3uzW1JuyPY+m0RsSkHR8u2iNcdI0l3gkfvw0rJ 1asykPCgDwbpFzH54C/Y2YOKe5f5y9BtQ1YxUfOJMAVkzs7TU3+RX7PRCgB0IcpRCymgKouEsXz ie6QMqwY= X-Gm-Gg: ASbGnctGaCN/q4IVBb0gltruo215y1iI+6ZToWEnZB0I3nSuOvr9QrDlHyGq57qTkmZ N3vDU8Zu8+5OQ3bzaISriwVd4/wBL/4uhsHUgoSRB8YGeCvB4fpL6J4nYSpajGZHMGlwRT7oBny 2i3wITSWUEHBqjdd+CcXvBPIQ7TtQ1nqYvbYhaCxho+liVF/r7lWjbHi+/r3jt8ldRn0DFL7ujJ wYlA6G63hbHN6UoUSo4ecNXl4o2km2HtpxDewruLPqNSzIvgEo5QETahdTShMDM4PNa3VFoXWUh b3F1QeC2DYVpfE7Ll791WjspFqIFKPQPJlKyT2OI5Dea/pD8bwK1L7SrBtGiV4YTNSfc3UA+CDu CaJOGA28PeH4HEclN7ns96tkaywFpVUUBzfB5UWw0xUiIUi530I4t8gCLtjpiOhjDpvs= X-Google-Smtp-Source: AGHT+IFM6Na4mCbdidqmq1mkUoDlkb/7C0nB0sqhQlVjW4irCLtDIf0HY6HTI0p1oz5LZmv7Xrwnvg== X-Received: by 2002:a17:906:3185:b0:b04:b435:fc6b with SMTP id a640c23a62f3a-b04b43602bamr371349866b.60.1757261725991; Sun, 07 Sep 2025 09:15:25 -0700 (PDT) From: Bernhard Kaindl To: xen-devel@lists.xenproject.org Cc: Alejandro Vallejo , Bernhard Kaindl , Stefano Stabellini , Julien Grall , Bertrand Marquis , Michal Orzel , Volodymyr Babchuk , Andrew Cooper , Anthony PERARD , =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= , Shawn Anastasio , Alistair Francis , Bob Eshleman , Connor Davis , Oleksii Kurochko , Jan Beulich Subject: [PATCH v3 1/7] xen/numa: Add per_node() variables paralleling per_cpu() variables Date: Sun, 7 Sep 2025 18:15:16 +0200 Message-ID: <2a2e557f84ba4785f3f8788d31d3edf64e689da0.1757261045.git.bernhard.kaindl@cloud.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ZohoMail-DKIM: pass (identity @cloud.com) X-ZM-MESSAGEID: 1757261752483116600 Content-Type: text/plain; charset="utf-8" During the review of the 3rd commit of the NUMA claims v1 series, it was found to be concerning (performance-wise) add add another array like this that randomly written from all nodes: +/* Per-node counts of free pages */ +static unsigned long pernode_avail_pages[MAX_NUMNODES]; As solution, it was suggested to introduce per_node() paralleling per_cpu(), or (less desirable) to make sure one particular cache line would only ever be written from a single node. It was mentioned that node_need_scrub[] could/should use it, and I assume others may benefit too. per_cpu() is a simple standard blueprint that is easy to copy, add per_node(), paralleling per_cpu() as the preferred suggestion: It is entirely derived from per_cpu(), with a few differences: - No add/remove callback: Nodes are onlined on boot and never offlined. - As per_node(avail_pages) and pernode(outstanding_claims) are used by the buddy allocator itself, and the buddy allocator is used to alloc the per_node() memory from the local NUMA node, there is a catch: per_node() must already be working to have a working buddy allocator: - Init per_node() before the buddy allocator is ready as it needs to be setup before its use, e.g. to init per_node(avail_pages)! Use an early static __initdata array during early boot and migrate it to the NUMA-node-local xenheap before we enable the secondary CPUs. Cc: Jan Beulich Signed-off-by: Bernhard Kaindl --- Changes: - This is patch is new in v3 to resolve the the suggestion from the review. - The previous patch #2 is removed from the series as not required, which is best visualized by how claims are used: - Claim needed memory - Allocate all domain memory - Cancel a possible leftover claim - Finish building the domain and unpause it. As it makes no sense to repeat "Claim needed memory" at any time, the change made had no practical significance. It can be applied later as a tiny, not important cleanup, e.g. with multi-node claims. Implementation note on this patch (not needed for the commit message): Instead of the __initdata array, I tried to alloc bootmem, but it caused paging_init() to panic with not enough memory for p2m on a very large 4-Socket, 480 pCPU, 4TiB RAM host (or it caused boot to hang after the microcode updates of the 480 pCPUs) The static __initdata array is freed after init and does not affect bootmem allocation. PPS: Yes, node_need_scrub[] should use it too, do it after this series. --- xen/arch/arm/xen.lds.S | 1 + xen/arch/ppc/xen.lds.S | 1 + xen/arch/riscv/xen.lds.S | 1 + xen/arch/x86/xen.lds.S | 1 + xen/common/numa.c | 53 ++++++++++++++++++++++++++++++++++++++- xen/include/xen/numa.h | 15 +++++++++++ xen/include/xen/xen.lds.h | 8 ++++++ 7 files changed, 79 insertions(+), 1 deletion(-) diff --git a/xen/arch/arm/xen.lds.S b/xen/arch/arm/xen.lds.S index db17ff1efa..d296a95dd3 100644 --- a/xen/arch/arm/xen.lds.S +++ b/xen/arch/arm/xen.lds.S @@ -176,6 +176,7 @@ SECTIONS *(.bss.stack_aligned) *(.bss.page_aligned) PERCPU_BSS + PERNODE_BSS *(.bss .bss.*) . =3D ALIGN(POINTER_ALIGN); __bss_end =3D .; diff --git a/xen/arch/ppc/xen.lds.S b/xen/arch/ppc/xen.lds.S index 1de0b77fc6..29d1b5da58 100644 --- a/xen/arch/ppc/xen.lds.S +++ b/xen/arch/ppc/xen.lds.S @@ -151,6 +151,7 @@ SECTIONS *(.bss.stack_aligned) *(.bss.page_aligned) PERCPU_BSS + PERNODE_BSS *(.bss .bss.*) . =3D ALIGN(POINTER_ALIGN); __bss_end =3D .; diff --git a/xen/arch/riscv/xen.lds.S b/xen/arch/riscv/xen.lds.S index edcadff90b..e154427353 100644 --- a/xen/arch/riscv/xen.lds.S +++ b/xen/arch/riscv/xen.lds.S @@ -146,6 +146,7 @@ SECTIONS *(.bss.stack_aligned) *(.bss.page_aligned) PERCPU_BSS + PERNODE_BSS *(.sbss .sbss.* .bss .bss.*) . =3D ALIGN(POINTER_ALIGN); __bss_end =3D .; diff --git a/xen/arch/x86/xen.lds.S b/xen/arch/x86/xen.lds.S index 966e514f20..95040cd516 100644 --- a/xen/arch/x86/xen.lds.S +++ b/xen/arch/x86/xen.lds.S @@ -327,6 +327,7 @@ SECTIONS __bss_start =3D .; *(.bss.page_aligned*) PERCPU_BSS + PERNODE_BSS *(.bss .bss.*) . =3D ALIGN(POINTER_ALIGN); __bss_end =3D .; diff --git a/xen/common/numa.c b/xen/common/numa.c index ad75955a16..5e66471159 100644 --- a/xen/common/numa.c +++ b/xen/common/numa.c @@ -320,6 +320,51 @@ static bool __init nodes_cover_memory(void) return true; } =20 +/* Defined on the BSS in xen.lds.S, used for area sizes and relative offse= ts */ +extern const char __pernode_start[]; +extern const char __pernode_end[]; + +unsigned long __read_mostly __pernode_offset[MAX_NUMNODES]; + +#define EARLY_PERNODE_AREA_SIZE (SMP_CACHE_BYTES) + +static char early_pernode_area[MAX_NUMNODES][EARLY_PERNODE_AREA_SIZE] + __initdata __cacheline_aligned; + +/* per_node() needs to be ready before the first alloc call using the heap= */ +static void __init early_init_pernode_areas(void) +{ + unsigned int node; + + if (__pernode_end - __pernode_start > EARLY_PERNODE_AREA_SIZE) + panic("per_node() area too small, increase EARLY_PERNODE_AREA_SIZE= "); + + for_each_online_node(node) + __pernode_offset[node] =3D early_pernode_area[node] - __pernode_st= art; +} + +/* Before going SMP, migrate the per_node memory areas to their NUMA nodes= */ +static int __init init_pernode_areas(void) +{ + const int pernode_size =3D __pernode_end - __pernode_start; /* size i= n BSS */ + unsigned int node; + + for_each_online_node(node) + { + char *p =3D alloc_xenheap_pages(get_order_from_bytes(pernode_size), + MEMF_node(node)); + + if ( !p ) + return -ENOMEM; + /* migrate the pernode data from the bootmem area to the xenheap */ + memcpy(p, early_pernode_area[node], SMP_CACHE_BYTES); + __pernode_offset[node] =3D p - __pernode_start; + } + return 0; +} + +presmp_initcall(init_pernode_areas); + /* Use discovered information to actually set up the nodes. */ static bool __init numa_process_nodes(paddr_t start, paddr_t end) { @@ -617,7 +662,7 @@ static int __init numa_emulation(unsigned long start_pf= n, } #endif =20 -void __init numa_initmem_init(unsigned long start_pfn, unsigned long end_p= fn) +static void __init init_nodes(unsigned long start_pfn, unsigned long end_p= fn) { unsigned int i; paddr_t start =3D pfn_to_paddr(start_pfn); @@ -656,6 +701,12 @@ void __init numa_initmem_init(unsigned long start_pfn,= unsigned long end_pfn) setup_node_bootmem(0, start, end); } =20 +void __init numa_initmem_init(unsigned long start_pfn, unsigned long end_p= fn) +{ + init_nodes(start_pfn, end_pfn); + early_init_pernode_areas(); /* With all nodes registered, init per_nod= e() */ +} + void numa_add_cpu(unsigned int cpu) { cpumask_set_cpu(cpu, &node_to_cpumask[cpu_to_node(cpu)]); diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h index f6c1f27ca1..729c400d64 100644 --- a/xen/include/xen/numa.h +++ b/xen/include/xen/numa.h @@ -152,4 +152,19 @@ static inline nodeid_t mfn_to_nid(mfn_t mfn) =20 #define page_to_nid(pg) mfn_to_nid(page_to_mfn(pg)) =20 +/* Per NUMA node data area handling based on per-cpu data area handling. */ +extern unsigned long __pernode_offset[]; + +#define DECLARE_PER_NODE(type, name) \ + extern __typeof__(type) pernode__ ## name + +#define __DEFINE_PER_NODE(attr, type, name) \ + attr __typeof__(type) pernode_ ## name + +#define DEFINE_PER_NODE(type, name) \ + __DEFINE_PER_NODE(__section(".bss.pernode"), type, _ ## name) + +#define per_node(var, node) \ + (*RELOC_HIDE(&pernode__##var, __pernode_offset[node])) + #endif /* _XEN_NUMA_H */ diff --git a/xen/include/xen/xen.lds.h b/xen/include/xen/xen.lds.h index b126dfe887..a32423dcec 100644 --- a/xen/include/xen/xen.lds.h +++ b/xen/include/xen/xen.lds.h @@ -174,6 +174,14 @@ #define LOCK_PROFILE_DATA #endif =20 +/* Per-node BSS for declaring per_node vars, based on per_cpu, but simpler= */ +#define PERNODE_BSS \ + . =3D ALIGN(PAGE_SIZE); \ + __pernode_start =3D .; \ + *(.bss.pernode) \ + . =3D ALIGN(SMP_CACHE_BYTES); \ + __pernode_end =3D .; \ + #define PERCPU_BSS \ . =3D ALIGN(PAGE_SIZE); \ __per_cpu_start =3D .; \ --=20 2.43.0 From nobody Wed Sep 10 06:05:35 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) client-ip=192.237.175.120; envelope-from=xen-devel-bounces@lists.xenproject.org; helo=lists.xenproject.org; Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org; dmarc=pass(p=reject dis=none) header.from=cloud.com ARC-Seal: i=1; a=rsa-sha256; t=1757261758; cv=none; d=zohomail.com; s=zohoarc; b=kYI3uETBzIkpEmJxjMYEpa6mollIXAJ3RuDQPnZPoY645Mbjsi07K+dew3wwYMmuj13iE7TFV69xVaMg8Yc5ybhPCz16X2g7EtcSUR97EE/wuld9OlYS8bl519EVH3Rqhem3G6fCku0zwgsUH3UqJ8RiYGmRcZ1OggHqYekPKTc= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1757261758; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=PTaeCc0jJZ6v4SiJ+F55XnFvKWiZFrzoLZItww1E4jk=; b=T1z+ePrP+7S3JFbXS4gnh83THCW2A7fE9FsXxURykntf2uZun1AGZOTke+bWufFa0VGFCRu27ljBFP3XhVnics0TjOqgk+qeWQ5mu3cvmLtDcXNMf5BYha/LsME3Ff9DoXjZo/yjK3S9QaC6vejBFjlgLYNF5HjLdkjLeMYJAwU= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org; dmarc=pass header.from= (p=reject dis=none) Return-Path: Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) by mx.zohomail.com with SMTPS id 1757261758443413.12130606472886; Sun, 7 Sep 2025 09:15:58 -0700 (PDT) Received: from list by lists.xenproject.org with outflank-mailman.1114125.1461298 (Exim 4.92) (envelope-from ) id 1uvI3B-0000Hj-El; Sun, 07 Sep 2025 16:15:33 +0000 Received: by outflank-mailman (output) from mailman id 1114125.1461298; Sun, 07 Sep 2025 16:15:33 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1uvI3B-0000HI-A4; Sun, 07 Sep 2025 16:15:33 +0000 Received: by outflank-mailman (input) for mailman id 1114125; Sun, 07 Sep 2025 16:15:32 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1uvI39-00009z-WC for xen-devel@lists.xenproject.org; Sun, 07 Sep 2025 16:15:32 +0000 Received: from mail-ej1-x62e.google.com (mail-ej1-x62e.google.com [2a00:1450:4864:20::62e]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id de1eaa83-8c05-11f0-9809-7dc792cee155; Sun, 07 Sep 2025 18:15:27 +0200 (CEST) Received: by mail-ej1-x62e.google.com with SMTP id a640c23a62f3a-b04b869abb9so149718866b.1 for ; Sun, 07 Sep 2025 09:15:27 -0700 (PDT) Received: from MinisforumBD795m.phoenix-carat.ts.net ([2a02:1748:f7df:8cb1:5474:d7c3:6edd:e683]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-b047b61cf00sm908263766b.15.2025.09.07.09.15.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 07 Sep 2025 09:15:26 -0700 (PDT) X-Outflank-Mailman: Message body and most headers restored to incoming version X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: de1eaa83-8c05-11f0-9809-7dc792cee155 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloud.com; s=cloud; t=1757261727; x=1757866527; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=PTaeCc0jJZ6v4SiJ+F55XnFvKWiZFrzoLZItww1E4jk=; b=PLN5XU9236vTRSXxf7AxvIsH1VkzJekntmRLSYgdMzlznr/lrxrb80EFEJXDdzhenk E/kXwI0bEbUQ/n6669oUhU6OCrHfM2oMs+bxD2WOzg6p/UZoB7q39ta0lCFfq9DNq0uZ 3X54wqcL6JdqvbxBI+tib5bSyaACGO3zVConY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1757261727; x=1757866527; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=PTaeCc0jJZ6v4SiJ+F55XnFvKWiZFrzoLZItww1E4jk=; b=OcxMg/4cMolaInqKr2DFPwnAN2pRsLcFO0ps0stJwWutwMLvmbHSvsJfBoHYWT9oiF in83eABDiXtbLkQVK72ZTpXtHraszwpO5u8K44WXFMTSov6e400T2yRM5AW30O5tUjJV zz1iU/oYRGt3X2tqy2UtIK7SDGPtbdpQejRByT7VmgShAgqDvvZ/I7vFW81Ijn97odOK ii1R0v8Ahl7WWYtfhCvPgXW3pcvM2BxXLhy8OuWFUXMFdNHrEDsrjdiEuTwOwJKvmcrb OWW2FDq948FdxqG0Jj25LKJrl0xnkPFID9XQAE7813B86UBvJCRIBxMutNRMCCiRTb65 VB/Q== X-Gm-Message-State: AOJu0YyyHjrxkGWJjWMUAHUomUB08fal8dw+iLBkiAstKLDErhF89pqG vhwxUzjmjm3aFSiaF4J2XNuy/74z9GwuKWd1FdaKp2HBb1N6TADyIKdj3m/vqB6N1TD01kTDWt6 aYLkTivE= X-Gm-Gg: ASbGncssaNeR0KuoroW7tfBb4xTqo0jz2XffUjWHrWLwuniwuktsXVX+qyBUAbbFvGZ Iva+2KCZGe4BFlpSlY/5nm74IQALnIeko1/tXN6KqtqwxLEL7q3QubagPhMTrtkZaGOmh2Eh1xk NivjA4aNoFfDDSgT5l7xHF6Guo6oUMMQdpUYU4pp3x/QcKkR+7qWNAPPRO4BsiMGllK3GKfR2Ca X3SzEtjvq0q8jNyEXQ7zI0TCosbqIRJGvd5FilRSLhy5CaKIiV7nQgG07sdp5R2/jy+js+LFMHx wk2kLxk4D1xwP0pyHm/apmdAASwFAxUs1qshtyYCmvNvD3M8OjKa7U53aPEJOS4CBPlT1dMRBHF VopplWXvaN5HNWAEatJ6RjSTiDhxWYaqZ9amxjwonUccWHE7fbTw7VGzVxhOC7MTFutI= X-Google-Smtp-Source: AGHT+IGQR9/J0pflFoZP8fpXWuEulGdjHQl4gGCXU4y13Rc3B7e+E6dWfnhdFZpDe8pf22mP59uS4g== X-Received: by 2002:a17:906:99c5:b0:b04:848f:a0b7 with SMTP id a640c23a62f3a-b04b1663cb3mr468611066b.41.1757261727086; Sun, 07 Sep 2025 09:15:27 -0700 (PDT) From: Bernhard Kaindl To: xen-devel@lists.xenproject.org Cc: Alejandro Vallejo , Bernhard Kaindl , Andrew Cooper , Anthony PERARD , Michal Orzel , Julien Grall , =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= , Stefano Stabellini , Jan Beulich Subject: [PATCH v3 2/7] xen/page_alloc: Simplify domain_adjust_tot_pages() further Date: Sun, 7 Sep 2025 18:15:17 +0200 Message-ID: <15ae395c6933e74da0cdd8f9d71d349a7bfad3f3.1757261045.git.bernhard.kaindl@cloud.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ZohoMail-DKIM: pass (identity @cloud.com) X-ZM-MESSAGEID: 1757261760441116600 Content-Type: text/plain; charset="utf-8" When domain memory is allocated, domain_adjust_tot_pages(), also reduces the outstanding claim. Replace the checks to not over-reduce the claim beyond 0 by using min() which prevents the claim to become negative (and also not be over-conumed for the node and globally) Cc: Jan Beulich Signed-off-by: Bernhard Kaindl --- Changes: - Was added as 2/7 in v2, the review by Jan Beulich is applied. --- xen/common/page_alloc.c | 28 +++++++++++++++++----------- 1 file changed, 17 insertions(+), 11 deletions(-) diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c index 1f67b88a89..e056624583 100644 --- a/xen/common/page_alloc.c +++ b/xen/common/page_alloc.c @@ -510,8 +510,15 @@ static unsigned long avail_heap_pages( return free_pages; } =20 +/* + * Update the total number of pages and outstanding claims of a domain. + * - When pages were freed, we do not increase outstanding claims. + * - On a domain's claims update, global outstanding_claims are updated as= well. + */ unsigned long domain_adjust_tot_pages(struct domain *d, long pages) { + unsigned long adjustment; + ASSERT(rspin_is_locked(&d->page_alloc_lock)); d->tot_pages +=3D pages; =20 @@ -519,23 +526,22 @@ unsigned long domain_adjust_tot_pages(struct domain *= d, long pages) * can test d->outstanding_pages race-free because it can only change * if d->page_alloc_lock and heap_lock are both held, see also * domain_set_outstanding_pages below + * + * If the domain has no outstanding claims (or we freed pages instead), + * we don't update outstanding claims and skip the claims adjustment. */ if ( !d->outstanding_pages || pages <=3D 0 ) goto out; =20 spin_lock(&heap_lock); BUG_ON(outstanding_claims < d->outstanding_pages); - if ( d->outstanding_pages < pages ) - { - /* `pages` exceeds the domain's outstanding count. Zero it out. */ - outstanding_claims -=3D d->outstanding_pages; - d->outstanding_pages =3D 0; - } - else - { - outstanding_claims -=3D pages; - d->outstanding_pages -=3D pages; - } + /* + * Reduce claims by outstanding claims or pages (whichever is smaller): + * If allocated > outstanding, reduce the claims only by outstanding p= ages. + */ + adjustment =3D min(d->outstanding_pages + 0UL, pages + 0UL); + d->outstanding_pages -=3D adjustment; + outstanding_claims -=3D adjustment; spin_unlock(&heap_lock); =20 out: --=20 2.43.0 From nobody Wed Sep 10 06:05:35 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) client-ip=192.237.175.120; envelope-from=xen-devel-bounces@lists.xenproject.org; helo=lists.xenproject.org; Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org; dmarc=pass(p=reject dis=none) header.from=cloud.com ARC-Seal: i=1; a=rsa-sha256; t=1757261758; cv=none; d=zohomail.com; s=zohoarc; b=SU28yzGPErxATVVVtIGCaf6t8iOxFZLE/HDX4lHnqmBNd0/lyNXJAtfqKOGk1TPw0QSNYQK5FZSCDyFbHEObqGOxMTwR1CDMp1yy4PI74wTO2LvwDJ/ysSIhla+oTmLdwIs3AfNmp1HVo5BL5DPnd0FIV7+yATn94Aq4TAjJwJk= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1757261758; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=UXFQDBB6LNVrlSbSbsw2cgbvIju27z1SvRMin6hDLRc=; b=QQhEN4iFEeVBhXLFWlqHw/Vs6hdsuTZ48vn4zIHwrfUi+wBN4oYYBW4ZOhVcG7fIHUFOGHm3hcwtdLmLB8TE2dMLtR+J0QSWMCR7zaWJ5tHOWwkbjLJOz4MiCrIs4yzoolEYYXzpsOouIpD1t118h9giicg+eLh5Is7jS4VF89U= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org; dmarc=pass header.from= (p=reject dis=none) Return-Path: Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) by mx.zohomail.com with SMTPS id 1757261758286652.2725601662423; Sun, 7 Sep 2025 09:15:58 -0700 (PDT) Received: from list by lists.xenproject.org with outflank-mailman.1114123.1461283 (Exim 4.92) (envelope-from ) id 1uvI38-0008Pp-T4; Sun, 07 Sep 2025 16:15:30 +0000 Received: by outflank-mailman (output) from mailman id 1114123.1461283; Sun, 07 Sep 2025 16:15:30 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1uvI38-0008Pi-Q3; Sun, 07 Sep 2025 16:15:30 +0000 Received: by outflank-mailman (input) for mailman id 1114123; Sun, 07 Sep 2025 16:15:29 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1uvI37-0008Bo-Cg for xen-devel@lists.xenproject.org; Sun, 07 Sep 2025 16:15:29 +0000 Received: from mail-ej1-x62d.google.com (mail-ej1-x62d.google.com [2a00:1450:4864:20::62d]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id dec10a8d-8c05-11f0-9d13-b5c5bf9af7f9; Sun, 07 Sep 2025 18:15:29 +0200 (CEST) Received: by mail-ej1-x62d.google.com with SMTP id a640c23a62f3a-afcb7a16441so539136666b.2 for ; Sun, 07 Sep 2025 09:15:29 -0700 (PDT) Received: from MinisforumBD795m.phoenix-carat.ts.net ([2a02:1748:f7df:8cb1:5474:d7c3:6edd:e683]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-b047b61cf00sm908263766b.15.2025.09.07.09.15.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 07 Sep 2025 09:15:27 -0700 (PDT) X-Outflank-Mailman: Message body and most headers restored to incoming version X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: dec10a8d-8c05-11f0-9d13-b5c5bf9af7f9 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloud.com; s=cloud; t=1757261728; x=1757866528; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=UXFQDBB6LNVrlSbSbsw2cgbvIju27z1SvRMin6hDLRc=; b=V5OR4Hpa9L48OaH7msaicdAZ0GfBmbBpf7uQNSv4XR0OTyhF8MFSWrHcqC8f0rItdm AkllrfoAXiuidGEiHffOmxVMOFy1hZLNGT5QHUgCdjGscIlNhcDq8O+ca8B8OWy/3Aki lmkm1c9VvEqUHXjX63L+6O+Hfo+sBXFZmpuXk= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1757261728; x=1757866528; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=UXFQDBB6LNVrlSbSbsw2cgbvIju27z1SvRMin6hDLRc=; b=jVuJd35+PRqJLKc0pdhjmFG8teZH0MGcYzYH3okqi4MfbX2Kja5yVywGqxgTKDY97x mfQ9H7Gbv6b40od/qq4XlXpxuJHnY+0opH3yzwBQ+N8rX8oBN7yvWAI8Wxk3ru5VlGHE 4pSvjFpIcZu0m8Ks3Ott3mudSM30x0zcN22J+qsFfCo3vkCSI6RCx8/G5DRWz5wcdOV6 UanCRC2YrI8ZgFLm3enrsTmRiIVrSaILo0QCjrnMBm/x7dzQIYRvG5+9EEiJeZolCEdi Jxv2rH1SQt/Np4+KJCG/BkDwYVuCrcxQOK1D/8OMVFuNmAbJB1SbpKBPVFs+kwjKuIwb iE/Q== X-Gm-Message-State: AOJu0YzE7NgBxFqrgjBQMaOPaMI3UK4W1YJ+FbLLbL3D1IEwKCllEkuP XErlwzzA8gzY4Ygk9v215yuezaELrjIoIYC+3QFBQl443e02/XVZsBtaC1z3KcGW3pbDvTtZfI8 tAay7WBo= X-Gm-Gg: ASbGncuybRQ3dlzsKy0c3rdW3UlpvvUmq/8mMKFUybXFKgMTtyXIzvWUs7/jigsWLAk JEyWeELBPaoyGjQpPQCrOnMW5/GeAkyx4qm57JbVbLL4v+adRGSuTWwGcnpFtJ7uGxS7RR9OjBH cXwgIj2W4oWps6Xsa5BnPApxjlxz7oO/LQC8ov3Y9+kQ5XOPSx8gSbLuOhe6jXgzvYwiVaKvtKQ HSSDkiJXX0u+g/bF70L9FZDbo8qK1KbdYjTEEW5qNJXHa9gYdi8WFeYa9CGc0ZWz6l5GUu9em1t i/BPFUeA/oHwNtBhv+BRJmWvrcxN+FbGAbP3mU7W1fpoBGjflTkCVF6pBUwxQcXgtSw5Jkd6riW Iydr6pt7aHVGd838R8U2X0tdUcXFsctifOt6/G0Rakj77HCQwAs6c709aJ8CdTuQvX0+vNlZPuY DrGA== X-Google-Smtp-Source: AGHT+IHuGGshBE11UswiTAVCGXRupZTuoOG9Ht4DWbahk+TZ+ISigvCdYg80rgu+VSa57mx1WUdvqg== X-Received: by 2002:a17:907:3f21:b0:afc:cc64:86da with SMTP id a640c23a62f3a-b04b14f5015mr511931166b.26.1757261728217; Sun, 07 Sep 2025 09:15:28 -0700 (PDT) From: Bernhard Kaindl To: xen-devel@lists.xenproject.org Cc: Alejandro Vallejo , Bernhard Kaindl , Andrew Cooper , Anthony PERARD , Michal Orzel , Jan Beulich , Julien Grall , =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= , Stefano Stabellini Subject: [PATCH v3 3/7] xen/page_alloc: Add and track per_node(avail_pages) Date: Sun, 7 Sep 2025 18:15:18 +0200 Message-ID: X-Mailer: git-send-email 2.43.0 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ZohoMail-DKIM: pass (identity @cloud.com) X-ZM-MESSAGEID: 1757261760435116600 Content-Type: text/plain; charset="utf-8" From: Alejandro Vallejo The static per-NUMA-node count of free pages is the sum of free memory in all zones of a node. It's an optimisation to avoid doing that operation frequently in the following patches that introduce per-NUMA-node claims. Signed-off-by: Alejandro Vallejo Signed-off-by: Bernhard Kaindl --- Changes in v2: - Added ASSERT(per_node(avail_pages, node) >=3D request) as requested during review by Roger: Comment by me: As we have ASSERT(avail[node][zone] >=3D request); directly before it, the request is already valid, so this checks that per_node(avail_pages, node) is not mis-accounted too low. Changes in v3: - Converted from static array to use per_node(avail_pages, node) --- xen/common/page_alloc.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c index e056624583..b8acb500da 100644 --- a/xen/common/page_alloc.c +++ b/xen/common/page_alloc.c @@ -486,6 +486,10 @@ static unsigned long node_need_scrub[MAX_NUMNODES]; static unsigned long *avail[MAX_NUMNODES]; static long total_avail_pages; =20 +/* Per-NUMA-node counts of free pages */ +DECLARE_PER_NODE(unsigned long, avail_pages); +DEFINE_PER_NODE(unsigned long, avail_pages); + static DEFINE_SPINLOCK(heap_lock); static long outstanding_claims; /* total outstanding claims by all domains= */ =20 @@ -1074,6 +1078,8 @@ static struct page_info *alloc_heap_pages( =20 ASSERT(avail[node][zone] >=3D request); avail[node][zone] -=3D request; + ASSERT(per_node(avail_pages, node) >=3D request); + per_node(avail_pages, node) -=3D request; total_avail_pages -=3D request; ASSERT(total_avail_pages >=3D 0); =20 @@ -1234,6 +1240,8 @@ static int reserve_offlined_page(struct page_info *he= ad) continue; =20 avail[node][zone]--; + ASSERT(per_node(avail_pages, node) > 0); + per_node(avail_pages, node)--; total_avail_pages--; ASSERT(total_avail_pages >=3D 0); =20 @@ -1558,6 +1566,7 @@ static void free_heap_pages( } =20 avail[node][zone] +=3D 1 << order; + per_node(avail_pages, node) +=3D 1 << order; total_avail_pages +=3D 1 << order; if ( need_scrub ) { --=20 2.43.0 From nobody Wed Sep 10 06:05:35 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) client-ip=192.237.175.120; envelope-from=xen-devel-bounces@lists.xenproject.org; helo=lists.xenproject.org; Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org; dmarc=pass(p=reject dis=none) header.from=cloud.com ARC-Seal: i=1; a=rsa-sha256; t=1757261759; cv=none; d=zohomail.com; s=zohoarc; b=h2QSSsQ0WwXMCns3PgUG5d1nCuA6eiwRdZpGbtoe5swq1F7ik3BHU5uX3xvzOxD9cJSnZ8Apf+QiOBWFgEn2iUlJDNSP07282AK0dF23NPqvHDDjXm8YfygQ+HwRcJ7HCBUhlBZ+2nXhDZjMIkc8UbzZm0uvhDtRDS5pjOvKNQ4= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1757261759; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=+wb0w+Ib0HchtKtOoJZ/15uDzicml5JsFwuf7VKWvmc=; b=OWK8aETp7qIfVuCvZzBwkFSsGIr84Tad+igOvWCA+XrP83dyiODZzt76OxDsyMPGDPLzLVyQNS1FN2Lrpa3YNx+Lx2+goQv/yGiZ8seV2p1fsqht8z5f9LxoNzc+voePQ+3DiZceafDvoOaajv6q5ybiMXXUlkZUS0dEerr8LnQ= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org; dmarc=pass header.from= (p=reject dis=none) Return-Path: Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) by mx.zohomail.com with SMTPS id 1757261759117930.5467089137037; Sun, 7 Sep 2025 09:15:59 -0700 (PDT) Received: from list by lists.xenproject.org with outflank-mailman.1114124.1461293 (Exim 4.92) (envelope-from ) id 1uvI3B-0000Ds-4H; Sun, 07 Sep 2025 16:15:33 +0000 Received: by outflank-mailman (output) from mailman id 1114124.1461293; Sun, 07 Sep 2025 16:15:33 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1uvI3B-0000Dl-1T; Sun, 07 Sep 2025 16:15:33 +0000 Received: by outflank-mailman (input) for mailman id 1114124; Sun, 07 Sep 2025 16:15:31 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1uvI39-0008Bo-8a for xen-devel@lists.xenproject.org; Sun, 07 Sep 2025 16:15:31 +0000 Received: from mail-ej1-x635.google.com (mail-ej1-x635.google.com [2a00:1450:4864:20::635]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id dfdd860c-8c05-11f0-9d13-b5c5bf9af7f9; Sun, 07 Sep 2025 18:15:30 +0200 (CEST) Received: by mail-ej1-x635.google.com with SMTP id a640c23a62f3a-b04163fe08dso630038666b.3 for ; Sun, 07 Sep 2025 09:15:30 -0700 (PDT) Received: from MinisforumBD795m.phoenix-carat.ts.net ([2a02:1748:f7df:8cb1:5474:d7c3:6edd:e683]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-b047b61cf00sm908263766b.15.2025.09.07.09.15.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 07 Sep 2025 09:15:29 -0700 (PDT) X-Outflank-Mailman: Message body and most headers restored to incoming version X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: dfdd860c-8c05-11f0-9d13-b5c5bf9af7f9 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloud.com; s=cloud; t=1757261730; x=1757866530; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=+wb0w+Ib0HchtKtOoJZ/15uDzicml5JsFwuf7VKWvmc=; b=QblKeUDN3D+i0ATssUC5GbNyu+48pXHKeNp1FCY7cx6QBOVWPw29S9WsAMAeEqGJM/ qrj+VbOeLBC/c2rshq/EreJVUQZIE3tNRHqib3hFVK3zpf2dh91YTcjGTusorN3FjYY0 D3HtTIxV/xagi95chCOY74jJJJXM8Ji+92pZQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1757261730; x=1757866530; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=+wb0w+Ib0HchtKtOoJZ/15uDzicml5JsFwuf7VKWvmc=; b=s1AYt+ePzy5a2iMHXfIb1RdmTs0hStbZtH5fhT8Qs5+DHEG0maRZHrfRJpPtPF/qpA /WFdcq2oYHzXTeAIFhgdJ/iBf6Fyrzr4JlyapqMSRDbMo9XUFt6dPDQPD/wbj8ZX+1LO h5X/LefUNCUI0kShFtHU6W4g8zMbXP2df9VDMO2Wc9Kiu+ksTCZtR/IrsuK8tEhiHvst E1PA6C2i+ZSjh3Fcx8aZ4r5cPnQ82SifSkrdQYyp9LvLxxPRGU7BSZnxiQUg/GjjDP6d ybvCScy4IcrSpYc7TYHfeKSRuqipzEh0X06tqBqPEFDSAeX83xHdcnv8PUozKMKljry4 8MZA== X-Gm-Message-State: AOJu0YwY8Es4+rAdbvF23sv+BN5w+6sJvwYjjUr0JNkRX1ptcUIbaDE1 acmOqyCTMG7fW5EvdOV/13qzF5NYSSCF+vkxhPg6CaP0AFLyho91GdTDzGYpCFb2mAuH01Bi6gV HTnsCawE= X-Gm-Gg: ASbGncs7pEa1VzyPnKtjuHR6YvGMLvcMxYQQCSDELaOwAL8nfJqbwmFCcprr08B9mMn imzoccsYKyYN8aFJoGt1gtGvTutAimPLsK72wgA1QUhwOJKkzsoe69lTtmOm2VyUm/7EpR0F9eN d51Lw7AQoRZmVQ5Xv1y7RbipXMs7GhlUDs+zB7oyGv7Y2WZCfSv5DMamf3VKg+Ng+avTO21Kd/0 yDEYcQr6hoAj0udbucsrj3mHJiJGauWa9qbXM+9yIP+/Nzt9XcMlQg4mv/zsL5DsiX6fyh/Hrnd WeHY9NEqoEofyGdChGGWyC9YmoLCDXV7UuFXTcxLf1bZyVStrXt9X0M+xDZs1iWOXEIlPzFUKm8 mG3lv5s0f5TW/qFEcYHKl6sI3NFezQ1U6fDIxHcz3sT0rATsProvO9q75JhF0oDmJqC0= X-Google-Smtp-Source: AGHT+IGJpDSkkIcfCORJaJVTeHzgRf+E7BrBIiM6Xe5wOoPEEV56CMMZIVXBq/xg2FvjtqZ9Wxu4/w== X-Received: by 2002:a17:906:d54a:b0:b04:827c:9139 with SMTP id a640c23a62f3a-b04b16e4f4dmr432603466b.65.1757261729960; Sun, 07 Sep 2025 09:15:29 -0700 (PDT) From: Bernhard Kaindl To: xen-devel@lists.xenproject.org Cc: Alejandro Vallejo , Bernhard Kaindl , Andrew Cooper , Anthony PERARD , Michal Orzel , Jan Beulich , Julien Grall , =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= , Stefano Stabellini Subject: [PATCH v3 4/7] xen/page_alloc: Add staking a NUMA node claim for a domain Date: Sun, 7 Sep 2025 18:15:19 +0200 Message-ID: X-Mailer: git-send-email 2.43.0 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ZohoMail-DKIM: pass (identity @cloud.com) X-ZM-MESSAGEID: 1757261760590116601 Content-Type: text/plain; charset="utf-8" Update domain_set_outstanding_pages() to domain_claim_pages() for staking claims for domains on NUMA nodes: domain_claim_pages() is a handler for claiming pages, where its former name suggested that it just sets the domain's outstanding claims. Actually, three different code locations do perform just this task: Fix this using a helper to avoid repeating yourself (an anti-pattern) for just only updating the domain's outstanding pages is added as well: It removes the need to repeat the same sequence of operations at three diffent places and helps to have a single location for adding multi-node claims. It also makes the code much shorter and easier to follow. Fix the meaning of the claims argument of domain_claim_pages() for NUMA-node claims: - For NUMA-node claims, we need to claim defined amounts of memory on different NUMA nodes. Previously, the argument was a "reservation" and the claim was made on the difference between d->tot_pages and the reservations. Of course, the argument needed to be > d->tot_pages. This interacs badly with NUMA claims: NUMA node claims are not related to potentially already allocated memory and reducing the claim by already allocated memory would not work in case d->tot_pages already has some amount of pages. - Fix this by simply claiming the given amount of pages. - Update the legacy caller of domain_claim_pages() accordingly by moving the reduction of the claim by d->tot_pages to it: No change for the users of the legacy hypercall, and a usable interface for staking NUMA claims. Signed-off-by: Bernhard Kaindl Signed-off-by: Alejandro Vallejo --- Changes in v3: - Renamed domain_set_outstanding_pages() and add check from review. - Reorganized v3, v4 and v5 as per review to avoid non-functional changes: - Combined patch v2#2 with v2#5 into a consolidated patch. - Moved the unrelated changes for domain_adjust_tot_pages() to #5. --- xen/common/domain.c | 2 +- xen/common/memory.c | 15 ++++++- xen/common/page_alloc.c | 93 ++++++++++++++++++++++++++++------------- xen/include/xen/mm.h | 3 +- xen/include/xen/sched.h | 1 + 5 files changed, 81 insertions(+), 33 deletions(-) diff --git a/xen/common/domain.c b/xen/common/domain.c index 775c339285..6ee9f23b10 100644 --- a/xen/common/domain.c +++ b/xen/common/domain.c @@ -1247,7 +1247,7 @@ int domain_kill(struct domain *d) rspin_barrier(&d->domain_lock); argo_destroy(d); vnuma_destroy(d->vnuma); - domain_set_outstanding_pages(d, 0); + domain_claim_pages(d, NUMA_NO_NODE, 0); /* fallthrough */ case DOMDYING_dying: rc =3D domain_teardown(d); diff --git a/xen/common/memory.c b/xen/common/memory.c index 3688e6dd50..3371edec11 100644 --- a/xen/common/memory.c +++ b/xen/common/memory.c @@ -1682,7 +1682,20 @@ long do_memory_op(unsigned long cmd, XEN_GUEST_HANDL= E_PARAM(void) arg) rc =3D xsm_claim_pages(XSM_PRIV, d); =20 if ( !rc ) - rc =3D domain_set_outstanding_pages(d, reservation.nr_extents); + { + unsigned long new_claim =3D reservation.nr_extents; + + /* + * For backwards compatibility, keep the meaning of nr_extents: + * it is the target number of pages for the domain. + * In case memory for the domain was allocated before, we must + * substract the already allocated pages from the reservation. + */ + if ( new_claim ) + new_claim -=3D domain_tot_pages(d); + + rc =3D domain_claim_pages(d, NUMA_NO_NODE, new_claim); + } =20 rcu_unlock_domain(d); =20 diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c index b8acb500da..bbb34994b7 100644 --- a/xen/common/page_alloc.c +++ b/xen/common/page_alloc.c @@ -492,6 +492,30 @@ DEFINE_PER_NODE(unsigned long, avail_pages); =20 static DEFINE_SPINLOCK(heap_lock); static long outstanding_claims; /* total outstanding claims by all domains= */ +DECLARE_PER_NODE(long, outstanding_claims); +DEFINE_PER_NODE(long, outstanding_claims); + +#define domain_has_node_claim(d) (d->claim_node !=3D NUMA_NO_NODE) + +static inline bool insufficient_memory(unsigned long request, nodeid_t nod= e) +{ + return per_node(avail_pages, node) - + per_node(outstanding_claims, node) < request; +} + +/* + * Adjust the claim of a domain host-wide and if set, for the claimed node + * + * All callers already hold d->page_alloc_lock and the heap_lock. + */ +static inline void domain_adjust_outstanding_claim(struct domain *d, long = pages) +{ + outstanding_claims +=3D pages; /* Update the host-wide-outstanding c= laims */ + d->outstanding_pages +=3D pages; /* Update the domain's outstanding cl= aims */ + + if ( domain_has_node_claim(d) ) /* Update the claims of that node */ + per_node(outstanding_claims, d->claim_node) +=3D pages; +} =20 static unsigned long avail_heap_pages( unsigned int zone_lo, unsigned int zone_hi, unsigned int node) @@ -529,7 +553,7 @@ unsigned long domain_adjust_tot_pages(struct domain *d,= long pages) /* * can test d->outstanding_pages race-free because it can only change * if d->page_alloc_lock and heap_lock are both held, see also - * domain_set_outstanding_pages below + * domain_claim_pages below * * If the domain has no outstanding claims (or we freed pages instead), * we don't update outstanding claims and skip the claims adjustment. @@ -544,18 +568,37 @@ unsigned long domain_adjust_tot_pages(struct domain *= d, long pages) * If allocated > outstanding, reduce the claims only by outstanding p= ages. */ adjustment =3D min(d->outstanding_pages + 0UL, pages + 0UL); - d->outstanding_pages -=3D adjustment; - outstanding_claims -=3D adjustment; + + domain_adjust_outstanding_claim(d, -adjustment); spin_unlock(&heap_lock); =20 out: return d->tot_pages; } =20 -int domain_set_outstanding_pages(struct domain *d, unsigned long pages) +/* + * Stake claim for memory for future allocations of a domain. + * + * The claim is an abstract stake on future memory allocations, + * no actual memory is allocated at this point. Instead, it guarantees + * that future allocations up to the claim's size will succeed. + * + * If node =3D=3D NUMA_NO_NODE, the claim is host-wide. + * Otherwise, it is local to the specific NUMA node defined by d->claim_no= de. + * + * It should normally only ever be before allocating the memory of the dom= ain. + * When libxenguest code has finished populating the memory of the domain,= it + * cleans up any remaining by passing of 0 to release any outstanding clai= ms. + * + * Returns 0 on success, -EINVAL if the request is invalid, + * or -ENOMEM if the claim cannot be satisfied in available memory. + */ +int domain_claim_pages(struct domain *d, nodeid_t node, unsigned long clai= m) { - int ret =3D -ENOMEM; - unsigned long claim, avail_pages; + int ret =3D -EINVAL; + + if ( node !=3D NUMA_NO_NODE && !node_online(node) ) + goto out; /* passed node is not valid */ =20 /* * take the domain's page_alloc_lock, else all d->tot_page adjustments @@ -565,45 +608,35 @@ int domain_set_outstanding_pages(struct domain *d, un= signed long pages) nrspin_lock(&d->page_alloc_lock); spin_lock(&heap_lock); =20 - /* pages=3D=3D0 means "unset" the claim. */ - if ( pages =3D=3D 0 ) + /* claim=3D=3D0 means "unset" the claim. */ + if ( claim =3D=3D 0 ) { - outstanding_claims -=3D d->outstanding_pages; - d->outstanding_pages =3D 0; + domain_adjust_outstanding_claim(d, -d->outstanding_pages); ret =3D 0; goto out; } =20 /* only one active claim per domain please */ if ( d->outstanding_pages ) - { - ret =3D -EINVAL; goto out; - } =20 - /* disallow a claim not exceeding domain_tot_pages() or above max_page= s */ - if ( (pages <=3D domain_tot_pages(d)) || (pages > d->max_pages) ) - { - ret =3D -EINVAL; + /* If we allocated for the domain already, the claim is on top of that= . */ + if ( (domain_tot_pages(d) + claim) > d->max_pages ) goto out; - } =20 - /* how much memory is available? */ - avail_pages =3D total_avail_pages; - - avail_pages -=3D outstanding_claims; + ret =3D -ENOMEM; + /* Check if the host-wide available memory is sufficent for this claim= */ + if ( claim > total_avail_pages - outstanding_claims ) + goto out; =20 - /* - * Note, if domain has already allocated memory before making a claim - * then the claim must take domain_tot_pages() into account - */ - claim =3D pages - domain_tot_pages(d); - if ( claim > avail_pages ) + /* Check if the node's available memory is insufficient for this claim= */ + if ( node !=3D NUMA_NO_NODE && insufficient_memory(node, claim) ) goto out; =20 /* yay, claim fits in available memory, stake the claim, success! */ - d->outstanding_pages =3D claim; - outstanding_claims +=3D d->outstanding_pages; + d->claim_node =3D node; + domain_adjust_outstanding_claim(d, claim); + ret =3D 0; =20 out: diff --git a/xen/include/xen/mm.h b/xen/include/xen/mm.h index b968f47b87..52c12c5783 100644 --- a/xen/include/xen/mm.h +++ b/xen/include/xen/mm.h @@ -65,6 +65,7 @@ #include #include #include +#include #include #include #include @@ -131,7 +132,7 @@ int populate_pt_range(unsigned long virt, unsigned long= nr_mfns); /* Claim handling */ unsigned long __must_check domain_adjust_tot_pages(struct domain *d, long pages); -int domain_set_outstanding_pages(struct domain *d, unsigned long pages); +int domain_claim_pages(struct domain *d, nodeid_t node, unsigned long page= s); void get_outstanding_claims(uint64_t *free_pages, uint64_t *outstanding_pa= ges); =20 /* Domain suballocator. These functions are *not* interrupt-safe.*/ diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h index 02bdc256ce..9b91261f20 100644 --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h @@ -405,6 +405,7 @@ struct domain unsigned int outstanding_pages; /* pages claimed but not possessed= */ unsigned int max_pages; /* maximum value for domain_tot_pa= ges() */ unsigned int extra_pages; /* pages not included in domain_to= t_pages() */ + nodeid_t claim_node; /* NUMA_NO_NODE for host-wide clai= ms */ =20 #ifdef CONFIG_MEM_SHARING atomic_t shr_pages; /* shared pages */ --=20 2.43.0 From nobody Wed Sep 10 06:05:35 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) client-ip=192.237.175.120; envelope-from=xen-devel-bounces@lists.xenproject.org; helo=lists.xenproject.org; Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org; dmarc=pass(p=reject dis=none) header.from=cloud.com ARC-Seal: i=1; a=rsa-sha256; t=1757262068; cv=none; d=zohomail.com; s=zohoarc; b=EnU96bWM6JHpwJAoNH4re/WshJAjj+/AjizcCyM2oiHrFGyDptye2vPalQmJpV0t7uXUrL94rJupArhw04NN7YN0NWKGPi9Mxk8Kp/ksVzlfagx2Phxp+/ca0a96AofaoxZBXqKdQIWlMRVjQ9cpGP4Yfeqthd5bVvQMMfKO75c= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1757262068; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=kxqBH4UoXF5zKcyV4oRkehvh5BG1QBuBGrUOn4bw4TQ=; b=NLMmZX/AMpJRZ4MRba+ahQMb33W5cCQscShcY/gq+aD6hpVy+Wg8DqdKDf28ZcFoQhDNMSdsn2aam8XiL9tddrkZRv6tm7c7K4gmdb4mswI+4pk82twp4xwRpOf3+XzKmli5KSeKGakmjh1nM7vDE2n9nrN3FxgYIp1DCuAY8Cw= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org; dmarc=pass header.from= (p=reject dis=none) Return-Path: Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) by mx.zohomail.com with SMTPS id 1757262068196468.1703117883352; Sun, 7 Sep 2025 09:21:08 -0700 (PDT) Received: from list by lists.xenproject.org with outflank-mailman.1114188.1461334 (Exim 4.92) (envelope-from ) id 1uvI8I-00043b-50; Sun, 07 Sep 2025 16:20:50 +0000 Received: by outflank-mailman (output) from mailman id 1114188.1461334; Sun, 07 Sep 2025 16:20:50 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1uvI8H-00043U-Vy; Sun, 07 Sep 2025 16:20:49 +0000 Received: by outflank-mailman (input) for mailman id 1114188; Sun, 07 Sep 2025 16:20:48 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1uvI41-00009z-NJ for xen-devel@lists.xenproject.org; Sun, 07 Sep 2025 16:16:25 +0000 Received: from mail-ej1-x62e.google.com (mail-ej1-x62e.google.com [2a00:1450:4864:20::62e]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id ffa40c74-8c05-11f0-9809-7dc792cee155; Sun, 07 Sep 2025 18:16:24 +0200 (CEST) Received: by mail-ej1-x62e.google.com with SMTP id a640c23a62f3a-b04b869abb9so149781466b.1 for ; Sun, 07 Sep 2025 09:16:24 -0700 (PDT) Received: from MinisforumBD795m.phoenix-carat.ts.net ([2a02:1748:f7df:8cb1:5474:d7c3:6edd:e683]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-b047b61cf00sm908263766b.15.2025.09.07.09.15.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 07 Sep 2025 09:16:22 -0700 (PDT) X-Outflank-Mailman: Message body and most headers restored to incoming version X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: ffa40c74-8c05-11f0-9809-7dc792cee155 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloud.com; s=cloud; t=1757261783; x=1757866583; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=kxqBH4UoXF5zKcyV4oRkehvh5BG1QBuBGrUOn4bw4TQ=; b=BaMyCOjDkwGZ2u73dbJ7SgDKbTt37YCQOSWkDjTDW/EGETYm2x8weqpPjSDcgHnSyz BIz77+tfhymkXZusWhxoBn31kwQo3DOc8PuPyfTOg8MK0DUFw1fzT35BJ/AZwi2Z4X3K lxiHHHwRlCBKjpmCV5Gkr+r0wvRtnZj3sjqMU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1757261783; x=1757866583; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=kxqBH4UoXF5zKcyV4oRkehvh5BG1QBuBGrUOn4bw4TQ=; b=VkHOKXqVAIEm2BQI5wdYdUIFvFzxJq59IaW8fNcgZlU7wsdlBnqjrNC2vn4miyylnj SEYnEmw1UWcUU4jbqTVzAslL1u4ZF8KRPYpYq5gA4oksgo2puMzf8F6guCMz3QEtwX7T SyoYLIjAfOgz8DqRhQgsW0LzbQzxt7mkceIcnUdHePByuu0x1E+bWpJEYEtAhoGV57/H wRUGC3fW8la9vqXHix3xxlPwswtMkgBx+O68rhBpvnL1pgAi+QbuM9y4UIssasMm+6eB YB9Dyne7vTO1fcwQ4K6dL/cNoPp4Xxeh67KzzvlX7CVPRZkW8B/iGKvvjhXz8tZ38c6T BrtQ== X-Gm-Message-State: AOJu0Yzp6qpredsVDAGtxLNT7TTH82gyb7ddFnGaNqlvsHDHxu+4L5bU BefLffoBfXYN4YWkqftM90NHj2zEOysPALUQmyUnzqVlA0e5atAoZGA87ovrAj7hSQjNPLDysLL HvEZVkTk= X-Gm-Gg: ASbGnctTPup+qfU4eopObUYAO5dl7hrPdQCs/CqY+9LBKAzck5JAt8J9R+91kTCBmhb Vmz1CRIUhJ9mVdl35Ow/3oYo+Js0Moz7061TrsOlpaObR4S27iKHYXTqTwJCa8OP1t725Cl8BxK TrcQTAHnJ8oDRwaHoX+Glzwu4qF1augqeAjBJ+e8DPIwOGZxlC093YQ6sPWMIwOGrp9PmH3zM05 v/qBfkDWv1MBsgFa+PIiVyS9iH+AFieb9Zj3fEtPoT7jCOqDgzvYWWjKS2VuCaZxUgVNjqu+Xsb G16Gp3ObCnPa2SwYjbF+V7QPq/a0/ut388Uyv4qAqPaUGO31pn8qkP7OSwk0ANhozaQ66sziKM2 2o6MWhFPM26bE+v0Ry7PjAL7N3tU31vbJJLiYy4kQ5op8hbwuYT2GFmK3C3/c0PgwGbFybQaloG 3XZA== X-Google-Smtp-Source: AGHT+IGHe1uaFOFNYQ4DJtikarPhM2GREyUd/9dIGycR1RQtQQeD3n91MRtwwnizdTXOOCUvQtLQmw== X-Received: by 2002:a17:906:d54b:b0:b04:53cc:4400 with SMTP id a640c23a62f3a-b04b14a198emr528981066b.27.1757261783273; Sun, 07 Sep 2025 09:16:23 -0700 (PDT) From: Bernhard Kaindl To: xen-devel@lists.xenproject.org Cc: Alejandro Vallejo , Bernhard Kaindl , Jan Beulich , Andrew Cooper , =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= , Anthony PERARD , Michal Orzel , Julien Grall , Stefano Stabellini , Tamas K Lengyel Subject: [PATCH v3 5/7] xen/page_alloc: Pass node to adjust_tot_pages and check it Date: Sun, 7 Sep 2025 18:15:20 +0200 Message-ID: <80adbc587f6acf6bae05bf66016ffecb532f8877.1757261045.git.bernhard.kaindl@cloud.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ZohoMail-DKIM: pass (identity @cloud.com) X-ZM-MESSAGEID: 1757262070136116600 Content-Type: text/plain; charset="utf-8" domain_adjust_tot_pages() consumes remaining claims as pages are allocated, now also from the claimed node. Update it to skip consuming the outstanding claims when the page was allocated from a different NUMA node. This in itself would not be critically needed as the page should only be allocated from a different NUMA node in case the target node has no available memory, but for multi-node claims, we need to reduce the outstanding claims only on the NUMA node the page was allocated from. For this, we need to pass the NUMA node of the allocated page, so we can use it to perform this check (and in the future update the claim only on the NUMA node the page was allocated from) Signed-off-by: Alejandro Vallejo Signed-off-by: Bernhard Kaindl --- - Reorganized v3, v4 and v5 as per review to avoid non-functional changes: - Split from patch v2#3 and merged the related changed from v2#5 into a consolidated patch. --- xen/arch/x86/mm.c | 3 ++- xen/arch/x86/mm/mem_sharing.c | 4 ++-- xen/common/grant_table.c | 4 ++-- xen/common/memory.c | 3 ++- xen/common/page_alloc.c | 21 ++++++++++++++++----- xen/include/xen/mm.h | 2 +- 6 files changed, 25 insertions(+), 12 deletions(-) diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index b929d15d00..b0f654e02e 100644 --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -4442,7 +4442,8 @@ int steal_page( page_list_del(page, &d->page_list); =20 /* Unlink from original owner. */ - if ( !(memflags & MEMF_no_refcount) && !domain_adjust_tot_pages(d, -1)= ) + if ( !(memflags & MEMF_no_refcount) && + !domain_adjust_tot_pages(d, NUMA_NO_NODE, -1) ) drop_dom_ref =3D true; =20 nrspin_unlock(&d->page_alloc_lock); diff --git a/xen/arch/x86/mm/mem_sharing.c b/xen/arch/x86/mm/mem_sharing.c index 4787b27964..15b8a3a9d9 100644 --- a/xen/arch/x86/mm/mem_sharing.c +++ b/xen/arch/x86/mm/mem_sharing.c @@ -720,7 +720,7 @@ static int page_make_sharable(struct domain *d, if ( !validate_only ) { page_set_owner(page, dom_cow); - drop_dom_ref =3D !domain_adjust_tot_pages(d, -1); + drop_dom_ref =3D !domain_adjust_tot_pages(d, NUMA_NO_NODE, -1); page_list_del(page, &d->page_list); } =20 @@ -766,7 +766,7 @@ static int page_make_private(struct domain *d, struct p= age_info *page) ASSERT(page_get_owner(page) =3D=3D dom_cow); page_set_owner(page, d); =20 - if ( domain_adjust_tot_pages(d, 1) =3D=3D 1 ) + if ( domain_adjust_tot_pages(d, page_to_nid(page), 1) =3D=3D 1 ) get_knownalive_domain(d); page_list_add_tail(page, &d->page_list); nrspin_unlock(&d->page_alloc_lock); diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c index cf131c43a1..8fea75dbb2 100644 --- a/xen/common/grant_table.c +++ b/xen/common/grant_table.c @@ -2405,7 +2405,7 @@ gnttab_transfer( } =20 /* Okay, add the page to 'e'. */ - if ( unlikely(domain_adjust_tot_pages(e, 1) =3D=3D 1) ) + if ( unlikely(domain_adjust_tot_pages(e, page_to_nid(page), 1) =3D= =3D 1) ) get_knownalive_domain(e); =20 /* @@ -2431,7 +2431,7 @@ gnttab_transfer( * page in the page total */ nrspin_lock(&e->page_alloc_lock); - drop_dom_ref =3D !domain_adjust_tot_pages(e, -1); + drop_dom_ref =3D !domain_adjust_tot_pages(e, NUMA_NO_NODE, -1); nrspin_unlock(&e->page_alloc_lock); =20 if ( okay /* i.e. e->is_dying due to the surrounding if() */ ) diff --git a/xen/common/memory.c b/xen/common/memory.c index 3371edec11..4c54ce5ede 100644 --- a/xen/common/memory.c +++ b/xen/common/memory.c @@ -775,7 +775,8 @@ static long memory_exchange(XEN_GUEST_HANDLE_PARAM(xen_= memory_exchange_t) arg) =20 nrspin_lock(&d->page_alloc_lock); drop_dom_ref =3D (dec_count && - !domain_adjust_tot_pages(d, -dec_count)); + !domain_adjust_tot_pages(d, NUMA_NO_NODE, + -dec_count)); nrspin_unlock(&d->page_alloc_lock); =20 if ( drop_dom_ref ) diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c index bbb34994b7..ebf41a1b33 100644 --- a/xen/common/page_alloc.c +++ b/xen/common/page_alloc.c @@ -542,8 +542,11 @@ static unsigned long avail_heap_pages( * Update the total number of pages and outstanding claims of a domain. * - When pages were freed, we do not increase outstanding claims. * - On a domain's claims update, global outstanding_claims are updated as= well. + * - If the domain's claim is on a NUMA node, we only update outstanding c= laims + * of the domain and the node, when the allocation is from the same NUMA= node. */ -unsigned long domain_adjust_tot_pages(struct domain *d, long pages) +unsigned long domain_adjust_tot_pages(struct domain *d, nodeid_t node, + long pages) { unsigned long adjustment; =20 @@ -557,8 +560,12 @@ unsigned long domain_adjust_tot_pages(struct domain *d= , long pages) * * If the domain has no outstanding claims (or we freed pages instead), * we don't update outstanding claims and skip the claims adjustment. + * + * Else, a page was allocated: But if the domain has a node_claim and + * the page was allocated from a different node, don't update claims. */ - if ( !d->outstanding_pages || pages <=3D 0 ) + if ( !d->outstanding_pages || pages <=3D 0 || + (domain_has_node_claim(d) && d->claim_node !=3D node) ) goto out; =20 spin_lock(&heap_lock); @@ -2662,6 +2669,8 @@ int assign_pages( =20 if ( !(memflags & MEMF_no_refcount) ) { + nodeid_t node =3D page_to_nid(&pg[0]); + if ( unlikely(d->tot_pages + nr < nr) ) { gprintk(XENLOG_INFO, @@ -2672,8 +2681,9 @@ int assign_pages( rc =3D -E2BIG; goto out; } + ASSERT(node =3D=3D page_to_nid(&pg[nr - 1])); =20 - if ( unlikely(domain_adjust_tot_pages(d, nr) =3D=3D nr) ) + if ( unlikely(domain_adjust_tot_pages(d, node, nr) =3D=3D nr) ) get_knownalive_domain(d); } =20 @@ -2806,7 +2816,8 @@ void free_domheap_pages(struct page_info *pg, unsigne= d int order) } } =20 - drop_dom_ref =3D !domain_adjust_tot_pages(d, -(1 << order)); + drop_dom_ref =3D !domain_adjust_tot_pages(d, NUMA_NO_NODE, + -(1 << order)); =20 rspin_unlock(&d->page_alloc_lock); =20 @@ -3012,7 +3023,7 @@ void free_domstatic_page(struct page_info *page) =20 arch_free_heap_page(d, page); =20 - drop_dom_ref =3D !domain_adjust_tot_pages(d, -1); + drop_dom_ref =3D !domain_adjust_tot_pages(d, NUMA_NO_NODE, -1); =20 unprepare_staticmem_pages(page, 1, scrub_debug); =20 diff --git a/xen/include/xen/mm.h b/xen/include/xen/mm.h index 52c12c5783..5a5252fc69 100644 --- a/xen/include/xen/mm.h +++ b/xen/include/xen/mm.h @@ -131,7 +131,7 @@ mfn_t xen_map_to_mfn(unsigned long va); int populate_pt_range(unsigned long virt, unsigned long nr_mfns); /* Claim handling */ unsigned long __must_check domain_adjust_tot_pages(struct domain *d, - long pages); + nodeid_t node, long pages); int domain_claim_pages(struct domain *d, nodeid_t node, unsigned long page= s); void get_outstanding_claims(uint64_t *free_pages, uint64_t *outstanding_pa= ges); =20 --=20 2.43.0 From nobody Wed Sep 10 06:05:35 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) client-ip=192.237.175.120; envelope-from=xen-devel-bounces@lists.xenproject.org; helo=lists.xenproject.org; Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org; dmarc=pass(p=reject dis=none) header.from=cloud.com ARC-Seal: i=1; a=rsa-sha256; t=1757262067; cv=none; d=zohomail.com; s=zohoarc; b=FBN/m4yjitIVGmTH1KTrdO8e9apdfqKLcVVUTLIAtwcA2ySYPYeJohXNyGuVtbDtqcXtcS3ve8XyONQTTkUScsMh5RA7oTOPccGytfupYHDLbZSQU9cqk6JdoRlM2c1WyGdTDpJr8YkDYPx/uX4PL040ti5dprBiTLB3Vb+qHjQ= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1757262067; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=vBTzAM9dJn/x75BZsxvktFF5ekUsBrdK+iYKuv7j92Y=; b=k3AEA4KmrJZTHtpn+WEsK9Lt1V++MVe9cyDC81c/TQ6/FkwtXvJdgzuMkfJghEpxDYxMuHLxev/RSkWTKt8sTLZG5uvYRbUg4+9UsMqZBdbpGUKprtNVYV+N/OBal7C5vD8Qp/haAQjKxdlhdphBKjuH0SdKEqcRyh2RWADF+0Y= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org; dmarc=pass header.from= (p=reject dis=none) Return-Path: Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) by mx.zohomail.com with SMTPS id 1757262067800475.49864868660325; Sun, 7 Sep 2025 09:21:07 -0700 (PDT) Received: from list by lists.xenproject.org with outflank-mailman.1114194.1461344 (Exim 4.92) (envelope-from ) id 1uvI8K-0004JQ-DA; Sun, 07 Sep 2025 16:20:52 +0000 Received: by outflank-mailman (output) from mailman id 1114194.1461344; Sun, 07 Sep 2025 16:20:52 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1uvI8K-0004JB-8V; Sun, 07 Sep 2025 16:20:52 +0000 Received: by outflank-mailman (input) for mailman id 1114194; Sun, 07 Sep 2025 16:20:51 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1uvI42-00009z-Nd for xen-devel@lists.xenproject.org; Sun, 07 Sep 2025 16:16:26 +0000 Received: from mail-ej1-x62c.google.com (mail-ej1-x62c.google.com [2a00:1450:4864:20::62c]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 00344c93-8c06-11f0-9809-7dc792cee155; Sun, 07 Sep 2025 18:16:25 +0200 (CEST) Received: by mail-ej1-x62c.google.com with SMTP id a640c23a62f3a-b046f6fb230so604182166b.1 for ; Sun, 07 Sep 2025 09:16:25 -0700 (PDT) Received: from MinisforumBD795m.phoenix-carat.ts.net ([2a02:1748:f7df:8cb1:5474:d7c3:6edd:e683]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-b047b61cf00sm908263766b.15.2025.09.07.09.16.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 07 Sep 2025 09:16:23 -0700 (PDT) X-Outflank-Mailman: Message body and most headers restored to incoming version X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 00344c93-8c06-11f0-9809-7dc792cee155 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloud.com; s=cloud; t=1757261784; x=1757866584; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=vBTzAM9dJn/x75BZsxvktFF5ekUsBrdK+iYKuv7j92Y=; b=Ehd7Nxq1zVQLYJAv33FDslPZN7BVOQLT/G0L4euiksG3FNcFkQ0utT+d1mf0dMLsEV K+1SeIu2+IakYyBvM2k+6mlxir91mMcQORLcJFiiv9AnAmuUKlVmeap/+Vttj6g1U1lr lKvqmT0A24Yf+zGEXN7UtgGI8nRpelsjoMgVI= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1757261784; x=1757866584; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=vBTzAM9dJn/x75BZsxvktFF5ekUsBrdK+iYKuv7j92Y=; b=iE9jbJd4dFA8ve0fXZfkI8An9z9BfuMcS+SEukz2Qdg58X90CMSluIDRO36NqBBxNi vhJvpCJQX/FcXvb6k1TYLtO/0YPjie0DJCE7TVsGMZT7hpgCXRK8CRPaFOQgI1UuKNiU oZOTpbZqUWX+j5TfCmXp2PKX0TGMZV+MK+IWez3rN+9nDRs/U18rKYf+oiovXZAgYbhG ZHpNdelNolHE8XmoAlT8/afMq5x3KVLSGNPPJllaWNZG2orbekyUDxmKjd4gKP26JBB2 UVsLarcyHYJR4rPGbD4QEnHvmHDSEwUvJOt6prIDKe/2RD8KQ4R7MBA2wPSWfjKg+G1J 11Fg== X-Gm-Message-State: AOJu0Yzj3jBlSERwNX7qLmAyMKapd0SNaBJFsqFegjbpbvewly0n2SkR QB5skVRTa9Db4fq/wlqWizs8nFpH28zui2bp9ImyjzsOAUqKG0qHBmr+g8Gnn2mR8vtuXaEwY2m n047UIUY= X-Gm-Gg: ASbGncsIzNlbmovo48jc4vgobIdQjBp/ENa1qIfqAqCGLX1do5k+d+tPvl1zMQW8zAY BmF39m30e+LwuiTmdzcV82ACHWj47JlK/GoOhGi4ZdZvf0ckv1r4KJNowS2pecXnfFTPVICG2A8 jpbZT1v35Rfnjl1+Oe0vKKVxro2AvkOGklQV3r6U9kozVxQfyKGvqtEvMVmMOWYicpeAvx93oNv vJEaeiqmO5VvH3wPnUqtr7tIVKbstKxYtp/CoZvHjBfa537elEgQcRn01LSxoqwYSR9E0KxQSqi ok/9cRCpfnp5LpDQsypQ2b9O1lk7C8ODcYSt34AxNTJfmNpuJQcb+kekGQ4zJzT1U8zMVJHz6AN 8uPbQT9YVH1v51fgyj276vGNienp97C60TBWymEvfl5cUnwJ3KeEIbwznHcbLf9hEvUMsZdybBw k4iA== X-Google-Smtp-Source: AGHT+IFNkJmollUYz0E9myq+0ab1Fzt9pFOg5uIHhtER2VpUj1FgGRJDBcAtjSl+tX9M37Qw/YC6FA== X-Received: by 2002:a17:907:805:b0:afe:94d7:7283 with SMTP id a640c23a62f3a-b04932a2452mr1010424466b.32.1757261784279; Sun, 07 Sep 2025 09:16:24 -0700 (PDT) From: Bernhard Kaindl To: xen-devel@lists.xenproject.org Cc: Alejandro Vallejo , Bernhard Kaindl , Andrew Cooper , Anthony PERARD , Michal Orzel , Jan Beulich , Julien Grall , =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= , Stefano Stabellini Subject: [PATCH v3 6/7] xen/page_alloc: Protect claimed memory against other allocations Date: Sun, 7 Sep 2025 18:15:21 +0200 Message-ID: X-Mailer: git-send-email 2.43.0 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ZohoMail-DKIM: pass (identity @cloud.com) X-ZM-MESSAGEID: 1757262073008124100 Content-Type: text/plain; charset="utf-8" Extend get_free_buddy() to only allocate from nodes with enough unclaimed memory left, unless the allocation is made by a domain with sufficient claims on this node to cover the allocation. Signed-off-by: Marcus Granado Signed-off-by: Alejandro Vallejo Signed-off-by: Bernhard Kaindl --- Changes in v3: Rewritten based on a check by Marcus Granado which needs to be inside the NUMA node loop of get_free_buddy() to only allow it to allocating from NUMA nodes with enough unclaimed memory. It was originally only intented for when looping over all NUMA nodes, but the check also needs to be done when falling back to other nodes: I updated the check to be generic: Now, it used for all requests by integrating the check of the claim of the domain from Alejandro's can_alloc() helper into it. This fixes the issue that when falling back from a nodemask to allocate from (based on MEMF_get_node(memflags) or from d->node_affinity): When falling back to other NUMA nodes, still only allocate from nodes with enough unclaimed memory left, unless the allocation is made by a domain with sufficient claims on this node to cover the allocation. This makes the can_alloc() helper function obsolete, as the needed checks are done for the NUMA nodes as they are considered, not only for the orignally requested NUMA node (not just before searching). --- xen/common/page_alloc.c | 16 +++++++++++++--- 1 file changed, 13 insertions(+), 3 deletions(-) diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c index ebf41a1b33..b866076b78 100644 --- a/xen/common/page_alloc.c +++ b/xen/common/page_alloc.c @@ -980,9 +980,19 @@ static struct page_info *get_free_buddy(unsigned int z= one_lo, { zone =3D zone_hi; do { - /* Check if target node can support the allocation. */ - if ( !avail[node] || (avail[node][zone] < (1UL << order)) ) - continue; + unsigned long request =3D 1UL << order; + /* + * Check if this node is currently suitable for this allocatio= n. + * 1. It has sufficient memory in the requested zone and the + * 2. request must fit in the unclaimed memory of the node min= us + * outstanding claims, unless the allocation is made by a d= omain + * with sufficient node-claimed memory to cover the allocat= ion. + */ + if ( !avail[node] || (avail[node][zone] < request) || + (insufficient_memory(node, request) && + (!d || node !=3D d->claim_node || /* a domain with c= laims */ + request > d->outstanding_pages)) ) /* claim covers requ= est */ + continue; /* next zone/node if insufficient memory or cla= ims */ =20 /* Find smallest order which can satisfy the request. */ for ( j =3D order; j <=3D MAX_ORDER; j++ ) --=20 2.43.0 From nobody Wed Sep 10 06:05:35 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) client-ip=192.237.175.120; envelope-from=xen-devel-bounces@lists.xenproject.org; helo=lists.xenproject.org; Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org; dmarc=pass(p=reject dis=none) header.from=cloud.com ARC-Seal: i=1; a=rsa-sha256; t=1757261838; cv=none; d=zohomail.com; s=zohoarc; b=Ujp7HDsYXFacjxtRgYHMdJ7JxPkgbwpmWa+9VPONYFwa60btckJCQUVmwlveAm5FFBSxqB9CZDj8VI9aXhQEOrV1Cg6QOhFhF0psHTV2IXCj2ecf7Hu8O39JfOFHk2tDob2jI6e1ss0CRW4g2iomv3HFsEuT8dSWbEJxUPhzgmw= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1757261838; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=6FiQwP0VdUGmMFkU0uIYAV7JnXqoktiIQH9nuSROebE=; b=jCmbhcyV+K1qeUhSFwiDTa3K8OHf9Wgw2+AcBMobuq6+dy9qX3wy5BtlpGjIZg5Bi6KMcaSFrR8DePLRi+a3NyQ5AMKgprDpe48TGrKfvD2q4EGRi4dWOVsBbf15hQ6QehnV8Am5G/NnV3cEXk+lVZ74Kggj8zt49RE8qDSZsHg= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org; dmarc=pass header.from= (p=reject dis=none) Return-Path: Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) by mx.zohomail.com with SMTPS id 1757261838168690.2274571967656; Sun, 7 Sep 2025 09:17:18 -0700 (PDT) Received: from list by lists.xenproject.org with outflank-mailman.1114166.1461323 (Exim 4.92) (envelope-from ) id 1uvI4i-0002H4-Fj; Sun, 07 Sep 2025 16:17:08 +0000 Received: by outflank-mailman (output) from mailman id 1114166.1461323; Sun, 07 Sep 2025 16:17:08 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1uvI4i-0002Gx-CA; Sun, 07 Sep 2025 16:17:08 +0000 Received: by outflank-mailman (input) for mailman id 1114166; Sun, 07 Sep 2025 16:17:07 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1uvI4g-0002Gl-UL for xen-devel@lists.xenproject.org; Sun, 07 Sep 2025 16:17:07 +0000 Received: from mail-ej1-x62c.google.com (mail-ej1-x62c.google.com [2a00:1450:4864:20::62c]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 17f8bfc5-8c06-11f0-9809-7dc792cee155; Sun, 07 Sep 2025 18:17:05 +0200 (CEST) Received: by mail-ej1-x62c.google.com with SMTP id a640c23a62f3a-afcb7ae31caso616594566b.3 for ; Sun, 07 Sep 2025 09:17:05 -0700 (PDT) Received: from MinisforumBD795m.phoenix-carat.ts.net ([2a02:1748:f7df:8cb1:5474:d7c3:6edd:e683]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-b047b61cf00sm908263766b.15.2025.09.07.09.16.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 07 Sep 2025 09:17:03 -0700 (PDT) X-Outflank-Mailman: Message body and most headers restored to incoming version X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 17f8bfc5-8c06-11f0-9809-7dc792cee155 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloud.com; s=cloud; t=1757261824; x=1757866624; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=6FiQwP0VdUGmMFkU0uIYAV7JnXqoktiIQH9nuSROebE=; b=NMW6bFbACu43o7wvBgzu33iydiwRwevxL/DzodDQONkl39IhPYtKpC6KbfiDq2r+1J 6gpiveF/YsP90HxaV2kTAnqdm9UxLGOBCrw66FgkKbhXPd1Htt5Qb8u2E2vuZnpRngBq 5sALD2zbaGX8qfd/ZiZeiH4WRA5u1T9hU/d+k= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1757261824; x=1757866624; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=6FiQwP0VdUGmMFkU0uIYAV7JnXqoktiIQH9nuSROebE=; b=FvRYSUKHmRlRJiVuHCsKNUS2IBQZ5/fv+XyIo+yoNxk8uszgGJ5kSuD6hjPqCk3W2J mw5M6afPNx5IC/qg1dGeg65r2urHOm11u5Mw/SW9+XbjXm5mFNPpAT4/9meYvS0AhWuw eXGz4PbtsrWUjwkKaZEQsUUAkMK7wbCE4pM/9hsbIvTAnA6tuBPe2me6TYiH+w2Hi+3o 6zpdIVfcROjYbqQ+XYRlRv+IGNkQGYiQfKayfXdWXZXX8+OpGMQMkPoMrb0AyVmwj2gD eGABnwjIP/B6Yq2VfbxAJwCmxX8Ce92MD++xkS/lsy9h6C5zFi8UkcvQ9TH9tSCbZsaZ vCvA== X-Gm-Message-State: AOJu0YyPnr+BUy2rf7jJeebbmfOz9EgbCMldQRR3anuZbKPmAj+0bH6P Vhmwh+TsY6hsvXD6nXs1dO3D1INeV1x9wx8Zntoa2JgGv4Wt4biOGN2GIcW9jSsP+Kf2hk6keSO xlcsCmwE= X-Gm-Gg: ASbGncv4HhjyGMrIYccHjLVTgfpFluru0d1UXWBrmUIoRpAoAiZz4hKR+QgJz/JPx7+ SXV+U7j1n8ayZA9PPqknyTCl6dyvT1oekukq3oqxVJNiQe1e+wpDBr3rwCFN/Sg35uHXZFrFJj7 71mRHDRCXcZJtRdYZInJbIOTaaxvSXfRXAXbRx5t6TJ85Dkc3ZhPAuQKx8g0esqTAYgj+tMt4Pk vkO4vsqJn5SOb7wytfMuM2Lu+nz2dJBWcGFrYdAGfS65N7jLP/lPO4gd2fzOnHs41oV1TwRuSX2 nAjjBqkvVbtYzmmHzM6ZTjkNpa0x8J5EkNs6vsOv95s8uXfUPES6YNiMRnhzT4dgKEEk2pamMVU hebjqBNLdczpz0oakyeQVPdgg8Mfl4zfAhqDlnjAX0h8ysrzV/PUo//USCM9PQPJSQnQ= X-Google-Smtp-Source: AGHT+IHQ41CzQUwsGmIiJ1de/f1zmmWKWPxtBh6Y9WUIDTwFCa8uQ+VvzK+oXvx3Q1NVWFYCh8wBSw== X-Received: by 2002:a17:907:6d0d:b0:b04:ae7c:703e with SMTP id a640c23a62f3a-b04b140a770mr473628766b.24.1757261823683; Sun, 07 Sep 2025 09:17:03 -0700 (PDT) From: Bernhard Kaindl To: xen-devel@lists.xenproject.org Cc: Alejandro Vallejo , Bernhard Kaindl , "Daniel P. Smith" , Anthony PERARD , Andrew Cooper , Michal Orzel , Jan Beulich , Julien Grall , =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= , Stefano Stabellini , Juergen Gross , Christian Lindig , David Scott Subject: [PATCH v3 7/7] xen: New hypercall to claim memory using XEN_DOMCTL_claim_memory Date: Sun, 7 Sep 2025 18:15:22 +0200 Message-ID: X-Mailer: git-send-email 2.43.0 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ZohoMail-DKIM: pass (identity @cloud.com) X-ZM-MESSAGEID: 1757261840996124100 Content-Type: text/plain; charset="utf-8" Add the new hypercall requested during the review of the v1 series do not require changing the API for multi-node claims. The hypercall receives a number of claims, intented to be one claim per NUMA node, and limited to one claim for now. The changes to update the NUMA claims management to handle updating the claims for multiple NUMA nodes of a domain at once are deferred to the next series. Cc: Alejandro Vallejo Signed-off-by: Bernhard Kaindl --- Changes in v3: - As a review to check nr_claims to be > 0 (but uint =3D superflous), but to avoid a raised eyebrow, add "> 0", which the compiler will optimise away anyway. --- tools/flask/policy/modules/dom0.te | 1 + tools/flask/policy/modules/xen.if | 1 + tools/include/xenctrl.h | 4 +++ tools/libs/ctrl/xc_domain.c | 42 +++++++++++++++++++++++++++++ tools/ocaml/libs/xc/xenctrl.ml | 9 +++++++ tools/ocaml/libs/xc/xenctrl.mli | 9 +++++++ tools/ocaml/libs/xc/xenctrl_stubs.c | 21 +++++++++++++++ xen/common/domain.c | 29 ++++++++++++++++++++ xen/common/domctl.c | 8 ++++++ xen/include/public/domctl.h | 17 ++++++++++++ xen/include/xen/domain.h | 2 ++ xen/xsm/flask/hooks.c | 3 +++ xen/xsm/flask/policy/access_vectors | 2 ++ 13 files changed, 148 insertions(+) diff --git a/tools/flask/policy/modules/dom0.te b/tools/flask/policy/module= s/dom0.te index ad2b4f9ea7..8801cb24f2 100644 --- a/tools/flask/policy/modules/dom0.te +++ b/tools/flask/policy/modules/dom0.te @@ -105,6 +105,7 @@ allow dom0_t dom0_t:domain2 { get_cpu_policy dt_overlay get_domain_state + claim_memory }; allow dom0_t dom0_t:resource { add diff --git a/tools/flask/policy/modules/xen.if b/tools/flask/policy/modules= /xen.if index ef7d8f438c..8e2dceb505 100644 --- a/tools/flask/policy/modules/xen.if +++ b/tools/flask/policy/modules/xen.if @@ -98,6 +98,7 @@ define(`create_domain_common', ` vuart_op set_llc_colors get_domain_state + claim_memory }; allow $1 $2:security check_context; allow $1 $2:shadow enable; diff --git a/tools/include/xenctrl.h b/tools/include/xenctrl.h index 965d3b585a..43ece3f2a7 100644 --- a/tools/include/xenctrl.h +++ b/tools/include/xenctrl.h @@ -2660,6 +2660,10 @@ int xc_domain_set_llc_colors(xc_interface *xch, uint= 32_t domid, const uint32_t *llc_colors, uint32_t num_llc_colors); =20 +int xc_domain_claim_memory(xc_interface *xch, uint32_t domid, + uint32_t nr_claims, + const memory_claim_t *claims); + #if defined(__arm__) || defined(__aarch64__) int xc_dt_overlay(xc_interface *xch, void *overlay_fdt, uint32_t overlay_fdt_size, uint8_t overlay_op); diff --git a/tools/libs/ctrl/xc_domain.c b/tools/libs/ctrl/xc_domain.c index 2ddc3f4f42..e022b76430 100644 --- a/tools/libs/ctrl/xc_domain.c +++ b/tools/libs/ctrl/xc_domain.c @@ -2229,6 +2229,48 @@ out: =20 return ret; } + +/* + * Claim memory for a domain. A Domain can only have one type of claim: + * + * If the number of claims is 0, existing claims are cancelled. + * Updating claims is not supported, cancel the existing claim first. + * + * Memory allocations consume the outstanding claim and if not enough memo= ry is + * free, the allocation must be satisfied from the remaining outstanding c= laim. + */ +int xc_domain_claim_memory(xc_interface *xch, uint32_t domid, + uint32_t nr_claims, + const memory_claim_t *claims) +{ + struct xen_domctl domctl =3D { + .cmd =3D XEN_DOMCTL_claim_memory, + .domain =3D domid, + .u.claim_memory.nr_claims =3D nr_claims, + }; + int ret; + DECLARE_HYPERCALL_BUFFER(struct xen_domctl_claim_memory, buffer); + + /* Use an array to not need changes for multi-node claims in the futur= e */ + if ( nr_claims > 0 ) + { + size_t bytes =3D sizeof(memory_claim_t) * nr_claims; + + buffer =3D xc_hypercall_buffer_alloc(xch, buffer, bytes); + if ( buffer =3D=3D NULL ) + { + PERROR("Could not allocate memory for xc_domain_claim_memory"); + return -1; + } + memcpy(buffer, claims, bytes); + set_xen_guest_handle(domctl.u.claim_memory.claims, buffer); + } + + ret =3D do_domctl(xch, &domctl); + xc_hypercall_buffer_free(xch, buffer); + return ret; +} + /* * Local variables: * mode: C diff --git a/tools/ocaml/libs/xc/xenctrl.ml b/tools/ocaml/libs/xc/xenctrl.ml index 97108b9d86..c8692fb169 100644 --- a/tools/ocaml/libs/xc/xenctrl.ml +++ b/tools/ocaml/libs/xc/xenctrl.ml @@ -370,6 +370,15 @@ external domain_deassign_device: handle -> domid -> (i= nt * int * int * int) -> u external domain_test_assign_device: handle -> domid -> (int * int * int * = int) -> bool =3D "stub_xc_domain_test_assign_device" =20 +type claim =3D + { + node: int; + nr_pages: int64; + } + +external domain_claim_memory: handle -> domid -> int -> claim array -> unit + =3D "stub_xc_domain_claim_memory" + external version: handle -> version =3D "stub_xc_version_version" external version_compile_info: handle -> compile_info =3D "stub_xc_version_compile_info" diff --git a/tools/ocaml/libs/xc/xenctrl.mli b/tools/ocaml/libs/xc/xenctrl.= mli index 9fccb2c2c2..82d59fc80d 100644 --- a/tools/ocaml/libs/xc/xenctrl.mli +++ b/tools/ocaml/libs/xc/xenctrl.mli @@ -297,6 +297,15 @@ external domain_deassign_device: handle -> domid -> (i= nt * int * int * int) -> u external domain_test_assign_device: handle -> domid -> (int * int * int * = int) -> bool =3D "stub_xc_domain_test_assign_device" =20 +type claim =3D + { + node: int; + nr_pages: int64; + } + +external domain_claim_memory: handle -> domid -> int -> claim array -> unit + =3D "stub_xc_domain_claim_memory" + external version : handle -> version =3D "stub_xc_version_version" external version_compile_info : handle -> compile_info =3D "stub_xc_version_compile_info" diff --git a/tools/ocaml/libs/xc/xenctrl_stubs.c b/tools/ocaml/libs/xc/xenc= trl_stubs.c index ac2a7537d6..53f56c5437 100644 --- a/tools/ocaml/libs/xc/xenctrl_stubs.c +++ b/tools/ocaml/libs/xc/xenctrl_stubs.c @@ -1435,6 +1435,27 @@ CAMLprim value stub_xc_watchdog(value xch_val, value= domid, value timeout) CAMLreturn(Val_int(ret)); } =20 +/* Claim memory for a domain. See xc_domain_claim_memory() for details. */ +CAMLprim value stub_xc_domain_claim_memory(value xch_val, value domid, + value num_claims, value desc) +{ + CAMLparam4(xch_val, domid, num_claims, desc); + xc_interface *xch =3D xch_of_val(xch_val); + int i, retval, nr_claims =3D Int_val(num_claims); + memory_claim_t claim[nr_claims]; + + for (i =3D 0; i < nr_claims; i++) { + claim[i].node =3D Int_val(Field(desc, i*2)); + claim[i].nr_pages =3D Int64_val(Field(desc, i*2 + 1)); + } + + retval =3D xc_domain_claim_memory(xch, Int_val(domid), nr_claims, claim); + if (retval < 0) + failwith_xc(xch); + + CAMLreturn(Val_unit); +} + /* * Local variables: * indent-tabs-mode: t diff --git a/xen/common/domain.c b/xen/common/domain.c index 6ee9f23b10..39f1c3718c 100644 --- a/xen/common/domain.c +++ b/xen/common/domain.c @@ -267,6 +267,35 @@ int get_domain_state(struct xen_domctl_get_domain_stat= e *info, struct domain *d, return rc; } =20 +/* XEN_DOMCTL_claim_memory: Claim an amount of memory for a domain */ +int claim_memory(struct domain *d, const struct xen_domctl_claim_memory *u= info) +{ + memory_claim_t claim; + int rc; + + switch ( uinfo->nr_claims ) + { + case 0: + /* Cancel existing claim. */ + rc =3D domain_claim_pages(d, 0, 0); + break; + + case 1: + /* Only single node claims supported at the moment. */ + if ( copy_from_guest(&claim, uinfo->claims, 1) ) + return -EFAULT; + + rc =3D domain_claim_pages(d, claim.node, claim.nr_pages); + break; + + default: + rc =3D -EOPNOTSUPP; + break; + } + + return rc; +} + static void __domain_finalise_shutdown(struct domain *d) { struct vcpu *v; diff --git a/xen/common/domctl.c b/xen/common/domctl.c index 71e712c1f3..cf9537b02c 100644 --- a/xen/common/domctl.c +++ b/xen/common/domctl.c @@ -863,6 +863,14 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_= domctl) ret =3D get_domain_state(&op->u.get_domain_state, d, &op->domain); break; =20 + case XEN_DOMCTL_claim_memory: + ret =3D xsm_claim_pages(XSM_PRIV, d); + if ( ret ) + break; + + ret =3D claim_memory(d, &op->u.claim_memory); + break; + default: ret =3D arch_do_domctl(op, d, u_domctl); break; diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h index 8f6708c0a7..1cebbb878e 100644 --- a/xen/include/public/domctl.h +++ b/xen/include/public/domctl.h @@ -1276,6 +1276,21 @@ struct xen_domctl_get_domain_state { uint64_t unique_id; /* Unique domain identifier. */ }; =20 +struct xen_memory_claim { + unsigned int node; /* NUMA node, XC_NUMA_NO_NODE for a host claim= */ + unsigned long nr_pages; /* Number of pages to claim */ +}; +typedef struct xen_memory_claim memory_claim_t; +DEFINE_XEN_GUEST_HANDLE(memory_claim_t); + +/* XEN_DOMCTL_claim_memory: Claim an amount of memory for a domain */ +struct xen_domctl_claim_memory { + /* IN: array of memory claims */ + XEN_GUEST_HANDLE_64(memory_claim_t) claims; + /* IN: number of claims */ + unsigned int nr_claims; +}; + struct xen_domctl { /* Stable domctl ops: interface_version is required to be 0. */ uint32_t cmd; @@ -1368,6 +1383,7 @@ struct xen_domctl { #define XEN_DOMCTL_gsi_permission 88 #define XEN_DOMCTL_set_llc_colors 89 #define XEN_DOMCTL_get_domain_state 90 /* stable interface */ +#define XEN_DOMCTL_claim_memory 91 #define XEN_DOMCTL_gdbsx_guestmemio 1000 #define XEN_DOMCTL_gdbsx_pausevcpu 1001 #define XEN_DOMCTL_gdbsx_unpausevcpu 1002 @@ -1436,6 +1452,7 @@ struct xen_domctl { #endif struct xen_domctl_set_llc_colors set_llc_colors; struct xen_domctl_get_domain_state get_domain_state; + struct xen_domctl_claim_memory claim_memory; uint8_t pad[128]; } u; }; diff --git a/xen/include/xen/domain.h b/xen/include/xen/domain.h index 8aab05ae93..cd3e933fbf 100644 --- a/xen/include/xen/domain.h +++ b/xen/include/xen/domain.h @@ -195,4 +195,6 @@ extern bool vmtrace_available; =20 extern bool vpmu_is_available; =20 +int claim_memory(struct domain *d, const struct xen_domctl_claim_memory *u= info); + #endif /* __XEN_DOMAIN_H__ */ diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c index b0308e1b26..6b2535b666 100644 --- a/xen/xsm/flask/hooks.c +++ b/xen/xsm/flask/hooks.c @@ -853,6 +853,9 @@ static int cf_check flask_domctl(struct domain *d, unsi= gned int cmd, case XEN_DOMCTL_set_llc_colors: return current_has_perm(d, SECCLASS_DOMAIN2, DOMAIN2__SET_LLC_COLO= RS); =20 + case XEN_DOMCTL_claim_memory: + return current_has_perm(d, SECCLASS_DOMAIN2, DOMAIN2__CLAIM_MEMORY= ); + default: return avc_unknown_permission("domctl", cmd); } diff --git a/xen/xsm/flask/policy/access_vectors b/xen/xsm/flask/policy/acc= ess_vectors index 51a1577a66..87338b5c2a 100644 --- a/xen/xsm/flask/policy/access_vectors +++ b/xen/xsm/flask/policy/access_vectors @@ -259,6 +259,8 @@ class domain2 set_llc_colors # XEN_DOMCTL_get_domain_state get_domain_state +# XEN_DOMCTL_claim_memory + claim_memory } =20 # Similar to class domain, but primarily contains domctls related to HVM d= omains --=20 2.43.0