From nobody Tue Dec 2 02:51:09 2025 Received: from mail-wm1-f46.google.com (mail-wm1-f46.google.com [209.85.128.46]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5AA294C6C for ; Mon, 17 Nov 2025 14:08:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.46 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763388529; cv=none; b=YfXsUum7uwFnupkwr+EI0nsVeILXjdTamq25BIV/s2NLohPpRGWjwHCnsYF69fhKX3o2CRVapNkssD9jgD95w9T1l3Mpx9SGMuYk6KLOTSqgUoHBP6V3OFZ5HCNEp5IM8pGLoS8DzOgiotDgyQYZr+H1x+GtwGpEC5TRIMjFT9s= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763388529; c=relaxed/simple; bh=aaOo14CORzPZC2MGFDFnCUTBTqAzyLsCN4cBcbISx+E=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=Mk6KsiNam/yfTzYaKEQZFP3wUUjZo1FHNlV9jYvbLa5mcon199OcEVz3efNfFD1OtX3C+TJ2RroaSwAMIrQjZgFpt3hjw9rbUCEqzPS4cW5H8XUfJAuG8Qc6YegrV+Uq5gGVXuodkW3R1CVRH+8y/1uwXKDAZ9OHpfkJD8p5e8k= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=H7vK9XrF; arc=none smtp.client-ip=209.85.128.46 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="H7vK9XrF" Received: by mail-wm1-f46.google.com with SMTP id 5b1f17b1804b1-4779a637712so12970805e9.1 for ; Mon, 17 Nov 2025 06:08:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1763388525; x=1763993325; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=eFt/0dR5SJNFqQvlHxE3I1L82J7M+GH636G3V6yWtjI=; b=H7vK9XrFI46o2Q0OiI2brzCP8jYNKjbHo4ayrs9zB2FoKwKEo2I7BmdI+q1pWDj9IQ r+oSp19/Y2F3XUQCW+ltm0sVfk6t1a2DbOQbWutHqGOzoiCPEbvBFqRKSExHbJ3XWwEQ m0Re1CXziDBhkmXiZEueKMAwwaZ7IqjmdOBcp52JEnNQ0HB0Rs2w7oRQxtBQId8NutDU Q9ze5TkQLC7PMnPg+AQ6hMx9Jij5lUvaaBlMmMET9zeOnDN+AP0AmO+fjEfVDvBV9Ljr sf4JRSpAwtnnlXiNe3WBNcHueSeLYrsP7OAtAtsVbb29OaUuo2cZ9KiTedCrnPUMaJwd 348w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1763388525; x=1763993325; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=eFt/0dR5SJNFqQvlHxE3I1L82J7M+GH636G3V6yWtjI=; b=JJRX/TcjrlwXtrWcGZb9uXFLGy9oEg0ucdA4oYwL4p8RLRrW1xZkWxPG/OjT4S+Y/l RI0JWaCTWT6pyKd7KQetwIx79b6LqtVBdW3mQYZFFMA2a0GTbX1JbmtWMq59fTviWut7 LwZ8MpucXgklJsQi7Kgo3DppHB7/O8q6BTq0qf0Y1A5gTv7F1lg3sFWf7YFhhGrnzk05 RX7d0w8PQzA6z/MUlhyG17H4pV1seZpgWbuKxbBPlNgz86Vge0aDkvUNwTU+xs/6+2SJ zpuhnpCZYOC9eAb+JbweC3Y51JRkqNrLnK5K7hDp4UCb52zon0mL2kTz85q4cBlgcpM8 ikmg== X-Forwarded-Encrypted: i=1; AJvYcCXUuBYoqymql8A27xB4uzdWSqa+K9SaEl/6Am5kGRpRVRyULeW8Tjnt8yxSLeoD2sIvnWnSPqjYJKibBxE=@vger.kernel.org X-Gm-Message-State: AOJu0YywDHkooGLBLSUIB5rRb6QhWhlPWVIa8GxYTfWI7Nw9MpaXX62F ZAibNRBLTz/Grvh89BAW1Ksw1CrxY9ECzdRLvsmFbnt5J/lideWPEuqd X-Gm-Gg: ASbGncsI740+9/+NFtI7nVGg8NtmNpF/H4iNmiDtbdEdcoEiHuVtck1w/VTNeaG4NKH S6UP2ddAnLeRx8gLdk85dpEj7GN3ajjuvPwAJ4pCN+Y0s6T3ZE3+t1wVK3IFdN2Iy0fS/PKcNWW PzEkv906iXlB8M/MTC/fcMRFEMJGOzInYv3WVnuID6IF+0jVBU/4l2zu9WZj12rdAFl4ZT/G5rP m3qbrbp9Vdte/4aP/5hyHGqw5EjFEzK6LgUxvd5zsSwg1MQBwfLS6SSR6vcZ3O9YMllSBkvecEQ CqkQj5Db40mZoRcDxbB0JQp9CfxD72OjkEOzPyFA45/a1j9xA3WgYp8OhcNcLfu+XOd7yBzrLfM vOxM3hduGLRe98QvTx2n6bnvwEpGphaTF3UDrl34gFJ/f6ZJlF6194p9OcM0ZxHTU74dx3Z8VL5 Vrd3rZ3PZMpFHVn4jL1DT/DxgoaWF2BTo+nm0wo/Ujyolyt2grDT3xS9CuPWU= X-Google-Smtp-Source: AGHT+IGkGnTkwpwDinuAgoZTmsKo1U5ibTWnT1wU5rzHuKsObAr1nsXRinCrzzduTf4/5dtPOpL3sA== X-Received: by 2002:a05:600c:1d0b:b0:477:95a0:fe96 with SMTP id 5b1f17b1804b1-47795a0ffffmr79716355e9.36.1763388525274; Mon, 17 Nov 2025 06:08:45 -0800 (PST) Received: from f.. (cst-prg-14-82.cust.vodafone.cz. [46.135.14.82]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-4779fb67e73sm83540275e9.0.2025.11.17.06.08.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 17 Nov 2025 06:08:44 -0800 (PST) From: Mateusz Guzik To: akpm@linux-foundation.org Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, linus.walleij@linaro.org, pasha.tatashin@soleen.com, Liam.Howlett@oracle.com, lorenzo.stoakes@oracle.com, Mateusz Guzik Subject: [PATCH] fork: stop ignoring NUMA while handling cached thread stacks Date: Mon, 17 Nov 2025 15:07:47 +0100 Message-ID: <20251117140747.2566239-1-mjguzik@gmail.com> X-Mailer: git-send-email 2.43.0 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" 1. the numa parameter was straight up ignored. 2. nothing was done to check if the to-be-cached/allocated stack matches the local node The id remains ignored on free in case of memoryless nodes. Note the current caching is already bad as the cache keeps overflowing and a different solution is needed for the long run, to be worked out(tm). Stats collected over a kernel build with the patch with the following topology: NUMA node(s): 2 NUMA node0 CPU(s): 0-11 NUMA node1 CPU(s): 12-23 caller's node vs stack backing pages on free: matching: 50083 (70%) mismatched: 21492 (30%) caching efficiency: cached: 32651 (65.2%) dropped: 17432 (34.8%) Signed-off-by: Mateusz Guzik Reviewed-by: Linus Walleij --- I lifted page node id checking out of vmalloc, I presume it works(tm). kernel/fork.c | 55 +++++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 45 insertions(+), 10 deletions(-) diff --git a/kernel/fork.c b/kernel/fork.c index f1857672426e..9448582737ff 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -208,15 +208,54 @@ struct vm_stack { struct vm_struct *stack_vm_area; }; =20 +static struct vm_struct *alloc_thread_stack_node_from_cache(struct task_st= ruct *tsk, int node) +{ + struct vm_struct *vm_area; + unsigned int i; + + /* + * If the node has memory, we are guaranteed the stacks are backed by loc= al pages. + * Otherwise the pages are arbitrary. + * + * Note that depending on cpuset it is possible we will get migrated to a= different + * node immediately after allocating here, so this does *not* guarantee l= ocality for + * arbitrary callers. + */ + scoped_guard(preempt) { + if (node !=3D NUMA_NO_NODE && numa_node_id() !=3D node) + return NULL; + + for (i =3D 0; i < NR_CACHED_STACKS; i++) { + vm_area =3D this_cpu_xchg(cached_stacks[i], NULL); + if (vm_area) + return vm_area; + } + } + + return NULL; +} + static bool try_release_thread_stack_to_cache(struct vm_struct *vm_area) { unsigned int i; + int nid; + + scoped_guard(preempt) { + nid =3D numa_node_id(); + if (node_state(nid, N_MEMORY)) { + for (i =3D 0; i < vm_area->nr_pages; i++) { + struct page *page =3D vm_area->pages[i]; + if (page_to_nid(page) !=3D nid) + return false; + } + } =20 - for (i =3D 0; i < NR_CACHED_STACKS; i++) { - struct vm_struct *tmp =3D NULL; + for (i =3D 0; i < NR_CACHED_STACKS; i++) { + struct vm_struct *tmp =3D NULL; =20 - if (this_cpu_try_cmpxchg(cached_stacks[i], &tmp, vm_area)) - return true; + if (this_cpu_try_cmpxchg(cached_stacks[i], &tmp, vm_area)) + return true; + } } return false; } @@ -283,13 +322,9 @@ static int alloc_thread_stack_node(struct task_struct = *tsk, int node) { struct vm_struct *vm_area; void *stack; - int i; - - for (i =3D 0; i < NR_CACHED_STACKS; i++) { - vm_area =3D this_cpu_xchg(cached_stacks[i], NULL); - if (!vm_area) - continue; =20 + vm_area =3D alloc_thread_stack_node_from_cache(tsk, node); + if (vm_area) { if (memcg_charge_kernel_stack(vm_area)) { vfree(vm_area->addr); return -ENOMEM; --=20 2.48.1