From nobody Thu Oct 2 06:18:04 2025 Received: from mail-yb1-f179.google.com (mail-yb1-f179.google.com [209.85.219.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4714F2D0267 for ; Fri, 19 Sep 2025 19:52:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.179 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758311550; cv=none; b=uxD9Bj5EZrX2avZzwH18UoUPsUA1QiFnUxWJZ75mhFz7GFBd/Ce+NA4WHqSbK6MNngAwdLx3mr06VQqvaHZVuVhRtgV5FkIV6NA1SMBzXtM2V/40dmH2i1RzDnTmgrDc5jmtsCv9yQBncJEWKCs1oirz09BLpbnbIsSxyE5vhjE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758311550; c=relaxed/simple; bh=dgs8/FCp3I80HNIj6iXMInXrKCLpOwWYIK2D6ydLQfg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=JqzBmjLtudhQNDeaY/3OdJz0VPM2+gAqgJABbnfkhkLq/bLDrXymuF0wXogoAc3QqCemaLhoooKitNpx+76fjp4hdd+DgYwa35nwXx10LF96hK6VtxHTQNVWghWJyMAOws8PqjtP3pfmHh+C2EyL0so2hJ9KxXRfBMT3QdmvKNs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=JlMghjxD; arc=none smtp.client-ip=209.85.219.179 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="JlMghjxD" Received: by mail-yb1-f179.google.com with SMTP id 3f1490d57ef6-e970e624b7cso3378448276.0 for ; Fri, 19 Sep 2025 12:52:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1758311546; x=1758916346; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=PNokewsZJe0VctGE28J9Sg6BSEv7Rs3SI9ltzwt8Wv8=; b=JlMghjxDaRyCaj5ecq7gTaxS1k3K5S1ErZPMm7rxSQxaBd2OYcVgonpWavDVYlQPxq eMWTdY5bxw18DBida3WS7RAr504EM52fKhaK1ZFoTgXpn9l5LQhIAhIVOgltybZrl2ar EJnxTMG+pY3uB9Ifcyj729ooMSKM9JvXDY5cIsPrhP/qz12JXi76q7SCvvDezLSdYV13 IaJ6RnrtzdyhkRhCkSa85IoUjx0WbS8JFdxZIQiCZf9lyHHxgzZ1eY3mbomHmvRpSC0X mq0GvZ01uugrjdhEgJM7JYs+jVSXmZ5juaFVDlPr0c7BouWKUA9ObB+GHbTxIzibHSph U0zg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1758311546; x=1758916346; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=PNokewsZJe0VctGE28J9Sg6BSEv7Rs3SI9ltzwt8Wv8=; b=GJFgdneZevgLwEuhp24HkGdJs2kUP/qo8MmHtHQigna+BTe0eZpUVoja4d3HYXb19Z 2ClNw5Rpxlj/T7fz0ehPWDWLmjSNUFeL7BkzGgYEZS58w6zM8GlaW7uRTJhHf6PQ5dLr 9vRb+zHVOfYMjkTkGhV4VKwc/L+WjXn/Wf80J3Du8v1+Q7jvRbhqE5MOdGXn19msULFj f2BxUkNyrahlEnB9XjNh0FMSt9e2okxkW5ugOx7mez2mlpqyZtQPlQoQkS5ucnFY6Pe5 l9VOn8KZaQXk3RBh0m/4iY8Huiv+tai/gKEdezKnJSQApxNn8QldxF+JaZgEXOkTk9E8 d8XA== X-Forwarded-Encrypted: i=1; AJvYcCU/TndgHy4fguVrZAVbSCxf2NLgr++A0nbHU1zaOkoFEqivyT/tGhAYsQ17EvIqL3jR55+Dk+EOT0WM/EU=@vger.kernel.org X-Gm-Message-State: AOJu0YyNPgsqogTiAni36SX+ee1oiQdgEoszIw0XSNO46hsFpAHdjrJr 1I3/HsCS2PIQ/Bo5GYXBiqtVqtn54H1KqPOtl7XhnIpJIofOuM05fsgo X-Gm-Gg: ASbGnctaGfYqK1swInsQMnsHqpFltJCRrvFD9llkQbIjhFwolkZXxNmChJOQtXm4QDt tBZTgxpy07bT3z1Iiqzyc5iIXAWLKq2QErweWmzzADdBo9bolIYJ6Iq8VK8ojRoNGQFD13zPa0P RvRmzkrC4yOWTj9oRHTKyTP99z7ELU1bKP6ZTBMXkkvFYKznqREieZ7EN0gxH+gc3fdMCzqHsJD acjBA0lqaDRHWkIFMYhLbqVsJFQaE87aylcL7ods6U/YGeutsMMU/pZEKrHiJhViTtB8A4EGRxA kbJqqWOJMIvvl85cjI2QEBYxWlY80LkfXnwONVUSLIeOnUEDGKoMDZVvqv2sHAcruYizbQiCkBz bt4kjglknf+CgdzdxUumAZVHpWMcdYbu4fGZg1WDi4hVj53T8WeVM X-Google-Smtp-Source: AGHT+IGiHhqEEGk8maRWdeb2WQIByD25ZoDI5E5e1pECrjiWUHQigM38gM3n9qWST+GwllEBBEo01w== X-Received: by 2002:a05:690c:9506:b0:733:aa00:3860 with SMTP id 00721157ae682-739708ca5b2mr60891147b3.23.1758311545864; Fri, 19 Sep 2025 12:52:25 -0700 (PDT) Received: from localhost ([2a03:2880:25ff:6::]) by smtp.gmail.com with ESMTPSA id 00721157ae682-744a3110618sm1682767b3.13.2025.09.19.12.52.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 19 Sep 2025 12:52:25 -0700 (PDT) From: Joshua Hahn To: Andrew Morton , Johannes Weiner Cc: Chris Mason , Kiryl Shutsemau , "Liam R. Howlett" , Brendan Jackman , David Hildenbrand , Lorenzo Stoakes , Michal Hocko , Mike Rapoport , Suren Baghdasaryan , Vlastimil Babka , Zi Yan , linux-kernel@vger.kernel.org, linux-mm@kvack.org, kernel-team@meta.com Subject: [PATCH 1/4] mm/page_alloc/vmstat: Simplify refresh_cpu_vm_stats change detection Date: Fri, 19 Sep 2025 12:52:19 -0700 Message-ID: <20250919195223.1560636-2-joshua.hahnjy@gmail.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20250919195223.1560636-1-joshua.hahnjy@gmail.com> References: <20250919195223.1560636-1-joshua.hahnjy@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Currently, refresh_cpu_vm_stats returns an int, indicating how many changes were made during its updates. Using this information, callers like vmstat_update can heuristically determine if more work will be done in the future. However, all of refresh_cpu_vm_stats's callers either (a) ignore the result, only caring about performing the updates, or (b) only care about whether changes were made, but not *how many* changes were made. Simplify the code by returning a bool instead to indicate if updates were made. In addition, simplify fold_diff and decay_pcp_high to return a bool for the same reason. Signed-off-by: Joshua Hahn --- include/linux/gfp.h | 2 +- mm/page_alloc.c | 8 ++++---- mm/vmstat.c | 26 +++++++++++++------------- 3 files changed, 18 insertions(+), 18 deletions(-) diff --git a/include/linux/gfp.h b/include/linux/gfp.h index 5ebf26fcdcfa..63c72cb1d117 100644 --- a/include/linux/gfp.h +++ b/include/linux/gfp.h @@ -386,7 +386,7 @@ extern void free_pages(unsigned long addr, unsigned int= order); #define free_page(addr) free_pages((addr), 0) =20 void page_alloc_init_cpuhp(void); -int decay_pcp_high(struct zone *zone, struct per_cpu_pages *pcp); +bool decay_pcp_high(struct zone *zone, struct per_cpu_pages *pcp); void drain_zone_pages(struct zone *zone, struct per_cpu_pages *pcp); void drain_all_pages(struct zone *zone); void drain_local_pages(struct zone *zone); diff --git a/mm/page_alloc.c b/mm/page_alloc.c index d1d037f97c5f..77e7d9a5f149 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2561,10 +2561,10 @@ static int rmqueue_bulk(struct zone *zone, unsigned= int order, * Called from the vmstat counter updater to decay the PCP high. * Return whether there are addition works to do. */ -int decay_pcp_high(struct zone *zone, struct per_cpu_pages *pcp) +bool decay_pcp_high(struct zone *zone, struct per_cpu_pages *pcp) { int high_min, to_drain, batch; - int todo =3D 0; + bool todo; =20 high_min =3D READ_ONCE(pcp->high_min); batch =3D READ_ONCE(pcp->batch); @@ -2577,7 +2577,7 @@ int decay_pcp_high(struct zone *zone, struct per_cpu_= pages *pcp) pcp->high =3D max3(pcp->count - (batch << CONFIG_PCP_BATCH_SCALE_MAX), pcp->high - (pcp->high >> 3), high_min); if (pcp->high > high_min) - todo++; + todo =3D true; } =20 to_drain =3D pcp->count - pcp->high; @@ -2585,7 +2585,7 @@ int decay_pcp_high(struct zone *zone, struct per_cpu_= pages *pcp) spin_lock(&pcp->lock); free_pcppages_bulk(zone, to_drain, pcp, 0); spin_unlock(&pcp->lock); - todo++; + todo =3D true; } =20 return todo; diff --git a/mm/vmstat.c b/mm/vmstat.c index 71cd1ceba191..1f74a3517ab2 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -771,25 +771,25 @@ EXPORT_SYMBOL(dec_node_page_state); =20 /* * Fold a differential into the global counters. - * Returns the number of counters updated. + * Returns whether counters were updated. */ static int fold_diff(int *zone_diff, int *node_diff) { int i; - int changes =3D 0; + bool changed =3D false; =20 for (i =3D 0; i < NR_VM_ZONE_STAT_ITEMS; i++) if (zone_diff[i]) { atomic_long_add(zone_diff[i], &vm_zone_stat[i]); - changes++; + changed =3D true; } =20 for (i =3D 0; i < NR_VM_NODE_STAT_ITEMS; i++) if (node_diff[i]) { atomic_long_add(node_diff[i], &vm_node_stat[i]); - changes++; + changed =3D true; } - return changes; + return changed; } =20 /* @@ -806,16 +806,16 @@ static int fold_diff(int *zone_diff, int *node_diff) * with the global counters. These could cause remote node cache line * bouncing and will have to be only done when necessary. * - * The function returns the number of global counters updated. + * The function returns whether global counters were updated. */ -static int refresh_cpu_vm_stats(bool do_pagesets) +static bool refresh_cpu_vm_stats(bool do_pagesets) { struct pglist_data *pgdat; struct zone *zone; int i; int global_zone_diff[NR_VM_ZONE_STAT_ITEMS] =3D { 0, }; int global_node_diff[NR_VM_NODE_STAT_ITEMS] =3D { 0, }; - int changes =3D 0; + bool changed =3D false; =20 for_each_populated_zone(zone) { struct per_cpu_zonestat __percpu *pzstats =3D zone->per_cpu_zonestats; @@ -839,7 +839,7 @@ static int refresh_cpu_vm_stats(bool do_pagesets) if (do_pagesets) { cond_resched(); =20 - changes +=3D decay_pcp_high(zone, this_cpu_ptr(pcp)); + changed |=3D decay_pcp_high(zone, this_cpu_ptr(pcp)); #ifdef CONFIG_NUMA /* * Deal with draining the remote pageset of this @@ -861,13 +861,13 @@ static int refresh_cpu_vm_stats(bool do_pagesets) } =20 if (__this_cpu_dec_return(pcp->expire)) { - changes++; + changed =3D true; continue; } =20 if (__this_cpu_read(pcp->count)) { drain_zone_pages(zone, this_cpu_ptr(pcp)); - changes++; + changed =3D true; } #endif } @@ -887,8 +887,8 @@ static int refresh_cpu_vm_stats(bool do_pagesets) } } =20 - changes +=3D fold_diff(global_zone_diff, global_node_diff); - return changes; + changed |=3D fold_diff(global_zone_diff, global_node_diff); + return changed; } =20 /* --=20 2.47.3 From nobody Thu Oct 2 06:18:04 2025 Received: from mail-yw1-f169.google.com (mail-yw1-f169.google.com [209.85.128.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F2AB32BEC27 for ; Fri, 19 Sep 2025 19:52:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.169 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758311549; cv=none; b=S6UT3pXD54L1M+toCq8Ed54A6rwWV0feIPWoMqm+Y0mq8P71xNBngTf46+nemP5Y7J+IDs6WeSMMB5THyqX7Eko4E49kUBWwAZxhWtwu9/tC71J4r3opEEQTDqBD0pdAVYKDPexBlCofufJwexiWu34fNCNEm1SPlxMBZE+eF6o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758311549; c=relaxed/simple; bh=42+eESs+hNHeT/GrhhilKtr9b3BDoXjdJa2yGQYtxfQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=dJvhPLkJKQkPVrY+x+Zuik6FNLgQ7WdW4vl0MbtHyWxooFCwMnmYizH7Y8P3qlc19oiMDTnptu2eI6xZOvAyDdN7vH/Q2tPB6mmMnS+27XD5prcFyt93RmIOiEMw5zgcniks4hJzRIz8vYyiovnE9S6mWtpPqdBmep+mfUJ+Ljo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=nO/FHyLK; arc=none smtp.client-ip=209.85.128.169 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="nO/FHyLK" Received: by mail-yw1-f169.google.com with SMTP id 00721157ae682-72e565bf2f0so21305807b3.3 for ; Fri, 19 Sep 2025 12:52:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1758311547; x=1758916347; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=MRPftj4LaqpBRfvv5a/Vndy5zTyggHCJw7J9EdAecUc=; b=nO/FHyLKhKTWszg0s8XVUsgLLVQmDHtPeNZKK0q6bd6tH5AmvxlhCMzGIFL6J+plEn MXP/K59joYB0T2Aio5H+NTcOUYzAX8a5qwTED9LpF6q9MZGFsHhmyRtXkA+H7Ztoae7M 1TK2T477QRj4Mhp7rs9jdxEYabWJwbI+RxZzC5QNWF+bI7s75mn6a/qPrGsvTJpHb9ds 23tMs6iFXx9emwldPDeohIHmAjIfaAalaIOwlEDluj7XEDufwMHAiK//TPO6BbHHoiDH dVIQnIwxi+XVWcOmiUk4EgyX8DBT4Ot4y0CL6fAog+oR/c+7MMBJGcs4LSDgHtRoTIZT KP0w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1758311547; x=1758916347; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=MRPftj4LaqpBRfvv5a/Vndy5zTyggHCJw7J9EdAecUc=; b=Dvc/51ZDv/KQYPWbwzOgsMnRHMo5PLljCypk7BH0OeM53lwjoprDnDrIkoY2/gNtau bytnba4FtTzLZmRBEc2EcE3jrHb//uqZJf5CT6ytqSH7QUdo3k2aQvHzcloIE2CO55Ml Uhakh6cnjG+bbgoxeEU5RKoIfEMs0x1tZagntci5kP2fyNxJ9QeC9AWNYcpu9L59QKAI BjwAsfixPE0oV5ZHuyq+rOrcV5324D6/4qTIvXeTj7Zhs2/311ihtmEOFY5YA+wfzV2P odvcXWjvpbjK5UtSC9Lr8yPKrnP0fz9YFDr9pNPd/ACVI0erAtbkCeiekdvj7NWYV1j6 67Hw== X-Forwarded-Encrypted: i=1; AJvYcCVINBLAqZNIICWRvCRnLyNsyqE5JxX/reJdjOLgrlyMU5IWCVxytqXJLT/Ug94VTvEn8y7rJb9T1yqW4zQ=@vger.kernel.org X-Gm-Message-State: AOJu0YwCydVHV+DEe1Pb8tFNrBJEw1pQ8WCwmhalqZKEq4DLCj7Gv2ef yzGOUf9VWhqOROOA/0wvuOjYSVzRhls8mOVVOSyFNpn1bAKaf5tMAgMN X-Gm-Gg: ASbGnctCwCRZuHNzQlN+9E/URnh3UcTSgpavgPDhcBMmjLDRWKS9MJEGC645GNr/RYv k03vNYplGomUcx0DcrpXhL9MQRLpTuXHY6CQKrOq3gY8axrWUjiue7H6pmT34kMlGo8Z6ReDoxx 0VG5E45RCex9MIhLTc8Kgl2KHRl+WuKKlL55l25pb0XdA9Pe48pk0gslu2fEfr0Z4Lqmy8H77v6 ADcSedCZypBvjmRUhJov9aodsdjiXADgs6G9FKkjoLPshMqI42yWQwMtzj0yPJv9RIikaXc2eHJ 4Ah5Fg9fWkMRCtTe8WB1EP74ouMhh1L+y5mEwpGc3n0UvjP/gcuotR87tFmfS1vjZSJYgbNcx0x k/RZIxvgeT8nJcdmqHWzGjin2c4GTJO5+T/ja7nqSqM0cyT04Y7Cg2g== X-Google-Smtp-Source: AGHT+IEdgDrFewuDJoDd3/cY/EMbtsLE4/UlFxrWj3yry9gWHcvbmUqj/wKnoUCikShDc6KxoRtvUg== X-Received: by 2002:a05:690c:3507:b0:729:df2d:4a23 with SMTP id 00721157ae682-73d3a52a51amr39975377b3.32.1758311546930; Fri, 19 Sep 2025 12:52:26 -0700 (PDT) Received: from localhost ([2a03:2880:25ff:43::]) by smtp.gmail.com with ESMTPSA id 00721157ae682-739716f35c5sm16631707b3.22.2025.09.19.12.52.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 19 Sep 2025 12:52:26 -0700 (PDT) From: Joshua Hahn To: Andrew Morton , Johannes Weiner Cc: Chris Mason , Kiryl Shutsemau , Brendan Jackman , Michal Hocko , Suren Baghdasaryan , Vlastimil Babka , Zi Yan , linux-kernel@vger.kernel.org, linux-mm@kvack.org, kernel-team@meta.com Subject: [PATCH 2/4] mm/page_alloc: Perform appropriate batching in drain_pages_zone Date: Fri, 19 Sep 2025 12:52:20 -0700 Message-ID: <20250919195223.1560636-3-joshua.hahnjy@gmail.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20250919195223.1560636-1-joshua.hahnjy@gmail.com> References: <20250919195223.1560636-1-joshua.hahnjy@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" drain_pages_zone completely drains a zone of its pcp free pages by repeatedly calling free_pcppages_bulk until pcp->count reaches 0. In this loop, it already performs batched calls to ensure that free_pcppages_bulk isn't called to free too many pages at once, and relinquishes & reacquires the lock between each call to prevent lock starvation from other processes. However, the current batching does not prevent lock starvation. The current implementation creates batches of pcp->batch << CONFIG_PCP_BATCH_SCALE_MAX, which has been seen in Meta workloads to be up to 64 << 5 =3D=3D 2048 pages. While it is true that CONFIG_PCP_BATCH_SCALE_MAX is a config and indeed can be adjusted by the system admin to be any number from 0 to 6, it's default value of 5 is still too high to be reasonable for any system. Instead, let's create batches of pcp->batch pages, which gives a more reasonable 64 pages per call to free_pcppages_bulk. This gives other processes a chance to grab the lock and prevents starvation. Each individual call to drain_pages_zone may take longer, but we avoid the worst case scenario of completely starving out other system-critical threads from acquiring the pcp lock while 2048 pages are freed one-by-one. Signed-off-by: Joshua Hahn --- mm/page_alloc.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 77e7d9a5f149..b861b647f184 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2623,8 +2623,7 @@ static void drain_pages_zone(unsigned int cpu, struct= zone *zone) spin_lock(&pcp->lock); count =3D pcp->count; if (count) { - int to_drain =3D min(count, - pcp->batch << CONFIG_PCP_BATCH_SCALE_MAX); + int to_drain =3D min(count, pcp->batch); =20 free_pcppages_bulk(zone, to_drain, pcp, 0); count -=3D to_drain; --=20 2.47.3 From nobody Thu Oct 2 06:18:04 2025 Received: from mail-yb1-f179.google.com (mail-yb1-f179.google.com [209.85.219.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2A0F2322753 for ; Fri, 19 Sep 2025 19:52:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.179 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758311552; cv=none; b=r81K4ducVjPzlpypOq9U1t+dbTRtl7VjRfORct7foL7Md6WFaKZXRmnh4rlYzNb2NiwQtRday72kQ4Ddmbs+6ng2/yi9mNWn1RJ48m1i6uuJKSnkR6AlysEzk38s0lsg+nLlBnwQ3dT02Ly2bh9JXeAgVb8GlYBuEkSYpjJQUQo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758311552; c=relaxed/simple; bh=/DJi6ZU0FtcXqFd9Nu5mKpK8PpkyQYwpqq9uwgCiJuU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=HmEh8WKv4uEXAA+2lVMtmUOA7D9EP5Aj9u+MsvVSBKfTIpgGrwiwszFT3VKZZhmADjMxWe4GtqOTRbGovkBS+r/yXjowqMq8Pm/pn+09uIqQU6MvvcGRqJnE7yRL1GvqGI9/llICoNH1KW9PvDmuykEVIZNHP4CyYIK8raLQus0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=YWf1+A62; arc=none smtp.client-ip=209.85.219.179 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="YWf1+A62" Received: by mail-yb1-f179.google.com with SMTP id 3f1490d57ef6-ea5b96d2488so1878138276.0 for ; Fri, 19 Sep 2025 12:52:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1758311548; x=1758916348; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=QeFP7YZ+0ipT42W2voI8x5bVWJQwIjHVnEimGCqA+fo=; b=YWf1+A62Bv6ktG7AtHEj7U4gELZJFiVV1IX6XEAIQgPM5yHsKlxY5vMXQApXxlM+AY 4zawzFL237Ivi+kG+tk/PpD2MgZCl288HMwT9zKinOYtbDKOyqUvjnuboPRX6TBStMQb PVXm4xB6pUsymFB2p4Xp/0ylDW6ee1qpGKKNv3lauZHkbiVHJLThnEIAqSZ6oVkr3dWV qXIrjcTRw6xTQHD6pCLz33tcd4QsXVM/mnQ5ZmjlySM1YMc4GlPyQPmJS3OGlWAo6UHJ 9BHulcubeeST8V/SPXjIsN0AVwufdUS4epLBFzvvkAv4t/seTClFLWzETpocF7yV2Wjy VChQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1758311548; x=1758916348; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=QeFP7YZ+0ipT42W2voI8x5bVWJQwIjHVnEimGCqA+fo=; b=Eiqapdz6vGQkSg/mK975X3LpOcWUnCvSKLzIiF4haGGvujSyNAjUrLu92TGXghLcYP WxR//sl82g0YFYMdT/fbqQN+9+nUu1OUvP57yM1dBT7xYLTY6lX30GcH+KNAkxOseMEP ZnZbX612SONSrbVueu6k1DgaL82uLCXdVO7utOctDSPNi7UtKtmPetQ+BSIPPMVgjklo 4IEytFOWEOOiBaGpLLdLkEUuwdXVl8MeIQv36bAjj37FqiBvtOHghbSpV3KH8AEMDTVK q1DGHObazBxXxf+A2k3PfiwN9JxKplrcykCZbwyUlGQgxeoeO9orgu+zYwxmgYbEC53h Q1ig== X-Forwarded-Encrypted: i=1; AJvYcCXiNL615kVdEzCiv6LhR9JQUFCbgjb58dL1G6IPrqidiA8qI4gqdINMbqGBqd1CQWmoWQKmLPRJMckmq4M=@vger.kernel.org X-Gm-Message-State: AOJu0Yw6cUtneOijUVqQ5UK3Vq9YHunzvhjt2tGWGAtPQP5+d/iRR5g2 0Jap3JEVP/KR3vgiQ98jBnJHFILbNPGSQgoH9qMJ/8tmRoWWl5KItc4AQ/E7Qw== X-Gm-Gg: ASbGncsy2H9g9QQWR5inj7qlKmyBd9qyl8JCk4zUpsO70sCU8wbzwMfIKPoGXg2McrT 8UnbBUgh7lv2hLz9v5HQX6YedI7haHTa6uX/5UgjE2IqXODlWLM8UbfXYgTnZt3QN4XfZLAZhE7 PYcSJG0bX0cW4dtOGW99KVDUhBa1fdlbBd8yhykm07CtADUJOVsy60Q6PweUaTrI9EkoLaTafgI qGObFOfykxTd+o7y+lcuosdtPQjcIUeR8uaryXIm3H71HEb/v3GX9LElzFL0792q2ZTH8n8zCkv /gh2ymZ+7rxKhUks+qk8lYNbXlz/o+um7anUfvogT1/motBz+VyVX46OH1K0wfgvj3YY5r/uPBz EAUnbJXA5UThsn1VQT/7PlEMWC3BgkuzvLzOj61dAqDWKXk5usbaAdw== X-Google-Smtp-Source: AGHT+IEsrmhTSR/Ouh5RyIoB6jxstfFRsj47OaTQJsuUNGkSTK4pmk8eUUNqdr9I9Afg7H4rSVdbiQ== X-Received: by 2002:a05:6902:70f:b0:ead:1e1c:2754 with SMTP id 3f1490d57ef6-ead1e2b7e0fmr137258276.42.1758311548027; Fri, 19 Sep 2025 12:52:28 -0700 (PDT) Received: from localhost ([2a03:2880:25ff:5c::]) by smtp.gmail.com with ESMTPSA id 3f1490d57ef6-ea5ce974386sm1950713276.28.2025.09.19.12.52.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 19 Sep 2025 12:52:27 -0700 (PDT) From: Joshua Hahn To: Andrew Morton Cc: Johannes Weiner , Chris Mason , Kiryl Shutsemau , Brendan Jackman , Michal Hocko , Suren Baghdasaryan , Vlastimil Babka , Zi Yan , linux-kernel@vger.kernel.org, linux-mm@kvack.org, kernel-team@meta.com Subject: [PATCH 3/4] mm/page_alloc: Batch page freeing in decay_pcp_high Date: Fri, 19 Sep 2025 12:52:21 -0700 Message-ID: <20250919195223.1560636-4-joshua.hahnjy@gmail.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20250919195223.1560636-1-joshua.hahnjy@gmail.com> References: <20250919195223.1560636-1-joshua.hahnjy@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" It is possible for pcp->count - pcp->high to be greatly over pcp->batch. When this happens, we should perform batching to ensure that free_pcppages_bulk isn't called with too many pages to free at once, and starve out other threads that need the pcp lock. Since we are still only freeing the difference between the initial pcp->count and pcp->high values, there should be no change to how many pages are freed. Suggested-by: Chris Mason Suggested-by: Andrew Morton Co-developed-by: Johannes Weiner Signed-off-by: Joshua Hahn --- mm/page_alloc.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index b861b647f184..467e524a99df 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2563,7 +2563,7 @@ static int rmqueue_bulk(struct zone *zone, unsigned i= nt order, */ bool decay_pcp_high(struct zone *zone, struct per_cpu_pages *pcp) { - int high_min, to_drain, batch; + int high_min, to_drain, to_drain_batched, batch; bool todo; =20 high_min =3D READ_ONCE(pcp->high_min); @@ -2581,11 +2581,14 @@ bool decay_pcp_high(struct zone *zone, struct per_c= pu_pages *pcp) } =20 to_drain =3D pcp->count - pcp->high; - if (to_drain > 0) { + while (to_drain > 0) { + to_drain_batched =3D min(to_drain, batch); spin_lock(&pcp->lock); - free_pcppages_bulk(zone, to_drain, pcp, 0); + free_pcppages_bulk(zone, to_drain_batched, pcp, 0); spin_unlock(&pcp->lock); todo =3D true; + + to_drain -=3D to_drain_batched; } =20 return todo; --=20 2.47.3 From nobody Thu Oct 2 06:18:04 2025 Received: from mail-yw1-f182.google.com (mail-yw1-f182.google.com [209.85.128.182]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2977B321434 for ; Fri, 19 Sep 2025 19:52:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.182 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758311551; cv=none; b=VfxernptKkx/nCld+iDYkIVSI0yP2eDp6OwaelLoN52l2we+CPbsdguQ7zxBOCxjE0OPZ2YXKVKbOmtb0y0TxZN3NShvKoEEC7Ahl73xHf55LGyF6n+OMHxKsV92GmvHcNoamvlSdXImtFL6suroz8R+Gacn2WGsVCWHCy5QnE4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758311551; c=relaxed/simple; bh=Z4+KbE6GsIZkdULkpVOcM/D3ULSNz+M1WkXs5KNc/Is=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=umIechjUHUAebZDQ27weLN+vCjq1x386EdQDNgXDhboBBOPBjImJatXF+CX8lgIc3spXEOiCwtYd1kn5ZL419zknerBbuQm8Kd1cPj3u1fulwkXgEzcqIyhPU7hxxmHZveQ5Pux+nsxuG02fD9nVz4He/BGTnsjXwVIOx1mvhJs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=N23zZ2LH; arc=none smtp.client-ip=209.85.128.182 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="N23zZ2LH" Received: by mail-yw1-f182.google.com with SMTP id 00721157ae682-72ce9790acdso25844087b3.0 for ; Fri, 19 Sep 2025 12:52:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1758311549; x=1758916349; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=w5514hkUU6j/LpNp+jQR2Vw3xX4bDEVjgpb6NqfXeu0=; b=N23zZ2LHPNSqxZ0UHXnZPXqgA+4NQPjtwItMiy0aIKDTpfIPwheku6BfapemLFEFlh q/y3axXJL+PJEmQiW/51o2OpCmv/oatvgXpbAZ/0txX6iPg9yscQWwrcl4B7W+mjkhFD 4RnphEdComGCfd7z7bZo/6cVqfphCOdvZKtkJdYmONtXXSg+eW73fTno6SRHghoZlqle P6SVKmnvgzTZ+ApVoddZ4pR43GuHuKjCf8uxt+kJs+CNJeUFmSJJCbpiy5B9wZT7LtCK e4MJBsADScbtcSMwbwiXPunLDiNj1k0VqF0YMsc90VIOqOQae5sKtfVvxuttOwD9NL63 +DUw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1758311549; x=1758916349; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=w5514hkUU6j/LpNp+jQR2Vw3xX4bDEVjgpb6NqfXeu0=; b=Tmc0UEb7SdoX4QBTYjPrX6qG8Zzu3oRE3WZcjPzi8Ko4dnW7hsoNnCz6iAwmpeVEx3 AQ6L9qxBIsMycMmECg0Oa7WlpC7BgVU6VGQo1hnEW0Lb6URrd8ytEWeQU0vy7Kc3lWQD xrFRmMyNbBJKCeCja7jR429gN8YSgSSyRpwHa6PhStrIu17iCzvUCbUjQoLOic+bDHBO L/dfolYA9IMA9JO9OXrB2+zwuid9rskfX+zm1BkHodIYxy6yrnF4tFJqiSQiX3+U463v 3+f1B+gmfVES+CQku5icb4AemC64b0Ey1RHkIG4dqsRkBqWuwZ8c7Fi+PI0DbNT0ahxA +rbw== X-Forwarded-Encrypted: i=1; AJvYcCWwjF2QXw8dAfL6nFhUhC+9roLwekj9nPkuJb5kUm0KhAq6UewXQp7INZmf1LZ6vqLTJG1UY8UgmO810Ww=@vger.kernel.org X-Gm-Message-State: AOJu0YyyVbAVpPSAaJQmugrqhnozWEnyBye61ZBxYgq//x2Ob1msgO3H E2/J84x/9WHjvC45S0nICm3oh0GDPB34QB3o/goz/tGXt/3bB2Ua5lzF X-Gm-Gg: ASbGncvNNKDEF/f7rmGR5s8n6F0OSf4tFfBSVt5c1iiW2+aYqfRim3lfL2GmRLw1Rk+ /flnw4QCE5JbfS61/+jc7TggJxa4Fbm+sntmohrp6HK5H5yuRQgTck+6H3kMYI9lUrgLbqHgBH7 ADOQB5KjvvHAnaDTWEWS/7Vm9DdH67VvAuBtEnD2GpAHZVJHc20vFryb7nMHNLNqrBQ4HzJiH2/ m40iBCC+OHsdfXpIviUMXKME7QailjsJjlZTmTVGsQj1E+oFPoIDE02VOM/Grdu8XDQ6hl6zotr b2S3X1EW3c/XKrri/ruE9E1x75/AX48x+ZirDlSN8DIVtkefxCnZheHJacwFhRKScCtVATaRbf7 HWT7JPx7CmClbWrVrfsjxyTIdrTj0GFXSCAen/WR7MqvWZiBSpzvo X-Google-Smtp-Source: AGHT+IEVBz0LvxtgrKD8uQV/kHRpMtKxa5IqUhA+Ja3x1EMKQQ8NgxEBIv4STTROBqXMAS7ua4+BRw== X-Received: by 2002:a05:690c:6812:b0:72f:d215:60c9 with SMTP id 00721157ae682-73d3c0162afmr46956397b3.26.1758311549074; Fri, 19 Sep 2025 12:52:29 -0700 (PDT) Received: from localhost ([2a03:2880:25ff:e::]) by smtp.gmail.com with ESMTPSA id 00721157ae682-739716caf81sm16662397b3.1.2025.09.19.12.52.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 19 Sep 2025 12:52:28 -0700 (PDT) From: Joshua Hahn To: Andrew Morton , Johannes Weiner Cc: Chris Mason , Kiryl Shutsemau , Brendan Jackman , Michal Hocko , Suren Baghdasaryan , Vlastimil Babka , Zi Yan , linux-kernel@vger.kernel.org, linux-mm@kvack.org, kernel-team@meta.com Subject: [PATCH 4/4] mm/page_alloc: Batch page freeing in free_frozen_page_commit Date: Fri, 19 Sep 2025 12:52:22 -0700 Message-ID: <20250919195223.1560636-5-joshua.hahnjy@gmail.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20250919195223.1560636-1-joshua.hahnjy@gmail.com> References: <20250919195223.1560636-1-joshua.hahnjy@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Before returning, free_frozen_page_commit calls free_pcppages_bulk using nr_pcp_free to determine how many pages can appropritately be freed, based on the tunable parameters stored in pcp. While this number is an accurate representation of how many pages should be freed in total, it is not an appropriate number of pages to free at once using free_pcppages_bulk, since we have seen the value consistently go above 2000 in the Meta fleet on larger machines. As such, perform batched page freeing in free_pcppages_bulk by using pcp->batch member. In order to ensure that other processes are not starved of the pcp (and zone) lock, free the pcp lock. Note that because free_frozen_page_commit now performs a spinlock inside the function (and can fail), the function may now return with a freed pcp. To handle this, return true if the pcp is locked on exit and false otherwis= e. Suggested-by: Chris Mason Co-developed-by: Johannes Weiner Signed-off-by: Joshua Hahn --- mm/page_alloc.c | 45 ++++++++++++++++++++++++++++++++++++--------- 1 file changed, 36 insertions(+), 9 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 467e524a99df..dc9412e295dc 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2821,11 +2821,19 @@ static int nr_pcp_high(struct per_cpu_pages *pcp, s= truct zone *zone, return high; } =20 -static void free_frozen_page_commit(struct zone *zone, +/* + * Tune pcp alloc factor and adjust count & free_count. Free pages to brin= g the + * pcp's watermarks below high. + * + * May return a freed pcp, if during page freeing the pcp spinlock cannot = be + * reacquired. Return true if pcp is locked, false otherwise. + */ +static bool free_frozen_page_commit(struct zone *zone, struct per_cpu_pages *pcp, struct page *page, int migratetype, unsigned int order, fpi_t fpi_flags) { int high, batch; + int to_free, to_free_batched; int pindex; bool free_high =3D false; =20 @@ -2864,17 +2872,34 @@ static void free_frozen_page_commit(struct zone *zo= ne, * Do not attempt to take a zone lock. Let pcp->count get * over high mark temporarily. */ - return; + return true; } high =3D nr_pcp_high(pcp, zone, batch, free_high); - if (pcp->count >=3D high) { - free_pcppages_bulk(zone, nr_pcp_free(pcp, batch, high, free_high), - pcp, pindex); + to_free =3D nr_pcp_free(pcp, batch, high, free_high); + while (to_free > 0 && pcp->count >=3D high) { + to_free_batched =3D min(to_free, batch); + free_pcppages_bulk(zone, to_free_batched, pcp, pindex); if (test_bit(ZONE_BELOW_HIGH, &zone->flags) && zone_watermark_ok(zone, 0, high_wmark_pages(zone), ZONE_MOVABLE, 0)) clear_bit(ZONE_BELOW_HIGH, &zone->flags); + + high =3D nr_pcp_high(pcp, zone, batch, free_high); + to_free -=3D to_free_batched; + if (pcp->count >=3D high) { + pcp_spin_unlock(pcp); + pcp_trylock_finish(UP_flags); + + pcp_trylock_prepare(UP_flags); + pcp =3D pcp_spin_trylock(zone->per_cpu_pageset); + if (!pcp) { + pcp_trylock_finish(UP_flags); + return false; + } + } } + + return true; } =20 /* @@ -2922,8 +2947,9 @@ static void __free_frozen_pages(struct page *page, un= signed int order, pcp_trylock_prepare(UP_flags); pcp =3D pcp_spin_trylock(zone->per_cpu_pageset); if (pcp) { - free_frozen_page_commit(zone, pcp, page, migratetype, order, fpi_flags); - pcp_spin_unlock(pcp); + if (free_frozen_page_commit(zone, pcp, page, migratetype, order, + fpi_flags)) + pcp_spin_unlock(pcp); } else { free_one_page(zone, page, pfn, order, fpi_flags); } @@ -3022,8 +3048,9 @@ void free_unref_folios(struct folio_batch *folios) migratetype =3D MIGRATE_MOVABLE; =20 trace_mm_page_free_batched(&folio->page); - free_frozen_page_commit(zone, pcp, &folio->page, migratetype, - order, FPI_NONE); + if (!free_frozen_page_commit(zone, pcp, &folio->page, + migratetype, order, FPI_NONE)) + pcp =3D NULL; } =20 if (pcp) { --=20 2.47.3