From nobody Wed Sep 10 02:01:36 2025 Received: from mail-yw1-f170.google.com (mail-yw1-f170.google.com [209.85.128.170]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A671C2E9EA6 for ; Mon, 8 Sep 2025 22:15:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.170 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757369711; cv=none; b=fo4WNQUL9YADdynw212t41Ny2SMoNdFi8XCQoAemBgHoYB9ubot75nPFKALxSoFfulkVGjAZHKkx5ABzW/N9kOAWcvJPtf//lmrqKNEF0kltk4KZqIB7yef4hWZ3KtP9eu3lstdzvavoZVHuLhmfZVa6IYNChXN91oWtoQSUcfQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757369711; c=relaxed/simple; bh=Xy+Re+xc/hvGH4iUk/YBgsU4mBMdUmQRSMi1fEtFvDo=; h=Date:From:To:cc:Subject:In-Reply-To:Message-ID:References: MIME-Version:Content-Type; b=FzD4ks4WLS4qTRBNtYhwbmhyczTbMAZZffgshVxAw/OAtpt+SZ5qWxL4jt4TTTnuajJ7krX4pWJV9UNYXbpPTuSWVD0Gi0fejOPDbD/R+RNIrYsezWjDL/xMlJBr5FD0vWgZtPVD5Aj8c9SfTgi25p82n2531/eQUe8zCEoJ29g= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=bm/46uOc; arc=none smtp.client-ip=209.85.128.170 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="bm/46uOc" Received: by mail-yw1-f170.google.com with SMTP id 00721157ae682-72267c05050so45181797b3.3 for ; Mon, 08 Sep 2025 15:15:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1757369709; x=1757974509; darn=vger.kernel.org; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=5SrxgkmQx2uFEvUPpHlLnP4q7YK8zbZeKcxHv1gRtTw=; b=bm/46uOcl72PsjYpqLc7bpR0lyMeBnQzhC9+MP+TwaCjHxsxgX4Pg+ZvnrK2PaqA68 qrwX2RxoaoTKilAjabO/S+FXKg+bVmMBX0uopCShF4N4ijuG7V56u3bTH1/DXJS4Qs8w hjRDKRzpb3SJ5kqjJnzPxEhNTDPYeFSCITvL0oUgkpTwNG0YQRUHAgoOgXHxQlRQLYb2 6o8aeMPc0YC5iSBdfPqFvcQTJUCjrTYvixKeTKO6jg7OeaCmEt0IEiWzwhYpakMqlkzV gx3PqCQi5kTcYj9RYnIOy3n45EA2uU+cvmOwNCtQJQADitnGC5Z+jrlKNoYyPE+muIh6 7btA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1757369709; x=1757974509; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=5SrxgkmQx2uFEvUPpHlLnP4q7YK8zbZeKcxHv1gRtTw=; b=IG4G2ChfURPTsUglT5XV9+ltS8fIiqERG+D4wkBR3heyYqJKO1A/qlBUGNsASD9RcZ b5OH0vLAyiGlWHN4m7E042DeiNqb28WYYod4RBbEIQuuMa9PfVrT3iu8ipXpHSCTJzib j3+Ss/ZwvXJh04gpeQMBJtPiPwVHZo4ieLMTaGIA9qYPdiLHineN4EBcyI53jv2AEBom Mtf8ytvs7zd2hVarPdGDY0yMA4uVuv3/LqOFxW7XXCSiKhUb65gp52Bz/f2qLgNk0MP1 wFFTCrNUGodA7Icv+nCsEaQqc4nzVPnHxBFxgPBXvsqqHULSYn/XM/z8B+H3q33F4lYY aeJw== X-Forwarded-Encrypted: i=1; AJvYcCXRdVeS5bjckNPRaqr/s3oesZQrismah29fMmR4YkGUcArBFBDVGE6Af9qkUXG2UnmRkvNTt7OO4k6r4Lk=@vger.kernel.org X-Gm-Message-State: AOJu0YwbkLMAYKORY3bVcrPc5mZlgcGKWy9xYEbY3pR80CNKRHWlwzFI 5kVZa8HSj81CjZ+eGylt6nDofekA44jrHvn74HYVkhbaezai3jZAKdoDPAoGRDbxCA== X-Gm-Gg: ASbGncue7xuu+/LfhSVO7G6H1h6h3tk+8kRfNV0yWQ6G52ORKJxSS/JdjA8PmoRiCjJ FhmpLaMwUgAXXflVshK58Nw/RJRVPG65csTAiLnZMUbgLWCC5Ku4KyuRTzMn1kz9aYV1JWIydwk 6HmId8VgJIxUfJiNkF3QbvF2Zl2h8LDlQQcqmCKuOccSwcEB7uobivdwHaYGqzuQhDsegInDcKs HfsgnlckY1lyrKoJIgZNrYC5NKiYFyJM3F/rBQkU82IXX/XnT61/C5G2Xa+86ZmNLco+nrpRnzD t/e0n63UJg+98wUiaKceuMECmyX9rjNK+cM7lAKuolb6jl3x42SmgZXhjM7Jan4wX5l76ORFiAh 5zLnFK5UMKliS6ycBEUuk0sZyyPwOwQ5c0mJJJwN40vVYZ0HU/9/hO9c3LN56llzmzMTwXLhhEd xzIxxlf+sepTlLEL2g6JWlVpEY X-Google-Smtp-Source: AGHT+IHbbZGCqX2bqrWu/ZbM1jf8n4KzCYSp45mow2B50HtcCCoCtNXlIXNvC9OOHQGNTdFk/Lk4rw== X-Received: by 2002:a05:690c:6d0a:b0:720:4ec:3f7a with SMTP id 00721157ae682-727f4d6233fmr76683737b3.31.1757369708132; Mon, 08 Sep 2025 15:15:08 -0700 (PDT) Received: from darker.attlocal.net (172-10-233-147.lightspeed.sntcca.sbcglobal.net. [172.10.233.147]) by smtp.gmail.com with ESMTPSA id 00721157ae682-724c8ba45b2sm36513917b3.53.2025.09.08.15.15.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 08 Sep 2025 15:15:07 -0700 (PDT) Date: Mon, 8 Sep 2025 15:15:03 -0700 (PDT) From: Hugh Dickins To: Andrew Morton cc: Alexander Krabler , "Aneesh Kumar K.V" , Axel Rasmussen , Chris Li , Christoph Hellwig , David Hildenbrand , Frederick Mayle , Jason Gunthorpe , Johannes Weiner , John Hubbard , Keir Fraser , Konstantin Khlebnikov , Li Zhe , Matthew Wilcox , Peter Xu , Rik van Riel , Shivank Garg , Vlastimil Babka , Wei Xu , Will Deacon , yangge , Yuanchu Xie , Yu Zhao , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v2 1/6] mm/gup: check ref_count instead of lru before migration In-Reply-To: <41395944-b0e3-c3ac-d648-8ddd70451d28@google.com> Message-ID: References: <41395944-b0e3-c3ac-d648-8ddd70451d28@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Will Deacon reports:- When taking a longterm GUP pin via pin_user_pages(), __gup_longterm_locked() tries to migrate target folios that should not be longterm pinned, for example because they reside in a CMA region or movable zone. This is done by first pinning all of the target folios anyway, collecting all of the longterm-unpinnable target folios into a list, dropping the pins that were just taken and finally handing the list off to migrate_pages() for the actual migration. It is critically important that no unexpected references are held on the folios being migrated, otherwise the migration will fail and pin_user_pages() will return -ENOMEM to its caller. Unfortunately, it is relatively easy to observe migration failures when running pKVM (which uses pin_user_pages() on crosvm's virtual address space to resolve stage-2 page faults from the guest) on a 6.15-based Pixel 6 device and this results in the VM terminating prematurely. In the failure case, 'crosvm' has called mlock(MLOCK_ONFAULT) on its mapping of guest memory prior to the pinning. Subsequently, when pin_user_pages() walks the page-table, the relevant 'pte' is not present and so the faulting logic allocates a new folio, mlocks it with mlock_folio() and maps it in the page-table. Since commit 2fbb0c10d1e8 ("mm/munlock: mlock_page() munlock_page() batch by pagevec"), mlock/munlock operations on a folio (formerly page), are deferred. For example, mlock_folio() takes an additional reference on the target folio before placing it into a per-cpu 'folio_batch' for later processing by mlock_folio_batch(), which drops the refcount once the operation is complete. Processing of the batches is coupled with the LRU batch logic and can be forcefully drained with lru_add_drain_all() but as long as a folio remains unprocessed on the batch, its refcount will be elevated. This deferred batching therefore interacts poorly with the pKVM pinning scenario as we can find ourselves in a situation where the migration code fails to migrate a folio due to the elevated refcount from the pending mlock operation. Hugh Dickins adds:- !folio_test_lru() has never been a very reliable way to tell if an lru_add_drain_all() is worth calling, to remove LRU cache references to make the folio migratable: the LRU flag may be set even while the folio is held with an extra reference in a per-CPU LRU cache. 5.18 commit 2fbb0c10d1e8 may have made it more unreliable. Then 6.11 commit 33dfe9204f29 ("mm/gup: clear the LRU flag of a page before adding to LRU batch") tried to make it reliable, by moving LRU flag clearing; but missed the mlock/munlock batches, so still unreliable as reported. And it turns out to be difficult to extend 33dfe9204f29's LRU flag clearing to the mlock/munlock batches: if they do benefit from batching, mlock/munlock cannot be so effective when easily suppressed while !LRU. Instead, switch to an expected ref_count check, which was more reliable all along: some more false positives (unhelpful drains) than before, and never a guarantee that the folio will prove migratable, but better. Note on PG_private_2: ceph and nfs are still using the deprecated PG_private_2 flag, with the aid of netfs and filemap support functions. Although it is consistently matched by an increment of folio ref_count, folio_expected_ref_count() intentionally does not recognize it, and ceph folio migration currently depends on that for PG_private_2 folios to be rejected. New references to the deprecated flag are discouraged, so do not add it into the collect_longterm_unpinnable_folios() calculation: but longterm pinning of transiently PG_private_2 ceph and nfs folios (an uncommon case) may invoke a redundant lru_add_drain_all(). And this makes easy the backport to earlier releases: up to and including 6.12, btrfs also used PG_private_2, but without a ref_count increment. Note for stable backports: requires 6.16 commit 86ebd50224c0 ("mm: add folio_expected_ref_count() for reference count calculation"). Reported-by: Will Deacon Closes: https://lore.kernel.org/linux-mm/20250815101858.24352-1-will@kernel= .org/ Fixes: 9a4e9f3b2d73 ("mm: update get_user_pages_longterm to migrate pages a= llocated from CMA region") Signed-off-by: Hugh Dickins Cc: Acked-by: David Hildenbrand Acked-by: Kiryl Shutsemau --- mm/gup.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/mm/gup.c b/mm/gup.c index adffe663594d..82aec6443c0a 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -2307,7 +2307,8 @@ static unsigned long collect_longterm_unpinnable_foli= os( continue; } =20 - if (!folio_test_lru(folio) && drain_allow) { + if (drain_allow && folio_ref_count(folio) !=3D + folio_expected_ref_count(folio) + 1) { lru_add_drain_all(); drain_allow =3D false; } --=20 2.51.0 From nobody Wed Sep 10 02:01:36 2025 Received: from mail-yw1-f169.google.com (mail-yw1-f169.google.com [209.85.128.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7EC2C1DFF7 for ; Mon, 8 Sep 2025 22:16:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.169 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757369820; cv=none; b=S3ggx3yLOeVtdnrdTA78fhI8uzDtBIo/LoMZPA+YZBloiGHUW/cdsYZZePk/Rhw0L9fn7f9ctnrQ75gYg7HwHU1W1A0nIDEo/bhFRChzUQQs6SAor1hb4nkfwPVgPjH0atw9t4PPmSZx2pZ1LuJacEz3z3hxQRQKFjjIkXmNlD0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757369820; c=relaxed/simple; bh=H42CHv/0ozdq1Ls2/9Wm26l9P0ywjQmPNMk/fPUfx3A=; h=Date:From:To:cc:Subject:In-Reply-To:Message-ID:References: MIME-Version:Content-Type; b=e5dSz34x7c/uVFuWMLADR+1SDs7a7ezqp+QHTWQU64YSCTJRzTQareAyZDeHBaRb+ybHDpay3AS9Wy3rRegYkhUY78GdGP/LB0MCyBwaKSA31Hc//ieiU306sEp/qcbhPTPQ/dsGEw2GjrPzjzUn3bhwELWDvM1QvwBKIP1sI+4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=IRHDSvBR; arc=none smtp.client-ip=209.85.128.169 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="IRHDSvBR" Received: by mail-yw1-f169.google.com with SMTP id 00721157ae682-724b9ba6e65so39316357b3.2 for ; Mon, 08 Sep 2025 15:16:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1757369817; x=1757974617; darn=vger.kernel.org; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=zVYLSetMwrX/VynVtA83hBzGTxxW5OBc+rZUcVQE0NA=; b=IRHDSvBRZ3z0hjkZyZTJtNj2OshQ8Z4uM0wIzui3vNc624IOBNYUHFNQSUmjhJFmW8 Yds8MevmmwVpuyNm+8y64No2X76dNKfIu/z6ytPs6Krx/ctSro7N1wNBs84vLT0MmmIt SHoO/VlfMLJ+evs8fao1KgT/znW8UNpYp3KifgioJIVgiA5M6xTHWOTHSWiKH/wfB9xz OP7m4K+R/pIDzjvG75lhzjMSoiB0ydQZDKIOo2F/14E3l2+uSS1ZayHbSb+byI0nPQUG hmneh9tymyc9m6bkLsEWEmjteNw9SKWrwgIQgo2mYu+x92DRqI/gZe2qpa/VpW3yXp0o Lodg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1757369817; x=1757974617; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=zVYLSetMwrX/VynVtA83hBzGTxxW5OBc+rZUcVQE0NA=; b=WxKFK7E/YidjSuLL21J0N050CSXV7x2atewGFvs0/xRSI1d/4/SncY2KT9l7DGoCLZ ybLqGs956zXNp0yfnfiMvWOUC98enidV6bPdkR4JLLBF5jsKX4jPiRgmTSmvDyA7ca7M v/uY/PQZcRRKVxa6cmOR0ieBjoEjGgrSIwCRzRdR6jTvlS1bVY7vav4CCfzFRxPMVJ+o Fgct7/Bm0clnwAbHrhqMt20/3xm6w/3mzxkRis9Hu4Lm3ONWD3NjSxZS+iVtKL/Grs4t gW7ocx10PxOKevvIQgdmVcn2+BqUMN6XUNC55aEv8jKCeWrlDMN20ZKxS62G8Xr/cKU2 NdPg== X-Forwarded-Encrypted: i=1; AJvYcCXM+dXXIwHA0IFZhHStRwu2unP95Ccf/cWhD7szu/llRXW8MZGM7IBurVSwG10rRW78yaVzFwKXxb5+bRs=@vger.kernel.org X-Gm-Message-State: AOJu0Ywj5Zb2iameanP+hjns7xoa5WQ+pHHzSfvYzBDjaDGHplmMSWB7 b6LL8jIvLgqOd5a9YGZ47qw66ORZT5LxlbyUqbWaDRHDe23068+cEsUVhs080c/l5w== X-Gm-Gg: ASbGnct2U/v0rQ2gJj4LqVASjfbyRcS0e4fw5ppbTvUNswuWObZ3d41UF1spcy/J1tM dTC4B0TCmKDfcy8aI2rUhGDoZxVBcIiMICr341pF33eIjGoRyMR7qp2X1+iJWnoqBDwAWIpYAhj FZZ9fjK2mmKKxzc+0gssP3j7DGdeSnqvu2J76UWf6HB/GK6rFkEcs6StbAK6SmhxiS/raj5oGKS 4ca6BfkLiFRGYWuJF1Lm1ABGO29DUC+oRfhToj2t8jMIloCBYQOIYC6QUS5gs6BnUNvoGxnma/T wsY9DaYggLsbEqgXusl5vSE5xFSv6p8D+747naMBS5wghcyQEvzZHSnbk/MhHfatS8rKurywfh/ 1GpTdn1CFqtBAh4KYx/Jjr0qlU+ICJ3KL+TAzViTQQu6Zfkk9ZGtOQXYfG9ogLWJubjYS71+Rpn stNO50Nzovo8ez2gxEEoeFeKbH7RCT X-Google-Smtp-Source: AGHT+IEFGtIXAPyyGQ3/HR/zcX4Kvh/xfXBtsgo7O+saKjkOvxNMfjk5bY+RsPrpYJS0IY8LbDuMMQ== X-Received: by 2002:a05:690c:4d88:b0:724:2cad:8df6 with SMTP id 00721157ae682-727f2ebd695mr84968267b3.16.1757369817188; Mon, 08 Sep 2025 15:16:57 -0700 (PDT) Received: from darker.attlocal.net (172-10-233-147.lightspeed.sntcca.sbcglobal.net. [172.10.233.147]) by smtp.gmail.com with ESMTPSA id 956f58d0204a3-608110cabd6sm5167388d50.8.2025.09.08.15.16.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 08 Sep 2025 15:16:56 -0700 (PDT) Date: Mon, 8 Sep 2025 15:16:53 -0700 (PDT) From: Hugh Dickins To: Andrew Morton cc: Alexander Krabler , "Aneesh Kumar K.V" , Axel Rasmussen , Chris Li , Christoph Hellwig , David Hildenbrand , Frederick Mayle , Jason Gunthorpe , Johannes Weiner , John Hubbard , Keir Fraser , Konstantin Khlebnikov , Li Zhe , Matthew Wilcox , Peter Xu , Rik van Riel , Shivank Garg , Vlastimil Babka , Wei Xu , Will Deacon , yangge , Yuanchu Xie , Yu Zhao , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v2 2/6] mm/gup: local lru_add_drain() to avoid lru_add_drain_all() In-Reply-To: <41395944-b0e3-c3ac-d648-8ddd70451d28@google.com> Message-ID: <66f2751f-283e-816d-9530-765db7edc465@google.com> References: <41395944-b0e3-c3ac-d648-8ddd70451d28@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" In many cases, if collect_longterm_unpinnable_folios() does need to drain the LRU cache to release a reference, the cache in question is on this same CPU, and much more efficiently drained by a preliminary local lru_add_drain(), than the later cross-CPU lru_add_drain_all(). Marked for stable, to counter the increase in lru_add_drain_all()s from "mm/gup: check ref_count instead of lru before migration". Note for clean backports: can take 6.16 commit a03db236aebf ("gup: optimize longterm pin_user_pages() for large folio") first. Signed-off-by: Hugh Dickins Cc: Acked-by: David Hildenbrand --- mm/gup.c | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/mm/gup.c b/mm/gup.c index 82aec6443c0a..b47066a54f52 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -2287,8 +2287,8 @@ static unsigned long collect_longterm_unpinnable_foli= os( struct pages_or_folios *pofs) { unsigned long collected =3D 0; - bool drain_allow =3D true; struct folio *folio; + int drained =3D 0; long i =3D 0; =20 for (folio =3D pofs_get_folio(pofs, i); folio; @@ -2307,10 +2307,17 @@ static unsigned long collect_longterm_unpinnable_fo= lios( continue; } =20 - if (drain_allow && folio_ref_count(folio) !=3D - folio_expected_ref_count(folio) + 1) { + if (drained =3D=3D 0 && + folio_ref_count(folio) !=3D + folio_expected_ref_count(folio) + 1) { + lru_add_drain(); + drained =3D 1; + } + if (drained =3D=3D 1 && + folio_ref_count(folio) !=3D + folio_expected_ref_count(folio) + 1) { lru_add_drain_all(); - drain_allow =3D false; + drained =3D 2; } =20 if (!folio_isolate_lru(folio)) --=20 2.51.0 From nobody Wed Sep 10 02:01:36 2025 Received: from mail-yw1-f177.google.com (mail-yw1-f177.google.com [209.85.128.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 747771DFF7 for ; Mon, 8 Sep 2025 22:19:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.177 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757369964; cv=none; b=V8rJ52qew42UqSZQwL3hAWMmP4W+wEn8jQAVbV9Kn1kiU9VeocVMJIhLIiwUY1XwWMkdKjxt5lzuMkTOMdbOl7Oe40SgeMKXPODBb8M2VOUNXnsQdxkkKvBoLDzTu0onHrvVgBX+hy0D7G+/1Vo3WpJFokSHDkD1qzLQYsPBsS0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757369964; c=relaxed/simple; bh=IM2inVkSeDSG4vXLlRZDfByFMogEwSnISRphKNqxbMw=; h=Date:From:To:cc:Subject:In-Reply-To:Message-ID:References: MIME-Version:Content-Type; b=sSDpSQ8b/T2jXjeaqpqKGTR6gv/f7udGxfnPvV/xFxkbffp7FN2io1jKTjvms8frOaxYGXrfzeWwhN5hKUodrEhaHx+muGYsZziM/3WklGZZJaqV1zy5mc+lWNb1+JNtrREHJQ9tjld3aSxH6yu96fuk8d2nfBvTSd3nsTr7kvc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=tibFOcAr; arc=none smtp.client-ip=209.85.128.177 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="tibFOcAr" Received: by mail-yw1-f177.google.com with SMTP id 00721157ae682-71d6083cc69so47896697b3.2 for ; Mon, 08 Sep 2025 15:19:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1757369961; x=1757974761; darn=vger.kernel.org; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=0nKmTdIpeS7PLxyfeMC8A21vcUc3Lgf8ocBEs9nNb+0=; b=tibFOcAr2G8VdKN2Y0MuEDdNn6Z8D6rU2L+tkb/zulYQt9QhvTQh14j+i2KoKhULUZ 0S3uQqbu62CmiaGJJFZqrg62djKU0gvjM6seovLHVA2zEPbLgVfSkKe7Mz3Z9bY0El58 D4gxOej7cylDAMU+f3O0Jk7F33cFFuDiu1Pg855EzkPb60ys4sGEhOd7/vAAIXr/7P7a 4/hdybrwGZcE6t+3FZ8KqOcXaiJGuUffz1EYiAdBw6LjRS/Krus3Yvpxx2WFxpuxgTss 7SxCvyVbbrFPE51EpgHGxc8rL82X1OAfNynP7B04fsFQVMkoN9sTMFQkclpQDEHfWDw1 GolA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1757369961; x=1757974761; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=0nKmTdIpeS7PLxyfeMC8A21vcUc3Lgf8ocBEs9nNb+0=; b=CAU/QBh2sj3tCYghyseHlpkTxa+aQtqKSogbgynDWDN7J7SYx+m5/z7qJeK8z+lSIE kFNjG7u7rBji8gLexLndGaEBzJFCDWQ1+SnzKT9TOB5PIa6jzVlB1Nto68Po39erE3/I znFitj+0bFzEOVYj21M9pUr1/HxYSFqrS1pZ53KmZgbWVMs+An1yrXDbly8550x9A9xz fmjyJtfmGI4TdedUpWk0u3ShPaLjCAIU43DH2uKrBO1HwO4Qne34VH9Emf/DkSHCerSy X4uoH33FNC8XOArcmAFILLr6qLVNb5IczOl4Yl6gwABDrCmI5ZBO0r9buCp+FemlIavH 6BNQ== X-Forwarded-Encrypted: i=1; AJvYcCWTkJmDV1t1m12bgY/Q5hS9eBvYQpGryV2C4/VHgogB5L25W1m3q7NTxHGRoY1ROIddWc7M+lMO7B3CyOc=@vger.kernel.org X-Gm-Message-State: AOJu0YwV7s8/O9rsPvFn6zk4MAn518VQ4bsndJiRyY+7RhXf6pXsfFt8 aY8+TVA7O9Fp7TGWotjX/WiJ+Fzjgz375Jyt4I5wCVr908CVu0w0259AWpYJxco4PQ== X-Gm-Gg: ASbGncvPItC19ElaBH9DWSQR2Hfqy0FPGyii8zNPk+94bXBhFNa27yr7j6yFfcNzrCk ooX6YSd18ftRsBqJA3/q79kZulHPHEYGcA+6Cz6AonZikGZDg6zrg5hGlbJzxvSdmoV7905TFMU ExvjQQ3cMiXdgh54cOsZ/aLoeKlB1yyJb1IwKDvvngOr9iFHH9MvVEn/NSWBHNGgkpYJ7ngRC3A Sz8xTzL/lRWEGBq8R7JhUIe8H2kFEFbMCBV8/rVICaVEOxbLFFvqWY68vLmIivMVChwhGk68UpE 251N5xWB/KT5G2l9q/YvI6+KmlqYSrmmCbIDRh/N31LUeyXxZjqexpTGpXGIFjR0fDPc/olRzOV 75rusgFu4aK/3XJOgvMKFfDTok7QSbosFiWcDfU2kDz9beBySm40AaVz/QnL22Kgr4OrplapkEV wkAh65VUw3urg1/Q== X-Google-Smtp-Source: AGHT+IFqQlqmrOh/oql4z+l2T+HqRRernXj//jbExJMrmWbstfFIE+mLXYbw9wt3KjgX0uSjw8WSMQ== X-Received: by 2002:a05:690c:620d:b0:724:a06b:cafe with SMTP id 00721157ae682-727f388271emr94315057b3.24.1757369961162; Mon, 08 Sep 2025 15:19:21 -0700 (PDT) Received: from darker.attlocal.net (172-10-233-147.lightspeed.sntcca.sbcglobal.net. [172.10.233.147]) by smtp.gmail.com with ESMTPSA id 00721157ae682-723a850287fsm56874777b3.47.2025.09.08.15.19.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 08 Sep 2025 15:19:20 -0700 (PDT) Date: Mon, 8 Sep 2025 15:19:17 -0700 (PDT) From: Hugh Dickins To: Andrew Morton cc: Alexander Krabler , "Aneesh Kumar K.V" , Axel Rasmussen , Chris Li , Christoph Hellwig , David Hildenbrand , Frederick Mayle , Jason Gunthorpe , Johannes Weiner , John Hubbard , Keir Fraser , Konstantin Khlebnikov , Li Zhe , Matthew Wilcox , Peter Xu , Rik van Riel , Shivank Garg , Vlastimil Babka , Wei Xu , Will Deacon , yangge , Yuanchu Xie , Yu Zhao , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v2 3/6] mm: Revert "mm/gup: clear the LRU flag of a page before adding to LRU batch" In-Reply-To: <41395944-b0e3-c3ac-d648-8ddd70451d28@google.com> Message-ID: <05905d7b-ed14-68b1-79d8-bdec30367eba@google.com> References: <41395944-b0e3-c3ac-d648-8ddd70451d28@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" This reverts commit 33dfe9204f29b415bbc0abb1a50642d1ba94f5e9: now that collect_longterm_unpinnable_folios() is checking ref_count instead of lru, and mlock/munlock do not participate in the revised LRU flag clearing, those changes are misleading, and enlarge the window during which mlock/munlock may miss an mlock_count update. It is possible (I'd hesitate to claim probable) that the greater likelihood of missed mlock_count updates would explain the "Realtime threads delayed due to kcompactd0" observed on 6.12 in the Link below. If that is the case, this reversion will help; but a complete solution needs also a further patch, beyond the scope of this series. Included some 80-column cleanup around folio_batch_add_and_move(). The role of folio_test_clear_lru() (before taking per-memcg lru_lock) is questionable since 6.13 removed mem_cgroup_move_account() etc; but perhaps there are still some races which need it - not examined here. Link: https://lore.kernel.org/linux-mm/DU0PR01MB10385345F7153F3341009818882= 59A@DU0PR01MB10385.eurprd01.prod.exchangelabs.com/ Signed-off-by: Hugh Dickins Acked-by: David Hildenbrand Cc: --- mm/swap.c | 50 ++++++++++++++++++++++++++------------------------ 1 file changed, 26 insertions(+), 24 deletions(-) diff --git a/mm/swap.c b/mm/swap.c index 3632dd061beb..6ae2d5680574 100644 --- a/mm/swap.c +++ b/mm/swap.c @@ -164,6 +164,10 @@ static void folio_batch_move_lru(struct folio_batch *f= batch, move_fn_t move_fn) for (i =3D 0; i < folio_batch_count(fbatch); i++) { struct folio *folio =3D fbatch->folios[i]; =20 + /* block memcg migration while the folio moves between lru */ + if (move_fn !=3D lru_add && !folio_test_clear_lru(folio)) + continue; + folio_lruvec_relock_irqsave(folio, &lruvec, &flags); move_fn(lruvec, folio); =20 @@ -176,14 +180,10 @@ static void folio_batch_move_lru(struct folio_batch *= fbatch, move_fn_t move_fn) } =20 static void __folio_batch_add_and_move(struct folio_batch __percpu *fbatch, - struct folio *folio, move_fn_t move_fn, - bool on_lru, bool disable_irq) + struct folio *folio, move_fn_t move_fn, bool disable_irq) { unsigned long flags; =20 - if (on_lru && !folio_test_clear_lru(folio)) - return; - folio_get(folio); =20 if (disable_irq) @@ -191,8 +191,8 @@ static void __folio_batch_add_and_move(struct folio_bat= ch __percpu *fbatch, else local_lock(&cpu_fbatches.lock); =20 - if (!folio_batch_add(this_cpu_ptr(fbatch), folio) || folio_test_large(fol= io) || - lru_cache_disabled()) + if (!folio_batch_add(this_cpu_ptr(fbatch), folio) || + folio_test_large(folio) || lru_cache_disabled()) folio_batch_move_lru(this_cpu_ptr(fbatch), move_fn); =20 if (disable_irq) @@ -201,13 +201,13 @@ static void __folio_batch_add_and_move(struct folio_b= atch __percpu *fbatch, local_unlock(&cpu_fbatches.lock); } =20 -#define folio_batch_add_and_move(folio, op, on_lru) \ - __folio_batch_add_and_move( \ - &cpu_fbatches.op, \ - folio, \ - op, \ - on_lru, \ - offsetof(struct cpu_fbatches, op) >=3D offsetof(struct cpu_fbatches, loc= k_irq) \ +#define folio_batch_add_and_move(folio, op) \ + __folio_batch_add_and_move( \ + &cpu_fbatches.op, \ + folio, \ + op, \ + offsetof(struct cpu_fbatches, op) >=3D \ + offsetof(struct cpu_fbatches, lock_irq) \ ) =20 static void lru_move_tail(struct lruvec *lruvec, struct folio *folio) @@ -231,10 +231,10 @@ static void lru_move_tail(struct lruvec *lruvec, stru= ct folio *folio) void folio_rotate_reclaimable(struct folio *folio) { if (folio_test_locked(folio) || folio_test_dirty(folio) || - folio_test_unevictable(folio)) + folio_test_unevictable(folio) || !folio_test_lru(folio)) return; =20 - folio_batch_add_and_move(folio, lru_move_tail, true); + folio_batch_add_and_move(folio, lru_move_tail); } =20 void lru_note_cost_unlock_irq(struct lruvec *lruvec, bool file, @@ -328,10 +328,11 @@ static void folio_activate_drain(int cpu) =20 void folio_activate(struct folio *folio) { - if (folio_test_active(folio) || folio_test_unevictable(folio)) + if (folio_test_active(folio) || folio_test_unevictable(folio) || + !folio_test_lru(folio)) return; =20 - folio_batch_add_and_move(folio, lru_activate, true); + folio_batch_add_and_move(folio, lru_activate); } =20 #else @@ -507,7 +508,7 @@ void folio_add_lru(struct folio *folio) lru_gen_in_fault() && !(current->flags & PF_MEMALLOC)) folio_set_active(folio); =20 - folio_batch_add_and_move(folio, lru_add, false); + folio_batch_add_and_move(folio, lru_add); } EXPORT_SYMBOL(folio_add_lru); =20 @@ -685,13 +686,13 @@ void lru_add_drain_cpu(int cpu) void deactivate_file_folio(struct folio *folio) { /* Deactivating an unevictable folio will not accelerate reclaim */ - if (folio_test_unevictable(folio)) + if (folio_test_unevictable(folio) || !folio_test_lru(folio)) return; =20 if (lru_gen_enabled() && lru_gen_clear_refs(folio)) return; =20 - folio_batch_add_and_move(folio, lru_deactivate_file, true); + folio_batch_add_and_move(folio, lru_deactivate_file); } =20 /* @@ -704,13 +705,13 @@ void deactivate_file_folio(struct folio *folio) */ void folio_deactivate(struct folio *folio) { - if (folio_test_unevictable(folio)) + if (folio_test_unevictable(folio) || !folio_test_lru(folio)) return; =20 if (lru_gen_enabled() ? lru_gen_clear_refs(folio) : !folio_test_active(fo= lio)) return; =20 - folio_batch_add_and_move(folio, lru_deactivate, true); + folio_batch_add_and_move(folio, lru_deactivate); } =20 /** @@ -723,10 +724,11 @@ void folio_deactivate(struct folio *folio) void folio_mark_lazyfree(struct folio *folio) { if (!folio_test_anon(folio) || !folio_test_swapbacked(folio) || + !folio_test_lru(folio) || folio_test_swapcache(folio) || folio_test_unevictable(folio)) return; =20 - folio_batch_add_and_move(folio, lru_lazyfree, true); + folio_batch_add_and_move(folio, lru_lazyfree); } =20 void lru_add_drain(void) --=20 2.51.0 From nobody Wed Sep 10 02:01:36 2025 Received: from mail-yb1-f170.google.com (mail-yb1-f170.google.com [209.85.219.170]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BDCDD17E0 for ; Mon, 8 Sep 2025 22:21:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.170 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757370079; cv=none; b=i33xuPk0Cmwv6QvZoOBEhJCZmrpUQe1P2dTi5TSAuGerLcJhDTJix2VTSdn8szmP8D3iLqyiH4DjE6hPrZq9o4fdEubfBPUb5uEQoelLb21ZhG2EZYdr5Hx2cfRrJIW1A3kR2Ox3AnS/AqKr8CUOBDXt3HxNrds9ma8PFgnUQ70= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757370079; c=relaxed/simple; bh=a1cTgHh2wNZem/UsEDdb6RrTDvlVPYb97vp4Xn8wEoU=; h=Date:From:To:cc:Subject:In-Reply-To:Message-ID:References: MIME-Version:Content-Type; b=X5VmA7blnAGRPEEF5ykcwHnlmyHssbyHkFXMdyXABnfboB564PxmQfDgWv8HUzmU8Q5ODLnrZBbuKVZ9AjZjC2vc1kN7aUd5XAKLRvPcsp9+Zu7646tS5rF3vu7pTtS5e1J1/HoOs58556cWzu6bnBBNowL4veZvD/cqs4aTZoU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=BZqkNvXr; arc=none smtp.client-ip=209.85.219.170 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="BZqkNvXr" Received: by mail-yb1-f170.google.com with SMTP id 3f1490d57ef6-e96dc26dfa2so4157998276.1 for ; Mon, 08 Sep 2025 15:21:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1757370077; x=1757974877; darn=vger.kernel.org; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=c9ddMlnJhgesFyfarJvyfx08wlzy1kwgyyy3+aP4EDU=; b=BZqkNvXrdhdVa3JRUfQgFntehnrmGrUuRTitqPvVFxXTKGmGwsQpIkwi/LzU1Jyi/z k0nV4r0SLsiRsyY9JJ9HTEiCnxpqEcprvDqEev0ytC+jrkAM98Ys3W1t+pYKmQ4Ar2Ye vpcCocbwkv23v/x9DC++3WirsA99UX43BFZWHPiPswu/MzK0QTZhDzHAylMDoVkzmI78 URRDptbCj4WVNwvPoKNDGZbkemQa0gzpDDqrcQwBkl41cFfQDeo6RpvK34S4ZayA7/MO hoa2PMjBt0LsRepJW4dy/VyLgu7XUz34B5BLM/n6awr5lab1Sobi4ZiiK3W4t/Q7fMzp F2nQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1757370077; x=1757974877; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=c9ddMlnJhgesFyfarJvyfx08wlzy1kwgyyy3+aP4EDU=; b=pmjxb9aqevz7vvQeBjadVGh/MaQjHvhtHsq8vy9YXv5yQD6iEXD9d30hqnye1RiPRm qn2AOghLVU4mb7MuCDyZsXrGhuboGQjAPXssFiBegeDYi2wBNwDevKOI/zCw1cNRpe+I t7VYTrdD/XKgN0Y5lI5EOyiUtSjeZec81hdwvB2yHapq1JdNoe5D0WATwN3oA/QS3P53 G5OWDzfk/lKc9imu7DWWrDNtVzBWTdko56gfHT5b6PaA9oZlkyE78HMgjyrcc5yqgPol sMAAxgfdM09qCb++xMYfXOrhyzuUwcDzHWxEVSWyqNa+JJhD0+HPXo/kjzugRsJTI44k U1UQ== X-Forwarded-Encrypted: i=1; AJvYcCVEhwwU8rfvuJXIXmz4QDneD8tDvhNHHsz3Lb+384gW3jWe/L1LjcNAVdS2TDtSg93EHcgBME4T4jdJTKU=@vger.kernel.org X-Gm-Message-State: AOJu0YwLQJX9w2SD67VkjIEyME2iv8kgyu7S2OQU0c5yjfEAo2l86veQ RICc/VnApCmjvexJ7917Ma/ymYGB66DPiNM+KwoiBNWONSas7ui98ESRHDMZytXcSA== X-Gm-Gg: ASbGncvVau9D8osdHUMRqKixw1OifCGwVb42AAfrpNPNBxVwNGgYHlzUQXdsvV2mcxF EpAQ5FQRKVpzYfnYPXOssHuOcwrZMBLbHQcnRNu3qGiaDx/LnRgKITR96M55wEPuT5MBEFAFX5j tUpLboJWTUZaTz+TQggiXyWCd2xO44ncDrFc1yvMiuLCVTfG2AA2yoV6P0qFe1xpngMefoTo8Xc egaWrFvaHJjWKw8Py5WTxYTE5Nm1Fg2qz8iwRnQEzB4fxLTD81ulE4bRBLK4Yxoxom/CbMX5f8f cjIVgozjM/FpwENIm6L+/rH2ENOErWLVLbe/ysSWVQWTCfe7gS4ajzGalMw8xIkMKYUfRpxOxU6 g9KvMzx6shjt5srQ3exmyfRB0NNcMLSh7jcacGBDc0nB39KWbSfxtphWhABwZIZhPdchiEyHYZT 0STnBASdU= X-Google-Smtp-Source: AGHT+IGM8fYhE9A3qchntfUnqur0lqvaNoCalamMqRl5+gg2Onz6zfisAMxx2108yKC4pC3/UUNzNQ== X-Received: by 2002:a05:6902:4282:b0:e9d:6879:5fe3 with SMTP id 3f1490d57ef6-e9f67ea672emr7444113276.39.1757370076366; Mon, 08 Sep 2025 15:21:16 -0700 (PDT) Received: from darker.attlocal.net (172-10-233-147.lightspeed.sntcca.sbcglobal.net. [172.10.233.147]) by smtp.gmail.com with ESMTPSA id 3f1490d57ef6-e9d4019d53csm4681153276.7.2025.09.08.15.21.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 08 Sep 2025 15:21:15 -0700 (PDT) Date: Mon, 8 Sep 2025 15:21:12 -0700 (PDT) From: Hugh Dickins To: Andrew Morton cc: Alexander Krabler , "Aneesh Kumar K.V" , Axel Rasmussen , Chris Li , Christoph Hellwig , David Hildenbrand , Frederick Mayle , Jason Gunthorpe , Johannes Weiner , John Hubbard , Keir Fraser , Konstantin Khlebnikov , Li Zhe , Matthew Wilcox , Peter Xu , Rik van Riel , Shivank Garg , Vlastimil Babka , Wei Xu , Will Deacon , yangge , Yuanchu Xie , Yu Zhao , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v2 4/6] mm: Revert "mm: vmscan.c: fix OOM on swap stress test" In-Reply-To: <41395944-b0e3-c3ac-d648-8ddd70451d28@google.com> Message-ID: References: <41395944-b0e3-c3ac-d648-8ddd70451d28@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" This reverts commit 0885ef4705607936fc36a38fd74356e1c465b023: that was a fix to the reverted 33dfe9204f29b415bbc0abb1a50642d1ba94f5e9. Signed-off-by: Hugh Dickins Acked-by: David Hildenbrand Cc: --- mm/vmscan.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index a48aec8bfd92..674999999cd0 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -4507,7 +4507,7 @@ static bool sort_folio(struct lruvec *lruvec, struct = folio *folio, struct scan_c } =20 /* ineligible */ - if (!folio_test_lru(folio) || zone > sc->reclaim_idx) { + if (zone > sc->reclaim_idx) { gen =3D folio_inc_gen(lruvec, folio, false); list_move_tail(&folio->lru, &lrugen->folios[gen][type][zone]); return true; --=20 2.51.0 From nobody Wed Sep 10 02:01:36 2025 Received: from mail-yw1-f177.google.com (mail-yw1-f177.google.com [209.85.128.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EA3382D9EFE for ; Mon, 8 Sep 2025 22:23:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.177 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757370202; cv=none; b=VwOQ1aN+lR8KDugN/BZ2h1+E5yC+KPOSqhSZDU0M/XJLwGkiB5dovwHFeQYyT+eiMwGz8eJ18BOKHapTuOHYLdxQ4Lr3YcK3Rf/EYQHtaohLjDlW2n3+SYgQcKF5SRvF3YkUpq3lPaTZ7aUvozg4HcROyXysLnSqubDSNTddLTc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757370202; c=relaxed/simple; bh=ER47jYWIu/IrmL5sqEJFqeAQDaxCd19hP4zVhfcO7r0=; h=Date:From:To:cc:Subject:In-Reply-To:Message-ID:References: MIME-Version:Content-Type; b=G2po0ltwIx0TBBw95cV06TULRq4GkXS7esloUWuJy6Y2kaVxgOvWwiSwzUjJbLwFT5DlgFfrerouH2SOf3POaanx2LN/uTTSOr/npNRiY7+sAIaGUhqd2nchHL6uCrAsxUem7PpHtwVisdejMRVGz5MaJeFGUS5UnTPyvBCKvzw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=KY+mx7RW; arc=none smtp.client-ip=209.85.128.177 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="KY+mx7RW" Received: by mail-yw1-f177.google.com with SMTP id 00721157ae682-71d5fb5e34cso53172097b3.0 for ; Mon, 08 Sep 2025 15:23:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1757370200; x=1757975000; darn=vger.kernel.org; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=g7abeCEsm9hQSHKcfKnUycTIfAV4deoEGxrAOakrims=; b=KY+mx7RW65oBGLlwDlcFSJHM9O8q3F47iWvgZyW8Q3H8+mwmZ1tEVg9/t7EH+PGLAj UpzvBc4rQnuG8dZWi8C2eFqOxY+EgwtBcGkKgvZMLCi98Rt/adMKG7i9RN/avQ0KHdT5 Ve61y0lv1ka56BVDoXz1a/bOqZbocTz+2fsEuy65t5A3JfVZofo48SATOiUjCZRA8PEM fG0uJP3QGJzkzNExV4Vs5Q3c1jR1VuHLEICARSEQuHgI8GaciperUmc9Z4ht7V3DnMZO 5RbIXbhJQVMn2bCNFzX53GHC2Z9W1Q8QiGN2Q70w5dwMQnC/UTFih+Ga0upjBXo/gmCg Axqw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1757370200; x=1757975000; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=g7abeCEsm9hQSHKcfKnUycTIfAV4deoEGxrAOakrims=; b=d9twGPGAKf5PUj6/9EYgadZmTUHXPvab9fVUESpOKWgz8Vz9pukGaHlsjoWpj51ri/ RwEegRK2e6qCI5Nd92v49pY4Ip9CUajtbS987shH8ZUvkxNx4PxHV2x15lLyv1bB70tp v1hevhfRbjf60nsMO9hWmJYCnJLTmjzVEtOUWMynflYiaYeEQMPTjQb2alO4owUnoxcz cf1kx9PukctUz4J1nux+TsMvHDHLhbv+1fZYlAezPpEO23qykE8WCIxkyTj4CWGR37+b K38nlrgLvZvC9iKnimpPCCS1wjqMA/t8I7Bi7dZBUodSmd96jC7a4tj5JPSvoT0PMFEU 4ONA== X-Forwarded-Encrypted: i=1; AJvYcCVJAK8+byMvIPlfdqzHrHS4JfTY/fUc+3Y6s2DOMRacIXnjPJ++3ThvF5vOVEklHGlrE7MY8R14HHspPWE=@vger.kernel.org X-Gm-Message-State: AOJu0Yw1cWKBEGaydEur5LJsPjw4veg7aNe5aPMGcaPR9c9CyELOGSj/ g+/2zZVpqli80igD2LpXnQBxYtUJ6qh896s2/sj8992RYLhmDISytDLd4vXM+BCyPg== X-Gm-Gg: ASbGncupUDSRF3aBhFAM6u8CeMm6+b/WWskiLz8z8bp77BqnSlOyATqCCc/bb6dJnIS q7QW3kCL0gzgHCMi4ju6t+othxwRYzc5NqLubD2kmCl7Ba1S9nlXi/Y6+JCJcJaO221bPwK10eV euAT7PyLoJJIQtkmzfNBqipCbexgJT9A2lySvLy657/bIRRKi4/4xCbZD3L1e+O2GJ22YO4RAct 3yIzEibzAfzRHuEu7WCEeI4K3Ossn40bYSuiStBQZSIJSghQMHj030RgSAx17xgVamqlozjSsYm 6iUZsyfXDJhb9XGfYLl9j1ZWe+i8CjGefs2G9vcXKfDsMAJm1FDul3zuiqXo0CgfpINR22RgI3B 1mn1R+viKeeXGuPgqXp+wx0MwInqYuUWaIyyL1pA8W0zm39cTI2utsLE5YuG2Axa3tQ4qZvNQVu uKhU3fVjYpB+eAEsqyRQ== X-Google-Smtp-Source: AGHT+IGaXcFHRTY5jFLccQQrjPx7QNGVNECmiV5ybnGf1NXLoPyxyEw6ZGSR0hlBxX5NCPzGYoRJWA== X-Received: by 2002:a05:690c:c8e:b0:71b:f0ae:1d6d with SMTP id 00721157ae682-7280a4487a6mr78339217b3.18.1757370199485; Mon, 08 Sep 2025 15:23:19 -0700 (PDT) Received: from darker.attlocal.net (172-10-233-147.lightspeed.sntcca.sbcglobal.net. [172.10.233.147]) by smtp.gmail.com with ESMTPSA id 00721157ae682-724c8ba45b2sm36567057b3.53.2025.09.08.15.23.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 08 Sep 2025 15:23:18 -0700 (PDT) Date: Mon, 8 Sep 2025 15:23:15 -0700 (PDT) From: Hugh Dickins To: Andrew Morton cc: Alexander Krabler , "Aneesh Kumar K.V" , Axel Rasmussen , Chris Li , Christoph Hellwig , David Hildenbrand , Frederick Mayle , Jason Gunthorpe , Johannes Weiner , John Hubbard , Keir Fraser , Konstantin Khlebnikov , Li Zhe , Matthew Wilcox , Peter Xu , Rik van Riel , Shivank Garg , Vlastimil Babka , Wei Xu , Will Deacon , yangge , Yuanchu Xie , Yu Zhao , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v2 5/6] mm: folio_may_be_lru_cached() unless folio_test_large() In-Reply-To: <41395944-b0e3-c3ac-d648-8ddd70451d28@google.com> Message-ID: <57d2eaf8-3607-f318-e0c5-be02dce61ad0@google.com> References: <41395944-b0e3-c3ac-d648-8ddd70451d28@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" mm/swap.c and mm/mlock.c agree to drain any per-CPU batch as soon as a large folio is added: so collect_longterm_unpinnable_folios() just wastes effort when calling lru_add_drain[_all]() on a large folio. But although there is good reason not to batch up PMD-sized folios, we might well benefit from batching a small number of low-order mTHPs (though unclear how that "small number" limitation will be implemented). So ask if folio_may_be_lru_cached() rather than !folio_test_large(), to insulate those particular checks from future change. Name preferred to "folio_is_batchable" because large folios can well be put on a batch: it's just the per-CPU LRU caches, drained much later, which need care. Marked for stable, to counter the increase in lru_add_drain_all()s from "mm/gup: check ref_count instead of lru before migration". Suggested-by: David Hildenbrand Signed-off-by: Hugh Dickins Cc: Acked-by: David Hildenbrand --- include/linux/swap.h | 10 ++++++++++ mm/gup.c | 4 ++-- mm/mlock.c | 6 +++--- mm/swap.c | 2 +- 4 files changed, 16 insertions(+), 6 deletions(-) diff --git a/include/linux/swap.h b/include/linux/swap.h index 2fe6ed2cc3fd..7012a0f758d8 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -385,6 +385,16 @@ void folio_add_lru_vma(struct folio *, struct vm_area_= struct *); void mark_page_accessed(struct page *); void folio_mark_accessed(struct folio *); =20 +static inline bool folio_may_be_lru_cached(struct folio *folio) +{ + /* + * Holding PMD-sized folios in per-CPU LRU cache unbalances accounting. + * Holding small numbers of low-order mTHP folios in per-CPU LRU cache + * will be sensible, but nobody has implemented and tested that yet. + */ + return !folio_test_large(folio); +} + extern atomic_t lru_disable_count; =20 static inline bool lru_cache_disabled(void) diff --git a/mm/gup.c b/mm/gup.c index b47066a54f52..0bc4d140fc07 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -2307,13 +2307,13 @@ static unsigned long collect_longterm_unpinnable_fo= lios( continue; } =20 - if (drained =3D=3D 0 && + if (drained =3D=3D 0 && folio_may_be_lru_cached(folio) && folio_ref_count(folio) !=3D folio_expected_ref_count(folio) + 1) { lru_add_drain(); drained =3D 1; } - if (drained =3D=3D 1 && + if (drained =3D=3D 1 && folio_may_be_lru_cached(folio) && folio_ref_count(folio) !=3D folio_expected_ref_count(folio) + 1) { lru_add_drain_all(); diff --git a/mm/mlock.c b/mm/mlock.c index a1d93ad33c6d..bb0776f5ef7c 100644 --- a/mm/mlock.c +++ b/mm/mlock.c @@ -255,7 +255,7 @@ void mlock_folio(struct folio *folio) =20 folio_get(folio); if (!folio_batch_add(fbatch, mlock_lru(folio)) || - folio_test_large(folio) || lru_cache_disabled()) + !folio_may_be_lru_cached(folio) || lru_cache_disabled()) mlock_folio_batch(fbatch); local_unlock(&mlock_fbatch.lock); } @@ -278,7 +278,7 @@ void mlock_new_folio(struct folio *folio) =20 folio_get(folio); if (!folio_batch_add(fbatch, mlock_new(folio)) || - folio_test_large(folio) || lru_cache_disabled()) + !folio_may_be_lru_cached(folio) || lru_cache_disabled()) mlock_folio_batch(fbatch); local_unlock(&mlock_fbatch.lock); } @@ -299,7 +299,7 @@ void munlock_folio(struct folio *folio) */ folio_get(folio); if (!folio_batch_add(fbatch, folio) || - folio_test_large(folio) || lru_cache_disabled()) + !folio_may_be_lru_cached(folio) || lru_cache_disabled()) mlock_folio_batch(fbatch); local_unlock(&mlock_fbatch.lock); } diff --git a/mm/swap.c b/mm/swap.c index 6ae2d5680574..b74ebe865dd9 100644 --- a/mm/swap.c +++ b/mm/swap.c @@ -192,7 +192,7 @@ static void __folio_batch_add_and_move(struct folio_bat= ch __percpu *fbatch, local_lock(&cpu_fbatches.lock); =20 if (!folio_batch_add(this_cpu_ptr(fbatch), folio) || - folio_test_large(folio) || lru_cache_disabled()) + !folio_may_be_lru_cached(folio) || lru_cache_disabled()) folio_batch_move_lru(this_cpu_ptr(fbatch), move_fn); =20 if (disable_irq) --=20 2.51.0 From nobody Wed Sep 10 02:01:36 2025 Received: from mail-yb1-f177.google.com (mail-yb1-f177.google.com [209.85.219.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E23572D9EFE for ; Mon, 8 Sep 2025 22:24:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.177 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757370301; cv=none; b=OTxnJ7JbyKmpU6IFyWf7uwytFz9FFze1SMUdMRDVhQt1h7RbMq85igxNov2pza4jgSwXk0k2gBEST/kA9hugJESkc8YUWbaIDwWd1Glr7QSKW1IS8dRVw5QtwsRUJ4aO3Xthltl4bTkRN6nAQQbt8DEhCxnzcCKyeq/a1EVQdys= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757370301; c=relaxed/simple; bh=BdczHDV3B3ZeupLMxai1XjM4O/IqMSbhh9GDyX4NZaY=; h=Date:From:To:cc:Subject:In-Reply-To:Message-ID:References: MIME-Version:Content-Type; b=HayFgIP/clFA3ZUDOHLykcgahlM8b/RbdDOGECyQ0pS4rcPJRCbm7ndT8Bw1ynG55vWkgcdTIysHJl6WmpAG4LDpsxx40tsjT8NSV7mtf/XE6+5FwPbv7Z3QGO5RdHRLXxiDrjnzSvgFJWHXhYLRhCpOemHF8KKA95Vms22atfg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=FZlhya5d; arc=none smtp.client-ip=209.85.219.177 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="FZlhya5d" Received: by mail-yb1-f177.google.com with SMTP id 3f1490d57ef6-e96e987fc92so4071765276.2 for ; Mon, 08 Sep 2025 15:24:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1757370299; x=1757975099; darn=vger.kernel.org; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=rv8hMh0LMqevWYFZ+srlns+vtAjMl4A4z3f9Z1Buf8k=; b=FZlhya5da51rJMcBjRhOeBD/8Lqz872VYUGB6HA6JsoItDmsjTgPzZvCBpcZ6t0Mo+ GvPbg1fwoR9vNO7y6/Yp2q490W1JXm4Gb69r0PG6GMLRccqqUtF/m6if2yQ1+j97kVlE SlfnJokC1GviymKcOlquFtLX/o2RVFa6I8s+XOu0BvlAfgfpfgw7MePZCSr20WySbdIa Obz4qaYuNOD3K3cO15pzm1h4BP0rZMHy4dgIOuBfklJItbn5/eBRzk6/NBpIBBuIwyCm tT2hGWQXRh45vzv2q7iWghwU0TkfAJqju+/E7I2/4w6ZWVS5Zd1lRiLq9sQe2K+AKYXJ 3tug== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1757370299; x=1757975099; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=rv8hMh0LMqevWYFZ+srlns+vtAjMl4A4z3f9Z1Buf8k=; b=JGjNTc6r26vABv8Vt2//U0q2l3QwIs9uMcfjL9UdtdTulPn78GbtQS8L2aqvKjihZZ bdFTTAyBVOSIeyZUEOdgCLzbAux5QblskaIN8tozqt0oxrgNFjPVzS8gRfrOs311r7WM NYqFoPCYIXTg2H1zsetqwyejK2Z8hQAmTiWFvpuNUf3Qrjb/vlReGhxOd6cafThrcLPl Lyfj10wZu1g8c7mKXrFMhE9mosYIbt/be0Wl0OfIq/B2NAog9BT5obZ7JxwNv961z/m0 EWL368Bg1ILmB1dKzM6trg++wIdZJN2aAfqSJR2DNsTcSmEwYhy8JogpJdGSjgHHqHps vM5w== X-Forwarded-Encrypted: i=1; AJvYcCXVpQogv68yJFpJ5nSFulgYlngN9YIlK+A2G+k/Xh6XOARgxsOq19E+tVPi4iwysv7ekueih4EBM2xKtV4=@vger.kernel.org X-Gm-Message-State: AOJu0YwyDA7OnMv8AOYAs12whINnbyCP7bq34l6Lo3BrKHfK15VVqfAF gOd6rrBDvpR99eA7/YB7wG8NbILd4Zr313lJ9EMVePglNkd4J3/tsxOa+SORa2adHw== X-Gm-Gg: ASbGnct+K/NzsCiztYHFGrGxon4F/9HaxjAEzGSJ65ckqLpn9hoDJOFKUPATHfVrv9f 7x5GN4/hv40uE1N13wm/a18e4rJpYNftNZZv4JbkxUqD7q0ay26GKX3t+4OYHUMEGo7pomRHqAJ Ryv1wzBr9ZfDPDFjXKFBCrm281WzgzMicm804cLmf6aBeEU4c9b+Kc98yCtWbLhL9jAGM30GN9A C6J/TB+SuflC6MEgp5hpEdGC2sRvcAbP7MSQWcdb7DNMTumSJAeJrorjx0jn0+B0WNVvyB5mPeE ZbgtYowPTbjnXnDnRSp6+nkVoI9usyFDr6igTWH9ENK4EwFrh97sk1ICl+v+dAinCypiGEEg8fK BDmDNbY6D6Z53T1kbQyJ4goKmVV3dVf2BYOnwo/J1DWTrc78GKQ35uw5Ps1QrZbibicTDMpESK/ ISEbjL+gaiqq+aKXTu8sgssMO6 X-Google-Smtp-Source: AGHT+IHz12RLu2qSx6kJKbJC54vlA6AVMAsAUr7Wd4XmRnv8FjgUsgD4d7k9E9m4cnAD0yWlnXIlxA== X-Received: by 2002:a05:6902:27c8:b0:e95:1dc3:8838 with SMTP id 3f1490d57ef6-e9f66fca8a0mr9322667276.21.1757370298464; Mon, 08 Sep 2025 15:24:58 -0700 (PDT) Received: from darker.attlocal.net (172-10-233-147.lightspeed.sntcca.sbcglobal.net. [172.10.233.147]) by smtp.gmail.com with ESMTPSA id 3f1490d57ef6-ea14b2e7ac4sm1647569276.35.2025.09.08.15.24.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 08 Sep 2025 15:24:57 -0700 (PDT) Date: Mon, 8 Sep 2025 15:24:54 -0700 (PDT) From: Hugh Dickins To: Andrew Morton cc: Alexander Krabler , "Aneesh Kumar K.V" , Axel Rasmussen , Chris Li , Christoph Hellwig , David Hildenbrand , Frederick Mayle , Jason Gunthorpe , Johannes Weiner , John Hubbard , Keir Fraser , Konstantin Khlebnikov , Li Zhe , Matthew Wilcox , Peter Xu , Rik van Riel , Shivank Garg , Vlastimil Babka , Wei Xu , Will Deacon , yangge , Yuanchu Xie , Yu Zhao , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v2 6/6] mm: lru_add_drain_all() do local lru_add_drain() first In-Reply-To: <41395944-b0e3-c3ac-d648-8ddd70451d28@google.com> Message-ID: <33389bf8-f79d-d4dd-b7a4-680c4aa21b23@google.com> References: <41395944-b0e3-c3ac-d648-8ddd70451d28@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" No numbers to back this up, but it seemed obvious to me, that if there are competing lru_add_drain_all()ers, the work will be minimized if each flushes its own local queues before locking and doing cross-CPU drains. Signed-off-by: Hugh Dickins Acked-by: David Hildenbrand --- mm/swap.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/mm/swap.c b/mm/swap.c index b74ebe865dd9..881e53b2877e 100644 --- a/mm/swap.c +++ b/mm/swap.c @@ -834,6 +834,9 @@ static inline void __lru_add_drain_all(bool force_all_c= pus) */ this_gen =3D smp_load_acquire(&lru_drain_gen); =20 + /* It helps everyone if we do our own local drain immediately. */ + lru_add_drain(); + mutex_lock(&lock); =20 /* --=20 2.51.0