From nobody Mon Apr 6 16:48:29 2026 Received: from mail-qt1-f179.google.com (mail-qt1-f179.google.com [209.85.160.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 302A33A7846 for ; Wed, 18 Mar 2026 20:04:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.179 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773864244; cv=none; b=sjNWv+tH8yFTWjRuV3iDfKMv88qPas3PGvC6YqU61eYexeQHVpqj7i/bA814JyGVJ19RX2hCmORpns2kMzMAj8rFJbY+tdrHqvye8AhFvAQ9HzlLNYyVsbF6hZrGfbJwIhFX/hcBL0NrZCBB8ogH8DRsiYM7sUIF176KcQaC3Ww= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773864244; c=relaxed/simple; bh=lthQ/bW6RIm1uQ7HKZsbjtI4AUpU9WCg3/sJUOcyQ0U=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=qXZ8i71IR/EV/+ooPuVw5w4ObIbtlSp5l+xX5O2KWrY9olylA2RKmt6SwtJxNfh5xC9yCs25hOl8WxwjxeUI0RFQ0XWChf1bEwSV92j5Pln0i/cqLOZgBD0Q4ePRbjI4mFf8QAhTroU+i24hpXBan8TgiJo6KPpjX/zc+KBiDwU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=cmpxchg.org; spf=pass smtp.mailfrom=cmpxchg.org; dkim=pass (2048-bit key) header.d=cmpxchg.org header.i=@cmpxchg.org header.b=t+2A6pv5; arc=none smtp.client-ip=209.85.160.179 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=cmpxchg.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=cmpxchg.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=cmpxchg.org header.i=@cmpxchg.org header.b="t+2A6pv5" Received: by mail-qt1-f179.google.com with SMTP id d75a77b69052e-5062fc5d86aso2025491cf.1 for ; Wed, 18 Mar 2026 13:04:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg.org; s=google; t=1773864242; x=1774469042; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=xA2tsV+rhDOG+fLN+eXa3w2/CdlIulJXzLT7vmCgaLA=; b=t+2A6pv5qX40qFT7dC9FKObZw5XVYLEln2Km68FaCFbj0Z3kAa11HFi83Wd8KlsV4C XATu8vVC18l4gRr+riUVWjhcTRG5wTxkRZdK84Gb+p0EbvCXatvRyxQEwLSCSTP/RU+i vTeigZKOGOcwQBLyOSs0r2TCpzDxNekJGU8RXiKGpxJyYhA5fKZ90YWawlUX4/k7X/P9 r9fUkFrvVVzB0HB/XGsg8/X+Vg+iLi3a1aI7B5SrZ1XDmlOxMDYxJf+NvdPi/747EW1C TZx7/EsOkjrEEl8wDmDGN3igrWDR1576FQg5YARXbDD66kSd+NJLtXgI4p0Tljofqwu7 yKMA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1773864242; x=1774469042; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=xA2tsV+rhDOG+fLN+eXa3w2/CdlIulJXzLT7vmCgaLA=; b=HX7Y8Sbecil1PqDpz6fPJWvS57c8fUcsjbAl7YZua9Szb/0gD7CqvOn4dGNPBTd3ev iBxB38VgwQyPm55rDBl1AVekOA8znmy+iQ1UsWh1ioiduygBbCuWmhshGkcGSChEJX4N UAUZ92gmM3TMmrvv2s+H75i3mFwa3aEvT/DOtCHU4ufvMfZ+Bpgc0KoNsUV0RnLVtZ5M v3u5f29R3kDNoHdIIWK5oMf/hc2Pnv90mQJDdNXNBQp1LgTZC5fD08OFnu0lVgdGpFQC abMn/sCH+Pyg8VEwM2N0t97+Yo645kRkia99w/TbslwG0+fBppOH/RJzewHzGiGCX82l NKog== X-Forwarded-Encrypted: i=1; AJvYcCXq4iimTzs3atjKF5xc8QREOtrMtGBWx9kEIiXCITEZpvsW09fG1/le2LfAY1nrOLrWW7ow6C1eHwJ2sZE=@vger.kernel.org X-Gm-Message-State: AOJu0YywIIBrf4R8vhMi2s6kCQRPxZq/773odd/P9pXEJkYYP9fPXBXl kxFlMhN8pPb/4hHb2TGlmdft50pqCCg0YPWBv8SOGH/jVQzGZ16bmvYTjkaXWONppM0= X-Gm-Gg: ATEYQzxDBHPENpFqg1zj4ttril6mZs7e5K/2NpXjykCaXxqJDLinELEIuEwYngkS+kj act8ohZR/M40B52/cKX3J76GHG4MPmGDF1AqN/oLRF/fnUqI3VJL5fQRvRZmI6fDGDMV5txmRKM ZEBHAj5NBO1Wl8hjddmRzogHtxLJNyD32QBsOzSmCJifUHgBaW0R9gNLSJXvFxK4hQmk6oWrNpr 0xkqbkp0lHVoGc0b7U0MEc8LfKTmeVQjXBklb/GDLEpYmslVZr8lmt6ZGVDu5ame6Bj69a9lAOA KsfhxdiLkR5h3MyL+PPcmGPPv0+aPO2N3u7AEajFgZ6US7oBuHAclQC0oQTgE5HvbqChJgijlWJ MY+NPGoQF/7YjtY36atXq2rQOV6PL2y7v7VWxcpOYey2ut5sn5NxY/+xpjegNR4alBuzdSCIrOe lGry3VhKv6MmBlNw/ZjfAfoQe/pRTDMYUw X-Received: by 2002:ac8:7d47:0:b0:509:1987:7626 with SMTP id d75a77b69052e-50b148ec22bmr59111751cf.68.1773864242086; Wed, 18 Mar 2026 13:04:02 -0700 (PDT) Received: from localhost ([2603:7000:c00:3a00:365a:60ff:fe62:ff29]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-50b135eeeebsm29705751cf.31.2026.03.18.13.04.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 18 Mar 2026 13:04:00 -0700 (PDT) From: Johannes Weiner To: Andrew Morton Cc: David Hildenbrand , Shakeel Butt , Yosry Ahmed , Zi Yan , "Liam R. Howlett" , Usama Arif , Kiryl Shutsemau , Dave Chinner , Roman Gushchin , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v3 1/7] mm: list_lru: lock_list_lru_of_memcg() cannot return NULL if !skip_empty Date: Wed, 18 Mar 2026 15:53:19 -0400 Message-ID: <20260318200352.1039011-2-hannes@cmpxchg.org> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260318200352.1039011-1-hannes@cmpxchg.org> References: <20260318200352.1039011-1-hannes@cmpxchg.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" skip_empty is only for the shrinker to abort and skip a list that's empty or whose cgroup is being deleted. For list additions and deletions, the cgroup hierarchy is walked upwards until a valid list_lru head is found, or it will fall back to the node list. Acquiring the lock won't fail. Remove the NULL checks in those callers. Reviewed-by: David Hildenbrand (Arm) Signed-off-by: Johannes Weiner Acked-by: Shakeel Butt Reviewed-by: Lorenzo Stoakes (Oracle) --- mm/list_lru.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/mm/list_lru.c b/mm/list_lru.c index 26463ae29c64..d96fd50fc9af 100644 --- a/mm/list_lru.c +++ b/mm/list_lru.c @@ -165,8 +165,6 @@ bool list_lru_add(struct list_lru *lru, struct list_hea= d *item, int nid, struct list_lru_one *l; =20 l =3D lock_list_lru_of_memcg(lru, nid, memcg, false, false); - if (!l) - return false; if (list_empty(item)) { list_add_tail(item, &l->list); /* Set shrinker bit if the first element was added */ @@ -203,9 +201,8 @@ bool list_lru_del(struct list_lru *lru, struct list_hea= d *item, int nid, { struct list_lru_node *nlru =3D &lru->node[nid]; struct list_lru_one *l; + l =3D lock_list_lru_of_memcg(lru, nid, memcg, false, false); - if (!l) - return false; if (!list_empty(item)) { list_del_init(item); l->nr_items--; --=20 2.53.0 From nobody Mon Apr 6 16:48:29 2026 Received: from mail-qt1-f172.google.com (mail-qt1-f172.google.com [209.85.160.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3FA643F7ABF for ; Wed, 18 Mar 2026 20:04:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773864247; cv=none; b=dj5bTePh5sorn1ZEJ5gPcgPlSetKi6GHykeK5O9sswAEsH5kpYDsBJ0zxg85h24E0ICp3GlQJ5vnM6YRZ3EZJxv/2TpXQx/FkU2aPnOUFmnOaEFWRCWun/Vk2gynJ6MJHHW0w8qjolrNHnac72D7YnxoJW0gU+ntXVaJ/9Weg54= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773864247; c=relaxed/simple; bh=3S94Q5uUUuy4oy2Bbu7HOiTJjkO17/SNvqAeYpwKxUY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=RX+avxJdV4zDIK7kiwg0V6OM5hzrf5MQjsURENPCa0YmzX66A2alCP2wEf3cU/wdM+JU1tKEzz0s4DLjPyqu3PBUN3N211WLN+leNkqjL2bpF2XTFbMp4KnIv2HkGtmwMofJZHCUYB9YwuN7qR8KPjRbUtdzoJ9Eo5EaMgAU0nU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=cmpxchg.org; spf=pass smtp.mailfrom=cmpxchg.org; dkim=pass (2048-bit key) header.d=cmpxchg.org header.i=@cmpxchg.org header.b=WsW+Uv3S; arc=none smtp.client-ip=209.85.160.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=cmpxchg.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=cmpxchg.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=cmpxchg.org header.i=@cmpxchg.org header.b="WsW+Uv3S" Received: by mail-qt1-f172.google.com with SMTP id d75a77b69052e-50917417efbso2015461cf.0 for ; Wed, 18 Mar 2026 13:04:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg.org; s=google; t=1773864244; x=1774469044; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Lbnp+kFbFYjM1YsuaYPPjCjMikwKyOcf6X0AkdLiyeI=; b=WsW+Uv3SZYtSPlMZ2ae2WIVDOYkkCFehk7S8dRDY7CDZxIjDtEiueT+P9B08xeYbAF LU/lDOo1Gz1j6fdmtSiV7nkD7c3UhV/GsQPerIBJm8U1/WJV8Q0bEsvWyf/F6+M97xtJ zoPgs2BQdj7Vt4rWxYte0gr9eDZcUP4OV/+43ya8OeqIOrtF5EvLPmngsZ9arYYPuKAC Ii0k2NqBnLYoI25F71e4MMqrA4t0fp3O2oXFLX1SdBBrMMXEw/Qy/3k3zm/oDvwmAj7I ihhrTyKPAc4MMsIZ9Ge1YX0Be9z9IPrBuRuI/KnFAxCqtBP665DHDclOsQWXadiH7MC8 f4OQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1773864244; x=1774469044; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=Lbnp+kFbFYjM1YsuaYPPjCjMikwKyOcf6X0AkdLiyeI=; b=P5J9NXAI7E6jEgfDaRAh4IWqvAqIRO/8rYqMQRVB61uSXqHCSsiC3BizAuCKGNb5ME CnvZEy30utMbkOJjz5NcoOd+DMlKeN7XuDpKeypMjfKc4ZroCeg2PhJiTx5FB+/bKDbt OWAXzQF0okzc40ILcQ6rgEYICAEHF01+VuWq+xJ/3wL7/q2z4xruPUg4jNgnXO1hkpZE X9m02m7fxr8r92y2oPuLjQtbUAcuoiKLKuY1gvEAl0JZ/Sm9LLNzRY46ZN7tiqEth3R0 wD7tsIJG2XZe0CrY5K/zuQeKPIMCrlj7TyByOLAi5TUOr/5qlmmncgdksMhZ7Tfj/V8r 3X8A== X-Forwarded-Encrypted: i=1; AJvYcCXVlQc0cCJJFFMg+TIc8ndd60OJjK8yica6mjKOOQkojd8BSrHRo94A+MA8LdAJ7CelUjQn4thTVbCXi+s=@vger.kernel.org X-Gm-Message-State: AOJu0YyRhTRHSN1mY2t1LUXCEgukdMEZbKL2Z7wjgClItROeST2StcJU D4UiLGYkowLe4cbjGqLHqRnzNn43bDC9lzDm8IdX8Uwsb6kB1/nWgvOlbM4JfMwKZUE= X-Gm-Gg: ATEYQzw7kH/ZazzI88unhTCSjXaLp0YqnSzpRV025swGNKLpL4KA28UiGYfV4C6oONG j5TTtF/q40mKM5CPGrMKhPlFJUEVKiV40OWRyRlFX0SH4ADDToPu23H7Q0QUKQBbZoxB+smruSt 2noohVhwTFSio6HjtNzHmgvdHJlUYQZMaq2G+52gRrAU0oEftBwvEeYl3iu3KjGYdv4IvzJUS7X QUacmpBK5qDYoZPSDx39aSwl0UDHwSq4N9S33WTGjp4ldGdMKpcSjycB3o5evgDTZMZueQ/CtMx CKLZIBfDRZxK5C9ax+yB09F/XKUAhlWQg/fRo5sX0NL4HWmvkRotSKmjEz5H41wEDPLY6bbxbyZ 9kgolMXglPL9sMhCTnsJvY4TVCAUyAMoKp8W/uqKQY11e5HcwHoQuXRCzIpjYJX6uu35iyiMjLW LdWuW4CWoLFcXTCabFig7GnA== X-Received: by 2002:ac8:5914:0:b0:4ff:b32b:cdf9 with SMTP id d75a77b69052e-50b24655c9bmr14048141cf.14.1773864244121; Wed, 18 Mar 2026 13:04:04 -0700 (PDT) Received: from localhost ([2603:7000:c00:3a00:365a:60ff:fe62:ff29]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-50b134ad438sm28515181cf.10.2026.03.18.13.04.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 18 Mar 2026 13:04:03 -0700 (PDT) From: Johannes Weiner To: Andrew Morton Cc: David Hildenbrand , Shakeel Butt , Yosry Ahmed , Zi Yan , "Liam R. Howlett" , Usama Arif , Kiryl Shutsemau , Dave Chinner , Roman Gushchin , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v3 2/7] mm: list_lru: deduplicate unlock_list_lru() Date: Wed, 18 Mar 2026 15:53:20 -0400 Message-ID: <20260318200352.1039011-3-hannes@cmpxchg.org> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260318200352.1039011-1-hannes@cmpxchg.org> References: <20260318200352.1039011-1-hannes@cmpxchg.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The MEMCG and !MEMCG variants are the same. lock_list_lru() has the same pattern when bailing. Consolidate into a common implementation. Reviewed-by: David Hildenbrand (Arm) Acked-by: Shakeel Butt Signed-off-by: Johannes Weiner Reviewed-by: Lorenzo Stoakes (Oracle) --- mm/list_lru.c | 29 +++++++++-------------------- 1 file changed, 9 insertions(+), 20 deletions(-) diff --git a/mm/list_lru.c b/mm/list_lru.c index d96fd50fc9af..e873bc26a7ef 100644 --- a/mm/list_lru.c +++ b/mm/list_lru.c @@ -15,6 +15,14 @@ #include "slab.h" #include "internal.h" =20 +static inline void unlock_list_lru(struct list_lru_one *l, bool irq_off) +{ + if (irq_off) + spin_unlock_irq(&l->lock); + else + spin_unlock(&l->lock); +} + #ifdef CONFIG_MEMCG static LIST_HEAD(memcg_list_lrus); static DEFINE_MUTEX(list_lrus_mutex); @@ -67,10 +75,7 @@ static inline bool lock_list_lru(struct list_lru_one *l,= bool irq) else spin_lock(&l->lock); if (unlikely(READ_ONCE(l->nr_items) =3D=3D LONG_MIN)) { - if (irq) - spin_unlock_irq(&l->lock); - else - spin_unlock(&l->lock); + unlock_list_lru(l, irq); return false; } return true; @@ -101,14 +106,6 @@ lock_list_lru_of_memcg(struct list_lru *lru, int nid, = struct mem_cgroup *memcg, memcg =3D parent_mem_cgroup(memcg); goto again; } - -static inline void unlock_list_lru(struct list_lru_one *l, bool irq_off) -{ - if (irq_off) - spin_unlock_irq(&l->lock); - else - spin_unlock(&l->lock); -} #else static void list_lru_register(struct list_lru *lru) { @@ -147,14 +144,6 @@ lock_list_lru_of_memcg(struct list_lru *lru, int nid, = struct mem_cgroup *memcg, =20 return l; } - -static inline void unlock_list_lru(struct list_lru_one *l, bool irq_off) -{ - if (irq_off) - spin_unlock_irq(&l->lock); - else - spin_unlock(&l->lock); -} #endif /* CONFIG_MEMCG */ =20 /* The caller must ensure the memcg lifetime. */ --=20 2.53.0 From nobody Mon Apr 6 16:48:29 2026 Received: from mail-qt1-f180.google.com (mail-qt1-f180.google.com [209.85.160.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F1B953DDDB9 for ; Wed, 18 Mar 2026 20:04:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.180 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773864249; cv=none; b=mO8eMm4CCKjNIk9n6Hdq+g0ymNiGyVYuQUh6xIIJSDbZ2VTEKY/TgNLpWPCbfVhn3WJkSLeANgi1ULkTozkyg3JUt6bF0cxeZpwK+YF0CyUYXut8QE2fT3ADPEjngHqCA8NMteqAOw2lM5OPuCjqjUeBfjcuebiFlXL9F5XppgQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773864249; c=relaxed/simple; bh=AVFEPAIHQhNPTOlvWkbZ41tpiX3+mM/5HCnOo7aJOxU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=J2Cel1E0xMX/lHlRkzM9KFMsPmZo1L6noo4cFUynRNm2L3SbmMGWq2CQGlNikXf5nPnvvQld/9dBhg46WzamAKb8JQNBtXwN/bhZPzhXKnc0sKty8md69F59uR+gwBLdNqg6vxjsXKY4tRtbNGrmPxcHOqb7NaBre6o1PYD9n/Y= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=cmpxchg.org; spf=pass smtp.mailfrom=cmpxchg.org; dkim=pass (2048-bit key) header.d=cmpxchg.org header.i=@cmpxchg.org header.b=XLKPMFpy; arc=none smtp.client-ip=209.85.160.180 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=cmpxchg.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=cmpxchg.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=cmpxchg.org header.i=@cmpxchg.org header.b="XLKPMFpy" Received: by mail-qt1-f180.google.com with SMTP id d75a77b69052e-50905b779dfso2394361cf.3 for ; Wed, 18 Mar 2026 13:04:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg.org; s=google; t=1773864247; x=1774469047; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=2wkyoaYJiF0XNGZN3CfUoqSmkjjAgGEnloqZ+FNXRgI=; b=XLKPMFpycqWD2Ok2BmmwwHU9SBuf5hfyTwT/KZE3LUmacVnZKIqrq3Sbb1TKx6SVW7 AifHfbS9+PEemkvWmaB+fah3sbhG2JSMjuH2GSXV5AJbmfqQDf22iPRerA8MBICOnKoj 93KRXu/65MLUgY3pKfnpHXneDER9TH5a9CA16scxSA05we4B5kQEGPjAQdPFRca/e6MM CIcKgdNOJxvoMndScgSmg44vfMTS07oK9Xco1s6xhcnnE3+4Lyr5xa+RXTk4lQ/Y8Ziv vRjt5U/+CRwzihorRdmF1ucbzgZS/R9N/2JTnjsyCM0aUgLLqaVN91rLvrOXCEMvHLqX xP1Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1773864247; x=1774469047; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=2wkyoaYJiF0XNGZN3CfUoqSmkjjAgGEnloqZ+FNXRgI=; b=gf7xSGPt2oGkO10nNR5UOT9/p/BFtRCF4MsXiggeUdJnSda1sCYbP0IlIAmQab0VM0 9WDPlxDMa42+SpxDPiIk5wqvQvwH2POVBN9RIvagotpzkjd15U3CoDPXr02fQveKTScP gIdu4qnQwq4CraW3to/FGvP2jPH8iEKXIJ8gVaHibNptKua8W6M2dlyDLow6VlZ7m7F+ u59po0tdP0NibktESZOfJPhKvFErPnDNqpvRSRKMAKtQrGePtnpzVJGXk5QkR3vIzunB TnNUcCpRpxu6gTzrddG3Y4v9YOk3SBtfSaJJ6PTTkMMhpGRKk8mspC/eAl/H8ML+ib5o Yi2g== X-Forwarded-Encrypted: i=1; AJvYcCXWUPht43afG1Sudw1SvcVK/l/uPr3NER9bHcGAiDcpcBb1OK/ZMOdyKGg4ON2W1lQ6oHDIVcdzVZv4V+s=@vger.kernel.org X-Gm-Message-State: AOJu0YzBsJyNJN3/UusqF7S2znjlIEbgle1TQ2jiK0awiAlnbT1k9BMK igX451zoOcpPbAcKN2uqt/k63gIrvIQm4MPQZGRCUfbZecDPZuPyH0Zwyl5SfZ7ZKT8= X-Gm-Gg: ATEYQzxdMro29R1aeXbKNZMLtfH48fCfPT8N9eokWZ7fYh8xkyNljam01hL4RiJPvnW SPJPupq7IE9LRbbVLqWn48W+AEkXktMK6twvrIr4P+/5xnirYysI8GHgI1zhy+qCcX2vjIty4d3 20QD1ue4K8gieauVqKqdjUrboGgZj+uFL2qgJZ6LtynecXM7rx+m2eWDla+vKGPz3qY1i3lz9F4 SS0zGxPxhGgqDQZVTMlsouLTFD+ueJfyVjOE5Fqk58OMpltpXTsEanRpPGk/ywlWvdCiRQG+KXV 2OcxxYpuJhuBDr/QFSCIuKga8UWxg/VKus8WFek1Snh+DYupRkQwK4lpYGvUrPLv2h3jJltRVoz BxaFvuu7/dMwiQj+FOnVYXXReEFvVA/bAORdNAahqgxHmFVv7l9bpXac10fy+//Yg/7sCutW2zJ 6FoD+B/Ysum3UmG/ils72omQ== X-Received: by 2002:a05:622a:150:b0:509:3098:819a with SMTP id d75a77b69052e-50b1486333dmr64714801cf.44.1773864245907; Wed, 18 Mar 2026 13:04:05 -0700 (PDT) Received: from localhost ([2603:7000:c00:3a00:365a:60ff:fe62:ff29]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-50b134ad438sm28515831cf.10.2026.03.18.13.04.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 18 Mar 2026 13:04:04 -0700 (PDT) From: Johannes Weiner To: Andrew Morton Cc: David Hildenbrand , Shakeel Butt , Yosry Ahmed , Zi Yan , "Liam R. Howlett" , Usama Arif , Kiryl Shutsemau , Dave Chinner , Roman Gushchin , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v3 3/7] mm: list_lru: move list dead check to lock_list_lru_of_memcg() Date: Wed, 18 Mar 2026 15:53:21 -0400 Message-ID: <20260318200352.1039011-4-hannes@cmpxchg.org> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260318200352.1039011-1-hannes@cmpxchg.org> References: <20260318200352.1039011-1-hannes@cmpxchg.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Only the MEMCG variant of lock_list_lru() needs to check if there is a race with cgroup deletion and list reparenting. Move the check to the caller, so that the next patch can unify the lock_list_lru() variants. Reviewed-by: David Hildenbrand (Arm) Signed-off-by: Johannes Weiner Acked-by: Shakeel Butt Reviewed-by: Lorenzo Stoakes (Oracle) --- mm/list_lru.c | 17 ++++++++--------- 1 file changed, 8 insertions(+), 9 deletions(-) diff --git a/mm/list_lru.c b/mm/list_lru.c index e873bc26a7ef..1a39ff490643 100644 --- a/mm/list_lru.c +++ b/mm/list_lru.c @@ -68,17 +68,12 @@ list_lru_from_memcg_idx(struct list_lru *lru, int nid, = int idx) return &lru->node[nid].lru; } =20 -static inline bool lock_list_lru(struct list_lru_one *l, bool irq) +static inline void lock_list_lru(struct list_lru_one *l, bool irq) { if (irq) spin_lock_irq(&l->lock); else spin_lock(&l->lock); - if (unlikely(READ_ONCE(l->nr_items) =3D=3D LONG_MIN)) { - unlock_list_lru(l, irq); - return false; - } - return true; } =20 static inline struct list_lru_one * @@ -90,9 +85,13 @@ lock_list_lru_of_memcg(struct list_lru *lru, int nid, st= ruct mem_cgroup *memcg, rcu_read_lock(); again: l =3D list_lru_from_memcg_idx(lru, nid, memcg_kmem_id(memcg)); - if (likely(l) && lock_list_lru(l, irq)) { - rcu_read_unlock(); - return l; + if (likely(l)) { + lock_list_lru(l, irq); + if (likely(READ_ONCE(l->nr_items) !=3D LONG_MIN)) { + rcu_read_unlock(); + return l; + } + unlock_list_lru(l, irq); } /* * Caller may simply bail out if raced with reparenting or --=20 2.53.0 From nobody Mon Apr 6 16:48:29 2026 Received: from mail-qk1-f173.google.com (mail-qk1-f173.google.com [209.85.222.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7C7093EAC6B for ; Wed, 18 Mar 2026 20:04:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773864249; cv=none; b=VFo3P8dWUrbQI5jEKZdMJasvrEHrpfBmxkjIM140wN9tQMqfPCCaRBe8774HXgyU2opQ9WqYvc5TblMdMNmA0KyK/CT2yM3H5Oaz5aj764SglPstN/BZAdSuLNNjZr5BFgV/I0EiU1Sdcdbj/l2ka1CGpSgzCvbNeaqhsc/wFeU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773864249; c=relaxed/simple; bh=pydjHkIk5OAceZVxZINaGtjRgV2uvpT21+5I4TpV328=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=KCq+b8xXfC9WdXmVCik7q6QDyh9BpdzVEIViNHhiZM9FwK0/IaUmUW1Y0lOtBcEoDOtNJ3VS0saJ2cW00hi0PtOQKw0zoETxLBF65tUbMCgRQBfzbfH9lqJT1nrwYgO46qlRzS4k5EX5iui6JEQC+M+nHroWwtqLEt4Vg6TfDU0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=cmpxchg.org; spf=pass smtp.mailfrom=cmpxchg.org; dkim=pass (2048-bit key) header.d=cmpxchg.org header.i=@cmpxchg.org header.b=dr9sTxdR; arc=none smtp.client-ip=209.85.222.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=cmpxchg.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=cmpxchg.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=cmpxchg.org header.i=@cmpxchg.org header.b="dr9sTxdR" Received: by mail-qk1-f173.google.com with SMTP id af79cd13be357-8cfbbf35354so8212185a.0 for ; Wed, 18 Mar 2026 13:04:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg.org; s=google; t=1773864247; x=1774469047; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=+lIBKuihBwhIz7SacTrjJRS4SfXMfl3HnekVF8WIFpY=; b=dr9sTxdRKM10SWlIZiAcVnUl/udNJJpX+SGMMS0Azx+l7Ng9RQ76wMwlva4yhTtqdR UEz8fsQnrQS9Mx/Jflg7DsU1xzCaMuCoMsLxW0Fz0hCVrzOjbtFix1K79trtwuMcefb4 KiaApucKIU2mTpqX1KMz84P36pimKcFC8WddxPVXHXCpv3twmuXvfqha7vCjYHbjf5sl hj1CFd//pRsGw7nse8ALEbLlTwl8P3PhZPWx+rPlrxiv7QeYt9ujsUDUt2eCLEZRU8l/ PrxgU8cgXzg8aVqBvXxtijCROymDp0rKyo/4SqkJEz1Nc4pUZTv6chgDMcW7rG8Uz1Ch RxNw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1773864247; x=1774469047; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=+lIBKuihBwhIz7SacTrjJRS4SfXMfl3HnekVF8WIFpY=; b=lPD4xVmMQIuBiO/4H3R9BQTAJnKt3hkwOkLo3cLioKrPZxx+OXND3QhDiPgbCGJ7gI M7cBd21a1lx1noewCLI409c1fy5Qbei9SuqK0V7iVEbYhnP51VvwvZ+EdqQuYQZzLVKv FLacnGRN8VgAUn2WrfRxfmXb1vqrlQiVjIyCrtmkEFnkJ99fXVWbermNTQjntrtiRSIC S2yGoh5nKLbBhVpjirgBdhP3WV9PZrY6CKaDNAcNS0WwzpMx30LwxfN6VMlRS1EMrnlL GeHtL1jKb/dX5ZStlllZngRGnSl3uA0rdliN9QE32Z0JPIfBOg4CVg054rhVWWKck60G l5ew== X-Forwarded-Encrypted: i=1; AJvYcCWnW7+/JasikC7+pMyjfSCY0+iKlVyXnZ9PlqLbyaSMf79bFRwypoVEb72wHpsJcwWBDWYnBIw7RAE8l4M=@vger.kernel.org X-Gm-Message-State: AOJu0YxB86XThBvTR+4WmZJl6LPJLgZYi4pmYbjWFtZdr5DWp90xGJaJ JSm/YDYwrfjXJB8693n9yxJPGC10+W+pUBs1dY5iDpqgOP2qxGK4aKrbEs1Qpm89wqE= X-Gm-Gg: ATEYQzwmxbknfpf6mPzLKIGBhWsWDVMs5uYDv1BdoLJD2kwPywifSb/QSJkX04ENap5 SudPixeLuRPGBPC2ltgSzQrqvfvlyq1CyFhUWLKl2yghqNDrpMTOpUKVzrd9t+dP3PTYo6pzVWO zujvd7sC6NF0Q/IAKxvAF4JCupVybPuqSr78U63+n14IFRyDraRF8rvFdO7a2rj71p8fXKvXWkk AVt9VWH30WwO71wFxRDzrEQZGRSCQBV7f7P0jR2+FVxeWNdzIRFBvZgXMf0zaCVL2ngjFK2NP1N 7hb+rvPC1iYgRY/xAQnHnAfcYHGNWbfQF/IQlEQALWcLyTW9vfxTzqvTcqEvTeR9V8NAKlwg8XO BZIMVwLU2MqCp73l93OsFGCeIjxftM7q4vs1Evj4ZHnZAI/60KZ+qR+g/zG9QNqn6luMEGDfxkj sB8hjABc9jZaOEBEJu1oieaKpqPhbnj7+1 X-Received: by 2002:a05:620a:171e:b0:8cd:b641:b730 with SMTP id af79cd13be357-8cfad3a0f32mr663539485a.66.1773864247421; Wed, 18 Mar 2026 13:04:07 -0700 (PDT) Received: from localhost ([2603:7000:c00:3a00:365a:60ff:fe62:ff29]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-89c6b9ec0fdsm27008986d6.40.2026.03.18.13.04.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 18 Mar 2026 13:04:06 -0700 (PDT) From: Johannes Weiner To: Andrew Morton Cc: David Hildenbrand , Shakeel Butt , Yosry Ahmed , Zi Yan , "Liam R. Howlett" , Usama Arif , Kiryl Shutsemau , Dave Chinner , Roman Gushchin , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v3 4/7] mm: list_lru: deduplicate lock_list_lru() Date: Wed, 18 Mar 2026 15:53:22 -0400 Message-ID: <20260318200352.1039011-5-hannes@cmpxchg.org> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260318200352.1039011-1-hannes@cmpxchg.org> References: <20260318200352.1039011-1-hannes@cmpxchg.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The MEMCG and !MEMCG paths have the same pattern. Share the code. Reviewed-by: David Hildenbrand (Arm) Signed-off-by: Johannes Weiner Acked-by: Shakeel Butt Reviewed-by: Lorenzo Stoakes (Oracle) --- mm/list_lru.c | 21 +++++++++------------ 1 file changed, 9 insertions(+), 12 deletions(-) diff --git a/mm/list_lru.c b/mm/list_lru.c index 1a39ff490643..4d74c2e9c2a5 100644 --- a/mm/list_lru.c +++ b/mm/list_lru.c @@ -15,6 +15,14 @@ #include "slab.h" #include "internal.h" =20 +static inline void lock_list_lru(struct list_lru_one *l, bool irq) +{ + if (irq) + spin_lock_irq(&l->lock); + else + spin_lock(&l->lock); +} + static inline void unlock_list_lru(struct list_lru_one *l, bool irq_off) { if (irq_off) @@ -68,14 +76,6 @@ list_lru_from_memcg_idx(struct list_lru *lru, int nid, i= nt idx) return &lru->node[nid].lru; } =20 -static inline void lock_list_lru(struct list_lru_one *l, bool irq) -{ - if (irq) - spin_lock_irq(&l->lock); - else - spin_lock(&l->lock); -} - static inline struct list_lru_one * lock_list_lru_of_memcg(struct list_lru *lru, int nid, struct mem_cgroup *m= emcg, bool irq, bool skip_empty) @@ -136,10 +136,7 @@ lock_list_lru_of_memcg(struct list_lru *lru, int nid, = struct mem_cgroup *memcg, { struct list_lru_one *l =3D &lru->node[nid].lru; =20 - if (irq) - spin_lock_irq(&l->lock); - else - spin_lock(&l->lock); + lock_list_lru(l, irq); =20 return l; } --=20 2.53.0 From nobody Mon Apr 6 16:48:29 2026 Received: from mail-qv1-f53.google.com (mail-qv1-f53.google.com [209.85.219.53]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 20E6D3EF673 for ; Wed, 18 Mar 2026 20:04:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.53 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773864252; cv=none; b=JoKus/rWj6+qsI1NgA5DdOTCLU4B6CpoNVeNhDh+epM9uK5PJ5l6ihkN382Cp16aV/h1dyzueK0OGgRkJUg+INsLniaZhl6VQSjKjHvkotVNZFUc/mXhtiFumG8dGGMCj4DWvovXgMzgI+ZCmri2+NMPZngIXnmkGjWp9B53r6k= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773864252; c=relaxed/simple; bh=J4I7gZMWF8vUPqOwIKEHG8vSQu+ZDec0R6fx00I5vi0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ecHCAhqZPPhJXPZBx19fui00lvs5jZ2wBrvXiJpa4QTTRIcpO+CGgtcH7wW5nJO5ue95JkZqs38F4F8O9zXwMjvz5CDiQBxcnyilCloTcqRQB4j87uLOVrCVagzF0EbzRcPRnO0AaFPFS2sw6/ELRdxOA9F2/ovakkq1fo2VIbk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=cmpxchg.org; spf=pass smtp.mailfrom=cmpxchg.org; dkim=pass (2048-bit key) header.d=cmpxchg.org header.i=@cmpxchg.org header.b=Nl+PfFhf; arc=none smtp.client-ip=209.85.219.53 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=cmpxchg.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=cmpxchg.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=cmpxchg.org header.i=@cmpxchg.org header.b="Nl+PfFhf" Received: by mail-qv1-f53.google.com with SMTP id 6a1803df08f44-89a0ecbc713so2930716d6.1 for ; Wed, 18 Mar 2026 13:04:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg.org; s=google; t=1773864250; x=1774469050; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=He+/m2HsFJRsEnGnHHCnhLMWVNpYSviCx14DKV7mwYM=; b=Nl+PfFhfTEs7oPYnbOqEot7Fz5nXSSBKYjItKAnfVkuaiyZY6kBSqJCc6myY4TUEfo w63qyzyFFXpzDmOqvBFy8oY2gGd6UKPRrqXyVUostlxPi+KIUznGwWxnxIJdpjyPYlMT x4LghkDCfXSqh5CibKWeV8J43+Bgid4G5gG2/xAEUFdBXQXAOGl+7U3qdlgC4QerxTqS D0FnulixiDoDFd8/Pc5VzejmHEJG1bGE+ytPu4tf+BHE2dm9c0i9JuCY66mJRshWN1F2 99PPTtVKEfzy+SAuDY42IVgMS1PvvUI0GVdGEFg4LJjhMgMkDk+rdLCDCgG1C9DzF665 jzNA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1773864250; x=1774469050; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=He+/m2HsFJRsEnGnHHCnhLMWVNpYSviCx14DKV7mwYM=; b=PAAA44s5oJ+OBRpL0jxn3/Jj00mUXrN0GpxwwmF30kZN/6fMtE0n7NGleWH1EG9p7J HU4gVrlLrHD99nHSdY/BNM6b0K3b9IMVPzU61mMiStyOlhQzMbII3M3k8caAV5IECMvA 6h1cy1r1JRdAI9sxr7y4hY74lSzVjetKVYdDiSdit2esO3t6iOd98IhKR9IKzxfm0LwX /3nrYzAXMVQV9WScO97dtfBk4994ZZA1jr1QpjmetTqTVAyS4ZtX1wShMLKyZR0+3/1Y hcSCuszE7GOCO2casnxoRKqE15nmv93HNHoSZwPBcNTwe72P/8cmUua2xntsjQDjd1xR xOQg== X-Forwarded-Encrypted: i=1; AJvYcCVhqtgMShGkbBPZtvhoAs0mQAiJtQxAjr239ASoW6CMGeBhnW/awsKx7BDpYDByDik3FAA/4bS6N62Zi/Q=@vger.kernel.org X-Gm-Message-State: AOJu0YziTZYuAaKHswDykF5nsG7nm0KbPowdpR424L652SkazKNWntYl +k8o9jYP77XnJpY028fosZWiJ4GaKc2DaR96DPguwPitVrSXrlYVM9n0p9C7T276ZAE= X-Gm-Gg: ATEYQzy84ccOtTzh1fgqJKkctOY25Nco8bpCiQikauVPEW56jVE8Vs/I/YcL5t0Ryhb NiDx88crs1bEllrRrN0Fokyfp1eO+6KuyjUa5/D2BtMR4D2hFVyaTvncj1Pb607DkQjuGABEcUq j+vo/70NM+Aa/IgRg1csEavI64M3oB5qSYfntDFvniVRgQ+tbgXEK21Wpa/LG1xT5mmopgpZxUl c9uDDrsp6N1mXqL/dvaBaTRwzXs2fGVxWKu/DCJmyTphhH52u0iBzyHqmveiFy/Lo8R9FV53yjk s4hco0zoW1xAktkQzVGHVmhxvGIXYTdmmkRFjIjn1Dt5blkomiDlsqc0nUICEyHqrYEY9/cPAFz EiGNiUPAqKKRHVN/fPHrcI+xi/w9/6/lLpCPWCBfw6WxZTGZLA09F4bGJLdKE+UOhU6rksRCxjE lTLQaseI0TQWzin6+eVY/NfQ== X-Received: by 2002:a05:6214:21e2:b0:899:ee6a:50ad with SMTP id 6a1803df08f44-89c773dcb69mr15009936d6.9.1773864249800; Wed, 18 Mar 2026 13:04:09 -0700 (PDT) Received: from localhost ([2603:7000:c00:3a00:365a:60ff:fe62:ff29]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-89c6b9ec0fdsm27009326d6.40.2026.03.18.13.04.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 18 Mar 2026 13:04:08 -0700 (PDT) From: Johannes Weiner To: Andrew Morton Cc: David Hildenbrand , Shakeel Butt , Yosry Ahmed , Zi Yan , "Liam R. Howlett" , Usama Arif , Kiryl Shutsemau , Dave Chinner , Roman Gushchin , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v3 5/7] mm: list_lru: introduce caller locking for additions and deletions Date: Wed, 18 Mar 2026 15:53:23 -0400 Message-ID: <20260318200352.1039011-6-hannes@cmpxchg.org> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260318200352.1039011-1-hannes@cmpxchg.org> References: <20260318200352.1039011-1-hannes@cmpxchg.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Locking is currently internal to the list_lru API. However, a caller might want to keep auxiliary state synchronized with the LRU state. For example, the THP shrinker uses the lock of its custom LRU to keep PG_partially_mapped and vmstats consistent. To allow the THP shrinker to switch to list_lru, provide normal and irqsafe locking primitives as well as caller-locked variants of the addition and deletion functions. Reviewed-by: David Hildenbrand (Arm) Signed-off-by: Johannes Weiner Acked-by: Shakeel Butt Reviewed-by: Lorenzo Stoakes (Oracle) --- include/linux/list_lru.h | 34 +++++++++++++ mm/list_lru.c | 107 +++++++++++++++++++++++++++------------ 2 files changed, 110 insertions(+), 31 deletions(-) diff --git a/include/linux/list_lru.h b/include/linux/list_lru.h index fe739d35a864..4afc02deb44d 100644 --- a/include/linux/list_lru.h +++ b/include/linux/list_lru.h @@ -83,6 +83,40 @@ int memcg_list_lru_alloc(struct mem_cgroup *memcg, struc= t list_lru *lru, gfp_t gfp); void memcg_reparent_list_lrus(struct mem_cgroup *memcg, struct mem_cgroup = *parent); =20 +/** + * list_lru_lock: lock the sublist for the given node and memcg + * @lru: the lru pointer + * @nid: the node id of the sublist to lock. + * @memcg: the cgroup of the sublist to lock. + * + * Returns the locked list_lru_one sublist. The caller must call + * list_lru_unlock() when done. + * + * You must ensure that the memcg is not freed during this call (e.g., with + * rcu or by taking a css refcnt). + * + * Return: the locked list_lru_one, or NULL on failure + */ +struct list_lru_one *list_lru_lock(struct list_lru *lru, int nid, + struct mem_cgroup *memcg); + +/** + * list_lru_unlock: unlock a sublist locked by list_lru_lock() + * @l: the list_lru_one to unlock + */ +void list_lru_unlock(struct list_lru_one *l); + +struct list_lru_one *list_lru_lock_irqsave(struct list_lru *lru, int nid, + struct mem_cgroup *memcg, unsigned long *irq_flags); +void list_lru_unlock_irqrestore(struct list_lru_one *l, + unsigned long *irq_flags); + +/* Caller-locked variants, see list_lru_add() etc for documentation */ +bool __list_lru_add(struct list_lru *lru, struct list_lru_one *l, + struct list_head *item, int nid, struct mem_cgroup *memcg); +bool __list_lru_del(struct list_lru *lru, struct list_lru_one *l, + struct list_head *item, int nid); + /** * list_lru_add: add an element to the lru list's tail * @lru: the lru pointer diff --git a/mm/list_lru.c b/mm/list_lru.c index 4d74c2e9c2a5..b817c0f48f73 100644 --- a/mm/list_lru.c +++ b/mm/list_lru.c @@ -15,17 +15,23 @@ #include "slab.h" #include "internal.h" =20 -static inline void lock_list_lru(struct list_lru_one *l, bool irq) +static inline void lock_list_lru(struct list_lru_one *l, bool irq, + unsigned long *irq_flags) { - if (irq) + if (irq_flags) + spin_lock_irqsave(&l->lock, *irq_flags); + else if (irq) spin_lock_irq(&l->lock); else spin_lock(&l->lock); } =20 -static inline void unlock_list_lru(struct list_lru_one *l, bool irq_off) +static inline void unlock_list_lru(struct list_lru_one *l, bool irq_off, + unsigned long *irq_flags) { - if (irq_off) + if (irq_flags) + spin_unlock_irqrestore(&l->lock, *irq_flags); + else if (irq_off) spin_unlock_irq(&l->lock); else spin_unlock(&l->lock); @@ -78,7 +84,7 @@ list_lru_from_memcg_idx(struct list_lru *lru, int nid, in= t idx) =20 static inline struct list_lru_one * lock_list_lru_of_memcg(struct list_lru *lru, int nid, struct mem_cgroup *m= emcg, - bool irq, bool skip_empty) + bool irq, unsigned long *irq_flags, bool skip_empty) { struct list_lru_one *l; =20 @@ -86,12 +92,12 @@ lock_list_lru_of_memcg(struct list_lru *lru, int nid, s= truct mem_cgroup *memcg, again: l =3D list_lru_from_memcg_idx(lru, nid, memcg_kmem_id(memcg)); if (likely(l)) { - lock_list_lru(l, irq); + lock_list_lru(l, irq, irq_flags); if (likely(READ_ONCE(l->nr_items) !=3D LONG_MIN)) { rcu_read_unlock(); return l; } - unlock_list_lru(l, irq); + unlock_list_lru(l, irq, irq_flags); } /* * Caller may simply bail out if raced with reparenting or @@ -132,37 +138,81 @@ list_lru_from_memcg_idx(struct list_lru *lru, int nid= , int idx) =20 static inline struct list_lru_one * lock_list_lru_of_memcg(struct list_lru *lru, int nid, struct mem_cgroup *m= emcg, - bool irq, bool skip_empty) + bool irq, unsigned long *irq_flags, bool skip_empty) { struct list_lru_one *l =3D &lru->node[nid].lru; =20 - lock_list_lru(l, irq); + lock_list_lru(l, irq, irq_flags); =20 return l; } #endif /* CONFIG_MEMCG */ =20 -/* The caller must ensure the memcg lifetime. */ -bool list_lru_add(struct list_lru *lru, struct list_head *item, int nid, - struct mem_cgroup *memcg) +struct list_lru_one *list_lru_lock(struct list_lru *lru, int nid, + struct mem_cgroup *memcg) { - struct list_lru_node *nlru =3D &lru->node[nid]; - struct list_lru_one *l; + return lock_list_lru_of_memcg(lru, nid, memcg, /*irq=3D*/false, + /*irq_flags=3D*/NULL, /*skip_empty=3D*/false); +} + +void list_lru_unlock(struct list_lru_one *l) +{ + unlock_list_lru(l, /*irq_off=3D*/false, /*irq_flags=3D*/NULL); +} + +struct list_lru_one *list_lru_lock_irqsave(struct list_lru *lru, int nid, + struct mem_cgroup *memcg, + unsigned long *flags) +{ + return lock_list_lru_of_memcg(lru, nid, memcg, /*irq=3D*/true, + /*irq_flags=3D*/flags, /*skip_empty=3D*/false); +} + +void list_lru_unlock_irqrestore(struct list_lru_one *l, unsigned long *fla= gs) +{ + unlock_list_lru(l, /*irq_off=3D*/true, /*irq_flags=3D*/flags); +} =20 - l =3D lock_list_lru_of_memcg(lru, nid, memcg, false, false); +bool __list_lru_add(struct list_lru *lru, struct list_lru_one *l, + struct list_head *item, int nid, + struct mem_cgroup *memcg) +{ if (list_empty(item)) { list_add_tail(item, &l->list); /* Set shrinker bit if the first element was added */ if (!l->nr_items++) set_shrinker_bit(memcg, nid, lru_shrinker_id(lru)); - unlock_list_lru(l, false); - atomic_long_inc(&nlru->nr_items); + atomic_long_inc(&lru->node[nid].nr_items); + return true; + } + return false; +} + +bool __list_lru_del(struct list_lru *lru, struct list_lru_one *l, + struct list_head *item, int nid) +{ + if (!list_empty(item)) { + list_del_init(item); + l->nr_items--; + atomic_long_dec(&lru->node[nid].nr_items); return true; } - unlock_list_lru(l, false); return false; } =20 +/* The caller must ensure the memcg lifetime. */ +bool list_lru_add(struct list_lru *lru, struct list_head *item, int nid, + struct mem_cgroup *memcg) +{ + struct list_lru_one *l; + bool ret; + + l =3D list_lru_lock(lru, nid, memcg); + ret =3D __list_lru_add(lru, l, item, nid, memcg); + list_lru_unlock(l); + return ret; +} + bool list_lru_add_obj(struct list_lru *lru, struct list_head *item) { bool ret; @@ -184,19 +234,13 @@ EXPORT_SYMBOL_GPL(list_lru_add_obj); bool list_lru_del(struct list_lru *lru, struct list_head *item, int nid, struct mem_cgroup *memcg) { - struct list_lru_node *nlru =3D &lru->node[nid]; struct list_lru_one *l; + bool ret; =20 - l =3D lock_list_lru_of_memcg(lru, nid, memcg, false, false); - if (!list_empty(item)) { - list_del_init(item); - l->nr_items--; - unlock_list_lru(l, false); - atomic_long_dec(&nlru->nr_items); - return true; - } - unlock_list_lru(l, false); - return false; + l =3D list_lru_lock(lru, nid, memcg); + ret =3D __list_lru_del(lru, l, item, nid); + list_lru_unlock(l); + return ret; } =20 bool list_lru_del_obj(struct list_lru *lru, struct list_head *item) @@ -269,7 +313,8 @@ __list_lru_walk_one(struct list_lru *lru, int nid, stru= ct mem_cgroup *memcg, unsigned long isolated =3D 0; =20 restart: - l =3D lock_list_lru_of_memcg(lru, nid, memcg, irq_off, true); + l =3D lock_list_lru_of_memcg(lru, nid, memcg, /*irq=3D*/irq_off, + /*irq_flags=3D*/NULL, /*skip_empty=3D*/true); if (!l) return isolated; list_for_each_safe(item, n, &l->list) { @@ -310,7 +355,7 @@ __list_lru_walk_one(struct list_lru *lru, int nid, stru= ct mem_cgroup *memcg, BUG(); } } - unlock_list_lru(l, irq_off); + unlock_list_lru(l, irq_off, NULL); out: return isolated; } --=20 2.53.0 From nobody Mon Apr 6 16:48:29 2026 Received: from mail-qk1-f180.google.com (mail-qk1-f180.google.com [209.85.222.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 893F83A7846 for ; Wed, 18 Mar 2026 20:04:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.180 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773864253; cv=none; b=nKwJ9sgM6EpbDJlPgi4o6Wl3hwBH8ZBSt2h6yc4h9p1FemQ3VYTQF9Zx3PiwV23Fa6ZWruYkyHSxbyUY1pAELR0GI+GhL1TR1qNo8Us/zsqu3tV89MLnkyR/JECha5jvjPxncSt2gvgxiLU74vNWYl1J8UfIg/RgE0RHP0CBAeg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773864253; c=relaxed/simple; bh=W0/oQRBD0+7E2k72mq3zznO1a3Q+lDyYOj6Osyx4hg8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=I/6wbKwNOZLHv6JLQ1D/eLmj9kvDGe0O/VA+LhBQomUGosx7OY/TuV3dpcapSp0wnCq/crqKNeMWjHvBAI35K8om3DOYdQ3SqkYB74wTOZ+dpKw3SL9UmweudtXXMfFIok6La5obEUzJhUaVwcbdI+L7Ip2mQV8SBMM8RkChdIM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=cmpxchg.org; spf=pass smtp.mailfrom=cmpxchg.org; dkim=pass (2048-bit key) header.d=cmpxchg.org header.i=@cmpxchg.org header.b=QrGPls/L; arc=none smtp.client-ip=209.85.222.180 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=cmpxchg.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=cmpxchg.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=cmpxchg.org header.i=@cmpxchg.org header.b="QrGPls/L" Received: by mail-qk1-f180.google.com with SMTP id af79cd13be357-8cd80bea5f3so23740285a.3 for ; Wed, 18 Mar 2026 13:04:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg.org; s=google; t=1773864251; x=1774469051; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=XKEn0eNu+Wls6kueeJsU2hAaS0QRR2i3TclFtAuyFDs=; b=QrGPls/LBwkL7WlrzfIEaGfvgOSFIBid6dftlH7GX4xEgoK333ZY2ZnoAdXWvJpzSu ANk8DzjIgWqJqnWIMrQW21v9wtSf+4azbTKgMuPFpoTZe1PIGZFWhkmbfmWBowNIpgh3 lUHZl/fqo3ynnY/iZZLL2rZ+wVLELx40sAdmp7w6j/39e29hCgYaqZrgnQComLcIBTXJ q7N92mctYSCBUP1ByS+/AftXUmb6FSe2Ikaapz69WVWBdEgv3PfPc/3Mnp2ioRKXblHH nAZWsjPPdd+MyPn9KnK/FojlIPtGn/KFQRly8U9/Ait/xS+kjlI2V2vGzelOk38ArUVu 7INA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1773864251; x=1774469051; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=XKEn0eNu+Wls6kueeJsU2hAaS0QRR2i3TclFtAuyFDs=; b=ETWWR2DTXiGYe7aNYEnKhGs2GI+3/5V9Jlk/id5MjGs6/SIiMsbUhsjID8w4he+ILM RDdS3VYMkSI9Dftg8SGGeC+bgEZEFhqC+tIN6dRXzJMkpnINHH6joqYQkYkT20rik3Op yYnvP0ElMPLXiLEONbklEVgjKGln6cfZXCBztOItWNzV6nGtD2WfjHzDUuLnDKyngoNX R32wD9IcN2Byj/YWZOJSmKYrBIJZWAcuNk4DGhFJ1AHDF7IrYKT70trKJV+7UKCBxfpq 0UBV3kehpXH2imgnvexUIqYxTywrjq1plMnB7Or6rsJmynGMGMYXLG+aGzstFe0z9k78 tLEQ== X-Forwarded-Encrypted: i=1; AJvYcCVJWlzHkqvlblAT7iGdKyjN65AEvd8TpwZBcX8EUvIwsdxoTA/OpwjC3JG9AVsbd+0Z/v+/GmFIan/CbZE=@vger.kernel.org X-Gm-Message-State: AOJu0YzOVC4464Ok9K4inDvZSvKhioeT8Hs0i+kkjU75EhhedSAPwK57 ETFvBR7YCmoiD9zva689fQqvz4jRxCTiCPW3Dtk6mXYnbghy3B1A5Nd01T5EgPQ+FWM= X-Gm-Gg: ATEYQzwzJuvAshfTb3NVIyAmpvqFQ61oTVQIYYaJ0MYvKjYp5PxlOchRVh7ky39kKO5 iHN+8akYQER918sHUSqw1KcMRnn7PosmLHHMp281CW6cGzIecK7yB2OLxj+hAtFafVw8aCk3gci Z186VGcND1h+UjxULiTXPg4TbvdrzRzlf2PQXUt4yB7lpISxcspWPYpCdeh4w1Tw6ZD26ZV8DSy VCt40yr/L7fPflmnxJlE111v4+z16DhpPbh6KUExtGdz3tfPUMB/6UBMquWZCughTkd9Y4T2ABx wVQRKcWyhoLAoxO/dwAoOqj64Sk8MYE6/5wyFFW2VBYEvtWJtw9XGId7FYw+DGc2eXkQNPdWAFZ /jJDHBF1BLmGtsfZsGkNOSOKMgrmD3dhl72VEmn1LLHQ+QXxfYhUVmaOHQL0XzE/9IR3NJkTqXE 7XjvXADxLa/Sp5rQqIp0BIWA== X-Received: by 2002:ad4:4ee8:0:b0:899:ea9e:31c4 with SMTP id 6a1803df08f44-89c6b5c828emr66247796d6.51.1773864251348; Wed, 18 Mar 2026 13:04:11 -0700 (PDT) Received: from localhost ([2603:7000:c00:3a00:365a:60ff:fe62:ff29]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-89c6b9cea7bsm33041426d6.27.2026.03.18.13.04.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 18 Mar 2026 13:04:10 -0700 (PDT) From: Johannes Weiner To: Andrew Morton Cc: David Hildenbrand , Shakeel Butt , Yosry Ahmed , Zi Yan , "Liam R. Howlett" , Usama Arif , Kiryl Shutsemau , Dave Chinner , Roman Gushchin , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v3 6/7] mm: list_lru: introduce folio_memcg_list_lru_alloc() Date: Wed, 18 Mar 2026 15:53:24 -0400 Message-ID: <20260318200352.1039011-7-hannes@cmpxchg.org> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260318200352.1039011-1-hannes@cmpxchg.org> References: <20260318200352.1039011-1-hannes@cmpxchg.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" memcg_list_lru_alloc() is called every time an object that may end up on the list_lru is created. It needs to quickly check if the list_lru heads for the memcg already exist, and allocate them when they don't. Doing this with folio objects is tricky: folio_memcg() is not stable and requires either RCU protection or pinning the cgroup. But it's desirable to make the existence check lightweight under RCU, and only pin the memcg when we need to allocate list_lru heads and may block. In preparation for switching the THP shrinker to list_lru, add a helper function for allocating list_lru heads coming from a folio. Reviewed-by: David Hildenbrand (Arm) Signed-off-by: Johannes Weiner Acked-by: Shakeel Butt Reviewed-by: Lorenzo Stoakes (Oracle) --- include/linux/list_lru.h | 12 ++++++++++++ mm/list_lru.c | 39 ++++++++++++++++++++++++++++++++++----- 2 files changed, 46 insertions(+), 5 deletions(-) diff --git a/include/linux/list_lru.h b/include/linux/list_lru.h index 4afc02deb44d..4bd29b61c59a 100644 --- a/include/linux/list_lru.h +++ b/include/linux/list_lru.h @@ -81,6 +81,18 @@ static inline int list_lru_init_memcg_key(struct list_lr= u *lru, struct shrinker =20 int memcg_list_lru_alloc(struct mem_cgroup *memcg, struct list_lru *lru, gfp_t gfp); + +#ifdef CONFIG_MEMCG +int folio_memcg_list_lru_alloc(struct folio *folio, struct list_lru *lru, + gfp_t gfp); +#else +static inline int folio_memcg_list_lru_alloc(struct folio *folio, + struct list_lru *lru, gfp_t gfp) +{ + return 0; +} +#endif + void memcg_reparent_list_lrus(struct mem_cgroup *memcg, struct mem_cgroup = *parent); =20 /** diff --git a/mm/list_lru.c b/mm/list_lru.c index b817c0f48f73..1ccdd45b1d14 100644 --- a/mm/list_lru.c +++ b/mm/list_lru.c @@ -537,17 +537,14 @@ static inline bool memcg_list_lru_allocated(struct me= m_cgroup *memcg, return idx < 0 || xa_load(&lru->xa, idx); } =20 -int memcg_list_lru_alloc(struct mem_cgroup *memcg, struct list_lru *lru, - gfp_t gfp) +static int __memcg_list_lru_alloc(struct mem_cgroup *memcg, + struct list_lru *lru, gfp_t gfp) { unsigned long flags; struct list_lru_memcg *mlru =3D NULL; struct mem_cgroup *pos, *parent; XA_STATE(xas, &lru->xa, 0); =20 - if (!list_lru_memcg_aware(lru) || memcg_list_lru_allocated(memcg, lru)) - return 0; - gfp &=3D GFP_RECLAIM_MASK; /* * Because the list_lru can be reparented to the parent cgroup's @@ -588,6 +585,38 @@ int memcg_list_lru_alloc(struct mem_cgroup *memcg, str= uct list_lru *lru, =20 return xas_error(&xas); } + +int memcg_list_lru_alloc(struct mem_cgroup *memcg, struct list_lru *lru, + gfp_t gfp) +{ + if (!list_lru_memcg_aware(lru) || memcg_list_lru_allocated(memcg, lru)) + return 0; + return __memcg_list_lru_alloc(memcg, lru, gfp); +} + +int folio_memcg_list_lru_alloc(struct folio *folio, struct list_lru *lru, + gfp_t gfp) +{ + struct mem_cgroup *memcg; + int res; + + if (!list_lru_memcg_aware(lru)) + return 0; + + /* Fast path when list_lru heads already exist */ + rcu_read_lock(); + memcg =3D folio_memcg(folio); + res =3D memcg_list_lru_allocated(memcg, lru); + rcu_read_unlock(); + if (likely(res)) + return 0; + + /* Allocation may block, pin the memcg */ + memcg =3D get_mem_cgroup_from_folio(folio); + res =3D __memcg_list_lru_alloc(memcg, lru, gfp); + mem_cgroup_put(memcg); + return res; +} #else static inline void memcg_init_list_lru(struct list_lru *lru, bool memcg_aw= are) { --=20 2.53.0 From nobody Mon Apr 6 16:48:29 2026 Received: from mail-qt1-f173.google.com (mail-qt1-f173.google.com [209.85.160.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 65E523F87E3 for ; Wed, 18 Mar 2026 20:04:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773864256; cv=none; b=T8QBWhfIMi3T2YxLeJOrbn1zcTbzgt83Cj+F3yA5F4lcFHeZQnMZuJxUJwxPGoZzcbNGU5Xqh8Y/MZCLCkh6Z61YbChZP3r1nrwNfx0dC/xLCp33koCmJ2jl3FFoDNC6lPlC9/1NCgHVHO8ZJO6u3r7UAfKZkIAIhhwuLMr6lbw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773864256; c=relaxed/simple; bh=W/tpXZogZuwsV4W2pnL8/KyzyVKrD4bIzJ2KAUG0wko=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=K4uoj7EHnm6GolL4EGidP44eR1x7OYUl8r7expIza9MV4d7nNH+1bvR71CBJ2MJkLW6VgfcVWurcTyVmYQYsaTDiDCKBUtvtjE/rQv6/hC4kkp84dUWxyWMWxBokM1KVXDsak/wYCsY0svqfG+VFOaVRpSIN9L7/kHDI3rZQU1I= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=cmpxchg.org; spf=pass smtp.mailfrom=cmpxchg.org; dkim=pass (2048-bit key) header.d=cmpxchg.org header.i=@cmpxchg.org header.b=LVErzkp1; arc=none smtp.client-ip=209.85.160.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=cmpxchg.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=cmpxchg.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=cmpxchg.org header.i=@cmpxchg.org header.b="LVErzkp1" Received: by mail-qt1-f173.google.com with SMTP id d75a77b69052e-5091d71aa11so2557111cf.1 for ; Wed, 18 Mar 2026 13:04:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg.org; s=google; t=1773864253; x=1774469053; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=UQl84LX5jU/faihyz88LAxgFR4VTOd8RY4OIYdmlbLQ=; b=LVErzkp1YJzu9tq5b7B8G7PoYdHYFReYOrjD3z8leC6VEbFE1m+w9hf1Al6lx3orvl /TOK82LrAV6neBxgyz9Xp4xQkYShQW7o2sSUDxWIVvoRpgYgBC6KeE3+fOaBWTubqOw/ xGQ8eLnm1PeY3H5ERtzGoklMLHVxBoDaQ3cmaIaWxKjq9OI0LXvFpCLYddmc+FIcuQzW bI1H5NK9NF/ro/49USqSgJPI8Xe2Kzbj7rvckbuSEAI5RNra4CMd/aCd+TqIEed3AHBB ntuiCNxJ4zb3o/oa7Juh4VFGSugcmRn5JZ33ZFKH+E9w8cQ5P1riY+hpdBhoMsil0uYh 5PoQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1773864253; x=1774469053; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=UQl84LX5jU/faihyz88LAxgFR4VTOd8RY4OIYdmlbLQ=; b=Q6tfk7bNrXlbcCcihYhykyB1P70YSN2iaweJa1GfnbBRCC7NTiq6IMPvmZ4uS0zrM6 jZfJLO4S8FaHwztbUgW2+XZIsGStpTPGFIf9Yvn4/a30NAKx/xrkitQ6SdKP+FCxrAyG YBuEUEq6IZRRDVTLXDX6c8FlklJ23ZNyUQsepAGYafeliCT1iegroWYTPPL9HHMGuFhg s3R/AayqYPNI86+sjwvJ1ZV5eCQul3bQ45QFO2tmgCdvfmyQG6Q2GrhvG6W2WaAHKguT iB9jp1feTz36oJzvdyWr5mIUsQwGx/8lybUdU88Ywjt5Qx2W3Qbz5zI5s5rtcM+nJ/EP 929Q== X-Forwarded-Encrypted: i=1; AJvYcCVMVjD5hROT1dWM/s1oXc63hNK1iJ1QHyApF1Jl5kFIBDyr6r3z6iEpg9XJrYq5Dg08FFplSsXEDrUKLaw=@vger.kernel.org X-Gm-Message-State: AOJu0YySUYVHoQh7PKVZWv4EAFiBUPL/iBwZlIxxCuzeWPquEty600ka EdTDsA4sMFUSJel53Fehnn5hOptAYDZkEseGl1iVm8BPOG5I6nkNVilsK7kZWKVd4C4= X-Gm-Gg: ATEYQzxolvmMr2gH+3JT8rsszuMF2XKibPGEm1Amq8komii0EPUk8WzYZsHLzlOqDUN BpNCH1GfLc22nnMoh2rpMg+Y9OhJAsUxp3MVLNWH+of7Rw05WKGNPf6ECqFI50PDDg+1LA4nC8t zHta3lUDhs091a5p5nOScfFwGWgKt8+PfWTW/JQRSrHxJsrKyc1VJaSWk+Cty7w7TjEU4N6QEuS 51tXmgPjiwB2ccbakevAEfxnjF1B/IIQG9cjisvz78//cq1DbRR6UzQWJmtQ5eMtlr6VcVG+JDT gKssFg+FeVCm2bkBMHE8VZdJDIHLeSRERgUZqfXNwwUtX9OFW9kbD3SK7yrKHSZFF3D3XTaMNxM jV4xbXTW2FbLdeoTMYpMSmAGEeC/F34DNI4zXj0eacGANkIcUY9UQVgbb3eg4mBpVBVXXYAucBq Dq0Ai68FqEewSCn2NRTFpPgw== X-Received: by 2002:a05:622a:1394:b0:509:205:230f with SMTP id d75a77b69052e-50b245e8c17mr13724351cf.4.1773864253286; Wed, 18 Mar 2026 13:04:13 -0700 (PDT) Received: from localhost ([2603:7000:c00:3a00:365a:60ff:fe62:ff29]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-50b1348ae9dsm37621861cf.6.2026.03.18.13.04.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 18 Mar 2026 13:04:12 -0700 (PDT) From: Johannes Weiner To: Andrew Morton Cc: David Hildenbrand , Shakeel Butt , Yosry Ahmed , Zi Yan , "Liam R. Howlett" , Usama Arif , Kiryl Shutsemau , Dave Chinner , Roman Gushchin , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v3 7/7] mm: switch deferred split shrinker to list_lru Date: Wed, 18 Mar 2026 15:53:25 -0400 Message-ID: <20260318200352.1039011-8-hannes@cmpxchg.org> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260318200352.1039011-1-hannes@cmpxchg.org> References: <20260318200352.1039011-1-hannes@cmpxchg.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The deferred split queue handles cgroups in a suboptimal fashion. The queue is per-NUMA node or per-cgroup, not the intersection. That means on a cgrouped system, a node-restricted allocation entering reclaim can end up splitting large pages on other nodes: alloc/unmap deferred_split_folio() list_add_tail(memcg->split_queue) set_shrinker_bit(memcg, node, deferred_shrinker_id) for_each_zone_zonelist_nodemask(restricted_nodes) mem_cgroup_iter() shrink_slab(node, memcg) shrink_slab_memcg(node, memcg) if test_shrinker_bit(memcg, node, deferred_shrinker_id) deferred_split_scan() walks memcg->split_queue The shrinker bit adds an imperfect guard rail. As soon as the cgroup has a single large page on the node of interest, all large pages owned by that memcg, including those on other nodes, will be split. list_lru properly sets up per-node, per-cgroup lists. As a bonus, it streamlines a lot of the list operations and reclaim walks. It's used widely by other major shrinkers already. Convert the deferred split queue as well. The list_lru per-memcg heads are instantiated on demand when the first object of interest is allocated for a cgroup, by calling folio_memcg_list_lru_alloc(). Add calls to where splittable pages are created: anon faults, swapin faults, khugepaged collapse. These calls create all possible node heads for the cgroup at once, so the migration code (between nodes) doesn't need any special care. Signed-off-by: Johannes Weiner Acked-by: Shakeel Butt --- include/linux/huge_mm.h | 6 +- include/linux/memcontrol.h | 4 - include/linux/mmzone.h | 12 -- mm/huge_memory.c | 342 ++++++++++++------------------------- mm/internal.h | 2 +- mm/khugepaged.c | 7 + mm/memcontrol.c | 12 +- mm/memory.c | 52 +++--- mm/mm_init.c | 15 -- 9 files changed, 151 insertions(+), 301 deletions(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index bd7f0e1d8094..8d801ed378db 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -414,10 +414,9 @@ static inline int split_huge_page(struct page *page) { return split_huge_page_to_list_to_order(page, NULL, 0); } + +extern struct list_lru deferred_split_lru; void deferred_split_folio(struct folio *folio, bool partially_mapped); -#ifdef CONFIG_MEMCG -void reparent_deferred_split_queue(struct mem_cgroup *memcg); -#endif =20 void __split_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, unsigned long address, bool freeze); @@ -650,7 +649,6 @@ static inline int try_folio_split_to_order(struct folio= *folio, } =20 static inline void deferred_split_folio(struct folio *folio, bool partiall= y_mapped) {} -static inline void reparent_deferred_split_queue(struct mem_cgroup *memcg)= {} #define split_huge_pmd(__vma, __pmd, __address) \ do { } while (0) =20 diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 086158969529..0782c72a1997 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -277,10 +277,6 @@ struct mem_cgroup { struct memcg_cgwb_frn cgwb_frn[MEMCG_CGWB_FRN_CNT]; #endif =20 -#ifdef CONFIG_TRANSPARENT_HUGEPAGE - struct deferred_split deferred_split_queue; -#endif - #ifdef CONFIG_LRU_GEN_WALKS_MMU /* per-memcg mm_struct list */ struct lru_gen_mm_list mm_list; diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 7bd0134c241c..232b7a71fd69 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -1429,14 +1429,6 @@ struct zonelist { */ extern struct page *mem_map; =20 -#ifdef CONFIG_TRANSPARENT_HUGEPAGE -struct deferred_split { - spinlock_t split_queue_lock; - struct list_head split_queue; - unsigned long split_queue_len; -}; -#endif - #ifdef CONFIG_MEMORY_FAILURE /* * Per NUMA node memory failure handling statistics. @@ -1562,10 +1554,6 @@ typedef struct pglist_data { unsigned long first_deferred_pfn; #endif /* CONFIG_DEFERRED_STRUCT_PAGE_INIT */ =20 -#ifdef CONFIG_TRANSPARENT_HUGEPAGE - struct deferred_split deferred_split_queue; -#endif - #ifdef CONFIG_NUMA_BALANCING /* start time in ms of current promote rate limit period */ unsigned int nbp_rl_start; diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 3fc02913b63e..e90d08db219d 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -14,6 +14,7 @@ #include #include #include +#include #include #include #include @@ -67,6 +68,8 @@ unsigned long transparent_hugepage_flags __read_mostly = =3D (1<count_objects =3D deferred_split_count; deferred_split_shrinker->scan_objects =3D deferred_split_scan; shrinker_register(deferred_split_shrinker); @@ -939,6 +949,7 @@ static int __init thp_shrinker_init(void) =20 huge_zero_folio_shrinker =3D shrinker_alloc(0, "thp-zero"); if (!huge_zero_folio_shrinker) { + list_lru_destroy(&deferred_split_lru); shrinker_free(deferred_split_shrinker); return -ENOMEM; } @@ -953,6 +964,7 @@ static int __init thp_shrinker_init(void) static void __init thp_shrinker_exit(void) { shrinker_free(huge_zero_folio_shrinker); + list_lru_destroy(&deferred_split_lru); shrinker_free(deferred_split_shrinker); } =20 @@ -1133,119 +1145,6 @@ pmd_t maybe_pmd_mkwrite(pmd_t pmd, struct vm_area_s= truct *vma) return pmd; } =20 -static struct deferred_split *split_queue_node(int nid) -{ - struct pglist_data *pgdata =3D NODE_DATA(nid); - - return &pgdata->deferred_split_queue; -} - -#ifdef CONFIG_MEMCG -static inline -struct mem_cgroup *folio_split_queue_memcg(struct folio *folio, - struct deferred_split *queue) -{ - if (mem_cgroup_disabled()) - return NULL; - if (split_queue_node(folio_nid(folio)) =3D=3D queue) - return NULL; - return container_of(queue, struct mem_cgroup, deferred_split_queue); -} - -static struct deferred_split *memcg_split_queue(int nid, struct mem_cgroup= *memcg) -{ - return memcg ? &memcg->deferred_split_queue : split_queue_node(nid); -} -#else -static inline -struct mem_cgroup *folio_split_queue_memcg(struct folio *folio, - struct deferred_split *queue) -{ - return NULL; -} - -static struct deferred_split *memcg_split_queue(int nid, struct mem_cgroup= *memcg) -{ - return split_queue_node(nid); -} -#endif - -static struct deferred_split *split_queue_lock(int nid, struct mem_cgroup = *memcg) -{ - struct deferred_split *queue; - -retry: - queue =3D memcg_split_queue(nid, memcg); - spin_lock(&queue->split_queue_lock); - /* - * There is a period between setting memcg to dying and reparenting - * deferred split queue, and during this period the THPs in the deferred - * split queue will be hidden from the shrinker side. - */ - if (unlikely(memcg_is_dying(memcg))) { - spin_unlock(&queue->split_queue_lock); - memcg =3D parent_mem_cgroup(memcg); - goto retry; - } - - return queue; -} - -static struct deferred_split * -split_queue_lock_irqsave(int nid, struct mem_cgroup *memcg, unsigned long = *flags) -{ - struct deferred_split *queue; - -retry: - queue =3D memcg_split_queue(nid, memcg); - spin_lock_irqsave(&queue->split_queue_lock, *flags); - if (unlikely(memcg_is_dying(memcg))) { - spin_unlock_irqrestore(&queue->split_queue_lock, *flags); - memcg =3D parent_mem_cgroup(memcg); - goto retry; - } - - return queue; -} - -static struct deferred_split *folio_split_queue_lock(struct folio *folio) -{ - struct deferred_split *queue; - - rcu_read_lock(); - queue =3D split_queue_lock(folio_nid(folio), folio_memcg(folio)); - /* - * The memcg destruction path is acquiring the split queue lock for - * reparenting. Once you have it locked, it's safe to drop the rcu lock. - */ - rcu_read_unlock(); - - return queue; -} - -static struct deferred_split * -folio_split_queue_lock_irqsave(struct folio *folio, unsigned long *flags) -{ - struct deferred_split *queue; - - rcu_read_lock(); - queue =3D split_queue_lock_irqsave(folio_nid(folio), folio_memcg(folio), = flags); - rcu_read_unlock(); - - return queue; -} - -static inline void split_queue_unlock(struct deferred_split *queue) -{ - spin_unlock(&queue->split_queue_lock); -} - -static inline void split_queue_unlock_irqrestore(struct deferred_split *qu= eue, - unsigned long flags) -{ - spin_unlock_irqrestore(&queue->split_queue_lock, flags); -} - static inline bool is_transparent_hugepage(const struct folio *folio) { if (!folio_test_large(folio)) @@ -1346,6 +1245,14 @@ static struct folio *vma_alloc_anon_folio_pmd(struct= vm_area_struct *vma, count_mthp_stat(order, MTHP_STAT_ANON_FAULT_FALLBACK_CHARGE); return NULL; } + + if (folio_memcg_list_lru_alloc(folio, &deferred_split_lru, gfp)) { + folio_put(folio); + count_vm_event(THP_FAULT_FALLBACK); + count_mthp_stat(order, MTHP_STAT_ANON_FAULT_FALLBACK); + return NULL; + } + folio_throttle_swaprate(folio, gfp); =20 /* @@ -3854,34 +3761,34 @@ static int __folio_freeze_and_split_unmapped(struct= folio *folio, unsigned int n struct folio *end_folio =3D folio_next(folio); struct folio *new_folio, *next; int old_order =3D folio_order(folio); + struct list_lru_one *l; + bool dequeue_deferred; int ret =3D 0; - struct deferred_split *ds_queue; =20 VM_WARN_ON_ONCE(!mapping && end); /* Prevent deferred_split_scan() touching ->_refcount */ - ds_queue =3D folio_split_queue_lock(folio); + dequeue_deferred =3D folio_test_anon(folio) && old_order > 1; + if (dequeue_deferred) { + rcu_read_lock(); + l =3D list_lru_lock(&deferred_split_lru, + folio_nid(folio), folio_memcg(folio)); + } if (folio_ref_freeze(folio, folio_cache_ref_count(folio) + 1)) { struct swap_cluster_info *ci =3D NULL; struct lruvec *lruvec; =20 - if (old_order > 1) { - if (!list_empty(&folio->_deferred_list)) { - ds_queue->split_queue_len--; - /* - * Reinitialize page_deferred_list after removing the - * page from the split_queue, otherwise a subsequent - * split will see list corruption when checking the - * page_deferred_list. - */ - list_del_init(&folio->_deferred_list); - } + if (dequeue_deferred) { + __list_lru_del(&deferred_split_lru, l, + &folio->_deferred_list, folio_nid(folio)); if (folio_test_partially_mapped(folio)) { folio_clear_partially_mapped(folio); mod_mthp_stat(old_order, MTHP_STAT_NR_ANON_PARTIALLY_MAPPED, -1); } + list_lru_unlock(l); + rcu_read_unlock(); } - split_queue_unlock(ds_queue); + if (mapping) { int nr =3D folio_nr_pages(folio); =20 @@ -3982,7 +3889,10 @@ static int __folio_freeze_and_split_unmapped(struct = folio *folio, unsigned int n if (ci) swap_cluster_unlock(ci); } else { - split_queue_unlock(ds_queue); + if (dequeue_deferred) { + list_lru_unlock(l); + rcu_read_unlock(); + } return -EAGAIN; } =20 @@ -4349,33 +4259,35 @@ int split_folio_to_list(struct folio *folio, struct= list_head *list) * queueing THP splits, and that list is (racily observed to be) non-empty. * * It is unsafe to call folio_unqueue_deferred_split() until folio refcoun= t is - * zero: because even when split_queue_lock is held, a non-empty _deferred= _list - * might be in use on deferred_split_scan()'s unlocked on-stack list. + * zero: because even when the list_lru lock is held, a non-empty + * _deferred_list might be in use on deferred_split_scan()'s unlocked + * on-stack list. * - * If memory cgroups are enabled, split_queue_lock is in the mem_cgroup: i= t is - * therefore important to unqueue deferred split before changing folio mem= cg. + * The list_lru sublist is determined by folio's memcg: it is therefore + * important to unqueue deferred split before changing folio memcg. */ bool __folio_unqueue_deferred_split(struct folio *folio) { - struct deferred_split *ds_queue; + struct list_lru_one *l; + int nid =3D folio_nid(folio); unsigned long flags; bool unqueued =3D false; =20 WARN_ON_ONCE(folio_ref_count(folio)); WARN_ON_ONCE(!mem_cgroup_disabled() && !folio_memcg_charged(folio)); =20 - ds_queue =3D folio_split_queue_lock_irqsave(folio, &flags); - if (!list_empty(&folio->_deferred_list)) { - ds_queue->split_queue_len--; + rcu_read_lock(); + l =3D list_lru_lock_irqsave(&deferred_split_lru, nid, folio_memcg(folio),= &flags); + if (__list_lru_del(&deferred_split_lru, l, &folio->_deferred_list, nid)) { if (folio_test_partially_mapped(folio)) { folio_clear_partially_mapped(folio); mod_mthp_stat(folio_order(folio), MTHP_STAT_NR_ANON_PARTIALLY_MAPPED, -1); } - list_del_init(&folio->_deferred_list); unqueued =3D true; } - split_queue_unlock_irqrestore(ds_queue, flags); + list_lru_unlock_irqrestore(l, &flags); + rcu_read_unlock(); =20 return unqueued; /* useful for debug warnings */ } @@ -4383,7 +4295,9 @@ bool __folio_unqueue_deferred_split(struct folio *fol= io) /* partially_mapped=3Dfalse won't clear PG_partially_mapped folio flag */ void deferred_split_folio(struct folio *folio, bool partially_mapped) { - struct deferred_split *ds_queue; + struct list_lru_one *l; + int nid; + struct mem_cgroup *memcg; unsigned long flags; =20 /* @@ -4406,7 +4320,11 @@ void deferred_split_folio(struct folio *folio, bool = partially_mapped) if (folio_test_swapcache(folio)) return; =20 - ds_queue =3D folio_split_queue_lock_irqsave(folio, &flags); + nid =3D folio_nid(folio); + + rcu_read_lock(); + memcg =3D folio_memcg(folio); + l =3D list_lru_lock_irqsave(&deferred_split_lru, nid, memcg, &flags); if (partially_mapped) { if (!folio_test_partially_mapped(folio)) { folio_set_partially_mapped(folio); @@ -4414,36 +4332,20 @@ void deferred_split_folio(struct folio *folio, bool= partially_mapped) count_vm_event(THP_DEFERRED_SPLIT_PAGE); count_mthp_stat(folio_order(folio), MTHP_STAT_SPLIT_DEFERRED); mod_mthp_stat(folio_order(folio), MTHP_STAT_NR_ANON_PARTIALLY_MAPPED, 1= ); - } } else { /* partially mapped folios cannot become non-partially mapped */ VM_WARN_ON_FOLIO(folio_test_partially_mapped(folio), folio); } - if (list_empty(&folio->_deferred_list)) { - struct mem_cgroup *memcg; - - memcg =3D folio_split_queue_memcg(folio, ds_queue); - list_add_tail(&folio->_deferred_list, &ds_queue->split_queue); - ds_queue->split_queue_len++; - if (memcg) - set_shrinker_bit(memcg, folio_nid(folio), - shrinker_id(deferred_split_shrinker)); - } - split_queue_unlock_irqrestore(ds_queue, flags); + __list_lru_add(&deferred_split_lru, l, &folio->_deferred_list, nid, memcg= ); + list_lru_unlock_irqrestore(l, &flags); + rcu_read_unlock(); } =20 static unsigned long deferred_split_count(struct shrinker *shrink, struct shrink_control *sc) { - struct pglist_data *pgdata =3D NODE_DATA(sc->nid); - struct deferred_split *ds_queue =3D &pgdata->deferred_split_queue; - -#ifdef CONFIG_MEMCG - if (sc->memcg) - ds_queue =3D &sc->memcg->deferred_split_queue; -#endif - return READ_ONCE(ds_queue->split_queue_len); + return list_lru_shrink_count(&deferred_split_lru, sc); } =20 static bool thp_underused(struct folio *folio) @@ -4473,45 +4375,47 @@ static bool thp_underused(struct folio *folio) return false; } =20 +static enum lru_status deferred_split_isolate(struct list_head *item, + struct list_lru_one *lru, + void *cb_arg) +{ + struct folio *folio =3D container_of(item, struct folio, _deferred_list); + struct list_head *freeable =3D cb_arg; + + if (folio_try_get(folio)) { + list_lru_isolate_move(lru, item, freeable); + return LRU_REMOVED; + } + + /* We lost race with folio_put() */ + list_lru_isolate(lru, item); + if (folio_test_partially_mapped(folio)) { + folio_clear_partially_mapped(folio); + mod_mthp_stat(folio_order(folio), + MTHP_STAT_NR_ANON_PARTIALLY_MAPPED, -1); + } + return LRU_REMOVED; +} + static unsigned long deferred_split_scan(struct shrinker *shrink, struct shrink_control *sc) { - struct deferred_split *ds_queue; - unsigned long flags; + LIST_HEAD(dispose); struct folio *folio, *next; - int split =3D 0, i; - struct folio_batch fbatch; + int split =3D 0; + unsigned long isolated; =20 - folio_batch_init(&fbatch); + isolated =3D list_lru_shrink_walk_irq(&deferred_split_lru, sc, + deferred_split_isolate, &dispose); =20 -retry: - ds_queue =3D split_queue_lock_irqsave(sc->nid, sc->memcg, &flags); - /* Take pin on all head pages to avoid freeing them under us */ - list_for_each_entry_safe(folio, next, &ds_queue->split_queue, - _deferred_list) { - if (folio_try_get(folio)) { - folio_batch_add(&fbatch, folio); - } else if (folio_test_partially_mapped(folio)) { - /* We lost race with folio_put() */ - folio_clear_partially_mapped(folio); - mod_mthp_stat(folio_order(folio), - MTHP_STAT_NR_ANON_PARTIALLY_MAPPED, -1); - } - list_del_init(&folio->_deferred_list); - ds_queue->split_queue_len--; - if (!--sc->nr_to_scan) - break; - if (!folio_batch_space(&fbatch)) - break; - } - split_queue_unlock_irqrestore(ds_queue, flags); - - for (i =3D 0; i < folio_batch_count(&fbatch); i++) { + list_for_each_entry_safe(folio, next, &dispose, _deferred_list) { bool did_split =3D false; bool underused =3D false; - struct deferred_split *fqueue; + struct list_lru_one *l; + unsigned long flags; + + list_del_init(&folio->_deferred_list); =20 - folio =3D fbatch.folios[i]; if (!folio_test_partially_mapped(folio)) { /* * See try_to_map_unused_to_zeropage(): we cannot @@ -4534,64 +4438,32 @@ static unsigned long deferred_split_scan(struct shr= inker *shrink, } folio_unlock(folio); next: - if (did_split || !folio_test_partially_mapped(folio)) - continue; /* * Only add back to the queue if folio is partially mapped. * If thp_underused returns false, or if split_folio fails * in the case it was underused, then consider it used and * don't add it back to split_queue. */ - fqueue =3D folio_split_queue_lock_irqsave(folio, &flags); - if (list_empty(&folio->_deferred_list)) { - list_add_tail(&folio->_deferred_list, &fqueue->split_queue); - fqueue->split_queue_len++; + if (!did_split && folio_test_partially_mapped(folio)) { + rcu_read_lock(); + l =3D list_lru_lock_irqsave(&deferred_split_lru, + folio_nid(folio), + folio_memcg(folio), + &flags); + __list_lru_add(&deferred_split_lru, l, + &folio->_deferred_list, + folio_nid(folio), folio_memcg(folio)); + list_lru_unlock_irqrestore(l, &flags); + rcu_read_unlock(); } - split_queue_unlock_irqrestore(fqueue, flags); - } - folios_put(&fbatch); - - if (sc->nr_to_scan && !list_empty(&ds_queue->split_queue)) { - cond_resched(); - goto retry; + folio_put(folio); } =20 - /* - * Stop shrinker if we didn't split any page, but the queue is empty. - * This can happen if pages were freed under us. - */ - if (!split && list_empty(&ds_queue->split_queue)) + if (!split && !isolated) return SHRINK_STOP; return split; } =20 -#ifdef CONFIG_MEMCG -void reparent_deferred_split_queue(struct mem_cgroup *memcg) -{ - struct mem_cgroup *parent =3D parent_mem_cgroup(memcg); - struct deferred_split *ds_queue =3D &memcg->deferred_split_queue; - struct deferred_split *parent_ds_queue =3D &parent->deferred_split_queue; - int nid; - - spin_lock_irq(&ds_queue->split_queue_lock); - spin_lock_nested(&parent_ds_queue->split_queue_lock, SINGLE_DEPTH_NESTING= ); - - if (!ds_queue->split_queue_len) - goto unlock; - - list_splice_tail_init(&ds_queue->split_queue, &parent_ds_queue->split_que= ue); - parent_ds_queue->split_queue_len +=3D ds_queue->split_queue_len; - ds_queue->split_queue_len =3D 0; - - for_each_node(nid) - set_shrinker_bit(parent, nid, shrinker_id(deferred_split_shrinker)); - -unlock: - spin_unlock(&parent_ds_queue->split_queue_lock); - spin_unlock_irq(&ds_queue->split_queue_lock); -} -#endif - #ifdef CONFIG_DEBUG_FS static void split_huge_pages_all(void) { diff --git a/mm/internal.h b/mm/internal.h index f98f4746ac41..d8c737338df5 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -863,7 +863,7 @@ static inline bool folio_unqueue_deferred_split(struct = folio *folio) /* * At this point, there is no one trying to add the folio to * deferred_list. If folio is not in deferred_list, it's safe - * to check without acquiring the split_queue_lock. + * to check without acquiring the list_lru lock. */ if (data_race(list_empty(&folio->_deferred_list))) return false; diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 4b0e59c7c0e6..b2ac28ddd480 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1081,6 +1081,7 @@ static enum scan_result alloc_charge_folio(struct fol= io **foliop, struct mm_stru } =20 count_vm_event(THP_COLLAPSE_ALLOC); + if (unlikely(mem_cgroup_charge(folio, mm, gfp))) { folio_put(folio); *foliop =3D NULL; @@ -1089,6 +1090,12 @@ static enum scan_result alloc_charge_folio(struct fo= lio **foliop, struct mm_stru =20 count_memcg_folio_events(folio, THP_COLLAPSE_ALLOC, 1); =20 + if (folio_memcg_list_lru_alloc(folio, &deferred_split_lru, gfp)) { + folio_put(folio); + *foliop =3D NULL; + return SCAN_CGROUP_CHARGE_FAIL; + } + *foliop =3D folio; return SCAN_SUCCEED; } diff --git a/mm/memcontrol.c b/mm/memcontrol.c index a47fb68dd65f..f381cb6bdff1 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -4015,11 +4015,6 @@ static struct mem_cgroup *mem_cgroup_alloc(struct me= m_cgroup *parent) for (i =3D 0; i < MEMCG_CGWB_FRN_CNT; i++) memcg->cgwb_frn[i].done =3D __WB_COMPLETION_INIT(&memcg_cgwb_frn_waitq); -#endif -#ifdef CONFIG_TRANSPARENT_HUGEPAGE - spin_lock_init(&memcg->deferred_split_queue.split_queue_lock); - INIT_LIST_HEAD(&memcg->deferred_split_queue.split_queue); - memcg->deferred_split_queue.split_queue_len =3D 0; #endif lru_gen_init_memcg(memcg); return memcg; @@ -4167,11 +4162,10 @@ static void mem_cgroup_css_offline(struct cgroup_su= bsys_state *css) zswap_memcg_offline_cleanup(memcg); =20 memcg_offline_kmem(memcg); - reparent_deferred_split_queue(memcg); /* - * The reparenting of objcg must be after the reparenting of the - * list_lru and deferred_split_queue above, which ensures that they will - * not mistakenly get the parent list_lru and deferred_split_queue. + * The reparenting of objcg must be after the reparenting of + * the list_lru in memcg_offline_kmem(), which ensures that + * they will not mistakenly get the parent list_lru. */ memcg_reparent_objcgs(memcg); reparent_shrinker_deferred(memcg); diff --git a/mm/memory.c b/mm/memory.c index 219b9bf6cae0..e68ceb4aa624 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4651,13 +4651,19 @@ static struct folio *alloc_swap_folio(struct vm_fau= lt *vmf) while (orders) { addr =3D ALIGN_DOWN(vmf->address, PAGE_SIZE << order); folio =3D vma_alloc_folio(gfp, order, vma, addr); - if (folio) { - if (!mem_cgroup_swapin_charge_folio(folio, vma->vm_mm, - gfp, entry)) - return folio; + if (!folio) + goto next; + if (mem_cgroup_swapin_charge_folio(folio, vma->vm_mm, gfp, entry)) { count_mthp_stat(order, MTHP_STAT_SWPIN_FALLBACK_CHARGE); folio_put(folio); + goto next; } + if (folio_memcg_list_lru_alloc(folio, &deferred_split_lru, gfp)) { + folio_put(folio); + goto fallback; + } + return folio; +next: count_mthp_stat(order, MTHP_STAT_SWPIN_FALLBACK); order =3D next_order(&orders, order); } @@ -5169,24 +5175,28 @@ static struct folio *alloc_anon_folio(struct vm_fau= lt *vmf) while (orders) { addr =3D ALIGN_DOWN(vmf->address, PAGE_SIZE << order); folio =3D vma_alloc_folio(gfp, order, vma, addr); - if (folio) { - if (mem_cgroup_charge(folio, vma->vm_mm, gfp)) { - count_mthp_stat(order, MTHP_STAT_ANON_FAULT_FALLBACK_CHARGE); - folio_put(folio); - goto next; - } - folio_throttle_swaprate(folio, gfp); - /* - * When a folio is not zeroed during allocation - * (__GFP_ZERO not used) or user folios require special - * handling, folio_zero_user() is used to make sure - * that the page corresponding to the faulting address - * will be hot in the cache after zeroing. - */ - if (user_alloc_needs_zeroing()) - folio_zero_user(folio, vmf->address); - return folio; + if (!folio) + goto next; + if (mem_cgroup_charge(folio, vma->vm_mm, gfp)) { + count_mthp_stat(order, MTHP_STAT_ANON_FAULT_FALLBACK_CHARGE); + folio_put(folio); + goto next; } + if (folio_memcg_list_lru_alloc(folio, &deferred_split_lru, gfp)) { + folio_put(folio); + goto fallback; + } + folio_throttle_swaprate(folio, gfp); + /* + * When a folio is not zeroed during allocation + * (__GFP_ZERO not used) or user folios require special + * handling, folio_zero_user() is used to make sure + * that the page corresponding to the faulting address + * will be hot in the cache after zeroing. + */ + if (user_alloc_needs_zeroing()) + folio_zero_user(folio, vmf->address); + return folio; next: count_mthp_stat(order, MTHP_STAT_ANON_FAULT_FALLBACK); order =3D next_order(&orders, order); diff --git a/mm/mm_init.c b/mm/mm_init.c index cec7bb758bdd..f293a62e652a 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -1388,19 +1388,6 @@ static void __init calculate_node_totalpages(struct = pglist_data *pgdat, pr_debug("On node %d totalpages: %lu\n", pgdat->node_id, realtotalpages); } =20 -#ifdef CONFIG_TRANSPARENT_HUGEPAGE -static void pgdat_init_split_queue(struct pglist_data *pgdat) -{ - struct deferred_split *ds_queue =3D &pgdat->deferred_split_queue; - - spin_lock_init(&ds_queue->split_queue_lock); - INIT_LIST_HEAD(&ds_queue->split_queue); - ds_queue->split_queue_len =3D 0; -} -#else -static void pgdat_init_split_queue(struct pglist_data *pgdat) {} -#endif - #ifdef CONFIG_COMPACTION static void pgdat_init_kcompactd(struct pglist_data *pgdat) { @@ -1416,8 +1403,6 @@ static void __meminit pgdat_init_internals(struct pgl= ist_data *pgdat) =20 pgdat_resize_init(pgdat); pgdat_kswapd_lock_init(pgdat); - - pgdat_init_split_queue(pgdat); pgdat_init_kcompactd(pgdat); =20 init_waitqueue_head(&pgdat->kswapd_wait); --=20 2.53.0