From nobody Sat Jun 27 16:01:42 2026 Received: from out-173.mta1.migadu.com (out-173.mta1.migadu.com [95.215.58.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F3D923128D4 for ; Mon, 8 Jun 2026 14:32:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=95.215.58.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780929174; cv=none; b=pFkyH8Lvre4cvU53RpDT/39xIcHU1zuvOXOqCtFlPeAI51aTmi71wgzLnoLQJz95JLAjniFZ9opC6/P2YD3OLNOIvNJ9tg9D41e+uZS6Rhb9gvaUivumysTRqBRvVTs6YP4/qPHW4y8L2F1tOQl9LeTT8O/0cDDA9GhHkL3Vuz8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780929174; c=relaxed/simple; bh=RV+rTsKsYlzj5acJap2QCyorGAGMOh5VZWewKdjcMyw=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=NQUjMKchD+i18QZNRwKBO+ln/Hl/JdM24UueSyd2qASJ5WTRj9NVFc2b1ePDkn/LmeGeL4yKBB2MiAz/duJvOs35xrT8GYN41Q2XCa0uvd8Yu9CR3zR06Wpz7vinbmk6miQyEnjoSaH1AjGULcvQrhVYQWZD0f/sbzG7t5CgoPI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=s8xurRHv; arc=none smtp.client-ip=95.215.58.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="s8xurRHv" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1780929170; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=NpXdeW2+VxVHgAaRthUZp9b/EfHCWgl273UAxmNpEpM=; b=s8xurRHvHU0L6YNm9y+gyEhAsd41NBJI9TC/Ly4NuPVLTeEZUCpstcnmkp7V4iuhyl4/gG NuUHo03QMRTKp2/xkoS04hl6TNu8BewP2c5LSxOCalRaWLtdbzczzrBoVma+2tzYnQSsxY p01NR11yKj15PqjsdKsOU06+Tbr9+OM= From: Usama Arif To: Andrew Morton , riel@surriel.com, david@kernel.org, baohua@kernel.org, baoquan.he@linux.dev, chrisl@kernel.org, kasong@tencent.com, linux-mm@kvack.org Cc: hannes@cmpxchg.org, shakeel.butt@linux.dev, nphamcs@gmail.com, shikemeng@huaweicloud.com, youngjun.park@lge.com, linux-kernel@vger.kernel.org, kernel-team@meta.com, Usama Arif Subject: [PATCH] mm/swap_state: remove unnecessary lru_add_drain() from readahead Date: Mon, 8 Jun 2026 07:32:42 -0700 Message-ID: <20260608143242.2869392-1-usama.arif@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Content-Type: text/plain; charset="utf-8" swap_cluster_readahead() and swap_vma_readahead() end the readahead loop with an explicit lru_add_drain() call. That drain is a leftover from 2.6.12 era code and serves no functional purpose for the callers: - do_swap_page() ignores LRU residency for the readahead folios; it only needs the target folio it called swapin_readahead() for, and if the write-fault path needs the target folio on the LRU to count references accurately, it runs its own lru_add_drain() at the wp_can_reuse_anon_folio() and do_swap_page() sites. - shmem_swapin_cluster() immediately locks the returned folio, waits for writeback, then operates on it - LRU residency of either the target or the readahead folios is irrelevant. - try_to_unuse() likewise locks the folio and calls unuse_pte() without depending on LRU presence. Folios newly added to the swap cache by the readahead loop sit in the per-CPU LRU folio_batch and will be drained naturally as the batch fills (FOLIO_BATCH_SIZE),by the next reclaim/compaction lru_add_drain_all() and so on. The unconditional drain only synchronously flushes a partial batch and forces contention on lruvec_lock. On a 176-CPU production host running a memory-pressured workload, this path was observed to call folio_batch_move_lru() from swap_cluster_readahead() ~28K/min, a very large source of LRU lock traffic. This is a direct continuation of the cleanup started in commit 1aa43598c03b ("mm: remove unnecessary calls to lru_add_drain") which removed the equivalent drain from free_pages_and_swap_cache() with the same rationale. A detailed reasoning for this is present in [1]. Remove both drains. [1] https://lore.kernel.org/all/dca2824e8e88e826c6b260a831d79089b5b9c79d.ca= mel@surriel.com/T/#u Signed-off-by: Usama Arif Acked-by: Johannes Weiner Acked-by: Shakeel Butt Reviewed-by: Barry Song Reviewed-by: Kairui Song --- mm/swap_state.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/mm/swap_state.c b/mm/swap_state.c index 9c3a5cf99778..6fd6e3415b71 100644 --- a/mm/swap_state.c +++ b/mm/swap_state.c @@ -836,7 +836,6 @@ struct folio *swap_cluster_readahead(swp_entry_t entry,= gfp_t gfp_mask, } blk_finish_plug(&plug); swap_read_unplug(splug); - lru_add_drain(); /* Push any new pages onto the LRU now */ skip: /* The page was likely read above, so no need for plugging here */ return swap_cache_read_folio(entry, gfp_mask, mpol, ilx, NULL, false); @@ -951,7 +950,6 @@ static struct folio *swap_vma_readahead(swp_entry_t tar= g_entry, gfp_t gfp_mask, pte_unmap(pte); blk_finish_plug(&plug); swap_read_unplug(splug); - lru_add_drain(); skip: /* The folio was likely read above, so no need for plugging here */ folio =3D swap_cache_read_folio(targ_entry, gfp_mask, mpol, targ_ilx, --=20 2.52.0