From nobody Tue Dec 16 13:12:57 2025 Received: from mail-pf1-f177.google.com (mail-pf1-f177.google.com [209.85.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 77F3C34D4FC for ; Thu, 4 Dec 2025 19:30:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.177 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764876624; cv=none; b=DlbOEgoS+6LWTD90b/KPtqmleGILccGdbffUfIN0tbSEd9oIZe1MdXDu2gejC+oUXYUBf1Itn4y6Shhtj0wAproD4IoY3OAX3zOfm+GetezzOZH8IBIyC9hbvLhos5gdE8GSKEOQCV/YhUGD/9KdTVEBPdMcaGZYO1oXXRay2Qo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764876624; c=relaxed/simple; bh=F524m90C6Fx+yH0qVBU9xvhQ1OhmohTeUS/8yQTKZwU=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=jbNQlNfodvLTh+Uq29gUfXoSCi43s6MfK3lA1aJaEjLuBQDuKnVIwupMV5lmYM5fhuxBDxohvubvuEehX2LYaRD8uTLh32fzwUhZQxphQZAlFmVjj387/jK++5zJWpMj6Lp/JMJ5C5x7PaTJG4Kzp3QEWHjty212LOeYqlsq/bc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=EXL05Ixq; arc=none smtp.client-ip=209.85.210.177 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="EXL05Ixq" Received: by mail-pf1-f177.google.com with SMTP id d2e1a72fcca58-7b9387df58cso2046886b3a.3 for ; Thu, 04 Dec 2025 11:30:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1764876620; x=1765481420; darn=vger.kernel.org; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=MfXmgn+mF1Fs5ynI6NKuBVi8g4Y7O33miAkqyHznxPk=; b=EXL05Ixqgs+6bK7HVz4O4bSIQw0EKnnJDHwu4Om6l2VGmLXGR93vHKxyP8xURjty63 WavMEkurxfhjO+W2Jmfh+sS7kRz2Zfnw+oTyQKFg5CbFGD2+aH75gv6bgyDl5PZrM8xO yhQCi88VS+gZY2zWAqTsx+DdOyH4C8GntNJEta4fBkkjx9T/6n4KhAXwktK3IYcPRzYD 0DlJ8fF1LDiTI8P57rjX8aXGWEhvouw+KNyPSKo2qfeEw0NNuaiLBVbIZI5lLmgCSbx6 1YSoSITNai5PdVU+Vkq8bVElVEsqejWZgtefOMIMr3wAGLio8dXGdEuefa3XKsqqpzhg 3v8w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764876620; x=1765481420; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=MfXmgn+mF1Fs5ynI6NKuBVi8g4Y7O33miAkqyHznxPk=; b=FLj12sHieWJaR9YkPQCMFBcXL3FzpqcH2QETn/IxjO7xRcsK8xNTLVWfuVoPH08Kwg gF3hDRBblW6Sx/gCoazDxSIKHf8wAl2FCrWwMpBUComU6JkxGWQ8rF3zreSTf6ev4Fn9 OG3MHv70yKsz2jh8BtNerIgJBYRqdHTYFbMre0EHrfrxjJvkzv2t0y4G+jIo99Hdag76 RlG6/ZvK+EAo6F3s5H7E0BNyqbogJKtmbjxX1KiuE8dcnoyP6udNq79yZ6pCoGckX/L5 QGvPWStFqLqgA2lLFbwxyxmmANS77zIH0GaxKZpC8TnqjAQtNHLP1otzxCeb7gTyxiSQ APYA== X-Forwarded-Encrypted: i=1; AJvYcCXUfDCr7voDvjifCVQL6EumbJNFrfZNFBXiU4TSSIgB2iwQG0ZRtR92IDmYAAAqGyjzJ0PGXIpeSR+y8mg=@vger.kernel.org X-Gm-Message-State: AOJu0YyZQC/uerFz8IY5PIPU+ZPFeI73VhCsqXx3LC+r3K5CJtUkq0m2 r2VZeTzeboEhonZ4rxBQHlHKyP/F3fEzgRifUygLO9FjigTi/I4g1v2f X-Gm-Gg: ASbGncvdxcWomHbdctToWn+DUk/vMXh4nY67fKUoGbYt3Q77b+76GpR8vAjGMWHS4OZ nmqoGsCGX2ju3JiLErPU3EgIzfzUE3jwszVYc5K6/9Ih1oC2abMcy6r3+DfxM8H2qf8Z6qTA3+j zvAJ3bbBY5boHi1DmO9AUO/K8wFUk9qhNFzvGQKTArQxGJ0LcvEeW+7BA1AIDk04/IhhjbVIcJk hkYwMa/Gv5M1E199tSUNgjwRO94Iramyl250DW1NomoOgkV+YezG+DWs9s9x1XmQ/882Hxs8h3W MttoJ/Cd20RhbVhuI6ZtJw6yXh8xJEgnI8jOwIq/AnudmiZsEu6aicCLqtA4A8T1J2utWgNs+zf 2FxkFgGed7bijWFlOQJ6t4A5g3s1FNZGfs/4OKnohKLElWRO18iWoNRI13aQHxfUFbAj7kOs/i8 vYrDYwsCecG5IkC+lC56Xn5IO892LT6cNtgzm7ICDJirB/aECx X-Google-Smtp-Source: AGHT+IEUWrwHCnrhMjQORd2GgwZECUW61fOPvfGX6b9uWWfuqrPKPIwXKaylq+ZvKLERLAGax0X09A== X-Received: by 2002:a05:6a20:a11a:b0:34f:1623:2354 with SMTP id adf61e73a8af0-3640387e178mr5129120637.42.1764876620236; Thu, 04 Dec 2025 11:30:20 -0800 (PST) Received: from [127.0.0.1] ([101.32.222.185]) by smtp.gmail.com with ESMTPSA id 41be03b00d2f7-bf686b3b5a9sm2552926a12.9.2025.12.04.11.30.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 04 Dec 2025 11:30:19 -0800 (PST) From: Kairui Song Date: Fri, 05 Dec 2025 03:29:17 +0800 Subject: [PATCH v4 09/19] mm, swap: swap entry of a bad slot should not be considered as swapped out Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20251205-swap-table-p2-v4-9-cb7e28a26a40@tencent.com> References: <20251205-swap-table-p2-v4-0-cb7e28a26a40@tencent.com> In-Reply-To: <20251205-swap-table-p2-v4-0-cb7e28a26a40@tencent.com> To: linux-mm@kvack.org Cc: Andrew Morton , Baoquan He , Barry Song , Chris Li , Nhat Pham , Yosry Ahmed , David Hildenbrand , Johannes Weiner , Youngjun Park , Hugh Dickins , Baolin Wang , Ying Huang , Kemeng Shi , Lorenzo Stoakes , "Matthew Wilcox (Oracle)" , linux-kernel@vger.kernel.org, Kairui Song X-Mailer: b4 0.14.3 X-Developer-Signature: v=1; a=ed25519-sha256; t=1764876574; l=4721; i=kasong@tencent.com; s=kasong-sign-tencent; h=from:subject:message-id; bh=gH+fgrUx68nGGL/szu90Y/dtzwtL/bZYzwxE+J7xnUA=; b=2q56WTzht4vUrFKpHdhgGpReADKRqdNRfeOvZzOu5S1rtV+JTT3jfSolWpfl6BtzzwdfEN8Cy 8K6i5p4Zm96CiKI35b3cQALEp3Gjy7GE/qs6EHQ8hGD803rPXvxNKQy X-Developer-Key: i=kasong@tencent.com; a=ed25519; pk=kCdoBuwrYph+KrkJnrr7Sm1pwwhGDdZKcKrqiK8Y1mI= From: Kairui Song When checking if a swap entry is swapped out, we simply check if the bitwise result of the count value is larger than 0. But SWAP_MAP_BAD will also be considered as a swao count value larger than 0. SWAP_MAP_BAD being considered as a count value larger than 0 is useful for the swap allocator: they will be seen as a used slot, so the allocator will skip them. But for the swapped out check, this isn't correct. There is currently no observable issue. The swapped out check is only useful for readahead and folio swapped-out status check. For readahead, the swap cache layer will abort upon checking and updating the swap map. For the folio swapped out status check, the swap allocator will never allocate an entry of bad slots to folio, so that part is fine too. The worst that could happen now is redundant allocation/freeing of folios and waste CPU time. This also makes it easier to get rid of swap map checking and update during folio insertion in the swap cache layer. Signed-off-by: Kairui Song --- include/linux/swap.h | 6 ++++-- mm/swap_state.c | 4 ++-- mm/swapfile.c | 22 +++++++++++----------- 3 files changed, 17 insertions(+), 15 deletions(-) diff --git a/include/linux/swap.h b/include/linux/swap.h index bf72b548a96d..936fa8f9e5f3 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -466,7 +466,8 @@ int find_first_swap(dev_t *device); extern unsigned int count_swap_pages(int, int); extern sector_t swapdev_block(int, pgoff_t); extern int __swap_count(swp_entry_t entry); -extern bool swap_entry_swapped(struct swap_info_struct *si, swp_entry_t en= try); +extern bool swap_entry_swapped(struct swap_info_struct *si, + unsigned long offset); extern int swp_swapcount(swp_entry_t entry); struct backing_dev_info; extern struct swap_info_struct *get_swap_device(swp_entry_t entry); @@ -535,7 +536,8 @@ static inline int __swap_count(swp_entry_t entry) return 0; } =20 -static inline bool swap_entry_swapped(struct swap_info_struct *si, swp_ent= ry_t entry) +static inline bool swap_entry_swapped(struct swap_info_struct *si, + unsigned long offset) { return false; } diff --git a/mm/swap_state.c b/mm/swap_state.c index 8c429dc33ca9..0c5aad537716 100644 --- a/mm/swap_state.c +++ b/mm/swap_state.c @@ -527,8 +527,8 @@ struct folio *swap_cache_alloc_folio(swp_entry_t entry,= gfp_t gfp_mask, if (folio) return folio; =20 - /* Skip allocation for unused swap slot for readahead path. */ - if (!swap_entry_swapped(si, entry)) + /* Skip allocation for unused and bad swap slot for readahead. */ + if (!swap_entry_swapped(si, swp_offset(entry))) return NULL; =20 /* Allocate a new folio to be added into the swap cache. */ diff --git a/mm/swapfile.c b/mm/swapfile.c index e23287c06f1c..5a766d4fcaa5 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -1766,21 +1766,21 @@ int __swap_count(swp_entry_t entry) return swap_count(si->swap_map[offset]); } =20 -/* - * How many references to @entry are currently swapped out? - * This does not give an exact answer when swap count is continued, - * but does include the high COUNT_CONTINUED flag to allow for that. +/** + * swap_entry_swapped - Check if the swap entry at @offset is swapped. + * @si: the swap device. + * @offset: offset of the swap entry. */ -bool swap_entry_swapped(struct swap_info_struct *si, swp_entry_t entry) +bool swap_entry_swapped(struct swap_info_struct *si, unsigned long offset) { - pgoff_t offset =3D swp_offset(entry); struct swap_cluster_info *ci; int count; =20 ci =3D swap_cluster_lock(si, offset); count =3D swap_count(si->swap_map[offset]); swap_cluster_unlock(ci); - return !!count; + + return count && count !=3D SWAP_MAP_BAD; } =20 /* @@ -1866,7 +1866,7 @@ static bool folio_swapped(struct folio *folio) return false; =20 if (!IS_ENABLED(CONFIG_THP_SWAP) || likely(!folio_test_large(folio))) - return swap_entry_swapped(si, entry); + return swap_entry_swapped(si, swp_offset(entry)); =20 return swap_page_trans_huge_swapped(si, entry, folio_order(folio)); } @@ -3677,10 +3677,10 @@ static int __swap_duplicate(swp_entry_t entry, unsi= gned char usage, int nr) count =3D si->swap_map[offset + i]; =20 /* - * swapin_readahead() doesn't check if a swap entry is valid, so the - * swap entry could be SWAP_MAP_BAD. Check here with lock held. + * Allocator never allocates bad slots, and readahead is guarded + * by swap_entry_swapped. */ - if (unlikely(swap_count(count) =3D=3D SWAP_MAP_BAD)) { + if (WARN_ON(swap_count(count) =3D=3D SWAP_MAP_BAD)) { err =3D -ENOENT; goto unlock_out; } --=20 2.52.0