From nobody Mon Feb 9 12:24:22 2026 Received: from mail-pf1-f170.google.com (mail-pf1-f170.google.com [209.85.210.170]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4E4502EE5FC for ; Sun, 25 Jan 2026 17:58:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.170 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769363915; cv=none; b=TYII/wRUu4Utnrut6sRxGRaJPFpNk7Yn8HNn1k1ca0btg0k2DDMzc3bjg4MpWyng813qFezZR3VeCoj5qLFD8HZR6Den6cgOItLUVZVQje1CjCtCJkHKlePo0gutwAh5NOrx2pT8AuV85sHe9fhn/an7qiFhDjaZyJOaP37TOow= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769363915; c=relaxed/simple; bh=Er02vhxaEVbwJu/z/eFj3qEl2mARw7tjDYxAvy8TGt0=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=GnIcDYQyikZLCxAtviZ99KhpC9orWb1+xKQLAd+eXN4y6uAnMSSa5dCs5Eb7zckk72YlzrEi70n5+rTkoIy6jhQIaLLaf6GDR0DKORjMMrzhPEYcsalByyMTt9og+dukhfkCvIjPvBGZ0gXfevXMvcvVpluzRW9LNayCLbaR5GM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=VBNrQbvJ; arc=none smtp.client-ip=209.85.210.170 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="VBNrQbvJ" Received: by mail-pf1-f170.google.com with SMTP id d2e1a72fcca58-81f4c0e2b42so1880680b3a.1 for ; Sun, 25 Jan 2026 09:58:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1769363913; x=1769968713; darn=vger.kernel.org; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=FByi9ivqPJyIjI850sU//sSIfdwjIzLDbOJYWWeWKFE=; b=VBNrQbvJTAVmLpalIhY1hKhmpmO4/B2jXPIFAMIMUoIe7Zu7pm0RzNYvQx6ikMsW90 jxZJjeLcbpPjxR4jNE9d6ZnH1Z8LvQfkc+tfF176Q7jMNc6d+FjaOZemNSy6M4/NIOke OBjb9mztHJQZDK/Y7k1/PHgjVSIOaXPmSNwFEVrW8yws4BndoSkN5XwKuN80xWQ903RF KRHM27Jb+0js8FLVAkU2p3QodLBKjL2YaBK2ASixKoH7hhJW2OpKElq+HyfG7K6K+CZM hw9Z0Qk/WGiReTTCY+mxUqk04MitwXKrWk/3tSf7ofnXv+Fqi559f989aNLonQJtDl59 mOfg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1769363913; x=1769968713; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=FByi9ivqPJyIjI850sU//sSIfdwjIzLDbOJYWWeWKFE=; b=f89CbH1IavbJa35xm09yxVhE6F4SHjNrNhxUTJLq4sj8Aq0kiD/uSxHllFxXSIZFXw qp1EfLK2oS8rcYe42Hg8ScwVRyciR9TXj4wphgoYmOjMwj5A13/NDnh9EQOVtPeNNXH/ 5EKb8UimBIRs516hjLfzZWu6803C3x3X5n+CNccA3h9F2dtOWz4GrwqJ4AqUE94zzdRt kJPvE1PxjLU1A0I69NbFq25rod+lS7aMohQo8k4mpNp2g19Dk1w37fcm9frxd5phS6BO mdoy8XntNafKdhJQjrEPulQyQ8l472jYahtXz5TzJvUJBBodprHCZuzZfJp5eezpUFBX qbLw== X-Forwarded-Encrypted: i=1; AJvYcCXwylBPTL0coYn2huD+HvKXBPSkvf7nqHKQAhUPZ279Sa1BVUUMgMviVPMbHzAMI+I1Cam2jvLGDV1qHSE=@vger.kernel.org X-Gm-Message-State: AOJu0YyR9P3UfzwjvqYKaONQiWaXa04JOGV6QKBZSpQ9jE6Iax9ONjU/ yZnhOkE/HlYgH/CCGw9HNOxSvFPyarBZTD5TvNowT5reqh4jdTYndD/fB187SVQTyVs= X-Gm-Gg: AZuq6aI4W2EtCf7NDeEUD7sLsVy4BRE6bwDeIgZiIv6YbGobU5VonKfD8Gx952rKaIf KRfAKuTNCT7roBCi4BK0KWbgZy8xs6C4wtTaKn2AlvitZuR/dwLVWnazLgx/iK49Tn5ZdXZShN1 Ld+rRbdLg5e8eldgmMAZV/xaN9gvlBxBfjSnOKSti8eo4S0XBvI/Wd7GrYlwpvDUC/A3smEaJ9e oSipYOUNP+qiWQlxQHxDKaxFcZ/PDFdiCqoBwBHK0hligMRs2w1MB3GRxDSUDlnR1fIDoOxADPx 8UJgW1v68pxWpl7dpEZP1/hex2iZyQd6iw+1nXnMR4rmEEejFagJ+mPhpw+pA350GXsXPMcRqSo kR1Cd/P0iLDum8wL1g22pDxCZPYRp3AzISpHYqHjMlEiFuQpE5BlftWfaHiJbhAo0vLC4VC3ES7 LT8vrypq3KkBYeYChbwBS2f6IfvXqHiF8WSdP3ClEuljjtchpp X-Received: by 2002:a05:6a00:4143:b0:81f:4e60:1c67 with SMTP id d2e1a72fcca58-823412d954dmr1785767b3a.67.1769363913484; Sun, 25 Jan 2026 09:58:33 -0800 (PST) Received: from [127.0.0.1] ([101.32.222.185]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-8231876e718sm7405963b3a.62.2026.01.25.09.58.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 25 Jan 2026 09:58:32 -0800 (PST) From: Kairui Song Date: Mon, 26 Jan 2026 01:57:30 +0800 Subject: [PATCH 07/12] mm, swap: mark bad slots in swap table directly Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260126-swap-table-p3-v1-7-a74155fab9b0@tencent.com> References: <20260126-swap-table-p3-v1-0-a74155fab9b0@tencent.com> In-Reply-To: <20260126-swap-table-p3-v1-0-a74155fab9b0@tencent.com> To: linux-mm@kvack.org Cc: Andrew Morton , Kemeng Shi , Nhat Pham , Baoquan He , Barry Song , Johannes Weiner , David Hildenbrand , Lorenzo Stoakes , linux-kernel@vger.kernel.org, Chris Li , Kairui Song X-Mailer: b4 0.14.3 X-Developer-Signature: v=1; a=ed25519-sha256; t=1769363877; l=4470; i=kasong@tencent.com; s=kasong-sign-tencent; h=from:subject:message-id; bh=8m9COjsQHMkFcnP8SRLKraCOnV7wwqbxj9agW6bIMxQ=; b=SpWzpfa4WqhuNM3Pe4jfaTwdbufTKN2DF7Uh2F/b+DvUK0vdZtXufwCirOKUf4gLbdtTA582W nhqSZ5cuQ0KCPQShNbDFuSBFTHpqPcHDcRoLKJH+OQU4wOUZsjuiCEd X-Developer-Key: i=kasong@tencent.com; a=ed25519; pk=kCdoBuwrYph+KrkJnrr7Sm1pwwhGDdZKcKrqiK8Y1mI= From: Kairui Song In preparing the deprecating swap_map, mark bad slots in the swap table too when setting SWAP_MAP_BAD in swap_map. Also, refine the swap table sanity check on freeing to adapt to the bad slots change. For swapoff, the bad slots count must match the cluster usage count, as nothing should touch them, and they contribute to the cluster usage count on swapon. For ordinary swap table freeing, the swap table of clusters with bad slots should never be freed since the cluster usage count never reaches zero. Signed-off-by: Kairui Song --- mm/swapfile.c | 56 +++++++++++++++++++++++++++++++++++++++++--------------- 1 file changed, 41 insertions(+), 15 deletions(-) diff --git a/mm/swapfile.c b/mm/swapfile.c index df8b13eecab1..bdce2abd9135 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -454,16 +454,37 @@ static void swap_table_free(struct swap_table *table) swap_table_free_folio_rcu_cb); } =20 +/* + * Sanity check to ensure nothing leaked, and the specified range is empty. + * One special case is that bad slots can't be freed, so check the number = of + * bad slots for swapoff, and non-swapoff path must never free bad slots. + */ +static void swap_cluster_assert_empty(struct swap_cluster_info *ci, bool s= wapoff) +{ + unsigned int ci_off =3D 0, ci_end =3D SWAPFILE_CLUSTER; + unsigned long swp_tb; + int bad_slots =3D 0; + + if (!IS_ENABLED(CONFIG_DEBUG_VM) && !swapoff) + return; + + do { + swp_tb =3D __swap_table_get(ci, ci_off); + if (swp_tb_is_bad(swp_tb)) + bad_slots++; + else + WARN_ON_ONCE(!swp_tb_is_null(swp_tb)); + } while (++ci_off < ci_end); + + WARN_ON_ONCE(bad_slots !=3D (swapoff ? ci->count : 0)); +} + static void swap_cluster_free_table(struct swap_cluster_info *ci) { - unsigned int ci_off; struct swap_table *table; =20 /* Only empty cluster's table is allow to be freed */ lockdep_assert_held(&ci->lock); - VM_WARN_ON_ONCE(!cluster_is_empty(ci)); - for (ci_off =3D 0; ci_off < SWAPFILE_CLUSTER; ci_off++) - VM_WARN_ON_ONCE(!swp_tb_is_null(__swap_table_get(ci, ci_off))); table =3D (void *)rcu_dereference_protected(ci->table, true); rcu_assign_pointer(ci->table, NULL); =20 @@ -567,6 +588,7 @@ static void swap_cluster_schedule_discard(struct swap_i= nfo_struct *si, =20 static void __free_cluster(struct swap_info_struct *si, struct swap_cluste= r_info *ci) { + swap_cluster_assert_empty(ci, false); swap_cluster_free_table(ci); move_cluster(si, ci, &si->free_clusters, CLUSTER_FLAG_FREE); ci->order =3D 0; @@ -747,9 +769,11 @@ static int swap_cluster_setup_bad_slot(struct swap_inf= o_struct *si, struct swap_cluster_info *cluster_info, unsigned int offset, bool mask) { + unsigned int ci_off =3D offset % SWAPFILE_CLUSTER; unsigned long idx =3D offset / SWAPFILE_CLUSTER; - struct swap_table *table; struct swap_cluster_info *ci; + struct swap_table *table; + int ret =3D 0; =20 /* si->max may got shrunk by swap swap_activate() */ if (offset >=3D si->max && !mask) { @@ -767,13 +791,7 @@ static int swap_cluster_setup_bad_slot(struct swap_inf= o_struct *si, pr_warn("Empty swap-file\n"); return -EINVAL; } - /* Check for duplicated bad swap slots. */ - if (si->swap_map[offset]) { - pr_warn("Duplicated bad slot offset %d\n", offset); - return -EINVAL; - } =20 - si->swap_map[offset] =3D SWAP_MAP_BAD; ci =3D cluster_info + idx; if (!ci->table) { table =3D swap_table_alloc(GFP_KERNEL); @@ -781,13 +799,21 @@ static int swap_cluster_setup_bad_slot(struct swap_in= fo_struct *si, return -ENOMEM; rcu_assign_pointer(ci->table, table); } - - ci->count++; + spin_lock(&ci->lock); + /* Check for duplicated bad swap slots. */ + if (__swap_table_xchg(ci, ci_off, SWP_TB_BAD) !=3D SWP_TB_NULL) { + pr_warn("Duplicated bad slot offset %d\n", offset); + ret =3D -EINVAL; + } else { + si->swap_map[offset] =3D SWAP_MAP_BAD; + ci->count++; + } + spin_unlock(&ci->lock); =20 WARN_ON(ci->count > SWAPFILE_CLUSTER); WARN_ON(ci->flags); =20 - return 0; + return ret; } =20 /* @@ -2743,7 +2769,7 @@ static void free_swap_cluster_info(struct swap_cluste= r_info *cluster_info, /* Cluster with bad marks count will have a remaining table */ spin_lock(&ci->lock); if (rcu_dereference_protected(ci->table, true)) { - ci->count =3D 0; + swap_cluster_assert_empty(ci, true); swap_cluster_free_table(ci); } spin_unlock(&ci->lock); --=20 2.52.0