From nobody Thu Dec 18 20:36:32 2025 Received: from mail-pl1-f173.google.com (mail-pl1-f173.google.com [209.85.214.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3421D1B87CD for ; Mon, 13 Jan 2025 18:00:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736791208; cv=none; b=DbTQPUVVu/nA+yOdpgCksbyDSV6APsyLbRtiWSZkNmRQHtAI8wRykKo1wsOmqM0hc/W39yck0GTOLzJk8OT3S9I1FjExJTUIgAwquvzbDTHyNZSVRj5Bgserd83uJfR31k1UUF5P7EL4Wjauy5jeLa/26nFryug3P720xcPwawY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736791208; c=relaxed/simple; bh=fyhJPcv1+8wjELUmZAuJP0JjFC47dGNWFp8AUO7PHeg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Bf4LQ38JfGAFlzQvtIJeZOeuPCoDyK/eV/XRi87sBRoKwn5gD+uwU8PcAELCnXkgBWOMakmMSkZZ0sUBRT/EhAiv5+Lzr8zcvFhko6oMPH6tBbIjPqgHZ2cdNZHlTrGJwoBEHZBgAef55MEcBx5w4sH9kjTYEAKNPK8i1bjurks= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=ickVvFjG; arc=none smtp.client-ip=209.85.214.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ickVvFjG" Received: by mail-pl1-f173.google.com with SMTP id d9443c01a7336-216281bc30fso96194465ad.0 for ; Mon, 13 Jan 2025 10:00:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1736791205; x=1737396005; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:from:to:cc:subject :date:message-id:reply-to; bh=yCxLYUgkmwrizvVbRUMYt+5+F/5vJNcyB3RTwQ2S4sU=; b=ickVvFjGNssYJlwKlryK3cmOCR1NkFTLneXe/U9tGOp9yjqMLLZm7WhMAjrglquQlM T0XJ+gs6mEaKj+TBk2dM+3tvtzFraEBHAELLfptulgxWb3jj4SVKhGZ23ljqKBPaE+WR fs+BcG9/x1yf5r3i3UeCsszooSLFAaByo9dh41R9PfynjxY/ygnEKDr9sAhoTRliH26M KEyKFIHJ+BhoHcaEqzNtVkdtu7+nl8yShn4bsamImR1QgwhdRal8gqIw2OHM+uD6bSXy efWXtrfiUERrOf5kAMozO6HpqYnvmw55deCqoJ9Bw53eytG/NwnyFPYHh9KzCoF/ufuX 0t8A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736791205; x=1737396005; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=yCxLYUgkmwrizvVbRUMYt+5+F/5vJNcyB3RTwQ2S4sU=; b=WY7N5rGJOhG5K+cFVrsuzylthy40fpeF7i3Rv4+Y++Lkh55tZ/cugBu1YdgUIaqwAD CmPvPz37mwvqhK3yGmF5KQAE9BgUODDvIvy3m+2n3Cv6mtP6NP+HTG8MZtBU7Zk4Jr5D 1ISuUcOHf/kmIW5iCd0SMg4LjIhYk7eU2UosgKATkm/oPJ2/q6PnAEecUkwyVarZMqe/ MdcvA1EnAeryq8xPYBq3sXK4mc0/FyIQmzs/mhC+KnwanBr1s3lIVWzR5TTCCpW3qT9f ksAkS8JUry0Mxy9fDK9UFWw5FIF62DPz9n+reonwMnC9ECOl/RvdU+qv2uluPH1D1nRy 0z8Q== X-Forwarded-Encrypted: i=1; AJvYcCXrZrF1r9ig+qxgUwk74kCw8edXu4HIN1Vbzd+S4qb/APF36iv2BOk1/jNTC3RlosrUbwvFG2GT0xmWNuw=@vger.kernel.org X-Gm-Message-State: AOJu0YwDAJet0QboB5H31MwvPs6DsuS+kTRvHBCFx2iYC7WCaPH4+uTi ov6xK7om8L+eSFlwcB6qp2zMH1xhskGfqj/B8swgSiBLbLwG7hj7 X-Gm-Gg: ASbGnctU7dx6ZK4e5qSI0E1d0F/yHrhJzwqoflTNRLL6hUCHSKfDksqlmfB6fbXcp+j +dvOWWmRtDAqVb1FRLmHVi0tSyk3FjaXjUkxipz7K3FTMiukSJuH6h1GXR4rZzI/X8rQsy7dQVD rd9j0ibgPTCaRVLxfdkiukyvRh023uysjc/Lf7983phs1m104JzT5BWTsuqAPWLRGK03oyPP1cO ftALcba7Tdc3ai7zFMlN/jdMF7/sZMPhHSENcR79olKa+pz+prY6++ztZM2b/cbRlLk9xL70C7+ ww== X-Google-Smtp-Source: AGHT+IEl2jGioK0E7VuX6f9gWotqstQiXPHroDcbV/zlArnbCa7XHio7Xb5+yluLOMEp/Y9ntNLtXg== X-Received: by 2002:a17:902:f68b:b0:216:32c4:f807 with SMTP id d9443c01a7336-21a83fdea82mr340270875ad.45.1736791205072; Mon, 13 Jan 2025 10:00:05 -0800 (PST) Received: from KASONG-MC4.tencent.com ([115.171.41.132]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-21a9f21aba7sm57023635ad.113.2025.01.13.10.00.01 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 13 Jan 2025 10:00:04 -0800 (PST) From: Kairui Song To: linux-mm@kvack.org Cc: Andrew Morton , Chris Li , Barry Song , Ryan Roberts , Hugh Dickins , Yosry Ahmed , "Huang, Ying" , Baoquan He , Nhat Pham , Johannes Weiner , Kalesh Singh , linux-kernel@vger.kernel.org, Kairui Song Subject: [PATCH v4 04/13] mm, swap: use cluster lock for HDD Date: Tue, 14 Jan 2025 01:57:23 +0800 Message-ID: <20250113175732.48099-5-ryncsn@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250113175732.48099-1-ryncsn@gmail.com> References: <20250113175732.48099-1-ryncsn@gmail.com> Reply-To: Kairui Song Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Kairui Song Cluster lock (ci->lock) was introduced to reduce contention for certain operations. Using cluster lock for HDD is not helpful as HDD have a poor performance, so locking isn't the bottleneck. But having different set of locks for HDD / non-HDD prevents further rework of device lock (si->lock). This commit just changed all lock_cluster_or_swap_info to lock_cluster, which is a safe and straight conversion since cluster info is always allocated now, also removed all cluster_info related checks. Suggested-by: Chris Li Signed-off-by: Kairui Song Reviewed-by: Baoquan He --- mm/swapfile.c | 109 ++++++++++++++++---------------------------------- 1 file changed, 35 insertions(+), 74 deletions(-) diff --git a/mm/swapfile.c b/mm/swapfile.c index fca58d43b836..83ebc24cc94b 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -58,10 +58,9 @@ static void swap_entry_range_free(struct swap_info_struc= t *si, swp_entry_t entry static void swap_range_alloc(struct swap_info_struct *si, unsigned long of= fset, unsigned int nr_entries); static bool folio_swapcache_freeable(struct folio *folio); -static struct swap_cluster_info *lock_cluster_or_swap_info( - struct swap_info_struct *si, unsigned long offset); -static void unlock_cluster_or_swap_info(struct swap_info_struct *si, - struct swap_cluster_info *ci); +static struct swap_cluster_info *lock_cluster(struct swap_info_struct *si, + unsigned long offset); +static void unlock_cluster(struct swap_cluster_info *ci); =20 static DEFINE_SPINLOCK(swap_lock); static unsigned int nr_swapfiles; @@ -222,9 +221,9 @@ static int __try_to_reclaim_swap(struct swap_info_struc= t *si, * swap_map is HAS_CACHE only, which means the slots have no page table * reference or pending writeback, and can't be allocated to others. */ - ci =3D lock_cluster_or_swap_info(si, offset); + ci =3D lock_cluster(si, offset); need_reclaim =3D swap_is_has_cache(si, offset, nr_pages); - unlock_cluster_or_swap_info(si, ci); + unlock_cluster(ci); if (!need_reclaim) goto out_unlock; =20 @@ -404,45 +403,15 @@ static inline struct swap_cluster_info *lock_cluster(= struct swap_info_struct *si { struct swap_cluster_info *ci; =20 - ci =3D si->cluster_info; - if (ci) { - ci +=3D offset / SWAPFILE_CLUSTER; - spin_lock(&ci->lock); - } - return ci; -} - -static inline void unlock_cluster(struct swap_cluster_info *ci) -{ - if (ci) - spin_unlock(&ci->lock); -} - -/* - * Determine the locking method in use for this device. Return - * swap_cluster_info if SSD-style cluster-based locking is in place. - */ -static inline struct swap_cluster_info *lock_cluster_or_swap_info( - struct swap_info_struct *si, unsigned long offset) -{ - struct swap_cluster_info *ci; - - /* Try to use fine-grained SSD-style locking if available: */ - ci =3D lock_cluster(si, offset); - /* Otherwise, fall back to traditional, coarse locking: */ - if (!ci) - spin_lock(&si->lock); + ci =3D &si->cluster_info[offset / SWAPFILE_CLUSTER]; + spin_lock(&ci->lock); =20 return ci; } =20 -static inline void unlock_cluster_or_swap_info(struct swap_info_struct *si, - struct swap_cluster_info *ci) +static inline void unlock_cluster(struct swap_cluster_info *ci) { - if (ci) - unlock_cluster(ci); - else - spin_unlock(&si->lock); + spin_unlock(&ci->lock); } =20 /* Add a cluster to discard list and schedule it to do discard */ @@ -558,9 +527,6 @@ static void inc_cluster_info_page(struct swap_info_stru= ct *si, unsigned long idx =3D page_nr / SWAPFILE_CLUSTER; struct swap_cluster_info *ci; =20 - if (!cluster_info) - return; - ci =3D cluster_info + idx; ci->count++; =20 @@ -576,9 +542,6 @@ static void inc_cluster_info_page(struct swap_info_stru= ct *si, static void dec_cluster_info_page(struct swap_info_struct *si, struct swap_cluster_info *ci, int nr_pages) { - if (!si->cluster_info) - return; - VM_BUG_ON(ci->count < nr_pages); VM_BUG_ON(cluster_is_free(ci)); lockdep_assert_held(&si->lock); @@ -940,7 +903,7 @@ static void swap_range_alloc(struct swap_info_struct *s= i, unsigned long offset, si->highest_bit =3D 0; del_from_avail_list(si); =20 - if (si->cluster_info && vm_swap_full()) + if (vm_swap_full()) schedule_work(&si->reclaim_work); } } @@ -1007,8 +970,6 @@ static int cluster_alloc_swap(struct swap_info_struct = *si, { int n_ret =3D 0; =20 - VM_BUG_ON(!si->cluster_info); - si->flags +=3D SWP_SCANNING; =20 while (n_ret < nr) { @@ -1052,10 +1013,10 @@ static int scan_swap_map_slots(struct swap_info_str= uct *si, } =20 /* - * Swapfile is not block device or not using clusters so unable + * Swapfile is not block device so unable * to allocate large entries. */ - if (!(si->flags & SWP_BLKDEV) || !si->cluster_info) + if (!(si->flags & SWP_BLKDEV)) return 0; } =20 @@ -1295,9 +1256,9 @@ static unsigned char __swap_entry_free(struct swap_in= fo_struct *si, unsigned long offset =3D swp_offset(entry); unsigned char usage; =20 - ci =3D lock_cluster_or_swap_info(si, offset); + ci =3D lock_cluster(si, offset); usage =3D __swap_entry_free_locked(si, offset, 1); - unlock_cluster_or_swap_info(si, ci); + unlock_cluster(ci); if (!usage) free_swap_slot(entry); =20 @@ -1320,14 +1281,14 @@ static bool __swap_entries_free(struct swap_info_st= ruct *si, if (nr > SWAPFILE_CLUSTER - offset % SWAPFILE_CLUSTER) goto fallback; =20 - ci =3D lock_cluster_or_swap_info(si, offset); + ci =3D lock_cluster(si, offset); if (!swap_is_last_map(si, offset, nr, &has_cache)) { - unlock_cluster_or_swap_info(si, ci); + unlock_cluster(ci); goto fallback; } for (i =3D 0; i < nr; i++) WRITE_ONCE(si->swap_map[offset + i], SWAP_HAS_CACHE); - unlock_cluster_or_swap_info(si, ci); + unlock_cluster(ci); =20 if (!has_cache) { for (i =3D 0; i < nr; i++) @@ -1383,7 +1344,7 @@ static void cluster_swap_free_nr(struct swap_info_str= uct *si, DECLARE_BITMAP(to_free, BITS_PER_LONG) =3D { 0 }; int i, nr; =20 - ci =3D lock_cluster_or_swap_info(si, offset); + ci =3D lock_cluster(si, offset); while (nr_pages) { nr =3D min(BITS_PER_LONG, nr_pages); for (i =3D 0; i < nr; i++) { @@ -1391,18 +1352,18 @@ static void cluster_swap_free_nr(struct swap_info_s= truct *si, bitmap_set(to_free, i, 1); } if (!bitmap_empty(to_free, BITS_PER_LONG)) { - unlock_cluster_or_swap_info(si, ci); + unlock_cluster(ci); for_each_set_bit(i, to_free, BITS_PER_LONG) free_swap_slot(swp_entry(si->type, offset + i)); if (nr =3D=3D nr_pages) return; bitmap_clear(to_free, 0, BITS_PER_LONG); - ci =3D lock_cluster_or_swap_info(si, offset); + ci =3D lock_cluster(si, offset); } offset +=3D nr; nr_pages -=3D nr; } - unlock_cluster_or_swap_info(si, ci); + unlock_cluster(ci); } =20 /* @@ -1441,9 +1402,9 @@ void put_swap_folio(struct folio *folio, swp_entry_t = entry) if (!si) return; =20 - ci =3D lock_cluster_or_swap_info(si, offset); + ci =3D lock_cluster(si, offset); if (size > 1 && swap_is_has_cache(si, offset, size)) { - unlock_cluster_or_swap_info(si, ci); + unlock_cluster(ci); spin_lock(&si->lock); swap_entry_range_free(si, entry, size); spin_unlock(&si->lock); @@ -1451,14 +1412,14 @@ void put_swap_folio(struct folio *folio, swp_entry_= t entry) } for (int i =3D 0; i < size; i++, entry.val++) { if (!__swap_entry_free_locked(si, offset + i, SWAP_HAS_CACHE)) { - unlock_cluster_or_swap_info(si, ci); + unlock_cluster(ci); free_swap_slot(entry); if (i =3D=3D size - 1) return; - lock_cluster_or_swap_info(si, offset); + lock_cluster(si, offset); } } - unlock_cluster_or_swap_info(si, ci); + unlock_cluster(ci); } =20 static int swp_entry_cmp(const void *ent1, const void *ent2) @@ -1522,9 +1483,9 @@ int swap_swapcount(struct swap_info_struct *si, swp_e= ntry_t entry) struct swap_cluster_info *ci; int count; =20 - ci =3D lock_cluster_or_swap_info(si, offset); + ci =3D lock_cluster(si, offset); count =3D swap_count(si->swap_map[offset]); - unlock_cluster_or_swap_info(si, ci); + unlock_cluster(ci); return count; } =20 @@ -1547,7 +1508,7 @@ int swp_swapcount(swp_entry_t entry) =20 offset =3D swp_offset(entry); =20 - ci =3D lock_cluster_or_swap_info(si, offset); + ci =3D lock_cluster(si, offset); =20 count =3D swap_count(si->swap_map[offset]); if (!(count & COUNT_CONTINUED)) @@ -1570,7 +1531,7 @@ int swp_swapcount(swp_entry_t entry) n *=3D (SWAP_CONT_MAX + 1); } while (tmp_count & COUNT_CONTINUED); out: - unlock_cluster_or_swap_info(si, ci); + unlock_cluster(ci); return count; } =20 @@ -1585,8 +1546,8 @@ static bool swap_page_trans_huge_swapped(struct swap_= info_struct *si, int i; bool ret =3D false; =20 - ci =3D lock_cluster_or_swap_info(si, offset); - if (!ci || nr_pages =3D=3D 1) { + ci =3D lock_cluster(si, offset); + if (nr_pages =3D=3D 1) { if (swap_count(map[roffset])) ret =3D true; goto unlock_out; @@ -1598,7 +1559,7 @@ static bool swap_page_trans_huge_swapped(struct swap_= info_struct *si, } } unlock_out: - unlock_cluster_or_swap_info(si, ci); + unlock_cluster(ci); return ret; } =20 @@ -3428,7 +3389,7 @@ static int __swap_duplicate(swp_entry_t entry, unsign= ed char usage, int nr) offset =3D swp_offset(entry); VM_WARN_ON(nr > SWAPFILE_CLUSTER - offset % SWAPFILE_CLUSTER); VM_WARN_ON(usage =3D=3D 1 && nr > 1); - ci =3D lock_cluster_or_swap_info(si, offset); + ci =3D lock_cluster(si, offset); =20 err =3D 0; for (i =3D 0; i < nr; i++) { @@ -3483,7 +3444,7 @@ static int __swap_duplicate(swp_entry_t entry, unsign= ed char usage, int nr) } =20 unlock_out: - unlock_cluster_or_swap_info(si, ci); + unlock_cluster(ci); return err; } =20 --=20 2.47.1