From nobody Sun Feb 8 22:07:55 2026 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.9]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 062417345B for ; Wed, 14 Aug 2024 06:28:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.9 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723616914; cv=none; b=nao6cVFhHS7Om1petO9yJd7fV6eRhm5kG7wqdGtfwybiumCe4pGRqPGr/soqnWxKXjBRxrSB8g6qjL0sDd1vtuzITupesYGDd48pt9S9xYfGHML4sDTVO3urip7oW/+F0DeTdfRSwYNjD49NdGgGLVdKB4Eg/I+lVKIeS/+BlQo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723616914; c=relaxed/simple; bh=oe1u+d7vabSLdycsD7ncPotAgciyDo4K/Y5B+/jbgp0=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=i6V111o9x/ieufx4r10RVDl29POaZ33/R2BFlNIxddOVwvwT1u9pHHKc5045KyLswUmK5Uxg+FnraGW7nuTRhUKyjUphz40oPXL+Hbsf3hbAMhmclZZ4X+fX+HlTF7Too+N+zQHfM4ziDAIGMAQdSf5SoZLR/ZoTjgncZS2X6hE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=asw0pJuv; arc=none smtp.client-ip=198.175.65.9 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="asw0pJuv" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1723616913; x=1755152913; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=oe1u+d7vabSLdycsD7ncPotAgciyDo4K/Y5B+/jbgp0=; b=asw0pJuv4b6JlcQ7mtfEFTo2OPelilOhZ/lylaVXwqfyyXRNZVxIbBIJ 6ZA4XVONA8bMNnECEILjTWV1EGVVyhCcXvfZve2EzWGJgLobbpzDChgll Kl52mFEGujXXbLpTJvMEFspNkRMGE2Y55oWoDa5yylNxqThvo/rE3h0mm Xd82DaDsOt/VtTHdDjAPIHKQ0pe2LInMB54u56oTCEPugw4wYMyTrS3tb dROcncbDo8XQGo5QDDFCoh8TEvc1SrdWPGt8rA7kyWPyT2dAfetT0qYiN IFPmgipYiaitsQFKEIvn5N9egUUdNdAGa7V4zr5j5ktL16sWlcUTjAR2y w==; X-CSE-ConnectionGUID: LdadFAniTzKdHGhAvwLzDg== X-CSE-MsgGUID: rWyyKTZMShCnsHYOoyJDDQ== X-IronPort-AV: E=McAfee;i="6700,10204,11163"; a="44332997" X-IronPort-AV: E=Sophos;i="6.09,288,1716274800"; d="scan'208";a="44332997" Received: from fmviesa004.fm.intel.com ([10.60.135.144]) by orvoesa101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Aug 2024 23:28:32 -0700 X-CSE-ConnectionGUID: kzgamZ+BQ3SH8LR1nuZ/yg== X-CSE-MsgGUID: ClU524GxSxK+fXEp+PHjlw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.09,288,1716274800"; d="scan'208";a="63568749" Received: from jf5300-b11a338t.jf.intel.com ([10.242.51.6]) by fmviesa004.fm.intel.com with ESMTP; 13 Aug 2024 23:28:32 -0700 From: Kanchana P Sridhar To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, hannes@cmpxchg.org, yosryahmed@google.com, nphamcs@gmail.com, ryan.roberts@arm.com, ying.huang@intel.com, 21cnbao@gmail.com, akpm@linux-foundation.org Cc: nanhai.zou@intel.com, wajdi.k.feghali@intel.com, vinodh.gopal@intel.com, kanchana.p.sridhar@intel.com Subject: [RFC PATCH v1 1/4] mm: zswap: zswap_is_folio_same_filled() takes an index in the folio. Date: Tue, 13 Aug 2024 23:28:27 -0700 Message-Id: <20240814062830.26833-2-kanchana.p.sridhar@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20240814062830.26833-1-kanchana.p.sridhar@intel.com> References: <20240814062830.26833-1-kanchana.p.sridhar@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" This change is being made so that zswap_store can process mTHP folios. Modified zswap_is_folio_same_filled() to work for any-order folios, by accepting an additional "index" parameter to arrive at the page within the folio to run the same-filled page check. Signed-off-by: Kanchana P Sridhar --- mm/zswap.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/mm/zswap.c b/mm/zswap.c index a50e2986cd2f..a6b0a7c636db 100644 --- a/mm/zswap.c +++ b/mm/zswap.c @@ -1373,14 +1373,14 @@ static void shrink_worker(struct work_struct *w) /********************************* * same-filled functions **********************************/ -static bool zswap_is_folio_same_filled(struct folio *folio, unsigned long = *value) +static bool zswap_is_folio_same_filled(struct folio *folio, long index, un= signed long *value) { unsigned long *page; unsigned long val; unsigned int pos, last_pos =3D PAGE_SIZE / sizeof(*page) - 1; bool ret =3D false; =20 - page =3D kmap_local_folio(folio, 0); + page =3D kmap_local_folio(folio, index * PAGE_SIZE); val =3D page[0]; =20 if (val !=3D page[last_pos]) @@ -1450,7 +1450,7 @@ bool zswap_store(struct folio *folio) goto reject; } =20 - if (zswap_is_folio_same_filled(folio, &value)) { + if (zswap_is_folio_same_filled(folio, 0, &value)) { entry->length =3D 0; entry->value =3D value; atomic_inc(&zswap_same_filled_pages); --=20 2.27.0 From nobody Sun Feb 8 22:07:55 2026 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.9]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1A66D6A022 for ; Wed, 14 Aug 2024 06:28:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.9 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723616917; cv=none; b=PaOfIUiqN/NsbVRIRIRSpJ8muECw+DpNJiriTf+RePqxYE1IQHv3hzf8WRIdbVqU+qgV8sTy8LFfQWGmZi/B2aqnLt1Xl4Yxv53w4WLp4u0z9HZIyO999XdBHPRjuqv7mleb9FegeUG5J8+QK61DwnROUU4e3R5P5XeUVpZllDQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723616917; c=relaxed/simple; bh=uqrNeGBRRJSKgfNeNGYme2bhngEOXyV0SlU+NTOqopU=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=o005LV+tvJEtBq/u/d4TibUVDAm7s+2q7PyVOAApppo1nyliZEcED2r8Q4LNrdLv0bUfSbHX/PS3qoa+C6UJJaJV1k2LexjrgXsu+611HNu1aCB5RRlOnuR1xFt+RSwGKSIbtmGEnfU2ADIfGWiI6b3dLiV7776fwY+izmer/sM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=bU3eDmGs; arc=none smtp.client-ip=198.175.65.9 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="bU3eDmGs" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1723616916; x=1755152916; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=uqrNeGBRRJSKgfNeNGYme2bhngEOXyV0SlU+NTOqopU=; b=bU3eDmGskH16eVbpSbwy25Rzd5uHd9cUBiDPZyE8X1oAZIAWxPzMPxM5 6IzZRp+VHGr96qzuPyvbcIkNaJ6wF19M6nq4njVzP7ha/+9XtXiYB3THs 4g4EuzkGleuSsRjZDA/aqRpIuROTVP+LGspfJcbu/0M7nriJqV+EzzkU8 R4YCCeCdoZSaFoLnw214N4vpuGKkLkzPfcRH78oDGz7S1yZOp5MwDk9LC yzYmjVd/PlpNhNlPp+Spl1dm/NyJMylevj80WN+N4NgpdVX/a2ZEeJ/Qt UGIwM+HJ7J2lLrFCN4TkJuj4YSroKVjN45d1E6fjHA51vzpMSWnfyD+F7 A==; X-CSE-ConnectionGUID: DU5wGKZKRhy5WFjyF6qDdQ== X-CSE-MsgGUID: VewcjTSyQk20lil1Sxl73w== X-IronPort-AV: E=McAfee;i="6700,10204,11163"; a="44333010" X-IronPort-AV: E=Sophos;i="6.09,288,1716274800"; d="scan'208";a="44333010" Received: from fmviesa004.fm.intel.com ([10.60.135.144]) by orvoesa101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Aug 2024 23:28:33 -0700 X-CSE-ConnectionGUID: OhOjY6yjTM6kgTZgOGWk0w== X-CSE-MsgGUID: 0vYFo0pvQu2aqwWrMol5Rg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.09,288,1716274800"; d="scan'208";a="63568753" Received: from jf5300-b11a338t.jf.intel.com ([10.242.51.6]) by fmviesa004.fm.intel.com with ESMTP; 13 Aug 2024 23:28:32 -0700 From: Kanchana P Sridhar To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, hannes@cmpxchg.org, yosryahmed@google.com, nphamcs@gmail.com, ryan.roberts@arm.com, ying.huang@intel.com, 21cnbao@gmail.com, akpm@linux-foundation.org Cc: nanhai.zou@intel.com, wajdi.k.feghali@intel.com, vinodh.gopal@intel.com, kanchana.p.sridhar@intel.com Subject: [RFC PATCH v1 2/4] mm: vmstat: Per mTHP-size zswap_store vmstat event counters. Date: Tue, 13 Aug 2024 23:28:28 -0700 Message-Id: <20240814062830.26833-3-kanchana.p.sridhar@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20240814062830.26833-1-kanchana.p.sridhar@intel.com> References: <20240814062830.26833-1-kanchana.p.sridhar@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Added vmstat event counters per mTHP-size that can be used to account for folios of different sizes being successfully stored in ZSWAP. For this RFC, it is not clear if these zswpout counters should instead be added as part of the existing mTHP stats in /sys/kernel/mm/transparent_hugepage/hugepages-*kB/stats. The following is also a viable option, should it make better sense, for instance, as: /sys/kernel/mm/transparent_hugepage/hugepages-*kB/stats/zswpout. If so, we would be able to distinguish between mTHP zswap and non-zswap swapouts through: /sys/kernel/mm/transparent_hugepage/hugepages-*kB/stats/zswpout and /sys/kernel/mm/transparent_hugepage/hugepages-*kB/stats/swpout respectively. Comments would be appreciated as to which approach is preferable. Signed-off-by: Kanchana P Sridhar --- include/linux/vm_event_item.h | 15 +++++++++++++++ mm/vmstat.c | 15 +++++++++++++++ 2 files changed, 30 insertions(+) diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h index 747943bc8cc2..2451bcfcf05c 100644 --- a/include/linux/vm_event_item.h +++ b/include/linux/vm_event_item.h @@ -114,6 +114,9 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT, THP_ZERO_PAGE_ALLOC, THP_ZERO_PAGE_ALLOC_FAILED, THP_SWPOUT, +#ifdef CONFIG_ZSWAP + ZSWPOUT_PMD_THP_FOLIO, +#endif THP_SWPOUT_FALLBACK, #endif #ifdef CONFIG_MEMORY_BALLOON @@ -143,6 +146,18 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT, ZSWPIN, ZSWPOUT, ZSWPWB, + ZSWPOUT_4KB_FOLIO, +#ifdef CONFIG_THP_SWAP + mTHP_ZSWPOUT_8kB, + mTHP_ZSWPOUT_16kB, + mTHP_ZSWPOUT_32kB, + mTHP_ZSWPOUT_64kB, + mTHP_ZSWPOUT_128kB, + mTHP_ZSWPOUT_256kB, + mTHP_ZSWPOUT_512kB, + mTHP_ZSWPOUT_1024kB, + mTHP_ZSWPOUT_2048kB, +#endif #endif #ifdef CONFIG_X86 DIRECT_MAP_LEVEL2_SPLIT, diff --git a/mm/vmstat.c b/mm/vmstat.c index 8507c497218b..0e66c8b0c486 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -1375,6 +1375,9 @@ const char * const vmstat_text[] =3D { "thp_zero_page_alloc", "thp_zero_page_alloc_failed", "thp_swpout", +#ifdef CONFIG_ZSWAP + "zswpout_pmd_thp_folio", +#endif "thp_swpout_fallback", #endif #ifdef CONFIG_MEMORY_BALLOON @@ -1405,6 +1408,18 @@ const char * const vmstat_text[] =3D { "zswpin", "zswpout", "zswpwb", + "zswpout_4kb_folio", +#ifdef CONFIG_THP_SWAP + "mthp_zswpout_8kb", + "mthp_zswpout_16kb", + "mthp_zswpout_32kb", + "mthp_zswpout_64kb", + "mthp_zswpout_128kb", + "mthp_zswpout_256kb", + "mthp_zswpout_512kb", + "mthp_zswpout_1024kb", + "mthp_zswpout_2048kb", +#endif #endif #ifdef CONFIG_X86 "direct_map_level2_splits", --=20 2.27.0 From nobody Sun Feb 8 22:07:55 2026 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.9]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 66F76139D00 for ; Wed, 14 Aug 2024 06:28:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.9 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723616919; cv=none; b=ot7Jta3WdC4VrVBzKneApnRkLZ76RnL57UW8oiL4V4DUeYxmdg8m/4QJv9rqt/HvyUkCWUjdkg0sfGjm3GMPEwuUTPXmCnDfRhSgGYkpRzqBKzno9RfvV3uyCAPqCI8Yzkinrdspq+ScDrsyZQfr8rdbYf8tQPQsOL1iufhbM2c= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723616919; c=relaxed/simple; bh=b0exLTLKXempeXy+88HdDHRjQzo5T2XSUscqTXLuQAo=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=VfIpZZc9GTNMMAtSSbRLpQgTyNbhxjOT/C6RoVlAgff2mKws7cAtL3O/zxDssmrFs9AbsSKYatpsz1nySHc1R3sJb3Pf6V/VnNwEu4E1YElBvtQR9ue6WvfqK64k+vWs5kM7x9wHbVXuefAv/BVIJkaLyzitCg8rpTIP0FCUd0E= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=mZ3noMtM; arc=none smtp.client-ip=198.175.65.9 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="mZ3noMtM" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1723616918; x=1755152918; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=b0exLTLKXempeXy+88HdDHRjQzo5T2XSUscqTXLuQAo=; b=mZ3noMtM60tI2j++BPClIMxXd9bdxNC4/hyGvH0utIyyah9tgcA/B+gX hNkf8FiEje3FxpGPXKC7u1pP7Dd1iYT1mtzBdB3QuDDHIN+ZeUCRD9aEO TjmfahbydLing9MsBDQlDM1haRTBxMTk1X2D5HutaPPDRQl8ngB7y5p02 z3EKokR0caSXREkQPF9MOFXEirs5HsiTmxESJC5otWpHCDVAnA3sE2Qt7 mAmwJIUEWketo4ISWoQ7OdHy7PjUw2LYgNQbSB5VV5lFpFXUiI3RfJcsR XskJwgPZgnrOdzY0lKe4WrYwyUFoSNYKewkseV8omlYdRj87D3zk8nz4s Q==; X-CSE-ConnectionGUID: Pi9zGzwbQIe/HEvJm4cMDA== X-CSE-MsgGUID: 32xl9hxITXKq0eaZa6+/2A== X-IronPort-AV: E=McAfee;i="6700,10204,11163"; a="44333017" X-IronPort-AV: E=Sophos;i="6.09,288,1716274800"; d="scan'208";a="44333017" Received: from fmviesa004.fm.intel.com ([10.60.135.144]) by orvoesa101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Aug 2024 23:28:33 -0700 X-CSE-ConnectionGUID: v572F++PQd+9e4jrfr8uVg== X-CSE-MsgGUID: nrg/9K0lRlityO7atqhYpQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.09,288,1716274800"; d="scan'208";a="63568759" Received: from jf5300-b11a338t.jf.intel.com ([10.242.51.6]) by fmviesa004.fm.intel.com with ESMTP; 13 Aug 2024 23:28:33 -0700 From: Kanchana P Sridhar To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, hannes@cmpxchg.org, yosryahmed@google.com, nphamcs@gmail.com, ryan.roberts@arm.com, ying.huang@intel.com, 21cnbao@gmail.com, akpm@linux-foundation.org Cc: nanhai.zou@intel.com, wajdi.k.feghali@intel.com, vinodh.gopal@intel.com, kanchana.p.sridhar@intel.com Subject: [RFC PATCH v1 3/4] mm: zswap: zswap_store() extended to handle mTHP folios. Date: Tue, 13 Aug 2024 23:28:29 -0700 Message-Id: <20240814062830.26833-4-kanchana.p.sridhar@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20240814062830.26833-1-kanchana.p.sridhar@intel.com> References: <20240814062830.26833-1-kanchana.p.sridhar@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" zswap_store() will now process and store mTHP and PMD-size THP folios. This change reuses and adapts the functionality in Ryan Roberts' RFC patch [1]: "[RFC,v1] mm: zswap: Store large folios without splitting" [1] https://lore.kernel.org/linux-mm/20231019110543.3284654-1-ryan.robert= s@arm.com/T/#u This patch provides a sequential implementation of storing an mTHP in zswap_store() by iterating through each page in the folio to compress and store it in the zswap zpool. Towards this goal, zswap_compress() is modified to take a page instead of a folio as input. Each page's swap offset is stored as a separate zswap entry. If an error is encountered during the store of any page in the mTHP, all previous pages/entries stored will be invalidated. Thus, an mTHP is either entirely stored in ZSWAP, or entirely not stored in ZSWAP. This forms the basis for building batching of pages during zswap store of large folios, by compressing batches of up to say, 8 pages in an mTHP in parallel in hardware, with the Intel In-Memory Analytics Accelerator (Intel IAA). Co-developed-by: Ryan Roberts Signed-off-by: Signed-off-by: Kanchana P Sridhar --- mm/zswap.c | 219 ++++++++++++++++++++++++++++++++++++++--------------- 1 file changed, 157 insertions(+), 62 deletions(-) diff --git a/mm/zswap.c b/mm/zswap.c index a6b0a7c636db..98ff98b485f5 100644 --- a/mm/zswap.c +++ b/mm/zswap.c @@ -899,7 +899,7 @@ static int zswap_cpu_comp_dead(unsigned int cpu, struct= hlist_node *node) return 0; } =20 -static bool zswap_compress(struct folio *folio, struct zswap_entry *entry) +static bool zswap_compress(struct page *page, struct zswap_entry *entry) { struct crypto_acomp_ctx *acomp_ctx; struct scatterlist input, output; @@ -917,7 +917,7 @@ static bool zswap_compress(struct folio *folio, struct = zswap_entry *entry) =20 dst =3D acomp_ctx->buffer; sg_init_table(&input, 1); - sg_set_page(&input, &folio->page, PAGE_SIZE, 0); + sg_set_page(&input, page, PAGE_SIZE, 0); =20 /* * We need PAGE_SIZE * 2 here since there maybe over-compression case, @@ -1409,36 +1409,82 @@ static void zswap_fill_page(void *ptr, unsigned lon= g value) /********************************* * main API **********************************/ -bool zswap_store(struct folio *folio) + +/* + * Returns true if the entry was successfully + * stored in the xarray, and false otherwise. + */ +static bool zswap_store_entry(struct xarray *tree, + struct zswap_entry *entry) { - swp_entry_t swp =3D folio->swap; - pgoff_t offset =3D swp_offset(swp); - struct xarray *tree =3D swap_zswap_tree(swp); - struct zswap_entry *entry, *old; - struct obj_cgroup *objcg =3D NULL; - struct mem_cgroup *memcg =3D NULL; - unsigned long value; + struct zswap_entry *old; + pgoff_t offset =3D swp_offset(entry->swpentry); =20 - VM_WARN_ON_ONCE(!folio_test_locked(folio)); - VM_WARN_ON_ONCE(!folio_test_swapcache(folio)); + old =3D xa_store(tree, offset, entry, GFP_KERNEL); =20 - /* Large folios aren't supported */ - if (folio_test_large(folio)) + if (xa_is_err(old)) { + int err =3D xa_err(old); + + WARN_ONCE(err !=3D -ENOMEM, "unexpected xarray error: %d\n", err); + zswap_reject_alloc_fail++; return false; + } =20 - if (!zswap_enabled) - goto check_old; + /* + * We may have had an existing entry that became stale when + * the folio was redirtied and now the new version is being + * swapped out. Get rid of the old. + */ + if (old) + zswap_entry_free(old); =20 - /* Check cgroup limits */ - objcg =3D get_obj_cgroup_from_folio(folio); - if (objcg && !obj_cgroup_may_zswap(objcg)) { - memcg =3D get_mem_cgroup_from_objcg(objcg); - if (shrink_memcg(memcg)) { - mem_cgroup_put(memcg); - goto reject; - } - mem_cgroup_put(memcg); + return true; +} + +/* + * If the zswap store fails or zswap is disabled, we must invalidate the + * possibly stale entry which was previously stored at this offset. + * Otherwise, writeback could overwrite the new data in the swapfile. + * + * This is called after the store of the i-th offset + * in a large folio, has failed. All entries from + * [i-1 .. 0] must be deleted. + * + * This is also called if zswap_store() is called, + * but zswap is not enabled. All offsets for the folio + * are deleted from zswap in this case. + */ +static void zswap_delete_stored_offsets(struct xarray *tree, + pgoff_t offset, + long nr_pages) +{ + struct zswap_entry *entry; + long i; + + for (i =3D 0; i < nr_pages; ++i) { + entry =3D xa_erase(tree, offset + i); + if (entry) + zswap_entry_free(entry); } +} + +/* + * Stores the page at specified "index" in a folio. + */ +static bool zswap_store_page(struct folio *folio, long index, + struct obj_cgroup *objcg, + struct zswap_pool *pool) +{ + swp_entry_t swp =3D folio->swap; + int type =3D swp_type(swp); + pgoff_t offset =3D swp_offset(swp) + index; + struct page *page =3D folio_page(folio, index); + struct xarray *tree =3D swap_zswap_tree(swp); + struct zswap_entry *entry; + unsigned long value; + + if (objcg) + obj_cgroup_get(objcg); =20 if (zswap_check_limits()) goto reject; @@ -1450,7 +1496,7 @@ bool zswap_store(struct folio *folio) goto reject; } =20 - if (zswap_is_folio_same_filled(folio, 0, &value)) { + if (zswap_is_folio_same_filled(folio, index, &value)) { entry->length =3D 0; entry->value =3D value; atomic_inc(&zswap_same_filled_pages); @@ -1458,42 +1504,20 @@ bool zswap_store(struct folio *folio) } =20 /* if entry is successfully added, it keeps the reference */ - entry->pool =3D zswap_pool_current_get(); - if (!entry->pool) + if (!zswap_pool_get(pool)) goto freepage; =20 - if (objcg) { - memcg =3D get_mem_cgroup_from_objcg(objcg); - if (memcg_list_lru_alloc(memcg, &zswap_list_lru, GFP_KERNEL)) { - mem_cgroup_put(memcg); - goto put_pool; - } - mem_cgroup_put(memcg); - } + entry->pool =3D pool; =20 - if (!zswap_compress(folio, entry)) + if (!zswap_compress(page, entry)) goto put_pool; =20 store_entry: - entry->swpentry =3D swp; + entry->swpentry =3D swp_entry(type, offset); entry->objcg =3D objcg; =20 - old =3D xa_store(tree, offset, entry, GFP_KERNEL); - if (xa_is_err(old)) { - int err =3D xa_err(old); - - WARN_ONCE(err !=3D -ENOMEM, "unexpected xarray error: %d\n", err); - zswap_reject_alloc_fail++; + if (!zswap_store_entry(tree, entry)) goto store_failed; - } - - /* - * We may have had an existing entry that became stale when - * the folio was redirtied and now the new version is being - * swapped out. Get rid of the old. - */ - if (old) - zswap_entry_free(old); =20 if (objcg) { obj_cgroup_charge_zswap(objcg, entry->length); @@ -1527,7 +1551,7 @@ bool zswap_store(struct folio *folio) else { zpool_free(zswap_find_zpool(entry), entry->handle); put_pool: - zswap_pool_put(entry->pool); + zswap_pool_put(pool); } freepage: zswap_entry_cache_free(entry); @@ -1535,16 +1559,87 @@ bool zswap_store(struct folio *folio) obj_cgroup_put(objcg); if (zswap_pool_reached_full) queue_work(shrink_wq, &zswap_shrink_work); -check_old: + + return false; +} + +/* + * Modified to store mTHP folios. + * Each page in the mTHP will be compressed + * and stored sequentially. + */ +bool zswap_store(struct folio *folio) +{ + long nr_pages =3D folio_nr_pages(folio); + swp_entry_t swp =3D folio->swap; + pgoff_t offset =3D swp_offset(swp); + struct xarray *tree =3D swap_zswap_tree(swp); + struct obj_cgroup *objcg =3D NULL; + struct mem_cgroup *memcg =3D NULL; + struct zswap_pool *pool; + bool ret =3D false; + long index; + + VM_WARN_ON_ONCE(!folio_test_locked(folio)); + VM_WARN_ON_ONCE(!folio_test_swapcache(folio)); + /* - * If the zswap store fails or zswap is disabled, we must invalidate the - * possibly stale entry which was previously stored at this offset. - * Otherwise, writeback could overwrite the new data in the swapfile. + * If zswap is disabled, we must invalidate the possibly stale entry + * which was previously stored at this offset. Otherwise, writeback + * could overwrite the new data in the swapfile. */ - entry =3D xa_erase(tree, offset); - if (entry) - zswap_entry_free(entry); - return false; + if (!zswap_enabled) + goto reject; + + /* Check cgroup limits */ + objcg =3D get_obj_cgroup_from_folio(folio); + if (objcg && !obj_cgroup_may_zswap(objcg)) { + memcg =3D get_mem_cgroup_from_objcg(objcg); + if (shrink_memcg(memcg)) { + mem_cgroup_put(memcg); + goto put_objcg; + } + mem_cgroup_put(memcg); + } + + if (zswap_check_limits()) + goto put_objcg; + + pool =3D zswap_pool_current_get(); + if (!pool) + goto put_objcg; + + if (objcg) { + memcg =3D get_mem_cgroup_from_objcg(objcg); + if (memcg_list_lru_alloc(memcg, &zswap_list_lru, GFP_KERNEL)) { + mem_cgroup_put(memcg); + goto put_pool; + } + mem_cgroup_put(memcg); + } + + /* + * Store each page of the folio as a separate entry. If we fail to store + * a page, unwind by removing all the previous pages we stored. + */ + for (index =3D 0; index < nr_pages; ++index) { + if (!zswap_store_page(folio, index, objcg, pool)) + goto put_pool; + } + + ret =3D true; + +put_pool: + zswap_pool_put(pool); +put_objcg: + obj_cgroup_put(objcg); + if (zswap_pool_reached_full) + queue_work(shrink_wq, &zswap_shrink_work); +reject: + if (!ret) + zswap_delete_stored_offsets(tree, offset, nr_pages); + + return ret; } =20 bool zswap_load(struct folio *folio) --=20 2.27.0 From nobody Sun Feb 8 22:07:55 2026 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.9]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 128B71384B3 for ; Wed, 14 Aug 2024 06:28:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.9 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723616918; cv=none; b=HXkgMlg3CARaVUsD0K1r//6TgGAx9EfV47L/nmnLKE5AgU6CS1FIuwRfEDkNjYTmWswoPNJkrZeUuQ8U9CPvhSjVzSDMu6udKFjxpaFDYxXlhCtLcBizRCRQyWfoFMqh6NdN2uM1lTHwLWK/4EdzKx5leQxiwglSLU1wL0a0/cA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723616918; c=relaxed/simple; bh=K2v1VkFvaq+3TScgsmYP670KxeFAd1V81S4yNBh/NTc=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=g7fZNdNxNqWSWTQlM92Yigxp0dzRS49NqE+0xDNF9OaEOz43DvLIwduz3oJW3yF55Y9GBpQWqWtaT3fSxx3MsWLwwrR/VEUy1Xzwfj2sN8UF/10hWokaKMHDXFtohkwImoe37fUw1mqqtCL7bsD4uur218yr6Fk5nCg7TdmwS/8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=CgAodVzZ; arc=none smtp.client-ip=198.175.65.9 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="CgAodVzZ" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1723616918; x=1755152918; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=K2v1VkFvaq+3TScgsmYP670KxeFAd1V81S4yNBh/NTc=; b=CgAodVzZm8mlPwJrZFYqajz8uic3OI2sbCBCet/evMMmGtLgETX3jK6s hr+SN7zMeDTg1wXL7VaIAPjJx3K74ZqdJ+zKVYxEqt0DRCqK+khmbsDi+ RSuhXkakPmtJ+Rdt+fXs242iG9XD+lrGHGyTBzEZtN8wjDdCwMJtBaXzZ ww4dY1YpAJQRm02LmqxG38f5ctLvTSlJm7wQYDJ7tfw+ZgxEW+O46Xr5G mx1jTBo//i31TNRY6VAOVK2PhyejOrmzI5x8Kz9N8xf83NQSVuLFkPb6P ysNpQ/xqbKoBYwhjHdxWENGegWb2BmDMYuN/FWUi3UgbfyvuTFEGQYAd8 g==; X-CSE-ConnectionGUID: 3L9fmTLwT5i4Fo9YsXzYBQ== X-CSE-MsgGUID: YmO3IS9fR3WM444cQg2ahA== X-IronPort-AV: E=McAfee;i="6700,10204,11163"; a="44333020" X-IronPort-AV: E=Sophos;i="6.09,288,1716274800"; d="scan'208";a="44333020" Received: from fmviesa004.fm.intel.com ([10.60.135.144]) by orvoesa101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Aug 2024 23:28:33 -0700 X-CSE-ConnectionGUID: 2zyx+9DPTjmnHtf37Gg14Q== X-CSE-MsgGUID: Jy9a/cZJQdGV9YY2ZGhG0w== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.09,288,1716274800"; d="scan'208";a="63568762" Received: from jf5300-b11a338t.jf.intel.com ([10.242.51.6]) by fmviesa004.fm.intel.com with ESMTP; 13 Aug 2024 23:28:33 -0700 From: Kanchana P Sridhar To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, hannes@cmpxchg.org, yosryahmed@google.com, nphamcs@gmail.com, ryan.roberts@arm.com, ying.huang@intel.com, 21cnbao@gmail.com, akpm@linux-foundation.org Cc: nanhai.zou@intel.com, wajdi.k.feghali@intel.com, vinodh.gopal@intel.com, kanchana.p.sridhar@intel.com Subject: [RFC PATCH v1 4/4] mm: page_io: Count successful mTHP zswap stores in vmstat. Date: Tue, 13 Aug 2024 23:28:30 -0700 Message-Id: <20240814062830.26833-5-kanchana.p.sridhar@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20240814062830.26833-1-kanchana.p.sridhar@intel.com> References: <20240814062830.26833-1-kanchana.p.sridhar@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Added count_zswap_thp_swpout_vm_event() that will increment the appropriate mTHP/PMD vmstat event counters if zswap_store succeeds for a large folio: zswap_store mTHP order [0, HPAGE_PMD_ORDER-1] will increment these vmstat event counters: ZSWPOUT_4KB_FOLIO mTHP_ZSWPOUT_8kB mTHP_ZSWPOUT_16kB mTHP_ZSWPOUT_32kB mTHP_ZSWPOUT_64kB mTHP_ZSWPOUT_128kB mTHP_ZSWPOUT_256kB mTHP_ZSWPOUT_512kB mTHP_ZSWPOUT_1024kB zswap_store of a PMD-size THP, i.e., mTHP order HPAGE_PMD_ORDER, will increment both these vmstat event counters: ZSWPOUT_PMD_THP_FOLIO mTHP_ZSWPOUT_2048kB Signed-off-by: Kanchana P Sridhar --- mm/page_io.c | 44 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 44 insertions(+) diff --git a/mm/page_io.c b/mm/page_io.c index 0a150c240bf4..ab54d2060cc4 100644 --- a/mm/page_io.c +++ b/mm/page_io.c @@ -172,6 +172,49 @@ int generic_swapfile_activate(struct swap_info_struct = *sis, goto out; } =20 +/* + * Count vmstats for ZSWAP store of large folios (mTHP and PMD-size THP). + */ +static inline void count_zswap_thp_swpout_vm_event(struct folio *folio) +{ + if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) && folio_test_pmd_mappable(fo= lio)) { + count_vm_event(ZSWPOUT_PMD_THP_FOLIO); + count_vm_event(mTHP_ZSWPOUT_2048kB); + } else if (folio_order(folio) =3D=3D 0) { + count_vm_event(ZSWPOUT_4KB_FOLIO); + } else if (IS_ENABLED(CONFIG_THP_SWAP)) { + switch (folio_order(folio)) { + case 1: + count_vm_event(mTHP_ZSWPOUT_8kB); + break; + case 2: + count_vm_event(mTHP_ZSWPOUT_16kB); + break; + case 3: + count_vm_event(mTHP_ZSWPOUT_32kB); + break; + case 4: + count_vm_event(mTHP_ZSWPOUT_64kB); + break; + case 5: + count_vm_event(mTHP_ZSWPOUT_128kB); + break; + case 6: + count_vm_event(mTHP_ZSWPOUT_256kB); + break; + case 7: + count_vm_event(mTHP_ZSWPOUT_512kB); + break; + case 8: + count_vm_event(mTHP_ZSWPOUT_1024kB); + break; + case 9: + count_vm_event(mTHP_ZSWPOUT_2048kB); + break; + } + } +} + /* * We may have stale swap cache pages in memory: notice * them here and get rid of the unnecessary final write. @@ -196,6 +239,7 @@ int swap_writepage(struct page *page, struct writeback_= control *wbc) return ret; } if (zswap_store(folio)) { + count_zswap_thp_swpout_vm_event(folio); folio_start_writeback(folio); folio_unlock(folio); folio_end_writeback(folio); --=20 2.27.0