[PATCH v8 0/3] mm: enable large folios swap-in support

Barry Song posted 3 patches 2 weeks ago
There is a newer version of this series
include/linux/memcontrol.h |   5 +-
mm/memcontrol.c            |   7 +-
mm/memory.c                | 261 +++++++++++++++++++++++++++++++++----
mm/page_io.c               |  32 +----
mm/swap.h                  |  33 +++++
mm/swap_state.c            |   2 +-
6 files changed, 282 insertions(+), 58 deletions(-)
[PATCH v8 0/3] mm: enable large folios swap-in support
Posted by Barry Song 2 weeks ago
From: Barry Song <v-songbaohua@oppo.com>

Currently, we support mTHP swapout but not swapin. This means that once mTHP
is swapped out, it will come back as small folios when swapped in. This is
particularly detrimental for devices like Android, where more than half of
the memory is in swap.

The lack of mTHP swapin functionality makes mTHP a showstopper in scenarios
that heavily rely on swap. This patchset introduces mTHP swap-in support.
It starts with synchronous devices similar to zRAM, aiming to benefit as
many users as possible with minimal changes.

-v8:
 * fix the conflicts with zeromap(this is also a hotfix to zeromap with a
   Fixes tag), reported by Kairui, thanks!
   Usama, Yosry, thanks for all your comments during the discussion!
 * refine the changelog to add the case Kanchana reported, using Intel
   IAA, with mTHP swap-in zRAM read latency can improve 7X. thanks!
 * some other code cleanup

-v7:
 https://lore.kernel.org/linux-mm/20240821074541.516249-1-hanchuanhua@oppo.com/
 * collect Chris's ack tags, thanks!
 * adjust the comment and subject,pointed by Christoph. 
 * make alloc_swap_folio() always charge the folio to fix the problem of charge
   failure in memcg when the memory limit is reached(reported and pointed by
   Kairui), pointed by Kefeng, Matthew.

-v6:
 https://lore.kernel.org/linux-mm/20240802122031.117548-1-21cnbao@gmail.com/
 * remove the swapin control added in v5, per Willy, Christoph;
   The original reason for adding the swpin_enabled control was primarily
   to address concerns for slower devices. Currently, since we only support
   fast sync devices, swap-in size is less of a concern.
   We’ll gain a clearer understanding of the next steps while more devices
   begin to support mTHP swap-in.
 * add nr argument in mem_cgroup_swapin_uncharge_swap() instead of adding
   new API, Willy;
 * swapcache_prepare() and swapcache_clear() large folios support is also
   removed as it has been separated per Baolin's request, right now has
   been in mm-unstable.
 * provide more data in changelog.

-v5:
 https://lore.kernel.org/linux-mm/20240726094618.401593-1-21cnbao@gmail.com/

 * Add swap-in control policy according to Ying's proposal. Right now only
   "always" and "never" are supported, later we can extend to "auto";
 * Fix the comment regarding zswap_never_enabled() according to Yosry;
 * Filter out unaligned swp entries earlier;
 * add mem_cgroup_swapin_uncharge_swap_nr() helper

-v4:
 https://lore.kernel.org/linux-mm/20240629111010.230484-1-21cnbao@gmail.com/

 Many parts of v3 have been merged into the mm tree with the help on reviewing
 from Ryan, David, Ying and Chris etc. Thank you very much!
 This is the final part to allocate large folios and map them.

 * Use Yosry's zswap_never_enabled(), notice there is a bug. I put the bug fix
   in this v4 RFC though it should be fixed in Yosry's patch
 * lots of code improvement (drop large stack, hold ptl etc) according
   to Yosry's and Ryan's feedback
 * rebased on top of the latest mm-unstable and utilized some new helpers
   introduced recently.

-v3:
 https://lore.kernel.org/linux-mm/20240304081348.197341-1-21cnbao@gmail.com/
 * avoid over-writing err in __swap_duplicate_nr, pointed out by Yosry,
   thanks!
 * fix the issue folio is charged twice for do_swap_page, separating
   alloc_anon_folio and alloc_swap_folio as they have many differences
   now on
   * memcg charing
   * clearing allocated folio or not

-v2:
 https://lore.kernel.org/linux-mm/20240229003753.134193-1-21cnbao@gmail.com/
 * lots of code cleanup according to Chris's comments, thanks!
 * collect Chris's ack tags, thanks!
 * address David's comment on moving to use folio_add_new_anon_rmap
   for !folio_test_anon in do_swap_page, thanks!
 * remove the MADV_PAGEOUT patch from this series as Ryan will
   intergrate it into swap-out series
 * Apply Kairui's work of "mm/swap: fix race when skipping swapcache"
   on large folios swap-in as well
 * fixed corrupted data(zero-filled data) in two races: zswap and
   a part of entries are in swapcache while some others are not
   in by checking SWAP_HAS_CACHE while swapping in a large folio

-v1:
 https://lore.kernel.org/all/20240118111036.72641-1-21cnbao@gmail.com/#t

Barry Song (2):
  mm: Fix swap_read_folio_zeromap() for large folios with partial
    zeromap
  mm: add nr argument in mem_cgroup_swapin_uncharge_swap() helper to
    support large folios

Chuanhua Han (1):
  mm: support large folios swap-in for sync io devices

 include/linux/memcontrol.h |   5 +-
 mm/memcontrol.c            |   7 +-
 mm/memory.c                | 261 +++++++++++++++++++++++++++++++++----
 mm/page_io.c               |  32 +----
 mm/swap.h                  |  33 +++++
 mm/swap_state.c            |   2 +-
 6 files changed, 282 insertions(+), 58 deletions(-)

-- 
2.34.1