[RFC PATCH v2 0/3] mm/zsmalloc: reduce lock contention in zs_free()

Wenchao Hao posted 3 patches 3 days ago
mm/zsmalloc.c | 175 +++++++++++++++++++++++++++++++++++++++++---------
1 file changed, 143 insertions(+), 32 deletions(-)
[RFC PATCH v2 0/3] mm/zsmalloc: reduce lock contention in zs_free()
Posted by Wenchao Hao 3 days ago
This series reduces lock contention in zs_free(), which dominates the
unmap path under memory pressure on Android (LMK kills) and on x86
servers running zswap-heavy workloads.

The current zs_free() takes pool->lock (rwlock, read side) just to
look up the size_class for a handle, then takes class->lock and holds
it across __free_zspage() which can call into the buddy allocator and
acquire zone->lock.  Two costs follow:

  * pool->lock reader-counter cacheline bouncing among concurrent
    zs_free() callers.
  * class->lock held across folio_put(), so any zone->lock wait
    fans out to every other zs_free() on the same class.

The series tackles both:

  Patch 1: encode size_class index into obj alongside PFN and obj_idx,
           so zs_free() can locate the class without pool->lock.
  Patch 2: drop pool->lock from zs_free() on 64-bit; 32-bit unchanged.
  Patch 3: move zspage page-freeing out of class->lock.

Performance results:

Test: each process independently mmap 256MB, write data, madvise
MADV_PAGEOUT to swap out via zram (lzo-rle), then concurrent munmap.

Raspberry Pi 4B (4-core ARM64 Cortex-A72):

  mode        Base       Patched     Speedup
  single      59.0ms     56.0ms      1.05x
  multi 2p    94.6ms     66.7ms      1.42x
  multi 4p    202.9ms    110.6ms     1.83x

x86 (20-core Intel i7-12700, 16 concurrent processes):

  mode        Base       Patched     Speedup
  single      11.7ms     9.8ms       1.19x
  multi 2p    24.1ms     17.2ms      1.40x
  multi 4p    63.0ms     45.3ms      1.39x

Changes since v1:
- Rename obj-encoding macros for clarity.
- Make 32-bit / 64-bit handling transparent at call sites
  (no #ifdef in callers).
- Extract a helper to keep zs_free() unified.

Wenchao Hao (2):
  mm/zsmalloc: encode class index in obj value for lockless class lookup
  mm/zsmalloc: drop pool->lock from zs_free on 64-bit systems

Xueyuan Chen (1):
  mm/zsmalloc: drop class lock before freeing zspage

 mm/zsmalloc.c | 175 +++++++++++++++++++++++++++++++++++++++++---------
 1 file changed, 143 insertions(+), 32 deletions(-)

--
2.34.1