This series reduces lock contention in zs_free(), which dominates the
unmap path under memory pressure on Android (LMK kills) and on x86
servers running zswap-heavy workloads.
The current zs_free() takes pool->lock (rwlock, read side) just to
look up the size_class for a handle, then takes class->lock and holds
it across __free_zspage() which can call into the buddy allocator and
acquire zone->lock. Two costs follow:
* pool->lock reader-counter cacheline bouncing among concurrent
zs_free() callers.
* class->lock held across folio_put(), so any zone->lock wait
fans out to every other zs_free() on the same class.
The series tackles both:
Patch 1: encode size_class index into obj alongside PFN and obj_idx,
so zs_free() can locate the class without pool->lock.
Patch 2: drop pool->lock from zs_free() on 64-bit; 32-bit unchanged.
Patch 3: move zspage page-freeing out of class->lock.
Performance results:
Test: each process independently mmap 256MB, write data, madvise
MADV_PAGEOUT to swap out via zram (lzo-rle), then concurrent munmap.
Raspberry Pi 4B (4-core ARM64 Cortex-A72):
mode Base Patched Speedup
single 59.0ms 56.0ms 1.05x
multi 2p 94.6ms 66.7ms 1.42x
multi 4p 202.9ms 110.6ms 1.83x
x86 (20-core Intel i7-12700, 16 concurrent processes):
mode Base Patched Speedup
single 11.7ms 9.8ms 1.19x
multi 2p 24.1ms 17.2ms 1.40x
multi 4p 63.0ms 45.3ms 1.39x
Changes since v1:
- Rename obj-encoding macros for clarity.
- Make 32-bit / 64-bit handling transparent at call sites
(no #ifdef in callers).
- Extract a helper to keep zs_free() unified.
Wenchao Hao (2):
mm/zsmalloc: encode class index in obj value for lockless class lookup
mm/zsmalloc: drop pool->lock from zs_free on 64-bit systems
Xueyuan Chen (1):
mm/zsmalloc: drop class lock before freeing zspage
mm/zsmalloc.c | 175 +++++++++++++++++++++++++++++++++++++++++---------
1 file changed, 143 insertions(+), 32 deletions(-)
--
2.34.1