From: Chi Zhiling <chizhiling@kylinos.cn>
This series optimizes shmem read performance by implementing folio
batching in the read path and eliminating unnecessary lock operations.
Performance testing with fio:
(--ioengine=sync --rw=read --size=1G --runtime=120)
shmem (THP disabled):
bs=1M: 11.4 GiB/s
bs=64k: 11.1 GiB/s
bs=4k: 3814 MiB/s
shmem (THP disabled) + fbatch:
bs=1M: 12.8 GiB/s (+12%)
bs=64k: 12.3 GiB/s (+11%)
bs=4k: 3783 MiB/s (-0.8%)
shmem (THP enabled):
bs=1M: 13.8 GiB/s
bs=64k: 13.1 GiB/s
bs=4k: 3851 MiB/s
shmem (THP enabled) + fbatch:
bs=1M: 14.0 GiB/s (+1%)
bs=64k: 13.4 GiB/s (+2%)
bs=4k: 3811 MiB/s (-1%)
shmem preallocated via fallocate (THP disabled):
bs=1M: 24.0 GiB/s
bs=64k: 22.5 GiB/s
bs=4k: 4670 MiB/s
shmem preallocated via fallocate (THP disabled) + fbatch:
bs=1M: 29.3 GiB/s (+22%)
bs=64k: 26.7 GiB/s (+19%)
bs=4k: 4654 MiB/s (-0.3%)
shmem preallocated via fallocate (THP enabled):
bs=1M: 24.0 GiB/s
bs=64k: 22.9 GiB/s
bs=4k: 4698 MiB/s
shmem preallocated via fallocate (THP enabled) + fbatch:
bs=1M: 34.3 GiB/s (+43%)
bs=64k: 31.5 GiB/s (+38%)
bs=4k: 4689 MiB/s (-0.2%)
Chi Zhiling (4):
mm/shmem: add SGP_GET to get unlocked folio
mm/shmem: use SGP_GET in read operations
mm/shmem: optimize file read with folio batching
mm/shmem: make SGP_NOALLOC succeed on hole like SGP_READ
include/linux/shmem_fs.h | 5 +-
mm/khugepaged.c | 2 +-
mm/shmem.c | 132 ++++++++++++++++++++++++++++++++-------
3 files changed, 112 insertions(+), 27 deletions(-)
--
2.43.0