include/linux/backing-dev.h | 3 ++ mm/backing-dev.c | 59 +++++++++++++++++++++++++++++++++++ mm/page-writeback.c | 61 +++++++++++++++++++++++++++++++++++++ 3 files changed, 123 insertions(+)
Hi everyone, We've been interested in this patch about parallelizing writeback [1] and have been following its discussion and development. Our testing in several application scenarios on mobile devices has shown significant performance improvements. Currently, we're focusing on how the number of writeback contexts impacts the performance on different filesystems and storage workloads. We noticed the previous discussion about making the number of writeback contexts an opt-in configuration to adapt to different filesystems [2]. Currently, it can only be set via a sysfs interface at system initialization. We'd like to discuss the possibility of supporting dynamic runtime configuration of the number of writeback contexts. We have developed a mechanism that allows the number of writeback contexts to be configured at runtime via a sysfs interface. To configure, use: echo <nr_wb_ctx> > /sys/class/bdi/<dev>/nwritebacks. Our implementation supports *increasing* the number of writeback contexts. This is achieved by dynamically allocating new writeback contexts and replacing the existing bdi->wb_ctx_arr and bdi->nr_wb_ctx. But we have not yet solved the problem of safely *reducing* the bdi->nr_wb_ctx. Several challenges remain: - How should we safely handle ongoing I/Os when contexts are removed? - What is the correct way to migrate pending writeback tasks and related resources to other writeback contexts? - Should this be a per-device or global setting? We're sharing this early implementation to gather feedback on: 1. Is runtime configurability of writeback contexts a worthwhile goal? 2. How should we handle synchronization and migration when dynamically changing the bdi->nr_wb_ctx, particularly when removing the active writeback contexts? 3. Any better tests to validate the stability of this approach? We look forward to feedback and suggestions for further improvements. [1] Parallelizing filesystem writeback : https://lore.kernel.org/linux-fsdevel/20250529111504.89912-1-kundan.kumar@samsung.com/ [2] The discussion on configuration of the number of writeback contexts : https://lore.kernel.org/linux-fsdevel/20250609040056.GA26101@lst.de/ wangyufei (1): writeback: add sysfs to config the number of writeback contexts include/linux/backing-dev.h | 3 ++ mm/backing-dev.c | 59 +++++++++++++++++++++++++++++++++++ mm/page-writeback.c | 61 +++++++++++++++++++++++++++++++++++++ 3 files changed, 123 insertions(+) -- 2.39.0
On 8/25/2025 5:59 PM, wangyufei wrote: > Hi everyone, > > We've been interested in this patch about parallelizing writeback [1] > and have been following its discussion and development. Our testing in > several application scenarios on mobile devices has shown significant > performance improvements. > Hi, Thanks for sharing this work. Could you clarify a few details about your test setup? - Which filesystem did you run these experiments on? - What were the specifics of the workload (number of threads, block size, I/O size)? - If you are using fio, can you please share the fio command. - How much RAM was available on the test system? - Can you share the performance improvement numbers you observed? That would help in understanding the impact of parallel writeback? I made similar modifications to dynamically configure the number of writeback threads in this experimental patch. Refer to patches 14 and 15: https://lore.kernel.org/all/20250807045706.2848-1-kundan.kumar@samsung.com/ The key difference is that this change also enables a reduction in the number of writeback threads. Thanks, Kundan >
On 8/29/2025 4:59 PM, Kundan Kumar wrote: > On 8/25/2025 5:59 PM, wangyufei wrote: >> Hi everyone, >> >> We've been interested in this patch about parallelizing writeback [1] >> and have been following its discussion and development. Our testing in >> several application scenarios on mobile devices has shown significant >> performance improvements. >> > Hi, > > Thanks for sharing this work. > > Could you clarify a few details about your test setup? > > - Which filesystem did you run these experiments on? > - What were the specifics of the workload (number of threads, block size, > I/O size)? > - If you are using fio, can you please share the fio command. > - How much RAM was available on the test system? > - Can you share the performance improvement numbers you observed? > > That would help in understanding the impact of parallel writeback? Hi Kundan, Most of the time we tested this patch on mobile devices. The test platform setup is shown as below: - filesystem:F2FS - system config: Number of CPUs = 8 System RAM = 11G - workload & fio:We used the same fio command as mentioned in your patch fio command line: fio --directory=/mnt --name=test --bs=4k --iodepth=1024 --rw=randwrite --ioengine=io_uring --time_based=1 -runtime=60 --numjobs=8 --size=450M --direct=0 --eta-interval=1 --eta-newline=1 --group_reporting - Performance gains: Base F2FS :973 MiB/s Parallel Writeback F2FS :1237 MiB/s (+27%) > > I made similar modifications to dynamically configure the number of > writeback threads in this experimental patch. Refer to patches 14 and 15: > https://lore.kernel.org/all/20250807045706.2848-1-kundan.kumar@samsung.com/ > The key difference is that this change also enables a reduction in the > number of writeback threads. Thanks for sharing the patch. I have a few questions: - The current approach freezes the filesystem and reallocates all writeback_ctx structures. Could this introduce latency? In some cases, I think the existing bdi_writeback_ctx structures could be reused instead. - Are there other use cases for dynamic thread tuning besides initialization and testing? - What methods are used to test the stability of this function? Finally, I would like to ask if there are any problems to be solved or optimization directions worth discussing for the parallelizing filesystem writeback? Thanks, yufei
On 25.08.25 14:29, wangyufei wrote: > Hi everyone, > > We've been interested in this patch about parallelizing writeback [1] > and have been following its discussion and development. Our testing in > several application scenarios on mobile devices has shown significant > performance improvements. > > Currently, we're focusing on how the number of writeback contexts impacts > the performance on different filesystems and storage workloads. We noticed > the previous discussion about making the number of writeback contexts an > opt-in configuration to adapt to different filesystems [2]. Currently, it > can only be set via a sysfs interface at system initialization. We'd like > to discuss the possibility of supporting dynamic runtime configuration of > the number of writeback contexts. > > We have developed a mechanism that allows the number of writeback contexts > to be configured at runtime via a sysfs interface. To configure, use: > echo <nr_wb_ctx> > /sys/class/bdi/<dev>/nwritebacks. What's the target use case for updating it dynamically? If it's mostly for debugging/testing (find out what works, what doesn't), it might better go into debugfs or just carried out of tree. If it's about setting sane default based on specific filesystems, maybe it could be optimized from within the kernel, without the need to expose this to an admin? -- Cheers David / dhildenb
On Mon, Aug 25, 2025 at 04:46:46PM +0200, David Hildenbrand wrote: > On 25.08.25 14:29, wangyufei wrote: > > Hi everyone, > > > > We've been interested in this patch about parallelizing writeback [1] > > and have been following its discussion and development. Our testing in > > several application scenarios on mobile devices has shown significant > > performance improvements. > > > > Currently, we're focusing on how the number of writeback contexts impacts > > the performance on different filesystems and storage workloads. We noticed > > the previous discussion about making the number of writeback contexts an > > opt-in configuration to adapt to different filesystems [2]. Currently, it > > can only be set via a sysfs interface at system initialization. We'd like > > to discuss the possibility of supporting dynamic runtime configuration of > > the number of writeback contexts. > > > > We have developed a mechanism that allows the number of writeback contexts > > to be configured at runtime via a sysfs interface. To configure, use: > > echo <nr_wb_ctx> > /sys/class/bdi/<dev>/nwritebacks. > > What's the target use case for updating it dynamically? > > If it's mostly for debugging/testing (find out what works, what doesn't), it > might better go into debugfs or just carried out of tree. > > If it's about setting sane default based on specific filesystems, maybe it > could be optimized from within the kernel, without the need to expose this > to an admin? I was assuming that this patch is for people who are experimenting to gather data more effectively. I'd NAK it being included, but it's good to have it out on the list so other people don't have to reinvent it.
© 2016 - 2025 Red Hat, Inc.