fs/f2fs/file.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
Jens has already completed the development of uncached buffered I/O
in git [1], and in f2fs, uncached buffered I/O read can be enabled
simply by setting the FOP_DONTCACHE flag in f2fs_file_operations.
I have been testing a use case locally, which aligns with Jens' test
case [2]. In the read scenario, using uncached buffer I/O results in
more stable read performance and a lower load on the background memory
reclaim thread (kswapd). So let's enable uncached buffer I/O reads on
F2FS.
Read test data without using uncached buffer I/O:
reading bs 32768, uncached 0
1s: 1856MB/sec, MB=1856
2s: 1907MB/sec, MB=3763
3s: 1830MB/sec, MB=5594
4s: 1745MB/sec, MB=7333
5s: 1829MB/sec, MB=9162
6s: 1903MB/sec, MB=11075
7s: 1878MB/sec, MB=12942
8s: 1763MB/sec, MB=14718
9s: 1845MB/sec, MB=16549
10s: 1915MB/sec, MB=18481
11s: 1831MB/sec, MB=20295
12s: 1750MB/sec, MB=22066
13s: 1787MB/sec, MB=23832
14s: 1913MB/sec, MB=25769
15s: 1898MB/sec, MB=27668
16s: 1795MB/sec, MB=29436
17s: 1812MB/sec, MB=31248
18s: 1890MB/sec, MB=33139
19s: 1880MB/sec, MB=35020
20s: 1754MB/sec, MB=36810
08:36:26 UID PID %usr %system %guest %wait %CPU CPU Command
08:36:27 0 93 0.00 0.00 0.00 0.00 0.00 7 kswapd0
08:36:28 0 93 0.00 0.00 0.00 0.00 0.00 7 kswapd0
08:36:29 0 93 0.00 0.00 0.00 0.00 0.00 7 kswapd0
08:36:30 0 93 0.00 56.00 0.00 0.00 56.00 7 kswapd0
08:36:31 0 93 0.00 73.00 0.00 0.00 73.00 7 kswapd0
08:36:32 0 93 0.00 83.00 0.00 0.00 83.00 7 kswapd0
08:36:33 0 93 0.00 75.00 0.00 0.00 75.00 7 kswapd0
08:36:34 0 93 0.00 81.00 0.00 0.00 81.00 7 kswapd0
08:36:35 0 93 0.00 54.00 0.00 1.00 54.00 2 kswapd0
08:36:36 0 93 0.00 61.00 0.00 0.00 61.00 0 kswapd0
08:36:37 0 93 0.00 68.00 0.00 0.00 68.00 7 kswapd0
08:36:38 0 93 0.00 53.00 0.00 0.00 53.00 2 kswapd0
08:36:39 0 93 0.00 82.00 0.00 0.00 82.00 7 kswapd0
08:36:40 0 93 0.00 77.00 0.00 0.00 77.00 1 kswapd0
08:36:41 0 93 0.00 74.00 0.00 1.00 74.00 7 kswapd0
08:36:42 0 93 0.00 71.00 0.00 0.00 71.00 7 kswapd0
08:36:43 0 93 0.00 78.00 0.00 0.00 78.00 7 kswapd0
08:36:44 0 93 0.00 85.00 0.00 0.00 85.00 7 kswapd0
08:36:45 0 93 0.00 83.00 0.00 0.00 83.00 7 kswapd0
08:36:46 0 93 0.00 70.00 0.00 0.00 70.00 7 kswapd0
08:36:47 0 93 0.00 78.00 0.00 1.00 78.00 2 kswapd0
08:36:48 0 93 0.00 81.00 0.00 0.00 81.00 3 kswapd0
08:36:49 0 93 0.00 54.00 0.00 0.00 54.00 7 kswapd0
08:36:50 0 93 0.00 76.00 0.00 0.00 76.00 1 kswapd0
08:36:51 0 93 0.00 75.00 0.00 0.00 75.00 0 kswapd0
08:36:52 0 93 0.00 73.00 0.00 0.00 73.00 7 kswapd0
08:36:53 0 93 0.00 61.00 0.00 1.00 61.00 7 kswapd0
08:36:54 0 93 0.00 80.00 0.00 0.00 80.00 7 kswapd0
08:36:55 0 93 0.00 64.00 0.00 0.00 64.00 7 kswapd0
08:36:56 0 93 0.00 56.00 0.00 0.00 56.00 7 kswapd0
08:36:57 0 93 0.00 26.00 0.00 0.00 26.00 2 kswapd0
08:36:58 0 93 0.00 24.00 0.00 1.00 24.00 3 kswapd0
08:36:59 0 93 0.00 22.00 0.00 1.00 22.00 3 kswapd0
08:37:00 0 93 0.00 15.84 0.00 0.00 15.84 3 kswapd0
08:37:01 0 93 0.00 0.00 0.00 0.00 0.00 3 kswapd0
08:37:02 0 93 0.00 0.00 0.00 0.00 0.00 3 kswapd0
Read test data after using uncached buffer I/O:
reading bs 32768, uncached 1
1s: 1863MB/sec, MB=1863
2s: 1903MB/sec, MB=3766
3s: 1860MB/sec, MB=5627
4s: 1864MB/sec, MB=7491
5s: 1860MB/sec, MB=9352
6s: 1854MB/sec, MB=11206
7s: 1874MB/sec, MB=13081
8s: 1874MB/sec, MB=14943
9s: 1840MB/sec, MB=16798
10s: 1849MB/sec, MB=18647
11s: 1863MB/sec, MB=20511
12s: 1798MB/sec, MB=22310
13s: 1897MB/sec, MB=24207
14s: 1817MB/sec, MB=26025
15s: 1893MB/sec, MB=27918
16s: 1917MB/sec, MB=29836
17s: 1863MB/sec, MB=31699
18s: 1904MB/sec, MB=33604
19s: 1894MB/sec, MB=35499
20s: 1907MB/sec, MB=37407
08:38:00 UID PID %usr %system %guest %wait %CPU CPU Command
08:38:01 0 93 0.00 0.00 0.00 0.00 0.00 4 kswapd0
08:38:02 0 93 0.00 0.00 0.00 0.00 0.00 4 kswapd0
08:38:03 0 93 0.00 0.00 0.00 0.00 0.00 4 kswapd0
08:38:04 0 93 0.00 0.00 0.00 0.00 0.00 4 kswapd0
08:38:05 0 93 0.00 0.00 0.00 0.00 0.00 4 kswapd0
08:38:06 0 93 0.00 1.00 0.00 1.00 1.00 0 kswapd0
08:38:07 0 93 0.00 0.00 0.00 0.00 0.00 0 kswapd0
08:38:08 0 93 0.00 0.00 0.00 0.00 0.00 0 kswapd0
08:38:09 0 93 0.00 1.00 0.00 0.00 1.00 1 kswapd0
08:38:10 0 93 0.00 0.00 0.00 0.00 0.00 1 kswapd0
08:38:11 0 93 0.00 0.00 0.00 0.00 0.00 1 kswapd0
08:38:12 0 93 0.00 0.00 0.00 0.00 0.00 1 kswapd0
08:38:13 0 93 0.00 0.00 0.00 0.00 0.00 1 kswapd0
08:38:14 0 93 0.00 0.00 0.00 0.00 0.00 1 kswapd0
08:38:15 0 93 0.00 3.00 0.00 0.00 3.00 0 kswapd0
08:38:16 0 93 0.00 0.00 0.00 0.00 0.00 0 kswapd0
08:38:17 0 93 0.00 0.00 0.00 0.00 0.00 0 kswapd0
08:38:18 0 93 0.00 0.00 0.00 0.00 0.00 0 kswapd0
08:38:19 0 93 0.00 0.00 0.00 0.00 0.00 0 kswapd0
08:38:20 0 93 0.00 0.00 0.00 0.00 0.00 0 kswapd0
08:38:21 0 93 0.00 0.00 0.00 0.00 0.00 0 kswapd0
08:38:22 0 93 0.00 0.00 0.00 0.00 0.00 0 kswapd0
08:38:23 0 93 0.00 3.00 0.00 0.00 3.00 4 kswapd0
08:38:24 0 93 0.00 0.00 0.00 0.00 0.00 4 kswapd0
08:38:25 0 93 0.00 0.00 0.00 0.00 0.00 4 kswapd0
08:38:26 0 93 0.00 4.00 0.00 0.00 4.00 3 kswapd0
08:38:27 0 93 0.00 0.00 0.00 0.00 0.00 3 kswapd0
08:38:28 0 93 0.00 0.00 0.00 0.00 0.00 3 kswapd0
08:38:29 0 93 0.00 0.00 0.00 0.00 0.00 3 kswapd0
08:38:30 0 93 0.00 0.00 0.00 0.00 0.00 3 kswapd0
08:38:31 0 93 0.00 0.00 0.00 0.00 0.00 3 kswapd0
08:38:32 0 93 0.00 0.00 0.00 0.00 0.00 3 kswapd0
08:38:33 0 93 0.00 0.00 0.00 0.00 0.00 3 kswapd0
[1]
https://lore.kernel.org/all/20241220154831.1086649-10-axboe@kernel.dk/T/#m58520a94b46f543d82db3711453dfc7bb594b2b0
[2]
https://pastebin.com/u8eCBzB5
Signed-off-by: Qi Han <hanqi@vivo.com>
---
fs/f2fs/file.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 696131e655ed..d8da1fc2febf 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -5425,5 +5425,5 @@ const struct file_operations f2fs_file_operations = {
.splice_read = f2fs_file_splice_read,
.splice_write = iter_file_splice_write,
.fadvise = f2fs_file_fadvise,
- .fop_flags = FOP_BUFFER_RASYNC,
+ .fop_flags = FOP_BUFFER_RASYNC | FOP_DONTCACHE,
};
--
2.48.1
On 7/25/25 15:53, Qi Han wrote: > Jens has already completed the development of uncached buffered I/O > in git [1], and in f2fs, uncached buffered I/O read can be enabled > simply by setting the FOP_DONTCACHE flag in f2fs_file_operations. > > I have been testing a use case locally, which aligns with Jens' test > case [2]. In the read scenario, using uncached buffer I/O results in > more stable read performance and a lower load on the background memory > reclaim thread (kswapd). So let's enable uncached buffer I/O reads on > F2FS. > > Read test data without using uncached buffer I/O: > reading bs 32768, uncached 0 > 1s: 1856MB/sec, MB=1856 > 2s: 1907MB/sec, MB=3763 > 3s: 1830MB/sec, MB=5594 > 4s: 1745MB/sec, MB=7333 > 5s: 1829MB/sec, MB=9162 > 6s: 1903MB/sec, MB=11075 > 7s: 1878MB/sec, MB=12942 > 8s: 1763MB/sec, MB=14718 > 9s: 1845MB/sec, MB=16549 > 10s: 1915MB/sec, MB=18481 > 11s: 1831MB/sec, MB=20295 > 12s: 1750MB/sec, MB=22066 > 13s: 1787MB/sec, MB=23832 > 14s: 1913MB/sec, MB=25769 > 15s: 1898MB/sec, MB=27668 > 16s: 1795MB/sec, MB=29436 > 17s: 1812MB/sec, MB=31248 > 18s: 1890MB/sec, MB=33139 > 19s: 1880MB/sec, MB=35020 > 20s: 1754MB/sec, MB=36810 > > 08:36:26 UID PID %usr %system %guest %wait %CPU CPU Command > 08:36:27 0 93 0.00 0.00 0.00 0.00 0.00 7 kswapd0 > 08:36:28 0 93 0.00 0.00 0.00 0.00 0.00 7 kswapd0 > 08:36:29 0 93 0.00 0.00 0.00 0.00 0.00 7 kswapd0 > 08:36:30 0 93 0.00 56.00 0.00 0.00 56.00 7 kswapd0 > 08:36:31 0 93 0.00 73.00 0.00 0.00 73.00 7 kswapd0 > 08:36:32 0 93 0.00 83.00 0.00 0.00 83.00 7 kswapd0 > 08:36:33 0 93 0.00 75.00 0.00 0.00 75.00 7 kswapd0 > 08:36:34 0 93 0.00 81.00 0.00 0.00 81.00 7 kswapd0 > 08:36:35 0 93 0.00 54.00 0.00 1.00 54.00 2 kswapd0 > 08:36:36 0 93 0.00 61.00 0.00 0.00 61.00 0 kswapd0 > 08:36:37 0 93 0.00 68.00 0.00 0.00 68.00 7 kswapd0 > 08:36:38 0 93 0.00 53.00 0.00 0.00 53.00 2 kswapd0 > 08:36:39 0 93 0.00 82.00 0.00 0.00 82.00 7 kswapd0 > 08:36:40 0 93 0.00 77.00 0.00 0.00 77.00 1 kswapd0 > 08:36:41 0 93 0.00 74.00 0.00 1.00 74.00 7 kswapd0 > 08:36:42 0 93 0.00 71.00 0.00 0.00 71.00 7 kswapd0 > 08:36:43 0 93 0.00 78.00 0.00 0.00 78.00 7 kswapd0 > 08:36:44 0 93 0.00 85.00 0.00 0.00 85.00 7 kswapd0 > 08:36:45 0 93 0.00 83.00 0.00 0.00 83.00 7 kswapd0 > 08:36:46 0 93 0.00 70.00 0.00 0.00 70.00 7 kswapd0 > 08:36:47 0 93 0.00 78.00 0.00 1.00 78.00 2 kswapd0 > 08:36:48 0 93 0.00 81.00 0.00 0.00 81.00 3 kswapd0 > 08:36:49 0 93 0.00 54.00 0.00 0.00 54.00 7 kswapd0 > 08:36:50 0 93 0.00 76.00 0.00 0.00 76.00 1 kswapd0 > 08:36:51 0 93 0.00 75.00 0.00 0.00 75.00 0 kswapd0 > 08:36:52 0 93 0.00 73.00 0.00 0.00 73.00 7 kswapd0 > 08:36:53 0 93 0.00 61.00 0.00 1.00 61.00 7 kswapd0 > 08:36:54 0 93 0.00 80.00 0.00 0.00 80.00 7 kswapd0 > 08:36:55 0 93 0.00 64.00 0.00 0.00 64.00 7 kswapd0 > 08:36:56 0 93 0.00 56.00 0.00 0.00 56.00 7 kswapd0 > 08:36:57 0 93 0.00 26.00 0.00 0.00 26.00 2 kswapd0 > 08:36:58 0 93 0.00 24.00 0.00 1.00 24.00 3 kswapd0 > 08:36:59 0 93 0.00 22.00 0.00 1.00 22.00 3 kswapd0 > 08:37:00 0 93 0.00 15.84 0.00 0.00 15.84 3 kswapd0 > 08:37:01 0 93 0.00 0.00 0.00 0.00 0.00 3 kswapd0 > 08:37:02 0 93 0.00 0.00 0.00 0.00 0.00 3 kswapd0 > > Read test data after using uncached buffer I/O: > reading bs 32768, uncached 1 > 1s: 1863MB/sec, MB=1863 > 2s: 1903MB/sec, MB=3766 > 3s: 1860MB/sec, MB=5627 > 4s: 1864MB/sec, MB=7491 > 5s: 1860MB/sec, MB=9352 > 6s: 1854MB/sec, MB=11206 > 7s: 1874MB/sec, MB=13081 > 8s: 1874MB/sec, MB=14943 > 9s: 1840MB/sec, MB=16798 > 10s: 1849MB/sec, MB=18647 > 11s: 1863MB/sec, MB=20511 > 12s: 1798MB/sec, MB=22310 > 13s: 1897MB/sec, MB=24207 > 14s: 1817MB/sec, MB=26025 > 15s: 1893MB/sec, MB=27918 > 16s: 1917MB/sec, MB=29836 > 17s: 1863MB/sec, MB=31699 > 18s: 1904MB/sec, MB=33604 > 19s: 1894MB/sec, MB=35499 > 20s: 1907MB/sec, MB=37407 > > 08:38:00 UID PID %usr %system %guest %wait %CPU CPU Command > 08:38:01 0 93 0.00 0.00 0.00 0.00 0.00 4 kswapd0 > 08:38:02 0 93 0.00 0.00 0.00 0.00 0.00 4 kswapd0 > 08:38:03 0 93 0.00 0.00 0.00 0.00 0.00 4 kswapd0 > 08:38:04 0 93 0.00 0.00 0.00 0.00 0.00 4 kswapd0 > 08:38:05 0 93 0.00 0.00 0.00 0.00 0.00 4 kswapd0 > 08:38:06 0 93 0.00 1.00 0.00 1.00 1.00 0 kswapd0 > 08:38:07 0 93 0.00 0.00 0.00 0.00 0.00 0 kswapd0 > 08:38:08 0 93 0.00 0.00 0.00 0.00 0.00 0 kswapd0 > 08:38:09 0 93 0.00 1.00 0.00 0.00 1.00 1 kswapd0 > 08:38:10 0 93 0.00 0.00 0.00 0.00 0.00 1 kswapd0 > 08:38:11 0 93 0.00 0.00 0.00 0.00 0.00 1 kswapd0 > 08:38:12 0 93 0.00 0.00 0.00 0.00 0.00 1 kswapd0 > 08:38:13 0 93 0.00 0.00 0.00 0.00 0.00 1 kswapd0 > 08:38:14 0 93 0.00 0.00 0.00 0.00 0.00 1 kswapd0 > 08:38:15 0 93 0.00 3.00 0.00 0.00 3.00 0 kswapd0 > 08:38:16 0 93 0.00 0.00 0.00 0.00 0.00 0 kswapd0 > 08:38:17 0 93 0.00 0.00 0.00 0.00 0.00 0 kswapd0 > 08:38:18 0 93 0.00 0.00 0.00 0.00 0.00 0 kswapd0 > 08:38:19 0 93 0.00 0.00 0.00 0.00 0.00 0 kswapd0 > 08:38:20 0 93 0.00 0.00 0.00 0.00 0.00 0 kswapd0 > 08:38:21 0 93 0.00 0.00 0.00 0.00 0.00 0 kswapd0 > 08:38:22 0 93 0.00 0.00 0.00 0.00 0.00 0 kswapd0 > 08:38:23 0 93 0.00 3.00 0.00 0.00 3.00 4 kswapd0 > 08:38:24 0 93 0.00 0.00 0.00 0.00 0.00 4 kswapd0 > 08:38:25 0 93 0.00 0.00 0.00 0.00 0.00 4 kswapd0 > 08:38:26 0 93 0.00 4.00 0.00 0.00 4.00 3 kswapd0 > 08:38:27 0 93 0.00 0.00 0.00 0.00 0.00 3 kswapd0 > 08:38:28 0 93 0.00 0.00 0.00 0.00 0.00 3 kswapd0 > 08:38:29 0 93 0.00 0.00 0.00 0.00 0.00 3 kswapd0 > 08:38:30 0 93 0.00 0.00 0.00 0.00 0.00 3 kswapd0 > 08:38:31 0 93 0.00 0.00 0.00 0.00 0.00 3 kswapd0 > 08:38:32 0 93 0.00 0.00 0.00 0.00 0.00 3 kswapd0 > 08:38:33 0 93 0.00 0.00 0.00 0.00 0.00 3 kswapd0 > > [1] > https://lore.kernel.org/all/20241220154831.1086649-10-axboe@kernel.dk/T/#m58520a94b46f543d82db3711453dfc7bb594b2b0 > > [2] > https://pastebin.com/u8eCBzB5 > > Signed-off-by: Qi Han <hanqi@vivo.com> Reviewed-by: Chao Yu <chao@kernel.org> Thanks,
On 7/25/25 15:53, Qi Han wrote: > Jens has already completed the development of uncached buffered I/O > in git [1], and in f2fs, uncached buffered I/O read can be enabled > simply by setting the FOP_DONTCACHE flag in f2fs_file_operations. IIUC, we may suffer lock issue when we call pwritev(.. ,RWF_DONTCACHE)? as Jen mentioned in below path, right? soft-irq - folio_end_writeback() - filemap_end_dropbehind_write() - filemap_end_dropbehind() - folio_unmap_invalidate() - lock i_lock Thanks, > > I have been testing a use case locally, which aligns with Jens' test > case [2]. In the read scenario, using uncached buffer I/O results in > more stable read performance and a lower load on the background memory > reclaim thread (kswapd). So let's enable uncached buffer I/O reads on > F2FS. > > Read test data without using uncached buffer I/O: > reading bs 32768, uncached 0 > 1s: 1856MB/sec, MB=1856 > 2s: 1907MB/sec, MB=3763 > 3s: 1830MB/sec, MB=5594 > 4s: 1745MB/sec, MB=7333 > 5s: 1829MB/sec, MB=9162 > 6s: 1903MB/sec, MB=11075 > 7s: 1878MB/sec, MB=12942 > 8s: 1763MB/sec, MB=14718 > 9s: 1845MB/sec, MB=16549 > 10s: 1915MB/sec, MB=18481 > 11s: 1831MB/sec, MB=20295 > 12s: 1750MB/sec, MB=22066 > 13s: 1787MB/sec, MB=23832 > 14s: 1913MB/sec, MB=25769 > 15s: 1898MB/sec, MB=27668 > 16s: 1795MB/sec, MB=29436 > 17s: 1812MB/sec, MB=31248 > 18s: 1890MB/sec, MB=33139 > 19s: 1880MB/sec, MB=35020 > 20s: 1754MB/sec, MB=36810 > > 08:36:26 UID PID %usr %system %guest %wait %CPU CPU Command > 08:36:27 0 93 0.00 0.00 0.00 0.00 0.00 7 kswapd0 > 08:36:28 0 93 0.00 0.00 0.00 0.00 0.00 7 kswapd0 > 08:36:29 0 93 0.00 0.00 0.00 0.00 0.00 7 kswapd0 > 08:36:30 0 93 0.00 56.00 0.00 0.00 56.00 7 kswapd0 > 08:36:31 0 93 0.00 73.00 0.00 0.00 73.00 7 kswapd0 > 08:36:32 0 93 0.00 83.00 0.00 0.00 83.00 7 kswapd0 > 08:36:33 0 93 0.00 75.00 0.00 0.00 75.00 7 kswapd0 > 08:36:34 0 93 0.00 81.00 0.00 0.00 81.00 7 kswapd0 > 08:36:35 0 93 0.00 54.00 0.00 1.00 54.00 2 kswapd0 > 08:36:36 0 93 0.00 61.00 0.00 0.00 61.00 0 kswapd0 > 08:36:37 0 93 0.00 68.00 0.00 0.00 68.00 7 kswapd0 > 08:36:38 0 93 0.00 53.00 0.00 0.00 53.00 2 kswapd0 > 08:36:39 0 93 0.00 82.00 0.00 0.00 82.00 7 kswapd0 > 08:36:40 0 93 0.00 77.00 0.00 0.00 77.00 1 kswapd0 > 08:36:41 0 93 0.00 74.00 0.00 1.00 74.00 7 kswapd0 > 08:36:42 0 93 0.00 71.00 0.00 0.00 71.00 7 kswapd0 > 08:36:43 0 93 0.00 78.00 0.00 0.00 78.00 7 kswapd0 > 08:36:44 0 93 0.00 85.00 0.00 0.00 85.00 7 kswapd0 > 08:36:45 0 93 0.00 83.00 0.00 0.00 83.00 7 kswapd0 > 08:36:46 0 93 0.00 70.00 0.00 0.00 70.00 7 kswapd0 > 08:36:47 0 93 0.00 78.00 0.00 1.00 78.00 2 kswapd0 > 08:36:48 0 93 0.00 81.00 0.00 0.00 81.00 3 kswapd0 > 08:36:49 0 93 0.00 54.00 0.00 0.00 54.00 7 kswapd0 > 08:36:50 0 93 0.00 76.00 0.00 0.00 76.00 1 kswapd0 > 08:36:51 0 93 0.00 75.00 0.00 0.00 75.00 0 kswapd0 > 08:36:52 0 93 0.00 73.00 0.00 0.00 73.00 7 kswapd0 > 08:36:53 0 93 0.00 61.00 0.00 1.00 61.00 7 kswapd0 > 08:36:54 0 93 0.00 80.00 0.00 0.00 80.00 7 kswapd0 > 08:36:55 0 93 0.00 64.00 0.00 0.00 64.00 7 kswapd0 > 08:36:56 0 93 0.00 56.00 0.00 0.00 56.00 7 kswapd0 > 08:36:57 0 93 0.00 26.00 0.00 0.00 26.00 2 kswapd0 > 08:36:58 0 93 0.00 24.00 0.00 1.00 24.00 3 kswapd0 > 08:36:59 0 93 0.00 22.00 0.00 1.00 22.00 3 kswapd0 > 08:37:00 0 93 0.00 15.84 0.00 0.00 15.84 3 kswapd0 > 08:37:01 0 93 0.00 0.00 0.00 0.00 0.00 3 kswapd0 > 08:37:02 0 93 0.00 0.00 0.00 0.00 0.00 3 kswapd0 > > Read test data after using uncached buffer I/O: > reading bs 32768, uncached 1 > 1s: 1863MB/sec, MB=1863 > 2s: 1903MB/sec, MB=3766 > 3s: 1860MB/sec, MB=5627 > 4s: 1864MB/sec, MB=7491 > 5s: 1860MB/sec, MB=9352 > 6s: 1854MB/sec, MB=11206 > 7s: 1874MB/sec, MB=13081 > 8s: 1874MB/sec, MB=14943 > 9s: 1840MB/sec, MB=16798 > 10s: 1849MB/sec, MB=18647 > 11s: 1863MB/sec, MB=20511 > 12s: 1798MB/sec, MB=22310 > 13s: 1897MB/sec, MB=24207 > 14s: 1817MB/sec, MB=26025 > 15s: 1893MB/sec, MB=27918 > 16s: 1917MB/sec, MB=29836 > 17s: 1863MB/sec, MB=31699 > 18s: 1904MB/sec, MB=33604 > 19s: 1894MB/sec, MB=35499 > 20s: 1907MB/sec, MB=37407 > > 08:38:00 UID PID %usr %system %guest %wait %CPU CPU Command > 08:38:01 0 93 0.00 0.00 0.00 0.00 0.00 4 kswapd0 > 08:38:02 0 93 0.00 0.00 0.00 0.00 0.00 4 kswapd0 > 08:38:03 0 93 0.00 0.00 0.00 0.00 0.00 4 kswapd0 > 08:38:04 0 93 0.00 0.00 0.00 0.00 0.00 4 kswapd0 > 08:38:05 0 93 0.00 0.00 0.00 0.00 0.00 4 kswapd0 > 08:38:06 0 93 0.00 1.00 0.00 1.00 1.00 0 kswapd0 > 08:38:07 0 93 0.00 0.00 0.00 0.00 0.00 0 kswapd0 > 08:38:08 0 93 0.00 0.00 0.00 0.00 0.00 0 kswapd0 > 08:38:09 0 93 0.00 1.00 0.00 0.00 1.00 1 kswapd0 > 08:38:10 0 93 0.00 0.00 0.00 0.00 0.00 1 kswapd0 > 08:38:11 0 93 0.00 0.00 0.00 0.00 0.00 1 kswapd0 > 08:38:12 0 93 0.00 0.00 0.00 0.00 0.00 1 kswapd0 > 08:38:13 0 93 0.00 0.00 0.00 0.00 0.00 1 kswapd0 > 08:38:14 0 93 0.00 0.00 0.00 0.00 0.00 1 kswapd0 > 08:38:15 0 93 0.00 3.00 0.00 0.00 3.00 0 kswapd0 > 08:38:16 0 93 0.00 0.00 0.00 0.00 0.00 0 kswapd0 > 08:38:17 0 93 0.00 0.00 0.00 0.00 0.00 0 kswapd0 > 08:38:18 0 93 0.00 0.00 0.00 0.00 0.00 0 kswapd0 > 08:38:19 0 93 0.00 0.00 0.00 0.00 0.00 0 kswapd0 > 08:38:20 0 93 0.00 0.00 0.00 0.00 0.00 0 kswapd0 > 08:38:21 0 93 0.00 0.00 0.00 0.00 0.00 0 kswapd0 > 08:38:22 0 93 0.00 0.00 0.00 0.00 0.00 0 kswapd0 > 08:38:23 0 93 0.00 3.00 0.00 0.00 3.00 4 kswapd0 > 08:38:24 0 93 0.00 0.00 0.00 0.00 0.00 4 kswapd0 > 08:38:25 0 93 0.00 0.00 0.00 0.00 0.00 4 kswapd0 > 08:38:26 0 93 0.00 4.00 0.00 0.00 4.00 3 kswapd0 > 08:38:27 0 93 0.00 0.00 0.00 0.00 0.00 3 kswapd0 > 08:38:28 0 93 0.00 0.00 0.00 0.00 0.00 3 kswapd0 > 08:38:29 0 93 0.00 0.00 0.00 0.00 0.00 3 kswapd0 > 08:38:30 0 93 0.00 0.00 0.00 0.00 0.00 3 kswapd0 > 08:38:31 0 93 0.00 0.00 0.00 0.00 0.00 3 kswapd0 > 08:38:32 0 93 0.00 0.00 0.00 0.00 0.00 3 kswapd0 > 08:38:33 0 93 0.00 0.00 0.00 0.00 0.00 3 kswapd0 > > [1] > https://lore.kernel.org/all/20241220154831.1086649-10-axboe@kernel.dk/T/#m58520a94b46f543d82db3711453dfc7bb594b2b0 > > [2] > https://pastebin.com/u8eCBzB5 > > Signed-off-by: Qi Han <hanqi@vivo.com> > --- > fs/f2fs/file.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c > index 696131e655ed..d8da1fc2febf 100644 > --- a/fs/f2fs/file.c > +++ b/fs/f2fs/file.c > @@ -5425,5 +5425,5 @@ const struct file_operations f2fs_file_operations = { > .splice_read = f2fs_file_splice_read, > .splice_write = iter_file_splice_write, > .fadvise = f2fs_file_fadvise, > - .fop_flags = FOP_BUFFER_RASYNC, > + .fop_flags = FOP_BUFFER_RASYNC | FOP_DONTCACHE, > };
在 2025/7/28 15:38, Chao Yu 写道: > On 7/25/25 15:53, Qi Han wrote: >> Jens has already completed the development of uncached buffered I/O >> in git [1], and in f2fs, uncached buffered I/O read can be enabled >> simply by setting the FOP_DONTCACHE flag in f2fs_file_operations. > IIUC, we may suffer lock issue when we call pwritev(.. ,RWF_DONTCACHE)? > as Jen mentioned in below path, right? > > soft-irq > - folio_end_writeback() > - filemap_end_dropbehind_write() > - filemap_end_dropbehind() > - folio_unmap_invalidate() > - lock i_lock > > Thanks, That's how I understand it. >> I have been testing a use case locally, which aligns with Jens' test >> case [2]. In the read scenario, using uncached buffer I/O results in >> more stable read performance and a lower load on the background memory >> reclaim thread (kswapd). So let's enable uncached buffer I/O reads on >> F2FS. >> >> Read test data without using uncached buffer I/O: >> reading bs 32768, uncached 0 >> 1s: 1856MB/sec, MB=1856 >> 2s: 1907MB/sec, MB=3763 >> 3s: 1830MB/sec, MB=5594 >> 4s: 1745MB/sec, MB=7333 >> 5s: 1829MB/sec, MB=9162 >> 6s: 1903MB/sec, MB=11075 >> 7s: 1878MB/sec, MB=12942 >> 8s: 1763MB/sec, MB=14718 >> 9s: 1845MB/sec, MB=16549 >> 10s: 1915MB/sec, MB=18481 >> 11s: 1831MB/sec, MB=20295 >> 12s: 1750MB/sec, MB=22066 >> 13s: 1787MB/sec, MB=23832 >> 14s: 1913MB/sec, MB=25769 >> 15s: 1898MB/sec, MB=27668 >> 16s: 1795MB/sec, MB=29436 >> 17s: 1812MB/sec, MB=31248 >> 18s: 1890MB/sec, MB=33139 >> 19s: 1880MB/sec, MB=35020 >> 20s: 1754MB/sec, MB=36810 >> >> 08:36:26 UID PID %usr %system %guest %wait %CPU CPU Command >> 08:36:27 0 93 0.00 0.00 0.00 0.00 0.00 7 kswapd0 >> 08:36:28 0 93 0.00 0.00 0.00 0.00 0.00 7 kswapd0 >> 08:36:29 0 93 0.00 0.00 0.00 0.00 0.00 7 kswapd0 >> 08:36:30 0 93 0.00 56.00 0.00 0.00 56.00 7 kswapd0 >> 08:36:31 0 93 0.00 73.00 0.00 0.00 73.00 7 kswapd0 >> 08:36:32 0 93 0.00 83.00 0.00 0.00 83.00 7 kswapd0 >> 08:36:33 0 93 0.00 75.00 0.00 0.00 75.00 7 kswapd0 >> 08:36:34 0 93 0.00 81.00 0.00 0.00 81.00 7 kswapd0 >> 08:36:35 0 93 0.00 54.00 0.00 1.00 54.00 2 kswapd0 >> 08:36:36 0 93 0.00 61.00 0.00 0.00 61.00 0 kswapd0 >> 08:36:37 0 93 0.00 68.00 0.00 0.00 68.00 7 kswapd0 >> 08:36:38 0 93 0.00 53.00 0.00 0.00 53.00 2 kswapd0 >> 08:36:39 0 93 0.00 82.00 0.00 0.00 82.00 7 kswapd0 >> 08:36:40 0 93 0.00 77.00 0.00 0.00 77.00 1 kswapd0 >> 08:36:41 0 93 0.00 74.00 0.00 1.00 74.00 7 kswapd0 >> 08:36:42 0 93 0.00 71.00 0.00 0.00 71.00 7 kswapd0 >> 08:36:43 0 93 0.00 78.00 0.00 0.00 78.00 7 kswapd0 >> 08:36:44 0 93 0.00 85.00 0.00 0.00 85.00 7 kswapd0 >> 08:36:45 0 93 0.00 83.00 0.00 0.00 83.00 7 kswapd0 >> 08:36:46 0 93 0.00 70.00 0.00 0.00 70.00 7 kswapd0 >> 08:36:47 0 93 0.00 78.00 0.00 1.00 78.00 2 kswapd0 >> 08:36:48 0 93 0.00 81.00 0.00 0.00 81.00 3 kswapd0 >> 08:36:49 0 93 0.00 54.00 0.00 0.00 54.00 7 kswapd0 >> 08:36:50 0 93 0.00 76.00 0.00 0.00 76.00 1 kswapd0 >> 08:36:51 0 93 0.00 75.00 0.00 0.00 75.00 0 kswapd0 >> 08:36:52 0 93 0.00 73.00 0.00 0.00 73.00 7 kswapd0 >> 08:36:53 0 93 0.00 61.00 0.00 1.00 61.00 7 kswapd0 >> 08:36:54 0 93 0.00 80.00 0.00 0.00 80.00 7 kswapd0 >> 08:36:55 0 93 0.00 64.00 0.00 0.00 64.00 7 kswapd0 >> 08:36:56 0 93 0.00 56.00 0.00 0.00 56.00 7 kswapd0 >> 08:36:57 0 93 0.00 26.00 0.00 0.00 26.00 2 kswapd0 >> 08:36:58 0 93 0.00 24.00 0.00 1.00 24.00 3 kswapd0 >> 08:36:59 0 93 0.00 22.00 0.00 1.00 22.00 3 kswapd0 >> 08:37:00 0 93 0.00 15.84 0.00 0.00 15.84 3 kswapd0 >> 08:37:01 0 93 0.00 0.00 0.00 0.00 0.00 3 kswapd0 >> 08:37:02 0 93 0.00 0.00 0.00 0.00 0.00 3 kswapd0 >> >> Read test data after using uncached buffer I/O: >> reading bs 32768, uncached 1 >> 1s: 1863MB/sec, MB=1863 >> 2s: 1903MB/sec, MB=3766 >> 3s: 1860MB/sec, MB=5627 >> 4s: 1864MB/sec, MB=7491 >> 5s: 1860MB/sec, MB=9352 >> 6s: 1854MB/sec, MB=11206 >> 7s: 1874MB/sec, MB=13081 >> 8s: 1874MB/sec, MB=14943 >> 9s: 1840MB/sec, MB=16798 >> 10s: 1849MB/sec, MB=18647 >> 11s: 1863MB/sec, MB=20511 >> 12s: 1798MB/sec, MB=22310 >> 13s: 1897MB/sec, MB=24207 >> 14s: 1817MB/sec, MB=26025 >> 15s: 1893MB/sec, MB=27918 >> 16s: 1917MB/sec, MB=29836 >> 17s: 1863MB/sec, MB=31699 >> 18s: 1904MB/sec, MB=33604 >> 19s: 1894MB/sec, MB=35499 >> 20s: 1907MB/sec, MB=37407 >> >> 08:38:00 UID PID %usr %system %guest %wait %CPU CPU Command >> 08:38:01 0 93 0.00 0.00 0.00 0.00 0.00 4 kswapd0 >> 08:38:02 0 93 0.00 0.00 0.00 0.00 0.00 4 kswapd0 >> 08:38:03 0 93 0.00 0.00 0.00 0.00 0.00 4 kswapd0 >> 08:38:04 0 93 0.00 0.00 0.00 0.00 0.00 4 kswapd0 >> 08:38:05 0 93 0.00 0.00 0.00 0.00 0.00 4 kswapd0 >> 08:38:06 0 93 0.00 1.00 0.00 1.00 1.00 0 kswapd0 >> 08:38:07 0 93 0.00 0.00 0.00 0.00 0.00 0 kswapd0 >> 08:38:08 0 93 0.00 0.00 0.00 0.00 0.00 0 kswapd0 >> 08:38:09 0 93 0.00 1.00 0.00 0.00 1.00 1 kswapd0 >> 08:38:10 0 93 0.00 0.00 0.00 0.00 0.00 1 kswapd0 >> 08:38:11 0 93 0.00 0.00 0.00 0.00 0.00 1 kswapd0 >> 08:38:12 0 93 0.00 0.00 0.00 0.00 0.00 1 kswapd0 >> 08:38:13 0 93 0.00 0.00 0.00 0.00 0.00 1 kswapd0 >> 08:38:14 0 93 0.00 0.00 0.00 0.00 0.00 1 kswapd0 >> 08:38:15 0 93 0.00 3.00 0.00 0.00 3.00 0 kswapd0 >> 08:38:16 0 93 0.00 0.00 0.00 0.00 0.00 0 kswapd0 >> 08:38:17 0 93 0.00 0.00 0.00 0.00 0.00 0 kswapd0 >> 08:38:18 0 93 0.00 0.00 0.00 0.00 0.00 0 kswapd0 >> 08:38:19 0 93 0.00 0.00 0.00 0.00 0.00 0 kswapd0 >> 08:38:20 0 93 0.00 0.00 0.00 0.00 0.00 0 kswapd0 >> 08:38:21 0 93 0.00 0.00 0.00 0.00 0.00 0 kswapd0 >> 08:38:22 0 93 0.00 0.00 0.00 0.00 0.00 0 kswapd0 >> 08:38:23 0 93 0.00 3.00 0.00 0.00 3.00 4 kswapd0 >> 08:38:24 0 93 0.00 0.00 0.00 0.00 0.00 4 kswapd0 >> 08:38:25 0 93 0.00 0.00 0.00 0.00 0.00 4 kswapd0 >> 08:38:26 0 93 0.00 4.00 0.00 0.00 4.00 3 kswapd0 >> 08:38:27 0 93 0.00 0.00 0.00 0.00 0.00 3 kswapd0 >> 08:38:28 0 93 0.00 0.00 0.00 0.00 0.00 3 kswapd0 >> 08:38:29 0 93 0.00 0.00 0.00 0.00 0.00 3 kswapd0 >> 08:38:30 0 93 0.00 0.00 0.00 0.00 0.00 3 kswapd0 >> 08:38:31 0 93 0.00 0.00 0.00 0.00 0.00 3 kswapd0 >> 08:38:32 0 93 0.00 0.00 0.00 0.00 0.00 3 kswapd0 >> 08:38:33 0 93 0.00 0.00 0.00 0.00 0.00 3 kswapd0 >> >> [1] >> https://lore.kernel.org/all/20241220154831.1086649-10-axboe@kernel.dk/T/#m58520a94b46f543d82db3711453dfc7bb594b2b0 >> >> [2] >> https://pastebin.com/u8eCBzB5 >> >> Signed-off-by: Qi Han <hanqi@vivo.com> >> --- >> fs/f2fs/file.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c >> index 696131e655ed..d8da1fc2febf 100644 >> --- a/fs/f2fs/file.c >> +++ b/fs/f2fs/file.c >> @@ -5425,5 +5425,5 @@ const struct file_operations f2fs_file_operations = { >> .splice_read = f2fs_file_splice_read, >> .splice_write = iter_file_splice_write, >> .fadvise = f2fs_file_fadvise, >> - .fop_flags = FOP_BUFFER_RASYNC, >> + .fop_flags = FOP_BUFFER_RASYNC | FOP_DONTCACHE, >> };
On 7/28/25 16:03, hanqi wrote: > 在 2025/7/28 15:38, Chao Yu 写道: > >> On 7/25/25 15:53, Qi Han wrote: >>> Jens has already completed the development of uncached buffered I/O >>> in git [1], and in f2fs, uncached buffered I/O read can be enabled >>> simply by setting the FOP_DONTCACHE flag in f2fs_file_operations. >> IIUC, we may suffer lock issue when we call pwritev(.. ,RWF_DONTCACHE)? >> as Jen mentioned in below path, right? >> >> soft-irq >> - folio_end_writeback() >> - filemap_end_dropbehind_write() >> - filemap_end_dropbehind() >> - folio_unmap_invalidate() >> - lock i_lock >> >> Thanks, > > That's how I understand it. So I guess we need to wait for the support RWF_DONTCACHE on write path, unless you can walk around for write path in this patch. Thanks, > >>> I have been testing a use case locally, which aligns with Jens' test >>> case [2]. In the read scenario, using uncached buffer I/O results in >>> more stable read performance and a lower load on the background memory >>> reclaim thread (kswapd). So let's enable uncached buffer I/O reads on >>> F2FS. >>> >>> Read test data without using uncached buffer I/O: >>> reading bs 32768, uncached 0 >>> 1s: 1856MB/sec, MB=1856 >>> 2s: 1907MB/sec, MB=3763 >>> 3s: 1830MB/sec, MB=5594 >>> 4s: 1745MB/sec, MB=7333 >>> 5s: 1829MB/sec, MB=9162 >>> 6s: 1903MB/sec, MB=11075 >>> 7s: 1878MB/sec, MB=12942 >>> 8s: 1763MB/sec, MB=14718 >>> 9s: 1845MB/sec, MB=16549 >>> 10s: 1915MB/sec, MB=18481 >>> 11s: 1831MB/sec, MB=20295 >>> 12s: 1750MB/sec, MB=22066 >>> 13s: 1787MB/sec, MB=23832 >>> 14s: 1913MB/sec, MB=25769 >>> 15s: 1898MB/sec, MB=27668 >>> 16s: 1795MB/sec, MB=29436 >>> 17s: 1812MB/sec, MB=31248 >>> 18s: 1890MB/sec, MB=33139 >>> 19s: 1880MB/sec, MB=35020 >>> 20s: 1754MB/sec, MB=36810 >>> >>> 08:36:26 UID PID %usr %system %guest %wait %CPU CPU Command >>> 08:36:27 0 93 0.00 0.00 0.00 0.00 0.00 7 kswapd0 >>> 08:36:28 0 93 0.00 0.00 0.00 0.00 0.00 7 kswapd0 >>> 08:36:29 0 93 0.00 0.00 0.00 0.00 0.00 7 kswapd0 >>> 08:36:30 0 93 0.00 56.00 0.00 0.00 56.00 7 kswapd0 >>> 08:36:31 0 93 0.00 73.00 0.00 0.00 73.00 7 kswapd0 >>> 08:36:32 0 93 0.00 83.00 0.00 0.00 83.00 7 kswapd0 >>> 08:36:33 0 93 0.00 75.00 0.00 0.00 75.00 7 kswapd0 >>> 08:36:34 0 93 0.00 81.00 0.00 0.00 81.00 7 kswapd0 >>> 08:36:35 0 93 0.00 54.00 0.00 1.00 54.00 2 kswapd0 >>> 08:36:36 0 93 0.00 61.00 0.00 0.00 61.00 0 kswapd0 >>> 08:36:37 0 93 0.00 68.00 0.00 0.00 68.00 7 kswapd0 >>> 08:36:38 0 93 0.00 53.00 0.00 0.00 53.00 2 kswapd0 >>> 08:36:39 0 93 0.00 82.00 0.00 0.00 82.00 7 kswapd0 >>> 08:36:40 0 93 0.00 77.00 0.00 0.00 77.00 1 kswapd0 >>> 08:36:41 0 93 0.00 74.00 0.00 1.00 74.00 7 kswapd0 >>> 08:36:42 0 93 0.00 71.00 0.00 0.00 71.00 7 kswapd0 >>> 08:36:43 0 93 0.00 78.00 0.00 0.00 78.00 7 kswapd0 >>> 08:36:44 0 93 0.00 85.00 0.00 0.00 85.00 7 kswapd0 >>> 08:36:45 0 93 0.00 83.00 0.00 0.00 83.00 7 kswapd0 >>> 08:36:46 0 93 0.00 70.00 0.00 0.00 70.00 7 kswapd0 >>> 08:36:47 0 93 0.00 78.00 0.00 1.00 78.00 2 kswapd0 >>> 08:36:48 0 93 0.00 81.00 0.00 0.00 81.00 3 kswapd0 >>> 08:36:49 0 93 0.00 54.00 0.00 0.00 54.00 7 kswapd0 >>> 08:36:50 0 93 0.00 76.00 0.00 0.00 76.00 1 kswapd0 >>> 08:36:51 0 93 0.00 75.00 0.00 0.00 75.00 0 kswapd0 >>> 08:36:52 0 93 0.00 73.00 0.00 0.00 73.00 7 kswapd0 >>> 08:36:53 0 93 0.00 61.00 0.00 1.00 61.00 7 kswapd0 >>> 08:36:54 0 93 0.00 80.00 0.00 0.00 80.00 7 kswapd0 >>> 08:36:55 0 93 0.00 64.00 0.00 0.00 64.00 7 kswapd0 >>> 08:36:56 0 93 0.00 56.00 0.00 0.00 56.00 7 kswapd0 >>> 08:36:57 0 93 0.00 26.00 0.00 0.00 26.00 2 kswapd0 >>> 08:36:58 0 93 0.00 24.00 0.00 1.00 24.00 3 kswapd0 >>> 08:36:59 0 93 0.00 22.00 0.00 1.00 22.00 3 kswapd0 >>> 08:37:00 0 93 0.00 15.84 0.00 0.00 15.84 3 kswapd0 >>> 08:37:01 0 93 0.00 0.00 0.00 0.00 0.00 3 kswapd0 >>> 08:37:02 0 93 0.00 0.00 0.00 0.00 0.00 3 kswapd0 >>> >>> Read test data after using uncached buffer I/O: >>> reading bs 32768, uncached 1 >>> 1s: 1863MB/sec, MB=1863 >>> 2s: 1903MB/sec, MB=3766 >>> 3s: 1860MB/sec, MB=5627 >>> 4s: 1864MB/sec, MB=7491 >>> 5s: 1860MB/sec, MB=9352 >>> 6s: 1854MB/sec, MB=11206 >>> 7s: 1874MB/sec, MB=13081 >>> 8s: 1874MB/sec, MB=14943 >>> 9s: 1840MB/sec, MB=16798 >>> 10s: 1849MB/sec, MB=18647 >>> 11s: 1863MB/sec, MB=20511 >>> 12s: 1798MB/sec, MB=22310 >>> 13s: 1897MB/sec, MB=24207 >>> 14s: 1817MB/sec, MB=26025 >>> 15s: 1893MB/sec, MB=27918 >>> 16s: 1917MB/sec, MB=29836 >>> 17s: 1863MB/sec, MB=31699 >>> 18s: 1904MB/sec, MB=33604 >>> 19s: 1894MB/sec, MB=35499 >>> 20s: 1907MB/sec, MB=37407 >>> >>> 08:38:00 UID PID %usr %system %guest %wait %CPU CPU Command >>> 08:38:01 0 93 0.00 0.00 0.00 0.00 0.00 4 kswapd0 >>> 08:38:02 0 93 0.00 0.00 0.00 0.00 0.00 4 kswapd0 >>> 08:38:03 0 93 0.00 0.00 0.00 0.00 0.00 4 kswapd0 >>> 08:38:04 0 93 0.00 0.00 0.00 0.00 0.00 4 kswapd0 >>> 08:38:05 0 93 0.00 0.00 0.00 0.00 0.00 4 kswapd0 >>> 08:38:06 0 93 0.00 1.00 0.00 1.00 1.00 0 kswapd0 >>> 08:38:07 0 93 0.00 0.00 0.00 0.00 0.00 0 kswapd0 >>> 08:38:08 0 93 0.00 0.00 0.00 0.00 0.00 0 kswapd0 >>> 08:38:09 0 93 0.00 1.00 0.00 0.00 1.00 1 kswapd0 >>> 08:38:10 0 93 0.00 0.00 0.00 0.00 0.00 1 kswapd0 >>> 08:38:11 0 93 0.00 0.00 0.00 0.00 0.00 1 kswapd0 >>> 08:38:12 0 93 0.00 0.00 0.00 0.00 0.00 1 kswapd0 >>> 08:38:13 0 93 0.00 0.00 0.00 0.00 0.00 1 kswapd0 >>> 08:38:14 0 93 0.00 0.00 0.00 0.00 0.00 1 kswapd0 >>> 08:38:15 0 93 0.00 3.00 0.00 0.00 3.00 0 kswapd0 >>> 08:38:16 0 93 0.00 0.00 0.00 0.00 0.00 0 kswapd0 >>> 08:38:17 0 93 0.00 0.00 0.00 0.00 0.00 0 kswapd0 >>> 08:38:18 0 93 0.00 0.00 0.00 0.00 0.00 0 kswapd0 >>> 08:38:19 0 93 0.00 0.00 0.00 0.00 0.00 0 kswapd0 >>> 08:38:20 0 93 0.00 0.00 0.00 0.00 0.00 0 kswapd0 >>> 08:38:21 0 93 0.00 0.00 0.00 0.00 0.00 0 kswapd0 >>> 08:38:22 0 93 0.00 0.00 0.00 0.00 0.00 0 kswapd0 >>> 08:38:23 0 93 0.00 3.00 0.00 0.00 3.00 4 kswapd0 >>> 08:38:24 0 93 0.00 0.00 0.00 0.00 0.00 4 kswapd0 >>> 08:38:25 0 93 0.00 0.00 0.00 0.00 0.00 4 kswapd0 >>> 08:38:26 0 93 0.00 4.00 0.00 0.00 4.00 3 kswapd0 >>> 08:38:27 0 93 0.00 0.00 0.00 0.00 0.00 3 kswapd0 >>> 08:38:28 0 93 0.00 0.00 0.00 0.00 0.00 3 kswapd0 >>> 08:38:29 0 93 0.00 0.00 0.00 0.00 0.00 3 kswapd0 >>> 08:38:30 0 93 0.00 0.00 0.00 0.00 0.00 3 kswapd0 >>> 08:38:31 0 93 0.00 0.00 0.00 0.00 0.00 3 kswapd0 >>> 08:38:32 0 93 0.00 0.00 0.00 0.00 0.00 3 kswapd0 >>> 08:38:33 0 93 0.00 0.00 0.00 0.00 0.00 3 kswapd0 >>> >>> [1] >>> https://lore.kernel.org/all/20241220154831.1086649-10-axboe@kernel.dk/T/#m58520a94b46f543d82db3711453dfc7bb594b2b0 >>> >>> [2] >>> https://pastebin.com/u8eCBzB5 >>> >>> Signed-off-by: Qi Han <hanqi@vivo.com> >>> --- >>> fs/f2fs/file.c | 2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c >>> index 696131e655ed..d8da1fc2febf 100644 >>> --- a/fs/f2fs/file.c >>> +++ b/fs/f2fs/file.c >>> @@ -5425,5 +5425,5 @@ const struct file_operations f2fs_file_operations = { >>> .splice_read = f2fs_file_splice_read, >>> .splice_write = iter_file_splice_write, >>> .fadvise = f2fs_file_fadvise, >>> - .fop_flags = FOP_BUFFER_RASYNC, >>> + .fop_flags = FOP_BUFFER_RASYNC | FOP_DONTCACHE, >>> }; >
在 2025/7/28 16:07, Chao Yu 写道: > On 7/28/25 16:03, hanqi wrote: >> 在 2025/7/28 15:38, Chao Yu 写道: >> >>> On 7/25/25 15:53, Qi Han wrote: >>>> Jens has already completed the development of uncached buffered I/O >>>> in git [1], and in f2fs, uncached buffered I/O read can be enabled >>>> simply by setting the FOP_DONTCACHE flag in f2fs_file_operations. >>> IIUC, we may suffer lock issue when we call pwritev(.. ,RWF_DONTCACHE)? >>> as Jen mentioned in below path, right? >>> >>> soft-irq >>> - folio_end_writeback() >>> - filemap_end_dropbehind_write() >>> - filemap_end_dropbehind() >>> - folio_unmap_invalidate() >>> - lock i_lock >>> >>> Thanks, >> That's how I understand it. > So I guess we need to wait for the support RWF_DONTCACHE on write path, unless > you can walk around for write path in this patch. > > Thanks, I think the read and write paths can be submitted separately. Currently, uncached buffered I/O write requires setting the FGP_DONTCACHE flag when the filesystem allocates a folio. In f2fs, this is done in the following path: - write_begin - f2fs_write_begin - __filemap_get_folio As I understand it, if we don't set the FGP_DONTCACHE flag here, this issue shouldn't occur. Thanks >>>> I have been testing a use case locally, which aligns with Jens' test >>>> case [2]. In the read scenario, using uncached buffer I/O results in >>>> more stable read performance and a lower load on the background memory >>>> reclaim thread (kswapd). So let's enable uncached buffer I/O reads on >>>> F2FS. >>>> >>>> Read test data without using uncached buffer I/O: >>>> reading bs 32768, uncached 0 >>>> 1s: 1856MB/sec, MB=1856 >>>> 2s: 1907MB/sec, MB=3763 >>>> 3s: 1830MB/sec, MB=5594 >>>> 4s: 1745MB/sec, MB=7333 >>>> 5s: 1829MB/sec, MB=9162 >>>> 6s: 1903MB/sec, MB=11075 >>>> 7s: 1878MB/sec, MB=12942 >>>> 8s: 1763MB/sec, MB=14718 >>>> 9s: 1845MB/sec, MB=16549 >>>> 10s: 1915MB/sec, MB=18481 >>>> 11s: 1831MB/sec, MB=20295 >>>> 12s: 1750MB/sec, MB=22066 >>>> 13s: 1787MB/sec, MB=23832 >>>> 14s: 1913MB/sec, MB=25769 >>>> 15s: 1898MB/sec, MB=27668 >>>> 16s: 1795MB/sec, MB=29436 >>>> 17s: 1812MB/sec, MB=31248 >>>> 18s: 1890MB/sec, MB=33139 >>>> 19s: 1880MB/sec, MB=35020 >>>> 20s: 1754MB/sec, MB=36810 >>>> >>>> 08:36:26 UID PID %usr %system %guest %wait %CPU CPU Command >>>> 08:36:27 0 93 0.00 0.00 0.00 0.00 0.00 7 kswapd0 >>>> 08:36:28 0 93 0.00 0.00 0.00 0.00 0.00 7 kswapd0 >>>> 08:36:29 0 93 0.00 0.00 0.00 0.00 0.00 7 kswapd0 >>>> 08:36:30 0 93 0.00 56.00 0.00 0.00 56.00 7 kswapd0 >>>> 08:36:31 0 93 0.00 73.00 0.00 0.00 73.00 7 kswapd0 >>>> 08:36:32 0 93 0.00 83.00 0.00 0.00 83.00 7 kswapd0 >>>> 08:36:33 0 93 0.00 75.00 0.00 0.00 75.00 7 kswapd0 >>>> 08:36:34 0 93 0.00 81.00 0.00 0.00 81.00 7 kswapd0 >>>> 08:36:35 0 93 0.00 54.00 0.00 1.00 54.00 2 kswapd0 >>>> 08:36:36 0 93 0.00 61.00 0.00 0.00 61.00 0 kswapd0 >>>> 08:36:37 0 93 0.00 68.00 0.00 0.00 68.00 7 kswapd0 >>>> 08:36:38 0 93 0.00 53.00 0.00 0.00 53.00 2 kswapd0 >>>> 08:36:39 0 93 0.00 82.00 0.00 0.00 82.00 7 kswapd0 >>>> 08:36:40 0 93 0.00 77.00 0.00 0.00 77.00 1 kswapd0 >>>> 08:36:41 0 93 0.00 74.00 0.00 1.00 74.00 7 kswapd0 >>>> 08:36:42 0 93 0.00 71.00 0.00 0.00 71.00 7 kswapd0 >>>> 08:36:43 0 93 0.00 78.00 0.00 0.00 78.00 7 kswapd0 >>>> 08:36:44 0 93 0.00 85.00 0.00 0.00 85.00 7 kswapd0 >>>> 08:36:45 0 93 0.00 83.00 0.00 0.00 83.00 7 kswapd0 >>>> 08:36:46 0 93 0.00 70.00 0.00 0.00 70.00 7 kswapd0 >>>> 08:36:47 0 93 0.00 78.00 0.00 1.00 78.00 2 kswapd0 >>>> 08:36:48 0 93 0.00 81.00 0.00 0.00 81.00 3 kswapd0 >>>> 08:36:49 0 93 0.00 54.00 0.00 0.00 54.00 7 kswapd0 >>>> 08:36:50 0 93 0.00 76.00 0.00 0.00 76.00 1 kswapd0 >>>> 08:36:51 0 93 0.00 75.00 0.00 0.00 75.00 0 kswapd0 >>>> 08:36:52 0 93 0.00 73.00 0.00 0.00 73.00 7 kswapd0 >>>> 08:36:53 0 93 0.00 61.00 0.00 1.00 61.00 7 kswapd0 >>>> 08:36:54 0 93 0.00 80.00 0.00 0.00 80.00 7 kswapd0 >>>> 08:36:55 0 93 0.00 64.00 0.00 0.00 64.00 7 kswapd0 >>>> 08:36:56 0 93 0.00 56.00 0.00 0.00 56.00 7 kswapd0 >>>> 08:36:57 0 93 0.00 26.00 0.00 0.00 26.00 2 kswapd0 >>>> 08:36:58 0 93 0.00 24.00 0.00 1.00 24.00 3 kswapd0 >>>> 08:36:59 0 93 0.00 22.00 0.00 1.00 22.00 3 kswapd0 >>>> 08:37:00 0 93 0.00 15.84 0.00 0.00 15.84 3 kswapd0 >>>> 08:37:01 0 93 0.00 0.00 0.00 0.00 0.00 3 kswapd0 >>>> 08:37:02 0 93 0.00 0.00 0.00 0.00 0.00 3 kswapd0 >>>> >>>> Read test data after using uncached buffer I/O: >>>> reading bs 32768, uncached 1 >>>> 1s: 1863MB/sec, MB=1863 >>>> 2s: 1903MB/sec, MB=3766 >>>> 3s: 1860MB/sec, MB=5627 >>>> 4s: 1864MB/sec, MB=7491 >>>> 5s: 1860MB/sec, MB=9352 >>>> 6s: 1854MB/sec, MB=11206 >>>> 7s: 1874MB/sec, MB=13081 >>>> 8s: 1874MB/sec, MB=14943 >>>> 9s: 1840MB/sec, MB=16798 >>>> 10s: 1849MB/sec, MB=18647 >>>> 11s: 1863MB/sec, MB=20511 >>>> 12s: 1798MB/sec, MB=22310 >>>> 13s: 1897MB/sec, MB=24207 >>>> 14s: 1817MB/sec, MB=26025 >>>> 15s: 1893MB/sec, MB=27918 >>>> 16s: 1917MB/sec, MB=29836 >>>> 17s: 1863MB/sec, MB=31699 >>>> 18s: 1904MB/sec, MB=33604 >>>> 19s: 1894MB/sec, MB=35499 >>>> 20s: 1907MB/sec, MB=37407 >>>> >>>> 08:38:00 UID PID %usr %system %guest %wait %CPU CPU Command >>>> 08:38:01 0 93 0.00 0.00 0.00 0.00 0.00 4 kswapd0 >>>> 08:38:02 0 93 0.00 0.00 0.00 0.00 0.00 4 kswapd0 >>>> 08:38:03 0 93 0.00 0.00 0.00 0.00 0.00 4 kswapd0 >>>> 08:38:04 0 93 0.00 0.00 0.00 0.00 0.00 4 kswapd0 >>>> 08:38:05 0 93 0.00 0.00 0.00 0.00 0.00 4 kswapd0 >>>> 08:38:06 0 93 0.00 1.00 0.00 1.00 1.00 0 kswapd0 >>>> 08:38:07 0 93 0.00 0.00 0.00 0.00 0.00 0 kswapd0 >>>> 08:38:08 0 93 0.00 0.00 0.00 0.00 0.00 0 kswapd0 >>>> 08:38:09 0 93 0.00 1.00 0.00 0.00 1.00 1 kswapd0 >>>> 08:38:10 0 93 0.00 0.00 0.00 0.00 0.00 1 kswapd0 >>>> 08:38:11 0 93 0.00 0.00 0.00 0.00 0.00 1 kswapd0 >>>> 08:38:12 0 93 0.00 0.00 0.00 0.00 0.00 1 kswapd0 >>>> 08:38:13 0 93 0.00 0.00 0.00 0.00 0.00 1 kswapd0 >>>> 08:38:14 0 93 0.00 0.00 0.00 0.00 0.00 1 kswapd0 >>>> 08:38:15 0 93 0.00 3.00 0.00 0.00 3.00 0 kswapd0 >>>> 08:38:16 0 93 0.00 0.00 0.00 0.00 0.00 0 kswapd0 >>>> 08:38:17 0 93 0.00 0.00 0.00 0.00 0.00 0 kswapd0 >>>> 08:38:18 0 93 0.00 0.00 0.00 0.00 0.00 0 kswapd0 >>>> 08:38:19 0 93 0.00 0.00 0.00 0.00 0.00 0 kswapd0 >>>> 08:38:20 0 93 0.00 0.00 0.00 0.00 0.00 0 kswapd0 >>>> 08:38:21 0 93 0.00 0.00 0.00 0.00 0.00 0 kswapd0 >>>> 08:38:22 0 93 0.00 0.00 0.00 0.00 0.00 0 kswapd0 >>>> 08:38:23 0 93 0.00 3.00 0.00 0.00 3.00 4 kswapd0 >>>> 08:38:24 0 93 0.00 0.00 0.00 0.00 0.00 4 kswapd0 >>>> 08:38:25 0 93 0.00 0.00 0.00 0.00 0.00 4 kswapd0 >>>> 08:38:26 0 93 0.00 4.00 0.00 0.00 4.00 3 kswapd0 >>>> 08:38:27 0 93 0.00 0.00 0.00 0.00 0.00 3 kswapd0 >>>> 08:38:28 0 93 0.00 0.00 0.00 0.00 0.00 3 kswapd0 >>>> 08:38:29 0 93 0.00 0.00 0.00 0.00 0.00 3 kswapd0 >>>> 08:38:30 0 93 0.00 0.00 0.00 0.00 0.00 3 kswapd0 >>>> 08:38:31 0 93 0.00 0.00 0.00 0.00 0.00 3 kswapd0 >>>> 08:38:32 0 93 0.00 0.00 0.00 0.00 0.00 3 kswapd0 >>>> 08:38:33 0 93 0.00 0.00 0.00 0.00 0.00 3 kswapd0 >>>> >>>> [1] >>>> https://lore.kernel.org/all/20241220154831.1086649-10-axboe@kernel.dk/T/#m58520a94b46f543d82db3711453dfc7bb594b2b0 >>>> >>>> [2] >>>> https://pastebin.com/u8eCBzB5 >>>> >>>> Signed-off-by: Qi Han <hanqi@vivo.com> >>>> --- >>>> fs/f2fs/file.c | 2 +- >>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>> >>>> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c >>>> index 696131e655ed..d8da1fc2febf 100644 >>>> --- a/fs/f2fs/file.c >>>> +++ b/fs/f2fs/file.c >>>> @@ -5425,5 +5425,5 @@ const struct file_operations f2fs_file_operations = { >>>> .splice_read = f2fs_file_splice_read, >>>> .splice_write = iter_file_splice_write, >>>> .fadvise = f2fs_file_fadvise, >>>> - .fop_flags = FOP_BUFFER_RASYNC, >>>> + .fop_flags = FOP_BUFFER_RASYNC | FOP_DONTCACHE, >>>> };
On 7/28/25 2:28 AM, hanqi wrote: > ? 2025/7/28 16:07, Chao Yu ??: >> On 7/28/25 16:03, hanqi wrote: >>> ? 2025/7/28 15:38, Chao Yu ??: >>> >>>> On 7/25/25 15:53, Qi Han wrote: >>>>> Jens has already completed the development of uncached buffered I/O >>>>> in git [1], and in f2fs, uncached buffered I/O read can be enabled >>>>> simply by setting the FOP_DONTCACHE flag in f2fs_file_operations. >>>> IIUC, we may suffer lock issue when we call pwritev(.. ,RWF_DONTCACHE)? >>>> as Jen mentioned in below path, right? >>>> >>>> soft-irq >>>> - folio_end_writeback() >>>> - filemap_end_dropbehind_write() >>>> - filemap_end_dropbehind() >>>> - folio_unmap_invalidate() >>>> - lock i_lock >>>> >>>> Thanks, >>> That's how I understand it. >> So I guess we need to wait for the support RWF_DONTCACHE on write path, unless >> you can walk around for write path in this patch. >> >> Thanks, > > I think the read and write paths can be submitted separately. > Currently, uncached buffered I/O write requires setting the > FGP_DONTCACHE flag when the filesystem allocates a folio. In > f2fs, this is done in the following path: > > - write_begin > - f2fs_write_begin > - __filemap_get_folio > As I understand it, if we don't set the FGP_DONTCACHE flag here, this > issue shouldn't occur. It won't cause an issue, but it also won't work in the sense that the intent is that if the file system doesn't support DONTCACHE, it would get errored at submission time. Your approach would just ignore the flag for writes, rather than return -EOPNOTSUPP as would be expected. You could potentially make it work just on the read side by having the f2fs write submit side check DONTCACHE on the write side and error them out. -- Jens Axboe
On 7/30/25 23:20, Jens Axboe wrote: > On 7/28/25 2:28 AM, hanqi wrote: >> ? 2025/7/28 16:07, Chao Yu ??: >>> On 7/28/25 16:03, hanqi wrote: >>>> ? 2025/7/28 15:38, Chao Yu ??: >>>> >>>>> On 7/25/25 15:53, Qi Han wrote: >>>>>> Jens has already completed the development of uncached buffered I/O >>>>>> in git [1], and in f2fs, uncached buffered I/O read can be enabled >>>>>> simply by setting the FOP_DONTCACHE flag in f2fs_file_operations. >>>>> IIUC, we may suffer lock issue when we call pwritev(.. ,RWF_DONTCACHE)? >>>>> as Jen mentioned in below path, right? >>>>> >>>>> soft-irq >>>>> - folio_end_writeback() >>>>> - filemap_end_dropbehind_write() >>>>> - filemap_end_dropbehind() >>>>> - folio_unmap_invalidate() >>>>> - lock i_lock >>>>> >>>>> Thanks, >>>> That's how I understand it. >>> So I guess we need to wait for the support RWF_DONTCACHE on write path, unless >>> you can walk around for write path in this patch. >>> >>> Thanks, >> >> I think the read and write paths can be submitted separately. >> Currently, uncached buffered I/O write requires setting the >> FGP_DONTCACHE flag when the filesystem allocates a folio. In >> f2fs, this is done in the following path: >> >> - write_begin >> - f2fs_write_begin >> - __filemap_get_folio >> As I understand it, if we don't set the FGP_DONTCACHE flag here, this >> issue shouldn't occur. > > It won't cause an issue, but it also won't work in the sense that the > intent is that if the file system doesn't support DONTCACHE, it would > get errored at submission time. Your approach would just ignore the flag > for writes, rather than return -EOPNOTSUPP as would be expected. Jens, Do you mean like what we have done in kiocb_set_rw_flags()? if (flags & RWF_DONTCACHE) { /* file system must support it */ if (!(ki->ki_filp->f_op->fop_flags & FOP_DONTCACHE)) return -EOPNOTSUPP; ... } IIUC, it's better to have this in original patch, let me know if I'm missing something. diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c index 9b8d24097b7a..7f09cad6b6d7 100644 --- a/fs/f2fs/file.c +++ b/fs/f2fs/file.c @@ -5185,6 +5185,11 @@ static ssize_t f2fs_file_write_iter(struct kiocb *iocb, struct iov_iter *from) goto out; } + if (iocb->ki_flags & IOCB_DONTCACHE) { + ret = -EOPNOTSUPP; + goto out; + } + if (!f2fs_is_compress_backend_ready(inode)) { ret = -EOPNOTSUPP; goto out; -- Thanks, > > You could potentially make it work just on the read side by having the > f2fs write submit side check DONTCACHE on the write side and error them > out. >
On 7/30/25 8:35 PM, Chao Yu wrote: > On 7/30/25 23:20, Jens Axboe wrote: >> On 7/28/25 2:28 AM, hanqi wrote: >>> ? 2025/7/28 16:07, Chao Yu ??: >>>> On 7/28/25 16:03, hanqi wrote: >>>>> ? 2025/7/28 15:38, Chao Yu ??: >>>>> >>>>>> On 7/25/25 15:53, Qi Han wrote: >>>>>>> Jens has already completed the development of uncached buffered I/O >>>>>>> in git [1], and in f2fs, uncached buffered I/O read can be enabled >>>>>>> simply by setting the FOP_DONTCACHE flag in f2fs_file_operations. >>>>>> IIUC, we may suffer lock issue when we call pwritev(.. ,RWF_DONTCACHE)? >>>>>> as Jen mentioned in below path, right? >>>>>> >>>>>> soft-irq >>>>>> - folio_end_writeback() >>>>>> - filemap_end_dropbehind_write() >>>>>> - filemap_end_dropbehind() >>>>>> - folio_unmap_invalidate() >>>>>> - lock i_lock >>>>>> >>>>>> Thanks, >>>>> That's how I understand it. >>>> So I guess we need to wait for the support RWF_DONTCACHE on write path, unless >>>> you can walk around for write path in this patch. >>>> >>>> Thanks, >>> >>> I think the read and write paths can be submitted separately. >>> Currently, uncached buffered I/O write requires setting the >>> FGP_DONTCACHE flag when the filesystem allocates a folio. In >>> f2fs, this is done in the following path: >>> >>> - write_begin >>> - f2fs_write_begin >>> - __filemap_get_folio >>> As I understand it, if we don't set the FGP_DONTCACHE flag here, this >>> issue shouldn't occur. >> >> It won't cause an issue, but it also won't work in the sense that the >> intent is that if the file system doesn't support DONTCACHE, it would >> get errored at submission time. Your approach would just ignore the flag >> for writes, rather than return -EOPNOTSUPP as would be expected. > > Jens, > > Do you mean like what we have done in kiocb_set_rw_flags()? > > if (flags & RWF_DONTCACHE) { > /* file system must support it */ > if (!(ki->ki_filp->f_op->fop_flags & FOP_DONTCACHE)) > return -EOPNOTSUPP; > ... > } > > IIUC, it's better to have this in original patch, let me know if I'm > missing something. Right, that would certainly be required to have it functional on the read side but not yet on the write side. Still leaves a weirder gap where other file systems (like XFS and ext4) you can rely on if read or write support is there, then the other direction is supported too. f2fs would be the only one where the read side works, but you get -EOPNOTSUPP on the write side. Unless there's a rush on the read side for some reason, I think it'd be better to have with setting FOP_DONTCACHE until the write side has been completed too. -- Jens Axboe
On 8/2/25 23:35, Jens Axboe wrote: > On 7/30/25 8:35 PM, Chao Yu wrote: >> On 7/30/25 23:20, Jens Axboe wrote: >>> On 7/28/25 2:28 AM, hanqi wrote: >>>> ? 2025/7/28 16:07, Chao Yu ??: >>>>> On 7/28/25 16:03, hanqi wrote: >>>>>> ? 2025/7/28 15:38, Chao Yu ??: >>>>>> >>>>>>> On 7/25/25 15:53, Qi Han wrote: >>>>>>>> Jens has already completed the development of uncached buffered I/O >>>>>>>> in git [1], and in f2fs, uncached buffered I/O read can be enabled >>>>>>>> simply by setting the FOP_DONTCACHE flag in f2fs_file_operations. >>>>>>> IIUC, we may suffer lock issue when we call pwritev(.. ,RWF_DONTCACHE)? >>>>>>> as Jen mentioned in below path, right? >>>>>>> >>>>>>> soft-irq >>>>>>> - folio_end_writeback() >>>>>>> - filemap_end_dropbehind_write() >>>>>>> - filemap_end_dropbehind() >>>>>>> - folio_unmap_invalidate() >>>>>>> - lock i_lock >>>>>>> >>>>>>> Thanks, >>>>>> That's how I understand it. >>>>> So I guess we need to wait for the support RWF_DONTCACHE on write path, unless >>>>> you can walk around for write path in this patch. >>>>> >>>>> Thanks, >>>> >>>> I think the read and write paths can be submitted separately. >>>> Currently, uncached buffered I/O write requires setting the >>>> FGP_DONTCACHE flag when the filesystem allocates a folio. In >>>> f2fs, this is done in the following path: >>>> >>>> - write_begin >>>> - f2fs_write_begin >>>> - __filemap_get_folio >>>> As I understand it, if we don't set the FGP_DONTCACHE flag here, this >>>> issue shouldn't occur. >>> >>> It won't cause an issue, but it also won't work in the sense that the >>> intent is that if the file system doesn't support DONTCACHE, it would >>> get errored at submission time. Your approach would just ignore the flag >>> for writes, rather than return -EOPNOTSUPP as would be expected. >> >> Jens, >> >> Do you mean like what we have done in kiocb_set_rw_flags()? >> >> if (flags & RWF_DONTCACHE) { >> /* file system must support it */ >> if (!(ki->ki_filp->f_op->fop_flags & FOP_DONTCACHE)) >> return -EOPNOTSUPP; >> ... >> } >> >> IIUC, it's better to have this in original patch, let me know if I'm >> missing something. > > Right, that would certainly be required to have it functional on the > read side but not yet on the write side. Still leaves a weirder gap > where other file systems (like XFS and ext4) you can rely on if read or > write support is there, then the other direction is supported too. f2fs > would be the only one where the read side works, but you get -EOPNOTSUPP > on the write side. > > Unless there's a rush on the read side for some reason, I think it'd be > better to have with setting FOP_DONTCACHE until the write side has been > completed too. Sure, let's wait for dontcache support in both read&write side, unless something is blocked in write side, let's see. :) Thanks, >
在 2025/7/30 23:20, Jens Axboe 写道: > On 7/28/25 2:28 AM, hanqi wrote: >> ? 2025/7/28 16:07, Chao Yu ??: >>> On 7/28/25 16:03, hanqi wrote: >>>> ? 2025/7/28 15:38, Chao Yu ??: >>>> >>>>> On 7/25/25 15:53, Qi Han wrote: >>>>>> Jens has already completed the development of uncached buffered I/O >>>>>> in git [1], and in f2fs, uncached buffered I/O read can be enabled >>>>>> simply by setting the FOP_DONTCACHE flag in f2fs_file_operations. >>>>> IIUC, we may suffer lock issue when we call pwritev(.. ,RWF_DONTCACHE)? >>>>> as Jen mentioned in below path, right? >>>>> >>>>> soft-irq >>>>> - folio_end_writeback() >>>>> - filemap_end_dropbehind_write() >>>>> - filemap_end_dropbehind() >>>>> - folio_unmap_invalidate() >>>>> - lock i_lock >>>>> >>>>> Thanks, >>>> That's how I understand it. >>> So I guess we need to wait for the support RWF_DONTCACHE on write path, unless >>> you can walk around for write path in this patch. >>> >>> Thanks, >> I think the read and write paths can be submitted separately. >> Currently, uncached buffered I/O write requires setting the >> FGP_DONTCACHE flag when the filesystem allocates a folio. In >> f2fs, this is done in the following path: >> >> - write_begin >> - f2fs_write_begin >> - __filemap_get_folio >> As I understand it, if we don't set the FGP_DONTCACHE flag here, this >> issue shouldn't occur. > It won't cause an issue, but it also won't work in the sense that the > intent is that if the file system doesn't support DONTCACHE, it would > get errored at submission time. Your approach would just ignore the flag > for writes, rather than return -EOPNOTSUPP as would be expected. > > You could potentially make it work just on the read side by having the > f2fs write submit side check DONTCACHE on the write side and error them > out. Hi Jens, Thank you for your suggestions. I am currently working on modifying F2FS to handle the dontcache unmap operation in a workqueue. I expect to submit the patch soon, after which F2FS should also support uncached buffer I/O writes. Thanks, >
On 7/30/25 7:58 PM, hanqi wrote: > > ? 2025/7/30 23:20, Jens Axboe ??: >> On 7/28/25 2:28 AM, hanqi wrote: >>> ? 2025/7/28 16:07, Chao Yu ??: >>>> On 7/28/25 16:03, hanqi wrote: >>>>> ? 2025/7/28 15:38, Chao Yu ??: >>>>> >>>>>> On 7/25/25 15:53, Qi Han wrote: >>>>>>> Jens has already completed the development of uncached buffered I/O >>>>>>> in git [1], and in f2fs, uncached buffered I/O read can be enabled >>>>>>> simply by setting the FOP_DONTCACHE flag in f2fs_file_operations. >>>>>> IIUC, we may suffer lock issue when we call pwritev(.. ,RWF_DONTCACHE)? >>>>>> as Jen mentioned in below path, right? >>>>>> >>>>>> soft-irq >>>>>> - folio_end_writeback() >>>>>> - filemap_end_dropbehind_write() >>>>>> - filemap_end_dropbehind() >>>>>> - folio_unmap_invalidate() >>>>>> - lock i_lock >>>>>> >>>>>> Thanks, >>>>> That's how I understand it. >>>> So I guess we need to wait for the support RWF_DONTCACHE on write path, unless >>>> you can walk around for write path in this patch. >>>> >>>> Thanks, >>> I think the read and write paths can be submitted separately. >>> Currently, uncached buffered I/O write requires setting the >>> FGP_DONTCACHE flag when the filesystem allocates a folio. In >>> f2fs, this is done in the following path: >>> >>> - write_begin >>> - f2fs_write_begin >>> - __filemap_get_folio >>> As I understand it, if we don't set the FGP_DONTCACHE flag here, this >>> issue shouldn't occur. >> It won't cause an issue, but it also won't work in the sense that the >> intent is that if the file system doesn't support DONTCACHE, it would >> get errored at submission time. Your approach would just ignore the flag >> for writes, rather than return -EOPNOTSUPP as would be expected. >> >> You could potentially make it work just on the read side by having the >> f2fs write submit side check DONTCACHE on the write side and error them >> out. > > Hi Jens, > > Thank you for your suggestions. I am currently working on modifying > F2FS to handle the dontcache unmap operation in a workqueue. I expect > to submit the patch soon, after which F2FS should also support uncached > buffer I/O writes. Sounds good, that's the right approach. Userspace needs to be able to rely on the fact that if RWF_DONTCACHE io is submitted without error that the target does the right thing as well, barring bugs of course. -- Jens Axboe
© 2016 - 2025 Red Hat, Inc.