f2fs: serialize writeback for inline-crypto inodes

[PATCH] f2fs: serialize writeback for inline-crypto inodes

Posted by Jeuk Kim 3 months, 3 weeks ago

From: Jeuk Kim <jeuk20.kim@samsung.com>

Inline encryption derives DUN from <inode, file offset>,
so bios from different inodes can't merge. With multi-threaded
buffered O_SYNC writes where each thread writes to its own file,
4KiB-per-page LBA allocation interleaves across inodes and
causes bio split. Serialize writeback for fscrypt inline-crypto
inodes via __should_serialize_io() to keep foreground writeback
focused on one inode and avoid split.

Test: fio --name=wb_osync --rw=write --bs=1M \
      --time_based=1 --runtime=60s --size=2G \
      --ioengine=psync --direct=0 --sync=1 \
      --numjobs=8 --thread=1 --nrfiles=1 \
      --filename_format='wb_osync.$jobnum'

device: UFS

Before -
  write throughput: 675MiB/s
  device I/O size distribution (by count, total 1027414):
    4 KiB:  923139 (89.9%)
    8 KiB:  84798 (8.3%)
    ≥512 KiB: 453 (0.0%)

After -
  write throughput: 1760MiB/s
  device I/O size distribution (by count, total 231750):
    4 KiB:  16904 (7.3%)
    8 KiB:  72128 (31.1%)
    ≥512 KiB: 118900 (51.3%)

Signed-off-by: Jeuk Kim <jeuk20.kim@samsung.com>
---
 fs/f2fs/data.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index ef38e62cda8f..ae6fb435d576 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -3217,6 +3217,8 @@ static inline bool __should_serialize_io(struct inode *inode,
 
 	if (f2fs_need_compress_data(inode))
 		return true;
+	if (fscrypt_inode_uses_inline_crypto(inode))
+		return true;
 	if (wbc->sync_mode != WB_SYNC_ALL)
 		return true;
 	if (get_dirty_pages(inode) >= SM_I(F2FS_I_SB(inode))->min_seq_blocks)
-- 
2.43.0

Re: [PATCH] f2fs: serialize writeback for inline-crypto inodes

Posted by Chao Yu 3 months, 3 weeks ago

On 10/16/2025 1:16 PM, Jeuk Kim wrote:
> From: Jeuk Kim <jeuk20.kim@samsung.com>
> 
> Inline encryption derives DUN from <inode, file offset>,
> so bios from different inodes can't merge. With multi-threaded
> buffered O_SYNC writes where each thread writes to its own file,
> 4KiB-per-page LBA allocation interleaves across inodes and
> causes bio split. Serialize writeback for fscrypt inline-crypto
> inodes via __should_serialize_io() to keep foreground writeback
> focused on one inode and avoid split.
> 
> Test: fio --name=wb_osync --rw=write --bs=1M \
>        --time_based=1 --runtime=60s --size=2G \
>        --ioengine=psync --direct=0 --sync=1 \
>        --numjobs=8 --thread=1 --nrfiles=1 \
>        --filename_format='wb_osync.$jobnum'
> 
> device: UFS
> 
> Before -
>    write throughput: 675MiB/s
>    device I/O size distribution (by count, total 1027414):
>      4 KiB:  923139 (89.9%)
>      8 KiB:  84798 (8.3%)
>      ≥512 KiB: 453 (0.0%)
> 
> After -
>    write throughput: 1760MiB/s
>    device I/O size distribution (by count, total 231750):
>      4 KiB:  16904 (7.3%)
>      8 KiB:  72128 (31.1%)
>      ≥512 KiB: 118900 (51.3%)
> 
> Signed-off-by: Jeuk Kim <jeuk20.kim@samsung.com>
> ---
>   fs/f2fs/data.c | 2 ++
>   1 file changed, 2 insertions(+)
> 
> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
> index ef38e62cda8f..ae6fb435d576 100644
> --- a/fs/f2fs/data.c
> +++ b/fs/f2fs/data.c
> @@ -3217,6 +3217,8 @@ static inline bool __should_serialize_io(struct inode *inode,
>   
>   	if (f2fs_need_compress_data(inode))
>   		return true;
> +	if (fscrypt_inode_uses_inline_crypto(inode))
> +		return true;
>   	if (wbc->sync_mode != WB_SYNC_ALL)
>   		return true;
>   	if (get_dirty_pages(inode) >= SM_I(F2FS_I_SB(inode))->min_seq_blocks)

Jeuk,

Can you please try tuning /sys/fs/f2fs/<dev>/min_seq_blocks to see whether it
can achive the goal?

Thanks,

Re: [PATCH] f2fs: serialize writeback for inline-crypto inodes

Posted by Jeuk Kim 3 months, 2 weeks ago

On 10/16/2025 7:12 PM, Chao Yu wrote:
> On 10/16/2025 1:16 PM, Jeuk Kim wrote:
>> From: Jeuk Kim <jeuk20.kim@samsung.com>
>>
>> Inline encryption derives DUN from <inode, file offset>,
>> so bios from different inodes can't merge. With multi-threaded
>> buffered O_SYNC writes where each thread writes to its own file,
>> 4KiB-per-page LBA allocation interleaves across inodes and
>> causes bio split. Serialize writeback for fscrypt inline-crypto
>> inodes via __should_serialize_io() to keep foreground writeback
>> focused on one inode and avoid split.
>>
>> Test: fio --name=wb_osync --rw=write --bs=1M \
>>        --time_based=1 --runtime=60s --size=2G \
>>        --ioengine=psync --direct=0 --sync=1 \
>>        --numjobs=8 --thread=1 --nrfiles=1 \
>>        --filename_format='wb_osync.$jobnum'
>>
>> device: UFS
>>
>> Before -
>>    write throughput: 675MiB/s
>>    device I/O size distribution (by count, total 1027414):
>>      4 KiB:  923139 (89.9%)
>>      8 KiB:  84798 (8.3%)
>>      ≥512 KiB: 453 (0.0%)
>>
>> After -
>>    write throughput: 1760MiB/s
>>    device I/O size distribution (by count, total 231750):
>>      4 KiB:  16904 (7.3%)
>>      8 KiB:  72128 (31.1%)
>>      ≥512 KiB: 118900 (51.3%)
>>
>> Signed-off-by: Jeuk Kim <jeuk20.kim@samsung.com>
>> ---
>>   fs/f2fs/data.c | 2 ++
>>   1 file changed, 2 insertions(+)
>>
>> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
>> index ef38e62cda8f..ae6fb435d576 100644
>> --- a/fs/f2fs/data.c
>> +++ b/fs/f2fs/data.c
>> @@ -3217,6 +3217,8 @@ static inline bool __should_serialize_io(struct 
>> inode *inode,
>>         if (f2fs_need_compress_data(inode))
>>           return true;
>> +    if (fscrypt_inode_uses_inline_crypto(inode))
>> +        return true;
>>       if (wbc->sync_mode != WB_SYNC_ALL)
>>           return true;
>>       if (get_dirty_pages(inode) >= 
>> SM_I(F2FS_I_SB(inode))->min_seq_blocks)
>
> Jeuk,
>
> Can you please try tuning /sys/fs/f2fs/<dev>/min_seq_blocks to see 
> whether it
> can achive the goal?
>
> Thanks,
>
Hi Chao,

Thanks a lot for the suggestion.
I tried tuning `/sys/fs/f2fs/<dev>/min_seq_blocks` as you mentioned, and 
it also achieved similar performance improvement on my setup.

Your approach looks cleaner and better than the one I proposed.

 From what I see, even after reducing this value from the default (2MB) 
to 512 KB on my local system, there doesn’t seem to be any noticeable 
performance drop or other side effects.
Do you see any possible downsides with lowering this value that I might 
have missed?

Thanks again for your help.

Re: [PATCH] f2fs: serialize writeback for inline-crypto inodes

Posted by Chao Yu 3 months, 2 weeks ago

On 10/21/25 11:33, Jeuk Kim wrote:
> 
> On 10/16/2025 7:12 PM, Chao Yu wrote:
>> On 10/16/2025 1:16 PM, Jeuk Kim wrote:
>>> From: Jeuk Kim <jeuk20.kim@samsung.com>
>>>
>>> Inline encryption derives DUN from <inode, file offset>,
>>> so bios from different inodes can't merge. With multi-threaded
>>> buffered O_SYNC writes where each thread writes to its own file,
>>> 4KiB-per-page LBA allocation interleaves across inodes and
>>> causes bio split. Serialize writeback for fscrypt inline-crypto
>>> inodes via __should_serialize_io() to keep foreground writeback
>>> focused on one inode and avoid split.
>>>
>>> Test: fio --name=wb_osync --rw=write --bs=1M \
>>>        --time_based=1 --runtime=60s --size=2G \
>>>        --ioengine=psync --direct=0 --sync=1 \
>>>        --numjobs=8 --thread=1 --nrfiles=1 \
>>>        --filename_format='wb_osync.$jobnum'
>>>
>>> device: UFS
>>>
>>> Before -
>>>    write throughput: 675MiB/s
>>>    device I/O size distribution (by count, total 1027414):
>>>      4 KiB:  923139 (89.9%)
>>>      8 KiB:  84798 (8.3%)
>>>      ≥512 KiB: 453 (0.0%)
>>>
>>> After -
>>>    write throughput: 1760MiB/s
>>>    device I/O size distribution (by count, total 231750):
>>>      4 KiB:  16904 (7.3%)
>>>      8 KiB:  72128 (31.1%)
>>>      ≥512 KiB: 118900 (51.3%)
>>>
>>> Signed-off-by: Jeuk Kim <jeuk20.kim@samsung.com>
>>> ---
>>>   fs/f2fs/data.c | 2 ++
>>>   1 file changed, 2 insertions(+)
>>>
>>> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
>>> index ef38e62cda8f..ae6fb435d576 100644
>>> --- a/fs/f2fs/data.c
>>> +++ b/fs/f2fs/data.c
>>> @@ -3217,6 +3217,8 @@ static inline bool __should_serialize_io(struct inode *inode,
>>>         if (f2fs_need_compress_data(inode))
>>>           return true;
>>> +    if (fscrypt_inode_uses_inline_crypto(inode))
>>> +        return true;
>>>       if (wbc->sync_mode != WB_SYNC_ALL)
>>>           return true;
>>>       if (get_dirty_pages(inode) >= SM_I(F2FS_I_SB(inode))->min_seq_blocks)
>>
>> Jeuk,
>>
>> Can you please try tuning /sys/fs/f2fs/<dev>/min_seq_blocks to see whether it
>> can achive the goal?
>>
>> Thanks,
>>
> Hi Chao,
> 
> Thanks a lot for the suggestion.
> I tried tuning `/sys/fs/f2fs/<dev>/min_seq_blocks` as you mentioned, and it also achieved similar performance improvement on my setup.
> 
> Your approach looks cleaner and better than the one I proposed.
> 
> From what I see, even after reducing this value from the default (2MB) to 512 KB on my local system, there doesn’t seem to be any noticeable performance drop or other side effects.
> Do you see any possible downsides with lowering this value that I might have missed?

Hi Jeuk,

We're using sbi->writepages to serialize large IOs, once you tuned default
value from 2MB to 512KB, in Android, there are threads issue [512K, 2M)
sized IOs, they will join into racing on grabbing the .writepages lock,
I guess that will cause potential performance regression, right?

Thanks,

> 
> Thanks again for your help.
>

Re: [PATCH] f2fs: serialize writeback for inline-crypto inodes

Posted by Jeuk Kim 3 months, 2 weeks ago

On 10/21/2025 3:51 PM, Chao Yu wrote:
> On 10/21/25 11:33, Jeuk Kim wrote:
>> On 10/16/2025 7:12 PM, Chao Yu wrote:
>>> On 10/16/2025 1:16 PM, Jeuk Kim wrote:
>>>> From: Jeuk Kim <jeuk20.kim@samsung.com>
>>>>
>>>> Inline encryption derives DUN from <inode, file offset>,
>>>> so bios from different inodes can't merge. With multi-threaded
>>>> buffered O_SYNC writes where each thread writes to its own file,
>>>> 4KiB-per-page LBA allocation interleaves across inodes and
>>>> causes bio split. Serialize writeback for fscrypt inline-crypto
>>>> inodes via __should_serialize_io() to keep foreground writeback
>>>> focused on one inode and avoid split.
>>>>
>>>> Test: fio --name=wb_osync --rw=write --bs=1M \
>>>>         --time_based=1 --runtime=60s --size=2G \
>>>>         --ioengine=psync --direct=0 --sync=1 \
>>>>         --numjobs=8 --thread=1 --nrfiles=1 \
>>>>         --filename_format='wb_osync.$jobnum'
>>>>
>>>> device: UFS
>>>>
>>>> Before -
>>>>     write throughput: 675MiB/s
>>>>     device I/O size distribution (by count, total 1027414):
>>>>       4 KiB:  923139 (89.9%)
>>>>       8 KiB:  84798 (8.3%)
>>>>       ≥512 KiB: 453 (0.0%)
>>>>
>>>> After -
>>>>     write throughput: 1760MiB/s
>>>>     device I/O size distribution (by count, total 231750):
>>>>       4 KiB:  16904 (7.3%)
>>>>       8 KiB:  72128 (31.1%)
>>>>       ≥512 KiB: 118900 (51.3%)
>>>>
>>>> Signed-off-by: Jeuk Kim <jeuk20.kim@samsung.com>
>>>> ---
>>>>    fs/f2fs/data.c | 2 ++
>>>>    1 file changed, 2 insertions(+)
>>>>
>>>> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
>>>> index ef38e62cda8f..ae6fb435d576 100644
>>>> --- a/fs/f2fs/data.c
>>>> +++ b/fs/f2fs/data.c
>>>> @@ -3217,6 +3217,8 @@ static inline bool __should_serialize_io(struct inode *inode,
>>>>          if (f2fs_need_compress_data(inode))
>>>>            return true;
>>>> +    if (fscrypt_inode_uses_inline_crypto(inode))
>>>> +        return true;
>>>>        if (wbc->sync_mode != WB_SYNC_ALL)
>>>>            return true;
>>>>        if (get_dirty_pages(inode) >= SM_I(F2FS_I_SB(inode))->min_seq_blocks)
>>> Jeuk,
>>>
>>> Can you please try tuning /sys/fs/f2fs/<dev>/min_seq_blocks to see whether it
>>> can achive the goal?
>>>
>>> Thanks,
>>>
>> Hi Chao,
>>
>> Thanks a lot for the suggestion.
>> I tried tuning `/sys/fs/f2fs/<dev>/min_seq_blocks` as you mentioned, and it also achieved similar performance improvement on my setup.
>>
>> Your approach looks cleaner and better than the one I proposed.
>>
>>  From what I see, even after reducing this value from the default (2MB) to 512 KB on my local system, there doesn’t seem to be any noticeable performance drop or other side effects.
>> Do you see any possible downsides with lowering this value that I might have missed?
> Hi Jeuk,
>
> We're using sbi->writepages to serialize large IOs, once you tuned default
> value from 2MB to 512KB, in Android, there are threads issue [512K, 2M)
> sized IOs, they will join into racing on grabbing the .writepages lock,
> I guess that will cause potential performance regression, right?

That's right, that could happen.

I’ll run some tests to check that, including a few other cases that 
might be affected.
I’ll share the results here if I find anything noticeable.

Thanks for your help!

> Thanks,
>