[RESEND PATCH 0/4] Implement File-Based optimization functionality

Jiaming Li posted 4 patches 3 years, 5 months ago
Documentation/ABI/testing/sysfs-driver-ufs | 129 +++++
drivers/ufs/core/Kconfig                   |  12 +
drivers/ufs/core/Makefile                  |   1 +
drivers/ufs/core/ufs-sysfs.c               |  26 ++
drivers/ufs/core/ufsfbo.c                  | 519 +++++++++++++++++++++
drivers/ufs/core/ufsfbo.h                  |  23 +
drivers/ufs/core/ufshcd.c                  |  15 +-
include/ufs/ufs.h                          |  16 +
include/ufs/ufshcd.h                       |   1 +
9 files changed, 734 insertions(+), 8 deletions(-)
create mode 100644 drivers/ufs/core/ufsfbo.c
create mode 100644 drivers/ufs/core/ufsfbo.h
[RESEND PATCH 0/4] Implement File-Based optimization functionality
Posted by Jiaming Li 3 years, 5 months ago
From: lijiaming3 <lijiaming3@xiaomi.com>

Stoage devices have a long lifespan. Device performance over its
lifespan is not constant and may deteriorate over time. To remedy
this, JEDEC came up with the UFS File-Based-Optimization (FBO)
extension (JC-64.1-22-67). The FBO feature improves this performance
regression via physical defragmentation of the LBA ranges that are
associated with specific files.

This feature expects the following host-device dialog:
1) The host let the device know of lba range(s) of interest. Those
   ranges are typically associated with a specific file. One can
   obtain it from the iNode of the file and some offset calculations.
2) The host ask the device for the current physical fragmentation
   level of this file.
3) Should it requires, the host instruct the device to perform
   defragmentation.
4) Upon successful termination of the defragmentation phase, the host
   may ask for the new fragmentation level of the file.

lijiaming3 (4):
  scsi:ufs:remove sanity check
  scsi:ufs:add File-Based Optimization descriptor
  scsi:ufs:add FBO module
  scsi:ufs:add fbo functionality

 Documentation/ABI/testing/sysfs-driver-ufs | 129 +++++
 drivers/ufs/core/Kconfig                   |  12 +
 drivers/ufs/core/Makefile                  |   1 +
 drivers/ufs/core/ufs-sysfs.c               |  26 ++
 drivers/ufs/core/ufsfbo.c                  | 519 +++++++++++++++++++++
 drivers/ufs/core/ufsfbo.h                  |  23 +
 drivers/ufs/core/ufshcd.c                  |  15 +-
 include/ufs/ufs.h                          |  16 +
 include/ufs/ufshcd.h                       |   1 +
 9 files changed, 734 insertions(+), 8 deletions(-)
 create mode 100644 drivers/ufs/core/ufsfbo.c
 create mode 100644 drivers/ufs/core/ufsfbo.h

-- 
2.38.1
Re: [RESEND PATCH 0/4] Implement File-Based optimization functionality
Posted by Bart Van Assche 3 years, 5 months ago
On 11/1/22 22:30, Jiaming Li wrote:
> From: lijiaming3 <lijiaming3@xiaomi.com>

Hi Jiaming,

...@...corp-partner.google.com email addresses must NOT be used for 
communication on open source mailing lists. Please use your Xiaomi.com 
e-mail address for communication on open source mailing lists.

Bart.
Re: [RESEND PATCH 0/4] Implement File-Based optimization functionality
Posted by Christoph Hellwig 3 years, 5 months ago
On Wed, Nov 02, 2022 at 01:30:54PM +0800, Jiaming Li wrote:
> 1) The host let the device know of lba range(s) of interest. Those
>    ranges are typically associated with a specific file. One can
>    obtain it from the iNode of the file and some offset calculations.

This is completely and utter madness.  Files are a logic concept, that
is non-unique (reflinks, snapshot) and can change at any time
(defragmentation, GC, dedup).  Whoever came up with this scheme is on
crack and the it has no business being in the Linux kernel

NAK.
Re: [RESEND PATCH 0/4] Implement File-Based optimization functionality
Posted by Juhyung Park 3 years, 5 months ago
On 11/2/22 17:47, Christoph Hellwig wrote:
> On Wed, Nov 02, 2022 at 01:30:54PM +0800, Jiaming Li wrote:
>> 1) The host let the device know of lba range(s) of interest. Those
>>     ranges are typically associated with a specific file. One can
>>     obtain it from the iNode of the file and some offset calculations.
> 
> This is completely and utter madness.  Files are a logic concept, that
> is non-unique (reflinks, snapshot) and can change at any time
> (defragmentation, GC, dedup).  Whoever came up with this scheme is on
> crack and the it has no business being in the Linux kernel
> 
> NAK.
> 
> 

Is the idea really an utter madness? Majority of regular files that may 
be of interest from the perspective of UFS aren't reflinked or 
snapshotted (let alone the lack of support from ext4 or f2fs).

Device-side fragmentation is a real issue [1] and it makes more than 
enough sense to defrag LBAs of interests to improve performance. This 
was long overdue, unless the block interface itself changes somehow.

The question is how to implement it correctly without creating a mess 
with mismatched/outdated LBAs as you've mentioned, preferably through 
file-system's integration: If the LBAs in questions are indeed 
reflinked, how do we handle it?, If the LBAs are moved/invalidated from 
defrag or GC, how do we make sure that UFS is up-to-date?, etc.

 >
 > From: lijiaming3 <lijiaming3@xiaomi.com>
 >
 > add fbo analysis and defrag function
 >
 > We can send LBA info to the device as a comma separated string. Each
 > adjacent pair represents a range:<open-lba>,<close-lba>.
 > e.g. The LBA range of the file is 0x1234,0x3456;0x4567,0x5678
 > 	echo 0x1234,0x3456,0x4567,0x5678 > fbo_send_lba
 >

Like, ew. Why would we ever want *the userspace* to be able to 
manipulate this directly?

[1] 
https://www.usenix.org/conference/atc17/technical-sessions/presentation/hahn 
- Section 3.3: "For example, even if a file was not fragmented at all in 
the logical space (DoFL=1), if the file had a DoFP value of 0.5, the I/O 
throughput became only 48% of that with DoFP=0."
Re: [RESEND PATCH 0/4] Implement File-Based optimization functionality
Posted by Christoph Hellwig 3 years, 5 months ago
On Thu, Nov 03, 2022 at 03:11:16PM +0900, Juhyung Park wrote:
> Is the idea really an utter madness?

Yes.

> Majority of regular files that may be
> of interest from the perspective of UFS aren't reflinked or snapshotted (let
> alone the lack of support from ext4 or f2fs).

Linux does not require you in any way to use obsolete file systems
desings only on any given block device.

> Device-side fragmentation is a real issue [1] and it makes more than enough
> sense to defrag LBAs of interests to improve performance. This was long
> overdue, unless the block interface itself changes somehow.

Or maybe random writes to flash aren't a good idea if you FTL sucks?
Full blown FTLs tend to not do any extent based mappings, so
fragmentation does not matter.  The price paid for that is much larger
FTL tables.  If you stop pretending flash is random writable through
saner interfaces like ZNS you automatically solve this fragmentation
problem as well.

> The question is how to implement it correctly without creating a mess with
> mismatched/outdated LBAs as you've mentioned, preferably through
> file-system's integration: If the LBAs in questions are indeed reflinked,
> how do we handle it?, If the LBAs are moved/invalidated from defrag or GC,
> how do we make sure that UFS is up-to-date?, etc.

The fix is to plug the leaking abtractions in UFS.  If it wants to look
like a random writable block device it better perform when doing that.
And if it doesn't want to pay the prize for that it'd better expose
an abstraction that actually fits the underlying media.  It's not like
some of us haven't worked on that for the last decade.
Re: [RESEND PATCH 0/4] Implement File-Based optimization functionality
Posted by Matias Bjørling 3 years, 5 months ago
On 03/11/2022 07.11, Juhyung Park wrote:
...
> 
> Is the idea really an utter madness? Majority of regular files that may 
> be of interest from the perspective of UFS aren't reflinked or 
> snapshotted (let alone the lack of support from ext4 or f2fs).
> 
> Device-side fragmentation is a real issue [1] and it makes more than 
> enough sense to defrag LBAs of interests to improve performance. This 
> was long overdue, unless the block interface itself changes somehow.

There are ongoing work with UFS to extend the block interface with 
zones. This approach eliminates the mismatch between the device-side 
mapping and host-side mapping and lets the host and device collaborate 
on the data placement.

> 
> The question is how to implement it correctly without creating a mess 
> with mismatched/outdated LBAs as you've mentioned, preferably through 
> file-system's integration: If the LBAs in questions are indeed 
> reflinked, how do we handle it?, If the LBAs are moved/invalidated from 
> defrag or GC, how do we make sure that UFS is up-to-date?, etc.

If using zoned UFS, the file-system can use zones for LBA tracking, 
eliminating the mismatched/outdated LBA issue. f2fs already supports 
this approach (works today with SMR HDDs and ZNS SSDs). It'll extend to 
UFS when zone support is added/implemented.
Re: [RESEND PATCH 0/4] Implement File-Based optimization functionality
Posted by Juhyung Park 3 years, 5 months ago
On Fri, Nov 4, 2022 at 9:37 PM Matias Bjørling <m@bjorling.me> wrote:
>
> On 03/11/2022 07.11, Juhyung Park wrote:
> ...
> >
> > Is the idea really an utter madness? Majority of regular files that may
> > be of interest from the perspective of UFS aren't reflinked or
> > snapshotted (let alone the lack of support from ext4 or f2fs).
> >
> > Device-side fragmentation is a real issue [1] and it makes more than
> > enough sense to defrag LBAs of interests to improve performance. This
> > was long overdue, unless the block interface itself changes somehow.
>
> There are ongoing work with UFS to extend the block interface with
> zones. This approach eliminates the mismatch between the device-side
> mapping and host-side mapping and lets the host and device collaborate
> on the data placement.
>
> >
> > The question is how to implement it correctly without creating a mess
> > with mismatched/outdated LBAs as you've mentioned, preferably through
> > file-system's integration: If the LBAs in questions are indeed
> > reflinked, how do we handle it?, If the LBAs are moved/invalidated from
> > defrag or GC, how do we make sure that UFS is up-to-date?, etc.
>
> If using zoned UFS, the file-system can use zones for LBA tracking,
> eliminating the mismatched/outdated LBA issue. f2fs already supports
> this approach (works today with SMR HDDs and ZNS SSDs). It'll extend to
> UFS when zone support is added/implemented.
>

More reasons to have this functionality integrated with the
file-system instead of allowing users to specify random LBA ranges.