[PATCH RFC 0/5] device mapper atomic write support

John Garry posted 5 patches 1 year, 1 month ago
There is a newer version of this series
block/blk-settings.c          |  9 ++++++---
drivers/md/dm-linear.c        |  3 ++-
drivers/md/dm-table.c         | 12 ++++++++++++
drivers/md/dm.c               |  3 +++
include/linux/blkdev.h        | 21 ++++++++++++---------
include/linux/device-mapper.h |  3 +++
6 files changed, 38 insertions(+), 13 deletions(-)
[PATCH RFC 0/5] device mapper atomic write support
Posted by John Garry 1 year, 1 month ago
This series introduces initial device mapper atomic write support.

Since we already support stacking atomic writes limits, it's quite
straightforward to support.

Only dm-linear is supported for now, but other personalities could
be supported.

Patch #1 is a proper fix, but the rest of the series is RFC - this is
because I have not fully tested and we are close to the end of this
development cycle.

Based on v6.13-rc6

John Garry (5):
  block: Ensure start sector is aligned for stacking atomic writes
  block: Change blk_stack_atomic_writes_limits() unit_min check
  dm-table: Atomic writes support
  dm: Ensure cloned bio is same length for atomic write
  dm-linear: Enable atomic writes

 block/blk-settings.c          |  9 ++++++---
 drivers/md/dm-linear.c        |  3 ++-
 drivers/md/dm-table.c         | 12 ++++++++++++
 drivers/md/dm.c               |  3 +++
 include/linux/blkdev.h        | 21 ++++++++++++---------
 include/linux/device-mapper.h |  3 +++
 6 files changed, 38 insertions(+), 13 deletions(-)

-- 
2.31.1
Re: [PATCH RFC 0/5] device mapper atomic write support
Posted by Mike Snitzer 1 year, 1 month ago
On Mon, Jan 06, 2025 at 12:41:14PM +0000, John Garry wrote:
> This series introduces initial device mapper atomic write support.
> 
> Since we already support stacking atomic writes limits, it's quite
> straightforward to support.
> 
> Only dm-linear is supported for now, but other personalities could
> be supported.
> 
> Patch #1 is a proper fix, but the rest of the series is RFC - this is
> because I have not fully tested and we are close to the end of this
> development cycle.

In general, looks reasonable.  But I would prefer to see atomic write
support added to dm-striped as well.  Not that I have some need, but
because it will help verify the correctness of the general stacking
code changes (in both block and DM core).  I wrote and/or fixed a fair
amount of the non-atomic block limits stacking code over the
years.. so this is just me trying to inform this effort based on
limits stacking gotchas we've experienced to this point.

Looks like adding dm-striped support would just need to ensure that
the chunk_size is multiple of atomic write size (so chunk_size >=
atomic write size).

Relative to linear, testing limits stacking in terms of linear should
also verify that concatenated volumes work.

Thanks,
Mike
Re: [PATCH RFC 0/5] device mapper atomic write support
Posted by John Garry 1 year, 1 month ago
On 06/01/2025 17:26, Mike Snitzer wrote:
> On Mon, Jan 06, 2025 at 12:41:14PM +0000, John Garry wrote:
>> This series introduces initial device mapper atomic write support.
>>
>> Since we already support stacking atomic writes limits, it's quite
>> straightforward to support.
>>
>> Only dm-linear is supported for now, but other personalities could
>> be supported.
>>
>> Patch #1 is a proper fix, but the rest of the series is RFC - this is
>> because I have not fully tested and we are close to the end of this
>> development cycle.
> In general, looks reasonable.  But I would prefer to see atomic write
> support added to dm-striped as well.  Not that I have some need, but
> because it will help verify the correctness of the general stacking
> code changes (in both block and DM core). 

That should be fine. We already have md raid0 support working (for 
atomic writes), so I would expect much of the required support is 
already available.

> I wrote and/or fixed a fair
> amount of the non-atomic block limits stacking code over the
> years.. so this is just me trying to inform this effort based on
> limits stacking gotchas we've experienced to this point.

Yeah, understood. And that is why I am on the lookup for points at which 
we try to split atomic writes in the submission patch. The only reason 
that it should happen is due to the limits being incorrectly calculated.

> 
> Looks like adding dm-striped support would just need to ensure that
> the chunk_size is multiple of atomic write size (so chunk_size >=
> atomic write size).

Right, so the block queue limits code already will throttle the atomic 
write max so that chunk_size % atomic write upper limit == 0.

> 
> Relative to linear, testing limits stacking in terms of linear should
> also verify that concatenated volumes work.

ok,

Thanks,
John
Re: [PATCH RFC 0/5] device mapper atomic write support
Posted by Mikulas Patocka 1 year, 1 month ago

On Mon, 6 Jan 2025, John Garry wrote:

> On 06/01/2025 17:26, Mike Snitzer wrote:
> > On Mon, Jan 06, 2025 at 12:41:14PM +0000, John Garry wrote:
> > > This series introduces initial device mapper atomic write support.
> > > 
> > > Since we already support stacking atomic writes limits, it's quite
> > > straightforward to support.
> > > 
> > > Only dm-linear is supported for now, but other personalities could
> > > be supported.
> > > 
> > > Patch #1 is a proper fix, but the rest of the series is RFC - this is
> > > because I have not fully tested and we are close to the end of this
> > > development cycle.
> > In general, looks reasonable.  But I would prefer to see atomic write
> > support added to dm-striped as well.  Not that I have some need, but
> > because it will help verify the correctness of the general stacking
> > code changes (in both block and DM core). 
> 
> That should be fine. We already have md raid0 support working (for atomic
> writes), so I would expect much of the required support is already available.

BTW. could it be possible to add dm-mirror support as well? dm-mirror is 
used when the user moves the logical volume to another physical volume, so 
it would be nice if this worked without resulting in not-supported errors.

dm-mirror uses dm-io to perform the writes on multiple mirror legs (see 
the function do_write() -> dm_io()), I looked at the code and it seems 
that the support for atomic writes in dm-mirror and dm-io would be 
straightforward.

Another possibility would be dm-snapshot support, assuming that the atomic 
i/o size <= snapshot chunk size, the support should be easy - i.e. just 
pass the flag REQ_ATOMIC through. Perhaps it could be supported for 
dm-thin as well.

Mikulas
Re: [PATCH RFC 0/5] device mapper atomic write support
Posted by John Garry 1 year ago
On 07/01/2025 17:13, Mikulas Patocka wrote:
> 
> 
> On Mon, 6 Jan 2025, John Garry wrote:
> 
>> On 06/01/2025 17:26, Mike Snitzer wrote:
>>> On Mon, Jan 06, 2025 at 12:41:14PM +0000, John Garry wrote:
>>>> This series introduces initial device mapper atomic write support.
>>>>
>>>> Since we already support stacking atomic writes limits, it's quite
>>>> straightforward to support.
>>>>
>>>> Only dm-linear is supported for now, but other personalities could
>>>> be supported.
>>>>
>>>> Patch #1 is a proper fix, but the rest of the series is RFC - this is
>>>> because I have not fully tested and we are close to the end of this
>>>> development cycle.
>>> In general, looks reasonable.  But I would prefer to see atomic write
>>> support added to dm-striped as well.  Not that I have some need, but
>>> because it will help verify the correctness of the general stacking
>>> code changes (in both block and DM core).
>>
>> That should be fine. We already have md raid0 support working (for atomic
>> writes), so I would expect much of the required support is already available.
> 
> BTW. could it be possible to add dm-mirror support as well? dm-mirror is
> used when the user moves the logical volume to another physical volume, so
> it would be nice if this worked without resulting in not-supported errors.
> 
> dm-mirror uses dm-io to perform the writes on multiple mirror legs (see
> the function do_write() -> dm_io()), I looked at the code and it seems
> that the support for atomic writes in dm-mirror and dm-io would be
> straightforward.

I tried this out, and it seems to work ok.

However, I need to set DM_TARGET_ATOMIC_WRITES in the 
mirror_target.features member, like:

diff --git a/drivers/md/dm-raid1.c b/drivers/md/dm-raid1.c
index 9511dae5b556..913a92c55904 100644
--- a/drivers/md/dm-raid1.c
+++ b/drivers/md/dm-raid1.c
@@ -1485,6 +1485,7 @@ static struct target_type mirror_target = {
	.name    = "mirror",
	.version = {1, 14, 0},
	.module  = THIS_MODULE,
+	.features = DM_TARGET_ATOMIC_WRITES,
	.ctr     = mirror_ctr,
	.dtr     = mirror_dtr,
	.map     = mirror_map,


Is this the right thing to do? I ask, as none of the other DM_TARGET* 
flags are set already, which makes me suspicious.

Thanks,
John
Re: [PATCH RFC 0/5] device mapper atomic write support
Posted by Mikulas Patocka 1 year ago

On Thu, 16 Jan 2025, John Garry wrote:

> On 07/01/2025 17:13, Mikulas Patocka wrote:
> > 
> > 
> > On Mon, 6 Jan 2025, John Garry wrote:
> > 
> > BTW. could it be possible to add dm-mirror support as well? dm-mirror is
> > used when the user moves the logical volume to another physical volume, so
> > it would be nice if this worked without resulting in not-supported errors.
> > 
> > dm-mirror uses dm-io to perform the writes on multiple mirror legs (see
> > the function do_write() -> dm_io()), I looked at the code and it seems
> > that the support for atomic writes in dm-mirror and dm-io would be
> > straightforward.
> 
> I tried this out, and it seems to work ok.
> 
> However, I need to set DM_TARGET_ATOMIC_WRITES in the mirror_target.features
> member, like:
> 
> diff --git a/drivers/md/dm-raid1.c b/drivers/md/dm-raid1.c
> index 9511dae5b556..913a92c55904 100644
> --- a/drivers/md/dm-raid1.c
> +++ b/drivers/md/dm-raid1.c
> @@ -1485,6 +1485,7 @@ static struct target_type mirror_target = {
> 	.name    = "mirror",
> 	.version = {1, 14, 0},
> 	.module  = THIS_MODULE,
> +	.features = DM_TARGET_ATOMIC_WRITES,
> 	.ctr     = mirror_ctr,
> 	.dtr     = mirror_dtr,
> 	.map     = mirror_map,
> 
> 
> Is this the right thing to do? I ask, as none of the other DM_TARGET* flags
> are set already, which makes me suspicious.
> 
> Thanks,
> John

Yes - that's right. I suggest that you verify that the atomic flag is 
really passed through the dm-raid1.c and dm-io.c stack. Add a printk that 
tests if REQ_ATOMIC is set to the function do_region in dm-io.c just 
before "submit_bio(bio)".

Alternatively, you can use blktrace to test if the REQ_ATOMIC is passed 
through correctly.

Mikulas
Re: [PATCH RFC 0/5] device mapper atomic write support
Posted by John Garry 1 year ago
On 16/01/2025 12:59, Mikulas Patocka wrote:
>>> dm-mirror uses dm-io to perform the writes on multiple mirror legs (see
>>> the function do_write() -> dm_io()), I looked at the code and it seems
>>> that the support for atomic writes in dm-mirror and dm-io would be
>>> straightforward.
>> I tried this out, and it seems to work ok.
>>
>> However, I need to set DM_TARGET_ATOMIC_WRITES in the mirror_target.features
>> member, like:
>>
>> diff --git a/drivers/md/dm-raid1.c b/drivers/md/dm-raid1.c
>> index 9511dae5b556..913a92c55904 100644
>> --- a/drivers/md/dm-raid1.c
>> +++ b/drivers/md/dm-raid1.c
>> @@ -1485,6 +1485,7 @@ static struct target_type mirror_target = {
>> 	.name    = "mirror",
>> 	.version = {1, 14, 0},
>> 	.module  = THIS_MODULE,
>> +	.features = DM_TARGET_ATOMIC_WRITES,
>> 	.ctr     = mirror_ctr,
>> 	.dtr     = mirror_dtr,
>> 	.map     = mirror_map,
>>
>>
>> Is this the right thing to do? I ask, as none of the other DM_TARGET* flags
>> are set already, which makes me suspicious.
>>
>> Thanks,
>> John
> Yes - that's right. I suggest that you verify that the atomic flag is
> really passed through the dm-raid1.c and dm-io.c stack. Add a printk that
> tests if REQ_ATOMIC is set to the function do_region in dm-io.c just
> before "submit_bio(bio)".
> 
> Alternatively, you can use blktrace to test if the REQ_ATOMIC is passed
> through correctly.

Yes, it is passed ok.

JFYI, I can also verify proper atomic write functionality on /dev/dmX 
with fio in verify mode.

Thanks,
John
Re: [PATCH RFC 0/5] device mapper atomic write support
Posted by Mikulas Patocka 1 year ago

On Thu, 16 Jan 2025, John Garry wrote:

> On 16/01/2025 12:59, Mikulas Patocka wrote:
> > > > dm-mirror uses dm-io to perform the writes on multiple mirror legs (see
> > > > the function do_write() -> dm_io()), I looked at the code and it seems
> > > > that the support for atomic writes in dm-mirror and dm-io would be
> > > > straightforward.
> > > I tried this out, and it seems to work ok.
> > > 
> > > However, I need to set DM_TARGET_ATOMIC_WRITES in the
> > > mirror_target.features
> > > member, like:
> > > 
> > > diff --git a/drivers/md/dm-raid1.c b/drivers/md/dm-raid1.c
> > > index 9511dae5b556..913a92c55904 100644
> > > --- a/drivers/md/dm-raid1.c
> > > +++ b/drivers/md/dm-raid1.c
> > > @@ -1485,6 +1485,7 @@ static struct target_type mirror_target = {
> > > 	.name    = "mirror",
> > > 	.version = {1, 14, 0},
> > > 	.module  = THIS_MODULE,
> > > +	.features = DM_TARGET_ATOMIC_WRITES,
> > > 	.ctr     = mirror_ctr,
> > > 	.dtr     = mirror_dtr,
> > > 	.map     = mirror_map,
> > > 
> > > 
> > > Is this the right thing to do? I ask, as none of the other DM_TARGET*
> > > flags
> > > are set already, which makes me suspicious.
> > > 
> > > Thanks,
> > > John
> > Yes - that's right. I suggest that you verify that the atomic flag is
> > really passed through the dm-raid1.c and dm-io.c stack. Add a printk that
> > tests if REQ_ATOMIC is set to the function do_region in dm-io.c just
> > before "submit_bio(bio)".
> > 
> > Alternatively, you can use blktrace to test if the REQ_ATOMIC is passed
> > through correctly.
> 
> Yes, it is passed ok.
> 
> JFYI, I can also verify proper atomic write functionality on /dev/dmX with fio
> in verify mode.
> 
> Thanks,
> John

Yes - so please send version 2 of the patches and I will stage them for 
this merge window.

Mikulas
Re: [PATCH RFC 0/5] device mapper atomic write support
Posted by John Garry 1 year ago
On 16/01/2025 14:58, Mikulas Patocka wrote:
> Yes - so please send version 2 of the patches and I will stage them for
> this merge window.

I'll send a v2 today, however I made some block changes based on the 
feedback from Mike on the dm-table changes in v1.

So prob quite late for this cycle, considering that it touches many 
trees ...

Cheers,
John
Re: [PATCH RFC 0/5] device mapper atomic write support
Posted by John Garry 1 year, 1 month ago
On 07/01/2025 17:13, Mikulas Patocka wrote:
> On Mon, 6 Jan 2025, John Garry wrote:
> 
>> On 06/01/2025 17:26, Mike Snitzer wrote:
>>> On Mon, Jan 06, 2025 at 12:41:14PM +0000, John Garry wrote:
>>>> This series introduces initial device mapper atomic write support.
>>>>
>>>> Since we already support stacking atomic writes limits, it's quite
>>>> straightforward to support.
>>>>
>>>> Only dm-linear is supported for now, but other personalities could
>>>> be supported.
>>>>
>>>> Patch #1 is a proper fix, but the rest of the series is RFC - this is
>>>> because I have not fully tested and we are close to the end of this
>>>> development cycle.
>>> In general, looks reasonable.  But I would prefer to see atomic write
>>> support added to dm-striped as well.  Not that I have some need, but
>>> because it will help verify the correctness of the general stacking
>>> code changes (in both block and DM core).
>> That should be fine. We already have md raid0 support working (for atomic
>> writes), so I would expect much of the required support is already available.
> BTW. could it be possible to add dm-mirror support as well? dm-mirror is
> used when the user moves the logical volume to another physical volume, so
> it would be nice if this worked without resulting in not-supported errors.
> 
> dm-mirror uses dm-io to perform the writes on multiple mirror legs (see
> the function do_write() -> dm_io()), I looked at the code and it seems
> that the support for atomic writes in dm-mirror and dm-io would be
> straightforward.

FWIW, we do support atomic writes for md raid1. The key principle is 
that we atomically write to each disk. Obviously we cannot write to 
multiple disks atomically. So the copies in each mirror may be 
out-of-sync after an unexpected power fail, but that is ok as either 
will have all of old or new data, which is what we guarantee.

> 
> Another possibility would be dm-snapshot support, assuming that the atomic
> i/o size <= snapshot chunk size, the support should be easy - i.e. just
> pass the flag REQ_ATOMIC through. Perhaps it could be supported for
> dm-thin as well.

Do you think that there will be users for these?

atomic writes provide guarantees for users, and it would be hard to 
detect when these guarantees become broken through software bugs. I 
would be just concerned that we enable atomic writes for many of these 
more complicated personalities, and they are not actively used and break.

Thanks,
John
Re: [PATCH RFC 0/5] device mapper atomic write support
Posted by Mikulas Patocka 1 year, 1 month ago

On Tue, 7 Jan 2025, John Garry wrote:

> On 07/01/2025 17:13, Mikulas Patocka wrote:
> > On Mon, 6 Jan 2025, John Garry wrote:
> > 
> > BTW. could it be possible to add dm-mirror support as well? dm-mirror is
> > used when the user moves the logical volume to another physical volume, so
> > it would be nice if this worked without resulting in not-supported errors.
> > 
> > dm-mirror uses dm-io to perform the writes on multiple mirror legs (see
> > the function do_write() -> dm_io()), I looked at the code and it seems
> > that the support for atomic writes in dm-mirror and dm-io would be
> > straightforward.
> 
> FWIW, we do support atomic writes for md raid1. The key principle is that we
> atomically write to each disk. Obviously we cannot write to multiple disks
> atomically. So the copies in each mirror may be out-of-sync after an
> unexpected power fail, but that is ok as either will have all of old or new
> data, which is what we guarantee.

Yes - something like that can be implemented for dm-mirror too.

> > Another possibility would be dm-snapshot support, assuming that the atomic
> > i/o size <= snapshot chunk size, the support should be easy - i.e. just
> > pass the flag REQ_ATOMIC through. Perhaps it could be supported for
> > dm-thin as well.
> 
> Do you think that there will be users for these?
> 
> atomic writes provide guarantees for users, and it would be hard to detect
> when these guarantees become broken through software bugs. I would be just
> concerned that we enable atomic writes for many of these more complicated
> personalities, and they are not actively used and break.
> 
> Thanks,
> John

dm-snapshot is not much used, but dm-thin is. I added Joe to the 
recipients list, so that he can decide whether dm-thin should support 
atomic writes or not.

Mikulas