drivers/md/dm-ioctl.c | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-)
do_resume when loading a new map first calls dm_suspend, which could
silently fail. When we proceeded to dm_swap_table, we would bail out
with EINVAL. Instead, restore new_map and return ERESTARTSYS when
signaled.
Signed-off-by: Khazhismel Kumykov <khazhy@google.com>
---
drivers/md/dm-ioctl.c | 18 ++++++++++++++++--
1 file changed, 16 insertions(+), 2 deletions(-)
RFC as I am rather unfamiliar with the locking semantics here - whether
we do need to re-grab hash_lock to write to an hc we previously grabbed,
and whether the device becoming unhashed while we're in this function is
really something that needs to be checked. However, this patch does fix
the issue we were seeing - we'd get EINVAL when thread in ioctl was
signaled.
diff --git a/drivers/md/dm-ioctl.c b/drivers/md/dm-ioctl.c
index c2c07bfa6471..b81650c6d096 100644
--- a/drivers/md/dm-ioctl.c
+++ b/drivers/md/dm-ioctl.c
@@ -1181,8 +1181,22 @@ static int do_resume(struct dm_ioctl *param)
suspend_flags &= ~DM_SUSPEND_LOCKFS_FLAG;
if (param->flags & DM_NOFLUSH_FLAG)
suspend_flags |= DM_SUSPEND_NOFLUSH_FLAG;
- if (!dm_suspended_md(md))
- dm_suspend(md, suspend_flags);
+ if (!dm_suspended_md(md)) {
+ r = dm_suspend(md, suspend_flags);
+ if (r == -EINTR)
+ r = -ERESTARTSYS;
+ if (r) {
+ down_write(&_hash_lock);
+ hc = dm_get_mdptr(md);
+ if (!hc)
+ r = -ENXIO;
+ else
+ hc->new_map = new_map;
+ up_write(&_hash_lock);
+ dm_put(md);
+ return r;
+ }
+ }
old_size = dm_get_size(md);
old_map = dm_swap_table(md, new_map);
--
2.45.2.993.g49e7a77208-goog
Hi
I am wondering why does do_resume need to call dm_suspend at all. Does
anyone here remember why is this code path needed?
Mikulas
On Wed, 17 Jul 2024, Khazhismel Kumykov wrote:
> do_resume when loading a new map first calls dm_suspend, which could
> silently fail. When we proceeded to dm_swap_table, we would bail out
> with EINVAL. Instead, restore new_map and return ERESTARTSYS when
> signaled.
>
> Signed-off-by: Khazhismel Kumykov <khazhy@google.com>
> ---
> drivers/md/dm-ioctl.c | 18 ++++++++++++++++--
> 1 file changed, 16 insertions(+), 2 deletions(-)
>
>
> RFC as I am rather unfamiliar with the locking semantics here - whether
> we do need to re-grab hash_lock to write to an hc we previously grabbed,
> and whether the device becoming unhashed while we're in this function is
> really something that needs to be checked. However, this patch does fix
> the issue we were seeing - we'd get EINVAL when thread in ioctl was
> signaled.
>
>
> diff --git a/drivers/md/dm-ioctl.c b/drivers/md/dm-ioctl.c
> index c2c07bfa6471..b81650c6d096 100644
> --- a/drivers/md/dm-ioctl.c
> +++ b/drivers/md/dm-ioctl.c
> @@ -1181,8 +1181,22 @@ static int do_resume(struct dm_ioctl *param)
> suspend_flags &= ~DM_SUSPEND_LOCKFS_FLAG;
> if (param->flags & DM_NOFLUSH_FLAG)
> suspend_flags |= DM_SUSPEND_NOFLUSH_FLAG;
> - if (!dm_suspended_md(md))
> - dm_suspend(md, suspend_flags);
> + if (!dm_suspended_md(md)) {
> + r = dm_suspend(md, suspend_flags);
> + if (r == -EINTR)
> + r = -ERESTARTSYS;
> + if (r) {
> + down_write(&_hash_lock);
> + hc = dm_get_mdptr(md);
> + if (!hc)
> + r = -ENXIO;
> + else
> + hc->new_map = new_map;
> + up_write(&_hash_lock);
> + dm_put(md);
> + return r;
> + }
> + }
>
> old_size = dm_get_size(md);
> old_map = dm_swap_table(md, new_map);
> --
> 2.45.2.993.g49e7a77208-goog
>
On Wed, Jul 17, 2024 at 12:45 PM Mikulas Patocka <mpatocka@redhat.com> wrote:
>
> Hi
>
> I am wondering why does do_resume need to call dm_suspend at all. Does
> anyone here remember why is this code path needed?
In our case, we have a sequence with load_table followed by a resume,
with no suspend first. The resume path suspends if needed, swaps
tables, then resumes. Removing the suspend here would break existing
userspace, I'd imagine. It seems like minimizing the suspended time
would also be a nice benefit.
>
> Mikulas
>
>
>
> On Wed, 17 Jul 2024, Khazhismel Kumykov wrote:
>
> > do_resume when loading a new map first calls dm_suspend, which could
> > silently fail. When we proceeded to dm_swap_table, we would bail out
> > with EINVAL. Instead, restore new_map and return ERESTARTSYS when
> > signaled.
> >
> > Signed-off-by: Khazhismel Kumykov <khazhy@google.com>
> > ---
> > drivers/md/dm-ioctl.c | 18 ++++++++++++++++--
> > 1 file changed, 16 insertions(+), 2 deletions(-)
> >
> >
> > RFC as I am rather unfamiliar with the locking semantics here - whether
> > we do need to re-grab hash_lock to write to an hc we previously grabbed,
> > and whether the device becoming unhashed while we're in this function is
> > really something that needs to be checked. However, this patch does fix
> > the issue we were seeing - we'd get EINVAL when thread in ioctl was
> > signaled.
> >
> >
> > diff --git a/drivers/md/dm-ioctl.c b/drivers/md/dm-ioctl.c
> > index c2c07bfa6471..b81650c6d096 100644
> > --- a/drivers/md/dm-ioctl.c
> > +++ b/drivers/md/dm-ioctl.c
> > @@ -1181,8 +1181,22 @@ static int do_resume(struct dm_ioctl *param)
> > suspend_flags &= ~DM_SUSPEND_LOCKFS_FLAG;
> > if (param->flags & DM_NOFLUSH_FLAG)
> > suspend_flags |= DM_SUSPEND_NOFLUSH_FLAG;
> > - if (!dm_suspended_md(md))
> > - dm_suspend(md, suspend_flags);
> > + if (!dm_suspended_md(md)) {
> > + r = dm_suspend(md, suspend_flags);
> > + if (r == -EINTR)
> > + r = -ERESTARTSYS;
> > + if (r) {
> > + down_write(&_hash_lock);
> > + hc = dm_get_mdptr(md);
> > + if (!hc)
> > + r = -ENXIO;
> > + else
> > + hc->new_map = new_map;
Oh - I probably want to check if hc->new_map has become non-null in
the meantime and if so... pick a winner then put the loser? Presumably
the newest map should win if that happens / is possible. although the
concept seems suspect to me.
> > + up_write(&_hash_lock);
> > + dm_put(md);
> > + return r;
> > + }
> > + }
> >
> > old_size = dm_get_size(md);
> > old_map = dm_swap_table(md, new_map);
> > --
> > 2.45.2.993.g49e7a77208-goog
> >
>
Dne 17. 07. 24 v 21:52 Khazhy Kumykov napsal(a): > On Wed, Jul 17, 2024 at 12:45 PM Mikulas Patocka <mpatocka@redhat.com> wrote: >> Hi >> >> I am wondering why does do_resume need to call dm_suspend at all. Does >> anyone here remember why is this code path needed? > In our case, we have a sequence with load_table followed by a resume, > with no suspend first. The resume path suspends if needed, swaps > tables, then resumes. Removing the suspend here would break existing > userspace, I'd imagine. It seems like minimizing the suspended time > would also be a nice benefit. lvm2 maintainer POV Automatic 'suspend' for resume is a kernel 'feature' that should not be normally used from the userspace. Userspace is supposed to call 'suspend' - handle error cases - eventually drop preloaded table and resume existing table that should work. If userspace is using ONLY 'resume' without calling suspend upfront - there are some unsolvable error cases. So no - 'minimizing' suspend time is NOT the main reason here. The only valid reason to use it is basically if you are admin and you need to reload table for a device you are running from - in this case calling 'dmsetup suspend' might leave your system in 'blocked' state since your rootfs will be 'frozen/suspend' and you would have no chance to call 'dmsetup resume'. lvm2 app is locking itself in the RAM in this critical section so it can proceed with regular sequence: 'write metadata - preload DM - suspend DM - commit metadata - resume DM' which basicall all userland apps should be using. Regards Zdenek
do_resume when loading a new map first calls dm_suspend, which could
silently fail. When we proceeded to dm_swap_table, we would bail out
with EINVAL. Instead, attempt to restore new_map and return ERESTARTSYS
when signaled.
Signed-off-by: Khazhismel Kumykov <khazhy@google.com>
---
drivers/md/dm-ioctl.c | 23 +++++++++++++++++++++--
1 file changed, 21 insertions(+), 2 deletions(-)
v2: don't leak new_map if we can't assign it back to hc.
diff --git a/drivers/md/dm-ioctl.c b/drivers/md/dm-ioctl.c
index c2c07bfa6471..0591455ad63c 100644
--- a/drivers/md/dm-ioctl.c
+++ b/drivers/md/dm-ioctl.c
@@ -1181,8 +1181,27 @@ static int do_resume(struct dm_ioctl *param)
suspend_flags &= ~DM_SUSPEND_LOCKFS_FLAG;
if (param->flags & DM_NOFLUSH_FLAG)
suspend_flags |= DM_SUSPEND_NOFLUSH_FLAG;
- if (!dm_suspended_md(md))
- dm_suspend(md, suspend_flags);
+ if (!dm_suspended_md(md)) {
+ r = dm_suspend(md, suspend_flags);
+ if (r == -EINTR)
+ r = -ERESTARTSYS;
+ if (r) {
+ down_write(&_hash_lock);
+ hc = dm_get_mdptr(md);
+ if (!hc)
+ r = -ENXIO;
+ if (hc && !hc->new_map) {
+ hc->new_map = new_map;
+ up_write(&_hash_lock);
+ } else {
+ up_write(&_hash_lock);
+ dm_sync_table(md);
+ dm_table_destroy(new_map);
+ }
+ dm_put(md);
+ return r;
+ }
+ }
old_size = dm_get_size(md);
old_map = dm_swap_table(md, new_map);
--
2.45.2.993.g49e7a77208-goog
On Wed, 17 Jul 2024, Khazhismel Kumykov wrote:
> do_resume when loading a new map first calls dm_suspend, which could
> silently fail. When we proceeded to dm_swap_table, we would bail out
> with EINVAL. Instead, attempt to restore new_map and return ERESTARTSYS
> when signaled.
>
> Signed-off-by: Khazhismel Kumykov <khazhy@google.com>
> ---
> drivers/md/dm-ioctl.c | 23 +++++++++++++++++++++--
> 1 file changed, 21 insertions(+), 2 deletions(-)
>
> v2: don't leak new_map if we can't assign it back to hc.
>
> diff --git a/drivers/md/dm-ioctl.c b/drivers/md/dm-ioctl.c
> index c2c07bfa6471..0591455ad63c 100644
> --- a/drivers/md/dm-ioctl.c
> +++ b/drivers/md/dm-ioctl.c
> @@ -1181,8 +1181,27 @@ static int do_resume(struct dm_ioctl *param)
> suspend_flags &= ~DM_SUSPEND_LOCKFS_FLAG;
> if (param->flags & DM_NOFLUSH_FLAG)
> suspend_flags |= DM_SUSPEND_NOFLUSH_FLAG;
> - if (!dm_suspended_md(md))
> - dm_suspend(md, suspend_flags);
> + if (!dm_suspended_md(md)) {
> + r = dm_suspend(md, suspend_flags);
> + if (r == -EINTR)
> + r = -ERESTARTSYS;
I'd like to ask why the "EINTR -> ERESTARTSYS" conversion is here and why
it isn't in dm_suspend?
What do libdevmapper+lvm maintainers think about it? Does lvm hadle EINTR
by restarting the ioctl syscall? Should we return ERESTARTSYS when suspend
is interrupted?
Mikulas
Dne 23. 07. 24 v 14:51 Mikulas Patocka napsal(a):
>
>
> On Wed, 17 Jul 2024, Khazhismel Kumykov wrote:
>
>> do_resume when loading a new map first calls dm_suspend, which could
>> silently fail. When we proceeded to dm_swap_table, we would bail out
>> with EINVAL. Instead, attempt to restore new_map and return ERESTARTSYS
>> when signaled.
>>
>> Signed-off-by: Khazhismel Kumykov <khazhy@google.com>
>> ---
>> drivers/md/dm-ioctl.c | 23 +++++++++++++++++++++--
>> 1 file changed, 21 insertions(+), 2 deletions(-)
>>
>> v2: don't leak new_map if we can't assign it back to hc.
>>
>> diff --git a/drivers/md/dm-ioctl.c b/drivers/md/dm-ioctl.c
>> index c2c07bfa6471..0591455ad63c 100644
>> --- a/drivers/md/dm-ioctl.c
>> +++ b/drivers/md/dm-ioctl.c
>> @@ -1181,8 +1181,27 @@ static int do_resume(struct dm_ioctl *param)
>> suspend_flags &= ~DM_SUSPEND_LOCKFS_FLAG;
>> if (param->flags & DM_NOFLUSH_FLAG)
>> suspend_flags |= DM_SUSPEND_NOFLUSH_FLAG;
>> - if (!dm_suspended_md(md))
>> - dm_suspend(md, suspend_flags);
>> + if (!dm_suspended_md(md)) {
>> + r = dm_suspend(md, suspend_flags);
>> + if (r == -EINTR)
>> + r = -ERESTARTSYS;
>
> I'd like to ask why the "EINTR -> ERESTARTSYS" conversion is here and why
> it isn't in dm_suspend?
>
> What do libdevmapper+lvm maintainers think about it? Does lvm hadle EINTR
> by restarting the ioctl syscall? Should we return ERESTARTSYS when suspend
> is interrupted?
In general - with suspend failures - we are just stopping whole operation -
and restoring previous state - so user can run operation again.
There is no special check for exact reason of ioctl failure.
Regards
Zdenek
On Tue, Jul 23, 2024 at 6:11 AM Zdenek Kabelac <zdenek.kabelac@gmail.com> wrote:
>
> Dne 23. 07. 24 v 14:51 Mikulas Patocka napsal(a):
> >
> >
> > On Wed, 17 Jul 2024, Khazhismel Kumykov wrote:
> >
> >> do_resume when loading a new map first calls dm_suspend, which could
> >> silently fail. When we proceeded to dm_swap_table, we would bail out
> >> with EINVAL. Instead, attempt to restore new_map and return ERESTARTSYS
> >> when signaled.
> >>
> >> Signed-off-by: Khazhismel Kumykov <khazhy@google.com>
> >> ---
> >> drivers/md/dm-ioctl.c | 23 +++++++++++++++++++++--
> >> 1 file changed, 21 insertions(+), 2 deletions(-)
> >>
> >> v2: don't leak new_map if we can't assign it back to hc.
> >>
> >> diff --git a/drivers/md/dm-ioctl.c b/drivers/md/dm-ioctl.c
> >> index c2c07bfa6471..0591455ad63c 100644
> >> --- a/drivers/md/dm-ioctl.c
> >> +++ b/drivers/md/dm-ioctl.c
> >> @@ -1181,8 +1181,27 @@ static int do_resume(struct dm_ioctl *param)
> >> suspend_flags &= ~DM_SUSPEND_LOCKFS_FLAG;
> >> if (param->flags & DM_NOFLUSH_FLAG)
> >> suspend_flags |= DM_SUSPEND_NOFLUSH_FLAG;
> >> - if (!dm_suspended_md(md))
> >> - dm_suspend(md, suspend_flags);
> >> + if (!dm_suspended_md(md)) {
> >> + r = dm_suspend(md, suspend_flags);
> >> + if (r == -EINTR)
> >> + r = -ERESTARTSYS;
> >
> > I'd like to ask why the "EINTR -> ERESTARTSYS" conversion is here and why
> > it isn't in dm_suspend?
I proposed ERESTARTSYS here since the act of waiting for the device to
suspend successfully seems "restartable" - I think the same reasoning
would apply to do_suspend.
> >
> > What do libdevmapper+lvm maintainers think about it? Does lvm hadle EINTR
> > by restarting the ioctl syscall? Should we return ERESTARTSYS when suspend
> > is interrupted?
>
> In general - with suspend failures - we are just stopping whole operation -
> and restoring previous state - so user can run operation again.
>
> There is no special check for exact reason of ioctl failure.
>
> Regards
>
> Zdenek
>
On Wed, Jul 17, 2024 at 04:18:33PM -0700, Khazhismel Kumykov wrote:
> do_resume when loading a new map first calls dm_suspend, which could
> silently fail. When we proceeded to dm_swap_table, we would bail out
> with EINVAL. Instead, attempt to restore new_map and return ERESTARTSYS
> when signaled.
>
> Signed-off-by: Khazhismel Kumykov <khazhy@google.com>
> ---
> drivers/md/dm-ioctl.c | 23 +++++++++++++++++++++--
> 1 file changed, 21 insertions(+), 2 deletions(-)
>
> v2: don't leak new_map if we can't assign it back to hc.
>
> diff --git a/drivers/md/dm-ioctl.c b/drivers/md/dm-ioctl.c
> index c2c07bfa6471..0591455ad63c 100644
> --- a/drivers/md/dm-ioctl.c
> +++ b/drivers/md/dm-ioctl.c
> @@ -1181,8 +1181,27 @@ static int do_resume(struct dm_ioctl *param)
> suspend_flags &= ~DM_SUSPEND_LOCKFS_FLAG;
> if (param->flags & DM_NOFLUSH_FLAG)
> suspend_flags |= DM_SUSPEND_NOFLUSH_FLAG;
> - if (!dm_suspended_md(md))
> - dm_suspend(md, suspend_flags);
> + if (!dm_suspended_md(md)) {
> + r = dm_suspend(md, suspend_flags);
> + if (r == -EINTR)
> + r = -ERESTARTSYS;
> + if (r) {
> + down_write(&_hash_lock);
> + hc = dm_get_mdptr(md);
> + if (!hc)
> + r = -ENXIO;
> + if (hc && !hc->new_map) {
> + hc->new_map = new_map;
> + up_write(&_hash_lock);
> + } else {
> + up_write(&_hash_lock);
> + dm_sync_table(md);
> + dm_table_destroy(new_map);
> + }
> + dm_put(md);
> + return r;
> + }
> + }
>
> old_size = dm_get_size(md);
> old_map = dm_swap_table(md, new_map);
> --
> 2.45.2.993.g49e7a77208-goog
>
>
Thanks for the patch. The header could use more context for how this
issue has caused problems in practice (you touched on that in reply to
Mikulas for v1).
But I will review this closely starting the week of July 29. This is
a very fundamental codepath for DM so needs extended review.
Mike
© 2016 - 2025 Red Hat, Inc.