fs/fscache/volume.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
From: Hou Tao <houtao1@huawei.com>
The freeing of relinquished volume will wake up the pending volume
acquisition by using wake_up_bit(), however it is mismatched with
wait_var_event() used in fscache_wait_on_volume_collision() and it will
never wake up the waiter in the wait-queue because these two functions
operate on different wait-queues.
According to the implementation in fscache_wait_on_volume_collision(),
if the wake-up of pending acquisition is delayed longer than 20 seconds
(e.g., due to the delay of on-demand fd closing), the first
wait_var_event_timeout() will timeout and the following wait_var_event()
will hang forever as shown below:
FS-Cache: Potential volume collision new=00000024 old=00000022
......
INFO: task mount:1148 blocked for more than 122 seconds.
Not tainted 6.1.0-rc6+ #1
task:mount state:D stack:0 pid:1148 ppid:1
Call Trace:
<TASK>
__schedule+0x2f6/0xb80
schedule+0x67/0xe0
fscache_wait_on_volume_collision.cold+0x80/0x82
__fscache_acquire_volume+0x40d/0x4e0
erofs_fscache_register_volume+0x51/0xe0 [erofs]
erofs_fscache_register_fs+0x19c/0x240 [erofs]
erofs_fc_fill_super+0x746/0xaf0 [erofs]
vfs_get_super+0x7d/0x100
get_tree_nodev+0x16/0x20
erofs_fc_get_tree+0x20/0x30 [erofs]
vfs_get_tree+0x24/0xb0
path_mount+0x2fa/0xa90
do_mount+0x7c/0xa0
__x64_sys_mount+0x8b/0xe0
do_syscall_64+0x30/0x60
entry_SYSCALL_64_after_hwframe+0x46/0xb0
Fixing it by using wake_up_var() instead of wake_up_bit(). In addition
because waitqueue_active() is used in wake_up_var() and clear_bit()
doesn't imply any memory barrier, so do smp_mb__after_atomic() before
invoking wake_up_var().
Fixes: 62ab63352350 ("fscache: Implement volume registration")
Signed-off-by: Hou Tao <houtao1@huawei.com>
---
fs/fscache/volume.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/fs/fscache/volume.c b/fs/fscache/volume.c
index ab8ceddf9efa..cf8293bb1aca 100644
--- a/fs/fscache/volume.c
+++ b/fs/fscache/volume.c
@@ -348,7 +348,12 @@ static void fscache_wake_pending_volume(struct fscache_volume *volume,
if (fscache_volume_same(cursor, volume)) {
fscache_see_volume(cursor, fscache_volume_see_hash_wake);
clear_bit(FSCACHE_VOLUME_ACQUIRE_PENDING, &cursor->flags);
- wake_up_bit(&cursor->flags, FSCACHE_VOLUME_ACQUIRE_PENDING);
+ /*
+ * Paired with barrier in wait_var_event(). Check
+ * waitqueue_active() and wake_up_var() for details.
+ */
+ smp_mb__after_atomic();
+ wake_up_var(&cursor->flags);
return;
}
}
--
2.29.2
Hi David, Could you please pick it up for v6.2 ? On 11/28/2022 11:19 AM, Hou Tao wrote: > From: Hou Tao <houtao1@huawei.com> > > The freeing of relinquished volume will wake up the pending volume > acquisition by using wake_up_bit(), however it is mismatched with > wait_var_event() used in fscache_wait_on_volume_collision() and it will > never wake up the waiter in the wait-queue because these two functions > operate on different wait-queues. > > According to the implementation in fscache_wait_on_volume_collision(), > if the wake-up of pending acquisition is delayed longer than 20 seconds > (e.g., due to the delay of on-demand fd closing), the first > wait_var_event_timeout() will timeout and the following wait_var_event() > will hang forever as shown below: > > FS-Cache: Potential volume collision new=00000024 old=00000022 > ...... > INFO: task mount:1148 blocked for more than 122 seconds. > Not tainted 6.1.0-rc6+ #1 > task:mount state:D stack:0 pid:1148 ppid:1 > Call Trace: > <TASK> > __schedule+0x2f6/0xb80 > schedule+0x67/0xe0 > fscache_wait_on_volume_collision.cold+0x80/0x82 > __fscache_acquire_volume+0x40d/0x4e0 > erofs_fscache_register_volume+0x51/0xe0 [erofs] > erofs_fscache_register_fs+0x19c/0x240 [erofs] > erofs_fc_fill_super+0x746/0xaf0 [erofs] > vfs_get_super+0x7d/0x100 > get_tree_nodev+0x16/0x20 > erofs_fc_get_tree+0x20/0x30 [erofs] > vfs_get_tree+0x24/0xb0 > path_mount+0x2fa/0xa90 > do_mount+0x7c/0xa0 > __x64_sys_mount+0x8b/0xe0 > do_syscall_64+0x30/0x60 > entry_SYSCALL_64_after_hwframe+0x46/0xb0 > > Fixing it by using wake_up_var() instead of wake_up_bit(). In addition > because waitqueue_active() is used in wake_up_var() and clear_bit() > doesn't imply any memory barrier, so do smp_mb__after_atomic() before > invoking wake_up_var(). > > Fixes: 62ab63352350 ("fscache: Implement volume registration") > Signed-off-by: Hou Tao <houtao1@huawei.com> > --- > fs/fscache/volume.c | 7 ++++++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/fs/fscache/volume.c b/fs/fscache/volume.c > index ab8ceddf9efa..cf8293bb1aca 100644 > --- a/fs/fscache/volume.c > +++ b/fs/fscache/volume.c > @@ -348,7 +348,12 @@ static void fscache_wake_pending_volume(struct fscache_volume *volume, > if (fscache_volume_same(cursor, volume)) { > fscache_see_volume(cursor, fscache_volume_see_hash_wake); > clear_bit(FSCACHE_VOLUME_ACQUIRE_PENDING, &cursor->flags); > - wake_up_bit(&cursor->flags, FSCACHE_VOLUME_ACQUIRE_PENDING); > + /* > + * Paired with barrier in wait_var_event(). Check > + * waitqueue_active() and wake_up_var() for details. > + */ > + smp_mb__after_atomic(); > + wake_up_var(&cursor->flags); > return; > } > }
Hou Tao <houtao@huaweicloud.com> wrote: > > clear_bit(FSCACHE_VOLUME_ACQUIRE_PENDING, &cursor->flags); Maybe this should be clear_bit_unlock() instead. And I wonder if: set_bit(FSCACHE_VOLUME_ACQUIRE_PENDING, &candidate->flags); in fscache_hash_volume() needs a barrier before it. > > - wake_up_bit(&cursor->flags, FSCACHE_VOLUME_ACQUIRE_PENDING); > > + /* > > + * Paired with barrier in wait_var_event(). Check > > + * waitqueue_active() and wake_up_var() for details. > > + */ > > + smp_mb__after_atomic(); > > + wake_up_var(&cursor->flags); That doesn't seem right. wake_up_bit() is more selective, so should be preferred to wake_up_var(). David
Hi David, Sorry for the late reply. Busy for other business in work. On 12/9/2022 7:26 PM, David Howells wrote: > Hou Tao <houtao@huaweicloud.com> wrote: > >>> clear_bit(FSCACHE_VOLUME_ACQUIRE_PENDING, &cursor->flags); > Maybe this should be clear_bit_unlock() instead. I'm not sure about that. In my understanding, clear_bit_unlock() is usually paired with test_and_set_bit_lock() to implement bit lock to make sure the writes before clear_bit_unlock() are visible to read access in concurrent process, right ? But now the caller of fscache_wake_pending_volume() only modify cursor->flags and nothing else, so I don't think it is needed here. If its intended purpose is to provide the missing smp_mb() for wake_up_bit(), I also don't think it is right, because the release barrier provided by clear_bit_unlock() doesn't guarantee the order of cursor->flags and wq_head, so I think one extra smp_mb_after_atomic() is also needed after clear_bit(FSCACHE_VOLUME_ACQUIRE_PENDING, &cursor->flags). If the above reasoning makes sense to you, I think we also need to add smp_mb_after_atomic() for wake_up_bit() in fscache_create_volume_work(). > And I wonder if: > > set_bit(FSCACHE_VOLUME_ACQUIRE_PENDING, &candidate->flags); > > in fscache_hash_volume() needs a barrier before it. I also don't get it. The barrier is used to guarantee the order between cursor->flags and candidate->flags, right ? But the write and read of cursor->flags and candidate->flags are protected by the same hash lock. > >>> - wake_up_bit(&cursor->flags, FSCACHE_VOLUME_ACQUIRE_PENDING); >>> + /* >>> + * Paired with barrier in wait_var_event(). Check >>> + * waitqueue_active() and wake_up_var() for details. >>> + */ >>> + smp_mb__after_atomic(); >>> + wake_up_var(&cursor->flags); > That doesn't seem right. > > wake_up_bit() is more selective, so should be preferred to wake_up_var(). OK. Will update fscache_wait_on_volume_collision() to use wait_on_bit() accordingly. > David > > > .
Hi, Thanks for catching this. On 11/28/22 11:19 AM, Hou Tao wrote: > From: Hou Tao <houtao1@huawei.com> > > The freeing of relinquished volume will wake up the pending volume > acquisition by using wake_up_bit(), however it is mismatched with > wait_var_event() used in fscache_wait_on_volume_collision() and it will > never wake up the waiter in the wait-queue because these two functions > operate on different wait-queues. > > According to the implementation in fscache_wait_on_volume_collision(), > if the wake-up of pending acquisition is delayed longer than 20 seconds > (e.g., due to the delay of on-demand fd closing), the first > wait_var_event_timeout() will timeout and the following wait_var_event() > will hang forever as shown below: > > FS-Cache: Potential volume collision new=00000024 old=00000022 > ...... > INFO: task mount:1148 blocked for more than 122 seconds. > Not tainted 6.1.0-rc6+ #1 > task:mount state:D stack:0 pid:1148 ppid:1 > Call Trace: > <TASK> > __schedule+0x2f6/0xb80 > schedule+0x67/0xe0 > fscache_wait_on_volume_collision.cold+0x80/0x82 > __fscache_acquire_volume+0x40d/0x4e0 > erofs_fscache_register_volume+0x51/0xe0 [erofs] > erofs_fscache_register_fs+0x19c/0x240 [erofs] > erofs_fc_fill_super+0x746/0xaf0 [erofs] > vfs_get_super+0x7d/0x100 > get_tree_nodev+0x16/0x20 > erofs_fc_get_tree+0x20/0x30 [erofs] > vfs_get_tree+0x24/0xb0 > path_mount+0x2fa/0xa90 > do_mount+0x7c/0xa0 > __x64_sys_mount+0x8b/0xe0 > do_syscall_64+0x30/0x60 > entry_SYSCALL_64_after_hwframe+0x46/0xb0 > > Fixing it by using wake_up_var() instead of wake_up_bit(). In addition > because waitqueue_active() is used in wake_up_var() and clear_bit() > doesn't imply any memory barrier, so do smp_mb__after_atomic() before > invoking wake_up_var(). > > Fixes: 62ab63352350 ("fscache: Implement volume registration") > Signed-off-by: Hou Tao <houtao1@huawei.com> Reviewed-and-tested-by: Jingbo Xu <jefflexu@linux.alibaba.com> > --- > fs/fscache/volume.c | 7 ++++++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/fs/fscache/volume.c b/fs/fscache/volume.c > index ab8ceddf9efa..cf8293bb1aca 100644 > --- a/fs/fscache/volume.c > +++ b/fs/fscache/volume.c > @@ -348,7 +348,12 @@ static void fscache_wake_pending_volume(struct fscache_volume *volume, > if (fscache_volume_same(cursor, volume)) { > fscache_see_volume(cursor, fscache_volume_see_hash_wake); > clear_bit(FSCACHE_VOLUME_ACQUIRE_PENDING, &cursor->flags); > - wake_up_bit(&cursor->flags, FSCACHE_VOLUME_ACQUIRE_PENDING); > + /* > + * Paired with barrier in wait_var_event(). Check > + * waitqueue_active() and wake_up_var() for details. > + */ > + smp_mb__after_atomic(); > + wake_up_var(&cursor->flags); > return; > } > } -- Thanks, Jingbo
© 2016 - 2025 Red Hat, Inc.