fs/quota/quota.c | 1 + 1 file changed, 1 insertion(+)
When a filesystem is frozen, quotactl_block() enters a retry loop
waiting for the filesystem to thaw. It acquires s_umount, checks the
freeze state, drops s_umount and uses sb_start_write() - sb_end_write()
pair to wait for the unfreeze.
However, this retry loop can trigger a livelock issue, specifically on
kernels with preemption disabled.
The mechanism is as follows:
1. freeze_super() sets SB_FREEZE_WRITE and calls sb_wait_write().
2. sb_wait_write() calls percpu_down_write(), which initiates
synchronize_rcu().
3. Simultaneously, quotactl_block() spins in its retry loop, immediately
executing the sb_start_write() - sb_end_write() pair.
4. Because the kernel is non-preemptible and the loop contains no
scheduling points, quotactl_block() never yields the CPU. This
prevents that CPU from reaching an RCU quiescent state.
5. synchronize_rcu() in the freezer thread waits indefinitely for the
quotactl_block() CPU to report a quiescent state.
6. quotactl_block() spins indefinitely waiting for the freezer to
advance, which it cannot do as it is blocked on the RCU sync.
This results in a hang of the freezer process and 100% CPU usage by the
quota process.
While this can occur intermittently on multi-core systems, it is
reliably reproducing on a node with the following script, running both
the freezer and the quota toggle on the same CPU:
# mkfs.ext4 -O quota /dev/sda 2g && mkdir a_mount
# mount /dev/sda -o quota,usrquota,grpquota a_mount
# taskset -c 3 bash -c "while true; do xfs_freeze -f a_mount; \
xfs_freeze -u a_mount; done" &
# taskset -c 3 bash -c "while true; do quotaon a_mount; \
quotaoff a_mount; done" &
Adding cond_resched() to the retry loop fixes the issue. It acts as an
RCU quiescent state, allowing synchronize_rcu() in percpu_down_write()
to complete.
Fixes: 576215cffdef ("fs: Drop wait_unfrozen wait queue")
Signed-off-by: Abhishek Bapat <abhishekbapat@google.com>
---
fs/quota/quota.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/fs/quota/quota.c b/fs/quota/quota.c
index 7c2b75a44485..de4379a9c792 100644
--- a/fs/quota/quota.c
+++ b/fs/quota/quota.c
@@ -899,6 +899,7 @@ static struct super_block *quotactl_block(const char __user *special, int cmd)
sb_start_write(sb);
sb_end_write(sb);
put_super(sb);
+ cond_resched();
goto retry;
}
return sb;
--
2.52.0.457.g6b5491de43-goog
On Thu 15-01-26 21:31:03, Abhishek Bapat wrote:
> When a filesystem is frozen, quotactl_block() enters a retry loop
> waiting for the filesystem to thaw. It acquires s_umount, checks the
> freeze state, drops s_umount and uses sb_start_write() - sb_end_write()
> pair to wait for the unfreeze.
>
> However, this retry loop can trigger a livelock issue, specifically on
> kernels with preemption disabled.
>
> The mechanism is as follows:
> 1. freeze_super() sets SB_FREEZE_WRITE and calls sb_wait_write().
> 2. sb_wait_write() calls percpu_down_write(), which initiates
> synchronize_rcu().
> 3. Simultaneously, quotactl_block() spins in its retry loop, immediately
> executing the sb_start_write() - sb_end_write() pair.
> 4. Because the kernel is non-preemptible and the loop contains no
> scheduling points, quotactl_block() never yields the CPU. This
> prevents that CPU from reaching an RCU quiescent state.
> 5. synchronize_rcu() in the freezer thread waits indefinitely for the
> quotactl_block() CPU to report a quiescent state.
> 6. quotactl_block() spins indefinitely waiting for the freezer to
> advance, which it cannot do as it is blocked on the RCU sync.
>
> This results in a hang of the freezer process and 100% CPU usage by the
> quota process.
>
> While this can occur intermittently on multi-core systems, it is
> reliably reproducing on a node with the following script, running both
> the freezer and the quota toggle on the same CPU:
>
> # mkfs.ext4 -O quota /dev/sda 2g && mkdir a_mount
> # mount /dev/sda -o quota,usrquota,grpquota a_mount
> # taskset -c 3 bash -c "while true; do xfs_freeze -f a_mount; \
> xfs_freeze -u a_mount; done" &
> # taskset -c 3 bash -c "while true; do quotaon a_mount; \
> quotaoff a_mount; done" &
>
> Adding cond_resched() to the retry loop fixes the issue. It acts as an
> RCU quiescent state, allowing synchronize_rcu() in percpu_down_write()
> to complete.
>
> Fixes: 576215cffdef ("fs: Drop wait_unfrozen wait queue")
> Signed-off-by: Abhishek Bapat <abhishekbapat@google.com>
Thanks for the fix! I've added it to my tree.
Honza
> ---
> fs/quota/quota.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/fs/quota/quota.c b/fs/quota/quota.c
> index 7c2b75a44485..de4379a9c792 100644
> --- a/fs/quota/quota.c
> +++ b/fs/quota/quota.c
> @@ -899,6 +899,7 @@ static struct super_block *quotactl_block(const char __user *special, int cmd)
> sb_start_write(sb);
> sb_end_write(sb);
> put_super(sb);
> + cond_resched();
> goto retry;
> }
> return sb;
> --
> 2.52.0.457.g6b5491de43-goog
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
© 2016 - 2026 Red Hat, Inc.