From: Jinliang Zheng <alexjlzheng@tencent.com>
We noticed a softlockup like:
crash> bt
PID: 5153 TASK: ffff8960a7ca0000 CPU: 115 COMMAND: "xfsaild/dm-4"
#0 [ffffc9001b1d4d58] machine_kexec at ffffffff9b086081
#1 [ffffc9001b1d4db8] __crash_kexec at ffffffff9b20817a
#2 [ffffc9001b1d4e78] panic at ffffffff9b107d8f
#3 [ffffc9001b1d4ef8] watchdog_timer_fn at ffffffff9b243511
#4 [ffffc9001b1d4f28] __hrtimer_run_queues at ffffffff9b1e62ff
#5 [ffffc9001b1d4f80] hrtimer_interrupt at ffffffff9b1e73d4
#6 [ffffc9001b1d4fd8] __sysvec_apic_timer_interrupt at ffffffff9b07bb29
#7 [ffffc9001b1d4ff0] sysvec_apic_timer_interrupt at ffffffff9bd689f9
--- <IRQ stack> ---
#8 [ffffc90031cd3a18] asm_sysvec_apic_timer_interrupt at ffffffff9be00e86
[exception RIP: part_in_flight+47]
RIP: ffffffff9b67960f RSP: ffffc90031cd3ac8 RFLAGS: 00000282
RAX: 00000000000000a9 RBX: 00000000000c4645 RCX: 00000000000000f5
RDX: ffffe89fffa36fe0 RSI: 0000000000000180 RDI: ffffffff9d1ae260
RBP: ffff898083d30000 R8: 00000000000000a8 R9: 0000000000000000
R10: ffff89808277d800 R11: 0000000000001000 R12: 0000000101a7d5be
R13: 0000000000000000 R14: 0000000000001001 R15: 0000000000001001
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#9 [ffffc90031cd3ad8] update_io_ticks at ffffffff9b6602e4
#10 [ffffc90031cd3b00] bdev_start_io_acct at ffffffff9b66031b
#11 [ffffc90031cd3b20] dm_io_acct at ffffffffc18d7f98 [dm_mod]
#12 [ffffc90031cd3b50] dm_submit_bio_remap at ffffffffc18d8195 [dm_mod]
#13 [ffffc90031cd3b70] dm_split_and_process_bio at ffffffffc18d9799 [dm_mod]
#14 [ffffc90031cd3be0] dm_submit_bio at ffffffffc18d9b07 [dm_mod]
#15 [ffffc90031cd3c20] __submit_bio at ffffffff9b65f61c
#16 [ffffc90031cd3c38] __submit_bio_noacct at ffffffff9b65f73e
#17 [ffffc90031cd3c80] xfs_buf_ioapply_map at ffffffffc23df4ea [xfs]
#18 [ffffc90031cd3ce0] _xfs_buf_ioapply at ffffffffc23df64f [xfs]
#19 [ffffc90031cd3d50] __xfs_buf_submit at ffffffffc23df7b8 [xfs]
#20 [ffffc90031cd3d70] xfs_buf_delwri_submit_buffers at ffffffffc23dffbd [xfs]
#21 [ffffc90031cd3df8] xfsaild_push at ffffffffc24268e5 [xfs]
#22 [ffffc90031cd3eb8] xfsaild at ffffffffc2426f88 [xfs]
#23 [ffffc90031cd3ef8] kthread at ffffffff9b1378fc
#24 [ffffc90031cd3f30] ret_from_fork at ffffffff9b042dd0
#25 [ffffc90031cd3f50] ret_from_fork_asm at ffffffff9b007e2b
This patch adds cond_resched() to avoid softlockups similar to the one
described above.
Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
---
fs/xfs/xfs_buf.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
index 47edf3041631..f1f8595d5e40 100644
--- a/fs/xfs/xfs_buf.c
+++ b/fs/xfs/xfs_buf.c
@@ -2026,6 +2026,8 @@ xfs_buf_delwri_submit_nowait(
bp->b_flags |= XBF_ASYNC;
xfs_buf_list_del(bp);
xfs_buf_submit(bp);
+
+ cond_resched();
}
blk_finish_plug(&plug);
--
2.49.0
On Thu, Feb 05, 2026 at 04:26:21PM +0800, alexjlzheng@gmail.com wrote: > From: Jinliang Zheng <alexjlzheng@tencent.com> > > We noticed a softlockup like: > > crash> bt > PID: 5153 TASK: ffff8960a7ca0000 CPU: 115 COMMAND: "xfsaild/dm-4" > #0 [ffffc9001b1d4d58] machine_kexec at ffffffff9b086081 > #1 [ffffc9001b1d4db8] __crash_kexec at ffffffff9b20817a > #2 [ffffc9001b1d4e78] panic at ffffffff9b107d8f > #3 [ffffc9001b1d4ef8] watchdog_timer_fn at ffffffff9b243511 > #4 [ffffc9001b1d4f28] __hrtimer_run_queues at ffffffff9b1e62ff > #5 [ffffc9001b1d4f80] hrtimer_interrupt at ffffffff9b1e73d4 > #6 [ffffc9001b1d4fd8] __sysvec_apic_timer_interrupt at ffffffff9b07bb29 > #7 [ffffc9001b1d4ff0] sysvec_apic_timer_interrupt at ffffffff9bd689f9 > --- <IRQ stack> --- > #8 [ffffc90031cd3a18] asm_sysvec_apic_timer_interrupt at ffffffff9be00e86 > [exception RIP: part_in_flight+47] > RIP: ffffffff9b67960f RSP: ffffc90031cd3ac8 RFLAGS: 00000282 > RAX: 00000000000000a9 RBX: 00000000000c4645 RCX: 00000000000000f5 > RDX: ffffe89fffa36fe0 RSI: 0000000000000180 RDI: ffffffff9d1ae260 > RBP: ffff898083d30000 R8: 00000000000000a8 R9: 0000000000000000 > R10: ffff89808277d800 R11: 0000000000001000 R12: 0000000101a7d5be > R13: 0000000000000000 R14: 0000000000001001 R15: 0000000000001001 > ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 > #9 [ffffc90031cd3ad8] update_io_ticks at ffffffff9b6602e4 > #10 [ffffc90031cd3b00] bdev_start_io_acct at ffffffff9b66031b > #11 [ffffc90031cd3b20] dm_io_acct at ffffffffc18d7f98 [dm_mod] > #12 [ffffc90031cd3b50] dm_submit_bio_remap at ffffffffc18d8195 [dm_mod] > #13 [ffffc90031cd3b70] dm_split_and_process_bio at ffffffffc18d9799 [dm_mod] > #14 [ffffc90031cd3be0] dm_submit_bio at ffffffffc18d9b07 [dm_mod] > #15 [ffffc90031cd3c20] __submit_bio at ffffffff9b65f61c > #16 [ffffc90031cd3c38] __submit_bio_noacct at ffffffff9b65f73e > #17 [ffffc90031cd3c80] xfs_buf_ioapply_map at ffffffffc23df4ea [xfs] This isn't from a TOT kernel. xfs_buf_ioapply_map() went away a year ago. What kernel is this occurring on? > #18 [ffffc90031cd3ce0] _xfs_buf_ioapply at ffffffffc23df64f [xfs] > #19 [ffffc90031cd3d50] __xfs_buf_submit at ffffffffc23df7b8 [xfs] > #20 [ffffc90031cd3d70] xfs_buf_delwri_submit_buffers at ffffffffc23dffbd [xfs] > #21 [ffffc90031cd3df8] xfsaild_push at ffffffffc24268e5 [xfs] > #22 [ffffc90031cd3eb8] xfsaild at ffffffffc2426f88 [xfs] > #23 [ffffc90031cd3ef8] kthread at ffffffff9b1378fc > #24 [ffffc90031cd3f30] ret_from_fork at ffffffff9b042dd0 > #25 [ffffc90031cd3f50] ret_from_fork_asm at ffffffff9b007e2b > > This patch adds cond_resched() to avoid softlockups similar to the one > described above. Again: how do this softlock occur? xfsaild_push() pushes at most 1000 items at a time for IO. It would have to be a fairly fast device not to block on the request queues filling as we submit batches of 1000 buffers at a time. Then the higher level AIL traversal loop would also have to be making continuous progress without blocking. Hence it must not hit the end of the AIL, nor ever hit pinned, stale, flushing or locked items in the AIL for as long as it takes for the soft lookup timer to fire. This seems ... highly unlikely. IOWs, if we are looping in this path without giving up the CPU for seconds at a time, then it is not behaving as I'd expect it to behave. We need to understand why is this code apparently behaving in an unexpected way, not just silence the warning.... Can you please explain how the softlockup timer is being hit here so we can try to understand the root cause of the problem? Workload, hardware, filesystem config, storage stack, etc all matter here, because they all play a part in these paths never blocking on a lock, a full queue, a pinned buffer, etc, whilst processing hundreds of thousands of dirty objects for IO. At least, I'm assuming we're talking about hundreds of thousands of objects, because I know the AIL can push a hundred thousand dirty buffers to disk every second when it is close to being CPU bound. So if it's not giving up the CPU for long enough to fire the soft lockup timer, we must be talking about processing millions of objects without blocking even once.... -Dave. -- Dave Chinner david@fromorbit.com
On Thu, 5 Feb 2026 22:44:51 +1100, david@fromorbit.com wrote:
> On Thu, Feb 05, 2026 at 04:26:21PM +0800, alexjlzheng@gmail.com wrote:
> > From: Jinliang Zheng <alexjlzheng@tencent.com>
> >
> > We noticed a softlockup like:
> >
> > crash> bt
> > PID: 5153 TASK: ffff8960a7ca0000 CPU: 115 COMMAND: "xfsaild/dm-4"
> > #0 [ffffc9001b1d4d58] machine_kexec at ffffffff9b086081
> > #1 [ffffc9001b1d4db8] __crash_kexec at ffffffff9b20817a
> > #2 [ffffc9001b1d4e78] panic at ffffffff9b107d8f
> > #3 [ffffc9001b1d4ef8] watchdog_timer_fn at ffffffff9b243511
> > #4 [ffffc9001b1d4f28] __hrtimer_run_queues at ffffffff9b1e62ff
> > #5 [ffffc9001b1d4f80] hrtimer_interrupt at ffffffff9b1e73d4
> > #6 [ffffc9001b1d4fd8] __sysvec_apic_timer_interrupt at ffffffff9b07bb29
> > #7 [ffffc9001b1d4ff0] sysvec_apic_timer_interrupt at ffffffff9bd689f9
> > --- <IRQ stack> ---
> > #8 [ffffc90031cd3a18] asm_sysvec_apic_timer_interrupt at ffffffff9be00e86
> > [exception RIP: part_in_flight+47]
> > RIP: ffffffff9b67960f RSP: ffffc90031cd3ac8 RFLAGS: 00000282
> > RAX: 00000000000000a9 RBX: 00000000000c4645 RCX: 00000000000000f5
> > RDX: ffffe89fffa36fe0 RSI: 0000000000000180 RDI: ffffffff9d1ae260
> > RBP: ffff898083d30000 R8: 00000000000000a8 R9: 0000000000000000
> > R10: ffff89808277d800 R11: 0000000000001000 R12: 0000000101a7d5be
> > R13: 0000000000000000 R14: 0000000000001001 R15: 0000000000001001
> > ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
> > #9 [ffffc90031cd3ad8] update_io_ticks at ffffffff9b6602e4
> > #10 [ffffc90031cd3b00] bdev_start_io_acct at ffffffff9b66031b
> > #11 [ffffc90031cd3b20] dm_io_acct at ffffffffc18d7f98 [dm_mod]
> > #12 [ffffc90031cd3b50] dm_submit_bio_remap at ffffffffc18d8195 [dm_mod]
> > #13 [ffffc90031cd3b70] dm_split_and_process_bio at ffffffffc18d9799 [dm_mod]
> > #14 [ffffc90031cd3be0] dm_submit_bio at ffffffffc18d9b07 [dm_mod]
> > #15 [ffffc90031cd3c20] __submit_bio at ffffffff9b65f61c
> > #16 [ffffc90031cd3c38] __submit_bio_noacct at ffffffff9b65f73e
> > #17 [ffffc90031cd3c80] xfs_buf_ioapply_map at ffffffffc23df4ea [xfs]
>
> This isn't from a TOT kernel. xfs_buf_ioapply_map() went away a year
> ago. What kernel is this occurring on?
Thanks for your reply. :)
It's based on v6.6.
>
> > #18 [ffffc90031cd3ce0] _xfs_buf_ioapply at ffffffffc23df64f [xfs]
> > #19 [ffffc90031cd3d50] __xfs_buf_submit at ffffffffc23df7b8 [xfs]
> > #20 [ffffc90031cd3d70] xfs_buf_delwri_submit_buffers at ffffffffc23dffbd [xfs]
> > #21 [ffffc90031cd3df8] xfsaild_push at ffffffffc24268e5 [xfs]
> > #22 [ffffc90031cd3eb8] xfsaild at ffffffffc2426f88 [xfs]
> > #23 [ffffc90031cd3ef8] kthread at ffffffff9b1378fc
> > #24 [ffffc90031cd3f30] ret_from_fork at ffffffff9b042dd0
> > #25 [ffffc90031cd3f50] ret_from_fork_asm at ffffffff9b007e2b
> >
> > This patch adds cond_resched() to avoid softlockups similar to the one
> > described above.
>
> Again: how do this softlock occur?
[28089.641309] watchdog: BUG: soft lockup - CPU#115 stuck for 26s! [xfsaild/dm-4:5153]
>
> xfsaild_push() pushes at most 1000 items at a time for IO. It would
> have to be a fairly fast device not to block on the request queues
> filling as we submit batches of 1000 buffers at a time.
>
> Then the higher level AIL traversal loop would also have to be
> making continuous progress without blocking. Hence it must not hit
> the end of the AIL, nor ever hit pinned, stale, flushing or locked
> items in the AIL for as long as it takes for the soft lookup timer
> to fire. This seems ... highly unlikely.
>
> IOWs, if we are looping in this path without giving up the CPU for
> seconds at a time, then it is not behaving as I'd expect it to
> behave. We need to understand why is this code apparently behaving
> in an unexpected way, not just silence the warning....
>
> Can you please explain how the softlockup timer is being hit here so we
> can try to understand the root cause of the problem? Workload,
Again, a testsuite combining stress-ng, LTP, and fio, executed concurrently.
> hardware, filesystem config, storage stack, etc all matter here,
================================= CPU ======================================
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 45 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 384
On-line CPU(s) list: 0-383
================================= MEM ======================================
[root@localhost ~]# free -h
total used free shared buff/cache available
Mem: 1.5Ti 479Gi 1.0Ti 2.0Gi 6.2Gi 1.0Ti
Swap: 0B 0B 0B
================================= XFS ======================================
[root@localhost ~]# xfs_info /dev/ts/home
meta-data=/dev/mapper/ts-home isize=512 agcount=4, agsize=45875200 blks
= sectsz=4096 attr=2, projid32bit=1
= crc=1 finobt=1, sparse=1, rmapbt=1
= reflink=1 bigtime=1 inobtcount=1 nrext64=1
data = bsize=4096 blocks=183500800, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0, ftype=1
log =internal log bsize=4096 blocks=89600, version=2
= sectsz=4096 sunit=1 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
================================= IOS ======================================
sdb
├─sdb1 vfat FAT32 EE62-ACF4 591.3M 1% /boot/efi
├─sdb2 xfs 95e0a12a-ed33-45b7-abe1-640c4e334468 1.6G 17% /boot
└─sdb3 LVM2_member LVM2 001 fSTN7x-BNcZ-w67a-gZBf-Vlps-7gtv-KAreS7
├─ts-root xfs f78c147c-86d7-4675-b9bd-ed0d512f37f4 84.5G 50% /
└─ts-home xfs 01677289-32f2-41b0-ab59-3c5d14a1eefa 668.6G 4% /home
> because they all play a part in these paths never blocking on
> a lock, a full queue, a pinned buffer, etc, whilst processing
> hundreds of thousands of dirty objects for IO.
And, there's another softlockup, which is similar:
watchdog: BUG: soft lockup - CPU#342 stuck for 22s! [xfsaild/dm-4:5045]
crash> bt
PID: 5045 TASK: ffff89e0a0150000 CPU: 342 COMMAND: "xfsaild/dm-4"
#0 [ffffc9001d98cd58] machine_kexec at ffffffffa8086081
#1 [ffffc9001d98cdb8] __crash_kexec at ffffffffa820817a
#2 [ffffc9001d98ce78] panic at ffffffffa8107d8f
#3 [ffffc9001d98cef8] watchdog_timer_fn at ffffffffa8243511
#4 [ffffc9001d98cf28] __hrtimer_run_queues at ffffffffa81e62ff
#5 [ffffc9001d98cf80] hrtimer_interrupt at ffffffffa81e73d4
#6 [ffffc9001d98cfd8] __sysvec_apic_timer_interrupt at ffffffffa807bb29
#7 [ffffc9001d98cff0] sysvec_apic_timer_interrupt at ffffffffa8d689f9
--- <IRQ stack> ---
#8 [ffffc900351efa48] asm_sysvec_apic_timer_interrupt at ffffffffa8e00e86
[exception RIP: kernel_fpu_begin_mask+66]
RIP: ffffffffa8044f52 RSP: ffffc900351efaf8 RFLAGS: 00000206
RAX: 0000000000000000 RBX: 0000000000000002 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffff899f8f9ee000 RDI: ffff89e0a0150000
RBP: 0000000000001000 R8: 0000000000000000 R9: 0000000000001000
R10: 0000000000000000 R11: ffff899f8f9ee030 R12: ffffc900351efb38
R13: ffff899f8f9ee000 R14: ffff88e084624158 R15: ffff88e083828000
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#9 [ffffc900351efb10] crc32c_pcl_intel_update at ffffffffc199b390 [crc32c_intel]
#10 [ffffc900351efb30] crc32c at ffffffffc196d03f [libcrc32c]
#11 [ffffc900351efcb8] xfs_dir3_data_write_verify at ffffffffc250cfd9 [xfs]
#12 [ffffc900351efce0] _xfs_buf_ioapply at ffffffffc253d5cd [xfs]
#13 [ffffc900351efd50] __xfs_buf_submit at ffffffffc253d7b8 [xfs]
#14 [ffffc900351efd70] xfs_buf_delwri_submit_buffers at ffffffffc253dfbd [xfs]
#15 [ffffc900351efdf8] xfsaild_push at ffffffffc25848e5 [xfs]
#16 [ffffc900351efeb8] xfsaild at ffffffffc2584f88 [xfs]
#17 [ffffc900351efef8] kthread at ffffffffa81378fc
#18 [ffffc900351eff30] ret_from_fork at ffffffffa8042dd0
#19 [ffffc900351eff50] ret_from_fork_asm at ffffffffa8007e2b
Thanks,
Jinliang Zheng. :)
>
> At least, I'm assuming we're talking about hundreds of thousands of
> objects, because I know the AIL can push a hundred thousand dirty
> buffers to disk every second when it is close to being CPU bound. So
> if it's not giving up the CPU for long enough to fire the soft
> lockup timer, we must be talking about processing millions of
> objects without blocking even once....
>
> -Dave.
> --
> Dave Chinner
> david@fromorbit.com
On Thu, Feb 05, 2026 at 08:49:59PM +0800, Jinliang Zheng wrote: > On Thu, 5 Feb 2026 22:44:51 +1100, david@fromorbit.com wrote: > > On Thu, Feb 05, 2026 at 04:26:21PM +0800, alexjlzheng@gmail.com wrote: > > > From: Jinliang Zheng <alexjlzheng@tencent.com> > > > > > > We noticed a softlockup like: > > > > > > crash> bt > > > PID: 5153 TASK: ffff8960a7ca0000 CPU: 115 COMMAND: "xfsaild/dm-4" > > > #0 [ffffc9001b1d4d58] machine_kexec at ffffffff9b086081 > > > #1 [ffffc9001b1d4db8] __crash_kexec at ffffffff9b20817a > > > #2 [ffffc9001b1d4e78] panic at ffffffff9b107d8f > > > #3 [ffffc9001b1d4ef8] watchdog_timer_fn at ffffffff9b243511 > > > #4 [ffffc9001b1d4f28] __hrtimer_run_queues at ffffffff9b1e62ff > > > #5 [ffffc9001b1d4f80] hrtimer_interrupt at ffffffff9b1e73d4 > > > #6 [ffffc9001b1d4fd8] __sysvec_apic_timer_interrupt at ffffffff9b07bb29 > > > #7 [ffffc9001b1d4ff0] sysvec_apic_timer_interrupt at ffffffff9bd689f9 > > > --- <IRQ stack> --- > > > #8 [ffffc90031cd3a18] asm_sysvec_apic_timer_interrupt at ffffffff9be00e86 > > > [exception RIP: part_in_flight+47] > > > RIP: ffffffff9b67960f RSP: ffffc90031cd3ac8 RFLAGS: 00000282 > > > RAX: 00000000000000a9 RBX: 00000000000c4645 RCX: 00000000000000f5 > > > RDX: ffffe89fffa36fe0 RSI: 0000000000000180 RDI: ffffffff9d1ae260 > > > RBP: ffff898083d30000 R8: 00000000000000a8 R9: 0000000000000000 > > > R10: ffff89808277d800 R11: 0000000000001000 R12: 0000000101a7d5be > > > R13: 0000000000000000 R14: 0000000000001001 R15: 0000000000001001 > > > ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 > > > #9 [ffffc90031cd3ad8] update_io_ticks at ffffffff9b6602e4 > > > #10 [ffffc90031cd3b00] bdev_start_io_acct at ffffffff9b66031b > > > #11 [ffffc90031cd3b20] dm_io_acct at ffffffffc18d7f98 [dm_mod] > > > #12 [ffffc90031cd3b50] dm_submit_bio_remap at ffffffffc18d8195 [dm_mod] > > > #13 [ffffc90031cd3b70] dm_split_and_process_bio at ffffffffc18d9799 [dm_mod] > > > #14 [ffffc90031cd3be0] dm_submit_bio at ffffffffc18d9b07 [dm_mod] > > > #15 [ffffc90031cd3c20] __submit_bio at ffffffff9b65f61c > > > #16 [ffffc90031cd3c38] __submit_bio_noacct at ffffffff9b65f73e > > > #17 [ffffc90031cd3c80] xfs_buf_ioapply_map at ffffffffc23df4ea [xfs] > > > > This isn't from a TOT kernel. xfs_buf_ioapply_map() went away a year > > ago. What kernel is this occurring on? > > Thanks for your reply. :) > > It's based on v6.6. v6.6 was released in late 2023. I think we largely fixed this problem with this series that was merged into 6.11 in mid 2024: https://lore.kernel.org/linux-xfs/20220809230353.3353059-1-david@fromorbit.com/ In more detail... > > Can you please explain how the softlockup timer is being hit here so we > > can try to understand the root cause of the problem? Workload, > > Again, a testsuite combining stress-ng, LTP, and fio, executed concurrently. > > > hardware, filesystem config, storage stack, etc all matter here, > > > ================================= CPU ====================================== > Architecture: x86_64 > CPU op-mode(s): 32-bit, 64-bit > Address sizes: 45 bits physical, 48 bits virtual > Byte Order: Little Endian > CPU(s): 384 ... 384 CPUs banging on a single filesystem.... > ================================= XFS ====================================== > [root@localhost ~]# xfs_info /dev/ts/home > meta-data=/dev/mapper/ts-home isize=512 agcount=4, agsize=45875200 blks ... that has very limited parallelism, and ... > = sectsz=4096 attr=2, projid32bit=1 > = crc=1 finobt=1, sparse=1, rmapbt=1 > = reflink=1 bigtime=1 inobtcount=1 nrext64=1 > data = bsize=4096 blocks=183500800, imaxpct=25 > = sunit=0 swidth=0 blks > naming =version 2 bsize=4096 ascii-ci=0, ftype=1 > log =internal log bsize=4096 blocks=89600, version=2 ... a relatively small log (350MB) compared to the size of the system that is hammering on it. i.e. This is exactly the sort of system architecture that will push heaps of concurrency into the filesystem's transaction reservation slow path and keep it there for long periods of time. Especially under sustained, highly concurrent, modification heavy stress workloads. Exposing any kernel spin lock to unbound user controlled concurrency will eventually result in a workload that causes catastrophic spin lock contention breakdown. Then everything that uses said lock will spend excessive amounts of time spinning and not making progress. This is one of the scalability problems the patchset I linked above addressed. Prior to that patchset, the transaction reservation slow path (the "journal full" path) exposed the AIL lock to unbound userspace concurrency via the "update the AIL push target" mechanism. Both journal IO completion and the xfsaild are heavy users of the AIL lock, but don't normally contend with each other because internal filesystem concurrency is tightly bounded. Once userspace starts banging on it, however.... Silencing soft lockups with cond_resched() is almost never the right thing to do - they are generally indicative of some other problem occurring. We need to understand what that "some other problem" is before we do anything else... -Dave. -- Dave Chinner david@fromorbit.com
© 2016 - 2026 Red Hat, Inc.