block/blk-zoned.c | 2 ++ 1 file changed, 2 insertions(+)
blk_zone_wplug_handle_write() increments zwplug->ref via kref_get()
when preparing to handle a zone write. On the error path where
blk_zone_wplug_handle_write_noalloc() fails, the function returns
without calling kref_put() on zwplug->ref, leaking the reference.
Add kref_put(&zwplug->ref, ...) on the error path to properly release
the reference.
Fixes: dd291d77cc90 ("block: Introduce zone write plugging")
Cc: stable@vger.kernel.org
Signed-off-by: Wentao Liang <vulab@iscas.ac.cn>
---
block/blk-zoned.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/block/blk-zoned.c b/block/blk-zoned.c
index 42ef830054dc..24b899663a48 100644
--- a/block/blk-zoned.c
+++ b/block/blk-zoned.c
@@ -1503,6 +1503,7 @@ static bool blk_zone_wplug_handle_write(struct bio *bio, unsigned int nr_segs)
if (!blk_zone_wplug_prepare_bio(zwplug, bio)) {
spin_unlock_irqrestore(&zwplug->lock, flags);
+ disk_put_zone_wplug(zwplug);
bio_io_error(bio);
return true;
}
@@ -1511,6 +1512,7 @@ static bool blk_zone_wplug_handle_write(struct bio *bio, unsigned int nr_segs)
zwplug->flags |= BLK_ZONE_WPLUG_PLUGGED;
spin_unlock_irqrestore(&zwplug->lock, flags);
+ disk_put_zone_wplug(zwplug);
return false;
--
2.34.1
Hello,
kernel test robot noticed "RIP:disk_free_zone_wplug" on:
commit: d9343256aa173471dbb7f3e02a2177801f2f2136 ("[PATCH] block: blk-zoned: fix zwplug refcount leak on write error path")
url: https://github.com/intel-lab-lkp/linux/commits/Wentao-Liang/block-blk-zoned-fix-zwplug-refcount-leak-on-write-error-path/20260526-234750
base: https://git.kernel.org/cgit/linux/kernel/git/axboe/linux.git for-next
patch link: https://lore.kernel.org/all/20260526141824.2293025-1-vulab@iscas.ac.cn/
patch subject: [PATCH] block: blk-zoned: fix zwplug refcount leak on write error path
in testcase: blktests
version: blktests-x86_64-9131687-1_20260529
with following parameters:
test: zbd-004
config: x86_64-rhel-9.4-func
compiler: gcc-14
test machine: 16 threads Intel(R) Core(TM) i7-13620H (Raptor Lake) with 32G memory
(please refer to attached dmesg/kmsg for entire log/backtrace)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202605311048.34c03950-lkp@intel.com
[ 38.687737][ T420] INFO: lkp CACHE_DIR is /opt/rootfs/tmp
[ 38.687747][ T420]
[ 40.261858][ T1207] loop: module loaded
[ 40.340461][ T1252] null_blk: disk nullb0 created
[ 40.341026][ T1252] null_blk: module loaded
[ 40.347691][ T1255] null_blk: nullb1: using native zone append
[ 40.349880][ T1255] null_blk: disk nullb1 created
[ 40.402539][ T1255] run blktests zbd/004 at 2026-05-30 10:54:31
[ 40.445801][ T1286] ------------[ cut here ]------------
[ 40.446398][ T1286] WARNING: block/blk-zoned.c:590 at disk_free_zone_wplug+0x26b/0x330, CPU#3: dd/1286
[ 40.447008][ T1286] Modules linked in: null_blk loop btrfs libblake2b zstd_compress raid6_pq xor binfmt_misc snd_hda_codec_intelhdmi snd_hda_codec_hdmi snd_hda_codec_alc269 snd_hda_codec_realtek_lib snd_hda_scodec_component snd_hda_codec_generic snd_hda_intel snd_sof_pci_intel_tgl snd_sof_pci_intel_cnl snd_sof_intel_hda_generic soundwire_intel snd_sof_intel_hda_sdw_bpt snd_sof_intel_hda_common snd_soc_hdac_hda snd_sof_intel_hda_mlink snd_sof_intel_hda soundwire_cadence intel_rapl_msr snd_sof_pci snd_sof_xtensa_dsp snd_soc_sdw_utils intel_uncore_frequency intel_uncore_frequency_common x86_pkg_temp_thermal snd_sof snd_sof_utils snd_soc_acpi_intel_match snd_soc_acpi_intel_sdca_quirks soundwire_generic_allocation snd_soc_acpi crc8 soundwire_bus snd_soc_sdca intel_powerclamp coretemp snd_soc_avs snd_soc_hda_codec snd_hda_ext_core spi_pxa2xx_platform snd_hda_codec dw_dmac kvm_intel snd_hda_core spi_pxa2xx_core i915 snd_intel_dspcfg snd_intel_sdw_acpi processor_thermal_device_pci snd_hwdep processor_thermal_device kvm intel_gtt
[ 40.447061][ T1286] processor_thermal_wt_hint drm_buddy snd_soc_core ttm btusb platform_temperature_control iwlwifi btrtl snd_compress processor_thermal_soc_slider drm_display_helper spi_nor btintel processor_thermal_rfim snd_pcm irqbypass think_lmi cec processor_thermal_rapl btbcm rapl intel_rapl_common drm_client_lib btmtk drm_kms_helper mtd intel_pmc_core snd_timer ahci intel_cstate intel_lpss_pci processor_thermal_wt_req cfg80211 firmware_attributes_class wmi_bmof bluetooth pmt_telemetry video libahci processor_thermal_power_floor mei_me snd spi_intel_pci i2c_i801 pmt_discovery processor_thermal_mbox intel_lpss intel_uncore libata pl2303 pmt_class i2c_smbus pcspkr idma64 spi_intel rfkill soundcore mei int340x_thermal_zone wmi intel_pmc_ssram_telemetry int3400_thermal acpi_thermal_rel intel_vsec pinctrl_tigerlake acpi_pad acpi_tad drm fuse nfnetlink
[ 40.452569][ T1286] CPU: 3 UID: 0 PID: 1286 Comm: dd Tainted: G S W 7.1.0-rc3+ #1 PREEMPT(lazy)
[ 40.453213][ T1286] Tainted: [S]=CPU_OUT_OF_SPEC, [W]=WARN
[ 40.453849][ T1286] Hardware name: LENOVO 90XW004HPL/336B, BIOS M5LKT1CA 01/06/2025
[ 40.454439][ T1286] RIP: 0010:disk_free_zone_wplug (blk-zoned.c:592 (discriminator 1))
[ 40.455005][ T1286] Code: 5d 41 5e 41 5f e9 f5 fc e2 fe 83 e2 07 38 d0 7f 08 84 c0 0f 85 82 00 00 00 41 c6 04 24 ff e9 15 ff ff ff 0f 0b e9 2c fe ff ff <0f> 0b a8 01 0f 84 f8 fd ff ff 0f 0b e9 f1 fd ff ff e8 bf 7a 61 ff
All code
========
0: 5d pop %rbp
1: 41 5e pop %r14
3: 41 5f pop %r15
5: e9 f5 fc e2 fe jmp 0xfffffffffee2fcff
a: 83 e2 07 and $0x7,%edx
d: 38 d0 cmp %dl,%al
f: 7f 08 jg 0x19
11: 84 c0 test %al,%al
13: 0f 85 82 00 00 00 jne 0x9b
19: 41 c6 04 24 ff movb $0xff,(%r12)
1e: e9 15 ff ff ff jmp 0xffffffffffffff38
23: 0f 0b ud2
25: e9 2c fe ff ff jmp 0xfffffffffffffe56
2a:* 0f 0b ud2 <-- trapping instruction
2c: a8 01 test $0x1,%al
2e: 0f 84 f8 fd ff ff je 0xfffffffffffffe2c
34: 0f 0b ud2
36: e9 f1 fd ff ff jmp 0xfffffffffffffe2c
3b: e8 bf 7a 61 ff call 0xffffffffff617aff
Code starting with the faulting instruction
===========================================
0: 0f 0b ud2
2: a8 01 test $0x1,%al
4: 0f 84 f8 fd ff ff je 0xfffffffffffffe02
a: 0f 0b ud2
c: e9 f1 fd ff ff jmp 0xfffffffffffffe02
11: e8 bf 7a 61 ff call 0xffffffffff617ad5
[ 40.455652][ T1286] RSP: 0018:ffffc9000179f4d8 EFLAGS: 00010246
[ 40.456258][ T1286] RAX: 0000000000000000 RBX: ffff888883a71800 RCX: ffffffff8293c99a
[ 40.456863][ T1286] RDX: 1ffff1111074e30e RSI: 0000000000000004 RDI: ffff888883a71870
[ 40.457503][ T1286] RBP: ffff8881efd97000 R08: 0000000000000001 R09: ffffed111074e30d
[ 40.458107][ T1286] R10: ffff888883a7186f R11: ffff888200fac01c R12: ffff888890e92940
[ 40.458785][ T1286] R13: ffff88889dac03f8 R14: 000000096a349520 R15: ffff8888a6d6b780
[ 40.459408][ T1286] FS: 00007f265573c780(0000) GS:ffff8887cd24b000(0000) knlGS:0000000000000000
[ 40.460016][ T1286] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 40.460681][ T1286] CR2: 000055b746683f88 CR3: 000000019ef16006 CR4: 0000000000f72ef0
[ 40.461295][ T1286] PKRU: 55555554
[ 40.461905][ T1286] Call Trace:
[ 40.462554][ T1286] <TASK>
[ 40.463161][ T1286] blk_mq_finish_request (blk.h:548 blk-mq.c:786)
[ 40.463864][ T1286] __blk_mq_end_request (blk-mq.c:1164)
[ 40.464518][ T1286] null_queue_rq (block/null_blk/main.c:1703 (discriminator 1)) null_blk
[ 40.465132][ T1286] null_queue_rqs (block/null_blk/main.c:1717) null_blk
[ 40.465828][ T1286] ? __pfx_null_queue_rqs (block/null_blk/main.c:1326) null_blk
[ 40.466460][ T1286] ? _raw_spin_lock_irqsave (linux/instrumented.h:55 linux/atomic/atomic-instrumented.h:1301 asm-generic/qspinlock.h:111 linux/spinlock.h:187 linux/spinlock_api_smp.h:133 locking/spinlock.c:166)
[ 40.467069][ T1286] ? __pfx__raw_spin_lock_irqsave (locking/spinlock.c:273)
[ 40.467748][ T1286] blk_mq_dispatch_queue_requests (blk-mq.c:2903 (discriminator 1))
[ 40.468362][ T1286] blk_mq_flush_plug_list (blk-mq.c:2991)
[ 40.469031][ T1286] ? blk_account_io_start (blk-mq.c:1145 blk-mq.c:1121)
[ 40.469695][ T1286] ? __pfx_blk_mq_flush_plug_list (blk-mq.h:364 (discriminator 1))
[ 40.470310][ T1286] ? blk_mq_submit_bio (blk-mq.c:3231)
[ 40.470968][ T1286] __blk_flush_plug (blk-core.c:1229)
[ 40.471617][ T1286] ? __pfx___blk_flush_plug (linux/list.h:46 (discriminator 2))
[ 40.472224][ T1286] ? gup_fast_fallback (gup.c:3202)
[ 40.472917][ T1286] __submit_bio (blk-core.c:1256 blk-core.c:648)
[ 40.473551][ T1286] ? get_page_from_freelist (page_alloc.c:1866 page_alloc.c:3946)
[ 40.474215][ T1286] ? __pfx___submit_bio (blk-core.c:1257 (discriminator 1))
[ 40.474900][ T1286] submit_bio_noacct_nocheck (blk-core.c:721 blk-core.c:752)
[ 40.475528][ T1286] ? __pfx_submit_bio_noacct_nocheck (blk-core.c:712)
[ 40.476196][ T1286] ? submit_bio_noacct (blk-core.c:881)
[ 40.476875][ T1286] bio_await (bio.c:1499)
[ 40.477495][ T1286] ? __pfx_bio_await (bio.c:1471)
[ 40.478094][ T1286] ? bio_iov_iter_get_pages (bio.c:1266)
[ 40.478753][ T1286] submit_bio_wait (bio.c:1517)
[ 40.479354][[ 40.483105][ T1286] blkdev_write_iter (fops.c:722 fops.c:790)
[ 40.483748][ T1286] vfs_write (read_write.c:595 read_write.c:688)
[ 40.484345][ T1286] ? __pfx_vfs_write (linux/percpu-rwsem.h:131 (discriminator 38))
[ 40.484934][ T1286] ? __pfx_css_rstat_updated (cgroup/rstat.c:548)
[ 40.485546][ T1286] ? do_syscall_64+[ 40.570578][ T1287] ------------[ cut here ]------------
[ 40.571274][ T1287] refcount_t: underflow; use-after-free.
[ 40.571922][ T1287] WARNING: lib/refcount.c:28 at refcount_warn_saturate+0xa9/0xf0, CPU#12: dd/1287
[ 40.572594][ T1287] Modules linked in: null_blk loop btrfs libblake2b zstd_compress raid6_pq xor binfmt_misc snd_hda_codec_intelhdmi snd_hda_codec_hdmi snd_hda_codec_alc269 snd_hda_codec_realtek_lib snd_hda_scodec_component snd_hda_codec_generic snd_hda_intel snd_sof_pci_intel_tgl snd_sof_pci_intel_cnl snd_sof_intel_hda_generic soundwire_intel snd_sof_intel_hda_sdw_bpt snd_sof_intel_hda_common snd_soc_hdac_hda snd_sof_intel_hda_mlink snd_sof_intel_hda soundwire_cadence intel_rapl_msr snd_sof_pci snd_sof_xtensa_dsp snd_soc_sdw_utils[ 40.572665][ T1287] processor_thermal_wt_hint drm_buddy snd_soc_core ttm btusb platform_temperature_control iwlwifi btrtl snd_compress processor_thermal_soc_slider drm_display_helper spi_nor btintel processor_thermal_rfim snd_pcm irqbypass think_lmi ce[ 40.595153][ T1287] ? __pfx__raw_spin_lock_irqsave (locking/spinlock.c:273)
[ 40.595903][ T1287] blk_mq_dispatch_queue_requests (blk-mq.c:2903 (discriminator 1))
[ 40.596653][ T1287] blk_mq_flush_plug_list (blk-mq.c:2991)
[ 40.597402][ T1287] ? blk_account_io_start (blk-mq.c:1145 blk-mq.c:1121)
[ [ 40.617272][ T1287] ? folio_add_lru_vma (swap.c:536)
[ 40.617965][ T1287] ksys_write (read_write.c:740)
[ 40.618658][ T1287] ? __pfx_ksys_write (read_write.c:724)
[ 40.619354][ T1287] ? folio_add_new_anon_rmap (linux/instrumented.h:82 asm-generic/bitops/instrumented-non-atomic.h:141 linux/page-flags.h:843 linux/page-flags.h:864 linux/mm.h:1724 rmap.c:1697)
[ 40.620041][ T1287] do_syscall_[ 43.443332][ T420] LKP: stdout: 365: /lkp/lkp/src/bin/run-lkp /lkp/jobs/scheduled/igk-rpl-d05/blktests-zbd-004-debian-13-x86_64-20250902.cgz-d9343256aa17-20260530-48630-1qqgxqy-5.yaml
[ 43.443340][ T420]
[ 43.445914][ T420] RESULT_ROOT=/result/blktests/zbd-004/igk-rpl-d05/debian-13-x86_64-20250902.cgz/x86_64-rhel-9.4-func/gcc-14/d9343256aa173471dbb7f3e02a2177801f2f2136/5
[ 43.445920][ T420]
[ 43.448290][ T420] job=/lkp/jobs/scheduled/igk-rpl-d05/blktests-zbd-004-debian-13-x86_64-20250902.cgz-d9343256aa17-20260530-48630-1qqgxqy-5.yaml
[ 43.448293][ T420]
[ 49.435619][ T420] result_service: raw_upload, RESULT_MNT: /internal-lkp-server/result, RESULT_ROOT: /internal-lkp-server/result/blktests/zbd-004/igk-rpl-d05/debian-13-x86_64-20250902.cgz/x86_64-rhel-9.4-func/gcc-14/d9343256aa173471dbb7f3e02a2177801f2f2136/5, TMP_RESULT_ROOT: /tmp/lkp/result
[ 49.435628][ T420]
[ 49.438631][ T420] run-job /lkp/jobs/scheduled/igk-rpl-d05/blktests-zbd-004-debian-13-x86_64-20250902.cgz-d9343256aa17-20260530-48630-1qqgxqy-5.yaml
[ 49.438634][ T420]
[ 50.297903][ T420] /usr/bin/wget -q --timeout=3600 --tries=1 --local-encoding=UTF-8 http://internal-lkp-server:80/~lkp/cgi-bin/lkp-jobfile-append-var?job_file=/lkp/jobs/scheduled/igk-rpl-d05/blktests-zbd-004-debian-13-x86_64-20250902.cgz-d9343256aa17-20260530-48630-1qqgxqy-5.yaml&job_state=running -O /dev/null
[ 50.297912][ T420]
[ 50.300165][ T420] target ucode: 0x4129
[ 50.300168][ T420]
[ 50.301778][ T420] LKP: stdout: 1009: current_version: 4129, target_version: 4129
[ 50.301791][ T420]
[ 50.302957][ T420] check_nr_cpu
[ 50.302959][ T420]
[ 50.304375][ T420] CPU(s): 16
[ 50.304378][ T420]
[ 50.305812][ T420] On-line CPU(s) list: 0-15
[ 50.305815][ T420]
[ 50.307490][ T420] Model name: 13th Gen Intel(R) Core(TM) i7-13620H
[ 50.307494][ T420]
[ 50.308972][ T420] Thread(s) per core: 2
[ 50.308974][ T420]
[ 50.310450][ T420] Core(s) per socket: 10
[ 50.310453][ T420]
[ 50.311846][ T420] Socket(s): 1
[ 50.311849][ T420]
[ 50.313233][ T420] CPU(s) scaling MHz: 30%
[ 50.313236][ T420]
[ 50.314648][ T420] NUMA node(s): 1
[ 50.314650][ T420]
[ 50.316064][ T420] NUMA node0 CPU(s): 0-15
[ 50.316066][ T420]
[ 50.317507][ T420] 2026-05-30 10:54:26 cd /lkp/benchmarks/blktests
[ 50.317509][ T420]
[ 50.319031][ T420] Defaulting to policy_version 2 because kernel supports it.
[ 50.319034][ T420]
[ 50.320596][ T420] Customizing passphrase hashing difficulty for this system...
[ 50.320600][ T420]
[ 50.322106][ T420] Created global config file at "/etc/fscrypt.conf".
[ 50.322108][ T420]
[ 50.323861][ T420] Allow users other than root to create fscrypt metadata on the root filesystem?
[ 50.323864][ T420]
[ 50.326241][ T420] (See https://github.com/google/fscrypt#setting-up-fscrypt-on-a-filesystem) [y/N] Metadata directories created at "/.fscrypt", writable by root only.
[ 50.326244][ T420]
[ 50.327687][ T420] 2026-05-30 10:54:31 echo zbd/004
[ 50.327689][ T420]
[ 50.329072][ T420] 2026-05-30 10:54:31 ./check zbd/004
[ 50.329074][ T420]
[ 50.330721][ T420] zbd/004 => nullb1 (write split across sequential zones)
[ 50.330724][ T420]
[ 50.332420][ T420] zbd/004 => nullb1 (write split across sequential zones) [failed]
[ 50.332423][ T420]
[ 50.333796][ T420] runtime ... 0.246s
[ 50.333798][ T420]
[ 50.335223][ T420] something found in dmesg:
[ 50.335226][ T420]
[ 50.336967][ T420]
[ 40.402539] [ T1255] run blktests zbd/004 at 2026-05-30 10:54:31
[ 50.336970][ T420]
[ 50.338633][ T420]
[ 40.445801] [ T1286] ------------[ cut here ]------------
[ 50.338636][ T420]
[ 50.340724][ T420]
[ 40.446398] [ T1286] WARNING: block/blk-zoned.c:590 at disk_free_zone_wplug+0x26b/0x330, CPU#3: dd/1286
[ 50.340727][ T420]
[ 60.489873][ T1461] EXT4-fs (nvme0n1p3): unmounting filesystem 1516f48d-9247-4757-9a6e-5cfcf67431a2.
[ 62.359503][ T1] watchdog: watchdog0: watchdog did not stop!
[ 62.393655][ T1] watchdog: watchdog0: watchdog did not stop!
[ 62.479655][ T1] r8169 0000:02:00.0 eth0: Link is Down
[ 0.000000][ T0] Linux version 7.1.0-rc3+ (kbuild@6767f1d4f5ea) (gcc-14 (Debian 14.2.0-19) 14.2.0, GNU ld (GNU Binutils for Debian) 2.44) #1 SMP PREEMPT_DYNAMIC Sat May 30 06:29:39 CEST 2026
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20260531/202605311048.34c03950-lkp@intel.com
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
On 5/26/26 16:18, Wentao Liang wrote:
> blk_zone_wplug_handle_write() increments zwplug->ref via kref_get()
> when preparing to handle a zone write. On the error path where
> blk_zone_wplug_handle_write_noalloc() fails, the function returns
> without calling kref_put() on zwplug->ref, leaking the reference.
>
> Add kref_put(&zwplug->ref, ...) on the error path to properly release
> the reference.
>
> Fixes: dd291d77cc90 ("block: Introduce zone write plugging")
> Cc: stable@vger.kernel.org
> Signed-off-by: Wentao Liang <vulab@iscas.ac.cn>
> ---
> block/blk-zoned.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/block/blk-zoned.c b/block/blk-zoned.c
> index 42ef830054dc..24b899663a48 100644
> --- a/block/blk-zoned.c
> +++ b/block/blk-zoned.c
> @@ -1503,6 +1503,7 @@ static bool blk_zone_wplug_handle_write(struct bio *bio, unsigned int nr_segs)
>
> if (!blk_zone_wplug_prepare_bio(zwplug, bio)) {
> spin_unlock_irqrestore(&zwplug->lock, flags);
> + disk_put_zone_wplug(zwplug);
I am not sure if this is needed. The code above adds the
BIO_ZONE_WRITE_PLUGGING flag to the bio, which means the
blk_zone_write_plug_bio_endio would be called which should then call
disk_put_zone_wplug.
I do wonder if there are special cases when blk_zone_bio_endio is not
called.
> bio_io_error(bio);
> return true;
> }
> @@ -1511,6 +1512,7 @@ static bool blk_zone_wplug_handle_write(struct bio *bio, unsigned int nr_segs)
> zwplug->flags |= BLK_ZONE_WPLUG_PLUGGED;
>
> spin_unlock_irqrestore(&zwplug->lock, flags);
> + disk_put_zone_wplug(zwplug);
>
> return false;
>
On 5/27/26 3:54 AM, Haris Iqbal wrote:
>
>
> On 5/26/26 16:18, Wentao Liang wrote:
>> blk_zone_wplug_handle_write() increments zwplug->ref via kref_get()
>> when preparing to handle a zone write. On the error path where
>> blk_zone_wplug_handle_write_noalloc() fails, the function returns
>> without calling kref_put() on zwplug->ref, leaking the reference.
>>
>> Add kref_put(&zwplug->ref, ...) on the error path to properly release
>> the reference.
>>
>> Fixes: dd291d77cc90 ("block: Introduce zone write plugging")
>> Cc: stable@vger.kernel.org
>> Signed-off-by: Wentao Liang <vulab@iscas.ac.cn>
>> ---
>> block/blk-zoned.c | 2 ++
>> 1 file changed, 2 insertions(+)
>>
>> diff --git a/block/blk-zoned.c b/block/blk-zoned.c
>> index 42ef830054dc..24b899663a48 100644
>> --- a/block/blk-zoned.c
>> +++ b/block/blk-zoned.c
>> @@ -1503,6 +1503,7 @@ static bool blk_zone_wplug_handle_write(struct bio
>> *bio, unsigned int nr_segs)
>> if (!blk_zone_wplug_prepare_bio(zwplug, bio)) {
>> spin_unlock_irqrestore(&zwplug->lock, flags);
>> + disk_put_zone_wplug(zwplug);
>
> I am not sure if this is needed. The code above adds the
> BIO_ZONE_WRITE_PLUGGING flag to the bio, which means the
> blk_zone_write_plug_bio_endio would be called which should then call
> disk_put_zone_wplug.
Correct. This patch is not correct at all. The write plug reference is dropped
in the BIO completion path.
Wentao,
You clearly did not test this at all because if you had, you would have seen
all the warning splats that your patch triggers.
--
Damien Le Moal
Western Digital Research
On May 27, 2026 / 08:15, Damien Le Moal wrote:
[...]
> Wentao,
>
> You clearly did not test this at all because if you had, you would have seen
> all the warning splats that your patch triggers.
FYI, the blktests CI run for the patch caught failures at block/017, zbd/004,
zbd/009 and zbd/012.
# RUN_ZONED_TESTS=1 ./check block/017
block/017 (do I/O and check the inflight counter) [passed]
runtime 2.264s ... 2.140s
block/017 (zoned) (do I/O and check the inflight counter) [failed]
runtime 2.107s ... 2.080s
something found in dmesg:
[ 207.429382] [ T1852] run blktests block/017 at 2026-05-27 20:43:45
[ 207.466894] [ T1852] null_blk: nullb1: using native zone append
[ 207.479158] [ T1852] null_blk: disk nullb1 created
[ 207.810531] [ T1956] null_blk: disk nullb0 created
[ 207.811528] [ T1956] null_blk: module loaded
[ 207.830801] [ T1852] null_blk: nullb1: using native zone append
[ 208.404359] [ T1852] null_blk: disk nullb1 created
[ 209.174141] [ C2] ------------[ cut here ]------------
[ 209.175354] [ C2] WARNING: block/blk-zoned.c:590 at disk_free_zone_wplug+0x30c/0x3b0, CPU#2: swapper/2/0
[ 209.176896] [ C2] Modules linked in: null_blk nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables qrtr sunrpc 9pnet_virtio 9pnet i2c_piix4 pcspkr netfs i2c_smbus dm_multipath nfnetlink zram vmw_vsock_virtio_transport vmw_vsock_virtio_transport_common vsock bochs drm_client_lib nvme drm_shmem_helper xfs drm_kms_helper sym53c8xx nvme_core floppy nvme_keyring nvme_auth scsi_transport_spi e1000 drm serio_raw ata_generic pata_acpi i2c_dev qemu_fw_cfg virtiofs fuse virtio_console [last unloaded: null_blk]
...
(See '/home/shin/Blktests/blktests/results/nodev_zoned/block/017.dmesg' for the entire message)
# ./check zbd/004 zbd/009 zbd/012
zbd/004 => nullb1 (write split across sequential zones) [failed]
runtime 0.152s ... 0.626s
something found in dmesg:
[ 231.263084] [ T2067] run blktests zbd/004 at 2026-05-27 20:44:08
[ 231.714947] [ T2105] ------------[ cut here ]------------
[ 231.716700] [ T2105] refcount_t: underflow; use-after-free.
[ 231.717849] [ T2105] WARNING: lib/refcount.c:28 at refcount_warn_saturate+0xa9/0xe0, CPU#3: dd/2105
[ 231.720269] [ T2105] Modules linked in: null_blk nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables qrtr sunrpc 9pnet_virtio 9pnet i2c_piix4 pcspkr netfs i2c_smbus dm_multipath nfnetlink zram vmw_vsock_virtio_transport vmw_vsock_virtio_transport_common vsock bochs drm_client_lib nvme drm_shmem_helper xfs drm_kms_helper sym53c8xx nvme_core floppy nvme_keyring nvme_auth scsi_transport_spi e1000 drm serio_raw ata_generic pata_acpi i2c_dev qemu_fw_cfg virtiofs fuse virtio_console [last unloaded: null_blk]
[ 231.730390] [ T2105] CPU: 3 UID: 0 PID: 2105 Comm: dd Tainted: G W 7.1.0-rc5+ #3 PREEMPT(full)
[ 231.732289] [ T2105] Tainted: [W]=WARN
[ 231.733281] [ T2105] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.17.0-10.fc44 06/10/2025
[ 231.735090] [ T2105] RIP: 0010:refcount_warn_saturate+0xa9/0xe0
[ 231.736514] [ T2105] Code: bd ee 5d 03 67 48 0f b9 3a 5b 5d c3 cc cc cc cc 48 8d 3d ba ee 5d 03 67 48 0f b9 3a 5b 5d e9 ce ea 85 01 48 8d 3d b7 ee 5d 03 <67> 48 0f b9 3a 5b 5d c3 cc cc cc cc 48 8d 3d b4 ee 5d 03 67 48 0f
...
(See '/home/shin/Blktests/blktests/results/nullb1/zbd/004.dmesg' for the entire message)
zbd/009 (test gap zone support with BTRFS) [failed]
runtime 11.646s ... 1.424s
--- tests/zbd/009.out 2023-04-06 10:11:07.928670527 +0900
+++ /home/shin/Blktests/blktests/results/nodev/zbd/009.out.bad 2026-05-27 20:44:12.743034470 +0900
@@ -1,2 +1,4 @@
Running zbd/009
-Test complete
+mount: /home/shin/Blktests/blktests/results/tmpdir.zbd.009.xLW/mnt: wrong fs type, bad option, bad superblock on /dev/sdd, missing codepage or helper program, or other error.
+ dmesg(1) may have more information after failed mount system call.
+Test failed
zbd/012 (test requeuing of zoned writes and queue freezing) [failed]
runtime 42.181s ... 23.791s
--- tests/zbd/012.out 2025-03-06 19:32:02.536851507 +0900
+++ /home/shin/Blktests/blktests/results/nodev/zbd/012.out.bad 2026-05-27 20:44:38.677211476 +0900
@@ -2,6 +2,4 @@
1
2
4
-8
-16
-Test complete
+Test failed
On 2026/05/27 20:47, Shin'ichiro Kawasaki wrote: > On May 27, 2026 / 08:15, Damien Le Moal wrote: > [...] >> Wentao, >> >> You clearly did not test this at all because if you had, you would have seen >> all the warning splats that your patch triggers. > > FYI, the blktests CI run for the patch caught failures at block/017, zbd/004, > zbd/009 and zbd/012. Thanks Shin'ichiro. I did a simple manual test issuing an unaligned write with dd on a zloop device. That was enough to trigger warnings similar to what the CI reported. -- Damien Le Moal Western Digital Research
© 2016 - 2026 Red Hat, Inc.