From nobody Thu Apr 9 04:04:24 2026 Received: from m16.mail.163.com (m16.mail.163.com [117.135.210.5]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CE641376BFB; Wed, 11 Mar 2026 08:06:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=117.135.210.5 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773216414; cv=none; b=HkNmW1IHQa6MTWJsh/kHyrZYmpSbB1NjUsYBjDIFNUpH20LMG5gq9tOi1o9o90T0MybrV3o+92nzjcK6rVQTq1LoFsONK5Kb0myZE7oeHArGMJgg443cA7FsE2ynAOiQ3FVs+9Ki7wt93/sur0kSep0zD2NC4f6zTI8u3C8htlI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773216414; c=relaxed/simple; bh=vyUFEum57AlHo8nJyMnSdtN1iUAl1gdenNgQDzO7PHk=; h=Date:From:To:Subject:Content-Type:MIME-Version:Message-ID; b=b6+W/f2rzoxRryR3IiwI7OdY/u/Zf/HlNMApVTgHjXQWEqPPaAlt2hOdPFXRsvf7FxrtyBaAsQa6DB1razv1Vaw6CPs0Jo3I8BAayQlOw/DdGQt9qvvhyY+CJadAaUnVZKuOul1RpH4+FYtVZ78wMpaidBRZWHiO5bb1U5xa8lA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=163.com; spf=pass smtp.mailfrom=163.com; dkim=pass (1024-bit key) header.d=163.com header.i=@163.com header.b=PsJs+B/9; arc=none smtp.client-ip=117.135.210.5 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=163.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=163.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=163.com header.i=@163.com header.b="PsJs+B/9" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=163.com; s=s110527; h=Date:From:To:Subject:Content-Type:MIME-Version: Message-ID; bh=vyUFEum57AlHo8nJyMnSdtN1iUAl1gdenNgQDzO7PHk=; b=P sJs+B/9sqSkBFRGUPtBQxHUIxe6ffPjhAjy72htSuomKrfYkeJ86mujurvANYUHt civBG76Qk4BtC1OprhHcbNXqnU5VkK5/xvEj0iJIZ9RXQIvE5rh3/CmLIXkFgwNO 8KqLDu4/H7pq9osKHfhmc6pOvWBEZN3Qvso8aZ7TAE= Received: from luckd0g$163.com ( [183.205.138.18] ) by ajax-webmail-wmsvr-40-127 (Coremail) ; Wed, 11 Mar 2026 16:06:31 +0800 (CST) Date: Wed, 11 Mar 2026 16:06:31 +0800 (CST) From: "Jianzhou Zhao" To: James.Bottomley@HansenPartnership.com, martin.petersen@oracle.com, linux-scsi@vger.kernel.org, linux-kernel@vger.kernel.org Subject: KCSAN: data-race in scsi_block_when_processing_errors / scsi_host_set_state X-Priority: 3 X-Mailer: Coremail Webmail Server Version 2023.4-cmXT build 20251222(83accb85) Copyright (c) 2002-2026 www.mailtech.cn 163com X-NTES-SC: AL_Qu2cAf6auUsi5yGQYukfmU4Rhug7UMO3uf8n24JfPJ9wjA/p2yseUUF9NmPf88CwFTuXvxiGfTNO1/ZAU5BifrwxAn/hsL8y9t3kMYcaDZFVmg== Content-Transfer-Encoding: quoted-printable Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-ID: <36d59d0e.6db0.19cdbeee01b.Coremail.luckd0g@163.com> X-Coremail-Locale: zh_CN X-CM-TRANSID: fygvCgDnT5GHIrFpv8R2AA--.38938W X-CM-SenderInfo: poxfyvkqj6il2tof0z/xtbC9gdw+mmxIocG3QAA3d X-Coremail-Antispam: 1U5529EdanIXcx71UUUUU7vcSsGvfC2KfnxnUU== Content-Type: text/plain; charset="utf-8" Subject: [BUG] scsi: core: KCSAN: data-race in scsi_block_when_processing_e= rrors / scsi_host_set_state Dear Maintainers, We are writing to report a KCSAN-detected data race vulnerability within th= e SCSI core subsystem (`drivers/scsi/hosts.c` and `include/scsi/scsi_host.h= `). This bug was found by our custom fuzzing tool, RacePilot. The race occu= rs during the host state transition while an error recovery process is acti= ve, specifically between the active modification of `shost->shost_state` wi= thin `scsi_host_set_state` and the lockless checking loop inside `scsi_bloc= k_when_processing_errors` through `scsi_host_in_recovery()`. We observed th= is bug on the Linux kernel version 6.18.0-08691-g2061f18ad76e-dirty. Call Trace & Context =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D BUG: KCSAN: data-race in scsi_block_when_processing_errors / scsi_host_set_= state write to 0xffff888009ff4280 of 4 bytes by task 307 on cpu 1: scsi_host_set_state+0x92/0x180 drivers/scsi/hosts.c:148 scsi_restart_operations drivers/scsi/scsi_error.c:2162 [inline] scsi_error_handler+0x269/0x840 drivers/scsi/scsi_error.c:2372 ... read to 0xffff888009ff4280 of 4 bytes by task 22653 on cpu 0: scsi_host_in_recovery include/scsi/scsi_host.h:754 [inline] scsi_block_when_processing_errors+0x41/0x240 drivers/scsi/scsi_error.c:388 sr_open+0x2e/0x60 drivers/scsi/sr.c:609 cdrom_open+0xbc/0xec0 drivers/cdrom/cdrom.c:1154 sr_block_open+0x9b/0x120 drivers/scsi/sr.c:512 blkdev_get_whole+0x55/0x1f0 block/bdev.c:758 ... __x64_sys_openat+0xc2/0x130 fs/open.c:1447 value changed: 0x00000005 -> 0x00000002 Reported by Kernel Concurrency Sanitizer on: CPU: 0 UID: 0 PID: 22653 Comm: syz.1.672 Not tainted 6.18.0-08691-g2061f18a= d76e-dirty #42 PREEMPT(voluntary)=20 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/= 2014 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Execution Flow & Code Context When the SCSI error handler resolves outstanding issues and restarts operat= ions, it invokes `scsi_restart_operations()`, which transitions the SCSI ho= st state back to `SHOST_RUNNING` by calling `scsi_host_set_state()`. This a= lters the state enum variable locklessly but assigns the target state indis= criminately via standard assignment: ```c // drivers/scsi/hosts.c int scsi_host_set_state(struct Scsi_Host *shost, enum scsi_host_state state) { ... shost->shost_state =3D state; // <-- Plain concurrent 4-byte write return 0; ... } ``` At the exact same time, a completely separate thread opening the SCSI/CD-RO= M device issues `wait_event(sdev->host->host_wait, !scsi_host_in_recovery(s= dev->host));` inside `scsi_block_when_processing_errors()`. The `wait_event= ` loop repeatedly checks `scsi_host_in_recovery()`, which evaluates the `sh= ost_state`: ```c // include/scsi/scsi_host.h static inline int scsi_host_in_recovery(struct Scsi_Host *shost) { return shost->shost_state =3D=3D SHOST_RECOVERY || // <-- Plain concurrent= 4-byte read shost->shost_state =3D=3D SHOST_CANCEL_RECOVERY || shost->shost_state =3D=3D SHOST_DEL_RECOVERY || shost->tmf_in_progress; } ``` Root Cause Analysis A KCSAN data race unfolds due to the unprotected modification of the enum t= ype `shost->shost_state` competing against the lockless predicate loop inhe= rent to `wait_event`. The condition function `scsi_host_in_recovery` repeti= tively evaluates `shost_state`. Since this value can asynchronously shift (= from `SHOST_RECOVERY` to `SHOST_RUNNING` as shown in the KCSAN trace when o= bserving the change `0x00000005 -> 0x00000002`), evaluating it as a plain v= ariable lacks fundamental compiler memory consistency boundaries. Without a= dequate compiler barriers, this specific mutation can suffer load tearing o= r generate severe KCSAN spam due to read-caching compiler optimizations glo= bally. Unfortunately, we were unable to generate a reproducer for this bug. Potential Impact This data race largely forces dynamic analysis tools like KCSAN to repetiti= vely issue warnings when traversing the SCSI block paths, obstructing true = analysis. In certain highly optimized architectures, standard assignments a= nd reads over variable data can encounter load-tearing, wherein the predica= te evaluation observes an illegal or incomplete transitional state structur= e causing `wait_event` to behave inconsistently. Proposed Fix We propose implementing `WRITE_ONCE` and `READ_ONCE` within the transition = path and wait queue evaluator specifically for `shost_state` updates to res= pect proper concurrency protocols. ```diff --- a/drivers/scsi/hosts.c +++ b/drivers/scsi/hosts.c @@ -145,7 +145,7 @@ int scsi_host_set_state(struct Scsi_Host *shost, enum s= csi_host_state state) } break; } - shost->shost_state =3D state; + WRITE_ONCE(shost->shost_state, state); return 0; =20 illegal: --- a/include/scsi/scsi_host.h +++ b/include/scsi/scsi_host.h @@ -751,9 +751,11 @@ static inline struct Scsi_Host *dev_to_shost(struct de= vice *dev) =20 static inline int scsi_host_in_recovery(struct Scsi_Host *shost) { - return shost->shost_state =3D=3D SHOST_RECOVERY || - shost->shost_state =3D=3D SHOST_CANCEL_RECOVERY || - shost->shost_state =3D=3D SHOST_DEL_RECOVERY || + enum scsi_host_state state =3D READ_ONCE(shost->shost_state); + + return state =3D=3D SHOST_RECOVERY || + state =3D=3D SHOST_CANCEL_RECOVERY || + state =3D=3D SHOST_DEL_RECOVERY || shost->tmf_in_progress; } ``` We would be highly honored if this could be of any help. Best regards, RacePilot Team