[v3] scsi: virtio_scsi: move INIT_WORK calls to virtscsi_init

[PATCH v3 0/3] scsi: virtio_scsi: move INIT_WORK calls to virtscsi_init

Posted by Joshua Daley 3 weeks ago

Changelog v2 -> v3:

- switched the order of patches 2 & 3 to fix compilation error.
- added reviewed-by tags.

-----

v2 cover letter:

Changelog v1 -> v2:

- Added 2 additional patches:
  - [PATCH v2 1/3] scsi: virtio_scsi: kick event_list unconditionally
    - Removes the conditions surrounding event_list operations (suggested by Stefan Hajnoczi <stefanha@redhat.com>)
  - [PATCH v2 2/3] scsi: virtio_scsi: remove unnecessary fn declaration
    - Removes virtscsi_handle_event() prototype (suggested by Eric Farman <farman@linux.ibm.com>)

- [PATCH 1/1] -> [PATCH v2 3/3] scsi: virtio_scsi: move INIT_WORK calls to virtscsi_init
  - Removed the condition surrounding INIT_WORK calls

-----

v1 cover letter:

This patch avoids a kernel warning that may occur if a virtio_scsi
controller is detached immediately following a disk detach. See the
commit message for details. The following are instructions to
produce the warning (without the proposed patch).

Timing matters--if all event work items call INIT_WORK before they are
flushed by cancel_work_sync, then the warning will not occur.

The warning will occur consistently if a sleep is added in
virtscsi_kick_event before the INIT_WORK call, like so:

#include <linux/delay.h>

static int virtscsi_kick_event(struct virtio_scsi *vscsi,
			       struct virtio_scsi_event_node *event_node)
{
    int err;
    struct scatterlist sg;
    unsigned long flags;

 -> msleep(1000);
    INIT_WORK(&event_node->work, virtscsi_handle_event);
	
    ...
}

Then, just detach a disk and its controller in quick succession:

virsh detach-device --domain <domain> disk.xml; \
virsh detach-device --domain <domain> controller.xml

where disk.xml and controller.xml are text files containing the XML
of the disk and controller.

Or, with the libvirt python module:

domain.detachDevice(str(disk_xml))
domain.detachDevice(str(controller_xml))

Joshua Daley (3):
  scsi: virtio_scsi: kick event_list unconditionally
  scsi: virtio_scsi: move INIT_WORK calls to virtscsi_init
  scsi: virtio_scsi: remove unnecessary fn declaration

 drivers/scsi/virtio_scsi.c | 17 ++++++-----------
 1 file changed, 6 insertions(+), 11 deletions(-)

-- 
2.34.1

Re: [PATCH v3 0/3] scsi: virtio_scsi: move INIT_WORK calls to virtscsi_init

Posted by Martin K. Petersen 2 weeks, 3 days ago

Hi Joshua!

> This patch avoids a kernel warning that may occur if a virtio_scsi
> controller is detached immediately following a disk detach. See the
> commit message for details. The following are instructions to
> produce the warning (without the proposed patch).

A few issues were flagged. Please review:

  https://sashiko.dev/#/patchset/20260316153341.2062278-1-jdaley%40linux.ibm.com

Thanks!

-- 
Martin K. Petersen

Re: [PATCH v3 0/3] scsi: virtio_scsi: move INIT_WORK calls to virtscsi_init

Posted by Joshua Daley 2 weeks ago

On 3/19/2026 10:19 PM, Martin K. Petersen wrote:
> 
> Hi Joshua!
> 
>> This patch avoids a kernel warning that may occur if a virtio_scsi
>> controller is detached immediately following a disk detach. See the
>> commit message for details. The following are instructions to
>> produce the warning (without the proposed patch).
> 
> A few issues were flagged. Please review:
> 
>    https://sashiko.dev/#/patchset/20260316153341.2062278-1-jdaley%40linux.ibm.com
> 
> Thanks!
> 

Hello,

I've reviewed Sashiko's comments and concluded that this series requires further changes. I will 
prepare a new version. Below are my responses to Sashiko's comments.

Thanks!

---

 From Sashiko:
> 
> commit a1db3f127c8402cd51cacfe63d01f15805153a29
> Author: Joshua Daley <jdaley@linux.ibm.com>
> 
> scsi: virtio_scsi: kick event_list unconditionally
> 
> The event_list processes non-hotplug events (such as LUN capacity
> changes), so remove the conditions that guard the initial kicks in
> _probe() and _restore(), as well as the work cancellation in _remove().
> 
>> diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
>> index 0ed8558dad72..982f49bc6c69 100644
>> --- a/drivers/scsi/virtio_scsi.c
>> +++ b/drivers/scsi/virtio_scsi.c
>> @@ -983,9 +983,7 @@ static int virtscsi_probe(struct virtio_device *vdev)
>>  		goto scsi_add_host_failed;
>>  
>>  	virtio_device_ready(vdev);
>> -
>> -	if (virtio_has_feature(vdev, VIRTIO_SCSI_F_HOTPLUG))
>> -		virtscsi_kick_event_all(vscsi);
>> +	virtscsi_kick_event_all(vscsi);
> 
> Does this code violate the virtio-scsi specification?
> 
> The specification mandates that a driver must not place buffers into the
> event virtqueue if neither VIRTIO_SCSI_F_HOTPLUG nor VIRTIO_SCSI_F_CHANGE
> has been negotiated.
> 
> By completely removing the VIRTIO_SCSI_F_HOTPLUG check without expanding it
> to check for VIRTIO_SCSI_F_CHANGE, could this unconditionally populate the
> event queue and cause strict implementations to reject the buffers or
> transition the device into a broken state?
> 

Stefan Hajnoczi <stefanha@redhat.com> addressed this already. "No, this is a hallucination."

>>  	scsi_scan_host(shost);
>>  	return 0;
> [ ... ]
>> @@ -1002,8 +1000,7 @@ static void virtscsi_remove(struct virtio_device *vdev)
>>  	struct Scsi_Host *shost = virtio_scsi_host(vdev);
>>  	struct virtio_scsi *vscsi = shost_priv(shost);
>>  
>> -	if (virtio_has_feature(vdev, VIRTIO_SCSI_F_HOTPLUG))
>> -		virtscsi_cancel_event_work(vscsi);
>> +	virtscsi_cancel_event_work(vscsi);
> 
> Can this introduce a use-after-free race during device removal that breaks
> bisection?
> 
> If a concurrent worker thread executing virtscsi_handle_event() reaches
> virtscsi_kick_event(), it invokes INIT_WORK(&event_node->work, ...).
> INIT_WORK() clears work->data bits to 0.
> 
> If cancel_work_sync() runs concurrently in virtscsi_cancel_event_work(), it
> might read work->data as 0, falsely conclude the work item is idle, and
> return immediately without waiting for the work to complete.
> 
> virtscsi_remove() would then proceed to free the vscsi structure via
> scsi_host_put(), while the worker thread continues and dereferences the now
> freed vscsi->event_vq.vq_lock.
> 
> While this specific race appears to be fixed later in the patch series by
> commit b602884db439 ("scsi: virtio_scsi: move INIT_WORK calls to
> virtscsi_init"), removing the feature guard here exposes all virtio-scsi
> instances to this crash, which might break bisection.
> 

see below

>>  	scsi_remove_host(shost);
>>  	virtscsi_remove_vqs(vdev);
> [ ... ]
>> @@ -1028,9 +1025,7 @@ static int virtscsi_restore(struct virtio_device *vdev)
>>  		return err;
>>  
>>  	virtio_device_ready(vdev);
>> -
>> -	if (virtio_has_feature(vdev, VIRTIO_SCSI_F_HOTPLUG))
>> -		virtscsi_kick_event_all(vscsi);
>> +	virtscsi_kick_event_all(vscsi);
> 
> Could this cause workqueue linked list corruption during suspend and resume?
> 
> When PM suspend begins, freezable workqueues are frozen. Any unhandled event
> from virtscsi_complete_event() leaves its work item queued. virtscsi_freeze()
> tears down virtqueues but does not cancel pending work or set stop_events =
> true.
> 
> Upon resume, virtscsi_restore() unconditionally calls virtscsi_kick_event_all()
> here, which executes INIT_WORK() on all event nodes. Executing INIT_WORK() on
> an actively queued work_struct blindly resets its list pointers, which silently
> corrupts the frozen workqueue's doubly-linked list. This can lead to a kernel
> panic when the workqueue is subsequently thawed.
> 
> Prior to this patch, this issue only affected devices that negotiated
> VIRTIO_SCSI_F_HOTPLUG, but removing the guard exposes every virtio-scsi
> device to this suspend/resume crash.
> 

Both this and the previous review comment are based on the current issue with the INIT_WORK call 
being located in virtscsi_kick_event().

As the Sashiko pointed out, this is addressed later in the series.

Performing this patch first does extend the INIT_WORK issue to devices that do not negotiate 
hotplug, having a somewhat negative effect on bisection. So in that regard it makes sense to reorder 
the patches such that the INIT_WORK call is moved first.

>>  	return err;
>>  }

> commit b602884db439878b4fde9b7b100a2b2263958fdf
> Author: Joshua Daley <jdaley@linux.ibm.com>
> 
> scsi: virtio_scsi: move INIT_WORK calls to virtscsi_init
> 
> The last step of virtscsi_handle_event is to call virtscsi_kick_event,
> which calls INIT_WORK on it's own work item. INIT_WORK resets the
> work item's data bits to 0.
> 
>> diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
>> index 982f49bc6c69..15ca2a2d7aa4 100644
>> --- a/drivers/scsi/virtio_scsi.c
>> +++ b/drivers/scsi/virtio_scsi.c
>> @@ -242,7 +242,6 @@ static int virtscsi_kick_event(struct virtio_scsi *vscsi,
>>  	struct scatterlist sg;
>>  	unsigned long flags;
>>  
>> -	INIT_WORK(&event_node->work, virtscsi_handle_event);
>>  	sg_init_one(&sg, event_node->event, sizeof(struct virtio_scsi_event));
>>  
>>  	spin_lock_irqsave(&vscsi->event_vq.vq_lock, flags);
> [ ... ]
>> @@ -898,6 +897,9 @@ static int virtscsi_init(struct virtio_device *vdev,
>>  	virtscsi_config_set(vdev, cdb_size, VIRTIO_SCSI_CDB_SIZE);
>>  	virtscsi_config_set(vdev, sense_size, VIRTIO_SCSI_SENSE_SIZE);
>>  
>> +	for (i = 0; i < VIRTIO_SCSI_EVENT_LEN; i++)
>> +		INIT_WORK(&vscsi->event_list[i].work, virtscsi_handle_event);
> 
> Can this code corrupt the workqueue's linked list during system resume?
> 
> Event works are queued on system_freezable_wq in virtscsi_complete_event():
> 
> virtscsi_complete_event() {
> 	...
> 	if (!vscsi->stop_events)
> 		queue_work(system_freezable_wq, &event_node->work);
> }
> 
> During system suspend, virtscsi_freeze() resets the virtio device but does
> not cancel pending event works, as it lacks a call to
> virtscsi_cancel_event_work(). Because the workqueue is freezable, pending
> works remain actively linked in the workqueue's internal pending list.
> 
> Upon system resume, virtscsi_restore() runs and calls virtscsi_init(),
> which unconditionally executes this INIT_WORK() during resume:
> 
> virtscsi_restore() -> virtscsi_init() {
> 	...
> 	for (i = 0; i < VIRTIO_SCSI_EVENT_LEN; i++)
> 		INIT_WORK(&vscsi->event_list[i].work, virtscsi_handle_event);
> }
> 
> Calling INIT_WORK() on a work item that is actively linked in a workqueue
> executes INIT_LIST_HEAD(&work->entry), overwriting its list pointers while
> adjacent workqueue list elements still point to it. This can result in a
> corrupted workqueue linked list when the workqueue is thawed.

This appears to be a valid concern.

INIT_WORK need only be called on the event nodes ONCE in the lifetime of the device, it should not 
be recalled on resume, especially since work may remain frozen in the workqueue and have its data 
corrupted by this call.

Lets move the calls to virtscsi_probe() instead of virtscsi_init().

Re: [PATCH v3 0/3] scsi: virtio_scsi: move INIT_WORK calls to virtscsi_init

Posted by Stefan Hajnoczi 2 weeks, 3 days ago

On Thu, Mar 19, 2026 at 10:19:00PM -0400, Martin K. Petersen wrote:
> > This patch avoids a kernel warning that may occur if a virtio_scsi
> > controller is detached immediately following a disk detach. See the
> > commit message for details. The following are instructions to
> > produce the warning (without the proposed patch).
> 
> A few issues were flagged. Please review:
> 
>   https://sashiko.dev/#/patchset/20260316153341.2062278-1-jdaley%40linux.ibm.com

Hi Joshua,
I am responding to the following sashiko review comment (haven't figured
out a way to reply in the web UI or via direct email to sashiko). I feel
responsible for this one since I suggested the change that sashiko is
questioning. I haven't looked at the other review comments, please
triage them yourself.

From Sashiko:
> Does this code violate the virtio-scsi specification?
>
> The specification mandates that a driver must not place buffers into the
> event virtqueue if neither VIRTIO_SCSI_F_HOTPLUG nor VIRTIO_SCSI_F_CHANGE
> has been negotiated.
>
> By completely removing the VIRTIO_SCSI_F_HOTPLUG check without expanding it
> to check for VIRTIO_SCSI_F_CHANGE, could this unconditionally populate the
> event queue and cause strict implementations to reject the buffers or
> transition the device into a broken state?

No, this is a hallucination. The spec does not mandate that a driver
must not place buffers into the event virtqueue when neither
VIRTIO_SCSI_F_HOTPLUG nor VIRTIO_SCSI_F_CHANGE has been negotiated:
https://docs.oasis-open.org/virtio/virtio/v1.4/virtio-v1.4.html#x1-4510006

The event virtqueue still serves a purpose when both
VIRTIO_SCSI_F_HOTPLUG and VIRTIO_SCSI_F_CHANGE are not negotiated. For
example, see "Asynchronous notification subscription" and the
VIRTIO_SCSI_T_ASYNC_NOTIFY event type.

Stefan