[PATCH v2] virtio_ring: Fix data race by tagging event_triggered as racy for KCSAN

Zhongqiu Han posted 1 patch 9 months, 1 week ago
drivers/virtio/virtio_ring.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
[PATCH v2] virtio_ring: Fix data race by tagging event_triggered as racy for KCSAN
Posted by Zhongqiu Han 9 months, 1 week ago
syzbot reports a data-race when accessing the event_triggered, here is the
simplified stack when the issue occurred:

==================================================================
BUG: KCSAN: data-race in virtqueue_disable_cb / virtqueue_enable_cb_delayed

write to 0xffff8881025bc452 of 1 bytes by task 3288 on cpu 0:
 virtqueue_enable_cb_delayed+0x42/0x3c0 drivers/virtio/virtio_ring.c:2653
 start_xmit+0x230/0x1310 drivers/net/virtio_net.c:3264
 __netdev_start_xmit include/linux/netdevice.h:5151 [inline]
 netdev_start_xmit include/linux/netdevice.h:5160 [inline]
 xmit_one net/core/dev.c:3800 [inline]

read to 0xffff8881025bc452 of 1 bytes by interrupt on cpu 1:
 virtqueue_disable_cb_split drivers/virtio/virtio_ring.c:880 [inline]
 virtqueue_disable_cb+0x92/0x180 drivers/virtio/virtio_ring.c:2566
 skb_xmit_done+0x5f/0x140 drivers/net/virtio_net.c:777
 vring_interrupt+0x161/0x190 drivers/virtio/virtio_ring.c:2715
 __handle_irq_event_percpu+0x95/0x490 kernel/irq/handle.c:158
 handle_irq_event_percpu kernel/irq/handle.c:193 [inline]

value changed: 0x01 -> 0x00
==================================================================

When the data race occurs, the function virtqueue_enable_cb_delayed() sets
event_triggered to false, and virtqueue_disable_cb_split/packed() reads it
as false due to the race condition. Since event_triggered is an unreliable
hint used for optimization, this should only cause the driver temporarily
suggest that the device not send an interrupt notification when the event
index is used.

Fix this KCSAN reported data-race issue by explicitly tagging the access as
data_racy.

Reported-by: syzbot+efe683d57990864b8c8e@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/67c7761a.050a0220.15b4b9.0018.GAE@google.com/
Signed-off-by: Zhongqiu Han <quic_zhonhan@quicinc.com>
---
v1 -> v2:
- Use data_race() instead of memory barriers.
- Simplify and rewrite commit messages.
- Link to v1: https://lore.kernel.org/all/20250311131735.3205493-1-quic_zhonhan@quicinc.com/

 drivers/virtio/virtio_ring.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index fdd2d2b07b5a..b784aab66867 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -2650,7 +2650,7 @@ bool virtqueue_enable_cb_delayed(struct virtqueue *_vq)
 	struct vring_virtqueue *vq = to_vvq(_vq);
 
 	if (vq->event_triggered)
-		vq->event_triggered = false;
+		data_race(vq->event_triggered = false);
 
 	return vq->packed_ring ? virtqueue_enable_cb_delayed_packed(_vq) :
 				 virtqueue_enable_cb_delayed_split(_vq);
-- 
2.25.1
Re: [PATCH v2] virtio_ring: Fix data race by tagging event_triggered as racy for KCSAN
Posted by Jason Wang 9 months ago
On Wed, Mar 12, 2025 at 9:04 PM Zhongqiu Han <quic_zhonhan@quicinc.com> wrote:
>
> syzbot reports a data-race when accessing the event_triggered, here is the
> simplified stack when the issue occurred:
>
> ==================================================================
> BUG: KCSAN: data-race in virtqueue_disable_cb / virtqueue_enable_cb_delayed
>
> write to 0xffff8881025bc452 of 1 bytes by task 3288 on cpu 0:
>  virtqueue_enable_cb_delayed+0x42/0x3c0 drivers/virtio/virtio_ring.c:2653
>  start_xmit+0x230/0x1310 drivers/net/virtio_net.c:3264
>  __netdev_start_xmit include/linux/netdevice.h:5151 [inline]
>  netdev_start_xmit include/linux/netdevice.h:5160 [inline]
>  xmit_one net/core/dev.c:3800 [inline]
>
> read to 0xffff8881025bc452 of 1 bytes by interrupt on cpu 1:
>  virtqueue_disable_cb_split drivers/virtio/virtio_ring.c:880 [inline]
>  virtqueue_disable_cb+0x92/0x180 drivers/virtio/virtio_ring.c:2566
>  skb_xmit_done+0x5f/0x140 drivers/net/virtio_net.c:777
>  vring_interrupt+0x161/0x190 drivers/virtio/virtio_ring.c:2715
>  __handle_irq_event_percpu+0x95/0x490 kernel/irq/handle.c:158
>  handle_irq_event_percpu kernel/irq/handle.c:193 [inline]
>
> value changed: 0x01 -> 0x00
> ==================================================================
>
> When the data race occurs, the function virtqueue_enable_cb_delayed() sets
> event_triggered to false, and virtqueue_disable_cb_split/packed() reads it
> as false due to the race condition. Since event_triggered is an unreliable
> hint used for optimization, this should only cause the driver temporarily
> suggest that the device not send an interrupt notification when the event
> index is used.
>
> Fix this KCSAN reported data-race issue by explicitly tagging the access as
> data_racy.
>
> Reported-by: syzbot+efe683d57990864b8c8e@syzkaller.appspotmail.com
> Closes: https://lore.kernel.org/all/67c7761a.050a0220.15b4b9.0018.GAE@google.com/
> Signed-off-by: Zhongqiu Han <quic_zhonhan@quicinc.com>
> ---
> v1 -> v2:
> - Use data_race() instead of memory barriers.
> - Simplify and rewrite commit messages.
> - Link to v1: https://lore.kernel.org/all/20250311131735.3205493-1-quic_zhonhan@quicinc.com/
>
>  drivers/virtio/virtio_ring.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index fdd2d2b07b5a..b784aab66867 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -2650,7 +2650,7 @@ bool virtqueue_enable_cb_delayed(struct virtqueue *_vq)
>         struct vring_virtqueue *vq = to_vvq(_vq);
>
>         if (vq->event_triggered)
> -               vq->event_triggered = false;
> +               data_race(vq->event_triggered = false);
>
>         return vq->packed_ring ? virtqueue_enable_cb_delayed_packed(_vq) :
>                                  virtqueue_enable_cb_delayed_split(_vq);
> --
> 2.25.1
>

Acked-by: Jason Wang <jasowang@redhat.com>

Thanks