[PATCH] virtio-net: disable delayed refill when setting up xdp

Bui Quang Minh posted 1 patch 10 months, 1 week ago
drivers/net/virtio_net.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
[PATCH] virtio-net: disable delayed refill when setting up xdp
Posted by Bui Quang Minh 10 months, 1 week ago
When setting up XDP for a running interface, we call napi_disable() on
the receive queue's napi. In delayed refill_work, it also calls
napi_disable() on the receive queue's napi. This can leads to deadlock
when napi_disable() is called on an already disabled napi. This commit
fixes this by disabling future and cancelling all inflight delayed
refill works before calling napi_disabled() in virtnet_xdp_set.

Fixes: 4941d472bf95 ("virtio-net: do not reset during XDP set")
Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
---
 drivers/net/virtio_net.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 7e4617216a4b..33406d59efe2 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -5956,6 +5956,15 @@ static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog,
 	if (!prog && !old_prog)
 		return 0;
 
+	/*
+	 * Make sure refill_work does not run concurrently to
+	 * avoid napi_disable race which leads to deadlock.
+	 */
+	if (netif_running(dev)) {
+		disable_delayed_refill(vi);
+		cancel_delayed_work_sync(&vi->refill);
+	}
+
 	if (prog)
 		bpf_prog_add(prog, vi->max_queue_pairs - 1);
 
@@ -6004,6 +6013,8 @@ static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog,
 			virtnet_napi_tx_enable(&vi->sq[i]);
 		}
 	}
+	if (netif_running(dev))
+		enable_delayed_refill(vi);
 
 	return 0;
 
@@ -6019,6 +6030,7 @@ static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog,
 			virtnet_napi_enable(&vi->rq[i]);
 			virtnet_napi_tx_enable(&vi->sq[i]);
 		}
+		enable_delayed_refill(vi);
 	}
 	if (prog)
 		bpf_prog_sub(prog, vi->max_queue_pairs - 1);
-- 
2.43.0
Re: [PATCH] virtio-net: disable delayed refill when setting up xdp
Posted by Bui Quang Minh 10 months, 1 week ago
On 4/2/25 12:42, Bui Quang Minh wrote:
> When setting up XDP for a running interface, we call napi_disable() on
> the receive queue's napi. In delayed refill_work, it also calls
> napi_disable() on the receive queue's napi. This can leads to deadlock
> when napi_disable() is called on an already disabled napi. This commit
> fixes this by disabling future and cancelling all inflight delayed
> refill works before calling napi_disabled() in virtnet_xdp_set.
>
> Fixes: 4941d472bf95 ("virtio-net: do not reset during XDP set")
> Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
> ---
>   drivers/net/virtio_net.c | 12 ++++++++++++
>   1 file changed, 12 insertions(+)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 7e4617216a4b..33406d59efe2 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -5956,6 +5956,15 @@ static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog,
>   	if (!prog && !old_prog)
>   		return 0;
>   
> +	/*
> +	 * Make sure refill_work does not run concurrently to
> +	 * avoid napi_disable race which leads to deadlock.
> +	 */
> +	if (netif_running(dev)) {
> +		disable_delayed_refill(vi);
> +		cancel_delayed_work_sync(&vi->refill);
> +	}
> +
>   	if (prog)
>   		bpf_prog_add(prog, vi->max_queue_pairs - 1);
>   
> @@ -6004,6 +6013,8 @@ static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog,
>   			virtnet_napi_tx_enable(&vi->sq[i]);
>   		}
>   	}
> +	if (netif_running(dev))
> +		enable_delayed_refill(vi);
While doing some testing, it look likes that we must call try_fill_recv 
to resume the rx path. I'll do more testing and send a new v2 patch.
>   
>   	return 0;
>   
> @@ -6019,6 +6030,7 @@ static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog,
>   			virtnet_napi_enable(&vi->rq[i]);
>   			virtnet_napi_tx_enable(&vi->sq[i]);
>   		}
> +		enable_delayed_refill(vi);
>   	}
>   	if (prog)
>   		bpf_prog_sub(prog, vi->max_queue_pairs - 1);
Re: [PATCH] virtio-net: disable delayed refill when setting up xdp
Posted by Bui Quang Minh 10 months, 1 week ago
On 4/3/25 17:43, Bui Quang Minh wrote:
> On 4/2/25 12:42, Bui Quang Minh wrote:
>> When setting up XDP for a running interface, we call napi_disable() on
>> the receive queue's napi. In delayed refill_work, it also calls
>> napi_disable() on the receive queue's napi. This can leads to deadlock
>> when napi_disable() is called on an already disabled napi. This commit
>> fixes this by disabling future and cancelling all inflight delayed
>> refill works before calling napi_disabled() in virtnet_xdp_set.
>>
>> Fixes: 4941d472bf95 ("virtio-net: do not reset during XDP set")
>> Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
>> ---
>>   drivers/net/virtio_net.c | 12 ++++++++++++
>>   1 file changed, 12 insertions(+)
>>
>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>> index 7e4617216a4b..33406d59efe2 100644
>> --- a/drivers/net/virtio_net.c
>> +++ b/drivers/net/virtio_net.c
>> @@ -5956,6 +5956,15 @@ static int virtnet_xdp_set(struct net_device 
>> *dev, struct bpf_prog *prog,
>>       if (!prog && !old_prog)
>>           return 0;
>>   +    /*
>> +     * Make sure refill_work does not run concurrently to
>> +     * avoid napi_disable race which leads to deadlock.
>> +     */
>> +    if (netif_running(dev)) {
>> +        disable_delayed_refill(vi);
>> +        cancel_delayed_work_sync(&vi->refill);
>> +    }
>> +
>>       if (prog)
>>           bpf_prog_add(prog, vi->max_queue_pairs - 1);
>>   @@ -6004,6 +6013,8 @@ static int virtnet_xdp_set(struct net_device 
>> *dev, struct bpf_prog *prog,
>>               virtnet_napi_tx_enable(&vi->sq[i]);
>>           }
>>       }
>> +    if (netif_running(dev))
>> +        enable_delayed_refill(vi);
> While doing some testing, it look likes that we must call 
> try_fill_recv to resume the rx path. I'll do more testing and send a 
> new v2 patch.

I've sent a new patch here: 
https://lore.kernel.org/virtualization/20250404093903.37416-1-minhquangbui99@gmail.com/T/#u. 
As the commit title has changed a little bit, I don't use v2 tag.

Thank you,
Quang Minh.


Re: [PATCH] virtio-net: disable delayed refill when setting up xdp
Posted by Paolo Abeni 10 months, 1 week ago
On 4/2/25 7:42 AM, Bui Quang Minh wrote:
> When setting up XDP for a running interface, we call napi_disable() on
> the receive queue's napi. In delayed refill_work, it also calls
> napi_disable() on the receive queue's napi. This can leads to deadlock
> when napi_disable() is called on an already disabled napi. This commit
> fixes this by disabling future and cancelling all inflight delayed
> refill works before calling napi_disabled() in virtnet_xdp_set.
> 
> Fixes: 4941d472bf95 ("virtio-net: do not reset during XDP set")
> Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
> ---
>  drivers/net/virtio_net.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 7e4617216a4b..33406d59efe2 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -5956,6 +5956,15 @@ static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog,
>  	if (!prog && !old_prog)
>  		return 0;
>  
> +	/*
> +	 * Make sure refill_work does not run concurrently to
> +	 * avoid napi_disable race which leads to deadlock.
> +	 */
> +	if (netif_running(dev)) {
> +		disable_delayed_refill(vi);
> +		cancel_delayed_work_sync(&vi->refill);

AFAICS at this point refill_work() could still be running, why don't you
need to call flush_delayed_work()?

@Jason: somewhat related, why virtnet_close() does not use
flush_delayed_work(), too?

Thanks,

Paolo
Re: [PATCH] virtio-net: disable delayed refill when setting up xdp
Posted by Bui Quang Minh 10 months, 1 week ago
On 4/3/25 14:24, Paolo Abeni wrote:
> On 4/2/25 7:42 AM, Bui Quang Minh wrote:
>> When setting up XDP for a running interface, we call napi_disable() on
>> the receive queue's napi. In delayed refill_work, it also calls
>> napi_disable() on the receive queue's napi. This can leads to deadlock
>> when napi_disable() is called on an already disabled napi. This commit
>> fixes this by disabling future and cancelling all inflight delayed
>> refill works before calling napi_disabled() in virtnet_xdp_set.
>>
>> Fixes: 4941d472bf95 ("virtio-net: do not reset during XDP set")
>> Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
>> ---
>>   drivers/net/virtio_net.c | 12 ++++++++++++
>>   1 file changed, 12 insertions(+)
>>
>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>> index 7e4617216a4b..33406d59efe2 100644
>> --- a/drivers/net/virtio_net.c
>> +++ b/drivers/net/virtio_net.c
>> @@ -5956,6 +5956,15 @@ static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog,
>>   	if (!prog && !old_prog)
>>   		return 0;
>>   
>> +	/*
>> +	 * Make sure refill_work does not run concurrently to
>> +	 * avoid napi_disable race which leads to deadlock.
>> +	 */
>> +	if (netif_running(dev)) {
>> +		disable_delayed_refill(vi);
>> +		cancel_delayed_work_sync(&vi->refill);
> AFAICS at this point refill_work() could still be running, why don't you
> need to call flush_delayed_work()?

AFAIK, the cancel_delayed_work_sync (this is a synchronous version) 
provides somewhat stronger guarantee than the flush_delayed_work. 
Internally, the cancel_delayed_work_sync will also call to __flush_work. 
The cancel_delayed_work_sync temporarily disables the work before 
calling __flush_work, so that even if refill_work tries to re-queue 
itself, that re-queue will fail. As the refill_work can actually 
re-queue itself, I think we must use cancel_delayed_work_sync here.

Thanks,
Quang Minh.