drivers/net/Kconfig | 1 + drivers/net/virtio_net.c | 353 ++++++++++-------- .../drivers/net/virtio_net/basic_features.sh | 70 ++++ 3 files changed, 273 insertions(+), 151 deletions(-)
Introduce page_pool support in virtio_net driver to enable page recycling
in RX buffer allocation and avoid repeated page allocator calls. This
applies to mergeable and small buffer modes.
Beyond performance improvements, this patch is a prerequisite for enabling
memory provider-based zero-copy features in virtio_net, specifically devmem
TCP and io_uring ZCRX, which require drivers to use page_pool for buffer
management.
The implementation preserves the DMA premapping optimization introduced in
commit 31f3cd4e5756 ("virtio-net: rq submits premapped per-buffer") by
conditionally using PP_FLAG_DMA_MAP when the virtio backend supports
standard DMA API (vhost, virtio-pci), and falling back to allocation-only
mode for backends with custom DMA mechanisms (VDUSE).
Changes in v2
=============
Addressing reviewer feedback from v1:
- Add "select PAGE_POOL" to Kconfig (Jason Wang)
- Move page pool creation from ndo_open to probe for device lifetime
management (Xuan Zhuo, Jason Wang)
- Implement conditional DMA strategy using virtqueue_dma_dev():
- When non-NULL: use PP_FLAG_DMA_MAP for page_pool-managed DMA premapping
- When NULL (VDUSE): page_pool handles allocation only
- Use page_pool_get_dma_addr() + virtqueue_add_inbuf_premapped() to
preserve DMA premapping optimization from commit 31f3cd4e5756
("virtio-net: rq submits premapped per-buffer") (Jason Wang)
- Remove dual allocation code paths - page_pool now always used for
small/mergeable modes (Jason Wang)
- Remove unused virtnet_rq_alloc/virtnet_rq_init_one_sg functions
- Add comprehensive performance data (Michael S. Tsirkin)
- v1 link: https://lore.kernel.org/virtualization/20260106221924.123856-1-vishs@meta.com/
Performance Results
===================
Tested using iperf3 TCP_STREAM with virtio-net on vhost backend.
300-second runs, results show throughput and TCP retransmissions.
The base kernel is synced to net tree and commit: 709bbb015538.
Mergeable Buffer Mode (mrg_rxbuf=on, GSO enabled, MTU 1500):
+--------+---------+---------+------------+------------+--------+--------+
| Queues | Streams | Patch | Throughput | Retries | Delta | Retry% |
+--------+---------+---------+------------+------------+--------+--------+
| 1 | 1 | base | 25.7 Gbps | 0 | - | - |
| 1 | 1 | pp | 26.2 Gbps | 0 | +1.9% | 0% |
+--------+---------+---------+------------+------------+--------+--------+
| 8 | 8 | base | 95.6 Gbps | 236,432 | - | - |
| 8 | 8 | pp | 97.9 Gbps | 188,249 | +2.4% | -20.4% |
+--------+---------+---------+------------+------------+--------+--------+
Small Buffer Mode (mrg_rxbuf=off, GSO disabled, MTU 1500):
+--------+---------+---------+------------+------------+--------+--------+
| Queues | Streams | Patch | Throughput | Retries | Delta | Retry% |
+--------+---------+---------+------------+------------+--------+--------+
| 1 | 1 | base | 9.17 Gbps | 15,152 | - | - |
| 1 | 1 | pp | 9.19 Gbps | 12,203 | +0.2% | -19.5% |
+--------+---------+---------+------------+------------+--------+--------+
| 8 | 8 | base | 43.0 Gbps | 974,500 | - | - |
| 8 | 8 | pp | 44.7 Gbps | 717,411 | +4.0% | -26.4% |
+--------+---------+---------+------------+------------+--------+--------+
Testing
=======
The patches have been tested with:
- iperf3 bulk transfer workloads (multiple queue/stream configurations)
- Included selftests for buffer circulation verification
- Edge case testing: device unbind/bind cycles, rapid interface open/close,
traffic during close, ethtool feature toggling, close with pending refill
work, and data integrity verification
Vishwanath Seshagiri (2):
virtio_net: add page_pool support for buffer allocation
selftests: virtio_net: add buffer circulation test
drivers/net/Kconfig | 1 +
drivers/net/virtio_net.c | 353 ++++++++++--------
.../drivers/net/virtio_net/basic_features.sh | 70 ++++
3 files changed, 273 insertions(+), 151 deletions(-)
--
2.47.3
On Thu, Jan 29, 2026 at 5:20 AM Vishwanath Seshagiri <vishs@meta.com> wrote:
>
> Introduce page_pool support in virtio_net driver to enable page recycling
> in RX buffer allocation and avoid repeated page allocator calls. This
> applies to mergeable and small buffer modes.
>
> Beyond performance improvements, this patch is a prerequisite for enabling
> memory provider-based zero-copy features in virtio_net, specifically devmem
> TCP and io_uring ZCRX, which require drivers to use page_pool for buffer
> management.
>
> The implementation preserves the DMA premapping optimization introduced in
> commit 31f3cd4e5756 ("virtio-net: rq submits premapped per-buffer") by
> conditionally using PP_FLAG_DMA_MAP when the virtio backend supports
> standard DMA API (vhost, virtio-pci), and falling back to allocation-only
> mode for backends with custom DMA mechanisms (VDUSE).
>
> Changes in v2
> =============
>
> Addressing reviewer feedback from v1:
>
> - Add "select PAGE_POOL" to Kconfig (Jason Wang)
> - Move page pool creation from ndo_open to probe for device lifetime
> management (Xuan Zhuo, Jason Wang)
> - Implement conditional DMA strategy using virtqueue_dma_dev():
> - When non-NULL: use PP_FLAG_DMA_MAP for page_pool-managed DMA premapping
> - When NULL (VDUSE): page_pool handles allocation only
> - Use page_pool_get_dma_addr() + virtqueue_add_inbuf_premapped() to
> preserve DMA premapping optimization from commit 31f3cd4e5756
> ("virtio-net: rq submits premapped per-buffer") (Jason Wang)
> - Remove dual allocation code paths - page_pool now always used for
> small/mergeable modes (Jason Wang)
> - Remove unused virtnet_rq_alloc/virtnet_rq_init_one_sg functions
> - Add comprehensive performance data (Michael S. Tsirkin)
> - v1 link: https://lore.kernel.org/virtualization/20260106221924.123856-1-vishs@meta.com/
>
> Performance Results
> ===================
>
> Tested using iperf3 TCP_STREAM with virtio-net on vhost backend.
> 300-second runs, results show throughput and TCP retransmissions.
> The base kernel is synced to net tree and commit: 709bbb015538.
>
> Mergeable Buffer Mode (mrg_rxbuf=on, GSO enabled, MTU 1500):
> +--------+---------+---------+------------+------------+--------+--------+
> | Queues | Streams | Patch | Throughput | Retries | Delta | Retry% |
> +--------+---------+---------+------------+------------+--------+--------+
> | 1 | 1 | base | 25.7 Gbps | 0 | - | - |
> | 1 | 1 | pp | 26.2 Gbps | 0 | +1.9% | 0% |
> +--------+---------+---------+------------+------------+--------+--------+
> | 8 | 8 | base | 95.6 Gbps | 236,432 | - | - |
> | 8 | 8 | pp | 97.9 Gbps | 188,249 | +2.4% | -20.4% |
> +--------+---------+---------+------------+------------+--------+--------+
>
> Small Buffer Mode (mrg_rxbuf=off, GSO disabled, MTU 1500):
> +--------+---------+---------+------------+------------+--------+--------+
> | Queues | Streams | Patch | Throughput | Retries | Delta | Retry% |
> +--------+---------+---------+------------+------------+--------+--------+
> | 1 | 1 | base | 9.17 Gbps | 15,152 | - | - |
> | 1 | 1 | pp | 9.19 Gbps | 12,203 | +0.2% | -19.5% |
> +--------+---------+---------+------------+------------+--------+--------+
> | 8 | 8 | base | 43.0 Gbps | 974,500 | - | - |
> | 8 | 8 | pp | 44.7 Gbps | 717,411 | +4.0% | -26.4% |
> +--------+---------+---------+------------+------------+--------+--------+
It would be better to have more benchmark like:
PPS (using pktgen on host and XDP_DROP in the guest)
We can see PPS as well as XDP performance as well.
Thanks
>
> Testing
> =======
>
> The patches have been tested with:
> - iperf3 bulk transfer workloads (multiple queue/stream configurations)
> - Included selftests for buffer circulation verification
> - Edge case testing: device unbind/bind cycles, rapid interface open/close,
> traffic during close, ethtool feature toggling, close with pending refill
> work, and data integrity verification
>
> Vishwanath Seshagiri (2):
> virtio_net: add page_pool support for buffer allocation
> selftests: virtio_net: add buffer circulation test
>
> drivers/net/Kconfig | 1 +
> drivers/net/virtio_net.c | 353 ++++++++++--------
> .../drivers/net/virtio_net/basic_features.sh | 70 ++++
> 3 files changed, 273 insertions(+), 151 deletions(-)
>
> --
> 2.47.3
>
On Wed, 28 Jan 2026 13:20:29 -0800 Vishwanath Seshagiri wrote: > Introduce page_pool support in virtio_net driver to enable page recycling > in RX buffer allocation and avoid repeated page allocator calls. This > applies to mergeable and small buffer modes. > > Beyond performance improvements, this patch is a prerequisite for enabling > memory provider-based zero-copy features in virtio_net, specifically devmem > TCP and io_uring ZCRX, which require drivers to use page_pool for buffer > management. Struggles to boot in the CI: [ 11.424197][ C0] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP KASAN [ 11.424454][ C0] KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] [ 11.424606][ C0] CPU: 0 UID: 0 PID: 271 Comm: ip Not tainted 6.19.0-rc6-virtme #1 PREEMPT(full) [ 11.424784][ C0] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 11.424913][ C0] RIP: 0010:page_pool_alloc_frag_netmem+0x34/0x8e0 [ 11.425054][ C0] Code: b8 00 00 00 00 00 fc ff df 41 57 41 89 c8 41 56 41 55 41 89 d5 48 89 fa 41 54 48 c1 ea 03 49 89 f4 55 53 48 89 fb 48 83 ec 30 <0f> b6 04 02 84 c0 74 08 3c 03 0f 8e 32 05 00 00 8b 0b 83 f9 3f 0f [ 11.425413][ C0] RSP: 0018:ffa0000000007a00 EFLAGS: 00010286 [ 11.425544][ C0] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000820 [ 11.425697][ C0] RDX: 0000000000000000 RSI: ffa0000000007ac0 RDI: 0000000000000000 [ 11.425846][ C0] RBP: 1ff4000000000f54 R08: 0000000000000820 R09: fff3fc0000000f8f [ 11.426000][ C0] R10: fff3fc0000000f90 R11: 0000000000000001 R12: ffa0000000007ac0 [ 11.426156][ C0] R13: 0000000000000600 R14: ff11000008719d00 R15: ff1100000926de00 [ 11.426308][ C0] FS: 00007f4db5334400(0000) GS:ff110000786fe000(0000) knlGS:0000000000000000 [ 11.426487][ C0] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 11.426617][ C0] CR2: 00000000004e5e60 CR3: 000000000b9da001 CR4: 0000000000771ef0 [ 11.426768][ C0] PKRU: 55555554 [ 11.426846][ C0] Call Trace: [ 11.426927][ C0] <IRQ> [ 11.426981][ C0] ? alloc_chain_hlocks+0x1e5/0x5c0 [ 11.427085][ C0] ? buf_to_xdp.isra.0+0x2f0/0x2f0 [ 11.427190][ C0] page_pool_alloc_frag+0xe/0x20 [ 11.427289][ C0] add_recvbuf_mergeable+0x1e0/0x940 [ 11.427392][ C0] ? page_to_skb+0x760/0x760 [ 11.427497][ C0] ? lock_acquire.part.0+0xbc/0x260 [ 11.427597][ C0] ? find_held_lock+0x2b/0x80 [ 11.427696][ C0] try_fill_recv+0x180/0x240 [ 11.427794][ C0] virtnet_poll+0xc79/0x1450 [ 11.427901][ C0] ? receive_buf+0x690/0x690 [ 11.428005][ C0] ? virtnet_xdp_handler+0x900/0x900 [ 11.428109][ C0] ? do_raw_spin_unlock+0x59/0x250 [ 11.428208][ C0] ? rcu_is_watching+0x15/0xd0 [ 11.428310][ C0] __napi_poll.constprop.0+0x97/0x390 [ 11.428415][ C0] net_rx_action+0x4f6/0xed0 [ 11.428517][ C0] ? run_backlog_napi+0x90/0x90 [ 11.428617][ C0] ? sched_balance_domains+0x270/0xd40 [ 11.428721][ C0] ? lockdep_hardirqs_on_prepare.part.0+0x9a/0x160 [ 11.428844][ C0] ? lockdep_hardirqs_on+0x84/0x130 [ 11.428949][ C0] ? sched_balance_update_blocked_averages+0x137/0x1a0 [ 11.429073][ C0] ? mark_held_locks+0x40/0x70 [ 11.429172][ C0] handle_softirqs+0x1d7/0x840 [ 11.429271][ C0] ? _local_bh_enable+0xd0/0xd0 [ 11.429371][ C0] ? __flush_smp_call_function_queue+0x449/0x6d0 [ 11.429497][ C0] ? rcu_is_watching+0x15/0xd0 [ 11.429597][ C0] do_softirq+0xa9/0xe0 https://netdev-ctrl.bots.linux.dev/logs/vmksft/virtio/results/494081/1-basic-features-sh/stderr -- pw-bot: cr
On 1/28/26 5:37 PM, Jakub Kicinski wrote: > On Wed, 28 Jan 2026 13:20:29 -0800 Vishwanath Seshagiri wrote: >> Introduce page_pool support in virtio_net driver to enable page recycling >> in RX buffer allocation and avoid repeated page allocator calls. This >> applies to mergeable and small buffer modes. >> >> Beyond performance improvements, this patch is a prerequisite for enabling >> memory provider-based zero-copy features in virtio_net, specifically devmem >> TCP and io_uring ZCRX, which require drivers to use page_pool for buffer >> management. > > Struggles to boot in the CI: > > [ 11.424197][ C0] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP KASAN > [ 11.424454][ C0] KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] > [ 11.424606][ C0] CPU: 0 UID: 0 PID: 271 Comm: ip Not tainted 6.19.0-rc6-virtme #1 PREEMPT(full) > [ 11.424784][ C0] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 > [ 11.424913][ C0] RIP: 0010:page_pool_alloc_frag_netmem+0x34/0x8e0 > [ 11.425054][ C0] Code: b8 00 00 00 00 00 fc ff df 41 57 41 89 c8 41 56 41 55 41 89 d5 48 89 fa 41 54 48 c1 ea 03 49 89 f4 55 53 48 89 fb 48 83 ec 30 <0f> b6 04 02 84 c0 74 08 3c 03 0f 8e 32 05 00 00 8b 0b 83 f9 3f 0f > [ 11.425413][ C0] RSP: 0018:ffa0000000007a00 EFLAGS: 00010286 > [ 11.425544][ C0] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000820 > [ 11.425697][ C0] RDX: 0000000000000000 RSI: ffa0000000007ac0 RDI: 0000000000000000 > [ 11.425846][ C0] RBP: 1ff4000000000f54 R08: 0000000000000820 R09: fff3fc0000000f8f > [ 11.426000][ C0] R10: fff3fc0000000f90 R11: 0000000000000001 R12: ffa0000000007ac0 > [ 11.426156][ C0] R13: 0000000000000600 R14: ff11000008719d00 R15: ff1100000926de00 > [ 11.426308][ C0] FS: 00007f4db5334400(0000) GS:ff110000786fe000(0000) knlGS:0000000000000000 > [ 11.426487][ C0] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 11.426617][ C0] CR2: 00000000004e5e60 CR3: 000000000b9da001 CR4: 0000000000771ef0 > [ 11.426768][ C0] PKRU: 55555554 > [ 11.426846][ C0] Call Trace: > [ 11.426927][ C0] <IRQ> > [ 11.426981][ C0] ? alloc_chain_hlocks+0x1e5/0x5c0 > [ 11.427085][ C0] ? buf_to_xdp.isra.0+0x2f0/0x2f0 > [ 11.427190][ C0] page_pool_alloc_frag+0xe/0x20 > [ 11.427289][ C0] add_recvbuf_mergeable+0x1e0/0x940 > [ 11.427392][ C0] ? page_to_skb+0x760/0x760 > [ 11.427497][ C0] ? lock_acquire.part.0+0xbc/0x260 > [ 11.427597][ C0] ? find_held_lock+0x2b/0x80 > [ 11.427696][ C0] try_fill_recv+0x180/0x240 > [ 11.427794][ C0] virtnet_poll+0xc79/0x1450 > [ 11.427901][ C0] ? receive_buf+0x690/0x690 > [ 11.428005][ C0] ? virtnet_xdp_handler+0x900/0x900 > [ 11.428109][ C0] ? do_raw_spin_unlock+0x59/0x250 > [ 11.428208][ C0] ? rcu_is_watching+0x15/0xd0 > [ 11.428310][ C0] __napi_poll.constprop.0+0x97/0x390 > [ 11.428415][ C0] net_rx_action+0x4f6/0xed0 > [ 11.428517][ C0] ? run_backlog_napi+0x90/0x90 > [ 11.428617][ C0] ? sched_balance_domains+0x270/0xd40 > [ 11.428721][ C0] ? lockdep_hardirqs_on_prepare.part.0+0x9a/0x160 > [ 11.428844][ C0] ? lockdep_hardirqs_on+0x84/0x130 > [ 11.428949][ C0] ? sched_balance_update_blocked_averages+0x137/0x1a0 > [ 11.429073][ C0] ? mark_held_locks+0x40/0x70 > [ 11.429172][ C0] handle_softirqs+0x1d7/0x840 > [ 11.429271][ C0] ? _local_bh_enable+0xd0/0xd0 > [ 11.429371][ C0] ? __flush_smp_call_function_queue+0x449/0x6d0 > [ 11.429497][ C0] ? rcu_is_watching+0x15/0xd0 > [ 11.429597][ C0] do_softirq+0xa9/0xe0 > > https://urldefense.com/v3/__https://netdev-ctrl.bots.linux.dev/logs/vmksft/virtio/results/494081/1-basic-features-sh/stderr__;!!Bt8RZUm9aw!7wS5Fv2oMa1yF_50zUYFtTbdPGX3y338sThfMUM_j_v_r6DUE5bRwHw9N4yAEVYm-uYHa48$ The CI error is a bug in my patch where page pools is not created for all queues when num_online_cpus < max_queue_pairs. This is the issue Jason caught about using max_queue_pairs instead of curr_queue_pairs. I will fix it in v3.
© 2016 - 2026 Red Hat, Inc.