This patchset fix a possible time window problem for page_pool and
the dma API misuse problem as mentioned in [1], and try to avoid the
overhead of the fixing using some optimization.
From the below performance data, the overhead is not so obvious
due to performance variations in arm64 server and less than 1 ns in
x86 server for time_bench_page_pool01_fast_path() and
time_bench_page_pool02_ptr_ring, and there is about 10~20ns overhead
for time_bench_page_pool03_slow(), see more detail in [2].
arm64 server:
Before this patchset:
fast_path ptr_ring slow
1. 31.171 ns 60.980 ns 164.917 ns
2. 28.824 ns 60.891 ns 170.241 ns
3. 14.236 ns 60.583 ns 164.355 ns
With patchset:
6. 26.163 ns 53.781 ns 189.450 ns
7. 26.189 ns 53.798 ns 189.466 ns
X86 server:
| Test name |Cycles | 1-5 | | Nanosec | 1-5 | | % |
| (tasklet_*)|Before | After |diff| Before | After | diff | change |
|------------+-------+-------+----+---------+--------+--------+--------|
| fast_path | 19 | 19 | 0| 5.399 | 5.492 | 0.093 | 1.7 |
| ptr_ring | 54 | 57 | 3| 15.090 | 15.849 | 0.759 | 5.0 |
| slow | 238 | 284 | 46| 66.134 | 78.909 | 12.775 | 19.3 |
And about 16 bytes of memory is also needed for each page_pool owned
page to fix the dma API misuse problem
1. https://lore.kernel.org/lkml/8067f204-1380-4d37-8ffd-007fc6f26738@kernel.org/T/
2. https://lore.kernel.org/all/f558df7a-d983-4fc5-8358-faf251994d23@kernel.org/
CC: Alexander Lobakin <aleksander.lobakin@intel.com>
CC: Robin Murphy <robin.murphy@arm.com>
CC: Alexander Duyck <alexander.duyck@gmail.com>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: IOMMU <iommu@lists.linux.dev>
CC: MM <linux-mm@kvack.org>
Change log:
V8:
1. Drop last 3 patch as it causes observable performance degradation
for x86 system.
2. Remove rcu read lock in page_pool_napi_local().
3. Renaming item function more consistently.
V7:
1. Fix a used-after-free bug reported by KASAN as mentioned by Jakub.
2. Fix the 'netmem' variable not setting up correctly bug as mentioned
by Simon.
V6:
1. Repost based on latest net-next.
2. Rename page_pool_to_pp() to page_pool_get_pp().
V5:
1. Support unlimit inflight pages.
2. Add some optimization to avoid the overhead of fixing bug.
V4:
1. use scanning to do the unmapping
2. spilt dma sync skipping into separate patch
V3:
1. Target net-next tree instead of net tree.
2. Narrow the rcu lock as the discussion in v2.
3. Check the ummapping cnt against the inflight cnt.
V2:
1. Add a item_full stat.
2. Use container_of() for page_pool_to_pp().
Yunsheng Lin (5):
page_pool: introduce page_pool_get_pp() API
page_pool: fix timing for checking and disabling napi_local
page_pool: fix IOMMU crash when driver has already unbound
page_pool: support unlimited number of inflight pages
page_pool: skip dma sync operation for inflight pages
drivers/net/ethernet/freescale/fec_main.c | 8 +-
.../ethernet/google/gve/gve_buffer_mgmt_dqo.c | 2 +-
drivers/net/ethernet/intel/iavf/iavf_txrx.c | 6 +-
drivers/net/ethernet/intel/idpf/idpf_txrx.c | 14 +-
drivers/net/ethernet/intel/libeth/rx.c | 2 +-
.../net/ethernet/mellanox/mlx5/core/en/xdp.c | 3 +-
drivers/net/netdevsim/netdev.c | 6 +-
drivers/net/wireless/mediatek/mt76/mt76.h | 2 +-
include/linux/mm_types.h | 2 +-
include/linux/skbuff.h | 1 +
include/net/libeth/rx.h | 3 +-
include/net/netmem.h | 22 +-
include/net/page_pool/helpers.h | 15 +
include/net/page_pool/types.h | 46 +-
net/core/devmem.c | 4 +-
net/core/netmem_priv.h | 5 +-
net/core/page_pool.c | 425 ++++++++++++++++--
net/core/page_pool_priv.h | 10 +-
net/core/xdp.c | 3 +-
19 files changed, 500 insertions(+), 79 deletions(-)
--
2.33.0