This patchset introduces 4-tuple hash for connected udp sockets, to make
connected udp lookup faster. Test using udpgso_bench_tx/rx shows no
obvious difference w/ and w/o this patchset for unconnected socket
receiving (see v4 thread).
Patch1: Add a new counter for hslot2 named hash4_cnt, to avoid cache line
miss when lookup.
Patch2 and 3: Implement 4-tuple hash for ipv4.
(That for ipv6 is in progress.)
The detailed motivation is described in Patch 3.
The 4-tuple hash increases the size of udp_sock and udp_hslot. Thus add it
with CONFIG_BASE_SMALL check, i.e., it's a no op with CONFIG_BASE_SMALL.
AFAICS the patchset can be further improved by:
(a) Better interact with hash2/reuseport. Now hash4 hardly affects other
mechanisms, but maintaining sockets in both hash4 and hash2 lists seems
unnecessary.
(b) Support early demux and ipv6.
changelogs:
v4 -> v5 (Paolo Abeni):
- Add CONFIG_BASE_SMALL with which udp hash4 does nothing
v3 -> v4 (Willem de Bruijn):
- fix mistakes in udp_pernet_table_alloc()
RFCv2 -> v3 (Gur Stavi):
- minor fix in udp_hashslot2() and udp_table_init()
- add rcu sync in rehash4()
RFCv1 -> RFCv2:
- add a new struct for hslot2
- remove the sockopt UDP_HASH4 because it has little side effect for
unconnected sockets
- add rehash in connect()
- re-organize the patch into 3 smaller ones
- other minor fix
v4:
https://lore.kernel.org/all/20241012012918.70888-1-lulie@linux.alibaba.com/
v3:
https://lore.kernel.org/all/20241010090351.79698-1-lulie@linux.alibaba.com/
RFCv2:
https://lore.kernel.org/all/20240924110414.52618-1-lulie@linux.alibaba.com/
RFCv1:
https://lore.kernel.org/all/20240913100941.8565-1-lulie@linux.alibaba.com/
Philo Lu (3):
net/udp: Add a new struct for hash2 slot
net/udp: Add 4-tuple hash list basis
ipv4/udp: Add 4-tuple hash for connected socket
include/linux/udp.h | 11 +++
include/net/udp.h | 111 +++++++++++++++++++++-
net/ipv4/udp.c | 217 +++++++++++++++++++++++++++++++++++++++-----
net/ipv6/udp.c | 17 ++--
4 files changed, 317 insertions(+), 39 deletions(-)
--
2.32.0.3.g01195cf9f