[PATCH net-next v3 0/3] tun/tap: use kfree_skb_reason() to trace dropped skb

Dongli Zhang posted 3 patches 4 years, 4 months ago
There is a newer version of this series
drivers/net/tap.c          | 35 +++++++++++++++++++++++++----------
drivers/net/tun.c          | 38 ++++++++++++++++++++++++++++++--------
include/linux/skbuff.h     | 18 ++++++++++++++++++
include/trace/events/skb.h | 10 ++++++++++
net/core/skbuff.c          | 11 +++++++++--
5 files changed, 92 insertions(+), 20 deletions(-)
[PATCH net-next v3 0/3] tun/tap: use kfree_skb_reason() to trace dropped skb
Posted by Dongli Zhang 4 years, 4 months ago
The commit c504e5c2f964 ("net: skb: introduce kfree_skb_reason()") has
introduced the kfree_skb_reason() to help track the reason.

The tun and tap are commonly used as virtio-net/vhost-net backend. This is to
use kfree_skb_reason() to trace the dropped skb for those two drivers. 

Changed since v1:
- I have renamed many of the reasons since v1. I make them as generic as
  possible so that they can be re-used by core networking and drivers.

Changed since v2:
- declare drop_reason as type "enum skb_drop_reason"
- handle the drop in skb_list_walk_safe() case for tap driver, and
  kfree_skb_list_reason() is introduced


The following reasons are introduced.

- SKB_DROP_REASON_SKB_CSUM

This is used whenever there is checksum error with sk_buff.

- SKB_DROP_REASON_SKB_COPY_DATA

The kernel may (zero) copy the data to or from sk_buff, e.g.,
zerocopy_sg_from_iter(), skb_copy_datagram_from_iter() and
skb_orphan_frags_rx(). This reason is for the copy related error.

- SKB_DROP_REASON_SKB_GSO_SEG

Any error reported when GSO processing the sk_buff. It is frequent to process
sk_buff gso data and we introduce a new reason to handle that.
	
- SKB_DROP_REASON_SKB_PULL
- SKB_DROP_REASON_SKB_TRIM

It is frequent to pull to sk_buff data or trim the sk_buff data.

- SKB_DROP_REASON_DEV_HDR

Any driver may report error if there is any error in the metadata on the DMA
ring buffer.

- SKB_DROP_REASON_DEV_READY

The device is not ready/online or initialized to receive data.

- SKB_DROP_REASON_DEV_FILTER

David Ahern suggested SKB_DROP_REASON_TAP_FILTER. I changed from 'TAP' to 'DEV'
to make it more generic.

- SKB_DROP_REASON_FULL_RING

Suggested by Eric Dumazet.

- SKB_DROP_REASON_BPF_FILTER

Dropped by ebpf filter


This is the output for TUN device.

# cat /sys/kernel/debug/tracing/trace_pipe
          <idle>-0       [018] ..s1.  1478.130490: kfree_skb: skbaddr=00000000c4f21b8d protocol=0 location=00000000aff342c7 reason: NOT_SPECIFIED
      vhost-9003-9020    [012] b..1.  1478.196264: kfree_skb: skbaddr=00000000b174fb9b protocol=2054 location=000000001cf38db0 reason: FULL_RING
          arping-9639    [018] b..1.  1479.082993: kfree_skb: skbaddr=00000000c4f21b8d protocol=2054 location=000000001cf38db0 reason: FULL_RING
          <idle>-0       [012] b.s3.  1479.110472: kfree_skb: skbaddr=00000000e0c3681f protocol=4 location=000000001cf38db0 reason: FULL_RING
          arping-9639    [018] b..1.  1480.083086: kfree_skb: skbaddr=00000000c4f21b8d protocol=2054 location=000000001cf38db0 reason: FULL_RING


This is the output for TAP device.

# cat /sys/kernel/debug/tracing/trace_pipe
          <idle>-0       [014] ..s1.  1096.418621: kfree_skb: skbaddr=00000000f8f41946 protocol=0 location=00000000aff342c7 reason: NOT_SPECIFIED
          arping-7006    [001] ..s1.  1096.843961: kfree_skb: skbaddr=000000002ec803a8 protocol=2054 location=000000009a57b32f reason: FULL_RING
          arping-7006    [001] ..s1.  1097.844035: kfree_skb: skbaddr=000000002ec803a8 protocol=2054 location=000000009a57b32f reason: FULL_RING
          arping-7006    [001] ..s1.  1098.844102: kfree_skb: skbaddr=00000000295eb0da protocol=2054 location=000000009a57b32f reason: FULL_RING
          arping-7006    [001] ..s1.  1099.844160: kfree_skb: skbaddr=00000000295eb0da protocol=2054 location=000000009a57b32f reason: FULL_RING
          arping-7006    [001] ..s1.  1100.844214: kfree_skb: skbaddr=00000000295eb0da protocol=2054 location=000000009a57b32f reason: FULL_RING
          arping-7006    [001] ..s1.  1101.844230: kfree_skb: skbaddr=00000000295eb0da protocol=2054 location=000000009a57b32f reason: FULL_RING


 drivers/net/tap.c          | 35 +++++++++++++++++++++++++----------
 drivers/net/tun.c          | 38 ++++++++++++++++++++++++++++++--------
 include/linux/skbuff.h     | 18 ++++++++++++++++++
 include/trace/events/skb.h | 10 ++++++++++
 net/core/skbuff.c          | 11 +++++++++--
 5 files changed, 92 insertions(+), 20 deletions(-)

Please let me know if there is any suggestion on the definition of reasons.

Thank you very much!

Dongli Zhang


Re: [PATCH net-next v3 0/3] tun/tap: use kfree_skb_reason() to trace dropped skb
Posted by Dongli Zhang 4 years, 4 months ago
The subject should be [PATCH net-next v3 0/4] but not [PATCH net-next v3 0/3].

Sorry for the mistake.

Dongli Zhang

On 2/20/22 9:34 PM, Dongli Zhang wrote:
> The commit c504e5c2f964 ("net: skb: introduce kfree_skb_reason()") has
> introduced the kfree_skb_reason() to help track the reason.
> 
> The tun and tap are commonly used as virtio-net/vhost-net backend. This is to
> use kfree_skb_reason() to trace the dropped skb for those two drivers. 
> 
> Changed since v1:
> - I have renamed many of the reasons since v1. I make them as generic as
>   possible so that they can be re-used by core networking and drivers.
> 
> Changed since v2:
> - declare drop_reason as type "enum skb_drop_reason"
> - handle the drop in skb_list_walk_safe() case for tap driver, and
>   kfree_skb_list_reason() is introduced
> 
> 
> The following reasons are introduced.
> 
> - SKB_DROP_REASON_SKB_CSUM
> 
> This is used whenever there is checksum error with sk_buff.
> 
> - SKB_DROP_REASON_SKB_COPY_DATA
> 
> The kernel may (zero) copy the data to or from sk_buff, e.g.,
> zerocopy_sg_from_iter(), skb_copy_datagram_from_iter() and
> skb_orphan_frags_rx(). This reason is for the copy related error.
> 
> - SKB_DROP_REASON_SKB_GSO_SEG
> 
> Any error reported when GSO processing the sk_buff. It is frequent to process
> sk_buff gso data and we introduce a new reason to handle that.
> 	
> - SKB_DROP_REASON_SKB_PULL
> - SKB_DROP_REASON_SKB_TRIM
> 
> It is frequent to pull to sk_buff data or trim the sk_buff data.
> 
> - SKB_DROP_REASON_DEV_HDR
> 
> Any driver may report error if there is any error in the metadata on the DMA
> ring buffer.
> 
> - SKB_DROP_REASON_DEV_READY
> 
> The device is not ready/online or initialized to receive data.
> 
> - SKB_DROP_REASON_DEV_FILTER
> 
> David Ahern suggested SKB_DROP_REASON_TAP_FILTER. I changed from 'TAP' to 'DEV'
> to make it more generic.
> 
> - SKB_DROP_REASON_FULL_RING
> 
> Suggested by Eric Dumazet.
> 
> - SKB_DROP_REASON_BPF_FILTER
> 
> Dropped by ebpf filter
> 
> 
> This is the output for TUN device.
> 
> # cat /sys/kernel/debug/tracing/trace_pipe
>           <idle>-0       [018] ..s1.  1478.130490: kfree_skb: skbaddr=00000000c4f21b8d protocol=0 location=00000000aff342c7 reason: NOT_SPECIFIED
>       vhost-9003-9020    [012] b..1.  1478.196264: kfree_skb: skbaddr=00000000b174fb9b protocol=2054 location=000000001cf38db0 reason: FULL_RING
>           arping-9639    [018] b..1.  1479.082993: kfree_skb: skbaddr=00000000c4f21b8d protocol=2054 location=000000001cf38db0 reason: FULL_RING
>           <idle>-0       [012] b.s3.  1479.110472: kfree_skb: skbaddr=00000000e0c3681f protocol=4 location=000000001cf38db0 reason: FULL_RING
>           arping-9639    [018] b..1.  1480.083086: kfree_skb: skbaddr=00000000c4f21b8d protocol=2054 location=000000001cf38db0 reason: FULL_RING
> 
> 
> This is the output for TAP device.
> 
> # cat /sys/kernel/debug/tracing/trace_pipe
>           <idle>-0       [014] ..s1.  1096.418621: kfree_skb: skbaddr=00000000f8f41946 protocol=0 location=00000000aff342c7 reason: NOT_SPECIFIED
>           arping-7006    [001] ..s1.  1096.843961: kfree_skb: skbaddr=000000002ec803a8 protocol=2054 location=000000009a57b32f reason: FULL_RING
>           arping-7006    [001] ..s1.  1097.844035: kfree_skb: skbaddr=000000002ec803a8 protocol=2054 location=000000009a57b32f reason: FULL_RING
>           arping-7006    [001] ..s1.  1098.844102: kfree_skb: skbaddr=00000000295eb0da protocol=2054 location=000000009a57b32f reason: FULL_RING
>           arping-7006    [001] ..s1.  1099.844160: kfree_skb: skbaddr=00000000295eb0da protocol=2054 location=000000009a57b32f reason: FULL_RING
>           arping-7006    [001] ..s1.  1100.844214: kfree_skb: skbaddr=00000000295eb0da protocol=2054 location=000000009a57b32f reason: FULL_RING
>           arping-7006    [001] ..s1.  1101.844230: kfree_skb: skbaddr=00000000295eb0da protocol=2054 location=000000009a57b32f reason: FULL_RING
> 
> 
>  drivers/net/tap.c          | 35 +++++++++++++++++++++++++----------
>  drivers/net/tun.c          | 38 ++++++++++++++++++++++++++++++--------
>  include/linux/skbuff.h     | 18 ++++++++++++++++++
>  include/trace/events/skb.h | 10 ++++++++++
>  net/core/skbuff.c          | 11 +++++++++--
>  5 files changed, 92 insertions(+), 20 deletions(-)
> 
> Please let me know if there is any suggestion on the definition of reasons.
> 
> Thank you very much!
> 
> Dongli Zhang
> 
>