From nobody Mon Feb 9 12:44:54 2026 Received: from mx0a-0031df01.pphosted.com (mx0a-0031df01.pphosted.com [205.220.168.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 77DAE3C067; Fri, 1 Mar 2024 20:14:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=205.220.168.131 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709324043; cv=none; b=KCRv/QIK+AG5jxntv5NBoqJlCMPQ5wTJSOrgpPHA33UAitpfMR2dkOC2zlwfi1FcdR7oAIVVG9w9otJ576qj3qyizYfmcEUkcx9h0IGEAnzWxqli5U5Gj/emR9Lmez8vwBNSdv8ijXzPgOY3bSRCzigbrrHAdiNLXB09BQ7ShYU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709324043; c=relaxed/simple; bh=IwbTQ4DeIGKDEICIUHpPbiwt3fZ6OnsvqK661h2R2Cw=; h=From:To:Cc:Subject:Date:Message-Id:MIME-Version; b=ALs2iTxXG24+lX9i/1ZLQnj8h7JnKLC5G7XTdPsssnH0RsMt1UE85sijhLWl2JATJy49P8aMwv3A5kB5b2Cz1/SXj53N6NX36arxy9Udx7qdWr4MrMZm/AzxeBmKB9vdDyW3AkA9xlJiSvFzhMeJlqSfiID7CqU9QkmqFAA71kw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=quicinc.com; spf=pass smtp.mailfrom=qualcomm.com; dkim=pass (2048-bit key) header.d=quicinc.com header.i=@quicinc.com header.b=VqMgyPhb; arc=none smtp.client-ip=205.220.168.131 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=quicinc.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=qualcomm.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=quicinc.com header.i=@quicinc.com header.b="VqMgyPhb" Received: from pps.filterd (m0279862.ppops.net [127.0.0.1]) by mx0a-0031df01.pphosted.com (8.17.1.24/8.17.1.24) with ESMTP id 421JvMcE018234; Fri, 1 Mar 2024 20:13:50 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; h= from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; s=qcppdkim1; bh=i+ubq+VAiTd30EeFxAqi ALdccqNyudp9rox9xyOObaw=; b=VqMgyPhbRzNdU3NkVHgITfyjbPzO0mkHGqYL bLwRDAvdKy+0CXb0AoQQ9Jco6Qg3ILOpFRTm2M5ztOY3+36bQltN0nrgvXmK5Ytn ji/dy8vz6kEOmxaY5LfRQ+fRNIr0P5463HLcJt+EnQs1sH14LJOXiRjp/9dj9j9p eDYKE8us/x+l5UKsxdxJOq1d2JHoQhaVgROQ7Qik1aAXSeYhEUfj2DQcumEjdjP3 rS1TrdaP4C1ey6NEzTwbF72S5TvY3EFlF8vv2Z/F9tJou9jk1x2FC0j7KAgwh0jD NOSTaZiNSQVfh45Z1Z44s/z9xLrRa6vUsHk+7C5HlOOZkD524Q== Received: from nalasppmta05.qualcomm.com (Global_NAT1.qualcomm.com [129.46.96.20]) by mx0a-0031df01.pphosted.com (PPS) with ESMTPS id 3wkk3u0c81-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 01 Mar 2024 20:13:50 +0000 (GMT) Received: from pps.filterd (NALASPPMTA05.qualcomm.com [127.0.0.1]) by NALASPPMTA05.qualcomm.com (8.17.1.5/8.17.1.5) with ESMTP id 421KDnPN021604; Fri, 1 Mar 2024 20:13:49 GMT Received: from pps.reinject (localhost [127.0.0.1]) by NALASPPMTA05.qualcomm.com (PPS) with ESMTPS id 3wk4mkf73q-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 01 Mar 2024 20:13:49 +0000 Received: from NALASPPMTA05.qualcomm.com (NALASPPMTA05.qualcomm.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 421KDn7k021599; Fri, 1 Mar 2024 20:13:49 GMT Received: from hu-devc-lv-u20-a-new.qualcomm.com (hu-abchauha-lv.qualcomm.com [10.81.25.35]) by NALASPPMTA05.qualcomm.com (PPS) with ESMTPS id 421KDmS2021598 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 01 Mar 2024 20:13:49 +0000 Received: by hu-devc-lv-u20-a-new.qualcomm.com (Postfix, from userid 214165) id 80C1A22134; Fri, 1 Mar 2024 12:13:48 -0800 (PST) From: Abhishek Chauhan To: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Andrew Halaney , Willem de Bruijn , Martin KaFai Lau , Martin KaFai Lau Cc: kernel@quicinc.com Subject: [PATCH net-next v4] net: Re-use and set mono_delivery_time bit for userspace tstamp packets Date: Fri, 1 Mar 2024 12:13:48 -0800 Message-Id: <20240301201348.2815102-1-quic_abchauha@quicinc.com> X-Mailer: git-send-email 2.25.1 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-QCInternal: smtphost X-QCInternal: smtphost X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=5800 signatures=585085 X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=5800 signatures=585085 X-Proofpoint-GUID: lOJH0ewvXDELo8bCtYxZZaDxvhQMcvG0 X-Proofpoint-ORIG-GUID: lOJH0ewvXDELo8bCtYxZZaDxvhQMcvG0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.1011,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2024-03-01_20,2024-03-01_03,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxlogscore=999 lowpriorityscore=0 clxscore=1015 mlxscore=0 impostorscore=0 adultscore=0 spamscore=0 malwarescore=0 priorityscore=1501 suspectscore=0 bulkscore=0 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.19.0-2402120000 definitions=main-2403010167 Content-Type: text/plain; charset="utf-8" Bridge driver today has no support to forward the userspace timestamp packets and ends up resetting the timestamp. ETF qdisc checks the packet coming from userspace and encounters to be 0 thereby dropping time sensitive packets. These changes will allow userspace timestamps packets to be forwarded from the bridge to NIC drivers. Setting the same bit (mono_delivery_time) to avoid dropping of userspace tstamp packets in the forwarding path. Existing functionality of mono_delivery_time remains unaltered here, instead just extended with userspace tstamp support for bridge forwarding path. Signed-off-by: Abhishek Chauhan Reviewed-by: Willem de Bruijn --- Changes since v3 - Setting mono_delivery_time at all instances where the skb->tstamp is=20 initialized with sockcm.transmit_time as reviewed by Willem - Removed repetitive comments from all the sources file and limited only to skbuff.h as suggested by Willem - Re-phrased the comment explanation in skbuff.h and made it much simpler=20 and generic as suggested by Willem=20 Changes since v2 - Updated the commit subject and message.=20 - Took care of few comments from Willem to re-use mono_delivery_time with comments and documentations in the header and source file. - Took care of comment from Andrew on the typo in the comment. - Existing self-test test cases are executed to make sure existing=20 implementation is not impacted as stated by Paolo.(so_txtime.sh).=20 - Internal validation of UDP packets using iperf/so_priority/so_txtime with MQPRIO + ETF offload is executed as well. - Test case is included below Test 1 :- FQ + ETF (SW path) [root@ecbldauto-lvarm04-lnx ~]# ./so_txtime.sh [ 280.640551] q->last time is 1707955476143297550 [ 283.338947] IPv6: ADDRCONF(NETDEV_CHANGE): veth0: link becomes ready [ 284.078429] IPv6: ADDRCONF(NETDEV_CHANGE): veth0: link becomes ready SO_TXTIME ipv4 clock monotonic payload:a delay:109 expected:0 (us) SO_TXTIME ipv6 clock monotonic payload:a delay:140 expected:0 (us) SO_TXTIME ipv6 clock monotonic payload:a delay:12739 expected:10000 (us) SO_TXTIME ipv4 clock monotonic payload:a delay:10054 expected:10000 (us) payload:b delay:20043 expected:20000 (us) SO_TXTIME ipv6 clock monotonic payload:b delay:20078 expected:20000 (us) payload:a delay:20177 expected:20000 (us) SO_TXTIME ipv4 clock tai send: pkt a at -1707955482913ms dropped: invalid txtime [ 287.070504] now is set to 1707955482913404839 [ 287.070509] tx time from SKB is 0 ./so_txtime: recv: timeout: Resource temporarily unavailable SO_TXTIME ipv6 clock tai send: pkt a at 0ms dropped: invalid txtime [ 287.070510] q->last time is 0 [ 287.420590] now is set to 1707955483263491298 [ 287.420596] tx time from SKB is 1707955483263454527 ./so_txtime: recv: timeout: Resource temporarily unavailable SO_TXTIME ipv6 clock tai [ 287.420597] q->last time is 0 [ 287.700598] now is set to 1707955483543498954 [ 287.700604] tx time from SKB is 1707955483553463173 payload:a delay:9655 expected:10000 (us) SO_TXTIME ipv4 clock tai [ 287.700605] q->last time is 0 [ 288.100532] now is set to 1707955483943432391 [ 288.100537] tx time from SKB is 1707955483953413016 payload:a delay:9668 expected:10000 (us)[ 288.100538] q->last time is 1707= 955483553463173 [ 288.100546] now is set to 1707955483943446975 [ 288.100547] tx time from SKB is 1707955483963413016 payload:b delay:20484 expected:20000 (us) SO_TXTIME ipv6 clock tai [ 288.100547] q->last time is 1707955483553463173 [ 288.440582] now is set to 1707955484283482495 [ 288.440587] tx time from SKB is 1707955484303452808 payload:b delay:9648 expected:10000 (us)[ 288.440588] q->last time is 1707= 955483963413016 [ 288.440598] now is set to 1707955484283499370 payload:a delay:22037 expected:20000 (us) [ 288.440599] tx time from SKB is 1707955484293452808 OK. All tests passed Test case 2 (MQPRIO + ETF HW offload) [root@ecbldauto-lvarm04-lnx ~]# tc qdisc add dev eth0 handle 100: parent ro= ot mqprio num_tc 4 \ map 0 2 1 3 3 2 2 2 2 2 2 2 2 2 2 2 \ queues 1@0 1@1 1@2 1@3\ hw 0 [root@ecbldauto-lvarm04-lnx ~]# tc qdisc replace dev eth0 parent 100:4 etf \ clockid CLOCK_TAI delta 40000 offload skip_sock_check [ 89.145838] qcom-ethqos 23040000.ethernet eth0: enabled ETF for Queue te= st log 3, number of queues 4, qopt enable 1, tbs queue bit 1 [ 89.145846] qcom-ethqos 23040000.ethernet eth0: enabled ETF for Queue 3 [root@ecbldauto-lvarm04-lnx ~]# ./a.out -4 -c tai -S 192.168.1.1 -D 192.168= .1.2 a,1,b,2 SO_TXTIME ipv4 clock tai glob_tstat =3D 1707955395256170394 [ 199.623650] now is set to 1707955395256215810 [ 199.623655] tx time from SKB is 1707955395257170394 [ 199.623656] q->last time is 0 [ 199.623663] now is set to 1707955395256230029 [ 199.623664] tx time from SKB is 1707955395258170394 [ 199.623665] q->last time is 0 [ 199.624589] qcom-ethqos 23040000.ethernet eth0: emac ethqos tx_xmit : la= uching tbs packet at 1707955395 sec and 257170394 nsec [ 199.625573] qcom-ethqos 23040000.ethernet eth0: emac ethqos tx_xmit : la= uching tbs packet at 1707955395 sec and 258170394 nsec Changes since v1=20 - Changed the commit subject as i am modifying the mono_delivery_time=20 bit with clockid_delivery_time. - Took care of suggestion mentioned by Willem to use the same bit for=20 userspace delivery time as there are no conflicts between TCP and=20 SCM_TXTIME, because explicit cmsg makes no sense for TCP and only RAW and DGRAM sockets interprets it.=20 - Clear explaination of why this is needed mentioned below and this=20 is extending the work done by Martin for mono_delivery_time=20 https://patchwork.kernel.org/project/netdevbpf/patch/20220302195525.34802= 80-1-kafai@fb.com/ - Version 1 patch can be referenced with below link which states=20 the exact problem with tc-etf and discussions which took place https://lore.kernel.org/all/20240215215632.2899370-1-quic_abchauha@quicin= c.com/=20 include/linux/skbuff.h | 6 +++--- net/ipv4/ip_output.c | 1 + net/ipv4/raw.c | 1 + net/ipv6/ip6_output.c | 2 +- net/ipv6/raw.c | 2 +- net/packet/af_packet.c | 4 +++- 6 files changed, 10 insertions(+), 6 deletions(-) diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index 2dde34c29203..4726298d4ed4 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -817,9 +817,9 @@ typedef unsigned char *sk_buff_data_t; * @decrypted: Decrypted SKB * @slow_gro: state present at GRO time, slower prepare step required * @mono_delivery_time: When set, skb->tstamp has the - * delivery_time in mono clock base (i.e. EDT). Otherwise, the - * skb->tstamp has the (rcv) timestamp at ingress and - * delivery_time at egress. + * delivery_time in mono clock base (i.e., EDT) or a clock base chosen + * by SO_TXTIME. If zero, skb->tstamp has the (rcv) timestamp at + * ingress. * @napi_id: id of the NAPI struct this skb came from * @sender_cpu: (aka @napi_id) source CPU in XPS * @alloc_cpu: CPU which did the skb allocation. diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c index 5b5a0adb927f..ff1df64c5697 100644 --- a/net/ipv4/ip_output.c +++ b/net/ipv4/ip_output.c @@ -1455,6 +1455,7 @@ struct sk_buff *__ip_make_skb(struct sock *sk, skb->priority =3D (cork->tos !=3D -1) ? cork->priority: READ_ONCE(sk->sk_= priority); skb->mark =3D cork->mark; skb->tstamp =3D cork->transmit_time; + skb->mono_delivery_time =3D !!skb->tstamp; /* * Steal rt from cork.dst to avoid a pair of atomic_inc/atomic_dec * on dst refcount diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c index aea89326c697..c4c29fc5b73f 100644 --- a/net/ipv4/raw.c +++ b/net/ipv4/raw.c @@ -353,6 +353,7 @@ static int raw_send_hdrinc(struct sock *sk, struct flow= i4 *fl4, skb->priority =3D READ_ONCE(sk->sk_priority); skb->mark =3D sockc->mark; skb->tstamp =3D sockc->transmit_time; + skb->mono_delivery_time =3D !!skb->tstamp; skb_dst_set(skb, &rt->dst); *rtp =3D NULL; =20 diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c index a722a43dd668..2fc1d03dc07d 100644 --- a/net/ipv6/ip6_output.c +++ b/net/ipv6/ip6_output.c @@ -1922,7 +1922,7 @@ struct sk_buff *__ip6_make_skb(struct sock *sk, skb->priority =3D READ_ONCE(sk->sk_priority); skb->mark =3D cork->base.mark; skb->tstamp =3D cork->base.transmit_time; - + skb->mono_delivery_time =3D !!skb->tstamp; ip6_cork_steal_dst(skb, cork); IP6_INC_STATS(net, rt->rt6i_idev, IPSTATS_MIB_OUTREQUESTS); if (proto =3D=3D IPPROTO_ICMPV6) { diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c index 03dbb874c363..13f54f8eea35 100644 --- a/net/ipv6/raw.c +++ b/net/ipv6/raw.c @@ -616,7 +616,7 @@ static int rawv6_send_hdrinc(struct sock *sk, struct ms= ghdr *msg, int length, skb->priority =3D READ_ONCE(sk->sk_priority); skb->mark =3D sockc->mark; skb->tstamp =3D sockc->transmit_time; - + skb->mono_delivery_time =3D !!skb->tstamp; skb_put(skb, length); skb_reset_network_header(skb); iph =3D ipv6_hdr(skb); diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c index c9bbc2686690..0db31ca4982d 100644 --- a/net/packet/af_packet.c +++ b/net/packet/af_packet.c @@ -2057,7 +2057,7 @@ static int packet_sendmsg_spkt(struct socket *sock, s= truct msghdr *msg, skb->priority =3D READ_ONCE(sk->sk_priority); skb->mark =3D READ_ONCE(sk->sk_mark); skb->tstamp =3D sockc.transmit_time; - + skb->mono_delivery_time =3D !!skb->tstamp; skb_setup_tx_timestamp(skb, sockc.tsflags); =20 if (unlikely(extra_len =3D=3D 4)) @@ -2586,6 +2586,7 @@ static int tpacket_fill_skb(struct packet_sock *po, s= truct sk_buff *skb, skb->priority =3D READ_ONCE(po->sk.sk_priority); skb->mark =3D READ_ONCE(po->sk.sk_mark); skb->tstamp =3D sockc->transmit_time; + skb->mono_delivery_time =3D !!skb->tstamp; skb_setup_tx_timestamp(skb, sockc->tsflags); skb_zcopy_set_nouarg(skb, ph.raw); =20 @@ -3064,6 +3065,7 @@ static int packet_snd(struct socket *sock, struct msg= hdr *msg, size_t len) skb->priority =3D READ_ONCE(sk->sk_priority); skb->mark =3D sockc.mark; skb->tstamp =3D sockc.transmit_time; + skb->mono_delivery_time =3D !!skb->tstamp; =20 if (unlikely(extra_len =3D=3D 4)) skb->no_fcs =3D 1; --=20 2.25.1