From nobody Sat Nov 30 00:51:24 2024 Received: from mail-wm1-f41.google.com (mail-wm1-f41.google.com [209.85.128.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EDB411D7E21 for ; Fri, 13 Sep 2024 09:39:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726220369; cv=none; b=sZCjVXk0defPuqpRMosxQ+CX3D3pCFNIs+5mY2jZAWbUjO+7x3953dHe624Rn0CDVsyB7tOIlY0IRAIgRAiSyXr9CKiQEFC1PY2opfjdSBh6ebt9fch5m3bcbrZ1QT4iRsbuh1eJSE4V7uuvmsBF1ActwJ6++HYzxezDOYk2XCA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726220369; c=relaxed/simple; bh=Tw4uEMSI5Dr5SJ/9tx+GPhyTrfSGhN6/WFouKDUT30Q=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=QTn+m/PX9JIID4c8EHrAQ4wAVrCba+CyLpMQV7FCn5w/xUkEBz4vM9Ywxs9kOqcnk7Uh3baMteeq7iPENKvsaZoUG+/Dp86s5+NmXi/h0+AgDQgs7OG8j0EpHyyC6cFOeDfMayUQ+tOLtxF8OZUW5TLI1BeE/EvflhCn5v5m9ns= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=cloudflare.com; spf=pass smtp.mailfrom=cloudflare.com; dkim=pass (2048-bit key) header.d=cloudflare.com header.i=@cloudflare.com header.b=Nlvw6BYZ; arc=none smtp.client-ip=209.85.128.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=cloudflare.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=cloudflare.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=cloudflare.com header.i=@cloudflare.com header.b="Nlvw6BYZ" Received: by mail-wm1-f41.google.com with SMTP id 5b1f17b1804b1-42cb60aff1eso18612145e9.0 for ; Fri, 13 Sep 2024 02:39:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloudflare.com; s=google09082023; t=1726220365; x=1726825165; darn=vger.kernel.org; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=f/F5VsUKzBY+Gvp0nTts52JPym3MtK6dVK27Lo+tSTM=; b=Nlvw6BYZ76+Y74dwpyYMfEt793vm8RH9JUskZ/y3SLWDL2ffYuTq0Kdto4x1Hm+oAo nNT3WMGg+nqEC1uVEqORAv9T6Tc3T6KI4Fsrh91vb+uBiZFogAgr+I+e3h6UIcZFVXfs uqm5XKWevnyBcafDgvxgrW0lVbMED9g2LVfrXtGNl6g4eI48G5nxv8/SsXlZCBHQG6UJ YbM+iqovUVTTjHy9c1SIZJB4aEiyrOmVYFlluDB8/igdjUB4df4HbnQx/dfXQA44BGKE 9QC0iF9to+jQp0hG8SPx/xFEdV9z8al62X8Zj2GKfXRPBC2NEUII9DbxfsTnABorKEl0 iRnA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726220365; x=1726825165; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=f/F5VsUKzBY+Gvp0nTts52JPym3MtK6dVK27Lo+tSTM=; b=NdDSWxyoF+lEJZSd9OWcwek0cQ7r8CgWOSbwROFaix8cfuwL52xBRQPIKI2pyV1HqX J94vlkvTQgJft+xtWc0OZFGryMFpMdyQI4Vaf4Xnf1d4YGUNckfb6hSqMrmPxOjm2W1C 70GRmw50LtEMQFngusNnq+PHVh5UVw3oHvc6O7CdacLJsVrEpsaKSzRRqxaBBhcuCY6C NxSoS5VMOYwk4zdz1e8KJ2Kyb1S3wJykskyfhtUaBlUp9OCGO2InpR6ylhuQyLRANkoH OkenkRPouCIS15QlMo8FhuEVBNzh+3PuQcox9OtDxWcB4BGFprgNOS66xlsKsp759Lol YRvw== X-Forwarded-Encrypted: i=1; AJvYcCUvI3FP/qu9oL3Mor4CGJx9s8nSVPNcXU269NUba188V1GLeuobgvv+vkggfhcyia908rlWpti9T223eRc=@vger.kernel.org X-Gm-Message-State: AOJu0Ywd+Up1OfAnLGwO/1kTvN9vQRWGdfv45D5Tkv1pVIJwZFYA0LfL mmYNBaRIwfOz3Muk41oJ1VAqmCLEPTHbnxcPrBJsEE2e1iy/5KWjy+psq8aKegQJ5Q35RUm6Oas ux1ZvHg== X-Google-Smtp-Source: AGHT+IFCLnGFac4d/lytgKhxj1y4y6Vc0xqjrzbKxpeoAAZXwZWlNKOpOSOEZ8fmlHYwc00Ryjx6Ug== X-Received: by 2002:a05:600c:3505:b0:42c:a8d5:2df5 with SMTP id 5b1f17b1804b1-42cdb586ee0mr45446675e9.24.1726220364867; Fri, 13 Sep 2024 02:39:24 -0700 (PDT) Received: from [127.0.1.1] ([2a09:bac5:3802:d2::15:37a]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-37895665548sm16474484f8f.34.2024.09.13.02.39.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 13 Sep 2024 02:39:24 -0700 (PDT) From: Tiago Lam Date: Fri, 13 Sep 2024 10:39:19 +0100 Subject: [RFC PATCH 1/3] ipv4: Run a reverse sk_lookup on sendmsg. Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20240913-reverse-sk-lookup-v1-1-e721ea003d4c@cloudflare.com> References: <20240913-reverse-sk-lookup-v1-0-e721ea003d4c@cloudflare.com> In-Reply-To: <20240913-reverse-sk-lookup-v1-0-e721ea003d4c@cloudflare.com> To: "David S. Miller" , David Ahern , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Willem de Bruijn , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Eduard Zingerman , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Mykola Lysenko , Shuah Khan Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org, bpf@vger.kernel.org, linux-kselftest@vger.kernel.org, Jakub Sitnicki , Tiago Lam , kernel-team@cloudflare.com X-Mailer: b4 0.14.1 In order to check if egress traffic should be allowed through, we run a reverse socket lookup (i.e. normal socket lookup with the src/dst addresses and ports reversed) to check if the corresponding ingress traffic is allowed in. Thus, if there's a sk_lookup reverse call returns a socket that matches the egress socket, we also let the egress traffic through - following the principle of, allowing return traffic to proceed if ingress traffic is allowed in. The reverse lookup is only performed in case an sk_lookup ebpf program is attached and the source address and/or port for the return traffic have been modified. The src address and port can be modified by using ancilliary messages. Up until now, it was possible to specify a different source address to sendmsg by providing it in an IP_PKTINFO anciliarry message, but there's no way to change the source port. This patch also extends the ancilliary messages supported by sendmsg to support the IP_ORIGDSTADDR ancilliary message, reusing the same cmsg and struct used in recvmsg - which already supports specifying a port. Suggested-by: Jakub Sitnicki Signed-off-by: Tiago Lam --- include/net/ip.h | 1 + net/ipv4/ip_sockglue.c | 11 +++++++++++ net/ipv4/udp.c | 33 ++++++++++++++++++++++++++++++++- 3 files changed, 44 insertions(+), 1 deletion(-) diff --git a/include/net/ip.h b/include/net/ip.h index c5606cadb1a5..e5753abd7247 100644 --- a/include/net/ip.h +++ b/include/net/ip.h @@ -75,6 +75,7 @@ static inline unsigned int ip_hdrlen(const struct sk_buff= *skb) struct ipcm_cookie { struct sockcm_cookie sockc; __be32 addr; + __be16 port; int oif; struct ip_options_rcu *opt; __u8 protocol; diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c index cf377377b52d..6e55bd25b5f7 100644 --- a/net/ipv4/ip_sockglue.c +++ b/net/ipv4/ip_sockglue.c @@ -297,6 +297,17 @@ int ip_cmsg_send(struct sock *sk, struct msghdr *msg, = struct ipcm_cookie *ipc, ipc->addr =3D info->ipi_spec_dst.s_addr; break; } + case IP_ORIGDSTADDR: + { + struct sockaddr_in *dst_addr; + + if (cmsg->cmsg_len !=3D CMSG_LEN(sizeof(struct sockaddr_in))) + return -EINVAL; + dst_addr =3D (struct sockaddr_in *)CMSG_DATA(cmsg); + ipc->port =3D dst_addr->sin_port; + ipc->addr =3D dst_addr->sin_addr.s_addr; + break; + } case IP_TTL: if (cmsg->cmsg_len !=3D CMSG_LEN(sizeof(int))) return -EINVAL; diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index 49c622e743e8..b9dc0a88b0c6 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -1060,6 +1060,7 @@ int udp_sendmsg(struct sock *sk, struct msghdr *msg, = size_t len) DECLARE_SOCKADDR(struct sockaddr_in *, usin, msg->msg_name); struct flowi4 fl4_stack; struct flowi4 *fl4; + __u8 flow_flags =3D inet_sk_flowi_flags(sk); int ulen =3D len; struct ipcm_cookie ipc; struct rtable *rt =3D NULL; @@ -1179,6 +1180,37 @@ int udp_sendmsg(struct sock *sk, struct msghdr *msg,= size_t len) } } =20 + /* If we're egressing with a different source address and/or port, we + * perform a reverse socket lookup. The rationale behind this is that we + * can allow return UDP traffic that has ingressed through sk_lookup to + * also egress correctly. In case this the reverse lookup fails. + * + * The lookup is performed if either source address and/or port changed, = and + * neither is "0". + */ + if (static_branch_unlikely(&bpf_sk_lookup_enabled) && + !connected && + (ipc.port && ipc.addr) && + (inet->inet_saddr !=3D ipc.addr || inet->inet_sport !=3D ipc.port)) { + struct sock *sk_egress; + + bpf_sk_lookup_run_v4(sock_net(sk), IPPROTO_UDP, + daddr, dport, ipc.addr, ntohs(ipc.port), 1, &sk_egress); + if (IS_ERR_OR_NULL(sk_egress) || + atomic64_read(&sk_egress->sk_cookie) !=3D atomic64_read(&sk->sk_cook= ie)) { + net_info_ratelimited("No reverse socket lookup match for local addr %pI= 4:%d remote addr %pI4:%d\n", + &ipc.addr, ntohs(ipc.port), &daddr, ntohs(dport)); + } else { + /* Override the source port to use with the one we got in cmsg, + * and tell routing to let us use a non-local address. Otherwise + * route lookups will fail with non-local source address when + * IP_TRANSPARENT isn't set. + */ + inet->inet_sport =3D ipc.port; + flow_flags |=3D FLOWI_FLAG_ANYSRC; + } + } + saddr =3D ipc.addr; ipc.addr =3D faddr =3D daddr; =20 @@ -1223,7 +1255,6 @@ int udp_sendmsg(struct sock *sk, struct msghdr *msg, = size_t len) =20 if (!rt) { struct net *net =3D sock_net(sk); - __u8 flow_flags =3D inet_sk_flowi_flags(sk); =20 fl4 =3D &fl4_stack; =20 --=20 2.34.1 From nobody Sat Nov 30 00:51:24 2024 Received: from mail-wr1-f44.google.com (mail-wr1-f44.google.com [209.85.221.44]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E39D31D7E33 for ; Fri, 13 Sep 2024 09:39:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.44 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726220370; cv=none; b=f1BaaRlzF4PQCXrvC9kIiya9PNRLmKUdv6DWcYNjQumhGGtEC6M7TsLw0LBcOqf5pDfzZXeHfmxyDkaaryBAUqCiYk5TgYlTlnK2CURb90wmAQgCDt1ZJZM0VPOHMavdkDllvHV15vY8AeI+2cdE38L464q/mp1EkR7BZgSpz2o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726220370; c=relaxed/simple; bh=14Em+5LUfU07AZhRuhVZlV8I2kuQVcaRcYhuB3emakA=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=BO/c3GespiYYZtqWD2jb6rewj+YCmZBf3EnJcB1Ozmgc2JRGCYxQX6beSvzc4VA0Z9bMI3qDFZgNV6mETFuWspDzajcXJOauygMqT17+t4EFwW+kPSSDgaIlpEI6WiSCdi2p4wcLvu1VabZWAlXWhPx6TEoz4tMLb3sM2gYNR+Q= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=cloudflare.com; spf=pass smtp.mailfrom=cloudflare.com; dkim=pass (2048-bit key) header.d=cloudflare.com header.i=@cloudflare.com header.b=L6C0V1B9; arc=none smtp.client-ip=209.85.221.44 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=cloudflare.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=cloudflare.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=cloudflare.com header.i=@cloudflare.com header.b="L6C0V1B9" Received: by mail-wr1-f44.google.com with SMTP id ffacd0b85a97d-3770320574aso1379363f8f.2 for ; Fri, 13 Sep 2024 02:39:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloudflare.com; s=google09082023; t=1726220366; x=1726825166; darn=vger.kernel.org; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=8SPUqN3GZrlcNjwE+u8cOJWfk91PGP0vK3hoXybkak4=; b=L6C0V1B9ThCrvHNddoT3ljveUKthVAwHmpOMbVvijk+yXwYtNmcNyorq7uVlnS/XEh I2YxoCo/5/SqWFzdfSvLSdu1rD+e/6HrZm5O6ClpO3KujnXuAUjUvRghtp4nzfs5MvqA B0XmM0avdW7fy2vNtkvckbZ42nK+93NKL+m9OnCBPR6pEvuSMXzNSI4A3Mp7DfnkZ89Z F9/jV66nd5IGX3oOReLmT3dFnADxYDXr7VneQqkap225cS33dXbNtWDYjV4GUN9Yfr1u nvyspk7NYMTVdKHRKEcg7lMmTzZJO9U55IeFyt2CAIh8EmdwGKc3tfQo3qCBa+ZjVbtZ 014g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726220366; x=1726825166; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=8SPUqN3GZrlcNjwE+u8cOJWfk91PGP0vK3hoXybkak4=; b=c1JAzuKELd73NwodGDU7eSuQY9okCqqPdSn29UF/RC5YzD6OjCvu4kew80CONM7yFG RLNjo58ZVGbw+8QjNI9pggSCMM3ZldGeB3kpllNtyzLyyvA8U90tnhQFuqAprlA984fg jAVsa9sfANbrSfZincVJDuZXTSLJH8tIrHlL/tapNz/Sj2GcnMxWMqL+Z3Ld3Gs4SIx9 RTI1/mADLgoPCOS9JgzIiBIv/aMFJCiW9nJ+FyIWZLEHi1ne82yXzvsg/bK6XSh8RYkR hrV1MQ4PiUHRV0O7eBo9Z3gLGmTNVIsqRsl1DV3ZeGzUIibRWA2/Z5XSohWnwRfL9ahL RLvg== X-Forwarded-Encrypted: i=1; AJvYcCUIib4izTgULI0x3CNaVrqzT8Wr305WSQH0vdLg8uAvn8hp/gi6psOK6qlY+fI5rlaXpO95TSO9ydjtBW4=@vger.kernel.org X-Gm-Message-State: AOJu0YzGFctqjYLsiBgG9Q5ZmWpDZlazT2p6qZ7dmGnN693o7oRplhla ixwslxD8mHvZcDezeQ44e600CFRC5R7tpLQrPop2qTG6EaYLRo4KY1cbwqnNDfw= X-Google-Smtp-Source: AGHT+IEP6gH+9G56Ca89xPoHtYf+GM04fd32svCb2j76cjibtD28Ka5SMKWWnsCP22WUeZmsjTzWEA== X-Received: by 2002:a05:6000:e09:b0:374:ca92:5e44 with SMTP id ffacd0b85a97d-378c2d121d6mr3535737f8f.32.1726220365954; Fri, 13 Sep 2024 02:39:25 -0700 (PDT) Received: from [127.0.1.1] ([2a09:bac5:3802:d2::15:37a]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-37895665548sm16474484f8f.34.2024.09.13.02.39.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 13 Sep 2024 02:39:25 -0700 (PDT) From: Tiago Lam Date: Fri, 13 Sep 2024 10:39:20 +0100 Subject: [RFC PATCH 2/3] ipv6: Run a reverse sk_lookup on sendmsg. Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20240913-reverse-sk-lookup-v1-2-e721ea003d4c@cloudflare.com> References: <20240913-reverse-sk-lookup-v1-0-e721ea003d4c@cloudflare.com> In-Reply-To: <20240913-reverse-sk-lookup-v1-0-e721ea003d4c@cloudflare.com> To: "David S. Miller" , David Ahern , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Willem de Bruijn , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Eduard Zingerman , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Mykola Lysenko , Shuah Khan Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org, bpf@vger.kernel.org, linux-kselftest@vger.kernel.org, Jakub Sitnicki , Tiago Lam , kernel-team@cloudflare.com X-Mailer: b4 0.14.1 This follows the same rationale provided for the ipv4 counterpart, where it now runs a reverse socket lookup when source addresses and/or ports are changed, on sendmsg, to check whether egress traffic should be allowed to go through or not. As with ipv4, the ipv6 sendmsg path is also extended here to support the IPV6_ORIGDSTADDR ancilliary message to be able to specify a source address/port. Suggested-by: Jakub Sitnicki Signed-off-by: Tiago Lam --- net/ipv6/datagram.c | 76 +++++++++++++++++++++++++++++++++++++++++++++++++= ++++ net/ipv6/udp.c | 8 ++++-- 2 files changed, 82 insertions(+), 2 deletions(-) diff --git a/net/ipv6/datagram.c b/net/ipv6/datagram.c index fff78496803d..4214dda1c320 100644 --- a/net/ipv6/datagram.c +++ b/net/ipv6/datagram.c @@ -756,6 +756,27 @@ void ip6_datagram_recv_ctl(struct sock *sk, struct msg= hdr *msg, } EXPORT_SYMBOL_GPL(ip6_datagram_recv_ctl); =20 +static inline bool reverse_sk_lookup(struct flowi6 *fl6, struct sock *sk, + struct in6_addr *saddr, __be16 sport) +{ + if (static_branch_unlikely(&bpf_sk_lookup_enabled) && + (saddr && sport) && + (ipv6_addr_cmp(&sk->sk_v6_rcv_saddr, saddr) || inet_sk(sk)->inet_spor= t !=3D sport)) { + struct sock *sk_egress; + + bpf_sk_lookup_run_v6(sock_net(sk), IPPROTO_UDP, &fl6->daddr, fl6->fl6_dp= ort, + saddr, ntohs(sport), 0, &sk_egress); + if (!IS_ERR_OR_NULL(sk_egress) && + atomic64_read(&sk_egress->sk_cookie) =3D=3D atomic64_read(&sk->sk_co= okie)) + return true; + + net_info_ratelimited("No reverse socket lookup match for local addr %pI6= :%d remote addr %pI6:%d\n", + &saddr, ntohs(sport), &fl6->daddr, ntohs(fl6->fl6_dport)); + } + + return false; +} + int ip6_datagram_send_ctl(struct net *net, struct sock *sk, struct msghdr *msg, struct flowi6 *fl6, struct ipcm6_cookie *ipc6) @@ -844,7 +865,62 @@ int ip6_datagram_send_ctl(struct net *net, struct sock= *sk, =20 break; } + case IPV6_ORIGDSTADDR: + { + struct sockaddr_in6 *sockaddr_in; + struct net_device *dev =3D NULL; + + if (cmsg->cmsg_len < CMSG_LEN(sizeof(struct sockaddr_in6))) { + err =3D -EINVAL; + goto exit_f; + } + + sockaddr_in =3D (struct sockaddr_in6 *)CMSG_DATA(cmsg); + + addr_type =3D __ipv6_addr_type(&sockaddr_in->sin6_addr); + + if (addr_type & IPV6_ADDR_LINKLOCAL) + return -EINVAL; + + /* If we're egressing with a different source address and/or port, we + * perform a reverse socket lookup. The rationale behind this is that = we + * can allow return UDP traffic that has ingressed through sk_lookup to + * also egress correctly. In case the reverse lookup fails, we + * continue with the normal path. + * + * The lookup is performed if either source address and/or port changed= , and + * neither is "0". + */ + if (reverse_sk_lookup(fl6, sk, &sockaddr_in->sin6_addr, + sockaddr_in->sin6_port)) { + /* Override the source port and address to use with the one we + * got in cmsg and bail early. + */ + fl6->saddr =3D sockaddr_in->sin6_addr; + fl6->fl6_sport =3D sockaddr_in->sin6_port; + break; + } =20 + if (addr_type !=3D IPV6_ADDR_ANY) { + int strict =3D __ipv6_addr_src_scope(addr_type) <=3D IPV6_ADDR_SCOPE_L= INKLOCAL; + + if (!ipv6_can_nonlocal_bind(net, inet_sk(sk)) && + !ipv6_chk_addr_and_flags(net, + &sockaddr_in->sin6_addr, + dev, !strict, 0, + IFA_F_TENTATIVE) && + !ipv6_chk_acast_addr_src(net, dev, + &sockaddr_in->sin6_addr)) + err =3D -EINVAL; + else + fl6->saddr =3D sockaddr_in->sin6_addr; + } + + if (err) + goto exit_f; + + break; + } case IPV6_FLOWINFO: if (cmsg->cmsg_len < CMSG_LEN(4)) { err =3D -EINVAL; diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c index 6602a2e9cdb5..6121cbb71ad3 100644 --- a/net/ipv6/udp.c +++ b/net/ipv6/udp.c @@ -1476,6 +1476,12 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *ms= g, size_t len) =20 fl6->flowi6_uid =3D sk->sk_uid; =20 + /* We use fl6's daddr and fl6_sport in the reverse sk_lookup done + * within ip6_datagram_send_ctl() now. + */ + fl6->daddr =3D *daddr; + fl6->fl6_sport =3D inet->inet_sport; + if (msg->msg_controllen) { opt =3D &opt_space; memset(opt, 0, sizeof(struct ipv6_txoptions)); @@ -1511,10 +1517,8 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *ms= g, size_t len) =20 fl6->flowi6_proto =3D sk->sk_protocol; fl6->flowi6_mark =3D ipc6.sockc.mark; - fl6->daddr =3D *daddr; if (ipv6_addr_any(&fl6->saddr) && !ipv6_addr_any(&np->saddr)) fl6->saddr =3D np->saddr; - fl6->fl6_sport =3D inet->inet_sport; =20 if (cgroup_bpf_enabled(CGROUP_UDP6_SENDMSG) && !connected) { err =3D BPF_CGROUP_RUN_PROG_UDP6_SENDMSG_LOCK(sk, --=20 2.34.1 From nobody Sat Nov 30 00:51:24 2024 Received: from mail-wm1-f54.google.com (mail-wm1-f54.google.com [209.85.128.54]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E54E51D88B9 for ; Fri, 13 Sep 2024 09:39:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.54 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726220370; cv=none; b=Rh9Sa2nmseJnWeqy/0oWDQ169EXStv3MQYT/j85ZW6o54viznICmlGkapeyx7Y51anTbCK54I0lPuk6vyDA0yalW2FlTaem7/Qd4saxnw3j5AQOk2WwUV/jat/SfBuJf91I56FMHNxFuwkjTXDDHyaor7ymTG4y8z6SzZPLUJmg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726220370; c=relaxed/simple; bh=kwVTA1xn2dlYAQ5FaOqW3WQio+O9v5ZocxxWoHtdcDQ=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=FU4vU7Y3Yea9Rtfcg08XxPesxZnRdbzOccz28gi/mVjZkxiacK7kidyZdDVkMzOzrCdu4n93o++YFwCaLyQyukcMCzEoLn9l9ApK+I/2weQCka9ZpvTrMXQDXkAt9GphdXel/FlCgQeu9WZKAjwH7eUmHj+H26Gd60OqeHKttR8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=cloudflare.com; spf=pass smtp.mailfrom=cloudflare.com; dkim=pass (2048-bit key) header.d=cloudflare.com header.i=@cloudflare.com header.b=gN9mRqbh; arc=none smtp.client-ip=209.85.128.54 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=cloudflare.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=cloudflare.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=cloudflare.com header.i=@cloudflare.com header.b="gN9mRqbh" Received: by mail-wm1-f54.google.com with SMTP id 5b1f17b1804b1-42cae6bb895so18498545e9.1 for ; Fri, 13 Sep 2024 02:39:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloudflare.com; s=google09082023; t=1726220367; x=1726825167; darn=vger.kernel.org; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=lw0zQpZeq+FMh46boYtf1wMWWNki5/w8+fazi2fO7P0=; b=gN9mRqbhjam2T9IGTs5lMaq2b0xYNElZCBYrlIxpoFhGq75l+ouF3d3+vgPjZNP9Vb SQK56eUh1pXkA3R3wG1grYG6i9TOv9aHavtwufvKmBa2xO9w3pBBVXkGhhNr1EIPVV71 iAdC1aIfySCOBS1hpuphDztkLTsgFeKhDnqTx4AY4SMiELeLYKlJJV+0lLRcigpqsPCw /QT15gPRIzWRb2lp51cvNq30ku7aoLD0Sp4u53xUYGSlUTo2L6vwN1NrM3jVYmr1zada YKg9CZntzErK8fVmE3cJNzHChmrcesJF7jfcq9ExmG3h4Ho5rSKSF7uDQztCaIzKCy31 DaCA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726220367; x=1726825167; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=lw0zQpZeq+FMh46boYtf1wMWWNki5/w8+fazi2fO7P0=; b=VQ2mr/JIZIsR/v95+F1w5VWuEPCFs5aXVRGMTn0qF6q6GX3eKlOGX6Uq5h1LWMG2jQ htN+UwWZjINsLxXmf3+C4Iua9vOdHPdqaXmQyyHVV1sXkJcPOLaYe1B6+IOx0/2RVv6s GbVYagtGz3XgOrEl7RdfL0i6YXNQLz75gPJTxaSDREMt4JQg/3qJ0MfWly86ydUUeIxR I995LF7L4iUBCEiY4UA3dh7EcKzaDKXojrCeZO/5d5IZwWJSRZi/DGGwj64Ocwm7fxmg wVtUeF/xwETALCzAfV9i2AY4+lSh5nS7ponkPdn2cHw++899BtnTrDkeQLk+4FGtZQUV +e5g== X-Forwarded-Encrypted: i=1; AJvYcCURTjcpi5w2myXwI9fnciLmqK+mPLVDEWec2kATDAsosUvM86oNvfDVwywnQTLDb2IMzxv0AsPxzRBEpRU=@vger.kernel.org X-Gm-Message-State: AOJu0Yzgs06mmYjMmeXe3QPH2+Ib/sEUQj4LiCn3Q6eu6Yt5TUFJpHw8 ttgunyMhjxlyA1w4PtqJ+cKxCNfw2GpOzJImod0sd4NOQDiXALoPyOMKJpIsumc= X-Google-Smtp-Source: AGHT+IGgG2ZTysSwmI61riKiy7WlPK0eMWG0eKUUV3q1+A4ChRBKjnQEymksqdsX8ZEbb5LITzFKGA== X-Received: by 2002:adf:a407:0:b0:374:c90c:226 with SMTP id ffacd0b85a97d-378c2cfec60mr3448025f8f.9.1726220367026; Fri, 13 Sep 2024 02:39:27 -0700 (PDT) Received: from [127.0.1.1] ([2a09:bac5:3802:d2::15:37a]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-37895665548sm16474484f8f.34.2024.09.13.02.39.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 13 Sep 2024 02:39:26 -0700 (PDT) From: Tiago Lam Date: Fri, 13 Sep 2024 10:39:21 +0100 Subject: [RFC PATCH 3/3] bpf: Add sk_lookup test to use ORIGDSTADDR cmsg. Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20240913-reverse-sk-lookup-v1-3-e721ea003d4c@cloudflare.com> References: <20240913-reverse-sk-lookup-v1-0-e721ea003d4c@cloudflare.com> In-Reply-To: <20240913-reverse-sk-lookup-v1-0-e721ea003d4c@cloudflare.com> To: "David S. Miller" , David Ahern , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Willem de Bruijn , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Eduard Zingerman , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Mykola Lysenko , Shuah Khan Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org, bpf@vger.kernel.org, linux-kselftest@vger.kernel.org, Jakub Sitnicki , Tiago Lam , kernel-team@cloudflare.com X-Mailer: b4 0.14.1 This patch reuses the framework already in place for sk_lookup, allowing it now to send a reply from the server fd directly, instead of having to create a socket bound to the original destination address and reply from there. It does this by passing the source address and port from where to reply from in a IP_ORIGDSTADDR ancilliary message passed in sendmsg. Signed-off-by: Tiago Lam Suggested-by: Jakub Sitnicki --- tools/testing/selftests/bpf/prog_tests/sk_lookup.c | 70 +++++++++++++++---= ---- 1 file changed, 48 insertions(+), 22 deletions(-) diff --git a/tools/testing/selftests/bpf/prog_tests/sk_lookup.c b/tools/tes= ting/selftests/bpf/prog_tests/sk_lookup.c index ae87c00867ba..b99aff2e3976 100644 --- a/tools/testing/selftests/bpf/prog_tests/sk_lookup.c +++ b/tools/testing/selftests/bpf/prog_tests/sk_lookup.c @@ -75,6 +75,7 @@ struct test { struct inet_addr listen_at; enum server accept_on; bool reuseport_has_conns; /* Add a connected socket to reuseport group */ + bool dont_bind_reply; /* Don't bind, send direct from server fd. */ }; =20 struct cb_opts { @@ -363,7 +364,7 @@ static void v4_to_v6(struct sockaddr_storage *ss) memset(&v6->sin6_addr.s6_addr[0], 0, 10); } =20 -static int udp_recv_send(int server_fd) +static int udp_recv_send(int server_fd, bool dont_bind_reply) { char cmsg_buf[CMSG_SPACE(sizeof(struct sockaddr_storage))]; struct sockaddr_storage _src_addr =3D { 0 }; @@ -415,26 +416,41 @@ static int udp_recv_send(int server_fd) v4_to_v6(dst_addr); } =20 - /* Reply from original destination address. */ - fd =3D start_server_addr(SOCK_DGRAM, dst_addr, sizeof(*dst_addr), NULL); - if (!ASSERT_OK_FD(fd, "start_server_addr")) { - log_err("failed to create tx socket"); - return -1; - } + if (dont_bind_reply) { + /* Reply directly from server fd, specifying the source address and/or + * port in struct msghdr. + */ + n =3D sendmsg(server_fd, &msg, 0); + if (CHECK(n <=3D 0, "sendmsg", "failed\n")) { + log_err("failed to send echo reply"); + return -1; + } + } else { + /* Reply from original destination address. */ + fd =3D socket(dst_addr->ss_family, SOCK_DGRAM, 0); + if (CHECK(fd < 0, "socket", "failed\n")) { + log_err("failed to create tx socket"); + return -1; + } =20 - msg.msg_control =3D NULL; - msg.msg_controllen =3D 0; - n =3D sendmsg(fd, &msg, 0); - if (CHECK(n <=3D 0, "sendmsg", "failed\n")) { - log_err("failed to send echo reply"); - ret =3D -1; - goto out; - } + ret =3D bind(fd, (struct sockaddr *)dst_addr, sizeof(*dst_addr)); + if (CHECK(ret, "bind", "failed\n")) { + log_err("failed to bind tx socket"); + close(fd); + return ret; + } =20 - ret =3D 0; -out: - close(fd); - return ret; + msg.msg_control =3D NULL; + msg.msg_controllen =3D 0; + n =3D sendmsg(fd, &msg, 0); + if (CHECK(n <=3D 0, "sendmsg", "failed\n")) { + log_err("failed to send echo reply"); + close(fd); + return -1; + } + + close(fd); + } } =20 static int tcp_echo_test(int client_fd, int server_fd) @@ -454,14 +470,14 @@ static int tcp_echo_test(int client_fd, int server_fd) return 0; } =20 -static int udp_echo_test(int client_fd, int server_fd) +static int udp_echo_test(int client_fd, int server_fd, bool dont_bind_repl= y) { int err; =20 err =3D send_byte(client_fd); if (err) return -1; - err =3D udp_recv_send(server_fd); + err =3D udp_recv_send(server_fd, dont_bind_reply); if (err) return -1; err =3D recv_byte(client_fd); @@ -653,7 +669,7 @@ static void run_lookup_prog(const struct test *t) if (t->sotype =3D=3D SOCK_STREAM) tcp_echo_test(client_fd, server_fds[t->accept_on]); else - udp_echo_test(client_fd, server_fds[t->accept_on]); + udp_echo_test(client_fd, server_fds[t->accept_on], t->dont_bind_reply); =20 close(client_fd); close: @@ -775,6 +791,16 @@ static void test_redirect_lookup(struct test_sk_lookup= *skel) .listen_at =3D { INT_IP4, INT_PORT }, .accept_on =3D SERVER_B, }, + { + .desc =3D "UDP IPv4 redir different ports", + .lookup_prog =3D skel->progs.select_sock_a_no_reuseport, + .sock_map =3D skel->maps.redir_map, + .sotype =3D SOCK_DGRAM, + .connect_to =3D { EXT_IP4, EXT_PORT }, + .listen_at =3D { INT_IP4, INT_PORT }, + .accept_on =3D SERVER_A, + .dont_bind_reply =3D true, + }, { .desc =3D "UDP IPv4 redir and reuseport with conns", .lookup_prog =3D skel->progs.select_sock_a, --=20 2.34.1