From nobody Sat Nov 30 02:31:35 2024 Received: from mail-wm1-f41.google.com (mail-wm1-f41.google.com [209.85.128.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EDB411D7E21 for ; Fri, 13 Sep 2024 09:39:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726220369; cv=none; b=sZCjVXk0defPuqpRMosxQ+CX3D3pCFNIs+5mY2jZAWbUjO+7x3953dHe624Rn0CDVsyB7tOIlY0IRAIgRAiSyXr9CKiQEFC1PY2opfjdSBh6ebt9fch5m3bcbrZ1QT4iRsbuh1eJSE4V7uuvmsBF1ActwJ6++HYzxezDOYk2XCA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726220369; c=relaxed/simple; bh=Tw4uEMSI5Dr5SJ/9tx+GPhyTrfSGhN6/WFouKDUT30Q=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=QTn+m/PX9JIID4c8EHrAQ4wAVrCba+CyLpMQV7FCn5w/xUkEBz4vM9Ywxs9kOqcnk7Uh3baMteeq7iPENKvsaZoUG+/Dp86s5+NmXi/h0+AgDQgs7OG8j0EpHyyC6cFOeDfMayUQ+tOLtxF8OZUW5TLI1BeE/EvflhCn5v5m9ns= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=cloudflare.com; spf=pass smtp.mailfrom=cloudflare.com; dkim=pass (2048-bit key) header.d=cloudflare.com header.i=@cloudflare.com header.b=Nlvw6BYZ; arc=none smtp.client-ip=209.85.128.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=cloudflare.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=cloudflare.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=cloudflare.com header.i=@cloudflare.com header.b="Nlvw6BYZ" Received: by mail-wm1-f41.google.com with SMTP id 5b1f17b1804b1-42cb60aff1eso18612145e9.0 for ; Fri, 13 Sep 2024 02:39:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloudflare.com; s=google09082023; t=1726220365; x=1726825165; darn=vger.kernel.org; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=f/F5VsUKzBY+Gvp0nTts52JPym3MtK6dVK27Lo+tSTM=; b=Nlvw6BYZ76+Y74dwpyYMfEt793vm8RH9JUskZ/y3SLWDL2ffYuTq0Kdto4x1Hm+oAo nNT3WMGg+nqEC1uVEqORAv9T6Tc3T6KI4Fsrh91vb+uBiZFogAgr+I+e3h6UIcZFVXfs uqm5XKWevnyBcafDgvxgrW0lVbMED9g2LVfrXtGNl6g4eI48G5nxv8/SsXlZCBHQG6UJ YbM+iqovUVTTjHy9c1SIZJB4aEiyrOmVYFlluDB8/igdjUB4df4HbnQx/dfXQA44BGKE 9QC0iF9to+jQp0hG8SPx/xFEdV9z8al62X8Zj2GKfXRPBC2NEUII9DbxfsTnABorKEl0 iRnA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726220365; x=1726825165; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=f/F5VsUKzBY+Gvp0nTts52JPym3MtK6dVK27Lo+tSTM=; b=NdDSWxyoF+lEJZSd9OWcwek0cQ7r8CgWOSbwROFaix8cfuwL52xBRQPIKI2pyV1HqX J94vlkvTQgJft+xtWc0OZFGryMFpMdyQI4Vaf4Xnf1d4YGUNckfb6hSqMrmPxOjm2W1C 70GRmw50LtEMQFngusNnq+PHVh5UVw3oHvc6O7CdacLJsVrEpsaKSzRRqxaBBhcuCY6C NxSoS5VMOYwk4zdz1e8KJ2Kyb1S3wJykskyfhtUaBlUp9OCGO2InpR6ylhuQyLRANkoH OkenkRPouCIS15QlMo8FhuEVBNzh+3PuQcox9OtDxWcB4BGFprgNOS66xlsKsp759Lol YRvw== X-Forwarded-Encrypted: i=1; AJvYcCUvI3FP/qu9oL3Mor4CGJx9s8nSVPNcXU269NUba188V1GLeuobgvv+vkggfhcyia908rlWpti9T223eRc=@vger.kernel.org X-Gm-Message-State: AOJu0Ywd+Up1OfAnLGwO/1kTvN9vQRWGdfv45D5Tkv1pVIJwZFYA0LfL mmYNBaRIwfOz3Muk41oJ1VAqmCLEPTHbnxcPrBJsEE2e1iy/5KWjy+psq8aKegQJ5Q35RUm6Oas ux1ZvHg== X-Google-Smtp-Source: AGHT+IFCLnGFac4d/lytgKhxj1y4y6Vc0xqjrzbKxpeoAAZXwZWlNKOpOSOEZ8fmlHYwc00Ryjx6Ug== X-Received: by 2002:a05:600c:3505:b0:42c:a8d5:2df5 with SMTP id 5b1f17b1804b1-42cdb586ee0mr45446675e9.24.1726220364867; Fri, 13 Sep 2024 02:39:24 -0700 (PDT) Received: from [127.0.1.1] ([2a09:bac5:3802:d2::15:37a]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-37895665548sm16474484f8f.34.2024.09.13.02.39.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 13 Sep 2024 02:39:24 -0700 (PDT) From: Tiago Lam Date: Fri, 13 Sep 2024 10:39:19 +0100 Subject: [RFC PATCH 1/3] ipv4: Run a reverse sk_lookup on sendmsg. Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20240913-reverse-sk-lookup-v1-1-e721ea003d4c@cloudflare.com> References: <20240913-reverse-sk-lookup-v1-0-e721ea003d4c@cloudflare.com> In-Reply-To: <20240913-reverse-sk-lookup-v1-0-e721ea003d4c@cloudflare.com> To: "David S. Miller" , David Ahern , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Willem de Bruijn , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Eduard Zingerman , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Mykola Lysenko , Shuah Khan Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org, bpf@vger.kernel.org, linux-kselftest@vger.kernel.org, Jakub Sitnicki , Tiago Lam , kernel-team@cloudflare.com X-Mailer: b4 0.14.1 In order to check if egress traffic should be allowed through, we run a reverse socket lookup (i.e. normal socket lookup with the src/dst addresses and ports reversed) to check if the corresponding ingress traffic is allowed in. Thus, if there's a sk_lookup reverse call returns a socket that matches the egress socket, we also let the egress traffic through - following the principle of, allowing return traffic to proceed if ingress traffic is allowed in. The reverse lookup is only performed in case an sk_lookup ebpf program is attached and the source address and/or port for the return traffic have been modified. The src address and port can be modified by using ancilliary messages. Up until now, it was possible to specify a different source address to sendmsg by providing it in an IP_PKTINFO anciliarry message, but there's no way to change the source port. This patch also extends the ancilliary messages supported by sendmsg to support the IP_ORIGDSTADDR ancilliary message, reusing the same cmsg and struct used in recvmsg - which already supports specifying a port. Suggested-by: Jakub Sitnicki Signed-off-by: Tiago Lam --- include/net/ip.h | 1 + net/ipv4/ip_sockglue.c | 11 +++++++++++ net/ipv4/udp.c | 33 ++++++++++++++++++++++++++++++++- 3 files changed, 44 insertions(+), 1 deletion(-) diff --git a/include/net/ip.h b/include/net/ip.h index c5606cadb1a5..e5753abd7247 100644 --- a/include/net/ip.h +++ b/include/net/ip.h @@ -75,6 +75,7 @@ static inline unsigned int ip_hdrlen(const struct sk_buff= *skb) struct ipcm_cookie { struct sockcm_cookie sockc; __be32 addr; + __be16 port; int oif; struct ip_options_rcu *opt; __u8 protocol; diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c index cf377377b52d..6e55bd25b5f7 100644 --- a/net/ipv4/ip_sockglue.c +++ b/net/ipv4/ip_sockglue.c @@ -297,6 +297,17 @@ int ip_cmsg_send(struct sock *sk, struct msghdr *msg, = struct ipcm_cookie *ipc, ipc->addr =3D info->ipi_spec_dst.s_addr; break; } + case IP_ORIGDSTADDR: + { + struct sockaddr_in *dst_addr; + + if (cmsg->cmsg_len !=3D CMSG_LEN(sizeof(struct sockaddr_in))) + return -EINVAL; + dst_addr =3D (struct sockaddr_in *)CMSG_DATA(cmsg); + ipc->port =3D dst_addr->sin_port; + ipc->addr =3D dst_addr->sin_addr.s_addr; + break; + } case IP_TTL: if (cmsg->cmsg_len !=3D CMSG_LEN(sizeof(int))) return -EINVAL; diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index 49c622e743e8..b9dc0a88b0c6 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -1060,6 +1060,7 @@ int udp_sendmsg(struct sock *sk, struct msghdr *msg, = size_t len) DECLARE_SOCKADDR(struct sockaddr_in *, usin, msg->msg_name); struct flowi4 fl4_stack; struct flowi4 *fl4; + __u8 flow_flags =3D inet_sk_flowi_flags(sk); int ulen =3D len; struct ipcm_cookie ipc; struct rtable *rt =3D NULL; @@ -1179,6 +1180,37 @@ int udp_sendmsg(struct sock *sk, struct msghdr *msg,= size_t len) } } =20 + /* If we're egressing with a different source address and/or port, we + * perform a reverse socket lookup. The rationale behind this is that we + * can allow return UDP traffic that has ingressed through sk_lookup to + * also egress correctly. In case this the reverse lookup fails. + * + * The lookup is performed if either source address and/or port changed, = and + * neither is "0". + */ + if (static_branch_unlikely(&bpf_sk_lookup_enabled) && + !connected && + (ipc.port && ipc.addr) && + (inet->inet_saddr !=3D ipc.addr || inet->inet_sport !=3D ipc.port)) { + struct sock *sk_egress; + + bpf_sk_lookup_run_v4(sock_net(sk), IPPROTO_UDP, + daddr, dport, ipc.addr, ntohs(ipc.port), 1, &sk_egress); + if (IS_ERR_OR_NULL(sk_egress) || + atomic64_read(&sk_egress->sk_cookie) !=3D atomic64_read(&sk->sk_cook= ie)) { + net_info_ratelimited("No reverse socket lookup match for local addr %pI= 4:%d remote addr %pI4:%d\n", + &ipc.addr, ntohs(ipc.port), &daddr, ntohs(dport)); + } else { + /* Override the source port to use with the one we got in cmsg, + * and tell routing to let us use a non-local address. Otherwise + * route lookups will fail with non-local source address when + * IP_TRANSPARENT isn't set. + */ + inet->inet_sport =3D ipc.port; + flow_flags |=3D FLOWI_FLAG_ANYSRC; + } + } + saddr =3D ipc.addr; ipc.addr =3D faddr =3D daddr; =20 @@ -1223,7 +1255,6 @@ int udp_sendmsg(struct sock *sk, struct msghdr *msg, = size_t len) =20 if (!rt) { struct net *net =3D sock_net(sk); - __u8 flow_flags =3D inet_sk_flowi_flags(sk); =20 fl4 =3D &fl4_stack; =20 --=20 2.34.1