From nobody Thu Apr 2 12:41:27 2026 Received: from mailtransmit05.runbox.com (mailtransmit05.runbox.com [185.226.149.38]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A5EA536B067; Thu, 5 Mar 2026 23:32:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=185.226.149.38 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772753551; cv=none; b=OdHfk+rimrZoqvgPM9MukJEjki1zP6WPcH5KXViYoqq80ZP1AhCnggKlHcsPgQ5gpht07qXgziyyiTJa4m5zRFYk8qR1mS8eOv6E4AJVsNHl+lXT2HOC73wf1h5rT9GVLoH2RVej5wwCf2+mCZYwvkjX4VCfXTFrGDPZL+VIISI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772753551; c=relaxed/simple; bh=bDzbc3ZHCvPBc/hlIDrPvFqAxratovDgZTWCGUKaMcs=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=omfOlkoFiINIBz6iX7D3c+bBd7e0BT4iE07ZWwGld9UQvDg2EYVMOSeaL8OQjLuBlrHV4SDH7o5XXzTNiBw7/qvHZ2Bqz32w3C2JO+5O1Pl1ncA5BzqLhgbYPV7g81LPZjNu7Qvu1yBHPOU/SnErYvmSDWS/bL+FzIO65b3v3Pc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=rbox.co; spf=pass smtp.mailfrom=rbox.co; dkim=pass (2048-bit key) header.d=rbox.co header.i=@rbox.co header.b=tH/LCiLS; arc=none smtp.client-ip=185.226.149.38 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=rbox.co Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=rbox.co Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=rbox.co header.i=@rbox.co header.b="tH/LCiLS" Received: from mailtransmit02.runbox ([10.9.9.162] helo=aibo.runbox.com) by mailtransmit05.runbox.com with esmtps (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.93) (envelope-from ) id 1vyIB6-00EUkk-AT; Fri, 06 Mar 2026 00:32:24 +0100 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=rbox.co; s=selector1; h=Cc:To:In-Reply-To:References:Message-Id: Content-Transfer-Encoding:Content-Type:MIME-Version:Subject:Date:From; bh=h3U/vn8cMJy7gTz7S1npXP10+eZWzW20H3DtLHiAKc0=; b=tH/LCiLSZnMna8l28wQy45gYGx heYq9n4eqSxv+Hp0y2W2r/r43siFTAT8ZxVnEl9LmxYWl99pDZPqnJXNhBik6PVvszMzJsSbHVYA6 0Tx4Yv5g1CgrRq7bdOlzIoi30H+jw41KfRkgwq5i7t0umrZmKtKa1Lma5YO00A5XBnWoR+olv5lY4 4qIDC5KcApNPNdqzGL4Bxime8Ds6d7a/Ya58akhH+moH7gxI5+SW8K8CwltCz0Bjt6N35nA6p6BWN ts4+jTmeHzgRV7axQ1S9pH6clP9PnEdOdYwdRnurOjMAHfisiGtMrpp6SBHbXeN8s5tboMfnm3TcR ac/Nf6hw==; Received: from [10.9.9.73] (helo=submission02.runbox) by mailtransmit02.runbox with esmtp (Exim 4.86_2) (envelope-from ) id 1vyIB5-0004DY-VA; Fri, 06 Mar 2026 00:32:24 +0100 Received: by submission02.runbox with esmtpsa [Authenticated ID (604044)] (TLS1.2:ECDHE_SECP256R1__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim 4.93) id 1vyIAN-00141B-96; Fri, 06 Mar 2026 00:31:39 +0100 From: Michal Luczaj Date: Fri, 06 Mar 2026 00:30:59 +0100 Subject: [PATCH bpf v3 5/5] bpf, sockmap: Adapt for af_unix-specific lock Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260306-unix-proto-update-null-ptr-deref-v3-5-2f0c7410c523@rbox.co> References: <20260306-unix-proto-update-null-ptr-deref-v3-0-2f0c7410c523@rbox.co> In-Reply-To: <20260306-unix-proto-update-null-ptr-deref-v3-0-2f0c7410c523@rbox.co> To: John Fastabend , Jakub Sitnicki , Eric Dumazet , Kuniyuki Iwashima , Paolo Abeni , Willem de Bruijn , "David S. Miller" , Jakub Kicinski , Simon Horman , Yonghong Song , Andrii Nakryiko , Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Eduard Zingerman , Song Liu , Yonghong Song , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Shuah Khan , Cong Wang Cc: netdev@vger.kernel.org, bpf@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, Michal Luczaj X-Mailer: b4 0.14.3 unix_stream_connect() sets sk_state (`WRITE_ONCE(sk->sk_state, TCP_ESTABLISHED)`) _before_ it assigns a peer (`unix_peer(sk) =3D newsk`). sk_state =3D=3D TCP_ESTABLISHED makes sock_map_sk_state_allowed() believe t= hat socket is properly set up, which would include having a defined peer. IOW, there's a window when unix_stream_bpf_update_proto() can be called on socket which still has unix_peer(sk) =3D=3D NULL. T0 bpf T1 connect ------ ---------- WRITE_ONCE(sk->sk_state, TCP_ESTABLISHED) sock_map_sk_state_allowed(sk) ... sk_pair =3D unix_peer(sk) sock_hold(sk_pair) sock_hold(newsk) smp_mb__after_atomic() unix_peer(sk) =3D newsk BUG: kernel NULL pointer dereference, address: 0000000000000080 RIP: 0010:unix_stream_bpf_update_proto+0xa0/0x1b0 Call Trace: sock_map_link+0x564/0x8b0 sock_map_update_common+0x6e/0x340 sock_map_update_elem_sys+0x17d/0x240 __sys_bpf+0x26db/0x3250 __x64_sys_bpf+0x21/0x30 do_syscall_64+0x6b/0x3a0 entry_SYSCALL_64_after_hwframe+0x76/0x7e Initial idea was to move peer assignment _before_ the sk_state update[1], but that involved an additional memory barrier, and changing the hot path was rejected. Then a check during proto update was considered[2], but a follow-up discussion[3] concluded the root cause is sockmap taking a wrong lock. Or, more specifically, an insufficient lock[4]. Thus, teach sockmap about the af_unix-specific locking: af_unix protects critical sections under unix_state_lock() operating on unix_sock::lock. [1]: https://lore.kernel.org/netdev/ba5c50aa-1df4-40c2-ab33-a72022c5a32e@rb= ox.co/ [2]: https://lore.kernel.org/netdev/20240610174906.32921-1-kuniyu@amazon.co= m/ [3]: https://lore.kernel.org/netdev/7603c0e6-cd5b-452b-b710-73b64bd9de26@li= nux.dev/ [4]: https://lore.kernel.org/netdev/CAAVpQUA+8GL_j63CaKb8hbxoL21izD58yr1Nvh= OhU=3Dj+35+3og@mail.gmail.com/ Suggested-by: Kuniyuki Iwashima Suggested-by: Martin KaFai Lau Fixes: c63829182c37 ("af_unix: Implement ->psock_update_sk_prot()") Signed-off-by: Michal Luczaj --- net/core/sock_map.c | 35 +++++++++++++++++++++++++++++------ 1 file changed, 29 insertions(+), 6 deletions(-) diff --git a/net/core/sock_map.c b/net/core/sock_map.c index 7ba6a7f24ccd..6109fbe6f99c 100644 --- a/net/core/sock_map.c +++ b/net/core/sock_map.c @@ -12,6 +12,7 @@ #include #include #include +#include #include =20 struct bpf_stab { @@ -115,19 +116,43 @@ int sock_map_prog_detach(const union bpf_attr *attr, = enum bpf_prog_type ptype) } =20 static void sock_map_sk_acquire(struct sock *sk) - __acquires(&sk->sk_lock.slock) { lock_sock(sk); + + if (sk_is_unix(sk)) + unix_state_lock(sk); + rcu_read_lock(); } =20 static void sock_map_sk_release(struct sock *sk) - __releases(&sk->sk_lock.slock) { rcu_read_unlock(); + + if (sk_is_unix(sk)) + unix_state_unlock(sk); + release_sock(sk); } =20 +static inline void sock_map_sk_acquire_fast(struct sock *sk) +{ + local_bh_disable(); + bh_lock_sock(sk); + + if (sk_is_unix(sk)) + unix_state_lock(sk); +} + +static inline void sock_map_sk_release_fast(struct sock *sk) +{ + if (sk_is_unix(sk)) + unix_state_unlock(sk); + + bh_unlock_sock(sk); + local_bh_enable(); +} + static void sock_map_add_link(struct sk_psock *psock, struct sk_psock_link *link, struct bpf_map *map, void *link_raw) @@ -604,16 +629,14 @@ static long sock_map_update_elem(struct bpf_map *map,= void *key, if (!sock_map_sk_is_suitable(sk)) return -EOPNOTSUPP; =20 - local_bh_disable(); - bh_lock_sock(sk); + sock_map_sk_acquire_fast(sk); if (!sock_map_sk_state_allowed(sk)) ret =3D -EOPNOTSUPP; else if (map->map_type =3D=3D BPF_MAP_TYPE_SOCKMAP) ret =3D sock_map_update_common(map, *(u32 *)key, sk, flags); else ret =3D sock_hash_update_common(map, key, sk, flags); - bh_unlock_sock(sk); - local_bh_enable(); + sock_map_sk_release_fast(sk); return ret; } =20 --=20 2.52.0