net/llc/af_llc.c | 1 - net/llc/llc_conn.c | 5 +++++ 2 files changed, 5 insertions(+), 1 deletion(-)
In llc_ui_release(), sock_orphan() was called before llc_sk_free()
stopped all LLC timers. A pending timer callback
(llc_conn_ack_tmr_cb()->llc_process_tmr_ev()->llc_conn_state_process())
could fire between these two operations and dereference the
NULL sk->sk_socket that sock_orphan() sets, causing a kernel
page fault.
Fix the race by moving sock_orphan() into llc_sk_free(), after
llc_sk_stop_all_timers() has completed. This guarantees that
all timers are stopped before the socket is orphaned, eliminating
the window for the race.
Fixes: aa2b2eb39348 ("llc: call sock_orphan() at release time")
Signed-off-by: Jiakai Xu <xujiakai24@mails.ucas.ac.cn>
---
V1 -> V2:
- Replaced sk->sk_socket NULL checks with moving sock_orphan()
after timer stop, as suggested by Paolo Abeni.
Link: https://lore.kernel.org/lkml/20260526013541.796307-1-xujiakai24@mails.ucas.ac.cn/T/#u
---
net/llc/af_llc.c | 1 -
net/llc/llc_conn.c | 5 +++++
2 files changed, 5 insertions(+), 1 deletion(-)
diff --git a/net/llc/af_llc.c b/net/llc/af_llc.c
index 35278c519a30..92f3576b339a 100644
--- a/net/llc/af_llc.c
+++ b/net/llc/af_llc.c
@@ -227,7 +227,6 @@ static int llc_ui_release(struct socket *sock)
}
netdev_put(llc->dev, &llc->dev_tracker);
sock_put(sk);
- sock_orphan(sk);
sock->sk = NULL;
llc_sk_free(sk);
out:
diff --git a/net/llc/llc_conn.c b/net/llc/llc_conn.c
index 5c0ac243b248..c02285441592 100644
--- a/net/llc/llc_conn.c
+++ b/net/llc/llc_conn.c
@@ -977,6 +977,11 @@ void llc_sk_free(struct sock *sk)
llc->state = LLC_CONN_OUT_OF_SVC;
/* Stop all (possibly) running timers */
llc_sk_stop_all_timers(sk, true);
+ /* Orphan the socket after timers are stopped; otherwise a pending
+ * timer callback could dereference the NULL sk->sk_socket that
+ * sock_orphan() sets.
+ */
+ sock_orphan(sk);
#ifdef DEBUG_LLC_CONN_ALLOC
printk(KERN_INFO "%s: unackq=%d, txq=%d\n", __func__,
skb_queue_len(&llc->pdu_unack_q),
--
2.34.1
On Fri, 29 May 2026 02:00:59 +0000 Jiakai Xu wrote: > In llc_ui_release(), sock_orphan() was called before llc_sk_free() > stopped all LLC timers. A pending timer callback > (llc_conn_ack_tmr_cb()->llc_process_tmr_ev()->llc_conn_state_process()) > could fire between these two operations and dereference the > NULL sk->sk_socket that sock_orphan() sets, causing a kernel > page fault. > > Fix the race by moving sock_orphan() into llc_sk_free(), after > llc_sk_stop_all_timers() has completed. This guarantees that > all timers are stopped before the socket is orphaned, eliminating > the window for the race. Sashiko points out that there's more issues if the timer runs after llc_ui_release(). Can you reliably reproduce this? Have you checked that this change is sufficient? Sashiko says that llc->dev may disappear even tho we don't clear that pointer in _release(). -- pw-bot: cr
Thank you very much for your review and feedback. I really appreciate you taking the time to look at this. > Sashiko points out that there's more issues if the timer runs after > llc_ui_release(). Can you reliably reproduce this? Have you checked > that this change is sufficient? Sashiko says that llc->dev may > disappear even tho we don't clear that pointer in _release(). This crash was discovered by fuzzing. Unfortunately, the fuzzer did not generate a reproducer program, so I am unable to reproduce it. Our analysis has been based entirely on the crash report. I'm not an expert in this area, so the quality of my patches may be low. I really appreciate your patience and the time you've taken to review this. Would this V3 approach (moving both sock_orphan() and netdev_put() into llc_sk_free() after the timer stop) be the correct way to proceed? Regards, Jiakai
On Wed, 3 Jun 2026 01:30:07 +0000 Jiakai Xu wrote: > > Sashiko points out that there's more issues if the timer runs after > > llc_ui_release(). Can you reliably reproduce this? Have you checked > > that this change is sufficient? Sashiko says that llc->dev may > > disappear even tho we don't clear that pointer in _release(). > > This crash was discovered by fuzzing. Unfortunately, the fuzzer did > not generate a reproducer program, so I am unable to reproduce it. > Our analysis has been based entirely on the crash report. > > I'm not an expert in this area, so the quality of my patches may be > low. I really appreciate your patience and the time you've taken to > review this. Would this V3 approach (moving both sock_orphan() and > netdev_put() into llc_sk_free() after the timer stop) be the correct > way to proceed? Not sure, feels like we're trying to fix symptoms instead of addressing the real root cause.
© 2016 - 2026 Red Hat, Inc.