From nobody Mon Jun 8 04:26:42 2026 Received: from stravinsky.debian.org (stravinsky.debian.org [82.195.75.108]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B47723F44EF; Thu, 4 Jun 2026 16:10:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=82.195.75.108 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780589446; cv=none; b=OhWwzHhE+xg63X2+OJg/KqB5e36odkSAqiE3dMvqPrSvLGq0WSIx8I70HRRmazwZmvQPx8pfgJFCeTvYMtlpb6rNgkZ/RWKuMJXIpbiJ2KDq6lCurKeR5ZSuZtBP7ruH7umlgwcR7iZ2E2Sa/8x6G5V+Tfe7LmMK4OYzQUX0FGg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780589446; c=relaxed/simple; bh=/Zn3vEz3PvXrX6NlCleQ6n7gHQC61l3pzu24aOFFGl8=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=XMeAsujPjQOlD+kJsAnSupjZ07rSwsTqOr1QXos9YDIeccsqJxlL55EQ7KSQnucSdSapfKQHUBWSwKFT+tHbakDYmHhXi/kIXULjVZcGyNVURHw7Y6PbIaWUc4R1coxYtXwtzlbTrfu4YViqDNPw5sIuGjhHE2DX7emBHa3yrjw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=debian.org; spf=pass smtp.mailfrom=debian.org; dkim=pass (2048-bit key) header.d=debian.org header.i=@debian.org header.b=cwbmEcFX; arc=none smtp.client-ip=82.195.75.108 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=debian.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=debian.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=debian.org header.i=@debian.org header.b="cwbmEcFX" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=debian.org; s=smtpauto.stravinsky; h=X-Debian-User:Cc:To:In-Reply-To:References: Message-Id:Content-Transfer-Encoding:Content-Type:MIME-Version:Subject:Date: From:Reply-To:Content-ID:Content-Description; bh=9S5daKe6q1OH/iTEY5c0ti/UepSQSFPvr+OxewRVe+M=; b=cwbmEcFXVwg3j70JN7MfgZK5Tz 0aMf6xDn/8iVoPucD46mu3jAM4nRPkpuPCiac0YJC9xere9X41pm6bxXnTj8gFdO10G1+MpN/SdF5 jDflNA7WlveB5IgeNSw4w2CnluQjeEfqYP8vD9h9Bw2nYlkngri8A6CiMw/BCZoliAiDM+UPYsfix 9HjfDs56VbO+hnbsTpWAieyNsdixQnm5bjo0kqMgA2jl+iiWxsEJMLk8kjs1MFcQcAP8resUKJ8V1 rsh1D7KPxSE5d5aVB/im4UI3KNVG4rqmoQY2DxRks2GeycjYyxHKS5Gh7z0xCLoyKidI8yMJTn3JL AH3jKHag==; Received: from authenticated-user by stravinsky.debian.org with esmtpsa (TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim 4.96) (envelope-from ) id 1wVAeU-004dNv-1W; Thu, 04 Jun 2026 16:10:38 +0000 From: Breno Leitao Date: Thu, 04 Jun 2026 09:10:10 -0700 Subject: [PATCH net-next v3 1/5] netconsole: do not schedule skb pool refill from NMI Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260604-netcons_fix_before_move-v3-1-ab055b3a6aa5@debian.org> References: <20260604-netcons_fix_before_move-v3-0-ab055b3a6aa5@debian.org> In-Reply-To: <20260604-netcons_fix_before_move-v3-0-ab055b3a6aa5@debian.org> To: Breno Leitao , Andrew Lunn , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Simon Horman Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@meta.com X-Mailer: b4 0.14.3 X-Developer-Signature: v=1; a=openpgp-sha256; l=2970; i=leitao@debian.org; h=from:subject:message-id; bh=/Zn3vEz3PvXrX6NlCleQ6n7gHQC61l3pzu24aOFFGl8=; b=owEBbQKS/ZANAwAIATWjk5/8eHdtAcsmYgBqIaN1VoOb0Sw8uhoCa6qx8sCFDf2NKwSbNPtN7 vVju+2YYJuJAjMEAAEIAB0WIQSshTmm6PRnAspKQ5s1o5Of/Hh3bQUCaiGjdQAKCRA1o5Of/Hh3 bbgNEACAMEsSRxUjbVWRXP/wGCp1kZCtu5SPR5g2cnrHmXSbd9ABZB+5zd//hvtS26043Swq892 5jPllJhVWX9mOr8MWqhVDagHbN4zx/Uu/kbUV03nxxOrBc+4fkFAcIf2JUHib8Onev/3dpRt4wP 1ZcWTctfb/ifPl5/ZiOUn+beGuA5FhhjfYmsbYiUC7Qd9nTKEj8E38pRzz5RhosZITCEDW0JUeA zDwIvtnPJtMsYM3tJrdefXtZTYO77+x4sioAohdFuUcouZY5d5UCAkbj5bK2oIDBj0pce8xNhN8 H+VXHzlVyZAhOgBgVZvL4Q+2WUuSMHfi0gPqT1mZqf94S7mXKxY7mr30GERGvbYZiPxa2W2Ornj 7Jg5rcaeRfykCClwubwaI4gNak7ejs1FSGHtb0BS1hLnzmiv2Nbpm+7q3AR4dTsOPfgojObs+sP kvBL70ciLNOF5slpsnyKsQo6IVvYPiqNTfz9qAVYFkHZGbjFQ/C3phBafUx8r6HcnUizBMubEeL NXprHirUv8qbGWTkEx7+FkXN/ve5gOoqNWpKbkwS///qw0UZL8WpOM6+EFLqED4jEWDAtR8o5x2 9i/ol666WyhQlIhjFeBidgFlFZphCLeyBZFaQvG9jJLhY703MugTrVVElGuNPFHBUGT4Xy1Okaj W7U/wWD0Z6J1xpw== X-Developer-Key: i=leitao@debian.org; a=openpgp; fpr=AC8539A6E8F46702CA4A439B35A3939FFC78776D X-Debian-User: leitao When alloc_skb() fails in find_skb(), the fallback path dequeues an skb from np->skb_pool and unconditionally calls schedule_work() to top the pool back up. schedule_work() ends up taking the workqueue pool locks, which are not NMI-safe. netconsole_write() is registered as the nbcon write_atomic callback and is explicitly marked CON_NBCON_ATOMIC_UNSAFE, meaning it is invoked from emergency/panic contexts including NMIs. If the NMI interrupts a thread already holding the workqueue pool lock, calling schedule_work() self-deadlocks and the panic message that was being printed is lost. Introduce netcons_skb_pop() to fold the pool dequeue and the refill request into a single helper. The helper skips schedule_work() when called from NMI context; the pool is best-effort, so the refill is simply deferred to the next non-NMI find_skb() call that exhausts alloc_skb() and hits the fallback again. This keeps the fast path untouched and the locking rules around the fallback pool documented in one place. Note this only removes the schedule_work() hazard from the NMI path. The allocation itself is still not fully NMI-safe: the alloc_skb(GFP_ATOMIC) attempted first may take slab locks, and the skb_dequeue() fallback takes np->skb_pool.lock, so either can deadlock if the NMI interrupts a holder of those locks. Closing those windows requires an NMI-safe (lockless) skb pool and is left to a follow-up; this patch addresses the schedule_work() deadlock, which is both the most likely and the easiest to trigger. Signed-off-by: Breno Leitao --- drivers/net/netconsole.c | 23 +++++++++++++++++++---- 1 file changed, 19 insertions(+), 4 deletions(-) diff --git a/drivers/net/netconsole.c b/drivers/net/netconsole.c index 8ecc2c71c699..918e4a9f4456 100644 --- a/drivers/net/netconsole.c +++ b/drivers/net/netconsole.c @@ -1654,6 +1654,23 @@ static struct notifier_block netconsole_netdev_notif= ier =3D { .notifier_call =3D netconsole_netdev_event, }; =20 +/* Pop a pre-allocated skb from the pool and request a refill. + * + * The refill is requested via schedule_work(), which takes the workqueue + * pool locks and is therefore not NMI-safe. Skip the refill when called + * from NMI context; the next non-NMI caller will top the pool back up. + */ +static struct sk_buff *netcons_skb_pop(struct netpoll *np) +{ + struct sk_buff *skb; + + skb =3D skb_dequeue(&np->skb_pool); + if (!in_nmi()) + schedule_work(&np->refill_wq); + + return skb; +} + static struct sk_buff *find_skb(struct netpoll *np, int len, int reserve) { int count =3D 0; @@ -1663,10 +1680,8 @@ static struct sk_buff *find_skb(struct netpoll *np, = int len, int reserve) repeat: =20 skb =3D alloc_skb(len, GFP_ATOMIC); - if (!skb) { - skb =3D skb_dequeue(&np->skb_pool); - schedule_work(&np->refill_wq); - } + if (!skb) + skb =3D netcons_skb_pop(np); =20 if (!skb) { if (++count < 10) { --=20 2.53.0-Meta From nobody Mon Jun 8 04:26:42 2026 Received: from stravinsky.debian.org (stravinsky.debian.org [82.195.75.108]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2E137450901; Thu, 4 Jun 2026 16:10:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=82.195.75.108 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780589449; cv=none; b=atxgdTcA7vBGdL2xSeKd+2J2Iq55QaGC3nM17DWUU0HoTzEnW2E5xGjSjI/l7Z1a1benrY2+FkGKz3VOsEIOBfsyFoETx5zMwURrvJl5Pn9C8f1ireliU0dUPEEGN7Yn/xE8M5abLtYEC/Nri1XrtL2nnt+lhTnypYWqUdkvmpA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780589449; c=relaxed/simple; bh=/vLl1l/78vt9RDMFfCeneM4B3TUnzwwDxj8qXrfEy5U=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=BHdtPPXGh3cVfrWiHvLPzUbP5EdECKVRiOYQ1m8ETmDvCavdR7J1km0+CXEC4dvePFkbIrm1gIZuMqs6oM+nPjHcoU/tAPXnVZ5crn1TL45XsCBsl6Hjq4+XYm+IqVYlMeZ6CEkf8e3YPXeeBpMUMd2is5l1F/XyDWXjLmgVr98= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=debian.org; spf=pass smtp.mailfrom=debian.org; dkim=pass (2048-bit key) header.d=debian.org header.i=@debian.org header.b=TBmcyn0g; arc=none smtp.client-ip=82.195.75.108 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=debian.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=debian.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=debian.org header.i=@debian.org header.b="TBmcyn0g" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=debian.org; s=smtpauto.stravinsky; h=X-Debian-User:Cc:To:In-Reply-To:References: Message-Id:Content-Transfer-Encoding:Content-Type:MIME-Version:Subject:Date: From:Reply-To:Content-ID:Content-Description; bh=ET3CzSCmgSIUdz/BKdumXFHL1EW3b0uAXYHBuoVwoeE=; b=TBmcyn0gLBsQwxStK6Uu6ihvUD KoMBC41PjML85/a1FgXEvF5iKOThZSTZrza8ZsqkN2N/ATUhU541IJ0kVCjhvcpzW7zb8lzn/6euA mkUQLo7s3qg14JgpK0IznEK8+YB5ga0mHtg9MCvGDkWhDTS5ibeiasPRdplHkjeAoZTNLb6t1VIlj 8dYVj3VsKs6kI/ipuqJX87Vxx+F6pZfC+5d8wu3NRZqLg7PgONeusdP/g7Ydcg5bkfK1KEB+3xWs7 8gPMLcdEe13vNoMLqN9aIj4FgjbGeBqlSwWAoTAs6nSgRLXjXXoJH12Lw7onn8euAn4tDOLQH1cGW 3Y0N2dHg==; Received: from authenticated-user by stravinsky.debian.org with esmtpsa (TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim 4.96) (envelope-from ) id 1wVAeY-004dO3-11; Thu, 04 Jun 2026 16:10:42 +0000 From: Breno Leitao Date: Thu, 04 Jun 2026 09:10:11 -0700 Subject: [PATCH net-next v3 2/5] netconsole: do not dequeue pooled skbs that cannot satisfy len Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260604-netcons_fix_before_move-v3-2-ab055b3a6aa5@debian.org> References: <20260604-netcons_fix_before_move-v3-0-ab055b3a6aa5@debian.org> In-Reply-To: <20260604-netcons_fix_before_move-v3-0-ab055b3a6aa5@debian.org> To: Breno Leitao , Andrew Lunn , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Simon Horman Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@meta.com X-Mailer: b4 0.14.3 X-Developer-Signature: v=1; a=openpgp-sha256; l=5657; i=leitao@debian.org; h=from:subject:message-id; bh=/vLl1l/78vt9RDMFfCeneM4B3TUnzwwDxj8qXrfEy5U=; b=owEBbQKS/ZANAwAIATWjk5/8eHdtAcsmYgBqIaN1P+qdhNl0Ya7azFUnK0SHDd/LeiiTxc/0B YV59va/2LKJAjMEAAEIAB0WIQSshTmm6PRnAspKQ5s1o5Of/Hh3bQUCaiGjdQAKCRA1o5Of/Hh3 bUIHD/91dHQgTTBKPZ1H5IbFTNEGYVOYVTVyUvdw2si04vRhKlkhmupnaP+cVMgFIwzPDC+yqFS /WYIa40nvj2fHcTE95P/6KU6FJNfxLJlg/rZkb++Yqyv2wL52jweY8CALM4UK1Ej8EG+jqqGfbc eUiy0Gv7fIP6GECGGh7QbwNcZLNvzNl4DlfGaiu9dCR3Z0zq9Rtq0cgFlpHyfdY0iCrJrsA5LtZ D1i9lbJrvf0uK7vO9StiS9C0mQurEIuUGaI5S6IltSuV8VlksMc2vxIEBuDhyXWyCxKlaodZRex jSCaH9IJ4DBOdC9kqad716ZXiTpJDjMG21HXySSmkFkwZFM/N08sZE+f1eeCk1Ji9C5H0wogij8 dtn4AIwDWpd5kftAWU9iQDpuZtRMuh1in4wPQS709QPlgMkqhQGQ2JEEoddxGUdA7YA960Q4V0E xZwGIEGmzCZkB9WRV77nd/Sp4+myLzISkSLVUph/FewkrpCV/dteh+bStKw17PTZlN0AU0IrhDP WCSeuCN5fxK6IKQrRWAIU5+LvaavqKlYajoPtSSilGFDnhjUqFeIl59LMcuIcJTXVuo4o1zzDYz ivFP9MaSP7jFiIMi763VlLOY7SHCrcZSjMK08QnocHZuf5VW6uVFfU3IoV0qxqohJ53Apb280Jw BaLzy6RwCMdyXtg== X-Developer-Key: i=leitao@debian.org; a=openpgp; fpr=AC8539A6E8F46702CA4A439B35A3939FFC78776D X-Debian-User: leitao find_skb() falls back to np->skb_pool when the GFP_ATOMIC alloc_skb() fails. The pool is refilled by refill_skbs(), which always allocates buffers of MAX_SKB_SIZE (ethhdr + iphdr + udphdr + MAX_UDP_CHUNK =3D=3D 1502 bytes). netconsole, however, computes the requested length dynamically as total_len + np->dev->needed_tailroom If the egress device declares a non-zero needed_tailroom (e.g. some tunnel or hardware accelerator devices), the required length can exceed MAX_SKB_SIZE. The pooled skb is then handed back to the caller, which immediately performs skb_put(skb, len), trips the tail > end check, and triggers skb_over_panic(). Leave the normal alloc_skb(len, GFP_ATOMIC) path untouched -- the slab allocator can still satisfy oversized requests when memory is available, so senders to devices with non-zero needed_tailroom keep working in the common case. Only the pool fallback is gated: when alloc_skb() failed and len exceeds the pool buffer size, skip the skb_dequeue() instead of burning a pre-allocated skb on a request that would later trip skb_over_panic(). Reserving pool entries for requests they can actually satisfy also keeps the panic path, which depends on the pool being primed, intact. When that drop happens, emit a rate-limited net_warn() so the user notices that netconsole is unable to push messages on the egress device. The warn is skipped under in_nmi() for the same reason schedule_work() is: printk machinery taken by net_warn_ratelimited() is not NMI-safe and would risk recursing into the same nbcon console we are servicing. MAX_SKB_SIZE / MAX_UDP_CHUNK were private to net/core/netpoll.c. Move them to include/linux/netpoll.h so netconsole can reference the same definition that refill_skbs() uses, keeping the two in sync by construction. The header now pulls in and explicitly so MAX_SKB_SIZE remains self-contained for any future user. Signed-off-by: Breno Leitao --- drivers/net/netconsole.c | 22 ++++++++++++++++++++-- include/linux/netpoll.h | 16 ++++++++++++++++ net/core/netpoll.c | 7 ------- 3 files changed, 36 insertions(+), 9 deletions(-) diff --git a/drivers/net/netconsole.c b/drivers/net/netconsole.c index 918e4a9f4456..58250e648f8b 100644 --- a/drivers/net/netconsole.c +++ b/drivers/net/netconsole.c @@ -1655,15 +1655,33 @@ static struct notifier_block netconsole_netdev_noti= fier =3D { }; =20 /* Pop a pre-allocated skb from the pool and request a refill. + * + * The pool is refilled with MAX_SKB_SIZE buffers, so a pooled skb cannot + * satisfy a larger request. Return NULL in that case rather than handing + * back a too-small skb that would later trip skb_over_panic() in skb_put(= ); + * the caller still polls and retries, and alloc_skb() itself can satisfy = the + * oversized request once memory frees up. * * The refill is requested via schedule_work(), which takes the workqueue * pool locks and is therefore not NMI-safe. Skip the refill when called * from NMI context; the next non-NMI caller will top the pool back up. */ -static struct sk_buff *netcons_skb_pop(struct netpoll *np) +static struct sk_buff *netcons_skb_pop(struct netpoll *np, int len) { struct sk_buff *skb; =20 + if (len > MAX_SKB_SIZE) { + /* net_warn_ratelimited() pulls in printk machinery that is not + * NMI-safe and could recurse into the nbcon console we are + * servicing, so only warn outside NMI. + */ + if (!in_nmi()) + net_warn_ratelimited("netconsole: dropping message, requested skb len %= d exceeds pool buffer size %zu on %s\n", + len, (size_t)MAX_SKB_SIZE, + np->dev->name); + return NULL; + } + skb =3D skb_dequeue(&np->skb_pool); if (!in_nmi()) schedule_work(&np->refill_wq); @@ -1681,7 +1699,7 @@ static struct sk_buff *find_skb(struct netpoll *np, i= nt len, int reserve) =20 skb =3D alloc_skb(len, GFP_ATOMIC); if (!skb) - skb =3D netcons_skb_pop(np); + skb =3D netcons_skb_pop(np, len); =20 if (!skb) { if (++count < 10) { diff --git a/include/linux/netpoll.h b/include/linux/netpoll.h index e4b8f1f91e54..88f7daa8560e 100644 --- a/include/linux/netpoll.h +++ b/include/linux/netpoll.h @@ -13,12 +13,28 @@ #include #include #include +#include +#include =20 union inet_addr { __be32 ip; struct in6_addr in6; }; =20 +/* + * Maximum payload netpoll's preallocated skb pool can carry. Keep this in + * sync with the buffer size used by refill_skbs() in net/core/netpoll.c; + * callers (e.g. netconsole) use it to detect requests the pool can never + * satisfy and avoid dequeuing a pooled skb that would later trip + * skb_over_panic() in skb_put(). + */ +#define MAX_UDP_CHUNK 1460 +#define MAX_SKB_SIZE \ + (sizeof(struct ethhdr) + \ + sizeof(struct iphdr) + \ + sizeof(struct udphdr) + \ + MAX_UDP_CHUNK) + struct netpoll { struct net_device *dev; netdevice_tracker dev_tracker; diff --git a/net/core/netpoll.c b/net/core/netpoll.c index b3fe59445f2d..229dde818ab3 100644 --- a/net/core/netpoll.c +++ b/net/core/netpoll.c @@ -41,16 +41,9 @@ * message gets out even in extreme OOM situations. */ =20 -#define MAX_UDP_CHUNK 1460 #define MAX_SKBS 32 #define USEC_PER_POLL 50 =20 -#define MAX_SKB_SIZE \ - (sizeof(struct ethhdr) + \ - sizeof(struct iphdr) + \ - sizeof(struct udphdr) + \ - MAX_UDP_CHUNK) - static unsigned int carrier_timeout =3D 4; module_param(carrier_timeout, uint, 0644); =20 --=20 2.53.0-Meta From nobody Mon Jun 8 04:26:42 2026 Received: from stravinsky.debian.org (stravinsky.debian.org [82.195.75.108]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D144847D959; Thu, 4 Jun 2026 16:10:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=82.195.75.108 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780589453; cv=none; b=T1VxfkojeZm8tNSrurD5ehQc//NFk9Qg949b+/6hWCsk9v4ls/utlidT+jdKt5Hq5SitofSAtZuiqPDKKijY01OBU/KDLLxspbUSprplvr89F/RAx7sDw1H6NJZ0HKfdJNZ+NqipbebVwvAVTpmLRsBRX0MzCKsn6904A5nVe8o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780589453; c=relaxed/simple; bh=YaN4a7w2amMQzJRoGcsAWVnJA3MY2r7yAmdHp90h8cI=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=ujhn+UluHFFBG1vqpWzNZv/1wQOs3cvgPBHP5h57MeBpJdg0ZNpU9Pkf2HKjglQUpuO3x73lM+0UwKtvvuNK0NfRYYbMfsaiO3NSZWjMmBypakOrlPxIF4lhzOtZ4n3ZCRbguTsv74q+gtHBvCpEzVSKUwC171LzFggFS8KfjmA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=debian.org; spf=pass smtp.mailfrom=debian.org; dkim=pass (2048-bit key) header.d=debian.org header.i=@debian.org header.b=Z1bqRZA5; arc=none smtp.client-ip=82.195.75.108 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=debian.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=debian.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=debian.org header.i=@debian.org header.b="Z1bqRZA5" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=debian.org; s=smtpauto.stravinsky; h=X-Debian-User:Cc:To:In-Reply-To:References: Message-Id:Content-Transfer-Encoding:Content-Type:MIME-Version:Subject:Date: From:Reply-To:Content-ID:Content-Description; bh=O2sFf1rMAObxsHbRdptfxA5iNeX8K8Ii825MHDg3280=; b=Z1bqRZA5yjBpInSaiTz0tSJ2Av XeADzE0MBS8tNFqd2iO+nz0sDh2oOtjVcsfXVUNPbRZ7beHscWmOrGK6GqjU5XMMHSQNxXP4AUblm xikesjbPOIZdrk/XVWaq9AtZS4ie83lXp7J7EtUj5yI7IWvP1mTsN2Q/RjSPiz13j/GaZ+YBm/JNm ye6ZeGVXv8JZ3hUfcc7q4AUVDWWH4dwdH+Wa5Ke6CdWW0f2Mj9Pxaczf3ZkqVRLvdTIVqY/ytEBbT 7+XB6Qqd1JXdca82U1qAwsulboeOWgM1hC8VV7L89vQOv/tsCKShpKVSUQ9virUb69qXO18qMwgpr lycctKvA==; Received: from authenticated-user by stravinsky.debian.org with esmtpsa (TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim 4.96) (envelope-from ) id 1wVAec-004dOF-0w; Thu, 04 Jun 2026 16:10:46 +0000 From: Breno Leitao Date: Thu, 04 Jun 2026 09:10:12 -0700 Subject: [PATCH net-next v3 3/5] netconsole: take target_cleanup_list_lock in drop_netconsole_target() Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260604-netcons_fix_before_move-v3-3-ab055b3a6aa5@debian.org> References: <20260604-netcons_fix_before_move-v3-0-ab055b3a6aa5@debian.org> In-Reply-To: <20260604-netcons_fix_before_move-v3-0-ab055b3a6aa5@debian.org> To: Breno Leitao , Andrew Lunn , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Simon Horman Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@meta.com X-Mailer: b4 0.14.3 X-Developer-Signature: v=1; a=openpgp-sha256; l=1853; i=leitao@debian.org; h=from:subject:message-id; bh=YaN4a7w2amMQzJRoGcsAWVnJA3MY2r7yAmdHp90h8cI=; b=owEBbQKS/ZANAwAIATWjk5/8eHdtAcsmYgBqIaN1rqG01i4P6/gJ+baqEG8ngK9TNKNaYmNtY agwhv4dusGJAjMEAAEIAB0WIQSshTmm6PRnAspKQ5s1o5Of/Hh3bQUCaiGjdQAKCRA1o5Of/Hh3 bc12EACqTP2l998VnxSVrW064Jto7lJX06ceb3lazJ5GkhnUySkAZwUEKPgFJKIhoXwP7mBbKgU 80kvpd6HMWOKlhl91RxS6+Ps7dn8+Uky2UoRVNTrMv6q2E8Uk3LxfpjR0I+FcivHHkK4LHfJ5T6 lyRmJ9oFX6TuqhiKfRHwdgv13T5wR9LqFEpyM57WkxV1qFNtBUAJV533SO7tpYtxBUhJVFc/t/W P8D7JqzlZmRjNhRfnkmL5qYd/BheZ1oerr9/IzF1Fx2E/Wp/KxpRBOPiYWUFvAmbgMtelFaYsOU 4d+tJesxL4vU5td9aNUHYprjAQc6zG1Dib4B38JkfiWRzorWGQ0MOpIURghtz5ryl4gvzhF/tjc G0hA4SEuYqAuqJLmWRsr5d/VMfsNWcHcMSn9Tj1pdBIYvtlLAGHUHcOrqv1AsgIy2EpPqdzhseM 3TsYBVyby3JMxcmCQiS46oFV8kF2K35TdxFlztbR+Ge1TlsWCHqgAEWl0n1TDuCNFFjFTHswv/W BfLTSx0e3QIpjzYVKfHBPeI7aeRgGgk/FmdQl3KnaH27Qk8ODE+rMJSVeLl+h4TVHIvW+QJ7EO3 djVM2C4LPnWwOluz5bIXmkcUHpEFQtJ9y21XU18XLpBbueALGuU9/84jn4Mv2ZTwkXpPfzeIbdy qVHnHyhaceOulYw== X-Developer-Key: i=leitao@debian.org; a=openpgp; fpr=AC8539A6E8F46702CA4A439B35A3939FFC78776D X-Debian-User: leitao drop_netconsole_target() unlinks the target while only holding target_list_lock. However, when the underlying interface has been unregistered, netconsole_netdev_event() moves the target from target_list to target_cleanup_list, and netconsole_process_cleanups_core() walks that list under target_cleanup_list_lock only. If a user removes the configfs target at the same time the cleanup worker is iterating target_cleanup_list, list_del() can corrupt the list because the two paths take disjoint locks while operating on the same list node. Acquire target_cleanup_list_lock around the list_del() so the unlink is serialised against netconsole_process_cleanups_core() regardless of which list the target currently belongs to. The state transition that downgrades STATE_DEACTIVATED to STATE_DISABLED is left intact and is performed under the same combined locking, preserving the existing ordering with resume_target(). Signed-off-by: Breno Leitao --- drivers/net/netconsole.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/net/netconsole.c b/drivers/net/netconsole.c index 58250e648f8b..d8be2fef3826 100644 --- a/drivers/net/netconsole.c +++ b/drivers/net/netconsole.c @@ -1452,6 +1452,7 @@ static void drop_netconsole_target(struct config_grou= p *group, =20 dynamic_netconsole_mutex_lock(); =20 + mutex_lock(&target_cleanup_list_lock); spin_lock_irqsave(&target_list_lock, flags); /* Disable deactivated target to prevent races between resume attempt * and target removal. @@ -1460,6 +1461,7 @@ static void drop_netconsole_target(struct config_grou= p *group, nt->state =3D STATE_DISABLED; list_del(&nt->list); spin_unlock_irqrestore(&target_list_lock, flags); + mutex_unlock(&target_cleanup_list_lock); =20 dynamic_netconsole_mutex_unlock(); =20 --=20 2.53.0-Meta From nobody Mon Jun 8 04:26:42 2026 Received: from stravinsky.debian.org (stravinsky.debian.org [82.195.75.108]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 75BB847ECF5; Thu, 4 Jun 2026 16:10:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=82.195.75.108 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780589456; cv=none; b=ruC6dCjBAaTf0sZyD9rz0zW9WWWyVkc8nN9DFAJyHQKncZi5ucRI0o1gnOBLwxz0/vFFcdeoHh9Omf8dT60p3aolA9z1vyOVqsyHplD58RwIC6kOiTsNVpTRpYaIlAsa7R5e778742xqOh02agjGqML0W2U28xytO+4wzVlBN44= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780589456; c=relaxed/simple; bh=S/tioB8E4yjoHelD68jHcbW7LtEhi/qm6INDh3a3irs=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=Q5QaHdZ3A/le4THCwrWXB/I8zoVAEbGtolU26bW7bjvL/TLYuok6kIJi85KuVbbG5pNSiiWwYItN1VGAZJxHjUaRpWI/vfPe1lhEPs2dYJSzGaHKZ6fsMfcXSZOG0YEz3uiwSv4tJcNkuBEnqDvGxVS23vr6Ywm9BeJJLnUI6wk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=debian.org; spf=pass smtp.mailfrom=debian.org; dkim=pass (2048-bit key) header.d=debian.org header.i=@debian.org header.b=TGkhr+z8; arc=none smtp.client-ip=82.195.75.108 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=debian.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=debian.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=debian.org header.i=@debian.org header.b="TGkhr+z8" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=debian.org; s=smtpauto.stravinsky; h=X-Debian-User:Cc:To:In-Reply-To:References: Message-Id:Content-Transfer-Encoding:Content-Type:MIME-Version:Subject:Date: From:Reply-To:Content-ID:Content-Description; bh=ULGRiCBgOfQhFNziT9TpZzJXG7cduMsUKF3R4NCV58w=; b=TGkhr+z8IlcYNgW96cw9mh60BL /CnWc/Ao/1edd0wfFKBbYDM4VP+LmlT9uVDZn5w7Uq52eEh0Q9E08AU/zFEpaIHlU/rr49B8m3kvZ yADsyz4RgbRo2TbRJvNHVytH9XulysxXXmqNJhq355jP3/PfDt/hwV5+gk5eWmUd4oB2/RiqPD+VK jmDAyfaJknCNY8+tp8Tg2B5EZGziC/c198m1+AarZegEhBV+xDlgsrFs0odOYf0EWA3LbPs2Kqwbh NtsLjV2bppmP3SrCQKgjZxD4nVmTAnGDwblLey7rYdQOE9xnv8C9yr4+CSL7KWaAilQH/O6AmRMqV g6sMFdYg==; Received: from authenticated-user by stravinsky.debian.org with esmtpsa (TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim 4.96) (envelope-from ) id 1wVAeg-004dOT-0G; Thu, 04 Jun 2026 16:10:50 +0000 From: Breno Leitao Date: Thu, 04 Jun 2026 09:10:13 -0700 Subject: [PATCH net-next v3 4/5] netconsole: clean up deactivated targets dropped before the cleanup worker Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260604-netcons_fix_before_move-v3-4-ab055b3a6aa5@debian.org> References: <20260604-netcons_fix_before_move-v3-0-ab055b3a6aa5@debian.org> In-Reply-To: <20260604-netcons_fix_before_move-v3-0-ab055b3a6aa5@debian.org> To: Breno Leitao , Andrew Lunn , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Simon Horman Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@meta.com X-Mailer: b4 0.14.3 X-Developer-Signature: v=1; a=openpgp-sha256; l=3175; i=leitao@debian.org; h=from:subject:message-id; bh=S/tioB8E4yjoHelD68jHcbW7LtEhi/qm6INDh3a3irs=; b=owEBbQKS/ZANAwAIATWjk5/8eHdtAcsmYgBqIaN1bm14IxovQRtRwrR/kGwxQJT2vbvuz0Cw9 Q9XK0B8o0GJAjMEAAEIAB0WIQSshTmm6PRnAspKQ5s1o5Of/Hh3bQUCaiGjdQAKCRA1o5Of/Hh3 bSjWD/0bSsumN1UloKcM4sWNEGyBHSdoLaN1r9qL/eF86/KFkGj6YoiG8mPbODyQrEkqHvfPBQR tT7Q5DccyzYgrBBcwbGZWB/Z1uOMBx997IZ0X8gW8Y2nMA9h26e8bKpZf3i9ZbRRtrdmavSVsPP hhuOxtF8p4wj3a4vkUucO00+VON1yl0UvWD5w+lN7ZaVJXr1Ej7vquhU7j5ZpKFt6Ga5sWDs6te +B9joKLj1+U+xuBJlKKblnrzqj4acFdvLJXS7nz+bQqjwL3MnjG6ak7bs1iqjMLmYZ3z5XmDiUC gFis80BbJZ5TKJ0qySFcHtSzH/Rcf/gJxHqruj4VYBWGi/T0ADZ+XrLlqpcHL1VKtW+md7ISCrp 4wT7twr4+Mo0k0vvJ0J2qb0ZSKn7DtNOq4u2clV5BHz61TLg34+qWV1veElvHDuAaDyr5dKuH3G e7ePv4OjYnDJVp5yiiOMk8pMGOiPnd35xAXKkkZyU+ThaOXkAj2L/+efvb8fWThg38DIjjIjxLa AoGe8+Q2wesY1+Ri6Nw9iDxQP+Jvhs9WNhZ4D3Bg/t1ycSnwWE9oYn0EFbS3S4rOTU7eh6v4ela MnELOjGKmVlL6vjJcUReNU7JYck/96c94jBzj6YxZwtP8ZZD8ddQ8rCTB88iWkB6uuaeo1A1EeT zlz8+60Ev1JXrKA== X-Developer-Key: i=leitao@debian.org; a=openpgp; fpr=AC8539A6E8F46702CA4A439B35A3939FFC78776D X-Debian-User: leitao drop_netconsole_target() downgrades a STATE_DEACTIVATED target to STATE_DISABLED and then only calls netpoll_cleanup() when the target is STATE_ENABLED. A target becomes STATE_DEACTIVATED when its underlying interface is unregistered: netconsole_netdev_event() moves it to target_cleanup_list, and netconsole_process_cleanups_core() is expected to run do_netpoll_cleanup() on it. Now that drop_netconsole_target() takes target_cleanup_list_lock around the unlink, a configfs removal racing with NETDEV_UNREGISTER can pull the target off target_cleanup_list before the cleanup worker processes it. The notifier drops the lock before calling netconsole_process_cleanups_core(), so the worker then iterates a list that no longer contains the target and never runs do_netpoll_cleanup() on it. Because drop_netconsole_target() has already rewritten the state to STATE_DISABLED, its own STATE_ENABLED check is false and netpoll_cleanup() is skipped too. The net_device reference taken by netpoll_setup() is then leaked and unregister_netdevice() hangs forever in netdev_wait_allrefs(). Capture whether the target still owns a netpoll before the state is downgraded and clean it up for both STATE_ENABLED and STATE_DEACTIVATED targets. netpoll_cleanup() is idempotent -- it skips when np->dev is already NULL -- so it is safe even when the cleanup worker won the race and already tore the netpoll down. Signed-off-by: Breno Leitao --- drivers/net/netconsole.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/drivers/net/netconsole.c b/drivers/net/netconsole.c index d8be2fef3826..80c5393ffa1c 100644 --- a/drivers/net/netconsole.c +++ b/drivers/net/netconsole.c @@ -1449,11 +1449,21 @@ static void drop_netconsole_target(struct config_gr= oup *group, { struct netconsole_target *nt =3D to_target(item); unsigned long flags; + bool needs_cleanup; =20 dynamic_netconsole_mutex_lock(); =20 mutex_lock(&target_cleanup_list_lock); spin_lock_irqsave(&target_list_lock, flags); + /* A STATE_DEACTIVATED target may have been moved to + * target_cleanup_list by netconsole_netdev_event() but not yet + * processed by netconsole_process_cleanups_core(). Unlinking it below + * hides it from the cleanup worker, so this path has to clean it up + * itself. Record that the target still owns a netpoll before the + * state is downgraded. + */ + needs_cleanup =3D nt->state =3D=3D STATE_ENABLED || + nt->state =3D=3D STATE_DEACTIVATED; /* Disable deactivated target to prevent races between resume attempt * and target removal. */ @@ -1475,8 +1485,10 @@ static void drop_netconsole_target(struct config_gro= up *group, /* * The target may have never been enabled, or was manually disabled * before being removed so netpoll may have already been cleaned up. + * netpoll_cleanup() is idempotent (it skips when np->dev is NULL), so + * it is safe even if the cleanup worker already tore the netpoll down. */ - if (nt->state =3D=3D STATE_ENABLED) + if (needs_cleanup) netpoll_cleanup(&nt->np); =20 config_item_put(&nt->group.cg_item); --=20 2.53.0-Meta From nobody Mon Jun 8 04:26:42 2026 Received: from stravinsky.debian.org (stravinsky.debian.org [82.195.75.108]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7613A48034C; Thu, 4 Jun 2026 16:10:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=82.195.75.108 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780589460; cv=none; b=FiRMFhp06gIBolqD7S6KPaPn1/l2aKbvHcEH7qsG5PsxVSu4RBFwBuGgPwipTIS9uHVk1Ynv1/hHEbR1RuB2bSSmVW6JM5Ki59VH/L1Q9PGEkwU1SmQYoIKist0lR2rKnlN0hySlv5/TkJB/vZNCilylxQ1Lbg18hzXgMPiaWSk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780589460; c=relaxed/simple; bh=7Bc2K2DonWTFN4PGky2us44y6fK41MOJLm0kSTR/pI8=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=gsGggSiZYUmVzI0HkLGUYzeKI3xn+xmx+ltvsLSVqpAO5IFbMUDL9hr0gP22BydSeGi2A4KHUB+tqY/UY1KfSOSsebL+uFlAiLiRGIN0t3q8qrdnX9tcHstDPEPsWFegXUvnLh4AahO3uJDIH2rxKPmHFtE0Z7PZ8XkqBQzoEJ8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=debian.org; spf=pass smtp.mailfrom=debian.org; dkim=pass (2048-bit key) header.d=debian.org header.i=@debian.org header.b=quVtbSPw; arc=none smtp.client-ip=82.195.75.108 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=debian.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=debian.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=debian.org header.i=@debian.org header.b="quVtbSPw" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=debian.org; s=smtpauto.stravinsky; h=X-Debian-User:Cc:To:In-Reply-To:References: Message-Id:Content-Transfer-Encoding:Content-Type:MIME-Version:Subject:Date: From:Reply-To:Content-ID:Content-Description; bh=NIaVfAeJGnzMWaDGjE2J9kNzpNCDQ8Uzp0cqm+Ii8SQ=; b=quVtbSPwpOVWrkMbm5hJ9dXC+P s4oOA/z2+mcN7pF4ap28eLE40CzZIm4YDvW3vZXqCFR3eX8vQHHW7tLosmxtCtqhGTrK1jvJwKqHu 0ypEthgRtkAPqCVpq7yPGdu4udyl+NYTOwBhS5PJkodcaHM18xs3aMpx1pO55MzaOpzQEDCyC+zR9 pPFQ1Rt+65RVif8Wj84EO2OlRf0hqjKlXtMi2DGu3IHMaiRLay5do2ALWNZDlSjJmdLAopJ/tuU4S Mci+uEHfEN/1psr0FYzU7F3fVY6lRDzYd57j2XkMJ3RZ1pc/Bzl8NVVzNHHHfLSx12gmlTLvRTivR Kfnh4H3g==; Received: from authenticated-user by stravinsky.debian.org with esmtpsa (TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim 4.96) (envelope-from ) id 1wVAej-004dOs-2l; Thu, 04 Jun 2026 16:10:54 +0000 From: Breno Leitao Date: Thu, 04 Jun 2026 09:10:14 -0700 Subject: [PATCH net-next v3 5/5] netconsole: close netdevice unregister window during target resume Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260604-netcons_fix_before_move-v3-5-ab055b3a6aa5@debian.org> References: <20260604-netcons_fix_before_move-v3-0-ab055b3a6aa5@debian.org> In-Reply-To: <20260604-netcons_fix_before_move-v3-0-ab055b3a6aa5@debian.org> To: Breno Leitao , Andrew Lunn , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Simon Horman Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@meta.com X-Mailer: b4 0.14.3 X-Developer-Signature: v=1; a=openpgp-sha256; l=3162; i=leitao@debian.org; h=from:subject:message-id; bh=7Bc2K2DonWTFN4PGky2us44y6fK41MOJLm0kSTR/pI8=; b=owEBbQKS/ZANAwAIATWjk5/8eHdtAcsmYgBqIaN1i6mYANyiL1dETt0YZdXZV0klC/mVyUxJ4 0kkTL8tORuJAjMEAAEIAB0WIQSshTmm6PRnAspKQ5s1o5Of/Hh3bQUCaiGjdQAKCRA1o5Of/Hh3 bVhtD/9eEj3DNjnOPIfmMWx4yKHxkWfcFGBL9rb0ECgfC8bP3dWR3D6Of/MDMoch3FOe7tLomOf c6x2FrvwZkchjqMAd0TX5v2mk0jAy05ZUPdlpp5ThaK3KSvSijVHNQxnrD/TtYNyWTjRRi/8JiW 0GkozePPRVJHqmEk17qt7UIbTPLH7PqpXcvVOVyVihySyFy+5brK+R6jleYe/JhOs1YNf08fj1d hVUUE61+fLmOh6n7wWF2XgPxNryQQfzJC+41q409s1lt2khn+vIZ7GDY06czPxXZfAkWTBqE6Kf kMk+gHCMt8sLcVN7uqjFycFxXR7Dh4ZoU1PI5ecZnDbJY/RZI09vCFgUVz6CyP5/8CtTFgOiOtJ 2AOhXVzVz8VoUuXnYT3oJusQY2QCItrgyFV5HbqkH6d+Z1PVPh+0BpHetdRmpCQeNQnY0cdKUjs 4y0B50IVhmSdE/fWNiIogkh5+lOE2N7lJgVdZ/2CN9T2yhzU4zlqgx8BodaML/gA5KGZ+bX320f qZWtJVqlZqFnxUfb072R6Vo7IZPfJyMTHpNu6REV1rHhXWwSaq2Am8CywY4+2QcB49WVB1yPpUo MlNiyQrHU7y2x+7an3Axx0xSX2NaQ04y3nSE+Sud4DUYpiLCn1DLp3dcuJ2iP2P+f+6Jpydmh/r oA9i1EbJh3Y5BTQ== X-Developer-Key: i=leitao@debian.org; a=openpgp; fpr=AC8539A6E8F46702CA4A439B35A3939FFC78776D X-Debian-User: leitao process_resume_target() removes the target from target_list before calling resume_target() so that netpoll_setup() can run with interrupts enabled, then re-adds it once setup completes. netpoll_setup() acquires a net_device reference (netdev_hold()) and releases the RTNL before returning. While the target is off target_list and the RTNL is not held, netconsole_netdev_event() cannot find it. If the egress device is unregistered in that window, the NETDEV_UNREGISTER notifier walks target_list, misses the resuming target, and never tears it down. The target is then re-added in STATE_ENABLED still holding a reference to the now-unregistered device, leaking it and hanging unregister_netdevice() in netdev_wait_allrefs(). Re-check under RTNL before re-publishing the target: if the device left NETREG_REGISTERED while we were off the list, run do_netpoll_cleanup() and mark the target disabled. Taking the RTNL across the check and the list_add() serialises against the NETDEV_UNREGISTER notifier, which also runs under RTNL, so the device is either still registered (and the notifier will find the re-added target later) or already unregistering (and we drop the reference here). netdev_wait_allrefs() runs from netdev_run_todo() outside the RTNL, so dropping the reference here cannot deadlock against the pending unregister. Signed-off-by: Breno Leitao --- drivers/net/netconsole.c | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/drivers/net/netconsole.c b/drivers/net/netconsole.c index 80c5393ffa1c..606e265cdfd7 100644 --- a/drivers/net/netconsole.c +++ b/drivers/net/netconsole.c @@ -335,6 +335,24 @@ static void process_resume_target(struct work_struct *= work) =20 resume_target(nt); =20 + /* netpoll_setup() took a net_device reference and dropped the RTNL + * before returning, all while this target was off target_list and + * thus invisible to netconsole_netdev_event(). If the device was + * unregistered in that window the NETDEV_UNREGISTER notifier could not + * tear this target down, which would leak the reference and hang + * unregister_netdevice(). Re-check under the RTNL before re-publishing: + * taking it across the check and the list_add() serialises against the + * notifier (which also runs under the RTNL), so the device is either + * still registered (the notifier will find the re-added target) or + * already unregistering (we drop the reference here). + */ + rtnl_lock(); + if (nt->state =3D=3D STATE_ENABLED && nt->np.dev && + nt->np.dev->reg_state !=3D NETREG_REGISTERED) { + do_netpoll_cleanup(&nt->np); + nt->state =3D STATE_DISABLED; + } + /* At this point the target is either enabled or disabled and * was cleaned up before getting deactivated. Either way, add it * back to target list. @@ -342,6 +360,7 @@ static void process_resume_target(struct work_struct *w= ork) spin_lock_irqsave(&target_list_lock, flags); list_add(&nt->list, &target_list); spin_unlock_irqrestore(&target_list_lock, flags); + rtnl_unlock(); =20 out_unlock: dynamic_netconsole_mutex_unlock(); --=20 2.53.0-Meta