From nobody Mon May 25 07:34:52 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CD11835FF58 for ; Sat, 16 May 2026 19:13:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778958820; cv=none; b=GfsJcyMhd5ap2u+2pMEpcSEz6gNJU8So38VF9Hlt5trTJ78r5iOMG23Y+1d4CiWEu559s7TGDGLVIMi4VU6sW++Ehfk3JzzxKFNtEDki/lLp/PyIcJbcnlLKUHXMoaUukQs3y2Hk+Py3RRpHZ7ElOxJKoaIvkGKbN6BeCOpTXo8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778958820; c=relaxed/simple; bh=AV/PZNcVDZtD3wIbTgjMAGnt05Ae2fGl6vs20KGKVOE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ALMQP7/waqOLsxN9orKtKyWzSyQrgrDc3mSL2J1wLXT+6Fm4ZnPbPfCd/7xEu02exAkbc87PEBch56GXvsCwVy+5TBaTr5F53fOQ1rEiPwd77wdTlRbGHsUYidXK6ZNfNftRzCm5hkQ8fafg2to7iXO4Ng5hf85D0DgVN7zVpxA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=gbjGY35o; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="gbjGY35o" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1778958817; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=N2WYfddecpwHUAGt4Gbbl5tbqcWFhdWpYjlHOX+xowA=; b=gbjGY35oubpfYiOjF9Qdir+od8lDce+5Ahaxv3CnJ7q+erbA0sSuoEDUXKQB5VKp6JOWqp do1ZJwBHdhCcfrqdyl2cqhEUfUG6/7tZBgXz0TlPOkJCwzIlt6V06rW64L4pBJ9H7y60gO zGnvmD7rBgajKJ4i1sVA9MfQZcYGJjM= Received: from mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-10-Onjj8GdQOTal3xFCFk5WAA-1; Sat, 16 May 2026 15:13:34 -0400 X-MC-Unique: Onjj8GdQOTal3xFCFk5WAA-1 X-Mimecast-MFC-AGG-ID: Onjj8GdQOTal3xFCFk5WAA_1778958812 Received: from mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.17]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id E8B7D195608F; Sat, 16 May 2026 19:13:31 +0000 (UTC) Received: from ShadowPeak.redhat.com (unknown [10.44.32.36]) by mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 0AAD219432BB; Sat, 16 May 2026 19:13:28 +0000 (UTC) From: Petr Oros To: netdev@vger.kernel.org Cc: ivecera@redhat.com, Petr Oros , Vadim Fedorenko , Arkadiusz Kubalewski , Jiri Pirko , Milena Olech , "David S. Miller" , Michal Michalik , linux-kernel@vger.kernel.org Subject: [PATCH net 1/2] dpll: fix NULL deref in dpll_device_ops() during teardown race Date: Sat, 16 May 2026 21:13:16 +0200 Message-ID: <20260516191317.1005612-2-poros@redhat.com> In-Reply-To: <20260516191317.1005612-1-poros@redhat.com> References: <20260516191317.1005612-1-poros@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.0 on 10.30.177.17 Content-Type: text/plain; charset="utf-8" When a dpll device driver (zl3073x) is unloaded via rmmod while a foreign driver (ice) still holds pins on it via dpll_pin_on_pin_register(), dpll_device_unregister() empties dpll->registration_list but the dpll object stays alive on ice's refcount. A subsequent PIN_DELETED notification on ice's pin reaches dpll_cmd_pin_get_one() which calls dpll_device_ops() on the still-indexed but half-dead dpll; dpll_device_registration_first() returns NULL and the accessor dereferences ops at offset 0x10 (struct dpll_device_registration::ops): CPU A (zl3073x rmmod) CPU B (ice workqueue) --------------------- --------------------- dpll_device_unregister(): drop reg from dpll->registration_list dpll_pin_on_pin_unregister dpll_pin_delete_ntf dpll_cmd_pin_get_one dpll_device_ops() registration_first() =3D NULL NULL deref @ 0x10 WARNING: CPU: 27 PID: 972 at drivers/dpll/dpll_core.c:1072 dpll_device_op= s+0x20/0x30 Workqueue: ice_dpll_wq ice_dpll_pin_notify_work [ice] RIP: 0010:dpll_device_ops+0x20/0x30 Call Trace: ? __warn+0x84/0x140 ? dpll_device_ops+0x20/0x30 ? report_bug+0x16b/0x180 ? handle_bug+0x3c/0x70 ? exc_invalid_op+0x14/0x70 ? asm_exc_invalid_op+0x16/0x20 ? dpll_device_ops+0x20/0x30 dpll_cmd_pin_get_one+0x303/0x490 dpll_pin_event_send+0x93/0x140 dpll_pin_on_pin_unregister+0x45/0xe0 ice_dpll_pin_notify_work+0x7b/0x150 [ice] process_one_work+0x188/0x3b0 worker_thread+0x2ef/0x410 kthread+0x122/0x240 ret_from_fork+0x28/0x50 ---[ end trace 0000000000000000 ]--- BUG: kernel NULL pointer dereference, address: 0000000000000010 RIP: 0010:dpll_device_ops+0x24/0x30 Call Trace: dpll_cmd_pin_get_one+0x303/0x490 dpll_pin_event_send+0x93/0x140 dpll_pin_on_pin_unregister+0x45/0xe0 ice_dpll_pin_notify_work+0x7b/0x150 [ice] process_one_work+0x188/0x3b0 worker_thread+0x2ef/0x410 kthread+0x122/0x240 ret_from_fork+0x28/0x50 dpll_lock serializes the two paths but cannot reorder them: ice's work was queued before zl3073x took the lock. "Empty registration_list on a still-indexed dpll" is a legitimate transient state caused by peer-driver refcounting; the accessors must tolerate it. Drop the WARN_ON in dpll_device_registration_first(), make dpll_priv() and dpll_device_ops() return NULL on the empty-list state, and bail out cleanly in dpll_cmd_pin_get_one() when the first dpll the pin references is already gone. In-tree doit/dumpit callers of dpll_priv() / dpll_device_ops() obtain their dpll via dpll_pre_doit() -> dpll_device_get_by_id(), which only returns DPLL_REGISTERED devices; dpll_device_unregister() clears that mark only after observing list_empty(&dpll->registration_list), so those callers cannot see the new NULL return. The only path that can is dpll_cmd_pin_get_one() reached from dpll_pin_event_send(), gated here. Fixes: 9431063ad323 ("dpll: core: Add DPLL framework base functions") Signed-off-by: Petr Oros --- drivers/dpll/dpll_core.c | 12 ++++++------ drivers/dpll/dpll_netlink.c | 6 ++++++ 2 files changed, 12 insertions(+), 6 deletions(-) diff --git a/drivers/dpll/dpll_core.c b/drivers/dpll/dpll_core.c index cbb635db43210f..4a058b46c69d4f 100644 --- a/drivers/dpll/dpll_core.c +++ b/drivers/dpll/dpll_core.c @@ -1060,12 +1060,8 @@ EXPORT_SYMBOL_GPL(dpll_pin_ref_sync_pair_add); static struct dpll_device_registration * dpll_device_registration_first(struct dpll_device *dpll) { - struct dpll_device_registration *reg; - - reg =3D list_first_entry_or_null((struct list_head *)&dpll->registration_= list, - struct dpll_device_registration, list); - WARN_ON(!reg); - return reg; + return list_first_entry_or_null((struct list_head *)&dpll->registration_l= ist, + struct dpll_device_registration, list); } =20 void *dpll_priv(struct dpll_device *dpll) @@ -1073,6 +1069,8 @@ void *dpll_priv(struct dpll_device *dpll) struct dpll_device_registration *reg; =20 reg =3D dpll_device_registration_first(dpll); + if (!reg) + return NULL; return reg->priv; } =20 @@ -1081,6 +1079,8 @@ const struct dpll_device_ops *dpll_device_ops(struct = dpll_device *dpll) struct dpll_device_registration *reg; =20 reg =3D dpll_device_registration_first(dpll); + if (!reg) + return NULL; return reg->ops; } =20 diff --git a/drivers/dpll/dpll_netlink.c b/drivers/dpll/dpll_netlink.c index 0ff1658c2dc1ba..8e7e61982b867c 100644 --- a/drivers/dpll/dpll_netlink.c +++ b/drivers/dpll/dpll_netlink.c @@ -682,6 +682,12 @@ dpll_cmd_pin_get_one(struct sk_buff *msg, struct dpll_= pin *pin, ref =3D dpll_xa_ref_dpll_first(&pin->dpll_refs); ASSERT_NOT_NULL(ref); =20 + /* The first dpll the pin references may be torn down while still + * pinned by foreign-driver refs; drop the notification cleanly. + */ + if (!dpll_device_ops(ref->dpll)) + return -ENODEV; + ret =3D dpll_msg_add_pin_handle(msg, pin); if (ret) return ret; --=20 2.53.0 From nobody Mon May 25 07:34:52 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 50C2E3624BF for ; Sat, 16 May 2026 19:13:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778958824; cv=none; b=VG5pc0jAzvaPD0qsX8V6zJrXQSabUuaTU+kLa/4AVDZjf5Tdai2W6qVj+cD4VFUxxxRSKBt5F1EpMsfIH/3vbWUunTnjNYFtdtJy33GAePmeGEzmM6sUknlmzqyHed0WKVvwAGZlWMbRvBOFPJsW+mbpnVbCFdAbBC62YpqvES4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778958824; c=relaxed/simple; bh=aairhOuEZ/yJXBB7QRohlAMS+C/i0oi1qnLIZouQ0lg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=hX40t5Nt0sRpu8SR9Cbe844G1ZCBPBKGaQkvWshgTRJhNMj0E+7dOLLaeU6b8G8sd2gK2jt/PWaljE7xp466JXRDWitLBJwHNCgK2zYSXQT7AIPZp0PMrrVnybQoxEny/DENuVvi9Y+DZl3c5IaQSozCAp+cbBjCL95T1SntVbY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=FGHMN4fP; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="FGHMN4fP" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1778958822; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=dXQNBpnpZA0xiPF9Pkissd/3JRF00KGEdyGdDHa5h9E=; b=FGHMN4fP40VjOcUwZthDt/kZhu2ez/+2L+IoD4zKbAg4ySf5wBOlv/TqCs/mIpNpYNQYoB zwC1k+Ml+hls62u2TeaRsPpU7DHvGK5JQgroiiraoHtdpSK08aDmRZLGaUWZGDDTlTPStd /nJgpiYsL9+wSiSF9sESR1uEad4I5es= Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-315-JyGQjYQ9P6yllTphzDuLCg-1; Sat, 16 May 2026 15:13:36 -0400 X-MC-Unique: JyGQjYQ9P6yllTphzDuLCg-1 X-Mimecast-MFC-AGG-ID: JyGQjYQ9P6yllTphzDuLCg_1778958815 Received: from mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.17]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 3256E18005B8; Sat, 16 May 2026 19:13:35 +0000 (UTC) Received: from ShadowPeak.redhat.com (unknown [10.44.32.36]) by mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 73AE61956053; Sat, 16 May 2026 19:13:32 +0000 (UTC) From: Petr Oros To: netdev@vger.kernel.org Cc: ivecera@redhat.com, Petr Oros , Vadim Fedorenko , Arkadiusz Kubalewski , Jiri Pirko , Milena Olech , "David S. Miller" , Michal Michalik , linux-kernel@vger.kernel.org Subject: [PATCH net 2/2] dpll: filter pin->dpll_refs by cookie in dpll_pin_on_pin_unregister Date: Sat, 16 May 2026 21:13:17 +0200 Message-ID: <20260516191317.1005612-3-poros@redhat.com> In-Reply-To: <20260516191317.1005612-1-poros@redhat.com> References: <20260516191317.1005612-1-poros@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.0 on 10.30.177.17 Content-Type: text/plain; charset="utf-8" dpll_pin_on_pin_register() iterates parent->dpll_refs and adds the child pin to each dpll the parent contributes to, tagging the registration with cookie =3D parent. The unregister side asymmetrically iterates pin->dpll_refs (a union across all of pin's parents) and calls __dpll_pin_unregister() on every entry without checking that the entry carries a (ops, priv, parent) registration. When the same pin is reachable through multiple parents whose supported-dpll sets differ, unregistering one parent walks entries owned by other parents and trips WARN_ON(!reg) in dpll_xa_ref_pin_del(): parent_A in {dpll_X}, parent_B in {dpll_X, dpll_Y} pin reachable through both pin->dpll_refs =3D {dpll_X, dpll_Y} dpll_X has cookies {parent_A, parent_B} dpll_Y has cookie {parent_B} unregister(parent_A) iterates {dpll_X, dpll_Y}; the dpll_Y step looks up cookie =3D parent_A which was never there -> WARN. WARNING: CPU: 137 PID: 4498 at drivers/dpll/dpll_core.c:252 dpll_xa_ref_p= in_del.isra.0+0x1b0/0x1c0 Call Trace: __dpll_pin_unregister+0xed/0x2e0 dpll_pin_on_pin_unregister+0x91/0xe0 ice_dpll_deinit_pins+0xb2/0x400 [ice] ice_dpll_deinit+0x32/0x130 [ice] ice_deinit_features.part.0+0xb6/0x100 [ice] ice_unload+0xdf/0x120 [ice] ice_remove+0x144/0x2f0 [ice] ice_shutdown+0x16/0x50 [ice] pci_device_shutdown+0x34/0x60 device_shutdown+0x158/0x200 kernel_kexec+0x48/0x160 __do_sys_reboot+0xda/0x230 do_syscall_64+0xec/0x680 entry_SYSCALL_64_after_hwframe+0x76/0x7e Filter the iteration by the (ops, priv, parent) cookie so only the entries this parent actually contributed to get touched. Keep pin->dpll_refs as the iteration source (rather than parent->dpll_refs as in register) so cleanup is also correct when the parent has been independently unregistered from some of its dpll devices in the meantime; switching sides would silently leak those registrations. The fix is placed at the call site rather than relaxing WARN_ON(!reg) in dpll_xa_ref_pin_del(): other callers (dpll_pin_unregister() with cookie =3D NULL, the dpll_pin_on_pin_register() error rollback with cookie =3D parent) pass cookies that must be present, and the WARN remains useful there as a double-unregister catch. Reproduced on E825-C + zl3073x via kexec stress: WARN appears within 1-4 reboots without the fix, none across several hundred reboots with it. Fixes: 9431063ad323 ("dpll: core: Add DPLL framework base functions") Signed-off-by: Petr Oros --- drivers/dpll/dpll_core.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/drivers/dpll/dpll_core.c b/drivers/dpll/dpll_core.c index 4a058b46c69d4f..4779976682fdb9 100644 --- a/drivers/dpll/dpll_core.c +++ b/drivers/dpll/dpll_core.c @@ -1024,8 +1024,14 @@ void dpll_pin_on_pin_unregister(struct dpll_pin *par= ent, struct dpll_pin *pin, mutex_lock(&dpll_lock); dpll_pin_delete_ntf(pin); dpll_xa_ref_pin_del(&pin->parent_refs, parent, ops, priv, pin); - xa_for_each(&pin->dpll_refs, i, ref) + /* pin->dpll_refs is the union over all of pin's parents; only + * touch entries actually registered via @parent. + */ + xa_for_each(&pin->dpll_refs, i, ref) { + if (!dpll_pin_registration_find(ref, ops, priv, parent)) + continue; __dpll_pin_unregister(ref->dpll, pin, ops, priv, parent); + } mutex_unlock(&dpll_lock); } EXPORT_SYMBOL_GPL(dpll_pin_on_pin_unregister); --=20 2.53.0