From nobody Sat May 30 08:46:16 2026 Received: from angie.orcam.me.uk (angie.orcam.me.uk [78.133.224.34]) by smtp.subspace.kernel.org (Postfix) with ESMTP id BB75725333F; Sat, 9 May 2026 21:04:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=78.133.224.34 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778360700; cv=none; b=BGpcLkNibiLf29rS3/jW9pyqhF6R9Wz+D6i0UsU9FoqO25tqhineFfsVl/Pk+fxmF7ZK5TWnM1N2HhNorIYzeuJKogcZWz6Y+SgKAk72gDEdlJScvGeP0PhL1zZqZIJ9Yv/5UQ3jb0hUagAlMz17+eYBybgi6n8IdmzWKxQEDTk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778360700; c=relaxed/simple; bh=08lThNnhevMm71QOCq37Tyz24IUFKenJcrFBUwo74DQ=; h=Date:From:To:cc:Subject:Message-ID:MIME-Version:Content-Type; b=Bpszat1n4gaVg0bE40NyreTFtLWfSJL0I6TZcR00dQCAQj3g7oy/ZZpLcCqnubs/LgNdtdqOxM/WrGKranuZTwzdXSCZzH6nPRSbd2ZPWwo0bXyPY3yzWnsnKfCDEtTctfVFJwY3YI/dAn5UNeNMROWOATeHpUz7JCe5zd8+4lU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=orcam.me.uk; spf=none smtp.mailfrom=orcam.me.uk; arc=none smtp.client-ip=78.133.224.34 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=orcam.me.uk Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=orcam.me.uk Received: by angie.orcam.me.uk (Postfix, from userid 500) id 3983D92009C; Sat, 9 May 2026 23:04:50 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by angie.orcam.me.uk (Postfix) with ESMTP id 324DB92009B; Sat, 9 May 2026 22:04:50 +0100 (BST) Date: Sat, 9 May 2026 22:04:50 +0100 (BST) From: "Maciej W. Rozycki" To: Andrew Lunn , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH net-next] FDDI: defza: Sanitise the reset safety timer Message-ID: User-Agent: Alpine 2.21 (DEB 202 2017-01-01) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The reset actions of the DEFZA adapters are exceedingly slow, taking up=20 to 30 seconds to complete by the device spec and typically in the range=20 of 10 seconds in reality, as required for the device RTOS to boot, still=20 quite a lot. Therefore a state machine is used that's interrupt driven,=20 however a safety mechanism is required in case of adapter malfunction,=20 so that if no state change interrupt has arrived in time, then the=20 situation is taken care of. The safety mechanism depends on the origin of the reset. For regular=20 adapter initialisation at the device probe time a sleep is requested. =20 However a reset is also required by the device spec when the adapter has=20 transitioned into the halted state, such as in response to a PC Trace=20 event in the course of ring fault recovery, possibly a common network=20 event. In that case no sleep is possible as a device halt is reported=20 at the hardirq level. A timer is therefore set up to ensure progress in case no adapter state=20 change interrupt has arrived in time, but as from commit 168f6b6ffbee=20 ("timers: Use del_timer_sync() even on UP") a warning is issued as the=20 timer is deleted in the hardirq handler upon an expected state change: defza: v.1.1.4 Oct 6 2018 Maciej W. Rozycki tc2: DEC FDDIcontroller 700 or 700-C at 0x18000000, irq 4 tc2: resetting the board... ------------[ cut here ]------------ WARNING: kernel/time/timer.c:1611 at __timer_delete_sync+0x104/0x120, CPU= #0: swapper/0/0 Modules linked in: CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 7.0.0-dirty #2 VOLUNTARY=20 Stack : 9800000002027d08 00000000140120e0 0000000000000000 ffffffff8089d4= 68 0000000000000000 0000000000000000 ffffffff807ed6b8 ffffffff808974= 58 ffffffff80897400 9800000002027b88 0000000000000000 7070617773203a= 6d 0000000000000000 9800000002027ba4 0000000000001000 6465746e696174= 20 0000000000000000 ffffffff807ed6b8 00000000140120e0 00000000000000= 09 000000000000064b ffffffff800dd14c 0000000000000036 98000000021840= 00 0000000000000000 0000000000000020 0000000000000000 ffffffff809100= 00 ffffffff8085c000 9800000002027c70 0000000000000001 ffffffff80045f= a0 0000000000000000 0000000000000000 0000000000000000 00000000000000= 09 000000000000064b ffffffff800502b8 ffffffff807ed6b8 ffffffff80045f= a0 ... Call Trace: [] show_stack+0x28/0xf0 [] dump_stack_lvl+0x48/0x7c [] __warn+0xa0/0x128 [] warn_slowpath_fmt+0x64/0xa4 [] __timer_delete_sync+0x104/0x120 [] fza_interrupt+0xc74/0xeb8 [] __handle_irq_event_percpu+0x70/0x228 [] handle_irq_event_percpu+0x18/0x78 [] handle_percpu_irq+0x50/0x80 [] generic_handle_irq+0x90/0xd0 [] do_IRQ+0x1c/0x30 [] handle_int+0x148/0x154 [] do_idle+0x40/0x108 [] cpu_startup_entry+0x2c/0x38 [] kernel_init+0x0/0x108 =20 ---[ end trace 0000000000000000 ]--- tc2: OK tc2: model 700 (DEFZA-AA), MMF PMD, address 08-00-2b-xx-xx-xx tc2: ROM rev. 1.0, firmware rev. 1.2, RMC rev. A, SMT ver. 1 tc2: link unavailable ------------[ cut here ]------------ WARNING: kernel/time/timer.c:1611 at __timer_delete_sync+0x104/0x120, CPU= #0: swapper/0/0 Modules linked in: CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Tainted: G W 7.0.0-= dirty #2 VOLUNTARY=20 Tainted: [W]=3DWARN Stack : 9800000002027d08 00000000140120e0 0000000000000000 ffffffff8089d4= 68 0000000000000000 0000000000000000 ffffffff807ed6b8 ffffffff808974= 58 ffffffff80897400 9800000002027b88 0000000000000000 00000000000000= 00 0000000000000000 9800000002027ba4 0000000000001000 00000000000000= 00 0000000000000000 ffffffff807ed6b8 00000000140120e0 00000000000000= 09 000000000000064b ffffffff800dd14c 0000000000000036 98000000021840= 00 0000000000000000 0000000000000020 0000000000000000 ffffffff809100= 00 ffffffff8085c000 9800000002027c70 0000000000000001 ffffffff80045f= a0 0000000000000000 0000000000000000 0000000000000000 00000000000000= 09 000000000000064b ffffffff800502b8 ffffffff807ed6b8 ffffffff80045f= a0 ... Call Trace: [] show_stack+0x28/0xf0 [] dump_stack_lvl+0x48/0x7c [] __warn+0xa0/0x128 [] warn_slowpath_fmt+0x64/0xa4 [] __timer_delete_sync+0x104/0x120 [] fza_interrupt+0xc74/0xeb8 [] __handle_irq_event_percpu+0x70/0x228 [] handle_irq_event_percpu+0x18/0x78 [] handle_percpu_irq+0x50/0x80 [] generic_handle_irq+0x90/0xd0 [] do_IRQ+0x1c/0x30 [] handle_int+0x148/0x154 [] arch_local_irq_disable+0x4/0x28 [] do_idle+0x50/0x108 [] cpu_startup_entry+0x2c/0x38 [] kernel_init+0x0/0x108 =20 ---[ end trace 0000000000000000 ]--- tc2: registered as fddi0 The immediate origin of the new warning is the switch away from aliasing=20 del_timer_sync() to del_timer() (timer_delete_sync() to timer_delete()=20 in terms of current function names) for UP configurations, which however=20 is the only choice for this driver anyway as no SMP hardware supports=20 the TURBOchannel bus this device interfaces to. Therefore there is a=20 very remote issue only this is a sign of. Specifically if an adapter reset issued upon a transition to the halted=20 state times out and first triggers fza_reset_timer() for another reset=20 assertion, which then schedules fza_reset_timer() for reset deassertion=20 and then that second call is pre-empted after poking at the hardware,=20 but before the timer has been rearmed and owing to high system load=20 causing exceedingly high scheduling latency control is not handed back=20 before a transition to the uninitialised state has caused the timer to=20 be deleted even before it has been started, then fza_reset_timer() will=20 be called yet again and issue another reset even though by then the=20 adapter has already recovered. Prevent this situation from happening by switching to timer_delete() for=20 the transition to the halted state and protect the code region affected=20 with a spinlock, also to make sure add_timer() has not been called twice=20 in a row due to an execution race between the interrupt handler and the=20 timer handler (though it could only happen on SMP, but let's keep the=20 driver clean). It's a very unlikely sequence of events to happen and=20 therefore there's no point in trying to be overly clever about it, such=20 as by placing printk() calls outside the protection. For the transition=20 to the uninitialised state switch to timer_delete_sync_try() instead, so=20 that a timer isn't deleted that's just been rearmed by the timer handler=20 and needs to watch for the device to come out of reset again (again, an=20 SMP scenario only). Retain timer_delete_sync() invocations outside the hardirq context for a=20 stray timer not to fire once device structures have been released. Fixes: 61414f5ec9834 ("FDDI: defza: Add support for DEC FDDIcontroller 700 = TURBOchannel adapter") Signed-off-by: Maciej W. Rozycki --- drivers/net/fddi/defza.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) linux-defza-timer-lock.diff Index: linux-macro/drivers/net/fddi/defza.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- linux-macro.orig/drivers/net/fddi/defza.c +++ linux-macro/drivers/net/fddi/defza.c @@ -984,7 +984,7 @@ static irqreturn_t fza_interrupt(int irq =20 case FZA_STATE_UNINITIALIZED: netif_carrier_off(dev); - timer_delete_sync(&fp->reset_timer); + timer_delete_sync_try(&fp->reset_timer); fp->ring_cmd_index =3D 0; fp->ring_uns_index =3D 0; fp->ring_rmc_tx_index =3D 0; @@ -1018,7 +1018,9 @@ static irqreturn_t fza_interrupt(int irq fp->queue_active =3D 0; netif_stop_queue(dev); pr_debug("%s: queue stopped\n", fp->name); - timer_delete_sync(&fp->reset_timer); + + spin_lock(&fp->lock); + timer_delete(&fp->reset_timer); pr_warn("%s: halted, reason: %x\n", fp->name, FZA_STATUS_GET_HALT(status)); fza_regs_dump(fp); @@ -1027,6 +1029,8 @@ static irqreturn_t fza_interrupt(int irq fp->timer_state =3D 0; fp->reset_timer.expires =3D jiffies + 45 * HZ; add_timer(&fp->reset_timer); + spin_unlock(&fp->lock); + break; =20 default: @@ -1046,7 +1050,9 @@ static irqreturn_t fza_interrupt(int irq static void fza_reset_timer(struct timer_list *t) { struct fza_private *fp =3D timer_container_of(fp, t, reset_timer); + unsigned long flags; =20 + spin_lock_irqsave(&fp->lock, flags); if (!fp->timer_state) { pr_err("%s: RESET timed out!\n", fp->name); pr_info("%s: trying harder...\n", fp->name); @@ -1069,6 +1075,7 @@ static void fza_reset_timer(struct timer fp->reset_timer.expires =3D jiffies + 45 * HZ; } add_timer(&fp->reset_timer); + spin_unlock_irqrestore(&fp->lock, flags); } =20 static int fza_set_mac_address(struct net_device *dev, void *addr)