[PATCH v2 net] octeontx2-pf: Fix use-after-free bugs in otx2_sync_tstamp()

Duoming Zhou posted 1 patch 2 weeks, 1 day ago
drivers/net/ethernet/marvell/octeontx2/nic/otx2_ptp.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
[PATCH v2 net] octeontx2-pf: Fix use-after-free bugs in otx2_sync_tstamp()
Posted by Duoming Zhou 2 weeks, 1 day ago
The original code relies on cancel_delayed_work() in otx2_ptp_destroy(),
which does not ensure that the delayed work item synctstamp_work has fully
completed if it was already running. This leads to use-after-free scenarios
where otx2_ptp is deallocated by otx2_ptp_destroy(), while synctstamp_work
remains active and attempts to dereference otx2_ptp in otx2_sync_tstamp().
Furthermore, the synctstamp_work is cyclic, the likelihood of triggering
the bug is nonnegligible.

A typical race condition is illustrated below:

CPU 0 (cleanup)           | CPU 1 (delayed work callback)
otx2_remove()             |
  otx2_ptp_destroy()      | otx2_sync_tstamp()
    cancel_delayed_work() |
    kfree(ptp)            |
                          |   ptp = container_of(...); //UAF
                          |   ptp-> //UAF

This is confirmed by a KASAN report:

BUG: KASAN: slab-use-after-free in __run_timer_base.part.0+0x7d7/0x8c0
Write of size 8 at addr ffff88800aa09a18 by task bash/136
...
Call Trace:
 <IRQ>
 dump_stack_lvl+0x55/0x70
 print_report+0xcf/0x610
 ? __run_timer_base.part.0+0x7d7/0x8c0
 kasan_report+0xb8/0xf0
 ? __run_timer_base.part.0+0x7d7/0x8c0
 __run_timer_base.part.0+0x7d7/0x8c0
 ? __pfx___run_timer_base.part.0+0x10/0x10
 ? __pfx_read_tsc+0x10/0x10
 ? ktime_get+0x60/0x140
 ? lapic_next_event+0x11/0x20
 ? clockevents_program_event+0x1d4/0x2a0
 run_timer_softirq+0xd1/0x190
 handle_softirqs+0x16a/0x550
 irq_exit_rcu+0xaf/0xe0
 sysvec_apic_timer_interrupt+0x70/0x80
 </IRQ>
...
Allocated by task 1:
 kasan_save_stack+0x24/0x50
 kasan_save_track+0x14/0x30
 __kasan_kmalloc+0x7f/0x90
 otx2_ptp_init+0xb1/0x860
 otx2_probe+0x4eb/0xc30
 local_pci_probe+0xdc/0x190
 pci_device_probe+0x2fe/0x470
 really_probe+0x1ca/0x5c0
 __driver_probe_device+0x248/0x310
 driver_probe_device+0x44/0x120
 __driver_attach+0xd2/0x310
 bus_for_each_dev+0xed/0x170
 bus_add_driver+0x208/0x500
 driver_register+0x132/0x460
 do_one_initcall+0x89/0x300
 kernel_init_freeable+0x40d/0x720
 kernel_init+0x1a/0x150
 ret_from_fork+0x10c/0x1a0
 ret_from_fork_asm+0x1a/0x30

Freed by task 136:
 kasan_save_stack+0x24/0x50
 kasan_save_track+0x14/0x30
 kasan_save_free_info+0x3a/0x60
 __kasan_slab_free+0x3f/0x50
 kfree+0x137/0x370
 otx2_ptp_destroy+0x38/0x80
 otx2_remove+0x10d/0x4c0
 pci_device_remove+0xa6/0x1d0
 device_release_driver_internal+0xf8/0x210
 pci_stop_bus_device+0x105/0x150
 pci_stop_and_remove_bus_device_locked+0x15/0x30
 remove_store+0xcc/0xe0
 kernfs_fop_write_iter+0x2c3/0x440
 vfs_write+0x871/0xd70
 ksys_write+0xee/0x1c0
 do_syscall_64+0xac/0x280
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
...

Replace cancel_delayed_work() with cancel_delayed_work_sync() to ensure
that the delayed work item is properly canceled before the otx2_ptp is
deallocated.

This bug was initially identified through static analysis. To reproduce
and test it, I simulated the OcteonTX2 PCI device in QEMU and introduced
artificial delays within the otx2_sync_tstamp() function to increase the
likelihood of triggering the bug.

Fixes: 2958d17a8984 ("octeontx2-pf: Add support for ptp 1-step mode on CN10K silicon")
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
---
Changes in v2:
  - Describe how the issue was discovered and how the patch was tested.

 drivers/net/ethernet/marvell/octeontx2/nic/otx2_ptp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_ptp.c b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_ptp.c
index e52cc6b1a26c..dedd586ed310 100644
--- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_ptp.c
+++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_ptp.c
@@ -491,7 +491,7 @@ void otx2_ptp_destroy(struct otx2_nic *pfvf)
 	if (!ptp)
 		return;
 
-	cancel_delayed_work(&pfvf->ptp->synctstamp_work);
+	cancel_delayed_work_sync(&pfvf->ptp->synctstamp_work);
 
 	ptp_clock_unregister(ptp->ptp_clock);
 	kfree(ptp);
-- 
2.34.1
Re: [PATCH v2 net] octeontx2-pf: Fix use-after-free bugs in otx2_sync_tstamp()
Posted by Vadim Fedorenko 2 weeks ago
On 17/09/2025 07:38, Duoming Zhou wrote:
> The original code relies on cancel_delayed_work() in otx2_ptp_destroy(),
> which does not ensure that the delayed work item synctstamp_work has fully
> completed if it was already running. This leads to use-after-free scenarios
> where otx2_ptp is deallocated by otx2_ptp_destroy(), while synctstamp_work
> remains active and attempts to dereference otx2_ptp in otx2_sync_tstamp().
> Furthermore, the synctstamp_work is cyclic, the likelihood of triggering
> the bug is nonnegligible.
> 
> A typical race condition is illustrated below:
> 
> CPU 0 (cleanup)           | CPU 1 (delayed work callback)
> otx2_remove()             |
>    otx2_ptp_destroy()      | otx2_sync_tstamp()
>      cancel_delayed_work() |
>      kfree(ptp)            |
>                            |   ptp = container_of(...); //UAF
>                            |   ptp-> //UAF
> 
> This is confirmed by a KASAN report:
> 
> BUG: KASAN: slab-use-after-free in __run_timer_base.part.0+0x7d7/0x8c0
> Write of size 8 at addr ffff88800aa09a18 by task bash/136
> ...
> Call Trace:
>   <IRQ>
>   dump_stack_lvl+0x55/0x70
>   print_report+0xcf/0x610
>   ? __run_timer_base.part.0+0x7d7/0x8c0
>   kasan_report+0xb8/0xf0
>   ? __run_timer_base.part.0+0x7d7/0x8c0
>   __run_timer_base.part.0+0x7d7/0x8c0
>   ? __pfx___run_timer_base.part.0+0x10/0x10
>   ? __pfx_read_tsc+0x10/0x10
>   ? ktime_get+0x60/0x140
>   ? lapic_next_event+0x11/0x20
>   ? clockevents_program_event+0x1d4/0x2a0
>   run_timer_softirq+0xd1/0x190
>   handle_softirqs+0x16a/0x550
>   irq_exit_rcu+0xaf/0xe0
>   sysvec_apic_timer_interrupt+0x70/0x80
>   </IRQ>
> ...
> Allocated by task 1:
>   kasan_save_stack+0x24/0x50
>   kasan_save_track+0x14/0x30
>   __kasan_kmalloc+0x7f/0x90
>   otx2_ptp_init+0xb1/0x860
>   otx2_probe+0x4eb/0xc30
>   local_pci_probe+0xdc/0x190
>   pci_device_probe+0x2fe/0x470
>   really_probe+0x1ca/0x5c0
>   __driver_probe_device+0x248/0x310
>   driver_probe_device+0x44/0x120
>   __driver_attach+0xd2/0x310
>   bus_for_each_dev+0xed/0x170
>   bus_add_driver+0x208/0x500
>   driver_register+0x132/0x460
>   do_one_initcall+0x89/0x300
>   kernel_init_freeable+0x40d/0x720
>   kernel_init+0x1a/0x150
>   ret_from_fork+0x10c/0x1a0
>   ret_from_fork_asm+0x1a/0x30
> 
> Freed by task 136:
>   kasan_save_stack+0x24/0x50
>   kasan_save_track+0x14/0x30
>   kasan_save_free_info+0x3a/0x60
>   __kasan_slab_free+0x3f/0x50
>   kfree+0x137/0x370
>   otx2_ptp_destroy+0x38/0x80
>   otx2_remove+0x10d/0x4c0
>   pci_device_remove+0xa6/0x1d0
>   device_release_driver_internal+0xf8/0x210
>   pci_stop_bus_device+0x105/0x150
>   pci_stop_and_remove_bus_device_locked+0x15/0x30
>   remove_store+0xcc/0xe0
>   kernfs_fop_write_iter+0x2c3/0x440
>   vfs_write+0x871/0xd70
>   ksys_write+0xee/0x1c0
>   do_syscall_64+0xac/0x280
>   entry_SYSCALL_64_after_hwframe+0x77/0x7f
> ...
> 
> Replace cancel_delayed_work() with cancel_delayed_work_sync() to ensure
> that the delayed work item is properly canceled before the otx2_ptp is
> deallocated.
> 
> This bug was initially identified through static analysis. To reproduce
> and test it, I simulated the OcteonTX2 PCI device in QEMU and introduced
> artificial delays within the otx2_sync_tstamp() function to increase the
> likelihood of triggering the bug.
> 
> Fixes: 2958d17a8984 ("octeontx2-pf: Add support for ptp 1-step mode on CN10K silicon")
> Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
> ---
> Changes in v2:
>    - Describe how the issue was discovered and how the patch was tested.
> 
>   drivers/net/ethernet/marvell/octeontx2/nic/otx2_ptp.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_ptp.c b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_ptp.c
> index e52cc6b1a26c..dedd586ed310 100644
> --- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_ptp.c
> +++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_ptp.c
> @@ -491,7 +491,7 @@ void otx2_ptp_destroy(struct otx2_nic *pfvf)
>   	if (!ptp)
>   		return;
>   
> -	cancel_delayed_work(&pfvf->ptp->synctstamp_work);
> +	cancel_delayed_work_sync(&pfvf->ptp->synctstamp_work);
>   
>   	ptp_clock_unregister(ptp->ptp_clock);
>   	kfree(ptp);

Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>