This adds a check before freeing the rx->skb in flush and close
functions to handle the kernel crash seen while removing driver after FW
download fails or before FW download completes.
dmesg log:
[ 54.634586] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000080
[ 54.643398] Mem abort info:
[ 54.646204] ESR = 0x0000000096000004
[ 54.649964] EC = 0x25: DABT (current EL), IL = 32 bits
[ 54.655286] SET = 0, FnV = 0
[ 54.658348] EA = 0, S1PTW = 0
[ 54.661498] FSC = 0x04: level 0 translation fault
[ 54.666391] Data abort info:
[ 54.669273] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
[ 54.674768] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[ 54.674771] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 54.674775] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000048860000
[ 54.674780] [0000000000000080] pgd=0000000000000000, p4d=0000000000000000
[ 54.703880] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
[ 54.710152] Modules linked in: btnxpuart(-) overlay fsl_jr_uio caam_jr caamkeyblob_desc caamhash_desc caamalg_desc crypto_engine authenc libdes crct10dif_ce polyval_ce polyval_generic snd_soc_imx_spdif snd_soc_imx_card snd_soc_ak5558 snd_soc_ak4458 caam secvio error snd_soc_fsl_micfil snd_soc_fsl_spdif snd_soc_fsl_sai snd_soc_fsl_utils imx_pcm_dma gpio_ir_recv rc_core sch_fq_codel fuse
[ 54.744357] CPU: 3 PID: 72 Comm: kworker/u9:0 Not tainted 6.6.3-otbr-g128004619037 #2
[ 54.744364] Hardware name: FSL i.MX8MM EVK board (DT)
[ 54.744368] Workqueue: hci0 hci_power_on
[ 54.757244] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 54.757249] pc : kfree_skb_reason+0x18/0xb0
[ 54.772299] lr : btnxpuart_flush+0x40/0x58 [btnxpuart]
[ 54.782921] sp : ffff8000805ebca0
[ 54.782923] x29: ffff8000805ebca0 x28: ffffa5c6cf1869c0 x27: ffffa5c6cf186000
[ 54.782931] x26: ffff377b84852400 x25: ffff377b848523c0 x24: ffff377b845e7230
[ 54.782938] x23: ffffa5c6ce8dbe08 x22: ffffa5c6ceb65410 x21: 00000000ffffff92
[ 54.782945] x20: ffffa5c6ce8dbe98 x19: ffffffffffffffac x18: ffffffffffffffff
[ 54.807651] x17: 0000000000000000 x16: ffffa5c6ce2824ec x15: ffff8001005eb857
[ 54.821917] x14: 0000000000000000 x13: ffffa5c6cf1a02e0 x12: 0000000000000642
[ 54.821924] x11: 0000000000000040 x10: ffffa5c6cf19d690 x9 : ffffa5c6cf19d688
[ 54.821931] x8 : ffff377b86000028 x7 : 0000000000000000 x6 : 0000000000000000
[ 54.821938] x5 : ffff377b86000000 x4 : 0000000000000000 x3 : 0000000000000000
[ 54.843331] x2 : 0000000000000000 x1 : 0000000000000002 x0 : ffffffffffffffac
[ 54.857599] Call trace:
[ 54.857601] kfree_skb_reason+0x18/0xb0
[ 54.863878] btnxpuart_flush+0x40/0x58 [btnxpuart]
[ 54.863888] hci_dev_open_sync+0x3a8/0xa04
[ 54.872773] hci_power_on+0x54/0x2e4
[ 54.881832] process_one_work+0x138/0x260
[ 54.881842] worker_thread+0x32c/0x438
[ 54.881847] kthread+0x118/0x11c
[ 54.881853] ret_from_fork+0x10/0x20
[ 54.896406] Code: a9be7bfd 910003fd f9000bf3 aa0003f3 (b940d400)
[ 54.896410] ---[ end trace 0000000000000000 ]---
Signed-off-by: Neeraj Sanjay Kale <neeraj.sanjaykale@nxp.com>
Tested-by: Guillaume Legoupil <guillaume.legoupil@nxp.com>
---
drivers/bluetooth/btnxpuart.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/drivers/bluetooth/btnxpuart.c b/drivers/bluetooth/btnxpuart.c
index 0b93c2ff29e4..0677b48a456e 100644
--- a/drivers/bluetooth/btnxpuart.c
+++ b/drivers/bluetooth/btnxpuart.c
@@ -1253,8 +1253,10 @@ static int btnxpuart_close(struct hci_dev *hdev)
ps_wakeup(nxpdev);
serdev_device_close(nxpdev->serdev);
skb_queue_purge(&nxpdev->txq);
- kfree_skb(nxpdev->rx_skb);
- nxpdev->rx_skb = NULL;
+ if (!IS_ERR_OR_NULL(nxpdev->rx_skb)) {
+ kfree_skb(nxpdev->rx_skb);
+ nxpdev->rx_skb = NULL;
+ }
clear_bit(BTNXPUART_SERDEV_OPEN, &nxpdev->tx_state);
return 0;
}
@@ -1269,8 +1271,10 @@ static int btnxpuart_flush(struct hci_dev *hdev)
cancel_work_sync(&nxpdev->tx_work);
- kfree_skb(nxpdev->rx_skb);
- nxpdev->rx_skb = NULL;
+ if (!IS_ERR_OR_NULL(nxpdev->rx_skb)) {
+ kfree_skb(nxpdev->rx_skb);
+ nxpdev->rx_skb = NULL;
+ }
return 0;
}
--
2.34.1
Hi,
Neeraj Sanjay Kale wrote on Wed, May 15, 2024 at 12:36:55PM +0530:
> @@ -1269,8 +1271,10 @@ static int btnxpuart_flush(struct hci_dev *hdev)
>
> cancel_work_sync(&nxpdev->tx_work);
>
> - kfree_skb(nxpdev->rx_skb);
> - nxpdev->rx_skb = NULL;
> + if (!IS_ERR_OR_NULL(nxpdev->rx_skb)) {
> + kfree_skb(nxpdev->rx_skb);
> + nxpdev->rx_skb = NULL;
> + }
This is an old patch but I hit that on our slightly old tree and was
wondering about this patch: why does flush() have to free rx at all?
I think this either needs a lock or (preferably) just remove this free:
- This is inherently racy with btnxpuart_receive_buf() which is run in
another workqueue with no lock involved as far as I understand, so this
is not just about errors but you could also free something in a weird
place here.
As far as I understand, even if we don't do anything, the rx path will
free the reply if no matching request was found.
- looking at other drivers, the hdev->flush() call never does anything
about rx and seems to only be about rx
(ah, checking again as of master drivers/bluetooth/btmtkuart.c seems to
have this same problem as of before this patch e.g. they're not checking
for errors either... This probably needs something akin to this patch or
removal as well. All other drivers have flush seem to be mostly about
tx, but I do see some cancel work for rx as well so I'm a bit unclear as
to what is expected of flush())
Thank you,
--
Dominique
© 2016 - 2026 Red Hat, Inc.