[ Resending due to cut and paste failure of email address ]
From: Steven Rostedt (Google) <rostedt@goodmis.org>
While looking at a crash report on a timer list being corrupted, which
usually happens when a timer is freed while still active. This is
commonly triggered by code calling del_timer() instead of
del_timer_sync() just before freeing.
One possible culprit is the hci_qca driver, which does exactly that.
Cc: stable@vger.kernel.org
Fixes: 0ff252c1976da ("Bluetooth: hciuart: Add support QCA chipset for
UART") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
diff --git a/drivers/bluetooth/hci_qca.c b/drivers/bluetooth/hci_qca.c
index f6e91fb432a3..73a8c72b5aae 100644
--- a/drivers/bluetooth/hci_qca.c
+++ b/drivers/bluetooth/hci_qca.c
@@ -696,8 +696,8 @@ static int qca_close(struct hci_uart *hu)
skb_queue_purge(&qca->tx_wait_q);
skb_queue_purge(&qca->txq);
skb_queue_purge(&qca->rx_memdump_q);
- del_timer(&qca->tx_idle_timer);
- del_timer(&qca->wake_retrans_timer);
+ del_timer_sync(&qca->tx_idle_timer);
+ del_timer_sync(&qca->wake_retrans_timer);
destroy_workqueue(qca->workqueue);
qca->hu = NULL;
On 4/4/22 15:22, Steven Rostedt wrote:
> [ Resending due to cut and paste failure of email address ]
>
> From: Steven Rostedt (Google) <rostedt@goodmis.org>
>
> While looking at a crash report on a timer list being corrupted, which
> usually happens when a timer is freed while still active. This is
> commonly triggered by code calling del_timer() instead of
> del_timer_sync() just before freeing.
>
> One possible culprit is the hci_qca driver, which does exactly that.
>
> Cc: stable@vger.kernel.org
> Fixes: 0ff252c1976da ("Bluetooth: hciuart: Add support QCA chipset for
> UART") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
> ---
> diff --git a/drivers/bluetooth/hci_qca.c b/drivers/bluetooth/hci_qca.c
> index f6e91fb432a3..73a8c72b5aae 100644
> --- a/drivers/bluetooth/hci_qca.c
> +++ b/drivers/bluetooth/hci_qca.c
> @@ -696,8 +696,8 @@ static int qca_close(struct hci_uart *hu)
> skb_queue_purge(&qca->tx_wait_q);
> skb_queue_purge(&qca->txq);
> skb_queue_purge(&qca->rx_memdump_q);
> - del_timer(&qca->tx_idle_timer);
> - del_timer(&qca->wake_retrans_timer);
> + del_timer_sync(&qca->tx_idle_timer);
> + del_timer_sync(&qca->wake_retrans_timer);
It seems the wake_retrans_timer could be re-armed from a work queue.
So perhaps we need to make sure qca->workqueue is destroyed
before these del_timer_sync() calls ?
> destroy_workqueue(qca->workqueue);
ie move this destroy_workqueue() up ?
> qca->hu = NULL;
>
On Mon, 4 Apr 2022 17:22:00 -0700 Eric Dumazet <eric.dumazet@gmail.com> wrote: > > diff --git a/drivers/bluetooth/hci_qca.c b/drivers/bluetooth/hci_qca.c > > index f6e91fb432a3..73a8c72b5aae 100644 > > --- a/drivers/bluetooth/hci_qca.c > > +++ b/drivers/bluetooth/hci_qca.c > > @@ -696,8 +696,8 @@ static int qca_close(struct hci_uart *hu) > > skb_queue_purge(&qca->tx_wait_q); > > skb_queue_purge(&qca->txq); > > skb_queue_purge(&qca->rx_memdump_q); > > - del_timer(&qca->tx_idle_timer); > > - del_timer(&qca->wake_retrans_timer); > > + del_timer_sync(&qca->tx_idle_timer); > > + del_timer_sync(&qca->wake_retrans_timer); > > > It seems the wake_retrans_timer could be re-armed from a work queue. > > So perhaps we need to make sure qca->workqueue is destroyed > > before these del_timer_sync() calls ? > > > destroy_workqueue(qca->workqueue); > > > ie move this destroy_workqueue() up ? Yeah, that could be a problem. I would think moving it up would help, if that's what requeue's the timers. -- Steve > > > > qca->hu = NULL; > >
On Mon, Apr 04, 2022 at 05:22:00PM -0700, Eric Dumazet wrote:
>
> On 4/4/22 15:22, Steven Rostedt wrote:
> > [ Resending due to cut and paste failure of email address ]
> >
> > From: Steven Rostedt (Google) <rostedt@goodmis.org>
> >
> > While looking at a crash report on a timer list being corrupted, which
> > usually happens when a timer is freed while still active. This is
> > commonly triggered by code calling del_timer() instead of
> > del_timer_sync() just before freeing.
> >
> > One possible culprit is the hci_qca driver, which does exactly that.
> >
> > Cc: stable@vger.kernel.org
> > Fixes: 0ff252c1976da ("Bluetooth: hciuart: Add support QCA chipset for
> > UART") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
> > ---
> > diff --git a/drivers/bluetooth/hci_qca.c b/drivers/bluetooth/hci_qca.c
> > index f6e91fb432a3..73a8c72b5aae 100644
> > --- a/drivers/bluetooth/hci_qca.c
> > +++ b/drivers/bluetooth/hci_qca.c
> > @@ -696,8 +696,8 @@ static int qca_close(struct hci_uart *hu)
> > skb_queue_purge(&qca->tx_wait_q);
> > skb_queue_purge(&qca->txq);
> > skb_queue_purge(&qca->rx_memdump_q);
> > - del_timer(&qca->tx_idle_timer);
> > - del_timer(&qca->wake_retrans_timer);
> > + del_timer_sync(&qca->tx_idle_timer);
> > + del_timer_sync(&qca->wake_retrans_timer);
>
>
> It seems the wake_retrans_timer could be re-armed from a work queue.
>
> So perhaps we need to make sure qca->workqueue is destroyed
>
> before these del_timer_sync() calls ?
>
> > destroy_workqueue(qca->workqueue);
>
>
> ie move this destroy_workqueue() up ?
>
What prevents the timer code from queueing work into the destroyed
workqueue ?
Thanks,
Guenter
On Wed, 6 Apr 2022 08:39:07 -0700 Guenter Roeck <linux@roeck-us.net> wrote: > > ie move this destroy_workqueue() up ? > > > > What prevents the timer code from queueing work into the destroyed > workqueue ? So we have a chicken verses egg issue here? -- Steve
On 4/6/22 08:46, Steven Rostedt wrote: > On Wed, 6 Apr 2022 08:39:07 -0700 > Guenter Roeck <linux@roeck-us.net> wrote: > >>> ie move this destroy_workqueue() up ? >>> >> >> What prevents the timer code from queueing work into the destroyed >> workqueue ? > > So we have a chicken verses egg issue here? > Almost looks like it, unless I am missing something. Maybe some flag is needed to prevent the timer handling code from queuing into the destroyed workqueue, or the workqueue handler from updating the timer. Guenter
On Wed, 6 Apr 2022 09:36:10 -0700 Guenter Roeck <linux@roeck-us.net> wrote: > > So we have a chicken verses egg issue here? > > > > Almost looks like it, unless I am missing something. Maybe some flag > is needed to prevent the timer handling code from queuing into the > destroyed workqueue, or the workqueue handler from updating the timer. That's exactly what I was thinking. I do not know all the code here. I could try to write a patch, but I may likely miss something. -- Steve
On 4/6/22 09:46, Steven Rostedt wrote: > On Wed, 6 Apr 2022 09:36:10 -0700 > Guenter Roeck <linux@roeck-us.net> wrote: > >>> So we have a chicken verses egg issue here? >>> >> Almost looks like it, unless I am missing something. Maybe some flag >> is needed to prevent the timer handling code from queuing into the >> destroyed workqueue, or the workqueue handler from updating the timer. > That's exactly what I was thinking. I do not know all the code here. I > could try to write a patch, but I may likely miss something. > > -- Steve Take a look at https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/commit/?id=1946014ca3b19be9e485e780e862c375c6f98bad Ie, use an ->live (or ->dead) field.
© 2016 - 2026 Red Hat, Inc.