[v2] can: m_can: fix pm_runtime and CAN state handling

[PATCH v2 2/7] can: m_can: only handle active interrupts

Posted by Marc Kleine-Budde 5 months ago

The M_CAN IP core has an Interrupt Register (IR) and an Interrupt
Enable (IE) register. An interrupt is triggered if at least 1 bit is
set in the bitwise and of IR and IE.

Depending on the configuration not all interrupts are enabled in the
IE register. However the m_can_rx_handler() IRQ handler looks at all
interrupts not just the enabled ones. This may lead to handling of not
activated interrupts.

Fix the problem and mask the irqstatus (IR register) with the
active_interrupts (cached value of IE register).

Fixes: e0d1f4816f2a ("can: m_can: add Bosch M_CAN controller support")
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
---
 drivers/net/can/m_can/m_can.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/can/m_can/m_can.c b/drivers/net/can/m_can/m_can.c
index fe74dbd2c966..16b38e6c3985 100644
--- a/drivers/net/can/m_can/m_can.c
+++ b/drivers/net/can/m_can/m_can.c
@@ -1057,6 +1057,7 @@ static int m_can_poll(struct napi_struct *napi, int quota)
 	u32 irqstatus;
 
 	irqstatus = cdev->irqstatus | m_can_read(cdev, M_CAN_IR);
+	irqstatus &= cdev->active_interrupts;
 
 	work_done = m_can_rx_handler(dev, quota, irqstatus);
 
@@ -1243,6 +1244,8 @@ static int m_can_interrupt_handler(struct m_can_classdev *cdev)
 	}
 
 	m_can_coalescing_update(cdev, ir);
+
+	ir &= cdev->active_interrupts;
 	if (!ir)
 		return IRQ_NONE;
 

-- 
2.51.0

Re: [PATCH v2 2/7] can: m_can: only handle active interrupts

Posted by Markus Schneider-Pargmann 5 months ago

On Tue Sep 9, 2025 at 7:53 PM CEST, Marc Kleine-Budde wrote:
> The M_CAN IP core has an Interrupt Register (IR) and an Interrupt
> Enable (IE) register. An interrupt is triggered if at least 1 bit is
> set in the bitwise and of IR and IE.
>
> Depending on the configuration not all interrupts are enabled in the
> IE register. However the m_can_rx_handler() IRQ handler looks at all
> interrupts not just the enabled ones. This may lead to handling of not
> activated interrupts.
>
> Fix the problem and mask the irqstatus (IR register) with the
> active_interrupts (cached value of IE register).
>
> Fixes: e0d1f4816f2a ("can: m_can: add Bosch M_CAN controller support")
> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
> ---
>  drivers/net/can/m_can/m_can.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/drivers/net/can/m_can/m_can.c b/drivers/net/can/m_can/m_can.c
> index fe74dbd2c966..16b38e6c3985 100644
> --- a/drivers/net/can/m_can/m_can.c
> +++ b/drivers/net/can/m_can/m_can.c
> @@ -1057,6 +1057,7 @@ static int m_can_poll(struct napi_struct *napi, int quota)
>  	u32 irqstatus;
>  
>  	irqstatus = cdev->irqstatus | m_can_read(cdev, M_CAN_IR);
> +	irqstatus &= cdev->active_interrupts;
>  
>  	work_done = m_can_rx_handler(dev, quota, irqstatus);
>  
> @@ -1243,6 +1244,8 @@ static int m_can_interrupt_handler(struct m_can_classdev *cdev)
>  	}
>  
>  	m_can_coalescing_update(cdev, ir);
> +
> +	ir &= cdev->active_interrupts;

m_can_coalescing_update() can change active_interrupts, meaning the
interrupt that caused the interrupt handler to run may be disabled in
active_interrupts above and then masked in this added line. Would that
still work or does it confuse the hardware?

Best
Markus

>  	if (!ir)
>  		return IRQ_NONE;
>

Re: [PATCH v2 2/7] can: m_can: only handle active interrupts

Posted by Marc Kleine-Budde 5 months ago

On 10.09.2025 10:41:28, Markus Schneider-Pargmann wrote:
> > diff --git a/drivers/net/can/m_can/m_can.c b/drivers/net/can/m_can/m_can.c
> > index fe74dbd2c966..16b38e6c3985 100644
> > --- a/drivers/net/can/m_can/m_can.c
> > +++ b/drivers/net/can/m_can/m_can.c
> > @@ -1057,6 +1057,7 @@ static int m_can_poll(struct napi_struct *napi, int quota)
> >  	u32 irqstatus;
> >  
> >  	irqstatus = cdev->irqstatus | m_can_read(cdev, M_CAN_IR);
> > +	irqstatus &= cdev->active_interrupts;
> >  
> >  	work_done = m_can_rx_handler(dev, quota, irqstatus);
> >  
> > @@ -1243,6 +1244,8 @@ static int m_can_interrupt_handler(struct m_can_classdev *cdev)
> >  	}
> >  
> >  	m_can_coalescing_update(cdev, ir);
> > +
> > +	ir &= cdev->active_interrupts;
> 
> m_can_coalescing_update() can change active_interrupts, meaning the
> interrupt that caused the interrupt handler to run may be disabled in
> active_interrupts above and then masked in this added line. Would that
> still work or does it confuse the hardware?

I think m_can_coalescing_update() expects the RX/TX will be cleared. Are
the following comments OK...

| diff --git a/drivers/net/can/m_can/m_can.c b/drivers/net/can/m_can/m_can.c
| index 16b38e6c3985..8cb9cc1cddbf 100644
| --- a/drivers/net/can/m_can/m_can.c
| +++ b/drivers/net/can/m_can/m_can.c
| @@ -1188,28 +1188,39 @@ static int m_can_echo_tx_event(struct net_device *dev)
|  
|  static void m_can_coalescing_update(struct m_can_classdev *cdev, u32 ir)
|  {
|          u32 new_interrupts = cdev->active_interrupts;
|          bool enable_rx_timer = false;
|          bool enable_tx_timer = false;
|  
|          if (!cdev->net->irq)
|                  return;
|  
| +        /* If there is a packet in the FIFO then:
| +         * - start timer
| +         * - disable not empty IRQ
| +         * - handle FIFO
                ^^^^^^^^^^^

...especially this one?

| +         */
|          if (cdev->rx_coalesce_usecs_irq > 0 && (ir & (IR_RF0N | IR_RF0W))) {
|                  enable_rx_timer = true;
|                  new_interrupts &= ~IR_RF0N;
|          }
|          if (cdev->tx_coalesce_usecs_irq > 0 && (ir & (IR_TEFN | IR_TEFW))) {
|                  enable_tx_timer = true;
|                  new_interrupts &= ~IR_TEFN;
|          }
| +
| +        /* If:
| +         * - timer is not going to be start
| +         * - and timer is not active
| +         * -> then enable FIFO empty IRQ
| +         */
|          if (!enable_rx_timer && !hrtimer_active(&cdev->hrtimer))
|                  new_interrupts |= IR_RF0N;
|          if (!enable_tx_timer && !hrtimer_active(&cdev->hrtimer))
|                  new_interrupts |= IR_TEFN;
|  
|          m_can_interrupt_enable(cdev, new_interrupts);
|          if (enable_rx_timer | enable_tx_timer)
|                  hrtimer_start(&cdev->hrtimer, cdev->irq_timer_wait,
|                                HRTIMER_MODE_REL);
|  }

Currently the m_can_coalescing_update() is called at the beginning of
the IRQ handler. Does it make sense to move it to the end and pass the
unmasked M_CAN_IR?

Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde          |
Embedded Linux                   | https://www.pengutronix.de |
Vertretung Nürnberg              | Phone: +49-5121-206917-129 |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-9   |

Re: [PATCH v2 2/7] can: m_can: only handle active interrupts

Posted by Marc Kleine-Budde 5 months ago

On 10.09.2025 16:28:54, Marc Kleine-Budde wrote:
> On 10.09.2025 10:41:28, Markus Schneider-Pargmann wrote:
> > > diff --git a/drivers/net/can/m_can/m_can.c b/drivers/net/can/m_can/m_can.c
> > > index fe74dbd2c966..16b38e6c3985 100644
> > > --- a/drivers/net/can/m_can/m_can.c
> > > +++ b/drivers/net/can/m_can/m_can.c
> > > @@ -1057,6 +1057,7 @@ static int m_can_poll(struct napi_struct *napi, int quota)
> > >  	u32 irqstatus;
> > >  
> > >  	irqstatus = cdev->irqstatus | m_can_read(cdev, M_CAN_IR);
> > > +	irqstatus &= cdev->active_interrupts;
> > >  
> > >  	work_done = m_can_rx_handler(dev, quota, irqstatus);
> > >  
> > > @@ -1243,6 +1244,8 @@ static int m_can_interrupt_handler(struct m_can_classdev *cdev)
> > >  	}
> > >  
> > >  	m_can_coalescing_update(cdev, ir);
> > > +
> > > +	ir &= cdev->active_interrupts;
> > 
> > m_can_coalescing_update() can change active_interrupts, meaning the
> > interrupt that caused the interrupt handler to run may be disabled in
> > active_interrupts above and then masked in this added line. Would that
> > still work or does it confuse the hardware?
> 
> I think m_can_coalescing_update() expects the RX/TX will be cleared. Are
> the following comments OK...
> 
> | diff --git a/drivers/net/can/m_can/m_can.c b/drivers/net/can/m_can/m_can.c
> | index 16b38e6c3985..8cb9cc1cddbf 100644
> | --- a/drivers/net/can/m_can/m_can.c
> | +++ b/drivers/net/can/m_can/m_can.c
> | @@ -1188,28 +1188,39 @@ static int m_can_echo_tx_event(struct net_device *dev)
> |  
> |  static void m_can_coalescing_update(struct m_can_classdev *cdev, u32 ir)
> |  {
> |          u32 new_interrupts = cdev->active_interrupts;
> |          bool enable_rx_timer = false;
> |          bool enable_tx_timer = false;
> |  
> |          if (!cdev->net->irq)
> |                  return;
> |  
> | +        /* If there is a packet in the FIFO then:
> | +         * - start timer
> | +         * - disable not empty IRQ
> | +         * - handle FIFO
>                 ^^^^^^^^^^^
> 
> ...especially this one?
> 
> | +         */
> |          if (cdev->rx_coalesce_usecs_irq > 0 && (ir & (IR_RF0N | IR_RF0W))) {
> |                  enable_rx_timer = true;
> |                  new_interrupts &= ~IR_RF0N;
> |          }
> |          if (cdev->tx_coalesce_usecs_irq > 0 && (ir & (IR_TEFN | IR_TEFW))) {
> |                  enable_tx_timer = true;
> |                  new_interrupts &= ~IR_TEFN;
> |          }
> | +
> | +        /* If:
> | +         * - timer is not going to be start
> | +         * - and timer is not active
> | +         * -> then enable FIFO empty IRQ
> | +         */
> |          if (!enable_rx_timer && !hrtimer_active(&cdev->hrtimer))
> |                  new_interrupts |= IR_RF0N;
> |          if (!enable_tx_timer && !hrtimer_active(&cdev->hrtimer))
> |                  new_interrupts |= IR_TEFN;
> |  
> |          m_can_interrupt_enable(cdev, new_interrupts);
> |          if (enable_rx_timer | enable_tx_timer)
> |                  hrtimer_start(&cdev->hrtimer, cdev->irq_timer_wait,
> |                                HRTIMER_MODE_REL);
> |  }

I can't reproduce the problem I had before. I will drop this patch for
now.

In an upcoming series, however, I would still like to move
can_coalescing_update() to the end of the IRQ handler.

> Currently the m_can_coalescing_update() is called at the beginning of
> the IRQ handler. Does it make sense to move it to the end and pass the
> unmasked M_CAN_IR?

Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde          |
Embedded Linux                   | https://www.pengutronix.de |
Vertretung Nürnberg              | Phone: +49-5121-206917-129 |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-9   |

Re: [PATCH v2 2/7] can: m_can: only handle active interrupts

Posted by Markus Schneider-Pargmann 5 months ago

On Wed Sep 10, 2025 at 5:06 PM CEST, Marc Kleine-Budde wrote:
> On 10.09.2025 16:28:54, Marc Kleine-Budde wrote:
>> On 10.09.2025 10:41:28, Markus Schneider-Pargmann wrote:
>> > > diff --git a/drivers/net/can/m_can/m_can.c b/drivers/net/can/m_can/m_can.c
>> > > index fe74dbd2c966..16b38e6c3985 100644
>> > > --- a/drivers/net/can/m_can/m_can.c
>> > > +++ b/drivers/net/can/m_can/m_can.c
>> > > @@ -1057,6 +1057,7 @@ static int m_can_poll(struct napi_struct *napi, int quota)
>> > >  	u32 irqstatus;
>> > >  
>> > >  	irqstatus = cdev->irqstatus | m_can_read(cdev, M_CAN_IR);
>> > > +	irqstatus &= cdev->active_interrupts;
>> > >  
>> > >  	work_done = m_can_rx_handler(dev, quota, irqstatus);
>> > >  
>> > > @@ -1243,6 +1244,8 @@ static int m_can_interrupt_handler(struct m_can_classdev *cdev)
>> > >  	}
>> > >  
>> > >  	m_can_coalescing_update(cdev, ir);
>> > > +
>> > > +	ir &= cdev->active_interrupts;
>> > 
>> > m_can_coalescing_update() can change active_interrupts, meaning the
>> > interrupt that caused the interrupt handler to run may be disabled in
>> > active_interrupts above and then masked in this added line. Would that
>> > still work or does it confuse the hardware?
>> 
>> I think m_can_coalescing_update() expects the RX/TX will be cleared. Are
>> the following comments OK...
>> 
>> | diff --git a/drivers/net/can/m_can/m_can.c b/drivers/net/can/m_can/m_can.c
>> | index 16b38e6c3985..8cb9cc1cddbf 100644
>> | --- a/drivers/net/can/m_can/m_can.c
>> | +++ b/drivers/net/can/m_can/m_can.c
>> | @@ -1188,28 +1188,39 @@ static int m_can_echo_tx_event(struct net_device *dev)
>> |  
>> |  static void m_can_coalescing_update(struct m_can_classdev *cdev, u32 ir)
>> |  {
>> |          u32 new_interrupts = cdev->active_interrupts;
>> |          bool enable_rx_timer = false;
>> |          bool enable_tx_timer = false;
>> |  
>> |          if (!cdev->net->irq)
>> |                  return;
>> |  
>> | +        /* If there is a packet in the FIFO then:
>> | +         * - start timer
>> | +         * - disable not empty IRQ
>> | +         * - handle FIFO
>>                 ^^^^^^^^^^^
>> 
>> ...especially this one?
>> 
>> | +         */
>> |          if (cdev->rx_coalesce_usecs_irq > 0 && (ir & (IR_RF0N | IR_RF0W))) {
>> |                  enable_rx_timer = true;
>> |                  new_interrupts &= ~IR_RF0N;
>> |          }
>> |          if (cdev->tx_coalesce_usecs_irq > 0 && (ir & (IR_TEFN | IR_TEFW))) {
>> |                  enable_tx_timer = true;
>> |                  new_interrupts &= ~IR_TEFN;
>> |          }
>> | +
>> | +        /* If:
>> | +         * - timer is not going to be start
>> | +         * - and timer is not active
>> | +         * -> then enable FIFO empty IRQ
>> | +         */
>> |          if (!enable_rx_timer && !hrtimer_active(&cdev->hrtimer))
>> |                  new_interrupts |= IR_RF0N;
>> |          if (!enable_tx_timer && !hrtimer_active(&cdev->hrtimer))
>> |                  new_interrupts |= IR_TEFN;
>> |  
>> |          m_can_interrupt_enable(cdev, new_interrupts);
>> |          if (enable_rx_timer | enable_tx_timer)
>> |                  hrtimer_start(&cdev->hrtimer, cdev->irq_timer_wait,
>> |                                HRTIMER_MODE_REL);
>> |  }
>
> I can't reproduce the problem I had before. I will drop this patch for
> now.
>
> In an upcoming series, however, I would still like to move
> can_coalescing_update() to the end of the IRQ handler.

The interrupts are acked just before calling m_can_coalescing_update().
I think ideally the unwanted interrupts should be disabled quick,
especially if there are SPI transfers for handling packets which take
some time.

Another point is to refresh the timer consistently so it doesn't trigger
because we are late refreshing it. The runtime of the interrupt handler
depends on the amount of work in the fifo.

Best
Markus