drivers/net/ethernet/atheros/ag71xx.c | 6 ++++++ 1 file changed, 6 insertions(+)
From: Sven Eckelmann <sven@narfation.org>
ag71xx_probe is registering ag71xx_interrupt as handler for gmac0/gmac1
interrupts. The handler is trying to use napi_schedule to handle the
processing of packets. But the netif_napi_add for this device is
called a lot later in ag71xx_probe.
It can therefore happen that a still running gmac0/gmac1 is triggering the
interrupt handler with a bit from AG71XX_INT_POLL set in
AG71XX_REG_INT_STATUS. The handler will then call napi_schedule and the
napi code will crash the system because the ag->napi is not yet
initialized.
The gmcc0/gmac1 must be brought in a state in which it doesn't signal a
AG71XX_INT_POLL related status bits as interrupt before registering the
interrupt handler. ag71xx_hw_start will take care of re-initializing the
AG71XX_REG_INT_ENABLE.
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Rosen Penev <rosenp@gmail.com>
---
drivers/net/ethernet/atheros/ag71xx.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/net/ethernet/atheros/ag71xx.c b/drivers/net/ethernet/atheros/ag71xx.c
index 0674a042e8d3..435c4b19acdd 100644
--- a/drivers/net/ethernet/atheros/ag71xx.c
+++ b/drivers/net/ethernet/atheros/ag71xx.c
@@ -1855,6 +1855,12 @@ static int ag71xx_probe(struct platform_device *pdev)
if (!ag->mac_base)
return -ENOMEM;
+ /* ensure that HW is in manual polling mode before interrupts are
+ * activated. Otherwise ag71xx_interrupt might call napi_schedule
+ * before it is initialized by netif_napi_add.
+ */
+ ag71xx_int_disable(ag, AG71XX_INT_POLL);
+
ndev->irq = platform_get_irq(pdev, 0);
err = devm_request_irq(&pdev->dev, ndev->irq, ag71xx_interrupt,
0x0, dev_name(&pdev->dev), ndev);
--
2.46.0
On 8/28/2024 1:41 PM, Rosen Penev wrote:
> From: Sven Eckelmann <sven@narfation.org>
>
> ag71xx_probe is registering ag71xx_interrupt as handler for gmac0/gmac1
> interrupts. The handler is trying to use napi_schedule to handle the
> processing of packets. But the netif_napi_add for this device is
> called a lot later in ag71xx_probe.
>
> It can therefore happen that a still running gmac0/gmac1 is triggering the
> interrupt handler with a bit from AG71XX_INT_POLL set in
> AG71XX_REG_INT_STATUS. The handler will then call napi_schedule and the
> napi code will crash the system because the ag->napi is not yet
> initialized.
>
> The gmcc0/gmac1 must be brought in a state in which it doesn't signal a
> AG71XX_INT_POLL related status bits as interrupt before registering the
> interrupt handler. ag71xx_hw_start will take care of re-initializing the
> AG71XX_REG_INT_ENABLE.
>
> Signed-off-by: Sven Eckelmann <sven@narfation.org>
> Signed-off-by: Rosen Penev <rosenp@gmail.com>
> ---
The description reads like a bug fix, so I would expect this to be
targeted to net and have a Fixes tag indicating what commit introduced
the issue, maybe:
Fixes: d51b6ce441d3 ("net: ethernet: add ag71xx driver")
The change seems reasonable to me otherwise.
> drivers/net/ethernet/atheros/ag71xx.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/drivers/net/ethernet/atheros/ag71xx.c b/drivers/net/ethernet/atheros/ag71xx.c
> index 0674a042e8d3..435c4b19acdd 100644
> --- a/drivers/net/ethernet/atheros/ag71xx.c
> +++ b/drivers/net/ethernet/atheros/ag71xx.c
> @@ -1855,6 +1855,12 @@ static int ag71xx_probe(struct platform_device *pdev)
> if (!ag->mac_base)
> return -ENOMEM;
>
> + /* ensure that HW is in manual polling mode before interrupts are
> + * activated. Otherwise ag71xx_interrupt might call napi_schedule
> + * before it is initialized by netif_napi_add.
> + */
> + ag71xx_int_disable(ag, AG71XX_INT_POLL);
> +
> ndev->irq = platform_get_irq(pdev, 0);
> err = devm_request_irq(&pdev->dev, ndev->irq, ag71xx_interrupt,
> 0x0, dev_name(&pdev->dev), ndev);
On Wed, Aug 28, 2024 at 2:05 PM Jacob Keller <jacob.e.keller@intel.com> wrote:
>
>
>
> On 8/28/2024 1:41 PM, Rosen Penev wrote:
> > From: Sven Eckelmann <sven@narfation.org>
> >
> > ag71xx_probe is registering ag71xx_interrupt as handler for gmac0/gmac1
> > interrupts. The handler is trying to use napi_schedule to handle the
> > processing of packets. But the netif_napi_add for this device is
> > called a lot later in ag71xx_probe.
> >
> > It can therefore happen that a still running gmac0/gmac1 is triggering the
> > interrupt handler with a bit from AG71XX_INT_POLL set in
> > AG71XX_REG_INT_STATUS. The handler will then call napi_schedule and the
> > napi code will crash the system because the ag->napi is not yet
> > initialized.
> >
> > The gmcc0/gmac1 must be brought in a state in which it doesn't signal a
> > AG71XX_INT_POLL related status bits as interrupt before registering the
> > interrupt handler. ag71xx_hw_start will take care of re-initializing the
> > AG71XX_REG_INT_ENABLE.
> >
> > Signed-off-by: Sven Eckelmann <sven@narfation.org>
> > Signed-off-by: Rosen Penev <rosenp@gmail.com>
> > ---
>
> The description reads like a bug fix, so I would expect this to be
> targeted to net and have a Fixes tag indicating what commit introduced
> the issue, maybe:
>
> Fixes: d51b6ce441d3 ("net: ethernet: add ag71xx driver")
>
> The change seems reasonable to me otherwise.
OTOH there are currently no dual GMAC users upstream. Just single.
>
> > drivers/net/ethernet/atheros/ag71xx.c | 6 ++++++
> > 1 file changed, 6 insertions(+)
> >
> > diff --git a/drivers/net/ethernet/atheros/ag71xx.c b/drivers/net/ethernet/atheros/ag71xx.c
> > index 0674a042e8d3..435c4b19acdd 100644
> > --- a/drivers/net/ethernet/atheros/ag71xx.c
> > +++ b/drivers/net/ethernet/atheros/ag71xx.c
> > @@ -1855,6 +1855,12 @@ static int ag71xx_probe(struct platform_device *pdev)
> > if (!ag->mac_base)
> > return -ENOMEM;
> >
> > + /* ensure that HW is in manual polling mode before interrupts are
> > + * activated. Otherwise ag71xx_interrupt might call napi_schedule
> > + * before it is initialized by netif_napi_add.
> > + */
> > + ag71xx_int_disable(ag, AG71XX_INT_POLL);
> > +
> > ndev->irq = platform_get_irq(pdev, 0);
> > err = devm_request_irq(&pdev->dev, ndev->irq, ag71xx_interrupt,
> > 0x0, dev_name(&pdev->dev), ndev);
> -----Original Message-----
> From: Rosen Penev <rosenp@gmail.com>
> Sent: Thursday, August 29, 2024 10:47 AM
> To: Keller, Jacob E <jacob.e.keller@intel.com>
> Cc: netdev@vger.kernel.org; davem@davemloft.net; edumazet@google.com;
> kuba@kernel.org; pabeni@redhat.com; linux@armlinux.org.uk; linux-
> kernel@vger.kernel.org; o.rempel@pengutronix.de; p.zabel@pengutronix.de
> Subject: Re: [PATCH net-next] net: ag71xx: disable napi interrupts during probe
>
> On Wed, Aug 28, 2024 at 2:05 PM Jacob Keller <jacob.e.keller@intel.com> wrote:
> >
> >
> >
> > On 8/28/2024 1:41 PM, Rosen Penev wrote:
> > > From: Sven Eckelmann <sven@narfation.org>
> > >
> > > ag71xx_probe is registering ag71xx_interrupt as handler for gmac0/gmac1
> > > interrupts. The handler is trying to use napi_schedule to handle the
> > > processing of packets. But the netif_napi_add for this device is
> > > called a lot later in ag71xx_probe.
> > >
> > > It can therefore happen that a still running gmac0/gmac1 is triggering the
> > > interrupt handler with a bit from AG71XX_INT_POLL set in
> > > AG71XX_REG_INT_STATUS. The handler will then call napi_schedule and the
> > > napi code will crash the system because the ag->napi is not yet
> > > initialized.
> > >
> > > The gmcc0/gmac1 must be brought in a state in which it doesn't signal a
> > > AG71XX_INT_POLL related status bits as interrupt before registering the
> > > interrupt handler. ag71xx_hw_start will take care of re-initializing the
> > > AG71XX_REG_INT_ENABLE.
> > >
> > > Signed-off-by: Sven Eckelmann <sven@narfation.org>
> > > Signed-off-by: Rosen Penev <rosenp@gmail.com>
> > > ---
> >
> > The description reads like a bug fix, so I would expect this to be
> > targeted to net and have a Fixes tag indicating what commit introduced
> > the issue, maybe:
> >
> > Fixes: d51b6ce441d3 ("net: ethernet: add ag71xx driver")
> >
> > The change seems reasonable to me otherwise.
> OTOH there are currently no dual GMAC users upstream. Just single.
>
If that’s the case, updating the description to make that clear would help.
© 2016 - 2025 Red Hat, Inc.