[PATCH] octeontx2-af: Fix PF driver crash with kexec kernel booting

Anshumali Gaur posted 1 patch 1 week, 2 days ago
There is a newer version of this series
drivers/net/ethernet/marvell/octeontx2/af/rvu.c | 11 +++++++++++
1 file changed, 11 insertions(+)
[PATCH] octeontx2-af: Fix PF driver crash with kexec kernel booting
Posted by Anshumali Gaur 1 week, 2 days ago
When both AF and PF drivers are built as modules, the PF driver in the
kexec kernel may probe before the AF driver is ready. This leads to
a crash due to uninitialized hardware state.

This patch ensures the PF driver properly detects and waits for AF
driver readiness before proceeding with initialization.

Fixes: 54494aa5d1e6 ("octeontx2-af: Add Marvell OcteonTX2 RVU AF driver")
Signed-off-by: Anshumali Gaur <agaur@marvell.com>
---
 drivers/net/ethernet/marvell/octeontx2/af/rvu.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu.c
index 747fbdf2a908..8530df8b3fda 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu.c
@@ -3632,11 +3632,22 @@ static void rvu_remove(struct pci_dev *pdev)
 	devm_kfree(&pdev->dev, rvu);
 }
 
+static void rvu_shutdown(struct pci_dev *pdev)
+{
+	struct rvu *rvu = pci_get_drvdata(pdev);
+
+	if (!rvu)
+		return;
+
+	rvu_clear_rvum_blk_revid(rvu);
+}
+
 static struct pci_driver rvu_driver = {
 	.name = DRV_NAME,
 	.id_table = rvu_id_table,
 	.probe = rvu_probe,
 	.remove = rvu_remove,
+	.shutdown = rvu_shutdown,
 };
 
 static int __init rvu_init_module(void)
-- 
2.25.1
Re: [PATCH] octeontx2-af: Fix PF driver crash with kexec kernel booting
Posted by Jacob Keller 1 week, 1 day ago

On 1/29/2026 1:19 AM, Anshumali Gaur wrote:
> When both AF and PF drivers are built as modules, the PF driver in the
> kexec kernel may probe before the AF driver is ready. This leads to
> a crash due to uninitialized hardware state.
> 
> This patch ensures the PF driver properly detects and waits for AF
> driver readiness before proceeding with initialization.
> 

To me, the patch description is not sufficient to describe the what and 
why of this change.

Could you please provide a better explanation of how the addition of the 
provided shutdown handler fixes initialization?

> Fixes: 54494aa5d1e6 ("octeontx2-af: Add Marvell OcteonTX2 RVU AF driver")
> Signed-off-by: Anshumali Gaur <agaur@marvell.com>
> ---
>   drivers/net/ethernet/marvell/octeontx2/af/rvu.c | 11 +++++++++++
>   1 file changed, 11 insertions(+)
> 
> diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu.c
> index 747fbdf2a908..8530df8b3fda 100644
> --- a/drivers/net/ethernet/marvell/octeontx2/af/rvu.c
> +++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu.c
> @@ -3632,11 +3632,22 @@ static void rvu_remove(struct pci_dev *pdev)
>   	devm_kfree(&pdev->dev, rvu);
>   }
>   
> +static void rvu_shutdown(struct pci_dev *pdev)
> +{
> +	struct rvu *rvu = pci_get_drvdata(pdev);
> +
> +	if (!rvu)
> +		return;
> +
> +	rvu_clear_rvum_blk_revid(rvu);

Here, I guess you are clearing some data about the device status. Does 
that mean that when you initialize later you will wait for the AF driver 
to finish probing and configure this? It would be nice to explain how 
this change fixes initialization.

> +}
> +
>   static struct pci_driver rvu_driver = {
>   	.name = DRV_NAME,
>   	.id_table = rvu_id_table,
>   	.probe = rvu_probe,
>   	.remove = rvu_remove,
> +	.shutdown = rvu_shutdown,

This is the shutdown handler:
> 
>  * @shutdown:   Hook into reboot_notifier_list (kernel/sys.c).
>  *              Intended to stop any idling DMA operations.
>  *              Useful for enabling wake-on-lan (NIC) or changing
>  *              the power state of a device before reboot.
>  *              e.g. drivers/net/e100.c.

How does this have anything to do with initialization?

>   };
>   
>   static int __init rvu_init_module(void)