drivers/pci/controller/dwc/pci-imx6.c | 57 +++++++++++++++++++++------ 1 file changed, 46 insertions(+), 11 deletions(-)
From: Stefan Eichenberger <stefan.eichenberger@toradex.com>
The suspend/resume support is broken on the i.MX6QDL platform. This
patch resets the link upon resuming to recover functionality. It shares
most of the sequences with other i.MX devices but does not touch the
critical registers, which might break PCIe. This patch addresses the
same issue as the following downstream commit:
https://github.com/nxp-imx/linux-imx/commit/4e92355e1f79d225ea842511fcfd42b343b32995
In comparison this patch will also reset the device if possible because
the downstream patch alone would still make the ath10k driver crash.
Without this patch suspend/resume will not work if a PCIe device is
connected. The kernel will hang on resume and print an error:
ath10k_pci 0000:01:00.0: Unable to change power state from D3hot to D0, device inaccessible
8<--- cut here ---
Unhandled fault: imprecise external abort (0x1406) at 0x0106f944
Signed-off-by: Stefan Eichenberger <stefan.eichenberger@toradex.com>
---
Changes in v3:
- Added a new flag to the driver data to indicate that the suspend/resume
is broken on the i.MX6QDL platform. (Frank)
- Fix comments to be more relevant (Mani)
- Use imx_pcie_assert_core_reset in suspend (Mani)
drivers/pci/controller/dwc/pci-imx6.c | 57 +++++++++++++++++++++------
1 file changed, 46 insertions(+), 11 deletions(-)
diff --git a/drivers/pci/controller/dwc/pci-imx6.c b/drivers/pci/controller/dwc/pci-imx6.c
index 808d1f1054173..09e3b15f0908a 100644
--- a/drivers/pci/controller/dwc/pci-imx6.c
+++ b/drivers/pci/controller/dwc/pci-imx6.c
@@ -82,6 +82,11 @@ enum imx_pcie_variants {
#define IMX_PCIE_FLAG_HAS_SERDES BIT(6)
#define IMX_PCIE_FLAG_SUPPORT_64BIT BIT(7)
#define IMX_PCIE_FLAG_CPU_ADDR_FIXUP BIT(8)
+/**
+ * Because of ERR005723 (PCIe does not support L2 power down) we need to
+ * workaround suspend resume on some devices which are affected by this errata.
+ */
+#define IMX_PCIE_FLAG_BROKEN_SUSPEND BIT(9)
#define imx_check_flag(pci, val) (pci->drvdata->flags & val)
@@ -1237,9 +1242,19 @@ static int imx_pcie_suspend_noirq(struct device *dev)
return 0;
imx_pcie_msi_save_restore(imx_pcie, true);
- imx_pcie_pm_turnoff(imx_pcie);
- imx_pcie_stop_link(imx_pcie->pci);
- imx_pcie_host_exit(pp);
+ if (imx_check_flag(imx_pcie, IMX_PCIE_FLAG_BROKEN_SUSPEND)) {
+ /**
+ * The minimum for a workaround would be to set PERST# and to
+ * set the PCIE_TEST_PD flag. However, we can also disable the
+ * clock which saves some power.
+ */
+ imx_pcie_assert_core_reset(imx_pcie);
+ imx_pcie->drvdata->enable_ref_clk(imx_pcie, false);
+ } else {
+ imx_pcie_pm_turnoff(imx_pcie);
+ imx_pcie_stop_link(imx_pcie->pci);
+ imx_pcie_host_exit(pp);
+ }
return 0;
}
@@ -1253,14 +1268,32 @@ static int imx_pcie_resume_noirq(struct device *dev)
if (!(imx_pcie->drvdata->flags & IMX_PCIE_FLAG_SUPPORTS_SUSPEND))
return 0;
- ret = imx_pcie_host_init(pp);
- if (ret)
- return ret;
- imx_pcie_msi_save_restore(imx_pcie, false);
- dw_pcie_setup_rc(pp);
+ if (imx_check_flag(imx_pcie, IMX_PCIE_FLAG_BROKEN_SUSPEND)) {
+ ret = imx_pcie->drvdata->enable_ref_clk(imx_pcie, true);
+ if (ret)
+ return ret;
+ ret = imx_pcie_deassert_core_reset(imx_pcie);
+ if (ret)
+ return ret;
+ /**
+ * Using PCIE_TEST_PD seems to disable msi and powers down the
+ * root complex. This is why we have to setup the rc again and
+ * why we have to restore the msi register.
+ */
+ ret = dw_pcie_setup_rc(&imx_pcie->pci->pp);
+ if (ret)
+ return ret;
+ imx_pcie_msi_save_restore(imx_pcie, false);
+ } else {
+ ret = imx_pcie_host_init(pp);
+ if (ret)
+ return ret;
+ imx_pcie_msi_save_restore(imx_pcie, false);
+ dw_pcie_setup_rc(pp);
- if (imx_pcie->link_is_up)
- imx_pcie_start_link(imx_pcie->pci);
+ if (imx_pcie->link_is_up)
+ imx_pcie_start_link(imx_pcie->pci);
+ }
return 0;
}
@@ -1485,7 +1518,9 @@ static const struct imx_pcie_drvdata drvdata[] = {
[IMX6Q] = {
.variant = IMX6Q,
.flags = IMX_PCIE_FLAG_IMX_PHY |
- IMX_PCIE_FLAG_IMX_SPEED_CHANGE,
+ IMX_PCIE_FLAG_IMX_SPEED_CHANGE |
+ IMX_PCIE_FLAG_BROKEN_SUSPEND |
+ IMX_PCIE_FLAG_SUPPORTS_SUSPEND,
.dbi_length = 0x200,
.gpr = "fsl,imx6q-iomuxc-gpr",
.clk_names = imx6q_clks,
--
2.43.0
On Mon, Oct 21, 2024 at 02:49:13PM +0200, Stefan Eichenberger wrote: > From: Stefan Eichenberger <stefan.eichenberger@toradex.com> > > The suspend/resume support is broken on the i.MX6QDL platform. This > patch resets the link upon resuming to recover functionality. It shares > most of the sequences with other i.MX devices but does not touch the > critical registers, which might break PCIe. This patch addresses the > same issue as the following downstream commit: > https://github.com/nxp-imx/linux-imx/commit/4e92355e1f79d225ea842511fcfd42b343b32995 > In comparison this patch will also reset the device if possible because > the downstream patch alone would still make the ath10k driver crash. > Without this patch suspend/resume will not work if a PCIe device is > connected. The kernel will hang on resume and print an error: > ath10k_pci 0000:01:00.0: Unable to change power state from D3hot to D0, device inaccessible > 8<--- cut here --- > Unhandled fault: imprecise external abort (0x1406) at 0x0106f944 https://chris.beams.io/posts/git-commit/ https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/maintainer-tip.rst?id=v6.11#n134 Add blank lines between paragraphs. Drop the "8<--- cut here" thing. What does "reset the link" mean? Please use the same terminology as the PCIe spec when possible. The downstream commit log ("WARNING: this is not the official workaround; user should take own risk to use it") doesn't exactly inspire confidence. It sounds like this resets *endpoints*? That sounds scary and unexpected in suspend/resume. > Signed-off-by: Stefan Eichenberger <stefan.eichenberger@toradex.com> > --- > Changes in v3: > - Added a new flag to the driver data to indicate that the suspend/resume > is broken on the i.MX6QDL platform. (Frank) > - Fix comments to be more relevant (Mani) > - Use imx_pcie_assert_core_reset in suspend (Mani) > > drivers/pci/controller/dwc/pci-imx6.c | 57 +++++++++++++++++++++------ > 1 file changed, 46 insertions(+), 11 deletions(-) > > diff --git a/drivers/pci/controller/dwc/pci-imx6.c b/drivers/pci/controller/dwc/pci-imx6.c > index 808d1f1054173..09e3b15f0908a 100644 > --- a/drivers/pci/controller/dwc/pci-imx6.c > +++ b/drivers/pci/controller/dwc/pci-imx6.c > @@ -82,6 +82,11 @@ enum imx_pcie_variants { > #define IMX_PCIE_FLAG_HAS_SERDES BIT(6) > #define IMX_PCIE_FLAG_SUPPORT_64BIT BIT(7) > #define IMX_PCIE_FLAG_CPU_ADDR_FIXUP BIT(8) > +/** > + * Because of ERR005723 (PCIe does not support L2 power down) we need to > + * workaround suspend resume on some devices which are affected by this errata. > + */ > +#define IMX_PCIE_FLAG_BROKEN_SUSPEND BIT(9) > > #define imx_check_flag(pci, val) (pci->drvdata->flags & val) > > @@ -1237,9 +1242,19 @@ static int imx_pcie_suspend_noirq(struct device *dev) > return 0; > > imx_pcie_msi_save_restore(imx_pcie, true); > - imx_pcie_pm_turnoff(imx_pcie); > - imx_pcie_stop_link(imx_pcie->pci); > - imx_pcie_host_exit(pp); > + if (imx_check_flag(imx_pcie, IMX_PCIE_FLAG_BROKEN_SUSPEND)) { > + /** Single asterisks above, here, and below. > + * The minimum for a workaround would be to set PERST# and to > + * set the PCIE_TEST_PD flag. However, we can also disable the > + * clock which saves some power. > + */ > + imx_pcie_assert_core_reset(imx_pcie); > + imx_pcie->drvdata->enable_ref_clk(imx_pcie, false); > + } else { > + imx_pcie_pm_turnoff(imx_pcie); > + imx_pcie_stop_link(imx_pcie->pci); > + imx_pcie_host_exit(pp); > + } > > return 0; > } > @@ -1253,14 +1268,32 @@ static int imx_pcie_resume_noirq(struct device *dev) > if (!(imx_pcie->drvdata->flags & IMX_PCIE_FLAG_SUPPORTS_SUSPEND)) > return 0; > > - ret = imx_pcie_host_init(pp); > - if (ret) > - return ret; > - imx_pcie_msi_save_restore(imx_pcie, false); > - dw_pcie_setup_rc(pp); > + if (imx_check_flag(imx_pcie, IMX_PCIE_FLAG_BROKEN_SUSPEND)) { > + ret = imx_pcie->drvdata->enable_ref_clk(imx_pcie, true); > + if (ret) > + return ret; > + ret = imx_pcie_deassert_core_reset(imx_pcie); > + if (ret) > + return ret; > + /** > + * Using PCIE_TEST_PD seems to disable msi and powers down the > + * root complex. This is why we have to setup the rc again and > + * why we have to restore the msi register. s/msi/MSI/ > + */ > + ret = dw_pcie_setup_rc(&imx_pcie->pci->pp); > + if (ret) > + return ret; > + imx_pcie_msi_save_restore(imx_pcie, false); > + } else { > + ret = imx_pcie_host_init(pp); > + if (ret) > + return ret; > + imx_pcie_msi_save_restore(imx_pcie, false); > + dw_pcie_setup_rc(pp); > > - if (imx_pcie->link_is_up) > - imx_pcie_start_link(imx_pcie->pci); > + if (imx_pcie->link_is_up) > + imx_pcie_start_link(imx_pcie->pci); > + } > > return 0; > } > @@ -1485,7 +1518,9 @@ static const struct imx_pcie_drvdata drvdata[] = { > [IMX6Q] = { > .variant = IMX6Q, > .flags = IMX_PCIE_FLAG_IMX_PHY | > - IMX_PCIE_FLAG_IMX_SPEED_CHANGE, > + IMX_PCIE_FLAG_IMX_SPEED_CHANGE | > + IMX_PCIE_FLAG_BROKEN_SUSPEND | > + IMX_PCIE_FLAG_SUPPORTS_SUSPEND, > .dbi_length = 0x200, > .gpr = "fsl,imx6q-iomuxc-gpr", > .clk_names = imx6q_clks, > -- > 2.43.0 >
On Tue, Oct 22, 2024 at 10:53:49AM -0500, Bjorn Helgaas wrote: > On Mon, Oct 21, 2024 at 02:49:13PM +0200, Stefan Eichenberger wrote: > > From: Stefan Eichenberger <stefan.eichenberger@toradex.com> > > > > The suspend/resume support is broken on the i.MX6QDL platform. This > > patch resets the link upon resuming to recover functionality. It shares > > most of the sequences with other i.MX devices but does not touch the > > critical registers, which might break PCIe. This patch addresses the > > same issue as the following downstream commit: > > https://github.com/nxp-imx/linux-imx/commit/4e92355e1f79d225ea842511fcfd42b343b32995 > > In comparison this patch will also reset the device if possible because > > the downstream patch alone would still make the ath10k driver crash. > > Without this patch suspend/resume will not work if a PCIe device is > > connected. The kernel will hang on resume and print an error: > > ath10k_pci 0000:01:00.0: Unable to change power state from D3hot to D0, device inaccessible > > 8<--- cut here --- > > Unhandled fault: imprecise external abort (0x1406) at 0x0106f944 > > https://chris.beams.io/posts/git-commit/ > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/maintainer-tip.rst?id=v6.11#n134 > > Add blank lines between paragraphs. Drop the "8<--- cut here" thing. Thanks for the feedback, I will fix this in the next version. > > What does "reset the link" mean? Please use the same terminology as > the PCIe spec when possible. I will try to come up with a better description in the next version. I think this sentence doesn't make sense anymore. > The downstream commit log ("WARNING: this is not the official > workaround; user should take own risk to use it") doesn't exactly > inspire confidence. > > It sounds like this resets *endpoints*? That sounds scary and > unexpected in suspend/resume. Yes, I completely agree with you, but NXP has never come up with an "official" workaround. Our problem is that with the current implementation, suspend/resume is completely broken when a PCIe device is connected. With this proposed patch we at least have a working device after resume. Even for the other i.MX devices, the driver resets the endpoints in the resume function (imx_pcie_resume_noir -> imx_pcie_host_init -> imx_pcie_assert_core_reset), we just do that now for the i.MX6QDL as well. If it is more appropriate to call imx_pcie_assert_core_reset in resume as we do for the other devices, that would be fine with me as well. I was thinking that if we need to reset the device anyway, we could put it into reset on suspend, as this might save some extra power. > > > Signed-off-by: Stefan Eichenberger <stefan.eichenberger@toradex.com> > > --- > > Changes in v3: > > - Added a new flag to the driver data to indicate that the suspend/resume > > is broken on the i.MX6QDL platform. (Frank) > > - Fix comments to be more relevant (Mani) > > - Use imx_pcie_assert_core_reset in suspend (Mani) > > > > drivers/pci/controller/dwc/pci-imx6.c | 57 +++++++++++++++++++++------ > > 1 file changed, 46 insertions(+), 11 deletions(-) > > > > diff --git a/drivers/pci/controller/dwc/pci-imx6.c b/drivers/pci/controller/dwc/pci-imx6.c > > index 808d1f1054173..09e3b15f0908a 100644 > > --- a/drivers/pci/controller/dwc/pci-imx6.c > > +++ b/drivers/pci/controller/dwc/pci-imx6.c > > @@ -82,6 +82,11 @@ enum imx_pcie_variants { > > #define IMX_PCIE_FLAG_HAS_SERDES BIT(6) > > #define IMX_PCIE_FLAG_SUPPORT_64BIT BIT(7) > > #define IMX_PCIE_FLAG_CPU_ADDR_FIXUP BIT(8) > > +/** > > + * Because of ERR005723 (PCIe does not support L2 power down) we need to > > + * workaround suspend resume on some devices which are affected by this errata. > > + */ > > +#define IMX_PCIE_FLAG_BROKEN_SUSPEND BIT(9) > > > > #define imx_check_flag(pci, val) (pci->drvdata->flags & val) > > > > @@ -1237,9 +1242,19 @@ static int imx_pcie_suspend_noirq(struct device *dev) > > return 0; > > > > imx_pcie_msi_save_restore(imx_pcie, true); > > - imx_pcie_pm_turnoff(imx_pcie); > > - imx_pcie_stop_link(imx_pcie->pci); > > - imx_pcie_host_exit(pp); > > + if (imx_check_flag(imx_pcie, IMX_PCIE_FLAG_BROKEN_SUSPEND)) { > > + /** > > Single asterisks above, here, and below. Thanks, I will fix this in the next version. > > + * The minimum for a workaround would be to set PERST# and to > > + * set the PCIE_TEST_PD flag. However, we can also disable the > > + * clock which saves some power. > > + */ > > + imx_pcie_assert_core_reset(imx_pcie); > > + imx_pcie->drvdata->enable_ref_clk(imx_pcie, false); > > + } else { > > + imx_pcie_pm_turnoff(imx_pcie); > > + imx_pcie_stop_link(imx_pcie->pci); > > + imx_pcie_host_exit(pp); > > + } > > > > return 0; > > } > > @@ -1253,14 +1268,32 @@ static int imx_pcie_resume_noirq(struct device *dev) > > if (!(imx_pcie->drvdata->flags & IMX_PCIE_FLAG_SUPPORTS_SUSPEND)) > > return 0; > > > > - ret = imx_pcie_host_init(pp); > > - if (ret) > > - return ret; > > - imx_pcie_msi_save_restore(imx_pcie, false); > > - dw_pcie_setup_rc(pp); > > + if (imx_check_flag(imx_pcie, IMX_PCIE_FLAG_BROKEN_SUSPEND)) { > > + ret = imx_pcie->drvdata->enable_ref_clk(imx_pcie, true); > > + if (ret) > > + return ret; > > + ret = imx_pcie_deassert_core_reset(imx_pcie); > > + if (ret) > > + return ret; > > + /** > > + * Using PCIE_TEST_PD seems to disable msi and powers down the > > + * root complex. This is why we have to setup the rc again and > > + * why we have to restore the msi register. > > s/msi/MSI/ Thanks, I will fix this in the next version. > > > + */ > > + ret = dw_pcie_setup_rc(&imx_pcie->pci->pp); > > + if (ret) > > + return ret; > > + imx_pcie_msi_save_restore(imx_pcie, false); > > + } else { > > + ret = imx_pcie_host_init(pp); > > + if (ret) > > + return ret; > > + imx_pcie_msi_save_restore(imx_pcie, false); > > + dw_pcie_setup_rc(pp); > > > > - if (imx_pcie->link_is_up) > > - imx_pcie_start_link(imx_pcie->pci); > > + if (imx_pcie->link_is_up) > > + imx_pcie_start_link(imx_pcie->pci); > > + } > > > > return 0; > > } > > @@ -1485,7 +1518,9 @@ static const struct imx_pcie_drvdata drvdata[] = { > > [IMX6Q] = { > > .variant = IMX6Q, > > .flags = IMX_PCIE_FLAG_IMX_PHY | > > - IMX_PCIE_FLAG_IMX_SPEED_CHANGE, > > + IMX_PCIE_FLAG_IMX_SPEED_CHANGE | > > + IMX_PCIE_FLAG_BROKEN_SUSPEND | > > + IMX_PCIE_FLAG_SUPPORTS_SUSPEND, > > .dbi_length = 0x200, > > .gpr = "fsl,imx6q-iomuxc-gpr", > > .clk_names = imx6q_clks, Regards, Stefan
On Wed, Oct 23, 2024 at 10:57:35AM +0200, Stefan Eichenberger wrote: > On Tue, Oct 22, 2024 at 10:53:49AM -0500, Bjorn Helgaas wrote: > > On Mon, Oct 21, 2024 at 02:49:13PM +0200, Stefan Eichenberger wrote: > > > From: Stefan Eichenberger <stefan.eichenberger@toradex.com> > > > > > > The suspend/resume support is broken on the i.MX6QDL platform. This > > > patch resets the link upon resuming to recover functionality. It shares > > > most of the sequences with other i.MX devices but does not touch the > > > critical registers, which might break PCIe. This patch addresses the > > > same issue as the following downstream commit: > > > https://github.com/nxp-imx/linux-imx/commit/4e92355e1f79d225ea842511fcfd42b343b32995 > > > In comparison this patch will also reset the device if possible because > > > the downstream patch alone would still make the ath10k driver crash. > > > Without this patch suspend/resume will not work if a PCIe device is > > > connected. The kernel will hang on resume and print an error: > > > ath10k_pci 0000:01:00.0: Unable to change power state from D3hot to D0, device inaccessible > ... > > The downstream commit log ("WARNING: this is not the official > > workaround; user should take own risk to use it") doesn't exactly > > inspire confidence. > > > > It sounds like this resets *endpoints*? That sounds scary and > > unexpected in suspend/resume. > > Yes, I completely agree with you, but NXP has never come up with an > "official" workaround. Our problem is that with the current > implementation, suspend/resume is completely broken when a PCIe device > is connected. With this proposed patch we at least have a working device > after resume. Even for the other i.MX devices, the driver resets the > endpoints in the resume function (imx_pcie_resume_noir -> > imx_pcie_host_init -> imx_pcie_assert_core_reset), we just do that now > for the i.MX6QDL as well. If it is more appropriate to call > imx_pcie_assert_core_reset in resume as we do for the other devices, > that would be fine with me as well. I was thinking that if we need to > reset the device anyway, we could put it into reset on suspend, as this > might save some extra power. OK, I have to admit I don't know enough about suspend/resume. Since we already do that for other i.MX platforms, maybe an endpoint reset is normal for suspend. I really don't know. In any case, if we do it for other i.MX platforms, I'm OK doing it for this one too. Bjorn
Hi Stefan, kernel test robot noticed the following build warnings: [auto build test WARNING on pci/next] [also build test WARNING on pci/for-linus linus/master v6.12-rc4 next-20241021] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Stefan-Eichenberger/PCI-imx6-Add-suspend-resume-support-for-i-MX6QDL/20241021-205155 base: https://git.kernel.org/pub/scm/linux/kernel/git/pci/pci.git next patch link: https://lore.kernel.org/r/20241021124922.5361-1-eichest%40gmail.com patch subject: [PATCH v3] PCI: imx6: Add suspend/resume support for i.MX6QDL config: loongarch-allmodconfig (https://download.01.org/0day-ci/archive/20241022/202410221309.BXeJNrxw-lkp@intel.com/config) compiler: loongarch64-linux-gcc (GCC) 14.1.0 reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241022/202410221309.BXeJNrxw-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202410221309.BXeJNrxw-lkp@intel.com/ All warnings (new ones prefixed by >>): >> drivers/pci/controller/dwc/pci-imx6.c:86: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst * Because of ERR005723 (PCIe does not support L2 power down) we need to vim +86 drivers/pci/controller/dwc/pci-imx6.c 75 76 #define IMX_PCIE_FLAG_IMX_PHY BIT(0) 77 #define IMX_PCIE_FLAG_IMX_SPEED_CHANGE BIT(1) 78 #define IMX_PCIE_FLAG_SUPPORTS_SUSPEND BIT(2) 79 #define IMX_PCIE_FLAG_HAS_PHYDRV BIT(3) 80 #define IMX_PCIE_FLAG_HAS_APP_RESET BIT(4) 81 #define IMX_PCIE_FLAG_HAS_PHY_RESET BIT(5) 82 #define IMX_PCIE_FLAG_HAS_SERDES BIT(6) 83 #define IMX_PCIE_FLAG_SUPPORT_64BIT BIT(7) 84 #define IMX_PCIE_FLAG_CPU_ADDR_FIXUP BIT(8) 85 /** > 86 * Because of ERR005723 (PCIe does not support L2 power down) we need to 87 * workaround suspend resume on some devices which are affected by this errata. 88 */ 89 #define IMX_PCIE_FLAG_BROKEN_SUSPEND BIT(9) 90 -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki
© 2016 - 2024 Red Hat, Inc.