[PATCH 8/9] ACPI: APEI: EINJ: Transition to the faux device interface

Sudeep Holla posted 9 patches 9 months ago
There is a newer version of this series
[PATCH 8/9] ACPI: APEI: EINJ: Transition to the faux device interface
Posted by Sudeep Holla 9 months ago
The APEI error injection driver does not require the creation of a
platform device. Originally, this approach was chosen for simplicity
when the driver was first implemented.

With the introduction of the lightweight faux device interface, we now
have a more appropriate alternative. Migrate the driver to utilize the
faux bus, given that the platform device it previously created was not
a real one anyway. This will simplify the code, reducing its footprint
while maintaining functionality.

Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: linux-acpi@vger.kernel.org
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/acpi/apei/einj-core.c | 32 +++++++++++---------------------
 1 file changed, 11 insertions(+), 21 deletions(-)

diff --git a/drivers/acpi/apei/einj-core.c b/drivers/acpi/apei/einj-core.c
index 04731a5b01faaba534bad853d0acc4c8a873a53b..7ff334422899e757de918107202507dd171d61da 100644
--- a/drivers/acpi/apei/einj-core.c
+++ b/drivers/acpi/apei/einj-core.c
@@ -21,7 +21,7 @@
 #include <linux/nmi.h>
 #include <linux/delay.h>
 #include <linux/mm.h>
-#include <linux/platform_device.h>
+#include <linux/device/faux.h>
 #include <linux/unaligned.h>
 
 #include "apei-internal.h"
@@ -749,7 +749,7 @@ static int einj_check_table(struct acpi_table_einj *einj_tab)
 	return 0;
 }
 
-static int __init einj_probe(struct platform_device *pdev)
+static int __init einj_probe(struct faux_device *fdev)
 {
 	int rc;
 	acpi_status status;
@@ -851,7 +851,7 @@ static int __init einj_probe(struct platform_device *pdev)
 	return rc;
 }
 
-static void __exit einj_remove(struct platform_device *pdev)
+static void __exit einj_remove(struct faux_device *fdev)
 {
 	struct apei_exec_context ctx;
 
@@ -872,34 +872,25 @@ static void __exit einj_remove(struct platform_device *pdev)
 	acpi_put_table((struct acpi_table_header *)einj_tab);
 }
 
-static struct platform_device *einj_dev;
+static struct faux_device *einj_dev;
 /*
  * einj_remove() lives in .exit.text. For drivers registered via
  * platform_driver_probe() this is ok because they cannot get unbound at
  * runtime. So mark the driver struct with __refdata to prevent modpost
  * triggering a section mismatch warning.
  */
-static struct platform_driver einj_driver __refdata = {
+static struct faux_device_ops einj_device_ops __refdata = {
+	.probe = einj_probe,
 	.remove = __exit_p(einj_remove),
-	.driver = {
-		.name = "acpi-einj",
-	},
 };
 
 static int __init einj_init(void)
 {
-	struct platform_device_info einj_dev_info = {
-		.name = "acpi-einj",
-		.id = -1,
-	};
-	int rc;
-
-	einj_dev = platform_device_register_full(&einj_dev_info);
-	if (IS_ERR(einj_dev))
-		return PTR_ERR(einj_dev);
+	einj_dev = faux_device_create("acpi-einj", NULL, &einj_device_ops);
+	if (!einj_dev)
+		return -ENODEV;
 
-	rc = platform_driver_probe(&einj_driver, einj_probe);
-	einj_initialized = rc == 0;
+	einj_initialized = true;
 
 	return 0;
 }
@@ -907,9 +898,8 @@ static int __init einj_init(void)
 static void __exit einj_exit(void)
 {
 	if (einj_initialized)
-		platform_driver_unregister(&einj_driver);
+		faux_device_destroy(einj_dev);
 
-	platform_device_unregister(einj_dev);
 }
 
 module_init(einj_init);

-- 
2.34.1
Re: [PATCH 8/9] ACPI: APEI: EINJ: Transition to the faux device interface
Posted by Dan Williams 6 months, 2 weeks ago
[ add linux-cxl ]

Sudeep Holla wrote:
> The APEI error injection driver does not require the creation of a
> platform device. Originally, this approach was chosen for simplicity
> when the driver was first implemented.
> 
> With the introduction of the lightweight faux device interface, we now
> have a more appropriate alternative. Migrate the driver to utilize the
> faux bus, given that the platform device it previously created was not
> a real one anyway. This will simplify the code, reducing its footprint
> while maintaining functionality.
> 
> Cc: "Rafael J. Wysocki" <rafael@kernel.org>
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: linux-acpi@vger.kernel.org
> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> ---
>  drivers/acpi/apei/einj-core.c | 32 +++++++++++---------------------
>  1 file changed, 11 insertions(+), 21 deletions(-)
> 
> diff --git a/drivers/acpi/apei/einj-core.c b/drivers/acpi/apei/einj-core.c
> index 04731a5b01faaba534bad853d0acc4c8a873a53b..7ff334422899e757de918107202507dd171d61da 100644
> --- a/drivers/acpi/apei/einj-core.c
> +++ b/drivers/acpi/apei/einj-core.c
[..]
>  static int __init einj_init(void)
>  {
> -	struct platform_device_info einj_dev_info = {
> -		.name = "acpi-einj",
> -		.id = -1,
> -	};
> -	int rc;
> -
> -	einj_dev = platform_device_register_full(&einj_dev_info);
> -	if (IS_ERR(einj_dev))
> -		return PTR_ERR(einj_dev);
> +	einj_dev = faux_device_create("acpi-einj", NULL, &einj_device_ops);
> +	if (!einj_dev)
> +		return -ENODEV;
>  
> -	rc = platform_driver_probe(&einj_driver, einj_probe);
> -	einj_initialized = rc == 0;
> +	einj_initialized = true;

git bisect says this change breaks CXL subsystem initialization. This
patch seems to not understand the semantic guarantees of
platform_driver_probe().

CXL init now fails with:

    faux acpi-einj: probe did not succeed, tearing down the device

...which will fire on the majority of CXL systems because EINJ is optional.

However, failure to probe is not a module load condition failure because
the CXL subsystem still needs access to the symbols even on systems
where the EINJ facility is disabled. In other words CXL has a module
dependency on einj.ko, but all of its entry points into that module
first check einj_initialized. So part of the fix is:

diff --git a/drivers/acpi/apei/einj-core.c b/drivers/acpi/apei/einj-core.c
index fea11a35eea3..9b041415a9d0 100644
--- a/drivers/acpi/apei/einj-core.c
+++ b/drivers/acpi/apei/einj-core.c
@@ -883,19 +883,16 @@ static int __init einj_init(void)
        }
 
        einj_dev = faux_device_create("acpi-einj", NULL, &einj_device_ops);
-       if (!einj_dev)
-               return -ENODEV;
 
-       einj_initialized = true;
+       if (einj_dev)
+               einj_initialized = true;
 
        return 0;
 }
 
 static void __exit einj_exit(void)
 {
-       if (einj_initialized)
-               faux_device_destroy(einj_dev);
-
+       faux_device_destroy(einj_dev);
 }
 
 module_init(einj_init);

However, that is not sufficient because faux_device_create() unlike
platform_driver_probe() does not suppress bind attributes which means
that einj_initialized result is not stable. I.e. somebody can unbind the
faux_driver from any faux_device at any time.

I think it is reasonable for faux_devices to only have one chance to
call ->probe() at create time:

diff --git a/drivers/base/faux.c b/drivers/base/faux.c
index 9054d346bd7f..934da77ca48b 100644
--- a/drivers/base/faux.c
+++ b/drivers/base/faux.c
@@ -86,6 +86,7 @@ static struct device_driver faux_driver = {
        .name           = "faux_driver",
        .bus            = &faux_bus_type,
        .probe_type     = PROBE_FORCE_SYNCHRONOUS,
+       .suppress_bind_attrs = true,
 };
 
 static void faux_device_release(struct device *dev)

Unless that global change is acceptable I do not think
faux_device_create() is a suitable replacement for
platform_driver_probe().

Lastly, for cases where probe failure are ok the dev_err() is too noisy,
so another change to make it behave like platform_driver_probe() would
be:

diff --git a/drivers/base/faux.c b/drivers/base/faux.c
index 9054d346bd7f..f5fbda0a9a44 100644
--- a/drivers/base/faux.c
+++ b/drivers/base/faux.c
@@ -169,7 +170,7 @@ struct faux_device *faux_device_create_with_groups(const char *name,
         * successful is almost impossible to determine by the caller.
         */
        if (!dev->driver) {
-               dev_err(dev, "probe did not succeed, tearing down the device\n");
+               dev_dbg(dev, "probe did not succeed, tearing down the device\n");
                faux_device_destroy(faux_dev);
                faux_dev = NULL;
        }

Greg, if you are ok with suppress_bind_attrs = true for faux devices I will
wrap the above into a 3 patch series to fix the regression.