[RFC PATCH 0/4] Rework SCMI transport drivers probing sequence

Cristian Marussi posted 4 patches 2 weeks, 6 days ago
drivers/firmware/arm_scmi/common.h            |  46 +++++-
drivers/firmware/arm_scmi/driver.c            |  15 +-
.../firmware/arm_scmi/transports/mailbox.c    |   5 +-
drivers/firmware/arm_scmi/transports/optee.c  | 143 ++++++++++--------
drivers/firmware/arm_scmi/transports/smc.c    |   4 +-
drivers/firmware/arm_scmi/transports/virtio.c |  99 +++++++-----
6 files changed, 202 insertions(+), 110 deletions(-)
[RFC PATCH 0/4] Rework SCMI transport drivers probing sequence
Posted by Cristian Marussi 2 weeks, 6 days ago
Hi,

when the SCMI transports were split out into standalone drivers [1] the
probe sequence was laid out in such a way that:

 - the transport drivers would have probed first, triggered by the firmware
   driven discovery process (DT/ACPI)

 - afterwards the control would have been passed to the core SCMI stack
   driver via the creation of a dedicated device that would have inherited
   the original firmware descriptor (since that same DT/ACPI node would
   have still needed by the SCMI core driver to be parsed)

The tricky part came around with some transport driver like virtio and
optee since they are, in turn, upfront dependent on an external distinct
kernel subsystem; IOW these have first to undergo their own subsystem
specific probe/initialization to become fully operational as transports:
this kind of initialization sequencing of course must deal with the
possibility of probe deferrals BUT at that time we avoid this by using the
trick in virtio/optee transports to register the next stage transport
drivers ONLY at the end of the subsystem specific probe routine, from
within the probe itself.

This behaviour has 2 issues:

 - it is frowned upon and can lead to hangs in the driver core whenever
   some core locking is changed as exposed in [2]
 - it limits these transport drivers to a single instance probing since of
   course you cannot register the same driver more than once

All of this was tolerable because optee/virtio are generally only employed
in a single SCMI instance scenario AND because no hang where triggered in
the core driver subsystem... (not saying that was perfectly fine :P)

This RFC series aims to solve both of the above issue by:

 - moving the problematic platform driver registation away from the probe
   into its own module_init/exit

 - introducing an optional mechanism for the transport drivers to provide
   to the SCMI core some sort of transport specific handle so that multiple
   instances can be supported and multiple probed transports instances can
   be registered and identified by the core. (who is who)
   Note that at the same time this optional handle-mechanism is used to
   synchronize the drivers probing sequences and possibly trigger, when
   needed, the probe deferrals, if the transport driver has still to
   undergo its own subsystem initialization and as such is still NOT fully
   operational.

Most of the above mechanism is provided in 1/4 and 2/4 while patches 3 and
4 take care to port virtio and optee to the new probe schema.

Note that optee required some additional way (an xarray) to match properly
the tee_client_device with the agent on the TEE removal phase, now that
multiple instances are an option: but I could have missed a simpler
solution given my TEE-ignorance.

Please be aware that all of this has been tested as a builtin or as modules
on an emulated setup BUT ONLY with the virtio transport as of now.
Optee is only built tested.

Based on v7.0-rc4

Being an RFC I am NOT completely sure of the final code layout (we may make
most of this transparent really), I am mostly interested of course in
gathering some feedback above the whole mechanism.

Thanks,
Cristian
----
[1]: https://lore.kernel.org/arm-scmi/20240812173340.3912830-1-cristian.marussi@arm.com/
[2]: https://lore.kernel.org/lkml/aaA6t-J2gRy3dE1_@pluto/

Cristian Marussi (4):
  firmware: arm_scmi: Add transport instance handle
  firmware: arm_scmi: Propagate transport instance handle
  firmware: arm_scmi: virtio: Rework transport probe sequence
  firmware: arm_scmi: optee: Rework transport probe sequence

 drivers/firmware/arm_scmi/common.h            |  46 +++++-
 drivers/firmware/arm_scmi/driver.c            |  15 +-
 .../firmware/arm_scmi/transports/mailbox.c    |   5 +-
 drivers/firmware/arm_scmi/transports/optee.c  | 143 ++++++++++--------
 drivers/firmware/arm_scmi/transports/smc.c    |   4 +-
 drivers/firmware/arm_scmi/transports/virtio.c |  99 +++++++-----
 6 files changed, 202 insertions(+), 110 deletions(-)

-- 
2.53.0
Re: [RFC PATCH 0/4] Rework SCMI transport drivers probing sequence
Posted by Cristian Marussi 2 weeks, 4 days ago
On Tue, Mar 17, 2026 at 04:58:07PM +0000, Cristian Marussi wrote:
> Hi,
> 

Hi,

a clarification replying to myself down-below... O_o

> when the SCMI transports were split out into standalone drivers [1] the
> probe sequence was laid out in such a way that:
> 
>  - the transport drivers would have probed first, triggered by the firmware
>    driven discovery process (DT/ACPI)
> 
>  - afterwards the control would have been passed to the core SCMI stack
>    driver via the creation of a dedicated device that would have inherited
>    the original firmware descriptor (since that same DT/ACPI node would
>    have still needed by the SCMI core driver to be parsed)
> 
> The tricky part came around with some transport driver like virtio and
> optee since they are, in turn, upfront dependent on an external distinct
> kernel subsystem; IOW these have first to undergo their own subsystem
> specific probe/initialization to become fully operational as transports:
> this kind of initialization sequencing of course must deal with the
> possibility of probe deferrals BUT at that time we avoid this by using the
> trick in virtio/optee transports to register the next stage transport
> drivers ONLY at the end of the subsystem specific probe routine, from
> within the probe itself.
> 
> This behaviour has 2 issues:
> 
>  - it is frowned upon and can lead to hangs in the driver core whenever
>    some core locking is changed as exposed in [2]
>  - it limits these transport drivers to a single instance probing since of
>    course you cannot register the same driver more than once
> 
> All of this was tolerable because optee/virtio are generally only employed
> in a single SCMI instance scenario AND because no hang where triggered in
> the core driver subsystem... (not saying that was perfectly fine :P)
> 
> This RFC series aims to solve both of the above issue by:
> 
>  - moving the problematic platform driver registation away from the probe
>    into its own module_init/exit
> 
>  - introducing an optional mechanism for the transport drivers to provide
>    to the SCMI core some sort of transport specific handle so that multiple
>    instances can be supported and multiple probed transports instances can
>    be registered and identified by the core. (who is who)
>    Note that at the same time this optional handle-mechanism is used to
>    synchronize the drivers probing sequences and possibly trigger, when
>    needed, the probe deferrals, if the transport driver has still to
>    undergo its own subsystem initialization and as such is still NOT fully
>    operational.

Implementation issues aside, one thing that I failed to make clear about
this series is that, while this series addresses the platform registration
@probe-time issue above and enables the capability to technically support
multiple SCMI instances also for optee/virtio transports (like it is
already for mailbox and smc), the multi instance support for optee/virtio,
as of now, still suffers from some 'structural' limitations that make it
unfit for production and useful ONLY on a testing scenario (...like in my
virtio based setup :D)

The crux of the matter lays in the fact that there is still not a way
in such transports to be able to match a specific probed transport instance
(device) with the corresponding SCMI DT top node instance descriptor that
usually, in a multi instance scenario, describes a different set of
protocols, or the same set of protocols enumerating a different set of
resources...so, e.g., if you want to match and reference by phandle a
specific clock domain from a specific instance you cannot really be sure
that the instance reference that you have is the instance that you wanted
since depends on how that was associated during the probe.
(in mailbox and smc you have an explicit channel/shmem association ...)

Still a lot to reason about this as of now...any suggestion about this
is even more very much welcome :P

Thanks,
Cristian