Allow specifying an S2RAM sleep on pre-SYSTEM_SUSPEND PSCI impls

[PATCH 0/3] Allow specifying an S2RAM sleep on pre-SYSTEM_SUSPEND PSCI impls

Posted by Konrad Dybcio 3 weeks, 6 days ago

Certain firmwares expose exactly what PSCI_SYSTEM_SUSPEND does through
CPU_SUSPEND instead. Inform Linux about that.
Please see the commit messages for a more detailed explanation.

This is effectively a more educated follow-up to [1].

The ultimate goal is to stop making Linux think that certain states
only concern cores/clusters, and consequently setting
pm_set_suspend/resume_via_firmware(), so that client drivers (such as
NVMe, see related discussion over at [2]) can make informed decisions
about assuming the power state of the device they govern.

If this series gets green light, I'll push a follow-up one that wires
up said sleep state on Qualcomm SoCs across the board.

[1] https://lore.kernel.org/linux-arm-kernel/20231227-topic-psci_fw_sus-v1-0-6910add70bf3@linaro.org/
[2] https://lore.kernel.org/linux-nvme/20241024-topic-nvmequirk-v1-1-51249999d409@oss.qualcomm.com/

Signed-off-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
---
Konrad Dybcio (3):
      dt-bindings: arm,psci: Allow S2RAM power_state parameter description
      firmware/psci: Set pm_set_resume/suspend_via_firmware() for SYSTEM_SUSPEND
      firmware/psci: Allow specifying an S2RAM state through CPU_SUSPEND

 Documentation/devicetree/bindings/arm/psci.yaml |  6 ++++
 drivers/firmware/psci/psci.c                    | 44 ++++++++++++++++++++++---
 2 files changed, 46 insertions(+), 4 deletions(-)
---
base-commit: a39230ecf6b3057f5897bc4744a790070cfbe7a8
change-id: 20241028-topic-cpu_suspend_s2ram-28fc095d0aa4

Best regards,
-- 
Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>

Re: [PATCH 0/3] Allow specifying an S2RAM sleep on pre-SYSTEM_SUSPEND PSCI impls

Posted by Ulf Hansson 1 week, 3 days ago

On Mon, 28 Oct 2024 at 15:24, Konrad Dybcio <konradybcio@kernel.org> wrote:
>
> Certain firmwares expose exactly what PSCI_SYSTEM_SUSPEND does through
> CPU_SUSPEND instead. Inform Linux about that.
> Please see the commit messages for a more detailed explanation.
>
> This is effectively a more educated follow-up to [1].
>
> The ultimate goal is to stop making Linux think that certain states
> only concern cores/clusters, and consequently setting
> pm_set_suspend/resume_via_firmware(), so that client drivers (such as
> NVMe, see related discussion over at [2]) can make informed decisions
> about assuming the power state of the device they govern.

In my opinion, this is not really the correct way to do it. Using
pm_set_suspend/resume_via_firmware() works fine for x86/ACPI, but not
for PSCI like this. Let me elaborate. If the NVMe storage device is
sharing the same power-rail as the CPU cluster, then yes we should use
PSCI to control it. But is that really the case? If so, there are in
principle two ways forward to deal with this correctly.

1) If PSCI OSI mode is being used, the corresponding NVMe storage
device should be hooked up to the CPU PM cluster domain via genpd and
controlled as any other devices sharing the cluster-rail. In this way,
genpd together with the cpuidle-psci-domain can decide whether it's
okay to turn off the cluster. I believe this is the preferred way, but
2) would work fine too.

2) If PSCI PC mode is being used, a separate channel/interface to the
FW (like SCMI or rpmh in the QC case), should inform the FW whether
NVMe needs the power to it. This information should then be taken into
account by the PSCI FW when it decides what low-power-state to enter,
which ultimately means whether the cluster-rail can be turned off or
not.

Assuming PSCI OSI mode is used here. Then if 1) doesn't work for you,
please elaborate on why, so we can help to make it work, as it should.

[...]

Kind regards
Uffe

Re: [PATCH 0/3] Allow specifying an S2RAM sleep on pre-SYSTEM_SUSPEND PSCI impls

Posted by Elliot Berman 1 week, 4 days ago

On Mon, Oct 28, 2024 at 03:22:56PM +0100, Konrad Dybcio wrote:
> Certain firmwares expose exactly what PSCI_SYSTEM_SUSPEND does through
> CPU_SUSPEND instead. Inform Linux about that.
> Please see the commit messages for a more detailed explanation.
> 
> This is effectively a more educated follow-up to [1].
> 
> The ultimate goal is to stop making Linux think that certain states
> only concern cores/clusters, and consequently setting
> pm_set_suspend/resume_via_firmware(), so that client drivers (such as
> NVMe, see related discussion over at [2]) can make informed decisions
> about assuming the power state of the device they govern.
> 
> If this series gets green light, I'll push a follow-up one that wires
> up said sleep state on Qualcomm SoCs across the board.
> 
> [1] https://lore.kernel.org/linux-arm-kernel/20231227-topic-psci_fw_sus-v1-0-6910add70bf3@linaro.org/
> [2] https://lore.kernel.org/linux-nvme/20241024-topic-nvmequirk-v1-1-51249999d409@oss.qualcomm.com/
> 

I got a bit confused, but I think I might've pieced it together. Konrad
wants to support s2ram (not clear why) on Qualcomm SoCs from 2015-2023.
On these SoCs, PSCI_SYSTEM_SUSPEND (s2ram) isn't supported but doing
s2idle gets you the same effect. You'd like s2ram to work, so you
provide a way to replace the PSCI_SYSTEM_SUSPEND param with
(effectively) the CPU_SUSPEND command. If this is the wrong
understanding, please correct me.

Could patch 2 be sent separately? I think it seems fine without the
rest of the series.

I'm not sure why you'd like to support s2ram. Is it *only* that you'd
like to be able to set pm_set_supend/resume_via_firmware()? I hope this
doesn't sound silly: what if you register a platform_s2idle_ops for the
relevant SoCs which calls pm_set_suspend/resume_via_firwmare()?

- Elliot

Re: [PATCH 0/3] Allow specifying an S2RAM sleep on pre-SYSTEM_SUSPEND PSCI impls

Posted by Manivannan Sadhasivam 1 week, 5 days ago

On Mon, Oct 28, 2024 at 03:22:56PM +0100, Konrad Dybcio wrote:
> Certain firmwares expose exactly what PSCI_SYSTEM_SUSPEND does through
> CPU_SUSPEND instead. Inform Linux about that.
> Please see the commit messages for a more detailed explanation.
> 

It is still not PSCI_SYSTEM_SUSPEND though...

> This is effectively a more educated follow-up to [1].
> 
> The ultimate goal is to stop making Linux think that certain states
> only concern cores/clusters, and consequently setting
> pm_set_suspend/resume_via_firmware(), so that client drivers (such as
> NVMe, see related discussion over at [2]) can make informed decisions
> about assuming the power state of the device they govern.
> 
> If this series gets green light, I'll push a follow-up one that wires
> up said sleep state on Qualcomm SoCs across the board.
> 

Sorry. I don't think PSCI is the right place for this. Qcom SoCs have a common
firmware across all segments (mostly), so there is no S2R involved and only
S2Idle. If you use PSCI to implement suspend_via_firmware(), then all the SoCs
making use of the PSCI implementation will have the same behavior. I don't think
we would want that.

For instance, if a Qcom SoC is used in an android tablet with the same firmware,
then this would allow the NVMe device to be turned off during system suspend all
the time when user presses the lock button. And this will cause NVMe device to
wear out faster. The said approach will work fine for non-android usecases
though.

I have a couple of ideas in mind that I will post to NVMe list itself.

- Mani

> [1] https://lore.kernel.org/linux-arm-kernel/20231227-topic-psci_fw_sus-v1-0-6910add70bf3@linaro.org/
> [2] https://lore.kernel.org/linux-nvme/20241024-topic-nvmequirk-v1-1-51249999d409@oss.qualcomm.com/
> 
> Signed-off-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
> ---
> Konrad Dybcio (3):
>       dt-bindings: arm,psci: Allow S2RAM power_state parameter description
>       firmware/psci: Set pm_set_resume/suspend_via_firmware() for SYSTEM_SUSPEND
>       firmware/psci: Allow specifying an S2RAM state through CPU_SUSPEND
> 
>  Documentation/devicetree/bindings/arm/psci.yaml |  6 ++++
>  drivers/firmware/psci/psci.c                    | 44 ++++++++++++++++++++++---
>  2 files changed, 46 insertions(+), 4 deletions(-)
> ---
> base-commit: a39230ecf6b3057f5897bc4744a790070cfbe7a8
> change-id: 20241028-topic-cpu_suspend_s2ram-28fc095d0aa4
> 
> Best regards,
> -- 
> Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
> 

-- 
மணிவண்ணன் சதாசிவம்

Re: [PATCH 0/3] Allow specifying an S2RAM sleep on pre-SYSTEM_SUSPEND PSCI impls

Posted by Konrad Dybcio 1 week, 5 days ago

On 11/12/24 19:01, Manivannan Sadhasivam wrote:
> On Mon, Oct 28, 2024 at 03:22:56PM +0100, Konrad Dybcio wrote:
>> Certain firmwares expose exactly what PSCI_SYSTEM_SUSPEND does through
>> CPU_SUSPEND instead. Inform Linux about that.
>> Please see the commit messages for a more detailed explanation.
>>
> 
> It is still not PSCI_SYSTEM_SUSPEND though...

It *literally* does the same thing on devices where it's exposed.

> 
>> This is effectively a more educated follow-up to [1].
>>
>> The ultimate goal is to stop making Linux think that certain states
>> only concern cores/clusters, and consequently setting
>> pm_set_suspend/resume_via_firmware(), so that client drivers (such as
>> NVMe, see related discussion over at [2]) can make informed decisions
>> about assuming the power state of the device they govern.
>>
>> If this series gets green light, I'll push a follow-up one that wires
>> up said sleep state on Qualcomm SoCs across the board.
>>
> 
> Sorry. I don't think PSCI is the right place for this. Qcom SoCs have a common
> firmware across all segments (mostly), 

This ^

> so there is no S2R involved and only S2Idle.

is not at all related to this ^, the "so" makes no sense.

(also you're wrong, this *is* S2RAM)

> If you use PSCI to implement suspend_via_firmware(), then all the SoCs
> making use of the PSCI implementation will have the same behavior. I don't think
> we would want that.

This is an issue with the NVMe framework that is totally unrelated to this
change, see below. Also, the code only sets that on targets where such state
exists and is described.

> For instance, if a Qcom SoC is used in an android tablet with the same firmware,
> then this would allow the NVMe device to be turned off during system suspend all
> the time when user presses the lock button. And this will cause NVMe device to
> wear out faster. The said approach will work fine for non-android usecases
> though.

The NVMe framework doesn't make a distinction between "phone screen off" and
"laptop lid closed & thrown in a bag" on *any* platform. The usecase you're
describing is not supported as of today since nobody *actually* has NVMe on a
phone that also happens to run upstream Linux.
I'm not going to solve imaginary problems.

Besides, userspace already has sysfs to tune device power state knobs. Which
Android uses very extensively on market devices.

Konrad

Re: [PATCH 0/3] Allow specifying an S2RAM sleep on pre-SYSTEM_SUSPEND PSCI impls

Posted by Manivannan Sadhasivam 1 week, 5 days ago

On Tue, Nov 12, 2024 at 07:32:36PM +0100, Konrad Dybcio wrote:
> 
> 
> On 11/12/24 19:01, Manivannan Sadhasivam wrote:
> > On Mon, Oct 28, 2024 at 03:22:56PM +0100, Konrad Dybcio wrote:
> > > Certain firmwares expose exactly what PSCI_SYSTEM_SUSPEND does through
> > > CPU_SUSPEND instead. Inform Linux about that.
> > > Please see the commit messages for a more detailed explanation.
> > > 
> > 
> > It is still not PSCI_SYSTEM_SUSPEND though...
> 
> It *literally* does the same thing on devices where it's exposed.
> 

But still...

> > 
> > > This is effectively a more educated follow-up to [1].
> > > 
> > > The ultimate goal is to stop making Linux think that certain states
> > > only concern cores/clusters, and consequently setting
> > > pm_set_suspend/resume_via_firmware(), so that client drivers (such as
> > > NVMe, see related discussion over at [2]) can make informed decisions
> > > about assuming the power state of the device they govern.
> > > 
> > > If this series gets green light, I'll push a follow-up one that wires
> > > up said sleep state on Qualcomm SoCs across the board.
> > > 
> > 
> > Sorry. I don't think PSCI is the right place for this. Qcom SoCs have a common
> > firmware across all segments (mostly),
> 
> This ^
> 
> > so there is no S2R involved and only S2Idle.
> 
> is not at all related to this ^, the "so" makes no sense.
> 
> (also you're wrong, this *is* S2RAM)
> 

What? Qcom SoCs supporting S2R? I'm unheard of.

> > If you use PSCI to implement suspend_via_firmware(), then all the SoCs
> > making use of the PSCI implementation will have the same behavior. I don't think
> > we would want that.
> 
> This is an issue with the NVMe framework that is totally unrelated to this
> change, see below. Also, the code only sets that on targets where such state
> exists and is described.
> 

Well, you are doing it just because you want the NVMe device to learn about the
platform requirement.

> > For instance, if a Qcom SoC is used in an android tablet with the same firmware,
> > then this would allow the NVMe device to be turned off during system suspend all
> > the time when user presses the lock button. And this will cause NVMe device to
> > wear out faster. The said approach will work fine for non-android usecases
> > though.
> 
> The NVMe framework doesn't make a distinction between "phone screen off" and
> "laptop lid closed & thrown in a bag" on *any* platform. The usecase you're
> describing is not supported as of today since nobody *actually* has NVMe on a
> phone that also happens to run upstream Linux.
> I'm not going to solve imaginary problems.
> 

Not just phone, NVMe device could be running on an android tablet. I'm not
talking about an imaginary problem, but a real problem that is in a forseeable
future (that is also the reason why NVMe developers doesn't want to put the
device into power down mode always during system suspend).

And with this change, you are just going to make the NVMe lifetime miserable on
those platforms.

- Mani

> Besides, userspace already has sysfs to tune device power state knobs. Which
> Android uses very extensively on market devices.
> 
> Konrad

-- 
மணிவண்ணன் சதாசிவம்

Re: [PATCH 0/3] Allow specifying an S2RAM sleep on pre-SYSTEM_SUSPEND PSCI impls

Posted by Konrad Dybcio 1 week, 5 days ago


On 11/12/24 19:43, Manivannan Sadhasivam wrote:
> On Tue, Nov 12, 2024 at 07:32:36PM +0100, Konrad Dybcio wrote:
>>
>>
>> On 11/12/24 19:01, Manivannan Sadhasivam wrote:
>>> On Mon, Oct 28, 2024 at 03:22:56PM +0100, Konrad Dybcio wrote:
>>>> Certain firmwares expose exactly what PSCI_SYSTEM_SUSPEND does through
>>>> CPU_SUSPEND instead. Inform Linux about that.
>>>> Please see the commit messages for a more detailed explanation.
>>>>
>>>
>>> It is still not PSCI_SYSTEM_SUSPEND though...
>>
>> It *literally* does the same thing on devices where it's exposed.
>>
> 
> But still...

Still-what? We can't replace the signed firmware on (unironically) tens
of millions of devices in the wild and this is how it exposes that sleep
state. This is how arm platforms did it before the PSCI spec was
updated and SYSTEM_SUSPEND is *still optional today*.


>>>> This is effectively a more educated follow-up to [1].
>>>>
>>>> The ultimate goal is to stop making Linux think that certain states
>>>> only concern cores/clusters, and consequently setting
>>>> pm_set_suspend/resume_via_firmware(), so that client drivers (such as
>>>> NVMe, see related discussion over at [2]) can make informed decisions
>>>> about assuming the power state of the device they govern.
>>>>
>>>> If this series gets green light, I'll push a follow-up one that wires
>>>> up said sleep state on Qualcomm SoCs across the board.
>>>>
>>>
>>> Sorry. I don't think PSCI is the right place for this. Qcom SoCs have a common
>>> firmware across all segments (mostly),
>>
>> This ^
>>
>>> so there is no S2R involved and only S2Idle.
>>
>> is not at all related to this ^, the "so" makes no sense.
>>
>> (also you're wrong, this *is* S2RAM)
>>
> 
> What? Qcom SoCs supporting S2R? I'm unheard of.

Maybe you're thinking of hibernation, which is not widely (if at all)
supported.


>>> If you use PSCI to implement suspend_via_firmware(), then all the SoCs
>>> making use of the PSCI implementation will have the same behavior. I don't think
>>> we would want that.
>>
>> This is an issue with the NVMe framework that is totally unrelated to this
>> change, see below. Also, the code only sets that on targets where such state
>> exists and is described.
>>
> 
> Well, you are doing it just because you want the NVMe device to learn about the
> platform requirement.

And I can't see why you're having a problem with this. It's exactly how it
works on x86 too. Modern Standby also shuts down storage on Windows,
regardless of the CPU architecture.
  
>>> For instance, if a Qcom SoC is used in an android tablet with the same firmware,
>>> then this would allow the NVMe device to be turned off during system suspend all
>>> the time when user presses the lock button. And this will cause NVMe device to
>>> wear out faster. The said approach will work fine for non-android usecases
>>> though.
>>
>> The NVMe framework doesn't make a distinction between "phone screen off" and
>> "laptop lid closed & thrown in a bag" on *any* platform. The usecase you're
>> describing is not supported as of today since nobody *actually* has NVMe on a
>> phone that also happens to run upstream Linux.
>> I'm not going to solve imaginary problems.
>>
> 
> Not just phone, NVMe device could be running on an android tablet.

'Could' very much makes it imaginary. There are no supported devices that
fall into this category.

> I'm not
> talking about an imaginary problem, but a real problem that is in a forseeable
> future

Keyword: future. This issue has been on hold for years because of 'issues'
that are pinky promised to happen eventually, without anyone suggesting any
actually acceptable solutions. This just undermines progress.

> (that is also the reason why NVMe developers doesn't want to put the
> device into power down mode always during system suspend).

This is the current behavior on any new x86 laptop, and has been for a
couple of years.

> And with this change, you are just going to make the NVMe lifetime miserable on
> those platforms.

Fearmongering and hearsay. See above.

Konrad

Re: [PATCH 0/3] Allow specifying an S2RAM sleep on pre-SYSTEM_SUSPEND PSCI impls

Posted by Manivannan Sadhasivam 1 week, 4 days ago

On Tue, Nov 12, 2024 at 08:04:34PM +0100, Konrad Dybcio wrote:
> 
> 
> On 11/12/24 19:43, Manivannan Sadhasivam wrote:
> > On Tue, Nov 12, 2024 at 07:32:36PM +0100, Konrad Dybcio wrote:
> > > 
> > > 
> > > On 11/12/24 19:01, Manivannan Sadhasivam wrote:
> > > > On Mon, Oct 28, 2024 at 03:22:56PM +0100, Konrad Dybcio wrote:
> > > > > Certain firmwares expose exactly what PSCI_SYSTEM_SUSPEND does through
> > > > > CPU_SUSPEND instead. Inform Linux about that.
> > > > > Please see the commit messages for a more detailed explanation.
> > > > > 
> > > > 
> > > > It is still not PSCI_SYSTEM_SUSPEND though...
> > > 
> > > It *literally* does the same thing on devices where it's exposed.
> > > 
> > 
> > But still...
> 
> Still-what? We can't replace the signed firmware on (unironically) tens
> of millions of devices in the wild and this is how it exposes that sleep
> state. This is how arm platforms did it before the PSCI spec was
> updated and SYSTEM_SUSPEND is *still optional today*.
> 

I never asked you to replace the firmware in first place, so don't quote the
fact I never said. I see this approach as a way of abusing/faking PSCI system
suspend.

Moreover, I heard from Bjorn that Qcom doesn't want to put the PCIe devices into
D3Cold during system suspend for future platforms (based on their
experimentation). So if drivers rely on this static information, then even Qcom
cannot achieve what they want.

> 
> > > > > This is effectively a more educated follow-up to [1].
> > > > > 
> > > > > The ultimate goal is to stop making Linux think that certain states
> > > > > only concern cores/clusters, and consequently setting
> > > > > pm_set_suspend/resume_via_firmware(), so that client drivers (such as
> > > > > NVMe, see related discussion over at [2]) can make informed decisions
> > > > > about assuming the power state of the device they govern.
> > > > > 
> > > > > If this series gets green light, I'll push a follow-up one that wires
> > > > > up said sleep state on Qualcomm SoCs across the board.
> > > > > 
> > > > 
> > > > Sorry. I don't think PSCI is the right place for this. Qcom SoCs have a common
> > > > firmware across all segments (mostly),
> > > 
> > > This ^
> > > 
> > > > so there is no S2R involved and only S2Idle.
> > > 
> > > is not at all related to this ^, the "so" makes no sense.
> > > 
> > > (also you're wrong, this *is* S2RAM)
> > > 
> > 
> > What? Qcom SoCs supporting S2R? I'm unheard of.
> 
> Maybe you're thinking of hibernation, which is not widely (if at all)
> supported.
> 

Not hibernation. The Qcom platforms I've aware of all support only S2Idle. I
don't work for Qcom, so I may be missing some insider information.

> 
> > > > If you use PSCI to implement suspend_via_firmware(), then all the SoCs
> > > > making use of the PSCI implementation will have the same behavior. I don't think
> > > > we would want that.
> > > 
> > > This is an issue with the NVMe framework that is totally unrelated to this
> > > change, see below. Also, the code only sets that on targets where such state
> > > exists and is described.
> > > 
> > 
> > Well, you are doing it just because you want the NVMe device to learn about the
> > platform requirement.
> 
> And I can't see why you're having a problem with this. It's exactly how it
> works on x86 too. Modern Standby also shuts down storage on Windows,
> regardless of the CPU architecture.

It is not just my problem. I'm expressing the concern that NVMe folks have and
already expressed over the similar solutions I proposed. And I cannot just
overrule them.

> > > > For instance, if a Qcom SoC is used in an android tablet with the same firmware,
> > > > then this would allow the NVMe device to be turned off during system suspend all
> > > > the time when user presses the lock button. And this will cause NVMe device to
> > > > wear out faster. The said approach will work fine for non-android usecases
> > > > though.
> > > 
> > > The NVMe framework doesn't make a distinction between "phone screen off" and
> > > "laptop lid closed & thrown in a bag" on *any* platform. The usecase you're
> > > describing is not supported as of today since nobody *actually* has NVMe on a
> > > phone that also happens to run upstream Linux.
> > > I'm not going to solve imaginary problems.
> > > 
> > 
> > Not just phone, NVMe device could be running on an android tablet.
> 
> 'Could' very much makes it imaginary. There are no supported devices that
> fall into this category.
> 

Agree that there are no products in the market (yet). But having NMVe on
handheld devices is not something I would quote as 'imaginary'.

> > I'm not
> > talking about an imaginary problem, but a real problem that is in a forseeable
> > future
> 
> Keyword: future. This issue has been on hold for years because of 'issues'
> that are pinky promised to happen eventually, without anyone suggesting any
> actually acceptable solutions. This just undermines progress.
> 

Not true. There are solutions suggested, but then it always takes time to reach
consensus. One of the approach that I'm about to propose is to have a userspace
knob that specifies whether the device can be powered down or not (leaving the
default behavior to put them in low power state). Because, the decision to put
the devices into power down or low power state sounds more like an userspace
policy. It was discussed at LPC 2023.

> > (that is also the reason why NVMe developers doesn't want to put the
> > device into power down mode always during system suspend).
> 
> This is the current behavior on any new x86 laptop, and has been for a
> couple of years.
> 
> > And with this change, you are just going to make the NVMe lifetime miserable on
> > those platforms.
> 
> Fearmongering and hearsay. See above.
> 

I can only wish you best of luck with this approach!

- Mani

-- 
மணிவண்ணன் சதாசிவம்