[PATCH v3] firmware_loader: Abort new fw load request once firmware core knows about reboot

Mukesh Ojha posted 1 patch 2 years, 2 months ago
drivers/base/firmware_loader/fallback.c | 6 +++++-
drivers/base/firmware_loader/firmware.h | 1 +
drivers/base/firmware_loader/main.c     | 1 +
3 files changed, 7 insertions(+), 1 deletion(-)
[PATCH v3] firmware_loader: Abort new fw load request once firmware core knows about reboot
Posted by Mukesh Ojha 2 years, 2 months ago
There could be following scenario where there is a ongoing reboot
is going from processA which tries to call all the reboot notifier
callback and one of them is firmware reboot call which tries to
abort all the ongoing firmware userspace request under fw_lock but
there could be another processB which tries to do request firmware,
which came just after abort done from ProcessA and ask for userspace
to load the firmware and this can stop the ongoing reboot ProcessA
to stall for next 60s(default timeout) which may not be expected
behaviour everyone like to see, instead we should abort any firmware
load request which came once firmware knows about the reboot through
notification.

      ProcessA                             ProcessB

kernel_restart_prepare
  blocking_notifier_call_chain
   fw_shutdown_notify
     kill_pending_fw_fallback_reqs
      __fw_load_abort
       fw_state_aborted	               request_firmware
         __fw_state_set                 firmware_fallback_sysfs
...                                       fw_load_from_user_helper
..                                         ...
.                                          ..
                                            usermodehelper_read_trylock
                                             fw_load_sysfs_fallback
                                              fw_sysfs_wait_timeout
usermodehelper_disable
 __usermodehelper_disable
  down_write()

Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com>
---
Changes from v2: https://lore.kernel.org/lkml/1696431327-7369-1-git-send-email-quic_mojha@quicinc.com/
 - Renamed the flag to fw_abort_load.

Changes from v1: https://lore.kernel.org/lkml/1694773288-15755-1-git-send-email-quic_mojha@quicinc.com/
 - Renamed the flag to fw_load_abort.
 - Kept the flag under fw_lock.
 - Repharsed commit text.

 drivers/base/firmware_loader/fallback.c | 6 +++++-
 drivers/base/firmware_loader/firmware.h | 1 +
 drivers/base/firmware_loader/main.c     | 1 +
 3 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/base/firmware_loader/fallback.c b/drivers/base/firmware_loader/fallback.c
index bf68e3947814..a162020e98f2 100644
--- a/drivers/base/firmware_loader/fallback.c
+++ b/drivers/base/firmware_loader/fallback.c
@@ -57,6 +57,10 @@ void kill_pending_fw_fallback_reqs(bool only_kill_custom)
 		if (!fw_priv->need_uevent || !only_kill_custom)
 			 __fw_load_abort(fw_priv);
 	}
+
+	if (!only_kill_custom)
+		fw_abort_load = true;
+
 	mutex_unlock(&fw_lock);
 }
 
@@ -86,7 +90,7 @@ static int fw_load_sysfs_fallback(struct fw_sysfs *fw_sysfs, long timeout)
 	}
 
 	mutex_lock(&fw_lock);
-	if (fw_state_is_aborted(fw_priv)) {
+	if (fw_abort_load || fw_state_is_aborted(fw_priv)) {
 		mutex_unlock(&fw_lock);
 		retval = -EINTR;
 		goto out;
diff --git a/drivers/base/firmware_loader/firmware.h b/drivers/base/firmware_loader/firmware.h
index bf549d6500d7..8b2549910f36 100644
--- a/drivers/base/firmware_loader/firmware.h
+++ b/drivers/base/firmware_loader/firmware.h
@@ -86,6 +86,7 @@ struct fw_priv {
 
 extern struct mutex fw_lock;
 extern struct firmware_cache fw_cache;
+extern bool fw_abort_load;
 
 static inline bool __fw_state_check(struct fw_priv *fw_priv,
 				    enum fw_status status)
diff --git a/drivers/base/firmware_loader/main.c b/drivers/base/firmware_loader/main.c
index b58c42f1b1ce..2658bdcdf1ee 100644
--- a/drivers/base/firmware_loader/main.c
+++ b/drivers/base/firmware_loader/main.c
@@ -93,6 +93,7 @@ static inline struct fw_priv *to_fw_priv(struct kref *ref)
 DEFINE_MUTEX(fw_lock);
 
 struct firmware_cache fw_cache;
+bool fw_abort_load;
 
 void fw_state_init(struct fw_priv *fw_priv)
 {
-- 
2.7.4
Re: [PATCH v3] firmware_loader: Abort new fw load request once firmware core knows about reboot
Posted by Luis Chamberlain 2 years, 2 months ago
On Thu, Oct 05, 2023 at 10:37:00AM +0530, Mukesh Ojha wrote:
> There could be following scenario where there is a ongoing reboot
> is going from processA which tries to call all the reboot notifier
> callback and one of them is firmware reboot call which tries to
> abort all the ongoing firmware userspace request under fw_lock but
> there could be another processB which tries to do request firmware,
> which came just after abort done from ProcessA and ask for userspace
> to load the firmware and this can stop the ongoing reboot ProcessA
> to stall for next 60s(default timeout) which may not be expected
> behaviour everyone like to see, instead we should abort any firmware
> load request which came once firmware knows about the reboot through
> notification.
> 
>       ProcessA                             ProcessB
> 
> kernel_restart_prepare
>   blocking_notifier_call_chain
>    fw_shutdown_notify
>      kill_pending_fw_fallback_reqs
>       __fw_load_abort
>        fw_state_aborted	               request_firmware
>          __fw_state_set                 firmware_fallback_sysfs
> ...                                       fw_load_from_user_helper
> ..                                         ...
> .                                          ..
>                                             usermodehelper_read_trylock
>                                              fw_load_sysfs_fallback
>                                               fw_sysfs_wait_timeout
> usermodehelper_disable
>  __usermodehelper_disable
>   down_write()
> 
> Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com>
> ---
> Changes from v2: https://lore.kernel.org/lkml/1696431327-7369-1-git-send-email-quic_mojha@quicinc.com/
>  - Renamed the flag to fw_abort_load.
> 
> Changes from v1: https://lore.kernel.org/lkml/1694773288-15755-1-git-send-email-quic_mojha@quicinc.com/
>  - Renamed the flag to fw_load_abort.
>  - Kept the flag under fw_lock.
>  - Repharsed commit text.
> 
>  drivers/base/firmware_loader/fallback.c | 6 +++++-
>  drivers/base/firmware_loader/firmware.h | 1 +
>  drivers/base/firmware_loader/main.c     | 1 +
>  3 files changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/base/firmware_loader/fallback.c b/drivers/base/firmware_loader/fallback.c
> index bf68e3947814..a162020e98f2 100644
> --- a/drivers/base/firmware_loader/fallback.c
> +++ b/drivers/base/firmware_loader/fallback.c
> @@ -57,6 +57,10 @@ void kill_pending_fw_fallback_reqs(bool only_kill_custom)

This routine uses a bool for only_kill_custom for when the events we
should kill ar ecustom or not. Piggy backing on it to assume that the
negative of value represents a shutdown is abusing the semantics
and muddies the waters. So to avoid that, just extend the arguments
to kill_pending_fw_fallback_reqs() for a new bool shutdown, that allows
the code to be clearer and the intent is kept clear.

>  		if (!fw_priv->need_uevent || !only_kill_custom)
>  			 __fw_load_abort(fw_priv);
>  	}
> +
> +	if (!only_kill_custom)
> +		fw_abort_load = true;
> +
>  	mutex_unlock(&fw_lock);
>  }
>  
> @@ -86,7 +90,7 @@ static int fw_load_sysfs_fallback(struct fw_sysfs *fw_sysfs, long timeout)
>  	}
>  
>  	mutex_lock(&fw_lock);
> -	if (fw_state_is_aborted(fw_priv)) {
> +	if (fw_abort_load || fw_state_is_aborted(fw_priv)) {

However, do we really need this ? Could we just use:

if (system_state == SYSTEM_HALT ||
    system_state == SYSTEM_POWER_OFF ||
    system_state == SYSTEM_RESTART ||
    fw_state_is_aborted(fw_priv))

?

  Luis
Re: [PATCH v3] firmware_loader: Abort new fw load request once firmware core knows about reboot
Posted by Mukesh Ojha 2 years, 2 months ago

On 10/20/2023 12:36 AM, Luis Chamberlain wrote:
> On Thu, Oct 05, 2023 at 10:37:00AM +0530, Mukesh Ojha wrote:
>> There could be following scenario where there is a ongoing reboot
>> is going from processA which tries to call all the reboot notifier
>> callback and one of them is firmware reboot call which tries to
>> abort all the ongoing firmware userspace request under fw_lock but
>> there could be another processB which tries to do request firmware,
>> which came just after abort done from ProcessA and ask for userspace
>> to load the firmware and this can stop the ongoing reboot ProcessA
>> to stall for next 60s(default timeout) which may not be expected
>> behaviour everyone like to see, instead we should abort any firmware
>> load request which came once firmware knows about the reboot through
>> notification.
>>
>>        ProcessA                             ProcessB
>>
>> kernel_restart_prepare
>>    blocking_notifier_call_chain
>>     fw_shutdown_notify
>>       kill_pending_fw_fallback_reqs
>>        __fw_load_abort
>>         fw_state_aborted	               request_firmware
>>           __fw_state_set                 firmware_fallback_sysfs
>> ...                                       fw_load_from_user_helper
>> ..                                         ...
>> .                                          ..
>>                                              usermodehelper_read_trylock
>>                                               fw_load_sysfs_fallback
>>                                                fw_sysfs_wait_timeout
>> usermodehelper_disable
>>   __usermodehelper_disable
>>    down_write()
>>
>> Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com>
>> ---
>> Changes from v2: https://lore.kernel.org/lkml/1696431327-7369-1-git-send-email-quic_mojha@quicinc.com/
>>   - Renamed the flag to fw_abort_load.
>>
>> Changes from v1: https://lore.kernel.org/lkml/1694773288-15755-1-git-send-email-quic_mojha@quicinc.com/
>>   - Renamed the flag to fw_load_abort.
>>   - Kept the flag under fw_lock.
>>   - Repharsed commit text.
>>
>>   drivers/base/firmware_loader/fallback.c | 6 +++++-
>>   drivers/base/firmware_loader/firmware.h | 1 +
>>   drivers/base/firmware_loader/main.c     | 1 +
>>   3 files changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/base/firmware_loader/fallback.c b/drivers/base/firmware_loader/fallback.c
>> index bf68e3947814..a162020e98f2 100644
>> --- a/drivers/base/firmware_loader/fallback.c
>> +++ b/drivers/base/firmware_loader/fallback.c
>> @@ -57,6 +57,10 @@ void kill_pending_fw_fallback_reqs(bool only_kill_custom)
> 
> This routine uses a bool for only_kill_custom for when the events we
> should kill ar ecustom or not. Piggy backing on it to assume that the
> negative of value represents a shutdown is abusing the semantics
> and muddies the waters. So to avoid that, just extend the arguments
> to kill_pending_fw_fallback_reqs() for a new bool shutdown, that allows
> the code to be clearer and the intent is kept clear.

Let me understand what does 'only_kill_custom' do,

1. It gets called from reboot notifier where it's default value is 
false. So, the intention here is to kill all the ongoing request along
with !buf->need_uevent ?

commit c4b768934be613fb882e4e4090946218d76c8e1b
Author: Luis R. Rodriguez <mcgrof@kernel.org>
Date:   Tue May 2 01:31:03 2017 -0700

     firmware: share fw fallback killing on reboot/suspend

     We kill pending fallback requests on suspend and reboot,
     the only difference is that on suspend we only kill custom
     fallback requests. Provide a wrapper that lets us customize
     the request with a flag.

     This also lets us simplify the #ifdef'ery over the calls.

     Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
     Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

2. The second call is from fw_pm_notify() which is only
interested in aborting non-uevent calls.

And in this patch since, we are calling it from reboot which is the 
first case, so piggybacking on the 'only_kill_custom' would be fine for 
this patch. However, if you think we should rename this
'only_kill_custom' to something like its inverse 'kill_all' and reverse 
the below check to be more meaningful now ?

    		if (kill_all || !fw_priv->need_uevent)

> 
>>   		if (!fw_priv->need_uevent || !only_kill_custom)
>>   			 __fw_load_abort(fw_priv);
>>   	}
>> +
>> +	if (!only_kill_custom)
>> +		fw_abort_load = true;
>> +
>>   	mutex_unlock(&fw_lock);
>>   }
>>   
>> @@ -86,7 +90,7 @@ static int fw_load_sysfs_fallback(struct fw_sysfs *fw_sysfs, long timeout)
>>   	}
>>   
>>   	mutex_lock(&fw_lock);
>> -	if (fw_state_is_aborted(fw_priv)) {
>> +	if (fw_abort_load || fw_state_is_aborted(fw_priv)) {
> 
> However, do we really need this ? Could we just use:
> 
> if (system_state == SYSTEM_HALT ||
>      system_state == SYSTEM_POWER_OFF ||
>      system_state == SYSTEM_RESTART ||
>      fw_state_is_aborted(fw_priv))
> 
> ?

There is slight window where system_state is not yet set and other
reboot notifiers after firmware one are running with ongoing
request_firmware().

-Mukesh

> 
>    Luis
Re: [PATCH v3] firmware_loader: Abort new fw load request once firmware core knows about reboot
Posted by Luis Chamberlain 2 years, 2 months ago
On Mon, Oct 23, 2023 at 06:11:18PM +0530, Mukesh Ojha wrote:
> However, if you think we should rename this
> 'only_kill_custom' to something like its inverse 'kill_all' and reverse the
> below check to be more meaningful now ?
> 
>    		if (kill_all || !fw_priv->need_uevent)

This seems like a better approach to make the intent clear and avoid
future confusion.

  Luis
Re: [PATCH v3] firmware_loader: Abort new fw load request once firmware core knows about reboot
Posted by Mukesh Ojha 2 years, 2 months ago

On 10/23/2023 9:45 PM, Luis Chamberlain wrote:
> On Mon, Oct 23, 2023 at 06:11:18PM +0530, Mukesh Ojha wrote:
>> However, if you think we should rename this
>> 'only_kill_custom' to something like its inverse 'kill_all' and reverse the
>> below check to be more meaningful now ?
>>
>>     		if (kill_all || !fw_priv->need_uevent)
> 
> This seems like a better approach to make the intent clear and avoid
> future confusion.

Thanks, have sent it here.

https://lore.kernel.org/lkml/1698330459-31776-1-git-send-email-quic_mojha@quicinc.com/

-Mukesh

> 
>    Luis