[PATCH v3 0/3] ACPI: battery: Do not generate too much pressure on ACPI methods

Rong Zhang posted 3 patches 4 hours ago
drivers/acpi/battery.c | 235 +++++++++++++++++++++++++++++++++++++------------
1 file changed, 177 insertions(+), 58 deletions(-)
[PATCH v3 0/3] ACPI: battery: Do not generate too much pressure on ACPI methods
Posted by Rong Zhang 4 hours ago
The acpi_battery_notify() and acpi_battery_get_property() callbacks
sometimes generate too much pressure on corresponding ACPI methods. On
some devices with fragile ACPI implementation, these methods share the
same mutex protecting EC accesses (hence slow to execute) with a lot of
other EC-related methods. Such pressure on them eventually leads to a
catastrophic situation that a bunch of ACPI method calls fail to acquire
the same mutex due to timeout. The firmware of these devices doesn't
handle mutex acquisition failure gracefully and return garbage data,
causing even more chaos.

For acpi_battery_notify(), a very common pattern in EC queries that
emits two consecutive battery notifications with event IDs 0x80 and 0x81
updates battery state and calls power_supply_changed() twice within a
short period, generating significant pressure on _STA, _BST and
_BIX/_BIF methods. Not only that, power_supply_ext properties may also
rely on some other ACPI methods, so both uevent assembling and userspace
processes call them. It becomes a nightmare when all these methods share
the same ACPI mutex and hence vulnerable to lock starvation. Even worse,
after the first uevent reaches userspace, some userspace processes start
to read all battery properties in order to refresh their internal
states, which competes with the second notification's handling and
uevent assembling, exacerbating the lock starvation.

For acpi_battery_get_property(), it generates too much pressure on the
_BST method because of the lack of synchronization. In detail, it
sometimes nullifies the cache mechanism of acpi_battery_get_state() when
multiple processes read power supply properties simultaneously, which
usually happens after a uevent. Normally, emitting a uevent implies that
the cache must have been refreshed due to power_supply_uevent() reading
all properties, so the mentioned processes should have seen cache hits.
Unfortunately, these fragile devices' power_supply_ext properties are
somehow slow to read after battery events, resulting in cache expiration
before power_supply_uevent() finishes. Hence, once the uevent reaches
userspace, the _BST method will be executed multiple times within a
short period due to userspace processes reading all properties again.

Improve acpi_battery_notify() by merging consecutive battery
notifications within 10ms using a delayed work, so that they only
refresh and/or update battery state once. ACPI netlink event and
notifier call chain are still triggered multiple times in order not to
break other components. Finally, call power_supply_changed() once and
lead to a single uevent instead of a bunch, preventing userspace
programs from causing too much pressure on power supply properties and
the underlying ACPI methods.

Fix acpi_battery_get_property() by introducing a mutex to protect all
accesses to battery properties, so that acpi_battery_get_property() can
take the advantage of the mutex and synchronize itself. This also
prevents potential race conditions, e.g., when multiple tasks read power
supply properties simultaneously, or when other callbacks are called
during its execution.

Since the series touches acpi_battery_alarm_store(), also convert the
use of sscanf("%lu\n") into the more preferred kstrtoul() beforehand.

With the series, the lock starvation issue on mentioned devices is
greatly improved according to the feedback from one of the device
owners.

Reported-by: Rick <rickk1166@gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=221065
Signed-off-by: Rong Zhang <i@rong.moe>

---
Changes in v3:
- Address Sashiko's concerns on my last-minute changes:
  - Set the number base to 10 in order not to break the ABI
  - Do not overwrite the initial value of `ret' in
    acpi_battery_get_property()
  - https://sashiko.dev/#/patchset/20260611-b4-acpi-battery-notification-v2-0-4e8ed651a151%40rong.moe
- Link to v2: https://patch.msgid.link/20260611-b4-acpi-battery-notification-v2-0-4e8ed651a151@rong.moe

Changes in v2:
- Address Sashiko's concerns:
  - Return from acpi_battery_notification_worker() early when the fifo
    is empty
  - Use pr_err_ratelimited() for potential event storms
  - Add missing `\n' in a printk message
  - Use a separated mutex to protect all properties instead of reusing
    update_lock
  - https://sashiko.dev/#/patchset/20260527-b4-acpi-battery-notification-v1-0-2303bed8ec0b%40rong.moe
- Minimalize the critical section of acpi_battery_notify()
- Rearrange the series
- Dropped Tested-by from patch 3 due to massive rewrite
- Link to v1: https://patch.msgid.link/20260527-b4-acpi-battery-notification-v1-0-2303bed8ec0b@rong.moe

---
Rong Zhang (3):
      ACPI: battery: Merge consecutive battery notifications
      ACPI: battery: Use kstrtoul() over sscanf("%lu\n")
      ACPI: battery: Protect all properties with a separated mutex

 drivers/acpi/battery.c | 235 +++++++++++++++++++++++++++++++++++++------------
 1 file changed, 177 insertions(+), 58 deletions(-)
---
base-commit: acb7500801e98639f6d8c2d796ed9f64cba83d3a
change-id: 20260520-b4-acpi-battery-notification-90d781a3f217

Thanks,
Rong