Fixes for concurrency and memory ordering bugs that were identified by
the Sashiko review tool when proposing the GS101 ACPM TMU addition.
While these bugs are genuine flaws, we haven't hit them yet, likely
because we don't have enough ACPM clients upstreamed to trigger the
race conditions.
These fixes can go in either at the -rc phase or as regular patches for
the next merge window. If the latter, we'll need a dedicated branch, as
these patches, together with the other ACPM thermal preparatory patches
will be needed by the upcoming GS101 ACPM thermal driver.
Thanks,
ta
Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org>
---
Changes in v4:
- Drop the SRAM boundary checks patch, incomplete band-aid.
- Split the concurrency and memory ordering into dedicated logical
patches. It involved reordering of the last patches to avoid
modifying the same code twice.
- Add a missing memory barrier in acpm_get_rx() to prevent weakly
ordered CPUs from advancing the hardware RX pointer before the
payload reads have completed.
- Fix a false-timeout race in the polling path by decoupling the
polling thread from the global allocator bitmap.
- Use test_and_set_bit_lock. address dependency was not enforced
when using the plain non-atomic read with find_next_zero_bit().
- Fix kernel doc.
- Link to v3: https://lore.kernel.org/r/20260429-acpm-fixes-sashiko-reports-v3-0-47cf74ab09ad@linaro.org
Changes in v3:
- validate more SRAM parameters and queue pointers (sashiko)
- consider/fix the acquire path (Krzysztof) - patch was moved
last in the series to avoid touching the same code twice.
- Link to v2: https://lore.kernel.org/r/20260427-acpm-fixes-sashiko-reports-v2-0-1ff8de94a997@linaro.org
Changes in v2:
- drop patch "firmware: samsung: acpm: Fix sequence number leak and infinite loop"
The patch freed sequence numbers on mailbox failures or timeouts. Because
the message is already in SRAM and tx.front was advanced, a delayed
firmware wake-up will process that abandoned message, stealing the
sequence number from a new thread and causing silent data corruption.
- fix mailbox channel leak when `acpm_achan_alloc_cmds()` failed. Did it
by moving the `devm_add_action_or_reset()` call.
- new patches, last 3 in the set, they fix some more sashiko reports.
- Link to v1: https://lore.kernel.org/r/20260423-acpm-fixes-sashiko-reports-v1-0-2217b790925e@linaro.org
---
Tudor Ambarus (7):
firmware: samsung: acpm: Fix cross-thread RX length corruption
firmware: samsung: acpm: Fix mailbox channel leak on probe error
firmware: samsung: acpm: Fix dummy stubs to return ERR_PTR
firmware: samsung: acpm: Add memory barrier before advancing RX pointer
firmware: samsung: acpm: Fix false timeouts in polling path
firmware: samsung: acpm: Fix missing LKMM barriers in RX and TX paths
firmware: samsung: acpm: Fix infinite loop on sequence number exhaustion
drivers/firmware/samsung/exynos-acpm-dvfs.c | 3 +
drivers/firmware/samsung/exynos-acpm.c | 119 ++++++++++++++++-----
.../linux/firmware/samsung/exynos-acpm-protocol.h | 3 +-
3 files changed, 96 insertions(+), 29 deletions(-)
---
base-commit: 254f49634ee16a731174d2ae34bc50bd5f45e731
change-id: 20260423-acpm-fixes-sashiko-reports-ae28b6ed5581
Best regards,
--
Tudor Ambarus <tudor.ambarus@linaro.org>