drivers/mailbox/mailbox.c | 4 ++++ 1 file changed, 4 insertions(+)
Unlike in non-blocking mode, multi-thread has not been supported in
blocking mode. This commit is to prevent clients from having wrong
assumption by explicitly specifying this fact to the API doc.
Cc: stable@vger.kernel.org
Signed-off-by: Joonwon Kang <joonwonkang@google.com>
---
v1: Abandon the previous attempts to support multi-thread in blocking
mode and instead declare it is not supported.
drivers/mailbox/mailbox.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/mailbox/mailbox.c b/drivers/mailbox/mailbox.c
index bbc9fd75a95f..b00f7a32e866 100644
--- a/drivers/mailbox/mailbox.c
+++ b/drivers/mailbox/mailbox.c
@@ -258,6 +258,10 @@ EXPORT_SYMBOL_GPL(mbox_chan_tx_slots_available);
* over the chan, i.e, tx_done() is made.
* This function could be called from atomic context as it simply
* queues the data and returns a token against the request.
+ * In blocking mode, it is caller's responsibility to serialize threads'
+ * access to a channel if multi-threads are to send messages through the
+ * same channel, i.e. caller should not call this function until any
+ * previous call returns.
*
* Return: Non-negative integer for successful submission (non-blocking mode)
* or transmission over chan (blocking mode).
--
2.54.0.rc1.555.g9c883467ad-goog
On Tue, Apr 21, 2026 at 5:46 AM Joonwon Kang <joonwonkang@google.com> wrote: > > Unlike in non-blocking mode, multi-thread has not been supported in > blocking mode. This commit is to prevent clients from having wrong > assumption by explicitly specifying this fact to the API doc. > > Cc: stable@vger.kernel.org > Signed-off-by: Joonwon Kang <joonwonkang@google.com> > --- > v1: Abandon the previous attempts to support multi-thread in blocking > mode and instead declare it is not supported. > > drivers/mailbox/mailbox.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/drivers/mailbox/mailbox.c b/drivers/mailbox/mailbox.c > index bbc9fd75a95f..b00f7a32e866 100644 > --- a/drivers/mailbox/mailbox.c > +++ b/drivers/mailbox/mailbox.c > @@ -258,6 +258,10 @@ EXPORT_SYMBOL_GPL(mbox_chan_tx_slots_available); > * over the chan, i.e, tx_done() is made. > * This function could be called from atomic context as it simply > * queues the data and returns a token against the request. > + * In blocking mode, it is caller's responsibility to serialize threads' > + * access to a channel if multi-threads are to send messages through the > + * same channel, i.e. caller should not call this function until any > + * previous call returns. > * > * Return: Non-negative integer for successful submission (non-blocking mode) > * or transmission over chan (blocking mode). > -- > 2.54.0.rc1.555.g9c883467ad-goog > Documentation fixes don't go to stable, so removed the cc to stable. Applied to mailbox/for-next Thanks Jassi
When the mailbox controller failed transmitting message, the error code
was only passed to the client's tx done handler and not to
mbox_send_message() in blocking mode. For this reason, the function could
return a false success. This commit resolves the issue by introducing the
tx status and checking it before mbox_send_message() returns.
Cc: stable@vger.kernel.org
Signed-off-by: Joonwon Kang <joonwonkang@google.com>
---
v4: Detach it from the previous commit that supports multi-thread in
blocking mode and rebase it on the latest for-next branch.
v3: No major patch since v1.
drivers/mailbox/mailbox.c | 6 +++++-
include/linux/mailbox_controller.h | 2 ++
2 files changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/mailbox/mailbox.c b/drivers/mailbox/mailbox.c
index b00f7a32e866..066702e5a46f 100644
--- a/drivers/mailbox/mailbox.c
+++ b/drivers/mailbox/mailbox.c
@@ -98,8 +98,10 @@ static void tx_tick(struct mbox_chan *chan, int r)
if (chan->cl->tx_done)
chan->cl->tx_done(chan->cl, mssg, r);
- if (r != -ETIME && chan->cl->tx_block)
+ if (r != -ETIME && chan->cl->tx_block) {
+ chan->tx_status = r;
complete(&chan->tx_complete);
+ }
}
static enum hrtimer_restart txdone_hrtimer(struct hrtimer *hrtimer)
@@ -295,6 +297,8 @@ int mbox_send_message(struct mbox_chan *chan, void *mssg)
if (ret == 0) {
t = -ETIME;
tx_tick(chan, t);
+ } else if (chan->tx_status < 0) {
+ t = chan->tx_status;
}
}
diff --git a/include/linux/mailbox_controller.h b/include/linux/mailbox_controller.h
index dc93287a2a01..26a238a6f941 100644
--- a/include/linux/mailbox_controller.h
+++ b/include/linux/mailbox_controller.h
@@ -120,6 +120,7 @@ struct mbox_controller {
* @txdone_method: Way to detect TXDone chosen by the API
* @cl: Pointer to the current owner of this channel
* @tx_complete: Transmission completion
+ * @tx_status: Transmission status
* @active_req: Currently active request hook
* @msg_count: No. of mssg currently queued
* @msg_free: Index of next available mssg slot
@@ -132,6 +133,7 @@ struct mbox_chan {
unsigned txdone_method;
struct mbox_client *cl;
struct completion tx_complete;
+ int tx_status;
void *active_req;
unsigned msg_count, msg_free;
void *msg_data[MBOX_TX_QUEUE_LEN];
--
2.54.0.rc1.555.g9c883467ad-goog
On Tue, Apr 21, 2026 at 10:46:52AM +0000, Joonwon Kang wrote: > When the mailbox controller failed transmitting message, the error code > was only passed to the client's tx done handler and not to > mbox_send_message() in blocking mode. For this reason, the function could > return a false success. This commit resolves the issue by introducing the > tx status and checking it before mbox_send_message() returns. > `tx_complete` and `tx_status` are per-channel, not per-message. Although `mbox_send_message()` can queue multiple messages, all blocking callers wait on the same completion, so a completion is not associated with the thread or message that triggered it. This creates two issues: 1. Concurrent blocking senders can consume each other’s completions. When message A completes, `tx_tick()` may submit message B, then set `chan->tx_status` and complete the shared completion. Any waiter may wake, including B’s sender, which can return while B is still in flight. It happens even w/o this change but with possibly wrong return value after this change. 2. `tx_status` can be stale or overwritten. Since it is a single channel field written just before `complete()`, a second(possibly fast) `tx_tick()` can update it before the first awakened sender reads it. Because `msg_submit()` happens before status publication, the next message can complete before the previous status is observed if the controller re-enters `tx_tick()` for the same channel. We need to see if there are other issue that needs fixing before you can propagate the tx error code. Let me know if I am missing something. -- Regards, Sudeep
Hi Sudeep, I appreciate your review! And I apologize that I missed some important context about this patch. > On Tue, Apr 21, 2026 at 10:46:52AM +0000, Joonwon Kang wrote: > > When the mailbox controller failed transmitting message, the error code > > was only passed to the client's tx done handler and not to > > mbox_send_message() in blocking mode. For this reason, the function could > > return a false success. This commit resolves the issue by introducing the > > tx status and checking it before mbox_send_message() returns. > > > `tx_complete` and `tx_status` are per-channel, not per-message. Although > `mbox_send_message()` can queue multiple messages, all blocking callers wait > on the same completion, so a completion is not associated with the thread or > message that triggered it. > > This creates two issues: > > 1. Concurrent blocking senders can consume each other’s completions. When > message A completes, `tx_tick()` may submit message B, then set > `chan->tx_status` and complete the shared completion. Any waiter may wake, > including B’s sender, which can return while B is still in flight. It > happens even w/o this change but with possibly wrong return value after > this change. > > 2. `tx_status` can be stale or overwritten. Since it is a single channel field > written just before `complete()`, a second(possibly fast) `tx_tick()` can > update it before the first awakened sender reads it. Because `msg_submit()` > happens before status publication, the next message can complete before the > previous status is observed if the controller re-enters `tx_tick()` for the > same channel. > > We need to see if there are other issue that needs fixing before you can > propagate the tx error code. Let me know if I am missing something. Yes, the current mbox_send_message() in blocking mode does not support multi-threads. I have tried adding the multi-threads support [1] since the first patchset and adding this patch on top of it [2], but the author was not convinced about the necessity of the multi-threads support and instead preferred that clients, instead of the mailbox APIs, serialize the multiple threads' access to the channel [3]. For this reason, I went with the author's preference [4] and clarified that multi-threads is not supported in the API doc [5] so that clients can be clearly aware of it and serialize its threads' access to the channel. So, this patch is based on the assumption that such multi-threads protection is given by the clients already, i.e. mbox_send_message() in blocking mode is called on the same channel only when the previous call has returned. What is your opinion on this? Should we support multi-threads in the mailbox APIs [1]? or should we go with the current decision [5]? I personally have been thinking the former is the way to go. [1] https://lore.kernel.org/all/20260402170641.2082547-1-joonwonkang@google.com [2] https://lore.kernel.org/all/20260402170641.2082547-3-joonwonkang@google.com/ [3] https://lore.kernel.org/all/CABb+yY0uDQh-3cadPQONV=NJKjMtc4mJekgjmHYVaHnfHXvGZQ@mail.gmail.com/ [4] https://lore.kernel.org/all/20260404124428.3077670-1-joonwonkang@google.com/ [5] https://lore.kernel.org/all/20260421104652.211276-1-joonwonkang@google.com/ Thanks, Joonwon Kang
On Thu, May 07, 2026 at 02:47:32PM +0000, Joonwon Kang wrote: > Hi Sudeep, I appreciate your review! And I apologize that I missed some > important context about this patch. > > > On Tue, Apr 21, 2026 at 10:46:52AM +0000, Joonwon Kang wrote: > > > When the mailbox controller failed transmitting message, the error code > > > was only passed to the client's tx done handler and not to > > > mbox_send_message() in blocking mode. For this reason, the function could > > > return a false success. This commit resolves the issue by introducing the > > > tx status and checking it before mbox_send_message() returns. > > > > > `tx_complete` and `tx_status` are per-channel, not per-message. Although > > `mbox_send_message()` can queue multiple messages, all blocking callers wait > > on the same completion, so a completion is not associated with the thread or > > message that triggered it. > > > > This creates two issues: > > > > 1. Concurrent blocking senders can consume each other’s completions. When > > message A completes, `tx_tick()` may submit message B, then set > > `chan->tx_status` and complete the shared completion. Any waiter may wake, > > including B’s sender, which can return while B is still in flight. It > > happens even w/o this change but with possibly wrong return value after > > this change. > > > > 2. `tx_status` can be stale or overwritten. Since it is a single channel field > > written just before `complete()`, a second(possibly fast) `tx_tick()` can > > update it before the first awakened sender reads it. Because `msg_submit()` > > happens before status publication, the next message can complete before the > > previous status is observed if the controller re-enters `tx_tick()` for the > > same channel. > > > > We need to see if there are other issue that needs fixing before you can > > propagate the tx error code. Let me know if I am missing something. > > Yes, the current mbox_send_message() in blocking mode does not support > multi-threads. I have tried adding the multi-threads support [1] since the > first patchset and adding this patch on top of it [2], but the author was > not convinced about the necessity of the multi-threads support and instead > preferred that clients, instead of the mailbox APIs, serialize the multiple > threads' access to the channel [3]. > > For this reason, I went with the author's preference [4] and clarified that > multi-threads is not supported in the API doc [5] so that clients can be > clearly aware of it and serialize its threads' access to the channel. > > So, this patch is based on the assumption that such multi-threads > protection is given by the clients already, i.e. mbox_send_message() in > blocking mode is called on the same channel only when the previous call has > returned. > Fair enough! Add a reminder note in the commit message that multi-threading is not supported and hence the proposed solution works. With that, you can add: Reviewed-by: Sudeep Holla <sudeep.holla@kernel.org> -- Regards, Sudeep
> When the mailbox controller failed transmitting message, the error code > was only passed to the client's tx done handler and not to > mbox_send_message() in blocking mode. For this reason, the function could > return a false success. This commit resolves the issue by introducing the > tx status and checking it before mbox_send_message() returns. > > Cc: stable@vger.kernel.org > Signed-off-by: Joonwon Kang <joonwonkang@google.com> Hi reviewers, Could you help to review this patch? Since this attempt has been open since June-2025, it will be appreciated if you provide any other reviewers who can help review if you are not available. Thanks, Joonwon Kang
© 2016 - 2026 Red Hat, Inc.