drivers/staging/most/dim2/dim2.c | 18 +++++++----------- 1 file changed, 7 insertions(+), 11 deletions(-)
The current implementation of complete_all_mbos() repeatedly acquires
and releases the spinlock in loop. This causes lock contention.
This patch refactors the function to use list_replace_init(), moving all
entries to a local list. This removes the loop-based locking approach
and significantly reduces lock contention.
Signed-off-by: Minu Jin <s9430939@naver.com>
---
drivers/staging/most/dim2/dim2.c | 18 +++++++-----------
1 file changed, 7 insertions(+), 11 deletions(-)
diff --git a/drivers/staging/most/dim2/dim2.c b/drivers/staging/most/dim2/dim2.c
index 80af965356d0..fb54bb1a803f 100644
--- a/drivers/staging/most/dim2/dim2.c
+++ b/drivers/staging/most/dim2/dim2.c
@@ -415,20 +415,16 @@ static irqreturn_t dim2_ahb_isr(int irq, void *_dev)
*/
static void complete_all_mbos(struct list_head *head)
{
+ struct mbo *mbo, *mbo_tmp;
unsigned long flags;
- struct mbo *mbo;
-
- for (;;) {
- spin_lock_irqsave(&dim_lock, flags);
- if (list_empty(head)) {
- spin_unlock_irqrestore(&dim_lock, flags);
- break;
- }
+ LIST_HEAD(del_list);
- mbo = list_first_entry(head, struct mbo, list);
- list_del(head->next);
- spin_unlock_irqrestore(&dim_lock, flags);
+ spin_lock_irqsave(&dim_lock, flags);
+ list_replace_init(head, &del_list);
+ spin_unlock_irqrestore(&dim_lock, flags);
+ list_for_each_entry_safe(mbo, mbo_tmp, &del_list, list) {
+ list_del(&mbo->list);
mbo->processed_length = 0;
mbo->status = MBO_E_CLOSE;
mbo->complete(mbo);
--
2.43.0
On Fri, Feb 06, 2026 at 01:02:31AM +0900, Minu Jin wrote: > The current implementation of complete_all_mbos() repeatedly acquires > and releases the spinlock in loop. This causes lock contention. > > This patch refactors the function to use list_replace_init(), moving all > entries to a local list. This removes the loop-based locking approach > and significantly reduces lock contention. > > Signed-off-by: Minu Jin <s9430939@naver.com> The subject talks about race conditions but the commit message talks about reducing lock contention. It does obviously reduce lock contention (althought I don't think anyone has benchmarked it to see if it matters) but does it prevent a race condition? Let's review: This complete_all_mbos() function is called when we do a most_stop_channel() and we ->poison_channel(). The list heads are &hdm_ch->started_list and &hdm_ch->pending_list. I feel like if we add something to the list while we are also freeing items from the list then we are toasted. In service_done_flag(), we delete items from the list but deleting items is fine in this context. We add things to the ->pending_list in enqueue() and service_done_flag(). We move things from the ->pending_list to the ->started_list in try_start_dim_transfer(). So if any of those three functions can be run at the same time as complete_all_mbos() we are in trouble. The hdm_enqueue_thread() function calls enqueue() until kthread_should_stop(). The most_stop_channel() function calls kthread_stop(c->hdm_enqueue_task) before doing the ->poison_channel() so that's fine. The service_done_flag() and try_start_dim_transfer() functions are called from dim2_task_irq(). When do we stop taking interrupts? To be honest, I don't know. I thought we had to call disable_irq()? So that's the question, when do we disable IRQs in this driver? I would have assumed it was in most_stop_channel() but I can't see it, but I'm also not very familiar with this code. Let's answer this question and then either add a Fixes tag or say that there doesn't appear to be a race condition. regards, dan carpenter
On Fri, Feb 06, 2026 at 10:20:51AM +0300, Dan Carpenter wrote: > On Fri, Feb 06, 2026 at 01:02:31AM +0900, Minu Jin wrote: > > The current implementation of complete_all_mbos() repeatedly acquires > > and releases the spinlock in loop. This causes lock contention. > > > > This patch refactors the function to use list_replace_init(), moving all > > entries to a local list. This removes the loop-based locking approach > > and significantly reduces lock contention. > > > > Signed-off-by: Minu Jin <s9430939@naver.com> > > This complete_all_mbos() function is called when we do a > most_stop_channel() and we ->poison_channel(). > > The list heads are &hdm_ch->started_list and &hdm_ch->pending_list. I > feel like if we add something to the list while we are also freeing > items from the list then we are toasted. In service_done_flag(), we > delete items from the list but deleting items is fine in this context. > > We add things to the ->pending_list in enqueue() and > service_done_flag(). We move things from the ->pending_list to the > ->started_list in try_start_dim_transfer(). So if any of those three > functions can be run at the same time as complete_all_mbos() we are in > trouble. > > The hdm_enqueue_thread() function calls enqueue() until > kthread_should_stop(). The most_stop_channel() function calls > kthread_stop(c->hdm_enqueue_task) before doing the ->poison_channel() > so that's fine. > > The service_done_flag() and try_start_dim_transfer() functions are > called from dim2_task_irq(). When do we stop taking interrupts? To be > honest, I don't know. I thought we had to call disable_irq()? > > So that's the question, when do we disable IRQs in this driver? I > would have assumed it was in most_stop_channel() but I can't see it, > but I'm also not very familiar with this code. > > Let's answer this question and then either add a Fixes tag or say that > there doesn't appear to be a race condition. > > regards, > dan carpenter > > Hi Dan, Thank you for spending your time for detailed review and analysis. To be honest, my original intention was to reduce lock contention by optimizing the repeated lock/unlock in the loop from O(n) to O(1). I wanted to minimize the overhead of acquiring the spinlock multiple times. However, after reviewing your feedback, I traced the code again that you pointed out. I confirmed that IRQs are not disabled during the call path. `most_stop_channel() -> poison_channel() -> complete_all_mbos()` In the original code, the brief time where the lock is released inside the loop create a time where an interrupt (eg, dim2_task_irq()) could trigger and modify the list, leading to a race condition. Although it wasn't my original intent, I think this patch could also solve this race condition. By moving the list items to a local list under a single lock, it provides the necessary isolation from interrupts. Does this reasoning make sense to you, or is there something I am missing? I would appreciate your opinion before I update the commit message and send a v2. Minu Jin
© 2016 - 2026 Red Hat, Inc.