[PATCH] iommu/amd: move wait_on_sem() out of spinlock

Ankit Soni posted 1 patch 2 months, 1 week ago
drivers/iommu/amd/iommu.c | 25 +++++++++++++++++--------
1 file changed, 17 insertions(+), 8 deletions(-)
[PATCH] iommu/amd: move wait_on_sem() out of spinlock
Posted by Ankit Soni 2 months, 1 week ago
With iommu.strict=1, the existing completion wait path can cause soft
lockups under stressed environment, as wait_on_sem() busy-waits under the
spinlock with interrupts disabled.

Move the completion wait in iommu_completion_wait() out of the spinlock.
wait_on_sem() only polls the hardware-updated cmd_sem and does not require
iommu->lock, so holding the lock during the busy wait unnecessarily
increases contention and extends the time with interrupts disabled.

Signed-off-by: Ankit Soni <Ankit.Soni@amd.com>
---
 drivers/iommu/amd/iommu.c | 25 +++++++++++++++++--------
 1 file changed, 17 insertions(+), 8 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 2e1865daa1ce..3ef188b39bf8 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1161,7 +1161,12 @@ static int wait_on_sem(struct amd_iommu *iommu, u64 data)
 {
 	int i = 0;
 
-	while (*iommu->cmd_sem != data && i < LOOP_TIMEOUT) {
+	/*
+	 * cmd_sem holds a monotonically non-decreasing completion sequence
+	 * number.
+	 */
+	while ((__s64)(READ_ONCE(*iommu->cmd_sem) - data) < 0 &&
+	       i < LOOP_TIMEOUT) {
 		udelay(1);
 		i += 1;
 	}
@@ -1406,14 +1411,13 @@ static int iommu_completion_wait(struct amd_iommu *iommu)
 	raw_spin_lock_irqsave(&iommu->lock, flags);
 
 	ret = __iommu_queue_command_sync(iommu, &cmd, false);
+	raw_spin_unlock_irqrestore(&iommu->lock, flags);
+
 	if (ret)
-		goto out_unlock;
+		return ret;
 
 	ret = wait_on_sem(iommu, data);
 
-out_unlock:
-	raw_spin_unlock_irqrestore(&iommu->lock, flags);
-
 	return ret;
 }
 
@@ -3094,13 +3098,18 @@ static void iommu_flush_irt_and_complete(struct amd_iommu *iommu, u16 devid)
 	raw_spin_lock_irqsave(&iommu->lock, flags);
 	ret = __iommu_queue_command_sync(iommu, &cmd, true);
 	if (ret)
-		goto out;
+		goto out_err;
 	ret = __iommu_queue_command_sync(iommu, &cmd2, false);
 	if (ret)
-		goto out;
+		goto out_err;
+	raw_spin_unlock_irqrestore(&iommu->lock, flags);
+
 	wait_on_sem(iommu, data);
-out:
+	return;
+
+out_err:
 	raw_spin_unlock_irqrestore(&iommu->lock, flags);
+	return;
 }
 
 static inline u8 iommu_get_int_tablen(struct iommu_dev_data *dev_data)
-- 
2.43.0
Re: [PATCH] iommu/amd: move wait_on_sem() out of spinlock
Posted by Markus Elfring 4 weeks ago
…
> +++ b/drivers/iommu/amd/iommu.c
…
> @@ -3094,13 +3098,18 @@ static void iommu_flush_irt_and_complete(struct amd_iommu *iommu, u16 devid)
…
> +out_err:
>  	raw_spin_unlock_irqrestore(&iommu->lock, flags);
> +	return;
>  }
…

How do you think about to omit a return statement at the end of the implementation
of such a function with the type “void”?

See also:
https://elixir.bootlin.com/linux/v6.19-rc4/source/scripts/checkpatch.pl#L5612-L5622

Regards,
Markus
Re: [PATCH] iommu/amd: move wait_on_sem() out of spinlock
Posted by Jörg Rödel 4 weeks ago
On Mon, Dec 01, 2025 at 02:39:40PM +0000, Ankit Soni wrote:
> With iommu.strict=1, the existing completion wait path can cause soft
> lockups under stressed environment, as wait_on_sem() busy-waits under the
> spinlock with interrupts disabled.
> 
> Move the completion wait in iommu_completion_wait() out of the spinlock.
> wait_on_sem() only polls the hardware-updated cmd_sem and does not require
> iommu->lock, so holding the lock during the busy wait unnecessarily
> increases contention and extends the time with interrupts disabled.
> 
> Signed-off-by: Ankit Soni <Ankit.Soni@amd.com>
> ---
>  drivers/iommu/amd/iommu.c | 25 +++++++++++++++++--------
>  1 file changed, 17 insertions(+), 8 deletions(-)

Applied, thanks.
Re: [PATCH] iommu/amd: move wait_on_sem() out of spinlock
Posted by Vasant Hegde 2 months, 1 week ago

On 12/1/2025 8:09 PM, Ankit Soni wrote:
> With iommu.strict=1, the existing completion wait path can cause soft
> lockups under stressed environment, as wait_on_sem() busy-waits under the
> spinlock with interrupts disabled.
> 
> Move the completion wait in iommu_completion_wait() out of the spinlock.
> wait_on_sem() only polls the hardware-updated cmd_sem and does not require
> iommu->lock, so holding the lock during the busy wait unnecessarily
> increases contention and extends the time with interrupts disabled.
> 
> Signed-off-by: Ankit Soni <Ankit.Soni@amd.com>

Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>

-Vasant