[PATCH] locking/rwsem: Allow slowpath writer to ignore handoff bit if not set by first waiter

Waiman Long posted 1 patch 4 years ago
There is a newer version of this series
kernel/locking/rwsem.c | 30 ++++++++++++++++++++----------
1 file changed, 20 insertions(+), 10 deletions(-)
[PATCH] locking/rwsem: Allow slowpath writer to ignore handoff bit if not set by first waiter
Posted by Waiman Long 4 years ago
With commit d257cc8cb8d5 ("locking/rwsem: Make handoff bit handling more
consistent"), the writer that sets the handoff bit can be interrupted
out without clearing the bit if the wait queue isn't empty. This disables
reader and writer optimistic lock spinning and stealing.

Now if a non-first writer in the queue is somehow woken up or first
entering the waiting loop, it can't acquire the lock.  This is not
the case before that commit as the writer that set the handoff bit
will clear it when exiting out via the out_nolock path. This is less
efficient as the busy rwsem stays in an unlock state for a longer time.

This patch allows a non-first writer to ignore the handoff bit if it
is not originally set or initiated by the first waiter.

Fixes: d257cc8cb8d5 ("locking/rwsem: Make handoff bit handling more consistent")
Signed-off-by: Waiman Long <longman@redhat.com>
---
 kernel/locking/rwsem.c | 30 ++++++++++++++++++++----------
 1 file changed, 20 insertions(+), 10 deletions(-)

diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
index 9d1db4a54d34..65f0262f635e 100644
--- a/kernel/locking/rwsem.c
+++ b/kernel/locking/rwsem.c
@@ -335,8 +335,6 @@ struct rwsem_waiter {
 	struct task_struct *task;
 	enum rwsem_waiter_type type;
 	unsigned long timeout;
-
-	/* Writer only, not initialized in reader */
 	bool handoff_set;
 };
 #define rwsem_first_waiter(sem) \
@@ -459,10 +457,12 @@ static void rwsem_mark_wake(struct rw_semaphore *sem,
 			 * to give up the lock), request a HANDOFF to
 			 * force the issue.
 			 */
-			if (!(oldcount & RWSEM_FLAG_HANDOFF) &&
-			    time_after(jiffies, waiter->timeout)) {
-				adjustment -= RWSEM_FLAG_HANDOFF;
-				lockevent_inc(rwsem_rlock_handoff);
+			if (time_after(jiffies, waiter->timeout)) {
+				if (!(oldcount & RWSEM_FLAG_HANDOFF)) {
+					adjustment -= RWSEM_FLAG_HANDOFF;
+					lockevent_inc(rwsem_rlock_handoff);
+				}
+				waiter->handoff_set = true;
 			}
 
 			atomic_long_add(-adjustment, &sem->count);
@@ -599,7 +599,7 @@ rwsem_del_wake_waiter(struct rw_semaphore *sem, struct rwsem_waiter *waiter,
 static inline bool rwsem_try_write_lock(struct rw_semaphore *sem,
 					struct rwsem_waiter *waiter)
 {
-	bool first = rwsem_first_waiter(sem) == waiter;
+	struct rwsem_waiter *first = rwsem_first_waiter(sem);
 	long count, new;
 
 	lockdep_assert_held(&sem->wait_lock);
@@ -609,11 +609,20 @@ static inline bool rwsem_try_write_lock(struct rw_semaphore *sem,
 		bool has_handoff = !!(count & RWSEM_FLAG_HANDOFF);
 
 		if (has_handoff) {
-			if (!first)
+			/*
+			 * Honor handoff bit and yield only when the first
+			 * waiter is the one that set it. Otherwisee, we
+			 * still try to acquire the rwsem.
+			 */
+			if (first->handoff_set && (waiter != first))
 				return false;
 
-			/* First waiter inherits a previously set handoff bit */
-			waiter->handoff_set = true;
+			/*
+			 * First waiter can inherit a previously set handoff
+			 * bit and spin on rwsem if lock acquisition fails.
+			 */
+			if (waiter == first)
+				waiter->handoff_set = true;
 		}
 
 		new = count;
@@ -1027,6 +1036,7 @@ rwsem_down_read_slowpath(struct rw_semaphore *sem, long count, unsigned int stat
 	waiter.task = current;
 	waiter.type = RWSEM_WAITING_FOR_READ;
 	waiter.timeout = jiffies + RWSEM_WAIT_TIMEOUT;
+	waiter.handoff_set = false;
 
 	raw_spin_lock_irq(&sem->wait_lock);
 	if (list_empty(&sem->wait_list)) {
-- 
2.27.0
Re: [PATCH] locking/rwsem: Allow slowpath writer to ignore handoff bit if not set by first waiter
Posted by John Donnelly 4 years ago
On 4/27/22 12:31 PM, Waiman Long wrote:
> With commit d257cc8cb8d5 ("locking/rwsem: Make handoff bit handling more
> consistent"), the writer that sets the handoff bit can be interrupted
> out without clearing the bit if the wait queue isn't empty. This disables
> reader and writer optimistic lock spinning and stealing.
> 
> Now if a non-first writer in the queue is somehow woken up or first
> entering the waiting loop, it can't acquire the lock.  This is not
> the case before that commit as the writer that set the handoff bit
> will clear it when exiting out via the out_nolock path. This is less
> efficient as the busy rwsem stays in an unlock state for a longer time.
> 
> This patch allows a non-first writer to ignore the handoff bit if it
> is not originally set or initiated by the first waiter.
> 
> Fixes: d257cc8cb8d5 ("locking/rwsem: Make handoff bit handling more consistent")

Hi Waiman,

1. You likely need :

Cc: <stable@vger.kernel.org>

Since d257cc8cb8d5 has been migrated to other LTS threads (5.15.y for 
sure) .

2. I had posted this earlier:

[PATCH 5.15 1/1] Revert "locking/rwsem: Make handoff bit handling more 
consistent"

That is likely not needed now.


3. Please add :

Reported-by: Jorge Lopez <jorge.jo.lopez@oracle.com>

We can likely test this, but I can't give a ETA when that will happen.


> Signed-off-by: Waiman Long <longman@redhat.com>

Acked-by: John Donnelly <john.p.donnelly@oracle.com>
> ---
>   kernel/locking/rwsem.c | 30 ++++++++++++++++++++----------
>   1 file changed, 20 insertions(+), 10 deletions(-)
> 
> diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
> index 9d1db4a54d34..65f0262f635e 100644
> --- a/kernel/locking/rwsem.c
> +++ b/kernel/locking/rwsem.c
> @@ -335,8 +335,6 @@ struct rwsem_waiter {
>   	struct task_struct *task;
>   	enum rwsem_waiter_type type;
>   	unsigned long timeout;
> -
> -	/* Writer only, not initialized in reader */
>   	bool handoff_set;
>   };
>   #define rwsem_first_waiter(sem) \
> @@ -459,10 +457,12 @@ static void rwsem_mark_wake(struct rw_semaphore *sem,
>   			 * to give up the lock), request a HANDOFF to
>   			 * force the issue.
>   			 */
> -			if (!(oldcount & RWSEM_FLAG_HANDOFF) &&
> -			    time_after(jiffies, waiter->timeout)) {
> -				adjustment -= RWSEM_FLAG_HANDOFF;
> -				lockevent_inc(rwsem_rlock_handoff);
> +			if (time_after(jiffies, waiter->timeout)) {
> +				if (!(oldcount & RWSEM_FLAG_HANDOFF)) {
> +					adjustment -= RWSEM_FLAG_HANDOFF;
> +					lockevent_inc(rwsem_rlock_handoff);
> +				}
> +				waiter->handoff_set = true;
>   			}
>   
>   			atomic_long_add(-adjustment, &sem->count);
> @@ -599,7 +599,7 @@ rwsem_del_wake_waiter(struct rw_semaphore *sem, struct rwsem_waiter *waiter,
>   static inline bool rwsem_try_write_lock(struct rw_semaphore *sem,
>   					struct rwsem_waiter *waiter)
>   {
> -	bool first = rwsem_first_waiter(sem) == waiter;
> +	struct rwsem_waiter *first = rwsem_first_waiter(sem);
>   	long count, new;
>   
>   	lockdep_assert_held(&sem->wait_lock);
> @@ -609,11 +609,20 @@ static inline bool rwsem_try_write_lock(struct rw_semaphore *sem,
>   		bool has_handoff = !!(count & RWSEM_FLAG_HANDOFF);
>   
>   		if (has_handoff) {
> -			if (!first)
> +			/*
> +			 * Honor handoff bit and yield only when the first
> +			 * waiter is the one that set it. Otherwisee, we
> +			 * still try to acquire the rwsem.
> +			 */
> +			if (first->handoff_set && (waiter != first))
>   				return false;
>   
> -			/* First waiter inherits a previously set handoff bit */
> -			waiter->handoff_set = true;
> +			/*
> +			 * First waiter can inherit a previously set handoff
> +			 * bit and spin on rwsem if lock acquisition fails.
> +			 */
> +			if (waiter == first)
> +				waiter->handoff_set = true;
>   		}
>   
>   		new = count;
> @@ -1027,6 +1036,7 @@ rwsem_down_read_slowpath(struct rw_semaphore *sem, long count, unsigned int stat
>   	waiter.task = current;
>   	waiter.type = RWSEM_WAITING_FOR_READ;
>   	waiter.timeout = jiffies + RWSEM_WAIT_TIMEOUT;
> +	waiter.handoff_set = false;
>   
>   	raw_spin_lock_irq(&sem->wait_lock);
>   	if (list_empty(&sem->wait_list)) {
Re: [PATCH] locking/rwsem: Allow slowpath writer to ignore handoff bit if not set by first waiter
Posted by Waiman Long 4 years ago
On 4/27/22 19:16, John Donnelly wrote:
> On 4/27/22 12:31 PM, Waiman Long wrote:
>> With commit d257cc8cb8d5 ("locking/rwsem: Make handoff bit handling more
>> consistent"), the writer that sets the handoff bit can be interrupted
>> out without clearing the bit if the wait queue isn't empty. This 
>> disables
>> reader and writer optimistic lock spinning and stealing.
>>
>> Now if a non-first writer in the queue is somehow woken up or first
>> entering the waiting loop, it can't acquire the lock.  This is not
>> the case before that commit as the writer that set the handoff bit
>> will clear it when exiting out via the out_nolock path. This is less
>> efficient as the busy rwsem stays in an unlock state for a longer time.
>>
>> This patch allows a non-first writer to ignore the handoff bit if it
>> is not originally set or initiated by the first waiter.
>>
>> Fixes: d257cc8cb8d5 ("locking/rwsem: Make handoff bit handling more 
>> consistent")
>
> Hi Waiman,
>
> 1. You likely need :
>
> Cc: <stable@vger.kernel.org>
>
> Since d257cc8cb8d5 has been migrated to other LTS threads (5.15.y for 
> sure) .

Since the commit has a fixes tag, the stable will automatically pick it up.

>
> 2. I had posted this earlier:
>
> [PATCH 5.15 1/1] Revert "locking/rwsem: Make handoff bit handling more 
> consistent"
>
> That is likely not needed now.
Right. The patch was created to fix a problem reported by another user. 
So reverting it may not the right move.
>
>
> 3. Please add :
>
> Reported-by: Jorge Lopez <jorge.jo.lopez@oracle.com>
>
> We can likely test this, but I can't give a ETA when that will happen.
>
>
>> Signed-off-by: Waiman Long <longman@redhat.com>
>
> Acked-by: John Donnelly <john.p.donnelly@oracle.com>

Will add the that when I send out the next version.

Cheers,
Longman

Re: [PATCH] locking/rwsem: Allow slowpath writer to ignore handoff bit if not set by first waiter
Posted by John Donnelly 4 years ago
On 4/27/22 18:56, Waiman Long wrote:
> On 4/27/22 19:16, John Donnelly wrote:
>> On 4/27/22 12:31 PM, Waiman Long wrote:
>>> With commit d257cc8cb8d5 ("locking/rwsem: Make handoff bit handling more
>>> consistent"), the writer that sets the handoff bit can be interrupted
>>> out without clearing the bit if the wait queue isn't empty. This 
>>> disables
>>> reader and writer optimistic lock spinning and stealing.
>>>
>>> Now if a non-first writer in the queue is somehow woken up or first
>>> entering the waiting loop, it can't acquire the lock.  This is not
>>> the case before that commit as the writer that set the handoff bit
>>> will clear it when exiting out via the out_nolock path. This is less
>>> efficient as the busy rwsem stays in an unlock state for a longer time.
>>>
>>> This patch allows a non-first writer to ignore the handoff bit if it
>>> is not originally set or initiated by the first waiter.
>>>
>>> Fixes: d257cc8cb8d5 ("locking/rwsem: Make handoff bit handling more 
>>> consistent")
>>
>> Hi Waiman,
>>
>> 1. You likely need :
>>
>> Cc: <stable@vger.kernel.org>
>>
>> Since d257cc8cb8d5 has been migrated to other LTS threads (5.15.y for 
>> sure) .
> 
> Since the commit has a fixes tag, the stable will automatically pick it up.
> 
>>
>> 2. I had posted this earlier:
>>
>> [PATCH 5.15 1/1] Revert "locking/rwsem: Make handoff bit handling more 
>> consistent"
>>
>> That is likely not needed now.
> Right. The patch was created to fix a problem reported by another user. 
> So reverting it may not the right move.



Thanks for the quick follow up.


>>
>>
>> 3. Please add :
>>
>> Reported-by: Jorge Lopez <jorge.jo.lopez@oracle.com>
>>
>> We can likely test this, but I can't give a ETA when that will happen.
>>
>>
>>> Signed-off-by: Waiman Long <longman@redhat.com>
>>
>> Acked-by: John Donnelly <john.p.donnelly@oracle.com>
> 
> Will add the that when I send out the next version.
> 
> Cheers,
> Longman
>