From: Lance Yang <lance.yang@linux.dev>
When CONFIG_DETECT_HUNG_TASK_BLOCKER is enabled, a stale owner pointer in a
reader-owned rwsem can lead to false positives in blocker tracking.
To mitigate this, let’s try to clear the owner field on unlock, as a NULL
owner is better than a stale one for diagnostics.
Signed-off-by: Lance Yang <lance.yang@linux.dev>
---
kernel/locking/rwsem.c | 10 ++++------
1 file changed, 4 insertions(+), 6 deletions(-)
diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
index 6cb29442d4fc..a310eb9896de 100644
--- a/kernel/locking/rwsem.c
+++ b/kernel/locking/rwsem.c
@@ -205,14 +205,12 @@ bool is_rwsem_reader_owned(struct rw_semaphore *sem)
return false;
return rwsem_test_oflags(sem, RWSEM_READER_OWNED);
}
-#endif
-#ifdef CONFIG_DEBUG_RWSEMS
/*
- * With CONFIG_DEBUG_RWSEMS configured, it will make sure that if there
- * is a task pointer in owner of a reader-owned rwsem, it will be the
- * real owner or one of the real owners. The only exception is when the
- * unlock is done by up_read_non_owner().
+ * With CONFIG_DEBUG_RWSEMS or CONFIG_DETECT_HUNG_TASK_BLOCKER configured,
+ * it will make sure that the owner field of a reader-owned rwsem either
+ * points to a real reader-owner(s) or gets cleared. The only exception is
+ * when the unlock is done by up_read_non_owner().
*/
static inline void rwsem_clear_reader_owned(struct rw_semaphore *sem)
{
--
2.49.0
On Thu, 12 Jun 2025 12:19:25 +0800 Lance Yang <ioworker0@gmail.com> wrote: > From: Lance Yang <lance.yang@linux.dev> > > When CONFIG_DETECT_HUNG_TASK_BLOCKER is enabled, a stale owner pointer in a > reader-owned rwsem can lead to false positives in blocker tracking. > > To mitigate this, let’s try to clear the owner field on unlock, as a NULL > owner is better than a stale one for diagnostics. Can we merge this to [PATCH 1/3]? It seems that you removed #ifdef and remove it. This means in anyway we need the feature enabled by DEBUG_RWSEMS. Thanks, > > Signed-off-by: Lance Yang <lance.yang@linux.dev> > --- > kernel/locking/rwsem.c | 10 ++++------ > 1 file changed, 4 insertions(+), 6 deletions(-) > > diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c > index 6cb29442d4fc..a310eb9896de 100644 > --- a/kernel/locking/rwsem.c > +++ b/kernel/locking/rwsem.c > @@ -205,14 +205,12 @@ bool is_rwsem_reader_owned(struct rw_semaphore *sem) > return false; > return rwsem_test_oflags(sem, RWSEM_READER_OWNED); > } > -#endif > > -#ifdef CONFIG_DEBUG_RWSEMS > /* > - * With CONFIG_DEBUG_RWSEMS configured, it will make sure that if there > - * is a task pointer in owner of a reader-owned rwsem, it will be the > - * real owner or one of the real owners. The only exception is when the > - * unlock is done by up_read_non_owner(). > + * With CONFIG_DEBUG_RWSEMS or CONFIG_DETECT_HUNG_TASK_BLOCKER configured, > + * it will make sure that the owner field of a reader-owned rwsem either > + * points to a real reader-owner(s) or gets cleared. The only exception is > + * when the unlock is done by up_read_non_owner(). > */ > static inline void rwsem_clear_reader_owned(struct rw_semaphore *sem) > { > -- > 2.49.0 > -- Masami Hiramatsu (Google) <mhiramat@kernel.org>
On 2025/6/24 08:26, Masami Hiramatsu (Google) wrote: > On Thu, 12 Jun 2025 12:19:25 +0800 > Lance Yang <ioworker0@gmail.com> wrote: > >> From: Lance Yang <lance.yang@linux.dev> >> >> When CONFIG_DETECT_HUNG_TASK_BLOCKER is enabled, a stale owner pointer in a >> reader-owned rwsem can lead to false positives in blocker tracking. >> >> To mitigate this, let’s try to clear the owner field on unlock, as a NULL >> owner is better than a stale one for diagnostics. > > Can we merge this to [PATCH 1/3]? It seems that you removed #ifdef and > remove it. This means in anyway we need the feature enabled by DEBUG_RWSEMS. Thanks for the feedback! I see your point about the dependency ;) Personlly, I'd perfer to keep them separate. The reasoning is that they addreess two distinct things, and I think splitting them makes this series clearer and easier to review ;) Patch #1 focuses on "ownership tracking": Its only job is to make the existing owner-related helpers (rwsem_owner(), is_rwsem_reader_owned()) globally available when blocker tracking is enabled. Patch #2, on the other hand, is about "reader-owner cleanup": It introduces a functional change to the unlock path, trying to clear the owner field for reader-owned rwsems. Does this reasoning make sense to you? Thanks, Lance > > Thanks, > >> >> Signed-off-by: Lance Yang <lance.yang@linux.dev> >> --- >> kernel/locking/rwsem.c | 10 ++++------ >> 1 file changed, 4 insertions(+), 6 deletions(-) >> >> diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c >> index 6cb29442d4fc..a310eb9896de 100644 >> --- a/kernel/locking/rwsem.c >> +++ b/kernel/locking/rwsem.c >> @@ -205,14 +205,12 @@ bool is_rwsem_reader_owned(struct rw_semaphore *sem) >> return false; >> return rwsem_test_oflags(sem, RWSEM_READER_OWNED); >> } >> -#endif >> >> -#ifdef CONFIG_DEBUG_RWSEMS >> /* >> - * With CONFIG_DEBUG_RWSEMS configured, it will make sure that if there >> - * is a task pointer in owner of a reader-owned rwsem, it will be the >> - * real owner or one of the real owners. The only exception is when the >> - * unlock is done by up_read_non_owner(). >> + * With CONFIG_DEBUG_RWSEMS or CONFIG_DETECT_HUNG_TASK_BLOCKER configured, >> + * it will make sure that the owner field of a reader-owned rwsem either >> + * points to a real reader-owner(s) or gets cleared. The only exception is >> + * when the unlock is done by up_read_non_owner(). >> */ >> static inline void rwsem_clear_reader_owned(struct rw_semaphore *sem) >> { >> -- >> 2.49.0 >> > >
On Tue, 24 Jun 2025 09:44:55 +0800 Lance Yang <lance.yang@linux.dev> wrote: > > > On 2025/6/24 08:26, Masami Hiramatsu (Google) wrote: > > On Thu, 12 Jun 2025 12:19:25 +0800 > > Lance Yang <ioworker0@gmail.com> wrote: > > > >> From: Lance Yang <lance.yang@linux.dev> > >> > >> When CONFIG_DETECT_HUNG_TASK_BLOCKER is enabled, a stale owner pointer in a > >> reader-owned rwsem can lead to false positives in blocker tracking. > >> > >> To mitigate this, let’s try to clear the owner field on unlock, as a NULL > >> owner is better than a stale one for diagnostics. > > > > Can we merge this to [PATCH 1/3]? It seems that you removed #ifdef and > > remove it. This means in anyway we need the feature enabled by DEBUG_RWSEMS. > > Thanks for the feedback! I see your point about the dependency ;) > > Personlly, I'd perfer to keep them separate. The reasoning is that > they addreess two distinct things, and I think splitting them makes > this series clearer and easier to review ;) > > Patch #1 focuses on "ownership tracking": Its only job is to make > the existing owner-related helpers (rwsem_owner(), is_rwsem_reader_owned()) > globally available when blocker tracking is enabled. > > Patch #2, on the other hand, is about "reader-owner cleanup": It > introduces a functional change to the unlock path, trying to clear > the owner field for reader-owned rwsems. But without clearing the owner, the owner information can be broken, right? Since CONFIG_DEBUG_RWSEMS is working as it is, I think those cannot be decoupled. For example, comparing the result of both DETECT_HUNG_TASK_BLOCKER and DEBUG_RWSEMS are enabled and only DETECT_HUNG_TASK_BLOCKER is enabled, the result is different. > > Does this reasoning make sense to you? Sorry, no. I think "reader-owner cleanup" is a part of "ownership tracking" as DEBUG_RWSEMS does (and that keeps consistency of the ownership tracking behavior same as DEBUG_RWSEM). Thank you, > > Thanks, > Lance > > > > > Thanks, > > > >> > >> Signed-off-by: Lance Yang <lance.yang@linux.dev> > >> --- > >> kernel/locking/rwsem.c | 10 ++++------ > >> 1 file changed, 4 insertions(+), 6 deletions(-) > >> > >> diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c > >> index 6cb29442d4fc..a310eb9896de 100644 > >> --- a/kernel/locking/rwsem.c > >> +++ b/kernel/locking/rwsem.c > >> @@ -205,14 +205,12 @@ bool is_rwsem_reader_owned(struct rw_semaphore *sem) > >> return false; > >> return rwsem_test_oflags(sem, RWSEM_READER_OWNED); > >> } > >> -#endif > >> > >> -#ifdef CONFIG_DEBUG_RWSEMS > >> /* > >> - * With CONFIG_DEBUG_RWSEMS configured, it will make sure that if there > >> - * is a task pointer in owner of a reader-owned rwsem, it will be the > >> - * real owner or one of the real owners. The only exception is when the > >> - * unlock is done by up_read_non_owner(). > >> + * With CONFIG_DEBUG_RWSEMS or CONFIG_DETECT_HUNG_TASK_BLOCKER configured, > >> + * it will make sure that the owner field of a reader-owned rwsem either > >> + * points to a real reader-owner(s) or gets cleared. The only exception is > >> + * when the unlock is done by up_read_non_owner(). > >> */ > >> static inline void rwsem_clear_reader_owned(struct rw_semaphore *sem) > >> { > >> -- > >> 2.49.0 > >> > > > > > -- Masami Hiramatsu (Google) <mhiramat@kernel.org>
On 2025/6/24 11:53, Masami Hiramatsu (Google) wrote: > On Tue, 24 Jun 2025 09:44:55 +0800 > Lance Yang <lance.yang@linux.dev> wrote: > >> >> >> On 2025/6/24 08:26, Masami Hiramatsu (Google) wrote: >>> On Thu, 12 Jun 2025 12:19:25 +0800 >>> Lance Yang <ioworker0@gmail.com> wrote: >>> >>>> From: Lance Yang <lance.yang@linux.dev> >>>> >>>> When CONFIG_DETECT_HUNG_TASK_BLOCKER is enabled, a stale owner pointer in a >>>> reader-owned rwsem can lead to false positives in blocker tracking. >>>> >>>> To mitigate this, let’s try to clear the owner field on unlock, as a NULL >>>> owner is better than a stale one for diagnostics. >>> >>> Can we merge this to [PATCH 1/3]? It seems that you removed #ifdef and >>> remove it. This means in anyway we need the feature enabled by DEBUG_RWSEMS. >> >> Thanks for the feedback! I see your point about the dependency ;) >> >> Personlly, I'd perfer to keep them separate. The reasoning is that >> they addreess two distinct things, and I think splitting them makes >> this series clearer and easier to review ;) >> >> Patch #1 focuses on "ownership tracking": Its only job is to make >> the existing owner-related helpers (rwsem_owner(), is_rwsem_reader_owned()) >> globally available when blocker tracking is enabled. >> >> Patch #2, on the other hand, is about "reader-owner cleanup": It >> introduces a functional change to the unlock path, trying to clear >> the owner field for reader-owned rwsems. > > But without clearing the owner, the owner information can be > broken, right? Since CONFIG_DEBUG_RWSEMS is working as it is, You're right, the owner info would be broken without the cleanup logic in patch #2. But ... > I think those cannot be decoupled. For example, comparing the > result of both DETECT_HUNG_TASK_BLOCKER and DEBUG_RWSEMS are > enabled and only DETECT_HUNG_TASK_BLOCKER is enabled, the > result is different. The actual blocker tracking for rwsems is only turned on in patch #3. So, there's no case where the feature is active without the cleanup logic already being in place. > >> >> Does this reasoning make sense to you? > > Sorry, no. I think "reader-owner cleanup" is a part of "ownership > tracking" as DEBUG_RWSEMS does (and that keeps consistency of > the ownership tracking behavior same as DEBUG_RWSEM). I thought this step-by-step approach was a bit cleaner, since there are currently only two users for these owner helpers (DEBUG_RWSEMS and DETECT_HUNG_TASK_BLOCKER). Anyway, if you still feel strongly that they should be merged, I'm happy to rework the series as you suggested ;p Thanks, Lance > > Thank you, > >> >> Thanks, >> Lance >> >>> >>> Thanks, >>> >>>> >>>> Signed-off-by: Lance Yang <lance.yang@linux.dev> >>>> --- >>>> kernel/locking/rwsem.c | 10 ++++------ >>>> 1 file changed, 4 insertions(+), 6 deletions(-) >>>> >>>> diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c >>>> index 6cb29442d4fc..a310eb9896de 100644 >>>> --- a/kernel/locking/rwsem.c >>>> +++ b/kernel/locking/rwsem.c >>>> @@ -205,14 +205,12 @@ bool is_rwsem_reader_owned(struct rw_semaphore *sem) >>>> return false; >>>> return rwsem_test_oflags(sem, RWSEM_READER_OWNED); >>>> } >>>> -#endif >>>> >>>> -#ifdef CONFIG_DEBUG_RWSEMS >>>> /* >>>> - * With CONFIG_DEBUG_RWSEMS configured, it will make sure that if there >>>> - * is a task pointer in owner of a reader-owned rwsem, it will be the >>>> - * real owner or one of the real owners. The only exception is when the >>>> - * unlock is done by up_read_non_owner(). >>>> + * With CONFIG_DEBUG_RWSEMS or CONFIG_DETECT_HUNG_TASK_BLOCKER configured, >>>> + * it will make sure that the owner field of a reader-owned rwsem either >>>> + * points to a real reader-owner(s) or gets cleared. The only exception is >>>> + * when the unlock is done by up_read_non_owner(). >>>> */ >>>> static inline void rwsem_clear_reader_owned(struct rw_semaphore *sem) >>>> { >>>> -- >>>> 2.49.0 >>>> >>> >>> >> > >
On Tue, 24 Jun 2025 13:02:31 +0800 Lance Yang <lance.yang@linux.dev> wrote: > > > On 2025/6/24 11:53, Masami Hiramatsu (Google) wrote: > > On Tue, 24 Jun 2025 09:44:55 +0800 > > Lance Yang <lance.yang@linux.dev> wrote: > > > >> > >> > >> On 2025/6/24 08:26, Masami Hiramatsu (Google) wrote: > >>> On Thu, 12 Jun 2025 12:19:25 +0800 > >>> Lance Yang <ioworker0@gmail.com> wrote: > >>> > >>>> From: Lance Yang <lance.yang@linux.dev> > >>>> > >>>> When CONFIG_DETECT_HUNG_TASK_BLOCKER is enabled, a stale owner pointer in a > >>>> reader-owned rwsem can lead to false positives in blocker tracking. > >>>> > >>>> To mitigate this, let’s try to clear the owner field on unlock, as a NULL > >>>> owner is better than a stale one for diagnostics. > >>> > >>> Can we merge this to [PATCH 1/3]? It seems that you removed #ifdef and > >>> remove it. This means in anyway we need the feature enabled by DEBUG_RWSEMS. > >> > >> Thanks for the feedback! I see your point about the dependency ;) > >> > >> Personlly, I'd perfer to keep them separate. The reasoning is that > >> they addreess two distinct things, and I think splitting them makes > >> this series clearer and easier to review ;) > >> > >> Patch #1 focuses on "ownership tracking": Its only job is to make > >> the existing owner-related helpers (rwsem_owner(), is_rwsem_reader_owned()) > >> globally available when blocker tracking is enabled. > >> > >> Patch #2, on the other hand, is about "reader-owner cleanup": It > >> introduces a functional change to the unlock path, trying to clear > >> the owner field for reader-owned rwsems. > > > > But without clearing the owner, the owner information can be > > broken, right? Since CONFIG_DEBUG_RWSEMS is working as it is, > > You're right, the owner info would be broken without the cleanup logic > in patch #2. But ... > > > I think those cannot be decoupled. For example, comparing the > > result of both DETECT_HUNG_TASK_BLOCKER and DEBUG_RWSEMS are > > enabled and only DETECT_HUNG_TASK_BLOCKER is enabled, the > > result is different. > > The actual blocker tracking for rwsems is only turned on in patch #3. > So, there's no case where the feature is active without the cleanup > logic already being in place. > > > > >> > >> Does this reasoning make sense to you? > > > > Sorry, no. I think "reader-owner cleanup" is a part of "ownership > > tracking" as DEBUG_RWSEMS does (and that keeps consistency of > > the ownership tracking behavior same as DEBUG_RWSEM). > > I thought this step-by-step approach was a bit cleaner, since there are > currently only two users for these owner helpers (DEBUG_RWSEMS and > DETECT_HUNG_TASK_BLOCKER). I think the step-by-step approach fits better if the feature is evolving (a working feature is already there.) I don't like the intermediate state which does not work correctly, because if we have a unit test( like kUnit) it should fail. If you can say "this finds the rwsem owner as same as what the CONFIG_DEBUG_RWSEM is doing", it is simpler to explain what you are doing, and easy to understand. > > Anyway, if you still feel strongly that they should be merged, I'm happy > to rework the series as you suggested ;p Thanks, > > Thanks, > Lance > > > > > Thank you, > > > >> > >> Thanks, > >> Lance > >> > >>> > >>> Thanks, > >>> > >>>> > >>>> Signed-off-by: Lance Yang <lance.yang@linux.dev> > >>>> --- > >>>> kernel/locking/rwsem.c | 10 ++++------ > >>>> 1 file changed, 4 insertions(+), 6 deletions(-) > >>>> > >>>> diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c > >>>> index 6cb29442d4fc..a310eb9896de 100644 > >>>> --- a/kernel/locking/rwsem.c > >>>> +++ b/kernel/locking/rwsem.c > >>>> @@ -205,14 +205,12 @@ bool is_rwsem_reader_owned(struct rw_semaphore *sem) > >>>> return false; > >>>> return rwsem_test_oflags(sem, RWSEM_READER_OWNED); > >>>> } > >>>> -#endif > >>>> > >>>> -#ifdef CONFIG_DEBUG_RWSEMS > >>>> /* > >>>> - * With CONFIG_DEBUG_RWSEMS configured, it will make sure that if there > >>>> - * is a task pointer in owner of a reader-owned rwsem, it will be the > >>>> - * real owner or one of the real owners. The only exception is when the > >>>> - * unlock is done by up_read_non_owner(). > >>>> + * With CONFIG_DEBUG_RWSEMS or CONFIG_DETECT_HUNG_TASK_BLOCKER configured, > >>>> + * it will make sure that the owner field of a reader-owned rwsem either > >>>> + * points to a real reader-owner(s) or gets cleared. The only exception is > >>>> + * when the unlock is done by up_read_non_owner(). > >>>> */ > >>>> static inline void rwsem_clear_reader_owned(struct rw_semaphore *sem) > >>>> { > >>>> -- > >>>> 2.49.0 > >>>> > >>> > >>> > >> > > > > > > -- Masami Hiramatsu (Google) <mhiramat@kernel.org>
On 2025/6/24 14:13, Masami Hiramatsu (Google) wrote: > On Tue, 24 Jun 2025 13:02:31 +0800 > Lance Yang <lance.yang@linux.dev> wrote: > >> >> >> On 2025/6/24 11:53, Masami Hiramatsu (Google) wrote: >>> On Tue, 24 Jun 2025 09:44:55 +0800 >>> Lance Yang <lance.yang@linux.dev> wrote: >>> >>>> >>>> >>>> On 2025/6/24 08:26, Masami Hiramatsu (Google) wrote: >>>>> On Thu, 12 Jun 2025 12:19:25 +0800 >>>>> Lance Yang <ioworker0@gmail.com> wrote: >>>>> >>>>>> From: Lance Yang <lance.yang@linux.dev> >>>>>> >>>>>> When CONFIG_DETECT_HUNG_TASK_BLOCKER is enabled, a stale owner pointer in a >>>>>> reader-owned rwsem can lead to false positives in blocker tracking. >>>>>> >>>>>> To mitigate this, let’s try to clear the owner field on unlock, as a NULL >>>>>> owner is better than a stale one for diagnostics. >>>>> >>>>> Can we merge this to [PATCH 1/3]? It seems that you removed #ifdef and >>>>> remove it. This means in anyway we need the feature enabled by DEBUG_RWSEMS. >>>> >>>> Thanks for the feedback! I see your point about the dependency ;) >>>> >>>> Personlly, I'd perfer to keep them separate. The reasoning is that >>>> they addreess two distinct things, and I think splitting them makes >>>> this series clearer and easier to review ;) >>>> >>>> Patch #1 focuses on "ownership tracking": Its only job is to make >>>> the existing owner-related helpers (rwsem_owner(), is_rwsem_reader_owned()) >>>> globally available when blocker tracking is enabled. >>>> >>>> Patch #2, on the other hand, is about "reader-owner cleanup": It >>>> introduces a functional change to the unlock path, trying to clear >>>> the owner field for reader-owned rwsems. >>> >>> But without clearing the owner, the owner information can be >>> broken, right? Since CONFIG_DEBUG_RWSEMS is working as it is, >> >> You're right, the owner info would be broken without the cleanup logic >> in patch #2. But ... >> >>> I think those cannot be decoupled. For example, comparing the >>> result of both DETECT_HUNG_TASK_BLOCKER and DEBUG_RWSEMS are >>> enabled and only DETECT_HUNG_TASK_BLOCKER is enabled, the >>> result is different. >> >> The actual blocker tracking for rwsems is only turned on in patch #3. >> So, there's no case where the feature is active without the cleanup >> logic already being in place. >> >>> >>>> >>>> Does this reasoning make sense to you? >>> >>> Sorry, no. I think "reader-owner cleanup" is a part of "ownership >>> tracking" as DEBUG_RWSEMS does (and that keeps consistency of >>> the ownership tracking behavior same as DEBUG_RWSEM). >> >> I thought this step-by-step approach was a bit cleaner, since there are >> currently only two users for these owner helpers (DEBUG_RWSEMS and >> DETECT_HUNG_TASK_BLOCKER). > > I think the step-by-step approach fits better if the feature is evolving > (a working feature is already there.) I don't like the intermediate Agreed. > state which does not work correctly, because if we have a unit test( > like kUnit) it should fail. If you can say "this finds the rwsem Ah, I missed that ... > owner as same as what the CONFIG_DEBUG_RWSEM is doing", it is simpler > to explain what you are doing, and easy to understand. Thanks for the lesson! Will rework the series as you suggested ;) Lance
© 2016 - 2025 Red Hat, Inc.