[v9] reimplement per-vma lock as a refcount

[PATCH v9 10/17] refcount: introduce __refcount_{add|inc}_not_zero_limited

Posted by Suren Baghdasaryan 1 year ago

Introduce functions to increase refcount but with a top limit above which
they will fail to increase (the limit is inclusive). Setting the limit to
INT_MAX indicates no limit.

Signed-off-by: Suren Baghdasaryan <surenb@google.com>
---
 include/linux/refcount.h | 24 +++++++++++++++++++++++-
 1 file changed, 23 insertions(+), 1 deletion(-)

diff --git a/include/linux/refcount.h b/include/linux/refcount.h
index 35f039ecb272..5072ba99f05e 100644
--- a/include/linux/refcount.h
+++ b/include/linux/refcount.h
@@ -137,13 +137,23 @@ static inline unsigned int refcount_read(const refcount_t *r)
 }
 
 static inline __must_check __signed_wrap
-bool __refcount_add_not_zero(int i, refcount_t *r, int *oldp)
+bool __refcount_add_not_zero_limited(int i, refcount_t *r, int *oldp,
+				     int limit)
 {
 	int old = refcount_read(r);
 
 	do {
 		if (!old)
 			break;
+
+		if (statically_true(limit == INT_MAX))
+			continue;
+
+		if (i > limit - old) {
+			if (oldp)
+				*oldp = old;
+			return false;
+		}
 	} while (!atomic_try_cmpxchg_relaxed(&r->refs, &old, old + i));
 
 	if (oldp)
@@ -155,6 +165,12 @@ bool __refcount_add_not_zero(int i, refcount_t *r, int *oldp)
 	return old;
 }
 
+static inline __must_check __signed_wrap
+bool __refcount_add_not_zero(int i, refcount_t *r, int *oldp)
+{
+	return __refcount_add_not_zero_limited(i, r, oldp, INT_MAX);
+}
+
 /**
  * refcount_add_not_zero - add a value to a refcount unless it is 0
  * @i: the value to add to the refcount
@@ -213,6 +229,12 @@ static inline void refcount_add(int i, refcount_t *r)
 	__refcount_add(i, r, NULL);
 }
 
+static inline __must_check bool __refcount_inc_not_zero_limited(refcount_t *r,
+								int *oldp, int limit)
+{
+	return __refcount_add_not_zero_limited(1, r, oldp, limit);
+}
+
 static inline __must_check bool __refcount_inc_not_zero(refcount_t *r, int *oldp)
 {
 	return __refcount_add_not_zero(1, r, oldp);
-- 
2.47.1.613.gc27f4b7a9f-goog

Re: [PATCH v9 10/17] refcount: introduce __refcount_{add|inc}_not_zero_limited

Posted by David Laight 1 year ago

On Fri, 10 Jan 2025 20:25:57 -0800
Suren Baghdasaryan <surenb@google.com> wrote:

> Introduce functions to increase refcount but with a top limit above which
> they will fail to increase (the limit is inclusive). Setting the limit to
> INT_MAX indicates no limit.

This function has never worked as expected!
I've removed the update and added in the rest of the code.

> diff --git a/include/linux/refcount.h b/include/linux/refcount.h
> index 35f039ecb272..5072ba99f05e 100644
> --- a/include/linux/refcount.h
> +++ b/include/linux/refcount.h
> @@ -137,13 +137,23 @@ static inline unsigned int refcount_read(const refcount_t *r)
>  }
>  
>  static inline __must_check __signed_wrap
> -bool __refcount_add_not_zero(int i, refcount_t *r, int *oldp)
>  {
>  	int old = refcount_read(r);
>  
>  	do {
>  		if (!old)
>  			break;
>
>  	} while (!atomic_try_cmpxchg_relaxed(&r->refs, &old, old + i));
>  
>  	if (oldp)
>		*oldp = old;
?
>	if (unlikely(old < 0 || old + i < 0))
>		refcount_warn_saturate(r, REFCOUNT_ADD_NOT_ZERO_OVF);
>
>  	return old;
>  }

The saturate test just doesn't work as expected.
In C signed integer overflow is undefined (probably so that cpu that saturate/trap
signed overflow can be conformant) and gcc uses that to optimise code.

So if you compile (https://www.godbolt.org/z/WYWo84Weq):
int inc_wraps(int i)
{
    return i < 0 || i + 1 < 0;
}
the second test is optimised away.
I don't think the kernel compiles disable this optimisation.

	David

Re: [PATCH v9 10/17] refcount: introduce __refcount_{add|inc}_not_zero_limited

Posted by Paul E. McKenney 1 year ago

On Sat, Jan 11, 2025 at 12:39:00PM +0000, David Laight wrote:
> On Fri, 10 Jan 2025 20:25:57 -0800
> Suren Baghdasaryan <surenb@google.com> wrote:
> 
> > Introduce functions to increase refcount but with a top limit above which
> > they will fail to increase (the limit is inclusive). Setting the limit to
> > INT_MAX indicates no limit.
> 
> This function has never worked as expected!
> I've removed the update and added in the rest of the code.
> 
> > diff --git a/include/linux/refcount.h b/include/linux/refcount.h
> > index 35f039ecb272..5072ba99f05e 100644
> > --- a/include/linux/refcount.h
> > +++ b/include/linux/refcount.h
> > @@ -137,13 +137,23 @@ static inline unsigned int refcount_read(const refcount_t *r)
> >  }
> >  
> >  static inline __must_check __signed_wrap
> > -bool __refcount_add_not_zero(int i, refcount_t *r, int *oldp)
> >  {
> >  	int old = refcount_read(r);
> >  
> >  	do {
> >  		if (!old)
> >  			break;
> >
> >  	} while (!atomic_try_cmpxchg_relaxed(&r->refs, &old, old + i));
> >  
> >  	if (oldp)
> >		*oldp = old;
> ?
> >	if (unlikely(old < 0 || old + i < 0))
> >		refcount_warn_saturate(r, REFCOUNT_ADD_NOT_ZERO_OVF);
> >
> >  	return old;
> >  }
> 
> The saturate test just doesn't work as expected.
> In C signed integer overflow is undefined (probably so that cpu that saturate/trap
> signed overflow can be conformant) and gcc uses that to optimise code.
> 
> So if you compile (https://www.godbolt.org/z/WYWo84Weq):
> int inc_wraps(int i)
> {
>     return i < 0 || i + 1 < 0;
> }
> the second test is optimised away.
> I don't think the kernel compiles disable this optimisation.

Last I checked, my kernel compiles specified -fno-strict-overflow.
What happens if you try that in godbolt?

							Thanx, Paul

Re: [PATCH v9 10/17] refcount: introduce __refcount_{add|inc}_not_zero_limited

Posted by David Laight 1 year ago

On Sat, 11 Jan 2025 10:30:40 -0800
"Paul E. McKenney" <paulmck@kernel.org> wrote:

> On Sat, Jan 11, 2025 at 12:39:00PM +0000, David Laight wrote:
> > On Fri, 10 Jan 2025 20:25:57 -0800
> > Suren Baghdasaryan <surenb@google.com> wrote:
> >   
> > > Introduce functions to increase refcount but with a top limit above which
> > > they will fail to increase (the limit is inclusive). Setting the limit to
> > > INT_MAX indicates no limit.  
> > 
> > This function has never worked as expected!
> > I've removed the update and added in the rest of the code.
> >   
> > > diff --git a/include/linux/refcount.h b/include/linux/refcount.h
> > > index 35f039ecb272..5072ba99f05e 100644
> > > --- a/include/linux/refcount.h
> > > +++ b/include/linux/refcount.h
> > > @@ -137,13 +137,23 @@ static inline unsigned int refcount_read(const refcount_t *r)
> > >  }
> > >  
> > >  static inline __must_check __signed_wrap
> > > -bool __refcount_add_not_zero(int i, refcount_t *r, int *oldp)
> > >  {
> > >  	int old = refcount_read(r);
> > >  
> > >  	do {
> > >  		if (!old)
> > >  			break;
> > >
> > >  	} while (!atomic_try_cmpxchg_relaxed(&r->refs, &old, old + i));
> > >  
> > >  	if (oldp)
> > >		*oldp = old;  
> > ?  
> > >	if (unlikely(old < 0 || old + i < 0))
> > >		refcount_warn_saturate(r, REFCOUNT_ADD_NOT_ZERO_OVF);
> > >
> > >  	return old;
> > >  }  
> > 
> > The saturate test just doesn't work as expected.
> > In C signed integer overflow is undefined (probably so that cpu that saturate/trap
> > signed overflow can be conformant) and gcc uses that to optimise code.
> > 
> > So if you compile (https://www.godbolt.org/z/WYWo84Weq):
> > int inc_wraps(int i)
> > {
> >     return i < 0 || i + 1 < 0;
> > }
> > the second test is optimised away.
> > I don't think the kernel compiles disable this optimisation.  
> 
> Last I checked, my kernel compiles specified -fno-strict-overflow.
> What happens if you try that in godbolt?

That does make gcc generated the wanted object code.
I know that compilation option has come up before, but I couldn't remember the
name or whether it was disabled :-(

You do get much better object code from return (i | i + 1) < 0;
And that is likely to be much better still if you need a conditional jump.

	David

Re: [PATCH v9 10/17] refcount: introduce __refcount_{add|inc}_not_zero_limited - clang 17.0.1 bug

Posted by David Laight 1 year ago

On Sat, 11 Jan 2025 22:19:39 +0000
David Laight <david.laight.linux@gmail.com> wrote:

> On Sat, 11 Jan 2025 10:30:40 -0800
> "Paul E. McKenney" <paulmck@kernel.org> wrote:
> 
> > On Sat, Jan 11, 2025 at 12:39:00PM +0000, David Laight wrote:  
> > > On Fri, 10 Jan 2025 20:25:57 -0800
> > > Suren Baghdasaryan <surenb@google.com> wrote:
> > >     
> > > > Introduce functions to increase refcount but with a top limit above which
> > > > they will fail to increase (the limit is inclusive). Setting the limit to
> > > > INT_MAX indicates no limit.    
> > > 
> > > This function has never worked as expected!
> > > I've removed the update and added in the rest of the code.
> > >     
> > > > diff --git a/include/linux/refcount.h b/include/linux/refcount.h
> > > > index 35f039ecb272..5072ba99f05e 100644
> > > > --- a/include/linux/refcount.h
> > > > +++ b/include/linux/refcount.h
> > > > @@ -137,13 +137,23 @@ static inline unsigned int refcount_read(const refcount_t *r)
> > > >  }
> > > >  
> > > >  static inline __must_check __signed_wrap
> > > > -bool __refcount_add_not_zero(int i, refcount_t *r, int *oldp)
> > > >  {
> > > >  	int old = refcount_read(r);
> > > >  
> > > >  	do {
> > > >  		if (!old)
> > > >  			break;
> > > >
> > > >  	} while (!atomic_try_cmpxchg_relaxed(&r->refs, &old, old + i));
> > > >  
> > > >  	if (oldp)
> > > >		*oldp = old;    
> > > ?    
> > > >	if (unlikely(old < 0 || old + i < 0))
> > > >		refcount_warn_saturate(r, REFCOUNT_ADD_NOT_ZERO_OVF);
> > > >
> > > >  	return old;
> > > >  }    
> > > 
> > > The saturate test just doesn't work as expected.
> > > In C signed integer overflow is undefined (probably so that cpu that saturate/trap
> > > signed overflow can be conformant) and gcc uses that to optimise code.
> > > 
> > > So if you compile (https://www.godbolt.org/z/WYWo84Weq):
> > > int inc_wraps(int i)
> > > {
> > >     return i < 0 || i + 1 < 0;
> > > }
> > > the second test is optimised away.
> > > I don't think the kernel compiles disable this optimisation.    
> > 
> > Last I checked, my kernel compiles specified -fno-strict-overflow.
> > What happens if you try that in godbolt?  
> 
> That does make gcc generated the wanted object code.
> I know that compilation option has come up before, but I couldn't remember the
> name or whether it was disabled :-(
> 
> You do get much better object code from return (i | i + 1) < 0;
> And that is likely to be much better still if you need a conditional jump.

I've just checked some more cases (see https://www.godbolt.org/z/YoM9odTbe).
gcc 11 onwards generates the same code code for the two expressions.

Rather more worryingly clang 17.0.1 is getting this one wrong:
   return i < 0 || i + 1 < 0 ? foo(i) : bar(i);
It ignores the 'i + 1' test even with -fno-strict-overflow.
That is more representative of the actual code.

What have I missed now?

	David

Re: [PATCH v9 10/17] refcount: introduce __refcount_{add|inc}_not_zero_limited - clang 17.0.1 bug

Posted by David Laight 1 year ago

On Sat, 11 Jan 2025 22:50:16 +0000
David Laight <david.laight.linux@gmail.com> wrote:

> On Sat, 11 Jan 2025 22:19:39 +0000
> David Laight <david.laight.linux@gmail.com> wrote:
> 
> > On Sat, 11 Jan 2025 10:30:40 -0800
> > "Paul E. McKenney" <paulmck@kernel.org> wrote:
> >   
> > > On Sat, Jan 11, 2025 at 12:39:00PM +0000, David Laight wrote:    
> > > > On Fri, 10 Jan 2025 20:25:57 -0800
> > > > Suren Baghdasaryan <surenb@google.com> wrote:
> > > >       
> > > > > Introduce functions to increase refcount but with a top limit above which
> > > > > they will fail to increase (the limit is inclusive). Setting the limit to
> > > > > INT_MAX indicates no limit.      
> > > > 
> > > > This function has never worked as expected!
> > > > I've removed the update and added in the rest of the code.
> > > >       
> > > > > diff --git a/include/linux/refcount.h b/include/linux/refcount.h
> > > > > index 35f039ecb272..5072ba99f05e 100644
> > > > > --- a/include/linux/refcount.h
> > > > > +++ b/include/linux/refcount.h
> > > > > @@ -137,13 +137,23 @@ static inline unsigned int refcount_read(const refcount_t *r)
> > > > >  }
> > > > >  
> > > > >  static inline __must_check __signed_wrap
> > > > > -bool __refcount_add_not_zero(int i, refcount_t *r, int *oldp)
> > > > >  {
> > > > >  	int old = refcount_read(r);
> > > > >  
> > > > >  	do {
> > > > >  		if (!old)
> > > > >  			break;
> > > > >
> > > > >  	} while (!atomic_try_cmpxchg_relaxed(&r->refs, &old, old + i));
> > > > >  
> > > > >  	if (oldp)
> > > > >		*oldp = old;      
> > > > ?      
> > > > >	if (unlikely(old < 0 || old + i < 0))
> > > > >		refcount_warn_saturate(r, REFCOUNT_ADD_NOT_ZERO_OVF);
> > > > >
> > > > >  	return old;
> > > > >  }      
> > > > 
> > > > The saturate test just doesn't work as expected.
> > > > In C signed integer overflow is undefined (probably so that cpu that saturate/trap
> > > > signed overflow can be conformant) and gcc uses that to optimise code.
> > > > 
> > > > So if you compile (https://www.godbolt.org/z/WYWo84Weq):
> > > > int inc_wraps(int i)
> > > > {
> > > >     return i < 0 || i + 1 < 0;
> > > > }
> > > > the second test is optimised away.
> > > > I don't think the kernel compiles disable this optimisation.      
> > > 
> > > Last I checked, my kernel compiles specified -fno-strict-overflow.
> > > What happens if you try that in godbolt?    
> > 
> > That does make gcc generated the wanted object code.
> > I know that compilation option has come up before, but I couldn't remember the
> > name or whether it was disabled :-(
> > 
> > You do get much better object code from return (i | i + 1) < 0;
> > And that is likely to be much better still if you need a conditional jump.  
> 
> I've just checked some more cases (see https://www.godbolt.org/z/YoM9odTbe).
> gcc 11 onwards generates the same code code for the two expressions.
> 
> Rather more worryingly clang 17.0.1 is getting this one wrong:
>    return i < 0 || i + 1 < 0 ? foo(i) : bar(i);
> It ignores the 'i + 1' test even with -fno-strict-overflow.
> That is more representative of the actual code.
> 
> What have I missed now?

A different optimisation :-(

> 
> 	David
>

Re: [PATCH v9 10/17] refcount: introduce __refcount_{add|inc}_not_zero_limited - clang 17.0.1 bug

Posted by Paul E. McKenney 1 year ago

On Sun, Jan 12, 2025 at 11:37:56AM +0000, David Laight wrote:
> On Sat, 11 Jan 2025 22:50:16 +0000
> David Laight <david.laight.linux@gmail.com> wrote:
> 
> > On Sat, 11 Jan 2025 22:19:39 +0000
> > David Laight <david.laight.linux@gmail.com> wrote:
> > 
> > > On Sat, 11 Jan 2025 10:30:40 -0800
> > > "Paul E. McKenney" <paulmck@kernel.org> wrote:
> > >   
> > > > On Sat, Jan 11, 2025 at 12:39:00PM +0000, David Laight wrote:    
> > > > > On Fri, 10 Jan 2025 20:25:57 -0800
> > > > > Suren Baghdasaryan <surenb@google.com> wrote:
> > > > >       
> > > > > > Introduce functions to increase refcount but with a top limit above which
> > > > > > they will fail to increase (the limit is inclusive). Setting the limit to
> > > > > > INT_MAX indicates no limit.      
> > > > > 
> > > > > This function has never worked as expected!
> > > > > I've removed the update and added in the rest of the code.
> > > > >       
> > > > > > diff --git a/include/linux/refcount.h b/include/linux/refcount.h
> > > > > > index 35f039ecb272..5072ba99f05e 100644
> > > > > > --- a/include/linux/refcount.h
> > > > > > +++ b/include/linux/refcount.h
> > > > > > @@ -137,13 +137,23 @@ static inline unsigned int refcount_read(const refcount_t *r)
> > > > > >  }
> > > > > >  
> > > > > >  static inline __must_check __signed_wrap
> > > > > > -bool __refcount_add_not_zero(int i, refcount_t *r, int *oldp)
> > > > > >  {
> > > > > >  	int old = refcount_read(r);
> > > > > >  
> > > > > >  	do {
> > > > > >  		if (!old)
> > > > > >  			break;
> > > > > >
> > > > > >  	} while (!atomic_try_cmpxchg_relaxed(&r->refs, &old, old + i));
> > > > > >  
> > > > > >  	if (oldp)
> > > > > >		*oldp = old;      
> > > > > ?      
> > > > > >	if (unlikely(old < 0 || old + i < 0))
> > > > > >		refcount_warn_saturate(r, REFCOUNT_ADD_NOT_ZERO_OVF);
> > > > > >
> > > > > >  	return old;
> > > > > >  }      
> > > > > 
> > > > > The saturate test just doesn't work as expected.
> > > > > In C signed integer overflow is undefined (probably so that cpu that saturate/trap
> > > > > signed overflow can be conformant) and gcc uses that to optimise code.
> > > > > 
> > > > > So if you compile (https://www.godbolt.org/z/WYWo84Weq):
> > > > > int inc_wraps(int i)
> > > > > {
> > > > >     return i < 0 || i + 1 < 0;
> > > > > }
> > > > > the second test is optimised away.
> > > > > I don't think the kernel compiles disable this optimisation.      
> > > > 
> > > > Last I checked, my kernel compiles specified -fno-strict-overflow.
> > > > What happens if you try that in godbolt?    
> > > 
> > > That does make gcc generated the wanted object code.
> > > I know that compilation option has come up before, but I couldn't remember the
> > > name or whether it was disabled :-(
> > > 
> > > You do get much better object code from return (i | i + 1) < 0;
> > > And that is likely to be much better still if you need a conditional jump.  
> > 
> > I've just checked some more cases (see https://www.godbolt.org/z/YoM9odTbe).
> > gcc 11 onwards generates the same code code for the two expressions.
> > 
> > Rather more worryingly clang 17.0.1 is getting this one wrong:
> >    return i < 0 || i + 1 < 0 ? foo(i) : bar(i);
> > It ignores the 'i + 1' test even with -fno-strict-overflow.
> > That is more representative of the actual code.
> > 
> > What have I missed now?
> 
> A different optimisation :-(

So the Linux kernel is good with signed integer overflow, right?

(Give or take compiler bugs, of course...)

							Thanx, Paul

Re: [PATCH v9 10/17] refcount: introduce __refcount_{add|inc}_not_zero_limited

Posted by Matthew Wilcox 1 year ago

On Sat, Jan 11, 2025 at 12:39:00PM +0000, David Laight wrote:
> I don't think the kernel compiles disable this optimisation.

You're wrong.

Re: [PATCH v9 10/17] refcount: introduce __refcount_{add|inc}_not_zero_limited

Posted by Hillf Danton 1 year ago

On Fri, 10 Jan 2025 20:25:57 -0800 Suren Baghdasaryan <surenb@google.com>
> -bool __refcount_add_not_zero(int i, refcount_t *r, int *oldp)
> +bool __refcount_add_not_zero_limited(int i, refcount_t *r, int *oldp,
> +				     int limit)
>  {
>  	int old = refcount_read(r);
>  
>  	do {
>  		if (!old)
>  			break;
> +
> +		if (statically_true(limit == INT_MAX))
> +			continue;
> +
> +		if (i > limit - old) {
> +			if (oldp)
> +				*oldp = old;
> +			return false;
> +		}
>  	} while (!atomic_try_cmpxchg_relaxed(&r->refs, &old, old + i));

The acquire version should be used, see atomic_long_try_cmpxchg_acquire()
in kernel/locking/rwsem.c.

Why not use the atomic_long_t without bothering to add this limited version?

Re: [PATCH v9 10/17] refcount: introduce __refcount_{add|inc}_not_zero_limited

Posted by Suren Baghdasaryan 1 year ago

On Fri, Jan 10, 2025 at 10:32 PM Hillf Danton <hdanton@sina.com> wrote:
>
> On Fri, 10 Jan 2025 20:25:57 -0800 Suren Baghdasaryan <surenb@google.com>
> > -bool __refcount_add_not_zero(int i, refcount_t *r, int *oldp)
> > +bool __refcount_add_not_zero_limited(int i, refcount_t *r, int *oldp,
> > +                                  int limit)
> >  {
> >       int old = refcount_read(r);
> >
> >       do {
> >               if (!old)
> >                       break;
> > +
> > +             if (statically_true(limit == INT_MAX))
> > +                     continue;
> > +
> > +             if (i > limit - old) {
> > +                     if (oldp)
> > +                             *oldp = old;
> > +                     return false;
> > +             }
> >       } while (!atomic_try_cmpxchg_relaxed(&r->refs, &old, old + i));
>
> The acquire version should be used, see atomic_long_try_cmpxchg_acquire()
> in kernel/locking/rwsem.c.

This is how __refcount_add_not_zero() is already implemented and I'm
only adding support for a limit. If you think it's implemented wrong
then IMHO it should be fixed separately.

>
> Why not use the atomic_long_t without bothering to add this limited version?

The check against the limit is not only for overflow protection but
also to avoid refcount increment when the writer bit is set. It makes
the locking code simpler if we have a function that prevents
refcounting when the vma is detached (vm_refcnt==0) or when it's
write-locked (vm_refcnt<VMA_REF_LIMIT).

>
> To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com.
>

Re: [PATCH v9 10/17] refcount: introduce __refcount_{add|inc}_not_zero_limited

Posted by Hillf Danton 1 year ago

On Sat, 11 Jan 2025 01:59:41 -0800 Suren Baghdasaryan <surenb@google.com>
> On Fri, Jan 10, 2025 at 10:32 PM Hillf Danton <hdanton@sina.com> wrote:
> > On Fri, 10 Jan 2025 20:25:57 -0800 Suren Baghdasaryan <surenb@google.com>
> > > -bool __refcount_add_not_zero(int i, refcount_t *r, int *oldp)
> > > +bool __refcount_add_not_zero_limited(int i, refcount_t *r, int *oldp,
> > > +                                  int limit)
> > >  {
> > >       int old = refcount_read(r);
> > >
> > >       do {
> > >               if (!old)
> > >                       break;
> > > +
> > > +             if (statically_true(limit == INT_MAX))
> > > +                     continue;
> > > +
> > > +             if (i > limit - old) {
> > > +                     if (oldp)
> > > +                             *oldp = old;
> > > +                     return false;
> > > +             }
> > >       } while (!atomic_try_cmpxchg_relaxed(&r->refs, &old, old + i));
> >
> > The acquire version should be used, see atomic_long_try_cmpxchg_acquire()
> > in kernel/locking/rwsem.c.
> 
> This is how __refcount_add_not_zero() is already implemented and I'm
> only adding support for a limit. If you think it's implemented wrong
> then IMHO it should be fixed separately.
> 
Two different things - refcount has nothing to do with locking at the
first place, while what you are adding to the mm directory is something
that replaces rwsem, so from the locking POV you have to mark the
boundaries of the locking section.

Re: [PATCH v9 10/17] refcount: introduce __refcount_{add|inc}_not_zero_limited

Posted by Suren Baghdasaryan 1 year ago

On Sat, Jan 11, 2025 at 4:13 AM Hillf Danton <hdanton@sina.com> wrote:
>
> On Sat, 11 Jan 2025 01:59:41 -0800 Suren Baghdasaryan <surenb@google.com>
> > On Fri, Jan 10, 2025 at 10:32 PM Hillf Danton <hdanton@sina.com> wrote:
> > > On Fri, 10 Jan 2025 20:25:57 -0800 Suren Baghdasaryan <surenb@google.com>
> > > > -bool __refcount_add_not_zero(int i, refcount_t *r, int *oldp)
> > > > +bool __refcount_add_not_zero_limited(int i, refcount_t *r, int *oldp,
> > > > +                                  int limit)
> > > >  {
> > > >       int old = refcount_read(r);
> > > >
> > > >       do {
> > > >               if (!old)
> > > >                       break;
> > > > +
> > > > +             if (statically_true(limit == INT_MAX))
> > > > +                     continue;
> > > > +
> > > > +             if (i > limit - old) {
> > > > +                     if (oldp)
> > > > +                             *oldp = old;
> > > > +                     return false;
> > > > +             }
> > > >       } while (!atomic_try_cmpxchg_relaxed(&r->refs, &old, old + i));
> > >
> > > The acquire version should be used, see atomic_long_try_cmpxchg_acquire()
> > > in kernel/locking/rwsem.c.
> >
> > This is how __refcount_add_not_zero() is already implemented and I'm
> > only adding support for a limit. If you think it's implemented wrong
> > then IMHO it should be fixed separately.
> >
> Two different things - refcount has nothing to do with locking at the
> first place, while what you are adding to the mm directory is something
> that replaces rwsem, so from the locking POV you have to mark the
> boundaries of the locking section.

I see your point. I think it's a strong argument to use atomic
directly instead of refcount for this locking. I'll try that and see
how it looks. Thanks for the feedback!

>
> To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com.
>

Re: [PATCH v9 10/17] refcount: introduce __refcount_{add|inc}_not_zero_limited

Posted by Peter Zijlstra 1 year ago

On Sat, Jan 11, 2025 at 09:11:52AM -0800, Suren Baghdasaryan wrote:
> On Sat, Jan 11, 2025 at 4:13 AM Hillf Danton <hdanton@sina.com> wrote:
> >
> > On Sat, 11 Jan 2025 01:59:41 -0800 Suren Baghdasaryan <surenb@google.com>
> > > On Fri, Jan 10, 2025 at 10:32 PM Hillf Danton <hdanton@sina.com> wrote:
> > > > On Fri, 10 Jan 2025 20:25:57 -0800 Suren Baghdasaryan <surenb@google.com>
> > > > > -bool __refcount_add_not_zero(int i, refcount_t *r, int *oldp)
> > > > > +bool __refcount_add_not_zero_limited(int i, refcount_t *r, int *oldp,
> > > > > +                                  int limit)
> > > > >  {
> > > > >       int old = refcount_read(r);
> > > > >
> > > > >       do {
> > > > >               if (!old)
> > > > >                       break;
> > > > > +
> > > > > +             if (statically_true(limit == INT_MAX))
> > > > > +                     continue;
> > > > > +
> > > > > +             if (i > limit - old) {
> > > > > +                     if (oldp)
> > > > > +                             *oldp = old;
> > > > > +                     return false;
> > > > > +             }
> > > > >       } while (!atomic_try_cmpxchg_relaxed(&r->refs, &old, old + i));
> > > >
> > > > The acquire version should be used, see atomic_long_try_cmpxchg_acquire()
> > > > in kernel/locking/rwsem.c.
> > >
> > > This is how __refcount_add_not_zero() is already implemented and I'm
> > > only adding support for a limit. If you think it's implemented wrong
> > > then IMHO it should be fixed separately.
> > >
> > Two different things - refcount has nothing to do with locking at the
> > first place, while what you are adding to the mm directory is something
> > that replaces rwsem, so from the locking POV you have to mark the
> > boundaries of the locking section.
> 
> I see your point. I think it's a strong argument to use atomic
> directly instead of refcount for this locking. I'll try that and see
> how it looks. Thanks for the feedback!

Sigh; don't let hillf confuse you. *IF* you need an acquire it will be
in the part where you wait for readers to go away. But even there, think
about what you're serializing against. Readers don't typically modify
things.

And modifications are fully serialized by mmap_sem^H^H^Hlock.

Re: [PATCH v9 10/17] refcount: introduce __refcount_{add|inc}_not_zero_limited

Posted by Hillf Danton 1 year ago

On Wed, 15 Jan 2025 10:39:55 +0100 Peter Zijlstra <peterz@infradead.org>
> 
> Sigh; don't let hillf confuse you. *IF* you need an acquire it will be
> in the part where you wait for readers to go away. But even there, think
> about what you're serializing against. Readers don't typically modify
> things.
> 
> And modifications are fully serialized by mmap_sem^H^H^Hlock.
> 
^H^H^H

- * Do note that inc_not_zero() provides a control dependency which will order
- * future stores against the inc, this ensures we'll never modify the object
- * if we did not in fact acquire a reference.
+ * Do note that inc_not_zero() does provide acquire order, which will order
+ * future load and stores against the inc, this ensures all subsequent accesses
+ * are from this object and not anything previously occupying this memory --
+ * consider SLAB_TYPESAFE_BY_RCU.
  *
  * The decrements will provide release order, such that all the prior loads and
  * stores will be issued before, it also provides a control dependency, which
@@ -144,7 +145,7 @@ bool __refcount_add_not_zero(int i, refcount_t *r, int *oldp)
 	do {
 		if (!old)
 			break;
-	} while (!atomic_try_cmpxchg_relaxed(&r->refs, &old, old + i));
+	} while (!atomic_try_cmpxchg_acquire(&r->refs, &old, old + i));

Re: [PATCH v9 10/17] refcount: introduce __refcount_{add|inc}_not_zero_limited

Posted by Hillf Danton 1 year ago

On Sat, 11 Jan 2025 09:11:52 -0800 Suren Baghdasaryan <surenb@google.com>
> I see your point. I think it's a strong argument to use atomic
> directly instead of refcount for this locking. I'll try that and see
> how it looks. Thanks for the feedback!
>
Better not before having a clear answer to why is it sane to invent
anything like rwsem in 2025. What, the 40 bytes? Nope it is the
fair price paid for finer locking granuality.

BTW Vlastimil, the cc list is cut down because I have to walk around
the spam check on the mail agent side.

Re: [PATCH v9 10/17] refcount: introduce __refcount_{add|inc}_not_zero_limited

Posted by Suren Baghdasaryan 1 year ago

On Sat, Jan 11, 2025 at 3:45 PM Hillf Danton <hdanton@sina.com> wrote:
>
> On Sat, 11 Jan 2025 09:11:52 -0800 Suren Baghdasaryan <surenb@google.com>
> > I see your point. I think it's a strong argument to use atomic
> > directly instead of refcount for this locking. I'll try that and see
> > how it looks. Thanks for the feedback!
> >
> Better not before having a clear answer to why is it sane to invent
> anything like rwsem in 2025. What, the 40 bytes? Nope it is the
> fair price paid for finer locking granuality.

It's not just about the 40 bytes. It allows us to fold the separate
vma->detached flag nicely into the same refcounter, which consolidates
the vma state in one place. Later that makes it much easier to add
SLAB_TYPESAFE_BY_RCU because now we have to preserve only this
refcounter during the vma reuse.

>
> BTW Vlastimil, the cc list is cut down because I have to walk around
> the spam check on the mail agent side.
>

Re: [PATCH v9 10/17] refcount: introduce __refcount_{add|inc}_not_zero_limited

Posted by Suren Baghdasaryan 1 year ago

On Sat, Jan 11, 2025 at 1:59 AM Suren Baghdasaryan <surenb@google.com> wrote:
>
> On Fri, Jan 10, 2025 at 10:32 PM Hillf Danton <hdanton@sina.com> wrote:
> >
> > On Fri, 10 Jan 2025 20:25:57 -0800 Suren Baghdasaryan <surenb@google.com>
> > > -bool __refcount_add_not_zero(int i, refcount_t *r, int *oldp)
> > > +bool __refcount_add_not_zero_limited(int i, refcount_t *r, int *oldp,
> > > +                                  int limit)
> > >  {
> > >       int old = refcount_read(r);
> > >
> > >       do {
> > >               if (!old)
> > >                       break;
> > > +
> > > +             if (statically_true(limit == INT_MAX))
> > > +                     continue;
> > > +
> > > +             if (i > limit - old) {
> > > +                     if (oldp)
> > > +                             *oldp = old;
> > > +                     return false;
> > > +             }
> > >       } while (!atomic_try_cmpxchg_relaxed(&r->refs, &old, old + i));
> >
> > The acquire version should be used, see atomic_long_try_cmpxchg_acquire()
> > in kernel/locking/rwsem.c.
>
> This is how __refcount_add_not_zero() is already implemented and I'm
> only adding support for a limit. If you think it's implemented wrong
> then IMHO it should be fixed separately.
>
> >
> > Why not use the atomic_long_t without bothering to add this limited version?
>
> The check against the limit is not only for overflow protection but
> also to avoid refcount increment when the writer bit is set. It makes
> the locking code simpler if we have a function that prevents
> refcounting when the vma is detached (vm_refcnt==0) or when it's
> write-locked (vm_refcnt<VMA_REF_LIMIT).

s / vm_refcnt<VMA_REF_LIMIT / vm_refcnt>VMA_REF_LIMIT

>
> >
> > To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com.
> >