xen/x86: cmpxchg cleanup

[PATCH 1/2] x86/memsharing: use an atomic add instead of a cmpxchg loop

Posted by Roger Pau Monne 1 year, 11 months ago

The usage of a cmpxchg loop in get_next_handle() is unnecessary, as the same
can be achieved with an atomic increment, which is both simpler to read, and
avoid any need for a loop.

The cmpxchg usage is likely a remnant of 32bit support, which didn't have an
instruction to do an atomic 64bit add, and instead a cmpxchg had to be used.

Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-of-by: Roger Pau Monné <roger.pau@citrix.com>
---
 xen/arch/x86/mm/mem_sharing.c | 8 +-------
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/xen/arch/x86/mm/mem_sharing.c b/xen/arch/x86/mm/mem_sharing.c
index 4f810706a315..fe299a2bf9aa 100644
--- a/xen/arch/x86/mm/mem_sharing.c
+++ b/xen/arch/x86/mm/mem_sharing.c
@@ -179,13 +179,7 @@ static void mem_sharing_page_unlock(struct page_info *pg)
 
 static shr_handle_t get_next_handle(void)
 {
-    /* Get the next handle get_page style */
-    uint64_t x, y = next_handle;
-    do {
-        x = y;
-    }
-    while ( (y = cmpxchg(&next_handle, x, x + 1)) != x );
-    return x + 1;
+    return arch_fetch_and_add(&next_handle, 1) + 1;
 }
 
 static atomic_t nr_saved_mfns   = ATOMIC_INIT(0);
-- 
2.43.0

Re: [PATCH 1/2] x86/memsharing: use an atomic add instead of a cmpxchg loop

Posted by Jan Beulich 1 year, 11 months ago

On 22.02.2024 10:05, Roger Pau Monne wrote:
> The usage of a cmpxchg loop in get_next_handle() is unnecessary, as the same
> can be achieved with an atomic increment, which is both simpler to read, and
> avoid any need for a loop.
> 
> The cmpxchg usage is likely a remnant of 32bit support, which didn't have an
> instruction to do an atomic 64bit add, and instead a cmpxchg had to be used.
> 
> Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Signed-of-by: Roger Pau Monné <roger.pau@citrix.com>

Hmm, same typo again, so my commit script again got the author wrong.
I'm sorry again. I would consider adding an explicit confirmation step
when there's no (well-formed) S-o-b, but I use this script elsewhere
as well, and in e.g. binutils and gcc S-o-b are pretty uncommon. (And
to avoid the question: I'm specifically avoiding From: itself in the
common case, as for you and other people with non-ASCII characters in
their names those typically don't properly show up there.)

Jan

Re: [PATCH 1/2] x86/memsharing: use an atomic add instead of a cmpxchg loop

Posted by Julien Grall 1 year, 11 months ago

Hi Roger,

On 22/02/2024 09:05, Roger Pau Monne wrote:
> The usage of a cmpxchg loop in get_next_handle() is unnecessary, as the same
> can be achieved with an atomic increment, which is both simpler to read, and
> avoid any need for a loop.
> 
> The cmpxchg usage is likely a remnant of 32bit support, which didn't have an
> instruction to do an atomic 64bit add, and instead a cmpxchg had to be used.
> 
> Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Signed-of-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>

Cheers,

-- 
Julien Grall

Re: [PATCH 1/2] x86/memsharing: use an atomic add instead of a cmpxchg loop

Posted by Jan Beulich 1 year, 11 months ago

On 22.02.2024 10:05, Roger Pau Monne wrote:
> The usage of a cmpxchg loop in get_next_handle() is unnecessary, as the same
> can be achieved with an atomic increment, which is both simpler to read, and
> avoid any need for a loop.
> 
> The cmpxchg usage is likely a remnant of 32bit support, which didn't have an
> instruction to do an atomic 64bit add, and instead a cmpxchg had to be used.
> 
> Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Signed-of-by: Roger Pau Monné <roger.pau@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>
albeit ...

> --- a/xen/arch/x86/mm/mem_sharing.c
> +++ b/xen/arch/x86/mm/mem_sharing.c
> @@ -179,13 +179,7 @@ static void mem_sharing_page_unlock(struct page_info *pg)
>  
>  static shr_handle_t get_next_handle(void)
>  {
> -    /* Get the next handle get_page style */
> -    uint64_t x, y = next_handle;
> -    do {
> -        x = y;
> -    }
> -    while ( (y = cmpxchg(&next_handle, x, x + 1)) != x );
> -    return x + 1;
> +    return arch_fetch_and_add(&next_handle, 1) + 1;
>  }

... the adding of 1 here is a little odd when taken together with
next_handle's initializer. Tamas, you've not written that code, but do
you have any thoughts towards the possible removal of either the
initializer or the adding here? Plus that variable of course could
very well do with moving into this function.

Jan

Re: [PATCH 1/2] x86/memsharing: use an atomic add instead of a cmpxchg loop

Posted by Tamas K Lengyel 1 year, 11 months ago

On Thu, Feb 22, 2024 at 5:06 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 22.02.2024 10:05, Roger Pau Monne wrote:
> > The usage of a cmpxchg loop in get_next_handle() is unnecessary, as the same
> > can be achieved with an atomic increment, which is both simpler to read, and
> > avoid any need for a loop.
> >
> > The cmpxchg usage is likely a remnant of 32bit support, which didn't have an
> > instruction to do an atomic 64bit add, and instead a cmpxchg had to be used.
> >
> > Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
> > Signed-of-by: Roger Pau Monné <roger.pau@citrix.com>
>
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> albeit ...
>
> > --- a/xen/arch/x86/mm/mem_sharing.c
> > +++ b/xen/arch/x86/mm/mem_sharing.c
> > @@ -179,13 +179,7 @@ static void mem_sharing_page_unlock(struct page_info *pg)
> >
> >  static shr_handle_t get_next_handle(void)
> >  {
> > -    /* Get the next handle get_page style */
> > -    uint64_t x, y = next_handle;
> > -    do {
> > -        x = y;
> > -    }
> > -    while ( (y = cmpxchg(&next_handle, x, x + 1)) != x );
> > -    return x + 1;
> > +    return arch_fetch_and_add(&next_handle, 1) + 1;
> >  }
>
> ... the adding of 1 here is a little odd when taken together with
> next_handle's initializer. Tamas, you've not written that code, but do
> you have any thoughts towards the possible removal of either the
> initializer or the adding here? Plus that variable of course could
> very well do with moving into this function.

I have to say I find the existing logic here hard to parse but by the
looks I don't think we need the + 1 once we switch to
arch_fetch_and_add. Also could go without initializing next_handle to
1. Moving it into the function would not really accomplish anything
other than style AFAICT?

Tamas

Re: [PATCH 1/2] x86/memsharing: use an atomic add instead of a cmpxchg loop

Posted by Jan Beulich 1 year, 11 months ago

On 22.02.2024 19:03, Tamas K Lengyel wrote:
> On Thu, Feb 22, 2024 at 5:06 AM Jan Beulich <jbeulich@suse.com> wrote:
>> On 22.02.2024 10:05, Roger Pau Monne wrote:
>>> The usage of a cmpxchg loop in get_next_handle() is unnecessary, as the same
>>> can be achieved with an atomic increment, which is both simpler to read, and
>>> avoid any need for a loop.
>>>
>>> The cmpxchg usage is likely a remnant of 32bit support, which didn't have an
>>> instruction to do an atomic 64bit add, and instead a cmpxchg had to be used.
>>>
>>> Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
>>> Signed-of-by: Roger Pau Monné <roger.pau@citrix.com>
>>
>> Reviewed-by: Jan Beulich <jbeulich@suse.com>
>> albeit ...
>>
>>> --- a/xen/arch/x86/mm/mem_sharing.c
>>> +++ b/xen/arch/x86/mm/mem_sharing.c
>>> @@ -179,13 +179,7 @@ static void mem_sharing_page_unlock(struct page_info *pg)
>>>
>>>  static shr_handle_t get_next_handle(void)
>>>  {
>>> -    /* Get the next handle get_page style */
>>> -    uint64_t x, y = next_handle;
>>> -    do {
>>> -        x = y;
>>> -    }
>>> -    while ( (y = cmpxchg(&next_handle, x, x + 1)) != x );
>>> -    return x + 1;
>>> +    return arch_fetch_and_add(&next_handle, 1) + 1;
>>>  }
>>
>> ... the adding of 1 here is a little odd when taken together with
>> next_handle's initializer. Tamas, you've not written that code, but do
>> you have any thoughts towards the possible removal of either the
>> initializer or the adding here? Plus that variable of course could
>> very well do with moving into this function.
> 
> I have to say I find the existing logic here hard to parse but by the
> looks I don't think we need the + 1 once we switch to
> arch_fetch_and_add. Also could go without initializing next_handle to
> 1. Moving it into the function would not really accomplish anything
> other than style AFAICT?

Well, limiting scope of things can be viewed as purely style, but I
think it's more than that: It makes intentions more clear and reduces
the chance of abuse (deliberate or unintentional).

Jan

Re: [PATCH 1/2] x86/memsharing: use an atomic add instead of a cmpxchg loop

Posted by Roger Pau Monné 1 year, 11 months ago

On Fri, Feb 23, 2024 at 08:43:24AM +0100, Jan Beulich wrote:
> On 22.02.2024 19:03, Tamas K Lengyel wrote:
> > On Thu, Feb 22, 2024 at 5:06 AM Jan Beulich <jbeulich@suse.com> wrote:
> >> On 22.02.2024 10:05, Roger Pau Monne wrote:
> >>> The usage of a cmpxchg loop in get_next_handle() is unnecessary, as the same
> >>> can be achieved with an atomic increment, which is both simpler to read, and
> >>> avoid any need for a loop.
> >>>
> >>> The cmpxchg usage is likely a remnant of 32bit support, which didn't have an
> >>> instruction to do an atomic 64bit add, and instead a cmpxchg had to be used.
> >>>
> >>> Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
> >>> Signed-of-by: Roger Pau Monné <roger.pau@citrix.com>
> >>
> >> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> >> albeit ...
> >>
> >>> --- a/xen/arch/x86/mm/mem_sharing.c
> >>> +++ b/xen/arch/x86/mm/mem_sharing.c
> >>> @@ -179,13 +179,7 @@ static void mem_sharing_page_unlock(struct page_info *pg)
> >>>
> >>>  static shr_handle_t get_next_handle(void)
> >>>  {
> >>> -    /* Get the next handle get_page style */
> >>> -    uint64_t x, y = next_handle;
> >>> -    do {
> >>> -        x = y;
> >>> -    }
> >>> -    while ( (y = cmpxchg(&next_handle, x, x + 1)) != x );
> >>> -    return x + 1;
> >>> +    return arch_fetch_and_add(&next_handle, 1) + 1;
> >>>  }
> >>
> >> ... the adding of 1 here is a little odd when taken together with
> >> next_handle's initializer. Tamas, you've not written that code, but do
> >> you have any thoughts towards the possible removal of either the
> >> initializer or the adding here? Plus that variable of course could
> >> very well do with moving into this function.
> > 
> > I have to say I find the existing logic here hard to parse but by the
> > looks I don't think we need the + 1 once we switch to
> > arch_fetch_and_add. Also could go without initializing next_handle to
> > 1. Moving it into the function would not really accomplish anything
> > other than style AFAICT?
> 
> Well, limiting scope of things can be viewed as purely style, but I
> think it's more than that: It makes intentions more clear and reduces
> the chance of abuse (deliberate or unintentional).

I'm afraid that whatever is the outcome here, I will defer it to a
further commit, since the purpose here is to be a non-functional
change.

Thanks, Roger.

Re: [PATCH 1/2] x86/memsharing: use an atomic add instead of a cmpxchg loop

Posted by Jan Beulich 1 year, 11 months ago

On 28.02.2024 11:53, Roger Pau Monné wrote:
> On Fri, Feb 23, 2024 at 08:43:24AM +0100, Jan Beulich wrote:
>> On 22.02.2024 19:03, Tamas K Lengyel wrote:
>>> On Thu, Feb 22, 2024 at 5:06 AM Jan Beulich <jbeulich@suse.com> wrote:
>>>> On 22.02.2024 10:05, Roger Pau Monne wrote:
>>>>> The usage of a cmpxchg loop in get_next_handle() is unnecessary, as the same
>>>>> can be achieved with an atomic increment, which is both simpler to read, and
>>>>> avoid any need for a loop.
>>>>>
>>>>> The cmpxchg usage is likely a remnant of 32bit support, which didn't have an
>>>>> instruction to do an atomic 64bit add, and instead a cmpxchg had to be used.
>>>>>
>>>>> Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
>>>>> Signed-of-by: Roger Pau Monné <roger.pau@citrix.com>
>>>>
>>>> Reviewed-by: Jan Beulich <jbeulich@suse.com>
>>>> albeit ...
>>>>
>>>>> --- a/xen/arch/x86/mm/mem_sharing.c
>>>>> +++ b/xen/arch/x86/mm/mem_sharing.c
>>>>> @@ -179,13 +179,7 @@ static void mem_sharing_page_unlock(struct page_info *pg)
>>>>>
>>>>>  static shr_handle_t get_next_handle(void)
>>>>>  {
>>>>> -    /* Get the next handle get_page style */
>>>>> -    uint64_t x, y = next_handle;
>>>>> -    do {
>>>>> -        x = y;
>>>>> -    }
>>>>> -    while ( (y = cmpxchg(&next_handle, x, x + 1)) != x );
>>>>> -    return x + 1;
>>>>> +    return arch_fetch_and_add(&next_handle, 1) + 1;
>>>>>  }
>>>>
>>>> ... the adding of 1 here is a little odd when taken together with
>>>> next_handle's initializer. Tamas, you've not written that code, but do
>>>> you have any thoughts towards the possible removal of either the
>>>> initializer or the adding here? Plus that variable of course could
>>>> very well do with moving into this function.
>>>
>>> I have to say I find the existing logic here hard to parse but by the
>>> looks I don't think we need the + 1 once we switch to
>>> arch_fetch_and_add. Also could go without initializing next_handle to
>>> 1. Moving it into the function would not really accomplish anything
>>> other than style AFAICT?
>>
>> Well, limiting scope of things can be viewed as purely style, but I
>> think it's more than that: It makes intentions more clear and reduces
>> the chance of abuse (deliberate or unintentional).
> 
> I'm afraid that whatever is the outcome here, I will defer it to a
> further commit, since the purpose here is to be a non-functional
> change.

That's fine with me, but an ack from Tamas is still pending, unless I
missed something somewhere.

Jan

Re: [PATCH 1/2] x86/memsharing: use an atomic add instead of a cmpxchg loop

Posted by Roger Pau Monné 1 year, 11 months ago

On Wed, Feb 28, 2024 at 12:18:31PM +0100, Jan Beulich wrote:
> On 28.02.2024 11:53, Roger Pau Monné wrote:
> > On Fri, Feb 23, 2024 at 08:43:24AM +0100, Jan Beulich wrote:
> >> On 22.02.2024 19:03, Tamas K Lengyel wrote:
> >>> On Thu, Feb 22, 2024 at 5:06 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>>> On 22.02.2024 10:05, Roger Pau Monne wrote:
> >>>>> The usage of a cmpxchg loop in get_next_handle() is unnecessary, as the same
> >>>>> can be achieved with an atomic increment, which is both simpler to read, and
> >>>>> avoid any need for a loop.
> >>>>>
> >>>>> The cmpxchg usage is likely a remnant of 32bit support, which didn't have an
> >>>>> instruction to do an atomic 64bit add, and instead a cmpxchg had to be used.
> >>>>>
> >>>>> Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
> >>>>> Signed-of-by: Roger Pau Monné <roger.pau@citrix.com>
> >>>>
> >>>> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> >>>> albeit ...
> >>>>
> >>>>> --- a/xen/arch/x86/mm/mem_sharing.c
> >>>>> +++ b/xen/arch/x86/mm/mem_sharing.c
> >>>>> @@ -179,13 +179,7 @@ static void mem_sharing_page_unlock(struct page_info *pg)
> >>>>>
> >>>>>  static shr_handle_t get_next_handle(void)
> >>>>>  {
> >>>>> -    /* Get the next handle get_page style */
> >>>>> -    uint64_t x, y = next_handle;
> >>>>> -    do {
> >>>>> -        x = y;
> >>>>> -    }
> >>>>> -    while ( (y = cmpxchg(&next_handle, x, x + 1)) != x );
> >>>>> -    return x + 1;
> >>>>> +    return arch_fetch_and_add(&next_handle, 1) + 1;
> >>>>>  }
> >>>>
> >>>> ... the adding of 1 here is a little odd when taken together with
> >>>> next_handle's initializer. Tamas, you've not written that code, but do
> >>>> you have any thoughts towards the possible removal of either the
> >>>> initializer or the adding here? Plus that variable of course could
> >>>> very well do with moving into this function.
> >>>
> >>> I have to say I find the existing logic here hard to parse but by the
> >>> looks I don't think we need the + 1 once we switch to
> >>> arch_fetch_and_add. Also could go without initializing next_handle to
> >>> 1. Moving it into the function would not really accomplish anything
> >>> other than style AFAICT?
> >>
> >> Well, limiting scope of things can be viewed as purely style, but I
> >> think it's more than that: It makes intentions more clear and reduces
> >> the chance of abuse (deliberate or unintentional).
> > 
> > I'm afraid that whatever is the outcome here, I will defer it to a
> > further commit, since the purpose here is to be a non-functional
> > change.
> 
> That's fine with me, but an ack from Tamas is still pending, unless I
> missed something somewhere.

No, just wanted to clarify that I wasn't expecting to do further
changes here, FTAOD.  Not sure if Tamas was expecting me to further
adjust the code.

Thanks, Roger.

Re: [PATCH 1/2] x86/memsharing: use an atomic add instead of a cmpxchg loop

Posted by Tamas K Lengyel 1 year, 11 months ago

On Wed, Feb 28, 2024 at 8:28 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
>
> On Wed, Feb 28, 2024 at 12:18:31PM +0100, Jan Beulich wrote:
> > On 28.02.2024 11:53, Roger Pau Monné wrote:
> > > On Fri, Feb 23, 2024 at 08:43:24AM +0100, Jan Beulich wrote:
> > >> On 22.02.2024 19:03, Tamas K Lengyel wrote:
> > >>> On Thu, Feb 22, 2024 at 5:06 AM Jan Beulich <jbeulich@suse.com> wrote:
> > >>>> On 22.02.2024 10:05, Roger Pau Monne wrote:
> > >>>>> The usage of a cmpxchg loop in get_next_handle() is unnecessary, as the same
> > >>>>> can be achieved with an atomic increment, which is both simpler to read, and
> > >>>>> avoid any need for a loop.
> > >>>>>
> > >>>>> The cmpxchg usage is likely a remnant of 32bit support, which didn't have an
> > >>>>> instruction to do an atomic 64bit add, and instead a cmpxchg had to be used.
> > >>>>>
> > >>>>> Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
> > >>>>> Signed-of-by: Roger Pau Monné <roger.pau@citrix.com>
> > >>>>
> > >>>> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> > >>>> albeit ...
> > >>>>
> > >>>>> --- a/xen/arch/x86/mm/mem_sharing.c
> > >>>>> +++ b/xen/arch/x86/mm/mem_sharing.c
> > >>>>> @@ -179,13 +179,7 @@ static void mem_sharing_page_unlock(struct page_info *pg)
> > >>>>>
> > >>>>>  static shr_handle_t get_next_handle(void)
> > >>>>>  {
> > >>>>> -    /* Get the next handle get_page style */
> > >>>>> -    uint64_t x, y = next_handle;
> > >>>>> -    do {
> > >>>>> -        x = y;
> > >>>>> -    }
> > >>>>> -    while ( (y = cmpxchg(&next_handle, x, x + 1)) != x );
> > >>>>> -    return x + 1;
> > >>>>> +    return arch_fetch_and_add(&next_handle, 1) + 1;
> > >>>>>  }
> > >>>>
> > >>>> ... the adding of 1 here is a little odd when taken together with
> > >>>> next_handle's initializer. Tamas, you've not written that code, but do
> > >>>> you have any thoughts towards the possible removal of either the
> > >>>> initializer or the adding here? Plus that variable of course could
> > >>>> very well do with moving into this function.
> > >>>
> > >>> I have to say I find the existing logic here hard to parse but by the
> > >>> looks I don't think we need the + 1 once we switch to
> > >>> arch_fetch_and_add. Also could go without initializing next_handle to
> > >>> 1. Moving it into the function would not really accomplish anything
> > >>> other than style AFAICT?
> > >>
> > >> Well, limiting scope of things can be viewed as purely style, but I
> > >> think it's more than that: It makes intentions more clear and reduces
> > >> the chance of abuse (deliberate or unintentional).
> > >
> > > I'm afraid that whatever is the outcome here, I will defer it to a
> > > further commit, since the purpose here is to be a non-functional
> > > change.
> >
> > That's fine with me, but an ack from Tamas is still pending, unless I
> > missed something somewhere.
>
> No, just wanted to clarify that I wasn't expecting to do further
> changes here, FTAOD.  Not sure if Tamas was expecting me to further
> adjust the code.

Acked-by: Tamas K Lengyel <tamas@tklengyel.com>