[PATCH] xen/x86: irq: Avoid a TOCTOU race in pirq_spin_lock_irq_desc()

Julien Grall posted 1 patch 2 weeks ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/xen tags/patchew/20200722165300.22655-1-julien@xen.org
Maintainers: Andrew Cooper <andrew.cooper3@citrix.com>, Wei Liu <wl@xen.org>, Jan Beulich <jbeulich@suse.com>, "Roger Pau Monné" <roger.pau@citrix.com>
xen/arch/x86/irq.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

[PATCH] xen/x86: irq: Avoid a TOCTOU race in pirq_spin_lock_irq_desc()

Posted by Julien Grall 2 weeks ago
From: Julien Grall <jgrall@amazon.com>

Even if we assigned pirq->arch.irq to a variable, a compile is still
allowed to read pirq->arch.irq multiple time. This means that the value
checked may be different from the value used to get the desc.

Force the compiler to only do one read access by using read_atomic().

Signed-off-by: Julien Grall <jgrall@amazon.com>
---
 xen/arch/x86/irq.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c
index a69937c840b9..25f2eb611692 100644
--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -1187,7 +1187,7 @@ struct irq_desc *pirq_spin_lock_irq_desc(
 
     for ( ; ; )
     {
-        int irq = pirq->arch.irq;
+        int irq = read_atomic(&pirq->arch.irq);
 
         if ( irq <= 0 )
             return NULL;
-- 
2.17.1


Re: [PATCH] xen/x86: irq: Avoid a TOCTOU race in pirq_spin_lock_irq_desc()

Posted by Jan Beulich 2 weeks ago
On 22.07.2020 18:53, Julien Grall wrote:
> --- a/xen/arch/x86/irq.c
> +++ b/xen/arch/x86/irq.c
> @@ -1187,7 +1187,7 @@ struct irq_desc *pirq_spin_lock_irq_desc(
>  
>      for ( ; ; )
>      {
> -        int irq = pirq->arch.irq;
> +        int irq = read_atomic(&pirq->arch.irq);

There we go - I'd be fine this way, but I'm pretty sure Andrew
would want this to be ACCESS_ONCE(). So I guess now is the time
to settle which one to prefer in new code (or which criteria
there are to prefer one over the other).

And this is of course besides the fact that I think we have many
more instances where guaranteeing a single access would be
needed, if we're afraid of the described permitted compiler
behavior. Which then makes me wonder if this is really something
we should fix one by one, rather than by at least larger scope
audits (in order to not suggest "throughout the code base").

As a minor remark, unless you've observed problematic behavior,
would you mind adding "potential" or "theoretical" to the title?

Jan

Re: [PATCH] xen/x86: irq: Avoid a TOCTOU race in pirq_spin_lock_irq_desc()

Posted by Julien Grall 2 weeks ago
Hi Jan,

On 23/07/2020 12:23, Jan Beulich wrote:
> On 22.07.2020 18:53, Julien Grall wrote:
>> --- a/xen/arch/x86/irq.c
>> +++ b/xen/arch/x86/irq.c
>> @@ -1187,7 +1187,7 @@ struct irq_desc *pirq_spin_lock_irq_desc(
>>   
>>       for ( ; ; )
>>       {
>> -        int irq = pirq->arch.irq;
>> +        int irq = read_atomic(&pirq->arch.irq);
> 
> There we go - I'd be fine this way, but I'm pretty sure Andrew
> would want this to be ACCESS_ONCE(). So I guess now is the time
> to settle which one to prefer in new code (or which criteria
> there are to prefer one over the other).

I would prefer if we have a single way to force the compiler to do a 
single access (read/write).

The existing implementation of ACCESS_ONCE() can only work on scalar 
type. The implementation is based on a Linux, although we have an extra 
check. Looking through the Linux history, it looks like it is not 
possible to make ACCESS_ONCE() work with non-scalar types:

     ACCESS_ONCE does not work reliably on non-scalar types. For
     example gcc 4.6 and 4.7 might remove the volatile tag for such
     accesses during the SRA (scalar replacement of aggregates) step
     https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145)

I understand that our implementation of read_atomic(), write_atomic() 
would lead to less optimized code. So maybe we want to import 
READ_ONCE() and WRITE_ONCE() from Linux?

As a side note, I have seen suggestion only (see [1]) which suggest that 
they those helpers wouldn't be portable:

"One relatively unimportant misunderstanding is due to the fact that the 
standard only talks about accesses to volatile objects. It does not talk 
about accesses via volatile qualified pointers. Some programmers believe 
that using a pointer-to-volatile should be handled as though it pointed 
to a volatile object. That is not guaranteed by the standard and is 
therefore not portable. However, this is relatively unimportant because 
gcc does in fact treat a pointer-to-volatile as though it pointed to a 
volatile object."

I would assume that the use is OK on CLang and GCC given that Linux has 
been using it.


> And this is of course besides the fact that I think we have many
> more instances where guaranteeing a single access would be
> needed, if we're afraid of the described permitted compiler
> behavior. Which then makes me wonder if this is really something
> we should fix one by one, rather than by at least larger scope
> audits (in order to not suggest "throughout the code base").

It depends how much the contributor can invest on chasing the rest of 
the issues. The larger the scope is, the less likely you will find 
someone that has bandwith to allocate for fixing it completely.

If the scope is "a field", then I think it is a reasonable suggesting.

In this case, I had a look at arch.irq and wasn't able to spot other 
potential issue.

> As a minor remark, unless you've observed problematic behavior,
> would you mind adding "potential" or "theoretical" to the title?

I am not aware of any issues with compiler so far. So I can add 
"potential" in the title.

Cheers,

[1] https://www.airs.com/blog/archives/154

-- 
Julien Grall

Re: [PATCH] xen/x86: irq: Avoid a TOCTOU race in pirq_spin_lock_irq_desc()

Posted by Andrew Cooper 2 weeks ago
On 23/07/2020 14:22, Julien Grall wrote:
> Hi Jan,
>
> On 23/07/2020 12:23, Jan Beulich wrote:
>> On 22.07.2020 18:53, Julien Grall wrote:
>>> --- a/xen/arch/x86/irq.c
>>> +++ b/xen/arch/x86/irq.c
>>> @@ -1187,7 +1187,7 @@ struct irq_desc *pirq_spin_lock_irq_desc(
>>>         for ( ; ; )
>>>       {
>>> -        int irq = pirq->arch.irq;
>>> +        int irq = read_atomic(&pirq->arch.irq);
>>
>> There we go - I'd be fine this way, but I'm pretty sure Andrew
>> would want this to be ACCESS_ONCE(). So I guess now is the time
>> to settle which one to prefer in new code (or which criteria
>> there are to prefer one over the other).
>
> I would prefer if we have a single way to force the compiler to do a
> single access (read/write).

Unlikely to happen, I'd expect.

But I would really like to get rid of (or at least rename)
read_atomic()/write_atomic() specifically because they've got nothing to
do with atomic_t's and the set of functionality who's namespace they share.

>
> The existing implementation of ACCESS_ONCE() can only work on scalar
> type. The implementation is based on a Linux, although we have an
> extra check. Looking through the Linux history, it looks like it is
> not possible to make ACCESS_ONCE() work with non-scalar types:
>
>     ACCESS_ONCE does not work reliably on non-scalar types. For
>     example gcc 4.6 and 4.7 might remove the volatile tag for such
>     accesses during the SRA (scalar replacement of aggregates) step
>     https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145)
>
> I understand that our implementation of read_atomic(), write_atomic()
> would lead to less optimized code.

There are cases where read_atomic()/write_atomic() prevent optimisations
which ACCESS_ONCE() would allow, but it is only for code of the form:

ACCESS_ONCE(ptr) |= val;

Which a sufficiently clever compiler could convert to a single `or $val,
ptr` instruction on x86, while read_atomic()/write_atomic() would force
it to be `mov ptr, %reg; or $val, %reg; mov %reg, ptr`.

That said - your note about GCC treating the pointed-to object as
volatile probably means it won't make the above optimisation, even
though it would be appropriate to do so.

> So maybe we want to import READ_ONCE() and WRITE_ONCE() from Linux?

There is no point.  Linux has taken a massive detour through wildly
different READ/WRITE_ONCE() functions (to fix the above GCC bugs), and
are now back to something very similar to ACCESS_ONCE().

~Andrew

Re: [PATCH] xen/x86: irq: Avoid a TOCTOU race in pirq_spin_lock_irq_desc()

Posted by Jan Beulich 2 weeks ago
On 23.07.2020 15:22, Julien Grall wrote:
> On 23/07/2020 12:23, Jan Beulich wrote:
>> On 22.07.2020 18:53, Julien Grall wrote:
>>> --- a/xen/arch/x86/irq.c
>>> +++ b/xen/arch/x86/irq.c
>>> @@ -1187,7 +1187,7 @@ struct irq_desc *pirq_spin_lock_irq_desc(
>>>   
>>>       for ( ; ; )
>>>       {
>>> -        int irq = pirq->arch.irq;
>>> +        int irq = read_atomic(&pirq->arch.irq);
>>
>> There we go - I'd be fine this way, but I'm pretty sure Andrew
>> would want this to be ACCESS_ONCE(). So I guess now is the time
>> to settle which one to prefer in new code (or which criteria
>> there are to prefer one over the other).
> 
> I would prefer if we have a single way to force the compiler to do a 
> single access (read/write).

Ideally yes. I'm unconvinced though that either construct fits all
needs (for {read,write}_atomic() there may be reasons why the
compiler is allowed to produce multiple generated code instances
from a single source instance, while for *_ONCE() the compiler may
be allowed to split the access into pieces (as can be easily seen
for an access to a uint64_t variable on 32-bit x86 at least, and
by deduction I then can't see why it shouldn't be allowed to use
byte-wise accesses).

> The existing implementation of ACCESS_ONCE() can only work on scalar 
> type. The implementation is based on a Linux, although we have an extra 
> check. Looking through the Linux history, it looks like it is not 
> possible to make ACCESS_ONCE() work with non-scalar types:
> 
>      ACCESS_ONCE does not work reliably on non-scalar types. For
>      example gcc 4.6 and 4.7 might remove the volatile tag for such
>      accesses during the SRA (scalar replacement of aggregates) step
>      https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145)
> 
> I understand that our implementation of read_atomic(), write_atomic() 
> would lead to less optimized code.

I.e. you see ways for the compiler to be more clever than using
a single "move" instruction for a single move? Or are you referring
to insn scheduling by the compiler (which my gut feeling would say
is impacted as much by an asm volatile() as by accessing a volatile
object).

> So maybe we want to import 
> READ_ONCE() and WRITE_ONCE() from Linux?

So far I was under the impression that our ACCESS_ONCE() is the
result of folding (older) Linux'es READ_ONCE() and WRITE_ONCE()
into a single construct.

> As a side note, I have seen suggestion only (see [1]) which suggest that 
> they those helpers wouldn't be portable:
> 
> "One relatively unimportant misunderstanding is due to the fact that the 
> standard only talks about accesses to volatile objects. It does not talk 
> about accesses via volatile qualified pointers. Some programmers believe 
> that using a pointer-to-volatile should be handled as though it pointed 
> to a volatile object. That is not guaranteed by the standard and is 
> therefore not portable. However, this is relatively unimportant because 
> gcc does in fact treat a pointer-to-volatile as though it pointed to a 
> volatile object."
> 
> I would assume that the use is OK on CLang and GCC given that Linux has 
> been using it.

Then again your change here is exactly to drop such assumptions of
ours on compiler behavior.

>> And this is of course besides the fact that I think we have many
>> more instances where guaranteeing a single access would be
>> needed, if we're afraid of the described permitted compiler
>> behavior. Which then makes me wonder if this is really something
>> we should fix one by one, rather than by at least larger scope
>> audits (in order to not suggest "throughout the code base").
> 
> It depends how much the contributor can invest on chasing the rest of 
> the issues. The larger the scope is, the less likely you will find 
> someone that has bandwith to allocate for fixing it completely.

I certainly understand that.

> If the scope is "a field", then I think it is a reasonable suggesting.
> 
> In this case, I had a look at arch.irq and wasn't able to spot other 
> potential issue.

That's good to know, and may be worth mentioning - if not in the
description, then maybe in a post-commit-message remark?

Jan