[PATCH] xen/grants: fix hypercall continuation for GNTTABOP_cache_flush

Juergen Gross posted 1 patch 4 years ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/xen tags/patchew/20200422130753.14713-1-jgross@suse.com
Maintainers: Julien Grall <julien@xen.org>, Ian Jackson <ian.jackson@eu.citrix.com>, Wei Liu <wl@xen.org>, Jan Beulich <jbeulich@suse.com>, Stefano Stabellini <sstabellini@kernel.org>, Andrew Cooper <andrew.cooper3@citrix.com>, George Dunlap <george.dunlap@citrix.com>
xen/common/grant_table.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
[PATCH] xen/grants: fix hypercall continuation for GNTTABOP_cache_flush
Posted by Juergen Gross 4 years ago
The GNTTABOP_cache_flush hypercall has a wrong test for hypercall
continuation, the test today is:

    if ( rc > 0 || opaque_out != 0 )

Unfortunately this will be true even in case of an error (rc < 0),
possibly leading to very long lasting hypercalls (times of more
than an hour have been observed in a test case).

Correct the test condition to result in false with rc < 0 and set
opaque_out only if no error occurred, to be on the safe side.

Partially-suggested-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
---
 xen/common/grant_table.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index 96080b3dec..5ef7ff940d 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -3626,12 +3626,12 @@ do_grant_table_op(
         if ( unlikely(!guest_handle_okay(cflush, count)) )
             goto out;
         rc = gnttab_cache_flush(cflush, &opaque_in, count);
-        if ( rc > 0 )
+        if ( rc >= 0 )
         {
             guest_handle_add_offset(cflush, rc);
             uop = guest_handle_cast(cflush, void);
+            opaque_out = opaque_in;
         }
-        opaque_out = opaque_in;
         break;
     }
 
@@ -3641,7 +3641,7 @@ do_grant_table_op(
     }
 
   out:
-    if ( rc > 0 || opaque_out != 0 )
+    if ( rc > 0 || (opaque_out != 0 && rc == 0) )
     {
         /* Adjust rc, see gnttab_copy() for why this is needed. */
         if ( cmd == GNTTABOP_copy )
-- 
2.16.4


Re: [PATCH] xen/grants: fix hypercall continuation for GNTTABOP_cache_flush
Posted by Julien Grall 4 years ago
Hi Juergen,

On 22/04/2020 14:07, Juergen Gross wrote:
> The GNTTABOP_cache_flush hypercall has a wrong test for hypercall
> continuation, the test today is:
> 
>      if ( rc > 0 || opaque_out != 0 )
> 
> Unfortunately this will be true even in case of an error (rc < 0),
> possibly leading to very long lasting hypercalls (times of more
> than an hour have been observed in a test case).
> 
> Correct the test condition to result in false with rc < 0 and set
> opaque_out only if no error occurred, to be on the safe side.
> 
> Partially-suggested-by: Jan Beulich <jbeulich@suse.com>
> Signed-off-by: Juergen Gross <jgross@suse.com>
> ---
>   xen/common/grant_table.c | 6 +++---
>   1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
> index 96080b3dec..5ef7ff940d 100644
> --- a/xen/common/grant_table.c
> +++ b/xen/common/grant_table.c
> @@ -3626,12 +3626,12 @@ do_grant_table_op(
>           if ( unlikely(!guest_handle_okay(cflush, count)) )
>               goto out;
>           rc = gnttab_cache_flush(cflush, &opaque_in, count);
> -        if ( rc > 0 )
> +        if ( rc >= 0 )
>           {
>               guest_handle_add_offset(cflush, rc);
>               uop = guest_handle_cast(cflush, void);
> +            opaque_out = opaque_in;
>           }
> -        opaque_out = opaque_in;
>           break;
>       }
>   
> @@ -3641,7 +3641,7 @@ do_grant_table_op(
>       }
>   
>     out:
> -    if ( rc > 0 || opaque_out != 0 )
> +    if ( rc > 0 || (opaque_out != 0 && rc == 0) )

I don't understand this change. If you look at the implementation of 
gnttab_flush() it is not possible to have opaque_out non-zero with rc = 0.

So what's the point of the second condition?

>       {
>           /* Adjust rc, see gnttab_copy() for why this is needed. */
>           if ( cmd == GNTTABOP_copy )
> 

Cheers,

-- 
Julien Grall

Re: [PATCH] xen/grants: fix hypercall continuation for GNTTABOP_cache_flush
Posted by Jürgen Groß 4 years ago
On 22.04.20 15:43, Julien Grall wrote:
> Hi Juergen,
> 
> On 22/04/2020 14:07, Juergen Gross wrote:
>> The GNTTABOP_cache_flush hypercall has a wrong test for hypercall
>> continuation, the test today is:
>>
>>      if ( rc > 0 || opaque_out != 0 )
>>
>> Unfortunately this will be true even in case of an error (rc < 0),
>> possibly leading to very long lasting hypercalls (times of more
>> than an hour have been observed in a test case).
>>
>> Correct the test condition to result in false with rc < 0 and set
>> opaque_out only if no error occurred, to be on the safe side.
>>
>> Partially-suggested-by: Jan Beulich <jbeulich@suse.com>
>> Signed-off-by: Juergen Gross <jgross@suse.com>
>> ---
>>   xen/common/grant_table.c | 6 +++---
>>   1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
>> index 96080b3dec..5ef7ff940d 100644
>> --- a/xen/common/grant_table.c
>> +++ b/xen/common/grant_table.c
>> @@ -3626,12 +3626,12 @@ do_grant_table_op(
>>           if ( unlikely(!guest_handle_okay(cflush, count)) )
>>               goto out;
>>           rc = gnttab_cache_flush(cflush, &opaque_in, count);
>> -        if ( rc > 0 )
>> +        if ( rc >= 0 )
>>           {
>>               guest_handle_add_offset(cflush, rc);
>>               uop = guest_handle_cast(cflush, void);
>> +            opaque_out = opaque_in;
>>           }
>> -        opaque_out = opaque_in;
>>           break;
>>       }
>> @@ -3641,7 +3641,7 @@ do_grant_table_op(
>>       }
>>     out:
>> -    if ( rc > 0 || opaque_out != 0 )
>> +    if ( rc > 0 || (opaque_out != 0 && rc == 0) )
> 
> I don't understand this change. If you look at the implementation of 
> gnttab_flush() it is not possible to have opaque_out non-zero with rc = 0.

Why not?

In gnttab_cache_flush() we have:

   if ( hypercall_preempt_check() )
       return i;

i will be 0 in the first loop iteration.


Juergen

Re: [PATCH] xen/grants: fix hypercall continuation for GNTTABOP_cache_flush
Posted by Julien Grall 4 years ago

On 22/04/2020 14:43, Julien Grall wrote:
> Hi Juergen,
> 
> On 22/04/2020 14:07, Juergen Gross wrote:
>> The GNTTABOP_cache_flush hypercall has a wrong test for hypercall
>> continuation, the test today is:
>>
>>      if ( rc > 0 || opaque_out != 0 )
>>
>> Unfortunately this will be true even in case of an error (rc < 0),
>> possibly leading to very long lasting hypercalls (times of more
>> than an hour have been observed in a test case).
>>
>> Correct the test condition to result in false with rc < 0 and set
>> opaque_out only if no error occurred, to be on the safe side.
>>
>> Partially-suggested-by: Jan Beulich <jbeulich@suse.com>
>> Signed-off-by: Juergen Gross <jgross@suse.com>
>> ---
>>   xen/common/grant_table.c | 6 +++---
>>   1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
>> index 96080b3dec..5ef7ff940d 100644
>> --- a/xen/common/grant_table.c
>> +++ b/xen/common/grant_table.c
>> @@ -3626,12 +3626,12 @@ do_grant_table_op(
>>           if ( unlikely(!guest_handle_okay(cflush, count)) )
>>               goto out;
>>           rc = gnttab_cache_flush(cflush, &opaque_in, count);
>> -        if ( rc > 0 )
>> +        if ( rc >= 0 )
>>           {
>>               guest_handle_add_offset(cflush, rc);
>>               uop = guest_handle_cast(cflush, void);
>> +            opaque_out = opaque_in;
>>           }
>> -        opaque_out = opaque_in;
>>           break;
>>       }
>> @@ -3641,7 +3641,7 @@ do_grant_table_op(
>>       }
>>     out:
>> -    if ( rc > 0 || opaque_out != 0 )
>> +    if ( rc > 0 || (opaque_out != 0 && rc == 0) )
> 
> I don't understand this change. If you look at the implementation of 
> gnttab_flush() it is not possible to have opaque_out non-zero with rc = 0.

Hmmm... I misread the code and missed the:

if ( hypercall_preempt_check() )
   return i;

Sorry for the noise.

I am also assuming this want a Fixes tag on 
18e8d22fe750c8c7b2830fa37961352693425cf1 "introduce GNTTABOP_cache_flush".

Reviewed-by: Julien Grall <jgrall@amazon.com>

Cheers,

-- 
Julien Grall

Re: [PATCH] xen/grants: fix hypercall continuation for GNTTABOP_cache_flush
Posted by Jan Beulich 4 years ago
On 22.04.2020 15:07, Juergen Gross wrote:
> --- a/xen/common/grant_table.c
> +++ b/xen/common/grant_table.c
> @@ -3626,12 +3626,12 @@ do_grant_table_op(
>          if ( unlikely(!guest_handle_okay(cflush, count)) )
>              goto out;
>          rc = gnttab_cache_flush(cflush, &opaque_in, count);
> -        if ( rc > 0 )
> +        if ( rc >= 0 )
>          {
>              guest_handle_add_offset(cflush, rc);
>              uop = guest_handle_cast(cflush, void);
> +            opaque_out = opaque_in;
>          }
> -        opaque_out = opaque_in;
>          break;
>      }
>  
> @@ -3641,7 +3641,7 @@ do_grant_table_op(
>      }
>  
>    out:
> -    if ( rc > 0 || opaque_out != 0 )
> +    if ( rc > 0 || (opaque_out != 0 && rc == 0) )

I disagree with this part - opaque_out shouldn't end up non-zero
when rc < 0, and it won't anymore with the change in the earlier
hunk.

Jan

Re: [PATCH] xen/grants: fix hypercall continuation for GNTTABOP_cache_flush
Posted by Stefano Stabellini 4 years ago
On Wed, 22 Apr 2020, Jan Beulich wrote:
> On 22.04.2020 15:07, Juergen Gross wrote:
> > --- a/xen/common/grant_table.c
> > +++ b/xen/common/grant_table.c
> > @@ -3626,12 +3626,12 @@ do_grant_table_op(
> >          if ( unlikely(!guest_handle_okay(cflush, count)) )
> >              goto out;
> >          rc = gnttab_cache_flush(cflush, &opaque_in, count);
> > -        if ( rc > 0 )
> > +        if ( rc >= 0 )
> >          {
> >              guest_handle_add_offset(cflush, rc);
> >              uop = guest_handle_cast(cflush, void);
> > +            opaque_out = opaque_in;
> >          }
> > -        opaque_out = opaque_in;
> >          break;
> >      }
> >  
> > @@ -3641,7 +3641,7 @@ do_grant_table_op(
> >      }
> >  
> >    out:
> > -    if ( rc > 0 || opaque_out != 0 )
> > +    if ( rc > 0 || (opaque_out != 0 && rc == 0) )
> 
> I disagree with this part - opaque_out shouldn't end up non-zero
> when rc < 0, and it won't anymore with the change in the earlier
> hunk.

But opaque_out could end up being non-zero when rc == 0. I think it is a
good improvement that we explicitly prevent rc < 0 from entering this
if condition. This is why this new version of the patch is my favorite:
it is the one that leads to the code most robust. 

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>


That said, as I mentioned before, I would be OK even without the last
part because it would still be enough to fix the bug.

Re: [PATCH] xen/grants: fix hypercall continuation for GNTTABOP_cache_flush
Posted by Jan Beulich 4 years ago
On 22.04.2020 18:31, Stefano Stabellini wrote:
> On Wed, 22 Apr 2020, Jan Beulich wrote:
>> On 22.04.2020 15:07, Juergen Gross wrote:
>>> --- a/xen/common/grant_table.c
>>> +++ b/xen/common/grant_table.c
>>> @@ -3626,12 +3626,12 @@ do_grant_table_op(
>>>          if ( unlikely(!guest_handle_okay(cflush, count)) )
>>>              goto out;
>>>          rc = gnttab_cache_flush(cflush, &opaque_in, count);
>>> -        if ( rc > 0 )
>>> +        if ( rc >= 0 )
>>>          {
>>>              guest_handle_add_offset(cflush, rc);
>>>              uop = guest_handle_cast(cflush, void);
>>> +            opaque_out = opaque_in;
>>>          }
>>> -        opaque_out = opaque_in;
>>>          break;
>>>      }
>>>  
>>> @@ -3641,7 +3641,7 @@ do_grant_table_op(
>>>      }
>>>  
>>>    out:
>>> -    if ( rc > 0 || opaque_out != 0 )
>>> +    if ( rc > 0 || (opaque_out != 0 && rc == 0) )
>>
>> I disagree with this part - opaque_out shouldn't end up non-zero
>> when rc < 0, and it won't anymore with the change in the earlier
>> hunk.
> 
> But opaque_out could end up being non-zero when rc == 0.

Which is the case the original code meant to deal with. (I still
think it is unfortunate behavior of the cache-flush implementation
to assign meaning other than "success, done" to rc == 0.)

> I think it is a
> good improvement that we explicitly prevent rc < 0 from entering this
> if condition. This is why this new version of the patch is my favorite:
> it is the one that leads to the code most robust. 
> 
> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>

Well, looks like I'm outvoted then. Nevertheless thanks ...

> That said, as I mentioned before, I would be OK even without the last
> part because it would still be enough to fix the bug.

.. for also clarifying this.

Jan