[PATCH] Arm: tidy page_get_owner_and_nr_reference()

Jan Beulich posted 1 patch 4 months ago
Failed in applying to current master (apply log)
[PATCH] Arm: tidy page_get_owner_and_nr_reference()
Posted by Jan Beulich 4 months ago
When the bumping by <nr> (instead of just 1) was introduced, a comment
was left untouched, and a bogus ASSERT_UNREACHABLE() was inserted. That
code path can in principle be taken (depending on configuration coming
from the outside), and we shouldn't assert anything we didn't check
elsewhere.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
Noticed while reviewing the RISC-V code copying this machinery almost
verbatim.

--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -256,16 +256,13 @@ static struct domain *page_get_owner_and
 
     /* Restrict nr to avoid "double" overflow */
     if ( nr >= PGC_count_mask )
-    {
-        ASSERT_UNREACHABLE();
         return NULL;
-    }
 
     do {
         x = y;
         /*
-         * Count ==  0: Page is not allocated, so we cannot take a reference.
-         * Count == -1: Reference count would wrap, which is invalid.
+         * Count ==   0: Page is not allocated, so we cannot take a reference.
+         * Count >= -nr: Reference count would wrap, which is invalid.
          */
         if ( unlikely(((x + nr) & PGC_count_mask) <= nr) )
             return NULL;
Re: [PATCH] Arm: tidy page_get_owner_and_nr_reference()
Posted by Julien Grall 4 months ago
Hi Jan,

On 02/07/2025 14:06, Jan Beulich wrote:
> When the bumping by <nr> (instead of just 1) was introduced, a comment
> was left untouched, and a bogus ASSERT_UNREACHABLE() was inserted. That
> code path can in principle be taken (depending on configuration coming
> from the outside), and we shouldn't assert anything we didn't check
> elsewhere.

I suggested to add the ASSERT_UNREACHABLE (see [1]). My take is the 
overflow is not something that should happen and it is good to be able 
to catch very clearly in debug build.

So I am reluctant to remove it at the moment.

Cheers,

[1] 
https://lore.kernel.org/xen-devel/47b4d6c2-3bc7-02bc-be5a-a2b782541c3c@xen.org/

-- 
Julien Grall
Re: [PATCH] Arm: tidy page_get_owner_and_nr_reference()
Posted by Jan Beulich 4 months ago
On 02.07.2025 15:18, Julien Grall wrote:
> On 02/07/2025 14:06, Jan Beulich wrote:
>> When the bumping by <nr> (instead of just 1) was introduced, a comment
>> was left untouched, and a bogus ASSERT_UNREACHABLE() was inserted. That
>> code path can in principle be taken (depending on configuration coming
>> from the outside), and we shouldn't assert anything we didn't check
>> elsewhere.
> 
> I suggested to add the ASSERT_UNREACHABLE (see [1]). My take is the 
> overflow is not something that should happen and it is good to be able 
> to catch very clearly in debug build.

But catching an anomalous entry in DT (which aiui the values come from,
even if perhaps indirectly) isn't going to depend on people using debug
builds. It depends on what DT blobs they use. And issues there shouldn't
be checked by assertions, as those blobs live outside of Xen.

> So I am reluctant to remove it at the moment.

But then some checking needs to be added elsewhere.

Jan
Re: [PATCH] Arm: tidy page_get_owner_and_nr_reference()
Posted by Julien Grall 3 months, 4 weeks ago
Hi Jan,

On 02/07/2025 14:37, Jan Beulich wrote:
> On 02.07.2025 15:18, Julien Grall wrote:
>> On 02/07/2025 14:06, Jan Beulich wrote:
>>> When the bumping by <nr> (instead of just 1) was introduced, a comment
>>> was left untouched, and a bogus ASSERT_UNREACHABLE() was inserted. That
>>> code path can in principle be taken (depending on configuration coming
>>> from the outside), and we shouldn't assert anything we didn't check
>>> elsewhere.
>>
>> I suggested to add the ASSERT_UNREACHABLE (see [1]). My take is the
>> overflow is not something that should happen and it is good to be able
>> to catch very clearly in debug build.
> 
> But catching an anomalous entry in DT (which aiui the values come from,
> even if perhaps indirectly) isn't going to depend on people using debug
> builds. It depends on what DT blobs they use. And issues there shouldn't
> be checked by assertions, as those blobs live outside of Xen.

I agree we probably want check upfront. But I still don't think we want 
to remove this ASSERT_UNREACHABLE().

By the way, I am not asking you to add those checks.

Cheers,

-- 
Julien Grall
Re: [PATCH] Arm: tidy page_get_owner_and_nr_reference()
Posted by Jan Beulich 3 months, 4 weeks ago
On 03.07.2025 10:52, Julien Grall wrote:
> On 02/07/2025 14:37, Jan Beulich wrote:
>> On 02.07.2025 15:18, Julien Grall wrote:
>>> On 02/07/2025 14:06, Jan Beulich wrote:
>>>> When the bumping by <nr> (instead of just 1) was introduced, a comment
>>>> was left untouched, and a bogus ASSERT_UNREACHABLE() was inserted. That
>>>> code path can in principle be taken (depending on configuration coming
>>>> from the outside), and we shouldn't assert anything we didn't check
>>>> elsewhere.
>>>
>>> I suggested to add the ASSERT_UNREACHABLE (see [1]). My take is the
>>> overflow is not something that should happen and it is good to be able
>>> to catch very clearly in debug build.
>>
>> But catching an anomalous entry in DT (which aiui the values come from,
>> even if perhaps indirectly) isn't going to depend on people using debug
>> builds. It depends on what DT blobs they use. And issues there shouldn't
>> be checked by assertions, as those blobs live outside of Xen.
> 
> I agree we probably want check upfront. But I still don't think we want 
> to remove this ASSERT_UNREACHABLE().
> 
> By the way, I am not asking you to add those checks.

Sure, I wouldn't even know where and what. I can re-send just the comment
change, but that would feel wrong as long as the ASSERT_UNREACHABLE() is
actually reachable.

Jan
Re: [PATCH] Arm: tidy page_get_owner_and_nr_reference()
Posted by Julien Grall 3 months, 2 weeks ago
Hi Jan,

Sorry for the late answer.

On 03/07/2025 11:04, Jan Beulich wrote:
> On 03.07.2025 10:52, Julien Grall wrote:
>> On 02/07/2025 14:37, Jan Beulich wrote:
>>> On 02.07.2025 15:18, Julien Grall wrote:
>>>> On 02/07/2025 14:06, Jan Beulich wrote:
>>>>> When the bumping by <nr> (instead of just 1) was introduced, a comment
>>>>> was left untouched, and a bogus ASSERT_UNREACHABLE() was inserted. That
>>>>> code path can in principle be taken (depending on configuration coming
>>>>> from the outside), and we shouldn't assert anything we didn't check
>>>>> elsewhere.
>>>>
>>>> I suggested to add the ASSERT_UNREACHABLE (see [1]). My take is the
>>>> overflow is not something that should happen and it is good to be able
>>>> to catch very clearly in debug build.
>>>
>>> But catching an anomalous entry in DT (which aiui the values come from,
>>> even if perhaps indirectly) isn't going to depend on people using debug
>>> builds. It depends on what DT blobs they use. And issues there shouldn't
>>> be checked by assertions, as those blobs live outside of Xen.
>>
>> I agree we probably want check upfront. But I still don't think we want
>> to remove this ASSERT_UNREACHABLE().
>>
>> By the way, I am not asking you to add those checks.
> 
> Sure, I wouldn't even know where and what. I can re-send just the comment
> change, but that would feel wrong as long as the ASSERT_UNREACHABLE() is
> actually reachable.

I am guessing you mean this change:

          /*
-         * Count ==  0: Page is not allocated, so we cannot take a 
reference.
-         * Count == -1: Reference count would wrap, which is invalid.
+         * Count ==   0: Page is not allocated, so we cannot take a 
reference.
+         * Count >= -nr: Reference count would wrap, which is invalid.
           */

If so, I think it is still valid regardless whether we continue to use 
ASSERT_UNREACHABLE().	

Cheers,

-- 
Julien Grall
Re: [PATCH] Arm: tidy page_get_owner_and_nr_reference()
Posted by Jan Beulich 3 months, 2 weeks ago
On 12.07.2025 12:37, Julien Grall wrote:
> Hi Jan,
> 
> Sorry for the late answer.
> 
> On 03/07/2025 11:04, Jan Beulich wrote:
>> On 03.07.2025 10:52, Julien Grall wrote:
>>> On 02/07/2025 14:37, Jan Beulich wrote:
>>>> On 02.07.2025 15:18, Julien Grall wrote:
>>>>> On 02/07/2025 14:06, Jan Beulich wrote:
>>>>>> When the bumping by <nr> (instead of just 1) was introduced, a comment
>>>>>> was left untouched, and a bogus ASSERT_UNREACHABLE() was inserted. That
>>>>>> code path can in principle be taken (depending on configuration coming
>>>>>> from the outside), and we shouldn't assert anything we didn't check
>>>>>> elsewhere.
>>>>>
>>>>> I suggested to add the ASSERT_UNREACHABLE (see [1]). My take is the
>>>>> overflow is not something that should happen and it is good to be able
>>>>> to catch very clearly in debug build.
>>>>
>>>> But catching an anomalous entry in DT (which aiui the values come from,
>>>> even if perhaps indirectly) isn't going to depend on people using debug
>>>> builds. It depends on what DT blobs they use. And issues there shouldn't
>>>> be checked by assertions, as those blobs live outside of Xen.
>>>
>>> I agree we probably want check upfront. But I still don't think we want
>>> to remove this ASSERT_UNREACHABLE().
>>>
>>> By the way, I am not asking you to add those checks.
>>
>> Sure, I wouldn't even know where and what. I can re-send just the comment
>> change, but that would feel wrong as long as the ASSERT_UNREACHABLE() is
>> actually reachable.
> 
> I am guessing you mean this change:
> 
>           /*
> -         * Count ==  0: Page is not allocated, so we cannot take a 
> reference.
> -         * Count == -1: Reference count would wrap, which is invalid.
> +         * Count ==   0: Page is not allocated, so we cannot take a 
> reference.
> +         * Count >= -nr: Reference count would wrap, which is invalid.
>            */
> 
> If so, I think it is still valid regardless whether we continue to use 
> ASSERT_UNREACHABLE().	

Yes, that's the comment change which is valid regardless. My reply was about
the dropping of the ASSERT_UNREACHABLE(), though: To me, dropping that
change simply feels wrong, as it was put there by mistake at the same time
as the comment was left unaltered. So to me both changes still go together,
unless by a patch going in earlier the unreachability of that return path
was indeed guaranteed.

In fact I guess I should have added a Fixes: tag to the patch.

Jan
Re: [PATCH] Arm: tidy page_get_owner_and_nr_reference()
Posted by Jan Beulich 1 month ago
On 14.07.2025 09:26, Jan Beulich wrote:
> On 12.07.2025 12:37, Julien Grall wrote:
>> On 03/07/2025 11:04, Jan Beulich wrote:
>>> On 03.07.2025 10:52, Julien Grall wrote:
>>>> On 02/07/2025 14:37, Jan Beulich wrote:
>>>>> On 02.07.2025 15:18, Julien Grall wrote:
>>>>>> On 02/07/2025 14:06, Jan Beulich wrote:
>>>>>>> When the bumping by <nr> (instead of just 1) was introduced, a comment
>>>>>>> was left untouched, and a bogus ASSERT_UNREACHABLE() was inserted. That
>>>>>>> code path can in principle be taken (depending on configuration coming
>>>>>>> from the outside), and we shouldn't assert anything we didn't check
>>>>>>> elsewhere.
>>>>>>
>>>>>> I suggested to add the ASSERT_UNREACHABLE (see [1]). My take is the
>>>>>> overflow is not something that should happen and it is good to be able
>>>>>> to catch very clearly in debug build.
>>>>>
>>>>> But catching an anomalous entry in DT (which aiui the values come from,
>>>>> even if perhaps indirectly) isn't going to depend on people using debug
>>>>> builds. It depends on what DT blobs they use. And issues there shouldn't
>>>>> be checked by assertions, as those blobs live outside of Xen.
>>>>
>>>> I agree we probably want check upfront. But I still don't think we want
>>>> to remove this ASSERT_UNREACHABLE().
>>>>
>>>> By the way, I am not asking you to add those checks.
>>>
>>> Sure, I wouldn't even know where and what. I can re-send just the comment
>>> change, but that would feel wrong as long as the ASSERT_UNREACHABLE() is
>>> actually reachable.
>>
>> I am guessing you mean this change:
>>
>>           /*
>> -         * Count ==  0: Page is not allocated, so we cannot take a 
>> reference.
>> -         * Count == -1: Reference count would wrap, which is invalid.
>> +         * Count ==   0: Page is not allocated, so we cannot take a 
>> reference.
>> +         * Count >= -nr: Reference count would wrap, which is invalid.
>>            */
>>
>> If so, I think it is still valid regardless whether we continue to use 
>> ASSERT_UNREACHABLE().	
> 
> Yes, that's the comment change which is valid regardless. My reply was about
> the dropping of the ASSERT_UNREACHABLE(), though: To me, dropping that
> change simply feels wrong, as it was put there by mistake at the same time
> as the comment was left unaltered. So to me both changes still go together,
> unless by a patch going in earlier the unreachability of that return path
> was indeed guaranteed.

I think we want this sorted for 4.21. I'm happy to shrink the patch to just
the comment change, but only if ahead of that whatever other change goes in,
making the assertion actually legitimate to live there.

> In fact I guess I should have added a Fixes: tag to the patch.

Locally I've added
Fixes: 5951b856d8d0 ("xen/arm: introduce put_page_nr and get_page_nr")

Jan