[PATCH v1 10/14] s390x/pv: check initial, not maximum RAM size

David Hildenbrand posted 14 patches 2 months, 2 weeks ago
There is a newer version of this series
[PATCH v1 10/14] s390x/pv: check initial, not maximum RAM size
Posted by David Hildenbrand 2 months, 2 weeks ago
We actually want to check the available RAM, not the maximum RAM size.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 target/s390x/kvm/pv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/s390x/kvm/pv.c b/target/s390x/kvm/pv.c
index dde836d21a..424cce75ca 100644
--- a/target/s390x/kvm/pv.c
+++ b/target/s390x/kvm/pv.c
@@ -133,7 +133,7 @@ bool s390_pv_vm_try_disable_async(S390CcwMachineState *ms)
      * If the feature is not present or if the VM is not larger than 2 GiB,
      * KVM_PV_ASYNC_CLEANUP_PREPARE fill fail; no point in attempting it.
      */
-    if ((MACHINE(ms)->maxram_size <= 2 * GiB) ||
+    if ((MACHINE(ms)->ram_size <= 2 * GiB) ||
         !kvm_check_extension(kvm_state, KVM_CAP_S390_PROTECTED_ASYNC_DISABLE)) {
         return false;
     }
-- 
2.46.0
Re: [PATCH v1 10/14] s390x/pv: check initial, not maximum RAM size
Posted by Nina Schoetterl-Glausch 2 months ago
On Tue, 2024-09-10 at 19:58 +0200, David Hildenbrand wrote:
> We actually want to check the available RAM, not the maximum RAM size.
> 
> Signed-off-by: David Hildenbrand <david@redhat.com>

Reviewed-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
Nit below.
> ---
>  target/s390x/kvm/pv.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/target/s390x/kvm/pv.c b/target/s390x/kvm/pv.c
> index dde836d21a..424cce75ca 100644
> --- a/target/s390x/kvm/pv.c
> +++ b/target/s390x/kvm/pv.c
> @@ -133,7 +133,7 @@ bool s390_pv_vm_try_disable_async(S390CcwMachineState *ms)
>       * If the feature is not present or if the VM is not larger than 2 GiB,
>       * KVM_PV_ASYNC_CLEANUP_PREPARE fill fail; no point in attempting it.
>       */
> -    if ((MACHINE(ms)->maxram_size <= 2 * GiB) ||
> +    if ((MACHINE(ms)->ram_size <= 2 * GiB) ||
>          !kvm_check_extension(kvm_state, KVM_CAP_S390_PROTECTED_ASYNC_DISABLE)) {
>          return false;
>      }

If I understood the kernel code right, the decision is made wrt
the size of the gmap address space, which is the same as the
limit set for the VM. So using s390_get_memory_limit would be
semantically cleaner.
Re: [PATCH v1 10/14] s390x/pv: check initial, not maximum RAM size
Posted by David Hildenbrand 2 months ago
On 24.09.24 18:22, Nina Schoetterl-Glausch wrote:
> On Tue, 2024-09-10 at 19:58 +0200, David Hildenbrand wrote:
>> We actually want to check the available RAM, not the maximum RAM size.
>>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
> 
> Reviewed-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
> Nit below.
>> ---
>>   target/s390x/kvm/pv.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/target/s390x/kvm/pv.c b/target/s390x/kvm/pv.c
>> index dde836d21a..424cce75ca 100644
>> --- a/target/s390x/kvm/pv.c
>> +++ b/target/s390x/kvm/pv.c
>> @@ -133,7 +133,7 @@ bool s390_pv_vm_try_disable_async(S390CcwMachineState *ms)
>>        * If the feature is not present or if the VM is not larger than 2 GiB,
>>        * KVM_PV_ASYNC_CLEANUP_PREPARE fill fail; no point in attempting it.
>>        */
>> -    if ((MACHINE(ms)->maxram_size <= 2 * GiB) ||
>> +    if ((MACHINE(ms)->ram_size <= 2 * GiB) ||
>>           !kvm_check_extension(kvm_state, KVM_CAP_S390_PROTECTED_ASYNC_DISABLE)) {
>>           return false;
>>       }
> 
> If I understood the kernel code right, the decision is made wrt
> the size of the gmap address space, which is the same as the
> limit set for the VM. So using s390_get_memory_limit would be
> semantically cleaner.

I wonder if we should just drop the RAM size check. Not convinced the 
slightly faster reboot for such small VMs is really relevant? Makes the 
code more complicated than really necessary.


-- 
Cheers,

David / dhildenb
Re: [PATCH v1 10/14] s390x/pv: check initial, not maximum RAM size
Posted by Christian Borntraeger 1 month, 3 weeks ago
Am 24.09.24 um 22:17 schrieb David Hildenbrand:
> On 24.09.24 18:22, Nina Schoetterl-Glausch wrote:
>> On Tue, 2024-09-10 at 19:58 +0200, David Hildenbrand wrote:
>>> We actually want to check the available RAM, not the maximum RAM size.
>>>
>>> Signed-off-by: David Hildenbrand <david@redhat.com>
>>
>> Reviewed-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
>> Nit below.
>>> ---
>>>   target/s390x/kvm/pv.c | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/target/s390x/kvm/pv.c b/target/s390x/kvm/pv.c
>>> index dde836d21a..424cce75ca 100644
>>> --- a/target/s390x/kvm/pv.c
>>> +++ b/target/s390x/kvm/pv.c
>>> @@ -133,7 +133,7 @@ bool s390_pv_vm_try_disable_async(S390CcwMachineState *ms)
>>>        * If the feature is not present or if the VM is not larger than 2 GiB,
>>>        * KVM_PV_ASYNC_CLEANUP_PREPARE fill fail; no point in attempting it.
>>>        */
>>> -    if ((MACHINE(ms)->maxram_size <= 2 * GiB) ||
>>> +    if ((MACHINE(ms)->ram_size <= 2 * GiB) ||
>>>           !kvm_check_extension(kvm_state, KVM_CAP_S390_PROTECTED_ASYNC_DISABLE)) {
>>>           return false;
>>>       }
>>
>> If I understood the kernel code right, the decision is made wrt
>> the size of the gmap address space, which is the same as the
>> limit set for the VM. So using s390_get_memory_limit would be
>> semantically cleaner.
> 
> I wonder if we should just drop the RAM size check. Not convinced the slightly faster reboot for such small VMs is really relevant? Makes the code more complicated than really necessary.

IIRC there have been functional issues with small guests and asnyc. Claudio, do you remember?

Re: [PATCH v1 10/14] s390x/pv: check initial, not maximum RAM size
Posted by Claudio Imbrenda 1 month, 3 weeks ago
On Mon, 30 Sep 2024 13:15:52 +0200
Christian Borntraeger <borntraeger@linux.ibm.com> wrote:

> Am 24.09.24 um 22:17 schrieb David Hildenbrand:
> > On 24.09.24 18:22, Nina Schoetterl-Glausch wrote:  
> >> On Tue, 2024-09-10 at 19:58 +0200, David Hildenbrand wrote:  
> >>> We actually want to check the available RAM, not the maximum RAM size.
> >>>
> >>> Signed-off-by: David Hildenbrand <david@redhat.com>  
> >>
> >> Reviewed-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
> >> Nit below.  
> >>> ---
> >>>   target/s390x/kvm/pv.c | 2 +-
> >>>   1 file changed, 1 insertion(+), 1 deletion(-)
> >>>
> >>> diff --git a/target/s390x/kvm/pv.c b/target/s390x/kvm/pv.c
> >>> index dde836d21a..424cce75ca 100644
> >>> --- a/target/s390x/kvm/pv.c
> >>> +++ b/target/s390x/kvm/pv.c
> >>> @@ -133,7 +133,7 @@ bool s390_pv_vm_try_disable_async(S390CcwMachineState *ms)
> >>>        * If the feature is not present or if the VM is not larger than 2 GiB,
> >>>        * KVM_PV_ASYNC_CLEANUP_PREPARE fill fail; no point in attempting it.
> >>>        */
> >>> -    if ((MACHINE(ms)->maxram_size <= 2 * GiB) ||
> >>> +    if ((MACHINE(ms)->ram_size <= 2 * GiB) ||
> >>>           !kvm_check_extension(kvm_state, KVM_CAP_S390_PROTECTED_ASYNC_DISABLE)) {
> >>>           return false;
> >>>       }  
> >>
> >> If I understood the kernel code right, the decision is made wrt
> >> the size of the gmap address space, which is the same as the
> >> limit set for the VM. So using s390_get_memory_limit would be
> >> semantically cleaner.  
> > 
> > I wonder if we should just drop the RAM size check. Not convinced the slightly faster reboot for such small VMs is really relevant? Makes the code more complicated than really necessary.  
> 
> IIRC there have been functional issues with small guests and asnyc. Claudio, do you remember?

if we are <2G, KVM allocates a segment table as the highest level table
for the gmap ASCE. there are pointers lurking around in the reverse
mapping prefix_tree, which point directly into segment tables.

if the ASCE is region3 or higher, that's not an issue. if it's a
segment table, then it's an issue, because those pointers will end up
pointing into freed memory, once the old asce is freed.

in short, we have to guarantee that we will never set aside a gmap ASCE
if it is a segment table.
Re: [PATCH v1 10/14] s390x/pv: check initial, not maximum RAM size
Posted by David Hildenbrand 1 month, 3 weeks ago
On 30.09.24 13:37, Claudio Imbrenda wrote:
> On Mon, 30 Sep 2024 13:15:52 +0200
> Christian Borntraeger <borntraeger@linux.ibm.com> wrote:
> 
>> Am 24.09.24 um 22:17 schrieb David Hildenbrand:
>>> On 24.09.24 18:22, Nina Schoetterl-Glausch wrote:
>>>> On Tue, 2024-09-10 at 19:58 +0200, David Hildenbrand wrote:
>>>>> We actually want to check the available RAM, not the maximum RAM size.
>>>>>
>>>>> Signed-off-by: David Hildenbrand <david@redhat.com>
>>>>
>>>> Reviewed-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
>>>> Nit below.
>>>>> ---
>>>>>    target/s390x/kvm/pv.c | 2 +-
>>>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/target/s390x/kvm/pv.c b/target/s390x/kvm/pv.c
>>>>> index dde836d21a..424cce75ca 100644
>>>>> --- a/target/s390x/kvm/pv.c
>>>>> +++ b/target/s390x/kvm/pv.c
>>>>> @@ -133,7 +133,7 @@ bool s390_pv_vm_try_disable_async(S390CcwMachineState *ms)
>>>>>         * If the feature is not present or if the VM is not larger than 2 GiB,
>>>>>         * KVM_PV_ASYNC_CLEANUP_PREPARE fill fail; no point in attempting it.
>>>>>         */
>>>>> -    if ((MACHINE(ms)->maxram_size <= 2 * GiB) ||
>>>>> +    if ((MACHINE(ms)->ram_size <= 2 * GiB) ||
>>>>>            !kvm_check_extension(kvm_state, KVM_CAP_S390_PROTECTED_ASYNC_DISABLE)) {
>>>>>            return false;
>>>>>        }
>>>>
>>>> If I understood the kernel code right, the decision is made wrt
>>>> the size of the gmap address space, which is the same as the
>>>> limit set for the VM. So using s390_get_memory_limit would be
>>>> semantically cleaner.
>>>
>>> I wonder if we should just drop the RAM size check. Not convinced the slightly faster reboot for such small VMs is really relevant? Makes the code more complicated than really necessary.
>>
>> IIRC there have been functional issues with small guests and asnyc. Claudio, do you remember?
> 
> if we are <2G, KVM allocates a segment table as the highest level table
> for the gmap ASCE. there are pointers lurking around in the reverse
> mapping prefix_tree, which point directly into segment tables.
> 
> if the ASCE is region3 or higher, that's not an issue. if it's a
> segment table, then it's an issue, because those pointers will end up
> pointing into freed memory, once the old asce is freed.
> 
> in short, we have to guarantee that we will never set aside a gmap ASCE
> if it is a segment table.

Thanks for the details, the kernel seems to properly safeguard against 
that. For now, I'll turn it into a check against the memory limit, which 
directly translates to the gmap ASCE used.

Thanks!

-- 
Cheers,

David / dhildenb


Re: [PATCH v1 10/14] s390x/pv: check initial, not maximum RAM size
Posted by Claudio Imbrenda 1 month, 3 weeks ago
On Mon, 30 Sep 2024 15:14:52 +0200
David Hildenbrand <david@redhat.com> wrote:

> On 30.09.24 13:37, Claudio Imbrenda wrote:
> > On Mon, 30 Sep 2024 13:15:52 +0200
> > Christian Borntraeger <borntraeger@linux.ibm.com> wrote:
> >   
> >> Am 24.09.24 um 22:17 schrieb David Hildenbrand:  
> >>> On 24.09.24 18:22, Nina Schoetterl-Glausch wrote:  
> >>>> On Tue, 2024-09-10 at 19:58 +0200, David Hildenbrand wrote:  
> >>>>> We actually want to check the available RAM, not the maximum RAM size.
> >>>>>
> >>>>> Signed-off-by: David Hildenbrand <david@redhat.com>  
> >>>>
> >>>> Reviewed-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
> >>>> Nit below.  
> >>>>> ---
> >>>>>    target/s390x/kvm/pv.c | 2 +-
> >>>>>    1 file changed, 1 insertion(+), 1 deletion(-)
> >>>>>
> >>>>> diff --git a/target/s390x/kvm/pv.c b/target/s390x/kvm/pv.c
> >>>>> index dde836d21a..424cce75ca 100644
> >>>>> --- a/target/s390x/kvm/pv.c
> >>>>> +++ b/target/s390x/kvm/pv.c
> >>>>> @@ -133,7 +133,7 @@ bool s390_pv_vm_try_disable_async(S390CcwMachineState *ms)
> >>>>>         * If the feature is not present or if the VM is not larger than 2 GiB,
> >>>>>         * KVM_PV_ASYNC_CLEANUP_PREPARE fill fail; no point in attempting it.
> >>>>>         */
> >>>>> -    if ((MACHINE(ms)->maxram_size <= 2 * GiB) ||
> >>>>> +    if ((MACHINE(ms)->ram_size <= 2 * GiB) ||
> >>>>>            !kvm_check_extension(kvm_state, KVM_CAP_S390_PROTECTED_ASYNC_DISABLE)) {
> >>>>>            return false;
> >>>>>        }  
> >>>>
> >>>> If I understood the kernel code right, the decision is made wrt
> >>>> the size of the gmap address space, which is the same as the
> >>>> limit set for the VM. So using s390_get_memory_limit would be
> >>>> semantically cleaner.  
> >>>
> >>> I wonder if we should just drop the RAM size check. Not convinced the slightly faster reboot for such small VMs is really relevant? Makes the code more complicated than really necessary.  
> >>
> >> IIRC there have been functional issues with small guests and asnyc. Claudio, do you remember?  
> > 
> > if we are <2G, KVM allocates a segment table as the highest level table
> > for the gmap ASCE. there are pointers lurking around in the reverse
> > mapping prefix_tree, which point directly into segment tables.
> > 
> > if the ASCE is region3 or higher, that's not an issue. if it's a
> > segment table, then it's an issue, because those pointers will end up
> > pointing into freed memory, once the old asce is freed.
> > 
> > in short, we have to guarantee that we will never set aside a gmap ASCE
> > if it is a segment table.  
> 
> Thanks for the details, the kernel seems to properly safeguard against 

it does, but it returns an error if userspace tries; the check there is
to prevent qemu from emitting confusing error messages.

> that. For now, I'll turn it into a check against the memory limit, which 
> directly translates to the gmap ASCE used.

sounds good

> 
> Thanks!

no problem :)
 
Re: [PATCH v1 10/14] s390x/pv: check initial, not maximum RAM size
Posted by David Hildenbrand 1 month, 4 weeks ago
On 24.09.24 22:17, David Hildenbrand wrote:
> On 24.09.24 18:22, Nina Schoetterl-Glausch wrote:
>> On Tue, 2024-09-10 at 19:58 +0200, David Hildenbrand wrote:
>>> We actually want to check the available RAM, not the maximum RAM size.
>>>
>>> Signed-off-by: David Hildenbrand <david@redhat.com>
>>
>> Reviewed-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
>> Nit below.
>>> ---
>>>    target/s390x/kvm/pv.c | 2 +-
>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/target/s390x/kvm/pv.c b/target/s390x/kvm/pv.c
>>> index dde836d21a..424cce75ca 100644
>>> --- a/target/s390x/kvm/pv.c
>>> +++ b/target/s390x/kvm/pv.c
>>> @@ -133,7 +133,7 @@ bool s390_pv_vm_try_disable_async(S390CcwMachineState *ms)
>>>         * If the feature is not present or if the VM is not larger than 2 GiB,
>>>         * KVM_PV_ASYNC_CLEANUP_PREPARE fill fail; no point in attempting it.
>>>         */
>>> -    if ((MACHINE(ms)->maxram_size <= 2 * GiB) ||
>>> +    if ((MACHINE(ms)->ram_size <= 2 * GiB) ||
>>>            !kvm_check_extension(kvm_state, KVM_CAP_S390_PROTECTED_ASYNC_DISABLE)) {
>>>            return false;
>>>        }
>>
>> If I understood the kernel code right, the decision is made wrt
>> the size of the gmap address space, which is the same as the
>> limit set for the VM. So using s390_get_memory_limit would be
>> semantically cleaner.
> 
> I wonder if we should just drop the RAM size check. Not convinced the
> slightly faster reboot for such small VMs is really relevant? Makes the
> code more complicated than really necessary.

Thinking about it,

diff --git a/target/s390x/kvm/pv.c b/target/s390x/kvm/pv.c
index 424cce75ca..bbb2108546 100644
--- a/target/s390x/kvm/pv.c
+++ b/target/s390x/kvm/pv.c
@@ -133,7 +133,7 @@ bool s390_pv_vm_try_disable_async(S390CcwMachineState *ms)
       * If the feature is not present or if the VM is not larger than 2 GiB,
       * KVM_PV_ASYNC_CLEANUP_PREPARE fill fail; no point in attempting it.
       */
-    if ((MACHINE(ms)->ram_size <= 2 * GiB) ||
+    if (s390_get_memory_limit(ms) < 2 * GiB ||
          !kvm_check_extension(kvm_state, KVM_CAP_S390_PROTECTED_ASYNC_DISABLE)) {
          return false;
      }

Is probably the easiest change, thanks!


-- 
Cheers,

David / dhildenb