[libvirt] [PATCH/QEMU] s390x/kvm: use cpu_model_available for guarded storage on compat machines

Christian Borntraeger posted 1 patch 6 years, 6 months ago
Failed in applying to current master (apply log)
hw/s390x/s390-virtio-ccw.c         | 8 --------
include/hw/s390x/s390-virtio-ccw.h | 3 ---
target/s390x/kvm.c                 | 2 +-
3 files changed, 1 insertion(+), 12 deletions(-)
[libvirt] [PATCH/QEMU] s390x/kvm: use cpu_model_available for guarded storage on compat machines
Posted by Christian Borntraeger 6 years, 6 months ago
Starting a guest with
   <os>
    <type arch='s390x' machine='s390-ccw-virtio-2.9'>hvm</type>
  </os>
  <cpu mode='host-model'/>

on an IBM z14 results in

"qemu-system-s390x: Some features requested in the CPU model are not
available in the configuration: gs"

This is because guarded storage is fenced for compat machines that did not have
guarded storage support, but libvirt expands the cpu model according to the
latest available machine.

While this prevents future migration abort (by not starting the guest at all),
not being able to start a "host-model" guest is very much unexpected.  As it
turns out, even if we would modify libvirt to not expand the cpu model to
contain "gs" for compat machines, it cannot guarantee that a migration will
succeed. For example if the kernel changes its features (or the user has
nested=1 on one host but not on the other) the migration will fail
nevertheless.  So instead of fencing "gs" for machines <= 2.9 lets allow it for
all machine types that support the CPU model. This will make "host-model"
runnable all the time, while relying on the CPU model to reject invalid
migration attempts.

Suggested-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 hw/s390x/s390-virtio-ccw.c         | 8 --------
 include/hw/s390x/s390-virtio-ccw.h | 3 ---
 target/s390x/kvm.c                 | 2 +-
 3 files changed, 1 insertion(+), 12 deletions(-)

diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
index fabe4a6..ae5b01a 100644
--- a/hw/s390x/s390-virtio-ccw.c
+++ b/hw/s390x/s390-virtio-ccw.c
@@ -414,7 +414,6 @@ static void ccw_machine_class_init(ObjectClass *oc, void *data)
     s390mc->ri_allowed = true;
     s390mc->cpu_model_allowed = true;
     s390mc->css_migration_enabled = true;
-    s390mc->gs_allowed = true;
     mc->init = ccw_init;
     mc->reset = s390_machine_reset;
     mc->hot_add_cpu = s390_hot_add_cpu;
@@ -495,12 +494,6 @@ bool cpu_model_allowed(void)
     return get_machine_class()->cpu_model_allowed;
 }
 
-bool gs_allowed(void)
-{
-    /* for "none" machine this results in true */
-    return get_machine_class()->gs_allowed;
-}
-
 static char *machine_get_loadparm(Object *obj, Error **errp)
 {
     S390CcwMachineState *ms = S390_CCW_MACHINE(obj);
@@ -740,7 +733,6 @@ static void ccw_machine_2_9_class_options(MachineClass *mc)
 {
     S390CcwMachineClass *s390mc = S390_MACHINE_CLASS(mc);
 
-    s390mc->gs_allowed = false;
     ccw_machine_2_10_class_options(mc);
     SET_MACHINE_COMPAT(mc, CCW_COMPAT_2_9);
     s390mc->css_migration_enabled = false;
diff --git a/include/hw/s390x/s390-virtio-ccw.h b/include/hw/s390x/s390-virtio-ccw.h
index a9a90c2..ac896e3 100644
--- a/include/hw/s390x/s390-virtio-ccw.h
+++ b/include/hw/s390x/s390-virtio-ccw.h
@@ -40,15 +40,12 @@ typedef struct S390CcwMachineClass {
     bool ri_allowed;
     bool cpu_model_allowed;
     bool css_migration_enabled;
-    bool gs_allowed;
 } S390CcwMachineClass;
 
 /* runtime-instrumentation allowed by the machine */
 bool ri_allowed(void);
 /* cpu model allowed by the machine */
 bool cpu_model_allowed(void);
-/* guarded-storage allowed by the machine */
-bool gs_allowed(void);
 
 /**
  * Returns true if (vmstate based) migration of the channel subsystem
diff --git a/target/s390x/kvm.c b/target/s390x/kvm.c
index 4c85ed8..020a7ea 100644
--- a/target/s390x/kvm.c
+++ b/target/s390x/kvm.c
@@ -363,7 +363,7 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
             cap_ri = 1;
         }
     }
-    if (gs_allowed()) {
+    if (cpu_model_allowed()) {
         if (kvm_vm_enable_cap(s, KVM_CAP_S390_GS, 0) == 0) {
             cap_gs = 1;
         }
-- 
2.9.4

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH/QEMU] s390x/kvm: use cpu_model_available for guarded storage on compat machines
Posted by Christian Borntraeger 6 years, 6 months ago
FWIW, this patch alone probably makes sense to fix this particular issue.

The question is do we want to have some additional interfaces between libvirt/qemu
that tell libvirt what CPUs/features are allowed for a specific machine version?

Halil just brough up an example. Imagine to have a QEMU binary that can emulate
an x86 and an s390x depending on the machine type. Should libvirt be able to detect
that -cpu pentium only makes sense for the q35 machines, but not for the s390-ccw-virtio
ones?




On 10/20/2017 04:54 PM, Christian Borntraeger wrote:
> Starting a guest with
>    <os>
>     <type arch='s390x' machine='s390-ccw-virtio-2.9'>hvm</type>
>   </os>
>   <cpu mode='host-model'/>
> 
> on an IBM z14 results in
> 
> "qemu-system-s390x: Some features requested in the CPU model are not
> available in the configuration: gs"
> 
> This is because guarded storage is fenced for compat machines that did not have
> guarded storage support, but libvirt expands the cpu model according to the
> latest available machine.
> 
> While this prevents future migration abort (by not starting the guest at all),
> not being able to start a "host-model" guest is very much unexpected.  As it
> turns out, even if we would modify libvirt to not expand the cpu model to
> contain "gs" for compat machines, it cannot guarantee that a migration will
> succeed. For example if the kernel changes its features (or the user has
> nested=1 on one host but not on the other) the migration will fail
> nevertheless.  So instead of fencing "gs" for machines <= 2.9 lets allow it for
> all machine types that support the CPU model. This will make "host-model"
> runnable all the time, while relying on the CPU model to reject invalid
> migration attempts.
> 
> Suggested-by: David Hildenbrand <david@redhat.com>
> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
> ---
>  hw/s390x/s390-virtio-ccw.c         | 8 --------
>  include/hw/s390x/s390-virtio-ccw.h | 3 ---
>  target/s390x/kvm.c                 | 2 +-
>  3 files changed, 1 insertion(+), 12 deletions(-)
> 
> diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
> index fabe4a6..ae5b01a 100644
> --- a/hw/s390x/s390-virtio-ccw.c
> +++ b/hw/s390x/s390-virtio-ccw.c
> @@ -414,7 +414,6 @@ static void ccw_machine_class_init(ObjectClass *oc, void *data)
>      s390mc->ri_allowed = true;
>      s390mc->cpu_model_allowed = true;
>      s390mc->css_migration_enabled = true;
> -    s390mc->gs_allowed = true;
>      mc->init = ccw_init;
>      mc->reset = s390_machine_reset;
>      mc->hot_add_cpu = s390_hot_add_cpu;
> @@ -495,12 +494,6 @@ bool cpu_model_allowed(void)
>      return get_machine_class()->cpu_model_allowed;
>  }
> 
> -bool gs_allowed(void)
> -{
> -    /* for "none" machine this results in true */
> -    return get_machine_class()->gs_allowed;
> -}
> -
>  static char *machine_get_loadparm(Object *obj, Error **errp)
>  {
>      S390CcwMachineState *ms = S390_CCW_MACHINE(obj);
> @@ -740,7 +733,6 @@ static void ccw_machine_2_9_class_options(MachineClass *mc)
>  {
>      S390CcwMachineClass *s390mc = S390_MACHINE_CLASS(mc);
> 
> -    s390mc->gs_allowed = false;
>      ccw_machine_2_10_class_options(mc);
>      SET_MACHINE_COMPAT(mc, CCW_COMPAT_2_9);
>      s390mc->css_migration_enabled = false;
> diff --git a/include/hw/s390x/s390-virtio-ccw.h b/include/hw/s390x/s390-virtio-ccw.h
> index a9a90c2..ac896e3 100644
> --- a/include/hw/s390x/s390-virtio-ccw.h
> +++ b/include/hw/s390x/s390-virtio-ccw.h
> @@ -40,15 +40,12 @@ typedef struct S390CcwMachineClass {
>      bool ri_allowed;
>      bool cpu_model_allowed;
>      bool css_migration_enabled;
> -    bool gs_allowed;
>  } S390CcwMachineClass;
> 
>  /* runtime-instrumentation allowed by the machine */
>  bool ri_allowed(void);
>  /* cpu model allowed by the machine */
>  bool cpu_model_allowed(void);
> -/* guarded-storage allowed by the machine */
> -bool gs_allowed(void);
> 
>  /**
>   * Returns true if (vmstate based) migration of the channel subsystem
> diff --git a/target/s390x/kvm.c b/target/s390x/kvm.c
> index 4c85ed8..020a7ea 100644
> --- a/target/s390x/kvm.c
> +++ b/target/s390x/kvm.c
> @@ -363,7 +363,7 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
>              cap_ri = 1;
>          }
>      }
> -    if (gs_allowed()) {
> +    if (cpu_model_allowed()) {
>          if (kvm_vm_enable_cap(s, KVM_CAP_S390_GS, 0) == 0) {
>              cap_gs = 1;
>          }
> 

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH/QEMU] s390x/kvm: use cpu_model_available for guarded storage on compat machines
Posted by Christian Borntraeger 6 years, 6 months ago
Ping, I plan to submit belows patch for 2.11. We can then still look into
a libvirt<->qemu interface for limiting host-model depending on machine versions
(or not).

On 10/20/2017 04:54 PM, Christian Borntraeger wrote:
> Starting a guest with
>    <os>
>     <type arch='s390x' machine='s390-ccw-virtio-2.9'>hvm</type>
>   </os>
>   <cpu mode='host-model'/>
> 
> on an IBM z14 results in
> 
> "qemu-system-s390x: Some features requested in the CPU model are not
> available in the configuration: gs"
> 
> This is because guarded storage is fenced for compat machines that did not have
> guarded storage support, but libvirt expands the cpu model according to the
> latest available machine.
> 
> While this prevents future migration abort (by not starting the guest at all),
> not being able to start a "host-model" guest is very much unexpected.  As it
> turns out, even if we would modify libvirt to not expand the cpu model to
> contain "gs" for compat machines, it cannot guarantee that a migration will
> succeed. For example if the kernel changes its features (or the user has
> nested=1 on one host but not on the other) the migration will fail
> nevertheless.  So instead of fencing "gs" for machines <= 2.9 lets allow it for
> all machine types that support the CPU model. This will make "host-model"
> runnable all the time, while relying on the CPU model to reject invalid
> migration attempts.
> 
> Suggested-by: David Hildenbrand <david@redhat.com>
> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
> ---
>  hw/s390x/s390-virtio-ccw.c         | 8 --------
>  include/hw/s390x/s390-virtio-ccw.h | 3 ---
>  target/s390x/kvm.c                 | 2 +-
>  3 files changed, 1 insertion(+), 12 deletions(-)
> 
> diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
> index fabe4a6..ae5b01a 100644
> --- a/hw/s390x/s390-virtio-ccw.c
> +++ b/hw/s390x/s390-virtio-ccw.c
> @@ -414,7 +414,6 @@ static void ccw_machine_class_init(ObjectClass *oc, void *data)
>      s390mc->ri_allowed = true;
>      s390mc->cpu_model_allowed = true;
>      s390mc->css_migration_enabled = true;
> -    s390mc->gs_allowed = true;
>      mc->init = ccw_init;
>      mc->reset = s390_machine_reset;
>      mc->hot_add_cpu = s390_hot_add_cpu;
> @@ -495,12 +494,6 @@ bool cpu_model_allowed(void)
>      return get_machine_class()->cpu_model_allowed;
>  }
> 
> -bool gs_allowed(void)
> -{
> -    /* for "none" machine this results in true */
> -    return get_machine_class()->gs_allowed;
> -}
> -
>  static char *machine_get_loadparm(Object *obj, Error **errp)
>  {
>      S390CcwMachineState *ms = S390_CCW_MACHINE(obj);
> @@ -740,7 +733,6 @@ static void ccw_machine_2_9_class_options(MachineClass *mc)
>  {
>      S390CcwMachineClass *s390mc = S390_MACHINE_CLASS(mc);
> 
> -    s390mc->gs_allowed = false;
>      ccw_machine_2_10_class_options(mc);
>      SET_MACHINE_COMPAT(mc, CCW_COMPAT_2_9);
>      s390mc->css_migration_enabled = false;
> diff --git a/include/hw/s390x/s390-virtio-ccw.h b/include/hw/s390x/s390-virtio-ccw.h
> index a9a90c2..ac896e3 100644
> --- a/include/hw/s390x/s390-virtio-ccw.h
> +++ b/include/hw/s390x/s390-virtio-ccw.h
> @@ -40,15 +40,12 @@ typedef struct S390CcwMachineClass {
>      bool ri_allowed;
>      bool cpu_model_allowed;
>      bool css_migration_enabled;
> -    bool gs_allowed;
>  } S390CcwMachineClass;
> 
>  /* runtime-instrumentation allowed by the machine */
>  bool ri_allowed(void);
>  /* cpu model allowed by the machine */
>  bool cpu_model_allowed(void);
> -/* guarded-storage allowed by the machine */
> -bool gs_allowed(void);
> 
>  /**
>   * Returns true if (vmstate based) migration of the channel subsystem
> diff --git a/target/s390x/kvm.c b/target/s390x/kvm.c
> index 4c85ed8..020a7ea 100644
> --- a/target/s390x/kvm.c
> +++ b/target/s390x/kvm.c
> @@ -363,7 +363,7 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
>              cap_ri = 1;
>          }
>      }
> -    if (gs_allowed()) {
> +    if (cpu_model_allowed()) {
>          if (kvm_vm_enable_cap(s, KVM_CAP_S390_GS, 0) == 0) {
>              cap_gs = 1;
>          }
> 

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH/QEMU] s390x/kvm: use cpu_model_available for guarded storage on compat machines
Posted by David Hildenbrand 6 years, 6 months ago
On 25.10.2017 12:18, Christian Borntraeger wrote:
> Ping, I plan to submit belows patch for 2.11. We can then still look into
> a libvirt<->qemu interface for limiting host-model depending on machine versions
> (or not).

I think this would be sufficient for now.

Having different host models, depending on the the machine type sounds
wrong. But maybe we'll need it in the future.


-- 

Thanks,

David

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH/QEMU] s390x/kvm: use cpu_model_available for guarded storage on compat machines
Posted by Boris Fiuczynski 6 years, 6 months ago
On 10/25/2017 12:23 PM, David Hildenbrand wrote:
> On 25.10.2017 12:18, Christian Borntraeger wrote:
>> Ping, I plan to submit belows patch for 2.11. We can then still look into
>> a libvirt<->qemu interface for limiting host-model depending on machine versions
>> (or not).
> 
> I think this would be sufficient for now.
> 
> Having different host models, depending on the the machine type sounds
> wrong. But maybe we'll need it in the future.
> 

David, I disagree if your proposal is to generally tolerate new cpu 
features in old machine types. This *might* work for gs but how do you 
guaranty that guests do not behave differently/wrong when suddenly new 
cpu features are made available to guest when (re-)starting them.
That is my feedback for the qemu side of this mater.

Regarding the libvirt side of this:
When looking at https://libvirt.org/formatdomain.html#elementsCPU I 
found the following sentence:
Since the CPU definition is copied just before starting a domain, 
exactly the same XML can be used on different hosts while still 
providing the best guest CPU each host supports.

My interpretation of "the best guest CPU each host supports" is that 
besides limiting factors like hardware, kernel and qemu capabilities the 
requested machine type for the guest is a limiting factor as well.

Nevertheless if my interpretation is found to be incorrect than we 
should think about another new cpu mode that includes the machine type 
into the "best guest CPU" detection.
My assumption is that we must not require the users to know which cpu 
model they manually need to define to match a specific machine type
AND we want to guarantee that guests run without risking any side 
effects by tolerating any additional cpu features.

-- 
Mit freundlichen Grüßen/Kind regards
    Boris Fiuczynski

IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martina Köderitz
Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH/QEMU] s390x/kvm: use cpu_model_available for guarded storage on compat machines
Posted by David Hildenbrand 6 years, 6 months ago
On 25.10.2017 17:09, Boris Fiuczynski wrote:
> On 10/25/2017 12:23 PM, David Hildenbrand wrote:
>> On 25.10.2017 12:18, Christian Borntraeger wrote:
>>> Ping, I plan to submit belows patch for 2.11. We can then still look into
>>> a libvirt<->qemu interface for limiting host-model depending on machine versions
>>> (or not).
>>
>> I think this would be sufficient for now.
>>
>> Having different host models, depending on the the machine type sounds
>> wrong. But maybe we'll need it in the future.
>>
> 
> David, I disagree if your proposal is to generally tolerate new cpu 
> features in old machine types. This *might* work for gs but how do you 
> guaranty that guests do not behave differently/wrong when suddenly new 
> cpu features are made available to guest when (re-)starting them.
> That is my feedback for the qemu side of this mater.

My point would be that it seems to work for all existing architectures
(so far as I am aware) and this one problem is easily fixed (and stems
from old CPU feature compatibility handling). So my question would be,
are there any potential CPU features that make such handling necessary
right now or in the near future?

> 
> Regarding the libvirt side of this:
> When looking at https://libvirt.org/formatdomain.html#elementsCPU I 
> found the following sentence:
> Since the CPU definition is copied just before starting a domain, 
> exactly the same XML can be used on different hosts while still 
> providing the best guest CPU each host supports.
> 
> My interpretation of "the best guest CPU each host supports" is that 
> besides limiting factors like hardware, kernel and qemu capabilities the 
> requested machine type for the guest is a limiting factor as well.

I understand "what the host supports" as combination of hw+kernel+qemu.
But the definition can be interpreted differently. I don't think that
the requested machine has to be taken into account at this point.
(Again, do you have any real examples where this would be applicable?)

> 
> Nevertheless if my interpretation is found to be incorrect than we 
> should think about another new cpu mode that includes the machine type 
> into the "best guest CPU" detection.

Which use case? I just want to understand how the current solution could
be problematic? (besides the problem we had, which is easily fixed)

> My assumption is that we must not require the users to know which cpu 
> model they manually need to define to match a specific machine type
> AND we want to guarantee that guests run without risking any side 
> effects by tolerating any additional cpu features.

That's why I think CPU models should be independent of the used QEMU
machine. It just over complicates things as we have seen.

Especially suddenly having multiple "host" CPU models depending on the
machine type, confusing. If we can, we should keep it simple.

-- 

Thanks,

David

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH/QEMU] s390x/kvm: use cpu_model_available for guarded storage on compat machines
Posted by David Hildenbrand 6 years, 6 months ago
On 25.10.2017 17:09, Boris Fiuczynski wrote:
> On 10/25/2017 12:23 PM, David Hildenbrand wrote:
>> On 25.10.2017 12:18, Christian Borntraeger wrote:
>>> Ping, I plan to submit belows patch for 2.11. We can then still look into
>>> a libvirt<->qemu interface for limiting host-model depending on machine versions
>>> (or not).
>>
>> I think this would be sufficient for now.
>>
>> Having different host models, depending on the the machine type sounds
>> wrong. But maybe we'll need it in the future.
>>
> 
> David, I disagree if your proposal is to generally tolerate new cpu 
> features in old machine types. This *might* work for gs but how do you 
> guaranty that guests do not behave differently/wrong when suddenly new 
> cpu features are made available to guest when (re-)starting them.
> That is my feedback for the qemu side of this mater.

Just re-reading this section, so what you mean is:

a) guest is started, host model is copied and used. guest works.
b) guest is stopped.
c) QEMU/KVM/HW is updated.
d) guest is started, new host model is copied. guest no longer works.

d) could be because there are now some additional features with e.g.
broken guest implementation or now missing features.


What you propose (if I am not wrong) is a to bind features somehow to a
QEMU machine. I think that should never be done. You could not catch now
missing features.

What would you think about a persistent host-model copy option? So
instead of copying at every start, only copy and replace it once in the XML?

Easy to specify by the user and no CPU feature changes will happend
"blindly".


-- 

Thanks,

David

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH/QEMU] s390x/kvm: use cpu_model_available for guarded storage on compat machines
Posted by Marc Hartmayer 6 years, 6 months ago
On Wed, Oct 25, 2017 at 05:50 PM +0200, David Hildenbrand <david@redhat.com> wrote:
> On 25.10.2017 17:09, Boris Fiuczynski wrote:
>> On 10/25/2017 12:23 PM, David Hildenbrand wrote:
>>> On 25.10.2017 12:18, Christian Borntraeger wrote:
>>>> Ping, I plan to submit belows patch for 2.11. We can then still look into
>>>> a libvirt<->qemu interface for limiting host-model depending on machine versions
>>>> (or not).
>>>
>>> I think this would be sufficient for now.
>>>
>>> Having different host models, depending on the the machine type sounds
>>> wrong. But maybe we'll need it in the future.
>>>
>>
>> David, I disagree if your proposal is to generally tolerate new cpu
>> features in old machine types. This *might* work for gs but how do you
>> guaranty that guests do not behave differently/wrong when suddenly new
>> cpu features are made available to guest when (re-)starting them.
>> That is my feedback for the qemu side of this mater.
>
> Just re-reading this section, so what you mean is:
>
> a) guest is started, host model is copied and used. guest works.
> b) guest is stopped.
> c) QEMU/KVM/HW is updated.
> d) guest is started, new host model is copied. guest no longer works.
>
> d) could be because there are now some additional features with e.g.
> broken guest implementation or now missing features.
>
>
> What you propose (if I am not wrong) is a to bind features somehow to a
> QEMU machine. I think that should never be done. You could not catch now
> missing features.

What exactly do you mean by the last sentence?

>
> What would you think about a persistent host-model copy option? So
> instead of copying at every start, only copy and replace it once in the XML?
>
> Easy to specify by the user and no CPU feature changes will happend
> "blindly".
>
>
> --
>
> Thanks,
>
> David
>
--
Beste Grüße / Kind regards
   Marc Hartmayer

IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294


--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH/QEMU] s390x/kvm: use cpu_model_available for guarded storage on compat machines
Posted by David Hildenbrand 6 years, 6 months ago
On 25.10.2017 18:45, Marc Hartmayer wrote:
> On Wed, Oct 25, 2017 at 05:50 PM +0200, David Hildenbrand <david@redhat.com> wrote:
>> On 25.10.2017 17:09, Boris Fiuczynski wrote:
>>> On 10/25/2017 12:23 PM, David Hildenbrand wrote:
>>>> On 25.10.2017 12:18, Christian Borntraeger wrote:
>>>>> Ping, I plan to submit belows patch for 2.11. We can then still look into
>>>>> a libvirt<->qemu interface for limiting host-model depending on machine versions
>>>>> (or not).
>>>>
>>>> I think this would be sufficient for now.
>>>>
>>>> Having different host models, depending on the the machine type sounds
>>>> wrong. But maybe we'll need it in the future.
>>>>
>>>
>>> David, I disagree if your proposal is to generally tolerate new cpu
>>> features in old machine types. This *might* work for gs but how do you
>>> guaranty that guests do not behave differently/wrong when suddenly new
>>> cpu features are made available to guest when (re-)starting them.
>>> That is my feedback for the qemu side of this mater.
>>
>> Just re-reading this section, so what you mean is:
>>
>> a) guest is started, host model is copied and used. guest works.
>> b) guest is stopped.
>> c) QEMU/KVM/HW is updated.
>> d) guest is started, new host model is copied. guest no longer works.
>>
>> d) could be because there are now some additional features with e.g.
>> broken guest implementation or now missing features.
>>
>>
>> What you propose (if I am not wrong) is a to bind features somehow to a
>> QEMU machine. I think that should never be done. You could not catch now
>> missing features.
> 
> What exactly do you mean by the last sentence?

In general, up/downgrading QEMU/KVM/HW can lead to the removal of features.

Another example is the "nested" flag for KVM. toggling that can lead to
the host feature looking differently (+/- SIE features).

So if you really want to make sure that a VM XML that once ran perfectly
fine on a host will still run after any QEMU/KVM/HW changes on that host:

a) specify an explicit CPU model, not the host model (e.g. "z13")
b) copy the host model to the XML persistently.

Linking any of that to the machine types is in my opinion the very wrong
approach.

> 
>>
>> What would you think about a persistent host-model copy option? So
>> instead of copying at every start, only copy and replace it once in the XML?
>>
>> Easy to specify by the user and no CPU feature changes will happend
>> "blindly".
>>
>>
>> --
>>
>> Thanks,
>>
>> David
>>
> --
> Beste Grüße / Kind regards
>    Marc Hartmayer
> 
> IBM Deutschland Research & Development GmbH
> Vorsitzende des Aufsichtsrats: Martina Koederitz
> Geschäftsführung: Dirk Wittkopp
> Sitz der Gesellschaft: Böblingen
> Registergericht: Amtsgericht Stuttgart, HRB 243294
> 


-- 

Thanks,

David

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH/QEMU] s390x/kvm: use cpu_model_available for guarded storage on compat machines
Posted by Jiri Denemark 6 years, 5 months ago
On Thu, Oct 26, 2017 at 10:09:46 +0200, David Hildenbrand wrote:
> On 25.10.2017 18:45, Marc Hartmayer wrote:
> > On Wed, Oct 25, 2017 at 05:50 PM +0200, David Hildenbrand <david@redhat.com> wrote:
> >> On 25.10.2017 17:09, Boris Fiuczynski wrote:
> >>> David, I disagree if your proposal is to generally tolerate new cpu
> >>> features in old machine types. This *might* work for gs but how do you
> >>> guaranty that guests do not behave differently/wrong when suddenly new
> >>> cpu features are made available to guest when (re-)starting them.
> >>> That is my feedback for the qemu side of this mater.
> >>
> >> Just re-reading this section, so what you mean is:
> >>
> >> a) guest is started, host model is copied and used. guest works.
> >> b) guest is stopped.
> >> c) QEMU/KVM/HW is updated.
> >> d) guest is started, new host model is copied. guest no longer works.
> >>
> >> d) could be because there are now some additional features with e.g.
> >> broken guest implementation or now missing features.
> >>
> >>
> >> What you propose (if I am not wrong) is a to bind features somehow to a
> >> QEMU machine. I think that should never be done. You could not catch now
> >> missing features.
> > 
> > What exactly do you mean by the last sentence?
> 
> In general, up/downgrading QEMU/KVM/HW can lead to the removal of features.
> 
> Another example is the "nested" flag for KVM. toggling that can lead to
> the host feature looking differently (+/- SIE features).
> 
> So if you really want to make sure that a VM XML that once ran perfectly
> fine on a host will still run after any QEMU/KVM/HW changes on that host:
> 
> a) specify an explicit CPU model, not the host model (e.g. "z13")
> b) copy the host model to the XML persistently.
> 
> Linking any of that to the machine types is in my opinion the very wrong
> approach.

I agree, we should do that only if it's really impossible to even create
a machine with a given machine type in combination with some CPU models.
And I believe this is not the case.

The host-model CPU guarantees guest ABI only while a domain is running.
Once it stops and a user starts it again, the ABI seen by the guest OS
may be different and this may sometimes cause the guest OS won't start.
It's pretty similar to what can happen when you change the machine type.
Of course, machine type doesn't change automatically while host-model
CPUs do change. But that's just how host-model is defined. If you don't
want this to happen, you should use a specific CPU model; you can copy
it from domain capabilities XML to make a persistent version of a
host-model CPU.

BTW, using host-model CPU may actually break migration to host with
older QEMU even though an old machine type is used. This is OK since
host-model may bring features which are not supported on the older host
and backward migration is only supported when no new features are used.
However, if a domain with a host-model CPU was started on the old host
and migrated to the new one, migrating it back to the old one is
supported and it will work.

Jirka

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [Qemu-devel] [PATCH/QEMU] s390x/kvm: use cpu_model_available for guarded storage on compat machines
Posted by Jason J. Herne 6 years, 6 months ago
On 10/20/2017 10:54 AM, Christian Borntraeger wrote:
> Starting a guest with
>     <os>
>      <type arch='s390x' machine='s390-ccw-virtio-2.9'>hvm</type>
>    </os>
>    <cpu mode='host-model'/>
> 
> on an IBM z14 results in
> 
> "qemu-system-s390x: Some features requested in the CPU model are not
> available in the configuration: gs"
> 
> This is because guarded storage is fenced for compat machines that did not have
> guarded storage support, but libvirt expands the cpu model according to the
> latest available machine.
> 
> While this prevents future migration abort (by not starting the guest at all),
> not being able to start a "host-model" guest is very much unexpected.  As it
> turns out, even if we would modify libvirt to not expand the cpu model to
> contain "gs" for compat machines, it cannot guarantee that a migration will
> succeed. For example if the kernel changes its features (or the user has
> nested=1 on one host but not on the other) the migration will fail
> nevertheless.  So instead of fencing "gs" for machines <= 2.9 lets allow it for
> all machine types that support the CPU model. This will make "host-model"
> runnable all the time, while relying on the CPU model to reject invalid
> migration attempts.
...
> -    if (gs_allowed()) {
> +    if (cpu_model_allowed()) {
>           if (kvm_vm_enable_cap(s, KVM_CAP_S390_GS, 0) == 0) {
>               cap_gs = 1;

Ok, honestly, I dislike this idea because it means we are effectively 
lying now. We will happily accept a +gs cpu model with a 2.9 compat 
machine when the point of compat machines is to mimick the older version 
of Qemu which does not support GS.  Yes, model checking will prevent the 
worst side effects, namely, exploding migrations.

I'd far prefer a solution that makes host-model dependent on the machine 
type. But I understand some of the backlash on that idea. Still, it 
seems like the cleanest approach even if it will be more work.

With all of that said, I know we want this fixed and your patch seems to 
fix the problem. So if you need an R-b...

Reviewed-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>

Can we have a new tag? :-D
Reviewed-by-with-reservations: Jason J. Herne <jjherne@linux.vnet.ibm.com>

-- 
-- Jason J. Herne (jjherne@linux.vnet.ibm.com)

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [Qemu-devel] [PATCH/QEMU] s390x/kvm: use cpu_model_available for guarded storage on compat machines
Posted by Halil Pasic 6 years, 5 months ago

On 10/25/2017 08:13 PM, Jason J. Herne wrote:
> On 10/20/2017 10:54 AM, Christian Borntraeger wrote:
>> Starting a guest with
>>     <os>
>>      <type arch='s390x' machine='s390-ccw-virtio-2.9'>hvm</type>
>>    </os>
>>    <cpu mode='host-model'/>
>>
>> on an IBM z14 results in
>>
>> "qemu-system-s390x: Some features requested in the CPU model are not
>> available in the configuration: gs"
>>
>> This is because guarded storage is fenced for compat machines that did not have
>> guarded storage support, but libvirt expands the cpu model according to the
>> latest available machine.
>>
>> While this prevents future migration abort (by not starting the guest at all),
>> not being able to start a "host-model" guest is very much unexpected.  As it
>> turns out, even if we would modify libvirt to not expand the cpu model to
>> contain "gs" for compat machines, it cannot guarantee that a migration will
>> succeed. For example if the kernel changes its features (or the user has
>> nested=1 on one host but not on the other) the migration will fail
>> nevertheless.  So instead of fencing "gs" for machines <= 2.9 lets allow it for
>> all machine types that support the CPU model. This will make "host-model"
>> runnable all the time, while relying on the CPU model to reject invalid
>> migration attempts.
> ...
>> -    if (gs_allowed()) {
>> +    if (cpu_model_allowed()) {
>>           if (kvm_vm_enable_cap(s, KVM_CAP_S390_GS, 0) == 0) {
>>               cap_gs = 1;


@Jason

Hi Jason,

I don't have access to a z14 at the moment, and since you do, I would
like to try out something.

I will first describe my concern, and then the test scenario.

The last line above, cap_gs = 1, has the side effect of returning
true ever after.

int kvm_s390_get_gs(void)                                                       
{                                                                               
    return cap_gs;                                                              
}  

Now considering
static bool gscb_needed(void *opaque)
{
    return kvm_s390_get_gs();
}

const VMStateDescription vmstate_gscb = {
    .name = "cpu/gscb",
    .version_id = 1,
    .minimum_version_id = 1,
    .needed = gscb_needed,
    .fields = (VMStateField[]) {
        VMSTATE_UINT64_ARRAY(env.gscb, S390CPU, 4),
        VMSTATE_END_OF_LIST()
        }
};

const VMStateDescription vmstate_s390_cpu = {
    .name = "cpu",
    .post_load = cpu_post_load,
    .pre_save = cpu_pre_save,
    .version_id = 4,
    .minimum_version_id = 3,
    .fields      = (VMStateField[]) {
        VMSTATE_UINT64_ARRAY(env.regs, S390CPU, 16),
        VMSTATE_UINT64(env.psw.mask, S390CPU),
        VMSTATE_UINT64(env.psw.addr, S390CPU),
        VMSTATE_UINT64(env.psa, S390CPU),
        VMSTATE_UINT32(env.todpr, S390CPU),
        VMSTATE_UINT64(env.pfault_token, S390CPU),
        VMSTATE_UINT64(env.pfault_compare, S390CPU),
        VMSTATE_UINT64(env.pfault_select, S390CPU),
        VMSTATE_UINT64(env.cputm, S390CPU),
        VMSTATE_UINT64(env.ckc, S390CPU),
        VMSTATE_UINT64(env.gbea, S390CPU),
        VMSTATE_UINT64(env.pp, S390CPU),
        VMSTATE_UINT32_ARRAY(env.aregs, S390CPU, 16),
        VMSTATE_UINT64_ARRAY(env.cregs, S390CPU, 16),
        VMSTATE_UINT8(env.cpu_state, S390CPU),
        VMSTATE_UINT8(env.sigp_order, S390CPU),
        VMSTATE_UINT32_V(irqstate_saved_size, S390CPU, 4),
        VMSTATE_VBUFFER_UINT32(irqstate, S390CPU, 4, NULL,
                               irqstate_saved_size),
        VMSTATE_END_OF_LIST()
    },
    .subsections = (const VMStateDescription*[]) {
        &vmstate_fpu,
        &vmstate_vregs,
        &vmstate_riccb,
        &vmstate_exval,
        &vmstate_gscb,
        NULL
    },
};

I would expect the vmstate_gscb subsection being sent, even if gs is disabled
via cpu-model if kernel and possibly machine has gs support (and qemu
has cpu-models).

So the test scenario I want you to play trough is the following. Take
the latest-greatest qemu with this patch applied. Make sure gs works
(is provided to the guest) with a 2.9 machine version, and a fully
specified cpu-model. Now disable gs explicitly.

Try to migrate this to another machine having a 2.9 binary. I expect
the migration failing because, the subsection is going to be sent by
the latest-greatest binary, but is unknown to the 2.9 binary.

Notice this is despite the fact that gs is explicitly disabled.

Now that I think about it, maybe the 2.9 binary is going to reject
the explicit gs flag altogether, because it's unknown.

Isn't this a problem? I'm afraid like this the only migration-safe
variant is -base, but that would essentially make adding features
incrementally impossible. 

But I hypothesize trying to migrate with z13 or even z13-base
would also trigger the unknown subsection problem.

Unfortunately I can't test this because my kernel never
makes kvm_vm_enable_cap(s, KVM_CAP_S390_GS, 0) return 0, because
I lack HW support in the host.

Regards,
Halil


> 
> Ok, honestly, I dislike this idea because it means we are effectively lying now. We will happily accept a +gs cpu model with a 2.9 compat machine when the point of compat machines is to mimick the older version of Qemu which does not support GS.  Yes, model checking will prevent the worst side effects, namely, exploding migrations.
> 
> I'd far prefer a solution that makes host-model dependent on the machine type. But I understand some of the backlash on that idea. Still, it seems like the cleanest approach even if it will be more work.
> 
> With all of that said, I know we want this fixed and your patch seems to fix the problem. So if you need an R-b...
> 
> Reviewed-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
> 
> Can we have a new tag? :-D
> Reviewed-by-with-reservations: Jason J. Herne <jjherne@linux.vnet.ibm.com>
> 


Re: [libvirt] [Qemu-devel] [PATCH/QEMU] s390x/kvm: use cpu_model_available for guarded storage on compat machines
Posted by Christian Borntraeger 6 years, 5 months ago

On 10/27/2017 02:31 PM, Halil Pasic wrote:
> 
> 
> On 10/25/2017 08:13 PM, Jason J. Herne wrote:
>> On 10/20/2017 10:54 AM, Christian Borntraeger wrote:
>>> Starting a guest with
>>>     <os>
>>>      <type arch='s390x' machine='s390-ccw-virtio-2.9'>hvm</type>
>>>    </os>
>>>    <cpu mode='host-model'/>
>>>
>>> on an IBM z14 results in
>>>
>>> "qemu-system-s390x: Some features requested in the CPU model are not
>>> available in the configuration: gs"
>>>
>>> This is because guarded storage is fenced for compat machines that did not have
>>> guarded storage support, but libvirt expands the cpu model according to the
>>> latest available machine.
>>>
>>> While this prevents future migration abort (by not starting the guest at all),
>>> not being able to start a "host-model" guest is very much unexpected.  As it
>>> turns out, even if we would modify libvirt to not expand the cpu model to
>>> contain "gs" for compat machines, it cannot guarantee that a migration will
>>> succeed. For example if the kernel changes its features (or the user has
>>> nested=1 on one host but not on the other) the migration will fail
>>> nevertheless.  So instead of fencing "gs" for machines <= 2.9 lets allow it for
>>> all machine types that support the CPU model. This will make "host-model"
>>> runnable all the time, while relying on the CPU model to reject invalid
>>> migration attempts.
>> ...
>>> -    if (gs_allowed()) {
>>> +    if (cpu_model_allowed()) {
>>>           if (kvm_vm_enable_cap(s, KVM_CAP_S390_GS, 0) == 0) {
>>>               cap_gs = 1;
> 
> 
> @Jason
> 
> Hi Jason,
> 
> I don't have access to a z14 at the moment, and since you do, I would
> like to try out something.
> 
> I will first describe my concern, and then the test scenario.
> 
> The last line above, cap_gs = 1, has the side effect of returning
> true ever after.
> 
> int kvm_s390_get_gs(void)                                                       
> {                                                                               
>     return cap_gs;                                                              
> }  
> 
> Now considering
> static bool gscb_needed(void *opaque)
> {
>     return kvm_s390_get_gs();
> }

Yes, we should also replace that with

 return s390_has_feat(S390_FEAT_GUARDED_STORAGE)

I can fixup my patch or provide a 2nd one.

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [Qemu-devel] [PATCH/QEMU] s390x/kvm: use cpu_model_available for guarded storage on compat machines
Posted by Cornelia Huck 6 years, 5 months ago
On Fri, 27 Oct 2017 14:42:57 +0200
Christian Borntraeger <borntraeger@de.ibm.com> wrote:

> Yes, we should also replace that with
> 
>  return s390_has_feat(S390_FEAT_GUARDED_STORAGE)
> 
> I can fixup my patch or provide a 2nd one.
> 

Consider a fixed up patch acked by me.

Re: [libvirt] [Qemu-devel] [PATCH/QEMU] s390x/kvm: use cpu_model_available for guarded storage on compat machines
Posted by Halil Pasic 6 years, 5 months ago

On 10/27/2017 03:31 PM, Cornelia Huck wrote:
> On Fri, 27 Oct 2017 14:42:57 +0200
> Christian Borntraeger <borntraeger@de.ibm.com> wrote:
> 
>> Yes, we should also replace that with
>>
>>  return s390_has_feat(S390_FEAT_GUARDED_STORAGE)
>>
>> I can fixup my patch or provide a 2nd one.
>>
> 
> Consider a fixed up patch acked by me.
> 

+1 You can keep my ack too. I will try to find some time and read
the v2 though.

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [Qemu-devel] [libvirt] [PATCH/QEMU] s390x/kvm: use cpu_model_available for guarded storage on compat machines
Posted by Christian Borntraeger 6 years, 5 months ago

On 10/27/2017 02:31 PM, Halil Pasic wrote:
gs is explicitly disabled.
> 
> Now that I think about it, maybe the 2.9 binary is going to reject
> the explicit gs flag altogether, because it's unknown.
> 
> Isn't this a problem? 

No. This is exactly the _solution_ and not the problem. The target will reject
unknown cpu features and migration will be aborted. This is exactly what the CPU
model is for.



Re: [libvirt] [Qemu-devel] [PATCH/QEMU] s390x/kvm: use cpu_model_available for guarded storage on compat machines
Posted by Christian Borntraeger 6 years, 5 months ago

On 10/27/2017 02:45 PM, Christian Borntraeger wrote:
> 
> 
> On 10/27/2017 02:31 PM, Halil Pasic wrote:
> gs is explicitly disabled.
>>
>> Now that I think about it, maybe the 2.9 binary is going to reject
>> the explicit gs flag altogether, because it's unknown.
>>
>> Isn't this a problem? 
> 
> No. This is exactly the _solution_ and not the problem. The target will reject
> unknown cpu features and migration will be aborted. This is exactly what the CPU
> model is for.
FWIW, I think in your particular case the QEMU will reject the z14 cpu and not even come
to checking the gs. 

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [Qemu-devel] [PATCH/QEMU] s390x/kvm: use cpu_model_available for guarded storage on compat machines
Posted by Halil Pasic 6 years, 5 months ago

On 10/27/2017 02:57 PM, Christian Borntraeger wrote:
> 
> 
> On 10/27/2017 02:45 PM, Christian Borntraeger wrote:
>>
>>
>> On 10/27/2017 02:31 PM, Halil Pasic wrote:
>> gs is explicitly disabled.
>>>
>>> Now that I think about it, maybe the 2.9 binary is going to reject
>>> the explicit gs flag altogether, because it's unknown.
>>>
>>> Isn't this a problem? 
>>
>> No. This is exactly the _solution_ and not the problem. The target will reject
>> unknown cpu features and migration will be aborted. This is exactly what the CPU
>> model is for.

I'm not sure we talk abut the same thing. I'm talking about the following. I
want to disable a cpu-model feature for the sake of migration (because I know
that binary version X does not support the feature, because it does not know
about it). Now if I do it via let's say -cpu z13,gs=off on let's say 2.11,
and start with the exact same command line (-cpu z13,gs=off) on lets say 2.9
my migration will explode because of the unknown feature I'm specifying
not to be used.

Well I'm not sure what I describe is relevant. My thinking is along the lines
some features are added incrementally. How do use those of the features not included
in -base model which both of my environments support and disable those that
are unsupported by one of the environments.

I will think about it some more. I've asked Boris about this situation,
and he did not put my mind at ease (to be more precise he seemed to
see this as a potential problem too), so I've decided to mention it.
Sorry if I've generated some unnecessary noise.

I think the root of the problem is that I don't understand the difference between
z13-base and z13, and the associated rules and expected/intended usages. 

> FWIW, I think in your particular case the QEMU will reject the z14 cpu and not even come
> to checking the gs. 
> 

I had a z13 cpu model in mind. I don't mention a z14 cpu-model (QEMU, not hw) in
my whole email.

Regards,
Halil

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [Qemu-devel] [PATCH/QEMU] s390x/kvm: use cpu_model_available for guarded storage on compat machines
Posted by Christian Borntraeger 6 years, 5 months ago

On 10/27/2017 03:40 PM, Halil Pasic wrote:
> 
> 
> On 10/27/2017 02:57 PM, Christian Borntraeger wrote:
>>
>>
>> On 10/27/2017 02:45 PM, Christian Borntraeger wrote:
>>>
>>>
>>> On 10/27/2017 02:31 PM, Halil Pasic wrote:
>>> gs is explicitly disabled.
>>>>
>>>> Now that I think about it, maybe the 2.9 binary is going to reject
>>>> the explicit gs flag altogether, because it's unknown.
>>>>
>>>> Isn't this a problem? 
>>>
>>> No. This is exactly the _solution_ and not the problem. The target will reject
>>> unknown cpu features and migration will be aborted. This is exactly what the CPU
>>> model is for.
> 
> I'm not sure we talk abut the same thing. I'm talking about the following. I
> want to disable a cpu-model feature for the sake of migration (because I know
> that binary version X does not support the feature, because it does not know
> about it). Now if I do it via let's say -cpu z13,gs=off on let's say 2.11,
> and start with the exact same command line (-cpu z13,gs=off) on lets say 2.9
> my migration will explode because of the unknown feature I'm specifying
> not to be used.

The migration will be rejected because the target qemu will not startup.
You can easily simulate that, e.g. by doing

qemu-system-s390x -cpu z13,notyetknown=off
qemu-system-s390x: can't apply global z13-s390x-cpu.notyetknown=off: Property '.notyetknown' not found

But libvirt will not use a full model (and the disable things) instead it will
use the base model and add things. (So libvirt should never use xxx=off)


I think this is really not an issue. If you specify a feature that is not known then
QEMU will not start on the target and migration is rejected. The guest continues to run
on the source. So if you specify a "too new" facility yourself its really a user error.
Everything that uses an explicit model (e.g. -cpu z13 or -cpu,sief2=on) will work, but only
as long as the conditions are met. If you specify -cpu z14, it does not matter if it fails
if the kernel or QEMU is too old, or if you just happen to run on a z13.

The  only question is/was:  what is about "host-model".
With my patch (+ the gs fixup) the following things will work:
- host-model will work on z13
- host-model will work on z14 (any machine version)
- host model on z13 and then migrating to z14 will work (any machine version)
- host model on z13 and then migrating to z14 and then migrating back will work (any version)
- qemu with fixup + host model on z14 with machine version 2.10 can be migrated

The only thing that does not work is
- qemu with fixup + host model on z14 with machine 2.9 can not be migrated to qemu 2.9 on z14.

Now: this would have not worked anyway, because qemu 2.9 does not know z14. So in theory 
QEMU must forbit z14 for compat machines (which we do not know).

I talked to several people and it seems that on x86 the host model will also enable new features
that are not known by older QEMUs and its considered works as designed. (see also Jiris mail)


> 
> Well I'm not sure what I describe is relevant. My thinking is along the lines
> some features are added incrementally. How do use those of the features not included
> in -base model which both of my environments support and disable those that
> are unsupported by one of the environments.
> 
> I will think about it some more. I've asked Boris about this situation,
> and he did not put my mind at ease (to be more precise he seemed to
> see this as a potential problem too), so I've decided to mention it.
> Sorry if I've generated some unnecessary noise.
> 
> I think the root of the problem is that I don't understand the difference between
> z13-base and z13, and the associated rules and expected/intended usages. 

z13-base contains only those features that a guaranteed to be there (there is
the list of non-hypervisor managed features). z13 is z13-base + all features that
will be available in a reasonably recent kernel+qemu combination and make sense
to be there a default. So it might happen that you cannot start -cpu z14, e.g. 
if you run on a kernel < 4.12.

 
>> FWIW, I think in your particular case the QEMU will reject the z14 cpu and not even come
>> to checking the gs. 
>>
> 
> I had a z13 cpu model in mind. I don't mention a z14 cpu-model (QEMU, not hw) in
> my whole email.
> 
> Regards,
> Halil
> 

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [Qemu-devel] [PATCH/QEMU] s390x/kvm: use cpu_model_available for guarded storage on compat machines
Posted by Halil Pasic 6 years, 5 months ago

On 10/27/2017 04:06 PM, Christian Borntraeger wrote:
> 
> 
> On 10/27/2017 03:40 PM, Halil Pasic wrote:
>>
>>
>> On 10/27/2017 02:57 PM, Christian Borntraeger wrote:
>>>
>>>
>>> On 10/27/2017 02:45 PM, Christian Borntraeger wrote:
>>>>
>>>>
>>>> On 10/27/2017 02:31 PM, Halil Pasic wrote:
>>>> gs is explicitly disabled.
>>>>>
>>>>> Now that I think about it, maybe the 2.9 binary is going to reject
>>>>> the explicit gs flag altogether, because it's unknown.
>>>>>
>>>>> Isn't this a problem? 
>>>>
>>>> No. This is exactly the _solution_ and not the problem. The target will reject
>>>> unknown cpu features and migration will be aborted. This is exactly what the CPU
>>>> model is for.
>>
>> I'm not sure we talk abut the same thing. I'm talking about the following. I
>> want to disable a cpu-model feature for the sake of migration (because I know
>> that binary version X does not support the feature, because it does not know
>> about it). Now if I do it via let's say -cpu z13,gs=off on let's say 2.11,
>> and start with the exact same command line (-cpu z13,gs=off) on lets say 2.9
>> my migration will explode because of the unknown feature I'm specifying
>> not to be used.
> 
> The migration will be rejected because the target qemu will not startup.
> You can easily simulate that, e.g. by doing
> 
> qemu-system-s390x -cpu z13,notyetknown=off
> qemu-system-s390x: can't apply global z13-s390x-cpu.notyetknown=off: Property '.notyetknown' not found
> 
> But libvirt will not use a full model (and the disable things) instead it will
> use the base model and add things. (So libvirt should never use xxx=off)
> 

That piece of the puzzle was missing for me (no xxx=off for M minus M-base
features).

> 
> I think this is really not an issue. If you specify a feature that is not known then
> QEMU will not start on the target and migration is rejected. The guest continues to run
> on the source. So if you specify a "too new" facility yourself its really a user error.
> Everything that uses an explicit model (e.g. -cpu z13 or -cpu,sief2=on) will work, but only
> as long as the conditions are met. If you specify -cpu z14, it does not matter if it fails
> if the kernel or QEMU is too old, or if you just happen to run on a z13.
> 
> The  only question is/was:  what is about "host-model".
> With my patch (+ the gs fixup) the following things will work:
> - host-model will work on z13
> - host-model will work on z14 (any machine version)
> - host model on z13 and then migrating to z14 will work (any machine version)
> - host model on z13 and then migrating to z14 and then migrating back will work (any version)
> - qemu with fixup + host model on z14 with machine version 2.10 can be migrated
> 
> The only thing that does not work is
> - qemu with fixup + host model on z14 with machine 2.9 can not be migrated to qemu 2.9 on z14.
> 


I agree.

> Now: this would have not worked anyway, because qemu 2.9 does not know z14. So in theory 
> QEMU must forbit z14 for compat machines (which we do not know).>

Noted.
 
> I talked to several people and it seems that on x86 the host model will also enable new features
> that are not known by older QEMUs and its considered works as designed. (see also Jiris mail)
> 

Yes, I've seen that. It would be nice though if this design was easier to
find in written. Unfortunately I can read minds only to a very limited extent,
and the written stuff I've read did not give me a full understanding of the
design -- although the entity to blame for this could be my limited intellect. 

> 
>>
>> Well I'm not sure what I describe is relevant. My thinking is along the lines
>> some features are added incrementally. How do use those of the features not included
>> in -base model which both of my environments support and disable those that
>> are unsupported by one of the environments.
>>
>> I will think about it some more. I've asked Boris about this situation,
>> and he did not put my mind at ease (to be more precise he seemed to
>> see this as a potential problem too), so I've decided to mention it.
>> Sorry if I've generated some unnecessary noise.
>>
>> I think the root of the problem is that I don't understand the difference between
>> z13-base and z13, and the associated rules and expected/intended usages. 
> 
> z13-base contains only those features that a guaranteed to be there (there is
> the list of non-hypervisor managed features). z13 is z13-base + all features that
> will be available in a reasonably recent kernel+qemu combination and make sense
> to be there a default. So it might happen that you cannot start -cpu z14, e.g. 
> if you run on a kernel < 4.12.
> 

OK. I will have to learn more about this. IMHO it does not make sense
to burden the community with the holes in my knowledge any more, but I will
have to stuff them to feel comfortable in this area.

Regards,
Halil

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [Qemu-devel] [PATCH/QEMU] s390x/kvm: use cpu_model_available for guarded storage on compat machines
Posted by Jiri Denemark 6 years, 5 months ago
On Fri, Oct 27, 2017 at 17:18:44 +0200, Halil Pasic wrote:
> On 10/27/2017 04:06 PM, Christian Borntraeger wrote:
...
> > I talked to several people and it seems that on x86 the host model will also enable new features
> > that are not known by older QEMUs and its considered works as designed. (see also Jiris mail)
> 
> Yes, I've seen that. It would be nice though if this design was easier to
> find in written. Unfortunately I can read minds only to a very limited extent,
> and the written stuff I've read did not give me a full understanding of the
> design -- although the entity to blame for this could be my limited intellect.

I think it's more likely the documentation is just not perfect. I'll
look at it and try to make it better. I know about some parts which need
to be clarified thanks to this discussion.

Jirka

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [Qemu-devel] [libvirt] [PATCH/QEMU] s390x/kvm: use cpu_model_available for guarded storage on compat machines
Posted by Halil Pasic 6 years, 5 months ago

On 10/27/2017 07:12 PM, Jiri Denemark wrote:
> On Fri, Oct 27, 2017 at 17:18:44 +0200, Halil Pasic wrote:
>> On 10/27/2017 04:06 PM, Christian Borntraeger wrote:
> ...
>>> I talked to several people and it seems that on x86 the host model will also enable new features
>>> that are not known by older QEMUs and its considered works as designed. (see also Jiris mail)
>>
>> Yes, I've seen that. It would be nice though if this design was easier to
>> find in written. Unfortunately I can read minds only to a very limited extent,
>> and the written stuff I've read did not give me a full understanding of the
>> design -- although the entity to blame for this could be my limited intellect.
> 
> I think it's more likely the documentation is just not perfect. I'll
> look at it and try to make it better. I know about some parts which need
> to be clarified thanks to this discussion.
> 
> Jirka
> 

Feel free to put me on cc for the patch. I would be happy to give it a
look, but since I'm not actively reading the libvirt list, I'm afraid it will
slip past me if I'm not in cc.

Regards,
Halil


Re: [libvirt] [Qemu-devel] [PATCH/QEMU] s390x/kvm: use cpu_model_available for guarded storage on compat machines
Posted by Halil Pasic 6 years, 6 months ago

On 10/20/2017 04:54 PM, Christian Borntraeger wrote:
> Starting a guest with
>    <os>
>     <type arch='s390x' machine='s390-ccw-virtio-2.9'>hvm</type>
>   </os>
>   <cpu mode='host-model'/>
> 
> on an IBM z14 results in
> 
> "qemu-system-s390x: Some features requested in the CPU model are not
> available in the configuration: gs"
> 
> This is because guarded storage is fenced for compat machines that did not have
> guarded storage support, but libvirt expands the cpu model according to the
> latest available machine.
> 
> While this prevents future migration abort (by not starting the guest at all),
> not being able to start a "host-model" guest is very much unexpected.  As it
> turns out, even if we would modify libvirt to not expand the cpu model to
> contain "gs" for compat machines, it cannot guarantee that a migration will
> succeed. For example if the kernel changes its features (or the user has
> nested=1 on one host but not on the other) the migration will fail
> nevertheless.  So instead of fencing "gs" for machines <= 2.9 lets allow it for
> all machine types that support the CPU model. This will make "host-model"
> runnable all the time, while relying on the CPU model to reject invalid
> migration attempts.
> 
> Suggested-by: David Hildenbrand <david@redhat.com>
> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>

I've tried to review this patch. Unfortunately I don't have access to a
z14, so I could not try the most interesting scenarios out.

The idea of the patch is very clear, but I don't understand the bigger gs
feature context fully.

>From what I read in the code, the attempt to enable the gs capability in
the kernel is made regardless of the cpu model. If the attempt was
successful kvm_s390_get_gs will keep returning true. That would in turn
affect migration, as far as I see (usages of kvm_s390_get_gs). I could
not figure out how does gs being turned off via cpu-model (-cpu
z14,gs=off) does turn of gs support -- at least not the details. I wanted
to give a timely review, so I've limited myself there. 

>From what I see, this patch does what it advertises, and since I think
it's the right thing to do in the current situation I gonna give it an:
Acked-by: Halil Pasic <pasic@linux.vnet.ibm.com>

At the same time, I would prefer the commit message being reworded. IMHO
this patch is a good stop-gap measure, but essentially it trades an
annoying and obvious bug for a subtle and hopefully painless one.

Let me explain this last statement. For starters, I  do share some of the
concerns Boris has voiced.  I won't repeat those. Same goes for the
example Christian paraphrased previously, and the the fear of an implicit
requirement for having to support a Cartesian product of the advertised
machine-types and cpu-models (for each qemu binary).

In my eyes, a cpu isn't all that different from the other devices which
get attached to a board (represented by machine-type). So I don't see why
should it be exempt from the usual compatibility requirements tied to
machine-types (for the sake of stability and compatibility). (I basically
mean: no new features are added to a device in the context of a given
(fully qualified) machine-type (with new QEMU -- binary -- versions). As
far as I understand all (other) devices shall respect these requirements.
Or am I wrong? If I'm not, please enlighten me, how is a CPU
fundamentally different than let's say a FLIC.

A related thing is, that to implement some features indicated/controlled
via cpu-model features, we need to add capabilities to certain devices.
Now if the devices shall obey the 'no new features for the same
machine-type' rule, but the cpu-model feature shall obey our new
'retroactively introduce/enable for all machines supporting cpu-models'
rule, I think we have a conflict.  An example for what I'm talking about
is zPCI, AIS and FLIC. In case of the AIS and the FLIC, AFAIK the
conflict was resolved so that the AIS feature/code of the FLIC is not
subject to usual compat-macro mechanism. Another example, AP facility in
not just about the CPU instructions, but also about a device.

It's also true for the last paragraph: I might very well be wrong, and if
I am please do tell (where is the hole in  my reasoning). I will try to
re-check my statements tomorrow -- again trying to deliver today along
the lines better a small bird today than a big one tomorrow).

And another question. If we adopt this introducing features for
machine-types retroactively, how should the machine-type versions be
understood like? My current understanding is, that the machine-type
(version) is supposed to limit the observable changes when upgrading
the binary to the bare minimum (basically possibly modified timings -- which
can't be avoided -- and bug-fixes) for the sake of updating the
binary being as safe as possible.

Last but not least, I have to say, I'm neither an domain expert for
cpu-models nor for libvirt and it's models. For that reason, I've
personally asked Jason to do a more detailed review on this -- and am
hoping to wiggle out with an weak ack. I do intend to keep on following
the discussion (despite of not feeling to be entitled to make any calls)
and contribute where I can.

Regards,
Halil

 

> ---
>  hw/s390x/s390-virtio-ccw.c         | 8 --------
>  include/hw/s390x/s390-virtio-ccw.h | 3 ---
>  target/s390x/kvm.c                 | 2 +-
>  3 files changed, 1 insertion(+), 12 deletions(-)
> 
> diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
> index fabe4a6..ae5b01a 100644
> --- a/hw/s390x/s390-virtio-ccw.c
> +++ b/hw/s390x/s390-virtio-ccw.c
> @@ -414,7 +414,6 @@ static void ccw_machine_class_init(ObjectClass *oc, void *data)
>      s390mc->ri_allowed = true;
>      s390mc->cpu_model_allowed = true;
>      s390mc->css_migration_enabled = true;
> -    s390mc->gs_allowed = true;
>      mc->init = ccw_init;
>      mc->reset = s390_machine_reset;
>      mc->hot_add_cpu = s390_hot_add_cpu;
> @@ -495,12 +494,6 @@ bool cpu_model_allowed(void)
>      return get_machine_class()->cpu_model_allowed;
>  }
> 
> -bool gs_allowed(void)
> -{
> -    /* for "none" machine this results in true */
> -    return get_machine_class()->gs_allowed;
> -}
> -
>  static char *machine_get_loadparm(Object *obj, Error **errp)
>  {
>      S390CcwMachineState *ms = S390_CCW_MACHINE(obj);
> @@ -740,7 +733,6 @@ static void ccw_machine_2_9_class_options(MachineClass *mc)
>  {
>      S390CcwMachineClass *s390mc = S390_MACHINE_CLASS(mc);
> 
> -    s390mc->gs_allowed = false;
>      ccw_machine_2_10_class_options(mc);
>      SET_MACHINE_COMPAT(mc, CCW_COMPAT_2_9);
>      s390mc->css_migration_enabled = false;
> diff --git a/include/hw/s390x/s390-virtio-ccw.h b/include/hw/s390x/s390-virtio-ccw.h
> index a9a90c2..ac896e3 100644
> --- a/include/hw/s390x/s390-virtio-ccw.h
> +++ b/include/hw/s390x/s390-virtio-ccw.h
> @@ -40,15 +40,12 @@ typedef struct S390CcwMachineClass {
>      bool ri_allowed;
>      bool cpu_model_allowed;
>      bool css_migration_enabled;
> -    bool gs_allowed;
>  } S390CcwMachineClass;
> 
>  /* runtime-instrumentation allowed by the machine */
>  bool ri_allowed(void);
>  /* cpu model allowed by the machine */
>  bool cpu_model_allowed(void);
> -/* guarded-storage allowed by the machine */
> -bool gs_allowed(void);
> 
>  /**
>   * Returns true if (vmstate based) migration of the channel subsystem
> diff --git a/target/s390x/kvm.c b/target/s390x/kvm.c
> index 4c85ed8..020a7ea 100644
> --- a/target/s390x/kvm.c
> +++ b/target/s390x/kvm.c
> @@ -363,7 +363,7 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
>              cap_ri = 1;
>          }
>      }
> -    if (gs_allowed()) {
> +    if (cpu_model_allowed()) {
>          if (kvm_vm_enable_cap(s, KVM_CAP_S390_GS, 0) == 0) {
>              cap_gs = 1;
>          }
> 

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [Qemu-devel] [PATCH/QEMU] s390x/kvm: use cpu_model_available for guarded storage on compat machines
Posted by Christian Borntraeger 6 years, 6 months ago

On 10/26/2017 01:35 AM, Halil Pasic wrote:
 try the most interesting scenarios out.
> 
> The idea of the patch is very clear, but I don't understand the bigger gs
> feature context fully.
> 
> From what I read in the code, the attempt to enable the gs capability in
> the kernel is made regardless of the cpu model. If the attempt was
> successful kvm_s390_get_gs will keep returning true. That would in turn
> affect migration, as far as I see (usages of kvm_s390_get_gs). I could
> not figure out how does gs being turned off via cpu-model (-cpu
> z14,gs=off) does turn of gs support -- at least not the details. I wanted
> to give a timely review, so I've limited myself there. 

When the CPU model turns off gs, facility bit 133 will be turned off.
Then the kernel does the right thing, see priv.c handle_gs.

guarded storage is enabled lazily. Whenever the guest issues a Gs instruction
we will get an exit and only enable GS if facility 133 is set.

> 
> From what I see, this patch does what it advertises, and since I think
> it's the right thing to do in the current situation I gonna give it an:
> Acked-by: Halil Pasic <pasic@linux.vnet.ibm.com>
> 
> At the same time, I would prefer the commit message being reworded. IMHO
> this patch is a good stop-gap measure, but essentially it trades an
> annoying and obvious bug for a subtle and hopefully painless one.
> 
> Let me explain this last statement. For starters, I  do share some of the
> concerns Boris has voiced.  I won't repeat those. Same goes for the
> example Christian paraphrased previously, and the the fear of an implicit
> requirement for having to support a Cartesian product of the advertised
> machine-types and cpu-models (for each qemu binary).

I will try to come up with a patch description that explains the open issues
and it will tell that additional might or might not be necessary depending
on followup discussions.
I will schedule this patch for 2.11 then.


Re: [Qemu-devel] [PATCH/QEMU] s390x/kvm: use cpu_model_available for guarded storage on compat machines
Posted by Halil Pasic 6 years, 6 months ago

On 10/26/2017 10:13 AM, Christian Borntraeger wrote:
> 
> 
> On 10/26/2017 01:35 AM, Halil Pasic wrote:
>  try the most interesting scenarios out.
>>
>> The idea of the patch is very clear, but I don't understand the bigger gs
>> feature context fully.
>>
>> From what I read in the code, the attempt to enable the gs capability in
>> the kernel is made regardless of the cpu model. If the attempt was
>> successful kvm_s390_get_gs will keep returning true. That would in turn
>> affect migration, as far as I see (usages of kvm_s390_get_gs). I could
>> not figure out how does gs being turned off via cpu-model (-cpu
>> z14,gs=off) does turn of gs support -- at least not the details. I wanted
>> to give a timely review, so I've limited myself there. 
> 
> When the CPU model turns off gs, facility bit 133 will be turned off.
> Then the kernel does the right thing, see priv.c handle_gs.
> 
> guarded storage is enabled lazily. Whenever the guest issues a Gs instruction
> we will get an exit and only enable GS if facility 133 is set.
> 
>>
>> From what I see, this patch does what it advertises, and since I think
>> it's the right thing to do in the current situation I gonna give it an:
>> Acked-by: Halil Pasic <pasic@linux.vnet.ibm.com>
>>
>> At the same time, I would prefer the commit message being reworded. IMHO
>> this patch is a good stop-gap measure, but essentially it trades an
>> annoying and obvious bug for a subtle and hopefully painless one.
>>
>> Let me explain this last statement. For starters, I  do share some of the
>> concerns Boris has voiced.  I won't repeat those. Same goes for the
>> example Christian paraphrased previously, and the the fear of an implicit
>> requirement for having to support a Cartesian product of the advertised
>> machine-types and cpu-models (for each qemu binary).
> 
> I will try to come up with a patch description that explains the open issues
> and it will tell that additional might or might not be necessary depending
> on followup discussions.

I would be already happy with adding something like:
During the discussion enabling cpu-model features for preexisting
machine-types came out as at least controversial. We agreed to implement
this patch as a stop-gap measure for the next release. What is a clean
enough solution shall be decided without time pressure.

> I will schedule this patch for 2.11 then.
> 

Sounds like a plan.

Cheers,
Halil

> 


Re: [Qemu-devel] [PATCH/QEMU] s390x/kvm: use cpu_model_available for guarded storage on compat machines
Posted by Cornelia Huck 6 years, 6 months ago
On Fri, 20 Oct 2017 16:54:37 +0200
Christian Borntraeger <borntraeger@de.ibm.com> wrote:

> Starting a guest with
>    <os>
>     <type arch='s390x' machine='s390-ccw-virtio-2.9'>hvm</type>
>   </os>
>   <cpu mode='host-model'/>
> 
> on an IBM z14 results in
> 
> "qemu-system-s390x: Some features requested in the CPU model are not
> available in the configuration: gs"
> 
> This is because guarded storage is fenced for compat machines that did not have
> guarded storage support, but libvirt expands the cpu model according to the
> latest available machine.
> 
> While this prevents future migration abort (by not starting the guest at all),
> not being able to start a "host-model" guest is very much unexpected.  As it
> turns out, even if we would modify libvirt to not expand the cpu model to
> contain "gs" for compat machines, it cannot guarantee that a migration will
> succeed. For example if the kernel changes its features (or the user has
> nested=1 on one host but not on the other) the migration will fail
> nevertheless.  So instead of fencing "gs" for machines <= 2.9 lets allow it for
> all machine types that support the CPU model. This will make "host-model"
> runnable all the time, while relying on the CPU model to reject invalid
> migration attempts.
> 
> Suggested-by: David Hildenbrand <david@redhat.com>
> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
> ---
>  hw/s390x/s390-virtio-ccw.c         | 8 --------
>  include/hw/s390x/s390-virtio-ccw.h | 3 ---
>  target/s390x/kvm.c                 | 2 +-
>  3 files changed, 1 insertion(+), 12 deletions(-)

Acked-by: Cornelia Huck <cohuck@redhat.com>