[PATCH 1/4] s390x/pci: use a reserved ID for the default PCI group

Matthew Rosato posted 4 patches 4 years, 2 months ago
Maintainers: David Hildenbrand <david@redhat.com>, Eric Farman <farman@linux.ibm.com>, Richard Henderson <richard.henderson@linaro.org>, Cornelia Huck <cohuck@redhat.com>, Thomas Huth <thuth@redhat.com>, Christian Borntraeger <borntraeger@de.ibm.com>, Halil Pasic <pasic@linux.ibm.com>, Matthew Rosato <mjrosato@linux.ibm.com>
There is a newer version of this series
[PATCH 1/4] s390x/pci: use a reserved ID for the default PCI group
Posted by Matthew Rosato 4 years, 2 months ago
The current default PCI group being used can technically collide with a
real group ID passed from a hostdev.  Let's instead use a group ID that comes
from a special pool that is architected to be reserved for simulated devices.

Fixes: 28dc86a072 ("s390x/pci: use a PCI Group structure")
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 include/hw/s390x/s390-pci-bus.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/hw/s390x/s390-pci-bus.h b/include/hw/s390x/s390-pci-bus.h
index aa891c178d..2727e7bdef 100644
--- a/include/hw/s390x/s390-pci-bus.h
+++ b/include/hw/s390x/s390-pci-bus.h
@@ -313,7 +313,7 @@ typedef struct ZpciFmb {
 } ZpciFmb;
 QEMU_BUILD_BUG_MSG(offsetof(ZpciFmb, fmt0) != 48, "padding in ZpciFmb");
 
-#define ZPCI_DEFAULT_FN_GRP 0x20
+#define ZPCI_DEFAULT_FN_GRP 0xFF
 typedef struct S390PCIGroup {
     ClpRspQueryPciGrp zpci_group;
     int id;
-- 
2.27.0


Re: [PATCH 1/4] s390x/pci: use a reserved ID for the default PCI group
Posted by David Hildenbrand 4 years, 2 months ago
On 02.12.21 17:41, Matthew Rosato wrote:
> The current default PCI group being used can technically collide with a
> real group ID passed from a hostdev.  Let's instead use a group ID that comes
> from a special pool that is architected to be reserved for simulated devices.
> 
> Fixes: 28dc86a072 ("s390x/pci: use a PCI Group structure")
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> ---
>  include/hw/s390x/s390-pci-bus.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/include/hw/s390x/s390-pci-bus.h b/include/hw/s390x/s390-pci-bus.h
> index aa891c178d..2727e7bdef 100644
> --- a/include/hw/s390x/s390-pci-bus.h
> +++ b/include/hw/s390x/s390-pci-bus.h
> @@ -313,7 +313,7 @@ typedef struct ZpciFmb {
>  } ZpciFmb;
>  QEMU_BUILD_BUG_MSG(offsetof(ZpciFmb, fmt0) != 48, "padding in ZpciFmb");
>  
> -#define ZPCI_DEFAULT_FN_GRP 0x20
> +#define ZPCI_DEFAULT_FN_GRP 0xFF
>  typedef struct S390PCIGroup {
>      ClpRspQueryPciGrp zpci_group;
>      int id;
> 

What happens if we migrate a VM from old to new QEMU? Won't the guest be
able to observe the change?

-- 
Thanks,

David / dhildenb


Re: [PATCH 1/4] s390x/pci: use a reserved ID for the default PCI group
Posted by Matthew Rosato 4 years, 2 months ago
On 12/2/21 11:43 AM, David Hildenbrand wrote:
> On 02.12.21 17:41, Matthew Rosato wrote:
>> The current default PCI group being used can technically collide with a
>> real group ID passed from a hostdev.  Let's instead use a group ID that comes
>> from a special pool that is architected to be reserved for simulated devices.
>>
>> Fixes: 28dc86a072 ("s390x/pci: use a PCI Group structure")
>> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
>> ---
>>   include/hw/s390x/s390-pci-bus.h | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/include/hw/s390x/s390-pci-bus.h b/include/hw/s390x/s390-pci-bus.h
>> index aa891c178d..2727e7bdef 100644
>> --- a/include/hw/s390x/s390-pci-bus.h
>> +++ b/include/hw/s390x/s390-pci-bus.h
>> @@ -313,7 +313,7 @@ typedef struct ZpciFmb {
>>   } ZpciFmb;
>>   QEMU_BUILD_BUG_MSG(offsetof(ZpciFmb, fmt0) != 48, "padding in ZpciFmb");
>>   
>> -#define ZPCI_DEFAULT_FN_GRP 0x20
>> +#define ZPCI_DEFAULT_FN_GRP 0xFF
>>   typedef struct S390PCIGroup {
>>       ClpRspQueryPciGrp zpci_group;
>>       int id;
>>
> 
> What happens if we migrate a VM from old to new QEMU? Won't the guest be
> able to observe the change?
> 

Yes, technically --  But # itself is not really all that important, it 
is provided from CLP Q PCI FN to be subsequently used as input into Q 
PCI FNGRP -- With the fundamental notion being that all functions that 
share the same group # share the same group CLP info.  Whether the 
number is, say, 1 or 5 doesn't matter so much.

However..  0xF0 and greater are the only values reserved for hypervisor 
use.  By using 0x20 we run the risk of accidentally conflating simulated 
devices and real hardware, hence the desire to change it.

Is your concern about a migrated guest with a virtio device trying to do 
a CLP QUERY PCI FNGRP using 0x20 on a new QEMU?  I suppose we could 
modify 'clp_service_call, case CLP_QUERY_PCI_FNGRP' to silently catch 
simulated devices trying to use something other than the default group, 
e.g.:

if ((pbdev->fh & FH_SHM_EMUL) &&
     (pbdev->zpci_fn.pfgid != ZPCI_DEFAULT_FN_GRP)) {
         /* Simulated device MUST have default group */
	pbdev->zpci_fn.pfgid = ZPCI_DEFAULT_FN_GRP;
	group = s390_group_find(ZPCI_DEFAULT_FN_GRP);
}

What do you think?

Re: [PATCH 1/4] s390x/pci: use a reserved ID for the default PCI group
Posted by David Hildenbrand 4 years, 2 months ago
On 02.12.21 18:11, Matthew Rosato wrote:
> On 12/2/21 11:43 AM, David Hildenbrand wrote:
>> On 02.12.21 17:41, Matthew Rosato wrote:
>>> The current default PCI group being used can technically collide with a
>>> real group ID passed from a hostdev.  Let's instead use a group ID that comes
>>> from a special pool that is architected to be reserved for simulated devices.
>>>
>>> Fixes: 28dc86a072 ("s390x/pci: use a PCI Group structure")
>>> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
>>> ---
>>>   include/hw/s390x/s390-pci-bus.h | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/include/hw/s390x/s390-pci-bus.h b/include/hw/s390x/s390-pci-bus.h
>>> index aa891c178d..2727e7bdef 100644
>>> --- a/include/hw/s390x/s390-pci-bus.h
>>> +++ b/include/hw/s390x/s390-pci-bus.h
>>> @@ -313,7 +313,7 @@ typedef struct ZpciFmb {
>>>   } ZpciFmb;
>>>   QEMU_BUILD_BUG_MSG(offsetof(ZpciFmb, fmt0) != 48, "padding in ZpciFmb");
>>>   
>>> -#define ZPCI_DEFAULT_FN_GRP 0x20
>>> +#define ZPCI_DEFAULT_FN_GRP 0xFF
>>>   typedef struct S390PCIGroup {
>>>       ClpRspQueryPciGrp zpci_group;
>>>       int id;
>>>
>>
>> What happens if we migrate a VM from old to new QEMU? Won't the guest be
>> able to observe the change?
>>
> 
> Yes, technically --  But # itself is not really all that important, it 
> is provided from CLP Q PCI FN to be subsequently used as input into Q 
> PCI FNGRP -- With the fundamental notion being that all functions that 
> share the same group # share the same group CLP info.  Whether the 
> number is, say, 1 or 5 doesn't matter so much.
> 
> However..  0xF0 and greater are the only values reserved for hypervisor 
> use.  By using 0x20 we run the risk of accidentally conflating simulated 
> devices and real hardware, hence the desire to change it.
> 
> Is your concern about a migrated guest with a virtio device trying to do 
> a CLP QUERY PCI FNGRP using 0x20 on a new QEMU?  I suppose we could 
> modify 'clp_service_call, case CLP_QUERY_PCI_FNGRP' to silently catch 
> simulated devices trying to use something other than the default group, 
> e.g.:
> 
> if ((pbdev->fh & FH_SHM_EMUL) &&
>      (pbdev->zpci_fn.pfgid != ZPCI_DEFAULT_FN_GRP)) {
>          /* Simulated device MUST have default group */
> 	pbdev->zpci_fn.pfgid = ZPCI_DEFAULT_FN_GRP;
> 	group = s390_group_find(ZPCI_DEFAULT_FN_GRP);
> }
> 
> What do you think?
> 

Good question, I'm certainly not a zPCI expert. However if you think
that we cannot really break migration in a sane use case, I'm fine with
it as well :)

The question about breaking migration just popped up in my head when I
stumbled over this patch.

-- 
Thanks,

David / dhildenb


Re: [PATCH 1/4] s390x/pci: use a reserved ID for the default PCI group
Posted by Halil Pasic 4 years, 2 months ago
On Thu, 2 Dec 2021 12:11:38 -0500
Matthew Rosato <mjrosato@linux.ibm.com> wrote:

> > 
> > What happens if we migrate a VM from old to new QEMU? Won't the guest be
> > able to observe the change?
> >   
> 
> Yes, technically --  But # itself is not really all that important, it 
> is provided from CLP Q PCI FN to be subsequently used as input into Q 
> PCI FNGRP -- With the fundamental notion being that all functions that 
> share the same group # share the same group CLP info.  Whether the 
> number is, say, 1 or 5 doesn't matter so much.
> 
> However..  0xF0 and greater are the only values reserved for hypervisor 
> use.  By using 0x20 we run the risk of accidentally conflating simulated 
> devices and real hardware, hence the desire to change it.
> 
> Is your concern about a migrated guest with a virtio device trying to do 
> a CLP QUERY PCI FNGRP using 0x20 on a new QEMU?  I suppose we could 
> modify 'clp_service_call, case CLP_QUERY_PCI_FNGRP' to silently catch 
> simulated devices trying to use something other than the default group, 
> e.g.:
> 
> if ((pbdev->fh & FH_SHM_EMUL) &&
>      (pbdev->zpci_fn.pfgid != ZPCI_DEFAULT_FN_GRP)) {
>          /* Simulated device MUST have default group */
> 	pbdev->zpci_fn.pfgid = ZPCI_DEFAULT_FN_GRP;
> 	group = s390_group_find(ZPCI_DEFAULT_FN_GRP);
> }
> 
> What do you think?

Another option, and in my opinion the cleaner one would be to tie this
change to a new machine version. That is if a post-change qemu is used
in compatibility mode, we would still have the old behavior.

What do you think?

Regards,
Halil

Re: [PATCH 1/4] s390x/pci: use a reserved ID for the default PCI group
Posted by Matthew Rosato 4 years, 2 months ago
On 12/2/21 6:06 PM, Halil Pasic wrote:
> On Thu, 2 Dec 2021 12:11:38 -0500
> Matthew Rosato <mjrosato@linux.ibm.com> wrote:
> 
>>>
>>> What happens if we migrate a VM from old to new QEMU? Won't the guest be
>>> able to observe the change?
>>>    
>>
>> Yes, technically --  But # itself is not really all that important, it
>> is provided from CLP Q PCI FN to be subsequently used as input into Q
>> PCI FNGRP -- With the fundamental notion being that all functions that
>> share the same group # share the same group CLP info.  Whether the
>> number is, say, 1 or 5 doesn't matter so much.
>>
>> However..  0xF0 and greater are the only values reserved for hypervisor
>> use.  By using 0x20 we run the risk of accidentally conflating simulated
>> devices and real hardware, hence the desire to change it.
>>
>> Is your concern about a migrated guest with a virtio device trying to do
>> a CLP QUERY PCI FNGRP using 0x20 on a new QEMU?  I suppose we could
>> modify 'clp_service_call, case CLP_QUERY_PCI_FNGRP' to silently catch
>> simulated devices trying to use something other than the default group,
>> e.g.:
>>
>> if ((pbdev->fh & FH_SHM_EMUL) &&
>>       (pbdev->zpci_fn.pfgid != ZPCI_DEFAULT_FN_GRP)) {
>>           /* Simulated device MUST have default group */
>> 	pbdev->zpci_fn.pfgid = ZPCI_DEFAULT_FN_GRP;
>> 	group = s390_group_find(ZPCI_DEFAULT_FN_GRP);
>> }
>>
>> What do you think?
> 
> Another option, and in my opinion the cleaner one would be to tie this
> change to a new machine version. That is if a post-change qemu is used
> in compatibility mode, we would still have the old behavior.
> 
> What do you think?
> 

The problem there is that the old behavior goes against the architecture 
(group 0x20 could belong to real hardware) and AFAIU assigning this new 
behavior only to a new machine version means we can't fix old stable 
QEMU versions.

Also, wait a minute -- migration isn't even an option right now, it's 
blocked for zpci devices, both passthrough and simulated (see 
aede5d5dfc5f 's390x/pci: mark zpci devices as unmigratable') so I say 
let's just move to a proper default group now before we potentially 
allow migration later.

Re: [PATCH 1/4] s390x/pci: use a reserved ID for the default PCI group
Posted by Pierre Morel 4 years, 2 months ago

On 12/3/21 03:25, Matthew Rosato wrote:
> On 12/2/21 6:06 PM, Halil Pasic wrote:
>> On Thu, 2 Dec 2021 12:11:38 -0500
>> Matthew Rosato <mjrosato@linux.ibm.com> wrote:
>>
>>>>
>>>> What happens if we migrate a VM from old to new QEMU? Won't the 
>>>> guest be
>>>> able to observe the change?
>>>
>>> Yes, technically --  But # itself is not really all that important, it
>>> is provided from CLP Q PCI FN to be subsequently used as input into Q
>>> PCI FNGRP -- With the fundamental notion being that all functions that
>>> share the same group # share the same group CLP info.  Whether the
>>> number is, say, 1 or 5 doesn't matter so much.
>>>
>>> However..  0xF0 and greater are the only values reserved for hypervisor
>>> use.  By using 0x20 we run the risk of accidentally conflating simulated
>>> devices and real hardware, hence the desire to change it.
>>>
>>> Is your concern about a migrated guest with a virtio device trying to do
>>> a CLP QUERY PCI FNGRP using 0x20 on a new QEMU?  I suppose we could
>>> modify 'clp_service_call, case CLP_QUERY_PCI_FNGRP' to silently catch
>>> simulated devices trying to use something other than the default group,
>>> e.g.:
>>>
>>> if ((pbdev->fh & FH_SHM_EMUL) &&
>>>       (pbdev->zpci_fn.pfgid != ZPCI_DEFAULT_FN_GRP)) {
>>>           /* Simulated device MUST have default group */
>>>     pbdev->zpci_fn.pfgid = ZPCI_DEFAULT_FN_GRP;
>>>     group = s390_group_find(ZPCI_DEFAULT_FN_GRP);
>>> }
>>>
>>> What do you think?
>>
>> Another option, and in my opinion the cleaner one would be to tie this
>> change to a new machine version. That is if a post-change qemu is used
>> in compatibility mode, we would still have the old behavior.
>>
>> What do you think?
>>
> 
> The problem there is that the old behavior goes against the architecture 
> (group 0x20 could belong to real hardware) and AFAIU assigning this new 
> behavior only to a new machine version means we can't fix old stable 
> QEMU versions.
> 
> Also, wait a minute -- migration isn't even an option right now, it's 
> blocked for zpci devices, both passthrough and simulated (see 
> aede5d5dfc5f 's390x/pci: mark zpci devices as unmigratable') so I say 
> let's just move to a proper default group now before we potentially 
> allow migration later.

Looks right to me.

-- 
Pierre Morel
IBM Lab Boeblingen

Re: [PATCH 1/4] s390x/pci: use a reserved ID for the default PCI group
Posted by David Hildenbrand 4 years, 2 months ago
On 03.12.21 03:25, Matthew Rosato wrote:
> On 12/2/21 6:06 PM, Halil Pasic wrote:
>> On Thu, 2 Dec 2021 12:11:38 -0500
>> Matthew Rosato <mjrosato@linux.ibm.com> wrote:
>>
>>>>
>>>> What happens if we migrate a VM from old to new QEMU? Won't the guest be
>>>> able to observe the change?
>>>>    
>>>
>>> Yes, technically --  But # itself is not really all that important, it
>>> is provided from CLP Q PCI FN to be subsequently used as input into Q
>>> PCI FNGRP -- With the fundamental notion being that all functions that
>>> share the same group # share the same group CLP info.  Whether the
>>> number is, say, 1 or 5 doesn't matter so much.
>>>
>>> However..  0xF0 and greater are the only values reserved for hypervisor
>>> use.  By using 0x20 we run the risk of accidentally conflating simulated
>>> devices and real hardware, hence the desire to change it.
>>>
>>> Is your concern about a migrated guest with a virtio device trying to do
>>> a CLP QUERY PCI FNGRP using 0x20 on a new QEMU?  I suppose we could
>>> modify 'clp_service_call, case CLP_QUERY_PCI_FNGRP' to silently catch
>>> simulated devices trying to use something other than the default group,
>>> e.g.:
>>>
>>> if ((pbdev->fh & FH_SHM_EMUL) &&
>>>       (pbdev->zpci_fn.pfgid != ZPCI_DEFAULT_FN_GRP)) {
>>>           /* Simulated device MUST have default group */
>>> 	pbdev->zpci_fn.pfgid = ZPCI_DEFAULT_FN_GRP;
>>> 	group = s390_group_find(ZPCI_DEFAULT_FN_GRP);
>>> }
>>>
>>> What do you think?
>>
>> Another option, and in my opinion the cleaner one would be to tie this
>> change to a new machine version. That is if a post-change qemu is used
>> in compatibility mode, we would still have the old behavior.
>>
>> What do you think?
>>
> 
> The problem there is that the old behavior goes against the architecture 
> (group 0x20 could belong to real hardware) and AFAIU assigning this new 
> behavior only to a new machine version means we can't fix old stable 
> QEMU versions.
> 
> Also, wait a minute -- migration isn't even an option right now, it's 
> blocked for zpci devices, both passthrough and simulated (see 
> aede5d5dfc5f 's390x/pci: mark zpci devices as unmigratable') so I say 
> let's just move to a proper default group now before we potentially 
> allow migration later.

Perfect, thanks for confirming!


-- 
Thanks,

David / dhildenb


Re: [PATCH 1/4] s390x/pci: use a reserved ID for the default PCI group
Posted by Eric Farman 4 years, 2 months ago
On Thu, 2021-12-02 at 11:41 -0500, Matthew Rosato wrote:
> The current default PCI group being used can technically collide with
> a
> real group ID passed from a hostdev.  Let's instead use a group ID
> that comes
> from a special pool that is architected to be reserved for simulated
> devices.
> 
> Fixes: 28dc86a072 ("s390x/pci: use a PCI Group structure")
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>

Regardless of the question regarding virtio migration, this is good.

Reviewed-by: Eric Farman <farman@linux.ibm.com>

> ---
>  include/hw/s390x/s390-pci-bus.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/include/hw/s390x/s390-pci-bus.h b/include/hw/s390x/s390-
> pci-bus.h
> index aa891c178d..2727e7bdef 100644
> --- a/include/hw/s390x/s390-pci-bus.h
> +++ b/include/hw/s390x/s390-pci-bus.h
> @@ -313,7 +313,7 @@ typedef struct ZpciFmb {
>  } ZpciFmb;
>  QEMU_BUILD_BUG_MSG(offsetof(ZpciFmb, fmt0) != 48, "padding in
> ZpciFmb");
>  
> -#define ZPCI_DEFAULT_FN_GRP 0x20
> +#define ZPCI_DEFAULT_FN_GRP 0xFF
>  typedef struct S390PCIGroup {
>      ClpRspQueryPciGrp zpci_group;
>      int id;


Re: [PATCH 1/4] s390x/pci: use a reserved ID for the default PCI group
Posted by Pierre Morel 4 years, 2 months ago

On 12/2/21 17:41, Matthew Rosato wrote:
> The current default PCI group being used can technically collide with a
> real group ID passed from a hostdev.  Let's instead use a group ID that comes
> from a special pool that is architected to be reserved for simulated devices.

NIT: May be add that PCIFG between 0xF0 and 0xFF is specified for this 
reserved pool.


> 
> Fixes: 28dc86a072 ("s390x/pci: use a PCI Group structure")
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> ---
>   include/hw/s390x/s390-pci-bus.h | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/include/hw/s390x/s390-pci-bus.h b/include/hw/s390x/s390-pci-bus.h
> index aa891c178d..2727e7bdef 100644
> --- a/include/hw/s390x/s390-pci-bus.h
> +++ b/include/hw/s390x/s390-pci-bus.h
> @@ -313,7 +313,7 @@ typedef struct ZpciFmb {
>   } ZpciFmb;
>   QEMU_BUILD_BUG_MSG(offsetof(ZpciFmb, fmt0) != 48, "padding in ZpciFmb");
>   
> -#define ZPCI_DEFAULT_FN_GRP 0x20
> +#define ZPCI_DEFAULT_FN_GRP 0xFF
>   typedef struct S390PCIGroup {
>       ClpRspQueryPciGrp zpci_group;
>       int id;
> 

Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>


-- 
Pierre Morel
IBM Lab Boeblingen