[RFC PATCH-for-5.2] hw/s390x/pci: Fix endianness issue

Philippe Mathieu-Daudé posted 1 patch 3 years, 5 months ago
Test checkpatch passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20201117120115.1234994-1-philmd@redhat.com
Maintainers: Thomas Huth <thuth@redhat.com>, Cornelia Huck <cohuck@redhat.com>
hw/s390x/s390-pci-inst.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
[RFC PATCH-for-5.2] hw/s390x/pci: Fix endianness issue
Posted by Philippe Mathieu-Daudé 3 years, 5 months ago
Fix an endianness issue reported by Cornelia:

> s390x tcg guest on x86, virtio-pci devices are not detected. The
> relevant feature bits are visible to the guest. Same breakage with
> different guest kernels.
> KVM guests and s390x tcg guests on s390x are fine.

Fixes: 28dc86a0729 ("s390x/pci: use a PCI Group structure")
Reported-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
---
RFC because review-only patch, untested
---
 hw/s390x/s390-pci-inst.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
index 58cd041d17f..cfb54b4d8ec 100644
--- a/hw/s390x/s390-pci-inst.c
+++ b/hw/s390x/s390-pci-inst.c
@@ -305,7 +305,7 @@ int clp_service_call(S390CPU *cpu, uint8_t r2, uintptr_t ra)
         ClpReqQueryPciGrp *reqgrp = (ClpReqQueryPciGrp *)reqh;
         S390PCIGroup *group;
 
-        group = s390_group_find(reqgrp->g);
+        group = s390_group_find(ldl_p(&reqgrp->g));
         if (!group) {
             /* We do not allow access to unknown groups */
             /* The group must have been obtained with a vfio device */
-- 
2.26.2

Re: [RFC PATCH-for-5.2] hw/s390x/pci: Fix endianness issue
Posted by Cornelia Huck 3 years, 5 months ago
On Tue, 17 Nov 2020 13:01:15 +0100
Philippe Mathieu-Daudé <philmd@redhat.com> wrote:

> Fix an endianness issue reported by Cornelia:
> 
> > s390x tcg guest on x86, virtio-pci devices are not detected. The
> > relevant feature bits are visible to the guest. Same breakage with
> > different guest kernels.
> > KVM guests and s390x tcg guests on s390x are fine.  
> 
> Fixes: 28dc86a0729 ("s390x/pci: use a PCI Group structure")
> Reported-by: Cornelia Huck <cohuck@redhat.com>
> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> ---
> RFC because review-only patch, untested
> ---
>  hw/s390x/s390-pci-inst.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
> index 58cd041d17f..cfb54b4d8ec 100644
> --- a/hw/s390x/s390-pci-inst.c
> +++ b/hw/s390x/s390-pci-inst.c
> @@ -305,7 +305,7 @@ int clp_service_call(S390CPU *cpu, uint8_t r2, uintptr_t ra)
>          ClpReqQueryPciGrp *reqgrp = (ClpReqQueryPciGrp *)reqh;
>          S390PCIGroup *group;
>  
> -        group = s390_group_find(reqgrp->g);
> +        group = s390_group_find(ldl_p(&reqgrp->g));
>          if (!group) {
>              /* We do not allow access to unknown groups */
>              /* The group must have been obtained with a vfio device */

Hmm... when I tested this change on top of 28dc86a0729, it worked; when
tested on top of my s390-fixes branch, I get

operand exception: 0015 ilc:3 [#1] SMP 
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.8.0-rc3+ #174
Hardware name: QEMU 2964 QEMU (KVM/Linux)
Krnl PSW : 0704d00180000000 00000000001727be (zpci_mod_fc+0x2e/0xf8)
           R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI:0 EA:3
Krnl GPRS: 000003d080fb8040 8002000000000004 8002000000000004 000003e00000bb30
           000003e00000bb2f 0000000040ffffff 000000003efa8804 0000000000000002
           000003e00000bb2f 8002000000000004 0000000000000000 000000003f1b4400
           000000003f41c000 0000000000000000 000003e00000ba88 000003e00000ba20
Krnl Code: 00000000001727ae: e3e0f0980024	stg	%r14,152(%r15)
           00000000001727b4: b9040084		lgr	%r8,%r4
          #00000000001727b8: e310300000d0	mpcifc	%r1,0(%r3)
          >00000000001727be: b22200b0		ipm	%r11
           00000000001727c2: 88b0001c		srl	%r11,28
           00000000001727c6: 42b0f0a7		stc	%r11,167(%r15)
           00000000001727ca: eb110018000c	srlg	%r1,%r1,24
           00000000001727d0: 42104000		stc	%r1,0(%r4)
Call Trace:
 [<00000000001727be>] zpci_mod_fc+0x2e/0xf8 
 [<000000000016d364>] zpci_register_ioat+0x64/0x88 
 [<0000000000170532>] zpci_dma_init_device+0x13a/0x1b8 
 [<000000000016e008>] zpci_enable_device+0x48/0x70 
 [<000000000016e1b2>] zpci_create_device+0x14a/0x170 
 [<0000000000171224>] clp_add_pci_device+0x504/0x528 
 [<0000000000170c46>] clp_list_pci+0xf6/0x1d0 
 [<00000000001712f4>] clp_scan_pci_devices+0x44/0x68 
 [<00000000011e859e>] pci_base_init+0x12e/0x1a0 
 [<00000000011ddf92>] do_one_initcall+0x132/0x2e0 
 [<00000000011de3cc>] kernel_init_freeable+0x28c/0x308 
 [<0000000000cfc822>] kernel_init+0x22/0x150 
 [<0000000000d08f60>] ret_from_fork+0x24/0x28 
INFO: lockdep is turned off.
Last Breaking-Event-Address:
 [<000000000016d35e>] zpci_register_ioat+0x5e/0x88

(Yeah, I know it's not the newest guest kernel.)

This patch probably uncovers further endianness problems in the
following zpci patches. Looking.


Re: [RFC PATCH-for-5.2] hw/s390x/pci: Fix endianness issue
Posted by Peter Maydell 3 years, 5 months ago
On Tue, 17 Nov 2020 at 12:03, Philippe Mathieu-Daudé <philmd@redhat.com> wrote:
>
> Fix an endianness issue reported by Cornelia:
>
> > s390x tcg guest on x86, virtio-pci devices are not detected. The
> > relevant feature bits are visible to the guest. Same breakage with
> > different guest kernels.
> > KVM guests and s390x tcg guests on s390x are fine.
>
> Fixes: 28dc86a0729 ("s390x/pci: use a PCI Group structure")
> Reported-by: Cornelia Huck <cohuck@redhat.com>
> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> ---
> RFC because review-only patch, untested
> ---
>  hw/s390x/s390-pci-inst.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
> index 58cd041d17f..cfb54b4d8ec 100644
> --- a/hw/s390x/s390-pci-inst.c
> +++ b/hw/s390x/s390-pci-inst.c
> @@ -305,7 +305,7 @@ int clp_service_call(S390CPU *cpu, uint8_t r2, uintptr_t ra)
>          ClpReqQueryPciGrp *reqgrp = (ClpReqQueryPciGrp *)reqh;
>          S390PCIGroup *group;
>
> -        group = s390_group_find(reqgrp->g);
> +        group = s390_group_find(ldl_p(&reqgrp->g));

'g' in the ClpReqQueryPciGrp struct is a uint32_t, so
adding the ldl_p() will have no effect unless (a) the
structure is not 4-aligned and (b) the host will fault on
unaligned accesses, which isn't the case for x86 hosts.

Q: is this struct really in host order, or should we
be using ldl_le_p() or ldl_be_p() and friends here and
elsewhere?

thanks
-- PMM

Re: [RFC PATCH-for-5.2] hw/s390x/pci: Fix endianness issue
Posted by Philippe Mathieu-Daudé 3 years, 5 months ago
On 11/17/20 2:00 PM, Peter Maydell wrote:
> On Tue, 17 Nov 2020 at 12:03, Philippe Mathieu-Daudé <philmd@redhat.com> wrote:
>>
>> Fix an endianness issue reported by Cornelia:
>>
>>> s390x tcg guest on x86, virtio-pci devices are not detected. The
>>> relevant feature bits are visible to the guest. Same breakage with
>>> different guest kernels.
>>> KVM guests and s390x tcg guests on s390x are fine.
>>
>> Fixes: 28dc86a0729 ("s390x/pci: use a PCI Group structure")
>> Reported-by: Cornelia Huck <cohuck@redhat.com>
>> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
>> ---
>> RFC because review-only patch, untested
>> ---
>>  hw/s390x/s390-pci-inst.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
>> index 58cd041d17f..cfb54b4d8ec 100644
>> --- a/hw/s390x/s390-pci-inst.c
>> +++ b/hw/s390x/s390-pci-inst.c
>> @@ -305,7 +305,7 @@ int clp_service_call(S390CPU *cpu, uint8_t r2, uintptr_t ra)
>>          ClpReqQueryPciGrp *reqgrp = (ClpReqQueryPciGrp *)reqh;
>>          S390PCIGroup *group;
>>
>> -        group = s390_group_find(reqgrp->g);
>> +        group = s390_group_find(ldl_p(&reqgrp->g));
> 
> 'g' in the ClpReqQueryPciGrp struct is a uint32_t, so
> adding the ldl_p() will have no effect unless (a) the
> structure is not 4-aligned and (b) the host will fault on
> unaligned accesses, which isn't the case for x86 hosts.
> 
> Q: is this struct really in host order, or should we
> be using ldl_le_p() or ldl_be_p() and friends here and
> elsewhere?

Now than I had lunch, your comment is obvious...

IIUC we should use ldl_le_p() here and fix all the other
uses.

> 
> thanks
> -- PMM
> 


Re: [RFC PATCH-for-5.2] hw/s390x/pci: Fix endianness issue
Posted by Cornelia Huck 3 years, 5 months ago
On Tue, 17 Nov 2020 14:12:00 +0100
Philippe Mathieu-Daudé <philmd@redhat.com> wrote:

> On 11/17/20 2:00 PM, Peter Maydell wrote:
> > On Tue, 17 Nov 2020 at 12:03, Philippe Mathieu-Daudé <philmd@redhat.com> wrote:  
> >>
> >> Fix an endianness issue reported by Cornelia:
> >>  
> >>> s390x tcg guest on x86, virtio-pci devices are not detected. The
> >>> relevant feature bits are visible to the guest. Same breakage with
> >>> different guest kernels.
> >>> KVM guests and s390x tcg guests on s390x are fine.  
> >>
> >> Fixes: 28dc86a0729 ("s390x/pci: use a PCI Group structure")
> >> Reported-by: Cornelia Huck <cohuck@redhat.com>
> >> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> >> ---
> >> RFC because review-only patch, untested
> >> ---
> >>  hw/s390x/s390-pci-inst.c | 2 +-
> >>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
> >> index 58cd041d17f..cfb54b4d8ec 100644
> >> --- a/hw/s390x/s390-pci-inst.c
> >> +++ b/hw/s390x/s390-pci-inst.c
> >> @@ -305,7 +305,7 @@ int clp_service_call(S390CPU *cpu, uint8_t r2, uintptr_t ra)
> >>          ClpReqQueryPciGrp *reqgrp = (ClpReqQueryPciGrp *)reqh;
> >>          S390PCIGroup *group;
> >>
> >> -        group = s390_group_find(reqgrp->g);
> >> +        group = s390_group_find(ldl_p(&reqgrp->g));  
> > 
> > 'g' in the ClpReqQueryPciGrp struct is a uint32_t, so
> > adding the ldl_p() will have no effect unless (a) the
> > structure is not 4-aligned and (b) the host will fault on
> > unaligned accesses, which isn't the case for x86 hosts.
> > 
> > Q: is this struct really in host order, or should we
> > be using ldl_le_p() or ldl_be_p() and friends here and
> > elsewhere?  
> 
> Now than I had lunch, your comment is obvious...
> 
> IIUC we should use ldl_le_p() here and fix all the other
> uses.

I'm not sure whether it shouldn't be be -- waiting for the IBM folks to
comment on the expected layout.

> 
> > 
> > thanks
> > -- PMM
> >   
> 


Re: [RFC PATCH-for-5.2] hw/s390x/pci: Fix endianness issue
Posted by Pierre Morel 3 years, 5 months ago

On 11/17/20 2:00 PM, Peter Maydell wrote:
> On Tue, 17 Nov 2020 at 12:03, Philippe Mathieu-Daudé <philmd@redhat.com> wrote:
>>
>> Fix an endianness issue reported by Cornelia:
>>
>>> s390x tcg guest on x86, virtio-pci devices are not detected. The
>>> relevant feature bits are visible to the guest. Same breakage with
>>> different guest kernels.
>>> KVM guests and s390x tcg guests on s390x are fine.
>>
>> Fixes: 28dc86a0729 ("s390x/pci: use a PCI Group structure")
>> Reported-by: Cornelia Huck <cohuck@redhat.com>
>> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
>> ---
>> RFC because review-only patch, untested
>> ---
>>   hw/s390x/s390-pci-inst.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
>> index 58cd041d17f..cfb54b4d8ec 100644
>> --- a/hw/s390x/s390-pci-inst.c
>> +++ b/hw/s390x/s390-pci-inst.c
>> @@ -305,7 +305,7 @@ int clp_service_call(S390CPU *cpu, uint8_t r2, uintptr_t ra)
>>           ClpReqQueryPciGrp *reqgrp = (ClpReqQueryPciGrp *)reqh;
>>           S390PCIGroup *group;
>>
>> -        group = s390_group_find(reqgrp->g);
>> +        group = s390_group_find(ldl_p(&reqgrp->g));
> 
> 'g' in the ClpReqQueryPciGrp struct is a uint32_t, so
> adding the ldl_p() will have no effect unless (a) the
> structure is not 4-aligned and (b) the host will fault on
> unaligned accesses, which isn't the case for x86 hosts.
> 
> Q: is this struct really in host order, or should we
> be using ldl_le_p() or ldl_be_p() and friends here and
> elsewhere?
> 
> thanks
> -- PMM
> 

Hi, I think we better modify the structure here, g should be a byte.

Connie, can you please try this if it resolves the issue?

diff --git a/hw/s390x/s390-pci-inst.h b/hw/s390x/s390-pci-inst.h
index fa3bf8b5aa..641d19c815 100644
--- a/hw/s390x/s390-pci-inst.h
+++ b/hw/s390x/s390-pci-inst.h
@@ -146,7 +146,8 @@ typedef struct ClpReqQueryPciGrp {
      uint32_t fmt;
      uint64_t reserved1;
  #define CLP_REQ_QPCIG_MASK_PFGID 0xff
-    uint32_t g;
+    uint32_t g0 :24;
+    uint32_t g  :8;
      uint32_t reserved2;
      uint64_t reserved3;
  } QEMU_PACKED ClpReqQueryPciGrp;

-- 
Pierre Morel
IBM Lab Boeblingen

Re: [RFC PATCH-for-5.2] hw/s390x/pci: Fix endianness issue
Posted by Peter Maydell 3 years, 5 months ago
On Tue, 17 Nov 2020 at 13:24, Pierre Morel <pmorel@linux.ibm.com> wrote:
>
>
>
> On 11/17/20 2:00 PM, Peter Maydell wrote:
> > On Tue, 17 Nov 2020 at 12:03, Philippe Mathieu-Daudé <philmd@redhat.com> wrote:
> >>
> >> Fix an endianness issue reported by Cornelia:
> >>
> >>> s390x tcg guest on x86, virtio-pci devices are not detected. The
> >>> relevant feature bits are visible to the guest. Same breakage with
> >>> different guest kernels.
> >>> KVM guests and s390x tcg guests on s390x are fine.
> >>
> >> Fixes: 28dc86a0729 ("s390x/pci: use a PCI Group structure")
> >> Reported-by: Cornelia Huck <cohuck@redhat.com>
> >> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> >> ---
> >> RFC because review-only patch, untested
> >> ---
> >>   hw/s390x/s390-pci-inst.c | 2 +-
> >>   1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
> >> index 58cd041d17f..cfb54b4d8ec 100644
> >> --- a/hw/s390x/s390-pci-inst.c
> >> +++ b/hw/s390x/s390-pci-inst.c
> >> @@ -305,7 +305,7 @@ int clp_service_call(S390CPU *cpu, uint8_t r2, uintptr_t ra)
> >>           ClpReqQueryPciGrp *reqgrp = (ClpReqQueryPciGrp *)reqh;
> >>           S390PCIGroup *group;
> >>
> >> -        group = s390_group_find(reqgrp->g);
> >> +        group = s390_group_find(ldl_p(&reqgrp->g));
> >
> > 'g' in the ClpReqQueryPciGrp struct is a uint32_t, so
> > adding the ldl_p() will have no effect unless (a) the
> > structure is not 4-aligned and (b) the host will fault on
> > unaligned accesses, which isn't the case for x86 hosts.
> >
> > Q: is this struct really in host order, or should we
> > be using ldl_le_p() or ldl_be_p() and friends here and
> > elsewhere?
> >
> > thanks
> > -- PMM
> >
>
> Hi, I think we better modify the structure here, g should be a byte.
>
> Connie, can you please try this if it resolves the issue?
>
> diff --git a/hw/s390x/s390-pci-inst.h b/hw/s390x/s390-pci-inst.h
> index fa3bf8b5aa..641d19c815 100644
> --- a/hw/s390x/s390-pci-inst.h
> +++ b/hw/s390x/s390-pci-inst.h
> @@ -146,7 +146,8 @@ typedef struct ClpReqQueryPciGrp {
>       uint32_t fmt;
>       uint64_t reserved1;
>   #define CLP_REQ_QPCIG_MASK_PFGID 0xff
> -    uint32_t g;
> +    uint32_t g0 :24;
> +    uint32_t g  :8;
>       uint32_t reserved2;

Bitfields in a struct intending to be a representation of
a guest in-memory structure? Almost certainly the wrong thing
because the order of g and g0 will differ depending on host
endianness...

>       uint64_t reserved3;
>   } QEMU_PACKED ClpReqQueryPciGrp;

thanks
-- PMM

Re: [RFC PATCH-for-5.2] hw/s390x/pci: Fix endianness issue
Posted by Cornelia Huck 3 years, 5 months ago
On Tue, 17 Nov 2020 14:23:57 +0100
Pierre Morel <pmorel@linux.ibm.com> wrote:

> On 11/17/20 2:00 PM, Peter Maydell wrote:
> > On Tue, 17 Nov 2020 at 12:03, Philippe Mathieu-Daudé <philmd@redhat.com> wrote:  
> >>
> >> Fix an endianness issue reported by Cornelia:
> >>  
> >>> s390x tcg guest on x86, virtio-pci devices are not detected. The
> >>> relevant feature bits are visible to the guest. Same breakage with
> >>> different guest kernels.
> >>> KVM guests and s390x tcg guests on s390x are fine.  
> >>
> >> Fixes: 28dc86a0729 ("s390x/pci: use a PCI Group structure")
> >> Reported-by: Cornelia Huck <cohuck@redhat.com>
> >> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> >> ---
> >> RFC because review-only patch, untested
> >> ---
> >>   hw/s390x/s390-pci-inst.c | 2 +-
> >>   1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
> >> index 58cd041d17f..cfb54b4d8ec 100644
> >> --- a/hw/s390x/s390-pci-inst.c
> >> +++ b/hw/s390x/s390-pci-inst.c
> >> @@ -305,7 +305,7 @@ int clp_service_call(S390CPU *cpu, uint8_t r2, uintptr_t ra)
> >>           ClpReqQueryPciGrp *reqgrp = (ClpReqQueryPciGrp *)reqh;
> >>           S390PCIGroup *group;
> >>
> >> -        group = s390_group_find(reqgrp->g);
> >> +        group = s390_group_find(ldl_p(&reqgrp->g));  
> > 
> > 'g' in the ClpReqQueryPciGrp struct is a uint32_t, so
> > adding the ldl_p() will have no effect unless (a) the
> > structure is not 4-aligned and (b) the host will fault on
> > unaligned accesses, which isn't the case for x86 hosts.
> > 
> > Q: is this struct really in host order, or should we
> > be using ldl_le_p() or ldl_be_p() and friends here and
> > elsewhere?
> > 
> > thanks
> > -- PMM
> >   
> 
> Hi, I think we better modify the structure here, g should be a byte.
> 
> Connie, can you please try this if it resolves the issue?
> 
> diff --git a/hw/s390x/s390-pci-inst.h b/hw/s390x/s390-pci-inst.h
> index fa3bf8b5aa..641d19c815 100644
> --- a/hw/s390x/s390-pci-inst.h
> +++ b/hw/s390x/s390-pci-inst.h
> @@ -146,7 +146,8 @@ typedef struct ClpReqQueryPciGrp {
>       uint32_t fmt;
>       uint64_t reserved1;
>   #define CLP_REQ_QPCIG_MASK_PFGID 0xff
> -    uint32_t g;
> +    uint32_t g0 :24;
> +    uint32_t g  :8;
>       uint32_t reserved2;
>       uint64_t reserved3;
>   } QEMU_PACKED ClpReqQueryPciGrp;
> 

No, same crash... I fear there are more things broken wrt endianness.


Re: [RFC PATCH-for-5.2] hw/s390x/pci: Fix endianness issue
Posted by Matthew Rosato 3 years, 5 months ago
On 11/17/20 8:31 AM, Cornelia Huck wrote:
> On Tue, 17 Nov 2020 14:23:57 +0100
> Pierre Morel <pmorel@linux.ibm.com> wrote:
> 
>> On 11/17/20 2:00 PM, Peter Maydell wrote:
>>> On Tue, 17 Nov 2020 at 12:03, Philippe Mathieu-Daudé <philmd@redhat.com> wrote:
>>>>
>>>> Fix an endianness issue reported by Cornelia:
>>>>   
>>>>> s390x tcg guest on x86, virtio-pci devices are not detected. The
>>>>> relevant feature bits are visible to the guest. Same breakage with
>>>>> different guest kernels.
>>>>> KVM guests and s390x tcg guests on s390x are fine.
>>>>
>>>> Fixes: 28dc86a0729 ("s390x/pci: use a PCI Group structure")
>>>> Reported-by: Cornelia Huck <cohuck@redhat.com>
>>>> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
>>>> ---
>>>> RFC because review-only patch, untested
>>>> ---
>>>>    hw/s390x/s390-pci-inst.c | 2 +-
>>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
>>>> index 58cd041d17f..cfb54b4d8ec 100644
>>>> --- a/hw/s390x/s390-pci-inst.c
>>>> +++ b/hw/s390x/s390-pci-inst.c
>>>> @@ -305,7 +305,7 @@ int clp_service_call(S390CPU *cpu, uint8_t r2, uintptr_t ra)
>>>>            ClpReqQueryPciGrp *reqgrp = (ClpReqQueryPciGrp *)reqh;
>>>>            S390PCIGroup *group;
>>>>
>>>> -        group = s390_group_find(reqgrp->g);
>>>> +        group = s390_group_find(ldl_p(&reqgrp->g));
>>>
>>> 'g' in the ClpReqQueryPciGrp struct is a uint32_t, so
>>> adding the ldl_p() will have no effect unless (a) the
>>> structure is not 4-aligned and (b) the host will fault on
>>> unaligned accesses, which isn't the case for x86 hosts.
>>>
>>> Q: is this struct really in host order, or should we
>>> be using ldl_le_p() or ldl_be_p() and friends here and
>>> elsewhere?
>>>
>>> thanks
>>> -- PMM
>>>    
>>
>> Hi, I think we better modify the structure here, g should be a byte.
>>
>> Connie, can you please try this if it resolves the issue?
>>
>> diff --git a/hw/s390x/s390-pci-inst.h b/hw/s390x/s390-pci-inst.h
>> index fa3bf8b5aa..641d19c815 100644
>> --- a/hw/s390x/s390-pci-inst.h
>> +++ b/hw/s390x/s390-pci-inst.h
>> @@ -146,7 +146,8 @@ typedef struct ClpReqQueryPciGrp {
>>        uint32_t fmt;
>>        uint64_t reserved1;
>>    #define CLP_REQ_QPCIG_MASK_PFGID 0xff
>> -    uint32_t g;
>> +    uint32_t g0 :24;
>> +    uint32_t g  :8;
>>        uint32_t reserved2;
>>        uint64_t reserved3;
>>    } QEMU_PACKED ClpReqQueryPciGrp;
>>
> 
> No, same crash... I fear there are more things broken wrt endianness.
> 

Sorry, just getting online now, looking at the code....  Are the 2 
memcpy calls added in 9670ee75 and 28dc86a0 the issue?  Won't they just 
present the Q PCI FN / Q PCI FN GRP results in host endianness?

Re: [RFC PATCH-for-5.2] hw/s390x/pci: Fix endianness issue
Posted by Cornelia Huck 3 years, 5 months ago
On Tue, 17 Nov 2020 09:02:37 -0500
Matthew Rosato <mjrosato@linux.ibm.com> wrote:

> On 11/17/20 8:31 AM, Cornelia Huck wrote:
> > On Tue, 17 Nov 2020 14:23:57 +0100
> > Pierre Morel <pmorel@linux.ibm.com> wrote:
> >   
> >> On 11/17/20 2:00 PM, Peter Maydell wrote:  
> >>> On Tue, 17 Nov 2020 at 12:03, Philippe Mathieu-Daudé <philmd@redhat.com> wrote:  
> >>>>
> >>>> Fix an endianness issue reported by Cornelia:
> >>>>     
> >>>>> s390x tcg guest on x86, virtio-pci devices are not detected. The
> >>>>> relevant feature bits are visible to the guest. Same breakage with
> >>>>> different guest kernels.
> >>>>> KVM guests and s390x tcg guests on s390x are fine.  
> >>>>
> >>>> Fixes: 28dc86a0729 ("s390x/pci: use a PCI Group structure")
> >>>> Reported-by: Cornelia Huck <cohuck@redhat.com>
> >>>> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> >>>> ---
> >>>> RFC because review-only patch, untested
> >>>> ---
> >>>>    hw/s390x/s390-pci-inst.c | 2 +-
> >>>>    1 file changed, 1 insertion(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
> >>>> index 58cd041d17f..cfb54b4d8ec 100644
> >>>> --- a/hw/s390x/s390-pci-inst.c
> >>>> +++ b/hw/s390x/s390-pci-inst.c
> >>>> @@ -305,7 +305,7 @@ int clp_service_call(S390CPU *cpu, uint8_t r2, uintptr_t ra)
> >>>>            ClpReqQueryPciGrp *reqgrp = (ClpReqQueryPciGrp *)reqh;
> >>>>            S390PCIGroup *group;
> >>>>
> >>>> -        group = s390_group_find(reqgrp->g);
> >>>> +        group = s390_group_find(ldl_p(&reqgrp->g));  
> >>>
> >>> 'g' in the ClpReqQueryPciGrp struct is a uint32_t, so
> >>> adding the ldl_p() will have no effect unless (a) the
> >>> structure is not 4-aligned and (b) the host will fault on
> >>> unaligned accesses, which isn't the case for x86 hosts.
> >>>
> >>> Q: is this struct really in host order, or should we
> >>> be using ldl_le_p() or ldl_be_p() and friends here and
> >>> elsewhere?
> >>>
> >>> thanks
> >>> -- PMM
> >>>      
> >>
> >> Hi, I think we better modify the structure here, g should be a byte.
> >>
> >> Connie, can you please try this if it resolves the issue?
> >>
> >> diff --git a/hw/s390x/s390-pci-inst.h b/hw/s390x/s390-pci-inst.h
> >> index fa3bf8b5aa..641d19c815 100644
> >> --- a/hw/s390x/s390-pci-inst.h
> >> +++ b/hw/s390x/s390-pci-inst.h
> >> @@ -146,7 +146,8 @@ typedef struct ClpReqQueryPciGrp {
> >>        uint32_t fmt;
> >>        uint64_t reserved1;
> >>    #define CLP_REQ_QPCIG_MASK_PFGID 0xff
> >> -    uint32_t g;
> >> +    uint32_t g0 :24;
> >> +    uint32_t g  :8;
> >>        uint32_t reserved2;
> >>        uint64_t reserved3;
> >>    } QEMU_PACKED ClpReqQueryPciGrp;
> >>  
> > 
> > No, same crash... I fear there are more things broken wrt endianness.
> >   
> 
> Sorry, just getting online now, looking at the code....  Are the 2 
> memcpy calls added in 9670ee75 and 28dc86a0 the issue?  Won't they just 
> present the Q PCI FN / Q PCI FN GRP results in host endianness?
> 

I just re-added some st?_p operations in set_pbdev_info and that fixes
at least the crash I was seeing with Phil's patch applied. Still, no
pci functions get detected, so that's not enough. Those memcpy calls
look like a possible culprit.


Re: [RFC PATCH-for-5.2] hw/s390x/pci: Fix endianness issue
Posted by Matthew Rosato 3 years, 5 months ago
On 11/17/20 9:13 AM, Cornelia Huck wrote:
> On Tue, 17 Nov 2020 09:02:37 -0500
> Matthew Rosato <mjrosato@linux.ibm.com> wrote:
> 
>> On 11/17/20 8:31 AM, Cornelia Huck wrote:
>>> On Tue, 17 Nov 2020 14:23:57 +0100
>>> Pierre Morel <pmorel@linux.ibm.com> wrote:
>>>    
>>>> On 11/17/20 2:00 PM, Peter Maydell wrote:
>>>>> On Tue, 17 Nov 2020 at 12:03, Philippe Mathieu-Daudé <philmd@redhat.com> wrote:
>>>>>>
>>>>>> Fix an endianness issue reported by Cornelia:
>>>>>>      
>>>>>>> s390x tcg guest on x86, virtio-pci devices are not detected. The
>>>>>>> relevant feature bits are visible to the guest. Same breakage with
>>>>>>> different guest kernels.
>>>>>>> KVM guests and s390x tcg guests on s390x are fine.
>>>>>>
>>>>>> Fixes: 28dc86a0729 ("s390x/pci: use a PCI Group structure")
>>>>>> Reported-by: Cornelia Huck <cohuck@redhat.com>
>>>>>> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
>>>>>> ---
>>>>>> RFC because review-only patch, untested
>>>>>> ---
>>>>>>     hw/s390x/s390-pci-inst.c | 2 +-
>>>>>>     1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
>>>>>> index 58cd041d17f..cfb54b4d8ec 100644
>>>>>> --- a/hw/s390x/s390-pci-inst.c
>>>>>> +++ b/hw/s390x/s390-pci-inst.c
>>>>>> @@ -305,7 +305,7 @@ int clp_service_call(S390CPU *cpu, uint8_t r2, uintptr_t ra)
>>>>>>             ClpReqQueryPciGrp *reqgrp = (ClpReqQueryPciGrp *)reqh;
>>>>>>             S390PCIGroup *group;
>>>>>>
>>>>>> -        group = s390_group_find(reqgrp->g);
>>>>>> +        group = s390_group_find(ldl_p(&reqgrp->g));
>>>>>
>>>>> 'g' in the ClpReqQueryPciGrp struct is a uint32_t, so
>>>>> adding the ldl_p() will have no effect unless (a) the
>>>>> structure is not 4-aligned and (b) the host will fault on
>>>>> unaligned accesses, which isn't the case for x86 hosts.
>>>>>
>>>>> Q: is this struct really in host order, or should we
>>>>> be using ldl_le_p() or ldl_be_p() and friends here and
>>>>> elsewhere?
>>>>>
>>>>> thanks
>>>>> -- PMM
>>>>>       
>>>>
>>>> Hi, I think we better modify the structure here, g should be a byte.
>>>>
>>>> Connie, can you please try this if it resolves the issue?
>>>>
>>>> diff --git a/hw/s390x/s390-pci-inst.h b/hw/s390x/s390-pci-inst.h
>>>> index fa3bf8b5aa..641d19c815 100644
>>>> --- a/hw/s390x/s390-pci-inst.h
>>>> +++ b/hw/s390x/s390-pci-inst.h
>>>> @@ -146,7 +146,8 @@ typedef struct ClpReqQueryPciGrp {
>>>>         uint32_t fmt;
>>>>         uint64_t reserved1;
>>>>     #define CLP_REQ_QPCIG_MASK_PFGID 0xff
>>>> -    uint32_t g;
>>>> +    uint32_t g0 :24;
>>>> +    uint32_t g  :8;
>>>>         uint32_t reserved2;
>>>>         uint64_t reserved3;
>>>>     } QEMU_PACKED ClpReqQueryPciGrp;
>>>>   
>>>
>>> No, same crash... I fear there are more things broken wrt endianness.
>>>    
>>
>> Sorry, just getting online now, looking at the code....  Are the 2
>> memcpy calls added in 9670ee75 and 28dc86a0 the issue?  Won't they just
>> present the Q PCI FN / Q PCI FN GRP results in host endianness?
>>
> 
> I just re-added some st?_p operations in set_pbdev_info and that fixes
> at least the crash I was seeing with Phil's patch applied. Still, no
> pci functions get detected, so that's not enough. Those memcpy calls
> look like a possible culprit.
> 

OK, so if everything in set_pbdev_info and s390_pci_init_default_group() 
is handled with st?_p operations, then the memcpy should be OK...

Pierre was on to something with his recommendation, as the group id is 
only 1B of the 'g' field (see CLP_REQ_QPCIG_MASK_PFGID) - the other bits 
just happen to be unused.

Did you include his change with your st?_p changes to set_pbdev_info 
(sorry, I don't have this environment set up but clearly need to do so 
for future testing)

Re: [RFC PATCH-for-5.2] hw/s390x/pci: Fix endianness issue
Posted by Cornelia Huck 3 years, 5 months ago
On Tue, 17 Nov 2020 09:34:41 -0500
Matthew Rosato <mjrosato@linux.ibm.com> wrote:

> On 11/17/20 9:13 AM, Cornelia Huck wrote:
> > On Tue, 17 Nov 2020 09:02:37 -0500
> > Matthew Rosato <mjrosato@linux.ibm.com> wrote:
> >   
> >> On 11/17/20 8:31 AM, Cornelia Huck wrote:  
> >>> On Tue, 17 Nov 2020 14:23:57 +0100
> >>> Pierre Morel <pmorel@linux.ibm.com> wrote:
> >>>      
> >>>> On 11/17/20 2:00 PM, Peter Maydell wrote:  
> >>>>> On Tue, 17 Nov 2020 at 12:03, Philippe Mathieu-Daudé <philmd@redhat.com> wrote:  
> >>>>>>
> >>>>>> Fix an endianness issue reported by Cornelia:
> >>>>>>        
> >>>>>>> s390x tcg guest on x86, virtio-pci devices are not detected. The
> >>>>>>> relevant feature bits are visible to the guest. Same breakage with
> >>>>>>> different guest kernels.
> >>>>>>> KVM guests and s390x tcg guests on s390x are fine.  
> >>>>>>
> >>>>>> Fixes: 28dc86a0729 ("s390x/pci: use a PCI Group structure")
> >>>>>> Reported-by: Cornelia Huck <cohuck@redhat.com>
> >>>>>> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> >>>>>> ---
> >>>>>> RFC because review-only patch, untested
> >>>>>> ---
> >>>>>>     hw/s390x/s390-pci-inst.c | 2 +-
> >>>>>>     1 file changed, 1 insertion(+), 1 deletion(-)
> >>>>>>
> >>>>>> diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
> >>>>>> index 58cd041d17f..cfb54b4d8ec 100644
> >>>>>> --- a/hw/s390x/s390-pci-inst.c
> >>>>>> +++ b/hw/s390x/s390-pci-inst.c
> >>>>>> @@ -305,7 +305,7 @@ int clp_service_call(S390CPU *cpu, uint8_t r2, uintptr_t ra)
> >>>>>>             ClpReqQueryPciGrp *reqgrp = (ClpReqQueryPciGrp *)reqh;
> >>>>>>             S390PCIGroup *group;
> >>>>>>
> >>>>>> -        group = s390_group_find(reqgrp->g);
> >>>>>> +        group = s390_group_find(ldl_p(&reqgrp->g));  
> >>>>>
> >>>>> 'g' in the ClpReqQueryPciGrp struct is a uint32_t, so
> >>>>> adding the ldl_p() will have no effect unless (a) the
> >>>>> structure is not 4-aligned and (b) the host will fault on
> >>>>> unaligned accesses, which isn't the case for x86 hosts.
> >>>>>
> >>>>> Q: is this struct really in host order, or should we
> >>>>> be using ldl_le_p() or ldl_be_p() and friends here and
> >>>>> elsewhere?
> >>>>>
> >>>>> thanks
> >>>>> -- PMM
> >>>>>         
> >>>>
> >>>> Hi, I think we better modify the structure here, g should be a byte.
> >>>>
> >>>> Connie, can you please try this if it resolves the issue?
> >>>>
> >>>> diff --git a/hw/s390x/s390-pci-inst.h b/hw/s390x/s390-pci-inst.h
> >>>> index fa3bf8b5aa..641d19c815 100644
> >>>> --- a/hw/s390x/s390-pci-inst.h
> >>>> +++ b/hw/s390x/s390-pci-inst.h
> >>>> @@ -146,7 +146,8 @@ typedef struct ClpReqQueryPciGrp {
> >>>>         uint32_t fmt;
> >>>>         uint64_t reserved1;
> >>>>     #define CLP_REQ_QPCIG_MASK_PFGID 0xff
> >>>> -    uint32_t g;
> >>>> +    uint32_t g0 :24;
> >>>> +    uint32_t g  :8;
> >>>>         uint32_t reserved2;
> >>>>         uint64_t reserved3;
> >>>>     } QEMU_PACKED ClpReqQueryPciGrp;
> >>>>     
> >>>
> >>> No, same crash... I fear there are more things broken wrt endianness.
> >>>      
> >>
> >> Sorry, just getting online now, looking at the code....  Are the 2
> >> memcpy calls added in 9670ee75 and 28dc86a0 the issue?  Won't they just
> >> present the Q PCI FN / Q PCI FN GRP results in host endianness?
> >>  
> > 
> > I just re-added some st?_p operations in set_pbdev_info and that fixes
> > at least the crash I was seeing with Phil's patch applied. Still, no
> > pci functions get detected, so that's not enough. Those memcpy calls
> > look like a possible culprit.
> >   
> 
> OK, so if everything in set_pbdev_info and s390_pci_init_default_group() 
> is handled with st?_p operations, then the memcpy should be OK...
> 
> Pierre was on to something with his recommendation, as the group id is 
> only 1B of the 'g' field (see CLP_REQ_QPCIG_MASK_PFGID) - the other bits 
> just happen to be unused.
> 
> Did you include his change with your st?_p changes to set_pbdev_info 
> (sorry, I don't have this environment set up but clearly need to do so 
> for future testing)

I tried in conjunction with Phil's patch (otherwise, I don't even get
to the part where it crashes.) Do we need to apply that mask somewhere?
It is hard to guess if you don't know what the structure is supposed to
look like :)


Re: [RFC PATCH-for-5.2] hw/s390x/pci: Fix endianness issue
Posted by Matthew Rosato 3 years, 5 months ago
On 11/17/20 10:17 AM, Cornelia Huck wrote:
> On Tue, 17 Nov 2020 09:34:41 -0500
> Matthew Rosato <mjrosato@linux.ibm.com> wrote:
> 
>> On 11/17/20 9:13 AM, Cornelia Huck wrote:
>>> On Tue, 17 Nov 2020 09:02:37 -0500
>>> Matthew Rosato <mjrosato@linux.ibm.com> wrote:
>>>    
>>>> On 11/17/20 8:31 AM, Cornelia Huck wrote:
>>>>> On Tue, 17 Nov 2020 14:23:57 +0100
>>>>> Pierre Morel <pmorel@linux.ibm.com> wrote:
>>>>>       
>>>>>> On 11/17/20 2:00 PM, Peter Maydell wrote:
>>>>>>> On Tue, 17 Nov 2020 at 12:03, Philippe Mathieu-Daudé <philmd@redhat.com> wrote:
>>>>>>>>
>>>>>>>> Fix an endianness issue reported by Cornelia:
>>>>>>>>         
>>>>>>>>> s390x tcg guest on x86, virtio-pci devices are not detected. The
>>>>>>>>> relevant feature bits are visible to the guest. Same breakage with
>>>>>>>>> different guest kernels.
>>>>>>>>> KVM guests and s390x tcg guests on s390x are fine.
>>>>>>>>
>>>>>>>> Fixes: 28dc86a0729 ("s390x/pci: use a PCI Group structure")
>>>>>>>> Reported-by: Cornelia Huck <cohuck@redhat.com>
>>>>>>>> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
>>>>>>>> ---
>>>>>>>> RFC because review-only patch, untested
>>>>>>>> ---
>>>>>>>>      hw/s390x/s390-pci-inst.c | 2 +-
>>>>>>>>      1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>>>
>>>>>>>> diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
>>>>>>>> index 58cd041d17f..cfb54b4d8ec 100644
>>>>>>>> --- a/hw/s390x/s390-pci-inst.c
>>>>>>>> +++ b/hw/s390x/s390-pci-inst.c
>>>>>>>> @@ -305,7 +305,7 @@ int clp_service_call(S390CPU *cpu, uint8_t r2, uintptr_t ra)
>>>>>>>>              ClpReqQueryPciGrp *reqgrp = (ClpReqQueryPciGrp *)reqh;
>>>>>>>>              S390PCIGroup *group;
>>>>>>>>
>>>>>>>> -        group = s390_group_find(reqgrp->g);
>>>>>>>> +        group = s390_group_find(ldl_p(&reqgrp->g));
>>>>>>>
>>>>>>> 'g' in the ClpReqQueryPciGrp struct is a uint32_t, so
>>>>>>> adding the ldl_p() will have no effect unless (a) the
>>>>>>> structure is not 4-aligned and (b) the host will fault on
>>>>>>> unaligned accesses, which isn't the case for x86 hosts.
>>>>>>>
>>>>>>> Q: is this struct really in host order, or should we
>>>>>>> be using ldl_le_p() or ldl_be_p() and friends here and
>>>>>>> elsewhere?
>>>>>>>
>>>>>>> thanks
>>>>>>> -- PMM
>>>>>>>          
>>>>>>
>>>>>> Hi, I think we better modify the structure here, g should be a byte.
>>>>>>
>>>>>> Connie, can you please try this if it resolves the issue?
>>>>>>
>>>>>> diff --git a/hw/s390x/s390-pci-inst.h b/hw/s390x/s390-pci-inst.h
>>>>>> index fa3bf8b5aa..641d19c815 100644
>>>>>> --- a/hw/s390x/s390-pci-inst.h
>>>>>> +++ b/hw/s390x/s390-pci-inst.h
>>>>>> @@ -146,7 +146,8 @@ typedef struct ClpReqQueryPciGrp {
>>>>>>          uint32_t fmt;
>>>>>>          uint64_t reserved1;
>>>>>>      #define CLP_REQ_QPCIG_MASK_PFGID 0xff
>>>>>> -    uint32_t g;
>>>>>> +    uint32_t g0 :24;
>>>>>> +    uint32_t g  :8;
>>>>>>          uint32_t reserved2;
>>>>>>          uint64_t reserved3;
>>>>>>      } QEMU_PACKED ClpReqQueryPciGrp;
>>>>>>      
>>>>>
>>>>> No, same crash... I fear there are more things broken wrt endianness.
>>>>>       
>>>>
>>>> Sorry, just getting online now, looking at the code....  Are the 2
>>>> memcpy calls added in 9670ee75 and 28dc86a0 the issue?  Won't they just
>>>> present the Q PCI FN / Q PCI FN GRP results in host endianness?
>>>>   
>>>
>>> I just re-added some st?_p operations in set_pbdev_info and that fixes
>>> at least the crash I was seeing with Phil's patch applied. Still, no
>>> pci functions get detected, so that's not enough. Those memcpy calls
>>> look like a possible culprit.
>>>    
>>
>> OK, so if everything in set_pbdev_info and s390_pci_init_default_group()
>> is handled with st?_p operations, then the memcpy should be OK...
>>
>> Pierre was on to something with his recommendation, as the group id is
>> only 1B of the 'g' field (see CLP_REQ_QPCIG_MASK_PFGID) - the other bits
>> just happen to be unused.
>>
>> Did you include his change with your st?_p changes to set_pbdev_info
>> (sorry, I don't have this environment set up but clearly need to do so
>> for future testing)
> 
> I tried in conjunction with Phil's patch (otherwise, I don't even get
> to the part where it crashes.) Do we need to apply that mask somewhere?
> It is hard to guess if you don't know what the structure is supposed to
> look like :)
> 

OK, I just got the issue reproduced here (no PCI without Phil's patch, 
crash with Phil's patch);  Let me investigate and I will get back.




Re: [RFC PATCH-for-5.2] hw/s390x/pci: Fix endianness issue
Posted by Cornelia Huck 3 years, 5 months ago
On Tue, 17 Nov 2020 11:01:31 -0500
Matthew Rosato <mjrosato@linux.ibm.com> wrote:

> On 11/17/20 10:17 AM, Cornelia Huck wrote:
> > On Tue, 17 Nov 2020 09:34:41 -0500
> > Matthew Rosato <mjrosato@linux.ibm.com> wrote:
> >   
> >> On 11/17/20 9:13 AM, Cornelia Huck wrote:  
> >>> On Tue, 17 Nov 2020 09:02:37 -0500
> >>> Matthew Rosato <mjrosato@linux.ibm.com> wrote:
> >>>      
> >>>> On 11/17/20 8:31 AM, Cornelia Huck wrote:  
> >>>>> On Tue, 17 Nov 2020 14:23:57 +0100
> >>>>> Pierre Morel <pmorel@linux.ibm.com> wrote:
> >>>>>         
> >>>>>> On 11/17/20 2:00 PM, Peter Maydell wrote:  
> >>>>>>> On Tue, 17 Nov 2020 at 12:03, Philippe Mathieu-Daudé <philmd@redhat.com> wrote:  
> >>>>>>>>
> >>>>>>>> Fix an endianness issue reported by Cornelia:
> >>>>>>>>           
> >>>>>>>>> s390x tcg guest on x86, virtio-pci devices are not detected. The
> >>>>>>>>> relevant feature bits are visible to the guest. Same breakage with
> >>>>>>>>> different guest kernels.
> >>>>>>>>> KVM guests and s390x tcg guests on s390x are fine.  
> >>>>>>>>
> >>>>>>>> Fixes: 28dc86a0729 ("s390x/pci: use a PCI Group structure")
> >>>>>>>> Reported-by: Cornelia Huck <cohuck@redhat.com>
> >>>>>>>> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> >>>>>>>> ---
> >>>>>>>> RFC because review-only patch, untested
> >>>>>>>> ---
> >>>>>>>>      hw/s390x/s390-pci-inst.c | 2 +-
> >>>>>>>>      1 file changed, 1 insertion(+), 1 deletion(-)
> >>>>>>>>
> >>>>>>>> diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
> >>>>>>>> index 58cd041d17f..cfb54b4d8ec 100644
> >>>>>>>> --- a/hw/s390x/s390-pci-inst.c
> >>>>>>>> +++ b/hw/s390x/s390-pci-inst.c
> >>>>>>>> @@ -305,7 +305,7 @@ int clp_service_call(S390CPU *cpu, uint8_t r2, uintptr_t ra)
> >>>>>>>>              ClpReqQueryPciGrp *reqgrp = (ClpReqQueryPciGrp *)reqh;
> >>>>>>>>              S390PCIGroup *group;
> >>>>>>>>
> >>>>>>>> -        group = s390_group_find(reqgrp->g);
> >>>>>>>> +        group = s390_group_find(ldl_p(&reqgrp->g));  
> >>>>>>>
> >>>>>>> 'g' in the ClpReqQueryPciGrp struct is a uint32_t, so
> >>>>>>> adding the ldl_p() will have no effect unless (a) the
> >>>>>>> structure is not 4-aligned and (b) the host will fault on
> >>>>>>> unaligned accesses, which isn't the case for x86 hosts.
> >>>>>>>
> >>>>>>> Q: is this struct really in host order, or should we
> >>>>>>> be using ldl_le_p() or ldl_be_p() and friends here and
> >>>>>>> elsewhere?
> >>>>>>>
> >>>>>>> thanks
> >>>>>>> -- PMM
> >>>>>>>            
> >>>>>>
> >>>>>> Hi, I think we better modify the structure here, g should be a byte.
> >>>>>>
> >>>>>> Connie, can you please try this if it resolves the issue?
> >>>>>>
> >>>>>> diff --git a/hw/s390x/s390-pci-inst.h b/hw/s390x/s390-pci-inst.h
> >>>>>> index fa3bf8b5aa..641d19c815 100644
> >>>>>> --- a/hw/s390x/s390-pci-inst.h
> >>>>>> +++ b/hw/s390x/s390-pci-inst.h
> >>>>>> @@ -146,7 +146,8 @@ typedef struct ClpReqQueryPciGrp {
> >>>>>>          uint32_t fmt;
> >>>>>>          uint64_t reserved1;
> >>>>>>      #define CLP_REQ_QPCIG_MASK_PFGID 0xff
> >>>>>> -    uint32_t g;
> >>>>>> +    uint32_t g0 :24;
> >>>>>> +    uint32_t g  :8;
> >>>>>>          uint32_t reserved2;
> >>>>>>          uint64_t reserved3;
> >>>>>>      } QEMU_PACKED ClpReqQueryPciGrp;
> >>>>>>        
> >>>>>
> >>>>> No, same crash... I fear there are more things broken wrt endianness.
> >>>>>         
> >>>>
> >>>> Sorry, just getting online now, looking at the code....  Are the 2
> >>>> memcpy calls added in 9670ee75 and 28dc86a0 the issue?  Won't they just
> >>>> present the Q PCI FN / Q PCI FN GRP results in host endianness?
> >>>>     
> >>>
> >>> I just re-added some st?_p operations in set_pbdev_info and that fixes
> >>> at least the crash I was seeing with Phil's patch applied. Still, no
> >>> pci functions get detected, so that's not enough. Those memcpy calls
> >>> look like a possible culprit.
> >>>      
> >>
> >> OK, so if everything in set_pbdev_info and s390_pci_init_default_group()
> >> is handled with st?_p operations, then the memcpy should be OK...
> >>
> >> Pierre was on to something with his recommendation, as the group id is
> >> only 1B of the 'g' field (see CLP_REQ_QPCIG_MASK_PFGID) - the other bits
> >> just happen to be unused.
> >>
> >> Did you include his change with your st?_p changes to set_pbdev_info
> >> (sorry, I don't have this environment set up but clearly need to do so
> >> for future testing)  
> > 
> > I tried in conjunction with Phil's patch (otherwise, I don't even get
> > to the part where it crashes.) Do we need to apply that mask somewhere?
> > It is hard to guess if you don't know what the structure is supposed to
> > look like :)
> >   
> 
> OK, I just got the issue reproduced here (no PCI without Phil's patch, 
> crash with Phil's patch);  Let me investigate and I will get back.

I think I may have something (there were some more fields that needed
care). Let me check whether it works on s390x-on-s390x as well, then
I'll polish it and post.

(Still hoping to get that included; I have two other fixes queued.)


Re: [RFC PATCH-for-5.2] hw/s390x/pci: Fix endianness issue
Posted by Thomas Huth 3 years, 5 months ago
On 17/11/2020 15.34, Matthew Rosato wrote:
> On 11/17/20 9:13 AM, Cornelia Huck wrote:
>> On Tue, 17 Nov 2020 09:02:37 -0500
>> Matthew Rosato <mjrosato@linux.ibm.com> wrote:
>>
>>> On 11/17/20 8:31 AM, Cornelia Huck wrote:
>>>> On Tue, 17 Nov 2020 14:23:57 +0100
>>>> Pierre Morel <pmorel@linux.ibm.com> wrote:
>>>>   
>>>>> On 11/17/20 2:00 PM, Peter Maydell wrote:
>>>>>> On Tue, 17 Nov 2020 at 12:03, Philippe Mathieu-Daudé
>>>>>> <philmd@redhat.com> wrote:
>>>>>>>
>>>>>>> Fix an endianness issue reported by Cornelia:
>>>>>>>     
>>>>>>>> s390x tcg guest on x86, virtio-pci devices are not detected. The
>>>>>>>> relevant feature bits are visible to the guest. Same breakage with
>>>>>>>> different guest kernels.
>>>>>>>> KVM guests and s390x tcg guests on s390x are fine.
>>>>>>>
>>>>>>> Fixes: 28dc86a0729 ("s390x/pci: use a PCI Group structure")
>>>>>>> Reported-by: Cornelia Huck <cohuck@redhat.com>
>>>>>>> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
>>>>>>> ---
>>>>>>> RFC because review-only patch, untested
>>>>>>> ---
>>>>>>>     hw/s390x/s390-pci-inst.c | 2 +-
>>>>>>>     1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
>>>>>>> index 58cd041d17f..cfb54b4d8ec 100644
>>>>>>> --- a/hw/s390x/s390-pci-inst.c
>>>>>>> +++ b/hw/s390x/s390-pci-inst.c
>>>>>>> @@ -305,7 +305,7 @@ int clp_service_call(S390CPU *cpu, uint8_t r2,
>>>>>>> uintptr_t ra)
>>>>>>>             ClpReqQueryPciGrp *reqgrp = (ClpReqQueryPciGrp *)reqh;
>>>>>>>             S390PCIGroup *group;
>>>>>>>
>>>>>>> -        group = s390_group_find(reqgrp->g);
>>>>>>> +        group = s390_group_find(ldl_p(&reqgrp->g));
>>>>>>
>>>>>> 'g' in the ClpReqQueryPciGrp struct is a uint32_t, so
>>>>>> adding the ldl_p() will have no effect unless (a) the
>>>>>> structure is not 4-aligned and (b) the host will fault on
>>>>>> unaligned accesses, which isn't the case for x86 hosts.
>>>>>>
>>>>>> Q: is this struct really in host order, or should we
>>>>>> be using ldl_le_p() or ldl_be_p() and friends here and
>>>>>> elsewhere?
>>>>>>
>>>>>> thanks
>>>>>> -- PMM
>>>>>>       
>>>>>
>>>>> Hi, I think we better modify the structure here, g should be a byte.
>>>>>
>>>>> Connie, can you please try this if it resolves the issue?
>>>>>
>>>>> diff --git a/hw/s390x/s390-pci-inst.h b/hw/s390x/s390-pci-inst.h
>>>>> index fa3bf8b5aa..641d19c815 100644
>>>>> --- a/hw/s390x/s390-pci-inst.h
>>>>> +++ b/hw/s390x/s390-pci-inst.h
>>>>> @@ -146,7 +146,8 @@ typedef struct ClpReqQueryPciGrp {
>>>>>         uint32_t fmt;
>>>>>         uint64_t reserved1;
>>>>>     #define CLP_REQ_QPCIG_MASK_PFGID 0xff
>>>>> -    uint32_t g;
>>>>> +    uint32_t g0 :24;
>>>>> +    uint32_t g  :8;
>>>>>         uint32_t reserved2;
>>>>>         uint64_t reserved3;
>>>>>     } QEMU_PACKED ClpReqQueryPciGrp;
>>>>>   
>>>>
>>>> No, same crash... I fear there are more things broken wrt endianness.
>>>>    
>>>
>>> Sorry, just getting online now, looking at the code....  Are the 2
>>> memcpy calls added in 9670ee75 and 28dc86a0 the issue?  Won't they just
>>> present the Q PCI FN / Q PCI FN GRP results in host endianness?
>>>
>>
>> I just re-added some st?_p operations in set_pbdev_info and that fixes
>> at least the crash I was seeing with Phil's patch applied. Still, no
>> pci functions get detected, so that's not enough. Those memcpy calls
>> look like a possible culprit.
>>
> 
> OK, so if everything in set_pbdev_info and s390_pci_init_default_group() is
> handled with st?_p operations, then the memcpy should be OK...
> 
> Pierre was on to something with his recommendation, as the group id is only
> 1B of the 'g' field (see CLP_REQ_QPCIG_MASK_PFGID) - the other bits just
> happen to be unused.
> 
> Did you include his change with your st?_p changes to set_pbdev_info 
As Peter also already wrote: Bitfields are not endianess safe either. You'd
need to replace the g0:24 with "uint8_t g0[3]" to get it working that way.

 Thomas