[PATCH] .travis.yml: skip ppc64abi32-linux-user with plugins

Alex Bennée posted 1 patch 3 years, 9 months ago
Test checkpatch passed
Test docker-mingw@fedora passed
Test FreeBSD passed
Test docker-quick@centos7 passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20200714175516.5475-1-alex.bennee@linaro.org
.travis.yml | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
[PATCH] .travis.yml: skip ppc64abi32-linux-user with plugins
Posted by Alex Bennée 3 years, 9 months ago
We actually see failures on threadcount running without plugins:

  retry.py -n 1000 -c -- \
    ./ppc64abi32-linux-user/qemu-ppc64abi32 \
    ./tests/tcg/ppc64abi32-linux-user/threadcount

which reports:

  0: 978 times (97.80%), avg time 0.270 (0.01 varience/0.08 deviation)
  -6: 21 times (2.10%), avg time 0.336 (0.01 varience/0.12 deviation)
  -11: 1 times (0.10%), avg time 0.502 (0.00 varience/0.00 deviation)
  Ran command 1000 times, 978 passes

But when running with plugins we hit the failure a lot more often:

  0: 91 times (91.00%), avg time 0.302 (0.04 varience/0.19 deviation)
  -11: 9 times (9.00%), avg time 0.558 (0.01 varience/0.11 deviation)
  Ran command 100 times, 91 passes

The crash occurs in guest code which is the same in both pass and fail
cases. However we see various messages reported on the console about
corrupted memory lists which seems to imply the guest memory allocation
is corrupted. This lines up with the seg fault being in the guest
__libc_free function. So we think this is a guest bug which is
exacerbated by various modes of translation. If anyone has access to
real hardware to soak test the test case we could prove this properly.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
---
 .travis.yml | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/.travis.yml b/.travis.yml
index ab429500fc..6695c0620f 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -350,9 +350,10 @@ jobs:
     # Run check-tcg against linux-user (with plugins)
     # we skip sparc64-linux-user until it has been fixed somewhat
     # we skip cris-linux-user as it doesn't use the common run loop
+    # we skip ppc64abi32-linux-user as it seems to have a broken libc
     - name: "GCC plugins check-tcg (user)"
       env:
-        - CONFIG="--disable-system --enable-plugins --enable-debug-tcg --target-list-exclude=sparc64-linux-user,cris-linux-user"
+        - CONFIG="--disable-system --enable-plugins --enable-debug-tcg --target-list-exclude=sparc64-linux-user,cris-linux-user,ppc64abi32-linux-user"
         - TEST_BUILD_CMD="make build-tcg"
         - TEST_CMD="make check-tcg"
         - CACHE_NAME="${TRAVIS_BRANCH}-linux-gcc-debug-tcg"
-- 
2.20.1


Re: [PATCH] .travis.yml: skip ppc64abi32-linux-user with plugins
Posted by Philippe Mathieu-Daudé 3 years, 9 months ago
On 7/14/20 7:55 PM, Alex Bennée wrote:
> We actually see failures on threadcount running without plugins:
> 
>   retry.py -n 1000 -c -- \
>     ./ppc64abi32-linux-user/qemu-ppc64abi32 \
>     ./tests/tcg/ppc64abi32-linux-user/threadcount
> 
> which reports:
> 
>   0: 978 times (97.80%), avg time 0.270 (0.01 varience/0.08 deviation)
>   -6: 21 times (2.10%), avg time 0.336 (0.01 varience/0.12 deviation)
>   -11: 1 times (0.10%), avg time 0.502 (0.00 varience/0.00 deviation)
>   Ran command 1000 times, 978 passes
> 
> But when running with plugins we hit the failure a lot more often:
> 
>   0: 91 times (91.00%), avg time 0.302 (0.04 varience/0.19 deviation)
>   -11: 9 times (9.00%), avg time 0.558 (0.01 varience/0.11 deviation)
>   Ran command 100 times, 91 passes
> 
> The crash occurs in guest code which is the same in both pass and fail
> cases. However we see various messages reported on the console about
> corrupted memory lists which seems to imply the guest memory allocation
> is corrupted. This lines up with the seg fault being in the guest
> __libc_free function. So we think this is a guest bug which is
> exacerbated by various modes of translation. If anyone has access to
> real hardware to soak test the test case we could prove this properly.
> 
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> Cc: David Gibson <david@gibson.dropbear.id.au>
> Cc: Philippe Mathieu-Daudé <philmd@redhat.com>

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>

> ---
>  .travis.yml | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/.travis.yml b/.travis.yml
> index ab429500fc..6695c0620f 100644
> --- a/.travis.yml
> +++ b/.travis.yml
> @@ -350,9 +350,10 @@ jobs:
>      # Run check-tcg against linux-user (with plugins)
>      # we skip sparc64-linux-user until it has been fixed somewhat
>      # we skip cris-linux-user as it doesn't use the common run loop
> +    # we skip ppc64abi32-linux-user as it seems to have a broken libc
>      - name: "GCC plugins check-tcg (user)"
>        env:
> -        - CONFIG="--disable-system --enable-plugins --enable-debug-tcg --target-list-exclude=sparc64-linux-user,cris-linux-user"
> +        - CONFIG="--disable-system --enable-plugins --enable-debug-tcg --target-list-exclude=sparc64-linux-user,cris-linux-user,ppc64abi32-linux-user"
>          - TEST_BUILD_CMD="make build-tcg"
>          - TEST_CMD="make check-tcg"
>          - CACHE_NAME="${TRAVIS_BRANCH}-linux-gcc-debug-tcg"
> 


Re: [PATCH] .travis.yml: skip ppc64abi32-linux-user with plugins
Posted by David Gibson 3 years, 9 months ago
On Tue, Jul 14, 2020 at 06:55:16PM +0100, Alex Bennée wrote:
> We actually see failures on threadcount running without plugins:
> 
>   retry.py -n 1000 -c -- \
>     ./ppc64abi32-linux-user/qemu-ppc64abi32 \
>     ./tests/tcg/ppc64abi32-linux-user/threadcount
> 
> which reports:
> 
>   0: 978 times (97.80%), avg time 0.270 (0.01 varience/0.08 deviation)
>   -6: 21 times (2.10%), avg time 0.336 (0.01 varience/0.12 deviation)
>   -11: 1 times (0.10%), avg time 0.502 (0.00 varience/0.00 deviation)
>   Ran command 1000 times, 978 passes
> 
> But when running with plugins we hit the failure a lot more often:
> 
>   0: 91 times (91.00%), avg time 0.302 (0.04 varience/0.19 deviation)
>   -11: 9 times (9.00%), avg time 0.558 (0.01 varience/0.11 deviation)
>   Ran command 100 times, 91 passes
> 
> The crash occurs in guest code which is the same in both pass and fail
> cases. However we see various messages reported on the console about
> corrupted memory lists which seems to imply the guest memory allocation
> is corrupted. This lines up with the seg fault being in the guest
> __libc_free function. So we think this is a guest bug which is
> exacerbated by various modes of translation. If anyone has access to
> real hardware to soak test the test case we could prove this properly.
> 
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> Cc: David Gibson <david@gibson.dropbear.id.au>
> Cc: Philippe Mathieu-Daudé <philmd@redhat.com>

Acked-by: David Gibson <david@gibson.dropbear.id.au>

Honestly, AFAICT the ppc64abi32-linux-user target is pretty much
entirely broken anyway.  Many things about it appear to make no
sense, it's difficult to work out what it's even supposed to be, and I
strongly suspect no-one's actually used it in like a decade.

> ---
>  .travis.yml | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/.travis.yml b/.travis.yml
> index ab429500fc..6695c0620f 100644
> --- a/.travis.yml
> +++ b/.travis.yml
> @@ -350,9 +350,10 @@ jobs:
>      # Run check-tcg against linux-user (with plugins)
>      # we skip sparc64-linux-user until it has been fixed somewhat
>      # we skip cris-linux-user as it doesn't use the common run loop
> +    # we skip ppc64abi32-linux-user as it seems to have a broken libc
>      - name: "GCC plugins check-tcg (user)"
>        env:
> -        - CONFIG="--disable-system --enable-plugins --enable-debug-tcg --target-list-exclude=sparc64-linux-user,cris-linux-user"
> +        - CONFIG="--disable-system --enable-plugins --enable-debug-tcg --target-list-exclude=sparc64-linux-user,cris-linux-user,ppc64abi32-linux-user"
>          - TEST_BUILD_CMD="make build-tcg"
>          - TEST_CMD="make check-tcg"
>          - CACHE_NAME="${TRAVIS_BRANCH}-linux-gcc-debug-tcg"

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson
Re: [PATCH] .travis.yml: skip ppc64abi32-linux-user with plugins
Posted by Alex Bennée 3 years, 9 months ago
David Gibson <david@gibson.dropbear.id.au> writes:

> On Tue, Jul 14, 2020 at 06:55:16PM +0100, Alex Bennée wrote:
>> We actually see failures on threadcount running without plugins:
>> 
>>   retry.py -n 1000 -c -- \
>>     ./ppc64abi32-linux-user/qemu-ppc64abi32 \
>>     ./tests/tcg/ppc64abi32-linux-user/threadcount
>> 
>> which reports:
>> 
>>   0: 978 times (97.80%), avg time 0.270 (0.01 varience/0.08 deviation)
>>   -6: 21 times (2.10%), avg time 0.336 (0.01 varience/0.12 deviation)
>>   -11: 1 times (0.10%), avg time 0.502 (0.00 varience/0.00 deviation)
>>   Ran command 1000 times, 978 passes
>> 
>> But when running with plugins we hit the failure a lot more often:
>> 
>>   0: 91 times (91.00%), avg time 0.302 (0.04 varience/0.19 deviation)
>>   -11: 9 times (9.00%), avg time 0.558 (0.01 varience/0.11 deviation)
>>   Ran command 100 times, 91 passes
>> 
>> The crash occurs in guest code which is the same in both pass and fail
>> cases. However we see various messages reported on the console about
>> corrupted memory lists which seems to imply the guest memory allocation
>> is corrupted. This lines up with the seg fault being in the guest
>> __libc_free function. So we think this is a guest bug which is
>> exacerbated by various modes of translation. If anyone has access to
>> real hardware to soak test the test case we could prove this properly.
>> 
>> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
>> Cc: David Gibson <david@gibson.dropbear.id.au>
>> Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
>
> Acked-by: David Gibson <david@gibson.dropbear.id.au>
>
> Honestly, AFAICT the ppc64abi32-linux-user target is pretty much
> entirely broken anyway.  Many things about it appear to make no
> sense, it's difficult to work out what it's even supposed to be, and I
> strongly suspect no-one's actually used it in like a decade.

Should we think about marking it deprecated for 5.2?

-- 
Alex Bennée

Re: [PATCH] .travis.yml: skip ppc64abi32-linux-user with plugins
Posted by David Gibson 3 years, 9 months ago
On Wed, Jul 15, 2020 at 09:02:05AM +0100, Alex Bennée wrote:
> 
> David Gibson <david@gibson.dropbear.id.au> writes:
> 
> > On Tue, Jul 14, 2020 at 06:55:16PM +0100, Alex Bennée wrote:
> >> We actually see failures on threadcount running without plugins:
> >> 
> >>   retry.py -n 1000 -c -- \
> >>     ./ppc64abi32-linux-user/qemu-ppc64abi32 \
> >>     ./tests/tcg/ppc64abi32-linux-user/threadcount
> >> 
> >> which reports:
> >> 
> >>   0: 978 times (97.80%), avg time 0.270 (0.01 varience/0.08 deviation)
> >>   -6: 21 times (2.10%), avg time 0.336 (0.01 varience/0.12 deviation)
> >>   -11: 1 times (0.10%), avg time 0.502 (0.00 varience/0.00 deviation)
> >>   Ran command 1000 times, 978 passes
> >> 
> >> But when running with plugins we hit the failure a lot more often:
> >> 
> >>   0: 91 times (91.00%), avg time 0.302 (0.04 varience/0.19 deviation)
> >>   -11: 9 times (9.00%), avg time 0.558 (0.01 varience/0.11 deviation)
> >>   Ran command 100 times, 91 passes
> >> 
> >> The crash occurs in guest code which is the same in both pass and fail
> >> cases. However we see various messages reported on the console about
> >> corrupted memory lists which seems to imply the guest memory allocation
> >> is corrupted. This lines up with the seg fault being in the guest
> >> __libc_free function. So we think this is a guest bug which is
> >> exacerbated by various modes of translation. If anyone has access to
> >> real hardware to soak test the test case we could prove this properly.
> >> 
> >> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> >> Cc: David Gibson <david@gibson.dropbear.id.au>
> >> Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
> >
> > Acked-by: David Gibson <david@gibson.dropbear.id.au>
> >
> > Honestly, AFAICT the ppc64abi32-linux-user target is pretty much
> > entirely broken anyway.  Many things about it appear to make no
> > sense, it's difficult to work out what it's even supposed to be, and I
> > strongly suspect no-one's actually used it in like a decade.
> 
> Should we think about marking it deprecated for 5.2?

Yes, probably.  I just haven't gotten around to it.


-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson