[Qemu-devel] [PATCH v1 0/4] de-macrofy softmmu

Alex Bennée posted 4 patches 5 years, 3 months ago
Test checkpatch passed
Test asan failed
Test docker-mingw@fedora failed
Test docker-quick@centos7 failed
Test docker-clang@ubuntu failed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20181217150116.10446-1-alex.bennee@linaro.org
accel/tcg/Makefile.objs      |   1 +
accel/tcg/cputlb.c           |  63 +----
accel/tcg/cputlb.h           |  21 ++
accel/tcg/softmmu.c          | 452 +++++++++++++++++++++++++++++++++++
accel/tcg/softmmu_template.h | 446 ----------------------------------
5 files changed, 485 insertions(+), 498 deletions(-)
create mode 100644 accel/tcg/cputlb.h
create mode 100644 accel/tcg/softmmu.c
delete mode 100644 accel/tcg/softmmu_template.h
[Qemu-devel] [PATCH v1 0/4] de-macrofy softmmu
Posted by Alex Bennée 5 years, 3 months ago
Hi,

This is a re-based re-spin of my RFC series with a few minor updates
thanks to updates to the softmmu code. The series is attempting to
achieve the same results we did with softfloat by replacing our nest of
macro expansions with some common helpers. By using __flatten__ to force
the whole function to be un-rolled the compiler can then dead-code away
parts of the common function not used in each particular instantiated
helper.

This is a win from a debugging point of view - no more "where is
helper_le_lduw_mmu defined" questions. I think it will also make
eventual instrumentation easier because there is exactly one slow path
load and store function to instrument.

Although the individual functions are now bigger (__flatten__ stops the
compiler ignoring inline if it wants to) the size of the resulting
binary is slightly smaller!

original - 73027224 bytes demacro - 66913848 bytes

Unfortunately in my simple boot test I see a slight performance
degradation:

original: 10 times (100.00%), avg time 5.358 (0.02 varience/0.13
deviation) demacro: 10 times (100.00%), avg time 5.760 (0.08
varience/0.29 deviation)

Emilio,

Any chance you could run this through your more comprehensive benchmark
suite?

Alex Bennée (4):
  accel/tcg: export some cputlb functions
  accel/tcg: introduce softmmu.c
  accel/tcg: use TLB helpers from softmmu.o
  accel/tcg: remove softmmu_template.h

 accel/tcg/Makefile.objs      |   1 +
 accel/tcg/cputlb.c           |  63 +----
 accel/tcg/cputlb.h           |  21 ++
 accel/tcg/softmmu.c          | 452 +++++++++++++++++++++++++++++++++++
 accel/tcg/softmmu_template.h | 446 ----------------------------------
 5 files changed, 485 insertions(+), 498 deletions(-)
 create mode 100644 accel/tcg/cputlb.h
 create mode 100644 accel/tcg/softmmu.c
 delete mode 100644 accel/tcg/softmmu_template.h

-- 
2.17.1


Re: [Qemu-devel] [PATCH v1 0/4] de-macrofy softmmu
Posted by Alex Bennée 5 years, 3 months ago
Alex Bennée <alex.bennee@linaro.org> writes:

> Hi,
>
<snip>
>
> Unfortunately in my simple boot test I see a slight performance
> degradation:
>
> original: 10 times (100.00%), avg time 5.358 (0.02 varience/0.13 deviation)
> demacro: 10 times (100.00%), avg time 5.760 (0.08 varience/0.29 deviation)

Moving stuff back into cputlb seems to help:

 10 times (100.00%), avg time 5.583 (0.03 varience/0.17 deviation)

>
> Emilio,
>
> Any chance you could run this through your more comprehensive benchmark
> suite?
>
> Alex Bennée (4):
>   accel/tcg: export some cputlb functions
>   accel/tcg: introduce softmmu.c
>   accel/tcg: use TLB helpers from softmmu.o
>   accel/tcg: remove softmmu_template.h
>
>  accel/tcg/Makefile.objs      |   1 +
>  accel/tcg/cputlb.c           |  63 +----
>  accel/tcg/cputlb.h           |  21 ++
>  accel/tcg/softmmu.c          | 452 +++++++++++++++++++++++++++++++++++
>  accel/tcg/softmmu_template.h | 446 ----------------------------------
>  5 files changed, 485 insertions(+), 498 deletions(-)
>  create mode 100644 accel/tcg/cputlb.h
>  create mode 100644 accel/tcg/softmmu.c
>  delete mode 100644 accel/tcg/softmmu_template.h


--
Alex Bennée

Re: [Qemu-devel] [PATCH v1 0/4] de-macrofy softmmu
Posted by Alex Bennée 5 years, 3 months ago
Alex Bennée <alex.bennee@linaro.org> writes:

> Alex Bennée <alex.bennee@linaro.org> writes:
>
>> Hi,
>>
> <snip>
>>
>> Unfortunately in my simple boot test I see a slight performance
>> degradation:
>>
>> original: 10 times (100.00%), avg time 5.358 (0.02 varience/0.13 deviation)
>> demacro: 10 times (100.00%), avg time 5.760 (0.08 varience/0.29 deviation)
>
> Moving stuff back into cputlb seems to help:
>
>  10 times (100.00%), avg time 5.583 (0.03 varience/0.17 deviation)

See:

  https://github.com/stsquad/qemu/tree/ldst/demacrofy-v2

Which:

  - keeps everything in cputlb (dropping the externs)
  - factors out unaligned handling
  - uses __always_inline__ instead of __flatten__

>
>>
>> Emilio,
>>
>> Any chance you could run this through your more comprehensive benchmark
>> suite?
>>
>> Alex Bennée (4):
>>   accel/tcg: export some cputlb functions
>>   accel/tcg: introduce softmmu.c
>>   accel/tcg: use TLB helpers from softmmu.o
>>   accel/tcg: remove softmmu_template.h
>>
>>  accel/tcg/Makefile.objs      |   1 +
>>  accel/tcg/cputlb.c           |  63 +----
>>  accel/tcg/cputlb.h           |  21 ++
>>  accel/tcg/softmmu.c          | 452 +++++++++++++++++++++++++++++++++++
>>  accel/tcg/softmmu_template.h | 446 ----------------------------------
>>  5 files changed, 485 insertions(+), 498 deletions(-)
>>  create mode 100644 accel/tcg/cputlb.h
>>  create mode 100644 accel/tcg/softmmu.c
>>  delete mode 100644 accel/tcg/softmmu_template.h


--
Alex Bennée

Re: [Qemu-devel] [PATCH v1 0/4] de-macrofy softmmu
Posted by Emilio G. Cota 5 years, 3 months ago
On Mon, Dec 17, 2018 at 16:15:10 +0000, Alex Bennée wrote:
> 
> Alex Bennée <alex.bennee@linaro.org> writes:
> 
> > Hi,
> >
> <snip>
> >
> > Unfortunately in my simple boot test I see a slight performance
> > degradation:
> >
> > original: 10 times (100.00%), avg time 5.358 (0.02 varience/0.13 deviation)
> > demacro: 10 times (100.00%), avg time 5.760 (0.08 varience/0.29 deviation)
> 
> Moving stuff back into cputlb seems to help:
> 
>  10 times (100.00%), avg time 5.583 (0.03 varience/0.17 deviation)

Yes, I'd move it there. Also playing with attr __noinline__
for the slow paths like we did for hardfloat should help.

> > Emilio,
> >
> > Any chance you could run this through your more comprehensive benchmark
> > suite?

Sure, please give me a few days (got a paper submission deadline
on Wed).

Do you have a branch I can pull from?

Thanks,

		Emilio

Re: [Qemu-devel] [PATCH v1 0/4] de-macrofy softmmu
Posted by no-reply@patchew.org 5 years, 3 months ago
Patchew URL: https://patchew.org/QEMU/20181217150116.10446-1-alex.bennee@linaro.org/



Hi,

This series failed the docker-mingw@fedora build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
time make docker-test-mingw@fedora SHOW_ENV=1 J=8
=== TEST SCRIPT END ===

  CC      x86_64-softmmu/target/i386/cpu.o
  CC      x86_64-softmmu/target/i386/gdbstub.o
  CC      aarch64-softmmu/hw/block/virtio-blk.o
/tmp/qemu-test/src/accel/tcg/softmmu.c:207:1: error: conflicting types for 'helper_le_ldq_mmu'
 helper_le_ldq_mmu(CPUArchState *env, target_ulong addr, TCGMemOpIdx oi,
 ^~~~~~~~~~~~~~~~~
In file included from /tmp/qemu-test/src/include/exec/cpu_ldst.h:127:0,
---
/tmp/qemu-test/src/tcg/tcg.h:1307:10: note: previous declaration of 'helper_le_ldq_mmu' was here
 uint64_t helper_le_ldq_mmu(CPUArchState *env, target_ulong addr,
          ^~~~~~~~~~~~~~~~~
/tmp/qemu-test/src/accel/tcg/softmmu.c:214:1: error: conflicting types for 'helper_be_ldq_mmu'
 helper_be_ldq_mmu(CPUArchState *env, target_ulong addr, TCGMemOpIdx oi,
 ^~~~~~~~~~~~~~~~~
In file included from /tmp/qemu-test/src/include/exec/cpu_ldst.h:127:0,
---
  CC      aarch64-softmmu/hw/misc/omap_tap.o
  CC      aarch64-softmmu/hw/misc/bcm2835_mbox.o
  CC      aarch64-softmmu/hw/misc/bcm2835_property.o
/tmp/qemu-test/src/accel/tcg/softmmu.c:207:1: error: conflicting types for 'helper_le_ldq_mmu'
 helper_le_ldq_mmu(CPUArchState *env, target_ulong addr, TCGMemOpIdx oi,
 ^~~~~~~~~~~~~~~~~
In file included from /tmp/qemu-test/src/include/exec/cpu_ldst.h:127:0,
---
/tmp/qemu-test/src/tcg/tcg.h:1307:10: note: previous declaration of 'helper_le_ldq_mmu' was here
 uint64_t helper_le_ldq_mmu(CPUArchState *env, target_ulong addr,
          ^~~~~~~~~~~~~~~~~
/tmp/qemu-test/src/accel/tcg/softmmu.c:214:1: error: conflicting types for 'helper_be_ldq_mmu'
 helper_be_ldq_mmu(CPUArchState *env, target_ulong addr, TCGMemOpIdx oi,
 ^~~~~~~~~~~~~~~~~
In file included from /tmp/qemu-test/src/include/exec/cpu_ldst.h:127:0,


The full log is available at
http://patchew.org/logs/20181217150116.10446-1-alex.bennee@linaro.org/testing.docker-mingw@fedora/?type=message.
---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-devel@redhat.com