On Fri, Nov 23, 2018 at 15:45:21 +0100, Richard Henderson wrote:
> This includes everything queued so far -- softmmu out-of-line
> patches
Reviewed-by: Emilio G. Cota <cota@braap.org>
for patches 1-9.
I am sad to report that on a Skylake host, this series gives
a ~10% average slowdown for x86_64-softmmu SPEC06int
(I'm reporting speedup, so <1 means slowdown):
https://imgur.com/a/25iu8Yl
Turns out that despite the higher icache hit, the IPC
ends up being lower. For instance, here are perf counts when
running hmmer x3 right after booting up (bootup is included
in the counts, but hmmer is run 3 times in a row):
- Before:
249,392,070,159 cycles
781,327,593,681 instructions # 3.13 insn per cycle
85,914,418,873 branches
242,572,820 branch-misses # 0.28% of all branches
1,567,954,032 L1-icache-load-misses
70.559864567 seconds time elapsed
- After:
277,806,651,701 cycles
813,619,725,225 instructions # 2.93 insn per cycle
132,453,633,831 branches
306,969,989 branch-misses # 0.23% of all branches
1,250,619,057 L1-icache-load-misses
78.420517079 seconds time elapsed
On the bright side, in an older system (Sandy Bridge), I get
a fairly neutral average perf impact, with some workloads
speeding up and others slowing down:
https://imgur.com/a/AokDbkm
(Note that v1 of this series gave an overall slowdown, so that's
progress.)
Given the above, perhaps the best way forward is to add a
configure flag to disable OOL thunks, unless you have any
further optimizations coming up.
Thanks,
Emilio