[v1] PowerPC interrupt rework

[RFC PATCH 00/13] PowerPC interrupt rework

Posted by Matheus Ferst 3 years, 5 months ago

Currently, PowerPC interrupts are handled as follows:

1) The CPU_INTERRUPT_HARD bit of cs->interrupt_request gates all
   interrupts;
2) The bits of env->pending_interrupts identify which particular
   interrupt is raised;
3) ppc_set_irq can be used to set/clear env->pending_interrupt bit and
   CPU_INTERRUPT_HARD, but some places access env->pending_interrupt
   directly;
4) ppc_cpu_exec_interrupt is called by cpu_handle_interrupt when
   cs->interrupt_request indicates that there is some interrupt pending.
   This method checks CPU_INTERRUPT_HARD and calls ppc_hw_interrupt. If
   env->pending_interrupt is zero after this call, CPU_INTERRUPT_HARD
   will be cleared.
5) ppc_hw_interrupt checks if there is any unmasked interrupt and calls
   powerpc_excp with the appropriate POWERPC_EXCP_* value. The method
   will also reset the corresponding bit in env->pending_interrupt for
   interrupts that clear on delivery.

If all pending interrupts are masked, CPU_INTERRUPT_HARD will be set,
but ppc_hw_interrupt will not deliver or clear the interrupt, so
CPU_INTERRUPT_HARD will not be reset by ppc_cpu_exec_interrupt. With
that, cs->has_work keeps returning true, creating a loop that acquires
and release qemu_mutex_lock_iothread, causing the poor performance
reported in [1].

This patch series attempts to rework the PowerPC interrupt code to set
CPU_INTERRUPT_HARD only when there are unmasked interrupts. Then
cs->has_work can be simplified to a check of CPU_INTERRUPT_HARD, so it
also only returns true when at least one interrupt can be delivered.

To achieve that, we are basically following Alex Bannée's suggestion[2]
in the original thread: the interrupt masking logic will be factored
out of ppc_hw_interrupt in a new method, ppc_pending_interrupts. This
method is then used to decide if CPU_INTERRUPT_HARD should be set or
cleared after changes to MSR, LPCR, env->pending_interrupts, and
power-management instructions.

We used [3] to check for regressions at each patch in this series. After
patch 12, booting a powernv machine with a newer skiboot with "-smp 4"
goes from 1m09s to 20.79s.

[1] https://lists.gnu.org/archive/html/qemu-ppc/2022-06/msg00336.html
[2] https://lists.gnu.org/archive/html/qemu-ppc/2022-06/msg00372.html
[3] https://github.com/legoater/qemu-ppc-boot

Matheus Ferst (13):
  target/ppc: define PPC_INTERRUPT_* values directly
  target/ppc: always use ppc_set_irq to set env->pending_interrupts
  target/ppc: move interrupt masking out of ppc_hw_interrupt
  target/ppc: prepare to split ppc_interrupt_pending by excp_model
  target/ppc: create a interrupt masking method for POWER9/POWER10
  target/ppc: remove embedded interrupts from ppc_pending_interrupt_p9
  target/ppc: create a interrupt masking method for POWER8
  target/ppc: remove unused interrupts from ppc_pending_interrupt_p8
  target/ppc: create a interrupt masking method for POWER7
  target/ppc: remove unused interrupts from ppc_pending_interrupt_p7
  target/ppc: remove ppc_store_lpcr from CONFIG_USER_ONLY builds
  target/ppc: introduce ppc_maybe_interrupt
  target/ppc: unify cpu->has_work based on cs->interrupt_request

 hw/ppc/ppc.c             |  17 +-
 hw/ppc/trace-events      |   2 +-
 target/ppc/cpu.c         |   2 +
 target/ppc/cpu.h         |  43 +--
 target/ppc/cpu_init.c    | 212 +------------
 target/ppc/excp_helper.c | 651 ++++++++++++++++++++++++++++++++-------
 target/ppc/helper_regs.c |   2 +
 target/ppc/misc_helper.c |  11 +-
 target/ppc/translate.c   |   8 +-
 9 files changed, 580 insertions(+), 368 deletions(-)

-- 
2.25.1

Re: [RFC PATCH 00/13] PowerPC interrupt rework

Posted by Cédric Le Goater 3 years, 5 months ago

[ Adding Fabiano who reworked all exception models for 7.0 and Nick
   who rewrote the Linux side sometime ago ]

On 8/15/22 18:20, Matheus Ferst wrote:
> Currently, PowerPC interrupts are handled as follows:
> 
> 1) The CPU_INTERRUPT_HARD bit of cs->interrupt_request gates all
>     interrupts;
> 2) The bits of env->pending_interrupts identify which particular
>     interrupt is raised;
> 3) ppc_set_irq can be used to set/clear env->pending_interrupt bit and
>     CPU_INTERRUPT_HARD, but some places access env->pending_interrupt
>     directly;
> 4) ppc_cpu_exec_interrupt is called by cpu_handle_interrupt when
>     cs->interrupt_request indicates that there is some interrupt pending.
>     This method checks CPU_INTERRUPT_HARD and calls ppc_hw_interrupt. If
>     env->pending_interrupt is zero after this call, CPU_INTERRUPT_HARD
>     will be cleared.
> 5) ppc_hw_interrupt checks if there is any unmasked interrupt and calls
>     powerpc_excp with the appropriate POWERPC_EXCP_* value. The method
>     will also reset the corresponding bit in env->pending_interrupt for
>     interrupts that clear on delivery.
> 
> If all pending interrupts are masked, CPU_INTERRUPT_HARD will be set,
> but ppc_hw_interrupt will not deliver or clear the interrupt, so
> CPU_INTERRUPT_HARD will not be reset by ppc_cpu_exec_interrupt. With
> that, cs->has_work keeps returning true, creating a loop that acquires
> and release qemu_mutex_lock_iothread, causing the poor performance
> reported in [1].
> 
> This patch series attempts to rework the PowerPC interrupt code to set
> CPU_INTERRUPT_HARD only when there are unmasked interrupts. Then
> cs->has_work can be simplified to a check of CPU_INTERRUPT_HARD, so it
> also only returns true when at least one interrupt can be delivered.
> 
> To achieve that, we are basically following Alex Bannée's suggestion[2]
> in the original thread: the interrupt masking logic will be factored
> out of ppc_hw_interrupt in a new method, ppc_pending_interrupts. This
> method is then used to decide if CPU_INTERRUPT_HARD should be set or
> cleared after changes to MSR, LPCR, env->pending_interrupts, and
> power-management instructions.
> 
> We used [3] to check for regressions at each patch in this series. After
> patch 12, booting a powernv machine with a newer skiboot with "-smp 4"
> goes from 1m09s to 20.79s.

whaou ! PowerNV is really an heavy weight platform, so that's a great
improvement.

Did you try KVM guests under PowerNV (L1 under an emulated L0) and KVM
under pseries (L2 under an emulated L1) ? Try some intensive I/O on a
SMP machine, like a large scp transfer.

We should try the MacOS images also.

Thanks,

C.


> 
> [1] https://lists.gnu.org/archive/html/qemu-ppc/2022-06/msg00336.html
> [2] https://lists.gnu.org/archive/html/qemu-ppc/2022-06/msg00372.html
> [3] https://github.com/legoater/qemu-ppc-boot
> 
> Matheus Ferst (13):
>    target/ppc: define PPC_INTERRUPT_* values directly
>    target/ppc: always use ppc_set_irq to set env->pending_interrupts
>    target/ppc: move interrupt masking out of ppc_hw_interrupt
>    target/ppc: prepare to split ppc_interrupt_pending by excp_model
>    target/ppc: create a interrupt masking method for POWER9/POWER10
>    target/ppc: remove embedded interrupts from ppc_pending_interrupt_p9
>    target/ppc: create a interrupt masking method for POWER8
>    target/ppc: remove unused interrupts from ppc_pending_interrupt_p8
>    target/ppc: create a interrupt masking method for POWER7
>    target/ppc: remove unused interrupts from ppc_pending_interrupt_p7
>    target/ppc: remove ppc_store_lpcr from CONFIG_USER_ONLY builds
>    target/ppc: introduce ppc_maybe_interrupt
>    target/ppc: unify cpu->has_work based on cs->interrupt_request
> 
>   hw/ppc/ppc.c             |  17 +-
>   hw/ppc/trace-events      |   2 +-
>   target/ppc/cpu.c         |   2 +
>   target/ppc/cpu.h         |  43 +--
>   target/ppc/cpu_init.c    | 212 +------------
>   target/ppc/excp_helper.c | 651 ++++++++++++++++++++++++++++++++-------
>   target/ppc/helper_regs.c |   2 +
>   target/ppc/misc_helper.c |  11 +-
>   target/ppc/translate.c   |   8 +-
>   9 files changed, 580 insertions(+), 368 deletions(-)
>

Re: [RFC PATCH 00/13] PowerPC interrupt rework

Posted by Matheus K. Ferst 3 years, 5 months ago

On 15/08/2022 17:02, Cédric Le Goater wrote:
> [ Adding Fabiano who reworked all exception models for 7.0 and Nick
>    who rewrote the Linux side sometime ago ]
> 
> On 8/15/22 18:20, Matheus Ferst wrote:
>> Currently, PowerPC interrupts are handled as follows:
>>
>> 1) The CPU_INTERRUPT_HARD bit of cs->interrupt_request gates all
>>     interrupts;
>> 2) The bits of env->pending_interrupts identify which particular
>>     interrupt is raised;
>> 3) ppc_set_irq can be used to set/clear env->pending_interrupt bit and
>>     CPU_INTERRUPT_HARD, but some places access env->pending_interrupt
>>     directly;
>> 4) ppc_cpu_exec_interrupt is called by cpu_handle_interrupt when
>>     cs->interrupt_request indicates that there is some interrupt pending.
>>     This method checks CPU_INTERRUPT_HARD and calls ppc_hw_interrupt. If
>>     env->pending_interrupt is zero after this call, CPU_INTERRUPT_HARD
>>     will be cleared.
>> 5) ppc_hw_interrupt checks if there is any unmasked interrupt and calls
>>     powerpc_excp with the appropriate POWERPC_EXCP_* value. The method
>>     will also reset the corresponding bit in env->pending_interrupt for
>>     interrupts that clear on delivery.
>>
>> If all pending interrupts are masked, CPU_INTERRUPT_HARD will be set,
>> but ppc_hw_interrupt will not deliver or clear the interrupt, so
>> CPU_INTERRUPT_HARD will not be reset by ppc_cpu_exec_interrupt. With
>> that, cs->has_work keeps returning true, creating a loop that acquires
>> and release qemu_mutex_lock_iothread, causing the poor performance
>> reported in [1].
>>
>> This patch series attempts to rework the PowerPC interrupt code to set
>> CPU_INTERRUPT_HARD only when there are unmasked interrupts. Then
>> cs->has_work can be simplified to a check of CPU_INTERRUPT_HARD, so it
>> also only returns true when at least one interrupt can be delivered.
>>
>> To achieve that, we are basically following Alex Bannée's suggestion[2]
>> in the original thread: the interrupt masking logic will be factored
>> out of ppc_hw_interrupt in a new method, ppc_pending_interrupts. This
>> method is then used to decide if CPU_INTERRUPT_HARD should be set or
>> cleared after changes to MSR, LPCR, env->pending_interrupts, and
>> power-management instructions.
>>
>> We used [3] to check for regressions at each patch in this series. After
>> patch 12, booting a powernv machine with a newer skiboot with "-smp 4"
>> goes from 1m09s to 20.79s.
> 
> whaou ! PowerNV is really an heavy weight platform, so that's a great
> improvement.
> 

Note that this result uses Frederic's test case, where one CPU 
decompresses the kernel and the others keep spinning to deliver a masked 
decrement interrupt. The improvement may not be that great with other 
workloads.

> Did you try KVM guests under PowerNV (L1 under an emulated L0) and KVM
> under pseries (L2 under an emulated L1) ? Try some intensive I/O on a
> SMP machine, like a large scp transfer.
> 

So far, I have mainly tested with buildroot boot+poweroff. I'll try 
these other tests too.

> We should try the MacOS images also.
> 
> Thanks,
> 
> C.

Unfortunately, I can't test with MacOS :/

Thanks,
Matheus K. Ferst
Instituto de Pesquisas ELDORADO <http://www.eldorado.org.br/>
Analista de Software
Aviso Legal - Disclaimer <https://www.eldorado.org.br/disclaimer.html>