include/hw/ppc/spapr.h | 3 ++- include/hw/ppc/spapr_irq.h | 12 ++++++------ include/hw/ppc/spapr_xive.h | 2 +- include/hw/ppc/xics_spapr.h | 2 +- hw/intc/spapr_xive.c | 9 +++++---- hw/intc/spapr_xive_kvm.c | 4 ++-- hw/intc/xics_kvm.c | 4 ++-- hw/intc/xics_spapr.c | 11 ++++++----- hw/ppc/spapr.c | 12 ++++++++---- hw/ppc/spapr_irq.c | 34 ++++++++++++++++++++++++---------- 10 files changed, 57 insertions(+), 36 deletions(-)
A regression was recently fixed in the sPAPR XIVE code for QEMU 5.2 RC3 [1]. It boiled down to a confusion between IPI numbers and vCPU ids, which happen to be numerically equal in general, but are really different entities that can diverge in some setups. This was causing QEMU to misconfigure XIVE and to crash the guest. The confusion comes from XICS actually. Interrupt presenters in XICS are identified by a "server number" which is a 1:1 mapping to vCPU ids. The range of these "server numbers" is exposed to the guest in the "ibm,interrupt-server-ranges" property. A xics_max_server_number() helper was introduced at some point to compute the upper limit of the range. When XIVE was added, commit 1a518e7693c9 renamed the helper to spapr_max_server_number(). It ended up being used to size a bunch of things in XIVE that are per-vCPU, such as internal END tables or IPI ranges presented to the guest. The problem is that the maximum "server number" can be much higher (up to 8 times) than the actual number of vCPUs when the VSMT mode doesn't match the number of threads per core in the guest: DIV_ROUND_UP(ms->smp.max_cpus * spapr->vsmt, ms->smp.threads); Since QEMU 4.2, the default behavior is to set spapr->vsmt to ms->smp.threads. Setups with custom VSMT settings will configure XIVE to use more HW resources than needed. This is a bit unfortunate but not extremely harmful, unless maybe if a lot of guests are running on the host. The sizing of the IPI range is more problematic though as it eventually led to [1]. This series first does some renaming to make it clear when we're dealing with vCPU ids. It then fixes the machine code to pass smp.max_cpus to XIVE where appropriate. Since these changes are guest/migration visible, a machine property is added to keep the existing behavior for older machine types. The series is thus based on Connie's recent patch that introduces compat machines for QEMU 6.0. Based-on: 20201109173928.1001764-1-cohuck@redhat.com Note that we still use spapr_max_vcpu_ids() when activating the in-kernel irqchip because this is what both XICS-on-XIVE and XIVE KVM devices expect. [1] https://bugs.launchpad.net/qemu/+bug/1900241 v2: - comments on v1 highlighted that problems mostly come from spapr_max_server_number() which got misused over the years. Updated the cover letter accordingly. - completely new approach. Instead of messing with device properties, pass the appropriate values to the IC backend handlers. - rename a few things using the "max_vcpu_ids" wording instead of "nr_servers" and "max_server_number" Greg Kurz (3): spapr: Improve naming of some vCPU id related items spapr/xive: Fix size of END table and number of claimed IPIs spapr/xive: Fix the "ibm,xive-lisn-ranges" property include/hw/ppc/spapr.h | 3 ++- include/hw/ppc/spapr_irq.h | 12 ++++++------ include/hw/ppc/spapr_xive.h | 2 +- include/hw/ppc/xics_spapr.h | 2 +- hw/intc/spapr_xive.c | 9 +++++---- hw/intc/spapr_xive_kvm.c | 4 ++-- hw/intc/xics_kvm.c | 4 ++-- hw/intc/xics_spapr.c | 11 ++++++----- hw/ppc/spapr.c | 12 ++++++++---- hw/ppc/spapr_irq.c | 34 ++++++++++++++++++++++++---------- 10 files changed, 57 insertions(+), 36 deletions(-) -- 2.26.2
On 11/30/20 5:52 PM, Greg Kurz wrote: > A regression was recently fixed in the sPAPR XIVE code for QEMU 5.2 > RC3 [1]. It boiled down to a confusion between IPI numbers and vCPU > ids, which happen to be numerically equal in general, but are really > different entities that can diverge in some setups. This was causing > QEMU to misconfigure XIVE and to crash the guest. > > The confusion comes from XICS actually. Interrupt presenters in XICS > are identified by a "server number" which is a 1:1 mapping to vCPU > ids. The range of these "server numbers" is exposed to the guest in > the "ibm,interrupt-server-ranges" property. A xics_max_server_number() > helper was introduced at some point to compute the upper limit of the > range. When XIVE was added, commit 1a518e7693c9 renamed the helper to > spapr_max_server_number(). It ended up being used to size a bunch of > things in XIVE that are per-vCPU, such as internal END tables or > IPI ranges presented to the guest. The problem is that the maximum > "server number" can be much higher (up to 8 times) than the actual > number of vCPUs when the VSMT mode doesn't match the number of threads > per core in the guest: > > DIV_ROUND_UP(ms->smp.max_cpus * spapr->vsmt, ms->smp.threads); > > Since QEMU 4.2, the default behavior is to set spapr->vsmt to > ms->smp.threads. Setups with custom VSMT settings will configure XIVE > to use more HW resources than needed. This is a bit unfortunate but > not extremely harmful, Indeed. The default usage case (without vsmt) has no impact since it does not fragment the XIVE VP space more than needed. > unless maybe if a lot of guests are running on the host. We can run 4K (-2) KVM guests today on a P9 system. To reach the internal limits, each should have 32 vCPUs. It's possible with a lot of RAM but it's not a common scenario. C. > The sizing of the IPI range is more problematic though > as it eventually led to [1]. > > This series first does some renaming to make it clear when we're > dealing with vCPU ids. It then fixes the machine code to pass > smp.max_cpus to XIVE where appropriate. Since these changes are > guest/migration visible, a machine property is added to keep the > existing behavior for older machine types. The series is thus based > on Connie's recent patch that introduces compat machines for > QEMU 6.0. > > Based-on: 20201109173928.1001764-1-cohuck@redhat.com > > Note that we still use spapr_max_vcpu_ids() when activating the > in-kernel irqchip because this is what both XICS-on-XIVE and XIVE > KVM devices expect. > > [1] https://bugs.launchpad.net/qemu/+bug/1900241 > > v2: - comments on v1 highlighted that problems mostly come from > spapr_max_server_number() which got misused over the years. > Updated the cover letter accordingly. > - completely new approach. Instead of messing with device properties, > pass the appropriate values to the IC backend handlers. > - rename a few things using the "max_vcpu_ids" wording instead of > "nr_servers" and "max_server_number" > > Greg Kurz (3): > spapr: Improve naming of some vCPU id related items > spapr/xive: Fix size of END table and number of claimed IPIs > spapr/xive: Fix the "ibm,xive-lisn-ranges" property > > include/hw/ppc/spapr.h | 3 ++- > include/hw/ppc/spapr_irq.h | 12 ++++++------ > include/hw/ppc/spapr_xive.h | 2 +- > include/hw/ppc/xics_spapr.h | 2 +- > hw/intc/spapr_xive.c | 9 +++++---- > hw/intc/spapr_xive_kvm.c | 4 ++-- > hw/intc/xics_kvm.c | 4 ++-- > hw/intc/xics_spapr.c | 11 ++++++----- > hw/ppc/spapr.c | 12 ++++++++---- > hw/ppc/spapr_irq.c | 34 ++++++++++++++++++++++++---------- > 10 files changed, 57 insertions(+), 36 deletions(-) >
© 2016 - 2024 Red Hat, Inc.