hw/acpi/piix4.c | 2 hw/char/virtio-serial-bus.c | 2 hw/core/bus.c | 11 -- hw/core/qdev.c | 15 ++- hw/pci/pci.c | 33 +++++++ hw/pci/pcie.c | 2 hw/pci/shpc.c | 2 hw/ppc/spapr.c | 205 ++++++++++++++++++++++++++++++++++++++++- hw/ppc/spapr_drc.c | 65 ++++++++++--- hw/ppc/spapr_events.c | 3 + hw/ppc/spapr_hcall.c | 20 ++++ hw/ppc/spapr_iommu.c | 22 +++- hw/ppc/spapr_pci.c | 86 +++++++++++++---- hw/s390x/css-bridge.c | 2 hw/s390x/s390-pci-bus.c | 6 + hw/scsi/virtio-scsi.c | 2 hw/scsi/vmw_pvscsi.c | 2 hw/usb/dev-smartcard-reader.c | 2 include/hw/compat.h | 3 + include/hw/pci-host/spapr.h | 9 +- include/hw/pci/pci.h | 3 + include/hw/ppc/spapr.h | 15 +++ include/hw/ppc/spapr_drc.h | 8 ++ include/hw/qdev-core.h | 4 - 24 files changed, 446 insertions(+), 78 deletions(-)
This series is based on patches from Michel Roth posted in 2015:
https://lists.gnu.org/archive/html/qemu-devel/2015-04/msg04246.html
It addresses comments made during the RFC review, and also provides a bunch
of preliminary cleanup/fix patches. Since we have reached hard-freeze, this
feature is provided by a new pseries-2.11 machine type, introduced by this
series. It is based on David Gibson's ppc-for-2.10 branch, and I believe some
of the preliminary fixes are eligible for 2.10.
Note that it also requires an updated SLOF that supports a new private hcall:
KVMPPC_H_UPDATE_PHANDLE. This is needed because SLOF changes FDT phandles to
Open Firmware phandles. Since the guest only sees the latter, QEMU must use
the updated value when populating the FDT for the hotplugged PHB (otherwise
the guest can't setup IRQs for the PCI devices). SLOF part is already upstream:
http://git.qemu.org/?p=SLOF.git;h=604d28cc3f791657414f8b21103921fa0147fc63
With these patches we support the following:
(qemu) device_add spapr-pci-host-bridge,index=2,id=phb2
(qemu) device_add virtio-net-pci,id=hp2.0,bus=phb2.0
(qemu) device_del hp2.0
(qemu) device_del phb2
I could run some successful tests with a fedora25 guest:
- hotplug PHB + migrate + unplug PHB
- hotplug PHB + hotplug PCI device + unplug PHB => PCI device gets unplugged
- migrate before OS starts + hotplug PHB => destination uses OF phandles
- no regression observed with older machine types
All the patches are also available here:
https://github.com/gkurz/qemu/commits/spapr-hotplug-phb
Cheers,
--
Greg
---
Greg Kurz (14):
spapr: move spapr_create_phb() to core machine code
spapr_pci: use memory_region_add_subregion() with DMA windows
spapr_iommu: use g_strdup_printf() instead of snprintf()
spapr_drc: use g_strdup_printf() instead of snprintf()
spapr_iommu: convert TCE table object to realize()
spapr_pci: parent the MSI memory region to the PHB
spapr_drc: fix realize and unrealize
spapr_drc: add unrealize method to physical DRC class
spapr_iommu: unregister vmstate at unrealize time
spapr: add pseries-2.11 machine type
spapr_pci: introduce drc_id property
spapr: allow guest to update the XICS phandle
spapr_pci: drop abusive sanity check when migrating the LSI table
spapr: add hotplug hooks for PHB hotplug
Michael Roth (11):
spapr_drc: pass object ownership to parent/owner
spapr_iommu: pass object ownership to parent/owner
pci: allow cleanup/unregistration of PCI buses
qdev: store DeviceState's canonical path to use when unparenting
spapr_pci: add PHB unrealize
spapr: enable PHB hotplug for pseries-2.11
spapr: create DR connectors for PHBs
spapr_events: add support for phb hotplug events
qdev: pass an Object * to qbus_set_hotplug_handler()
spapr_pci: provide node start offset via spapr_populate_pci_dt()
spapr_pci: add ibm, my-drc-index property for PHB hotplug
Nathan Fontenot (1):
spapr: populate PHB DRC entries for root DT node
hw/acpi/piix4.c | 2
hw/char/virtio-serial-bus.c | 2
hw/core/bus.c | 11 --
hw/core/qdev.c | 15 ++-
hw/pci/pci.c | 33 +++++++
hw/pci/pcie.c | 2
hw/pci/shpc.c | 2
hw/ppc/spapr.c | 205 ++++++++++++++++++++++++++++++++++++++++-
hw/ppc/spapr_drc.c | 65 ++++++++++---
hw/ppc/spapr_events.c | 3 +
hw/ppc/spapr_hcall.c | 20 ++++
hw/ppc/spapr_iommu.c | 22 +++-
hw/ppc/spapr_pci.c | 86 +++++++++++++----
hw/s390x/css-bridge.c | 2
hw/s390x/s390-pci-bus.c | 6 +
hw/scsi/virtio-scsi.c | 2
hw/scsi/vmw_pvscsi.c | 2
hw/usb/dev-smartcard-reader.c | 2
include/hw/compat.h | 3 +
include/hw/pci-host/spapr.h | 9 +-
include/hw/pci/pci.h | 3 +
include/hw/ppc/spapr.h | 15 +++
include/hw/ppc/spapr_drc.h | 8 ++
include/hw/qdev-core.h | 4 -
24 files changed, 446 insertions(+), 78 deletions(-)
On 26/07/17 03:57, Greg Kurz wrote: > This series is based on patches from Michel Roth posted in 2015: > > https://lists.gnu.org/archive/html/qemu-devel/2015-04/msg04246.html > > It addresses comments made during the RFC review, and also provides a bunch > of preliminary cleanup/fix patches. Since we have reached hard-freeze, this > feature is provided by a new pseries-2.11 machine type, introduced by this > series. It is based on David Gibson's ppc-for-2.10 branch, and I believe some > of the preliminary fixes are eligible for 2.10. > > Note that it also requires an updated SLOF that supports a new private hcall: > KVMPPC_H_UPDATE_PHANDLE. This is needed because SLOF changes FDT phandles to > Open Firmware phandles. Since the guest only sees the latter, QEMU must use > the updated value when populating the FDT for the hotplugged PHB (otherwise > the guest can't setup IRQs for the PCI devices). SLOF part is already upstream: > > http://git.qemu.org/?p=SLOF.git;h=604d28cc3f791657414f8b21103921fa0147fc63 > > With these patches we support the following: > > (qemu) device_add spapr-pci-host-bridge,index=2,id=phb2 > (qemu) device_add virtio-net-pci,id=hp2.0,bus=phb2.0 > (qemu) device_del hp2.0 > (qemu) device_del phb2 > > I could run some successful tests with a fedora25 guest: > - hotplug PHB + migrate + unplug PHB > - hotplug PHB + hotplug PCI device + unplug PHB => PCI device gets unplugged > - migrate before OS starts + hotplug PHB => destination uses OF phandles > - no regression observed with older machine types > > All the patches are also available here: > > https://github.com/gkurz/qemu/commits/spapr-hotplug-phb > > Cheers, > > -- > Greg > > --- > > Greg Kurz (14): > spapr: move spapr_create_phb() to core machine code > spapr_pci: use memory_region_add_subregion() with DMA windows > spapr_iommu: use g_strdup_printf() instead of snprintf() > spapr_drc: use g_strdup_printf() instead of snprintf() > spapr_iommu: convert TCE table object to realize() > spapr_pci: parent the MSI memory region to the PHB > spapr_drc: fix realize and unrealize > spapr_drc: add unrealize method to physical DRC class > spapr_iommu: unregister vmstate at unrealize time > spapr: add pseries-2.11 machine type > spapr_pci: introduce drc_id property > spapr: allow guest to update the XICS phandle > spapr_pci: drop abusive sanity check when migrating the LSI table > spapr: add hotplug hooks for PHB hotplug This one did not make it to the lists. > > Michael Roth (11): > spapr_drc: pass object ownership to parent/owner > spapr_iommu: pass object ownership to parent/owner > pci: allow cleanup/unregistration of PCI buses > qdev: store DeviceState's canonical path to use when unparenting > spapr_pci: add PHB unrealize > spapr: enable PHB hotplug for pseries-2.11> spapr: create DR connectors for PHBs > spapr_events: add support for phb hotplug events > qdev: pass an Object * to qbus_set_hotplug_handler() > spapr_pci: provide node start offset via spapr_populate_pci_dt() > spapr_pci: add ibm, my-drc-index property for PHB hotplug > > Nathan Fontenot (1): > spapr: populate PHB DRC entries for root DT node > > > hw/acpi/piix4.c | 2 > hw/char/virtio-serial-bus.c | 2 > hw/core/bus.c | 11 -- > hw/core/qdev.c | 15 ++- > hw/pci/pci.c | 33 +++++++ > hw/pci/pcie.c | 2 > hw/pci/shpc.c | 2 > hw/ppc/spapr.c | 205 ++++++++++++++++++++++++++++++++++++++++- > hw/ppc/spapr_drc.c | 65 ++++++++++--- > hw/ppc/spapr_events.c | 3 + > hw/ppc/spapr_hcall.c | 20 ++++ > hw/ppc/spapr_iommu.c | 22 +++- > hw/ppc/spapr_pci.c | 86 +++++++++++++---- > hw/s390x/css-bridge.c | 2 > hw/s390x/s390-pci-bus.c | 6 + > hw/scsi/virtio-scsi.c | 2 > hw/scsi/vmw_pvscsi.c | 2 > hw/usb/dev-smartcard-reader.c | 2 > include/hw/compat.h | 3 + > include/hw/pci-host/spapr.h | 9 +- > include/hw/pci/pci.h | 3 + > include/hw/ppc/spapr.h | 15 +++ > include/hw/ppc/spapr_drc.h | 8 ++ > include/hw/qdev-core.h | 4 - > 24 files changed, 446 insertions(+), 78 deletions(-) > > -- Alexey
On Wed, 26 Jul 2017 13:44:55 +1000 Alexey Kardashevskiy <aik@ozlabs.ru> wrote: > On 26/07/17 03:57, Greg Kurz wrote: > > This series is based on patches from Michel Roth posted in 2015: > > > > https://lists.gnu.org/archive/html/qemu-devel/2015-04/msg04246.html > > > > It addresses comments made during the RFC review, and also provides a bunch > > of preliminary cleanup/fix patches. Since we have reached hard-freeze, this > > feature is provided by a new pseries-2.11 machine type, introduced by this > > series. It is based on David Gibson's ppc-for-2.10 branch, and I believe some > > of the preliminary fixes are eligible for 2.10. > > > > Note that it also requires an updated SLOF that supports a new private hcall: > > KVMPPC_H_UPDATE_PHANDLE. This is needed because SLOF changes FDT phandles to > > Open Firmware phandles. Since the guest only sees the latter, QEMU must use > > the updated value when populating the FDT for the hotplugged PHB (otherwise > > the guest can't setup IRQs for the PCI devices). SLOF part is already upstream: > > > > http://git.qemu.org/?p=SLOF.git;h=604d28cc3f791657414f8b21103921fa0147fc63 > > > > With these patches we support the following: > > > > (qemu) device_add spapr-pci-host-bridge,index=2,id=phb2 > > (qemu) device_add virtio-net-pci,id=hp2.0,bus=phb2.0 > > (qemu) device_del hp2.0 > > (qemu) device_del phb2 > > > > I could run some successful tests with a fedora25 guest: > > - hotplug PHB + migrate + unplug PHB > > - hotplug PHB + hotplug PCI device + unplug PHB => PCI device gets unplugged > > - migrate before OS starts + hotplug PHB => destination uses OF phandles > > - no regression observed with older machine types > > > > All the patches are also available here: > > > > https://github.com/gkurz/qemu/commits/spapr-hotplug-phb > > > > Cheers, > > > > -- > > Greg > > > > --- > > > > Greg Kurz (14): > > spapr: move spapr_create_phb() to core machine code > > spapr_pci: use memory_region_add_subregion() with DMA windows > > spapr_iommu: use g_strdup_printf() instead of snprintf() > > spapr_drc: use g_strdup_printf() instead of snprintf() > > spapr_iommu: convert TCE table object to realize() > > spapr_pci: parent the MSI memory region to the PHB > > spapr_drc: fix realize and unrealize > > spapr_drc: add unrealize method to physical DRC class > > spapr_iommu: unregister vmstate at unrealize time > > spapr: add pseries-2.11 machine type > > > spapr_pci: introduce drc_id property > > spapr: allow guest to update the XICS phandle > > spapr_pci: drop abusive sanity check when migrating the LSI table > > spapr: add hotplug hooks for PHB hotplug > > > This one did not make it to the lists. > Yeah, I've just realized that 'stg mail' failed to send this patch because of a connection problem with OVH's mail relay... :-\ I've sent it again with --in-reply-to, so that it goes to the appropriate thread. > > > > > > Michael Roth (11): > > spapr_drc: pass object ownership to parent/owner > > spapr_iommu: pass object ownership to parent/owner > > pci: allow cleanup/unregistration of PCI buses > > qdev: store DeviceState's canonical path to use when unparenting > > spapr_pci: add PHB unrealize > > spapr: enable PHB hotplug for pseries-2.11> spapr: create DR connectors for PHBs > > > > spapr_events: add support for phb hotplug events > > qdev: pass an Object * to qbus_set_hotplug_handler() > > spapr_pci: provide node start offset via spapr_populate_pci_dt() > > spapr_pci: add ibm, my-drc-index property for PHB hotplug > > > > Nathan Fontenot (1): > > spapr: populate PHB DRC entries for root DT node > > > > > > hw/acpi/piix4.c | 2 > > hw/char/virtio-serial-bus.c | 2 > > hw/core/bus.c | 11 -- > > hw/core/qdev.c | 15 ++- > > hw/pci/pci.c | 33 +++++++ > > hw/pci/pcie.c | 2 > > hw/pci/shpc.c | 2 > > hw/ppc/spapr.c | 205 ++++++++++++++++++++++++++++++++++++++++- > > hw/ppc/spapr_drc.c | 65 ++++++++++--- > > hw/ppc/spapr_events.c | 3 + > > hw/ppc/spapr_hcall.c | 20 ++++ > > hw/ppc/spapr_iommu.c | 22 +++- > > hw/ppc/spapr_pci.c | 86 +++++++++++++---- > > hw/s390x/css-bridge.c | 2 > > hw/s390x/s390-pci-bus.c | 6 + > > hw/scsi/virtio-scsi.c | 2 > > hw/scsi/vmw_pvscsi.c | 2 > > hw/usb/dev-smartcard-reader.c | 2 > > include/hw/compat.h | 3 + > > include/hw/pci-host/spapr.h | 9 +- > > include/hw/pci/pci.h | 3 + > > include/hw/ppc/spapr.h | 15 +++ > > include/hw/ppc/spapr_drc.h | 8 ++ > > include/hw/qdev-core.h | 4 - > > 24 files changed, 446 insertions(+), 78 deletions(-) > > > > > >
I've tested the patch set using Greg's Github branch. It worked fine in my tests using a Fedora 26 and an Ubuntu 17.04 guests. I have two observations though: 1 - This is not related to this patch set per se because it is reproducible on master, but I think it is interfering with this new feature. There is a warning/error message in the kernel right after SLOF that goes: (...) -> smp_release_cpus() spinning_secondaries = 0 <- smp_release_cpus() Linux ppc64le #1 SMP Mon Jul 1[ 0.030450] pci 0000:00:02.0: of_irq_parse_pci: failed with rc=-22 [ 0.030552] pci 0000:00:0f.0: of_irq_parse_pci: failed with rc=-22 [ OK ] Started Security Auditing Service. (...) I get this same message after hotplugging a PCI device with this series, but the hotplug shows up in the lspci as expected: (qemu) device_add virtio-net-pci,id=hp2.0,bus=phb2.0 (qemu) [ 412.233441] pci 0002:00:00.0: of_irq_parse_pci: failed with rc=-22 2 - when hotplugging the same PHB for the second time (device_add phb2, device_del phb2, device_add phb2 again) the hotplug works but dmesg got spammed by the messages: (qemu) device_add spapr-pci-host-bridge,index=2,id=phb2 (qemu) [ 378.309490] rpaphp: rpaphp_register_slot: slot[C131106] is already registered [ 378.309674] rpaphp: rpaphp_register_slot: slot[C131074] is already registered [ 378.309841] rpaphp: rpaphp_register_slot: slot[C131087] is already registered [ 378.310847] rpaphp: rpaphp_register_slot: slot[C131078] is already registered ( .... about 250+ messages like that ) Looks like device_del isn't cleaning up everything after the first hotplug. Thanks, Daniel On 07/25/2017 02:57 PM, Greg Kurz wrote: > This series is based on patches from Michel Roth posted in 2015: > > https://lists.gnu.org/archive/html/qemu-devel/2015-04/msg04246.html > > It addresses comments made during the RFC review, and also provides a bunch > of preliminary cleanup/fix patches. Since we have reached hard-freeze, this > feature is provided by a new pseries-2.11 machine type, introduced by this > series. It is based on David Gibson's ppc-for-2.10 branch, and I believe some > of the preliminary fixes are eligible for 2.10. > > Note that it also requires an updated SLOF that supports a new private hcall: > KVMPPC_H_UPDATE_PHANDLE. This is needed because SLOF changes FDT phandles to > Open Firmware phandles. Since the guest only sees the latter, QEMU must use > the updated value when populating the FDT for the hotplugged PHB (otherwise > the guest can't setup IRQs for the PCI devices). SLOF part is already upstream: > > http://git.qemu.org/?p=SLOF.git;h=604d28cc3f791657414f8b21103921fa0147fc63 > > With these patches we support the following: > > (qemu) device_add spapr-pci-host-bridge,index=2,id=phb2 > (qemu) device_add virtio-net-pci,id=hp2.0,bus=phb2.0 > (qemu) device_del hp2.0 > (qemu) device_del phb2 > > I could run some successful tests with a fedora25 guest: > - hotplug PHB + migrate + unplug PHB > - hotplug PHB + hotplug PCI device + unplug PHB => PCI device gets unplugged > - migrate before OS starts + hotplug PHB => destination uses OF phandles > - no regression observed with older machine types > > All the patches are also available here: > > https://github.com/gkurz/qemu/commits/spapr-hotplug-phb > > Cheers, > > -- > Greg > > --- > > Greg Kurz (14): > spapr: move spapr_create_phb() to core machine code > spapr_pci: use memory_region_add_subregion() with DMA windows > spapr_iommu: use g_strdup_printf() instead of snprintf() > spapr_drc: use g_strdup_printf() instead of snprintf() > spapr_iommu: convert TCE table object to realize() > spapr_pci: parent the MSI memory region to the PHB > spapr_drc: fix realize and unrealize > spapr_drc: add unrealize method to physical DRC class > spapr_iommu: unregister vmstate at unrealize time > spapr: add pseries-2.11 machine type > spapr_pci: introduce drc_id property > spapr: allow guest to update the XICS phandle > spapr_pci: drop abusive sanity check when migrating the LSI table > spapr: add hotplug hooks for PHB hotplug > > Michael Roth (11): > spapr_drc: pass object ownership to parent/owner > spapr_iommu: pass object ownership to parent/owner > pci: allow cleanup/unregistration of PCI buses > qdev: store DeviceState's canonical path to use when unparenting > spapr_pci: add PHB unrealize > spapr: enable PHB hotplug for pseries-2.11 > spapr: create DR connectors for PHBs > spapr_events: add support for phb hotplug events > qdev: pass an Object * to qbus_set_hotplug_handler() > spapr_pci: provide node start offset via spapr_populate_pci_dt() > spapr_pci: add ibm, my-drc-index property for PHB hotplug > > Nathan Fontenot (1): > spapr: populate PHB DRC entries for root DT node > > > hw/acpi/piix4.c | 2 > hw/char/virtio-serial-bus.c | 2 > hw/core/bus.c | 11 -- > hw/core/qdev.c | 15 ++- > hw/pci/pci.c | 33 +++++++ > hw/pci/pcie.c | 2 > hw/pci/shpc.c | 2 > hw/ppc/spapr.c | 205 ++++++++++++++++++++++++++++++++++++++++- > hw/ppc/spapr_drc.c | 65 ++++++++++--- > hw/ppc/spapr_events.c | 3 + > hw/ppc/spapr_hcall.c | 20 ++++ > hw/ppc/spapr_iommu.c | 22 +++- > hw/ppc/spapr_pci.c | 86 +++++++++++++---- > hw/s390x/css-bridge.c | 2 > hw/s390x/s390-pci-bus.c | 6 + > hw/scsi/virtio-scsi.c | 2 > hw/scsi/vmw_pvscsi.c | 2 > hw/usb/dev-smartcard-reader.c | 2 > include/hw/compat.h | 3 + > include/hw/pci-host/spapr.h | 9 +- > include/hw/pci/pci.h | 3 + > include/hw/ppc/spapr.h | 15 +++ > include/hw/ppc/spapr_drc.h | 8 ++ > include/hw/qdev-core.h | 4 - > 24 files changed, 446 insertions(+), 78 deletions(-) > >
On Wed, 26 Jul 2017 17:31:17 -0300
Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com> wrote:
> I've tested the patch set using Greg's Github branch. It worked fine in
> my tests
> using a Fedora 26 and an Ubuntu 17.04 guests. I have two observations
> though:
>
> 1 - This is not related to this patch set per se because it is
> reproducible on master, but
> I think it is interfering with this new feature. There is a
> warning/error message in
> the kernel right after SLOF that goes:
>
> (...)
> -> smp_release_cpus()
> spinning_secondaries = 0
> <- smp_release_cpus()
> Linux ppc64le
> #1 SMP Mon Jul 1[ 0.030450] pci 0000:00:02.0: of_irq_parse_pci:
> failed with rc=-22
> [ 0.030552] pci 0000:00:0f.0: of_irq_parse_pci: failed with rc=-22
> [ OK ] Started Security Auditing Service.
> (...)
>
This is a regression in QEMU master introduced by this commit:
commit b87680427e8a3ff682f66514e99a8344e7437247
Author: Cédric Le Goater <clg@kaod.org>
Date: Wed Jul 5 19:13:15 2017 +0200
spapr: populate device tree depending on XIVE_EXPLOIT option
When XIVE is supported, the device tree should be populated
accordingly and the XIVE memory regions mapped to activate MMIOs.
Depending on the design we choose, we could also allocate different
ICS and ICP objects, or switch between objects. This needs to be
discussed.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
It is very similar to the issue that motivated the new KVMPPC_H_UPDATE_PHANDLE
hcall (see patch 24 and 26 in this series):
- QEMU creates an "interrupt-controller" node with a phandle property
with the value 0x1111
- QEMU passes the FDT to SLOF
- SLOF converts all references to the phandle to an SLOF internal value
=> from now on (ie, until the next machine reset), the guest only knows
the OF phandle.
- during CAS, if we go XICS, we send back an updated FDT with the
phandle of the "interrupt-controller" node reverted to 0x1111
=> the guest complains because all cold-plugged devices nodes refer
to the OF phandle, not 0x1111
A solution is to use the value set by KVMPPC_H_UPDATE_PHANDLE during CAS
instead of 0x1111. I could verify it makes the guest warning disappear.
I'll send a dedicated patchset to fix this in 2.10.
> I get this same message after hotplugging a PCI device with this series,
> but the
> hotplug shows up in the lspci as expected:
>
> (qemu) device_add virtio-net-pci,id=hp2.0,bus=phb2.0
> (qemu) [ 412.233441] pci 0002:00:00.0: of_irq_parse_pci: failed with rc=-22
>
This is the same error (the devices plugged in the new PHB all refer to
the OF phandle).
>
> 2 - when hotplugging the same PHB for the second time (device_add phb2,
> device_del phb2, device_add phb2 again) the hotplug works but dmesg got
> spammed
> by the messages:
>
> (qemu) device_add spapr-pci-host-bridge,index=2,id=phb2
> (qemu) [ 378.309490] rpaphp: rpaphp_register_slot: slot[C131106] is
> already registered
> [ 378.309674] rpaphp: rpaphp_register_slot: slot[C131074] is already
> registered
> [ 378.309841] rpaphp: rpaphp_register_slot: slot[C131087] is already
> registered
> [ 378.310847] rpaphp: rpaphp_register_slot: slot[C131078] is already
> registered
Yeah, I saw that but I haven't investigated that yet.
> ( .... about 250+ messages like that )
>
The current default is to have 256 slots per PHB.
> Looks like device_del isn't cleaning up everything after the first hotplug.
>
Or maybe the guest part (rpaphp or drmgr) ?
>
>
> Thanks,
>
Thanks very much for the testing! :)
Cheers,
--
Greg
>
> Daniel
>
>
> On 07/25/2017 02:57 PM, Greg Kurz wrote:
> > This series is based on patches from Michel Roth posted in 2015:
> >
> > https://lists.gnu.org/archive/html/qemu-devel/2015-04/msg04246.html
> >
> > It addresses comments made during the RFC review, and also provides a bunch
> > of preliminary cleanup/fix patches. Since we have reached hard-freeze, this
> > feature is provided by a new pseries-2.11 machine type, introduced by this
> > series. It is based on David Gibson's ppc-for-2.10 branch, and I believe some
> > of the preliminary fixes are eligible for 2.10.
> >
> > Note that it also requires an updated SLOF that supports a new private hcall:
> > KVMPPC_H_UPDATE_PHANDLE. This is needed because SLOF changes FDT phandles to
> > Open Firmware phandles. Since the guest only sees the latter, QEMU must use
> > the updated value when populating the FDT for the hotplugged PHB (otherwise
> > the guest can't setup IRQs for the PCI devices). SLOF part is already upstream:
> >
> > http://git.qemu.org/?p=SLOF.git;h=604d28cc3f791657414f8b21103921fa0147fc63
> >
> > With these patches we support the following:
> >
> > (qemu) device_add spapr-pci-host-bridge,index=2,id=phb2
> > (qemu) device_add virtio-net-pci,id=hp2.0,bus=phb2.0
> > (qemu) device_del hp2.0
> > (qemu) device_del phb2
> >
> > I could run some successful tests with a fedora25 guest:
> > - hotplug PHB + migrate + unplug PHB
> > - hotplug PHB + hotplug PCI device + unplug PHB => PCI device gets unplugged
> > - migrate before OS starts + hotplug PHB => destination uses OF phandles
> > - no regression observed with older machine types
> >
> > All the patches are also available here:
> >
> > https://github.com/gkurz/qemu/commits/spapr-hotplug-phb
> >
> > Cheers,
> >
> > --
> > Greg
> >
> > ---
> >
> > Greg Kurz (14):
> > spapr: move spapr_create_phb() to core machine code
> > spapr_pci: use memory_region_add_subregion() with DMA windows
> > spapr_iommu: use g_strdup_printf() instead of snprintf()
> > spapr_drc: use g_strdup_printf() instead of snprintf()
> > spapr_iommu: convert TCE table object to realize()
> > spapr_pci: parent the MSI memory region to the PHB
> > spapr_drc: fix realize and unrealize
> > spapr_drc: add unrealize method to physical DRC class
> > spapr_iommu: unregister vmstate at unrealize time
> > spapr: add pseries-2.11 machine type
> > spapr_pci: introduce drc_id property
> > spapr: allow guest to update the XICS phandle
> > spapr_pci: drop abusive sanity check when migrating the LSI table
> > spapr: add hotplug hooks for PHB hotplug
> >
> > Michael Roth (11):
> > spapr_drc: pass object ownership to parent/owner
> > spapr_iommu: pass object ownership to parent/owner
> > pci: allow cleanup/unregistration of PCI buses
> > qdev: store DeviceState's canonical path to use when unparenting
> > spapr_pci: add PHB unrealize
> > spapr: enable PHB hotplug for pseries-2.11
> > spapr: create DR connectors for PHBs
> > spapr_events: add support for phb hotplug events
> > qdev: pass an Object * to qbus_set_hotplug_handler()
> > spapr_pci: provide node start offset via spapr_populate_pci_dt()
> > spapr_pci: add ibm, my-drc-index property for PHB hotplug
> >
> > Nathan Fontenot (1):
> > spapr: populate PHB DRC entries for root DT node
> >
> >
> > hw/acpi/piix4.c | 2
> > hw/char/virtio-serial-bus.c | 2
> > hw/core/bus.c | 11 --
> > hw/core/qdev.c | 15 ++-
> > hw/pci/pci.c | 33 +++++++
> > hw/pci/pcie.c | 2
> > hw/pci/shpc.c | 2
> > hw/ppc/spapr.c | 205 ++++++++++++++++++++++++++++++++++++++++-
> > hw/ppc/spapr_drc.c | 65 ++++++++++---
> > hw/ppc/spapr_events.c | 3 +
> > hw/ppc/spapr_hcall.c | 20 ++++
> > hw/ppc/spapr_iommu.c | 22 +++-
> > hw/ppc/spapr_pci.c | 86 +++++++++++++----
> > hw/s390x/css-bridge.c | 2
> > hw/s390x/s390-pci-bus.c | 6 +
> > hw/scsi/virtio-scsi.c | 2
> > hw/scsi/vmw_pvscsi.c | 2
> > hw/usb/dev-smartcard-reader.c | 2
> > include/hw/compat.h | 3 +
> > include/hw/pci-host/spapr.h | 9 +-
> > include/hw/pci/pci.h | 3 +
> > include/hw/ppc/spapr.h | 15 +++
> > include/hw/ppc/spapr_drc.h | 8 ++
> > include/hw/qdev-core.h | 4 -
> > 24 files changed, 446 insertions(+), 78 deletions(-)
> >
> >
>
On 28/07/17 02:39, Greg Kurz wrote: > On Wed, 26 Jul 2017 17:31:17 -0300 > Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com> wrote: > >> I've tested the patch set using Greg's Github branch. It worked fine in >> my tests >> using a Fedora 26 and an Ubuntu 17.04 guests. I have two observations >> though: >> >> 1 - This is not related to this patch set per se because it is >> reproducible on master, but >> I think it is interfering with this new feature. There is a >> warning/error message in >> the kernel right after SLOF that goes: >> >> (...) >> -> smp_release_cpus() >> spinning_secondaries = 0 >> <- smp_release_cpus() >> Linux ppc64le >> #1 SMP Mon Jul 1[ 0.030450] pci 0000:00:02.0: of_irq_parse_pci: >> failed with rc=-22 >> [ 0.030552] pci 0000:00:0f.0: of_irq_parse_pci: failed with rc=-22 >> [ OK ] Started Security Auditing Service. >> (...) >> > > This is a regression in QEMU master introduced by this commit: > > commit b87680427e8a3ff682f66514e99a8344e7437247 > Author: Cédric Le Goater <clg@kaod.org> > Date: Wed Jul 5 19:13:15 2017 +0200 > > spapr: populate device tree depending on XIVE_EXPLOIT option > > When XIVE is supported, the device tree should be populated > accordingly and the XIVE memory regions mapped to activate MMIOs. > > Depending on the design we choose, we could also allocate different > ICS and ICP objects, or switch between objects. This needs to be > discussed. > > Signed-off-by: Cédric Le Goater <clg@kaod.org> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > > It is very similar to the issue that motivated the new KVMPPC_H_UPDATE_PHANDLE > hcall (see patch 24 and 26 in this series): > > - QEMU creates an "interrupt-controller" node with a phandle property > with the value 0x1111 > - QEMU passes the FDT to SLOF > - SLOF converts all references to the phandle to an SLOF internal value > > => from now on (ie, until the next machine reset), the guest only knows > the OF phandle. > > - during CAS, if we go XICS, we send back an updated FDT with the > phandle of the "interrupt-controller" node reverted to 0x1111 > > => the guest complains because all cold-plugged devices nodes refer > to the OF phandle, not 0x1111 > > A solution is to use the value set by KVMPPC_H_UPDATE_PHANDLE during CAS > instead of 0x1111. I could verify it makes the guest warning disappear. > > I'll send a dedicated patchset to fix this in 2.10. The SLOF I pushed for 2.10 does not have it though. And the rest of XIVE is not targeted for 2.10 anyway. So imho the solution is reverting "spapr: populate device tree depending on XIVE_EXPLOIT option" for 2.10. -- Alexey
On Fri, Jul 28, 2017 at 01:27:05PM +1000, Alexey Kardashevskiy wrote: > On 28/07/17 02:39, Greg Kurz wrote: > > On Wed, 26 Jul 2017 17:31:17 -0300 > > Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com> wrote: > > > >> I've tested the patch set using Greg's Github branch. It worked fine in > >> my tests > >> using a Fedora 26 and an Ubuntu 17.04 guests. I have two observations > >> though: > >> > >> 1 - This is not related to this patch set per se because it is > >> reproducible on master, but > >> I think it is interfering with this new feature. There is a > >> warning/error message in > >> the kernel right after SLOF that goes: > >> > >> (...) > >> -> smp_release_cpus() > >> spinning_secondaries = 0 > >> <- smp_release_cpus() > >> Linux ppc64le > >> #1 SMP Mon Jul 1[ 0.030450] pci 0000:00:02.0: of_irq_parse_pci: > >> failed with rc=-22 > >> [ 0.030552] pci 0000:00:0f.0: of_irq_parse_pci: failed with rc=-22 > >> [ OK ] Started Security Auditing Service. > >> (...) > >> > > > > This is a regression in QEMU master introduced by this commit: > > > > commit b87680427e8a3ff682f66514e99a8344e7437247 > > Author: Cédric Le Goater <clg@kaod.org> > > Date: Wed Jul 5 19:13:15 2017 +0200 > > > > spapr: populate device tree depending on XIVE_EXPLOIT option > > > > When XIVE is supported, the device tree should be populated > > accordingly and the XIVE memory regions mapped to activate MMIOs. > > > > Depending on the design we choose, we could also allocate different > > ICS and ICP objects, or switch between objects. This needs to be > > discussed. > > > > Signed-off-by: Cédric Le Goater <clg@kaod.org> > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > > > > It is very similar to the issue that motivated the new KVMPPC_H_UPDATE_PHANDLE > > hcall (see patch 24 and 26 in this series): > > > > - QEMU creates an "interrupt-controller" node with a phandle property > > with the value 0x1111 > > - QEMU passes the FDT to SLOF > > - SLOF converts all references to the phandle to an SLOF internal value > > > > => from now on (ie, until the next machine reset), the guest only knows > > the OF phandle. > > > > - during CAS, if we go XICS, we send back an updated FDT with the > > phandle of the "interrupt-controller" node reverted to 0x1111 > > > > => the guest complains because all cold-plugged devices nodes refer > > to the OF phandle, not 0x1111 > > > > A solution is to use the value set by KVMPPC_H_UPDATE_PHANDLE during CAS > > instead of 0x1111. I could verify it makes the guest warning disappear. > > > > I'll send a dedicated patchset to fix this in 2.10. > > > The SLOF I pushed for 2.10 does not have it though. And the rest of XIVE is > not targeted for 2.10 anyway. So imho the solution is reverting "spapr: > populate device tree depending on XIVE_EXPLOIT option" for 2.10. I agree, I've applied the revert to ppc-for-2.10 now. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
On 07/28/2017 05:40 AM, David Gibson wrote: > On Fri, Jul 28, 2017 at 01:27:05PM +1000, Alexey Kardashevskiy wrote: >> On 28/07/17 02:39, Greg Kurz wrote: >>> On Wed, 26 Jul 2017 17:31:17 -0300 >>> Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com> wrote: >>> >>>> I've tested the patch set using Greg's Github branch. It worked fine in >>>> my tests >>>> using a Fedora 26 and an Ubuntu 17.04 guests. I have two observations >>>> though: >>>> >>>> 1 - This is not related to this patch set per se because it is >>>> reproducible on master, but >>>> I think it is interfering with this new feature. There is a >>>> warning/error message in >>>> the kernel right after SLOF that goes: >>>> >>>> (...) >>>> -> smp_release_cpus() >>>> spinning_secondaries = 0 >>>> <- smp_release_cpus() >>>> Linux ppc64le >>>> #1 SMP Mon Jul 1[ 0.030450] pci 0000:00:02.0: of_irq_parse_pci: >>>> failed with rc=-22 >>>> [ 0.030552] pci 0000:00:0f.0: of_irq_parse_pci: failed with rc=-22 >>>> [ OK ] Started Security Auditing Service. >>>> (...) >>>> >>> >>> This is a regression in QEMU master introduced by this commit: >>> >>> commit b87680427e8a3ff682f66514e99a8344e7437247 >>> Author: Cédric Le Goater <clg@kaod.org> >>> Date: Wed Jul 5 19:13:15 2017 +0200 >>> >>> spapr: populate device tree depending on XIVE_EXPLOIT option >>> >>> When XIVE is supported, the device tree should be populated >>> accordingly and the XIVE memory regions mapped to activate MMIOs. >>> >>> Depending on the design we choose, we could also allocate different >>> ICS and ICP objects, or switch between objects. This needs to be >>> discussed. >>> >>> Signed-off-by: Cédric Le Goater <clg@kaod.org> >>> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> >>> >>> It is very similar to the issue that motivated the new KVMPPC_H_UPDATE_PHANDLE >>> hcall (see patch 24 and 26 in this series): >>> >>> - QEMU creates an "interrupt-controller" node with a phandle property >>> with the value 0x1111 >>> - QEMU passes the FDT to SLOF >>> - SLOF converts all references to the phandle to an SLOF internal value >>> >>> => from now on (ie, until the next machine reset), the guest only knows >>> the OF phandle. >>> >>> - during CAS, if we go XICS, we send back an updated FDT with the >>> phandle of the "interrupt-controller" node reverted to 0x1111 >>> >>> => the guest complains because all cold-plugged devices nodes refer >>> to the OF phandle, not 0x1111 >>> >>> A solution is to use the value set by KVMPPC_H_UPDATE_PHANDLE during CAS >>> instead of 0x1111. I could verify it makes the guest warning disappear. >>> >>> I'll send a dedicated patchset to fix this in 2.10. >> >> >> The SLOF I pushed for 2.10 does not have it though. And the rest of XIVE is >> not targeted for 2.10 anyway. So imho the solution is reverting "spapr: >> populate device tree depending on XIVE_EXPLOIT option" for 2.10. > > I agree, I've applied the revert to ppc-for-2.10 now. Sure. Greg, have you found a solution to get around this problem or do we need to populate the device tree with the interrupt controller node for SLOF ? Thanks, C.
On Fri, 28 Jul 2017 07:35:34 +0200 Cédric Le Goater <clg@kaod.org> wrote: > On 07/28/2017 05:40 AM, David Gibson wrote: > > On Fri, Jul 28, 2017 at 01:27:05PM +1000, Alexey Kardashevskiy wrote: > >> On 28/07/17 02:39, Greg Kurz wrote: > >>> On Wed, 26 Jul 2017 17:31:17 -0300 > >>> Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com> wrote: > >>> > >>>> I've tested the patch set using Greg's Github branch. It worked fine in > >>>> my tests > >>>> using a Fedora 26 and an Ubuntu 17.04 guests. I have two observations > >>>> though: > >>>> > >>>> 1 - This is not related to this patch set per se because it is > >>>> reproducible on master, but > >>>> I think it is interfering with this new feature. There is a > >>>> warning/error message in > >>>> the kernel right after SLOF that goes: > >>>> > >>>> (...) > >>>> -> smp_release_cpus() > >>>> spinning_secondaries = 0 > >>>> <- smp_release_cpus() > >>>> Linux ppc64le > >>>> #1 SMP Mon Jul 1[ 0.030450] pci 0000:00:02.0: of_irq_parse_pci: > >>>> failed with rc=-22 > >>>> [ 0.030552] pci 0000:00:0f.0: of_irq_parse_pci: failed with rc=-22 > >>>> [ OK ] Started Security Auditing Service. > >>>> (...) > >>>> > >>> > >>> This is a regression in QEMU master introduced by this commit: > >>> > >>> commit b87680427e8a3ff682f66514e99a8344e7437247 > >>> Author: Cédric Le Goater <clg@kaod.org> > >>> Date: Wed Jul 5 19:13:15 2017 +0200 > >>> > >>> spapr: populate device tree depending on XIVE_EXPLOIT option > >>> > >>> When XIVE is supported, the device tree should be populated > >>> accordingly and the XIVE memory regions mapped to activate MMIOs. > >>> > >>> Depending on the design we choose, we could also allocate different > >>> ICS and ICP objects, or switch between objects. This needs to be > >>> discussed. > >>> > >>> Signed-off-by: Cédric Le Goater <clg@kaod.org> > >>> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > >>> > >>> It is very similar to the issue that motivated the new KVMPPC_H_UPDATE_PHANDLE > >>> hcall (see patch 24 and 26 in this series): > >>> > >>> - QEMU creates an "interrupt-controller" node with a phandle property > >>> with the value 0x1111 > >>> - QEMU passes the FDT to SLOF > >>> - SLOF converts all references to the phandle to an SLOF internal value > >>> > >>> => from now on (ie, until the next machine reset), the guest only knows > >>> the OF phandle. > >>> > >>> - during CAS, if we go XICS, we send back an updated FDT with the > >>> phandle of the "interrupt-controller" node reverted to 0x1111 > >>> > >>> => the guest complains because all cold-plugged devices nodes refer > >>> to the OF phandle, not 0x1111 > >>> > >>> A solution is to use the value set by KVMPPC_H_UPDATE_PHANDLE during CAS > >>> instead of 0x1111. I could verify it makes the guest warning disappear. > >>> > >>> I'll send a dedicated patchset to fix this in 2.10. > >> > >> > >> The SLOF I pushed for 2.10 does not have it though. And the rest of XIVE is > >> not targeted for 2.10 anyway. So imho the solution is reverting "spapr: > >> populate device tree depending on XIVE_EXPLOIT option" for 2.10. > > > > I agree, I've applied the revert to ppc-for-2.10 now. > > Sure. > > Greg, have you found a solution to get around this problem or do we need > to populate the device tree with the interrupt controller node for SLOF ? > With commit b87680427e8a, spapr_dt_xics() is called twice actually: first time during machine reset and a second time during CAS. The first call means we start in XICS mode by default. But I don't understand why it is also called during CAS: if the guest didn't advertise XIVE, then we don't have anything to do, do we ? Cheers, -- Greg > Thanks, > > C. >
© 2016 - 2025 Red Hat, Inc.