hw/intc/xics.c | 95 ++++++++++++++++++++--------------------------- hw/intc/xics_kvm.c | 18 ++++----- hw/intc/xics_pnv.c | 6 +-- hw/ppc/pnv_core.c | 17 ++++---- hw/ppc/pnv_psi.c | 3 + hw/ppc/spapr.c | 89 +++++++++++++++++++++++++++++++++++++++++++- hw/ppc/spapr_cpu_core.c | 22 +++++------ include/hw/ppc/spapr.h | 2 + include/hw/ppc/xics.h | 16 ++++---- 9 files changed, 168 insertions(+), 100 deletions(-)
I've provided answers for all comments from the v3 review that I deliberately
don't address in v4.
v4: - some patches from v3 got merged
- added some more preparatory cleanup in xics (patches 1,2)
- merge cpu_setup() handler into realize() (patch 4)
- see individual changelog for patches 3 and 6
v3: - preparatory cleanup in pnv (patch 1)
- rework ICPState realization and vmstate registration (patches 2,3,4)
- fix migration using dummy icp/server entries (patch 5)
v2: - some patches from v1 are already merged in ppc-for-2.10
- added a new fix to a potential memory leak (patch 1)
- consolidate dt_id computation (patch 3)
- see individual changelogs for patch 2 and 4
I could successfully do the following on POWER8 host with full cores (SMT8):
1) start a pseries-2.9 machine with QEMU 2.9:
-smp cores=1,threads=2,maxcpus=8
2) hotplug a core:
device_add host-spapr-cpu-core,core-id=4
3) migrate to QEMU 2.10 configured with core-id 0,4
4) hotplug another core:
device_add host-spapr-cpu-core,core-id=2
5) migrate back to QEMU 2.9 configured with core-id 0,4,2
6) hotplug the core in the last available slot:
device_add host-spapr-cpu-core,core-id=6
7) migrate to QEMU 2.10 configured with core-id 0,4,2,6
I could check that the guest is functional after each migration.
--
Greg
---
Greg Kurz (6):
xics: introduce macros for ICP/ICS link properties
xics: pass appropriate types to realize() handlers.
xics: setup cpu at realize time
xics: drop ICPStateClass::cpu_setup() handler
xics: directly register ICPState objects to vmstate
spapr: fix migration of ICPState objects from/to older QEMU
hw/intc/xics.c | 95 ++++++++++++++++++++---------------------------
hw/intc/xics_kvm.c | 18 ++++-----
hw/intc/xics_pnv.c | 6 +--
hw/ppc/pnv_core.c | 17 ++++----
hw/ppc/pnv_psi.c | 3 +
hw/ppc/spapr.c | 89 +++++++++++++++++++++++++++++++++++++++++++-
hw/ppc/spapr_cpu_core.c | 22 +++++------
include/hw/ppc/spapr.h | 2 +
include/hw/ppc/xics.h | 16 ++++----
9 files changed, 168 insertions(+), 100 deletions(-)
On Thu, Jun 08, 2017 at 03:42:32PM +0200, Greg Kurz wrote: > I've provided answers for all comments from the v3 review that I deliberately > don't address in v4. I've merged patches 1-4. 5 & 6 I'm still reviewing. > > v4: - some patches from v3 got merged > - added some more preparatory cleanup in xics (patches 1,2) > - merge cpu_setup() handler into realize() (patch 4) > - see individual changelog for patches 3 and 6 > > v3: - preparatory cleanup in pnv (patch 1) > - rework ICPState realization and vmstate registration (patches 2,3,4) > - fix migration using dummy icp/server entries (patch 5) > > v2: - some patches from v1 are already merged in ppc-for-2.10 > - added a new fix to a potential memory leak (patch 1) > - consolidate dt_id computation (patch 3) > - see individual changelogs for patch 2 and 4 > > I could successfully do the following on POWER8 host with full cores (SMT8): > > 1) start a pseries-2.9 machine with QEMU 2.9: > -smp cores=1,threads=2,maxcpus=8 > 2) hotplug a core: > device_add host-spapr-cpu-core,core-id=4 > 3) migrate to QEMU 2.10 configured with core-id 0,4 > 4) hotplug another core: > device_add host-spapr-cpu-core,core-id=2 > 5) migrate back to QEMU 2.9 configured with core-id 0,4,2 > 6) hotplug the core in the last available slot: > device_add host-spapr-cpu-core,core-id=6 > 7) migrate to QEMU 2.10 configured with core-id 0,4,2,6 > > I could check that the guest is functional after each migration. > -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
On Fri, 9 Jun 2017 12:28:13 +1000
David Gibson <david@gibson.dropbear.id.au> wrote:
> On Thu, Jun 08, 2017 at 03:42:32PM +0200, Greg Kurz wrote:
> > I've provided answers for all comments from the v3 review that I deliberately
> > don't address in v4.
>
> I've merged patches 1-4. 5 & 6 I'm still reviewing.
>
Cool. FYI, I forgot to mention that I only tested with KVM.
I'm now trying with TCG and I hit various guest crash on
the destination (using your ppc-for-2.10 branch WITHOUT
my patches):
cpu 0x0: Vector: 700 (Program Check) at [c0000000787ebae0]
pc: c0000000002803c4: __fput+0x284/0x310
lr: c000000000280258: __fput+0x118/0x310
sp: c0000000787ebd60
msr: 8000000000029033
current = 0xc00000007cbab640
paca = 0xc000000007b80000 softe: 0 irq_happened: 0x01
pid = 1812, comm = gawk
kernel BUG at ../include/linux/fs.h:2399!
enter ? for help
[c0000000787ebdb0] c0000000000d7d84 task_work_run+0xe4/0x160
[c0000000787ebe00] c000000000018054 do_notify_resume+0xb4/0xc0
[c0000000787ebe30] c00000000000a730 ret_from_except_lite+0x5c/0x60
--- Exception: c00 (System Call) at 00003fff9026dd90
SP (3fffcb37b790) is in userspace
0:mon>
or
cpu 0x0: Vector: 300 (Data Access) at [c00000007fff7490]
pc: c0000000001ef768: free_pcppages_bulk+0x2b8/0x500
lr: c0000000001ef524: free_pcppages_bulk+0x74/0x500
sp: c00000007fff7710
msr: 8000000000009033
dar: c0000000807afc70
dsisr: 40000000
current = 0xc00000007c609190
paca = 0xc000000007b80000 softe: 0 irq_happened: 0x01
pid = 1631, comm = systemctl
enter ? for help
[c00000007fff77c0] c0000000001eff24 free_hot_cold_page+0x204/0x270
[c00000007fff7810] c0000000001f5848 __put_single_page+0x48/0x60
[c00000007fff7840] c00000000059ac50 skb_release_data+0xb0/0x180
[c00000007fff7880] c00000000059ae38 kfree_skb+0x58/0x130
[c00000007fff78c0] c00000000063f604 __udp4_lib_mcast_deliver+0x3d4/0x460
[c00000007fff7a50] c00000000063fb0c __udp4_lib_rcv+0x47c/0x770
[c00000007fff7b00] c0000000006023a8 ip_local_deliver_finish+0x148/0x310
[c00000007fff7b50] c0000000006026c4 ip_rcv_finish+0x154/0x420
[c00000007fff7bd0] c0000000005b1154 __netif_receive_skb_core+0x874/0xac0
[c00000007fff7cc0] c0000000005b30d4 netif_receive_skb+0x34/0xd0
[c00000007fff7d00] d000000000ef3c74 virtnet_poll+0x514/0x8a0 [virtio_net]
[c00000007fff7e10] c0000000005b3668 net_rx_action+0x1d8/0x310
[c00000007fff7ea0] c0000000000b0cc4 __do_softirq+0x154/0x330
[c00000007fff7f90] c0000000000251ac call_do_softirq+0x14/0x24
[c00000007fff3ef0] c000000000011be0 do_softirq+0xe0/0x110
[c00000007fff3f30] c0000000000b10e8 irq_exit+0xc8/0x110
[c00000007fff3f60] c0000000000117e8 __do_irq+0xb8/0x1c0
[c00000007fff3f90] c0000000000251d0 call_do_irq+0x14/0x24
[c00000007a94bac0] c000000000011990 do_IRQ+0xa0/0x120
[c00000007a94bb20] c00000000000a8b0 restore_check_irq_replay+0x2c/0x5c
--- Exception: 501 (Hardware Interrupt) at c000000000010f84 arch_local_irq_restore+0x74/0x90
[c00000007a94be10] 000000000000000c (unreliable)
[c00000007a94be30] c00000000000a704 ret_from_except_lite+0x30/0x60
--- Exception: 501 (Hardware Interrupt) at 00003fffa04a2c28
SP (3ffff7f1bf60) is in userspace
0:mon>
These doesn't seem to occur with QEMU master. I'll try to investigate.
> >
> > v4: - some patches from v3 got merged
> > - added some more preparatory cleanup in xics (patches 1,2)
> > - merge cpu_setup() handler into realize() (patch 4)
> > - see individual changelog for patches 3 and 6
> >
> > v3: - preparatory cleanup in pnv (patch 1)
> > - rework ICPState realization and vmstate registration (patches 2,3,4)
> > - fix migration using dummy icp/server entries (patch 5)
> >
> > v2: - some patches from v1 are already merged in ppc-for-2.10
> > - added a new fix to a potential memory leak (patch 1)
> > - consolidate dt_id computation (patch 3)
> > - see individual changelogs for patch 2 and 4
> >
> > I could successfully do the following on POWER8 host with full cores (SMT8):
> >
> > 1) start a pseries-2.9 machine with QEMU 2.9:
> > -smp cores=1,threads=2,maxcpus=8
> > 2) hotplug a core:
> > device_add host-spapr-cpu-core,core-id=4
> > 3) migrate to QEMU 2.10 configured with core-id 0,4
> > 4) hotplug another core:
> > device_add host-spapr-cpu-core,core-id=2
> > 5) migrate back to QEMU 2.9 configured with core-id 0,4,2
> > 6) hotplug the core in the last available slot:
> > device_add host-spapr-cpu-core,core-id=6
> > 7) migrate to QEMU 2.10 configured with core-id 0,4,2,6
> >
> > I could check that the guest is functional after each migration.
> >
>
On Fri, Jun 09, 2017 at 11:36:31AM +0200, Greg Kurz wrote: > On Fri, 9 Jun 2017 12:28:13 +1000 > David Gibson <david@gibson.dropbear.id.au> wrote: > > > On Thu, Jun 08, 2017 at 03:42:32PM +0200, Greg Kurz wrote: > > > I've provided answers for all comments from the v3 review that I deliberately > > > don't address in v4. > > > > I've merged patches 1-4. 5 & 6 I'm still reviewing. > > > > Cool. FYI, I forgot to mention that I only tested with KVM. > > I'm now trying with TCG and I hit various guest crash on > the destination (using your ppc-for-2.10 branch WITHOUT > my patches): Drat. What's your reproducer for this crash? > > cpu 0x0: Vector: 700 (Program Check) at [c0000000787ebae0] > pc: c0000000002803c4: __fput+0x284/0x310 > lr: c000000000280258: __fput+0x118/0x310 > sp: c0000000787ebd60 > msr: 8000000000029033 > current = 0xc00000007cbab640 > paca = 0xc000000007b80000 softe: 0 irq_happened: 0x01 > pid = 1812, comm = gawk > kernel BUG at ../include/linux/fs.h:2399! > enter ? for help > [c0000000787ebdb0] c0000000000d7d84 task_work_run+0xe4/0x160 > [c0000000787ebe00] c000000000018054 do_notify_resume+0xb4/0xc0 > [c0000000787ebe30] c00000000000a730 ret_from_except_lite+0x5c/0x60 > --- Exception: c00 (System Call) at 00003fff9026dd90 > SP (3fffcb37b790) is in userspace > 0:mon> > > or > > cpu 0x0: Vector: 300 (Data Access) at [c00000007fff7490] > pc: c0000000001ef768: free_pcppages_bulk+0x2b8/0x500 > lr: c0000000001ef524: free_pcppages_bulk+0x74/0x500 > sp: c00000007fff7710 > msr: 8000000000009033 > dar: c0000000807afc70 > dsisr: 40000000 > current = 0xc00000007c609190 > paca = 0xc000000007b80000 softe: 0 irq_happened: 0x01 > pid = 1631, comm = systemctl > enter ? for help > [c00000007fff77c0] c0000000001eff24 free_hot_cold_page+0x204/0x270 > [c00000007fff7810] c0000000001f5848 __put_single_page+0x48/0x60 > [c00000007fff7840] c00000000059ac50 skb_release_data+0xb0/0x180 > [c00000007fff7880] c00000000059ae38 kfree_skb+0x58/0x130 > [c00000007fff78c0] c00000000063f604 __udp4_lib_mcast_deliver+0x3d4/0x460 > [c00000007fff7a50] c00000000063fb0c __udp4_lib_rcv+0x47c/0x770 > [c00000007fff7b00] c0000000006023a8 ip_local_deliver_finish+0x148/0x310 > [c00000007fff7b50] c0000000006026c4 ip_rcv_finish+0x154/0x420 > [c00000007fff7bd0] c0000000005b1154 __netif_receive_skb_core+0x874/0xac0 > [c00000007fff7cc0] c0000000005b30d4 netif_receive_skb+0x34/0xd0 > [c00000007fff7d00] d000000000ef3c74 virtnet_poll+0x514/0x8a0 [virtio_net] > [c00000007fff7e10] c0000000005b3668 net_rx_action+0x1d8/0x310 > [c00000007fff7ea0] c0000000000b0cc4 __do_softirq+0x154/0x330 > [c00000007fff7f90] c0000000000251ac call_do_softirq+0x14/0x24 > [c00000007fff3ef0] c000000000011be0 do_softirq+0xe0/0x110 > [c00000007fff3f30] c0000000000b10e8 irq_exit+0xc8/0x110 > [c00000007fff3f60] c0000000000117e8 __do_irq+0xb8/0x1c0 > [c00000007fff3f90] c0000000000251d0 call_do_irq+0x14/0x24 > [c00000007a94bac0] c000000000011990 do_IRQ+0xa0/0x120 > [c00000007a94bb20] c00000000000a8b0 restore_check_irq_replay+0x2c/0x5c > --- Exception: 501 (Hardware Interrupt) at c000000000010f84 arch_local_irq_restore+0x74/0x90 > [c00000007a94be10] 000000000000000c (unreliable) > [c00000007a94be30] c00000000000a704 ret_from_except_lite+0x30/0x60 > --- Exception: 501 (Hardware Interrupt) at 00003fffa04a2c28 > SP (3ffff7f1bf60) is in userspace > 0:mon> > > These doesn't seem to occur with QEMU master. I'll try to > investigate. Thanks. I'm going to be in China for the next couple of weeks. I'll still be working, but my time will be divided. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
On Fri, 9 Jun 2017 20:28:32 +1000
David Gibson <david@gibson.dropbear.id.au> wrote:
> On Fri, Jun 09, 2017 at 11:36:31AM +0200, Greg Kurz wrote:
> > On Fri, 9 Jun 2017 12:28:13 +1000
> > David Gibson <david@gibson.dropbear.id.au> wrote:
> >
> > > On Thu, Jun 08, 2017 at 03:42:32PM +0200, Greg Kurz wrote:
> > > > I've provided answers for all comments from the v3 review that I deliberately
> > > > don't address in v4.
> > >
> > > I've merged patches 1-4. 5 & 6 I'm still reviewing.
> > >
> >
> > Cool. FYI, I forgot to mention that I only tested with KVM.
> >
> > I'm now trying with TCG and I hit various guest crash on
> > the destination (using your ppc-for-2.10 branch WITHOUT
> > my patches):
>
> Drat. What's your reproducer for this crash?
>
1) start guest
qemu-system-ppc64 \
-nodefaults -nographic -snapshot -no-shutdown -serial mon:stdio \
-device virtio-net,netdev=netdev0,id=net0 \
-netdev bridge,id=netdev0,br=virbr0,helper=/usr/libexec/qemu-bridge-helper \
-device virtio-blk,drive=drive0,id=blk0 \
-drive file=/home/greg/images/sle12-sp1-ppc64le.qcow2,id=drive0,if=none \
-machine type=pseries,accel=tcg -cpu POWER8
2) migrate
3) destination crashes (immediately or after very short delay) or hangs
> >
> > cpu 0x0: Vector: 700 (Program Check) at [c0000000787ebae0]
> > pc: c0000000002803c4: __fput+0x284/0x310
> > lr: c000000000280258: __fput+0x118/0x310
> > sp: c0000000787ebd60
> > msr: 8000000000029033
> > current = 0xc00000007cbab640
> > paca = 0xc000000007b80000 softe: 0 irq_happened: 0x01
> > pid = 1812, comm = gawk
> > kernel BUG at ../include/linux/fs.h:2399!
> > enter ? for help
> > [c0000000787ebdb0] c0000000000d7d84 task_work_run+0xe4/0x160
> > [c0000000787ebe00] c000000000018054 do_notify_resume+0xb4/0xc0
> > [c0000000787ebe30] c00000000000a730 ret_from_except_lite+0x5c/0x60
> > --- Exception: c00 (System Call) at 00003fff9026dd90
> > SP (3fffcb37b790) is in userspace
> > 0:mon>
> >
> > or
> >
> > cpu 0x0: Vector: 300 (Data Access) at [c00000007fff7490]
> > pc: c0000000001ef768: free_pcppages_bulk+0x2b8/0x500
> > lr: c0000000001ef524: free_pcppages_bulk+0x74/0x500
> > sp: c00000007fff7710
> > msr: 8000000000009033
> > dar: c0000000807afc70
> > dsisr: 40000000
> > current = 0xc00000007c609190
> > paca = 0xc000000007b80000 softe: 0 irq_happened: 0x01
> > pid = 1631, comm = systemctl
> > enter ? for help
> > [c00000007fff77c0] c0000000001eff24 free_hot_cold_page+0x204/0x270
> > [c00000007fff7810] c0000000001f5848 __put_single_page+0x48/0x60
> > [c00000007fff7840] c00000000059ac50 skb_release_data+0xb0/0x180
> > [c00000007fff7880] c00000000059ae38 kfree_skb+0x58/0x130
> > [c00000007fff78c0] c00000000063f604 __udp4_lib_mcast_deliver+0x3d4/0x460
> > [c00000007fff7a50] c00000000063fb0c __udp4_lib_rcv+0x47c/0x770
> > [c00000007fff7b00] c0000000006023a8 ip_local_deliver_finish+0x148/0x310
> > [c00000007fff7b50] c0000000006026c4 ip_rcv_finish+0x154/0x420
> > [c00000007fff7bd0] c0000000005b1154 __netif_receive_skb_core+0x874/0xac0
> > [c00000007fff7cc0] c0000000005b30d4 netif_receive_skb+0x34/0xd0
> > [c00000007fff7d00] d000000000ef3c74 virtnet_poll+0x514/0x8a0 [virtio_net]
> > [c00000007fff7e10] c0000000005b3668 net_rx_action+0x1d8/0x310
> > [c00000007fff7ea0] c0000000000b0cc4 __do_softirq+0x154/0x330
> > [c00000007fff7f90] c0000000000251ac call_do_softirq+0x14/0x24
> > [c00000007fff3ef0] c000000000011be0 do_softirq+0xe0/0x110
> > [c00000007fff3f30] c0000000000b10e8 irq_exit+0xc8/0x110
> > [c00000007fff3f60] c0000000000117e8 __do_irq+0xb8/0x1c0
> > [c00000007fff3f90] c0000000000251d0 call_do_irq+0x14/0x24
> > [c00000007a94bac0] c000000000011990 do_IRQ+0xa0/0x120
> > [c00000007a94bb20] c00000000000a8b0 restore_check_irq_replay+0x2c/0x5c
> > --- Exception: 501 (Hardware Interrupt) at c000000000010f84 arch_local_irq_restore+0x74/0x90
> > [c00000007a94be10] 000000000000000c (unreliable)
> > [c00000007a94be30] c00000000000a704 ret_from_except_lite+0x30/0x60
> > --- Exception: 501 (Hardware Interrupt) at 00003fffa04a2c28
> > SP (3ffff7f1bf60) is in userspace
> > 0:mon>
> >
> > These doesn't seem to occur with QEMU master. I'll try to
> > investigate.
>
Bisect leads to:
f0b0685d6694a28c66018f438e822596243b1250 is the first bad commit
commit f0b0685d6694a28c66018f438e822596243b1250
Author: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Date: Thu Apr 27 10:48:23 2017 +0530
tcg: enable MTTCG by default for PPC64 on x86
I guess we're still not completely ready to support MTTCG...
Cc'ing Nikunj for insights.
> Thanks. I'm going to be in China for the next couple of weeks. I'll
> still be working, but my time will be divided.
>
Hey, have a good trip! :)
Cheers,
--
Greg
On Fri, Jun 09, 2017 at 05:09:13PM +0200, Greg Kurz wrote: > On Fri, 9 Jun 2017 20:28:32 +1000 > David Gibson <david@gibson.dropbear.id.au> wrote: > > > On Fri, Jun 09, 2017 at 11:36:31AM +0200, Greg Kurz wrote: > > > On Fri, 9 Jun 2017 12:28:13 +1000 > > > David Gibson <david@gibson.dropbear.id.au> wrote: > > > > > > > On Thu, Jun 08, 2017 at 03:42:32PM +0200, Greg Kurz wrote: > > > > > I've provided answers for all comments from the v3 review that I deliberately > > > > > don't address in v4. > > > > > > > > I've merged patches 1-4. 5 & 6 I'm still reviewing. > > > > > > > > > > Cool. FYI, I forgot to mention that I only tested with KVM. > > > > > > I'm now trying with TCG and I hit various guest crash on > > > the destination (using your ppc-for-2.10 branch WITHOUT > > > my patches): > > > > Drat. What's your reproducer for this crash? > > > > 1) start guest > > qemu-system-ppc64 \ > -nodefaults -nographic -snapshot -no-shutdown -serial mon:stdio \ > -device virtio-net,netdev=netdev0,id=net0 \ > -netdev bridge,id=netdev0,br=virbr0,helper=/usr/libexec/qemu-bridge-helper \ > -device virtio-blk,drive=drive0,id=blk0 \ > -drive file=/home/greg/images/sle12-sp1-ppc64le.qcow2,id=drive0,if=none \ > -machine type=pseries,accel=tcg -cpu POWER8 > > 2) migrate > > 3) destination crashes (immediately or after very short delay) or > hangs Ok. I'll bisect it when I can, but you might well get to it first. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
© 2016 - 2025 Red Hat, Inc.