[v4] spapr/xics: fix migration of older machine types

[Qemu-devel] [PATCH v4 0/6] spapr/xics: fix migration of older machine types

Posted by Greg Kurz 8 years, 4 months ago

I've provided answers for all comments from the v3 review that I deliberately
don't address in v4.

v4: - some patches from v3 got merged
    - added some more preparatory cleanup in xics (patches 1,2)
    - merge cpu_setup() handler into realize() (patch 4)
    - see individual changelog for patches 3 and 6

v3: - preparatory cleanup in pnv (patch 1)
    - rework ICPState realization and vmstate registration (patches 2,3,4)
    - fix migration using dummy icp/server entries (patch 5)

v2: - some patches from v1 are already merged in ppc-for-2.10
    - added a new fix to a potential memory leak (patch 1)
    - consolidate dt_id computation (patch 3)
    - see individual changelogs for patch 2 and 4

I could successfully do the following on POWER8 host with full cores (SMT8):

1) start a pseries-2.9 machine with QEMU 2.9:
        -smp cores=1,threads=2,maxcpus=8
2) hotplug a core:
        device_add host-spapr-cpu-core,core-id=4
3) migrate to QEMU 2.10 configured with core-id 0,4
4) hotplug another core:
        device_add host-spapr-cpu-core,core-id=2
5) migrate back to QEMU 2.9 configured with core-id 0,4,2
6) hotplug the core in the last available slot:
        device_add host-spapr-cpu-core,core-id=6
7) migrate to QEMU 2.10 configured with core-id 0,4,2,6

I could check that the guest is functional after each migration.

--
Greg

---

Greg Kurz (6):
      xics: introduce macros for ICP/ICS link properties
      xics: pass appropriate types to realize() handlers.
      xics: setup cpu at realize time
      xics: drop ICPStateClass::cpu_setup() handler
      xics: directly register ICPState objects to vmstate
      spapr: fix migration of ICPState objects from/to older QEMU


 hw/intc/xics.c          |   95 ++++++++++++++++++++---------------------------
 hw/intc/xics_kvm.c      |   18 ++++-----
 hw/intc/xics_pnv.c      |    6 +--
 hw/ppc/pnv_core.c       |   17 ++++----
 hw/ppc/pnv_psi.c        |    3 +
 hw/ppc/spapr.c          |   89 +++++++++++++++++++++++++++++++++++++++++++-
 hw/ppc/spapr_cpu_core.c |   22 +++++------
 include/hw/ppc/spapr.h  |    2 +
 include/hw/ppc/xics.h   |   16 ++++----
 9 files changed, 168 insertions(+), 100 deletions(-)

Re: [Qemu-devel] [PATCH v4 0/6] spapr/xics: fix migration of older machine types

Posted by David Gibson 8 years, 4 months ago

On Thu, Jun 08, 2017 at 03:42:32PM +0200, Greg Kurz wrote:
> I've provided answers for all comments from the v3 review that I deliberately
> don't address in v4.

I've merged patches 1-4.  5 & 6 I'm still reviewing.

> 
> v4: - some patches from v3 got merged
>     - added some more preparatory cleanup in xics (patches 1,2)
>     - merge cpu_setup() handler into realize() (patch 4)
>     - see individual changelog for patches 3 and 6
> 
> v3: - preparatory cleanup in pnv (patch 1)
>     - rework ICPState realization and vmstate registration (patches 2,3,4)
>     - fix migration using dummy icp/server entries (patch 5)
> 
> v2: - some patches from v1 are already merged in ppc-for-2.10
>     - added a new fix to a potential memory leak (patch 1)
>     - consolidate dt_id computation (patch 3)
>     - see individual changelogs for patch 2 and 4
> 
> I could successfully do the following on POWER8 host with full cores (SMT8):
> 
> 1) start a pseries-2.9 machine with QEMU 2.9:
>         -smp cores=1,threads=2,maxcpus=8
> 2) hotplug a core:
>         device_add host-spapr-cpu-core,core-id=4
> 3) migrate to QEMU 2.10 configured with core-id 0,4
> 4) hotplug another core:
>         device_add host-spapr-cpu-core,core-id=2
> 5) migrate back to QEMU 2.9 configured with core-id 0,4,2
> 6) hotplug the core in the last available slot:
>         device_add host-spapr-cpu-core,core-id=6
> 7) migrate to QEMU 2.10 configured with core-id 0,4,2,6
> 
> I could check that the guest is functional after each migration.
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

Re: [Qemu-devel] [PATCH v4 0/6] spapr/xics: fix migration of older machine types

Posted by Greg Kurz 8 years, 4 months ago

On Fri, 9 Jun 2017 12:28:13 +1000
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Thu, Jun 08, 2017 at 03:42:32PM +0200, Greg Kurz wrote:
> > I've provided answers for all comments from the v3 review that I deliberately
> > don't address in v4.  
> 
> I've merged patches 1-4.  5 & 6 I'm still reviewing.
> 

Cool. FYI, I forgot to mention that I only tested with KVM.

I'm now trying with TCG and I hit various guest crash on
the destination (using your ppc-for-2.10 branch WITHOUT
my patches):

cpu 0x0: Vector: 700 (Program Check) at [c0000000787ebae0]
    pc: c0000000002803c4: __fput+0x284/0x310
    lr: c000000000280258: __fput+0x118/0x310
    sp: c0000000787ebd60
   msr: 8000000000029033
  current = 0xc00000007cbab640
  paca    = 0xc000000007b80000   softe: 0        irq_happened: 0x01
    pid   = 1812, comm = gawk
kernel BUG at ../include/linux/fs.h:2399!
enter ? for help
[c0000000787ebdb0] c0000000000d7d84 task_work_run+0xe4/0x160
[c0000000787ebe00] c000000000018054 do_notify_resume+0xb4/0xc0
[c0000000787ebe30] c00000000000a730 ret_from_except_lite+0x5c/0x60
--- Exception: c00 (System Call) at 00003fff9026dd90
SP (3fffcb37b790) is in userspace
0:mon> 

or

cpu 0x0: Vector: 300 (Data Access) at [c00000007fff7490]
    pc: c0000000001ef768: free_pcppages_bulk+0x2b8/0x500
    lr: c0000000001ef524: free_pcppages_bulk+0x74/0x500
    sp: c00000007fff7710
   msr: 8000000000009033
   dar: c0000000807afc70
 dsisr: 40000000
  current = 0xc00000007c609190
  paca    = 0xc000000007b80000   softe: 0        irq_happened: 0x01
    pid   = 1631, comm = systemctl
enter ? for help
[c00000007fff77c0] c0000000001eff24 free_hot_cold_page+0x204/0x270
[c00000007fff7810] c0000000001f5848 __put_single_page+0x48/0x60
[c00000007fff7840] c00000000059ac50 skb_release_data+0xb0/0x180
[c00000007fff7880] c00000000059ae38 kfree_skb+0x58/0x130
[c00000007fff78c0] c00000000063f604 __udp4_lib_mcast_deliver+0x3d4/0x460
[c00000007fff7a50] c00000000063fb0c __udp4_lib_rcv+0x47c/0x770
[c00000007fff7b00] c0000000006023a8 ip_local_deliver_finish+0x148/0x310
[c00000007fff7b50] c0000000006026c4 ip_rcv_finish+0x154/0x420
[c00000007fff7bd0] c0000000005b1154 __netif_receive_skb_core+0x874/0xac0
[c00000007fff7cc0] c0000000005b30d4 netif_receive_skb+0x34/0xd0
[c00000007fff7d00] d000000000ef3c74 virtnet_poll+0x514/0x8a0 [virtio_net]
[c00000007fff7e10] c0000000005b3668 net_rx_action+0x1d8/0x310
[c00000007fff7ea0] c0000000000b0cc4 __do_softirq+0x154/0x330
[c00000007fff7f90] c0000000000251ac call_do_softirq+0x14/0x24
[c00000007fff3ef0] c000000000011be0 do_softirq+0xe0/0x110
[c00000007fff3f30] c0000000000b10e8 irq_exit+0xc8/0x110
[c00000007fff3f60] c0000000000117e8 __do_irq+0xb8/0x1c0
[c00000007fff3f90] c0000000000251d0 call_do_irq+0x14/0x24
[c00000007a94bac0] c000000000011990 do_IRQ+0xa0/0x120
[c00000007a94bb20] c00000000000a8b0 restore_check_irq_replay+0x2c/0x5c
--- Exception: 501 (Hardware Interrupt) at c000000000010f84 arch_local_irq_restore+0x74/0x90
[c00000007a94be10] 000000000000000c (unreliable)
[c00000007a94be30] c00000000000a704 ret_from_except_lite+0x30/0x60
--- Exception: 501 (Hardware Interrupt) at 00003fffa04a2c28
SP (3ffff7f1bf60) is in userspace
0:mon> 

These doesn't seem to occur with QEMU master. I'll try to investigate.

> > 
> > v4: - some patches from v3 got merged
> >     - added some more preparatory cleanup in xics (patches 1,2)
> >     - merge cpu_setup() handler into realize() (patch 4)
> >     - see individual changelog for patches 3 and 6
> > 
> > v3: - preparatory cleanup in pnv (patch 1)
> >     - rework ICPState realization and vmstate registration (patches 2,3,4)
> >     - fix migration using dummy icp/server entries (patch 5)
> > 
> > v2: - some patches from v1 are already merged in ppc-for-2.10
> >     - added a new fix to a potential memory leak (patch 1)
> >     - consolidate dt_id computation (patch 3)
> >     - see individual changelogs for patch 2 and 4
> > 
> > I could successfully do the following on POWER8 host with full cores (SMT8):
> > 
> > 1) start a pseries-2.9 machine with QEMU 2.9:
> >         -smp cores=1,threads=2,maxcpus=8
> > 2) hotplug a core:
> >         device_add host-spapr-cpu-core,core-id=4
> > 3) migrate to QEMU 2.10 configured with core-id 0,4
> > 4) hotplug another core:
> >         device_add host-spapr-cpu-core,core-id=2
> > 5) migrate back to QEMU 2.9 configured with core-id 0,4,2
> > 6) hotplug the core in the last available slot:
> >         device_add host-spapr-cpu-core,core-id=6
> > 7) migrate to QEMU 2.10 configured with core-id 0,4,2,6
> > 
> > I could check that the guest is functional after each migration.
> >   
>

Re: [Qemu-devel] [PATCH v4 0/6] spapr/xics: fix migration of older machine types

Posted by David Gibson 8 years, 4 months ago

On Fri, Jun 09, 2017 at 11:36:31AM +0200, Greg Kurz wrote:
> On Fri, 9 Jun 2017 12:28:13 +1000
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > On Thu, Jun 08, 2017 at 03:42:32PM +0200, Greg Kurz wrote:
> > > I've provided answers for all comments from the v3 review that I deliberately
> > > don't address in v4.  
> > 
> > I've merged patches 1-4.  5 & 6 I'm still reviewing.
> > 
> 
> Cool. FYI, I forgot to mention that I only tested with KVM.
> 
> I'm now trying with TCG and I hit various guest crash on
> the destination (using your ppc-for-2.10 branch WITHOUT
> my patches):

Drat.  What's your reproducer for this crash?

> 
> cpu 0x0: Vector: 700 (Program Check) at [c0000000787ebae0]
>     pc: c0000000002803c4: __fput+0x284/0x310
>     lr: c000000000280258: __fput+0x118/0x310
>     sp: c0000000787ebd60
>    msr: 8000000000029033
>   current = 0xc00000007cbab640
>   paca    = 0xc000000007b80000   softe: 0        irq_happened: 0x01
>     pid   = 1812, comm = gawk
> kernel BUG at ../include/linux/fs.h:2399!
> enter ? for help
> [c0000000787ebdb0] c0000000000d7d84 task_work_run+0xe4/0x160
> [c0000000787ebe00] c000000000018054 do_notify_resume+0xb4/0xc0
> [c0000000787ebe30] c00000000000a730 ret_from_except_lite+0x5c/0x60
> --- Exception: c00 (System Call) at 00003fff9026dd90
> SP (3fffcb37b790) is in userspace
> 0:mon> 
> 
> or
> 
> cpu 0x0: Vector: 300 (Data Access) at [c00000007fff7490]
>     pc: c0000000001ef768: free_pcppages_bulk+0x2b8/0x500
>     lr: c0000000001ef524: free_pcppages_bulk+0x74/0x500
>     sp: c00000007fff7710
>    msr: 8000000000009033
>    dar: c0000000807afc70
>  dsisr: 40000000
>   current = 0xc00000007c609190
>   paca    = 0xc000000007b80000   softe: 0        irq_happened: 0x01
>     pid   = 1631, comm = systemctl
> enter ? for help
> [c00000007fff77c0] c0000000001eff24 free_hot_cold_page+0x204/0x270
> [c00000007fff7810] c0000000001f5848 __put_single_page+0x48/0x60
> [c00000007fff7840] c00000000059ac50 skb_release_data+0xb0/0x180
> [c00000007fff7880] c00000000059ae38 kfree_skb+0x58/0x130
> [c00000007fff78c0] c00000000063f604 __udp4_lib_mcast_deliver+0x3d4/0x460
> [c00000007fff7a50] c00000000063fb0c __udp4_lib_rcv+0x47c/0x770
> [c00000007fff7b00] c0000000006023a8 ip_local_deliver_finish+0x148/0x310
> [c00000007fff7b50] c0000000006026c4 ip_rcv_finish+0x154/0x420
> [c00000007fff7bd0] c0000000005b1154 __netif_receive_skb_core+0x874/0xac0
> [c00000007fff7cc0] c0000000005b30d4 netif_receive_skb+0x34/0xd0
> [c00000007fff7d00] d000000000ef3c74 virtnet_poll+0x514/0x8a0 [virtio_net]
> [c00000007fff7e10] c0000000005b3668 net_rx_action+0x1d8/0x310
> [c00000007fff7ea0] c0000000000b0cc4 __do_softirq+0x154/0x330
> [c00000007fff7f90] c0000000000251ac call_do_softirq+0x14/0x24
> [c00000007fff3ef0] c000000000011be0 do_softirq+0xe0/0x110
> [c00000007fff3f30] c0000000000b10e8 irq_exit+0xc8/0x110
> [c00000007fff3f60] c0000000000117e8 __do_irq+0xb8/0x1c0
> [c00000007fff3f90] c0000000000251d0 call_do_irq+0x14/0x24
> [c00000007a94bac0] c000000000011990 do_IRQ+0xa0/0x120
> [c00000007a94bb20] c00000000000a8b0 restore_check_irq_replay+0x2c/0x5c
> --- Exception: 501 (Hardware Interrupt) at c000000000010f84 arch_local_irq_restore+0x74/0x90
> [c00000007a94be10] 000000000000000c (unreliable)
> [c00000007a94be30] c00000000000a704 ret_from_except_lite+0x30/0x60
> --- Exception: 501 (Hardware Interrupt) at 00003fffa04a2c28
> SP (3ffff7f1bf60) is in userspace
> 0:mon> 
> 
> These doesn't seem to occur with QEMU master. I'll try to
> investigate.

Thanks.  I'm going to be in China for the next couple of weeks.  I'll
still be working, but my time will be divided.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

Re: [Qemu-devel] [PATCH v4 0/6] spapr/xics: fix migration of older machine types

Posted by Greg Kurz 8 years, 4 months ago

On Fri, 9 Jun 2017 20:28:32 +1000
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Fri, Jun 09, 2017 at 11:36:31AM +0200, Greg Kurz wrote:
> > On Fri, 9 Jun 2017 12:28:13 +1000
> > David Gibson <david@gibson.dropbear.id.au> wrote:
> >   
> > > On Thu, Jun 08, 2017 at 03:42:32PM +0200, Greg Kurz wrote:  
> > > > I've provided answers for all comments from the v3 review that I deliberately
> > > > don't address in v4.    
> > > 
> > > I've merged patches 1-4.  5 & 6 I'm still reviewing.
> > >   
> > 
> > Cool. FYI, I forgot to mention that I only tested with KVM.
> > 
> > I'm now trying with TCG and I hit various guest crash on
> > the destination (using your ppc-for-2.10 branch WITHOUT
> > my patches):  
> 
> Drat.  What's your reproducer for this crash?
> 

1) start guest

qemu-system-ppc64 \
 -nodefaults -nographic -snapshot -no-shutdown -serial mon:stdio \
 -device virtio-net,netdev=netdev0,id=net0 \
 -netdev bridge,id=netdev0,br=virbr0,helper=/usr/libexec/qemu-bridge-helper \
 -device virtio-blk,drive=drive0,id=blk0 \
 -drive file=/home/greg/images/sle12-sp1-ppc64le.qcow2,id=drive0,if=none \
 -machine type=pseries,accel=tcg -cpu POWER8

2) migrate

3) destination crashes (immediately or after very short delay) or hangs

> > 
> > cpu 0x0: Vector: 700 (Program Check) at [c0000000787ebae0]
> >     pc: c0000000002803c4: __fput+0x284/0x310
> >     lr: c000000000280258: __fput+0x118/0x310
> >     sp: c0000000787ebd60
> >    msr: 8000000000029033
> >   current = 0xc00000007cbab640
> >   paca    = 0xc000000007b80000   softe: 0        irq_happened: 0x01
> >     pid   = 1812, comm = gawk
> > kernel BUG at ../include/linux/fs.h:2399!
> > enter ? for help
> > [c0000000787ebdb0] c0000000000d7d84 task_work_run+0xe4/0x160
> > [c0000000787ebe00] c000000000018054 do_notify_resume+0xb4/0xc0
> > [c0000000787ebe30] c00000000000a730 ret_from_except_lite+0x5c/0x60
> > --- Exception: c00 (System Call) at 00003fff9026dd90
> > SP (3fffcb37b790) is in userspace  
> > 0:mon>   
> > 
> > or
> > 
> > cpu 0x0: Vector: 300 (Data Access) at [c00000007fff7490]
> >     pc: c0000000001ef768: free_pcppages_bulk+0x2b8/0x500
> >     lr: c0000000001ef524: free_pcppages_bulk+0x74/0x500
> >     sp: c00000007fff7710
> >    msr: 8000000000009033
> >    dar: c0000000807afc70
> >  dsisr: 40000000
> >   current = 0xc00000007c609190
> >   paca    = 0xc000000007b80000   softe: 0        irq_happened: 0x01
> >     pid   = 1631, comm = systemctl
> > enter ? for help
> > [c00000007fff77c0] c0000000001eff24 free_hot_cold_page+0x204/0x270
> > [c00000007fff7810] c0000000001f5848 __put_single_page+0x48/0x60
> > [c00000007fff7840] c00000000059ac50 skb_release_data+0xb0/0x180
> > [c00000007fff7880] c00000000059ae38 kfree_skb+0x58/0x130
> > [c00000007fff78c0] c00000000063f604 __udp4_lib_mcast_deliver+0x3d4/0x460
> > [c00000007fff7a50] c00000000063fb0c __udp4_lib_rcv+0x47c/0x770
> > [c00000007fff7b00] c0000000006023a8 ip_local_deliver_finish+0x148/0x310
> > [c00000007fff7b50] c0000000006026c4 ip_rcv_finish+0x154/0x420
> > [c00000007fff7bd0] c0000000005b1154 __netif_receive_skb_core+0x874/0xac0
> > [c00000007fff7cc0] c0000000005b30d4 netif_receive_skb+0x34/0xd0
> > [c00000007fff7d00] d000000000ef3c74 virtnet_poll+0x514/0x8a0 [virtio_net]
> > [c00000007fff7e10] c0000000005b3668 net_rx_action+0x1d8/0x310
> > [c00000007fff7ea0] c0000000000b0cc4 __do_softirq+0x154/0x330
> > [c00000007fff7f90] c0000000000251ac call_do_softirq+0x14/0x24
> > [c00000007fff3ef0] c000000000011be0 do_softirq+0xe0/0x110
> > [c00000007fff3f30] c0000000000b10e8 irq_exit+0xc8/0x110
> > [c00000007fff3f60] c0000000000117e8 __do_irq+0xb8/0x1c0
> > [c00000007fff3f90] c0000000000251d0 call_do_irq+0x14/0x24
> > [c00000007a94bac0] c000000000011990 do_IRQ+0xa0/0x120
> > [c00000007a94bb20] c00000000000a8b0 restore_check_irq_replay+0x2c/0x5c
> > --- Exception: 501 (Hardware Interrupt) at c000000000010f84 arch_local_irq_restore+0x74/0x90
> > [c00000007a94be10] 000000000000000c (unreliable)
> > [c00000007a94be30] c00000000000a704 ret_from_except_lite+0x30/0x60
> > --- Exception: 501 (Hardware Interrupt) at 00003fffa04a2c28
> > SP (3ffff7f1bf60) is in userspace  
> > 0:mon>   
> > 
> > These doesn't seem to occur with QEMU master. I'll try to
> > investigate.  
> 

Bisect leads to:

f0b0685d6694a28c66018f438e822596243b1250 is the first bad commit
commit f0b0685d6694a28c66018f438e822596243b1250
Author: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Date:   Thu Apr 27 10:48:23 2017 +0530

    tcg: enable MTTCG by default for PPC64 on x86

I guess we're still not completely ready to support MTTCG...

Cc'ing Nikunj for insights.

> Thanks.  I'm going to be in China for the next couple of weeks.  I'll
> still be working, but my time will be divided.
> 

Hey, have a good trip! :)

Cheers,

--
Greg

Re: [Qemu-devel] [PATCH v4 0/6] spapr/xics: fix migration of older machine types

Posted by David Gibson 8 years, 4 months ago

On Fri, Jun 09, 2017 at 05:09:13PM +0200, Greg Kurz wrote:
> On Fri, 9 Jun 2017 20:28:32 +1000
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > On Fri, Jun 09, 2017 at 11:36:31AM +0200, Greg Kurz wrote:
> > > On Fri, 9 Jun 2017 12:28:13 +1000
> > > David Gibson <david@gibson.dropbear.id.au> wrote:
> > >   
> > > > On Thu, Jun 08, 2017 at 03:42:32PM +0200, Greg Kurz wrote:  
> > > > > I've provided answers for all comments from the v3 review that I deliberately
> > > > > don't address in v4.    
> > > > 
> > > > I've merged patches 1-4.  5 & 6 I'm still reviewing.
> > > >   
> > > 
> > > Cool. FYI, I forgot to mention that I only tested with KVM.
> > > 
> > > I'm now trying with TCG and I hit various guest crash on
> > > the destination (using your ppc-for-2.10 branch WITHOUT
> > > my patches):  
> > 
> > Drat.  What's your reproducer for this crash?
> > 
> 
> 1) start guest
> 
> qemu-system-ppc64 \
>  -nodefaults -nographic -snapshot -no-shutdown -serial mon:stdio \
>  -device virtio-net,netdev=netdev0,id=net0 \
>  -netdev bridge,id=netdev0,br=virbr0,helper=/usr/libexec/qemu-bridge-helper \
>  -device virtio-blk,drive=drive0,id=blk0 \
>  -drive file=/home/greg/images/sle12-sp1-ppc64le.qcow2,id=drive0,if=none \
>  -machine type=pseries,accel=tcg -cpu POWER8
> 
> 2) migrate
> 
> 3) destination crashes (immediately or after very short delay) or
> hangs

Ok.  I'll bisect it when I can, but you might well get to it first.


-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson