[Qemu-devel] [PULL 0/2] ppc-for-4.1 queue 20190813

David Gibson posted 2 patches 17 weeks ago
Test FreeBSD passed
Test docker-mingw@fedora passed
Test asan passed
Test docker-clang@ubuntu passed
Test checkpatch passed
Test s390x failed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20190813065920.23203-1-david@gibson.dropbear.id.au
Maintainers: "Cédric Le Goater" <clg@kaod.org>, David Gibson <david@gibson.dropbear.id.au>
hw/intc/spapr_xive_kvm.c | 19 +++++++++++++++++--
hw/intc/xive.c           | 21 ++++++++++++++++++++-
hw/ppc/spapr.c           | 24 ++++++++++++------------
include/hw/ppc/xive.h    |  1 +
4 files changed, 50 insertions(+), 15 deletions(-)

[Qemu-devel] [PULL 0/2] ppc-for-4.1 queue 20190813

Posted by David Gibson 17 weeks ago
The following changes since commit 5e7bcdcfe69ce0fad66012b2cfb2035003c37eef:

  display/bochs: fix pcie support (2019-08-12 16:36:41 +0100)

are available in the Git repository at:

  git://github.com/dgibson/qemu.git tags/ppc-for-4.1-20190813

for you to fetch changes up to 310cda5b5e9df642b19a0e9c504368ffba3b3ab9:

  spapr/xive: Fix migration of hot-plugged CPUs (2019-08-13 16:50:30 +1000)

----------------------------------------------------------------
ppc patch queue 2019-08-13 (last minute qemu-4.1 fixes)

Here's a very, very last minute pull request for qemu-4.1.  This fixes
two nasty bugs with the XIVE interrupt controller in "dual" mode
(where the guest decides which interrupt controller it wants to use).
One occurs when resetting the guest while I/O is active, and the other
with migration of hotplugged CPUs.

The timing here is very unfortunate.  Alas, we only spotted these bugs
very late, and I was sick last week, delaying analysis and fix even
further.

This series hasn't had nearly as much testing as I'd really like, but
I'd still like to squeeze it into qemu-4.1 if possible, since
definitely fixing two bad bugs seems like an acceptable tradeoff for
the risk of introducing different bugs.

----------------------------------------------------------------
Cédric Le Goater (1):
      spapr/xive: Fix migration of hot-plugged CPUs

David Gibson (1):
      spapr: Reset CAS & IRQ subsystem after devices

 hw/intc/spapr_xive_kvm.c | 19 +++++++++++++++++--
 hw/intc/xive.c           | 21 ++++++++++++++++++++-
 hw/ppc/spapr.c           | 24 ++++++++++++------------
 include/hw/ppc/xive.h    |  1 +
 4 files changed, 50 insertions(+), 15 deletions(-)

Re: [Qemu-devel] [PULL 0/2] ppc-for-4.1 queue 20190813

Posted by Peter Maydell 17 weeks ago
On Tue, 13 Aug 2019 at 07:59, David Gibson <david@gibson.dropbear.id.au> wrote:
>
> The following changes since commit 5e7bcdcfe69ce0fad66012b2cfb2035003c37eef:
>
>   display/bochs: fix pcie support (2019-08-12 16:36:41 +0100)
>
> are available in the Git repository at:
>
>   git://github.com/dgibson/qemu.git tags/ppc-for-4.1-20190813
>
> for you to fetch changes up to 310cda5b5e9df642b19a0e9c504368ffba3b3ab9:
>
>   spapr/xive: Fix migration of hot-plugged CPUs (2019-08-13 16:50:30 +1000)
>
> ----------------------------------------------------------------
> ppc patch queue 2019-08-13 (last minute qemu-4.1 fixes)
>
> Here's a very, very last minute pull request for qemu-4.1.  This fixes
> two nasty bugs with the XIVE interrupt controller in "dual" mode
> (where the guest decides which interrupt controller it wants to use).
> One occurs when resetting the guest while I/O is active, and the other
> with migration of hotplugged CPUs.
>
> The timing here is very unfortunate.  Alas, we only spotted these bugs
> very late, and I was sick last week, delaying analysis and fix even
> further.
>
> This series hasn't had nearly as much testing as I'd really like, but
> I'd still like to squeeze it into qemu-4.1 if possible, since
> definitely fixing two bad bugs seems like an acceptable tradeoff for
> the risk of introducing different bugs.

Are these regressions? Are they security issues?

We are going to have an rc5 today, but my intention was to only put in
the security-fix bug in the bochs display device, and then have
a final release Thursday.

thanks
-- PMM

Re: [Qemu-devel] [PULL 0/2] ppc-for-4.1 queue 20190813

Posted by Peter Maydell 17 weeks ago
On Tue, 13 Aug 2019 at 10:23, Peter Maydell <peter.maydell@linaro.org> wrote:
>
> On Tue, 13 Aug 2019 at 07:59, David Gibson <david@gibson.dropbear.id.au> wrote:
> >
> > The following changes since commit 5e7bcdcfe69ce0fad66012b2cfb2035003c37eef:
> >
> >   display/bochs: fix pcie support (2019-08-12 16:36:41 +0100)
> >
> > are available in the Git repository at:
> >
> >   git://github.com/dgibson/qemu.git tags/ppc-for-4.1-20190813
> >
> > for you to fetch changes up to 310cda5b5e9df642b19a0e9c504368ffba3b3ab9:
> >
> >   spapr/xive: Fix migration of hot-plugged CPUs (2019-08-13 16:50:30 +1000)
> >
> > ----------------------------------------------------------------
> > ppc patch queue 2019-08-13 (last minute qemu-4.1 fixes)
> >
> > Here's a very, very last minute pull request for qemu-4.1.  This fixes
> > two nasty bugs with the XIVE interrupt controller in "dual" mode
> > (where the guest decides which interrupt controller it wants to use).
> > One occurs when resetting the guest while I/O is active, and the other
> > with migration of hotplugged CPUs.
> >
> > The timing here is very unfortunate.  Alas, we only spotted these bugs
> > very late, and I was sick last week, delaying analysis and fix even
> > further.
> >
> > This series hasn't had nearly as much testing as I'd really like, but
> > I'd still like to squeeze it into qemu-4.1 if possible, since
> > definitely fixing two bad bugs seems like an acceptable tradeoff for
> > the risk of introducing different bugs.
>
> Are these regressions? Are they security issues?
>
> We are going to have an rc5 today, but my intention was to only put in
> the security-fix bug in the bochs display device, and then have
> a final release Thursday.

After thinking about this and reading the commit messages I've
applied this pullreq, since it clearly only affects spapr and you're
in a better position to judge the significance of the fixes than me,
but it was really really borderline...

thanks
-- PMM

Re: [Qemu-devel] [PULL 0/2] ppc-for-4.1 queue 20190813

Posted by David Gibson 17 weeks ago
On Tue, Aug 13, 2019 at 12:45:51PM +0100, Peter Maydell wrote:
> On Tue, 13 Aug 2019 at 10:23, Peter Maydell <peter.maydell@linaro.org> wrote:
> >
> > On Tue, 13 Aug 2019 at 07:59, David Gibson <david@gibson.dropbear.id.au> wrote:
> > >
> > > The following changes since commit 5e7bcdcfe69ce0fad66012b2cfb2035003c37eef:
> > >
> > >   display/bochs: fix pcie support (2019-08-12 16:36:41 +0100)
> > >
> > > are available in the Git repository at:
> > >
> > >   git://github.com/dgibson/qemu.git tags/ppc-for-4.1-20190813
> > >
> > > for you to fetch changes up to 310cda5b5e9df642b19a0e9c504368ffba3b3ab9:
> > >
> > >   spapr/xive: Fix migration of hot-plugged CPUs (2019-08-13 16:50:30 +1000)
> > >
> > > ----------------------------------------------------------------
> > > ppc patch queue 2019-08-13 (last minute qemu-4.1 fixes)
> > >
> > > Here's a very, very last minute pull request for qemu-4.1.  This fixes
> > > two nasty bugs with the XIVE interrupt controller in "dual" mode
> > > (where the guest decides which interrupt controller it wants to use).
> > > One occurs when resetting the guest while I/O is active, and the other
> > > with migration of hotplugged CPUs.
> > >
> > > The timing here is very unfortunate.  Alas, we only spotted these bugs
> > > very late, and I was sick last week, delaying analysis and fix even
> > > further.
> > >
> > > This series hasn't had nearly as much testing as I'd really like, but
> > > I'd still like to squeeze it into qemu-4.1 if possible, since
> > > definitely fixing two bad bugs seems like an acceptable tradeoff for
> > > the risk of introducing different bugs.
> >
> > Are these regressions? Are they security issues?

They're effectively regressions.  Pedantically, they're bugs in a new
feature, but since the new feature is enabled by default in the new
machine type (and it's the interrupt controller, so you can't do
without it), so it means a normal setup will be broken where the
normal setup in the old version wasn't.

> > We are going to have an rc5 today, but my intention was to only put in
> > the security-fix bug in the bochs display device, and then have
> > a final release Thursday.
> 
> After thinking about this and reading the commit messages I've
> applied this pullreq, since it clearly only affects spapr and you're
> in a better position to judge the significance of the fixes than me,
> but it was really really borderline...

Fair enough.  As I said, the timing sucked, but there's not really
anything I could do about that.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

Re: [Qemu-devel] [PULL 0/2] ppc-for-4.1 queue 20190813

Posted by Cédric Le Goater 17 weeks ago
On 13/08/2019 13:45, Peter Maydell wrote:
> On Tue, 13 Aug 2019 at 10:23, Peter Maydell <peter.maydell@linaro.org> wrote:
>>
>> On Tue, 13 Aug 2019 at 07:59, David Gibson <david@gibson.dropbear.id.au> wrote:
>>>
>>> The following changes since commit 5e7bcdcfe69ce0fad66012b2cfb2035003c37eef:
>>>
>>>   display/bochs: fix pcie support (2019-08-12 16:36:41 +0100)
>>>
>>> are available in the Git repository at:
>>>
>>>   git://github.com/dgibson/qemu.git tags/ppc-for-4.1-20190813
>>>
>>> for you to fetch changes up to 310cda5b5e9df642b19a0e9c504368ffba3b3ab9:
>>>
>>>   spapr/xive: Fix migration of hot-plugged CPUs (2019-08-13 16:50:30 +1000)
>>>
>>> ----------------------------------------------------------------
>>> ppc patch queue 2019-08-13 (last minute qemu-4.1 fixes)
>>>
>>> Here's a very, very last minute pull request for qemu-4.1.  This fixes
>>> two nasty bugs with the XIVE interrupt controller in "dual" mode
>>> (where the guest decides which interrupt controller it wants to use).
>>> One occurs when resetting the guest while I/O is active, and the other
>>> with migration of hotplugged CPUs.
>>>
>>> The timing here is very unfortunate.  Alas, we only spotted these bugs
>>> very late, and I was sick last week, delaying analysis and fix even
>>> further.
>>>
>>> This series hasn't had nearly as much testing as I'd really like, but
>>> I'd still like to squeeze it into qemu-4.1 if possible, since
>>> definitely fixing two bad bugs seems like an acceptable tradeoff for
>>> the risk of introducing different bugs.
>>
>> Are these regressions? Are they security issues?
>>
>> We are going to have an rc5 today, but my intention was to only put in
>> the security-fix bug in the bochs display device, and then have
>> a final release Thursday.
> 
> After thinking about this and reading the commit messages I've
> applied this pullreq, since it clearly only affects spapr and you're
> in a better position to judge the significance of the fixes than me,
> but it was really really borderline...

I was going to reply but you were faster to apply. Here is some more
context.

The XIVE interrupt mode is activated by default in 4.1. So these are 
regressions w.r.t to the previous mode spapr was using referred as 
XICS. Specially the first patch.

The second patch is a fix for the restoration of the hot-plugged CPUs.
The restoration of the spapr machine became more complex with the 
XIVE interrupt controller because when we need the machine state to be 
loaded to know which KVM IRQ device to activate, XICS or XIVE. From 
there we can restore the KVM states and HW contexts of the different 
models in use, sources, controllers and presenters. 

The post_load handler of the spapr machine relies on the fact that 
it is called last and does the work for all models. I realized last 
evening that this is not true for hot-plugged CPUs which state come 
after the machine. I was under the assumption/impression this was 
not the case but I might be mistaken. It took me while to get the 
save/restore sequences correct. 


Thanks,

C.