[PATCH 0/12] Get Xen PV shim running in qemu

David Woodhouse posted 12 patches 6 months, 2 weeks ago
Failed in applying to current master (apply log)
There is a newer version of this series
blockdev.c                                     |  15 +++-
hw/block/xen-block.c                           |  26 +++++-
hw/char/trace-events                           |   8 ++
hw/char/xen_console.c                          | 522 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--------------------------
hw/i386/kvm/meson.build                        |   1 +
hw/i386/kvm/trace-events                       |   2 +
hw/i386/kvm/xen-stubs.c                        |   5 ++
hw/i386/kvm/xen_evtchn.c                       |   6 ++
hw/i386/kvm/xen_gnttab.c                       |  32 ++++++-
hw/i386/kvm/xen_primary_console.c              | 167 ++++++++++++++++++++++++++++++++++
hw/i386/kvm/xen_primary_console.h              |  22 +++++
hw/i386/kvm/xen_xenstore.c                     |  21 ++++-
hw/xen/xen-backend.c                           |  81 +++++++++++++----
hw/xen/xen-bus.c                               |  21 ++++-
hw/xen/xen-legacy-backend.c                    |   1 -
include/hw/xen/interface/arch-arm.h            |  37 ++++----
include/hw/xen/interface/arch-x86/cpuid.h      |  31 +++----
include/hw/xen/interface/arch-x86/xen-x86_32.h |  19 +---
include/hw/xen/interface/arch-x86/xen-x86_64.h |  19 +---
include/hw/xen/interface/arch-x86/xen.h        |  26 +-----
include/hw/xen/interface/event_channel.h       |  19 +---
include/hw/xen/interface/features.h            |  19 +---
include/hw/xen/interface/grant_table.h         |  19 +---
include/hw/xen/interface/hvm/hvm_op.h          |  19 +---
include/hw/xen/interface/hvm/params.h          |  19 +---
include/hw/xen/interface/io/blkif.h            |  27 ++----
include/hw/xen/interface/io/console.h          |  19 +---
include/hw/xen/interface/io/fbif.h             |  19 +---
include/hw/xen/interface/io/kbdif.h            |  19 +---
include/hw/xen/interface/io/netif.h            |  25 ++----
include/hw/xen/interface/io/protocols.h        |  19 +---
include/hw/xen/interface/io/ring.h             |  49 +++++-----
include/hw/xen/interface/io/usbif.h            |  19 +---
include/hw/xen/interface/io/xenbus.h           |  19 +---
include/hw/xen/interface/io/xs_wire.h          |  36 ++++----
include/hw/xen/interface/memory.h              |  30 +++----
include/hw/xen/interface/physdev.h             |  23 +----
include/hw/xen/interface/sched.h               |  19 +---
include/hw/xen/interface/trace.h               |  19 +---
include/hw/xen/interface/vcpu.h                |  19 +---
include/hw/xen/interface/version.h             |  19 +---
include/hw/xen/interface/xen-compat.h          |  19 +---
include/hw/xen/interface/xen.h                 |  19 +---
include/hw/xen/xen-backend.h                   |   4 +
include/hw/xen/xen-bus.h                       |   2 +
include/sysemu/kvm_xen.h                       |   1 +
target/i386/kvm/kvm.c                          |   4 +
target/i386/kvm/xen-emu.c                      |  35 ++++++--
48 files changed, 941 insertions(+), 680 deletions(-)
[PATCH 0/12] Get Xen PV shim running in qemu
Posted by David Woodhouse 6 months, 2 weeks ago
I hadn't got round to getting the PV shim running yet; I thought it would
need work on the multiboot loader. Turns out it doesn't. I *did* need to
fix a couple of brown-paper-bag bugs in the per-vCPU upcall vector support,
and implement Xen console support though. Now I can test PV guests:

 $ qemu-system-x86_64 --accel kvm,xen-version=0x40011,kernel-irqchip=split \
   -chardev stdio,mux=on,id=char0 -device xen-console,chardev=char0 \
   -drive file=${GUEST_IMAGE},if=xen -display none -m 1G \
   -kernel ~/git/xen/xen/xen -initrd ~/git/linux/arch/x86/boot/bzImage \
   -append "loglvl=all -- console=hvc0 root=/dev/xvda1"

 blockdev.c                                     |  15 +++-
 hw/block/xen-block.c                           |  26 +++++-
 hw/char/trace-events                           |   8 ++
 hw/char/xen_console.c                          | 522 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--------------------------
 hw/i386/kvm/meson.build                        |   1 +
 hw/i386/kvm/trace-events                       |   2 +
 hw/i386/kvm/xen-stubs.c                        |   5 ++
 hw/i386/kvm/xen_evtchn.c                       |   6 ++
 hw/i386/kvm/xen_gnttab.c                       |  32 ++++++-
 hw/i386/kvm/xen_primary_console.c              | 167 ++++++++++++++++++++++++++++++++++
 hw/i386/kvm/xen_primary_console.h              |  22 +++++
 hw/i386/kvm/xen_xenstore.c                     |  21 ++++-
 hw/xen/xen-backend.c                           |  81 +++++++++++++----
 hw/xen/xen-bus.c                               |  21 ++++-
 hw/xen/xen-legacy-backend.c                    |   1 -
 include/hw/xen/interface/arch-arm.h            |  37 ++++----
 include/hw/xen/interface/arch-x86/cpuid.h      |  31 +++----
 include/hw/xen/interface/arch-x86/xen-x86_32.h |  19 +---
 include/hw/xen/interface/arch-x86/xen-x86_64.h |  19 +---
 include/hw/xen/interface/arch-x86/xen.h        |  26 +-----
 include/hw/xen/interface/event_channel.h       |  19 +---
 include/hw/xen/interface/features.h            |  19 +---
 include/hw/xen/interface/grant_table.h         |  19 +---
 include/hw/xen/interface/hvm/hvm_op.h          |  19 +---
 include/hw/xen/interface/hvm/params.h          |  19 +---
 include/hw/xen/interface/io/blkif.h            |  27 ++----
 include/hw/xen/interface/io/console.h          |  19 +---
 include/hw/xen/interface/io/fbif.h             |  19 +---
 include/hw/xen/interface/io/kbdif.h            |  19 +---
 include/hw/xen/interface/io/netif.h            |  25 ++----
 include/hw/xen/interface/io/protocols.h        |  19 +---
 include/hw/xen/interface/io/ring.h             |  49 +++++-----
 include/hw/xen/interface/io/usbif.h            |  19 +---
 include/hw/xen/interface/io/xenbus.h           |  19 +---
 include/hw/xen/interface/io/xs_wire.h          |  36 ++++----
 include/hw/xen/interface/memory.h              |  30 +++----
 include/hw/xen/interface/physdev.h             |  23 +----
 include/hw/xen/interface/sched.h               |  19 +---
 include/hw/xen/interface/trace.h               |  19 +---
 include/hw/xen/interface/vcpu.h                |  19 +---
 include/hw/xen/interface/version.h             |  19 +---
 include/hw/xen/interface/xen-compat.h          |  19 +---
 include/hw/xen/interface/xen.h                 |  19 +---
 include/hw/xen/xen-backend.h                   |   4 +
 include/hw/xen/xen-bus.h                       |   2 +
 include/sysemu/kvm_xen.h                       |   1 +
 target/i386/kvm/kvm.c                          |   4 +
 target/i386/kvm/xen-emu.c                      |  35 ++++++--
 48 files changed, 941 insertions(+), 680 deletions(-)
Re: [PATCH 0/12] Get Xen PV shim running in qemu
Posted by Alex Bennée 6 months, 1 week ago
David Woodhouse <dwmw2@infradead.org> writes:

> I hadn't got round to getting the PV shim running yet; I thought it would
> need work on the multiboot loader. Turns out it doesn't. I *did* need to
> fix a couple of brown-paper-bag bugs in the per-vCPU upcall vector support,
> and implement Xen console support though. Now I can test PV guests:
>
>  $ qemu-system-x86_64 --accel kvm,xen-version=0x40011,kernel-irqchip=split \
>    -chardev stdio,mux=on,id=char0 -device xen-console,chardev=char0 \
>    -drive file=${GUEST_IMAGE},if=xen -display none -m 1G \
>    -kernel ~/git/xen/xen/xen -initrd ~/git/linux/arch/x86/boot/bzImage
>  \

So this is a KVM guest running the Xen hypervisor (via -kernel) and a
Dom0 Linux guest (via -initrd)?

Should this work for any Xen architecture or is this x86 specific? Does
the -M machine model matter?

Would it be possible to have some sort of overview document in our
manual for how Xen guests are supported under KVM?

>    -append "loglvl=all -- console=hvc0 root=/dev/xvda1"
>
<snip>

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro
Re: [PATCH 0/12] Get Xen PV shim running in qemu
Posted by David Woodhouse 6 months, 1 week ago
On Tue, 2023-10-24 at 16:24 +0100, Alex Bennée wrote:
> 
> David Woodhouse <dwmw2@infradead.org> writes:
> 
> > I hadn't got round to getting the PV shim running yet; I thought it would
> > need work on the multiboot loader. Turns out it doesn't. I *did* need to
> > fix a couple of brown-paper-bag bugs in the per-vCPU upcall vector support,
> > and implement Xen console support though. Now I can test PV guests:
> > 
> >  $ qemu-system-x86_64 --accel kvm,xen-version=0x40011,kernel-irqchip=split \
> >    -chardev stdio,mux=on,id=char0 -device xen-console,chardev=char0 \
> >    -drive file=${GUEST_IMAGE},if=xen -display none -m 1G \
> >    -kernel ~/git/xen/xen/xen -initrd ~/git/linux/arch/x86/boot/bzImage
> >  \
> 

(Reordering your questions so the answers flow better)

> Would it be possible to have some sort of overview document in our
> manual for how Xen guests are supported under KVM?

https://qemu-project.gitlab.io/qemu/system/i386/xen.html covers running
Xen HVM guests under Qemu/KVM.

What I'm adding here is the facility to support Xen PV guests. There is
a corresponding update to the documentation in my working tree at 
https://git.infradead.org/users/dwmw2/qemu.git/shortlog/refs/heads/xenfv

https://git.infradead.org/users/dwmw2/qemu.git/commitdiff/af693bf51141

PV mode is the old mode which predates proper virtualization support in
the CPUs, where a guest kernel *knows* it doesn't have access to real
(or indeed virtualized) hardware. It runs in ring 1 (or ring 3 on
x86_64) and makes hypercalls to Xen to ask it to do all the MMU
management.

When Spectre/Meltdown happened, running actual PV guests directly under
Xen became kind of insane, so we hacked a version of Xen to work as a
"shim", running inside a proper HVM guest, and just providing those MMU
management services to its guest. Its *only* guest. This shim doesn't
even do any of the PV disk/block stuff; that's passed through directly
to the real hypervisor.

So you have a real Xen hypervisor, then a "PV shim" Xen running inside
that hardware virtual machine, and a guest kernel hosted by that PV
shim.

Now, since Qemu/KVM can now pretend to be Xen and host guests that
think they're running as Xen HVM guests... Qemu/KVM can host that PV
shim too. As noted, I just had to realise that we could use '-initrd'
to trick Qemu's multiboot loader into doing it... and fix a few brown
paper bag bugs.

> So this is a KVM guest running the Xen hypervisor (via -kernel) and a
> Dom0 Linux guest (via -initrd)?

Fairly much. It's a KVM guest running that "shim" version of the Xen
hypervisor via -kernel, and a Linux guest via -initrd.

Although I wouldn't call that a "Dom0 Linux guest" because we tend to
use "Dom0" to mean the control domain, which can launch other (DomU)
guests... and that isn't what's happening here. It's more of a "Dom1".
The one and only unprivileged guest.

In particular, there's no nested virtualization here. Because in that
sense, what "Xen" does to host a PV guest isn't really virtualization.

> Should this work for any Xen architecture or is this x86 specific? Does
> the -M machine model matter?

It's currently x86-specific and KVM-specific. You can use the pc or q35
models as you see fit, although see the doc linked above for discussion
of the IDE 'unplug' mechanism. And recent patches on the list to fix it
for q35.

It would be interesting to make it work on other platforms, and even
with TCG. I've tried to keep it as portable as possible up to a point,
but without too much gratuitous overengineering to chase that goal.

Making it work with TCG would require dealing with all the struct
layouts where alignment/padding differs on different host
architectures, so we probably wouldn't be able to use the Xen public
header files directly. And we would need to implement some of the basic
event channel delivery and shared info page handling that we rely on
KVM to do for us. The latter probably isn't that hard; the former is
what stopped me even bothering.

Making it work for e.g. Arm would require porting some of the KVM
support to Arm in the kernel (it's currently x86-specific). And/or
making it work for TCG.... but the parts that *are* accelerated in the
kernel (timers, IPIs, etc) are there for a reason though. If we do make
it work for TCG by implementing those in userspace, I wouldn't
necessarily want a *KVM* guest to have to rely on those in userspace.