[RFC 0/3] Qemu FM emulation

nifan.cxl@gmail.com posted 3 patches 8 months, 1 week ago
Failed in applying to current master (apply log)
hw/cxl/cxl-mctp-qmp.c             |  85 +++++++++++++++
hw/cxl/i2c_mctp_cxl.c             |  68 ++++++++++--
hw/cxl/meson.build                |   2 +-
hw/mem/cxl_type3.c                | 166 +++++++++++++++++++++++++++++-
hw/mem/cxl_type3_stubs.c          |   5 +
include/hw/cxl/cxl_device.h       |   8 ++
include/hw/cxl/cxl_mctp_message.h |  43 ++++++++
qapi/cxl.json                     |  18 ++++
8 files changed, 387 insertions(+), 8 deletions(-)
create mode 100644 hw/cxl/cxl-mctp-qmp.c
create mode 100644 include/hw/cxl/cxl_mctp_message.h
[RFC 0/3] Qemu FM emulation
Posted by nifan.cxl@gmail.com 8 months, 1 week ago
From: Fan Ni <fan.ni@samsung.com>

The RFC provides a way for FM emulation in Qemu. The goal is to provide
a context where we can have more FM emulation discussions and share solutions
for a reasonable FM implementation in Qemu.

The basic idea is,

We have two VMs, one is the VM we want to test (named Target VM) and one is the
FM VM. The target VM has the kernel which we are interested (for example, DCD
or RAS feature enabled). The FM VM can be VM with any kernel version as long as
OOB communication support is enabled.

An application running in the FM VM issues FM commands to the underlying device
with OOB channel (e.g., MCTP over I2C), when the device receives the message,
it will not response to the request locally, instead the request will be stored
in a share buffer (implemented with /dev/shm), and a QMP request will be sent
to the target VM to notify there is a MCTP message in the shared buffer,
which needs to be processed. The FM will wait the completion of the request.
The target VM will read the buffer and process the message.
When the process completes, the output payload and any information needs to
return is stored in buffer, and a state field will be reset to notify the FM of
the completion of the processing.

The nice points of the method:
1. It is simple model (consumer-produce model with shm as shared buffer).
2. The communication between the two VMs through the qmp interface is simple.
One qmp interface works for all MCTP messages. Moreover, the qmp interface may
be able to use as a way for the communication between two VMs in different
context.

How we run the test?
Step 1: Start the VM we want to Target VM.
The device interested having "allow-fm-attach=on,mctp-buf-init=on"
For example, for my test, it is the DCD device.

In our test, the kernel run on the target VM is Ira's DCD branch:
https://github.com/weiny2/linux-kernel/tree/dcd-v4-2024-12-11.

qemu-system-x86_64 -gdb tcp::1235  -kernel bzImage -append "root=/dev/sda rw console=ttyS0,115200 ignore_loglevel nokaslr" \
-smp 8 -accel kvm -serial mon:stdio  -nographic  -qmp tcp:localhost:4445,server,wait=off \
-netdev user,id=network0,hostfwd=tcp::2024-:22    \
-device e1000,netdev=network0  -monitor telnet:127.0.0.1:12346,server,nowait \
-drive file=/home/fan/cxl/images/qemu-image.img,index=0,media=disk,format=raw \
-machine q35,cxl=on -cpu qemu64,mce=on -m 8G,maxmem=64G,slots=8 \
-virtfs local,path=/opt/lib/modules,mount_tag=modshare,security_model=mapped  \
-virtfs local,path=/home/fan,mount_tag=homeshare,security_model=mapped \
-object memory-backend-file,id=cxl-mem2,mem-path=/tmp/host0/t3_cxl2.raw,size=4G \
-object memory-backend-file,id=cxl-lsa2,mem-path=/tmp/host0/t3_lsa2.raw,size=1M \
-device pxb-cxl,bus_nr=12,bus=pcie.0,id=cxl.1,hdm_for_passthrough=true \
-device cxl-rp,port=0,bus=cxl.1,id=cxl_rp_port0,chassis=0,slot=2 \
-device cxl-upstream,port=2,sn=1234,bus=cxl_rp_port0,id=us0,addr=0.0,multifunction=on, \
-device cxl-switch-mailbox-cci,bus=cxl_rp_port0,addr=0.1,target=us0 \
-device cxl-downstream,port=0,bus=us0,id=swport0,chassis=0,slot=4 \
-device cxl-downstream,port=1,bus=us0,id=swport1,chassis=0,slot=5 \
-device cxl-downstream,port=3,bus=us0,id=swport2,chassis=0,slot=6 \
-device cxl-type3,bus=swport2,volatile-dc-memdev=cxl-mem2,id=cxl-dcd0,lsa=cxl-lsa2,num-dc-regions=2,sn=99,allow-fm-attach=on,mctp-buf-init=on \
-machine cxl-fmw.0.targets.0=cxl.1,cxl-fmw.0.size=4G,cxl-fmw.0.interleave-granularity=1k \
-device i2c_mctp_cxl,bus=aspeed.i2c.bus.0,address=4,target=us0 \
-device i2c_mctp_cxl,bus=aspeed.i2c.bus.0,address=6,target=cxl-dcd0 \
-device virtio-rng-pci,bus=swport1

Step 2: Start the FM VM and run the test program to send MCTP requests and
forward to the target VM for processing.

Note: the kernel for FM VM should have MCTP support.

In the test, we use linux-v6.6-rc6 with Jonathan's MCTP hack patches:
https://github.com/moking/cxl-test-tool/blob/main/test-workflows/mctp/mctp-patches-kernel.patch

qemu-system-x86_64 -gdb tcp::1236 -kernel fm-bzImage -append "root=/dev/sda rw console=ttyS0,115200 ignore_loglevel nokaslr " \
-smp 8 -accel kvm -serial mon:stdio  -nographic  -qmp tcp:localhost:4446,server,wait=off \
-netdev user,id=network0,hostfwd=tcp::2025-:22    \
-device e1000,netdev=network0  -monitor telnet:127.0.0.1:12347,server,nowait \
-drive file=/home/fan/cxl/images/qemu-image-fm.img,index=0,media=disk,format=raw \
-machine q35,cxl=on -cpu qemu64,mce=on -m 8G,maxmem=64G,slots=8  \
-virtfs local,path=/opt/lib/modules,mount_tag=modshare,security_model=mapped  \
-virtfs local,path=/home/fan,mount_tag=homeshare,security_model=mapped \
-object memory-backend-file,id=cxl-mem2,mem-path=/tmp/host1/t3_cxl2.raw,size=4G \
-object memory-backend-file,id=cxl-lsa2,mem-path=/tmp/host1/t3_lsa2.raw,size=1M \
-device pxb-cxl,bus_nr=12,bus=pcie.0,id=cxl.1,hdm_for_passthrough=true \
-device cxl-rp,port=0,bus=cxl.1,id=cxl_rp_port0,chassis=0,slot=2 \
-device cxl-upstream,port=2,sn=1234,bus=cxl_rp_port0,id=us0,addr=0.0,multifunction=on, \
-device cxl-switch-mailbox-cci,bus=cxl_rp_port0,addr=0.1,target=us0 \
-device cxl-downstream,port=0,bus=us0,id=swport0,chassis=0,slot=4 \
-device cxl-downstream,port=1,bus=us0,id=swport1,chassis=0,slot=5 \
-device cxl-downstream,port=3,bus=us0,id=swport2,chassis=0,slot=6 \
-device cxl-type3,bus=swport2,volatile-dc-memdev=cxl-mem2,id=cxl-dcd0,lsa=cxl-lsa2,num-dc-regions=2,sn=99,allow-fm-attach=on \
-machine cxl-fmw.0.targets.0=cxl.1,cxl-fmw.0.size=4G,cxl-fmw.0.interleave-granularity=1k \
-device i2c_mctp_cxl,bus=aspeed.i2c.bus.0,address=4,target=us0 \
-device i2c_mctp_cxl,bus=aspeed.i2c.bus.0,address=6,target=cxl-dcd0,qmp=127.0.0.1:4445,mctp-msg-forward=on \
-device virtio-rng-pci,bus=swport1

Currently, the code is not clean at all, it is a POC to prove the idea. Only
type3 (including DCD) devices can accept requests from the FM, which should be
easy to extend to support switch-targeted FM command processing.

The code is based on Jonathan's cxl-2025-03-20 branch.
A qemu branch with the code: https://github.com/moking/qemu-jic-clone/tree/fm-qmp

FYI.
I have a tool to make the test easier.
https://github.com/moking/cxl-test-tool/tree/main

Part of .var.config, see run_vars.example

QEMU_ROOT=~/cxl/jic/qemu
# for FM VM
FM_KERNEL_ROOT=~/cxl/linux-v6.6-rc6/
FM_QEMU_IMG=~/cxl/images/qemu-image-fm.img

# for Target VM
KERNEL_ROOT=~/cxl/linux-dcd/
QEMU_IMG=~/cxl/images/qemu-image.img

command:
1. cxl-tool.py --run -T FM_TARGET
2. cxl-tool.py --attach-VM -T FM_CLIENT
3. cxl-tool.py --install-libcxlmi-fm
4. cxl-tool.py --setup-mctp-fm
5. cxl-tool.py --login-fm (run the test program with libcxlmi)

Fan Ni (3):
  cxl_type3: Preparing information sharing between VMs
  cxl_type3: Add qmp_cxl_process_mctp_message qmp interface
  cxl/i2c_mctp_cxl: Add support to process MCTP command remotely

 hw/cxl/cxl-mctp-qmp.c             |  85 +++++++++++++++
 hw/cxl/i2c_mctp_cxl.c             |  68 ++++++++++--
 hw/cxl/meson.build                |   2 +-
 hw/mem/cxl_type3.c                | 166 +++++++++++++++++++++++++++++-
 hw/mem/cxl_type3_stubs.c          |   5 +
 include/hw/cxl/cxl_device.h       |   8 ++
 include/hw/cxl/cxl_mctp_message.h |  43 ++++++++
 qapi/cxl.json                     |  18 ++++
 8 files changed, 387 insertions(+), 8 deletions(-)
 create mode 100644 hw/cxl/cxl-mctp-qmp.c
 create mode 100644 include/hw/cxl/cxl_mctp_message.h

-- 
2.47.2
Re: [RFC 0/3] Qemu FM emulation
Posted by Gregory Price 8 months, 1 week ago
On Mon, Apr 07, 2025 at 09:20:27PM -0700, nifan.cxl@gmail.com wrote:
> From: Fan Ni <fan.ni@samsung.com>
> 
> The RFC provides a way for FM emulation in Qemu. The goal is to provide
> a context where we can have more FM emulation discussions and share solutions
> for a reasonable FM implementation in Qemu.
>
... snip ...

Took a browse of the series, and I like this method.  It seems simple
and straight-forward, avoids any complex networking between the vms and
gives us what we want.

I'll wait for Jonathan's commentary, but solid prototype (bn_n)b

~Gregory
Re: [RFC 0/3] Qemu FM emulation
Posted by Fan Ni 8 months ago
On Tue, Apr 08, 2025 at 11:04:20AM -0400, Gregory Price wrote:
> On Mon, Apr 07, 2025 at 09:20:27PM -0700, nifan.cxl@gmail.com wrote:
> > From: Fan Ni <fan.ni@samsung.com>
> > 
> > The RFC provides a way for FM emulation in Qemu. The goal is to provide
> > a context where we can have more FM emulation discussions and share solutions
> > for a reasonable FM implementation in Qemu.
> >
> ... snip ...
> 
> Took a browse of the series, and I like this method.  It seems simple
> and straight-forward, avoids any complex networking between the vms and
> gives us what we want.
> 
> I'll wait for Jonathan's commentary, but solid prototype (bn_n)b
> 
> ~Gregory

Hi Jonathan,

Any feedback for this RFC?

Fan
Re: [RFC 0/3] Qemu FM emulation
Posted by Jonathan Cameron via 8 months ago
On Mon, 14 Apr 2025 08:44:07 -0700
Fan Ni <nifan.cxl@gmail.com> wrote:

> On Tue, Apr 08, 2025 at 11:04:20AM -0400, Gregory Price wrote:
> > On Mon, Apr 07, 2025 at 09:20:27PM -0700, nifan.cxl@gmail.com wrote:  
> > > From: Fan Ni <fan.ni@samsung.com>
> > > 
> > > The RFC provides a way for FM emulation in Qemu. The goal is to provide
> > > a context where we can have more FM emulation discussions and share solutions
> > > for a reasonable FM implementation in Qemu.
> > >  
> > ... snip ...
> > 
> > Took a browse of the series, and I like this method.  It seems simple
> > and straight-forward, avoids any complex networking between the vms and
> > gives us what we want.
> > 
> > I'll wait for Jonathan's commentary, but solid prototype (bn_n)b
> > 
> > ~Gregory  
> 
> Hi Jonathan,
> 
> Any feedback for this RFC?

Immediate question is whether anything similar is done in other use cases
in QEMU?   There are vaguely similar things that work via a socket but
I'm not sure the mix of a shared buffer and a qmp based doorbell is done
elsewhere.  There is use of shared memory for inter VM comms but that uses
a socket for it's doorbell / interrupt path, not qmp.
https://www.qemu.org/docs/master/specs/ivshmem-spec.html

So without looking in that much detail yet, I'm not yet convinced this is
preferable to a socket over which we can send the mctp packets.

In general we need to also solve how to upstream the mctp support in
qemu or this is adding yet more stuff to my cxl staging tree.

+CC Markus for QMP part.

https://lore.kernel.org/all/20250408043051.430340-1-nifan.cxl@gmail.com/
is start of thread.
https://lore.kernel.org/all/20250408043051.430340-3-nifan.cxl@gmail.com/
the qmp patch adding what is more or less a doorbell pinged by a device
on a different  QEMU instance.

Jonathan

> 
> Fan