[v3] Implement vNVDIMM for Xen HVM guest

[Qemu-devel] [RFC QEMU PATCH v3 00/10] Implement vNVDIMM for Xen HVM guest

Posted by Haozhong Zhang 8 years, 5 months ago

This is the QEMU part patches that works with the associated Xen
patches to enable vNVDIMM support for Xen HVM domains. Xen relies on
QEMU to build guest NFIT and NVDIMM namespace devices, and allocate
guest address space for vNVDIMM devices.

All patches can be found at
  Xen:  https://github.com/hzzhan9/xen.git nvdimm-rfc-v3
  QEMU: https://github.com/hzzhan9/qemu.git xen-nvdimm-rfc-v3

Patch 1 is to avoid dereferencing the NULL pointer to non-existing
label data, as the Xen side support for labels is not implemented yet.

Patch 2 & 3 add a memory backend dedicated for Xen usage and a hotplug
memory region for Xen guest, in order to make the existing nvdimm
device plugging path work on Xen.

Patch 4 - 10 build and cooy NFIT from QEMU to Xen guest, when QEMU is
used as the Xen device model.


Haozhong Zhang (10):
  nvdimm: do not intiailize nvdimm->label_data if label size is zero
  hw/xen-hvm: create the hotplug memory region on Xen
  hostmem-xen: add a host memory backend for Xen
  nvdimm acpi: do not use fw_cfg on Xen
  hw/xen-hvm: initialize DM ACPI
  hw/xen-hvm: add function to copy ACPI into guest memory
  nvdimm acpi: copy NFIT to Xen guest
  nvdimm acpi: copy ACPI namespace device of vNVDIMM to Xen guest
  nvdimm acpi: do not build _FIT method on Xen
  hw/xen-hvm: enable building DM ACPI if vNVDIMM is enabled

 backends/Makefile.objs |   1 +
 backends/hostmem-xen.c | 108 ++++++++++++++++++++++++++
 backends/hostmem.c     |   9 +++
 hw/acpi/aml-build.c    |  10 ++-
 hw/acpi/nvdimm.c       |  79 ++++++++++++++-----
 hw/i386/pc.c           | 102 ++++++++++++++-----------
 hw/i386/xen/xen-hvm.c  | 204 ++++++++++++++++++++++++++++++++++++++++++++++++-
 hw/mem/nvdimm.c        |  10 ++-
 hw/mem/pc-dimm.c       |   6 +-
 include/hw/i386/pc.h   |   1 +
 include/hw/xen/xen.h   |  25 ++++++
 stubs/xen-hvm.c        |  10 +++
 12 files changed, 495 insertions(+), 70 deletions(-)
 create mode 100644 backends/hostmem-xen.c

-- 
2.11.0

Re: [Qemu-devel] [RFC QEMU PATCH v3 00/10] Implement vNVDIMM for Xen HVM guest

Posted by no-reply@patchew.org 8 years, 5 months ago

Hi,

This series seems to have some coding style problems. See output below for
more information:

Subject: [Qemu-devel] [RFC QEMU PATCH v3 00/10] Implement vNVDIMM for Xen HVM guest
Message-id: 20170911044157.15403-1-haozhong.zhang@intel.com
Type: series

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

git config --local diff.renamelimit 0
git config --local diff.renames True

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
    echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
    if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
        failed=1
        echo
    fi
    n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag]               patchew/20170911044157.15403-1-haozhong.zhang@intel.com -> patchew/20170911044157.15403-1-haozhong.zhang@intel.com
Switched to a new branch 'test'
d5f5b8faf2 hw/xen-hvm: enable building DM ACPI if vNVDIMM is enabled
73b52971f5 nvdimm acpi: do not build _FIT method on Xen
1c6eeac40e nvdimm acpi: copy ACPI namespace device of vNVDIMM to Xen guest
f2d6097366 nvdimm acpi: copy NFIT to Xen guest
69ddac3d65 hw/xen-hvm: add function to copy ACPI into guest memory
cae88474b2 hw/xen-hvm: initialize DM ACPI
23a0e4204a nvdimm acpi: do not use fw_cfg on Xen
e298be5d96 hostmem-xen: add a host memory backend for Xen
f069bbb659 hw/xen-hvm: create the hotplug memory region on Xen
69b6b6e9fa nvdimm: do not intiailize nvdimm->label_data if label size is zero

=== OUTPUT BEGIN ===
Checking PATCH 1/10: nvdimm: do not intiailize nvdimm->label_data if label size is zero...
ERROR: braces {} are necessary for all arms of this statement
#33: FILE: hw/mem/nvdimm.c:97:
+    if (nvdimm->label_size)
[...]

total: 1 errors, 0 warnings, 16 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 2/10: hw/xen-hvm: create the hotplug memory region on Xen...
ERROR: braces {} are necessary for all arms of this statement
#29: FILE: hw/i386/pc.c:1356:
+    if (!pcmc->has_reserved_memory || machine->ram_size >= machine->maxram_size)
[...]

total: 1 errors, 0 warnings, 113 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 3/10: hostmem-xen: add a host memory backend for Xen...
Checking PATCH 4/10: nvdimm acpi: do not use fw_cfg on Xen...
Checking PATCH 5/10: hw/xen-hvm: initialize DM ACPI...
Checking PATCH 6/10: hw/xen-hvm: add function to copy ACPI into guest memory...
Checking PATCH 7/10: nvdimm acpi: copy NFIT to Xen guest...
Checking PATCH 8/10: nvdimm acpi: copy ACPI namespace device of vNVDIMM to Xen guest...
Checking PATCH 9/10: nvdimm acpi: do not build _FIT method on Xen...
Checking PATCH 10/10: hw/xen-hvm: enable building DM ACPI if vNVDIMM is enabled...
=== OUTPUT END ===

Test command exited with code: 1


---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-devel@freelists.org

Re: [Qemu-devel] [RFC QEMU PATCH v3 00/10] Implement vNVDIMM for Xen HVM guest

Posted by Igor Mammedov 8 years, 5 months ago

On Mon, 11 Sep 2017 12:41:47 +0800
Haozhong Zhang <haozhong.zhang@intel.com> wrote:

> This is the QEMU part patches that works with the associated Xen
> patches to enable vNVDIMM support for Xen HVM domains. Xen relies on
> QEMU to build guest NFIT and NVDIMM namespace devices, and allocate
> guest address space for vNVDIMM devices.
> 
> All patches can be found at
>   Xen:  https://github.com/hzzhan9/xen.git nvdimm-rfc-v3
>   QEMU: https://github.com/hzzhan9/qemu.git xen-nvdimm-rfc-v3
> 
> Patch 1 is to avoid dereferencing the NULL pointer to non-existing
> label data, as the Xen side support for labels is not implemented yet.
> 
> Patch 2 & 3 add a memory backend dedicated for Xen usage and a hotplug
> memory region for Xen guest, in order to make the existing nvdimm
> device plugging path work on Xen.
> 
> Patch 4 - 10 build and cooy NFIT from QEMU to Xen guest, when QEMU is
> used as the Xen device model.

I've skimmed over patch-set and can't say that I'm happy with
number of xen_enabled() invariants it introduced as well as
with partial blobs it creates.

I'd like to reduce above and a way to do this might be making xen 
 1. use fw_cfg
 2. fetch QEMU build acpi tables from fw_cfg
 3. extract nvdim tables (which is trivial) and use them

looking at xen_load_linux(), it seems possible to use fw_cfg.

So what's stopping xen from using it elsewhere?,
instead of adding more xen specific code to do 'the same'
job and not reusing/sharing common code with tcg/kvm.


> Haozhong Zhang (10):
>   nvdimm: do not intiailize nvdimm->label_data if label size is zero
>   hw/xen-hvm: create the hotplug memory region on Xen
>   hostmem-xen: add a host memory backend for Xen
>   nvdimm acpi: do not use fw_cfg on Xen
>   hw/xen-hvm: initialize DM ACPI
>   hw/xen-hvm: add function to copy ACPI into guest memory
>   nvdimm acpi: copy NFIT to Xen guest
>   nvdimm acpi: copy ACPI namespace device of vNVDIMM to Xen guest
>   nvdimm acpi: do not build _FIT method on Xen
>   hw/xen-hvm: enable building DM ACPI if vNVDIMM is enabled
> 
>  backends/Makefile.objs |   1 +
>  backends/hostmem-xen.c | 108 ++++++++++++++++++++++++++
>  backends/hostmem.c     |   9 +++
>  hw/acpi/aml-build.c    |  10 ++-
>  hw/acpi/nvdimm.c       |  79 ++++++++++++++-----
>  hw/i386/pc.c           | 102 ++++++++++++++-----------
>  hw/i386/xen/xen-hvm.c  | 204 ++++++++++++++++++++++++++++++++++++++++++++++++-
>  hw/mem/nvdimm.c        |  10 ++-
>  hw/mem/pc-dimm.c       |   6 +-
>  include/hw/i386/pc.h   |   1 +
>  include/hw/xen/xen.h   |  25 ++++++
>  stubs/xen-hvm.c        |  10 +++
>  12 files changed, 495 insertions(+), 70 deletions(-)
>  create mode 100644 backends/hostmem-xen.c
>

Re: [Qemu-devel] [RFC QEMU PATCH v3 00/10] Implement vNVDIMM for Xen HVM guest

Posted by Stefano Stabellini 8 years, 5 months ago

CC'ing xen-devel, and the Xen tools and x86 maintainers.

On Mon, 11 Sep 2017, Igor Mammedov wrote:
> On Mon, 11 Sep 2017 12:41:47 +0800
> Haozhong Zhang <haozhong.zhang@intel.com> wrote:
> 
> > This is the QEMU part patches that works with the associated Xen
> > patches to enable vNVDIMM support for Xen HVM domains. Xen relies on
> > QEMU to build guest NFIT and NVDIMM namespace devices, and allocate
> > guest address space for vNVDIMM devices.
> > 
> > All patches can be found at
> >   Xen:  https://github.com/hzzhan9/xen.git nvdimm-rfc-v3
> >   QEMU: https://github.com/hzzhan9/qemu.git xen-nvdimm-rfc-v3
> > 
> > Patch 1 is to avoid dereferencing the NULL pointer to non-existing
> > label data, as the Xen side support for labels is not implemented yet.
> > 
> > Patch 2 & 3 add a memory backend dedicated for Xen usage and a hotplug
> > memory region for Xen guest, in order to make the existing nvdimm
> > device plugging path work on Xen.
> > 
> > Patch 4 - 10 build and cooy NFIT from QEMU to Xen guest, when QEMU is
> > used as the Xen device model.
> 
> I've skimmed over patch-set and can't say that I'm happy with
> number of xen_enabled() invariants it introduced as well as
> with partial blobs it creates.

I have not read the series (Haozhong, please CC me, Anthony and
xen-devel to the whole series next time), but yes, indeed. Let's not add
more xen_enabled() if possible.

Haozhong, was there a design document thread on xen-devel about this? If
so, did it reach a conclusion? Was the design accepted? If so, please
add a link to the design doc in the introductory email, so that
everybody can read it and be on the same page.

> I'd like to reduce above and a way to do this might be making xen 
>  1. use fw_cfg
>  2. fetch QEMU build acpi tables from fw_cfg
>  3. extract nvdim tables (which is trivial) and use them
> 
> looking at xen_load_linux(), it seems possible to use fw_cfg.
> 
> So what's stopping xen from using it elsewhere?,
> instead of adding more xen specific code to do 'the same'
> job and not reusing/sharing common code with tcg/kvm.

So far, ACPI tables have not been generated by QEMU. Xen HVM machines
rely on a firmware-like application called "hvmloader" that runs in
guest context and generates the ACPI tables. I have no opinions on
hvmloader and I'll let the Xen maintainers talk about it. However, keep
in mind that with an HVM guest some devices are emulated by Xen and/or
by other device emulators that can run alongside QEMU. QEMU doesn't have
a full few of the system.

Here the question is: does it have to be QEMU the one to generate the
ACPI blobs for the nvdimm? It would be nicer if it was up to hvmloader
like the rest, instead of introducing this split-brain design about
ACPI. We need to see a design doc to fully understand this.

If the design doc thread led into thinking that it has to be QEMU to
generate them, then would it make the code nicer if we used fw_cfg to
get the (full or partial) tables from QEMU, as Igor suggested?

> > Haozhong Zhang (10):
> >   nvdimm: do not intiailize nvdimm->label_data if label size is zero
> >   hw/xen-hvm: create the hotplug memory region on Xen
> >   hostmem-xen: add a host memory backend for Xen
> >   nvdimm acpi: do not use fw_cfg on Xen
> >   hw/xen-hvm: initialize DM ACPI
> >   hw/xen-hvm: add function to copy ACPI into guest memory
> >   nvdimm acpi: copy NFIT to Xen guest
> >   nvdimm acpi: copy ACPI namespace device of vNVDIMM to Xen guest
> >   nvdimm acpi: do not build _FIT method on Xen
> >   hw/xen-hvm: enable building DM ACPI if vNVDIMM is enabled
> > 
> >  backends/Makefile.objs |   1 +
> >  backends/hostmem-xen.c | 108 ++++++++++++++++++++++++++
> >  backends/hostmem.c     |   9 +++
> >  hw/acpi/aml-build.c    |  10 ++-
> >  hw/acpi/nvdimm.c       |  79 ++++++++++++++-----
> >  hw/i386/pc.c           | 102 ++++++++++++++-----------
> >  hw/i386/xen/xen-hvm.c  | 204 ++++++++++++++++++++++++++++++++++++++++++++++++-
> >  hw/mem/nvdimm.c        |  10 ++-
> >  hw/mem/pc-dimm.c       |   6 +-
> >  include/hw/i386/pc.h   |   1 +
> >  include/hw/xen/xen.h   |  25 ++++++
> >  stubs/xen-hvm.c        |  10 +++
> >  12 files changed, 495 insertions(+), 70 deletions(-)
> >  create mode 100644 backends/hostmem-xen.c
> > 
>

Re: [Qemu-devel] [RFC QEMU PATCH v3 00/10] Implement vNVDIMM for Xen HVM guest

Posted by Haozhong Zhang 8 years, 5 months ago

On 09/11/17 11:52 -0700, Stefano Stabellini wrote:
> CC'ing xen-devel, and the Xen tools and x86 maintainers.
> 
> On Mon, 11 Sep 2017, Igor Mammedov wrote:
> > On Mon, 11 Sep 2017 12:41:47 +0800
> > Haozhong Zhang <haozhong.zhang@intel.com> wrote:
> > 
> > > This is the QEMU part patches that works with the associated Xen
> > > patches to enable vNVDIMM support for Xen HVM domains. Xen relies on
> > > QEMU to build guest NFIT and NVDIMM namespace devices, and allocate
> > > guest address space for vNVDIMM devices.
> > > 
> > > All patches can be found at
> > >   Xen:  https://github.com/hzzhan9/xen.git nvdimm-rfc-v3
> > >   QEMU: https://github.com/hzzhan9/qemu.git xen-nvdimm-rfc-v3
> > > 
> > > Patch 1 is to avoid dereferencing the NULL pointer to non-existing
> > > label data, as the Xen side support for labels is not implemented yet.
> > > 
> > > Patch 2 & 3 add a memory backend dedicated for Xen usage and a hotplug
> > > memory region for Xen guest, in order to make the existing nvdimm
> > > device plugging path work on Xen.
> > > 
> > > Patch 4 - 10 build and cooy NFIT from QEMU to Xen guest, when QEMU is
> > > used as the Xen device model.
> > 
> > I've skimmed over patch-set and can't say that I'm happy with
> > number of xen_enabled() invariants it introduced as well as
> > with partial blobs it creates.
> 
> I have not read the series (Haozhong, please CC me, Anthony and
> xen-devel to the whole series next time), but yes, indeed. Let's not add
> more xen_enabled() if possible.
> 
> Haozhong, was there a design document thread on xen-devel about this? If
> so, did it reach a conclusion? Was the design accepted? If so, please
> add a link to the design doc in the introductory email, so that
> everybody can read it and be on the same page.

Yes, there is a design [1] discussed and reviewed. Section 4.3 discussed
the guest ACPI.

[1] https://lists.xenproject.org/archives/html/xen-devel/2016-07/msg01921.html

> 
> 
> > I'd like to reduce above and a way to do this might be making xen 
> >  1. use fw_cfg
> >  2. fetch QEMU build acpi tables from fw_cfg
> >  3. extract nvdim tables (which is trivial) and use them
> > 
> > looking at xen_load_linux(), it seems possible to use fw_cfg.
> > 
> > So what's stopping xen from using it elsewhere?,
> > instead of adding more xen specific code to do 'the same'
> > job and not reusing/sharing common code with tcg/kvm.
> 
> So far, ACPI tables have not been generated by QEMU. Xen HVM machines
> rely on a firmware-like application called "hvmloader" that runs in
> guest context and generates the ACPI tables. I have no opinions on
> hvmloader and I'll let the Xen maintainers talk about it. However, keep
> in mind that with an HVM guest some devices are emulated by Xen and/or
> by other device emulators that can run alongside QEMU. QEMU doesn't have
> a full few of the system.
> 
> Here the question is: does it have to be QEMU the one to generate the
> ACPI blobs for the nvdimm? It would be nicer if it was up to hvmloader
> like the rest, instead of introducing this split-brain design about
> ACPI. We need to see a design doc to fully understand this.
>

hvmloader runs in the guest and is responsible to build/load guest
ACPI. However, it's not capable to build AML at runtime (for the lack
of AML builder). If any guest ACPI object is needed (e.g. by guest
DSDT), it has to be generated from ASL by iasl at Xen compile time and
then be loaded by hvmloader at runtime.

Xen includes an OperationRegion "BIOS" in the static generated guest
DSDT, whose address is hardcoded and which contains a list of values
filled by hvmloader at runtime. Other ACPI objects can refer to those
values (e.g., the number of vCPUs). But it's not enough for generating
guest NVDIMM ACPI objects at compile time and then being customized
and loaded by hvmload, because its structure (i.e., the number of
namespace devices) cannot be decided util the guest config is known.

Alternatively, we may introduce an AML builder in hvmloader and build
all guest ACPI completely in hvmloader. Looking at the similar
implementation in QEMU, it would not be small, compared to the current
size of hvmloader. Besides, I'm still going to let QEMU handle guest
NVDIMM _DSM and _FIT calls, which is another reason I use QEMU to
build NVDIMM ACPI.

> If the design doc thread led into thinking that it has to be QEMU to
> generate them, then would it make the code nicer if we used fw_cfg to
> get the (full or partial) tables from QEMU, as Igor suggested?

I'll have a look at the code (which I didn't notice) pointed by Igor.

One possible issue to use fw_cfg is how to avoid the conflict between
ACPI built by QEMU and ACPI built by hvmloader (e.g., both may use the
same table signature / device names / ...). In my current design, QEMU
will pass the table signatures and device names used in its ACPI to
Xen, and Xen can check the conflict with its own ACPI. Perhaps we can
add necessary functions in fw_cfg as well. Anyway, let me first look
at the code.

Thanks,
Haozhong

Re: [Qemu-devel] [RFC QEMU PATCH v3 00/10] Implement vNVDIMM for Xen HVM guest

Posted by Konrad Rzeszutek Wilk 8 years, 4 months ago

On Tue, Sep 12, 2017 at 11:15:09AM +0800, Haozhong Zhang wrote:
> On 09/11/17 11:52 -0700, Stefano Stabellini wrote:
> > CC'ing xen-devel, and the Xen tools and x86 maintainers.
> > 
> > On Mon, 11 Sep 2017, Igor Mammedov wrote:
> > > On Mon, 11 Sep 2017 12:41:47 +0800
> > > Haozhong Zhang <haozhong.zhang@intel.com> wrote:
> > > 
> > > > This is the QEMU part patches that works with the associated Xen
> > > > patches to enable vNVDIMM support for Xen HVM domains. Xen relies on
> > > > QEMU to build guest NFIT and NVDIMM namespace devices, and allocate
> > > > guest address space for vNVDIMM devices.
> > > > 
> > > > All patches can be found at
> > > >   Xen:  https://github.com/hzzhan9/xen.git nvdimm-rfc-v3
> > > >   QEMU: https://github.com/hzzhan9/qemu.git xen-nvdimm-rfc-v3
> > > > 
> > > > Patch 1 is to avoid dereferencing the NULL pointer to non-existing
> > > > label data, as the Xen side support for labels is not implemented yet.
> > > > 
> > > > Patch 2 & 3 add a memory backend dedicated for Xen usage and a hotplug
> > > > memory region for Xen guest, in order to make the existing nvdimm
> > > > device plugging path work on Xen.
> > > > 
> > > > Patch 4 - 10 build and cooy NFIT from QEMU to Xen guest, when QEMU is
> > > > used as the Xen device model.
> > > 
> > > I've skimmed over patch-set and can't say that I'm happy with
> > > number of xen_enabled() invariants it introduced as well as
> > > with partial blobs it creates.
> > 
> > I have not read the series (Haozhong, please CC me, Anthony and
> > xen-devel to the whole series next time), but yes, indeed. Let's not add
> > more xen_enabled() if possible.
> > 
> > Haozhong, was there a design document thread on xen-devel about this? If
> > so, did it reach a conclusion? Was the design accepted? If so, please
> > add a link to the design doc in the introductory email, so that
> > everybody can read it and be on the same page.
> 
> Yes, there is a design [1] discussed and reviewed. Section 4.3 discussed
> the guest ACPI.
> 
> [1] https://lists.xenproject.org/archives/html/xen-devel/2016-07/msg01921.html

Igor, did you have a chance to read it?

.. see below
> 
> > 
> > 
> > > I'd like to reduce above and a way to do this might be making xen 
> > >  1. use fw_cfg
> > >  2. fetch QEMU build acpi tables from fw_cfg
> > >  3. extract nvdim tables (which is trivial) and use them
> > > 
> > > looking at xen_load_linux(), it seems possible to use fw_cfg.
> > > 
> > > So what's stopping xen from using it elsewhere?,
> > > instead of adding more xen specific code to do 'the same'
> > > job and not reusing/sharing common code with tcg/kvm.
> > 
> > So far, ACPI tables have not been generated by QEMU. Xen HVM machines
> > rely on a firmware-like application called "hvmloader" that runs in
> > guest context and generates the ACPI tables. I have no opinions on
> > hvmloader and I'll let the Xen maintainers talk about it. However, keep
> > in mind that with an HVM guest some devices are emulated by Xen and/or
> > by other device emulators that can run alongside QEMU. QEMU doesn't have
> > a full few of the system.
> > 
> > Here the question is: does it have to be QEMU the one to generate the
> > ACPI blobs for the nvdimm? It would be nicer if it was up to hvmloader
> > like the rest, instead of introducing this split-brain design about
> > ACPI. We need to see a design doc to fully understand this.
> >
> 
> hvmloader runs in the guest and is responsible to build/load guest
> ACPI. However, it's not capable to build AML at runtime (for the lack
> of AML builder). If any guest ACPI object is needed (e.g. by guest
> DSDT), it has to be generated from ASL by iasl at Xen compile time and
> then be loaded by hvmloader at runtime.
> 
> Xen includes an OperationRegion "BIOS" in the static generated guest
> DSDT, whose address is hardcoded and which contains a list of values
> filled by hvmloader at runtime. Other ACPI objects can refer to those
> values (e.g., the number of vCPUs). But it's not enough for generating
> guest NVDIMM ACPI objects at compile time and then being customized
> and loaded by hvmload, because its structure (i.e., the number of
> namespace devices) cannot be decided util the guest config is known.
> 
> Alternatively, we may introduce an AML builder in hvmloader and build
> all guest ACPI completely in hvmloader. Looking at the similar
> implementation in QEMU, it would not be small, compared to the current
> size of hvmloader. Besides, I'm still going to let QEMU handle guest
> NVDIMM _DSM and _FIT calls, which is another reason I use QEMU to
> build NVDIMM ACPI.
> 
> > If the design doc thread led into thinking that it has to be QEMU to
> > generate them, then would it make the code nicer if we used fw_cfg to
> > get the (full or partial) tables from QEMU, as Igor suggested?
> 
> I'll have a look at the code (which I didn't notice) pointed by Igor.

And there is a spec too!

https://github.com/qemu/qemu/blob/master/docs/specs/fw_cfg.txt

Igor, did you have in mind to use FW_CFG_FILE_DIR to retrieve the
ACPI AML code?

> 
> One possible issue to use fw_cfg is how to avoid the conflict between
> ACPI built by QEMU and ACPI built by hvmloader (e.g., both may use the
> same table signature / device names / ...). In my current design, QEMU
> will pass the table signatures and device names used in its ACPI to
> Xen, and Xen can check the conflict with its own ACPI. Perhaps we can
> add necessary functions in fw_cfg as well. Anyway, let me first look
> at the code.
> 
> Thanks,
> Haozhong

Re: [Qemu-devel] [RFC QEMU PATCH v3 00/10] Implement vNVDIMM for Xen HVM guest

Posted by Haozhong Zhang 8 years, 4 months ago

On 10/10/17 12:05 -0400, Konrad Rzeszutek Wilk wrote:
> On Tue, Sep 12, 2017 at 11:15:09AM +0800, Haozhong Zhang wrote:
> > On 09/11/17 11:52 -0700, Stefano Stabellini wrote:
> > > CC'ing xen-devel, and the Xen tools and x86 maintainers.
> > > 
> > > On Mon, 11 Sep 2017, Igor Mammedov wrote:
> > > > On Mon, 11 Sep 2017 12:41:47 +0800
> > > > Haozhong Zhang <haozhong.zhang@intel.com> wrote:
> > > > 
> > > > > This is the QEMU part patches that works with the associated Xen
> > > > > patches to enable vNVDIMM support for Xen HVM domains. Xen relies on
> > > > > QEMU to build guest NFIT and NVDIMM namespace devices, and allocate
> > > > > guest address space for vNVDIMM devices.
> > > > > 
> > > > > All patches can be found at
> > > > >   Xen:  https://github.com/hzzhan9/xen.git nvdimm-rfc-v3
> > > > >   QEMU: https://github.com/hzzhan9/qemu.git xen-nvdimm-rfc-v3
> > > > > 
> > > > > Patch 1 is to avoid dereferencing the NULL pointer to non-existing
> > > > > label data, as the Xen side support for labels is not implemented yet.
> > > > > 
> > > > > Patch 2 & 3 add a memory backend dedicated for Xen usage and a hotplug
> > > > > memory region for Xen guest, in order to make the existing nvdimm
> > > > > device plugging path work on Xen.
> > > > > 
> > > > > Patch 4 - 10 build and cooy NFIT from QEMU to Xen guest, when QEMU is
> > > > > used as the Xen device model.
> > > > 
> > > > I've skimmed over patch-set and can't say that I'm happy with
> > > > number of xen_enabled() invariants it introduced as well as
> > > > with partial blobs it creates.
> > > 
> > > I have not read the series (Haozhong, please CC me, Anthony and
> > > xen-devel to the whole series next time), but yes, indeed. Let's not add
> > > more xen_enabled() if possible.
> > > 
> > > Haozhong, was there a design document thread on xen-devel about this? If
> > > so, did it reach a conclusion? Was the design accepted? If so, please
> > > add a link to the design doc in the introductory email, so that
> > > everybody can read it and be on the same page.
> > 
> > Yes, there is a design [1] discussed and reviewed. Section 4.3 discussed
> > the guest ACPI.
> > 
> > [1] https://lists.xenproject.org/archives/html/xen-devel/2016-07/msg01921.html
> 
> Igor, did you have a chance to read it?
> 
> .. see below
> > 
> > > 
> > > 
> > > > I'd like to reduce above and a way to do this might be making xen 
> > > >  1. use fw_cfg
> > > >  2. fetch QEMU build acpi tables from fw_cfg
> > > >  3. extract nvdim tables (which is trivial) and use them
> > > > 
> > > > looking at xen_load_linux(), it seems possible to use fw_cfg.
> > > > 
> > > > So what's stopping xen from using it elsewhere?,
> > > > instead of adding more xen specific code to do 'the same'
> > > > job and not reusing/sharing common code with tcg/kvm.
> > > 
> > > So far, ACPI tables have not been generated by QEMU. Xen HVM machines
> > > rely on a firmware-like application called "hvmloader" that runs in
> > > guest context and generates the ACPI tables. I have no opinions on
> > > hvmloader and I'll let the Xen maintainers talk about it. However, keep
> > > in mind that with an HVM guest some devices are emulated by Xen and/or
> > > by other device emulators that can run alongside QEMU. QEMU doesn't have
> > > a full few of the system.
> > > 
> > > Here the question is: does it have to be QEMU the one to generate the
> > > ACPI blobs for the nvdimm? It would be nicer if it was up to hvmloader
> > > like the rest, instead of introducing this split-brain design about
> > > ACPI. We need to see a design doc to fully understand this.
> > >
> > 
> > hvmloader runs in the guest and is responsible to build/load guest
> > ACPI. However, it's not capable to build AML at runtime (for the lack
> > of AML builder). If any guest ACPI object is needed (e.g. by guest
> > DSDT), it has to be generated from ASL by iasl at Xen compile time and
> > then be loaded by hvmloader at runtime.
> > 
> > Xen includes an OperationRegion "BIOS" in the static generated guest
> > DSDT, whose address is hardcoded and which contains a list of values
> > filled by hvmloader at runtime. Other ACPI objects can refer to those
> > values (e.g., the number of vCPUs). But it's not enough for generating
> > guest NVDIMM ACPI objects at compile time and then being customized
> > and loaded by hvmload, because its structure (i.e., the number of
> > namespace devices) cannot be decided util the guest config is known.
> > 
> > Alternatively, we may introduce an AML builder in hvmloader and build
> > all guest ACPI completely in hvmloader. Looking at the similar
> > implementation in QEMU, it would not be small, compared to the current
> > size of hvmloader. Besides, I'm still going to let QEMU handle guest
> > NVDIMM _DSM and _FIT calls, which is another reason I use QEMU to
> > build NVDIMM ACPI.
> > 
> > > If the design doc thread led into thinking that it has to be QEMU to
> > > generate them, then would it make the code nicer if we used fw_cfg to
> > > get the (full or partial) tables from QEMU, as Igor suggested?
> > 
> > I'll have a look at the code (which I didn't notice) pointed by Igor.
> 
> And there is a spec too!
> 
> https://github.com/qemu/qemu/blob/master/docs/specs/fw_cfg.txt
> 
> Igor, did you have in mind to use FW_CFG_FILE_DIR to retrieve the
> ACPI AML code?
> 

Basically, QEMU builds two ROMs for guest, /rom@etc/acpi/tables and
/rom@etc/table-loader. The former is unstructured to guest, and
contains all data of guest ACPI. The latter is a BIOSLinkerLoader
organized as a set of commands, which direct the guest (e.g., SeaBIOS
on KVM/QEMU) to relocate data in the former file, recalculate checksum
of specified area, and fill guest address in specified ACPI field.

One part of my patches is to implement a mechanism to tell Xen which
part of ACPI data is a table (NFIT), and which part defines a
namespace device and what the device name is. I can add two new loader
commands for them respectively.

Because they just provide information and SeaBIOS in non-xen
environment ignores unrecognized commands, they will not break SeaBIOS
in non-xen environment.

On QEMU side, most Xen-specific hacks in ACPI builder could be
dropped, and replaced by adding the new loader commands (though they
may be used only by Xen).

On Xen side, a fw_cfg driver and a BIOSLinkerLoader command executor
are needed in, perhaps, hvmloader.


Haozhong

Re: [Qemu-devel] [RFC QEMU PATCH v3 00/10] Implement vNVDIMM for Xen HVM guest

Posted by Paolo Bonzini 8 years, 4 months ago

On 12/10/2017 14:45, Haozhong Zhang wrote:
> Basically, QEMU builds two ROMs for guest, /rom@etc/acpi/tables and
> /rom@etc/table-loader. The former is unstructured to guest, and
> contains all data of guest ACPI. The latter is a BIOSLinkerLoader
> organized as a set of commands, which direct the guest (e.g., SeaBIOS
> on KVM/QEMU) to relocate data in the former file, recalculate checksum
> of specified area, and fill guest address in specified ACPI field.
> 
> One part of my patches is to implement a mechanism to tell Xen which
> part of ACPI data is a table (NFIT), and which part defines a
> namespace device and what the device name is. I can add two new loader
> commands for them respectively.
> 
> Because they just provide information and SeaBIOS in non-xen
> environment ignores unrecognized commands, they will not break SeaBIOS
> in non-xen environment.
> 
> On QEMU side, most Xen-specific hacks in ACPI builder could be
> dropped, and replaced by adding the new loader commands (though they
> may be used only by Xen).
> 
> On Xen side, a fw_cfg driver and a BIOSLinkerLoader command executor
> are needed in, perhaps, hvmloader.

If Xen has to parse BIOSLinkerLoader, it can use the existing commands
to process a reduced set of ACPI tables.  In other words,
etc/acpi/tables would only include the NFIT, the SSDT with namespace
devices, and the XSDT.  etc/acpi/rsdp would include the RSDP table as usual.

hvmloader can then:

1) allocate some memory for where the XSDT will go

2) process the BIOSLinkerLoader like SeaBIOS would do

3) find the RSDP in low memory, since the loader script must have placed
it there.  If it cannot find it, allocate some low memory, fill it with
the RSDP header and revision, and and jump to step 6

4) If it found QEMU's RSDP, use it to find QEMU's XSDT

5) Copy ACPI table pointers from QEMU to hvmloader's RSDT and/or XSDT.

6) build hvmloader tables and link them into the RSDT and/or XSDT as usual.

7) overwrite the RSDP in low memory with a pointer to hvmloader's own
RSDT and/or XSDT, and updated the checksums

QEMU's XSDT remains there somewhere in memory, unused but harmless.

Paolo

Re: [Qemu-devel] [RFC QEMU PATCH v3 00/10] Implement vNVDIMM for Xen HVM guest

Posted by Haozhong Zhang 8 years, 4 months ago

On 10/12/17 17:45 +0200, Paolo Bonzini wrote:
> On 12/10/2017 14:45, Haozhong Zhang wrote:
> > Basically, QEMU builds two ROMs for guest, /rom@etc/acpi/tables and
> > /rom@etc/table-loader. The former is unstructured to guest, and
> > contains all data of guest ACPI. The latter is a BIOSLinkerLoader
> > organized as a set of commands, which direct the guest (e.g., SeaBIOS
> > on KVM/QEMU) to relocate data in the former file, recalculate checksum
> > of specified area, and fill guest address in specified ACPI field.
> > 
> > One part of my patches is to implement a mechanism to tell Xen which
> > part of ACPI data is a table (NFIT), and which part defines a
> > namespace device and what the device name is. I can add two new loader
> > commands for them respectively.
> > 
> > Because they just provide information and SeaBIOS in non-xen
> > environment ignores unrecognized commands, they will not break SeaBIOS
> > in non-xen environment.
> > 
> > On QEMU side, most Xen-specific hacks in ACPI builder could be
> > dropped, and replaced by adding the new loader commands (though they
> > may be used only by Xen).
> > 
> > On Xen side, a fw_cfg driver and a BIOSLinkerLoader command executor
> > are needed in, perhaps, hvmloader.
> 
> If Xen has to parse BIOSLinkerLoader, it can use the existing commands
> to process a reduced set of ACPI tables.  In other words,
> etc/acpi/tables would only include the NFIT, the SSDT with namespace
> devices, and the XSDT.  etc/acpi/rsdp would include the RSDP table as usual.
>
> hvmloader can then:
> 
> 1) allocate some memory for where the XSDT will go
> 
> 2) process the BIOSLinkerLoader like SeaBIOS would do
> 
> 3) find the RSDP in low memory, since the loader script must have placed
> it there.  If it cannot find it, allocate some low memory, fill it with
> the RSDP header and revision, and and jump to step 6
> 
> 4) If it found QEMU's RSDP, use it to find QEMU's XSDT
> 
> 5) Copy ACPI table pointers from QEMU to hvmloader's RSDT and/or XSDT.
> 
> 6) build hvmloader tables and link them into the RSDT and/or XSDT as usual.
> 
> 7) overwrite the RSDP in low memory with a pointer to hvmloader's own
> RSDT and/or XSDT, and updated the checksums
> 
> QEMU's XSDT remains there somewhere in memory, unused but harmless.
> 

It can work for plan tables which do not contain AML.

However, for a namespace device, Xen needs to know its name in order
to detect the potential name conflict with those used in Xen built
ACPI. Xen does not (and is not going to) introduce an AML parser, so
it cannot get those device names from QEMU built ACPI by its own.

The idea of either this patch series or the new BIOSLinkerLoader
command is to let QEMU tell Xen where the definition body of a
namespace device (i.e. that part within the outmost "Device(NAME)") is
and what the device name is. Xen, after the name conflict check, can
re-package the definition body in a namespace device (w/ minimal AML
builder code added in Xen) and then in SSDT.


Haozhong

Re: [Qemu-devel] [RFC QEMU PATCH v3 00/10] Implement vNVDIMM for Xen HVM guest

Posted by Igor Mammedov 8 years, 4 months ago

On Fri, 13 Oct 2017 15:53:26 +0800
Haozhong Zhang <haozhong.zhang@intel.com> wrote:

> On 10/12/17 17:45 +0200, Paolo Bonzini wrote:
> > On 12/10/2017 14:45, Haozhong Zhang wrote:  
> > > Basically, QEMU builds two ROMs for guest, /rom@etc/acpi/tables and
> > > /rom@etc/table-loader. The former is unstructured to guest, and
> > > contains all data of guest ACPI. The latter is a BIOSLinkerLoader
> > > organized as a set of commands, which direct the guest (e.g., SeaBIOS
> > > on KVM/QEMU) to relocate data in the former file, recalculate checksum
> > > of specified area, and fill guest address in specified ACPI field.
> > > 
> > > One part of my patches is to implement a mechanism to tell Xen which
> > > part of ACPI data is a table (NFIT), and which part defines a
> > > namespace device and what the device name is. I can add two new loader
> > > commands for them respectively.
> > > 
> > > Because they just provide information and SeaBIOS in non-xen
> > > environment ignores unrecognized commands, they will not break SeaBIOS
> > > in non-xen environment.
> > > 
> > > On QEMU side, most Xen-specific hacks in ACPI builder could be
> > > dropped, and replaced by adding the new loader commands (though they
> > > may be used only by Xen).
> > > 
> > > On Xen side, a fw_cfg driver and a BIOSLinkerLoader command executor
> > > are needed in, perhaps, hvmloader.  
> > 
> > If Xen has to parse BIOSLinkerLoader, it can use the existing commands
> > to process a reduced set of ACPI tables.  In other words,
> > etc/acpi/tables would only include the NFIT, the SSDT with namespace
> > devices, and the XSDT.  etc/acpi/rsdp would include the RSDP table as usual.
> >
> > hvmloader can then:
> > 
> > 1) allocate some memory for where the XSDT will go
> > 
> > 2) process the BIOSLinkerLoader like SeaBIOS would do
> > 
> > 3) find the RSDP in low memory, since the loader script must have placed
> > it there.  If it cannot find it, allocate some low memory, fill it with
> > the RSDP header and revision, and and jump to step 6
> > 
> > 4) If it found QEMU's RSDP, use it to find QEMU's XSDT
> > 
> > 5) Copy ACPI table pointers from QEMU to hvmloader's RSDT and/or XSDT.
> > 
> > 6) build hvmloader tables and link them into the RSDT and/or XSDT as usual.
> > 
> > 7) overwrite the RSDP in low memory with a pointer to hvmloader's own
> > RSDT and/or XSDT, and updated the checksums
> > 
> > QEMU's XSDT remains there somewhere in memory, unused but harmless.
> >   
+1 to Paolo's suggestion, i.e.
 1. add BIOSLinkerLoader into hvmloader
 2. load/process QEMU's tables with #1
 3. get pointers to QEMU generated NFIT and NVDIMM SSDT from QEMU's RSDT/XSDT
    and put them in hvmloader's RSDT

> It can work for plan tables which do not contain AML.
> 
> However, for a namespace device, Xen needs to know its name in order
> to detect the potential name conflict with those used in Xen built
> ACPI. Xen does not (and is not going to) introduce an AML parser, so
> it cannot get those device names from QEMU built ACPI by its own.
> 
> The idea of either this patch series or the new BIOSLinkerLoader
> command is to let QEMU tell Xen where the definition body of a
> namespace device (i.e. that part within the outmost "Device(NAME)") is
> and what the device name is. Xen, after the name conflict check, can
> re-package the definition body in a namespace device (w/ minimal AML
> builder code added in Xen) and then in SSDT.

I'd skip conflict check at runtime as hvmloader doesn't currently
have "\\_SB\NVDR" device so instead of doing runtime check it might
do primitive check at build time that ASL sources in hvmloader do
not contain reserved for QEMU "NVDR" keyword to avoid its addition
by accident in future. (it also might be reused in future if some
other tables from QEMU will be reused).
It's a bit hackinsh but at least it does the job and keeps
BIOSLinkerLoader interface the same for all supported firmwares
(I'd consider it as a temporary hack on the way to fully build
by QEMU ACPI tables for Xen).

Ideally it would be better for QEMU to build all ACPI tables for
hvmloader to avoid split brain issues and need to invent extra
interfaces every time a feature is added to pass configuration
data from QEMU to firmware.
But that's probably out of scope of this project, it could be
done on top of this if Xen folks would like to do it. Adding
BIOSLinkerLoader to hvmloader would be a good starting point
for that future effort.

Re: [Qemu-devel] [RFC QEMU PATCH v3 00/10] Implement vNVDIMM for Xen HVM guest

Posted by Haozhong Zhang 8 years, 4 months ago

On 10/13/17 10:44 +0200, Igor Mammedov wrote:
> On Fri, 13 Oct 2017 15:53:26 +0800
> Haozhong Zhang <haozhong.zhang@intel.com> wrote:
> 
> > On 10/12/17 17:45 +0200, Paolo Bonzini wrote:
> > > On 12/10/2017 14:45, Haozhong Zhang wrote:  
> > > > Basically, QEMU builds two ROMs for guest, /rom@etc/acpi/tables and
> > > > /rom@etc/table-loader. The former is unstructured to guest, and
> > > > contains all data of guest ACPI. The latter is a BIOSLinkerLoader
> > > > organized as a set of commands, which direct the guest (e.g., SeaBIOS
> > > > on KVM/QEMU) to relocate data in the former file, recalculate checksum
> > > > of specified area, and fill guest address in specified ACPI field.
> > > > 
> > > > One part of my patches is to implement a mechanism to tell Xen which
> > > > part of ACPI data is a table (NFIT), and which part defines a
> > > > namespace device and what the device name is. I can add two new loader
> > > > commands for them respectively.
> > > > 
> > > > Because they just provide information and SeaBIOS in non-xen
> > > > environment ignores unrecognized commands, they will not break SeaBIOS
> > > > in non-xen environment.
> > > > 
> > > > On QEMU side, most Xen-specific hacks in ACPI builder could be
> > > > dropped, and replaced by adding the new loader commands (though they
> > > > may be used only by Xen).
> > > > 
> > > > On Xen side, a fw_cfg driver and a BIOSLinkerLoader command executor
> > > > are needed in, perhaps, hvmloader.  
> > > 
> > > If Xen has to parse BIOSLinkerLoader, it can use the existing commands
> > > to process a reduced set of ACPI tables.  In other words,
> > > etc/acpi/tables would only include the NFIT, the SSDT with namespace
> > > devices, and the XSDT.  etc/acpi/rsdp would include the RSDP table as usual.
> > >
> > > hvmloader can then:
> > > 
> > > 1) allocate some memory for where the XSDT will go
> > > 
> > > 2) process the BIOSLinkerLoader like SeaBIOS would do
> > > 
> > > 3) find the RSDP in low memory, since the loader script must have placed
> > > it there.  If it cannot find it, allocate some low memory, fill it with
> > > the RSDP header and revision, and and jump to step 6
> > > 
> > > 4) If it found QEMU's RSDP, use it to find QEMU's XSDT
> > > 
> > > 5) Copy ACPI table pointers from QEMU to hvmloader's RSDT and/or XSDT.
> > > 
> > > 6) build hvmloader tables and link them into the RSDT and/or XSDT as usual.
> > > 
> > > 7) overwrite the RSDP in low memory with a pointer to hvmloader's own
> > > RSDT and/or XSDT, and updated the checksums
> > > 
> > > QEMU's XSDT remains there somewhere in memory, unused but harmless.
> > >   
> +1 to Paolo's suggestion, i.e.
>  1. add BIOSLinkerLoader into hvmloader
>  2. load/process QEMU's tables with #1
>  3. get pointers to QEMU generated NFIT and NVDIMM SSDT from QEMU's RSDT/XSDT
>     and put them in hvmloader's RSDT
> 
> > It can work for plan tables which do not contain AML.
> > 
> > However, for a namespace device, Xen needs to know its name in order
> > to detect the potential name conflict with those used in Xen built
> > ACPI. Xen does not (and is not going to) introduce an AML parser, so
> > it cannot get those device names from QEMU built ACPI by its own.
> > 
> > The idea of either this patch series or the new BIOSLinkerLoader
> > command is to let QEMU tell Xen where the definition body of a
> > namespace device (i.e. that part within the outmost "Device(NAME)") is
> > and what the device name is. Xen, after the name conflict check, can
> > re-package the definition body in a namespace device (w/ minimal AML
> > builder code added in Xen) and then in SSDT.
> 
> I'd skip conflict check at runtime as hvmloader doesn't currently
> have "\\_SB\NVDR" device so instead of doing runtime check it might
> do primitive check at build time that ASL sources in hvmloader do
> not contain reserved for QEMU "NVDR" keyword to avoid its addition
> by accident in future. (it also might be reused in future if some
> other tables from QEMU will be reused).
> It's a bit hackinsh but at least it does the job and keeps
> BIOSLinkerLoader interface the same for all supported firmwares
> (I'd consider it as a temporary hack on the way to fully build
> by QEMU ACPI tables for Xen).
> 
> Ideally it would be better for QEMU to build all ACPI tables for
> hvmloader to avoid split brain issues and need to invent extra
> interfaces every time a feature is added to pass configuration
> data from QEMU to firmware.
> But that's probably out of scope of this project, it could be
> done on top of this if Xen folks would like to do it. Adding
> BIOSLinkerLoader to hvmloader would be a good starting point
> for that future effort.

If we can let QEMU build the entire guest ACPI, we may even not need
to introduce fw_cfg and BIOSLinkerLoader code to hvmloader.  SeaBIOS
is currently loaded after hvmloader and can be used to load QEMU built
ACPI.

To Jan, Andrew, Stefano and Anthony,

what do you think about allowing QEMU to build the entire guest ACPI
and letting SeaBIOS to load it? ACPI builder code in hvmloader is
still there and just bypassed in this case.


Haozhong

Re: [Qemu-devel] [RFC QEMU PATCH v3 00/10] Implement vNVDIMM for Xen HVM guest

Posted by Jan Beulich 8 years, 4 months ago

>>> On 13.10.17 at 13:13, <haozhong.zhang@intel.com> wrote:
> To Jan, Andrew, Stefano and Anthony,
> 
> what do you think about allowing QEMU to build the entire guest ACPI
> and letting SeaBIOS to load it? ACPI builder code in hvmloader is
> still there and just bypassed in this case.

Well, if that can be made work in a non-quirky way and without
loss of functionality, I'd probably be fine. I do think, however,
that there's a reason this is being handled in hvmloader right now.

Jan

Re: [Qemu-devel] [RFC QEMU PATCH v3 00/10] Implement vNVDIMM for Xen HVM guest

Posted by Stefano Stabellini 8 years, 3 months ago

On Fri, 13 Oct 2017, Jan Beulich wrote:
> >>> On 13.10.17 at 13:13, <haozhong.zhang@intel.com> wrote:
> > To Jan, Andrew, Stefano and Anthony,
> > 
> > what do you think about allowing QEMU to build the entire guest ACPI
> > and letting SeaBIOS to load it? ACPI builder code in hvmloader is
> > still there and just bypassed in this case.
> 
> Well, if that can be made work in a non-quirky way and without
> loss of functionality, I'd probably be fine. I do think, however,
> that there's a reason this is being handled in hvmloader right now.

And not to discourage you, just as a clarification, you'll also need to
consider backward compatibility: unless the tables are identical, I
imagine we'll have to keep using the old tables for already installed
virtual machines.

Re: [Qemu-devel] [RFC QEMU PATCH v3 00/10] Implement vNVDIMM for Xen HVM guest

Posted by Michael S. Tsirkin 8 years, 3 months ago

On Fri, Oct 13, 2017 at 03:46:39PM -0700, Stefano Stabellini wrote:
> On Fri, 13 Oct 2017, Jan Beulich wrote:
> > >>> On 13.10.17 at 13:13, <haozhong.zhang@intel.com> wrote:
> > > To Jan, Andrew, Stefano and Anthony,
> > > 
> > > what do you think about allowing QEMU to build the entire guest ACPI
> > > and letting SeaBIOS to load it? ACPI builder code in hvmloader is
> > > still there and just bypassed in this case.
> > 
> > Well, if that can be made work in a non-quirky way and without
> > loss of functionality, I'd probably be fine. I do think, however,
> > that there's a reason this is being handled in hvmloader right now.
> 
> And not to discourage you, just as a clarification, you'll also need to
> consider backward compatibility: unless the tables are identical, I
> imagine we'll have to keep using the old tables for already installed
> virtual machines.

Maybe you can handle this using machine type versioning.
Installed guests would use the old type.

-- 
MST

Re: [Qemu-devel] [Xen-devel] [RFC QEMU PATCH v3 00/10] Implement vNVDIMM for Xen HVM guest

Posted by Konrad Rzeszutek Wilk 8 years, 3 months ago

On Sun, Oct 15, 2017 at 03:31:15AM +0300, Michael S. Tsirkin wrote:
> On Fri, Oct 13, 2017 at 03:46:39PM -0700, Stefano Stabellini wrote:
> > On Fri, 13 Oct 2017, Jan Beulich wrote:
> > > >>> On 13.10.17 at 13:13, <haozhong.zhang@intel.com> wrote:
> > > > To Jan, Andrew, Stefano and Anthony,
> > > > 
> > > > what do you think about allowing QEMU to build the entire guest ACPI
> > > > and letting SeaBIOS to load it? ACPI builder code in hvmloader is
> > > > still there and just bypassed in this case.
> > > 
> > > Well, if that can be made work in a non-quirky way and without
> > > loss of functionality, I'd probably be fine. I do think, however,
> > > that there's a reason this is being handled in hvmloader right now.
> > 
> > And not to discourage you, just as a clarification, you'll also need to
> > consider backward compatibility: unless the tables are identical, I
> > imagine we'll have to keep using the old tables for already installed
> > virtual machines.
> 
> Maybe you can handle this using machine type versioning.

<nods> And the type could be v2 if nvdimm was provided (which is
something that the toolstack would figure out).

The toolstack could also have a seperate 'v2' config flag if somebody
wanted to play with this _outside_ of having NVDIMM in the guest?


> Installed guests would use the old type.

<nods>

Re: [Qemu-devel] [RFC QEMU PATCH v3 00/10] Implement vNVDIMM for Xen HVM guest

Posted by Paolo Bonzini 8 years, 3 months ago

On 14/10/2017 00:46, Stefano Stabellini wrote:
> On Fri, 13 Oct 2017, Jan Beulich wrote:
>>>>> On 13.10.17 at 13:13, <haozhong.zhang@intel.com> wrote:
>>> To Jan, Andrew, Stefano and Anthony,
>>>
>>> what do you think about allowing QEMU to build the entire guest ACPI
>>> and letting SeaBIOS to load it? ACPI builder code in hvmloader is
>>> still there and just bypassed in this case.
>> Well, if that can be made work in a non-quirky way and without
>> loss of functionality, I'd probably be fine. I do think, however,
>> that there's a reason this is being handled in hvmloader right now.
> And not to discourage you, just as a clarification, you'll also need to
> consider backward compatibility: unless the tables are identical, I
> imagine we'll have to keep using the old tables for already installed
> virtual machines.

I agree.  Some of them are already identical, some are not but the QEMU
version should be okay, and for yet more it's probably better to keep
the Xen-specific parts in hvmloader.

The good thing is that it's possible to proceed incrementally once you
have the hvmloader support for merging the QEMU and hvmloader RSDT or
XSDT (whatever you are using), starting with just NVDIMM and proceeding
later with whatever you see fit.

Paolo

Re: [Qemu-devel] [RFC QEMU PATCH v3 00/10] Implement vNVDIMM for Xen HVM guest

Posted by Haozhong Zhang 8 years, 3 months ago

On 10/17/17 13:45 +0200, Paolo Bonzini wrote:
> On 14/10/2017 00:46, Stefano Stabellini wrote:
> > On Fri, 13 Oct 2017, Jan Beulich wrote:
> >>>>> On 13.10.17 at 13:13, <haozhong.zhang@intel.com> wrote:
> >>> To Jan, Andrew, Stefano and Anthony,
> >>>
> >>> what do you think about allowing QEMU to build the entire guest ACPI
> >>> and letting SeaBIOS to load it? ACPI builder code in hvmloader is
> >>> still there and just bypassed in this case.
> >> Well, if that can be made work in a non-quirky way and without
> >> loss of functionality, I'd probably be fine. I do think, however,
> >> that there's a reason this is being handled in hvmloader right now.
> > And not to discourage you, just as a clarification, you'll also need to
> > consider backward compatibility: unless the tables are identical, I
> > imagine we'll have to keep using the old tables for already installed
> > virtual machines.
> 
> I agree.  Some of them are already identical, some are not but the QEMU
> version should be okay, and for yet more it's probably better to keep
> the Xen-specific parts in hvmloader.
> 
> The good thing is that it's possible to proceed incrementally once you
> have the hvmloader support for merging the QEMU and hvmloader RSDT or
> XSDT (whatever you are using), starting with just NVDIMM and proceeding
> later with whatever you see fit.
> 

I'll have a try to check how much the differences would affect. If it
would not take too much work, I'd like to adapt Xen NVDIMM enabling
patches to the all QEMU built ACPI. Otherwise, I'll fall back to Paolo
and MST's suggestions.

Thanks,
Haozhong

Re: [Qemu-devel] [Xen-devel] [RFC QEMU PATCH v3 00/10] Implement vNVDIMM for Xen HVM guest

Posted by Roger Pau Monné 8 years, 3 months ago

On Tue, Oct 17, 2017 at 08:16:47PM +0800, Haozhong Zhang wrote:
> On 10/17/17 13:45 +0200, Paolo Bonzini wrote:
> > On 14/10/2017 00:46, Stefano Stabellini wrote:
> > > On Fri, 13 Oct 2017, Jan Beulich wrote:
> > >>>>> On 13.10.17 at 13:13, <haozhong.zhang@intel.com> wrote:
> > >>> To Jan, Andrew, Stefano and Anthony,
> > >>>
> > >>> what do you think about allowing QEMU to build the entire guest ACPI
> > >>> and letting SeaBIOS to load it? ACPI builder code in hvmloader is
> > >>> still there and just bypassed in this case.
> > >> Well, if that can be made work in a non-quirky way and without
> > >> loss of functionality, I'd probably be fine. I do think, however,
> > >> that there's a reason this is being handled in hvmloader right now.
> > > And not to discourage you, just as a clarification, you'll also need to
> > > consider backward compatibility: unless the tables are identical, I
> > > imagine we'll have to keep using the old tables for already installed
> > > virtual machines.
> > 
> > I agree.  Some of them are already identical, some are not but the QEMU
> > version should be okay, and for yet more it's probably better to keep
> > the Xen-specific parts in hvmloader.
> > 
> > The good thing is that it's possible to proceed incrementally once you
> > have the hvmloader support for merging the QEMU and hvmloader RSDT or
> > XSDT (whatever you are using), starting with just NVDIMM and proceeding
> > later with whatever you see fit.
> > 
> 
> I'll have a try to check how much the differences would affect. If it
> would not take too much work, I'd like to adapt Xen NVDIMM enabling
> patches to the all QEMU built ACPI. Otherwise, I'll fall back to Paolo
> and MST's suggestions.

I don't agree with the end goal of fully switching to the QEMU build
ACPI tables. First of all, the only entity that has all the
information about the guest it's the toolstack, and so it should be
the one in control of the ACPI tables.

Also, Xen guests can use several device models concurrently (via the
ioreq server interface), and each should be able to contribute to the
information presented in the ACPI tables. Intel is also working on
adding IOMMU emulation to the Xen hypervisor, in which case the vIOMMU
ACPI tables should be created by the toolstack and not QEMU. And
finally keep in mind that there are Xen guests (PVH) that use ACPI
tables but not QEMU.

Thanks, Roger.

Re: [Qemu-devel] [Xen-devel] [RFC QEMU PATCH v3 00/10] Implement vNVDIMM for Xen HVM guest

Posted by Paolo Bonzini 8 years, 3 months ago

On 18/10/2017 10:32, Roger Pau Monné wrote:
>> I'll have a try to check how much the differences would affect. If it
>> would not take too much work, I'd like to adapt Xen NVDIMM enabling
>> patches to the all QEMU built ACPI. Otherwise, I'll fall back to Paolo
>> and MST's suggestions.
> I don't agree with the end goal of fully switching to the QEMU build
> ACPI tables. First of all, the only entity that has all the
> information about the guest it's the toolstack, and so it should be
> the one in control of the ACPI tables.
> 
> Also, Xen guests can use several device models concurrently (via the
> ioreq server interface), and each should be able to contribute to the
> information presented in the ACPI tables. Intel is also working on
> adding IOMMU emulation to the Xen hypervisor, in which case the vIOMMU
> ACPI tables should be created by the toolstack and not QEMU. And
> finally keep in mind that there are Xen guests (PVH) that use ACPI
> tables but not QEMU.

I agree with this in fact; QEMU has a view of _most_ of the emulated
hardware, but not all.

However, I disagree that the toolstack should be alone in controlling
the ACPI tables; rather, each involved part of the stack should be
providing its own part of the tables.  For example, QEMU (in addition to
NVDIMM information) should be the one providing an SSDT for southbridge
devices (floppy, COMx, LPTx, etc.).

The Xen stack (or more likely, hvmloader itself) would provide all the
bits that are provided by the hypervisor (MADT for the IOAPIC, another
SSDT for the HPET and RTC, DMAR tables for IOMMU, and so on).  This
should also work just fine for PVH.  Of course backwards compatibility
is the enemy of simplification, but in the end things _should_ actually
be simpler and I think it's a good idea if a prerequisite for Xen
vNVDIMM is to move AML code for QEMU devices out of hvmloader.

Paolo

Re: [Qemu-devel] [Xen-devel] [RFC QEMU PATCH v3 00/10] Implement vNVDIMM for Xen HVM guest

Posted by Roger Pau Monné 8 years, 3 months ago

On Wed, Oct 18, 2017 at 10:46:57AM +0200, Paolo Bonzini wrote:
> On 18/10/2017 10:32, Roger Pau Monné wrote:
> >> I'll have a try to check how much the differences would affect. If it
> >> would not take too much work, I'd like to adapt Xen NVDIMM enabling
> >> patches to the all QEMU built ACPI. Otherwise, I'll fall back to Paolo
> >> and MST's suggestions.
> > I don't agree with the end goal of fully switching to the QEMU build
> > ACPI tables. First of all, the only entity that has all the
> > information about the guest it's the toolstack, and so it should be
> > the one in control of the ACPI tables.
> > 
> > Also, Xen guests can use several device models concurrently (via the
> > ioreq server interface), and each should be able to contribute to the
> > information presented in the ACPI tables. Intel is also working on
> > adding IOMMU emulation to the Xen hypervisor, in which case the vIOMMU
> > ACPI tables should be created by the toolstack and not QEMU. And
> > finally keep in mind that there are Xen guests (PVH) that use ACPI
> > tables but not QEMU.
> 
> I agree with this in fact; QEMU has a view of _most_ of the emulated
> hardware, but not all.
> 
> However, I disagree that the toolstack should be alone in controlling
> the ACPI tables; rather, each involved part of the stack should be
> providing its own part of the tables.  For example, QEMU (in addition to
> NVDIMM information) should be the one providing an SSDT for southbridge
> devices (floppy, COMx, LPTx, etc.).

Yes, that's what I wanted to say, rather than the toolstack providing
all the ACPI tables by itself. Every component should provide the
tables of the devices under it's control, and that should be glued
together by the toolstack (ie: hvmloader).

Roger.

Re: [Qemu-devel] [RFC QEMU PATCH v3 00/10] Implement vNVDIMM for Xen HVM guest

Posted by Michael S. Tsirkin 8 years, 3 months ago

On Fri, Oct 13, 2017 at 03:53:26PM +0800, Haozhong Zhang wrote:
> On 10/12/17 17:45 +0200, Paolo Bonzini wrote:
> > On 12/10/2017 14:45, Haozhong Zhang wrote:
> > > Basically, QEMU builds two ROMs for guest, /rom@etc/acpi/tables and
> > > /rom@etc/table-loader. The former is unstructured to guest, and
> > > contains all data of guest ACPI. The latter is a BIOSLinkerLoader
> > > organized as a set of commands, which direct the guest (e.g., SeaBIOS
> > > on KVM/QEMU) to relocate data in the former file, recalculate checksum
> > > of specified area, and fill guest address in specified ACPI field.
> > > 
> > > One part of my patches is to implement a mechanism to tell Xen which
> > > part of ACPI data is a table (NFIT), and which part defines a
> > > namespace device and what the device name is. I can add two new loader
> > > commands for them respectively.
> > > 
> > > Because they just provide information and SeaBIOS in non-xen
> > > environment ignores unrecognized commands, they will not break SeaBIOS
> > > in non-xen environment.
> > > 
> > > On QEMU side, most Xen-specific hacks in ACPI builder could be
> > > dropped, and replaced by adding the new loader commands (though they
> > > may be used only by Xen).
> > > 
> > > On Xen side, a fw_cfg driver and a BIOSLinkerLoader command executor
> > > are needed in, perhaps, hvmloader.
> > 
> > If Xen has to parse BIOSLinkerLoader, it can use the existing commands
> > to process a reduced set of ACPI tables.  In other words,
> > etc/acpi/tables would only include the NFIT, the SSDT with namespace
> > devices, and the XSDT.  etc/acpi/rsdp would include the RSDP table as usual.
> >
> > hvmloader can then:
> > 
> > 1) allocate some memory for where the XSDT will go
> > 
> > 2) process the BIOSLinkerLoader like SeaBIOS would do
> > 
> > 3) find the RSDP in low memory, since the loader script must have placed
> > it there.  If it cannot find it, allocate some low memory, fill it with
> > the RSDP header and revision, and and jump to step 6
> > 
> > 4) If it found QEMU's RSDP, use it to find QEMU's XSDT
> > 
> > 5) Copy ACPI table pointers from QEMU to hvmloader's RSDT and/or XSDT.
> > 
> > 6) build hvmloader tables and link them into the RSDT and/or XSDT as usual.
> > 
> > 7) overwrite the RSDP in low memory with a pointer to hvmloader's own
> > RSDT and/or XSDT, and updated the checksums
> > 
> > QEMU's XSDT remains there somewhere in memory, unused but harmless.
> > 
> 
> It can work for plan tables which do not contain AML.
> 
> However, for a namespace device, Xen needs to know its name in order
> to detect the potential name conflict with those used in Xen built
> ACPI. Xen does not (and is not going to) introduce an AML parser, so
> it cannot get those device names from QEMU built ACPI by its own.
> 
> The idea of either this patch series or the new BIOSLinkerLoader
> command is to let QEMU tell Xen where the definition body of a
> namespace device (i.e. that part within the outmost "Device(NAME)") is
> and what the device name is. Xen, after the name conflict check, can
> re-package the definition body in a namespace device (w/ minimal AML
> builder code added in Xen) and then in SSDT.
> 
> 
> Haozhong

You most likely can do this without a new command.
You can use something similiar to build_append_named_dword
in combination with BIOS_LINKER_LOADER_COMMAND_ADD_POINTER
like vm gen id does.

-- 
MST

Re: [Qemu-devel] [RFC QEMU PATCH v3 00/10] Implement vNVDIMM for Xen HVM guest

Posted by Konrad Rzeszutek Wilk 8 years, 4 months ago

On Thu, Oct 12, 2017 at 08:45:44PM +0800, Haozhong Zhang wrote:
> On 10/10/17 12:05 -0400, Konrad Rzeszutek Wilk wrote:
> > On Tue, Sep 12, 2017 at 11:15:09AM +0800, Haozhong Zhang wrote:
> > > On 09/11/17 11:52 -0700, Stefano Stabellini wrote:
> > > > CC'ing xen-devel, and the Xen tools and x86 maintainers.
> > > > 
> > > > On Mon, 11 Sep 2017, Igor Mammedov wrote:
> > > > > On Mon, 11 Sep 2017 12:41:47 +0800
> > > > > Haozhong Zhang <haozhong.zhang@intel.com> wrote:
> > > > > 
> > > > > > This is the QEMU part patches that works with the associated Xen
> > > > > > patches to enable vNVDIMM support for Xen HVM domains. Xen relies on
> > > > > > QEMU to build guest NFIT and NVDIMM namespace devices, and allocate
> > > > > > guest address space for vNVDIMM devices.
> > > > > > 
> > > > > > All patches can be found at
> > > > > >   Xen:  https://github.com/hzzhan9/xen.git nvdimm-rfc-v3
> > > > > >   QEMU: https://github.com/hzzhan9/qemu.git xen-nvdimm-rfc-v3
> > > > > > 
> > > > > > Patch 1 is to avoid dereferencing the NULL pointer to non-existing
> > > > > > label data, as the Xen side support for labels is not implemented yet.
> > > > > > 
> > > > > > Patch 2 & 3 add a memory backend dedicated for Xen usage and a hotplug
> > > > > > memory region for Xen guest, in order to make the existing nvdimm
> > > > > > device plugging path work on Xen.
> > > > > > 
> > > > > > Patch 4 - 10 build and cooy NFIT from QEMU to Xen guest, when QEMU is
> > > > > > used as the Xen device model.
> > > > > 
> > > > > I've skimmed over patch-set and can't say that I'm happy with
> > > > > number of xen_enabled() invariants it introduced as well as
> > > > > with partial blobs it creates.
> > > > 
> > > > I have not read the series (Haozhong, please CC me, Anthony and
> > > > xen-devel to the whole series next time), but yes, indeed. Let's not add
> > > > more xen_enabled() if possible.
> > > > 
> > > > Haozhong, was there a design document thread on xen-devel about this? If
> > > > so, did it reach a conclusion? Was the design accepted? If so, please
> > > > add a link to the design doc in the introductory email, so that
> > > > everybody can read it and be on the same page.
> > > 
> > > Yes, there is a design [1] discussed and reviewed. Section 4.3 discussed
> > > the guest ACPI.
> > > 
> > > [1] https://lists.xenproject.org/archives/html/xen-devel/2016-07/msg01921.html
> > 
> > Igor, did you have a chance to read it?
> > 
> > .. see below
> > > 
> > > > 
> > > > 
> > > > > I'd like to reduce above and a way to do this might be making xen 
> > > > >  1. use fw_cfg
> > > > >  2. fetch QEMU build acpi tables from fw_cfg
> > > > >  3. extract nvdim tables (which is trivial) and use them
> > > > > 
> > > > > looking at xen_load_linux(), it seems possible to use fw_cfg.
> > > > > 
> > > > > So what's stopping xen from using it elsewhere?,
> > > > > instead of adding more xen specific code to do 'the same'
> > > > > job and not reusing/sharing common code with tcg/kvm.
> > > > 
> > > > So far, ACPI tables have not been generated by QEMU. Xen HVM machines
> > > > rely on a firmware-like application called "hvmloader" that runs in
> > > > guest context and generates the ACPI tables. I have no opinions on
> > > > hvmloader and I'll let the Xen maintainers talk about it. However, keep
> > > > in mind that with an HVM guest some devices are emulated by Xen and/or
> > > > by other device emulators that can run alongside QEMU. QEMU doesn't have
> > > > a full few of the system.
> > > > 
> > > > Here the question is: does it have to be QEMU the one to generate the
> > > > ACPI blobs for the nvdimm? It would be nicer if it was up to hvmloader
> > > > like the rest, instead of introducing this split-brain design about
> > > > ACPI. We need to see a design doc to fully understand this.
> > > >
> > > 
> > > hvmloader runs in the guest and is responsible to build/load guest
> > > ACPI. However, it's not capable to build AML at runtime (for the lack
> > > of AML builder). If any guest ACPI object is needed (e.g. by guest
> > > DSDT), it has to be generated from ASL by iasl at Xen compile time and
> > > then be loaded by hvmloader at runtime.
> > > 
> > > Xen includes an OperationRegion "BIOS" in the static generated guest
> > > DSDT, whose address is hardcoded and which contains a list of values
> > > filled by hvmloader at runtime. Other ACPI objects can refer to those
> > > values (e.g., the number of vCPUs). But it's not enough for generating
> > > guest NVDIMM ACPI objects at compile time and then being customized
> > > and loaded by hvmload, because its structure (i.e., the number of
> > > namespace devices) cannot be decided util the guest config is known.
> > > 
> > > Alternatively, we may introduce an AML builder in hvmloader and build
> > > all guest ACPI completely in hvmloader. Looking at the similar
> > > implementation in QEMU, it would not be small, compared to the current
> > > size of hvmloader. Besides, I'm still going to let QEMU handle guest
> > > NVDIMM _DSM and _FIT calls, which is another reason I use QEMU to
> > > build NVDIMM ACPI.
> > > 
> > > > If the design doc thread led into thinking that it has to be QEMU to
> > > > generate them, then would it make the code nicer if we used fw_cfg to
> > > > get the (full or partial) tables from QEMU, as Igor suggested?
> > > 
> > > I'll have a look at the code (which I didn't notice) pointed by Igor.
> > 
> > And there is a spec too!
> > 
> > https://github.com/qemu/qemu/blob/master/docs/specs/fw_cfg.txt
> > 
> > Igor, did you have in mind to use FW_CFG_FILE_DIR to retrieve the
> > ACPI AML code?
> > 
> 
> Basically, QEMU builds two ROMs for guest, /rom@etc/acpi/tables and
> /rom@etc/table-loader. The former is unstructured to guest, and
> contains all data of guest ACPI. The latter is a BIOSLinkerLoader
> organized as a set of commands, which direct the guest (e.g., SeaBIOS
> on KVM/QEMU) to relocate data in the former file, recalculate checksum
> of specified area, and fill guest address in specified ACPI field.
> 
> One part of my patches is to implement a mechanism to tell Xen which
> part of ACPI data is a table (NFIT), and which part defines a
> namespace device and what the device name is. I can add two new loader
> commands for them respectively.

<nods>
> 
> Because they just provide information and SeaBIOS in non-xen
> environment ignores unrecognized commands, they will not break SeaBIOS
> in non-xen environment.
> 
> On QEMU side, most Xen-specific hacks in ACPI builder could be

Wooot!
> dropped, and replaced by adding the new loader commands (though they
> may be used only by Xen).

And eventually all of the hvmloader ACPI built in code could be dropped
and use all of the loader commands?

> 
> On Xen side, a fw_cfg driver and a BIOSLinkerLoader command executor
> are needed in, perhaps, hvmloader.

<nods>
> 
> 
> Haozhong
>

Re: [Qemu-devel] [RFC QEMU PATCH v3 00/10] Implement vNVDIMM for Xen HVM guest

Posted by Haozhong Zhang 8 years, 4 months ago

On 10/12/17 13:39 -0400, Konrad Rzeszutek Wilk wrote:
> On Thu, Oct 12, 2017 at 08:45:44PM +0800, Haozhong Zhang wrote:
> > On 10/10/17 12:05 -0400, Konrad Rzeszutek Wilk wrote:
> > > On Tue, Sep 12, 2017 at 11:15:09AM +0800, Haozhong Zhang wrote:
> > > > On 09/11/17 11:52 -0700, Stefano Stabellini wrote:
> > > > > CC'ing xen-devel, and the Xen tools and x86 maintainers.
> > > > > 
> > > > > On Mon, 11 Sep 2017, Igor Mammedov wrote:
> > > > > > On Mon, 11 Sep 2017 12:41:47 +0800
> > > > > > Haozhong Zhang <haozhong.zhang@intel.com> wrote:
> > > > > > 
> > > > > > > This is the QEMU part patches that works with the associated Xen
> > > > > > > patches to enable vNVDIMM support for Xen HVM domains. Xen relies on
> > > > > > > QEMU to build guest NFIT and NVDIMM namespace devices, and allocate
> > > > > > > guest address space for vNVDIMM devices.
> > > > > > > 
> > > > > > > All patches can be found at
> > > > > > >   Xen:  https://github.com/hzzhan9/xen.git nvdimm-rfc-v3
> > > > > > >   QEMU: https://github.com/hzzhan9/qemu.git xen-nvdimm-rfc-v3
> > > > > > > 
> > > > > > > Patch 1 is to avoid dereferencing the NULL pointer to non-existing
> > > > > > > label data, as the Xen side support for labels is not implemented yet.
> > > > > > > 
> > > > > > > Patch 2 & 3 add a memory backend dedicated for Xen usage and a hotplug
> > > > > > > memory region for Xen guest, in order to make the existing nvdimm
> > > > > > > device plugging path work on Xen.
> > > > > > > 
> > > > > > > Patch 4 - 10 build and cooy NFIT from QEMU to Xen guest, when QEMU is
> > > > > > > used as the Xen device model.
> > > > > > 
> > > > > > I've skimmed over patch-set and can't say that I'm happy with
> > > > > > number of xen_enabled() invariants it introduced as well as
> > > > > > with partial blobs it creates.
> > > > > 
> > > > > I have not read the series (Haozhong, please CC me, Anthony and
> > > > > xen-devel to the whole series next time), but yes, indeed. Let's not add
> > > > > more xen_enabled() if possible.
> > > > > 
> > > > > Haozhong, was there a design document thread on xen-devel about this? If
> > > > > so, did it reach a conclusion? Was the design accepted? If so, please
> > > > > add a link to the design doc in the introductory email, so that
> > > > > everybody can read it and be on the same page.
> > > > 
> > > > Yes, there is a design [1] discussed and reviewed. Section 4.3 discussed
> > > > the guest ACPI.
> > > > 
> > > > [1] https://lists.xenproject.org/archives/html/xen-devel/2016-07/msg01921.html
> > > 
> > > Igor, did you have a chance to read it?
> > > 
> > > .. see below
> > > > 
> > > > > 
> > > > > 
> > > > > > I'd like to reduce above and a way to do this might be making xen 
> > > > > >  1. use fw_cfg
> > > > > >  2. fetch QEMU build acpi tables from fw_cfg
> > > > > >  3. extract nvdim tables (which is trivial) and use them
> > > > > > 
> > > > > > looking at xen_load_linux(), it seems possible to use fw_cfg.
> > > > > > 
> > > > > > So what's stopping xen from using it elsewhere?,
> > > > > > instead of adding more xen specific code to do 'the same'
> > > > > > job and not reusing/sharing common code with tcg/kvm.
> > > > > 
> > > > > So far, ACPI tables have not been generated by QEMU. Xen HVM machines
> > > > > rely on a firmware-like application called "hvmloader" that runs in
> > > > > guest context and generates the ACPI tables. I have no opinions on
> > > > > hvmloader and I'll let the Xen maintainers talk about it. However, keep
> > > > > in mind that with an HVM guest some devices are emulated by Xen and/or
> > > > > by other device emulators that can run alongside QEMU. QEMU doesn't have
> > > > > a full few of the system.
> > > > > 
> > > > > Here the question is: does it have to be QEMU the one to generate the
> > > > > ACPI blobs for the nvdimm? It would be nicer if it was up to hvmloader
> > > > > like the rest, instead of introducing this split-brain design about
> > > > > ACPI. We need to see a design doc to fully understand this.
> > > > >
> > > > 
> > > > hvmloader runs in the guest and is responsible to build/load guest
> > > > ACPI. However, it's not capable to build AML at runtime (for the lack
> > > > of AML builder). If any guest ACPI object is needed (e.g. by guest
> > > > DSDT), it has to be generated from ASL by iasl at Xen compile time and
> > > > then be loaded by hvmloader at runtime.
> > > > 
> > > > Xen includes an OperationRegion "BIOS" in the static generated guest
> > > > DSDT, whose address is hardcoded and which contains a list of values
> > > > filled by hvmloader at runtime. Other ACPI objects can refer to those
> > > > values (e.g., the number of vCPUs). But it's not enough for generating
> > > > guest NVDIMM ACPI objects at compile time and then being customized
> > > > and loaded by hvmload, because its structure (i.e., the number of
> > > > namespace devices) cannot be decided util the guest config is known.
> > > > 
> > > > Alternatively, we may introduce an AML builder in hvmloader and build
> > > > all guest ACPI completely in hvmloader. Looking at the similar
> > > > implementation in QEMU, it would not be small, compared to the current
> > > > size of hvmloader. Besides, I'm still going to let QEMU handle guest
> > > > NVDIMM _DSM and _FIT calls, which is another reason I use QEMU to
> > > > build NVDIMM ACPI.
> > > > 
> > > > > If the design doc thread led into thinking that it has to be QEMU to
> > > > > generate them, then would it make the code nicer if we used fw_cfg to
> > > > > get the (full or partial) tables from QEMU, as Igor suggested?
> > > > 
> > > > I'll have a look at the code (which I didn't notice) pointed by Igor.
> > > 
> > > And there is a spec too!
> > > 
> > > https://github.com/qemu/qemu/blob/master/docs/specs/fw_cfg.txt
> > > 
> > > Igor, did you have in mind to use FW_CFG_FILE_DIR to retrieve the
> > > ACPI AML code?
> > > 
> > 
> > Basically, QEMU builds two ROMs for guest, /rom@etc/acpi/tables and
> > /rom@etc/table-loader. The former is unstructured to guest, and
> > contains all data of guest ACPI. The latter is a BIOSLinkerLoader
> > organized as a set of commands, which direct the guest (e.g., SeaBIOS
> > on KVM/QEMU) to relocate data in the former file, recalculate checksum
> > of specified area, and fill guest address in specified ACPI field.
> > 
> > One part of my patches is to implement a mechanism to tell Xen which
> > part of ACPI data is a table (NFIT), and which part defines a
> > namespace device and what the device name is. I can add two new loader
> > commands for them respectively.
> 
> <nods>
> > 
> > Because they just provide information and SeaBIOS in non-xen
> > environment ignores unrecognized commands, they will not break SeaBIOS
> > in non-xen environment.
> > 
> > On QEMU side, most Xen-specific hacks in ACPI builder could be
> 
> Wooot!
> > dropped, and replaced by adding the new loader commands (though they
> > may be used only by Xen).
> 
> And eventually all of the hvmloader ACPI built in code could be dropped
> and use all of the loader commands?

If Xen is going to rely on QEMU to build the entire ACPI for a HVM
guest, then there would be no need to check the signature/name
conflict, so the new BIOSLinkerLoader commands here would not be
necessary in that case (or only for backwards compatible). I don't
know how much work would be needed, and that could be another project.

Haozhong

> 
> > 
> > On Xen side, a fw_cfg driver and a BIOSLinkerLoader command executor
> > are needed in, perhaps, hvmloader.
> 
> <nods>
> > 
> > 
> > Haozhong
> > 
>