backends/Makefile.objs | 1 + backends/hostmem-xen.c | 108 ++++++++++++++++++++++++++ backends/hostmem.c | 9 +++ hw/acpi/aml-build.c | 10 ++- hw/acpi/nvdimm.c | 79 ++++++++++++++----- hw/i386/pc.c | 102 ++++++++++++++----------- hw/i386/xen/xen-hvm.c | 204 ++++++++++++++++++++++++++++++++++++++++++++++++- hw/mem/nvdimm.c | 10 ++- hw/mem/pc-dimm.c | 6 +- include/hw/i386/pc.h | 1 + include/hw/xen/xen.h | 25 ++++++ stubs/xen-hvm.c | 10 +++ 12 files changed, 495 insertions(+), 70 deletions(-) create mode 100644 backends/hostmem-xen.c
This is the QEMU part patches that works with the associated Xen patches to enable vNVDIMM support for Xen HVM domains. Xen relies on QEMU to build guest NFIT and NVDIMM namespace devices, and allocate guest address space for vNVDIMM devices. All patches can be found at Xen: https://github.com/hzzhan9/xen.git nvdimm-rfc-v3 QEMU: https://github.com/hzzhan9/qemu.git xen-nvdimm-rfc-v3 Patch 1 is to avoid dereferencing the NULL pointer to non-existing label data, as the Xen side support for labels is not implemented yet. Patch 2 & 3 add a memory backend dedicated for Xen usage and a hotplug memory region for Xen guest, in order to make the existing nvdimm device plugging path work on Xen. Patch 4 - 10 build and cooy NFIT from QEMU to Xen guest, when QEMU is used as the Xen device model. Haozhong Zhang (10): nvdimm: do not intiailize nvdimm->label_data if label size is zero hw/xen-hvm: create the hotplug memory region on Xen hostmem-xen: add a host memory backend for Xen nvdimm acpi: do not use fw_cfg on Xen hw/xen-hvm: initialize DM ACPI hw/xen-hvm: add function to copy ACPI into guest memory nvdimm acpi: copy NFIT to Xen guest nvdimm acpi: copy ACPI namespace device of vNVDIMM to Xen guest nvdimm acpi: do not build _FIT method on Xen hw/xen-hvm: enable building DM ACPI if vNVDIMM is enabled backends/Makefile.objs | 1 + backends/hostmem-xen.c | 108 ++++++++++++++++++++++++++ backends/hostmem.c | 9 +++ hw/acpi/aml-build.c | 10 ++- hw/acpi/nvdimm.c | 79 ++++++++++++++----- hw/i386/pc.c | 102 ++++++++++++++----------- hw/i386/xen/xen-hvm.c | 204 ++++++++++++++++++++++++++++++++++++++++++++++++- hw/mem/nvdimm.c | 10 ++- hw/mem/pc-dimm.c | 6 +- include/hw/i386/pc.h | 1 + include/hw/xen/xen.h | 25 ++++++ stubs/xen-hvm.c | 10 +++ 12 files changed, 495 insertions(+), 70 deletions(-) create mode 100644 backends/hostmem-xen.c -- 2.11.0
Hi, This series seems to have some coding style problems. See output below for more information: Subject: [Qemu-devel] [RFC QEMU PATCH v3 00/10] Implement vNVDIMM for Xen HVM guest Message-id: 20170911044157.15403-1-haozhong.zhang@intel.com Type: series === TEST SCRIPT BEGIN === #!/bin/bash BASE=base n=1 total=$(git log --oneline $BASE.. | wc -l) failed=0 git config --local diff.renamelimit 0 git config --local diff.renames True commits="$(git log --format=%H --reverse $BASE..)" for c in $commits; do echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..." if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then failed=1 echo fi n=$((n+1)) done exit $failed === TEST SCRIPT END === Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384 From https://github.com/patchew-project/qemu * [new tag] patchew/20170911044157.15403-1-haozhong.zhang@intel.com -> patchew/20170911044157.15403-1-haozhong.zhang@intel.com Switched to a new branch 'test' d5f5b8faf2 hw/xen-hvm: enable building DM ACPI if vNVDIMM is enabled 73b52971f5 nvdimm acpi: do not build _FIT method on Xen 1c6eeac40e nvdimm acpi: copy ACPI namespace device of vNVDIMM to Xen guest f2d6097366 nvdimm acpi: copy NFIT to Xen guest 69ddac3d65 hw/xen-hvm: add function to copy ACPI into guest memory cae88474b2 hw/xen-hvm: initialize DM ACPI 23a0e4204a nvdimm acpi: do not use fw_cfg on Xen e298be5d96 hostmem-xen: add a host memory backend for Xen f069bbb659 hw/xen-hvm: create the hotplug memory region on Xen 69b6b6e9fa nvdimm: do not intiailize nvdimm->label_data if label size is zero === OUTPUT BEGIN === Checking PATCH 1/10: nvdimm: do not intiailize nvdimm->label_data if label size is zero... ERROR: braces {} are necessary for all arms of this statement #33: FILE: hw/mem/nvdimm.c:97: + if (nvdimm->label_size) [...] total: 1 errors, 0 warnings, 16 lines checked Your patch has style problems, please review. If any of these errors are false positives report them to the maintainer, see CHECKPATCH in MAINTAINERS. Checking PATCH 2/10: hw/xen-hvm: create the hotplug memory region on Xen... ERROR: braces {} are necessary for all arms of this statement #29: FILE: hw/i386/pc.c:1356: + if (!pcmc->has_reserved_memory || machine->ram_size >= machine->maxram_size) [...] total: 1 errors, 0 warnings, 113 lines checked Your patch has style problems, please review. If any of these errors are false positives report them to the maintainer, see CHECKPATCH in MAINTAINERS. Checking PATCH 3/10: hostmem-xen: add a host memory backend for Xen... Checking PATCH 4/10: nvdimm acpi: do not use fw_cfg on Xen... Checking PATCH 5/10: hw/xen-hvm: initialize DM ACPI... Checking PATCH 6/10: hw/xen-hvm: add function to copy ACPI into guest memory... Checking PATCH 7/10: nvdimm acpi: copy NFIT to Xen guest... Checking PATCH 8/10: nvdimm acpi: copy ACPI namespace device of vNVDIMM to Xen guest... Checking PATCH 9/10: nvdimm acpi: do not build _FIT method on Xen... Checking PATCH 10/10: hw/xen-hvm: enable building DM ACPI if vNVDIMM is enabled... === OUTPUT END === Test command exited with code: 1 --- Email generated automatically by Patchew [http://patchew.org/]. Please send your feedback to patchew-devel@freelists.org
On Mon, 11 Sep 2017 12:41:47 +0800 Haozhong Zhang <haozhong.zhang@intel.com> wrote: > This is the QEMU part patches that works with the associated Xen > patches to enable vNVDIMM support for Xen HVM domains. Xen relies on > QEMU to build guest NFIT and NVDIMM namespace devices, and allocate > guest address space for vNVDIMM devices. > > All patches can be found at > Xen: https://github.com/hzzhan9/xen.git nvdimm-rfc-v3 > QEMU: https://github.com/hzzhan9/qemu.git xen-nvdimm-rfc-v3 > > Patch 1 is to avoid dereferencing the NULL pointer to non-existing > label data, as the Xen side support for labels is not implemented yet. > > Patch 2 & 3 add a memory backend dedicated for Xen usage and a hotplug > memory region for Xen guest, in order to make the existing nvdimm > device plugging path work on Xen. > > Patch 4 - 10 build and cooy NFIT from QEMU to Xen guest, when QEMU is > used as the Xen device model. I've skimmed over patch-set and can't say that I'm happy with number of xen_enabled() invariants it introduced as well as with partial blobs it creates. I'd like to reduce above and a way to do this might be making xen 1. use fw_cfg 2. fetch QEMU build acpi tables from fw_cfg 3. extract nvdim tables (which is trivial) and use them looking at xen_load_linux(), it seems possible to use fw_cfg. So what's stopping xen from using it elsewhere?, instead of adding more xen specific code to do 'the same' job and not reusing/sharing common code with tcg/kvm. > Haozhong Zhang (10): > nvdimm: do not intiailize nvdimm->label_data if label size is zero > hw/xen-hvm: create the hotplug memory region on Xen > hostmem-xen: add a host memory backend for Xen > nvdimm acpi: do not use fw_cfg on Xen > hw/xen-hvm: initialize DM ACPI > hw/xen-hvm: add function to copy ACPI into guest memory > nvdimm acpi: copy NFIT to Xen guest > nvdimm acpi: copy ACPI namespace device of vNVDIMM to Xen guest > nvdimm acpi: do not build _FIT method on Xen > hw/xen-hvm: enable building DM ACPI if vNVDIMM is enabled > > backends/Makefile.objs | 1 + > backends/hostmem-xen.c | 108 ++++++++++++++++++++++++++ > backends/hostmem.c | 9 +++ > hw/acpi/aml-build.c | 10 ++- > hw/acpi/nvdimm.c | 79 ++++++++++++++----- > hw/i386/pc.c | 102 ++++++++++++++----------- > hw/i386/xen/xen-hvm.c | 204 ++++++++++++++++++++++++++++++++++++++++++++++++- > hw/mem/nvdimm.c | 10 ++- > hw/mem/pc-dimm.c | 6 +- > include/hw/i386/pc.h | 1 + > include/hw/xen/xen.h | 25 ++++++ > stubs/xen-hvm.c | 10 +++ > 12 files changed, 495 insertions(+), 70 deletions(-) > create mode 100644 backends/hostmem-xen.c >
CC'ing xen-devel, and the Xen tools and x86 maintainers. On Mon, 11 Sep 2017, Igor Mammedov wrote: > On Mon, 11 Sep 2017 12:41:47 +0800 > Haozhong Zhang <haozhong.zhang@intel.com> wrote: > > > This is the QEMU part patches that works with the associated Xen > > patches to enable vNVDIMM support for Xen HVM domains. Xen relies on > > QEMU to build guest NFIT and NVDIMM namespace devices, and allocate > > guest address space for vNVDIMM devices. > > > > All patches can be found at > > Xen: https://github.com/hzzhan9/xen.git nvdimm-rfc-v3 > > QEMU: https://github.com/hzzhan9/qemu.git xen-nvdimm-rfc-v3 > > > > Patch 1 is to avoid dereferencing the NULL pointer to non-existing > > label data, as the Xen side support for labels is not implemented yet. > > > > Patch 2 & 3 add a memory backend dedicated for Xen usage and a hotplug > > memory region for Xen guest, in order to make the existing nvdimm > > device plugging path work on Xen. > > > > Patch 4 - 10 build and cooy NFIT from QEMU to Xen guest, when QEMU is > > used as the Xen device model. > > I've skimmed over patch-set and can't say that I'm happy with > number of xen_enabled() invariants it introduced as well as > with partial blobs it creates. I have not read the series (Haozhong, please CC me, Anthony and xen-devel to the whole series next time), but yes, indeed. Let's not add more xen_enabled() if possible. Haozhong, was there a design document thread on xen-devel about this? If so, did it reach a conclusion? Was the design accepted? If so, please add a link to the design doc in the introductory email, so that everybody can read it and be on the same page. > I'd like to reduce above and a way to do this might be making xen > 1. use fw_cfg > 2. fetch QEMU build acpi tables from fw_cfg > 3. extract nvdim tables (which is trivial) and use them > > looking at xen_load_linux(), it seems possible to use fw_cfg. > > So what's stopping xen from using it elsewhere?, > instead of adding more xen specific code to do 'the same' > job and not reusing/sharing common code with tcg/kvm. So far, ACPI tables have not been generated by QEMU. Xen HVM machines rely on a firmware-like application called "hvmloader" that runs in guest context and generates the ACPI tables. I have no opinions on hvmloader and I'll let the Xen maintainers talk about it. However, keep in mind that with an HVM guest some devices are emulated by Xen and/or by other device emulators that can run alongside QEMU. QEMU doesn't have a full few of the system. Here the question is: does it have to be QEMU the one to generate the ACPI blobs for the nvdimm? It would be nicer if it was up to hvmloader like the rest, instead of introducing this split-brain design about ACPI. We need to see a design doc to fully understand this. If the design doc thread led into thinking that it has to be QEMU to generate them, then would it make the code nicer if we used fw_cfg to get the (full or partial) tables from QEMU, as Igor suggested? > > Haozhong Zhang (10): > > nvdimm: do not intiailize nvdimm->label_data if label size is zero > > hw/xen-hvm: create the hotplug memory region on Xen > > hostmem-xen: add a host memory backend for Xen > > nvdimm acpi: do not use fw_cfg on Xen > > hw/xen-hvm: initialize DM ACPI > > hw/xen-hvm: add function to copy ACPI into guest memory > > nvdimm acpi: copy NFIT to Xen guest > > nvdimm acpi: copy ACPI namespace device of vNVDIMM to Xen guest > > nvdimm acpi: do not build _FIT method on Xen > > hw/xen-hvm: enable building DM ACPI if vNVDIMM is enabled > > > > backends/Makefile.objs | 1 + > > backends/hostmem-xen.c | 108 ++++++++++++++++++++++++++ > > backends/hostmem.c | 9 +++ > > hw/acpi/aml-build.c | 10 ++- > > hw/acpi/nvdimm.c | 79 ++++++++++++++----- > > hw/i386/pc.c | 102 ++++++++++++++----------- > > hw/i386/xen/xen-hvm.c | 204 ++++++++++++++++++++++++++++++++++++++++++++++++- > > hw/mem/nvdimm.c | 10 ++- > > hw/mem/pc-dimm.c | 6 +- > > include/hw/i386/pc.h | 1 + > > include/hw/xen/xen.h | 25 ++++++ > > stubs/xen-hvm.c | 10 +++ > > 12 files changed, 495 insertions(+), 70 deletions(-) > > create mode 100644 backends/hostmem-xen.c > > >
On 09/11/17 11:52 -0700, Stefano Stabellini wrote: > CC'ing xen-devel, and the Xen tools and x86 maintainers. > > On Mon, 11 Sep 2017, Igor Mammedov wrote: > > On Mon, 11 Sep 2017 12:41:47 +0800 > > Haozhong Zhang <haozhong.zhang@intel.com> wrote: > > > > > This is the QEMU part patches that works with the associated Xen > > > patches to enable vNVDIMM support for Xen HVM domains. Xen relies on > > > QEMU to build guest NFIT and NVDIMM namespace devices, and allocate > > > guest address space for vNVDIMM devices. > > > > > > All patches can be found at > > > Xen: https://github.com/hzzhan9/xen.git nvdimm-rfc-v3 > > > QEMU: https://github.com/hzzhan9/qemu.git xen-nvdimm-rfc-v3 > > > > > > Patch 1 is to avoid dereferencing the NULL pointer to non-existing > > > label data, as the Xen side support for labels is not implemented yet. > > > > > > Patch 2 & 3 add a memory backend dedicated for Xen usage and a hotplug > > > memory region for Xen guest, in order to make the existing nvdimm > > > device plugging path work on Xen. > > > > > > Patch 4 - 10 build and cooy NFIT from QEMU to Xen guest, when QEMU is > > > used as the Xen device model. > > > > I've skimmed over patch-set and can't say that I'm happy with > > number of xen_enabled() invariants it introduced as well as > > with partial blobs it creates. > > I have not read the series (Haozhong, please CC me, Anthony and > xen-devel to the whole series next time), but yes, indeed. Let's not add > more xen_enabled() if possible. > > Haozhong, was there a design document thread on xen-devel about this? If > so, did it reach a conclusion? Was the design accepted? If so, please > add a link to the design doc in the introductory email, so that > everybody can read it and be on the same page. Yes, there is a design [1] discussed and reviewed. Section 4.3 discussed the guest ACPI. [1] https://lists.xenproject.org/archives/html/xen-devel/2016-07/msg01921.html > > > > I'd like to reduce above and a way to do this might be making xen > > 1. use fw_cfg > > 2. fetch QEMU build acpi tables from fw_cfg > > 3. extract nvdim tables (which is trivial) and use them > > > > looking at xen_load_linux(), it seems possible to use fw_cfg. > > > > So what's stopping xen from using it elsewhere?, > > instead of adding more xen specific code to do 'the same' > > job and not reusing/sharing common code with tcg/kvm. > > So far, ACPI tables have not been generated by QEMU. Xen HVM machines > rely on a firmware-like application called "hvmloader" that runs in > guest context and generates the ACPI tables. I have no opinions on > hvmloader and I'll let the Xen maintainers talk about it. However, keep > in mind that with an HVM guest some devices are emulated by Xen and/or > by other device emulators that can run alongside QEMU. QEMU doesn't have > a full few of the system. > > Here the question is: does it have to be QEMU the one to generate the > ACPI blobs for the nvdimm? It would be nicer if it was up to hvmloader > like the rest, instead of introducing this split-brain design about > ACPI. We need to see a design doc to fully understand this. > hvmloader runs in the guest and is responsible to build/load guest ACPI. However, it's not capable to build AML at runtime (for the lack of AML builder). If any guest ACPI object is needed (e.g. by guest DSDT), it has to be generated from ASL by iasl at Xen compile time and then be loaded by hvmloader at runtime. Xen includes an OperationRegion "BIOS" in the static generated guest DSDT, whose address is hardcoded and which contains a list of values filled by hvmloader at runtime. Other ACPI objects can refer to those values (e.g., the number of vCPUs). But it's not enough for generating guest NVDIMM ACPI objects at compile time and then being customized and loaded by hvmload, because its structure (i.e., the number of namespace devices) cannot be decided util the guest config is known. Alternatively, we may introduce an AML builder in hvmloader and build all guest ACPI completely in hvmloader. Looking at the similar implementation in QEMU, it would not be small, compared to the current size of hvmloader. Besides, I'm still going to let QEMU handle guest NVDIMM _DSM and _FIT calls, which is another reason I use QEMU to build NVDIMM ACPI. > If the design doc thread led into thinking that it has to be QEMU to > generate them, then would it make the code nicer if we used fw_cfg to > get the (full or partial) tables from QEMU, as Igor suggested? I'll have a look at the code (which I didn't notice) pointed by Igor. One possible issue to use fw_cfg is how to avoid the conflict between ACPI built by QEMU and ACPI built by hvmloader (e.g., both may use the same table signature / device names / ...). In my current design, QEMU will pass the table signatures and device names used in its ACPI to Xen, and Xen can check the conflict with its own ACPI. Perhaps we can add necessary functions in fw_cfg as well. Anyway, let me first look at the code. Thanks, Haozhong
On Tue, Sep 12, 2017 at 11:15:09AM +0800, Haozhong Zhang wrote: > On 09/11/17 11:52 -0700, Stefano Stabellini wrote: > > CC'ing xen-devel, and the Xen tools and x86 maintainers. > > > > On Mon, 11 Sep 2017, Igor Mammedov wrote: > > > On Mon, 11 Sep 2017 12:41:47 +0800 > > > Haozhong Zhang <haozhong.zhang@intel.com> wrote: > > > > > > > This is the QEMU part patches that works with the associated Xen > > > > patches to enable vNVDIMM support for Xen HVM domains. Xen relies on > > > > QEMU to build guest NFIT and NVDIMM namespace devices, and allocate > > > > guest address space for vNVDIMM devices. > > > > > > > > All patches can be found at > > > > Xen: https://github.com/hzzhan9/xen.git nvdimm-rfc-v3 > > > > QEMU: https://github.com/hzzhan9/qemu.git xen-nvdimm-rfc-v3 > > > > > > > > Patch 1 is to avoid dereferencing the NULL pointer to non-existing > > > > label data, as the Xen side support for labels is not implemented yet. > > > > > > > > Patch 2 & 3 add a memory backend dedicated for Xen usage and a hotplug > > > > memory region for Xen guest, in order to make the existing nvdimm > > > > device plugging path work on Xen. > > > > > > > > Patch 4 - 10 build and cooy NFIT from QEMU to Xen guest, when QEMU is > > > > used as the Xen device model. > > > > > > I've skimmed over patch-set and can't say that I'm happy with > > > number of xen_enabled() invariants it introduced as well as > > > with partial blobs it creates. > > > > I have not read the series (Haozhong, please CC me, Anthony and > > xen-devel to the whole series next time), but yes, indeed. Let's not add > > more xen_enabled() if possible. > > > > Haozhong, was there a design document thread on xen-devel about this? If > > so, did it reach a conclusion? Was the design accepted? If so, please > > add a link to the design doc in the introductory email, so that > > everybody can read it and be on the same page. > > Yes, there is a design [1] discussed and reviewed. Section 4.3 discussed > the guest ACPI. > > [1] https://lists.xenproject.org/archives/html/xen-devel/2016-07/msg01921.html Igor, did you have a chance to read it? .. see below > > > > > > > > I'd like to reduce above and a way to do this might be making xen > > > 1. use fw_cfg > > > 2. fetch QEMU build acpi tables from fw_cfg > > > 3. extract nvdim tables (which is trivial) and use them > > > > > > looking at xen_load_linux(), it seems possible to use fw_cfg. > > > > > > So what's stopping xen from using it elsewhere?, > > > instead of adding more xen specific code to do 'the same' > > > job and not reusing/sharing common code with tcg/kvm. > > > > So far, ACPI tables have not been generated by QEMU. Xen HVM machines > > rely on a firmware-like application called "hvmloader" that runs in > > guest context and generates the ACPI tables. I have no opinions on > > hvmloader and I'll let the Xen maintainers talk about it. However, keep > > in mind that with an HVM guest some devices are emulated by Xen and/or > > by other device emulators that can run alongside QEMU. QEMU doesn't have > > a full few of the system. > > > > Here the question is: does it have to be QEMU the one to generate the > > ACPI blobs for the nvdimm? It would be nicer if it was up to hvmloader > > like the rest, instead of introducing this split-brain design about > > ACPI. We need to see a design doc to fully understand this. > > > > hvmloader runs in the guest and is responsible to build/load guest > ACPI. However, it's not capable to build AML at runtime (for the lack > of AML builder). If any guest ACPI object is needed (e.g. by guest > DSDT), it has to be generated from ASL by iasl at Xen compile time and > then be loaded by hvmloader at runtime. > > Xen includes an OperationRegion "BIOS" in the static generated guest > DSDT, whose address is hardcoded and which contains a list of values > filled by hvmloader at runtime. Other ACPI objects can refer to those > values (e.g., the number of vCPUs). But it's not enough for generating > guest NVDIMM ACPI objects at compile time and then being customized > and loaded by hvmload, because its structure (i.e., the number of > namespace devices) cannot be decided util the guest config is known. > > Alternatively, we may introduce an AML builder in hvmloader and build > all guest ACPI completely in hvmloader. Looking at the similar > implementation in QEMU, it would not be small, compared to the current > size of hvmloader. Besides, I'm still going to let QEMU handle guest > NVDIMM _DSM and _FIT calls, which is another reason I use QEMU to > build NVDIMM ACPI. > > > If the design doc thread led into thinking that it has to be QEMU to > > generate them, then would it make the code nicer if we used fw_cfg to > > get the (full or partial) tables from QEMU, as Igor suggested? > > I'll have a look at the code (which I didn't notice) pointed by Igor. And there is a spec too! https://github.com/qemu/qemu/blob/master/docs/specs/fw_cfg.txt Igor, did you have in mind to use FW_CFG_FILE_DIR to retrieve the ACPI AML code? > > One possible issue to use fw_cfg is how to avoid the conflict between > ACPI built by QEMU and ACPI built by hvmloader (e.g., both may use the > same table signature / device names / ...). In my current design, QEMU > will pass the table signatures and device names used in its ACPI to > Xen, and Xen can check the conflict with its own ACPI. Perhaps we can > add necessary functions in fw_cfg as well. Anyway, let me first look > at the code. > > Thanks, > Haozhong
On 10/10/17 12:05 -0400, Konrad Rzeszutek Wilk wrote: > On Tue, Sep 12, 2017 at 11:15:09AM +0800, Haozhong Zhang wrote: > > On 09/11/17 11:52 -0700, Stefano Stabellini wrote: > > > CC'ing xen-devel, and the Xen tools and x86 maintainers. > > > > > > On Mon, 11 Sep 2017, Igor Mammedov wrote: > > > > On Mon, 11 Sep 2017 12:41:47 +0800 > > > > Haozhong Zhang <haozhong.zhang@intel.com> wrote: > > > > > > > > > This is the QEMU part patches that works with the associated Xen > > > > > patches to enable vNVDIMM support for Xen HVM domains. Xen relies on > > > > > QEMU to build guest NFIT and NVDIMM namespace devices, and allocate > > > > > guest address space for vNVDIMM devices. > > > > > > > > > > All patches can be found at > > > > > Xen: https://github.com/hzzhan9/xen.git nvdimm-rfc-v3 > > > > > QEMU: https://github.com/hzzhan9/qemu.git xen-nvdimm-rfc-v3 > > > > > > > > > > Patch 1 is to avoid dereferencing the NULL pointer to non-existing > > > > > label data, as the Xen side support for labels is not implemented yet. > > > > > > > > > > Patch 2 & 3 add a memory backend dedicated for Xen usage and a hotplug > > > > > memory region for Xen guest, in order to make the existing nvdimm > > > > > device plugging path work on Xen. > > > > > > > > > > Patch 4 - 10 build and cooy NFIT from QEMU to Xen guest, when QEMU is > > > > > used as the Xen device model. > > > > > > > > I've skimmed over patch-set and can't say that I'm happy with > > > > number of xen_enabled() invariants it introduced as well as > > > > with partial blobs it creates. > > > > > > I have not read the series (Haozhong, please CC me, Anthony and > > > xen-devel to the whole series next time), but yes, indeed. Let's not add > > > more xen_enabled() if possible. > > > > > > Haozhong, was there a design document thread on xen-devel about this? If > > > so, did it reach a conclusion? Was the design accepted? If so, please > > > add a link to the design doc in the introductory email, so that > > > everybody can read it and be on the same page. > > > > Yes, there is a design [1] discussed and reviewed. Section 4.3 discussed > > the guest ACPI. > > > > [1] https://lists.xenproject.org/archives/html/xen-devel/2016-07/msg01921.html > > Igor, did you have a chance to read it? > > .. see below > > > > > > > > > > > > I'd like to reduce above and a way to do this might be making xen > > > > 1. use fw_cfg > > > > 2. fetch QEMU build acpi tables from fw_cfg > > > > 3. extract nvdim tables (which is trivial) and use them > > > > > > > > looking at xen_load_linux(), it seems possible to use fw_cfg. > > > > > > > > So what's stopping xen from using it elsewhere?, > > > > instead of adding more xen specific code to do 'the same' > > > > job and not reusing/sharing common code with tcg/kvm. > > > > > > So far, ACPI tables have not been generated by QEMU. Xen HVM machines > > > rely on a firmware-like application called "hvmloader" that runs in > > > guest context and generates the ACPI tables. I have no opinions on > > > hvmloader and I'll let the Xen maintainers talk about it. However, keep > > > in mind that with an HVM guest some devices are emulated by Xen and/or > > > by other device emulators that can run alongside QEMU. QEMU doesn't have > > > a full few of the system. > > > > > > Here the question is: does it have to be QEMU the one to generate the > > > ACPI blobs for the nvdimm? It would be nicer if it was up to hvmloader > > > like the rest, instead of introducing this split-brain design about > > > ACPI. We need to see a design doc to fully understand this. > > > > > > > hvmloader runs in the guest and is responsible to build/load guest > > ACPI. However, it's not capable to build AML at runtime (for the lack > > of AML builder). If any guest ACPI object is needed (e.g. by guest > > DSDT), it has to be generated from ASL by iasl at Xen compile time and > > then be loaded by hvmloader at runtime. > > > > Xen includes an OperationRegion "BIOS" in the static generated guest > > DSDT, whose address is hardcoded and which contains a list of values > > filled by hvmloader at runtime. Other ACPI objects can refer to those > > values (e.g., the number of vCPUs). But it's not enough for generating > > guest NVDIMM ACPI objects at compile time and then being customized > > and loaded by hvmload, because its structure (i.e., the number of > > namespace devices) cannot be decided util the guest config is known. > > > > Alternatively, we may introduce an AML builder in hvmloader and build > > all guest ACPI completely in hvmloader. Looking at the similar > > implementation in QEMU, it would not be small, compared to the current > > size of hvmloader. Besides, I'm still going to let QEMU handle guest > > NVDIMM _DSM and _FIT calls, which is another reason I use QEMU to > > build NVDIMM ACPI. > > > > > If the design doc thread led into thinking that it has to be QEMU to > > > generate them, then would it make the code nicer if we used fw_cfg to > > > get the (full or partial) tables from QEMU, as Igor suggested? > > > > I'll have a look at the code (which I didn't notice) pointed by Igor. > > And there is a spec too! > > https://github.com/qemu/qemu/blob/master/docs/specs/fw_cfg.txt > > Igor, did you have in mind to use FW_CFG_FILE_DIR to retrieve the > ACPI AML code? > Basically, QEMU builds two ROMs for guest, /rom@etc/acpi/tables and /rom@etc/table-loader. The former is unstructured to guest, and contains all data of guest ACPI. The latter is a BIOSLinkerLoader organized as a set of commands, which direct the guest (e.g., SeaBIOS on KVM/QEMU) to relocate data in the former file, recalculate checksum of specified area, and fill guest address in specified ACPI field. One part of my patches is to implement a mechanism to tell Xen which part of ACPI data is a table (NFIT), and which part defines a namespace device and what the device name is. I can add two new loader commands for them respectively. Because they just provide information and SeaBIOS in non-xen environment ignores unrecognized commands, they will not break SeaBIOS in non-xen environment. On QEMU side, most Xen-specific hacks in ACPI builder could be dropped, and replaced by adding the new loader commands (though they may be used only by Xen). On Xen side, a fw_cfg driver and a BIOSLinkerLoader command executor are needed in, perhaps, hvmloader. Haozhong
On 12/10/2017 14:45, Haozhong Zhang wrote: > Basically, QEMU builds two ROMs for guest, /rom@etc/acpi/tables and > /rom@etc/table-loader. The former is unstructured to guest, and > contains all data of guest ACPI. The latter is a BIOSLinkerLoader > organized as a set of commands, which direct the guest (e.g., SeaBIOS > on KVM/QEMU) to relocate data in the former file, recalculate checksum > of specified area, and fill guest address in specified ACPI field. > > One part of my patches is to implement a mechanism to tell Xen which > part of ACPI data is a table (NFIT), and which part defines a > namespace device and what the device name is. I can add two new loader > commands for them respectively. > > Because they just provide information and SeaBIOS in non-xen > environment ignores unrecognized commands, they will not break SeaBIOS > in non-xen environment. > > On QEMU side, most Xen-specific hacks in ACPI builder could be > dropped, and replaced by adding the new loader commands (though they > may be used only by Xen). > > On Xen side, a fw_cfg driver and a BIOSLinkerLoader command executor > are needed in, perhaps, hvmloader. If Xen has to parse BIOSLinkerLoader, it can use the existing commands to process a reduced set of ACPI tables. In other words, etc/acpi/tables would only include the NFIT, the SSDT with namespace devices, and the XSDT. etc/acpi/rsdp would include the RSDP table as usual. hvmloader can then: 1) allocate some memory for where the XSDT will go 2) process the BIOSLinkerLoader like SeaBIOS would do 3) find the RSDP in low memory, since the loader script must have placed it there. If it cannot find it, allocate some low memory, fill it with the RSDP header and revision, and and jump to step 6 4) If it found QEMU's RSDP, use it to find QEMU's XSDT 5) Copy ACPI table pointers from QEMU to hvmloader's RSDT and/or XSDT. 6) build hvmloader tables and link them into the RSDT and/or XSDT as usual. 7) overwrite the RSDP in low memory with a pointer to hvmloader's own RSDT and/or XSDT, and updated the checksums QEMU's XSDT remains there somewhere in memory, unused but harmless. Paolo
On 10/12/17 17:45 +0200, Paolo Bonzini wrote: > On 12/10/2017 14:45, Haozhong Zhang wrote: > > Basically, QEMU builds two ROMs for guest, /rom@etc/acpi/tables and > > /rom@etc/table-loader. The former is unstructured to guest, and > > contains all data of guest ACPI. The latter is a BIOSLinkerLoader > > organized as a set of commands, which direct the guest (e.g., SeaBIOS > > on KVM/QEMU) to relocate data in the former file, recalculate checksum > > of specified area, and fill guest address in specified ACPI field. > > > > One part of my patches is to implement a mechanism to tell Xen which > > part of ACPI data is a table (NFIT), and which part defines a > > namespace device and what the device name is. I can add two new loader > > commands for them respectively. > > > > Because they just provide information and SeaBIOS in non-xen > > environment ignores unrecognized commands, they will not break SeaBIOS > > in non-xen environment. > > > > On QEMU side, most Xen-specific hacks in ACPI builder could be > > dropped, and replaced by adding the new loader commands (though they > > may be used only by Xen). > > > > On Xen side, a fw_cfg driver and a BIOSLinkerLoader command executor > > are needed in, perhaps, hvmloader. > > If Xen has to parse BIOSLinkerLoader, it can use the existing commands > to process a reduced set of ACPI tables. In other words, > etc/acpi/tables would only include the NFIT, the SSDT with namespace > devices, and the XSDT. etc/acpi/rsdp would include the RSDP table as usual. > > hvmloader can then: > > 1) allocate some memory for where the XSDT will go > > 2) process the BIOSLinkerLoader like SeaBIOS would do > > 3) find the RSDP in low memory, since the loader script must have placed > it there. If it cannot find it, allocate some low memory, fill it with > the RSDP header and revision, and and jump to step 6 > > 4) If it found QEMU's RSDP, use it to find QEMU's XSDT > > 5) Copy ACPI table pointers from QEMU to hvmloader's RSDT and/or XSDT. > > 6) build hvmloader tables and link them into the RSDT and/or XSDT as usual. > > 7) overwrite the RSDP in low memory with a pointer to hvmloader's own > RSDT and/or XSDT, and updated the checksums > > QEMU's XSDT remains there somewhere in memory, unused but harmless. > It can work for plan tables which do not contain AML. However, for a namespace device, Xen needs to know its name in order to detect the potential name conflict with those used in Xen built ACPI. Xen does not (and is not going to) introduce an AML parser, so it cannot get those device names from QEMU built ACPI by its own. The idea of either this patch series or the new BIOSLinkerLoader command is to let QEMU tell Xen where the definition body of a namespace device (i.e. that part within the outmost "Device(NAME)") is and what the device name is. Xen, after the name conflict check, can re-package the definition body in a namespace device (w/ minimal AML builder code added in Xen) and then in SSDT. Haozhong
On Fri, 13 Oct 2017 15:53:26 +0800 Haozhong Zhang <haozhong.zhang@intel.com> wrote: > On 10/12/17 17:45 +0200, Paolo Bonzini wrote: > > On 12/10/2017 14:45, Haozhong Zhang wrote: > > > Basically, QEMU builds two ROMs for guest, /rom@etc/acpi/tables and > > > /rom@etc/table-loader. The former is unstructured to guest, and > > > contains all data of guest ACPI. The latter is a BIOSLinkerLoader > > > organized as a set of commands, which direct the guest (e.g., SeaBIOS > > > on KVM/QEMU) to relocate data in the former file, recalculate checksum > > > of specified area, and fill guest address in specified ACPI field. > > > > > > One part of my patches is to implement a mechanism to tell Xen which > > > part of ACPI data is a table (NFIT), and which part defines a > > > namespace device and what the device name is. I can add two new loader > > > commands for them respectively. > > > > > > Because they just provide information and SeaBIOS in non-xen > > > environment ignores unrecognized commands, they will not break SeaBIOS > > > in non-xen environment. > > > > > > On QEMU side, most Xen-specific hacks in ACPI builder could be > > > dropped, and replaced by adding the new loader commands (though they > > > may be used only by Xen). > > > > > > On Xen side, a fw_cfg driver and a BIOSLinkerLoader command executor > > > are needed in, perhaps, hvmloader. > > > > If Xen has to parse BIOSLinkerLoader, it can use the existing commands > > to process a reduced set of ACPI tables. In other words, > > etc/acpi/tables would only include the NFIT, the SSDT with namespace > > devices, and the XSDT. etc/acpi/rsdp would include the RSDP table as usual. > > > > hvmloader can then: > > > > 1) allocate some memory for where the XSDT will go > > > > 2) process the BIOSLinkerLoader like SeaBIOS would do > > > > 3) find the RSDP in low memory, since the loader script must have placed > > it there. If it cannot find it, allocate some low memory, fill it with > > the RSDP header and revision, and and jump to step 6 > > > > 4) If it found QEMU's RSDP, use it to find QEMU's XSDT > > > > 5) Copy ACPI table pointers from QEMU to hvmloader's RSDT and/or XSDT. > > > > 6) build hvmloader tables and link them into the RSDT and/or XSDT as usual. > > > > 7) overwrite the RSDP in low memory with a pointer to hvmloader's own > > RSDT and/or XSDT, and updated the checksums > > > > QEMU's XSDT remains there somewhere in memory, unused but harmless. > > +1 to Paolo's suggestion, i.e. 1. add BIOSLinkerLoader into hvmloader 2. load/process QEMU's tables with #1 3. get pointers to QEMU generated NFIT and NVDIMM SSDT from QEMU's RSDT/XSDT and put them in hvmloader's RSDT > It can work for plan tables which do not contain AML. > > However, for a namespace device, Xen needs to know its name in order > to detect the potential name conflict with those used in Xen built > ACPI. Xen does not (and is not going to) introduce an AML parser, so > it cannot get those device names from QEMU built ACPI by its own. > > The idea of either this patch series or the new BIOSLinkerLoader > command is to let QEMU tell Xen where the definition body of a > namespace device (i.e. that part within the outmost "Device(NAME)") is > and what the device name is. Xen, after the name conflict check, can > re-package the definition body in a namespace device (w/ minimal AML > builder code added in Xen) and then in SSDT. I'd skip conflict check at runtime as hvmloader doesn't currently have "\\_SB\NVDR" device so instead of doing runtime check it might do primitive check at build time that ASL sources in hvmloader do not contain reserved for QEMU "NVDR" keyword to avoid its addition by accident in future. (it also might be reused in future if some other tables from QEMU will be reused). It's a bit hackinsh but at least it does the job and keeps BIOSLinkerLoader interface the same for all supported firmwares (I'd consider it as a temporary hack on the way to fully build by QEMU ACPI tables for Xen). Ideally it would be better for QEMU to build all ACPI tables for hvmloader to avoid split brain issues and need to invent extra interfaces every time a feature is added to pass configuration data from QEMU to firmware. But that's probably out of scope of this project, it could be done on top of this if Xen folks would like to do it. Adding BIOSLinkerLoader to hvmloader would be a good starting point for that future effort.
On 10/13/17 10:44 +0200, Igor Mammedov wrote: > On Fri, 13 Oct 2017 15:53:26 +0800 > Haozhong Zhang <haozhong.zhang@intel.com> wrote: > > > On 10/12/17 17:45 +0200, Paolo Bonzini wrote: > > > On 12/10/2017 14:45, Haozhong Zhang wrote: > > > > Basically, QEMU builds two ROMs for guest, /rom@etc/acpi/tables and > > > > /rom@etc/table-loader. The former is unstructured to guest, and > > > > contains all data of guest ACPI. The latter is a BIOSLinkerLoader > > > > organized as a set of commands, which direct the guest (e.g., SeaBIOS > > > > on KVM/QEMU) to relocate data in the former file, recalculate checksum > > > > of specified area, and fill guest address in specified ACPI field. > > > > > > > > One part of my patches is to implement a mechanism to tell Xen which > > > > part of ACPI data is a table (NFIT), and which part defines a > > > > namespace device and what the device name is. I can add two new loader > > > > commands for them respectively. > > > > > > > > Because they just provide information and SeaBIOS in non-xen > > > > environment ignores unrecognized commands, they will not break SeaBIOS > > > > in non-xen environment. > > > > > > > > On QEMU side, most Xen-specific hacks in ACPI builder could be > > > > dropped, and replaced by adding the new loader commands (though they > > > > may be used only by Xen). > > > > > > > > On Xen side, a fw_cfg driver and a BIOSLinkerLoader command executor > > > > are needed in, perhaps, hvmloader. > > > > > > If Xen has to parse BIOSLinkerLoader, it can use the existing commands > > > to process a reduced set of ACPI tables. In other words, > > > etc/acpi/tables would only include the NFIT, the SSDT with namespace > > > devices, and the XSDT. etc/acpi/rsdp would include the RSDP table as usual. > > > > > > hvmloader can then: > > > > > > 1) allocate some memory for where the XSDT will go > > > > > > 2) process the BIOSLinkerLoader like SeaBIOS would do > > > > > > 3) find the RSDP in low memory, since the loader script must have placed > > > it there. If it cannot find it, allocate some low memory, fill it with > > > the RSDP header and revision, and and jump to step 6 > > > > > > 4) If it found QEMU's RSDP, use it to find QEMU's XSDT > > > > > > 5) Copy ACPI table pointers from QEMU to hvmloader's RSDT and/or XSDT. > > > > > > 6) build hvmloader tables and link them into the RSDT and/or XSDT as usual. > > > > > > 7) overwrite the RSDP in low memory with a pointer to hvmloader's own > > > RSDT and/or XSDT, and updated the checksums > > > > > > QEMU's XSDT remains there somewhere in memory, unused but harmless. > > > > +1 to Paolo's suggestion, i.e. > 1. add BIOSLinkerLoader into hvmloader > 2. load/process QEMU's tables with #1 > 3. get pointers to QEMU generated NFIT and NVDIMM SSDT from QEMU's RSDT/XSDT > and put them in hvmloader's RSDT > > > It can work for plan tables which do not contain AML. > > > > However, for a namespace device, Xen needs to know its name in order > > to detect the potential name conflict with those used in Xen built > > ACPI. Xen does not (and is not going to) introduce an AML parser, so > > it cannot get those device names from QEMU built ACPI by its own. > > > > The idea of either this patch series or the new BIOSLinkerLoader > > command is to let QEMU tell Xen where the definition body of a > > namespace device (i.e. that part within the outmost "Device(NAME)") is > > and what the device name is. Xen, after the name conflict check, can > > re-package the definition body in a namespace device (w/ minimal AML > > builder code added in Xen) and then in SSDT. > > I'd skip conflict check at runtime as hvmloader doesn't currently > have "\\_SB\NVDR" device so instead of doing runtime check it might > do primitive check at build time that ASL sources in hvmloader do > not contain reserved for QEMU "NVDR" keyword to avoid its addition > by accident in future. (it also might be reused in future if some > other tables from QEMU will be reused). > It's a bit hackinsh but at least it does the job and keeps > BIOSLinkerLoader interface the same for all supported firmwares > (I'd consider it as a temporary hack on the way to fully build > by QEMU ACPI tables for Xen). > > Ideally it would be better for QEMU to build all ACPI tables for > hvmloader to avoid split brain issues and need to invent extra > interfaces every time a feature is added to pass configuration > data from QEMU to firmware. > But that's probably out of scope of this project, it could be > done on top of this if Xen folks would like to do it. Adding > BIOSLinkerLoader to hvmloader would be a good starting point > for that future effort. If we can let QEMU build the entire guest ACPI, we may even not need to introduce fw_cfg and BIOSLinkerLoader code to hvmloader. SeaBIOS is currently loaded after hvmloader and can be used to load QEMU built ACPI. To Jan, Andrew, Stefano and Anthony, what do you think about allowing QEMU to build the entire guest ACPI and letting SeaBIOS to load it? ACPI builder code in hvmloader is still there and just bypassed in this case. Haozhong
>>> On 13.10.17 at 13:13, <haozhong.zhang@intel.com> wrote: > To Jan, Andrew, Stefano and Anthony, > > what do you think about allowing QEMU to build the entire guest ACPI > and letting SeaBIOS to load it? ACPI builder code in hvmloader is > still there and just bypassed in this case. Well, if that can be made work in a non-quirky way and without loss of functionality, I'd probably be fine. I do think, however, that there's a reason this is being handled in hvmloader right now. Jan
On Fri, 13 Oct 2017, Jan Beulich wrote: > >>> On 13.10.17 at 13:13, <haozhong.zhang@intel.com> wrote: > > To Jan, Andrew, Stefano and Anthony, > > > > what do you think about allowing QEMU to build the entire guest ACPI > > and letting SeaBIOS to load it? ACPI builder code in hvmloader is > > still there and just bypassed in this case. > > Well, if that can be made work in a non-quirky way and without > loss of functionality, I'd probably be fine. I do think, however, > that there's a reason this is being handled in hvmloader right now. And not to discourage you, just as a clarification, you'll also need to consider backward compatibility: unless the tables are identical, I imagine we'll have to keep using the old tables for already installed virtual machines.
On Fri, Oct 13, 2017 at 03:46:39PM -0700, Stefano Stabellini wrote: > On Fri, 13 Oct 2017, Jan Beulich wrote: > > >>> On 13.10.17 at 13:13, <haozhong.zhang@intel.com> wrote: > > > To Jan, Andrew, Stefano and Anthony, > > > > > > what do you think about allowing QEMU to build the entire guest ACPI > > > and letting SeaBIOS to load it? ACPI builder code in hvmloader is > > > still there and just bypassed in this case. > > > > Well, if that can be made work in a non-quirky way and without > > loss of functionality, I'd probably be fine. I do think, however, > > that there's a reason this is being handled in hvmloader right now. > > And not to discourage you, just as a clarification, you'll also need to > consider backward compatibility: unless the tables are identical, I > imagine we'll have to keep using the old tables for already installed > virtual machines. Maybe you can handle this using machine type versioning. Installed guests would use the old type. -- MST
On Sun, Oct 15, 2017 at 03:31:15AM +0300, Michael S. Tsirkin wrote: > On Fri, Oct 13, 2017 at 03:46:39PM -0700, Stefano Stabellini wrote: > > On Fri, 13 Oct 2017, Jan Beulich wrote: > > > >>> On 13.10.17 at 13:13, <haozhong.zhang@intel.com> wrote: > > > > To Jan, Andrew, Stefano and Anthony, > > > > > > > > what do you think about allowing QEMU to build the entire guest ACPI > > > > and letting SeaBIOS to load it? ACPI builder code in hvmloader is > > > > still there and just bypassed in this case. > > > > > > Well, if that can be made work in a non-quirky way and without > > > loss of functionality, I'd probably be fine. I do think, however, > > > that there's a reason this is being handled in hvmloader right now. > > > > And not to discourage you, just as a clarification, you'll also need to > > consider backward compatibility: unless the tables are identical, I > > imagine we'll have to keep using the old tables for already installed > > virtual machines. > > Maybe you can handle this using machine type versioning. <nods> And the type could be v2 if nvdimm was provided (which is something that the toolstack would figure out). The toolstack could also have a seperate 'v2' config flag if somebody wanted to play with this _outside_ of having NVDIMM in the guest? > Installed guests would use the old type. <nods>
On 14/10/2017 00:46, Stefano Stabellini wrote: > On Fri, 13 Oct 2017, Jan Beulich wrote: >>>>> On 13.10.17 at 13:13, <haozhong.zhang@intel.com> wrote: >>> To Jan, Andrew, Stefano and Anthony, >>> >>> what do you think about allowing QEMU to build the entire guest ACPI >>> and letting SeaBIOS to load it? ACPI builder code in hvmloader is >>> still there and just bypassed in this case. >> Well, if that can be made work in a non-quirky way and without >> loss of functionality, I'd probably be fine. I do think, however, >> that there's a reason this is being handled in hvmloader right now. > And not to discourage you, just as a clarification, you'll also need to > consider backward compatibility: unless the tables are identical, I > imagine we'll have to keep using the old tables for already installed > virtual machines. I agree. Some of them are already identical, some are not but the QEMU version should be okay, and for yet more it's probably better to keep the Xen-specific parts in hvmloader. The good thing is that it's possible to proceed incrementally once you have the hvmloader support for merging the QEMU and hvmloader RSDT or XSDT (whatever you are using), starting with just NVDIMM and proceeding later with whatever you see fit. Paolo
On 10/17/17 13:45 +0200, Paolo Bonzini wrote: > On 14/10/2017 00:46, Stefano Stabellini wrote: > > On Fri, 13 Oct 2017, Jan Beulich wrote: > >>>>> On 13.10.17 at 13:13, <haozhong.zhang@intel.com> wrote: > >>> To Jan, Andrew, Stefano and Anthony, > >>> > >>> what do you think about allowing QEMU to build the entire guest ACPI > >>> and letting SeaBIOS to load it? ACPI builder code in hvmloader is > >>> still there and just bypassed in this case. > >> Well, if that can be made work in a non-quirky way and without > >> loss of functionality, I'd probably be fine. I do think, however, > >> that there's a reason this is being handled in hvmloader right now. > > And not to discourage you, just as a clarification, you'll also need to > > consider backward compatibility: unless the tables are identical, I > > imagine we'll have to keep using the old tables for already installed > > virtual machines. > > I agree. Some of them are already identical, some are not but the QEMU > version should be okay, and for yet more it's probably better to keep > the Xen-specific parts in hvmloader. > > The good thing is that it's possible to proceed incrementally once you > have the hvmloader support for merging the QEMU and hvmloader RSDT or > XSDT (whatever you are using), starting with just NVDIMM and proceeding > later with whatever you see fit. > I'll have a try to check how much the differences would affect. If it would not take too much work, I'd like to adapt Xen NVDIMM enabling patches to the all QEMU built ACPI. Otherwise, I'll fall back to Paolo and MST's suggestions. Thanks, Haozhong
On Tue, Oct 17, 2017 at 08:16:47PM +0800, Haozhong Zhang wrote: > On 10/17/17 13:45 +0200, Paolo Bonzini wrote: > > On 14/10/2017 00:46, Stefano Stabellini wrote: > > > On Fri, 13 Oct 2017, Jan Beulich wrote: > > >>>>> On 13.10.17 at 13:13, <haozhong.zhang@intel.com> wrote: > > >>> To Jan, Andrew, Stefano and Anthony, > > >>> > > >>> what do you think about allowing QEMU to build the entire guest ACPI > > >>> and letting SeaBIOS to load it? ACPI builder code in hvmloader is > > >>> still there and just bypassed in this case. > > >> Well, if that can be made work in a non-quirky way and without > > >> loss of functionality, I'd probably be fine. I do think, however, > > >> that there's a reason this is being handled in hvmloader right now. > > > And not to discourage you, just as a clarification, you'll also need to > > > consider backward compatibility: unless the tables are identical, I > > > imagine we'll have to keep using the old tables for already installed > > > virtual machines. > > > > I agree. Some of them are already identical, some are not but the QEMU > > version should be okay, and for yet more it's probably better to keep > > the Xen-specific parts in hvmloader. > > > > The good thing is that it's possible to proceed incrementally once you > > have the hvmloader support for merging the QEMU and hvmloader RSDT or > > XSDT (whatever you are using), starting with just NVDIMM and proceeding > > later with whatever you see fit. > > > > I'll have a try to check how much the differences would affect. If it > would not take too much work, I'd like to adapt Xen NVDIMM enabling > patches to the all QEMU built ACPI. Otherwise, I'll fall back to Paolo > and MST's suggestions. I don't agree with the end goal of fully switching to the QEMU build ACPI tables. First of all, the only entity that has all the information about the guest it's the toolstack, and so it should be the one in control of the ACPI tables. Also, Xen guests can use several device models concurrently (via the ioreq server interface), and each should be able to contribute to the information presented in the ACPI tables. Intel is also working on adding IOMMU emulation to the Xen hypervisor, in which case the vIOMMU ACPI tables should be created by the toolstack and not QEMU. And finally keep in mind that there are Xen guests (PVH) that use ACPI tables but not QEMU. Thanks, Roger.
On 18/10/2017 10:32, Roger Pau Monné wrote: >> I'll have a try to check how much the differences would affect. If it >> would not take too much work, I'd like to adapt Xen NVDIMM enabling >> patches to the all QEMU built ACPI. Otherwise, I'll fall back to Paolo >> and MST's suggestions. > I don't agree with the end goal of fully switching to the QEMU build > ACPI tables. First of all, the only entity that has all the > information about the guest it's the toolstack, and so it should be > the one in control of the ACPI tables. > > Also, Xen guests can use several device models concurrently (via the > ioreq server interface), and each should be able to contribute to the > information presented in the ACPI tables. Intel is also working on > adding IOMMU emulation to the Xen hypervisor, in which case the vIOMMU > ACPI tables should be created by the toolstack and not QEMU. And > finally keep in mind that there are Xen guests (PVH) that use ACPI > tables but not QEMU. I agree with this in fact; QEMU has a view of _most_ of the emulated hardware, but not all. However, I disagree that the toolstack should be alone in controlling the ACPI tables; rather, each involved part of the stack should be providing its own part of the tables. For example, QEMU (in addition to NVDIMM information) should be the one providing an SSDT for southbridge devices (floppy, COMx, LPTx, etc.). The Xen stack (or more likely, hvmloader itself) would provide all the bits that are provided by the hypervisor (MADT for the IOAPIC, another SSDT for the HPET and RTC, DMAR tables for IOMMU, and so on). This should also work just fine for PVH. Of course backwards compatibility is the enemy of simplification, but in the end things _should_ actually be simpler and I think it's a good idea if a prerequisite for Xen vNVDIMM is to move AML code for QEMU devices out of hvmloader. Paolo
On Wed, Oct 18, 2017 at 10:46:57AM +0200, Paolo Bonzini wrote: > On 18/10/2017 10:32, Roger Pau Monné wrote: > >> I'll have a try to check how much the differences would affect. If it > >> would not take too much work, I'd like to adapt Xen NVDIMM enabling > >> patches to the all QEMU built ACPI. Otherwise, I'll fall back to Paolo > >> and MST's suggestions. > > I don't agree with the end goal of fully switching to the QEMU build > > ACPI tables. First of all, the only entity that has all the > > information about the guest it's the toolstack, and so it should be > > the one in control of the ACPI tables. > > > > Also, Xen guests can use several device models concurrently (via the > > ioreq server interface), and each should be able to contribute to the > > information presented in the ACPI tables. Intel is also working on > > adding IOMMU emulation to the Xen hypervisor, in which case the vIOMMU > > ACPI tables should be created by the toolstack and not QEMU. And > > finally keep in mind that there are Xen guests (PVH) that use ACPI > > tables but not QEMU. > > I agree with this in fact; QEMU has a view of _most_ of the emulated > hardware, but not all. > > However, I disagree that the toolstack should be alone in controlling > the ACPI tables; rather, each involved part of the stack should be > providing its own part of the tables. For example, QEMU (in addition to > NVDIMM information) should be the one providing an SSDT for southbridge > devices (floppy, COMx, LPTx, etc.). Yes, that's what I wanted to say, rather than the toolstack providing all the ACPI tables by itself. Every component should provide the tables of the devices under it's control, and that should be glued together by the toolstack (ie: hvmloader). Roger.
On Fri, Oct 13, 2017 at 03:53:26PM +0800, Haozhong Zhang wrote: > On 10/12/17 17:45 +0200, Paolo Bonzini wrote: > > On 12/10/2017 14:45, Haozhong Zhang wrote: > > > Basically, QEMU builds two ROMs for guest, /rom@etc/acpi/tables and > > > /rom@etc/table-loader. The former is unstructured to guest, and > > > contains all data of guest ACPI. The latter is a BIOSLinkerLoader > > > organized as a set of commands, which direct the guest (e.g., SeaBIOS > > > on KVM/QEMU) to relocate data in the former file, recalculate checksum > > > of specified area, and fill guest address in specified ACPI field. > > > > > > One part of my patches is to implement a mechanism to tell Xen which > > > part of ACPI data is a table (NFIT), and which part defines a > > > namespace device and what the device name is. I can add two new loader > > > commands for them respectively. > > > > > > Because they just provide information and SeaBIOS in non-xen > > > environment ignores unrecognized commands, they will not break SeaBIOS > > > in non-xen environment. > > > > > > On QEMU side, most Xen-specific hacks in ACPI builder could be > > > dropped, and replaced by adding the new loader commands (though they > > > may be used only by Xen). > > > > > > On Xen side, a fw_cfg driver and a BIOSLinkerLoader command executor > > > are needed in, perhaps, hvmloader. > > > > If Xen has to parse BIOSLinkerLoader, it can use the existing commands > > to process a reduced set of ACPI tables. In other words, > > etc/acpi/tables would only include the NFIT, the SSDT with namespace > > devices, and the XSDT. etc/acpi/rsdp would include the RSDP table as usual. > > > > hvmloader can then: > > > > 1) allocate some memory for where the XSDT will go > > > > 2) process the BIOSLinkerLoader like SeaBIOS would do > > > > 3) find the RSDP in low memory, since the loader script must have placed > > it there. If it cannot find it, allocate some low memory, fill it with > > the RSDP header and revision, and and jump to step 6 > > > > 4) If it found QEMU's RSDP, use it to find QEMU's XSDT > > > > 5) Copy ACPI table pointers from QEMU to hvmloader's RSDT and/or XSDT. > > > > 6) build hvmloader tables and link them into the RSDT and/or XSDT as usual. > > > > 7) overwrite the RSDP in low memory with a pointer to hvmloader's own > > RSDT and/or XSDT, and updated the checksums > > > > QEMU's XSDT remains there somewhere in memory, unused but harmless. > > > > It can work for plan tables which do not contain AML. > > However, for a namespace device, Xen needs to know its name in order > to detect the potential name conflict with those used in Xen built > ACPI. Xen does not (and is not going to) introduce an AML parser, so > it cannot get those device names from QEMU built ACPI by its own. > > The idea of either this patch series or the new BIOSLinkerLoader > command is to let QEMU tell Xen where the definition body of a > namespace device (i.e. that part within the outmost "Device(NAME)") is > and what the device name is. Xen, after the name conflict check, can > re-package the definition body in a namespace device (w/ minimal AML > builder code added in Xen) and then in SSDT. > > > Haozhong You most likely can do this without a new command. You can use something similiar to build_append_named_dword in combination with BIOS_LINKER_LOADER_COMMAND_ADD_POINTER like vm gen id does. -- MST
On Thu, Oct 12, 2017 at 08:45:44PM +0800, Haozhong Zhang wrote: > On 10/10/17 12:05 -0400, Konrad Rzeszutek Wilk wrote: > > On Tue, Sep 12, 2017 at 11:15:09AM +0800, Haozhong Zhang wrote: > > > On 09/11/17 11:52 -0700, Stefano Stabellini wrote: > > > > CC'ing xen-devel, and the Xen tools and x86 maintainers. > > > > > > > > On Mon, 11 Sep 2017, Igor Mammedov wrote: > > > > > On Mon, 11 Sep 2017 12:41:47 +0800 > > > > > Haozhong Zhang <haozhong.zhang@intel.com> wrote: > > > > > > > > > > > This is the QEMU part patches that works with the associated Xen > > > > > > patches to enable vNVDIMM support for Xen HVM domains. Xen relies on > > > > > > QEMU to build guest NFIT and NVDIMM namespace devices, and allocate > > > > > > guest address space for vNVDIMM devices. > > > > > > > > > > > > All patches can be found at > > > > > > Xen: https://github.com/hzzhan9/xen.git nvdimm-rfc-v3 > > > > > > QEMU: https://github.com/hzzhan9/qemu.git xen-nvdimm-rfc-v3 > > > > > > > > > > > > Patch 1 is to avoid dereferencing the NULL pointer to non-existing > > > > > > label data, as the Xen side support for labels is not implemented yet. > > > > > > > > > > > > Patch 2 & 3 add a memory backend dedicated for Xen usage and a hotplug > > > > > > memory region for Xen guest, in order to make the existing nvdimm > > > > > > device plugging path work on Xen. > > > > > > > > > > > > Patch 4 - 10 build and cooy NFIT from QEMU to Xen guest, when QEMU is > > > > > > used as the Xen device model. > > > > > > > > > > I've skimmed over patch-set and can't say that I'm happy with > > > > > number of xen_enabled() invariants it introduced as well as > > > > > with partial blobs it creates. > > > > > > > > I have not read the series (Haozhong, please CC me, Anthony and > > > > xen-devel to the whole series next time), but yes, indeed. Let's not add > > > > more xen_enabled() if possible. > > > > > > > > Haozhong, was there a design document thread on xen-devel about this? If > > > > so, did it reach a conclusion? Was the design accepted? If so, please > > > > add a link to the design doc in the introductory email, so that > > > > everybody can read it and be on the same page. > > > > > > Yes, there is a design [1] discussed and reviewed. Section 4.3 discussed > > > the guest ACPI. > > > > > > [1] https://lists.xenproject.org/archives/html/xen-devel/2016-07/msg01921.html > > > > Igor, did you have a chance to read it? > > > > .. see below > > > > > > > > > > > > > > > > I'd like to reduce above and a way to do this might be making xen > > > > > 1. use fw_cfg > > > > > 2. fetch QEMU build acpi tables from fw_cfg > > > > > 3. extract nvdim tables (which is trivial) and use them > > > > > > > > > > looking at xen_load_linux(), it seems possible to use fw_cfg. > > > > > > > > > > So what's stopping xen from using it elsewhere?, > > > > > instead of adding more xen specific code to do 'the same' > > > > > job and not reusing/sharing common code with tcg/kvm. > > > > > > > > So far, ACPI tables have not been generated by QEMU. Xen HVM machines > > > > rely on a firmware-like application called "hvmloader" that runs in > > > > guest context and generates the ACPI tables. I have no opinions on > > > > hvmloader and I'll let the Xen maintainers talk about it. However, keep > > > > in mind that with an HVM guest some devices are emulated by Xen and/or > > > > by other device emulators that can run alongside QEMU. QEMU doesn't have > > > > a full few of the system. > > > > > > > > Here the question is: does it have to be QEMU the one to generate the > > > > ACPI blobs for the nvdimm? It would be nicer if it was up to hvmloader > > > > like the rest, instead of introducing this split-brain design about > > > > ACPI. We need to see a design doc to fully understand this. > > > > > > > > > > hvmloader runs in the guest and is responsible to build/load guest > > > ACPI. However, it's not capable to build AML at runtime (for the lack > > > of AML builder). If any guest ACPI object is needed (e.g. by guest > > > DSDT), it has to be generated from ASL by iasl at Xen compile time and > > > then be loaded by hvmloader at runtime. > > > > > > Xen includes an OperationRegion "BIOS" in the static generated guest > > > DSDT, whose address is hardcoded and which contains a list of values > > > filled by hvmloader at runtime. Other ACPI objects can refer to those > > > values (e.g., the number of vCPUs). But it's not enough for generating > > > guest NVDIMM ACPI objects at compile time and then being customized > > > and loaded by hvmload, because its structure (i.e., the number of > > > namespace devices) cannot be decided util the guest config is known. > > > > > > Alternatively, we may introduce an AML builder in hvmloader and build > > > all guest ACPI completely in hvmloader. Looking at the similar > > > implementation in QEMU, it would not be small, compared to the current > > > size of hvmloader. Besides, I'm still going to let QEMU handle guest > > > NVDIMM _DSM and _FIT calls, which is another reason I use QEMU to > > > build NVDIMM ACPI. > > > > > > > If the design doc thread led into thinking that it has to be QEMU to > > > > generate them, then would it make the code nicer if we used fw_cfg to > > > > get the (full or partial) tables from QEMU, as Igor suggested? > > > > > > I'll have a look at the code (which I didn't notice) pointed by Igor. > > > > And there is a spec too! > > > > https://github.com/qemu/qemu/blob/master/docs/specs/fw_cfg.txt > > > > Igor, did you have in mind to use FW_CFG_FILE_DIR to retrieve the > > ACPI AML code? > > > > Basically, QEMU builds two ROMs for guest, /rom@etc/acpi/tables and > /rom@etc/table-loader. The former is unstructured to guest, and > contains all data of guest ACPI. The latter is a BIOSLinkerLoader > organized as a set of commands, which direct the guest (e.g., SeaBIOS > on KVM/QEMU) to relocate data in the former file, recalculate checksum > of specified area, and fill guest address in specified ACPI field. > > One part of my patches is to implement a mechanism to tell Xen which > part of ACPI data is a table (NFIT), and which part defines a > namespace device and what the device name is. I can add two new loader > commands for them respectively. <nods> > > Because they just provide information and SeaBIOS in non-xen > environment ignores unrecognized commands, they will not break SeaBIOS > in non-xen environment. > > On QEMU side, most Xen-specific hacks in ACPI builder could be Wooot! > dropped, and replaced by adding the new loader commands (though they > may be used only by Xen). And eventually all of the hvmloader ACPI built in code could be dropped and use all of the loader commands? > > On Xen side, a fw_cfg driver and a BIOSLinkerLoader command executor > are needed in, perhaps, hvmloader. <nods> > > > Haozhong >
On 10/12/17 13:39 -0400, Konrad Rzeszutek Wilk wrote: > On Thu, Oct 12, 2017 at 08:45:44PM +0800, Haozhong Zhang wrote: > > On 10/10/17 12:05 -0400, Konrad Rzeszutek Wilk wrote: > > > On Tue, Sep 12, 2017 at 11:15:09AM +0800, Haozhong Zhang wrote: > > > > On 09/11/17 11:52 -0700, Stefano Stabellini wrote: > > > > > CC'ing xen-devel, and the Xen tools and x86 maintainers. > > > > > > > > > > On Mon, 11 Sep 2017, Igor Mammedov wrote: > > > > > > On Mon, 11 Sep 2017 12:41:47 +0800 > > > > > > Haozhong Zhang <haozhong.zhang@intel.com> wrote: > > > > > > > > > > > > > This is the QEMU part patches that works with the associated Xen > > > > > > > patches to enable vNVDIMM support for Xen HVM domains. Xen relies on > > > > > > > QEMU to build guest NFIT and NVDIMM namespace devices, and allocate > > > > > > > guest address space for vNVDIMM devices. > > > > > > > > > > > > > > All patches can be found at > > > > > > > Xen: https://github.com/hzzhan9/xen.git nvdimm-rfc-v3 > > > > > > > QEMU: https://github.com/hzzhan9/qemu.git xen-nvdimm-rfc-v3 > > > > > > > > > > > > > > Patch 1 is to avoid dereferencing the NULL pointer to non-existing > > > > > > > label data, as the Xen side support for labels is not implemented yet. > > > > > > > > > > > > > > Patch 2 & 3 add a memory backend dedicated for Xen usage and a hotplug > > > > > > > memory region for Xen guest, in order to make the existing nvdimm > > > > > > > device plugging path work on Xen. > > > > > > > > > > > > > > Patch 4 - 10 build and cooy NFIT from QEMU to Xen guest, when QEMU is > > > > > > > used as the Xen device model. > > > > > > > > > > > > I've skimmed over patch-set and can't say that I'm happy with > > > > > > number of xen_enabled() invariants it introduced as well as > > > > > > with partial blobs it creates. > > > > > > > > > > I have not read the series (Haozhong, please CC me, Anthony and > > > > > xen-devel to the whole series next time), but yes, indeed. Let's not add > > > > > more xen_enabled() if possible. > > > > > > > > > > Haozhong, was there a design document thread on xen-devel about this? If > > > > > so, did it reach a conclusion? Was the design accepted? If so, please > > > > > add a link to the design doc in the introductory email, so that > > > > > everybody can read it and be on the same page. > > > > > > > > Yes, there is a design [1] discussed and reviewed. Section 4.3 discussed > > > > the guest ACPI. > > > > > > > > [1] https://lists.xenproject.org/archives/html/xen-devel/2016-07/msg01921.html > > > > > > Igor, did you have a chance to read it? > > > > > > .. see below > > > > > > > > > > > > > > > > > > > > I'd like to reduce above and a way to do this might be making xen > > > > > > 1. use fw_cfg > > > > > > 2. fetch QEMU build acpi tables from fw_cfg > > > > > > 3. extract nvdim tables (which is trivial) and use them > > > > > > > > > > > > looking at xen_load_linux(), it seems possible to use fw_cfg. > > > > > > > > > > > > So what's stopping xen from using it elsewhere?, > > > > > > instead of adding more xen specific code to do 'the same' > > > > > > job and not reusing/sharing common code with tcg/kvm. > > > > > > > > > > So far, ACPI tables have not been generated by QEMU. Xen HVM machines > > > > > rely on a firmware-like application called "hvmloader" that runs in > > > > > guest context and generates the ACPI tables. I have no opinions on > > > > > hvmloader and I'll let the Xen maintainers talk about it. However, keep > > > > > in mind that with an HVM guest some devices are emulated by Xen and/or > > > > > by other device emulators that can run alongside QEMU. QEMU doesn't have > > > > > a full few of the system. > > > > > > > > > > Here the question is: does it have to be QEMU the one to generate the > > > > > ACPI blobs for the nvdimm? It would be nicer if it was up to hvmloader > > > > > like the rest, instead of introducing this split-brain design about > > > > > ACPI. We need to see a design doc to fully understand this. > > > > > > > > > > > > > hvmloader runs in the guest and is responsible to build/load guest > > > > ACPI. However, it's not capable to build AML at runtime (for the lack > > > > of AML builder). If any guest ACPI object is needed (e.g. by guest > > > > DSDT), it has to be generated from ASL by iasl at Xen compile time and > > > > then be loaded by hvmloader at runtime. > > > > > > > > Xen includes an OperationRegion "BIOS" in the static generated guest > > > > DSDT, whose address is hardcoded and which contains a list of values > > > > filled by hvmloader at runtime. Other ACPI objects can refer to those > > > > values (e.g., the number of vCPUs). But it's not enough for generating > > > > guest NVDIMM ACPI objects at compile time and then being customized > > > > and loaded by hvmload, because its structure (i.e., the number of > > > > namespace devices) cannot be decided util the guest config is known. > > > > > > > > Alternatively, we may introduce an AML builder in hvmloader and build > > > > all guest ACPI completely in hvmloader. Looking at the similar > > > > implementation in QEMU, it would not be small, compared to the current > > > > size of hvmloader. Besides, I'm still going to let QEMU handle guest > > > > NVDIMM _DSM and _FIT calls, which is another reason I use QEMU to > > > > build NVDIMM ACPI. > > > > > > > > > If the design doc thread led into thinking that it has to be QEMU to > > > > > generate them, then would it make the code nicer if we used fw_cfg to > > > > > get the (full or partial) tables from QEMU, as Igor suggested? > > > > > > > > I'll have a look at the code (which I didn't notice) pointed by Igor. > > > > > > And there is a spec too! > > > > > > https://github.com/qemu/qemu/blob/master/docs/specs/fw_cfg.txt > > > > > > Igor, did you have in mind to use FW_CFG_FILE_DIR to retrieve the > > > ACPI AML code? > > > > > > > Basically, QEMU builds two ROMs for guest, /rom@etc/acpi/tables and > > /rom@etc/table-loader. The former is unstructured to guest, and > > contains all data of guest ACPI. The latter is a BIOSLinkerLoader > > organized as a set of commands, which direct the guest (e.g., SeaBIOS > > on KVM/QEMU) to relocate data in the former file, recalculate checksum > > of specified area, and fill guest address in specified ACPI field. > > > > One part of my patches is to implement a mechanism to tell Xen which > > part of ACPI data is a table (NFIT), and which part defines a > > namespace device and what the device name is. I can add two new loader > > commands for them respectively. > > <nods> > > > > Because they just provide information and SeaBIOS in non-xen > > environment ignores unrecognized commands, they will not break SeaBIOS > > in non-xen environment. > > > > On QEMU side, most Xen-specific hacks in ACPI builder could be > > Wooot! > > dropped, and replaced by adding the new loader commands (though they > > may be used only by Xen). > > And eventually all of the hvmloader ACPI built in code could be dropped > and use all of the loader commands? If Xen is going to rely on QEMU to build the entire ACPI for a HVM guest, then there would be no need to check the signature/name conflict, so the new BIOSLinkerLoader commands here would not be necessary in that case (or only for backwards compatible). I don't know how much work would be needed, and that could be another project. Haozhong > > > > > On Xen side, a fw_cfg driver and a BIOSLinkerLoader command executor > > are needed in, perhaps, hvmloader. > > <nods> > > > > > > Haozhong > > >
© 2016 - 2024 Red Hat, Inc.