[PATCH v7 0/5] QEMU PCIe DOE for PCIe 4.0/5.0 and CXL 2.0

Jonathan Cameron via posted 5 patches 1 year, 7 months ago
Failed in applying to current master (apply log)
MAINTAINERS                    |   7 +
hw/cxl/cxl-cdat.c              | 222 ++++++++++++++++++++
hw/cxl/meson.build             |   1 +
hw/mem/cxl_type3.c             | 236 +++++++++++++++++++++
hw/pci-bridge/cxl_upstream.c   | 182 +++++++++++++++-
hw/pci/meson.build             |   1 +
hw/pci/pcie_doe.c              | 367 +++++++++++++++++++++++++++++++++
include/hw/cxl/cxl_cdat.h      | 166 +++++++++++++++
include/hw/cxl/cxl_component.h |   7 +
include/hw/cxl/cxl_device.h    |   3 +
include/hw/cxl/cxl_pci.h       |   1 +
include/hw/pci/pci_ids.h       |   3 +
include/hw/pci/pcie.h          |   1 +
include/hw/pci/pcie_doe.h      | 123 +++++++++++
include/hw/pci/pcie_regs.h     |   4 +
15 files changed, 1323 insertions(+), 1 deletion(-)
create mode 100644 hw/cxl/cxl-cdat.c
create mode 100644 hw/pci/pcie_doe.c
create mode 100644 include/hw/cxl/cxl_cdat.h
create mode 100644 include/hw/pci/pcie_doe.h
[PATCH v7 0/5] QEMU PCIe DOE for PCIe 4.0/5.0 and CXL 2.0
Posted by Jonathan Cameron via 1 year, 7 months ago
Whilst I have carried on Huai-Cheng Kuo's series version numbering and
naming, there have been very substantial changes since v6 so I would
suggest fresh review makes sense for anyone who has looked at this before.
In particularly if the Avery design folks could check I haven't broken
anything that would be great.

For reference v6: QEMU PCIe DOE for PCIe 4.0/5.0 and CXL 2.0
https://lore.kernel.org/qemu-devel/1623330943-18290-1-git-send-email-cbrowy@avery-design.com/

Summary of changes:
1) Linux headers definitions for DOE are now upstream so drop that patch.
2) Add CDAT for switch upstream port.
3) Generate 'plausible' default CDAT tables when a file is not provided.
4) General refactoring to calculate the correct table sizes and allocate
   based on that rather than copying from a local static array.
5) Changes from earlier reviews such as matching QEMU type naming style.
6) Moved compliance and SPDM usecases to future patch sets.

Sign-offs on these are complex because the patches were originally developed
by Huai-Cheng Kuo, but posted by Chris Browy and then picked up by Jonathan
Cameron who made substantial changes.

Huai-Cheng Kuo / Chris Browy, please confirm you are still happy to maintain this
code as per the original MAINTAINERS entry.

What's here?

This series brings generic PCI Express Data Object Exchange support (DOE)
DOE is defined in the PCIe Base Spec r6.0. It consists of a mailbox in PCI
config space via a PCIe Extended Capability Structure.
The PCIe spec defines several protocols (including one to discover what
protocols a given DOE instance supports) and other specification such as
CXL define additional protocols using their own vendor IDs.

In this series we make use of the DOE to support the CXL spec defined
Table Access Protocol, specifically to provide access to CDAT - a
table specified in a specification that is hosted by the UEFI forum
and is used to provide runtime discoverability of the sort of information
that would otherwise be available in firmware tables (memory types,
latency and bandwidth information etc).

The Linux kernel gained support for DOE / CDAT on CXL type 3 EPs in 6.0.
The version merged did not support interrupts (earlier versions did
so that support in the emulation was tested a while back).

This series provides CDAT emulation for CXL switch upstream ports
and CXL type 3 memory devices. Note that to exercise the switch support
additional Linux kernel patches are needed.
https://lore.kernel.org/linux-cxl/20220503153449.4088-1-Jonathan.Cameron@huawei.com/
(I'll post a new version of that support shortly)

Additional protocols will be supported by follow on patch sets:
* CXL compliance protocol.
* CMA / SPDM device attestation.
(Old version at https://gitlab.com/jic23/qemu/-/commits/cxl-next - will refresh
that tree next week)

Huai-Cheng Kuo (3):
  hw/pci: PCIe Data Object Exchange emulation
  hw/cxl/cdat: CXL CDAT Data Object Exchange implementation
  hw/mem/cxl-type3: Add CXL CDAT Data Object Exchange

Jonathan Cameron (2):
  hw/mem/cxl-type3: Add MSIX support
  hw/pci-bridge/cxl-upstream: Add a CDAT table access DOE

 MAINTAINERS                    |   7 +
 hw/cxl/cxl-cdat.c              | 222 ++++++++++++++++++++
 hw/cxl/meson.build             |   1 +
 hw/mem/cxl_type3.c             | 236 +++++++++++++++++++++
 hw/pci-bridge/cxl_upstream.c   | 182 +++++++++++++++-
 hw/pci/meson.build             |   1 +
 hw/pci/pcie_doe.c              | 367 +++++++++++++++++++++++++++++++++
 include/hw/cxl/cxl_cdat.h      | 166 +++++++++++++++
 include/hw/cxl/cxl_component.h |   7 +
 include/hw/cxl/cxl_device.h    |   3 +
 include/hw/cxl/cxl_pci.h       |   1 +
 include/hw/pci/pci_ids.h       |   3 +
 include/hw/pci/pcie.h          |   1 +
 include/hw/pci/pcie_doe.h      | 123 +++++++++++
 include/hw/pci/pcie_regs.h     |   4 +
 15 files changed, 1323 insertions(+), 1 deletion(-)
 create mode 100644 hw/cxl/cxl-cdat.c
 create mode 100644 hw/pci/pcie_doe.c
 create mode 100644 include/hw/cxl/cxl_cdat.h
 create mode 100644 include/hw/pci/pcie_doe.h

-- 
2.37.2
Re: [PATCH v7 0/5] QEMU PCIe DOE for PCIe 4.0/5.0 and CXL 2.0
Posted by Jonathan Cameron via 1 year, 7 months ago
On Fri, 7 Oct 2022 16:21:51 +0100
Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:

> Whilst I have carried on Huai-Cheng Kuo's series version numbering and
> naming, there have been very substantial changes since v6 so I would
> suggest fresh review makes sense for anyone who has looked at this before.
> In particularly if the Avery design folks could check I haven't broken
> anything that would be great.

I forgot to run checkpatch on these and there is some white space that
will need cleaning up and one instance of missing brackets.
As that doesn't greatly affect review, I'll wait for a few days to see
if there is other feedback to incorporate in v8.

Sorry for the resulting noise!

These are now available at
https://gitlab.com/jic23/qemu/-/commits/cxl-2022-10-09
along with a bunch of other CXL features:
* Compliance DOE protocol
* SPDM / CMA over DOE supprot
* ARM64 support in general.
* Various small emulation additions.
* CPMU support

I'll add a few more features to similarly named branches over the next
week or so including initial support for standalone switch CCI mailboxes.

Jonathan
 
> 
> For reference v6: QEMU PCIe DOE for PCIe 4.0/5.0 and CXL 2.0
> https://lore.kernel.org/qemu-devel/1623330943-18290-1-git-send-email-cbrowy@avery-design.com/
> 
> Summary of changes:
> 1) Linux headers definitions for DOE are now upstream so drop that patch.
> 2) Add CDAT for switch upstream port.
> 3) Generate 'plausible' default CDAT tables when a file is not provided.
> 4) General refactoring to calculate the correct table sizes and allocate
>    based on that rather than copying from a local static array.
> 5) Changes from earlier reviews such as matching QEMU type naming style.
> 6) Moved compliance and SPDM usecases to future patch sets.
> 
> Sign-offs on these are complex because the patches were originally developed
> by Huai-Cheng Kuo, but posted by Chris Browy and then picked up by Jonathan
> Cameron who made substantial changes.
> 
> Huai-Cheng Kuo / Chris Browy, please confirm you are still happy to maintain this
> code as per the original MAINTAINERS entry.
> 
> What's here?
> 
> This series brings generic PCI Express Data Object Exchange support (DOE)
> DOE is defined in the PCIe Base Spec r6.0. It consists of a mailbox in PCI
> config space via a PCIe Extended Capability Structure.
> The PCIe spec defines several protocols (including one to discover what
> protocols a given DOE instance supports) and other specification such as
> CXL define additional protocols using their own vendor IDs.
> 
> In this series we make use of the DOE to support the CXL spec defined
> Table Access Protocol, specifically to provide access to CDAT - a
> table specified in a specification that is hosted by the UEFI forum
> and is used to provide runtime discoverability of the sort of information
> that would otherwise be available in firmware tables (memory types,
> latency and bandwidth information etc).
> 
> The Linux kernel gained support for DOE / CDAT on CXL type 3 EPs in 6.0.
> The version merged did not support interrupts (earlier versions did
> so that support in the emulation was tested a while back).
> 
> This series provides CDAT emulation for CXL switch upstream ports
> and CXL type 3 memory devices. Note that to exercise the switch support
> additional Linux kernel patches are needed.
> https://lore.kernel.org/linux-cxl/20220503153449.4088-1-Jonathan.Cameron@huawei.com/
> (I'll post a new version of that support shortly)
> 
> Additional protocols will be supported by follow on patch sets:
> * CXL compliance protocol.
> * CMA / SPDM device attestation.
> (Old version at https://gitlab.com/jic23/qemu/-/commits/cxl-next - will refresh
> that tree next week)
> 
> Huai-Cheng Kuo (3):
>   hw/pci: PCIe Data Object Exchange emulation
>   hw/cxl/cdat: CXL CDAT Data Object Exchange implementation
>   hw/mem/cxl-type3: Add CXL CDAT Data Object Exchange
> 
> Jonathan Cameron (2):
>   hw/mem/cxl-type3: Add MSIX support
>   hw/pci-bridge/cxl-upstream: Add a CDAT table access DOE
> 
>  MAINTAINERS                    |   7 +
>  hw/cxl/cxl-cdat.c              | 222 ++++++++++++++++++++
>  hw/cxl/meson.build             |   1 +
>  hw/mem/cxl_type3.c             | 236 +++++++++++++++++++++
>  hw/pci-bridge/cxl_upstream.c   | 182 +++++++++++++++-
>  hw/pci/meson.build             |   1 +
>  hw/pci/pcie_doe.c              | 367 +++++++++++++++++++++++++++++++++
>  include/hw/cxl/cxl_cdat.h      | 166 +++++++++++++++
>  include/hw/cxl/cxl_component.h |   7 +
>  include/hw/cxl/cxl_device.h    |   3 +
>  include/hw/cxl/cxl_pci.h       |   1 +
>  include/hw/pci/pci_ids.h       |   3 +
>  include/hw/pci/pcie.h          |   1 +
>  include/hw/pci/pcie_doe.h      | 123 +++++++++++
>  include/hw/pci/pcie_regs.h     |   4 +
>  15 files changed, 1323 insertions(+), 1 deletion(-)
>  create mode 100644 hw/cxl/cxl-cdat.c
>  create mode 100644 hw/pci/pcie_doe.c
>  create mode 100644 include/hw/cxl/cxl_cdat.h
>  create mode 100644 include/hw/pci/pcie_doe.h
>
Re: [PATCH v7 0/5] QEMU PCIe DOE for PCIe 4.0/5.0 and CXL 2.0
Posted by Huai-Cheng 1 year, 7 months ago
Hi Jonathan,

We've reviewed the patches related to DOE and everything looks good. And we
are glad to maintain the code as the maintainers.

Thanks for applying the changes.

Best Regards,
Huai-Cheng Kuo

On Mon, Oct 10, 2022 at 6:30 PM Jonathan Cameron <
Jonathan.Cameron@huawei.com> wrote:

> On Fri, 7 Oct 2022 16:21:51 +0100
> Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
>
> > Whilst I have carried on Huai-Cheng Kuo's series version numbering and
> > naming, there have been very substantial changes since v6 so I would
> > suggest fresh review makes sense for anyone who has looked at this
> before.
> > In particularly if the Avery design folks could check I haven't broken
> > anything that would be great.
>
> I forgot to run checkpatch on these and there is some white space that
> will need cleaning up and one instance of missing brackets.
> As that doesn't greatly affect review, I'll wait for a few days to see
> if there is other feedback to incorporate in v8.
>
> Sorry for the resulting noise!
>
> These are now available at
> https://gitlab.com/jic23/qemu/-/commits/cxl-2022-10-09
> along with a bunch of other CXL features:
> * Compliance DOE protocol
> * SPDM / CMA over DOE supprot
> * ARM64 support in general.
> * Various small emulation additions.
> * CPMU support
>
> I'll add a few more features to similarly named branches over the next
> week or so including initial support for standalone switch CCI mailboxes.
>
> Jonathan
>
> >
> > For reference v6: QEMU PCIe DOE for PCIe 4.0/5.0 and CXL 2.0
> >
> https://lore.kernel.org/qemu-devel/1623330943-18290-1-git-send-email-cbrowy@avery-design.com/
> >
> > Summary of changes:
> > 1) Linux headers definitions for DOE are now upstream so drop that patch.
> > 2) Add CDAT for switch upstream port.
> > 3) Generate 'plausible' default CDAT tables when a file is not provided.
> > 4) General refactoring to calculate the correct table sizes and allocate
> >    based on that rather than copying from a local static array.
> > 5) Changes from earlier reviews such as matching QEMU type naming style.
> > 6) Moved compliance and SPDM usecases to future patch sets.
> >
> > Sign-offs on these are complex because the patches were originally
> developed
> > by Huai-Cheng Kuo, but posted by Chris Browy and then picked up by
> Jonathan
> > Cameron who made substantial changes.
> >
> > Huai-Cheng Kuo / Chris Browy, please confirm you are still happy to
> maintain this
> > code as per the original MAINTAINERS entry.
> >
> > What's here?
> >
> > This series brings generic PCI Express Data Object Exchange support (DOE)
> > DOE is defined in the PCIe Base Spec r6.0. It consists of a mailbox in
> PCI
> > config space via a PCIe Extended Capability Structure.
> > The PCIe spec defines several protocols (including one to discover what
> > protocols a given DOE instance supports) and other specification such as
> > CXL define additional protocols using their own vendor IDs.
> >
> > In this series we make use of the DOE to support the CXL spec defined
> > Table Access Protocol, specifically to provide access to CDAT - a
> > table specified in a specification that is hosted by the UEFI forum
> > and is used to provide runtime discoverability of the sort of information
> > that would otherwise be available in firmware tables (memory types,
> > latency and bandwidth information etc).
> >
> > The Linux kernel gained support for DOE / CDAT on CXL type 3 EPs in 6.0.
> > The version merged did not support interrupts (earlier versions did
> > so that support in the emulation was tested a while back).
> >
> > This series provides CDAT emulation for CXL switch upstream ports
> > and CXL type 3 memory devices. Note that to exercise the switch support
> > additional Linux kernel patches are needed.
> >
> https://lore.kernel.org/linux-cxl/20220503153449.4088-1-Jonathan.Cameron@huawei.com/
> > (I'll post a new version of that support shortly)
> >
> > Additional protocols will be supported by follow on patch sets:
> > * CXL compliance protocol.
> > * CMA / SPDM device attestation.
> > (Old version at https://gitlab.com/jic23/qemu/-/commits/cxl-next - will
> refresh
> > that tree next week)
> >
> > Huai-Cheng Kuo (3):
> >   hw/pci: PCIe Data Object Exchange emulation
> >   hw/cxl/cdat: CXL CDAT Data Object Exchange implementation
> >   hw/mem/cxl-type3: Add CXL CDAT Data Object Exchange
> >
> > Jonathan Cameron (2):
> >   hw/mem/cxl-type3: Add MSIX support
> >   hw/pci-bridge/cxl-upstream: Add a CDAT table access DOE
> >
> >  MAINTAINERS                    |   7 +
> >  hw/cxl/cxl-cdat.c              | 222 ++++++++++++++++++++
> >  hw/cxl/meson.build             |   1 +
> >  hw/mem/cxl_type3.c             | 236 +++++++++++++++++++++
> >  hw/pci-bridge/cxl_upstream.c   | 182 +++++++++++++++-
> >  hw/pci/meson.build             |   1 +
> >  hw/pci/pcie_doe.c              | 367 +++++++++++++++++++++++++++++++++
> >  include/hw/cxl/cxl_cdat.h      | 166 +++++++++++++++
> >  include/hw/cxl/cxl_component.h |   7 +
> >  include/hw/cxl/cxl_device.h    |   3 +
> >  include/hw/cxl/cxl_pci.h       |   1 +
> >  include/hw/pci/pci_ids.h       |   3 +
> >  include/hw/pci/pcie.h          |   1 +
> >  include/hw/pci/pcie_doe.h      | 123 +++++++++++
> >  include/hw/pci/pcie_regs.h     |   4 +
> >  15 files changed, 1323 insertions(+), 1 deletion(-)
> >  create mode 100644 hw/cxl/cxl-cdat.c
> >  create mode 100644 hw/pci/pcie_doe.c
> >  create mode 100644 include/hw/cxl/cxl_cdat.h
> >  create mode 100644 include/hw/pci/pcie_doe.h
> >
>
>
[PATCH 0/5] Multi-Region and Volatile Memory support for CXL Type-3 Devices
Posted by Gregory Price 1 year, 7 months ago
Summary of Changes:
1) Correction of PCI_CLASS from STORAGE_EXPRESS to MEMORY_CXL on init
2) Add CXL_CAPACITY_MULTIPLIER definition to replace magic numbers
3) Refactor CDAT DSMAS Initialization for multi-region initialization
4) Multi-Region and Volatile Memory support for CXL Type-3 Devices
5) Test and Documentation updates

Developed with input from Jonathan Cameron and Davidloh Bueso.

This series brings 2 features to CXL Type-3 Devices:
    1) Volatile Memory Region support
    2) Multi-Region support (1 Volatile, 1 Persistent)

In this series we implement multi-region and volatile region support
through 6 major changes to CXL devices
    1) The HostMemoryBackend [hostmem] has been replaced by two
       [hostvmem] and [hostpmem] to store volatile and persistent memory
       respectively
    2) The single AddressSpace has been replaced by two AddressSpaces
       [hostvmem_as] and [hostpmem_as] to map respective memdevs.
    3) Each memory region size and total region are stored separately
    4) The CDAT and DVSEC memory map entries have been updated:
       a) if vmem is present, vmem is mapped at DPA(0)
       b) if pmem is present
          i)  and vmem is present, pmem is mapped at DPA(vmem->size)
          ii) else, pmem is mapped at DPA(0)
       c) partitioning of pmem is not supported in this patch set but
          has been discussed and this design should suffice.
    5) Read/Write functions have been updated to access AddressSpaces
       according to the mapping described in #4
    6) cxl-mailbox has been updated to report the respective size of
       volatile and persistent memory regions

CXL Spec (3.0) Section 8.2.9.8.2.0 - Get Partition Info
  Active Volatile Memory
    The device shall provide this volatile capacity starting at DPA 0
  Active Persistent Memory
    The device shall provide this persistent capacity starting at the
    DPA immediately following the volatile capacity

Partitioning of Persistent Memory regions may be supported on following
patch sets.

Submitted as an extention to the CDAT emulation because the CDAT DSMAS
entry concerns memory mapping and is required to successfully map memory
regions correctly in bios/efi.

Gregory Price (5):
  hw/cxl: set cxl-type3 device type to PCI_CLASS_MEMORY_CXL
  hw/cxl: Add CXL_CAPACITY_MULTIPLIER definition
  hw/mem/cxl_type: Generalize CDATDsmas initialization for Memory
    Regions
  hw/cxl: Multi-Region CXL Type-3 Devices (Volatile and Persistent)
  cxl: update tests and documentation for new cxl properties

 docs/system/devices/cxl.rst |  53 ++++-
 hw/cxl/cxl-mailbox-utils.c  |  23 +-
 hw/mem/cxl_type3.c          | 449 +++++++++++++++++++++++-------------
 include/hw/cxl/cxl_device.h |  11 +-
 tests/qtest/cxl-test.c      |  81 ++++++-
 5 files changed, 416 insertions(+), 201 deletions(-)

-- 
2.37.3
Re: [PATCH 0/5] Multi-Region and Volatile Memory support for CXL Type-3 Devices
Posted by Michael S. Tsirkin 1 year, 7 months ago
On Tue, Oct 11, 2022 at 05:19:11PM -0400, Gregory Price wrote:
> Summary of Changes:
> 1) Correction of PCI_CLASS from STORAGE_EXPRESS to MEMORY_CXL on init
> 2) Add CXL_CAPACITY_MULTIPLIER definition to replace magic numbers
> 3) Refactor CDAT DSMAS Initialization for multi-region initialization
> 4) Multi-Region and Volatile Memory support for CXL Type-3 Devices
> 5) Test and Documentation updates
> 
> Developed with input from Jonathan Cameron and Davidloh Bueso.
> 
> This series brings 2 features to CXL Type-3 Devices:
>     1) Volatile Memory Region support
>     2) Multi-Region support (1 Volatile, 1 Persistent)
> 
> In this series we implement multi-region and volatile region support
> through 6 major changes to CXL devices
>     1) The HostMemoryBackend [hostmem] has been replaced by two
>        [hostvmem] and [hostpmem] to store volatile and persistent memory
>        respectively
>     2) The single AddressSpace has been replaced by two AddressSpaces
>        [hostvmem_as] and [hostpmem_as] to map respective memdevs.
>     3) Each memory region size and total region are stored separately
>     4) The CDAT and DVSEC memory map entries have been updated:
>        a) if vmem is present, vmem is mapped at DPA(0)
>        b) if pmem is present
>           i)  and vmem is present, pmem is mapped at DPA(vmem->size)
>           ii) else, pmem is mapped at DPA(0)
>        c) partitioning of pmem is not supported in this patch set but
>           has been discussed and this design should suffice.
>     5) Read/Write functions have been updated to access AddressSpaces
>        according to the mapping described in #4
>     6) cxl-mailbox has been updated to report the respective size of
>        volatile and persistent memory regions
> 
> CXL Spec (3.0) Section 8.2.9.8.2.0 - Get Partition Info
>   Active Volatile Memory
>     The device shall provide this volatile capacity starting at DPA 0
>   Active Persistent Memory
>     The device shall provide this persistent capacity starting at the
>     DPA immediately following the volatile capacity
> 
> Partitioning of Persistent Memory regions may be supported on following
> patch sets.
> 
> Submitted as an extention to the CDAT emulation because the CDAT DSMAS
> entry concerns memory mapping and is required to successfully map memory
> regions correctly in bios/efi.

As there will be v8 of CDAT patches I expect there will be a rebase
of this patchset too.

> Gregory Price (5):
>   hw/cxl: set cxl-type3 device type to PCI_CLASS_MEMORY_CXL
>   hw/cxl: Add CXL_CAPACITY_MULTIPLIER definition
>   hw/mem/cxl_type: Generalize CDATDsmas initialization for Memory
>     Regions
>   hw/cxl: Multi-Region CXL Type-3 Devices (Volatile and Persistent)
>   cxl: update tests and documentation for new cxl properties
> 
>  docs/system/devices/cxl.rst |  53 ++++-
>  hw/cxl/cxl-mailbox-utils.c  |  23 +-
>  hw/mem/cxl_type3.c          | 449 +++++++++++++++++++++++-------------
>  include/hw/cxl/cxl_device.h |  11 +-
>  tests/qtest/cxl-test.c      |  81 ++++++-
>  5 files changed, 416 insertions(+), 201 deletions(-)
> 
> -- 
> 2.37.3
[PATCH 1/5] hw/cxl: set cxl-type3 device type to PCI_CLASS_MEMORY_CXL
Posted by Gregory Price 1 year, 7 months ago
Current code sets to STORAGE_EXPRESS and then overrides it.

Signed-off-by: Gregory Price <gregory.price@memverge.com>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
 hw/mem/cxl_type3.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
index 3e7ca7a455..282f274266 100644
--- a/hw/mem/cxl_type3.c
+++ b/hw/mem/cxl_type3.c
@@ -535,7 +535,6 @@ static void ct3_realize(PCIDevice *pci_dev, Error **errp)
     }
 
     pci_config_set_prog_interface(pci_conf, 0x10);
-    pci_config_set_class(pci_conf, PCI_CLASS_MEMORY_CXL);
 
     pcie_endpoint_cap_init(pci_dev, 0x80);
     if (ct3d->sn != UI64_NULL) {
@@ -763,7 +762,7 @@ static void ct3_class_init(ObjectClass *oc, void *data)
     pc->config_read = ct3d_config_read;
     pc->realize = ct3_realize;
     pc->exit = ct3_exit;
-    pc->class_id = PCI_CLASS_STORAGE_EXPRESS;
+    pc->class_id = PCI_CLASS_MEMORY_CXL;
     pc->vendor_id = PCI_VENDOR_ID_INTEL;
     pc->device_id = 0xd93; /* LVF for now */
     pc->revision = 1;
-- 
2.37.3
[PATCH 2/5] hw/cxl: Add CXL_CAPACITY_MULTIPLIER definition
Posted by Gregory Price 1 year, 7 months ago
Remove usage of magic numbers when accessing capacity fields and replace
with CXL_CAPACITY_MULTIPLIER, matching the kernel definition.

Signed-off-by: Gregory Price <gregory.price@memverge.com>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
---
 hw/cxl/cxl-mailbox-utils.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
index c7e1a88b44..776c8cbadc 100644
--- a/hw/cxl/cxl-mailbox-utils.c
+++ b/hw/cxl/cxl-mailbox-utils.c
@@ -14,6 +14,8 @@
 #include "qemu/log.h"
 #include "qemu/uuid.h"
 
+#define CXL_CAPACITY_MULTIPLIER   0x10000000 /* SZ_256M */
+
 /*
  * How to add a new command, example. The command set FOO, with cmd BAR.
  *  1. Add the command set and cmd to the enum.
@@ -140,7 +142,7 @@ static ret_code cmd_firmware_update_get_info(struct cxl_cmd *cmd,
     } QEMU_PACKED *fw_info;
     QEMU_BUILD_BUG_ON(sizeof(*fw_info) != 0x50);
 
-    if (cxl_dstate->pmem_size < (256 << 20)) {
+    if (cxl_dstate->pmem_size < CXL_CAPACITY_MULTIPLIER) {
         return CXL_MBOX_INTERNAL_ERROR;
     }
 
@@ -285,7 +287,7 @@ static ret_code cmd_identify_memory_device(struct cxl_cmd *cmd,
     CXLType3Class *cvc = CXL_TYPE3_GET_CLASS(ct3d);
     uint64_t size = cxl_dstate->pmem_size;
 
-    if (!QEMU_IS_ALIGNED(size, 256 << 20)) {
+    if (!QEMU_IS_ALIGNED(size, CXL_CAPACITY_MULTIPLIER)) {
         return CXL_MBOX_INTERNAL_ERROR;
     }
 
@@ -295,8 +297,8 @@ static ret_code cmd_identify_memory_device(struct cxl_cmd *cmd,
     /* PMEM only */
     snprintf(id->fw_revision, 0x10, "BWFW VERSION %02d", 0);
 
-    id->total_capacity = size / (256 << 20);
-    id->persistent_capacity = size / (256 << 20);
+    id->total_capacity = size / CXL_CAPACITY_MULTIPLIER;
+    id->persistent_capacity = size / CXL_CAPACITY_MULTIPLIER;
     id->lsa_size = cvc->get_lsa_size(ct3d);
     id->poison_list_max_mer[1] = 0x1; /* 256 poison records */
 
@@ -317,14 +319,14 @@ static ret_code cmd_ccls_get_partition_info(struct cxl_cmd *cmd,
     QEMU_BUILD_BUG_ON(sizeof(*part_info) != 0x20);
     uint64_t size = cxl_dstate->pmem_size;
 
-    if (!QEMU_IS_ALIGNED(size, 256 << 20)) {
+    if (!QEMU_IS_ALIGNED(size, CXL_CAPACITY_MULTIPLIER)) {
         return CXL_MBOX_INTERNAL_ERROR;
     }
 
     /* PMEM only */
     part_info->active_vmem = 0;
     part_info->next_vmem = 0;
-    part_info->active_pmem = size / (256 << 20);
+    part_info->active_pmem = size / CXL_CAPACITY_MULTIPLIER;
     part_info->next_pmem = 0;
 
     *len = sizeof(*part_info);
-- 
2.37.3
[PATCH 3/5] hw/mem/cxl_type: Generalize CDATDsmas initialization for Memory Regions
Posted by Gregory Price 1 year, 7 months ago
This is a preparatory commit for enabling multiple memory regions within
a single CXL Type-3 device.  We will need to initialize multiple CDAT
DSMAS regions (and subsequent DSLBIS, and DSEMTS entries), so generalize
the intialization into a function.

Signed-off-by: Gregory Price <gregory.price@memverge.com>
---
 hw/mem/cxl_type3.c | 275 +++++++++++++++++++++++++--------------------
 1 file changed, 154 insertions(+), 121 deletions(-)

diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
index 282f274266..dda78704c2 100644
--- a/hw/mem/cxl_type3.c
+++ b/hw/mem/cxl_type3.c
@@ -24,145 +24,178 @@
 #define UI64_NULL ~(0ULL)
 #define DWORD_BYTE 4
 
+static int ct3_build_dsmas(CDATDsmas *dsmas,
+                           CDATDslbis *dslbis,
+                           CDATDsemts *dsemts,
+                           MemoryRegion *mr,
+                           int dsmad_handle,
+                           bool is_pmem,
+                           uint64_t dpa_base)
+{
+    int len = 0;
+    /* ttl_len should be incremented for every entry */
+
+    /* Device Scoped Memory Affinity Structure */
+    *dsmas = (CDATDsmas) {
+        .header = {
+            .type = CDAT_TYPE_DSMAS,
+            .length = sizeof(*dsmas),
+        },
+        .DSMADhandle = dsmad_handle,
+        .flags = (is_pmem ? CDAT_DSMAS_FLAG_NV : 0),
+        .DPA_base = dpa_base,
+        .DPA_length = int128_get64(mr->size),
+    };
+    len++;
+
+    /* For now, no memory side cache, plausiblish numbers */
+    dslbis[0] = (CDATDslbis) {
+        .header = {
+            .type = CDAT_TYPE_DSLBIS,
+            .length = sizeof(*dslbis),
+        },
+        .handle = dsmad_handle,
+        .flags = HMAT_LB_MEM_MEMORY,
+        .data_type = HMAT_LB_DATA_READ_LATENCY,
+        .entry_base_unit = 10000, /* 10ns base */
+        .entry[0] = 15, /* 150ns */
+    };
+    len++;
+
+    dslbis[1] = (CDATDslbis) {
+        .header = {
+            .type = CDAT_TYPE_DSLBIS,
+            .length = sizeof(*dslbis),
+        },
+        .handle = dsmad_handle,
+        .flags = HMAT_LB_MEM_MEMORY,
+        .data_type = HMAT_LB_DATA_WRITE_LATENCY,
+        .entry_base_unit = 10000,
+        .entry[0] = 25, /* 250ns */
+    };
+    len++;
+
+    dslbis[2] = (CDATDslbis) {
+        .header = {
+            .type = CDAT_TYPE_DSLBIS,
+            .length = sizeof(*dslbis),
+            },
+        .handle = dsmad_handle,
+        .flags = HMAT_LB_MEM_MEMORY,
+        .data_type = HMAT_LB_DATA_READ_BANDWIDTH,
+        .entry_base_unit = 1000, /* GB/s */
+        .entry[0] = 16,
+    };
+    len++;
+
+    dslbis[3] = (CDATDslbis) {
+        .header = {
+            .type = CDAT_TYPE_DSLBIS,
+            .length = sizeof(*dslbis),
+        },
+        .handle = dsmad_handle,
+        .flags = HMAT_LB_MEM_MEMORY,
+        .data_type = HMAT_LB_DATA_WRITE_BANDWIDTH,
+        .entry_base_unit = 1000, /* GB/s */
+        .entry[0] = 16,
+    };
+    len++;
+
+    *dsemts = (CDATDsemts) {
+        .header = {
+            .type = CDAT_TYPE_DSEMTS,
+            .length = sizeof(*dsemts),
+        },
+        .DSMAS_handle = dsmad_handle,
+        /* EFI_MEMORY_NV implies EfiReservedMemoryType */
+        .EFI_memory_type_attr = is_pmem ? 2 : 0,
+        /* Reserved - the non volatile from DSMAS matters */
+        .DPA_offset = 0,
+        .DPA_length = int128_get64(mr->size),
+    };
+    len++;
+    return len;
+}
+
 static int ct3_build_cdat_table(CDATSubHeader ***cdat_table,
                                 void *priv)
 {
-    g_autofree CDATDsmas *dsmas_nonvolatile = NULL;
-    g_autofree CDATDslbis *dslbis_nonvolatile = NULL;
-    g_autofree CDATDsemts *dsemts_nonvolatile = NULL;
+    g_autofree CDATDsmas *dsmas = NULL;
+    g_autofree CDATDslbis *dslbis = NULL;
+    g_autofree CDATDsemts *dsemts = NULL;
     CXLType3Dev *ct3d = priv;
-    int len = 0;
-    int i = 0;
-    int next_dsmad_handle = 0;
-    int nonvolatile_dsmad = -1;
-    int dslbis_nonvolatile_num = 4;
+    int cdat_len = 0;
+    int cdat_idx = 0, sub_idx = 0;
+    int dsmas_num, dslbis_num, dsemts_num;
+    int dsmad_handle = 0;
+    uint64_t dpa_base = 0;
     MemoryRegion *mr;
 
-    /* Non volatile aspects */
-    if (ct3d->hostmem) {
-        dsmas_nonvolatile = g_malloc(sizeof(*dsmas_nonvolatile));
-        if (!dsmas_nonvolatile) {
-            return -ENOMEM;
-        }
-        nonvolatile_dsmad = next_dsmad_handle++;
-        mr = host_memory_backend_get_memory(ct3d->hostmem);
-        if (!mr) {
-            return -EINVAL;
-        }
-        *dsmas_nonvolatile = (CDATDsmas) {
-            .header = {
-                .type = CDAT_TYPE_DSMAS,
-                .length = sizeof(*dsmas_nonvolatile),
-            },
-            .DSMADhandle = nonvolatile_dsmad,
-            .flags = CDAT_DSMAS_FLAG_NV,
-            .DPA_base = 0,
-            .DPA_length = int128_get64(mr->size),
-        };
-        len++;
-
-        /* For now, no memory side cache, plausiblish numbers */
-        dslbis_nonvolatile = g_malloc(sizeof(*dslbis_nonvolatile) * dslbis_nonvolatile_num);
-        if (!dslbis_nonvolatile)
-            return -ENOMEM;
-
-        dslbis_nonvolatile[0] = (CDATDslbis) {
-            .header = {
-                .type = CDAT_TYPE_DSLBIS,
-                .length = sizeof(*dslbis_nonvolatile),
-            },
-            .handle = nonvolatile_dsmad,
-            .flags = HMAT_LB_MEM_MEMORY,
-            .data_type = HMAT_LB_DATA_READ_LATENCY,
-            .entry_base_unit = 10000, /* 10ns base */
-            .entry[0] = 15, /* 150ns */
-        };
-        len++;
-
-        dslbis_nonvolatile[1] = (CDATDslbis) {
-            .header = {
-                .type = CDAT_TYPE_DSLBIS,
-                .length = sizeof(*dslbis_nonvolatile),
-            },
-            .handle = nonvolatile_dsmad,
-            .flags = HMAT_LB_MEM_MEMORY,
-            .data_type = HMAT_LB_DATA_WRITE_LATENCY,
-            .entry_base_unit = 10000,
-            .entry[0] = 25, /* 250ns */
-        };
-        len++;
-       
-        dslbis_nonvolatile[2] = (CDATDslbis) {
-            .header = {
-                .type = CDAT_TYPE_DSLBIS,
-                .length = sizeof(*dslbis_nonvolatile),
-            },
-            .handle = nonvolatile_dsmad,
-            .flags = HMAT_LB_MEM_MEMORY,
-            .data_type = HMAT_LB_DATA_READ_BANDWIDTH,
-            .entry_base_unit = 1000, /* GB/s */
-            .entry[0] = 16,
-        };
-        len++;
-
-        dslbis_nonvolatile[3] = (CDATDslbis) {
-            .header = {
-                .type = CDAT_TYPE_DSLBIS,
-                .length = sizeof(*dslbis_nonvolatile),
-            },
-            .handle = nonvolatile_dsmad,
-            .flags = HMAT_LB_MEM_MEMORY,
-            .data_type = HMAT_LB_DATA_WRITE_BANDWIDTH,
-            .entry_base_unit = 1000, /* GB/s */
-            .entry[0] = 16,
-        };
-        len++;
-
-        mr = host_memory_backend_get_memory(ct3d->hostmem);
-        if (!mr) {
-            return -EINVAL;
-        }
-        dsemts_nonvolatile = g_malloc(sizeof(*dsemts_nonvolatile));
-        *dsemts_nonvolatile = (CDATDsemts) {
-            .header = {
-                .type = CDAT_TYPE_DSEMTS,
-                .length = sizeof(*dsemts_nonvolatile),
-            },
-            .DSMAS_handle = nonvolatile_dsmad,
-            .EFI_memory_type_attr = 2, /* Reserved - the non volatile from DSMAS matters */
-            .DPA_offset = 0,
-            .DPA_length = int128_get64(mr->size),
-        };
-        len++;
+    if (!ct3d->hostmem | !host_memory_backend_get_memory(ct3d->hostmem)) {
+        return -EINVAL;
+    }
+
+    dsmas_num = 1;
+    dslbis_num = 4 * dsmas_num;
+    dsemts_num = dsmas_num;
+
+    dsmas = g_malloc(sizeof(*dsmas) * dsmas_num);
+    dslbis = g_malloc(sizeof(*dslbis) * dslbis_num);
+    dsemts = g_malloc(sizeof(*dsemts) * dsemts_num);
+
+    if (!dsmas || !dslbis || !dsemts) {
+        return -ENOMEM;
+    }
+
+    mr = host_memory_backend_get_memory(ct3d->hostmem);
+    cdat_len += ct3_build_dsmas(&dsmas[dsmad_handle],
+                                &dslbis[4 * dsmad_handle],
+                                &dsemts[dsmad_handle],
+                                mr,
+                                dsmad_handle,
+                                false,
+                                dpa_base);
+    dpa_base += mr->size;
+    dsmad_handle++;
+
+    /* Allocate and fill in the CDAT table */
+    *cdat_table = g_malloc0(cdat_len * sizeof(*cdat_table));
+    if (!*cdat_table) {
+        return -ENOMEM;
     }
 
-    *cdat_table = g_malloc0(len * sizeof(*cdat_table));
     /* Header always at start of structure */
-    if (dsmas_nonvolatile) {
-        (*cdat_table)[i++] = g_steal_pointer(&dsmas_nonvolatile);
+    CDATDsmas *dsmas_ent = g_steal_pointer(&dsmas);
+    for (sub_idx = 0; sub_idx < dsmas_num; sub_idx++) {
+        (*cdat_table)[cdat_idx++] = (CDATSubHeader*)&dsmas_ent[sub_idx];
     }
-    if (dslbis_nonvolatile) {
-        CDATDslbis *dslbis = g_steal_pointer(&dslbis_nonvolatile);        
-        int j;
 
-        for (j = 0; j < dslbis_nonvolatile_num; j++) {
-            (*cdat_table)[i++] = (CDATSubHeader *)&dslbis[j];
-        }
+    CDATDslbis *dslbis_ent = g_steal_pointer(&dslbis);
+    for (sub_idx = 0; sub_idx < dslbis_num; sub_idx++) {
+        (*cdat_table)[cdat_idx++] = (CDATSubHeader*)&dslbis_ent[sub_idx];
     }
-    if (dsemts_nonvolatile) {
-        (*cdat_table)[i++] = g_steal_pointer(&dsemts_nonvolatile);
+
+    CDATDsemts *dsemts_ent = g_steal_pointer(&dsemts);
+    for (sub_idx = 0; sub_idx < dsemts_num; sub_idx++) {
+        (*cdat_table)[cdat_idx++] = (CDATSubHeader*)&dsemts_ent[sub_idx];
     }
-    
-    return len;
+
+    return cdat_len;
 }
 
 static void ct3_free_cdat_table(CDATSubHeader **cdat_table, int num, void *priv)
 {
-    int i;
+    int dsmas_num = 1;
+    int dslbis_idx = dsmas_num;
+    int dsemts_idx = dsmas_num + (dsmas_num * 4);
+
+    /* There are only 3 sub-tables to free: dsmas, dslbis, dsemts */
+    assert(num == (dsmas_num + (dsmas_num * 4) + (dsmas_num)));
+
+    g_free(cdat_table[0]);
+    g_free(cdat_table[dslbis_idx]);
+    g_free(cdat_table[dsemts_idx]);
 
-    for (i = 0; i < num; i++) {
-        g_free(cdat_table[i]);
-    }
     g_free(cdat_table);
 }
 
-- 
2.37.3
Re: [PATCH 3/5] hw/mem/cxl_type: Generalize CDATDsmas initialization for Memory Regions
Posted by Jonathan Cameron via 1 year, 7 months ago
On Tue, 11 Oct 2022 17:19:14 -0400
Gregory Price <gourry.memverge@gmail.com> wrote:

> This is a preparatory commit for enabling multiple memory regions within
> a single CXL Type-3 device.  We will need to initialize multiple CDAT
> DSMAS regions (and subsequent DSLBIS, and DSEMTS entries), so generalize
> the intialization into a function.
> 
> Signed-off-by: Gregory Price <gregory.price@memverge.com>

Hi Gregory,

Main comment here is that DOE isn't in yet.  Some of the changes
you have made in here should be review comments on that series rather
than here.

I'm also not keen on taking the various allocations to arrays,
particularly when seeing the hacky result in the free routine.

Just spin some more pointers and 3 more allocations (once persistent is
added).

Jonathan

> ---
>  hw/mem/cxl_type3.c | 275 +++++++++++++++++++++++++--------------------
>  1 file changed, 154 insertions(+), 121 deletions(-)
> 
> diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
> index 282f274266..dda78704c2 100644
> --- a/hw/mem/cxl_type3.c
> +++ b/hw/mem/cxl_type3.c
> @@ -24,145 +24,178 @@
>  #define UI64_NULL ~(0ULL)
>  #define DWORD_BYTE 4
>  
> +static int ct3_build_dsmas(CDATDsmas *dsmas,
> +                           CDATDslbis *dslbis,
> +                           CDATDsemts *dsemts,
> +                           MemoryRegion *mr,
> +                           int dsmad_handle,
> +                           bool is_pmem,
> +                           uint64_t dpa_base)

Rewrap this to be just under 80 characters per line.

This is building a lot more than DSMAS.
Could rename it, or could break it down into functions
that deal with each type  of entry.

> +{
> +    int len = 0;
> +    /* ttl_len should be incremented for every entry */

ttl_ ?

Given you now allocate outside of this function, probably
more appropriate to just add the entries up there.


> +
> +    /* Device Scoped Memory Affinity Structure */
> +    *dsmas = (CDATDsmas) {
> +        .header = {
> +            .type = CDAT_TYPE_DSMAS,
> +            .length = sizeof(*dsmas),
> +        },
> +        .DSMADhandle = dsmad_handle,
> +        .flags = (is_pmem ? CDAT_DSMAS_FLAG_NV : 0),
> +        .DPA_base = dpa_base,
> +        .DPA_length = int128_get64(mr->size),
> +    };
> +    len++;
> +
> +    /* For now, no memory side cache, plausiblish numbers */
> +    dslbis[0] = (CDATDslbis) {
> +        .header = {
> +            .type = CDAT_TYPE_DSLBIS,
> +            .length = sizeof(*dslbis),
> +        },
> +        .handle = dsmad_handle,
> +        .flags = HMAT_LB_MEM_MEMORY,
> +        .data_type = HMAT_LB_DATA_READ_LATENCY,
> +        .entry_base_unit = 10000, /* 10ns base */
> +        .entry[0] = 15, /* 150ns */
> +    };
> +    len++;
> +
> +    dslbis[1] = (CDATDslbis) {
> +        .header = {
> +            .type = CDAT_TYPE_DSLBIS,
> +            .length = sizeof(*dslbis),
> +        },
> +        .handle = dsmad_handle,
> +        .flags = HMAT_LB_MEM_MEMORY,
> +        .data_type = HMAT_LB_DATA_WRITE_LATENCY,
> +        .entry_base_unit = 10000,
> +        .entry[0] = 25, /* 250ns */
> +    };
> +    len++;
> +
> +    dslbis[2] = (CDATDslbis) {
> +        .header = {
> +            .type = CDAT_TYPE_DSLBIS,
> +            .length = sizeof(*dslbis),
> +            },
> +        .handle = dsmad_handle,
> +        .flags = HMAT_LB_MEM_MEMORY,
> +        .data_type = HMAT_LB_DATA_READ_BANDWIDTH,
> +        .entry_base_unit = 1000, /* GB/s */
> +        .entry[0] = 16,
> +    };
> +    len++;
> +
> +    dslbis[3] = (CDATDslbis) {
> +        .header = {
> +            .type = CDAT_TYPE_DSLBIS,
> +            .length = sizeof(*dslbis),
> +        },
> +        .handle = dsmad_handle,
> +        .flags = HMAT_LB_MEM_MEMORY,
> +        .data_type = HMAT_LB_DATA_WRITE_BANDWIDTH,
> +        .entry_base_unit = 1000, /* GB/s */
> +        .entry[0] = 16,
> +    };
> +    len++;
> +
> +    *dsemts = (CDATDsemts) {
> +        .header = {
> +            .type = CDAT_TYPE_DSEMTS,
> +            .length = sizeof(*dsemts),
> +        },
> +        .DSMAS_handle = dsmad_handle,
> +        /* EFI_MEMORY_NV implies EfiReservedMemoryType */
> +        .EFI_memory_type_attr = is_pmem ? 2 : 0,
> +        /* Reserved - the non volatile from DSMAS matters */
> +        .DPA_offset = 0,
> +        .DPA_length = int128_get64(mr->size),
> +    };
> +    len++;
> +    return len;
> +}
> +
>  static int ct3_build_cdat_table(CDATSubHeader ***cdat_table,
>                                  void *priv)
>  {
> -    g_autofree CDATDsmas *dsmas_nonvolatile = NULL;
> -    g_autofree CDATDslbis *dslbis_nonvolatile = NULL;
> -    g_autofree CDATDsemts *dsemts_nonvolatile = NULL;
> +    g_autofree CDATDsmas *dsmas = NULL;
> +    g_autofree CDATDslbis *dslbis = NULL;
> +    g_autofree CDATDsemts *dsemts = NULL;
>      CXLType3Dev *ct3d = priv;
> -    int len = 0;

There are changes in here that aren't necessary and just result
in a much harder diff to review.  Why rename len to cdat_len?

> -    int i = 0;
> -    int next_dsmad_handle = 0;
> -    int nonvolatile_dsmad = -1;
> -    int dslbis_nonvolatile_num = 4;
> +    int cdat_len = 0;
> +    int cdat_idx = 0, sub_idx = 0;
> +    int dsmas_num, dslbis_num, dsemts_num;
> +    int dsmad_handle = 0;
> +    uint64_t dpa_base = 0;
>      MemoryRegion *mr;
>  
> -    /* Non volatile aspects */
> -    if (ct3d->hostmem) {
> -        dsmas_nonvolatile = g_malloc(sizeof(*dsmas_nonvolatile));
> -        if (!dsmas_nonvolatile) {
> -            return -ENOMEM;
> -        }
> -        nonvolatile_dsmad = next_dsmad_handle++;
> -        mr = host_memory_backend_get_memory(ct3d->hostmem);
> -        if (!mr) {
> -            return -EINVAL;
> -        }
> -        *dsmas_nonvolatile = (CDATDsmas) {
> -            .header = {
> -                .type = CDAT_TYPE_DSMAS,
> -                .length = sizeof(*dsmas_nonvolatile),
> -            },
> -            .DSMADhandle = nonvolatile_dsmad,
> -            .flags = CDAT_DSMAS_FLAG_NV,
> -            .DPA_base = 0,
> -            .DPA_length = int128_get64(mr->size),
> -        };
> -        len++;
> -
> -        /* For now, no memory side cache, plausiblish numbers */
> -        dslbis_nonvolatile = g_malloc(sizeof(*dslbis_nonvolatile) * dslbis_nonvolatile_num);
> -        if (!dslbis_nonvolatile)
> -            return -ENOMEM;
> -
> -        dslbis_nonvolatile[0] = (CDATDslbis) {
> -            .header = {
> -                .type = CDAT_TYPE_DSLBIS,
> -                .length = sizeof(*dslbis_nonvolatile),
> -            },
> -            .handle = nonvolatile_dsmad,
> -            .flags = HMAT_LB_MEM_MEMORY,
> -            .data_type = HMAT_LB_DATA_READ_LATENCY,
> -            .entry_base_unit = 10000, /* 10ns base */
> -            .entry[0] = 15, /* 150ns */
> -        };
> -        len++;
> -
> -        dslbis_nonvolatile[1] = (CDATDslbis) {
> -            .header = {
> -                .type = CDAT_TYPE_DSLBIS,
> -                .length = sizeof(*dslbis_nonvolatile),
> -            },
> -            .handle = nonvolatile_dsmad,
> -            .flags = HMAT_LB_MEM_MEMORY,
> -            .data_type = HMAT_LB_DATA_WRITE_LATENCY,
> -            .entry_base_unit = 10000,
> -            .entry[0] = 25, /* 250ns */
> -        };
> -        len++;
> -       
> -        dslbis_nonvolatile[2] = (CDATDslbis) {
> -            .header = {
> -                .type = CDAT_TYPE_DSLBIS,
> -                .length = sizeof(*dslbis_nonvolatile),
> -            },
> -            .handle = nonvolatile_dsmad,
> -            .flags = HMAT_LB_MEM_MEMORY,
> -            .data_type = HMAT_LB_DATA_READ_BANDWIDTH,
> -            .entry_base_unit = 1000, /* GB/s */
> -            .entry[0] = 16,
> -        };
> -        len++;
> -
> -        dslbis_nonvolatile[3] = (CDATDslbis) {
> -            .header = {
> -                .type = CDAT_TYPE_DSLBIS,
> -                .length = sizeof(*dslbis_nonvolatile),
> -            },
> -            .handle = nonvolatile_dsmad,
> -            .flags = HMAT_LB_MEM_MEMORY,
> -            .data_type = HMAT_LB_DATA_WRITE_BANDWIDTH,
> -            .entry_base_unit = 1000, /* GB/s */
> -            .entry[0] = 16,
> -        };
> -        len++;
> -
> -        mr = host_memory_backend_get_memory(ct3d->hostmem);
> -        if (!mr) {
> -            return -EINVAL;
> -        }
> -        dsemts_nonvolatile = g_malloc(sizeof(*dsemts_nonvolatile));
> -        *dsemts_nonvolatile = (CDATDsemts) {
> -            .header = {
> -                .type = CDAT_TYPE_DSEMTS,
> -                .length = sizeof(*dsemts_nonvolatile),
> -            },
> -            .DSMAS_handle = nonvolatile_dsmad,
> -            .EFI_memory_type_attr = 2, /* Reserved - the non volatile from DSMAS matters */
> -            .DPA_offset = 0,
> -            .DPA_length = int128_get64(mr->size),
> -        };
> -        len++;
> +    if (!ct3d->hostmem | !host_memory_backend_get_memory(ct3d->hostmem)) {

I don't immediately see why we need this test here and didn't previously.  If it
should always have been there, put that as a review on the DOE patches not here.

> +        return -EINVAL;
> +    }
> +
> +    dsmas_num = 1;
> +    dslbis_num = 4 * dsmas_num;
> +    dsemts_num = dsmas_num;
> +
> +    dsmas = g_malloc(sizeof(*dsmas) * dsmas_num);

As we aren't likely to add any more entries after non volatile
I'm not convinced making everything arrays then indexing into them
is worth while, particularly as it's causing huge amounts of churn.

If this style of preallocate then fill makes more sense (I don't particularly
like it breaks up the handling of each element), then propose that in review
of the original series.  Having this level of 'style' of function
change here makes for a hard to read set.  We can still change the
original patch.  I'm not yet convinced it's worth making that change.

I think I'd rather see the allocation and fill all in the factored out function.


> +    dslbis = g_malloc(sizeof(*dslbis) * dslbis_num);
> +    dsemts = g_malloc(sizeof(*dsemts) * dsemts_num);
> +
> +    if (!dsmas || !dslbis || !dsemts) {
> +        return -ENOMEM;
> +    }
> +
> +    mr = host_memory_backend_get_memory(ct3d->hostmem);
> +    cdat_len += ct3_build_dsmas(&dsmas[dsmad_handle],

There isn't a specific connection between dsmad_handle.  This code
kind of makes it look like it's not just that we've decided to use handle
0 and later 1.

That's another reason I'd rather not do this with arrays.

> +                                &dslbis[4 * dsmad_handle],
> +                                &dsemts[dsmad_handle],
> +                                mr,
> +                                dsmad_handle,
> +                                false,
> +                                dpa_base);
> +    dpa_base += mr->size;
> +    dsmad_handle++;
> +
> +    /* Allocate and fill in the CDAT table */
> +    *cdat_table = g_malloc0(cdat_len * sizeof(*cdat_table));
> +    if (!*cdat_table) {
> +        return -ENOMEM;
>      }
>  
> -    *cdat_table = g_malloc0(len * sizeof(*cdat_table));

Looks like I'm missing an allocation failure check here in original
code. Please put that in a review of that series rather than introducing
the change hidden down in here. 

>      /* Header always at start of structure */
> -    if (dsmas_nonvolatile) {
> -        (*cdat_table)[i++] = g_steal_pointer(&dsmas_nonvolatile);
> +    CDATDsmas *dsmas_ent = g_steal_pointer(&dsmas);

We should not be introducing new local variables down here.
Style wise stick to old school C style of all definitions at the top
or within a scoped region {}.


> +    for (sub_idx = 0; sub_idx < dsmas_num; sub_idx++) {
> +        (*cdat_table)[cdat_idx++] = (CDATSubHeader*)&dsmas_ent[sub_idx];

for a local index j is fine.
Using a more specific name for what was i is fair enough.  Belongs
in review of original patch given that hasn't been accepted yet.

>      }
> -    if (dslbis_nonvolatile) {
> -        CDATDslbis *dslbis = g_steal_pointer(&dslbis_nonvolatile);        
> -        int j;
>  
> -        for (j = 0; j < dslbis_nonvolatile_num; j++) {
> -            (*cdat_table)[i++] = (CDATSubHeader *)&dslbis[j];
> -        }
> +    CDATDslbis *dslbis_ent = g_steal_pointer(&dslbis);
> +    for (sub_idx = 0; sub_idx < dslbis_num; sub_idx++) {
> +        (*cdat_table)[cdat_idx++] = (CDATSubHeader*)&dslbis_ent[sub_idx];
>      }
> -    if (dsemts_nonvolatile) {
> -        (*cdat_table)[i++] = g_steal_pointer(&dsemts_nonvolatile);
> +
> +    CDATDsemts *dsemts_ent = g_steal_pointer(&dsemts);
> +    for (sub_idx = 0; sub_idx < dsemts_num; sub_idx++) {
> +        (*cdat_table)[cdat_idx++] = (CDATSubHeader*)&dsemts_ent[sub_idx];
>      }
> -    
> -    return len;
> +
> +    return cdat_len;
>  }
>  
>  static void ct3_free_cdat_table(CDATSubHeader **cdat_table, int num, void *priv)
>  {
> -    int i;
> +    int dsmas_num = 1;
> +    int dslbis_idx = dsmas_num;
> +    int dsemts_idx = dsmas_num + (dsmas_num * 4);
> +
> +    /* There are only 3 sub-tables to free: dsmas, dslbis, dsemts */
> +    assert(num == (dsmas_num + (dsmas_num * 4) + (dsmas_num)));

This alone is a very good reason not to do allocations as arrays.
It looks extremely fragile to me.  Lets just pay the cost of a few
more small allocations to keep the code easier to maintain.


> +
> +    g_free(cdat_table[0]);
> +    g_free(cdat_table[dslbis_idx]);
> +    g_free(cdat_table[dsemts_idx]);
>  
> -    for (i = 0; i < num; i++) {
> -        g_free(cdat_table[i]);
> -    }
>      g_free(cdat_table);
>  }
>
[PATCH 4/5] hw/cxl: Multi-Region CXL Type-3 Devices (Volatile and Persistent)
Posted by Gregory Price 1 year, 7 months ago
This commit enables each CXL Type-3 device to contain one volatile
memory region and one persistent region.

Two new properties have been added to cxl-type3 device initialization:
    [volatile-memdev] and [persistent-memdev]

The existing [memdev] property has been deprecated and will default the
memory region to a persistent memory region (although a user may assign
the region to a ram or file backed region). It cannot be used in
combination with the new [persistent-memdev] property.

Partitioning volatile memory from persistent memory is not yet supported.

Volatile memory is mapped at DPA(0x0), while Persistent memory is mapped
at DPA(vmem->size), per CXL Spec 8.2.9.8.2.0 - Get Partition Info.

Signed-off-by: Gregory Price <gregory.price@memverge.com>
---
 hw/cxl/cxl-mailbox-utils.c  |  21 ++--
 hw/mem/cxl_type3.c          | 197 ++++++++++++++++++++++++++----------
 include/hw/cxl/cxl_device.h |  11 +-
 3 files changed, 162 insertions(+), 67 deletions(-)

diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
index 776c8cbadc..88d33e9a37 100644
--- a/hw/cxl/cxl-mailbox-utils.c
+++ b/hw/cxl/cxl-mailbox-utils.c
@@ -142,7 +142,7 @@ static ret_code cmd_firmware_update_get_info(struct cxl_cmd *cmd,
     } QEMU_PACKED *fw_info;
     QEMU_BUILD_BUG_ON(sizeof(*fw_info) != 0x50);
 
-    if (cxl_dstate->pmem_size < CXL_CAPACITY_MULTIPLIER) {
+    if (cxl_dstate->mem_size < CXL_CAPACITY_MULTIPLIER) {
         return CXL_MBOX_INTERNAL_ERROR;
     }
 
@@ -285,20 +285,20 @@ static ret_code cmd_identify_memory_device(struct cxl_cmd *cmd,
 
     CXLType3Dev *ct3d = container_of(cxl_dstate, CXLType3Dev, cxl_dstate);
     CXLType3Class *cvc = CXL_TYPE3_GET_CLASS(ct3d);
-    uint64_t size = cxl_dstate->pmem_size;
 
-    if (!QEMU_IS_ALIGNED(size, CXL_CAPACITY_MULTIPLIER)) {
+    if ((!QEMU_IS_ALIGNED(cxl_dstate->vmem_size, CXL_CAPACITY_MULTIPLIER)) ||
+        (!QEMU_IS_ALIGNED(cxl_dstate->pmem_size, CXL_CAPACITY_MULTIPLIER))) {
         return CXL_MBOX_INTERNAL_ERROR;
     }
 
     id = (void *)cmd->payload;
     memset(id, 0, sizeof(*id));
 
-    /* PMEM only */
     snprintf(id->fw_revision, 0x10, "BWFW VERSION %02d", 0);
 
-    id->total_capacity = size / CXL_CAPACITY_MULTIPLIER;
-    id->persistent_capacity = size / CXL_CAPACITY_MULTIPLIER;
+    id->total_capacity = cxl_dstate->mem_size / CXL_CAPACITY_MULTIPLIER;
+    id->persistent_capacity = cxl_dstate->pmem_size / CXL_CAPACITY_MULTIPLIER;
+    id->volatile_capacity = cxl_dstate->vmem_size / CXL_CAPACITY_MULTIPLIER;
     id->lsa_size = cvc->get_lsa_size(ct3d);
     id->poison_list_max_mer[1] = 0x1; /* 256 poison records */
 
@@ -317,16 +317,15 @@ static ret_code cmd_ccls_get_partition_info(struct cxl_cmd *cmd,
         uint64_t next_pmem;
     } QEMU_PACKED *part_info = (void *)cmd->payload;
     QEMU_BUILD_BUG_ON(sizeof(*part_info) != 0x20);
-    uint64_t size = cxl_dstate->pmem_size;
 
-    if (!QEMU_IS_ALIGNED(size, CXL_CAPACITY_MULTIPLIER)) {
+    if ((!QEMU_IS_ALIGNED(cxl_dstate->vmem_size, CXL_CAPACITY_MULTIPLIER)) ||
+        (!QEMU_IS_ALIGNED(cxl_dstate->pmem_size, CXL_CAPACITY_MULTIPLIER))) {
         return CXL_MBOX_INTERNAL_ERROR;
     }
 
-    /* PMEM only */
-    part_info->active_vmem = 0;
+    part_info->active_vmem = cxl_dstate->vmem_size / CXL_CAPACITY_MULTIPLIER;
     part_info->next_vmem = 0;
-    part_info->active_pmem = size / CXL_CAPACITY_MULTIPLIER;
+    part_info->active_pmem = cxl_dstate->pmem_size / CXL_CAPACITY_MULTIPLIER;
     part_info->next_pmem = 0;
 
     *len = sizeof(*part_info);
diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
index dda78704c2..c371cd06e1 100644
--- a/hw/mem/cxl_type3.c
+++ b/hw/mem/cxl_type3.c
@@ -131,11 +131,13 @@ static int ct3_build_cdat_table(CDATSubHeader ***cdat_table,
     uint64_t dpa_base = 0;
     MemoryRegion *mr;
 
-    if (!ct3d->hostmem | !host_memory_backend_get_memory(ct3d->hostmem)) {
+    if ((!ct3d->hostvmem && !ct3d->hostpmem) ||
+        (ct3d->hostvmem && !host_memory_backend_get_memory(ct3d->hostvmem)) ||
+        (ct3d->hostpmem && !host_memory_backend_get_memory(ct3d->hostpmem))) {
         return -EINVAL;
     }
 
-    dsmas_num = 1;
+    dsmas_num = (ct3d->hostvmem ? 1 : 0) + (ct3d->hostpmem ? 1 : 0);
     dslbis_num = 4 * dsmas_num;
     dsemts_num = dsmas_num;
 
@@ -147,16 +149,30 @@ static int ct3_build_cdat_table(CDATSubHeader ***cdat_table,
         return -ENOMEM;
     }
 
-    mr = host_memory_backend_get_memory(ct3d->hostmem);
-    cdat_len += ct3_build_dsmas(&dsmas[dsmad_handle],
-                                &dslbis[4 * dsmad_handle],
-                                &dsemts[dsmad_handle],
-                                mr,
-                                dsmad_handle,
-                                false,
-                                dpa_base);
-    dpa_base += mr->size;
-    dsmad_handle++;
+    if (ct3d->hostvmem) {
+        mr = host_memory_backend_get_memory(ct3d->hostvmem);
+        cdat_len += ct3_build_dsmas(&dsmas[dsmad_handle],
+                        &dslbis[4 * dsmad_handle],
+                        &dsemts[dsmad_handle],
+                        mr,
+                        dsmad_handle,
+                        false,
+                        dpa_base);
+        dpa_base += mr->size;
+        dsmad_handle++;
+    }
+    if (ct3d->hostpmem) {
+        mr = host_memory_backend_get_memory(ct3d->hostpmem);
+        cdat_len += ct3_build_dsmas(&dsmas[dsmad_handle],
+                        &dslbis[4 * dsmad_handle],
+                        &dsemts[dsmad_handle],
+                        mr,
+                        dsmad_handle,
+                        false,
+                        dpa_base);
+        dpa_base += mr->size;
+        dsmad_handle++;
+    }
 
     /* Allocate and fill in the CDAT table */
     *cdat_table = g_malloc0(cdat_len * sizeof(*cdat_table));
@@ -185,7 +201,8 @@ static int ct3_build_cdat_table(CDATSubHeader ***cdat_table,
 
 static void ct3_free_cdat_table(CDATSubHeader **cdat_table, int num, void *priv)
 {
-    int dsmas_num = 1;
+    CXLType3Dev *ct3d = priv;
+    int dsmas_num = (ct3d->hostvmem ? 1 : 0) + (ct3d->hostpmem ? 1 : 0);
     int dslbis_idx = dsmas_num;
     int dsemts_idx = dsmas_num + (dsmas_num * 4);
 
@@ -386,16 +403,48 @@ static void build_dvsecs(CXLType3Dev *ct3d)
     CXLDVSECRegisterLocator *regloc_dvsec;
     uint8_t *dvsec;
     int i;
+    uint32_t range1_size_hi = 0, range1_size_lo = 0,
+             range1_base_hi = 0, range1_base_lo = 0,
+             range2_size_hi = 0, range2_size_lo = 0,
+             range2_base_hi = 0, range2_base_lo = 0;
+
+    /*
+     * Volatile memory is mapped as (0x0)
+     * Persistent memory is mapped at (volatile->size)
+     */
+    if (ct3d->hostvmem && ct3d->hostpmem) {
+        range1_size_hi = ct3d->hostvmem->size >> 32;
+        range1_size_lo = (2 << 5) | (2 << 2) | 0x3 |
+                         (ct3d->hostvmem->size & 0xF0000000);
+        range1_base_hi = 0;
+        range1_base_lo = 0;
+        range2_size_hi = ct3d->hostpmem->size >> 32;
+        range2_size_lo = (2 << 5) | (2 << 2) | 0x3 |
+                         (ct3d->hostpmem->size & 0xF0000000);
+        range2_base_hi = ct3d->hostvmem->size >> 32;
+        range2_base_lo = ct3d->hostvmem->size & 0xF0000000;
+    } else {
+        HostMemoryBackend* hmbe = ct3d->hostvmem ?
+                                  ct3d->hostvmem : ct3d->hostpmem;
+        range1_size_hi = hmbe->size >> 32;
+        range1_size_lo = (2 << 5) | (2 << 2) | 0x3 |
+                         (hmbe->size & 0xF0000000);
+        range1_base_hi = 0;
+        range1_base_lo = 0;
+    }
 
     dvsec = (uint8_t *)&(CXLDVSECDevice){
         .cap = 0x1e,
         .ctrl = 0x2,
         .status2 = 0x2,
-        .range1_size_hi = ct3d->hostmem->size >> 32,
-        .range1_size_lo = (2 << 5) | (2 << 2) | 0x3 |
-        (ct3d->hostmem->size & 0xF0000000),
-        .range1_base_hi = 0,
-        .range1_base_lo = 0,
+        .range1_size_hi = range1_size_hi,
+        .range1_size_lo = range1_size_lo,
+        .range1_base_hi = range1_base_hi,
+        .range1_base_lo = range1_base_lo,
+        .range2_size_hi = range2_size_hi,
+        .range2_size_lo = range2_size_lo,
+        .range2_base_hi = range2_base_hi,
+        .range2_base_lo = range2_base_lo
     };
     cxl_component_create_dvsec(cxl_cstate, CXL2_TYPE3_DEVICE,
                                PCIE_CXL_DEVICE_DVSEC_LENGTH,
@@ -483,35 +532,57 @@ static bool cxl_setup_memory(CXLType3Dev *ct3d, Error **errp)
     MemoryRegion *mr;
     char *name;
 
-    if (!ct3d->hostmem) {
-        error_setg(errp, "memdev property must be set");
+    if (!ct3d->hostmem && !ct3d->hostvmem && !ct3d->hostpmem) {
+        error_setg(errp, "at least one memdev property must be set");
+        return false;
+    } else if (ct3d->hostmem && ct3d->hostpmem) {
+        error_setg(errp, "[memdev] cannot be used with new "
+                         "[persistent-memdev] property");
         return false;
+    } else if (ct3d->hostmem) {
+        /* Use of hostmem property implies pmem */
+        ct3d->hostpmem = ct3d->hostmem;
+        ct3d->hostmem = NULL;
     }
 
-    mr = host_memory_backend_get_memory(ct3d->hostmem);
-    if (!mr) {
-        error_setg(errp, "memdev property must be set");
+    if (ct3d->hostpmem && !ct3d->lsa) {
+        error_setg(errp, "lsa property must be set for persistent devices");
         return false;
     }
-    memory_region_set_nonvolatile(mr, true);
-    memory_region_set_enabled(mr, true);
-    host_memory_backend_set_mapped(ct3d->hostmem, true);
 
-    if (ds->id) {
-        name = g_strdup_printf("cxl-type3-dpa-space:%s", ds->id);
-    } else {
-        name = g_strdup("cxl-type3-dpa-space");
+    if (ct3d->hostvmem)
+    {
+        mr = host_memory_backend_get_memory(ct3d->hostvmem);
+        memory_region_set_nonvolatile(mr, false);
+        memory_region_set_enabled(mr, true);
+        host_memory_backend_set_mapped(ct3d->hostvmem, true);
+        if (ds->id) {
+            name = g_strdup_printf("cxl-type3-dpa-vmem-space:%s", ds->id);
+        } else {
+            name = g_strdup("cxl-type3-dpa-vmem-space");
+        }
+        address_space_init(&ct3d->hostvmem_as, mr, name);
+        ct3d->cxl_dstate.vmem_size = mr->size;
+        ct3d->cxl_dstate.mem_size += mr->size;
+        g_free(name);
     }
-    address_space_init(&ct3d->hostmem_as, mr, name);
-    g_free(name);
 
-    ct3d->cxl_dstate.pmem_size = ct3d->hostmem->size;
-
-    if (!ct3d->lsa) {
-        error_setg(errp, "lsa property must be set");
-        return false;
+    if (ct3d->hostpmem)
+    {
+        mr = host_memory_backend_get_memory(ct3d->hostpmem);
+        memory_region_set_nonvolatile(mr, true);
+        memory_region_set_enabled(mr, true);
+        host_memory_backend_set_mapped(ct3d->hostpmem, true);
+        if (ds->id) {
+            name = g_strdup_printf("cxl-type3-dpa-pmem-space:%s", ds->id);
+        } else {
+            name = g_strdup("cxl-type3-dpa-pmem-space");
+        }
+        address_space_init(&ct3d->hostpmem_as, mr, name);
+        ct3d->cxl_dstate.pmem_size = mr->size;
+        ct3d->cxl_dstate.mem_size += mr->size;
+        g_free(name);
     }
-
     return true;
 }
 
@@ -627,7 +698,10 @@ static void ct3_exit(PCIDevice *pci_dev)
     cxl_doe_cdat_release(cxl_cstate);
     spdm_sock_fini(ct3d->doe_spdm.socket);
     g_free(regs->special_ops);
-    address_space_destroy(&ct3d->hostmem_as);
+    if (ct3d->hostvmem)
+        address_space_destroy(&ct3d->hostvmem_as);
+    if (ct3d->hostpmem)
+        address_space_destroy(&ct3d->hostpmem_as);
 }
 
 /* TODO: Support multiple HDM decoders and DPA skip */
@@ -667,11 +741,15 @@ MemTxResult cxl_type3_read(PCIDevice *d, hwaddr host_addr, uint64_t *data,
 {
     CXLType3Dev *ct3d = CXL_TYPE3(d);
     uint64_t dpa_offset;
-    MemoryRegion *mr;
+    MemoryRegion *vmr = NULL, *pmr = NULL;
+    AddressSpace* as;
 
-    /* TODO support volatile region */
-    mr = host_memory_backend_get_memory(ct3d->hostmem);
-    if (!mr) {
+    if (ct3d->hostvmem)
+        vmr = host_memory_backend_get_memory(ct3d->hostvmem);
+    if (ct3d->hostpmem)
+        pmr = host_memory_backend_get_memory(ct3d->hostpmem);
+
+    if (!vmr && !pmr) {
         return MEMTX_ERROR;
     }
 
@@ -679,11 +757,13 @@ MemTxResult cxl_type3_read(PCIDevice *d, hwaddr host_addr, uint64_t *data,
         return MEMTX_ERROR;
     }
 
-    if (dpa_offset > int128_get64(mr->size)) {
+    if (dpa_offset > int128_get64(ct3d->cxl_dstate.mem_size)) {
         return MEMTX_ERROR;
     }
 
-    return address_space_read(&ct3d->hostmem_as, dpa_offset, attrs, data, size);
+    as = (vmr && (dpa_offset <= int128_get64(vmr->size))) ?
+        &ct3d->hostvmem_as : &ct3d->hostpmem_as;
+    return address_space_read(as, dpa_offset, attrs, data, size);
 }
 
 MemTxResult cxl_type3_write(PCIDevice *d, hwaddr host_addr, uint64_t data,
@@ -691,10 +771,15 @@ MemTxResult cxl_type3_write(PCIDevice *d, hwaddr host_addr, uint64_t data,
 {
     CXLType3Dev *ct3d = CXL_TYPE3(d);
     uint64_t dpa_offset;
-    MemoryRegion *mr;
+    MemoryRegion *vmr = NULL, *pmr = NULL;
+    AddressSpace* as;
+
+    if (ct3d->hostvmem)
+        vmr = host_memory_backend_get_memory(ct3d->hostvmem);
+    if (ct3d->hostpmem)
+        pmr = host_memory_backend_get_memory(ct3d->hostpmem);
 
-    mr = host_memory_backend_get_memory(ct3d->hostmem);
-    if (!mr) {
+    if (!vmr && !pmr) {
         return MEMTX_OK;
     }
 
@@ -702,11 +787,13 @@ MemTxResult cxl_type3_write(PCIDevice *d, hwaddr host_addr, uint64_t data,
         return MEMTX_OK;
     }
 
-    if (dpa_offset > int128_get64(mr->size)) {
+    if (dpa_offset > int128_get64(ct3d->cxl_dstate.mem_size)) {
         return MEMTX_OK;
     }
-    return address_space_write(&ct3d->hostmem_as, dpa_offset, attrs,
-                               &data, size);
+
+    as = (vmr && (dpa_offset <= int128_get64(vmr->size))) ?
+        &ct3d->hostvmem_as : &ct3d->hostpmem_as;
+    return address_space_write(as, dpa_offset, attrs, &data, size);
 }
 
 static void ct3d_reset(DeviceState *dev)
@@ -721,7 +808,11 @@ static void ct3d_reset(DeviceState *dev)
 
 static Property ct3_props[] = {
     DEFINE_PROP_LINK("memdev", CXLType3Dev, hostmem, TYPE_MEMORY_BACKEND,
-                     HostMemoryBackend *),
+                     HostMemoryBackend *), /* for backward compatibility */
+    DEFINE_PROP_LINK("persistent-memdev", CXLType3Dev, hostpmem,
+                     TYPE_MEMORY_BACKEND, HostMemoryBackend *),
+    DEFINE_PROP_LINK("volatile-memdev", CXLType3Dev, hostvmem,
+                     TYPE_MEMORY_BACKEND, HostMemoryBackend *),
     DEFINE_PROP_LINK("lsa", CXLType3Dev, lsa, TYPE_MEMORY_BACKEND,
                      HostMemoryBackend *),
     DEFINE_PROP_UINT64("sn", CXLType3Dev, sn, UI64_NULL),
@@ -804,7 +895,7 @@ static void ct3_class_init(ObjectClass *oc, void *data)
     pc->config_read = ct3d_config_read;
 
     set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
-    dc->desc = "CXL PMEM Device (Type 3)";
+    dc->desc = "CXL Memory Device (Type 3)";
     dc->reset = ct3d_reset;
     device_class_set_props(dc, ct3_props);
 
diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
index 0f4e29345f..458853b373 100644
--- a/include/hw/cxl/cxl_device.h
+++ b/include/hw/cxl/cxl_device.h
@@ -141,8 +141,10 @@ typedef struct cxl_device_state {
         uint64_t host_set;
     } timestamp;
 
-    /* memory region for persistent memory, HDM */
+    /* memory region size, HDM */
+    uint64_t mem_size;
     uint64_t pmem_size;
+    uint64_t vmem_size;
 
     /* Move me later */
     CPMUState cpmu[CXL_NUM_CPMU_INSTANCES];
@@ -270,12 +272,15 @@ struct CXLType3Dev {
     PCIDevice parent_obj;
 
     /* Properties */
-    HostMemoryBackend *hostmem;
+    HostMemoryBackend *hostmem; /* deprecated */
+    HostMemoryBackend *hostvmem;
+    HostMemoryBackend *hostpmem;
     HostMemoryBackend *lsa;
     uint64_t sn;
 
     /* State */
-    AddressSpace hostmem_as;
+    AddressSpace hostvmem_as;
+    AddressSpace hostpmem_as;
     CXLComponentState cxl_cstate;
     CXLDeviceState cxl_dstate;
 
-- 
2.37.3
[PATCH 5/5] cxl: update tests and documentation for new cxl properties
Posted by Gregory Price 1 year, 7 months ago
Adds explicit examples for --persistent-memdev and --volatile-memdev

Signed-off-by: Gregory Price <gregory.price@memverge.com>
---
 docs/system/devices/cxl.rst | 53 ++++++++++++++++++------
 tests/qtest/cxl-test.c      | 81 +++++++++++++++++++++++++++++++------
 2 files changed, 110 insertions(+), 24 deletions(-)

diff --git a/docs/system/devices/cxl.rst b/docs/system/devices/cxl.rst
index f25783a4ec..9e165064c8 100644
--- a/docs/system/devices/cxl.rst
+++ b/docs/system/devices/cxl.rst
@@ -300,15 +300,36 @@ Example topology involving a switch::
 
 Example command lines
 ---------------------
-A very simple setup with just one directly attached CXL Type 3 device::
+A very simple setup with just one directly attached CXL Type 3 Persistent Memory device::
 
   qemu-system-aarch64 -M virt,gic-version=3,cxl=on -m 4g,maxmem=8G,slots=8 -cpu max \
   ...
-  -object memory-backend-file,id=cxl-mem1,share=on,mem-path=/tmp/cxltest.raw,size=256M \
-  -object memory-backend-file,id=cxl-lsa1,share=on,mem-path=/tmp/lsa.raw,size=256M \
+  -object memory-backend-file,pmem=true,id=pmem0,share=on,mem-path=/tmp/cxltest.raw,size=256M \
+  -object memory-backend-file,pmem=true,id=cxl-lsa0,share=on,mem-path=/tmp/lsa.raw,size=256M \
+  -device pxb-cxl,bus_nr=12,bus=pcie.0,id=cxl.1 \
+  -device cxl-rp,port=0,bus=cxl.1,id=root_port13,chassis=0,slot=2 \
+  -device cxl-type3,bus=root_port13,persistent-memdev=pmem0,lsa=cxl-lsa1,id=cxl-pmem0 \
+  -M cxl-fmw.0.targets.0=cxl.1,cxl-fmw.0.size=4G
+
+A very simple setup with just one directly attached CXL Type 3 Volatile Memory device::
+
+  qemu-system-aarch64 -M virt,gic-version=3,cxl=on -m 4g,maxmem=8G,slots=8 -cpu max \
+  ...
+  -object memory-backend-ram,id=vmem0,share=on,size=256M \
   -device pxb-cxl,bus_nr=12,bus=pcie.0,id=cxl.1 \
   -device cxl-rp,port=0,bus=cxl.1,id=root_port13,chassis=0,slot=2 \
-  -device cxl-type3,bus=root_port13,memdev=cxl-mem1,lsa=cxl-lsa1,id=cxl-pmem0 \
+  -device cxl-type3,bus=root_port13,volatile-memdev=vmem0,id=cxl-vmem0 \
+  -M cxl-fmw.0.targets.0=cxl.1,cxl-fmw.0.size=4G
+
+The same volatile setup may optionally include an LSA region::
+
+  qemu-system-aarch64 -M virt,gic-version=3,cxl=on -m 4g,maxmem=8G,slots=8 -cpu max \
+  ...
+  -object memory-backend-ram,id=vmem0,share=on,size=256M \
+  -object memory-backend-file,id=cxl-lsa0,share=on,mem-path=/tmp/lsa.raw,size=256M \
+  -device pxb-cxl,bus_nr=12,bus=pcie.0,id=cxl.1 \
+  -device cxl-rp,port=0,bus=cxl.1,id=root_port13,chassis=0,slot=2 \
+  -device cxl-type3,bus=root_port13,volatile-memdev=vmem0,lsa=cxl-lsa0,id=cxl-vmem0 \
   -M cxl-fmw.0.targets.0=cxl.1,cxl-fmw.0.size=4G
 
 A setup suitable for 4 way interleave. Only one fixed window provided, to enable 2 way
@@ -328,13 +349,13 @@ the CXL Type3 device directly attached (no switches).::
   -device pxb-cxl,bus_nr=12,bus=pcie.0,id=cxl.1 \
   -device pxb-cxl,bus_nr=222,bus=pcie.0,id=cxl.2 \
   -device cxl-rp,port=0,bus=cxl.1,id=root_port13,chassis=0,slot=2 \
-  -device cxl-type3,bus=root_port13,memdev=cxl-mem1,lsa=cxl-lsa1,id=cxl-pmem0 \
+  -device cxl-type3,bus=root_port13,persistent-memdev=cxl-mem1,lsa=cxl-lsa1,id=cxl-pmem0 \
   -device cxl-rp,port=1,bus=cxl.1,id=root_port14,chassis=0,slot=3 \
-  -device cxl-type3,bus=root_port14,memdev=cxl-mem2,lsa=cxl-lsa2,id=cxl-pmem1 \
+  -device cxl-type3,bus=root_port14,persistent-memdev=cxl-mem2,lsa=cxl-lsa2,id=cxl-pmem1 \
   -device cxl-rp,port=0,bus=cxl.2,id=root_port15,chassis=0,slot=5 \
-  -device cxl-type3,bus=root_port15,memdev=cxl-mem3,lsa=cxl-lsa3,id=cxl-pmem2 \
+  -device cxl-type3,bus=root_port15,persistent-memdev=cxl-mem3,lsa=cxl-lsa3,id=cxl-pmem2 \
   -device cxl-rp,port=1,bus=cxl.2,id=root_port16,chassis=0,slot=6 \
-  -device cxl-type3,bus=root_port16,memdev=cxl-mem4,lsa=cxl-lsa4,id=cxl-pmem3 \
+  -device cxl-type3,bus=root_port16,persistent-memdev=cxl-mem4,lsa=cxl-lsa4,id=cxl-pmem3 \
   -M cxl-fmw.0.targets.0=cxl.1,cxl-fmw.0.targets.1=cxl.2,cxl-fmw.0.size=4G,cxl-fmw.0.interleave-granularity=8k
 
 An example of 4 devices below a switch suitable for 1, 2 or 4 way interleave::
@@ -354,15 +375,23 @@ An example of 4 devices below a switch suitable for 1, 2 or 4 way interleave::
   -device cxl-rp,port=1,bus=cxl.1,id=root_port1,chassis=0,slot=1 \
   -device cxl-upstream,bus=root_port0,id=us0 \
   -device cxl-downstream,port=0,bus=us0,id=swport0,chassis=0,slot=4 \
-  -device cxl-type3,bus=swport0,memdev=cxl-mem0,lsa=cxl-lsa0,id=cxl-pmem0,size=256M \
+  -device cxl-type3,bus=swport0,persistent-memdev=cxl-mem0,lsa=cxl-lsa0,id=cxl-pmem0,size=256M \
   -device cxl-downstream,port=1,bus=us0,id=swport1,chassis=0,slot=5 \
-  -device cxl-type3,bus=swport1,memdev=cxl-mem1,lsa=cxl-lsa1,id=cxl-pmem1,size=256M \
+  -device cxl-type3,bus=swport1,persistent-memdev=cxl-mem1,lsa=cxl-lsa1,id=cxl-pmem1,size=256M \
   -device cxl-downstream,port=2,bus=us0,id=swport2,chassis=0,slot=6 \
-  -device cxl-type3,bus=swport2,memdev=cxl-mem2,lsa=cxl-lsa2,id=cxl-pmem2,size=256M \
+  -device cxl-type3,bus=swport2,persistent-memdev=cxl-mem2,lsa=cxl-lsa2,id=cxl-pmem2,size=256M \
   -device cxl-downstream,port=3,bus=us0,id=swport3,chassis=0,slot=7 \
-  -device cxl-type3,bus=swport3,memdev=cxl-mem3,lsa=cxl-lsa3,id=cxl-pmem3,size=256M \
+  -device cxl-type3,bus=swport3,persistent-memdev=cxl-mem3,lsa=cxl-lsa3,id=cxl-pmem3,size=256M \
   -M cxl-fmw.0.targets.0=cxl.1,cxl-fmw.0.size=4G,cxl-fmw.0.interleave-granularity=4k
 
+Deprecations
+------------
+
+The Type 3 device [memdev] attribute has been deprecated in favor
+of the [persistent-memdev] and [volatile-memdev] attributes. [memdev]
+will default to a persistent memory device for backward compatibility
+and is incapable of being used in combination with [persistent-memdev].
+
 Kernel Configuration Options
 ----------------------------
 
diff --git a/tests/qtest/cxl-test.c b/tests/qtest/cxl-test.c
index f0a8a4045d..1a7a25dc53 100644
--- a/tests/qtest/cxl-test.c
+++ b/tests/qtest/cxl-test.c
@@ -34,29 +34,44 @@
                  "-device cxl-rp,id=rp2,bus=cxl.1,chassis=0,slot=2 " \
                  "-device cxl-rp,id=rp3,bus=cxl.1,chassis=0,slot=3 "
 
-#define QEMU_T3D "-object memory-backend-file,id=cxl-mem0,mem-path=%s,size=256M " \
-                 "-object memory-backend-file,id=lsa0,mem-path=%s,size=256M "    \
-                 "-device cxl-type3,bus=rp0,memdev=cxl-mem0,lsa=lsa0,id=cxl-pmem0 "
+#define QEMU_T3D_DEPRECATED \
+  "-object memory-backend-file,id=cxl-mem0,mem-path=%s,size=256M " \
+  "-object memory-backend-file,id=lsa0,mem-path=%s,size=256M "    \
+  "-device cxl-type3,bus=rp0,memdev=cxl-mem0,lsa=lsa0,id=cxl-pmem0 "
+
+#define QEMU_T3D_PMEM \
+  "-object memory-backend-file,id=m0,mem-path=%s,size=256M " \
+  "-object memory-backend-file,id=lsa0,mem-path=%s,size=256M " \
+  "-device cxl-type3,bus=rp0,persistent-memdev=cxl-m0,lsa=lsa0,id=pmem0 "
+
+#define QEMU_T3D_VMEM \
+  "-object memory-backend-ram,id=mem0,size=256M " \
+  "-device cxl-type3,bus=rp0,volatile-memdev=mem0,id=mem0 "
+
+#define QEMU_T3D_VMEM_LSA \
+  "-object memory-backend-ram,id=mem0,size=256M " \
+  "-object memory-backend-file,id=lsa0,mem-path=%s,size=256M " \
+  "-device cxl-type3,bus=rp0,volatile-memdev=mem0,lsa=lsa0,id=mem0 "
 
 #define QEMU_2T3D "-object memory-backend-file,id=cxl-mem0,mem-path=%s,size=256M "    \
                   "-object memory-backend-file,id=lsa0,mem-path=%s,size=256M "    \
-                  "-device cxl-type3,bus=rp0,memdev=cxl-mem0,lsa=lsa0,id=cxl-pmem0 " \
+                  "-device cxl-type3,bus=rp0,persistent-memdev=cxl-mem0,lsa=lsa0,id=cxl-pmem0 " \
                   "-object memory-backend-file,id=cxl-mem1,mem-path=%s,size=256M "    \
                   "-object memory-backend-file,id=lsa1,mem-path=%s,size=256M "    \
-                  "-device cxl-type3,bus=rp1,memdev=cxl-mem1,lsa=lsa1,id=cxl-pmem1 "
+                  "-device cxl-type3,bus=rp1,persistent-memdev=cxl-mem1,lsa=lsa1,id=cxl-pmem1 "
 
 #define QEMU_4T3D "-object memory-backend-file,id=cxl-mem0,mem-path=%s,size=256M " \
                   "-object memory-backend-file,id=lsa0,mem-path=%s,size=256M "    \
-                  "-device cxl-type3,bus=rp0,memdev=cxl-mem0,lsa=lsa0,id=cxl-pmem0 " \
+                  "-device cxl-type3,bus=rp0,persistent-memdev=cxl-mem0,lsa=lsa0,id=cxl-pmem0 " \
                   "-object memory-backend-file,id=cxl-mem1,mem-path=%s,size=256M "    \
                   "-object memory-backend-file,id=lsa1,mem-path=%s,size=256M "    \
-                  "-device cxl-type3,bus=rp1,memdev=cxl-mem1,lsa=lsa1,id=cxl-pmem1 " \
+                  "-device cxl-type3,bus=rp1,persistent-memdev=cxl-mem1,lsa=lsa1,id=cxl-pmem1 " \
                   "-object memory-backend-file,id=cxl-mem2,mem-path=%s,size=256M "    \
                   "-object memory-backend-file,id=lsa2,mem-path=%s,size=256M "    \
-                  "-device cxl-type3,bus=rp2,memdev=cxl-mem2,lsa=lsa2,id=cxl-pmem2 " \
+                  "-device cxl-type3,bus=rp2,persistent-memdev=cxl-mem2,lsa=lsa2,id=cxl-pmem2 " \
                   "-object memory-backend-file,id=cxl-mem3,mem-path=%s,size=256M "    \
                   "-object memory-backend-file,id=lsa3,mem-path=%s,size=256M "    \
-                  "-device cxl-type3,bus=rp3,memdev=cxl-mem3,lsa=lsa3,id=cxl-pmem3 "
+                  "-device cxl-type3,bus=rp3,persistent-memdev=cxl-mem3,lsa=lsa3,id=cxl-pmem3 "
 
 static void cxl_basic_hb(void)
 {
@@ -95,14 +110,53 @@ static void cxl_2root_port(void)
 }
 
 #ifdef CONFIG_POSIX
-static void cxl_t3d(void)
+static void cxl_t3d_deprecated(void)
+{
+    g_autoptr(GString) cmdline = g_string_new(NULL);
+    g_autofree const char *tmpfs = NULL;
+
+    tmpfs = g_dir_make_tmp("cxl-test-XXXXXX", NULL);
+
+    g_string_printf(cmdline, QEMU_PXB_CMD QEMU_RP QEMU_T3D_DEPRECATED,
+                    tmpfs, tmpfs);
+
+    qtest_start(cmdline->str);
+    qtest_end();
+}
+
+static void cxl_t3d_persistent(void)
+{
+    g_autoptr(GString) cmdline = g_string_new(NULL);
+    g_autofree const char *tmpfs = NULL;
+
+    tmpfs = g_dir_make_tmp("cxl-test-XXXXXX", NULL);
+
+    g_string_printf(cmdline, QEMU_PXB_CMD QEMU_RP QEMU_T3D_PMEM,
+                    tmpfs, tmpfs);
+
+    qtest_start(cmdline->str);
+    qtest_end();
+}
+
+static void cxl_t3d_volatile(void)
+{
+    g_autoptr(GString) cmdline = g_string_new(NULL);
+
+    g_string_printf(cmdline, QEMU_PXB_CMD QEMU_RP QEMU_T3D_VMEM);
+
+    qtest_start(cmdline->str);
+    qtest_end();
+}
+
+static void cxl_t3d_volatile_lsa(void)
 {
     g_autoptr(GString) cmdline = g_string_new(NULL);
     g_autofree const char *tmpfs = NULL;
 
     tmpfs = g_dir_make_tmp("cxl-test-XXXXXX", NULL);
 
-    g_string_printf(cmdline, QEMU_PXB_CMD QEMU_RP QEMU_T3D, tmpfs, tmpfs);
+    g_string_printf(cmdline, QEMU_PXB_CMD QEMU_RP QEMU_T3D_VMEM_LSA,
+                    tmpfs);
 
     qtest_start(cmdline->str);
     qtest_end();
@@ -167,7 +221,10 @@ int main(int argc, char **argv)
         qtest_add_func("/pci/cxl/rp", cxl_root_port);
         qtest_add_func("/pci/cxl/rp_x2", cxl_2root_port);
 #ifdef CONFIG_POSIX
-        qtest_add_func("/pci/cxl/type3_device", cxl_t3d);
+        qtest_add_func("/pci/cxl/type3_device", cxl_t3d_deprecated);
+        qtest_add_func("/pci/cxl/type3_device_pmem", cxl_t3d_persistent);
+        qtest_add_func("/pci/cxl/type3_device_vmem", cxl_t3d_volatile);
+        qtest_add_func("/pci/cxl/type3_device_vmem_lsa", cxl_t3d_volatile_lsa);
         qtest_add_func("/pci/cxl/rp_x2_type3_x2", cxl_1pxb_2rp_2t3d);
         qtest_add_func("/pci/cxl/pxb_x2_root_port_x4_type3_x4",
                        cxl_2pxb_4rp_4t3d);
-- 
2.37.3