[PATCH v2 0/8] pdx: introduce a new compression algorithm

Roger Pau Monne posted 8 patches 4 months, 1 week ago
Patches applied successfully (tree, apply log)
git fetch https://gitlab.com/xen-project/patchew/xen tags/patchew/20250620111130.29057-1-roger.pau@citrix.com
There is a newer version of this series
CHANGELOG.md                           |   3 +
docs/misc/xen-command-line.pandoc      |   9 +
tools/tests/Makefile                   |   1 +
tools/tests/pdx/.gitignore             |   3 +
tools/tests/pdx/Makefile               |  49 ++++
tools/tests/pdx/harness.h              |  99 +++++++
tools/tests/pdx/test-pdx.c             | 224 +++++++++++++++
xen/arch/arm/include/asm/Makefile      |   1 +
xen/arch/arm/setup.c                   |  34 +--
xen/arch/ppc/include/asm/Makefile      |   1 +
xen/arch/riscv/include/asm/Makefile    |   1 +
xen/arch/x86/domain.c                  |  40 +--
xen/arch/x86/include/asm/cpufeatures.h |   1 +
xen/arch/x86/include/asm/pdx.h         |  75 +++++
xen/arch/x86/srat.c                    |  30 +-
xen/common/Kconfig                     |  37 ++-
xen/common/pdx.c                       | 379 ++++++++++++++++++++++---
xen/include/asm-generic/pdx.h          |  24 ++
xen/include/xen/pdx.h                  | 201 +++++++++----
19 files changed, 1056 insertions(+), 156 deletions(-)
create mode 100644 tools/tests/pdx/.gitignore
create mode 100644 tools/tests/pdx/Makefile
create mode 100644 tools/tests/pdx/harness.h
create mode 100644 tools/tests/pdx/test-pdx.c
create mode 100644 xen/arch/x86/include/asm/pdx.h
create mode 100644 xen/include/asm-generic/pdx.h
[PATCH v2 0/8] pdx: introduce a new compression algorithm
Posted by Roger Pau Monne 4 months, 1 week ago
Hello,

This series implements a new PDX compression algorithm to cope with the
spare memory maps found on the Intel Sapphire/Granite Rapids.

Patches 1 to 7 prepare the existing code to make it easier to introduce
a new PDX compression, including generalizing the initialization and
setup functions and adding a unit test for PDX compression.

Patch 8 introduce the new compression.  The new compression is only
enabled by default on x86, other architectures are left with their
previous defaults.

Thanks, Roger.

Roger Pau Monne (8):
  x86/pdx: simplify calculation of domain struct allocation boundary
  kconfig: turn PDX compression into a choice
  pdx: provide a unified set of unit functions
  pdx: introduce command line compression toggle
  pdx: allow per-arch optimization of PDX conversion helpers
  test/pdx: add PDX compression unit tests
  pdx: move some helpers in preparation for new compression
  pdx: introduce a new compression algorithm based on region offsets

 CHANGELOG.md                           |   3 +
 docs/misc/xen-command-line.pandoc      |   9 +
 tools/tests/Makefile                   |   1 +
 tools/tests/pdx/.gitignore             |   3 +
 tools/tests/pdx/Makefile               |  49 ++++
 tools/tests/pdx/harness.h              |  99 +++++++
 tools/tests/pdx/test-pdx.c             | 224 +++++++++++++++
 xen/arch/arm/include/asm/Makefile      |   1 +
 xen/arch/arm/setup.c                   |  34 +--
 xen/arch/ppc/include/asm/Makefile      |   1 +
 xen/arch/riscv/include/asm/Makefile    |   1 +
 xen/arch/x86/domain.c                  |  40 +--
 xen/arch/x86/include/asm/cpufeatures.h |   1 +
 xen/arch/x86/include/asm/pdx.h         |  75 +++++
 xen/arch/x86/srat.c                    |  30 +-
 xen/common/Kconfig                     |  37 ++-
 xen/common/pdx.c                       | 379 ++++++++++++++++++++++---
 xen/include/asm-generic/pdx.h          |  24 ++
 xen/include/xen/pdx.h                  | 201 +++++++++----
 19 files changed, 1056 insertions(+), 156 deletions(-)
 create mode 100644 tools/tests/pdx/.gitignore
 create mode 100644 tools/tests/pdx/Makefile
 create mode 100644 tools/tests/pdx/harness.h
 create mode 100644 tools/tests/pdx/test-pdx.c
 create mode 100644 xen/arch/x86/include/asm/pdx.h
 create mode 100644 xen/include/asm-generic/pdx.h

-- 
2.49.0
Re: [PATCH v2 0/8] pdx: introduce a new compression algorithm
Posted by Stefano Stabellini 4 months ago
Hi Roger,

We have an ARM board with the following memory layout:

0x0-0x80000000, 0, 2G
0x800000000,0x880000000, 32GB, 2G
0x50000000000-0x50080000000 5T, 2GB 
0x60000000000-0x60080000000 6T, 2GB 
0x70000000000-0x70080000000 7T, 2GB 

It looks like your PDX series is exactly what we need.  However, I tried
to use it and it doesn't seem to be hooked properly on ARM yet. I spent
some time trying to fix it but I was unsuccessful.

As far as I can tell the following functions need to be adjusted but I
am not sure the list is comprehensive:

xen/arch/arm/include/asm/mmu/mm.h:maddr_to_virt
xen/arch/arm/mmu/mm.c:setup_frametable_mappings
xen/arch/arm/setup.c:init_pdx

Cheers,

Stefano

On Fri, 20 Jun 2025, Roger Pau Monne wrote:
> Hello,
> 
> This series implements a new PDX compression algorithm to cope with the
> spare memory maps found on the Intel Sapphire/Granite Rapids.
> 
> Patches 1 to 7 prepare the existing code to make it easier to introduce
> a new PDX compression, including generalizing the initialization and
> setup functions and adding a unit test for PDX compression.
> 
> Patch 8 introduce the new compression.  The new compression is only
> enabled by default on x86, other architectures are left with their
> previous defaults.
> 
> Thanks, Roger.
> 
> Roger Pau Monne (8):
>   x86/pdx: simplify calculation of domain struct allocation boundary
>   kconfig: turn PDX compression into a choice
>   pdx: provide a unified set of unit functions
>   pdx: introduce command line compression toggle
>   pdx: allow per-arch optimization of PDX conversion helpers
>   test/pdx: add PDX compression unit tests
>   pdx: move some helpers in preparation for new compression
>   pdx: introduce a new compression algorithm based on region offsets
> 
>  CHANGELOG.md                           |   3 +
>  docs/misc/xen-command-line.pandoc      |   9 +
>  tools/tests/Makefile                   |   1 +
>  tools/tests/pdx/.gitignore             |   3 +
>  tools/tests/pdx/Makefile               |  49 ++++
>  tools/tests/pdx/harness.h              |  99 +++++++
>  tools/tests/pdx/test-pdx.c             | 224 +++++++++++++++
>  xen/arch/arm/include/asm/Makefile      |   1 +
>  xen/arch/arm/setup.c                   |  34 +--
>  xen/arch/ppc/include/asm/Makefile      |   1 +
>  xen/arch/riscv/include/asm/Makefile    |   1 +
>  xen/arch/x86/domain.c                  |  40 +--
>  xen/arch/x86/include/asm/cpufeatures.h |   1 +
>  xen/arch/x86/include/asm/pdx.h         |  75 +++++
>  xen/arch/x86/srat.c                    |  30 +-
>  xen/common/Kconfig                     |  37 ++-
>  xen/common/pdx.c                       | 379 ++++++++++++++++++++++---
>  xen/include/asm-generic/pdx.h          |  24 ++
>  xen/include/xen/pdx.h                  | 201 +++++++++----
>  19 files changed, 1056 insertions(+), 156 deletions(-)
>  create mode 100644 tools/tests/pdx/.gitignore
>  create mode 100644 tools/tests/pdx/Makefile
>  create mode 100644 tools/tests/pdx/harness.h
>  create mode 100644 tools/tests/pdx/test-pdx.c
>  create mode 100644 xen/arch/x86/include/asm/pdx.h
>  create mode 100644 xen/include/asm-generic/pdx.h
> 
> -- 
> 2.49.0
>
Re: [PATCH v2 0/8] pdx: introduce a new compression algorithm
Posted by Roger Pau Monné 3 months, 4 weeks ago
On Fri, Jun 27, 2025 at 07:08:29PM -0700, Stefano Stabellini wrote:
> Hi Roger,
> 
> We have an ARM board with the following memory layout:
> 
> 0x0-0x80000000, 0, 2G
> 0x800000000,0x880000000, 32GB, 2G
> 0x50000000000-0x50080000000 5T, 2GB 
> 0x60000000000-0x60080000000 6T, 2GB 
> 0x70000000000-0x70080000000 7T, 2GB 

I would like to add this memory map to the PDX unit testing, do you
have a name I could use as a reference?  For example for the Intel
sparse map I'm using: "Real memory map from a 4s Intel GNR.".  I
currently have yours listed as: "Stefano's ARM board.", but that's not
a very descriptive naming :).

Thanks, Roger.
Re: [PATCH v2 0/8] pdx: introduce a new compression algorithm
Posted by Stefano Stabellini 3 months, 4 weeks ago
On Thu, 3 Jul 2025, Roger Pau Monné wrote:
> On Fri, Jun 27, 2025 at 07:08:29PM -0700, Stefano Stabellini wrote:
> > Hi Roger,
> > 
> > We have an ARM board with the following memory layout:
> > 
> > 0x0-0x80000000, 0, 2G
> > 0x800000000,0x880000000, 32GB, 2G
> > 0x50000000000-0x50080000000 5T, 2GB 
> > 0x60000000000-0x60080000000 6T, 2GB 
> > 0x70000000000-0x70080000000 7T, 2GB 
> 
> I would like to add this memory map to the PDX unit testing, do you
> have a name I could use as a reference?  For example for the Intel
> sparse map I'm using: "Real memory map from a 4s Intel GNR.".  I
> currently have yours listed as: "Stefano's ARM board.", but that's not
> a very descriptive naming :).

The name of the board is AMD "Versal Gen 2"
Re: [PATCH v2 0/8] pdx: introduce a new compression algorithm
Posted by Roger Pau Monné 4 months ago
On Fri, Jun 27, 2025 at 07:08:29PM -0700, Stefano Stabellini wrote:
> Hi Roger,
> 
> We have an ARM board with the following memory layout:
> 
> 0x0-0x80000000, 0, 2G
> 0x800000000,0x880000000, 32GB, 2G
> 0x50000000000-0x50080000000 5T, 2GB 
> 0x60000000000-0x60080000000 6T, 2GB 
> 0x70000000000-0x70080000000 7T, 2GB 

With the current PDX mask compression you could compress 4bits AFAICT.

> It looks like your PDX series is exactly what we need.  However, I tried
> to use it and it doesn't seem to be hooked properly on ARM yet. I spent
> some time trying to fix it but I was unsuccessful.

Hm, weird.  It shouldn't need any special hooking, unless assumptions
about the existing PDX mask compression have leaked into ARM code.

> As far as I can tell the following functions need to be adjusted but I
> am not sure the list is comprehensive:
> 
> xen/arch/arm/include/asm/mmu/mm.h:maddr_to_virt

At least for CONFIG_ARM_64 this seems to be implemented correctly, as
it's using maddr_to_directmapoff() which should have the correct
translation between paddr -> directmap virt.

Also given the memory map above the adjustments done in ARM to remove
any initial memory map offset should be no-ops, since I expect
base_mfn == 0 in setup_directmap_mappings() in that particular case,
and then directmap_mfn_start = directmap_base_pdx = 0 and
directmap_virt_start = DIRECTMAP_VIRT_START.  FWIW, if ARM uses offset
compression the special casing about removing the initial gap can be
removed, as the compression should already take care of that.

> xen/arch/arm/mmu/mm.c:setup_frametable_mappings
> xen/arch/arm/setup.c:init_pdx

I've attempted to adjust init_pdx() myself so it works with the new
generic PDX compression setup, it seemed to work fine on the CI, but I
don't have any real ARM machines to test myself.

Is there a way I could reproduce the issue(s) you are seeing with
QEMU?

I'm already working on v3, as this version implementation of
mfn_valid() is buggy.  Maybe that's what you are hitting?

Regards, Roger.
Re: [PATCH v2 0/8] pdx: introduce a new compression algorithm
Posted by Stefano Stabellini 4 months ago
On Mon, 30 Jun 2025, Roger Pau Monné wrote:
> On Fri, Jun 27, 2025 at 07:08:29PM -0700, Stefano Stabellini wrote:
> > Hi Roger,
> > 
> > We have an ARM board with the following memory layout:
> > 
> > 0x0-0x80000000, 0, 2G
> > 0x800000000,0x880000000, 32GB, 2G
> > 0x50000000000-0x50080000000 5T, 2GB 
> > 0x60000000000-0x60080000000 6T, 2GB 
> > 0x70000000000-0x70080000000 7T, 2GB 
> 
> With the current PDX mask compression you could compress 4bits AFAICT.
> 
> > It looks like your PDX series is exactly what we need.  However, I tried
> > to use it and it doesn't seem to be hooked properly on ARM yet. I spent
> > some time trying to fix it but I was unsuccessful.
> 
> Hm, weird.  It shouldn't need any special hooking, unless assumptions
> about the existing PDX mask compression have leaked into ARM code.
> 
> > As far as I can tell the following functions need to be adjusted but I
> > am not sure the list is comprehensive:
> > 
> > xen/arch/arm/include/asm/mmu/mm.h:maddr_to_virt
> 
> At least for CONFIG_ARM_64 this seems to be implemented correctly, as
> it's using maddr_to_directmapoff() which should have the correct
> translation between paddr -> directmap virt.
> 
> Also given the memory map above the adjustments done in ARM to remove
> any initial memory map offset should be no-ops, since I expect
> base_mfn == 0 in setup_directmap_mappings() in that particular case,
> and then directmap_mfn_start = directmap_base_pdx = 0 and
> directmap_virt_start = DIRECTMAP_VIRT_START.  FWIW, if ARM uses offset
> compression the special casing about removing the initial gap can be
> removed, as the compression should already take care of that.
> 
> > xen/arch/arm/mmu/mm.c:setup_frametable_mappings
> > xen/arch/arm/setup.c:init_pdx
> 
> I've attempted to adjust init_pdx() myself so it works with the new
> generic PDX compression setup, it seemed to work fine on the CI, but I
> don't have any real ARM machines to test myself.
 
> Is there a way I could reproduce the issue(s) you are seeing with
> QEMU?

Maybe. You can see how we run QEMU from gitlab-ci, but I don't know on
top of my head how to force QEMU to emulate multiple RAM banks at
specific addresses.


> I'm already working on v3, as this version implementation of
> mfn_valid() is buggy.  Maybe that's what you are hitting?
> 

This is the error:

(XEN) [0000000179e5f96b] Assertion '(mfn_to_pdx(maddr_to_mfn(ma)) - directmap_base_pdx) < (DIRECTMAP_SIZE >> PAGE_SHIFT)' failed at ./arch/arm/include/asm/mmu/mm.h:72
(XEN) [0000000179e90619] ----[ Xen-4.21-unstable  arm64  debug=y  Not tainted ]----
(XEN) [0000000179e9ee58] CPU:    0
(XEN) [0000000179eac907] PC:     00000a00002da5fc setup_mm+0x174/0x200
(XEN) [0000000179ed3ed0] LR:     00000a00002da580
(XEN) [0000000179edc486] SP:     00000a0000327e10
(XEN) [0000000179ee6b3a] CPSR:   00000000200003c9 MODE:64-bit EL2h (Hypervisor, handler)
(XEN) [0000000179ef5b4f]      X0: 0000050000000000  X1: 0000000050000000  X2: 0000000000080000
(XEN) [0000000179f05de3]      X3: 0000000000000017  X4: 0000000000000000  X5: 0000000050000000
(XEN) [0000000179f19396]      X6: 000000004fffffff  X7: 0000000000000000  X8: 0000000000020400
(XEN) [0000000179f2d797]      X9: 000000000001b808 X10: 0000000000000080 X11: 00000000000186de
(XEN) [0000000179f3d492]     X12: 000000000001a7df X13: 000000000001214f X14: 0000000000017275
(XEN) [0000000179f50f4c]     X15: 00000a00002b48bc X16: 00000a0000291478 X17: 0000000000000000
(XEN) [0000000179f60902]     X18: 000000007be9bbe0 X19: 0000000000000002 X20: 0000000000000000
(XEN) [0000000179f6fde5]     X21: 0000050080000000 X22: 00000a00002f8008 X23: 00000a00002b5c90
(XEN) [0000000179f7eeea]     X24: 0000000180000000 X25: 00000a00002b5e90 X26: 0000000000000000
(XEN) [0000000179f8ee55]     X27: 0000000000000000 X28: 000000007bff2f70  FP: 00000a0000327e10
(XEN) [0000000179fa6deb] 
(XEN) [0000000179fadf84]   VTCR_EL2: 0000000000000000
(XEN) [0000000179fb9994]  VTTBR_EL2: 0000000000000000
(XEN) [0000000179fc689d] 
(XEN) [0000000179fcc1a0]  SCTLR_EL2: 0000000030cd183d
(XEN) [0000000179fd95e3]    HCR_EL2: 0000000000000038
(XEN) [0000000179fe7082]  TTBR0_EL2: 0000000022148000
(XEN) [0000000179ff0d00] 
(XEN) [0000000179ff6d07]    ESR_EL2: 00000000f2000001
(XEN) [000000017a0003fe]  HPFAR_EL2: 0000000000000000
(XEN) [000000017a00c8f4]    FAR_EL2: 0000000000000000
(XEN) [000000017a018511] 
(XEN) [000000017a01fbe5] Xen stack trace from sp=00000a0000327e10:
(XEN) [000000017a02aa88]    00000a0000327e60 00000a00002e40c4 0000000022200000 000000000000f000
(XEN) [000000017a03e578]    00000a0000c0a5c0 00000a0000332000 00000a0000a00000 0000000000000000
(XEN) [000000017a04e676]    0000000000000000 0000000000000000 000000007be89ea0 00000a00002001a4
(XEN) [000000017a0636e1]    0000000022000000 fffff60021e00000 0000000022200000 0000000000001710
(XEN) [000000017a072ae0]    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) [000000017a084bf8]    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) [000000017a097ced]    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) [000000017a0a6829]    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) [000000017a0b8e71]    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) [000000017a0cdb4b]    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) [000000017a0e44b9]    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) [000000017a0f6a2b]    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) [000000017a1074a2]    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) [000000017a1178b3]    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) [000000017a128463]    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) [000000017a13a015]    0000000000000000 0000000000000000
(XEN) [000000017a144d66] Xen call trace:
(XEN) [000000017a14bcee]    [<00000a00002da5fc>] setup_mm+0x174/0x200 (PC)
(XEN) [000000017a15db0a]    [<00000a00002da580>] setup_mm+0xf8/0x200 (LR)
(XEN) [000000017a167dbb]    [<00000a00002e40c4>] start_xen+0x118/0x9d0
(XEN) [000000017a171724]    [<00000a00002001a4>] arch/arm/arm64/head.o#primary_switched+0x4/0x24
(XEN) [000000017a18abb4] 
(XEN) [000000017a19a465] 
(XEN) [000000017a19ffed] ****************************************
(XEN) [000000017a1aad66] Panic on CPU 0:
(XEN) [000000017a1b2757] Assertion '(mfn_to_pdx(maddr_to_mfn(ma)) - directmap_base_pdx) < (DIRECTMAP_SIZE >> PAGE_SHIFT)' failed at ./arch/arm/include/asm/mmu/mm.h:72
(XEN) [000000017a1daedf] ****************************************
(XEN) [000000017a1eb0a9] 
(XEN) [000000017a1f2b27] Reboot in five seconds...


If I remove the ASSERT:

(XEN) [00000003bc65c616] parameter "debug" unknown!
(XEN) [00000003bc70915a] 
(XEN) [00000003bc70fd14] ****************************************
(XEN) [00000003bc71afec] Panic on CPU 0:
(XEN) [00000003bc724d03] The frametable cannot cover the physical region 0000000000000000 - 0x00070080000000
(XEN) [00000003bc73786c] ****************************************
(XEN) [00000003bc741a19] 
(XEN) [00000003bc747833] Reboot in five seconds...


I think the issue (or one issue) is the implementation of
setup_frametable_mappings on ARM which is ignoring the pdx_group_valid
bitmap. I am attaching a work-in-progress patch from Michal to add
support for it for your reference. Remove commit fe6a12a08 to apply the
patch without conflict.

With Michal's patch, I can boot *without* your patches on the
problematic board.

I still cannot boot with your patches, even with Michal's patch. I still
hit the same ASSERT. If I remove the ASSERT I go further and hit:

(XEN) [00000001bccbd3ab] Panic on CPU 0:
(XEN) [00000001bccc4c3e] Frametable too small

I added some debug messages (see
attached stefano-debug.patch). Something seems to be wrong with the
pdx_group_valid bitmap after 0x880000, as we start getting MFN ranges
such as 0x254c0000-0x25500000 which don't make any sense to me.

(XEN) [00000001563012a8] DEBUG init_pdx 294 start=0 end=80000000
(XEN) [000000015630d6d9] DEBUG init_pdx 294 start=800000000 end=880000000
(XEN) [000000015631c73c] DEBUG init_pdx 294 start=50000000000 end=50080000000
(XEN) [000000015632947b] DEBUG init_pdx 294 start=60000000000 end=60080000000
(XEN) [00000001563365a8] DEBUG init_pdx 294 start=70000000000 end=70080000000
(XEN) [000000015637c6aa] DEBUG init_frametable 65 start=0 end=80000
(XEN) [00000001563898e1] DEBUG init_frametable_chunk 28 virt=a0800000000 base_mfn=7007e000 pfn_start=0 pfn_end=80000
(XEN) [000000015692ed1f] DEBUG init_frametable 65 start=800000 end=880000
(XEN) [00000001569399fe] DEBUG init_frametable_chunk 28 virt=a081c000000 base_mfn=7007c000 pfn_start=800000 pfn_end=880000
(XEN) [00000001573bad45] DEBUG init_frametable 65 start=254c0000 end=25500000
(XEN) [00000001573dee6a] DEBUG init_frametable_chunk 28 virt=a1028a00000 base_mfn=7007a000 pfn_start=254c0000 pfn_end=25500000
(XEN) [00000001578ad5c2] DEBUG init_frametable 65 start=25700000 end=257c0000
(XEN) [00000001578b841d] DEBUG init_frametable_chunk 28 virt=a1030800000 base_mfn=70076000 pfn_start=25700000 pfn_end=257c0000
(XEN) [000000015853b121] DEBUG init_frametable 65 start=27400000 end=27440000
(XEN) [00000001585470fe] DEBUG init_frametable_chunk 28 virt=a1096000000 base_mfn=70074000 pfn_start=27400000 pfn_end=27440000
(XEN) [0000000158880a59] DEBUG init_frametable 65 start=27480000 end=27500000
(XEN) [000000015888d583] DEBUG init_frametable_chunk 28 virt=a1097c00000 base_mfn=70072000 pfn_start=27480000 pfn_end=27500000
(XEN) [0000000158eacf55] DEBUG init_frametable 65 start=27580000 end=27a40000
(XEN) [0000000158eb7f8e] DEBUG init_frametable_chunk 28 virt=a109b400000 base_mfn=70060000 pfn_start=27580000 pfn_end=27a40000
(XEN) [000000015cac7416] DEBUG init_frametable 65 start=27a80000 end=27ac0000
(XEN) [000000015cad6818] DEBUG init_frametable_chunk 28 virt=a10acc00000 base_mfn=7005e000 pfn_start=27a80000 pfn_end=27ac0000
(XEN) [000000015cb26b99] arch/arm/mmu/pt.c:360: Changing MFN for a valid entry is not allowed (0x70071800 -> 0x7005e000).
(XEN) [000000015cb80f94] Xen WARN at arch/arm/mmu/pt.c:360
(XEN) [000000015cbabedc] ----[ Xen-4.21-unstable  arm64  debug=y  Not tainted ]----
Re: [PATCH v2 0/8] pdx: introduce a new compression algorithm
Posted by Jan Beulich 4 months ago
On 01.07.2025 03:50, Stefano Stabellini wrote:
> On Mon, 30 Jun 2025, Roger Pau Monné wrote:
>> On Fri, Jun 27, 2025 at 07:08:29PM -0700, Stefano Stabellini wrote:
>>> Hi Roger,
>>>
>>> We have an ARM board with the following memory layout:
>>>
>>> 0x0-0x80000000, 0, 2G
>>> 0x800000000,0x880000000, 32GB, 2G
>>> 0x50000000000-0x50080000000 5T, 2GB 
>>> 0x60000000000-0x60080000000 6T, 2GB 
>>> 0x70000000000-0x70080000000 7T, 2GB 
>>
>> With the current PDX mask compression you could compress 4bits AFAICT.
>>
>>> It looks like your PDX series is exactly what we need.  However, I tried
>>> to use it and it doesn't seem to be hooked properly on ARM yet. I spent
>>> some time trying to fix it but I was unsuccessful.
>>
>> Hm, weird.  It shouldn't need any special hooking, unless assumptions
>> about the existing PDX mask compression have leaked into ARM code.
>>
>>> As far as I can tell the following functions need to be adjusted but I
>>> am not sure the list is comprehensive:
>>>
>>> xen/arch/arm/include/asm/mmu/mm.h:maddr_to_virt
>>
>> At least for CONFIG_ARM_64 this seems to be implemented correctly, as
>> it's using maddr_to_directmapoff() which should have the correct
>> translation between paddr -> directmap virt.
>>
>> Also given the memory map above the adjustments done in ARM to remove
>> any initial memory map offset should be no-ops, since I expect
>> base_mfn == 0 in setup_directmap_mappings() in that particular case,
>> and then directmap_mfn_start = directmap_base_pdx = 0 and
>> directmap_virt_start = DIRECTMAP_VIRT_START.  FWIW, if ARM uses offset
>> compression the special casing about removing the initial gap can be
>> removed, as the compression should already take care of that.
>>
>>> xen/arch/arm/mmu/mm.c:setup_frametable_mappings
>>> xen/arch/arm/setup.c:init_pdx
>>
>> I've attempted to adjust init_pdx() myself so it works with the new
>> generic PDX compression setup, it seemed to work fine on the CI, but I
>> don't have any real ARM machines to test myself.
>  
>> Is there a way I could reproduce the issue(s) you are seeing with
>> QEMU?
> 
> Maybe. You can see how we run QEMU from gitlab-ci, but I don't know on
> top of my head how to force QEMU to emulate multiple RAM banks at
> specific addresses.
> 
> 
>> I'm already working on v3, as this version implementation of
>> mfn_valid() is buggy.  Maybe that's what you are hitting?
>>
> 
> This is the error:
> 
> (XEN) [0000000179e5f96b] Assertion '(mfn_to_pdx(maddr_to_mfn(ma)) - directmap_base_pdx) < (DIRECTMAP_SIZE >> PAGE_SHIFT)' failed at ./arch/arm/include/asm/mmu/mm.h:72
> (XEN) [0000000179e90619] ----[ Xen-4.21-unstable  arm64  debug=y  Not tainted ]----
> (XEN) [0000000179e9ee58] CPU:    0
> (XEN) [0000000179eac907] PC:     00000a00002da5fc setup_mm+0x174/0x200
> (XEN) [0000000179ed3ed0] LR:     00000a00002da580
> (XEN) [0000000179edc486] SP:     00000a0000327e10
> (XEN) [0000000179ee6b3a] CPSR:   00000000200003c9 MODE:64-bit EL2h (Hypervisor, handler)
> (XEN) [0000000179ef5b4f]      X0: 0000050000000000  X1: 0000000050000000  X2: 0000000000080000
> (XEN) [0000000179f05de3]      X3: 0000000000000017  X4: 0000000000000000  X5: 0000000050000000
> (XEN) [0000000179f19396]      X6: 000000004fffffff  X7: 0000000000000000  X8: 0000000000020400
> (XEN) [0000000179f2d797]      X9: 000000000001b808 X10: 0000000000000080 X11: 00000000000186de
> (XEN) [0000000179f3d492]     X12: 000000000001a7df X13: 000000000001214f X14: 0000000000017275
> (XEN) [0000000179f50f4c]     X15: 00000a00002b48bc X16: 00000a0000291478 X17: 0000000000000000
> (XEN) [0000000179f60902]     X18: 000000007be9bbe0 X19: 0000000000000002 X20: 0000000000000000
> (XEN) [0000000179f6fde5]     X21: 0000050080000000 X22: 00000a00002f8008 X23: 00000a00002b5c90
> (XEN) [0000000179f7eeea]     X24: 0000000180000000 X25: 00000a00002b5e90 X26: 0000000000000000
> (XEN) [0000000179f8ee55]     X27: 0000000000000000 X28: 000000007bff2f70  FP: 00000a0000327e10
> (XEN) [0000000179fa6deb] 
> (XEN) [0000000179fadf84]   VTCR_EL2: 0000000000000000
> (XEN) [0000000179fb9994]  VTTBR_EL2: 0000000000000000
> (XEN) [0000000179fc689d] 
> (XEN) [0000000179fcc1a0]  SCTLR_EL2: 0000000030cd183d
> (XEN) [0000000179fd95e3]    HCR_EL2: 0000000000000038
> (XEN) [0000000179fe7082]  TTBR0_EL2: 0000000022148000
> (XEN) [0000000179ff0d00] 
> (XEN) [0000000179ff6d07]    ESR_EL2: 00000000f2000001
> (XEN) [000000017a0003fe]  HPFAR_EL2: 0000000000000000
> (XEN) [000000017a00c8f4]    FAR_EL2: 0000000000000000
> (XEN) [000000017a018511] 
> (XEN) [000000017a01fbe5] Xen stack trace from sp=00000a0000327e10:
> (XEN) [000000017a02aa88]    00000a0000327e60 00000a00002e40c4 0000000022200000 000000000000f000
> (XEN) [000000017a03e578]    00000a0000c0a5c0 00000a0000332000 00000a0000a00000 0000000000000000
> (XEN) [000000017a04e676]    0000000000000000 0000000000000000 000000007be89ea0 00000a00002001a4
> (XEN) [000000017a0636e1]    0000000022000000 fffff60021e00000 0000000022200000 0000000000001710
> (XEN) [000000017a072ae0]    0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (XEN) [000000017a084bf8]    0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (XEN) [000000017a097ced]    0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (XEN) [000000017a0a6829]    0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (XEN) [000000017a0b8e71]    0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (XEN) [000000017a0cdb4b]    0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (XEN) [000000017a0e44b9]    0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (XEN) [000000017a0f6a2b]    0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (XEN) [000000017a1074a2]    0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (XEN) [000000017a1178b3]    0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (XEN) [000000017a128463]    0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (XEN) [000000017a13a015]    0000000000000000 0000000000000000
> (XEN) [000000017a144d66] Xen call trace:
> (XEN) [000000017a14bcee]    [<00000a00002da5fc>] setup_mm+0x174/0x200 (PC)
> (XEN) [000000017a15db0a]    [<00000a00002da580>] setup_mm+0xf8/0x200 (LR)
> (XEN) [000000017a167dbb]    [<00000a00002e40c4>] start_xen+0x118/0x9d0
> (XEN) [000000017a171724]    [<00000a00002001a4>] arch/arm/arm64/head.o#primary_switched+0x4/0x24
> (XEN) [000000017a18abb4] 
> (XEN) [000000017a19a465] 
> (XEN) [000000017a19ffed] ****************************************
> (XEN) [000000017a1aad66] Panic on CPU 0:
> (XEN) [000000017a1b2757] Assertion '(mfn_to_pdx(maddr_to_mfn(ma)) - directmap_base_pdx) < (DIRECTMAP_SIZE >> PAGE_SHIFT)' failed at ./arch/arm/include/asm/mmu/mm.h:72
> (XEN) [000000017a1daedf] ****************************************
> (XEN) [000000017a1eb0a9] 
> (XEN) [000000017a1f2b27] Reboot in five seconds...
> 
> 
> If I remove the ASSERT:
> 
> (XEN) [00000003bc65c616] parameter "debug" unknown!
> (XEN) [00000003bc70915a] 
> (XEN) [00000003bc70fd14] ****************************************
> (XEN) [00000003bc71afec] Panic on CPU 0:
> (XEN) [00000003bc724d03] The frametable cannot cover the physical region 0000000000000000 - 0x00070080000000
> (XEN) [00000003bc73786c] ****************************************
> (XEN) [00000003bc741a19] 
> (XEN) [00000003bc747833] Reboot in five seconds...
> 
> 
> I think the issue (or one issue) is the implementation of
> setup_frametable_mappings on ARM which is ignoring the pdx_group_valid
> bitmap. I am attaching a work-in-progress patch from Michal to add
> support for it for your reference. Remove commit fe6a12a08 to apply the
> patch without conflict.
> 
> With Michal's patch, I can boot *without* your patches on the
> problematic board.
> 
> I still cannot boot with your patches, even with Michal's patch. I still
> hit the same ASSERT. If I remove the ASSERT I go further and hit:
> 
> (XEN) [00000001bccbd3ab] Panic on CPU 0:
> (XEN) [00000001bccc4c3e] Frametable too small
> 
> I added some debug messages (see
> attached stefano-debug.patch). Something seems to be wrong with the
> pdx_group_valid bitmap after 0x880000, as we start getting MFN ranges
> such as 0x254c0000-0x25500000 which don't make any sense to me.

But in pdx_group_valid it would want to be PDXes.

> (XEN) [00000001563012a8] DEBUG init_pdx 294 start=0 end=80000000
> (XEN) [000000015630d6d9] DEBUG init_pdx 294 start=800000000 end=880000000
> (XEN) [000000015631c73c] DEBUG init_pdx 294 start=50000000000 end=50080000000
> (XEN) [000000015632947b] DEBUG init_pdx 294 start=60000000000 end=60080000000
> (XEN) [00000001563365a8] DEBUG init_pdx 294 start=70000000000 end=70080000000
> (XEN) [000000015637c6aa] DEBUG init_frametable 65 start=0 end=80000
> (XEN) [00000001563898e1] DEBUG init_frametable_chunk 28 virt=a0800000000 base_mfn=7007e000 pfn_start=0 pfn_end=80000
> (XEN) [000000015692ed1f] DEBUG init_frametable 65 start=800000 end=880000
> (XEN) [00000001569399fe] DEBUG init_frametable_chunk 28 virt=a081c000000 base_mfn=7007c000 pfn_start=800000 pfn_end=880000
> (XEN) [00000001573bad45] DEBUG init_frametable 65 start=254c0000 end=25500000
> (XEN) [00000001573dee6a] DEBUG init_frametable_chunk 28 virt=a1028a00000 base_mfn=7007a000 pfn_start=254c0000 pfn_end=25500000
> (XEN) [00000001578ad5c2] DEBUG init_frametable 65 start=25700000 end=257c0000
> (XEN) [00000001578b841d] DEBUG init_frametable_chunk 28 virt=a1030800000 base_mfn=70076000 pfn_start=25700000 pfn_end=257c0000
> (XEN) [000000015853b121] DEBUG init_frametable 65 start=27400000 end=27440000
> (XEN) [00000001585470fe] DEBUG init_frametable_chunk 28 virt=a1096000000 base_mfn=70074000 pfn_start=27400000 pfn_end=27440000
> (XEN) [0000000158880a59] DEBUG init_frametable 65 start=27480000 end=27500000
> (XEN) [000000015888d583] DEBUG init_frametable_chunk 28 virt=a1097c00000 base_mfn=70072000 pfn_start=27480000 pfn_end=27500000
> (XEN) [0000000158eacf55] DEBUG init_frametable 65 start=27580000 end=27a40000
> (XEN) [0000000158eb7f8e] DEBUG init_frametable_chunk 28 virt=a109b400000 base_mfn=70060000 pfn_start=27580000 pfn_end=27a40000
> (XEN) [000000015cac7416] DEBUG init_frametable 65 start=27a80000 end=27ac0000
> (XEN) [000000015cad6818] DEBUG init_frametable_chunk 28 virt=a10acc00000 base_mfn=7005e000 pfn_start=27a80000 pfn_end=27ac0000
> (XEN) [000000015cb26b99] arch/arm/mmu/pt.c:360: Changing MFN for a valid entry is not allowed (0x70071800 -> 0x7005e000).
> (XEN) [000000015cb80f94] Xen WARN at arch/arm/mmu/pt.c:360
> (XEN) [000000015cbabedc] ----[ Xen-4.21-unstable  arm64  debug=y  Not tainted ]----

Sadly from this you omitted the output from the setup of the offsets
arrays. Considering also your later reply, I'd be curious to know what
mfn_to_pdx(0x50000000) is.

Jan

Re: [PATCH v2 0/8] pdx: introduce a new compression algorithm
Posted by Stefano Stabellini 4 months ago
On Tue, 1 Jul 2025, Jan Beulich wrote:
> Sadly from this you omitted the output from the setup of the offsets
> arrays. Considering also your later reply, I'd be curious to know what
> mfn_to_pdx(0x50000000) is.
 
Full logs here, and debug patch in attachment.

(XEN) Checking for initrd in /chosen
(XEN) RAM: 0000000000000000 - 000000007fffffff
(XEN) RAM: 0000000800000000 - 000000087fffffff
(XEN) RAM: 0000050000000000 - 000005007fffffff
(XEN) RAM: 0000060000000000 - 000006007fffffff
(XEN) RAM: 0000070000000000 - 000007007fffffff
(XEN) 
(XEN) MODULE[0]: 0000000022000000 - 0000000022172fff Xen         
(XEN) MODULE[1]: 0000000022200000 - 000000002220efff Device Tree 
(XEN) MODULE[2]: 0000000020400000 - 0000000021e2ffff Kernel      
(XEN)  RESVD[0]: 0000000000000000 - 0000000000ffffff
(XEN)  RESVD[1]: 0000000001000000 - 00000000015fffff
(XEN)  RESVD[2]: 0000000001600000 - 00000000017fffff
(XEN)  RESVD[3]: 0000000001800000 - 00000000097fffff
(XEN)  RESVD[4]: 0000000009800000 - 000000000bffffff
(XEN)  RESVD[5]: 0000000011126000 - 000000001114dfff
(XEN)  RESVD[6]: 000000001114e000 - 000000001214efff
(XEN)  RESVD[7]: 0000000017275000 - 000000001729cfff
(XEN)  RESVD[8]: 000000001729d000 - 000000001829dfff
(XEN)  RESVD[9]: 000000001a7df000 - 000000001a806fff
(XEN)  RESVD[10]: 000000001a807000 - 000000001b807fff
(XEN)  RESVD[11]: 000000001d908000 - 000000001d92ffff
(XEN)  RESVD[12]: 000000001d930000 - 000000001e930fff
(XEN)  RESVD[13]: 000000001829e000 - 000000001869dfff
(XEN)  RESVD[14]: 000000001869e000 - 00000000186ddfff
(XEN)  RESVD[15]: 0000000800000000 - 000000083fffffff
(XEN) 
(XEN) 
(XEN) Command line: console=dtuart dom0_mem=2048M console_timestamps=boot debug bootscrub=0 vwfi=native sched=null
(XEN) [00000006bfc302ec] parameter "debug" unknown!
(XEN) [00000006bfcc0476] DEBUG init_pdx 294 start=0 end=80000000
(XEN) [00000006bfcd2400] DEBUG init_pdx 294 start=800000000 end=880000000
(XEN) [00000006bfce29ec] DEBUG init_pdx 294 start=50000000000 end=50080000000
(XEN) [00000006bfcf1768] DEBUG init_pdx 294 start=60000000000 end=60080000000
(XEN) [00000006bfd015a4] DEBUG init_pdx 294 start=70000000000 end=70080000000
(XEN) [00000006bfd1444f] DEBUG setup_mm 252
(XEN) [00000006bfd3dc6f] DEBUG setup_mm 273 start=0 size=80000000 ram_end=80000000 directmap_base_pdx=0
(XEN) [00000006bfd5616e] DEBUG setup_directmap_mappings 229 base_mfn=0 nr_mfns=80000 directmap_base_pdx=0 mfn_to_pdx=0
(XEN) [00000006bfd7d38a] DEBUG setup_directmap_mappings 237 base_mfn=0 nr_mfns=80000 directmap_base_pdx=0
(XEN) [00000006bfd92728] DEBUG setup_mm 273 start=800000000 size=80000000 ram_end=880000000 directmap_base_pdx=0
(XEN) [00000006bfdaba3b] DEBUG setup_directmap_mappings 229 base_mfn=800000 nr_mfns=80000 directmap_base_pdx=0 mfn_to_pdx=800000
(XEN) [00000006bfdcd79c] DEBUG setup_directmap_mappings 237 base_mfn=800000 nr_mfns=80000 directmap_base_pdx=0
(XEN) [00000006bfde4d82] DEBUG setup_mm 273 start=50000000000 size=80000000 ram_end=50080000000 directmap_base_pdx=0
(XEN) [00000006bfdfaef0] DEBUG setup_directmap_mappings 229 base_mfn=50000000 nr_mfns=80000 directmap_base_pdx=0 mfn_to_pdx=50000000
(XEN) [00000006bfe35249] Assertion '(mfn_to_pdx(maddr_to_mfn(ma)) - directmap_base_pdx) < (DIRECTMAP_SIZE >> PAGE_SHIFT)' failed at ./arch/arm/include/asm/mmu/mm.h:72
(XEN) [00000006bfe68507] ----[ Xen-4.21-unstable  arm64  debug=y  Not tainted ]----
(XEN) [00000006bfe766bf] CPU:    0
(XEN) [00000006bfe832e0] PC:     00000a00002da70c setup_mm+0x284/0x308
(XEN) [00000006bfea5b1a] LR:     00000a00002da6b0
(XEN) [00000006bfeb1032] SP:     00000a0000327e00
(XEN) [00000006bfebf403] CPSR:   00000000200003c9 MODE:64-bit EL2h (Hypervisor, handler)
(XEN) [00000006bfed4634]      X0: 0000000000000017  X1: 0000000000000000  X2: 0000000050000000
(XEN) [00000006bfee4d11]      X3: 000000004fffffff  X4: 0000000000000020  X5: 0000000000000000
(XEN) [00000006bfef48cf]      X6: 0000000000000000  X7: 0000000000000000  X8: ffffffffffffffff
(XEN) [00000006bff047ac]      X9: fefefefefefeff09 X10: 0000000000000080 X11: 0101010101010101
(XEN) [00000006bff153b4]     X12: 0000000000000008 X13: 0000000000000009 X14: 0000000000000030
(XEN) [00000006bff2620d]     X15: 00000a0000a00000 X16: 00000a0000291478 X17: 0000000000000000
(XEN) [00000006bff35c41]     X18: 000000007be9bbe0 X19: 00000a0000292c40 X20: 00000a00002ade68
(XEN) [00000006bff465a5]     X21: 0000050080000000 X22: 0000000000000000 X23: 0000000180000000
(XEN) [00000006bff57a51]     X24: 0000000000000002 X25: 00000a0000292c50 X26: 0000000050000000
(XEN) [00000006bff67d91]     X27: 0000000000080000 X28: 0000050000000000  FP: 00000a0000327e00
(XEN) [00000006bff76ebe] 
(XEN) [00000006bff7c3e3]   VTCR_EL2: 0000000000000000
(XEN) [00000006bff8501a]  VTTBR_EL2: 0000000000000000
(XEN) [00000006bff8f616] 
(XEN) [00000006bff94c4a]  SCTLR_EL2: 0000000030cd183d
(XEN) [00000006bff9e3f7]    HCR_EL2: 0000000000000038
(XEN) [00000006bffaac9c]  TTBR0_EL2: 0000000022148000
(XEN) [00000006bffb6794] 
(XEN) [00000006bffbc972]    ESR_EL2: 00000000f2000001
(XEN) [00000006bffcb424]  HPFAR_EL2: 0000000000000000
(XEN) [00000006bffd7c69]    FAR_EL2: 0000000000000000
(XEN) [00000006bffe3719] 
(XEN) [00000006bffecd4b] Xen stack trace from sp=00000a0000327e00:
(XEN) [00000006bfff9321]    00000a0000327e60 00000a00002e4378 0000000022200000 000000000000f000
(XEN) [00000006c000e3e1]    00000a0000c0a5c0 00000a0000332000 00000a0000a00000 0000000000000000
(XEN) [00000006c001f69c]    0000000000000000 0000000000000000 0000000000000000 000000007bff2f70
(XEN) [00000006c0031b91]    000000007be89ea0 00000a00002001a4 0000000022000000 fffff60021e00000
(XEN) [00000006c0041c20]    0000000022200000 0000000000001710 0000000000000000 0000000000000000
(XEN) [00000006c0052629]    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) [00000006c0065bde]    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) [00000006c00752d1]    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) [00000006c00858cc]    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) [00000006c0096b34]    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) [00000006c00a72f3]    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) [00000006c00b8357]    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) [00000006c00ce60f]    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) [00000006c00e2ee4]    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) [00000006c00f53e7]    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) [00000006c01091f3]    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) [00000006c011cd30] Xen call trace:
(XEN) [00000006c01264b7]    [<00000a00002da70c>] setup_mm+0x284/0x308 (PC)
(XEN) [00000006c01348a8]    [<00000a00002da6b0>] setup_mm+0x228/0x308 (LR)
(XEN) [00000006c0144263]    [<00000a00002e4378>] start_xen+0x118/0x9d0
(XEN) [00000006c01529c3]    [<00000a00002001a4>] arch/arm/arm64/head.o#primary_switched+0x4/0x24
(XEN) [00000006c0165f60] 
(XEN) [00000006c0176bd8] 
(XEN) [00000006c017c5cf] ****************************************
(XEN) [00000006c018964c] Panic on CPU 0:
(XEN) [00000006c0190b79] Assertion '(mfn_to_pdx(maddr_to_mfn(ma)) - directmap_base_pdx) < (DIRECTMAP_SIZE >> PAGE_SHIFT)' failed at ./arch/arm/include/asm/mmu/mm.h:72
(XEN) [00000006c01af78d] ****************************************
Re: [PATCH v2 0/8] pdx: introduce a new compression algorithm
Posted by Roger Pau Monné 4 months ago
On Tue, Jul 01, 2025 at 01:46:19PM -0700, Stefano Stabellini wrote:
> On Tue, 1 Jul 2025, Jan Beulich wrote:
> > Sadly from this you omitted the output from the setup of the offsets
> > arrays. Considering also your later reply, I'd be curious to know what
> > mfn_to_pdx(0x50000000) is.
>  
> Full logs here, and debug patch in attachment.
> 
> (XEN) Checking for initrd in /chosen
> (XEN) RAM: 0000000000000000 - 000000007fffffff
> (XEN) RAM: 0000000800000000 - 000000087fffffff
> (XEN) RAM: 0000050000000000 - 000005007fffffff
> (XEN) RAM: 0000060000000000 - 000006007fffffff
> (XEN) RAM: 0000070000000000 - 000007007fffffff
> (XEN) 
> (XEN) MODULE[0]: 0000000022000000 - 0000000022172fff Xen         
> (XEN) MODULE[1]: 0000000022200000 - 000000002220efff Device Tree 
> (XEN) MODULE[2]: 0000000020400000 - 0000000021e2ffff Kernel      
> (XEN)  RESVD[0]: 0000000000000000 - 0000000000ffffff
> (XEN)  RESVD[1]: 0000000001000000 - 00000000015fffff
> (XEN)  RESVD[2]: 0000000001600000 - 00000000017fffff
> (XEN)  RESVD[3]: 0000000001800000 - 00000000097fffff
> (XEN)  RESVD[4]: 0000000009800000 - 000000000bffffff
> (XEN)  RESVD[5]: 0000000011126000 - 000000001114dfff
> (XEN)  RESVD[6]: 000000001114e000 - 000000001214efff
> (XEN)  RESVD[7]: 0000000017275000 - 000000001729cfff
> (XEN)  RESVD[8]: 000000001729d000 - 000000001829dfff
> (XEN)  RESVD[9]: 000000001a7df000 - 000000001a806fff
> (XEN)  RESVD[10]: 000000001a807000 - 000000001b807fff
> (XEN)  RESVD[11]: 000000001d908000 - 000000001d92ffff
> (XEN)  RESVD[12]: 000000001d930000 - 000000001e930fff
> (XEN)  RESVD[13]: 000000001829e000 - 000000001869dfff
> (XEN)  RESVD[14]: 000000001869e000 - 00000000186ddfff
> (XEN)  RESVD[15]: 0000000800000000 - 000000083fffffff
> (XEN) 
> (XEN) 
> (XEN) Command line: console=dtuart dom0_mem=2048M console_timestamps=boot debug bootscrub=0 vwfi=native sched=null
> (XEN) [00000006bfc302ec] parameter "debug" unknown!
> (XEN) [00000006bfcc0476] DEBUG init_pdx 294 start=0 end=80000000
> (XEN) [00000006bfcd2400] DEBUG init_pdx 294 start=800000000 end=880000000
> (XEN) [00000006bfce29ec] DEBUG init_pdx 294 start=50000000000 end=50080000000
> (XEN) [00000006bfcf1768] DEBUG init_pdx 294 start=60000000000 end=60080000000
> (XEN) [00000006bfd015a4] DEBUG init_pdx 294 start=70000000000 end=70080000000
> (XEN) [00000006bfd1444f] DEBUG setup_mm 252
> (XEN) [00000006bfd3dc6f] DEBUG setup_mm 273 start=0 size=80000000 ram_end=80000000 directmap_base_pdx=0
> (XEN) [00000006bfd5616e] DEBUG setup_directmap_mappings 229 base_mfn=0 nr_mfns=80000 directmap_base_pdx=0 mfn_to_pdx=0
> (XEN) [00000006bfd7d38a] DEBUG setup_directmap_mappings 237 base_mfn=0 nr_mfns=80000 directmap_base_pdx=0
> (XEN) [00000006bfd92728] DEBUG setup_mm 273 start=800000000 size=80000000 ram_end=880000000 directmap_base_pdx=0
> (XEN) [00000006bfdaba3b] DEBUG setup_directmap_mappings 229 base_mfn=800000 nr_mfns=80000 directmap_base_pdx=0 mfn_to_pdx=800000
> (XEN) [00000006bfdcd79c] DEBUG setup_directmap_mappings 237 base_mfn=800000 nr_mfns=80000 directmap_base_pdx=0
> (XEN) [00000006bfde4d82] DEBUG setup_mm 273 start=50000000000 size=80000000 ram_end=50080000000 directmap_base_pdx=0
> (XEN) [00000006bfdfaef0] DEBUG setup_directmap_mappings 229 base_mfn=50000000 nr_mfns=80000 directmap_base_pdx=0 mfn_to_pdx=50000000
> (XEN) [00000006bfe35249] Assertion '(mfn_to_pdx(maddr_to_mfn(ma)) - directmap_base_pdx) < (DIRECTMAP_SIZE >> PAGE_SHIFT)' failed at ./arch/arm/include/asm/mmu/mm.h:72

As said on the other reply, the issue here is that with the v2 PDX
offset compression logic your memory map is not compressible, and this
leads to an overflow, as anything above 5TiB won't fit in the
directmap AFAICT.  We already discussed with Jan that ARM seems to be
missing any logic to account for the max addressable page:

https://lore.kernel.org/xen-devel/9074f1a6-a605-43f4-97f3-d0a626252d3f@suse.com/

x86 has setup_max_pdx() that truncates the maximum addressable MFN
based on the active PDX compression and the virtual memory map
restrictions.  ARM needs similar logic to account for this
restrictions.

Thanks, Roger.
Re: [PATCH v2 0/8] pdx: introduce a new compression algorithm
Posted by Orzel, Michal 4 months ago

On 02/07/2025 09:00, Roger Pau Monné wrote:
> On Tue, Jul 01, 2025 at 01:46:19PM -0700, Stefano Stabellini wrote:
>> On Tue, 1 Jul 2025, Jan Beulich wrote:
>>> Sadly from this you omitted the output from the setup of the offsets
>>> arrays. Considering also your later reply, I'd be curious to know what
>>> mfn_to_pdx(0x50000000) is.
>>  
>> Full logs here, and debug patch in attachment.
>>
>> (XEN) Checking for initrd in /chosen
>> (XEN) RAM: 0000000000000000 - 000000007fffffff
>> (XEN) RAM: 0000000800000000 - 000000087fffffff
>> (XEN) RAM: 0000050000000000 - 000005007fffffff
>> (XEN) RAM: 0000060000000000 - 000006007fffffff
>> (XEN) RAM: 0000070000000000 - 000007007fffffff
>> (XEN) 
>> (XEN) MODULE[0]: 0000000022000000 - 0000000022172fff Xen         
>> (XEN) MODULE[1]: 0000000022200000 - 000000002220efff Device Tree 
>> (XEN) MODULE[2]: 0000000020400000 - 0000000021e2ffff Kernel      
>> (XEN)  RESVD[0]: 0000000000000000 - 0000000000ffffff
>> (XEN)  RESVD[1]: 0000000001000000 - 00000000015fffff
>> (XEN)  RESVD[2]: 0000000001600000 - 00000000017fffff
>> (XEN)  RESVD[3]: 0000000001800000 - 00000000097fffff
>> (XEN)  RESVD[4]: 0000000009800000 - 000000000bffffff
>> (XEN)  RESVD[5]: 0000000011126000 - 000000001114dfff
>> (XEN)  RESVD[6]: 000000001114e000 - 000000001214efff
>> (XEN)  RESVD[7]: 0000000017275000 - 000000001729cfff
>> (XEN)  RESVD[8]: 000000001729d000 - 000000001829dfff
>> (XEN)  RESVD[9]: 000000001a7df000 - 000000001a806fff
>> (XEN)  RESVD[10]: 000000001a807000 - 000000001b807fff
>> (XEN)  RESVD[11]: 000000001d908000 - 000000001d92ffff
>> (XEN)  RESVD[12]: 000000001d930000 - 000000001e930fff
>> (XEN)  RESVD[13]: 000000001829e000 - 000000001869dfff
>> (XEN)  RESVD[14]: 000000001869e000 - 00000000186ddfff
>> (XEN)  RESVD[15]: 0000000800000000 - 000000083fffffff
>> (XEN) 
>> (XEN) 
>> (XEN) Command line: console=dtuart dom0_mem=2048M console_timestamps=boot debug bootscrub=0 vwfi=native sched=null
>> (XEN) [00000006bfc302ec] parameter "debug" unknown!
>> (XEN) [00000006bfcc0476] DEBUG init_pdx 294 start=0 end=80000000
>> (XEN) [00000006bfcd2400] DEBUG init_pdx 294 start=800000000 end=880000000
>> (XEN) [00000006bfce29ec] DEBUG init_pdx 294 start=50000000000 end=50080000000
>> (XEN) [00000006bfcf1768] DEBUG init_pdx 294 start=60000000000 end=60080000000
>> (XEN) [00000006bfd015a4] DEBUG init_pdx 294 start=70000000000 end=70080000000
>> (XEN) [00000006bfd1444f] DEBUG setup_mm 252
>> (XEN) [00000006bfd3dc6f] DEBUG setup_mm 273 start=0 size=80000000 ram_end=80000000 directmap_base_pdx=0
>> (XEN) [00000006bfd5616e] DEBUG setup_directmap_mappings 229 base_mfn=0 nr_mfns=80000 directmap_base_pdx=0 mfn_to_pdx=0
>> (XEN) [00000006bfd7d38a] DEBUG setup_directmap_mappings 237 base_mfn=0 nr_mfns=80000 directmap_base_pdx=0
>> (XEN) [00000006bfd92728] DEBUG setup_mm 273 start=800000000 size=80000000 ram_end=880000000 directmap_base_pdx=0
>> (XEN) [00000006bfdaba3b] DEBUG setup_directmap_mappings 229 base_mfn=800000 nr_mfns=80000 directmap_base_pdx=0 mfn_to_pdx=800000
>> (XEN) [00000006bfdcd79c] DEBUG setup_directmap_mappings 237 base_mfn=800000 nr_mfns=80000 directmap_base_pdx=0
>> (XEN) [00000006bfde4d82] DEBUG setup_mm 273 start=50000000000 size=80000000 ram_end=50080000000 directmap_base_pdx=0
>> (XEN) [00000006bfdfaef0] DEBUG setup_directmap_mappings 229 base_mfn=50000000 nr_mfns=80000 directmap_base_pdx=0 mfn_to_pdx=50000000
>> (XEN) [00000006bfe35249] Assertion '(mfn_to_pdx(maddr_to_mfn(ma)) - directmap_base_pdx) < (DIRECTMAP_SIZE >> PAGE_SHIFT)' failed at ./arch/arm/include/asm/mmu/mm.h:72
> 
> As said on the other reply, the issue here is that with the v2 PDX
> offset compression logic your memory map is not compressible, and this
> leads to an overflow, as anything above 5TiB won't fit in the
> directmap AFAICT.  We already discussed with Jan that ARM seems to be
> missing any logic to account for the max addressable page:
> 
> https://lore.kernel.org/xen-devel/9074f1a6-a605-43f4-97f3-d0a626252d3f@suse.com/
> 
> x86 has setup_max_pdx() that truncates the maximum addressable MFN
> based on the active PDX compression and the virtual memory map
> restrictions.  ARM needs similar logic to account for this
> restrictions.

We have a few issues on Arm. First, we don't check whether direct map is big
enough provided max_pdx that we don't set at all. Second, we don't really use
PDX grouping (can be also used without compression). My patch (that Stefano
attached previously) fixes the second issue (Allejandro will take it over to
come up with common solution). For the first issue, we need to know max_page (at
the moment we calculate it in setup_mm() at the very end but we could do it in
init_pdx() to know it ahead of setting direct map) and PDX offset (on x86 there
is no offset). I also think that on Arm we should just panic if direct map is
too small.

The issue can be reproduced by disabling PDX compression, so not only with
Roger's patch.

@Julien, I'm thinking of something like this:

diff --git a/xen/arch/arm/arm32/mmu/mm.c b/xen/arch/arm/arm32/mmu/mm.c
index 4d22f35618aa..e6d9b49acd3c 100644
--- a/xen/arch/arm/arm32/mmu/mm.c
+++ b/xen/arch/arm/arm32/mmu/mm.c
@@ -190,7 +190,6 @@ void __init setup_mm(void)

     /* Frame table covers all of RAM region, including holes */
     setup_frametable_mappings(ram_start, ram_end);
-    max_page = PFN_DOWN(ram_end);

     /*
      * The allocators may need to use map_domain_page() (such as for
diff --git a/xen/arch/arm/arm64/mmu/mm.c b/xen/arch/arm/arm64/mmu/mm.c
index a0a2dd8cc762..3e64be6ae664 100644
--- a/xen/arch/arm/arm64/mmu/mm.c
+++ b/xen/arch/arm/arm64/mmu/mm.c
@@ -224,6 +224,9 @@ static void __init setup_directmap_mappings(unsigned long
base_mfn,
          */
         directmap_virt_start = DIRECTMAP_VIRT_START +
             (base_mfn - mfn_gb) * PAGE_SIZE;
+
+        if ( (max_pdx - directmap_base_pdx) > (DIRECTMAP_SIZE >> PAGE_SHIFT) )
+            panic("Direct map is too small\n");
     }

     if ( base_mfn < mfn_x(directmap_mfn_start) )
@@ -278,7 +281,6 @@ void __init setup_mm(void)
     directmap_mfn_end = maddr_to_mfn(ram_end);

     setup_frametable_mappings(ram_start, ram_end);
-    max_page = PFN_DOWN(ram_end);

     init_staticmem_pages();
     init_sharedmem_pages();
diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index 58acc2d0d4b8..e047225eb413 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -265,6 +265,7 @@ void __init init_pdx(void)
      */
     uint64_t mask = pdx_init_mask(0x0);
     int bank;
+    paddr_t ram_end = 0;

     for ( bank = 0 ; bank < mem->nr_banks; bank++ )
     {
@@ -290,10 +291,14 @@ void __init init_pdx(void)
         bank_start = mem->bank[bank].start;
         bank_size = mem->bank[bank].size;
         bank_end = bank_start + bank_size;
+        ram_end = max(ram_end, bank_end);

         set_pdx_range(paddr_to_pfn(bank_start),
                       paddr_to_pfn(bank_end));
     }
+
+    max_page = PFN_DOWN(ram_end);
+    max_pdx = pfn_to_pdx(max_page - 1) + 1;
 }

 size_t __read_mostly dcache_line_bytes;

~Michal


Re: [PATCH v2 0/8] pdx: introduce a new compression algorithm
Posted by Julien Grall 4 months ago
Hi Michal,

On 02/07/2025 08:52, Orzel, Michal wrote:
> We have a few issues on Arm. First, we don't check whether direct map is big
> enough provided max_pdx that we don't set at all. Second, we don't really use
> PDX grouping (can be also used without compression). My patch (that Stefano
> attached previously) fixes the second issue (Allejandro will take it over to
> come up with common solution). For the first issue, we need to know max_page (at
> the moment we calculate it in setup_mm() at the very end but we could do it in
> init_pdx() to know it ahead of setting direct map) and PDX offset (on x86 there
> is no offset). I also think that on Arm we should just panic if direct map is
> too small.
> 
> The issue can be reproduced by disabling PDX compression, so not only with
> Roger's patch.
> 
> @Julien, I'm thinking of something like this:

The change below look good to me.

Cheers,

-- 
Julien Grall
Re: [PATCH v2 0/8] pdx: introduce a new compression algorithm
Posted by Roger Pau Monné 4 months ago
On Wed, Jul 02, 2025 at 09:52:45AM +0200, Orzel, Michal wrote:
> 
> 
> On 02/07/2025 09:00, Roger Pau Monné wrote:
> > On Tue, Jul 01, 2025 at 01:46:19PM -0700, Stefano Stabellini wrote:
> >> On Tue, 1 Jul 2025, Jan Beulich wrote:
> >>> Sadly from this you omitted the output from the setup of the offsets
> >>> arrays. Considering also your later reply, I'd be curious to know what
> >>> mfn_to_pdx(0x50000000) is.
> >>  
> >> Full logs here, and debug patch in attachment.
> >>
> >> (XEN) Checking for initrd in /chosen
> >> (XEN) RAM: 0000000000000000 - 000000007fffffff
> >> (XEN) RAM: 0000000800000000 - 000000087fffffff
> >> (XEN) RAM: 0000050000000000 - 000005007fffffff
> >> (XEN) RAM: 0000060000000000 - 000006007fffffff
> >> (XEN) RAM: 0000070000000000 - 000007007fffffff
> >> (XEN) 
> >> (XEN) MODULE[0]: 0000000022000000 - 0000000022172fff Xen         
> >> (XEN) MODULE[1]: 0000000022200000 - 000000002220efff Device Tree 
> >> (XEN) MODULE[2]: 0000000020400000 - 0000000021e2ffff Kernel      
> >> (XEN)  RESVD[0]: 0000000000000000 - 0000000000ffffff
> >> (XEN)  RESVD[1]: 0000000001000000 - 00000000015fffff
> >> (XEN)  RESVD[2]: 0000000001600000 - 00000000017fffff
> >> (XEN)  RESVD[3]: 0000000001800000 - 00000000097fffff
> >> (XEN)  RESVD[4]: 0000000009800000 - 000000000bffffff
> >> (XEN)  RESVD[5]: 0000000011126000 - 000000001114dfff
> >> (XEN)  RESVD[6]: 000000001114e000 - 000000001214efff
> >> (XEN)  RESVD[7]: 0000000017275000 - 000000001729cfff
> >> (XEN)  RESVD[8]: 000000001729d000 - 000000001829dfff
> >> (XEN)  RESVD[9]: 000000001a7df000 - 000000001a806fff
> >> (XEN)  RESVD[10]: 000000001a807000 - 000000001b807fff
> >> (XEN)  RESVD[11]: 000000001d908000 - 000000001d92ffff
> >> (XEN)  RESVD[12]: 000000001d930000 - 000000001e930fff
> >> (XEN)  RESVD[13]: 000000001829e000 - 000000001869dfff
> >> (XEN)  RESVD[14]: 000000001869e000 - 00000000186ddfff
> >> (XEN)  RESVD[15]: 0000000800000000 - 000000083fffffff
> >> (XEN) 
> >> (XEN) 
> >> (XEN) Command line: console=dtuart dom0_mem=2048M console_timestamps=boot debug bootscrub=0 vwfi=native sched=null
> >> (XEN) [00000006bfc302ec] parameter "debug" unknown!
> >> (XEN) [00000006bfcc0476] DEBUG init_pdx 294 start=0 end=80000000
> >> (XEN) [00000006bfcd2400] DEBUG init_pdx 294 start=800000000 end=880000000
> >> (XEN) [00000006bfce29ec] DEBUG init_pdx 294 start=50000000000 end=50080000000
> >> (XEN) [00000006bfcf1768] DEBUG init_pdx 294 start=60000000000 end=60080000000
> >> (XEN) [00000006bfd015a4] DEBUG init_pdx 294 start=70000000000 end=70080000000
> >> (XEN) [00000006bfd1444f] DEBUG setup_mm 252
> >> (XEN) [00000006bfd3dc6f] DEBUG setup_mm 273 start=0 size=80000000 ram_end=80000000 directmap_base_pdx=0
> >> (XEN) [00000006bfd5616e] DEBUG setup_directmap_mappings 229 base_mfn=0 nr_mfns=80000 directmap_base_pdx=0 mfn_to_pdx=0
> >> (XEN) [00000006bfd7d38a] DEBUG setup_directmap_mappings 237 base_mfn=0 nr_mfns=80000 directmap_base_pdx=0
> >> (XEN) [00000006bfd92728] DEBUG setup_mm 273 start=800000000 size=80000000 ram_end=880000000 directmap_base_pdx=0
> >> (XEN) [00000006bfdaba3b] DEBUG setup_directmap_mappings 229 base_mfn=800000 nr_mfns=80000 directmap_base_pdx=0 mfn_to_pdx=800000
> >> (XEN) [00000006bfdcd79c] DEBUG setup_directmap_mappings 237 base_mfn=800000 nr_mfns=80000 directmap_base_pdx=0
> >> (XEN) [00000006bfde4d82] DEBUG setup_mm 273 start=50000000000 size=80000000 ram_end=50080000000 directmap_base_pdx=0
> >> (XEN) [00000006bfdfaef0] DEBUG setup_directmap_mappings 229 base_mfn=50000000 nr_mfns=80000 directmap_base_pdx=0 mfn_to_pdx=50000000
> >> (XEN) [00000006bfe35249] Assertion '(mfn_to_pdx(maddr_to_mfn(ma)) - directmap_base_pdx) < (DIRECTMAP_SIZE >> PAGE_SHIFT)' failed at ./arch/arm/include/asm/mmu/mm.h:72
> > 
> > As said on the other reply, the issue here is that with the v2 PDX
> > offset compression logic your memory map is not compressible, and this
> > leads to an overflow, as anything above 5TiB won't fit in the
> > directmap AFAICT.  We already discussed with Jan that ARM seems to be
> > missing any logic to account for the max addressable page:
> > 
> > https://lore.kernel.org/xen-devel/9074f1a6-a605-43f4-97f3-d0a626252d3f@suse.com/
> > 
> > x86 has setup_max_pdx() that truncates the maximum addressable MFN
> > based on the active PDX compression and the virtual memory map
> > restrictions.  ARM needs similar logic to account for this
> > restrictions.
> 
> We have a few issues on Arm. First, we don't check whether direct map is big
> enough provided max_pdx that we don't set at all. Second, we don't really use
> PDX grouping (can be also used without compression). My patch (that Stefano
> attached previously) fixes the second issue (Allejandro will take it over to
> come up with common solution).

You probably can handle those as different issues, as PDX grouping is
completely disjoint from PDX compression.  It might be helpful if
we could split the PDX grouping into a separate file from the PDX
compression.

One weirdness I've noticed with ARM is the addition of start offsets
to the existing PDX compression, by using directmap_base_pdx,
directmap_mfn_start, directmap_base_pdx &c.  I'm not sure whether this will
interfere with the PDX compression, but it looks like a bodge.  This
should be part of the generic PDX compression implementation, not an
extra added on a per-arch basis.

FWIW, PDX offset translation should already compress any gaps from 0
to the first RAM range, and hence this won't be needed (in fact it
would just make ARM translations slower by doing an extra unneeded
operation).  My recommendation would be to move this initial offset
compression inside the PDX mask translation.

> For the first issue, we need to know max_page (at
> the moment we calculate it in setup_mm() at the very end but we could do it in
> init_pdx() to know it ahead of setting direct map) and PDX offset (on x86 there
> is no offset). I also think that on Arm we should just panic if direct map is
> too small.

Hm, that's up to the ARM folks, but my opinion is that you should
simply ignore memory above the threshold.  Panicking should IMO be a
last resort option when there's no way to workaround the issue.

> The issue can be reproduced by disabling PDX compression, so not only with
> Roger's patch.
> 
> @Julien, I'm thinking of something like this:
> 
> diff --git a/xen/arch/arm/arm32/mmu/mm.c b/xen/arch/arm/arm32/mmu/mm.c
> index 4d22f35618aa..e6d9b49acd3c 100644
> --- a/xen/arch/arm/arm32/mmu/mm.c
> +++ b/xen/arch/arm/arm32/mmu/mm.c
> @@ -190,7 +190,6 @@ void __init setup_mm(void)
> 
>      /* Frame table covers all of RAM region, including holes */
>      setup_frametable_mappings(ram_start, ram_end);
> -    max_page = PFN_DOWN(ram_end);
> 
>      /*
>       * The allocators may need to use map_domain_page() (such as for
> diff --git a/xen/arch/arm/arm64/mmu/mm.c b/xen/arch/arm/arm64/mmu/mm.c
> index a0a2dd8cc762..3e64be6ae664 100644
> --- a/xen/arch/arm/arm64/mmu/mm.c
> +++ b/xen/arch/arm/arm64/mmu/mm.c
> @@ -224,6 +224,9 @@ static void __init setup_directmap_mappings(unsigned long
> base_mfn,
>           */
>          directmap_virt_start = DIRECTMAP_VIRT_START +
>              (base_mfn - mfn_gb) * PAGE_SIZE;
> +
> +        if ( (max_pdx - directmap_base_pdx) > (DIRECTMAP_SIZE >> PAGE_SHIFT) )
> +            panic("Direct map is too small\n");

As said above - I would avoid propagating the usage of those offsets
into generic memory management code, it's usage should be confined
inside the translation functions.

Here you probably want to use maddr_to_virt() or similar.

You can maybe pickup:

https://lore.kernel.org/xen-devel/20250611171636.5674-3-roger.pau@citrix.com/

And attempt to hook it into ARM?

I don't think it would that difficult to reduce the consumption of
memory map ranges to what Xen can handle.

Thanks, Roger.

Re: [PATCH v2 0/8] pdx: introduce a new compression algorithm
Posted by Orzel, Michal 4 months ago

On 02/07/2025 10:26, Roger Pau Monné wrote:
> On Wed, Jul 02, 2025 at 09:52:45AM +0200, Orzel, Michal wrote:
>>
>>
>> On 02/07/2025 09:00, Roger Pau Monné wrote:
>>> On Tue, Jul 01, 2025 at 01:46:19PM -0700, Stefano Stabellini wrote:
>>>> On Tue, 1 Jul 2025, Jan Beulich wrote:
>>>>> Sadly from this you omitted the output from the setup of the offsets
>>>>> arrays. Considering also your later reply, I'd be curious to know what
>>>>> mfn_to_pdx(0x50000000) is.
>>>>  
>>>> Full logs here, and debug patch in attachment.
>>>>
>>>> (XEN) Checking for initrd in /chosen
>>>> (XEN) RAM: 0000000000000000 - 000000007fffffff
>>>> (XEN) RAM: 0000000800000000 - 000000087fffffff
>>>> (XEN) RAM: 0000050000000000 - 000005007fffffff
>>>> (XEN) RAM: 0000060000000000 - 000006007fffffff
>>>> (XEN) RAM: 0000070000000000 - 000007007fffffff
>>>> (XEN) 
>>>> (XEN) MODULE[0]: 0000000022000000 - 0000000022172fff Xen         
>>>> (XEN) MODULE[1]: 0000000022200000 - 000000002220efff Device Tree 
>>>> (XEN) MODULE[2]: 0000000020400000 - 0000000021e2ffff Kernel      
>>>> (XEN)  RESVD[0]: 0000000000000000 - 0000000000ffffff
>>>> (XEN)  RESVD[1]: 0000000001000000 - 00000000015fffff
>>>> (XEN)  RESVD[2]: 0000000001600000 - 00000000017fffff
>>>> (XEN)  RESVD[3]: 0000000001800000 - 00000000097fffff
>>>> (XEN)  RESVD[4]: 0000000009800000 - 000000000bffffff
>>>> (XEN)  RESVD[5]: 0000000011126000 - 000000001114dfff
>>>> (XEN)  RESVD[6]: 000000001114e000 - 000000001214efff
>>>> (XEN)  RESVD[7]: 0000000017275000 - 000000001729cfff
>>>> (XEN)  RESVD[8]: 000000001729d000 - 000000001829dfff
>>>> (XEN)  RESVD[9]: 000000001a7df000 - 000000001a806fff
>>>> (XEN)  RESVD[10]: 000000001a807000 - 000000001b807fff
>>>> (XEN)  RESVD[11]: 000000001d908000 - 000000001d92ffff
>>>> (XEN)  RESVD[12]: 000000001d930000 - 000000001e930fff
>>>> (XEN)  RESVD[13]: 000000001829e000 - 000000001869dfff
>>>> (XEN)  RESVD[14]: 000000001869e000 - 00000000186ddfff
>>>> (XEN)  RESVD[15]: 0000000800000000 - 000000083fffffff
>>>> (XEN) 
>>>> (XEN) 
>>>> (XEN) Command line: console=dtuart dom0_mem=2048M console_timestamps=boot debug bootscrub=0 vwfi=native sched=null
>>>> (XEN) [00000006bfc302ec] parameter "debug" unknown!
>>>> (XEN) [00000006bfcc0476] DEBUG init_pdx 294 start=0 end=80000000
>>>> (XEN) [00000006bfcd2400] DEBUG init_pdx 294 start=800000000 end=880000000
>>>> (XEN) [00000006bfce29ec] DEBUG init_pdx 294 start=50000000000 end=50080000000
>>>> (XEN) [00000006bfcf1768] DEBUG init_pdx 294 start=60000000000 end=60080000000
>>>> (XEN) [00000006bfd015a4] DEBUG init_pdx 294 start=70000000000 end=70080000000
>>>> (XEN) [00000006bfd1444f] DEBUG setup_mm 252
>>>> (XEN) [00000006bfd3dc6f] DEBUG setup_mm 273 start=0 size=80000000 ram_end=80000000 directmap_base_pdx=0
>>>> (XEN) [00000006bfd5616e] DEBUG setup_directmap_mappings 229 base_mfn=0 nr_mfns=80000 directmap_base_pdx=0 mfn_to_pdx=0
>>>> (XEN) [00000006bfd7d38a] DEBUG setup_directmap_mappings 237 base_mfn=0 nr_mfns=80000 directmap_base_pdx=0
>>>> (XEN) [00000006bfd92728] DEBUG setup_mm 273 start=800000000 size=80000000 ram_end=880000000 directmap_base_pdx=0
>>>> (XEN) [00000006bfdaba3b] DEBUG setup_directmap_mappings 229 base_mfn=800000 nr_mfns=80000 directmap_base_pdx=0 mfn_to_pdx=800000
>>>> (XEN) [00000006bfdcd79c] DEBUG setup_directmap_mappings 237 base_mfn=800000 nr_mfns=80000 directmap_base_pdx=0
>>>> (XEN) [00000006bfde4d82] DEBUG setup_mm 273 start=50000000000 size=80000000 ram_end=50080000000 directmap_base_pdx=0
>>>> (XEN) [00000006bfdfaef0] DEBUG setup_directmap_mappings 229 base_mfn=50000000 nr_mfns=80000 directmap_base_pdx=0 mfn_to_pdx=50000000
>>>> (XEN) [00000006bfe35249] Assertion '(mfn_to_pdx(maddr_to_mfn(ma)) - directmap_base_pdx) < (DIRECTMAP_SIZE >> PAGE_SHIFT)' failed at ./arch/arm/include/asm/mmu/mm.h:72
>>>
>>> As said on the other reply, the issue here is that with the v2 PDX
>>> offset compression logic your memory map is not compressible, and this
>>> leads to an overflow, as anything above 5TiB won't fit in the
>>> directmap AFAICT.  We already discussed with Jan that ARM seems to be
>>> missing any logic to account for the max addressable page:
>>>
>>> https://lore.kernel.org/xen-devel/9074f1a6-a605-43f4-97f3-d0a626252d3f@suse.com/
>>>
>>> x86 has setup_max_pdx() that truncates the maximum addressable MFN
>>> based on the active PDX compression and the virtual memory map
>>> restrictions.  ARM needs similar logic to account for this
>>> restrictions.
>>
>> We have a few issues on Arm. First, we don't check whether direct map is big
>> enough provided max_pdx that we don't set at all. Second, we don't really use
>> PDX grouping (can be also used without compression). My patch (that Stefano
>> attached previously) fixes the second issue (Allejandro will take it over to
>> come up with common solution).
> 
> You probably can handle those as different issues, as PDX grouping is
> completely disjoint from PDX compression.  It might be helpful if
> we could split the PDX grouping into a separate file from the PDX
> compression.
> 
> One weirdness I've noticed with ARM is the addition of start offsets
> to the existing PDX compression, by using directmap_base_pdx,
> directmap_mfn_start, directmap_base_pdx &c.  I'm not sure whether this will
> interfere with the PDX compression, but it looks like a bodge.  This
> should be part of the generic PDX compression implementation, not an
> extra added on a per-arch basis.
> 
> FWIW, PDX offset translation should already compress any gaps from 0
> to the first RAM range, and hence this won't be needed (in fact it
> would just make ARM translations slower by doing an extra unneeded
> operation).  My recommendation would be to move this initial offset
> compression inside the PDX mask translation.
> 
>> For the first issue, we need to know max_page (at
>> the moment we calculate it in setup_mm() at the very end but we could do it in
>> init_pdx() to know it ahead of setting direct map) and PDX offset (on x86 there
>> is no offset). I also think that on Arm we should just panic if direct map is
>> too small.
> 
> Hm, that's up to the ARM folks, but my opinion is that you should
> simply ignore memory above the threshold.  Panicking should IMO be a
> last resort option when there's no way to workaround the issue.
On Arm we handle user errors and suspicious behavior usually as panics as oppose
to x86 which is more liberal in that regard. We want to fail as soon as possible.

> 
>> The issue can be reproduced by disabling PDX compression, so not only with
>> Roger's patch.
>>
>> @Julien, I'm thinking of something like this:
>>
>> diff --git a/xen/arch/arm/arm32/mmu/mm.c b/xen/arch/arm/arm32/mmu/mm.c
>> index 4d22f35618aa..e6d9b49acd3c 100644
>> --- a/xen/arch/arm/arm32/mmu/mm.c
>> +++ b/xen/arch/arm/arm32/mmu/mm.c
>> @@ -190,7 +190,6 @@ void __init setup_mm(void)
>>
>>      /* Frame table covers all of RAM region, including holes */
>>      setup_frametable_mappings(ram_start, ram_end);
>> -    max_page = PFN_DOWN(ram_end);
>>
>>      /*
>>       * The allocators may need to use map_domain_page() (such as for
>> diff --git a/xen/arch/arm/arm64/mmu/mm.c b/xen/arch/arm/arm64/mmu/mm.c
>> index a0a2dd8cc762..3e64be6ae664 100644
>> --- a/xen/arch/arm/arm64/mmu/mm.c
>> +++ b/xen/arch/arm/arm64/mmu/mm.c
>> @@ -224,6 +224,9 @@ static void __init setup_directmap_mappings(unsigned long
>> base_mfn,
>>           */
>>          directmap_virt_start = DIRECTMAP_VIRT_START +
>>              (base_mfn - mfn_gb) * PAGE_SIZE;
>> +
>> +        if ( (max_pdx - directmap_base_pdx) > (DIRECTMAP_SIZE >> PAGE_SHIFT) )
>> +            panic("Direct map is too small\n");
> 
> As said above - I would avoid propagating the usage of those offsets
> into generic memory management code, it's usage should be confined
> inside the translation functions.
directmap_base_pdx is set a few lines above, so I would not call it propagation.

> 
> Here you probably want to use maddr_to_virt() or similar.
I can't because maddr_to_virt() has the ASSERT with similar check.
> 
> You can maybe pickup:
> 
> https://lore.kernel.org/xen-devel/20250611171636.5674-3-roger.pau@citrix.com/
> 
> And attempt to hook it into ARM?
As said above, we have different ways to approach setting max_pdx. On Arm we
want to panic, on x86 you want to limit the max_pdx.

> 
> I don't think it would that difficult to reduce the consumption of
> memory map ranges to what Xen can handle.
> 
> Thanks, Roger.

The diff I sent fixes the issue for direct map now. We can take it now if we
want to solve the issue. If we instead want to wait for frametable fixes (\wrt
grouping) and possible PDX changes (making offsets common) to be done first, I
can simply park this patch.

~Michal


Re: [PATCH v2 0/8] pdx: introduce a new compression algorithm
Posted by Stefano Stabellini 3 months, 4 weeks ago
On Wed, 2 Jul 2025, Orzel, Michal wrote:
> > Hm, that's up to the ARM folks, but my opinion is that you should
> > simply ignore memory above the threshold.  Panicking should IMO be a
> > last resort option when there's no way to workaround the issue.
> On Arm we handle user errors and suspicious behavior usually as panics as oppose
> to x86 which is more liberal in that regard. We want to fail as soon as possible.

If we think about it, this is natural because Xen on ARM was mostly
aimed at embedded developers configuring an embedded system. Embedded
developers might not be Xen experts but they are typically engineers.
These people would definitely want to know if part of the memory was
ignored, and might be able to write a fix.

On the other hand Xen on x86 was aimed at non-expert users -- people
apt-get'ing Xen on a Debian system. These people wouldn't know how to
read a panic so we would certainly want to boot anyway even with only
partial resources.

This has worked well so far, but now we are getting x86 in embedded and
ARM on servers, so I think we should discuss and agree on a common
pattern or a configurable pattern to handle this kind of situations.
Re: [PATCH v2 0/8] pdx: introduce a new compression algorithm
Posted by Roger Pau Monné 4 months ago
On Wed, Jul 02, 2025 at 10:54:24AM +0200, Orzel, Michal wrote:
> 
> 
> On 02/07/2025 10:26, Roger Pau Monné wrote:
> > On Wed, Jul 02, 2025 at 09:52:45AM +0200, Orzel, Michal wrote:
> >>
> >>
> >> On 02/07/2025 09:00, Roger Pau Monné wrote:
> >>> On Tue, Jul 01, 2025 at 01:46:19PM -0700, Stefano Stabellini wrote:
> >>>> On Tue, 1 Jul 2025, Jan Beulich wrote:
> >>>>> Sadly from this you omitted the output from the setup of the offsets
> >>>>> arrays. Considering also your later reply, I'd be curious to know what
> >>>>> mfn_to_pdx(0x50000000) is.
> >>>>  
> >>>> Full logs here, and debug patch in attachment.
> >>>>
> >>>> (XEN) Checking for initrd in /chosen
> >>>> (XEN) RAM: 0000000000000000 - 000000007fffffff
> >>>> (XEN) RAM: 0000000800000000 - 000000087fffffff
> >>>> (XEN) RAM: 0000050000000000 - 000005007fffffff
> >>>> (XEN) RAM: 0000060000000000 - 000006007fffffff
> >>>> (XEN) RAM: 0000070000000000 - 000007007fffffff
> >>>> (XEN) 
> >>>> (XEN) MODULE[0]: 0000000022000000 - 0000000022172fff Xen         
> >>>> (XEN) MODULE[1]: 0000000022200000 - 000000002220efff Device Tree 
> >>>> (XEN) MODULE[2]: 0000000020400000 - 0000000021e2ffff Kernel      
> >>>> (XEN)  RESVD[0]: 0000000000000000 - 0000000000ffffff
> >>>> (XEN)  RESVD[1]: 0000000001000000 - 00000000015fffff
> >>>> (XEN)  RESVD[2]: 0000000001600000 - 00000000017fffff
> >>>> (XEN)  RESVD[3]: 0000000001800000 - 00000000097fffff
> >>>> (XEN)  RESVD[4]: 0000000009800000 - 000000000bffffff
> >>>> (XEN)  RESVD[5]: 0000000011126000 - 000000001114dfff
> >>>> (XEN)  RESVD[6]: 000000001114e000 - 000000001214efff
> >>>> (XEN)  RESVD[7]: 0000000017275000 - 000000001729cfff
> >>>> (XEN)  RESVD[8]: 000000001729d000 - 000000001829dfff
> >>>> (XEN)  RESVD[9]: 000000001a7df000 - 000000001a806fff
> >>>> (XEN)  RESVD[10]: 000000001a807000 - 000000001b807fff
> >>>> (XEN)  RESVD[11]: 000000001d908000 - 000000001d92ffff
> >>>> (XEN)  RESVD[12]: 000000001d930000 - 000000001e930fff
> >>>> (XEN)  RESVD[13]: 000000001829e000 - 000000001869dfff
> >>>> (XEN)  RESVD[14]: 000000001869e000 - 00000000186ddfff
> >>>> (XEN)  RESVD[15]: 0000000800000000 - 000000083fffffff
> >>>> (XEN) 
> >>>> (XEN) 
> >>>> (XEN) Command line: console=dtuart dom0_mem=2048M console_timestamps=boot debug bootscrub=0 vwfi=native sched=null
> >>>> (XEN) [00000006bfc302ec] parameter "debug" unknown!
> >>>> (XEN) [00000006bfcc0476] DEBUG init_pdx 294 start=0 end=80000000
> >>>> (XEN) [00000006bfcd2400] DEBUG init_pdx 294 start=800000000 end=880000000
> >>>> (XEN) [00000006bfce29ec] DEBUG init_pdx 294 start=50000000000 end=50080000000
> >>>> (XEN) [00000006bfcf1768] DEBUG init_pdx 294 start=60000000000 end=60080000000
> >>>> (XEN) [00000006bfd015a4] DEBUG init_pdx 294 start=70000000000 end=70080000000
> >>>> (XEN) [00000006bfd1444f] DEBUG setup_mm 252
> >>>> (XEN) [00000006bfd3dc6f] DEBUG setup_mm 273 start=0 size=80000000 ram_end=80000000 directmap_base_pdx=0
> >>>> (XEN) [00000006bfd5616e] DEBUG setup_directmap_mappings 229 base_mfn=0 nr_mfns=80000 directmap_base_pdx=0 mfn_to_pdx=0
> >>>> (XEN) [00000006bfd7d38a] DEBUG setup_directmap_mappings 237 base_mfn=0 nr_mfns=80000 directmap_base_pdx=0
> >>>> (XEN) [00000006bfd92728] DEBUG setup_mm 273 start=800000000 size=80000000 ram_end=880000000 directmap_base_pdx=0
> >>>> (XEN) [00000006bfdaba3b] DEBUG setup_directmap_mappings 229 base_mfn=800000 nr_mfns=80000 directmap_base_pdx=0 mfn_to_pdx=800000
> >>>> (XEN) [00000006bfdcd79c] DEBUG setup_directmap_mappings 237 base_mfn=800000 nr_mfns=80000 directmap_base_pdx=0
> >>>> (XEN) [00000006bfde4d82] DEBUG setup_mm 273 start=50000000000 size=80000000 ram_end=50080000000 directmap_base_pdx=0
> >>>> (XEN) [00000006bfdfaef0] DEBUG setup_directmap_mappings 229 base_mfn=50000000 nr_mfns=80000 directmap_base_pdx=0 mfn_to_pdx=50000000
> >>>> (XEN) [00000006bfe35249] Assertion '(mfn_to_pdx(maddr_to_mfn(ma)) - directmap_base_pdx) < (DIRECTMAP_SIZE >> PAGE_SHIFT)' failed at ./arch/arm/include/asm/mmu/mm.h:72
> >>>
> >>> As said on the other reply, the issue here is that with the v2 PDX
> >>> offset compression logic your memory map is not compressible, and this
> >>> leads to an overflow, as anything above 5TiB won't fit in the
> >>> directmap AFAICT.  We already discussed with Jan that ARM seems to be
> >>> missing any logic to account for the max addressable page:
> >>>
> >>> https://lore.kernel.org/xen-devel/9074f1a6-a605-43f4-97f3-d0a626252d3f@suse.com/
> >>>
> >>> x86 has setup_max_pdx() that truncates the maximum addressable MFN
> >>> based on the active PDX compression and the virtual memory map
> >>> restrictions.  ARM needs similar logic to account for this
> >>> restrictions.
> >>
> >> We have a few issues on Arm. First, we don't check whether direct map is big
> >> enough provided max_pdx that we don't set at all. Second, we don't really use
> >> PDX grouping (can be also used without compression). My patch (that Stefano
> >> attached previously) fixes the second issue (Allejandro will take it over to
> >> come up with common solution).
> > 
> > You probably can handle those as different issues, as PDX grouping is
> > completely disjoint from PDX compression.  It might be helpful if
> > we could split the PDX grouping into a separate file from the PDX
> > compression.
> > 
> > One weirdness I've noticed with ARM is the addition of start offsets
> > to the existing PDX compression, by using directmap_base_pdx,
> > directmap_mfn_start, directmap_base_pdx &c.  I'm not sure whether this will
> > interfere with the PDX compression, but it looks like a bodge.  This
> > should be part of the generic PDX compression implementation, not an
> > extra added on a per-arch basis.
> > 
> > FWIW, PDX offset translation should already compress any gaps from 0
> > to the first RAM range, and hence this won't be needed (in fact it
> > would just make ARM translations slower by doing an extra unneeded
> > operation).  My recommendation would be to move this initial offset
> > compression inside the PDX mask translation.
> > 
> >> For the first issue, we need to know max_page (at
> >> the moment we calculate it in setup_mm() at the very end but we could do it in
> >> init_pdx() to know it ahead of setting direct map) and PDX offset (on x86 there
> >> is no offset). I also think that on Arm we should just panic if direct map is
> >> too small.
> > 
> > Hm, that's up to the ARM folks, but my opinion is that you should
> > simply ignore memory above the threshold.  Panicking should IMO be a
> > last resort option when there's no way to workaround the issue.
> On Arm we handle user errors and suspicious behavior usually as panics as oppose
> to x86 which is more liberal in that regard. We want to fail as soon as possible.
> 
> > 
> >> The issue can be reproduced by disabling PDX compression, so not only with
> >> Roger's patch.
> >>
> >> @Julien, I'm thinking of something like this:
> >>
> >> diff --git a/xen/arch/arm/arm32/mmu/mm.c b/xen/arch/arm/arm32/mmu/mm.c
> >> index 4d22f35618aa..e6d9b49acd3c 100644
> >> --- a/xen/arch/arm/arm32/mmu/mm.c
> >> +++ b/xen/arch/arm/arm32/mmu/mm.c
> >> @@ -190,7 +190,6 @@ void __init setup_mm(void)
> >>
> >>      /* Frame table covers all of RAM region, including holes */
> >>      setup_frametable_mappings(ram_start, ram_end);
> >> -    max_page = PFN_DOWN(ram_end);
> >>
> >>      /*
> >>       * The allocators may need to use map_domain_page() (such as for
> >> diff --git a/xen/arch/arm/arm64/mmu/mm.c b/xen/arch/arm/arm64/mmu/mm.c
> >> index a0a2dd8cc762..3e64be6ae664 100644
> >> --- a/xen/arch/arm/arm64/mmu/mm.c
> >> +++ b/xen/arch/arm/arm64/mmu/mm.c
> >> @@ -224,6 +224,9 @@ static void __init setup_directmap_mappings(unsigned long
> >> base_mfn,
> >>           */
> >>          directmap_virt_start = DIRECTMAP_VIRT_START +
> >>              (base_mfn - mfn_gb) * PAGE_SIZE;
> >> +
> >> +        if ( (max_pdx - directmap_base_pdx) > (DIRECTMAP_SIZE >> PAGE_SHIFT) )
> >> +            panic("Direct map is too small\n");
> > 
> > As said above - I would avoid propagating the usage of those offsets
> > into generic memory management code, it's usage should be confined
> > inside the translation functions.
> directmap_base_pdx is set a few lines above, so I would not call it propagation.
> 
> > 
> > Here you probably want to use maddr_to_virt() or similar.
> I can't because maddr_to_virt() has the ASSERT with similar check.
> > 
> > You can maybe pickup:
> > 
> > https://lore.kernel.org/xen-devel/20250611171636.5674-3-roger.pau@citrix.com/
> > 
> > And attempt to hook it into ARM?
> As said above, we have different ways to approach setting max_pdx. On Arm we
> want to panic, on x86 you want to limit the max_pdx.
> 
> > 
> > I don't think it would that difficult to reduce the consumption of
> > memory map ranges to what Xen can handle.
> > 
> > Thanks, Roger.
> 
> The diff I sent fixes the issue for direct map now. We can take it now if we
> want to solve the issue. If we instead want to wait for frametable fixes (\wrt
> grouping) and possible PDX changes (making offsets common) to be done first, I
> can simply park this patch.

No please, don't park it just because of my opinions.  I think Julien
is OK with it, so don't hold back because of my x86 based opinion on
how to handle errors.

Regards, Roger.

Re: [PATCH v2 0/8] pdx: introduce a new compression algorithm
Posted by Stefano Stabellini 3 months, 4 weeks ago
On Wed, 2 Jul 2025, Roger Pau Monné wrote:
> On Wed, Jul 02, 2025 at 10:54:24AM +0200, Orzel, Michal wrote:
> > 
> > 
> > On 02/07/2025 10:26, Roger Pau Monné wrote:
> > > On Wed, Jul 02, 2025 at 09:52:45AM +0200, Orzel, Michal wrote:
> > >>
> > >>
> > >> On 02/07/2025 09:00, Roger Pau Monné wrote:
> > >>> On Tue, Jul 01, 2025 at 01:46:19PM -0700, Stefano Stabellini wrote:
> > >>>> On Tue, 1 Jul 2025, Jan Beulich wrote:
> > >>>>> Sadly from this you omitted the output from the setup of the offsets
> > >>>>> arrays. Considering also your later reply, I'd be curious to know what
> > >>>>> mfn_to_pdx(0x50000000) is.
> > >>>>  
> > >>>> Full logs here, and debug patch in attachment.
> > >>>>
> > >>>> (XEN) Checking for initrd in /chosen
> > >>>> (XEN) RAM: 0000000000000000 - 000000007fffffff
> > >>>> (XEN) RAM: 0000000800000000 - 000000087fffffff
> > >>>> (XEN) RAM: 0000050000000000 - 000005007fffffff
> > >>>> (XEN) RAM: 0000060000000000 - 000006007fffffff
> > >>>> (XEN) RAM: 0000070000000000 - 000007007fffffff
> > >>>> (XEN) 
> > >>>> (XEN) MODULE[0]: 0000000022000000 - 0000000022172fff Xen         
> > >>>> (XEN) MODULE[1]: 0000000022200000 - 000000002220efff Device Tree 
> > >>>> (XEN) MODULE[2]: 0000000020400000 - 0000000021e2ffff Kernel      
> > >>>> (XEN)  RESVD[0]: 0000000000000000 - 0000000000ffffff
> > >>>> (XEN)  RESVD[1]: 0000000001000000 - 00000000015fffff
> > >>>> (XEN)  RESVD[2]: 0000000001600000 - 00000000017fffff
> > >>>> (XEN)  RESVD[3]: 0000000001800000 - 00000000097fffff
> > >>>> (XEN)  RESVD[4]: 0000000009800000 - 000000000bffffff
> > >>>> (XEN)  RESVD[5]: 0000000011126000 - 000000001114dfff
> > >>>> (XEN)  RESVD[6]: 000000001114e000 - 000000001214efff
> > >>>> (XEN)  RESVD[7]: 0000000017275000 - 000000001729cfff
> > >>>> (XEN)  RESVD[8]: 000000001729d000 - 000000001829dfff
> > >>>> (XEN)  RESVD[9]: 000000001a7df000 - 000000001a806fff
> > >>>> (XEN)  RESVD[10]: 000000001a807000 - 000000001b807fff
> > >>>> (XEN)  RESVD[11]: 000000001d908000 - 000000001d92ffff
> > >>>> (XEN)  RESVD[12]: 000000001d930000 - 000000001e930fff
> > >>>> (XEN)  RESVD[13]: 000000001829e000 - 000000001869dfff
> > >>>> (XEN)  RESVD[14]: 000000001869e000 - 00000000186ddfff
> > >>>> (XEN)  RESVD[15]: 0000000800000000 - 000000083fffffff
> > >>>> (XEN) 
> > >>>> (XEN) 
> > >>>> (XEN) Command line: console=dtuart dom0_mem=2048M console_timestamps=boot debug bootscrub=0 vwfi=native sched=null
> > >>>> (XEN) [00000006bfc302ec] parameter "debug" unknown!
> > >>>> (XEN) [00000006bfcc0476] DEBUG init_pdx 294 start=0 end=80000000
> > >>>> (XEN) [00000006bfcd2400] DEBUG init_pdx 294 start=800000000 end=880000000
> > >>>> (XEN) [00000006bfce29ec] DEBUG init_pdx 294 start=50000000000 end=50080000000
> > >>>> (XEN) [00000006bfcf1768] DEBUG init_pdx 294 start=60000000000 end=60080000000
> > >>>> (XEN) [00000006bfd015a4] DEBUG init_pdx 294 start=70000000000 end=70080000000
> > >>>> (XEN) [00000006bfd1444f] DEBUG setup_mm 252
> > >>>> (XEN) [00000006bfd3dc6f] DEBUG setup_mm 273 start=0 size=80000000 ram_end=80000000 directmap_base_pdx=0
> > >>>> (XEN) [00000006bfd5616e] DEBUG setup_directmap_mappings 229 base_mfn=0 nr_mfns=80000 directmap_base_pdx=0 mfn_to_pdx=0
> > >>>> (XEN) [00000006bfd7d38a] DEBUG setup_directmap_mappings 237 base_mfn=0 nr_mfns=80000 directmap_base_pdx=0
> > >>>> (XEN) [00000006bfd92728] DEBUG setup_mm 273 start=800000000 size=80000000 ram_end=880000000 directmap_base_pdx=0
> > >>>> (XEN) [00000006bfdaba3b] DEBUG setup_directmap_mappings 229 base_mfn=800000 nr_mfns=80000 directmap_base_pdx=0 mfn_to_pdx=800000
> > >>>> (XEN) [00000006bfdcd79c] DEBUG setup_directmap_mappings 237 base_mfn=800000 nr_mfns=80000 directmap_base_pdx=0
> > >>>> (XEN) [00000006bfde4d82] DEBUG setup_mm 273 start=50000000000 size=80000000 ram_end=50080000000 directmap_base_pdx=0
> > >>>> (XEN) [00000006bfdfaef0] DEBUG setup_directmap_mappings 229 base_mfn=50000000 nr_mfns=80000 directmap_base_pdx=0 mfn_to_pdx=50000000
> > >>>> (XEN) [00000006bfe35249] Assertion '(mfn_to_pdx(maddr_to_mfn(ma)) - directmap_base_pdx) < (DIRECTMAP_SIZE >> PAGE_SHIFT)' failed at ./arch/arm/include/asm/mmu/mm.h:72
> > >>>
> > >>> As said on the other reply, the issue here is that with the v2 PDX
> > >>> offset compression logic your memory map is not compressible, and this
> > >>> leads to an overflow, as anything above 5TiB won't fit in the
> > >>> directmap AFAICT.  We already discussed with Jan that ARM seems to be
> > >>> missing any logic to account for the max addressable page:
> > >>>
> > >>> https://lore.kernel.org/xen-devel/9074f1a6-a605-43f4-97f3-d0a626252d3f@suse.com/
> > >>>
> > >>> x86 has setup_max_pdx() that truncates the maximum addressable MFN
> > >>> based on the active PDX compression and the virtual memory map
> > >>> restrictions.  ARM needs similar logic to account for this
> > >>> restrictions.
> > >>
> > >> We have a few issues on Arm. First, we don't check whether direct map is big
> > >> enough provided max_pdx that we don't set at all. Second, we don't really use
> > >> PDX grouping (can be also used without compression). My patch (that Stefano
> > >> attached previously) fixes the second issue (Allejandro will take it over to
> > >> come up with common solution).
> > > 
> > > You probably can handle those as different issues, as PDX grouping is
> > > completely disjoint from PDX compression.  It might be helpful if
> > > we could split the PDX grouping into a separate file from the PDX
> > > compression.
> > > 
> > > One weirdness I've noticed with ARM is the addition of start offsets
> > > to the existing PDX compression, by using directmap_base_pdx,
> > > directmap_mfn_start, directmap_base_pdx &c.  I'm not sure whether this will
> > > interfere with the PDX compression, but it looks like a bodge.  This
> > > should be part of the generic PDX compression implementation, not an
> > > extra added on a per-arch basis.
> > > 
> > > FWIW, PDX offset translation should already compress any gaps from 0
> > > to the first RAM range, and hence this won't be needed (in fact it
> > > would just make ARM translations slower by doing an extra unneeded
> > > operation).  My recommendation would be to move this initial offset
> > > compression inside the PDX mask translation.
> > > 
> > >> For the first issue, we need to know max_page (at
> > >> the moment we calculate it in setup_mm() at the very end but we could do it in
> > >> init_pdx() to know it ahead of setting direct map) and PDX offset (on x86 there
> > >> is no offset). I also think that on Arm we should just panic if direct map is
> > >> too small.
> > > 
> > > Hm, that's up to the ARM folks, but my opinion is that you should
> > > simply ignore memory above the threshold.  Panicking should IMO be a
> > > last resort option when there's no way to workaround the issue.
> > On Arm we handle user errors and suspicious behavior usually as panics as oppose
> > to x86 which is more liberal in that regard. We want to fail as soon as possible.
> > 
> > > 
> > >> The issue can be reproduced by disabling PDX compression, so not only with
> > >> Roger's patch.
> > >>
> > >> @Julien, I'm thinking of something like this:
> > >>
> > >> diff --git a/xen/arch/arm/arm32/mmu/mm.c b/xen/arch/arm/arm32/mmu/mm.c
> > >> index 4d22f35618aa..e6d9b49acd3c 100644
> > >> --- a/xen/arch/arm/arm32/mmu/mm.c
> > >> +++ b/xen/arch/arm/arm32/mmu/mm.c
> > >> @@ -190,7 +190,6 @@ void __init setup_mm(void)
> > >>
> > >>      /* Frame table covers all of RAM region, including holes */
> > >>      setup_frametable_mappings(ram_start, ram_end);
> > >> -    max_page = PFN_DOWN(ram_end);
> > >>
> > >>      /*
> > >>       * The allocators may need to use map_domain_page() (such as for
> > >> diff --git a/xen/arch/arm/arm64/mmu/mm.c b/xen/arch/arm/arm64/mmu/mm.c
> > >> index a0a2dd8cc762..3e64be6ae664 100644
> > >> --- a/xen/arch/arm/arm64/mmu/mm.c
> > >> +++ b/xen/arch/arm/arm64/mmu/mm.c
> > >> @@ -224,6 +224,9 @@ static void __init setup_directmap_mappings(unsigned long
> > >> base_mfn,
> > >>           */
> > >>          directmap_virt_start = DIRECTMAP_VIRT_START +
> > >>              (base_mfn - mfn_gb) * PAGE_SIZE;
> > >> +
> > >> +        if ( (max_pdx - directmap_base_pdx) > (DIRECTMAP_SIZE >> PAGE_SHIFT) )
> > >> +            panic("Direct map is too small\n");
> > > 
> > > As said above - I would avoid propagating the usage of those offsets
> > > into generic memory management code, it's usage should be confined
> > > inside the translation functions.
> > directmap_base_pdx is set a few lines above, so I would not call it propagation.
> > 
> > > 
> > > Here you probably want to use maddr_to_virt() or similar.
> > I can't because maddr_to_virt() has the ASSERT with similar check.
> > > 
> > > You can maybe pickup:
> > > 
> > > https://lore.kernel.org/xen-devel/20250611171636.5674-3-roger.pau@citrix.com/
> > > 
> > > And attempt to hook it into ARM?
> > As said above, we have different ways to approach setting max_pdx. On Arm we
> > want to panic, on x86 you want to limit the max_pdx.
> > 
> > > 
> > > I don't think it would that difficult to reduce the consumption of
> > > memory map ranges to what Xen can handle.
> > > 
> > > Thanks, Roger.
> > 
> > The diff I sent fixes the issue for direct map now. We can take it now if we
> > want to solve the issue. If we instead want to wait for frametable fixes (\wrt
> > grouping) and possible PDX changes (making offsets common) to be done first, I
> > can simply park this patch.
> 
> No please, don't park it just because of my opinions.  I think Julien
> is OK with it, so don't hold back because of my x86 based opinion on
> how to handle errors.

I would also rather have the small improvement now rather than later
Re: [PATCH v2 0/8] pdx: introduce a new compression algorithm
Posted by Julien Grall 4 months ago
Hi Roger,

On 02/07/2025 09:26, Roger Pau Monné wrote:
> On Wed, Jul 02, 2025 at 09:52:45AM +0200, Orzel, Michal wrote:
>> We have a few issues on Arm. First, we don't check whether direct map is big
>> enough provided max_pdx that we don't set at all. Second, we don't really use
>> PDX grouping (can be also used without compression). My patch (that Stefano
>> attached previously) fixes the second issue (Allejandro will take it over to
>> come up with common solution).
> 
> You probably can handle those as different issues, as PDX grouping is
> completely disjoint from PDX compression.  It might be helpful if
> we could split the PDX grouping into a separate file from the PDX
> compression.
> 
> One weirdness I've noticed with ARM is the addition of start offsets
> to the existing PDX compression, by using directmap_base_pdx,
> directmap_mfn_start, directmap_base_pdx &c.  I'm not sure whether this will
> interfere with the PDX compression, but it looks like a bodge.  This
> should be part of the generic PDX compression implementation, not an
> extra added on a per-arch basis.

They were introduced right at the beginning of the ARM port because we 
have quite a few platforms where the memory doesn't start at 0 and there 
was still a fairly large hole between two banks. IIRC until this series 
we would have been able to handle the hole but not the offset.

This is can be handled in common, then I would be happy with that.

> 
> FWIW, PDX offset translation should already compress any gaps from 0
> to the first RAM range, and hence this won't be needed (in fact it
> would just make ARM translations slower by doing an extra unneeded
> operation).  My recommendation would be to move this initial offset
> compression inside the PDX mask translation.
> 
>> For the first issue, we need to know max_page (at
>> the moment we calculate it in setup_mm() at the very end but we could do it in
>> init_pdx() to know it ahead of setting direct map) and PDX offset (on x86 there
>> is no offset). I also think that on Arm we should just panic if direct map is
>> too small.
> 
> Hm, that's up to the ARM folks, but my opinion is that you should
> simply ignore memory above the threshold.  Panicking should IMO be a
> last resort option when there's no way to workaround the issue.

This is following the other pattern within the Arm port. We want to fail 
early with a clear error rather than booting an half broken system.

Cheers,

-- 
Julien Grall


Re: [PATCH v2 0/8] pdx: introduce a new compression algorithm
Posted by Jan Beulich 4 months ago
On 01.07.2025 22:46, Stefano Stabellini wrote:
> On Tue, 1 Jul 2025, Jan Beulich wrote:
>> Sadly from this you omitted the output from the setup of the offsets
>> arrays. Considering also your later reply, I'd be curious to know what
>> mfn_to_pdx(0x50000000) is.
>  
> Full logs here, and debug patch in attachment.
> 
> (XEN) Checking for initrd in /chosen
> (XEN) RAM: 0000000000000000 - 000000007fffffff
> (XEN) RAM: 0000000800000000 - 000000087fffffff
> (XEN) RAM: 0000050000000000 - 000005007fffffff
> (XEN) RAM: 0000060000000000 - 000006007fffffff
> (XEN) RAM: 0000070000000000 - 000007007fffffff
> (XEN) 
> (XEN) MODULE[0]: 0000000022000000 - 0000000022172fff Xen         
> (XEN) MODULE[1]: 0000000022200000 - 000000002220efff Device Tree 
> (XEN) MODULE[2]: 0000000020400000 - 0000000021e2ffff Kernel      
> (XEN)  RESVD[0]: 0000000000000000 - 0000000000ffffff
> (XEN)  RESVD[1]: 0000000001000000 - 00000000015fffff
> (XEN)  RESVD[2]: 0000000001600000 - 00000000017fffff
> (XEN)  RESVD[3]: 0000000001800000 - 00000000097fffff
> (XEN)  RESVD[4]: 0000000009800000 - 000000000bffffff
> (XEN)  RESVD[5]: 0000000011126000 - 000000001114dfff
> (XEN)  RESVD[6]: 000000001114e000 - 000000001214efff
> (XEN)  RESVD[7]: 0000000017275000 - 000000001729cfff
> (XEN)  RESVD[8]: 000000001729d000 - 000000001829dfff
> (XEN)  RESVD[9]: 000000001a7df000 - 000000001a806fff
> (XEN)  RESVD[10]: 000000001a807000 - 000000001b807fff
> (XEN)  RESVD[11]: 000000001d908000 - 000000001d92ffff
> (XEN)  RESVD[12]: 000000001d930000 - 000000001e930fff
> (XEN)  RESVD[13]: 000000001829e000 - 000000001869dfff
> (XEN)  RESVD[14]: 000000001869e000 - 00000000186ddfff
> (XEN)  RESVD[15]: 0000000800000000 - 000000083fffffff
> (XEN) 
> (XEN) 
> (XEN) Command line: console=dtuart dom0_mem=2048M console_timestamps=boot debug bootscrub=0 vwfi=native sched=null
> (XEN) [00000006bfc302ec] parameter "debug" unknown!
> (XEN) [00000006bfcc0476] DEBUG init_pdx 294 start=0 end=80000000
> (XEN) [00000006bfcd2400] DEBUG init_pdx 294 start=800000000 end=880000000
> (XEN) [00000006bfce29ec] DEBUG init_pdx 294 start=50000000000 end=50080000000
> (XEN) [00000006bfcf1768] DEBUG init_pdx 294 start=60000000000 end=60080000000
> (XEN) [00000006bfd015a4] DEBUG init_pdx 294 start=70000000000 end=70080000000
> (XEN) [00000006bfd1444f] DEBUG setup_mm 252

This one is immediately after init_pdx(), i.e. by here the log messages from
Roger's patch (out of pfn_pdx_compression_setup()) should have appeared.
Which at least falsifies my earlier suspicion about there being an ordering
issue. You do have PDX_OFFSET_COMPRESSION=y in your .config, don't you? Are
we perhaps taking the only "return false" path in pfn_offset_sanitize_ranges()
that doesn't issue a log message? I can't see how we could plausibly take the
"Avoid compression if there's no gain" path in pfn_pdx_compression_setup()
itself.

Jan
Re: [PATCH v2 0/8] pdx: introduce a new compression algorithm
Posted by Roger Pau Monné 4 months ago
On Wed, Jul 02, 2025 at 08:32:27AM +0200, Jan Beulich wrote:
> On 01.07.2025 22:46, Stefano Stabellini wrote:
> > On Tue, 1 Jul 2025, Jan Beulich wrote:
> >> Sadly from this you omitted the output from the setup of the offsets
> >> arrays. Considering also your later reply, I'd be curious to know what
> >> mfn_to_pdx(0x50000000) is.
> >  
> > Full logs here, and debug patch in attachment.
> > 
> > (XEN) Checking for initrd in /chosen
> > (XEN) RAM: 0000000000000000 - 000000007fffffff
> > (XEN) RAM: 0000000800000000 - 000000087fffffff
> > (XEN) RAM: 0000050000000000 - 000005007fffffff
> > (XEN) RAM: 0000060000000000 - 000006007fffffff
> > (XEN) RAM: 0000070000000000 - 000007007fffffff
> > (XEN) 
> > (XEN) MODULE[0]: 0000000022000000 - 0000000022172fff Xen         
> > (XEN) MODULE[1]: 0000000022200000 - 000000002220efff Device Tree 
> > (XEN) MODULE[2]: 0000000020400000 - 0000000021e2ffff Kernel      
> > (XEN)  RESVD[0]: 0000000000000000 - 0000000000ffffff
> > (XEN)  RESVD[1]: 0000000001000000 - 00000000015fffff
> > (XEN)  RESVD[2]: 0000000001600000 - 00000000017fffff
> > (XEN)  RESVD[3]: 0000000001800000 - 00000000097fffff
> > (XEN)  RESVD[4]: 0000000009800000 - 000000000bffffff
> > (XEN)  RESVD[5]: 0000000011126000 - 000000001114dfff
> > (XEN)  RESVD[6]: 000000001114e000 - 000000001214efff
> > (XEN)  RESVD[7]: 0000000017275000 - 000000001729cfff
> > (XEN)  RESVD[8]: 000000001729d000 - 000000001829dfff
> > (XEN)  RESVD[9]: 000000001a7df000 - 000000001a806fff
> > (XEN)  RESVD[10]: 000000001a807000 - 000000001b807fff
> > (XEN)  RESVD[11]: 000000001d908000 - 000000001d92ffff
> > (XEN)  RESVD[12]: 000000001d930000 - 000000001e930fff
> > (XEN)  RESVD[13]: 000000001829e000 - 000000001869dfff
> > (XEN)  RESVD[14]: 000000001869e000 - 00000000186ddfff
> > (XEN)  RESVD[15]: 0000000800000000 - 000000083fffffff
> > (XEN) 
> > (XEN) 
> > (XEN) Command line: console=dtuart dom0_mem=2048M console_timestamps=boot debug bootscrub=0 vwfi=native sched=null
> > (XEN) [00000006bfc302ec] parameter "debug" unknown!
> > (XEN) [00000006bfcc0476] DEBUG init_pdx 294 start=0 end=80000000
> > (XEN) [00000006bfcd2400] DEBUG init_pdx 294 start=800000000 end=880000000
> > (XEN) [00000006bfce29ec] DEBUG init_pdx 294 start=50000000000 end=50080000000
> > (XEN) [00000006bfcf1768] DEBUG init_pdx 294 start=60000000000 end=60080000000
> > (XEN) [00000006bfd015a4] DEBUG init_pdx 294 start=70000000000 end=70080000000
> > (XEN) [00000006bfd1444f] DEBUG setup_mm 252
> 
> This one is immediately after init_pdx(), i.e. by here the log messages from
> Roger's patch (out of pfn_pdx_compression_setup()) should have appeared.
> Which at least falsifies my earlier suspicion about there being an ordering
> issue. You do have PDX_OFFSET_COMPRESSION=y in your .config, don't you? Are
> we perhaps taking the only "return false" path in pfn_offset_sanitize_ranges()
> that doesn't issue a log message?

Sorry, should have posted this yesterday.  With the current offset
compression algorithm the memory map provided by Stefano is not
compressible, as the calculated PFN shift leads to lookup table
indexes that overflows the default table size.

I'm working on an improved version that attempts to always preserve
the most significant bits in the lookup table index, even if that
leads to merging regions.

Thanks, Roger.
Re: [PATCH v2 0/8] pdx: introduce a new compression algorithm
Posted by Jan Beulich 4 months ago
On 01.07.2025 22:46, Stefano Stabellini wrote:
> On Tue, 1 Jul 2025, Jan Beulich wrote:
>> Sadly from this you omitted the output from the setup of the offsets
>> arrays. Considering also your later reply, I'd be curious to know what
>> mfn_to_pdx(0x50000000) is.
>  
> Full logs here, and debug patch in attachment.

Interesting. Up to ...

> (XEN) Checking for initrd in /chosen
> (XEN) RAM: 0000000000000000 - 000000007fffffff
> (XEN) RAM: 0000000800000000 - 000000087fffffff
> (XEN) RAM: 0000050000000000 - 000005007fffffff
> (XEN) RAM: 0000060000000000 - 000006007fffffff
> (XEN) RAM: 0000070000000000 - 000007007fffffff
> (XEN) 
> (XEN) MODULE[0]: 0000000022000000 - 0000000022172fff Xen         
> (XEN) MODULE[1]: 0000000022200000 - 000000002220efff Device Tree 
> (XEN) MODULE[2]: 0000000020400000 - 0000000021e2ffff Kernel      
> (XEN)  RESVD[0]: 0000000000000000 - 0000000000ffffff
> (XEN)  RESVD[1]: 0000000001000000 - 00000000015fffff
> (XEN)  RESVD[2]: 0000000001600000 - 00000000017fffff
> (XEN)  RESVD[3]: 0000000001800000 - 00000000097fffff
> (XEN)  RESVD[4]: 0000000009800000 - 000000000bffffff
> (XEN)  RESVD[5]: 0000000011126000 - 000000001114dfff
> (XEN)  RESVD[6]: 000000001114e000 - 000000001214efff
> (XEN)  RESVD[7]: 0000000017275000 - 000000001729cfff
> (XEN)  RESVD[8]: 000000001729d000 - 000000001829dfff
> (XEN)  RESVD[9]: 000000001a7df000 - 000000001a806fff
> (XEN)  RESVD[10]: 000000001a807000 - 000000001b807fff
> (XEN)  RESVD[11]: 000000001d908000 - 000000001d92ffff
> (XEN)  RESVD[12]: 000000001d930000 - 000000001e930fff
> (XEN)  RESVD[13]: 000000001829e000 - 000000001869dfff
> (XEN)  RESVD[14]: 000000001869e000 - 00000000186ddfff
> (XEN)  RESVD[15]: 0000000800000000 - 000000083fffffff
> (XEN) 
> (XEN) 
> (XEN) Command line: console=dtuart dom0_mem=2048M console_timestamps=boot debug bootscrub=0 vwfi=native sched=null
> (XEN) [00000006bfc302ec] parameter "debug" unknown!
> (XEN) [00000006bfcc0476] DEBUG init_pdx 294 start=0 end=80000000
> (XEN) [00000006bfcd2400] DEBUG init_pdx 294 start=800000000 end=880000000
> (XEN) [00000006bfce29ec] DEBUG init_pdx 294 start=50000000000 end=50080000000
> (XEN) [00000006bfcf1768] DEBUG init_pdx 294 start=60000000000 end=60080000000
> (XEN) [00000006bfd015a4] DEBUG init_pdx 294 start=70000000000 end=70080000000
> (XEN) [00000006bfd1444f] DEBUG setup_mm 252
> (XEN) [00000006bfd3dc6f] DEBUG setup_mm 273 start=0 size=80000000 ram_end=80000000 directmap_base_pdx=0
> (XEN) [00000006bfd5616e] DEBUG setup_directmap_mappings 229 base_mfn=0 nr_mfns=80000 directmap_base_pdx=0 mfn_to_pdx=0
> (XEN) [00000006bfd7d38a] DEBUG setup_directmap_mappings 237 base_mfn=0 nr_mfns=80000 directmap_base_pdx=0
> (XEN) [00000006bfd92728] DEBUG setup_mm 273 start=800000000 size=80000000 ram_end=880000000 directmap_base_pdx=0
> (XEN) [00000006bfdaba3b] DEBUG setup_directmap_mappings 229 base_mfn=800000 nr_mfns=80000 directmap_base_pdx=0 mfn_to_pdx=800000
> (XEN) [00000006bfdcd79c] DEBUG setup_directmap_mappings 237 base_mfn=800000 nr_mfns=80000 directmap_base_pdx=0
> (XEN) [00000006bfde4d82] DEBUG setup_mm 273 start=50000000000 size=80000000 ram_end=50080000000 directmap_base_pdx=0
> (XEN) [00000006bfdfaef0] DEBUG setup_directmap_mappings 229 base_mfn=50000000 nr_mfns=80000 directmap_base_pdx=0 mfn_to_pdx=50000000
> (XEN) [00000006bfe35249] Assertion '(mfn_to_pdx(maddr_to_mfn(ma)) - directmap_base_pdx) < (DIRECTMAP_SIZE >> PAGE_SHIFT)' failed at ./arch/arm/include/asm/mmu/mm.h:72

... here there's no sign of PDX compression actually being set up; all that's
there are the init_pdx() messages. Do you perhaps have an ordering problem on
Arm? The register values ...

> (XEN) [00000006bfe68507] ----[ Xen-4.21-unstable  arm64  debug=y  Not tainted ]----
> (XEN) [00000006bfe766bf] CPU:    0
> (XEN) [00000006bfe832e0] PC:     00000a00002da70c setup_mm+0x284/0x308
> (XEN) [00000006bfea5b1a] LR:     00000a00002da6b0
> (XEN) [00000006bfeb1032] SP:     00000a0000327e00
> (XEN) [00000006bfebf403] CPSR:   00000000200003c9 MODE:64-bit EL2h (Hypervisor, handler)
> (XEN) [00000006bfed4634]      X0: 0000000000000017  X1: 0000000000000000  X2: 0000000050000000
> (XEN) [00000006bfee4d11]      X3: 000000004fffffff  X4: 0000000000000020  X5: 0000000000000000
> (XEN) [00000006bfef48cf]      X6: 0000000000000000  X7: 0000000000000000  X8: ffffffffffffffff
> (XEN) [00000006bff047ac]      X9: fefefefefefeff09 X10: 0000000000000080 X11: 0101010101010101
> (XEN) [00000006bff153b4]     X12: 0000000000000008 X13: 0000000000000009 X14: 0000000000000030
> (XEN) [00000006bff2620d]     X15: 00000a0000a00000 X16: 00000a0000291478 X17: 0000000000000000
> (XEN) [00000006bff35c41]     X18: 000000007be9bbe0 X19: 00000a0000292c40 X20: 00000a00002ade68
> (XEN) [00000006bff465a5]     X21: 0000050080000000 X22: 0000000000000000 X23: 0000000180000000
> (XEN) [00000006bff57a51]     X24: 0000000000000002 X25: 00000a0000292c50 X26: 0000000050000000
> (XEN) [00000006bff67d91]     X27: 0000000000080000 X28: 0000050000000000  FP: 00000a0000327e00

... also suggest (x2, x3, and x26 in particular) that offsets are still all
zero, i.e. PDX == MFN. And aiui DIRECTMAP_SIZE is 5Tb.

Jan
Re: [PATCH v2 0/8] pdx: introduce a new compression algorithm
Posted by Stefano Stabellini 4 months ago
On Mon, 30 Jun 2025, Stefano Stabellini wrote:
> I added some debug messages (see
> attached stefano-debug.patch). Something seems to be wrong with the
> pdx_group_valid bitmap after 0x880000, as we start getting MFN ranges
> such as 0x254c0000-0x25500000 which don't make any sense to me.

From what I can see the first time setup_directmap_mappings is called
with base_mfn=50000000, __mfn_to_virt goes wrong and triggers the ASSERT
in maddr_to_virt.