arch/arm64/include/asm/acpi.h | 1 + arch/arm64/include/asm/rsi.h | 1 + arch/arm64/kernel/acpi.c | 151 ++++++++++++++++++++++++++++------ arch/arm64/kernel/rsi.c | 23 +++++- arch/arm64/kernel/setup.c | 79 ++++++++++++++++++ drivers/firmware/psci/psci.c | 49 ++++++++++- include/linux/psci.h | 2 + 7 files changed, 277 insertions(+), 29 deletions(-)
This is an updated series, addressing the review comments from AI agent on
the version 1 [0] of the series, (some of which were documented as short comings).
See below for the changes.
The Realm Guest linux support is broken without rodata=full (fortunately default
for arm64), as we detect the RSI support after we have created the Linear map
with Block/Contiguous mappings. If the boot CPU doesn't support BBML2_NOABORT
(there are CPUs out there with FEAT_RME and no - useable - BBML2_NOABORT)
we are then not able to split the page tables down to PTE level if the system
as such doesn't support BBML2.
See the following link for the discussion.
https://lore.kernel.org/all/20260330161705.3349825-2-ryan.roberts@arm.com/
The available options are :
1. Start with PTE level mappings at paging_init() and then "FOLD" the page tables
to Block/Cont mappings after we have the full picture available. Looking at the
future (with BBML3), this might mean "additional work" for most of the systems
at boot. But not bad as splitting them ?
2. Hold the secondary CPUs in busy loop with MMU disabled and split the mappings
by the boot CPU with MMU off (if Boot CPU can't support BBML2). This is tricky
with the page allocations required to add the page-tables.
3. Move the detection of Realm support earlier to make a better decision for
paging_init(), with an added bonus of earlycon support for Realms without
the user having to work out the "top bit" for the Realm.
This series is an attempt to implement (3) (without the earlycon support). We try
to probe the PSCI conduit early from the DT/ACPI. DT is not flattened at this time.
ACPI table is not mapped in full, so we have to map one table at a time and walk
from the Root of the table (RSDP) through to XSDT and find the FADT table from
the array of table pointers there. Minimal verification is performed on the
tables (e.g., revision checks, standard FADT sanity checks). Checksum is not
verified, but should be possible to do for the parts we consume.
With arm64, during the normal boot, we could fallback to using DT if the ACPI
tables are not useable. So, during the early probe, we try to follow the similar
logic and probe the conduit from both DT and ACPI where available. If both of
them contain a conduit, we only proceed if they match. Otherwise, we skip the
early probe and do things the normal way. (Any sane system shouldn't have such
a mismatch, but..)
Once we probe the PSCI conduit, PSCI is probed, along with the presence of SMCCC.
With that in place, we try to probe the RSI support after the early probe and
advertise the Realm World. If the early probe wasn't successful, we fall back
to the late mode, where we could end up with (on a possibly rare broken firmware).
NOTE: This is an early RFC attempt to moving the PSCI detection earlier. The other
option(s) that may be worth exploring are:
1. On systems with EFI, parse this from EFI Stub and pass the data back in the
DT Stub, under chosen node. e.g., "linux,uefi-arm-psci-conduit".
Challenge: EFI stub doesn't seem to be ACPI aware. We could make that change,
we only need a few table walks.
2. Have EFI firmware provide this information (with my limited knowledge on the
area, this looks like too much work, and bending the standards)
3. Append arm64 boot protocol to have this information passed to the kernel.
(Firmware provided) - (Steven's idea)
4. Any other options ?
This series is also available here :
git@git.gitlab.arm.com:linux-arm/linux-cca.git cca-guest/early-rsi-detection/rfc-v2
Thoughts ?
Suzuki
Changes since v1:
(Mainly addressing review comments from AI agent)
[0] https://lore.kernel.org/all/20260429103535.266728-1-suzuki.poulose@arm.com
- Handle ACPI XSDT table properly for tables greater than a PAGE_SIZE
- Stricter checking for the PSCI DT node, match the compatible to PSCI 0.2 or v1.0
and honor the "status" property of the node, to be more closer to the late check
Suzuki K Poulose (4):
arm64: acpi: Refactor FADT table verification
psci: Add support for Early detection and init
arm64: psci: Move detection and SMCCC probe earlier
arm64: realm: Move RSI detection earlier
arch/arm64/include/asm/acpi.h | 1 +
arch/arm64/include/asm/rsi.h | 1 +
arch/arm64/kernel/acpi.c | 151 ++++++++++++++++++++++++++++------
arch/arm64/kernel/rsi.c | 23 +++++-
arch/arm64/kernel/setup.c | 79 ++++++++++++++++++
drivers/firmware/psci/psci.c | 49 ++++++++++-
include/linux/psci.h | 2 +
7 files changed, 277 insertions(+), 29 deletions(-)
--
2.43.0
Hi Suzuki, On Tue, May 05, 2026 at 04:57:38PM +0100, Suzuki K Poulose wrote: > This is an updated series, addressing the review comments from AI agent on > the version 1 [0] of the series, (some of which were documented as short comings). > See below for the changes. > > The Realm Guest linux support is broken without rodata=full (fortunately default > for arm64), as we detect the RSI support after we have created the Linear map > with Block/Contiguous mappings. If the boot CPU doesn't support BBML2_NOABORT > (there are CPUs out there with FEAT_RME and no - useable - BBML2_NOABORT) > we are then not able to split the page tables down to PTE level if the system > as such doesn't support BBML2. > > See the following link for the discussion. > > https://lore.kernel.org/all/20260330161705.3349825-2-ryan.roberts@arm.com/ > > The available options are : > 1. Start with PTE level mappings at paging_init() and then "FOLD" the page tables > to Block/Cont mappings after we have the full picture available. Looking at the > future (with BBML3), this might mean "additional work" for most of the systems > at boot. But not bad as splitting them ? > 2. Hold the secondary CPUs in busy loop with MMU disabled and split the mappings > by the boot CPU with MMU off (if Boot CPU can't support BBML2). This is tricky > with the page allocations required to add the page-tables. > 3. Move the detection of Realm support earlier to make a better decision for > paging_init(), with an added bonus of earlycon support for Realms without > the user having to work out the "top bit" for the Realm. > > This series is an attempt to implement (3) (without the earlycon support). We try > to probe the PSCI conduit early from the DT/ACPI. DT is not flattened at this time. Looking at the patches, I'm no longer sure it's worth justifying the complexity. Trying to recall the previous thread - does it only matter if BBML2_NOABORT is not supported and the kernel boots with rodata=off? I guess we can ignore big.LITTLE configurations for now since the deployment of CCA doesn't target mobile yet. Could we instead add a more informative message in arm64_rsi_init() if !force_pte_mappings() && !cpu_supports_bbml2_noabort() (before is_realm_world() becomes true)? Well, it may not print anything if the early console is not set up yet. -- Catalin
On 28/05/2026 17:06, Catalin Marinas wrote: > Hi Suzuki, > > On Tue, May 05, 2026 at 04:57:38PM +0100, Suzuki K Poulose wrote: >> This is an updated series, addressing the review comments from AI agent on >> the version 1 [0] of the series, (some of which were documented as short comings). >> See below for the changes. >> >> The Realm Guest linux support is broken without rodata=full (fortunately default >> for arm64), as we detect the RSI support after we have created the Linear map >> with Block/Contiguous mappings. If the boot CPU doesn't support BBML2_NOABORT >> (there are CPUs out there with FEAT_RME and no - useable - BBML2_NOABORT) >> we are then not able to split the page tables down to PTE level if the system >> as such doesn't support BBML2. >> >> See the following link for the discussion. >> >> https://lore.kernel.org/all/20260330161705.3349825-2-ryan.roberts@arm.com/ >> >> The available options are : >> 1. Start with PTE level mappings at paging_init() and then "FOLD" the page tables >> to Block/Cont mappings after we have the full picture available. Looking at the >> future (with BBML3), this might mean "additional work" for most of the systems >> at boot. But not bad as splitting them ? >> 2. Hold the secondary CPUs in busy loop with MMU disabled and split the mappings >> by the boot CPU with MMU off (if Boot CPU can't support BBML2). This is tricky >> with the page allocations required to add the page-tables. >> 3. Move the detection of Realm support earlier to make a better decision for >> paging_init(), with an added bonus of earlycon support for Realms without >> the user having to work out the "top bit" for the Realm. >> >> This series is an attempt to implement (3) (without the earlycon support). We try >> to probe the PSCI conduit early from the DT/ACPI. DT is not flattened at this time. > > Looking at the patches, I'm no longer sure it's worth justifying the > complexity. Yep, it is not simple with ACPI. > Trying to recall the previous thread - does it only matter > if BBML2_NOABORT is not supported and the kernel boots with rodata=off? Correct. > I guess we can ignore big.LITTLE configurations for now since the > deployment of CCA doesn't target mobile yet. True. > > Could we instead add a more informative message in arm64_rsi_init() if > !force_pte_mappings() && !cpu_supports_bbml2_noabort() (before > is_realm_world() becomes true)? Well, it may not print anything if the > early console is not set up yet. That is true, but with some expertise you may be able to enable earlycon and may be we could get some new mechanism for "earlycon" for Realms. The other way to look at is: When the system doesn't support BBML2 Abort: Creating block/Cont mappings to start with and then splitting it to PTE is quite difficult as we : 1. Need to allocate pages for leaf level tables 2. Hold the other CPUs in tight loop Instead, creating the block/CONT levels from a fully "page level" mappings are easier, as we can: 1. Can easily fold the tables to Block mapping with reclaiming the leaf level pagetables. 2. Avoid the secondary CPUs dance, as they all support BBML2_NOABORT. This shouldn't be that bad as the opposite ? I understand there are concerns on the boot time. May be we could add a kernel command line to force block mappings and slowly deprecate it ? Suzuki >
On Fri, May 29, 2026 at 01:27:01PM +0100, Suzuki K Poulose wrote: > On 28/05/2026 17:06, Catalin Marinas wrote: > > On Tue, May 05, 2026 at 04:57:38PM +0100, Suzuki K Poulose wrote: > > > This is an updated series, addressing the review comments from AI agent on > > > the version 1 [0] of the series, (some of which were documented as short comings). > > > See below for the changes. > > > > > > The Realm Guest linux support is broken without rodata=full (fortunately default > > > for arm64), as we detect the RSI support after we have created the Linear map > > > with Block/Contiguous mappings. If the boot CPU doesn't support BBML2_NOABORT > > > (there are CPUs out there with FEAT_RME and no - useable - BBML2_NOABORT) > > > we are then not able to split the page tables down to PTE level if the system > > > as such doesn't support BBML2. > > > > > > See the following link for the discussion. > > > > > > https://lore.kernel.org/all/20260330161705.3349825-2-ryan.roberts@arm.com/ > > > > > > The available options are : > > > 1. Start with PTE level mappings at paging_init() and then "FOLD" the page tables > > > to Block/Cont mappings after we have the full picture available. Looking at the > > > future (with BBML3), this might mean "additional work" for most of the systems > > > at boot. But not bad as splitting them ? > > > 2. Hold the secondary CPUs in busy loop with MMU disabled and split the mappings > > > by the boot CPU with MMU off (if Boot CPU can't support BBML2). This is tricky > > > with the page allocations required to add the page-tables. > > > 3. Move the detection of Realm support earlier to make a better decision for > > > paging_init(), with an added bonus of earlycon support for Realms without > > > the user having to work out the "top bit" for the Realm. > > > > > > This series is an attempt to implement (3) (without the earlycon support). We try > > > to probe the PSCI conduit early from the DT/ACPI. DT is not flattened at this time. [...] > > Could we instead add a more informative message in arm64_rsi_init() if > > !force_pte_mappings() && !cpu_supports_bbml2_noabort() (before > > is_realm_world() becomes true)? Well, it may not print anything if the > > early console is not set up yet. > > That is true, but with some expertise you may be able to enable earlycon > and may be we could get some new mechanism for "earlycon" for Realms. > > The other way to look at is: > > When the system doesn't support BBML2 Abort: > > Creating block/Cont mappings to start with and then splitting it to PTE > is quite difficult as we : > 1. Need to allocate pages for leaf level tables > 2. Hold the other CPUs in tight loop Agree, that's not easily possible at runtime. > Instead, creating the block/CONT levels from a fully "page level" > mappings are easier, as we can: > > 1. Can easily fold the tables to Block mapping with reclaiming the leaf > level pagetables. > > 2. Avoid the secondary CPUs dance, as they all support BBML2_NOABORT. > > This shouldn't be that bad as the opposite ? I don't think it solves our problem. Aren't we concerned with the rodata=off && !BBML2_NOABORT && is_realm_world() case? I don't think your second point stands. Currently we have: rodata=full && BBML2_NOABORT => block mappings irrespective of realms rodata=off && BBML2_NOABORT => block mappings first, can be split later if is_realm_world() rodata=off && !BBML2_NOABORT => block mappings first, serious problem if is_realm_world() It's the last case we need to fix. Starting with page mappings does avoid the in-realm failure but the !is_realm_world() case folding to block mappings still requires proper BBM. -- Catalin
On 11/06/2026 12:14, Catalin Marinas wrote: > On Fri, May 29, 2026 at 01:27:01PM +0100, Suzuki K Poulose wrote: >> On 28/05/2026 17:06, Catalin Marinas wrote: >>> On Tue, May 05, 2026 at 04:57:38PM +0100, Suzuki K Poulose wrote: >>>> This is an updated series, addressing the review comments from AI agent on >>>> the version 1 [0] of the series, (some of which were documented as short comings). >>>> See below for the changes. >>>> >>>> The Realm Guest linux support is broken without rodata=full (fortunately default >>>> for arm64), as we detect the RSI support after we have created the Linear map >>>> with Block/Contiguous mappings. If the boot CPU doesn't support BBML2_NOABORT >>>> (there are CPUs out there with FEAT_RME and no - useable - BBML2_NOABORT) >>>> we are then not able to split the page tables down to PTE level if the system >>>> as such doesn't support BBML2. >>>> >>>> See the following link for the discussion. >>>> >>>> https://lore.kernel.org/all/20260330161705.3349825-2-ryan.roberts@arm.com/ >>>> >>>> The available options are : >>>> 1. Start with PTE level mappings at paging_init() and then "FOLD" the page tables >>>> to Block/Cont mappings after we have the full picture available. Looking at the >>>> future (with BBML3), this might mean "additional work" for most of the systems >>>> at boot. But not bad as splitting them ? >>>> 2. Hold the secondary CPUs in busy loop with MMU disabled and split the mappings >>>> by the boot CPU with MMU off (if Boot CPU can't support BBML2). This is tricky >>>> with the page allocations required to add the page-tables. >>>> 3. Move the detection of Realm support earlier to make a better decision for >>>> paging_init(), with an added bonus of earlycon support for Realms without >>>> the user having to work out the "top bit" for the Realm. >>>> >>>> This series is an attempt to implement (3) (without the earlycon support). We try >>>> to probe the PSCI conduit early from the DT/ACPI. DT is not flattened at this time. > [...] >>> Could we instead add a more informative message in arm64_rsi_init() if >>> !force_pte_mappings() && !cpu_supports_bbml2_noabort() (before >>> is_realm_world() becomes true)? Well, it may not print anything if the >>> early console is not set up yet. >> >> That is true, but with some expertise you may be able to enable earlycon >> and may be we could get some new mechanism for "earlycon" for Realms. >> >> The other way to look at is: >> >> When the system doesn't support BBML2 Abort: >> >> Creating block/Cont mappings to start with and then splitting it to PTE >> is quite difficult as we : >> 1. Need to allocate pages for leaf level tables >> 2. Hold the other CPUs in tight loop > > Agree, that's not easily possible at runtime. > >> Instead, creating the block/CONT levels from a fully "page level" >> mappings are easier, as we can: >> >> 1. Can easily fold the tables to Block mapping with reclaiming the leaf >> level pagetables. >> >> 2. Avoid the secondary CPUs dance, as they all support BBML2_NOABORT. >> >> This shouldn't be that bad as the opposite ? > > I don't think it solves our problem. Aren't we concerned with the > rodata=off && !BBML2_NOABORT && is_realm_world() case? I don't think > your second point stands. > > Currently we have: > > rodata=full && BBML2_NOABORT => block mappings irrespective of realms > > rodata=off && BBML2_NOABORT => block mappings first, can be split later > if is_realm_world() > > rodata=off && !BBML2_NOABORT => block mappings first, serious problem if > is_realm_world() > > It's the last case we need to fix. Starting with page mappings does > avoid the in-realm failure but the !is_realm_world() case folding to > block mappings still requires proper BBM. I see, the case I was missing is : !is_realm_world() and !BBML2_NO_ABORT and we want Block mapping if rodata=off. Yes, in this case we need the secondaries on hold, with proper BBM on the boot CPU too. Again, it is easier to "collapsing the tables to Block" than the reverse. Suzuki >
© 2016 - 2026 Red Hat, Inc.