arch/x86/xen/setup.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-)
When fixing a conflict in xen_e820_resolve_conflicts(), the loop over
the E820 map entries needs to be restarted, as the E820 map will have
been modified by the fix. Otherwise entries might be skipped by
accident.
Fixes: be35d91c8880 ("xen: tolerate ACPI NVS memory overlapping with Xen allocated memory")
Signed-off-by: Juergen Gross <jgross@suse.com>
---
arch/x86/xen/setup.c | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)
diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c
index ac8021c3a997..bb95a05259b8 100644
--- a/arch/x86/xen/setup.c
+++ b/arch/x86/xen/setup.c
@@ -695,17 +695,22 @@ static void __init xen_e820_resolve_conflicts(phys_addr_t start,
return;
end = start + size;
- entry = xen_e820_table.entries;
+ mapcnt = 0;
- for (mapcnt = 0; mapcnt < xen_e820_table.nr_entries; mapcnt++) {
+ while (mapcnt < xen_e820_table.nr_entries) {
+ entry = xen_e820_table.entries + mapcnt;
if (entry->addr >= end)
return;
if (entry->addr + entry->size > start &&
- entry->type == E820_TYPE_NVS)
+ entry->type == E820_TYPE_NVS) {
xen_e820_swap_entry_with_ram(entry);
+ /* E820 map has been changed, restart loop! */
+ mapcnt = 0;
+ continue;
+ }
- entry++;
+ mapcnt++;
}
}
--
2.54.0
On 05.05.2026 10:06, Juergen Gross wrote:
> When fixing a conflict in xen_e820_resolve_conflicts(), the loop over
> the E820 map entries needs to be restarted, as the E820 map will have
> been modified by the fix. Otherwise entries might be skipped by
> accident.
>
> Fixes: be35d91c8880 ("xen: tolerate ACPI NVS memory overlapping with Xen allocated memory")
> Signed-off-by: Juergen Gross <jgross@suse.com>
First, while trying to review this, isn't there another issue in
xen_e820_swap_entry_with_ram(), in that
entry->addr = entry_end - swap_size +
swap_addr - swap_entry->addr;
really means to be
entry->addr = entry_end - swap_size +
swap_entry->addr - swap_addr;
(affecting non-page-aligned E820 entries)?
Further, that function converts swap_entry to the page-aligned superset
of the passed in range. How is it guaranteed that this new range won't
overlap with the predecessor and/or successor one? Wouldn't that need
to be conversion to the page-aligned subset instead?
And then, is passing the page-aligned superset to xen_add_remap_nonram()
really appropriate? Why would any leading or trailing space there be
subject to remapping?
> --- a/arch/x86/xen/setup.c
> +++ b/arch/x86/xen/setup.c
> @@ -695,17 +695,22 @@ static void __init xen_e820_resolve_conflicts(phys_addr_t start,
> return;
>
> end = start + size;
> - entry = xen_e820_table.entries;
> + mapcnt = 0;
>
> - for (mapcnt = 0; mapcnt < xen_e820_table.nr_entries; mapcnt++) {
> + while (mapcnt < xen_e820_table.nr_entries) {
> + entry = xen_e820_table.entries + mapcnt;
> if (entry->addr >= end)
> return;
>
> if (entry->addr + entry->size > start &&
> - entry->type == E820_TYPE_NVS)
> + entry->type == E820_TYPE_NVS) {
> xen_e820_swap_entry_with_ram(entry);
> + /* E820 map has been changed, restart loop! */
> + mapcnt = 0;
> + continue;
> + }
>
> - entry++;
> + mapcnt++;
> }
> }
Given what exactly xen_e820_swap_entry_with_ram() does, restarting from
entry 0 looks to be needed only if the non-RAM entry ended up moving down
(strictly speaking even there it wouldn't need to be entry 0). If it
moved up, simply not incrementing mapcnt would look to suffice. Since the
extra overhead is likely tolerable here (with simplicity of the code
being more important), this may want mentioning in a code comment (or at
least the description). Preferably with that:
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Jan
On 05.05.26 10:43, Jan Beulich wrote:
> On 05.05.2026 10:06, Juergen Gross wrote:
>> When fixing a conflict in xen_e820_resolve_conflicts(), the loop over
>> the E820 map entries needs to be restarted, as the E820 map will have
>> been modified by the fix. Otherwise entries might be skipped by
>> accident.
>>
>> Fixes: be35d91c8880 ("xen: tolerate ACPI NVS memory overlapping with Xen allocated memory")
>> Signed-off-by: Juergen Gross <jgross@suse.com>
>
> First, while trying to review this, isn't there another issue in
> xen_e820_swap_entry_with_ram(), in that
>
> entry->addr = entry_end - swap_size +
> swap_addr - swap_entry->addr;
>
>
> really means to be
>
> entry->addr = entry_end - swap_size +
> swap_entry->addr - swap_addr;
>
> (affecting non-page-aligned E820 entries)?
Yes, you are right.
>
> Further, that function converts swap_entry to the page-aligned superset
> of the passed in range. How is it guaranteed that this new range won't
> overlap with the predecessor and/or successor one? Wouldn't that need
> to be conversion to the page-aligned subset instead?
This is subtle. :-)
We are converting to RAM (usable), so the type value is 1. e820__update_table()
will handle overlaps just fine, with higher type values "winning" against lower
ones. So any other region overlapping with the new RAM region will result in
another conflict in the next loop iteration.
Using the page-aligned subset would result in possible memory holes, which would
be problematic (the kernel or page tables shouldn't have holes, after all).
>
> And then, is passing the page-aligned superset to xen_add_remap_nonram()
> really appropriate? Why would any leading or trailing space there be
> subject to remapping?
How would you want to remap a sub-page physical memory area to another location
without affecting the rest of the page? We are reworking the final p2m map here.
>
>> --- a/arch/x86/xen/setup.c
>> +++ b/arch/x86/xen/setup.c
>> @@ -695,17 +695,22 @@ static void __init xen_e820_resolve_conflicts(phys_addr_t start,
>> return;
>>
>> end = start + size;
>> - entry = xen_e820_table.entries;
>> + mapcnt = 0;
>>
>> - for (mapcnt = 0; mapcnt < xen_e820_table.nr_entries; mapcnt++) {
>> + while (mapcnt < xen_e820_table.nr_entries) {
>> + entry = xen_e820_table.entries + mapcnt;
>> if (entry->addr >= end)
>> return;
>>
>> if (entry->addr + entry->size > start &&
>> - entry->type == E820_TYPE_NVS)
>> + entry->type == E820_TYPE_NVS) {
>> xen_e820_swap_entry_with_ram(entry);
>> + /* E820 map has been changed, restart loop! */
>> + mapcnt = 0;
>> + continue;
>> + }
>>
>> - entry++;
>> + mapcnt++;
>> }
>> }
>
> Given what exactly xen_e820_swap_entry_with_ram() does, restarting from
> entry 0 looks to be needed only if the non-RAM entry ended up moving down
> (strictly speaking even there it wouldn't need to be entry 0). If it
> moved up, simply not incrementing mapcnt would look to suffice. Since the
> extra overhead is likely tolerable here (with simplicity of the code
> being more important), this may want mentioning in a code comment (or at
> least the description). Preferably with that:
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Thanks,
Juergen
On 05.05.2026 11:13, Jürgen Groß wrote:
> On 05.05.26 10:43, Jan Beulich wrote:
>> On 05.05.2026 10:06, Juergen Gross wrote:
>>> When fixing a conflict in xen_e820_resolve_conflicts(), the loop over
>>> the E820 map entries needs to be restarted, as the E820 map will have
>>> been modified by the fix. Otherwise entries might be skipped by
>>> accident.
>>>
>>> Fixes: be35d91c8880 ("xen: tolerate ACPI NVS memory overlapping with Xen allocated memory")
>>> Signed-off-by: Juergen Gross <jgross@suse.com>
>>
>> First, while trying to review this, isn't there another issue in
>> xen_e820_swap_entry_with_ram(), in that
>>
>> entry->addr = entry_end - swap_size +
>> swap_addr - swap_entry->addr;
>>
>>
>> really means to be
>>
>> entry->addr = entry_end - swap_size +
>> swap_entry->addr - swap_addr;
>>
>> (affecting non-page-aligned E820 entries)?
>
> Yes, you are right.
>
>>
>> Further, that function converts swap_entry to the page-aligned superset
>> of the passed in range. How is it guaranteed that this new range won't
>> overlap with the predecessor and/or successor one? Wouldn't that need
>> to be conversion to the page-aligned subset instead?
>
> This is subtle. :-)
>
> We are converting to RAM (usable), so the type value is 1. e820__update_table()
> will handle overlaps just fine, with higher type values "winning" against lower
> ones. So any other region overlapping with the new RAM region will result in
> another conflict in the next loop iteration.
Oh, wow, and this is a property of the function that one can rely upon?
> Using the page-aligned subset would result in possible memory holes, which would
> be problematic (the kernel or page tables shouldn't have holes, after all).
Aren't such holes normal to occur, e.g. on misaligned RAM/UNUSABLE
boundaries?
>> And then, is passing the page-aligned superset to xen_add_remap_nonram()
>> really appropriate? Why would any leading or trailing space there be
>> subject to remapping?
>
> How would you want to remap a sub-page physical memory area to another location
> without affecting the rest of the page? We are reworking the final p2m map here.
Well, first and foremost: xen_add_remap_nonram() takes and stores byte-
granular addresses / sizes, with the sole requirement being that the
offset-into-page be identical between both addresses. That check alone
already indicates that non-page-aligned addresses are expected to be
passed into here.
Further, xen_acpi_os_ioremap() uses the resulting remap table, and is
byte granular. With the physical address adjustment there, both mappings
could (theoretically) coexist. But the problem I'm trying to point out
is that by passing the page-aligned superset into xen_add_remap_nonram()
you misguide xen_acpi_os_ioremap() (while at the same time
xen_do_remap_nonram() will do suitable rounding to page boundaries even
if exact addresses were passed).
Jan
On 05.05.26 12:24, Jan Beulich wrote:
> On 05.05.2026 11:13, Jürgen Groß wrote:
>> On 05.05.26 10:43, Jan Beulich wrote:
>>> On 05.05.2026 10:06, Juergen Gross wrote:
>>>> When fixing a conflict in xen_e820_resolve_conflicts(), the loop over
>>>> the E820 map entries needs to be restarted, as the E820 map will have
>>>> been modified by the fix. Otherwise entries might be skipped by
>>>> accident.
>>>>
>>>> Fixes: be35d91c8880 ("xen: tolerate ACPI NVS memory overlapping with Xen allocated memory")
>>>> Signed-off-by: Juergen Gross <jgross@suse.com>
>>>
>>> First, while trying to review this, isn't there another issue in
>>> xen_e820_swap_entry_with_ram(), in that
>>>
>>> entry->addr = entry_end - swap_size +
>>> swap_addr - swap_entry->addr;
>>>
>>>
>>> really means to be
>>>
>>> entry->addr = entry_end - swap_size +
>>> swap_entry->addr - swap_addr;
>>>
>>> (affecting non-page-aligned E820 entries)?
>>
>> Yes, you are right.
>>
>>>
>>> Further, that function converts swap_entry to the page-aligned superset
>>> of the passed in range. How is it guaranteed that this new range won't
>>> overlap with the predecessor and/or successor one? Wouldn't that need
>>> to be conversion to the page-aligned subset instead?
>>
>> This is subtle. :-)
>>
>> We are converting to RAM (usable), so the type value is 1. e820__update_table()
>> will handle overlaps just fine, with higher type values "winning" against lower
>> ones. So any other region overlapping with the new RAM region will result in
>> another conflict in the next loop iteration.
>
> Oh, wow, and this is a property of the function that one can rely upon?
It is documented to be handled this way.
>> Using the page-aligned subset would result in possible memory holes, which would
>> be problematic (the kernel or page tables shouldn't have holes, after all).
>
> Aren't such holes normal to occur, e.g. on misaligned RAM/UNUSABLE
> boundaries?
This can happen, yes, but it should not be the case in the area where the
kernel is actually located.
>
>>> And then, is passing the page-aligned superset to xen_add_remap_nonram()
>>> really appropriate? Why would any leading or trailing space there be
>>> subject to remapping?
>>
>> How would you want to remap a sub-page physical memory area to another location
>> without affecting the rest of the page? We are reworking the final p2m map here.
>
> Well, first and foremost: xen_add_remap_nonram() takes and stores byte-
> granular addresses / sizes, with the sole requirement being that the
> offset-into-page be identical between both addresses. That check alone
> already indicates that non-page-aligned addresses are expected to be
> passed into here.
I'd say "tolerated" instead of "expected".
> Further, xen_acpi_os_ioremap() uses the resulting remap table, and is
> byte granular. With the physical address adjustment there, both mappings
> could (theoretically) coexist. But the problem I'm trying to point out
> is that by passing the page-aligned superset into xen_add_remap_nonram()
> you misguide xen_acpi_os_ioremap() (while at the same time
> xen_do_remap_nonram() will do suitable rounding to page boundaries even
> if exact addresses were passed).
Ah, okay, now I understand your concern.
I'm on the edge whether a change is wanted or not. The current implementation
is correct, while I agree that using non-page-aligned addresses should work.
OTOH using a superset is fine, too. Especially as the remap is done based on
memory map entries, while the caller of xen_acpi_os_ioremap() will act based
on ACPI table entries. It is perfectly fine to have multiple NVS records in
an area covered by a single memory map entry, so calling xen_acpi_os_ioremap()
only for a part of a remap entry isn't weird at all.
So the implementation needs to ensure that a remap entry is allowed to be a
superset of an area mapped via xen_acpi_os_ioremap(), resulting in no need to
modify the current coding.
As this whole handling was added to support a very rare case, I'd rather not
risk to break that case by doing cosmetic changes. OTOH I wouldn't NACK a patch.
Juergen
The following commit has been merged into the x86/urgent branch of tip:
Commit-ID: 5f8719945244dd65b5fa06195f4600db62581610
Gitweb: https://git.kernel.org/tip/5f8719945244dd65b5fa06195f4600db62581610
Author: Juergen Gross <jgross@suse.com>
AuthorDate: Tue, 05 May 2026 10:06:53 +02:00
Committer: Ingo Molnar <mingo@kernel.org>
CommitterDate: Tue, 05 May 2026 10:17:00 +02:00
x86/xen: Fix a potential problem in xen_e820_resolve_conflicts()
When fixing a conflict in xen_e820_resolve_conflicts(), the loop over
the E820 map entries needs to be restarted, as the E820 map will have
been modified by the fix. Otherwise entries might be skipped by
accident.
Fixes: be35d91c8880 ("xen: tolerate ACPI NVS memory overlapping with Xen allocated memory")
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: xen-devel@lists.xenproject.org
Link: https://patch.msgid.link/20260505080653.197775-1-jgross@suse.com
---
arch/x86/xen/setup.c | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)
diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c
index ac8021c..bb95a05 100644
--- a/arch/x86/xen/setup.c
+++ b/arch/x86/xen/setup.c
@@ -695,17 +695,22 @@ static void __init xen_e820_resolve_conflicts(phys_addr_t start,
return;
end = start + size;
- entry = xen_e820_table.entries;
+ mapcnt = 0;
- for (mapcnt = 0; mapcnt < xen_e820_table.nr_entries; mapcnt++) {
+ while (mapcnt < xen_e820_table.nr_entries) {
+ entry = xen_e820_table.entries + mapcnt;
if (entry->addr >= end)
return;
if (entry->addr + entry->size > start &&
- entry->type == E820_TYPE_NVS)
+ entry->type == E820_TYPE_NVS) {
xen_e820_swap_entry_with_ram(entry);
+ /* E820 map has been changed, restart loop! */
+ mapcnt = 0;
+ continue;
+ }
- entry++;
+ mapcnt++;
}
}
© 2016 - 2026 Red Hat, Inc.