The address translation logic in get_physical_address() will currently
truncate physical addresses to 32 bits unless long mode is enabled.
This is incorrect when using physical address extensions (PAE) outside
of long mode, with the result that a 32-bit operating system using PAE
to access memory above 4G will experience undefined behaviour.
The truncation code was originally introduced in commit 33dfdb5 ("x86:
only allow real mode to access 32bit without LMA"), where it applied
only to translations performed while paging is disabled (and so cannot
affect guests using PAE).
Commit 9828198 ("target/i386: Add MMU_PHYS_IDX and MMU_NESTED_IDX")
rearranged the code such that the truncation also applied to the use
of MMU_PHYS_IDX and MMU_NESTED_IDX. Commit 4a1e9d4 ("target/i386: Use
atomic operations for pte updates") brought this truncation into scope
for page table entry accesses, and is the first commit for which a
Windows 10 32-bit guest will reliably fail to boot if memory above 4G
is present.
The truncation code however is not completely redundant. Even though the
maximum address size for any executed instruction is 32 bits, helpers for
operations such as BOUND, FSAVE or XSAVE may ask get_physical_address()
to translate an address outside of the 32-bit range, if invoked with an
argument that is close to the 4G boundary.
So, move the address truncation in get_physical_address() in the
CR0.PG=0 case.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2040
Fixes: 4a1e9d4d11c ("target/i386: Use atomic operations for pte updates", 2022-10-18)
Cc: qemu-stable@nongnu.org
Co-developed-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
target/i386/tcg/sysemu/excp_helper.c | 11 ++++-------
1 file changed, 4 insertions(+), 7 deletions(-)
diff --git a/target/i386/tcg/sysemu/excp_helper.c b/target/i386/tcg/sysemu/excp_helper.c
index 11126c860d4..eee1af52710 100644
--- a/target/i386/tcg/sysemu/excp_helper.c
+++ b/target/i386/tcg/sysemu/excp_helper.c
@@ -577,17 +577,14 @@ static bool get_physical_address(CPUX86State *env, vaddr addr,
}
return mmu_translate(env, &in, out, err);
}
+
+ /* No paging implies long mode is disabled. */
+ addr = (uint32_t)addr;
break;
}
- /* Translation disabled. */
+ /* No translation needed. */
out->paddr = addr & x86_get_a20_mask(env);
-#ifdef TARGET_X86_64
- if (!(env->hflags & HF_LMA_MASK)) {
- /* Without long mode we can only address 32bits in real mode */
- out->paddr = (uint32_t)out->paddr;
- }
-#endif
out->prot = PAGE_READ | PAGE_WRITE | PAGE_EXEC;
out->page_size = TARGET_PAGE_SIZE;
return true;
--
2.43.0
On 22/12/2023 17:59, Paolo Bonzini wrote: > The address translation logic in get_physical_address() will currently > truncate physical addresses to 32 bits unless long mode is enabled. > This is incorrect when using physical address extensions (PAE) outside > of long mode, with the result that a 32-bit operating system using PAE > to access memory above 4G will experience undefined behaviour. > > The truncation code was originally introduced in commit 33dfdb5 ("x86: > only allow real mode to access 32bit without LMA"), where it applied > only to translations performed while paging is disabled (and so cannot > affect guests using PAE). > > Commit 9828198 ("target/i386: Add MMU_PHYS_IDX and MMU_NESTED_IDX") > rearranged the code such that the truncation also applied to the use > of MMU_PHYS_IDX and MMU_NESTED_IDX. Commit 4a1e9d4 ("target/i386: Use > atomic operations for pte updates") brought this truncation into scope > for page table entry accesses, and is the first commit for which a > Windows 10 32-bit guest will reliably fail to boot if memory above 4G > is present. > > The truncation code however is not completely redundant. Even though the > maximum address size for any executed instruction is 32 bits, helpers for > operations such as BOUND, FSAVE or XSAVE may ask get_physical_address() > to translate an address outside of the 32-bit range, if invoked with an > argument that is close to the 4G boundary. Thank you for investigating and updating this patch. I am confused by how BOUND can result in an access to a linear address outside of the address-size range. I don't know the internals well enough, but I'm guessing it might be in the line in helper_boundl(): high = cpu_ldl_data_ra(env, a0 + 4, GETPC()); where an address is calculated as (a0+4) using a 64-bit target_ulong type with no truncation to 32 bits applied. If so, then ought the truncation to be applied on this line instead (and the equivalent in helper_boundw())? My understanding (which may well be incorrect) is that the linear address gets truncated to the instruction address size (16 or 32 bits) before any conversion to a physical address takes place. Regardless: this updated patch (in isolation) definitely fixes the issue that I observed, so I'm happy for an added Tested-by: Michael Brown <mcb30@ipxe.org> Thanks, Michael
Il sab 23 dic 2023, 11:34 Michael Brown <mcb30@ipxe.org> ha scritto: > I am confused by how BOUND can result in an access to a linear address > outside of the address-size range. I don't know the internals well > enough, but I'm guessing it might be in the line in helper_boundl(): > > high = cpu_ldl_data_ra(env, a0 + 4, GETPC()); > > where an address is calculated as (a0+4) using a 64-bit target_ulong > type with no truncation to 32 bits applied. > > If so, then ought the truncation to be applied on this line instead (and > the equivalent in helper_boundw())? My understanding (which may well be > incorrect) is that the linear address gets truncated to the instruction > address size (16 or 32 bits) before any conversion to a physical address > takes place. > The linear address is the one that has the segment base added, and it is not truncated to 16 bits (otherwise the whole A20 thing would not exist). The same should be true of e.g. an FSAVE instruction; it would allow access slightly beyond the usual 1M+64K limit that is possible in real mode with 286 and later processors. In big real mode with 32-bit addresses, it should not be possible to go beyond 4G physical address by adding the segment base, it should wrap around and that's what I implemented. However you're probably right that this patch has a hole for accesses made from 32-bit code segments with paging enabled. I think LMA was the wrong bit to test all the time, and I am not even sure if the masking must be applied even before the call to mmu_translate(). I will ponder it a bit and possibly send a revised version. Paolo > Regardless: this updated patch (in isolation) definitely fixes the issue > that I observed, so I'm happy for an added > > Tested-by: Michael Brown <mcb30@ipxe.org> > > Thanks, > > Michael > >
On 23/12/2023 11:47, Paolo Bonzini wrote: > The linear address is the one that has the segment base added, and it is > not truncated to 16 bits (otherwise the whole A20 thing would not > exist). The same should be true of e.g. an FSAVE instruction; it would > allow access slightly beyond the usual 1M+64K limit that is possible in > real mode with 286 and later processors. > > In big real mode with 32-bit addresses, it should not be possible to go > beyond 4G physical address by adding the segment base, it should wrap > around and that's what I implemented. However you're probably right that > this patch has a hole for accesses made from 32-bit code segments with > paging enabled. I think LMA was the wrong bit to test all the time, and > I am not even sure if the masking must be applied even before the call > to mmu_translate(). I will ponder it a bit and possibly send a revised > version. You are of course correct that the linear address is not truncated to 16 bits when the address size is 16 bits - my mistake. I've been looking through the SDM for any definitive statement on the topic. The closest I can find is in volume 3 table 4-1, which states that the linear address width is: - 32 bits with paging disabled - 32 bits with 32-bit paging - 32 bits with PAE paging - 48 bits with 4-level paging - 57 bits with 5-level paging My previous experiment seems to show that the linear address *does* also get truncated to 32 bits for an instruction with a 32-bit address size even when running in long mode with 4-level paging (on a Core i7-6600U), so this table definitely isn't telling the complete story. My best guess at this point is that the linear address gets truncated to 32 bits when the address size is 32 bits (which will always be the case when paging is disabled, or when using 32-bit paging or PAE paging). Thanks, Michael
© 2016 - 2024 Red Hat, Inc.