[PATCH 0/2] linux-user: Fix incorrect sign-extension of guest addresses

Peter Maydell posted 2 patches 2 days, 10 hours ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20260330143123.1685142-1-peter.maydell@linaro.org
Maintainers: Richard Henderson <richard.henderson@linaro.org>, Paolo Bonzini <pbonzini@redhat.com>, Warner Losh <imp@bsdimp.com>, Kyle Evans <kevans@freebsd.org>, Laurent Vivier <laurent@vivier.eu>, Pierrick Bouvier <pierrick.bouvier@linaro.org>
accel/tcg/user-exec.c        | 26 +++++++--------
bsd-user/qemu.h              |  1 +
include/accel/tcg/cpu-ldst.h |  4 ---
include/user/guest-host.h    | 62 ++++++++++++++++++++++++++++++++----
linux-user/qemu.h            |  1 +
5 files changed, 71 insertions(+), 23 deletions(-)
[PATCH 0/2] linux-user: Fix incorrect sign-extension of guest addresses
Posted by Peter Maydell 2 days, 10 hours ago
In commit 7804c84a ("include/user: Use vaddr in guest-host.h") we
changed all the functions in guest-host.h that took or returned their
guest address argument in type abi_ptr to instead use vaddr.

This introduced regressions for the case of a 32-bit guest and an
address above 2GB for the common situation where the address is a
syscall argument stored in a variable of type 'abi_long'.  With
abi_ptr (which will be an unsigned 32-bit type for 32-bit guests),
the address is cast to unsigned 32-bit, and then zero-extended to
64-bits in g2h_untagged_vaddr().  With the switch to vaddr (which is
always a 64-bit unsigned type), the guest address will instead be
sign-extended to 64 bits, which gives the wrong answer.

This is https://gitlab.com/qemu-project/qemu/-/work_items/3333

There are multiple ways we could fix this, and I wasn't really
very enthusiastic about any of the ways I thought of:

 * we could require all the code that has guest addresses in
   abi_long variables to carefully make sure it casts them
   to abi_ptr before calling any of the relevant functions.
   That would be making changes to quite a lot of code, and
   it feels somewhat bug-prone because the "obvious" way to
   write code (..."g2h(arg1)"...) is the wrong way.

 * we could try to abstract away direct calls to these functions
   (as a comment in the header says, we likely need to do that
   at some point if we ever want to have usermode softmmu), as
   we already do somewhat with the lock_user() and unlock_user()
   primitives. But that would be a huge upheaval not suitable
   for fixing a regression.

 * we could make the "normal" versions of these functions the
   same as they've traditionally been in QEMU, and provide
   variants specifically using vaddr for the small amount of
   code that needs them and is being compiled only once.
   This is at least minimally invasive and fairly obviously
   reviewable for correctness, but having two versions of each
   function is a bit ugly.

In this patchset I've opted for the last of these possibilities.
If anybody has a better idea I didn't think of, feel free to
suggest it :-)

Patch 1 in this series isn't a requirement for patch 2;
but cutting down the number of places where we include the
header helped make me more confident that I wasn't missing
any uses.

thanks
-- PMM

Peter Maydell (2):
  include: Don't include guest-host.h in cpu-ldst.h
  include/user/guest-host.h: Provide g2h etc for both abi_ptr and vaddr

 accel/tcg/user-exec.c        | 26 +++++++--------
 bsd-user/qemu.h              |  1 +
 include/accel/tcg/cpu-ldst.h |  4 ---
 include/user/guest-host.h    | 62 ++++++++++++++++++++++++++++++++----
 linux-user/qemu.h            |  1 +
 5 files changed, 71 insertions(+), 23 deletions(-)

-- 
2.43.0
Re: [PATCH 0/2] linux-user: Fix incorrect sign-extension of guest addresses
Posted by Peter Maydell 1 day, 8 hours ago
On Mon, 30 Mar 2026 at 15:31, Peter Maydell <peter.maydell@linaro.org> wrote:
>
> In commit 7804c84a ("include/user: Use vaddr in guest-host.h") we
> changed all the functions in guest-host.h that took or returned their
> guest address argument in type abi_ptr to instead use vaddr.
>
> This introduced regressions for the case of a 32-bit guest and an
> address above 2GB for the common situation where the address is a
> syscall argument stored in a variable of type 'abi_long'.  With
> abi_ptr (which will be an unsigned 32-bit type for 32-bit guests),
> the address is cast to unsigned 32-bit, and then zero-extended to
> 64-bits in g2h_untagged_vaddr().  With the switch to vaddr (which is
> always a 64-bit unsigned type), the guest address will instead be
> sign-extended to 64 bits, which gives the wrong answer.

I'm going to pick these up for a pullreq I'm putting together
for this and some other recent linux-user bug fixes, because
I think it's better to have the fix in rc2 than delayed into rc3.
I know this has only been on the list a day or so, but I think
tweaking it in rc3 if needed is better than definitely slipping it
into rc3.

thanks
-- PMM