arch/x86/include/asm/uaccess_64.h | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
We were checking one byte beyond the actual range that would be accessed.
Originally, valid_user_address would consider the user guard page to be
valid, so checks including the final accessible byte would still succeed.
However, after commit 86e6b1547b3d ("x86: fix user address masking
non-canonical speculation issue") this is no longer the case.
Update the logic to always consider the final address in the range.
Fixes: 86e6b1547b3d ("x86: fix user address masking non-canonical speculation issue")
Signed-off-by: Mikel Rychliski <mikel@mikelr.com>
---
arch/x86/include/asm/uaccess_64.h | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/uaccess_64.h b/arch/x86/include/asm/uaccess_64.h
index b0a887209400..3e0eb72c036f 100644
--- a/arch/x86/include/asm/uaccess_64.h
+++ b/arch/x86/include/asm/uaccess_64.h
@@ -100,9 +100,11 @@ static inline bool __access_ok(const void __user *ptr, unsigned long size)
if (__builtin_constant_p(size <= PAGE_SIZE) && size <= PAGE_SIZE) {
return valid_user_address(ptr);
} else {
- unsigned long sum = size + (__force unsigned long)ptr;
+ unsigned long end = (__force unsigned long)ptr;
- return valid_user_address(sum) && sum >= (__force unsigned long)ptr;
+ if (size)
+ end += size - 1;
+ return valid_user_address(end) && end >= (__force unsigned long)ptr;
}
}
#define __access_ok __access_ok
--
2.47.0
Hi, I hit an issue with using gdb (and eventually more) on a system with 9p as rootfs which I eventually root-caused to this, so I'm just posting here for reference / another testing datapoint, since I couldn't find any other mentions of this error elsewhere and this is in the latest stable kernel (6.12 / 6.12.1). Apologies in advance that I might not be offering much else useful, but I can confirm that applying this patch fixes it. I'm running a development VM where the rootfs is a 9p mount, and from 6.12 I get this if I try to debug anything with gdb: [ 6.258525][ T88] netfs: Couldn't get user pages (rc=-14) [ 6.259414][ T88] netfs: Zero-sized read [R=1ff3] /bin/sh: error while loading shared libraries: /lib/x86_64-linux-gnu/libc.so.6: cannot read file data: Input/output error During startup program exited with code 127. After some further testing I realized that basically *everything* was broken (e.g. /bin/sh) if I disable ASLR (via /proc/sys/kernel/randomize_va_space), with the same messages printed. The user-space is a Debian distribution. Basically I think the user-space initialisation tries to call read with (for example) buf=0x7fffffffdfc8 and count=832, so it spans the last two valid user-space pages, and the access_ok in gup_fast_fallback eventually fails (because somewhere above it rounds to whole pages). I think this doesn't happen with a "normal" ext4 root (otherwise I would be surprised if nobody else has reported it yet) - it might just have been surfaced by recent v9fs changes.
From: Tingmao Wang <m@maowtm.org>
> Sent: 26 November 2024 01:09
>
> I hit an issue with using gdb (and eventually more) on a system with 9p
> as rootfs which I eventually root-caused to this, so I'm just posting
> here for reference / another testing datapoint, since I couldn't find
> any other mentions of this error elsewhere and this is in the latest
> stable kernel (6.12 / 6.12.1). Apologies in advance that I might not be
> offering much else useful, but I can confirm that applying this patch
> fixes it.
I believe Linus has applied a different path that does:
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -2389,12 +2389,12 @@ void __init arch_cpu_finalize_init(void)
alternative_instructions();
if (IS_ENABLED(CONFIG_X86_64)) {
- unsigned long USER_PTR_MAX = TASK_SIZE_MAX-1;
+ unsigned long USER_PTR_MAX = TASK_SIZE_MAX;
Probably not been back-ported yet.
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
From: Mikel Rychliski
> Sent: 09 November 2024 21:03
>
> We were checking one byte beyond the actual range that would be accessed.
> Originally, valid_user_address would consider the user guard page to be
> valid, so checks including the final accessible byte would still succeed.
Did it allow the entire page or just the first byte?
The test for ignoring small constant sizes rather assumes that accesses
to the guard page are errored (or transfers start with the first byte).
> However, after commit 86e6b1547b3d ("x86: fix user address masking
> non-canonical speculation issue") this is no longer the case.
>
> Update the logic to always consider the final address in the range.
>
> Fixes: 86e6b1547b3d ("x86: fix user address masking non-canonical speculation issue")
> Signed-off-by: Mikel Rychliski <mikel@mikelr.com>
> ---
> arch/x86/include/asm/uaccess_64.h | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/include/asm/uaccess_64.h b/arch/x86/include/asm/uaccess_64.h
> index b0a887209400..3e0eb72c036f 100644
> --- a/arch/x86/include/asm/uaccess_64.h
> +++ b/arch/x86/include/asm/uaccess_64.h
> @@ -100,9 +100,11 @@ static inline bool __access_ok(const void __user *ptr, unsigned long size)
> if (__builtin_constant_p(size <= PAGE_SIZE) && size <= PAGE_SIZE) {
> return valid_user_address(ptr);
> } else {
> - unsigned long sum = size + (__force unsigned long)ptr;
> + unsigned long end = (__force unsigned long)ptr;
>
> - return valid_user_address(sum) && sum >= (__force unsigned long)ptr;
> + if (size)
> + end += size - 1;
> + return valid_user_address(end) && end >= (__force unsigned long)ptr;
Why not:
if (statically_true(size <= PAGE_SIZE) || !size)
return vaid_user_address(ptr);
end = ptr + size - 1;
return ptr <= end && valid_user_address(end);
Although it is questionable whether a zero size should be allowed.
Also, if you assume that the actual copies are 'reasonably sequential',
it is valid to just ignore the length completely.
It also ought to be possible to get the 'size == 0' check out of the common path.
Maybe something like:
if (statically_true(size <= PAGE_SIZE)
return vaid_user_address(ptr);
end = ptr + size - 1;
return (ptr <= end || (end++, !size)) && valid_user_address(end);
You might want a likely() around the <=, but I suspect it makes little
difference on modern x86 (esp. Intel ones).
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Hi David,
Thanks for the review:
On Sunday, November 10, 2024 2:36:49 P.M. EST David Laight wrote:
> From: Mikel Rychliski
>
> > Sent: 09 November 2024 21:03
> >
> > We were checking one byte beyond the actual range that would be accessed.
> > Originally, valid_user_address would consider the user guard page to be
> > valid, so checks including the final accessible byte would still succeed.
>
> Did it allow the entire page or just the first byte?
> The test for ignoring small constant sizes rather assumes that accesses
> to the guard page are errored (or transfers start with the first byte).
>
valid_user_address() allowed the whole guard page. __access_ok() was
inconsistent about ranges including the guard page (and, as you mention, would
continue to be with this change).
The problem is before 86e6b1547b3d, the off-by-one calculation just lead to
another harmless inconsistency in checks including the guard page. Now it
prohibits reads of the last mapped userspace byte.
> > diff --git a/arch/x86/include/asm/uaccess_64.h
> > b/arch/x86/include/asm/uaccess_64.h index b0a887209400..3e0eb72c036f
> > 100644
> > --- a/arch/x86/include/asm/uaccess_64.h
> > +++ b/arch/x86/include/asm/uaccess_64.h
> > @@ -100,9 +100,11 @@ static inline bool __access_ok(const void __user
> > *ptr, unsigned long size)>
> > if (__builtin_constant_p(size <= PAGE_SIZE) && size <= PAGE_SIZE)
{
> >
> > return valid_user_address(ptr);
> >
> > } else {
> >
> > - unsigned long sum = size + (__force unsigned long)ptr;
> > + unsigned long end = (__force unsigned long)ptr;
> >
> > - return valid_user_address(sum) && sum >= (__force
unsigned long)ptr;
> > + if (size)
> > + end += size - 1;
> > + return valid_user_address(end) && end >= (__force
unsigned long)ptr;
>
> Why not:
> if (statically_true(size <= PAGE_SIZE) || !size)
> return vaid_user_address(ptr);
> end = ptr + size - 1;
> return ptr <= end && valid_user_address(end);
Sure, agree this works as well.
> Although it is questionable whether a zero size should be allowed.
> Also, if you assume that the actual copies are 'reasonably sequential',
> it is valid to just ignore the length completely.
>
> It also ought to be possible to get the 'size == 0' check out of the common
> path. Maybe something like:
> if (statically_true(size <= PAGE_SIZE)
> return vaid_user_address(ptr);
> end = ptr + size - 1;
> return (ptr <= end || (end++, !size)) && valid_user_address(end);
The first issue I ran into with the size==0 is that __import_iovec() is
checking access for vectors with io_len==0 (and the check needs to succeed,
otherwise userspace will get a -EFAULT). Not sure if there are others.
Similarly, the iovec case is depending on access_ok(0, 0) succeeding. So with
the example here, end underflows and gets rejected.
From: Mikel Rychliski > Sent: 11 November 2024 18:33 > > Hi David, > > Thanks for the review: > > On Sunday, November 10, 2024 2:36:49 P.M. EST David Laight wrote: > > From: Mikel Rychliski > > > > > Sent: 09 November 2024 21:03 > > > > > > We were checking one byte beyond the actual range that would be accessed. > > > Originally, valid_user_address would consider the user guard page to be > > > valid, so checks including the final accessible byte would still succeed. > > > > Did it allow the entire page or just the first byte? > > The test for ignoring small constant sizes rather assumes that accesses > > to the guard page are errored (or transfers start with the first byte). > > > > valid_user_address() allowed the whole guard page. __access_ok() was > inconsistent about ranges including the guard page (and, as you mention, would > continue to be with this change). > > The problem is before 86e6b1547b3d, the off-by-one calculation just lead to > another harmless inconsistency in checks including the guard page. Now it > prohibits reads of the last mapped userspace byte. So if you could find code that didn't read the first byte of a short buffer first you could access the first page of kernel memory. (Ignoring the STAC/CLAC instructions.) So that has always been wrong! OTOH I suspect that all user accesses start with the first byte and are either 'reasonably sequential' or recheck an updated pointer. So an architecture with a guard page (not all do) need only check the base address of a user buffer for being below/equal to the guard page. ... > > Why not: > > if (statically_true(size <= PAGE_SIZE) || !size) > > return vaid_user_address(ptr); > > end = ptr + size - 1; > > return ptr <= end && valid_user_address(end); > > Sure, agree this works as well. But is likely to replicate the valid_user_address() code. > > Although it is questionable whether a zero size should be allowed. > > Also, if you assume that the actual copies are 'reasonably sequential', > > it is valid to just ignore the length completely. > > > > It also ought to be possible to get the 'size == 0' check out of the common > > path. Maybe something like: > > if (statically_true(size <= PAGE_SIZE) > > return vaid_user_address(ptr); > > end = ptr + size - 1; > > return (ptr <= end || (end++, !size)) && valid_user_address(end); > > The first issue I ran into with the size==0 is that __import_iovec() is > checking access for vectors with io_len==0 (and the check needs to succeed, > otherwise userspace will get a -EFAULT). Not sure if there are others. I've looked at __import_iovec() in the past. The API is horrid! and the 32bit compat version is actually faster. It doesn't need to call access_ok() either the check is done later. > Similarly, the iovec case is depending on access_ok(0, 0) succeeding. So with > the example here, end underflows and gets rejected. I've even wondered what the actual issue is with speculative kernel reads from get_user(). The read itself can't be an issue (a valid user address will also displace any cache lines), so I think the value read must be used to form an address in order for any kernel data to be leaked. You might find a compare (eg the length in import_iovec() but that can only expose high bits of a byte - and probably requires i-cache timing. But I'm not expert - and the experts hide the fine details. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
From: David Laight
> Sent: 10 November 2024 19:37
>
> From: Mikel Rychliski
> > Sent: 09 November 2024 21:03
> >
> > We were checking one byte beyond the actual range that would be accessed.
> > Originally, valid_user_address would consider the user guard page to be
> > valid, so checks including the final accessible byte would still succeed.
>
> Did it allow the entire page or just the first byte?
> The test for ignoring small constant sizes rather assumes that accesses
> to the guard page are errored (or transfers start with the first byte).
>
> > However, after commit 86e6b1547b3d ("x86: fix user address masking
> > non-canonical speculation issue") this is no longer the case.
> >
> > Update the logic to always consider the final address in the range.
> >
> > Fixes: 86e6b1547b3d ("x86: fix user address masking non-canonical speculation issue")
> > Signed-off-by: Mikel Rychliski <mikel@mikelr.com>
> > ---
> > arch/x86/include/asm/uaccess_64.h | 6 ++++--
> > 1 file changed, 4 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/x86/include/asm/uaccess_64.h b/arch/x86/include/asm/uaccess_64.h
> > index b0a887209400..3e0eb72c036f 100644
> > --- a/arch/x86/include/asm/uaccess_64.h
> > +++ b/arch/x86/include/asm/uaccess_64.h
> > @@ -100,9 +100,11 @@ static inline bool __access_ok(const void __user *ptr, unsigned long size)
> > if (__builtin_constant_p(size <= PAGE_SIZE) && size <= PAGE_SIZE) {
> > return valid_user_address(ptr);
> > } else {
> > - unsigned long sum = size + (__force unsigned long)ptr;
> > + unsigned long end = (__force unsigned long)ptr;
> >
> > - return valid_user_address(sum) && sum >= (__force unsigned long)ptr;
> > + if (size)
> > + end += size - 1;
> > + return valid_user_address(end) && end >= (__force unsigned long)ptr;
>
> Why not:
> if (statically_true(size <= PAGE_SIZE) || !size)
> return vaid_user_address(ptr);
> end = ptr + size - 1;
> return ptr <= end && valid_user_address(end);
Thinking more that version probably bloats the code with two
copies of valid_user_address().
> Although it is questionable whether a zero size should be allowed.
> Also, if you assume that the actual copies are 'reasonably sequential',
> it is valid to just ignore the length completely.
>
> It also ought to be possible to get the 'size == 0' check out of the common path.
> Maybe something like:
> if (statically_true(size <= PAGE_SIZE)
> return vaid_user_address(ptr);
> end = ptr + size - 1;
> return (ptr <= end || (end++, !size)) && valid_user_address(end);
> You might want a likely() around the <=, but I suspect it makes little
> difference on modern x86 (esp. Intel ones).
I can't actually remember Linus's final version and don't have it to hand.
But I'm sure I remember something about a 64bit address constant.
(Probably 'cmpi, sbb, or' and then test the sign or carry flag.)
It might just be enough to increase that by one so that buffers that
end at the boundary are ok.
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
© 2016 - 2026 Red Hat, Inc.