[PATCH] x86: Fix off-by-one error in __access_ok

Mikel Rychliski posted 1 patch 1 year, 3 months ago
arch/x86/include/asm/uaccess_64.h | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
[PATCH] x86: Fix off-by-one error in __access_ok
Posted by Mikel Rychliski 1 year, 3 months ago
We were checking one byte beyond the actual range that would be accessed.
Originally, valid_user_address would consider the user guard page to be
valid, so checks including the final accessible byte would still succeed.
However, after commit 86e6b1547b3d ("x86: fix user address masking
non-canonical speculation issue") this is no longer the case.

Update the logic to always consider the final address in the range.

Fixes: 86e6b1547b3d ("x86: fix user address masking non-canonical speculation issue")
Signed-off-by: Mikel Rychliski <mikel@mikelr.com>
---
 arch/x86/include/asm/uaccess_64.h | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/uaccess_64.h b/arch/x86/include/asm/uaccess_64.h
index b0a887209400..3e0eb72c036f 100644
--- a/arch/x86/include/asm/uaccess_64.h
+++ b/arch/x86/include/asm/uaccess_64.h
@@ -100,9 +100,11 @@ static inline bool __access_ok(const void __user *ptr, unsigned long size)
 	if (__builtin_constant_p(size <= PAGE_SIZE) && size <= PAGE_SIZE) {
 		return valid_user_address(ptr);
 	} else {
-		unsigned long sum = size + (__force unsigned long)ptr;
+		unsigned long end = (__force unsigned long)ptr;
 
-		return valid_user_address(sum) && sum >= (__force unsigned long)ptr;
+		if (size)
+			end += size - 1;
+		return valid_user_address(end) && end >= (__force unsigned long)ptr;
 	}
 }
 #define __access_ok __access_ok
-- 
2.47.0
Re: [PATCH] x86: Fix off-by-one error in __access_ok
Posted by Tingmao Wang 1 year, 2 months ago
Hi,

I hit an issue with using gdb (and eventually more) on a system with 9p 
as rootfs which I eventually root-caused to this, so I'm just posting 
here for reference / another testing datapoint, since I couldn't find 
any other mentions of this error elsewhere and this is in the latest 
stable kernel (6.12 / 6.12.1). Apologies in advance that I might not be 
offering much else useful, but I can confirm that applying this patch 
fixes it.

I'm running a development VM where the rootfs is a 9p mount, and from 
6.12 I get this if I try to debug anything with gdb:

[    6.258525][   T88] netfs: Couldn't get user pages (rc=-14)
[    6.259414][   T88] netfs: Zero-sized read [R=1ff3]
/bin/sh: error while loading shared libraries: 
/lib/x86_64-linux-gnu/libc.so.6: cannot read file data: Input/output error
During startup program exited with code 127.

After some further testing I realized that basically *everything* was 
broken (e.g. /bin/sh) if I disable ASLR (via 
/proc/sys/kernel/randomize_va_space), with the same messages printed. 
The user-space is a Debian distribution.

Basically I think the user-space initialisation tries to call read with 
(for example) buf=0x7fffffffdfc8 and count=832, so it spans the last two 
valid user-space pages, and the access_ok in gup_fast_fallback 
eventually fails (because somewhere above it rounds to whole pages).

I think this doesn't happen with a "normal" ext4 root (otherwise I would 
be surprised if nobody else has reported it yet) - it might just have 
been surfaced by recent v9fs changes.
RE: [PATCH] x86: Fix off-by-one error in __access_ok
Posted by David Laight 1 year, 2 months ago
From: Tingmao Wang <m@maowtm.org>
> Sent: 26 November 2024 01:09
> 
> I hit an issue with using gdb (and eventually more) on a system with 9p
> as rootfs which I eventually root-caused to this, so I'm just posting
> here for reference / another testing datapoint, since I couldn't find
> any other mentions of this error elsewhere and this is in the latest
> stable kernel (6.12 / 6.12.1). Apologies in advance that I might not be
> offering much else useful, but I can confirm that applying this patch
> fixes it.

I believe Linus has applied a different path that does:
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -2389,12 +2389,12 @@ void __init arch_cpu_finalize_init(void)
 	alternative_instructions();
 
 	if (IS_ENABLED(CONFIG_X86_64)) {
-		unsigned long USER_PTR_MAX = TASK_SIZE_MAX-1;
+		unsigned long USER_PTR_MAX = TASK_SIZE_MAX;

Probably not been back-ported yet.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
RE: [PATCH] x86: Fix off-by-one error in __access_ok
Posted by David Laight 1 year, 2 months ago
From: Mikel Rychliski
> Sent: 09 November 2024 21:03
> 
> We were checking one byte beyond the actual range that would be accessed.
> Originally, valid_user_address would consider the user guard page to be
> valid, so checks including the final accessible byte would still succeed.

Did it allow the entire page or just the first byte?
The test for ignoring small constant sizes rather assumes that accesses
to the guard page are errored (or transfers start with the first byte).

> However, after commit 86e6b1547b3d ("x86: fix user address masking
> non-canonical speculation issue") this is no longer the case.
> 
> Update the logic to always consider the final address in the range.
> 
> Fixes: 86e6b1547b3d ("x86: fix user address masking non-canonical speculation issue")
> Signed-off-by: Mikel Rychliski <mikel@mikelr.com>
> ---
>  arch/x86/include/asm/uaccess_64.h | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/include/asm/uaccess_64.h b/arch/x86/include/asm/uaccess_64.h
> index b0a887209400..3e0eb72c036f 100644
> --- a/arch/x86/include/asm/uaccess_64.h
> +++ b/arch/x86/include/asm/uaccess_64.h
> @@ -100,9 +100,11 @@ static inline bool __access_ok(const void __user *ptr, unsigned long size)
>  	if (__builtin_constant_p(size <= PAGE_SIZE) && size <= PAGE_SIZE) {
>  		return valid_user_address(ptr);
>  	} else {
> -		unsigned long sum = size + (__force unsigned long)ptr;
> +		unsigned long end = (__force unsigned long)ptr;
> 
> -		return valid_user_address(sum) && sum >= (__force unsigned long)ptr;
> +		if (size)
> +			end += size - 1;
> +		return valid_user_address(end) && end >= (__force unsigned long)ptr;

Why not:
	if (statically_true(size <= PAGE_SIZE) || !size)
		return vaid_user_address(ptr);
	end = ptr + size - 1;
	return ptr <= end && valid_user_address(end);

Although it is questionable whether a zero size should be allowed.
Also, if you assume that the actual copies are 'reasonably sequential',
it is valid to just ignore the length completely.

It also ought to be possible to get the 'size == 0' check out of the common path.
Maybe something like:
	if (statically_true(size <= PAGE_SIZE)
		return vaid_user_address(ptr);
	end = ptr + size - 1;
	return (ptr <= end || (end++, !size)) && valid_user_address(end);
You might want a likely() around the <=, but I suspect it makes little
difference on modern x86 (esp. Intel ones).

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Re: [PATCH] x86: Fix off-by-one error in __access_ok
Posted by Mikel Rychliski 1 year, 2 months ago
Hi David,

Thanks for the review:

On Sunday, November 10, 2024 2:36:49 P.M. EST David Laight wrote:
> From: Mikel Rychliski
> 
> > Sent: 09 November 2024 21:03
> > 
> > We were checking one byte beyond the actual range that would be accessed.
> > Originally, valid_user_address would consider the user guard page to be
> > valid, so checks including the final accessible byte would still succeed.
> 
> Did it allow the entire page or just the first byte?
> The test for ignoring small constant sizes rather assumes that accesses
> to the guard page are errored (or transfers start with the first byte).
> 

valid_user_address() allowed the whole guard page. __access_ok() was 
inconsistent about ranges including the guard page (and, as you mention, would 
continue to be with this change).

The problem is before 86e6b1547b3d, the off-by-one calculation just lead to 
another harmless inconsistency in checks including the guard page. Now it 
prohibits reads of the last mapped userspace byte.

> > diff --git a/arch/x86/include/asm/uaccess_64.h
> > b/arch/x86/include/asm/uaccess_64.h index b0a887209400..3e0eb72c036f
> > 100644
> > --- a/arch/x86/include/asm/uaccess_64.h
> > +++ b/arch/x86/include/asm/uaccess_64.h
> > @@ -100,9 +100,11 @@ static inline bool __access_ok(const void __user
> > *ptr, unsigned long size)> 
> >  	if (__builtin_constant_p(size <= PAGE_SIZE) && size <= PAGE_SIZE) 
{
> >  	
> >  		return valid_user_address(ptr);
> >  	
> >  	} else {
> > 
> > -		unsigned long sum = size + (__force unsigned long)ptr;
> > +		unsigned long end = (__force unsigned long)ptr;
> > 
> > -		return valid_user_address(sum) && sum >= (__force 
unsigned long)ptr;
> > +		if (size)
> > +			end += size - 1;
> > +		return valid_user_address(end) && end >= (__force 
unsigned long)ptr;
> 
> Why not:
> 	if (statically_true(size <= PAGE_SIZE) || !size)
> 		return vaid_user_address(ptr);
> 	end = ptr + size - 1;
> 	return ptr <= end && valid_user_address(end);

Sure, agree this works as well.

> Although it is questionable whether a zero size should be allowed.
> Also, if you assume that the actual copies are 'reasonably sequential',
> it is valid to just ignore the length completely.
> 
> It also ought to be possible to get the 'size == 0' check out of the common
> path. Maybe something like:
> 	if (statically_true(size <= PAGE_SIZE)
> 		return vaid_user_address(ptr);
> 	end = ptr + size - 1;
> 	return (ptr <= end || (end++, !size)) && valid_user_address(end);

The first issue I ran into with the size==0 is that __import_iovec() is 
checking access for vectors with io_len==0 (and the check needs to succeed, 
otherwise userspace will get a -EFAULT). Not sure if there are others.

Similarly, the iovec case is depending on access_ok(0, 0) succeeding. So with 
the example here, end underflows and gets rejected.
RE: [PATCH] x86: Fix off-by-one error in __access_ok
Posted by David Laight 1 year, 2 months ago
From: Mikel Rychliski
> Sent: 11 November 2024 18:33
> 
> Hi David,
> 
> Thanks for the review:
> 
> On Sunday, November 10, 2024 2:36:49 P.M. EST David Laight wrote:
> > From: Mikel Rychliski
> >
> > > Sent: 09 November 2024 21:03
> > >
> > > We were checking one byte beyond the actual range that would be accessed.
> > > Originally, valid_user_address would consider the user guard page to be
> > > valid, so checks including the final accessible byte would still succeed.
> >
> > Did it allow the entire page or just the first byte?
> > The test for ignoring small constant sizes rather assumes that accesses
> > to the guard page are errored (or transfers start with the first byte).
> >
> 
> valid_user_address() allowed the whole guard page. __access_ok() was
> inconsistent about ranges including the guard page (and, as you mention, would
> continue to be with this change).
> 
> The problem is before 86e6b1547b3d, the off-by-one calculation just lead to
> another harmless inconsistency in checks including the guard page. Now it
> prohibits reads of the last mapped userspace byte.

So if you could find code that didn't read the first byte of a short buffer
first you could access the first page of kernel memory.
(Ignoring the STAC/CLAC instructions.)
So that has always been wrong!

OTOH I suspect that all user accesses start with the first byte
and are either 'reasonably sequential' or recheck an updated pointer.
So an architecture with a guard page (not all do) need only check
the base address of a user buffer for being below/equal to the
guard page.

...
> > Why not:
> > 	if (statically_true(size <= PAGE_SIZE) || !size)
> > 		return vaid_user_address(ptr);
> > 	end = ptr + size - 1;
> > 	return ptr <= end && valid_user_address(end);
> 
> Sure, agree this works as well.

But is likely to replicate the valid_user_address() code.

> > Although it is questionable whether a zero size should be allowed.
> > Also, if you assume that the actual copies are 'reasonably sequential',
> > it is valid to just ignore the length completely.
> >
> > It also ought to be possible to get the 'size == 0' check out of the common
> > path. Maybe something like:
> > 	if (statically_true(size <= PAGE_SIZE)
> > 		return vaid_user_address(ptr);
> > 	end = ptr + size - 1;
> > 	return (ptr <= end || (end++, !size)) && valid_user_address(end);
> 
> The first issue I ran into with the size==0 is that __import_iovec() is
> checking access for vectors with io_len==0 (and the check needs to succeed,
> otherwise userspace will get a -EFAULT). Not sure if there are others.

I've looked at __import_iovec() in the past.
The API is horrid! and the 32bit compat version is actually faster.
It doesn't need to call access_ok() either the check is done later.

> Similarly, the iovec case is depending on access_ok(0, 0) succeeding. So with
> the example here, end underflows and gets rejected.

I've even wondered what the actual issue is with speculative kernel
reads from get_user().
The read itself can't be an issue (a valid user address will also displace
any cache lines), so I think the value read must be used to form an
address in order for any kernel data to be leaked.
You might find a compare (eg the length in import_iovec() but that can
only expose high bits of a byte - and probably requires i-cache timing.
But I'm not expert - and the experts hide the fine details.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
RE: [PATCH] x86: Fix off-by-one error in __access_ok
Posted by David Laight 1 year, 2 months ago
From: David Laight
> Sent: 10 November 2024 19:37
> 
> From: Mikel Rychliski
> > Sent: 09 November 2024 21:03
> >
> > We were checking one byte beyond the actual range that would be accessed.
> > Originally, valid_user_address would consider the user guard page to be
> > valid, so checks including the final accessible byte would still succeed.
> 
> Did it allow the entire page or just the first byte?
> The test for ignoring small constant sizes rather assumes that accesses
> to the guard page are errored (or transfers start with the first byte).
> 
> > However, after commit 86e6b1547b3d ("x86: fix user address masking
> > non-canonical speculation issue") this is no longer the case.
> >
> > Update the logic to always consider the final address in the range.
> >
> > Fixes: 86e6b1547b3d ("x86: fix user address masking non-canonical speculation issue")
> > Signed-off-by: Mikel Rychliski <mikel@mikelr.com>
> > ---
> >  arch/x86/include/asm/uaccess_64.h | 6 ++++--
> >  1 file changed, 4 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/x86/include/asm/uaccess_64.h b/arch/x86/include/asm/uaccess_64.h
> > index b0a887209400..3e0eb72c036f 100644
> > --- a/arch/x86/include/asm/uaccess_64.h
> > +++ b/arch/x86/include/asm/uaccess_64.h
> > @@ -100,9 +100,11 @@ static inline bool __access_ok(const void __user *ptr, unsigned long size)
> >  	if (__builtin_constant_p(size <= PAGE_SIZE) && size <= PAGE_SIZE) {
> >  		return valid_user_address(ptr);
> >  	} else {
> > -		unsigned long sum = size + (__force unsigned long)ptr;
> > +		unsigned long end = (__force unsigned long)ptr;
> >
> > -		return valid_user_address(sum) && sum >= (__force unsigned long)ptr;
> > +		if (size)
> > +			end += size - 1;
> > +		return valid_user_address(end) && end >= (__force unsigned long)ptr;
> 
> Why not:
> 	if (statically_true(size <= PAGE_SIZE) || !size)
> 		return vaid_user_address(ptr);
> 	end = ptr + size - 1;
> 	return ptr <= end && valid_user_address(end);

Thinking more that version probably bloats the code with two
copies of valid_user_address().

> Although it is questionable whether a zero size should be allowed.
> Also, if you assume that the actual copies are 'reasonably sequential',
> it is valid to just ignore the length completely.
> 
> It also ought to be possible to get the 'size == 0' check out of the common path.
> Maybe something like:
> 	if (statically_true(size <= PAGE_SIZE)
> 		return vaid_user_address(ptr);
> 	end = ptr + size - 1;
> 	return (ptr <= end || (end++, !size)) && valid_user_address(end);
> You might want a likely() around the <=, but I suspect it makes little
> difference on modern x86 (esp. Intel ones).

I can't actually remember Linus's final version and don't have it to hand.
But I'm sure I remember something about a 64bit address constant.
(Probably 'cmpi, sbb, or' and then test the sign or carry flag.)
It might just be enough to increase that by one so that buffers that
end at the boundary are ok.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)