[PATCH v5 19/34] random: vDSO: only access vDSO datapage after random_init()

Thomas Weißschuh posted 34 patches 1 month, 1 week ago
[PATCH v5 19/34] random: vDSO: only access vDSO datapage after random_init()
Posted by Thomas Weißschuh 1 month, 1 week ago
Upcoming changes to the generic vDSO library will mean that the vDSO
datapage will not yet be usable during early boot.

Introduce a static key which prevents early accesses.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Tested-by: Andreas Larsson <andreas@gaisler.com>
Reviewed-by: Andreas Larsson <andreas@gaisler.com>
---
 drivers/char/random.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/char/random.c b/drivers/char/random.c
index 73c53a4fb949bfd2ed723fa3cec3fe0d066a7fa3..f39524fb076a0c77bab228d4f2d45fee37291eb0 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -88,6 +88,7 @@ static DEFINE_STATIC_KEY_FALSE(crng_is_ready);
 static DECLARE_WAIT_QUEUE_HEAD(crng_init_wait);
 static struct fasync_struct *fasync;
 static ATOMIC_NOTIFIER_HEAD(random_ready_notifier);
+static DEFINE_STATIC_KEY_FALSE(random_vdso_is_ready);
 
 /* Control how we warn userspace. */
 static struct ratelimit_state urandom_warning =
@@ -252,6 +253,9 @@ static void random_vdso_update_generation(unsigned long next_gen)
 	if (!IS_ENABLED(CONFIG_VDSO_GETRANDOM))
 		return;
 
+	if (!static_branch_likely(&random_vdso_is_ready))
+		return;
+
 	/* base_crng.generation's invalid value is ULONG_MAX, while
 	 * vdso_k_rng_data->generation's invalid value is 0, so add one to the
 	 * former to arrive at the latter. Use smp_store_release so that this
@@ -274,6 +278,9 @@ static void random_vdso_set_ready(void)
 	if (!IS_ENABLED(CONFIG_VDSO_GETRANDOM))
 		return;
 
+	if (!static_branch_likely(&random_vdso_is_ready))
+		return;
+
 	WRITE_ONCE(vdso_k_rng_data->is_ready, true);
 }
 
@@ -925,6 +932,9 @@ void __init random_init(void)
 	_mix_pool_bytes(&entropy, sizeof(entropy));
 	add_latent_entropy();
 
+	if (IS_ENABLED(CONFIG_VDSO_GETRANDOM))
+		static_branch_enable(&random_vdso_is_ready);
+
 	/*
 	 * If we were initialized by the cpu or bootloader before jump labels
 	 * or workqueues are initialized, then we should enable the static
@@ -934,8 +944,10 @@ void __init random_init(void)
 		crng_set_ready(NULL);
 
 	/* Reseed if already seeded by earlier phases. */
-	if (crng_ready())
+	if (crng_ready()) {
 		crng_reseed(NULL);
+		random_vdso_set_ready();
+	}
 
 	WARN_ON(register_pm_notifier(&pm_notifier));
 

-- 
2.51.0

Re: [PATCH v5 19/34] random: vDSO: only access vDSO datapage after random_init()
Posted by Jason A. Donenfeld 1 month, 1 week ago
Hi Thomas,

I'm not a huge fan of this change:

On Thu, Nov 06, 2025 at 11:02:12AM +0100, Thomas Weißschuh wrote:
> +static DEFINE_STATIC_KEY_FALSE(random_vdso_is_ready);
>  
>  /* Control how we warn userspace. */
>  static struct ratelimit_state urandom_warning =
> @@ -252,6 +253,9 @@ static void random_vdso_update_generation(unsigned long next_gen)
>  	if (!IS_ENABLED(CONFIG_VDSO_GETRANDOM))
>  		return;
>  
> +	if (!static_branch_likely(&random_vdso_is_ready))
> +		return;
> +
>  	/* base_crng.generation's invalid value is ULONG_MAX, while
>  	 * vdso_k_rng_data->generation's invalid value is 0, so add one to the
>  	 * former to arrive at the latter. Use smp_store_release so that this
> @@ -274,6 +278,9 @@ static void random_vdso_set_ready(void)
>  	if (!IS_ENABLED(CONFIG_VDSO_GETRANDOM))
>  		return;
>  
> +	if (!static_branch_likely(&random_vdso_is_ready))
> +		return;
> +
>  	WRITE_ONCE(vdso_k_rng_data->is_ready, true);
>  }
>  
> @@ -925,6 +932,9 @@ void __init random_init(void)
>  	_mix_pool_bytes(&entropy, sizeof(entropy));
>  	add_latent_entropy();
>  
> +	if (IS_ENABLED(CONFIG_VDSO_GETRANDOM))
> +		static_branch_enable(&random_vdso_is_ready);
> +
>  	/*
>  	 * If we were initialized by the cpu or bootloader before jump labels
>  	 * or workqueues are initialized, then we should enable the static
> @@ -934,8 +944,10 @@ void __init random_init(void)
>  		crng_set_ready(NULL);
>  
>  	/* Reseed if already seeded by earlier phases. */
> -	if (crng_ready())
> +	if (crng_ready()) {
>  		crng_reseed(NULL);
> +		random_vdso_set_ready();
> +	}

The fact that the vdso datapage is set up by the time random_init() is
called seems incredibly contingent on init details. Why not, instead,
make this a necessary part of the structure of vdso setup code, which
can actually know about what happens when? For example, one clean way of
doing that would be to make vdso_k_rng_data always valid by having it
initially point to __initdata memory, and then when it's time to
initialize the real datapage, memcpy() the __initdata memory to the new
specially allocated memory. Then we don't need the complex state
tracking that this commit and the prior one introduce.

Jason
Re: [PATCH v5 19/34] random: vDSO: only access vDSO datapage after random_init()
Posted by Thomas Weißschuh 1 month, 1 week ago
On Sat, Nov 08, 2025 at 12:46:05AM +0100, Jason A. Donenfeld wrote:
> I'm not a huge fan of this change:
> 
> On Thu, Nov 06, 2025 at 11:02:12AM +0100, Thomas Weißschuh wrote:
> > +static DEFINE_STATIC_KEY_FALSE(random_vdso_is_ready);
> >  
> >  /* Control how we warn userspace. */
> >  static struct ratelimit_state urandom_warning =
> > @@ -252,6 +253,9 @@ static void random_vdso_update_generation(unsigned long next_gen)
> >  	if (!IS_ENABLED(CONFIG_VDSO_GETRANDOM))
> >  		return;
> >  
> > +	if (!static_branch_likely(&random_vdso_is_ready))
> > +		return;
> > +
> >  	/* base_crng.generation's invalid value is ULONG_MAX, while
> >  	 * vdso_k_rng_data->generation's invalid value is 0, so add one to the
> >  	 * former to arrive at the latter. Use smp_store_release so that this
> > @@ -274,6 +278,9 @@ static void random_vdso_set_ready(void)
> >  	if (!IS_ENABLED(CONFIG_VDSO_GETRANDOM))
> >  		return;
> >  
> > +	if (!static_branch_likely(&random_vdso_is_ready))
> > +		return;
> > +
> >  	WRITE_ONCE(vdso_k_rng_data->is_ready, true);
> >  }
> >  
> > @@ -925,6 +932,9 @@ void __init random_init(void)
> >  	_mix_pool_bytes(&entropy, sizeof(entropy));
> >  	add_latent_entropy();
> >  
> > +	if (IS_ENABLED(CONFIG_VDSO_GETRANDOM))
> > +		static_branch_enable(&random_vdso_is_ready);
> > +
> >  	/*
> >  	 * If we were initialized by the cpu or bootloader before jump labels
> >  	 * or workqueues are initialized, then we should enable the static
> > @@ -934,8 +944,10 @@ void __init random_init(void)
> >  		crng_set_ready(NULL);
> >  
> >  	/* Reseed if already seeded by earlier phases. */
> > -	if (crng_ready())
> > +	if (crng_ready()) {
> >  		crng_reseed(NULL);
> > +		random_vdso_set_ready();
> > +	}
> 
> The fact that the vdso datapage is set up by the time random_init() is
> called seems incredibly contingent on init details. Why not, instead,
> make this a necessary part of the structure of vdso setup code, which
> can actually know about what happens when?

The whole early init is "carefully" ordered in any case. I would have been
happy to allocate the data pages before the random initialization, but the
allocator is not yet usable by then.
We could also make the ordering more visible by having the vDSO datastore call
into a dedicated function to allow the random core to touch the data pages:
random_vdso_enable_datapages().

> For example, one clean way of
> doing that would be to make vdso_k_rng_data always valid by having it
> initially point to __initdata memory, and then when it's time to
> initialize the real datapage, memcpy() the __initdata memory to the new
> specially allocated memory. Then we don't need the complex state
> tracking that this commit and the prior one introduce.

Wouldn't that require synchronization between the update path and the memcpy()
path? Also if the pointer is going to change at some point we'll probably need
to use READ_ONCE()/WRITE_ONCE(). In general I would be happy about a cleaner
solution for this but didn't find a great one.


Thomas
Re: [PATCH v5 19/34] random: vDSO: only access vDSO datapage after random_init()
Posted by Jason A. Donenfeld 1 month, 1 week ago
On Mon, Nov 10, 2025 at 10:04:17AM +0100, Thomas Weißschuh wrote:
> On Sat, Nov 08, 2025 at 12:46:05AM +0100, Jason A. Donenfeld wrote:
> > I'm not a huge fan of this change:
> > 
> > On Thu, Nov 06, 2025 at 11:02:12AM +0100, Thomas Weißschuh wrote:
> > > +static DEFINE_STATIC_KEY_FALSE(random_vdso_is_ready);
> > >  
> > >  /* Control how we warn userspace. */
> > >  static struct ratelimit_state urandom_warning =
> > > @@ -252,6 +253,9 @@ static void random_vdso_update_generation(unsigned long next_gen)
> > >  	if (!IS_ENABLED(CONFIG_VDSO_GETRANDOM))
> > >  		return;
> > >  
> > > +	if (!static_branch_likely(&random_vdso_is_ready))
> > > +		return;
> > > +
> > >  	/* base_crng.generation's invalid value is ULONG_MAX, while
> > >  	 * vdso_k_rng_data->generation's invalid value is 0, so add one to the
> > >  	 * former to arrive at the latter. Use smp_store_release so that this
> > > @@ -274,6 +278,9 @@ static void random_vdso_set_ready(void)
> > >  	if (!IS_ENABLED(CONFIG_VDSO_GETRANDOM))
> > >  		return;
> > >  
> > > +	if (!static_branch_likely(&random_vdso_is_ready))
> > > +		return;
> > > +
> > >  	WRITE_ONCE(vdso_k_rng_data->is_ready, true);
> > >  }
> > >  
> > > @@ -925,6 +932,9 @@ void __init random_init(void)
> > >  	_mix_pool_bytes(&entropy, sizeof(entropy));
> > >  	add_latent_entropy();
> > >  
> > > +	if (IS_ENABLED(CONFIG_VDSO_GETRANDOM))
> > > +		static_branch_enable(&random_vdso_is_ready);
> > > +
> > >  	/*
> > >  	 * If we were initialized by the cpu or bootloader before jump labels
> > >  	 * or workqueues are initialized, then we should enable the static
> > > @@ -934,8 +944,10 @@ void __init random_init(void)
> > >  		crng_set_ready(NULL);
> > >  
> > >  	/* Reseed if already seeded by earlier phases. */
> > > -	if (crng_ready())
> > > +	if (crng_ready()) {
> > >  		crng_reseed(NULL);
> > > +		random_vdso_set_ready();
> > > +	}
> > 
> > The fact that the vdso datapage is set up by the time random_init() is
> > called seems incredibly contingent on init details. Why not, instead,
> > make this a necessary part of the structure of vdso setup code, which
> > can actually know about what happens when?
> 
> The whole early init is "carefully" ordered in any case. I would have been
> happy to allocate the data pages before the random initialization, but the
> allocator is not yet usable by then.
> We could also make the ordering more visible by having the vDSO datastore call
> into a dedicated function to allow the random core to touch the data pages:
> random_vdso_enable_datapages().
> 
> > For example, one clean way of
> > doing that would be to make vdso_k_rng_data always valid by having it
> > initially point to __initdata memory, and then when it's time to
> > initialize the real datapage, memcpy() the __initdata memory to the new
> > specially allocated memory. Then we don't need the complex state
> > tracking that this commit and the prior one introduce.
> 
> Wouldn't that require synchronization between the update path and the memcpy()
> path? Also if the pointer is going to change at some point we'll probably need
> to use READ_ONCE()/WRITE_ONCE(). In general I would be happy about a cleaner
> solution for this but didn't find a great one.

This is still before userspace has started, and interrupts are disabled,
so I don't think so? Also, you only care about being after
mm_core_init(), right? So move your thing before sched_init() and then
you'll really have nothing to worry about.

But I think globally I agree with Andy/Arnd -- this is kind of ugly and
not worth it. Disable vDSO for these old CPUs with cache aliasing
issues.

Jason
Re: [PATCH v5 19/34] random: vDSO: only access vDSO datapage after random_init()
Posted by Thomas Weißschuh 1 month, 1 week ago
On Mon, Nov 10, 2025 at 11:37:07AM +0100, Jason A. Donenfeld wrote:
> On Mon, Nov 10, 2025 at 10:04:17AM +0100, Thomas Weißschuh wrote:
> > On Sat, Nov 08, 2025 at 12:46:05AM +0100, Jason A. Donenfeld wrote:
> > > I'm not a huge fan of this change:
> > > 
> > > On Thu, Nov 06, 2025 at 11:02:12AM +0100, Thomas Weißschuh wrote:
> > > > +static DEFINE_STATIC_KEY_FALSE(random_vdso_is_ready);
> > > >  
> > > >  /* Control how we warn userspace. */
> > > >  static struct ratelimit_state urandom_warning =
> > > > @@ -252,6 +253,9 @@ static void random_vdso_update_generation(unsigned long next_gen)
> > > >  	if (!IS_ENABLED(CONFIG_VDSO_GETRANDOM))
> > > >  		return;
> > > >  
> > > > +	if (!static_branch_likely(&random_vdso_is_ready))
> > > > +		return;
> > > > +
> > > >  	/* base_crng.generation's invalid value is ULONG_MAX, while
> > > >  	 * vdso_k_rng_data->generation's invalid value is 0, so add one to the
> > > >  	 * former to arrive at the latter. Use smp_store_release so that this
> > > > @@ -274,6 +278,9 @@ static void random_vdso_set_ready(void)
> > > >  	if (!IS_ENABLED(CONFIG_VDSO_GETRANDOM))
> > > >  		return;
> > > >  
> > > > +	if (!static_branch_likely(&random_vdso_is_ready))
> > > > +		return;
> > > > +
> > > >  	WRITE_ONCE(vdso_k_rng_data->is_ready, true);
> > > >  }
> > > >  
> > > > @@ -925,6 +932,9 @@ void __init random_init(void)
> > > >  	_mix_pool_bytes(&entropy, sizeof(entropy));
> > > >  	add_latent_entropy();
> > > >  
> > > > +	if (IS_ENABLED(CONFIG_VDSO_GETRANDOM))
> > > > +		static_branch_enable(&random_vdso_is_ready);
> > > > +
> > > >  	/*
> > > >  	 * If we were initialized by the cpu or bootloader before jump labels
> > > >  	 * or workqueues are initialized, then we should enable the static
> > > > @@ -934,8 +944,10 @@ void __init random_init(void)
> > > >  		crng_set_ready(NULL);
> > > >  
> > > >  	/* Reseed if already seeded by earlier phases. */
> > > > -	if (crng_ready())
> > > > +	if (crng_ready()) {
> > > >  		crng_reseed(NULL);
> > > > +		random_vdso_set_ready();
> > > > +	}
> > > 
> > > The fact that the vdso datapage is set up by the time random_init() is
> > > called seems incredibly contingent on init details. Why not, instead,
> > > make this a necessary part of the structure of vdso setup code, which
> > > can actually know about what happens when?
> > 
> > The whole early init is "carefully" ordered in any case. I would have been
> > happy to allocate the data pages before the random initialization, but the
> > allocator is not yet usable by then.
> > We could also make the ordering more visible by having the vDSO datastore call
> > into a dedicated function to allow the random core to touch the data pages:
> > random_vdso_enable_datapages().
> > 
> > > For example, one clean way of
> > > doing that would be to make vdso_k_rng_data always valid by having it
> > > initially point to __initdata memory, and then when it's time to
> > > initialize the real datapage, memcpy() the __initdata memory to the new
> > > specially allocated memory. Then we don't need the complex state
> > > tracking that this commit and the prior one introduce.
> > 
> > Wouldn't that require synchronization between the update path and the memcpy()
> > path? Also if the pointer is going to change at some point we'll probably need
> > to use READ_ONCE()/WRITE_ONCE(). In general I would be happy about a cleaner
> > solution for this but didn't find a great one.
> 
> This is still before userspace has started, and interrupts are disabled,
> so I don't think so?

Interrupts being disabled is a good point. But we are still leaking
implementation details from the random code into the vdso datastore.

> Also, you only care about being after
> mm_core_init(), right? So move your thing before sched_init() and then
> you'll really have nothing to worry about.

The callchains random_init_early() -> crng_reseed()/_credit_init_bits() could
still touch the datapage before it is allocated.
Adding conditionals to prevent those is essentially what my patch does.

> But I think globally I agree with Andy/Arnd -- this is kind of ugly and
> not worth it. Disable vDSO for these old CPUs with cache aliasing
> issues.

(I obviously still need to properly respond to Andy and Arnd)

The dynamic allocation does more.
1) It is a preparation for mlockall() for the datapages [0].
2) On the SPARC T4 Niagara this was tested on, the MMU would send an exception,
if userspace accessed the mapped kernel .data page. Even compensating for dcache
aliasing did not prevent this, so that is not the reason. It is not clear why
it happens. At some point I gave up investigating as point 1) would still hold
true.

[0] https://lore.kernel.org/lkml/20250901-vdso-mlockall-v2-0-68f5a6f03345@linutronix.de/
    (This revision used 'struct page' with .data memory, but that doesn't
    actually work everywhere)


Thomas
Re: [PATCH v5 19/34] random: vDSO: only access vDSO datapage after random_init()
Posted by Jason A. Donenfeld 1 month, 1 week ago
On Mon, Nov 10, 2025 at 12:24:13PM +0100, Thomas Weißschuh wrote:
> > > > For example, one clean way of
> > > > doing that would be to make vdso_k_rng_data always valid by having it
> > > > initially point to __initdata memory, and then when it's time to
> > > > initialize the real datapage, memcpy() the __initdata memory to the new
> > > > specially allocated memory. Then we don't need the complex state
> > > > tracking that this commit and the prior one introduce.
> > > 
> > > Wouldn't that require synchronization between the update path and the memcpy()
> > > path? Also if the pointer is going to change at some point we'll probably need
> > > to use READ_ONCE()/WRITE_ONCE(). In general I would be happy about a cleaner
> > > solution for this but didn't find a great one.
> > 
> > This is still before userspace has started, and interrupts are disabled,
> > so I don't think so?
> 
> Interrupts being disabled is a good point. But we are still leaking
> implementation details from the random code into the vdso datastore.

It wouldn't. You do this generically with memcpy().

> 
> > Also, you only care about being after
> > mm_core_init(), right? So move your thing before sched_init() and then
> > you'll really have nothing to worry about.
> 
> The callchains random_init_early() -> crng_reseed()/_credit_init_bits() could
> still touch the datapage before it is allocated.
> Adding conditionals to prevent those is essentially what my patch does.

I think I wasn't very clear in my proposal, because this isn't an issue
in it.

Global scope:

static struct vdso_rng_data placeholder_vdso_k_rng_data __initdata;
struct vdso_rng_data *vdso_k_rng_data = &placeholder_vdso_k_rng_data;

Then,

void __init vdso_setup_data_pages(void)
{
    ...
    vdso_k_rng_data = blabla();
    ...
    memcpy(vdso_k_rng_data, &placeholder_vdso_k_rng_data, sizeof(*vdso_k_rng_data);
    ...
}

If vdso_setup_data_pages() is called early enough in init, this is safe
to do, and then you don't need to muck up the random code with awful
state machine ordering stuff.

Jason
Re: [PATCH v5 19/34] random: vDSO: only access vDSO datapage after random_init()
Posted by Thomas Weißschuh 1 month, 1 week ago
On Mon, Nov 10, 2025 at 12:40:17PM +0100, Jason A. Donenfeld wrote:
> On Mon, Nov 10, 2025 at 12:24:13PM +0100, Thomas Weißschuh wrote:
> > > > > For example, one clean way of
> > > > > doing that would be to make vdso_k_rng_data always valid by having it
> > > > > initially point to __initdata memory, and then when it's time to
> > > > > initialize the real datapage, memcpy() the __initdata memory to the new
> > > > > specially allocated memory. Then we don't need the complex state
> > > > > tracking that this commit and the prior one introduce.
> > > > 
> > > > Wouldn't that require synchronization between the update path and the memcpy()
> > > > path? Also if the pointer is going to change at some point we'll probably need
> > > > to use READ_ONCE()/WRITE_ONCE(). In general I would be happy about a cleaner
> > > > solution for this but didn't find a great one.
> > > 
> > > This is still before userspace has started, and interrupts are disabled,
> > > so I don't think so?
> > 
> > Interrupts being disabled is a good point. But we are still leaking
> > implementation details from the random code into the vdso datastore.
> 
> It wouldn't. You do this generically with memcpy().

With "implementation details" I meant the fact that it is fine to swap out the
datapage behind its back. And the fact that the memcpy() can not introduce any
races.

> > > Also, you only care about being after
> > > mm_core_init(), right? So move your thing before sched_init() and then
> > > you'll really have nothing to worry about.
> > 
> > The callchains random_init_early() -> crng_reseed()/_credit_init_bits() could
> > still touch the datapage before it is allocated.
> > Adding conditionals to prevent those is essentially what my patch does.
> 
> I think I wasn't very clear in my proposal, because this isn't an issue
> in it.

I interpreted your previous mail as two different proposals:
1) do the memcpy() thing
2) move the page allocation after mm_core_init()

Now it makes more sense.

> Global scope:
> 
> static struct vdso_rng_data placeholder_vdso_k_rng_data __initdata;
> struct vdso_rng_data *vdso_k_rng_data = &placeholder_vdso_k_rng_data;
> 
> Then,
> 
> void __init vdso_setup_data_pages(void)
> {
>     ...
>     vdso_k_rng_data = blabla();
>     ...
>     memcpy(vdso_k_rng_data, &placeholder_vdso_k_rng_data, sizeof(*vdso_k_rng_data);
>     ...
> }
> 
> If vdso_setup_data_pages() is called early enough in init, this is safe
> to do, and then you don't need to muck up the random code with awful
> state machine ordering stuff.

Yes it is safe, but this safety is not obvious in my opinion.
However I'll use your proposal for the next revision.


Thomas