[PATCH 00/41] Simplify, reorganize and clean up the x86 INT3 based batch-patching code (alternative.c)

Ingo Molnar posted 41 patches 8 months, 3 weeks ago
arch/x86/include/asm/text-patching.h |  23 ++---
arch/x86/kernel/alternative.c        | 255 ++++++++++++++++++++++++++++---------------------------
arch/x86/kernel/ftrace.c             |  18 ++--
arch/x86/kernel/jump_label.c         |   6 +-
arch/x86/kernel/kprobes/core.c       |   4 +-
arch/x86/kernel/kprobes/opt.c        |   6 +-
arch/x86/kernel/module.c             |   2 +-
arch/x86/kernel/static_call.c        |   2 +-
arch/x86/kernel/traps.c              |   6 +-
arch/x86/mm/init.c                   |  16 ++--
arch/x86/net/bpf_jit_comp.c          |   2 +-
11 files changed, 172 insertions(+), 168 deletions(-)
[PATCH 00/41] Simplify, reorganize and clean up the x86 INT3 based batch-patching code (alternative.c)
Posted by Ingo Molnar 8 months, 3 weeks ago
This series has 3 main parts:

(1)

The first part of this series performs a thorough text-patching API namespace
cleanup discussed with Peter Zijlstra:

	https://lore.kernel.org/r/20250325123119.GL36322@noisy.programming.kicks-ass.net

Non-SMP APIs retain their existing text_poke*() namespace:

	text_poke()
	text_poke_sync_each_cpu()
	text_poke_kgdb()
	text_poke_copy()
	text_poke_copy
	text_poke_copy_locked()
	text_poke_set()

The SMP text-patching APIs had 3 separate prefixes:

	text_poke_
	text_poke_bp_
	poke_int3_

These get standardized to the single text_poke_int3*() namespace:

	text_poke_addr()	=> text_poke_int3_addr()
	poke_int3_handler()	=> text_poke_int3_handler()
	text_poke_bp_batch()	=> text_poke_int3_batch_process()
	text_poke_loc_init()	=> text_poke_int3_loc_add()
	text_poke_flush()	=> text_poke_int3_finish()
	text_poke_finish()	=> text_poke_int3_flush()
	text_poke_queue()	=> text_poke_int3_queue()
	text_poke_bp()		=> text_poke_int3_now()

(2)

The second part of the series simplifies and standardizes the SMP batch-patching
data & types namespace, around the new tp_array* namespace:

	int3_patching_desc	=> [removed]
	temp_mm_state_t		=> [removed]
	try_get_desc()		=> [removed]
	put_desc()		=> [removed]

	tp_vec,tp_vec_nr	=> tp_array
	int3_refs		=> tp_array_refs

(3)

The third part of the series contains additional patches, that
together with the data-namespace simplification changes remove
about 3 layers of unnecessary indirections and simplify/streamline
various aspects of the code:

	[PATCH] x86/alternatives: Remove the confusing, inaccurate & unnecessary 'temp_mm_state_t' abstraction
	[PATCH] x86/alternatives: Use non-inverted logic instead of 'tp_order_fail()'
	[PATCH] x86/alternatives: Remove the 'addr == NULL means forced-flush' hack from text_poke_int3_finish()/text_poke_int3_flush()/tp_addr_ordered()
	[PATCH] x86/alternatives: Simplify text_poke_int3() by using tp_vec and existing APIs
	[PATCH] x86/alternatives: Introduce 'struct text_poke_int3_array' and move tp_vec and tp_vec_nr to it
	[PATCH] x86/alternatives: Remove the tp_vec indirection
	[PATCH] x86/alternatives: Simplify try_get_tp_array()
	[PATCH] x86/alternatives: Simplify text_poke_int3_handler()
	[PATCH] x86/alternatives: Simplify text_poke_int3_batch()
	[PATCH] x86/alternatives: Move the tp_array manipulation into text_poke_int3_loc_init() and rename it to text_poke_int3_loc_add()
	[PATCH] x86/alternatives: Move tp_array completion from text_poke_int3_finish() and text_poke_int3_flush() to text_poke_int3_batch_process()
	[PATCH] x86/alternatives: Simplify tp_addr_ordered()

Various APIs also had their names clarified, as part of the renames.
I also added comments where justified.

There's almost no functional changes in the end, other than
mixed text_poke_int3_now() & text_poke_int3_queue() calls
are now probably working better than before - although I'm not
aware of such in-tree usage at the moment.

After these changes there's a reduction of about ~20 lines of
code if we exclude comments, and some reduction in text size:

   text       data        bss        dec        hex    filename
  13637       1009       4112      18758       4946    arch/x86/kernel/alternative.o.before
  13549       1009       4156      18714       491a    arch/x86/kernel/alternative.o.after

But the main goal was to perform a thorough round of source code TLC,
to make the code easier to read & maintain, and to remove a chunk
of technical debt accumulated incrementally over 20 years, which
improvements are only partly reflected in line count and code size decreases.

Lightly tested only.

This tree can also be found at:

  git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip WIP.x86/alternatives

Thanks,

    Ingo

================>

Ingo Molnar (41):
  x86/alternatives: Rename 'struct bp_patching_desc' to 'struct int3_patching_desc'
  x86/alternatives: Rename 'bp_refs' to 'int3_refs'
  x86/alternatives: Rename 'text_poke_bp_batch()' to 'text_poke_int3_batch()'
  x86/alternatives: Rename 'text_poke_bp()' to 'text_poke_int3()'
  x86/alternatives: Rename 'poke_int3_handler()' to 'text_poke_int3_handler()'
  x86/alternatives: Rename 'poking_mm' to 'text_poke_mm'
  x86/alternatives: Rename 'text_poke_addr' to 'text_poke_int3_addr'
  x86/alternatives: Rename 'poking_addr' to 'text_poke_addr'
  x86/alternatives: Rename 'bp_desc' to 'int3_desc'
  x86/alternatives: Remove duplicate 'text_poke_early()' prototype
  x86/alternatives: Update comments in int3_emulate_push()
  x86/alternatives: Remove the confusing, inaccurate & unnecessary 'temp_mm_state_t' abstraction
  x86/alternatives: Rename 'text_poke_flush()' to 'text_poke_int3_flush()'
  x86/alternatives: Rename 'text_poke_finish()' to 'text_poke_int3_finish()'
  x86/alternatives: Rename 'text_poke_queue()' to 'text_poke_int3_queue()'
  x86/alternatives: Rename 'text_poke_loc_init()' to 'text_poke_int3_loc_init()'
  x86/alternatives: Rename 'struct text_poke_loc' to 'struct text_poke_int3_loc'
  x86/alternatives: Rename 'struct int3_patching_desc' to 'struct text_poke_int3_vec'
  x86/alternatives: Rename 'int3_desc' to 'int3_vec'
  x86/alternatives: Add text_mutex) assert to text_poke_int3_flush()
  x86/alternatives: Assert that text_poke_int3_handler() can only ever handle 'tp_vec[]' based requests
  x86/alternatives: Use non-inverted logic instead of 'tp_order_fail()'
  x86/alternatives: Remove the 'addr == NULL means forced-flush' hack from text_poke_int3_finish()/text_poke_int3_flush()/tp_addr_ordered()
  x86/alternatives: Simplify text_poke_int3() by using tp_vec and existing APIs
  x86/alternatives: Assert input parameters in text_poke_int3_batch()
  x86/alternatives: Introduce 'struct text_poke_int3_array' and move tp_vec and tp_vec_nr to it
  x86/alternatives: Remove the tp_vec indirection
  x86/alternatives: Rename 'try_get_desc()' to 'try_get_tp_array()'
  x86/alternatives: Rename 'put_desc()' to 'put_tp_array()'
  x86/alternatives: Simplify try_get_tp_array()
  x86/alternatives: Simplify text_poke_int3_handler()
  x86/alternatives: Simplify text_poke_int3_batch()
  x86/alternatives: Rename 'text_poke_int3_batch()' to 'text_poke_int3_batch_process()'
  x86/alternatives: Rename 'int3_refs' to 'tp_array_refs'
  x86/alternatives: Move the tp_array manipulation into text_poke_int3_loc_init() and rename it to text_poke_int3_loc_add()
  x86/alternatives: Remove the mixed-patching restriction on text_poke_int3()
  x86/alternatives: Rename 'text_poke_int3()' to 'text_poke_int3_now()'
  x86/alternatives: Add documentation for text_poke_int3_queue()
  x86/alternatives: Move tp_array completion from text_poke_int3_finish() and text_poke_int3_flush() to text_poke_int3_batch_process()
  x86/alternatives: Rename 'text_poke_sync()' to 'text_poke_sync_each_cpu()'
  x86/alternatives: Simplify tp_addr_ordered()

 arch/x86/include/asm/text-patching.h |  23 ++---
 arch/x86/kernel/alternative.c        | 255 ++++++++++++++++++++++++++++---------------------------
 arch/x86/kernel/ftrace.c             |  18 ++--
 arch/x86/kernel/jump_label.c         |   6 +-
 arch/x86/kernel/kprobes/core.c       |   4 +-
 arch/x86/kernel/kprobes/opt.c        |   6 +-
 arch/x86/kernel/module.c             |   2 +-
 arch/x86/kernel/static_call.c        |   2 +-
 arch/x86/kernel/traps.c              |   6 +-
 arch/x86/mm/init.c                   |  16 ++--
 arch/x86/net/bpf_jit_comp.c          |   2 +-
 11 files changed, 172 insertions(+), 168 deletions(-)

-- 
2.45.2
Re: [PATCH 00/41] Simplify, reorganize and clean up the x86 INT3 based batch-patching code (alternative.c)
Posted by Linus Torvalds 8 months, 3 weeks ago
On Thu, 27 Mar 2025 at 13:54, Ingo Molnar <mingo@kernel.org> wrote:
>
> The second part of the series simplifies and standardizes the SMP batch-patching
> data & types namespace, around the new tp_array* namespace:
>
>         int3_patching_desc      => [removed]
>         temp_mm_state_t         => [removed]
>         try_get_desc()          => [removed]
>         put_desc()              => [removed]
>
>         tp_vec,tp_vec_nr        => tp_array
>         int3_refs               => tp_array_refs

Honestly, I think "int3" is better than "tp" as a part of the name.

"tp" doesn't say _anything_ to me, even though I understand it is
short for "text poke". But if you want to say 'text_poke", please just
write it out.

At least "int3" has some meaning in x86 context, unlike "tp".

So please either write out "text_poke" and accept that the names are a
bit longer (but a lot more descriptive), or use "int3" if you want to
save some typing.

        Linus

PS. The casual meaning "tp" has in English everyday language is short
for "toilet paper".
Re: [PATCH 00/41] Simplify, reorganize and clean up the x86 INT3 based batch-patching code (alternative.c)
Posted by Peter Zijlstra 8 months, 2 weeks ago
On Thu, Mar 27, 2025 at 03:19:40PM -0700, Linus Torvalds wrote:

> PS. The casual meaning "tp" has in English everyday language is short
> for "toilet paper".

Yes, which is why the old names were awesome :-) /me runs
Re: [PATCH 00/41] Simplify, reorganize and clean up the x86 INT3 based batch-patching code (alternative.c)
Posted by Ingo Molnar 8 months, 3 weeks ago
* Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Thu, 27 Mar 2025 at 13:54, Ingo Molnar <mingo@kernel.org> wrote:
> >
> > The second part of the series simplifies and standardizes the SMP batch-patching
> > data & types namespace, around the new tp_array* namespace:
> >
> >         int3_patching_desc      => [removed]
> >         temp_mm_state_t         => [removed]
> >         try_get_desc()          => [removed]
> >         put_desc()              => [removed]
> >
> >         tp_vec,tp_vec_nr        => tp_array
> >         int3_refs               => tp_array_refs
> 
> Honestly, I think "int3" is better than "tp" as a part of the name.

Yeah, was thinking hard about those two as well, but skipped it for 
this series, because of the brevity issue: text_poke_int3_ is quite a 
mouthful for a commonly used global state variable.

> "tp" doesn't say _anything_ to me, even though I understand it is 
> short for "text poke". But if you want to say 'text_poke", please 
> just write it out.
> 
> At least "int3" has some meaning in x86 context, unlike "tp".
> 
> So please either write out "text_poke" and accept that the names are 
> a bit longer (but a lot more descriptive), or use "int3" if you want 
> to save some typing.

Yeah.

So the thing is, the whole _int3 naming itself is a bit artificial 
IMHO, what we *really* want to signal here is whether something is 
boot/UP functionality or SMP functionality under the text_mutex.

That the SMP functionality relies on INT3 traps is an implementational 
detail that isn't necessary to show up in the API namespace. It also 
relies on CR3 flushing, so we could as well have added _cr3. ;-)

So I was thinking about something like this for the boot/UP variants:

  text_poke_*()

and for the SMP variants:

  smp_text_poke_*()

and text_poke_* for the data/type space.

Plus I think with the adding of 'smp_' we can also add 'batch_' to a 
few APIs to make the family of APIs clearer, plus a few other things:

A quick summary of changes (mockup):

	# boot/UP APIs & single-thread helpers:

						text_poke()
						text_poke_kgdb()
	[ unchanged APIs: ]			text_poke_copy()
						text_poke_copy_locked()
						text_poke_set()

						text_poke_addr()

	# SMP API & helpers namespace:

	text_poke_bp()			=>	smp_text_poke_single()
	text_poke_loc_init()		=>	__smp_text_poke_batch_add()
	text_poke_queue()		=>	smp_text_poke_batch_add()
	text_poke_finish()		=>	smp_text_poke_batch_flush()
	text_poke_flush()		=>	smp_text_poke_batch_finish()

	text_poke_bp_batch()		=>	smp_text_poke_batch_process()
	poke_int3_handler()		=>	smp_text_poke_int3_trap_handler()
        text_poke_sync()		=>	smp_text_poke_sync_each_cpu()


	# data/type namespace:

	int3_patching_desc		=>	[removed]
	temp_mm_state_t			=>	[removed]
	try_get_desc()			=>	[removed]
	put_desc()			=>	[removed]

	tp_vec,tp_vec_nr		=>	text_poke_array
	int3_refs			=>	text_poke_array_refs

Some of the API names are now a bit long, but I think this is one of 
the cases where clarity is more important than brevity, plus these are 
usually used in a pretty compact, straightforward fashion to trigger 
text-patching processing, not part of complex compound expressions.

I'll propagate this nomenclature into the series and repost.

> PS. The casual meaning "tp" has in English everyday language is short 
> for "toilet paper".

LOL, this seals the deal, the tp_ prefix is *so* dead.

Thanks,

	Ingo