From nobody Thu Apr 9 21:51:17 2026 Received: from mail-ot1-f48.google.com (mail-ot1-f48.google.com [209.85.210.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 629B62DEA9B for ; Fri, 6 Mar 2026 01:50:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.48 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772761860; cv=none; b=OQO+9ss5QWNcG/MN/nhMIvHVogu6hD4mJ3KRaKqx/eMf3kItr8nxAnnWjWyHEceuuTwqZmniU7j/BKGNnO8stHwabNJEnA7Kil+h82L4BJAJopiFrs+wwv/d+APuC8Hp1gCThzm6X1TvOAzIfOn9RdxJKTfLqPolZoSuEDGGSww= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772761860; c=relaxed/simple; bh=nerg5Sj4B/7cIkUkLRlGkjS7d3O0Xr3NjbqfJgzkPk4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=dNEPf0EQQJKcvfceyI4ijzmz1vHVuwjBp1a1CLR6l2GFjmYOMamgQ72XCciyTFsJOv+iYJnkycV1fCFcfwG6EbobnZkeT1DL829s5xj9aGyBkHTjwWn2TMj8f9EVRkXloU1biY7ombJB2UTXMg7LGTmu+DkzoHIuaNVd0zNfupI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=MvTEP6iE; arc=none smtp.client-ip=209.85.210.48 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="MvTEP6iE" Received: by mail-ot1-f48.google.com with SMTP id 46e09a7af769-7d4c85307b2so5978896a34.0 for ; Thu, 05 Mar 2026 17:50:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1772761857; x=1773366657; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=4BixrpmblYk1fVfc//g7Bejl2P3I3dQW/tYhS8Iov94=; b=MvTEP6iEUxVlv9yYTBLPDJFtNBaTjGwXjU663JEAtnlFDtJDGY+mZCj43rwml3XVji gld4HLcBA9abpOUMp3Ouza5w6i46jWlsNj7KWsUAPpkOJ4TrLEclMK0KYDSK4kQxoaUv QQ2yclwOec7iRlHZyQqLtQa/SUflsWxf4wqzIWLnDviJg4UKEJXbcSV4je7EJELRqF5B mR0uJG9kHx6owMn0J0kJ6kvEMXXHnt50mcQ4r4ySIncWC8wbu2hLvNPy5zyxZ+8PdOcX gmPHML/JD2o9Hmk1bxQIOerfigQ7kUlwis6rnUMjzSKaqH568PzftVug3q/SchMqwBHS gGWg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1772761857; x=1773366657; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=4BixrpmblYk1fVfc//g7Bejl2P3I3dQW/tYhS8Iov94=; b=h44smE920pBf6F6BRw1ev0IKjXH523+keREx3f9+l8TKg7McK3HIVjwRSXz3e/293r T4x5nMu7R3npksJpYWBvJZBQTk/LXuoKiH2CLQhI6OFfEq3PerSOj6wrbO28ZnWnJ97i LlHyr6hU5qbmcVfqyTAaO3C7R+vJnzVt5XprlMTbavfZ/GTnyKQpaDp6aUokJo/J6i7l PrUXHXEHTTFF5nnS8ErifG1FYeRZFXO5qHYKquUqPVSzad1AYuotrhbHxb4j1kq0nDnn ONB93mAinQRV7z09Deq/Zmgjj0m0q+xLyyRq6qzceHILaUCXsom44jSYNGXpYOvwJmtq 6nsg== X-Gm-Message-State: AOJu0Yzs/JY0HWbRP39UEqR36c42piLPpOWuzFhUejv9G7E8jkMDj983 2Xac8oyRNXIrilOD1GtGqW9j1iV4eim+sAOUY8TTY/HUiA+9E2YfqGwuUSR6DA== X-Gm-Gg: ATEYQzxrvanhO1jsaXMefp1TVkJAm6eytuTZGJ27BbmCwy2lTE6vi359i+CGaCTKk5l pXKv5odQpzeXkjrRNLuLDmafu61k59imA4MnP983CKp21YfdggRyaLS9KIxYeFhrlxZA92xQ0A9 Y04fUFix1ZJiVp/wlb/UCUSCq6QOqFC4PeZcElOoOJKmZBk74qhBapw1KuHOVdH3Hz4W4uK/byN X7uoyjYwTs5v7o2g6WLGMrfZXf5ilSHP7T2kvvogwRlhu/EihxJ5L84uyuVZ9tRAOS3LYq601QX o9yihFrZiEgKs43YFk/S55b4RtrkUSVi7tuWXTzgcb6Et9sDOxZ6J0gCTfwxyt4/if7MKOM0f6u URxvP6HdT3zOCRQL/VfyKvdKBgaKt6bulpL8c0ERhPiZDM/ChsLIdxN6P+mDEl57+WDPu1b1Arp 7VLkEvKlT+adLpoW0LMmL8jEr1THBq+yD/fg/7p1A8+8Ypqu6c X-Received: by 2002:a05:6820:4410:b0:677:2ac6:3beb with SMTP id 006d021491bc7-67b9bd49539mr338521eaf.55.1772761856874; Thu, 05 Mar 2026 17:50:56 -0800 (PST) Received: from frodo (c-98-38-17-99.hsd1.co.comcast.net. [98.38.17.99]) by smtp.googlemail.com with ESMTPSA id 006d021491bc7-67b9cc1a627sm115245eaf.6.2026.03.05.17.50.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 05 Mar 2026 17:50:56 -0800 (PST) From: Jim Cromie To: linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, amd-gfx@lists.freedesktop.org, intel-gvt-dev@lists.freedesktop.org, intel-gfx@lists.freedesktop.org Cc: Jim Cromie , Jason Baron , Peter Zijlstra , Josh Poimboeuf , Thomas Gleixner , Alice Ryhl , Steven Rostedt , Ard Biesheuvel , Alexandre Chartre , Juergen Gross , Andy Lutomirski , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" , Kees Cook , Nathan Chancellor , Lukas Bulwahn Subject: [RFC PATCH 1/7] jump_label: expose queueing API for batched static key updates Date: Thu, 5 Mar 2026 18:50:04 -0700 Message-ID: <20260306015022.1940986-2-jim.cromie@gmail.com> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260306015022.1940986-1-jim.cromie@gmail.com> References: <20260306015022.1940986-1-jim.cromie@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Currently, `HAVE_JUMP_LABEL_BATCH` provides an architecture-level mechanism to defer instruction synchronization (`text_poke_sync()`) when patching a sequence of static keys. However, this deferred batching capability is not exposed as a public kernel API. Subsystems that need to toggle a large number of static keys (e.g., dynamic_debug) currently suffer from O(N) overhead due to repeated machine-wide synchronizations (stop_machine). This patch introduces a public queueing API to expose this deferred synchronization mechanism to the rest of the kernel. This allows multiple static keys to be enabled/disabled by queueing their architecture-level updates, before applying a single machine-wide synchronization barrier after all instructions are modified. The new API consists of: - static_key_enable_queued(key) - static_key_disable_queued(key) - static_key_apply_queued() (the global barrier/flush) - static_branch_enable_queued(x) / static_branch_disable_queued(x) macros NOTES: The '_queued' API suffix was chosen to match the underlying 'arch_jump_label_transform_queue' and to avoid confusion with the existing rate-limited 'static_key_deferred' API. Also unify the names under the 'static_key_*' prefix, renaming jump_label_apply_queued to static_key_apply_queued (with a compatibility macro) for consistency. A pr_debug() is added to show the poked addresses, this exposed the semi-random ordering coming from dynamic-debug, despite its ordered descriptors. So x86/kernel/alternatives gets new code to do an insert-sort, by memcpy & memmove after appending. This sorting yields a dramatic IPI reduction; a following patch to dynamic-debug uses the API, and includes the numbers. Cc: Jason Baron Cc: Peter Zijlstra Cc: Josh Poimboeuf Cc: Thomas Gleixner Cc: Alice Ryhl Cc: Steven Rostedt Cc: Ard Biesheuvel Cc: Alexandre Chartre Cc: Juergen Gross Cc: Andy Lutomirski Signed-off-by: Jim Cromie --- arch/Kconfig | 3 + arch/x86/Kconfig | 1 + arch/x86/kernel/alternative.c | 50 +++++++++----- arch/x86/kernel/jump_label.c | 13 +++- include/linux/jump_label.h | 24 +++++++ kernel/jump_label.c | 125 +++++++++++++++++++++++++++++++--- 6 files changed, 186 insertions(+), 30 deletions(-) diff --git a/arch/Kconfig b/arch/Kconfig index 102ddbd4298e..388a73545005 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -505,6 +505,9 @@ config HAVE_ARCH_JUMP_LABEL config HAVE_ARCH_JUMP_LABEL_RELATIVE bool =20 +config HAVE_JUMP_LABEL_BATCH + bool + config MMU_GATHER_TABLE_FREE bool =20 diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index e2df1b147184..4d7705890558 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -249,6 +249,7 @@ config X86 select HAVE_IOREMAP_PROT select HAVE_IRQ_EXIT_ON_IRQ_STACK if X86_64 select HAVE_IRQ_TIME_ACCOUNTING + select HAVE_JUMP_LABEL_BATCH select HAVE_JUMP_LABEL_HACK if HAVE_OBJTOOL select HAVE_KERNEL_BZIP2 select HAVE_KERNEL_GZIP diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c index a888ae0f01fb..85df82c36543 100644 --- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -3137,26 +3137,19 @@ static void __smp_text_poke_batch_add(void *addr, c= onst void *opcode, size_t len } =20 /* - * We hard rely on the text_poke_array.vec being ordered; ensure this is s= o by flushing - * early if needed. + * We hard rely on the text_poke_array.vec being ordered; ensure this + * by finding where to insert to preserve the order, and mem-moving + * into place after appending it. */ -static bool text_poke_addr_ordered(void *addr) +static int text_poke_get_insert_idx(void *addr) { - WARN_ON_ONCE(!addr); + int i; =20 - if (!text_poke_array.nr_entries) - return true; - - /* - * If the last current entry's address is higher than the - * new entry's address we'd like to add, then ordering - * is violated and we must first flush all pending patching - * requests: - */ - if (text_poke_addr(text_poke_array.vec + text_poke_array.nr_entries-1) > = addr) - return false; - - return true; + for (i =3D 0; i < text_poke_array.nr_entries; i++) { + if (text_poke_addr(&text_poke_array.vec[i]) > addr) + return i; + } + return text_poke_array.nr_entries; } =20 /** @@ -3174,9 +3167,30 @@ static bool text_poke_addr_ordered(void *addr) */ void __ref smp_text_poke_batch_add(void *addr, const void *opcode, size_t = len, const void *emulate) { - if (text_poke_array.nr_entries =3D=3D TEXT_POKE_ARRAY_MAX || !text_poke_a= ddr_ordered(addr)) + int insert_idx; + + pr_debug("incoming addr=3D%px, current_qlen=3D%d\n", + addr, text_poke_array.nr_entries); + + if (text_poke_array.nr_entries =3D=3D TEXT_POKE_ARRAY_MAX) smp_text_poke_batch_finish(); + + insert_idx =3D text_poke_get_insert_idx(addr); __smp_text_poke_batch_add(addr, opcode, len, emulate); + + if (insert_idx < text_poke_array.nr_entries - 1) { + struct smp_text_poke_loc tmp; + int last =3D text_poke_array.nr_entries - 1; + /* Copy the newly appended item out */ + memcpy(&tmp, &text_poke_array.vec[last], sizeof(tmp)); + + /* Shift everything from insert_idx over by 1 */ + memmove(&text_poke_array.vec[insert_idx + 1], + &text_poke_array.vec[insert_idx], + (last - insert_idx) * sizeof(struct smp_text_poke_loc)); + /* Drop the new item into its sorted home */ + memcpy(&text_poke_array.vec[insert_idx], &tmp, sizeof(tmp)); + } } =20 /** diff --git a/arch/x86/kernel/jump_label.c b/arch/x86/kernel/jump_label.c index a7949a54a0ff..6b5bab5f34e8 100644 --- a/arch/x86/kernel/jump_label.c +++ b/arch/x86/kernel/jump_label.c @@ -120,6 +120,8 @@ void arch_jump_label_transform(struct jump_entry *entry, jump_label_transform(entry, type, 0); } =20 +static int jump_label_queue_len; + bool arch_jump_label_transform_queue(struct jump_entry *entry, enum jump_label_type type) { @@ -135,14 +137,23 @@ bool arch_jump_label_transform_queue(struct jump_entr= y *entry, =20 mutex_lock(&text_mutex); jlp =3D __jump_label_patch(entry, type); - smp_text_poke_batch_add((void *)jump_entry_code(entry), jlp.code, jlp.siz= e, NULL); + smp_text_poke_batch_add((void *)jump_entry_code(entry), + jlp.code, jlp.size, NULL); + jump_label_queue_len++; mutex_unlock(&text_mutex); return true; } =20 void arch_jump_label_transform_apply(void) { + if (!jump_label_queue_len) { + pr_debug("no queued jump_labels to apply\n"); + return; + } + + pr_debug("applying %d queued jump_labels\n", jump_label_queue_len); mutex_lock(&text_mutex); smp_text_poke_batch_finish(); + jump_label_queue_len =3D 0; mutex_unlock(&text_mutex); } diff --git a/include/linux/jump_label.h b/include/linux/jump_label.h index fdb79dd1ebd8..17f572abe4bb 100644 --- a/include/linux/jump_label.h +++ b/include/linux/jump_label.h @@ -234,10 +234,20 @@ extern void static_key_slow_dec_cpuslocked(struct sta= tic_key *key); extern int static_key_count(struct static_key *key); extern void static_key_enable(struct static_key *key); extern void static_key_disable(struct static_key *key); +extern void static_key_enable_queued(struct static_key *key); +extern void static_key_disable_queued(struct static_key *key); +extern void static_key_apply_queued(void); extern void static_key_enable_cpuslocked(struct static_key *key); extern void static_key_disable_cpuslocked(struct static_key *key); extern enum jump_label_type jump_label_init_type(struct jump_entry *entry); =20 +#define static_branch_enable(x) static_key_enable(&(x)->key) +#define static_branch_disable(x) static_key_disable(&(x)->key) +#define static_branch_enable_queued(x) static_key_enable_queued(&(x)->key) +#define static_branch_disable_queued(x) static_key_disable_queued(&(x)->k= ey) +#define static_branch_enable_cpuslocked(x) static_key_enable_cpuslocked(&(= x)->key) +#define static_branch_disable_cpuslocked(x) static_key_disable_cpuslocked(= &(x)->key) + /* * We should be using ATOMIC_INIT() for initializing .enabled, but * the inclusion of atomic.h is problematic for inclusion of jump_label.h @@ -340,6 +350,18 @@ static inline void static_key_disable(struct static_ke= y *key) atomic_set(&key->enabled, 0); } =20 +static inline void static_key_enable_queued(struct static_key *key) +{ + static_key_enable(key); +} + +static inline void static_key_disable_queued(struct static_key *key) +{ + static_key_disable(key); +} + +static inline void static_key_apply_queued(void) {} + #define static_key_enable_cpuslocked(k) static_key_enable((k)) #define static_key_disable_cpuslocked(k) static_key_disable((k)) =20 @@ -535,6 +557,8 @@ extern bool ____wrong_branch_error(void); =20 #define static_branch_enable(x) static_key_enable(&(x)->key) #define static_branch_disable(x) static_key_disable(&(x)->key) +#define static_branch_enable_queued(x) static_key_enable_queued(&(x)->key) +#define static_branch_disable_queued(x) static_key_disable_queued(&(x)->k= ey) #define static_branch_enable_cpuslocked(x) static_key_enable_cpuslocked(&(= x)->key) #define static_branch_disable_cpuslocked(x) static_key_disable_cpuslocked(= &(x)->key) =20 diff --git a/kernel/jump_label.c b/kernel/jump_label.c index 7cb19e601426..76a0f4e68b73 100644 --- a/kernel/jump_label.c +++ b/kernel/jump_label.c @@ -91,6 +91,7 @@ jump_label_sort_entries(struct jump_entry *start, struct = jump_entry *stop) } =20 static void jump_label_update(struct static_key *key); +static void jump_label_update_queued(struct static_key *key); =20 /* * There are similar definitions for the !CONFIG_JUMP_LABEL case in jump_l= abel.h. @@ -250,6 +251,41 @@ void static_key_disable(struct static_key *key) } EXPORT_SYMBOL_GPL(static_key_disable); =20 +void static_key_enable_queued(struct static_key *key) +{ + STATIC_KEY_CHECK_USE(key); + + if (atomic_read(&key->enabled) > 0) { + WARN_ON_ONCE(atomic_read(&key->enabled) !=3D 1); + return; + } + + jump_label_lock(); + if (atomic_read(&key->enabled) =3D=3D 0) { + atomic_set(&key->enabled, -1); + jump_label_update_queued(key); + atomic_set_release(&key->enabled, 1); + } + jump_label_unlock(); +} +EXPORT_SYMBOL_GPL(static_key_enable_queued); + +void static_key_disable_queued(struct static_key *key) +{ + STATIC_KEY_CHECK_USE(key); + + if (atomic_read(&key->enabled) !=3D 1) { + WARN_ON_ONCE(atomic_read(&key->enabled) !=3D 0); + return; + } + + jump_label_lock(); + if (atomic_cmpxchg(&key->enabled, 1, 0) =3D=3D 1) + jump_label_update_queued(key); + jump_label_unlock(); +} +EXPORT_SYMBOL_GPL(static_key_disable_queued); + static bool static_key_dec_not_one(struct static_key *key) { int v; @@ -488,39 +524,59 @@ static bool jump_label_can_update(struct jump_entry *= entry, bool init) return true; } =20 -#ifndef HAVE_JUMP_LABEL_BATCH static void __jump_label_update(struct static_key *key, struct jump_entry *entry, struct jump_entry *stop, bool init) { +#ifndef HAVE_JUMP_LABEL_BATCH for (; (entry < stop) && (jump_entry_key(entry) =3D=3D key); entry++) { if (jump_label_can_update(entry, init)) arch_jump_label_transform(entry, jump_label_type(entry)); } -} #else -static void __jump_label_update(struct static_key *key, - struct jump_entry *entry, - struct jump_entry *stop, - bool init) -{ for (; (entry < stop) && (jump_entry_key(entry) =3D=3D key); entry++) { =20 if (!jump_label_can_update(entry, init)) continue; =20 if (!arch_jump_label_transform_queue(entry, jump_label_type(entry))) { - /* - * Queue is full: Apply the current queue and try again. - */ arch_jump_label_transform_apply(); - BUG_ON(!arch_jump_label_transform_queue(entry, jump_label_type(entry))); + WARN_ON_ONCE(!arch_jump_label_transform_queue(entry, jump_label_type(en= try))); } } arch_jump_label_transform_apply(); +#endif } + +static void __jump_label_update_queued(struct static_key *key, + struct jump_entry *entry, + struct jump_entry *stop, + bool init) +{ +#ifdef HAVE_JUMP_LABEL_BATCH + for (; (entry < stop) && (jump_entry_key(entry) =3D=3D key); entry++) { + + if (!jump_label_can_update(entry, init)) + continue; + + if (!arch_jump_label_transform_queue(entry, jump_label_type(entry))) { + arch_jump_label_transform_apply(); + WARN_ON_ONCE(!arch_jump_label_transform_queue(entry, jump_label_type(en= try))); + } + } +#else + __jump_label_update(key, entry, stop, init); +#endif +} + +void static_key_apply_queued(void) +{ +#ifdef HAVE_JUMP_LABEL_BATCH + arch_jump_label_transform_apply(); #endif +} +EXPORT_SYMBOL_GPL(static_key_apply_queued); =20 void __init jump_label_init(void) { @@ -696,6 +752,27 @@ static void __jump_label_mod_update(struct static_key = *key) } } =20 +static void __jump_label_mod_update_queued(struct static_key *key) +{ + struct static_key_mod *mod; + + for (mod =3D static_key_mod(key); mod; mod =3D mod->next) { + struct jump_entry *stop; + struct module *m; + + if (!mod->entries) + continue; + + m =3D mod->mod; + if (!m) + stop =3D __stop___jump_table; + else + stop =3D m->jump_entries + m->num_jump_entries; + __jump_label_update_queued(key, mod->entries, stop, + m && m->state =3D=3D MODULE_STATE_COMING); + } +} + static int jump_label_add_module(struct module *mod) { struct jump_entry *iter_start =3D mod->jump_entries; @@ -919,6 +996,32 @@ static void jump_label_update(struct static_key *key) __jump_label_update(key, entry, stop, init); } =20 +static void jump_label_update_queued(struct static_key *key) +{ + struct jump_entry *stop =3D __stop___jump_table; + bool init =3D system_state < SYSTEM_RUNNING; + struct jump_entry *entry; +#ifdef CONFIG_MODULES + struct module *mod; + + if (static_key_linked(key)) { + __jump_label_mod_update_queued(key); + return; + } + + scoped_guard(rcu) { + mod =3D __module_address((unsigned long)key); + if (mod) { + stop =3D mod->jump_entries + mod->num_jump_entries; + init =3D mod->state =3D=3D MODULE_STATE_COMING; + } + } +#endif + entry =3D static_key_entries(key); + if (entry) + __jump_label_update_queued(key, entry, stop, init); +} + #ifdef CONFIG_STATIC_KEYS_SELFTEST static DEFINE_STATIC_KEY_TRUE(sk_true); static DEFINE_STATIC_KEY_FALSE(sk_false); --=20 2.53.0