From nobody Sat Apr 18 05:54:57 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BDF57C43334 for ; Sat, 16 Jul 2022 23:17:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232504AbiGPXRQ (ORCPT ); Sat, 16 Jul 2022 19:17:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38542 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229619AbiGPXRO (ORCPT ); Sat, 16 Jul 2022 19:17:14 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2ADCB1AF22 for ; Sat, 16 Jul 2022 16:17:14 -0700 (PDT) Message-ID: <20220716230952.727288644@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1658013431; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=tOd2h+5X6DIVgo8cTm+A/eBNlNlb0gNOIgz+JPAUdSc=; b=oFTQVt3owFOV+oSY8U3WQnmJAYOcGBGoUCum40vbucF0emAfCPRDk/WMPsBNw3Pbg0+9UA 3xzhsDToCRznVTj7pDE6Q7dkDR9R2lyA7PttQqNzgR5DT596wzcZQEBmiPcNcCWSg6iwdw v3qsaPiNkj51M9Sp/2TpxSBJECcOAlJy3SGTG3iGX8rjz0vDkeSBNCtvCYpa3zTzHYO9g0 /YbPa1zGqc1i+6y0fjUi6m4bh1Z8YfxyOo7jsLQQglU4N7dUhrx4lq92DE9/VhbKUu/Ynf ytI2Q4TeqJX6ktZBVp2qcme8qXtZhsHnKr0RS1UFgwInOBVvH50FB8ptZv/kyA== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1658013431; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=tOd2h+5X6DIVgo8cTm+A/eBNlNlb0gNOIgz+JPAUdSc=; b=WdxTxQ8jYG9/FPO6DHcYaOuRVxVDXZEdHmRUHwRBH8UoLxEWX1V7ki0XFm5lS5Op3j97tY ucBGzqzyO2tVyRAg== From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Linus Torvalds , Tim Chen , Josh Poimboeuf , Andrew Cooper , Pawan Gupta , Johannes Wikner , Alyssa Milburn , Jann Horn , "H.J. Lu" , Joao Moreira , Joseph Nuzman , Steven Rostedt , Juergen Gross Subject: [patch 01/38] x86/paravirt: Ensure proper alignment References: <20220716230344.239749011@linutronix.de> MIME-Version: 1.0 Date: Sun, 17 Jul 2022 01:17:11 +0200 (CEST) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The entries in the .parainstr sections are 8 byte aligned and the corresponding C struct makes the array offset 16 bytes. Though the pushed entries are only using 12 bytes. .parainstr_end is therefore 4 bytes short. That works by chance because it's only used in a loop: for (p =3D start; p < end; p++) But this falls flat when calculating the number of elements: n =3D end - start That's obviously off by one. Ensure that the gap is filled and the last entry is occupying 16 bytes. Signed-off-by: Thomas Gleixner Cc: Juergen Gross --- arch/x86/include/asm/paravirt.h | 1 + arch/x86/include/asm/paravirt_types.h | 1 + 2 files changed, 2 insertions(+) --- a/arch/x86/include/asm/paravirt.h +++ b/arch/x86/include/asm/paravirt.h @@ -743,6 +743,7 @@ extern void default_banner(void); word 771b; \ .byte ptype; \ .byte 772b-771b; \ + _ASM_ALIGN; \ .popsection =20 =20 --- a/arch/x86/include/asm/paravirt_types.h +++ b/arch/x86/include/asm/paravirt_types.h @@ -294,6 +294,7 @@ extern struct paravirt_patch_template pv " .byte " type "\n" \ " .byte 772b-771b\n" \ " .short " clobber "\n" \ + _ASM_ALIGN "\n" \ ".popsection\n" =20 /* Generate patchable code, with the default asm parameters. */ From nobody Sat Apr 18 05:54:57 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E6271CCA482 for ; Sat, 16 Jul 2022 23:17:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232721AbiGPXRV (ORCPT ); Sat, 16 Jul 2022 19:17:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38556 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232578AbiGPXRR (ORCPT ); Sat, 16 Jul 2022 19:17:17 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 80C9A1AF22 for ; Sat, 16 Jul 2022 16:17:15 -0700 (PDT) Message-ID: <20220716230952.787452088@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1658013433; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=4fi47O/0pVrlNGLHp44UcxjED6kI5STBjAcxzwbWyJc=; b=we7GlwNbtnKX6t1UjXXCfOgk/d/tK0f/WoIQ5JnMbR3PvYnECLlzc4RV+x1XxwH8+ilB9l VInJpKLOLGx4/BEEDAfVlGmzJ/l1WZY6AMVytFrbWQXgcsIsem+J3Ie7QRfWVxcxiZh0AT NRvWB8zFwv9V3cvhTnb5SKxJgsCv7iJNPjocjcaxaCJHb/4ScmpC+rrhdIsaer/KlnD//Q yUE04pTIYOlF87oG1eTI9+f9wmltFIwJLII6M86NQLUJVDGsnZfCocTVDAeJn5PhVK3Jib 2ccaES0yrqIx3zSYYYKk4vnh74SFvgXEYgW8ZlTDtEJSUHqtvrld0wScgcITPA== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1658013433; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=4fi47O/0pVrlNGLHp44UcxjED6kI5STBjAcxzwbWyJc=; b=qdCa5TIaaXeJkUFTt802dhkWoRMEiCfJKRWByzRlmwPoch0DFgbD6xpJzyZixxHfQbHdIw 05z/jsknHGNa3QDg== From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Linus Torvalds , Tim Chen , Josh Poimboeuf , Andrew Cooper , Pawan Gupta , Johannes Wikner , Alyssa Milburn , Jann Horn , "H.J. Lu" , Joao Moreira , Joseph Nuzman , Steven Rostedt Subject: [patch 02/38] x86/cpu: Use native_wrmsrl() in load_percpu_segment() References: <20220716230344.239749011@linutronix.de> MIME-Version: 1.0 Date: Sun, 17 Jul 2022 01:17:12 +0200 (CEST) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" load_percpu_segment() is using wrmsr() which is paravirtualized. That's an issue because the code sequence is: __loadsegment_simple(gs, 0); wrmsrl(MSR_GS_BASE, cpu_kernelmode_gs_base(cpu)); So anything which uses a per CPU variable between setting GS to 0 and writing GSBASE is going to end up in a NULL pointer dereference. That's can be triggered with instrumentation and is guaranteed to be triggered with callthunks for call depth tracking. Use native_wrmsrl() instead. XEN_PV will trap and emulate, but that's not a hot path. Also make it static and mark it noinstr so neither kprobes, sanitizers or whatever can touch it. Signed-off-by: Thomas Gleixner --- arch/x86/include/asm/processor.h | 1 - arch/x86/kernel/cpu/common.c | 12 ++++++++++-- 2 files changed, 10 insertions(+), 3 deletions(-) --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -673,7 +673,6 @@ extern struct desc_ptr early_gdt_descr; extern void switch_to_new_gdt(int); extern void load_direct_gdt(int); extern void load_fixmap_gdt(int); -extern void load_percpu_segment(int); extern void cpu_init(void); extern void cpu_init_secondary(void); extern void cpu_init_exception_handling(void); --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -701,13 +701,21 @@ static const char *table_lookup_model(st __u32 cpu_caps_cleared[NCAPINTS + NBUGINTS] __aligned(sizeof(unsigned long= )); __u32 cpu_caps_set[NCAPINTS + NBUGINTS] __aligned(sizeof(unsigned long)); =20 -void load_percpu_segment(int cpu) +static noinstr void load_percpu_segment(int cpu) { #ifdef CONFIG_X86_32 loadsegment(fs, __KERNEL_PERCPU); #else __loadsegment_simple(gs, 0); - wrmsrl(MSR_GS_BASE, cpu_kernelmode_gs_base(cpu)); + /* + * Because of the __loadsegment_simple(gs, 0) above, any GS-prefixed + * instruction will explode right about here. As such, we must not have + * any CALL-thunks using per-cpu data. + * + * Therefore, use native_wrmsrl() and have XenPV take the fault and + * emulate. + */ + native_wrmsrl(MSR_GS_BASE, cpu_kernelmode_gs_base(cpu)); #endif } From nobody Sat Apr 18 05:54:57 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9BA10C43334 for ; Sat, 16 Jul 2022 23:17:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232869AbiGPXRY (ORCPT ); Sat, 16 Jul 2022 19:17:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38612 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232637AbiGPXRS (ORCPT ); Sat, 16 Jul 2022 19:17:18 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9C4C81AF3D for ; Sat, 16 Jul 2022 16:17:17 -0700 (PDT) Message-ID: <20220716230952.845576840@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1658013435; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=aIQ/Ky4BkhWj1e8pSb5b8EYVCLX1295GSYYZn2MYdwA=; b=QuSWrgtgRwGZys91cSn4NGs8TkUNFwNue/6Hjr5kYy5OM9Rg7kxZ8e4V3WKG9K7iLAX4pE k2vjjoaM7zG0A7+fm7S/iQPMa7xMSVkklXcl58e0z+2fQG0wdA49DO3Z+5zXdNRKcI0A6m sd9eVa26Ne1Lbjj3XGcRpofQNEIeKmij1cTcCMrOgkvyFBlYADe36xounUc8Z1+G6iQjwh wWNSFokPU6PWEH3Lobox+Sh+hCtCl5fgsGg93DNYpESy7V0kKoysIU5E7igJwHxoYLA3NN ayt+EnbRjVS3G06uifxkAspfooH5unadYyREb3PTtTlqkrLf+J3t6CAkJpTVlg== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1658013435; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=aIQ/Ky4BkhWj1e8pSb5b8EYVCLX1295GSYYZn2MYdwA=; b=Up8R0aCMnjmqjTKLM5LijbQByYHJVUFiU1yDjTDDBiUqprrZUsKN1Jdlxxh+yp1cn8hsCO ldYoZdnSsKL4rYCw== From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Linus Torvalds , Tim Chen , Josh Poimboeuf , Andrew Cooper , Pawan Gupta , Johannes Wikner , Alyssa Milburn , Jann Horn , "H.J. Lu" , Joao Moreira , Joseph Nuzman , Steven Rostedt Subject: [patch 03/38] x86/modules: Set VM_FLUSH_RESET_PERMS in module_alloc() References: <20220716230344.239749011@linutronix.de> MIME-Version: 1.0 Date: Sun, 17 Jul 2022 01:17:14 +0200 (CEST) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Instead of resetting permissions all over the place when freeing module memory tell the vmalloc code to do so. Avoids the exercise for the next upcoming user. Signed-off-by: Thomas Gleixner --- arch/x86/kernel/ftrace.c | 2 -- arch/x86/kernel/kprobes/core.c | 1 - arch/x86/kernel/module.c | 9 +++++---- 3 files changed, 5 insertions(+), 7 deletions(-) --- a/arch/x86/kernel/ftrace.c +++ b/arch/x86/kernel/ftrace.c @@ -412,8 +412,6 @@ create_trampoline(struct ftrace_ops *ops /* ALLOC_TRAMP flags lets us know we created it */ ops->flags |=3D FTRACE_OPS_FL_ALLOC_TRAMP; =20 - set_vm_flush_reset_perms(trampoline); - if (likely(system_state !=3D SYSTEM_BOOTING)) set_memory_ro((unsigned long)trampoline, npages); set_memory_x((unsigned long)trampoline, npages); --- a/arch/x86/kernel/kprobes/core.c +++ b/arch/x86/kernel/kprobes/core.c @@ -416,7 +416,6 @@ void *alloc_insn_page(void) if (!page) return NULL; =20 - set_vm_flush_reset_perms(page); /* * First make the page read-only, and only then make it executable to * prevent it from being W+X in between. --- a/arch/x86/kernel/module.c +++ b/arch/x86/kernel/module.c @@ -74,10 +74,11 @@ void *module_alloc(unsigned long size) return NULL; =20 p =3D __vmalloc_node_range(size, MODULE_ALIGN, - MODULES_VADDR + get_module_load_offset(), - MODULES_END, gfp_mask, - PAGE_KERNEL, VM_DEFER_KMEMLEAK, NUMA_NO_NODE, - __builtin_return_address(0)); + MODULES_VADDR + get_module_load_offset(), + MODULES_END, gfp_mask, PAGE_KERNEL, + VM_FLUSH_RESET_PERMS | VM_DEFER_KMEMLEAK, + NUMA_NO_NODE, __builtin_return_address(0)); + if (p && (kasan_alloc_module_shadow(p, size, gfp_mask) < 0)) { vfree(p); return NULL; From nobody Sat Apr 18 05:54:57 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id ADD95C43334 for ; Sat, 16 Jul 2022 23:17:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232191AbiGPXR2 (ORCPT ); Sat, 16 Jul 2022 19:17:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38634 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232616AbiGPXRT (ORCPT ); Sat, 16 Jul 2022 19:17:19 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 903FE1EC43 for ; Sat, 16 Jul 2022 16:17:18 -0700 (PDT) Message-ID: <20220716230952.904222100@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1658013436; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=Cm9gDiq7932M5/ZYpz8elqQRxbwWGV28kDDII2hlkOA=; b=v3v7Iw3vOR9bmz6pVjlYRxB+3YnO8YAarmwol4Wh6T3AkL3tpO25bazr3/zogptxF4eeJw tgdBjyzpiB0Abquly1hFkOIrycKZ48No12716IgNfaae4Ogj8u5Nz21+akZwDdBeuIXhbL 7I5iH9g/KAnEsdkbPC7AxEFQA20oib7MjdBCLr55Qcb1aXRYjhyvuzfdZBfePR+SbTD9OE hXFR7dVJQbPrLkbJVgckBT1S2bQB3i3u+uHQQqEQoa/uVKgEYea05mIV/OdI8wnq9BPmpC gHTNTBHSCUdWkRGMXoyp4y6H8BaNAZ2caITxPbay0T2r7ArLbuLwUPnOZ3xDFw== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1658013436; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=Cm9gDiq7932M5/ZYpz8elqQRxbwWGV28kDDII2hlkOA=; b=z2fuZfBwMoJ6mfiZO1n1hzfp2JSG/rkToGzN33KttWWE8YWi/yalHppY+yHtv00FKH7T47 ICY5IoQ4zPp+0JCg== From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Linus Torvalds , Tim Chen , Josh Poimboeuf , Andrew Cooper , Pawan Gupta , Johannes Wikner , Alyssa Milburn , Jann Horn , "H.J. Lu" , Joao Moreira , Joseph Nuzman , Steven Rostedt Subject: [patch 04/38] x86/vdso: Ensure all kernel code is seen by objtool References: <20220716230344.239749011@linutronix.de> MIME-Version: 1.0 Date: Sun, 17 Jul 2022 01:17:16 +0200 (CEST) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" extable.c is kernel code and not part of the VDSO Signed-off-by: Thomas Gleixner --- arch/x86/entry/vdso/Makefile | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) --- a/arch/x86/entry/vdso/Makefile +++ b/arch/x86/entry/vdso/Makefile @@ -30,11 +30,12 @@ vobjs32-y +=3D vdso32/vclock_gettime.o vobjs-$(CONFIG_X86_SGX) +=3D vsgx.o =20 # files to link into kernel -obj-y +=3D vma.o extable.o -KASAN_SANITIZE_vma.o :=3D y -UBSAN_SANITIZE_vma.o :=3D y -KCSAN_SANITIZE_vma.o :=3D y -OBJECT_FILES_NON_STANDARD_vma.o :=3D n +obj-y +=3D vma.o extable.o +KASAN_SANITIZE_vma.o :=3D y +UBSAN_SANITIZE_vma.o :=3D y +KCSAN_SANITIZE_vma.o :=3D y +OBJECT_FILES_NON_STANDARD_vma.o :=3D n +OBJECT_FILES_NON_STANDARD_extable.o :=3D n =20 # vDSO images to build vdso_img-$(VDSO64-y) +=3D 64 From nobody Sat Apr 18 05:54:57 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B2DEBC43334 for ; Sat, 16 Jul 2022 23:17:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232951AbiGPXRc (ORCPT ); Sat, 16 Jul 2022 19:17:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38686 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232725AbiGPXRV (ORCPT ); Sat, 16 Jul 2022 19:17:21 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 86518205E2 for ; Sat, 16 Jul 2022 16:17:20 -0700 (PDT) Message-ID: <20220716230952.961938321@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1658013438; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=p8GTJK4uVYYWM5yuu2sRowR6EatmieQ3CSBXbtGpA7Y=; b=aKGmpOd9sTKxdTKiJrrxBYpom9/s95YW6pVGZJ0lCejgk5Bd4nXs3he7YeTNvexFk1IOQT qnyUYiguuaJ2bLMiNR+EYcOu5FcLLb6fREYyNHcpHnmjcC0aWYRNV9nXI744dL+fvztI+q xjTBdxF6PmLfh8WgWciXYC0jNvYwjkgBYSwdLo9CcC7I3RU7oTCinB8OTVSEBWRgp6wUww xE9eW5MDU0wUKBZquB/gWn6KZExqyOK5DmJDYD1neDjSZbOLigt18aXKKymGz9hD9iJrkE CQePDho/UnoqHuub+a/ISYHQN4n7xr7UoMelJX52XKMBzAdKDRxSkwXWJiH3pw== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1658013438; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=p8GTJK4uVYYWM5yuu2sRowR6EatmieQ3CSBXbtGpA7Y=; b=CJ/Bj37JqHXtrkPxCzWul5ExTkkfB0MEIecJTARcSh8rCbrzSiVBCJifeln5nfCED6l8mS UAiwzCIQLmfo49DQ== From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Linus Torvalds , Tim Chen , Josh Poimboeuf , Andrew Cooper , Pawan Gupta , Johannes Wikner , Alyssa Milburn , Jann Horn , "H.J. Lu" , Joao Moreira , Joseph Nuzman , Steven Rostedt Subject: [patch 05/38] btree: Initialize early when builtin References: <20220716230344.239749011@linutronix.de> MIME-Version: 1.0 Date: Sun, 17 Jul 2022 01:17:17 +0200 (CEST) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" An upcoming user of btree needs it early on. Initialize it in start_kernel(). Signed-off-by: Thomas Gleixner --- include/linux/btree.h | 6 ++++++ init/main.c | 2 ++ lib/btree.c | 8 +++++++- 3 files changed, 15 insertions(+), 1 deletion(-) --- a/include/linux/btree.h +++ b/include/linux/btree.h @@ -5,6 +5,12 @@ #include #include =20 +#if IS_BUILTIN(CONFIG_BTREE) +extern void btree_cache_init(void); +#else +static inline void btree_cache_init(void) {} +#endif + /** * DOC: B+Tree basics * --- a/init/main.c +++ b/init/main.c @@ -75,6 +75,7 @@ #include #include #include +#include #include #include #include @@ -1125,6 +1126,7 @@ asmlinkage __visible void __init __no_sa cgroup_init(); taskstats_init_early(); delayacct_init(); + btree_cache_init(); =20 poking_init(); check_bugs(); --- a/lib/btree.c +++ b/lib/btree.c @@ -787,15 +787,21 @@ static int __init btree_module_init(void return 0; } =20 +#if IS_MODULE(CONFIG_BTREE) static void __exit btree_module_exit(void) { kmem_cache_destroy(btree_cachep); } =20 -/* If core code starts using btree, initialization should happen even earl= ier */ module_init(btree_module_init); module_exit(btree_module_exit); =20 MODULE_AUTHOR("Joern Engel "); MODULE_AUTHOR("Johannes Berg "); MODULE_LICENSE("GPL"); +#else +void __init btree_cache_init(void) +{ + BUG_ON(btree_module_init()); +} +#endif From nobody Sat Apr 18 05:54:57 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0294AC43334 for ; Sat, 16 Jul 2022 23:17:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233052AbiGPXRl (ORCPT ); Sat, 16 Jul 2022 19:17:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38884 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232773AbiGPXR0 (ORCPT ); Sat, 16 Jul 2022 19:17:26 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D51901AF29 for ; Sat, 16 Jul 2022 16:17:21 -0700 (PDT) Message-ID: <20220716230953.021143397@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1658013439; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=uS+wUG53yieKJ2rVW0hF2kwxHQa7dn6gMuFlM3qMG6M=; b=sMF0ven6SG0bOoohd362yty6POSUValbGmPlwpWWuDR0SIR9SUZOMHw3vRH9hTRtESvYT8 pfuU+7sIJa6DPpuPEh3aGjxPCiyWxmadhoOLRLMJEtndhS7yR3OngrqzwZQnnZNaM5jhPo hi7lhQIFrpeB/6jGma19rtDPTCY/pqhcicv4qIitHdyzrtHswU0RVXtCBsvaBmJku+UhnV pj3qENVR4/rtSmK5bryowgM6D9ICXrh75gIU0xaql2pztKWUpWNxU9eb55b9c9PyS/Zw4+ Ui7y8c3pjw0fmRmxkC2oxh4j282v0upmUJbipNmBlcVu5F+pcRsP3oqXpds5vA== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1658013439; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=uS+wUG53yieKJ2rVW0hF2kwxHQa7dn6gMuFlM3qMG6M=; b=C0ixrIxw123ZNgk4TVQWjvhNVqMjv7BpoZ24fu99UyG9xyh2FgaiQnbVrPNKtm57FHeG3b u6r+O6MiRTHvWiBw== From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Linus Torvalds , Tim Chen , Josh Poimboeuf , Andrew Cooper , Pawan Gupta , Johannes Wikner , Alyssa Milburn , Jann Horn , "H.J. Lu" , Joao Moreira , Joseph Nuzman , Steven Rostedt , "Peter Zijlstra (Intel)" Subject: [patch 06/38] objtool: Allow GS relative relocs References: <20220716230344.239749011@linutronix.de> MIME-Version: 1.0 Date: Sun, 17 Jul 2022 01:17:19 +0200 (CEST) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Peter Zijlstra Objtool doesn't currently much like per-cpu usage in alternatives: arch/x86/entry/entry_64.o: warning: objtool: .altinstr_replacement+0xf: uns= upported relocation in alternatives section f: 65 c7 04 25 00 00 00 00 00 00 00 80 movl $0x80000000,%gs:0x0 = 13: R_X86_64_32S __x86_call_depth Allow this. Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Thomas Gleixner --- tools/objtool/arch/x86/decode.c | 26 +++++++++++++++++++++----- tools/objtool/check.c | 6 ++---- tools/objtool/include/objtool/arch.h | 4 +--- tools/objtool/include/objtool/check.h | 20 +++++++++++--------- 4 files changed, 35 insertions(+), 21 deletions(-) --- a/tools/objtool/arch/x86/decode.c +++ b/tools/objtool/arch/x86/decode.c @@ -103,24 +103,37 @@ unsigned long arch_jump_destination(stru #define rm_is_mem(reg) (mod_is_mem() && !is_RIP() && rm_is(reg)) #define rm_is_reg(reg) (mod_is_reg() && modrm_rm =3D=3D (reg)) =20 -static bool has_notrack_prefix(struct insn *insn) +static bool has_prefix(struct insn *insn, u8 prefix) { int i; =20 for (i =3D 0; i < insn->prefixes.nbytes; i++) { - if (insn->prefixes.bytes[i] =3D=3D 0x3e) + if (insn->prefixes.bytes[i] =3D=3D prefix) return true; } =20 return false; } =20 +static bool has_notrack_prefix(struct insn *insn) +{ + return has_prefix(insn, 0x3e); +} + +static bool has_gs_prefix(struct insn *insn) +{ + return has_prefix(insn, 0x65); +} + int arch_decode_instruction(struct objtool_file *file, const struct sectio= n *sec, unsigned long offset, unsigned int maxlen, - unsigned int *len, enum insn_type *type, - unsigned long *immediate, - struct list_head *ops_list) + struct instruction *instruction) { + struct list_head *ops_list =3D &instruction->stack_ops; + unsigned long *immediate =3D &instruction->immediate; + enum insn_type *type =3D &instruction->type; + unsigned int *len =3D &instruction->len; + const struct elf *elf =3D file->elf; struct insn insn; int x86_64, ret; @@ -149,6 +162,9 @@ int arch_decode_instruction(struct objto if (insn.vex_prefix.nbytes) return 0; =20 + if (has_gs_prefix(&insn)) + instruction->alt_reloc_safe =3D 1; + prefix =3D insn.prefixes.bytes[0]; =20 op1 =3D insn.opcode.bytes[0]; --- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -396,9 +396,7 @@ static int decode_instructions(struct ob =20 ret =3D arch_decode_instruction(file, sec, offset, sec->sh.sh_size - offset, - &insn->len, &insn->type, - &insn->immediate, - &insn->stack_ops); + insn); if (ret) goto err; =20 @@ -1620,7 +1618,7 @@ static int handle_group_alt(struct objto * accordingly. */ alt_reloc =3D insn_reloc(file, insn); - if (alt_reloc && + if (alt_reloc && !insn->alt_reloc_safe && !arch_support_alt_relocation(special_alt, insn, alt_reloc)) { =20 WARN_FUNC("unsupported relocation in alternatives section", --- a/tools/objtool/include/objtool/arch.h +++ b/tools/objtool/include/objtool/arch.h @@ -73,9 +73,7 @@ void arch_initial_func_cfi_state(struct =20 int arch_decode_instruction(struct objtool_file *file, const struct sectio= n *sec, unsigned long offset, unsigned int maxlen, - unsigned int *len, enum insn_type *type, - unsigned long *immediate, - struct list_head *ops_list); + struct instruction *insn); =20 bool arch_callee_saved_reg(unsigned char reg); =20 --- a/tools/objtool/include/objtool/check.h +++ b/tools/objtool/include/objtool/check.h @@ -47,15 +47,17 @@ struct instruction { unsigned long immediate; =20 u16 dead_end : 1, - ignore : 1, - ignore_alts : 1, - hint : 1, - save : 1, - restore : 1, - retpoline_safe : 1, - noendbr : 1, - entry : 1; - /* 7 bit hole */ + ignore : 1, + ignore_alts : 1, + hint : 1, + save : 1, + restore : 1, + retpoline_safe : 1, + noendbr : 1, + entry : 1, + alt_reloc_safe : 1; + + /* 6 bit hole */ =20 s8 instr; u8 visited; From nobody Sat Apr 18 05:54:57 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 63DF6C43334 for ; Sat, 16 Jul 2022 23:17:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232996AbiGPXRn (ORCPT ); Sat, 16 Jul 2022 19:17:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39050 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232888AbiGPXRa (ORCPT ); Sat, 16 Jul 2022 19:17:30 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C587A1AF19 for ; Sat, 16 Jul 2022 16:17:23 -0700 (PDT) Message-ID: <20220716230953.080514530@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1658013441; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=l2lzGHvFrgNLjFsSfHpJzcBOALkxr+/ChBRuIFW3BaA=; b=XONyBgQq0seJpfnpxivemXHQE6LTkW160Y2Kc05nkwQMtyGGWMjpnAP9LOgWZe+psc0A4X LPTKvPvFpivllkiFQhzYIeDu8IOwzBgFLSNs1Ml9KFgC8d3nf4M7gHUnF6ip4JV4wPvFAi DGENKBqU89blXosybfItpdtQ3XQUlYkiUNnKI0vgCCjCfVpVM+/vl+dbXPelhNnuQuSLz5 hir4SERcmTYaMPO23zu4Z99s+U+3Our9b1OCjzb1PQ+zI2NzXZ8anL/FxGAOtOVewTdZoB 3AfUwY3p9kzL9QzvaRD/wN9II2lABQMX5xfgTRKjK7oRqN7/dSw60tZ4CRbVEQ== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1658013441; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=l2lzGHvFrgNLjFsSfHpJzcBOALkxr+/ChBRuIFW3BaA=; b=OWbQxHqw/P1t2rc6hxBIsazunE0+o6TPZck5IeDtbgI9NDEzaX8Q1e8J+CuU7jsoV0fPEp 3gN/yNKSEdQWbkAQ== From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Linus Torvalds , Tim Chen , Josh Poimboeuf , Andrew Cooper , Pawan Gupta , Johannes Wikner , Alyssa Milburn , Jann Horn , "H.J. Lu" , Joao Moreira , Joseph Nuzman , Steven Rostedt , "Peter Zijlstra (Intel)" Subject: [patch 07/38] objtool: Track init section References: <20220716230344.239749011@linutronix.de> MIME-Version: 1.0 Date: Sun, 17 Jul 2022 01:17:21 +0200 (CEST) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Peter Zijlstra For future usage of .init.text exclusion track the init section in the instruction decoder and use the result in retpoline validation. Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Thomas Gleixner --- tools/objtool/check.c | 17 ++++++++++------- tools/objtool/include/objtool/elf.h | 2 +- 2 files changed, 11 insertions(+), 8 deletions(-) --- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -380,6 +380,15 @@ static int decode_instructions(struct ob !strncmp(sec->name, ".text.__x86.", 12)) sec->noinstr =3D true; =20 + /* + * .init.text code is ran before userspace and thus doesn't + * strictly need retpolines, except for modules which are + * loaded late, they very much do need retpoline in their + * .init.text + */ + if (!strcmp(sec->name, ".init.text") && !opts.module) + sec->init =3D true; + for (offset =3D 0; offset < sec->sh.sh_size; offset +=3D insn->len) { insn =3D malloc(sizeof(*insn)); if (!insn) { @@ -3720,13 +3729,7 @@ static int validate_retpoline(struct obj if (insn->retpoline_safe) continue; =20 - /* - * .init.text code is ran before userspace and thus doesn't - * strictly need retpolines, except for modules which are - * loaded late, they very much do need retpoline in their - * .init.text - */ - if (!strcmp(insn->sec->name, ".init.text") && !opts.module) + if (insn->sec->init) continue; =20 if (insn->type =3D=3D INSN_RETURN) { --- a/tools/objtool/include/objtool/elf.h +++ b/tools/objtool/include/objtool/elf.h @@ -38,7 +38,7 @@ struct section { Elf_Data *data; char *name; int idx; - bool changed, text, rodata, noinstr; + bool changed, text, rodata, noinstr, init; }; =20 struct symbol { From nobody Sat Apr 18 05:54:57 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8F2FAC43334 for ; Sat, 16 Jul 2022 23:17:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232822AbiGPXRt (ORCPT ); Sat, 16 Jul 2022 19:17:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38992 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232978AbiGPXRi (ORCPT ); Sat, 16 Jul 2022 19:17:38 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 469FC22B18 for ; Sat, 16 Jul 2022 16:17:25 -0700 (PDT) Message-ID: <20220716230953.138004117@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1658013443; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=QvzD5+agpmv+z+EPnO126OtvixIotUEzcS1YTM2Sf54=; b=MkHx/IkFOkZ/D9EqEFMfURLeqDyMptI4jvdIi9x6qwssQXJU2ADy3Sj1ttt6G/E++K/sbL pa5LFvU156K5VBDuHZ1oOTsLNv0nS5HMP/iAmvLJAglaqvRhfO+/nP0R9ALVDukwUmlHUs J9i12CzVK617RllEgwsnrJPTGRnhBodzAKvwoMPpWUB3YaWGVUNtaA2wbick1P1ghqX4Cl evjqEtkpX65MkW61eqXpRIEBWRpj9fMpI3BiNcL2V4CfTljHpK2mIfn7V/ZsALFo8iFJ6E yrzbOhtHsx4MSr5WeO6z0JA+86vu7ZE3vmfIkn6uKTOs7+z26Coem8+6T9kv7g== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1658013443; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=QvzD5+agpmv+z+EPnO126OtvixIotUEzcS1YTM2Sf54=; b=MzsacAQSccZGhFmiqh4qA4460KcAtZYXekavesQt88ORHeNvVzZRXJmhFclaOFnaMLCRIN Z1+NOdYtx+wI5VAg== From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Linus Torvalds , Tim Chen , Josh Poimboeuf , Andrew Cooper , Pawan Gupta , Johannes Wikner , Alyssa Milburn , Jann Horn , "H.J. Lu" , Joao Moreira , Joseph Nuzman , Steven Rostedt , "Peter Zijlstra (Intel)" Subject: [patch 08/38] objtool: Add .call_sites section References: <20220716230344.239749011@linutronix.de> MIME-Version: 1.0 Date: Sun, 17 Jul 2022 01:17:22 +0200 (CEST) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Peter Zijlstra In preparation for call depth tracking provide a section which collects all direct calls. Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Thomas Gleixner --- arch/x86/kernel/vmlinux.lds.S | 7 ++++ tools/objtool/check.c | 51 +++++++++++++++++++++++++++= +++++ tools/objtool/include/objtool/objtool.h | 1=20 tools/objtool/objtool.c | 1=20 4 files changed, 60 insertions(+) --- a/arch/x86/kernel/vmlinux.lds.S +++ b/arch/x86/kernel/vmlinux.lds.S @@ -290,6 +290,13 @@ SECTIONS *(.return_sites) __return_sites_end =3D .; } + + . =3D ALIGN(8); + .call_sites : AT(ADDR(.call_sites) - LOAD_OFFSET) { + __call_sites =3D .; + *(.call_sites) + __call_sites_end =3D .; + } #endif =20 #ifdef CONFIG_X86_KERNEL_IBT --- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -898,6 +898,49 @@ static int create_mcount_loc_sections(st return 0; } =20 +static int create_direct_call_sections(struct objtool_file *file) +{ + struct instruction *insn; + struct section *sec; + unsigned int *loc; + int idx; + + sec =3D find_section_by_name(file->elf, ".call_sites"); + if (sec) { + INIT_LIST_HEAD(&file->call_list); + WARN("file already has .call_sites section, skipping"); + return 0; + } + + if (list_empty(&file->call_list)) + return 0; + + idx =3D 0; + list_for_each_entry(insn, &file->call_list, call_node) + idx++; + + sec =3D elf_create_section(file->elf, ".call_sites", 0, sizeof(unsigned i= nt), idx); + if (!sec) + return -1; + + idx =3D 0; + list_for_each_entry(insn, &file->call_list, call_node) { + + loc =3D (unsigned int *)sec->data->d_buf + idx; + memset(loc, 0, sizeof(unsigned int)); + + if (elf_add_reloc_to_insn(file->elf, sec, + idx * sizeof(unsigned int), + R_X86_64_PC32, + insn->sec, insn->offset)) + return -1; + + idx++; + } + + return 0; +} + /* * Warnings shouldn't be reported for ignored functions. */ @@ -1252,6 +1295,9 @@ static void annotate_call_site(struct ob return; } =20 + if (insn->type =3D=3D INSN_CALL && !insn->sec->init) + list_add_tail(&insn->call_node, &file->call_list); + if (!sibling && dead_end_function(file, sym)) insn->dead_end =3D true; } @@ -4275,6 +4321,11 @@ int check(struct objtool_file *file) if (ret < 0) goto out; warnings +=3D ret; + + ret =3D create_direct_call_sections(file); + if (ret < 0) + goto out; + warnings +=3D ret; } =20 if (opts.mcount) { --- a/tools/objtool/include/objtool/objtool.h +++ b/tools/objtool/include/objtool/objtool.h @@ -28,6 +28,7 @@ struct objtool_file { struct list_head static_call_list; struct list_head mcount_loc_list; struct list_head endbr_list; + struct list_head call_list; bool ignore_unreachables, hints, rodata; =20 unsigned int nr_endbr; --- a/tools/objtool/objtool.c +++ b/tools/objtool/objtool.c @@ -106,6 +106,7 @@ struct objtool_file *objtool_open_read(c INIT_LIST_HEAD(&file.static_call_list); INIT_LIST_HEAD(&file.mcount_loc_list); INIT_LIST_HEAD(&file.endbr_list); + INIT_LIST_HEAD(&file.call_list); file.ignore_unreachables =3D opts.no_unreachable; file.hints =3D false; From nobody Sat Apr 18 05:54:57 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 31125C433EF for ; Sat, 16 Jul 2022 23:17:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233051AbiGPXRw (ORCPT ); Sat, 16 Jul 2022 19:17:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39364 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232878AbiGPXRi (ORCPT ); Sat, 16 Jul 2022 19:17:38 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 15E7522BCC for ; Sat, 16 Jul 2022 16:17:26 -0700 (PDT) Message-ID: <20220716230953.205004151@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1658013444; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=08QH2YM/1Di8RC5cyk9vD0xjmV190+dyfUw7UHxm3Ko=; b=vEi4ExZA1d0vxPBYQIeE4UnGrv0F59Oknn92PiWUPmxt3YpuKRIo8vtpjSKFn4wrA4Aqlf TyX6xBOTPA//coSn3ZAZgH6nKAw8/u+z7JFLNjcLvJVD3pEFYrMpFhr+3cSVrjUjYJy+Yi qYXpS1am6E+d1mgpo3J6gkkb9RgLlpT6E6YpFkC37p2v4NTRv8l9m/LBtpKnltMnHdC/o5 BA9jpi0o5A7lOydNKgcBqTUrH5IbuWEJkKC/POdIYhIlbu6cDgqDNLDMyAmva2X0xwcf4i B3l6gzQBxsCeGBrOcio9JtzAlWidomw0sDoW7vvONzhMaQmEpCZpLspNfI+oMg== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1658013444; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=08QH2YM/1Di8RC5cyk9vD0xjmV190+dyfUw7UHxm3Ko=; b=oAXaQqRtn8Rcp1W7qeYV1HSmc730sVKlmHSRsOvMYoyEdrbWaaMIG7/OSYExnCOSVCkmKY I2HlxhAihSn4muDw== From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Linus Torvalds , Tim Chen , Josh Poimboeuf , Andrew Cooper , Pawan Gupta , Johannes Wikner , Alyssa Milburn , Jann Horn , "H.J. Lu" , Joao Moreira , Joseph Nuzman , Steven Rostedt , "Peter Zijlstra (Intel)" Subject: [patch 09/38] objtool: Add .sym_sites section References: <20220716230344.239749011@linutronix.de> MIME-Version: 1.0 Date: Sun, 17 Jul 2022 01:17:24 +0200 (CEST) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Peter Zijlstra In preparation for call depth tracking provide a section which collects all all !init symbols to generate thunks for. Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Thomas Gleixner --- arch/x86/kernel/vmlinux.lds.S | 7 +++++ tools/objtool/check.c | 55 +++++++++++++++++++++++++++++++++++++= +++++ 2 files changed, 62 insertions(+) --- a/arch/x86/kernel/vmlinux.lds.S +++ b/arch/x86/kernel/vmlinux.lds.S @@ -297,6 +297,13 @@ SECTIONS *(.call_sites) __call_sites_end =3D .; } + + . =3D ALIGN(8); + .sym_sites : AT(ADDR(.sym_sites) - LOAD_OFFSET) { + __sym_sites =3D .; + *(.sym_sites) + __sym_sites_end =3D .; + } #endif =20 #ifdef CONFIG_X86_KERNEL_IBT --- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -941,6 +941,56 @@ static int create_direct_call_sections(s return 0; } =20 +static int create_sym_thunk_sections(struct objtool_file *file) +{ + struct section *sec, *s; + struct symbol *sym; + unsigned int *loc; + int idx; + + sec =3D find_section_by_name(file->elf, ".sym_sites"); + if (sec) { + INIT_LIST_HEAD(&file->call_list); + WARN("file already has .sym_sites section, skipping"); + return 0; + } + + idx =3D 0; + for_each_sec(file, s) { + if (!s->text || s->init) + continue; + + list_for_each_entry(sym, &s->symbol_list, list) + idx++; + } + + sec =3D elf_create_section(file->elf, ".sym_sites", 0, sizeof(unsigned in= t), idx); + if (!sec) + return -1; + + idx =3D 0; + for_each_sec(file, s) { + if (!s->text || s->init) + continue; + + list_for_each_entry(sym, &s->symbol_list, list) { + + loc =3D (unsigned int *)sec->data->d_buf + idx; + memset(loc, 0, sizeof(unsigned int)); + + if (elf_add_reloc_to_insn(file->elf, sec, + idx * sizeof(unsigned int), + R_X86_64_PC32, + s, sym->offset)) + return -1; + + idx++; + } + } + + return 0; +} + /* * Warnings shouldn't be reported for ignored functions. */ @@ -4326,6 +4376,11 @@ int check(struct objtool_file *file) if (ret < 0) goto out; warnings +=3D ret; + + ret =3D create_sym_thunk_sections(file); + if (ret < 0) + goto out; + warnings +=3D ret; } =20 if (opts.mcount) { From nobody Sat Apr 18 05:54:57 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B68FFC43334 for ; Sat, 16 Jul 2022 23:18:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233069AbiGPXR7 (ORCPT ); Sat, 16 Jul 2022 19:17:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39318 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232725AbiGPXRk (ORCPT ); Sat, 16 Jul 2022 19:17:40 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DBF7922BE7 for ; Sat, 16 Jul 2022 16:17:28 -0700 (PDT) Message-ID: <20220716230953.265188514@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1658013446; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=3supOg+sDQQOfSp9ifUSIygdjymw6D8ncsEppKfmHZA=; b=lprSnZOrYsVcKhla+weKD3GW8WaOjVFwhF5vQ7tqfZr7swIkUCxUaWUEmFfVrlVmf+P2Dl iBkimFBI6lVycjhXXYg3zoBkiocSn1AIOAl04A00I7oaarB1ExIr15g8LFG+UhXvxi4h60 rQzpNxoqDS0lQpA1mFgMXsjoWdSiFEnp07CX2uG3ezJJkdx1slb0RDpC7VG1YnfmnlIyZ1 VrtL2GWlu2Z6PDWQklZL5fkTL0O6djafREywpx18Ligr1JBEWofhHDTkEQ9I56OANqWhxD Fo4Dij2OfRM9xFAFjFUgOeVOApqT9CKKcKI6uyH5hsTcuKnTrK9lKSesZPi6kA== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1658013446; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=3supOg+sDQQOfSp9ifUSIygdjymw6D8ncsEppKfmHZA=; b=NYy2oXk2xptMAHSOgKlFJOeSqUAAOCZIiYPMmqpSGFiXoEU6nWjlfbJ1ADNlu0rCgNzxiL 0zGgfqqGJDYk+SBg== From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Linus Torvalds , Tim Chen , Josh Poimboeuf , Andrew Cooper , Pawan Gupta , Johannes Wikner , Alyssa Milburn , Jann Horn , "H.J. Lu" , Joao Moreira , Joseph Nuzman , Steven Rostedt , "Peter Zijlstra (Intel)" Subject: [patch 10/38] objtool: Add --hacks=skylake References: <20220716230344.239749011@linutronix.de> MIME-Version: 1.0 Date: Sun, 17 Jul 2022 01:17:25 +0200 (CEST) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Peter Zijlstra Make the call/func sections selectable via the --hacks option. Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Thomas Gleixner --- scripts/Makefile.lib | 3 ++- tools/objtool/builtin-check.c | 7 ++++++- tools/objtool/check.c | 18 ++++++++++-------- tools/objtool/include/objtool/builtin.h | 1 + 4 files changed, 19 insertions(+), 10 deletions(-) --- a/scripts/Makefile.lib +++ b/scripts/Makefile.lib @@ -231,7 +231,8 @@ objtool :=3D $(objtree)/tools/objtool/objt =20 objtool_args =3D \ $(if $(CONFIG_HAVE_JUMP_LABEL_HACK), --hacks=3Djump_label) \ - $(if $(CONFIG_HAVE_NOINSTR_HACK), --hacks=3Dnoinstr) \ + $(if $(CONFIG_HAVE_NOINSTR_HACK), --hacks=3Dnoinstr) \ + $(if $(CONFIG_CALL_DEPTH_TRACKING), --hacks=3Dskylake) \ $(if $(CONFIG_X86_KERNEL_IBT), --ibt) \ $(if $(CONFIG_FTRACE_MCOUNT_USE_OBJTOOL), --mcount) \ $(if $(CONFIG_UNWINDER_ORC), --orc) \ --- a/tools/objtool/builtin-check.c +++ b/tools/objtool/builtin-check.c @@ -57,12 +57,17 @@ static int parse_hacks(const struct opti found =3D true; } =20 + if (!str || strstr(str, "skylake")) { + opts.hack_skylake =3D true; + found =3D true; + } + return found ? 0 : -1; } =20 const struct option check_options[] =3D { OPT_GROUP("Actions:"), - OPT_CALLBACK_OPTARG('h', "hacks", NULL, NULL, "jump_label,noinstr", "patc= h toolchain bugs/limitations", parse_hacks), + OPT_CALLBACK_OPTARG('h', "hacks", NULL, NULL, "jump_label,noinstr,skylake= ", "patch toolchain bugs/limitations", parse_hacks), OPT_BOOLEAN('i', "ibt", &opts.ibt, "validate and annotate IBT"), OPT_BOOLEAN('m', "mcount", &opts.mcount, "annotate mcount/fentry calls fo= r ftrace"), OPT_BOOLEAN('n', "noinstr", &opts.noinstr, "validate noinstr rules"), --- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -4372,15 +4372,17 @@ int check(struct objtool_file *file) goto out; warnings +=3D ret; =20 - ret =3D create_direct_call_sections(file); - if (ret < 0) - goto out; - warnings +=3D ret; + if (opts.hack_skylake) { + ret =3D create_direct_call_sections(file); + if (ret < 0) + goto out; + warnings +=3D ret; =20 - ret =3D create_sym_thunk_sections(file); - if (ret < 0) - goto out; - warnings +=3D ret; + ret =3D create_sym_thunk_sections(file); + if (ret < 0) + goto out; + warnings +=3D ret; + } } =20 if (opts.mcount) { --- a/tools/objtool/include/objtool/builtin.h +++ b/tools/objtool/include/objtool/builtin.h @@ -14,6 +14,7 @@ struct opts { bool dump_orc; bool hack_jump_label; bool hack_noinstr; + bool hack_skylake; bool ibt; bool mcount; bool noinstr; From nobody Sat Apr 18 05:54:57 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 87459C433EF for ; Sat, 16 Jul 2022 23:17:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232279AbiGPXRz (ORCPT ); Sat, 16 Jul 2022 19:17:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38678 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233027AbiGPXRk (ORCPT ); Sat, 16 Jul 2022 19:17:40 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0017C22BF8 for ; Sat, 16 Jul 2022 16:17:29 -0700 (PDT) Message-ID: <20220716230953.326426616@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1658013448; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=gtD1bYCGdHrGTPvqbbT1WQIkd3q5mX14d2yTGTsEWIY=; b=lrQq7zg12vfL/fN+uWktN47xduM1mb9mGBRJv3iMxtvBOgIMIhDsVpJkxRMmE2tf9NZTBT ZKGuaXKuEn0KXxwHmVi+tAO+zWMmNbQArFmCpxa9fF1V4uGRcQshj/kSZla/lsaBHEAetk uwC0ADPf7B27lXUJUueshlIaqi0gp9Khexp9baNI55GajgJ0eH0C6IoU7MwKZXD3htOCDg C7VQsv7X4eMPSgV1X9mv3coV2VYVk9N9H4QJHOL2l3gb9mOzNC9iXnig0D0YuI395ld1za jfXEqOajTsPvbTBF3JuyF0cB5BpmSjAhvjqRkIOXc+sFn/z1dqtzIO4U4DuE1w== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1658013448; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=gtD1bYCGdHrGTPvqbbT1WQIkd3q5mX14d2yTGTsEWIY=; b=hgFT98BVliI+NewUNQtK7EJ65UB1bY1yMvjI0qVxyd5uxSQhe9bl/1q9f/swB7jITiOGEE 5yJTOqW/Ptk2F4Dw== From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Linus Torvalds , Tim Chen , Josh Poimboeuf , Andrew Cooper , Pawan Gupta , Johannes Wikner , Alyssa Milburn , Jann Horn , "H.J. Lu" , Joao Moreira , Joseph Nuzman , Steven Rostedt , "Peter Zijlstra (Intel)" Subject: [patch 11/38] objtool: Allow STT_NOTYPE -> STT_FUNC+0 tail-calls References: <20220716230344.239749011@linutronix.de> MIME-Version: 1.0 Date: Sun, 17 Jul 2022 01:17:27 +0200 (CEST) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Allow STT_NOTYPE to tail-call into STT_FUNC, per definition STT_NOTYPE is not a sub-function of the STT_FUNC. Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Thomas Gleixner --- tools/objtool/check.c | 29 ++++++++++++++++++++--------- 1 file changed, 20 insertions(+), 9 deletions(-) --- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -1420,6 +1420,16 @@ static void add_return_call(struct objto =20 static bool same_function(struct instruction *insn1, struct instruction *i= nsn2) { + if (!insn1->func && !insn2->func) + return true; + + /* Allow STT_NOTYPE -> STT_FUNC+0 tail-calls */ + if (!insn1->func && insn1->func !=3D insn2->func) + return false; + + if (!insn2->func) + return true; + return insn1->func->pfunc =3D=3D insn2->func->pfunc; } =20 @@ -1537,18 +1547,19 @@ static int add_jump_destinations(struct strstr(jump_dest->func->name, ".cold")) { insn->func->cfunc =3D jump_dest->func; jump_dest->func->pfunc =3D insn->func; - - } else if (!same_function(insn, jump_dest) && - is_first_func_insn(file, jump_dest)) { - /* - * Internal sibling call without reloc or with - * STT_SECTION reloc. - */ - add_call_dest(file, insn, jump_dest->func, true); - continue; } } =20 + if (!same_function(insn, jump_dest) && + is_first_func_insn(file, jump_dest)) { + /* + * Internal sibling call without reloc or with + * STT_SECTION reloc. + */ + add_call_dest(file, insn, jump_dest->func, true); + continue; + } + insn->jump_dest =3D jump_dest; } From nobody Sat Apr 18 05:54:57 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 79436C433EF for ; Sat, 16 Jul 2022 23:18:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232985AbiGPXSD (ORCPT ); Sat, 16 Jul 2022 19:18:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39378 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232965AbiGPXRl (ORCPT ); Sat, 16 Jul 2022 19:17:41 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 70D6423174 for ; Sat, 16 Jul 2022 16:17:32 -0700 (PDT) Message-ID: <20220716230953.385846695@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1658013449; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=HpmwuGNovGo47P809SY+5iM/RPE9nY5KEl2jrra4VzI=; b=skFOje2PrLR0pdzlqKwbqANdn8WzvZLOy8Phznr0Jo/t1Bocx31+VVWh9hN5dFsEpuXdff BCapJFj4JPfdDoKHCGo7zOrSX3qnMQLA1r+6B9M3+747uP1Q6e5cOVQDura2n1qjQSL8WP 1L9QwUyy2E0Q6hG+r9FAprnCBPyzSSkQ/E+LulTvNYhjBPwCEcA89Coh7DnKaUrcug6K3f 0i85XosdVhbFsSm0jAaV/DaQXER+fJJP3XqXfUTRN10eoBQvH96Dxoscm/CIlEA9YSX68/ 5dMqn6cgcmHOxGA5x2KvOhEsX4AntonvlOTfx3hYt0J4gtINm3xGXBq0wENGzA== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1658013449; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=HpmwuGNovGo47P809SY+5iM/RPE9nY5KEl2jrra4VzI=; b=Y6ouY3ChRBWW4VLRhuyDj7T4uHASNMiMDjeRWDOZ05p/XoU+ZQBP6Z08vCEW75PuhBQ1mD rTjeAuMs85bj3TCg== From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Linus Torvalds , Tim Chen , Josh Poimboeuf , Andrew Cooper , Pawan Gupta , Johannes Wikner , Alyssa Milburn , Jann Horn , "H.J. Lu" , Joao Moreira , Joseph Nuzman , Steven Rostedt , "Peter Zijlstra (Intel)" Subject: [patch 12/38] x86/entry: Make sync_regs() invocation a tail call References: <20220716230344.239749011@linutronix.de> MIME-Version: 1.0 Date: Sun, 17 Jul 2022 01:17:29 +0200 (CEST) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Peter Zijlstra No point in having a call there. Spare the call/ret overhead. Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Thomas Gleixner --- arch/x86/entry/entry_64.S | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-) --- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -1060,11 +1060,8 @@ SYM_CODE_START_LOCAL(error_entry) UNTRAIN_RET =20 leaq 8(%rsp), %rdi /* arg0 =3D pt_regs pointer */ -.Lerror_entry_from_usermode_after_swapgs: - /* Put us onto the real thread stack. */ - call sync_regs - RET + jmp sync_regs =20 /* * There are two places in the kernel that can potentially fault with @@ -1122,7 +1119,7 @@ SYM_CODE_START_LOCAL(error_entry) leaq 8(%rsp), %rdi /* arg0 =3D pt_regs pointer */ call fixup_bad_iret mov %rax, %rdi - jmp .Lerror_entry_from_usermode_after_swapgs + jmp sync_regs SYM_CODE_END(error_entry) =20 SYM_CODE_START_LOCAL(error_return) From nobody Sat Apr 18 05:54:57 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 70E5CC433EF for ; Sat, 16 Jul 2022 23:18:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233080AbiGPXSG (ORCPT ); Sat, 16 Jul 2022 19:18:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38884 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233070AbiGPXRm (ORCPT ); Sat, 16 Jul 2022 19:17:42 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C1767237DE for ; Sat, 16 Jul 2022 16:17:33 -0700 (PDT) Message-ID: <20220716230953.442937066@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1658013451; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=Ithq52Gx+iixg96k39i55L8lXk6+1urMKhPljXhhy5M=; b=RD2ZTbVGxpgYx2xHnJLhKdrR9TpRyheRmkAEzrEOEWkzDB4Ukw+t2vJTuhQO1WwKUXJo4q KbcJAeqltspsDr7gDwj64eWufE2h15/ccwSZpKV2wm2mUhh7O/GHjH6RbgTBT8Fwt33EVZ qezVc1BUikkMIM85qxt4UFPGfQ6jToBEqdNd+HXHVB6do5kLexhr4qR75u3VUNUqbfZ4df 29JO3pBgXM3rR+UoT3Yd/B5dD2rr9zcqD0lATCoVzGS1jai9u50OLpwn/AwO0e1tdmFF+5 Z3C9wBgZWfbuBIVHPdgyaMZLrO2qiICGYEGN+X/Q8RkDMAm/EdYAm89RceiyQw== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1658013451; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=Ithq52Gx+iixg96k39i55L8lXk6+1urMKhPljXhhy5M=; b=RXE/Y2xi68Iu624evLhKupppsTCt/fYBAR++Lhsf2/IqVwseeEu1DuhI8cqxV+2yeznLln yj5jUB0y+f6HtfCA== From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Linus Torvalds , Tim Chen , Josh Poimboeuf , Andrew Cooper , Pawan Gupta , Johannes Wikner , Alyssa Milburn , Jann Horn , "H.J. Lu" , Joao Moreira , Joseph Nuzman , Steven Rostedt Subject: [patch 13/38] x86/modules: Make module_alloc() generally available References: <20220716230344.239749011@linutronix.de> MIME-Version: 1.0 Date: Sun, 17 Jul 2022 01:17:30 +0200 (CEST) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" module_alloc() allocates from the module region which is also available when CONFIG_MODULES=3Dn. Non module builds should be able to allocate from that region nevertheless e.g. for creating call thunks. Split the code out and make it possible to select for !MODULES builds. Signed-off-by: Thomas Gleixner --- arch/x86/Kconfig | 3 ++ arch/x86/kernel/module.c | 58 ----------------------------------------= -- arch/x86/mm/Makefile | 2 + arch/x86/mm/module_alloc.c | 62 ++++++++++++++++++++++++++++++++++++++++= +++++ 4 files changed, 67 insertions(+), 58 deletions(-) --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -2236,6 +2236,9 @@ config RANDOMIZE_MEMORY_PHYSICAL_PADDING =20 If unsure, leave at the default value. =20 +config MODULE_ALLOC + def_bool MODULES + config HOTPLUG_CPU def_bool y depends on SMP --- a/arch/x86/kernel/module.c +++ b/arch/x86/kernel/module.c @@ -8,21 +8,14 @@ =20 #include #include -#include #include #include #include -#include #include -#include -#include #include -#include #include =20 #include -#include -#include #include =20 #if 0 @@ -36,57 +29,6 @@ do { \ } while (0) #endif =20 -#ifdef CONFIG_RANDOMIZE_BASE -static unsigned long module_load_offset; - -/* Mutex protects the module_load_offset. */ -static DEFINE_MUTEX(module_kaslr_mutex); - -static unsigned long int get_module_load_offset(void) -{ - if (kaslr_enabled()) { - mutex_lock(&module_kaslr_mutex); - /* - * Calculate the module_load_offset the first time this - * code is called. Once calculated it stays the same until - * reboot. - */ - if (module_load_offset =3D=3D 0) - module_load_offset =3D - (get_random_int() % 1024 + 1) * PAGE_SIZE; - mutex_unlock(&module_kaslr_mutex); - } - return module_load_offset; -} -#else -static unsigned long int get_module_load_offset(void) -{ - return 0; -} -#endif - -void *module_alloc(unsigned long size) -{ - gfp_t gfp_mask =3D GFP_KERNEL; - void *p; - - if (PAGE_ALIGN(size) > MODULES_LEN) - return NULL; - - p =3D __vmalloc_node_range(size, MODULE_ALIGN, - MODULES_VADDR + get_module_load_offset(), - MODULES_END, gfp_mask, PAGE_KERNEL, - VM_FLUSH_RESET_PERMS | VM_DEFER_KMEMLEAK, - NUMA_NO_NODE, __builtin_return_address(0)); - - if (p && (kasan_alloc_module_shadow(p, size, gfp_mask) < 0)) { - vfree(p); - return NULL; - } - - return p; -} - #ifdef CONFIG_X86_32 int apply_relocate(Elf32_Shdr *sechdrs, const char *strtab, --- a/arch/x86/mm/Makefile +++ b/arch/x86/mm/Makefile @@ -59,3 +59,5 @@ obj-$(CONFIG_AMD_MEM_ENCRYPT) +=3D mem_enc =20 obj-$(CONFIG_AMD_MEM_ENCRYPT) +=3D mem_encrypt_identity.o obj-$(CONFIG_AMD_MEM_ENCRYPT) +=3D mem_encrypt_boot.o + +obj-$(CONFIG_MODULE_ALLOC) +=3D module_alloc.o --- /dev/null +++ b/arch/x86/mm/module_alloc.c @@ -0,0 +1,62 @@ +// SPDX-License-Identifier: GPL-2.0-or-later + +#include +#include +#include +#include +#include +#include + +#include +#include + +#ifdef CONFIG_RANDOMIZE_BASE +static unsigned long module_load_offset; + +/* Mutex protects the module_load_offset. */ +static DEFINE_MUTEX(module_kaslr_mutex); + +static unsigned long int get_module_load_offset(void) +{ + if (kaslr_enabled()) { + mutex_lock(&module_kaslr_mutex); + /* + * Calculate the module_load_offset the first time this + * code is called. Once calculated it stays the same until + * reboot. + */ + if (module_load_offset =3D=3D 0) + module_load_offset =3D + (get_random_int() % 1024 + 1) * PAGE_SIZE; + mutex_unlock(&module_kaslr_mutex); + } + return module_load_offset; +} +#else +static unsigned long int get_module_load_offset(void) +{ + return 0; +} +#endif + +void *module_alloc(unsigned long size) +{ + gfp_t gfp_mask =3D GFP_KERNEL; + void *p; + + if (PAGE_ALIGN(size) > MODULES_LEN) + return NULL; + + p =3D __vmalloc_node_range(size, MODULE_ALIGN, + MODULES_VADDR + get_module_load_offset(), + MODULES_END, gfp_mask, PAGE_KERNEL, + VM_FLUSH_RESET_PERMS | VM_DEFER_KMEMLEAK, + NUMA_NO_NODE, __builtin_return_address(0)); + + if (p && (kasan_alloc_module_shadow(p, size, gfp_mask) < 0)) { + vfree(p); + return NULL; + } + + return p; +} From nobody Sat Apr 18 05:54:57 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5943FC43334 for ; Sat, 16 Jul 2022 23:18:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233145AbiGPXSN (ORCPT ); Sat, 16 Jul 2022 19:18:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39666 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232621AbiGPXRo (ORCPT ); Sat, 16 Jul 2022 19:17:44 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 123AA23BCD for ; Sat, 16 Jul 2022 16:17:34 -0700 (PDT) Message-ID: <20220716230953.500517388@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1658013452; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=QqAiRodBnzoFq0Uv7yhac721jXiH8ElvZylHXExmCIo=; b=rc1Ng1Bc0mSzCZ/ImhA9uf8toqbfvxb+B6ftlfcWG2JyuRJudFt2g7sSsCzqjtYkEj14DI QPvFMvBcJb6wRR9dqB9aD2kkVIg5F5z9es74kovLN9hxHidtN5xAH9bmqFnm6X4hHfMaHe GyyETPaiceyyDHB1Evwmp4Qyc4aHqJqIINX5CrbDAArAGTNzkzyAwKZIdF+6KRcCC7a4MP AMHMu/jbLzNJAuQJVaQxOZB3TTeX8yCJTzCewDNIrG+X2ndUPEPKOg65QPP4Idi8ZBdtJE AFdrioCjAKEUF1y2Ccxcg22gBstUtfw4QMup4ftqKotH9l/IOiBKUsOqzUMNyA== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1658013452; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=QqAiRodBnzoFq0Uv7yhac721jXiH8ElvZylHXExmCIo=; b=h3EmeZaBUFDKVEHqcnwlhn/N/0VIM4S+ifGV9IOeuhi+KU5NpQAvw512pb2glCrTBLzpD9 g7STLvCrOC9Cz0Aw== From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Linus Torvalds , Tim Chen , Josh Poimboeuf , Andrew Cooper , Pawan Gupta , Johannes Wikner , Alyssa Milburn , Jann Horn , "H.J. Lu" , Joao Moreira , Joseph Nuzman , Steven Rostedt Subject: [patch 14/38] x86/Kconfig: Add CONFIG_CALL_THUNKS References: <20220716230344.239749011@linutronix.de> MIME-Version: 1.0 Date: Sun, 17 Jul 2022 01:17:32 +0200 (CEST) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" In preparation for mitigating the Intel SKL RSB underflow issue in software, add a new configuration symbol which allows to build the required call thunk infrastructure conditionally. Signed-off-by: Thomas Gleixner --- arch/x86/Kconfig | 8 ++++++++ 1 file changed, 8 insertions(+) --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -2439,6 +2439,14 @@ config CC_HAS_SLS config CC_HAS_RETURN_THUNK def_bool $(cc-option,-mfunction-return=3Dthunk-extern) =20 +config HAVE_CALL_THUNKS + def_bool y + depends on RETHUNK && OBJTOOL + +config CALL_THUNKS + def_bool n + select MODULE_ALLOC + menuconfig SPECULATION_MITIGATIONS bool "Mitigations for speculative execution vulnerabilities" default y From nobody Sat Apr 18 05:54:57 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 658C6C433EF for ; Sat, 16 Jul 2022 23:18:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233136AbiGPXSK (ORCPT ); Sat, 16 Jul 2022 19:18:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39052 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233026AbiGPXRv (ORCPT ); Sat, 16 Jul 2022 19:17:51 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1DEBA222B6 for ; Sat, 16 Jul 2022 16:17:37 -0700 (PDT) Message-ID: <20220716230953.558921214@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1658013454; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=38u937fwI+dedrnZIykHKHbxdYzLzIUKpcUX1dAfXdw=; b=z1m9Uz3ZPv3XGE2Kk3qMGigHQ181tfTp2tUkxLDVCbvdScqcqFpz0YjDpoUAXOWqZNv3xH KfUDwXf4Zf8s8tGQYAFZqPRaPTFCcsxAH8V3gih2u7AJHgsMRt0Prt/7WaisvbpG9upIK4 XNNxDxx39Q6AhI0DSE/Ea9gOAul4u4/P+tbc0mVgbeYtZjY4G+V624w0j5r45NTbwLNMo6 T6NzRKZtIenozgHnsxmx1EmFv2a3F9ZMTSytroy/XlT8nnHpysu8kbVsyGIgClG51RaaFh tYeuI18ip1a8FqWIEAm7DkHzzlwCuFg+xnzakpvNpkdJrVOaa2CGAVAUftKzAg== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1658013454; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=38u937fwI+dedrnZIykHKHbxdYzLzIUKpcUX1dAfXdw=; b=lYfKPWqq4rcPPTgJ0f25hz/DFP5uvORIW8N12pn5I9dGQHSLWoczbHolBGXEJSzLnVJ3L4 81ygoox59GWTmDDQ== From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Linus Torvalds , Tim Chen , Josh Poimboeuf , Andrew Cooper , Pawan Gupta , Johannes Wikner , Alyssa Milburn , Jann Horn , "H.J. Lu" , Joao Moreira , Joseph Nuzman , Steven Rostedt Subject: [patch 15/38] x86/retbleed: Add X86_FEATURE_CALL_DEPTH References: <20220716230344.239749011@linutronix.de> MIME-Version: 1.0 Date: Sun, 17 Jul 2022 01:17:34 +0200 (CEST) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Intel SKL CPUs fall back to other predictors when the RSB underflows. The only microcode mitigation is IBRS which is insanely expensive. It comes with performance drops of up to 30% depending on the workload. A way less expensive, but nevertheless horrible mitigation is to track the call depth in software and overeagerly fill the RSB when returns underflow the software counter. Provide a configuration symbol and a CPU misfeature bit. Signed-off-by: Thomas Gleixner --- arch/x86/Kconfig | 13 +++++++++++++ arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/include/asm/disabled-features.h | 9 ++++++++- 3 files changed, 22 insertions(+), 1 deletion(-) --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -2498,6 +2498,19 @@ config CPU_UNRET_ENTRY help Compile the kernel with support for the retbleed=3Dunret mitigation. =20 +config CALL_DEPTH_TRACKING + bool "Mitigate RSB underflow with call depth tracking" + depends on CPU_SUP_INTEL && HAVE_CALL_THUNKS + select CALL_THUNKS + default y + help + Compile the kernel with call depth tracking to mitigate the Intel + SKL Return-Speculation-Buffer (RSB) underflow issue. The mitigation + is off by default and needs to be enabled on the kernel command line + via the retbleed=3Dstuff option. For non-affected systems the overhead + of this option is marginal as the call depth tracking is using + run-time generated call thunks and call patching. + config CPU_IBPB_ENTRY bool "Enable IBPB on kernel entry" depends on CPU_SUP_AMD --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -302,6 +302,7 @@ #define X86_FEATURE_RETPOLINE_LFENCE (11*32+13) /* "" Use LFENCE for Spect= re variant 2 */ #define X86_FEATURE_RETHUNK (11*32+14) /* "" Use REturn THUNK */ #define X86_FEATURE_UNRET (11*32+15) /* "" AMD BTB untrain return */ +#define X86_FEATURE_CALL_DEPTH (11*32+16) /* "" Call depth tracking for R= SB stuffing */ =20 /* Intel-defined CPU features, CPUID level 0x00000007:1 (EAX), word 12 */ #define X86_FEATURE_AVX_VNNI (12*32+ 4) /* AVX VNNI instructions */ --- a/arch/x86/include/asm/disabled-features.h +++ b/arch/x86/include/asm/disabled-features.h @@ -69,6 +69,12 @@ # define DISABLE_UNRET (1 << (X86_FEATURE_UNRET & 31)) #endif =20 +#ifdef CONFIG_CALL_DEPTH_TRACKING +# define DISABLE_CALL_DEPTH_TRACKING 0 +#else +# define DISABLE_CALL_DEPTH_TRACKING (1 << (X86_FEATURE_CALL_DEPTH & 31)) +#endif + #ifdef CONFIG_INTEL_IOMMU_SVM # define DISABLE_ENQCMD 0 #else @@ -101,7 +107,8 @@ #define DISABLED_MASK8 (DISABLE_TDX_GUEST) #define DISABLED_MASK9 (DISABLE_SGX) #define DISABLED_MASK10 0 -#define DISABLED_MASK11 (DISABLE_RETPOLINE|DISABLE_RETHUNK|DISABLE_UNRET) +#define DISABLED_MASK11 (DISABLE_RETPOLINE|DISABLE_RETHUNK|DISABLE_UNRET| \ + DISABLE_CALL_DEPTH_TRACKING) #define DISABLED_MASK12 0 #define DISABLED_MASK13 0 #define DISABLED_MASK14 0 From nobody Sat Apr 18 05:54:57 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E5BA8C43334 for ; Sat, 16 Jul 2022 23:18:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233157AbiGPXSP (ORCPT ); Sat, 16 Jul 2022 19:18:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39920 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230489AbiGPXRv (ORCPT ); Sat, 16 Jul 2022 19:17:51 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 835B722512 for ; Sat, 16 Jul 2022 16:17:38 -0700 (PDT) Message-ID: <20220716230953.619868339@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1658013456; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=b2tVGWEEQxEmKYvEfnlf9eVD7BzXQkIgf4CQYJfSWA0=; b=O7UhPxkCkQ737zainAw9aArz/bddAcpJHPpiD5Kp9M06wleZTQfXgKqex/jnpymI24hM6Z pHp4/VaBZ1nFNw8I9aByaY+wi5ZNvE2VbtGNnNmt9/9B9n25wCQqYIxXJR1Vr7YsQsHaYD ysVTkZMsDXnOjp3rrvt5extlWht2WG7n/hzF/eEoxm+otTc6g+ERiTvTRF838F/cXMni09 jKDhMKLDD4VAioO3tT9wiMIJWmZ/aCsMSq/siLBQ/hBXXLvS0ixCsKeUTxPDZ2RGgL3Ijs c7hQIDLhpu9ORGAwo2S+Yh5Y4pn85grwGXekmfbZU2rTSiN0FQbZ9grthbhf4Q== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1658013456; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=b2tVGWEEQxEmKYvEfnlf9eVD7BzXQkIgf4CQYJfSWA0=; b=CZ1DNdoIuzBl9Kb+DEaHZJzoEGs0pd4Dzz4uhF8KABrtdFczi0jr29Z2877TXr23Ow/qON 9fbK9FEBsGbTFRBw== From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Linus Torvalds , Tim Chen , Josh Poimboeuf , Andrew Cooper , Pawan Gupta , Johannes Wikner , Alyssa Milburn , Jann Horn , "H.J. Lu" , Joao Moreira , Joseph Nuzman , Steven Rostedt Subject: [patch 16/38] modules: Make struct module_layout unconditionally available References: <20220716230344.239749011@linutronix.de> MIME-Version: 1.0 Date: Sun, 17 Jul 2022 01:17:35 +0200 (CEST) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" To simplify the upcoming call thunk code it's desired to expose struct module_layout even on !MDOULES builds. This spares conditionals and #ifdeffery. Signed-off-by: Thomas Gleixner --- include/linux/module.h | 44 ++++++++++++++++++++++---------------------- 1 file changed, 22 insertions(+), 22 deletions(-) --- a/include/linux/module.h +++ b/include/linux/module.h @@ -67,6 +67,28 @@ struct module_version_attribute { const char *version; }; =20 +struct mod_tree_node { + struct module *mod; + struct latch_tree_node node; +}; + +struct module_layout { + /* The actual code + data. */ + void *base; + /* Total size. */ + unsigned int size; + /* The size of the executable code. */ + unsigned int text_size; + /* Size of RO section of the module (text+rodata) */ + unsigned int ro_size; + /* Size of RO after init section */ + unsigned int ro_after_init_size; + +#ifdef CONFIG_MODULES_TREE_LOOKUP + struct mod_tree_node mtn; +#endif +}; + extern ssize_t __modver_version_show(struct module_attribute *, struct module_kobject *, char *); =20 @@ -316,28 +338,6 @@ enum module_state { MODULE_STATE_UNFORMED, /* Still setting it up. */ }; =20 -struct mod_tree_node { - struct module *mod; - struct latch_tree_node node; -}; - -struct module_layout { - /* The actual code + data. */ - void *base; - /* Total size. */ - unsigned int size; - /* The size of the executable code. */ - unsigned int text_size; - /* Size of RO section of the module (text+rodata) */ - unsigned int ro_size; - /* Size of RO after init section */ - unsigned int ro_after_init_size; - -#ifdef CONFIG_MODULES_TREE_LOOKUP - struct mod_tree_node mtn; -#endif -}; - #ifdef CONFIG_MODULES_TREE_LOOKUP /* Only touch one cacheline for common rbtree-for-core-layout case. */ #define __module_layout_align ____cacheline_aligned From nobody Sat Apr 18 05:54:57 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 52707C43334 for ; Sat, 16 Jul 2022 23:18:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233054AbiGPXST (ORCPT ); Sat, 16 Jul 2022 19:18:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39974 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233089AbiGPXRw (ORCPT ); Sat, 16 Jul 2022 19:17:52 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 150A922530 for ; Sat, 16 Jul 2022 16:17:40 -0700 (PDT) Message-ID: <20220716230953.680326814@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1658013457; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=Gik5ycS44UWfcpIctkiTE3e/YDYlUOruM5e3iQXUL0g=; b=X807HrVwl4RC2I76VLm1mwNFcLukNwIJy98zK5GuKfgtZf78axpgJZQWCpN+TOlzWC7ftR LtYqIgn/J1aBEXXH103KQVkF9zwYkdibcM35upniyt2knLFbZWZ06H4RG5Lln3768u5Gw2 Y2SRVNMu6fHvgffnewmw86eCkEfstW6tYMlutpTFZsAUF5ueHmuoiFAm5ORKmqLBAHHZjo zBCMeLvQGzJ/NHyNgtttC4hKxegc0OHeUmELscK+O5/7B/ClV7qUUhDmnJk2aNIkYZk+th dWgiAqX1xI2XjVXT3t0th/uszclhma45tc8XZD4rHeubm7Usb8hEeJ0JN7YdMQ== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1658013457; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=Gik5ycS44UWfcpIctkiTE3e/YDYlUOruM5e3iQXUL0g=; b=QqaQ2T5Dv6mWQOTLxLrNYUm+MJF5Ji3sORMs+v8m7dX0/HQBtK0TPd61XAHx09Fow5UZjr oicTaWnbsQyvabDA== From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Linus Torvalds , Tim Chen , Josh Poimboeuf , Andrew Cooper , Pawan Gupta , Johannes Wikner , Alyssa Milburn , Jann Horn , "H.J. Lu" , Joao Moreira , Joseph Nuzman , Steven Rostedt Subject: [patch 17/38] module: Add arch_data to module_layout References: <20220716230344.239749011@linutronix.de> MIME-Version: 1.0 Date: Sun, 17 Jul 2022 01:17:37 +0200 (CEST) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" For the upcoming call depth tracking it's required to store extra information in the module layout. Add a pointer. Signed-off-by: Thomas Gleixner --- include/linux/module.h | 3 +++ 1 file changed, 3 insertions(+) --- a/include/linux/module.h +++ b/include/linux/module.h @@ -87,6 +87,9 @@ struct module_layout { #ifdef CONFIG_MODULES_TREE_LOOKUP struct mod_tree_node mtn; #endif +#ifdef CONFIG_CALL_THUNKS + void *arch_data; +#endif }; =20 extern ssize_t __modver_version_show(struct module_attribute *, From nobody Sat Apr 18 05:54:57 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 03197C43334 for ; Sat, 16 Jul 2022 23:18:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233029AbiGPXSh (ORCPT ); Sat, 16 Jul 2022 19:18:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39432 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231490AbiGPXR7 (ORCPT ); Sat, 16 Jul 2022 19:17:59 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0364D22BCC for ; Sat, 16 Jul 2022 16:17:44 -0700 (PDT) Message-ID: <20220716230953.738288030@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1658013459; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=sFncniQWCWFsoBIPdxcaL179NgZVykSQzUetuwoc3AM=; b=PL93QLSh3uV4HEBJbTcb7BC8MtTc9SjR+uT3D8LvzsJn8zwmSgojcfEEozpAFiOuP5zxBZ bqpqyMDtE1t+72rRcKCL87pt01ejps5pRPj+MmPqRq4BDGsGDWI/6VQ7ztBgh0L/y5R7Vn 7amUunaJxj+O2qHlkxajDijm3UsWHaIttoMBWbZ+erWBREx9sa+Mcuw7LkjUV6zfV4Up15 YuF90HLR8XTGv+Yisk/FRG8eYKciQ0YbGFhfiSBZenEMaCbaHeG5lb0eA54QkIPmzYVSPz 52cw72YkHQvzbxSI1ju1UpHAUMIibcQmg0P6s9SwhZ1MyfMFvKSZbeoylb+2yQ== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1658013459; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=sFncniQWCWFsoBIPdxcaL179NgZVykSQzUetuwoc3AM=; b=OK/agc9RiD69sFyUzsaN7zbo1HHl/VF89+eGNZcPi5/CX/96OhlLPD2zF+pvCRssvY0yNk Mv0oCBIRK5BVs9AQ== From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Linus Torvalds , Tim Chen , Josh Poimboeuf , Andrew Cooper , Pawan Gupta , Johannes Wikner , Alyssa Milburn , Jann Horn , "H.J. Lu" , Joao Moreira , Joseph Nuzman , Steven Rostedt , "Peter Zijlstra (Intel)" Subject: [patch 18/38] mm/vmalloc: Provide huge page mappings References: <20220716230344.239749011@linutronix.de> MIME-Version: 1.0 Date: Sun, 17 Jul 2022 01:17:38 +0200 (CEST) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Peter Zijlstra Provide VM_HUGE_VMAP, which unconditionally tries to use huge mappings. Unlike VM_ALLOW_HUGE_VMAP it doesn't care about the number of NUMA nodes or the size of the allocation. If the page allocator fails to provide huge pages, it will silently fall back. Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Thomas Gleixner --- include/linux/vmalloc.h | 3 ++- mm/vmalloc.c | 33 +++++++++++++++++++-------------- 2 files changed, 21 insertions(+), 15 deletions(-) --- a/include/linux/vmalloc.h +++ b/include/linux/vmalloc.h @@ -27,10 +27,11 @@ struct notifier_block; /* in notifier.h #define VM_FLUSH_RESET_PERMS 0x00000100 /* reset direct map and flush TLB = on unmap, can't be freed in atomic context */ #define VM_MAP_PUT_PAGES 0x00000200 /* put pages and free array in vfree */ #define VM_ALLOW_HUGE_VMAP 0x00000400 /* Allow for huge pages on arch= s with HAVE_ARCH_HUGE_VMALLOC */ +#define VM_HUGE_VMAP 0x00000800 /* Force for huge pages on archs wit= h HAVE_ARCH_HUGE_VMALLOC */ =20 #if (defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_KASAN_SW_TAGS)) && \ !defined(CONFIG_KASAN_VMALLOC) -#define VM_DEFER_KMEMLEAK 0x00000800 /* defer kmemleak object creation */ +#define VM_DEFER_KMEMLEAK 0x00001000 /* defer kmemleak object creation */ #else #define VM_DEFER_KMEMLEAK 0 #endif --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -3099,23 +3099,28 @@ void *__vmalloc_node_range(unsigned long return NULL; } =20 - if (vmap_allow_huge && (vm_flags & VM_ALLOW_HUGE_VMAP)) { - unsigned long size_per_node; + if (vmap_allow_huge && (vm_flags & (VM_HUGE_VMAP|VM_ALLOW_HUGE_VMAP))) { =20 - /* - * Try huge pages. Only try for PAGE_KERNEL allocations, - * others like modules don't yet expect huge pages in - * their allocations due to apply_to_page_range not - * supporting them. - */ + if (vm_flags & VM_ALLOW_HUGE_VMAP) { + unsigned long size_per_node; =20 - size_per_node =3D size; - if (node =3D=3D NUMA_NO_NODE) - size_per_node /=3D num_online_nodes(); - if (arch_vmap_pmd_supported(prot) && size_per_node >=3D PMD_SIZE) + /* + * Try huge pages. Only try for PAGE_KERNEL allocations, + * others like modules don't yet expect huge pages in + * their allocations due to apply_to_page_range not + * supporting them. + */ + + size_per_node =3D size; + if (node =3D=3D NUMA_NO_NODE) + size_per_node /=3D num_online_nodes(); + if (arch_vmap_pmd_supported(prot) && size_per_node >=3D PMD_SIZE) + shift =3D PMD_SHIFT; + else + shift =3D arch_vmap_pte_supported_shift(size_per_node); + } else { shift =3D PMD_SHIFT; - else - shift =3D arch_vmap_pte_supported_shift(size_per_node); + } =20 align =3D max(real_align, 1UL << shift); size =3D ALIGN(real_size, 1UL << shift); From nobody Sat Apr 18 05:54:57 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E54C1C433EF for ; Sat, 16 Jul 2022 23:18:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232920AbiGPXSp (ORCPT ); Sat, 16 Jul 2022 19:18:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40222 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233100AbiGPXR6 (ORCPT ); Sat, 16 Jul 2022 19:17:58 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E94B522B2E for ; Sat, 16 Jul 2022 16:17:42 -0700 (PDT) Message-ID: <20220716230953.797450674@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1658013461; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=Gd8NwNu0ThhW0QIu3FBYrL07cfGnrA/dNAI5sapg5h4=; b=0O24DsJodMkB5DrVniuZ4ogXsrq9F1bYco2YpjBQGKf+eecMM8rY85pVx1I/GadFtI62gr dTM8eMRJeDwEw7lwUNNrPzWOp7GYZ8jfEPKhNyZSSgN8jrekY2JSr60ANa2pakJL2EAT4e X2q824R+a5cANkpk9QvYidykE69ATDuY8zf9WI/ycMH4Z4hzInpgXan320MlWmMWmrzCou xexzhHemoJ5jtG7kn5Z/U7R/HNif/qplGITBm6aZ31YFp2l1Dm9GnwXAyrjW+2NkpZm9OH Wr/PWb3p9A5dzY+JzIAK0IKa5nxYp3qfNHTs3jY6ioajGrf5Ds1mmUsgDJXOaA== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1658013461; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=Gd8NwNu0ThhW0QIu3FBYrL07cfGnrA/dNAI5sapg5h4=; b=gWxtHbQ2jxPqv72i7D8iswJsg0w+AneZOt2jcgCU3+AzgemDt1n5OCvA9fGElCshbOCGkr lplWgFJrplByDHAA== From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Linus Torvalds , Tim Chen , Josh Poimboeuf , Andrew Cooper , Pawan Gupta , Johannes Wikner , Alyssa Milburn , Jann Horn , "H.J. Lu" , Joao Moreira , Joseph Nuzman , Steven Rostedt Subject: [patch 19/38] x86/module: Provide __module_alloc() References: <20220716230344.239749011@linutronix.de> MIME-Version: 1.0 Date: Sun, 17 Jul 2022 01:17:40 +0200 (CEST) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Provide a function to allocate from module space with large TLBs. This is required for callthunks as otherwise the ITLB pressure kills performance. Signed-off-by: Thomas Gleixner --- arch/x86/include/asm/module.h | 2 ++ arch/x86/mm/module_alloc.c | 10 ++++++++-- 2 files changed, 10 insertions(+), 2 deletions(-) --- a/arch/x86/include/asm/module.h +++ b/arch/x86/include/asm/module.h @@ -13,4 +13,6 @@ struct mod_arch_specific { #endif }; =20 +extern void *__module_alloc(unsigned long size, unsigned long vmflags); + #endif /* _ASM_X86_MODULE_H */ --- a/arch/x86/mm/module_alloc.c +++ b/arch/x86/mm/module_alloc.c @@ -39,7 +39,7 @@ static unsigned long int get_module_load } #endif =20 -void *module_alloc(unsigned long size) +void *__module_alloc(unsigned long size, unsigned long vmflags) { gfp_t gfp_mask =3D GFP_KERNEL; void *p; @@ -47,10 +47,11 @@ void *module_alloc(unsigned long size) if (PAGE_ALIGN(size) > MODULES_LEN) return NULL; =20 + vmflags |=3D VM_FLUSH_RESET_PERMS | VM_DEFER_KMEMLEAK; p =3D __vmalloc_node_range(size, MODULE_ALIGN, MODULES_VADDR + get_module_load_offset(), MODULES_END, gfp_mask, PAGE_KERNEL, - VM_FLUSH_RESET_PERMS | VM_DEFER_KMEMLEAK, + vmflags, NUMA_NO_NODE, __builtin_return_address(0)); =20 if (p && (kasan_alloc_module_shadow(p, size, gfp_mask) < 0)) { @@ -60,3 +61,8 @@ void *module_alloc(unsigned long size) =20 return p; } + +void *module_alloc(unsigned long size) +{ + return __module_alloc(size, 0); +} From nobody Sat Apr 18 05:54:57 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 256FEC433EF for ; Sat, 16 Jul 2022 23:18:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232125AbiGPXSe (ORCPT ); Sat, 16 Jul 2022 19:18:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38898 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232810AbiGPXSA (ORCPT ); Sat, 16 Jul 2022 19:18:00 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A191222BD3 for ; Sat, 16 Jul 2022 16:17:44 -0700 (PDT) Message-ID: <20220716230953.858048083@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1658013462; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=Jzk1Ro7moQj4/ZNdqGDe6vOHP2eSGjateMVY4xuPVLw=; b=ABOqAJ8ZC4ybwE0awspMk33HahjgcjGnviUBghK+dC9qnrV8SnCm6lTrZMxLuZYupBjdo5 Hzl79sPZcu8H5By4Ta/YWxKPOR98XlsQNkXAwNbu6+nGGB3HtP4RbT6urZBuBh+Q/byCMC c8e1nlgTJBOIxuehEm3qtVTaUogEwUOBFtXCJQp8iwlrQi14uhorOuxqPX/fRn0B+XcsCb zeYe6VWHF8ime4A3WbJQv1cnXneJM6BJWPSPA+KtZ5Tv0dLRsd+aBRz9xd3zBOn58vs+up cwfBwHa5E1lyiYppeElTRdCes0zp6ZlspRAuR6u9CfJkCNaZVui9SP5l3fOhkQ== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1658013462; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=Jzk1Ro7moQj4/ZNdqGDe6vOHP2eSGjateMVY4xuPVLw=; b=5EUd6EGPJWOJ58Rki0DfMwZwyqF9oxESgWMLkJkvEILhwWyMNNcx9fj+0aX+ajxMHp4A2v EBtK/o0Q4h6wJHAg== From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Linus Torvalds , Tim Chen , Josh Poimboeuf , Andrew Cooper , Pawan Gupta , Johannes Wikner , Alyssa Milburn , Jann Horn , "H.J. Lu" , Joao Moreira , Joseph Nuzman , Steven Rostedt Subject: [patch 20/38] x86/alternatives: Provide text_poke_[copy|set]_locked() References: <20220716230344.239749011@linutronix.de> MIME-Version: 1.0 Date: Sun, 17 Jul 2022 01:17:42 +0200 (CEST) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The upcoming call thunk patching must hold text_mutex and needs access to text_poke_copy() and text_poke_set(), which take text_mutex. Provide _locked postfixed variants to expose the inner workings. Signed-off-by: Thomas Gleixner --- arch/x86/include/asm/text-patching.h | 2 + arch/x86/kernel/alternative.c | 48 +++++++++++++++++++++---------= ----- 2 files changed, 32 insertions(+), 18 deletions(-) --- a/arch/x86/include/asm/text-patching.h +++ b/arch/x86/include/asm/text-patching.h @@ -45,6 +45,8 @@ extern void *text_poke(void *addr, const extern void text_poke_sync(void); extern void *text_poke_kgdb(void *addr, const void *opcode, size_t len); extern void *text_poke_copy(void *addr, const void *opcode, size_t len); +extern void *text_poke_copy_locked(void *addr, const void *opcode, size_t = len); +extern void *text_poke_set_locked(void *addr, int c, size_t len); extern void *text_poke_set(void *addr, int c, size_t len); extern int poke_int3_handler(struct pt_regs *regs); extern void text_poke_bp(void *addr, const void *opcode, size_t len, const= void *emulate); --- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -1225,6 +1225,26 @@ void *text_poke_kgdb(void *addr, const v return __text_poke(text_poke_memcpy, addr, opcode, len); } =20 +void *text_poke_copy_locked(void *addr, const void *opcode, size_t len) +{ + unsigned long start =3D (unsigned long)addr; + size_t patched =3D 0; + + if (WARN_ON_ONCE(core_kernel_text(start))) + return NULL; + + while (patched < len) { + unsigned long ptr =3D start + patched; + size_t s; + + s =3D min_t(size_t, PAGE_SIZE * 2 - offset_in_page(ptr), len - patched); + + __text_poke(text_poke_memcpy, (void *)ptr, opcode + patched, s); + patched +=3D s; + } + return addr; +} + /** * text_poke_copy - Copy instructions into (an unused part of) RX memory * @addr: address to modify @@ -1239,23 +1259,29 @@ void *text_poke_kgdb(void *addr, const v */ void *text_poke_copy(void *addr, const void *opcode, size_t len) { + mutex_lock(&text_mutex); + addr =3D text_poke_copy_locked(addr, opcode, len); + mutex_unlock(&text_mutex); + return addr; +} + +void *text_poke_set_locked(void *addr, int c, size_t len) +{ unsigned long start =3D (unsigned long)addr; size_t patched =3D 0; =20 if (WARN_ON_ONCE(core_kernel_text(start))) return NULL; =20 - mutex_lock(&text_mutex); while (patched < len) { unsigned long ptr =3D start + patched; size_t s; =20 s =3D min_t(size_t, PAGE_SIZE * 2 - offset_in_page(ptr), len - patched); =20 - __text_poke(text_poke_memcpy, (void *)ptr, opcode + patched, s); + __text_poke(text_poke_memset, (void *)ptr, (void *)&c, s); patched +=3D s; } - mutex_unlock(&text_mutex); return addr; } =20 @@ -1270,22 +1296,8 @@ void *text_poke_copy(void *addr, const v */ void *text_poke_set(void *addr, int c, size_t len) { - unsigned long start =3D (unsigned long)addr; - size_t patched =3D 0; - - if (WARN_ON_ONCE(core_kernel_text(start))) - return NULL; - mutex_lock(&text_mutex); - while (patched < len) { - unsigned long ptr =3D start + patched; - size_t s; - - s =3D min_t(size_t, PAGE_SIZE * 2 - offset_in_page(ptr), len - patched); - - __text_poke(text_poke_memset, (void *)ptr, (void *)&c, s); - patched +=3D s; - } + addr =3D text_poke_set_locked(addr, c, len); mutex_unlock(&text_mutex); return addr; } From nobody Sat Apr 18 05:54:57 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6C855C433EF for ; Sat, 16 Jul 2022 23:18:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233216AbiGPXSt (ORCPT ); Sat, 16 Jul 2022 19:18:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39874 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233130AbiGPXSC (ORCPT ); Sat, 16 Jul 2022 19:18:02 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 58CFE2409C for ; Sat, 16 Jul 2022 16:17:49 -0700 (PDT) Message-ID: <20220716230953.918740143@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1658013464; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=/h0ZE3KOnMvzgiDFOYx6/4OXuRqRs3WRg7RaeD3nrMY=; b=A9iZ8Od0A9eYHT4eukFRfhAJ79PiHYnUG7+WTvD0VkBMWvNFZTTRtnJZnH1lTMZrTW+mso eyz+x3IkDHY1jPe8Sg8XnnIUnqOZhbWI+IAjQ2ytxiCZ5lmbXESygdD7QFuIxEXKHe9pau fcjWp3hBTqaNLJpE0cpeNpGD88ODnmrCOil9AXb1h/aWJSM2JGf4dHrq09ppgSP+Ls7xPg p+ohZTC9PqKDb6etzmMsYhXs1Gi6aFrePhZbYmauTsQGOyOAfvQN7CnaJxcvhwtKXfr5jo MG0ptkQJ/waBXBldXXXLjLG6F4YtUR7zasHR5quzAPUKfqHq5hi3erLOuNHMpA== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1658013464; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=/h0ZE3KOnMvzgiDFOYx6/4OXuRqRs3WRg7RaeD3nrMY=; b=aUPrudRtQHSyrbnyMPkrBQnTg0pB/OEBjtFWxlSDNsjncYJSExM7N8fNA5Ru2E92ORvC4A fM04bIPi4f46RvCg== From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Linus Torvalds , Tim Chen , Josh Poimboeuf , Andrew Cooper , Pawan Gupta , Johannes Wikner , Alyssa Milburn , Jann Horn , "H.J. Lu" , Joao Moreira , Joseph Nuzman , Steven Rostedt Subject: [patch 21/38] x86/entry: Make some entry symbols global References: <20220716230344.239749011@linutronix.de> MIME-Version: 1.0 Date: Sun, 17 Jul 2022 01:17:43 +0200 (CEST) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" paranoid_entry(), error_entry() and xen_error_entry() have to be exempted from call accounting by thunk patching because they are before UNTRAIN_RET. Expose them so they are available in the alternative code. Signed-off-by: Thomas Gleixner --- arch/x86/entry/entry_64.S | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) --- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -326,7 +326,8 @@ SYM_CODE_END(ret_from_fork) #endif .endm =20 -SYM_CODE_START_LOCAL(xen_error_entry) +SYM_CODE_START(xen_error_entry) + ANNOTATE_NOENDBR UNWIND_HINT_FUNC PUSH_AND_CLEAR_REGS save_ret=3D1 ENCODE_FRAME_POINTER 8 @@ -904,7 +905,8 @@ SYM_CODE_END(xen_failsafe_callback) * R14 - old CR3 * R15 - old SPEC_CTRL */ -SYM_CODE_START_LOCAL(paranoid_entry) +SYM_CODE_START(paranoid_entry) + ANNOTATE_NOENDBR UNWIND_HINT_FUNC PUSH_AND_CLEAR_REGS save_ret=3D1 ENCODE_FRAME_POINTER 8 @@ -1039,7 +1041,8 @@ SYM_CODE_END(paranoid_exit) /* * Switch GS and CR3 if needed. */ -SYM_CODE_START_LOCAL(error_entry) +SYM_CODE_START(error_entry) + ANNOTATE_NOENDBR UNWIND_HINT_FUNC =20 PUSH_AND_CLEAR_REGS save_ret=3D1 From nobody Sat Apr 18 05:54:57 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AF19AC43334 for ; Sat, 16 Jul 2022 23:18:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232987AbiGPXSz (ORCPT ); Sat, 16 Jul 2022 19:18:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38702 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233128AbiGPXSC (ORCPT ); Sat, 16 Jul 2022 19:18:02 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9762523BF1 for ; Sat, 16 Jul 2022 16:17:49 -0700 (PDT) Message-ID: <20220716230953.977047812@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1658013465; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=oPPu+jFTcgeESCfOqjttTTcYWEzNGD5UiaAKckbHmr0=; b=pIV9Yx9mr7gx9fADTXXrnpkLGro7vX+pdJsuS/V4ImbfKmT/4IYDH0nMKD8EQsr3cyWByN nbGSlcu2S3JV7Z9Z2xq3EV16WiBKO9Tj0qT0hYqqQIgXLbSSqwFEf1JMiCUYBwB+v09Agr 53N6Kd094/Jw555/M/KAfP3BvkwaZUjq4mz4Q/fn+WrJnco8EMfMvHaNrOdlGMQeSDeXh2 oeZT4TUSAK9FYJ5093NRAMLHCf3v2toOlpQ54hC+Mp6IJXotQKHpcr4gHXXlS2qHAxoux1 EquiMkZ6DX8Xty2zX2YUXONfRy2JSatfpKmKpFqN7wduwZzNMrTQWWkTeJ3jmA== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1658013465; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=oPPu+jFTcgeESCfOqjttTTcYWEzNGD5UiaAKckbHmr0=; b=i/kNQGMQ1jH4nLjWWcOmPYC1HlRpD49P2EDwpgi0S2rlHT5JqmIlI5vVNR90A33ob7uFrq gVpEgyyvvchGY+Bw== From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Linus Torvalds , Tim Chen , Josh Poimboeuf , Andrew Cooper , Pawan Gupta , Johannes Wikner , Alyssa Milburn , Jann Horn , "H.J. Lu" , Joao Moreira , Joseph Nuzman , Steven Rostedt Subject: [patch 22/38] x86/paravirt: Make struct paravirt_call_site unconditionally available References: <20220716230344.239749011@linutronix.de> MIME-Version: 1.0 Date: Sun, 17 Jul 2022 01:17:45 +0200 (CEST) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" For the upcoming call thunk patching it's less ifdeffery when the data structure is unconditionally available. The code can then be trivially fenced off with IS_ENABLED(). Signed-off-by: Thomas Gleixner --- arch/x86/include/asm/paravirt.h | 4 ++-- arch/x86/include/asm/paravirt_types.h | 20 ++++++++++++-------- 2 files changed, 14 insertions(+), 10 deletions(-) --- a/arch/x86/include/asm/paravirt.h +++ b/arch/x86/include/asm/paravirt.h @@ -4,13 +4,13 @@ /* Various instructions on x86 need to be replaced for * para-virtualization: those hooks are defined here. */ =20 +#include + #ifdef CONFIG_PARAVIRT #include #include #include =20 -#include - #ifndef __ASSEMBLY__ #include #include --- a/arch/x86/include/asm/paravirt_types.h +++ b/arch/x86/include/asm/paravirt_types.h @@ -2,6 +2,17 @@ #ifndef _ASM_X86_PARAVIRT_TYPES_H #define _ASM_X86_PARAVIRT_TYPES_H =20 +#ifndef __ASSEMBLY__ +/* These all sit in the .parainstructions section to tell us what to patch= . */ +struct paravirt_patch_site { + u8 *instr; /* original instructions */ + u8 type; /* type of this instruction */ + u8 len; /* length of original instruction */ +}; +#endif + +#ifdef CONFIG_PARAVIRT + /* Bitmask of what can be clobbered: usually at least eax. */ #define CLBR_EAX (1 << 0) #define CLBR_ECX (1 << 1) @@ -584,16 +595,9 @@ unsigned long paravirt_ret0(void); =20 #define paravirt_nop ((void *)_paravirt_nop) =20 -/* These all sit in the .parainstructions section to tell us what to patch= . */ -struct paravirt_patch_site { - u8 *instr; /* original instructions */ - u8 type; /* type of this instruction */ - u8 len; /* length of original instruction */ -}; - extern struct paravirt_patch_site __parainstructions[], __parainstructions_end[]; =20 #endif /* __ASSEMBLY__ */ - +#endif /* CONFIG_PARAVIRT */ #endif /* _ASM_X86_PARAVIRT_TYPES_H */ From nobody Sat Apr 18 05:54:57 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9DD0AC433EF for ; Sat, 16 Jul 2022 23:19:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233240AbiGPXTE (ORCPT ); Sat, 16 Jul 2022 19:19:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39932 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233024AbiGPXSE (ORCPT ); Sat, 16 Jul 2022 19:18:04 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2CC19240B4 for ; Sat, 16 Jul 2022 16:17:51 -0700 (PDT) Message-ID: <20220716230954.036332074@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1658013467; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=qQeZCKMBlh2YkZXAFPUoFtNKnasF/QEaf/ud1By5XWI=; b=hsGD31w67pi+rnQ+BBw+ZClQrfryZAXaKSh+00H9y4wPv/uRIhF0R8tcuIHJ3Rhwq9fSEd LhvY6X3cfu0vZ2L6woUBWIFXWRiux7QMVDebCTcHxuNZTVsuVOxWLMR3PQLb0Cx18Gwgde 3u58iO52hjbVgWrvZNzG+oQRCx0tDN0SHgC3D/2bnIRy8P5P9VEGxbLsj2vQU2xqmWbMK+ umdF/mB96AaSfhhzD0MHhnd9g8q0KMX4Xl8rN6I2BjQHzLJtWLvCWLpCVsGPdkNiDp70lI HgUbrorMdaVjqP1Yt1pp/gTGKmfJ3OuJin1hM1msGWHHL/09/7b/eXqVsU5+4Q== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1658013467; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=qQeZCKMBlh2YkZXAFPUoFtNKnasF/QEaf/ud1By5XWI=; b=Q7NdlC7Bxtejmfy2Et3ALTfJqBP6CauIJlkf5Va5072+v55qjJsoq43J2lyMvh5TSy+9iP ENeEbP693xX3LxAg== From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Linus Torvalds , Tim Chen , Josh Poimboeuf , Andrew Cooper , Pawan Gupta , Johannes Wikner , Alyssa Milburn , Jann Horn , "H.J. Lu" , Joao Moreira , Joseph Nuzman , Steven Rostedt Subject: [patch 23/38] x86/callthunks: Add call patching for call depth tracking References: <20220716230344.239749011@linutronix.de> MIME-Version: 1.0 Date: Sun, 17 Jul 2022 01:17:46 +0200 (CEST) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Mitigating the Intel SKL RSB underflow issue in software requires to track the call depth. This could be done with help of the compiler by adding at least 7 bytes NOPs before every direct call, which amounts to a 15+ percent text size increase for a vmlinux built with a Debian kernel config. While CPUs are quite efficient in ignoring NOPs this still is a massive penalty in terms of I-cache for all CPUs which do not have this issue. Inflict the pain only on SKL CPUs by creating call thunks for each function and patching the calls to invoke the thunks instead. The thunks are created in module memory to stay within the 32bit displacement boundary. The thunk then does: ACCOUNT_DEPTH JMP function The functions and call sites lists are generated by objtool. The memory requirement is 16 bytes per call thunk and btree memory for keeping track of them. For a Debian distro config this amounts to ~1.6MB thunk memory and 2MB btree storage. This is only required when the call depth tracking is enabled on the kernel command line. So the burden is solely on SKL[-X]. The thunks are all stored in one 2MB memory region which is mapped with a large TLB to prevent ITLB pressure. The thunks are generated from a template and the btree is used to store them by destination address. The actual call patching retrieves the thunks from the btree and replaces the original function call by a call to the thunk. Module handling and the actual thunk code for SKL will be added in subsequent steps. Signed-off-by: Thomas Gleixner --- arch/x86/Kconfig | 13 + arch/x86/include/asm/alternative.h | 13 + arch/x86/kernel/Makefile | 2=20 arch/x86/kernel/alternative.c | 6=20 arch/x86/kernel/callthunks.c | 459 ++++++++++++++++++++++++++++++++= +++++ 5 files changed, 493 insertions(+) --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -125,6 +125,7 @@ config X86 select ARCH_WANT_LD_ORPHAN_WARN select ARCH_WANTS_THP_SWAP if X86_64 select ARCH_HAS_PARANOID_L1D_FLUSH + select BTREE if CALL_DEPTH_TRACKING select BUILDTIME_TABLE_SORT select CLKEVT_I8253 select CLOCKSOURCE_VALIDATE_LAST_CYCLE @@ -2511,6 +2512,18 @@ config CALL_DEPTH_TRACKING of this option is marginal as the call depth tracking is using run-time generated call thunks and call patching. =20 +config CALL_THUNKS_DEBUG + bool "Enable call thunks and call depth tracking debugging" + depends on CALL_DEPTH_TRACKING + default n + help + Enable call/ret counters for imbalance detection and build in + a noisy dmesg about callthunks generation and call patching for + trouble shooting. The debug prints need to be enabled on the + kernel command line with 'debug-callthunks'. + Only enable this, when you are debugging call thunks as this + creates a noticable runtime overhead. If unsure say N. + config CPU_IBPB_ENTRY bool "Enable IBPB on kernel entry" depends on CPU_SUP_AMD --- a/arch/x86/include/asm/alternative.h +++ b/arch/x86/include/asm/alternative.h @@ -80,6 +80,19 @@ extern void apply_returns(s32 *start, s3 extern void apply_ibt_endbr(s32 *start, s32 *end); =20 struct module; +struct paravirt_patch_site; + +struct callthunk_sites { + s32 *syms_start, *syms_end; + s32 *call_start, *call_end; + struct paravirt_patch_site *pv_start, *pv_end; +}; + +#ifdef CONFIG_CALL_THUNKS +extern void callthunks_patch_builtin_calls(void); +#else +static __always_inline void callthunks_patch_builtin_calls(void) {} +#endif =20 #ifdef CONFIG_SMP extern void alternatives_smp_module_add(struct module *mod, char *name, --- a/arch/x86/kernel/Makefile +++ b/arch/x86/kernel/Makefile @@ -141,6 +141,8 @@ obj-$(CONFIG_UNWINDER_GUESS) +=3D unwind_ =20 obj-$(CONFIG_AMD_MEM_ENCRYPT) +=3D sev.o =20 +obj-$(CONFIG_CALL_THUNKS) +=3D callthunks.o + ### # 64 bit specific files ifeq ($(CONFIG_X86_64),y) --- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -936,6 +936,12 @@ void __init alternative_instructions(voi */ apply_alternatives(__alt_instructions, __alt_instructions_end); =20 + /* + * Now all calls are established. Apply the call thunks if + * required. + */ + callthunks_patch_builtin_calls(); + apply_ibt_endbr(__ibt_endbr_seal, __ibt_endbr_seal_end); =20 #ifdef CONFIG_SMP --- /dev/null +++ b/arch/x86/kernel/callthunks.c @@ -0,0 +1,459 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#define pr_fmt(fmt) "callthunks: " fmt + +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include + +#ifdef CONFIG_CALL_THUNKS_DEBUG +static int __initdata_or_module debug_callthunks; + +#define prdbg(fmt, args...) \ +do { \ + if (debug_callthunks) \ + printk(KERN_DEBUG pr_fmt(fmt), ##args); \ +} while(0) + +static int __init debug_thunks(char *str) +{ + debug_callthunks =3D 1; + return 1; +} +__setup("debug-callthunks", debug_thunks); +#else +#define prdbg(fmt, args...) do { } while(0) +#endif + +extern s32 __call_sites[], __call_sites_end[]; +extern s32 __sym_sites[], __sym_sites_end[]; + +static struct btree_head64 call_thunks; + +static bool thunks_initialized __ro_after_init; +static struct module_layout builtin_layout __ro_after_init; + +struct thunk_desc { + void *template; + unsigned int template_size; + unsigned int thunk_size; +}; + +static struct thunk_desc callthunk_desc __ro_after_init; + +struct thunk_mem { + void *base; + unsigned int size; + unsigned int nthunks; + bool is_rx; + struct list_head list; + unsigned long map[0]; +}; + +struct thunk_mem_area { + struct thunk_mem *tmem; + unsigned long start; + unsigned long nthunks; +}; + +static LIST_HEAD(thunk_mem_list); + +extern void error_entry(void); +extern void xen_error_entry(void); +extern void paranoid_entry(void); + +static inline bool is_inittext(struct module_layout *layout, void *addr) +{ + if (!layout->mtn.mod) + return is_kernel_inittext((unsigned long)addr); + + return within_module_init((unsigned long)addr, layout->mtn.mod); +} + +static __init_or_module bool skip_addr(void *dest) +{ + if (dest =3D=3D error_entry) + return true; + if (dest =3D=3D paranoid_entry) + return true; + if (dest =3D=3D xen_error_entry) + return true; + /* Does FILL_RSB... */ + if (dest =3D=3D __switch_to_asm) + return true; + /* Accounts directly */ + if (dest =3D=3D ret_from_fork) + return true; +#ifdef CONFIG_FUNCTION_TRACER + if (dest =3D=3D __fentry__) + return true; +#endif + return false; +} + +static __init_or_module void *call_get_dest(void *addr) +{ + struct insn insn; + void *dest; + int ret; + + ret =3D insn_decode_kernel(&insn, addr); + if (ret) + return ERR_PTR(ret); + + /* Patched out call? */ + if (insn.opcode.bytes[0] !=3D CALL_INSN_OPCODE) + return NULL; + + dest =3D addr + insn.length + insn.immediate.value; + if (skip_addr(dest)) + return NULL; + return dest; +} + +static void *jump_get_dest(void *addr) +{ + struct insn insn; + int ret; + + ret =3D insn_decode_kernel(&insn, addr); + if (WARN_ON_ONCE(ret)) + return NULL; + + if (insn.opcode.bytes[0] !=3D JMP32_INSN_OPCODE) { + WARN_ON_ONCE(insn.opcode.bytes[0] !=3D INT3_INSN_OPCODE); + return NULL; + } + + return addr + insn.length + insn.immediate.value; +} + +static __init_or_module void callthunk_free(struct thunk_mem_area *area, + bool set_int3) +{ + struct thunk_mem *tmem =3D area->tmem; + unsigned int i, size; + u8 *thunk, *tp; + + lockdep_assert_held(&text_mutex); + + prdbg("Freeing tmem %px %px %lu %lu\n", tmem->base, + tmem->base + area->start * callthunk_desc.thunk_size, + area->start, area->nthunks); + + /* Jump starts right after the template */ + thunk =3D tmem->base + area->start * callthunk_desc.thunk_size; + tp =3D thunk + callthunk_desc.template_size; + + for (i =3D 0; i < area->nthunks; i++) { + void *dest =3D jump_get_dest(tp); + + if (dest) + btree_remove64(&call_thunks, (unsigned long)dest); + tp +=3D callthunk_desc.thunk_size; + } + bitmap_clear(tmem->map, area->start, area->nthunks); + + if (bitmap_empty(tmem->map, tmem->nthunks)) { + list_del(&tmem->list); + prdbg("Freeing empty tmem: %px %u %u\n", tmem->base, + tmem->size, tmem->nthunks); + vfree(tmem->base); + kfree(tmem); + } else if (set_int3) { + size =3D area->nthunks * callthunk_desc.thunk_size; + text_poke_set_locked(thunk, 0xcc, size); + } + kfree(area); +} + +static __init_or_module +int callthunk_setup_one(void *dest, u8 *thunk, u8 *buffer, + struct module_layout *layout) +{ + unsigned long key =3D (unsigned long)dest; + u8 *jmp; + + if (is_inittext(layout, dest)) { + prdbg("Ignoring init dest: %pS %px\n", dest, dest); + return 0; + } + + /* Multiple symbols can have the same location. */ + if (btree_lookup64(&call_thunks, key)) { + prdbg("Ignoring duplicate dest: %pS %px\n", dest, dest); + return 0; + } + + memcpy(buffer, callthunk_desc.template, callthunk_desc.template_size); + jmp =3D thunk + callthunk_desc.template_size; + buffer +=3D callthunk_desc.template_size; + __text_gen_insn(buffer, JMP32_INSN_OPCODE, jmp, dest, JMP32_INSN_SIZE); + + return btree_insert64(&call_thunks, key, (void *)thunk, GFP_KERNEL) ? : 1; +} + +static __always_inline char *layout_getname(struct module_layout *layout) +{ +#ifdef CONFIG_MODULES + if (layout->mtn.mod) + return layout->mtn.mod->name; +#endif + return "builtin"; +} + +static __init_or_module void patch_call(void *addr, struct module_layout *= layout) +{ + void *thunk, *dest; + unsigned long key; + u8 bytes[8]; + + if (is_inittext(layout, addr)) + return; + + dest =3D call_get_dest(addr); + if (!dest || WARN_ON_ONCE(IS_ERR(dest))) + return; + + key =3D (unsigned long)dest; + thunk =3D btree_lookup64(&call_thunks, key); + + if (!thunk) { + WARN_ONCE(!is_inittext(layout, dest), + "Lookup %s thunk for %pS -> %pS %016lx failed\n", + layout_getname(layout), addr, dest, key); + return; + } + + __text_gen_insn(bytes, CALL_INSN_OPCODE, addr, thunk, CALL_INSN_SIZE); + text_poke_early(addr, bytes, CALL_INSN_SIZE); +} + +static __init_or_module void patch_call_sites(s32 *start, s32 *end, + struct module_layout *layout) +{ + s32 *s; + + for (s =3D start; s < end; s++) + patch_call((void *)s + *s, layout); +} + +static __init_or_module void +patch_paravirt_call_sites(struct paravirt_patch_site *start, + struct paravirt_patch_site *end, + struct module_layout *layout) +{ + struct paravirt_patch_site *p; + + for (p =3D start; p < end; p++) + patch_call(p->instr, layout); +} + +static struct thunk_mem_area *callthunks_alloc(unsigned int nthunks) +{ + struct thunk_mem_area *area; + unsigned int size, mapsize; + struct thunk_mem *tmem; + + area =3D kzalloc(sizeof(*area), GFP_KERNEL); + if (!area) + return NULL; + + list_for_each_entry(tmem, &thunk_mem_list, list) { + unsigned long start; + + start =3D bitmap_find_next_zero_area(tmem->map, tmem->nthunks, + 0, nthunks, 0); + if (start >=3D tmem->nthunks) + continue; + area->tmem =3D tmem; + area->start =3D start; + prdbg("Using tmem %px %px %lu %u\n", tmem->base, + tmem->base + start * callthunk_desc.thunk_size, + start, nthunks); + return area; + } + + size =3D nthunks * callthunk_desc.thunk_size; + size =3D round_up(size, PMD_SIZE); + nthunks =3D size / callthunk_desc.thunk_size; + mapsize =3D nthunks / 8; + + tmem =3D kzalloc(sizeof(*tmem) + mapsize, GFP_KERNEL); + if (!tmem) + goto free_area; + INIT_LIST_HEAD(&tmem->list); + + tmem->base =3D __module_alloc(size, VM_HUGE_VMAP); + if (!tmem->base) + goto free_tmem; + memset(tmem->base, INT3_INSN_OPCODE, size); + tmem->size =3D size; + tmem->nthunks =3D nthunks; + list_add(&tmem->list, &thunk_mem_list); + + area->tmem =3D tmem; + area->start =3D 0; + prdbg("Allocated tmem %px %x %u\n", tmem->base, size, nthunks); + return area; + +free_tmem: + kfree(tmem); +free_area: + kfree(area); + return NULL; +} + +static __init_or_module void callthunk_area_set_rx(struct thunk_mem_area *= area) +{ + unsigned long base, size; + + base =3D (unsigned long)area->tmem->base; + size =3D area->tmem->size / PAGE_SIZE; + + prdbg("Set RX: %016lx %lx\n", base, size); + set_memory_ro(base, size); + set_memory_x(base, size); + + area->tmem->is_rx =3D true; +} + +static __init_or_module int callthunks_setup(struct callthunk_sites *cs, + struct module_layout *layout) +{ + u8 *tp, *thunk, *buffer, *vbuf =3D NULL; + unsigned int nthunks, bitpos; + struct thunk_mem_area *area; + int ret, text_size, size; + s32 *s; + + lockdep_assert_held(&text_mutex); + + prdbg("Setup %s\n", layout_getname(layout)); + /* Calculate the number of thunks required */ + nthunks =3D cs->syms_end - cs->syms_start; + + /* + * thunk_size can be 0 when there are no intra module calls, + * but there might be still sites to patch. + */ + if (!nthunks) + goto patch; + + area =3D callthunks_alloc(nthunks); + if (!area) + return -ENOMEM; + + bitpos =3D area->start; + thunk =3D area->tmem->base + bitpos * callthunk_desc.thunk_size; + tp =3D thunk; + + prdbg("Thunk %px\n", tp); + /* + * If the memory area is already RX, use a temporary + * buffer. Otherwise just copy into the unused area + */ + if (!area->tmem->is_rx) { + prdbg("Using thunk direct\n"); + buffer =3D thunk; + } else { + size =3D nthunks * callthunk_desc.thunk_size; + vbuf =3D vmalloc(size); + if (!vbuf) { + ret =3D -ENOMEM; + goto fail; + } + memset(vbuf, INT3_INSN_OPCODE, size); + buffer =3D vbuf; + prdbg("Using thunk vbuf %px\n", vbuf); + } + + for (s =3D cs->syms_start; s < cs->syms_end; s++, bitpos++) { + void *dest =3D (void *)s + *s; + + ret =3D callthunk_setup_one(dest, tp, buffer, layout); + if (ret) + goto fail; + buffer +=3D callthunk_desc.thunk_size; + tp +=3D callthunk_desc.thunk_size; + bitmap_set(area->tmem->map, bitpos, 1); + area->nthunks++; + } + + text_size =3D tp - thunk; + prdbg("Thunk %px .. %px 0x%x\n", thunk, tp, text_size); + + /* + * If thunk memory is already RX, poke the buffer into it. + * Otherwise make the memory RX. + */ + if (vbuf) + text_poke_copy_locked(thunk, vbuf, text_size); + else + callthunk_area_set_rx(area); + sync_core(); + + layout->base =3D thunk; + layout->size =3D text_size; + layout->text_size =3D text_size; + layout->arch_data =3D area; + + vfree(vbuf); + +patch: + prdbg("Patching call sites %s\n", layout_getname(layout)); + patch_call_sites(cs->call_start, cs->call_end, layout); + patch_paravirt_call_sites(cs->pv_start, cs->pv_end, layout); + prdbg("Patching call sites done%s\n", layout_getname(layout)); + return 0; + +fail: + WARN_ON_ONCE(ret); + callthunk_free(area, false); + vfree(vbuf); + return ret; +} + +static __init noinline void callthunks_init(struct callthunk_sites *cs) +{ + int ret; + + if (!callthunk_desc.template) + return; + + if (WARN_ON_ONCE(btree_init64(&call_thunks))) + return; + + ret =3D callthunks_setup(cs, &builtin_layout); + if (WARN_ON_ONCE(ret)) + return; + + thunks_initialized =3D true; +} + +void __init callthunks_patch_builtin_calls(void) +{ + struct callthunk_sites cs =3D { + .syms_start =3D __sym_sites, + .syms_end =3D __sym_sites_end, + .call_start =3D __call_sites, + .call_end =3D __call_sites_end, + .pv_start =3D __parainstructions, + .pv_end =3D __parainstructions_end + }; + + mutex_lock(&text_mutex); + callthunks_init(&cs); + mutex_unlock(&text_mutex); +} From nobody Sat Apr 18 05:54:57 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D9747C43334 for ; Sat, 16 Jul 2022 23:19:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233183AbiGPXTB (ORCPT ); Sat, 16 Jul 2022 19:19:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39928 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232986AbiGPXSE (ORCPT ); Sat, 16 Jul 2022 19:18:04 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5768E240B6 for ; Sat, 16 Jul 2022 16:17:51 -0700 (PDT) Message-ID: <20220716230954.095980377@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1658013469; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=TLRAvZyLEqsQnwkoOe77XpgKA72Xpqcnn5aNrBgcecY=; b=zu68aOfoTqg7NypD3iZoMno0+P25QhX7HLERWL4CZTTSaxsTUfvg4Z+FAVZj/gf1yTDIOt m9yMKN/2AzBe0fIWHNyi7f1hC5fqxe1b4ENRp63p1zpJJNeHRkIYewLA1lVwBFzsT+RcVI 90J6W4K2GWoWLjWGy7W+eOrMnAQ1WEuhoYL2AC2uINHrxgAMV40xqPgZgTwiGcESbYy4Ej c9ujKGfN7fBRJdjZeryA+CykR1NJ234iJt9BqCHM24h+U8uOVojDnlWfyb1sj7nQM1LFa0 QDxAp87oNzkwiLPRxvqECQFsNrhr8Fut7cSERXYkSTpXgW0peG0JWfapnx9TXg== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1658013469; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=TLRAvZyLEqsQnwkoOe77XpgKA72Xpqcnn5aNrBgcecY=; b=bDKQoKtEz/awFA0KCCwZ5LGnaiaLhvF3ypOdgvTOzoN/FOQEp1pvTHrWBg7l/HaOZv1hKf ppuZ84PmuQK7ZHAA== From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Linus Torvalds , Tim Chen , Josh Poimboeuf , Andrew Cooper , Pawan Gupta , Johannes Wikner , Alyssa Milburn , Jann Horn , "H.J. Lu" , Joao Moreira , Joseph Nuzman , Steven Rostedt , "Peter Zijlstra (Intel)" Subject: [patch 24/38] module: Add layout for callthunks tracking References: <20220716230344.239749011@linutronix.de> MIME-Version: 1.0 Date: Sun, 17 Jul 2022 01:17:48 +0200 (CEST) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Peter Zijlstra Various things will need to be able to tell if a specific address is a callthunk or not (ORC, BPF, static_call). In order to answer this question in the face of modules it is necessary to (quickly) find the module associated with a specific (callthunk) address. Extend the __module_address() infrastructure with knowledge of the (per module) callthunk range. Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Thomas Gleixner --- include/linux/module.h | 21 +++++++++++++++++++-- kernel/module/internal.h | 8 ++++++++ kernel/module/main.c | 6 ++++++ kernel/module/tree_lookup.c | 17 ++++++++++++++++- 4 files changed, 49 insertions(+), 3 deletions(-) --- a/include/linux/module.h +++ b/include/linux/module.h @@ -424,6 +424,9 @@ struct module { /* Core layout: rbtree is accessed frequently, so keep together. */ struct module_layout core_layout __module_layout_align; struct module_layout init_layout; +#ifdef CONFIG_CALL_THUNKS + struct module_layout thunk_layout; +#endif #ifdef CONFIG_ARCH_WANTS_MODULES_DATA_IN_VMALLOC struct module_layout data_layout; #endif @@ -590,9 +593,23 @@ static inline bool within_module_init(un addr < (unsigned long)mod->init_layout.base + mod->init_layout.siz= e; } =20 -static inline bool within_module(unsigned long addr, const struct module *= mod) +static inline bool within_module_thunk(unsigned long addr, + const struct module *mod) +{ +#ifdef CONFIG_CALL_THUNKS + return (unsigned long)mod->thunk_layout.base <=3D addr && + addr < (unsigned long)mod->thunk_layout.base + mod->thunk_layout.s= ize; +#else + return false; +#endif +} + +static inline bool within_module(unsigned long addr, + const struct module *mod) { - return within_module_init(addr, mod) || within_module_core(addr, mod); + return within_module_core(addr, mod) || + within_module_thunk(addr, mod) || + within_module_init(addr, mod); } =20 /* Search for module by name: must be in a RCU-sched critical section. */ --- a/kernel/module/internal.h +++ b/kernel/module/internal.h @@ -219,6 +219,14 @@ static inline struct module *mod_find(un } #endif /* CONFIG_MODULES_TREE_LOOKUP */ =20 +#if defined(CONFIG_MODULES_TREE_LOOKUP) && defined(CONFIG_CALL_THUNKS) +void mod_tree_insert_thunk(struct module *mod); +void mod_tree_remove_thunk(struct module *mod); +#else +static inline void mod_tree_insert_thunk(struct module *mod) { } +static inline void mod_tree_remove_thunk(struct module *mod) { } +#endif + void module_enable_ro(const struct module *mod, bool after_init); void module_enable_nx(const struct module *mod); void module_enable_x(const struct module *mod); --- a/kernel/module/main.c +++ b/kernel/module/main.c @@ -1154,6 +1154,7 @@ static void free_module(struct module *m */ mutex_lock(&module_mutex); mod->state =3D MODULE_STATE_UNFORMED; + mod_tree_remove_thunk(mod); mutex_unlock(&module_mutex); =20 /* Remove dynamic debug info */ @@ -2770,6 +2771,10 @@ static int load_module(struct load_info if (err < 0) goto free_modinfo; =20 + mutex_lock(&module_mutex); + mod_tree_insert_thunk(mod); + mutex_unlock(&module_mutex); + flush_module_icache(mod); =20 /* Setup CFI for the module. */ @@ -2859,6 +2864,7 @@ static int load_module(struct load_info mutex_lock(&module_mutex); /* Unlink carefully: kallsyms could be walking list. */ list_del_rcu(&mod->list); + mod_tree_remove_thunk(mod); mod_tree_remove(mod); wake_up_all(&module_wq); /* Wait for RCU-sched synchronizing before releasing mod->list. */ --- a/kernel/module/tree_lookup.c +++ b/kernel/module/tree_lookup.c @@ -66,11 +66,26 @@ static noinline void __mod_tree_insert(s latch_tree_insert(&node->node, &tree->root, &mod_tree_ops); } =20 -static void __mod_tree_remove(struct mod_tree_node *node, struct mod_tree_= root *tree) +static noinline void __mod_tree_remove(struct mod_tree_node *node, struct = mod_tree_root *tree) { latch_tree_erase(&node->node, &tree->root, &mod_tree_ops); } =20 +#ifdef CONFIG_CALL_THUNKS +void mod_tree_insert_thunk(struct module *mod) +{ + mod->thunk_layout.mtn.mod =3D mod; + if (mod->thunk_layout.size) + __mod_tree_insert(&mod->thunk_layout.mtn, &mod_tree); +} + +void mod_tree_remove_thunk(struct module *mod) +{ + if (mod->thunk_layout.size) + __mod_tree_remove(&mod->thunk_layout.mtn, &mod_tree); +} +#endif + /* * These modifications: insert, remove_init and remove; are serialized by = the * module_mutex. From nobody Sat Apr 18 05:54:57 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7FBA7C433EF for ; Sat, 16 Jul 2022 23:19:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233184AbiGPXTH (ORCPT ); Sat, 16 Jul 2022 19:19:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39972 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232955AbiGPXSG (ORCPT ); Sat, 16 Jul 2022 19:18:06 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2598423154 for ; Sat, 16 Jul 2022 16:17:53 -0700 (PDT) Message-ID: <20220716230954.154789166@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1658013470; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=fAsRDApUg+1Ko3/u4aoL7mwcacsPjnB0bRdIcawL54c=; b=tFpPBn4JCge0ux/DTYO5dVsvHAbEr5C7zz8bCLioXN8B8M7G8AApeAe872zx4SzBv3B6X5 awBQ9bf9yBJDdsunt3dOqBNV+YkuvbTNEP+uXm2UjOMacqJIuvrf/AHWt5gMZc/jNrl5mn x3z0MP2zK0oZllv2QNMc8O7O1Q9BF+mWL495j8Va90fa+GN+l7tIrWDhK7RMUXHBlI6Hho U0B47I+j6tFhb15F+6RACSiskOL68Urwz1PopzxPKIDXPJInQ3cm9NcJozMwpcaab4Nqbh 43ztgy/6W+UxO/BbbYx5aluDNcgyunUvKr/2g66UaKXzyqSu7aViwKV5aDo2LA== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1658013470; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=fAsRDApUg+1Ko3/u4aoL7mwcacsPjnB0bRdIcawL54c=; b=aSvxwZElkNaqXEtHzZykl/J5ozSUlmktVINaEkR3IGdeaq3KzFTtJEDt58HPSzPspHGpzb rZzN8oTEYfk+peAg== From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Linus Torvalds , Tim Chen , Josh Poimboeuf , Andrew Cooper , Pawan Gupta , Johannes Wikner , Alyssa Milburn , Jann Horn , "H.J. Lu" , Joao Moreira , Joseph Nuzman , Steven Rostedt Subject: [patch 25/38] x86/modules: Add call thunk patching References: <20220716230344.239749011@linutronix.de> MIME-Version: 1.0 Date: Sun, 17 Jul 2022 01:17:50 +0200 (CEST) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" As for the builtins create call thunks and patch the call sites to call the thunk on Intel SKL CPUs for retbleed mitigation. Note, that module init functions are ignored for sake of simplicity because loading modules is not something which is done in high frequent loops and the attacker has not really a handle on when this happens in order to launch a matching attack. The depth tracking will still work for calls into the builtins and because the call is not accounted it will underflow faster and overstuff, but that's mitigated by the saturating counter and the side effect is only temporary. Signed-off-by: Thomas Gleixner --- arch/x86/include/asm/alternative.h | 7 +++++ arch/x86/kernel/callthunks.c | 49 ++++++++++++++++++++++++++++++++= +++++ arch/x86/kernel/module.c | 29 +++++++++++++++++++++ include/linux/module.h | 4 +++ 4 files changed, 88 insertions(+), 1 deletion(-) --- a/arch/x86/include/asm/alternative.h +++ b/arch/x86/include/asm/alternative.h @@ -90,8 +90,15 @@ struct callthunk_sites { =20 #ifdef CONFIG_CALL_THUNKS extern void callthunks_patch_builtin_calls(void); +extern void callthunks_patch_module_calls(struct callthunk_sites *sites, + struct module *mod); +extern void callthunks_module_free(struct module *mod); #else static __always_inline void callthunks_patch_builtin_calls(void) {} +static __always_inline void +callthunks_patch_module_calls(struct callthunk_sites *sites, + struct module *mod) {} +static __always_inline void callthunks_module_free(struct module *mod) { } #endif =20 #ifdef CONFIG_SMP --- a/arch/x86/kernel/callthunks.c +++ b/arch/x86/kernel/callthunks.c @@ -329,6 +329,20 @@ static __init_or_module void callthunk_a area->tmem->is_rx =3D true; } =20 +static __init_or_module int callthunk_set_modname(struct module_layout *la= yout) +{ +#ifdef CONFIG_MODULES + struct module *mod =3D layout->mtn.mod; + + if (mod) { + mod->callthunk_name =3D kasprintf(GFP_KERNEL, "callthunk:%s", mod->name); + if (!mod->callthunk_name) + return -ENOMEM; + } +#endif + return 0; +} + static __init_or_module int callthunks_setup(struct callthunk_sites *cs, struct module_layout *layout) { @@ -404,6 +418,10 @@ static __init_or_module int callthunks_s callthunk_area_set_rx(area); sync_core(); =20 + ret =3D callthunk_set_modname(layout); + if (ret) + goto fail; + layout->base =3D thunk; layout->size =3D text_size; layout->text_size =3D text_size; @@ -457,3 +475,34 @@ void __init callthunks_patch_builtin_cal callthunks_init(&cs); mutex_unlock(&text_mutex); } + +#ifdef CONFIG_MODULES +void noinline callthunks_patch_module_calls(struct callthunk_sites *cs, + struct module *mod) +{ + struct module_layout *layout =3D &mod->thunk_layout; + + if (!thunks_initialized) + return; + + layout->mtn.mod =3D mod; + mutex_lock(&text_mutex); + WARN_ON_ONCE(callthunks_setup(cs, layout)); + mutex_unlock(&text_mutex); +} + +void callthunks_module_free(struct module *mod) +{ + struct module_layout *layout =3D &mod->thunk_layout; + struct thunk_mem_area *area =3D layout->arch_data; + + if (!thunks_initialized || !area) + return; + + prdbg("Free %s\n", layout_getname(layout)); + layout->arch_data =3D NULL; + mutex_lock(&text_mutex); + callthunk_free(area, true); + mutex_unlock(&text_mutex); +} +#endif /* CONFIG_MODULES */ --- a/arch/x86/kernel/module.c +++ b/arch/x86/kernel/module.c @@ -196,7 +196,8 @@ int module_finalize(const Elf_Ehdr *hdr, { const Elf_Shdr *s, *text =3D NULL, *alt =3D NULL, *locks =3D NULL, *para =3D NULL, *orc =3D NULL, *orc_ip =3D NULL, - *retpolines =3D NULL, *returns =3D NULL, *ibt_endbr =3D NULL; + *retpolines =3D NULL, *returns =3D NULL, *ibt_endbr =3D NULL, + *syms =3D NULL, *calls =3D NULL; char *secstrings =3D (void *)hdr + sechdrs[hdr->e_shstrndx].sh_offset; =20 for (s =3D sechdrs; s < sechdrs + hdr->e_shnum; s++) { @@ -216,6 +217,10 @@ int module_finalize(const Elf_Ehdr *hdr, retpolines =3D s; if (!strcmp(".return_sites", secstrings + s->sh_name)) returns =3D s; + if (!strcmp(".sym_sites", secstrings + s->sh_name)) + syms =3D s; + if (!strcmp(".call_sites", secstrings + s->sh_name)) + calls =3D s; if (!strcmp(".ibt_endbr_seal", secstrings + s->sh_name)) ibt_endbr =3D s; } @@ -241,10 +246,31 @@ int module_finalize(const Elf_Ehdr *hdr, void *aseg =3D (void *)alt->sh_addr; apply_alternatives(aseg, aseg + alt->sh_size); } + if (calls || syms || para) { + struct callthunk_sites cs =3D {}; + + if (syms) { + cs.syms_start =3D (void *)syms->sh_addr; + cs.syms_end =3D (void *)syms->sh_addr + syms->sh_size; + } + + if (calls) { + cs.call_start =3D (void *)calls->sh_addr; + cs.call_end =3D (void *)calls->sh_addr + calls->sh_size; + } + + if (para) { + cs.pv_start =3D (void *)para->sh_addr; + cs.pv_end =3D (void *)para->sh_addr + para->sh_size; + } + + callthunks_patch_module_calls(&cs, me); + } if (ibt_endbr) { void *iseg =3D (void *)ibt_endbr->sh_addr; apply_ibt_endbr(iseg, iseg + ibt_endbr->sh_size); } + if (locks && text) { void *lseg =3D (void *)locks->sh_addr; void *tseg =3D (void *)text->sh_addr; @@ -266,4 +292,5 @@ int module_finalize(const Elf_Ehdr *hdr, void module_arch_cleanup(struct module *mod) { alternatives_smp_module_del(mod); + callthunks_module_free(mod); } --- a/include/linux/module.h +++ b/include/linux/module.h @@ -525,6 +525,10 @@ struct module { struct pi_entry **printk_index_start; #endif =20 +#ifdef CONFIG_CALL_THUNKS + char *callthunk_name; +#endif + #ifdef CONFIG_MODULE_UNLOAD /* What modules depend on me? */ struct list_head source_list; From nobody Sat Apr 18 05:54:57 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 40A30C43334 for ; Sat, 16 Jul 2022 23:19:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232725AbiGPXTN (ORCPT ); Sat, 16 Jul 2022 19:19:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39976 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233158AbiGPXSP (ORCPT ); Sat, 16 Jul 2022 19:18:15 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4499F240BE for ; Sat, 16 Jul 2022 16:17:54 -0700 (PDT) Message-ID: <20220716230954.214825322@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1658013472; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=vI5x6EFx6DXYrH43rq+rdB9Jx9gwo4XYYQzQFGAOOMo=; b=pyMUvyDhd3KvrPWdzTAYMKZH4kd8sncuoLVe3nKcyrzLW0lqm/uI+eU73iy8syu37AT+dA 8EJYFasoL5Uqz9+dsHUy9pd9/X8fE547yK1meknL94donFnmRg6MRHN71e+aInopW+Z/WX L0krV4dn3k/Wm0MUAq0B9t7MQN6x14TxYtx8sgM0JPvr0WUD31++SXRsuc9vYy5uJDCZnF lJ88W1YC+dcEmJHdLVhudeq1Yt2GqL4J0Zh3SNUjR+nux9ONcmR+Z25pL9FBBlTiZE3Cnk bh5mV/4n8tJGoEEyyIokJ+Nf3nkt7NDZGxRFnUpwmjd4KppabPyvueUaGJhVXQ== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1658013472; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=vI5x6EFx6DXYrH43rq+rdB9Jx9gwo4XYYQzQFGAOOMo=; b=KorIiSDsmYm1NVLaBldcYK2dArr2vAjLloKFVV++au7sKYQt5k70u69eqE2ab7hn2sey72 8Z5OoGP2Z0AXqxCw== From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Linus Torvalds , Tim Chen , Josh Poimboeuf , Andrew Cooper , Pawan Gupta , Johannes Wikner , Alyssa Milburn , Jann Horn , "H.J. Lu" , Joao Moreira , Joseph Nuzman , Steven Rostedt , "Peter Zijlstra (Intel)" Subject: [patch 26/38] x86/returnthunk: Allow different return thunks References: <20220716230344.239749011@linutronix.de> MIME-Version: 1.0 Date: Sun, 17 Jul 2022 01:17:52 +0200 (CEST) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Peter Zijlstra In preparation for call depth tracking on Intel SKL CPUs, make it possible to patch in a SKL specific return thunk. Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Thomas Gleixner --- arch/x86/include/asm/nospec-branch.h | 6 ++++++ arch/x86/kernel/alternative.c | 19 ++++++++++++++----- arch/x86/kernel/ftrace.c | 2 +- arch/x86/kernel/static_call.c | 2 +- arch/x86/net/bpf_jit_comp.c | 2 +- 5 files changed, 23 insertions(+), 8 deletions(-) --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -168,6 +168,12 @@ extern void __x86_return_thunk(void); extern void zen_untrain_ret(void); extern void entry_ibpb(void); =20 +#ifdef CONFIG_CALL_THUNKS +extern void (*x86_return_thunk)(void); +#else +#define x86_return_thunk (&__x86_return_thunk) +#endif + #ifdef CONFIG_RETPOLINE =20 #define GEN(reg) \ --- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -509,6 +509,11 @@ void __init_or_module noinline apply_ret } =20 #ifdef CONFIG_RETHUNK + +#ifdef CONFIG_CALL_THUNKS +void (*x86_return_thunk)(void) __ro_after_init =3D &__x86_return_thunk; +#endif + /* * Rewrite the compiler generated return thunk tail-calls. * @@ -524,14 +529,18 @@ static int patch_return(void *addr, stru { int i =3D 0; =20 - if (cpu_feature_enabled(X86_FEATURE_RETHUNK)) - return -1; - - bytes[i++] =3D RET_INSN_OPCODE; + if (cpu_feature_enabled(X86_FEATURE_RETHUNK)) { + if (x86_return_thunk =3D=3D __x86_return_thunk) + return -1; + + i =3D JMP32_INSN_SIZE; + __text_gen_insn(bytes, JMP32_INSN_OPCODE, addr, x86_return_thunk, i); + } else { + bytes[i++] =3D RET_INSN_OPCODE; + } =20 for (; i < insn->length;) bytes[i++] =3D INT3_INSN_OPCODE; - return i; } =20 --- a/arch/x86/kernel/ftrace.c +++ b/arch/x86/kernel/ftrace.c @@ -358,7 +358,7 @@ create_trampoline(struct ftrace_ops *ops =20 ip =3D trampoline + size; if (cpu_feature_enabled(X86_FEATURE_RETHUNK)) - __text_gen_insn(ip, JMP32_INSN_OPCODE, ip, &__x86_return_thunk, JMP32_IN= SN_SIZE); + __text_gen_insn(ip, JMP32_INSN_OPCODE, ip, x86_return_thunk, JMP32_INSN_= SIZE); else memcpy(ip, retq, sizeof(retq)); =20 --- a/arch/x86/kernel/static_call.c +++ b/arch/x86/kernel/static_call.c @@ -52,7 +52,7 @@ static void __ref __static_call_transfor =20 case RET: if (cpu_feature_enabled(X86_FEATURE_RETHUNK)) - code =3D text_gen_insn(JMP32_INSN_OPCODE, insn, &__x86_return_thunk); + code =3D text_gen_insn(JMP32_INSN_OPCODE, insn, x86_return_thunk); else code =3D &retinsn; break; --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -430,7 +430,7 @@ static void emit_return(u8 **pprog, u8 * u8 *prog =3D *pprog; =20 if (cpu_feature_enabled(X86_FEATURE_RETHUNK)) { - emit_jump(&prog, &__x86_return_thunk, ip); + emit_jump(&prog, x86_return_thunk, ip); } else { EMIT1(0xC3); /* ret */ if (IS_ENABLED(CONFIG_SLS)) From nobody Sat Apr 18 05:54:57 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4D0BCC43334 for ; Sat, 16 Jul 2022 23:19:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233108AbiGPXTP (ORCPT ); Sat, 16 Jul 2022 19:19:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39558 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233181AbiGPXS2 (ORCPT ); Sat, 16 Jul 2022 19:18:28 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6FD8A23169 for ; Sat, 16 Jul 2022 16:17:56 -0700 (PDT) Message-ID: <20220716230954.275034921@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1658013474; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=+SkAT0acTC0wg1yKNmaNDVW/Hce09Uo6dKJ0RMDncd4=; b=aKc9I7OgvEqVoDkW5CNTiUQMmwvpRljeGAji+oG/7jY60IhW1W9F/Ji/geM4sPnAh7BtJ2 ErbWyHKRmwVHQ5K7OjpYEH5hYvLA+A2pt9qAsY83IpIkF2g3b1VeKlYhbCNqASToiomPuU JhTyzo4CgoIOzLz3VSwRvVi4J43pDoLJbOC+3Pb/oeMSS/HcqfVILer47yWZrjYIBTN2sG vfubuFBjSCTOkyKPlwPaF/eRbHiYy9M0eq+5HEHG/CfxZ8Ylt+zoonCU1ZmRNAyYLzkoiq 5yToAgCRVvwre1aV0H2CfarYkKhDyIvQwwu5KihO4lPt+jqzcEdDL9jiJSy7Tw== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1658013474; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=+SkAT0acTC0wg1yKNmaNDVW/Hce09Uo6dKJ0RMDncd4=; b=l7zjZpIBEHlZtAneeNEraYRsIexxqaruRu5iviVajpqxakq7ZBcQgqgGvhChbitdD1YSNp JkWSBuDvqWGGSACw== From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Linus Torvalds , Tim Chen , Josh Poimboeuf , Andrew Cooper , Pawan Gupta , Johannes Wikner , Alyssa Milburn , Jann Horn , "H.J. Lu" , Joao Moreira , Joseph Nuzman , Steven Rostedt , "Peter Zijlstra (Intel)" Subject: [patch 27/38] x86/asm: Provide ALTERNATIVE_3 References: <20220716230344.239749011@linutronix.de> MIME-Version: 1.0 Date: Sun, 17 Jul 2022 01:17:53 +0200 (CEST) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Peter Zijlstra Fairly straight forward adaptation/extention of ALTERNATIVE_2. Required for call depth tracking. Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Thomas Gleixner --- arch/x86/include/asm/alternative.h | 33 ++++++++++++++++++++++++++++++--- 1 file changed, 30 insertions(+), 3 deletions(-) --- a/arch/x86/include/asm/alternative.h +++ b/arch/x86/include/asm/alternative.h @@ -367,6 +367,7 @@ static inline int alternatives_text_rese #define old_len 141b-140b #define new_len1 144f-143f #define new_len2 145f-144f +#define new_len3 146f-145f =20 /* * gas compatible max based on the idea from: @@ -374,7 +375,8 @@ static inline int alternatives_text_rese * * The additional "-" is needed because gas uses a "true" value of -1. */ -#define alt_max_short(a, b) ((a) ^ (((a) ^ (b)) & -(-((a) < (b))))) +#define alt_max_2(a, b) ((a) ^ (((a) ^ (b)) & -(-((a) < (b))))) +#define alt_max_3(a, b, c) (alt_max_2(alt_max_2(a, b), c)) =20 =20 /* @@ -386,8 +388,8 @@ static inline int alternatives_text_rese 140: \oldinstr 141: - .skip -((alt_max_short(new_len1, new_len2) - (old_len)) > 0) * \ - (alt_max_short(new_len1, new_len2) - (old_len)),0x90 + .skip -((alt_max_2(new_len1, new_len2) - (old_len)) > 0) * \ + (alt_max_2(new_len1, new_len2) - (old_len)),0x90 142: =20 .pushsection .altinstructions,"a" @@ -404,6 +406,31 @@ static inline int alternatives_text_rese .popsection .endm =20 +.macro ALTERNATIVE_3 oldinstr, newinstr1, feature1, newinstr2, feature2, n= ewinstr3, feature3 +140: + \oldinstr +141: + .skip -((alt_max_3(new_len1, new_len2, new_len3) - (old_len)) > 0) * \ + (alt_max_3(new_len1, new_len2, new_len3) - (old_len)),0x90 +142: + + .pushsection .altinstructions,"a" + altinstruction_entry 140b,143f,\feature1,142b-140b,144f-143f + altinstruction_entry 140b,144f,\feature2,142b-140b,145f-144f + altinstruction_entry 140b,145f,\feature3,142b-140b,146f-145f + .popsection + + .pushsection .altinstr_replacement,"ax" +143: + \newinstr1 +144: + \newinstr2 +145: + \newinstr3 +146: + .popsection +.endm + /* If @feature is set, patch in @newinstr_yes, otherwise @newinstr_no. */ #define ALTERNATIVE_TERNARY(oldinstr, feature, newinstr_yes, newinstr_no) \ ALTERNATIVE_2 oldinstr, newinstr_no, X86_FEATURE_ALWAYS, \ From nobody Sat Apr 18 05:54:57 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3FCE3C43334 for ; Sat, 16 Jul 2022 23:19:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233281AbiGPXTT (ORCPT ); Sat, 16 Jul 2022 19:19:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38898 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232894AbiGPXS3 (ORCPT ); Sat, 16 Jul 2022 19:18:29 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DD1EC248ED for ; Sat, 16 Jul 2022 16:17:57 -0700 (PDT) Message-ID: <20220716230954.334016834@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1658013475; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=JmwYsQ7+uISrRi4cJKfiAFmtnwD9cDnZcOisEZqN7rE=; b=Y4f3yv1tiUXLMSoEhK8pBnTVdnIdbL7FEYKUxVEvtWTWBlk6PvLiuDgpYv5li2Q5SUl/El En3rhJME7li1fKebNegETdJ1t8ralugC0iZLVmoxoOptapH/kUSs0vbeaapcvzwho8wTzO eS6G/9pFCz27pZpMUKT8CQOZ7VNr6Wwu+4D0h/zBz1ecPgq9gAxZ6ZsuM4Zs5a2TI2XACo ZKMWmOqckJZSn3ommbgxvipvSmLVkM852ojZYpJRYUE0r5MHiDpiQW9Yos2s0rW6R9u1d7 Ma0wSzm6/I9ttuY1heJdFdAhHKyq1BryCAHRILbLahRzxqzZb5yAUwmBPVb8sQ== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1658013475; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=JmwYsQ7+uISrRi4cJKfiAFmtnwD9cDnZcOisEZqN7rE=; b=/7uHClugvGXw7OrtFyG2O1NUp0u6sGK1/MVNbnTBen2DIAy/XWb64WktPT4kMNeEOZ9lkU XmgpinIDVW2il7CA== From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Linus Torvalds , Tim Chen , Josh Poimboeuf , Andrew Cooper , Pawan Gupta , Johannes Wikner , Alyssa Milburn , Jann Horn , "H.J. Lu" , Joao Moreira , Joseph Nuzman , Steven Rostedt Subject: [patch 28/38] x86/retbleed: Add SKL return thunk References: <20220716230344.239749011@linutronix.de> MIME-Version: 1.0 Date: Sun, 17 Jul 2022 01:17:55 +0200 (CEST) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" To address the Intel SKL RSB underflow issue in software it's required to do call depth tracking. Provide a return thunk for call depth tracking on Intel SKL CPUs. The tracking does not use a counter. It uses uses arithmetic shift right on call entry and logical shift left on return. The depth tracking variable is initialized to 0x8000.... when the call depth is zero. The arithmetic shift right sign extends the MSB and saturates after the 12th call. The shift count is 5 so the tracking covers 12 nested calls. On return the variable is shifted left logically so it becomes zero again. CALL RET 0: 0x8000000000000000 0x0000000000000000 1: 0xfc00000000000000 0xf000000000000000 ... 11: 0xfffffffffffffff8 0xfffffffffffffc00 12: 0xffffffffffffffff 0xffffffffffffffe0 After a return buffer fill the depth is credited 12 calls before the next stuffing has to take place. There is a inaccuracy for situations like this: 10 calls 5 returns 3 calls 4 returns 3 calls .... The shift count might cause this to be off by one in either direction, but there is still a cushion vs. the RSB depth. The algorithm does not claim to be perfect, but it should obfuscate the problem enough to make exploitation extremly difficult. The theory behind this is: RSB is a stack with depth 16 which is filled on every call. On the return path speculation "pops" entries to speculate down the call chain. Once the speculative RSB is empty it switches to other predictors, e.g. the Branch History Buffer, which can be mistrained by user space and misguide the speculation path to a gadget. Call depth tracking is designed to break this speculation path by stuffing speculation trap calls into the RSB which are never getting a corresponding return executed. This stalls the prediction path until it gets resteered, The assumption is that stuffing at the 12th return is sufficient to break the speculation before it hits the underflow and the fallback to the other predictors. Testing confirms that it works. Johannes, one of the retbleed researchers. tried to attack this approach but failed. There is obviously no scientific proof that this will withstand future research progress, but all we can do right now is to speculate about it. The SAR/SHL usage was suggested by Andi Kleen. Signed-off-by: Thomas Gleixner --- arch/x86/entry/entry_64.S | 10 +-- arch/x86/include/asm/nospec-branch.h | 114 ++++++++++++++++++++++++++++++= +++-- arch/x86/kernel/cpu/common.c | 5 + arch/x86/lib/retpoline.S | 30 +++++++++ 4 files changed, 149 insertions(+), 10 deletions(-) --- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -287,6 +287,7 @@ SYM_FUNC_END(__switch_to_asm) SYM_CODE_START(ret_from_fork) UNWIND_HINT_EMPTY ANNOTATE_NOENDBR // copy_thread + CALL_DEPTH_ACCOUNT movq %rax, %rdi call schedule_tail /* rdi: 'prev' task parameter */ =20 @@ -331,7 +332,7 @@ SYM_CODE_START(xen_error_entry) UNWIND_HINT_FUNC PUSH_AND_CLEAR_REGS save_ret=3D1 ENCODE_FRAME_POINTER 8 - UNTRAIN_RET + UNTRAIN_RET_FROM_CALL RET SYM_CODE_END(xen_error_entry) =20 @@ -975,7 +976,7 @@ SYM_CODE_START(paranoid_entry) * CR3 above, keep the old value in a callee saved register. */ IBRS_ENTER save_reg=3D%r15 - UNTRAIN_RET + UNTRAIN_RET_FROM_CALL =20 RET SYM_CODE_END(paranoid_entry) @@ -1060,7 +1061,7 @@ SYM_CODE_START(error_entry) /* We have user CR3. Change to kernel CR3. */ SWITCH_TO_KERNEL_CR3 scratch_reg=3D%rax IBRS_ENTER - UNTRAIN_RET + UNTRAIN_RET_FROM_CALL =20 leaq 8(%rsp), %rdi /* arg0 =3D pt_regs pointer */ /* Put us onto the real thread stack. */ @@ -1095,6 +1096,7 @@ SYM_CODE_START(error_entry) */ .Lerror_entry_done_lfence: FENCE_SWAPGS_KERNEL_ENTRY + CALL_DEPTH_ACCOUNT leaq 8(%rsp), %rax /* return pt_regs pointer */ ANNOTATE_UNRET_END RET @@ -1113,7 +1115,7 @@ SYM_CODE_START(error_entry) FENCE_SWAPGS_USER_ENTRY SWITCH_TO_KERNEL_CR3 scratch_reg=3D%rax IBRS_ENTER - UNTRAIN_RET + UNTRAIN_RET_FROM_CALL =20 /* * Pretend that the exception came from user mode: set up pt_regs --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -11,8 +11,53 @@ #include #include #include +#include =20 #define RETPOLINE_THUNK_SIZE 32 +#define RSB_CLEAR_LOOPS 32 /* To forcibly overwrite all entries */ + +/* + * Call depth tracking for Intel SKL CPUs to address the RSB underflow + * issue in software. + * + * The tracking does not use a counter. It uses uses arithmetic shift + * right on call entry and logical shift left on return. + * + * The depth tracking variable is initialized to 0x8000.... when the call + * depth is zero. The arithmetic shift right sign extends the MSB and + * saturates after the 12th call. The shift count is 5 for both directions + * so the tracking covers 12 nested calls. + * + * Call + * 0: 0x8000000000000000 0x0000000000000000 + * 1: 0xfc00000000000000 0xf000000000000000 + * ... + * 11: 0xfffffffffffffff8 0xfffffffffffffc00 + * 12: 0xffffffffffffffff 0xffffffffffffffe0 + * + * After a return buffer fill the depth is credited 12 calls before the + * next stuffing has to take place. + * + * There is a inaccuracy for situations like this: + * + * 10 calls + * 5 returns + * 3 calls + * 4 returns + * 3 calls + * .... + * + * The shift count might cause this to be off by one in either direction, + * but there is still a cushion vs. the RSB depth. The algorithm does not + * claim to be perfect and it can be speculated around by the CPU, but it + * is considered that it obfuscates the problem enough to make exploitation + * extremly difficult. + */ +#define RET_DEPTH_SHIFT 5 +#define RSB_RET_STUFF_LOOPS 16 +#define RET_DEPTH_INIT 0x8000000000000000ULL +#define RET_DEPTH_INIT_FROM_CALL 0xfc00000000000000ULL +#define RET_DEPTH_CREDIT 0xffffffffffffffffULL =20 /* * Fill the CPU return stack buffer. @@ -31,7 +76,28 @@ * from C via asm(".include ") but let's not go there. */ =20 -#define RSB_CLEAR_LOOPS 32 /* To forcibly overwrite all entries */ +#ifdef CONFIG_CALL_DEPTH_TRACKING +#define CREDIT_CALL_DEPTH \ + movq $-1, PER_CPU_VAR(__x86_call_depth); + +#define RESET_CALL_DEPTH \ + mov $0x80, %rax; \ + shl $56, %rax; \ + movq %rax, PER_CPU_VAR(__x86_call_depth); + +#define RESET_CALL_DEPTH_FROM_CALL \ + mov $0xfc, %rax; \ + shl $56, %rax; \ + movq %rax, PER_CPU_VAR(__x86_call_depth); + +#define INCREMENT_CALL_DEPTH \ + sarq $5, %gs:__x86_call_depth +#else +#define CREDIT_CALL_DEPTH +#define RESET_CALL_DEPTH +#define INCREMENT_CALL_DEPTH +#define RESET_CALL_DEPTH_FROM_CALL +#endif =20 /* * Google experimented with loop-unrolling and this turned out to be @@ -59,7 +125,9 @@ 774: \ add $(BITS_PER_LONG/8) * 2, sp; \ dec reg; \ - jnz 771b; + jnz 771b; \ + \ + CREDIT_CALL_DEPTH =20 #ifdef __ASSEMBLY__ =20 @@ -145,11 +213,32 @@ * where we have a stack but before any RET instruction. */ .macro UNTRAIN_RET -#if defined(CONFIG_CPU_UNRET_ENTRY) || defined(CONFIG_CPU_IBPB_ENTRY) +#if defined(CONFIG_CPU_UNRET_ENTRY) || defined(CONFIG_CPU_IBPB_ENTRY) || \ + defined(CONFIG_X86_FEATURE_CALL_DEPTH) ANNOTATE_UNRET_END - ALTERNATIVE_2 "", \ - CALL_ZEN_UNTRAIN_RET, X86_FEATURE_UNRET, \ - "call entry_ibpb", X86_FEATURE_ENTRY_IBPB + ALTERNATIVE_3 "", \ + CALL_ZEN_UNTRAIN_RET, X86_FEATURE_UNRET, \ + "call entry_ibpb", X86_FEATURE_ENTRY_IBPB, \ + __stringify(RESET_CALL_DEPTH), X86_FEATURE_CALL_DEPTH +#endif +.endm + +.macro UNTRAIN_RET_FROM_CALL +#if defined(CONFIG_CPU_UNRET_ENTRY) || defined(CONFIG_CPU_IBPB_ENTRY) || \ + defined(CONFIG_X86_FEATURE_CALL_DEPTH) + ANNOTATE_UNRET_END + ALTERNATIVE_3 "", \ + CALL_ZEN_UNTRAIN_RET, X86_FEATURE_UNRET, \ + "call entry_ibpb", X86_FEATURE_ENTRY_IBPB, \ + __stringify(RESET_CALL_DEPTH_FROM_CALL), X86_FEATURE_CALL_DEPTH +#endif +.endm + + +.macro CALL_DEPTH_ACCOUNT +#ifdef CONFIG_CALL_DEPTH_TRACKING + ALTERNATIVE "", \ + __stringify(INCREMENT_CALL_DEPTH), X86_FEATURE_CALL_DEPTH #endif .endm =20 @@ -174,6 +263,19 @@ extern void (*x86_return_thunk)(void); #define x86_return_thunk (&__x86_return_thunk) #endif =20 +#ifdef CONFIG_CALL_DEPTH_TRACKING +extern void __x86_return_skl(void); + +static inline void x86_set_skl_return_thunk(void) +{ + x86_return_thunk =3D &__x86_return_skl; +} + +DECLARE_PER_CPU(u64, __x86_call_depth); +#else +static inline void x86_set_skl_return_thunk(void) {} +#endif + #ifdef CONFIG_RETPOLINE =20 #define GEN(reg) \ --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -2002,6 +2002,11 @@ EXPORT_PER_CPU_SYMBOL(__preempt_count); =20 DEFINE_PER_CPU(unsigned long, cpu_current_top_of_stack) =3D TOP_OF_INIT_ST= ACK; =20 +#ifdef CONFIG_CALL_DEPTH_TRACKING +DEFINE_PER_CPU(u64, __x86_call_depth); +EXPORT_PER_CPU_SYMBOL_GPL(__x86_call_depth); +#endif + static void wrmsrl_cstar(unsigned long val) { /* --- a/arch/x86/lib/retpoline.S +++ b/arch/x86/lib/retpoline.S @@ -8,6 +8,7 @@ #include #include #include +#include #include =20 .section .text.__x86.indirect_thunk @@ -140,3 +141,32 @@ SYM_FUNC_END(zen_untrain_ret) EXPORT_SYMBOL(__x86_return_thunk) =20 #endif /* CONFIG_RETHUNK */ + +#ifdef CONFIG_CALL_DEPTH_TRACKING + + .align 64 +SYM_FUNC_START(__x86_return_skl) + ANNOTATE_NOENDBR + /* Keep the hotpath in a 16byte I-fetch */ + shlq $5, PER_CPU_VAR(__x86_call_depth) + jz 1f + ANNOTATE_UNRET_SAFE + ret + int3 +1: + .rept 16 + ANNOTATE_INTRA_FUNCTION_CALL + call 2f + int3 +2: + .endr + add $(8*16), %rsp + + CREDIT_CALL_DEPTH + + ANNOTATE_UNRET_SAFE + ret + int3 +SYM_FUNC_END(__x86_return_skl) + +#endif /* CONFIG_CALL_DEPTH_TRACKING */ From nobody Sat Apr 18 05:54:57 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C6EB1C433EF for ; Sat, 16 Jul 2022 23:19:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233117AbiGPXTY (ORCPT ); Sat, 16 Jul 2022 19:19:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39668 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233105AbiGPXSb (ORCPT ); Sat, 16 Jul 2022 19:18:31 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0352F2316D for ; Sat, 16 Jul 2022 16:18:00 -0700 (PDT) Message-ID: <20220716230954.395957513@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1658013477; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=NshALtucQnymJ2WeE+UdxR6xGfxJcgehG3zhGvwcNYE=; b=cmNoIR/Nd13Op6eENfywnX5YX3Z7qMI1LTE0yMmp3CE0xfzSHEnZcTK+ZqDax/a0EGxz3g vDPO0hmznLDoCQzjpvuI0Mx+DbiXq6RaXS1qUHH3MIbULdBZXzaCK0dgJeQ2HtmxFkyz78 IN0WeM/iFuGCQ9tV3/4V3Dj6oa5hdFSpAreybu9fS9OVtpyHmEmB4zdFAnEL8hDSVYd/PJ 4rxlyWOqRNjhwQ4BFmC0+J+m78uTqiBhLPC+ybedZ2b2qJu7Lwu3vMkeeqB2gi1xgaPwcY gWTyj9HxtuoxRvbD06qBoakP7p9ShJRv7W+KrLlopgpcVdqhBi43nIymV3OnfQ== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1658013477; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=NshALtucQnymJ2WeE+UdxR6xGfxJcgehG3zhGvwcNYE=; b=Bgn4iEM4KKgMrcLrRw90Qr6YklxVs4CQalimSI3lbrqSjbRpxI4PKKhUARgAzusucdIX7m dN5TN+aDzngGK8Bw== From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Linus Torvalds , Tim Chen , Josh Poimboeuf , Andrew Cooper , Pawan Gupta , Johannes Wikner , Alyssa Milburn , Jann Horn , "H.J. Lu" , Joao Moreira , Joseph Nuzman , Steven Rostedt , "Peter Zijlstra (Intel)" Subject: [patch 29/38] x86/retpoline: Add SKL retthunk retpolines References: <20220716230344.239749011@linutronix.de> MIME-Version: 1.0 Date: Sun, 17 Jul 2022 01:17:56 +0200 (CEST) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Peter Zijlstra Ensure that retpolines do the proper call accounting so that the return accounting works correctly. Specifically; retpolines are used to replace both 'jmp *%reg' and 'call *%reg', however these two cases do not have the same accounting requirements. Therefore split things up and provide two different retpoline arrays for SKL. The 'jmp *%reg' case needs no accounting, the __x86_indirect_jump_thunk_array[] covers this. The retpoline is changed to not use the return thunk; it's a simple call;ret construct. [ strictly speaking it should do: andq $(~0x1f), PER_CPU_VAR(__x86_call_depth) but we can argue this can be covered by the fuzz we already have in the accounting depth (12) vs the RSB depth (16) ] The 'call *%reg' case does need accounting, the __x86_indirect_call_thunk_array[] covers this. Again, this retpoline avoids the use of the return-thunk, in this case to avoid double accounting. Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Thomas Gleixner --- arch/x86/include/asm/nospec-branch.h | 12 +++++ arch/x86/kernel/alternative.c | 43 +++++++++++++++++++-- arch/x86/lib/retpoline.S | 71 ++++++++++++++++++++++++++++++= +---- arch/x86/net/bpf_jit_comp.c | 5 +- 4 files changed, 119 insertions(+), 12 deletions(-) --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -252,6 +252,8 @@ =20 typedef u8 retpoline_thunk_t[RETPOLINE_THUNK_SIZE]; extern retpoline_thunk_t __x86_indirect_thunk_array[]; +extern retpoline_thunk_t __x86_indirect_call_thunk_array[]; +extern retpoline_thunk_t __x86_indirect_jump_thunk_array[]; =20 extern void __x86_return_thunk(void); extern void zen_untrain_ret(void); @@ -283,6 +285,16 @@ static inline void x86_set_skl_return_th #include #undef GEN =20 +#define GEN(reg) \ + extern retpoline_thunk_t __x86_indirect_call_thunk_ ## reg; +#include +#undef GEN + +#define GEN(reg) \ + extern retpoline_thunk_t __x86_indirect_jump_thunk_ ## reg; +#include +#undef GEN + #ifdef CONFIG_X86_64 =20 /* --- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -377,6 +377,38 @@ static int emit_indirect(int op, int reg return i; } =20 +static int emit_call_track_retpoline(void *addr, struct insn *insn, int re= g, u8 *bytes) +{ + u8 op =3D insn->opcode.bytes[0]; + int i =3D 0; + + if (insn->length =3D=3D 6) + bytes[i++] =3D 0x2e; /* CS-prefix */ + + switch (op) { + case CALL_INSN_OPCODE: + __text_gen_insn(bytes+i, op, addr+i, + __x86_indirect_call_thunk_array[reg], + CALL_INSN_SIZE); + i +=3D CALL_INSN_SIZE; + break; + + case JMP32_INSN_OPCODE: + __text_gen_insn(bytes+i, op, addr+i, + __x86_indirect_jump_thunk_array[reg], + JMP32_INSN_SIZE); + i +=3D JMP32_INSN_SIZE; + break; + + default: + BUG(); + } + + WARN_ON_ONCE(i !=3D insn->length); + + return i; +} + /* * Rewrite the compiler generated retpoline thunk calls. * @@ -408,11 +440,16 @@ static int patch_retpoline(void *addr, s /* If anyone ever does: CALL/JMP *%rsp, we're in deep trouble. */ BUG_ON(reg =3D=3D 4); =20 + op =3D insn->opcode.bytes[0]; + if (cpu_feature_enabled(X86_FEATURE_RETPOLINE) && - !cpu_feature_enabled(X86_FEATURE_RETPOLINE_LFENCE)) + !cpu_feature_enabled(X86_FEATURE_RETPOLINE_LFENCE)) { + if (cpu_feature_enabled(X86_FEATURE_CALL_DEPTH)) { + i +=3D emit_call_track_retpoline(addr, insn, reg, bytes); + return i; + } return -1; - - op =3D insn->opcode.bytes[0]; + } =20 /* * Convert: --- a/arch/x86/lib/retpoline.S +++ b/arch/x86/lib/retpoline.S @@ -13,17 +13,18 @@ =20 .section .text.__x86.indirect_thunk =20 -.macro RETPOLINE reg + +.macro POLINE reg ANNOTATE_INTRA_FUNCTION_CALL call .Ldo_rop_\@ -.Lspec_trap_\@: - UNWIND_HINT_EMPTY - pause - lfence - jmp .Lspec_trap_\@ + int3 .Ldo_rop_\@: mov %\reg, (%_ASM_SP) UNWIND_HINT_FUNC +.endm + +.macro RETPOLINE reg + POLINE \reg RET .endm =20 @@ -53,7 +54,6 @@ SYM_INNER_LABEL(__x86_indirect_thunk_\re */ =20 #define __EXPORT_THUNK(sym) _ASM_NOKPROBE(sym); EXPORT_SYMBOL(sym) -#define EXPORT_THUNK(reg) __EXPORT_THUNK(__x86_indirect_thunk_ ## reg) =20 .align RETPOLINE_THUNK_SIZE SYM_CODE_START(__x86_indirect_thunk_array) @@ -65,10 +65,65 @@ SYM_CODE_START(__x86_indirect_thunk_arra .align RETPOLINE_THUNK_SIZE SYM_CODE_END(__x86_indirect_thunk_array) =20 -#define GEN(reg) EXPORT_THUNK(reg) +#define GEN(reg) __EXPORT_THUNK(__x86_indirect_thunk_ ## reg) +#include +#undef GEN + +#ifdef CONFIG_CALL_DEPTH_TRACKING +.macro CALL_THUNK reg + .align RETPOLINE_THUNK_SIZE + +SYM_INNER_LABEL(__x86_indirect_call_thunk_\reg, SYM_L_GLOBAL) + UNWIND_HINT_EMPTY + ANNOTATE_NOENDBR + + CALL_DEPTH_ACCOUNT + POLINE \reg + ANNOTATE_UNRET_SAFE + ret + int3 +.endm + + .align RETPOLINE_THUNK_SIZE +SYM_CODE_START(__x86_indirect_call_thunk_array) + +#define GEN(reg) CALL_THUNK reg +#include +#undef GEN + + .align RETPOLINE_THUNK_SIZE +SYM_CODE_END(__x86_indirect_call_thunk_array) + +#define GEN(reg) __EXPORT_THUNK(__x86_indirect_call_thunk_ ## reg) #include #undef GEN =20 +.macro JUMP_THUNK reg + .align RETPOLINE_THUNK_SIZE + +SYM_INNER_LABEL(__x86_indirect_jump_thunk_\reg, SYM_L_GLOBAL) + UNWIND_HINT_EMPTY + ANNOTATE_NOENDBR + POLINE \reg + ANNOTATE_UNRET_SAFE + ret + int3 +.endm + + .align RETPOLINE_THUNK_SIZE +SYM_CODE_START(__x86_indirect_jump_thunk_array) + +#define GEN(reg) JUMP_THUNK reg +#include +#undef GEN + + .align RETPOLINE_THUNK_SIZE +SYM_CODE_END(__x86_indirect_jump_thunk_array) + +#define GEN(reg) __EXPORT_THUNK(__x86_indirect_jump_thunk_ ## reg) +#include +#undef GEN +#endif /* * This function name is magical and is used by -mfunction-return=3Dthunk-= extern * for the compiler to generate JMPs to it. --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -417,7 +417,10 @@ static void emit_indirect_jump(u8 **ppro EMIT2(0xFF, 0xE0 + reg); } else if (cpu_feature_enabled(X86_FEATURE_RETPOLINE)) { OPTIMIZER_HIDE_VAR(reg); - emit_jump(&prog, &__x86_indirect_thunk_array[reg], ip); + if (cpu_feature_enabled(X86_FEATURE_CALL_DEPTH)) + emit_jump(&prog, &__x86_indirect_jump_thunk_array[reg], ip); + else + emit_jump(&prog, &__x86_indirect_thunk_array[reg], ip); } else { EMIT2(0xFF, 0xE0 + reg); } From nobody Sat Apr 18 05:54:57 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 08B9FC433EF for ; Sat, 16 Jul 2022 23:19:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233318AbiGPXTh (ORCPT ); Sat, 16 Jul 2022 19:19:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39966 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233206AbiGPXSf (ORCPT ); Sat, 16 Jul 2022 19:18:35 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2791023BD3 for ; Sat, 16 Jul 2022 16:18:05 -0700 (PDT) Message-ID: <20220716230954.470918864@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1658013479; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=QMc5vSkp40P2d+jY5LFr071gTHJY7fEdf/VlKzlHVVI=; b=4IQkCHfHf1/mvq1CUwZOV29fe86HDSfLQZ37SslVG18eluh8arFjSxoTnb+67JsQMGTeUE KGWIWof9HQ39phxWZnGnw2NQyOxYBzQ2NHjKfSuyo8umK2clI/NVdzRmL2I1MqQBiOPrzU EupHbjkDsrwgWL9Vfxui9R0d5Ho6xFsLirL56eUQ9NSAgN61MlzEJ/VNA40J6mySlfWNbp pULWGP1xlBMIk+kcNWy/3fGoSYlBfyDDCiDBaJyohgxynyi2+KhdfbZwjhY4vJA/rsH3PB 7DJb4CXVG3YZ0ISGjCzd9p6L7zsKoWtgprylHb/t/VOecwnteS8lXPrQNMtKhA== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1658013479; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=QMc5vSkp40P2d+jY5LFr071gTHJY7fEdf/VlKzlHVVI=; b=XidcQsbkLQgs93ZB2oKlm16rt5MV1bnkdHJyGP+ENcy5BQohTo3yPonBQfZJP9LapjLZRh ZqlBqe7f36ewhfAw== From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Linus Torvalds , Tim Chen , Josh Poimboeuf , Andrew Cooper , Pawan Gupta , Johannes Wikner , Alyssa Milburn , Jann Horn , "H.J. Lu" , Joao Moreira , Joseph Nuzman , Steven Rostedt Subject: [patch 30/38] x86/retbleed: Add SKL call thunk References: <20220716230344.239749011@linutronix.de> MIME-Version: 1.0 Date: Sun, 17 Jul 2022 01:17:58 +0200 (CEST) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add the actual SKL call thunk for call depth accounting. Signed-off-by: Thomas Gleixner --- arch/x86/kernel/callthunks.c | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) --- a/arch/x86/kernel/callthunks.c +++ b/arch/x86/kernel/callthunks.c @@ -52,6 +52,24 @@ struct thunk_desc { =20 static struct thunk_desc callthunk_desc __ro_after_init; =20 +asm ( + ".pushsection .rodata \n" + ".global skl_call_thunk_template \n" + "skl_call_thunk_template: \n" + __stringify(INCREMENT_CALL_DEPTH)" \n" + ".global skl_call_thunk_tail \n" + "skl_call_thunk_tail: \n" + ".popsection \n" +); + +extern u8 skl_call_thunk_template[]; +extern u8 skl_call_thunk_tail[]; + +#define SKL_TMPL_SIZE \ + ((unsigned int)(skl_call_thunk_tail - skl_call_thunk_template)) +#define SKL_CALLTHUNK_CODE_SIZE (SKL_TMPL_SIZE + JMP32_INSN_SIZE + INT3_IN= SN_SIZE) +#define SKL_CALLTHUNK_SIZE roundup_pow_of_two(SKL_CALLTHUNK_CODE_SIZE) + struct thunk_mem { void *base; unsigned int size; @@ -447,6 +465,12 @@ static __init noinline void callthunks_i { int ret; =20 + if (cpu_feature_enabled(X86_FEATURE_CALL_DEPTH)) { + callthunk_desc.template =3D skl_call_thunk_template; + callthunk_desc.template_size =3D SKL_TMPL_SIZE; + callthunk_desc.thunk_size =3D SKL_CALLTHUNK_SIZE; + } + if (!callthunk_desc.template) return; From nobody Sat Apr 18 05:54:57 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2100EC433EF for ; Sat, 16 Jul 2022 23:19:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233295AbiGPXTd (ORCPT ); Sat, 16 Jul 2022 19:19:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39912 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233194AbiGPXSe (ORCPT ); Sat, 16 Jul 2022 19:18:34 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1A67722521 for ; Sat, 16 Jul 2022 16:18:04 -0700 (PDT) Message-ID: <20220716230954.531875572@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1658013480; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=n/x/B8Fp0Zw1KyBer7r29adRRpTb/pJqU9PpL7Cx/VI=; b=g6C2o++/uTV7cr+0ZZ3+X1pSHMQ2ov/nbYmTaNVTZv4NIDNY06tWRjhHe/vxoNFqyIkG9m g1f8CeJ+s1qahgOsdMWCeKVL6EIulZsQOe3UjCZt/lXDhJu3wLQjUyKea311cq4gr+rzhu PWMYJ9pmKHjmZeLGogOyU+E5DoFfsnsDK7B+svxQybkm5W1hsrM6gXUNzQSGZ1Ab9+2a29 0qgy3MaQfU+zBJsXUlIRifGRIs5A4T/HB5B8x4GtRG/ORkjC6G6uoFIh+A1QruVmLz0C0o lb6Guz2hXMA1vL6wvH0BJ3UHdT0mMwcRc6kWBS+/hQL3eGKc43MXt8JQv1Y4gg== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1658013480; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=n/x/B8Fp0Zw1KyBer7r29adRRpTb/pJqU9PpL7Cx/VI=; b=QugiXolVRHGq7pGLiEqOkpR+DA+aFx3u/3BOdtTSeNFGzi1ZSLd4Ohbi5miisZ+aEZOxIG NtxWMyuULqud9wCQ== From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Linus Torvalds , Tim Chen , Josh Poimboeuf , Andrew Cooper , Pawan Gupta , Johannes Wikner , Alyssa Milburn , Jann Horn , "H.J. Lu" , Joao Moreira , Joseph Nuzman , Steven Rostedt Subject: [patch 31/38] x86/calldepth: Add ret/call counting for debug References: <20220716230344.239749011@linutronix.de> MIME-Version: 1.0 Date: Sun, 17 Jul 2022 01:18:00 +0200 (CEST) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add a debuigfs mechanism to validate the accounting, e.g. vs. call/ret balance and to gather statistics about the stuffing to call ratio. Signed-off-by: Thomas Gleixner --- arch/x86/include/asm/nospec-branch.h | 32 +++++++++++++++++++-- arch/x86/kernel/callthunks.c | 51 ++++++++++++++++++++++++++++++= +++++ arch/x86/lib/retpoline.S | 7 ++++ 3 files changed, 86 insertions(+), 4 deletions(-) --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -77,6 +77,23 @@ */ =20 #ifdef CONFIG_CALL_DEPTH_TRACKING + +#ifdef CONFIG_CALL_THUNKS_DEBUG +# define CALL_THUNKS_DEBUG_INC_CALLS \ + incq %gs:__x86_call_count; +# define CALL_THUNKS_DEBUG_INC_RETS \ + incq %gs:__x86_ret_count; +# define CALL_THUNKS_DEBUG_INC_STUFFS \ + incq %gs:__x86_stuffs_count; +# define CALL_THUNKS_DEBUG_INC_CTXSW \ + incq %gs:__x86_ctxsw_count; +#else +# define CALL_THUNKS_DEBUG_INC_CALLS +# define CALL_THUNKS_DEBUG_INC_RETS +# define CALL_THUNKS_DEBUG_INC_STUFFS +# define CALL_THUNKS_DEBUG_INC_CTXSW +#endif + #define CREDIT_CALL_DEPTH \ movq $-1, PER_CPU_VAR(__x86_call_depth); =20 @@ -88,10 +105,12 @@ #define RESET_CALL_DEPTH_FROM_CALL \ mov $0xfc, %rax; \ shl $56, %rax; \ - movq %rax, PER_CPU_VAR(__x86_call_depth); + movq %rax, PER_CPU_VAR(__x86_call_depth); \ + CALL_THUNKS_DEBUG_INC_CALLS =20 #define INCREMENT_CALL_DEPTH \ - sarq $5, %gs:__x86_call_depth + sarq $5, %gs:__x86_call_depth; \ + CALL_THUNKS_DEBUG_INC_CALLS #else #define CREDIT_CALL_DEPTH #define RESET_CALL_DEPTH @@ -127,7 +146,8 @@ dec reg; \ jnz 771b; \ \ - CREDIT_CALL_DEPTH + CREDIT_CALL_DEPTH \ + CALL_THUNKS_DEBUG_INC_CTXSW =20 #ifdef __ASSEMBLY__ =20 @@ -274,6 +294,12 @@ static inline void x86_set_skl_return_th } =20 DECLARE_PER_CPU(u64, __x86_call_depth); +#ifdef CONFIG_CALL_THUNKS_DEBUG +DECLARE_PER_CPU(u64, __x86_call_count); +DECLARE_PER_CPU(u64, __x86_ret_count); +DECLARE_PER_CPU(u64, __x86_stuffs_count); +DECLARE_PER_CPU(u64, __x86_ctxsw_count); +#endif #else static inline void x86_set_skl_return_thunk(void) {} #endif --- a/arch/x86/kernel/callthunks.c +++ b/arch/x86/kernel/callthunks.c @@ -3,6 +3,7 @@ #define pr_fmt(fmt) "callthunks: " fmt =20 #include +#include #include #include #include @@ -32,6 +33,13 @@ static int __init debug_thunks(char *str return 1; } __setup("debug-callthunks", debug_thunks); + +DEFINE_PER_CPU(u64, __x86_call_count); +DEFINE_PER_CPU(u64, __x86_ret_count); +DEFINE_PER_CPU(u64, __x86_stuffs_count); +DEFINE_PER_CPU(u64, __x86_ctxsw_count); +EXPORT_SYMBOL_GPL(__x86_ctxsw_count); + #else #define prdbg(fmt, args...) do { } while(0) #endif @@ -530,3 +538,46 @@ void callthunks_module_free(struct modul mutex_unlock(&text_mutex); } #endif /* CONFIG_MODULES */ + +#if defined(CONFIG_CALL_THUNKS_DEBUG) && defined(CONFIG_DEBUG_FS) +static int callthunks_debug_show(struct seq_file *m, void *p) +{ + unsigned long cpu =3D (unsigned long)m->private; + + seq_printf(m, "C: %16llu R: %16llu S: %16llu X: %16llu\n,", + per_cpu(__x86_call_count, cpu), + per_cpu(__x86_ret_count, cpu), + per_cpu(__x86_stuffs_count, cpu), + per_cpu(__x86_ctxsw_count, cpu)); + return 0; +} + +static int callthunks_debug_open(struct inode *inode, struct file *file) +{ + return single_open(file, callthunks_debug_show, inode->i_private); +} + +static const struct file_operations dfs_ops =3D { + .open =3D callthunks_debug_open, + .read =3D seq_read, + .llseek =3D seq_lseek, + .release =3D single_release, +}; + +static int __init callthunks_debugfs_init(void) +{ + struct dentry *dir; + unsigned long cpu; + + dir =3D debugfs_create_dir("callthunks", NULL); + for_each_possible_cpu(cpu) { + void *arg =3D (void *)cpu; + char name [10]; + + sprintf(name, "cpu%lu", cpu); + debugfs_create_file(name, 0644, dir, arg, &dfs_ops); + } + return 0; +} +__initcall(callthunks_debugfs_init); +#endif --- a/arch/x86/lib/retpoline.S +++ b/arch/x86/lib/retpoline.S @@ -202,13 +202,18 @@ EXPORT_SYMBOL(__x86_return_thunk) .align 64 SYM_FUNC_START(__x86_return_skl) ANNOTATE_NOENDBR - /* Keep the hotpath in a 16byte I-fetch */ + /* + * Keep the hotpath in a 16byte I-fetch for the non-debug + * case. + */ + CALL_THUNKS_DEBUG_INC_RETS shlq $5, PER_CPU_VAR(__x86_call_depth) jz 1f ANNOTATE_UNRET_SAFE ret int3 1: + CALL_THUNKS_DEBUG_INC_STUFFS .rept 16 ANNOTATE_INTRA_FUNCTION_CALL call 2f From nobody Sat Apr 18 05:54:57 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 884E5C43334 for ; Sat, 16 Jul 2022 23:19:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233226AbiGPXTl (ORCPT ); Sat, 16 Jul 2022 19:19:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39980 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233041AbiGPXSm (ORCPT ); Sat, 16 Jul 2022 19:18:42 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B101A23174 for ; Sat, 16 Jul 2022 16:18:05 -0700 (PDT) Message-ID: <20220716230954.592165541@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1658013482; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=5S6jiCi7pbggp/7rldxDyrEBlWY5m7T5K4YVZOSbqFw=; b=yrr0kPjQ+TXB6ttnMnbR7NNnxuozNMzJ9iH73qPOOls2sKXZn7uXp5p9ypUQJHK3DoIxer PGyloZYYll0pgGJfu6iEJqegftHAD9RLKZcc6jBy3FWOBMsMV79qclMfs+HVip4EEgtb8s lq2/Xu/rixXXRR5P8fxFHCCG/rOLdZ8FmRnWy+OFG8ltddsFidXe878eQeFcyslR1vs0uB j1UPaMBGc8KAK3fjL2ldWJLyCnxAblCqlJp4pWMHMpPsOy2XTu6ThPmgsQa6MNmg+17tcp hv+rc+BaVZ/QWoHGLRhzcsfbV9z1xbIFwW6MOJt6GPx4KbtkTjwrVHoSH+t/fg== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1658013482; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=5S6jiCi7pbggp/7rldxDyrEBlWY5m7T5K4YVZOSbqFw=; b=RWX0luHjkHNuVZX4E++IoRMgZj4/e6aYd9O7QHOlkPaCSNpaXaqKhlH+wwDBQVTFPschsu fwmGHJKe9LHSizDg== From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Linus Torvalds , Tim Chen , Josh Poimboeuf , Andrew Cooper , Pawan Gupta , Johannes Wikner , Alyssa Milburn , Jann Horn , "H.J. Lu" , Joao Moreira , Joseph Nuzman , Steven Rostedt , "Peter Zijlstra (Intel)" Subject: [patch 32/38] static_call: Add call depth tracking support References: <20220716230344.239749011@linutronix.de> MIME-Version: 1.0 Date: Sun, 17 Jul 2022 01:18:01 +0200 (CEST) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Peter Zijlstra When indirect calls are switched to direct calls then it has to be ensured that the call target is not the function, but the call thunk when call depth tracking is enabled. But static calls are available before call thunks have been set up. Ensure a second run through the static call patching code after call thunks have been created. When call thunks are not enabled this has no side effects. Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Thomas Gleixner --- arch/x86/include/asm/alternative.h | 5 +++++ arch/x86/kernel/callthunks.c | 37 ++++++++++++++++++++++++++++++++= +++++ arch/x86/kernel/static_call.c | 1 + include/linux/static_call.h | 2 ++ kernel/static_call_inline.c | 23 ++++++++++++++++++----- 5 files changed, 63 insertions(+), 5 deletions(-) --- a/arch/x86/include/asm/alternative.h +++ b/arch/x86/include/asm/alternative.h @@ -93,12 +93,17 @@ extern void callthunks_patch_builtin_cal extern void callthunks_patch_module_calls(struct callthunk_sites *sites, struct module *mod); extern void callthunks_module_free(struct module *mod); +extern void *callthunks_translate_call_dest(void *dest); #else static __always_inline void callthunks_patch_builtin_calls(void) {} static __always_inline void callthunks_patch_module_calls(struct callthunk_sites *sites, struct module *mod) {} static __always_inline void callthunks_module_free(struct module *mod) { } +static __always_inline void *callthunks_translate_call_dest(void *dest) +{ + return dest; +} #endif =20 #ifdef CONFIG_SMP --- a/arch/x86/kernel/callthunks.c +++ b/arch/x86/kernel/callthunks.c @@ -7,6 +7,7 @@ #include #include #include +#include #include =20 #include @@ -492,6 +493,7 @@ static __init noinline void callthunks_i if (WARN_ON_ONCE(ret)) return; =20 + static_call_force_reinit(); thunks_initialized =3D true; } =20 @@ -511,6 +513,41 @@ void __init callthunks_patch_builtin_cal mutex_unlock(&text_mutex); } =20 +static bool is_module_init_dest(void *dest) +{ + bool ret =3D false; + +#ifdef CONFIG_MODULES + struct module *mod; + + preempt_disable(); + mod =3D __module_address((unsigned long)dest); + if (mod && within_module_init((unsigned long)dest, mod)) + ret =3D true; + preempt_enable(); +#endif + return ret; +} + +void *callthunks_translate_call_dest(void *dest) +{ + void *thunk; + + lockdep_assert_held(&text_mutex); + + if (!thunks_initialized || skip_addr(dest)) + return dest; + + thunk =3D btree_lookup64(&call_thunks, (unsigned long)dest); + + if (thunk) + return thunk; + + WARN_ON_ONCE(!is_kernel_inittext((unsigned long)dest) && + !is_module_init_dest(dest)); + return dest; +} + #ifdef CONFIG_MODULES void noinline callthunks_patch_module_calls(struct callthunk_sites *cs, struct module *mod) --- a/arch/x86/kernel/static_call.c +++ b/arch/x86/kernel/static_call.c @@ -34,6 +34,7 @@ static void __ref __static_call_transfor =20 switch (type) { case CALL: + func =3D callthunks_translate_call_dest(func); code =3D text_gen_insn(CALL_INSN_OPCODE, insn, func); if (func =3D=3D &__static_call_return0) { emulate =3D code; --- a/include/linux/static_call.h +++ b/include/linux/static_call.h @@ -162,6 +162,8 @@ extern void arch_static_call_transform(v =20 extern int __init static_call_init(void); =20 +extern void static_call_force_reinit(void); + struct static_call_mod { struct static_call_mod *next; struct module *mod; /* for vmlinux, mod =3D=3D NULL */ --- a/kernel/static_call_inline.c +++ b/kernel/static_call_inline.c @@ -15,7 +15,18 @@ extern struct static_call_site __start_s extern struct static_call_tramp_key __start_static_call_tramp_key[], __stop_static_call_tramp_key[]; =20 -static bool static_call_initialized; +static int static_call_initialized; + +/* + * Must be called before early_initcall() to be effective. + */ +void static_call_force_reinit(void) +{ + if (WARN_ON_ONCE(!static_call_initialized)) + return; + + static_call_initialized++; +} =20 /* mutex to protect key modules/sites */ static DEFINE_MUTEX(static_call_mutex); @@ -475,7 +486,8 @@ int __init static_call_init(void) { int ret; =20 - if (static_call_initialized) + /* See static_call_force_reinit(). */ + if (static_call_initialized =3D=3D 1) return 0; =20 cpus_read_lock(); @@ -490,11 +502,12 @@ int __init static_call_init(void) BUG(); } =20 - static_call_initialized =3D true; - #ifdef CONFIG_MODULES - register_module_notifier(&static_call_module_nb); + if (!static_call_initialized) + register_module_notifier(&static_call_module_nb); #endif + + static_call_initialized =3D 1; return 0; } early_initcall(static_call_init); From nobody Sat Apr 18 05:54:57 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4D61DC433EF for ; Sat, 16 Jul 2022 23:19:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233340AbiGPXTn (ORCPT ); Sat, 16 Jul 2022 19:19:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39976 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233073AbiGPXSm (ORCPT ); Sat, 16 Jul 2022 19:18:42 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5A2E223BEC for ; Sat, 16 Jul 2022 16:18:06 -0700 (PDT) Message-ID: <20220716230954.651974187@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1658013484; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=ClwOwqNEOMfbSYwIuxbNweOtgmvdarLY2d+mT83kdoA=; b=V+I/80uZtQGL80itUgyEvjUXgke6eWnrtjG5YhyYrrw/OL831hP4h6F0R0VVw969bTLl95 mKdVmE34l4evnFyOlueBgCd5YYefTihzZr+cql2sXCrQyDOvnYg3W++x9MJx4ufo79xe2Q E2Xqu5D6iweziFoVulfOqpwoH5sklvnuMJcisCQlWSvt1LHlNqsO/od7bGwe5aaoj+0+aq ZNz6hW33p3mzhBx6FPdpcf6ZQ/bXlnYgKlTNLL/j2MoZnvH7on0hGetbct3tAXDl9n/3Rg 6bgLXuWJR83S0WDQfakVukzalGQBoFHXXXmaaSpMSnE8VI930VE1YrLpd5UMNg== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1658013484; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=ClwOwqNEOMfbSYwIuxbNweOtgmvdarLY2d+mT83kdoA=; b=MGpaN4YaztpvyGHR5iqeV1A3YllhUw4g0wMH9rWNV5tG+IqpslOR4DZi3h7KwRDs16CCHS FkazDGMYwYwiC7Bw== From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Linus Torvalds , Tim Chen , Josh Poimboeuf , Andrew Cooper , Pawan Gupta , Johannes Wikner , Alyssa Milburn , Jann Horn , "H.J. Lu" , Joao Moreira , Joseph Nuzman , Steven Rostedt , "Peter Zijlstra (Intel)" , Masami Hiramatsu Subject: [patch 33/38] kallsyms: Take callthunks into account References: <20220716230344.239749011@linutronix.de> MIME-Version: 1.0 Date: Sun, 17 Jul 2022 01:18:03 +0200 (CEST) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Peter Zijlstra Similar to ftrace and bpf call thunks are creating symbols which are interesting for things like printing stack-traces, perf, live-patching and things like that. Add the required functionality to the core and implement it in x86. Callthunks will report the same function name as their target, but their module name will be "callthunk" or "callthunk:${modname}" for modules. Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Thomas Gleixner Cc: Masami Hiramatsu --- arch/x86/kernel/callthunks.c | 155 ++++++++++++++++++++++++++++++++++++++= +++++ include/linux/kallsyms.h | 24 ++++++ kernel/kallsyms.c | 23 ++++++ 3 files changed, 202 insertions(+) --- a/arch/x86/kernel/callthunks.c +++ b/arch/x86/kernel/callthunks.c @@ -4,6 +4,7 @@ =20 #include #include +#include #include #include #include @@ -548,6 +549,160 @@ void *callthunks_translate_call_dest(voi return dest; } =20 +static bool is_module_callthunk(void *addr) +{ + bool ret =3D false; + +#ifdef CONFIG_MODULES + struct module *mod; + + preempt_disable(); + mod =3D __module_address((unsigned long)addr); + if (mod && within_module_thunk((unsigned long)addr, mod)) + ret =3D true; + preempt_enable(); +#endif + return ret; +} + +static bool is_callthunk(void *addr) +{ + if (builtin_layout.base <=3D addr && + addr < builtin_layout.base + builtin_layout.size) + return true; + return is_module_callthunk(addr); +} + +static void *__callthunk_dest(void *addr) +{ + unsigned long mask =3D callthunk_desc.thunk_size - 1; + void *thunk; + + thunk =3D (void *)((unsigned long)addr & ~mask); + thunk +=3D callthunk_desc.template_size; + return jump_get_dest(thunk); +} + +static void *callthunk_dest(void *addr) +{ + if (!is_callthunk(addr)) + return NULL; + return __callthunk_dest(addr); +} + +static void set_modname(char **modname, unsigned long addr) +{ + if (!modname || !IS_ENABLED(CONFIG_MODULES)) + *modname =3D "callthunk"; + +#ifdef CONFIG_MODULES + else { + struct module * mod; + + preempt_disable(); + mod =3D __module_address(addr); + *modname =3D mod->callthunk_name; + preempt_enable(); + } +#endif +} + +const char * +callthunk_address_lookup(unsigned long addr, unsigned long *size, + unsigned long *off, char **modname, char *sym) +{ + unsigned long dest, mask =3D callthunk_desc.thunk_size - 1; + const char *ret; + + if (!thunks_initialized) + return NULL; + + dest =3D (unsigned long)callthunk_dest((void *)addr); + if (!dest) + return NULL; + + ret =3D kallsyms_lookup(dest, size, off, modname, sym); + if (!ret) + return NULL; + + *off =3D addr & mask; + *size =3D callthunk_desc.thunk_size; + + set_modname(modname, addr); + return ret; +} + +static int get_module_thunk(char **modname, struct module_layout **layoutp, + unsigned int symthunk) +{ +#ifdef CONFIG_MODULES + extern struct list_head modules; + struct module *mod; + unsigned int size; + + symthunk -=3D (*layoutp)->text_size; + list_for_each_entry_rcu(mod, &modules, list) { + if (mod->state =3D=3D MODULE_STATE_UNFORMED) + continue; + + *layoutp =3D &mod->thunk_layout; + size =3D mod->thunk_layout.text_size; + + if (symthunk >=3D size) { + symthunk -=3D size; + continue; + } + *modname =3D mod->callthunk_name; + return symthunk; + } +#endif + return -ERANGE; +} + +int callthunk_get_kallsym(unsigned int symnum, unsigned long *value, + char *type, char *name, char *module_name, + int *exported) +{ + int symthunk =3D symnum * callthunk_desc.thunk_size; + struct module_layout *layout =3D &builtin_layout; + char *modname =3D "callthunk"; + void *thunk, *dest; + int ret =3D -ERANGE; + + if (!thunks_initialized) + return -ERANGE; + + preempt_disable(); + + if (symthunk >=3D layout->text_size) { + symthunk =3D get_module_thunk(&modname, &layout, symthunk); + if (symthunk < 0) + goto out; + } + + thunk =3D layout->base + symthunk; + dest =3D __callthunk_dest(thunk); + + if (!dest) { + strlcpy(name, "(unknown callthunk)", KSYM_NAME_LEN); + ret =3D 0; + goto out; + } + + ret =3D lookup_symbol_name((unsigned long)dest, name); + if (ret) + goto out; + + *value =3D (unsigned long)thunk; + *exported =3D 0; + *type =3D 't'; + strlcpy(module_name, modname, MODULE_NAME_LEN); + +out: + preempt_enable(); + return ret; +} + #ifdef CONFIG_MODULES void noinline callthunks_patch_module_calls(struct callthunk_sites *cs, struct module *mod) --- a/include/linux/kallsyms.h +++ b/include/linux/kallsyms.h @@ -65,6 +65,30 @@ static inline void *dereference_symbol_d return ptr; } =20 +#ifdef CONFIG_CALL_THUNKS +extern const char * +callthunk_address_lookup(unsigned long addr, unsigned long *size, + unsigned long *off, char **modname, char *sym); +extern int callthunk_get_kallsym(unsigned int symnum, unsigned long *value, + char *type, char *name, char *module_name, + int *exported); +#else +static inline const char * +callthunk_address_lookup(unsigned long addr, unsigned long *size, + unsigned long *off, char **modname, char *sym) +{ + return NULL; +} + +static inline +int callthunk_get_kallsym(unsigned int symnum, unsigned long *value, + char *type, char *name, char *module_name, + int *exported) +{ + return -1; +} +#endif + #ifdef CONFIG_KALLSYMS int kallsyms_on_each_symbol(int (*fn)(void *, const char *, struct module = *, unsigned long), --- a/kernel/kallsyms.c +++ b/kernel/kallsyms.c @@ -365,6 +365,10 @@ static const char *kallsyms_lookup_build ret =3D ftrace_mod_address_lookup(addr, symbolsize, offset, modname, namebuf); =20 + if (!ret) + ret =3D callthunk_address_lookup(addr, symbolsize, + offset, modname, namebuf); + found: cleanup_symbol_name(namebuf); return ret; @@ -578,6 +582,7 @@ struct kallsym_iter { loff_t pos_mod_end; loff_t pos_ftrace_mod_end; loff_t pos_bpf_end; + loff_t pos_callthunk_end; unsigned long value; unsigned int nameoff; /* If iterating in core kernel symbols. */ char type; @@ -657,6 +662,20 @@ static int get_ksymbol_bpf(struct kallsy return 1; } =20 +static int get_ksymbol_callthunk(struct kallsym_iter *iter) +{ + int ret =3D callthunk_get_kallsym(iter->pos - iter->pos_bpf_end, + &iter->value, &iter->type, + iter->name, iter->module_name, + &iter->exported); + if (ret < 0) { + iter->pos_callthunk_end =3D iter->pos; + return 0; + } + + return 1; +} + /* * This uses "__builtin__kprobes" as a module name for symbols for pages * allocated for kprobes' purposes, even though "__builtin__kprobes" is no= t a @@ -724,6 +743,10 @@ static int update_iter_mod(struct kallsy get_ksymbol_bpf(iter)) return 1; =20 + if ((!iter->pos_callthunk_end || iter->pos_callthunk_end > pos) && + get_ksymbol_callthunk(iter)) + return 1; + return get_ksymbol_kprobe(iter); } From nobody Sat Apr 18 05:54:57 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 46A4EC433EF for ; Sat, 16 Jul 2022 23:19:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233263AbiGPXTr (ORCPT ); Sat, 16 Jul 2022 19:19:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39336 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233269AbiGPXTM (ORCPT ); Sat, 16 Jul 2022 19:19:12 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D319B2497F for ; Sat, 16 Jul 2022 16:18:10 -0700 (PDT) Message-ID: <20220716230954.711882354@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1658013485; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=wIsBjcmnUXYyQorogZuJM115Z5C9SKbZzs0zRjSGzDo=; b=BLiqNuZBv5FNCiLedEBNSeLn4iH8ifZyowSQ4gror1C3a7dYWVSAS4ut8j/G+cnLrxf7rN ntwUhSqfL/AOIf7JD6tUDrW6LyBp4gzOOifdWW2cbX4nwp3r+F3ZyMZ1oUIoQVTStemwOJ Gsbb+fJpW4WTCAiRVc/Pm1l6nI6H48F5HU7UToPiD3GGLCTkMHTQuVNdX8kZ+aaN5cQZ/g 3HZ44VINy7nvrw8PcmgSdUKdeCqbywzJdm6X/iJZSHACuKh05/UCdkyokHCOvlTewgzrra G5CMuMBy/JCV58K9u/zkG6yxrNlCNZWBSZJPOTXa90vjerAgYeAmA2nL3VCBGg== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1658013485; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=wIsBjcmnUXYyQorogZuJM115Z5C9SKbZzs0zRjSGzDo=; b=4IMs+4ayq3xlICKiYVfzxXcMIFK033YDvt6E7HCoBYmgeFbDQF9hE0BKYlrQeVqtwyqxDO Hf5d/4QBBeGi0MBw== From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Linus Torvalds , Tim Chen , Josh Poimboeuf , Andrew Cooper , Pawan Gupta , Johannes Wikner , Alyssa Milburn , Jann Horn , "H.J. Lu" , Joao Moreira , Joseph Nuzman , Steven Rostedt , "Peter Zijlstra (Intel)" Subject: [patch 34/38] x86/orc: Make it callthunk aware References: <20220716230344.239749011@linutronix.de> MIME-Version: 1.0 Date: Sun, 17 Jul 2022 01:18:05 +0200 (CEST) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Peter Zijlstra Callthunks addresses on the stack would confuse the ORC unwinder. Handle them correctly and tell ORC to proceed further down the stack. Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Thomas Gleixner Cc: Josh Poimboeuf --- arch/x86/include/asm/alternative.h | 5 +++++ arch/x86/kernel/callthunks.c | 2 +- arch/x86/kernel/unwind_orc.c | 21 ++++++++++++++++++++- 3 files changed, 26 insertions(+), 2 deletions(-) --- a/arch/x86/include/asm/alternative.h +++ b/arch/x86/include/asm/alternative.h @@ -94,6 +94,7 @@ extern void callthunks_patch_module_call struct module *mod); extern void callthunks_module_free(struct module *mod); extern void *callthunks_translate_call_dest(void *dest); +extern bool is_callthunk(void *addr); #else static __always_inline void callthunks_patch_builtin_calls(void) {} static __always_inline void @@ -104,6 +105,10 @@ static __always_inline void *callthunks_ { return dest; } +static __always_inline bool is_callthunk(void *addr) +{ + return false; +} #endif =20 #ifdef CONFIG_SMP --- a/arch/x86/kernel/callthunks.c +++ b/arch/x86/kernel/callthunks.c @@ -565,7 +565,7 @@ static bool is_module_callthunk(void *ad return ret; } =20 -static bool is_callthunk(void *addr) +bool is_callthunk(void *addr) { if (builtin_layout.base <=3D addr && addr < builtin_layout.base + builtin_layout.size) --- a/arch/x86/kernel/unwind_orc.c +++ b/arch/x86/kernel/unwind_orc.c @@ -131,6 +131,21 @@ static struct orc_entry null_orc_entry =3D .type =3D UNWIND_HINT_TYPE_CALL }; =20 +#ifdef CONFIG_CALL_THUNKS +static struct orc_entry *orc_callthunk_find(unsigned long ip) +{ + if (!is_callthunk((void *)ip)) + return NULL; + + return &null_orc_entry; +} +#else +static struct orc_entry *orc_callthunk_find(unsigned long ip) +{ + return NULL; +} +#endif + /* Fake frame pointer entry -- used as a fallback for generated code */ static struct orc_entry orc_fp_entry =3D { .type =3D UNWIND_HINT_TYPE_CALL, @@ -184,7 +199,11 @@ static struct orc_entry *orc_find(unsign if (orc) return orc; =20 - return orc_ftrace_find(ip); + orc =3D orc_ftrace_find(ip); + if (orc) + return orc; + + return orc_callthunk_find(ip); } =20 #ifdef CONFIG_MODULES From nobody Sat Apr 18 05:54:57 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EB24EC43334 for ; Sat, 16 Jul 2022 23:19:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232810AbiGPXT4 (ORCPT ); Sat, 16 Jul 2022 19:19:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39052 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233189AbiGPXTb (ORCPT ); Sat, 16 Jul 2022 19:19:31 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 752B523BC9 for ; Sat, 16 Jul 2022 16:18:13 -0700 (PDT) Message-ID: <20220716230954.772385338@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1658013487; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=Kf2G1rz5pONni7MT992YEuxpma07TxS3Y+PewVU2bX4=; b=UqtU09MxfNOF1FVYOUtExn/u/selQX/fZnoqeuCXfBR0kBVJ41hIWPAge7J2aOoGIiuq/g N2RoL7NnRABnj40DdCkDMNeXM19Nn2WCgnfoIsiTkWKRaUdbcLQvVzeP2GFWEOUFRNS5kq 0RUzPBskCk2D5P4810gSSdCMBpJfwBJLaTt92sKVm/KrTLBZo7VH3y7JCkwBSO8mp6nY6J KTFc5FsFbnU1+K2gSeOLW3tR4J2WVP3Jw/Mr+qq2kRGUtNbraz2vP1RjvpZnW931u35lBy zioHOn8LsFuQcv2BILOONAejwDotk5nz1oSH7VdZzIS498lizUj7qu976COahg== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1658013487; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=Kf2G1rz5pONni7MT992YEuxpma07TxS3Y+PewVU2bX4=; b=dYB1/mTvB2NH2OUTBq9mvLvtbaGbzKNkQxLkc4ixf8hSad7Kkv7owhcqfKvtbUodjhOhdw xZz1gIIE9U9VrkBA== From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Linus Torvalds , Tim Chen , Josh Poimboeuf , Andrew Cooper , Pawan Gupta , Johannes Wikner , Alyssa Milburn , Jann Horn , "H.J. Lu" , Joao Moreira , Joseph Nuzman , Steven Rostedt , "Peter Zijlstra (Intel)" , Masami Hiramatsu Subject: [patch 35/38] kprobes: Add callthunk blacklisting References: <20220716230344.239749011@linutronix.de> MIME-Version: 1.0 Date: Sun, 17 Jul 2022 01:18:06 +0200 (CEST) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Peter Zijlstra Callthunks are not safe for probing. Add them to the kprobes black listed areas. Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Thomas Gleixner Cc: Masami Hiramatsu --- arch/x86/kernel/callthunks.c | 5 ++++ kernel/kprobes.c | 52 +++++++++++++++++++++++++++-----------= ----- 2 files changed, 38 insertions(+), 19 deletions(-) --- a/arch/x86/kernel/callthunks.c +++ b/arch/x86/kernel/callthunks.c @@ -6,6 +6,7 @@ #include #include #include +#include #include #include #include @@ -476,6 +477,7 @@ static __init_or_module int callthunks_s =20 static __init noinline void callthunks_init(struct callthunk_sites *cs) { + unsigned long base, size; int ret; =20 if (cpu_feature_enabled(X86_FEATURE_CALL_DEPTH)) { @@ -494,6 +496,9 @@ static __init noinline void callthunks_i if (WARN_ON_ONCE(ret)) return; =20 + base =3D (unsigned long)builtin_layout.base; + size =3D builtin_layout.size; + kprobe_add_area_blacklist(base, base + size); static_call_force_reinit(); thunks_initialized =3D true; } --- a/kernel/kprobes.c +++ b/kernel/kprobes.c @@ -2439,40 +2439,38 @@ void dump_kprobe(struct kprobe *kp) } NOKPROBE_SYMBOL(dump_kprobe); =20 -int kprobe_add_ksym_blacklist(unsigned long entry) +static int __kprobe_add_ksym_blacklist(unsigned long start, unsigned long = end) { struct kprobe_blacklist_entry *ent; - unsigned long offset =3D 0, size =3D 0; - - if (!kernel_text_address(entry) || - !kallsyms_lookup_size_offset(entry, &size, &offset)) - return -EINVAL; =20 ent =3D kmalloc(sizeof(*ent), GFP_KERNEL); if (!ent) return -ENOMEM; - ent->start_addr =3D entry; - ent->end_addr =3D entry + size; + ent->start_addr =3D start; + ent->end_addr =3D end; INIT_LIST_HEAD(&ent->list); list_add_tail(&ent->list, &kprobe_blacklist); =20 - return (int)size; + return (int)(end - start); +} + +int kprobe_add_ksym_blacklist(unsigned long entry) +{ + unsigned long offset =3D 0, size =3D 0; + + if (!kernel_text_address(entry) || + !kallsyms_lookup_size_offset(entry, &size, &offset)) + return -EINVAL; + + return __kprobe_add_ksym_blacklist(entry, entry + size); } =20 /* Add all symbols in given area into kprobe blacklist */ int kprobe_add_area_blacklist(unsigned long start, unsigned long end) { - unsigned long entry; - int ret =3D 0; + int ret =3D __kprobe_add_ksym_blacklist(start, end); =20 - for (entry =3D start; entry < end; entry +=3D ret) { - ret =3D kprobe_add_ksym_blacklist(entry); - if (ret < 0) - return ret; - if (ret =3D=3D 0) /* In case of alias symbol */ - ret =3D 1; - } - return 0; + return ret < 0 ? ret : 0; } =20 /* Remove all symbols in given area from kprobe blacklist */ @@ -2578,6 +2576,14 @@ static void add_module_kprobe_blacklist( end =3D start + mod->noinstr_text_size; kprobe_add_area_blacklist(start, end); } + +#ifdef CONFIG_CALL_THUNKS + start =3D (unsigned long)mod->thunk_layout.base; + if (start) { + end =3D start + mod->thunk_layout.size; + kprobe_remove_area_blacklist(start, end); + } +#endif } =20 static void remove_module_kprobe_blacklist(struct module *mod) @@ -2601,6 +2607,14 @@ static void remove_module_kprobe_blackli end =3D start + mod->noinstr_text_size; kprobe_remove_area_blacklist(start, end); } + +#ifdef CONFIG_CALL_THUNKS + start =3D (unsigned long)mod->thunk_layout.base; + if (start) { + end =3D start + mod->thunk_layout.size; + kprobe_remove_area_blacklist(start, end); + } +#endif } =20 /* Module notifier call back, checking kprobes on the module */ From nobody Sat Apr 18 05:54:57 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 17503C43334 for ; Sat, 16 Jul 2022 23:20:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232964AbiGPXT7 (ORCPT ); Sat, 16 Jul 2022 19:19:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39970 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233215AbiGPXTd (ORCPT ); Sat, 16 Jul 2022 19:19:33 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D455F24F1C for ; Sat, 16 Jul 2022 16:18:16 -0700 (PDT) Message-ID: <20220716230954.835254576@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1658013488; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=PII+cwKLyD+P05ol3oLjiYv5SJEasmDF6oIXlLdij2g=; b=m3VUJtBcCROOUBGwbEBONJJigGj+W4WZzj2RctcUR0DtY0EwqJwcsUMfq02ZAiem7/7qP9 Qi+0E/ceGixkq0tkSek6amfM+oBHnAES3cKxJw5nSTDJ39pQj6t2SpngsmnZA07VawP+yZ reCpZXAFL1PueX5Qxr7RFlO3drmIVLAU3lNCzA9xYZg7ZSvjmhcABXwFmuHw8dEYGYrVcC tIUYyO+9wMQ9KuXa+O+S9NqS5FlmwpBvvtwdlpwT9uTTtRseiaPonkZlSXXtoNSMDyoNHO dwxlQCHQCyvDfKUpPY8Pk9CHREJJqqE73KyUJMGXbt9z57whZ8ISvOqXt7Bjzw== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1658013488; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=PII+cwKLyD+P05ol3oLjiYv5SJEasmDF6oIXlLdij2g=; b=X932vk2axR6UO80rh2//i3Y151zee7HakzWAlj5t2aT1m+U/Jbzz3AnvU/GTdAiy31fk9D Ufo1iUX29DwmcECg== From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Linus Torvalds , Tim Chen , Josh Poimboeuf , Andrew Cooper , Pawan Gupta , Johannes Wikner , Alyssa Milburn , Jann Horn , "H.J. Lu" , Joao Moreira , Joseph Nuzman , Steven Rostedt , "Peter Zijlstra (Intel)" Subject: [patch 36/38] x86/ftrace: Make it call depth tracking aware References: <20220716230344.239749011@linutronix.de> MIME-Version: 1.0 Date: Sun, 17 Jul 2022 01:18:08 +0200 (CEST) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Peter Zijlstra Since ftrace has trampolines, don't use thunks for the __fentry__ site but instead require that every function called from there includes accounting. This very much includes all the direct-call functions. Additionally, ftrace uses ROP tricks in two places: - return_to_handler(), and - ftrace_regs_caller() when pt_regs->orig_ax is set by a direct-call. return_to_handler() already uses a retpoline to replace an indirect-jump to defeat IBT, since this is a jump-type retpoline, make sure there is no accounting done and ALTERNATIVE the RET into a ret. ftrace_regs_caller() does much the same but currently causes an RSB imbalance by effectively doing a PUSH+RET combo, rebalance. Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Thomas Gleixner --- arch/x86/include/asm/nospec-branch.h | 8 +++++++ arch/x86/kernel/ftrace.c | 16 ++++++++++---- arch/x86/kernel/ftrace_64.S | 31 +++++++++++++++++++++++= +++-- arch/x86/net/bpf_jit_comp.c | 6 +++++ kernel/trace/trace_selftest.c | 5 +++- samples/ftrace/ftrace-direct-modify.c | 2 + samples/ftrace/ftrace-direct-multi-modify.c | 2 + samples/ftrace/ftrace-direct-multi.c | 1=20 samples/ftrace/ftrace-direct-too.c | 1=20 samples/ftrace/ftrace-direct.c | 1=20 10 files changed, 66 insertions(+), 7 deletions(-) --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -293,6 +293,11 @@ static inline void x86_set_skl_return_th x86_return_thunk =3D &__x86_return_skl; } =20 +#define CALL_DEPTH_ACCOUNT \ + ALTERNATIVE("", \ + __stringify(INCREMENT_CALL_DEPTH), \ + X86_FEATURE_CALL_DEPTH) + DECLARE_PER_CPU(u64, __x86_call_depth); #ifdef CONFIG_CALL_THUNKS_DEBUG DECLARE_PER_CPU(u64, __x86_call_count); @@ -302,6 +307,9 @@ DECLARE_PER_CPU(u64, __x86_ctxsw_count); #endif #else static inline void x86_set_skl_return_thunk(void) {} + +#define CALL_DEPTH_ACCOUNT + #endif =20 #ifdef CONFIG_RETPOLINE --- a/arch/x86/kernel/ftrace.c +++ b/arch/x86/kernel/ftrace.c @@ -69,6 +69,10 @@ static const char *ftrace_nop_replace(vo =20 static const char *ftrace_call_replace(unsigned long ip, unsigned long add= r) { + /* + * No need to translate into a callthunk. The trampoline does + * the depth accounting itself. + */ return text_gen_insn(CALL_INSN_OPCODE, (void *)ip, (void *)addr); } =20 @@ -316,7 +320,7 @@ create_trampoline(struct ftrace_ops *ops unsigned long size; unsigned long *ptr; void *trampoline; - void *ip; + void *ip, *dest; /* 48 8b 15 is movq (%rip), %rdx */ unsigned const char op_ref[] =3D { 0x48, 0x8b, 0x15 }; unsigned const char retq[] =3D { RET_INSN_OPCODE, INT3_INSN_OPCODE }; @@ -403,10 +407,14 @@ create_trampoline(struct ftrace_ops *ops /* put in the call to the function */ mutex_lock(&text_mutex); call_offset -=3D start_offset; + /* + * No need to translate into a callthunk. The trampoline does + * the depth accounting before the call already. + */ + dest =3D ftrace_ops_get_func(ops); memcpy(trampoline + call_offset, - text_gen_insn(CALL_INSN_OPCODE, - trampoline + call_offset, - ftrace_ops_get_func(ops)), CALL_INSN_SIZE); + text_gen_insn(CALL_INSN_OPCODE, trampoline + call_offset, dest), + CALL_INSN_SIZE); mutex_unlock(&text_mutex); =20 /* ALLOC_TRAMP flags lets us know we created it */ --- a/arch/x86/kernel/ftrace_64.S +++ b/arch/x86/kernel/ftrace_64.S @@ -132,6 +132,7 @@ #ifdef CONFIG_DYNAMIC_FTRACE =20 SYM_FUNC_START(__fentry__) + CALL_DEPTH_ACCOUNT RET SYM_FUNC_END(__fentry__) EXPORT_SYMBOL(__fentry__) @@ -140,6 +141,8 @@ SYM_FUNC_START(ftrace_caller) /* save_mcount_regs fills in first two parameters */ save_mcount_regs =20 + CALL_DEPTH_ACCOUNT + /* Stack - skipping return address of ftrace_caller */ leaq MCOUNT_REG_SIZE+8(%rsp), %rcx movq %rcx, RSP(%rsp) @@ -155,6 +158,9 @@ SYM_INNER_LABEL(ftrace_caller_op_ptr, SY /* Only ops with REGS flag set should have CS register set */ movq $0, CS(%rsp) =20 + /* Account for the function call below */ + CALL_DEPTH_ACCOUNT + SYM_INNER_LABEL(ftrace_call, SYM_L_GLOBAL) ANNOTATE_NOENDBR call ftrace_stub @@ -195,6 +201,8 @@ SYM_FUNC_START(ftrace_regs_caller) save_mcount_regs 8 /* save_mcount_regs fills in first two parameters */ =20 + CALL_DEPTH_ACCOUNT + SYM_INNER_LABEL(ftrace_regs_caller_op_ptr, SYM_L_GLOBAL) ANNOTATE_NOENDBR /* Load the ftrace_ops into the 3rd parameter */ @@ -225,6 +233,9 @@ SYM_INNER_LABEL(ftrace_regs_caller_op_pt /* regs go into 4th parameter */ leaq (%rsp), %rcx =20 + /* Account for the function call below */ + CALL_DEPTH_ACCOUNT + SYM_INNER_LABEL(ftrace_regs_call, SYM_L_GLOBAL) ANNOTATE_NOENDBR call ftrace_stub @@ -280,7 +291,19 @@ SYM_INNER_LABEL(ftrace_regs_caller_end, /* Restore flags */ popfq UNWIND_HINT_FUNC - jmp ftrace_epilogue + + /* + * Since we're effectively emulating a tail-call with PUSH;RET + * make sure we don't unbalance the RSB and mess up accounting. + */ + ANNOTATE_INTRA_FUNCTION_CALL + call 2f + int3 +2: + add $8, %rsp + ALTERNATIVE __stringify(RET), \ + __stringify(ANNOTATE_UNRET_SAFE; ret; int3), \ + X86_FEATURE_CALL_DEPTH =20 SYM_FUNC_END(ftrace_regs_caller) STACK_FRAME_NON_STANDARD_FP(ftrace_regs_caller) @@ -289,6 +312,8 @@ STACK_FRAME_NON_STANDARD_FP(ftrace_regs_ #else /* ! CONFIG_DYNAMIC_FTRACE */ =20 SYM_FUNC_START(__fentry__) + CALL_DEPTH_ACCOUNT + cmpq $ftrace_stub, ftrace_trace_function jnz trace =20 @@ -345,6 +370,8 @@ SYM_CODE_START(return_to_handler) int3 .Ldo_rop: mov %rdi, (%rsp) - RET + ALTERNATIVE __stringify(RET), \ + __stringify(ANNOTATE_UNRET_SAFE; ret; int3), \ + X86_FEATURE_CALL_DEPTH SYM_CODE_END(return_to_handler) #endif --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -12,6 +12,7 @@ #include #include #include +#include #include #include #include @@ -2090,6 +2091,11 @@ int arch_prepare_bpf_trampoline(struct b prog =3D image; =20 EMIT_ENDBR(); + /* + * This is the direct-call trampoline, as such it needs accounting + * for the __fentry__ call. + */ + x86_call_depth_emit_accounting(&prog, __fentry__); EMIT1(0x55); /* push rbp */ EMIT3(0x48, 0x89, 0xE5); /* mov rbp, rsp */ EMIT4(0x48, 0x83, 0xEC, stack_size); /* sub rsp, stack_size */ --- a/kernel/trace/trace_selftest.c +++ b/kernel/trace/trace_selftest.c @@ -785,7 +785,10 @@ static struct fgraph_ops fgraph_ops __in }; =20 #ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS -noinline __noclone static void trace_direct_tramp(void) { } +noinline __noclone static void trace_direct_tramp(void) +{ + asm(CALL_DEPTH_ACCOUNT); +} #endif =20 /* --- a/samples/ftrace/ftrace-direct-modify.c +++ b/samples/ftrace/ftrace-direct-modify.c @@ -34,6 +34,7 @@ asm ( ASM_ENDBR " pushq %rbp\n" " movq %rsp, %rbp\n" + CALL_DEPTH_ACCOUNT " call my_direct_func1\n" " leave\n" " .size my_tramp1, .-my_tramp1\n" @@ -45,6 +46,7 @@ asm ( ASM_ENDBR " pushq %rbp\n" " movq %rsp, %rbp\n" + CALL_DEPTH_ACCOUNT " call my_direct_func2\n" " leave\n" ASM_RET --- a/samples/ftrace/ftrace-direct-multi-modify.c +++ b/samples/ftrace/ftrace-direct-multi-modify.c @@ -32,6 +32,7 @@ asm ( ASM_ENDBR " pushq %rbp\n" " movq %rsp, %rbp\n" + CALL_DEPTH_ACCOUNT " pushq %rdi\n" " movq 8(%rbp), %rdi\n" " call my_direct_func1\n" @@ -46,6 +47,7 @@ asm ( ASM_ENDBR " pushq %rbp\n" " movq %rsp, %rbp\n" + CALL_DEPTH_ACCOUNT " pushq %rdi\n" " movq 8(%rbp), %rdi\n" " call my_direct_func2\n" --- a/samples/ftrace/ftrace-direct-multi.c +++ b/samples/ftrace/ftrace-direct-multi.c @@ -27,6 +27,7 @@ asm ( ASM_ENDBR " pushq %rbp\n" " movq %rsp, %rbp\n" + CALL_DEPTH_ACCOUNT " pushq %rdi\n" " movq 8(%rbp), %rdi\n" " call my_direct_func\n" --- a/samples/ftrace/ftrace-direct-too.c +++ b/samples/ftrace/ftrace-direct-too.c @@ -29,6 +29,7 @@ asm ( ASM_ENDBR " pushq %rbp\n" " movq %rsp, %rbp\n" + CALL_DEPTH_ACCOUNT " pushq %rdi\n" " pushq %rsi\n" " pushq %rdx\n" --- a/samples/ftrace/ftrace-direct.c +++ b/samples/ftrace/ftrace-direct.c @@ -26,6 +26,7 @@ asm ( ASM_ENDBR " pushq %rbp\n" " movq %rsp, %rbp\n" + CALL_DEPTH_ACCOUNT " pushq %rdi\n" " call my_direct_func\n" " popq %rdi\n" From nobody Sat Apr 18 05:54:57 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 169D2C433EF for ; Sat, 16 Jul 2022 23:20:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233372AbiGPXUM (ORCPT ); Sat, 16 Jul 2022 19:20:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39320 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233231AbiGPXTm (ORCPT ); Sat, 16 Jul 2022 19:19:42 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 985D0252AE for ; Sat, 16 Jul 2022 16:18:20 -0700 (PDT) Message-ID: <20220716230954.898341815@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1658013490; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=HH2MkrvtK0knlQv3/gUUgzX0UyCTF610TO5DDeIkhWw=; b=PBHxgX2+WigfBNkvD7Of6EcTXx/UDDGItxPo8f/XXmduIkPi9RaFlHcHILHSM5JG0ReOTA LYKgnQeQhjdrGM4fKeByD7EdPxop06kG4D15E74OjjBRPV5EgO0BZLDxeNaRxrc48wogOI 0YP11YnQ4SW1uejuew21VugEQEGDNV25puv/PfsiIdVG4+tsz7O4FyGIufqXpLZlQ0YvZ8 Clg4rZlO4B7fW1OWUDvzmxbabqWsHmO9aMUDyeuprXybVSalp/L/6MyGnItKVM1gfxuqoX YhnWkgNUw32eelHBk8srfoWPNxB6HD8+b0zCHkDBt1o0XMWuK1dHl4JwVOctSw== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1658013490; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=HH2MkrvtK0knlQv3/gUUgzX0UyCTF610TO5DDeIkhWw=; b=c7acqDlncM0gMIpLjjNqUDuLayxkpvQQvqvBe2LijisEKrJVujCO5bVgwNX/Q3McCmKopG i5lXsar61xZugwBA== From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Linus Torvalds , Tim Chen , Josh Poimboeuf , Andrew Cooper , Pawan Gupta , Johannes Wikner , Alyssa Milburn , Jann Horn , "H.J. Lu" , Joao Moreira , Joseph Nuzman , Steven Rostedt , Alexei Starovoitov , Daniel Borkmann Subject: [patch 37/38] x86/bpf: Emit call depth accounting if required References: <20220716230344.239749011@linutronix.de> MIME-Version: 1.0 Date: Sun, 17 Jul 2022 01:18:09 +0200 (CEST) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Ensure that calls in BPF jitted programs are emitting call depth accounting when enabled to keep the call/return balanced. The return thunk jump is already injected due to the earlier retbleed mitigations. Signed-off-by: Thomas Gleixner Cc: Alexei Starovoitov Cc: Daniel Borkmann --- arch/x86/include/asm/alternative.h | 6 +++++ arch/x86/kernel/callthunks.c | 19 ++++++++++++++++ arch/x86/net/bpf_jit_comp.c | 43 ++++++++++++++++++++++++--------= ----- 3 files changed, 53 insertions(+), 15 deletions(-) --- a/arch/x86/include/asm/alternative.h +++ b/arch/x86/include/asm/alternative.h @@ -95,6 +95,7 @@ extern void callthunks_patch_module_call extern void callthunks_module_free(struct module *mod); extern void *callthunks_translate_call_dest(void *dest); extern bool is_callthunk(void *addr); +extern int x86_call_depth_emit_accounting(u8 **pprog, void *func); #else static __always_inline void callthunks_patch_builtin_calls(void) {} static __always_inline void @@ -109,6 +110,11 @@ static __always_inline bool is_callthunk { return false; } +static __always_inline int x86_call_depth_emit_accounting(u8 **pprog, + void *func) +{ + return 0; +} #endif =20 #ifdef CONFIG_SMP --- a/arch/x86/kernel/callthunks.c +++ b/arch/x86/kernel/callthunks.c @@ -706,6 +706,25 @@ int callthunk_get_kallsym(unsigned int s return ret; } =20 +#ifdef CONFIG_BPF_JIT +int x86_call_depth_emit_accounting(u8 **pprog, void *func) +{ + unsigned int tmpl_size =3D callthunk_desc.template_size; + void *tmpl =3D callthunk_desc.template; + + if (!thunks_initialized) + return 0; + + /* Is function call target a thunk? */ + if (is_callthunk(func)) + return 0; + + memcpy(*pprog, tmpl, tmpl_size); + *pprog +=3D tmpl_size; + return tmpl_size; +} +#endif + #ifdef CONFIG_MODULES void noinline callthunks_patch_module_calls(struct callthunk_sites *cs, struct module *mod) --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -340,6 +340,12 @@ static int emit_call(u8 **pprog, void *f return emit_patch(pprog, func, ip, 0xE8); } =20 +static int emit_rsb_call(u8 **pprog, void *func, void *ip) +{ + x86_call_depth_emit_accounting(pprog, func); + return emit_patch(pprog, func, ip, 0xE8); +} + static int emit_jump(u8 **pprog, void *func, void *ip) { return emit_patch(pprog, func, ip, 0xE9); @@ -1431,19 +1437,26 @@ st: if (is_imm8(insn->off)) break; =20 /* call */ - case BPF_JMP | BPF_CALL: + case BPF_JMP | BPF_CALL: { + int offs; + func =3D (u8 *) __bpf_call_base + imm32; if (tail_call_reachable) { /* mov rax, qword ptr [rbp - rounded_stack_depth - 8] */ EMIT3_off32(0x48, 0x8B, 0x85, -round_up(bpf_prog->aux->stack_depth, 8) - 8); - if (!imm32 || emit_call(&prog, func, image + addrs[i - 1] + 7)) + if (!imm32) return -EINVAL; + offs =3D 7 + x86_call_depth_emit_accounting(&prog, func); } else { - if (!imm32 || emit_call(&prog, func, image + addrs[i - 1])) + if (!imm32) return -EINVAL; + offs =3D x86_call_depth_emit_accounting(&prog, func); } + if (emit_call(&prog, func, image + addrs[i - 1] + offs)) + return -EINVAL; break; + } =20 case BPF_JMP | BPF_TAIL_CALL: if (imm32) @@ -1808,10 +1821,10 @@ static int invoke_bpf_prog(const struct /* arg2: lea rsi, [rbp - ctx_cookie_off] */ EMIT4(0x48, 0x8D, 0x75, -run_ctx_off); =20 - if (emit_call(&prog, - p->aux->sleepable ? __bpf_prog_enter_sleepable : - __bpf_prog_enter, prog)) - return -EINVAL; + if (emit_rsb_call(&prog, + p->aux->sleepable ? __bpf_prog_enter_sleepable : + __bpf_prog_enter, prog)) + return -EINVAL; /* remember prog start time returned by __bpf_prog_enter */ emit_mov_reg(&prog, true, BPF_REG_6, BPF_REG_0); =20 @@ -1831,7 +1844,7 @@ static int invoke_bpf_prog(const struct (long) p->insnsi >> 32, (u32) (long) p->insnsi); /* call JITed bpf program or interpreter */ - if (emit_call(&prog, p->bpf_func, prog)) + if (emit_rsb_call(&prog, p->bpf_func, prog)) return -EINVAL; =20 /* @@ -1855,10 +1868,10 @@ static int invoke_bpf_prog(const struct emit_mov_reg(&prog, true, BPF_REG_2, BPF_REG_6); /* arg3: lea rdx, [rbp - run_ctx_off] */ EMIT4(0x48, 0x8D, 0x55, -run_ctx_off); - if (emit_call(&prog, - p->aux->sleepable ? __bpf_prog_exit_sleepable : - __bpf_prog_exit, prog)) - return -EINVAL; + if (emit_rsb_call(&prog, + p->aux->sleepable ? __bpf_prog_exit_sleepable : + __bpf_prog_exit, prog)) + return -EINVAL; =20 *pprog =3D prog; return 0; @@ -2123,7 +2136,7 @@ int arch_prepare_bpf_trampoline(struct b if (flags & BPF_TRAMP_F_CALL_ORIG) { /* arg1: mov rdi, im */ emit_mov_imm64(&prog, BPF_REG_1, (long) im >> 32, (u32) (long) im); - if (emit_call(&prog, __bpf_tramp_enter, prog)) { + if (emit_rsb_call(&prog, __bpf_tramp_enter, prog)) { ret =3D -EINVAL; goto cleanup; } @@ -2151,7 +2164,7 @@ int arch_prepare_bpf_trampoline(struct b restore_regs(m, &prog, nr_args, regs_off); =20 /* call original function */ - if (emit_call(&prog, orig_call, prog)) { + if (emit_rsb_call(&prog, orig_call, prog)) { ret =3D -EINVAL; goto cleanup; } @@ -2194,7 +2207,7 @@ int arch_prepare_bpf_trampoline(struct b im->ip_epilogue =3D prog; /* arg1: mov rdi, im */ emit_mov_imm64(&prog, BPF_REG_1, (long) im >> 32, (u32) (long) im); - if (emit_call(&prog, __bpf_tramp_exit, prog)) { + if (emit_rsb_call(&prog, __bpf_tramp_exit, prog)) { ret =3D -EINVAL; goto cleanup; } From nobody Sat Apr 18 05:54:57 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 316B4C433EF for ; Sat, 16 Jul 2022 23:20:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233276AbiGPXUP (ORCPT ); Sat, 16 Jul 2022 19:20:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39384 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233328AbiGPXTn (ORCPT ); Sat, 16 Jul 2022 19:19:43 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 97AAF25585 for ; Sat, 16 Jul 2022 16:18:21 -0700 (PDT) Message-ID: <20220716230954.957997370@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1658013492; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=LlS7WcBzrKcFWrbZQU5y98/DeyvCwTJOWzMBgWM/naQ=; b=2hfUTazQRr066bAX5raUP9gbSJmxMwdj7QKWxfZAuqgqZCxohzxSVqI2IICyuiW0NGQgeN v6+gydnZcWlmVWA6B+Bpzz3Oy9cj0YSDIt9NEskcXBwb/mpEVfpDmRcqWZfTJn4VoHP3E5 TFTk/6SjxpGf6vrkbx3ZhOrqzfQpHYr9gZhH1Masy8dirFHRvdfYzgzMcQoC2bnODxzQ+Y 6Fz2OaD+gdqDTlnAYXxSFW0ENqsEtbyhJr1KE0KKhGV0PnXeunOvnX5nfIw1UlXmUxCysO HKm1Fmpe6CsdgJB4m/E5rQbfl9PZn4iY2uccN/tISljhmjzbKVAZ4vEvYZhM7A== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1658013492; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=LlS7WcBzrKcFWrbZQU5y98/DeyvCwTJOWzMBgWM/naQ=; b=cmTN5WRV36zThjlAigj/rQupHTAWHCf31mQk9irjEnTBZqFHWnR3Q1YuiN7jgLPfhz0gYD Mx5nC5IxuV5uG0BA== From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Linus Torvalds , Tim Chen , Josh Poimboeuf , Andrew Cooper , Pawan Gupta , Johannes Wikner , Alyssa Milburn , Jann Horn , "H.J. Lu" , Joao Moreira , Joseph Nuzman , Steven Rostedt Subject: [patch 38/38] x86/retbleed: Add call depth tracking mitigation References: <20220716230344.239749011@linutronix.de> MIME-Version: 1.0 Date: Sun, 17 Jul 2022 01:18:11 +0200 (CEST) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The fully secure mitigation for RSB underflow on Intel SKL CPUs is IBRS, which inflicts up to 30% penalty for pathological syscall heavy work loads. Software based call depth tracking and RSB refill is not perfect, but reduces the attack surface massively. The penalty for the pathological case is about 8% which is still annoying but definitely more palatable than IBRS. Add a retbleed=3Dstuff command line option to enable the call depth tracking and software refill of the RSB. This gives admins a choice. IBeeRS are safe and cause headaches, call depth tracking is considered to be s(t)ufficiently safe. Signed-off-by: Thomas Gleixner --- arch/x86/kernel/cpu/bugs.c | 32 ++++++++++++++++++++++++++++++-- 1 file changed, 30 insertions(+), 2 deletions(-) --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -784,6 +784,7 @@ enum retbleed_mitigation { RETBLEED_MITIGATION_IBPB, RETBLEED_MITIGATION_IBRS, RETBLEED_MITIGATION_EIBRS, + RETBLEED_MITIGATION_STUFF, }; =20 enum retbleed_mitigation_cmd { @@ -791,6 +792,7 @@ enum retbleed_mitigation_cmd { RETBLEED_CMD_AUTO, RETBLEED_CMD_UNRET, RETBLEED_CMD_IBPB, + RETBLEED_CMD_STUFF, }; =20 const char * const retbleed_strings[] =3D { @@ -799,6 +801,7 @@ const char * const retbleed_strings[] =3D [RETBLEED_MITIGATION_IBPB] =3D "Mitigation: IBPB", [RETBLEED_MITIGATION_IBRS] =3D "Mitigation: IBRS", [RETBLEED_MITIGATION_EIBRS] =3D "Mitigation: Enhanced IBRS", + [RETBLEED_MITIGATION_STUFF] =3D "Mitigation: Stuffing", }; =20 static enum retbleed_mitigation retbleed_mitigation __ro_after_init =3D @@ -828,6 +831,8 @@ static int __init retbleed_parse_cmdline retbleed_cmd =3D RETBLEED_CMD_UNRET; } else if (!strcmp(str, "ibpb")) { retbleed_cmd =3D RETBLEED_CMD_IBPB; + } else if (!strcmp(str, "stuff")) { + retbleed_cmd =3D RETBLEED_CMD_STUFF; } else if (!strcmp(str, "nosmt")) { retbleed_nosmt =3D true; } else { @@ -876,6 +881,21 @@ static void __init retbleed_select_mitig } break; =20 + case RETBLEED_CMD_STUFF: + if (IS_ENABLED(CONFIG_CALL_DEPTH_TRACKING) && + spectre_v2_enabled =3D=3D SPECTRE_V2_RETPOLINE) { + retbleed_mitigation =3D RETBLEED_MITIGATION_STUFF; + + } else { + if (IS_ENABLED(CONFIG_CALL_DEPTH_TRACKING)) + pr_err("WARNING: retbleed=3Dstuff depends on spectre_v2=3Dretpoline\n"= ); + else + pr_err("WARNING: kernel not compiled with CALL_DEPTH_TRACKING.\n"); + + goto do_cmd_auto; + } + break; + do_cmd_auto: case RETBLEED_CMD_AUTO: default: @@ -913,6 +933,12 @@ static void __init retbleed_select_mitig mitigate_smt =3D true; break; =20 + case RETBLEED_MITIGATION_STUFF: + setup_force_cpu_cap(X86_FEATURE_RETHUNK); + setup_force_cpu_cap(X86_FEATURE_CALL_DEPTH); + x86_set_skl_return_thunk(); + break; + default: break; } @@ -923,7 +949,7 @@ static void __init retbleed_select_mitig =20 /* * Let IBRS trump all on Intel without affecting the effects of the - * retbleed=3D cmdline option. + * retbleed=3D cmdline option except for call depth based stuffing */ if (boot_cpu_data.x86_vendor =3D=3D X86_VENDOR_INTEL) { switch (spectre_v2_enabled) { @@ -936,7 +962,8 @@ static void __init retbleed_select_mitig retbleed_mitigation =3D RETBLEED_MITIGATION_EIBRS; break; default: - pr_err(RETBLEED_INTEL_MSG); + if (retbleed_mitigation !=3D RETBLEED_MITIGATION_STUFF) + pr_err(RETBLEED_INTEL_MSG); } } =20 @@ -1361,6 +1388,7 @@ static void __init spectre_v2_select_mit if (IS_ENABLED(CONFIG_CPU_IBRS_ENTRY) && boot_cpu_has_bug(X86_BUG_RETBLEED) && retbleed_cmd !=3D RETBLEED_CMD_OFF && + retbleed_cmd !=3D RETBLEED_CMD_STUFF && boot_cpu_has(X86_FEATURE_IBRS) && boot_cpu_data.x86_vendor =3D=3D X86_VENDOR_INTEL) { mode =3D SPECTRE_V2_IBRS;