From nobody Sun Feb 8 05:37:17 2026 Received: from mail-qk1-f202.google.com (mail-qk1-f202.google.com [209.85.222.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DF5CE2F6199 for ; Fri, 19 Dec 2025 11:20:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766143214; cv=none; b=k85O6qnHmnaxzkiH9W6k4o+7+4BZ9ga17baGCkTjTrGjxbiXLvqNoJIY45BQK5BLgzFyBYsWpsmXs8FG9xE5huqUUFbH1k4GBXinuQbSM5QG72rWo7BtvXxOpqOtWF6/BQIaklRlRT4fXpeVwRK/FZYOca8ztRG1nc38Q769Ikg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766143214; c=relaxed/simple; bh=2iXhpfp/aJABbrDRzG0FjAgcR+ufUymQ24M10dTOvYo=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=copuG83Iyvc2KcKQYfIRzRW8//5esu626KmSEDeUDoAyLavuigCLT+bCxzzEVMcvF+K/Lhf5mCH+jXhkqqzb6S+qHiej2V0+ZnPZuEm1uES3/mmPmIHvUZdmO4rrKIKiLlMA0IJYHVkQJmdAnnyeTZHTVNxYmTZwW+WCQrLAtnc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--edumazet.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=J98PvK7H; arc=none smtp.client-ip=209.85.222.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--edumazet.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="J98PvK7H" Received: by mail-qk1-f202.google.com with SMTP id af79cd13be357-8b259f0da04so365133085a.0 for ; Fri, 19 Dec 2025 03:20:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1766143211; x=1766748011; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=SSucSGSxJIEKPHcbjenWb72LitkKX+M2VvSq7chS5ig=; b=J98PvK7H86taC8uKuK/DjEkmMbI6KmDhjXQNFIVuYep7mOU7pnY4+1+oKUYGTC4NPF bofNoRVqh8XMX4NPeiQjsXRChtpq/vKqJ7AqyIBquyT4TvwaMy9Y63cZGLi5aeRYeUWR SeHq0gDDCXX+QLFmNX6cXlTSaIogJfgQdRljMGQX7eX8p8vy+HGUGp/ZQs6R0smYhe1e OV/nlmT8znBudEkkuQMkEr3alI0jBiMClH7CKo0evFh/h/F0CoukoSIBqm5ILp+OU9v3 Dj6Kn1b38bglhhfouogzCmOnsdrZ9k5z1F9yZMcXD7imvft7JKO27N/l8yVYF5C6ax4w fEPg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1766143211; x=1766748011; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=SSucSGSxJIEKPHcbjenWb72LitkKX+M2VvSq7chS5ig=; b=IS5O6Ogq/6l2ewArsWsH18RrqLdNGDxn3qTaPoTj1sXGilYbl3yEr3wI9H9PKnqk3b IPnORmVd3DCQafxi+GJmTigB/ELLXJ6garv78Dn/oCg7syBcnisV1wS44U28+81//r6Q yzoMEktzZk3pjGc3LyjSZM/ChdILjN7L1+XJpXokKWU8RnldUJSWC+FAIxD1m4r6ppZh 5iuCa7S5bfr29ukuqnhcfy4N2jXFGFoUsNH1dRQ4NJfOJyFzJu8MMjIbw1r61g2EURUY tzFmRVJ4OwJXuQeHPx6IhtcxOvQPC1UcNLeC9yFL215BdduXpg3J4xmCWkWkTbRA7ak7 hjcA== X-Gm-Message-State: AOJu0YxQBGctVj2WBZGCFb0CzCdeqG9rWRKJNqFXWg+/oGJEpo0lu70+ EBUgNCgDiGdyvJQNklVu6WzaNbTlm6DypJd9xCgj+GOWbNFyhRqnVlvMi+yB6+62ix6QEDB/HYN iu6f1o6WYAV3gfg== X-Google-Smtp-Source: AGHT+IFoOSXeGtghRJMXRrF4J3HTURppkkwOHbfn3+ULqrAqx9DCt+GMs0OnHqm+zrkUyx6cABnFpqm4E8LpPw== X-Received: from qkau19-n1.prod.google.com ([2002:a05:620a:a1d3:10b0:8b2:217d:fc3]) (user=edumazet job=prod-delivery.src-stubby-dispatcher) by 2002:a05:620a:4711:b0:8b2:eea5:3324 with SMTP id af79cd13be357-8c08f664d7dmr403956385a.27.1766143210861; Fri, 19 Dec 2025 03:20:10 -0800 (PST) Date: Fri, 19 Dec 2025 11:20:06 +0000 In-Reply-To: <20251219112007.2827302-1-edumazet@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251219112007.2827302-1-edumazet@google.com> X-Mailer: git-send-email 2.52.0.322.g1dd061c0dc-goog Message-ID: <20251219112007.2827302-2-edumazet@google.com> Subject: [PATCH 1/2] clang: work around asm output constraint problems From: Eric Dumazet To: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , Uros Bizjak , Linus Torvalds , x86@kernel.org, "H . Peter Anvin" Cc: linux-kernel , Eric Dumazet , Eric Dumazet Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Work around clang problems with "=3Drm" asm constraint. clang seems to always chose the memory output, while it is almost always the worst choice. Add ASM_OUTPUT_RM so that we can replace "=3Drm" constraint where it maters for clang, while not penalizing gcc. Signed-off-by: Eric Dumazet Suggested-by: Uros Bizjak Cc: Linus Torvalds --- include/linux/compiler-clang.h | 7 +++++++ include/linux/compiler_types.h | 6 +++++- 2 files changed, 12 insertions(+), 1 deletion(-) diff --git a/include/linux/compiler-clang.h b/include/linux/compiler-clang.h index 107ce05bd16eb4a2ecb0f8cbf5c9efeab1845a1c..a82db8ab2812b9aa8851bf51dae= d3f81d2a41979 100644 --- a/include/linux/compiler-clang.h +++ b/include/linux/compiler-clang.h @@ -146,6 +146,13 @@ #define ASM_INPUT_G "ir" #define ASM_INPUT_RM "r" =20 +/* + * clang has non optimal behavior with "=3Drm" constraints for asm outputs. + * It favors memory operand, forcing additional logic in tail functions + * when CONFIG_STACKPROTECTOR_STRONG=3Dy. + */ +#define ASM_OUTPUT_RM "=3Dr" + /* * Declare compiler support for __typeof_unqual__() operator. * diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h index 1280693766b9dd844ec7ca83053381799be12456..b66aa58dacd595546b17582df03= a762dd23cb15c 100644 --- a/include/linux/compiler_types.h +++ b/include/linux/compiler_types.h @@ -548,13 +548,17 @@ struct ftrace_likely_data { =20 /* * Clang has trouble with constraints with multiple - * alternative behaviors (mainly "g" and "rm"). + * alternative behaviors ("g" , "rm" and "=3Drm"). */ #ifndef ASM_INPUT_G #define ASM_INPUT_G "g" #define ASM_INPUT_RM "rm" #endif =20 +#ifndef ASM_OUTPUT_RM + #define ASM_OUTPUT_RM "=3Drm" +#endif + #ifdef CONFIG_CC_HAS_ASM_INLINE #define asm_inline asm __inline #else --=20 2.52.0.322.g1dd061c0dc-goog From nobody Sun Feb 8 05:37:17 2026 Received: from mail-qk1-f202.google.com (mail-qk1-f202.google.com [209.85.222.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6C5E830EF94 for ; Fri, 19 Dec 2025 11:20:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766143215; cv=none; b=qNerH8kDJlT6bVpfGZVUJcIpRaAvgUnkQ45wg0B8dFys1vU0M1UNXHXLnT+WHQrFfeDSWbx21uPTeqffOpLVQfUkRLyZtQjkq0lNScto9nsZuPe1oUh/6wjgTu+tUbV3hqI7KbH7miG3Hi9gVqnTRKm/fO9q6S4nyboUkIaOczU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766143215; c=relaxed/simple; bh=7Cyhju10ezixdww1hRURVhKo0Mha/Ksq2O9OEK68B10=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=GOYzhMpJ8P2y438QGJvuUhXFGywE4M+HK/CcvoCNn2p0x4SDtDVVhsMhS9fHi44YvhakGxqvIZx+xaQplG5FpCYv+Jnhyt0ct95mJPEDLTq+D5p+hmH8ApexO+D/2KNjgxHblVO7Qf4XonX+Z/43uJ1AHuc6nh0OaR6ZS1h9uKI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--edumazet.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=1JHqwMJs; arc=none smtp.client-ip=209.85.222.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--edumazet.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="1JHqwMJs" Received: by mail-qk1-f202.google.com with SMTP id af79cd13be357-8b6a9c80038so152368685a.2 for ; Fri, 19 Dec 2025 03:20:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1766143212; x=1766748012; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=pSvgKgsDUGJzufvYSY3UnWzJNGLNis10xspq27hhqUk=; b=1JHqwMJs859NktxLNy3MXyhGPssaN60oBofnKR0eglh/tCp4AHZiHCGnLsPCfYLFZ2 e0pxmSRtSyfKjA/2d+ybOcDmu/Ajntyr90yJin9mOhDPWAkRzMEqHqeZEG258QayhE+N YxNxuK8Q6kjt1SZpr6iKFqHoFq6kfm6N5QHAQ/KiH/Co0X8CibWrYkQ3PiZLATRkTTf6 Mi0kIKKMfCN1tu6e9ESz0gJr827YetrNUNUsGSXksPh8ICrNkehVjApE78gc2XSXAgFn XLmMCKXAhxeXlPmvsB+8YMvYShzrWDqzHsoYuqOCBIiUJ8H7q93hU4SFaJ7T0bvj8m1l MhKQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1766143212; x=1766748012; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=pSvgKgsDUGJzufvYSY3UnWzJNGLNis10xspq27hhqUk=; b=isPkzEVkFJfRlJXbkOgqYhL0fTcHP42yY0hoIJxRpth8QZddM9S41B/hzV7Xkyonq7 LX6YLbn7gxWPKAqgoSwvYw8ZRsk67mViXWxeUm/BtMf6yPUf6Hc/edVMjaJrSOaCn3vk PdMrAStjgOO0M6GZ73T5yNoF2Sk1syJ0L00KsdTlZ7G+Z1pGBsljdQ5UzUS/7eKDGJTr wDyIeWAfJvBz3edCp6EuP1XMid5t6ituKVwxIL/XNY8oz1UyTzzquECkefEv6Nrgnxoq ZmI8QJ0IPhcBlWoQbzRK/jeRkKLGQHQZLTjwdyj1diFdKEKC941KO6wUmY6VdVGkxwEy mA6w== X-Gm-Message-State: AOJu0YwgwWQ0S99laNzOUtDMuFuM6D3+XqloNSBkMJ8gYDrnPD+2okIn Ur2OFb515pXXTW9JQEIyYNwJjmtuBNqEXeCXKetvFOAP99DRbXksmWSnoC1V9vHDoFkNY3t7KZq FqAjRlMhq9cOS5A== X-Google-Smtp-Source: AGHT+IF5Sl4LVavx8rgxHWefYl9YXHeQXsihWCXXJQFxciNRy3/PWK12Th2CHkOo8H0KljkZ/Xekae3jUksCHw== X-Received: from qvbqm19.prod.google.com ([2002:a05:6214:5693:b0:888:58ef:442]) (user=edumazet job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6214:186f:b0:880:63b1:3f57 with SMTP id 6a1803df08f44-88d837932e8mr29669486d6.38.1766143212292; Fri, 19 Dec 2025 03:20:12 -0800 (PST) Date: Fri, 19 Dec 2025 11:20:07 +0000 In-Reply-To: <20251219112007.2827302-1-edumazet@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251219112007.2827302-1-edumazet@google.com> X-Mailer: git-send-email 2.52.0.322.g1dd061c0dc-goog Message-ID: <20251219112007.2827302-3-edumazet@google.com> Subject: [PATCH 2/2] x86/irqflags: Use ASM_OUTPUT_RM in native_save_fl() From: Eric Dumazet To: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , Uros Bizjak , Linus Torvalds , x86@kernel.org, "H . Peter Anvin" Cc: linux-kernel , Eric Dumazet , Eric Dumazet Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" clang is generating very inefficient code for native_save_fl() which is used for local_irq_save() in critical spots. Allowing the "pop %0" to use memory: 1) forces the compiler to add annoying stack canaries when CONFIG_STACKPROTECTOR_STRONG=3Dy in many places. 2) Almost always is followed by an immediate "move memory,register" One good example is _raw_spin_lock_irqsave, with 8 extra instructions ffffffff82067a30 <_raw_spin_lock_irqsave>: ffffffff82067a30: ... ffffffff82067a39: 53 push %rbx // Three instructions to ajust the stack, read the per-cpu canary // and copy it to 8(%rsp) ffffffff82067a3a: 48 83 ec 10 sub $0x10,%rsp ffffffff82067a3e: 65 48 8b 05 da 15 45 02 mov %gs:0x24515da(%rip),%rax = # <__stack_chk_guard> ffffffff82067a46: 48 89 44 24 08 mov %rax,0x8(%rsp) ffffffff82067a4b: 9c pushf // instead of pop %rbx, compiler uses 2 instructions. ffffffff82067a4c: 8f 04 24 pop (%rsp) ffffffff82067a4f: 48 8b 1c 24 mov (%rsp),%rbx ffffffff82067a53: fa cli ffffffff82067a54: b9 01 00 00 00 mov $0x1,%ecx ffffffff82067a59: 31 c0 xor %eax,%eax ffffffff82067a5b: f0 0f b1 0f lock cmpxchg %ecx,(%rdi) ffffffff82067a5f: 75 1d jne ffffffff82067a7e <_raw_spin_lock_irqsav= e+0x4e> // three instructions to check the stack canary ffffffff82067a61: 65 48 8b 05 b7 15 45 02 mov %gs:0x24515b7(%rip),%rax = # <__stack_chk_guard> ffffffff82067a69: 48 3b 44 24 08 cmp 0x8(%rsp),%rax ffffffff82067a6e: 75 17 jne ffffffff82067a87 ... // One extra instruction to adjust the stack. ffffffff82067a73: 48 83 c4 10 add $0x10,%rsp ... // One more instruction in case the stack was mangled. ffffffff82067a87: e8 a4 35 ff ff call ffffffff8205b030 <__stack_chk_fa= il> This patch changes nothing for gcc, but for clang saves ~20000 bytes of text even though more functions are inlined. $ size vmlinux.gcc.before vmlinux.gcc.after vmlinux.clang.before vmlinux.cl= ang.after text data bss dec hex filename 45565821 25005462 4704800 75276083 47c9f33 vmlinux.gcc.before 45565821 25005462 4704800 75276083 47c9f33 vmlinux.gcc.after 45121072 24638617 5533040 75292729 47ce039 vmlinux.clang.before 45093887 24638633 5536808 75269328 47c84d0 vmlinux.clang.after $ scripts/bloat-o-meter -t vmlinux.clang.before vmlinux.clang.after add/remove: 1/2 grow/shrink: 21/533 up/down: 2250/-22112 (-19862) Function old new delta wakeup_cpu_via_vmgexit 1002 1447 +445 rcu_tasks_trace_pregp_step 1052 1454 +402 snp_kexec_finish 1290 1527 +237 check_all_holdout_tasks_trace 909 1106 +197 x2apic_send_IPI_mask_allbutself 38 198 +160 hpet_set_rtc_irq_bit 118 265 +147 x2apic_send_IPI_mask 38 184 +146 ring_buffer_poll_wait 261 405 +144 rb_watermark_hit 253 386 +133 __unlikely_text_end 368 416 +48 printk_trigger_flush 262 298 +36 __softirqentry_text_end - 32 +32 pstore_dump 1145 1164 +19 printk_legacy_allow_panic_sync 159 178 +19 netlink_insert 979 995 +16 console_try_replay_all 268 283 +15 do_flush_tlb_all 151 165 +14 __flush_tlb_all 151 165 +14 synchronize_rcu_expedited 2248 2259 +11 ... tcp_wfree 402 332 -70 stacktrace_trigger 133 62 -71 w1_touch_bit 418 343 -75 w1_triplet 446 370 -76 link_create 980 902 -78 drain_dead_softirq_workfn 425 347 -78 kcryptd_queue_crypt 253 174 -79 perf_event_aux_pause 448 368 -80 idle_worker_timeout 320 240 -80 srcu_funnel_exp_start 418 333 -85 call_rcu 751 666 -85 enable_IR_x2apic 279 191 -88 bpf_link_free 432 342 -90 synchronize_rcu 497 403 -94 identify_cpu 2665 2569 -96 ftrace_modify_all_code 355 258 -97 load_gs_index 212 104 -108 verity_end_io 369 257 -112 bpf_prog_detach 672 555 -117 __x2apic_send_IPI_mask 552 275 -277 snp_cleanup_vmsa 284 - -284 __noinstr_text_start 3072 1920 -1152 Total: Before=3D28577936, After=3D28558074, chg -0.07% Signed-off-by: Eric Dumazet Cc: Uros Bizjak Cc: Linus Torvalds --- v2: use ASM_OUTPUT_RM (Uros Bizjak) v1: https://lore.kernel.org/lkml/CANn89iJ+HKXRn7qF4KrT6gghw6CwWcsvoj8Scw17C= kCqhGbk=3DA@mail.gmail.com/T/#mc2322d458f07118580eca7c5fa1f0bc931c32d30 arch/x86/include/asm/irqflags.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/include/asm/irqflags.h b/arch/x86/include/asm/irqflag= s.h index b30e5474c18e1be63b7c69354c26ae6a6cb02731..a1193e9d65f2000d6de88468bee= 58f2dae9c6cd5 100644 --- a/arch/x86/include/asm/irqflags.h +++ b/arch/x86/include/asm/irqflags.h @@ -25,7 +25,7 @@ extern __always_inline unsigned long native_save_fl(void) */ asm volatile("# __raw_save_flags\n\t" "pushf ; pop %0" - : "=3Drm" (flags) + : ASM_OUTPUT_RM (flags) : /* no input */ : "memory"); -- 2.52.0.322.g1dd061c0dc-goog