From nobody Fri Sep 12 16:18:08 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 954A2C636CC for ; Wed, 8 Feb 2023 17:24:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231760AbjBHRYM (ORCPT ); Wed, 8 Feb 2023 12:24:12 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55386 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229539AbjBHRYG (ORCPT ); Wed, 8 Feb 2023 12:24:06 -0500 Received: from desiato.infradead.org (desiato.infradead.org [IPv6:2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DE8AC2BEC5 for ; Wed, 8 Feb 2023 09:23:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=ZM2H9NcBYzk3p+h+REZIrOr8Wdv2eWikVDS+/wEJ9YQ=; b=QPSoc0VOKEEOeRBS1h3oyzg4aF B0ACXT394ce1mbXzw6peCznewOKbFYH8iFtM0xyr1jgjnNFirxwwQjEfBFpYKP7dah5m2Aipp/gPU FX/yyS0uMQ2PJIQTdaNXI0zUa+/jNpbwk2z6sLH+OVwg+mlCzezFdejh5cGJd5B+LeL2RnyezLQ0J KLgZNifgVqc+0VDdyvOJNQQ0ZsJZKE0kMadt01JthBlz/YcR1jn5/69iVm0muATWTdaJ9GTojaJxl kCbDeF7jzLw0D0oMFkLuVcoUtxIbOeDyWVOUmyA9yFyDp1Vd33XAgC31FyePRCYe3v+SqPy8skxwx 2ve9GtRw==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.96 #2 (Red Hat Linux)) id 1pPoA4-007Vve-0A; Wed, 08 Feb 2023 17:23:12 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 199C1300446; Wed, 8 Feb 2023 18:23:50 +0100 (CET) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id 026E220A3C1B4; Wed, 8 Feb 2023 18:23:49 +0100 (CET) Message-ID: <20230208172245.291087549@infradead.org> User-Agent: quilt/0.66 Date: Wed, 08 Feb 2023 18:17:57 +0100 From: Peter Zijlstra To: x86@kernel.org, jpoimboe@redhat.com, linux@weissschuh.net Cc: linux-kernel@vger.kernel.org, peterz@infradead.org Subject: [PATCH 01/10] objtool: Change arch_decode_instruction() signature References: <20230208171756.898991570@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" In preparation to changing struct instruction around a bit, avoid passing it's members by pointer and instead pass the whole thing. A cleanup in it's own right too. Signed-off-by: Peter Zijlstra (Intel) Acked-by: Josh Poimboeuf Tested-by: Nathan Chancellor # build only Tested-by: Thomas Wei=C3=9Fschuh # compile and run --- tools/objtool/arch/powerpc/decode.c | 22 +++---- tools/objtool/arch/x86/decode.c | 105 +++++++++++++++++-------------= ----- tools/objtool/check.c | 4 - tools/objtool/include/objtool/arch.h | 4 - 4 files changed, 64 insertions(+), 71 deletions(-) --- a/tools/objtool/arch/powerpc/decode.c +++ b/tools/objtool/arch/powerpc/decode.c @@ -41,38 +41,36 @@ const char *arch_ret_insn(int len) =20 int arch_decode_instruction(struct objtool_file *file, const struct sectio= n *sec, unsigned long offset, unsigned int maxlen, - unsigned int *len, enum insn_type *type, - unsigned long *immediate, - struct list_head *ops_list) + struct instruction *insn) { unsigned int opcode; enum insn_type typ; unsigned long imm; - u32 insn; + u32 ins; =20 - insn =3D bswap_if_needed(file->elf, *(u32 *)(sec->data->d_buf + offset)); - opcode =3D insn >> 26; + ins =3D bswap_if_needed(file->elf, *(u32 *)(sec->data->d_buf + offset)); + opcode =3D ins >> 26; typ =3D INSN_OTHER; imm =3D 0; =20 switch (opcode) { case 18: /* b[l][a] */ - if ((insn & 3) =3D=3D 1) /* bl */ + if ((ins & 3) =3D=3D 1) /* bl */ typ =3D INSN_CALL; =20 - imm =3D insn & 0x3fffffc; + imm =3D ins & 0x3fffffc; if (imm & 0x2000000) imm -=3D 0x4000000; break; } =20 if (opcode =3D=3D 1) - *len =3D 8; + insn->len =3D 8; else - *len =3D 4; + insn->len =3D 4; =20 - *type =3D typ; - *immediate =3D imm; + insn->type =3D typ; + insn->immediate =3D imm; =20 return 0; } --- a/tools/objtool/arch/x86/decode.c +++ b/tools/objtool/arch/x86/decode.c @@ -146,12 +146,11 @@ static bool has_notrack_prefix(struct in =20 int arch_decode_instruction(struct objtool_file *file, const struct sectio= n *sec, unsigned long offset, unsigned int maxlen, - unsigned int *len, enum insn_type *type, - unsigned long *immediate, - struct list_head *ops_list) + struct instruction *insn) { + struct list_head *ops_list =3D &insn->stack_ops; const struct elf *elf =3D file->elf; - struct insn insn; + struct insn ins; int x86_64, ret; unsigned char op1, op2, op3, prefix, rex =3D 0, rex_b =3D 0, rex_r =3D 0, rex_w =3D 0, rex_x =3D 0, @@ -165,42 +164,42 @@ int arch_decode_instruction(struct objto if (x86_64 =3D=3D -1) return -1; =20 - ret =3D insn_decode(&insn, sec->data->d_buf + offset, maxlen, + ret =3D insn_decode(&ins, sec->data->d_buf + offset, maxlen, x86_64 ? INSN_MODE_64 : INSN_MODE_32); if (ret < 0) { WARN("can't decode instruction at %s:0x%lx", sec->name, offset); return -1; } =20 - *len =3D insn.length; - *type =3D INSN_OTHER; + insn->len =3D ins.length; + insn->type =3D INSN_OTHER; =20 - if (insn.vex_prefix.nbytes) + if (ins.vex_prefix.nbytes) return 0; =20 - prefix =3D insn.prefixes.bytes[0]; + prefix =3D ins.prefixes.bytes[0]; =20 - op1 =3D insn.opcode.bytes[0]; - op2 =3D insn.opcode.bytes[1]; - op3 =3D insn.opcode.bytes[2]; + op1 =3D ins.opcode.bytes[0]; + op2 =3D ins.opcode.bytes[1]; + op3 =3D ins.opcode.bytes[2]; =20 - if (insn.rex_prefix.nbytes) { - rex =3D insn.rex_prefix.bytes[0]; + if (ins.rex_prefix.nbytes) { + rex =3D ins.rex_prefix.bytes[0]; rex_w =3D X86_REX_W(rex) >> 3; rex_r =3D X86_REX_R(rex) >> 2; rex_x =3D X86_REX_X(rex) >> 1; rex_b =3D X86_REX_B(rex); } =20 - if (insn.modrm.nbytes) { - modrm =3D insn.modrm.bytes[0]; + if (ins.modrm.nbytes) { + modrm =3D ins.modrm.bytes[0]; modrm_mod =3D X86_MODRM_MOD(modrm); modrm_reg =3D X86_MODRM_REG(modrm) + 8*rex_r; modrm_rm =3D X86_MODRM_RM(modrm) + 8*rex_b; } =20 - if (insn.sib.nbytes) { - sib =3D insn.sib.bytes[0]; + if (ins.sib.nbytes) { + sib =3D ins.sib.bytes[0]; /* sib_scale =3D X86_SIB_SCALE(sib); */ sib_index =3D X86_SIB_INDEX(sib) + 8*rex_x; sib_base =3D X86_SIB_BASE(sib) + 8*rex_b; @@ -254,7 +253,7 @@ int arch_decode_instruction(struct objto break; =20 case 0x70 ... 0x7f: - *type =3D INSN_JUMP_CONDITIONAL; + insn->type =3D INSN_JUMP_CONDITIONAL; break; =20 case 0x80 ... 0x83: @@ -278,7 +277,7 @@ int arch_decode_instruction(struct objto if (!rm_is_reg(CFI_SP)) break; =20 - imm =3D insn.immediate.value; + imm =3D ins.immediate.value; if (op1 & 2) { /* sign extend */ if (op1 & 1) { /* imm32 */ imm <<=3D 32; @@ -309,7 +308,7 @@ int arch_decode_instruction(struct objto ADD_OP(op) { op->src.type =3D OP_SRC_AND; op->src.reg =3D CFI_SP; - op->src.offset =3D insn.immediate.value; + op->src.offset =3D ins.immediate.value; op->dest.type =3D OP_DEST_REG; op->dest.reg =3D CFI_SP; } @@ -356,7 +355,7 @@ int arch_decode_instruction(struct objto op->src.reg =3D CFI_SP; op->dest.type =3D OP_DEST_REG_INDIRECT; op->dest.reg =3D modrm_rm; - op->dest.offset =3D insn.displacement.value; + op->dest.offset =3D ins.displacement.value; } break; } @@ -389,7 +388,7 @@ int arch_decode_instruction(struct objto op->src.reg =3D modrm_reg; op->dest.type =3D OP_DEST_REG_INDIRECT; op->dest.reg =3D CFI_BP; - op->dest.offset =3D insn.displacement.value; + op->dest.offset =3D ins.displacement.value; } break; } @@ -402,7 +401,7 @@ int arch_decode_instruction(struct objto op->src.reg =3D modrm_reg; op->dest.type =3D OP_DEST_REG_INDIRECT; op->dest.reg =3D CFI_SP; - op->dest.offset =3D insn.displacement.value; + op->dest.offset =3D ins.displacement.value; } break; } @@ -419,7 +418,7 @@ int arch_decode_instruction(struct objto ADD_OP(op) { op->src.type =3D OP_SRC_REG_INDIRECT; op->src.reg =3D CFI_BP; - op->src.offset =3D insn.displacement.value; + op->src.offset =3D ins.displacement.value; op->dest.type =3D OP_DEST_REG; op->dest.reg =3D modrm_reg; } @@ -432,7 +431,7 @@ int arch_decode_instruction(struct objto ADD_OP(op) { op->src.type =3D OP_SRC_REG_INDIRECT; op->src.reg =3D CFI_SP; - op->src.offset =3D insn.displacement.value; + op->src.offset =3D ins.displacement.value; op->dest.type =3D OP_DEST_REG; op->dest.reg =3D modrm_reg; } @@ -464,7 +463,7 @@ int arch_decode_instruction(struct objto =20 /* lea disp(%src), %dst */ ADD_OP(op) { - op->src.offset =3D insn.displacement.value; + op->src.offset =3D ins.displacement.value; if (!op->src.offset) { /* lea (%src), %dst */ op->src.type =3D OP_SRC_REG; @@ -487,7 +486,7 @@ int arch_decode_instruction(struct objto break; =20 case 0x90: - *type =3D INSN_NOP; + insn->type =3D INSN_NOP; break; =20 case 0x9c: @@ -511,39 +510,39 @@ int arch_decode_instruction(struct objto if (op2 =3D=3D 0x01) { =20 if (modrm =3D=3D 0xca) - *type =3D INSN_CLAC; + insn->type =3D INSN_CLAC; else if (modrm =3D=3D 0xcb) - *type =3D INSN_STAC; + insn->type =3D INSN_STAC; =20 } else if (op2 >=3D 0x80 && op2 <=3D 0x8f) { =20 - *type =3D INSN_JUMP_CONDITIONAL; + insn->type =3D INSN_JUMP_CONDITIONAL; =20 } else if (op2 =3D=3D 0x05 || op2 =3D=3D 0x07 || op2 =3D=3D 0x34 || op2 =3D=3D 0x35) { =20 /* sysenter, sysret */ - *type =3D INSN_CONTEXT_SWITCH; + insn->type =3D INSN_CONTEXT_SWITCH; =20 } else if (op2 =3D=3D 0x0b || op2 =3D=3D 0xb9) { =20 /* ud2 */ - *type =3D INSN_BUG; + insn->type =3D INSN_BUG; =20 } else if (op2 =3D=3D 0x0d || op2 =3D=3D 0x1f) { =20 /* nopl/nopw */ - *type =3D INSN_NOP; + insn->type =3D INSN_NOP; =20 } else if (op2 =3D=3D 0x1e) { =20 if (prefix =3D=3D 0xf3 && (modrm =3D=3D 0xfa || modrm =3D=3D 0xfb)) - *type =3D INSN_ENDBR; + insn->type =3D INSN_ENDBR; =20 =20 } else if (op2 =3D=3D 0x38 && op3 =3D=3D 0xf8) { - if (insn.prefixes.nbytes =3D=3D 1 && - insn.prefixes.bytes[0] =3D=3D 0xf2) { + if (ins.prefixes.nbytes =3D=3D 1 && + ins.prefixes.bytes[0] =3D=3D 0xf2) { /* ENQCMD cannot be used in the kernel. */ WARN("ENQCMD instruction at %s:%lx", sec->name, offset); @@ -591,29 +590,29 @@ int arch_decode_instruction(struct objto =20 case 0xcc: /* int3 */ - *type =3D INSN_TRAP; + insn->type =3D INSN_TRAP; break; =20 case 0xe3: /* jecxz/jrcxz */ - *type =3D INSN_JUMP_CONDITIONAL; + insn->type =3D INSN_JUMP_CONDITIONAL; break; =20 case 0xe9: case 0xeb: - *type =3D INSN_JUMP_UNCONDITIONAL; + insn->type =3D INSN_JUMP_UNCONDITIONAL; break; =20 case 0xc2: case 0xc3: - *type =3D INSN_RETURN; + insn->type =3D INSN_RETURN; break; =20 case 0xc7: /* mov imm, r/m */ if (!opts.noinstr) break; =20 - if (insn.length =3D=3D 3+4+4 && !strncmp(sec->name, ".init.text", 10)) { + if (ins.length =3D=3D 3+4+4 && !strncmp(sec->name, ".init.text", 10)) { struct reloc *immr, *disp; struct symbol *func; int idx; @@ -661,17 +660,17 @@ int arch_decode_instruction(struct objto =20 case 0xca: /* retf */ case 0xcb: /* retf */ - *type =3D INSN_CONTEXT_SWITCH; + insn->type =3D INSN_CONTEXT_SWITCH; break; =20 case 0xe0: /* loopne */ case 0xe1: /* loope */ case 0xe2: /* loop */ - *type =3D INSN_JUMP_CONDITIONAL; + insn->type =3D INSN_JUMP_CONDITIONAL; break; =20 case 0xe8: - *type =3D INSN_CALL; + insn->type =3D INSN_CALL; /* * For the impact on the stack, a CALL behaves like * a PUSH of an immediate value (the return address). @@ -683,30 +682,30 @@ int arch_decode_instruction(struct objto break; =20 case 0xfc: - *type =3D INSN_CLD; + insn->type =3D INSN_CLD; break; =20 case 0xfd: - *type =3D INSN_STD; + insn->type =3D INSN_STD; break; =20 case 0xff: if (modrm_reg =3D=3D 2 || modrm_reg =3D=3D 3) { =20 - *type =3D INSN_CALL_DYNAMIC; - if (has_notrack_prefix(&insn)) + insn->type =3D INSN_CALL_DYNAMIC; + if (has_notrack_prefix(&ins)) WARN("notrack prefix found at %s:0x%lx", sec->name, offset); =20 } else if (modrm_reg =3D=3D 4) { =20 - *type =3D INSN_JUMP_DYNAMIC; - if (has_notrack_prefix(&insn)) + insn->type =3D INSN_JUMP_DYNAMIC; + if (has_notrack_prefix(&ins)) WARN("notrack prefix found at %s:0x%lx", sec->name, offset); =20 } else if (modrm_reg =3D=3D 5) { =20 /* jmpf */ - *type =3D INSN_CONTEXT_SWITCH; + insn->type =3D INSN_CONTEXT_SWITCH; =20 } else if (modrm_reg =3D=3D 6) { =20 @@ -723,7 +722,7 @@ int arch_decode_instruction(struct objto break; } =20 - *immediate =3D insn.immediate.nbytes ? insn.immediate.value : 0; + insn->immediate =3D ins.immediate.nbytes ? ins.immediate.value : 0; =20 return 0; } --- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -404,9 +404,7 @@ static int decode_instructions(struct ob =20 ret =3D arch_decode_instruction(file, sec, offset, sec->sh.sh_size - offset, - &insn->len, &insn->type, - &insn->immediate, - &insn->stack_ops); + insn); if (ret) goto err; =20 --- a/tools/objtool/include/objtool/arch.h +++ b/tools/objtool/include/objtool/arch.h @@ -75,9 +75,7 @@ void arch_initial_func_cfi_state(struct =20 int arch_decode_instruction(struct objtool_file *file, const struct sectio= n *sec, unsigned long offset, unsigned int maxlen, - unsigned int *len, enum insn_type *type, - unsigned long *immediate, - struct list_head *ops_list); + struct instruction *insn); =20 bool arch_callee_saved_reg(unsigned char reg); From nobody Fri Sep 12 16:18:08 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F11A2C05027 for ; Wed, 8 Feb 2023 17:24:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231877AbjBHRYR (ORCPT ); Wed, 8 Feb 2023 12:24:17 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55402 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231473AbjBHRYI (ORCPT ); Wed, 8 Feb 2023 12:24:08 -0500 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3401C11174 for ; Wed, 8 Feb 2023 09:23:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=8TbebfEQGupoy/9v/z5uEIiMfU8D0SE8LTUS76/4dow=; b=jEjQtbcquVHU8Ggz47ZybOguHR 1bxalUflyRQu0unb/sLmUz0tbYotSThHT0iuuYQN7Gv3yJQlNRkhJnIJwtAUQIocwuulqGRkc6aYe lkpvxSu6hD5qRKQb6is+Reml5E5j2QDsJNJSJoOb4cXa0H6FTdH1/72tIeuPw8ynD6OPIvrSE62F8 1iAvWhSBH8gjam9wv/Ov3Gz7shF6DcNn480VHqyDUeQAZIJ3h8NB55udNf8MX9i6DjM37Sf0Uf7RS apj5TLukX/82BUZz5VbRyGKbpihRKVob316SF0YBxl+62BsvDuubNV8aE/1pmmTn4UxzGlhNTGpmD nBibaSoQ==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1pPoAg-001PY4-Gg; Wed, 08 Feb 2023 17:23:51 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 1ECF73007FB; Wed, 8 Feb 2023 18:23:50 +0100 (CET) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id 0469123D88E92; Wed, 8 Feb 2023 18:23:50 +0100 (CET) Message-ID: <20230208172245.362196959@infradead.org> User-Agent: quilt/0.66 Date: Wed, 08 Feb 2023 18:17:58 +0100 From: Peter Zijlstra To: x86@kernel.org, jpoimboe@redhat.com, linux@weissschuh.net Cc: linux-kernel@vger.kernel.org, peterz@infradead.org Subject: [PATCH 02/10] objtool: Make instruction::stack_ops a single-linked list References: <20230208171756.898991570@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" struct instruction { struct list_head list; /* 0 16 */ struct hlist_node hash; /* 16 16 */ struct list_head call_node; /* 32 16 */ struct section * sec; /* 48 8 */ long unsigned int offset; /* 56 8 */ /* --- cacheline 1 boundary (64 bytes) --- */ unsigned int len; /* 64 4 */ enum insn_type type; /* 68 4 */ long unsigned int immediate; /* 72 8 */ u16 dead_end:1; /* 80: 0 2 */ u16 ignore:1; /* 80: 1 2 */ u16 ignore_alts:1; /* 80: 2 2 */ u16 hint:1; /* 80: 3 2 */ u16 save:1; /* 80: 4 2 */ u16 restore:1; /* 80: 5 2 */ u16 retpoline_safe:1; /* 80: 6 2 */ u16 noendbr:1; /* 80: 7 2 */ u16 entry:1; /* 80: 8 2 */ /* XXX 7 bits hole, try to pack */ s8 instr; /* 82 1 */ u8 visited; /* 83 1 */ /* XXX 4 bytes hole, try to pack */ struct alt_group * alt_group; /* 88 8 */ struct symbol * call_dest; /* 96 8 */ struct instruction * jump_dest; /* 104 8 */ struct instruction * first_jump_src; /* 112 8 */ struct reloc * jump_table; /* 120 8 */ /* --- cacheline 2 boundary (128 bytes) --- */ struct reloc * reloc; /* 128 8 */ struct list_head alts; /* 136 16 */ struct symbol * sym; /* 152 8 */ - struct list_head stack_ops; /* 160 16 */ - struct cfi_state * cfi; /* 176 8 */ + struct stack_op * stack_ops; /* 160 8 */ + struct cfi_state * cfi; /* 168 8 */ - /* size: 184, cachelines: 3, members: 29 */ - /* sum members: 178, holes: 1, sum holes: 4 */ + /* size: 176, cachelines: 3, members: 29 */ + /* sum members: 170, holes: 1, sum holes: 4 */ /* sum bitfield members: 9 bits, bit holes: 1, sum bit holes: 7 bits */ - /* last cacheline: 56 bytes */ + /* last cacheline: 48 bytes */ }; pre: 5:58.22 real, 226.69 user, 131.22 sys, 26221520 mem post: 5:58.50 real, 229.64 user, 128.65 sys, 26221520 mem Signed-off-by: Peter Zijlstra (Intel) Acked-by: Josh Poimboeuf Tested-by: Nathan Chancellor # build only Tested-by: Thomas Wei=C3=9Fschuh # compile and run --- tools/objtool/arch/x86/decode.c | 4 ++-- tools/objtool/check.c | 11 +++++------ tools/objtool/include/objtool/arch.h | 2 +- tools/objtool/include/objtool/check.h | 2 +- 4 files changed, 9 insertions(+), 10 deletions(-) --- a/tools/objtool/arch/x86/decode.c +++ b/tools/objtool/arch/x86/decode.c @@ -105,7 +105,7 @@ bool arch_pc_relative_reloc(struct reloc #define ADD_OP(op) \ if (!(op =3D calloc(1, sizeof(*op)))) \ return -1; \ - else for (list_add_tail(&op->list, ops_list); op; op =3D NULL) + else for (*ops_list =3D op, ops_list =3D &op->next; op; op =3D NULL) =20 /* * Helpers to decode ModRM/SIB: @@ -148,7 +148,7 @@ int arch_decode_instruction(struct objto unsigned long offset, unsigned int maxlen, struct instruction *insn) { - struct list_head *ops_list =3D &insn->stack_ops; + struct stack_op **ops_list =3D &insn->stack_ops; const struct elf *elf =3D file->elf; struct insn ins; int x86_64, ret; --- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -396,7 +396,6 @@ static int decode_instructions(struct ob } memset(insn, 0, sizeof(*insn)); INIT_LIST_HEAD(&insn->alts); - INIT_LIST_HEAD(&insn->stack_ops); INIT_LIST_HEAD(&insn->call_node); =20 insn->sec =3D sec; @@ -1319,12 +1318,13 @@ static struct reloc *insn_reloc(struct o =20 static void remove_insn_ops(struct instruction *insn) { - struct stack_op *op, *tmp; + struct stack_op *op, *next; =20 - list_for_each_entry_safe(op, tmp, &insn->stack_ops, list) { - list_del(&op->list); + for (op =3D insn->stack_ops; op; op =3D next) { + next =3D op->next; free(op); } + insn->stack_ops =3D NULL; } =20 static void annotate_call_site(struct objtool_file *file, @@ -1769,7 +1769,6 @@ static int handle_group_alt(struct objto } memset(nop, 0, sizeof(*nop)); INIT_LIST_HEAD(&nop->alts); - INIT_LIST_HEAD(&nop->stack_ops); =20 nop->sec =3D special_alt->new_sec; nop->offset =3D special_alt->new_off + special_alt->new_len; @@ -3214,7 +3213,7 @@ static int handle_insn_ops(struct instru { struct stack_op *op; =20 - list_for_each_entry(op, &insn->stack_ops, list) { + for (op =3D insn->stack_ops; op; op =3D op->next) { =20 if (update_cfi_state(insn, next_insn, &state->cfi, op)) return 1; --- a/tools/objtool/include/objtool/arch.h +++ b/tools/objtool/include/objtool/arch.h @@ -62,9 +62,9 @@ struct op_src { }; =20 struct stack_op { + struct stack_op *next; struct op_dest dest; struct op_src src; - struct list_head list; }; =20 struct instruction; --- a/tools/objtool/include/objtool/check.h +++ b/tools/objtool/include/objtool/check.h @@ -68,7 +68,7 @@ struct instruction { struct reloc *reloc; struct list_head alts; struct symbol *sym; - struct list_head stack_ops; + struct stack_op *stack_ops; struct cfi_state *cfi; }; From nobody Fri Sep 12 16:18:08 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A6D67C636CC for ; Wed, 8 Feb 2023 17:24:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231906AbjBHRYa (ORCPT ); Wed, 8 Feb 2023 12:24:30 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55402 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231791AbjBHRYJ (ORCPT ); Wed, 8 Feb 2023 12:24:09 -0500 Received: from desiato.infradead.org (desiato.infradead.org [IPv6:2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 76E77EC45 for ; Wed, 8 Feb 2023 09:23:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=dWf0mF+CkANB3UjknX/6VNA3izZPStL4WfuVfUb/La0=; b=RjRDvoUb9moSjybIp7cnAw4pfY tSoZoJOJAGO0KL5lbysz5L2bljP11PAJtlrVMPOeWexevT7+Zf8Tt9w70DhYBETpzGVItUQfwsU6D JU4OA6ZgcLimyWOY6U8JFgvZ5W1k15qVtCU9R3c2HMQMrAPM3czGBcGj374RjuHQI2WbN+hGvFSF0 j/nxUeu/VZJezzk+0WvzulBTAymZKB+iFs96QZ3CSQwp2Ro41FFMQN3LvGmEKihlOK3a7R7Ft36Zm Ux1IJckmUpBA65SOcfm6UyW8rMul5xei58aW/9c/hFMOjNMAaNUy/fq5oIi4El+teEqC5pm6mt7A5 sHnOtFFQ==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.96 #2 (Red Hat Linux)) id 1pPoA4-007Vvd-05; Wed, 08 Feb 2023 17:23:12 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 22C9A30080C; Wed, 8 Feb 2023 18:23:50 +0100 (CET) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id 0A12E203D3415; Wed, 8 Feb 2023 18:23:50 +0100 (CET) Message-ID: <20230208172245.430556498@infradead.org> User-Agent: quilt/0.66 Date: Wed, 08 Feb 2023 18:17:59 +0100 From: Peter Zijlstra To: x86@kernel.org, jpoimboe@redhat.com, linux@weissschuh.net Cc: linux-kernel@vger.kernel.org, peterz@infradead.org Subject: [PATCH 03/10] objtool: Make instruction::alts a single-linked list References: <20230208171756.898991570@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" struct instruction { struct list_head list; /* 0 16 */ struct hlist_node hash; /* 16 16 */ struct list_head call_node; /* 32 16 */ struct section * sec; /* 48 8 */ long unsigned int offset; /* 56 8 */ /* --- cacheline 1 boundary (64 bytes) --- */ unsigned int len; /* 64 4 */ enum insn_type type; /* 68 4 */ long unsigned int immediate; /* 72 8 */ u16 dead_end:1; /* 80: 0 2 */ u16 ignore:1; /* 80: 1 2 */ u16 ignore_alts:1; /* 80: 2 2 */ u16 hint:1; /* 80: 3 2 */ u16 save:1; /* 80: 4 2 */ u16 restore:1; /* 80: 5 2 */ u16 retpoline_safe:1; /* 80: 6 2 */ u16 noendbr:1; /* 80: 7 2 */ u16 entry:1; /* 80: 8 2 */ /* XXX 7 bits hole, try to pack */ s8 instr; /* 82 1 */ u8 visited; /* 83 1 */ /* XXX 4 bytes hole, try to pack */ struct alt_group * alt_group; /* 88 8 */ struct symbol * call_dest; /* 96 8 */ struct instruction * jump_dest; /* 104 8 */ struct instruction * first_jump_src; /* 112 8 */ struct reloc * jump_table; /* 120 8 */ /* --- cacheline 2 boundary (128 bytes) --- */ struct reloc * reloc; /* 128 8 */ - struct list_head alts; /* 136 16 */ - struct symbol * sym; /* 152 8 */ - struct stack_op * stack_ops; /* 160 8 */ - struct cfi_state * cfi; /* 168 8 */ + struct alternative * alts; /* 136 8 */ + struct symbol * sym; /* 144 8 */ + struct stack_op * stack_ops; /* 152 8 */ + struct cfi_state * cfi; /* 160 8 */ - /* size: 176, cachelines: 3, members: 29 */ - /* sum members: 170, holes: 1, sum holes: 4 */ + /* size: 168, cachelines: 3, members: 29 */ + /* sum members: 162, holes: 1, sum holes: 4 */ /* sum bitfield members: 9 bits, bit holes: 1, sum bit holes: 7 bits */ - /* last cacheline: 48 bytes */ + /* last cacheline: 40 bytes */ }; pre: 5:58.50 real, 229.64 user, 128.65 sys, 26221520 mem post: 5:48.86 real, 220.30 user, 128.34 sys, 24834672 mem Signed-off-by: Peter Zijlstra (Intel) Acked-by: Josh Poimboeuf Tested-by: Nathan Chancellor # build only Tested-by: Thomas Wei=C3=9Fschuh # compile and run --- tools/objtool/check.c | 18 +++++++++--------- tools/objtool/include/objtool/check.h | 2 +- 2 files changed, 10 insertions(+), 10 deletions(-) --- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -23,7 +23,7 @@ #include =20 struct alternative { - struct list_head list; + struct alternative *next; struct instruction *insn; bool skip_orig; }; @@ -395,7 +395,6 @@ static int decode_instructions(struct ob return -1; } memset(insn, 0, sizeof(*insn)); - INIT_LIST_HEAD(&insn->alts); INIT_LIST_HEAD(&insn->call_node); =20 insn->sec =3D sec; @@ -1768,7 +1767,6 @@ static int handle_group_alt(struct objto return -1; } memset(nop, 0, sizeof(*nop)); - INIT_LIST_HEAD(&nop->alts); =20 nop->sec =3D special_alt->new_sec; nop->offset =3D special_alt->new_off + special_alt->new_len; @@ -1966,7 +1964,8 @@ static int add_special_section_alts(stru alt->insn =3D new_insn; alt->skip_orig =3D special_alt->skip_orig; orig_insn->ignore_alts |=3D special_alt->skip_alt; - list_add_tail(&alt->list, &orig_insn->alts); + alt->next =3D orig_insn->alts; + orig_insn->alts =3D alt; =20 list_del(&special_alt->list); free(special_alt); @@ -2025,7 +2024,8 @@ static int add_jump_table(struct objtool } =20 alt->insn =3D dest_insn; - list_add_tail(&alt->list, &insn->alts); + alt->next =3D insn->alts; + insn->alts =3D alt; prev_offset =3D reloc->offset; } =20 @@ -3576,10 +3576,10 @@ static int validate_branch(struct objtoo if (propagate_alt_cfi(file, insn)) return 1; =20 - if (!insn->ignore_alts && !list_empty(&insn->alts)) { + if (!insn->ignore_alts && insn->alts) { bool skip_orig =3D false; =20 - list_for_each_entry(alt, &insn->alts, list) { + for (alt =3D insn->alts; alt; alt =3D alt->next) { if (alt->skip_orig) skip_orig =3D true; =20 @@ -3778,11 +3778,11 @@ static int validate_entry(struct objtool =20 insn->visited |=3D VISITED_ENTRY; =20 - if (!insn->ignore_alts && !list_empty(&insn->alts)) { + if (!insn->ignore_alts && insn->alts) { struct alternative *alt; bool skip_orig =3D false; =20 - list_for_each_entry(alt, &insn->alts, list) { + for (alt =3D insn->alts; alt; alt =3D alt->next) { if (alt->skip_orig) skip_orig =3D true; =20 --- a/tools/objtool/include/objtool/check.h +++ b/tools/objtool/include/objtool/check.h @@ -66,7 +66,7 @@ struct instruction { struct instruction *first_jump_src; struct reloc *jump_table; struct reloc *reloc; - struct list_head alts; + struct alternative *alts; struct symbol *sym; struct stack_op *stack_ops; struct cfi_state *cfi; From nobody Fri Sep 12 16:18:08 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0F7E2C636CC for ; Wed, 8 Feb 2023 17:24:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231874AbjBHRYO (ORCPT ); Wed, 8 Feb 2023 12:24:14 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55384 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230467AbjBHRYG (ORCPT ); Wed, 8 Feb 2023 12:24:06 -0500 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E06CA2C665 for ; Wed, 8 Feb 2023 09:23:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=c4iXDIqTZJbezvVFwGhR/tsclWYbiTSQB02fC9Oia0s=; b=hNNye+947Z9LCWb8YzeJ6lRyUv rTTT2sZx442WDoWjt6m6VKy2MzplIkf6PaKWA42xBamy2yb+GumhiCiKWp2yFIMVKMB03NZ1xle21 Z0vJP1Jq05cmHP07MeaZCc2tAg3jYr/oOPk5B32Oc9XchAcd4BJLUX+Wa4LhDfz03SQ9R09hPJqeN u05fieyFJ1n0Iif0cLEJB/kHAsi/tMpGrYZKmL8+0+w8YXwvG+026+LmG4xGb+yu2Wgmv5E7VH4ws 1zu6jHhjnoAn4gWk3LtxSzW4HiWM9VySQO1foPna4z6wkuSgy+sX5hGAgnlEb1rFhGj+a5uzCZccw wZKkWwCw==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1pPoAg-001PY3-6C; Wed, 08 Feb 2023 17:23:50 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 26795300912; Wed, 8 Feb 2023 18:23:50 +0100 (CET) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id 0E352203C2EC8; Wed, 8 Feb 2023 18:23:50 +0100 (CET) Message-ID: <20230208172245.501847188@infradead.org> User-Agent: quilt/0.66 Date: Wed, 08 Feb 2023 18:18:00 +0100 From: Peter Zijlstra To: x86@kernel.org, jpoimboe@redhat.com, linux@weissschuh.net Cc: linux-kernel@vger.kernel.org, peterz@infradead.org Subject: [PATCH 04/10] objtool: Shrink instruction::{type,visited} References: <20230208171756.898991570@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Since we don't have that many types in enum insn_type, force it into a u8 and re-arrange member to get rid of the holes, saves another 8 bytes. struct instruction { struct list_head list; /* 0 16 */ struct hlist_node hash; /* 16 16 */ struct list_head call_node; /* 32 16 */ struct section * sec; /* 48 8 */ long unsigned int offset; /* 56 8 */ /* --- cacheline 1 boundary (64 bytes) --- */ - unsigned int len; /* 64 4 */ - enum insn_type type; /* 68 4 */ - long unsigned int immediate; /* 72 8 */ - u16 dead_end:1; /* 80: 0 2 */ - u16 ignore:1; /* 80: 1 2 */ - u16 ignore_alts:1; /* 80: 2 2 */ - u16 hint:1; /* 80: 3 2 */ - u16 save:1; /* 80: 4 2 */ - u16 restore:1; /* 80: 5 2 */ - u16 retpoline_safe:1; /* 80: 6 2 */ - u16 noendbr:1; /* 80: 7 2 */ - u16 entry:1; /* 80: 8 2 */ + long unsigned int immediate; /* 64 8 */ + unsigned int len; /* 72 4 */ + u8 type; /* 76 1 */ - /* XXX 7 bits hole, try to pack */ + /* Bitfield combined with previous fields */ - s8 instr; /* 82 1 */ - u8 visited; /* 83 1 */ + u16 dead_end:1; /* 76: 8 2 */ + u16 ignore:1; /* 76: 9 2 */ + u16 ignore_alts:1; /* 76:10 2 */ + u16 hint:1; /* 76:11 2 */ + u16 save:1; /* 76:12 2 */ + u16 restore:1; /* 76:13 2 */ + u16 retpoline_safe:1; /* 76:14 2 */ + u16 noendbr:1; /* 76:15 2 */ + u16 entry:1; /* 78: 0 2 */ + u16 visited:4; /* 78: 1 2 */ - /* XXX 4 bytes hole, try to pack */ + /* XXX 3 bits hole, try to pack */ + /* Bitfield combined with next fields */ - struct alt_group * alt_group; /* 88 8 */ - struct symbol * call_dest; /* 96 8 */ - struct instruction * jump_dest; /* 104 8 */ - struct instruction * first_jump_src; /* 112 8 */ - struct reloc * jump_table; /* 120 8 */ + s8 instr; /* 79 1 */ + struct alt_group * alt_group; /* 80 8 */ + struct symbol * call_dest; /* 88 8 */ + struct instruction * jump_dest; /* 96 8 */ + struct instruction * first_jump_src; /* 104 8 */ + struct reloc * jump_table; /* 112 8 */ + struct reloc * reloc; /* 120 8 */ /* --- cacheline 2 boundary (128 bytes) --- */ - struct reloc * reloc; /* 128 8 */ - struct alternative * alts; /* 136 8 */ - struct symbol * sym; /* 144 8 */ - struct stack_op * stack_ops; /* 152 8 */ - struct cfi_state * cfi; /* 160 8 */ + struct alternative * alts; /* 128 8 */ + struct symbol * sym; /* 136 8 */ + struct stack_op * stack_ops; /* 144 8 */ + struct cfi_state * cfi; /* 152 8 */ - /* size: 168, cachelines: 3, members: 29 */ - /* sum members: 162, holes: 1, sum holes: 4 */ - /* sum bitfield members: 9 bits, bit holes: 1, sum bit holes: 7 bits */ - /* last cacheline: 40 bytes */ + /* size: 160, cachelines: 3, members: 29 */ + /* sum members: 158 */ + /* sum bitfield members: 13 bits, bit holes: 1, sum bit holes: 3 bits */ + /* last cacheline: 32 bytes */ }; pre: 5:48.86 real, 220.30 user, 128.34 sys, 24834672 mem post: 5:48.89 real, 220.96 user, 127.55 sys, 24834672 mem Signed-off-by: Peter Zijlstra (Intel) Acked-by: Josh Poimboeuf Tested-by: Nathan Chancellor # build only Tested-by: Thomas Wei=C3=9Fschuh # compile and run --- tools/objtool/include/objtool/check.h | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) --- a/tools/objtool/include/objtool/check.h +++ b/tools/objtool/include/objtool/check.h @@ -42,9 +42,9 @@ struct instruction { struct list_head call_node; struct section *sec; unsigned long offset; - unsigned int len; - enum insn_type type; unsigned long immediate; + unsigned int len; + u8 type; =20 u16 dead_end : 1, ignore : 1, @@ -54,11 +54,11 @@ struct instruction { restore : 1, retpoline_safe : 1, noendbr : 1, - entry : 1; - /* 7 bit hole */ + entry : 1, + visited : 4; + /* 3 bit hole */ =20 s8 instr; - u8 visited; =20 struct alt_group *alt_group; struct symbol *call_dest; From nobody Fri Sep 12 16:18:08 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5DB64C636CC for ; Wed, 8 Feb 2023 17:24:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231897AbjBHRYZ (ORCPT ); Wed, 8 Feb 2023 12:24:25 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55432 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231713AbjBHRYJ (ORCPT ); Wed, 8 Feb 2023 12:24:09 -0500 Received: from desiato.infradead.org (desiato.infradead.org [IPv6:2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9817D37B5E for ; Wed, 8 Feb 2023 09:23:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=GvO8aVeF7JKwzvQc7hhJHgk9VBm8YhlWG+s1YwCYQvE=; b=M8movAMwfAbHmAKhR1cTW3NTbS tY9ts2TQCEnA4Yz7KFMm31MAR7KopSfB85a8YlwIjf+k1kzP114AUTPp+MBBJHgaVfBW2memKa+lM HsyoWxGFJrxcHNvQ099cbYjeiQh4hv34h602NFfrdE/jGraVNx2m67Lc7heVP38bhDwfkxU/VK1Z5 jo0grckrq/yrBgFISK5/UnTWj0sh0fIekIiBLZWz1Az3cNZM580DRZmCgbMJ45vnC4BNP/IUJcxJs H8zmnbKRE5W75yUVsOqnvrADLBSZhGoIY559M3z9h9514a6liwQTsNx+u/XuLB8bVG3z6NZF33m26 CU66KdVQ==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.96 #2 (Red Hat Linux)) id 1pPoA4-007Vvj-2I; Wed, 08 Feb 2023 17:23:13 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 497B53010E0; Wed, 8 Feb 2023 18:23:50 +0100 (CET) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id 1269923698889; Wed, 8 Feb 2023 18:23:50 +0100 (CET) Message-ID: <20230208172245.572145269@infradead.org> User-Agent: quilt/0.66 Date: Wed, 08 Feb 2023 18:18:01 +0100 From: Peter Zijlstra To: x86@kernel.org, jpoimboe@redhat.com, linux@weissschuh.net Cc: linux-kernel@vger.kernel.org, peterz@infradead.org Subject: [PATCH 05/10] objtool: Remove instruction::reloc References: <20230208171756.898991570@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Instead of caching the reloc for each instruction, only keep a negative cache of not having a reloc (by far the most common case). struct instruction { struct list_head list; /* 0 16 */ struct hlist_node hash; /* 16 16 */ struct list_head call_node; /* 32 16 */ struct section * sec; /* 48 8 */ long unsigned int offset; /* 56 8 */ /* --- cacheline 1 boundary (64 bytes) --- */ long unsigned int immediate; /* 64 8 */ unsigned int len; /* 72 4 */ u8 type; /* 76 1 */ /* Bitfield combined with previous fields */ u16 dead_end:1; /* 76: 8 2 */ u16 ignore:1; /* 76: 9 2 */ u16 ignore_alts:1; /* 76:10 2 */ u16 hint:1; /* 76:11 2 */ u16 save:1; /* 76:12 2 */ u16 restore:1; /* 76:13 2 */ u16 retpoline_safe:1; /* 76:14 2 */ u16 noendbr:1; /* 76:15 2 */ u16 entry:1; /* 78: 0 2 */ u16 visited:4; /* 78: 1 2 */ + u16 no_reloc:1; /* 78: 5 2 */ - /* XXX 3 bits hole, try to pack */ + /* XXX 2 bits hole, try to pack */ /* Bitfield combined with next fields */ s8 instr; /* 79 1 */ struct alt_group * alt_group; /* 80 8 */ struct symbol * call_dest; /* 88 8 */ struct instruction * jump_dest; /* 96 8 */ struct instruction * first_jump_src; /* 104 8 */ struct reloc * jump_table; /* 112 8 */ - struct reloc * reloc; /* 120 8 */ + struct alternative * alts; /* 120 8 */ /* --- cacheline 2 boundary (128 bytes) --- */ - struct alternative * alts; /* 128 8 */ - struct symbol * sym; /* 136 8 */ - struct stack_op * stack_ops; /* 144 8 */ - struct cfi_state * cfi; /* 152 8 */ + struct symbol * sym; /* 128 8 */ + struct stack_op * stack_ops; /* 136 8 */ + struct cfi_state * cfi; /* 144 8 */ - /* size: 160, cachelines: 3, members: 29 */ - /* sum members: 158 */ - /* sum bitfield members: 13 bits, bit holes: 1, sum bit holes: 3 bits */ - /* last cacheline: 32 bytes */ + /* size: 152, cachelines: 3, members: 29 */ + /* sum members: 150 */ + /* sum bitfield members: 14 bits, bit holes: 1, sum bit holes: 2 bits */ + /* last cacheline: 24 bytes */ }; pre: 5:48.89 real, 220.96 user, 127.55 sys, 24834672 mem post: 5:39.35 real, 215.58 user, 123.69 sys, 23448736 mem Signed-off-by: Peter Zijlstra (Intel) Acked-by: Josh Poimboeuf Tested-by: Nathan Chancellor # build only Tested-by: Thomas Wei=C3=9Fschuh # compile and run --- tools/objtool/check.c | 26 ++++++++++++-------------- tools/objtool/include/objtool/check.h | 6 +++--- 2 files changed, 15 insertions(+), 17 deletions(-) --- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -1307,26 +1307,24 @@ __weak bool arch_is_rethunk(struct symbo return false; } =20 -#define NEGATIVE_RELOC ((void *)-1L) - static struct reloc *insn_reloc(struct objtool_file *file, struct instruct= ion *insn) { - if (insn->reloc =3D=3D NEGATIVE_RELOC) + struct reloc *reloc; + + if (insn->no_reloc) return NULL; =20 - if (!insn->reloc) { - if (!file) - return NULL; - - insn->reloc =3D find_reloc_by_dest_range(file->elf, insn->sec, - insn->offset, insn->len); - if (!insn->reloc) { - insn->reloc =3D NEGATIVE_RELOC; - return NULL; - } + if (!file) + return NULL; + + reloc =3D find_reloc_by_dest_range(file->elf, insn->sec, + insn->offset, insn->len); + if (!reloc) { + insn->no_reloc =3D 1; + return NULL; } =20 - return insn->reloc; + return reloc; } =20 static void remove_insn_ops(struct instruction *insn) --- a/tools/objtool/include/objtool/check.h +++ b/tools/objtool/include/objtool/check.h @@ -55,8 +55,9 @@ struct instruction { retpoline_safe : 1, noendbr : 1, entry : 1, - visited : 4; - /* 3 bit hole */ + visited : 4, + no_reloc : 1; + /* 2 bit hole */ =20 s8 instr; =20 @@ -65,7 +66,6 @@ struct instruction { struct instruction *jump_dest; struct instruction *first_jump_src; struct reloc *jump_table; - struct reloc *reloc; struct alternative *alts; struct symbol *sym; struct stack_op *stack_ops; From nobody Fri Sep 12 16:18:08 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0B616C636D3 for ; Wed, 8 Feb 2023 17:24:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230491AbjBHRYU (ORCPT ); Wed, 8 Feb 2023 12:24:20 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55412 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231510AbjBHRYI (ORCPT ); Wed, 8 Feb 2023 12:24:08 -0500 Received: from desiato.infradead.org (desiato.infradead.org [IPv6:2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 599EE303EB for ; Wed, 8 Feb 2023 09:23:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=tf6weflLVL5UNGPZOQBPuXAtnzAI3KVAbkARyh3VqT8=; b=ICLCoCx5dEH4rNkQwezCwmC3gk AU72ccSNgmlMRkx02M1/krwPZba/bbsuzgrewgBt7aIC2dppxiF0eMQ4IStWQiMqvtV3FogXR4Voe KFHp+zQruIu8h1cIGSqnf302/TE2BgdPT4QmBzgTT0a4LAZr6rXv1IRFnL+Jv3qe+eV5gHor6cKKk 5fPE7nlHBBjJ4l6ZOhFf7MSs8DM5nbNTip07kANOKNG3KSTRTeupjiaNRUk4c/5W4t0IVMI6FkXN7 FfHE2KV/dGAcP1C+Ze2C4LeaQLof6wgX1GpfMnnwIxMkvJ3vVgv0mYvLcln477JvsglN6EwST4yRl 7uL6p8mQ==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.96 #2 (Red Hat Linux)) id 1pPoA4-007Vvk-2T; Wed, 08 Feb 2023 17:23:13 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 4970C300AFB; Wed, 8 Feb 2023 18:23:50 +0100 (CET) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id 1697A23D8CFB2; Wed, 8 Feb 2023 18:23:50 +0100 (CET) Message-ID: <20230208172245.640914454@infradead.org> User-Agent: quilt/0.66 Date: Wed, 08 Feb 2023 18:18:02 +0100 From: Peter Zijlstra To: x86@kernel.org, jpoimboe@redhat.com, linux@weissschuh.net Cc: linux-kernel@vger.kernel.org, peterz@infradead.org Subject: [PATCH 06/10] objtool: Union instruction::{call_dest,jump_table} References: <20230208171756.898991570@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The instruction call_dest and jump_table members can never be used at the same time, their usage depends on type. struct instruction { struct list_head list; /* 0 16 */ struct hlist_node hash; /* 16 16 */ struct list_head call_node; /* 32 16 */ struct section * sec; /* 48 8 */ long unsigned int offset; /* 56 8 */ /* --- cacheline 1 boundary (64 bytes) --- */ long unsigned int immediate; /* 64 8 */ unsigned int len; /* 72 4 */ u8 type; /* 76 1 */ /* Bitfield combined with previous fields */ u16 dead_end:1; /* 76: 8 2 */ u16 ignore:1; /* 76: 9 2 */ u16 ignore_alts:1; /* 76:10 2 */ u16 hint:1; /* 76:11 2 */ u16 save:1; /* 76:12 2 */ u16 restore:1; /* 76:13 2 */ u16 retpoline_safe:1; /* 76:14 2 */ u16 noendbr:1; /* 76:15 2 */ u16 entry:1; /* 78: 0 2 */ u16 visited:4; /* 78: 1 2 */ u16 no_reloc:1; /* 78: 5 2 */ /* XXX 2 bits hole, try to pack */ /* Bitfield combined with next fields */ s8 instr; /* 79 1 */ struct alt_group * alt_group; /* 80 8 */ - struct symbol * call_dest; /* 88 8 */ - struct instruction * jump_dest; /* 96 8 */ - struct instruction * first_jump_src; /* 104 8 */ - struct reloc * jump_table; /* 112 8 */ - struct alternative * alts; /* 120 8 */ + struct instruction * jump_dest; /* 88 8 */ + struct instruction * first_jump_src; /* 96 8 */ + union { + struct symbol * _call_dest; /* 104 8 */ + struct reloc * _jump_table; /* 104 8 */ + }; /* 104 8 */ + struct alternative * alts; /* 112 8 */ + struct symbol * sym; /* 120 8 */ /* --- cacheline 2 boundary (128 bytes) --- */ - struct symbol * sym; /* 128 8 */ - struct stack_op * stack_ops; /* 136 8 */ - struct cfi_state * cfi; /* 144 8 */ + struct stack_op * stack_ops; /* 128 8 */ + struct cfi_state * cfi; /* 136 8 */ - /* size: 152, cachelines: 3, members: 29 */ - /* sum members: 150 */ + /* size: 144, cachelines: 3, members: 28 */ + /* sum members: 142 */ /* sum bitfield members: 14 bits, bit holes: 1, sum bit holes: 2 bits */ - /* last cacheline: 24 bytes */ + /* last cacheline: 16 bytes */ }; pre: 5:39.35 real, 215.58 user, 123.69 sys, 23448736 mem post: 5:38.18 real, 213.25 user, 124.90 sys, 23449040 mem Signed-off-by: Peter Zijlstra (Intel) Acked-by: Josh Poimboeuf Tested-by: Nathan Chancellor # build only Tested-by: Thomas Wei=C3=9Fschuh # compile and run --- tools/objtool/check.c | 73 +++++++++++++++++++++--------= ----- tools/objtool/include/objtool/check.h | 6 +- 2 files changed, 50 insertions(+), 29 deletions(-) --- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -114,16 +114,34 @@ static struct instruction *prev_insn_sam for (insn =3D next_insn_same_sec(file, insn); insn; \ insn =3D next_insn_same_sec(file, insn)) =20 +static inline struct symbol *insn_call_dest(struct instruction *insn) +{ + if (insn->type =3D=3D INSN_JUMP_DYNAMIC || + insn->type =3D=3D INSN_CALL_DYNAMIC) + return NULL; + + return insn->_call_dest; +} + +static inline struct reloc *insn_jump_table(struct instruction *insn) +{ + if (insn->type =3D=3D INSN_JUMP_DYNAMIC || + insn->type =3D=3D INSN_CALL_DYNAMIC) + return insn->_jump_table; + + return NULL; +} + static bool is_jump_table_jump(struct instruction *insn) { struct alt_group *alt_group =3D insn->alt_group; =20 - if (insn->jump_table) + if (insn_jump_table(insn)) return true; =20 /* Retpoline alternative for a jump table? */ return alt_group && alt_group->orig_group && - alt_group->orig_group->first_insn->jump_table; + insn_jump_table(alt_group->orig_group->first_insn); } =20 static bool is_sibling_call(struct instruction *insn) @@ -137,8 +155,8 @@ static bool is_sibling_call(struct instr return !is_jump_table_jump(insn); } =20 - /* add_jump_destinations() sets insn->call_dest for sibling calls. */ - return (is_static_jump(insn) && insn->call_dest); + /* add_jump_destinations() sets insn_call_dest(insn) for sibling calls. */ + return (is_static_jump(insn) && insn_call_dest(insn)); } =20 /* @@ -273,8 +291,8 @@ static void init_insn_state(struct objto =20 /* * We need the full vmlinux for noinstr validation, otherwise we can - * not correctly determine insn->call_dest->sec (external symbols do - * not have a section). + * not correctly determine insn_call_dest(insn)->sec (external symbols + * do not have a section). */ if (opts.link && opts.noinstr && sec) state->noinstr =3D sec->noinstr; @@ -677,7 +695,7 @@ static int create_static_call_sections(s return -1; =20 /* find key symbol */ - key_name =3D strdup(insn->call_dest->name); + key_name =3D strdup(insn_call_dest(insn)->name); if (!key_name) { perror("strdup"); return -1; @@ -708,7 +726,7 @@ static int create_static_call_sections(s * trampoline address. This is fixed up in * static_call_add_module(). */ - key_sym =3D insn->call_dest; + key_sym =3D insn_call_dest(insn); } free(key_name); =20 @@ -1342,7 +1360,7 @@ static void annotate_call_site(struct ob struct instruction *insn, bool sibling) { struct reloc *reloc =3D insn_reloc(file, insn); - struct symbol *sym =3D insn->call_dest; + struct symbol *sym =3D insn_call_dest(insn); =20 if (!sym) sym =3D reloc->sym; @@ -1427,7 +1445,7 @@ static void annotate_call_site(struct ob static void add_call_dest(struct objtool_file *file, struct instruction *i= nsn, struct symbol *dest, bool sibling) { - insn->call_dest =3D dest; + insn->_call_dest =3D dest; if (!dest) return; =20 @@ -1685,12 +1703,12 @@ static int add_call_destinations(struct if (insn->ignore) continue; =20 - if (!insn->call_dest) { + if (!insn_call_dest(insn)) { WARN_FUNC("unannotated intra-function call", insn->sec, insn->offset); return -1; } =20 - if (insn_func(insn) && insn->call_dest->type !=3D STT_FUNC) { + if (insn_func(insn) && insn_call_dest(insn)->type !=3D STT_FUNC) { WARN_FUNC("unsupported call to non-function", insn->sec, insn->offset); return -1; @@ -2127,7 +2145,7 @@ static void mark_func_jump_tables(struct reloc =3D find_jump_table(file, func, insn); if (reloc) { reloc->jump_table_start =3D true; - insn->jump_table =3D reloc; + insn->_jump_table =3D reloc; } } } @@ -2139,10 +2157,10 @@ static int add_func_jump_tables(struct o int ret; =20 func_for_each_insn(file, func, insn) { - if (!insn->jump_table) + if (!insn_jump_table(insn)) continue; =20 - ret =3D add_jump_table(file, insn, insn->jump_table); + ret =3D add_jump_table(file, insn, insn_jump_table(insn)); if (ret) return ret; } @@ -2614,8 +2632,8 @@ static int decode_sections(struct objtoo static bool is_fentry_call(struct instruction *insn) { if (insn->type =3D=3D INSN_CALL && - insn->call_dest && - insn->call_dest->fentry) + insn_call_dest(insn) && + insn_call_dest(insn)->fentry) return true; =20 return false; @@ -3322,8 +3340,8 @@ static inline const char *call_dest_name struct reloc *rel; int idx; =20 - if (insn->call_dest) - return insn->call_dest->name; + if (insn_call_dest(insn)) + return insn_call_dest(insn)->name; =20 rel =3D insn_reloc(NULL, insn); if (rel && !strcmp(rel->sym->name, "pv_ops")) { @@ -3405,13 +3423,13 @@ static int validate_call(struct objtool_ struct insn_state *state) { if (state->noinstr && state->instr <=3D 0 && - !noinstr_call_dest(file, insn, insn->call_dest)) { + !noinstr_call_dest(file, insn, insn_call_dest(insn))) { WARN_FUNC("call to %s() leaves .noinstr.text section", insn->sec, insn->offset, call_dest_name(insn)); return 1; } =20 - if (state->uaccess && !func_uaccess_safe(insn->call_dest)) { + if (state->uaccess && !func_uaccess_safe(insn_call_dest(insn))) { WARN_FUNC("call to %s() with UACCESS enabled", insn->sec, insn->offset, call_dest_name(insn)); return 1; @@ -3849,11 +3867,11 @@ static int validate_entry(struct objtool =20 /* fallthrough */ case INSN_CALL: - dest =3D find_insn(file, insn->call_dest->sec, - insn->call_dest->offset); + dest =3D find_insn(file, insn_call_dest(insn)->sec, + insn_call_dest(insn)->offset); if (!dest) { WARN("Unresolved function after linking!?: %s", - insn->call_dest->name); + insn_call_dest(insn)->name); return -1; } =20 @@ -3954,13 +3972,13 @@ static int validate_retpoline(struct obj static bool is_kasan_insn(struct instruction *insn) { return (insn->type =3D=3D INSN_CALL && - !strcmp(insn->call_dest->name, "__asan_handle_no_return")); + !strcmp(insn_call_dest(insn)->name, "__asan_handle_no_return")); } =20 static bool is_ubsan_insn(struct instruction *insn) { return (insn->type =3D=3D INSN_CALL && - !strcmp(insn->call_dest->name, + !strcmp(insn_call_dest(insn)->name, "__ubsan_handle_builtin_unreachable")); } =20 @@ -4038,7 +4056,8 @@ static bool ignore_unreachable_insn(stru * It may also insert a UD2 after calling a __noreturn function. */ prev_insn =3D list_prev_entry(insn, list); - if ((prev_insn->dead_end || dead_end_function(file, prev_insn->call_dest)= ) && + if ((prev_insn->dead_end || + dead_end_function(file, insn_call_dest(prev_insn))) && (insn->type =3D=3D INSN_BUG || (insn->type =3D=3D INSN_JUMP_UNCONDITIONAL && insn->jump_dest && insn->jump_dest->type =3D=3D INSN_BUG))) --- a/tools/objtool/include/objtool/check.h +++ b/tools/objtool/include/objtool/check.h @@ -62,10 +62,12 @@ struct instruction { s8 instr; =20 struct alt_group *alt_group; - struct symbol *call_dest; struct instruction *jump_dest; struct instruction *first_jump_src; - struct reloc *jump_table; + union { + struct symbol *_call_dest; + struct reloc *_jump_table; + }; struct alternative *alts; struct symbol *sym; struct stack_op *stack_ops; From nobody Fri Sep 12 16:18:08 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A9B05C636CC for ; Wed, 8 Feb 2023 17:24:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230471AbjBHRYe (ORCPT ); Wed, 8 Feb 2023 12:24:34 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55502 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231820AbjBHRYK (ORCPT ); Wed, 8 Feb 2023 12:24:10 -0500 Received: from desiato.infradead.org (desiato.infradead.org [IPv6:2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 634424F84E for ; Wed, 8 Feb 2023 09:23:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=rcZCmoIUcgLAbnZrH9bmcKFtR9+VWwtjQJpHM8y0dwU=; b=ZwzycoRhBe720t52V6JFKc/80J TLYRBAd2Q1EQRx2k7IQPbUI/JnFkzI6XV7zjMBdByLB6VTtVH1c7PODLaP1ce3IX/YqJiNPzAb90N HP5WsGY45Q5ADfBGiFbanS/iqb+F7jr0/suucQu2rl4Wcrpi2dlUn39Uw7CJ+dDaCrVpM3Hxeh5zv fVFcc3PqO0Bc4QquCSX2jFBrnb87rVzkLoYomWyJoKG8nt9J20aBl3PPbL+QsUswItCzcpQPJR4XE ymKSNI0RgVkyFlOxWVWLPBwwHNIQxO7xW8AK23aFuWTa0P+I1WfWl0YKj8GpQgVzq+vvik5hySYu5 qOLrn9ug==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.96 #2 (Red Hat Linux)) id 1pPoA5-007Vvl-14; Wed, 08 Feb 2023 17:23:13 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 49753300BE3; Wed, 8 Feb 2023 18:23:50 +0100 (CET) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id 1ABAD23D8CFB3; Wed, 8 Feb 2023 18:23:50 +0100 (CET) Message-ID: <20230208172245.711471461@infradead.org> User-Agent: quilt/0.66 Date: Wed, 08 Feb 2023 18:18:03 +0100 From: Peter Zijlstra To: x86@kernel.org, jpoimboe@redhat.com, linux@weissschuh.net Cc: linux-kernel@vger.kernel.org, peterz@infradead.org Subject: [PATCH 07/10] objtool: Fix overlapping alternatives References: <20230208171756.898991570@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Things like ALTERNATIVE_{2,3}() generate multiple alternatives on the same place, objtool would override the first orig_alt_group with the second (or third), failing to check the CFI among all the different variants. Signed-off-by: Peter Zijlstra (Intel) Acked-by: Josh Poimboeuf Tested-by: Nathan Chancellor # build only Tested-by: Thomas Wei=C3=9Fschuh # compile and run --- tools/objtool/check.c | 69 +++++++++++++++++++++++++++++++--------------= ----- 1 file changed, 43 insertions(+), 26 deletions(-) --- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -1732,36 +1732,49 @@ static int handle_group_alt(struct objto struct instruction *orig_insn, struct instruction **new_insn) { - struct instruction *last_orig_insn, *last_new_insn =3D NULL, *insn, *nop = =3D NULL; + struct instruction *last_new_insn =3D NULL, *insn, *nop =3D NULL; struct alt_group *orig_alt_group, *new_alt_group; unsigned long dest_off; =20 - - orig_alt_group =3D malloc(sizeof(*orig_alt_group)); + orig_alt_group =3D orig_insn->alt_group; if (!orig_alt_group) { - WARN("malloc failed"); - return -1; - } - orig_alt_group->cfi =3D calloc(special_alt->orig_len, - sizeof(struct cfi_state *)); - if (!orig_alt_group->cfi) { - WARN("calloc failed"); - return -1; - } + struct instruction *last_orig_insn =3D NULL; =20 - last_orig_insn =3D NULL; - insn =3D orig_insn; - sec_for_each_insn_from(file, insn) { - if (insn->offset >=3D special_alt->orig_off + special_alt->orig_len) - break; + orig_alt_group =3D malloc(sizeof(*orig_alt_group)); + if (!orig_alt_group) { + WARN("malloc failed"); + return -1; + } + orig_alt_group->cfi =3D calloc(special_alt->orig_len, + sizeof(struct cfi_state *)); + if (!orig_alt_group->cfi) { + WARN("calloc failed"); + return -1; + } =20 - insn->alt_group =3D orig_alt_group; - last_orig_insn =3D insn; - } - orig_alt_group->orig_group =3D NULL; - orig_alt_group->first_insn =3D orig_insn; - orig_alt_group->last_insn =3D last_orig_insn; + insn =3D orig_insn; + sec_for_each_insn_from(file, insn) { + if (insn->offset >=3D special_alt->orig_off + special_alt->orig_len) + break; =20 + insn->alt_group =3D orig_alt_group; + last_orig_insn =3D insn; + } + orig_alt_group->orig_group =3D NULL; + orig_alt_group->first_insn =3D orig_insn; + orig_alt_group->last_insn =3D last_orig_insn; + } else { + if (orig_alt_group->last_insn->offset + orig_alt_group->last_insn->len - + orig_alt_group->first_insn->offset !=3D special_alt->orig_len) { + WARN_FUNC("weirdly overlapping alternative! %ld !=3D %d", + orig_insn->sec, orig_insn->offset, + orig_alt_group->last_insn->offset + + orig_alt_group->last_insn->len - + orig_alt_group->first_insn->offset, + special_alt->orig_len); + return -1; + } + } =20 new_alt_group =3D malloc(sizeof(*new_alt_group)); if (!new_alt_group) { @@ -1836,7 +1849,7 @@ static int handle_group_alt(struct objto =20 dest_off =3D arch_jump_destination(insn); if (dest_off =3D=3D special_alt->new_off + special_alt->new_len) { - insn->jump_dest =3D next_insn_same_sec(file, last_orig_insn); + insn->jump_dest =3D next_insn_same_sec(file, orig_alt_group->last_insn); if (!insn->jump_dest) { WARN_FUNC("can't find alternative jump destination", insn->sec, insn->offset); @@ -3214,8 +3227,12 @@ static int propagate_alt_cfi(struct objt alt_cfi[group_off] =3D insn->cfi; } else { if (cficmp(alt_cfi[group_off], insn->cfi)) { - WARN_FUNC("stack layout conflict in alternatives", - insn->sec, insn->offset); + struct alt_group *orig_group =3D insn->alt_group->orig_group ?: insn->a= lt_group; + struct instruction *orig =3D orig_group->first_insn; + char *where =3D offstr(insn->sec, insn->offset); + WARN_FUNC("stack layout conflict in alternatives: %s", + orig->sec, orig->offset, where); + free(where); return -1; } } From nobody Fri Sep 12 16:18:08 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B43A3C636D3 for ; Wed, 8 Feb 2023 17:24:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231901AbjBHRY2 (ORCPT ); Wed, 8 Feb 2023 12:24:28 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55448 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231775AbjBHRYJ (ORCPT ); Wed, 8 Feb 2023 12:24:09 -0500 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4D1824ED19 for ; Wed, 8 Feb 2023 09:23:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=Ki0eJe+BBfgVdZtQHY6Ihx06bAxCNT64t7j0255dl/4=; b=P0M3DiAj0TvRasgj3eum/crA7n 9qbBlg3VJId6sDFxJtT2p5PdMqOGA18NShnLqyTf3r+9EVhKkKBFD6zEYybYnyVbdMX2flhaE3ydH eKUsczXA9WOf/Nh7Y8W5OLHdtzTIMwJnpv0rXofyQKkAfaMqTEsdiFKgf4eUbK4QXUWc3Ns3QXfvH MagsUVTDBKBGhnTa0NsHfki99SM+PBWizgXthmI/PNnZo/SU8YSpe2IZUPRwc8gVpGcJAlojKS/2D KumQaYuPs9uznL8Sq8sP5lA1WK+/zXYNrxwJL7dc4SyhGpM0A/lvKRXzBFMZzS7MCOP7NWjjqWfC6 S5TX1I1A==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1pPoAg-001PY7-Qh; Wed, 08 Feb 2023 17:23:51 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 497E13021D4; Wed, 8 Feb 2023 18:23:50 +0100 (CET) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id 1EF4623D8CFB4; Wed, 8 Feb 2023 18:23:50 +0100 (CET) Message-ID: <20230208172245.783099843@infradead.org> User-Agent: quilt/0.66 Date: Wed, 08 Feb 2023 18:18:04 +0100 From: Peter Zijlstra To: x86@kernel.org, jpoimboe@redhat.com, linux@weissschuh.net Cc: linux-kernel@vger.kernel.org, peterz@infradead.org Subject: [PATCH 08/10] x86: Fix FILL_RETURN_BUFFER References: <20230208171756.898991570@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" With overlapping alternative validation fixed, objtool promptly complains: vmlinux.o: warning: objtool: __switch_to_asm+0x2c: stack layout conflict in= alternatives: .altinstr_replacement+0x47 .rela.altinstructions: 000000000000009c 0000000200000002 R_X86_64_PC32 0000000000000000 = .text + 16dc 00000000000000a0 0000000600000002 R_X86_64_PC32 0000000000000000 = .altinstr_replacement + 3a 00000000000000a8 0000000200000002 R_X86_64_PC32 0000000000000000 = .text + 16dc 00000000000000ac 0000000600000002 R_X86_64_PC32 0000000000000000 = .altinstr_replacement + 66 .text: 00000000000016b0 <__switch_to_asm>: 16b0: f3 0f 1e fa endbr64 16b4: 55 push %rbp 16b5: 53 push %rbx 16b6: 41 54 push %r12 16b8: 41 55 push %r13 16ba: 41 56 push %r14 16bc: 41 57 push %r15 16be: 48 89 a7 18 0b 00 00 mov %rsp,0xb18(%rdi) 16c5: 48 8b a6 18 0b 00 00 mov 0xb18(%rsi),%rsp 16cc: 48 8b 9e 28 05 00 00 mov 0x528(%rsi),%rbx 16d3: 65 48 89 1c 25 00 00 00 00 mov %rbx,%gs:0x0 16d= 8: R_X86_64_32S fixed_percpu_data+0x28 16dc: eb 2a jmp 1708 <__switch_to_asm+0x58> 16de: 90 nop 16df: 90 nop 16e0: 90 nop 16e1: 90 nop 16e2: 90 nop 16e3: 90 nop 16e4: 90 nop 16e5: 90 nop 16e6: 90 nop 16e7: 90 nop 16e8: 90 nop 16e9: 90 nop 16ea: 90 nop 16eb: 90 nop 16ec: 90 nop 16ed: 90 nop 16ee: 90 nop 16ef: 90 nop 16f0: 90 nop 16f1: 90 nop 16f2: 90 nop 16f3: 90 nop 16f4: 90 nop 16f5: 90 nop 16f6: 90 nop 16f7: 90 nop 16f8: 90 nop 16f9: 90 nop 16fa: 90 nop 16fb: 90 nop 16fc: 90 nop 16fd: 90 nop 16fe: 90 nop 16ff: 90 nop 1700: 90 nop 1701: 90 nop 1702: 90 nop 1703: 90 nop 1704: 90 nop 1705: 90 nop 1706: 90 nop 1707: 90 nop 1708: 41 5f pop %r15 170a: 41 5e pop %r14 170c: 41 5d pop %r13 170e: 41 5c pop %r12 1710: 5b pop %rbx 1711: 5d pop %rbp 1712: e9 00 00 00 00 jmp 1717 <__switch_to_asm+0x67> = 1713: R_X86_64_PLT32 __switch_to-0x4 .altinstr_replacement: 3a: 49 c7 c4 10 00 00 00 mov $0x10,%r12 41: e8 01 00 00 00 call 47 <.altinstr_replacement+0x= 47> 46: cc int3 47: e8 01 00 00 00 call 4d <.altinstr_replacement+0x= 4d> 4c: cc int3 4d: 48 83 c4 10 add $0x10,%rsp 51: 49 ff cc dec %r12 54: 75 eb jne 41 <.altinstr_replacement+0x= 41> 56: 0f ae e8 lfence 59: 65 48 c7 04 25 00 00 00 00 ff ff ff ff movq $0xfffffffff= fffffff,%gs:0x0 5e: R_X86_64_32S pcpu_hot+0x10 66: e8 01 00 00 00 call 6c <.altinstr_replacement+0x= 6c> 6b: cc int3 6c: 48 83 c4 08 add $0x8,%rsp 70: 0f ae e8 lfence As can be seen from the two alternatives, when overlaid, the NOP after the shorter (starting at 66) coinsides with the call at 47, leading to conflicting CFI state for that instruction. By offsetting the shorter alternative by 2 bytes, this alignment is undone. TODO: arguably objtool should be taught about the max nop length used by the kernel for tail padding, or unconditionally use JMP.d8 to not use the intervening bytes at all. Signed-off-by: Peter Zijlstra (Intel) Acked-by: Josh Poimboeuf Tested-by: Nathan Chancellor # build only Tested-by: Thomas Wei=C3=9Fschuh # compile and run --- arch/x86/include/asm/nospec-branch.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -261,7 +261,7 @@ .macro FILL_RETURN_BUFFER reg:req nr:req ftr:req ftr2=3DALT_NOT(X86_FEATUR= E_ALWAYS) ALTERNATIVE_2 "jmp .Lskip_rsb_\@", \ __stringify(__FILL_RETURN_BUFFER(\reg,\nr)), \ftr, \ - __stringify(__FILL_ONE_RETURN), \ftr2 + __stringify(nop;nop;__FILL_ONE_RETURN), \ftr2 =20 .Lskip_rsb_\@: .endm From nobody Fri Sep 12 16:18:08 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 15315C636CC for ; Wed, 8 Feb 2023 17:24:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231868AbjBHRYk (ORCPT ); Wed, 8 Feb 2023 12:24:40 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55412 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231768AbjBHRYL (ORCPT ); Wed, 8 Feb 2023 12:24:11 -0500 Received: from desiato.infradead.org (desiato.infradead.org [IPv6:2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C50D74E52B for ; Wed, 8 Feb 2023 09:23:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=wzBN97ZQFhjUhX+K0c8kPMFOf7C/Gf9jJ86oNi8wN4M=; b=EsoEMmiZgEYdWvchlhellXRAEr i8BgamOpFzs3xx7hnUoRAGHB+jiwAJ2VmeNuDM27rtX4dzQ00PSc9o6w3dJtYhjXRDTNbPCSiwixU 0dYDmmbW3k/07WB92v+pmR2gl1W6VtdM/1OGabG5fNeI3gC83t+Ewl831WuBwfVekbbVgke2mlO+6 GHXWrUT2siE71T4BI+D3RrwjGWVzX0RGJd51cn0HYjuSyymVK58B467VkzgJuXseJtp1n7N2arb+4 7ltViIikb/1lyQQsEHKYlvY8gGEYJTvyr/qIamRlXhXkIDPpF9XiSZJYZxthoB/XTFZH6+mrmJnOo Y3t6DNtw==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.96 #2 (Red Hat Linux)) id 1pPoA4-007Vvh-1Q; Wed, 08 Feb 2023 17:23:13 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 49770300CCD; Wed, 8 Feb 2023 18:23:50 +0100 (CET) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id 236D423D8CFB5; Wed, 8 Feb 2023 18:23:50 +0100 (CET) Message-ID: <20230208172245.851307606@infradead.org> User-Agent: quilt/0.66 Date: Wed, 08 Feb 2023 18:18:05 +0100 From: Peter Zijlstra To: x86@kernel.org, jpoimboe@redhat.com, linux@weissschuh.net Cc: linux-kernel@vger.kernel.org, peterz@infradead.org Subject: [PATCH 09/10] objtool: Remove instruction::list References: <20230208171756.898991570@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Replace the instruction::list by allocating instructions in arrays of 256 entries and stringing them together by (amortized) find_insn(). This shrinks instruction by 16 bytes and brings it down to 128. struct instruction { - struct list_head list; /* 0 16 */ - struct hlist_node hash; /* 16 16 */ - struct list_head call_node; /* 32 16 */ - struct section * sec; /* 48 8 */ - long unsigned int offset; /* 56 8 */ - /* --- cacheline 1 boundary (64 bytes) --- */ - long unsigned int immediate; /* 64 8 */ - unsigned int len; /* 72 4 */ - u8 type; /* 76 1 */ - - /* Bitfield combined with previous fields */ + struct hlist_node hash; /* 0 16 */ + struct list_head call_node; /* 16 16 */ + struct section * sec; /* 32 8 */ + long unsigned int offset; /* 40 8 */ + long unsigned int immediate; /* 48 8 */ + u8 len; /* 56 1 */ + u8 prev_len; /* 57 1 */ + u8 type; /* 58 1 */ + s8 instr; /* 59 1 */ + u32 idx:8; /* 60: 0 4 */ + u32 dead_end:1; /* 60: 8 4 */ + u32 ignore:1; /* 60: 9 4 */ + u32 ignore_alts:1; /* 60:10 4 */ + u32 hint:1; /* 60:11 4 */ + u32 save:1; /* 60:12 4 */ + u32 restore:1; /* 60:13 4 */ + u32 retpoline_safe:1; /* 60:14 4 */ + u32 noendbr:1; /* 60:15 4 */ + u32 entry:1; /* 60:16 4 */ + u32 visited:4; /* 60:17 4 */ + u32 no_reloc:1; /* 60:21 4 */ - u16 dead_end:1; /* 76: 8 2 */ - u16 ignore:1; /* 76: 9 2 */ - u16 ignore_alts:1; /* 76:10 2 */ - u16 hint:1; /* 76:11 2 */ - u16 save:1; /* 76:12 2 */ - u16 restore:1; /* 76:13 2 */ - u16 retpoline_safe:1; /* 76:14 2 */ - u16 noendbr:1; /* 76:15 2 */ - u16 entry:1; /* 78: 0 2 */ - u16 visited:4; /* 78: 1 2 */ - u16 no_reloc:1; /* 78: 5 2 */ + /* XXX 10 bits hole, try to pack */ - /* XXX 2 bits hole, try to pack */ - /* Bitfield combined with next fields */ - - s8 instr; /* 79 1 */ - struct alt_group * alt_group; /* 80 8 */ - struct instruction * jump_dest; /* 88 8 */ - struct instruction * first_jump_src; /* 96 8 */ + /* --- cacheline 1 boundary (64 bytes) --- */ + struct alt_group * alt_group; /* 64 8 */ + struct instruction * jump_dest; /* 72 8 */ + struct instruction * first_jump_src; /* 80 8 */ union { - struct symbol * _call_dest; /* 104 8 */ - struct reloc * _jump_table; /* 104 8 */ - }; /* 104 8 */ - struct alternative * alts; /* 112 8 */ - struct symbol * sym; /* 120 8 */ - /* --- cacheline 2 boundary (128 bytes) --- */ - struct stack_op * stack_ops; /* 128 8 */ - struct cfi_state * cfi; /* 136 8 */ + struct symbol * _call_dest; /* 88 8 */ + struct reloc * _jump_table; /* 88 8 */ + }; /* 88 8 */ + struct alternative * alts; /* 96 8 */ + struct symbol * sym; /* 104 8 */ + struct stack_op * stack_ops; /* 112 8 */ + struct cfi_state * cfi; /* 120 8 */ - /* size: 144, cachelines: 3, members: 28 */ - /* sum members: 142 */ - /* sum bitfield members: 14 bits, bit holes: 1, sum bit holes: 2 bits */ - /* last cacheline: 16 bytes */ + /* size: 128, cachelines: 2, members: 29 */ + /* sum members: 124 */ + /* sum bitfield members: 22 bits, bit holes: 1, sum bit holes: 10 bits */ }; pre: 5:38.18 real, 213.25 user, 124.90 sys, 23449040 mem post: 5:03.34 real, 210.75 user, 88.80 sys, 20241232 mem Signed-off-by: Peter Zijlstra (Intel) Acked-by: Josh Poimboeuf Tested-by: Nathan Chancellor # build only Tested-by: Thomas Wei=C3=9Fschuh # compile and run --- tools/objtool/check.c | 166 ++++++++++++++++++++-------= ----- tools/objtool/include/objtool/check.h | 51 +++++---- tools/objtool/include/objtool/objtool.h | 1=20 tools/objtool/objtool.c | 1=20 4 files changed, 133 insertions(+), 86 deletions(-) --- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -47,27 +47,29 @@ struct instruction *find_insn(struct obj return NULL; } =20 -static struct instruction *next_insn_same_sec(struct objtool_file *file, - struct instruction *insn) +struct instruction *next_insn_same_sec(struct objtool_file *file, + struct instruction *insn) { - struct instruction *next =3D list_next_entry(insn, list); + if (insn->idx =3D=3D INSN_CHUNK_MAX) + return find_insn(file, insn->sec, insn->offset + insn->len); =20 - if (!next || &next->list =3D=3D &file->insn_list || next->sec !=3D insn->= sec) + insn++; + if (!insn->len) return NULL; =20 - return next; + return insn; } =20 static struct instruction *next_insn_same_func(struct objtool_file *file, struct instruction *insn) { - struct instruction *next =3D list_next_entry(insn, list); + struct instruction *next =3D next_insn_same_sec(file, insn); struct symbol *func =3D insn_func(insn); =20 if (!func) return NULL; =20 - if (&next->list !=3D &file->insn_list && insn_func(next) =3D=3D func) + if (next && insn_func(next) =3D=3D func) return next; =20 /* Check if we're already in the subfunction: */ @@ -78,17 +80,35 @@ static struct instruction *next_insn_sam return find_insn(file, func->cfunc->sec, func->cfunc->offset); } =20 +static struct instruction *prev_insn_same_sec(struct objtool_file *file, + struct instruction *insn) +{ + if (insn->idx =3D=3D 0) { + if (insn->prev_len) + return find_insn(file, insn->sec, insn->offset - insn->prev_len); + return NULL; + } + + return insn - 1; +} + static struct instruction *prev_insn_same_sym(struct objtool_file *file, - struct instruction *insn) + struct instruction *insn) { - struct instruction *prev =3D list_prev_entry(insn, list); + struct instruction *prev =3D prev_insn_same_sec(file, insn); =20 - if (&prev->list !=3D &file->insn_list && insn_func(prev) =3D=3D insn_func= (insn)) + if (prev && insn_func(prev) =3D=3D insn_func(insn)) return prev; =20 return NULL; } =20 +#define for_each_insn(file, insn) \ + for (struct section *__sec, *__fake =3D (struct section *)1; \ + __fake; __fake =3D NULL) \ + for_each_sec(file, __sec) \ + sec_for_each_insn(file, __sec, insn) + #define func_for_each_insn(file, func, insn) \ for (insn =3D find_insn(file, func->sec, func->offset); \ insn; \ @@ -96,16 +116,13 @@ static struct instruction *prev_insn_sam =20 #define sym_for_each_insn(file, sym, insn) \ for (insn =3D find_insn(file, sym->sec, sym->offset); \ - insn && &insn->list !=3D &file->insn_list && \ - insn->sec =3D=3D sym->sec && \ - insn->offset < sym->offset + sym->len; \ - insn =3D list_next_entry(insn, list)) + insn && insn->offset < sym->offset + sym->len; \ + insn =3D next_insn_same_sec(file, insn)) =20 #define sym_for_each_insn_continue_reverse(file, sym, insn) \ - for (insn =3D list_prev_entry(insn, list); \ - &insn->list !=3D &file->insn_list && \ - insn->sec =3D=3D sym->sec && insn->offset >=3D sym->offset; \ - insn =3D list_prev_entry(insn, list)) + for (insn =3D prev_insn_same_sec(file, insn); \ + insn && insn->offset >=3D sym->offset; \ + insn =3D prev_insn_same_sec(file, insn)) =20 #define sec_for_each_insn_from(file, insn) \ for (; insn; insn =3D next_insn_same_sec(file, insn)) @@ -383,6 +400,9 @@ static int decode_instructions(struct ob int ret; =20 for_each_sec(file, sec) { + struct instruction *insns =3D NULL; + u8 prev_len =3D 0; + u8 idx =3D 0; =20 if (!(sec->sh.sh_flags & SHF_EXECINSTR)) continue; @@ -407,22 +427,31 @@ static int decode_instructions(struct ob sec->init =3D true; =20 for (offset =3D 0; offset < sec->sh.sh_size; offset +=3D insn->len) { - insn =3D malloc(sizeof(*insn)); - if (!insn) { - WARN("malloc failed"); - return -1; + if (!insns || idx =3D=3D INSN_CHUNK_MAX) { + insns =3D calloc(sizeof(*insn), INSN_CHUNK_SIZE); + if (!insns) { + WARN("malloc failed"); + return -1; + } + idx =3D 0; + } else { + idx++; } - memset(insn, 0, sizeof(*insn)); - INIT_LIST_HEAD(&insn->call_node); + insn =3D &insns[idx]; + insn->idx =3D idx; =20 + INIT_LIST_HEAD(&insn->call_node); insn->sec =3D sec; insn->offset =3D offset; + insn->prev_len =3D prev_len; =20 ret =3D arch_decode_instruction(file, sec, offset, sec->sh.sh_size - offset, insn); if (ret) - goto err; + return ret; + + prev_len =3D insn->len; =20 /* * By default, "ud2" is a dead end unless otherwise @@ -433,10 +462,11 @@ static int decode_instructions(struct ob insn->dead_end =3D true; =20 hash_add(file->insn_hash, &insn->hash, sec_offset_hash(sec, insn->offse= t)); - list_add_tail(&insn->list, &file->insn_list); nr_insns++; } =20 +// printf("%s: last chunk used: %d\n", sec->name, (int)idx); + list_for_each_entry(func, &sec->symbol_list, list) { if (func->type !=3D STT_NOTYPE && func->type !=3D STT_FUNC) continue; @@ -479,10 +509,6 @@ static int decode_instructions(struct ob printf("nr_insns: %lu\n", nr_insns); =20 return 0; - -err: - free(insn); - return ret; } =20 /* @@ -597,7 +623,7 @@ static int add_dead_ends(struct objtool_ } insn =3D find_insn(file, reloc->sym->sec, reloc->addend); if (insn) - insn =3D list_prev_entry(insn, list); + insn =3D prev_insn_same_sec(file, insn); else if (reloc->addend =3D=3D reloc->sym->sec->sh.sh_size) { insn =3D find_last_insn(file, reloc->sym->sec); if (!insn) { @@ -632,7 +658,7 @@ static int add_dead_ends(struct objtool_ } insn =3D find_insn(file, reloc->sym->sec, reloc->addend); if (insn) - insn =3D list_prev_entry(insn, list); + insn =3D prev_insn_same_sec(file, insn); else if (reloc->addend =3D=3D reloc->sym->sec->sh.sh_size) { insn =3D find_last_insn(file, reloc->sym->sec); if (!insn) { @@ -1763,6 +1789,7 @@ static int handle_group_alt(struct objto orig_alt_group->orig_group =3D NULL; orig_alt_group->first_insn =3D orig_insn; orig_alt_group->last_insn =3D last_orig_insn; + orig_alt_group->nop =3D NULL; } else { if (orig_alt_group->last_insn->offset + orig_alt_group->last_insn->len - orig_alt_group->first_insn->offset !=3D special_alt->orig_len) { @@ -1864,12 +1891,11 @@ static int handle_group_alt(struct objto return -1; } =20 - if (nop) - list_add(&nop->list, &last_new_insn->list); end: new_alt_group->orig_group =3D orig_alt_group; new_alt_group->first_insn =3D *new_insn; - new_alt_group->last_insn =3D nop ? : last_new_insn; + new_alt_group->last_insn =3D last_new_insn; + new_alt_group->nop =3D nop; new_alt_group->cfi =3D orig_alt_group->cfi; return 0; } @@ -1919,7 +1945,7 @@ static int handle_jump_alt(struct objtoo else file->jl_long++; =20 - *new_insn =3D list_next_entry(orig_insn, list); + *new_insn =3D next_insn_same_sec(file, orig_insn); return 0; } =20 @@ -3504,11 +3530,28 @@ static struct instruction *next_insn_to_ * Simulate the fact that alternatives are patched in-place. When the * end of a replacement alt_group is reached, redirect objtool flow to * the end of the original alt_group. + * + * insn->alts->insn -> alt_group->first_insn + * ... + * alt_group->last_insn + * [alt_group->nop] -> next(orig_group->last_insn) */ - if (alt_group && insn =3D=3D alt_group->last_insn && alt_group->orig_grou= p) - return next_insn_same_sec(file, alt_group->orig_group->last_insn); + if (alt_group) { + if (alt_group->nop) { + /* ->nop implies ->orig_group */ + if (insn =3D=3D alt_group->last_insn) + return alt_group->nop; + if (insn =3D=3D alt_group->nop) + goto next_orig; + } + if (insn =3D=3D alt_group->last_insn && alt_group->orig_group) + goto next_orig; + } =20 return next_insn_same_sec(file, insn); + +next_orig: + return next_insn_same_sec(file, alt_group->orig_group->last_insn); } =20 /* @@ -3759,11 +3802,25 @@ static int validate_branch(struct objtoo return 0; } =20 +static int validate_unwind_hint(struct objtool_file *file, + struct instruction *insn, + struct insn_state *state) +{ + if (insn->hint && !insn->visited && !insn->ignore) { + int ret =3D validate_branch(file, insn_func(insn), insn, *state); + if (ret && opts.backtrace) + BT_FUNC("<=3D=3D=3D (hint)", insn); + return ret; + } + + return 0; +} + static int validate_unwind_hints(struct objtool_file *file, struct section= *sec) { struct instruction *insn; struct insn_state state; - int ret, warnings =3D 0; + int warnings =3D 0; =20 if (!file->hints) return 0; @@ -3771,22 +3828,11 @@ static int validate_unwind_hints(struct init_insn_state(file, &state, sec); =20 if (sec) { - insn =3D find_insn(file, sec, 0); - if (!insn) - return 0; + sec_for_each_insn(file, sec, insn) + warnings +=3D validate_unwind_hint(file, insn, &state); } else { - insn =3D list_first_entry(&file->insn_list, typeof(*insn), list); - } - - while (&insn->list !=3D &file->insn_list && (!sec || insn->sec =3D=3D sec= )) { - if (insn->hint && !insn->visited && !insn->ignore) { - ret =3D validate_branch(file, insn_func(insn), insn, state); - if (ret && opts.backtrace) - BT_FUNC("<=3D=3D=3D (hint)", insn); - warnings +=3D ret; - } - - insn =3D list_next_entry(insn, list); + for_each_insn(file, insn) + warnings +=3D validate_unwind_hint(file, insn, &state); } =20 return warnings; @@ -4052,7 +4098,7 @@ static bool ignore_unreachable_insn(stru * * It may also insert a UD2 after calling a __noreturn function. */ - prev_insn =3D list_prev_entry(insn, list); + prev_insn =3D prev_insn_same_sec(file, insn); if ((prev_insn->dead_end || dead_end_function(file, insn_call_dest(prev_insn))) && (insn->type =3D=3D INSN_BUG || @@ -4084,7 +4130,7 @@ static bool ignore_unreachable_insn(stru if (insn->offset + insn->len >=3D insn_func(insn)->offset + insn_func(in= sn)->len) break; =20 - insn =3D list_next_entry(insn, list); + insn =3D next_insn_same_sec(file, insn); } =20 return false; @@ -4097,10 +4143,10 @@ static int add_prefix_symbol(struct objt return 0; =20 for (;;) { - struct instruction *prev =3D list_prev_entry(insn, list); + struct instruction *prev =3D prev_insn_same_sec(file, insn); u64 offset; =20 - if (&prev->list =3D=3D &file->insn_list) + if (!prev) break; =20 if (prev->type !=3D INSN_NOP) @@ -4493,7 +4539,7 @@ int check(struct objtool_file *file) =20 warnings +=3D ret; =20 - if (list_empty(&file->insn_list)) + if (!nr_insns) goto out; =20 if (opts.retpoline) { @@ -4602,7 +4648,7 @@ int check(struct objtool_file *file) warnings +=3D ret; } =20 - if (opts.orc && !list_empty(&file->insn_list)) { + if (opts.orc && nr_insns) { ret =3D orc_create(file); if (ret < 0) goto out; --- a/tools/objtool/include/objtool/check.h +++ b/tools/objtool/include/objtool/check.h @@ -27,7 +27,7 @@ struct alt_group { struct alt_group *orig_group; =20 /* First and last instructions in the group */ - struct instruction *first_insn, *last_insn; + struct instruction *first_insn, *last_insn, *nop; =20 /* * Byte-offset-addressed len-sized array of pointers to CFI structs. @@ -36,31 +36,36 @@ struct alt_group { struct cfi_state **cfi; }; =20 +#define INSN_CHUNK_BITS 8 +#define INSN_CHUNK_SIZE (1 << INSN_CHUNK_BITS) +#define INSN_CHUNK_MAX (INSN_CHUNK_SIZE - 1) + struct instruction { - struct list_head list; struct hlist_node hash; struct list_head call_node; struct section *sec; unsigned long offset; unsigned long immediate; - unsigned int len; - u8 type; - - u16 dead_end : 1, - ignore : 1, - ignore_alts : 1, - hint : 1, - save : 1, - restore : 1, - retpoline_safe : 1, - noendbr : 1, - entry : 1, - visited : 4, - no_reloc : 1; - /* 2 bit hole */ =20 + u8 len; + u8 prev_len; + u8 type; s8 instr; =20 + u32 idx : INSN_CHUNK_BITS, + dead_end : 1, + ignore : 1, + ignore_alts : 1, + hint : 1, + save : 1, + restore : 1, + retpoline_safe : 1, + noendbr : 1, + entry : 1, + visited : 4, + no_reloc : 1; + /* 10 bit hole */ + struct alt_group *alt_group; struct instruction *jump_dest; struct instruction *first_jump_src; @@ -109,13 +114,11 @@ static inline bool is_jump(struct instru struct instruction *find_insn(struct objtool_file *file, struct section *sec, unsigned long offset); =20 -#define for_each_insn(file, insn) \ - list_for_each_entry(insn, &file->insn_list, list) +struct instruction *next_insn_same_sec(struct objtool_file *file, struct i= nstruction *insn); =20 -#define sec_for_each_insn(file, sec, insn) \ - for (insn =3D find_insn(file, sec, 0); \ - insn && &insn->list !=3D &file->insn_list && \ - insn->sec =3D=3D sec; \ - insn =3D list_next_entry(insn, list)) +#define sec_for_each_insn(file, _sec, insn) \ + for (insn =3D find_insn(file, _sec, 0); \ + insn && insn->sec =3D=3D _sec; \ + insn =3D next_insn_same_sec(file, insn)) =20 #endif /* _CHECK_H */ --- a/tools/objtool/include/objtool/objtool.h +++ b/tools/objtool/include/objtool/objtool.h @@ -21,7 +21,6 @@ struct pv_state { =20 struct objtool_file { struct elf *elf; - struct list_head insn_list; DECLARE_HASHTABLE(insn_hash, 20); struct list_head retpoline_call_list; struct list_head return_thunk_list; --- a/tools/objtool/objtool.c +++ b/tools/objtool/objtool.c @@ -99,7 +99,6 @@ struct objtool_file *objtool_open_read(c return NULL; } =20 - INIT_LIST_HEAD(&file.insn_list); hash_init(file.insn_hash); INIT_LIST_HEAD(&file.retpoline_call_list); INIT_LIST_HEAD(&file.return_thunk_list); From nobody Fri Sep 12 16:18:08 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8C31AC05027 for ; Wed, 8 Feb 2023 17:24:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231890AbjBHRYX (ORCPT ); Wed, 8 Feb 2023 12:24:23 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55438 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231764AbjBHRYJ (ORCPT ); Wed, 8 Feb 2023 12:24:09 -0500 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EDCFB2594D for ; Wed, 8 Feb 2023 09:23:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=eoiAli19PL7azkr1hXVTcWX6t95Td6qRB+S8kh8zJlQ=; b=WoCH/mSdUyVDacdymT8eL3e5cC DeZUvZFrS2QMu86mrs8CClkmBdCDkEN6fNNLRePzh+/AUjxnwJo2pRcsvxQVlCplDhHz4rRIAdZxG FPQeY76QW/e3AGZwZGsLQob14d7zQPMpCF3yHgHdsXHwyf8Hhb45iA80/joAykRdWNOB/FT4xuk3a 5yO/zVqy0yd8a4ncKDeJ8vu+aE94Koty8uMs9Glt373GPTDHuuVYn+B1+KSajuvg4P09Yh3LFZ4t0 l1e/CveedMAAqeScctLnFkrtfKcnO3pQWdkEVWDj/VDDPwXAdKaA5114EbkDYiUQLOGShVBTJfJCb 5zNYvE8w==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1pPoAh-001PYA-HO; Wed, 08 Feb 2023 17:23:52 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 505683021F5; Wed, 8 Feb 2023 18:23:50 +0100 (CET) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id 2C0E823D8CFB8; Wed, 8 Feb 2023 18:23:50 +0100 (CET) Message-ID: <20230208172245.922980544@infradead.org> User-Agent: quilt/0.66 Date: Wed, 08 Feb 2023 18:18:06 +0100 From: Peter Zijlstra To: x86@kernel.org, jpoimboe@redhat.com, linux@weissschuh.net Cc: linux-kernel@vger.kernel.org, peterz@infradead.org Subject: [PATCH 10/10][HACK] objtool: Shrink reloc References: <20230208171756.898991570@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Glorious hack, do not merge. Good for another ~850M of allyesconfig savings. Signed-off-by: Peter Zijlstra (Intel) Acked-by: Josh Poimboeuf Tested-by: Nathan Chancellor # build only Tested-by: Thomas Wei=C3=9Fschuh # compile and run --- tools/objtool/include/objtool/elf.h | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) --- a/tools/objtool/include/objtool/elf.h +++ b/tools/objtool/include/objtool/elf.h @@ -71,17 +71,23 @@ struct reloc { union { GElf_Rela rela; GElf_Rel rel; + struct { + u64 offset; + u64 __bar; + s64 addend; + }; }; struct section *sec; struct symbol *sym; struct list_head sym_reloc_entry; - unsigned long offset; - unsigned int type; - s64 addend; int idx; + unsigned short type; bool jump_table_start; }; =20 +static_assert(offsetof(struct reloc, rela.r_offset) =3D=3D offsetof(struct= reloc, offset)); +static_assert(offsetof(struct reloc, rela.r_addend) =3D=3D offsetof(struct= reloc, addend)); + #define ELF_HASH_BITS 20 =20 struct elf {