From nobody Mon Feb 9 04:13:53 2026 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.21]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 20FDC22655B; Sun, 21 Dec 2025 04:31:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.21 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766291508; cv=none; b=kscowoKiEhOwdsk/0jHqopvpYQx7jRIPG0xB5IAr/JFWIQgAkxZZ48vWDJYX/MNYXA/46bl28aLolMaNEWJtnOYm2rbt0qxSQucT2xH2dRiVcw70Pln0YhxplZZG5wvoHnjvbpryc6lxMj+ekHHdoq/3oUFvsEIZHNyzOkhWD8k= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766291508; c=relaxed/simple; bh=5CVZJSO1vs60pqptnZve9rSGX1Qu+JSFvxM3nsRVdgQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ErhsWFZPh8XDbNLvjDtNgjiM8i8B8p/h1JOxhLpflB/x2ob1kNX1PvL70JWp0kUjFMaFqTX7eKxEqbOjKjDQAUGBYGzPAmi2uX+RmytoHs9cR9CXi2DaH55N+xFHKoy39Ywa7xHJwatdBHFeDRWV3EIEeidlffqOwDmmBkjOphI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=dXEgRduh; arc=none smtp.client-ip=198.175.65.21 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="dXEgRduh" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1766291506; x=1797827506; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=5CVZJSO1vs60pqptnZve9rSGX1Qu+JSFvxM3nsRVdgQ=; b=dXEgRduhpp36wo5h/W0qXEWE91Ulclzs9iYEmWR4fBOz8nENhletdrbz H2Ky5dEl/aouK5pPLPYuRvcyIEC0C5MP1g0T+3NAxwmM1tlbmMXmxlxqc Lt/3Bsgvh1qx+InMEngqdisRKh227m4+Out0ErdfXtDxen5cfO2dGpNlK HBZR1BPMVvDR2/YtWP1T4VFdlFqeeOtRo/Yb6cjpa/bmAxiIHLmgmu6Bj nHp3Y575GMwPZGz9JprHVm4qxXYgPRRg/tb6qrgl7Ii9P1kSh9eyjBl9d 3UjnmxPVWgNMD4LWzK+xYLdoikG27sXM0wym/nveLqWGnc6k3aVhd5IK3 A==; X-CSE-ConnectionGUID: aRp6bso6SEi/a9aecfASeQ== X-CSE-MsgGUID: WjJmfvtlQ1O1SseZAGJuJg== X-IronPort-AV: E=McAfee;i="6800,10657,11635"; a="68132419" X-IronPort-AV: E=Sophos;i="6.20,256,1758610800"; d="scan'208";a="68132419" Received: from orviesa002.jf.intel.com ([10.64.159.142]) by orvoesa113.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Dec 2025 20:31:45 -0800 X-CSE-ConnectionGUID: ZEjWdzS+R/ez02yKsF97Pw== X-CSE-MsgGUID: BAsLqxneQ5mcF4JNB2efKw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.21,164,1763452800"; d="scan'208";a="229885020" Received: from chang-linux-3.sc.intel.com (HELO chang-linux-3) ([172.25.66.172]) by orviesa002.jf.intel.com with ESMTP; 20 Dec 2025 20:31:45 -0800 From: "Chang S. Bae" To: pbonzini@redhat.com, seanjc@google.com Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, chao.gao@intel.com, chang.seok.bae@intel.com Subject: [PATCH 10/16] KVM: emulate: Handle EGPR index and REX2-incompatible opcodes Date: Sun, 21 Dec 2025 04:07:36 +0000 Message-ID: <20251221040742.29749-11-chang.seok.bae@intel.com> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251221040742.29749-1-chang.seok.bae@intel.com> References: <20251221040742.29749-1-chang.seok.bae@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Prepare the emulator for REX2 handling by introducing the NoRex opcode flag and supporting extended register indices. Add a helper to factor out common logic for calculating register indices from a given register identifier and REX bits alone. REX2 does not support three-byte opcodes. Instead, the REX2.M bit selects between one- and two-byte opcode tables, which were previously distinguished by the 0x0F escape byte. Some legacy instructions in those tables never reference extended registers. When prefixed with REX, such instructions are treated as if the prefix were absent. In contrast, a REX2 prefix causes a #UD, which should be handled explicitly. Link: https://lore.kernel.org/1ebf3a23-5671-41c1-8daa-c83f2f105936@redhat.c= om Suggested-by: Paolo Bonzini Signed-off-by: Chang S. Bae --- Changes since last version: * Introduce a dedicated flag to mark REX2-incompatible opcodes (Paolo). With JMPABS dropped, there is no longer a need to separate opcode tables. * Refactor register index extraction into a common helper (Paolo) and adjust its return value to match existing call sites. Have __always_inline to ensure no build issue with BUILD_BUG_ON(!__builtin_constant_p(fld)) * Rewrite and clarify the changelog accordingly. --- arch/x86/kvm/emulate.c | 80 +++++++++++++++++++++++--------------- arch/x86/kvm/kvm_emulate.h | 1 + 2 files changed, 50 insertions(+), 31 deletions(-) diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c index c8e292e9a24d..242e043634b9 100644 --- a/arch/x86/kvm/emulate.c +++ b/arch/x86/kvm/emulate.c @@ -175,6 +175,7 @@ #define TwoMemOp ((u64)1 << 55) /* Instruction has two memory operand = */ #define IsBranch ((u64)1 << 56) /* Instruction is considered a branch.= */ #define ShadowStack ((u64)1 << 57) /* Instruction affects Shadow Stacks. = */ +#define NoRex ((u64)1 << 58) /* Instruction has no use of REX prefi= x */ =20 #define DstXacc (DstAccLo | SrcAccHi | SrcWrite) =20 @@ -244,6 +245,7 @@ enum rex_bits { REX_X =3D 2, REX_R =3D 4, REX_W =3D 8, + REX_M =3D 0x80, }; =20 static void writeback_registers(struct x86_emulate_ctxt *ctxt) @@ -1078,6 +1080,15 @@ static int em_fnstsw(struct x86_emulate_ctxt *ctxt) return X86EMUL_CONTINUE; } =20 +static __always_inline int rex_get_rxb(u8 rex, u8 fld) +{ + BUILD_BUG_ON(!__builtin_constant_p(fld)); + BUILD_BUG_ON(fld !=3D REX_B && fld !=3D REX_X && fld !=3D REX_R); + + rex >>=3D ffs(fld) - 1; + return (rex & 1 ? 8 : 0) + (rex & 0x10 ? 16 : 0); +} + static void __decode_register_operand(struct x86_emulate_ctxt *ctxt, struct operand *op, int reg) { @@ -1117,7 +1128,7 @@ static void decode_register_operand(struct x86_emulat= e_ctxt *ctxt, if (ctxt->d & ModRM) reg =3D ctxt->modrm_reg; else - reg =3D (ctxt->b & 7) | (ctxt->rex_bits & REX_B ? 8 : 0); + reg =3D (ctxt->b & 7) | rex_get_rxb(ctxt->rex_bits, REX_B); =20 __decode_register_operand(ctxt, op, reg); } @@ -1136,9 +1147,9 @@ static int decode_modrm(struct x86_emulate_ctxt *ctxt, int rc =3D X86EMUL_CONTINUE; ulong modrm_ea =3D 0; =20 - ctxt->modrm_reg =3D (ctxt->rex_bits & REX_R ? 8 : 0); - index_reg =3D (ctxt->rex_bits & REX_X ? 8 : 0); - base_reg =3D (ctxt->rex_bits & REX_B ? 8 : 0); + ctxt->modrm_reg =3D rex_get_rxb(ctxt->rex_bits, REX_R); + index_reg =3D rex_get_rxb(ctxt->rex_bits, REX_X); + base_reg =3D rex_get_rxb(ctxt->rex_bits, REX_B); =20 ctxt->modrm_mod =3D (ctxt->modrm & 0xc0) >> 6; ctxt->modrm_reg |=3D (ctxt->modrm & 0x38) >> 3; @@ -4245,7 +4256,7 @@ static const struct opcode opcode_table[256] =3D { /* 0x38 - 0x3F */ I6ALU(NoWrite, em_cmp), N, N, /* 0x40 - 0x4F */ - X8(I(DstReg, em_inc)), X8(I(DstReg, em_dec)), + X8(I(DstReg | NoRex, em_inc)), X8(I(DstReg | NoRex, em_dec)), /* 0x50 - 0x57 */ X8(I(SrcReg | Stack, em_push)), /* 0x58 - 0x5F */ @@ -4263,7 +4274,7 @@ static const struct opcode opcode_table[256] =3D { I2bvIP(DstDI | SrcDX | Mov | String | Unaligned, em_in, ins, check_perm_i= n), /* insb, insw/insd */ I2bvIP(SrcSI | DstDX | String, em_out, outs, check_perm_out), /* outsb, o= utsw/outsd */ /* 0x70 - 0x7F */ - X16(D(SrcImmByte | NearBranch | IsBranch)), + X16(D(SrcImmByte | NearBranch | IsBranch | NoRex)), /* 0x80 - 0x87 */ G(ByteOp | DstMem | SrcImm, group1), G(DstMem | SrcImm, group1), @@ -4287,15 +4298,15 @@ static const struct opcode opcode_table[256] =3D { II(ImplicitOps | Stack, em_popf, popf), I(ImplicitOps, em_sahf), I(ImplicitOps, em_lahf), /* 0xA0 - 0xA7 */ - I2bv(DstAcc | SrcMem | Mov | MemAbs, em_mov), - I2bv(DstMem | SrcAcc | Mov | MemAbs | PageTable, em_mov), - I2bv(SrcSI | DstDI | Mov | String | TwoMemOp, em_mov), - I2bv(SrcSI | DstDI | String | NoWrite | TwoMemOp, em_cmp_r), + I2bv(DstAcc | SrcMem | Mov | MemAbs | NoRex, em_mov), + I2bv(DstMem | SrcAcc | Mov | MemAbs | PageTable | NoRex, em_mov), + I2bv(SrcSI | DstDI | Mov | String | TwoMemOp | NoRex, em_mov), + I2bv(SrcSI | DstDI | String | NoWrite | TwoMemOp | NoRex, em_cmp_r), /* 0xA8 - 0xAF */ - I2bv(DstAcc | SrcImm | NoWrite, em_test), - I2bv(SrcAcc | DstDI | Mov | String, em_mov), - I2bv(SrcSI | DstAcc | Mov | String, em_mov), - I2bv(SrcAcc | DstDI | String | NoWrite, em_cmp_r), + I2bv(DstAcc | SrcImm | NoWrite | NoRex, em_test), + I2bv(SrcAcc | DstDI | Mov | String | NoRex, em_mov), + I2bv(SrcSI | DstAcc | Mov | String | NoRex, em_mov), + I2bv(SrcAcc | DstDI | String | NoWrite | NoRex, em_cmp_r), /* 0xB0 - 0xB7 */ X8(I(ByteOp | DstReg | SrcImm | Mov, em_mov)), /* 0xB8 - 0xBF */ @@ -4325,17 +4336,17 @@ static const struct opcode opcode_table[256] =3D { /* 0xD8 - 0xDF */ N, E(0, &escape_d9), N, E(0, &escape_db), N, E(0, &escape_dd), N, N, /* 0xE0 - 0xE7 */ - X3(I(SrcImmByte | NearBranch | IsBranch, em_loop)), - I(SrcImmByte | NearBranch | IsBranch, em_jcxz), - I2bvIP(SrcImmUByte | DstAcc, em_in, in, check_perm_in), - I2bvIP(SrcAcc | DstImmUByte, em_out, out, check_perm_out), + X3(I(SrcImmByte | NearBranch | IsBranch | NoRex, em_loop)), + I(SrcImmByte | NearBranch | IsBranch | NoRex, em_jcxz), + I2bvIP(SrcImmUByte | DstAcc | NoRex, em_in, in, check_perm_in), + I2bvIP(SrcAcc | DstImmUByte | NoRex, em_out, out, check_perm_out), /* 0xE8 - 0xEF */ - I(SrcImm | NearBranch | IsBranch | ShadowStack, em_call), - D(SrcImm | ImplicitOps | NearBranch | IsBranch), - I(SrcImmFAddr | No64 | IsBranch, em_jmp_far), - D(SrcImmByte | ImplicitOps | NearBranch | IsBranch), - I2bvIP(SrcDX | DstAcc, em_in, in, check_perm_in), - I2bvIP(SrcAcc | DstDX, em_out, out, check_perm_out), + I(SrcImm | NearBranch | IsBranch | ShadowStack | NoRex, em_call), + D(SrcImm | ImplicitOps | NearBranch | IsBranch | NoRex), + I(SrcImmFAddr | No64 | IsBranch | NoRex, em_jmp_far), + D(SrcImmByte | ImplicitOps | NearBranch | IsBranch | NoRex), + I2bvIP(SrcDX | DstAcc | NoRex, em_in, in, check_perm_in), + I2bvIP(SrcAcc | DstDX | NoRex, em_out, out, check_perm_out), /* 0xF0 - 0xF7 */ N, DI(ImplicitOps, icebp), N, N, DI(ImplicitOps | Priv, hlt), D(ImplicitOps), @@ -4376,12 +4387,12 @@ static const struct opcode twobyte_table[256] =3D { N, GP(ModRM | DstMem | SrcReg | Mov | Sse | Avx, &pfx_0f_2b), N, N, N, N, /* 0x30 - 0x3F */ - II(ImplicitOps | Priv, em_wrmsr, wrmsr), - IIP(ImplicitOps, em_rdtsc, rdtsc, check_rdtsc), - II(ImplicitOps | Priv, em_rdmsr, rdmsr), - IIP(ImplicitOps, em_rdpmc, rdpmc, check_rdpmc), - I(ImplicitOps | EmulateOnUD | IsBranch | ShadowStack, em_sysenter), - I(ImplicitOps | Priv | EmulateOnUD | IsBranch | ShadowStack, em_sysexit), + II(ImplicitOps | Priv | NoRex, em_wrmsr, wrmsr), + IIP(ImplicitOps | NoRex, em_rdtsc, rdtsc, check_rdtsc), + II(ImplicitOps | Priv | NoRex, em_rdmsr, rdmsr), + IIP(ImplicitOps | NoRex, em_rdpmc, rdpmc, check_rdpmc), + I(ImplicitOps | EmulateOnUD | IsBranch | ShadowStack | NoRex, em_sysenter= ), + I(ImplicitOps | Priv | EmulateOnUD | IsBranch | ShadowStack | NoRex, em_s= ysexit), N, N, N, N, N, N, N, N, N, N, /* 0x40 - 0x4F */ @@ -4399,7 +4410,7 @@ static const struct opcode twobyte_table[256] =3D { N, N, N, N, N, N, N, GP(SrcReg | DstMem | ModRM | Mov, &pfx_0f_6f_0f_7f), /* 0x80 - 0x8F */ - X16(D(SrcImm | NearBranch | IsBranch)), + X16(D(SrcImm | NearBranch | IsBranch | NoRex)), /* 0x90 - 0x9F */ X16(D(ByteOp | DstMem | SrcNone | ModRM| Mov)), /* 0xA0 - 0xA7 */ @@ -4992,6 +5003,13 @@ int x86_decode_insn(struct x86_emulate_ctxt *ctxt, v= oid *insn, int insn_len, int opcode =3D opcode_table[ctxt->b]; } =20 + /* + * Instructions marked with NoRex ignore a legacy REX prefix, but + * #UD should be raised when prefixed with REX2. + */ + if (ctxt->d & NoRex && ctxt->rex_prefix =3D=3D REX2_PREFIX) + opcode.flags =3D Undefined; + if (opcode.flags & ModRM) ctxt->modrm =3D insn_fetch(u8, ctxt); =20 diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h index 16b35a796a7f..dd5d1e489db6 100644 --- a/arch/x86/kvm/kvm_emulate.h +++ b/arch/x86/kvm/kvm_emulate.h @@ -325,6 +325,7 @@ typedef void (*fastop_t)(struct fastop *); enum rex_type { REX_NONE, REX_PREFIX, + REX2_PREFIX, }; =20 struct x86_emulate_ctxt { --=20 2.51.0