[PATCH RFC v1 11/20] KVM: x86: Refactor opcode table lookup in instruction emulation

Chang S. Bae posted 20 patches 3 months ago
[PATCH RFC v1 11/20] KVM: x86: Refactor opcode table lookup in instruction emulation
Posted by Chang S. Bae 3 months ago
Refactor opcode lookup to clearly separate handling of different byte
sequences and prefix types, in preparation for REX2 support.

The decoder begins with a one-byte opcode table by default and falls
through to other tables on escape bytes, but the logic is intertwined and
hard to extend.

REX2 introduces a dedicated bit in its payload byte to indicate which
opcode table to use. To accommodate this mapping bit, the existing lookup
path needs to be restructured.

No functional changes intended.

Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
---
 arch/x86/kvm/emulate.c | 19 +++++++++++--------
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 763fbd139242..9c98843094a1 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -4773,7 +4773,6 @@ int x86_decode_insn(struct x86_emulate_ctxt *ctxt, void *insn, int insn_len, int
 	ctxt->_eip = ctxt->eip;
 	ctxt->fetch.ptr = ctxt->fetch.data;
 	ctxt->fetch.end = ctxt->fetch.data + insn_len;
-	ctxt->opcode_len = 1;
 	ctxt->intercept = x86_intercept_none;
 	if (insn_len > 0)
 		memcpy(ctxt->fetch.data, insn, insn_len);
@@ -4877,20 +4876,24 @@ int x86_decode_insn(struct x86_emulate_ctxt *ctxt, void *insn, int insn_len, int
 	if (ctxt->rex.bits.w)
 		ctxt->op_bytes = 8;
 
-	/* Opcode byte(s). */
-	opcode = opcode_table[ctxt->b];
-	/* Two-byte opcode? */
+	/* Determine opcode byte(s): */
 	if (ctxt->b == 0x0f) {
-		ctxt->opcode_len = 2;
+		/* Escape byte: start two-byte opcode sequence */
 		ctxt->b = insn_fetch(u8, ctxt);
-		opcode = twobyte_table[ctxt->b];
-
-		/* 0F_38 opcode map */
 		if (ctxt->b == 0x38) {
+			/* Three-byte opcode */
 			ctxt->opcode_len = 3;
 			ctxt->b = insn_fetch(u8, ctxt);
 			opcode = opcode_map_0f_38[ctxt->b];
+		} else {
+			/* Two-byte opcode */
+			ctxt->opcode_len = 2;
+			opcode = twobyte_table[ctxt->b];
 		}
+	} else {
+		/* Single-byte opcode */
+		ctxt->opcode_len = 1;
+		opcode = opcode_table[ctxt->b];
 	}
 	ctxt->d = opcode.flags;
 
-- 
2.51.0
Re: [PATCH RFC v1 11/20] KVM: x86: Refactor opcode table lookup in instruction emulation
Posted by Paolo Bonzini 2 months, 4 weeks ago
On 11/10/25 19:01, Chang S. Bae wrote:
> Refactor opcode lookup to clearly separate handling of different byte
> sequences and prefix types, in preparation for REX2 support.
> 
> The decoder begins with a one-byte opcode table by default and falls
> through to other tables on escape bytes, but the logic is intertwined and
> hard to extend.
> 
> REX2 introduces a dedicated bit in its payload byte to indicate which
> opcode table to use. To accommodate this mapping bit, the existing lookup
> path needs to be restructured.
> 
> No functional changes intended.
> 
> Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
> ---
>   arch/x86/kvm/emulate.c | 19 +++++++++++--------
>   1 file changed, 11 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
> index 763fbd139242..9c98843094a1 100644
> --- a/arch/x86/kvm/emulate.c
> +++ b/arch/x86/kvm/emulate.c
> @@ -4773,7 +4773,6 @@ int x86_decode_insn(struct x86_emulate_ctxt *ctxt, void *insn, int insn_len, int
>   	ctxt->_eip = ctxt->eip;
>   	ctxt->fetch.ptr = ctxt->fetch.data;
>   	ctxt->fetch.end = ctxt->fetch.data + insn_len;
> -	ctxt->opcode_len = 1;
>   	ctxt->intercept = x86_intercept_none;
>   	if (insn_len > 0)
>   		memcpy(ctxt->fetch.data, insn, insn_len);
> @@ -4877,20 +4876,24 @@ int x86_decode_insn(struct x86_emulate_ctxt *ctxt, void *insn, int insn_len, int
>   	if (ctxt->rex.bits.w)
>   		ctxt->op_bytes = 8;
>   
> -	/* Opcode byte(s). */
> -	opcode = opcode_table[ctxt->b];
> -	/* Two-byte opcode? */
> +	/* Determine opcode byte(s): */
>   	if (ctxt->b == 0x0f) {
> -		ctxt->opcode_len = 2;
> +		/* Escape byte: start two-byte opcode sequence */
>   		ctxt->b = insn_fetch(u8, ctxt);
> -		opcode = twobyte_table[ctxt->b];
> -
> -		/* 0F_38 opcode map */
>   		if (ctxt->b == 0x38) {
> +			/* Three-byte opcode */
>   			ctxt->opcode_len = 3;
>   			ctxt->b = insn_fetch(u8, ctxt);
>   			opcode = opcode_map_0f_38[ctxt->b];
> +		} else {
> +			/* Two-byte opcode */
> +			ctxt->opcode_len = 2;
> +			opcode = twobyte_table[ctxt->b];
>   		}
> +	} else {
> +		/* Single-byte opcode */
> +		ctxt->opcode_len = 1;
> +		opcode = opcode_table[ctxt->b];
>   	}
>   	ctxt->d = opcode.flags;
>   

This will also conflict with the VEX patches, overall I think all 10-12 
patches can be merged in one.

Paolo
Re: [PATCH RFC v1 11/20] KVM: x86: Refactor opcode table lookup in instruction emulation
Posted by Chang S. Bae 2 months, 3 weeks ago
On 11/11/2025 8:55 AM, Paolo Bonzini wrote:
> 
> This will also conflict with the VEX patches, overall I think all 10-12 
> patches can be merged in one.

I initially split these into micro-patches to make the first round of
review easier. But yes, now that the series is shaping up, folding those
preparatory pieces together makes perfect sense to me.