[PATCH RFC v1 12/20] KVM: x86: Support REX2-extended register index in the decoder

Chang S. Bae posted 20 patches 3 months ago
[PATCH RFC v1 12/20] KVM: x86: Support REX2-extended register index in the decoder
Posted by Chang S. Bae 3 months ago
Update register index decoding to account for the additional bit fields
introduced by the REX2 prefix.

Both ModR/M and opcode register decoding paths now consider the extended
index bits (R4, X4, B4) in addition to the legacy REX bits (R3, X3, B3).

Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
---
 arch/x86/kvm/emulate.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 9c98843094a1..ed3a8c0bca20 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -1084,7 +1084,8 @@ static void decode_register_operand(struct x86_emulate_ctxt *ctxt,
 		reg = ctxt->modrm_reg;
 	} else {
 		reg = (ctxt->b & 7) |
-		      (ctxt->rex.bits.b3 * BIT(3));
+		      (ctxt->rex.bits.b3 * BIT(3)) |
+		      (ctxt->rex.bits.b4 * BIT(4));
 	}
 
 	if (ctxt->d & Sse) {
@@ -1124,9 +1125,12 @@ static int decode_modrm(struct x86_emulate_ctxt *ctxt,
 	int rc = X86EMUL_CONTINUE;
 	ulong modrm_ea = 0;
 
-	ctxt->modrm_reg = ctxt->rex.bits.r3 * BIT(3);
-	index_reg       = ctxt->rex.bits.x3 * BIT(3);
-	base_reg        = ctxt->rex.bits.b3 * BIT(3);
+	ctxt->modrm_reg	= (ctxt->rex.bits.r3 * BIT(3)) |
+			  (ctxt->rex.bits.r4 * BIT(4));
+	index_reg	= (ctxt->rex.bits.x3 * BIT(3)) |
+			  (ctxt->rex.bits.x4 * BIT(4));
+	base_reg	= (ctxt->rex.bits.b3 * BIT(3)) |
+			  (ctxt->rex.bits.b4 * BIT(4));
 
 	ctxt->modrm_mod = (ctxt->modrm & 0xc0) >> 6;
 	ctxt->modrm_reg |= (ctxt->modrm & 0x38) >> 3;
-- 
2.51.0
Re: [PATCH RFC v1 12/20] KVM: x86: Support REX2-extended register index in the decoder
Posted by Paolo Bonzini 2 months, 4 weeks ago
On 11/10/25 19:01, Chang S. Bae wrote:
> Update register index decoding to account for the additional bit fields
> introduced by the REX2 prefix.
> 
> Both ModR/M and opcode register decoding paths now consider the extended
> index bits (R4, X4, B4) in addition to the legacy REX bits (R3, X3, B3).
> 
> Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>

Replying here for both patches 10 and 12, because I think you can merge 
them in one, I'd prefer to avoid bitfields.

You only need a single enum:

enum {
	REX_B = 1,
	REX_X = 2,
	REX_R = 4,
	REX_W = 8,
	REX_M = 0x80,
};

for REX_W/REX_M you access them directly, while for RXB you go through a 
function:

static inline rex_get_rxb(u8 rex, u8 fld)
{
	BUILD_BUG_ON(!__builtin_constant_p(fld));
	BUILD_BUG_ON(fld != REX_B && fld != REX_X && fld != REX_R);

	rex >>= ffs(fld) - 1;	// bits 0+4
	return (rex & 1) + (rex & 0x10 ? 8 : 0);
}

>   	} else {
>   		reg = (ctxt->b & 7) |
> -		      (ctxt->rex.bits.b3 * BIT(3));
> +		      (ctxt->rex.bits.b3 * BIT(3)) |
> +		      (ctxt->rex.bits.b4 * BIT(4));

+		      rex_get_rxb(ctxt->rex, REX_B);

and likewise everywhere else.

>   	}
>   
>   	if (ctxt->d & Sse) {
> @@ -1124,9 +1125,12 @@ static int decode_modrm(struct x86_emulate_ctxt *ctxt,
>   	int rc = X86EMUL_CONTINUE;
>   	ulong modrm_ea = 0;
>   
> -	ctxt->modrm_reg = ctxt->rex.bits.r3 * BIT(3);
> -	index_reg       = ctxt->rex.bits.x3 * BIT(3);
> -	base_reg        = ctxt->rex.bits.b3 * BIT(3);
> +	ctxt->modrm_reg	= (ctxt->rex.bits.r3 * BIT(3)) |
> +			  (ctxt->rex.bits.r4 * BIT(4));
> +	index_reg	= (ctxt->rex.bits.x3 * BIT(3)) |
> +			  (ctxt->rex.bits.x4 * BIT(4));
> +	base_reg	= (ctxt->rex.bits.b3 * BIT(3)) |
> +			  (ctxt->rex.bits.b4 * BIT(4));
>   
>   	ctxt->modrm_mod = (ctxt->modrm & 0xc0) >> 6;
>   	ctxt->modrm_reg |= (ctxt->modrm & 0x38) >> 3;
Re: [PATCH RFC v1 12/20] KVM: x86: Support REX2-extended register index in the decoder
Posted by Chang S. Bae 2 months, 3 weeks ago
On 11/11/2025 8:53 AM, Paolo Bonzini wrote:
> 
> Replying here for both patches 10 and 12, because I think you can merge 
> them in one, 

Done

> I'd prefer to avoid bitfields.
> 
> You only need a single enum:
> 
> enum {
>      REX_B = 1,
>      REX_X = 2,
>      REX_R = 4,
>      REX_W = 8,
>      REX_M = 0x80,
> };
> 
> for REX_W/REX_M you access them directly, while for RXB you go through a 
> function:
> 
> static inline rex_get_rxb(u8 rex, u8 fld)
> {
>      BUILD_BUG_ON(!__builtin_constant_p(fld));
>      BUILD_BUG_ON(fld != REX_B && fld != REX_X && fld != REX_R);
> 
>      rex >>= ffs(fld) - 1;    // bits 0+4
>      return (rex & 1) + (rex & 0x10 ? 8 : 0);
> }
> 
>>       } else {
>>           reg = (ctxt->b & 7) |
>> -              (ctxt->rex.bits.b3 * BIT(3));
>> +              (ctxt->rex.bits.b3 * BIT(3)) |
>> +              (ctxt->rex.bits.b4 * BIT(4));
> 
> +              rex_get_rxb(ctxt->rex, REX_B);
> 
> and likewise everywhere else.

Neat! This looks much better. Thanks for the suggestion.
Re: [PATCH RFC v1 12/20] KVM: x86: Support REX2-extended register index in the decoder
Posted by Paolo Bonzini 2 months, 4 weeks ago
On 11/10/25 19:01, Chang S. Bae wrote:
> Update register index decoding to account for the additional bit fields
> introduced by the REX2 prefix.
> 
> Both ModR/M and opcode register decoding paths now consider the extended
> index bits (R4, X4, B4) in addition to the legacy REX bits (R3, X3, B3).
> 
> Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>

Replying here for both patches 10 and 12, because I think you can merge 
them in one, I'd prefer to avoid bitfields.

You only need a single enum:

enum {
	REX_B = 1,
	REX_X = 2,
	REX_R = 4,
	REX_W = 8,
	REX_M = 0x80,
};

for REX_W/REX_M you access them directly, while for RXB you go through a 
function:

static inline rex_get_rxb(u8 rex, u8 fld)
{
	BUILD_BUG_ON(!__builtin_constant_p(fld));
	BUILD_BUG_ON(fld != REX_B && fld != REX_X && fld != REX_R);

	rex >>= ffs(fld) - 1;	// bits 0+4
	return (rex & 1) + (rex & 0x10 ? 8 : 0);
}

>   	} else {
>   		reg = (ctxt->b & 7) |
> -		      (ctxt->rex.bits.b3 * BIT(3));
> +		      (ctxt->rex.bits.b3 * BIT(3)) |
> +		      (ctxt->rex.bits.b4 * BIT(4));

+		      rex_get_rxb(ctxt->rex, REX_B);

and likewise everywhere else.

>   	}
>   
>   	if (ctxt->d & Sse) {
> @@ -1124,9 +1125,12 @@ static int decode_modrm(struct x86_emulate_ctxt *ctxt,
>   	int rc = X86EMUL_CONTINUE;
>   	ulong modrm_ea = 0;
>   
> -	ctxt->modrm_reg = ctxt->rex.bits.r3 * BIT(3);
> -	index_reg       = ctxt->rex.bits.x3 * BIT(3);
> -	base_reg        = ctxt->rex.bits.b3 * BIT(3);
> +	ctxt->modrm_reg	= (ctxt->rex.bits.r3 * BIT(3)) |
> +			  (ctxt->rex.bits.r4 * BIT(4));
> +	index_reg	= (ctxt->rex.bits.x3 * BIT(3)) |
> +			  (ctxt->rex.bits.x4 * BIT(4));
> +	base_reg	= (ctxt->rex.bits.b3 * BIT(3)) |
> +			  (ctxt->rex.bits.b4 * BIT(4));
>   
>   	ctxt->modrm_mod = (ctxt->modrm & 0xc0) >> 6;
>   	ctxt->modrm_reg |= (ctxt->modrm & 0x38) >> 3;