[PATCH v7 3/3] x86/bugs: Use code segment selector for VERW operand

Pawan Gupta posted 3 patches 2 months ago
[PATCH v7 3/3] x86/bugs: Use code segment selector for VERW operand
Posted by Pawan Gupta 2 months ago
Robert Gill reported below #GP in 32-bit mode when dosemu software was
executing vm86() system call:

  general protection fault: 0000 [#1] PREEMPT SMP
  CPU: 4 PID: 4610 Comm: dosemu.bin Not tainted 6.6.21-gentoo-x86 #1
  Hardware name: Dell Inc. PowerEdge 1950/0H723K, BIOS 2.7.0 10/30/2010
  EIP: restore_all_switch_stack+0xbe/0xcf
  EAX: 00000000 EBX: 00000000 ECX: 00000000 EDX: 00000000
  ESI: 00000000 EDI: 00000000 EBP: 00000000 ESP: ff8affdc
  DS: 0000 ES: 0000 FS: 0000 GS: 0033 SS: 0068 EFLAGS: 00010046
  CR0: 80050033 CR2: 00c2101c CR3: 04b6d000 CR4: 000406d0
  Call Trace:
   show_regs+0x70/0x78
   die_addr+0x29/0x70
   exc_general_protection+0x13c/0x348
   exc_bounds+0x98/0x98
   handle_exception+0x14d/0x14d
   exc_bounds+0x98/0x98
   restore_all_switch_stack+0xbe/0xcf
   exc_bounds+0x98/0x98
   restore_all_switch_stack+0xbe/0xcf

This only happens in 32-bit mode when VERW based mitigations like MDS/RFDS
are enabled. This is because segment registers with an arbitrary user value
can result in #GP when executing VERW. Intel SDM vol. 2C documents the
following behavior for VERW instruction:

  #GP(0) - If a memory operand effective address is outside the CS, DS, ES,
	   FS, or GS segment limit.

CLEAR_CPU_BUFFERS macro executes VERW instruction before returning to user
space. Use %cs selector to reference VERW operand. This ensures VERW will
not #GP for an arbitrary user %ds.

Fixes: a0e2dab44d22 ("x86/entry_32: Add VERW just before userspace transition")
Cc: stable@vger.kernel.org # 5.10+
Reported-by: Robert Gill <rtgill82@gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218707
Closes: https://lore.kernel.org/all/8c77ccfd-d561-45a1-8ed5-6b75212c7a58@leemhuis.info/
Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
Suggested-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
---
 arch/x86/include/asm/nospec-branch.h | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index ff5f1ecc7d1e..e18a6aaf414c 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -318,12 +318,14 @@
 /*
  * Macro to execute VERW instruction that mitigate transient data sampling
  * attacks such as MDS. On affected systems a microcode update overloaded VERW
- * instruction to also clear the CPU buffers. VERW clobbers CFLAGS.ZF.
+ * instruction to also clear the CPU buffers. VERW clobbers CFLAGS.ZF. Using %cs
+ * to reference VERW operand avoids a #GP fault for an arbitrary user %ds in
+ * 32-bit mode.
  *
  * Note: Only the memory operand variant of VERW clears the CPU buffers.
  */
 .macro CLEAR_CPU_BUFFERS
-	ALTERNATIVE "", __stringify(verw _ASM_RIP(mds_verw_sel)), X86_FEATURE_CLEAR_CPU_BUF
+	ALTERNATIVE "", __stringify(verw %cs:_ASM_RIP(mds_verw_sel)), X86_FEATURE_CLEAR_CPU_BUF
 .endm
 
 #ifdef CONFIG_X86_64

-- 
2.34.1
Re: [PATCH v7 3/3] x86/bugs: Use code segment selector for VERW operand
Posted by Andrew Cooper 2 months ago
On 25/09/2024 11:25 pm, Pawan Gupta wrote:
> Robert Gill reported below #GP in 32-bit mode when dosemu software was
> executing vm86() system call:
>
>   general protection fault: 0000 [#1] PREEMPT SMP
>   CPU: 4 PID: 4610 Comm: dosemu.bin Not tainted 6.6.21-gentoo-x86 #1
>   Hardware name: Dell Inc. PowerEdge 1950/0H723K, BIOS 2.7.0 10/30/2010
>   EIP: restore_all_switch_stack+0xbe/0xcf
>   EAX: 00000000 EBX: 00000000 ECX: 00000000 EDX: 00000000
>   ESI: 00000000 EDI: 00000000 EBP: 00000000 ESP: ff8affdc
>   DS: 0000 ES: 0000 FS: 0000 GS: 0033 SS: 0068 EFLAGS: 00010046
>   CR0: 80050033 CR2: 00c2101c CR3: 04b6d000 CR4: 000406d0
>   Call Trace:
>    show_regs+0x70/0x78
>    die_addr+0x29/0x70
>    exc_general_protection+0x13c/0x348
>    exc_bounds+0x98/0x98
>    handle_exception+0x14d/0x14d
>    exc_bounds+0x98/0x98
>    restore_all_switch_stack+0xbe/0xcf
>    exc_bounds+0x98/0x98
>    restore_all_switch_stack+0xbe/0xcf
>
> This only happens in 32-bit mode when VERW based mitigations like MDS/RFDS
> are enabled. This is because segment registers with an arbitrary user value
> can result in #GP when executing VERW. Intel SDM vol. 2C documents the
> following behavior for VERW instruction:
>
>   #GP(0) - If a memory operand effective address is outside the CS, DS, ES,
> 	   FS, or GS segment limit.
>
> CLEAR_CPU_BUFFERS macro executes VERW instruction before returning to user
> space. Use %cs selector to reference VERW operand. This ensures VERW will
> not #GP for an arbitrary user %ds.
>
> Fixes: a0e2dab44d22 ("x86/entry_32: Add VERW just before userspace transition")
> Cc: stable@vger.kernel.org # 5.10+
> Reported-by: Robert Gill <rtgill82@gmail.com>
> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218707
> Closes: https://lore.kernel.org/all/8c77ccfd-d561-45a1-8ed5-6b75212c7a58@leemhuis.info/
> Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
> Suggested-by: Brian Gerst <brgerst@gmail.com>
> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
> ---
>  arch/x86/include/asm/nospec-branch.h | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
> index ff5f1ecc7d1e..e18a6aaf414c 100644
> --- a/arch/x86/include/asm/nospec-branch.h
> +++ b/arch/x86/include/asm/nospec-branch.h
> @@ -318,12 +318,14 @@
>  /*
>   * Macro to execute VERW instruction that mitigate transient data sampling
>   * attacks such as MDS. On affected systems a microcode update overloaded VERW
> - * instruction to also clear the CPU buffers. VERW clobbers CFLAGS.ZF.
> + * instruction to also clear the CPU buffers. VERW clobbers CFLAGS.ZF. Using %cs
> + * to reference VERW operand avoids a #GP fault for an arbitrary user %ds in
> + * 32-bit mode.
>   *
>   * Note: Only the memory operand variant of VERW clears the CPU buffers.
>   */
>  .macro CLEAR_CPU_BUFFERS
> -	ALTERNATIVE "", __stringify(verw _ASM_RIP(mds_verw_sel)), X86_FEATURE_CLEAR_CPU_BUF
> +	ALTERNATIVE "", __stringify(verw %cs:_ASM_RIP(mds_verw_sel)), X86_FEATURE_CLEAR_CPU_BUF
>  .endm

People ought rightly to double-take at this using %cs and not %ss. 
There is a good reason, but it needs describing explicitly.  May I
suggest the following:

*...
* In 32bit mode, the memory operand must be a %cs reference.  The data
segments may not be usable (vm86 mode), and the stack segment may not be
flat (espfix32).
*...

 .macro CLEAR_CPU_BUFFERS
#ifdef __x86_64__
    ALTERNATIVE "", "verw mds_verw_sel(%rip)", X86_FEATURE_CLEAR_CPU_BUF
#else
    ALTERNATIVE "", "verw %cs:mds_verw_sel", X86_FEATURE_CLEAR_CPU_BUF
#endif
 .endm

This also lets you drop _ASM_RIP().  It's a cute idea, but is more
confusion than it's worth, because there's no such thing in 32bit mode.

"%cs:_ASM_RIP(mds_verw_sel)" reads as if it does nothing, because it
really doesn't in 64bit mode.

~Andrew
Re: [PATCH v7 3/3] x86/bugs: Use code segment selector for VERW operand
Posted by Pawan Gupta 2 months ago
On Thu, Sep 26, 2024 at 12:29:00AM +0100, Andrew Cooper wrote:
> On 25/09/2024 11:25 pm, Pawan Gupta wrote:
> > Robert Gill reported below #GP in 32-bit mode when dosemu software was
> > executing vm86() system call:
> >
> >   general protection fault: 0000 [#1] PREEMPT SMP
> >   CPU: 4 PID: 4610 Comm: dosemu.bin Not tainted 6.6.21-gentoo-x86 #1
> >   Hardware name: Dell Inc. PowerEdge 1950/0H723K, BIOS 2.7.0 10/30/2010
> >   EIP: restore_all_switch_stack+0xbe/0xcf
> >   EAX: 00000000 EBX: 00000000 ECX: 00000000 EDX: 00000000
> >   ESI: 00000000 EDI: 00000000 EBP: 00000000 ESP: ff8affdc
> >   DS: 0000 ES: 0000 FS: 0000 GS: 0033 SS: 0068 EFLAGS: 00010046
> >   CR0: 80050033 CR2: 00c2101c CR3: 04b6d000 CR4: 000406d0
> >   Call Trace:
> >    show_regs+0x70/0x78
> >    die_addr+0x29/0x70
> >    exc_general_protection+0x13c/0x348
> >    exc_bounds+0x98/0x98
> >    handle_exception+0x14d/0x14d
> >    exc_bounds+0x98/0x98
> >    restore_all_switch_stack+0xbe/0xcf
> >    exc_bounds+0x98/0x98
> >    restore_all_switch_stack+0xbe/0xcf
> >
> > This only happens in 32-bit mode when VERW based mitigations like MDS/RFDS
> > are enabled. This is because segment registers with an arbitrary user value
> > can result in #GP when executing VERW. Intel SDM vol. 2C documents the
> > following behavior for VERW instruction:
> >
> >   #GP(0) - If a memory operand effective address is outside the CS, DS, ES,
> > 	   FS, or GS segment limit.
> >
> > CLEAR_CPU_BUFFERS macro executes VERW instruction before returning to user
> > space. Use %cs selector to reference VERW operand. This ensures VERW will
> > not #GP for an arbitrary user %ds.
> >
> > Fixes: a0e2dab44d22 ("x86/entry_32: Add VERW just before userspace transition")
> > Cc: stable@vger.kernel.org # 5.10+
> > Reported-by: Robert Gill <rtgill82@gmail.com>
> > Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218707
> > Closes: https://lore.kernel.org/all/8c77ccfd-d561-45a1-8ed5-6b75212c7a58@leemhuis.info/
> > Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
> > Suggested-by: Brian Gerst <brgerst@gmail.com>
> > Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
> > ---
> >  arch/x86/include/asm/nospec-branch.h | 6 ++++--
> >  1 file changed, 4 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
> > index ff5f1ecc7d1e..e18a6aaf414c 100644
> > --- a/arch/x86/include/asm/nospec-branch.h
> > +++ b/arch/x86/include/asm/nospec-branch.h
> > @@ -318,12 +318,14 @@
> >  /*
> >   * Macro to execute VERW instruction that mitigate transient data sampling
> >   * attacks such as MDS. On affected systems a microcode update overloaded VERW
> > - * instruction to also clear the CPU buffers. VERW clobbers CFLAGS.ZF.
> > + * instruction to also clear the CPU buffers. VERW clobbers CFLAGS.ZF. Using %cs
> > + * to reference VERW operand avoids a #GP fault for an arbitrary user %ds in
> > + * 32-bit mode.
> >   *
> >   * Note: Only the memory operand variant of VERW clears the CPU buffers.
> >   */
> >  .macro CLEAR_CPU_BUFFERS
> > -	ALTERNATIVE "", __stringify(verw _ASM_RIP(mds_verw_sel)), X86_FEATURE_CLEAR_CPU_BUF
> > +	ALTERNATIVE "", __stringify(verw %cs:_ASM_RIP(mds_verw_sel)), X86_FEATURE_CLEAR_CPU_BUF
> >  .endm
> 
> People ought rightly to double-take at this using %cs and not %ss. 
> There is a good reason, but it needs describing explicitly.  May I
> suggest the following:
> 
> *...
> * In 32bit mode, the memory operand must be a %cs reference.  The data
> segments may not be usable (vm86 mode), and the stack segment may not be
> flat (espfix32).
> *...

Thanks for the suggestion. I will include this.

>  .macro CLEAR_CPU_BUFFERS
> #ifdef __x86_64__
>     ALTERNATIVE "", "verw mds_verw_sel(%rip)", X86_FEATURE_CLEAR_CPU_BUF
> #else
>     ALTERNATIVE "", "verw %cs:mds_verw_sel", X86_FEATURE_CLEAR_CPU_BUF
> #endif
>  .endm
> 
> This also lets you drop _ASM_RIP().  It's a cute idea, but is more
> confusion than it's worth, because there's no such thing in 32bit mode.
> 
> "%cs:_ASM_RIP(mds_verw_sel)" reads as if it does nothing, because it
> really doesn't in 64bit mode.

Right, will drop _ASM_RIP() in 32-bit mode and %cs in 64-bit mode.
Re: [PATCH v7 3/3] x86/bugs: Use code segment selector for VERW operand
Posted by Pawan Gupta 2 months ago
On Wed, Sep 25, 2024 at 04:46:23PM -0700, Pawan Gupta wrote:
> On Thu, Sep 26, 2024 at 12:29:00AM +0100, Andrew Cooper wrote:
> > On 25/09/2024 11:25 pm, Pawan Gupta wrote:
> > > Robert Gill reported below #GP in 32-bit mode when dosemu software was
> > > executing vm86() system call:
> > >
> > >   general protection fault: 0000 [#1] PREEMPT SMP
> > >   CPU: 4 PID: 4610 Comm: dosemu.bin Not tainted 6.6.21-gentoo-x86 #1
> > >   Hardware name: Dell Inc. PowerEdge 1950/0H723K, BIOS 2.7.0 10/30/2010
> > >   EIP: restore_all_switch_stack+0xbe/0xcf
> > >   EAX: 00000000 EBX: 00000000 ECX: 00000000 EDX: 00000000
> > >   ESI: 00000000 EDI: 00000000 EBP: 00000000 ESP: ff8affdc
> > >   DS: 0000 ES: 0000 FS: 0000 GS: 0033 SS: 0068 EFLAGS: 00010046
> > >   CR0: 80050033 CR2: 00c2101c CR3: 04b6d000 CR4: 000406d0
> > >   Call Trace:
> > >    show_regs+0x70/0x78
> > >    die_addr+0x29/0x70
> > >    exc_general_protection+0x13c/0x348
> > >    exc_bounds+0x98/0x98
> > >    handle_exception+0x14d/0x14d
> > >    exc_bounds+0x98/0x98
> > >    restore_all_switch_stack+0xbe/0xcf
> > >    exc_bounds+0x98/0x98
> > >    restore_all_switch_stack+0xbe/0xcf
> > >
> > > This only happens in 32-bit mode when VERW based mitigations like MDS/RFDS
> > > are enabled. This is because segment registers with an arbitrary user value
> > > can result in #GP when executing VERW. Intel SDM vol. 2C documents the
> > > following behavior for VERW instruction:
> > >
> > >   #GP(0) - If a memory operand effective address is outside the CS, DS, ES,
> > > 	   FS, or GS segment limit.
> > >
> > > CLEAR_CPU_BUFFERS macro executes VERW instruction before returning to user
> > > space. Use %cs selector to reference VERW operand. This ensures VERW will
> > > not #GP for an arbitrary user %ds.
> > >
> > > Fixes: a0e2dab44d22 ("x86/entry_32: Add VERW just before userspace transition")
> > > Cc: stable@vger.kernel.org # 5.10+
> > > Reported-by: Robert Gill <rtgill82@gmail.com>
> > > Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218707
> > > Closes: https://lore.kernel.org/all/8c77ccfd-d561-45a1-8ed5-6b75212c7a58@leemhuis.info/
> > > Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
> > > Suggested-by: Brian Gerst <brgerst@gmail.com>
> > > Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
> > > ---
> > >  arch/x86/include/asm/nospec-branch.h | 6 ++++--
> > >  1 file changed, 4 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
> > > index ff5f1ecc7d1e..e18a6aaf414c 100644
> > > --- a/arch/x86/include/asm/nospec-branch.h
> > > +++ b/arch/x86/include/asm/nospec-branch.h
> > > @@ -318,12 +318,14 @@
> > >  /*
> > >   * Macro to execute VERW instruction that mitigate transient data sampling
> > >   * attacks such as MDS. On affected systems a microcode update overloaded VERW
> > > - * instruction to also clear the CPU buffers. VERW clobbers CFLAGS.ZF.
> > > + * instruction to also clear the CPU buffers. VERW clobbers CFLAGS.ZF. Using %cs
> > > + * to reference VERW operand avoids a #GP fault for an arbitrary user %ds in
> > > + * 32-bit mode.
> > >   *
> > >   * Note: Only the memory operand variant of VERW clears the CPU buffers.
> > >   */
> > >  .macro CLEAR_CPU_BUFFERS
> > > -	ALTERNATIVE "", __stringify(verw _ASM_RIP(mds_verw_sel)), X86_FEATURE_CLEAR_CPU_BUF
> > > +	ALTERNATIVE "", __stringify(verw %cs:_ASM_RIP(mds_verw_sel)), X86_FEATURE_CLEAR_CPU_BUF
> > >  .endm
> > 
> > People ought rightly to double-take at this using %cs and not %ss. 
> > There is a good reason, but it needs describing explicitly.  May I
> > suggest the following:
> > 
> > *...
> > * In 32bit mode, the memory operand must be a %cs reference.  The data
> > segments may not be usable (vm86 mode), and the stack segment may not be
> > flat (espfix32).
> > *...
> 
> Thanks for the suggestion. I will include this.
> 
> >  .macro CLEAR_CPU_BUFFERS
> > #ifdef __x86_64__
> >     ALTERNATIVE "", "verw mds_verw_sel(%rip)", X86_FEATURE_CLEAR_CPU_BUF
> > #else
> >     ALTERNATIVE "", "verw %cs:mds_verw_sel", X86_FEATURE_CLEAR_CPU_BUF
> > #endif
> >  .endm
> > 
> > This also lets you drop _ASM_RIP().  It's a cute idea, but is more
> > confusion than it's worth, because there's no such thing in 32bit mode.
> > 
> > "%cs:_ASM_RIP(mds_verw_sel)" reads as if it does nothing, because it
> > really doesn't in 64bit mode.
> 
> Right, will drop _ASM_RIP() in 32-bit mode and %cs in 64-bit mode.

Its probably too soon for next version, pasting the patch here:

---8<---
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index e18a6aaf414c..4228a1fd2c2e 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -318,14 +318,21 @@
 /*
  * Macro to execute VERW instruction that mitigate transient data sampling
  * attacks such as MDS. On affected systems a microcode update overloaded VERW
- * instruction to also clear the CPU buffers. VERW clobbers CFLAGS.ZF. Using %cs
- * to reference VERW operand avoids a #GP fault for an arbitrary user %ds in
- * 32-bit mode.
+ * instruction to also clear the CPU buffers. VERW clobbers CFLAGS.ZF.
  *
  * Note: Only the memory operand variant of VERW clears the CPU buffers.
  */
 .macro CLEAR_CPU_BUFFERS
-	ALTERNATIVE "", __stringify(verw %cs:_ASM_RIP(mds_verw_sel)), X86_FEATURE_CLEAR_CPU_BUF
+#ifdef CONFIG_X86_64
+	ALTERNATIVE "", __stringify(verw _ASM_RIP(mds_verw_sel)), X86_FEATURE_CLEAR_CPU_BUF
+#else
+	/*
+	 * In 32bit mode, the memory operand must be a %cs reference. The data
+	 * segments may not be usable (vm86 mode), and the stack segment may not
+	 * be flat (ESPFIX32).
+	 */
+	ALTERNATIVE "", __stringify(verw %cs:mds_verw_sel), X86_FEATURE_CLEAR_CPU_BUF
+#endif
 .endm
 
 #ifdef CONFIG_X86_64
Re: [PATCH v7 3/3] x86/bugs: Use code segment selector for VERW operand
Posted by Uros Bizjak 2 months ago

On 26. 09. 24 02:17, Pawan Gupta wrote:
> On Wed, Sep 25, 2024 at 04:46:23PM -0700, Pawan Gupta wrote:
>> On Thu, Sep 26, 2024 at 12:29:00AM +0100, Andrew Cooper wrote:
>>> On 25/09/2024 11:25 pm, Pawan Gupta wrote:
>>>> Robert Gill reported below #GP in 32-bit mode when dosemu software was
>>>> executing vm86() system call:
>>>>
>>>>    general protection fault: 0000 [#1] PREEMPT SMP
>>>>    CPU: 4 PID: 4610 Comm: dosemu.bin Not tainted 6.6.21-gentoo-x86 #1
>>>>    Hardware name: Dell Inc. PowerEdge 1950/0H723K, BIOS 2.7.0 10/30/2010
>>>>    EIP: restore_all_switch_stack+0xbe/0xcf
>>>>    EAX: 00000000 EBX: 00000000 ECX: 00000000 EDX: 00000000
>>>>    ESI: 00000000 EDI: 00000000 EBP: 00000000 ESP: ff8affdc
>>>>    DS: 0000 ES: 0000 FS: 0000 GS: 0033 SS: 0068 EFLAGS: 00010046
>>>>    CR0: 80050033 CR2: 00c2101c CR3: 04b6d000 CR4: 000406d0
>>>>    Call Trace:
>>>>     show_regs+0x70/0x78
>>>>     die_addr+0x29/0x70
>>>>     exc_general_protection+0x13c/0x348
>>>>     exc_bounds+0x98/0x98
>>>>     handle_exception+0x14d/0x14d
>>>>     exc_bounds+0x98/0x98
>>>>     restore_all_switch_stack+0xbe/0xcf
>>>>     exc_bounds+0x98/0x98
>>>>     restore_all_switch_stack+0xbe/0xcf
>>>>
>>>> This only happens in 32-bit mode when VERW based mitigations like MDS/RFDS
>>>> are enabled. This is because segment registers with an arbitrary user value
>>>> can result in #GP when executing VERW. Intel SDM vol. 2C documents the
>>>> following behavior for VERW instruction:
>>>>
>>>>    #GP(0) - If a memory operand effective address is outside the CS, DS, ES,
>>>> 	   FS, or GS segment limit.
>>>>
>>>> CLEAR_CPU_BUFFERS macro executes VERW instruction before returning to user
>>>> space. Use %cs selector to reference VERW operand. This ensures VERW will
>>>> not #GP for an arbitrary user %ds.
>>>>
>>>> Fixes: a0e2dab44d22 ("x86/entry_32: Add VERW just before userspace transition")
>>>> Cc: stable@vger.kernel.org # 5.10+
>>>> Reported-by: Robert Gill <rtgill82@gmail.com>
>>>> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218707
>>>> Closes: https://lore.kernel.org/all/8c77ccfd-d561-45a1-8ed5-6b75212c7a58@leemhuis.info/
>>>> Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
>>>> Suggested-by: Brian Gerst <brgerst@gmail.com>
>>>> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
>>>> ---
>>>>   arch/x86/include/asm/nospec-branch.h | 6 ++++--
>>>>   1 file changed, 4 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
>>>> index ff5f1ecc7d1e..e18a6aaf414c 100644
>>>> --- a/arch/x86/include/asm/nospec-branch.h
>>>> +++ b/arch/x86/include/asm/nospec-branch.h
>>>> @@ -318,12 +318,14 @@
>>>>   /*
>>>>    * Macro to execute VERW instruction that mitigate transient data sampling
>>>>    * attacks such as MDS. On affected systems a microcode update overloaded VERW
>>>> - * instruction to also clear the CPU buffers. VERW clobbers CFLAGS.ZF.
>>>> + * instruction to also clear the CPU buffers. VERW clobbers CFLAGS.ZF. Using %cs
>>>> + * to reference VERW operand avoids a #GP fault for an arbitrary user %ds in
>>>> + * 32-bit mode.
>>>>    *
>>>>    * Note: Only the memory operand variant of VERW clears the CPU buffers.
>>>>    */
>>>>   .macro CLEAR_CPU_BUFFERS
>>>> -	ALTERNATIVE "", __stringify(verw _ASM_RIP(mds_verw_sel)), X86_FEATURE_CLEAR_CPU_BUF
>>>> +	ALTERNATIVE "", __stringify(verw %cs:_ASM_RIP(mds_verw_sel)), X86_FEATURE_CLEAR_CPU_BUF
>>>>   .endm
>>>
>>> People ought rightly to double-take at this using %cs and not %ss.
>>> There is a good reason, but it needs describing explicitly.  May I
>>> suggest the following:
>>>
>>> *...
>>> * In 32bit mode, the memory operand must be a %cs reference.  The data
>>> segments may not be usable (vm86 mode), and the stack segment may not be
>>> flat (espfix32).
>>> *...
>>
>> Thanks for the suggestion. I will include this.
>>
>>>   .macro CLEAR_CPU_BUFFERS
>>> #ifdef __x86_64__
>>>      ALTERNATIVE "", "verw mds_verw_sel(%rip)", X86_FEATURE_CLEAR_CPU_BUF
>>> #else
>>>      ALTERNATIVE "", "verw %cs:mds_verw_sel", X86_FEATURE_CLEAR_CPU_BUF
>>> #endif
>>>   .endm
>>>
>>> This also lets you drop _ASM_RIP().  It's a cute idea, but is more
>>> confusion than it's worth, because there's no such thing in 32bit mode.
>>>
>>> "%cs:_ASM_RIP(mds_verw_sel)" reads as if it does nothing, because it
>>> really doesn't in 64bit mode.
>>
>> Right, will drop _ASM_RIP() in 32-bit mode and %cs in 64-bit mode.
> 
> Its probably too soon for next version, pasting the patch here:
> 
> ---8<---
> diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
> index e18a6aaf414c..4228a1fd2c2e 100644
> --- a/arch/x86/include/asm/nospec-branch.h
> +++ b/arch/x86/include/asm/nospec-branch.h
> @@ -318,14 +318,21 @@
>   /*
>    * Macro to execute VERW instruction that mitigate transient data sampling
>    * attacks such as MDS. On affected systems a microcode update overloaded VERW
> - * instruction to also clear the CPU buffers. VERW clobbers CFLAGS.ZF. Using %cs
> - * to reference VERW operand avoids a #GP fault for an arbitrary user %ds in
> - * 32-bit mode.
> + * instruction to also clear the CPU buffers. VERW clobbers CFLAGS.ZF.
>    *
>    * Note: Only the memory operand variant of VERW clears the CPU buffers.
>    */
>   .macro CLEAR_CPU_BUFFERS
> -	ALTERNATIVE "", __stringify(verw %cs:_ASM_RIP(mds_verw_sel)), X86_FEATURE_CLEAR_CPU_BUF
> +#ifdef CONFIG_X86_64
> +	ALTERNATIVE "", __stringify(verw _ASM_RIP(mds_verw_sel)), X86_FEATURE_CLEAR_CPU_BUF

You should drop _ASM_RIP here and direclty use (%rip). This way, you 
also won't need __stringify:

ALTERNATIVE "", "verw mds_verw_sel(%rip)", X86_FEATURE_CLEAR_CPU_BUF

> +#else
> +	/*
> +	 * In 32bit mode, the memory operand must be a %cs reference. The data
> +	 * segments may not be usable (vm86 mode), and the stack segment may not
> +	 * be flat (ESPFIX32).
> +	 */
> +	ALTERNATIVE "", __stringify(verw %cs:mds_verw_sel), X86_FEATURE_CLEAR_CPU_BUF

Also here, no need for __stringify:

ALTERNATIVE "", "verw %cs:mds_verw_sel", X86_FEATURE_CLEAR_CPU_BUF

This is in fact what Andrew proposed in his review.

Uros.
Re: [PATCH v7 3/3] x86/bugs: Use code segment selector for VERW operand
Posted by Pawan Gupta 2 months ago
On Thu, Sep 26, 2024 at 04:52:53PM +0200, Uros Bizjak wrote:
> > diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
> > index e18a6aaf414c..4228a1fd2c2e 100644
> > --- a/arch/x86/include/asm/nospec-branch.h
> > +++ b/arch/x86/include/asm/nospec-branch.h
> > @@ -318,14 +318,21 @@
> >   /*
> >    * Macro to execute VERW instruction that mitigate transient data sampling
> >    * attacks such as MDS. On affected systems a microcode update overloaded VERW
> > - * instruction to also clear the CPU buffers. VERW clobbers CFLAGS.ZF. Using %cs
> > - * to reference VERW operand avoids a #GP fault for an arbitrary user %ds in
> > - * 32-bit mode.
> > + * instruction to also clear the CPU buffers. VERW clobbers CFLAGS.ZF.
> >    *
> >    * Note: Only the memory operand variant of VERW clears the CPU buffers.
> >    */
> >   .macro CLEAR_CPU_BUFFERS
> > -	ALTERNATIVE "", __stringify(verw %cs:_ASM_RIP(mds_verw_sel)), X86_FEATURE_CLEAR_CPU_BUF
> > +#ifdef CONFIG_X86_64
> > +	ALTERNATIVE "", __stringify(verw _ASM_RIP(mds_verw_sel)), X86_FEATURE_CLEAR_CPU_BUF
> 
> You should drop _ASM_RIP here and direclty use (%rip). This way, you also
> won't need __stringify:
> 
> ALTERNATIVE "", "verw mds_verw_sel(%rip)", X86_FEATURE_CLEAR_CPU_BUF
> 
> > +#else
> > +	/*
> > +	 * In 32bit mode, the memory operand must be a %cs reference. The data
> > +	 * segments may not be usable (vm86 mode), and the stack segment may not
> > +	 * be flat (ESPFIX32).
> > +	 */
> > +	ALTERNATIVE "", __stringify(verw %cs:mds_verw_sel), X86_FEATURE_CLEAR_CPU_BUF
> 
> Also here, no need for __stringify:
> 
> ALTERNATIVE "", "verw %cs:mds_verw_sel", X86_FEATURE_CLEAR_CPU_BUF
> 
> This is in fact what Andrew proposed in his review.

Thanks for pointing out, I completely missed that part. Below is how it
looks like with stringify gone:

--- >8 ---
Subject: [PATCH] x86/bugs: Use code segment selector for VERW operand

Robert Gill reported below #GP in 32-bit mode when dosemu software was
executing vm86() system call:

  general protection fault: 0000 [#1] PREEMPT SMP
  CPU: 4 PID: 4610 Comm: dosemu.bin Not tainted 6.6.21-gentoo-x86 #1
  Hardware name: Dell Inc. PowerEdge 1950/0H723K, BIOS 2.7.0 10/30/2010
  EIP: restore_all_switch_stack+0xbe/0xcf
  EAX: 00000000 EBX: 00000000 ECX: 00000000 EDX: 00000000
  ESI: 00000000 EDI: 00000000 EBP: 00000000 ESP: ff8affdc
  DS: 0000 ES: 0000 FS: 0000 GS: 0033 SS: 0068 EFLAGS: 00010046
  CR0: 80050033 CR2: 00c2101c CR3: 04b6d000 CR4: 000406d0
  Call Trace:
   show_regs+0x70/0x78
   die_addr+0x29/0x70
   exc_general_protection+0x13c/0x348
   exc_bounds+0x98/0x98
   handle_exception+0x14d/0x14d
   exc_bounds+0x98/0x98
   restore_all_switch_stack+0xbe/0xcf
   exc_bounds+0x98/0x98
   restore_all_switch_stack+0xbe/0xcf

This only happens in 32-bit mode when VERW based mitigations like MDS/RFDS
are enabled. This is because segment registers with an arbitrary user value
can result in #GP when executing VERW. Intel SDM vol. 2C documents the
following behavior for VERW instruction:

  #GP(0) - If a memory operand effective address is outside the CS, DS, ES,
	   FS, or GS segment limit.

CLEAR_CPU_BUFFERS macro executes VERW instruction before returning to user
space. Use %cs selector to reference VERW operand. This ensures VERW will
not #GP for an arbitrary user %ds.

Fixes: a0e2dab44d22 ("x86/entry_32: Add VERW just before userspace transition")
Cc: stable@vger.kernel.org # 5.10+
Reported-by: Robert Gill <rtgill82@gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218707
Closes: https://lore.kernel.org/all/8c77ccfd-d561-45a1-8ed5-6b75212c7a58@leemhuis.info/
Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
Suggested-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
---
 arch/x86/include/asm/nospec-branch.h | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index ff5f1ecc7d1e..96b410b1d4e8 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -323,7 +323,16 @@
  * Note: Only the memory operand variant of VERW clears the CPU buffers.
  */
 .macro CLEAR_CPU_BUFFERS
-	ALTERNATIVE "", __stringify(verw _ASM_RIP(mds_verw_sel)), X86_FEATURE_CLEAR_CPU_BUF
+#ifdef CONFIG_X86_64
+	ALTERNATIVE "", "verw mds_verw_sel(%rip)", X86_FEATURE_CLEAR_CPU_BUF
+#else
+	/*
+	 * In 32bit mode, the memory operand must be a %cs reference. The data
+	 * segments may not be usable (vm86 mode), and the stack segment may not
+	 * be flat (ESPFIX32).
+	 */
+	ALTERNATIVE "", "verw %cs:mds_verw_sel", X86_FEATURE_CLEAR_CPU_BUF
+#endif
 .endm
 
 #ifdef CONFIG_X86_64
-- 
2.34.1
Re: [PATCH v7 3/3] x86/bugs: Use code segment selector for VERW operand
Posted by Andrew Cooper 2 months ago
On 26/09/2024 5:10 pm, Pawan Gupta wrote:
> On Thu, Sep 26, 2024 at 04:52:53PM +0200, Uros Bizjak wrote:
>>> diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
>>> index e18a6aaf414c..4228a1fd2c2e 100644
>>> --- a/arch/x86/include/asm/nospec-branch.h
>>> +++ b/arch/x86/include/asm/nospec-branch.h
>>> @@ -318,14 +318,21 @@
>>>   /*
>>>    * Macro to execute VERW instruction that mitigate transient data sampling
>>>    * attacks such as MDS. On affected systems a microcode update overloaded VERW
>>> - * instruction to also clear the CPU buffers. VERW clobbers CFLAGS.ZF. Using %cs
>>> - * to reference VERW operand avoids a #GP fault for an arbitrary user %ds in
>>> - * 32-bit mode.
>>> + * instruction to also clear the CPU buffers. VERW clobbers CFLAGS.ZF.
>>>    *
>>>    * Note: Only the memory operand variant of VERW clears the CPU buffers.
>>>    */
>>>   .macro CLEAR_CPU_BUFFERS
>>> -	ALTERNATIVE "", __stringify(verw %cs:_ASM_RIP(mds_verw_sel)), X86_FEATURE_CLEAR_CPU_BUF
>>> +#ifdef CONFIG_X86_64
>>> +	ALTERNATIVE "", __stringify(verw _ASM_RIP(mds_verw_sel)), X86_FEATURE_CLEAR_CPU_BUF
>> You should drop _ASM_RIP here and direclty use (%rip). This way, you also
>> won't need __stringify:
>>
>> ALTERNATIVE "", "verw mds_verw_sel(%rip)", X86_FEATURE_CLEAR_CPU_BUF
>>
>>> +#else
>>> +	/*
>>> +	 * In 32bit mode, the memory operand must be a %cs reference. The data
>>> +	 * segments may not be usable (vm86 mode), and the stack segment may not
>>> +	 * be flat (ESPFIX32).
>>> +	 */
>>> +	ALTERNATIVE "", __stringify(verw %cs:mds_verw_sel), X86_FEATURE_CLEAR_CPU_BUF
>> Also here, no need for __stringify:
>>
>> ALTERNATIVE "", "verw %cs:mds_verw_sel", X86_FEATURE_CLEAR_CPU_BUF
>>
>> This is in fact what Andrew proposed in his review.
> Thanks for pointing out, I completely missed that part. Below is how it
> looks like with stringify gone:
>
> --- >8 ---
> Subject: [PATCH] x86/bugs: Use code segment selector for VERW operand
>
> Robert Gill reported below #GP in 32-bit mode when dosemu software was
> executing vm86() system call:
>
>   general protection fault: 0000 [#1] PREEMPT SMP
>   CPU: 4 PID: 4610 Comm: dosemu.bin Not tainted 6.6.21-gentoo-x86 #1
>   Hardware name: Dell Inc. PowerEdge 1950/0H723K, BIOS 2.7.0 10/30/2010
>   EIP: restore_all_switch_stack+0xbe/0xcf
>   EAX: 00000000 EBX: 00000000 ECX: 00000000 EDX: 00000000
>   ESI: 00000000 EDI: 00000000 EBP: 00000000 ESP: ff8affdc
>   DS: 0000 ES: 0000 FS: 0000 GS: 0033 SS: 0068 EFLAGS: 00010046
>   CR0: 80050033 CR2: 00c2101c CR3: 04b6d000 CR4: 000406d0
>   Call Trace:
>    show_regs+0x70/0x78
>    die_addr+0x29/0x70
>    exc_general_protection+0x13c/0x348
>    exc_bounds+0x98/0x98
>    handle_exception+0x14d/0x14d
>    exc_bounds+0x98/0x98
>    restore_all_switch_stack+0xbe/0xcf
>    exc_bounds+0x98/0x98
>    restore_all_switch_stack+0xbe/0xcf
>
> This only happens in 32-bit mode when VERW based mitigations like MDS/RFDS
> are enabled. This is because segment registers with an arbitrary user value
> can result in #GP when executing VERW. Intel SDM vol. 2C documents the
> following behavior for VERW instruction:
>
>   #GP(0) - If a memory operand effective address is outside the CS, DS, ES,
> 	   FS, or GS segment limit.
>
> CLEAR_CPU_BUFFERS macro executes VERW instruction before returning to user
> space. Use %cs selector to reference VERW operand. This ensures VERW will
> not #GP for an arbitrary user %ds.
>
> Fixes: a0e2dab44d22 ("x86/entry_32: Add VERW just before userspace transition")
> Cc: stable@vger.kernel.org # 5.10+
> Reported-by: Robert Gill <rtgill82@gmail.com>
> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218707
> Closes: https://lore.kernel.org/all/8c77ccfd-d561-45a1-8ed5-6b75212c7a58@leemhuis.info/
> Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
> Suggested-by: Brian Gerst <brgerst@gmail.com>
> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
> ---
>  arch/x86/include/asm/nospec-branch.h | 11 ++++++++++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
> index ff5f1ecc7d1e..96b410b1d4e8 100644
> --- a/arch/x86/include/asm/nospec-branch.h
> +++ b/arch/x86/include/asm/nospec-branch.h
> @@ -323,7 +323,16 @@
>   * Note: Only the memory operand variant of VERW clears the CPU buffers.
>   */
>  .macro CLEAR_CPU_BUFFERS
> -	ALTERNATIVE "", __stringify(verw _ASM_RIP(mds_verw_sel)), X86_FEATURE_CLEAR_CPU_BUF
> +#ifdef CONFIG_X86_64
> +	ALTERNATIVE "", "verw mds_verw_sel(%rip)", X86_FEATURE_CLEAR_CPU_BUF
> +#else
> +	/*
> +	 * In 32bit mode, the memory operand must be a %cs reference. The data
> +	 * segments may not be usable (vm86 mode), and the stack segment may not
> +	 * be flat (ESPFIX32).
> +	 */
> +	ALTERNATIVE "", "verw %cs:mds_verw_sel", X86_FEATURE_CLEAR_CPU_BUF
> +#endif

You should also delete _ASM_RIP() as you're removing the only user of it.

But yes, with that, Reviewed-by: Andrew Cooper
<andrew.cooper3@citrix.com> FWIW.
Re: [PATCH v7 3/3] x86/bugs: Use code segment selector for VERW operand
Posted by Pawan Gupta 2 months ago
On Thu, Sep 26, 2024 at 05:28:05PM +0100, Andrew Cooper wrote:
> On 26/09/2024 5:10 pm, Pawan Gupta wrote:
> > On Thu, Sep 26, 2024 at 04:52:53PM +0200, Uros Bizjak wrote:
> >>> diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
> >>> index e18a6aaf414c..4228a1fd2c2e 100644
> >>> --- a/arch/x86/include/asm/nospec-branch.h
> >>> +++ b/arch/x86/include/asm/nospec-branch.h
> >>> @@ -318,14 +318,21 @@
> >>>   /*
> >>>    * Macro to execute VERW instruction that mitigate transient data sampling
> >>>    * attacks such as MDS. On affected systems a microcode update overloaded VERW
> >>> - * instruction to also clear the CPU buffers. VERW clobbers CFLAGS.ZF. Using %cs
> >>> - * to reference VERW operand avoids a #GP fault for an arbitrary user %ds in
> >>> - * 32-bit mode.
> >>> + * instruction to also clear the CPU buffers. VERW clobbers CFLAGS.ZF.
> >>>    *
> >>>    * Note: Only the memory operand variant of VERW clears the CPU buffers.
> >>>    */
> >>>   .macro CLEAR_CPU_BUFFERS
> >>> -	ALTERNATIVE "", __stringify(verw %cs:_ASM_RIP(mds_verw_sel)), X86_FEATURE_CLEAR_CPU_BUF
> >>> +#ifdef CONFIG_X86_64
> >>> +	ALTERNATIVE "", __stringify(verw _ASM_RIP(mds_verw_sel)), X86_FEATURE_CLEAR_CPU_BUF
> >> You should drop _ASM_RIP here and direclty use (%rip). This way, you also
> >> won't need __stringify:
> >>
> >> ALTERNATIVE "", "verw mds_verw_sel(%rip)", X86_FEATURE_CLEAR_CPU_BUF
> >>
> >>> +#else
> >>> +	/*
> >>> +	 * In 32bit mode, the memory operand must be a %cs reference. The data
> >>> +	 * segments may not be usable (vm86 mode), and the stack segment may not
> >>> +	 * be flat (ESPFIX32).
> >>> +	 */
> >>> +	ALTERNATIVE "", __stringify(verw %cs:mds_verw_sel), X86_FEATURE_CLEAR_CPU_BUF
> >> Also here, no need for __stringify:
> >>
> >> ALTERNATIVE "", "verw %cs:mds_verw_sel", X86_FEATURE_CLEAR_CPU_BUF
> >>
> >> This is in fact what Andrew proposed in his review.
> > Thanks for pointing out, I completely missed that part. Below is how it
> > looks like with stringify gone:
> >
> > --- >8 ---
> > Subject: [PATCH] x86/bugs: Use code segment selector for VERW operand
> >
> > Robert Gill reported below #GP in 32-bit mode when dosemu software was
> > executing vm86() system call:
> >
> >   general protection fault: 0000 [#1] PREEMPT SMP
> >   CPU: 4 PID: 4610 Comm: dosemu.bin Not tainted 6.6.21-gentoo-x86 #1
> >   Hardware name: Dell Inc. PowerEdge 1950/0H723K, BIOS 2.7.0 10/30/2010
> >   EIP: restore_all_switch_stack+0xbe/0xcf
> >   EAX: 00000000 EBX: 00000000 ECX: 00000000 EDX: 00000000
> >   ESI: 00000000 EDI: 00000000 EBP: 00000000 ESP: ff8affdc
> >   DS: 0000 ES: 0000 FS: 0000 GS: 0033 SS: 0068 EFLAGS: 00010046
> >   CR0: 80050033 CR2: 00c2101c CR3: 04b6d000 CR4: 000406d0
> >   Call Trace:
> >    show_regs+0x70/0x78
> >    die_addr+0x29/0x70
> >    exc_general_protection+0x13c/0x348
> >    exc_bounds+0x98/0x98
> >    handle_exception+0x14d/0x14d
> >    exc_bounds+0x98/0x98
> >    restore_all_switch_stack+0xbe/0xcf
> >    exc_bounds+0x98/0x98
> >    restore_all_switch_stack+0xbe/0xcf
> >
> > This only happens in 32-bit mode when VERW based mitigations like MDS/RFDS
> > are enabled. This is because segment registers with an arbitrary user value
> > can result in #GP when executing VERW. Intel SDM vol. 2C documents the
> > following behavior for VERW instruction:
> >
> >   #GP(0) - If a memory operand effective address is outside the CS, DS, ES,
> > 	   FS, or GS segment limit.
> >
> > CLEAR_CPU_BUFFERS macro executes VERW instruction before returning to user
> > space. Use %cs selector to reference VERW operand. This ensures VERW will
> > not #GP for an arbitrary user %ds.
> >
> > Fixes: a0e2dab44d22 ("x86/entry_32: Add VERW just before userspace transition")
> > Cc: stable@vger.kernel.org # 5.10+
> > Reported-by: Robert Gill <rtgill82@gmail.com>
> > Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218707
> > Closes: https://lore.kernel.org/all/8c77ccfd-d561-45a1-8ed5-6b75212c7a58@leemhuis.info/
> > Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
> > Suggested-by: Brian Gerst <brgerst@gmail.com>
> > Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
> > ---
> >  arch/x86/include/asm/nospec-branch.h | 11 ++++++++++-
> >  1 file changed, 10 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
> > index ff5f1ecc7d1e..96b410b1d4e8 100644
> > --- a/arch/x86/include/asm/nospec-branch.h
> > +++ b/arch/x86/include/asm/nospec-branch.h
> > @@ -323,7 +323,16 @@
> >   * Note: Only the memory operand variant of VERW clears the CPU buffers.
> >   */
> >  .macro CLEAR_CPU_BUFFERS
> > -	ALTERNATIVE "", __stringify(verw _ASM_RIP(mds_verw_sel)), X86_FEATURE_CLEAR_CPU_BUF
> > +#ifdef CONFIG_X86_64
> > +	ALTERNATIVE "", "verw mds_verw_sel(%rip)", X86_FEATURE_CLEAR_CPU_BUF
> > +#else
> > +	/*
> > +	 * In 32bit mode, the memory operand must be a %cs reference. The data
> > +	 * segments may not be usable (vm86 mode), and the stack segment may not
> > +	 * be flat (ESPFIX32).
> > +	 */
> > +	ALTERNATIVE "", "verw %cs:mds_verw_sel", X86_FEATURE_CLEAR_CPU_BUF
> > +#endif
> 
> You should also delete _ASM_RIP() as you're removing the only user of it.

Can we? I see that __svm_vcpu_run() and __vmx_vcpu_run() are using _ASM_RIP().

> But yes, with that, Reviewed-by: Andrew Cooper
> <andrew.cooper3@citrix.com> FWIW.

Thanks.
Re: [PATCH v7 3/3] x86/bugs: Use code segment selector for VERW operand
Posted by Andrew Cooper 2 months ago
On 26/09/2024 5:56 pm, Pawan Gupta wrote:
> On Thu, Sep 26, 2024 at 05:28:05PM +0100, Andrew Cooper wrote:
>> On 26/09/2024 5:10 pm, Pawan Gupta wrote:
>>> On Thu, Sep 26, 2024 at 04:52:53PM +0200, Uros Bizjak wrote:
>>>>> diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
>>>>> index e18a6aaf414c..4228a1fd2c2e 100644
>>>>> --- a/arch/x86/include/asm/nospec-branch.h
>>>>> +++ b/arch/x86/include/asm/nospec-branch.h
>>>>> @@ -318,14 +318,21 @@
>>>>>   /*
>>>>>    * Macro to execute VERW instruction that mitigate transient data sampling
>>>>>    * attacks such as MDS. On affected systems a microcode update overloaded VERW
>>>>> - * instruction to also clear the CPU buffers. VERW clobbers CFLAGS.ZF. Using %cs
>>>>> - * to reference VERW operand avoids a #GP fault for an arbitrary user %ds in
>>>>> - * 32-bit mode.
>>>>> + * instruction to also clear the CPU buffers. VERW clobbers CFLAGS.ZF.
>>>>>    *
>>>>>    * Note: Only the memory operand variant of VERW clears the CPU buffers.
>>>>>    */
>>>>>   .macro CLEAR_CPU_BUFFERS
>>>>> -	ALTERNATIVE "", __stringify(verw %cs:_ASM_RIP(mds_verw_sel)), X86_FEATURE_CLEAR_CPU_BUF
>>>>> +#ifdef CONFIG_X86_64
>>>>> +	ALTERNATIVE "", __stringify(verw _ASM_RIP(mds_verw_sel)), X86_FEATURE_CLEAR_CPU_BUF
>>>> You should drop _ASM_RIP here and direclty use (%rip). This way, you also
>>>> won't need __stringify:
>>>>
>>>> ALTERNATIVE "", "verw mds_verw_sel(%rip)", X86_FEATURE_CLEAR_CPU_BUF
>>>>
>>>>> +#else
>>>>> +	/*
>>>>> +	 * In 32bit mode, the memory operand must be a %cs reference. The data
>>>>> +	 * segments may not be usable (vm86 mode), and the stack segment may not
>>>>> +	 * be flat (ESPFIX32).
>>>>> +	 */
>>>>> +	ALTERNATIVE "", __stringify(verw %cs:mds_verw_sel), X86_FEATURE_CLEAR_CPU_BUF
>>>> Also here, no need for __stringify:
>>>>
>>>> ALTERNATIVE "", "verw %cs:mds_verw_sel", X86_FEATURE_CLEAR_CPU_BUF
>>>>
>>>> This is in fact what Andrew proposed in his review.
>>> Thanks for pointing out, I completely missed that part. Below is how it
>>> looks like with stringify gone:
>>>
>>> --- >8 ---
>>> Subject: [PATCH] x86/bugs: Use code segment selector for VERW operand
>>>
>>> Robert Gill reported below #GP in 32-bit mode when dosemu software was
>>> executing vm86() system call:
>>>
>>>   general protection fault: 0000 [#1] PREEMPT SMP
>>>   CPU: 4 PID: 4610 Comm: dosemu.bin Not tainted 6.6.21-gentoo-x86 #1
>>>   Hardware name: Dell Inc. PowerEdge 1950/0H723K, BIOS 2.7.0 10/30/2010
>>>   EIP: restore_all_switch_stack+0xbe/0xcf
>>>   EAX: 00000000 EBX: 00000000 ECX: 00000000 EDX: 00000000
>>>   ESI: 00000000 EDI: 00000000 EBP: 00000000 ESP: ff8affdc
>>>   DS: 0000 ES: 0000 FS: 0000 GS: 0033 SS: 0068 EFLAGS: 00010046
>>>   CR0: 80050033 CR2: 00c2101c CR3: 04b6d000 CR4: 000406d0
>>>   Call Trace:
>>>    show_regs+0x70/0x78
>>>    die_addr+0x29/0x70
>>>    exc_general_protection+0x13c/0x348
>>>    exc_bounds+0x98/0x98
>>>    handle_exception+0x14d/0x14d
>>>    exc_bounds+0x98/0x98
>>>    restore_all_switch_stack+0xbe/0xcf
>>>    exc_bounds+0x98/0x98
>>>    restore_all_switch_stack+0xbe/0xcf
>>>
>>> This only happens in 32-bit mode when VERW based mitigations like MDS/RFDS
>>> are enabled. This is because segment registers with an arbitrary user value
>>> can result in #GP when executing VERW. Intel SDM vol. 2C documents the
>>> following behavior for VERW instruction:
>>>
>>>   #GP(0) - If a memory operand effective address is outside the CS, DS, ES,
>>> 	   FS, or GS segment limit.
>>>
>>> CLEAR_CPU_BUFFERS macro executes VERW instruction before returning to user
>>> space. Use %cs selector to reference VERW operand. This ensures VERW will
>>> not #GP for an arbitrary user %ds.
>>>
>>> Fixes: a0e2dab44d22 ("x86/entry_32: Add VERW just before userspace transition")
>>> Cc: stable@vger.kernel.org # 5.10+
>>> Reported-by: Robert Gill <rtgill82@gmail.com>
>>> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218707
>>> Closes: https://lore.kernel.org/all/8c77ccfd-d561-45a1-8ed5-6b75212c7a58@leemhuis.info/
>>> Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
>>> Suggested-by: Brian Gerst <brgerst@gmail.com>
>>> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
>>> ---
>>>  arch/x86/include/asm/nospec-branch.h | 11 ++++++++++-
>>>  1 file changed, 10 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
>>> index ff5f1ecc7d1e..96b410b1d4e8 100644
>>> --- a/arch/x86/include/asm/nospec-branch.h
>>> +++ b/arch/x86/include/asm/nospec-branch.h
>>> @@ -323,7 +323,16 @@
>>>   * Note: Only the memory operand variant of VERW clears the CPU buffers.
>>>   */
>>>  .macro CLEAR_CPU_BUFFERS
>>> -	ALTERNATIVE "", __stringify(verw _ASM_RIP(mds_verw_sel)), X86_FEATURE_CLEAR_CPU_BUF
>>> +#ifdef CONFIG_X86_64
>>> +	ALTERNATIVE "", "verw mds_verw_sel(%rip)", X86_FEATURE_CLEAR_CPU_BUF
>>> +#else
>>> +	/*
>>> +	 * In 32bit mode, the memory operand must be a %cs reference. The data
>>> +	 * segments may not be usable (vm86 mode), and the stack segment may not
>>> +	 * be flat (ESPFIX32).
>>> +	 */
>>> +	ALTERNATIVE "", "verw %cs:mds_verw_sel", X86_FEATURE_CLEAR_CPU_BUF
>>> +#endif
>> You should also delete _ASM_RIP() as you're removing the only user of it.
> Can we? I see that __svm_vcpu_run() and __vmx_vcpu_run() are using _ASM_RIP().

Oh - so it is when I'm on the right branch.  Sorry for the noise.

~Andrew
Re: [PATCH v7 3/3] x86/bugs: Use code segment selector for VERW operand
Posted by Andrew Cooper 2 months ago
On 26/09/2024 1:17 am, Pawan Gupta wrote:
> On Wed, Sep 25, 2024 at 04:46:23PM -0700, Pawan Gupta wrote:
>> On Thu, Sep 26, 2024 at 12:29:00AM +0100, Andrew Cooper wrote:
>>> On 25/09/2024 11:25 pm, Pawan Gupta wrote:
>>>> Robert Gill reported below #GP in 32-bit mode when dosemu software was
>>>> executing vm86() system call:
>>>>
>>>>   general protection fault: 0000 [#1] PREEMPT SMP
>>>>   CPU: 4 PID: 4610 Comm: dosemu.bin Not tainted 6.6.21-gentoo-x86 #1
>>>>   Hardware name: Dell Inc. PowerEdge 1950/0H723K, BIOS 2.7.0 10/30/2010
>>>>   EIP: restore_all_switch_stack+0xbe/0xcf
>>>>   EAX: 00000000 EBX: 00000000 ECX: 00000000 EDX: 00000000
>>>>   ESI: 00000000 EDI: 00000000 EBP: 00000000 ESP: ff8affdc
>>>>   DS: 0000 ES: 0000 FS: 0000 GS: 0033 SS: 0068 EFLAGS: 00010046
>>>>   CR0: 80050033 CR2: 00c2101c CR3: 04b6d000 CR4: 000406d0
>>>>   Call Trace:
>>>>    show_regs+0x70/0x78
>>>>    die_addr+0x29/0x70
>>>>    exc_general_protection+0x13c/0x348
>>>>    exc_bounds+0x98/0x98
>>>>    handle_exception+0x14d/0x14d
>>>>    exc_bounds+0x98/0x98
>>>>    restore_all_switch_stack+0xbe/0xcf
>>>>    exc_bounds+0x98/0x98
>>>>    restore_all_switch_stack+0xbe/0xcf
>>>>
>>>> This only happens in 32-bit mode when VERW based mitigations like MDS/RFDS
>>>> are enabled. This is because segment registers with an arbitrary user value
>>>> can result in #GP when executing VERW. Intel SDM vol. 2C documents the
>>>> following behavior for VERW instruction:
>>>>
>>>>   #GP(0) - If a memory operand effective address is outside the CS, DS, ES,
>>>> 	   FS, or GS segment limit.
>>>>
>>>> CLEAR_CPU_BUFFERS macro executes VERW instruction before returning to user
>>>> space. Use %cs selector to reference VERW operand. This ensures VERW will
>>>> not #GP for an arbitrary user %ds.
>>>>
>>>> Fixes: a0e2dab44d22 ("x86/entry_32: Add VERW just before userspace transition")
>>>> Cc: stable@vger.kernel.org # 5.10+
>>>> Reported-by: Robert Gill <rtgill82@gmail.com>
>>>> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218707
>>>> Closes: https://lore.kernel.org/all/8c77ccfd-d561-45a1-8ed5-6b75212c7a58@leemhuis.info/
>>>> Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
>>>> Suggested-by: Brian Gerst <brgerst@gmail.com>
>>>> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
>>>> ---
>>>>  arch/x86/include/asm/nospec-branch.h | 6 ++++--
>>>>  1 file changed, 4 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
>>>> index ff5f1ecc7d1e..e18a6aaf414c 100644
>>>> --- a/arch/x86/include/asm/nospec-branch.h
>>>> +++ b/arch/x86/include/asm/nospec-branch.h
>>>> @@ -318,12 +318,14 @@
>>>>  /*
>>>>   * Macro to execute VERW instruction that mitigate transient data sampling
>>>>   * attacks such as MDS. On affected systems a microcode update overloaded VERW
>>>> - * instruction to also clear the CPU buffers. VERW clobbers CFLAGS.ZF.
>>>> + * instruction to also clear the CPU buffers. VERW clobbers CFLAGS.ZF. Using %cs
>>>> + * to reference VERW operand avoids a #GP fault for an arbitrary user %ds in
>>>> + * 32-bit mode.
>>>>   *
>>>>   * Note: Only the memory operand variant of VERW clears the CPU buffers.
>>>>   */
>>>>  .macro CLEAR_CPU_BUFFERS
>>>> -	ALTERNATIVE "", __stringify(verw _ASM_RIP(mds_verw_sel)), X86_FEATURE_CLEAR_CPU_BUF
>>>> +	ALTERNATIVE "", __stringify(verw %cs:_ASM_RIP(mds_verw_sel)), X86_FEATURE_CLEAR_CPU_BUF
>>>>  .endm
>>> People ought rightly to double-take at this using %cs and not %ss. 
>>> There is a good reason, but it needs describing explicitly.  May I
>>> suggest the following:
>>>
>>> *...
>>> * In 32bit mode, the memory operand must be a %cs reference.  The data
>>> segments may not be usable (vm86 mode), and the stack segment may not be
>>> flat (espfix32).
>>> *...
>> Thanks for the suggestion. I will include this.
>>
>>>  .macro CLEAR_CPU_BUFFERS
>>> #ifdef __x86_64__
>>>     ALTERNATIVE "", "verw mds_verw_sel(%rip)", X86_FEATURE_CLEAR_CPU_BUF
>>> #else
>>>     ALTERNATIVE "", "verw %cs:mds_verw_sel", X86_FEATURE_CLEAR_CPU_BUF
>>> #endif
>>>  .endm
>>>
>>> This also lets you drop _ASM_RIP().  It's a cute idea, but is more
>>> confusion than it's worth, because there's no such thing in 32bit mode.
>>>
>>> "%cs:_ASM_RIP(mds_verw_sel)" reads as if it does nothing, because it
>>> really doesn't in 64bit mode.
>> Right, will drop _ASM_RIP() in 32-bit mode and %cs in 64-bit mode.
> Its probably too soon for next version, pasting the patch here:
>
> ---8<---
> diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
> index e18a6aaf414c..4228a1fd2c2e 100644
> --- a/arch/x86/include/asm/nospec-branch.h
> +++ b/arch/x86/include/asm/nospec-branch.h
> @@ -318,14 +318,21 @@
>  /*
>   * Macro to execute VERW instruction that mitigate transient data sampling
>   * attacks such as MDS. On affected systems a microcode update overloaded VERW
> - * instruction to also clear the CPU buffers. VERW clobbers CFLAGS.ZF. Using %cs
> - * to reference VERW operand avoids a #GP fault for an arbitrary user %ds in
> - * 32-bit mode.
> + * instruction to also clear the CPU buffers. VERW clobbers CFLAGS.ZF.
>   *
>   * Note: Only the memory operand variant of VERW clears the CPU buffers.
>   */
>  .macro CLEAR_CPU_BUFFERS
> -	ALTERNATIVE "", __stringify(verw %cs:_ASM_RIP(mds_verw_sel)), X86_FEATURE_CLEAR_CPU_BUF
> +#ifdef CONFIG_X86_64
> +	ALTERNATIVE "", __stringify(verw _ASM_RIP(mds_verw_sel)), X86_FEATURE_CLEAR_CPU_BUF
> +#else
> +	/*
> +	 * In 32bit mode, the memory operand must be a %cs reference. The data
> +	 * segments may not be usable (vm86 mode), and the stack segment may not
> +	 * be flat (ESPFIX32).
> +	 */

I was intending for this to replace the "Using %cs" sentence, as a new
paragraph in that main comment block.

Otherwise, yes, this is half of what I had in mind.

~Andrew
Re: [PATCH v7 3/3] x86/bugs: Use code segment selector for VERW operand
Posted by Pawan Gupta 2 months ago
On Thu, Sep 26, 2024 at 01:32:19AM +0100, Andrew Cooper wrote:
> On 26/09/2024 1:17 am, Pawan Gupta wrote:
> > On Wed, Sep 25, 2024 at 04:46:23PM -0700, Pawan Gupta wrote:
> >> On Thu, Sep 26, 2024 at 12:29:00AM +0100, Andrew Cooper wrote:
> >>> On 25/09/2024 11:25 pm, Pawan Gupta wrote:
> >>>> Robert Gill reported below #GP in 32-bit mode when dosemu software was
> >>>> executing vm86() system call:
> >>>>
> >>>>   general protection fault: 0000 [#1] PREEMPT SMP
> >>>>   CPU: 4 PID: 4610 Comm: dosemu.bin Not tainted 6.6.21-gentoo-x86 #1
> >>>>   Hardware name: Dell Inc. PowerEdge 1950/0H723K, BIOS 2.7.0 10/30/2010
> >>>>   EIP: restore_all_switch_stack+0xbe/0xcf
> >>>>   EAX: 00000000 EBX: 00000000 ECX: 00000000 EDX: 00000000
> >>>>   ESI: 00000000 EDI: 00000000 EBP: 00000000 ESP: ff8affdc
> >>>>   DS: 0000 ES: 0000 FS: 0000 GS: 0033 SS: 0068 EFLAGS: 00010046
> >>>>   CR0: 80050033 CR2: 00c2101c CR3: 04b6d000 CR4: 000406d0
> >>>>   Call Trace:
> >>>>    show_regs+0x70/0x78
> >>>>    die_addr+0x29/0x70
> >>>>    exc_general_protection+0x13c/0x348
> >>>>    exc_bounds+0x98/0x98
> >>>>    handle_exception+0x14d/0x14d
> >>>>    exc_bounds+0x98/0x98
> >>>>    restore_all_switch_stack+0xbe/0xcf
> >>>>    exc_bounds+0x98/0x98
> >>>>    restore_all_switch_stack+0xbe/0xcf
> >>>>
> >>>> This only happens in 32-bit mode when VERW based mitigations like MDS/RFDS
> >>>> are enabled. This is because segment registers with an arbitrary user value
> >>>> can result in #GP when executing VERW. Intel SDM vol. 2C documents the
> >>>> following behavior for VERW instruction:
> >>>>
> >>>>   #GP(0) - If a memory operand effective address is outside the CS, DS, ES,
> >>>> 	   FS, or GS segment limit.
> >>>>
> >>>> CLEAR_CPU_BUFFERS macro executes VERW instruction before returning to user
> >>>> space. Use %cs selector to reference VERW operand. This ensures VERW will
> >>>> not #GP for an arbitrary user %ds.
> >>>>
> >>>> Fixes: a0e2dab44d22 ("x86/entry_32: Add VERW just before userspace transition")
> >>>> Cc: stable@vger.kernel.org # 5.10+
> >>>> Reported-by: Robert Gill <rtgill82@gmail.com>
> >>>> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218707
> >>>> Closes: https://lore.kernel.org/all/8c77ccfd-d561-45a1-8ed5-6b75212c7a58@leemhuis.info/
> >>>> Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
> >>>> Suggested-by: Brian Gerst <brgerst@gmail.com>
> >>>> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
> >>>> ---
> >>>>  arch/x86/include/asm/nospec-branch.h | 6 ++++--
> >>>>  1 file changed, 4 insertions(+), 2 deletions(-)
> >>>>
> >>>> diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
> >>>> index ff5f1ecc7d1e..e18a6aaf414c 100644
> >>>> --- a/arch/x86/include/asm/nospec-branch.h
> >>>> +++ b/arch/x86/include/asm/nospec-branch.h
> >>>> @@ -318,12 +318,14 @@
> >>>>  /*
> >>>>   * Macro to execute VERW instruction that mitigate transient data sampling
> >>>>   * attacks such as MDS. On affected systems a microcode update overloaded VERW
> >>>> - * instruction to also clear the CPU buffers. VERW clobbers CFLAGS.ZF.
> >>>> + * instruction to also clear the CPU buffers. VERW clobbers CFLAGS.ZF. Using %cs
> >>>> + * to reference VERW operand avoids a #GP fault for an arbitrary user %ds in
> >>>> + * 32-bit mode.
> >>>>   *
> >>>>   * Note: Only the memory operand variant of VERW clears the CPU buffers.
> >>>>   */
> >>>>  .macro CLEAR_CPU_BUFFERS
> >>>> -	ALTERNATIVE "", __stringify(verw _ASM_RIP(mds_verw_sel)), X86_FEATURE_CLEAR_CPU_BUF
> >>>> +	ALTERNATIVE "", __stringify(verw %cs:_ASM_RIP(mds_verw_sel)), X86_FEATURE_CLEAR_CPU_BUF
> >>>>  .endm
> >>> People ought rightly to double-take at this using %cs and not %ss. 
> >>> There is a good reason, but it needs describing explicitly.  May I
> >>> suggest the following:
> >>>
> >>> *...
> >>> * In 32bit mode, the memory operand must be a %cs reference.  The data
> >>> segments may not be usable (vm86 mode), and the stack segment may not be
> >>> flat (espfix32).
> >>> *...
> >> Thanks for the suggestion. I will include this.
> >>
> >>>  .macro CLEAR_CPU_BUFFERS
> >>> #ifdef __x86_64__
> >>>     ALTERNATIVE "", "verw mds_verw_sel(%rip)", X86_FEATURE_CLEAR_CPU_BUF
> >>> #else
> >>>     ALTERNATIVE "", "verw %cs:mds_verw_sel", X86_FEATURE_CLEAR_CPU_BUF
> >>> #endif
> >>>  .endm
> >>>
> >>> This also lets you drop _ASM_RIP().  It's a cute idea, but is more
> >>> confusion than it's worth, because there's no such thing in 32bit mode.
> >>>
> >>> "%cs:_ASM_RIP(mds_verw_sel)" reads as if it does nothing, because it
> >>> really doesn't in 64bit mode.
> >> Right, will drop _ASM_RIP() in 32-bit mode and %cs in 64-bit mode.
> > Its probably too soon for next version, pasting the patch here:
> >
> > ---8<---
> > diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
> > index e18a6aaf414c..4228a1fd2c2e 100644
> > --- a/arch/x86/include/asm/nospec-branch.h
> > +++ b/arch/x86/include/asm/nospec-branch.h
> > @@ -318,14 +318,21 @@
> >  /*
> >   * Macro to execute VERW instruction that mitigate transient data sampling
> >   * attacks such as MDS. On affected systems a microcode update overloaded VERW
> > - * instruction to also clear the CPU buffers. VERW clobbers CFLAGS.ZF. Using %cs
> > - * to reference VERW operand avoids a #GP fault for an arbitrary user %ds in
> > - * 32-bit mode.
> > + * instruction to also clear the CPU buffers. VERW clobbers CFLAGS.ZF.
> >   *
> >   * Note: Only the memory operand variant of VERW clears the CPU buffers.
> >   */
> >  .macro CLEAR_CPU_BUFFERS
> > -	ALTERNATIVE "", __stringify(verw %cs:_ASM_RIP(mds_verw_sel)), X86_FEATURE_CLEAR_CPU_BUF
> > +#ifdef CONFIG_X86_64
> > +	ALTERNATIVE "", __stringify(verw _ASM_RIP(mds_verw_sel)), X86_FEATURE_CLEAR_CPU_BUF
> > +#else
> > +	/*
> > +	 * In 32bit mode, the memory operand must be a %cs reference. The data
> > +	 * segments may not be usable (vm86 mode), and the stack segment may not
> > +	 * be flat (ESPFIX32).
> > +	 */
> 
> I was intending for this to replace the "Using %cs" sentence, as a new
> paragraph in that main comment block.

The reason I added the comment to 32-bit leg is because most readers will
not care about 32-bit mode. The comment will mostly be a distraction for
majority. People who care about 32-bit mode will read the comment in 32-bit
leg. I can move the comment to main block if you still want.