[PATCH v4 5/5] x86/gsseg: use the LKGS instruction if available for load_gs_index()

Xin Li posted 5 patches 3 years, 1 month ago
[PATCH v4 5/5] x86/gsseg: use the LKGS instruction if available for load_gs_index()
Posted by Xin Li 3 years, 1 month ago
From: "H. Peter Anvin (Intel)" <hpa@zytor.com>

The LKGS instruction atomically loads a segment descriptor into the
%gs descriptor registers, *except* that %gs.base is unchanged, and the
base is instead loaded into MSR_IA32_KERNEL_GS_BASE, which is exactly
what we want this function to do.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
---

Changes since v3:
* We want less ASM not more, thus keep local_irq_save/restore() inside
  native_load_gs_index() (Thomas Gleixner).
* For paravirt enabled kernels, initialize pv_ops.cpu.load_gs_index to
  native_lkgs (Thomas Gleixner).

Changes since V2:
* Mark DI as input and output (+D) as in V1, since the exception handler
  modifies it (Brian Gerst).

Changes since V1:
* Use EX_TYPE_ZERO_REG instead of fixup code in the obsolete .fixup code
  section (Peter Zijlstra).
* Add a comment that states the LKGS_DI macro will be repalced with "lkgs %di"
  once the binutils support the LKGS instruction (Peter Zijlstra).
---
 arch/x86/include/asm/gsseg.h | 33 +++++++++++++++++++++++++++++----
 arch/x86/kernel/cpu/common.c |  1 +
 2 files changed, 30 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/gsseg.h b/arch/x86/include/asm/gsseg.h
index d15577c39e8d..ab6a595cea70 100644
--- a/arch/x86/include/asm/gsseg.h
+++ b/arch/x86/include/asm/gsseg.h
@@ -14,17 +14,42 @@
 
 extern asmlinkage void asm_load_gs_index(u16 selector);
 
+/* Replace with "lkgs %di" once binutils support LKGS instruction */
+#define LKGS_DI _ASM_BYTES(0xf2,0x0f,0x00,0xf7)
+
+static inline void native_lkgs(unsigned int selector)
+{
+	u16 sel = selector;
+	asm_inline volatile("1: " LKGS_DI
+			    _ASM_EXTABLE_TYPE_REG(1b, 1b, EX_TYPE_ZERO_REG, %k[sel])
+			    : [sel] "+D" (sel));
+}
+
 static inline void native_load_gs_index(unsigned int selector)
 {
-	unsigned long flags;
+	if (cpu_feature_enabled(X86_FEATURE_LKGS)) {
+		native_lkgs(selector);
+	} else {
+		unsigned long flags;
 
-	local_irq_save(flags);
-	asm_load_gs_index(selector);
-	local_irq_restore(flags);
+		local_irq_save(flags);
+		asm_load_gs_index(selector);
+		local_irq_restore(flags);
+	}
 }
 
 #endif /* CONFIG_X86_64 */
 
+static inline void __init lkgs_init(void)
+{
+#ifdef CONFIG_PARAVIRT_XXL
+#ifdef CONFIG_X86_64
+	if (cpu_feature_enabled(X86_FEATURE_LKGS))
+		pv_ops.cpu.load_gs_index = native_lkgs;
+#endif
+#endif
+}
+
 #ifndef CONFIG_PARAVIRT_XXL
 
 static inline void load_gs_index(unsigned int selector)
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 3e508f239098..d6eb4f60b47d 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1939,6 +1939,7 @@ void __init identify_boot_cpu(void)
 	setup_cr_pinning();
 
 	tsx_init();
+	lkgs_init();
 }
 
 void identify_secondary_cpu(struct cpuinfo_x86 *c)
-- 
2.34.1
Re: [PATCH v4 5/5] x86/gsseg: use the LKGS instruction if available for load_gs_index()
Posted by Juergen Gross 3 years, 1 month ago
On 19.10.22 11:50, Xin Li wrote:
> From: "H. Peter Anvin (Intel)" <hpa@zytor.com>
> 
> The LKGS instruction atomically loads a segment descriptor into the
> %gs descriptor registers, *except* that %gs.base is unchanged, and the
> base is instead loaded into MSR_IA32_KERNEL_GS_BASE, which is exactly
> what we want this function to do.
> 
> Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> Signed-off-by: Brian Gerst <brgerst@gmail.com>
> Signed-off-by: Xin Li <xin3.li@intel.com>
> ---
> 
> Changes since v3:
> * We want less ASM not more, thus keep local_irq_save/restore() inside
>    native_load_gs_index() (Thomas Gleixner).
> * For paravirt enabled kernels, initialize pv_ops.cpu.load_gs_index to
>    native_lkgs (Thomas Gleixner).
> 
> Changes since V2:
> * Mark DI as input and output (+D) as in V1, since the exception handler
>    modifies it (Brian Gerst).
> 
> Changes since V1:
> * Use EX_TYPE_ZERO_REG instead of fixup code in the obsolete .fixup code
>    section (Peter Zijlstra).
> * Add a comment that states the LKGS_DI macro will be repalced with "lkgs %di"
>    once the binutils support the LKGS instruction (Peter Zijlstra).
> ---
>   arch/x86/include/asm/gsseg.h | 33 +++++++++++++++++++++++++++++----
>   arch/x86/kernel/cpu/common.c |  1 +
>   2 files changed, 30 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/x86/include/asm/gsseg.h b/arch/x86/include/asm/gsseg.h
> index d15577c39e8d..ab6a595cea70 100644
> --- a/arch/x86/include/asm/gsseg.h
> +++ b/arch/x86/include/asm/gsseg.h
> @@ -14,17 +14,42 @@
>   
>   extern asmlinkage void asm_load_gs_index(u16 selector);
>   
> +/* Replace with "lkgs %di" once binutils support LKGS instruction */
> +#define LKGS_DI _ASM_BYTES(0xf2,0x0f,0x00,0xf7)
> +
> +static inline void native_lkgs(unsigned int selector)
> +{
> +	u16 sel = selector;
> +	asm_inline volatile("1: " LKGS_DI
> +			    _ASM_EXTABLE_TYPE_REG(1b, 1b, EX_TYPE_ZERO_REG, %k[sel])
> +			    : [sel] "+D" (sel));
> +}
> +
>   static inline void native_load_gs_index(unsigned int selector)
>   {
> -	unsigned long flags;
> +	if (cpu_feature_enabled(X86_FEATURE_LKGS)) {
> +		native_lkgs(selector);
> +	} else {
> +		unsigned long flags;
>   
> -	local_irq_save(flags);
> -	asm_load_gs_index(selector);
> -	local_irq_restore(flags);
> +		local_irq_save(flags);
> +		asm_load_gs_index(selector);
> +		local_irq_restore(flags);
> +	}
>   }
>   
>   #endif /* CONFIG_X86_64 */
>   
> +static inline void __init lkgs_init(void)
> +{
> +#ifdef CONFIG_PARAVIRT_XXL
> +#ifdef CONFIG_X86_64
> +	if (cpu_feature_enabled(X86_FEATURE_LKGS))
> +		pv_ops.cpu.load_gs_index = native_lkgs;

For this to work correctly when running as a Xen PV guest, you need to add

	setup_clear_cpu_cap(X86_FEATURE_LKGS);

to xen_init_capabilities() in arch/x86/xen/enlighten_pv.c, as otherwise
the Xen specific .load_gs_index vector will be overwritten.


Juergen
RE: [PATCH v4 5/5] x86/gsseg: use the LKGS instruction if available for load_gs_index()
Posted by Li, Xin3 3 years, 1 month ago
> > +static inline void __init lkgs_init(void) { #ifdef
> > +CONFIG_PARAVIRT_XXL #ifdef CONFIG_X86_64
> > +	if (cpu_feature_enabled(X86_FEATURE_LKGS))
> > +		pv_ops.cpu.load_gs_index = native_lkgs;
> 
> For this to work correctly when running as a Xen PV guest, you need to add
> 
> 	setup_clear_cpu_cap(X86_FEATURE_LKGS);
> 
> to xen_init_capabilities() in arch/x86/xen/enlighten_pv.c, as otherwise the Xen
> specific .load_gs_index vector will be overwritten.

Yeah, we definitely should add it to disable LKGS in a Xen PV guest.

So does it mean that the Xen PV uses a black list during feature detection?
If yes then new features are often required to be masked with an explicit
call to setup_clear_cpu_cap.

Wouldn't a white list be better?
Then the job is more just on the Xen PV side, and it can selectively enable
a new feature, sometimes with Xen PV specific handling code added.

Xin

> 
> 
> Juergen
Re: [PATCH v4 5/5] x86/gsseg: use the LKGS instruction if available for load_gs_index()
Posted by Juergen Gross 3 years, 1 month ago
On 19.10.22 19:45, Li, Xin3 wrote:
>>> +static inline void __init lkgs_init(void) { #ifdef
>>> +CONFIG_PARAVIRT_XXL #ifdef CONFIG_X86_64
>>> +	if (cpu_feature_enabled(X86_FEATURE_LKGS))
>>> +		pv_ops.cpu.load_gs_index = native_lkgs;
>>
>> For this to work correctly when running as a Xen PV guest, you need to add
>>
>> 	setup_clear_cpu_cap(X86_FEATURE_LKGS);
>>
>> to xen_init_capabilities() in arch/x86/xen/enlighten_pv.c, as otherwise the Xen
>> specific .load_gs_index vector will be overwritten.
> 
> Yeah, we definitely should add it to disable LKGS in a Xen PV guest.
> 
> So does it mean that the Xen PV uses a black list during feature detection?
> If yes then new features are often required to be masked with an explicit
> call to setup_clear_cpu_cap.
> 
> Wouldn't a white list be better?
> Then the job is more just on the Xen PV side, and it can selectively enable
> a new feature, sometimes with Xen PV specific handling code added.

This is not how it works. Feature detection is generic code, so we'd need to
tweak that for switching to a whitelist.

Additionally most features don't require any Xen PV specific handling. This is
needed for some paravirtualized privileged operations only. So switching to a
whitelist would add more effort.


Juergen

RE: [PATCH v4 5/5] x86/gsseg: use the LKGS instruction if available for load_gs_index()
Posted by Li, Xin3 3 years, 1 month ago
> On 19.10.22 19:45, Li, Xin3 wrote:
> >>> +static inline void __init lkgs_init(void) { #ifdef
> >>> +CONFIG_PARAVIRT_XXL #ifdef CONFIG_X86_64
> >>> +	if (cpu_feature_enabled(X86_FEATURE_LKGS))
> >>> +		pv_ops.cpu.load_gs_index = native_lkgs;
> >>
> >> For this to work correctly when running as a Xen PV guest, you need
> >> to add
> >>
> >> 	setup_clear_cpu_cap(X86_FEATURE_LKGS);
> >>
> >> to xen_init_capabilities() in arch/x86/xen/enlighten_pv.c, as
> >> otherwise the Xen specific .load_gs_index vector will be overwritten.
> >
> > Yeah, we definitely should add it to disable LKGS in a Xen PV guest.
> >
> > So does it mean that the Xen PV uses a black list during feature detection?
> > If yes then new features are often required to be masked with an
> > explicit call to setup_clear_cpu_cap.
> >
> > Wouldn't a white list be better?
> > Then the job is more just on the Xen PV side, and it can selectively
> > enable a new feature, sometimes with Xen PV specific handling code added.
> 
> This is not how it works. Feature detection is generic code, so we'd need to
> tweak that for switching to a whitelist.
>

Yes, a Xen PV guest is basically a Linux system.  However IIRC, the Xen PV
CPUID is para-virtualized, so it's Xen hypervisor's responsibility to decide
features exposed to a Xen PV guest.  No?

> Additionally most features don't require any Xen PV specific handling. This is
> needed for some paravirtualized privileged operations only. So switching to a
> whitelist would add more effort.
> 

LKGS is allowed only in ring 0, thus only Xen hypervisor could use it.

Xin

> 
> Juergen

Re: [PATCH v4 5/5] x86/gsseg: use the LKGS instruction if available for load_gs_index()
Posted by Juergen Gross 3 years, 1 month ago
On 20.10.22 07:58, Li, Xin3 wrote:
>> On 19.10.22 19:45, Li, Xin3 wrote:
>>>>> +static inline void __init lkgs_init(void) { #ifdef
>>>>> +CONFIG_PARAVIRT_XXL #ifdef CONFIG_X86_64
>>>>> +	if (cpu_feature_enabled(X86_FEATURE_LKGS))
>>>>> +		pv_ops.cpu.load_gs_index = native_lkgs;
>>>>
>>>> For this to work correctly when running as a Xen PV guest, you need
>>>> to add
>>>>
>>>> 	setup_clear_cpu_cap(X86_FEATURE_LKGS);
>>>>
>>>> to xen_init_capabilities() in arch/x86/xen/enlighten_pv.c, as
>>>> otherwise the Xen specific .load_gs_index vector will be overwritten.
>>>
>>> Yeah, we definitely should add it to disable LKGS in a Xen PV guest.
>>>
>>> So does it mean that the Xen PV uses a black list during feature detection?
>>> If yes then new features are often required to be masked with an
>>> explicit call to setup_clear_cpu_cap.
>>>
>>> Wouldn't a white list be better?
>>> Then the job is more just on the Xen PV side, and it can selectively
>>> enable a new feature, sometimes with Xen PV specific handling code added.
>>
>> This is not how it works. Feature detection is generic code, so we'd need to
>> tweak that for switching to a whitelist.
>>
> 
> Yes, a Xen PV guest is basically a Linux system.  However IIRC, the Xen PV
> CPUID is para-virtualized, so it's Xen hypervisor's responsibility to decide
> features exposed to a Xen PV guest.  No?

In theory you are right, of course.

OTOH the Xen PV interface has a long and complicated history, and we have to
deal with old hypervisor versions, too.

>> Additionally most features don't require any Xen PV specific handling. This is
>> needed for some paravirtualized privileged operations only. So switching to a
>> whitelist would add more effort.
>>
> 
> LKGS is allowed only in ring 0, thus only Xen hypervisor could use it.

Right, it would be one of the features where a whitelist would be nice.

OTOH today only 11 features need special handling in Xen PV guests, while
the rest of more than 300 features doesn't.


Juergen
RE: [PATCH v4 5/5] x86/gsseg: use the LKGS instruction if available for load_gs_index()
Posted by Li, Xin3 3 years, 1 month ago
> On 20.10.22 07:58, Li, Xin3 wrote:
> >> On 19.10.22 19:45, Li, Xin3 wrote:
> >>>>> +static inline void __init lkgs_init(void) { #ifdef
> >>>>> +CONFIG_PARAVIRT_XXL #ifdef CONFIG_X86_64
> >>>>> +	if (cpu_feature_enabled(X86_FEATURE_LKGS))
> >>>>> +		pv_ops.cpu.load_gs_index = native_lkgs;
> >>>>
> >>>> For this to work correctly when running as a Xen PV guest, you need
> >>>> to add
> >>>>
> >>>> 	setup_clear_cpu_cap(X86_FEATURE_LKGS);
> >>>>
> >>>> to xen_init_capabilities() in arch/x86/xen/enlighten_pv.c, as
> >>>> otherwise the Xen specific .load_gs_index vector will be overwritten.
> >>>
> >>> Yeah, we definitely should add it to disable LKGS in a Xen PV guest.
> >>>
> >>> So does it mean that the Xen PV uses a black list during feature detection?
> >>> If yes then new features are often required to be masked with an
> >>> explicit call to setup_clear_cpu_cap.
> >>>
> >>> Wouldn't a white list be better?
> >>> Then the job is more just on the Xen PV side, and it can selectively
> >>> enable a new feature, sometimes with Xen PV specific handling code
> added.
> >>
> >> This is not how it works. Feature detection is generic code, so we'd
> >> need to tweak that for switching to a whitelist.
> >>
> >
> > Yes, a Xen PV guest is basically a Linux system.  However IIRC, the
> > Xen PV CPUID is para-virtualized, so it's Xen hypervisor's
> > responsibility to decide features exposed to a Xen PV guest.  No?
> 
> In theory you are right, of course.
> 
> OTOH the Xen PV interface has a long and complicated history, and we have to
> deal with old hypervisor versions, too.
> 
> >> Additionally most features don't require any Xen PV specific
> >> handling. This is needed for some paravirtualized privileged
> >> operations only. So switching to a whitelist would add more effort.
> >>
> >
> > LKGS is allowed only in ring 0, thus only Xen hypervisor could use it.
> 
> Right, it would be one of the features where a whitelist would be nice.
> 
> OTOH today only 11 features need special handling in Xen PV guests, while the
> rest of more than 300 features doesn't.
> 

Got to say, nothing is more convincing than strong data.
Xin

> 
> Juergen
RE: [PATCH v4 5/5] x86/gsseg: use the LKGS instruction if available for load_gs_index()
Posted by H. Peter Anvin 3 years, 1 month ago
On October 19, 2022 10:45:07 AM PDT, "Li, Xin3" <xin3.li@intel.com> wrote:
>> > +static inline void __init lkgs_init(void) { #ifdef
>> > +CONFIG_PARAVIRT_XXL #ifdef CONFIG_X86_64
>> > +	if (cpu_feature_enabled(X86_FEATURE_LKGS))
>> > +		pv_ops.cpu.load_gs_index = native_lkgs;
>> 
>> For this to work correctly when running as a Xen PV guest, you need to add
>> 
>> 	setup_clear_cpu_cap(X86_FEATURE_LKGS);
>> 
>> to xen_init_capabilities() in arch/x86/xen/enlighten_pv.c, as otherwise the Xen
>> specific .load_gs_index vector will be overwritten.
>
>Yeah, we definitely should add it to disable LKGS in a Xen PV guest.
>
>So does it mean that the Xen PV uses a black list during feature detection?
>If yes then new features are often required to be masked with an explicit
>call to setup_clear_cpu_cap.
>
>Wouldn't a white list be better?
>Then the job is more just on the Xen PV side, and it can selectively enable
>a new feature, sometimes with Xen PV specific handling code added.
>
>Xin
>
>> 
>> 
>> Juergen
>

Most things don't frob the paravirt list.

Maybe we should make the paravirt frobbing a separate patch, at it is separable.
Re: [PATCH v4 5/5] x86/gsseg: use the LKGS instruction if available for load_gs_index()
Posted by Juergen Gross 3 years, 1 month ago
On 19.10.22 20:01, H. Peter Anvin wrote:
> On October 19, 2022 10:45:07 AM PDT, "Li, Xin3" <xin3.li@intel.com> wrote:
>>>> +static inline void __init lkgs_init(void) { #ifdef
>>>> +CONFIG_PARAVIRT_XXL #ifdef CONFIG_X86_64
>>>> +	if (cpu_feature_enabled(X86_FEATURE_LKGS))
>>>> +		pv_ops.cpu.load_gs_index = native_lkgs;
>>>
>>> For this to work correctly when running as a Xen PV guest, you need to add
>>>
>>> 	setup_clear_cpu_cap(X86_FEATURE_LKGS);
>>>
>>> to xen_init_capabilities() in arch/x86/xen/enlighten_pv.c, as otherwise the Xen
>>> specific .load_gs_index vector will be overwritten.
>>
>> Yeah, we definitely should add it to disable LKGS in a Xen PV guest.
>>
>> So does it mean that the Xen PV uses a black list during feature detection?
>> If yes then new features are often required to be masked with an explicit
>> call to setup_clear_cpu_cap.
>>
>> Wouldn't a white list be better?
>> Then the job is more just on the Xen PV side, and it can selectively enable
>> a new feature, sometimes with Xen PV specific handling code added.
>>
>> Xin
>>
>>>
>>>
>>> Juergen
>>
> 
> Most things don't frob the paravirt list.
> 
> Maybe we should make the paravirt frobbing a separate patch, at it is separable.

Works for me.


Juergen
RE: [PATCH v4 5/5] x86/gsseg: use the LKGS instruction if available for load_gs_index()
Posted by Li, Xin3 3 years, 1 month ago
> >
> > Most things don't frob the paravirt list.
> >
> > Maybe we should make the paravirt frobbing a separate patch, at it is
> separable.
> 
> Works for me.

Thanks, I will send out the patch after Xen PV testing (need to setup it first).

Xin

> 
> 
> Juergen