[PATCH v5 01/20] KVM: x86: Move find_highest_vector() to a common header

Neeraj Upadhyay posted 20 patches 9 months, 2 weeks ago
There is a newer version of this series
[PATCH v5 01/20] KVM: x86: Move find_highest_vector() to a common header
Posted by Neeraj Upadhyay 9 months, 2 weeks ago
In preparation for using find_highest_vector() in Secure AVIC
guest APIC driver, move (and rename) find_highest_vector() to
apic.h.

Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
---
Changes since v4:
 - No change

 arch/x86/include/asm/apic.h | 23 +++++++++++++++++++++++
 arch/x86/kvm/lapic.c        | 23 +++--------------------
 2 files changed, 26 insertions(+), 20 deletions(-)

diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index 1c136f54651c..c63c2fe8ad13 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -500,6 +500,29 @@ static inline bool is_vector_pending(unsigned int vector)
 	return lapic_vector_set_in_irr(vector) || pi_pending_this_cpu(vector);
 }
 
+#define MAX_APIC_VECTOR			256
+#define APIC_VECTORS_PER_REG		32
+
+static inline int apic_find_highest_vector(void *bitmap)
+{
+	unsigned int regno;
+	unsigned int vec;
+	u32 *reg;
+
+	/*
+	 * The registers in the bitmap are 32-bit wide and 16-byte
+	 * aligned. State of a vector is stored in a single bit.
+	 */
+	for (regno = MAX_APIC_VECTOR / APIC_VECTORS_PER_REG - 1; regno >= 0; regno--) {
+		vec = regno * APIC_VECTORS_PER_REG;
+		reg = bitmap + regno * 16;
+		if (*reg)
+			return __fls(*reg) + vec;
+	}
+
+	return -1;
+}
+
 /*
  * Warm reset vector position:
  */
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 28e3317124fd..775eb742d110 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -27,6 +27,7 @@
 #include <linux/export.h>
 #include <linux/math64.h>
 #include <linux/slab.h>
+#include <asm/apic.h>
 #include <asm/processor.h>
 #include <asm/mce.h>
 #include <asm/msr.h>
@@ -55,9 +56,6 @@
 /* 14 is the version for Xeon and Pentium 8.4.8*/
 #define APIC_VERSION			0x14UL
 #define LAPIC_MMIO_LENGTH		(1 << 12)
-/* followed define is not in apicdef.h */
-#define MAX_APIC_VECTOR			256
-#define APIC_VECTORS_PER_REG		32
 
 /*
  * Enable local APIC timer advancement (tscdeadline mode only) with adaptive
@@ -626,21 +624,6 @@ static const unsigned int apic_lvt_mask[KVM_APIC_MAX_NR_LVT_ENTRIES] = {
 	[LVT_CMCI] = LVT_MASK | APIC_MODE_MASK
 };
 
-static int find_highest_vector(void *bitmap)
-{
-	int vec;
-	u32 *reg;
-
-	for (vec = MAX_APIC_VECTOR - APIC_VECTORS_PER_REG;
-	     vec >= 0; vec -= APIC_VECTORS_PER_REG) {
-		reg = bitmap + REG_POS(vec);
-		if (*reg)
-			return __fls(*reg) + vec;
-	}
-
-	return -1;
-}
-
 static u8 count_vectors(void *bitmap)
 {
 	int vec;
@@ -704,7 +687,7 @@ EXPORT_SYMBOL_GPL(kvm_apic_update_irr);
 
 static inline int apic_search_irr(struct kvm_lapic *apic)
 {
-	return find_highest_vector(apic->regs + APIC_IRR);
+	return apic_find_highest_vector(apic->regs + APIC_IRR);
 }
 
 static inline int apic_find_highest_irr(struct kvm_lapic *apic)
@@ -779,7 +762,7 @@ static inline int apic_find_highest_isr(struct kvm_lapic *apic)
 	if (likely(apic->highest_isr_cache != -1))
 		return apic->highest_isr_cache;
 
-	result = find_highest_vector(apic->regs + APIC_ISR);
+	result = apic_find_highest_vector(apic->regs + APIC_ISR);
 	ASSERT(result == -1 || result >= 16);
 
 	return result;
-- 
2.34.1
Re: [PATCH v5 01/20] KVM: x86: Move find_highest_vector() to a common header
Posted by Sean Christopherson 9 months, 2 weeks ago
On Tue, Apr 29, 2025, Neeraj Upadhyay wrote:
> In preparation for using find_highest_vector() in Secure AVIC
> guest APIC driver, move (and rename) find_highest_vector() to
> apic.h.
> 
> Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
> ---
> Changes since v4:
>  - No change
> 
>  arch/x86/include/asm/apic.h | 23 +++++++++++++++++++++++
>  arch/x86/kvm/lapic.c        | 23 +++--------------------
>  2 files changed, 26 insertions(+), 20 deletions(-)
> 
> diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
> index 1c136f54651c..c63c2fe8ad13 100644
> --- a/arch/x86/include/asm/apic.h
> +++ b/arch/x86/include/asm/apic.h
> @@ -500,6 +500,29 @@ static inline bool is_vector_pending(unsigned int vector)
>  	return lapic_vector_set_in_irr(vector) || pi_pending_this_cpu(vector);
>  }
>  
> +#define MAX_APIC_VECTOR			256
> +#define APIC_VECTORS_PER_REG		32
> +
> +static inline int apic_find_highest_vector(void *bitmap)
> +{
> +	unsigned int regno;
> +	unsigned int vec;
> +	u32 *reg;
> +
> +	/*
> +	 * The registers in the bitmap are 32-bit wide and 16-byte
> +	 * aligned. State of a vector is stored in a single bit.
> +	 */
> +	for (regno = MAX_APIC_VECTOR / APIC_VECTORS_PER_REG - 1; regno >= 0; regno--) {
> +		vec = regno * APIC_VECTORS_PER_REG;
> +		reg = bitmap + regno * 16;
> +		if (*reg)
> +			return __fls(*reg) + vec;
> +	}

NAK.  The changelog says nothing about rewriting the logic, and I have zero desire
to review or test this for correctness.  If someone has requested that the logic be
cleaned up, then do that as a separate patch (or patches) on top, with a changelog
that justifies the change, because to my eyes this isn't an improvement.

I suspect the rewrite is in part due to REG_POS() being a KVM helper that's poorly
named for a global macro.  lapic_vector_set_in_irr() already has open coded
versions of REG_POS() and VEC_POS(), just dedup those.
 
*sigh*

And you created your own versions of those in get_reg_bitmap() and get_vec_bit().

Please slot the below in.  And if there is any more code in this series that is
duplicating existing functionality, try to figure out a clean way to share code
instead of open coding yet another version.

--
From: Sean Christopherson <seanjc@google.com>
Date: Tue, 29 Apr 2025 07:30:47 -0700
Subject: [PATCH] x86/apic: KVM: Deduplicate APIC vector => register+bit math

Consolidate KVM's {REG,VEC}_POS() macros and lapic_vector_set_in_irr()'s
open coded equivalent logic in anticipation of the kernel gaining more
usage of vector => reg+bit lookups.

Use lapic_vector_set_in_irr()'s math as using divides for both the bit
number and register offset makes it easier to connect the dots, and for at
least one user, fixup_irqs(), "/ 32 * 0x10" generates ever so slightly
better code with gcc-14 (shaves a whole 3 bytes from the code stream):

((v) >> 5) << 4:
  c1 ef 05           shr    $0x5,%edi
  c1 e7 04           shl    $0x4,%edi
  81 c7 00 02 00 00  add    $0x200,%edi

(v) / 32 * 0x10:
  c1 ef 05           shr    $0x5,%edi
  83 c7 20           add    $0x20,%edi
  c1 e7 04           shl    $0x4,%edi

Keep KVM's tersely named macros as "wrappers" to avoid unnecessary churn
in KVM, and because the shorter names yield more readable code overall in
KVM.

No functional change intended (clang-19 and gcc-14 generate bit-for-bit
identical code for all of kvm.ko).

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/include/asm/apic.h | 7 +++++--
 arch/x86/kvm/lapic.h        | 4 ++--
 2 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index c903d358405d..7082826030ba 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -488,11 +488,14 @@ static inline void apic_setup_apic_calls(void) { }
 
 extern void apic_ack_irq(struct irq_data *data);
 
+#define APIC_VECTOR_TO_BIT_NUMBER(v) ((v) % 32)
+#define APIC_VECTOR_TO_REG_OFFSET(v) ((v) / 32 * 0x10)
+
 static inline bool lapic_vector_set_in_irr(unsigned int vector)
 {
-	u32 irr = apic_read(APIC_IRR + (vector / 32 * 0x10));
+	u32 irr = apic_read(APIC_IRR + APIC_VECTOR_TO_REG_OFFSET(vector));
 
-	return !!(irr & (1U << (vector % 32)));
+	return !!(irr & (1U << APIC_VECTOR_TO_BIT_NUMBER(vector)));
 }
 
 static inline bool is_vector_pending(unsigned int vector)
diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
index e33c969439f7..13a4bc60e292 100644
--- a/arch/x86/kvm/lapic.h
+++ b/arch/x86/kvm/lapic.h
@@ -145,8 +145,8 @@ void kvm_lapic_exit(void);
 
 u64 kvm_lapic_readable_reg_mask(struct kvm_lapic *apic);
 
-#define VEC_POS(v) ((v) & (32 - 1))
-#define REG_POS(v) (((v) >> 5) << 4)
+#define VEC_POS(v) APIC_VECTOR_TO_BIT_NUMBER(v)
+#define REG_POS(v) APIC_VECTOR_TO_REG_OFFSET(v)
 
 static inline void kvm_lapic_clear_vector(int vec, void *bitmap)
 {

base-commit: 810a8562c8a326765a35e7c2415bd052cca9dd2a
--
Re: [PATCH v5 01/20] KVM: x86: Move find_highest_vector() to a common header
Posted by Neeraj Upadhyay 8 months ago

On 4/29/2025 8:12 PM, Sean Christopherson wrote:
> Please slot the below in.  And if there is any more code in this series that is
> duplicating existing functionality, try to figure out a clean way to share code
> instead of open coding yet another version.
> 
> --
> From: Sean Christopherson <seanjc@google.com>
> Date: Tue, 29 Apr 2025 07:30:47 -0700
> Subject: [PATCH] x86/apic: KVM: Deduplicate APIC vector => register+bit math
> 
> Consolidate KVM's {REG,VEC}_POS() macros and lapic_vector_set_in_irr()'s
> open coded equivalent logic in anticipation of the kernel gaining more
> usage of vector => reg+bit lookups.
> 
> Use lapic_vector_set_in_irr()'s math as using divides for both the bit
> number and register offset makes it easier to connect the dots, and for at
> least one user, fixup_irqs(), "/ 32 * 0x10" generates ever so slightly
> better code with gcc-14 (shaves a whole 3 bytes from the code stream):
> 
> ((v) >> 5) << 4:
>   c1 ef 05           shr    $0x5,%edi
>   c1 e7 04           shl    $0x4,%edi
>   81 c7 00 02 00 00  add    $0x200,%edi
> 
> (v) / 32 * 0x10:
>   c1 ef 05           shr    $0x5,%edi
>   83 c7 20           add    $0x20,%edi
>   c1 e7 04           shl    $0x4,%edi
> 
> Keep KVM's tersely named macros as "wrappers" to avoid unnecessary churn
> in KVM, and because the shorter names yield more readable code overall in
> KVM.
> 
> No functional change intended (clang-19 and gcc-14 generate bit-for-bit
> identical code for all of kvm.ko).
> 

With this change, I am observing difference in generated assembly for VEC_POS
and REG_POS, as KVM code passes vector param with type "int" to these macros.
Type casting "v" param of APIC_VECTOR_TO_BIT_NUMBER and APIC_VECTOR_TO_REG_OFFSET
to "unsigned int" in the macro definition restores the original assembly. Can
you have a look at this once? Below is the updated patch for this. Can you please
share your feedback on this?

--
x86/apic: KVM: Deduplicate APIC vector => register+bit math

Consolidate KVM's {REG,VEC}_POS() macros and lapic_vector_set_in_irr()'s
open coded equivalent logic in anticipation of the kernel gaining more
usage of vector => reg+bit lookups.

Use lapic_vector_set_in_irr()'s math as using divides for both the bit
number and register offset makes it easier to connect the dots, and for at
least one user, fixup_irqs(), "/ 32 * 0x10" generates ever so slightly
better code with gcc-14 (shaves a whole 3 bytes from the code stream):

((v) >> 5) << 4:
  c1 ef 05           shr    $0x5,%edi
  c1 e7 04           shl    $0x4,%edi
  81 c7 00 02 00 00  add    $0x200,%edi

(v) / 32 * 0x10:
  c1 ef 05           shr    $0x5,%edi
  83 c7 20           add    $0x20,%edi
  c1 e7 04           shl    $0x4,%edi

Keep KVM's tersely named macros as "wrappers" to avoid unnecessary churn
in KVM, and because the shorter names yield more readable code overall in
KVM.

The new macros type cast the vector parameter to "unsigned int". This is
required from better code generation for cases where an "int" is passed
to these macros in kvm code.

int v;

((v) >> 5) << 4:

  c1 f8 05    sar    $0x5,%eax
  c1 e0 04    shl    $0x4,%eax

((v) / 32 * 0x10):

  85 ff       test   %edi,%edi
  8d 47 1f    lea    0x1f(%rdi),%eax
  0f 49 c7    cmovns %edi,%eax
  c1 f8 05    sar    $0x5,%eax
  c1 e0 04    shl    $0x4,%eax

((unsigned int)(v) / 32 * 0x10):

  c1 f8 05    sar    $0x5,%eax
  c1 e0 04    shl    $0x4,%eax

(v) & (32 - 1):

  89 f8       mov    %edi,%eax
  83 e0 1f    and    $0x1f,%eax

(v) % 32

  89 fa       mov    %edi,%edx
  c1 fa 1f    sar    $0x1f,%edx
  c1 ea 1b    shr    $0x1b,%edx
  8d 04 17    lea    (%rdi,%rdx,1),%eax
  83 e0 1f    and    $0x1f,%eax
  29 d0       sub    %edx,%eax

(unsigned int)(v) % 32:

  89 f8       mov    %edi,%eax
  83 e0 1f    and    $0x1f,%eax

Overall kvm.ko size is impacted if "unsigned int" is not used.

Bin      Orig     New (w/o unsigned int)  New (w/ unsigned int)

lapic.o  28580        28772                 28580
kvm.o    670810       671002                670810
kvm.ko   708079       708271                708079

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
[Neeraj: Type cast vec macro param to "unsigned int", provide data
         in commit log on "unsigned int" requirement]
Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
---
 arch/x86/include/asm/apic.h | 7 +++++--
 arch/x86/kvm/lapic.h        | 4 ++--
 2 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index 23d86c9750b9..c84d4e86fe4e 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -488,11 +488,14 @@ static inline void apic_setup_apic_calls(void) { }

 extern void apic_ack_irq(struct irq_data *data);

+#define APIC_VECTOR_TO_BIT_NUMBER(v) ((unsigned int)(v) % 32)
+#define APIC_VECTOR_TO_REG_OFFSET(v) ((unsigned int)(v) / 32 * 0x10)
+
 static inline bool lapic_vector_set_in_irr(unsigned int vector)
 {
-       u32 irr = apic_read(APIC_IRR + (vector / 32 * 0x10));
+       u32 irr = apic_read(APIC_IRR + APIC_VECTOR_TO_REG_OFFSET(vector));

-       return !!(irr & (1U << (vector % 32)));
+       return !!(irr & (1U << APIC_VECTOR_TO_BIT_NUMBER(vector)));
 }

 static inline bool is_vector_pending(unsigned int vector)
diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
index 1638a3da383a..56369d331bfc 100644
--- a/arch/x86/kvm/lapic.h
+++ b/arch/x86/kvm/lapic.h
@@ -145,8 +145,8 @@ void kvm_lapic_exit(void);

 u64 kvm_lapic_readable_reg_mask(struct kvm_lapic *apic);

-#define VEC_POS(v) ((v) & (32 - 1))
-#define REG_POS(v) (((v) >> 5) << 4)
+#define VEC_POS(v) APIC_VECTOR_TO_BIT_NUMBER(v)
+#define REG_POS(v) APIC_VECTOR_TO_REG_OFFSET(v)

 static inline void kvm_lapic_clear_vector(int vec, void *bitmap)
 {

--
Re: [PATCH v5 01/20] KVM: x86: Move find_highest_vector() to a common header
Posted by Sean Christopherson 8 months ago
On Tue, Jun 10, 2025, Neeraj Upadhyay wrote:
> On 4/29/2025 8:12 PM, Sean Christopherson wrote:
> > Please slot the below in.  And if there is any more code in this series that is
> > duplicating existing functionality, try to figure out a clean way to share code
> > instead of open coding yet another version.
> > 
> > --
> > From: Sean Christopherson <seanjc@google.com>
> > Date: Tue, 29 Apr 2025 07:30:47 -0700
> > Subject: [PATCH] x86/apic: KVM: Deduplicate APIC vector => register+bit math
> > 
> > Consolidate KVM's {REG,VEC}_POS() macros and lapic_vector_set_in_irr()'s
> > open coded equivalent logic in anticipation of the kernel gaining more
> > usage of vector => reg+bit lookups.
> > 
> > Use lapic_vector_set_in_irr()'s math as using divides for both the bit
> > number and register offset makes it easier to connect the dots, and for at
> > least one user, fixup_irqs(), "/ 32 * 0x10" generates ever so slightly
> > better code with gcc-14 (shaves a whole 3 bytes from the code stream):
> > 
> > ((v) >> 5) << 4:
> >   c1 ef 05           shr    $0x5,%edi
> >   c1 e7 04           shl    $0x4,%edi
> >   81 c7 00 02 00 00  add    $0x200,%edi
> > 
> > (v) / 32 * 0x10:
> >   c1 ef 05           shr    $0x5,%edi
> >   83 c7 20           add    $0x20,%edi
> >   c1 e7 04           shl    $0x4,%edi
> > 
> > Keep KVM's tersely named macros as "wrappers" to avoid unnecessary churn
> > in KVM, and because the shorter names yield more readable code overall in
> > KVM.
> > 
> > No functional change intended (clang-19 and gcc-14 generate bit-for-bit
> > identical code for all of kvm.ko).
> > 
> 
> With this change, I am observing difference in generated assembly for VEC_POS
> and REG_POS, as KVM code passes vector param with type "int" to these macros.
> Type casting "v" param of APIC_VECTOR_TO_BIT_NUMBER and APIC_VECTOR_TO_REG_OFFSET
> to "unsigned int" in the macro definition restores the original assembly. Can
> you have a look at this once? Below is the updated patch for this. Can you please
> share your feedback on this?

LGTM.

Ideally, KVM would probably pass around an "unsigned int", but some higher level
APIs in KVM use -1 to indicate an invalid vector (e.g. no IRQ pending), and mixing
and matching types would get a little weird and would require a decent amount of
churn.  So casting in the macro where it matters seems like the best option, at
least for now.

Thanks much for taking care of this!
Re: [PATCH v5 01/20] KVM: x86: Move find_highest_vector() to a common header
Posted by Neeraj Upadhyay 9 months, 2 weeks ago

On 4/29/2025 8:12 PM, Sean Christopherson wrote:
...

>> +
>> +static inline int apic_find_highest_vector(void *bitmap)
>> +{
>> +	unsigned int regno;
>> +	unsigned int vec;
>> +	u32 *reg;
>> +
>> +	/*
>> +	 * The registers in the bitmap are 32-bit wide and 16-byte
>> +	 * aligned. State of a vector is stored in a single bit.
>> +	 */
>> +	for (regno = MAX_APIC_VECTOR / APIC_VECTORS_PER_REG - 1; regno >= 0; regno--) {
>> +		vec = regno * APIC_VECTORS_PER_REG;
>> +		reg = bitmap + regno * 16;
>> +		if (*reg)
>> +			return __fls(*reg) + vec;
>> +	}
> 
> NAK.  The changelog says nothing about rewriting the logic, and I have zero desire

My bad. I missed updating the changelog with the information about logic update.

> to review or test this for correctness.  If someone has requested that the logic be
> cleaned up, then do that as a separate patch (or patches) on top, with a changelog
> that justifies the change, because to my eyes this isn't an improvement.
>

Ok. I will keep the original logic in apic_find_highest_vector() in next version of
this patch.
 
> I suspect the rewrite is in part due to REG_POS() being a KVM helper that's poorly
> named for a global macro.  lapic_vector_set_in_irr() already has open coded
> versions of REG_POS() and VEC_POS(), just dedup those.
>  
> *sigh*
> 
> And you created your own versions of those in get_reg_bitmap() and get_vec_bit().
> 
> Please slot the below in.  And if there is any more code in this series that is
> duplicating existing functionality, try to figure out a clean way to share code
> instead of open coding yet another version.
> 

Ok sure. I will take your patch and reuse the APIC regs manipulation functions
between KVM and SAVIC in next version. Thanks for sharing the patch!


- Neeraj
Re: [PATCH v5 01/20] KVM: x86: Move find_highest_vector() to a common header
Posted by Borislav Petkov 9 months, 1 week ago
On Tue, Apr 29, 2025 at 11:39:53PM +0530, Neeraj Upadhyay wrote:
> My bad. I missed updating the changelog with the information about logic update.

No, remember: when you move code like this, your first patch is *solely*
*mechanical* move.

Then, ontop, in further patches you do other changes.

You want to keep mechanical move separate from other changes because it
complicates review unnecessarily.

One of the reasons I'm trying to get you guys to do review too is because then
you'll know.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette
Re: [PATCH v5 01/20] KVM: x86: Move find_highest_vector() to a common header
Posted by Neeraj Upadhyay 9 months, 1 week ago

On 5/7/2025 4:08 PM, Borislav Petkov wrote:
> On Tue, Apr 29, 2025 at 11:39:53PM +0530, Neeraj Upadhyay wrote:
>> My bad. I missed updating the changelog with the information about logic update.
> 
> No, remember: when you move code like this, your first patch is *solely*
> *mechanical* move.
> 
> Then, ontop, in further patches you do other changes.
> 
> You want to keep mechanical move separate from other changes because it
> complicates review unnecessarily.
> 
> One of the reasons I'm trying to get you guys to do review too is because then
> you'll know.
> 

Understood. Thanks for explaining this!


- Neeraj


> Thx.
>