[PATCH] x86/AMD: also determine L3 cache size

Jan Beulich posted 1 patch 3 years ago
Test gitlab-ci failed
Patches applied successfully (tree, apply log)
git fetch https://gitlab.com/xen-project/patchew/xen tags/patchew/7ffeec9f-2ce4-9122-4699-32c3ffb06a5d@suse.com
[PATCH] x86/AMD: also determine L3 cache size
Posted by Jan Beulich 3 years ago
For Intel CPUs we record L3 cache size, hence we should also do so for
AMD and alike.

While making these additions, also make sure (throughout the function)
that we don't needlessly overwrite prior values when the new value to be
stored is zero.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
I have to admit though that I'm not convinced the sole real use of the
field (in flush_area_local()) is a good one - flushing an entire L3's
worth of lines via CLFLUSH may not be more efficient than using WBINVD.
But I didn't measure it (yet).

--- a/xen/arch/x86/cpu/common.c
+++ b/xen/arch/x86/cpu/common.c
@@ -240,28 +240,41 @@ int get_model_name(struct cpuinfo_x86 *c
 
 void display_cacheinfo(struct cpuinfo_x86 *c)
 {
-	unsigned int dummy, ecx, edx, l2size;
+	unsigned int dummy, ecx, edx, size;
 
 	if (c->extended_cpuid_level >= 0x80000005) {
 		cpuid(0x80000005, &dummy, &dummy, &ecx, &edx);
-		if (opt_cpu_info)
-			printk("CPU: L1 I cache %dK (%d bytes/line),"
-			              " D cache %dK (%d bytes/line)\n",
-			       edx>>24, edx&0xFF, ecx>>24, ecx&0xFF);
-		c->x86_cache_size=(ecx>>24)+(edx>>24);	
+		if ((edx | ecx) >> 24) {
+			if (opt_cpu_info)
+				printk("CPU: L1 I cache %uK (%u bytes/line),"
+				              " D cache %uK (%u bytes/line)\n",
+				       edx >> 24, edx & 0xFF, ecx >> 24, ecx & 0xFF);
+			c->x86_cache_size = (ecx >> 24) + (edx >> 24);
+		}
 	}
 
 	if (c->extended_cpuid_level < 0x80000006)	/* Some chips just has a large L1. */
 		return;
 
-	ecx = cpuid_ecx(0x80000006);
-	l2size = ecx >> 16;
-	
-	c->x86_cache_size = l2size;
-
-	if (opt_cpu_info)
-		printk("CPU: L2 Cache: %dK (%d bytes/line)\n",
-		       l2size, ecx & 0xFF);
+	cpuid(0x80000006, &dummy, &dummy, &ecx, &edx);
+
+	size = ecx >> 16;
+	if (size) {
+		c->x86_cache_size = size;
+
+		if (opt_cpu_info)
+			printk("CPU: L2 Cache: %uK (%u bytes/line)\n",
+			       size, ecx & 0xFF);
+	}
+
+	size = edx >> 18;
+	if (size) {
+		c->x86_cache_size = size * 512;
+
+		if (opt_cpu_info)
+			printk("CPU: L3 Cache: %uM (%u bytes/line)\n",
+			       (size + (size & 1)) >> 1, edx & 0xFF);
+	}
 }
 
 static inline u32 _phys_pkg_id(u32 cpuid_apic, int index_msb)

Re: [PATCH] x86/AMD: also determine L3 cache size
Posted by Andrew Cooper 3 years ago
On 16/04/2021 14:20, Jan Beulich wrote:
> For Intel CPUs we record L3 cache size, hence we should also do so for
> AMD and alike.
>
> While making these additions, also make sure (throughout the function)
> that we don't needlessly overwrite prior values when the new value to be
> stored is zero.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> I have to admit though that I'm not convinced the sole real use of the
> field (in flush_area_local()) is a good one - flushing an entire L3's
> worth of lines via CLFLUSH may not be more efficient than using WBINVD.
> But I didn't measure it (yet).

WBINVD always needs a broadcast IPI to work correctly.

CLFLUSH and friends let you do this from a single CPU, using cache
coherency to DTRT with the line, wherever it is.


Looking at that logic in flush_area_local(), I don't see how it can be
correct.  The WBINVD path is a decomposition inside the IPI, but in the
higher level helpers, I don't see how the "area too big, convert to
WBINVD" can be safe.

All users of FLUSH_CACHE are flush_all(), except two PCI
Passthrough-restricted cases. MMUEXT_FLUSH_CACHE_GLOBAL looks to be
safe, while vmx_do_resume() has very dubious reasoning, and is dead code
I think, because I'm not aware of a VT-x capable CPU without WBINVD-exiting.

~Andrew


Re: [PATCH] x86/AMD: also determine L3 cache size
Posted by Jan Beulich 3 years ago
On 16.04.2021 16:21, Andrew Cooper wrote:
> On 16/04/2021 14:20, Jan Beulich wrote:
>> For Intel CPUs we record L3 cache size, hence we should also do so for
>> AMD and alike.
>>
>> While making these additions, also make sure (throughout the function)
>> that we don't needlessly overwrite prior values when the new value to be
>> stored is zero.
>>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>> ---
>> I have to admit though that I'm not convinced the sole real use of the
>> field (in flush_area_local()) is a good one - flushing an entire L3's
>> worth of lines via CLFLUSH may not be more efficient than using WBINVD.
>> But I didn't measure it (yet).
> 
> WBINVD always needs a broadcast IPI to work correctly.
> 
> CLFLUSH and friends let you do this from a single CPU, using cache
> coherency to DTRT with the line, wherever it is.
> 
> 
> Looking at that logic in flush_area_local(), I don't see how it can be
> correct.  The WBINVD path is a decomposition inside the IPI, but in the
> higher level helpers, I don't see how the "area too big, convert to
> WBINVD" can be safe.

Would you mind giving an example? I'm struggling to understand what
exactly you mean to point out.

Jan

> All users of FLUSH_CACHE are flush_all(), except two PCI
> Passthrough-restricted cases. MMUEXT_FLUSH_CACHE_GLOBAL looks to be
> safe, while vmx_do_resume() has very dubious reasoning, and is dead code
> I think, because I'm not aware of a VT-x capable CPU without WBINVD-exiting.
> 
> ~Andrew
> 


Re: [PATCH] x86/AMD: also determine L3 cache size
Posted by Jan Beulich 2 years, 12 months ago
On 16.04.2021 16:21, Andrew Cooper wrote:
> On 16/04/2021 14:20, Jan Beulich wrote:
>> For Intel CPUs we record L3 cache size, hence we should also do so for
>> AMD and alike.
>>
>> While making these additions, also make sure (throughout the function)
>> that we don't needlessly overwrite prior values when the new value to be
>> stored is zero.
>>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>> ---
>> I have to admit though that I'm not convinced the sole real use of the
>> field (in flush_area_local()) is a good one - flushing an entire L3's
>> worth of lines via CLFLUSH may not be more efficient than using WBINVD.
>> But I didn't measure it (yet).
> 
> WBINVD always needs a broadcast IPI to work correctly.
> 
> CLFLUSH and friends let you do this from a single CPU, using cache
> coherency to DTRT with the line, wherever it is.
> 
> 
> Looking at that logic in flush_area_local(), I don't see how it can be
> correct.  The WBINVD path is a decomposition inside the IPI, but in the
> higher level helpers, I don't see how the "area too big, convert to
> WBINVD" can be safe.
> 
> All users of FLUSH_CACHE are flush_all(), except two PCI
> Passthrough-restricted cases. MMUEXT_FLUSH_CACHE_GLOBAL looks to be
> safe, while vmx_do_resume() has very dubious reasoning, and is dead code
> I think, because I'm not aware of a VT-x capable CPU without WBINVD-exiting.

Besides my prior question on your reply, may I also ask what all of
this means for the patch itself? After all you've been replying to
the post-commit-message remark only so far.

Jan

Ping: [PATCH] x86/AMD: also determine L3 cache size
Posted by Jan Beulich 2 years, 11 months ago
On 29.04.2021 11:21, Jan Beulich wrote:
> On 16.04.2021 16:21, Andrew Cooper wrote:
>> On 16/04/2021 14:20, Jan Beulich wrote:
>>> For Intel CPUs we record L3 cache size, hence we should also do so for
>>> AMD and alike.
>>>
>>> While making these additions, also make sure (throughout the function)
>>> that we don't needlessly overwrite prior values when the new value to be
>>> stored is zero.
>>>
>>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>>> ---
>>> I have to admit though that I'm not convinced the sole real use of the
>>> field (in flush_area_local()) is a good one - flushing an entire L3's
>>> worth of lines via CLFLUSH may not be more efficient than using WBINVD.
>>> But I didn't measure it (yet).
>>
>> WBINVD always needs a broadcast IPI to work correctly.
>>
>> CLFLUSH and friends let you do this from a single CPU, using cache
>> coherency to DTRT with the line, wherever it is.
>>
>>
>> Looking at that logic in flush_area_local(), I don't see how it can be
>> correct.  The WBINVD path is a decomposition inside the IPI, but in the
>> higher level helpers, I don't see how the "area too big, convert to
>> WBINVD" can be safe.
>>
>> All users of FLUSH_CACHE are flush_all(), except two PCI
>> Passthrough-restricted cases. MMUEXT_FLUSH_CACHE_GLOBAL looks to be
>> safe, while vmx_do_resume() has very dubious reasoning, and is dead code
>> I think, because I'm not aware of a VT-x capable CPU without WBINVD-exiting.
> 
> Besides my prior question on your reply, may I also ask what all of
> this means for the patch itself? After all you've been replying to
> the post-commit-message remark only so far.

As for the other patch just pinged again, unless I hear back on the
patch itself by then, I'm intending to commit this the week after the
next one, if need be without any acks.

Jan