[PATCH] perf/x86/intel/ds: Fix loop ordering in release_ds_buffers()

Rik van Riel posted 1 patch 5 days, 22 hours ago
arch/x86/events/intel/ds.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
[PATCH] perf/x86/intel/ds: Fix loop ordering in release_ds_buffers()
Posted by Rik van Riel 5 days, 22 hours ago
release_ds_buffers() has three loops:
1. release_ds_buffer() - NULLs hwev->ds for each CPU
2. fini_debug_store_on_cpu() - clears MSR_IA32_DS_AREA
3. release_pebs_buffer()/release_bts_buffer() - unmaps CEA pages and
   frees backing pages

The problem is that fini_debug_store_on_cpu() checks if hwev->ds is
NULL and returns early if so. Since loop 1 already NULLed hwev->ds,
loop 2 never actually clears MSR_IA32_DS_AREA. Then loop 3 unmaps the
CEA pages, leaving the MSR pointing at now-unmapped memory. When a PEBS
overflow fires, the hardware writes to unmapped pages, causing page
faults in random victim code.

Fix by swapping the first two loops: clear MSR_IA32_DS_AREA first
(while hwev->ds is still valid), then NULL hwev->ds, then unmap and
free the CEA pages.

Fixes: b00233b25080 ("perf/x86/intel/ds: Create separate buffer for CEA")
Signed-off-by: Rik van Riel <riel@surriel.com>
---
PS: this bug was discovered in automatic syzkaller runs, with AI
    running Chris Mason's /kdebug analysis on every crash found.
    I'm slowly working my way through the bug analyses, trying to
    come up with fixes. This one seemed simple and self contained 
    enough to just send it out.

 arch/x86/events/intel/ds.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index eb07fcb61a9b0..927efea37caf0 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -970,18 +970,18 @@ void release_ds_buffers(void)
 	if (!x86_pmu.bts && !x86_pmu.ds_pebs)
 		return;
 
-	for_each_possible_cpu(cpu)
-		release_ds_buffer(cpu);
-
 	for_each_possible_cpu(cpu) {
 		/*
-		 * Again, ignore errors from offline CPUs, they will no longer
+		 * Ignore errors from offline CPUs, they will no longer
 		 * observe cpu_hw_events.ds and not program the DS_AREA when
 		 * they come up.
 		 */
 		fini_debug_store_on_cpu(cpu);
 	}
 
+	for_each_possible_cpu(cpu)
+		release_ds_buffer(cpu);
+
 	for_each_possible_cpu(cpu) {
 		if (x86_pmu.ds_pebs)
 			release_pebs_buffer(cpu);
-- 
2.52.0