From nobody Fri Oct 3 18:02:01 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.13]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id ECE71239E8B; Thu, 28 Aug 2025 01:36:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.13 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756344981; cv=none; b=HHrvDcBjQW7XpzIO2Opf879XiVKV2W0nbsuw0IaJg6MxCvGrT0R8tymqjSDGmb0AIpRHyD26H+G7PYu0EJiGzenRo8ASgErd9H443cCtUj0V3EYjZ1OkQLN3WFWJjPUr5BOUSt4yxyqi65VSP20swZ6PR8QqYXmVtkAQcqhjoj4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756344981; c=relaxed/simple; bh=SxgdErdfZ4i3Zgnr39SKMu+J5NC+vAYvaf32s4Lygkg=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=H/1ZrysTHqL4XqPUPZvIRP37zveG8RXRDFm+EAtjDyenXiKjEieoAdRjXRsmVftyUP+CbvjOTi+ej5TxyenFFXq8iSyXUNV0AYevGN6DTyXBIj1kw7QxsOJDL16SjnN22h7880ENrVQKmxdlw/CeWzEppFOcXfS2klmNag972Uk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=CW0MCxOV; arc=none smtp.client-ip=192.198.163.13 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="CW0MCxOV" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1756344980; x=1787880980; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=SxgdErdfZ4i3Zgnr39SKMu+J5NC+vAYvaf32s4Lygkg=; b=CW0MCxOVVmcH7vKnzvVszfHth59BBUTz+k9Yt4xTAXssPk2fs+71PWDC qcJUT/ejYG8ARndvDlfnYYmIQel0I0pQJHwmPy8XdoHDc0vW41DqEjjv7 KzsOrO+POZPmbW/fTT7Ni3WzzAQdbYlUm1+BsYrt3DFKyx6u0pfhsM+pv dyIPFXFCWfR5Z7dJ0xM3VLGazuqcxcpMEF7l/hvWDht0SRiAx82/zDQ2n PwBn3Ov42cOmDhr2N8ZST4lr/fymyjGv4quqPds1jz/huuswoWpQHFkQ5 KC7gHh8K/lriWFRfbcxE+AOb76AN61TEr4z1euQ6KJis4DGaaXKIPvkV3 A==; X-CSE-ConnectionGUID: 8AkBzke5Rhy0nq2+EG+xbA== X-CSE-MsgGUID: jUBqSXNJT/mMxtxOBrEYoQ== X-IronPort-AV: E=McAfee;i="6800,10657,11535"; a="61240916" X-IronPort-AV: E=Sophos;i="6.18,217,1751266800"; d="scan'208";a="61240916" Received: from fmviesa008.fm.intel.com ([10.60.135.148]) by fmvoesa107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Aug 2025 18:36:19 -0700 X-CSE-ConnectionGUID: hpRQrzuSR9OAEomwkwIscQ== X-CSE-MsgGUID: KF0xwjVbRyiX7oT5Vwv4EQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.18,217,1751266800"; d="scan'208";a="170372586" Received: from spr.sh.intel.com ([10.112.230.239]) by fmviesa008.fm.intel.com with ESMTP; 27 Aug 2025 18:36:16 -0700 From: Dapeng Mi To: Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Ian Rogers , Adrian Hunter , Alexander Shishkin , Kan Liang , Andi Kleen , Eranian Stephane Cc: linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Dapeng Mi , Dapeng Mi Subject: [Patch v7 08/12] perf/x86/intel: Process arch-PEBS records or record fragments Date: Thu, 28 Aug 2025 09:34:31 +0800 Message-Id: <20250828013435.1528459-9-dapeng1.mi@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250828013435.1528459-1-dapeng1.mi@linux.intel.com> References: <20250828013435.1528459-1-dapeng1.mi@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" A significant difference with adaptive PEBS is that arch-PEBS record supports fragments which means an arch-PEBS record could be split into several independent fragments which have its own arch-PEBS header in each fragment. This patch defines architectural PEBS record layout structures and add helpers to process arch-PEBS records or fragments. Only legacy PEBS groups like basic, GPR, XMM and LBR groups are supported in this patch, the new added YMM/ZMM/OPMASK vector registers capturing would be supported in the future. Signed-off-by: Dapeng Mi --- arch/x86/events/intel/core.c | 13 +++ arch/x86/events/intel/ds.c | 182 ++++++++++++++++++++++++++++++ arch/x86/include/asm/msr-index.h | 6 + arch/x86/include/asm/perf_event.h | 96 ++++++++++++++++ 4 files changed, 297 insertions(+) diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c index bfb123ff7c9a..cab2ce73ad09 100644 --- a/arch/x86/events/intel/core.c +++ b/arch/x86/events/intel/core.c @@ -3215,6 +3215,19 @@ static int handle_pmi_common(struct pt_regs *regs, u= 64 status) status &=3D ~GLOBAL_STATUS_PERF_METRICS_OVF_BIT; } =20 + /* + * Arch PEBS sets bit 54 in the global status register + */ + if (__test_and_clear_bit(GLOBAL_STATUS_ARCH_PEBS_THRESHOLD_BIT, + (unsigned long *)&status)) { + handled++; + static_call(x86_pmu_drain_pebs)(regs, &data); + + if (cpuc->events[INTEL_PMC_IDX_FIXED_SLOTS] && + is_pebs_counter_event_group(cpuc->events[INTEL_PMC_IDX_FIXED_SLOTS])) + status &=3D ~GLOBAL_STATUS_PERF_METRICS_OVF_BIT; + } + /* * Intel PT */ diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c index 121042d532b8..311e25d5c4cf 100644 --- a/arch/x86/events/intel/ds.c +++ b/arch/x86/events/intel/ds.c @@ -2270,6 +2270,114 @@ static void setup_pebs_adaptive_sample_data(struct = perf_event *event, format_group); } =20 +static inline bool arch_pebs_record_continued(struct arch_pebs_header *hea= der) +{ + /* Continue bit or null PEBS record indicates fragment follows. */ + return header->cont || !(header->format & GENMASK_ULL(63, 16)); +} + +static void setup_arch_pebs_sample_data(struct perf_event *event, + struct pt_regs *iregs, void *__pebs, + struct perf_sample_data *data, + struct pt_regs *regs) +{ + struct cpu_hw_events *cpuc =3D this_cpu_ptr(&cpu_hw_events); + u64 sample_type =3D event->attr.sample_type; + struct arch_pebs_header *header =3D NULL; + struct arch_pebs_aux *meminfo =3D NULL; + struct arch_pebs_gprs *gprs =3D NULL; + struct x86_perf_regs *perf_regs; + void *next_record; + void *at =3D __pebs; + + if (at =3D=3D NULL) + return; + + perf_regs =3D container_of(regs, struct x86_perf_regs, regs); + perf_regs->xmm_regs =3D NULL; + + __setup_perf_sample_data(event, iregs, data); + + *regs =3D *iregs; + +again: + header =3D at; + next_record =3D at + sizeof(struct arch_pebs_header); + if (header->basic) { + struct arch_pebs_basic *basic =3D next_record; + u16 retire =3D 0; + + next_record =3D basic + 1; + + if (sample_type & PERF_SAMPLE_WEIGHT_STRUCT) + retire =3D basic->valid ? basic->retire : 0; + __setup_pebs_basic_group(event, regs, data, sample_type, + basic->ip, basic->tsc, retire); + } + + /* + * The record for MEMINFO is in front of GP + * But PERF_SAMPLE_TRANSACTION needs gprs->ax. + * Save the pointer here but process later. + */ + if (header->aux) { + meminfo =3D next_record; + next_record =3D meminfo + 1; + } + + if (header->gpr) { + gprs =3D next_record; + next_record =3D gprs + 1; + + __setup_pebs_gpr_group(event, regs, (struct pebs_gprs *)gprs, + sample_type); + } + + if (header->aux) { + u64 ax =3D gprs ? gprs->ax : 0; + + __setup_pebs_meminfo_group(event, data, sample_type, + meminfo->cache_latency, + meminfo->instr_latency, + meminfo->address, meminfo->aux, + meminfo->tsx_tuning, ax); + } + + if (header->xmm) { + struct pebs_xmm *xmm; + + next_record +=3D sizeof(struct arch_pebs_xer_header); + + xmm =3D next_record; + perf_regs->xmm_regs =3D xmm->xmm; + next_record =3D xmm + 1; + } + + if (header->lbr) { + struct arch_pebs_lbr_header *lbr_header =3D next_record; + struct lbr_entry *lbr; + int num_lbr; + + next_record =3D lbr_header + 1; + lbr =3D next_record; + + num_lbr =3D header->lbr =3D=3D ARCH_PEBS_LBR_NUM_VAR ? lbr_header->depth= : + header->lbr * ARCH_PEBS_BASE_LBR_ENTRIES; + next_record +=3D num_lbr * sizeof(struct lbr_entry); + + if (has_branch_stack(event)) { + intel_pmu_store_pebs_lbrs(lbr); + intel_pmu_lbr_save_brstack(data, cpuc, event); + } + } + + /* Parse followed fragments if there are. */ + if (arch_pebs_record_continued(header)) { + at =3D at + header->size; + goto again; + } +} + static inline void * get_next_pebs_record_by_bit(void *base, void *top, int bit) { @@ -2772,6 +2880,79 @@ static void intel_pmu_drain_pebs_icl(struct pt_regs = *iregs, struct perf_sample_d setup_pebs_adaptive_sample_data); } =20 +static void intel_pmu_drain_arch_pebs(struct pt_regs *iregs, + struct perf_sample_data *data) +{ + struct perf_event *events[INTEL_PMC_IDX_FIXED + MAX_FIXED_PEBS_EVENTS] = =3D {NULL}; + short counts[INTEL_PMC_IDX_FIXED + MAX_FIXED_PEBS_EVENTS] =3D {}; + void *last[INTEL_PMC_IDX_FIXED + MAX_FIXED_PEBS_EVENTS]; + struct cpu_hw_events *cpuc =3D this_cpu_ptr(&cpu_hw_events); + union arch_pebs_index index; + struct x86_perf_regs perf_regs; + struct pt_regs *regs =3D &perf_regs.regs; + void *base, *at, *top; + u64 mask; + + rdmsrq(MSR_IA32_PEBS_INDEX, index.whole); + + if (unlikely(!index.wr)) { + intel_pmu_pebs_event_update_no_drain(cpuc, X86_PMC_IDX_MAX); + return; + } + + base =3D cpuc->ds_pebs_vaddr; + top =3D (void *)((u64)cpuc->ds_pebs_vaddr + + (index.wr << ARCH_PEBS_INDEX_WR_SHIFT)); + + index.wr =3D 0; + index.full =3D 0; + wrmsrq(MSR_IA32_PEBS_INDEX, index.whole); + + mask =3D hybrid(cpuc->pmu, arch_pebs_cap).counters & cpuc->pebs_enabled; + + if (!iregs) + iregs =3D &dummy_iregs; + + /* Process all but the last event for each counter. */ + for (at =3D base; at < top;) { + struct arch_pebs_header *header; + struct arch_pebs_basic *basic; + u64 pebs_status; + + header =3D at; + + if (WARN_ON_ONCE(!header->size)) + break; + + /* 1st fragment or single record must have basic group */ + if (!header->basic) { + at +=3D header->size; + continue; + } + + basic =3D at + sizeof(struct arch_pebs_header); + pebs_status =3D mask & basic->applicable_counters; + __intel_pmu_handle_pebs_record(iregs, regs, data, at, + pebs_status, events, counts, last, + setup_arch_pebs_sample_data); + + /* Skip non-last fragments */ + while (arch_pebs_record_continued(header)) { + if (!header->size) + break; + at +=3D header->size; + header =3D at; + } + + /* Skip last fragment or the single record */ + at +=3D header->size; + } + + __intel_pmu_handle_last_pebs_record(iregs, regs, data, mask, events, + counts, last, + setup_arch_pebs_sample_data); +} + static void __init intel_arch_pebs_init(void) { /* @@ -2781,6 +2962,7 @@ static void __init intel_arch_pebs_init(void) */ x86_pmu.arch_pebs =3D 1; x86_pmu.pebs_buffer_size =3D PEBS_BUFFER_SIZE; + x86_pmu.drain_pebs =3D intel_pmu_drain_arch_pebs; x86_pmu.pebs_capable =3D ~0ULL; =20 x86_pmu.pebs_enable =3D __intel_pmu_pebs_enable; diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-in= dex.h index f627196eb796..7c306dd0713c 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -324,6 +324,12 @@ PERF_CAP_PEBS_FORMAT | PERF_CAP_PEBS_BASELINE | \ PERF_CAP_PEBS_TIMING_INFO) =20 +/* Arch PEBS */ +#define MSR_IA32_PEBS_BASE 0x000003f4 +#define MSR_IA32_PEBS_INDEX 0x000003f5 +#define ARCH_PEBS_OFFSET_MASK 0x7fffff +#define ARCH_PEBS_INDEX_WR_SHIFT 4 + #define MSR_IA32_RTIT_CTL 0x00000570 #define RTIT_CTL_TRACEEN BIT(0) #define RTIT_CTL_CYCLEACC BIT(1) diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_= event.h index 0dfa06722bab..3b3848f0d339 100644 --- a/arch/x86/include/asm/perf_event.h +++ b/arch/x86/include/asm/perf_event.h @@ -437,6 +437,8 @@ static inline bool is_topdown_idx(int idx) #define GLOBAL_STATUS_LBRS_FROZEN BIT_ULL(GLOBAL_STATUS_LBRS_FROZEN_BIT) #define GLOBAL_STATUS_TRACE_TOPAPMI_BIT 55 #define GLOBAL_STATUS_TRACE_TOPAPMI BIT_ULL(GLOBAL_STATUS_TRACE_TOPAPMI_B= IT) +#define GLOBAL_STATUS_ARCH_PEBS_THRESHOLD_BIT 54 +#define GLOBAL_STATUS_ARCH_PEBS_THRESHOLD BIT_ULL(GLOBAL_STATUS_ARCH_PEBS_= THRESHOLD_BIT) #define GLOBAL_STATUS_PERF_METRICS_OVF_BIT 48 =20 #define GLOBAL_CTRL_EN_PERF_METRICS BIT_ULL(48) @@ -507,6 +509,100 @@ struct pebs_cntr_header { =20 #define INTEL_CNTR_METRICS 0x3 =20 +/* + * Arch PEBS + */ +union arch_pebs_index { + struct { + u64 rsvd:4, + wr:23, + rsvd2:4, + full:1, + en:1, + rsvd3:3, + thresh:23, + rsvd4:5; + }; + u64 whole; +}; + +struct arch_pebs_header { + union { + u64 format; + struct { + u64 size:16, /* Record size */ + rsvd:14, + mode:1, /* 64BIT_MODE */ + cont:1, + rsvd2:3, + cntr:5, + lbr:2, + rsvd3:7, + xmm:1, + ymmh:1, + rsvd4:2, + opmask:1, + zmmh:1, + h16zmm:1, + rsvd5:5, + gpr:1, + aux:1, + basic:1; + }; + }; + u64 rsvd6; +}; + +struct arch_pebs_basic { + u64 ip; + u64 applicable_counters; + u64 tsc; + u64 retire :16, /* Retire Latency */ + valid :1, + rsvd :47; + u64 rsvd2; + u64 rsvd3; +}; + +struct arch_pebs_aux { + u64 address; + u64 rsvd; + u64 rsvd2; + u64 rsvd3; + u64 rsvd4; + u64 aux; + u64 instr_latency :16, + pad2 :16, + cache_latency :16, + pad3 :16; + u64 tsx_tuning; +}; + +struct arch_pebs_gprs { + u64 flags, ip, ax, cx, dx, bx, sp, bp, si, di; + u64 r8, r9, r10, r11, r12, r13, r14, r15, ssp; + u64 rsvd; +}; + +struct arch_pebs_xer_header { + u64 xstate; + u64 rsvd; +}; + +#define ARCH_PEBS_LBR_NAN 0x0 +#define ARCH_PEBS_LBR_NUM_8 0x1 +#define ARCH_PEBS_LBR_NUM_16 0x2 +#define ARCH_PEBS_LBR_NUM_VAR 0x3 +#define ARCH_PEBS_BASE_LBR_ENTRIES 8 +struct arch_pebs_lbr_header { + u64 rsvd; + u64 ctl; + u64 depth; + u64 ler_from; + u64 ler_to; + u64 ler_info; +}; + /* * AMD Extended Performance Monitoring and Debug cpuid feature detection */ --=20 2.34.1