From nobody Sat Feb 7 22:55:24 2026 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B141B1CCECF; Tue, 19 Nov 2024 13:58:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.19 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732024698; cv=none; b=DMyoC8cBjMBCxH9LsCkRrM/gkDthcEacdi8XWHgkdNSpGh8aVIO8nxvT7V/2VrioB57PDfGv4AO660XnoiEvHAx8OAMnuRY4gJlqtUUnvVTgKNkCWRXjFUv3ho7dwYre1kopKPAzmZVrYwp7Aufzzp2GGH4at79Qw9lNV+jfsK8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732024698; c=relaxed/simple; bh=hubE/oIsEXcyJKdSHaK31n5gltZEvFOLvxTg1HGGg5I=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=AaqoOT4FoKqIMU8yUYC5FpaNlSaYDeVBXsSBT9MQaHSPnQ2C2AOsEwJD8gEkFcQz9mXXlC+GIcmN12OTfhJl/B5EaF3dRBeT0BzKLpF3CdAsxTgVdP171BfpBF/SD9vDFS8tM+z5j1zdWf6GFEFKLuJ+TFdpP8Rzw5w3ns8CboU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=HY7wgTbp; arc=none smtp.client-ip=192.198.163.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="HY7wgTbp" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1732024697; x=1763560697; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=hubE/oIsEXcyJKdSHaK31n5gltZEvFOLvxTg1HGGg5I=; b=HY7wgTbpXaCpcZqnUWE/tTqYi4LZdT6soPEkB63a261/K5sTjsqxDa3G fJh4oMKT5cNSpgDxkV4hAUt9TXsowIsgn5mlu92foRLUBHG/Hcd2x5aOV GSg1s+tMBrBK3h6n2YHLVtXbnk7gNKGSvLHHKtySobNVNzqBOWBiValti LQ3CzHZ8siv5/jWpS+ASdqgYigUbH/AiZrtUAdRbZfzwAd95maqRSuPxJ OWOIOfGqn17Q7hGg8aqk1NI9llRxz7Oj2OZmvObp8mNifAwDrx0m/NuA4 fnmizx7LXsT9e5PIBrSwOjPRw5E9lEZQMNZiBWpHQp6lUNG/+T8IiBtne w==; X-CSE-ConnectionGUID: rnID8a/nRLi46vFYswP2BQ== X-CSE-MsgGUID: Tc0KVM9pTrWsup1pvl4Mkw== X-IronPort-AV: E=McAfee;i="6700,10204,11261"; a="31435319" X-IronPort-AV: E=Sophos;i="6.12,166,1728975600"; d="scan'208";a="31435319" Received: from orviesa007.jf.intel.com ([10.64.159.147]) by fmvoesa113.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Nov 2024 05:58:15 -0800 X-CSE-ConnectionGUID: kscStD/TSkuM5H2C5Q0xLg== X-CSE-MsgGUID: /SGz3THoTFuQPsO3Z71Q1A== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,166,1728975600"; d="scan'208";a="89956369" Received: from kanliang-dev.jf.intel.com ([10.165.154.102]) by orviesa007.jf.intel.com with ESMTP; 19 Nov 2024 05:58:15 -0800 From: kan.liang@linux.intel.com To: peterz@infradead.org, mingo@redhat.com, linux-kernel@vger.kernel.org Cc: acme@kernel.org, namhyung@kernel.org, irogers@google.com, eranian@google.com, ak@linux.intel.com, dapeng1.mi@linux.intel.com, Kan Liang , stable@vger.kernel.org Subject: [PATCH V2 1/4] perf/x86/intel/ds: Unconditionally drain PEBS DS when changing PEBS_DATA_CFG Date: Tue, 19 Nov 2024 05:55:01 -0800 Message-Id: <20241119135504.1463839-2-kan.liang@linux.intel.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20241119135504.1463839-1-kan.liang@linux.intel.com> References: <20241119135504.1463839-1-kan.liang@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Kan Liang The PEBS kernel warnings can still be observed with the below case. when the below commands are running in parallel for a while. while true; do perf record --no-buildid -a --intr-regs=3DAX \ -e cpu/event=3D0xd0,umask=3D0x81/pp \ -c 10003 -o /dev/null ./triad; done & while true; do perf record -e 'cpu/mem-loads,ldlat=3D3/uP' -W -d -- ./dtlb done The commit b752ea0c28e3 ("perf/x86/intel/ds: Flush PEBS DS when changing PEBS_DATA_CFG") intends to flush the entire PEBS buffer before the hardware is reprogrammed. However, it fails in the above case. The first perf command utilizes the large PEBS, while the second perf command only utilizes a single PEBS. When the second perf event is added, only the n_pebs++. The intel_pmu_pebs_enable() is invoked after intel_pmu_pebs_add(). So the cpuc->n_pebs =3D=3D cpuc->n_large_pebs check in the intel_pmu_drain_large_pebs() fails. The PEBS DS is not flushed. The new PEBS event should not be taken into account when flushing the existing PEBS DS. The check is unnecessary here. Before the hardware is reprogrammed, all the stale records must be drained unconditionally. For single PEBS or PEBS-vi-pt, the DS must be empty. The drain_pebs() can handle the empty case. There is no harm to unconditionally drain the PEBS DS. Fixes: b752ea0c28e3 ("perf/x86/intel/ds: Flush PEBS DS when changing PEBS_D= ATA_CFG") Signed-off-by: Kan Liang Cc: stable@vger.kernel.org --- arch/x86/events/intel/ds.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c index 8afc4ad3cd16..1a4b326ca2ce 100644 --- a/arch/x86/events/intel/ds.c +++ b/arch/x86/events/intel/ds.c @@ -1489,7 +1489,7 @@ void intel_pmu_pebs_enable(struct perf_event *event) * hence we need to drain when changing said * size. */ - intel_pmu_drain_large_pebs(cpuc); + intel_pmu_drain_pebs_buffer(); adaptive_pebs_record_size_update(); wrmsrl(MSR_PEBS_DATA_CFG, pebs_data_cfg); cpuc->active_pebs_data_cfg =3D pebs_data_cfg; --=20 2.38.1 From nobody Sat Feb 7 22:55:24 2026 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 72B5C1CD1FA for ; Tue, 19 Nov 2024 13:58:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.19 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732024699; cv=none; b=feXZj2l07tXxpIVMyeKMd7ZWYMvxy9RGl34tNoD3PsQe2/KrbuG3TiYeekyGUIt8AtWo/pm5tm3fxOxYAjg3PhGGzUm5lP/Y/kZD6dI9DcFdacS1Iz3Ma0h+2d7MThWbGdBsiJi7/4qixUq7CQarP/cG93jE4XZr3cL1EpRqHGM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732024699; c=relaxed/simple; bh=qFLwEkQu/EgEqCVTZa1a0nJIfjIoXqRjKJ74CEpK/XM=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=k9tKw1F+RJh/B7CqYQzuuSI5UzRHhrhZ/uzSmQzucp7RM0g1hhz+1BfNVwyeV/qHEDj/s1zvFgJHA/o7dhk8DPPTog/4ndZK4P2T8nqKgpkl1LrEO/jfUfhE/58BAexzeIDYP80HiugcSTMNCzbsLnRA99PFMJcTmv/Wx8jKk3o= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=P/8sLSC1; arc=none smtp.client-ip=192.198.163.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="P/8sLSC1" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1732024697; x=1763560697; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=qFLwEkQu/EgEqCVTZa1a0nJIfjIoXqRjKJ74CEpK/XM=; b=P/8sLSC1zwQEbCKPqvb+50wrcNcfQVORpdk/H6K8zKH4gVlO52UzMsbw /1tDPbC8X/3NNESqLc1roYIZezihQVWvK4x7XWuGApMw9A6rtB1udNSPL 4bUF0xtDZOcI0CVg1GMK+abNmRx6txp4lpn2/2yJWaJI/vK+E6IJrWV0V JCxAQI8OcYUuV1p9aZwEo4e5Plq6Fuc/+lLFmPDmh2dvbCLUIe6ZZ9EdD mu3D2b7K5pIfYxY0M8Iq2yYxkfZxJcqjwfXKCtImx+xjoAxV9k7jHWAa9 qaXOESrDDcN5c+havbqr7Hwl+z2xp1dMfLCaEzfH+fSfYEhChXK9UWZCH A==; X-CSE-ConnectionGUID: Rr3Gc6wKSeSlZ1Oawy8heg== X-CSE-MsgGUID: L4Z576dBSdiNFFkchcHG9A== X-IronPort-AV: E=McAfee;i="6700,10204,11261"; a="31435326" X-IronPort-AV: E=Sophos;i="6.12,166,1728975600"; d="scan'208";a="31435326" Received: from orviesa007.jf.intel.com ([10.64.159.147]) by fmvoesa113.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Nov 2024 05:58:15 -0800 X-CSE-ConnectionGUID: z+lsNa2aTLeYwg9S+9brLg== X-CSE-MsgGUID: 0mvukrGSTBWi7n9EijgCJw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,166,1728975600"; d="scan'208";a="89956370" Received: from kanliang-dev.jf.intel.com ([10.165.154.102]) by orviesa007.jf.intel.com with ESMTP; 19 Nov 2024 05:58:15 -0800 From: kan.liang@linux.intel.com To: peterz@infradead.org, mingo@redhat.com, linux-kernel@vger.kernel.org Cc: acme@kernel.org, namhyung@kernel.org, irogers@google.com, eranian@google.com, ak@linux.intel.com, dapeng1.mi@linux.intel.com, Kan Liang Subject: [PATCH V2 2/4] perf/x86/intel/ds: Clarify adaptive PEBS processing Date: Tue, 19 Nov 2024 05:55:02 -0800 Message-Id: <20241119135504.1463839-3-kan.liang@linux.intel.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20241119135504.1463839-1-kan.liang@linux.intel.com> References: <20241119135504.1463839-1-kan.liang@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Kan Liang Modify the pebs_basic and pebs_meminfo structs to make the bitfields more explicit to ease readability of the code. Co-developed-by: Stephane Eranian Signed-off-by: Stephane Eranian Signed-off-by: Kan Liang --- arch/x86/events/intel/ds.c | 43 ++++++++++++++----------------- arch/x86/include/asm/perf_event.h | 16 ++++++++++-- 2 files changed, 34 insertions(+), 25 deletions(-) diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c index 1a4b326ca2ce..35926d0d2341 100644 --- a/arch/x86/events/intel/ds.c +++ b/arch/x86/events/intel/ds.c @@ -1917,8 +1917,6 @@ static void adaptive_pebs_save_regs(struct pt_regs *r= egs, } =20 #define PEBS_LATENCY_MASK 0xffff -#define PEBS_CACHE_LATENCY_OFFSET 32 -#define PEBS_RETIRE_LATENCY_OFFSET 32 =20 /* * With adaptive PEBS the layout depends on what fields are configured. @@ -1932,8 +1930,7 @@ static void setup_pebs_adaptive_sample_data(struct pe= rf_event *event, struct cpu_hw_events *cpuc =3D this_cpu_ptr(&cpu_hw_events); struct pebs_basic *basic =3D __pebs; void *next_record =3D basic + 1; - u64 sample_type; - u64 format_size; + u64 sample_type, format_group; struct pebs_meminfo *meminfo =3D NULL; struct pebs_gprs *gprs =3D NULL; struct x86_perf_regs *perf_regs; @@ -1945,7 +1942,7 @@ static void setup_pebs_adaptive_sample_data(struct pe= rf_event *event, perf_regs->xmm_regs =3D NULL; =20 sample_type =3D event->attr.sample_type; - format_size =3D basic->format_size; + format_group =3D basic->format_group; perf_sample_data_init(data, 0, event->hw.last_period); data->period =3D event->hw.last_period; =20 @@ -1967,7 +1964,7 @@ static void setup_pebs_adaptive_sample_data(struct pe= rf_event *event, =20 if (sample_type & PERF_SAMPLE_WEIGHT_STRUCT) { if (x86_pmu.flags & PMU_FL_RETIRE_LATENCY) - data->weight.var3_w =3D format_size >> PEBS_RETIRE_LATENCY_OFFSET & PEB= S_LATENCY_MASK; + data->weight.var3_w =3D basic->retire_latency; else data->weight.var3_w =3D 0; } @@ -1977,12 +1974,12 @@ static void setup_pebs_adaptive_sample_data(struct = perf_event *event, * But PERF_SAMPLE_TRANSACTION needs gprs->ax. * Save the pointer here but process later. */ - if (format_size & PEBS_DATACFG_MEMINFO) { + if (format_group & PEBS_DATACFG_MEMINFO) { meminfo =3D next_record; next_record =3D meminfo + 1; } =20 - if (format_size & PEBS_DATACFG_GP) { + if (format_group & PEBS_DATACFG_GP) { gprs =3D next_record; next_record =3D gprs + 1; =20 @@ -1995,14 +1992,13 @@ static void setup_pebs_adaptive_sample_data(struct = perf_event *event, adaptive_pebs_save_regs(regs, gprs); } =20 - if (format_size & PEBS_DATACFG_MEMINFO) { + if (format_group & PEBS_DATACFG_MEMINFO) { if (sample_type & PERF_SAMPLE_WEIGHT_TYPE) { - u64 weight =3D meminfo->latency; + u64 latency =3D x86_pmu.flags & PMU_FL_INSTR_LATENCY ? + meminfo->cache_latency : meminfo->mem_latency; =20 - if (x86_pmu.flags & PMU_FL_INSTR_LATENCY) { - data->weight.var2_w =3D weight & PEBS_LATENCY_MASK; - weight >>=3D PEBS_CACHE_LATENCY_OFFSET; - } + if (x86_pmu.flags & PMU_FL_INSTR_LATENCY) + data->weight.var2_w =3D meminfo->instr_latency; =20 /* * Although meminfo::latency is defined as a u64, @@ -2010,12 +2006,13 @@ static void setup_pebs_adaptive_sample_data(struct = perf_event *event, * in practice on Ice Lake and earlier platforms. */ if (sample_type & PERF_SAMPLE_WEIGHT) { - data->weight.full =3D weight ?: + data->weight.full =3D latency ?: intel_get_tsx_weight(meminfo->tsx_tuning); } else { - data->weight.var1_dw =3D (u32)(weight & PEBS_LATENCY_MASK) ?: + data->weight.var1_dw =3D (u32)latency ?: intel_get_tsx_weight(meminfo->tsx_tuning); } + data->sample_flags |=3D PERF_SAMPLE_WEIGHT_TYPE; } =20 @@ -2036,16 +2033,16 @@ static void setup_pebs_adaptive_sample_data(struct = perf_event *event, } } =20 - if (format_size & PEBS_DATACFG_XMMS) { + if (format_group & PEBS_DATACFG_XMMS) { struct pebs_xmm *xmm =3D next_record; =20 next_record =3D xmm + 1; perf_regs->xmm_regs =3D xmm->xmm; } =20 - if (format_size & PEBS_DATACFG_LBRS) { + if (format_group & PEBS_DATACFG_LBRS) { struct lbr_entry *lbr =3D next_record; - int num_lbr =3D ((format_size >> PEBS_DATACFG_LBR_SHIFT) + int num_lbr =3D ((format_group >> PEBS_DATACFG_LBR_SHIFT) & 0xff) + 1; next_record =3D next_record + num_lbr * sizeof(struct lbr_entry); =20 @@ -2055,11 +2052,11 @@ static void setup_pebs_adaptive_sample_data(struct = perf_event *event, } } =20 - WARN_ONCE(next_record !=3D __pebs + (format_size >> 48), - "PEBS record size %llu, expected %llu, config %llx\n", - format_size >> 48, + WARN_ONCE(next_record !=3D __pebs + basic->format_size, + "PEBS record size %u, expected %llu, config %llx\n", + basic->format_size, (u64)(next_record - __pebs), - basic->format_size); + format_group); } =20 static inline void * diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_= event.h index 91b73571412f..cd8023d5ea46 100644 --- a/arch/x86/include/asm/perf_event.h +++ b/arch/x86/include/asm/perf_event.h @@ -422,7 +422,9 @@ static inline bool is_topdown_idx(int idx) */ =20 struct pebs_basic { - u64 format_size; + u64 format_group:32, + retire_latency:16, + format_size:16; u64 ip; u64 applicable_counters; u64 tsc; @@ -431,7 +433,17 @@ struct pebs_basic { struct pebs_meminfo { u64 address; u64 aux; - u64 latency; + union { + /* pre Alder Lake */ + u64 mem_latency; + /* Alder Lake and later */ + struct { + u64 instr_latency:16; + u64 pad2:16; + u64 cache_latency:16; + u64 pad3:16; + }; + }; u64 tsx_tuning; }; =20 --=20 2.38.1 From nobody Sat Feb 7 22:55:24 2026 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BBEB51CDA0D for ; Tue, 19 Nov 2024 13:58:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.19 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732024699; cv=none; b=GzqYBZB+Zgn95U6o8YqkKEZVnK13TK7EDPt0+Nb8Mic0gcz/wPl1TddOCU+ufw38LhCWaAr6UlmoREqTXzjpExWQPhPlOgD8z9qcY9Izjx0OzAFNZ1r489aioYtZ2ERR1UE2J8JXKUF7Sx6yb/m3iGUKh98my10iCkTgcNSufy8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732024699; c=relaxed/simple; bh=N2IMcpIEmJnBpeHQLOTmNKrMXtM6xbHUMg301MPnLh0=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=ho3MGkYwAQ0Z/mBkt9p0RRsLkDl544p5cpmK5uHV2GtVwOTb21E/tSiBAXo5X4yy0nSGkMyDl9nImp0MyvyLLEaayYNgpcprRhXr37UNWJpgGEIJ/3qwqCrx0Lp15siDiS2vc/I8fiG3kRoeHVXKpVqE/U359prdqIhHnnjStg0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=jXG+1iiY; arc=none smtp.client-ip=192.198.163.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="jXG+1iiY" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1732024698; x=1763560698; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=N2IMcpIEmJnBpeHQLOTmNKrMXtM6xbHUMg301MPnLh0=; b=jXG+1iiYJiEgLvFKRlaQGuxfn15tGav64IQmRD1C9yeAhen94SubklqP xFaRov0I2BfO1j6gUGmIHSpocmRfVEk77dd4IowbsqPczDfeUsHpUCJAy FsLt14mT+IRoUttR09fl9Q459D/mdNGUvP8hkLsWF/yg0BQOOom4bHoco SkFOcndbfqBXsSAwVu7YZm9Sl7WusVEHIXO5aHlgmFfVsNAhp8iJVACXT UCInMXBsz6e6K1XdlkZ2E3bEDggV8s39ktU1r0TqqX5CeeSm0gebUirIA 3TzZBmvpWqTsDCwZISDJsJuGbUI67kQMrLGy5BdSMnz6K//XunsVmq+Bn g==; X-CSE-ConnectionGUID: TSwbPJ9ISautXR1JPY+cKw== X-CSE-MsgGUID: O2qofScTTOWV7M1L0mOHoA== X-IronPort-AV: E=McAfee;i="6700,10204,11261"; a="31435331" X-IronPort-AV: E=Sophos;i="6.12,166,1728975600"; d="scan'208";a="31435331" Received: from orviesa007.jf.intel.com ([10.64.159.147]) by fmvoesa113.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Nov 2024 05:58:15 -0800 X-CSE-ConnectionGUID: RIFaB2jYTVafEYJhWd1caA== X-CSE-MsgGUID: rnis8KGLRRi+ypJ1mY4w6g== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,166,1728975600"; d="scan'208";a="89956371" Received: from kanliang-dev.jf.intel.com ([10.165.154.102]) by orviesa007.jf.intel.com with ESMTP; 19 Nov 2024 05:58:15 -0800 From: kan.liang@linux.intel.com To: peterz@infradead.org, mingo@redhat.com, linux-kernel@vger.kernel.org Cc: acme@kernel.org, namhyung@kernel.org, irogers@google.com, eranian@google.com, ak@linux.intel.com, dapeng1.mi@linux.intel.com, Kan Liang Subject: [PATCH V2 3/4] perf/x86/intel/ds: Factor out functions for PEBS records processing Date: Tue, 19 Nov 2024 05:55:03 -0800 Message-Id: <20241119135504.1463839-4-kan.liang@linux.intel.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20241119135504.1463839-1-kan.liang@linux.intel.com> References: <20241119135504.1463839-1-kan.liang@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Kan Liang Factor out functions to process normal and the last PEBS records, which can be shared with the later patch. Move the event updating related codes (intel_pmu_save_and_restart()) to the end, where all samples have been processed. For the current usage, it doesn't matter when perf updates event counts and reset the counter. Because all counters are stopped when the PEBS buffer is drained. Drop the return of the !intel_pmu_save_and_restart(event) check. Because it never happen. The intel_pmu_save_and_restart(event) only returns 0, when !hwc->event_base or the period_left > 0. - The !hwc->event_base is impossible for the PEBS event, since the PEBS event is only available on GP and fixed counters, which always have a valid hwc->event_base. - The check only happens for the case of non-AUTO_RELOAD and single PEBS, which implies that the event must be overflowed. The period_left must be always <=3D 0 for an overflowed event after the x86_pmu_update(). Co-developed-by: Peter Zijlstra (Intel) Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Kan Liang --- arch/x86/events/intel/ds.c | 109 +++++++++++++++++++++++-------------- 1 file changed, 67 insertions(+), 42 deletions(-) diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c index 35926d0d2341..bf624499f3b4 100644 --- a/arch/x86/events/intel/ds.c +++ b/arch/x86/events/intel/ds.c @@ -2167,46 +2167,33 @@ intel_pmu_save_and_restart_reload(struct perf_event= *event, int count) return 0; } =20 +typedef void (*setup_fn)(struct perf_event *, struct pt_regs *, void *, + struct perf_sample_data *, struct pt_regs *); + +static struct pt_regs dummy_iregs; + static __always_inline void __intel_pmu_pebs_event(struct perf_event *event, struct pt_regs *iregs, + struct pt_regs *regs, struct perf_sample_data *data, - void *base, void *top, - int bit, int count, - void (*setup_sample)(struct perf_event *, - struct pt_regs *, - void *, - struct perf_sample_data *, - struct pt_regs *)) + void *at, + setup_fn setup_sample) { - struct cpu_hw_events *cpuc =3D this_cpu_ptr(&cpu_hw_events); - struct hw_perf_event *hwc =3D &event->hw; - struct x86_perf_regs perf_regs; - struct pt_regs *regs =3D &perf_regs.regs; - void *at =3D get_next_pebs_record_by_bit(base, top, bit); - static struct pt_regs dummy_iregs; - - if (hwc->flags & PERF_X86_EVENT_AUTO_RELOAD) { - /* - * Now, auto-reload is only enabled in fixed period mode. - * The reload value is always hwc->sample_period. - * May need to change it, if auto-reload is enabled in - * freq mode later. - */ - intel_pmu_save_and_restart_reload(event, count); - } else if (!intel_pmu_save_and_restart(event)) - return; - - if (!iregs) - iregs =3D &dummy_iregs; + setup_sample(event, iregs, at, data, regs); + perf_event_output(event, data, regs); +} =20 - while (count > 1) { - setup_sample(event, iregs, at, data, regs); - perf_event_output(event, data, regs); - at +=3D cpuc->pebs_record_size; - at =3D get_next_pebs_record_by_bit(at, top, bit); - count--; - } +static __always_inline void +__intel_pmu_pebs_last_event(struct perf_event *event, + struct pt_regs *iregs, + struct pt_regs *regs, + struct perf_sample_data *data, + void *at, + int count, + setup_fn setup_sample) +{ + struct hw_perf_event *hwc =3D &event->hw; =20 setup_sample(event, iregs, at, data, regs); if (iregs =3D=3D &dummy_iregs) { @@ -2225,6 +2212,44 @@ __intel_pmu_pebs_event(struct perf_event *event, if (perf_event_overflow(event, data, regs)) x86_pmu_stop(event, 0); } + + if (hwc->flags & PERF_X86_EVENT_AUTO_RELOAD) { + /* + * Now, auto-reload is only enabled in fixed period mode. + * The reload value is always hwc->sample_period. + * May need to change it, if auto-reload is enabled in + * freq mode later. + */ + intel_pmu_save_and_restart_reload(event, count); + } else + intel_pmu_save_and_restart(event); +} + +static __always_inline void +__intel_pmu_pebs_events(struct perf_event *event, + struct pt_regs *iregs, + struct perf_sample_data *data, + void *base, void *top, + int bit, int count, + setup_fn setup_sample) +{ + struct cpu_hw_events *cpuc =3D this_cpu_ptr(&cpu_hw_events); + struct x86_perf_regs perf_regs; + struct pt_regs *regs =3D &perf_regs.regs; + void *at =3D get_next_pebs_record_by_bit(base, top, bit); + int cnt =3D count; + + if (!iregs) + iregs =3D &dummy_iregs; + + while (cnt > 1) { + __intel_pmu_pebs_event(event, iregs, regs, data, at, setup_sample); + at +=3D cpuc->pebs_record_size; + at =3D get_next_pebs_record_by_bit(at, top, bit); + cnt--; + } + + __intel_pmu_pebs_last_event(event, iregs, regs, data, at, count, setup_sa= mple); } =20 static void intel_pmu_drain_pebs_core(struct pt_regs *iregs, struct perf_s= ample_data *data) @@ -2261,8 +2286,8 @@ static void intel_pmu_drain_pebs_core(struct pt_regs = *iregs, struct perf_sample_ return; } =20 - __intel_pmu_pebs_event(event, iregs, data, at, top, 0, n, - setup_pebs_fixed_sample_data); + __intel_pmu_pebs_events(event, iregs, data, at, top, 0, n, + setup_pebs_fixed_sample_data); } =20 static void intel_pmu_pebs_event_update_no_drain(struct cpu_hw_events *cpu= c, int size) @@ -2393,9 +2418,9 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs *= iregs, struct perf_sample_d } =20 if (counts[bit]) { - __intel_pmu_pebs_event(event, iregs, data, base, - top, bit, counts[bit], - setup_pebs_fixed_sample_data); + __intel_pmu_pebs_events(event, iregs, data, base, + top, bit, counts[bit], + setup_pebs_fixed_sample_data); } } } @@ -2447,9 +2472,9 @@ static void intel_pmu_drain_pebs_icl(struct pt_regs *= iregs, struct perf_sample_d if (WARN_ON_ONCE(!event->attr.precise_ip)) continue; =20 - __intel_pmu_pebs_event(event, iregs, data, base, - top, bit, counts[bit], - setup_pebs_adaptive_sample_data); + __intel_pmu_pebs_events(event, iregs, data, base, + top, bit, counts[bit], + setup_pebs_adaptive_sample_data); } } =20 --=20 2.38.1 From nobody Sat Feb 7 22:55:24 2026 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8F0361CDA3C for ; Tue, 19 Nov 2024 13:58:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.19 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732024700; cv=none; b=cf+q4F2QZXnbHvt5YSCXVwPxssFMgAwifjehBZpxXZZ5hYp6hZy0+V5ompc4JCxUolIOt9TL6Ld7TVETFVcPMhDGWVKgnGPkcQMuiodyoks+uwy+1L9x4d7SECc7TsCERr++smG7KT9Aw8U6pB4TiMbJq6vDpKVweCb+Tx0yX8g= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732024700; c=relaxed/simple; bh=JR/nwAP48AuhxV09njiBOtwRlN9cYWqv0/thXMn4tuA=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=QaGmH7cU96Bg16gFxfyfHXY2gAXvyTFe8oW2cS82RQkyBR42KipOe+wEADQVOsZxie7KPIsiTlIEOPYvDoyhVsbnA3zON5W5uzwKS9reUGoq3i0gaqMtACqmIKWbnuWsSeebAZUro5ZpjXPMRX6G4diJqdi+NoateeNi2oT3lUQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=Le+lCHpy; arc=none smtp.client-ip=192.198.163.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="Le+lCHpy" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1732024698; x=1763560698; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=JR/nwAP48AuhxV09njiBOtwRlN9cYWqv0/thXMn4tuA=; b=Le+lCHpyBc2fV86wokHuEcjL5OIsf3zTTX3RxQc6JsOla7i1bS2lZCps UoNrWZFxwTWiJ0vYuzxHjToqCfDz140GxxsRDucyDSj9v1wDEk05RM21x bVOPQiu92T+51A2Dgdit54JCxY2xP/7S1p+R7/NbVl4/fqQMkJwKJ0OYw 4yaHiRwAD6CT4wSnRm6GXQIbdGirlBwJ+ytUSw2t+fGr+Yrekw9hKSF5T qIyIy2S7AxZFKkfnJ7ya8jCoESL4vdZaVSUd5k4wKkpuPSQ02/xLLG29z 9XOQybWC3/T8zuHdZTgiByphDlCjc4LQCaJcXa97ZwW4itbgnojXbIS61 w==; X-CSE-ConnectionGUID: LWDkfalXRtCt6dtDK4xEBg== X-CSE-MsgGUID: 6fMpCtfdTgeeFfC2/Oc4uw== X-IronPort-AV: E=McAfee;i="6700,10204,11261"; a="31435334" X-IronPort-AV: E=Sophos;i="6.12,166,1728975600"; d="scan'208";a="31435334" Received: from orviesa007.jf.intel.com ([10.64.159.147]) by fmvoesa113.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Nov 2024 05:58:15 -0800 X-CSE-ConnectionGUID: esUkFrnSTQSfo8eBdaCnSg== X-CSE-MsgGUID: h3i/B5CVSYKCRv+tov+tUQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,166,1728975600"; d="scan'208";a="89956372" Received: from kanliang-dev.jf.intel.com ([10.165.154.102]) by orviesa007.jf.intel.com with ESMTP; 19 Nov 2024 05:58:15 -0800 From: kan.liang@linux.intel.com To: peterz@infradead.org, mingo@redhat.com, linux-kernel@vger.kernel.org Cc: acme@kernel.org, namhyung@kernel.org, irogers@google.com, eranian@google.com, ak@linux.intel.com, dapeng1.mi@linux.intel.com, Kan Liang Subject: [PATCH V2 4/4] perf/x86/intel/ds: Simplify the PEBS records processing for adaptive PEBS Date: Tue, 19 Nov 2024 05:55:04 -0800 Message-Id: <20241119135504.1463839-5-kan.liang@linux.intel.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20241119135504.1463839-1-kan.liang@linux.intel.com> References: <20241119135504.1463839-1-kan.liang@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Kan Liang The current code may iterate all the PEBS records in the DS area several times. The first loop is to find all active events and calculate the available records for each event. Then iterate the whole buffer again and again to process available records until all active events are processed. The algorithm is inherited from the old generations. The old PEBS hardware does not deal well with the situation when events happen near each other. SW has to drop the error records. Multiple iterations are required. The hardware limit has been addressed on newer platforms with adaptive PEBS. A simple one-iteration algorithm is introduced. The samples are output by record order with the patch, rather than the event order. It doesn't impact the post-processing. The perf tool always sorts the records by time before presenting them to the end user. In an NMI, the last record has to be specially handled. Add a last[] variable to track the last unprocessed record of each event. Test: 11 PEBS events are used in the perf test. Only the basic information is collected. perf record -e instructions:up,...,instructions:up -c 2000003 benchmark The ftrace is used to record the duration of the intel_pmu_drain_pebs_icl(). The average duration reduced from 62.04us to 57.94us. A small improvement can be observed with the new algorithm. Also, the implementation becomes simpler and more straightforward. Suggested-by: Stephane Eranian Reviewed-by: Dapeng Mi Signed-off-by: Kan Liang --- arch/x86/events/intel/ds.c | 43 +++++++++++++++++++++++++------------- 1 file changed, 29 insertions(+), 14 deletions(-) diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c index bf624499f3b4..dc9fe45df297 100644 --- a/arch/x86/events/intel/ds.c +++ b/arch/x86/events/intel/ds.c @@ -2428,8 +2428,12 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs = *iregs, struct perf_sample_d static void intel_pmu_drain_pebs_icl(struct pt_regs *iregs, struct perf_sa= mple_data *data) { short counts[INTEL_PMC_IDX_FIXED + MAX_FIXED_PEBS_EVENTS] =3D {}; + void *last[INTEL_PMC_IDX_FIXED + MAX_FIXED_PEBS_EVENTS]; struct cpu_hw_events *cpuc =3D this_cpu_ptr(&cpu_hw_events); struct debug_store *ds =3D cpuc->ds; + struct x86_perf_regs perf_regs; + struct pt_regs *regs =3D &perf_regs.regs; + struct pebs_basic *basic; struct perf_event *event; void *base, *at, *top; int bit; @@ -2451,30 +2455,41 @@ static void intel_pmu_drain_pebs_icl(struct pt_regs= *iregs, struct perf_sample_d return; } =20 - for (at =3D base; at < top; at +=3D cpuc->pebs_record_size) { + if (!iregs) + iregs =3D &dummy_iregs; + + /* Process all but the last event for each counter. */ + for (at =3D base; at < top; at +=3D basic->format_size) { u64 pebs_status; =20 - pebs_status =3D get_pebs_status(at) & cpuc->pebs_enabled; - pebs_status &=3D mask; + basic =3D at; + if (basic->format_size !=3D cpuc->pebs_record_size) + continue; + + pebs_status =3D basic->applicable_counters & cpuc->pebs_enabled & mask; + for_each_set_bit(bit, (unsigned long *)&pebs_status, X86_PMC_IDX_MAX) { + event =3D cpuc->events[bit]; =20 - for_each_set_bit(bit, (unsigned long *)&pebs_status, X86_PMC_IDX_MAX) - counts[bit]++; + if (WARN_ON_ONCE(!event) || + WARN_ON_ONCE(!event->attr.precise_ip)) + continue; + + if (counts[bit]++) { + __intel_pmu_pebs_event(event, iregs, regs, data, last[bit], + setup_pebs_adaptive_sample_data); + } + last[bit] =3D at; + } } =20 for_each_set_bit(bit, (unsigned long *)&mask, X86_PMC_IDX_MAX) { - if (counts[bit] =3D=3D 0) + if (!counts[bit]) continue; =20 event =3D cpuc->events[bit]; - if (WARN_ON_ONCE(!event)) - continue; - - if (WARN_ON_ONCE(!event->attr.precise_ip)) - continue; =20 - __intel_pmu_pebs_events(event, iregs, data, base, - top, bit, counts[bit], - setup_pebs_adaptive_sample_data); + __intel_pmu_pebs_last_event(event, iregs, regs, data, last[bit], + counts[bit], setup_pebs_adaptive_sample_data); } } =20 --=20 2.38.1