From nobody Fri Dec 19 07:31:48 2025 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 62C3F1E287F for ; Fri, 2 Aug 2024 15:16:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.19 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722611762; cv=none; b=auCg7ubFDXXmh951maaLUDyFWbT6hQKEUrafqxh7PikE5ACNC2yHjRdYdef4ScHszhEfJ3VLXmn4J4fnFhiJcQ2fxhwJxUZB0i5iJ78Xu5gAJmmaZt3/2ciUAAPOQwR7HG6vKp5FXQpNJuF20MyIhPYcBxNzxXcIa1y8Zyiz1hg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722611762; c=relaxed/simple; bh=mGTuL2K61dIMThg7S4YuzF8bs4faqy5BjuNQiiLUyOs=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=mJU+/6szPqD84jz7Dguy3oI1vuPXdcqNYJy8u6UNT0ifJh5ZYyj9EjIxtMLhxPhKSL3zi6NP2MaEj0HJPqvJfstY3oSz24skMRnd3pPCpD9Kvc8RTTTDMC791JTOgs+oqR1zCRRIrE1P9XnEW0jgWU5Sr2qPt4GXW1YFLoveiWY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=LE3Vl+Gn; arc=none smtp.client-ip=198.175.65.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="LE3Vl+Gn" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1722611761; x=1754147761; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=mGTuL2K61dIMThg7S4YuzF8bs4faqy5BjuNQiiLUyOs=; b=LE3Vl+Gn19cNK/c1Wy7BTIhkMW9vR/Tl7a4hVBndP63K0VLHSWy18Neu 5Uth9AD/RPo5lpd69BXEjMwul9CtvkPr16H0feIPWLc1lBbTHJjGOd5jB wZEgp0KYpoKMjwZ36kJtEu+1klpT9b16oN0hU4ty5AJZSHyV20qGxqw9T 88BxoGozuOlIYouM5RvDm3ogKnnn/Mjdxqb6zzi8uc1GCu23kI87n4iop JjXD1aDFppyUAiyP8VOqNE7DIAdZBwfBfkZl9s+qb2X30Msi2fRk2ucGx ga/X8D+4r1Y9PmQ0Dd3qu1w2uyutknIif/pdMoABeKQrYoOF79BAQde9H A==; X-CSE-ConnectionGUID: b6RuXe6nQk6Ylv4DMfmXHA== X-CSE-MsgGUID: N26qlOykRfCpI9Qxlw4ZuA== X-IronPort-AV: E=McAfee;i="6700,10204,11152"; a="20473757" X-IronPort-AV: E=Sophos;i="6.09,258,1716274800"; d="scan'208";a="20473757" Received: from fmviesa010.fm.intel.com ([10.60.135.150]) by orvoesa111.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Aug 2024 08:15:59 -0700 X-CSE-ConnectionGUID: g6z1q1fmTtuODGljGYlXxw== X-CSE-MsgGUID: bX92xgvKTDGPP7XcfSNrtw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.09,258,1716274800"; d="scan'208";a="55516929" Received: from kanliang-dev.jf.intel.com ([10.165.154.102]) by fmviesa010.fm.intel.com with ESMTP; 02 Aug 2024 08:15:58 -0700 From: kan.liang@linux.intel.com To: peterz@infradead.org, mingo@redhat.com, acme@kernel.org, namhyung@kernel.org, irogers@google.com, linux-kernel@vger.kernel.org Cc: Kan Liang Subject: [PATCH 1/7] perf: Generic hotplug support for a PMU with a scope Date: Fri, 2 Aug 2024 08:16:37 -0700 Message-Id: <20240802151643.1691631-2-kan.liang@linux.intel.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20240802151643.1691631-1-kan.liang@linux.intel.com> References: <20240802151643.1691631-1-kan.liang@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Kan Liang The perf subsystem assumes that the counters of a PMU are per-CPU. So the user space tool reads a counter from each CPU in the system wide mode. However, many PMUs don't have a per-CPU counter. The counter is effective for a scope, e.g., a die or a socket. To address this, a cpumask is exposed by the kernel driver to restrict to one CPU to stand for a specific scope. In case the given CPU is removed, the hotplug support has to be implemented for each such driver. The codes to support the cpumask and hotplug are very similar. - Expose a cpumask into sysfs - Pickup another CPU in the same scope if the given CPU is removed. - Invoke the perf_pmu_migrate_context() to migrate to a new CPU. - In event init, always set the CPU in the cpumask to event->cpu Similar duplicated codes are implemented for each such PMU driver. It would be good to introduce a generic infrastructure to avoid such duplication. 5 popular scopes are implemented here, core, die, cluster, pkg, and the system-wide. The scope can be set when a PMU is registered. If so, a "cpumask" is automatically exposed for the PMU. The "cpumask" is from the perf_online__mask, which is to track the active CPU for each scope. They are set when the first CPU of the scope is online via the generic perf hotplug support. When a corresponding CPU is removed, the perf_online__mask is updated accordingly and the PMU will be moved to a new CPU from the same scope if possible. Signed-off-by: Kan Liang --- include/linux/perf_event.h | 18 ++++ kernel/events/core.c | 164 ++++++++++++++++++++++++++++++++++++- 2 files changed, 180 insertions(+), 2 deletions(-) diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index 1a8942277dda..1102d5c2be70 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -292,6 +292,19 @@ struct perf_event_pmu_context; #define PERF_PMU_CAP_AUX_OUTPUT 0x0080 #define PERF_PMU_CAP_EXTENDED_HW_TYPE 0x0100 =20 +/** + * pmu::scope + */ +enum perf_pmu_scope { + PERF_PMU_SCOPE_NONE =3D 0, + PERF_PMU_SCOPE_CORE, + PERF_PMU_SCOPE_DIE, + PERF_PMU_SCOPE_CLUSTER, + PERF_PMU_SCOPE_PKG, + PERF_PMU_SCOPE_SYS_WIDE, + PERF_PMU_MAX_SCOPE, +}; + struct perf_output_handle; =20 #define PMU_NULL_DEV ((void *)(~0UL)) @@ -315,6 +328,11 @@ struct pmu { */ int capabilities; =20 + /* + * PMU scope + */ + unsigned int scope; + int __percpu *pmu_disable_count; struct perf_cpu_pmu_context __percpu *cpu_pmu_context; atomic_t exclusive_cnt; /* < 0: cpu; > 0: tsk */ diff --git a/kernel/events/core.c b/kernel/events/core.c index aa3450bdc227..5e1877c4cb4c 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -407,6 +407,11 @@ static LIST_HEAD(pmus); static DEFINE_MUTEX(pmus_lock); static struct srcu_struct pmus_srcu; static cpumask_var_t perf_online_mask; +static cpumask_var_t perf_online_core_mask; +static cpumask_var_t perf_online_die_mask; +static cpumask_var_t perf_online_cluster_mask; +static cpumask_var_t perf_online_pkg_mask; +static cpumask_var_t perf_online_sys_mask; static struct kmem_cache *perf_event_cache; =20 /* @@ -11477,10 +11482,60 @@ perf_event_mux_interval_ms_store(struct device *d= ev, } static DEVICE_ATTR_RW(perf_event_mux_interval_ms); =20 +static inline const struct cpumask *perf_scope_cpu_topology_cpumask(unsign= ed int scope, int cpu) +{ + switch (scope) { + case PERF_PMU_SCOPE_CORE: + return topology_sibling_cpumask(cpu); + case PERF_PMU_SCOPE_DIE: + return topology_die_cpumask(cpu); + case PERF_PMU_SCOPE_CLUSTER: + return topology_cluster_cpumask(cpu); + case PERF_PMU_SCOPE_PKG: + return topology_core_cpumask(cpu); + case PERF_PMU_SCOPE_SYS_WIDE: + return cpu_online_mask; + } + + return NULL; +} + +static inline struct cpumask *perf_scope_cpumask(unsigned int scope) +{ + switch (scope) { + case PERF_PMU_SCOPE_CORE: + return perf_online_core_mask; + case PERF_PMU_SCOPE_DIE: + return perf_online_die_mask; + case PERF_PMU_SCOPE_CLUSTER: + return perf_online_cluster_mask; + case PERF_PMU_SCOPE_PKG: + return perf_online_pkg_mask; + case PERF_PMU_SCOPE_SYS_WIDE: + return perf_online_sys_mask; + } + + return NULL; +} + +static ssize_t cpumask_show(struct device *dev, struct device_attribute *a= ttr, + char *buf) +{ + struct pmu *pmu =3D dev_get_drvdata(dev); + struct cpumask *mask =3D perf_scope_cpumask(pmu->scope); + + if (mask) + return cpumap_print_to_pagebuf(true, buf, mask); + return 0; +} + +static DEVICE_ATTR_RO(cpumask); + static struct attribute *pmu_dev_attrs[] =3D { &dev_attr_type.attr, &dev_attr_perf_event_mux_interval_ms.attr, &dev_attr_nr_addr_filters.attr, + &dev_attr_cpumask.attr, NULL, }; =20 @@ -11492,6 +11547,10 @@ static umode_t pmu_dev_is_visible(struct kobject *= kobj, struct attribute *a, int if (n =3D=3D 2 && !pmu->nr_addr_filters) return 0; =20 + /* cpumask */ + if (n =3D=3D 3 && pmu->scope =3D=3D PERF_PMU_SCOPE_NONE) + return 0; + return a->mode; } =20 @@ -11576,6 +11635,11 @@ int perf_pmu_register(struct pmu *pmu, const char = *name, int type) goto free_pdc; } =20 + if (WARN_ONCE(pmu->scope >=3D PERF_PMU_MAX_SCOPE, "Can not register a pmu= with an invalid scope.\n")) { + ret =3D -EINVAL; + goto free_pdc; + } + pmu->name =3D name; =20 if (type >=3D 0) @@ -11730,6 +11794,22 @@ static int perf_try_init_event(struct pmu *pmu, st= ruct perf_event *event) event_has_any_exclude_flag(event)) ret =3D -EINVAL; =20 + if (pmu->scope !=3D PERF_PMU_SCOPE_NONE && event->cpu >=3D 0) { + const struct cpumask *cpumask =3D perf_scope_cpu_topology_cpumask(pmu->= scope, event->cpu); + struct cpumask *pmu_cpumask =3D perf_scope_cpumask(pmu->scope); + int cpu; + + if (pmu_cpumask && cpumask) { + cpu =3D cpumask_any_and(pmu_cpumask, cpumask); + if (cpu >=3D nr_cpu_ids) + ret =3D -ENODEV; + else + event->cpu =3D cpu; + } else { + ret =3D -ENODEV; + } + } + if (ret && event->destroy) event->destroy(event); } @@ -13681,6 +13761,12 @@ static void __init perf_event_init_all_cpus(void) int cpu; =20 zalloc_cpumask_var(&perf_online_mask, GFP_KERNEL); + zalloc_cpumask_var(&perf_online_core_mask, GFP_KERNEL); + zalloc_cpumask_var(&perf_online_die_mask, GFP_KERNEL); + zalloc_cpumask_var(&perf_online_cluster_mask, GFP_KERNEL); + zalloc_cpumask_var(&perf_online_pkg_mask, GFP_KERNEL); + zalloc_cpumask_var(&perf_online_sys_mask, GFP_KERNEL); + =20 for_each_possible_cpu(cpu) { swhash =3D &per_cpu(swevent_htable, cpu); @@ -13730,6 +13816,40 @@ static void __perf_event_exit_context(void *__info) raw_spin_unlock(&ctx->lock); } =20 +static void perf_event_clear_cpumask(unsigned int cpu) +{ + int target[PERF_PMU_MAX_SCOPE]; + unsigned int scope; + struct pmu *pmu; + + cpumask_clear_cpu(cpu, perf_online_mask); + + for (scope =3D PERF_PMU_SCOPE_NONE + 1; scope < PERF_PMU_MAX_SCOPE; scope= ++) { + const struct cpumask *cpumask =3D perf_scope_cpu_topology_cpumask(scope,= cpu); + struct cpumask *pmu_cpumask =3D perf_scope_cpumask(scope); + + target[scope] =3D -1; + if (WARN_ON_ONCE(!pmu_cpumask || !cpumask)) + continue; + + if (!cpumask_test_and_clear_cpu(cpu, pmu_cpumask)) + continue; + target[scope] =3D cpumask_any_but(cpumask, cpu); + if (target[scope] < nr_cpu_ids) + cpumask_set_cpu(target[scope], pmu_cpumask); + } + + /* migrate */ + list_for_each_entry_rcu(pmu, &pmus, entry, lockdep_is_held(&pmus_srcu)) { + if (pmu->scope =3D=3D PERF_PMU_SCOPE_NONE || + WARN_ON_ONCE(pmu->scope >=3D PERF_PMU_MAX_SCOPE)) + continue; + + if (target[pmu->scope] >=3D 0 && target[pmu->scope] < nr_cpu_ids) + perf_pmu_migrate_context(pmu, cpu, target[pmu->scope]); + } +} + static void perf_event_exit_cpu_context(int cpu) { struct perf_cpu_context *cpuctx; @@ -13737,6 +13857,11 @@ static void perf_event_exit_cpu_context(int cpu) =20 // XXX simplify cpuctx->online mutex_lock(&pmus_lock); + /* + * Clear the cpumasks, and migrate to other CPUs if possible. + * Must be invoked before the __perf_event_exit_context. + */ + perf_event_clear_cpumask(cpu); cpuctx =3D per_cpu_ptr(&perf_cpu_context, cpu); ctx =3D &cpuctx->ctx; =20 @@ -13744,7 +13869,6 @@ static void perf_event_exit_cpu_context(int cpu) smp_call_function_single(cpu, __perf_event_exit_context, ctx, 1); cpuctx->online =3D 0; mutex_unlock(&ctx->mutex); - cpumask_clear_cpu(cpu, perf_online_mask); mutex_unlock(&pmus_lock); } #else @@ -13753,6 +13877,42 @@ static void perf_event_exit_cpu_context(int cpu) {= } =20 #endif =20 +static void perf_event_setup_cpumask(unsigned int cpu) +{ + struct cpumask *pmu_cpumask; + unsigned int scope; + + cpumask_set_cpu(cpu, perf_online_mask); + + /* + * Early boot stage, the cpumask hasn't been set yet. + * The perf_online__masks includes the first CPU of each domain. + * Always uncondifionally set the boot CPU for the perf_online__m= asks. + */ + if (!topology_sibling_cpumask(cpu)) { + for (scope =3D PERF_PMU_SCOPE_NONE + 1; scope < PERF_PMU_MAX_SCOPE; scop= e++) { + pmu_cpumask =3D perf_scope_cpumask(scope); + if (WARN_ON_ONCE(!pmu_cpumask)) + continue; + cpumask_set_cpu(cpu, pmu_cpumask); + } + return; + } + + for (scope =3D PERF_PMU_SCOPE_NONE + 1; scope < PERF_PMU_MAX_SCOPE; scope= ++) { + const struct cpumask *cpumask =3D perf_scope_cpu_topology_cpumask(scope,= cpu); + + pmu_cpumask =3D perf_scope_cpumask(scope); + + if (WARN_ON_ONCE(!pmu_cpumask || !cpumask)) + continue; + + if (!cpumask_empty(cpumask) && + cpumask_any_and(pmu_cpumask, cpumask) >=3D nr_cpu_ids) + cpumask_set_cpu(cpu, pmu_cpumask); + } +} + int perf_event_init_cpu(unsigned int cpu) { struct perf_cpu_context *cpuctx; @@ -13761,7 +13921,7 @@ int perf_event_init_cpu(unsigned int cpu) perf_swevent_init_cpu(cpu); =20 mutex_lock(&pmus_lock); - cpumask_set_cpu(cpu, perf_online_mask); + perf_event_setup_cpumask(cpu); cpuctx =3D per_cpu_ptr(&perf_cpu_context, cpu); ctx =3D &cpuctx->ctx; =20 --=20 2.38.1 From nobody Fri Dec 19 07:31:48 2025 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4DF682101A4 for ; Fri, 2 Aug 2024 15:16:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.19 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722611763; cv=none; b=R9mkkus3ojCh2uzRTOU3vl4qOnd2xq5rt7s+fbKaaXXXnINgGxOjWYT710+BF9UVRP/rrM6zD6JRxXFEzVngsWtcvlpRT6MDrEGZzuQZbpPXJCYOy9BJQKA2WYG8tJ+qOeq/UKpod7U0AUr1VKkIOjB+1IPNa6Nk6bHMO1PmCD8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722611763; c=relaxed/simple; bh=onkK7eGKtLJBosXsYvUv4xlTMHPEHaRONgXoxX6LRTw=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=dfGZufXHSOy1tmAJbFV+0pvfUQ/N4RQ2MddWNokHx6D7A5Kn7jDMnUOZC7bdmyVGL67RhqmcDc5OQ/28b8vrfyCSXja6HD6r+BsTObSUajbFPBzP6smg7uLEzxAxuaNFhk5BmnPdSQ/ATHJeLhiEQEyEzThv3iagOjEiZTvb3xU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=nakr+78l; arc=none smtp.client-ip=198.175.65.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="nakr+78l" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1722611762; x=1754147762; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=onkK7eGKtLJBosXsYvUv4xlTMHPEHaRONgXoxX6LRTw=; b=nakr+78lwJCHM0cOWvGolr3Jrkppi3C0Z9lrL6f8fDaEY+kaDCNsvDCE Zy8hkxNttp+zUpExBSZTK+AkP9AJ4Q6syUTlcrWFDO8A9EYijAfH4xTD+ uMoOWvEceG9RCdiL976JOOl6lqW+QLh67ldfZDdl3SgMtAYUi7bl2SoqQ M2kaFddAZlz5zXObqrUTA+4n5v7Virozzs/D3sjFWqswYPZ5ALep0wB1t V6AlvJZ3C7Tv7eJX12IY+BuYUB9VzeKjD434Wb6bhtFcq+E7L/F7pSHWi t0aIGbJ/I4m81pOtOqZZPKtNFADf0AVmsOoPEUr+cBzBASSaAn0yYMhRv A==; X-CSE-ConnectionGUID: XLwjHKXpSwGvGc4Abu1y6A== X-CSE-MsgGUID: K3mjAHv1RV22kwyfg5iuOw== X-IronPort-AV: E=McAfee;i="6700,10204,11152"; a="20473762" X-IronPort-AV: E=Sophos;i="6.09,258,1716274800"; d="scan'208";a="20473762" Received: from fmviesa010.fm.intel.com ([10.60.135.150]) by orvoesa111.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Aug 2024 08:15:59 -0700 X-CSE-ConnectionGUID: tV7qxRiOQIixsbqeXItJxw== X-CSE-MsgGUID: uIEbc+T/TFqLPNLW/K05Dg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.09,258,1716274800"; d="scan'208";a="55516931" Received: from kanliang-dev.jf.intel.com ([10.165.154.102]) by fmviesa010.fm.intel.com with ESMTP; 02 Aug 2024 08:15:58 -0700 From: kan.liang@linux.intel.com To: peterz@infradead.org, mingo@redhat.com, acme@kernel.org, namhyung@kernel.org, irogers@google.com, linux-kernel@vger.kernel.org Cc: Kan Liang Subject: [PATCH 2/7] perf: Add PERF_EV_CAP_READ_SCOPE Date: Fri, 2 Aug 2024 08:16:38 -0700 Message-Id: <20240802151643.1691631-3-kan.liang@linux.intel.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20240802151643.1691631-1-kan.liang@linux.intel.com> References: <20240802151643.1691631-1-kan.liang@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Kan Liang Usually, an event can be read from any CPU of the scope. It doesn't need to be read from the advertised CPU. Add a new event cap, PERF_EV_CAP_READ_SCOPE. An event of a PMU with scope can be read from any active CPU in the scope. Signed-off-by: Kan Liang --- include/linux/perf_event.h | 3 +++ kernel/events/core.c | 14 +++++++++++--- 2 files changed, 14 insertions(+), 3 deletions(-) diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index 1102d5c2be70..1206bc86eb4f 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -633,10 +633,13 @@ typedef void (*perf_overflow_handler_t)(struct perf_e= vent *, * PERF_EV_CAP_SIBLING: An event with this flag must be a group sibling and * cannot be a group leader. If an event with this flag is detached from t= he * group it is scheduled out and moved into an unrecoverable ERROR state. + * PERF_EV_CAP_READ_SCOPE: A CPU event that can be read from any CPU of the + * PMU scope where it is active. */ #define PERF_EV_CAP_SOFTWARE BIT(0) #define PERF_EV_CAP_READ_ACTIVE_PKG BIT(1) #define PERF_EV_CAP_SIBLING BIT(2) +#define PERF_EV_CAP_READ_SCOPE BIT(3) =20 #define SWEVENT_HLIST_BITS 8 #define SWEVENT_HLIST_SIZE (1 << SWEVENT_HLIST_BITS) diff --git a/kernel/events/core.c b/kernel/events/core.c index 5e1877c4cb4c..c55294f34575 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -4463,16 +4463,24 @@ struct perf_read_data { int ret; }; =20 +static inline const struct cpumask *perf_scope_cpu_topology_cpumask(unsign= ed int scope, int cpu); + static int __perf_event_read_cpu(struct perf_event *event, int event_cpu) { + int local_cpu =3D smp_processor_id(); u16 local_pkg, event_pkg; =20 if ((unsigned)event_cpu >=3D nr_cpu_ids) return event_cpu; =20 - if (event->group_caps & PERF_EV_CAP_READ_ACTIVE_PKG) { - int local_cpu =3D smp_processor_id(); + if (event->group_caps & PERF_EV_CAP_READ_SCOPE) { + const struct cpumask *cpumask =3D perf_scope_cpu_topology_cpumask(event-= >pmu->scope, event_cpu); + + if (cpumask && cpumask_test_cpu(local_cpu, cpumask)) + return local_cpu; + } =20 + if (event->group_caps & PERF_EV_CAP_READ_ACTIVE_PKG) { event_pkg =3D topology_physical_package_id(event_cpu); local_pkg =3D topology_physical_package_id(local_cpu); =20 @@ -11804,7 +11812,7 @@ static int perf_try_init_event(struct pmu *pmu, str= uct perf_event *event) if (cpu >=3D nr_cpu_ids) ret =3D -ENODEV; else - event->cpu =3D cpu; + event->event_caps |=3D PERF_EV_CAP_READ_SCOPE; } else { ret =3D -ENODEV; } --=20 2.38.1 From nobody Fri Dec 19 07:31:48 2025 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C50D721C161 for ; Fri, 2 Aug 2024 15:16:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.19 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722611763; cv=none; b=MNsCVi2rMEadNZuS7OApPfLiLe//DwmQwOD+V6mmNqwgagmIQEEmZSo1cdbecGa373HAhV29Svt+yl/IZ4UUSe090nQjqt19fHwFI6BPk4/uZZWD0YqTfuZsgPLvavarUFgrYFIE+9ECeL10ifx8DRFtcJCVZl/rtCWxFn+kcMA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722611763; c=relaxed/simple; bh=+ZXXHiNZZA1BghCtWfCFfCi1+sxQltAZQN3WjRZyB5I=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=XZpfpla+Xtfn67mYdbKCpf/XqZ8PdKoouDfCIb0JgX7nxyJDmjyoUZPBcGXZ2fHpxmRecDeGFEDr/gs2U+9Ct+HvC8GWVZVdMeCgUpqc6y7GWeeTHdCDflekvZRpgEGMRgs0EiaCb6Ivkn4smzj+qaH+vE141xZcrCX+5iqWXWg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=Ap42uIZQ; arc=none smtp.client-ip=198.175.65.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="Ap42uIZQ" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1722611762; x=1754147762; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=+ZXXHiNZZA1BghCtWfCFfCi1+sxQltAZQN3WjRZyB5I=; b=Ap42uIZQ26wq/TjLdYcy3y56noPiRrMJRCUs0gpNeuB0KhN3SnKiFoJ8 MVg3rP+tr4YKnceTu8abPhQpTu+CrnPF8+djlYG3JCB43lPKykt/3lmB8 YAz3dklBx27gGnb/+mY+ld6qhCec+1DHp11izTkNT4VXmYmipnoFDGkaZ BymXAglvJIWymxhjrzXQabdSrmj58CKEScvB7YYkGi6rPqbYnkIkQOx3E Oq+rDG+E0gQqLpNfcoffvRKMOxeSHewMGYmQ2D2793EQJ2e3Y8tNt/rQu UaRD82MLoB5yhH7Q+ePxA1e7RDFbgP7n2oGj6VIhPJYt2bbYzi8u7li2T Q==; X-CSE-ConnectionGUID: gVp1phrGSw6OwUFDt4pWrA== X-CSE-MsgGUID: CfkKzYA8RVSsF3x90l/SVA== X-IronPort-AV: E=McAfee;i="6700,10204,11152"; a="20473767" X-IronPort-AV: E=Sophos;i="6.09,258,1716274800"; d="scan'208";a="20473767" Received: from fmviesa010.fm.intel.com ([10.60.135.150]) by orvoesa111.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Aug 2024 08:15:59 -0700 X-CSE-ConnectionGUID: WrbMoaqNQoiSplDd5WCrlg== X-CSE-MsgGUID: PrwkPJI9Rs2eINuiBRNsaA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.09,258,1716274800"; d="scan'208";a="55516932" Received: from kanliang-dev.jf.intel.com ([10.165.154.102]) by fmviesa010.fm.intel.com with ESMTP; 02 Aug 2024 08:15:58 -0700 From: kan.liang@linux.intel.com To: peterz@infradead.org, mingo@redhat.com, acme@kernel.org, namhyung@kernel.org, irogers@google.com, linux-kernel@vger.kernel.org Cc: Kan Liang Subject: [PATCH 3/7] perf/x86/intel/cstate: Clean up cpumask and hotplug Date: Fri, 2 Aug 2024 08:16:39 -0700 Message-Id: <20240802151643.1691631-4-kan.liang@linux.intel.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20240802151643.1691631-1-kan.liang@linux.intel.com> References: <20240802151643.1691631-1-kan.liang@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Kan Liang There are three cstate PMUs with different scopes, core, die and module. The scopes are supported by the generic perf_event subsystem now. Set the scope for each PMU and remove all the cpumask and hotplug codes. Signed-off-by: Kan Liang --- arch/x86/events/intel/cstate.c | 142 ++------------------------------- include/linux/cpuhotplug.h | 2 - 2 files changed, 5 insertions(+), 139 deletions(-) diff --git a/arch/x86/events/intel/cstate.c b/arch/x86/events/intel/cstate.c index be58cfb012dd..13d229f2cdda 100644 --- a/arch/x86/events/intel/cstate.c +++ b/arch/x86/events/intel/cstate.c @@ -128,10 +128,6 @@ static ssize_t __cstate_##_var##_show(struct device *d= ev, \ static struct device_attribute format_attr_##_var =3D \ __ATTR(_name, 0444, __cstate_##_var##_show, NULL) =20 -static ssize_t cstate_get_attr_cpumask(struct device *dev, - struct device_attribute *attr, - char *buf); - /* Model -> events mapping */ struct cstate_model { unsigned long core_events; @@ -206,22 +202,9 @@ static struct attribute_group cstate_format_attr_group= =3D { .attrs =3D cstate_format_attrs, }; =20 -static cpumask_t cstate_core_cpu_mask; -static DEVICE_ATTR(cpumask, S_IRUGO, cstate_get_attr_cpumask, NULL); - -static struct attribute *cstate_cpumask_attrs[] =3D { - &dev_attr_cpumask.attr, - NULL, -}; - -static struct attribute_group cpumask_attr_group =3D { - .attrs =3D cstate_cpumask_attrs, -}; - static const struct attribute_group *cstate_attr_groups[] =3D { &cstate_events_attr_group, &cstate_format_attr_group, - &cpumask_attr_group, NULL, }; =20 @@ -269,8 +252,6 @@ static struct perf_msr pkg_msr[] =3D { [PERF_CSTATE_PKG_C10_RES] =3D { MSR_PKG_C10_RESIDENCY, &group_cstate_pkg_= c10, test_msr }, }; =20 -static cpumask_t cstate_pkg_cpu_mask; - /* cstate_module PMU */ static struct pmu cstate_module_pmu; static bool has_cstate_module; @@ -291,28 +272,9 @@ static struct perf_msr module_msr[] =3D { [PERF_CSTATE_MODULE_C6_RES] =3D { MSR_MODULE_C6_RES_MS, &group_cstate_mo= dule_c6, test_msr }, }; =20 -static cpumask_t cstate_module_cpu_mask; - -static ssize_t cstate_get_attr_cpumask(struct device *dev, - struct device_attribute *attr, - char *buf) -{ - struct pmu *pmu =3D dev_get_drvdata(dev); - - if (pmu =3D=3D &cstate_core_pmu) - return cpumap_print_to_pagebuf(true, buf, &cstate_core_cpu_mask); - else if (pmu =3D=3D &cstate_pkg_pmu) - return cpumap_print_to_pagebuf(true, buf, &cstate_pkg_cpu_mask); - else if (pmu =3D=3D &cstate_module_pmu) - return cpumap_print_to_pagebuf(true, buf, &cstate_module_cpu_mask); - else - return 0; -} - static int cstate_pmu_event_init(struct perf_event *event) { u64 cfg =3D event->attr.config; - int cpu; =20 if (event->attr.type !=3D event->pmu->type) return -ENOENT; @@ -331,20 +293,13 @@ static int cstate_pmu_event_init(struct perf_event *e= vent) if (!(core_msr_mask & (1 << cfg))) return -EINVAL; event->hw.event_base =3D core_msr[cfg].msr; - cpu =3D cpumask_any_and(&cstate_core_cpu_mask, - topology_sibling_cpumask(event->cpu)); } else if (event->pmu =3D=3D &cstate_pkg_pmu) { if (cfg >=3D PERF_CSTATE_PKG_EVENT_MAX) return -EINVAL; cfg =3D array_index_nospec((unsigned long)cfg, PERF_CSTATE_PKG_EVENT_MAX= ); if (!(pkg_msr_mask & (1 << cfg))) return -EINVAL; - - event->event_caps |=3D PERF_EV_CAP_READ_ACTIVE_PKG; - event->hw.event_base =3D pkg_msr[cfg].msr; - cpu =3D cpumask_any_and(&cstate_pkg_cpu_mask, - topology_die_cpumask(event->cpu)); } else if (event->pmu =3D=3D &cstate_module_pmu) { if (cfg >=3D PERF_CSTATE_MODULE_EVENT_MAX) return -EINVAL; @@ -352,16 +307,10 @@ static int cstate_pmu_event_init(struct perf_event *e= vent) if (!(module_msr_mask & (1 << cfg))) return -EINVAL; event->hw.event_base =3D module_msr[cfg].msr; - cpu =3D cpumask_any_and(&cstate_module_cpu_mask, - topology_cluster_cpumask(event->cpu)); } else { return -ENOENT; } =20 - if (cpu >=3D nr_cpu_ids) - return -ENODEV; - - event->cpu =3D cpu; event->hw.config =3D cfg; event->hw.idx =3D -1; return 0; @@ -412,84 +361,6 @@ static int cstate_pmu_event_add(struct perf_event *eve= nt, int mode) return 0; } =20 -/* - * Check if exiting cpu is the designated reader. If so migrate the - * events when there is a valid target available - */ -static int cstate_cpu_exit(unsigned int cpu) -{ - unsigned int target; - - if (has_cstate_core && - cpumask_test_and_clear_cpu(cpu, &cstate_core_cpu_mask)) { - - target =3D cpumask_any_but(topology_sibling_cpumask(cpu), cpu); - /* Migrate events if there is a valid target */ - if (target < nr_cpu_ids) { - cpumask_set_cpu(target, &cstate_core_cpu_mask); - perf_pmu_migrate_context(&cstate_core_pmu, cpu, target); - } - } - - if (has_cstate_pkg && - cpumask_test_and_clear_cpu(cpu, &cstate_pkg_cpu_mask)) { - - target =3D cpumask_any_but(topology_die_cpumask(cpu), cpu); - /* Migrate events if there is a valid target */ - if (target < nr_cpu_ids) { - cpumask_set_cpu(target, &cstate_pkg_cpu_mask); - perf_pmu_migrate_context(&cstate_pkg_pmu, cpu, target); - } - } - - if (has_cstate_module && - cpumask_test_and_clear_cpu(cpu, &cstate_module_cpu_mask)) { - - target =3D cpumask_any_but(topology_cluster_cpumask(cpu), cpu); - /* Migrate events if there is a valid target */ - if (target < nr_cpu_ids) { - cpumask_set_cpu(target, &cstate_module_cpu_mask); - perf_pmu_migrate_context(&cstate_module_pmu, cpu, target); - } - } - return 0; -} - -static int cstate_cpu_init(unsigned int cpu) -{ - unsigned int target; - - /* - * If this is the first online thread of that core, set it in - * the core cpu mask as the designated reader. - */ - target =3D cpumask_any_and(&cstate_core_cpu_mask, - topology_sibling_cpumask(cpu)); - - if (has_cstate_core && target >=3D nr_cpu_ids) - cpumask_set_cpu(cpu, &cstate_core_cpu_mask); - - /* - * If this is the first online thread of that package, set it - * in the package cpu mask as the designated reader. - */ - target =3D cpumask_any_and(&cstate_pkg_cpu_mask, - topology_die_cpumask(cpu)); - if (has_cstate_pkg && target >=3D nr_cpu_ids) - cpumask_set_cpu(cpu, &cstate_pkg_cpu_mask); - - /* - * If this is the first online thread of that cluster, set it - * in the cluster cpu mask as the designated reader. - */ - target =3D cpumask_any_and(&cstate_module_cpu_mask, - topology_cluster_cpumask(cpu)); - if (has_cstate_module && target >=3D nr_cpu_ids) - cpumask_set_cpu(cpu, &cstate_module_cpu_mask); - - return 0; -} - static const struct attribute_group *core_attr_update[] =3D { &group_cstate_core_c1, &group_cstate_core_c3, @@ -526,6 +397,7 @@ static struct pmu cstate_core_pmu =3D { .stop =3D cstate_pmu_event_stop, .read =3D cstate_pmu_event_update, .capabilities =3D PERF_PMU_CAP_NO_INTERRUPT | PERF_PMU_CAP_NO_EXCLUDE, + .scope =3D PERF_PMU_SCOPE_CORE, .module =3D THIS_MODULE, }; =20 @@ -541,6 +413,7 @@ static struct pmu cstate_pkg_pmu =3D { .stop =3D cstate_pmu_event_stop, .read =3D cstate_pmu_event_update, .capabilities =3D PERF_PMU_CAP_NO_INTERRUPT | PERF_PMU_CAP_NO_EXCLUDE, + .scope =3D PERF_PMU_SCOPE_PKG, .module =3D THIS_MODULE, }; =20 @@ -556,6 +429,7 @@ static struct pmu cstate_module_pmu =3D { .stop =3D cstate_pmu_event_stop, .read =3D cstate_pmu_event_update, .capabilities =3D PERF_PMU_CAP_NO_INTERRUPT | PERF_PMU_CAP_NO_EXCLUDE, + .scope =3D PERF_PMU_SCOPE_CLUSTER, .module =3D THIS_MODULE, }; =20 @@ -809,9 +683,6 @@ static int __init cstate_probe(const struct cstate_mode= l *cm) =20 static inline void cstate_cleanup(void) { - cpuhp_remove_state_nocalls(CPUHP_AP_PERF_X86_CSTATE_ONLINE); - cpuhp_remove_state_nocalls(CPUHP_AP_PERF_X86_CSTATE_STARTING); - if (has_cstate_core) perf_pmu_unregister(&cstate_core_pmu); =20 @@ -826,11 +697,6 @@ static int __init cstate_init(void) { int err; =20 - cpuhp_setup_state(CPUHP_AP_PERF_X86_CSTATE_STARTING, - "perf/x86/cstate:starting", cstate_cpu_init, NULL); - cpuhp_setup_state(CPUHP_AP_PERF_X86_CSTATE_ONLINE, - "perf/x86/cstate:online", NULL, cstate_cpu_exit); - if (has_cstate_core) { err =3D perf_pmu_register(&cstate_core_pmu, cstate_core_pmu.name, -1); if (err) { @@ -843,6 +709,8 @@ static int __init cstate_init(void) =20 if (has_cstate_pkg) { if (topology_max_dies_per_package() > 1) { + /* CLX-AP is multi-die and the cstate is die-scope */ + cstate_pkg_pmu.scope =3D PERF_PMU_SCOPE_DIE; err =3D perf_pmu_register(&cstate_pkg_pmu, "cstate_die", -1); } else { diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h index 51ba681b915a..9ea6290ade56 100644 --- a/include/linux/cpuhotplug.h +++ b/include/linux/cpuhotplug.h @@ -152,7 +152,6 @@ enum cpuhp_state { CPUHP_AP_PERF_X86_AMD_UNCORE_STARTING, CPUHP_AP_PERF_X86_STARTING, CPUHP_AP_PERF_X86_AMD_IBS_STARTING, - CPUHP_AP_PERF_X86_CSTATE_STARTING, CPUHP_AP_PERF_XTENSA_STARTING, CPUHP_AP_ARM_VFP_STARTING, CPUHP_AP_ARM64_DEBUG_MONITORS_STARTING, @@ -209,7 +208,6 @@ enum cpuhp_state { CPUHP_AP_PERF_X86_AMD_UNCORE_ONLINE, CPUHP_AP_PERF_X86_AMD_POWER_ONLINE, CPUHP_AP_PERF_X86_RAPL_ONLINE, - CPUHP_AP_PERF_X86_CSTATE_ONLINE, CPUHP_AP_PERF_S390_CF_ONLINE, CPUHP_AP_PERF_S390_SF_ONLINE, CPUHP_AP_PERF_ARM_CCI_ONLINE, --=20 2.38.1 From nobody Fri Dec 19 07:31:48 2025 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 586D021C171 for ; Fri, 2 Aug 2024 15:16:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.19 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722611763; cv=none; b=B/BZpn9/B+wBWWf3OM5w35YhYjvgYUOccTNtxmwKz4LV09kEF82XRPvxIsYwzJ3J5Voi0C4VhPwdkvTc8pZ9iufmjB1ZK0w3nXIxtQ4PioWh5fjyOnbdJDIWhvLoioGMpJsjxDx9tqFvvnh1Y3zoOGp5ArIborMlrPXbeo611nk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722611763; c=relaxed/simple; bh=X/DZHGNEnpppzw6W66fG/5qG6ja3qRNvChuXNJE/48w=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=XRlU053M41M4lauqzf9VP4iNtTDqPW6ItIcDgaXGAz4/X93S4J1vm8gjGf60bBj2TuCsxQ1CmW6pGgZ9LEnxg2ZdW9XEXKU7eeJFRsnSxv3L2oMD/C234jsWR+mwquoAWPh1mZMZNqumR2IYogl388vjqfk3EBXXoZXaKZe9hec= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=OCqVVoaP; arc=none smtp.client-ip=198.175.65.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="OCqVVoaP" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1722611763; x=1754147763; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=X/DZHGNEnpppzw6W66fG/5qG6ja3qRNvChuXNJE/48w=; b=OCqVVoaP7LPt8Nj90EH9pSKpsRCANA+JXhyFbhbCJo3XqZWg0US5lODt /TUv4LcijZMeDyX9dvxIqJm786d4rQ2Gu9CBcuNXfSHEvsDoLPcRHR451 ONt3QiRZCb3Lfg4KHD5WZyN/Jj7IBoJWq/9t1CRJnesmCZqjz0j31s6Ls 3oYH3SJLTaUpiVNGy9lqjTRk7PHiIr27afZ/3zeVYwAnyhRzW33TGzOAq 8WWXEXQvl3RLJiu1sRvOa59b+IKCdk26bVVcX9aZqQ+tefkG+fdi8UbCt /fYr1LUVH6dW2x/HECwNsFpl/6AVodkPk44icel3C3dQc7Ai52eaWYyEA w==; X-CSE-ConnectionGUID: RYRk/fM+Qt+cLVeAaed5iA== X-CSE-MsgGUID: zdUrDHe8Rm6T06izzIEfdw== X-IronPort-AV: E=McAfee;i="6700,10204,11152"; a="20473772" X-IronPort-AV: E=Sophos;i="6.09,258,1716274800"; d="scan'208";a="20473772" Received: from fmviesa010.fm.intel.com ([10.60.135.150]) by orvoesa111.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Aug 2024 08:16:00 -0700 X-CSE-ConnectionGUID: m3YEcdw7RruD1VS/A3Y19g== X-CSE-MsgGUID: f2Ic8290SdynwlOlesaHDw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.09,258,1716274800"; d="scan'208";a="55516934" Received: from kanliang-dev.jf.intel.com ([10.165.154.102]) by fmviesa010.fm.intel.com with ESMTP; 02 Aug 2024 08:15:59 -0700 From: kan.liang@linux.intel.com To: peterz@infradead.org, mingo@redhat.com, acme@kernel.org, namhyung@kernel.org, irogers@google.com, linux-kernel@vger.kernel.org Cc: Kan Liang , Lu Baolu , David Woodhouse , Joerg Roedel , Will Deacon , iommu@lists.linux.dev Subject: [PATCH 4/7] iommu/vt-d: Clean up cpumask and hotplug for perfmon Date: Fri, 2 Aug 2024 08:16:40 -0700 Message-Id: <20240802151643.1691631-5-kan.liang@linux.intel.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20240802151643.1691631-1-kan.liang@linux.intel.com> References: <20240802151643.1691631-1-kan.liang@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Kan Liang The iommu PMU is system-wide scope, which is supported by the generic perf_event subsystem now. Set the scope for the iommu PMU and remove all the cpumask and hotplug codes. Reviewed-by: Lu Baolu Signed-off-by: Kan Liang Cc: David Woodhouse Cc: Joerg Roedel Cc: Will Deacon Cc: iommu@lists.linux.dev --- drivers/iommu/intel/iommu.h | 2 - drivers/iommu/intel/perfmon.c | 111 +--------------------------------- 2 files changed, 2 insertions(+), 111 deletions(-) diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h index b67c14da1240..bd2c5a4ca11a 100644 --- a/drivers/iommu/intel/iommu.h +++ b/drivers/iommu/intel/iommu.h @@ -687,8 +687,6 @@ struct iommu_pmu { DECLARE_BITMAP(used_mask, IOMMU_PMU_IDX_MAX); struct perf_event *event_list[IOMMU_PMU_IDX_MAX]; unsigned char irq_name[16]; - struct hlist_node cpuhp_node; - int cpu; }; =20 #define IOMMU_IRQ_ID_OFFSET_PRQ (DMAR_UNITS_SUPPORTED) diff --git a/drivers/iommu/intel/perfmon.c b/drivers/iommu/intel/perfmon.c index 44083d01852d..75f493bcb353 100644 --- a/drivers/iommu/intel/perfmon.c +++ b/drivers/iommu/intel/perfmon.c @@ -34,28 +34,9 @@ static struct attribute_group iommu_pmu_events_attr_grou= p =3D { .attrs =3D attrs_empty, }; =20 -static cpumask_t iommu_pmu_cpu_mask; - -static ssize_t -cpumask_show(struct device *dev, struct device_attribute *attr, char *buf) -{ - return cpumap_print_to_pagebuf(true, buf, &iommu_pmu_cpu_mask); -} -static DEVICE_ATTR_RO(cpumask); - -static struct attribute *iommu_pmu_cpumask_attrs[] =3D { - &dev_attr_cpumask.attr, - NULL -}; - -static struct attribute_group iommu_pmu_cpumask_attr_group =3D { - .attrs =3D iommu_pmu_cpumask_attrs, -}; - static const struct attribute_group *iommu_pmu_attr_groups[] =3D { &iommu_pmu_format_attr_group, &iommu_pmu_events_attr_group, - &iommu_pmu_cpumask_attr_group, NULL }; =20 @@ -565,6 +546,7 @@ static int __iommu_pmu_register(struct intel_iommu *iom= mu) iommu_pmu->pmu.attr_groups =3D iommu_pmu_attr_groups; iommu_pmu->pmu.attr_update =3D iommu_pmu_attr_update; iommu_pmu->pmu.capabilities =3D PERF_PMU_CAP_NO_EXCLUDE; + iommu_pmu->pmu.scope =3D PERF_PMU_SCOPE_SYS_WIDE; iommu_pmu->pmu.module =3D THIS_MODULE; =20 return perf_pmu_register(&iommu_pmu->pmu, iommu_pmu->pmu.name, -1); @@ -773,89 +755,6 @@ static void iommu_pmu_unset_interrupt(struct intel_iom= mu *iommu) iommu->perf_irq =3D 0; } =20 -static int iommu_pmu_cpu_online(unsigned int cpu, struct hlist_node *node) -{ - struct iommu_pmu *iommu_pmu =3D hlist_entry_safe(node, typeof(*iommu_pmu)= , cpuhp_node); - - if (cpumask_empty(&iommu_pmu_cpu_mask)) - cpumask_set_cpu(cpu, &iommu_pmu_cpu_mask); - - if (cpumask_test_cpu(cpu, &iommu_pmu_cpu_mask)) - iommu_pmu->cpu =3D cpu; - - return 0; -} - -static int iommu_pmu_cpu_offline(unsigned int cpu, struct hlist_node *node) -{ - struct iommu_pmu *iommu_pmu =3D hlist_entry_safe(node, typeof(*iommu_pmu)= , cpuhp_node); - int target =3D cpumask_first(&iommu_pmu_cpu_mask); - - /* - * The iommu_pmu_cpu_mask has been updated when offline the CPU - * for the first iommu_pmu. Migrate the other iommu_pmu to the - * new target. - */ - if (target < nr_cpu_ids && target !=3D iommu_pmu->cpu) { - perf_pmu_migrate_context(&iommu_pmu->pmu, cpu, target); - iommu_pmu->cpu =3D target; - return 0; - } - - if (!cpumask_test_and_clear_cpu(cpu, &iommu_pmu_cpu_mask)) - return 0; - - target =3D cpumask_any_but(cpu_online_mask, cpu); - - if (target < nr_cpu_ids) - cpumask_set_cpu(target, &iommu_pmu_cpu_mask); - else - return 0; - - perf_pmu_migrate_context(&iommu_pmu->pmu, cpu, target); - iommu_pmu->cpu =3D target; - - return 0; -} - -static int nr_iommu_pmu; -static enum cpuhp_state iommu_cpuhp_slot; - -static int iommu_pmu_cpuhp_setup(struct iommu_pmu *iommu_pmu) -{ - int ret; - - if (!nr_iommu_pmu) { - ret =3D cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN, - "driver/iommu/intel/perfmon:online", - iommu_pmu_cpu_online, - iommu_pmu_cpu_offline); - if (ret < 0) - return ret; - iommu_cpuhp_slot =3D ret; - } - - ret =3D cpuhp_state_add_instance(iommu_cpuhp_slot, &iommu_pmu->cpuhp_node= ); - if (ret) { - if (!nr_iommu_pmu) - cpuhp_remove_multi_state(iommu_cpuhp_slot); - return ret; - } - nr_iommu_pmu++; - - return 0; -} - -static void iommu_pmu_cpuhp_free(struct iommu_pmu *iommu_pmu) -{ - cpuhp_state_remove_instance(iommu_cpuhp_slot, &iommu_pmu->cpuhp_node); - - if (--nr_iommu_pmu) - return; - - cpuhp_remove_multi_state(iommu_cpuhp_slot); -} - void iommu_pmu_register(struct intel_iommu *iommu) { struct iommu_pmu *iommu_pmu =3D iommu->pmu; @@ -866,17 +765,12 @@ void iommu_pmu_register(struct intel_iommu *iommu) if (__iommu_pmu_register(iommu)) goto err; =20 - if (iommu_pmu_cpuhp_setup(iommu_pmu)) - goto unregister; - /* Set interrupt for overflow */ if (iommu_pmu_set_interrupt(iommu)) - goto cpuhp_free; + goto unregister; =20 return; =20 -cpuhp_free: - iommu_pmu_cpuhp_free(iommu_pmu); unregister: perf_pmu_unregister(&iommu_pmu->pmu); err: @@ -892,6 +786,5 @@ void iommu_pmu_unregister(struct intel_iommu *iommu) return; =20 iommu_pmu_unset_interrupt(iommu); - iommu_pmu_cpuhp_free(iommu_pmu); perf_pmu_unregister(&iommu_pmu->pmu); } --=20 2.38.1 From nobody Fri Dec 19 07:31:48 2025 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 19F7721C17C; Fri, 2 Aug 2024 15:16:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.19 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722611765; cv=none; b=VWKX2azBFaTWVZtuAtHqGfkwHPbCeSTT71E1LnovZfPmEimVJQ00gy0FDbGSwNNWyQlbBulJnTtOJR729rvs0SxJZvCpHD7xyZY4BFQvinfYZlp1fpRwGM17OizdUk0qZ8/dBuKAAMMmG0IICjcACvAV82t+0MkhPr9jL0lNvJ4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722611765; c=relaxed/simple; bh=FUMrZD0Layse35uINcArM7Zo+s0aEYJ6DbXc7XPmI28=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=NNCXrfv0WZOOgaautpVGAPv33nK7fcMDBK1qoV/qcE8DUUO5hkAYcKb84TOdwjZukABWTtHNviHtJd3c7TVUJdV92+VeinGS9TCfI+2MpE0q899gDVqhSb42FKxgtzkJyu03TWLVdC4mRQpHB0shwKemIerCCkl+NFVfCIvuQDM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=fWWQctBk; arc=none smtp.client-ip=198.175.65.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="fWWQctBk" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1722611764; x=1754147764; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=FUMrZD0Layse35uINcArM7Zo+s0aEYJ6DbXc7XPmI28=; b=fWWQctBkKu3c8vJShlxLHAJv3VzVPV2JCzo7iQxJW7+XDPDSo+d3ZCLO hEVGXKES4Y0ubUfe1yxe97OZzEp/bMA6kVXPEcl2dZ3eEXVd7V1FJOBZp wihVPkwi7rLXK1RoIHIfvS6UcgppWE2Romkqi+VrY+jLSPd8pgMbphk+6 8L/brxKYey9pJX+1BjICEzIYPmFIykc2ka3DlklJqP6FNlR23GN4OwQlv pWTWLJC1xCwTibfNJ0sVk69Qrmp8r7AyMVETPk/i1f9F+0LcASabE2M44 KN1haZhICgDUplkoXKkryizPllxSTEX0D2Nw0Em2ZqPy9Cdo902IPrlmA g==; X-CSE-ConnectionGUID: FhVOzcTbStWZ7Dj/0tptgg== X-CSE-MsgGUID: KbRJiZ8nSouP0hxHvCALjQ== X-IronPort-AV: E=McAfee;i="6700,10204,11152"; a="20473780" X-IronPort-AV: E=Sophos;i="6.09,258,1716274800"; d="scan'208";a="20473780" Received: from fmviesa010.fm.intel.com ([10.60.135.150]) by orvoesa111.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Aug 2024 08:16:00 -0700 X-CSE-ConnectionGUID: Jbg7JvPXR16YmXrB3efu+A== X-CSE-MsgGUID: +PMYS0sMSauaIxM+O1t2cw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.09,258,1716274800"; d="scan'208";a="55516939" Received: from kanliang-dev.jf.intel.com ([10.165.154.102]) by fmviesa010.fm.intel.com with ESMTP; 02 Aug 2024 08:15:59 -0700 From: kan.liang@linux.intel.com To: peterz@infradead.org, mingo@redhat.com, acme@kernel.org, namhyung@kernel.org, irogers@google.com, linux-kernel@vger.kernel.org Cc: Kan Liang , Fenghua Yu , Dave Jiang , Vinod Koul , dmaengine@vger.kernel.org Subject: [PATCH 5/7] dmaengine: idxd: Clean up cpumask and hotplug for perfmon Date: Fri, 2 Aug 2024 08:16:41 -0700 Message-Id: <20240802151643.1691631-6-kan.liang@linux.intel.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20240802151643.1691631-1-kan.liang@linux.intel.com> References: <20240802151643.1691631-1-kan.liang@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Kan Liang The idxd PMU is system-wide scope, which is supported by the generic perf_event subsystem now. Set the scope for the idxd PMU and remove all the cpumask and hotplug codes. Signed-off-by: Kan Liang Cc: Fenghua Yu Cc: Dave Jiang Cc: Vinod Koul Cc: dmaengine@vger.kernel.org Reviewed-by: Dave Jiang Reviewed-by: Fenghua Yu --- drivers/dma/idxd/idxd.h | 7 --- drivers/dma/idxd/init.c | 3 -- drivers/dma/idxd/perfmon.c | 98 +------------------------------------- 3 files changed, 1 insertion(+), 107 deletions(-) diff --git a/drivers/dma/idxd/idxd.h b/drivers/dma/idxd/idxd.h index 868b724a3b75..d84e21daa991 100644 --- a/drivers/dma/idxd/idxd.h +++ b/drivers/dma/idxd/idxd.h @@ -124,7 +124,6 @@ struct idxd_pmu { =20 struct pmu pmu; char name[IDXD_NAME_SIZE]; - int cpu; =20 int n_counters; int counter_width; @@ -135,8 +134,6 @@ struct idxd_pmu { =20 unsigned long supported_filters; int n_filters; - - struct hlist_node cpuhp_node; }; =20 #define IDXD_MAX_PRIORITY 0xf @@ -803,14 +800,10 @@ void idxd_user_counter_increment(struct idxd_wq *wq, = u32 pasid, int index); int perfmon_pmu_init(struct idxd_device *idxd); void perfmon_pmu_remove(struct idxd_device *idxd); void perfmon_counter_overflow(struct idxd_device *idxd); -void perfmon_init(void); -void perfmon_exit(void); #else static inline int perfmon_pmu_init(struct idxd_device *idxd) { return 0; } static inline void perfmon_pmu_remove(struct idxd_device *idxd) {} static inline void perfmon_counter_overflow(struct idxd_device *idxd) {} -static inline void perfmon_init(void) {} -static inline void perfmon_exit(void) {} #endif =20 /* debugfs */ diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c index 21f6905b554d..5725ea82c409 100644 --- a/drivers/dma/idxd/init.c +++ b/drivers/dma/idxd/init.c @@ -878,8 +878,6 @@ static int __init idxd_init_module(void) else support_enqcmd =3D true; =20 - perfmon_init(); - err =3D idxd_driver_register(&idxd_drv); if (err < 0) goto err_idxd_driver_register; @@ -928,7 +926,6 @@ static void __exit idxd_exit_module(void) idxd_driver_unregister(&idxd_drv); pci_unregister_driver(&idxd_pci_driver); idxd_cdev_remove(); - perfmon_exit(); idxd_remove_debugfs(); } module_exit(idxd_exit_module); diff --git a/drivers/dma/idxd/perfmon.c b/drivers/dma/idxd/perfmon.c index 5e94247e1ea7..f511cf15845b 100644 --- a/drivers/dma/idxd/perfmon.c +++ b/drivers/dma/idxd/perfmon.c @@ -6,29 +6,6 @@ #include "idxd.h" #include "perfmon.h" =20 -static ssize_t cpumask_show(struct device *dev, struct device_attribute *a= ttr, - char *buf); - -static cpumask_t perfmon_dsa_cpu_mask; -static bool cpuhp_set_up; -static enum cpuhp_state cpuhp_slot; - -/* - * perf userspace reads this attribute to determine which cpus to open - * counters on. It's connected to perfmon_dsa_cpu_mask, which is - * maintained by the cpu hotplug handlers. - */ -static DEVICE_ATTR_RO(cpumask); - -static struct attribute *perfmon_cpumask_attrs[] =3D { - &dev_attr_cpumask.attr, - NULL, -}; - -static struct attribute_group cpumask_attr_group =3D { - .attrs =3D perfmon_cpumask_attrs, -}; - /* * These attributes specify the bits in the config word that the perf * syscall uses to pass the event ids and categories to perfmon. @@ -67,16 +44,9 @@ static struct attribute_group perfmon_format_attr_group = =3D { =20 static const struct attribute_group *perfmon_attr_groups[] =3D { &perfmon_format_attr_group, - &cpumask_attr_group, NULL, }; =20 -static ssize_t cpumask_show(struct device *dev, struct device_attribute *a= ttr, - char *buf) -{ - return cpumap_print_to_pagebuf(true, buf, &perfmon_dsa_cpu_mask); -} - static bool is_idxd_event(struct idxd_pmu *idxd_pmu, struct perf_event *ev= ent) { return &idxd_pmu->pmu =3D=3D event->pmu; @@ -217,7 +187,6 @@ static int perfmon_pmu_event_init(struct perf_event *ev= ent) return -EINVAL; =20 event->hw.event_base =3D ioread64(PERFMON_TABLE_OFFSET(idxd)); - event->cpu =3D idxd->idxd_pmu->cpu; event->hw.config =3D event->attr.config; =20 if (event->group_leader !=3D event) @@ -488,6 +457,7 @@ static void idxd_pmu_init(struct idxd_pmu *idxd_pmu) idxd_pmu->pmu.stop =3D perfmon_pmu_event_stop; idxd_pmu->pmu.read =3D perfmon_pmu_event_update; idxd_pmu->pmu.capabilities =3D PERF_PMU_CAP_NO_EXCLUDE; + idxd_pmu->pmu.scope =3D PERF_PMU_SCOPE_SYS_WIDE; idxd_pmu->pmu.module =3D THIS_MODULE; } =20 @@ -496,59 +466,17 @@ void perfmon_pmu_remove(struct idxd_device *idxd) if (!idxd->idxd_pmu) return; =20 - cpuhp_state_remove_instance(cpuhp_slot, &idxd->idxd_pmu->cpuhp_node); perf_pmu_unregister(&idxd->idxd_pmu->pmu); kfree(idxd->idxd_pmu); idxd->idxd_pmu =3D NULL; } =20 -static int perf_event_cpu_online(unsigned int cpu, struct hlist_node *node) -{ - struct idxd_pmu *idxd_pmu; - - idxd_pmu =3D hlist_entry_safe(node, typeof(*idxd_pmu), cpuhp_node); - - /* select the first online CPU as the designated reader */ - if (cpumask_empty(&perfmon_dsa_cpu_mask)) { - cpumask_set_cpu(cpu, &perfmon_dsa_cpu_mask); - idxd_pmu->cpu =3D cpu; - } - - return 0; -} - -static int perf_event_cpu_offline(unsigned int cpu, struct hlist_node *nod= e) -{ - struct idxd_pmu *idxd_pmu; - unsigned int target; - - idxd_pmu =3D hlist_entry_safe(node, typeof(*idxd_pmu), cpuhp_node); - - if (!cpumask_test_and_clear_cpu(cpu, &perfmon_dsa_cpu_mask)) - return 0; - - target =3D cpumask_any_but(cpu_online_mask, cpu); - /* migrate events if there is a valid target */ - if (target < nr_cpu_ids) { - cpumask_set_cpu(target, &perfmon_dsa_cpu_mask); - perf_pmu_migrate_context(&idxd_pmu->pmu, cpu, target); - } - - return 0; -} - int perfmon_pmu_init(struct idxd_device *idxd) { union idxd_perfcap perfcap; struct idxd_pmu *idxd_pmu; int rc =3D -ENODEV; =20 - /* - * perfmon module initialization failed, nothing to do - */ - if (!cpuhp_set_up) - return -ENODEV; - /* * If perfmon_offset or num_counters is 0, it means perfmon is * not supported on this hardware. @@ -624,11 +552,6 @@ int perfmon_pmu_init(struct idxd_device *idxd) if (rc) goto free; =20 - rc =3D cpuhp_state_add_instance(cpuhp_slot, &idxd_pmu->cpuhp_node); - if (rc) { - perf_pmu_unregister(&idxd->idxd_pmu->pmu); - goto free; - } out: return rc; free: @@ -637,22 +560,3 @@ int perfmon_pmu_init(struct idxd_device *idxd) =20 goto out; } - -void __init perfmon_init(void) -{ - int rc =3D cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN, - "driver/dma/idxd/perf:online", - perf_event_cpu_online, - perf_event_cpu_offline); - if (WARN_ON(rc < 0)) - return; - - cpuhp_slot =3D rc; - cpuhp_set_up =3D true; -} - -void __exit perfmon_exit(void) -{ - if (cpuhp_set_up) - cpuhp_remove_multi_state(cpuhp_slot); -} --=20 2.38.1 From nobody Fri Dec 19 07:31:48 2025 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9060221C189 for ; Fri, 2 Aug 2024 15:16:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.19 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722611765; cv=none; b=Nui7bj7aLN8JGzilEFnRLCjzrGQz1IT/a9NK0QaYMmRb1a6lTtABIGHmBAx3N1wJ1Z86lMyy7g54fxnHLHEoQJZeDYC6ee/5dAQdYhrID2XgfT0SeWYmLHy1j28LXCUH9Re9HTG6ni7Z4W20faF4LxHJmXUc6o1SEIaFjN7O9VA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722611765; c=relaxed/simple; bh=p+4Lv+dh60cpYo7aE7ItjBxEiUBNsC63EBfUh4dALaQ=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=bRmOJ2VCvqXhko2oAuUbO7jh1Br4U9dPpmpZB83dS/xHF7AJyD87/C47ka8rCPIYozw5CzJQ2/Qe+oRrQa1ygz9Yicf5YDvpo5aXdoFdLuFrAno0zNZGTWw6xyqXXMoL25bRsH9gEnkJdEi3IDQ+IX9j+ZMaXXbd5JP6nVPuUr0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=b+u4TLE/; arc=none smtp.client-ip=198.175.65.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="b+u4TLE/" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1722611764; x=1754147764; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=p+4Lv+dh60cpYo7aE7ItjBxEiUBNsC63EBfUh4dALaQ=; b=b+u4TLE/TZuJ9nU4v8mHMhjPOPqJoDnSYD7yhhQsAZqVWpMNszyJHJWg Cx+xMGmFX4sXC3jsQQzIfi4QAjx1CUPkLqsosxraoXntvyRtH+IMWM9xw Csbjo+R3ZxCMKQ3O6mR+GIw1JDtxPd3F2VMS2Q6WawNOJ+RHZoiv5vcyK JQdb9wzXe/CXmDwa8j3QUysXxrhUg5YskTHHV3ZQzqXKh2A6gRv5SUUar lM8ywlBAJhJ6OqFMeDN/RoI86rnVWN7rXLoFgXwofHys0ETO2oBUkE48E s3t+T7WVRtV62yza4pSfTqPlAs3mYkZWSI6j5/eAZZg2YiIijerNnpsKy w==; X-CSE-ConnectionGUID: Ztjvnc7bSAG72Zem/7QZEA== X-CSE-MsgGUID: SSt7KGZNT8ilVFeWzOEdyg== X-IronPort-AV: E=McAfee;i="6700,10204,11152"; a="20473785" X-IronPort-AV: E=Sophos;i="6.09,258,1716274800"; d="scan'208";a="20473785" Received: from fmviesa010.fm.intel.com ([10.60.135.150]) by orvoesa111.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Aug 2024 08:16:00 -0700 X-CSE-ConnectionGUID: wky1+X3RS6WfkZNXZoenQQ== X-CSE-MsgGUID: Ln4DiuWVSPamDeM35Cs3Rg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.09,258,1716274800"; d="scan'208";a="55516940" Received: from kanliang-dev.jf.intel.com ([10.165.154.102]) by fmviesa010.fm.intel.com with ESMTP; 02 Aug 2024 08:15:59 -0700 From: kan.liang@linux.intel.com To: peterz@infradead.org, mingo@redhat.com, acme@kernel.org, namhyung@kernel.org, irogers@google.com, linux-kernel@vger.kernel.org Cc: Kan Liang , Dhananjay Ugwekar Subject: [PATCH 6/7] perf/x86/rapl: Move the pmu allocation out of CPU hotplug Date: Fri, 2 Aug 2024 08:16:42 -0700 Message-Id: <20240802151643.1691631-7-kan.liang@linux.intel.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20240802151643.1691631-1-kan.liang@linux.intel.com> References: <20240802151643.1691631-1-kan.liang@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Kan Liang The rapl pmu just needs to be allocated once. It doesn't matter to be allocated at each CPU hotplug, or the global init_rapl_pmus(). Move the pmu allocation to the init_rapl_pmus(). So the generic hotplug supports can be applied. Signed-off-by: Kan Liang Cc: Dhananjay Ugwekar --- arch/x86/events/rapl.c | 43 +++++++++++++++++++++++++++++------------- 1 file changed, 30 insertions(+), 13 deletions(-) diff --git a/arch/x86/events/rapl.c b/arch/x86/events/rapl.c index b985ca79cf97..f8b6d504d03f 100644 --- a/arch/x86/events/rapl.c +++ b/arch/x86/events/rapl.c @@ -568,19 +568,8 @@ static int rapl_cpu_online(unsigned int cpu) struct rapl_pmu *pmu =3D cpu_to_rapl_pmu(cpu); int target; =20 - if (!pmu) { - pmu =3D kzalloc_node(sizeof(*pmu), GFP_KERNEL, cpu_to_node(cpu)); - if (!pmu) - return -ENOMEM; - - raw_spin_lock_init(&pmu->lock); - INIT_LIST_HEAD(&pmu->active_list); - pmu->pmu =3D &rapl_pmus->pmu; - pmu->timer_interval =3D ms_to_ktime(rapl_timer_ms); - rapl_hrtimer_init(pmu); - - rapl_pmus->pmus[topology_logical_die_id(cpu)] =3D pmu; - } + if (!pmu) + return -ENOMEM; =20 /* * Check if there is an online cpu in the package which collects rapl @@ -673,6 +662,32 @@ static const struct attribute_group *rapl_attr_update[= ] =3D { NULL, }; =20 +static void __init init_rapl_pmu(void) +{ + struct rapl_pmu *pmu; + int cpu; + + cpus_read_lock(); + + for_each_cpu(cpu, cpu_online_mask) { + pmu =3D cpu_to_rapl_pmu(cpu); + if (pmu) + continue; + pmu =3D kzalloc_node(sizeof(*pmu), GFP_KERNEL, cpu_to_node(cpu)); + if (!pmu) + continue; + raw_spin_lock_init(&pmu->lock); + INIT_LIST_HEAD(&pmu->active_list); + pmu->pmu =3D &rapl_pmus->pmu; + pmu->timer_interval =3D ms_to_ktime(rapl_timer_ms); + rapl_hrtimer_init(pmu); + + rapl_pmus->pmus[topology_logical_die_id(cpu)] =3D pmu; + } + + cpus_read_unlock(); +} + static int __init init_rapl_pmus(void) { int nr_rapl_pmu =3D topology_max_packages() * topology_max_dies_per_packa= ge(); @@ -681,6 +696,8 @@ static int __init init_rapl_pmus(void) if (!rapl_pmus) return -ENOMEM; =20 + init_rapl_pmu(); + rapl_pmus->nr_rapl_pmu =3D nr_rapl_pmu; rapl_pmus->pmu.attr_groups =3D rapl_attr_groups; rapl_pmus->pmu.attr_update =3D rapl_attr_update; --=20 2.38.1 From nobody Fri Dec 19 07:31:48 2025 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0A87321C19B for ; Fri, 2 Aug 2024 15:16:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.19 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722611765; cv=none; b=daGRzqtjOqDhks0pE8JO34+oVLUlCoCQe68TYVUnVl3ZvuQMPvuHlEbJ/BQdaQx14Fc8fC/k8WdlhWwbXgG/4NwIa698GWA0IZH1kcIeZ9Irx6kXAgaNj2EPY9GavEwWheB2H3tnudEaTtnISFtsTiuIcvtDIxMYzJ1QcK+Gvlc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722611765; c=relaxed/simple; bh=+q6If9D2thIVk8mfxeitCcSfQbMYqV+Hm+rfEgc6qVY=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=ZQPC7pwFt30+ay00RdoxrWRMOhhDVnhKqWvgSaRyXis0AHLM2t4UdOGtPCc3X1j8VB0mmLlNXUDJSxcOTYELZ1dHqfSFmduIvsEevzpOtPXAzMNcl4NFNiZ1UEZVJzboG5Vofbm5PXq+Uvtk762wakA14R3okLlPmpilbav3BjM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=XHjqZIjO; arc=none smtp.client-ip=198.175.65.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="XHjqZIjO" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1722611765; x=1754147765; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=+q6If9D2thIVk8mfxeitCcSfQbMYqV+Hm+rfEgc6qVY=; b=XHjqZIjOnZHEGALT25zFzby1673kf/zAJA8u/7/cweW+i0Pwa6kHOPsx qIWpllQ+pwxC0bAGGbaI8GdIUEtFvu75go8nmxHkKTm4OBM5MinJsMIzL P8ui5jzEAn2DZNmc1mB8mX1K7eBnx5hCTxaJD1LUgYRa2MYfRHSrRbD1x puRN21xjUU4/EU895tcj19Ipk5Y5yMZJx+xW1oCM0zcSkOkRlHt1PImVz EVlVJpoji3v24Q7B05pW7SCf+hoAkm+FyT2dwG8n3Id4jhbaXyWUGmF9c 2KRqIMmg1Yw9Daeuc8ln8VAwZz6WLtO3TqTDjc5M8DWxyfnTJ2EOt2s6m g==; X-CSE-ConnectionGUID: UPSow0eoReikZ55wkBuC3A== X-CSE-MsgGUID: mt9aBvO4QgWcFlx3XMG0DQ== X-IronPort-AV: E=McAfee;i="6700,10204,11152"; a="20473793" X-IronPort-AV: E=Sophos;i="6.09,258,1716274800"; d="scan'208";a="20473793" Received: from fmviesa010.fm.intel.com ([10.60.135.150]) by orvoesa111.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Aug 2024 08:16:00 -0700 X-CSE-ConnectionGUID: lWCROytnRfa3mMPq4w0L0Q== X-CSE-MsgGUID: M/DABoEiRaWbfFYB4F/U5g== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.09,258,1716274800"; d="scan'208";a="55516942" Received: from kanliang-dev.jf.intel.com ([10.165.154.102]) by fmviesa010.fm.intel.com with ESMTP; 02 Aug 2024 08:15:59 -0700 From: kan.liang@linux.intel.com To: peterz@infradead.org, mingo@redhat.com, acme@kernel.org, namhyung@kernel.org, irogers@google.com, linux-kernel@vger.kernel.org Cc: Kan Liang , Dhananjay Ugwekar Subject: [PATCH 7/7] perf/x86/rapl: Clean up cpumask and hotplug Date: Fri, 2 Aug 2024 08:16:43 -0700 Message-Id: <20240802151643.1691631-8-kan.liang@linux.intel.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20240802151643.1691631-1-kan.liang@linux.intel.com> References: <20240802151643.1691631-1-kan.liang@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Kan Liang The rapl pmu is die scope, which is supported by the generic perf_event subsystem now. Set the scope for the rapl PMU and remove all the cpumask and hotplug codes. Signed-off-by: Kan Liang Cc: Dhananjay Ugwekar --- arch/x86/events/rapl.c | 80 +------------------------------------- include/linux/cpuhotplug.h | 1 - 2 files changed, 2 insertions(+), 79 deletions(-) diff --git a/arch/x86/events/rapl.c b/arch/x86/events/rapl.c index f8b6d504d03f..b70ad880c5bc 100644 --- a/arch/x86/events/rapl.c +++ b/arch/x86/events/rapl.c @@ -135,7 +135,6 @@ struct rapl_model { /* 1/2^hw_unit Joule */ static int rapl_hw_unit[NR_RAPL_DOMAINS] __read_mostly; static struct rapl_pmus *rapl_pmus; -static cpumask_t rapl_cpu_mask; static unsigned int rapl_cntr_mask; static u64 rapl_timer_ms; static struct perf_msr *rapl_msrs; @@ -340,8 +339,6 @@ static int rapl_pmu_event_init(struct perf_event *event) if (event->cpu < 0) return -EINVAL; =20 - event->event_caps |=3D PERF_EV_CAP_READ_ACTIVE_PKG; - if (!cfg || cfg >=3D NR_RAPL_DOMAINS + 1) return -EINVAL; =20 @@ -360,7 +357,6 @@ static int rapl_pmu_event_init(struct perf_event *event) pmu =3D cpu_to_rapl_pmu(event->cpu); if (!pmu) return -EINVAL; - event->cpu =3D pmu->cpu; event->pmu_private =3D pmu; event->hw.event_base =3D rapl_msrs[bit].msr; event->hw.config =3D cfg; @@ -374,23 +370,6 @@ static void rapl_pmu_event_read(struct perf_event *eve= nt) rapl_event_update(event); } =20 -static ssize_t rapl_get_attr_cpumask(struct device *dev, - struct device_attribute *attr, char *buf) -{ - return cpumap_print_to_pagebuf(true, buf, &rapl_cpu_mask); -} - -static DEVICE_ATTR(cpumask, S_IRUGO, rapl_get_attr_cpumask, NULL); - -static struct attribute *rapl_pmu_attrs[] =3D { - &dev_attr_cpumask.attr, - NULL, -}; - -static struct attribute_group rapl_pmu_attr_group =3D { - .attrs =3D rapl_pmu_attrs, -}; - RAPL_EVENT_ATTR_STR(energy-cores, rapl_cores, "event=3D0x01"); RAPL_EVENT_ATTR_STR(energy-pkg , rapl_pkg, "event=3D0x02"); RAPL_EVENT_ATTR_STR(energy-ram , rapl_ram, "event=3D0x03"); @@ -438,7 +417,6 @@ static struct attribute_group rapl_pmu_format_group =3D= { }; =20 static const struct attribute_group *rapl_attr_groups[] =3D { - &rapl_pmu_attr_group, &rapl_pmu_format_group, &rapl_pmu_events_group, NULL, @@ -541,49 +519,6 @@ static struct perf_msr amd_rapl_msrs[] =3D { [PERF_RAPL_PSYS] =3D { 0, &rapl_events_psys_group, NULL, false, 0 }, }; =20 -static int rapl_cpu_offline(unsigned int cpu) -{ - struct rapl_pmu *pmu =3D cpu_to_rapl_pmu(cpu); - int target; - - /* Check if exiting cpu is used for collecting rapl events */ - if (!cpumask_test_and_clear_cpu(cpu, &rapl_cpu_mask)) - return 0; - - pmu->cpu =3D -1; - /* Find a new cpu to collect rapl events */ - target =3D cpumask_any_but(topology_die_cpumask(cpu), cpu); - - /* Migrate rapl events to the new target */ - if (target < nr_cpu_ids) { - cpumask_set_cpu(target, &rapl_cpu_mask); - pmu->cpu =3D target; - perf_pmu_migrate_context(pmu->pmu, cpu, target); - } - return 0; -} - -static int rapl_cpu_online(unsigned int cpu) -{ - struct rapl_pmu *pmu =3D cpu_to_rapl_pmu(cpu); - int target; - - if (!pmu) - return -ENOMEM; - - /* - * Check if there is an online cpu in the package which collects rapl - * events already. - */ - target =3D cpumask_any_and(&rapl_cpu_mask, topology_die_cpumask(cpu)); - if (target < nr_cpu_ids) - return 0; - - cpumask_set_cpu(cpu, &rapl_cpu_mask); - pmu->cpu =3D cpu; - return 0; -} - static int rapl_check_hw_unit(struct rapl_model *rm) { u64 msr_rapl_power_unit_bits; @@ -709,6 +644,7 @@ static int __init init_rapl_pmus(void) rapl_pmus->pmu.stop =3D rapl_pmu_event_stop; rapl_pmus->pmu.read =3D rapl_pmu_event_read; rapl_pmus->pmu.module =3D THIS_MODULE; + rapl_pmus->pmu.scope =3D PERF_PMU_SCOPE_DIE; rapl_pmus->pmu.capabilities =3D PERF_PMU_CAP_NO_EXCLUDE; return 0; } @@ -856,24 +792,13 @@ static int __init rapl_pmu_init(void) if (ret) return ret; =20 - /* - * Install callbacks. Core will call them for each online cpu. - */ - ret =3D cpuhp_setup_state(CPUHP_AP_PERF_X86_RAPL_ONLINE, - "perf/x86/rapl:online", - rapl_cpu_online, rapl_cpu_offline); - if (ret) - goto out; - ret =3D perf_pmu_register(&rapl_pmus->pmu, "power", -1); if (ret) - goto out1; + goto out; =20 rapl_advertise(); return 0; =20 -out1: - cpuhp_remove_state(CPUHP_AP_PERF_X86_RAPL_ONLINE); out: pr_warn("Initialization failed (%d), disabled\n", ret); cleanup_rapl_pmus(); @@ -883,7 +808,6 @@ module_init(rapl_pmu_init); =20 static void __exit intel_rapl_exit(void) { - cpuhp_remove_state_nocalls(CPUHP_AP_PERF_X86_RAPL_ONLINE); perf_pmu_unregister(&rapl_pmus->pmu); cleanup_rapl_pmus(); } diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h index 9ea6290ade56..f408521be568 100644 --- a/include/linux/cpuhotplug.h +++ b/include/linux/cpuhotplug.h @@ -207,7 +207,6 @@ enum cpuhp_state { CPUHP_AP_PERF_X86_UNCORE_ONLINE, CPUHP_AP_PERF_X86_AMD_UNCORE_ONLINE, CPUHP_AP_PERF_X86_AMD_POWER_ONLINE, - CPUHP_AP_PERF_X86_RAPL_ONLINE, CPUHP_AP_PERF_S390_CF_ONLINE, CPUHP_AP_PERF_S390_SF_ONLINE, CPUHP_AP_PERF_ARM_CCI_ONLINE, --=20 2.38.1