From nobody Mon May 25 03:55:48 2026 Received: from mailgw1.hygon.cn (unknown [101.204.27.37]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 4047D236A73; Tue, 19 May 2026 03:32:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=101.204.27.37 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779161589; cv=none; b=L7QJ9CAyH4mVppYHTWBV5jk1TcAzkeA1WaNEm23A6dJXO9dvTyL5LqKrkMEvmRts0jRSQXPctFT7YlwzPs8eT2M9tQtJFzTSK/dORIUo4l73m5aRnLB4OEQcGoes4b3uBMtRzZktXnvgUXWlgU86q+dFwhL6ARrvtyp7NL2fVxM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779161589; c=relaxed/simple; bh=T3Lz1by1HQVAW4dYR0txXIenZZ4KnjRAfhQPsuWf2Gc=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=d5XVvpXCrK6RtB8oCuwu9XDL3UGXv5kkuXz+1cwC4ijFYliteVUjcRV1ppyQQ6v9Af2C8nZBxLmWOiVuCUdHTN2TzhmzzcimFpCtCh51eHS/kzdREuF+CpyERgrZPsS95J1NbIzAPuszgKmdr22c+Q/Ev2TwtdJHn7nRbTLTIA4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=hygon.cn; spf=pass smtp.mailfrom=hygon.cn; arc=none smtp.client-ip=101.204.27.37 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=hygon.cn Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=hygon.cn Received: from maildlp1.hygon.cn (unknown [127.0.0.1]) by mailgw1.hygon.cn (Postfix) with ESMTP id 4gKKxR3KRXzT1qv; Tue, 19 May 2026 11:32:43 +0800 (CST) Received: from maildlp1.hygon.cn (unknown [172.23.18.60]) by mailgw1.hygon.cn (Postfix) with ESMTP id 4gKKxP3sGtzT1qQ; Tue, 19 May 2026 11:32:41 +0800 (CST) Received: from cncheex04.Hygon.cn (unknown [172.23.18.114]) by maildlp1.hygon.cn (Postfix) with ESMTPS id A7E0E5955; Tue, 19 May 2026 11:32:36 +0800 (CST) Received: from neptune.hygon.cn (172.22.228.106) by cncheex04.Hygon.cn (172.23.18.114) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.36; Tue, 19 May 2026 11:32:41 +0800 From: Qi Liu To: , , , , , CC: , , , , Zhenglang Hu Subject: [RFC PATCH 1/3] perf/x86/amd/uncore: Add common PMU helper functions Date: Tue, 19 May 2026 03:32:23 +0000 Message-ID: <20260519033225.1479907-2-liuqi@hygon.cn> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20260519033225.1479907-1-liuqi@hygon.cn> References: <20260519033225.1479907-1-liuqi@hygon.cn> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: cncheex06.Hygon.cn (172.23.18.116) To cncheex04.Hygon.cn (172.23.18.114) Content-Type: text/plain; charset="utf-8" Add common helper functions for AMD-family uncore PMU handling. The helpers cover event initialization, counter allocation, counter read/update, event start/stop and per-CPU context management. These paths are not tied to a specific uncore unit and can be reused by drivers with a similar uncore PMU programming model. Signed-off-by: Qi Liu Tested-by: Zhenglang Hu --- arch/x86/events/amd/uncore_common.c | 390 ++++++++++++++++++++++++++++ arch/x86/events/amd/uncore_common.h | 113 ++++++++ 2 files changed, 503 insertions(+) create mode 100644 arch/x86/events/amd/uncore_common.c create mode 100644 arch/x86/events/amd/uncore_common.h diff --git a/arch/x86/events/amd/uncore_common.c b/arch/x86/events/amd/unco= re_common.c new file mode 100644 index 000000000000..a6d50fe803df --- /dev/null +++ b/arch/x86/events/amd/uncore_common.c @@ -0,0 +1,390 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Common uncore PMU helpers for AMD-family x86 processors. + */ + +#include +#include +#include +#include +#include +#include + +#include +#include + +#include "uncore_common.h" + +#define COUNTER_SHIFT 16 +#define NUM_COUNTERS_MAX 64 + +/* Interval for hrtimer, defaults to 60000 milliseconds */ +static unsigned int uncore_update_interval =3D 60 * MSEC_PER_SEC; + +void uncore_common_set_update_interval(unsigned int interval) +{ + uncore_update_interval =3D interval; +} + +struct uncore_common_pmu *event_to_uncore_common_pmu(struct perf_event *ev= ent) +{ + return container_of(event->pmu, struct uncore_common_pmu, pmu); +} + +static ssize_t cpumask_show(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + struct pmu *ptr =3D dev_get_drvdata(dev); + struct uncore_common_pmu *pmu; + + pmu =3D container_of(ptr, struct uncore_common_pmu, pmu); + + return cpumap_print_to_pagebuf(true, buf, &pmu->active_mask); +} +static DEVICE_ATTR_RO(cpumask); + +static struct attribute *uncore_common_attrs[] =3D { + &dev_attr_cpumask.attr, + NULL, +}; + +struct attribute_group uncore_common_attr_group =3D { + .attrs =3D uncore_common_attrs, +}; + +static enum hrtimer_restart uncore_common_hrtimer(struct hrtimer *hrtimer) +{ + struct uncore_common_ctx *ctx; + struct perf_event *event; + int bit; + + ctx =3D container_of(hrtimer, struct uncore_common_ctx, hrtimer); + + if (!ctx->nr_active || ctx->cpu !=3D smp_processor_id()) + return HRTIMER_NORESTART; + + for_each_set_bit(bit, ctx->active_mask, NUM_COUNTERS_MAX) { + event =3D ctx->events[bit]; + event->pmu->read(event); + } + + hrtimer_forward_now(hrtimer, ns_to_ktime(ctx->hrtimer_duration)); + + return HRTIMER_RESTART; +} + +void uncore_common_start_hrtimer(struct uncore_common_ctx *ctx) +{ + hrtimer_start(&ctx->hrtimer, ns_to_ktime(ctx->hrtimer_duration), + HRTIMER_MODE_REL_PINNED_HARD); +} + +static void uncore_common_cancel_hrtimer(struct uncore_common_ctx *ctx) +{ + hrtimer_cancel(&ctx->hrtimer); +} + +static void uncore_common_init_hrtimer(struct uncore_common_ctx *ctx) +{ + hrtimer_setup(&ctx->hrtimer, uncore_common_hrtimer, CLOCK_MONOTONIC, + HRTIMER_MODE_REL_HARD); +} + +void uncore_common_read(struct perf_event *event) +{ + struct hw_perf_event *hwc =3D &event->hw; + u64 prev, new; + s64 delta; + + /* + * since we do not enable counter overflow interrupts, + * we do not have to worry about prev_count changing on us + */ + prev =3D local64_read(&hwc->prev_count); + + /* + * Some uncore PMUs do not have RDPMC assignments. In such cases, + * read counts directly from the corresponding PERF_CTR. + */ + if (hwc->event_base_rdpmc < 0) + rdmsrq(hwc->event_base, new); + else + new =3D rdpmc(hwc->event_base_rdpmc); + + local64_set(&hwc->prev_count, new); + + delta =3D (new << COUNTER_SHIFT) - (prev << COUNTER_SHIFT); + delta >>=3D COUNTER_SHIFT; + + local64_add(delta, &event->count); +} + +void uncore_common_start(struct perf_event *event, int flags) +{ + struct uncore_common_pmu *pmu =3D event_to_uncore_common_pmu(event); + struct uncore_common_ctx *ctx =3D *per_cpu_ptr(pmu->ctx, event->cpu); + struct hw_perf_event *hwc =3D &event->hw; + + if (!ctx->nr_active++) + uncore_common_start_hrtimer(ctx); + + if (flags & PERF_EF_RELOAD) + wrmsrq(hwc->event_base, (u64)local64_read(&hwc->prev_count)); + + hwc->state =3D 0; + + __set_bit(hwc->idx, ctx->active_mask); + wrmsrq(hwc->config_base, (hwc->config | ARCH_PERFMON_EVENTSEL_ENABLE)); + + perf_event_update_userpage(event); +} + +void uncore_common_stop(struct perf_event *event, int flags) +{ + struct uncore_common_pmu *pmu =3D event_to_uncore_common_pmu(event); + struct uncore_common_ctx *ctx =3D *per_cpu_ptr(pmu->ctx, event->cpu); + struct hw_perf_event *hwc =3D &event->hw; + + wrmsrq(hwc->config_base, hwc->config); + hwc->state |=3D PERF_HES_STOPPED; + + if ((flags & PERF_EF_UPDATE) && !(hwc->state & PERF_HES_UPTODATE)) { + event->pmu->read(event); + hwc->state |=3D PERF_HES_UPTODATE; + } + + if (!--ctx->nr_active) + uncore_common_cancel_hrtimer(ctx); + + __clear_bit(hwc->idx, ctx->active_mask); +} + +int uncore_common_event_init(struct perf_event *event) +{ + struct uncore_common_pmu *pmu; + struct uncore_common_ctx *ctx; + struct hw_perf_event *hwc =3D &event->hw; + + if (event->attr.type !=3D event->pmu->type) + return -ENOENT; + + if (event->cpu < 0) + return -EINVAL; + + pmu =3D event_to_uncore_common_pmu(event); + ctx =3D *per_cpu_ptr(pmu->ctx, event->cpu); + if (!ctx) + return -ENODEV; + + hwc->config =3D event->attr.config; + hwc->idx =3D -1; + + event->cpu =3D ctx->cpu; + + return 0; +} + +int uncore_common_add(struct perf_event *event, int flags) +{ + struct uncore_common_pmu *pmu =3D event_to_uncore_common_pmu(event); + struct uncore_common_ctx *ctx =3D *per_cpu_ptr(pmu->ctx, event->cpu); + struct hw_perf_event *hwc =3D &event->hw; + int i; + + /* are we already assigned? */ + if (hwc->idx !=3D -1 && ctx->events[hwc->idx] =3D=3D event) + goto out; + + for (i =3D 0; i < pmu->num_counters; i++) { + if (ctx->events[i] =3D=3D event) { + hwc->idx =3D i; + goto out; + } + } + + /* if not, take the first available counter */ + hwc->idx =3D -1; + + for (i =3D 0; i < pmu->num_counters; i++) { + struct perf_event *tmp =3D NULL; + + if (try_cmpxchg(&ctx->events[i], &tmp, event)) { + hwc->idx =3D i; + break; + } + } + +out: + if (hwc->idx =3D=3D -1) + return -EBUSY; + + hwc->config_base =3D pmu->msr_base + (2 * hwc->idx); + hwc->event_base =3D pmu->msr_base + 1 + (2 * hwc->idx); + hwc->event_base_rdpmc =3D pmu->rdpmc_base + hwc->idx; + hwc->state =3D PERF_HES_UPTODATE | PERF_HES_STOPPED; + + if (pmu->rdpmc_base < 0) + hwc->event_base_rdpmc =3D -1; + + if (flags & PERF_EF_START) + event->pmu->start(event, PERF_EF_RELOAD); + + return 0; +} + +void uncore_common_del(struct perf_event *event, int flags) +{ + struct uncore_common_pmu *pmu =3D event_to_uncore_common_pmu(event); + struct uncore_common_ctx *ctx =3D *per_cpu_ptr(pmu->ctx, event->cpu); + struct hw_perf_event *hwc =3D &event->hw; + int i; + + event->pmu->stop(event, PERF_EF_UPDATE); + + for (i =3D 0; i < pmu->num_counters; i++) { + struct perf_event *tmp =3D event; + + if (try_cmpxchg(&ctx->events[i], &tmp, NULL)) + break; + } + + hwc->idx =3D -1; +} + +int uncore_common_ctx_init(struct uncore_common *uncore, unsigned int cpu) +{ + struct uncore_common_ctx *curr, *prev; + struct uncore_common_pmu *pmu; + int node, cid, gid; + int i, j; + + if (!uncore->init_done || !uncore->num_pmus) + return 0; + + cid =3D uncore_common_ctx_cid(uncore, cpu); + gid =3D uncore_common_ctx_gid(uncore, cpu); + + for (i =3D 0; i < uncore->num_pmus; i++) { + pmu =3D &uncore->pmus[i]; + *per_cpu_ptr(pmu->ctx, cpu) =3D NULL; + curr =3D NULL; + + if (gid !=3D pmu->group) + continue; + + for_each_online_cpu(j) { + if (cpu =3D=3D j) + continue; + + prev =3D *per_cpu_ptr(pmu->ctx, j); + if (!prev) + continue; + + if (cid =3D=3D uncore_common_ctx_cid(uncore, j)) { + curr =3D prev; + break; + } + } + + if (!curr) { + node =3D cpu_to_node(cpu); + + curr =3D kzalloc_node(sizeof(*curr), GFP_KERNEL, node); + if (!curr) + goto fail; + + curr->cpu =3D cpu; + curr->events =3D kzalloc_node(sizeof(*curr->events) * + pmu->num_counters, + GFP_KERNEL, node); + if (!curr->events) { + kfree(curr); + goto fail; + } + + uncore_common_init_hrtimer(curr); + curr->hrtimer_duration =3D (u64)uncore_update_interval * NSEC_PER_MSEC; + + cpumask_set_cpu(cpu, &pmu->active_mask); + } + + curr->refcnt++; + *per_cpu_ptr(pmu->ctx, cpu) =3D curr; + } + + return 0; + +fail: + uncore_common_ctx_free(uncore, cpu); + + return -ENOMEM; +} + +void uncore_common_ctx_free(struct uncore_common *uncore, unsigned int cpu) +{ + struct uncore_common_pmu *pmu; + struct uncore_common_ctx *ctx; + int i; + + if (!uncore->init_done) + return; + + for (i =3D 0; i < uncore->num_pmus; i++) { + pmu =3D &uncore->pmus[i]; + + if (!pmu->ctx) + continue; + + ctx =3D *per_cpu_ptr(pmu->ctx, cpu); + if (!ctx) + continue; + + if (cpu =3D=3D ctx->cpu) + cpumask_clear_cpu(cpu, &pmu->active_mask); + + if (!--ctx->refcnt) { + kfree(ctx->events); + kfree(ctx); + } + + *per_cpu_ptr(pmu->ctx, cpu) =3D NULL; + } +} + +void uncore_common_ctx_move(struct uncore_common *uncore, unsigned int cpu) +{ + struct uncore_common_ctx *curr, *next; + struct uncore_common_pmu *pmu; + int i, j; + + if (!uncore->init_done) + return; + + for (i =3D 0; i < uncore->num_pmus; i++) { + pmu =3D &uncore->pmus[i]; + if (!pmu->ctx) + continue; + + curr =3D *per_cpu_ptr(pmu->ctx, cpu); + if (!curr) + continue; + + for_each_online_cpu(j) { + if (cpu =3D=3D j) + continue; + + next =3D *per_cpu_ptr(pmu->ctx, j); + if (!next) + continue; + + if (curr =3D=3D next) { + perf_pmu_migrate_context(&pmu->pmu, cpu, j); + cpumask_clear_cpu(cpu, &pmu->active_mask); + cpumask_set_cpu(j, &pmu->active_mask); + next->cpu =3D j; + break; + } + } + } +} diff --git a/arch/x86/events/amd/uncore_common.h b/arch/x86/events/amd/unco= re_common.h new file mode 100644 index 000000000000..3657b22268c0 --- /dev/null +++ b/arch/x86/events/amd/uncore_common.h @@ -0,0 +1,113 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +#ifndef _X86_EVENTS_UNCORE_COMMON_H +#define _X86_EVENTS_UNCORE_COMMON_H + +#include +#include +#include +#include +#include +#include +#include + +#define UNCORE_NAME_LEN 16 +#define UNCORE_GROUP_MAX 256 +#define NUM_COUNTERS_MAX 64 + +#define DEFINE_UNCORE_FORMAT_ATTR(_var, _name, _format) \ +static ssize_t __uncore_##_var##_show(struct device *dev, \ + struct device_attribute *attr, \ + char *page) \ +{ \ + BUILD_BUG_ON(sizeof(_format) >=3D PAGE_SIZE); \ + return sprintf(page, _format "\n"); \ +} \ +static struct device_attribute format_attr_##_var =3D \ + __ATTR(_name, 0444, __uncore_##_var##_show, NULL) + +union uncore_common_info { + struct { + u64 aux_data:32; + u64 num_pmcs:8; + u64 gid:8; + u64 cid:8; + u64 private:8; + } split; + u64 full; +}; + +struct uncore_common_ctx { + int refcnt; + int cpu; + struct perf_event **events; + unsigned long active_mask[BITS_TO_LONGS(NUM_COUNTERS_MAX)]; + int nr_active; + struct hrtimer hrtimer; + u64 hrtimer_duration; +}; + +struct uncore_common_pmu { + char name[UNCORE_NAME_LEN]; + int num_counters; + int rdpmc_base; + u32 msr_base; + int group; + cpumask_t active_mask; + struct pmu pmu; + struct uncore_common_ctx * __percpu *ctx; + void *private; +}; + +struct uncore_common { + union uncore_common_info __percpu *info; + struct uncore_common_pmu *pmus; + unsigned int num_pmus; + bool init_done; + void (*scan)(struct uncore_common *uncore, unsigned int cpu); + int (*init)(struct uncore_common *uncore, unsigned int cpu); + void (*move)(struct uncore_common *uncore, unsigned int cpu); + void (*free)(struct uncore_common *uncore, unsigned int cpu); +}; + +extern struct attribute_group uncore_common_attr_group; + +static inline int uncore_common_ctx_cid(struct uncore_common *uncore, + unsigned int cpu) +{ + union uncore_common_info *info =3D per_cpu_ptr(uncore->info, cpu); + + return info->split.cid; +} + +static inline int uncore_common_ctx_gid(struct uncore_common *uncore, + unsigned int cpu) +{ + union uncore_common_info *info =3D per_cpu_ptr(uncore->info, cpu); + + return info->split.gid; +} + +static inline int uncore_common_ctx_num_pmcs(struct uncore_common *uncore, + unsigned int cpu) +{ + union uncore_common_info *info =3D per_cpu_ptr(uncore->info, cpu); + + return info->split.num_pmcs; +} + +struct uncore_common_pmu *event_to_uncore_common_pmu(struct perf_event *ev= ent); + +void uncore_common_set_update_interval(unsigned int interval); +int uncore_common_event_init(struct perf_event *event); +int uncore_common_add(struct perf_event *event, int flags); +void uncore_common_del(struct perf_event *event, int flags); +void uncore_common_start(struct perf_event *event, int flags); +void uncore_common_stop(struct perf_event *event, int flags); +void uncore_common_read(struct perf_event *event); + +int uncore_common_ctx_init(struct uncore_common *uncore, unsigned int cpu); +void uncore_common_ctx_free(struct uncore_common *uncore, unsigned int cpu= ); +void uncore_common_ctx_move(struct uncore_common *uncore, unsigned int cpu= ); +void uncore_common_start_hrtimer(struct uncore_common_ctx *ctx); + +#endif /* _X86_EVENTS_UNCORE_COMMON_H */ --=20 2.34.1 From nobody Mon May 25 03:55:48 2026 Received: from mailgw1.hygon.cn (unknown [101.204.27.37]) by smtp.subspace.kernel.org (Postfix) with ESMTP id A1E24258CD7; Tue, 19 May 2026 03:32:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=101.204.27.37 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779161590; cv=none; b=ErXXcWPceMyjC2AeIyuPXOSgNHtcsLhvzeKvtP+o+COUHBGL1hEZYTs28Q9RrjDKrmdkUOpTrZrmzx0MPcCZOovtQqJofmjvaR2VUb+LeQeglM3axl4fdFnc14zQ/zkpVS5ZKSwItwhh/HcAGhs0CVCYF1ZACPicdXewo0CmZiQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779161590; c=relaxed/simple; bh=R41vfVGljaWuXou9SJCAiufAJsMxgzjOPhQnhX/DPC4=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=Yynp6K91M7RhfqWvVPl+IdQ3P7TC3wlcVimAb8KSHb3mLKUDRX60OhdYuLBqSz6OXK0NMZNwrYVv9hFt9Jg+egANU6dLPp+eCBgD7BVR5WoQ9dIyGRDWlj0/v8vkxX+jIUjQ9e6YJx/MdCkU+bjS5etG41nUotwejM1VtL0isks= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=hygon.cn; spf=pass smtp.mailfrom=hygon.cn; arc=none smtp.client-ip=101.204.27.37 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=hygon.cn Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=hygon.cn Received: from maildlp2.hygon.cn (unknown [127.0.0.1]) by mailgw1.hygon.cn (Postfix) with ESMTP id 4gKKxR3KR3zTNdP; Tue, 19 May 2026 11:32:43 +0800 (CST) Received: from maildlp2.hygon.cn (unknown [172.23.18.61]) by mailgw1.hygon.cn (Postfix) with ESMTP id 4gKKxQ6jQ0zTNdP; Tue, 19 May 2026 11:32:42 +0800 (CST) Received: from cncheex04.Hygon.cn (unknown [172.23.18.114]) by maildlp2.hygon.cn (Postfix) with ESMTPS id 8197031812CD; Tue, 19 May 2026 11:31:58 +0800 (CST) Received: from neptune.hygon.cn (172.22.228.106) by cncheex04.Hygon.cn (172.23.18.114) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.36; Tue, 19 May 2026 11:32:42 +0800 From: Qi Liu To: , , , , , CC: , , , , Zhenglang Hu Subject: [RFC PATCH 2/3] perf/x86/amd/uncore: Convert AMD driver to common PMU helpers Date: Tue, 19 May 2026 03:32:24 +0000 Message-ID: <20260519033225.1479907-3-liuqi@hygon.cn> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20260519033225.1479907-1-liuqi@hygon.cn> References: <20260519033225.1479907-1-liuqi@hygon.cn> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: cncheex06.Hygon.cn (172.23.18.116) To cncheex04.Hygon.cn (172.23.18.114) Content-Type: text/plain; charset="utf-8" Use the common uncore PMU helpers for AMD uncore event handling and per-CPU context management. The AMD-specific DF, L3 and UMC discovery, event encoding and PMU setup logic remains in the AMD driver. This only replaces duplicated common operations with calls to the helper functions. Signed-off-by: Qi Liu Tested-by: Zhenglang Hu --- arch/x86/events/amd/Makefile | 2 +- arch/x86/events/amd/uncore.c | 585 +++++------------------------------ 2 files changed, 76 insertions(+), 511 deletions(-) diff --git a/arch/x86/events/amd/Makefile b/arch/x86/events/amd/Makefile index 527d947eb76b..f951ae64ee36 100644 --- a/arch/x86/events/amd/Makefile +++ b/arch/x86/events/amd/Makefile @@ -4,7 +4,7 @@ obj-$(CONFIG_PERF_EVENTS_AMD_BRS) +=3D brs.o obj-$(CONFIG_PERF_EVENTS_AMD_POWER) +=3D power.o obj-$(CONFIG_X86_LOCAL_APIC) +=3D ibs.o obj-$(CONFIG_PERF_EVENTS_AMD_UNCORE) +=3D amd-uncore.o -amd-uncore-objs :=3D uncore.o +amd-uncore-objs :=3D uncore_common.o uncore.o ifdef CONFIG_AMD_IOMMU obj-$(CONFIG_CPU_SUP_AMD) +=3D iommu.o endif diff --git a/arch/x86/events/amd/uncore.c b/arch/x86/events/amd/uncore.c index dd956cfcadef..065824b70497 100644 --- a/arch/x86/events/amd/uncore.c +++ b/arch/x86/events/amd/uncore.c @@ -18,16 +18,16 @@ #include #include =20 +#include "uncore_common.h" + #define NUM_COUNTERS_NB 4 #define NUM_COUNTERS_L2 4 #define NUM_COUNTERS_L3 6 -#define NUM_COUNTERS_MAX 64 =20 #define RDPMC_BASE_NB 6 #define RDPMC_BASE_LLC 10 =20 #define COUNTER_SHIFT 16 -#define UNCORE_NAME_LEN 16 #define UNCORE_GROUP_MAX 256 =20 #undef pr_fmt @@ -35,27 +35,6 @@ =20 static int pmu_version; =20 -struct amd_uncore_ctx { - int refcnt; - int cpu; - struct perf_event **events; - unsigned long active_mask[BITS_TO_LONGS(NUM_COUNTERS_MAX)]; - int nr_active; - struct hrtimer hrtimer; - u64 hrtimer_duration; -}; - -struct amd_uncore_pmu { - char name[UNCORE_NAME_LEN]; - int num_counters; - int rdpmc_base; - u32 msr_base; - int group; - cpumask_t active_mask; - struct pmu pmu; - struct amd_uncore_ctx * __percpu *ctx; -}; - enum { UNCORE_TYPE_DF, UNCORE_TYPE_L3, @@ -63,243 +42,11 @@ enum { =20 UNCORE_TYPE_MAX }; - -union amd_uncore_info { - struct { - u64 aux_data:32; /* auxiliary data */ - u64 num_pmcs:8; /* number of counters */ - u64 gid:8; /* group id */ - u64 cid:8; /* context id */ - } split; - u64 full; -}; - -struct amd_uncore { - union amd_uncore_info __percpu *info; - struct amd_uncore_pmu *pmus; - unsigned int num_pmus; - bool init_done; - void (*scan)(struct amd_uncore *uncore, unsigned int cpu); - int (*init)(struct amd_uncore *uncore, unsigned int cpu); - void (*move)(struct amd_uncore *uncore, unsigned int cpu); - void (*free)(struct amd_uncore *uncore, unsigned int cpu); -}; - -static struct amd_uncore uncores[UNCORE_TYPE_MAX]; - /* Interval for hrtimer, defaults to 60000 milliseconds */ static unsigned int update_interval =3D 60 * MSEC_PER_SEC; module_param(update_interval, uint, 0444); =20 -static struct amd_uncore_pmu *event_to_amd_uncore_pmu(struct perf_event *e= vent) -{ - return container_of(event->pmu, struct amd_uncore_pmu, pmu); -} - -static enum hrtimer_restart amd_uncore_hrtimer(struct hrtimer *hrtimer) -{ - struct amd_uncore_ctx *ctx; - struct perf_event *event; - int bit; - - ctx =3D container_of(hrtimer, struct amd_uncore_ctx, hrtimer); - - if (!ctx->nr_active || ctx->cpu !=3D smp_processor_id()) - return HRTIMER_NORESTART; - - for_each_set_bit(bit, ctx->active_mask, NUM_COUNTERS_MAX) { - event =3D ctx->events[bit]; - event->pmu->read(event); - } - - hrtimer_forward_now(hrtimer, ns_to_ktime(ctx->hrtimer_duration)); - return HRTIMER_RESTART; -} - -static void amd_uncore_start_hrtimer(struct amd_uncore_ctx *ctx) -{ - hrtimer_start(&ctx->hrtimer, ns_to_ktime(ctx->hrtimer_duration), - HRTIMER_MODE_REL_PINNED_HARD); -} - -static void amd_uncore_cancel_hrtimer(struct amd_uncore_ctx *ctx) -{ - hrtimer_cancel(&ctx->hrtimer); -} - -static void amd_uncore_init_hrtimer(struct amd_uncore_ctx *ctx) -{ - hrtimer_setup(&ctx->hrtimer, amd_uncore_hrtimer, CLOCK_MONOTONIC, HRTIMER= _MODE_REL_HARD); -} - -static void amd_uncore_read(struct perf_event *event) -{ - struct hw_perf_event *hwc =3D &event->hw; - u64 prev, new; - s64 delta; - - /* - * since we do not enable counter overflow interrupts, - * we do not have to worry about prev_count changing on us - */ - - prev =3D local64_read(&hwc->prev_count); - - /* - * Some uncore PMUs do not have RDPMC assignments. In such cases, - * read counts directly from the corresponding PERF_CTR. - */ - if (hwc->event_base_rdpmc < 0) - rdmsrq(hwc->event_base, new); - else - new =3D rdpmc(hwc->event_base_rdpmc); - - local64_set(&hwc->prev_count, new); - delta =3D (new << COUNTER_SHIFT) - (prev << COUNTER_SHIFT); - delta >>=3D COUNTER_SHIFT; - local64_add(delta, &event->count); -} - -static void amd_uncore_start(struct perf_event *event, int flags) -{ - struct amd_uncore_pmu *pmu =3D event_to_amd_uncore_pmu(event); - struct amd_uncore_ctx *ctx =3D *per_cpu_ptr(pmu->ctx, event->cpu); - struct hw_perf_event *hwc =3D &event->hw; - - if (!ctx->nr_active++) - amd_uncore_start_hrtimer(ctx); - - if (flags & PERF_EF_RELOAD) - wrmsrq(hwc->event_base, (u64)local64_read(&hwc->prev_count)); - - hwc->state =3D 0; - __set_bit(hwc->idx, ctx->active_mask); - wrmsrq(hwc->config_base, (hwc->config | ARCH_PERFMON_EVENTSEL_ENABLE)); - perf_event_update_userpage(event); -} - -static void amd_uncore_stop(struct perf_event *event, int flags) -{ - struct amd_uncore_pmu *pmu =3D event_to_amd_uncore_pmu(event); - struct amd_uncore_ctx *ctx =3D *per_cpu_ptr(pmu->ctx, event->cpu); - struct hw_perf_event *hwc =3D &event->hw; - - wrmsrq(hwc->config_base, hwc->config); - hwc->state |=3D PERF_HES_STOPPED; - - if ((flags & PERF_EF_UPDATE) && !(hwc->state & PERF_HES_UPTODATE)) { - event->pmu->read(event); - hwc->state |=3D PERF_HES_UPTODATE; - } - - if (!--ctx->nr_active) - amd_uncore_cancel_hrtimer(ctx); - - __clear_bit(hwc->idx, ctx->active_mask); -} - -static int amd_uncore_add(struct perf_event *event, int flags) -{ - int i; - struct amd_uncore_pmu *pmu =3D event_to_amd_uncore_pmu(event); - struct amd_uncore_ctx *ctx =3D *per_cpu_ptr(pmu->ctx, event->cpu); - struct hw_perf_event *hwc =3D &event->hw; - - /* are we already assigned? */ - if (hwc->idx !=3D -1 && ctx->events[hwc->idx] =3D=3D event) - goto out; - - for (i =3D 0; i < pmu->num_counters; i++) { - if (ctx->events[i] =3D=3D event) { - hwc->idx =3D i; - goto out; - } - } - - /* if not, take the first available counter */ - hwc->idx =3D -1; - for (i =3D 0; i < pmu->num_counters; i++) { - struct perf_event *tmp =3D NULL; - - if (try_cmpxchg(&ctx->events[i], &tmp, event)) { - hwc->idx =3D i; - break; - } - } - -out: - if (hwc->idx =3D=3D -1) - return -EBUSY; - - hwc->config_base =3D pmu->msr_base + (2 * hwc->idx); - hwc->event_base =3D pmu->msr_base + 1 + (2 * hwc->idx); - hwc->event_base_rdpmc =3D pmu->rdpmc_base + hwc->idx; - hwc->state =3D PERF_HES_UPTODATE | PERF_HES_STOPPED; - - if (pmu->rdpmc_base < 0) - hwc->event_base_rdpmc =3D -1; - - if (flags & PERF_EF_START) - event->pmu->start(event, PERF_EF_RELOAD); - - return 0; -} - -static void amd_uncore_del(struct perf_event *event, int flags) -{ - int i; - struct amd_uncore_pmu *pmu =3D event_to_amd_uncore_pmu(event); - struct amd_uncore_ctx *ctx =3D *per_cpu_ptr(pmu->ctx, event->cpu); - struct hw_perf_event *hwc =3D &event->hw; - - event->pmu->stop(event, PERF_EF_UPDATE); - - for (i =3D 0; i < pmu->num_counters; i++) { - struct perf_event *tmp =3D event; - - if (try_cmpxchg(&ctx->events[i], &tmp, NULL)) - break; - } - - hwc->idx =3D -1; -} - -static int amd_uncore_event_init(struct perf_event *event) -{ - struct amd_uncore_pmu *pmu; - struct amd_uncore_ctx *ctx; - struct hw_perf_event *hwc =3D &event->hw; - - if (event->attr.type !=3D event->pmu->type) - return -ENOENT; - - if (event->cpu < 0) - return -EINVAL; - - pmu =3D event_to_amd_uncore_pmu(event); - ctx =3D *per_cpu_ptr(pmu->ctx, event->cpu); - if (!ctx) - return -ENODEV; - - /* - * NB and Last level cache counters (MSRs) are shared across all cores - * that share the same NB / Last level cache. On family 16h and below, - * Interrupts can be directed to a single target core, however, event - * counts generated by processes running on other cores cannot be masked - * out. So we do not support sampling and per-thread events via - * CAP_NO_INTERRUPT, and we do not enable counter overflow interrupts: - */ - hwc->config =3D event->attr.config; - hwc->idx =3D -1; - - /* - * since request can come in to any of the shared cores, we will remap - * to a single common cpu. - */ - event->cpu =3D ctx->cpu; - - return 0; -} +static struct uncore_common uncores[UNCORE_TYPE_MAX]; =20 static umode_t amd_f17h_uncore_is_visible(struct kobject *kobj, struct attribute *attr, i= nt i) @@ -314,37 +61,6 @@ amd_f19h_uncore_is_visible(struct kobject *kobj, struct= attribute *attr, int i) return boot_cpu_data.x86 >=3D 0x19 ? attr->mode : 0; } =20 -static ssize_t amd_uncore_attr_show_cpumask(struct device *dev, - struct device_attribute *attr, - char *buf) -{ - struct pmu *ptr =3D dev_get_drvdata(dev); - struct amd_uncore_pmu *pmu =3D container_of(ptr, struct amd_uncore_pmu, p= mu); - - return cpumap_print_to_pagebuf(true, buf, &pmu->active_mask); -} -static DEVICE_ATTR(cpumask, S_IRUGO, amd_uncore_attr_show_cpumask, NULL); - -static struct attribute *amd_uncore_attrs[] =3D { - &dev_attr_cpumask.attr, - NULL, -}; - -static struct attribute_group amd_uncore_attr_group =3D { - .attrs =3D amd_uncore_attrs, -}; - -#define DEFINE_UNCORE_FORMAT_ATTR(_var, _name, _format) \ -static ssize_t __uncore_##_var##_show(struct device *dev, \ - struct device_attribute *attr, \ - char *page) \ -{ \ - BUILD_BUG_ON(sizeof(_format) >=3D PAGE_SIZE); \ - return sprintf(page, _format "\n"); \ -} \ -static struct device_attribute format_attr_##_var =3D \ - __ATTR(_name, 0444, __uncore_##_var##_show, NULL) - DEFINE_UNCORE_FORMAT_ATTR(event12, event, "config:0-7,32-35"); DEFINE_UNCORE_FORMAT_ATTR(event14, event, "config:0-7,32-35,59-60"); /* F= 17h+ DF */ DEFINE_UNCORE_FORMAT_ATTR(event14v2, event, "config:0-7,32-37"); /* Pe= rfMonV2 DF */ @@ -425,13 +141,13 @@ static struct attribute_group amd_uncore_umc_format_g= roup =3D { }; =20 static const struct attribute_group *amd_uncore_df_attr_groups[] =3D { - &amd_uncore_attr_group, + &uncore_common_attr_group, &amd_uncore_df_format_group, NULL, }; =20 static const struct attribute_group *amd_uncore_l3_attr_groups[] =3D { - &amd_uncore_attr_group, + &uncore_common_attr_group, &amd_uncore_l3_format_group, NULL, }; @@ -443,164 +159,14 @@ static const struct attribute_group *amd_uncore_l3_a= ttr_update[] =3D { }; =20 static const struct attribute_group *amd_uncore_umc_attr_groups[] =3D { - &amd_uncore_attr_group, + &uncore_common_attr_group, &amd_uncore_umc_format_group, NULL, }; =20 -static __always_inline -int amd_uncore_ctx_cid(struct amd_uncore *uncore, unsigned int cpu) -{ - union amd_uncore_info *info =3D per_cpu_ptr(uncore->info, cpu); - return info->split.cid; -} - -static __always_inline -int amd_uncore_ctx_gid(struct amd_uncore *uncore, unsigned int cpu) -{ - union amd_uncore_info *info =3D per_cpu_ptr(uncore->info, cpu); - return info->split.gid; -} - -static __always_inline -int amd_uncore_ctx_num_pmcs(struct amd_uncore *uncore, unsigned int cpu) -{ - union amd_uncore_info *info =3D per_cpu_ptr(uncore->info, cpu); - return info->split.num_pmcs; -} - -static void amd_uncore_ctx_free(struct amd_uncore *uncore, unsigned int cp= u) -{ - struct amd_uncore_pmu *pmu; - struct amd_uncore_ctx *ctx; - int i; - - if (!uncore->init_done) - return; - - for (i =3D 0; i < uncore->num_pmus; i++) { - pmu =3D &uncore->pmus[i]; - ctx =3D *per_cpu_ptr(pmu->ctx, cpu); - if (!ctx) - continue; - - if (cpu =3D=3D ctx->cpu) - cpumask_clear_cpu(cpu, &pmu->active_mask); - - if (!--ctx->refcnt) { - kfree(ctx->events); - kfree(ctx); - } - - *per_cpu_ptr(pmu->ctx, cpu) =3D NULL; - } -} - -static int amd_uncore_ctx_init(struct amd_uncore *uncore, unsigned int cpu) -{ - struct amd_uncore_ctx *curr, *prev; - struct amd_uncore_pmu *pmu; - int node, cid, gid, i, j; - - if (!uncore->init_done || !uncore->num_pmus) - return 0; - - cid =3D amd_uncore_ctx_cid(uncore, cpu); - gid =3D amd_uncore_ctx_gid(uncore, cpu); - - for (i =3D 0; i < uncore->num_pmus; i++) { - pmu =3D &uncore->pmus[i]; - *per_cpu_ptr(pmu->ctx, cpu) =3D NULL; - curr =3D NULL; - - /* Check for group exclusivity */ - if (gid !=3D pmu->group) - continue; - - /* Find a sibling context */ - for_each_online_cpu(j) { - if (cpu =3D=3D j) - continue; - - prev =3D *per_cpu_ptr(pmu->ctx, j); - if (!prev) - continue; - - if (cid =3D=3D amd_uncore_ctx_cid(uncore, j)) { - curr =3D prev; - break; - } - } - - /* Allocate context if sibling does not exist */ - if (!curr) { - node =3D cpu_to_node(cpu); - curr =3D kzalloc_node(sizeof(*curr), GFP_KERNEL, node); - if (!curr) - goto fail; - - curr->cpu =3D cpu; - curr->events =3D kzalloc_node(sizeof(*curr->events) * - pmu->num_counters, - GFP_KERNEL, node); - if (!curr->events) { - kfree(curr); - goto fail; - } - - amd_uncore_init_hrtimer(curr); - curr->hrtimer_duration =3D (u64)update_interval * NSEC_PER_MSEC; - - cpumask_set_cpu(cpu, &pmu->active_mask); - } - - curr->refcnt++; - *per_cpu_ptr(pmu->ctx, cpu) =3D curr; - } - - return 0; - -fail: - amd_uncore_ctx_free(uncore, cpu); - - return -ENOMEM; -} - -static void amd_uncore_ctx_move(struct amd_uncore *uncore, unsigned int cp= u) -{ - struct amd_uncore_ctx *curr, *next; - struct amd_uncore_pmu *pmu; - int i, j; - - if (!uncore->init_done) - return; - - for (i =3D 0; i < uncore->num_pmus; i++) { - pmu =3D &uncore->pmus[i]; - curr =3D *per_cpu_ptr(pmu->ctx, cpu); - if (!curr) - continue; - - /* Migrate to a shared sibling if possible */ - for_each_online_cpu(j) { - next =3D *per_cpu_ptr(pmu->ctx, j); - if (!next || cpu =3D=3D j) - continue; - - if (curr =3D=3D next) { - perf_pmu_migrate_context(&pmu->pmu, cpu, j); - cpumask_clear_cpu(cpu, &pmu->active_mask); - cpumask_set_cpu(j, &pmu->active_mask); - next->cpu =3D j; - break; - } - } - } -} - static int amd_uncore_cpu_starting(unsigned int cpu) { - struct amd_uncore *uncore; + struct uncore_common *uncore; int i; =20 for (i =3D 0; i < UNCORE_TYPE_MAX; i++) { @@ -613,7 +179,7 @@ static int amd_uncore_cpu_starting(unsigned int cpu) =20 static int amd_uncore_cpu_online(unsigned int cpu) { - struct amd_uncore *uncore; + struct uncore_common *uncore; int i; =20 for (i =3D 0; i < UNCORE_TYPE_MAX; i++) { @@ -627,7 +193,7 @@ static int amd_uncore_cpu_online(unsigned int cpu) =20 static int amd_uncore_cpu_down_prepare(unsigned int cpu) { - struct amd_uncore *uncore; + struct uncore_common *uncore; int i; =20 for (i =3D 0; i < UNCORE_TYPE_MAX; i++) { @@ -640,7 +206,7 @@ static int amd_uncore_cpu_down_prepare(unsigned int cpu) =20 static int amd_uncore_cpu_dead(unsigned int cpu) { - struct amd_uncore *uncore; + struct uncore_common *uncore; int i; =20 for (i =3D 0; i < UNCORE_TYPE_MAX; i++) { @@ -654,7 +220,7 @@ static int amd_uncore_cpu_dead(unsigned int cpu) static int amd_uncore_df_event_init(struct perf_event *event) { struct hw_perf_event *hwc =3D &event->hw; - int ret =3D amd_uncore_event_init(event); + int ret =3D uncore_common_event_init(event); =20 hwc->config =3D event->attr.config & (pmu_version >=3D 2 ? AMD64_PERFMON_V2_RAW_EVENT_MASK_NB : @@ -665,7 +231,7 @@ static int amd_uncore_df_event_init(struct perf_event *= event) =20 static int amd_uncore_df_add(struct perf_event *event, int flags) { - int ret =3D amd_uncore_add(event, flags & ~PERF_EF_START); + int ret =3D uncore_common_add(event, flags & ~PERF_EF_START); struct hw_perf_event *hwc =3D &event->hw; =20 if (ret) @@ -683,16 +249,16 @@ static int amd_uncore_df_add(struct perf_event *event= , int flags) =20 /* Delayed start after rdpmc base update */ if (flags & PERF_EF_START) - amd_uncore_start(event, PERF_EF_RELOAD); + uncore_common_start(event, PERF_EF_RELOAD); =20 return 0; } =20 static -void amd_uncore_df_ctx_scan(struct amd_uncore *uncore, unsigned int cpu) +void amd_uncore_df_ctx_scan(struct uncore_common *uncore, unsigned int cpu) { union cpuid_0x80000022_ebx ebx; - union amd_uncore_info info; + union uncore_common_info info; =20 if (!boot_cpu_has(X86_FEATURE_PERFCTR_NB)) return; @@ -711,17 +277,17 @@ void amd_uncore_df_ctx_scan(struct amd_uncore *uncore= , unsigned int cpu) } =20 static -int amd_uncore_df_ctx_init(struct amd_uncore *uncore, unsigned int cpu) +int amd_uncore_df_ctx_init(struct uncore_common *uncore, unsigned int cpu) { struct attribute **df_attr =3D amd_uncore_df_format_attr; - struct amd_uncore_pmu *pmu; + struct uncore_common_pmu *pmu; int num_counters; =20 /* Run just once */ if (uncore->init_done) - return amd_uncore_ctx_init(uncore, cpu); + return uncore_common_ctx_init(uncore, cpu); =20 - num_counters =3D amd_uncore_ctx_num_pmcs(uncore, cpu); + num_counters =3D uncore_common_ctx_num_pmcs(uncore, cpu); if (!num_counters) goto done; =20 @@ -741,7 +307,7 @@ int amd_uncore_df_ctx_init(struct amd_uncore *uncore, u= nsigned int cpu) pmu->num_counters =3D num_counters; pmu->msr_base =3D MSR_F15H_NB_PERF_CTL; pmu->rdpmc_base =3D RDPMC_BASE_NB; - pmu->group =3D amd_uncore_ctx_gid(uncore, cpu); + pmu->group =3D uncore_common_ctx_gid(uncore, cpu); =20 if (pmu_version >=3D 2) { *df_attr++ =3D &format_attr_event14v2.attr; @@ -750,7 +316,7 @@ int amd_uncore_df_ctx_init(struct amd_uncore *uncore, u= nsigned int cpu) *df_attr =3D &format_attr_event14.attr; } =20 - pmu->ctx =3D alloc_percpu(struct amd_uncore_ctx *); + pmu->ctx =3D alloc_percpu(struct uncore_common_ctx *); if (!pmu->ctx) goto done; =20 @@ -760,10 +326,10 @@ int amd_uncore_df_ctx_init(struct amd_uncore *uncore,= unsigned int cpu) .name =3D pmu->name, .event_init =3D amd_uncore_df_event_init, .add =3D amd_uncore_df_add, - .del =3D amd_uncore_del, - .start =3D amd_uncore_start, - .stop =3D amd_uncore_stop, - .read =3D amd_uncore_read, + .del =3D uncore_common_del, + .start =3D uncore_common_start, + .stop =3D uncore_common_stop, + .read =3D uncore_common_read, .capabilities =3D PERF_PMU_CAP_NO_EXCLUDE | PERF_PMU_CAP_NO_INTERRUPT, .module =3D THIS_MODULE, }; @@ -774,8 +340,7 @@ int amd_uncore_df_ctx_init(struct amd_uncore *uncore, u= nsigned int cpu) goto done; } =20 - pr_info("%d %s%s counters detected\n", pmu->num_counters, - boot_cpu_data.x86_vendor =3D=3D X86_VENDOR_HYGON ? "HYGON " : "", + pr_info("%d %s counters detected\n", pmu->num_counters, pmu->pmu.name); =20 uncore->num_pmus =3D 1; @@ -783,12 +348,12 @@ int amd_uncore_df_ctx_init(struct amd_uncore *uncore,= unsigned int cpu) done: uncore->init_done =3D true; =20 - return amd_uncore_ctx_init(uncore, cpu); + return uncore_common_ctx_init(uncore, cpu); } =20 static int amd_uncore_l3_event_init(struct perf_event *event) { - int ret =3D amd_uncore_event_init(event); + int ret =3D uncore_common_event_init(event); struct hw_perf_event *hwc =3D &event->hw; u64 config =3D event->attr.config; u64 mask; @@ -826,9 +391,9 @@ static int amd_uncore_l3_event_init(struct perf_event *= event) } =20 static -void amd_uncore_l3_ctx_scan(struct amd_uncore *uncore, unsigned int cpu) +void amd_uncore_l3_ctx_scan(struct uncore_common *uncore, unsigned int cpu) { - union amd_uncore_info info; + union uncore_common_info info; =20 if (!boot_cpu_has(X86_FEATURE_PERFCTR_LLC)) return; @@ -845,17 +410,17 @@ void amd_uncore_l3_ctx_scan(struct amd_uncore *uncore= , unsigned int cpu) } =20 static -int amd_uncore_l3_ctx_init(struct amd_uncore *uncore, unsigned int cpu) +int amd_uncore_l3_ctx_init(struct uncore_common *uncore, unsigned int cpu) { struct attribute **l3_attr =3D amd_uncore_l3_format_attr; - struct amd_uncore_pmu *pmu; + struct uncore_common_pmu *pmu; int num_counters; =20 /* Run just once */ if (uncore->init_done) - return amd_uncore_ctx_init(uncore, cpu); + return uncore_common_ctx_init(uncore, cpu); =20 - num_counters =3D amd_uncore_ctx_num_pmcs(uncore, cpu); + num_counters =3D uncore_common_ctx_num_pmcs(uncore, cpu); if (!num_counters) goto done; =20 @@ -875,7 +440,7 @@ int amd_uncore_l3_ctx_init(struct amd_uncore *uncore, u= nsigned int cpu) pmu->num_counters =3D num_counters; pmu->msr_base =3D MSR_F16H_L2I_PERF_CTL; pmu->rdpmc_base =3D RDPMC_BASE_LLC; - pmu->group =3D amd_uncore_ctx_gid(uncore, cpu); + pmu->group =3D uncore_common_ctx_gid(uncore, cpu); =20 if (boot_cpu_data.x86 >=3D 0x17) { *l3_attr++ =3D &format_attr_event8.attr; @@ -885,7 +450,7 @@ int amd_uncore_l3_ctx_init(struct amd_uncore *uncore, u= nsigned int cpu) &format_attr_threadmask8.attr; } =20 - pmu->ctx =3D alloc_percpu(struct amd_uncore_ctx *); + pmu->ctx =3D alloc_percpu(struct uncore_common_ctx *); if (!pmu->ctx) goto done; =20 @@ -895,11 +460,11 @@ int amd_uncore_l3_ctx_init(struct amd_uncore *uncore,= unsigned int cpu) .attr_update =3D amd_uncore_l3_attr_update, .name =3D pmu->name, .event_init =3D amd_uncore_l3_event_init, - .add =3D amd_uncore_add, - .del =3D amd_uncore_del, - .start =3D amd_uncore_start, - .stop =3D amd_uncore_stop, - .read =3D amd_uncore_read, + .add =3D uncore_common_add, + .del =3D uncore_common_del, + .start =3D uncore_common_start, + .stop =3D uncore_common_stop, + .read =3D uncore_common_read, .capabilities =3D PERF_PMU_CAP_NO_EXCLUDE | PERF_PMU_CAP_NO_INTERRUPT, .module =3D THIS_MODULE, }; @@ -910,8 +475,7 @@ int amd_uncore_l3_ctx_init(struct amd_uncore *uncore, u= nsigned int cpu) goto done; } =20 - pr_info("%d %s%s counters detected\n", pmu->num_counters, - boot_cpu_data.x86_vendor =3D=3D X86_VENDOR_HYGON ? "HYGON " : "", + pr_info("%d %s counters detected\n", pmu->num_counters, pmu->pmu.name); =20 uncore->num_pmus =3D 1; @@ -919,14 +483,14 @@ int amd_uncore_l3_ctx_init(struct amd_uncore *uncore,= unsigned int cpu) done: uncore->init_done =3D true; =20 - return amd_uncore_ctx_init(uncore, cpu); + return uncore_common_ctx_init(uncore, cpu); } =20 static int amd_uncore_umc_event_init(struct perf_event *event) { struct hw_perf_event *hwc =3D &event->hw; - int ret =3D amd_uncore_event_init(event); =20 + int ret =3D uncore_common_event_init(event); if (ret) return ret; =20 @@ -937,12 +501,12 @@ static int amd_uncore_umc_event_init(struct perf_even= t *event) =20 static void amd_uncore_umc_start(struct perf_event *event, int flags) { - struct amd_uncore_pmu *pmu =3D event_to_amd_uncore_pmu(event); - struct amd_uncore_ctx *ctx =3D *per_cpu_ptr(pmu->ctx, event->cpu); + struct uncore_common_pmu *pmu =3D event_to_uncore_common_pmu(event); + struct uncore_common_ctx *ctx =3D *per_cpu_ptr(pmu->ctx, event->cpu); struct hw_perf_event *hwc =3D &event->hw; =20 if (!ctx->nr_active++) - amd_uncore_start_hrtimer(ctx); + uncore_common_start_hrtimer(ctx); =20 if (flags & PERF_EF_RELOAD) wrmsrq(hwc->event_base, (u64)local64_read(&hwc->prev_count)); @@ -987,31 +551,31 @@ static void amd_uncore_umc_read(struct perf_event *ev= ent) } =20 static -void amd_uncore_umc_ctx_scan(struct amd_uncore *uncore, unsigned int cpu) +void amd_uncore_umc_ctx_scan(struct uncore_common *uncore, unsigned int cp= u) { union cpuid_0x80000022_ebx ebx; - union amd_uncore_info info; + union uncore_common_info info =3D {}; unsigned int eax, ecx, edx; =20 if (pmu_version < 2) return; =20 cpuid(EXT_PERFMON_DEBUG_FEATURES, &eax, &ebx.full, &ecx, &edx); - info.split.aux_data =3D ecx; /* stash active mask */ + info.split.aux_data =3D ecx; info.split.num_pmcs =3D ebx.split.num_umc_pmc; info.split.gid =3D topology_logical_package_id(cpu); info.split.cid =3D topology_logical_package_id(cpu); + *per_cpu_ptr(uncore->info, cpu) =3D info; } =20 -static -int amd_uncore_umc_ctx_init(struct amd_uncore *uncore, unsigned int cpu) +static int amd_uncore_umc_ctx_init(struct uncore_common *uncore, unsigned = int cpu) { DECLARE_BITMAP(gmask, UNCORE_GROUP_MAX) =3D { 0 }; u8 group_num_pmus[UNCORE_GROUP_MAX] =3D { 0 }; u8 group_num_pmcs[UNCORE_GROUP_MAX] =3D { 0 }; - union amd_uncore_info info; - struct amd_uncore_pmu *pmu; + union uncore_common_info info; + struct uncore_common_pmu *pmu; int gid, i; u16 index =3D 0; =20 @@ -1020,12 +584,13 @@ int amd_uncore_umc_ctx_init(struct amd_uncore *uncor= e, unsigned int cpu) =20 /* Run just once */ if (uncore->init_done) - return amd_uncore_ctx_init(uncore, cpu); + return uncore_common_ctx_init(uncore, cpu); =20 /* Find unique groups */ for_each_online_cpu(i) { info =3D *per_cpu_ptr(uncore->info, i); gid =3D info.split.gid; + if (test_bit(gid, gmask)) continue; =20 @@ -1050,8 +615,7 @@ int amd_uncore_umc_ctx_init(struct amd_uncore *uncore,= unsigned int cpu) pmu->msr_base =3D MSR_F19H_UMC_PERF_CTL + i * pmu->num_counters * 2; pmu->rdpmc_base =3D -1; pmu->group =3D gid; - - pmu->ctx =3D alloc_percpu(struct amd_uncore_ctx *); + pmu->ctx =3D alloc_percpu(struct uncore_common_ctx *); if (!pmu->ctx) goto done; =20 @@ -1060,10 +624,10 @@ int amd_uncore_umc_ctx_init(struct amd_uncore *uncor= e, unsigned int cpu) .attr_groups =3D amd_uncore_umc_attr_groups, .name =3D pmu->name, .event_init =3D amd_uncore_umc_event_init, - .add =3D amd_uncore_add, - .del =3D amd_uncore_del, + .add =3D uncore_common_add, + .del =3D uncore_common_del, .start =3D amd_uncore_umc_start, - .stop =3D amd_uncore_stop, + .stop =3D uncore_common_stop, .read =3D amd_uncore_umc_read, .capabilities =3D PERF_PMU_CAP_NO_EXCLUDE | PERF_PMU_CAP_NO_INTERRUPT, .module =3D THIS_MODULE, @@ -1086,41 +650,40 @@ int amd_uncore_umc_ctx_init(struct amd_uncore *uncor= e, unsigned int cpu) uncore->num_pmus =3D index; uncore->init_done =3D true; =20 - return amd_uncore_ctx_init(uncore, cpu); + return uncore_common_ctx_init(uncore, cpu); } =20 -static struct amd_uncore uncores[UNCORE_TYPE_MAX] =3D { +static struct uncore_common uncores[UNCORE_TYPE_MAX] =3D { /* UNCORE_TYPE_DF */ { .scan =3D amd_uncore_df_ctx_scan, .init =3D amd_uncore_df_ctx_init, - .move =3D amd_uncore_ctx_move, - .free =3D amd_uncore_ctx_free, + .move =3D uncore_common_ctx_move, + .free =3D uncore_common_ctx_free, }, /* UNCORE_TYPE_L3 */ { .scan =3D amd_uncore_l3_ctx_scan, .init =3D amd_uncore_l3_ctx_init, - .move =3D amd_uncore_ctx_move, - .free =3D amd_uncore_ctx_free, + .move =3D uncore_common_ctx_move, + .free =3D uncore_common_ctx_free, }, /* UNCORE_TYPE_UMC */ { .scan =3D amd_uncore_umc_ctx_scan, .init =3D amd_uncore_umc_ctx_init, - .move =3D amd_uncore_ctx_move, - .free =3D amd_uncore_ctx_free, + .move =3D uncore_common_ctx_move, + .free =3D uncore_common_ctx_free, }, }; =20 static int __init amd_uncore_init(void) { - struct amd_uncore *uncore; + struct uncore_common *uncore; int ret =3D -ENODEV; int i; =20 - if (boot_cpu_data.x86_vendor !=3D X86_VENDOR_AMD && - boot_cpu_data.x86_vendor !=3D X86_VENDOR_HYGON) + if (boot_cpu_data.x86_vendor !=3D X86_VENDOR_AMD) return -ENODEV; =20 if (!boot_cpu_has(X86_FEATURE_TOPOEXT)) @@ -1129,6 +692,8 @@ static int __init amd_uncore_init(void) if (boot_cpu_has(X86_FEATURE_PERFMON_V2)) pmu_version =3D 2; =20 + uncore_common_set_update_interval(update_interval); + for (i =3D 0; i < UNCORE_TYPE_MAX; i++) { uncore =3D &uncores[i]; =20 @@ -1137,7 +702,7 @@ static int __init amd_uncore_init(void) BUG_ON(!uncore->move); BUG_ON(!uncore->free); =20 - uncore->info =3D alloc_percpu(union amd_uncore_info); + uncore->info =3D alloc_percpu(union uncore_common_info); if (!uncore->info) { ret =3D -ENOMEM; goto fail; @@ -1186,8 +751,8 @@ static int __init amd_uncore_init(void) =20 static void __exit amd_uncore_exit(void) { - struct amd_uncore *uncore; - struct amd_uncore_pmu *pmu; + struct uncore_common *uncore; + struct uncore_common_pmu *pmu; int i, j; =20 cpuhp_remove_state(CPUHP_AP_PERF_X86_AMD_UNCORE_ONLINE); --=20 2.34.1 From nobody Mon May 25 03:55:48 2026 Received: from mailgw1.hygon.cn (unknown [101.204.27.37]) by smtp.subspace.kernel.org (Postfix) with ESMTP id E6D7F2673AA; Tue, 19 May 2026 03:32:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=101.204.27.37 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779161590; cv=none; b=V9w/W73qNUVEzvBZZXV9Sw0NsodL3P9JNgoVzwNRhFtZP53v+flkX5NPE61uewtc1BZqRNTJS3WxhE0teQYSWW+pvEpmgBuRFQwB88Ot81/Vo16A6P3zvINK+X/zhns0F0b8UK1jrO4suUdx8wj1jrYrwRXee4ZJOpqJWwUSCVU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779161590; c=relaxed/simple; bh=yc5obdwewimg5iwq+DDFeT+P/zi++j0aQ6h1d1Xq01A=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=XtEgEyBfibfdbcRQHkwiG/K3gLWhSbuBDpNLs9Vq/MuEVpxURRRB9eP3Lx6GbMorU128Y0FfXvd8sfzz4D4AmowTdouB2bl4V8syZsUUxTwNtmajM7PUkgazOJvzkj00CQ0Jen97/5EMJ8yFeU62FQTyjjBYKTePLA6Zyge7IlA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=hygon.cn; spf=fail smtp.mailfrom=hygon.cn; arc=none smtp.client-ip=101.204.27.37 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=hygon.cn Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=hygon.cn Received: from maildlp1.hygon.cn (unknown [127.0.0.1]) by mailgw1.hygon.cn (Postfix) with ESMTP id 4gKKxV3KVMzT1qQ; Tue, 19 May 2026 11:32:46 +0800 (CST) Received: from maildlp1.hygon.cn (unknown [172.23.18.60]) by mailgw1.hygon.cn (Postfix) with ESMTP id 4gKKxS0xdZzT1qQ; Tue, 19 May 2026 11:32:44 +0800 (CST) Received: from cncheex04.Hygon.cn (unknown [172.23.18.114]) by maildlp1.hygon.cn (Postfix) with ESMTPS id 3BE795955; Tue, 19 May 2026 11:32:39 +0800 (CST) Received: from neptune.hygon.cn (172.22.228.106) by cncheex04.Hygon.cn (172.23.18.114) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.36; Tue, 19 May 2026 11:32:43 +0800 From: Qi Liu To: , , , , , CC: , , , , Zhenglang Hu Subject: [RFC PATCH 3/3] perf/x86/amd/uncore: Add Hygon uncore PMU support Date: Tue, 19 May 2026 03:32:25 +0000 Message-ID: <20260519033225.1479907-4-liuqi@hygon.cn> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20260519033225.1479907-1-liuqi@hygon.cn> References: <20260519033225.1479907-1-liuqi@hygon.cn> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: cncheex06.Hygon.cn (172.23.18.116) To cncheex04.Hygon.cn (172.23.18.114) Content-Type: text/plain; charset="utf-8" Add uncore PMU support for Hygon processors based on the shared AMD-family uncore framework. Hygon processors implement uncore PMUs for Data Fabric units with a programming model similar to AMD, but with several differences: - Different MSR base addresses - Vendor-specific event encoding and masks - DF IOD counter allocation semantics - Additional format attributes (e.g. iod, constid) Reuse the common uncore infrastructure introduced in the previous patch, this avoids duplicating the full uncore driver while keeping vendor-specific logic isolated. Signed-off-by: Qi Liu Tested-by: Zhenglang Hu --- arch/x86/events/Kconfig | 11 + arch/x86/events/amd/Makefile | 2 + arch/x86/events/amd/hygon_uncore.c | 567 +++++++++++++++++++++++++++++ arch/x86/include/asm/msr-index.h | 2 + arch/x86/include/asm/perf_event.h | 20 + include/linux/cpuhotplug.h | 3 + 6 files changed, 605 insertions(+) create mode 100644 arch/x86/events/amd/hygon_uncore.c diff --git a/arch/x86/events/Kconfig b/arch/x86/events/Kconfig index dabdf3d7bf84..cc8387236b95 100644 --- a/arch/x86/events/Kconfig +++ b/arch/x86/events/Kconfig @@ -45,6 +45,17 @@ config PERF_EVENTS_AMD_UNCORE To compile this driver as a module, choose M here: the module will be called 'amd-uncore'. =20 +config PERF_EVENTS_HYGON_UNCORE + tristate "Hygon Uncore performance events" + depends on PERF_EVENTS && CPU_SUP_HYGON + default y + help + Include support for Hygon uncore performance events for use with + e.g., perf stat -e hygon_df/.../. + + To compile this driver as a module, choose M here: the + module will be called 'hygon-uncore'. + config PERF_EVENTS_AMD_BRS depends on PERF_EVENTS && CPU_SUP_AMD bool "AMD Zen3 Branch Sampling support" diff --git a/arch/x86/events/amd/Makefile b/arch/x86/events/amd/Makefile index f951ae64ee36..32bf7aae2368 100644 --- a/arch/x86/events/amd/Makefile +++ b/arch/x86/events/amd/Makefile @@ -5,6 +5,8 @@ obj-$(CONFIG_PERF_EVENTS_AMD_POWER) +=3D power.o obj-$(CONFIG_X86_LOCAL_APIC) +=3D ibs.o obj-$(CONFIG_PERF_EVENTS_AMD_UNCORE) +=3D amd-uncore.o amd-uncore-objs :=3D uncore_common.o uncore.o +obj-$(CONFIG_PERF_EVENTS_HYGON_UNCORE) +=3D hygon-uncore.o +hygon-uncore-objs :=3D uncore_common.o hygon_uncore.o ifdef CONFIG_AMD_IOMMU obj-$(CONFIG_CPU_SUP_AMD) +=3D iommu.o endif diff --git a/arch/x86/events/amd/hygon_uncore.c b/arch/x86/events/amd/hygon= _uncore.c new file mode 100644 index 000000000000..133b6b1923de --- /dev/null +++ b/arch/x86/events/amd/hygon_uncore.c @@ -0,0 +1,567 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright (C) 2026 Chengdu Haiguang IC Design Co., Ltd. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include + +#include "uncore_common.h" + +#define NUM_COUNTERS_DF 4 + +#undef pr_fmt +#define pr_fmt(fmt) "hygon_uncore: " fmt + +enum { + HYGON_UNCORE_TYPE_DF, + HYGON_UNCORE_TYPE_DF_IOD, + HYGON_UNCORE_TYPE_MAX, +}; + +/* Interval for hrtimer, defaults to 60000 milliseconds */ +static unsigned int update_interval =3D 60 * MSEC_PER_SEC; +module_param(update_interval, uint, 0444); + +static struct uncore_common hygon_uncores[HYGON_UNCORE_TYPE_MAX]; + +static __always_inline bool hygon_uncore_is_df_iod(struct uncore_common_pm= u *pmu) +{ + struct uncore_common *uncore =3D pmu->private; + + return uncore =3D=3D &hygon_uncores[HYGON_UNCORE_TYPE_DF_IOD]; +} + +static __always_inline int hygon_uncore_num_iods(struct uncore_common *unc= ore, unsigned int cpu) +{ + union uncore_common_info *info =3D per_cpu_ptr(uncore->info, cpu); + + return info->split.private; +} + +static u64 hygon_uncore_df_event_mask(void) +{ + if (boot_cpu_data.x86_model =3D=3D 0x4 || + boot_cpu_data.x86_model =3D=3D 0x5) + return HYGON_F18H_M4H_RAW_EVENT_MASK_DF; + + if (boot_cpu_data.x86_model >=3D 0x6 && + boot_cpu_data.x86_model <=3D 0x18) + return HYGON_F18H_M6H_RAW_EVENT_MASK_DF; + + return HYGON_F18H_RAW_EVENT_MASK_DF; +} + +static ssize_t cpumask_show(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + struct pmu *ptr =3D dev_get_drvdata(dev); + struct uncore_common_pmu *pmu; + + pmu =3D container_of(ptr, struct uncore_common_pmu, pmu); + + return cpumap_print_to_pagebuf(true, buf, &pmu->active_mask); +} +static DEVICE_ATTR_RO(cpumask); + +static struct attribute *hygon_uncore_attrs[] =3D { + &dev_attr_cpumask.attr, + NULL, +}; + +static struct attribute_group hygon_uncore_attr_group =3D { + .attrs =3D hygon_uncore_attrs, +}; + +#define DEFINE_UNCORE_FORMAT_ATTR(_var, _name, _format) \ +static ssize_t __uncore_##_var##_show(struct device *dev, \ + struct device_attribute *attr, \ + char *page) \ +{ \ + BUILD_BUG_ON(sizeof(_format) >=3D PAGE_SIZE); \ + return sprintf(page, _format "\n"); \ +} \ +static struct device_attribute format_attr_##_var =3D \ + __ATTR(_name, 0444, __uncore_##_var##_show, NULL) + +DEFINE_UNCORE_FORMAT_ATTR(event, event, "config:0-5"); +DEFINE_UNCORE_FORMAT_ATTR(umask8, umask, "config:8-15"); +DEFINE_UNCORE_FORMAT_ATTR(umask10, umask, "config:8-17"); +DEFINE_UNCORE_FORMAT_ATTR(umask12, umask, "config:8-19"); +DEFINE_UNCORE_FORMAT_ATTR(constid, constid, "config:6-7,32-35,61-62"); +DEFINE_UNCORE_FORMAT_ATTR(iod, iod, "config1:0-1"); + +static struct attribute *hygon_uncore_df_format_attr[] =3D { + &format_attr_event.attr, + &format_attr_umask8.attr, + &format_attr_constid.attr, + NULL, +}; + +static struct attribute *hygon_uncore_df_iod_format_attr[] =3D { + &format_attr_event.attr, + &format_attr_umask10.attr, + &format_attr_constid.attr, + &format_attr_iod.attr, + NULL, +}; + +static struct attribute_group hygon_uncore_df_format_group =3D { + .name =3D "format", + .attrs =3D hygon_uncore_df_format_attr, +}; + +static struct attribute_group hygon_uncore_df_iod_format_group =3D { + .name =3D "format", + .attrs =3D hygon_uncore_df_iod_format_attr, +}; + +static const struct attribute_group *hygon_uncore_df_attr_groups[] =3D { + &hygon_uncore_attr_group, + &hygon_uncore_df_format_group, + NULL, +}; + +static const struct attribute_group *hygon_uncore_df_iod_attr_groups[] =3D= { + &hygon_uncore_attr_group, + &hygon_uncore_df_iod_format_group, + NULL, +}; + +static int hygon_uncore_df_event_init(struct perf_event *event) +{ + struct hw_perf_event *hwc =3D &event->hw; + struct uncore_common_pmu *pmu; + struct uncore_common *uncore; + u64 event_mask; + int ret; + + ret =3D uncore_common_event_init(event); + if (ret) + return ret; + + pmu =3D event_to_uncore_common_pmu(event); + uncore =3D pmu->private; + + if (hygon_uncore_is_df_iod(pmu) && + event->attr.config1 >=3D hygon_uncore_num_iods(uncore, event->cpu)) + return -EINVAL; + + event_mask =3D hygon_uncore_df_event_mask(); + hwc->config =3D event->attr.config & event_mask; + + return 0; +} + +static int hygon_uncore_df_iod_add(struct perf_event *event, int flags) +{ + struct uncore_common_pmu *pmu =3D event_to_uncore_common_pmu(event); + struct uncore_common_ctx *ctx =3D *per_cpu_ptr(pmu->ctx, event->cpu); + struct hw_perf_event *hwc =3D &event->hw; + int iod_idx; + int i; + + if (hwc->idx !=3D -1 && ctx->events[hwc->idx] =3D=3D event) + goto out; + + for (i =3D 0; i < pmu->num_counters; i++) { + if (ctx->events[i] =3D=3D event) { + hwc->idx =3D i; + goto out; + } + } + + hwc->idx =3D -1; + iod_idx =3D event->attr.config1; + + for (i =3D iod_idx * NUM_COUNTERS_DF; i < (iod_idx + 1) * NUM_COUNTERS_DF= ; i++) { + struct perf_event *tmp =3D NULL; + + if (try_cmpxchg(&ctx->events[i], &tmp, event)) { + hwc->idx =3D i; + break; + } + } + +out: + if (hwc->idx =3D=3D -1) + return -EBUSY; + + hwc->config_base =3D pmu->msr_base + 2 * hwc->idx; + hwc->event_base =3D pmu->msr_base + 1 + 2 * hwc->idx; + hwc->event_base_rdpmc =3D -1; + hwc->state =3D PERF_HES_UPTODATE | PERF_HES_STOPPED; + + if (flags & PERF_EF_START) + event->pmu->start(event, PERF_EF_RELOAD); + + return 0; +} + +static int hygon_uncore_cpu_starting(unsigned int cpu) +{ + struct uncore_common *uncore; + int i; + + for (i =3D 0; i < HYGON_UNCORE_TYPE_MAX; i++) { + uncore =3D &hygon_uncores[i]; + uncore->scan(uncore, cpu); + } + + return 0; +} + +static int hygon_uncore_cpu_online(unsigned int cpu) +{ + struct uncore_common *uncore; + int ret; + int i; + + for (i =3D 0; i < HYGON_UNCORE_TYPE_MAX; i++) { + uncore =3D &hygon_uncores[i]; + + ret =3D uncore->init(uncore, cpu); + if (ret) + return ret; + } + + return 0; +} + +static int hygon_uncore_cpu_down_prepare(unsigned int cpu) +{ + struct uncore_common *uncore; + int i; + + for (i =3D 0; i < HYGON_UNCORE_TYPE_MAX; i++) { + uncore =3D &hygon_uncores[i]; + uncore->move(uncore, cpu); + } + + return 0; +} + +static int hygon_uncore_cpu_dead(unsigned int cpu) +{ + struct uncore_common *uncore; + int i; + + for (i =3D 0; i < HYGON_UNCORE_TYPE_MAX; i++) { + uncore =3D &hygon_uncores[i]; + uncore->free(uncore, cpu); + } + + return 0; +} + +static int hygon_uncore_df_ctx_init(struct uncore_common *uncore, + unsigned int cpu) +{ + struct attribute *df_attr; + struct uncore_common_pmu *pmu; + int num_counters; + + if (uncore->init_done) + return uncore_common_ctx_init(uncore, cpu); + + num_counters =3D uncore_common_ctx_num_pmcs(uncore, cpu); + if (!num_counters) + goto done; + + uncore->pmus =3D kzalloc_obj(*uncore->pmus); + if (!uncore->pmus) + goto done; + + pmu =3D &uncore->pmus[0]; + strscpy(pmu->name, "hygon_df", sizeof(pmu->name)); + pmu->num_counters =3D num_counters; + pmu->msr_base =3D MSR_HYGON_F18H_DF_CTL; + pmu->rdpmc_base =3D -1; + pmu->group =3D uncore_common_ctx_gid(uncore, cpu); + pmu->private =3D uncore; + + df_attr =3D &format_attr_umask8.attr; + if (boot_cpu_data.x86_model =3D=3D 0x4 || + boot_cpu_data.x86_model =3D=3D 0x5) + df_attr =3D &format_attr_umask10.attr; + else if (boot_cpu_data.x86_model >=3D 0x6 && + boot_cpu_data.x86_model <=3D 0x18) + df_attr =3D &format_attr_umask12.attr; + hygon_uncore_df_format_attr[1] =3D df_attr; + + pmu->ctx =3D alloc_percpu(struct uncore_common_ctx *); + if (!pmu->ctx) + goto done; + + pmu->pmu =3D (struct pmu) { + .task_ctx_nr =3D perf_invalid_context, + .attr_groups =3D hygon_uncore_df_attr_groups, + .name =3D pmu->name, + .event_init =3D hygon_uncore_df_event_init, + .add =3D uncore_common_add, + .del =3D uncore_common_del, + .start =3D uncore_common_start, + .stop =3D uncore_common_stop, + .read =3D uncore_common_read, + .capabilities =3D PERF_PMU_CAP_NO_EXCLUDE | PERF_PMU_CAP_NO_INTERRUPT, + .module =3D THIS_MODULE, + }; + + if (perf_pmu_register(&pmu->pmu, pmu->pmu.name, -1)) { + free_percpu(pmu->ctx); + pmu->ctx =3D NULL; + goto done; + } + + pr_info("%d %s counters detected\n", pmu->num_counters, pmu->pmu.name); + uncore->num_pmus =3D 1; + +done: + uncore->init_done =3D true; + return uncore_common_ctx_init(uncore, cpu); +} + +static void hygon_uncore_df_ctx_scan(struct uncore_common *uncore, + unsigned int cpu) +{ + unsigned int eax, ebx, ecx, edx; + union uncore_common_info info; + + if (!boot_cpu_has(X86_FEATURE_PERFCTR_NB)) + return; + + info.split.gid =3D 0; + info.split.aux_data =3D 0; + info.split.num_pmcs =3D NUM_COUNTERS_DF; + + cpuid(0x8000001e, &eax, &ebx, &ecx, &edx); + info.split.cid =3D ecx & 0xff; + + *per_cpu_ptr(uncore->info, cpu) =3D info; +} + +static void hygon_uncore_df_iod_ctx_scan(struct uncore_common *uncore, + unsigned int cpu) +{ + int num_packages, iods_per_package; + union uncore_common_info info; + + if (!boot_cpu_has(X86_FEATURE_PERFCTR_NB)) + return; + + if (boot_cpu_data.x86_model < 0x4 || boot_cpu_data.x86_model =3D=3D 0x6) + return; + + num_packages =3D topology_max_packages(); + iods_per_package =3D amd_nb_num() / num_packages - topology_max_dies_per_= package(); + if (iods_per_package <=3D 0) + return; + + info.split.cid =3D topology_physical_package_id(cpu); + info.split.gid =3D 0; + info.split.private =3D iods_per_package; + info.split.num_pmcs =3D NUM_COUNTERS_DF * iods_per_package; + + *per_cpu_ptr(uncore->info, cpu) =3D info; +} + +static int hygon_uncore_df_iod_ctx_init(struct uncore_common *uncore, + unsigned int cpu) +{ + struct uncore_common_pmu *pmu; + int num_counters; + + if (uncore->init_done) + return uncore_common_ctx_init(uncore, cpu); + + num_counters =3D uncore_common_ctx_num_pmcs(uncore, cpu); + if (!num_counters) + goto done; + + uncore->pmus =3D kzalloc_obj(*uncore->pmus); + if (!uncore->pmus) + goto done; + + pmu =3D &uncore->pmus[0]; + strscpy(pmu->name, "hygon_df_iod", sizeof(pmu->name)); + pmu->num_counters =3D num_counters; + pmu->msr_base =3D MSR_HYGON_F18H_DF_IOD_CTL; + pmu->rdpmc_base =3D -1; + pmu->group =3D uncore_common_ctx_gid(uncore, cpu); + pmu->private =3D uncore; + + if (boot_cpu_data.x86_model >=3D 0x6 && + boot_cpu_data.x86_model <=3D 0x18) + hygon_uncore_df_iod_format_attr[1] =3D &format_attr_umask12.attr; + + pmu->ctx =3D alloc_percpu(struct uncore_common_ctx *); + if (!pmu->ctx) + goto done; + + pmu->pmu =3D (struct pmu) { + .task_ctx_nr =3D perf_invalid_context, + .attr_groups =3D hygon_uncore_df_iod_attr_groups, + .name =3D pmu->name, + .event_init =3D hygon_uncore_df_event_init, + .add =3D hygon_uncore_df_iod_add, + .del =3D uncore_common_del, + .start =3D uncore_common_start, + .stop =3D uncore_common_stop, + .read =3D uncore_common_read, + .capabilities =3D PERF_PMU_CAP_NO_EXCLUDE | PERF_PMU_CAP_NO_INTERRUPT, + .module =3D THIS_MODULE, + }; + + if (perf_pmu_register(&pmu->pmu, pmu->pmu.name, -1)) { + free_percpu(pmu->ctx); + pmu->ctx =3D NULL; + goto done; + } + + pr_info("%d %s counters detected\n", pmu->num_counters, pmu->pmu.name); + uncore->num_pmus =3D 1; + +done: + uncore->init_done =3D true; + return uncore_common_ctx_init(uncore, cpu); +} + +static struct uncore_common hygon_uncores[HYGON_UNCORE_TYPE_MAX] =3D { + /* HYGON_UNCORE_TYPE_DF */ + { + .scan =3D hygon_uncore_df_ctx_scan, + .init =3D hygon_uncore_df_ctx_init, + .move =3D uncore_common_ctx_move, + .free =3D uncore_common_ctx_free, + }, + /* HYGON_UNCORE_TYPE_DF IOD */ + { + .scan =3D hygon_uncore_df_iod_ctx_scan, + .init =3D hygon_uncore_df_iod_ctx_init, + .move =3D uncore_common_ctx_move, + .free =3D uncore_common_ctx_free, + }, +}; + +static int __init hygon_uncore_init(void) +{ + struct uncore_common *uncore; + int ret =3D -ENODEV; + int i; + + if (boot_cpu_data.x86_vendor !=3D X86_VENDOR_HYGON) + return -ENODEV; + + if (!boot_cpu_has(X86_FEATURE_TOPOEXT)) + return -ENODEV; + + uncore_common_set_update_interval(update_interval); + + for (i =3D 0; i < HYGON_UNCORE_TYPE_MAX; i++) { + uncore =3D &hygon_uncores[i]; + + if (WARN_ON_ONCE(!uncore->scan || + !uncore->init || + !uncore->move || + !uncore->free)) { + ret =3D -EINVAL; + goto fail; + } + + uncore->info =3D alloc_percpu(union uncore_common_info); + if (!uncore->info) { + ret =3D -ENOMEM; + goto fail; + } + } + + ret =3D cpuhp_setup_state(CPUHP_PERF_X86_HYGON_UNCORE_PREP, + "perf/x86/hygon/uncore:prepare", + NULL, hygon_uncore_cpu_dead); + if (ret) + goto fail; + + ret =3D cpuhp_setup_state(CPUHP_AP_PERF_X86_HYGON_UNCORE_STARTING, + "perf/x86/hygon/uncore:starting", + hygon_uncore_cpu_starting, NULL); + if (ret) + goto fail_prep; + + ret =3D cpuhp_setup_state(CPUHP_AP_PERF_X86_HYGON_UNCORE_ONLINE, + "perf/x86/hygon/uncore:online", + hygon_uncore_cpu_online, + hygon_uncore_cpu_down_prepare); + if (ret) + goto fail_start; + + return 0; + +fail_start: + cpuhp_remove_state(CPUHP_AP_PERF_X86_HYGON_UNCORE_STARTING); +fail_prep: + cpuhp_remove_state(CPUHP_PERF_X86_HYGON_UNCORE_PREP); +fail: + for (i =3D 0; i < HYGON_UNCORE_TYPE_MAX; i++) { + uncore =3D &hygon_uncores[i]; + + if (uncore->info) { + free_percpu(uncore->info); + uncore->info =3D NULL; + } + } + + return ret; +} + +static void __exit hygon_uncore_exit(void) +{ + struct uncore_common *uncore; + struct uncore_common_pmu *pmu; + int i, j; + + cpuhp_remove_state(CPUHP_AP_PERF_X86_HYGON_UNCORE_ONLINE); + cpuhp_remove_state(CPUHP_AP_PERF_X86_HYGON_UNCORE_STARTING); + cpuhp_remove_state(CPUHP_PERF_X86_HYGON_UNCORE_PREP); + + for (i =3D 0; i < HYGON_UNCORE_TYPE_MAX; i++) { + uncore =3D &hygon_uncores[i]; + + if (!uncore->info) + continue; + + free_percpu(uncore->info); + uncore->info =3D NULL; + + for (j =3D 0; j < uncore->num_pmus; j++) { + pmu =3D &uncore->pmus[j]; + + if (!pmu->ctx) + continue; + + perf_pmu_unregister(&pmu->pmu); + free_percpu(pmu->ctx); + pmu->ctx =3D NULL; + } + + kfree(uncore->pmus); + uncore->pmus =3D NULL; + } +} + +module_init(hygon_uncore_init); +module_exit(hygon_uncore_exit); + +MODULE_DESCRIPTION("Hygon Uncore Driver"); +MODULE_LICENSE("GPL"); diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-in= dex.h index 86554de9a3f5..b81d0003e295 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -851,6 +851,8 @@ #define MSR_F15H_PTSC 0xc0010280 #define MSR_F15H_IC_CFG 0xc0011021 #define MSR_F15H_EX_CFG 0xc001102c +#define MSR_HYGON_F18H_DF_CTL 0xc0010240 +#define MSR_HYGON_F18H_DF_IOD_CTL 0xc0010250 =20 /* Fam 10h MSRs */ #define MSR_FAM10H_MMIO_CONF_BASE 0xc0010058 diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_= event.h index 752cb319d5ea..9e56b5c94d4b 100644 --- a/arch/x86/include/asm/perf_event.h +++ b/arch/x86/include/asm/perf_event.h @@ -122,6 +122,26 @@ (AMD64_PERFMON_V2_EVENTSEL_EVENT_NB | \ AMD64_PERFMON_V2_EVENTSEL_UMASK_NB) =20 +#define HYGON_F18H_EVENTSEL_EVENT_DF 0x0000003FULL +#define HYGON_F18H_EVENTSEL_UMASK_DF 0x0000FF00ULL +#define HYGON_F18H_M4H_EVENTSEL_UMASK_DF 0x0003FF00ULL +#define HYGON_F18H_M6H_EVENTSEL_UMASK_DF 0x000FFF00ULL +#define HYGON_F18H_EVENTSEL_CONSTID_DF \ + (0x000000C0ULL | GENMASK_ULL(35, 32) | GENMASK_ULL(62, 61)) + +#define HYGON_F18H_RAW_EVENT_MASK_DF \ + (HYGON_F18H_EVENTSEL_EVENT_DF | \ + HYGON_F18H_EVENTSEL_UMASK_DF | \ + HYGON_F18H_EVENTSEL_CONSTID_DF) +#define HYGON_F18H_M4H_RAW_EVENT_MASK_DF \ + (HYGON_F18H_EVENTSEL_EVENT_DF | \ + HYGON_F18H_M4H_EVENTSEL_UMASK_DF | \ + HYGON_F18H_EVENTSEL_CONSTID_DF) +#define HYGON_F18H_M6H_RAW_EVENT_MASK_DF \ + (HYGON_F18H_EVENTSEL_EVENT_DF | \ + HYGON_F18H_M6H_EVENTSEL_UMASK_DF | \ + HYGON_F18H_EVENTSEL_CONSTID_DF) + #define AMD64_PERFMON_V2_ENABLE_UMC BIT_ULL(31) #define AMD64_PERFMON_V2_EVENTSEL_EVENT_UMC GENMASK_ULL(7, 0) #define AMD64_PERFMON_V2_EVENTSEL_RDWRMASK_UMC GENMASK_ULL(9, 8) diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h index 22ba327ec227..25f64bdfb5dc 100644 --- a/include/linux/cpuhotplug.h +++ b/include/linux/cpuhotplug.h @@ -62,6 +62,7 @@ enum cpuhp_state { CPUHP_CREATE_THREADS, CPUHP_PERF_X86_PREPARE, CPUHP_PERF_X86_AMD_UNCORE_PREP, + CPUHP_PERF_X86_HYGON_UNCORE_PREP, CPUHP_PERF_POWER, CPUHP_PERF_SUPERH, CPUHP_X86_HPET_DEAD, @@ -148,6 +149,7 @@ enum cpuhp_state { CPUHP_AP_IRQ_RISCV_SBI_IPI_STARTING, CPUHP_AP_ARM_MVEBU_COHERENCY, CPUHP_AP_PERF_X86_AMD_UNCORE_STARTING, + CPUHP_AP_PERF_X86_HYGON_UNCORE_STARTING, CPUHP_AP_PERF_X86_STARTING, CPUHP_AP_PERF_X86_AMD_IBS_STARTING, CPUHP_AP_PERF_XTENSA_STARTING, @@ -205,6 +207,7 @@ enum cpuhp_state { CPUHP_AP_PERF_X86_ONLINE, CPUHP_AP_PERF_X86_UNCORE_ONLINE, CPUHP_AP_PERF_X86_AMD_UNCORE_ONLINE, + CPUHP_AP_PERF_X86_HYGON_UNCORE_ONLINE, CPUHP_AP_PERF_X86_AMD_POWER_ONLINE, CPUHP_AP_PERF_S390_CF_ONLINE, CPUHP_AP_PERF_S390_SF_ONLINE, --=20 2.34.1