From nobody Tue Dec 16 07:27:38 2025 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 20BE41D61A3 for ; Sat, 6 Dec 2025 00:17:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980251; cv=none; b=FU2gLXEKvYYJnKvN6A1ZqqLLnz+wdX3vN4uV61NDsL3g7gFcHnqJ+YbidyJlVOwv7xldxuIdC7eEZvyvsS+AH5lG42ZfyVJ7O9fccO3cGxXYwaQVGn5SviVnj+12AOqa2W1wxoBnhkDlN+9WFEBse+4X+xFja1WC+A1d13JZfgU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980251; c=relaxed/simple; bh=YMO9MQpn+6D5GbuhVLY4x+/hk2yVXLICVv5Keo5cLr0=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=WBgmQs5t1HllCRYeXkxUFrGm1PocleJ9eOazQQjONuDSbOaTccVu86/XG2xqd/CfyhIKiT1+NV2WEnA+MjxOV68XPIVBs3JqtC6KrD5TMtFYaVDJeoh7reXpvVHFBRb9Z9v/xLycYI6UfMtcm9p4VtubJnBgTMvpKkVYgi6P8d0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=3B8eufpA; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="3B8eufpA" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-3418ad76023so4734169a91.0 for ; Fri, 05 Dec 2025 16:17:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1764980248; x=1765585048; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=DM2dr0XIbGe7/AWt+6ujYEf4dejjsZPTOEBNxiHf3kk=; b=3B8eufpA+U3VDdS/WFv661InIxbyYyjHIz/hdLY1OsYxFlgtE5/kqZS89r2R9aW3+K XAokkDr2vzJZIQwsPmHxWoIsT7czaf9i4COXliaovpJkUxUq3XPwaY/+oL+zHv+2ZUVd Pwug+uJZKEPYPnu5IIbL+4GcKq4ls/1CxisS8T2oFB1oOg9F0qnbvu9id7nlAw0PTZkN fncA27J3pOGvQWijcgHrpwCxpv6Rg3oG+G9TVfLf1chk/QAh92KDC6AaCLqis6H64i/c mu4yCA6YAIIr9kgXYPZv9KWSiun+8rF3ZRwa5D79iRvvS+dt0/33UHLwyvIhzg9q87iU VtFw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764980248; x=1765585048; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=DM2dr0XIbGe7/AWt+6ujYEf4dejjsZPTOEBNxiHf3kk=; b=H6XR+83z2E4qawsCNDGILFKvB2OUdwyjCuVtxqzQaFk51JE50VhOVlhxL9Sv2XAkzu gjhEyCK36rUQrTQEHrbRbwYWv3JBEEnavUi7PY4AinZs+Ba2v+IF+vpNrTR62EX9CTDp 3/j31DDztp2odbsOdL1riHnlmfCSH2UKcx0gObwwA5W7AoRQG0s5Cjx9Bs/ox9e7PzTH SF4lDC2vZB0b/edsZm8uce0+IxsDoT5JrUu7+BwwewJWyLXeaVm2hRNUg7ywenn0Ceme sIEIAOqARTjatgcdvWVx2iRqo8uyjrNJrkZ+MhOct0A56vU1Cg1iyGumGvAO1zY93FBV ffZg== X-Forwarded-Encrypted: i=1; AJvYcCUnIwS1ODjL6/2WvN8N+Z+5AtoC2FoIcbWnd/hBXxAmXjfvaERVaibDPZh7zP5pHkjsGhrJHY2eoR7tWFg=@vger.kernel.org X-Gm-Message-State: AOJu0YyRyD2nU7UOYKSCiH2ZabnHzdrBPcfP10TpmMvxc+Io0e6FboeW mHmzyOdEr5ZboLFg+3m8/9olWXXccAHEWWCYlG3WisMcCTDmbOeFkAng9IB7RJ8O4c7xkMGm5kn t1ywq+A== X-Google-Smtp-Source: AGHT+IHEMs/aR7qexI7Is3FXlX0YDwbJRLdXH5K6AFlKhYIHpjlXc2KLW8p/bDIFlOe79jXXFih2gpaRxRo= X-Received: from pjbfs21.prod.google.com ([2002:a17:90a:f295:b0:343:6849:31ae]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:3ec7:b0:33b:cfae:3621 with SMTP id 98e67ed59e1d1-349a260ce74mr527533a91.32.1764980248394; Fri, 05 Dec 2025 16:17:28 -0800 (PST) Reply-To: Sean Christopherson Date: Fri, 5 Dec 2025 16:16:37 -0800 In-Reply-To: <20251206001720.468579-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251206001720.468579-1-seanjc@google.com> X-Mailer: git-send-email 2.52.0.223.gf5cc29aaa4-goog Message-ID: <20251206001720.468579-2-seanjc@google.com> Subject: [PATCH v6 01/44] perf: Skip pmu_ctx based on event_type From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Xin Li , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Sean Christopherson , Paolo Bonzini Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Mingwei Zhang , Xudong Hao , Sandipan Das , Dapeng Mi , Xiong Zhang , Manali Shukla , Jim Mattson Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Kan Liang To optimize the cgroup context switch, the perf_event_pmu_context iteration skips the PMUs without cgroup events. A bool cgroup was introduced to indicate the case. It can work, but this way is hard to extend for other cases, e.g. skipping non-mediated PMUs. It doesn't make sense to keep adding bool variables. Pass the event_type instead of the specific bool variable. Check both the event_type and related pmu_ctx variables to decide whether skipping a PMU. Event flags, e.g., EVENT_CGROUP, should be cleard in the ctx->is_active. Add EVENT_FLAGS to indicate such event flags. No functional change. Signed-off-by: Kan Liang Tested-by: Yongwei Ma Signed-off-by: Mingwei Zhang Tested-by: Xudong Hao Signed-off-by: Sean Christopherson --- kernel/events/core.c | 74 ++++++++++++++++++++++++-------------------- 1 file changed, 40 insertions(+), 34 deletions(-) diff --git a/kernel/events/core.c b/kernel/events/core.c index 2c35acc2722b..4cc95dd15620 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -164,7 +164,7 @@ enum event_type_t { /* see ctx_resched() for details */ EVENT_CPU =3D 0x10, EVENT_CGROUP =3D 0x20, - + EVENT_FLAGS =3D EVENT_CGROUP, /* compound helpers */ EVENT_ALL =3D EVENT_FLEXIBLE | EVENT_PINNED, EVENT_TIME_FROZEN =3D EVENT_TIME | EVENT_FROZEN, @@ -778,27 +778,37 @@ do { \ ___p; \ }) =20 -#define for_each_epc(_epc, _ctx, _pmu, _cgroup) \ +static bool perf_skip_pmu_ctx(struct perf_event_pmu_context *pmu_ctx, + enum event_type_t event_type) +{ + if ((event_type & EVENT_CGROUP) && !pmu_ctx->nr_cgroups) + return true; + return false; +} + +#define for_each_epc(_epc, _ctx, _pmu, _event_type) \ list_for_each_entry(_epc, &((_ctx)->pmu_ctx_list), pmu_ctx_entry) \ - if (_cgroup && !_epc->nr_cgroups) \ + if (perf_skip_pmu_ctx(_epc, _event_type)) \ continue; \ else if (_pmu && _epc->pmu !=3D _pmu) \ continue; \ else =20 -static void perf_ctx_disable(struct perf_event_context *ctx, bool cgroup) +static void perf_ctx_disable(struct perf_event_context *ctx, + enum event_type_t event_type) { struct perf_event_pmu_context *pmu_ctx; =20 - for_each_epc(pmu_ctx, ctx, NULL, cgroup) + for_each_epc(pmu_ctx, ctx, NULL, event_type) perf_pmu_disable(pmu_ctx->pmu); } =20 -static void perf_ctx_enable(struct perf_event_context *ctx, bool cgroup) +static void perf_ctx_enable(struct perf_event_context *ctx, + enum event_type_t event_type) { struct perf_event_pmu_context *pmu_ctx; =20 - for_each_epc(pmu_ctx, ctx, NULL, cgroup) + for_each_epc(pmu_ctx, ctx, NULL, event_type) perf_pmu_enable(pmu_ctx->pmu); } =20 @@ -963,8 +973,7 @@ static void perf_cgroup_switch(struct task_struct *task) return; =20 WARN_ON_ONCE(cpuctx->ctx.nr_cgroups =3D=3D 0); - - perf_ctx_disable(&cpuctx->ctx, true); + perf_ctx_disable(&cpuctx->ctx, EVENT_CGROUP); =20 ctx_sched_out(&cpuctx->ctx, NULL, EVENT_ALL|EVENT_CGROUP); /* @@ -980,7 +989,7 @@ static void perf_cgroup_switch(struct task_struct *task) */ ctx_sched_in(&cpuctx->ctx, NULL, EVENT_ALL|EVENT_CGROUP); =20 - perf_ctx_enable(&cpuctx->ctx, true); + perf_ctx_enable(&cpuctx->ctx, EVENT_CGROUP); } =20 static int perf_cgroup_ensure_storage(struct perf_event *event, @@ -2904,11 +2913,11 @@ static void ctx_resched(struct perf_cpu_context *cp= uctx, =20 event_type &=3D EVENT_ALL; =20 - for_each_epc(epc, &cpuctx->ctx, pmu, false) + for_each_epc(epc, &cpuctx->ctx, pmu, 0) perf_pmu_disable(epc->pmu); =20 if (task_ctx) { - for_each_epc(epc, task_ctx, pmu, false) + for_each_epc(epc, task_ctx, pmu, 0) perf_pmu_disable(epc->pmu); =20 task_ctx_sched_out(task_ctx, pmu, event_type); @@ -2928,11 +2937,11 @@ static void ctx_resched(struct perf_cpu_context *cp= uctx, =20 perf_event_sched_in(cpuctx, task_ctx, pmu); =20 - for_each_epc(epc, &cpuctx->ctx, pmu, false) + for_each_epc(epc, &cpuctx->ctx, pmu, 0) perf_pmu_enable(epc->pmu); =20 if (task_ctx) { - for_each_epc(epc, task_ctx, pmu, false) + for_each_epc(epc, task_ctx, pmu, 0) perf_pmu_enable(epc->pmu); } } @@ -3481,11 +3490,10 @@ static void ctx_sched_out(struct perf_event_context *ctx, struct pmu *pmu, enum event_= type_t event_type) { struct perf_cpu_context *cpuctx =3D this_cpu_ptr(&perf_cpu_context); + enum event_type_t active_type =3D event_type & ~EVENT_FLAGS; struct perf_event_pmu_context *pmu_ctx; int is_active =3D ctx->is_active; - bool cgroup =3D event_type & EVENT_CGROUP; =20 - event_type &=3D ~EVENT_CGROUP; =20 lockdep_assert_held(&ctx->lock); =20 @@ -3516,7 +3524,7 @@ ctx_sched_out(struct perf_event_context *ctx, struct = pmu *pmu, enum event_type_t * see __load_acquire() in perf_event_time_now() */ barrier(); - ctx->is_active &=3D ~event_type; + ctx->is_active &=3D ~active_type; =20 if (!(ctx->is_active & EVENT_ALL)) { /* @@ -3537,7 +3545,7 @@ ctx_sched_out(struct perf_event_context *ctx, struct = pmu *pmu, enum event_type_t =20 is_active ^=3D ctx->is_active; /* changed bits */ =20 - for_each_epc(pmu_ctx, ctx, pmu, cgroup) + for_each_epc(pmu_ctx, ctx, pmu, event_type) __pmu_ctx_sched_out(pmu_ctx, is_active); } =20 @@ -3693,7 +3701,7 @@ perf_event_context_sched_out(struct task_struct *task= , struct task_struct *next) raw_spin_lock_nested(&next_ctx->lock, SINGLE_DEPTH_NESTING); if (context_equiv(ctx, next_ctx)) { =20 - perf_ctx_disable(ctx, false); + perf_ctx_disable(ctx, 0); =20 /* PMIs are disabled; ctx->nr_no_switch_fast is stable. */ if (local_read(&ctx->nr_no_switch_fast) || @@ -3717,7 +3725,7 @@ perf_event_context_sched_out(struct task_struct *task= , struct task_struct *next) =20 perf_ctx_sched_task_cb(ctx, task, false); =20 - perf_ctx_enable(ctx, false); + perf_ctx_enable(ctx, 0); =20 /* * RCU_INIT_POINTER here is safe because we've not @@ -3741,13 +3749,13 @@ perf_event_context_sched_out(struct task_struct *ta= sk, struct task_struct *next) =20 if (do_switch) { raw_spin_lock(&ctx->lock); - perf_ctx_disable(ctx, false); + perf_ctx_disable(ctx, 0); =20 inside_switch: perf_ctx_sched_task_cb(ctx, task, false); task_ctx_sched_out(ctx, NULL, EVENT_ALL); =20 - perf_ctx_enable(ctx, false); + perf_ctx_enable(ctx, 0); raw_spin_unlock(&ctx->lock); } } @@ -4056,11 +4064,9 @@ static void ctx_sched_in(struct perf_event_context *ctx, struct pmu *pmu, enum event_t= ype_t event_type) { struct perf_cpu_context *cpuctx =3D this_cpu_ptr(&perf_cpu_context); + enum event_type_t active_type =3D event_type & ~EVENT_FLAGS; struct perf_event_pmu_context *pmu_ctx; int is_active =3D ctx->is_active; - bool cgroup =3D event_type & EVENT_CGROUP; - - event_type &=3D ~EVENT_CGROUP; =20 lockdep_assert_held(&ctx->lock); =20 @@ -4078,7 +4084,7 @@ ctx_sched_in(struct perf_event_context *ctx, struct p= mu *pmu, enum event_type_t barrier(); } =20 - ctx->is_active |=3D (event_type | EVENT_TIME); + ctx->is_active |=3D active_type | EVENT_TIME; if (ctx->task) { if (!(is_active & EVENT_ALL)) cpuctx->task_ctx =3D ctx; @@ -4093,13 +4099,13 @@ ctx_sched_in(struct perf_event_context *ctx, struct= pmu *pmu, enum event_type_t * in order to give them the best chance of going on. */ if (is_active & EVENT_PINNED) { - for_each_epc(pmu_ctx, ctx, pmu, cgroup) + for_each_epc(pmu_ctx, ctx, pmu, event_type) __pmu_ctx_sched_in(pmu_ctx, EVENT_PINNED); } =20 /* Then walk through the lower prio flexible groups */ if (is_active & EVENT_FLEXIBLE) { - for_each_epc(pmu_ctx, ctx, pmu, cgroup) + for_each_epc(pmu_ctx, ctx, pmu, event_type) __pmu_ctx_sched_in(pmu_ctx, EVENT_FLEXIBLE); } } @@ -4116,11 +4122,11 @@ static void perf_event_context_sched_in(struct task= _struct *task) =20 if (cpuctx->task_ctx =3D=3D ctx) { perf_ctx_lock(cpuctx, ctx); - perf_ctx_disable(ctx, false); + perf_ctx_disable(ctx, 0); =20 perf_ctx_sched_task_cb(ctx, task, true); =20 - perf_ctx_enable(ctx, false); + perf_ctx_enable(ctx, 0); perf_ctx_unlock(cpuctx, ctx); goto rcu_unlock; } @@ -4133,7 +4139,7 @@ static void perf_event_context_sched_in(struct task_s= truct *task) if (!ctx->nr_events) goto unlock; =20 - perf_ctx_disable(ctx, false); + perf_ctx_disable(ctx, 0); /* * We want to keep the following priority order: * cpu pinned (that don't need to move), task pinned, @@ -4143,7 +4149,7 @@ static void perf_event_context_sched_in(struct task_s= truct *task) * events, no need to flip the cpuctx's events around. */ if (!RB_EMPTY_ROOT(&ctx->pinned_groups.tree)) { - perf_ctx_disable(&cpuctx->ctx, false); + perf_ctx_disable(&cpuctx->ctx, 0); ctx_sched_out(&cpuctx->ctx, NULL, EVENT_FLEXIBLE); } =20 @@ -4152,9 +4158,9 @@ static void perf_event_context_sched_in(struct task_s= truct *task) perf_ctx_sched_task_cb(cpuctx->task_ctx, task, true); =20 if (!RB_EMPTY_ROOT(&ctx->pinned_groups.tree)) - perf_ctx_enable(&cpuctx->ctx, false); + perf_ctx_enable(&cpuctx->ctx, 0); =20 - perf_ctx_enable(ctx, false); + perf_ctx_enable(ctx, 0); =20 unlock: perf_ctx_unlock(cpuctx, ctx); --=20 2.52.0.223.gf5cc29aaa4-goog From nobody Tue Dec 16 07:27:38 2025 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BDE2B21885A for ; Sat, 6 Dec 2025 00:17:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980253; cv=none; b=ZQUqZJpF3TIFGhN5MexPSkf8Kg+GfN9bJXksFxOELqKw1mUf2Mln1IKdufUYcIetpGvw02xLDV3SLpvMLcW1b5L5d923GITKZlQ1I66CzqVYmgtIneYd4+c08D8a/AvVfdnPlX7d4vnpfq+5DuoByu5LxqFdJB9p+yJV7BJVtXk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980253; c=relaxed/simple; bh=53ZMr2ZRIpu7IbiEdlebeII/cbBReqilV+Dt6eLN4/M=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=XWOckPjRoDsVe91vJQAjb5KW0EWVRPeMM0l9mr8kyjsISTzezTLQ2MK2ZlkReJJ+uPqX6vgDbhY9z0Hy3/MFD+wymbJkuDEguaoFkg2523rd49gjIM5lD9+FZzsXhidvyHMSeMGvvGGgYwHf02plqJRs26Zf/ZSKWAxQZmkkVKc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=O+XPlgCr; arc=none smtp.client-ip=209.85.216.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="O+XPlgCr" Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-340c0604e3dso3051511a91.2 for ; Fri, 05 Dec 2025 16:17:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1764980251; x=1765585051; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=UxEE6YFCrywRRLGygEARIDA06aPKgmiMBn7nXQjzdiI=; b=O+XPlgCrt31UvHqt1Yhm5OIlmo5qQ3OxvcB9NX19SC6EUoqFrv6xPgLXOeRENDgKmt BmHD0E8fhS6vD42fOvVRVYL1OPaN5NQrbWsJUxDRWnz6BmkmZMpBoZ3g10hSAc5oPXNy qd7C/FQ+3HAjuxRc+upNC4iGHXZCU8Gkl0xRfPeXMisrHOgTnVCuQNLmNiMiZZnMp8UM elVL6c5/ggQlrK0cs5Mxzn4lYsQs3kkkmC3g6Gq4ofFj2q2MbmAwC8lToD9oYE30us71 X10Z0WSlSFR98SvxVzo90De/OPOXkN+qNmJbdQLwW+ipL/0w0ISnxPckE0X692l1n8r4 dB+A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764980251; x=1765585051; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=UxEE6YFCrywRRLGygEARIDA06aPKgmiMBn7nXQjzdiI=; b=URPD+1w6EUlBmF0mogCJz4uWNRpHrfse5NZkeOqoZtewUUmCo+rp1D337xo60ChW4y YbtLVBuQAQ69iQL3PfNaADrxpCGymK8H8nZb+uKHnANV3owRxCexFyUT+VauPuE9YUG9 7S4hMCNhGBy2Z4hqfq6G3HuFe/KqtsGsoKtQHCJjl1U+56MvSz/HCRzhNJav+ZogbkH5 k2jBP7Uihb2CS7Cc7W96XPMk332sfkU5ycOfE5rFyilQJ1ZY4gAAuJYBl+fYdb2cfF7V NNrwbdGQC3uJY+V+lJ/Ckhm4JaqzG4sBwrzAiPfua9epQ18DYr9kzl7fk8eaLrErxBap Qm9w== X-Forwarded-Encrypted: i=1; AJvYcCUKv/o7yEf44EyK+YmUUlamjWUkGrt3v6W1qEzQqcIs+u+Glz60q7pVAu27LJEY6PsNPYPCVX3BARK2kiY=@vger.kernel.org X-Gm-Message-State: AOJu0YwSexKg2nkq4ZwNgX5y4maAOipLQUxtXjM5VcDoDWuktS6gwA9C gO8I6RlvNNDRRuhYP/l+zMw1PjOqxYvhDXrz2MTlqDnL7fgzHzIU9qg17Eut7ZdcI0BaKxEoduw JXjSn4w== X-Google-Smtp-Source: AGHT+IHjg9edpOr3D1FDjetV+9fqRbGzj3yZXk9Coy0VtACLss4vYlJyIABW7Xcv++xOTx44sX1NPQCjnv8= X-Received: from pjbkr8.prod.google.com ([2002:a17:90b:4908:b0:343:387b:f2fa]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:38d0:b0:349:8116:a2e1 with SMTP id 98e67ed59e1d1-349a25fe29fmr699857a91.20.1764980251086; Fri, 05 Dec 2025 16:17:31 -0800 (PST) Reply-To: Sean Christopherson Date: Fri, 5 Dec 2025 16:16:38 -0800 In-Reply-To: <20251206001720.468579-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251206001720.468579-1-seanjc@google.com> X-Mailer: git-send-email 2.52.0.223.gf5cc29aaa4-goog Message-ID: <20251206001720.468579-3-seanjc@google.com> Subject: [PATCH v6 02/44] perf: Add generic exclude_guest support From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Xin Li , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Sean Christopherson , Paolo Bonzini Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Mingwei Zhang , Xudong Hao , Sandipan Das , Dapeng Mi , Xiong Zhang , Manali Shukla , Jim Mattson Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Kan Liang Only KVM knows the exact time when a guest is entering/exiting. Expose two interfaces to KVM to switch the ownership of the PMU resources. All the pinned events must be scheduled in first. Extend the perf_event_sched_in() helper to support extra flag, e.g., EVENT_GUEST. Signed-off-by: Kan Liang Signed-off-by: Mingwei Zhang Tested-by: Xudong Hao Signed-off-by: Sean Christopherson --- kernel/events/core.c | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/kernel/events/core.c b/kernel/events/core.c index 4cc95dd15620..1e37ab90b815 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -2872,14 +2872,15 @@ static void task_ctx_sched_out(struct perf_event_co= ntext *ctx, =20 static void perf_event_sched_in(struct perf_cpu_context *cpuctx, struct perf_event_context *ctx, - struct pmu *pmu) + struct pmu *pmu, + enum event_type_t event_type) { - ctx_sched_in(&cpuctx->ctx, pmu, EVENT_PINNED); + ctx_sched_in(&cpuctx->ctx, pmu, EVENT_PINNED | event_type); if (ctx) - ctx_sched_in(ctx, pmu, EVENT_PINNED); - ctx_sched_in(&cpuctx->ctx, pmu, EVENT_FLEXIBLE); + ctx_sched_in(ctx, pmu, EVENT_PINNED | event_type); + ctx_sched_in(&cpuctx->ctx, pmu, EVENT_FLEXIBLE | event_type); if (ctx) - ctx_sched_in(ctx, pmu, EVENT_FLEXIBLE); + ctx_sched_in(ctx, pmu, EVENT_FLEXIBLE | event_type); } =20 /* @@ -2935,7 +2936,7 @@ static void ctx_resched(struct perf_cpu_context *cpuc= tx, else if (event_type & EVENT_PINNED) ctx_sched_out(&cpuctx->ctx, pmu, EVENT_FLEXIBLE); =20 - perf_event_sched_in(cpuctx, task_ctx, pmu); + perf_event_sched_in(cpuctx, task_ctx, pmu, 0); =20 for_each_epc(epc, &cpuctx->ctx, pmu, 0) perf_pmu_enable(epc->pmu); @@ -4153,7 +4154,7 @@ static void perf_event_context_sched_in(struct task_s= truct *task) ctx_sched_out(&cpuctx->ctx, NULL, EVENT_FLEXIBLE); } =20 - perf_event_sched_in(cpuctx, ctx, NULL); + perf_event_sched_in(cpuctx, ctx, NULL, 0); =20 perf_ctx_sched_task_cb(cpuctx->task_ctx, task, true); =20 --=20 2.52.0.223.gf5cc29aaa4-goog From nobody Tue Dec 16 07:27:38 2025 Received: from mail-pf1-f201.google.com (mail-pf1-f201.google.com [209.85.210.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B733E1F4CA9 for ; Sat, 6 Dec 2025 00:17:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980255; cv=none; b=gt3DKWdemvb30OL9HXYSTE2nOtDr6lHP/DO0mwOtptqkpy8kU2WmmZDVxS2QmzgxKs+jwkITVMZEFJCjNFXwEOPc3suPPsOGKIMHi7PP2Gs5UssCcW74D9iorG1r0VnxQ3HNQciKe5Y4nYIYPH+1UmRSxNAoHPrhGP5qM1oRejk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980255; c=relaxed/simple; bh=LI1sDcOO4Y4+JFloEIGoQdv+YZQhyxul+6jzUgodhT0=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=iFQop8SordN6kGoPFMOHF6oG6yxx517CeKiO+p+CEp0ogwtCzB8jSysacovN22Rzedyc5o/rFV65gjnrVn1b2aFl6wJlg0gLw0c7tBuLVBJGe29XE/+03zDn6vmVBoaSK5cXt2zf2SbGibNXOIr2f/8SvTmP0BUptNH9Q+qGIAg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=pVau5ElV; arc=none smtp.client-ip=209.85.210.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="pVau5ElV" Received: by mail-pf1-f201.google.com with SMTP id d2e1a72fcca58-7d481452732so4329232b3a.1 for ; Fri, 05 Dec 2025 16:17:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1764980253; x=1765585053; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=8zDbYh8g4andZgSq5KjxfJP+Yx1X56NwE8CC/T+SiEI=; b=pVau5ElVfZwx0/su8cO5Rqh691AFnnWz6RrzYp3Sy65JPI1sKLjj3nW7KYdYWSmgsA wqN6uooeeNSfiSWQFYEKWRKU9ZwvgOCJBkSzZIj08kSsx/88FS4ffQzGkIwZoDTYK2fe YJIMyGVKJKi5DgVWO1duCxNlrtonrWGH83SxF98yGTP5vVDKUHMvdqQZABZ36WTJ29iL RI5r+azoW8h5jtBTj9Xq5a58S16XF3XksydgNWse0hYPABjSLnLJS5pjFaYDr/N18kw3 ozAvHONmGu+4ouR4dU08quepnzo35TueGFCL97Yv3oFX4i5vB0N70+o+A0xMxMmtCF/0 Iiuw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764980253; x=1765585053; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=8zDbYh8g4andZgSq5KjxfJP+Yx1X56NwE8CC/T+SiEI=; b=hzzmetc6csdNu08rZTvFwbnRnILznXng8/OcxXUrdgHhNIIJp70XvxSP/k/q9Z7Ck1 G0w04pGdBYY+iVsgl2o/tXcG1KVEnyC6Km4BqYb/6rQlvIKUOuaka8waBtZneTpd1hXw elqpMstSYiVu5e214Owx8TWO24EvL/L+KgU4s7zKP43IJPolG+IIt+to/GBHWnb7lDPO lnAY6Awc2ca3cIrs2GvurWN18epoyNmOU7CkgFMc1/khWEB1YzDj8S7Da2cIeENLNpAm MRXEI+0BT14iQGltHHdOo1WxrUHVTq5c2tZKLckkYe/RfUbpeVRI3LMZZtalZodpoHFX x5Hg== X-Forwarded-Encrypted: i=1; AJvYcCWzJSuwO055KouUuMfq5avlGaZE0nFiHLd4fc6rD+tfV8ocgn4sTscv7WJa+t/ql+8JVzU9sPqojljV89s=@vger.kernel.org X-Gm-Message-State: AOJu0Yz2pLrFL1CJ5/jAtJ3XwCzeG/VMoCuNrG94ZNicG9q4S2+0t0RB hUZpOxxhgbGoi7LPPRscw4Wip7c8ohh4jt474PF64ZjthSoeifoU8/NYyof2ww+q+wGuctIu2/V Xl/eONQ== X-Google-Smtp-Source: AGHT+IF/oM705F739LTH8aiHz6AKpqWGtQrIi8ngNpofrp5S2I+xWIdijTs20meiR96WS9wVH7kHD0C0BxI= X-Received: from pgce2.prod.google.com ([2002:a05:6a02:1c2:b0:bd0:79b2:aa3f]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a20:7b23:b0:366:1880:7e06 with SMTP id adf61e73a8af0-36618807e8dmr621575637.0.1764980252791; Fri, 05 Dec 2025 16:17:32 -0800 (PST) Reply-To: Sean Christopherson Date: Fri, 5 Dec 2025 16:16:39 -0800 In-Reply-To: <20251206001720.468579-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251206001720.468579-1-seanjc@google.com> X-Mailer: git-send-email 2.52.0.223.gf5cc29aaa4-goog Message-ID: <20251206001720.468579-4-seanjc@google.com> Subject: [PATCH v6 03/44] perf: Move security_perf_event_free() call to __free_event() From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Xin Li , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Sean Christopherson , Paolo Bonzini Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Mingwei Zhang , Xudong Hao , Sandipan Das , Dapeng Mi , Xiong Zhang , Manali Shukla , Jim Mattson Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Move the freeing of any security state associated with a perf event from _free_event() to __free_event(), i.e. invoke security_perf_event_free() in the error paths for perf_event_alloc(). This will allow adding potential error paths in perf_event_alloc() that can occur after allocating security state. Note, kfree() and thus security_perf_event_free() is a nop if event->security is NULL, i.e. calling security_perf_event_free() even if security_perf_event_alloc() fails or is never reached is functionality ok. Tested-by: Xudong Hao Signed-off-by: Sean Christopherson --- kernel/events/core.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/events/core.c b/kernel/events/core.c index 1e37ab90b815..e34112df8b31 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -5602,6 +5602,8 @@ static void __free_event(struct perf_event *event) { struct pmu *pmu =3D event->pmu; =20 + security_perf_event_free(event); + if (event->attach_state & PERF_ATTACH_CALLCHAIN) put_callchain_buffers(); =20 @@ -5665,8 +5667,6 @@ static void _free_event(struct perf_event *event) =20 unaccount_event(event); =20 - security_perf_event_free(event); - if (event->rb) { /* * Can happen when we close an event with re-directed output. --=20 2.52.0.223.gf5cc29aaa4-goog From nobody Tue Dec 16 07:27:38 2025 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 64252224AF2 for ; Sat, 6 Dec 2025 00:17:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980257; cv=none; b=BGTpFrYozhYDEKSFu+zb9b4p+uyogEvP5T43csCvhzMeXlyi8eJ7XxvB95pVt1cWJNgmeJyZL8OnvMf2HFCG10Zpx8VZIH56C2+E/12U4r2KQBRKbrWQUFocsqQnQpYQTZecbMF7w0f2QGhJa/5F6QNGlnxdQcpXoe65x/TrkBI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980257; c=relaxed/simple; bh=QLQZjWHCBv07/qr5bOJVEcqFcY4sq2RI3wzauvN7A34=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=sFJHrJlnumv4xW8NRFarF/flA4Vt3x+64LxeCIHgG3q2eQsEnWmhNH5m9Ez0a4Ejc9qzOe1cu6r7CeDcjbwkUbgO05a15rpC74lyUuJVUbGNLaBy4x70wCQAABHuC1NJx19mO9vMrD5RXfQTiyD4EFrfwCdCiY8zy0a5u2ODsyE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=3iIpNMqJ; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="3iIpNMqJ" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-342701608e2so2933586a91.1 for ; Fri, 05 Dec 2025 16:17:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1764980255; x=1765585055; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=wmQUDTBRI6uAYPiZERIyuwnHpmVucKzvhAShzUV1wg8=; b=3iIpNMqJ366fJfO+YSSAcxqYolblWh8q6TqpHIXfTTjRH18GMFBh0j7NRm6lhCvncr 8mKnNULVd/yFqZ3kpvgUN614kQE+e4k0c/KCJDGaVL+c/qjI3zXi7c/y0cIFYnL8ZXSa FGs1LM4FxwVEtTeVn3+NDgQ287PjXp5MYMCi8ikdcYCHYbqD4V99Ki8Vo37IYqHjEcaq HBnJRANlrqVdKOzPdPcFqM74GYIOCzXIbYr1egbshlLtJtfdfM+juovlm/LVqm3z+yEG rtJ0awWnijUPKEMMkVA2sKiWJ06ru9xw6ECobAb4dGLaLS7AklPYawWg4pGjU5a97R3x FeOw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764980255; x=1765585055; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=wmQUDTBRI6uAYPiZERIyuwnHpmVucKzvhAShzUV1wg8=; b=WoC2p+NK5WkELe6sp/UJJhemny6P/y/3qNhy9xyLj/QsxMYf2c9cuUCpGuSqT85oLI 4UQUY+yQ10Wwo13tXY3ylvvi6ITr61BUdLhlVJ1gXodL5H67SzdcHYI1nzHtT9SwzpmE PkzzXCxqxz4KzjE1cMW//AHbuKALX15R6i1FWG2nSrX3aBytTGWZJFSMIWdXW/sictx1 aBPJyY99CYkklmUuihmaxiBn9+Ona9nsPm6Vakw/89J7SOhxWG/nZvaiDP8bp38PP8wU Ncep2PhROampzj06HLOmQV9gVBzGyvVEhOx5IV6SSfSjrYtipAkmiNM4uEBivWwMLMQG Nc9Q== X-Forwarded-Encrypted: i=1; AJvYcCXMtWhdZ7y2wRbbbCOG81dhf04crHymTZkacpSO1cfPz1X0OzvoCveWNEk3V886k02foBQNhIeRcob6b8M=@vger.kernel.org X-Gm-Message-State: AOJu0YzRJJSCMbUW/+81Y23jh7LG1hLBrSGhaociA7IBuDQ8tejhBxfP cR4soVBmdYh1g6FuqBe3vtf3Pf1MkLz/9IY0i/6It0hVSYv5qA4SkvYnK0Ds7DWDpMcYLhyth1I zJZt6aw== X-Google-Smtp-Source: AGHT+IEZRxY3LZNDar88VXHAyUm8Uc785TqYbcnbZTCQlLOLUTZJss/bvouzXSbB0o11yHRRJXuvDvjgcck= X-Received: from pjbhl15.prod.google.com ([2002:a17:90b:134f:b0:340:9a37:91a4]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:520f:b0:340:2a18:1536 with SMTP id 98e67ed59e1d1-349a25e044emr617645a91.25.1764980254710; Fri, 05 Dec 2025 16:17:34 -0800 (PST) Reply-To: Sean Christopherson Date: Fri, 5 Dec 2025 16:16:40 -0800 In-Reply-To: <20251206001720.468579-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251206001720.468579-1-seanjc@google.com> X-Mailer: git-send-email 2.52.0.223.gf5cc29aaa4-goog Message-ID: <20251206001720.468579-5-seanjc@google.com> Subject: [PATCH v6 04/44] perf: Add APIs to create/release mediated guest vPMUs From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Xin Li , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Sean Christopherson , Paolo Bonzini Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Mingwei Zhang , Xudong Hao , Sandipan Das , Dapeng Mi , Xiong Zhang , Manali Shukla , Jim Mattson Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Kan Liang Currently, exposing PMU capabilities to a KVM guest is done by emulating guest PMCs via host perf events, i.e. by having KVM be "just" another user of perf. As a result, the guest and host are effectively competing for resources, and emulating guest accesses to vPMU resources requires expensive actions (expensive relative to the native instruction). The overhead and resource competition results in degraded guest performance and ultimately very poor vPMU accuracy. To address the issues with the perf-emulated vPMU, introduce a "mediated vPMU", where the data plane (PMCs and enable/disable knobs) is exposed directly to the guest, but the control plane (event selectors and access to fixed counters) is managed by KVM (via MSR interceptions). To allow host perf usage of the PMU to (partially) co-exist with KVM/guest usage of the PMU, KVM and perf will coordinate to a world switch between host perf context and guest vPMU context near VM-Enter/VM-Exit. Add two exported APIs, perf_{create,release}_mediated_pmu(), to allow KVM to create and release a mediated PMU instance (per VM). Because host perf context will be deactivated while the guest is running, mediated PMU usage will be mutually exclusive with perf analysis of the guest, i.e. perf events that do NOT exclude the guest will not behave as expected. To avoid silent failure of !exclude_guest perf events, disallow creating a mediated PMU if there are active !exclude_guest events, and on the perf side, disallowing creating new !exclude_guest perf events while there is at least one active mediated PMU. Exempt PMU resources that do not support mediated PMU usage, i.e. that are outside the scope/view of KVM's vPMU and will not be swapped out while the guest is running. Guard mediated PMU with a new kconfig to help readers identify code paths that are unique to mediated PMU support, and to allow for adding arch- specific hooks without stubs. KVM x86 is expected to be the only KVM architecture to support a mediated PMU in the near future (e.g. arm64 is trending toward a partitioned PMU implementation), and KVM x86 will select PERF_GUEST_MEDIATED_PMU unconditionally, i.e. won't need stubs. Immediately select PERF_GUEST_MEDIATED_PMU when KVM x86 is enabled so that all paths are compile tested. Full KVM support is on its way... Suggested-by: Sean Christopherson Signed-off-by: Kan Liang Signed-off-by: Mingwei Zhang [sean: add kconfig and WARNing, rewrite changelog, swizzle patch ordering] Tested-by: Xudong Hao Signed-off-by: Sean Christopherson --- arch/x86/kvm/Kconfig | 1 + include/linux/perf_event.h | 6 +++ init/Kconfig | 4 ++ kernel/events/core.c | 82 ++++++++++++++++++++++++++++++++++++++ 4 files changed, 93 insertions(+) diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig index 278f08194ec8..d916bd766c94 100644 --- a/arch/x86/kvm/Kconfig +++ b/arch/x86/kvm/Kconfig @@ -37,6 +37,7 @@ config KVM_X86 select SCHED_INFO select PERF_EVENTS select GUEST_PERF_EVENTS + select PERF_GUEST_MEDIATED_PMU select HAVE_KVM_MSI select HAVE_KVM_CPU_RELAX_INTERCEPT select HAVE_KVM_NO_POLL diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index fd1d91017b99..94f679634ef6 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -305,6 +305,7 @@ struct perf_event_pmu_context; #define PERF_PMU_CAP_EXTENDED_HW_TYPE 0x0100 #define PERF_PMU_CAP_AUX_PAUSE 0x0200 #define PERF_PMU_CAP_AUX_PREFER_LARGE 0x0400 +#define PERF_PMU_CAP_MEDIATED_VPMU 0x0800 =20 /** * pmu::scope @@ -1914,6 +1915,11 @@ extern int perf_event_account_interrupt(struct perf_= event *event); extern int perf_event_period(struct perf_event *event, u64 value); extern u64 perf_event_pause(struct perf_event *event, bool reset); =20 +#ifdef CONFIG_PERF_GUEST_MEDIATED_PMU +int perf_create_mediated_pmu(void); +void perf_release_mediated_pmu(void); +#endif + #else /* !CONFIG_PERF_EVENTS: */ =20 static inline void * diff --git a/init/Kconfig b/init/Kconfig index cab3ad28ca49..45b9ac626829 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -2010,6 +2010,10 @@ config GUEST_PERF_EVENTS bool depends on HAVE_PERF_EVENTS =20 +config PERF_GUEST_MEDIATED_PMU + bool + depends on GUEST_PERF_EVENTS + config PERF_USE_VMALLOC bool help diff --git a/kernel/events/core.c b/kernel/events/core.c index e34112df8b31..cfeea7d330f9 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -5657,6 +5657,8 @@ static void __free_event(struct perf_event *event) call_rcu(&event->rcu_head, free_event_rcu); } =20 +static void mediated_pmu_unaccount_event(struct perf_event *event); + DEFINE_FREE(__free_event, struct perf_event *, if (_T) __free_event(_T)) =20 /* vs perf_event_alloc() success */ @@ -5666,6 +5668,7 @@ static void _free_event(struct perf_event *event) irq_work_sync(&event->pending_disable_irq); =20 unaccount_event(event); + mediated_pmu_unaccount_event(event); =20 if (event->rb) { /* @@ -6188,6 +6191,81 @@ u64 perf_event_pause(struct perf_event *event, bool = reset) } EXPORT_SYMBOL_GPL(perf_event_pause); =20 +#ifdef CONFIG_PERF_GUEST_MEDIATED_PMU +static atomic_t nr_include_guest_events __read_mostly; + +static atomic_t nr_mediated_pmu_vms __read_mostly; +static DEFINE_MUTEX(perf_mediated_pmu_mutex); + +/* !exclude_guest event of PMU with PERF_PMU_CAP_MEDIATED_VPMU */ +static inline bool is_include_guest_event(struct perf_event *event) +{ + if ((event->pmu->capabilities & PERF_PMU_CAP_MEDIATED_VPMU) && + !event->attr.exclude_guest) + return true; + + return false; +} + +static int mediated_pmu_account_event(struct perf_event *event) +{ + if (!is_include_guest_event(event)) + return 0; + + guard(mutex)(&perf_mediated_pmu_mutex); + + if (atomic_read(&nr_mediated_pmu_vms)) + return -EOPNOTSUPP; + + atomic_inc(&nr_include_guest_events); + return 0; +} + +static void mediated_pmu_unaccount_event(struct perf_event *event) +{ + if (!is_include_guest_event(event)) + return; + + atomic_dec(&nr_include_guest_events); +} + +/* + * Currently invoked at VM creation to + * - Check whether there are existing !exclude_guest events of PMU with + * PERF_PMU_CAP_MEDIATED_VPMU + * - Set nr_mediated_pmu_vms to prevent !exclude_guest event creation on + * PMUs with PERF_PMU_CAP_MEDIATED_VPMU + * + * No impact for the PMU without PERF_PMU_CAP_MEDIATED_VPMU. The perf + * still owns all the PMU resources. + */ +int perf_create_mediated_pmu(void) +{ + guard(mutex)(&perf_mediated_pmu_mutex); + if (atomic_inc_not_zero(&nr_mediated_pmu_vms)) + return 0; + + if (atomic_read(&nr_include_guest_events)) + return -EBUSY; + + atomic_inc(&nr_mediated_pmu_vms); + return 0; +} +EXPORT_SYMBOL_GPL(perf_create_mediated_pmu); + +void perf_release_mediated_pmu(void) +{ + if (WARN_ON_ONCE(!atomic_read(&nr_mediated_pmu_vms))) + return; + + atomic_dec(&nr_mediated_pmu_vms); +} +EXPORT_SYMBOL_GPL(perf_release_mediated_pmu); +#else +static int mediated_pmu_account_event(struct perf_event *event) { return 0= ; } +static void mediated_pmu_unaccount_event(struct perf_event *event) {} +#endif + /* * Holding the top-level event's child_mutex means that any * descendant process that has inherited this event will block @@ -13078,6 +13156,10 @@ perf_event_alloc(struct perf_event_attr *attr, int= cpu, if (err) return ERR_PTR(err); =20 + err =3D mediated_pmu_account_event(event); + if (err) + return ERR_PTR(err); + /* symmetric to unaccount_event() in _free_event() */ account_event(event); =20 --=20 2.52.0.223.gf5cc29aaa4-goog From nobody Tue Dec 16 07:27:38 2025 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 94CA02356A4 for ; Sat, 6 Dec 2025 00:17:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980260; cv=none; b=s4W572bl9uPLpnQbh6sVndwIMoeoOnN0bNW5clmNSqU4eTJn+7xHYwceo6trm3PsGz1OrCpf1S5D6hL520z5X1Egy6PiM89zya0vKinUytaa7L/cvbB1BpGvO45CDQrhbN1eSQ/PsFmoQFuDG4y/E6g1FgDliJtd4COFISM89N4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980260; c=relaxed/simple; bh=G8wALHk9xxWCdpKRi3+gLxGJE7FMh32ZDSRLMavCxW0=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=WHHXzR0N8DohUR27P9o/Sfr/ggWjoEZqhOJN8Y5uNRL/G7RiIaHn7h7w9tcjyH1XTe9aMW1nw4PugLgyws0KCuRkS7MPJe/Z6yfDVsWst8FeUii5CfpD72DdTtvsUxIRYRkOm7IxFFqG2fhrBidrdJqwZyniDVoFRjU90QG8pic= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=Yn5+aFeL; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="Yn5+aFeL" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-340c261fb38so4315548a91.0 for ; Fri, 05 Dec 2025 16:17:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1764980257; x=1765585057; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=yhKYQi5Qha/BTsmxjoFY4QUfh3xCFhBQAHy77hScZiw=; b=Yn5+aFeLROWC15aoNeAOx3X2ASbrbxvW5evDBFGDmKYNGG7bxhitorC/xEFphC+wU0 rrRTuV2cKQaKL+q/e8DxHDb6d/0s42ghyov5ePXYe5lKzsho/1GR4cka70kxh+1Ee27x UeOj6ul2Kc/kYdxMLSuBQCIhnvI4PoxqVEiV+ZHaOAJ/BGBDmdfhFB+FnqcMQLPyIK0l YWpzT8TMd9hD9I+8esRSiHJeMcmClHKCF56ulDixIuSHRMZnFwLfi3GrcEQ6S8DWXDwV +eoPdlIeYeS79Z8qiB5d7GI6GIsNJW/IcBapVAyct5cTFn10VvjnjmIX3QA5x9L8Jlac D7dA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764980257; x=1765585057; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=yhKYQi5Qha/BTsmxjoFY4QUfh3xCFhBQAHy77hScZiw=; b=YdV1sMew8NQoLakEY1QmOuXElzJgEII1wmObuYOLuhM4eHlXkeBmp9ziodCH0WJEaP WhMMvc8Bl8NivB3IugVl0NWQmpP8Ak8UU4KaacedcrrfLMAgu+ChxLbdvQvkwtqeeDXF peKYwDnhGYr1QMwD1q/p9dyDCfw3a9wiA/I2m0YTgcf6AOZsYkDGuVHYKpXfLyGxWbQy xtJ6LlJTZZqMln0zBZZaj1zv9hy2/yImCQz2OIt4l1Jxd32eAyznrclRCoMt5teJJS+I c6k0mgumf85LnTGRt9KEv4lKUkLdejZ7+6eszq9NH3yOprZiEe+DnB/aBMv6tQnJ4Gie NxCg== X-Forwarded-Encrypted: i=1; AJvYcCW7aRdGnxJjZUw5neLAfDTxhYx/UHj/13ymZKTYgxL5zBTBAbr90/GFOSoZqrO5TDJ4jEO7VUQpnnQN1FM=@vger.kernel.org X-Gm-Message-State: AOJu0Yzvnt7V6+fUBQ12AS9P+DpGGAWAH5BJwW8rbbPyjzGBi2mrgbSM p3O8V+R8R2uTfXckTNpHoftLvh2C5/o3hfZTOnEdHDIUDCzxuJBWxFQRdPqd3Cgq1Hjsqvsvis5 Cq4SMxA== X-Google-Smtp-Source: AGHT+IGmP8Ey35X4MpwHqzR6yI0kfgLszkB31H16gNGXB5SUmH/owSLovbegxo4T8OosEeKQAfSbsj3/1QU= X-Received: from pjtv9.prod.google.com ([2002:a17:90a:c909:b0:33b:51fe:1a81]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:5627:b0:343:7f04:79c1 with SMTP id 98e67ed59e1d1-349a24f29b9mr580948a91.9.1764980256849; Fri, 05 Dec 2025 16:17:36 -0800 (PST) Reply-To: Sean Christopherson Date: Fri, 5 Dec 2025 16:16:41 -0800 In-Reply-To: <20251206001720.468579-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251206001720.468579-1-seanjc@google.com> X-Mailer: git-send-email 2.52.0.223.gf5cc29aaa4-goog Message-ID: <20251206001720.468579-6-seanjc@google.com> Subject: [PATCH v6 05/44] perf: Clean up perf ctx time From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Xin Li , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Sean Christopherson , Paolo Bonzini Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Mingwei Zhang , Xudong Hao , Sandipan Das , Dapeng Mi , Xiong Zhang , Manali Shukla , Jim Mattson Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Kan Liang The current perf tracks two timestamps for the normal ctx and cgroup. The same type of variables and similar codes are used to track the timestamps. In the following patch, the third timestamp to track the guest time will be introduced. To avoid the code duplication, add a new struct perf_time_ctx and factor out a generic function update_perf_time_ctx(). No functional change. Suggested-by: Peter Zijlstra (Intel) Signed-off-by: Kan Liang Signed-off-by: Mingwei Zhang Tested-by: Xudong Hao Signed-off-by: Sean Christopherson --- include/linux/perf_event.h | 13 +++---- kernel/events/core.c | 70 +++++++++++++++++--------------------- 2 files changed, 39 insertions(+), 44 deletions(-) diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index 94f679634ef6..42d1debc519f 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -999,6 +999,11 @@ struct perf_event_groups { u64 index; }; =20 +struct perf_time_ctx { + u64 time; + u64 stamp; + u64 offset; +}; =20 /** * struct perf_event_context - event context structure @@ -1037,9 +1042,7 @@ struct perf_event_context { /* * Context clock, runs when context enabled. */ - u64 time; - u64 timestamp; - u64 timeoffset; + struct perf_time_ctx time; =20 /* * These fields let us detect when two contexts have both @@ -1172,9 +1175,7 @@ struct bpf_perf_event_data_kern { * This is a per-cpu dynamically allocated data structure. */ struct perf_cgroup_info { - u64 time; - u64 timestamp; - u64 timeoffset; + struct perf_time_ctx time; int active; }; =20 diff --git a/kernel/events/core.c b/kernel/events/core.c index cfeea7d330f9..5db8f4c60b9e 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -815,6 +815,24 @@ static void perf_ctx_enable(struct perf_event_context = *ctx, static void ctx_sched_out(struct perf_event_context *ctx, struct pmu *pmu,= enum event_type_t event_type); static void ctx_sched_in(struct perf_event_context *ctx, struct pmu *pmu, = enum event_type_t event_type); =20 +static inline void update_perf_time_ctx(struct perf_time_ctx *time, u64 no= w, bool adv) +{ + if (adv) + time->time +=3D now - time->stamp; + time->stamp =3D now; + + /* + * The above: time' =3D time + (now - timestamp), can be re-arranged + * into: time` =3D now + (time - timestamp), which gives a single value + * offset to compute future time without locks on. + * + * See perf_event_time_now(), which can be used from NMI context where + * it's (obviously) not possible to acquire ctx->lock in order to read + * both the above values in a consistent manner. + */ + WRITE_ONCE(time->offset, time->time - time->stamp); +} + #ifdef CONFIG_CGROUP_PERF =20 static inline bool @@ -856,7 +874,7 @@ static inline u64 perf_cgroup_event_time(struct perf_ev= ent *event) struct perf_cgroup_info *t; =20 t =3D per_cpu_ptr(event->cgrp->info, event->cpu); - return t->time; + return t->time.time; } =20 static inline u64 perf_cgroup_event_time_now(struct perf_event *event, u64= now) @@ -865,22 +883,11 @@ static inline u64 perf_cgroup_event_time_now(struct p= erf_event *event, u64 now) =20 t =3D per_cpu_ptr(event->cgrp->info, event->cpu); if (!__load_acquire(&t->active)) - return t->time; - now +=3D READ_ONCE(t->timeoffset); + return t->time.time; + now +=3D READ_ONCE(t->time.offset); return now; } =20 -static inline void __update_cgrp_time(struct perf_cgroup_info *info, u64 n= ow, bool adv) -{ - if (adv) - info->time +=3D now - info->timestamp; - info->timestamp =3D now; - /* - * see update_context_time() - */ - WRITE_ONCE(info->timeoffset, info->time - info->timestamp); -} - static inline void update_cgrp_time_from_cpuctx(struct perf_cpu_context *c= puctx, bool final) { struct perf_cgroup *cgrp =3D cpuctx->cgrp; @@ -894,7 +901,7 @@ static inline void update_cgrp_time_from_cpuctx(struct = perf_cpu_context *cpuctx, cgrp =3D container_of(css, struct perf_cgroup, css); info =3D this_cpu_ptr(cgrp->info); =20 - __update_cgrp_time(info, now, true); + update_perf_time_ctx(&info->time, now, true); if (final) __store_release(&info->active, 0); } @@ -917,7 +924,7 @@ static inline void update_cgrp_time_from_event(struct p= erf_event *event) * Do not update time when cgroup is not active */ if (info->active) - __update_cgrp_time(info, perf_clock(), true); + update_perf_time_ctx(&info->time, perf_clock(), true); } =20 static inline void @@ -941,7 +948,7 @@ perf_cgroup_set_timestamp(struct perf_cpu_context *cpuc= tx) for (css =3D &cgrp->css; css; css =3D css->parent) { cgrp =3D container_of(css, struct perf_cgroup, css); info =3D this_cpu_ptr(cgrp->info); - __update_cgrp_time(info, ctx->timestamp, false); + update_perf_time_ctx(&info->time, ctx->time.stamp, false); __store_release(&info->active, 1); } } @@ -1562,20 +1569,7 @@ static void __update_context_time(struct perf_event_= context *ctx, bool adv) =20 lockdep_assert_held(&ctx->lock); =20 - if (adv) - ctx->time +=3D now - ctx->timestamp; - ctx->timestamp =3D now; - - /* - * The above: time' =3D time + (now - timestamp), can be re-arranged - * into: time` =3D now + (time - timestamp), which gives a single value - * offset to compute future time without locks on. - * - * See perf_event_time_now(), which can be used from NMI context where - * it's (obviously) not possible to acquire ctx->lock in order to read - * both the above values in a consistent manner. - */ - WRITE_ONCE(ctx->timeoffset, ctx->time - ctx->timestamp); + update_perf_time_ctx(&ctx->time, now, adv); } =20 static void update_context_time(struct perf_event_context *ctx) @@ -1593,7 +1587,7 @@ static u64 perf_event_time(struct perf_event *event) if (is_cgroup_event(event)) return perf_cgroup_event_time(event); =20 - return ctx->time; + return ctx->time.time; } =20 static u64 perf_event_time_now(struct perf_event *event, u64 now) @@ -1607,9 +1601,9 @@ static u64 perf_event_time_now(struct perf_event *eve= nt, u64 now) return perf_cgroup_event_time_now(event, now); =20 if (!(__load_acquire(&ctx->is_active) & EVENT_TIME)) - return ctx->time; + return ctx->time.time; =20 - now +=3D READ_ONCE(ctx->timeoffset); + now +=3D READ_ONCE(ctx->time.offset); return now; } =20 @@ -12044,7 +12038,7 @@ static void task_clock_event_update(struct perf_eve= nt *event, u64 now) static void task_clock_event_start(struct perf_event *event, int flags) { event->hw.state =3D 0; - local64_set(&event->hw.prev_count, event->ctx->time); + local64_set(&event->hw.prev_count, event->ctx->time.time); perf_swevent_start_hrtimer(event); } =20 @@ -12053,7 +12047,7 @@ static void task_clock_event_stop(struct perf_event= *event, int flags) event->hw.state =3D PERF_HES_STOPPED; perf_swevent_cancel_hrtimer(event); if (flags & PERF_EF_UPDATE) - task_clock_event_update(event, event->ctx->time); + task_clock_event_update(event, event->ctx->time.time); } =20 static int task_clock_event_add(struct perf_event *event, int flags) @@ -12073,8 +12067,8 @@ static void task_clock_event_del(struct perf_event = *event, int flags) static void task_clock_event_read(struct perf_event *event) { u64 now =3D perf_clock(); - u64 delta =3D now - event->ctx->timestamp; - u64 time =3D event->ctx->time + delta; + u64 delta =3D now - event->ctx->time.stamp; + u64 time =3D event->ctx->time.time + delta; =20 task_clock_event_update(event, time); } --=20 2.52.0.223.gf5cc29aaa4-goog From nobody Tue Dec 16 07:27:38 2025 Received: from mail-pf1-f202.google.com (mail-pf1-f202.google.com [209.85.210.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E3978221DB6 for ; Sat, 6 Dec 2025 00:17:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980262; cv=none; b=S022GG56pkVEpseNS4v/A49f7tvYL+R1nykub4lwYBqRix7h5vC9+/CQtZteA9g/Q51mjV20OceQARn41H0P4KDDVHQWseCICoORlFWk0TnXQGlgqevjt+RLzpJDvYlU5Ek3mhN2pdOZTw+bvgCWKGAqnAkJ7PqNPXusNtR2W38= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980262; c=relaxed/simple; bh=HeAd8S+/f4aG220I8Wx26Az1DMnwCb//oDp6Goot8UY=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=jsYo5VNuFb1yEZxFzm1mFnVqZNglLDMUk3faim6+hxHpQVN6vPP8i5Ga/R/GiJGLl8ocoAb59a9F+X+EmGofEbMs6llvONtyr7wCMs3bj9RHkrfnmV5SJE+PEBt/7QzeLbaWsMQJXLXLMzU/whR6aEupnc4p/3u5Vh/2IIRXg4s= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=zk/LWamJ; arc=none smtp.client-ip=209.85.210.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="zk/LWamJ" Received: by mail-pf1-f202.google.com with SMTP id d2e1a72fcca58-7c240728e2aso4692044b3a.3 for ; Fri, 05 Dec 2025 16:17:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1764980259; x=1765585059; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=r6rVFSEwb9C3fWGiMX1SBAT8nySg6vSlU7f2Mtt+blE=; b=zk/LWamJk3mAhAzFbBI/3klarLzS/ojJKBLeUuTVIaxhQrPJhXbTjGLQq+Ej/4nRV1 GWa15Nnccb9NJMObImhMHTpGYaSUkzVx31YUC6zLCGk5ELdttcR0+nzQ4VexpcJ5PogV +FnDIP8H7A7t53WnUVdPdzZUMQKBAuYwCfdw6TPpaxDRCwpNkcUJINv038F4tSL9hkVk ITYSlPJdaULDgs8JCoVnd6J8uPgu2LIiIbd9FNQVPMA3P4/w8fQ4vglMcLEtnP5IFxm2 ESGf648B4No1kjpUlh5mhZPLWotOEglDHMUqBDsRS78vMFqEPrBfUF4x2GeIgGWx0thV y/HQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764980259; x=1765585059; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=r6rVFSEwb9C3fWGiMX1SBAT8nySg6vSlU7f2Mtt+blE=; b=mlgIrUoBSL8XCqHAp2TIwtcAvDDS4o2dVR1iXFBOISPciIk2JOYuARW7JR0sgNDvZA 5CbWUv5vFN/UNKSWOYwtyZjZ4pZnu8gkOtze6PNqk6B7UvqzXSxtxTBxAfzGwZS9pd4s BE6E9i5W9lrwP5Ggj/PFddr7sZ5MDRjjSdNhvBnC54jMKXjrNXxBH3vfrxMKrNky3hOX crYs5d4uUIdsTqpWKSGmprEzVgy7qW/6y9wY9crAyHCJIzAcMIaOd1EVdqaCXqO51H1C wikBVetmO510BcnVfbPxOUryjBX6C6a0V6jX9PPUJPOWmDoKPF+BRTnk8fcCWEsoOYvk 4JMQ== X-Forwarded-Encrypted: i=1; AJvYcCWu5t9HvLSAk6pAkNv1xagXyNbmVbo20IW4e7U0d8vu0r0ojdMn36/JFTOVYspXxwfPUgc90/WYxtbSfpg=@vger.kernel.org X-Gm-Message-State: AOJu0YwkJlWt/dLR/XqtJgF9x8uCm5MUjcKhEVqb9dtLZyMN7I+lUi6y II2xP21i8DvgDwe0TNS9twYE0qUiba9xg3awUd22AQijViaUxViKKZr9xRDdUY5Ty8Fk97XsNuU tbevIrA== X-Google-Smtp-Source: AGHT+IEPWY/PdairYW2WlOr0N+pVBVhqM0jggAM9xJALp3PL8+l3HzkQusARbXJ/7OYDI9LgIR3DcdcTBvg= X-Received: from pfmu8.prod.google.com ([2002:aa7:8388:0:b0:7b0:bc2e:959b]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:b93:b0:781:17fb:d3ca with SMTP id d2e1a72fcca58-7e8c1661681mr779292b3a.15.1764980258501; Fri, 05 Dec 2025 16:17:38 -0800 (PST) Reply-To: Sean Christopherson Date: Fri, 5 Dec 2025 16:16:42 -0800 In-Reply-To: <20251206001720.468579-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251206001720.468579-1-seanjc@google.com> X-Mailer: git-send-email 2.52.0.223.gf5cc29aaa4-goog Message-ID: <20251206001720.468579-7-seanjc@google.com> Subject: [PATCH v6 06/44] perf: Add a EVENT_GUEST flag From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Xin Li , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Sean Christopherson , Paolo Bonzini Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Mingwei Zhang , Xudong Hao , Sandipan Das , Dapeng Mi , Xiong Zhang , Manali Shukla , Jim Mattson Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Kan Liang Current perf doesn't explicitly schedule out all exclude_guest events while the guest is running. There is no problem with the current emulated vPMU. Because perf owns all the PMU counters. It can mask the counter which is assigned to an exclude_guest event when a guest is running (Intel way), or set the corresponding HOSTONLY bit in evsentsel (AMD way). The counter doesn't count when a guest is running. However, either way doesn't work with the introduced mediated vPMU. A guest owns all the PMU counters when it's running. The host should not mask any counters. The counter may be used by the guest. The evsentsel may be overwritten. Perf should explicitly schedule out all exclude_guest events to release the PMU resources when entering a guest, and resume the counting when exiting the guest. It's possible that an exclude_guest event is created when a guest is running. The new event should not be scheduled in as well. The ctx time is shared among different PMUs. The time cannot be stopped when a guest is running. It is required to calculate the time for events from other PMUs, e.g., uncore events. Add timeguest to track the guest run time. For an exclude_guest event, the elapsed time equals the ctx time - guest time. Cgroup has dedicated times. Use the same method to deduct the guest time from the cgroup time as well. Co-developed-by: Peter Zijlstra (Intel) Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Kan Liang Signed-off-by: Mingwei Zhang [sean: massage comments] Tested-by: Xudong Hao Signed-off-by: Sean Christopherson --- include/linux/perf_event.h | 6 + kernel/events/core.c | 232 ++++++++++++++++++++++++++++--------- 2 files changed, 186 insertions(+), 52 deletions(-) diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index 42d1debc519f..eaab830c9bf5 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -1044,6 +1044,11 @@ struct perf_event_context { */ struct perf_time_ctx time; =20 + /* + * Context clock, runs when in the guest mode. + */ + struct perf_time_ctx timeguest; + /* * These fields let us detect when two contexts have both * been cloned (inherited) from a common ancestor. @@ -1176,6 +1181,7 @@ struct bpf_perf_event_data_kern { */ struct perf_cgroup_info { struct perf_time_ctx time; + struct perf_time_ctx timeguest; int active; }; =20 diff --git a/kernel/events/core.c b/kernel/events/core.c index 5db8f4c60b9e..f72d4844b05e 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -164,7 +164,19 @@ enum event_type_t { /* see ctx_resched() for details */ EVENT_CPU =3D 0x10, EVENT_CGROUP =3D 0x20, - EVENT_FLAGS =3D EVENT_CGROUP, + + /* + * EVENT_GUEST is set when scheduling in/out events between the host + * and a guest with a mediated vPMU. Among other things, EVENT_GUEST + * is used: + * + * - In for_each_epc() to skip PMUs that don't support events in a + * MEDIATED_VPMU guest, i.e. don't need to be context switched. + * - To indicate the start/end point of the events in a guest. Guest + * running time is deducted for host-only (exclude_guest) events. + */ + EVENT_GUEST =3D 0x40, + EVENT_FLAGS =3D EVENT_CGROUP | EVENT_GUEST, /* compound helpers */ EVENT_ALL =3D EVENT_FLEXIBLE | EVENT_PINNED, EVENT_TIME_FROZEN =3D EVENT_TIME | EVENT_FROZEN, @@ -457,6 +469,11 @@ static cpumask_var_t perf_online_pkg_mask; static cpumask_var_t perf_online_sys_mask; static struct kmem_cache *perf_event_cache; =20 +static __always_inline bool is_guest_mediated_pmu_loaded(void) +{ + return false; +} + /* * perf event paranoia level: * -1 - not paranoid at all @@ -783,6 +800,9 @@ static bool perf_skip_pmu_ctx(struct perf_event_pmu_con= text *pmu_ctx, { if ((event_type & EVENT_CGROUP) && !pmu_ctx->nr_cgroups) return true; + if ((event_type & EVENT_GUEST) && + !(pmu_ctx->pmu->capabilities & PERF_PMU_CAP_MEDIATED_VPMU)) + return true; return false; } =20 @@ -833,6 +853,39 @@ static inline void update_perf_time_ctx(struct perf_ti= me_ctx *time, u64 now, boo WRITE_ONCE(time->offset, time->time - time->stamp); } =20 +static_assert(offsetof(struct perf_event_context, timeguest) - + offsetof(struct perf_event_context, time) =3D=3D + sizeof(struct perf_time_ctx)); + +#define T_TOTAL 0 +#define T_GUEST 1 + +static inline u64 __perf_event_time_ctx(struct perf_event *event, + struct perf_time_ctx *times) +{ + u64 time =3D times[T_TOTAL].time; + + if (event->attr.exclude_guest) + time -=3D times[T_GUEST].time; + + return time; +} + +static inline u64 __perf_event_time_ctx_now(struct perf_event *event, + struct perf_time_ctx *times, + u64 now) +{ + if (is_guest_mediated_pmu_loaded() && event->attr.exclude_guest) { + /* + * (now + times[total].offset) - (now + times[guest].offset) :=3D + * times[total].offset - times[guest].offset + */ + return READ_ONCE(times[T_TOTAL].offset) - READ_ONCE(times[T_GUEST].offse= t); + } + + return now + READ_ONCE(times[T_TOTAL].offset); +} + #ifdef CONFIG_CGROUP_PERF =20 static inline bool @@ -869,12 +922,16 @@ static inline int is_cgroup_event(struct perf_event *= event) return event->cgrp !=3D NULL; } =20 +static_assert(offsetof(struct perf_cgroup_info, timeguest) - + offsetof(struct perf_cgroup_info, time) =3D=3D + sizeof(struct perf_time_ctx)); + static inline u64 perf_cgroup_event_time(struct perf_event *event) { struct perf_cgroup_info *t; =20 t =3D per_cpu_ptr(event->cgrp->info, event->cpu); - return t->time.time; + return __perf_event_time_ctx(event, &t->time); } =20 static inline u64 perf_cgroup_event_time_now(struct perf_event *event, u64= now) @@ -883,9 +940,21 @@ static inline u64 perf_cgroup_event_time_now(struct pe= rf_event *event, u64 now) =20 t =3D per_cpu_ptr(event->cgrp->info, event->cpu); if (!__load_acquire(&t->active)) - return t->time.time; - now +=3D READ_ONCE(t->time.offset); - return now; + return __perf_event_time_ctx(event, &t->time); + + return __perf_event_time_ctx_now(event, &t->time, now); +} + +static inline void __update_cgrp_guest_time(struct perf_cgroup_info *info,= u64 now, bool adv) +{ + update_perf_time_ctx(&info->timeguest, now, adv); +} + +static inline void update_cgrp_time(struct perf_cgroup_info *info, u64 now) +{ + update_perf_time_ctx(&info->time, now, true); + if (is_guest_mediated_pmu_loaded()) + __update_cgrp_guest_time(info, now, true); } =20 static inline void update_cgrp_time_from_cpuctx(struct perf_cpu_context *c= puctx, bool final) @@ -901,7 +970,7 @@ static inline void update_cgrp_time_from_cpuctx(struct = perf_cpu_context *cpuctx, cgrp =3D container_of(css, struct perf_cgroup, css); info =3D this_cpu_ptr(cgrp->info); =20 - update_perf_time_ctx(&info->time, now, true); + update_cgrp_time(info, now); if (final) __store_release(&info->active, 0); } @@ -924,11 +993,11 @@ static inline void update_cgrp_time_from_event(struct= perf_event *event) * Do not update time when cgroup is not active */ if (info->active) - update_perf_time_ctx(&info->time, perf_clock(), true); + update_cgrp_time(info, perf_clock()); } =20 static inline void -perf_cgroup_set_timestamp(struct perf_cpu_context *cpuctx) +perf_cgroup_set_timestamp(struct perf_cpu_context *cpuctx, bool guest) { struct perf_event_context *ctx =3D &cpuctx->ctx; struct perf_cgroup *cgrp =3D cpuctx->cgrp; @@ -948,8 +1017,12 @@ perf_cgroup_set_timestamp(struct perf_cpu_context *cp= uctx) for (css =3D &cgrp->css; css; css =3D css->parent) { cgrp =3D container_of(css, struct perf_cgroup, css); info =3D this_cpu_ptr(cgrp->info); - update_perf_time_ctx(&info->time, ctx->time.stamp, false); - __store_release(&info->active, 1); + if (guest) { + __update_cgrp_guest_time(info, ctx->time.stamp, false); + } else { + update_perf_time_ctx(&info->time, ctx->time.stamp, false); + __store_release(&info->active, 1); + } } } =20 @@ -1153,7 +1226,7 @@ static inline int perf_cgroup_connect(pid_t pid, stru= ct perf_event *event, } =20 static inline void -perf_cgroup_set_timestamp(struct perf_cpu_context *cpuctx) +perf_cgroup_set_timestamp(struct perf_cpu_context *cpuctx, bool guest) { } =20 @@ -1565,16 +1638,24 @@ static void perf_unpin_context(struct perf_event_co= ntext *ctx) */ static void __update_context_time(struct perf_event_context *ctx, bool adv) { - u64 now =3D perf_clock(); + lockdep_assert_held(&ctx->lock); + + update_perf_time_ctx(&ctx->time, perf_clock(), adv); +} =20 +static void __update_context_guest_time(struct perf_event_context *ctx, bo= ol adv) +{ lockdep_assert_held(&ctx->lock); =20 - update_perf_time_ctx(&ctx->time, now, adv); + /* must be called after __update_context_time(); */ + update_perf_time_ctx(&ctx->timeguest, ctx->time.stamp, adv); } =20 static void update_context_time(struct perf_event_context *ctx) { __update_context_time(ctx, true); + if (is_guest_mediated_pmu_loaded()) + __update_context_guest_time(ctx, true); } =20 static u64 perf_event_time(struct perf_event *event) @@ -1587,7 +1668,7 @@ static u64 perf_event_time(struct perf_event *event) if (is_cgroup_event(event)) return perf_cgroup_event_time(event); =20 - return ctx->time.time; + return __perf_event_time_ctx(event, &ctx->time); } =20 static u64 perf_event_time_now(struct perf_event *event, u64 now) @@ -1601,10 +1682,9 @@ static u64 perf_event_time_now(struct perf_event *ev= ent, u64 now) return perf_cgroup_event_time_now(event, now); =20 if (!(__load_acquire(&ctx->is_active) & EVENT_TIME)) - return ctx->time.time; + return __perf_event_time_ctx(event, &ctx->time); =20 - now +=3D READ_ONCE(ctx->time.offset); - return now; + return __perf_event_time_ctx_now(event, &ctx->time, now); } =20 static enum event_type_t get_event_type(struct perf_event *event) @@ -2427,20 +2507,23 @@ group_sched_out(struct perf_event *group_event, str= uct perf_event_context *ctx) } =20 static inline void -__ctx_time_update(struct perf_cpu_context *cpuctx, struct perf_event_conte= xt *ctx, bool final) +__ctx_time_update(struct perf_cpu_context *cpuctx, struct perf_event_conte= xt *ctx, + bool final, enum event_type_t event_type) { if (ctx->is_active & EVENT_TIME) { if (ctx->is_active & EVENT_FROZEN) return; + update_context_time(ctx); - update_cgrp_time_from_cpuctx(cpuctx, final); + /* vPMU should not stop time */ + update_cgrp_time_from_cpuctx(cpuctx, !(event_type & EVENT_GUEST) && fina= l); } } =20 static inline void ctx_time_update(struct perf_cpu_context *cpuctx, struct perf_event_context= *ctx) { - __ctx_time_update(cpuctx, ctx, false); + __ctx_time_update(cpuctx, ctx, false, 0); } =20 /* @@ -3512,7 +3595,7 @@ ctx_sched_out(struct perf_event_context *ctx, struct = pmu *pmu, enum event_type_t * * would only update time for the pinned events. */ - __ctx_time_update(cpuctx, ctx, ctx =3D=3D &cpuctx->ctx); + __ctx_time_update(cpuctx, ctx, ctx =3D=3D &cpuctx->ctx, event_type); =20 /* * CPU-release for the below ->is_active store, @@ -3538,7 +3621,18 @@ ctx_sched_out(struct perf_event_context *ctx, struct= pmu *pmu, enum event_type_t cpuctx->task_ctx =3D NULL; } =20 - is_active ^=3D ctx->is_active; /* changed bits */ + if (event_type & EVENT_GUEST) { + /* + * Schedule out all exclude_guest events of PMU + * with PERF_PMU_CAP_MEDIATED_VPMU. + */ + is_active =3D EVENT_ALL; + __update_context_guest_time(ctx, false); + perf_cgroup_set_timestamp(cpuctx, true); + barrier(); + } else { + is_active ^=3D ctx->is_active; /* changed bits */ + } =20 for_each_epc(pmu_ctx, ctx, pmu, event_type) __pmu_ctx_sched_out(pmu_ctx, is_active); @@ -3997,10 +4091,15 @@ static inline void group_update_userpage(struct per= f_event *group_event) event_update_userpage(event); } =20 +struct merge_sched_data { + int can_add_hw; + enum event_type_t event_type; +}; + static int merge_sched_in(struct perf_event *event, void *data) { struct perf_event_context *ctx =3D event->ctx; - int *can_add_hw =3D data; + struct merge_sched_data *msd =3D data; =20 if (event->state <=3D PERF_EVENT_STATE_OFF) return 0; @@ -4008,13 +4107,22 @@ static int merge_sched_in(struct perf_event *event,= void *data) if (!event_filter_match(event)) return 0; =20 - if (group_can_go_on(event, *can_add_hw)) { + /* + * Don't schedule in any host events from PMU with + * PERF_PMU_CAP_MEDIATED_VPMU, while a guest is running. + */ + if (is_guest_mediated_pmu_loaded() && + event->pmu_ctx->pmu->capabilities & PERF_PMU_CAP_MEDIATED_VPMU && + !(msd->event_type & EVENT_GUEST)) + return 0; + + if (group_can_go_on(event, msd->can_add_hw)) { if (!group_sched_in(event, ctx)) list_add_tail(&event->active_list, get_event_list(event)); } =20 if (event->state =3D=3D PERF_EVENT_STATE_INACTIVE) { - *can_add_hw =3D 0; + msd->can_add_hw =3D 0; if (event->attr.pinned) { perf_cgroup_event_disable(event, ctx); perf_event_set_state(event, PERF_EVENT_STATE_ERROR); @@ -4037,11 +4145,15 @@ static int merge_sched_in(struct perf_event *event,= void *data) =20 static void pmu_groups_sched_in(struct perf_event_context *ctx, struct perf_event_groups *groups, - struct pmu *pmu) + struct pmu *pmu, + enum event_type_t event_type) { - int can_add_hw =3D 1; + struct merge_sched_data msd =3D { + .can_add_hw =3D 1, + .event_type =3D event_type, + }; visit_groups_merge(ctx, groups, smp_processor_id(), pmu, - merge_sched_in, &can_add_hw); + merge_sched_in, &msd); } =20 static void __pmu_ctx_sched_in(struct perf_event_pmu_context *pmu_ctx, @@ -4050,9 +4162,9 @@ static void __pmu_ctx_sched_in(struct perf_event_pmu_= context *pmu_ctx, struct perf_event_context *ctx =3D pmu_ctx->ctx; =20 if (event_type & EVENT_PINNED) - pmu_groups_sched_in(ctx, &ctx->pinned_groups, pmu_ctx->pmu); + pmu_groups_sched_in(ctx, &ctx->pinned_groups, pmu_ctx->pmu, event_type); if (event_type & EVENT_FLEXIBLE) - pmu_groups_sched_in(ctx, &ctx->flexible_groups, pmu_ctx->pmu); + pmu_groups_sched_in(ctx, &ctx->flexible_groups, pmu_ctx->pmu, event_type= ); } =20 static void @@ -4069,9 +4181,11 @@ ctx_sched_in(struct perf_event_context *ctx, struct = pmu *pmu, enum event_type_t return; =20 if (!(is_active & EVENT_TIME)) { + /* EVENT_TIME should be active while the guest runs */ + WARN_ON_ONCE(event_type & EVENT_GUEST); /* start ctx time */ __update_context_time(ctx, false); - perf_cgroup_set_timestamp(cpuctx); + perf_cgroup_set_timestamp(cpuctx, false); /* * CPU-release for the below ->is_active store, * see __load_acquire() in perf_event_time_now() @@ -4087,7 +4201,23 @@ ctx_sched_in(struct perf_event_context *ctx, struct = pmu *pmu, enum event_type_t WARN_ON_ONCE(cpuctx->task_ctx !=3D ctx); } =20 - is_active ^=3D ctx->is_active; /* changed bits */ + if (event_type & EVENT_GUEST) { + /* + * Schedule in the required exclude_guest events of PMU + * with PERF_PMU_CAP_MEDIATED_VPMU. + */ + is_active =3D event_type & EVENT_ALL; + + /* + * Update ctx time to set the new start time for + * the exclude_guest events. + */ + update_context_time(ctx); + update_cgrp_time_from_cpuctx(cpuctx, false); + barrier(); + } else { + is_active ^=3D ctx->is_active; /* changed bits */ + } =20 /* * First go through the list and put on any pinned groups @@ -4095,13 +4225,13 @@ ctx_sched_in(struct perf_event_context *ctx, struct= pmu *pmu, enum event_type_t */ if (is_active & EVENT_PINNED) { for_each_epc(pmu_ctx, ctx, pmu, event_type) - __pmu_ctx_sched_in(pmu_ctx, EVENT_PINNED); + __pmu_ctx_sched_in(pmu_ctx, EVENT_PINNED | (event_type & EVENT_GUEST)); } =20 /* Then walk through the lower prio flexible groups */ if (is_active & EVENT_FLEXIBLE) { for_each_epc(pmu_ctx, ctx, pmu, event_type) - __pmu_ctx_sched_in(pmu_ctx, EVENT_FLEXIBLE); + __pmu_ctx_sched_in(pmu_ctx, EVENT_FLEXIBLE | (event_type & EVENT_GUEST)= ); } } =20 @@ -6627,23 +6757,23 @@ void perf_event_update_userpage(struct perf_event *= event) if (!rb) goto unlock; =20 - /* - * compute total_time_enabled, total_time_running - * based on snapshot values taken when the event - * was last scheduled in. - * - * we cannot simply called update_context_time() - * because of locking issue as we can be called in - * NMI context - */ - calc_timer_values(event, &now, &enabled, &running); - - userpg =3D rb->user_page; /* * Disable preemption to guarantee consistent time stamps are stored to * the user page. */ preempt_disable(); + + /* + * Compute total_time_enabled, total_time_running based on snapshot + * values taken when the event was last scheduled in. + * + * We cannot simply call update_context_time() because doing so would + * lead to deadlock when called from NMI context. + */ + calc_timer_values(event, &now, &enabled, &running); + + userpg =3D rb->user_page; + ++userpg->lock; barrier(); userpg->index =3D perf_event_index(event); @@ -7940,13 +8070,11 @@ static void perf_output_read(struct perf_output_han= dle *handle, u64 read_format =3D event->attr.read_format; =20 /* - * compute total_time_enabled, total_time_running - * based on snapshot values taken when the event - * was last scheduled in. + * Compute total_time_enabled, total_time_running based on snapshot + * values taken when the event was last scheduled in. * - * we cannot simply called update_context_time() - * because of locking issue as we are called in - * NMI context + * We cannot simply call update_context_time() because doing so would + * lead to deadlock when called from NMI context. */ if (read_format & PERF_FORMAT_TOTAL_TIMES) calc_timer_values(event, &now, &enabled, &running); --=20 2.52.0.223.gf5cc29aaa4-goog From nobody Tue Dec 16 07:27:38 2025 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1039123D7CF for ; Sat, 6 Dec 2025 00:17:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980263; cv=none; b=ASZJJaIo6AwL5fqX7dzUcy44X6Spw6kqSQChGR7pOZp9Uyg1nIkRnzAeIdWVbR2FcOLEGKnd0BjhxENY9NLKIzKweFAMAIybye+Qtobv7CtpqcDKJcqyHr820fXAFnFf/SlajznI6M/gM+hDqFeLZQVybTdk2an9oP+IENIzN6g= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980263; c=relaxed/simple; bh=ol4osHN66I/PAiXEwvP+OdGz+wl0D4v1AlsJ0ZqcJWM=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=T+B0YvzeQ6FJM1kmboNzh9AXJm0GDGOZPMk0zI5e6R6zOhpCwAmzcfjlzYNTvMAYqITJ6Cn1JGGclSEoGMW1TlP77oU9nLTkl95ostd/gG4L7Vme1pUqA2n/5fHMlRp/GAGoy00hYgw+ESbm1HwrI8DgviKApchfaJ+DuT/iEgk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=a3mbt8lU; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="a3mbt8lU" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-340bc4ef67fso3026394a91.3 for ; Fri, 05 Dec 2025 16:17:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1764980260; x=1765585060; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=FylYYLVT0WTXptfl4HpO9XPf05XgvSaO59C95hLudDA=; b=a3mbt8lUAdbV6kGXC3GT6vE8YPxA8+wrR0VjKodKfqFdiTSzN4epX9HLzvFa4Wqx6o b5Ml6LcoIy2i2xtQYwgL3XWNA0Iv3zO9CocSi4pUUiySoTVfr3IKqGLHUr+PX53+kNED F4eFcR9zgHVR8dOxe2eL46/npNZ1/x5WMzmyI1B6e0WjyJhzfIkxCUvGUVZ/Pz3dEFnD 9jjDSpVu2O+VavEWrdZ2YoRNSv9KSONRwqmI8fogRsujw7dLEeubqQBbS4ys71D1DIVC bnM1kftvOW+OCiOgO4PO41D+AWIDTqEsb+71u3vOLNzAxyvNDfGvgoZVxG3YmJeM3C5h fnrg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764980260; x=1765585060; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=FylYYLVT0WTXptfl4HpO9XPf05XgvSaO59C95hLudDA=; b=A2HlR7BHgi0Rt+hrNZ7z3AQQADubIudDPtkPJ9R7b9QlhDz3mFXwNZW4uZ9xN5Iqvn 3IjoUdOlGQWMTVV0OklmznbAqqcBRXvF6JjquyLs8cnUCBF4bfUBj4EvWpC6zCmmQymq 05YoREQqbSOJxIOaza91nbVIAVFM0K8raSRi8iYOCKY9q1kd57PH5l3R9dv927vI5IuI /wlfIq04sR7w2E4J1rbWi6J5tIJ3ywmJF+PIBDTyjAZe8enl0S4MiY4W66a41jTGhFaa QB0l4D5QxEGbwLw/Bi/PJ6WQpGl6CDye4OgTSOBZJOaAyLU7K2YT3rk/w8se4rLKH52b J6Kg== X-Forwarded-Encrypted: i=1; AJvYcCXBs9GXMQiHMRAosd9YZTFWe4NbLah8y3JHTzpXqjC6boyNVRTZV6y/ys/bYHCkYk4y/uiR+qd9gzaoYsE=@vger.kernel.org X-Gm-Message-State: AOJu0YxX4pnoXqZ2YgrRnt0GoWFMkvNnxwyXX3Uf+GcadOrPaWFX5wqt ivpowbZBHepgtlN3L4mvlHKS5InC6feVY5vOpikED7tWMsYYGZ15+8vILx/C2n3cJyIUuqUd35i 8H0K+oQ== X-Google-Smtp-Source: AGHT+IFoT/ixFxXWDXJBPVNPACu7PrN8IiUqBDaVd9+tgekzVLP1tDFd7WsI6N47bV9683aut9/Cc/2CP5c= X-Received: from pjtl17.prod.google.com ([2002:a17:90a:c591:b0:349:1598:173b]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:4acf:b0:340:d06d:ea73 with SMTP id 98e67ed59e1d1-349a2686512mr571973a91.19.1764980260147; Fri, 05 Dec 2025 16:17:40 -0800 (PST) Reply-To: Sean Christopherson Date: Fri, 5 Dec 2025 16:16:43 -0800 In-Reply-To: <20251206001720.468579-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251206001720.468579-1-seanjc@google.com> X-Mailer: git-send-email 2.52.0.223.gf5cc29aaa4-goog Message-ID: <20251206001720.468579-8-seanjc@google.com> Subject: [PATCH v6 07/44] perf: Add APIs to load/put guest mediated PMU context From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Xin Li , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Sean Christopherson , Paolo Bonzini Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Mingwei Zhang , Xudong Hao , Sandipan Das , Dapeng Mi , Xiong Zhang , Manali Shukla , Jim Mattson Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Kan Liang Add exported APIs to load/put a guest mediated PMU context. KVM will load the guest PMU shortly before VM-Enter, and put the guest PMU shortly after VM-Exit. On the perf side of things, schedule out all exclude_guest events when the guest context is loaded, and schedule them back in when the guest context is put. I.e. yield the hardware PMU resources to the guest, by way of KVM. Note, perf is only responsible for managing host context. KVM is responsible for loading/storing guest state to/from hardware. Suggested-by: Sean Christopherson Signed-off-by: Kan Liang Signed-off-by: Mingwei Zhang [sean: shuffle patches around, write changelog] Tested-by: Xudong Hao Signed-off-by: Sean Christopherson --- include/linux/perf_event.h | 2 ++ kernel/events/core.c | 61 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 63 insertions(+) diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index eaab830c9bf5..cfc8cd86c409 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -1925,6 +1925,8 @@ extern u64 perf_event_pause(struct perf_event *event,= bool reset); #ifdef CONFIG_PERF_GUEST_MEDIATED_PMU int perf_create_mediated_pmu(void); void perf_release_mediated_pmu(void); +void perf_load_guest_context(void); +void perf_put_guest_context(void); #endif =20 #else /* !CONFIG_PERF_EVENTS: */ diff --git a/kernel/events/core.c b/kernel/events/core.c index f72d4844b05e..81c35859e6ea 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -469,10 +469,19 @@ static cpumask_var_t perf_online_pkg_mask; static cpumask_var_t perf_online_sys_mask; static struct kmem_cache *perf_event_cache; =20 +#ifdef CONFIG_PERF_GUEST_MEDIATED_PMU +static DEFINE_PER_CPU(bool, guest_ctx_loaded); + +static __always_inline bool is_guest_mediated_pmu_loaded(void) +{ + return __this_cpu_read(guest_ctx_loaded); +} +#else static __always_inline bool is_guest_mediated_pmu_loaded(void) { return false; } +#endif =20 /* * perf event paranoia level: @@ -6385,6 +6394,58 @@ void perf_release_mediated_pmu(void) atomic_dec(&nr_mediated_pmu_vms); } EXPORT_SYMBOL_GPL(perf_release_mediated_pmu); + +/* When loading a guest's mediated PMU, schedule out all exclude_guest eve= nts. */ +void perf_load_guest_context(void) +{ + struct perf_cpu_context *cpuctx =3D this_cpu_ptr(&perf_cpu_context); + + lockdep_assert_irqs_disabled(); + + guard(perf_ctx_lock)(cpuctx, cpuctx->task_ctx); + + if (WARN_ON_ONCE(__this_cpu_read(guest_ctx_loaded))) + return; + + perf_ctx_disable(&cpuctx->ctx, EVENT_GUEST); + ctx_sched_out(&cpuctx->ctx, NULL, EVENT_GUEST); + if (cpuctx->task_ctx) { + perf_ctx_disable(cpuctx->task_ctx, EVENT_GUEST); + task_ctx_sched_out(cpuctx->task_ctx, NULL, EVENT_GUEST); + } + + perf_ctx_enable(&cpuctx->ctx, EVENT_GUEST); + if (cpuctx->task_ctx) + perf_ctx_enable(cpuctx->task_ctx, EVENT_GUEST); + + __this_cpu_write(guest_ctx_loaded, true); +} +EXPORT_SYMBOL_GPL(perf_load_guest_context); + +void perf_put_guest_context(void) +{ + struct perf_cpu_context *cpuctx =3D this_cpu_ptr(&perf_cpu_context); + + lockdep_assert_irqs_disabled(); + + guard(perf_ctx_lock)(cpuctx, cpuctx->task_ctx); + + if (WARN_ON_ONCE(!__this_cpu_read(guest_ctx_loaded))) + return; + + perf_ctx_disable(&cpuctx->ctx, EVENT_GUEST); + if (cpuctx->task_ctx) + perf_ctx_disable(cpuctx->task_ctx, EVENT_GUEST); + + perf_event_sched_in(cpuctx, cpuctx->task_ctx, NULL, EVENT_GUEST); + + if (cpuctx->task_ctx) + perf_ctx_enable(cpuctx->task_ctx, EVENT_GUEST); + perf_ctx_enable(&cpuctx->ctx, EVENT_GUEST); + + __this_cpu_write(guest_ctx_loaded, false); +} +EXPORT_SYMBOL_GPL(perf_put_guest_context); #else static int mediated_pmu_account_event(struct perf_event *event) { return 0= ; } static void mediated_pmu_unaccount_event(struct perf_event *event) {} --=20 2.52.0.223.gf5cc29aaa4-goog From nobody Tue Dec 16 07:27:38 2025 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 041CA2417D1 for ; Sat, 6 Dec 2025 00:17:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980265; cv=none; b=W72EoDhrEPUWxW1tjD3A4EJId2kE1LVNYRgYTUR9pQaZMa7FtwHdqPLQ8t3Ka8pNofjzdLMx3dWpvmj3Y0qLSc/lFE4aIqKh6wugI7HigkhZfO3XoMGKPwPcS5u33wNG+Owdbmq71QeWcadzEQ56sVRjUyNLNikpZvsWKATkTFY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980265; c=relaxed/simple; bh=B07Fi36gCg2XTPk4IcpSxpRfsJ0d9shcamMYXeoKIbg=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=M36n7/01E+nWwvamlmGp3UPZuKdYJvazxjnbfBiaYHaDywkHzoSbA+PRii4oBqwSDA6gBaMQ9ylTQ+RuDkeiIWGBewVXWvs17PHBLAxnwls6RhVFNLa+kO4+AsvCbC4m7oZpYOSHGXiCDYISWmLlGGdhBEcO1540amX27eCx9CI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=iNzWc2pG; arc=none smtp.client-ip=209.85.214.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="iNzWc2pG" Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-2958a134514so38064135ad.2 for ; Fri, 05 Dec 2025 16:17:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1764980262; x=1765585062; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=PaAEczJn56g6j7l3T/bJJIloWLF69FMD1cy8Jo+0lXA=; b=iNzWc2pGwRQLPL+8Aimnkd3/u4RV+2CLRTXB5icAirnjw3BhHb7ke7yhbzN7TlSwys PkZReNFeT3FcyXpu9+DDr94cTpZM/2z0bHghwMTRkFfC1uwZpwGpR76MNx9skcs8EAjU 5BCLqE/5IJTMmkcnzsMVblB6xNIgcaZ+dgt2ioY73Px0RuesqDLBS2m85JhF922mKjZG V4EaJ+vZYpICbWTvcwq1XCkNvn91Mj+mZP8T5WAEyAsxP/U/dNrYKPYyFojWHph6YrnY Bi7ByeN6Q4HU2hNuiO8MHrE3o5vtyyJw+GyGjJw467vOvkvAruvJAAxFqDVH0fAw7rmG RBCg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764980262; x=1765585062; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=PaAEczJn56g6j7l3T/bJJIloWLF69FMD1cy8Jo+0lXA=; b=hN2JwcYsnCQ4GIz5xwqDUDhnTPK+9JmNtQnoDXTb0pXWQDGWwv7IJz6NNs5BBYbGnH qd9R0LwSzePdfgmrRQGvqzrxhTO9C21O+beMlZ6FNZSIdc4FEF9g4mfKncAUQuGZ6g+z 5bPS/ly9r/eGQxt8h3Y+c4hF0mrHXnDUQWWEKHkysrhu2dLphCyNGiU/bIw/X1dDTwcF JQ+5lecC0Velv4s+5XaIuKZtdobl2k9+OoFlFEuLYnc+PY/EV6YoVTQOoUEhIi+vwbsX W4EMvqd8POk5pUvu6DnreFtCzZLupXiimQ5i//RtgUA1C/2SP/KWajAP7MVe1wy3M2AF HzJA== X-Forwarded-Encrypted: i=1; AJvYcCU9Gauv4jwzP3lhpFvXbzEjPGCbm7V/scDibJ5hvEkQPY1WoRmtEdARuBIN/cgwB1A2nXgvKM/H5U5Lw9E=@vger.kernel.org X-Gm-Message-State: AOJu0YxpVPLpIonOeS0ojHeoJsddhhUm2wmNzhSZD7du+F9h6NnkJb8o ayoHQpMxM4+Bkr8SgTE3gVGdOJTS2JNnY4qUcV2OP9/Eck76cp9J0V8RJ8tx50RpKuWQie+tE+Z yJMEnCA== X-Google-Smtp-Source: AGHT+IE66BT/6tJmA7onHYrZQVP4Upgb8o2eySO9j5HzT7TCLlKWVMyzuwZuW9b+NVrzVgC1varHiz4graE= X-Received: from plbmf12.prod.google.com ([2002:a17:902:fc8c:b0:295:2d28:1242]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:902:ef4d:b0:299:e031:173 with SMTP id d9443c01a7336-29df610f3f7mr7047655ad.35.1764980262067; Fri, 05 Dec 2025 16:17:42 -0800 (PST) Reply-To: Sean Christopherson Date: Fri, 5 Dec 2025 16:16:44 -0800 In-Reply-To: <20251206001720.468579-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251206001720.468579-1-seanjc@google.com> X-Mailer: git-send-email 2.52.0.223.gf5cc29aaa4-goog Message-ID: <20251206001720.468579-9-seanjc@google.com> Subject: [PATCH v6 08/44] perf/x86/core: Register a new vector for handling mediated guest PMIs From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Xin Li , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Sean Christopherson , Paolo Bonzini Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Mingwei Zhang , Xudong Hao , Sandipan Das , Dapeng Mi , Xiong Zhang , Manali Shukla , Jim Mattson Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Wire up system vector 0xf5 for handling PMIs (i.e. interrupts delivered through the LVTPC) while running KVM guests with a mediated PMU. Perf currently delivers all PMIs as NMIs, e.g. so that events that trigger while IRQs are disabled aren't delayed and generate useless records, but due to the multiplexing of NMIs throughout the system, correctly identifying NMIs for a mediated PMU is practically infeasible. To (greatly) simplify identifying guest mediated PMU PMIs, perf will switch the CPU's LVTPC between PERF_GUEST_MEDIATED_PMI_VECTOR and NMI when guest PMU context is loaded/put. I.e. PMIs that are generated by the CPU while the guest is active will be identified purely based on the IRQ vector. Route the vector through perf, e.g. as opposed to letting KVM attach a handler directly a la posted interrupt notification vectors, as perf owns the LVTPC and thus is the rightful owner of PERF_GUEST_MEDIATED_PMI_VECTOR. Functionally, having KVM directly own the vector would be fine (both KVM and perf will be completely aware of when a mediated PMU is active), but would lead to an undesirable split in ownership: perf would be responsible for installing the vector, but not handling the resulting IRQs. Add a new perf_guest_info_callbacks hook (and static call) to allow KVM to register its handler with perf when running guests with mediated PMUs. Note, because KVM always runs guests with host IRQs enabled, there is no danger of a PMI being delayed from the guest's perspective due to using a regular IRQ instead of an NMI. Tested-by: Xudong Hao Signed-off-by: Sean Christopherson --- arch/x86/entry/entry_fred.c | 1 + arch/x86/include/asm/hardirq.h | 3 +++ arch/x86/include/asm/idtentry.h | 6 ++++++ arch/x86/include/asm/irq_vectors.h | 4 +++- arch/x86/kernel/idt.c | 3 +++ arch/x86/kernel/irq.c | 19 +++++++++++++++++++ include/linux/perf_event.h | 8 ++++++++ kernel/events/core.c | 9 +++++++-- .../beauty/arch/x86/include/asm/irq_vectors.h | 3 ++- virt/kvm/kvm_main.c | 3 +++ 10 files changed, 55 insertions(+), 4 deletions(-) diff --git a/arch/x86/entry/entry_fred.c b/arch/x86/entry/entry_fred.c index f004a4dc74c2..d80861a4cd00 100644 --- a/arch/x86/entry/entry_fred.c +++ b/arch/x86/entry/entry_fred.c @@ -114,6 +114,7 @@ static idtentry_t sysvec_table[NR_SYSTEM_VECTORS] __ro_= after_init =3D { =20 SYSVEC(IRQ_WORK_VECTOR, irq_work), =20 + SYSVEC(PERF_GUEST_MEDIATED_PMI_VECTOR, perf_guest_mediated_pmi_handler), SYSVEC(POSTED_INTR_VECTOR, kvm_posted_intr_ipi), SYSVEC(POSTED_INTR_WAKEUP_VECTOR, kvm_posted_intr_wakeup_ipi), SYSVEC(POSTED_INTR_NESTED_VECTOR, kvm_posted_intr_nested_ipi), diff --git a/arch/x86/include/asm/hardirq.h b/arch/x86/include/asm/hardirq.h index 6b6d472baa0b..9314642ae93c 100644 --- a/arch/x86/include/asm/hardirq.h +++ b/arch/x86/include/asm/hardirq.h @@ -18,6 +18,9 @@ typedef struct { unsigned int kvm_posted_intr_ipis; unsigned int kvm_posted_intr_wakeup_ipis; unsigned int kvm_posted_intr_nested_ipis; +#endif +#ifdef CONFIG_GUEST_PERF_EVENTS + unsigned int perf_guest_mediated_pmis; #endif unsigned int x86_platform_ipis; /* arch dependent */ unsigned int apic_perf_irqs; diff --git a/arch/x86/include/asm/idtentry.h b/arch/x86/include/asm/idtentr= y.h index abd637e54e94..e64294c906bc 100644 --- a/arch/x86/include/asm/idtentry.h +++ b/arch/x86/include/asm/idtentry.h @@ -746,6 +746,12 @@ DECLARE_IDTENTRY_SYSVEC(POSTED_INTR_NESTED_VECTOR, sys= vec_kvm_posted_intr_nested # define fred_sysvec_kvm_posted_intr_nested_ipi NULL #endif =20 +# ifdef CONFIG_GUEST_PERF_EVENTS +DECLARE_IDTENTRY_SYSVEC(PERF_GUEST_MEDIATED_PMI_VECTOR, sysvec_perf_guest_= mediated_pmi_handler); +#else +# define fred_sysvec_perf_guest_mediated_pmi_handler NULL +#endif + # ifdef CONFIG_X86_POSTED_MSI DECLARE_IDTENTRY_SYSVEC(POSTED_MSI_NOTIFICATION_VECTOR, sysvec_posted_msi_= notification); #else diff --git a/arch/x86/include/asm/irq_vectors.h b/arch/x86/include/asm/irq_= vectors.h index 47051871b436..85253fc8e384 100644 --- a/arch/x86/include/asm/irq_vectors.h +++ b/arch/x86/include/asm/irq_vectors.h @@ -77,7 +77,9 @@ */ #define IRQ_WORK_VECTOR 0xf6 =20 -/* 0xf5 - unused, was UV_BAU_MESSAGE */ +/* IRQ vector for PMIs when running a guest with a mediated PMU. */ +#define PERF_GUEST_MEDIATED_PMI_VECTOR 0xf5 + #define DEFERRED_ERROR_VECTOR 0xf4 =20 /* Vector on which hypervisor callbacks will be delivered */ diff --git a/arch/x86/kernel/idt.c b/arch/x86/kernel/idt.c index f445bec516a0..260456588756 100644 --- a/arch/x86/kernel/idt.c +++ b/arch/x86/kernel/idt.c @@ -158,6 +158,9 @@ static const __initconst struct idt_data apic_idts[] = =3D { INTG(POSTED_INTR_WAKEUP_VECTOR, asm_sysvec_kvm_posted_intr_wakeup_ipi), INTG(POSTED_INTR_NESTED_VECTOR, asm_sysvec_kvm_posted_intr_nested_ipi), # endif +#ifdef CONFIG_GUEST_PERF_EVENTS + INTG(PERF_GUEST_MEDIATED_PMI_VECTOR, asm_sysvec_perf_guest_mediated_pmi_h= andler), +#endif # ifdef CONFIG_IRQ_WORK INTG(IRQ_WORK_VECTOR, asm_sysvec_irq_work), # endif diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c index 10721a125226..e9050e69717e 100644 --- a/arch/x86/kernel/irq.c +++ b/arch/x86/kernel/irq.c @@ -191,6 +191,13 @@ int arch_show_interrupts(struct seq_file *p, int prec) irq_stats(j)->kvm_posted_intr_wakeup_ipis); seq_puts(p, " Posted-interrupt wakeup event\n"); #endif +#ifdef CONFIG_GUEST_PERF_EVENTS + seq_printf(p, "%*s: ", prec, "VPMI"); + for_each_online_cpu(j) + seq_printf(p, "%10u ", + irq_stats(j)->perf_guest_mediated_pmis); + seq_puts(p, " Perf Guest Mediated PMI\n"); +#endif #ifdef CONFIG_X86_POSTED_MSI seq_printf(p, "%*s: ", prec, "PMN"); for_each_online_cpu(j) @@ -348,6 +355,18 @@ DEFINE_IDTENTRY_SYSVEC(sysvec_x86_platform_ipi) } #endif =20 +#ifdef CONFIG_GUEST_PERF_EVENTS +/* + * Handler for PERF_GUEST_MEDIATED_PMI_VECTOR. + */ +DEFINE_IDTENTRY_SYSVEC(sysvec_perf_guest_mediated_pmi_handler) +{ + apic_eoi(); + inc_irq_stat(perf_guest_mediated_pmis); + perf_guest_handle_mediated_pmi(); +} +#endif + #if IS_ENABLED(CONFIG_KVM) static void dummy_handler(void) {} static void (*kvm_posted_intr_wakeup_handler)(void) =3D dummy_handler; diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index cfc8cd86c409..9abfca6cf655 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -1677,6 +1677,8 @@ struct perf_guest_info_callbacks { unsigned int (*state)(void); unsigned long (*get_ip)(void); unsigned int (*handle_intel_pt_intr)(void); + + void (*handle_mediated_pmi)(void); }; =20 #ifdef CONFIG_GUEST_PERF_EVENTS @@ -1686,6 +1688,7 @@ extern struct perf_guest_info_callbacks __rcu *perf_g= uest_cbs; DECLARE_STATIC_CALL(__perf_guest_state, *perf_guest_cbs->state); DECLARE_STATIC_CALL(__perf_guest_get_ip, *perf_guest_cbs->get_ip); DECLARE_STATIC_CALL(__perf_guest_handle_intel_pt_intr, *perf_guest_cbs->ha= ndle_intel_pt_intr); +DECLARE_STATIC_CALL(__perf_guest_handle_mediated_pmi, *perf_guest_cbs->han= dle_mediated_pmi); =20 static inline unsigned int perf_guest_state(void) { @@ -1702,6 +1705,11 @@ static inline unsigned int perf_guest_handle_intel_p= t_intr(void) return static_call(__perf_guest_handle_intel_pt_intr)(); } =20 +static inline void perf_guest_handle_mediated_pmi(void) +{ + static_call(__perf_guest_handle_mediated_pmi)(); +} + extern void perf_register_guest_info_callbacks(struct perf_guest_info_call= backs *cbs); extern void perf_unregister_guest_info_callbacks(struct perf_guest_info_ca= llbacks *cbs); =20 diff --git a/kernel/events/core.c b/kernel/events/core.c index 81c35859e6ea..c6368c64b866 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -7645,6 +7645,7 @@ struct perf_guest_info_callbacks __rcu *perf_guest_cb= s; DEFINE_STATIC_CALL_RET0(__perf_guest_state, *perf_guest_cbs->state); DEFINE_STATIC_CALL_RET0(__perf_guest_get_ip, *perf_guest_cbs->get_ip); DEFINE_STATIC_CALL_RET0(__perf_guest_handle_intel_pt_intr, *perf_guest_cbs= ->handle_intel_pt_intr); +DEFINE_STATIC_CALL_RET0(__perf_guest_handle_mediated_pmi, *perf_guest_cbs-= >handle_mediated_pmi); =20 void perf_register_guest_info_callbacks(struct perf_guest_info_callbacks *= cbs) { @@ -7659,6 +7660,10 @@ void perf_register_guest_info_callbacks(struct perf_= guest_info_callbacks *cbs) if (cbs->handle_intel_pt_intr) static_call_update(__perf_guest_handle_intel_pt_intr, cbs->handle_intel_pt_intr); + + if (cbs->handle_mediated_pmi) + static_call_update(__perf_guest_handle_mediated_pmi, + cbs->handle_mediated_pmi); } EXPORT_SYMBOL_GPL(perf_register_guest_info_callbacks); =20 @@ -7670,8 +7675,8 @@ void perf_unregister_guest_info_callbacks(struct perf= _guest_info_callbacks *cbs) rcu_assign_pointer(perf_guest_cbs, NULL); static_call_update(__perf_guest_state, (void *)&__static_call_return0); static_call_update(__perf_guest_get_ip, (void *)&__static_call_return0); - static_call_update(__perf_guest_handle_intel_pt_intr, - (void *)&__static_call_return0); + static_call_update(__perf_guest_handle_intel_pt_intr, (void *)&__static_c= all_return0); + static_call_update(__perf_guest_handle_mediated_pmi, (void *)&__static_ca= ll_return0); synchronize_rcu(); } EXPORT_SYMBOL_GPL(perf_unregister_guest_info_callbacks); diff --git a/tools/perf/trace/beauty/arch/x86/include/asm/irq_vectors.h b/t= ools/perf/trace/beauty/arch/x86/include/asm/irq_vectors.h index 47051871b436..6e1d5b955aae 100644 --- a/tools/perf/trace/beauty/arch/x86/include/asm/irq_vectors.h +++ b/tools/perf/trace/beauty/arch/x86/include/asm/irq_vectors.h @@ -77,7 +77,8 @@ */ #define IRQ_WORK_VECTOR 0xf6 =20 -/* 0xf5 - unused, was UV_BAU_MESSAGE */ +#define PERF_GUEST_MEDIATED_PMI_VECTOR 0xf5 + #define DEFERRED_ERROR_VECTOR 0xf4 =20 /* Vector on which hypervisor callbacks will be delivered */ diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index f1f6a71b2b5f..4954cbbb05e8 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -6471,11 +6471,14 @@ static struct perf_guest_info_callbacks kvm_guest_c= bs =3D { .state =3D kvm_guest_state, .get_ip =3D kvm_guest_get_ip, .handle_intel_pt_intr =3D NULL, + .handle_mediated_pmi =3D NULL, }; =20 void kvm_register_perf_callbacks(unsigned int (*pt_intr_handler)(void)) { kvm_guest_cbs.handle_intel_pt_intr =3D pt_intr_handler; + kvm_guest_cbs.handle_mediated_pmi =3D NULL; + perf_register_guest_info_callbacks(&kvm_guest_cbs); } void kvm_unregister_perf_callbacks(void) --=20 2.52.0.223.gf5cc29aaa4-goog From nobody Tue Dec 16 07:27:38 2025 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AAE9524EF8C for ; Sat, 6 Dec 2025 00:17:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980266; cv=none; b=k9kHMPtKP9Dl+AVd1BRgWPRm41G73Jfjtbd/vKuG1+Rcq2+zUPuwziDphOyCsXMZQ7KYYxD6iL0llLKh4DeKZcH13STaf/Ubvnt6ijU6K2m7iDzOaMWRig129a8Q08KZwgC8dYgl4PqomrDgsBplJbXJPcl5F85uvBpisgMVIqY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980266; c=relaxed/simple; bh=tgwd5Esq4wpmewehCZ8NiFmYgLVHlgDnHG/ez6ZtIJc=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=Ra9XNhYHxohQ9VJ8QLpX173Ti7y53osDoC87ftgkiCKq85GDzpFZXFf2sIV+tUzpk/sFh+1U9YnyaqQ14op/Nq60zkBjDO56iL9edAEHWf0oLbACPtitVFL9rCknUsfogEU2bnS5wSvPg2pPOdxltmwg5xHjQ/k1L78yUWWiGLI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=DskE6ray; arc=none smtp.client-ip=209.85.216.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="DskE6ray" Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-34176460924so2633662a91.3 for ; Fri, 05 Dec 2025 16:17:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1764980264; x=1765585064; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=b3rwpX75fYsvQxObK18NSjc+CmweVuoOUNJF4J3Ucww=; b=DskE6rayxgJCh16+iTeDIsI6fXNSQV0iczfaKlwm732BE5bWrN0W6QysFfPAUmI4Da /J1pjEhNq+8Mr4Q4IsCNlIujltCq2SoTosP4xHZhE8OzNjbAvWpWEIu91FsOtBwwJodZ gtOZ3uhqjzObJ9526oouzRVw/77Iz5JCMl4MP4Z6nOWKFsXIF6ZVbYC/polyYmGYRlTK 30XwYKCyZxuVsQ5e7tPgc1OwNDcEVGAdUpaKCKofcby/Ju7hbz2cy7tUVyiaIVDVNdsB m9QOG9fQ0vzQQSRhsgn41XKb1XJ67QDLhShOxozCsNcAAdIJI3qfdIi0prXdmtDkepTv T5IA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764980264; x=1765585064; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=b3rwpX75fYsvQxObK18NSjc+CmweVuoOUNJF4J3Ucww=; b=LDuvypCG94BCa7CwpAZ+i3+oSpEQFbDchkzoRb9r8qSrolil/JEtDlVRLaCT0+z0Oo GZO5WK6k7uOjDVUZu9eWx6ppujy31sQkUV/ZQKC9XCkldj5+e87mllBfDIoRJT8oWaFp 0FnwZN3UdmJxqHRmWolDOl6eWBawjLnccWtMvbqSYlaVeidIdiiZIXD6TNs5NXEF8J3k te37a7TsRWmLnn8DkqaDTv7G+zBqzTZiyHWwk3Fn1ZVP4yeLbYZcZ6AGOs+x3/Tni2xK JWTnFNR/ue07S0lOaRKKO+seveE0SFiwMWcrOpp2P/oDN8X8NOPnamW9Itytb4MwDgsW CLmQ== X-Forwarded-Encrypted: i=1; AJvYcCUGBDaQ9HoZ74B8H+IwBfB4AA8hWENzuOn50EU7lspVifpLC8ZXOOpVxqn7ONZlltXnApp4JSytpYoe9kk=@vger.kernel.org X-Gm-Message-State: AOJu0Yxaq9RMjuge1McZqKxmneEtsk3duGb79j8yxBbjMYFqrXWa1d1h N6YVZDaf5qeZzG4LtfKcyc8Rn4zUJMOb0LgRRed0pIqSSG7ZuizxYOGrCEEIt9Vi9x2G2/F1J32 vwwTN3Q== X-Google-Smtp-Source: AGHT+IHEsD7fkDDlHKveCIETTAnzxygk5RYBYlDb1QrkU4F4qH52QVESGYT3XJZIm6BuCoPrCdtcsD42SZs= X-Received: from pjbbf17.prod.google.com ([2002:a17:90b:b11:b0:33b:dccb:b328]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:2750:b0:349:9dc4:fa35 with SMTP id 98e67ed59e1d1-349a2622069mr652583a91.25.1764980263863; Fri, 05 Dec 2025 16:17:43 -0800 (PST) Reply-To: Sean Christopherson Date: Fri, 5 Dec 2025 16:16:45 -0800 In-Reply-To: <20251206001720.468579-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251206001720.468579-1-seanjc@google.com> X-Mailer: git-send-email 2.52.0.223.gf5cc29aaa4-goog Message-ID: <20251206001720.468579-10-seanjc@google.com> Subject: [PATCH v6 09/44] perf/x86/core: Add APIs to switch to/from mediated PMI vector (for KVM) From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Xin Li , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Sean Christopherson , Paolo Bonzini Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Mingwei Zhang , Xudong Hao , Sandipan Das , Dapeng Mi , Xiong Zhang , Manali Shukla , Jim Mattson Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add APIs (exported only for KVM) to switch PMIs to the dedicated mediated PMU IRQ vector when loading guest context, and back to perf's standard NMI when the guest context is put. I.e. route PMIs to PERF_GUEST_MEDIATED_PMI_VECTOR when the guest context is active, and to NMIs while the host context is active. While running with guest context loaded, ignore all NMIs (in perf). Any NMI that arrives while the LVTPC points at the mediated PMU IRQ vector can't possibly be due to a host perf event. Signed-off-by: Sean Christopherson --- arch/x86/events/core.c | 32 +++++++++++++++++++++++++++++++ arch/x86/include/asm/perf_event.h | 5 +++++ 2 files changed, 37 insertions(+) diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c index fa6c47b50989..abe6a129a87f 100644 --- a/arch/x86/events/core.c +++ b/arch/x86/events/core.c @@ -55,6 +55,8 @@ DEFINE_PER_CPU(struct cpu_hw_events, cpu_hw_events) =3D { .pmu =3D &pmu, }; =20 +static DEFINE_PER_CPU(bool, guest_lvtpc_loaded); + DEFINE_STATIC_KEY_FALSE(rdpmc_never_available_key); DEFINE_STATIC_KEY_FALSE(rdpmc_always_available_key); DEFINE_STATIC_KEY_FALSE(perf_is_hybrid); @@ -1749,6 +1751,25 @@ void perf_events_lapic_init(void) apic_write(APIC_LVTPC, APIC_DM_NMI); } =20 +#ifdef CONFIG_PERF_GUEST_MEDIATED_PMU +void perf_load_guest_lvtpc(u32 guest_lvtpc) +{ + u32 masked =3D guest_lvtpc & APIC_LVT_MASKED; + + apic_write(APIC_LVTPC, + APIC_DM_FIXED | PERF_GUEST_MEDIATED_PMI_VECTOR | masked); + this_cpu_write(guest_lvtpc_loaded, true); +} +EXPORT_SYMBOL_FOR_MODULES(perf_load_guest_lvtpc, "kvm"); + +void perf_put_guest_lvtpc(void) +{ + this_cpu_write(guest_lvtpc_loaded, false); + apic_write(APIC_LVTPC, APIC_DM_NMI); +} +EXPORT_SYMBOL_FOR_MODULES(perf_put_guest_lvtpc, "kvm"); +#endif /* CONFIG_PERF_GUEST_MEDIATED_PMU */ + static int perf_event_nmi_handler(unsigned int cmd, struct pt_regs *regs) { @@ -1756,6 +1777,17 @@ perf_event_nmi_handler(unsigned int cmd, struct pt_r= egs *regs) u64 finish_clock; int ret; =20 + /* + * Ignore all NMIs when the CPU's LVTPC is configured to route PMIs to + * PERF_GUEST_MEDIATED_PMI_VECTOR, i.e. when an NMI time can't be due + * to a PMI. Attempting to handle a PMI while the guest's context is + * loaded will generate false positives and clobber guest state. Note, + * the LVTPC is switched to/from the dedicated mediated PMI IRQ vector + * while host events are quiesced. + */ + if (this_cpu_read(guest_lvtpc_loaded)) + return NMI_DONE; + /* * All PMUs/events that share this PMI handler should make sure to * increment active_events for their events. diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_= event.h index 49a4d442f3fc..4cd38b9da0ba 100644 --- a/arch/x86/include/asm/perf_event.h +++ b/arch/x86/include/asm/perf_event.h @@ -651,6 +651,11 @@ static inline void perf_events_lapic_init(void) { } static inline void perf_check_microcode(void) { } #endif =20 +#ifdef CONFIG_PERF_GUEST_MEDIATED_PMU +extern void perf_load_guest_lvtpc(u32 guest_lvtpc); +extern void perf_put_guest_lvtpc(void); +#endif + #if defined(CONFIG_PERF_EVENTS) && defined(CONFIG_CPU_SUP_INTEL) extern struct perf_guest_switch_msr *perf_guest_get_msrs(int *nr, void *da= ta); extern void x86_perf_get_lbr(struct x86_pmu_lbr *lbr); --=20 2.52.0.223.gf5cc29aaa4-goog From nobody Tue Dec 16 07:27:38 2025 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 13DD42288E3 for ; Sat, 6 Dec 2025 00:17:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980270; cv=none; b=JMGzBo8FzLmGl24xq0mI8Pt5/n5hQNViWBFQsZa4BnqXa9Dst5Max4FwJFxygtq6Bwa4AB+/L8n16WOUXst5UE+lDNSM2S06lchAl9GazoqMfjLUZ4cbAwneePBpRcW6YdcfDyGuBYJCQjSh8mfR3C+dh64DSHL5LYyWHw8RxwE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980270; c=relaxed/simple; bh=gw4FQTHE8SLlwJArBri42euAjtNSeNQ1byjalYji5b0=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=Os4LMI7I0qqLMkREj4d2qw8KgSWcU5g52ZGH7qV5QyPzWVrC1om6ecQMwstT7YZ8LJwZX+NoX1umiRqwSAF9ytIe3DeMDU6soDy2s4Sb+rZNjSSX+odD7FNsxS0XaPIaOWQzXO8aCQ+p5BW4GsG1vGA4fAC6qb2gIMOSQzFa4po= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=qReMFqf8; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="qReMFqf8" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-341616a6fb7so2792745a91.0 for ; Fri, 05 Dec 2025 16:17:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1764980266; x=1765585066; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=CUuCYzCJ+7mOkcgENgUcVnWD5HNM+Ecxc29Wc5gu0a8=; b=qReMFqf83LKwpbNApAv433CETZlXKi85nOvIXeqkLijljWe8SrlpKQMFuMS1K3eMHA uNE9OqI2jDoLxzkjvHJml8htJx4gz/O9hcVherDcQ+8yPpmIsJKmdg5EoMPtIQGjP/KG kLpXtVYDOvmbQBD1udIXAAwI1ShLBq4DIkhFqyTCOakJJeJwSMyhEEKOKN3YkB8ZOI0d ziTBj0XV+LrC6suGn09PLuYjPdqYLLZTEedKbImmIBsyLzP6qyE+kcW0p1+IOhzeo7EQ +UVsYVMGlVOsWl/Z7KsbvCiLl0+dWVMXaaNWXP1SHRKIDWW8OZZbXxV4qsz8jpVFTBHs le5w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764980266; x=1765585066; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=CUuCYzCJ+7mOkcgENgUcVnWD5HNM+Ecxc29Wc5gu0a8=; b=NZjiY9UY5e3Xk6yaalgu6Oniop1FVzk9iefmwvC0jQtihTwvIzsJgMJYbMT7tE3pCB C8z9UnB8D+V1jo7BNPKTs+dtQ3kDW0C81Drlxhlj6PTp6p+Ao4NGu0JOWnGClaERPz1t Q0hycD+d8HIWChVxLVsGjDlcHsrAvF10QBBNvblpQePCK4ZNG43lOEzQpF5z5ClnuLMj W0ee63rP3dri9gxdhq7JQVXY4whTLqXR2bTJf5mApKVO+wX73ezA5fsSEefH2Z/jPSey Dcb74LKkcvRzSXbSubECW1jZzndwyowpYDC1HDsDHjk1XW69C5547TH0MH7TXQA7klLK 8sTA== X-Forwarded-Encrypted: i=1; AJvYcCUDafO5kcNTRRVq238fdJvOg5l8wpUOQ47WnfqFs6mgpZdoU6Mz20FwL3O57Ug4u1gYWK0cBJnaremPXYA=@vger.kernel.org X-Gm-Message-State: AOJu0Yz5JqWoRzZQugfgSItN4JBP4ETUHoGCfjeQEUotPj8xxaYnmwrY tWEMs8hpAvTxyk1av6NaIHv1CLplPkV5qdBrUXS+s3qVPVnF+aGJCrpAvyrv3fHmqUeyeA7L76j XBgZiTA== X-Google-Smtp-Source: AGHT+IEudAc+RtrsthgQd29i5TzWwClBzh0orawUsXJ6SsndAr08K8DVgHn3KuIiiMIFAwisOcz2YNtd49Q= X-Received: from pjqx1.prod.google.com ([2002:a17:90a:b001:b0:33b:5907:81cb]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:1d92:b0:340:b572:3b7d with SMTP id 98e67ed59e1d1-349a25fb840mr628818a91.19.1764980265902; Fri, 05 Dec 2025 16:17:45 -0800 (PST) Reply-To: Sean Christopherson Date: Fri, 5 Dec 2025 16:16:46 -0800 In-Reply-To: <20251206001720.468579-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251206001720.468579-1-seanjc@google.com> X-Mailer: git-send-email 2.52.0.223.gf5cc29aaa4-goog Message-ID: <20251206001720.468579-11-seanjc@google.com> Subject: [PATCH v6 10/44] perf/x86/core: Do not set bit width for unavailable counters From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Xin Li , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Sean Christopherson , Paolo Bonzini Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Mingwei Zhang , Xudong Hao , Sandipan Das , Dapeng Mi , Xiong Zhang , Manali Shukla , Jim Mattson Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Sandipan Das Not all x86 processors have fixed counters. It may also be the case that a processor has only fixed counters and no general-purpose counters. Set the bit widths corresponding to each counter type only if such counters are available. Fixes: b3d9468a8bd2 ("perf, x86: Expose perf capability to other modules") Signed-off-by: Sandipan Das Co-developed-by: Dapeng Mi Signed-off-by: Dapeng Mi Signed-off-by: Mingwei Zhang Tested-by: Xudong Hao Signed-off-by: Sean Christopherson --- arch/x86/events/core.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c index abe6a129a87f..bd9abe298469 100644 --- a/arch/x86/events/core.c +++ b/arch/x86/events/core.c @@ -3132,8 +3132,8 @@ void perf_get_x86_pmu_capability(struct x86_pmu_capab= ility *cap) cap->version =3D x86_pmu.version; cap->num_counters_gp =3D x86_pmu_num_counters(NULL); cap->num_counters_fixed =3D x86_pmu_num_counters_fixed(NULL); - cap->bit_width_gp =3D x86_pmu.cntval_bits; - cap->bit_width_fixed =3D x86_pmu.cntval_bits; + cap->bit_width_gp =3D cap->num_counters_gp ? x86_pmu.cntval_bits : 0; + cap->bit_width_fixed =3D cap->num_counters_fixed ? x86_pmu.cntval_bits : = 0; cap->events_mask =3D (unsigned int)x86_pmu.events_maskl; cap->events_mask_len =3D x86_pmu.events_mask_len; cap->pebs_ept =3D x86_pmu.pebs_ept; --=20 2.52.0.223.gf5cc29aaa4-goog From nobody Tue Dec 16 07:27:38 2025 Received: from mail-pf1-f201.google.com (mail-pf1-f201.google.com [209.85.210.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 99F1926E6E1 for ; Sat, 6 Dec 2025 00:17:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980271; cv=none; b=OvbiPxUzXa+EETRXNhxc0XneMvjIVe/7C7qRwXSym3bnN5P4PSIW+bC89IqZlyRWYWiH4HOM8XY3oq4K7GFeTajQDcLFfwTULc8rQf/2wp3Sl3a6UKwy1l/Qd6OWD+FUpuI0aURNZPgGhJhmO2fA2cjnaC09UhUFCcYmGfqBD4k= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980271; c=relaxed/simple; bh=1cCrG4DCg3nVbAW9oH4QXbYa70u3YhFDXh9Xhmff200=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=si1WUeTTUzh7jGMxZOPh+yMFvYcHVfUUe6CU2TsSjIqPCykq3Gd20kjKWuIM73Q0AIIbrqzXYG2BxxDYpKd4osn6eR9raOaNNR/XRFr6xYlP10wqqW0TWaktoY4N0e6SABpJ2PkqbAde2tuinRL3Bx8VE1svGxRX3jAIXXqd7Sc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=B8/Ao4CL; arc=none smtp.client-ip=209.85.210.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="B8/Ao4CL" Received: by mail-pf1-f201.google.com with SMTP id d2e1a72fcca58-7b8973c4608so6532125b3a.3 for ; Fri, 05 Dec 2025 16:17:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1764980269; x=1765585069; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=OFskCraMCAt1o1jXv8Q2cmHJ2OyK2yl9CAVs1/Lc01w=; b=B8/Ao4CLt3v5ehswAKCi138a1UFTPaFDajyIeBCnDnRvrwbScEDrmkI3ak6+RJV8Lh 80CnXuTlh4+5ZYJAaOzdImSgsWqRLStxnCXMmS9A4xyxCOHWv1OS7jGT88ma0iBdA2Qd 8JHjKoEHDcVBhrkaiudwxY9pDLovmxCqLonDllOzEpmV8rL398ugIlcWfR0eI2CFgPrg mb8PI+1OB7DUAGtGHkCNDEVq9ShAcRgsptAk9IRKpIaq5dIEobHgHCX4Sr9/gYJe+6Bl EMf6R2NWAQ1BVrHZYD+91KuUrqSH8l+IFLgy28p+uQusD6FZGMs3MA1WWwQVVUw0CLdT rO/g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764980269; x=1765585069; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=OFskCraMCAt1o1jXv8Q2cmHJ2OyK2yl9CAVs1/Lc01w=; b=YoVGO2lPqsjBD6RtUVGXS2UfzrZkdGZuJLbC9o5ALOAkM389xedjrlGgYXoQCNG9SV n27Hwgk4oVnWYfYemlkDRFbOITVLJYvYyU+6Nqr8PLT74KTFBSEy8zb3LP9xH52tWC4r oqDnhXeeU1cR8EaJdBriaJAMwLiQqYka8IBp9N9L2xVxUbu6WROdIEKdO3i0ERvXoaH6 HtnuqIUcJzVHOF1raMH9ld3xHmhx0qY3xZ50+yWERBTXWqp4G4xboFeqPwxAIORCTR96 y8W7tKLdt89hFoPyO98sx9azHG8pWPOQm+Wczx3mrYo6efN+ixQyePZRvkjLgptCmJtu mQBg== X-Forwarded-Encrypted: i=1; AJvYcCV7ivpWUdlezrax19mnuAOkx4KJZ3isM3mfsLVBwyP1SSFy17CqsgH92XbcwJkRbhjjtCZFc2Y/Qae/vRA=@vger.kernel.org X-Gm-Message-State: AOJu0YzT+BB7Vk66z8DTCCJAB6a1Fb6QH23TwQcoe29uEoEgnNrmrFrp AiCzdfzCFTL0Ee2SDlm0EV0zPe1x8ZrvTOD9hTiSGqwtc+ZHTZetQWHrLDYU15nl1Qr+QPu/EtW OeZoacQ== X-Google-Smtp-Source: AGHT+IFfI+s5eY5Nrz61icCfCwYv8IGUZBRQf6oh7KDogWUJprIuAF6nUR9g3xDsrE/LDxUvXG6Y79XK/ls= X-Received: from pfuf34.prod.google.com ([2002:a05:6a00:b22:b0:781:26f4:7855]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:234c:b0:7e8:450c:6195 with SMTP id d2e1a72fcca58-7e8c6dac143mr767576b3a.44.1764980268965; Fri, 05 Dec 2025 16:17:48 -0800 (PST) Reply-To: Sean Christopherson Date: Fri, 5 Dec 2025 16:16:47 -0800 In-Reply-To: <20251206001720.468579-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251206001720.468579-1-seanjc@google.com> X-Mailer: git-send-email 2.52.0.223.gf5cc29aaa4-goog Message-ID: <20251206001720.468579-12-seanjc@google.com> Subject: [PATCH v6 11/44] perf/x86/core: Plumb mediated PMU capability from x86_pmu to x86_pmu_cap From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Xin Li , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Sean Christopherson , Paolo Bonzini Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Mingwei Zhang , Xudong Hao , Sandipan Das , Dapeng Mi , Xiong Zhang , Manali Shukla , Jim Mattson Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Mingwei Zhang Plumb mediated PMU capability to x86_pmu_cap in order to let any kernel entity such as KVM know that host PMU support mediated PMU mode and has the implementation. Signed-off-by: Mingwei Zhang Tested-by: Xudong Hao Signed-off-by: Sean Christopherson --- arch/x86/events/core.c | 1 + arch/x86/include/asm/perf_event.h | 1 + 2 files changed, 2 insertions(+) diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c index bd9abe298469..1b50f0117876 100644 --- a/arch/x86/events/core.c +++ b/arch/x86/events/core.c @@ -3137,6 +3137,7 @@ void perf_get_x86_pmu_capability(struct x86_pmu_capab= ility *cap) cap->events_mask =3D (unsigned int)x86_pmu.events_maskl; cap->events_mask_len =3D x86_pmu.events_mask_len; cap->pebs_ept =3D x86_pmu.pebs_ept; + cap->mediated =3D !!(pmu.capabilities & PERF_PMU_CAP_MEDIATED_VPMU); } EXPORT_SYMBOL_GPL(perf_get_x86_pmu_capability); =20 diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_= event.h index 4cd38b9da0ba..4714bdee17b2 100644 --- a/arch/x86/include/asm/perf_event.h +++ b/arch/x86/include/asm/perf_event.h @@ -296,6 +296,7 @@ struct x86_pmu_capability { unsigned int events_mask; int events_mask_len; unsigned int pebs_ept :1; + unsigned int mediated :1; }; =20 /* --=20 2.52.0.223.gf5cc29aaa4-goog From nobody Tue Dec 16 07:27:38 2025 Received: from mail-pf1-f201.google.com (mail-pf1-f201.google.com [209.85.210.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F2938230D35 for ; Sat, 6 Dec 2025 00:17:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980273; cv=none; b=ScVah0g6Eg6on1pN+RMzVE5xzFfPsiRQ4dvKkpFxrSTh/jxEWwXWGs/vt4SQcpnKK+ZydPA4ySH+grTMnmfSuBzz3D8+ZhW7PoyidLc+1SNKgciujaVGT2YmkHQb3FoboJHB21zYDKibASf16PAXEw85SoYKggP9/CbBkinfthI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980273; c=relaxed/simple; bh=o32NYSL5oP89MnyU43/Lrt7hcO0vsyjmOnag53iPYE8=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=MljQS0Yzx/P1nrHLMHBYoI0u7NzTX78Wot21wgcqoS8zz4F4tkZEXSvQIpaaKquABKtl7E3qKAn8RE53e6XLHkY2UfdZ8t3sTWVkNrvLH370VlPOQQeq7/E5G2gLQSxF9hytsfkmLumgBBLCBL9DdPJmWdjVRDo7wZjbX/CyWzw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=u6yWemoz; arc=none smtp.client-ip=209.85.210.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="u6yWemoz" Received: by mail-pf1-f201.google.com with SMTP id d2e1a72fcca58-7c7957d978aso2757093b3a.1 for ; Fri, 05 Dec 2025 16:17:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1764980271; x=1765585071; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=704FSrHxrLR1ctx2c3pAZj/r3vJEFglglFSFFl0Rh7E=; b=u6yWemozW6fWAaYycenCFVUv3d0UJkL7Dkbt+QW4XW1sJigeyWZsHpuvDnfKPPXkh7 dbuh8fWDKAu05+SRAZVEdxJHNV4QkYw4VQ4xtrWCUXF2txcwXM4Ga5E7NbFKtpEjYYNG +ytft6fjzAxr7Ftt/CZnyPf2wq7nqLqO+9M9CIfdFJYLd2zv7XlAkS5/cZ/aSHYCvefr ZGN6e6xISTqUhPy+vzFgNC76sqoRt8WcQnI7NOY/Ay8SpWIRfPYs83Nxjt7VOefH85Ys e2NKOcwUk2puJ6xKWFjROuiBTUJ7AOz0kUrilqtD0sxFgaTCPeg7cV2XQ5z0nujdqY2o jEdw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764980271; x=1765585071; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=704FSrHxrLR1ctx2c3pAZj/r3vJEFglglFSFFl0Rh7E=; b=m4NlvyThO7YtzgRLq827ycfYVzI7JYzJfJPjBztg5Pqyz9vDQvyxCuCCdLT2J02X5j AM3yPFkqsz8UiSACRsSyZMCYDoeVNqeLfejFIIamJZV40KYKM7tFvdwhgPgSi77tMdEc AtUmTrOLcwYj7B3BStDfgeMKdef1bIiIe0k/TOPt7uieQyQHPfHH4AZPExzCiDYx0RgN NwYJXVAHYzHzX7N0abVEYdyUFGERyGkkKUlKFHVJYlN////UWmGvaqGWy2j1mvZ020AC uHz/EEarozR15lF632wUQhdNKnxodAdSjEA4/slzEs+SWMmeND/RIxwDN+GzDlzMEIil xCHw== X-Forwarded-Encrypted: i=1; AJvYcCWUWYDIZs1Noi0+5wztPIqx7aAp0Cr9jw+UlKm72uRFtzftMHWw2sxafoqDIln+iFiRBgjiI1oRI//MvHs=@vger.kernel.org X-Gm-Message-State: AOJu0YzR9bCcwYv2B8MtrtpRB35p7/l2YFOpuvzejKKzbo2RYip8D9Tf PEZnjeF7+Hy0kxAobd9RQy1OHqx0gfl8RB12CfMz9uB55AHNj2qCkrN1mH/TZfYLujjVC4VXVDj boNTxuA== X-Google-Smtp-Source: AGHT+IH7A1gUOgKFKJnzRngtLUVtNHZRNw8i6EgWRT3b89IXq+CAXu6n1GyBnJNQ7qWDwlQy81DLJeMmjzQ= X-Received: from pfnr5.prod.google.com ([2002:aa7:8445:0:b0:7dd:8bba:639e]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:2347:b0:7e8:4471:8da with SMTP id d2e1a72fcca58-7e8c8924252mr849439b3a.59.1764980270980; Fri, 05 Dec 2025 16:17:50 -0800 (PST) Reply-To: Sean Christopherson Date: Fri, 5 Dec 2025 16:16:48 -0800 In-Reply-To: <20251206001720.468579-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251206001720.468579-1-seanjc@google.com> X-Mailer: git-send-email 2.52.0.223.gf5cc29aaa4-goog Message-ID: <20251206001720.468579-13-seanjc@google.com> Subject: [PATCH v6 12/44] perf/x86/intel: Support PERF_PMU_CAP_MEDIATED_VPMU From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Xin Li , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Sean Christopherson , Paolo Bonzini Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Mingwei Zhang , Xudong Hao , Sandipan Das , Dapeng Mi , Xiong Zhang , Manali Shukla , Jim Mattson Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Kan Liang Apply the PERF_PMU_CAP_MEDIATED_VPMU for Intel core PMU. It only indicates that the perf side of core PMU is ready to support the mediated vPMU. Besides the capability, the hypervisor, a.k.a. KVM, still needs to check the PMU version and other PMU features/capabilities to decide whether to enable support mediated vPMUs. Signed-off-by: Kan Liang Signed-off-by: Mingwei Zhang [sean: massage changelog] Tested-by: Xudong Hao Signed-off-by: Sean Christopherson --- arch/x86/events/intel/core.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c index fe65be0b9d9c..5d53da858714 100644 --- a/arch/x86/events/intel/core.c +++ b/arch/x86/events/intel/core.c @@ -5319,6 +5319,8 @@ static void intel_pmu_check_hybrid_pmus(struct x86_hy= brid_pmu *pmu) else pmu->intel_ctrl &=3D ~GLOBAL_CTRL_EN_PERF_METRICS; =20 + pmu->pmu.capabilities |=3D PERF_PMU_CAP_MEDIATED_VPMU; + intel_pmu_check_event_constraints(pmu->event_constraints, pmu->cntr_mask64, pmu->fixed_cntr_mask64, @@ -6936,6 +6938,9 @@ __init int intel_pmu_init(void) pr_cont(" AnyThread deprecated, "); } =20 + /* The perf side of core PMU is ready to support the mediated vPMU. */ + x86_get_pmu(smp_processor_id())->capabilities |=3D PERF_PMU_CAP_MEDIATED_= VPMU; + /* * Many features on and after V6 require dynamic constraint, * e.g., Arch PEBS, ACR. --=20 2.52.0.223.gf5cc29aaa4-goog From nobody Tue Dec 16 07:27:38 2025 Received: from mail-pg1-f201.google.com (mail-pg1-f201.google.com [209.85.215.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 57399274FEB for ; Sat, 6 Dec 2025 00:17:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980275; cv=none; b=VespITLSsiCdgh5OiRTKv0FOw/KJBuz1BW0TSWKi6tFlHDvi5/pLoq84KDsvFLRGaAUYtf8tv/LC5Pm8wrMgofTuJiBQgD5JTiyyb3a0IsbJDBy2DSd1/sUehE6t3Ra2sNrk9dmP56zkdlpK+JHkoJndNmad5/bs7GZVNVPXASE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980275; c=relaxed/simple; bh=kd7T73ar6I8NiXQe7LuWrPEA21EYkgr8AnFeISoZufY=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=uicNBlr7bqtVi0ABR/kSXPilF16kIO6xOJDJL25iRzyR6baHyqjnqbSJOmjeI8wV4bz7YY0Z1nMesvGBtfjpsRdBLbQNNgxcUkVyHEEf8TQNA7HIbipnwK7d8HAaFW8r6r1gCMOgAhDfvvzP9fk2QrsTd6cGLptCWM8mthF6HVI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=WSJjoP7u; arc=none smtp.client-ip=209.85.215.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="WSJjoP7u" Received: by mail-pg1-f201.google.com with SMTP id 41be03b00d2f7-bcecfea0e8aso2141154a12.0 for ; Fri, 05 Dec 2025 16:17:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1764980273; x=1765585073; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=Xvhz9N1L4N+mDurmwM1rhQiV/PP6RHnJJPdqNYMoi5g=; b=WSJjoP7uPjzY1dDmTWmM4fjP5WnVRsuKjpeV31BcAHZIi+0+MreLMOF0wGBuNasFoE SR6MdHuzURqebl1OG5cSW+QHIcyBoirhIpOpd48RNmMt8p47YnEZ1m0jUVDfoTg/troL dp4F/3GXOCrgjRvrpo+w1/hCj/d4HLhGrzmMKcfHptjwFPAgeCpfMtOZIqMsf/2ylxaJ lnOeXigZeTswFnCJj3t9lQh+XdLv9XFgJl9tRzEWIq05KaPBB/lsXHzgMNKWhqp0TfZl IKv7DqcXi+4eOx7Mxyg07udsq1mU6MKXmcXPImTFxQLGPdkNjtJihWAtZRik6CznVRl9 8xTg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764980273; x=1765585073; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Xvhz9N1L4N+mDurmwM1rhQiV/PP6RHnJJPdqNYMoi5g=; b=ZRpGTrz8+8ahLaXehwekg9VPIXJS8r/w3Sk1HRP0EQddtEP1aFkz3DS0iCTeNvyv3i /w2Hi07j93p8W02B8S07fGnBNOra0KT2QWnkhSv/PcbxUZ+6d/WJBxkgekp7LBl8GHfN whaLqlMkV4E9hDsZgPt9a0tmXcV72KJ1c9DRtJKk65OjzcS8NIFxrXzLHpsFoldEq0qN weg+RjuXdB/s9PdunEwAmyA1Bw+aVCBHf5e65kp6QfRzy9t/REg8dDkcBvbXE7nVo5gM j6Z4iaSyeVdIIzzZOOLDVcecz7UGwZtTVqP2LuzwsLe5tTT5K6T7mNA6W3mxtiWufUb2 m3hw== X-Forwarded-Encrypted: i=1; AJvYcCUmqpp8cN69WUQAdT51gyjfZVSvkQEglmMBZ3jyfoDJmks9yhhhzoQaUJwZgLsfx2HifS8vuhfnS1/yMPc=@vger.kernel.org X-Gm-Message-State: AOJu0Yzkba/B/YwXAdIUwTzOPQXQfOeNumLZCOpNBjMcvvZDxcn990q5 9SKBjJ+N5w5+58lkgxpFG3u5nuylaKUuVX6Ne5mzURu5k8pIFo98dpMGXtfACBx9Ui+a5knHgc3 m1hgcMQ== X-Google-Smtp-Source: AGHT+IFWUJIU/8E8cWwKD1KvVXfXTPgHtbYSY08/FWGA961BLSrK8ZlXaCtvUM6dR/7cetdvhZaXTRiVhyQ= X-Received: from pfdc17.prod.google.com ([2002:aa7:8c11:0:b0:7dd:1a70:fbc1]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a20:5605:b0:366:14b0:4b19 with SMTP id adf61e73a8af0-366180292f3mr728551637.36.1764980272562; Fri, 05 Dec 2025 16:17:52 -0800 (PST) Reply-To: Sean Christopherson Date: Fri, 5 Dec 2025 16:16:49 -0800 In-Reply-To: <20251206001720.468579-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251206001720.468579-1-seanjc@google.com> X-Mailer: git-send-email 2.52.0.223.gf5cc29aaa4-goog Message-ID: <20251206001720.468579-14-seanjc@google.com> Subject: [PATCH v6 13/44] perf/x86/amd: Support PERF_PMU_CAP_MEDIATED_VPMU for AMD host From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Xin Li , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Sean Christopherson , Paolo Bonzini Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Mingwei Zhang , Xudong Hao , Sandipan Das , Dapeng Mi , Xiong Zhang , Manali Shukla , Jim Mattson Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Sandipan Das Apply the PERF_PMU_CAP_MEDIATED_VPMU flag for version 2 and later implementations of the core PMU. Aside from having Global Control and Status registers, virtualizing the PMU using the mediated model requires an interface to set or clear the overflow bits in the Global Status MSRs while restoring or saving the PMU context of a vCPU. PerfMonV2-capable hardware has additional MSRs for this purpose, namely PerfCntrGlobalStatusSet and PerfCntrGlobalStatusClr, thereby making it suitable for use with mediated vPMU. Signed-off-by: Sandipan Das Signed-off-by: Mingwei Zhang Tested-by: Xudong Hao Signed-off-by: Sean Christopherson --- arch/x86/events/amd/core.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/x86/events/amd/core.c b/arch/x86/events/amd/core.c index b20661b8621d..8179fb5f1ee3 100644 --- a/arch/x86/events/amd/core.c +++ b/arch/x86/events/amd/core.c @@ -1433,6 +1433,8 @@ static int __init amd_core_pmu_init(void) =20 amd_pmu_global_cntr_mask =3D x86_pmu.cntr_mask64; =20 + x86_get_pmu(smp_processor_id())->capabilities |=3D PERF_PMU_CAP_MEDIATED= _VPMU; + /* Update PMC handling functions */ x86_pmu.enable_all =3D amd_pmu_v2_enable_all; x86_pmu.disable_all =3D amd_pmu_v2_disable_all; --=20 2.52.0.223.gf5cc29aaa4-goog From nobody Tue Dec 16 07:27:38 2025 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1F0FD278165 for ; Sat, 6 Dec 2025 00:17:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980277; cv=none; b=VWwP4jWfUVE3d1nyLILZrZJI/aai4nL5+4KEgQzwypKg3G7h5q1BjNmBO7ShKxMZdA+Q6X6OXn/QiiRwnJidzolsgq3rJa7r3c6aaV09NMkVezFJKbI1wmeqyYsFxc+0ZGz3oAsXnYEYICX/Frzwr+sHbE/5n3ig/9YITiq6/sU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980277; c=relaxed/simple; bh=kkGqe6G69RBgrV2U1NfzL4MrQRt1WC2zdHXtUBZdM0o=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=CWBFd47HjzgaGqhuh5Am3ha9hNjQLjY7L8TX7WlRt1Q5V4pcd53GdUKroVTYxz19Q3BseN5DdqbuYroy5PcfR94Jdv2UJ5jRA9aFk1Usq09pt9loPJDYS4Grub6E81AJFkpRKoTXbNV/A3yBLu6CPx5pfycG4tgiTefgSfbe4y8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=iXVVR/oI; arc=none smtp.client-ip=209.85.216.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="iXVVR/oI" Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-340d3b1baafso3695304a91.3 for ; Fri, 05 Dec 2025 16:17:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1764980274; x=1765585074; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=eqIotjgCjEUdnt2WiVnuNRmRS2dnPuXPiOvBLci8ppE=; b=iXVVR/oIkPJS8TSshZRc+IlZk3m9f2lL82gUio5YtH8yvIfLWe5Ft1w2PUqWqgrrl3 K0tOx/oYT6F+DHMvF5BwOyclvAceXWDpdKn5dvltbjFlFV4bSIcddntqwSJvMhJzntBK scHSLW5AJPVd7WoluE1I0Ew5t/29IQyFQdCAP9j9CDlBwgnb+dG32aa6XmiacZ4L4iLo uOBt4U24WYmdrZHmBi5oWJUDjP5y2W0YEIEaDs5pvlZ5oRs+nywWCXIxOkWwzOpTnr2L Y+OBKQhy5jjWClxuhhNsnZK9pkI8kuVx/YoktFTAgVLd0NZnpSwew6aCpk3zrcj9grPj LS6A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764980274; x=1765585074; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=eqIotjgCjEUdnt2WiVnuNRmRS2dnPuXPiOvBLci8ppE=; b=qR0hKMSlnpJnN9VyCha6OJVDobhc3Dz71J94/phtQhKVYvIvv9TIfQbX/74qGuboZQ v9fzs9TsnbqNH9qiygW3v1YlYZ+FoQWXz99Op+XYva4E/9wP7P9AAcJ+aubknbX7Th1j IyZE24ADTuXWCEfLH9EaGQ/ccPjq4wfUn8f8qwFMQoBlmrr9oqlanZh7YW4Ndd4QudQB qQZ/jg3uUG6IG0fh1atm8DSZ2fTip0PDQQ8zE4AoqzWdnWZGTMz04muaRjTut0xWIqv6 3aVNlLQpVk+MeunADO1HGxyizZlwY5z5hy2XcSwEJB6N9bFftbmhyYw9XHEhuNth6rHS qSsQ== X-Forwarded-Encrypted: i=1; AJvYcCULPpQoVKCdf1xut9CMQF1o0nkQtlr4hKZwrob5RYr0E2IuJXqMTUc1SRB6oAJ8QS12/xtg0En2hl1qqj0=@vger.kernel.org X-Gm-Message-State: AOJu0Yy7z6LLeU7JVrkwr7CJnmawE5YZ4adkldTkp93IEujqaJ7A82HK 33il33PnOF+vFaqMAENRjhsjxNhIcLy8ON3pfFXaDyoZEx4vUCoK80P6UUMFjhcohCglXP0x9JQ /JAKDEw== X-Google-Smtp-Source: AGHT+IEouBZxtWhTrY4lnM9VAu4F3ROpN1DG3BfY+U/j8YLDQib06swOT26j4RpeJbHBlGNlXEC9N5U6nuk= X-Received: from pjps10.prod.google.com ([2002:a17:90a:a10a:b0:340:99d8:c874]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:3fc4:b0:32e:7270:9499 with SMTP id 98e67ed59e1d1-349a2383216mr737832a91.0.1764980274477; Fri, 05 Dec 2025 16:17:54 -0800 (PST) Reply-To: Sean Christopherson Date: Fri, 5 Dec 2025 16:16:50 -0800 In-Reply-To: <20251206001720.468579-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251206001720.468579-1-seanjc@google.com> X-Mailer: git-send-email 2.52.0.223.gf5cc29aaa4-goog Message-ID: <20251206001720.468579-15-seanjc@google.com> Subject: [PATCH v6 14/44] KVM: Add a simplified wrapper for registering perf callbacks From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Xin Li , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Sean Christopherson , Paolo Bonzini Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Mingwei Zhang , Xudong Hao , Sandipan Das , Dapeng Mi , Xiong Zhang , Manali Shukla , Jim Mattson Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add a parameter-less API for registering perf callbacks in anticipation of introducing another x86-only parameter for handling mediated PMU PMIs. No functional change intended. Acked-by: Anup Patel Tested-by: Xudong Hao Signed-off-by: Sean Christopherson --- arch/arm64/kvm/arm.c | 2 +- arch/loongarch/kvm/main.c | 2 +- arch/riscv/kvm/main.c | 2 +- arch/x86/kvm/x86.c | 2 +- include/linux/kvm_host.h | 11 +++++++++-- virt/kvm/kvm_main.c | 5 +++-- 6 files changed, 16 insertions(+), 8 deletions(-) diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 97627638e802..153166ca626a 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -2357,7 +2357,7 @@ static int __init init_subsystems(void) if (err) goto out; =20 - kvm_register_perf_callbacks(NULL); + kvm_register_perf_callbacks(); =20 out: if (err) diff --git a/arch/loongarch/kvm/main.c b/arch/loongarch/kvm/main.c index 80ea63d465b8..f62326fe29fa 100644 --- a/arch/loongarch/kvm/main.c +++ b/arch/loongarch/kvm/main.c @@ -394,7 +394,7 @@ static int kvm_loongarch_env_init(void) } =20 kvm_init_gcsr_flag(); - kvm_register_perf_callbacks(NULL); + kvm_register_perf_callbacks(); =20 /* Register LoongArch IPI interrupt controller interface. */ ret =3D kvm_loongarch_register_ipi_device(); diff --git a/arch/riscv/kvm/main.c b/arch/riscv/kvm/main.c index 45536af521f0..0f3fe3986fc0 100644 --- a/arch/riscv/kvm/main.c +++ b/arch/riscv/kvm/main.c @@ -174,7 +174,7 @@ static int __init riscv_kvm_init(void) =20 kvm_riscv_setup_vendor_features(); =20 - kvm_register_perf_callbacks(NULL); + kvm_register_perf_callbacks(); =20 rc =3D kvm_init(sizeof(struct kvm_vcpu), 0, THIS_MODULE); if (rc) { diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 0c6d899d53dd..1b2827cecf38 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -10107,7 +10107,7 @@ int kvm_x86_vendor_init(struct kvm_x86_init_ops *op= s) set_hv_tscchange_cb(kvm_hyperv_tsc_notifier); #endif =20 - kvm_register_perf_callbacks(ops->handle_intel_pt_intr); + __kvm_register_perf_callbacks(ops->handle_intel_pt_intr, NULL); =20 if (IS_ENABLED(CONFIG_KVM_SW_PROTECTED_VM) && tdp_mmu_enabled) kvm_caps.supported_vm_types |=3D BIT(KVM_X86_SW_PROTECTED_VM); diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index d93f75b05ae2..8e410d1a63df 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -1749,10 +1749,17 @@ static inline bool kvm_arch_intc_initialized(struct= kvm *kvm) #ifdef CONFIG_GUEST_PERF_EVENTS unsigned long kvm_arch_vcpu_get_ip(struct kvm_vcpu *vcpu); =20 -void kvm_register_perf_callbacks(unsigned int (*pt_intr_handler)(void)); +void __kvm_register_perf_callbacks(unsigned int (*pt_intr_handler)(void), + void (*mediated_pmi_handler)(void)); + +static inline void kvm_register_perf_callbacks(void) +{ + __kvm_register_perf_callbacks(NULL, NULL); +} + void kvm_unregister_perf_callbacks(void); #else -static inline void kvm_register_perf_callbacks(void *ign) {} +static inline void kvm_register_perf_callbacks(void) {} static inline void kvm_unregister_perf_callbacks(void) {} #endif /* CONFIG_GUEST_PERF_EVENTS */ =20 diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 4954cbbb05e8..16b24da9cda5 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -6474,10 +6474,11 @@ static struct perf_guest_info_callbacks kvm_guest_c= bs =3D { .handle_mediated_pmi =3D NULL, }; =20 -void kvm_register_perf_callbacks(unsigned int (*pt_intr_handler)(void)) +void __kvm_register_perf_callbacks(unsigned int (*pt_intr_handler)(void), + void (*mediated_pmi_handler)(void)) { kvm_guest_cbs.handle_intel_pt_intr =3D pt_intr_handler; - kvm_guest_cbs.handle_mediated_pmi =3D NULL; + kvm_guest_cbs.handle_mediated_pmi =3D mediated_pmi_handler; =20 perf_register_guest_info_callbacks(&kvm_guest_cbs); } --=20 2.52.0.223.gf5cc29aaa4-goog From nobody Tue Dec 16 07:27:38 2025 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2B95A280A5A for ; Sat, 6 Dec 2025 00:17:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980279; cv=none; b=kdWcFoBPujbZeSbtMwwtDkcnpM7le7dzTpBrcmxtvBYC8xve8b5evGKTQA2lfislq4xSoxu4dzbOYYC9y77S7Mf/ekbx8/u23AYjXbicfzkB5v5eTuq3+GCIyI5bs4tIgq6FJp/0QjE4YrGN3OXOkn1YkwspSOp+Q5NndSmvp8g= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980279; c=relaxed/simple; bh=tj6t2t4Ed4eDjoeIXxejgP4HqHkh0nNCA/O27IqihCM=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=rQTJg3dB37Inpb+Xyx3+KpunhJcalYzKWAIqaTFEYKfRuhrvbc2Ge/yjIsAoIEsezNML9n0a+txKcd5WefphqpcyXE6ZPf5RtABBLHKHJCtBxCA6RJN26bgAgyn2wryT0DJ3+SSQdphjx5mPBv3qW0QNCL8OHQ4VfWFg/LYlL70= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=g7xckl+P; arc=none smtp.client-ip=209.85.214.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="g7xckl+P" Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-2980ef53fc5so55548985ad.1 for ; Fri, 05 Dec 2025 16:17:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1764980276; x=1765585076; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=RvT5QOLTrPlZDM6KRmJviB3G20BHxxiuQcblXVA4XhQ=; b=g7xckl+P87b6HbODAOyT/O8Ipex0KWgqAc+ZzCIL9NjsLCFEDIRpfBYM9auy7pMQ5t VERXFbqzPC6PofStt7lVol1EdHMzHQCp9kT2JRovo5/3hbHsf8ttzfYqTRFK6TuBRAty Cu1XCTP1ut6Lb3EX2tlSoxtsd8Qke9ORXfEKFCmPRbxJVVq80Afom8vtxIQd9NOTti0t mCsRj+HiuWs/o8Vk7E3gcZuPygdMomU8BYMc+Faz0GzANqJuyWuK8OBumvj4uI7q48m0 djTghDTCuIMJucbuy8EibimzuBejPSuhkXunuW6vjD1hCThXPeqpqUzE2EKmosC67SNU Aygg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764980276; x=1765585076; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=RvT5QOLTrPlZDM6KRmJviB3G20BHxxiuQcblXVA4XhQ=; b=Agzm8wd1v8CKYHm3hFlNbRdF75FR91hv8u3xL0OCKmRUALDd+IQUU2vKiqaKgyPDIi LKNlVXDY8AAq7SxRC4rE0HrLxXSGKF+iFdC3vpk4vwk35Ofb4OZXPNJBtEi8jF+BJ2Kb MPVp/K7swDIdSBI8oIV614oJOSNSsJHE2IXug+/fhb9XPUtsPeS6o9BRj1w8e0dkF+ZA 6ZmMIDuZFtFFVFwQBl1g5DNaxEYVTsfEEeeDVzDYUZAtPhTYrVYoGWi+6XTbysxPybrM Joxj1t9gCzh0u/Lgjyjj53bfuXlONRtjDJ43apM9rRhL1kp7u8joMiP1SMngv8e4EJWF UeFg== X-Forwarded-Encrypted: i=1; AJvYcCVoUFKBYcF1PvwLzmfKfjIRdHaGRIydswYxlhNdZiXFtwL9V/ue/rt9cOBkj/Aag6cZqMCZnSQ2kd1rTLE=@vger.kernel.org X-Gm-Message-State: AOJu0Yztkt6nNqnAeHuZOHbej3Bb/Ejvevjd+hRWR752IoHr85P/0l6i aqtX1b61LtHEZ2EzQn+DMUT4NSTJOsE97xpcDjcLY5aIfws36wQ8S/DnH0OylYXiUGwwIp3z3mP EqEKj6A== X-Google-Smtp-Source: AGHT+IGYBlCmb/gXYXB/w+eLYeN/6lSsoZJHvJE9tm9YKGnjFm45dzt4nCYHuTcqC75ktumYKLC92IIPc9g= X-Received: from plbko16.prod.google.com ([2002:a17:903:7d0:b0:29d:5afa:2c1]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:902:d488:b0:295:5668:2f2e with SMTP id d9443c01a7336-29df880d025mr4171385ad.37.1764980276326; Fri, 05 Dec 2025 16:17:56 -0800 (PST) Reply-To: Sean Christopherson Date: Fri, 5 Dec 2025 16:16:51 -0800 In-Reply-To: <20251206001720.468579-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251206001720.468579-1-seanjc@google.com> X-Mailer: git-send-email 2.52.0.223.gf5cc29aaa4-goog Message-ID: <20251206001720.468579-16-seanjc@google.com> Subject: [PATCH v6 15/44] KVM: x86/pmu: Snapshot host (i.e. perf's) reported PMU capabilities From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Xin Li , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Sean Christopherson , Paolo Bonzini Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Mingwei Zhang , Xudong Hao , Sandipan Das , Dapeng Mi , Xiong Zhang , Manali Shukla , Jim Mattson Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Take a snapshot of the unadulterated PMU capabilities provided by perf so that KVM can compare guest vPMU capabilities against hardware capabilities when determining whether or not to intercept PMU MSRs (and RDPMC). Reviewed-by: Sandipan Das Tested-by: Xudong Hao Signed-off-by: Sean Christopherson --- arch/x86/kvm/pmu.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c index 487ad19a236e..7c219305b61d 100644 --- a/arch/x86/kvm/pmu.c +++ b/arch/x86/kvm/pmu.c @@ -108,6 +108,8 @@ void kvm_init_pmu_capability(const struct kvm_pmu_ops *= pmu_ops) bool is_intel =3D boot_cpu_data.x86_vendor =3D=3D X86_VENDOR_INTEL; int min_nr_gp_ctrs =3D pmu_ops->MIN_NR_GP_COUNTERS; =20 + perf_get_x86_pmu_capability(&kvm_host_pmu); + /* * Hybrid PMUs don't play nice with virtualization without careful * configuration by userspace, and KVM's APIs for reporting supported --=20 2.52.0.223.gf5cc29aaa4-goog From nobody Tue Dec 16 07:27:38 2025 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E660C285417 for ; Sat, 6 Dec 2025 00:17:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980281; cv=none; b=FMJZuQ01C5M6R8DAUO+C0yn5MKxm0UfsoKMVdTo+OL2+qzJpq04z9VMfi1A4hrQrR1Th0HfRQ7ZVxv6rSgkT0r/1Wnp9eTPvywPQfGT751cv9oY1nteHw97+wciIGenVTHCs6dKb3T5kaXAQz9GV1qrJCqUvuGqOb+7avZnEuSA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980281; c=relaxed/simple; bh=NxcTFxjepVUBhgcuDgXP9erFCogdZe/UYTYZVp4ITjU=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=n2ybCqsdwagkt9NcfxsNHeuE7R+doAbURungcq0CiuSJpKfmFxOP3Lkk7a0rApRNDWQaN67nn8a/qmjLEI0BJgIZP+5fTO0RBNe1cNZNdwYBn3zsxr+OFyKMQPAj9IXpRnGyaLKqCOa4qu1hZ5aMrWUjr9GkKe/bw+YwmrgqppQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=k9vsKGTi; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="k9vsKGTi" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-34566e62f16so3089860a91.1 for ; Fri, 05 Dec 2025 16:17:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1764980278; x=1765585078; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=EJZjDSfKK1pwnMTXc65KmoRQXFDI8yuuzalHKLt+oxI=; b=k9vsKGTi3KkuAHwmozPzmk2ZhbWwF1U8NNI0oc/RtQA+i++QZmK7cqN4PKOSmcBhIB t/zLKeXAZdde9Ae6KJdOT75TOa7jWHExrA45NYJ8xR5iB43g8ycP50m2L+SGCiLWx58E rMpnYBvWLYPjbj9vIKfQaBN4o5Dic/VhLJKkxLKLznwLdfSmOrpR9ZET0D/AfCdhO+dM kuV2+i5pB8/2F3CoYV4HK2a69Shbhy5oZRJ5glKoCW/IcqqDNT8tCSNkV5uXAcEOt2gk PNNacaM+BfiuqNU6/t5DfM9Hg/Wr186TqEvYSyYVrvGrkvbhi1CJBGaKv35n2qyBqTPC 4Qng== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764980278; x=1765585078; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=EJZjDSfKK1pwnMTXc65KmoRQXFDI8yuuzalHKLt+oxI=; b=SgrggArYY5sg3Cd7WyVwJqDXKc52cYN3fkp8uTNUazKCsDla0TGs+avyRX2Ka7AHQV R/wbkRT2LNi3wv+w18IGIMcGCEC7C5XR2tR7dO10pURg4bo/GuNX8zZLVgOlMvWwIS6L wWYU80hgMkHHxRMaGdtWiuewfOZ4LdwFEdjfBOBCxChUn0oFD9r+vrPHi4SmX4EHHF2n DEAywbMe5eoOtBLpIG4XBq7STwN7o7dVWjdbQLXKcBDxH7veYVtU4Ot0NxWZRUCc67gW dsttcTIGIli1l1qKFaQcxaphc66p3kPjcLSzkk4PoZhhJkymKfrWsjxhhfRtEQQS9oxL n16w== X-Forwarded-Encrypted: i=1; AJvYcCXx2KTbkaIJ4ihJLcHEpax/Q7FO9k6ypL+p0aj8yBViCwhfGaH9e5Rat9gQ7nlQFdR2B0opUEi1LTm6Jw4=@vger.kernel.org X-Gm-Message-State: AOJu0Yw5dwPbjkHNNAWNP2rC9JiuyLIMGugoQTwLDWnMk2u/ZfaVb338 LKvusT6LDuj+lu0VZPspsv7wBNLeloqvKn5b/2YmyNs39BrNJDgfpX1DbzmS+N/yYdvqr9YpR+Z Nbc68Lw== X-Google-Smtp-Source: AGHT+IFzyDODAGNW9A6GgT1Jbq067tl3uqRceo8na5LL790lP+X/PbzaZlwBYP1SXTWzOmiFVcopZu238bM= X-Received: from pjis4.prod.google.com ([2002:a17:90a:5d04:b0:340:b1b5:eb5e]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:3ecd:b0:340:f05a:3ec2 with SMTP id 98e67ed59e1d1-349a25fb8c0mr656453a91.17.1764980278079; Fri, 05 Dec 2025 16:17:58 -0800 (PST) Reply-To: Sean Christopherson Date: Fri, 5 Dec 2025 16:16:52 -0800 In-Reply-To: <20251206001720.468579-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251206001720.468579-1-seanjc@google.com> X-Mailer: git-send-email 2.52.0.223.gf5cc29aaa4-goog Message-ID: <20251206001720.468579-17-seanjc@google.com> Subject: [PATCH v6 16/44] KVM: x86/pmu: Start stubbing in mediated PMU support From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Xin Li , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Sean Christopherson , Paolo Bonzini Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Mingwei Zhang , Xudong Hao , Sandipan Das , Dapeng Mi , Xiong Zhang , Manali Shukla , Jim Mattson Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Dapeng Mi Introduce enable_mediated_pmu as a global variable, with the intent of exposing it to userspace a vendor module parameter, to control and reflect mediated vPMU support. Wire up the perf plumbing to create+release a mediated PMU, but defer exposing the parameter to userspace until KVM support for a mediated PMUs is fully landed. To (a) minimize compatibility issues, (b) to give userspace a chance to opt out of the restrictive side-effects of perf_create_mediated_pmu(), and (c) to avoid adding new dependencies between enabling an in-kernel irqchip and a mediated vPMU, defer "creating" a mediated PMU in perf until the first vCPU is created. Regarding userspace compatibility, an alternative solution would be to make the mediated PMU fully opt-in, e.g. to avoid unexpected failure due to perf_create_mediated_pmu() failing. Ironically, that approach creates an even bigger compatibility issue, as turning on enable_mediated_pmu would silently break VMMs that don't utilize KVM_CAP_PMU_CAPABILITY (well, silently until the guest tried to access PMU assets). Regarding an in-kernel irqchip, create a mediated PMU if and only if the VM has an in-kernel local APIC, as the mediated PMU will take a hard dependency on forwarding PMIs to the guest without bouncing through host userspace. Silently "drop" the PMU instead of rejecting KVM_CREATE_VCPU, as KVM's existing vPMU support doesn't function correctly if the local APIC is emulated by userspace, e.g. PMIs will never be delivered. I.e. it's far, far more likely that rejecting KVM_CREATE_VCPU would cause problems, e.g. for tests or userspace daemons that just want to probe basic KVM functionality. Note! Deliberately make mediated PMU creation "sticky", i.e. don't unwind it on failure to create a vCPU. Practically speaking, there's no harm to having a VM with a mediated PMU and no vCPUs. To avoid an "impossible" VM setup, reject KVM_CAP_PMU_CAPABILITY if a mediated PMU has been created, i.e. don't let userspace disable PMU support after failed vCPU creation (with PMU support enabled). Defer vendor specific requirements and constraints to the future. Suggested-by: Sean Christopherson Signed-off-by: Dapeng Mi Co-developed-by: Mingwei Zhang Signed-off-by: Mingwei Zhang Tested-by: Xudong Hao Co-developed-by: Sean Christopherson Signed-off-by: Sean Christopherson --- arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/pmu.c | 4 ++++ arch/x86/kvm/pmu.h | 7 +++++++ arch/x86/kvm/x86.c | 37 +++++++++++++++++++++++++++++++-- arch/x86/kvm/x86.h | 1 + 5 files changed, 48 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_hos= t.h index 5a3bfa293e8b..defd979003be 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1484,6 +1484,7 @@ struct kvm_arch { =20 bool bus_lock_detection_enabled; bool enable_pmu; + bool created_mediated_pmu; =20 u32 notify_window; u32 notify_vmexit_flags; diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c index 7c219305b61d..0de0af5c6e4f 100644 --- a/arch/x86/kvm/pmu.c +++ b/arch/x86/kvm/pmu.c @@ -137,6 +137,10 @@ void kvm_init_pmu_capability(const struct kvm_pmu_ops = *pmu_ops) enable_pmu =3D false; } =20 + if (!enable_pmu || !enable_mediated_pmu || !kvm_host_pmu.mediated || + !pmu_ops->is_mediated_pmu_supported(&kvm_host_pmu)) + enable_mediated_pmu =3D false; + if (!enable_pmu) { memset(&kvm_pmu_cap, 0, sizeof(kvm_pmu_cap)); return; diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h index 5c3939e91f1d..a5c7c026b919 100644 --- a/arch/x86/kvm/pmu.h +++ b/arch/x86/kvm/pmu.h @@ -37,6 +37,8 @@ struct kvm_pmu_ops { void (*deliver_pmi)(struct kvm_vcpu *vcpu); void (*cleanup)(struct kvm_vcpu *vcpu); =20 + bool (*is_mediated_pmu_supported)(struct x86_pmu_capability *host_pmu); + const u64 EVENTSEL_EVENT; const int MAX_NR_GP_COUNTERS; const int MIN_NR_GP_COUNTERS; @@ -58,6 +60,11 @@ static inline bool kvm_pmu_has_perf_global_ctrl(struct k= vm_pmu *pmu) return pmu->version > 1; } =20 +static inline bool kvm_vcpu_has_mediated_pmu(struct kvm_vcpu *vcpu) +{ + return enable_mediated_pmu && vcpu_to_pmu(vcpu)->version; +} + /* * KVM tracks all counters in 64-bit bitmaps, with general purpose counters * mapped to bits 31:0 and fixed counters mapped to 63:32, e.g. fixed coun= ter 0 diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 1b2827cecf38..fb3a5e861553 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -183,6 +183,10 @@ bool __read_mostly enable_pmu =3D true; EXPORT_SYMBOL_FOR_KVM_INTERNAL(enable_pmu); module_param(enable_pmu, bool, 0444); =20 +/* Enable/disabled mediated PMU virtualization. */ +bool __read_mostly enable_mediated_pmu; +EXPORT_SYMBOL_FOR_KVM_INTERNAL(enable_mediated_pmu); + bool __read_mostly eager_page_split =3D true; module_param(eager_page_split, bool, 0644); =20 @@ -6854,7 +6858,7 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm, break; =20 mutex_lock(&kvm->lock); - if (!kvm->created_vcpus) { + if (!kvm->created_vcpus && !kvm->arch.created_mediated_pmu) { kvm->arch.enable_pmu =3D !(cap->args[0] & KVM_PMU_CAP_DISABLE); r =3D 0; } @@ -12641,8 +12645,13 @@ static int sync_regs(struct kvm_vcpu *vcpu) return 0; } =20 +#define PERF_MEDIATED_PMU_MSG \ + "Failed to enable mediated vPMU, try disabling system wide perf events an= d nmi_watchdog.\n" + int kvm_arch_vcpu_precreate(struct kvm *kvm, unsigned int id) { + int r; + if (kvm_check_tsc_unstable() && kvm->created_vcpus) pr_warn_once("SMP vm created on host with unstable TSC; " "guest TSC will not be reliable\n"); @@ -12653,7 +12662,29 @@ int kvm_arch_vcpu_precreate(struct kvm *kvm, unsig= ned int id) if (id >=3D kvm->arch.max_vcpu_ids) return -EINVAL; =20 - return kvm_x86_call(vcpu_precreate)(kvm); + /* + * Note, any actions done by .vcpu_create() must be idempotent with + * respect to creating multiple vCPUs, and therefore are not undone if + * creating a vCPU fails (including failure during pre-create). + */ + r =3D kvm_x86_call(vcpu_precreate)(kvm); + if (r) + return r; + + if (enable_mediated_pmu && kvm->arch.enable_pmu && + !kvm->arch.created_mediated_pmu) { + if (irqchip_in_kernel(kvm)) { + r =3D perf_create_mediated_pmu(); + if (r) { + pr_warn_ratelimited(PERF_MEDIATED_PMU_MSG); + return r; + } + kvm->arch.created_mediated_pmu =3D true; + } else { + kvm->arch.enable_pmu =3D false; + } + } + return 0; } =20 int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu) @@ -13319,6 +13350,8 @@ void kvm_arch_destroy_vm(struct kvm *kvm) __x86_set_memory_region(kvm, TSS_PRIVATE_MEMSLOT, 0, 0); mutex_unlock(&kvm->slots_lock); } + if (kvm->arch.created_mediated_pmu) + perf_release_mediated_pmu(); kvm_destroy_vcpus(kvm); kvm_free_msr_filter(srcu_dereference_check(kvm->arch.msr_filter, &kvm->sr= cu, 1)); #ifdef CONFIG_KVM_IOAPIC diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index fdab0ad49098..6e1fb1680c0a 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -470,6 +470,7 @@ extern struct kvm_caps kvm_caps; extern struct kvm_host_values kvm_host; =20 extern bool enable_pmu; +extern bool enable_mediated_pmu; =20 /* * Get a filtered version of KVM's supported XCR0 that strips out dynamic --=20 2.52.0.223.gf5cc29aaa4-goog From nobody Tue Dec 16 07:27:38 2025 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 12B00292936 for ; Sat, 6 Dec 2025 00:18:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980283; cv=none; b=nheFasjBLoD1e34xxgDomYLRCxU48aploRuSsxdyWt627q8EXWhnLaXCMDVNU7cN9LQbnT7ZmUmdd3xcbdLQ3DXmda2lYRdHTOtKMWxA9r+EFD7uy21/tnKRGMokhs6dxbs+upWsJ2xIuvKfAHhglUiMBj9FJo1u51+Pq1ck8ks= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980283; c=relaxed/simple; bh=nvjn8cBbINNrlqqH1bX/64Jkc6rJ3drKKXADKrppk08=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=MRt2LJzSfHC6sYNqpOwAX/IXOxGO0XB7a7+mxS3wNAaGnEhu76HSy4GpflqjnEz3ijPCfWxJwGGxTND6a5UEXiEWH6y2iYjyTcDy27vEUUMV+flMmCHazT4HF2YZdJy/FUmB8YZvzwdiWOCbPYIX3Wbf6IO0YIoMeYws9+LpmYQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=tenxrJR7; arc=none smtp.client-ip=209.85.216.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="tenxrJR7" Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-3436d81a532so4711216a91.3 for ; Fri, 05 Dec 2025 16:18:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1764980280; x=1765585080; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=hhYBOxKMeDA97YbyTKBhQywJKJiMZ56sAPho58tY+oI=; b=tenxrJR7lKSr+ksaSeyia6tl0vXZ/diCSnLbnt2Hms+t3Jk7s2g2kYXjJU9k7/Z34X Xb736el24Be2eyd3dEP6EBOAbCwP9kDAoTA4QKTVZzQCJF4HCSIgzsAZ7pewTNDCHUIF AWjuhjq0brqKkyCwYtXJ4nYVrPW8ci3uCb7gHvv5LYeDNCqAYfqe0ec8x+sZT9Js174i P76u80PDhNqKn2V+KdZOd60ZiwrCYDTpwNh9UGqxf5T4X9hhQmrPeSHKg3gGwE+IJCRq SGUB8D+0/VigquWrN8XoDjLs5elYLCfu0azfWIaD92zYOZgvCgL/Y+Paqups65CKUMwD xUEg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764980280; x=1765585080; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=hhYBOxKMeDA97YbyTKBhQywJKJiMZ56sAPho58tY+oI=; b=H+N/uV2JYMdBCm4qjsLdyyPLap/Iz7LjoGaR/Mvc6mh2YD4fhnMatnsX7nGWlONo9S gZ6UXl4m0kVcR2/VQ7cB2oaSM37gbNAhGhSM9G8oMM/s1XJw2WFapuvKlsGTvPtAbGBe fQbz+G71gtOlTtuS544sNr2qhNmjeYx6zh1ZgSco3ecrRgFHac4ptS6z6nMasqvqpRRY //Nxlx49NqxyltSwxxWEBcxu14P4hQjn9FDJwMUrOh5juPaW6pR5wO62MX1Zcp99IX/Q UnUiwPNonSIGC+jfZuVl7lYfiku8X+I3m/oN1+AdftRC4GAw2mKGPmMobl7I1dnfSem9 JUIQ== X-Forwarded-Encrypted: i=1; AJvYcCXfnlh3NTCb9j3mBABiZw/kvAltIXsr9UA3WIeq/cneiBkKiAT3tWpCTWUJBuygR384yGoihtpCZ6oQIyk=@vger.kernel.org X-Gm-Message-State: AOJu0YxKbCf00mVllib14T22gf5kNUEd2u9uqJMNCEHR3Pd1mw+HyhjB +LCrmEMPod3eDwj1RsCqTm5ZGA3lX7kHFRJTIoa8DToByi6hKc+SH08ND8tOcGjNX95ZjBNm9H1 btR3hQw== X-Google-Smtp-Source: AGHT+IFFMAhL4pZP/9HCypRtffNLi9XXG9wzK7MTQvl2xq8p68VDBLvLazIKhLosMJgd33GPPNkDHe8jMow= X-Received: from pjbbo18.prod.google.com ([2002:a17:90b:912:b0:33b:c211:1fa9]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90a:ce07:b0:32d:d5f1:fe7f with SMTP id 98e67ed59e1d1-349a24f3283mr493681a91.15.1764980280055; Fri, 05 Dec 2025 16:18:00 -0800 (PST) Reply-To: Sean Christopherson Date: Fri, 5 Dec 2025 16:16:53 -0800 In-Reply-To: <20251206001720.468579-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251206001720.468579-1-seanjc@google.com> X-Mailer: git-send-email 2.52.0.223.gf5cc29aaa4-goog Message-ID: <20251206001720.468579-18-seanjc@google.com> Subject: [PATCH v6 17/44] KVM: x86/pmu: Implement Intel mediated PMU requirements and constraints From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Xin Li , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Sean Christopherson , Paolo Bonzini Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Mingwei Zhang , Xudong Hao , Sandipan Das , Dapeng Mi , Xiong Zhang , Manali Shukla , Jim Mattson Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Dapeng Mi Implement Intel PMU requirements and constraints for mediated PMU support. Require host PMU version 4+ so that PERF_GLOBAL_STATUS_SET can be used to precisely load the guest's status value into hardware, and require full- width writes so that KVM can precisely load guest counter values. Disable PEBS and LBRs if mediated PMU support is enabled, as they won't be supported in the initial implementation. Signed-off-by: Dapeng Mi Co-developed-by: Mingwei Zhang Signed-off-by: Mingwei Zhang [sean: split to separate patch, add full-width writes dependency] Tested-by: Xudong Hao Signed-off-by: Sean Christopherson --- arch/x86/kvm/vmx/capabilities.h | 3 ++- arch/x86/kvm/vmx/pmu_intel.c | 17 +++++++++++++++++ arch/x86/kvm/vmx/vmx.c | 3 ++- 3 files changed, 21 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilitie= s.h index 02aadb9d730e..26302fd6dd9c 100644 --- a/arch/x86/kvm/vmx/capabilities.h +++ b/arch/x86/kvm/vmx/capabilities.h @@ -395,7 +395,8 @@ static inline bool vmx_pt_mode_is_host_guest(void) =20 static inline bool vmx_pebs_supported(void) { - return boot_cpu_has(X86_FEATURE_PEBS) && kvm_pmu_cap.pebs_ept; + return boot_cpu_has(X86_FEATURE_PEBS) && kvm_pmu_cap.pebs_ept && + !enable_mediated_pmu; } =20 static inline bool cpu_has_notify_vmexit(void) diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c index de1d9785c01f..050c21298213 100644 --- a/arch/x86/kvm/vmx/pmu_intel.c +++ b/arch/x86/kvm/vmx/pmu_intel.c @@ -767,6 +767,20 @@ void intel_pmu_cross_mapped_check(struct kvm_pmu *pmu) } } =20 +static bool intel_pmu_is_mediated_pmu_supported(struct x86_pmu_capability = *host_pmu) +{ + u64 host_perf_cap =3D 0; + + if (boot_cpu_has(X86_FEATURE_PDCM)) + rdmsrq(MSR_IA32_PERF_CAPABILITIES, host_perf_cap); + + /* + * Require v4+ for MSR_CORE_PERF_GLOBAL_STATUS_SET, and full-width + * writes so that KVM can precisely load guest counter values. + */ + return host_pmu->version >=3D 4 && host_perf_cap & PERF_CAP_FW_WRITES; +} + struct kvm_pmu_ops intel_pmu_ops __initdata =3D { .rdpmc_ecx_to_pmc =3D intel_rdpmc_ecx_to_pmc, .msr_idx_to_pmc =3D intel_msr_idx_to_pmc, @@ -778,6 +792,9 @@ struct kvm_pmu_ops intel_pmu_ops __initdata =3D { .reset =3D intel_pmu_reset, .deliver_pmi =3D intel_pmu_deliver_pmi, .cleanup =3D intel_pmu_cleanup, + + .is_mediated_pmu_supported =3D intel_pmu_is_mediated_pmu_supported, + .EVENTSEL_EVENT =3D ARCH_PERFMON_EVENTSEL_EVENT, .MAX_NR_GP_COUNTERS =3D KVM_MAX_NR_INTEL_GP_COUNTERS, .MIN_NR_GP_COUNTERS =3D 1, diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 4cbe8c84b636..fdd18ad1ede3 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -7958,7 +7958,8 @@ static __init u64 vmx_get_perf_capabilities(void) if (boot_cpu_has(X86_FEATURE_PDCM)) rdmsrq(MSR_IA32_PERF_CAPABILITIES, host_perf_cap); =20 - if (!cpu_feature_enabled(X86_FEATURE_ARCH_LBR)) { + if (!cpu_feature_enabled(X86_FEATURE_ARCH_LBR) && + !enable_mediated_pmu) { x86_perf_get_lbr(&vmx_lbr_caps); =20 /* --=20 2.52.0.223.gf5cc29aaa4-goog From nobody Tue Dec 16 07:27:38 2025 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 152F824503F for ; Sat, 6 Dec 2025 00:18:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980285; cv=none; b=FUFH3DouIsqbkGv66FnsqUDdjhZMMKwKbFM6l9PAQ8KJXeA+SJfyobd46h9F4RsrmmL4lxriM46oYTLEOoQDTYgV2PL+0HT5+8ZSoJK2VHC1QYi5175fKZx467DVltrW/lcf1FoioFBSDgLmrVQaDZf7FuXFxGRotYOj4n+9umY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980285; c=relaxed/simple; bh=GEofwV87pfS9k1X1r08HpHCiDsjoBWg3ivwGBCrzyh4=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=VKzXhipulI1op52T+CmPxrqZuJeG+Y8GduSOXkp3H/9DoQQWSdQgJLH0MeHi/78OJVrp4rdBuyafLd3YcYCllSvczka+vK5frLjoTgQlnlXhYWTzgIHDRmJNFvAe7Jf2FNhRNTz8XrHRHoBmUmdL0JHk1a+Rl+P3x1WVHsx1WnY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=FenbBuyJ; arc=none smtp.client-ip=209.85.216.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="FenbBuyJ" Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-34992a92fa0so1248106a91.3 for ; Fri, 05 Dec 2025 16:18:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1764980282; x=1765585082; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=m3Z5nHAcUfdxfKrbAwjeK85mE0dJR71iRHLzzuZPmpA=; b=FenbBuyJ+Mymh8FeAYi6ewPvo702dpo6qmzB9tnBX010au9ZDIEj9yLwu28Xc96Uck wE4nljrLpSVbynZSLrta2EXgiwXCn+nSEpsshkOPeDjjs37MPmCoLhW/fgDRs1RUpnMl GYeSl/or3Lq98FQNMmy1SVfGqngSfQJ1Yq0gxTt7Y3Kiht85FShrYRkPpubOftztC2Hk WIpiiPVuvHXNv0vMX2Z4HdFpQ8riebRADQYXbRllcRoMOYf8oF4YqUsULSc7mU6lt0Sd vdL3uV3Ik5PkuoCAehrrBYkvsKc7YE0Ltos1guOR6bAlAZtF3CV2I9k1xQYD+lRF6PRQ KPug== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764980282; x=1765585082; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=m3Z5nHAcUfdxfKrbAwjeK85mE0dJR71iRHLzzuZPmpA=; b=JM506UXqeCZOqj+CU2AcPKlFNaB4CkSJ0yha3SiCHNrh83ByHbAC6qOTvKiJDO8rIg dJAp1hQWV3rvnpAHTJHDCatJF+W+1Uke/K/s0mu6yS5wNkAap8TSwZ7E/3bk4fbfiDdG KrWlMKH+63nxaSHbCigj+JjQaQDAjEAayHJFa1G0B5P4hhx9G5aDzvad706qJyD0Nx9D bJDDcxezalNnyapLEAQYZSS6Kegjrfk+/hSZTnmv6Dia+ZmwE/hJtltJoQ9LI8qP3QvE ZG0Ci/3q3UJpQbncmIE2M4jnWyKrAmenlL8WaQRqWkF0ur7/4DhkD6jWmLax/fgzwujx GjvA== X-Forwarded-Encrypted: i=1; AJvYcCUolAXzVBqOzQAY97Hktjc3h+BbEYExu+sJD8Q+hJSSmlYXu25XPp9PMyTvUaHnWwYsCHQwCLSgFXsmlOU=@vger.kernel.org X-Gm-Message-State: AOJu0YyG7Kl5ZkLvzJIgzALntrr07EPPvDiqIjbe+cNXyW66AtVNG0ll 4vLEnjvqlFWIZ/V/JPMkHgNsG3nA8aBifGq6ZNSiKZSrVimmW1wyNq9DSxPaFLQFq3DqKuBg7K/ AEGR9bg== X-Google-Smtp-Source: AGHT+IGcOklQOzJ/lNZWJnk8g0uw9rVYfuzrIb6PtX+N9IOeiT702eHEqjavt0HC1DcnaOX9m2isN8tDRpE= X-Received: from pjyt22.prod.google.com ([2002:a17:90a:e516:b0:340:92f6:5531]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90a:d2cb:b0:33b:b078:d6d3 with SMTP id 98e67ed59e1d1-349a25bcd68mr661307a91.23.1764980281984; Fri, 05 Dec 2025 16:18:01 -0800 (PST) Reply-To: Sean Christopherson Date: Fri, 5 Dec 2025 16:16:54 -0800 In-Reply-To: <20251206001720.468579-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251206001720.468579-1-seanjc@google.com> X-Mailer: git-send-email 2.52.0.223.gf5cc29aaa4-goog Message-ID: <20251206001720.468579-19-seanjc@google.com> Subject: [PATCH v6 18/44] KVM: x86/pmu: Implement AMD mediated PMU requirements From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Xin Li , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Sean Christopherson , Paolo Bonzini Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Mingwei Zhang , Xudong Hao , Sandipan Das , Dapeng Mi , Xiong Zhang , Manali Shukla , Jim Mattson Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Require host PMU version 2+ for AMD mediated PMU support, as PERF_GLOBAL_CTRL and friends are hard requirements for the mediated PMU. Signed-off-by: Dapeng Mi Co-developed-by: Mingwei Zhang Signed-off-by: Mingwei Zhang [sean: extract to separate patch, write changelog] Reviewed-by: Sandipan Das Tested-by: Xudong Hao Signed-off-by: Sean Christopherson --- arch/x86/kvm/svm/pmu.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/arch/x86/kvm/svm/pmu.c b/arch/x86/kvm/svm/pmu.c index bc062285fbf5..16c88b2a2eb8 100644 --- a/arch/x86/kvm/svm/pmu.c +++ b/arch/x86/kvm/svm/pmu.c @@ -227,6 +227,11 @@ static void amd_pmu_init(struct kvm_vcpu *vcpu) } } =20 +static bool amd_pmu_is_mediated_pmu_supported(struct x86_pmu_capability *h= ost_pmu) +{ + return host_pmu->version >=3D 2; +} + struct kvm_pmu_ops amd_pmu_ops __initdata =3D { .rdpmc_ecx_to_pmc =3D amd_rdpmc_ecx_to_pmc, .msr_idx_to_pmc =3D amd_msr_idx_to_pmc, @@ -236,6 +241,9 @@ struct kvm_pmu_ops amd_pmu_ops __initdata =3D { .set_msr =3D amd_pmu_set_msr, .refresh =3D amd_pmu_refresh, .init =3D amd_pmu_init, + + .is_mediated_pmu_supported =3D amd_pmu_is_mediated_pmu_supported, + .EVENTSEL_EVENT =3D AMD64_EVENTSEL_EVENT, .MAX_NR_GP_COUNTERS =3D KVM_MAX_NR_AMD_GP_COUNTERS, .MIN_NR_GP_COUNTERS =3D AMD64_NUM_COUNTERS, --=20 2.52.0.223.gf5cc29aaa4-goog From nobody Tue Dec 16 07:27:38 2025 Received: from mail-pf1-f201.google.com (mail-pf1-f201.google.com [209.85.210.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 668F929ACFC for ; Sat, 6 Dec 2025 00:18:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980287; cv=none; b=opj9D5TngfWx3EsLBe4P6wLcX3G5UBrb8bm6ZCxwKgXiWSMUajGK8pCa39LRnt8v18sZg662/EQ3ZuqX1ziMPn2htI4MQUK1wpiVwF5L+CIZ6yXoSR1HDyIwUJaXoz/0+/GdWI+c3khNfRbnXBz9cjSrTWYtElRJ9VmG7EupPUs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980287; c=relaxed/simple; bh=+5rvOQQEmWi8zXJY9jqrXtHF9NbwN3EkEXKXWn4uwVo=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=nJ/7AdP4IK0cS8J0uPgD9D8FTr7a9J4Gs18QmqO9v3HouNR0UClH87DtruXjQoBMWeVD2nzFJBvqf7VaQ8rpeeHFi99wS48lpPBg7cKSQTaH8GZqdKPLTPRr7qj1xpkOvfEy6MXlBNiZ7DEkaIWkncetd240fg4rF7Z+gox0jOs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=r85hU2PS; arc=none smtp.client-ip=209.85.210.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="r85hU2PS" Received: by mail-pf1-f201.google.com with SMTP id d2e1a72fcca58-7c7957d978aso2757183b3a.1 for ; Fri, 05 Dec 2025 16:18:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1764980284; x=1765585084; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=Updb/Jfn2dAHBrhkdzkPSsR6YWK/TllPUiDQA+O1aSA=; b=r85hU2PSGYrGRH+SzXBDxfmq2Jp5ipUFbFSdQYvpDt8wsRm2XHTtVJVfWhtQnAzTEb vwAW0emhK/5y48p4DdsrdG/yEuS/EqAheVtfA86TIAXNrqvrCsVNh4nowh2yJ6MxEMUf TTYMJtGQ7QGJjxjAkbMAioPY4ZyBqHVl7zyziXTgx0i3rKudjIgpdLxjWST5I9WGov4v OJQd/Ii7SwuahL7qwPY34fiT5zTwCU1OFUrFqlJ283fbdJ1RpBgYDLweVPKJfUxiZloq ZiTX5pXOV79cmM16S/Wy6d+mWm1+CyEObotQyessdvUKNGV0sWgiCjATrxA1r0mlMkUy wyXQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764980284; x=1765585084; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Updb/Jfn2dAHBrhkdzkPSsR6YWK/TllPUiDQA+O1aSA=; b=t0ibDwK4iw3E7yYJgJBNVVCz3beFGvNH7Lqv4MBllj7ujOb9TX+I0Iea3LUnt1X+GX kc+y0Pkv4ct8hu7gqJVP2lZMs092X8Sh9BpIKCiYhef1OUfmsE76c9qBo35BSO/W0sTX zieKQL72cDLBhKGWx4AoAdkXLt/Hc6D2JLEuIPuUlrPOdoponodWssolEDSGRfRypSEc glHfZVB5Jt+g3xZU+Sx7wSoHG2O5WGcRVaO/4MIQxzBp/QlXDFQPt/T3HYzbZH91VujR YNkyU9xRa7w0CSXy4m2I2plFO6hvvn0NFral760rYpz09YIxOxUTz0QcQThd2MWyY85p blMg== X-Forwarded-Encrypted: i=1; AJvYcCX9PDEZLMM64UfgyztUpJUaE1m5RXSlqcPvFCCxcnGO3+LFqv/IL5LdKUZWuxMMCDsFj+954DOXHs1vm7k=@vger.kernel.org X-Gm-Message-State: AOJu0YxyaPhn2EydbUvolKxj/8cZLG1uU5bKKJh59WqsYIHqL5+ZIiME IZlr+Hzg3fDlyQCFo9bFsdla/rgJxhZssFElhts+xHM1kqGARHi+/9txEaAqlb8YLGfW7I6baNS XSzGNkg== X-Google-Smtp-Source: AGHT+IGmx8a/eJZmORgvkEWVhn/uP5dbhmnIbP0XRDK0XcrnLs6A3LdHJHErdqkr+RZXqUaApvjJvO7ayAk= X-Received: from pgbdp3.prod.google.com ([2002:a05:6a02:f03:b0:bc8:5648:d6bf]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a21:8187:b0:366:1a2c:6f91 with SMTP id adf61e73a8af0-3661a2c7805mr309067637.4.1764980284548; Fri, 05 Dec 2025 16:18:04 -0800 (PST) Reply-To: Sean Christopherson Date: Fri, 5 Dec 2025 16:16:55 -0800 In-Reply-To: <20251206001720.468579-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251206001720.468579-1-seanjc@google.com> X-Mailer: git-send-email 2.52.0.223.gf5cc29aaa4-goog Message-ID: <20251206001720.468579-20-seanjc@google.com> Subject: [PATCH v6 19/44] KVM: x86/pmu: Register PMI handler for mediated vPMU From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Xin Li , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Sean Christopherson , Paolo Bonzini Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Mingwei Zhang , Xudong Hao , Sandipan Das , Dapeng Mi , Xiong Zhang , Manali Shukla , Jim Mattson Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Xiong Zhang Register a dedicated PMI handler with perf's callback when mediated PMU support is enabled. Perf routes PMIs that arrive while guest context is loaded to the provided callback, by modifying the CPU's LVTPC to point at a dedicated mediated PMI IRQ vector. WARN upon receipt of a mediated PMI if there is no active vCPU, or if the vCPU doesn't have a mediated PMU. Even if a PMI manages to skid past VM-Exit, it should never be delayed all the way beyond unloading the vCPU. And while running vCPUs without a mediated PMU, the LVTPC should never be wired up to the mediated PMI IRQ vector, i.e. should always be routed through perf's NMI handler. Signed-off-by: Xiong Zhang Signed-off-by: Mingwei Zhang Tested-by: Xudong Hao Co-developed-by: Sean Christopherson Signed-off-by: Sean Christopherson --- arch/x86/kvm/pmu.c | 10 ++++++++++ arch/x86/kvm/pmu.h | 2 ++ arch/x86/kvm/x86.c | 3 ++- 3 files changed, 14 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c index 0de0af5c6e4f..b3dde9a836ea 100644 --- a/arch/x86/kvm/pmu.c +++ b/arch/x86/kvm/pmu.c @@ -159,6 +159,16 @@ void kvm_init_pmu_capability(const struct kvm_pmu_ops = *pmu_ops) perf_get_hw_event_config(PERF_COUNT_HW_BRANCH_INSTRUCTIONS); } =20 +void kvm_handle_guest_mediated_pmi(void) +{ + struct kvm_vcpu *vcpu =3D kvm_get_running_vcpu(); + + if (WARN_ON_ONCE(!vcpu || !kvm_vcpu_has_mediated_pmu(vcpu))) + return; + + kvm_make_request(KVM_REQ_PMI, vcpu); +} + static inline void __kvm_perf_overflow(struct kvm_pmc *pmc, bool in_pmi) { struct kvm_pmu *pmu =3D pmc_to_pmu(pmc); diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h index a5c7c026b919..9849c2bb720d 100644 --- a/arch/x86/kvm/pmu.h +++ b/arch/x86/kvm/pmu.h @@ -46,6 +46,8 @@ struct kvm_pmu_ops { =20 void kvm_pmu_ops_update(const struct kvm_pmu_ops *pmu_ops); =20 +void kvm_handle_guest_mediated_pmi(void); + static inline bool kvm_pmu_has_perf_global_ctrl(struct kvm_pmu *pmu) { /* diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index fb3a5e861553..1623afddff3b 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -10111,7 +10111,8 @@ int kvm_x86_vendor_init(struct kvm_x86_init_ops *op= s) set_hv_tscchange_cb(kvm_hyperv_tsc_notifier); #endif =20 - __kvm_register_perf_callbacks(ops->handle_intel_pt_intr, NULL); + __kvm_register_perf_callbacks(ops->handle_intel_pt_intr, + enable_mediated_pmu ? kvm_handle_guest_mediated_pmi : NULL); =20 if (IS_ENABLED(CONFIG_KVM_SW_PROTECTED_VM) && tdp_mmu_enabled) kvm_caps.supported_vm_types |=3D BIT(KVM_X86_SW_PROTECTED_VM); --=20 2.52.0.223.gf5cc29aaa4-goog From nobody Tue Dec 16 07:27:38 2025 Received: from mail-pf1-f201.google.com (mail-pf1-f201.google.com [209.85.210.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 563C223F439 for ; Sat, 6 Dec 2025 00:18:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980289; cv=none; b=X/as0JCO16liBzH8jeB7nZ1LTdkhJzu21GSG0WQ0XpFtn+Rbp1hiERDbdCHigl8d76Z+UG9G+yN8JJpAR3tiLZB8cQQWjLISIcNkf3OpRt/wgL6Ml3urtLwRoYpbEAgXZpYPRcPLgXOd0hXmEiQzuAMgUSvOIf+sRDDdwgoe0QQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980289; c=relaxed/simple; bh=ke85FLmiKSfdj4rUFAQKnbz4Pj1/xMDGNPG2+PNz6uc=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=k4UwZwXuH9TwWvqojCxfY1rm2hbPdiE5iCGf5BEucruIkepeyWstAUqdArCs8Ha68YRInAlF9DyUtKRMSQoEHiDHWh3Zr3ia/SHcJaYloaJm9ictw5x1GXLyMdxGa16OkGs8Ama4YoVrDXtuDtOrfNx7hOyBSvfZbGkDdmAdrBY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=duvqRfMk; arc=none smtp.client-ip=209.85.210.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="duvqRfMk" Received: by mail-pf1-f201.google.com with SMTP id d2e1a72fcca58-7aa148105a2so2494156b3a.1 for ; Fri, 05 Dec 2025 16:18:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1764980286; x=1765585086; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=3tXwFHzylPZMGyN6KBlqBM06kqQeDyKKeii7TB3ROVw=; b=duvqRfMkfDvLQbWqABHCg5hheVxU4oc6TzwHIiI2gz9eEoUBUcTKY0jsAP7yA43Wt7 4OXeUJ1eUJi7bxoiQ7MV3FREuCZuYPFh6kkn8Kid923xyyNoxTcfVF53GGPwNG4z9bY3 HEszbW1ehyBA44qD3wtJCgT4p3W9HzyPHOycndFUPEYlOe5d6UpR4Z1l+xeYVHznygo0 9xnrSUyqzHRHmAmOdkwzj1XJrcJ2xPuv6+JIEgKG2mDg+OkfxksvQh9Qy5s2TPUpfc+J I8Eq3GuSKogdAbydRm976vWyI2QWASdk2w9f01J9DWKwuJjvrS32jDmIoj1Ft8vKJGle kQcg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764980286; x=1765585086; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=3tXwFHzylPZMGyN6KBlqBM06kqQeDyKKeii7TB3ROVw=; b=P6icNbZiC+gtvWzoIsV2Y32CzS8ubWNMybNgC+KDjMr4UgQxQ6Zg0d5bbb4/XNcF5l erBOHQqszAe2XRb7JwmpLfjRFGdK0m+z8XFMQxJh8vVFwNNJTD5aqMJpr6zm2BfwKN3R wpgDzxcrEJLPVp/pcdoXHf8UZU1A9G/hwk6NZoIw6+ADvWUWX3Juh7uvp+P2wS173jSS iMmleSWLaXFHPGAJZ7h3+Z4PZSEE48v6TPdAxFevw2HVaKvBkQIi1x3t4hPtPeU3IoQj ToKGUrYqNbSUkromuDNLa260sNg6yiEeDkUymVp8Yz7ELx9G8FGoLkG2iZc5uFcMV5zU X+pQ== X-Forwarded-Encrypted: i=1; AJvYcCX7cxRfOnf1PwSyQ6h0SZklr6luT89UAP27/+3Hk76CqpGmHipC/Es84EDzZBDpXRs6x83JdR+YIzJv/jo=@vger.kernel.org X-Gm-Message-State: AOJu0Yy+I+7f3DUw5G1jqROVKX0tmgvpkKO5sockK8osAgYil/FGgSxk s1BXe3Z9LD7BFDd+1zKuS0cHDtZjpHNbnoo51UTtVED/3u7RaGX23rHhs+TmtRqAmK9VkIQjtkn fFcF5EA== X-Google-Smtp-Source: AGHT+IGMO0B+jqPHT7+9YfCBpeq9zR/PklORB8CpAuE6Up8hoHfkAh/+97jt6V2KW+WqsentqXikVA61obU= X-Received: from pfbdr8.prod.google.com ([2002:a05:6a00:4a88:b0:7ae:973b:9f29]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:14c9:b0:7b7:a62:550c with SMTP id d2e1a72fcca58-7e8bf858c3dmr849285b3a.1.1764980286427; Fri, 05 Dec 2025 16:18:06 -0800 (PST) Reply-To: Sean Christopherson Date: Fri, 5 Dec 2025 16:16:56 -0800 In-Reply-To: <20251206001720.468579-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251206001720.468579-1-seanjc@google.com> X-Mailer: git-send-email 2.52.0.223.gf5cc29aaa4-goog Message-ID: <20251206001720.468579-21-seanjc@google.com> Subject: [PATCH v6 20/44] KVM: x86/pmu: Disable RDPMC interception for compatible mediated vPMU From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Xin Li , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Sean Christopherson , Paolo Bonzini Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Mingwei Zhang , Xudong Hao , Sandipan Das , Dapeng Mi , Xiong Zhang , Manali Shukla , Jim Mattson Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Dapeng Mi Disable RDPMC interception for vCPUs with a mediated vPMU that is compatible with the host PMU, i.e. that doesn't require KVM emulation of RDPMC to honor the guest's vCPU model. With a mediated vPMU, all guest state accessible via RDPMC is loaded into hardware while the guest is running. Adust RDPMC interception only for non-TDX guests, as the TDX module is responsible for managing RDPMC intercepts based on the TD configuration. Co-developed-by: Mingwei Zhang Signed-off-by: Mingwei Zhang Co-developed-by: Sandipan Das Signed-off-by: Sandipan Das Signed-off-by: Dapeng Mi Tested-by: Xudong Hao Co-developed-by: Sean Christopherson Signed-off-by: Sean Christopherson --- arch/x86/kvm/pmu.c | 26 ++++++++++++++++++++++++++ arch/x86/kvm/pmu.h | 1 + arch/x86/kvm/svm/svm.c | 5 +++++ arch/x86/kvm/vmx/vmx.c | 7 +++++++ arch/x86/kvm/x86.c | 1 + 5 files changed, 40 insertions(+) diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c index b3dde9a836ea..182ff2d8d119 100644 --- a/arch/x86/kvm/pmu.c +++ b/arch/x86/kvm/pmu.c @@ -716,6 +716,32 @@ int kvm_pmu_rdpmc(struct kvm_vcpu *vcpu, unsigned idx,= u64 *data) return 0; } =20 +bool kvm_need_rdpmc_intercept(struct kvm_vcpu *vcpu) +{ + struct kvm_pmu *pmu =3D vcpu_to_pmu(vcpu); + + if (!kvm_vcpu_has_mediated_pmu(vcpu)) + return true; + + /* + * VMware allows access to these Pseduo-PMCs even when read via RDPMC + * in Ring3 when CR4.PCE=3D0. + */ + if (enable_vmware_backdoor) + return true; + + /* + * Note! Check *host* PMU capabilities, not KVM's PMU capabilities, as + * KVM's capabilities are constrained based on KVM support, i.e. KVM's + * capabilities themselves may be a subset of hardware capabilities. + */ + return pmu->nr_arch_gp_counters !=3D kvm_host_pmu.num_counters_gp || + pmu->nr_arch_fixed_counters !=3D kvm_host_pmu.num_counters_fixed || + pmu->counter_bitmask[KVM_PMC_GP] !=3D (BIT_ULL(kvm_host_pmu.bit_wi= dth_gp) - 1) || + pmu->counter_bitmask[KVM_PMC_FIXED] !=3D (BIT_ULL(kvm_host_pmu.bit= _width_fixed) - 1); +} +EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_need_rdpmc_intercept); + void kvm_pmu_deliver_pmi(struct kvm_vcpu *vcpu) { if (lapic_in_kernel(vcpu)) { diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h index 9849c2bb720d..506c203587ea 100644 --- a/arch/x86/kvm/pmu.h +++ b/arch/x86/kvm/pmu.h @@ -238,6 +238,7 @@ void kvm_pmu_instruction_retired(struct kvm_vcpu *vcpu); void kvm_pmu_branch_retired(struct kvm_vcpu *vcpu); =20 bool is_vmware_backdoor_pmc(u32 pmc_idx); +bool kvm_need_rdpmc_intercept(struct kvm_vcpu *vcpu); =20 extern struct kvm_pmu_ops intel_pmu_ops; extern struct kvm_pmu_ops amd_pmu_ops; diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 24d59ccfa40d..11913574de88 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -1011,6 +1011,11 @@ static void svm_recalc_instruction_intercepts(struct= kvm_vcpu *vcpu) svm->vmcb->control.virt_ext |=3D VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK; } } + + if (kvm_need_rdpmc_intercept(vcpu)) + svm_set_intercept(svm, INTERCEPT_RDPMC); + else + svm_clr_intercept(svm, INTERCEPT_RDPMC); } =20 static void svm_recalc_intercepts(struct kvm_vcpu *vcpu) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index fdd18ad1ede3..9f71ba99cf70 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -4300,8 +4300,15 @@ static void vmx_recalc_msr_intercepts(struct kvm_vcp= u *vcpu) */ } =20 +static void vmx_recalc_instruction_intercepts(struct kvm_vcpu *vcpu) +{ + exec_controls_changebit(to_vmx(vcpu), CPU_BASED_RDPMC_EXITING, + kvm_need_rdpmc_intercept(vcpu)); +} + void vmx_recalc_intercepts(struct kvm_vcpu *vcpu) { + vmx_recalc_instruction_intercepts(vcpu); vmx_recalc_msr_intercepts(vcpu); } =20 diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 1623afddff3b..76e86eb358df 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3945,6 +3945,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct = msr_data *msr_info) =20 vcpu->arch.perf_capabilities =3D data; kvm_pmu_refresh(vcpu); + kvm_make_request(KVM_REQ_RECALC_INTERCEPTS, vcpu); break; case MSR_IA32_PRED_CMD: { u64 reserved_bits =3D ~(PRED_CMD_IBPB | PRED_CMD_SBPB); --=20 2.52.0.223.gf5cc29aaa4-goog From nobody Tue Dec 16 07:27:38 2025 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DE0A12222CB for ; Sat, 6 Dec 2025 00:18:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980291; cv=none; b=gAVxkifX/mIzTLi3KbIjR3HWlnGyh9M6EZWovoc4wbuNUBzLg+zZpV9lXDkWRluyU7vkNwzLPxUQDntZP5zTjAdbZDXULRqbdAaUHDN5RK8cPxz6pTBocYGGJyL3NzTJq2aOJccckUFHgW46JKJF+0y4yE8bsd0xOT6NXCOAkgQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980291; c=relaxed/simple; bh=QJ1SWt/CKyyfCdg/zIP/8ZsXL6hn3DdiicKeShHL0P4=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=JOUR8D1aV3u2bc+5e/XeOkcvuYjBSkvfs5pA5IbcYmZcD62+Gv6mOL8FIPEeTliKN9Bb2ygx264UG1Du4z/DOtuS+GEkoscyOlnMZ2d6BD58YkThNg4PEnvhaiFpbK3soC7jtTHWNTNdoElJd/6adtx0HjE3VRHBIeY6TD04+qU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=tuuyu7Pl; arc=none smtp.client-ip=209.85.216.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="tuuyu7Pl" Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-34176460924so2633855a91.3 for ; Fri, 05 Dec 2025 16:18:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1764980288; x=1765585088; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=OhgzT8IdkvE39q2IRwzyEAuA5dkTXvZ5OuVHbNN1a4s=; b=tuuyu7PlbywHCfwNbZlcelL1h9YSuSCkn0ehoJRsmkFLVo5me64p+vKoAwHQ+KbK+i kW1ADmYzTXjX98kmS46Q57Hs9p5A2Di7GEJmpI+9KNiNgkMibcnsbY6JvAr9NxqxI9Th 8o88auRa+Kl3/BBJhr4X+ikjapI+DuAGNtAt2ia2FrO/x/kOAv8NPE0HDlXFL5wDhJvG +KDNat/hCFJX+hfgN/h/UGyL843lxaJRZeRNfX7Ra3thoIm+F/XrUUa2zTC+gfFMD/fN 32HrMJkevS1oIe0naB+utRJafB1CCXWkl4hVSLj6t7RI/kOHm60rk8+CqaP436Mu0UZ1 Zxsw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764980288; x=1765585088; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=OhgzT8IdkvE39q2IRwzyEAuA5dkTXvZ5OuVHbNN1a4s=; b=Nm7+q+C0bNp6IfsaJscPyEKRt7Cstx9TOF9ZX4xdbadBomtS170OA/EUXs0k7aQR6Z jgDeM5zyy9X2lRViVS/RlbQ4bYHAOZDTLKhscZGEdoIStLQsQrHHHHPc9LURrijgog+F hpFSCvv90nV+EMR/frkvYiqaymXCXYDMRcjXML3hCeKdZHg8bd+H5WfmRrQOJQQHJda7 twu941pjo1kuM9aHOezcHTi+ZSLHw+bL42jKiLCQJbLEqizFrUIS8KaryumuPgruwvQF oZ/zCWZBeYNW4SyVZFu/pqyHHssxgHFh6HbK2ivEcJKBxYCmixgGwswnPtrr/7aoO/ra 1KIw== X-Forwarded-Encrypted: i=1; AJvYcCXMDuIHH+8fbCFjezwBbCvCBi/gmERDa5n9+IWx9mlgh5yBrDROdiuRaEyDJmyabBnJbSJNDkL/7qZ3IOU=@vger.kernel.org X-Gm-Message-State: AOJu0Yyv7FDSSYU5gF/eY6ulsDJNt+FFRcnaSfYxDmLBL8ql9orrkpKT T3F+7lLkscEfXDNipgzS+kXIDDMSR1LsXUNquOg50XdN1ge77+XfXsItBM570vOdmfdJ1fBWnXP f3RvdQA== X-Google-Smtp-Source: AGHT+IF3kuPQQ3W2lHv9YyVXwm6+5IssxVx5oyvmKrDHcM1H80BmJCC/pSgRMpxg7eh1jE/u/pn/AsSkGNo= X-Received: from pjbbj14.prod.google.com ([2002:a17:90b:88e:b0:340:c625:b238]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:4c4a:b0:32e:2059:ee83 with SMTP id 98e67ed59e1d1-349a252b2bdmr667640a91.7.1764980288312; Fri, 05 Dec 2025 16:18:08 -0800 (PST) Reply-To: Sean Christopherson Date: Fri, 5 Dec 2025 16:16:57 -0800 In-Reply-To: <20251206001720.468579-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251206001720.468579-1-seanjc@google.com> X-Mailer: git-send-email 2.52.0.223.gf5cc29aaa4-goog Message-ID: <20251206001720.468579-22-seanjc@google.com> Subject: [PATCH v6 21/44] KVM: x86/pmu: Load/save GLOBAL_CTRL via entry/exit fields for mediated PMU From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Xin Li , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Sean Christopherson , Paolo Bonzini Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Mingwei Zhang , Xudong Hao , Sandipan Das , Dapeng Mi , Xiong Zhang , Manali Shukla , Jim Mattson Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Dapeng Mi When running a guest with a mediated PMU, context switch PERF_GLOBAL_CTRL via the dedicated VMCS fields for both host and guest. For the host, always zero GLOBAL_CTRL on exit as the guest's state will still be loaded in hardware (KVM will context switch the bulk of PMU state outside of the inner run loop). For the guest, use the dedicated fields to atomically load and save PERF_GLOBAL_CTRL on all entry/exits. For now, require VM_EXIT_SAVE_IA32_PERF_GLOBAL_CTRL support (introduced by Sapphire Rapids). KVM can support such CPUs by saving PERF_GLOBAL_CTRL via the MSR save list, a.k.a. the MSR auto-store list, but defer that support as it adds a small amount of complexity and is somewhat unique. To minimize VM-Entry latency, propagate IA32_PERF_GLOBAL_CTRL to the VMCS on-demand. But to minimize complexity, read IA32_PERF_GLOBAL_CTRL out of the VMCS on all non-failing VM-Exits. I.e. partially cache the MSR. KVM could track GLOBAL_CTRL as an EXREG and defer all reads, but writes are rare, i.e. the dirty tracking for an EXREG is unnecessary, and it's not obvious that shaving ~15-20 cycles per exit is meaningful given the total overhead associated with mediated PMU context switches. Suggested-by: Sean Christopherson Signed-off-by: Dapeng Mi Co-developed-by: Mingwei Zhang Signed-off-by: Mingwei Zhang Tested-by: Xudong Hao Co-developed-by: Sean Christopherson Signed-off-by: Sean Christopherson --- arch/x86/include/asm/kvm-x86-pmu-ops.h | 2 ++ arch/x86/include/asm/vmx.h | 1 + arch/x86/kvm/pmu.c | 13 +++++++++-- arch/x86/kvm/pmu.h | 3 ++- arch/x86/kvm/vmx/capabilities.h | 6 +++++ arch/x86/kvm/vmx/pmu_intel.c | 25 ++++++++++++++++++++- arch/x86/kvm/vmx/vmx.c | 31 +++++++++++++++++++++++++- arch/x86/kvm/vmx/vmx.h | 3 ++- 8 files changed, 78 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/kvm-x86-pmu-ops.h b/arch/x86/include/asm/= kvm-x86-pmu-ops.h index 9159bf1a4730..ad2cc82abf79 100644 --- a/arch/x86/include/asm/kvm-x86-pmu-ops.h +++ b/arch/x86/include/asm/kvm-x86-pmu-ops.h @@ -23,5 +23,7 @@ KVM_X86_PMU_OP_OPTIONAL(reset) KVM_X86_PMU_OP_OPTIONAL(deliver_pmi) KVM_X86_PMU_OP_OPTIONAL(cleanup) =20 +KVM_X86_PMU_OP_OPTIONAL(write_global_ctrl) + #undef KVM_X86_PMU_OP #undef KVM_X86_PMU_OP_OPTIONAL diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h index c85c50019523..b92ff87e3560 100644 --- a/arch/x86/include/asm/vmx.h +++ b/arch/x86/include/asm/vmx.h @@ -107,6 +107,7 @@ #define VM_EXIT_PT_CONCEAL_PIP 0x01000000 #define VM_EXIT_CLEAR_IA32_RTIT_CTL 0x02000000 #define VM_EXIT_LOAD_CET_STATE 0x10000000 +#define VM_EXIT_SAVE_IA32_PERF_GLOBAL_CTRL 0x40000000 =20 #define VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR 0x00036dff =20 diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c index 182ff2d8d119..c4a32bfb26f5 100644 --- a/arch/x86/kvm/pmu.c +++ b/arch/x86/kvm/pmu.c @@ -103,7 +103,7 @@ void kvm_pmu_ops_update(const struct kvm_pmu_ops *pmu_o= ps) #undef __KVM_X86_PMU_OP } =20 -void kvm_init_pmu_capability(const struct kvm_pmu_ops *pmu_ops) +void kvm_init_pmu_capability(struct kvm_pmu_ops *pmu_ops) { bool is_intel =3D boot_cpu_data.x86_vendor =3D=3D X86_VENDOR_INTEL; int min_nr_gp_ctrs =3D pmu_ops->MIN_NR_GP_COUNTERS; @@ -141,6 +141,9 @@ void kvm_init_pmu_capability(const struct kvm_pmu_ops *= pmu_ops) !pmu_ops->is_mediated_pmu_supported(&kvm_host_pmu)) enable_mediated_pmu =3D false; =20 + if (!enable_mediated_pmu) + pmu_ops->write_global_ctrl =3D NULL; + if (!enable_pmu) { memset(&kvm_pmu_cap, 0, sizeof(kvm_pmu_cap)); return; @@ -836,6 +839,9 @@ int kvm_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_d= ata *msr_info) diff =3D pmu->global_ctrl ^ data; pmu->global_ctrl =3D data; reprogram_counters(pmu, diff); + + if (kvm_vcpu_has_mediated_pmu(vcpu)) + kvm_pmu_call(write_global_ctrl)(data); } break; case MSR_CORE_PERF_GLOBAL_OVF_CTRL: @@ -930,8 +936,11 @@ void kvm_pmu_refresh(struct kvm_vcpu *vcpu) * in the global controls). Emulate that behavior when refreshing the * PMU so that userspace doesn't need to manually set PERF_GLOBAL_CTRL. */ - if (kvm_pmu_has_perf_global_ctrl(pmu) && pmu->nr_arch_gp_counters) + if (kvm_pmu_has_perf_global_ctrl(pmu) && pmu->nr_arch_gp_counters) { pmu->global_ctrl =3D GENMASK_ULL(pmu->nr_arch_gp_counters - 1, 0); + if (kvm_vcpu_has_mediated_pmu(vcpu)) + kvm_pmu_call(write_global_ctrl)(pmu->global_ctrl); + } =20 bitmap_set(pmu->all_valid_pmc_idx, 0, pmu->nr_arch_gp_counters); bitmap_set(pmu->all_valid_pmc_idx, KVM_FIXED_PMC_BASE_IDX, diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h index 506c203587ea..2ff469334c1a 100644 --- a/arch/x86/kvm/pmu.h +++ b/arch/x86/kvm/pmu.h @@ -38,6 +38,7 @@ struct kvm_pmu_ops { void (*cleanup)(struct kvm_vcpu *vcpu); =20 bool (*is_mediated_pmu_supported)(struct x86_pmu_capability *host_pmu); + void (*write_global_ctrl)(u64 global_ctrl); =20 const u64 EVENTSEL_EVENT; const int MAX_NR_GP_COUNTERS; @@ -183,7 +184,7 @@ static inline bool pmc_is_locally_enabled(struct kvm_pm= c *pmc) =20 extern struct x86_pmu_capability kvm_pmu_cap; =20 -void kvm_init_pmu_capability(const struct kvm_pmu_ops *pmu_ops); +void kvm_init_pmu_capability(struct kvm_pmu_ops *pmu_ops); =20 void kvm_pmu_recalc_pmc_emulation(struct kvm_pmu *pmu, struct kvm_pmc *pmc= ); =20 diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilitie= s.h index 26302fd6dd9c..4e371c93ae16 100644 --- a/arch/x86/kvm/vmx/capabilities.h +++ b/arch/x86/kvm/vmx/capabilities.h @@ -109,6 +109,12 @@ static inline bool cpu_has_load_cet_ctrl(void) { return (vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_CET_STATE); } + +static inline bool cpu_has_save_perf_global_ctrl(void) +{ + return vmcs_config.vmexit_ctrl & VM_EXIT_SAVE_IA32_PERF_GLOBAL_CTRL; +} + static inline bool cpu_has_vmx_mpx(void) { return vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_BNDCFGS; diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c index 050c21298213..dbab7cca7a62 100644 --- a/arch/x86/kvm/vmx/pmu_intel.c +++ b/arch/x86/kvm/vmx/pmu_intel.c @@ -778,7 +778,29 @@ static bool intel_pmu_is_mediated_pmu_supported(struct= x86_pmu_capability *host_ * Require v4+ for MSR_CORE_PERF_GLOBAL_STATUS_SET, and full-width * writes so that KVM can precisely load guest counter values. */ - return host_pmu->version >=3D 4 && host_perf_cap & PERF_CAP_FW_WRITES; + if (host_pmu->version < 4 || !(host_perf_cap & PERF_CAP_FW_WRITES)) + return false; + + /* + * All CPUs that support a mediated PMU are expected to support loading + * PERF_GLOBAL_CTRL via dedicated VMCS fields. + */ + if (WARN_ON_ONCE(!cpu_has_load_perf_global_ctrl())) + return false; + + /* + * KVM doesn't yet support mediated PMU on CPUs without support for + * saving PERF_GLOBAL_CTRL via a dedicated VMCS field. + */ + if (!cpu_has_save_perf_global_ctrl()) + return false; + + return true; +} + +static void intel_pmu_write_global_ctrl(u64 global_ctrl) +{ + vmcs_write64(GUEST_IA32_PERF_GLOBAL_CTRL, global_ctrl); } =20 struct kvm_pmu_ops intel_pmu_ops __initdata =3D { @@ -794,6 +816,7 @@ struct kvm_pmu_ops intel_pmu_ops __initdata =3D { .cleanup =3D intel_pmu_cleanup, =20 .is_mediated_pmu_supported =3D intel_pmu_is_mediated_pmu_supported, + .write_global_ctrl =3D intel_pmu_write_global_ctrl, =20 .EVENTSEL_EVENT =3D ARCH_PERFMON_EVENTSEL_EVENT, .MAX_NR_GP_COUNTERS =3D KVM_MAX_NR_INTEL_GP_COUNTERS, diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 9f71ba99cf70..72b92cea9d72 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -4294,6 +4294,18 @@ static void vmx_recalc_msr_intercepts(struct kvm_vcp= u *vcpu) vmx_set_intercept_for_msr(vcpu, MSR_IA32_S_CET, MSR_TYPE_RW, intercept); } =20 + if (enable_mediated_pmu) { + bool is_mediated_pmu =3D kvm_vcpu_has_mediated_pmu(vcpu); + struct vcpu_vmx *vmx =3D to_vmx(vcpu); + + vm_entry_controls_changebit(vmx, + VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL, is_mediated_pmu); + + vm_exit_controls_changebit(vmx, + VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL | + VM_EXIT_SAVE_IA32_PERF_GLOBAL_CTRL, is_mediated_pmu); + } + /* * x2APIC and LBR MSR intercepts are modified on-demand and cannot be * filtered by userspace. @@ -4476,6 +4488,16 @@ void vmx_set_constant_host_state(struct vcpu_vmx *vm= x) vmcs_writel(HOST_SSP, 0); vmcs_writel(HOST_INTR_SSP_TABLE, 0); } + + /* + * When running a guest with a mediated PMU, guest state is resident in + * hardware after VM-Exit. Zero PERF_GLOBAL_CTRL on exit so that host + * activity doesn't bleed into the guest counters. When running with + * an emulated PMU, PERF_GLOBAL_CTRL is dynamically computed on every + * entry/exit to merge guest and host PMU usage. + */ + if (enable_mediated_pmu) + vmcs_write64(HOST_IA32_PERF_GLOBAL_CTRL, 0); } =20 void set_cr4_guest_host_mask(struct vcpu_vmx *vmx) @@ -4543,7 +4565,8 @@ static u32 vmx_get_initial_vmexit_ctrl(void) VM_EXIT_CLEAR_IA32_RTIT_CTL); /* Loading of EFER and PERF_GLOBAL_CTRL are toggled dynamically */ return vmexit_ctrl & - ~(VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL | VM_EXIT_LOAD_IA32_EFER); + ~(VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL | VM_EXIT_LOAD_IA32_EFER | + VM_EXIT_SAVE_IA32_PERF_GLOBAL_CTRL); } =20 void vmx_refresh_apicv_exec_ctrl(struct kvm_vcpu *vcpu) @@ -7270,6 +7293,9 @@ static void atomic_switch_perf_msrs(struct vcpu_vmx *= vmx) struct perf_guest_switch_msr *msrs; struct kvm_pmu *pmu =3D vcpu_to_pmu(&vmx->vcpu); =20 + if (kvm_vcpu_has_mediated_pmu(&vmx->vcpu)) + return; + pmu->host_cross_mapped_mask =3D 0; if (pmu->pebs_enable & pmu->global_ctrl) intel_pmu_cross_mapped_check(pmu); @@ -7572,6 +7598,9 @@ fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu, u64 ru= n_flags) =20 vmx->loaded_vmcs->launched =3D 1; =20 + if (!msr_write_intercepted(vmx, MSR_CORE_PERF_GLOBAL_CTRL)) + vcpu_to_pmu(vcpu)->global_ctrl =3D vmcs_read64(GUEST_IA32_PERF_GLOBAL_CT= RL); + vmx_recover_nmi_blocking(vmx); vmx_complete_interrupts(vmx); =20 diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h index bc3ed3145d7e..d7a96c84371f 100644 --- a/arch/x86/kvm/vmx/vmx.h +++ b/arch/x86/kvm/vmx/vmx.h @@ -510,7 +510,8 @@ static inline u8 vmx_get_rvi(void) VM_EXIT_CLEAR_BNDCFGS | \ VM_EXIT_PT_CONCEAL_PIP | \ VM_EXIT_CLEAR_IA32_RTIT_CTL | \ - VM_EXIT_LOAD_CET_STATE) + VM_EXIT_LOAD_CET_STATE | \ + VM_EXIT_SAVE_IA32_PERF_GLOBAL_CTRL) =20 #define KVM_REQUIRED_VMX_PIN_BASED_VM_EXEC_CONTROL \ (PIN_BASED_EXT_INTR_MASK | \ --=20 2.52.0.223.gf5cc29aaa4-goog From nobody Tue Dec 16 07:27:38 2025 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D71792D0C9A for ; Sat, 6 Dec 2025 00:18:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980293; cv=none; b=EBM17sCcK8gO/FFzP7cCcMnRG8pGOO3J9Neyd+5dp5eMfifbNu7HBNMKcyDQE5XiKMhESQBZDUhjgk10ne5uxX6RN3aiXvhergLnrkzaOZ2RMezJaW2oeU6l6BYMsQs1udWX7riNH8zOScvl/5xL6GAPAZZnwQKlnfPFn/C3YhI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980293; c=relaxed/simple; bh=/6biXpQFm51gHJtR2A48VdHy52DA+SmrH2G9LhLIuBA=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=r0h0/Ob/l6ZEMzjo588y6wBcaXC4lNdLMwYOnlMyG8Y0ung3L9UXJk5QfTtKiKBsrqR+2/5UersZ7TqAT2UJZjhPAXVAKSuP005qiCpaEP/RM0+oKkdL12RdNHWU/9U6huuNfL2FL8kCSjTcmglnOlhV4ES7PPfu8APWtE5bCn4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=JlNCZZYu; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="JlNCZZYu" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-3416dc5752aso6775128a91.1 for ; Fri, 05 Dec 2025 16:18:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1764980290; x=1765585090; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=FT03r3WGpiwo19/4S5QoYm3Bx/6piafEWN0VhOWiV4M=; b=JlNCZZYu/RSAc7xX6XQv4V4gUvi8/iZzruED5uWTCS52L67tXNg7pKcobOUQPG8cBH xA1+qpls4xoGBdVEmFs7pn0INyMg9FXE6rOknQ581AME0YTYQZ4tgTBHz9433qSUsMk4 D5SwA7Wau3pBst00+tyFuD+z3j0iOkMEpeKPi2J7lRRvO4pUudcqGtnKVBF+mACz81/e bafmEU9jyjaF6yWs53vo0ObQjbpTpz/uG5kqUn2djXgeXcdRwaRynXr5WEFQ+z0cBA0v ZroPuq2E/D1WynteMR2Lo1Ii3hD7mG02gDSX5a+3kZ12CfWTfzhXjfwYmBiNA1u5i+0k DYyQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764980290; x=1765585090; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=FT03r3WGpiwo19/4S5QoYm3Bx/6piafEWN0VhOWiV4M=; b=h+nG/ukMS7S2slEs8OFIxv5UD2hhKesQN0653ScgApcigcN5T+/E8t55zvMdOUC2eT tx/mybAGBVScUsPxSG770FZTyK0jV56JRMRSJVPE1ih49PAXBh6pJpSmshCVCxoOytVQ T1bLEZ9+hcl7IIrKKFsMcVrnDEOdZGhEfuTERS51ayoZgj3x443wunnJM+NnNuYTHG6+ WDyw0wdJrkwiXhvC7pLpJE18EFVxotV/9SLkVzmsV6TWmpLVfZo0NzmvU4SSjTufcwG0 zUepE83NLFdhsIstusD4kXMBDFLjZS9YvAgO2wFgX/V6KMVVtj9bgh/heDWbV3eXzen7 Bu2g== X-Forwarded-Encrypted: i=1; AJvYcCVyhD7Lp8zV8/hsv/DtIiE5dEuoYzwh4I80+fqz/wa8b9SKidiL3CPSLYj7yuMUVdS4qR2uer43VoVS3Yw=@vger.kernel.org X-Gm-Message-State: AOJu0Yx+k01onYLmqkI/0WRH/7YxE5KEI8RWRneAQWX0Jnot9RnjcJbu FZbiNQCqCX6q1awOidVM4xDOY/coiv8wyMfohy17VEtTkUE0rronvQO+QFbV8KbawdBSeilp2z/ NAwp4oA== X-Google-Smtp-Source: AGHT+IFYZeA/1Yjha10aGWWmoHIcleNuAFmNne8CK5ME9Ml1zcteifWQF4rCyiueQYdIVptKTzpY36jecuo= X-Received: from pjer9.prod.google.com ([2002:a17:90a:ac9:b0:343:6d9b:86c7]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:2784:b0:33f:eca0:47ae with SMTP id 98e67ed59e1d1-349a262dda2mr610988a91.28.1764980289948; Fri, 05 Dec 2025 16:18:09 -0800 (PST) Reply-To: Sean Christopherson Date: Fri, 5 Dec 2025 16:16:58 -0800 In-Reply-To: <20251206001720.468579-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251206001720.468579-1-seanjc@google.com> X-Mailer: git-send-email 2.52.0.223.gf5cc29aaa4-goog Message-ID: <20251206001720.468579-23-seanjc@google.com> Subject: [PATCH v6 22/44] KVM: x86/pmu: Disable interception of select PMU MSRs for mediated vPMUs From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Xin Li , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Sean Christopherson , Paolo Bonzini Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Mingwei Zhang , Xudong Hao , Sandipan Das , Dapeng Mi , Xiong Zhang , Manali Shukla , Jim Mattson Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Dapeng Mi For vCPUs with a mediated vPMU, disable interception of counter MSRs for PMCs that are exposed to the guest, and for GLOBAL_CTRL and related MSRs if they are fully supported according to the vCPU model, i.e. if the MSRs and all bits supported by hardware exist from the guest's point of view. Do NOT passthrough event selector or fixed counter control MSRs, so that KVM can enforce userspace-defined event filters, e.g. to prevent use of AnyThread events (which is unfortunately a setting in the fixed counter control MSR). Defer support for nested passthrough of mediated PMU MSRs to the future, as the logic for nested MSR interception is unfortunately vendor specific. Suggested-by: Sean Christopherson Co-developed-by: Mingwei Zhang Signed-off-by: Mingwei Zhang Co-developed-by: Sandipan Das Signed-off-by: Sandipan Das Signed-off-by: Dapeng Mi [sean: squash patches, massage changelog, refresh VMX MSRs on filter change] Tested-by: Xudong Hao Signed-off-by: Sean Christopherson --- arch/x86/kvm/pmu.c | 41 +++++++++++++++++-------- arch/x86/kvm/pmu.h | 1 + arch/x86/kvm/svm/svm.c | 36 ++++++++++++++++++++++ arch/x86/kvm/vmx/pmu_intel.c | 13 -------- arch/x86/kvm/vmx/pmu_intel.h | 15 +++++++++ arch/x86/kvm/vmx/vmx.c | 59 +++++++++++++++++++++++++++++------- 6 files changed, 128 insertions(+), 37 deletions(-) diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c index c4a32bfb26f5..57833f29a746 100644 --- a/arch/x86/kvm/pmu.c +++ b/arch/x86/kvm/pmu.c @@ -719,27 +719,41 @@ int kvm_pmu_rdpmc(struct kvm_vcpu *vcpu, unsigned idx= , u64 *data) return 0; } =20 -bool kvm_need_rdpmc_intercept(struct kvm_vcpu *vcpu) +static bool kvm_need_any_pmc_intercept(struct kvm_vcpu *vcpu) { struct kvm_pmu *pmu =3D vcpu_to_pmu(vcpu); =20 if (!kvm_vcpu_has_mediated_pmu(vcpu)) return true; =20 - /* - * VMware allows access to these Pseduo-PMCs even when read via RDPMC - * in Ring3 when CR4.PCE=3D0. - */ - if (enable_vmware_backdoor) - return true; - /* * Note! Check *host* PMU capabilities, not KVM's PMU capabilities, as * KVM's capabilities are constrained based on KVM support, i.e. KVM's * capabilities themselves may be a subset of hardware capabilities. */ return pmu->nr_arch_gp_counters !=3D kvm_host_pmu.num_counters_gp || - pmu->nr_arch_fixed_counters !=3D kvm_host_pmu.num_counters_fixed || + pmu->nr_arch_fixed_counters !=3D kvm_host_pmu.num_counters_fixed; +} + +bool kvm_need_perf_global_ctrl_intercept(struct kvm_vcpu *vcpu) +{ + return kvm_need_any_pmc_intercept(vcpu) || + !kvm_pmu_has_perf_global_ctrl(vcpu_to_pmu(vcpu)); +} +EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_need_perf_global_ctrl_intercept); + +bool kvm_need_rdpmc_intercept(struct kvm_vcpu *vcpu) +{ + struct kvm_pmu *pmu =3D vcpu_to_pmu(vcpu); + + /* + * VMware allows access to these Pseduo-PMCs even when read via RDPMC + * in Ring3 when CR4.PCE=3D0. + */ + if (enable_vmware_backdoor) + return true; + + return kvm_need_any_pmc_intercept(vcpu) || pmu->counter_bitmask[KVM_PMC_GP] !=3D (BIT_ULL(kvm_host_pmu.bit_wi= dth_gp) - 1) || pmu->counter_bitmask[KVM_PMC_FIXED] !=3D (BIT_ULL(kvm_host_pmu.bit= _width_fixed) - 1); } @@ -936,11 +950,12 @@ void kvm_pmu_refresh(struct kvm_vcpu *vcpu) * in the global controls). Emulate that behavior when refreshing the * PMU so that userspace doesn't need to manually set PERF_GLOBAL_CTRL. */ - if (kvm_pmu_has_perf_global_ctrl(pmu) && pmu->nr_arch_gp_counters) { + if (pmu->nr_arch_gp_counters && + (kvm_pmu_has_perf_global_ctrl(pmu) || kvm_vcpu_has_mediated_pmu(vcpu)= )) pmu->global_ctrl =3D GENMASK_ULL(pmu->nr_arch_gp_counters - 1, 0); - if (kvm_vcpu_has_mediated_pmu(vcpu)) - kvm_pmu_call(write_global_ctrl)(pmu->global_ctrl); - } + + if (kvm_vcpu_has_mediated_pmu(vcpu)) + kvm_pmu_call(write_global_ctrl)(pmu->global_ctrl); =20 bitmap_set(pmu->all_valid_pmc_idx, 0, pmu->nr_arch_gp_counters); bitmap_set(pmu->all_valid_pmc_idx, KVM_FIXED_PMC_BASE_IDX, diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h index 2ff469334c1a..356b08e92bc9 100644 --- a/arch/x86/kvm/pmu.h +++ b/arch/x86/kvm/pmu.h @@ -239,6 +239,7 @@ void kvm_pmu_instruction_retired(struct kvm_vcpu *vcpu); void kvm_pmu_branch_retired(struct kvm_vcpu *vcpu); =20 bool is_vmware_backdoor_pmc(u32 pmc_idx); +bool kvm_need_perf_global_ctrl_intercept(struct kvm_vcpu *vcpu); bool kvm_need_rdpmc_intercept(struct kvm_vcpu *vcpu); =20 extern struct kvm_pmu_ops intel_pmu_ops; diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 11913574de88..fa04e58ff524 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -730,6 +730,40 @@ void svm_vcpu_free_msrpm(void *msrpm) __free_pages(virt_to_page(msrpm), get_order(MSRPM_SIZE)); } =20 +static void svm_recalc_pmu_msr_intercepts(struct kvm_vcpu *vcpu) +{ + bool intercept =3D !kvm_vcpu_has_mediated_pmu(vcpu); + struct kvm_pmu *pmu =3D vcpu_to_pmu(vcpu); + int i; + + if (!enable_mediated_pmu) + return; + + /* Legacy counters are always available for AMD CPUs with a PMU. */ + for (i =3D 0; i < min(pmu->nr_arch_gp_counters, AMD64_NUM_COUNTERS); i++) + svm_set_intercept_for_msr(vcpu, MSR_K7_PERFCTR0 + i, + MSR_TYPE_RW, intercept); + + intercept |=3D !guest_cpu_cap_has(vcpu, X86_FEATURE_PERFCTR_CORE); + for (i =3D 0; i < pmu->nr_arch_gp_counters; i++) + svm_set_intercept_for_msr(vcpu, MSR_F15H_PERF_CTR + 2 * i, + MSR_TYPE_RW, intercept); + + for ( ; i < kvm_pmu_cap.num_counters_gp; i++) + svm_enable_intercept_for_msr(vcpu, MSR_F15H_PERF_CTR + 2 * i, + MSR_TYPE_RW); + + intercept =3D kvm_need_perf_global_ctrl_intercept(vcpu); + svm_set_intercept_for_msr(vcpu, MSR_AMD64_PERF_CNTR_GLOBAL_CTL, + MSR_TYPE_RW, intercept); + svm_set_intercept_for_msr(vcpu, MSR_AMD64_PERF_CNTR_GLOBAL_STATUS, + MSR_TYPE_RW, intercept); + svm_set_intercept_for_msr(vcpu, MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_CLR, + MSR_TYPE_RW, intercept); + svm_set_intercept_for_msr(vcpu, MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_SET, + MSR_TYPE_RW, intercept); +} + static void svm_recalc_msr_intercepts(struct kvm_vcpu *vcpu) { struct vcpu_svm *svm =3D to_svm(vcpu); @@ -798,6 +832,8 @@ static void svm_recalc_msr_intercepts(struct kvm_vcpu *= vcpu) if (sev_es_guest(vcpu->kvm)) sev_es_recalc_msr_intercepts(vcpu); =20 + svm_recalc_pmu_msr_intercepts(vcpu); + /* * x2APIC intercepts are modified on-demand and cannot be filtered by * userspace. diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c index dbab7cca7a62..820da47454d7 100644 --- a/arch/x86/kvm/vmx/pmu_intel.c +++ b/arch/x86/kvm/vmx/pmu_intel.c @@ -128,19 +128,6 @@ static struct kvm_pmc *intel_rdpmc_ecx_to_pmc(struct k= vm_vcpu *vcpu, return &counters[array_index_nospec(idx, num_counters)]; } =20 -static inline u64 vcpu_get_perf_capabilities(struct kvm_vcpu *vcpu) -{ - if (!guest_cpu_cap_has(vcpu, X86_FEATURE_PDCM)) - return 0; - - return vcpu->arch.perf_capabilities; -} - -static inline bool fw_writes_is_enabled(struct kvm_vcpu *vcpu) -{ - return (vcpu_get_perf_capabilities(vcpu) & PERF_CAP_FW_WRITES) !=3D 0; -} - static inline struct kvm_pmc *get_fw_gp_pmc(struct kvm_pmu *pmu, u32 msr) { if (!fw_writes_is_enabled(pmu_to_vcpu(pmu))) diff --git a/arch/x86/kvm/vmx/pmu_intel.h b/arch/x86/kvm/vmx/pmu_intel.h index 5620d0882cdc..5d9357640aa1 100644 --- a/arch/x86/kvm/vmx/pmu_intel.h +++ b/arch/x86/kvm/vmx/pmu_intel.h @@ -4,6 +4,21 @@ =20 #include =20 +#include "cpuid.h" + +static inline u64 vcpu_get_perf_capabilities(struct kvm_vcpu *vcpu) +{ + if (!guest_cpu_cap_has(vcpu, X86_FEATURE_PDCM)) + return 0; + + return vcpu->arch.perf_capabilities; +} + +static inline bool fw_writes_is_enabled(struct kvm_vcpu *vcpu) +{ + return (vcpu_get_perf_capabilities(vcpu) & PERF_CAP_FW_WRITES) !=3D 0; +} + bool intel_pmu_lbr_is_enabled(struct kvm_vcpu *vcpu); int intel_pmu_create_guest_lbr_event(struct kvm_vcpu *vcpu); =20 diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 72b92cea9d72..f0a20ff2a941 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -4228,6 +4228,53 @@ void pt_update_intercept_for_msr(struct kvm_vcpu *vc= pu) } } =20 +static void vmx_recalc_pmu_msr_intercepts(struct kvm_vcpu *vcpu) +{ + bool has_mediated_pmu =3D kvm_vcpu_has_mediated_pmu(vcpu); + struct kvm_pmu *pmu =3D vcpu_to_pmu(vcpu); + struct vcpu_vmx *vmx =3D to_vmx(vcpu); + bool intercept =3D !has_mediated_pmu; + int i; + + if (!enable_mediated_pmu) + return; + + vm_entry_controls_changebit(vmx, VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL, + has_mediated_pmu); + + vm_exit_controls_changebit(vmx, VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL | + VM_EXIT_SAVE_IA32_PERF_GLOBAL_CTRL, + has_mediated_pmu); + + for (i =3D 0; i < pmu->nr_arch_gp_counters; i++) { + vmx_set_intercept_for_msr(vcpu, MSR_IA32_PERFCTR0 + i, + MSR_TYPE_RW, intercept); + vmx_set_intercept_for_msr(vcpu, MSR_IA32_PMC0 + i, MSR_TYPE_RW, + intercept || !fw_writes_is_enabled(vcpu)); + } + for ( ; i < kvm_pmu_cap.num_counters_gp; i++) { + vmx_set_intercept_for_msr(vcpu, MSR_IA32_PERFCTR0 + i, + MSR_TYPE_RW, true); + vmx_set_intercept_for_msr(vcpu, MSR_IA32_PMC0 + i, + MSR_TYPE_RW, true); + } + + for (i =3D 0; i < pmu->nr_arch_fixed_counters; i++) + vmx_set_intercept_for_msr(vcpu, MSR_CORE_PERF_FIXED_CTR0 + i, + MSR_TYPE_RW, intercept); + for ( ; i < kvm_pmu_cap.num_counters_fixed; i++) + vmx_set_intercept_for_msr(vcpu, MSR_CORE_PERF_FIXED_CTR0 + i, + MSR_TYPE_RW, true); + + intercept =3D kvm_need_perf_global_ctrl_intercept(vcpu); + vmx_set_intercept_for_msr(vcpu, MSR_CORE_PERF_GLOBAL_STATUS, + MSR_TYPE_RW, intercept); + vmx_set_intercept_for_msr(vcpu, MSR_CORE_PERF_GLOBAL_CTRL, + MSR_TYPE_RW, intercept); + vmx_set_intercept_for_msr(vcpu, MSR_CORE_PERF_GLOBAL_OVF_CTRL, + MSR_TYPE_RW, intercept); +} + static void vmx_recalc_msr_intercepts(struct kvm_vcpu *vcpu) { bool intercept; @@ -4294,17 +4341,7 @@ static void vmx_recalc_msr_intercepts(struct kvm_vcp= u *vcpu) vmx_set_intercept_for_msr(vcpu, MSR_IA32_S_CET, MSR_TYPE_RW, intercept); } =20 - if (enable_mediated_pmu) { - bool is_mediated_pmu =3D kvm_vcpu_has_mediated_pmu(vcpu); - struct vcpu_vmx *vmx =3D to_vmx(vcpu); - - vm_entry_controls_changebit(vmx, - VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL, is_mediated_pmu); - - vm_exit_controls_changebit(vmx, - VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL | - VM_EXIT_SAVE_IA32_PERF_GLOBAL_CTRL, is_mediated_pmu); - } + vmx_recalc_pmu_msr_intercepts(vcpu); =20 /* * x2APIC and LBR MSR intercepts are modified on-demand and cannot be --=20 2.52.0.223.gf5cc29aaa4-goog From nobody Tue Dec 16 07:27:38 2025 Received: from mail-pf1-f201.google.com (mail-pf1-f201.google.com [209.85.210.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EAEEC2D5A14 for ; Sat, 6 Dec 2025 00:18:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980295; cv=none; b=gzuYy+pPsnom2g0KsHyahTPyK631eeR9wEQ8ejXKOKcN044dygvIL7PiGfMSbm0NfshCF5Ta6wWeimH2ilyXif4JuiuXA5GqYWaYWMv1UJ44lc3iazhoTZDWMvCNhUZkF1byCx2p6cSxmlaWIS8KvYKa57gbqtXj+cqMBl9/g/E= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980295; c=relaxed/simple; bh=O/jVTq5Qhwa5T0yytQs5FBna5XRpIRNeziQvPpiH8po=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=cO/tKE7My5Q723BaC7XhtCICVPSGs60rjHjjFko0GUHWo/fV6vJD0UtnbsIE8b8fKlRXNFToWl51tPERIbTF9TqYvtp/oAyJBh4rxqfb3lkvOFbeRB/A77ZllnEJw2Flvwh55u1S5oOdz4ZhGdxuDrUxdqXV701S8bdWA6nzXPk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=bixWkv78; arc=none smtp.client-ip=209.85.210.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="bixWkv78" Received: by mail-pf1-f201.google.com with SMTP id d2e1a72fcca58-7b9090d9f2eso4938014b3a.0 for ; Fri, 05 Dec 2025 16:18:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1764980292; x=1765585092; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=XDWqFeiOQLKkxB3EbvG31YUh6Q8If2ANhLHUmz0BNv4=; b=bixWkv78EyRWIJkWbMv/yM67O5ssC+QKW+D/0xsnFqPt2hoSkHvtgI2mBiK5hhxxDw xJNgh1ow72gdkzPSaDfy3LN03Cejoiy2SqjCoaYyfDIV9hvBG/iYrsEb6doZjzibhgFG GMDz58Nwhc20qiDwUH/mgZjoM2jcZv23cT5FXuVaiyXzmOw6x0cE74wbYAAnh38ayiNx QlnLw6R1Pv2tZTdJfSbrICf0sCR2XqeinDr93jZ22ubxdnxxJJ3L7TdvUHOvgTmyBxOW taf64A9bQqH9gV1zrmNKqzfHdp5hya0rsttG2UeRCCG5XwcR3yVfZLtZ41iSXDaj+wpL k4oA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764980292; x=1765585092; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=XDWqFeiOQLKkxB3EbvG31YUh6Q8If2ANhLHUmz0BNv4=; b=ixClL4b1218xqjijTFgh2yKAat/wfAGqXZq+Cpl/KHxRuU5xLYplFS8jivStm6zW9d JHISlsb6Yv5gyPj97jnjOfJDb1dtLgQ9Pl7ypBlFqYIaYkA5VpMp/YwAALEPafHcnrXZ 5G+vAxC+AOg/dARL1MdO+ywquFSPZv4zjXOway/UkkNkiyN4YMopraB994IJ5weFfdbq l7AY7XNgvDmB8qyRIZ0Rk2fcFJItoRdvMtWDEPVlAzwsSpJ1UsI08qQ+dtjuE2KgR223 QwM3nlvQtAgfOLagHZmOFcozSVQxVwQXuIaUh0FaxFTws6JEPCll4jnPTWjwdIDcFoHV b1dw== X-Forwarded-Encrypted: i=1; AJvYcCXVJsi2epm4DBq3XH3gsPvURXfMhoOEkJLNaCmAuo4GEFWvuBwEE9sqBDhU1ZX1M7X9Ldm3Jt6oSDeDKBA=@vger.kernel.org X-Gm-Message-State: AOJu0YzwF8cSHExrvLwv5TLIqBAx+spuVmKchvVZ4FBXxJEUU2yuOrlB Hu9cTHjnj8Gp8FJGzjYk3fceDIRLSfU8YAMphSyFKT4J0K2hpTCjVI5bfe8O3GXMReJk/+2MrMf ZMVzUCQ== X-Google-Smtp-Source: AGHT+IEUzuqWf57QLYzTtVTMaTDzv0I3kjHOLmfOTP6Jt6doQKesYDsCDJnJjl04ZJwHyBMoU/64HyX4VwM= X-Received: from pgar12.prod.google.com ([2002:a05:6a02:2e8c:b0:b63:2a80:d077]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a20:7f81:b0:334:912f:acea with SMTP id adf61e73a8af0-366180175c0mr936268637.59.1764980292181; Fri, 05 Dec 2025 16:18:12 -0800 (PST) Reply-To: Sean Christopherson Date: Fri, 5 Dec 2025 16:16:59 -0800 In-Reply-To: <20251206001720.468579-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251206001720.468579-1-seanjc@google.com> X-Mailer: git-send-email 2.52.0.223.gf5cc29aaa4-goog Message-ID: <20251206001720.468579-24-seanjc@google.com> Subject: [PATCH v6 23/44] KVM: x86/pmu: Bypass perf checks when emulating mediated PMU counter accesses From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Xin Li , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Sean Christopherson , Paolo Bonzini Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Mingwei Zhang , Xudong Hao , Sandipan Das , Dapeng Mi , Xiong Zhang , Manali Shukla , Jim Mattson Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Dapeng Mi When emulating a PMC counter read or write for a mediated PMU, bypass the perf checks and emulated_counter logic as the counters aren't proxied through perf, i.e. pmc->counter always holds the guest's up-to-date value, and thus there's no need to defer emulated overflow checks. Suggested-by: Sean Christopherson Signed-off-by: Dapeng Mi Co-developed-by: Mingwei Zhang Signed-off-by: Mingwei Zhang [sean: split from event filtering change, write shortlog+changelog] Reviewed-by: Sandipan Das Tested-by: Xudong Hao Signed-off-by: Sean Christopherson --- arch/x86/kvm/pmu.c | 5 +++++ arch/x86/kvm/pmu.h | 3 +++ 2 files changed, 8 insertions(+) diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c index 57833f29a746..621722e8cc7e 100644 --- a/arch/x86/kvm/pmu.c +++ b/arch/x86/kvm/pmu.c @@ -381,6 +381,11 @@ static void pmc_update_sample_period(struct kvm_pmc *p= mc) =20 void pmc_write_counter(struct kvm_pmc *pmc, u64 val) { + if (kvm_vcpu_has_mediated_pmu(pmc->vcpu)) { + pmc->counter =3D val & pmc_bitmask(pmc); + return; + } + /* * Drop any unconsumed accumulated counts, the WRMSR is a write, not a * read-modify-write. Adjust the counter value so that its value is diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h index 356b08e92bc9..9a199109d672 100644 --- a/arch/x86/kvm/pmu.h +++ b/arch/x86/kvm/pmu.h @@ -111,6 +111,9 @@ static inline u64 pmc_read_counter(struct kvm_pmc *pmc) { u64 counter, enabled, running; =20 + if (kvm_vcpu_has_mediated_pmu(pmc->vcpu)) + return pmc->counter & pmc_bitmask(pmc); + counter =3D pmc->counter + pmc->emulated_counter; =20 if (pmc->perf_event && !pmc->is_paused) --=20 2.52.0.223.gf5cc29aaa4-goog From nobody Tue Dec 16 07:27:38 2025 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B90872D837B for ; Sat, 6 Dec 2025 00:18:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980297; cv=none; b=BgLaqoCrj6elc+SLzHm9iy7WKhjq7Efl30SiaPiVYXxu4huEJYHykybK7Mp6VXEMl2C7tV4nSMF9AMFBdxAdKBz/z4g77I5OJ7j33PNYKhgV0UOxsyLn8lH9UvHoWdT3+NTW2DwkTy8atAdHhG9uAhoiCLohb4PQbPmL7/uct7A= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980297; c=relaxed/simple; bh=Pzy06uwQwl662SZzAhKp4ZgGhpYEW/xfdtfXTkdDh0E=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=YRaFBY9I/u2di0WvxyuA1lkY3AtC1l7eEXROXfgY24srpRpOWN+Xyu++oTQLWG6Vka9FlkxvLca2FZPNTmBRAgU7+aIOAjEZgeM1jr+EkvXEAMU89RU8snzbzCM7INSeSS2l7kK2WgYbDgnn7QWmfXMadKKyGbMjSBabckz6Pjo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=U6K8xVeK; arc=none smtp.client-ip=209.85.216.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="U6K8xVeK" Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-340c0604e3dso3052140a91.2 for ; Fri, 05 Dec 2025 16:18:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1764980295; x=1765585095; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=0NKuhyI73DcmRDf6dJr4Yp2dlSnDCvKYnr68Tztb6CI=; b=U6K8xVeK3hqhGdDCpbZJH7RkcPnBYjxMzFiJWqxlXRybzO2Oy3k3R7Z5ixbPJgpmri hP+ZR/OPaCWEa2eXOzWx4qnLHipFkQphPpHUjCIcecAjYJCMsKH+jRhhjPExSPqWfBoJ Tc3XlengNsjyMp3QfQheGRhaZLLb/KSVb9nhf1z97XpuGtN4DRIUvXCII82T8nxlL7gL ziCN65iYGbMkV2wGpiaFB7z0fvDhELG06pgA+w/QNSCbdmoIYB3mJAmwsB2utl6q6FlR scMTYG6T25xGPRQ9HcGdy59fSfw596mNM1ao6PMALBT+VWw101xfocSgeWz521TluI0O kpBw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764980295; x=1765585095; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=0NKuhyI73DcmRDf6dJr4Yp2dlSnDCvKYnr68Tztb6CI=; b=g65IJq5NLgZZGXL3ZXRGT77xAt9vzoQH+4QJH3p9+RY1KfV9QLeDUUZW+0Bav9Kb6Z OWBZYb+AkpaQjw6HBHX9hCxYItdfthNOTfEuLx+ugAjGgRoCVs9yf590vbJ9NxgeYF74 6DHucl5vPC3y7vfLcND/RFMRdkCa7twqYzvlRmYNd0pfb4Cn7RbMz0joLzLuu/yNxl// RnitatFDh7B1tF7F/aI8Hyonh8U9sxd2HLn91IYaDA8GAViuv7H/8e9FfCr9oAFDmvv8 7Ik1h3wPY5bmWjzStPf5JgNz/aX1R+GV2ynh9rseNQj/sFR4x0JyahCKQQZmx9AfZ7aG XCFw== X-Forwarded-Encrypted: i=1; AJvYcCV4OuRrG8K4yuD7r733F1WK+TOoRwZdgiQu2o7mcQ/Fnv4Z6v3E8DtRN/VKNtpLR3uvl3/jEzDPvWqcMiY=@vger.kernel.org X-Gm-Message-State: AOJu0Yz9OxMxu3k2Fta4jYFabGnAbpgA346Z5osSwBl78uwGgmiB2aFp dzKM53wPsxp6JmrNnQpODAeWDm4KtK9FJRzchtYYad5ApoBJhBu/2RPBYszxitz/ZSnVhZX5lj0 OD0aRsw== X-Google-Smtp-Source: AGHT+IEu368Ymit6Wf714y2QoU1WaQ4whiXftg8ijTgUsFOrrwNXh1LyvlB+ZwNskFSucsIG3f0JlRtM2cA= X-Received: from pjoa4.prod.google.com ([2002:a17:90a:8c04:b0:349:6296:2bb7]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:3bc7:b0:340:d578:f2a2 with SMTP id 98e67ed59e1d1-349a252b381mr638790a91.6.1764980295195; Fri, 05 Dec 2025 16:18:15 -0800 (PST) Reply-To: Sean Christopherson Date: Fri, 5 Dec 2025 16:17:00 -0800 In-Reply-To: <20251206001720.468579-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251206001720.468579-1-seanjc@google.com> X-Mailer: git-send-email 2.52.0.223.gf5cc29aaa4-goog Message-ID: <20251206001720.468579-25-seanjc@google.com> Subject: [PATCH v6 24/44] KVM: x86/pmu: Introduce eventsel_hw to prepare for pmu event filtering From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Xin Li , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Sean Christopherson , Paolo Bonzini Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Mingwei Zhang , Xudong Hao , Sandipan Das , Dapeng Mi , Xiong Zhang , Manali Shukla , Jim Mattson Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Mingwei Zhang Introduce eventsel_hw and fixed_ctr_ctrl_hw to store the actual HW value in PMU event selector MSRs. In mediated PMU checks events before allowing the event values written to the PMU MSRs. However, to match the HW behavior, when PMU event checks fails, KVM should allow guest to read the value back. This essentially requires an extra variable to separate the guest requested value from actual PMU MSR value. Note this only applies to event selectors. Signed-off-by: Mingwei Zhang Co-developed-by: Dapeng Mi Signed-off-by: Dapeng Mi Tested-by: Xudong Hao Signed-off-by: Sean Christopherson --- arch/x86/include/asm/kvm_host.h | 2 ++ arch/x86/kvm/pmu.c | 7 +++++-- arch/x86/kvm/svm/pmu.c | 1 + arch/x86/kvm/vmx/pmu_intel.c | 2 ++ 4 files changed, 10 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_hos= t.h index defd979003be..e72357f64b19 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -529,6 +529,7 @@ struct kvm_pmc { */ u64 emulated_counter; u64 eventsel; + u64 eventsel_hw; struct perf_event *perf_event; struct kvm_vcpu *vcpu; /* @@ -557,6 +558,7 @@ struct kvm_pmu { unsigned nr_arch_fixed_counters; unsigned available_event_types; u64 fixed_ctr_ctrl; + u64 fixed_ctr_ctrl_hw; u64 fixed_ctr_ctrl_rsvd; u64 global_ctrl; u64 global_status; diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c index 621722e8cc7e..36eebc1c7e70 100644 --- a/arch/x86/kvm/pmu.c +++ b/arch/x86/kvm/pmu.c @@ -902,11 +902,14 @@ static void kvm_pmu_reset(struct kvm_vcpu *vcpu) pmc->counter =3D 0; pmc->emulated_counter =3D 0; =20 - if (pmc_is_gp(pmc)) + if (pmc_is_gp(pmc)) { pmc->eventsel =3D 0; + pmc->eventsel_hw =3D 0; + } } =20 - pmu->fixed_ctr_ctrl =3D pmu->global_ctrl =3D pmu->global_status =3D 0; + pmu->fixed_ctr_ctrl =3D pmu->fixed_ctr_ctrl_hw =3D 0; + pmu->global_ctrl =3D pmu->global_status =3D 0; =20 kvm_pmu_call(reset)(vcpu); } diff --git a/arch/x86/kvm/svm/pmu.c b/arch/x86/kvm/svm/pmu.c index 16c88b2a2eb8..c1ec1962314e 100644 --- a/arch/x86/kvm/svm/pmu.c +++ b/arch/x86/kvm/svm/pmu.c @@ -166,6 +166,7 @@ static int amd_pmu_set_msr(struct kvm_vcpu *vcpu, struc= t msr_data *msr_info) data &=3D ~pmu->reserved_bits; if (data !=3D pmc->eventsel) { pmc->eventsel =3D data; + pmc->eventsel_hw =3D data; kvm_pmu_request_counter_reprogram(pmc); } return 0; diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c index 820da47454d7..855240678300 100644 --- a/arch/x86/kvm/vmx/pmu_intel.c +++ b/arch/x86/kvm/vmx/pmu_intel.c @@ -61,6 +61,7 @@ static void reprogram_fixed_counters(struct kvm_pmu *pmu,= u64 data) int i; =20 pmu->fixed_ctr_ctrl =3D data; + pmu->fixed_ctr_ctrl_hw =3D data; for (i =3D 0; i < pmu->nr_arch_fixed_counters; i++) { u8 new_ctrl =3D fixed_ctrl_field(data, i); u8 old_ctrl =3D fixed_ctrl_field(old_fixed_ctr_ctrl, i); @@ -430,6 +431,7 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, str= uct msr_data *msr_info) =20 if (data !=3D pmc->eventsel) { pmc->eventsel =3D data; + pmc->eventsel_hw =3D data; kvm_pmu_request_counter_reprogram(pmc); } break; --=20 2.52.0.223.gf5cc29aaa4-goog From nobody Tue Dec 16 07:27:38 2025 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 84B062D9EC4 for ; Sat, 6 Dec 2025 00:18:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980299; cv=none; b=kB0N3PPezqjKSZ5KAjR+Bpf2Hg1QmNDzAK/fBpK/hSWrDsAqbYRZLDsFFwFue9HG2OuZaVla/dML4iG/z0xbSm7M6y2kamr6SOfj1G9mfv7PzlG/KSS0guF2erOlIh3mqDaFccMKmElSny3q0YZDEMmhFSGHp1iIXSv4n81duVY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980299; c=relaxed/simple; bh=jX+tE0pJsaHvJh6IrTkcgDEB5TFVK0tRq1c1m4xYkUI=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=XjO72PZ7I89Gwrb0Oqp+h9K5BiG9yNgJy1usJLfDNucs9Zz8MwNKhnpKmUnTdvKM2MssoiIAB+VESkmWRHIQ5+6p7/W9Bk7BpIKOx/1s2j8Wsa0PFqwwigKq31VlF801CoZyZqnBRpA8qf84kbWiGaNcSgdUladETWKAoxB90FA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=e9Am5M+p; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="e9Am5M+p" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-34188ba5990so6123629a91.0 for ; Fri, 05 Dec 2025 16:18:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1764980297; x=1765585097; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=1e8bcRXtuFwsQIbYtccrwhgHSv5SpQkyZPUQMVVAuRI=; b=e9Am5M+pxNlUcX3PkWafN0C+GVQbHmJua8cL1LURNaeiI94EDp2g+CIBt3hroSpmFh bpm1l8enYj/bmOfiX9Ncd7KhShB6gEQd2uo2X41KbcfvwHPzwl5vqHrl8D8FbWOsICBR hoTt8K6YQfJlGwnnKwzj+1JZWv9Bkk3sqGGI5rmIxj9vN4924LX6z6VUMzd5k6/aDe87 uUR5qU4M4Uze1LpRCul06fO36FS2YWJURNNJGaeuiZnQEiYMmI7CMA9CAHBvPnXptC9P 48f7CjwPjN0Dnzg8W9GIqLeZdWIwdQFlnghBhSKZqrGnKnwtPLXPB5nWWEBKMupJpmE5 AXsA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764980297; x=1765585097; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=1e8bcRXtuFwsQIbYtccrwhgHSv5SpQkyZPUQMVVAuRI=; b=PwiR4P1zj8xdZUUltKqoRf9K3WNggGEfbhylGK2hPxhlb3N+m8EivZskQCCeSrXXRY caxXxfoGvJdmowODuQyDoHGNyCs5UL7vgeP4DM/ZS20tQl7Hxcnv5lkK7oVHKYjWBeXG weGi/v4FQl5k7JJpoof/S+0lsCwmV0Wql7uiwzKMniKa/AuwZTCFVeMSVtCHxy2Z559J q9oJN27b0O403M2o+yeAxZk7uxSPhTNYIsV0zZmhzZ4mKYpfqWfX/CBN/LmxmqNiPyJV 6nvBcXZX5jWkYkxWxHy95b3ZOQh/VwLctGdTKOzZHHay7RxRFEFBFHbQQ1QRQZpHwGTO SmbA== X-Forwarded-Encrypted: i=1; AJvYcCUJNY/amt/b/JBSR/AWOw/bZtFSokMW1aig7wfjJR2lssN/jaiq3Zjocn1oCMGl5a4/jcMH3wNuisJm+V8=@vger.kernel.org X-Gm-Message-State: AOJu0YwIHYH1jmUQpO15nblVs57bjq91hv89bGjb8k3Jkl/YJQQA/J2j aMoEUl/R8eWtcExZb0xgheTzjzbpRZQnQJrm13AGdcF+j+C8MVwAP6dKp+nd/ofsqKe7/8LlMVC 5JuU0GA== X-Google-Smtp-Source: AGHT+IFA+TmENZM8zZH3xQmTmkf0AAav0uZn++oR8JtE8dy0fhXg220AvOxyf1+YRyhWVdU9HyzyXzK52Ag= X-Received: from pjwo11.prod.google.com ([2002:a17:90a:d24b:b0:33b:51fe:1a89]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:5750:b0:330:84c8:92d0 with SMTP id 98e67ed59e1d1-349a261424emr607582a91.24.1764980296958; Fri, 05 Dec 2025 16:18:16 -0800 (PST) Reply-To: Sean Christopherson Date: Fri, 5 Dec 2025 16:17:01 -0800 In-Reply-To: <20251206001720.468579-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251206001720.468579-1-seanjc@google.com> X-Mailer: git-send-email 2.52.0.223.gf5cc29aaa4-goog Message-ID: <20251206001720.468579-26-seanjc@google.com> Subject: [PATCH v6 25/44] KVM: x86/pmu: Reprogram mediated PMU event selectors on event filter updates From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Xin Li , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Sean Christopherson , Paolo Bonzini Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Mingwei Zhang , Xudong Hao , Sandipan Das , Dapeng Mi , Xiong Zhang , Manali Shukla , Jim Mattson Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Dapeng Mi Refresh the event selectors that are programmed into hardware when a PMC is "reprogrammed" for a mediated PMU, i.e. if userspace changes the PMU event filters Note, KVM doesn't utilize the reprogramming infrastructure to handle counter overflow for mediated PMUs, as there's no need to reprogram a non-existent perf event. Suggested-by: Sean Christopherson Signed-off-by: Dapeng Mi Co-developed-by: Mingwei Zhang Signed-off-by: Mingwei Zhang [sean: add a helper to document behavior, split patch and rewrite changelog] Tested-by: Xudong Hao Signed-off-by: Sean Christopherson --- arch/x86/kvm/pmu.c | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c index 36eebc1c7e70..39904e6fd227 100644 --- a/arch/x86/kvm/pmu.c +++ b/arch/x86/kvm/pmu.c @@ -522,6 +522,25 @@ static bool pmc_is_event_allowed(struct kvm_pmc *pmc) return is_fixed_event_allowed(filter, pmc->idx); } =20 +static void kvm_mediated_pmu_refresh_event_filter(struct kvm_pmc *pmc) +{ + bool allowed =3D pmc_is_event_allowed(pmc); + struct kvm_pmu *pmu =3D pmc_to_pmu(pmc); + + if (pmc_is_gp(pmc)) { + pmc->eventsel_hw &=3D ~ARCH_PERFMON_EVENTSEL_ENABLE; + if (allowed) + pmc->eventsel_hw |=3D pmc->eventsel & + ARCH_PERFMON_EVENTSEL_ENABLE; + } else { + u64 mask =3D intel_fixed_bits_by_idx(pmc->idx - KVM_FIXED_PMC_BASE_IDX, = 0xf); + + pmu->fixed_ctr_ctrl_hw &=3D ~mask; + if (allowed) + pmu->fixed_ctr_ctrl_hw |=3D pmu->fixed_ctr_ctrl & mask; + } +} + static int reprogram_counter(struct kvm_pmc *pmc) { struct kvm_pmu *pmu =3D pmc_to_pmu(pmc); @@ -530,6 +549,11 @@ static int reprogram_counter(struct kvm_pmc *pmc) bool emulate_overflow; u8 fixed_ctr_ctrl; =20 + if (kvm_vcpu_has_mediated_pmu(pmu_to_vcpu(pmu))) { + kvm_mediated_pmu_refresh_event_filter(pmc); + return 0; + } + emulate_overflow =3D pmc_pause_counter(pmc); =20 if (!pmc_is_globally_enabled(pmc) || !pmc_is_locally_enabled(pmc) || --=20 2.52.0.223.gf5cc29aaa4-goog From nobody Tue Dec 16 07:27:38 2025 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 49FC72DF130 for ; Sat, 6 Dec 2025 00:18:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980302; cv=none; b=KVpI2nwmmSG7/Zv319Peb37wbgq5m6N75bhivE2IyI/METfWlIa714B20MEVOSLyjya/aUrojNuky8HVaPYRhzySxaf4E2up+5iIyQCg8OBi3bGyHwzz//dhfLyjaIs+r5RU/t8SZo72K/zv3l2f8gT88lL6RVq08BigGoWWvpI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980302; c=relaxed/simple; bh=KX2afOBO5JliLIBcGy7PGHDj+yaeSTNsvwyLPzdSmQM=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=LG0NPD3WLwwmEVm0BaR8B+qSMxsTtCBO9aVuq8oGoLTSPPbkprYtWGsQdJR87c01zzbLTUgmKIGNJyiNYoIeiwof4Ajst5AwUn1LImtXqE/+nR9sjaPU+tr7Soi6YcmPLiTSTiDp8P9mN8H4tWIlmdZYuQZRwgooaYbnUYwbwR0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=0Aibxls9; arc=none smtp.client-ip=209.85.216.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="0Aibxls9" Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-3418ad76023so4734859a91.0 for ; Fri, 05 Dec 2025 16:18:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1764980298; x=1765585098; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=3dnk2MNp76sLgAxgNfD42hkkfZBnnHFIERQ6eKlhSyM=; b=0Aibxls9R4YhFOB28CMCf+STW8cIf+Ju1G3mAyXFx9Nkmhy+mXgEayTR/CdmiT7OsN 3fMKAmFqAmyzhDImLf4tQPEti8l6NdbhTAyA5wSUgdvSOD9yfgv/74NlOx5fT87/goDL /bOTEA91KnKUe6p/qQsyIURKpVATa6yqEJfytCUBA2QmmDOTIYstWMXokygj3z5emT5+ xhzPP6tsp7pfIXFn0sEqaIv/4bD5BnYQpgmHugccohtIKLMihaC+OD9UCaw6mo67NVzH 73T3Uc6MKbIlFlgtHClRCP5QakQA52z0LzqG9kRV0X65gReYNSGH4jQlgYydc1SWnQAt X3OA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764980298; x=1765585098; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=3dnk2MNp76sLgAxgNfD42hkkfZBnnHFIERQ6eKlhSyM=; b=ER1+lFTEG5zujjq2N1/hApDxkPAnqWb5TZNaoZ5dLsoRv/7Zlo40UCZLg/Wc3JdrQw ieTZFld6brv5uR9kFOy24bNLEKJOP0C5BqG2zblzRoIJ9LJBYQ/mdqVC7heglx7CEpBo kAymUEobcgTC+T68jKUZfoOZO+2fUpDfmxvBbp2l4KVlRAIhy57CZYS3HPLoz8HVWJ++ eLhdWk48IdRZ+wVxRYG+kfWwg5tl8+OuobyTfiBT8NxAV7GKfqHF3DCQ2XND652k5wtH qxx/3klLy6BypNK5P5AnzPfjt2mqqYlbTEaesCC2xzSkS3NwbZz11hzwfMR+t+1GTGl4 YuQw== X-Forwarded-Encrypted: i=1; AJvYcCVa8Z9bFVUTxMPJwLyRUI8oYyRwS6zsRRj8nHf/8yP9bXJed/TbQSDI0YZ2JFt/lLS7xP8qF1LWMNkmjEg=@vger.kernel.org X-Gm-Message-State: AOJu0YyGVmS8LRjSabNDuTgWnwewQeBOcHijVS2VsyXcal16JAwFSnFV NkHnWDbt8nxlY2GXm1tpV5bv3qz2jkXjTiGWU6r4lweCp76zUw6oOiWZzxYKRJWCOtnIxeMNljR 5ZwCGxA== X-Google-Smtp-Source: AGHT+IFWJxm66Taju4sAXeG8Zi046j5cUgR9dVl+uDTUXVS82TWmhcnKPkrnyr1pnneTXazQsF/t/WHWvfY= X-Received: from pjll2.prod.google.com ([2002:a17:90a:702:b0:349:3867:ccc1]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:3c4a:b0:339:d03e:2a11 with SMTP id 98e67ed59e1d1-349a2511dfemr604485a91.14.1764980298595; Fri, 05 Dec 2025 16:18:18 -0800 (PST) Reply-To: Sean Christopherson Date: Fri, 5 Dec 2025 16:17:02 -0800 In-Reply-To: <20251206001720.468579-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251206001720.468579-1-seanjc@google.com> X-Mailer: git-send-email 2.52.0.223.gf5cc29aaa4-goog Message-ID: <20251206001720.468579-27-seanjc@google.com> Subject: [PATCH v6 26/44] KVM: x86/pmu: Always stuff GuestOnly=1,HostOnly=0 for mediated PMCs on AMD From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Xin Li , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Sean Christopherson , Paolo Bonzini Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Mingwei Zhang , Xudong Hao , Sandipan Das , Dapeng Mi , Xiong Zhang , Manali Shukla , Jim Mattson Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Sandipan Das On AMD platforms, there is no way to restore PerfCntrGlobalCtl at VM-Entry or clear it at VM-Exit. Since the register states will be restored before entering and saved after exiting guest context, the counters can keep ticking and even overflow leading to chaos while still in host context. To avoid this, intecept event selectors, which is already done by mediated PMU. In addition, always set the GuestOnly bit and clear the HostOnly bit for PMU selectors on AMD. Doing so allows the counters run only in guest context even if their enable bits are still set after VM exit and before host/guest PMU context switch. Signed-off-by: Sandipan Das Signed-off-by: Mingwei Zhang [sean: massage shortlog] Tested-by: Xudong Hao Signed-off-by: Sean Christopherson --- arch/x86/kvm/svm/pmu.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/svm/pmu.c b/arch/x86/kvm/svm/pmu.c index c1ec1962314e..6d5f791126b1 100644 --- a/arch/x86/kvm/svm/pmu.c +++ b/arch/x86/kvm/svm/pmu.c @@ -166,7 +166,8 @@ static int amd_pmu_set_msr(struct kvm_vcpu *vcpu, struc= t msr_data *msr_info) data &=3D ~pmu->reserved_bits; if (data !=3D pmc->eventsel) { pmc->eventsel =3D data; - pmc->eventsel_hw =3D data; + pmc->eventsel_hw =3D (data & ~AMD64_EVENTSEL_HOSTONLY) | + AMD64_EVENTSEL_GUESTONLY; kvm_pmu_request_counter_reprogram(pmc); } return 0; --=20 2.52.0.223.gf5cc29aaa4-goog From nobody Tue Dec 16 07:27:38 2025 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DAECF22A4DB for ; Sat, 6 Dec 2025 00:18:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980304; cv=none; b=UM22JcwB6at4ingSI5xY6qfMKwIZl5RZfC+Q7tnrBPdCpgsxBQcgOt0HKP3DVZtIa1col4XdVJ2J7NnnPnFVskX10L3FDhnjt1+b86PRDQgqxOA7pphOh63s54BN4EFB83a66S+3XwjyWe9zu/ueIdfNBnYPC4OQEabH2Ch7x18= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980304; c=relaxed/simple; bh=g6IgwDOr1QTB/U0G/UqEHj7fne6rpUkmt4EyXf4kjNE=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=rsD3PobgRVaNQ0jH0256R6CfGrG8UEFO74bVWO9U74i0l2zYveVnUGQjiJWMkU1Jt3Fr+ue9TaeR7wzZEdSD/6YDvUiL39+RcMBuILCpajshEWH6HtWnRIu9IPp1TgoydKpKzbiBbP0qUpCMSfh3oQWxN8BZJlQUGnWSLdzyyc4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=UdpsrET7; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="UdpsrET7" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-3438b1220bcso3062277a91.2 for ; Fri, 05 Dec 2025 16:18:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1764980301; x=1765585101; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=woqdwqYc7rQiICol2w1MfYRG1IsQA6G0tPae7LPMVAU=; b=UdpsrET78tTHLHOYWxSoSTfKXIL63puqUiXEpzMh8H3zSBpNXHfMLMJfXqh06sSzMR Zh7GJWQ53vbP9BJRyVh2be6mZTx8sM+ugZ+/1wSM2XfbAYytuxTLms9r5R0p9725RGDR PBpj87AXD2r+BpBmC5dr8ag+or11YKHZmia1YlRWkMPwQOxznHNYzsNYZe408KluSrLj iu+bROZM5L0i5aOyb6tTM0xmt8UQnizN93Za9aj99eJEZevqyVr5OL6fOWt2jVm0S2/j OdlbAHbpLEtG4f//tQguPg5cSmkuYG0T4EQZ3b8RAQTrJSTH30LIkbHmUXrLs80MQtKu lbhQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764980301; x=1765585101; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=woqdwqYc7rQiICol2w1MfYRG1IsQA6G0tPae7LPMVAU=; b=LnuxOyqYnsFcdRzuiDVh5+85zsQzV5adkb9wuz2BqXvFdz/ez2SV12/y2Z8VqTHESt Yq/+RodG4zlCKM5tsuyRunWVTnR0vVcLRoll5T6LsFXORbpsXRJ3WaplSd4EUDwLPmc2 U1SKmnidl4zeqb/97zfFcfnUwl/gwN1qVlO9WTGYC8snP6ntZYQt0/tVOi/rR7PNjLEv CVdpVop3doOmTlwH+Etca27U6k7N6hF8LZIVMVRafTnNJAxurM2HTYzZ+6vnXLnoZb0P w+Whg1UyH8F0mw8koLM/Md8x0A3aEmMV+cNsbzfOSUF51r7DHkYgWxS1Wq+CWLahJFWX sjqQ== X-Forwarded-Encrypted: i=1; AJvYcCWhnXrVoXkAQgXeFCOqrNvAvyB5EgRHVj2jQwGXCNhNPr3jZXj8F0Sh+qCadrHWha0beGhOv6KtGDcOTg0=@vger.kernel.org X-Gm-Message-State: AOJu0YwKBnHmXzimRt7i2CzZZ4YOlnzhs5zqoGjJmWJ+lHHe4H+6H2o6 mg1gm571RNuzqvy294L5lzWsMttCcDoE756NeWkdklKryIyV0z8GPeEINJWuz9AvWNaGYkrMiqW Yp6oAHQ== X-Google-Smtp-Source: AGHT+IFJtzUjMs3EhF2cTLNE2/+KZsTfbuJQ6pRxxJNiALtopWi5CNU9QJ//pu80N197imWodywRwcS+L+M= X-Received: from pjxx15.prod.google.com ([2002:a17:90b:58cf:b0:340:bb32:f5cf]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:33d1:b0:341:194:5e7d with SMTP id 98e67ed59e1d1-349a2686534mr715053a91.24.1764980300781; Fri, 05 Dec 2025 16:18:20 -0800 (PST) Reply-To: Sean Christopherson Date: Fri, 5 Dec 2025 16:17:03 -0800 In-Reply-To: <20251206001720.468579-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251206001720.468579-1-seanjc@google.com> X-Mailer: git-send-email 2.52.0.223.gf5cc29aaa4-goog Message-ID: <20251206001720.468579-28-seanjc@google.com> Subject: [PATCH v6 27/44] KVM: x86/pmu: Load/put mediated PMU context when entering/exiting guest From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Xin Li , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Sean Christopherson , Paolo Bonzini Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Mingwei Zhang , Xudong Hao , Sandipan Das , Dapeng Mi , Xiong Zhang , Manali Shukla , Jim Mattson Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Dapeng Mi Implement the PMU "world switch" between host perf and guest mediated PMU. When loading guest state, call into perf to switch from host to guest, and then load guest state into hardware, and then reverse those actions when putting guest state. On the KVM side, when loading guest state, zero PERF_GLOBAL_CTRL to ensure all counters are disabled, then load selectors and counters, and finally call into vendor code to load control/status information. While VMX and SVM use different mechanisms to avoid counting host activity while guest controls are loaded, both implementations require PERF_GLOBAL_CTRL to be zeroed when the event selectors are in flux. When putting guest state, reverse the order, and save and zero controls and status prior to saving+zeroing selectors and counters. Defer clearing PERF_GLOBAL_CTRL to vendor code, as only SVM needs to manually clear the MSR; VMX configures PERF_GLOBAL_CTRL to be atomically cleared by the CPU on VM-Exit. Handle the difference in MSR layouts between Intel and AMD by communicating the bases and stride via kvm_pmu_ops. Because KVM requires Intel v4 (and full-width writes) and AMD v2, the MSRs to load/save are constant for a given vendor, i.e. do not vary based on the guest PMU, and do not vary based on host PMU (because KVM will simply disable mediated PMU support if the necessary MSRs are unsupported). Except for retrieving the guest's PERF_GLOBAL_CTRL, which needs to be read before invoking any fastpath handler (spoiler alert), perform the context switch around KVM's inner run loop. State only needs to be synchronized from hardware before KVM can access the software "caches". Note, VMX already grabs the guest's PERF_GLOBAL_CTRL immediately after VM-Exit, as hardware saves value into the VMCS. Co-developed-by: Mingwei Zhang Signed-off-by: Mingwei Zhang Co-developed-by: Sandipan Das Signed-off-by: Sandipan Das Signed-off-by: Dapeng Mi Tested-by: Xudong Hao Co-developed-by: Sean Christopherson Signed-off-by: Sean Christopherson --- arch/x86/include/asm/kvm-x86-pmu-ops.h | 2 + arch/x86/include/asm/msr-index.h | 1 + arch/x86/kvm/pmu.c | 130 ++++++++++++++++++++++++- arch/x86/kvm/pmu.h | 10 ++ arch/x86/kvm/svm/pmu.c | 34 +++++++ arch/x86/kvm/svm/svm.c | 3 + arch/x86/kvm/vmx/pmu_intel.c | 44 +++++++++ arch/x86/kvm/x86.c | 4 + 8 files changed, 225 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/asm/kvm-x86-pmu-ops.h b/arch/x86/include/asm/= kvm-x86-pmu-ops.h index ad2cc82abf79..f0aa6996811f 100644 --- a/arch/x86/include/asm/kvm-x86-pmu-ops.h +++ b/arch/x86/include/asm/kvm-x86-pmu-ops.h @@ -24,6 +24,8 @@ KVM_X86_PMU_OP_OPTIONAL(deliver_pmi) KVM_X86_PMU_OP_OPTIONAL(cleanup) =20 KVM_X86_PMU_OP_OPTIONAL(write_global_ctrl) +KVM_X86_PMU_OP(mediated_load) +KVM_X86_PMU_OP(mediated_put) =20 #undef KVM_X86_PMU_OP #undef KVM_X86_PMU_OP_OPTIONAL diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-in= dex.h index 9e1720d73244..0ba08fd4ac3f 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -1191,6 +1191,7 @@ #define MSR_CORE_PERF_GLOBAL_STATUS 0x0000038e #define MSR_CORE_PERF_GLOBAL_CTRL 0x0000038f #define MSR_CORE_PERF_GLOBAL_OVF_CTRL 0x00000390 +#define MSR_CORE_PERF_GLOBAL_STATUS_SET 0x00000391 =20 #define MSR_PERF_METRICS 0x00000329 =20 diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c index 39904e6fd227..578bf996bda2 100644 --- a/arch/x86/kvm/pmu.c +++ b/arch/x86/kvm/pmu.c @@ -882,10 +882,13 @@ int kvm_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr= _data *msr_info) diff =3D pmu->global_ctrl ^ data; pmu->global_ctrl =3D data; reprogram_counters(pmu, diff); - - if (kvm_vcpu_has_mediated_pmu(vcpu)) - kvm_pmu_call(write_global_ctrl)(data); } + /* + * Unconditionally forward writes to vendor code, i.e. to the + * VMC{B,S}, as pmu->global_ctrl is per-VCPU, not per-VMC{B,S}. + */ + if (kvm_vcpu_has_mediated_pmu(vcpu)) + kvm_pmu_call(write_global_ctrl)(data); break; case MSR_CORE_PERF_GLOBAL_OVF_CTRL: /* @@ -1246,3 +1249,124 @@ int kvm_vm_ioctl_set_pmu_event_filter(struct kvm *k= vm, void __user *argp) kfree(filter); return r; } + +static __always_inline u32 fixed_counter_msr(u32 idx) +{ + return kvm_pmu_ops.FIXED_COUNTER_BASE + idx * kvm_pmu_ops.MSR_STRIDE; +} + +static __always_inline u32 gp_counter_msr(u32 idx) +{ + return kvm_pmu_ops.GP_COUNTER_BASE + idx * kvm_pmu_ops.MSR_STRIDE; +} + +static __always_inline u32 gp_eventsel_msr(u32 idx) +{ + return kvm_pmu_ops.GP_EVENTSEL_BASE + idx * kvm_pmu_ops.MSR_STRIDE; +} + +static void kvm_pmu_load_guest_pmcs(struct kvm_vcpu *vcpu) +{ + struct kvm_pmu *pmu =3D vcpu_to_pmu(vcpu); + struct kvm_pmc *pmc; + u32 i; + + /* + * No need to zero out unexposed GP/fixed counters/selectors since RDPMC + * is intercepted if hardware has counters that aren't visible to the + * guest (KVM will inject #GP as appropriate). + */ + for (i =3D 0; i < pmu->nr_arch_gp_counters; i++) { + pmc =3D &pmu->gp_counters[i]; + + wrmsrl(gp_counter_msr(i), pmc->counter); + wrmsrl(gp_eventsel_msr(i), pmc->eventsel_hw); + } + for (i =3D 0; i < pmu->nr_arch_fixed_counters; i++) { + pmc =3D &pmu->fixed_counters[i]; + + wrmsrl(fixed_counter_msr(i), pmc->counter); + } +} + +void kvm_mediated_pmu_load(struct kvm_vcpu *vcpu) +{ + if (!kvm_vcpu_has_mediated_pmu(vcpu) || + KVM_BUG_ON(!lapic_in_kernel(vcpu), vcpu->kvm)) + return; + + lockdep_assert_irqs_disabled(); + + perf_load_guest_context(); + + /* + * Explicitly clear PERF_GLOBAL_CTRL, as "loading" the guest's context + * disables all individual counters (if any were enabled), but doesn't + * globally disable the entire PMU. Loading event selectors and PMCs + * with guest values while PERF_GLOBAL_CTRL is non-zero will generate + * unexpected events and PMIs. + * + * VMX will enable/disable counters at VM-Enter/VM-Exit by atomically + * loading PERF_GLOBAL_CONTROL. SVM effectively performs the switch by + * configuring all events to be GUEST_ONLY. Clear PERF_GLOBAL_CONTROL + * even for SVM to minimize the damage if a perf event is left enabled, + * and to ensure a consistent starting state. + */ + wrmsrq(kvm_pmu_ops.PERF_GLOBAL_CTRL, 0); + + perf_load_guest_lvtpc(kvm_lapic_get_reg(vcpu->arch.apic, APIC_LVTPC)); + + kvm_pmu_load_guest_pmcs(vcpu); + + kvm_pmu_call(mediated_load)(vcpu); +} + +static void kvm_pmu_put_guest_pmcs(struct kvm_vcpu *vcpu) +{ + struct kvm_pmu *pmu =3D vcpu_to_pmu(vcpu); + struct kvm_pmc *pmc; + u32 i; + + /* + * Clear selectors and counters to ensure hardware doesn't count using + * guest controls when the host (perf) restores its state. + */ + for (i =3D 0; i < pmu->nr_arch_gp_counters; i++) { + pmc =3D &pmu->gp_counters[i]; + + pmc->counter =3D rdpmc(i); + if (pmc->counter) + wrmsrq(gp_counter_msr(i), 0); + if (pmc->eventsel_hw) + wrmsrq(gp_eventsel_msr(i), 0); + } + + for (i =3D 0; i < pmu->nr_arch_fixed_counters; i++) { + pmc =3D &pmu->fixed_counters[i]; + + pmc->counter =3D rdpmc(INTEL_PMC_FIXED_RDPMC_BASE | i); + if (pmc->counter) + wrmsrq(fixed_counter_msr(i), 0); + } +} + +void kvm_mediated_pmu_put(struct kvm_vcpu *vcpu) +{ + if (!kvm_vcpu_has_mediated_pmu(vcpu) || + KVM_BUG_ON(!lapic_in_kernel(vcpu), vcpu->kvm)) + return; + + lockdep_assert_irqs_disabled(); + + /* + * Defer handling of PERF_GLOBAL_CTRL to vendor code. On Intel, it's + * atomically cleared on VM-Exit, i.e. doesn't need to be clear here. + */ + kvm_pmu_call(mediated_put)(vcpu); + + kvm_pmu_put_guest_pmcs(vcpu); + + perf_put_guest_lvtpc(); + + perf_put_guest_context(); +} diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h index 9a199109d672..25b583da9ee2 100644 --- a/arch/x86/kvm/pmu.h +++ b/arch/x86/kvm/pmu.h @@ -38,11 +38,19 @@ struct kvm_pmu_ops { void (*cleanup)(struct kvm_vcpu *vcpu); =20 bool (*is_mediated_pmu_supported)(struct x86_pmu_capability *host_pmu); + void (*mediated_load)(struct kvm_vcpu *vcpu); + void (*mediated_put)(struct kvm_vcpu *vcpu); void (*write_global_ctrl)(u64 global_ctrl); =20 const u64 EVENTSEL_EVENT; const int MAX_NR_GP_COUNTERS; const int MIN_NR_GP_COUNTERS; + + const u32 PERF_GLOBAL_CTRL; + const u32 GP_EVENTSEL_BASE; + const u32 GP_COUNTER_BASE; + const u32 FIXED_COUNTER_BASE; + const u32 MSR_STRIDE; }; =20 void kvm_pmu_ops_update(const struct kvm_pmu_ops *pmu_ops); @@ -240,6 +248,8 @@ void kvm_pmu_destroy(struct kvm_vcpu *vcpu); int kvm_vm_ioctl_set_pmu_event_filter(struct kvm *kvm, void __user *argp); void kvm_pmu_instruction_retired(struct kvm_vcpu *vcpu); void kvm_pmu_branch_retired(struct kvm_vcpu *vcpu); +void kvm_mediated_pmu_load(struct kvm_vcpu *vcpu); +void kvm_mediated_pmu_put(struct kvm_vcpu *vcpu); =20 bool is_vmware_backdoor_pmc(u32 pmc_idx); bool kvm_need_perf_global_ctrl_intercept(struct kvm_vcpu *vcpu); diff --git a/arch/x86/kvm/svm/pmu.c b/arch/x86/kvm/svm/pmu.c index 6d5f791126b1..7aa298eeb072 100644 --- a/arch/x86/kvm/svm/pmu.c +++ b/arch/x86/kvm/svm/pmu.c @@ -234,6 +234,32 @@ static bool amd_pmu_is_mediated_pmu_supported(struct x= 86_pmu_capability *host_pm return host_pmu->version >=3D 2; } =20 +static void amd_mediated_pmu_load(struct kvm_vcpu *vcpu) +{ + struct kvm_pmu *pmu =3D vcpu_to_pmu(vcpu); + u64 global_status; + + rdmsrq(MSR_AMD64_PERF_CNTR_GLOBAL_STATUS, global_status); + /* Clear host global_status MSR if non-zero. */ + if (global_status) + wrmsrq(MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_CLR, global_status); + + wrmsrq(MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_SET, pmu->global_status); + wrmsrq(MSR_AMD64_PERF_CNTR_GLOBAL_CTL, pmu->global_ctrl); +} + +static void amd_mediated_pmu_put(struct kvm_vcpu *vcpu) +{ + struct kvm_pmu *pmu =3D vcpu_to_pmu(vcpu); + + wrmsrq(MSR_AMD64_PERF_CNTR_GLOBAL_CTL, 0); + rdmsrq(MSR_AMD64_PERF_CNTR_GLOBAL_STATUS, pmu->global_status); + + /* Clear global status bits if non-zero */ + if (pmu->global_status) + wrmsrq(MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_CLR, pmu->global_status); +} + struct kvm_pmu_ops amd_pmu_ops __initdata =3D { .rdpmc_ecx_to_pmc =3D amd_rdpmc_ecx_to_pmc, .msr_idx_to_pmc =3D amd_msr_idx_to_pmc, @@ -245,8 +271,16 @@ struct kvm_pmu_ops amd_pmu_ops __initdata =3D { .init =3D amd_pmu_init, =20 .is_mediated_pmu_supported =3D amd_pmu_is_mediated_pmu_supported, + .mediated_load =3D amd_mediated_pmu_load, + .mediated_put =3D amd_mediated_pmu_put, =20 .EVENTSEL_EVENT =3D AMD64_EVENTSEL_EVENT, .MAX_NR_GP_COUNTERS =3D KVM_MAX_NR_AMD_GP_COUNTERS, .MIN_NR_GP_COUNTERS =3D AMD64_NUM_COUNTERS, + + .PERF_GLOBAL_CTRL =3D MSR_AMD64_PERF_CNTR_GLOBAL_CTL, + .GP_EVENTSEL_BASE =3D MSR_F15H_PERF_CTL0, + .GP_COUNTER_BASE =3D MSR_F15H_PERF_CTR0, + .FIXED_COUNTER_BASE =3D 0, + .MSR_STRIDE =3D 2, }; diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index fa04e58ff524..cbebd3a18918 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -4368,6 +4368,9 @@ static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_= vcpu *vcpu, u64 run_flags) =20 vcpu->arch.regs_avail &=3D ~SVM_REGS_LAZY_LOAD_SET; =20 + if (!msr_write_intercepted(vcpu, MSR_AMD64_PERF_CNTR_GLOBAL_CTL)) + rdmsrq(MSR_AMD64_PERF_CNTR_GLOBAL_CTL, vcpu_to_pmu(vcpu)->global_ctrl); + trace_kvm_exit(vcpu, KVM_ISA_SVM); =20 svm_complete_interrupts(vcpu); diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c index 855240678300..55249fa4db95 100644 --- a/arch/x86/kvm/vmx/pmu_intel.c +++ b/arch/x86/kvm/vmx/pmu_intel.c @@ -792,6 +792,42 @@ static void intel_pmu_write_global_ctrl(u64 global_ctr= l) vmcs_write64(GUEST_IA32_PERF_GLOBAL_CTRL, global_ctrl); } =20 + +static void intel_mediated_pmu_load(struct kvm_vcpu *vcpu) +{ + struct kvm_pmu *pmu =3D vcpu_to_pmu(vcpu); + u64 global_status, toggle; + + rdmsrq(MSR_CORE_PERF_GLOBAL_STATUS, global_status); + toggle =3D pmu->global_status ^ global_status; + if (global_status & toggle) + wrmsrq(MSR_CORE_PERF_GLOBAL_OVF_CTRL, global_status & toggle); + if (pmu->global_status & toggle) + wrmsrq(MSR_CORE_PERF_GLOBAL_STATUS_SET, pmu->global_status & toggle); + + wrmsrq(MSR_CORE_PERF_FIXED_CTR_CTRL, pmu->fixed_ctr_ctrl_hw); +} + +static void intel_mediated_pmu_put(struct kvm_vcpu *vcpu) +{ + struct kvm_pmu *pmu =3D vcpu_to_pmu(vcpu); + + /* MSR_CORE_PERF_GLOBAL_CTRL is already saved at VM-exit. */ + rdmsrq(MSR_CORE_PERF_GLOBAL_STATUS, pmu->global_status); + + /* Clear hardware MSR_CORE_PERF_GLOBAL_STATUS MSR, if non-zero. */ + if (pmu->global_status) + wrmsrq(MSR_CORE_PERF_GLOBAL_OVF_CTRL, pmu->global_status); + + /* + * Clear hardware FIXED_CTR_CTRL MSR to avoid information leakage and + * also to avoid accidentally enabling fixed counters (based on guest + * state) while running in the host, e.g. when setting global ctrl. + */ + if (pmu->fixed_ctr_ctrl_hw) + wrmsrq(MSR_CORE_PERF_FIXED_CTR_CTRL, 0); +} + struct kvm_pmu_ops intel_pmu_ops __initdata =3D { .rdpmc_ecx_to_pmc =3D intel_rdpmc_ecx_to_pmc, .msr_idx_to_pmc =3D intel_msr_idx_to_pmc, @@ -805,9 +841,17 @@ struct kvm_pmu_ops intel_pmu_ops __initdata =3D { .cleanup =3D intel_pmu_cleanup, =20 .is_mediated_pmu_supported =3D intel_pmu_is_mediated_pmu_supported, + .mediated_load =3D intel_mediated_pmu_load, + .mediated_put =3D intel_mediated_pmu_put, .write_global_ctrl =3D intel_pmu_write_global_ctrl, =20 .EVENTSEL_EVENT =3D ARCH_PERFMON_EVENTSEL_EVENT, .MAX_NR_GP_COUNTERS =3D KVM_MAX_NR_INTEL_GP_COUNTERS, .MIN_NR_GP_COUNTERS =3D 1, + + .PERF_GLOBAL_CTRL =3D MSR_CORE_PERF_GLOBAL_CTRL, + .GP_EVENTSEL_BASE =3D MSR_P6_EVNTSEL0, + .GP_COUNTER_BASE =3D MSR_IA32_PMC0, + .FIXED_COUNTER_BASE =3D MSR_CORE_PERF_FIXED_CTR0, + .MSR_STRIDE =3D 1, }; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 76e86eb358df..589a309259f4 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -11334,6 +11334,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) run_flags |=3D KVM_RUN_LOAD_DEBUGCTL; vcpu->arch.host_debugctl =3D debug_ctl; =20 + kvm_mediated_pmu_load(vcpu); + guest_timing_enter_irqoff(); =20 /* @@ -11372,6 +11374,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) =20 kvm_load_host_pkru(vcpu); =20 + kvm_mediated_pmu_put(vcpu); + /* * Do this here before restoring debug registers on the host. And * since we do this before handling the vmexit, a DR access vmexit --=20 2.52.0.223.gf5cc29aaa4-goog From nobody Tue Dec 16 07:27:38 2025 Received: from mail-pf1-f201.google.com (mail-pf1-f201.google.com [209.85.210.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EEAE7261581 for ; Sat, 6 Dec 2025 00:18:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980305; cv=none; b=Y0vfORsAPHsfa44dkBImJfLg/MhFJWH0gs88XJAwW/scoYOhim4X2S0T1cgeZdBuvF0KDsUF061WTOEeRrrvAz23PWykbFlnsRcoIZe6N5PM/ieNodWQzvAXXYnv2wZGDlvdKRL1zipFEhO42/5I/wr8fg7mKHGivAqdtqHviW4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980305; c=relaxed/simple; bh=zwC31PtY1144Zbq3uDzlYjGhLN8hSauVy0aZ0f/WAwM=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=U7zWQ51hDo1fHMKMdJzXP2B+KSSKqW/vDbn84jOrJ4/Asr/1AsZeRbDmxguULDTrlse/Hq2qlyMKqpipfRr1mwAtxdLTVC8SyqtB4MKmyX2XMqLhfPsImxabHK9Lg/DzAwAq2CcL3cTUQ889IV90bpvpJYMf0yTRKGnyPgEwgVo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=GpPdDcrc; arc=none smtp.client-ip=209.85.210.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="GpPdDcrc" Received: by mail-pf1-f201.google.com with SMTP id d2e1a72fcca58-7ba9c366057so6697244b3a.1 for ; Fri, 05 Dec 2025 16:18:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1764980303; x=1765585103; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=vcjFml1Ocn5W+9fhtxvlQ/X0kii0S1Hbwc7mdgCzQH4=; b=GpPdDcrcX/O2WMJDudwmIRnyPJEAGYpY1YMIKwi7MR6u+UZKHKBcqrd4WBtCv4D713 n0coA+2TfbfjYY2diXOq5Jx6xpPgpCH9/qzj1Yw6ebsCajoeT44fth+yoM0WpTfmlbOc UAHhV5dsFOtarp3DyDl/uRF/HH2NDpLWIk81j819fV7pl8evYS/9XaDHL0oNpGn938Dz 5vFtbwmtjoqMKosUysjoAg0sR/N7PaIMVq/PyZ3O/5NTVxlmn5M9iU5z7fnvrFpqGzzK FRr7NUwmtCuhsOWLVh8Kt0dB2acTyEOcRfG8JdmjpV9DDxQdmvKsOf7EEtPZm0mfmts8 h2aA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764980303; x=1765585103; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=vcjFml1Ocn5W+9fhtxvlQ/X0kii0S1Hbwc7mdgCzQH4=; b=YCyVgjgnsMv0qszjkeRRvLeHIGSQaSd6TLxNExexm+ytCnwrdKjgLeYFgDjZL1WMoG wtItwcN2sqPMDb05sYIrOrqeu23d0h699J5m10Eot0RIqspuBjr5xsuMh2qeY0Ili8rd 59iq5M9J/uObgwx+74eRt3CIR1N/Zb3lSB5JHhZzQW+jzb3oeC9uOigd8xA2QTkDEItF 1c3S8fxSPtRsqNUm+FBikBDjbZImByY/5YMNRO49kedb8KRWp+9+THByn00HM0DtUrLr HLpxlu+2mnmu4Gd8E/VAI/QT6q9LYdmSE6vOVCyCyaZfLU94fJWXTXG1UQPrxSr3gJeV qdKA== X-Forwarded-Encrypted: i=1; AJvYcCViB/O+XhufzCchcHOP6yG/XOrxwgY5r5rLBAXz1cub5g4eTBdb0W2CAkJxheThH/hS3wDINyshEHy5eoE=@vger.kernel.org X-Gm-Message-State: AOJu0Yyh0LzssJlp+Pa0Xa/Y1j43WIwlH9ao5XkWfnAk3RwqejQHwb2y 8mUKzSPtJdYTfnTLt5r1tcmSAQpb/T47WRaq2aYLjZGj/54jLtYr558t0PccA/EvJNCJmGEOoqy 8AnTstg== X-Google-Smtp-Source: AGHT+IEi41vT7k5Y8TL0eorj0ILiZzn3n1cXKu+mvARIF/KTkqiX2XdS9wsUu8uQeqy5b+ge2DTi5IOhdXM= X-Received: from pfmm14.prod.google.com ([2002:a05:6a00:248e:b0:7dd:8bba:63ae]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:2ea0:b0:7a9:8770:ce5a with SMTP id d2e1a72fcca58-7e8c3628897mr847803b3a.20.1764980303025; Fri, 05 Dec 2025 16:18:23 -0800 (PST) Reply-To: Sean Christopherson Date: Fri, 5 Dec 2025 16:17:04 -0800 In-Reply-To: <20251206001720.468579-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251206001720.468579-1-seanjc@google.com> X-Mailer: git-send-email 2.52.0.223.gf5cc29aaa4-goog Message-ID: <20251206001720.468579-29-seanjc@google.com> Subject: [PATCH v6 28/44] KVM: x86/pmu: Disallow emulation in the fastpath if mediated PMCs are active From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Xin Li , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Sean Christopherson , Paolo Bonzini Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Mingwei Zhang , Xudong Hao , Sandipan Das , Dapeng Mi , Xiong Zhang , Manali Shukla , Jim Mattson Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Don't handle exits in the fastpath if emulation is required, i.e. if an instruction needs to be skipped, the mediated PMU is enabled, and one or more PMCs is counting instructions. With the mediated PMU, KVM's cache of PMU state is inconsistent with respect to hardware until KVM exits the inner run loop (when the mediated PMU is "put"). Reviewed-by: Sandipan Das Tested-by: Xudong Hao Signed-off-by: Sean Christopherson --- arch/x86/kvm/pmu.h | 10 ++++++++++ arch/x86/kvm/x86.c | 9 +++++++++ 2 files changed, 19 insertions(+) diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h index 25b583da9ee2..0925246731cb 100644 --- a/arch/x86/kvm/pmu.h +++ b/arch/x86/kvm/pmu.h @@ -234,6 +234,16 @@ static inline bool pmc_is_globally_enabled(struct kvm_= pmc *pmc) return test_bit(pmc->idx, (unsigned long *)&pmu->global_ctrl); } =20 +static inline bool kvm_pmu_is_fastpath_emulation_allowed(struct kvm_vcpu *= vcpu) +{ + struct kvm_pmu *pmu =3D vcpu_to_pmu(vcpu); + + return !kvm_vcpu_has_mediated_pmu(vcpu) || + !bitmap_intersects(pmu->pmc_counting_instructions, + (unsigned long *)&pmu->global_ctrl, + X86_PMC_IDX_MAX); +} + void kvm_pmu_deliver_pmi(struct kvm_vcpu *vcpu); void kvm_pmu_handle_event(struct kvm_vcpu *vcpu); int kvm_pmu_rdpmc(struct kvm_vcpu *vcpu, unsigned pmc, u64 *data); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 589a309259f4..4683df775b0a 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2215,6 +2215,9 @@ EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_emulate_invd); =20 fastpath_t handle_fastpath_invd(struct kvm_vcpu *vcpu) { + if (!kvm_pmu_is_fastpath_emulation_allowed(vcpu)) + return EXIT_FASTPATH_NONE; + if (!kvm_emulate_invd(vcpu)) return EXIT_FASTPATH_EXIT_USERSPACE; =20 @@ -2271,6 +2274,9 @@ static inline bool kvm_vcpu_exit_request(struct kvm_v= cpu *vcpu) =20 static fastpath_t __handle_fastpath_wrmsr(struct kvm_vcpu *vcpu, u32 msr, = u64 data) { + if (!kvm_pmu_is_fastpath_emulation_allowed(vcpu)) + return EXIT_FASTPATH_NONE; + switch (msr) { case APIC_BASE_MSR + (APIC_ICR >> 4): if (!lapic_in_kernel(vcpu) || !apic_x2apic_mode(vcpu->arch.apic) || @@ -11714,6 +11720,9 @@ EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_emulate_halt); =20 fastpath_t handle_fastpath_hlt(struct kvm_vcpu *vcpu) { + if (!kvm_pmu_is_fastpath_emulation_allowed(vcpu)) + return EXIT_FASTPATH_NONE; + if (!kvm_emulate_halt(vcpu)) return EXIT_FASTPATH_EXIT_USERSPACE; =20 --=20 2.52.0.223.gf5cc29aaa4-goog From nobody Tue Dec 16 07:27:38 2025 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8FBE52EB876 for ; Sat, 6 Dec 2025 00:18:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980307; cv=none; b=ZlO/kpVYm//WJ7QetldoQqdBXk6bNEw6j1JmXeZfV+bNA2p+e6doeEkCIWDfD/wsT7uneBxRXczzANUAmbLEHVWc23IeXWuojnN/805uWdZZmCe23zt1ij0tEYfcw3N+4stJOqZm4GN5/ly5tvcjreuU6oTXnDQLoqAwWGamUME= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980307; c=relaxed/simple; bh=Yx2occF+Cc2oLfxL/Kkyww9HN7cAnhXtGyaYNERA9Ks=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=N6C6mHZHFQcEW4gfCcCs2QPSPNUjkZKapwf9I3n1vwvRCBlWFc6en2mKVBscutt4/VznBJOWqeE4l69uqzwK1OOcFXjLG5+3dqCbuOV+DDX1Sra71feKXk9XHBWaD1alO9mP/ThW5m1uSz6iRjDI6PoacQe+DEwkRFxG+rFwHNw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=THyvMBJ3; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="THyvMBJ3" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-3418ad76023so4734947a91.0 for ; Fri, 05 Dec 2025 16:18:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1764980305; x=1765585105; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=VDZ6qvbhnxeB/xicU22NIlKKbEtAofiGZXasyTcZTcc=; b=THyvMBJ3meNDoA2l0BzL7YeICYmWTwOE5M1zabT5vLiRvDTdY3lvdH/iXtt+GCZczn UWlB3oXT1Tt1KHbUkRD/RLmNt0Xg3Xk6cuEEU9j5Jv67SfkMRhXo9B+HO7QuB837BIwd MAKBb5LqN13A9XCF8ixopL7svv3vDLeodx4VuqSwvF5Cq1dz9EAd7LTeZhkNyqBNCCpM yZu3NOu5+xsAgCRbFso/ZGZb9i+TFXsNhR5XRV03pzNss8zZWkJfX1oqAJvR8DYmXmTG iaU2Ea/F+RzVY85GQ7MNBx6Jobm6ZeYmJMknGRoHVfyb+ZVfvtwUdSzFhaoe7JxjLTm8 tH+Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764980305; x=1765585105; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=VDZ6qvbhnxeB/xicU22NIlKKbEtAofiGZXasyTcZTcc=; b=mKodZ7YonkqeVJTqt0sm4SMi1PP5k+ArAoKaGsbcohXKmTHmnaFbmP03goSJK63jF3 zW9jLJ1gC1DeLbFXapVFMPXC7A2G8lRHOlH4rjfqXrPh4Gy+jS2orEQep2Z0nkygp4YK sGpJC9qnxRXRRJG9+PLPZH36JSxAqY1b6GUGzfbcDl6I+f7UspZb/zUnKhgaXn9lohoo od4FHGlLs+KjKurjgWRtVlGlkH25FEmjqXQ0KYZAaN5EKFN+Za89/BcVsZQ8vQSaP7/O aXz+msS3BkHVNo87sVPk1fLppm5ML//ZO6KvJBJ/rftGUCU84BoPNW91e8W/9GwyD6SA iC1Q== X-Forwarded-Encrypted: i=1; AJvYcCVAXPB89NtPyXLFVitQT3xRCT6VJRbZ/gRHSRDDEAYRTPKYJ1K++txMw2sNR5Vke9LfWNEF9Rf72uy5YFQ=@vger.kernel.org X-Gm-Message-State: AOJu0Yxkp5VL6XkIBIsNAPraIJwziyIkcmpCQmH5C0jwCUNF0HNF2slw NGUnxy6YA1tkuntETmKiUGweZQA4bspawmokMoSm5uZK67IzoMQIOHwNX84Wg7CasPG7dJ06DZ2 f80bIow== X-Google-Smtp-Source: AGHT+IF4RzpEBka8aK0ucRE2Cm9e095W8Dm56/F1owL7T0nm81v+pRoWT+IrKx6iUc4yDzdsBycmMqAWYnk= X-Received: from pjuy11.prod.google.com ([2002:a17:90a:d70b:b0:340:3ea9:30bc]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:5905:b0:341:761c:3330 with SMTP id 98e67ed59e1d1-349a25bd8e0mr687187a91.23.1764980304689; Fri, 05 Dec 2025 16:18:24 -0800 (PST) Reply-To: Sean Christopherson Date: Fri, 5 Dec 2025 16:17:05 -0800 In-Reply-To: <20251206001720.468579-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251206001720.468579-1-seanjc@google.com> X-Mailer: git-send-email 2.52.0.223.gf5cc29aaa4-goog Message-ID: <20251206001720.468579-30-seanjc@google.com> Subject: [PATCH v6 29/44] KVM: x86/pmu: Handle emulated instruction for mediated vPMU From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Xin Li , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Sean Christopherson , Paolo Bonzini Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Mingwei Zhang , Xudong Hao , Sandipan Das , Dapeng Mi , Xiong Zhang , Manali Shukla , Jim Mattson Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Dapeng Mi Mediated vPMU needs to accumulate the emulated instructions into counter and load the counter into HW at vm-entry. Moreover, if the accumulation leads to counter overflow, KVM needs to update GLOBAL_STATUS and inject PMI into guest as well. Suggested-by: Sean Christopherson Signed-off-by: Dapeng Mi Signed-off-by: Mingwei Zhang Tested-by: Xudong Hao Signed-off-by: Sean Christopherson --- arch/x86/kvm/pmu.c | 39 +++++++++++++++++++++++++++++++++++++-- 1 file changed, 37 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c index 578bf996bda2..cb07d9b62bee 100644 --- a/arch/x86/kvm/pmu.c +++ b/arch/x86/kvm/pmu.c @@ -1033,10 +1033,45 @@ void kvm_pmu_destroy(struct kvm_vcpu *vcpu) kvm_pmu_reset(vcpu); } =20 +static bool pmc_is_pmi_enabled(struct kvm_pmc *pmc) +{ + u8 fixed_ctr_ctrl; + + if (pmc_is_gp(pmc)) + return pmc->eventsel & ARCH_PERFMON_EVENTSEL_INT; + + fixed_ctr_ctrl =3D fixed_ctrl_field(pmc_to_pmu(pmc)->fixed_ctr_ctrl, + pmc->idx - KVM_FIXED_PMC_BASE_IDX); + return fixed_ctr_ctrl & INTEL_FIXED_0_ENABLE_PMI; +} + static void kvm_pmu_incr_counter(struct kvm_pmc *pmc) { - pmc->emulated_counter++; - kvm_pmu_request_counter_reprogram(pmc); + struct kvm_vcpu *vcpu =3D pmc->vcpu; + + /* + * For perf-based PMUs, accumulate software-emulated events separately + * from pmc->counter, as pmc->counter is offset by the count of the + * associated perf event. Request reprogramming, which will consult + * both emulated and hardware-generated events to detect overflow. + */ + if (!kvm_vcpu_has_mediated_pmu(vcpu)) { + pmc->emulated_counter++; + kvm_pmu_request_counter_reprogram(pmc); + return; + } + + /* + * For mediated PMUs, pmc->counter is updated when the vCPU's PMU is + * put, and will be loaded into hardware when the PMU is loaded. Simply + * increment the counter and signal overflow if it wraps to zero. + */ + pmc->counter =3D (pmc->counter + 1) & pmc_bitmask(pmc); + if (!pmc->counter) { + pmc_to_pmu(pmc)->global_status |=3D BIT_ULL(pmc->idx); + if (pmc_is_pmi_enabled(pmc)) + kvm_make_request(KVM_REQ_PMI, vcpu); + } } =20 static inline bool cpl_is_matched(struct kvm_pmc *pmc) --=20 2.52.0.223.gf5cc29aaa4-goog From nobody Tue Dec 16 07:27:38 2025 Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 61FA42F360E for ; Sat, 6 Dec 2025 00:18:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980309; cv=none; b=GOVZO+fnnj4nDYOHVbKqqxss7kQ1qpFfsWRN9HTASwu9DACW2AbgAQES/N1NLM4UczbekJXSMsuau4dpmIN3RUbz1DauSo/bhrkqCkpDSj+PzSygYIhlOoYVsFQ4aYr7keYF0hbTess5ptxz+ex5fzQZyX4MAqrhST2vZlh7Bp0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980309; c=relaxed/simple; bh=XkalrXDc3Qa++HcZCT+ub3/cOt2TlQhCgor1dl+ppOo=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=cTyVWyF2Ao1k4B0tBbsMPiz1yYedCn34LBpeCUcJDRB57RXZpUorBdkZmIEJ5Bmg3Qgf1UnENq86Yftx238PELHSQ3Wk4WtOUCiLuEOyhv2l79Oa96q8on7/4rHgGEb/IbnYOscSSGnXR+6xSnFWKDdSBThaSl9NE9Hm1nxaL5Q= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=LPeIErv8; arc=none smtp.client-ip=209.85.214.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="LPeIErv8" Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-295592eb5dbso46086735ad.0 for ; Fri, 05 Dec 2025 16:18:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1764980307; x=1765585107; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=wDDQncQUzzwpRSzzmwHiRYOU7FF2nSUQYIeQTrnvQK8=; b=LPeIErv8XJaCf0GqhOGvqDME4xx8RaSPLGaW3LW84EkpdR2n3FnqBLAglxLF22IHCv QP2GKLEBgciOSXlBqVcgkgsCnfLLFP3kwcqXuUmwbLB+doE8UmtE/in7ysQxHEeWM7r7 lZVIgCDhFFB4W+IRE0jSon7CR+83XnwXAlAFhKdQFJpH/004PS9bT10HxZWJ4lVSmBD6 Mc+xYAWvOjPZSR7i+tRyJy6x1CB2x/Qyf4CRd86qbza3CkCngFCd53RFJh2/H4OMpo3e kbBS3Tsp0z4rfXdMehjFpU+gQvQ5GL/HUcp60xMPQI3MMjj0ExVJcJlWVIyR9lfnn5ZG jOiw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764980307; x=1765585107; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=wDDQncQUzzwpRSzzmwHiRYOU7FF2nSUQYIeQTrnvQK8=; b=vtYhBjRZNN6XTRwiXqeGJZbf/LjnltM2z39RQmzIPrE3vDgLuqzbvfbC0H0RNibW3S 5zRUurq/mgOCn92l6pLmmE6I9rrChmT8LfkM0H6JRPlENN3LgIidrJZaLHqq7tDF2uoK qvItAIN6R0Oj75CuNoQV24/4NsgFJonwpHH2++sOMTsJ3EO7RTPVCiS8VjOoeZOZQvJB +ViT6HrTVfZmv2u6cbGBWOigNcKr50CgnA8/exssYB6Xlwpu05gGL95zYwcJVUYsDLjR ufssRMqXHkgCu3YmEUdVZAwvpIsYyquWcrcrkqeEvHOHSflCVQ7NB/kPDxAZltlWvniA jQpQ== X-Forwarded-Encrypted: i=1; AJvYcCUyHIWPNcaH7QSt2ZKsjS0c3OXnMvXfDiXJ0REIAyLw4pgUgwWEZcIbZTQ9Q5nierbX7Wa0TkRPvS+OrTY=@vger.kernel.org X-Gm-Message-State: AOJu0YwvHnIg3dUDN9pL/4BEBpwDw2P9/aPZqgB0ErUHN0TsaqxuThPq S9p1RMQu03TTREioUHL2mQ8MhWnilGWiDG6ZQvRpIDiBxEQRky3Gv5MtZDeI50WrU2tDEs107JC bhto5Rg== X-Google-Smtp-Source: AGHT+IG4vHPmom8LOFsNf9ubPRsWVZC6RkdL55O4HgJ6N06qyh5/Ls7bYlKNT7sQNBmfh9FVeBcsEGoEIG4= X-Received: from plhz1.prod.google.com ([2002:a17:902:d9c1:b0:298:51f6:847d]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:903:2f84:b0:294:fc77:f021 with SMTP id d9443c01a7336-29df5deb193mr7205895ad.49.1764980306541; Fri, 05 Dec 2025 16:18:26 -0800 (PST) Reply-To: Sean Christopherson Date: Fri, 5 Dec 2025 16:17:06 -0800 In-Reply-To: <20251206001720.468579-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251206001720.468579-1-seanjc@google.com> X-Mailer: git-send-email 2.52.0.223.gf5cc29aaa4-goog Message-ID: <20251206001720.468579-31-seanjc@google.com> Subject: [PATCH v6 30/44] KVM: nVMX: Add macros to simplify nested MSR interception setting From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Xin Li , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Sean Christopherson , Paolo Bonzini Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Mingwei Zhang , Xudong Hao , Sandipan Das , Dapeng Mi , Xiong Zhang , Manali Shukla , Jim Mattson Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Dapeng Mi Add macros nested_vmx_merge_msr_bitmaps_xxx() to simplify nested MSR interception setting. No function change intended. Suggested-by: Sean Christopherson Signed-off-by: Dapeng Mi Signed-off-by: Mingwei Zhang Tested-by: Xudong Hao Signed-off-by: Sean Christopherson --- arch/x86/kvm/vmx/nested.c | 35 +++++++++++++++++++---------------- 1 file changed, 19 insertions(+), 16 deletions(-) diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 40777278eabb..b56ed2b1ac67 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -617,6 +617,19 @@ static inline void nested_vmx_set_intercept_for_msr(st= ruct vcpu_vmx *vmx, msr_bitmap_l0, msr); } =20 +#define nested_vmx_merge_msr_bitmaps(msr, type) \ + nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, \ + msr_bitmap_l0, msr, type) + +#define nested_vmx_merge_msr_bitmaps_read(msr) \ + nested_vmx_merge_msr_bitmaps(msr, MSR_TYPE_R) + +#define nested_vmx_merge_msr_bitmaps_write(msr) \ + nested_vmx_merge_msr_bitmaps(msr, MSR_TYPE_W) + +#define nested_vmx_merge_msr_bitmaps_rw(msr) \ + nested_vmx_merge_msr_bitmaps(msr, MSR_TYPE_RW) + /* * Merge L0's and L1's MSR bitmap, return false to indicate that * we do not use the hardware. @@ -700,23 +713,13 @@ static inline bool nested_vmx_prepare_msr_bitmap(stru= ct kvm_vcpu *vcpu, * other runtime changes to vmcs01's bitmap, e.g. dynamic pass-through. */ #ifdef CONFIG_X86_64 - nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, - MSR_FS_BASE, MSR_TYPE_RW); - - nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, - MSR_GS_BASE, MSR_TYPE_RW); - - nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, - MSR_KERNEL_GS_BASE, MSR_TYPE_RW); + nested_vmx_merge_msr_bitmaps_rw(MSR_FS_BASE); + nested_vmx_merge_msr_bitmaps_rw(MSR_GS_BASE); + nested_vmx_merge_msr_bitmaps_rw(MSR_KERNEL_GS_BASE); #endif - nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, - MSR_IA32_SPEC_CTRL, MSR_TYPE_RW); - - nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, - MSR_IA32_PRED_CMD, MSR_TYPE_W); - - nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, - MSR_IA32_FLUSH_CMD, MSR_TYPE_W); + nested_vmx_merge_msr_bitmaps_rw(MSR_IA32_SPEC_CTRL); + nested_vmx_merge_msr_bitmaps_write(MSR_IA32_PRED_CMD); + nested_vmx_merge_msr_bitmaps_write(MSR_IA32_FLUSH_CMD); =20 nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, MSR_IA32_APERF, MSR_TYPE_R); --=20 2.52.0.223.gf5cc29aaa4-goog From nobody Tue Dec 16 07:27:38 2025 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E66AC2F7AD0 for ; Sat, 6 Dec 2025 00:18:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980311; cv=none; b=GiTR/oYS3Y7lRN3q+xKBiyoSzwaPMcDHlQMoPz1fPmhVp1+ReJ0iuBfZy10v1my2eY0DvcUQajaPUbbDDXls3Vu5z3a0tM/frzJTP0DCWtceKYzIce1qPyrpzO2XZn/hfehG8GmWjWu5DXnN4Vy+r/zOKDTskebffBfUjfcRBZ4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980311; c=relaxed/simple; bh=jnmh0698KpE63YRCvIEZAjn7c1Lr7TEaV4sjEsIEpYo=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=qmaxf1tWcSeSXvShqyVNVKzKAgZrcr/NS2xqEHIqpvfUOd1AoIWX3LBvWoGqgFsVHZ/tTmk3DZzQ8MH90DvEMeA6DwbkLMPvIKrv+q1hTDcbQchiCa65MVFgzmRWaOTcpkLm+8BiMW48DP+epqDI1o1oe4Rn/8qmbk5N05/iGJY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=mnuXyLYd; arc=none smtp.client-ip=209.85.216.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="mnuXyLYd" Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-34992a92fa0so1248623a91.3 for ; Fri, 05 Dec 2025 16:18:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1764980308; x=1765585108; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=4OfoQfKEgZFjrirIuizJ077/M4GdZrVaenba/hcoG9w=; b=mnuXyLYdBe2SGIG4Me7eq8ECbaVCwn7YZGFmEadJqEzHa63nG8JsOYQqCoUXfpeCtJ +2goHxl5yet7efQwGGWxpFLbeW6OKD1J9HtcnskPpG/aVrvVIKRDyxbpZL6n8lAZqKTT SkMBkkHWWjfWHPJIWtM44x29J0Ei3HUnSL5LWczinPq7rVf1CNIznXS9mo9y3f3rQ7+1 QDei6BqZdyinngVDEO/rlSjfmHHFovbc8rfn/YF9+PqrkY5X2TDGW+4SKd1BAoBTYAiO 7btDvyrO3JoejuyKVYpvXqrLVTVE1yWizlOWpXAY2GxGCRyPHn1k+tGKF3Q4wG+/VNGk Kp2A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764980308; x=1765585108; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=4OfoQfKEgZFjrirIuizJ077/M4GdZrVaenba/hcoG9w=; b=JNtFmhiZjz48ryOo45+ZXgqhr3xq1qT+/kbvTFjLxT6wnPJ7oZFtTaplD1iYVDK0IM Ckl5xQz75cRUnjrzhI85oS2iqXjia3EqDRS7GxPY86EkUgtBY+idHoPi8aoVLf11B3G7 UTNoTrIr0DrzSlRNvlvT84R4y7BM7OETW6CansNv+WQ6BcpW7x1LTQ8RonNrTV6CFRLP 0XXnW5nzzojh6PDNwGt2QlVi73TiEVmgN82lPBjcPPauAS7i+QMJYNSB7TxN912uDmLo eehJQmxvyqONwblAaYH8+jaOPZ8I/nTuRfRUlrUH0rscIy9o8axZ5DaPVrzWnVrocImA fG4A== X-Forwarded-Encrypted: i=1; AJvYcCWFMQ72sLVHF9YzQ4VncNGnd39eGOpvXNoVQdIuWd9UqD3NIFX7wa6J73eOhu1zHvmLGls+oE9V9KFOta0=@vger.kernel.org X-Gm-Message-State: AOJu0YzBqMHzBhN2a0yUm0ksbzXEwG8OMuugwL8BEQDrVN4XRh6a70/a sVhIVKD/kG2hIJ1ZwZwsfdBwpEev/XvU6XXYJLTNSoBCsRdmZB4m/Gln5FiURiQHqSZSHpP7D81 4gIV/pg== X-Google-Smtp-Source: AGHT+IHbD7B9YeGH7Tmr7JOXchgVNBxwJK8eblPXJXnk4Y9Y/l53ndcM8z2+XQN6T6Xmq93leyEKMn8mhtA= X-Received: from pjbcc9.prod.google.com ([2002:a17:90a:f109:b0:340:3e18:b5c]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:3ec7:b0:32e:528c:60ee with SMTP id 98e67ed59e1d1-349a25b9531mr659960a91.24.1764980308336; Fri, 05 Dec 2025 16:18:28 -0800 (PST) Reply-To: Sean Christopherson Date: Fri, 5 Dec 2025 16:17:07 -0800 In-Reply-To: <20251206001720.468579-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251206001720.468579-1-seanjc@google.com> X-Mailer: git-send-email 2.52.0.223.gf5cc29aaa4-goog Message-ID: <20251206001720.468579-32-seanjc@google.com> Subject: [PATCH v6 31/44] KVM: nVMX: Disable PMU MSR interception as appropriate while running L2 From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Xin Li , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Sean Christopherson , Paolo Bonzini Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Mingwei Zhang , Xudong Hao , Sandipan Das , Dapeng Mi , Xiong Zhang , Manali Shukla , Jim Mattson Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Mingwei Zhang Merge KVM's PMU MSR interception bitmaps with those of L1, i.e. merge the bitmaps of vmcs01 and vmcs12, e.g. so that KVM doesn't interpose on MSR accesses unnecessarily if L1 exposes a mediated PMU (or equivalent) to L2. Signed-off-by: Mingwei Zhang Co-developed-by: Dapeng Mi Signed-off-by: Dapeng Mi [sean: rewrite changelog and comment, omit MSRs that are always intercepted] Tested-by: Xudong Hao Signed-off-by: Sean Christopherson --- arch/x86/kvm/vmx/nested.c | 30 ++++++++++++++++++++++++++++++ 1 file changed, 30 insertions(+) diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index b56ed2b1ac67..729cc1f05ac8 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -630,6 +630,34 @@ static inline void nested_vmx_set_intercept_for_msr(st= ruct vcpu_vmx *vmx, #define nested_vmx_merge_msr_bitmaps_rw(msr) \ nested_vmx_merge_msr_bitmaps(msr, MSR_TYPE_RW) =20 +static void nested_vmx_merge_pmu_msr_bitmaps(struct kvm_vcpu *vcpu, + unsigned long *msr_bitmap_l1, + unsigned long *msr_bitmap_l0) +{ + struct kvm_pmu *pmu =3D vcpu_to_pmu(vcpu); + struct vcpu_vmx *vmx =3D to_vmx(vcpu); + int i; + + /* + * Skip the merges if the vCPU doesn't have a mediated PMU MSR, i.e. if + * none of the MSRs can possibly be passed through to L1. + */ + if (!kvm_vcpu_has_mediated_pmu(vcpu)) + return; + + for (i =3D 0; i < pmu->nr_arch_gp_counters; i++) { + nested_vmx_merge_msr_bitmaps_rw(MSR_IA32_PERFCTR0 + i); + nested_vmx_merge_msr_bitmaps_rw(MSR_IA32_PMC0 + i); + } + + for (i =3D 0; i < pmu->nr_arch_fixed_counters; i++) + nested_vmx_merge_msr_bitmaps_rw(MSR_CORE_PERF_FIXED_CTR0 + i); + + nested_vmx_merge_msr_bitmaps_rw(MSR_CORE_PERF_GLOBAL_CTRL); + nested_vmx_merge_msr_bitmaps_read(MSR_CORE_PERF_GLOBAL_STATUS); + nested_vmx_merge_msr_bitmaps_write(MSR_CORE_PERF_GLOBAL_OVF_CTRL); +} + /* * Merge L0's and L1's MSR bitmap, return false to indicate that * we do not use the hardware. @@ -745,6 +773,8 @@ static inline bool nested_vmx_prepare_msr_bitmap(struct= kvm_vcpu *vcpu, nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, MSR_IA32_PL3_SSP, MSR_TYPE_RW); =20 + nested_vmx_merge_pmu_msr_bitmaps(vcpu, msr_bitmap_l1, msr_bitmap_l0); + kvm_vcpu_unmap(vcpu, &map); =20 vmx->nested.force_msr_bitmap_recalc =3D false; --=20 2.52.0.223.gf5cc29aaa4-goog From nobody Tue Dec 16 07:27:38 2025 Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9DAD72F7AC5 for ; Sat, 6 Dec 2025 00:18:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980313; cv=none; b=YtwnxOjWdqrwdRwIkbeenp/NLABbi8SikPBvaUYggxn+dWyOogaxVGTg2dgTz4/OYTR/ViIcwoDLs/47DK8R7AJSsCEB+MRwLg78v+KqDEVoKp30/KbBJ+I9/yluS+o+yY1yqb14rtEaxEac0JrhgkZMc9X7SxmLneLe3tQ2aEE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980313; c=relaxed/simple; bh=3Wr9JMWcB5q7jiwxMn5RBJWwtBvMspvu6SXJNEXPG6Q=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=g1IM/yWhxQV1826br6KrB10eRrFiQBLRpE1PKmWZxesxzqZJJI0qBSzwQZLYoKXj3TYsn3u+ZCjGD4LMdgHZcoG/wCx3G9MdQqE5LclfxtxNNJ+o/b9cqWhMIsaRbJ0I/yQoehiZPTmnR+cAFLmBosz8gHo/rWhcGvDFfT8U5d4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=fail (0-bit key) header.d=google.com header.i=@google.com header.b=I+tov7Gi reason="key not found in DNS"; arc=none smtp.client-ip=209.85.214.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=fail reason="key not found in DNS" (0-bit key) header.d=google.com header.i=@google.com header.b="I+tov7Gi" Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-297df52c960so55490145ad.1 for ; Fri, 05 Dec 2025 16:18:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1764980310; x=1765585110; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=EW7FbaDPVd5t9h4ibnQZUOMw78F2xoAgB/he+4U2Bes=; b=I+tov7GiRtyiO58uXMws04H+CWBqQ/SiP98FOW9vQr8LzoPTX18XuRpDFBQf8Vwaqt FJLCo7cfwx/L3aDPUSuFlVEOhCM2Kl+juxNmm+hnU7OoBHv+F815d+euqz3p6BBPOjsf ZMAh7qvBtYCV2Gcw2colazmYI0HG5jK5TnnHABxR5OaHP0pMRZY69zS5JUXfUfo1+ujM Oz3SknGz4Ijyg8ItMgeEqK/8t1ulGarPso71HqSQ35dzMa8vJ/L5aCeNoCo2acMIaL3t tjvpyks8NJQNWkrMS7UzmXk+AOVwPvqa9NYhxvLVC5ThjqjW6XXGJgY+iJA3Acm02KHq Sq+A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764980310; x=1765585110; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=EW7FbaDPVd5t9h4ibnQZUOMw78F2xoAgB/he+4U2Bes=; b=sBFPJPbnnm0tkCvaKiXSK2+ZH4CXsJssjiNfhCwYCCR2x5XTpJZZu/06Hg9/Xfir6Z q38Cdx6nqxb0MCEX3/83osHMM9KVmjBNhuOjAEUjZlwmps8B6zNNsr+dwMrPBpDzLkF1 y/DXaa+/NSO8LS5AHvlk5uWs2oE5VVCx2vYozbEIQQK40Eeik0bsUC6FQdEs3Mb5tWRP SQotnp0QBuPCXd/7tiyQKbaGaUZ3AkJ8LHeq7ogqLdOnzoOrdxXVsMtcK3ld0bKbjYhg Kolo7ISqf0zKOaACuW+rWYBWJN1g6BuEXc4z8kdkNQ8aAXEqH4iJGL4xz6eMq1OAP3KH 5foA== X-Forwarded-Encrypted: i=1; AJvYcCWvmtSCg36D2TBbOSiiyvXufDXiw/hx8FGTyf35POzvASfRQrl8XKwsO52CwC6jnfi/Ip2xAfezjU/zekg=@vger.kernel.org X-Gm-Message-State: AOJu0YygRutu7PkEnOe+GZsEQ21P2+dG3tud2jNXeRe6sYJU5vZvD4EY TToScTQ4DNbAoMVu3/qA34Jy6T30m5dfVA2UqPed6HlsMZtxCYWH2xJITD+XhyirbLwVb/8HSmK D14lPAA== X-Google-Smtp-Source: AGHT+IEzWFttnZLEQAsOaxYvtzWLuPcRq6QltY73DikVd5gmywMn7Yi+CzZHr3S07fvHdxIzVSJyvXKfRes= X-Received: from plrc6.prod.google.com ([2002:a17:902:aa46:b0:29d:5afa:2de]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:903:32d0:b0:294:f310:5218 with SMTP id d9443c01a7336-29df4cb6bfamr6615315ad.0.1764980310360; Fri, 05 Dec 2025 16:18:30 -0800 (PST) Reply-To: Sean Christopherson Date: Fri, 5 Dec 2025 16:17:08 -0800 In-Reply-To: <20251206001720.468579-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251206001720.468579-1-seanjc@google.com> X-Mailer: git-send-email 2.52.0.223.gf5cc29aaa4-goog Message-ID: <20251206001720.468579-33-seanjc@google.com> Subject: [PATCH v6 32/44] KVM: nSVM: Disable PMU MSR interception as appropriate while running L2 From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Xin Li , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Sean Christopherson , Paolo Bonzini Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Mingwei Zhang , Xudong Hao , Sandipan Das , Dapeng Mi , Xiong Zhang , Manali Shukla , Jim Mattson Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add MSRs that might be passed through to L1 when running with a mediated PMU to the nested SVM's set of to-be-merged MSR indices, i.e. disable interception of PMU MSRs when running L2 if both KVM (L0) and L1 disable interception. There is no need for KVM to interpose on such MSR accesses, e.g. if L1 exposes a mediated PMU (or equivalent) to L2. Tested-by: Xudong Hao Signed-off-by: Sean Christopherson --- arch/x86/kvm/svm/nested.c | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index ba0f11c68372..93140ccec95d 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -194,7 +194,7 @@ void recalc_intercepts(struct vcpu_svm *svm) * Hardcode the capacity of the array based on the maximum number of _offs= ets_. * MSRs are batched together, so there are fewer offsets than MSRs. */ -static int nested_svm_msrpm_merge_offsets[7] __ro_after_init; +static int nested_svm_msrpm_merge_offsets[10] __ro_after_init; static int nested_svm_nr_msrpm_merge_offsets __ro_after_init; typedef unsigned long nsvm_msrpm_merge_t; =20 @@ -222,6 +222,22 @@ int __init nested_svm_init_msrpm_merge_offsets(void) MSR_IA32_LASTBRANCHTOIP, MSR_IA32_LASTINTFROMIP, MSR_IA32_LASTINTTOIP, + + MSR_K7_PERFCTR0, + MSR_K7_PERFCTR1, + MSR_K7_PERFCTR2, + MSR_K7_PERFCTR3, + MSR_F15H_PERF_CTR0, + MSR_F15H_PERF_CTR1, + MSR_F15H_PERF_CTR2, + MSR_F15H_PERF_CTR3, + MSR_F15H_PERF_CTR4, + MSR_F15H_PERF_CTR5, + + MSR_AMD64_PERF_CNTR_GLOBAL_CTL, + MSR_AMD64_PERF_CNTR_GLOBAL_STATUS, + MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_CLR, + MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_SET, }; int i, j; =20 --=20 2.52.0.223.gf5cc29aaa4-goog From nobody Tue Dec 16 07:27:38 2025 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 15B7E2FD1DC for ; Sat, 6 Dec 2025 00:18:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980315; cv=none; b=h8cmbObNUi/epUBffl8FwJvWEDmsszgNXaGAy+wWTUUIPKMV85q+eXhk/RfAB2Olla19c649ZqtZBQr2vMTYKPUD06RJbuxampUnqpeWqWFmZBYDfBffP+TomP94Q2mlE2EMaz+lGRjYv02tLUSX5AEm3qzyOpwDyVj+zSp3HNo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980315; c=relaxed/simple; bh=dDJu5oLi1PjxkU6HXADJID7cFgoQLWGm6y/UNcHsMk8=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=g+gvdm9x289jMrm8TIj2BDyIhuFHF/7tHYzIyjufpeT3k+ypbFnqyJxsLj+8C+sO7jJDfuDrnOMb7qMsZY5h3NmGqxLfdSdN10L4biKEmoaZCb8jNI2kfc+grB1cg+9KRtpxgwqliPMRFYtsjyHMVhxzkYhc6CUKDXmOaXFWk5g= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=WdnciM7a; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="WdnciM7a" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-340c261fb38so4316378a91.0 for ; Fri, 05 Dec 2025 16:18:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1764980312; x=1765585112; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=DfpzWNiupf9oyWVJGMgXVTAvVfysSBzo629ptknGMjs=; b=WdnciM7aRLE3dqITw1ih2vsVwFRX/KgJG/Qi7v3ON6h23DMqPY0LWbZnkNTznwVo6c cKYBtx/JaniY8nr3iIn9PRAsr4lBODlW6jTGYiti5N6FMCpR7jl72zOIrDroLtdNjeXh 1dQ7bgKl9EnPV9TMJiaYWrZyBXNKgb/l5S+OeVi5M1DzxS59pn5SHWjpFrM7k03Klrph GzR5oDtGyJho8GxnKG+JY0nptkK+dgWV3qotthyz3zdYF8Yb5qqNIeKeJuMdvJzcggXW OQz01Zgr3OSIDHkExp0HM8HAU8Led5BzStLY6uXuwZqC9d5RkkmnoAmHjxiUk7q7Ei8i kAjg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764980312; x=1765585112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=DfpzWNiupf9oyWVJGMgXVTAvVfysSBzo629ptknGMjs=; b=G3W+3WroTzROCaS5NidBieLoY8WmRzB8J4gdszHYkwdQEUNhh5Be/C08byeNwQinDC wjpv0d/zIW2BB61ezLeRknAIiaLlvs26HLKQl/nnvxFx7g8GQVvZ+tqmXTwBUsdDFuDt 20M7aPuPaJQ9mn6+xxCZHPpIbngBMhRPElh+D1WLWjLs2KL7eri8HzQCnuf3Cws6Dgl1 lZB4OlWWcU2v18Rw7BtJZLKQKTAWelOn1jMzQ+kU0EnHDlM5rce21Ycc/Mt0TaCe8j0g Nxo3QQDjUSSDtKOGfd+NAYJyG8vt/Y6p0fACPq8X6132MedVpJ8pKs4ovTomJsJ5oJK1 R+DA== X-Forwarded-Encrypted: i=1; AJvYcCWjMVq8/TBzNCyPt174yREWB7xb0pQIylNgQzXV300O+vLXkz/JCKQfB374l2Qp4Vh6jfcidtrQlWUbwno=@vger.kernel.org X-Gm-Message-State: AOJu0YwXjI3ob7IxOM4ns4rf4C2NNNZoWQus34TJfYC63aCJfgailtuw oC3S8e20ZuuVkGiaWeUOkgMcaOlbaewLMRKxHsIBNSgNmcK0DLpu0zeS/W67AoxjuM+a0yRkq42 9e4TunA== X-Google-Smtp-Source: AGHT+IECzGPhAX0jAKjnbhpxSt8tw9U2yWj2g1X4v/o1b69MxFOOkQezmGTBDbALh6waJ5b9ZWPa3i4t9I0= X-Received: from pjbsb13.prod.google.com ([2002:a17:90b:50cd:b0:340:c0e9:24b6]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:55c4:b0:341:8adc:76d2 with SMTP id 98e67ed59e1d1-349a2518ceamr601732a91.16.1764980312307; Fri, 05 Dec 2025 16:18:32 -0800 (PST) Reply-To: Sean Christopherson Date: Fri, 5 Dec 2025 16:17:09 -0800 In-Reply-To: <20251206001720.468579-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251206001720.468579-1-seanjc@google.com> X-Mailer: git-send-email 2.52.0.223.gf5cc29aaa4-goog Message-ID: <20251206001720.468579-34-seanjc@google.com> Subject: [PATCH v6 33/44] KVM: x86/pmu: Expose enable_mediated_pmu parameter to user space From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Xin Li , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Sean Christopherson , Paolo Bonzini Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Mingwei Zhang , Xudong Hao , Sandipan Das , Dapeng Mi , Xiong Zhang , Manali Shukla , Jim Mattson Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Dapeng Mi Expose enable_mediated_pmu parameter to user space, i.e. allow userspace to enable/disable mediated vPMU support. Document the mediated versus perf-based behavior as part of the kernel-parameters.txt entry, and opportunistically add an entry for the core enable_pmu param as well. Signed-off-by: Dapeng Mi Signed-off-by: Mingwei Zhang Tested-by: Xudong Hao Co-developed-by: Sean Christopherson Signed-off-by: Sean Christopherson --- .../admin-guide/kernel-parameters.txt | 49 +++++++++++++++++++ arch/x86/kvm/svm/svm.c | 2 + arch/x86/kvm/vmx/vmx.c | 2 + 3 files changed, 53 insertions(+) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentatio= n/admin-guide/kernel-parameters.txt index 6c42061ca20e..ed6f2ed94756 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -2926,6 +2926,26 @@ =20 Default is Y (on). =20 + kvm.enable_pmu=3D[KVM,X86] + If enabled, KVM will virtualize PMU functionality based + on the virtual CPU model defined by userspace. This + can be overridden on a per-VM basis via + KVM_CAP_PMU_CAPABILITY. + + If disabled, KVM will not virtualize PMU functionality, + e.g. MSRs, PMCs, PMIs, etc., even if userspace defines + a virtual CPU model that contains PMU assets. + + Note, KVM's vPMU support implicitly requires running + with an in-kernel local APIC, e.g. to deliver PMIs to + the guest. Running without an in-kernel local APIC is + not supported, though KVM will allow such a combination + (with severely degraded functionality). + + See also enable_mediated_pmu. + + Default is Y (on). + kvm.enable_virt_at_load=3D[KVM,ARM64,LOONGARCH,MIPS,RISCV,X86] If enabled, KVM will enable virtualization in hardware when KVM is loaded, and disable virtualization when KVM @@ -2972,6 +2992,35 @@ If the value is 0 (the default), KVM will pick a period based on the ratio, such that a page is zapped after 1 hour on average. =20 + kvm-{amd,intel}.enable_mediated_pmu=3D[KVM,AMD,INTEL] + If enabled, KVM will provide a mediated virtual PMU, + instead of the default perf-based virtual PMU (if + kvm.enable_pmu is true and PMU is enumerated via the + virtual CPU model). + + With a perf-based vPMU, KVM operates as a user of perf, + i.e. emulates guest PMU counters using perf events. + KVM-created perf events are managed by perf as regular + (guest-only) events, e.g. are scheduled in/out, contend + for hardware resources, etc. Using a perf-based vPMU + allows guest and host usage of the PMU to co-exist, but + incurs non-trivial overhead and can result in silently + dropped guest events (due to resource contention). + + With a mediated vPMU, hardware PMU state is context + switched around the world switch to/from the guest. + KVM mediates which events the guest can utilize, but + gives the guest direct access to all other PMU assets + when possible (KVM may intercept some accesses if the + virtual CPU model provides a subset of hardware PMU + functionality). Using a mediated vPMU significantly + reduces PMU virtualization overhead and eliminates lost + guest events, but is mutually exclusive with using perf + to profile KVM guests and adds latency to most VM-Exits + (to context switch PMU state). + + Default is N (off). + kvm-amd.nested=3D [KVM,AMD] Control nested virtualization feature in KVM/SVM. Default is 1 (enabled). =20 diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index cbebd3a18918..20fb7c38bf75 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -170,6 +170,8 @@ module_param(intercept_smi, bool, 0444); bool vnmi =3D true; module_param(vnmi, bool, 0444); =20 +module_param(enable_mediated_pmu, bool, 0444); + static bool svm_gp_erratum_intercept =3D true; =20 static u8 rsm_ins_bytes[] =3D "\x0f\xaa"; diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index f0a20ff2a941..62ba2a2b9e98 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -150,6 +150,8 @@ module_param_named(preemption_timer, enable_preemption_= timer, bool, S_IRUGO); extern bool __read_mostly allow_smaller_maxphyaddr; module_param(allow_smaller_maxphyaddr, bool, S_IRUGO); =20 +module_param(enable_mediated_pmu, bool, 0444); + #define KVM_VM_CR0_ALWAYS_OFF (X86_CR0_NW | X86_CR0_CD) #define KVM_VM_CR0_ALWAYS_ON_UNRESTRICTED_GUEST X86_CR0_NE #define KVM_VM_CR0_ALWAYS_ON \ --=20 2.52.0.223.gf5cc29aaa4-goog From nobody Tue Dec 16 07:27:38 2025 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C9BC7301037 for ; Sat, 6 Dec 2025 00:18:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980317; cv=none; b=PRt923AETypcLzaFvLfZ9pjtXYMxYBc3OyJfP64bhW4WEkatiod95l42KyHTs22FOADlkxkcYLlIabyy/vdZmDlPR0EabNvJLU25se67pLXZ2ij+x3RhVLJipDwK6FUbNXU8J+GCWd/PcVRbM1jraAcfgzFeAipJFWqB93B+hGU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980317; c=relaxed/simple; bh=bVIn0sl64UFZr8WJm9qTaFNmU6IY7pnSUpqJCc//7XM=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=A8nVVY+fD7WgcMM3M37rwCa6TOxz1oXIRl04gWPbfL+fBlU1Q8Qnd5tKhmXnY7RkWFrJGSyQwFaJUvPuWu+k89SKiQ26tL4LQ8g4V174zMGDDV8uDVi/KDLj1yCK4UuIEwEjVCHJV0WScjtqEGxO2+mL3Kqc7THGZqUfYaTiqwE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=fail (0-bit key) header.d=google.com header.i=@google.com header.b=RoABzGP+ reason="key not found in DNS"; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=fail reason="key not found in DNS" (0-bit key) header.d=google.com header.i=@google.com header.b="RoABzGP+" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-340ad9349b3so4794208a91.1 for ; Fri, 05 Dec 2025 16:18:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1764980315; x=1765585115; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=YVw2IrEv6dj1lN2Egcfs+/+7XHiKcsFRJtntJAtFFMc=; b=RoABzGP+UBpFV9s0OwCFx+9KD0W8YqoEdZ19ykGLLvCGhum4htOXC7WShw581iwNVk +5N9h64+y23pIHG7j244zid9DVpNOAjJoTFbphQTaMmsF1PBPjx7kUuIxa+0VmQpENfj w5CX0WQgY5FumteWOFchGWs6uGuJ0pQmzpecXfqMaMlNmq9hgMNphiH3MK8fxc2Xajlo GnFlAv14CnLjBy20nyZg2ndwMZWYTg/b9JSvWNQHhfS5sO7IBlJHXJGSgXUWUjL8wysr igSDPgQ+W5V7Va4lkVsrXwaNPedaTi3NSvC2yG8yJRA5YfxwG4SS+/mUh8XPQ7k8Tj4e 7woA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764980315; x=1765585115; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=YVw2IrEv6dj1lN2Egcfs+/+7XHiKcsFRJtntJAtFFMc=; b=rfZ8J1t/fqAk3UWcwOcpcQU6QA8h//jRDOVjJG00USsqWqgNG9cuiKxdDhNVa5tM13 ja88eQGlva3BCsWBa+zzZDU1Z7zs02piEloqd+9Dwsy3Ll9cpNYPIo0FZN5UiQfWQfhV k932MvMipvFpKT0LGGpESDBJieEhIFlr2fJCMfAcp/hnliaZKjV6anvjbB2dET8MX3R5 g9Opd2O5idXO8Vub0zPJU0xq+RAErQ99OyYirix/+RG1K25/wr8R/ujwFwWJqgPLuTNS l7QA0pqSPZBBt9yXcP+YOD0+6uaBJpOE0oKNbSPNFn20TDGwGTC1RjTjfUfVk5/7ci2+ FYNw== X-Forwarded-Encrypted: i=1; AJvYcCWCZ4sEENNq/4rVtuyZRjvL9dhdUjdUu08pYmwFz2OVRTcmbxe3ZEzrj1U1mY+r6Dfk0pbcwHUOAeeQavY=@vger.kernel.org X-Gm-Message-State: AOJu0YyQSgkEZmfNZUgwini9qFLhj8G1WwMeE8cS2BuHv7wXo+kWxxjP VMOiFMZXLf4+dNIjt31tgnan21aigqpBXZ/5pr+cmmx97QDuXLXpOB4jxqt77Y/M/RA98TyrA3s D94QltQ== X-Google-Smtp-Source: AGHT+IH5qbAxlC2vbzunodq4KUrEtCHMXaoxZK4q8nETSOlwObwaBJfEq5OV6dsDt0Oyzwdbj6cNRcmYmOA= X-Received: from pjbfs2.prod.google.com ([2002:a17:90a:f282:b0:334:1843:ee45]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:3ec7:b0:349:9d63:8511 with SMTP id 98e67ed59e1d1-349a25e3b79mr670940a91.25.1764980315199; Fri, 05 Dec 2025 16:18:35 -0800 (PST) Reply-To: Sean Christopherson Date: Fri, 5 Dec 2025 16:17:10 -0800 In-Reply-To: <20251206001720.468579-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251206001720.468579-1-seanjc@google.com> X-Mailer: git-send-email 2.52.0.223.gf5cc29aaa4-goog Message-ID: <20251206001720.468579-35-seanjc@google.com> Subject: [PATCH v6 34/44] KVM: x86/pmu: Elide WRMSRs when loading guest PMCs if values already match From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Xin Li , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Sean Christopherson , Paolo Bonzini Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Mingwei Zhang , Xudong Hao , Sandipan Das , Dapeng Mi , Xiong Zhang , Manali Shukla , Jim Mattson Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" When loading a mediated PMU state, elide the WRMSRs to load PMCs with the guest's value if the value in hardware already matches the guest's value. For the relatively common case where neither the guest nor the host is actively using the PMU, i.e. when all/many counters are '0', eliding the WRMSRs reduces the latency of handling VM-Exit by a measurable amount (WRMSR is significantly more expensive than RDPMC). As measured by KVM-Unit-Tests' CPUID VM-Exit testcase, this provides a a ~25% reduction in latency (4k =3D> 3k cycles) on Intel Emerald Rapids, and a ~13% reduction (6.2k =3D> 5.3k cycles) on AMD Turin. Cc: Manali Shukla Tested-by: Xudong Hao Signed-off-by: Sean Christopherson --- arch/x86/kvm/pmu.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c index cb07d9b62bee..fdf1df3100eb 100644 --- a/arch/x86/kvm/pmu.c +++ b/arch/x86/kvm/pmu.c @@ -1314,13 +1314,15 @@ static void kvm_pmu_load_guest_pmcs(struct kvm_vcpu= *vcpu) for (i =3D 0; i < pmu->nr_arch_gp_counters; i++) { pmc =3D &pmu->gp_counters[i]; =20 - wrmsrl(gp_counter_msr(i), pmc->counter); + if (pmc->counter !=3D rdpmc(i)) + wrmsrl(gp_counter_msr(i), pmc->counter); wrmsrl(gp_eventsel_msr(i), pmc->eventsel_hw); } for (i =3D 0; i < pmu->nr_arch_fixed_counters; i++) { pmc =3D &pmu->fixed_counters[i]; =20 - wrmsrl(fixed_counter_msr(i), pmc->counter); + if (pmc->counter !=3D rdpmc(INTEL_PMC_FIXED_RDPMC_BASE | i)) + wrmsrl(fixed_counter_msr(i), pmc->counter); } } =20 --=20 2.52.0.223.gf5cc29aaa4-goog From nobody Tue Dec 16 07:27:38 2025 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9BBD630171E for ; Sat, 6 Dec 2025 00:18:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980320; cv=none; b=Elxf0cnmGTClUWzHcl8NTO955joyz3M3Kjc83yFx3fxyFmE+/zrnh7Nv3mcdtsMUg2xFH9lKnITZxmPk6JlDjBNZKfLQau7O7q6W68gjNMwciYvZaBEXWc9NaSw8weK3XOhdGHR7fbGED58B64DxewZ+ypjuj3XebGtwwswGqeQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980320; c=relaxed/simple; bh=idLMRNHkxj+prEIBFHlo9xwnmQGyalobOpJXM4sezD0=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=CRFrWqUyAyX/6Fz8Ti3B08PBjdaTFoRhUO5l/v8mpEb0xnkFe+/KQ8PdhZUzWB9kutXTJfDuVlJgm9Ly1f2Eig2pIgsG6y+XBWYYmOUqCNCiLPBarzXFVZ4XdA6rqbHoc71ykoF4ZCHQPLfyURmVh3BXZcrPfPTePSsyEBuyKj0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=YXuOOAK+; arc=none smtp.client-ip=209.85.216.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="YXuOOAK+" Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-34378c914b4so4920841a91.1 for ; Fri, 05 Dec 2025 16:18:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1764980317; x=1765585117; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=M6M+TUCunGgZRp0HQ274P454YeO/pMlL0Ud7LaETHp4=; b=YXuOOAK+IfYfCD1nGxvanCt5ZYGG9d5xqAYHnE+CUv0jkhfkMrKfPGFZoQnqGEHM2M l8jAkI/sJQ+LVd1/VmkGhzCCLPtB8IS3ofZBAZTLCzD328p8sur3wNpMybbXIaP7+gjt Dkt1TACMkPP6QNi5rxBUOELLFpoonof20I4dyyXyX6WL0tp/iGQRZBvrcY168s84Hf35 dt4R/6UZXH1aFsMLj1QUlA0pi/bu9e+G1cKX2ZZWgfbUIbKbcbH603+yKx5pk7c0aHWe EO68Nlz++GJRueNWyjdoNrZQNW9ajsXSY1bJwgpgIKL88pD8gPU5qFAhjVsnV3zOtZ8s Mgig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764980317; x=1765585117; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=M6M+TUCunGgZRp0HQ274P454YeO/pMlL0Ud7LaETHp4=; b=EZ7iSJH6YXVqlA/VKzjQ4OI0Ua7zvIZhEbbxUj46+MU+VvyISbHkbdGvlsyYrwg1rm vf1OjT0dAZtitOUe0KC43JIY/9tutzotpYsk/Qd66e9c+p5qY+X2HJ/tA4TuZBMfP2aU p7YqtnoAjA6sspKANMuSUvImjCFWg5jxWzy95ZNro75R0anRotR3lLpTm4Lt13zA0P1W NeqlUcBlCjnoq2TuDWDXjIB0nUGaWHRzsuNYznQ0qC1jGOUDJN/q/HVnFVAV/7x3jcTr VMoLmUru2vVgthuD05TkqECQek5syokjHbOA2yHn2RHC4oY6tYI4x+QW3dUr0u7fEYTF v+xQ== X-Forwarded-Encrypted: i=1; AJvYcCWjhLtd/aJhuNrDgRPo61yAagiMxBBgC7ohwfFs0RP48/pqsW3Ih4M2L/y5T0y6x5INq7ztSkKkCNqpCuU=@vger.kernel.org X-Gm-Message-State: AOJu0YyrWROPODr1R/vh5ZMRrzaJBNgcp80cPwj8woaBAzir3lAXxbXb ckyqgxE7CVMLF4vieEcwg60uFBi+CYLMb13Op2qodEwaAq0OtlNRsYB2c20O2JIROfktkbg9ARG sUMthUA== X-Google-Smtp-Source: AGHT+IH4rM0hrWDLcj1m7KmQfI0LQhMNTOrUf9WMiZFUwgaE9DsfO2YaBs29AIBYl4bKM4D6uGtnUb5M7G0= X-Received: from pjbsb13.prod.google.com ([2002:a17:90b:50cd:b0:340:c0e9:24b6]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:4c89:b0:343:6611:f21 with SMTP id 98e67ed59e1d1-349a24e3b08mr610435a91.1.1764980317071; Fri, 05 Dec 2025 16:18:37 -0800 (PST) Reply-To: Sean Christopherson Date: Fri, 5 Dec 2025 16:17:11 -0800 In-Reply-To: <20251206001720.468579-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251206001720.468579-1-seanjc@google.com> X-Mailer: git-send-email 2.52.0.223.gf5cc29aaa4-goog Message-ID: <20251206001720.468579-36-seanjc@google.com> Subject: [PATCH v6 35/44] KVM: VMX: Drop intermediate "guest" field from msr_autostore From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Xin Li , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Sean Christopherson , Paolo Bonzini Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Mingwei Zhang , Xudong Hao , Sandipan Das , Dapeng Mi , Xiong Zhang , Manali Shukla , Jim Mattson Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Drop the intermediate "guest" field from vcpu_vmx.msr_autostore as the value saved on VM-Exit isn't guaranteed to be the guest's value, it's purely whatever is in hardware at the time of VM-Exit. E.g. KVM's only use of the store list at the momemnt is to snapshot TSC at VM-Exit, and the value saved is always the raw TSC even if TSC-offseting and/or TSC-scaling is enabled for the guest. And unlike msr_autoload, there is no need differentiate between "on-entry" and "on-exit". No functional change intended. Cc: Jim Mattson Signed-off-by: Sean Christopherson Reviewed-by: Dapeng Mi --- arch/x86/kvm/vmx/nested.c | 10 +++++----- arch/x86/kvm/vmx/vmx.c | 2 +- arch/x86/kvm/vmx/vmx.h | 4 +--- 3 files changed, 7 insertions(+), 9 deletions(-) diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 729cc1f05ac8..486789dac515 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -1076,11 +1076,11 @@ static bool nested_vmx_get_vmexit_msr_value(struct = kvm_vcpu *vcpu, * VM-exit in L0, use the more accurate value. */ if (msr_index =3D=3D MSR_IA32_TSC) { - int i =3D vmx_find_loadstore_msr_slot(&vmx->msr_autostore.guest, + int i =3D vmx_find_loadstore_msr_slot(&vmx->msr_autostore, MSR_IA32_TSC); =20 if (i >=3D 0) { - u64 val =3D vmx->msr_autostore.guest.val[i].value; + u64 val =3D vmx->msr_autostore.val[i].value; =20 *data =3D kvm_read_l1_tsc(vcpu, val); return true; @@ -1167,7 +1167,7 @@ static void prepare_vmx_msr_autostore_list(struct kvm= _vcpu *vcpu, u32 msr_index) { struct vcpu_vmx *vmx =3D to_vmx(vcpu); - struct vmx_msrs *autostore =3D &vmx->msr_autostore.guest; + struct vmx_msrs *autostore =3D &vmx->msr_autostore; bool in_vmcs12_store_list; int msr_autostore_slot; bool in_autostore_list; @@ -2366,7 +2366,7 @@ static void prepare_vmcs02_constant_state(struct vcpu= _vmx *vmx) * addresses are constant (for vmcs02), the counts can change based * on L2's behavior, e.g. switching to/from long mode. */ - vmcs_write64(VM_EXIT_MSR_STORE_ADDR, __pa(vmx->msr_autostore.guest.val)); + vmcs_write64(VM_EXIT_MSR_STORE_ADDR, __pa(vmx->msr_autostore.val)); vmcs_write64(VM_EXIT_MSR_LOAD_ADDR, __pa(vmx->msr_autoload.host.val)); vmcs_write64(VM_ENTRY_MSR_LOAD_ADDR, __pa(vmx->msr_autoload.guest.val)); =20 @@ -2704,7 +2704,7 @@ static void prepare_vmcs02_rare(struct vcpu_vmx *vmx,= struct vmcs12 *vmcs12) */ prepare_vmx_msr_autostore_list(&vmx->vcpu, MSR_IA32_TSC); =20 - vmcs_write32(VM_EXIT_MSR_STORE_COUNT, vmx->msr_autostore.guest.nr); + vmcs_write32(VM_EXIT_MSR_STORE_COUNT, vmx->msr_autostore.nr); vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, vmx->msr_autoload.host.nr); vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, vmx->msr_autoload.guest.nr); =20 diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 62ba2a2b9e98..23c92c41fd83 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -6567,7 +6567,7 @@ void dump_vmcs(struct kvm_vcpu *vcpu) if (vmcs_read32(VM_ENTRY_MSR_LOAD_COUNT) > 0) vmx_dump_msrs("guest autoload", &vmx->msr_autoload.guest); if (vmcs_read32(VM_EXIT_MSR_STORE_COUNT) > 0) - vmx_dump_msrs("guest autostore", &vmx->msr_autostore.guest); + vmx_dump_msrs("autostore", &vmx->msr_autostore); =20 if (vmentry_ctl & VM_ENTRY_LOAD_CET_STATE) pr_err("S_CET =3D 0x%016lx, SSP =3D 0x%016lx, SSP TABLE =3D 0x%016lx\n", diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h index d7a96c84371f..4ce653d729ca 100644 --- a/arch/x86/kvm/vmx/vmx.h +++ b/arch/x86/kvm/vmx/vmx.h @@ -245,9 +245,7 @@ struct vcpu_vmx { struct vmx_msrs host; } msr_autoload; =20 - struct msr_autostore { - struct vmx_msrs guest; - } msr_autostore; + struct vmx_msrs msr_autostore; =20 struct { int vm86_active; --=20 2.52.0.223.gf5cc29aaa4-goog From nobody Tue Dec 16 07:27:38 2025 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F2AE630171D for ; Sat, 6 Dec 2025 00:18:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980323; cv=none; b=Dg6GIWWTaMUWhHtkrOymztvoXeqaYwFRO6AU78MyZ93c8dakOI3A9SaEYoh6q/Ogrt48QE4zTxme8VelRr3v4JevnzZq2FCJ/metXAoBqVJlGVNuvjHYPS3JLI2l0a+JPPIBgrSOPeJsnX5T7idmH/67N2yKUJ+SUHoG/SGCpBU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980323; c=relaxed/simple; bh=q1xuOjj1CS5QA7ELdLFyugLEU/XcJbqyV6rHreEB4aE=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=CFnpstiUysQDyYcvFAYYc06G/MeA2rAlwJ7G9etrYsXXlJQx2KstzjZ0rMVVNuUXfEYPC/kRyO6C7Z911OClcf7zf8Kdsf1lMxeZQCT+TrKtx97xowmYrWe2ZFy7mYmnQKx+mI4k01wrxrMQ3NaFpSHdVqnQy4jHI+uIvQ7Z5Ps= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=LgH5hYRC; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="LgH5hYRC" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-3436e9e3569so4588535a91.2 for ; Fri, 05 Dec 2025 16:18:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1764980320; x=1765585120; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=ku5/x+LjNQqCx84/WRP5NtSBvqHOmfXefyV2HWZpjLA=; b=LgH5hYRCojXZxMk+udhVBWaSHt5Cz2gJX4YN9xR+XclaD/8uM8OFKqai8vC/9d4HfA HF8Q08xgUuZYyH6FA5WH+q17QnBO1YYpEln23eoWQWsFqoXvpjMSojwP4/SYwunWM63J +QwMADc3awZqRxQlFnuAv67lNugaHhW2Yz9qmG1/ZoBNr2tHYbAigyPreD2ppcGWPNUj UclS2of4ZPabA3A7kj80JS2A4HkVRWROgmCxUmRmS/GPki0KY6iu3KDnRDw3is/ma183 VRj5Au6qsb5FC4mfxoTaINqM8XV2K95pbRYJFtjqfzpjBji8FO4BgZ+0Pqj6+hv+JDqY DaBg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764980320; x=1765585120; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=ku5/x+LjNQqCx84/WRP5NtSBvqHOmfXefyV2HWZpjLA=; b=a07grU0bXMkQLNr5OGAXDN7DTC1ctWvvsnp0sFy1CdQc3yXQKM1b4Q5eJojccW6vcy 83zH1tLeqlF6YB1+Tkm3YIzHmDzeyegz+To24H88FuUpIiFILf2uf2r0bDZREeErKnKe JV4h4yhSB+0zh+SpbcCP6HasVYbsndUyLbbFxyxx3pquV8sLgt/gCCGr5etzR4GsMcXV 4SYKYg6a6ZwGXOJxzsvf7X9yCOCYPrMwRQuOpUUVAri1FocCwN6UrKaSNlCG7SA/SRFp RNwbKdG/wW+3k+AYn22ZMX0nCyAEvulDe8mrt9KNemBw28lbJFYWbMjrL9dOI4nAyjfk R/pQ== X-Forwarded-Encrypted: i=1; AJvYcCUlizob3+JM5IDflnE6aKMic2zks3vrZ5gJsDgn+2hfxVLv0yYOGevcxvKlyJyOMGIHk79rRhD+b7KLDxM=@vger.kernel.org X-Gm-Message-State: AOJu0YwDY+GkONaBVrBsaOBTT+KVWot449pOhGHBvKPJ4ZXU+CEZstYb IIBfcfbntVmf3/XskE+/3xDA88hqOXa9qQfkf7F7fYf8t5hLrGaKlQqo9bxvtTrC0WoDj4qYpj3 IZsRFQQ== X-Google-Smtp-Source: AGHT+IEdp4wM145Xt6RAcxFUTnovxS6MiWgfOzkzkurnCS7JdSyw5SBAEDCrfo8aXCdOIaWJdQdQ/+16vt8= X-Received: from pjbft20.prod.google.com ([2002:a17:90b:f94:b0:347:40b0:958b]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90a:e70d:b0:343:66e2:5f9b with SMTP id 98e67ed59e1d1-349a25bd10emr622201a91.24.1764980320255; Fri, 05 Dec 2025 16:18:40 -0800 (PST) Reply-To: Sean Christopherson Date: Fri, 5 Dec 2025 16:17:12 -0800 In-Reply-To: <20251206001720.468579-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251206001720.468579-1-seanjc@google.com> X-Mailer: git-send-email 2.52.0.223.gf5cc29aaa4-goog Message-ID: <20251206001720.468579-37-seanjc@google.com> Subject: [PATCH v6 36/44] KVM: nVMX: Don't update msr_autostore count when saving TSC for vmcs12 From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Xin Li , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Sean Christopherson , Paolo Bonzini Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Mingwei Zhang , Xudong Hao , Sandipan Das , Dapeng Mi , Xiong Zhang , Manali Shukla , Jim Mattson Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Rework nVMX's use of the MSR auto-store list to snapshot TSC to sneak MSR_IA32_TSC into the list _without_ updating KVM's software tracking, and drop the generic functionality so that future usage of the store list for nested specific logic needs to consider the implications of modifying the list. Updating the list only for vmcs02 and only on nested VM-Enter is a disaster waiting to happen, as it means vmcs01 is stale relative to the software tracking, and KVM could unintentionally leave an MSR in the store list in perpetuity while running L1, e.g. if KVM addressed the first issue and updated vmcs01 on nested VM-Exit without removing TSC from the list. Furthermore, mixing KVM's desire to save an MSR with L1's desire to save an MSR result KVM clobbering/ignoring the needs of vmcs01 or vmcs02. E.g. if KVM added MSR_IA32_TSC to the store list for its own purposes, and then _removed_ MSR_IA32_TSC from the list after emulating nested VM-Enter, then KVM would remove MSR_IA32_TSC from the list even though saving TSC on VM-Exit from L2 is still desirable (to provide L1 with an accurate TSC). Similarly, removing an MSR from the list based on vmcs12's settings could drop an MSR that KVM wants to save for its own purposes. In practice, the issues are currently benign, because KVM doesn't use the store list for vmcs01. But that will change with upcoming mediated PMU support. Alternatively, a "full" solution would be to track MSR list entries for vmcs12 separately from KVM's standard lists, but MSR_IA32_TSC is likely the only MSR that KVM would ever want to save on _every_ VM-Exit purely based on vmcs12. I.e. the added complexity isn't remotely justified at this time. Opportunistically escalate from a pr_warn_ratelimited() to a full WARN as KVM reserves eight entries in each MSR list, and as above KVM uses at most one entry. Opportunistically make vmx_find_loadstore_msr_slot() local to vmx.c as using it directly from nested code is unsafe due to the potential for mixing vmcs01 and vmcs02 state (see above). Cc: Jim Mattson Signed-off-by: Sean Christopherson --- arch/x86/kvm/vmx/nested.c | 71 ++++++++++++--------------------------- arch/x86/kvm/vmx/vmx.c | 2 +- arch/x86/kvm/vmx/vmx.h | 2 +- 3 files changed, 24 insertions(+), 51 deletions(-) diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 486789dac515..614b789ecf16 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -1075,16 +1075,12 @@ static bool nested_vmx_get_vmexit_msr_value(struct = kvm_vcpu *vcpu, * does not include the time taken for emulation of the L2->L1 * VM-exit in L0, use the more accurate value. */ - if (msr_index =3D=3D MSR_IA32_TSC) { - int i =3D vmx_find_loadstore_msr_slot(&vmx->msr_autostore, - MSR_IA32_TSC); + if (msr_index =3D=3D MSR_IA32_TSC && vmx->nested.tsc_autostore_slot >=3D = 0) { + int slot =3D vmx->nested.tsc_autostore_slot; + u64 host_tsc =3D vmx->msr_autostore.val[slot].value; =20 - if (i >=3D 0) { - u64 val =3D vmx->msr_autostore.val[i].value; - - *data =3D kvm_read_l1_tsc(vcpu, val); - return true; - } + *data =3D kvm_read_l1_tsc(vcpu, host_tsc); + return true; } =20 if (kvm_emulate_msr_read(vcpu, msr_index, data)) { @@ -1163,42 +1159,6 @@ static bool nested_msr_store_list_has_msr(struct kvm= _vcpu *vcpu, u32 msr_index) return false; } =20 -static void prepare_vmx_msr_autostore_list(struct kvm_vcpu *vcpu, - u32 msr_index) -{ - struct vcpu_vmx *vmx =3D to_vmx(vcpu); - struct vmx_msrs *autostore =3D &vmx->msr_autostore; - bool in_vmcs12_store_list; - int msr_autostore_slot; - bool in_autostore_list; - int last; - - msr_autostore_slot =3D vmx_find_loadstore_msr_slot(autostore, msr_index); - in_autostore_list =3D msr_autostore_slot >=3D 0; - in_vmcs12_store_list =3D nested_msr_store_list_has_msr(vcpu, msr_index); - - if (in_vmcs12_store_list && !in_autostore_list) { - if (autostore->nr =3D=3D MAX_NR_LOADSTORE_MSRS) { - /* - * Emulated VMEntry does not fail here. Instead a less - * accurate value will be returned by - * nested_vmx_get_vmexit_msr_value() by reading KVM's - * internal MSR state instead of reading the value from - * the vmcs02 VMExit MSR-store area. - */ - pr_warn_ratelimited( - "Not enough msr entries in msr_autostore. Can't add msr %x\n", - msr_index); - return; - } - last =3D autostore->nr++; - autostore->val[last].index =3D msr_index; - } else if (!in_vmcs12_store_list && in_autostore_list) { - last =3D --autostore->nr; - autostore->val[msr_autostore_slot] =3D autostore->val[last]; - } -} - /* * Load guest's/host's cr3 at nested entry/exit. @nested_ept is true if w= e are * emulating VM-Entry into a guest with EPT enabled. On failure, the expe= cted @@ -2699,12 +2659,25 @@ static void prepare_vmcs02_rare(struct vcpu_vmx *vm= x, struct vmcs12 *vmcs12) } =20 /* - * Make sure the msr_autostore list is up to date before we set the - * count in the vmcs02. + * If vmcs12 is configured to save TSC on exit via the auto-store list, + * append the MSR to vmcs02's auto-store list so that KVM effectively + * reads TSC at the time of VM-Exit from L2. The saved value will be + * propagated to vmcs12's list on nested VM-Exit. + * + * Don't increment the number of MSRs in the vCPU structure, as saving + * TSC is specific to this particular incarnation of vmcb02, i.e. must + * not bleed into vmcs01. */ - prepare_vmx_msr_autostore_list(&vmx->vcpu, MSR_IA32_TSC); + if (nested_msr_store_list_has_msr(&vmx->vcpu, MSR_IA32_TSC) && + !WARN_ON_ONCE(vmx->msr_autostore.nr >=3D ARRAY_SIZE(vmx->msr_autostor= e.val))) { + vmx->nested.tsc_autostore_slot =3D vmx->msr_autostore.nr; + vmx->msr_autostore.val[vmx->msr_autostore.nr].index =3D MSR_IA32_TSC; =20 - vmcs_write32(VM_EXIT_MSR_STORE_COUNT, vmx->msr_autostore.nr); + vmcs_write32(VM_EXIT_MSR_STORE_COUNT, vmx->msr_autostore.nr + 1); + } else { + vmx->nested.tsc_autostore_slot =3D -1; + vmcs_write32(VM_EXIT_MSR_STORE_COUNT, vmx->msr_autostore.nr); + } vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, vmx->msr_autoload.host.nr); vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, vmx->msr_autoload.guest.nr); =20 diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 23c92c41fd83..52bcb817cc15 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -1029,7 +1029,7 @@ static __always_inline void clear_atomic_switch_msr_s= pecial(struct vcpu_vmx *vmx vm_exit_controls_clearbit(vmx, exit); } =20 -int vmx_find_loadstore_msr_slot(struct vmx_msrs *m, u32 msr) +static int vmx_find_loadstore_msr_slot(struct vmx_msrs *m, u32 msr) { unsigned int i; =20 diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h index 4ce653d729ca..3175fedb5a4d 100644 --- a/arch/x86/kvm/vmx/vmx.h +++ b/arch/x86/kvm/vmx/vmx.h @@ -191,6 +191,7 @@ struct nested_vmx { u16 vpid02; u16 last_vpid; =20 + int tsc_autostore_slot; struct nested_vmx_msrs msrs; =20 /* SMM related state */ @@ -383,7 +384,6 @@ void vmx_spec_ctrl_restore_host(struct vcpu_vmx *vmx, u= nsigned int flags); unsigned int __vmx_vcpu_run_flags(struct vcpu_vmx *vmx); bool __vmx_vcpu_run(struct vcpu_vmx *vmx, unsigned long *regs, unsigned int flags); -int vmx_find_loadstore_msr_slot(struct vmx_msrs *m, u32 msr); void vmx_ept_load_pdptrs(struct kvm_vcpu *vcpu); =20 void vmx_set_intercept_for_msr(struct kvm_vcpu *vcpu, u32 msr, int type, b= ool set); --=20 2.52.0.223.gf5cc29aaa4-goog From nobody Tue Dec 16 07:27:38 2025 Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8FB9E303A17 for ; Sat, 6 Dec 2025 00:18:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980324; cv=none; b=NAoKfRLZuRKQooMmy6bwHqoJHg8XfJb5PuPoJGVNOxM5zRFeHqQfRNk06ZM7Sg9iUpSLGHoxgksxTN0MwvwKJyB2GwPQkoA2xeJEIGONgSdlBo8tGNpOJQHc0Ly/y3Abz5YhXGnADPz0ElYE360eX9cUOQ6YHGn1zwxo+y3e5YY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980324; c=relaxed/simple; bh=wKnuYKFa2/XawhzwETwLumcOzjORaqQZ0PyG/8KJ2Eg=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=lRXDv5Ty5ibs6CpZNnb8yTMcFsK1FAUCidnRng9mHCe2BY/2MIz12AqfdHLz6KGSJsUvVFqZJdvVoyhYL/OQMcbUPw6J7ZZ4UlF+CSlpY0olfbDjlTOIANWD0HKWLy5mgmd4LFnCvw2chJw/zrtpZe1pTjIoCs8VoRhs/Zicuck= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=zMLpL+sB; arc=none smtp.client-ip=209.85.214.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="zMLpL+sB" Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-29be4d2ef78so54741305ad.2 for ; Fri, 05 Dec 2025 16:18:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1764980322; x=1765585122; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=2n5D0Ng0cmebeVdxSFqRanF8f5+ncu9wEbeJjFPfHxU=; b=zMLpL+sBSxZEUVbTdZ5sMjDWV/L1CkczpAetqw/btZhoebaeCFiwNEfO8c3vSc5//n wB7fHcHF7zTRN+xHBtdocfFj3rchP5TH0Uotb0QRjTVe/+5mARa2aM8mjVK/o3Oh/uUd DUN1UVicqX3ln31AZIjNjsdRJsPGgqIZdHPn8PrGjvuS8JjdXVnAtYSQMu0Z/B1H8mqx uYYGDlb5Gn77qaGJghInYBukSw72M6aAYm3636CAW9QGW7WcQAxLVA+cY6EXp5yR18Aj ZTWJPuDMOrwenzjQmF5K1so4Na4a8aol4WbonM6sW0SDhbel0MPZZyef0yGHaffxOX99 PkXA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764980322; x=1765585122; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=2n5D0Ng0cmebeVdxSFqRanF8f5+ncu9wEbeJjFPfHxU=; b=JOFULL3i1YdvgEICUntUdTlruD9gGK1MSVSHwaNrZVa+ZVx30cdNHkCG63FIyzukwl 03tEAYNk+cDWB6HL67ek4N3IlGS/VZMd3eg94Iyb40X2wYyVbRhW+0azixyPiFL/A2N9 MuAdhJefx87zhkq7+yz3rgXHHziY4s7HMnRd9urwQtUoEDN/UuT3toZ8t8C2+gV35YTP e/5bb4fUfLk5iJvJ0Zn7JaTsgy53sw/z9b0+WzmbqsVdU9tXPYalIywwBs3WPAKjuUHZ UJ0kFQdUNR3+ZOxSXPIwWh2pqStR2hso/r/aZkLrDQaXpq8kt2jkiAOYtV1BJ3JG/T2t 1fwA== X-Forwarded-Encrypted: i=1; AJvYcCVa6RgevufkbhZMHf+5w1g4uCCKJUCzxtFHDPkV+ZNCgT1zl5qQMlWs/JfCV6DldRQuHYXtO4qhFT5VkXQ=@vger.kernel.org X-Gm-Message-State: AOJu0YzTgXKqhKsRKM2Jw7kWiWTqg3EUvWwOYjWXiJURS9WnwKCoHJqc Aqq1rQe7Sazlu2qgGatBH15E+bacUWjvD4fBlO2lSBZP3NE9N2sjanjOo393BBijWIx3GAdylsU 2Wu+Lxg== X-Google-Smtp-Source: AGHT+IED91Xjb0+dxO7/M2E0zHU6IQhei8lnafvXM15EOvowaasBkt+pNEKc95Jh7HmLdvF7r/s8Wu6xUzQ= X-Received: from plbky13.prod.google.com ([2002:a17:902:f98d:b0:29d:5afa:2c5]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:902:d48f:b0:295:62d:5004 with SMTP id d9443c01a7336-29df5695ff4mr7109145ad.26.1764980321808; Fri, 05 Dec 2025 16:18:41 -0800 (PST) Reply-To: Sean Christopherson Date: Fri, 5 Dec 2025 16:17:13 -0800 In-Reply-To: <20251206001720.468579-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251206001720.468579-1-seanjc@google.com> X-Mailer: git-send-email 2.52.0.223.gf5cc29aaa4-goog Message-ID: <20251206001720.468579-38-seanjc@google.com> Subject: [PATCH v6 37/44] KVM: VMX: Dedup code for removing MSR from VMCS's auto-load list From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Xin Li , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Sean Christopherson , Paolo Bonzini Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Mingwei Zhang , Xudong Hao , Sandipan Das , Dapeng Mi , Xiong Zhang , Manali Shukla , Jim Mattson Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add a helper to remove an MSR from an auto-{load,store} list to dedup the msr_autoload code, and in anticipation of adding similar functionality for msr_autostore. No functional change intended. Signed-off-by: Sean Christopherson --- arch/x86/kvm/vmx/vmx.c | 31 ++++++++++++++++--------------- 1 file changed, 16 insertions(+), 15 deletions(-) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 52bcb817cc15..a51f66d1b201 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -1040,9 +1040,22 @@ static int vmx_find_loadstore_msr_slot(struct vmx_ms= rs *m, u32 msr) return -ENOENT; } =20 +static void vmx_remove_auto_msr(struct vmx_msrs *m, u32 msr, + unsigned long vmcs_count_field) +{ + int i; + + i =3D vmx_find_loadstore_msr_slot(m, msr); + if (i < 0) + return; + + --m->nr; + m->val[i] =3D m->val[m->nr]; + vmcs_write32(vmcs_count_field, m->nr); +} + static void clear_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr) { - int i; struct msr_autoload *m =3D &vmx->msr_autoload; =20 switch (msr) { @@ -1063,21 +1076,9 @@ static void clear_atomic_switch_msr(struct vcpu_vmx = *vmx, unsigned msr) } break; } - i =3D vmx_find_loadstore_msr_slot(&m->guest, msr); - if (i < 0) - goto skip_guest; - --m->guest.nr; - m->guest.val[i] =3D m->guest.val[m->guest.nr]; - vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, m->guest.nr); =20 -skip_guest: - i =3D vmx_find_loadstore_msr_slot(&m->host, msr); - if (i < 0) - return; - - --m->host.nr; - m->host.val[i] =3D m->host.val[m->host.nr]; - vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, m->host.nr); + vmx_remove_auto_msr(&m->guest, msr, VM_ENTRY_MSR_LOAD_COUNT); + vmx_remove_auto_msr(&m->host, msr, VM_EXIT_MSR_LOAD_COUNT); } =20 static __always_inline void add_atomic_switch_msr_special(struct vcpu_vmx = *vmx, --=20 2.52.0.223.gf5cc29aaa4-goog From nobody Tue Dec 16 07:27:38 2025 Received: from mail-pf1-f201.google.com (mail-pf1-f201.google.com [209.85.210.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4B3083054E9 for ; Sat, 6 Dec 2025 00:18:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980327; cv=none; b=KDkITCUs4C1cPOmLKfnvRdxR5WcuFGw7hAn+aPVedSyzoKZKSDj8cyM6Ex+nO2eGcU1+0w1KtSvt9wrIYkxv1d9bpi4ld5hq5pcRPXZyH3ZjBsyST5+35qKpSBasL+WYLd/KuBMY/0ToN6rot4IEw0KJQzplgyb9ljxXJb5A+XE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980327; c=relaxed/simple; bh=NbkrBG3duMpaol6d8oG4nabrvUXIXCsgan8nKAy9HYc=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=TelNb3R2x3e61h5VdKqb1ZOTB70sfAnGb69MhvQbX0p+pzpqdYiTQmlbXcC237+/N1c7MBjWyfmcsV/rv+6kqtHGRBACfBh8sajhFuGhsUfOFujhDT96lrN+05GxS7pWZ03jHaIejYlAzsbF5Qkw9hX75Ke3LIgRfaTw+LcqWgE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=gPbqaZJy; arc=none smtp.client-ip=209.85.210.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="gPbqaZJy" Received: by mail-pf1-f201.google.com with SMTP id d2e1a72fcca58-7be94e1a073so4966251b3a.2 for ; Fri, 05 Dec 2025 16:18:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1764980324; x=1765585124; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=0alQNA5MnRougM8104k8mY12SAXvJNUdA/VqOuM31LI=; b=gPbqaZJy8VbXYC4ud3kXmQdUa2VIcdS334VR5E5WhizmFlnP6MZar84aPvNckNhtXc TrRt0jmVdvIi+pTxkenCupOe3XE5DThLzOuUNVeod/Gl9JRuZ9wkOrEhBy4ZD3cKsTL9 psZKtOdPfmSKVk0om6dvC64o2k8QJJfntq6qemfo2vKrXUDvNIiTEIGKA1b26mMy3BqG PqJh11awL5EPWBQr5AqUP63nnw6TLdWaey/C8m4xPLTd/Ds9AXp64y5f38iCZKRmmvP4 tbyjRS24TZRt6hPoLAkqMBCnlr+wRN/AYV/sbiVF/j6m3A8hdlHQJdiYRiUKDIfB8saD R4nw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764980324; x=1765585124; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=0alQNA5MnRougM8104k8mY12SAXvJNUdA/VqOuM31LI=; b=rD5kT4wHDtr/2Mda0TVLPWFbtD48RxdcKWWzN2A10PQrspDRCKl3TGDD35YYX/PEVg 2iuj8W4JcRrtp8aW6KiKLdOV/sQQwPQo51mrDmAsK17AjJ/L0f2mjquuajXwYd0nDhpV qQ7+teoPQyr9T2EqY+5G9xRZi0DigRxX6GjeM1TmzQNrUns9g8KMkk4bzQRwtfvs9hrD UT1D6u0/TTVXVa8YW+5mpwOx/xdpPwgEPiTyhw3hJrPM9pbR1BEdsJ2lprHooraC2LZ/ 87KSbwrtQ9FrrCOylCqoRlUuecT8sInqv8oaAdfEyOjlk+8DEyNvPUipBydhwYwp+OuU qF+g== X-Forwarded-Encrypted: i=1; AJvYcCVlxOw2LXllhIap3S2GvyH/Z+UQys867mdtoqog/KV7GDdtGDF5RoTR8zCBWETds+az3q1/LMlaDORNCYo=@vger.kernel.org X-Gm-Message-State: AOJu0Ywm6eohcBrAmsm4Pq0bmA5U0++ejhDXjjy5VjGsYtSqTb+XQQs1 85JrWWYjRzPNCAlyqckcgJpcKgAa1dmUfE1i+NNKaZj+FDYr0HpI53J4z5lwmQBUrKDugcIg9Iw e0bSFkw== X-Google-Smtp-Source: AGHT+IE9VOcwrfHydwxf79golQ2o2etTa7qJp+DbC2tY99Fupcp+X66ru5yHSbM1504AJSJDzZlYcA1+Jcw= X-Received: from pgdc11.prod.google.com ([2002:a05:6a02:510b:b0:bd9:6028:d18c]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a21:4ed2:10b0:361:4ca3:e17d with SMTP id adf61e73a8af0-36617e37b3fmr739726637.13.1764980324070; Fri, 05 Dec 2025 16:18:44 -0800 (PST) Reply-To: Sean Christopherson Date: Fri, 5 Dec 2025 16:17:14 -0800 In-Reply-To: <20251206001720.468579-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251206001720.468579-1-seanjc@google.com> X-Mailer: git-send-email 2.52.0.223.gf5cc29aaa4-goog Message-ID: <20251206001720.468579-39-seanjc@google.com> Subject: [PATCH v6 38/44] KVM: VMX: Drop unused @entry_only param from add_atomic_switch_msr() From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Xin Li , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Sean Christopherson , Paolo Bonzini Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Mingwei Zhang , Xudong Hao , Sandipan Das , Dapeng Mi , Xiong Zhang , Manali Shukla , Jim Mattson Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Drop the "on VM-Enter only" parameter from add_atomic_switch_msr() as it is no longer used, and for all intents and purposes was never used. The functionality was added, under embargo, by commit 989e3992d2ec ("x86/KVM/VMX: Extend add_atomic_switch_msr() to allow VMENTER only MSRs"), and then ripped out by commit 2f055947ae5e ("x86/kvm: Drop L1TF MSR list approach") just a few commits later. 2f055947ae5e x86/kvm: Drop L1TF MSR list approach 72c6d2db64fa x86/litf: Introduce vmx status variable 215af5499d9e cpu/hotplug: Online siblings when SMT control is turned on 390d975e0c4e x86/KVM/VMX: Use MSR save list for IA32_FLUSH_CMD if required 989e3992d2ec x86/KVM/VMX: Extend add_atomic_switch_msr() to allow VMENTER= only MSRs Furthermore, it's extremely unlikely KVM will ever _need_ to load an MSR value via the auto-load lists only on VM-Enter. MSRs writes via the lists aren't optimized in any way, and so the only reason to use the lists instead of a WRMSR are for cases where the MSR _must_ be load atomically with respect to VM-Enter (and/or VM-Exit). While one could argue that command MSRs, e.g. IA32_FLUSH_CMD, "need" to be done exact at VM-Enter, in practice doing such flushes within a few instructons of VM-Enter is more than sufficient. Note, the shortlog and changelog for commit 390d975e0c4e ("x86/KVM/VMX: Use MSR save list for IA32_FLUSH_CMD if required") are misleading and wrong. That commit added MSR_IA32_FLUSH_CMD to the VM-Enter _load_ list, not the VM-Enter save list (which doesn't exist, only VM-Exit has a store/save list). Signed-off-by: Sean Christopherson Reviewed-by: Dapeng Mi --- arch/x86/kvm/vmx/vmx.c | 13 ++++--------- 1 file changed, 4 insertions(+), 9 deletions(-) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index a51f66d1b201..38491962b2c1 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -1094,7 +1094,7 @@ static __always_inline void add_atomic_switch_msr_spe= cial(struct vcpu_vmx *vmx, } =20 static void add_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr, - u64 guest_val, u64 host_val, bool entry_only) + u64 guest_val, u64 host_val) { int i, j =3D 0; struct msr_autoload *m =3D &vmx->msr_autoload; @@ -1132,8 +1132,7 @@ static void add_atomic_switch_msr(struct vcpu_vmx *vm= x, unsigned msr, } =20 i =3D vmx_find_loadstore_msr_slot(&m->guest, msr); - if (!entry_only) - j =3D vmx_find_loadstore_msr_slot(&m->host, msr); + j =3D vmx_find_loadstore_msr_slot(&m->host, msr); =20 if ((i < 0 && m->guest.nr =3D=3D MAX_NR_LOADSTORE_MSRS) || (j < 0 && m->host.nr =3D=3D MAX_NR_LOADSTORE_MSRS)) { @@ -1148,9 +1147,6 @@ static void add_atomic_switch_msr(struct vcpu_vmx *vm= x, unsigned msr, m->guest.val[i].index =3D msr; m->guest.val[i].value =3D guest_val; =20 - if (entry_only) - return; - if (j < 0) { j =3D m->host.nr++; vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, m->host.nr); @@ -1190,8 +1186,7 @@ static bool update_transition_efer(struct vcpu_vmx *v= mx) if (!(guest_efer & EFER_LMA)) guest_efer &=3D ~EFER_LME; if (guest_efer !=3D kvm_host.efer) - add_atomic_switch_msr(vmx, MSR_EFER, - guest_efer, kvm_host.efer, false); + add_atomic_switch_msr(vmx, MSR_EFER, guest_efer, kvm_host.efer); else clear_atomic_switch_msr(vmx, MSR_EFER); return false; @@ -7350,7 +7345,7 @@ static void atomic_switch_perf_msrs(struct vcpu_vmx *= vmx) clear_atomic_switch_msr(vmx, msrs[i].msr); else add_atomic_switch_msr(vmx, msrs[i].msr, msrs[i].guest, - msrs[i].host, false); + msrs[i].host); } =20 static void vmx_update_hv_timer(struct kvm_vcpu *vcpu, bool force_immediat= e_exit) --=20 2.52.0.223.gf5cc29aaa4-goog From nobody Tue Dec 16 07:27:38 2025 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7683F306490 for ; Sat, 6 Dec 2025 00:18:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980328; cv=none; b=p1cLdcJOJScyANuu/ghAG3qrll7XlMIByFMRjDqivgAF+3xDtiMarZeNg8cb0O+bcO1jW2DR0pO4/U5OBeKcP19e0Vi/QU6TGyECyo3gcw/m5aaUrfdwxFHf8CUyvHfeAL7r6gOOJWslNxAiScPzYjn7er+D0vat+Hx6V7GHIH4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980328; c=relaxed/simple; bh=l5LewAY0/XP6khVEVySmdxm/awcqBbBDupO1XKWMwzw=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=ZycEWoOkqYxbokUgQqEYY68tDOLnIM5Uv2E/LWXPdQEgwbNWd0joRDBbWlmnCJ8z9J1glakDqcrZzXVsq16HqFa+0UD8bLQfcD5z9g0AFRzVRAvk/DeKxgUcTaR5nsEwxaYfisr2iRH7PLNDSPgoPRnwBJthQhXdG9S0m8LDgcE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=sj3VHSDj; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="sj3VHSDj" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-34176460924so2634328a91.3 for ; Fri, 05 Dec 2025 16:18:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1764980326; x=1765585126; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=OETTgwJdKjLOXcIOe9hfqxKY+IkrMGSVX3FLBG1neIA=; b=sj3VHSDjoVOqfHTm75gcjLiZ2g5ImZzrOY+bEHkeUX5GF1e/PnPa7Mag0CMbGlhqJN lbadaBXTOiucejd0oFmg1RbH47IDtn9slLfgCRmv+fl6PcRACsE6DWYqtx38nKBsfRnA pDLn2OcqfiVhfrNmPq6pu9NmxgSQkXlbeXjVJWag2bgkxjV58Lx+5SWSuXGNFlJiSw1X BYpumfnfnUdtwi/h6PwgVFrRbasqDtwYbSK7NbDyP/pQJzdVlPyzwtTsuxw0B5+QdO5E 8Ia+cNEbkOiEpsrd3kudpJB7bh4LZNPOPlUMBSVYOiWzXhHOY8l4HyFd27HxLZzGpKeh kEdw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764980326; x=1765585126; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=OETTgwJdKjLOXcIOe9hfqxKY+IkrMGSVX3FLBG1neIA=; b=KAYqaTk4Os0Cp3eLP1AOQWTT2BFCfWJWkjbW9zpY9OBA9ECf7bHO2WrCOt7o2UYCop T1EJ7DMh2KA3ydWnSpI0nTfUNj0luSPbOQZ+aX9EeOhaVRCDrAdbvoxHBz96z94xl1MT SpOFJp09aegLFFTOYSNs+sGv7FcbUBgynWBrNk1hkNh6dCzMd1rUx9imZGBBNeXHde+h 1DA3je2FW8mzUFed7TLaX9poXu6b07uqt88cLZ8sdfFf1c+4yCX3QVCIFlTOAYf65Ibg uW4eMR+q22fliZS2HRX9NdHjo/kaUlar4so+C2LPMXXKQYI0NQp09RZyMzHE9G4+LOMh 5VEw== X-Forwarded-Encrypted: i=1; AJvYcCX4b1kgnqMZcVTXAoBmRDk46BHvx79XNyepn3ZGaoWjgH1DdMb5KP6f5Y9uUGQ1wDVWCIOZaRrbC7gohyg=@vger.kernel.org X-Gm-Message-State: AOJu0Yyl+7MJwUleuPwBH/tRg/+P97kurnM4feY/xcMW35roTc/sd2Hu lNU7o8BArSPUGzhXwqSGQcOfDFHkhl4W4iDoMbOWjA35bV9iUxrdI1IFtniBVvv8WiT5K4ihvN3 szcGMew== X-Google-Smtp-Source: AGHT+IEhe1pZpq0MW23HxJ1d0U2ymT/Zcg6wxeHV4r8IvmytRP6tpgYpoHJSoQtNjxepJopTb3+TB9OPRJk= X-Received: from pjis4.prod.google.com ([2002:a17:90a:5d04:b0:340:b1b5:eb5e]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:3ecd:b0:340:f05a:3ec2 with SMTP id 98e67ed59e1d1-349a25fb8c0mr658162a91.17.1764980325670; Fri, 05 Dec 2025 16:18:45 -0800 (PST) Reply-To: Sean Christopherson Date: Fri, 5 Dec 2025 16:17:15 -0800 In-Reply-To: <20251206001720.468579-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251206001720.468579-1-seanjc@google.com> X-Mailer: git-send-email 2.52.0.223.gf5cc29aaa4-goog Message-ID: <20251206001720.468579-40-seanjc@google.com> Subject: [PATCH v6 39/44] KVM: VMX: Bug the VM if either MSR auto-load list is full From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Xin Li , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Sean Christopherson , Paolo Bonzini Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Mingwei Zhang , Xudong Hao , Sandipan Das , Dapeng Mi , Xiong Zhang , Manali Shukla , Jim Mattson Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" WARN and bug the VM if either MSR auto-load list is full when adding an MSR to the lists, as the set of MSRs that KVM loads via the lists is finite and entirely KVM controlled, i.e. overflowing the lists shouldn't be possible in a fully released version of KVM. Terminate the VM as the core KVM infrastructure has no insight as to _why_ an MSR is being added to the list, and failure to load an MSR on VM-Enter and/or VM-Exit could be fatal to the host. E.g. running the host with a guest-controlled PEBS MSR could generate unexpected writes to the DS buffer and crash the host. Signed-off-by: Sean Christopherson Reviewed-by: Dapeng Mi --- arch/x86/kvm/vmx/vmx.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 38491962b2c1..2c50ebf4ff1b 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -1098,6 +1098,7 @@ static void add_atomic_switch_msr(struct vcpu_vmx *vm= x, unsigned msr, { int i, j =3D 0; struct msr_autoload *m =3D &vmx->msr_autoload; + struct kvm *kvm =3D vmx->vcpu.kvm; =20 switch (msr) { case MSR_EFER: @@ -1134,12 +1135,10 @@ static void add_atomic_switch_msr(struct vcpu_vmx *= vmx, unsigned msr, i =3D vmx_find_loadstore_msr_slot(&m->guest, msr); j =3D vmx_find_loadstore_msr_slot(&m->host, msr); =20 - if ((i < 0 && m->guest.nr =3D=3D MAX_NR_LOADSTORE_MSRS) || - (j < 0 && m->host.nr =3D=3D MAX_NR_LOADSTORE_MSRS)) { - printk_once(KERN_WARNING "Not enough msr switch entries. " - "Can't add msr %x\n", msr); + if (KVM_BUG_ON(i < 0 && m->guest.nr =3D=3D MAX_NR_LOADSTORE_MSRS, kvm) || + KVM_BUG_ON(j < 0 && m->host.nr =3D=3D MAX_NR_LOADSTORE_MSRS, kvm)) return; - } + if (i < 0) { i =3D m->guest.nr++; vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, m->guest.nr); --=20 2.52.0.223.gf5cc29aaa4-goog From nobody Tue Dec 16 07:27:38 2025 Received: from mail-pf1-f202.google.com (mail-pf1-f202.google.com [209.85.210.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 796BC3081CA for ; Sat, 6 Dec 2025 00:18:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980330; cv=none; b=gHzOGlj6+JExpWPvYO9LgjIc0YdOiZjKpats+ry1AGEzmeWzRlm0QM+4P1VkLr1uYOW+p1/LZ20kRDHeWAhcj928p7twAL5NyqqWzZmx6vvREOOgi2fcDAKUKgS3vYsp3xaFIF7dsPdkJI77ODQMNpdkSGJUdGPuxC2Y3I7kvwo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980330; c=relaxed/simple; bh=B/6xHmD7dd8wWjqN/hAhxWOoQSEJgrYoSRISc3YL0mc=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=qyt/2Vtl0UDEWmzCwrPQKmhe1ARxZNnDQX2Uuc0bZBNPiBH2BA/AX9QsHJKs9gKjOlR7IzM5WJmiWnFfedKZgLh/hKwiDq1NmcuA/POxyaHSXYwjglzu92/NcClg8FZ6aN4EDCmf7FoRon6a1JKB9HrlcZa3RCVWaFXytZ9yMAk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=iKh511QB; arc=none smtp.client-ip=209.85.210.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="iKh511QB" Received: by mail-pf1-f202.google.com with SMTP id d2e1a72fcca58-7bb2303fe94so2996906b3a.3 for ; Fri, 05 Dec 2025 16:18:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1764980328; x=1765585128; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=56r/hH7ijnps8qr4K++rAycxqMKLQxyC6YFw0gfnCKk=; b=iKh511QBc/yEjfAhHuNG3iwhGXLgAXOynU5yTYI2xMrq25M2JhWxwoAtJ3wDJY6YV2 H9fH+MCGTRzzWkfGbO5s9rJaYZX+ciwdEcOXBk0P+9iRN//wsDl9j+EljKYXYmBdJoqh y4TZiXQOAW4fy53h68wMBHVPGfnOlMIB7QmQicnDxKXif/9zK35OOLLf9o66e7XnCX0h HcGua1Zk1H/Np+gG0LXyAyn6aCP26whYJS4crd7mefUhZrDIrmy+YiBwFNQHPOZrjDGL GbWxE4pfG9uLB79xjW0yZ1pMMr5+ZRfciXr22w2bjumA8BZ/vaC9YSndsOHxIJH+c3S/ tZlA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764980328; x=1765585128; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=56r/hH7ijnps8qr4K++rAycxqMKLQxyC6YFw0gfnCKk=; b=QZYK7DcLgWh6F0Zqjp8jUQMVzet7Mr82xefh3pXtj1pxjpEU5xsnN6MYVtsRtQduUT jlmhNBoWg02N9MV2A/mqr78jQEl1umomGKL7aZJe37FpneKv0w/Ldn9HWaP7Xf+OOV8i +7ghTo8JvYKxnHiynU85cD/kDzhWXLYHEAW5Iywv0GkgYfenaze/rkWIug1LM+t8hiN4 VaibjKwKxxzctq5g8Y4l755qwCxP009VbkZtId5qhmsTvl5DFe0FiSs2kV0kZAhSwwUu aemshmIAqR0qLS2iAVBuY60wyyZkbwBUFUeGj6ruuMi5/ICZc1mh0a0FbHHcWhhJmNfo VqiQ== X-Forwarded-Encrypted: i=1; AJvYcCWXe4hXDOUuXq7iB5eM7iIRbvn33uT+JBpvDphfMrSHlyeUbDe1VzvV2iQbx0dm3+jmbzi5mwHHDs3nZ8Y=@vger.kernel.org X-Gm-Message-State: AOJu0YxV3dI4fQhm59dt53kNWTj8CgOS018cSOIerDJNZeLY5FWtujvi YUNnyyptRFnPYmBrybLkleysxKpdy3j2JF2720wkYz2NZZA/pJ8V2QoD+tsmk9kX1R3Fbq9cIy1 fmikNnw== X-Google-Smtp-Source: AGHT+IGLiFKCkiAoyo8w1adxKIx9TSOzM2AC83Bo64USM+supssaEydukBJfVufrwhWzLpi5mXFgs4LcC10= X-Received: from pfbmy25-n2.prod.google.com ([2002:a05:6a00:6d59:20b0:7e5:4656:9d96]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:2d90:b0:7e8:4471:8ca with SMTP id d2e1a72fcca58-7e8c6dac0c1mr746508b3a.43.1764980327560; Fri, 05 Dec 2025 16:18:47 -0800 (PST) Reply-To: Sean Christopherson Date: Fri, 5 Dec 2025 16:17:16 -0800 In-Reply-To: <20251206001720.468579-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251206001720.468579-1-seanjc@google.com> X-Mailer: git-send-email 2.52.0.223.gf5cc29aaa4-goog Message-ID: <20251206001720.468579-41-seanjc@google.com> Subject: [PATCH v6 40/44] KVM: VMX: Set MSR index auto-load entry if and only if entry is "new" From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Xin Li , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Sean Christopherson , Paolo Bonzini Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Mingwei Zhang , Xudong Hao , Sandipan Das , Dapeng Mi , Xiong Zhang , Manali Shukla , Jim Mattson Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" When adding an MSR to the auto-load lists, update the MSR index in the list entry if and only if a new entry is being inserted, as 'i' can only be non-negative if vmx_find_loadstore_msr_slot() found an entry with the MSR's index. Unnecessarily setting the index is benign, but it makes it harder to see that updating the value is necessary even when an existing entry for the MSR was found. No functional change intended. Signed-off-by: Sean Christopherson Reviewed-by: Dapeng Mi --- arch/x86/kvm/vmx/vmx.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 2c50ebf4ff1b..be2a2580e8f1 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -1141,16 +1141,16 @@ static void add_atomic_switch_msr(struct vcpu_vmx *= vmx, unsigned msr, =20 if (i < 0) { i =3D m->guest.nr++; + m->guest.val[i].index =3D msr; vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, m->guest.nr); } - m->guest.val[i].index =3D msr; m->guest.val[i].value =3D guest_val; =20 if (j < 0) { j =3D m->host.nr++; + m->host.val[j].index =3D msr; vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, m->host.nr); } - m->host.val[j].index =3D msr; m->host.val[j].value =3D host_val; } =20 --=20 2.52.0.223.gf5cc29aaa4-goog From nobody Tue Dec 16 07:27:38 2025 Received: from mail-pf1-f202.google.com (mail-pf1-f202.google.com [209.85.210.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 80091306496 for ; Sat, 6 Dec 2025 00:18:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980332; cv=none; b=lJquN4IfYZHpnvg8mlCRzgotn8Z+aPJy6rA4zaCVfCcm4kiR+rjGJIIiyUnZXakyIuGZm/LYviIftx/ZaSrD3bYVIw/zSV8JJhMkKezeEgJIAAANPOEt0X+3Uw/bX/NA+jhQysvR119HslgGPqR7SH1PubdlABv9riXlLOdJpfM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980332; c=relaxed/simple; bh=BtLSCOsVaylpZ1Hr4R9W1QzZYCsZevsYtH+m4bBtmCY=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=GYBtx3Cz5nZiAtCQ5m48b1dG8Tk144OG9rJomlAsGhKY8VimdfQO/SyCvkP2UO/j78eqGaH4q+Ew+NOBppgNgS8CdDXLFhBRjhGrq59mYb38jTRS8jdtDd21XghBCwxp06/cWwRFjEb7hV62HgWECFAQ/cU0gglb/bYVGQVyElg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=kj7T8fD8; arc=none smtp.client-ip=209.85.210.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="kj7T8fD8" Received: by mail-pf1-f202.google.com with SMTP id d2e1a72fcca58-7ba92341f38so2730028b3a.0 for ; Fri, 05 Dec 2025 16:18:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1764980330; x=1765585130; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=QTKuKPWFr6bVN6iN1gTHfORH5j78nQLhYRuj2ERNrVM=; b=kj7T8fD8+9J5+pZv791pGIL6U79dOWUAmdv2NfmiJKEjdLa0tjZdQVVJGKmY6iyJpl PjJ0hBuogkK8bkKPpSZULZiQNzodoNsGC887xSD7Es59JUM2u74YYw/6s8xbRqL0doKk AN1umI+Z195X8XiwzKi0+yk9k9Dgv1UvEHKec+1lusxOd9PbF/1u3hCXwokWUr0yg5iZ enrKUZxUAG2j+Zf4LgEcYvRogXL29sXQD/thVxwV+qGvk+ccvq5s5bxAohCr/1jHI+FB N+gtce1ETBPocnoJZ1YUtFK2PlyL/dEXVdNu1ypvx5oEPI/iw1aUFs0X3HRtXw5/PLdz hcdw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764980330; x=1765585130; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=QTKuKPWFr6bVN6iN1gTHfORH5j78nQLhYRuj2ERNrVM=; b=YR9DzlV2fVZ9oT02s1VS5MWAixY1n4UWM0W8D0zAVr4vvADwGGJZkGO/oMqNHosqLP Nw39k6hp5OlJh2Nh8ttQtpHbwHdAFQ15Etvel7AWwndeFigelS2QYNsgdnZBESlVGJ3O gK6J2bOfcUVf4AEchMAZPfk6HD5tOvGR6cZ+oOX0tmXV2qWKxs8dDcPl3rLt0kOKTBc9 jdcpmGCzczrBZTmLPCqoFqYHITyF7VTRhoF/RprSsCW8m8YzwT0hWDiT99+By24Had/P SVW2C83n2aCkz5edTmFUsrXC6dpok4HEU5Vk0pUxw9jlQ0F4X/nduV1hOkKE3Yuw6mov 0Cpw== X-Forwarded-Encrypted: i=1; AJvYcCXWXtGrkipgzhsrx9MiGcsFmktOxRiGjF3iA7CwD+Krrg9EJ4JxuT/8JHLUGfsBtdTW5Yxyx4hN5scBPGs=@vger.kernel.org X-Gm-Message-State: AOJu0YwyWUGcOosNM5ItTaA7I36E3qdAU/IoS6wrRlfIGWSRkFpOUF6G Fw74GWGLln1IPZy3XqgD+E3Zx8KQp+tu0jZ7VFfQV3QP1+IP9tweIpWLB5pUoEaqa8uWm2lWeCK kgTcVrg== X-Google-Smtp-Source: AGHT+IHwEgCif9QeztnHlCZf32TA1mft1ORkKdnbi9mkgV4eyxEEshqnwgTA7OOXOPfobSPR/sGAg/rO4Do= X-Received: from pfjc12.prod.google.com ([2002:a05:6a00:8c:b0:7e5:49a7:f55f]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a20:7f91:b0:363:cd5e:8f87 with SMTP id adf61e73a8af0-36617ea8befmr929454637.13.1764980329483; Fri, 05 Dec 2025 16:18:49 -0800 (PST) Reply-To: Sean Christopherson Date: Fri, 5 Dec 2025 16:17:17 -0800 In-Reply-To: <20251206001720.468579-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251206001720.468579-1-seanjc@google.com> X-Mailer: git-send-email 2.52.0.223.gf5cc29aaa4-goog Message-ID: <20251206001720.468579-42-seanjc@google.com> Subject: [PATCH v6 41/44] KVM: VMX: Compartmentalize adding MSRs to host vs. guest auto-load list From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Xin Li , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Sean Christopherson , Paolo Bonzini Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Mingwei Zhang , Xudong Hao , Sandipan Das , Dapeng Mi , Xiong Zhang , Manali Shukla , Jim Mattson Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Undo the bundling of the "host" and "guest" MSR auto-load list logic so that the code can be deduplicated by factoring out the logic to a separate helper. Now that "list full" situations are treated as fatal to the VM, there is no need to pre-check both lists. For all intents and purposes, this reverts the add_atomic_switch_msr() changes made by commit 3190709335dd ("x86/KVM/VMX: Separate the VMX AUTOLOAD guest/host number accounting"). Signed-off-by: Sean Christopherson Reviewed-by: Dapeng Mi --- arch/x86/kvm/vmx/vmx.c | 23 ++++++++++++----------- 1 file changed, 12 insertions(+), 11 deletions(-) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index be2a2580e8f1..018e01daab68 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -1096,9 +1096,9 @@ static __always_inline void add_atomic_switch_msr_spe= cial(struct vcpu_vmx *vmx, static void add_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr, u64 guest_val, u64 host_val) { - int i, j =3D 0; struct msr_autoload *m =3D &vmx->msr_autoload; struct kvm *kvm =3D vmx->vcpu.kvm; + int i; =20 switch (msr) { case MSR_EFER: @@ -1133,25 +1133,26 @@ static void add_atomic_switch_msr(struct vcpu_vmx *= vmx, unsigned msr, } =20 i =3D vmx_find_loadstore_msr_slot(&m->guest, msr); - j =3D vmx_find_loadstore_msr_slot(&m->host, msr); - - if (KVM_BUG_ON(i < 0 && m->guest.nr =3D=3D MAX_NR_LOADSTORE_MSRS, kvm) || - KVM_BUG_ON(j < 0 && m->host.nr =3D=3D MAX_NR_LOADSTORE_MSRS, kvm)) - return; - if (i < 0) { + if (KVM_BUG_ON(m->guest.nr =3D=3D MAX_NR_LOADSTORE_MSRS, kvm)) + return; + i =3D m->guest.nr++; m->guest.val[i].index =3D msr; vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, m->guest.nr); } m->guest.val[i].value =3D guest_val; =20 - if (j < 0) { - j =3D m->host.nr++; - m->host.val[j].index =3D msr; + i =3D vmx_find_loadstore_msr_slot(&m->host, msr); + if (i < 0) { + if (KVM_BUG_ON(m->host.nr =3D=3D MAX_NR_LOADSTORE_MSRS, kvm)) + return; + + i =3D m->host.nr++; + m->host.val[i].index =3D msr; vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, m->host.nr); } - m->host.val[j].value =3D host_val; + m->host.val[i].value =3D host_val; } =20 static bool update_transition_efer(struct vcpu_vmx *vmx) --=20 2.52.0.223.gf5cc29aaa4-goog From nobody Tue Dec 16 07:27:38 2025 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E556D30B524 for ; Sat, 6 Dec 2025 00:18:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980333; cv=none; b=RTIeBgITZH9KteA1ZYcXpBuiXD7xUbahwCs2B9+ufuagVc8e12Sd+pQRz39h5CR023iSGBEaKsVhp5Gggae+/UGNx9BVB2wvo0e32mY4OHMnjnvcTcSYpfCj8pcxUyKFDfDblHBEUhctr6CkQhzC0Z141+dPfAsHNjtAua1XnLg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980333; c=relaxed/simple; bh=UVioe3cP1bYPqsGVmWsr21dHZMl8z/Ov/PsUa6Us+s0=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=PmooQle+7gPttdYwagi+paRMvh1XAM/Sz3PESgvs4jO70+Ldw9IfNY1ZvUM91pEuGK5DFxCqTdoLjf1LMaAhezMeAeXQV26R2FTOVl0R9vybJNbXSFk/yst2E+xdVeqY/HczACtG/taOoTVfb0I9mKPP8c1fepFaJluJEQ2cZ+Q= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=1k/cmGSx; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="1k/cmGSx" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-3438b1220bcso3062579a91.2 for ; Fri, 05 Dec 2025 16:18:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1764980331; x=1765585131; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=igvZhN+qFglUKI0D9M5DwytuE53IkpoWDDVxPCMFY70=; b=1k/cmGSxnUJf9BbckwKdkvy4OZTsqwjslvKaZMhbHkKF7/PnxBUsgg8TcIxzOzOgbk aTw9rz8+Ej2wapUHCe+gocU+bEnJpqi7EKa/jjuyy0XeV1xI9U0b4u9aThcbA+BxdJyw h3jrreq2K+TgFdzGrbIu2WkxGbbVYiQYifhmF++CEm3vZuyUJoU60Q7Aza3O7XDfWIUO 54/g/7zQ5j9mPrCJYX1ZJsJ5gL74PuZGk9poeyo2jkFUgWmMxT0yweqfEG2fiNYUs9IB HzFgL0wZ7AIQdVTJTh1Uru6XXZN2zRfCBaicy1wIGiudjhN3Qs0jm4wtm1/h1dH7NHDP 91SQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764980331; x=1765585131; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=igvZhN+qFglUKI0D9M5DwytuE53IkpoWDDVxPCMFY70=; b=cKuYGYNLSGgdSwCyed7zOe/gb3xWOsOnZRyB+a4zvHFJKvBTn3tUTB9Qcm0+iV8at1 DKfLjd67INWEArfD0eISekzb8gCQWjRxz12/JziWsZ9ThghE3cfo+Dd6s+KOhGayEmg4 TYE+IHhp9ESoaW98saEXicyrsVdpAfsmU5kijIYgj7z/fmBdtOGYvQUS1sayOyfA9eKn ES4dLh0vQDEhzPNr5yuW5EXWtrh1HML5BMXoEeYDFOFw+JxGkxG9FJP5RfKJCSLiVm6Q 77M8WAkQ5k4MQrXu5NiYsNGWI9xtFucFU3YpVrqh2IrGTb4m4mHL8KwH60eAiUjo7S7j 3obg== X-Forwarded-Encrypted: i=1; AJvYcCVgAc7dP1iQr2E4B0wUhELQAZJXbw+zVGVr2RUH13MJcIMCxmZarllYa2meEMPxuGcu9VKrcjU5mtawUco=@vger.kernel.org X-Gm-Message-State: AOJu0YxT2LkYcroiflZk5/qw6Ko4WlT6SUJcOJ1hOE5sBNKuBU9CQLm8 vRibZ8V8yR/PhXgrSD+nLQ2TVxZ8ZIdNIXdvsKQznd7i0d+o6fabEGXiLQxNjqVLC8s/yb38uWt AqawGxA== X-Google-Smtp-Source: AGHT+IELoQ04EyxWWYOkaKJLtcnZ6DpPNmUnBZcM+48DKxRjXGIekKbShjkFhb2K5/ObBWXw0pAx2mabe98= X-Received: from pjbsk6.prod.google.com ([2002:a17:90b:2dc6:b0:340:a5c6:acc3]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90a:fc47:b0:343:d70e:bef0 with SMTP id 98e67ed59e1d1-349a267fbc8mr680584a91.21.1764980331409; Fri, 05 Dec 2025 16:18:51 -0800 (PST) Reply-To: Sean Christopherson Date: Fri, 5 Dec 2025 16:17:18 -0800 In-Reply-To: <20251206001720.468579-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251206001720.468579-1-seanjc@google.com> X-Mailer: git-send-email 2.52.0.223.gf5cc29aaa4-goog Message-ID: <20251206001720.468579-43-seanjc@google.com> Subject: [PATCH v6 42/44] KVM: VMX: Dedup code for adding MSR to VMCS's auto list From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Xin Li , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Sean Christopherson , Paolo Bonzini Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Mingwei Zhang , Xudong Hao , Sandipan Das , Dapeng Mi , Xiong Zhang , Manali Shukla , Jim Mattson Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add a helper to add an MSR to a VMCS's "auto" list to deduplicate the code in add_atomic_switch_msr(), and so that the functionality can be used in the future for managing the MSR auto-store list. No functional change intended. Signed-off-by: Sean Christopherson Reviewed-by: Dapeng Mi --- arch/x86/kvm/vmx/vmx.c | 41 +++++++++++++++++++---------------------- 1 file changed, 19 insertions(+), 22 deletions(-) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 018e01daab68..3f64d4b1b19c 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -1093,12 +1093,28 @@ static __always_inline void add_atomic_switch_msr_s= pecial(struct vcpu_vmx *vmx, vm_exit_controls_setbit(vmx, exit); } =20 +static void vmx_add_auto_msr(struct vmx_msrs *m, u32 msr, u64 value, + unsigned long vmcs_count_field, struct kvm *kvm) +{ + int i; + + i =3D vmx_find_loadstore_msr_slot(m, msr); + if (i < 0) { + if (KVM_BUG_ON(m->nr =3D=3D MAX_NR_LOADSTORE_MSRS, kvm)) + return; + + i =3D m->nr++; + m->val[i].index =3D msr; + vmcs_write32(vmcs_count_field, m->nr); + } + m->val[i].value =3D value; +} + static void add_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr, u64 guest_val, u64 host_val) { struct msr_autoload *m =3D &vmx->msr_autoload; struct kvm *kvm =3D vmx->vcpu.kvm; - int i; =20 switch (msr) { case MSR_EFER: @@ -1132,27 +1148,8 @@ static void add_atomic_switch_msr(struct vcpu_vmx *v= mx, unsigned msr, wrmsrq(MSR_IA32_PEBS_ENABLE, 0); } =20 - i =3D vmx_find_loadstore_msr_slot(&m->guest, msr); - if (i < 0) { - if (KVM_BUG_ON(m->guest.nr =3D=3D MAX_NR_LOADSTORE_MSRS, kvm)) - return; - - i =3D m->guest.nr++; - m->guest.val[i].index =3D msr; - vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, m->guest.nr); - } - m->guest.val[i].value =3D guest_val; - - i =3D vmx_find_loadstore_msr_slot(&m->host, msr); - if (i < 0) { - if (KVM_BUG_ON(m->host.nr =3D=3D MAX_NR_LOADSTORE_MSRS, kvm)) - return; - - i =3D m->host.nr++; - m->host.val[i].index =3D msr; - vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, m->host.nr); - } - m->host.val[i].value =3D host_val; + vmx_add_auto_msr(&m->guest, msr, guest_val, VM_ENTRY_MSR_LOAD_COUNT, kvm); + vmx_add_auto_msr(&m->guest, msr, host_val, VM_EXIT_MSR_LOAD_COUNT, kvm); } =20 static bool update_transition_efer(struct vcpu_vmx *vmx) --=20 2.52.0.223.gf5cc29aaa4-goog From nobody Tue Dec 16 07:27:38 2025 Received: from mail-pf1-f201.google.com (mail-pf1-f201.google.com [209.85.210.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0AD6D30C626 for ; Sat, 6 Dec 2025 00:18:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980336; cv=none; b=apOTTqiUrcgXIoDnrwQjJXaK1/aBD8Y67yDhkqub/7dxXqo2+1ER3kc2QtWEita7YBwGIcnWHzBjq24ivQ/XIR0ZcCXgdwkuvzJB76y96Td7E6JLoYgu9zVCLPJ0jjNaxSQ2N0H2y8mxlW6wEoL2xEKUzwzDEreCmu5dNcSlhJE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980336; c=relaxed/simple; bh=EbuHjkM11wJSgkGWY2aSFlMpNr7gwy2PyHzdmFLt294=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=cilu/f3Kpz3cZhu3V1ZK0mLptnba6yliO3nAyfaU9roa0sanfDtWZbAYbepnDx2HMMpwLUtr0vD92+6m7i9rUpmpAXhGyZKlFhVOArfn/winlUSF2A5fKMLUjJR+ygIkvBc40pd03X5dNW9ZtV+tohZezXL+3j95ASVbs2QoIYQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=iKSVd+rd; arc=none smtp.client-ip=209.85.210.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="iKSVd+rd" Received: by mail-pf1-f201.google.com with SMTP id d2e1a72fcca58-7b8ed43cd00so3149230b3a.2 for ; Fri, 05 Dec 2025 16:18:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1764980333; x=1765585133; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=n8V9DDSigR69uE32CWTID9idTpYitJ2rw2w6q4YtJdw=; b=iKSVd+rd86Opv0mE38U0ubpL+DeS273fzlevHCCHWm4bq8qBj4Z7SYdKpUkzVJO/dh 2G3Lmxw2qQW/SSW4FAdmJnOxm8Bb5X0YM1bJJQ8KmSKdvfCHYLlZ6GkZeYUPPHEHGQfh p3qKRmww80TkTY81zwp2rGZI1VJ74TweTrE+ID+RvPmOk++z9YKJYKGhjuejfEIHcJCp BTGR6TXuPxC4QDdACkE5tvaH+SMyW+iSan1OzQCjafUpvGtzqWv18woR6bZRm1VWkS3e mAtlZ9J+6oOD1wBYA51YJ9VqPTbcAgC5pJh2Z+NYfnO/ZsFRTXMlwTsgsTWkcZthrZAM YfyA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764980333; x=1765585133; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=n8V9DDSigR69uE32CWTID9idTpYitJ2rw2w6q4YtJdw=; b=aERBiR+xZr4WvNZwE31DHilWON9pfm0RGjVp347V3EWEnEVl9k++Ourv0BSNkx/7Vi 8U32PzJqwlU/zY6P42LHZ3KdLolY2ThVK1WJ+MzpsiDlMOKIhPnKfdTV8WuuCBdPzkPk a5dDp9fM3wO0kpyH7zqNAbTLHTCf1LDjehSa82n3K5XJLlRcq4NpiUnIC6HIp4rdaqly G2qjV3KU05JtIFxPegr65uwVy7yjMfPytCkX2LVdBMQkz//xTPw2tkanwCb1LIbvy9gl caf7ae+8vGnGxGeh9PfvjbLlr2pUT+KxnxMN0xWdaxfCuKyUX2OWYJd4VSjWCkIAvU+j 1Z1w== X-Forwarded-Encrypted: i=1; AJvYcCV0ggTo59FFbht3vQikfNNgWX+wtzW4FibzJNSu9DCPLGqa0SkEpopoOpbt2dKb3MvjskOp7/JkCoqtQJ0=@vger.kernel.org X-Gm-Message-State: AOJu0YzR9EWQ+VR4y0tYZKidzVhanMASs1XXXf7gArOvpOFJDNQVOI2A wUog4g/f4Xt+/6W4JJ0143jhcur/YWqTHcEILVVcyH6DZt+eKm+m5B/rBziQmRO2sX62Omd8vo1 bG6txNw== X-Google-Smtp-Source: AGHT+IFvgfK2a2snqIy1Rm8aE6qptyZkQEpan42/+C7VOcIZpnFc5T7dKOJ6QhHIoZwkwV6ZMBDtCsCouY0= X-Received: from pfbim5.prod.google.com ([2002:a05:6a00:8d85:b0:7b0:e3d3:f042]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:aa7:888d:0:b0:7e1:89ad:991 with SMTP id d2e1a72fcca58-7e8c6da8d28mr855283b3a.32.1764980332931; Fri, 05 Dec 2025 16:18:52 -0800 (PST) Reply-To: Sean Christopherson Date: Fri, 5 Dec 2025 16:17:19 -0800 In-Reply-To: <20251206001720.468579-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251206001720.468579-1-seanjc@google.com> X-Mailer: git-send-email 2.52.0.223.gf5cc29aaa4-goog Message-ID: <20251206001720.468579-44-seanjc@google.com> Subject: [PATCH v6 43/44] KVM: VMX: Initialize vmcs01.VM_EXIT_MSR_STORE_ADDR with list address From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Xin Li , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Sean Christopherson , Paolo Bonzini Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Mingwei Zhang , Xudong Hao , Sandipan Das , Dapeng Mi , Xiong Zhang , Manali Shukla , Jim Mattson Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Initialize vmcs01.VM_EXIT_MSR_STORE_ADDR to point at the vCPU's msr_autostore list in anticipation of utilizing the auto-store functionality, and to harden KVM against stray reads to pfn 0 (or, in theory, a random pfn if the underlying CPU uses a complex scheme for encoding VMCS data). The MSR auto lists are supposed to be ignored if the associated COUNT VMCS field is '0', but leaving the ADDR field zero-initialized in memory is an unnecessary risk (albeit a minuscule risk) given that the cost is a single VMWRITE during vCPU creation. Signed-off-by: Sean Christopherson --- arch/x86/kvm/vmx/vmx.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 3f64d4b1b19c..6a17cb90eaf4 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -4933,6 +4933,7 @@ static void init_vmcs(struct vcpu_vmx *vmx) vmcs_write64(VM_FUNCTION_CONTROL, 0); =20 vmcs_write32(VM_EXIT_MSR_STORE_COUNT, 0); + vmcs_write64(VM_EXIT_MSR_STORE_ADDR, __pa(vmx->msr_autostore.val)); vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, 0); vmcs_write64(VM_EXIT_MSR_LOAD_ADDR, __pa(vmx->msr_autoload.host.val)); vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, 0); --=20 2.52.0.223.gf5cc29aaa4-goog From nobody Tue Dec 16 07:27:38 2025 Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D261D30DEA3 for ; Sat, 6 Dec 2025 00:18:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980338; cv=none; b=oUtfhCn+DWW4YZmNItG/VF7nrpUASfc7o/RgTgjFtCabzAYVAz+XZ79kpcvi5/kwIKpnJtBjwAlPaPB6sO4F01KHCRjsTX6O1eGGBQp0UQcByRceKOaRWVBDg1gpvcXW50HfyXbDlK8lIc5ktK0FZ7URhKA7gif1Vv5mV1Hj43I= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980338; c=relaxed/simple; bh=gvtfcjIXj//cuiiuL7vddv/r8HozbH4yy9UU+z+mReQ=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=lc/DUndgP00GL/qVFq+W5hNUlg0FXlP4UsqoZ13VUuAPaac517M2e2xEXMW13WGmMjDHE2lBkT864X2qkqA+lmynouQSt07y1f34P//nYi72vVVPQ9+zuyXKo0Ri+QKZe1elWT/98c1enNtnngfHqjkpv+IwrFmuQGtNtPHkuJg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=e2+Lh9NO; arc=none smtp.client-ip=209.85.214.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="e2+Lh9NO" Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-297e5a18652so32996395ad.1 for ; Fri, 05 Dec 2025 16:18:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1764980335; x=1765585135; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=dghHZC8C4J94U5X0EsLhdqx+qyyu5Gw9y0a71sLAgMI=; b=e2+Lh9NOV9oFxAhcihvtwuvlpFFYgrPpkFtzJewouyhZ6KGw00QMqmtMYDJMA/FfOj PhL1kr6BCAxP7KJOGMuUY8Dz2cjGh1FRNRytpgYyypj0Fc9JGbidyZzu5vjBZcnC0BSC P6D7V11GpTo+aCtf8XAGrjz9h8+RJaytrUwHJ40sh0l2jUsP0banmYi/qXd5O5SF0mO6 ZJOBmMk1zwy8U6Lifpm0B7UBtn9beUaQg9bvdmOb1LBGwGUGgRfE0BPPeer5nQidkRm0 tZzi+VMzi3Lh7izG7kpYYmgnIAgFTAEPRCmi8GhmTbetHugS/ZFwSXs3lmb06aRQwC73 j2dA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764980335; x=1765585135; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=dghHZC8C4J94U5X0EsLhdqx+qyyu5Gw9y0a71sLAgMI=; b=aj/+AbO3jPcNQ6ggOLD+WgqTSXxqO3UU4Ptx/6f/29wNnN97a9wplhy55wDzni4L2G TbNLcWwnyUjXzJwZOErctE91Im7ebccVNtIfWIUwbAubTlGDvl5r2JJgQYekDyYAuEh4 VeYj1sDXo35Aj+PLvAgG0xYycxlRfsWAtcNC9y64ss6KbZ68BxAsF97421WDAjhC6VFt aC2PSdWyvkDenbuJTA71SnqHjzhENagPj5UrisNea3U0DLkJ03fXX/xMhahHb3aYWFMu w/9r2e6pkkcGwR+jXODwbMFxfMjr8mB1g+ar7GW1SMrMzfcX5nQsAgy4c9JiCawPXc0V oRxQ== X-Forwarded-Encrypted: i=1; AJvYcCUPB4NO3GqQP3ckiZaTo+Vb1l/ULkDNSx4XaGgxaeGJV2BpMmufCVfOt7UlVCl1v9I2ztP3kZNO54lbxso=@vger.kernel.org X-Gm-Message-State: AOJu0Yyt+GPy17x08sMUMFIEgCsUr73LbT1S2Wmedj+yCAQnJxmqsCTn ONtYoBjC2fNsqBECsErsBRRWoLwZ4gYYwEKpo3VmoHyvWXfEpp6cgMxb7lr+KLlP8reOWVvW72y UkdH2Sg== X-Google-Smtp-Source: AGHT+IGiF/KlyKw3jh5Mhte5oVO+4hAY3tNIkOZ4DdyLYn/+Z0r5Rwc99w3eHOaoOE5Aq4b+oO/Y2gl+Kqw= X-Received: from pgbcp9.prod.google.com ([2002:a05:6a02:4009:b0:b99:9560:3dc9]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a20:e290:b0:35b:c84f:c7b0 with SMTP id adf61e73a8af0-36617e37e14mr1031726637.8.1764980334861; Fri, 05 Dec 2025 16:18:54 -0800 (PST) Reply-To: Sean Christopherson Date: Fri, 5 Dec 2025 16:17:20 -0800 In-Reply-To: <20251206001720.468579-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251206001720.468579-1-seanjc@google.com> X-Mailer: git-send-email 2.52.0.223.gf5cc29aaa4-goog Message-ID: <20251206001720.468579-45-seanjc@google.com> Subject: [PATCH v6 44/44] KVM: VMX: Add mediated PMU support for CPUs without "save perf global ctrl" From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Xin Li , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Sean Christopherson , Paolo Bonzini Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Mingwei Zhang , Xudong Hao , Sandipan Das , Dapeng Mi , Xiong Zhang , Manali Shukla , Jim Mattson Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Extend mediated PMU support for Intel CPUs without support for saving PERF_GLOBAL_CONTROL into the guest VMCS field on VM-Exit, e.g. for Skylake and its derivatives, as well as Icelake. While supporting CPUs without VM_EXIT_SAVE_IA32_PERF_GLOBAL_CTRL isn't completely trivial, it's not that complex either. And not supporting such CPUs would mean not supporting 7+ years of Intel CPUs released in the past 10 years. On VM-Exit, immediately propagate the saved PERF_GLOBAL_CTRL to the VMCS as well as KVM's software cache so that KVM doesn't need to add full EXREG tracking of PERF_GLOBAL_CTRL. In practice, the vast majority of VM-Exits won't trigger software writes to guest PERF_GLOBAL_CTRL, so deferring the VMWRITE to the next VM-Enter would only delay the inevitable without batching/avoiding VMWRITEs. Note! Take care to refresh VM_EXIT_MSR_STORE_COUNT on nested VM-Exit, as it's unfortunately possible that KVM could recalculate MSR intercepts while L2 is active, e.g. if userspace loads nested state and _then_ sets PERF_CAPABILITIES. Eating the VMWRITE on every nested VM-Exit is unfortunate, but that's a pre-existing problem and can/should be solved separately, e.g. modifying the number of auto-load entries while L2 is active is also uncommon on modern CPUs. Signed-off-by: Sean Christopherson Reviewed-by: Dapeng Mi Tested-by: Dapeng Mi --- arch/x86/kvm/vmx/nested.c | 6 ++++- arch/x86/kvm/vmx/pmu_intel.c | 7 ----- arch/x86/kvm/vmx/vmx.c | 52 ++++++++++++++++++++++++++++++++---- 3 files changed, 52 insertions(+), 13 deletions(-) diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 614b789ecf16..1ee1edc8419d 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -5142,7 +5142,11 @@ void __nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 = vm_exit_reason, =20 kvm_nested_vmexit_handle_ibrs(vcpu); =20 - /* Update any VMCS fields that might have changed while L2 ran */ + /* + * Update any VMCS fields that might have changed while vmcs02 was the + * active VMCS. The tracking is per-vCPU, not per-VMCS. + */ + vmcs_write32(VM_EXIT_MSR_STORE_COUNT, vmx->msr_autostore.nr); vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, vmx->msr_autoload.host.nr); vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, vmx->msr_autoload.guest.nr); vmcs_write64(TSC_OFFSET, vcpu->arch.tsc_offset); diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c index 55249fa4db95..27eb76e6b6a0 100644 --- a/arch/x86/kvm/vmx/pmu_intel.c +++ b/arch/x86/kvm/vmx/pmu_intel.c @@ -777,13 +777,6 @@ static bool intel_pmu_is_mediated_pmu_supported(struct= x86_pmu_capability *host_ if (WARN_ON_ONCE(!cpu_has_load_perf_global_ctrl())) return false; =20 - /* - * KVM doesn't yet support mediated PMU on CPUs without support for - * saving PERF_GLOBAL_CTRL via a dedicated VMCS field. - */ - if (!cpu_has_save_perf_global_ctrl()) - return false; - return true; } =20 diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 6a17cb90eaf4..ba1262c3e3ff 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -1204,6 +1204,17 @@ static bool update_transition_efer(struct vcpu_vmx *= vmx) return true; } =20 +static void vmx_add_autostore_msr(struct vcpu_vmx *vmx, u32 msr) +{ + vmx_add_auto_msr(&vmx->msr_autostore, msr, 0, VM_EXIT_MSR_STORE_COUNT, + vmx->vcpu.kvm); +} + +static void vmx_remove_autostore_msr(struct vcpu_vmx *vmx, u32 msr) +{ + vmx_remove_auto_msr(&vmx->msr_autostore, msr, VM_EXIT_MSR_STORE_COUNT); +} + #ifdef CONFIG_X86_32 /* * On 32-bit kernels, VM exits still load the FS and GS bases from the @@ -4225,6 +4236,8 @@ void pt_update_intercept_for_msr(struct kvm_vcpu *vcp= u) =20 static void vmx_recalc_pmu_msr_intercepts(struct kvm_vcpu *vcpu) { + u64 vm_exit_controls_bits =3D VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL | + VM_EXIT_SAVE_IA32_PERF_GLOBAL_CTRL; bool has_mediated_pmu =3D kvm_vcpu_has_mediated_pmu(vcpu); struct kvm_pmu *pmu =3D vcpu_to_pmu(vcpu); struct vcpu_vmx *vmx =3D to_vmx(vcpu); @@ -4234,12 +4247,19 @@ static void vmx_recalc_pmu_msr_intercepts(struct kv= m_vcpu *vcpu) if (!enable_mediated_pmu) return; =20 + if (!cpu_has_save_perf_global_ctrl()) { + vm_exit_controls_bits &=3D ~VM_EXIT_SAVE_IA32_PERF_GLOBAL_CTRL; + + if (has_mediated_pmu) + vmx_add_autostore_msr(vmx, MSR_CORE_PERF_GLOBAL_CTRL); + else + vmx_remove_autostore_msr(vmx, MSR_CORE_PERF_GLOBAL_CTRL); + } + vm_entry_controls_changebit(vmx, VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL, has_mediated_pmu); =20 - vm_exit_controls_changebit(vmx, VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL | - VM_EXIT_SAVE_IA32_PERF_GLOBAL_CTRL, - has_mediated_pmu); + vm_exit_controls_changebit(vmx, vm_exit_controls_bits, has_mediated_pmu); =20 for (i =3D 0; i < pmu->nr_arch_gp_counters; i++) { vmx_set_intercept_for_msr(vcpu, MSR_IA32_PERFCTR0 + i, @@ -7346,6 +7366,29 @@ static void atomic_switch_perf_msrs(struct vcpu_vmx = *vmx) msrs[i].host); } =20 +static void vmx_refresh_guest_perf_global_control(struct kvm_vcpu *vcpu) +{ + struct kvm_pmu *pmu =3D vcpu_to_pmu(vcpu); + struct vcpu_vmx *vmx =3D to_vmx(vcpu); + + if (msr_write_intercepted(vmx, MSR_CORE_PERF_GLOBAL_CTRL)) + return; + + if (!cpu_has_save_perf_global_ctrl()) { + int slot =3D vmx_find_loadstore_msr_slot(&vmx->msr_autostore, + MSR_CORE_PERF_GLOBAL_CTRL); + + if (WARN_ON_ONCE(slot < 0)) + return; + + pmu->global_ctrl =3D vmx->msr_autostore.val[slot].value; + vmcs_write64(GUEST_IA32_PERF_GLOBAL_CTRL, pmu->global_ctrl); + return; + } + + pmu->global_ctrl =3D vmcs_read64(GUEST_IA32_PERF_GLOBAL_CTRL); +} + static void vmx_update_hv_timer(struct kvm_vcpu *vcpu, bool force_immediat= e_exit) { struct vcpu_vmx *vmx =3D to_vmx(vcpu); @@ -7631,8 +7674,7 @@ fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu, u64 ru= n_flags) =20 vmx->loaded_vmcs->launched =3D 1; =20 - if (!msr_write_intercepted(vmx, MSR_CORE_PERF_GLOBAL_CTRL)) - vcpu_to_pmu(vcpu)->global_ctrl =3D vmcs_read64(GUEST_IA32_PERF_GLOBAL_CT= RL); + vmx_refresh_guest_perf_global_control(vcpu); =20 vmx_recover_nmi_blocking(vmx); vmx_complete_interrupts(vmx); --=20 2.52.0.223.gf5cc29aaa4-goog