From nobody Sun Feb 8 02:56:17 2026 Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6D9E612F36C for ; Mon, 25 Mar 2024 17:27:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711387659; cv=none; b=uTBdb37emeiD3FYxUEHUI4WR8KALu418Lrxx8bzMkg4dyLGwEXccpZJzH0EATvlqO4BRYuwVNCqEBQGvocNCaGp+3L3amcL8gM411+7tz1GEuG4Mdrv0+jkqZKZ1BIicn1SJ7VwNkgpDiWy2VGIbxmD39OsdG91gzvVXbuLknrA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711387659; c=relaxed/simple; bh=rZTT6QKzijjIvZzlnVr95xkP+9uKeuWzF2TLd4tBV9M=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=dxHWcMtYaQnyEhcBLn9IfjdJBkOHpnoyjIFhQtVV9f2v5RP0JYtF58kxy+qFXnMYKgzEPLooaszXNn5AG0tOxAHLvWXENbk8P2q7B949jUPCtHjmBbgWhrmSz7WRiE6hghN3469QzJJAlkErDuV5bN1GKf/As0JqJBV/SmNKYmQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--peternewman.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=TuvO+Keb; arc=none smtp.client-ip=209.85.128.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--peternewman.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="TuvO+Keb" Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-60cbba6fa0bso87241757b3.3 for ; Mon, 25 Mar 2024 10:27:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1711387656; x=1711992456; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=6sYt0dTu69OKJjBkOi8XcK0GHNUcEvmVxw2SNet6n9M=; b=TuvO+KebtYe+ny6q7UXgOnE1Z3J8RvNae0xrTPevQeSHRvdTNsIAx68GgQHUIyI17K YWY/KDBthPlyl5BkE8bNMr7j7z6ZwrJsHm9g+OjCSVPUB+7lN9EYxazf+bTwmvJ6k0Y7 zREVJwdAPqlj3Pm4WvNlDuHNKnvqecZjrS8hUWByMG5ACnp49dov1+DEqj9Ejdi8SWjT bnp0RtJOHYERknVsgecCazZ0CCZJx4xpfptV0f7njF6D+FZ4OKblmDuro3aEnKDaWFsi MeoxBxiweUBzxI6srGVv7SU6GMQRHYw9yqfRZ264dJf58/Hs/EYpQOOKS4MHRBXnCJ/L 6FXw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711387656; x=1711992456; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=6sYt0dTu69OKJjBkOi8XcK0GHNUcEvmVxw2SNet6n9M=; b=qkyNLdzlJEaBtzysvXwwmVDjJdaaFBpkeFD+2StYaYESFStfCXM2PqL27QF6KkIAj1 ITVRITVjOn00HanmS56sHkCsbiMYz5gF+WvuR2T8xR1/u+liur/5zfoGR6XY2juqv7gr L1cJmVdQX51y9p8ZFBlAA9lD8t5KJ+pFmIjpRewVH+SP0ytNziFCKZKL/KvvdOcW6kpX GbUGyLvmHMgGm4oqiI0rRqpNV+HnrWInLxZX0P42ZtEoDpDqlsfg7LNtG3SOltZ3C9lu 1ptutyxWSQtumUGYqem3s/hwbsBWL52E/75kWr7bCanb6JM9WYAhOPApZnP7WuKZx8v1 DJRA== X-Forwarded-Encrypted: i=1; AJvYcCU0gitbbm35WK/B4bkt8CtgRKSml4l6HAVgR1NsUy+S8qoU+aqAlLcWs45hH9TPihTTIJIfeFCv4OE6V2jzSLxLcqrXx3TiwPBfn5CR X-Gm-Message-State: AOJu0YzkvxKVI1UocPYRp5RJ9DgvE1JzDNjEw+tTqwGLzQ6TWXTPuLRl TSiihQEWngLvf1CFcqvk86Lyk/lLgw4utg37JL6iHqEZhFlwnXasGe31MWC9GOOPdscy5nMfmWM ApcYF/4qxlPQhaKccgZWgCA== X-Google-Smtp-Source: AGHT+IFCmz5O/RuOvajFIq7LN16Esa34lGCVAS5nlSt3cqrYU+ynIulHUiqHNgktIJKEToOLbIKmQjfnm51kn3xugg== X-Received: from peternewman-us.c.googlers.com ([fda3:e722:ac3:cc00:24:72f4:c0a8:3dcc]) (user=peternewman job=sendgmr) by 2002:a05:6902:220a:b0:dc6:5396:c0d4 with SMTP id dm10-20020a056902220a00b00dc65396c0d4mr2368650ybb.1.1711387656606; Mon, 25 Mar 2024 10:27:36 -0700 (PDT) Date: Mon, 25 Mar 2024 10:27:02 -0700 In-Reply-To: <20240325172707.73966-1-peternewman@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240325172707.73966-1-peternewman@google.com> X-Mailer: git-send-email 2.44.0.396.g6e790dbe36-goog Message-ID: <20240325172707.73966-2-peternewman@google.com> Subject: [PATCH v1 1/6] x86/resctrl: Move __resctrl_sched_in() out-of-line From: Peter Newman To: Fenghua Yu , Reinette Chatre , James Morse Cc: Stephane Eranian , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Daniel Bristot de Oliveira , Valentin Schneider , Uros Bizjak , Mike Rapoport , "Kirill A. Shutemov" , Rick Edgecombe , Xin Li , Babu Moger , Shaopeng Tan , Maciej Wieczor-Retman , Jens Axboe , Christian Brauner , Oleg Nesterov , Andrew Morton , Tycho Andersen , Nicholas Piggin , Beau Belgrave , "Matthew Wilcox (Oracle)" , linux-kernel@vger.kernel.org, Peter Newman Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" __resctrl_sched_in() is unable to dereference a struct rdtgroup pointer when defined inline because rdtgroup is a private structure defined in internal.h. This function is defined inline to avoid impacting context switch performance for the majority of users who aren't using resctrl at all. These benefits can already be realized without access to internal resctrl data structures. The logic of performing an out-of-line call to __resctrl_sched_in() only when resctrl is mounted is architecture-independent, so the inline definition of resctrl_sched_in() can be moved into linux/resctrl.h. Signed-off-by: Peter Newman --- arch/x86/include/asm/resctrl.h | 75 -------------------------- arch/x86/kernel/cpu/resctrl/internal.h | 24 +++++++++ arch/x86/kernel/cpu/resctrl/rdtgroup.c | 41 ++++++++++++++ arch/x86/kernel/process_32.c | 2 +- arch/x86/kernel/process_64.c | 2 +- include/linux/resctrl.h | 21 ++++++++ 6 files changed, 88 insertions(+), 77 deletions(-) diff --git a/arch/x86/include/asm/resctrl.h b/arch/x86/include/asm/resctrl.h index 12dbd2588ca7..99ba8c0dc155 100644 --- a/arch/x86/include/asm/resctrl.h +++ b/arch/x86/include/asm/resctrl.h @@ -14,30 +14,6 @@ */ #define X86_RESCTRL_EMPTY_CLOSID ((u32)~0) =20 -/** - * struct resctrl_pqr_state - State cache for the PQR MSR - * @cur_rmid: The cached Resource Monitoring ID - * @cur_closid: The cached Class Of Service ID - * @default_rmid: The user assigned Resource Monitoring ID - * @default_closid: The user assigned cached Class Of Service ID - * - * The upper 32 bits of MSR_IA32_PQR_ASSOC contain closid and the - * lower 10 bits rmid. The update to MSR_IA32_PQR_ASSOC always - * contains both parts, so we need to cache them. This also - * stores the user configured per cpu CLOSID and RMID. - * - * The cache also helps to avoid pointless updates if the value does - * not change. - */ -struct resctrl_pqr_state { - u32 cur_rmid; - u32 cur_closid; - u32 default_rmid; - u32 default_closid; -}; - -DECLARE_PER_CPU(struct resctrl_pqr_state, pqr_state); - extern bool rdt_alloc_capable; extern bool rdt_mon_capable; =20 @@ -79,50 +55,6 @@ static inline void resctrl_arch_disable_mon(void) static_branch_dec_cpuslocked(&rdt_enable_key); } =20 -/* - * __resctrl_sched_in() - Writes the task's CLOSid/RMID to IA32_PQR_MSR - * - * Following considerations are made so that this has minimal impact - * on scheduler hot path: - * - This will stay as no-op unless we are running on an Intel SKU - * which supports resource control or monitoring and we enable by - * mounting the resctrl file system. - * - Caches the per cpu CLOSid/RMID values and does the MSR write only - * when a task with a different CLOSid/RMID is scheduled in. - * - We allocate RMIDs/CLOSids globally in order to keep this as - * simple as possible. - * Must be called with preemption disabled. - */ -static inline void __resctrl_sched_in(struct task_struct *tsk) -{ - struct resctrl_pqr_state *state =3D this_cpu_ptr(&pqr_state); - u32 closid =3D state->default_closid; - u32 rmid =3D state->default_rmid; - u32 tmp; - - /* - * If this task has a closid/rmid assigned, use it. - * Else use the closid/rmid assigned to this cpu. - */ - if (static_branch_likely(&rdt_alloc_enable_key)) { - tmp =3D READ_ONCE(tsk->closid); - if (tmp) - closid =3D tmp; - } - - if (static_branch_likely(&rdt_mon_enable_key)) { - tmp =3D READ_ONCE(tsk->rmid); - if (tmp) - rmid =3D tmp; - } - - if (closid !=3D state->cur_closid || rmid !=3D state->cur_rmid) { - state->cur_closid =3D closid; - state->cur_rmid =3D rmid; - wrmsr(MSR_IA32_PQR_ASSOC, rmid, closid); - } -} - static inline unsigned int resctrl_arch_round_mon_val(unsigned int val) { unsigned int scale =3D boot_cpu_data.x86_cache_occ_scale; @@ -150,12 +82,6 @@ static inline bool resctrl_arch_match_rmid(struct task_= struct *tsk, u32 ignored, return READ_ONCE(tsk->rmid) =3D=3D rmid; } =20 -static inline void resctrl_sched_in(struct task_struct *tsk) -{ - if (static_branch_likely(&rdt_enable_key)) - __resctrl_sched_in(tsk); -} - static inline u32 resctrl_arch_system_num_rmid_idx(void) { /* RMID are independent numbers for x86. num_rmid_idx =3D=3D num_rmid */ @@ -188,7 +114,6 @@ void resctrl_cpu_detect(struct cpuinfo_x86 *c); =20 #else =20 -static inline void resctrl_sched_in(struct task_struct *tsk) {} static inline void resctrl_cpu_detect(struct cpuinfo_x86 *c) {} =20 #endif /* CONFIG_X86_CPU_RESCTRL */ diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/r= esctrl/internal.h index c99f26ebe7a6..56a68e542572 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -331,6 +331,30 @@ struct rftype { char *buf, size_t nbytes, loff_t off); }; =20 +/** + * struct resctrl_pqr_state - State cache for the PQR MSR + * @cur_rmid: The cached Resource Monitoring ID + * @cur_closid: The cached Class Of Service ID + * @default_rmid: The user assigned Resource Monitoring ID + * @default_closid: The user assigned cached Class Of Service ID + * + * The upper 32 bits of MSR_IA32_PQR_ASSOC contain closid and the + * lower 10 bits rmid. The update to MSR_IA32_PQR_ASSOC always + * contains both parts, so we need to cache them. This also + * stores the user configured per cpu CLOSID and RMID. + * + * The cache also helps to avoid pointless updates if the value does + * not change. + */ +struct resctrl_pqr_state { + u32 cur_rmid; + u32 cur_closid; + u32 default_rmid; + u32 default_closid; +}; + +DECLARE_PER_CPU(struct resctrl_pqr_state, pqr_state); + /** * struct mbm_state - status for each MBM counter in each domain * @prev_bw_bytes: Previous bytes value read for bandwidth calculation diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/r= esctrl/rdtgroup.c index 011e17efb1a6..5d599d99f94b 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -334,6 +334,47 @@ static int rdtgroup_cpus_show(struct kernfs_open_file = *of, return ret; } =20 +/* + * __resctrl_sched_in() - Writes the task's control and monitor IDs into t= he CPU + * + * Following considerations are made so that this has minimal impact + * on scheduler hot path: + * - Caches the per cpu CLOSid/RMID values and does the MSR write only + * when a task with a different CLOSid/RMID is scheduled in. + * - We allocate RMIDs/CLOSids globally in order to keep this as + * simple as possible. + * Must be called with preemption disabled. + */ +void __resctrl_sched_in(struct task_struct *tsk) +{ + struct resctrl_pqr_state *state =3D this_cpu_ptr(&pqr_state); + u32 closid =3D state->default_closid; + u32 rmid =3D state->default_rmid; + u32 tmp; + + /* + * If this task has a closid/rmid assigned, use it. + * Else use the closid/rmid assigned to this cpu. + */ + if (static_branch_likely(&rdt_alloc_enable_key)) { + tmp =3D READ_ONCE(tsk->closid); + if (tmp) + closid =3D tmp; + } + + if (static_branch_likely(&rdt_mon_enable_key)) { + tmp =3D READ_ONCE(tsk->rmid); + if (tmp) + rmid =3D tmp; + } + + if (closid !=3D state->cur_closid || rmid !=3D state->cur_rmid) { + state->cur_closid =3D closid; + state->cur_rmid =3D rmid; + wrmsr(MSR_IA32_PQR_ASSOC, rmid, closid); + } +} + /* * This is safe against resctrl_sched_in() called from __switch_to() * because __switch_to() is executed with interrupts disabled. A local call diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c index 0917c7f25720..8f92a87d381d 100644 --- a/arch/x86/kernel/process_32.c +++ b/arch/x86/kernel/process_32.c @@ -38,6 +38,7 @@ #include #include #include +#include =20 #include #include @@ -51,7 +52,6 @@ #include #include #include -#include #include =20 #include "process.h" diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c index 7062b84dd467..d442269bb25b 100644 --- a/arch/x86/kernel/process_64.c +++ b/arch/x86/kernel/process_64.c @@ -40,6 +40,7 @@ #include #include #include +#include =20 #include #include @@ -53,7 +54,6 @@ #include #include #include -#include #include #include #include diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h index a365f67131ec..62d607939a73 100644 --- a/include/linux/resctrl.h +++ b/include/linux/resctrl.h @@ -304,4 +304,25 @@ void resctrl_arch_reset_rmid_all(struct rdt_resource *= r, struct rdt_domain *d); extern unsigned int resctrl_rmid_realloc_threshold; extern unsigned int resctrl_rmid_realloc_limit; =20 +DECLARE_STATIC_KEY_FALSE(rdt_enable_key); + +void __resctrl_sched_in(struct task_struct *tsk); + +/* + * resctrl_sched_in() - Assigns the incoming task's control/monitor IDs to= the + * current CPU + * + * To minimize impact to the scheduler hot path, this will stay as no-op u= nless + * running on a system supporting resctrl and the filesystem is mounted. + * + * Must be called with preemption disabled. + */ +static inline void resctrl_sched_in(struct task_struct *tsk) +{ +#ifdef CONFIG_X86_CPU_RESCTRL + if (static_branch_likely(&rdt_enable_key)) + __resctrl_sched_in(tsk); +#endif +} + #endif /* _RESCTRL_H */ --=20 2.44.0.396.g6e790dbe36-goog From nobody Sun Feb 8 02:56:17 2026 Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 155E912F582 for ; Mon, 25 Mar 2024 17:27:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711387662; cv=none; b=u5TKh58VtUArhnVe2xJioegG8K/Q2GLVc5H9BCJK/urpMLmmugh57oTycFLk8D7qIWCsadUlwgRqTCh3vbnVffF1ff98Gc86C7hZOrDLCKSF3yxJPpCbvRq8DO8Ui9TvNWzxhlf3bRpZhBFJ7mH2sa/szWSLFfNHJvriAZRHePQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711387662; c=relaxed/simple; bh=AyBVpI9vaGXwp404hB8NwI0hWf2z6RpoeeU9eY+5rJE=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=YZc1fSh+ve0rtJUz/dz8PdtqxV+RQyyTr9X6Pd+a1532ZC4/FR5VsZbScENo4BkVwTROkxiy4xx6fCAzoVqcgd0sQo6yY4STm2X9BMG3Q7pieK9PfDDXOIfPBgyS/LY3KG6icXbUIBrN50chOljg0F3bQ2QbDkFDRueoS4v50Ag= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--peternewman.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=4Z3EsaD2; arc=none smtp.client-ip=209.85.128.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--peternewman.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="4Z3EsaD2" Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-60a08eb0956so73033677b3.3 for ; Mon, 25 Mar 2024 10:27:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1711387660; x=1711992460; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=VNpwPuaEt/XFweY06CO9RkqXPvdKzwhVG5w7ZvFx+D4=; b=4Z3EsaD2fSgqcMSqzxKlCbEjXGJ3rx7ZvfFGIk9sa7+mj+d79VhHMTbhRXOOd36SM9 ZLZNDn/eBkF5Z4OVZQUv/w4VxwGT6H60j8AXoAaz4k8Az07FeaUdD/sQ7dc+we1G2GPY hzWGGsx73nNt93k2Gc/707Eg4IeS52aoNWxMhV/eqE1n5aEO2LLP9n7uDR9iWCg7Wlo2 e8/Bnuv3x+KBqIU5s1FSE4vPORjdh5NXo0UB2QrkzZyCnNvPFPJcrcwqkUJcLjOzXHQV kF7R7hjnyQNhMHPqFYItQLSpeiTipInGxS9zL7BHseteSLQ/dFURhd/pYGfCUAG3pptV n9bA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711387660; x=1711992460; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=VNpwPuaEt/XFweY06CO9RkqXPvdKzwhVG5w7ZvFx+D4=; b=tWVc7OwyS6fJBQn3dzGZrBH2YwAU2gogUp0yG+rUIjnu0nHOoY9rTZJ4khNU25xa2e oih7s67VQUEwyvVnj82hnIn03GlVq6MjaUXc0M0Xc0/KWLxCcgBKG+J2zNBj40qJxY63 W47dPFbtUsIrYnZw8N6N6vJ+aik+Hff1s5LEcNUpbZDIPfH6Vi5hxd6F5slzldW1T4tu ZKD2DXPjZj6zq/rBnou6/5GmfxMfKx4FppcSS5DleqTHREHsFl7yVDMqyaJtlqTFi9gR fCrPZPDkAlI7ALqQHE4b7BB8bCO4L1YoGdFTg8YY6UBAf8droFolrrsO25SW5YWQrlvd G2IA== X-Forwarded-Encrypted: i=1; AJvYcCUfvQ6rhYq3HGui4CaEJ+2HhawfNY5cOCAGNV7l1NYlSmbLfyYQbOlXdkbkF4NDJlIwbD5xZ2Dypgk3OMA32IcwZ9H1STB2Esaa9G8O X-Gm-Message-State: AOJu0YxNcm+6vlhpzWWeJN8BBdDiXOt/EFXExe0d05BoPs1HcLnKPQlg wqurgHOu3xWy1xWSr4OKLUBf1J3Fwkk8DZoiWdXXPIteIlbGmqVv7nneWawRdlemm16xUwfkP72 5zKOlfigYY/ifu3DxBzbWYA== X-Google-Smtp-Source: AGHT+IHlfxCzaC5QZvJS8UKj5AkDxbOZp4SBMBs3PGEZ0LkbHF7gM5w89iof3Cqnk0L6PzpzXy18Cb0tqtiscYFF9Q== X-Received: from peternewman-us.c.googlers.com ([fda3:e722:ac3:cc00:24:72f4:c0a8:3dcc]) (user=peternewman job=sendgmr) by 2002:a81:de51:0:b0:609:f0f0:20c4 with SMTP id o17-20020a81de51000000b00609f0f020c4mr1784308ywl.4.1711387660115; Mon, 25 Mar 2024 10:27:40 -0700 (PDT) Date: Mon, 25 Mar 2024 10:27:03 -0700 In-Reply-To: <20240325172707.73966-1-peternewman@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240325172707.73966-1-peternewman@google.com> X-Mailer: git-send-email 2.44.0.396.g6e790dbe36-goog Message-ID: <20240325172707.73966-3-peternewman@google.com> Subject: [PATCH v1 2/6] x86/resctrl: Add hook for releasing task_struct references From: Peter Newman To: Fenghua Yu , Reinette Chatre , James Morse Cc: Stephane Eranian , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Daniel Bristot de Oliveira , Valentin Schneider , Uros Bizjak , Mike Rapoport , "Kirill A. Shutemov" , Rick Edgecombe , Xin Li , Babu Moger , Shaopeng Tan , Maciej Wieczor-Retman , Jens Axboe , Christian Brauner , Oleg Nesterov , Andrew Morton , Tycho Andersen , Nicholas Piggin , Beau Belgrave , "Matthew Wilcox (Oracle)" , linux-kernel@vger.kernel.org, Peter Newman Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" In order for the task_struct to hold references to rdtgroups, it must be possible to release these references before a concurrent deletion causes them to be freed. It is not possible for resctrl code to do this with for_each_process_thread() because the task can still switch in after it has been removed from the tasklist, at which point the task_struct could be referring to freed memory. Signed-off-by: Peter Newman --- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 10 ++++++++++ include/linux/resctrl.h | 6 ++++++ kernel/exit.c | 3 +++ 3 files changed, 19 insertions(+) diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/r= esctrl/rdtgroup.c index 5d599d99f94b..9b1969e4235a 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -2931,6 +2931,16 @@ static void rdt_move_group_tasks(struct rdtgroup *fr= om, struct rdtgroup *to, read_unlock(&tasklist_lock); } =20 +/** + * exit_resctrl() - called at thread destruction to release resources + * + * This hook is called just before the task is removed from the global tas= klist + * and still reachable via for_each_process_thread(). + */ +void exit_resctrl(struct task_struct *tsk) +{ +} + static void free_all_child_rdtgrp(struct rdtgroup *rdtgrp) { struct rdtgroup *sentry, *stmp; diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h index 62d607939a73..b2af1fbc7aa1 100644 --- a/include/linux/resctrl.h +++ b/include/linux/resctrl.h @@ -325,4 +325,10 @@ static inline void resctrl_sched_in(struct task_struct= *tsk) #endif } =20 +#ifdef CONFIG_X86_CPU_RESCTRL +void exit_resctrl(struct task_struct *tsk); +#else +static inline void exit_resctrl(struct task_struct *tsk) {} +#endif + #endif /* _RESCTRL_H */ diff --git a/kernel/exit.c b/kernel/exit.c index 41a12630cbbc..ccdc90ff6d71 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -70,6 +70,7 @@ #include #include #include +#include =20 #include =20 @@ -862,6 +863,8 @@ void __noreturn do_exit(long code) tsk->exit_code =3D code; taskstats_exit(tsk, group_dead); =20 + exit_resctrl(tsk); + exit_mm(); =20 if (group_dead) --=20 2.44.0.396.g6e790dbe36-goog From nobody Sun Feb 8 02:56:17 2026 Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8CDDD12F5B6 for ; Mon, 25 Mar 2024 17:27:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711387664; cv=none; b=bUR1elUgHZFR6ehAHOc4VITr5afFp30e0t94cpzD/bvZRSVjpMowKts0gmtfgtuyUJQA+nQ+GYCiZ8wMV2TqJqa27yzVq5Y5du8Vv5XP8BsvcvfctZT5ZE6efMyUoiHRRr6nZVH+h5R/4//EkV7eP9/2qJ8jQZDHNVCYlYapYLI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711387664; c=relaxed/simple; bh=r3fjiYp+AgZHV3ei77GMW6DjlOuXQfdqooVdsxOJZz8=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=JeQJMtsDn7XCrdex1pT7m3YBgoRf08UbEyyStwwTz5GpgmPHLe1QpoRCAgfU6vdX3fYx1UNpImVhEeYSWdkOax+Gdqb0ypQGuCrq94n68DZ5970kM97GGz3kYT2z+G3A2biyKI77a8lPmARa82LV2QnHrZumrHKPPuuwqi7WEnk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--peternewman.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=tWFgFCu+; arc=none smtp.client-ip=209.85.128.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--peternewman.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="tWFgFCu+" Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-60a3bb05c9bso74651097b3.1 for ; Mon, 25 Mar 2024 10:27:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1711387661; x=1711992461; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Z+xKnD/Hk5bulpibeGFuTyXe+4qbZ8qSnGUgzX6Q+tg=; b=tWFgFCu+7+B8iCj4l+XNvEkD6WEFf837r+nJK2b9bKdKyecR4o6DmXl8TeRq8iLfHN QS1ETtisUldJfyfqDSZvaA/ZJjadZRdeDUGvuqD35eeBEoc4yBaY1T4Ka3sBu7AhlXHU CF6L/84pzItEDEBrsmMqmiJws1J5dA+E4XY6nKpZp8hThPwoR/253VFmTDdVpvnBiUvn FyXbsWATbxD+Z6cGQ6REWOT9XjmW+CmRrSxNxTzpcPgzDwpZEhSje9AETBNgVyLCC0R4 53wi/PoD5mqvao6C0HQsdVNbHj5oStlH5UiNX6pCLvRmLl95/1TPLEekK/85KkmK5AGi KXbw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711387661; x=1711992461; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Z+xKnD/Hk5bulpibeGFuTyXe+4qbZ8qSnGUgzX6Q+tg=; b=ZVU2EP9FaBZInAuEEwI4VC/ZsCYHuaRGOUb4Rq0WmyBVWyBIPmyhsl1OLf0bJ1Rjeb DmxwVzSODfEF/OQM8TxNtOHnOKsMuhDda/6mF4ikY4GZ8Hr/W6Lw+DprmomxR0tuHEEl TzenSWvd4rMgfWO3rbcVEsnfSm2167Lkc3LyDS7PBhbtI/21CApUQa7+YhJuizSTHv9t +U4Utg/nXCgNHWBP/JkY/lOW1Cxsptab9wNm8pvmZzHqJA+a/AlkXJP6nFojx+FMpZ0X aOupjzXwRtCMCl7RXYriuxS0dh+Og/xT18j/Ie861OxnENmQg7r+Tq0BbpYDF2qxeGW1 ZheA== X-Forwarded-Encrypted: i=1; AJvYcCX+U0ENzV6P5snj0Sz6EO0+JfeVF106u5dI4bBtFQFux2cXr8c/xAuExdj0glJ4F3PB4v/D5844jOpDACjtYj97+WuO1hfDN9gjcvRI X-Gm-Message-State: AOJu0Yz0B5QJBur+XKKRfuNcWMYpNqB+q4GFi0WQhgc+EVGMjToUjdRg 6lgzJMaUTx/ZpxZnmgV2J7K75CIXCPtkBRIMexd312jLpBRLxt9a8CEh5se3hIJ7VK4Q7tUFnmT jaGcBvB4GGD1apHFJJjaSTA== X-Google-Smtp-Source: AGHT+IGOXDCr9eFN0hGMBkiz/xvjtcpSPh1J7eXfROpiF4bMJ8dytnsQwF5v/Y0UX1Wt2YJvnTIPvhFgMbCVu+AJLg== X-Received: from peternewman-us.c.googlers.com ([fda3:e722:ac3:cc00:24:72f4:c0a8:3dcc]) (user=peternewman job=sendgmr) by 2002:a81:8387:0:b0:611:a290:b8e3 with SMTP id t129-20020a818387000000b00611a290b8e3mr157140ywf.0.1711387661638; Mon, 25 Mar 2024 10:27:41 -0700 (PDT) Date: Mon, 25 Mar 2024 10:27:04 -0700 In-Reply-To: <20240325172707.73966-1-peternewman@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240325172707.73966-1-peternewman@google.com> X-Mailer: git-send-email 2.44.0.396.g6e790dbe36-goog Message-ID: <20240325172707.73966-4-peternewman@google.com> Subject: [PATCH v1 3/6] x86/resctrl: Disallow mongroup rename on MPAM From: Peter Newman To: Fenghua Yu , Reinette Chatre , James Morse Cc: Stephane Eranian , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Daniel Bristot de Oliveira , Valentin Schneider , Uros Bizjak , Mike Rapoport , "Kirill A. Shutemov" , Rick Edgecombe , Xin Li , Babu Moger , Shaopeng Tan , Maciej Wieczor-Retman , Jens Axboe , Christian Brauner , Oleg Nesterov , Andrew Morton , Tycho Andersen , Nicholas Piggin , Beau Belgrave , "Matthew Wilcox (Oracle)" , linux-kernel@vger.kernel.org, Peter Newman Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Moving a monitoring group to a different parent control assumes that the monitors will not be impacted. This is not the case on MPAM where the PMG is an extension of the PARTID. Detect this situation by requiring the change in CLOSID not to affect the result of resctrl_arch_rmid_idx_encode(), otherwise return -EOPNOTSUPP. Signed-off-by: Peter Newman --- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/r= esctrl/rdtgroup.c index 9b1969e4235a..8d6979dbfd02 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -3879,6 +3879,19 @@ static int rdtgroup_rename(struct kernfs_node *kn, goto out; } =20 + /* + * If changing the CLOSID impacts the RMID, this operation is not + * supported. + */ + if (resctrl_arch_rmid_idx_encode(rdtgrp->mon.parent->closid, + rdtgrp->mon.rmid) !=3D + resctrl_arch_rmid_idx_encode(new_prdtgrp->closid, + rdtgrp->mon.rmid)) { + rdt_last_cmd_puts("changing parent control group not supported\n"); + ret =3D -EOPNOTSUPP; + goto out; + } + /* * If the MON group is monitoring CPUs, the CPUs must be assigned to the * current parent CTRL_MON group and therefore cannot be assigned to --=20 2.44.0.396.g6e790dbe36-goog From nobody Sun Feb 8 02:56:17 2026 Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4A27B12FF80 for ; Mon, 25 Mar 2024 17:27:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711387666; cv=none; b=bSTDJwRcqLwBXYzKmrUbN4o/mn9JBei29HfoK9UI9xTV7QsKK98lISxIaVaesbhM6wi4GQHXC67ne/dUEfSYkGfrbRZlwGTvCd4mOqNrPPr1LUwvzVYSyN+Cv0aBNUWu8EMP+3YRXzT6wb7Mj15ySgYpOACdMAehuY6/u866828= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711387666; c=relaxed/simple; bh=l1cLPXEzd5BiKD1kOwyPdTIzc6MWvbUhAo2KrNl0sAM=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=DLYJQzgtw79t3tgFNxv+llWb+viOlNqZFGwF7exx66ovO+m4+gSeLEG1Qc56L1a5w1s/A+TSNi8iiGw07UcEofJV+wE7tR0LPjz3Krl/t0ZhoIKANvyC++KPnUo9cPae2GUQg6iI/Xfnn1gmUcjrswZICOJwH/WayirAilFJjus= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--peternewman.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=2LJf2Qon; arc=none smtp.client-ip=209.85.219.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--peternewman.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="2LJf2Qon" Received: by mail-yb1-f202.google.com with SMTP id 3f1490d57ef6-dcc4563611cso6575554276.3 for ; Mon, 25 Mar 2024 10:27:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1711387663; x=1711992463; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=9OsE8zjKMOb7QWxDzxIqnVKWQWwrvLnNjjLQEcIFHUQ=; b=2LJf2QoneFnznj6mw05ClVtkmzYODZifbmrY3cLA5qf5hvUONh+Bf65dOPmy8aBJCS 6uEJlfmONuzX9yjLHrgSGunyp2B+zJVDIXorn2otJOsvNvJIll0rFCWuiymhpYm1D+Ny JPxdo7zs4tg9vgbhJPZPcCGIp7+raVQNSreym1Ps5FxcyQ4A5bGHUWbJql94REx6YSZ4 IxB9v3j5+X2/PDq9/kCWN2CTblcw+8Q97Bkn5M2h/GXNEL3ydEbKVsM91T3Gp6AMEnci qkoP9ZUXf+q4XrWP1HrDLLs4y5XnlU2tfaQP99lbVLvKg0CZDmt89rEBCmg72IVUtRRf C2bQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711387663; x=1711992463; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=9OsE8zjKMOb7QWxDzxIqnVKWQWwrvLnNjjLQEcIFHUQ=; b=PRJHEn+vea3cr3LDxu7215mcrgNShVgnVn2n2Z7haFd/W/q2RayF1HJemiLQHdnT9G uUo+2dHqFPJJNhLAlTiOhBoK6LDhPuKqNSxzq14nAkOExiBiaeyKrMvHS9wJ1wcsD1Sq 7Z62y3yoekzTMwXphNJ581a07iNySqZgzNmbEGWIvcqfaV5IOe/WrgcalkkOm5Ykxpww ZejUqE1EWAfcni14QrcYBObE4prO7HhO9X+MISMDZKQpCdI7w5m9a5N3utZHO8PruKy5 2US0UpS+QDDaqApAYfrn7jEPikMmuxCnNITK3UxIeEDiTFM40OKLbTpvff1zQevdjKXe 8Xfg== X-Forwarded-Encrypted: i=1; AJvYcCV4H/bQVspqcBuh8aWMR54vqk4GQDGdeBvqP4AQbVQmpJGKqAWn9Q0PTlvVvH/SpxPDXbg0bPwAT72ECUPQdJIR/ODFWtjcWdVbyH52 X-Gm-Message-State: AOJu0YxdXy/r5Jjdd5xDupgUIddPEVTZf19TXBYrgAmQZi+OzvJ8uoH7 hyw7FySTlPWCs9eS6pHEWJ632fGY+EbPe1GXoD75bikLet8rqlLGWvxCZiHGjFCauhhbVrX5hFW i28BcPm8fdNnN4NZ5GDIxew== X-Google-Smtp-Source: AGHT+IHcOU3kC4lTU+U9SgE+eejDP+mmufYh/KPQ87GtHJN4eZ2VWej068MZ7ayn5BbE6AKTXNJbvcjxVAWBsnUQXw== X-Received: from peternewman-us.c.googlers.com ([fda3:e722:ac3:cc00:24:72f4:c0a8:3dcc]) (user=peternewman job=sendgmr) by 2002:a05:6902:2508:b0:dc6:c2e4:5126 with SMTP id dt8-20020a056902250800b00dc6c2e45126mr2433941ybb.12.1711387663293; Mon, 25 Mar 2024 10:27:43 -0700 (PDT) Date: Mon, 25 Mar 2024 10:27:05 -0700 In-Reply-To: <20240325172707.73966-1-peternewman@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240325172707.73966-1-peternewman@google.com> X-Mailer: git-send-email 2.44.0.396.g6e790dbe36-goog Message-ID: <20240325172707.73966-5-peternewman@google.com> Subject: [PATCH v1 4/6] x86/resctrl: Use rdtgroup pointer to indicate task membership From: Peter Newman To: Fenghua Yu , Reinette Chatre , James Morse Cc: Stephane Eranian , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Daniel Bristot de Oliveira , Valentin Schneider , Uros Bizjak , Mike Rapoport , "Kirill A. Shutemov" , Rick Edgecombe , Xin Li , Babu Moger , Shaopeng Tan , Maciej Wieczor-Retman , Jens Axboe , Christian Brauner , Oleg Nesterov , Andrew Morton , Tycho Andersen , Nicholas Piggin , Beau Belgrave , "Matthew Wilcox (Oracle)" , linux-kernel@vger.kernel.org, Peter Newman Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Caching the CLOSID and RMID values in all member tasks makes changing either ID for a group expensive, as all task_structs must be inspected while read-locking the tasklist_lock. A single rdtgroup reference from the task_struct can indicate the mongroup and ctrl group membership of a task. In the case of mongroups, the parent pointer can be used to determine the CLOSID indirectly, avoiding the need for invalidating a cached CLOSID in all task_structs. This also solves the problem of tearing PARTID/PMG values in MPAM, as the parent pointer of a mongroup does not change. Therefore an atomic read of the rdt_group pointer provides a consistent view of current mongroup and control group membership, making __resctrl_sched_in() portable. Care must be taken to ensure that __resctrl_sched_in() does not dereference a pointer to a freed rdtgroup struct. Tasks may no longer be reachable via for_each_process_thread() but can still be switched in, so update the rdt_group pointer before the thread is removed from the tasklist. Co-developed-by: Stephane Eranian Signed-off-by: Stephane Eranian Signed-off-by: Peter Newman --- arch/x86/include/asm/resctrl.h | 18 --- arch/x86/kernel/cpu/resctrl/core.c | 3 +- arch/x86/kernel/cpu/resctrl/internal.h | 13 +- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 205 +++++++++++++------------ include/linux/sched.h | 3 +- 5 files changed, 110 insertions(+), 132 deletions(-) diff --git a/arch/x86/include/asm/resctrl.h b/arch/x86/include/asm/resctrl.h index 99ba8c0dc155..be4afbc6180f 100644 --- a/arch/x86/include/asm/resctrl.h +++ b/arch/x86/include/asm/resctrl.h @@ -64,24 +64,6 @@ static inline unsigned int resctrl_arch_round_mon_val(un= signed int val) return val * scale; } =20 -static inline void resctrl_arch_set_closid_rmid(struct task_struct *tsk, - u32 closid, u32 rmid) -{ - WRITE_ONCE(tsk->closid, closid); - WRITE_ONCE(tsk->rmid, rmid); -} - -static inline bool resctrl_arch_match_closid(struct task_struct *tsk, u32 = closid) -{ - return READ_ONCE(tsk->closid) =3D=3D closid; -} - -static inline bool resctrl_arch_match_rmid(struct task_struct *tsk, u32 ig= nored, - u32 rmid) -{ - return READ_ONCE(tsk->rmid) =3D=3D rmid; -} - static inline u32 resctrl_arch_system_num_rmid_idx(void) { /* RMID are independent numbers for x86. num_rmid_idx =3D=3D num_rmid */ diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resct= rl/core.c index 83e40341583e..ae5878d748fc 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -600,8 +600,7 @@ static void clear_closid_rmid(int cpu) { struct resctrl_pqr_state *state =3D this_cpu_ptr(&pqr_state); =20 - state->default_closid =3D RESCTRL_RESERVED_CLOSID; - state->default_rmid =3D RESCTRL_RESERVED_RMID; + state->default_group =3D &rdtgroup_default; state->cur_closid =3D RESCTRL_RESERVED_CLOSID; state->cur_rmid =3D RESCTRL_RESERVED_RMID; wrmsr(MSR_IA32_PQR_ASSOC, RESCTRL_RESERVED_RMID, diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/r= esctrl/internal.h index 56a68e542572..0ba0d2428780 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -334,14 +334,8 @@ struct rftype { /** * struct resctrl_pqr_state - State cache for the PQR MSR * @cur_rmid: The cached Resource Monitoring ID - * @cur_closid: The cached Class Of Service ID - * @default_rmid: The user assigned Resource Monitoring ID - * @default_closid: The user assigned cached Class Of Service ID - * - * The upper 32 bits of MSR_IA32_PQR_ASSOC contain closid and the - * lower 10 bits rmid. The update to MSR_IA32_PQR_ASSOC always - * contains both parts, so we need to cache them. This also - * stores the user configured per cpu CLOSID and RMID. + * @cur_closid: The cached Class Of Service ID + * @default_group: The user assigned rdtgroup * * The cache also helps to avoid pointless updates if the value does * not change. @@ -349,8 +343,7 @@ struct rftype { struct resctrl_pqr_state { u32 cur_rmid; u32 cur_closid; - u32 default_rmid; - u32 default_closid; + struct rdtgroup *default_group; }; =20 DECLARE_PER_CPU(struct resctrl_pqr_state, pqr_state); diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/r= esctrl/rdtgroup.c index 8d6979dbfd02..badf181c8cbb 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -348,25 +348,55 @@ static int rdtgroup_cpus_show(struct kernfs_open_file= *of, void __resctrl_sched_in(struct task_struct *tsk) { struct resctrl_pqr_state *state =3D this_cpu_ptr(&pqr_state); - u32 closid =3D state->default_closid; - u32 rmid =3D state->default_rmid; - u32 tmp; + u32 closid =3D state->cur_closid; + u32 rmid =3D state->cur_rmid; + struct rdtgroup *rgrp; =20 /* - * If this task has a closid/rmid assigned, use it. - * Else use the closid/rmid assigned to this cpu. + * A task's group assignment can change concurrently, but the CLOSID or + * RMID assigned to a group cannot change. */ + rgrp =3D READ_ONCE(tsk->rdt_group); + if (!rgrp || rgrp =3D=3D &rdtgroup_default) + /* + * If this task is a member of a control or monitoring group, + * use the IDs assigned to these groups. Else use the + * closid/rmid assigned to this cpu. + */ + rgrp =3D state->default_group; + + /* + * Context switches are possible before the cpuonline handler + * initializes default_group. + */ + if (!rgrp) + rgrp =3D &rdtgroup_default; + if (static_branch_likely(&rdt_alloc_enable_key)) { - tmp =3D READ_ONCE(tsk->closid); - if (tmp) - closid =3D tmp; + /* + * If the task is assigned to a monitoring group, the CLOSID is + * determined by the parent control group. + */ + if (rgrp->type =3D=3D RDTMON_GROUP) { + if (!WARN_ON(!rgrp->mon.parent)) + /* + * The parent rdtgroup cannot be freed until + * after the mon group is freed. In the event + * that the parent rdtgroup is removed (by + * rdtgroup_rmdir_ctrl()), rdt_mon_group would + * be redirected to rdtgroup_default, followed + * by a full barrier and synchronous IPI + * broadcast before proceeding to free the + * group. + */ + closid =3D rgrp->mon.parent->closid; + } else { + closid =3D rgrp->closid; + } } =20 - if (static_branch_likely(&rdt_mon_enable_key)) { - tmp =3D READ_ONCE(tsk->rmid); - if (tmp) - rmid =3D tmp; - } + if (static_branch_likely(&rdt_mon_enable_key)) + rmid =3D rgrp->mon.rmid; =20 if (closid !=3D state->cur_closid || rmid !=3D state->cur_rmid) { state->cur_closid =3D closid; @@ -385,10 +415,8 @@ static void update_cpu_closid_rmid(void *info) { struct rdtgroup *r =3D info; =20 - if (r) { - this_cpu_write(pqr_state.default_closid, r->closid); - this_cpu_write(pqr_state.default_rmid, r->mon.rmid); - } + if (r) + this_cpu_write(pqr_state.default_group, r); =20 /* * We cannot unconditionally write the MSR because the current @@ -624,49 +652,61 @@ static void update_task_closid_rmid(struct task_struc= t *t) =20 static bool task_in_rdtgroup(struct task_struct *tsk, struct rdtgroup *rdt= grp) { - u32 closid, rmid =3D rdtgrp->mon.rmid; + struct rdtgroup *task_group =3D READ_ONCE(tsk->rdt_group); =20 - if (rdtgrp->type =3D=3D RDTCTRL_GROUP) - closid =3D rdtgrp->closid; - else if (rdtgrp->type =3D=3D RDTMON_GROUP) - closid =3D rdtgrp->mon.parent->closid; - else - return false; + lockdep_assert_held(&rdtgroup_mutex); + + /* Uninitalized rdt_group pointer implies rdtgroup_default. */ + if (!task_group) + task_group =3D &rdtgroup_default; + + if (rdtgrp =3D=3D task_group) + return true; + + /* Tasks in child mongroups are members of the parent ctrlmon group. */ + if (task_group->type =3D=3D RDTMON_GROUP && + task_group->mon.parent =3D=3D rdtgrp) + return true; =20 - return resctrl_arch_match_closid(tsk, closid) && - resctrl_arch_match_rmid(tsk, closid, rmid); + return false; } =20 static int __rdtgroup_move_task(struct task_struct *tsk, struct rdtgroup *rdtgrp) { + struct rdtgroup *task_group =3D READ_ONCE(tsk->rdt_group); + /* If the task is already in rdtgrp, no need to move the task. */ if (task_in_rdtgroup(tsk, rdtgrp)) return 0; =20 /* - * Set the task's closid/rmid before the PQR_ASSOC MSR can be - * updated by them. + * NULL is used in the task_struct so it can be overridden by a CPU's + * default_group + */ + if (!task_group) + task_group =3D &rdtgroup_default; + + /* + * Set the task's group before the CPU can be updated by them. * * For ctrl_mon groups, move both closid and rmid. * For monitor groups, can move the tasks only from - * their parent CTRL group. + * their parent CTRL group or another mon group under the same parent. */ - if (rdtgrp->type =3D=3D RDTMON_GROUP && - !resctrl_arch_match_closid(tsk, rdtgrp->mon.parent->closid)) { + if (rdtgrp->type =3D=3D RDTCTRL_GROUP) { + WRITE_ONCE(tsk->rdt_group, rdtgrp); + } else if (rdtgrp->type =3D=3D RDTMON_GROUP && + (task_group =3D=3D rdtgrp->mon.parent || + task_group->mon.parent =3D=3D rdtgrp->mon.parent)) { + WRITE_ONCE(tsk->rdt_group, rdtgrp); + } else { rdt_last_cmd_puts("Can't move task to different control group\n"); return -EINVAL; } =20 - if (rdtgrp->type =3D=3D RDTMON_GROUP) - resctrl_arch_set_closid_rmid(tsk, rdtgrp->mon.parent->closid, - rdtgrp->mon.rmid); - else - resctrl_arch_set_closid_rmid(tsk, rdtgrp->closid, - rdtgrp->mon.rmid); - /* - * Ensure the task's closid and rmid are written before determining if + * Ensure the task's group is written before determining if * the task is current that will decide if it will be interrupted. * This pairs with the full barrier between the rq->curr update and * resctrl_sched_in() during context switch. @@ -684,19 +724,6 @@ static int __rdtgroup_move_task(struct task_struct *ts= k, return 0; } =20 -static bool is_closid_match(struct task_struct *t, struct rdtgroup *r) -{ - return (resctrl_arch_alloc_capable() && (r->type =3D=3D RDTCTRL_GROUP) && - resctrl_arch_match_closid(t, r->closid)); -} - -static bool is_rmid_match(struct task_struct *t, struct rdtgroup *r) -{ - return (resctrl_arch_mon_capable() && (r->type =3D=3D RDTMON_GROUP) && - resctrl_arch_match_rmid(t, r->mon.parent->closid, - r->mon.rmid)); -} - /** * rdtgroup_tasks_assigned - Test if tasks have been assigned to resource = group * @r: Resource group @@ -712,7 +739,7 @@ int rdtgroup_tasks_assigned(struct rdtgroup *r) =20 rcu_read_lock(); for_each_process_thread(p, t) { - if (is_closid_match(t, r) || is_rmid_match(t, r)) { + if (task_in_rdtgroup(t, r)) { ret =3D 1; break; } @@ -830,7 +857,7 @@ static void show_rdt_tasks(struct rdtgroup *r, struct s= eq_file *s) =20 rcu_read_lock(); for_each_process_thread(p, t) { - if (is_closid_match(t, r) || is_rmid_match(t, r)) { + if (task_in_rdtgroup(t, r)) { pid =3D task_pid_vnr(t); if (pid) seq_printf(s, "%d\n", pid); @@ -924,53 +951,34 @@ int proc_resctrl_show(struct seq_file *s, struct pid_= namespace *ns, struct pid *pid, struct task_struct *tsk) { struct rdtgroup *rdtg; - int ret =3D 0; - - mutex_lock(&rdtgroup_mutex); + struct rdtgroup *crg; + struct rdtgroup *mrg; =20 /* Return empty if resctrl has not been mounted. */ if (!resctrl_mounted) { seq_puts(s, "res:\nmon:\n"); - goto unlock; + return 0; } =20 - list_for_each_entry(rdtg, &rdt_all_groups, rdtgroup_list) { - struct rdtgroup *crg; + mutex_lock(&rdtgroup_mutex); =20 - /* - * Task information is only relevant for shareable - * and exclusive groups. - */ - if (rdtg->mode !=3D RDT_MODE_SHAREABLE && - rdtg->mode !=3D RDT_MODE_EXCLUSIVE) - continue; + rdtg =3D READ_ONCE(tsk->rdt_group); + if (!rdtg) + rdtg =3D &rdtgroup_default; =20 - if (!resctrl_arch_match_closid(tsk, rdtg->closid)) - continue; + mrg =3D rdtg; + crg =3D rdtg; + if (rdtg->type =3D=3D RDTMON_GROUP) + crg =3D rdtg->mon.parent; + + seq_printf(s, "res:%s%s\n", (crg =3D=3D &rdtgroup_default) ? "/" : "", + crg->kn->name); + seq_printf(s, "mon:%s%s\n", (mrg =3D=3D &rdtgroup_default) ? "/" : "", + mrg->kn->name); =20 - seq_printf(s, "res:%s%s\n", (rdtg =3D=3D &rdtgroup_default) ? "/" : "", - rdtg->kn->name); - seq_puts(s, "mon:"); - list_for_each_entry(crg, &rdtg->mon.crdtgrp_list, - mon.crdtgrp_list) { - if (!resctrl_arch_match_rmid(tsk, crg->mon.parent->closid, - crg->mon.rmid)) - continue; - seq_printf(s, "%s", crg->kn->name); - break; - } - seq_putc(s, '\n'); - goto unlock; - } - /* - * The above search should succeed. Otherwise return - * with an error. - */ - ret =3D -ENOENT; -unlock: mutex_unlock(&rdtgroup_mutex); =20 - return ret; + return 0; } #endif =20 @@ -2904,13 +2912,11 @@ static void rdt_move_group_tasks(struct rdtgroup *f= rom, struct rdtgroup *to, =20 read_lock(&tasklist_lock); for_each_process_thread(p, t) { - if (!from || is_closid_match(t, from) || - is_rmid_match(t, from)) { - resctrl_arch_set_closid_rmid(t, to->closid, - to->mon.rmid); + if (!from || task_in_rdtgroup(t, from)) { + WRITE_ONCE(t->rdt_group, to); =20 /* - * Order the closid/rmid stores above before the loads + * Order the group store above before the loads * in task_curr(). This pairs with the full barrier * between the rq->curr update and resctrl_sched_in() * during context switch. @@ -2939,6 +2945,7 @@ static void rdt_move_group_tasks(struct rdtgroup *fro= m, struct rdtgroup *to, */ void exit_resctrl(struct task_struct *tsk) { + WRITE_ONCE(tsk->rdt_group, &rdtgroup_default); } =20 static void free_all_child_rdtgrp(struct rdtgroup *rdtgrp) @@ -3681,7 +3688,7 @@ static int rdtgroup_rmdir_mon(struct rdtgroup *rdtgrp= , cpumask_var_t tmpmask) =20 /* Update per cpu rmid of the moved CPUs first */ for_each_cpu(cpu, &rdtgrp->cpu_mask) - per_cpu(pqr_state.default_rmid, cpu) =3D prdtgrp->mon.rmid; + per_cpu(pqr_state.default_group, cpu) =3D prdtgrp; /* * Update the MSR on moved CPUs and CPUs which have moved * task running on them. @@ -3724,10 +3731,8 @@ static int rdtgroup_rmdir_ctrl(struct rdtgroup *rdtg= rp, cpumask_var_t tmpmask) &rdtgroup_default.cpu_mask, &rdtgrp->cpu_mask); =20 /* Update per cpu closid and rmid of the moved CPUs first */ - for_each_cpu(cpu, &rdtgrp->cpu_mask) { - per_cpu(pqr_state.default_closid, cpu) =3D rdtgroup_default.closid; - per_cpu(pqr_state.default_rmid, cpu) =3D rdtgroup_default.mon.rmid; - } + for_each_cpu(cpu, &rdtgrp->cpu_mask) + per_cpu(pqr_state.default_group, cpu) =3D &rdtgroup_default; =20 /* * Update the MSR on moved CPUs and CPUs which have moved diff --git a/include/linux/sched.h b/include/linux/sched.h index 3c2abbc587b4..d07d7a80006b 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1236,8 +1236,7 @@ struct task_struct { struct list_head cg_list; #endif #ifdef CONFIG_X86_CPU_RESCTRL - u32 closid; - u32 rmid; + struct rdtgroup *rdt_group; #endif #ifdef CONFIG_FUTEX struct robust_list_head __user *robust_list; --=20 2.44.0.396.g6e790dbe36-goog From nobody Sun Feb 8 02:56:17 2026 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A54C7130AD4 for ; Mon, 25 Mar 2024 17:27:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711387667; cv=none; b=aqR27DWZCf3P4SbUj5DK/ZzHzHoXvolzG3TsoafM4AVr3DoJQGK7U9WODqtT91ZeA0v4X68vMpg98OgfAb32vrEm7j05WzarYPWGnddMua3zGVTyVVBQo0Otph3jeTDsFinqMa2m5zg1V8E2+8B1Lz+ZrDWmdVOxKwfyW7ZnJUc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711387667; c=relaxed/simple; bh=k78/xyNeRm+p5JTWQOob+AJOfSmYtbmPvcai3bGPDYA=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=f/Vy32GoTqJb6xPuSeYIS1Sbs7rbqORlpcfgx7+jsD2oPrAs53fzb4FvKV6SnBn+r5+M1WndkF7OUTgoW8sZQQzNtFDG8UU5VOK7GMyuK19qsMVpuWGUvptcq/xBKrtPAEx9fiOtP/Ahfciu/dnkosKAaudKi7eOMqAbVOwtC88= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--peternewman.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=eJo9biCV; arc=none smtp.client-ip=209.85.216.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--peternewman.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="eJo9biCV" Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-29b8f702cbfso3404997a91.1 for ; Mon, 25 Mar 2024 10:27:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1711387665; x=1711992465; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=iYXDvPU+5s3FumCxEyHgAv7m9MtFIzgZjtXbt/l7B3E=; b=eJo9biCVBxAzskJaWYFfuXUOwzU0Qe9xf0tWZTC7IYoEZMIbn+bOwxzLCm+lBG/c+y R3wiNurBs2SW9Dj5LNs8KRpsPyO/IH6yHeTmSREyIpCod8k1b+jn6o8tlQhc8rngIB+3 O5pQIiLuRdH0qeBszxuGfF9/AoHnEebCspuYwCKnDAnlpqpOHKKgkVJ4qFJF6kcHodqc Lsh1R6Y8rP+S0fRlg9swmsu3aC6w8qC2WgqTBBvpI3qO+ZDbniDlpR1gYzJLdtEZ9d3C eNFhEaTF/hs+hpgnQAmLmb57Ir0PUHOCE/GCw6eew1Ceyc+Wqbru0F84bKkDYl7UDm9h 2sjA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711387665; x=1711992465; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=iYXDvPU+5s3FumCxEyHgAv7m9MtFIzgZjtXbt/l7B3E=; b=YzKENjDGVrF/LolRnFGxNAzNTWXDj+BFV3XdXauZLZjAw19HAeDQmbCQO3GxEtRqL1 G2KbJNrnNDm7D6VUSi0qj86g2VQsJrqPRWrnD1S184nnH+Fat4hh5sqW6sfbdCsLiGUy vRFGdVKGps6BPixr0Pd0FIAA9Ly5Wehkw+cFO+lQ/Evd0rI32ucj+AXMuzlHFg4eatUc N2OIoEvyrYoNe8ijcXDiHgrq9e//GTs2WUezNZyrgMNl0Blh70QOxY9kgH8b+s+/8S8W 6M2E0iG1mAjnBcg4WqinqRh8L8+nTTkjjAjh+cDkF0gj8CMd9vJNnYu9Chco6BRSkFu1 nFBg== X-Forwarded-Encrypted: i=1; AJvYcCUMbmkF65Eha6TV+HRTXYmcVD1iQmI4ugRv4CZJaFsNBnQoBX0A89MK7+skcBS8pvvwbwnAG2LzEJGyIH56QWRhEsAgaURXv5+GGy7L X-Gm-Message-State: AOJu0Yw/GPqDTuBsG/lxZua5SRqnVMrk2yDs18aZKB9V/x+5j5APKs2S 0Hb6D52jAskp8KsAY8pyH8PabeqHXg7nl0m+ULXaHdKeuQf/WDzzZ8N7Y7NFddFyVG5Cy5jmWB9 oBQIJm660Zh+hue7fNPh4cQ== X-Google-Smtp-Source: AGHT+IHnCMj4K+ceCrkCTPaNx6/YZeAY1MukLGPRA5Ia+2uNmIgD1Fz+quh85teKWLKttkyfgEqVrvG4ZT+VnA2d4g== X-Received: from peternewman-us.c.googlers.com ([fda3:e722:ac3:cc00:24:72f4:c0a8:3dcc]) (user=peternewman job=sendgmr) by 2002:a17:90b:78a:b0:2a0:4e94:3c9 with SMTP id l10-20020a17090b078a00b002a04e9403c9mr23042pjz.4.1711387664881; Mon, 25 Mar 2024 10:27:44 -0700 (PDT) Date: Mon, 25 Mar 2024 10:27:06 -0700 In-Reply-To: <20240325172707.73966-1-peternewman@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240325172707.73966-1-peternewman@google.com> X-Mailer: git-send-email 2.44.0.396.g6e790dbe36-goog Message-ID: <20240325172707.73966-6-peternewman@google.com> Subject: [PATCH v1 5/6] x86/resctrl: Abstract PQR_ASSOC from generic code From: Peter Newman To: Fenghua Yu , Reinette Chatre , James Morse Cc: Stephane Eranian , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Daniel Bristot de Oliveira , Valentin Schneider , Uros Bizjak , Mike Rapoport , "Kirill A. Shutemov" , Rick Edgecombe , Xin Li , Babu Moger , Shaopeng Tan , Maciej Wieczor-Retman , Jens Axboe , Christian Brauner , Oleg Nesterov , Andrew Morton , Tycho Andersen , Nicholas Piggin , Beau Belgrave , "Matthew Wilcox (Oracle)" , linux-kernel@vger.kernel.org, Peter Newman Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" While CLOSID and RMID originated in RDT, the concept applies to other architectures, as it's standard to write allocation and monitoring IDs into per-CPU registers. - Rename resctrl_pqr_state and pqr_state to be more architecturally-neutral. - Introduce resctrl_arch_update_cpu() to replace the explicit write to MSR_IA32_PQR_ASSOC in __resctrl_sched_in(). In the case of MPAM, PARTID[_I,D] and PMG are a simple function of closid, rmid, and an internal global. - Update terminology containing explicit references to the PQR_ASSOC register. Signed-off-by: Peter Newman --- arch/x86/kernel/cpu/resctrl/core.c | 11 ++++++++--- arch/x86/kernel/cpu/resctrl/internal.h | 6 +++--- arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 4 ++-- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 18 +++++++++--------- include/linux/resctrl.h | 11 +++++++++++ 5 files changed, 33 insertions(+), 17 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resct= rl/core.c index ae5878d748fc..4cc584754f8b 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -37,12 +37,12 @@ static DEFINE_MUTEX(domain_list_lock); =20 /* - * The cached resctrl_pqr_state is strictly per CPU and can never be + * The cached resctrl_cpu_state is strictly per CPU and can never be * updated from a remote CPU. Functions which modify the state * are called with interrupts disabled and no preemption, which * is sufficient for the protection. */ -DEFINE_PER_CPU(struct resctrl_pqr_state, pqr_state); +DEFINE_PER_CPU(struct resctrl_cpu_state, resctrl_state); =20 /* * Used to store the max resource name width and max resource data width @@ -309,6 +309,11 @@ static void rdt_get_cdp_l2_config(void) rdt_get_cdp_config(RDT_RESOURCE_L2); } =20 +void resctrl_arch_update_cpu(u32 ctrl_id, u32 mon_id) +{ + wrmsr(MSR_IA32_PQR_ASSOC, mon_id, ctrl_id); +} + static void mba_wrmsr_amd(struct rdt_domain *d, struct msr_param *m, struct rdt_resour= ce *r) { @@ -598,7 +603,7 @@ static void domain_remove_cpu(int cpu, struct rdt_resou= rce *r) =20 static void clear_closid_rmid(int cpu) { - struct resctrl_pqr_state *state =3D this_cpu_ptr(&pqr_state); + struct resctrl_cpu_state *state =3D this_cpu_ptr(&resctrl_state); =20 state->default_group =3D &rdtgroup_default; state->cur_closid =3D RESCTRL_RESERVED_CLOSID; diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/r= esctrl/internal.h index 0ba0d2428780..e30f42744ac7 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -332,7 +332,7 @@ struct rftype { }; =20 /** - * struct resctrl_pqr_state - State cache for the PQR MSR + * struct resctrl_cpu_state - State cache for allocation/monitoring group = IDs * @cur_rmid: The cached Resource Monitoring ID * @cur_closid: The cached Class Of Service ID * @default_group: The user assigned rdtgroup @@ -340,13 +340,13 @@ struct rftype { * The cache also helps to avoid pointless updates if the value does * not change. */ -struct resctrl_pqr_state { +struct resctrl_cpu_state { u32 cur_rmid; u32 cur_closid; struct rdtgroup *default_group; }; =20 -DECLARE_PER_CPU(struct resctrl_pqr_state, pqr_state); +DECLARE_PER_CPU(struct resctrl_cpu_state, resctrl_state); =20 /** * struct mbm_state - status for each MBM counter in each domain diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c b/arch/x86/kernel/cp= u/resctrl/pseudo_lock.c index 884b88e25141..ca1805a566cb 100644 --- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c +++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c @@ -480,8 +480,8 @@ static int pseudo_lock_fn(void *_rdtgrp) */ saved_msr =3D __rdmsr(MSR_MISC_FEATURE_CONTROL); __wrmsr(MSR_MISC_FEATURE_CONTROL, prefetch_disable_bits, 0x0); - closid_p =3D this_cpu_read(pqr_state.cur_closid); - rmid_p =3D this_cpu_read(pqr_state.cur_rmid); + closid_p =3D this_cpu_read(resctrl_state.cur_closid); + rmid_p =3D this_cpu_read(resctrl_state.cur_rmid); mem_r =3D plr->kmem; size =3D plr->size; line_size =3D plr->line_size; diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/r= esctrl/rdtgroup.c index badf181c8cbb..bd067f7ed5b6 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -112,7 +112,7 @@ void rdt_staged_configs_clear(void) * + We can simply set current's closid to assign a task to a resource * group. * + Context switch code can avoid extra memory references deciding which - * CLOSID to load into the PQR_ASSOC MSR + * CLOSID to load into the CPU * - We give up some options in configuring resource groups across multi-s= ocket * systems. * - Our choices on how to configure each resource become progressively mo= re @@ -347,7 +347,7 @@ static int rdtgroup_cpus_show(struct kernfs_open_file *= of, */ void __resctrl_sched_in(struct task_struct *tsk) { - struct resctrl_pqr_state *state =3D this_cpu_ptr(&pqr_state); + struct resctrl_cpu_state *state =3D this_cpu_ptr(&resctrl_state); u32 closid =3D state->cur_closid; u32 rmid =3D state->cur_rmid; struct rdtgroup *rgrp; @@ -401,7 +401,7 @@ void __resctrl_sched_in(struct task_struct *tsk) if (closid !=3D state->cur_closid || rmid !=3D state->cur_rmid) { state->cur_closid =3D closid; state->cur_rmid =3D rmid; - wrmsr(MSR_IA32_PQR_ASSOC, rmid, closid); + resctrl_arch_update_cpu(closid, rmid); } } =20 @@ -416,7 +416,7 @@ static void update_cpu_closid_rmid(void *info) struct rdtgroup *r =3D info; =20 if (r) - this_cpu_write(pqr_state.default_group, r); + this_cpu_write(resctrl_state.default_group, r); =20 /* * We cannot unconditionally write the MSR because the current @@ -635,8 +635,8 @@ static void rdtgroup_remove(struct rdtgroup *rdtgrp) static void _update_task_closid_rmid(void *task) { /* - * If the task is still current on this CPU, update PQR_ASSOC MSR. - * Otherwise, the MSR is updated when the task is scheduled in. + * If the task is still current on this CPU, update the current ctrl + * group. Otherwise, the CPU is updated when the task is scheduled in. */ if (task =3D=3D current) resctrl_sched_in(task); @@ -3005,7 +3005,7 @@ static void rmdir_all_sub(void) else rdtgroup_remove(rdtgrp); } - /* Notify online CPUs to update per cpu storage and PQR_ASSOC MSR */ + /* Update online CPUs to propagate group membership changes. */ update_closid_rmid(cpu_online_mask, &rdtgroup_default); =20 kernfs_remove(kn_info); @@ -3688,7 +3688,7 @@ static int rdtgroup_rmdir_mon(struct rdtgroup *rdtgrp= , cpumask_var_t tmpmask) =20 /* Update per cpu rmid of the moved CPUs first */ for_each_cpu(cpu, &rdtgrp->cpu_mask) - per_cpu(pqr_state.default_group, cpu) =3D prdtgrp; + per_cpu(resctrl_state.default_group, cpu) =3D prdtgrp; /* * Update the MSR on moved CPUs and CPUs which have moved * task running on them. @@ -3732,7 +3732,7 @@ static int rdtgroup_rmdir_ctrl(struct rdtgroup *rdtgr= p, cpumask_var_t tmpmask) =20 /* Update per cpu closid and rmid of the moved CPUs first */ for_each_cpu(cpu, &rdtgrp->cpu_mask) - per_cpu(pqr_state.default_group, cpu) =3D &rdtgroup_default; + per_cpu(resctrl_state.default_group, cpu) =3D &rdtgroup_default; =20 /* * Update the MSR on moved CPUs and CPUs which have moved diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h index b2af1fbc7aa1..a6b1b13cc769 100644 --- a/include/linux/resctrl.h +++ b/include/linux/resctrl.h @@ -306,6 +306,17 @@ extern unsigned int resctrl_rmid_realloc_limit; =20 DECLARE_STATIC_KEY_FALSE(rdt_enable_key); =20 +/* + * resctrl_arch_update_cpu() - Make control and monitoring group IDs effec= tive + * on the current CPU + * + * @ctrl_id: An identifier for the control group which is to be used on the + * current CPU. + * @mon_id: An identifier for the monitoring group which is to be used on + * the current CPU. + */ +void resctrl_arch_update_cpu(u32 ctrl_id, u32 mon_id); + void __resctrl_sched_in(struct task_struct *tsk); =20 /* --=20 2.44.0.396.g6e790dbe36-goog From nobody Sun Feb 8 02:56:17 2026 Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AB24F130E5D for ; Mon, 25 Mar 2024 17:27:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711387669; cv=none; b=B/cHU+hL/42F96ySS2IDIxeVCGaWU9bIkxxMJoVHfcbyl5vQKIavAnsG0HoRp/W7nTOZVnwkWqd1fEf01Q0hMPhDOmzqQNObTFMrXyFffjPh3x0FjpCW8weWikMuvI+xDBZU/5UB3Djuww7hB23FGXgb0YJi/wCYqc0d8Ujc0rc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711387669; c=relaxed/simple; bh=yTMbYEK5awdComl/ezDEyjlZWsoayimTaKfd0CBe8zE=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=VCJK4/5851Ahy4/lY3V70NGY4jS66WpwElB62MiOo41MMYp6a1kKd0ik8UDAlktn5SfNlu3TEPh4OQSwJMnzWk//xGCxoRf7nuprs8YloFxdVQcdgGZTeS56EELZYJ7gfqlfXSlaQpAjwkwIPaNtfyq0XikU6ejJj4kP6fw+a2U= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--peternewman.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=ieauYavy; arc=none smtp.client-ip=209.85.219.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--peternewman.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="ieauYavy" Received: by mail-yb1-f201.google.com with SMTP id 3f1490d57ef6-dcbfe1a42a4so8683403276.2 for ; Mon, 25 Mar 2024 10:27:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1711387666; x=1711992466; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Sa3z2217X6fUx6p/cBiZWKstN9/uVhy7ssYwZcnFGPU=; b=ieauYavyZvxryp7MsmLBEGUgJQ4Eiyzz24lxTemWYG27K7NPLTGBU4vfuxh6/vOoqw duFzqs47tuEPn0E/ENrX1XDsbkA6uWGeONWdIu5ROtNF85+Box37JCvXIobNOATtIXWr LHm3P1DMqaRPah+AxisbFCtPNprRj6LZ0IrlrfaEOp6YdnAiZ4ExxlgIE3So8XJZWQJy 2CiK0TIDeD+CwbTQTPZECRTfCT6r7R0Ah2u6vjob8BkElunsT2yhiBYSCfa0qhNOYMx7 +Q7hqdk3T8xEdp2C29s6s52AR2Y28KLaPTzmt3QV94aZeALo/nr1KTlWJ6efqjtp1uPq pI1w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711387666; x=1711992466; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Sa3z2217X6fUx6p/cBiZWKstN9/uVhy7ssYwZcnFGPU=; b=p+lhhojV/ZNt3RBq/cQ4jVENy1wiKA6l9os7SJ15q0Q1Em7MAEeLWVmThuwwp3zcY4 kzo9BrOjboGa/GB8qyWDnK57oxy7yy6qo8HJ7SDdgSyBKk1GV506PsgDQ6/21GH4ev4a a4niUu7V5kGQ+4Ne1eR3WkbYD3flpcs5a+xMfn7WbGLt4tTlaCDyorRlaDECXHUzsh+x mShrQzZ6juOyxn8DfphlEKc1r9qJSDZN8ElbZczm/Lkk2AnB0UmPauN9fRQ27Et02hcc d5vEzlmEG8HD13zKTPzdUspiQisYZxk1PUYNOdVBBaPPf8ULe0X4rl/g4uMCLmLvHMoP oBVw== X-Forwarded-Encrypted: i=1; AJvYcCVGJvo9EjcpPHNRzrX6KpUksz5TDuqyeSmMtryTcuRqSE0T0BGsAW6PF3j82PlNRdG09LbeWpkqL832B1CHVA4FJS7kewX0p42hLg27 X-Gm-Message-State: AOJu0YwaDrD9RAu0S3US8T30A8ILTssXlt18+muHIcLjy09kO1Xloh1B xralWucySkCD16Wv5XoPq1Ph4eFUZfEYx6Podhc09JIFg4h1W9b5NHTV3svH2ngNx73Muxa1wEm YVi2BYJ+ViFMKuJa4ay/s6g== X-Google-Smtp-Source: AGHT+IFgPlsoAE7hzmPfMTQGoYJe+yF0pzDalzKIKtFe75F+KyuVHePFsmRXYPtOcPxVj2mEZeHNwJ9LsbctE9XPDw== X-Received: from peternewman-us.c.googlers.com ([fda3:e722:ac3:cc00:24:72f4:c0a8:3dcc]) (user=peternewman job=sendgmr) by 2002:a05:6902:2388:b0:dcc:5a91:aee9 with SMTP id dp8-20020a056902238800b00dcc5a91aee9mr2359571ybb.7.1711387666713; Mon, 25 Mar 2024 10:27:46 -0700 (PDT) Date: Mon, 25 Mar 2024 10:27:07 -0700 In-Reply-To: <20240325172707.73966-1-peternewman@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240325172707.73966-1-peternewman@google.com> X-Mailer: git-send-email 2.44.0.396.g6e790dbe36-goog Message-ID: <20240325172707.73966-7-peternewman@google.com> Subject: [PATCH v1 6/6] x86/resctrl: Don't search tasklist in mongroup rename From: Peter Newman To: Fenghua Yu , Reinette Chatre , James Morse Cc: Stephane Eranian , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Daniel Bristot de Oliveira , Valentin Schneider , Uros Bizjak , Mike Rapoport , "Kirill A. Shutemov" , Rick Edgecombe , Xin Li , Babu Moger , Shaopeng Tan , Maciej Wieczor-Retman , Jens Axboe , Christian Brauner , Oleg Nesterov , Andrew Morton , Tycho Andersen , Nicholas Piggin , Beau Belgrave , "Matthew Wilcox (Oracle)" , linux-kernel@vger.kernel.org, Peter Newman Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Iterating over all task_structs while read-locking the tasklist_lock results in significant task creation/destruction latency. Back-to-back move operations can thus be disastrous to the responsiveness of threadpool-based services. Now that the CLOSID is determined indirectly through a reference to the task's current rdtgroup, it is not longer necessary to update the CLOSID in all tasks belonging to the moved mongroup. The context switch handler just needs to be prepared for concurrent writes to the parent pointer. Signed-off-by: Peter Newman --- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 30 +++++++------------------- 1 file changed, 8 insertions(+), 22 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/r= esctrl/rdtgroup.c index bd067f7ed5b6..a007c0ec478f 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -388,8 +388,11 @@ void __resctrl_sched_in(struct task_struct *tsk) * by a full barrier and synchronous IPI * broadcast before proceeding to free the * group. + * + * parent can be concurrently updated to a new + * group as a result of mongrp_reparent(). */ - closid =3D rgrp->mon.parent->closid; + closid =3D READ_ONCE(rgrp->mon.parent)->closid; } else { closid =3D rgrp->closid; } @@ -3809,8 +3812,7 @@ static int rdtgroup_rmdir(struct kernfs_node *kn) * Monitoring data for the group is unaffected by this operation. */ static void mongrp_reparent(struct rdtgroup *rdtgrp, - struct rdtgroup *new_prdtgrp, - cpumask_var_t cpus) + struct rdtgroup *new_prdtgrp) { struct rdtgroup *prdtgrp =3D rdtgrp->mon.parent; =20 @@ -3825,13 +3827,10 @@ static void mongrp_reparent(struct rdtgroup *rdtgrp, list_move_tail(&rdtgrp->mon.crdtgrp_list, &new_prdtgrp->mon.crdtgrp_list); =20 - rdtgrp->mon.parent =3D new_prdtgrp; + WRITE_ONCE(rdtgrp->mon.parent, new_prdtgrp); rdtgrp->closid =3D new_prdtgrp->closid; =20 - /* Propagate updated closid to all tasks in this group. */ - rdt_move_group_tasks(rdtgrp, rdtgrp, cpus); - - update_closid_rmid(cpus, NULL); + update_closid_rmid(cpu_online_mask, NULL); } =20 static int rdtgroup_rename(struct kernfs_node *kn, @@ -3839,7 +3838,6 @@ static int rdtgroup_rename(struct kernfs_node *kn, { struct rdtgroup *new_prdtgrp; struct rdtgroup *rdtgrp; - cpumask_var_t tmpmask; int ret; =20 rdtgrp =3D kernfs_to_rdtgroup(kn); @@ -3909,16 +3907,6 @@ static int rdtgroup_rename(struct kernfs_node *kn, goto out; } =20 - /* - * Allocate the cpumask for use in mongrp_reparent() to avoid the - * possibility of failing to allocate it after kernfs_rename() has - * succeeded. - */ - if (!zalloc_cpumask_var(&tmpmask, GFP_KERNEL)) { - ret =3D -ENOMEM; - goto out; - } - /* * Perform all input validation and allocations needed to ensure * mongrp_reparent() will succeed before calling kernfs_rename(), @@ -3927,9 +3915,7 @@ static int rdtgroup_rename(struct kernfs_node *kn, */ ret =3D kernfs_rename(kn, new_parent, new_name); if (!ret) - mongrp_reparent(rdtgrp, new_prdtgrp, tmpmask); - - free_cpumask_var(tmpmask); + mongrp_reparent(rdtgrp, new_prdtgrp); =20 out: mutex_unlock(&rdtgroup_mutex); --=20 2.44.0.396.g6e790dbe36-goog