From nobody Sat Nov 23 23:17:46 2024 Received: from mail-pf1-f175.google.com (mail-pf1-f175.google.com [209.85.210.175]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 16ABF1EC015; Fri, 8 Nov 2024 13:29:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.175 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731072563; cv=none; b=Hjsfwbte2wieiWPCMfyNnixctKn+1MSgi8zmCq5BjsA5bXH7IpirXpkpdWg3Aprh2ldzyldQs0z+DTVnVt6hNGeyUHzdy+ODGnY6hoR3GDsRAYZfmI0kEx/D4+8IISeo21czrG9ZeXkM8rt968xMZnFOrf9WLJ6SVXXLMpP1liI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731072563; c=relaxed/simple; bh=awhjoehiBq9K/ShfB1gLgEPzL9/qaL+Uh3vkwhYwJD4=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=J/FuNJldWuFAp5b8nAhQRBpnw+Bf4CRoSzYlffRg1grC2FBQFqcsms+e1lQktdNPFNxpUOmI4CfiwhpA7us50Y1yN38giOD2Zyjo1bo7oMWQxdY6wNKVTPjZQ6TRQXVfXxiwp7ESb9fG61ghAY3glQvACn5+ZKn0lyprQgdCfkg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=az7Xgajq; arc=none smtp.client-ip=209.85.210.175 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="az7Xgajq" Received: by mail-pf1-f175.google.com with SMTP id d2e1a72fcca58-720d01caa66so1973803b3a.2; Fri, 08 Nov 2024 05:29:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1731072561; x=1731677361; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=PHC8CQdHO3sJRaU8u6msHyqioLgOfX7kr4v7xgAz/pg=; b=az7XgajqtsUQEp0Xpi5hE0k6iioinfOAS8BHc/bAG0/DgUD/pQF+NcakDwJkV1jWud G2RDqZjLwVBEXdMbiwllo0VvIZeGZHCkv9x4A314qeBqQYpVPbiGbA/TfgrbV4pAkKGi PgOYtDXQN5pPIg5K6YkO1y7wqILCnJmvH//cD/mbpwWiBxH99whTteEaUe7jPixDmceB 0/ydKOiiUaH2IDFxZvX5GBbLOHNK423LDpW6oUl77FF+aD9OekQzIzI6fJ0tIB4xO7Xz ku4Y/d4VSBcUj9Rna298BJMOcsVqF/8O7ZMJz3fT+0SeuM8+nUp62J6fwNoWnFxyzqvr xkOA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731072561; x=1731677361; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=PHC8CQdHO3sJRaU8u6msHyqioLgOfX7kr4v7xgAz/pg=; b=R4rLTFFNlQSXuNufaIHpwBetPC/FwtZPyVYWtVhJZe765PuCbcTcIhXVtLbHzCIp/w qlYRLmoKD/fo40J6An73P/d/30FFFgdLXfrocZt8lLdJ9f+GA0Y/nZJi6dpgB207D0aW KrQyROn5chJrzx9IB9PVFPkynAJ6aGeZ73X0Blx3CQqg8woa9NxjZ+z+GthQCl6TrY0V WPxWSTYHPIyDF3kLixf4gwmJbHGCUqhsikbqjxetu5PTKULPbfFXQSQXj8L90WRV8Dgm Izgw0T/J4CQRNOhbCWFFSeT6PvKzhYGRYe49A9v241MxWT62ByFeZypyUvGtTzzTa3HC IgHg== X-Forwarded-Encrypted: i=1; AJvYcCXJnCTTXZRAAmZbSZ/mlcCpFkAjHw4tMnOl6Soy/o8oBHQZCOxf8QbMcbJgYNbH19f847NgeIli6XBj73aI@vger.kernel.org, AJvYcCXUonLkxZ8UowXdAbGwj19RUC7o6MwK/+TnaGIqawLzPq81pRHBjA3O/QF4bfIZDhHUk7M7BpLD@vger.kernel.org X-Gm-Message-State: AOJu0YyKEuZ6I+FJD/ZEHWa/X3FDmHnCAAr1rTyIU6C6CKg5atjlgdNJ NhWsHjbqLN7+Vo3HKujreKgp1gYWnsN6fziG1c9CMV2mDfZXF9BD X-Google-Smtp-Source: AGHT+IEAsCp6Ri9+gJQf3AMi94IwKLWsEWVoq7ZJMhKvkiCX9rGAtbplU2IsUFIHuXagg+71JXKAnA== X-Received: by 2002:a05:6a00:22cd:b0:71e:581f:7d7e with SMTP id d2e1a72fcca58-724132cd0f9mr3162742b3a.15.1731072561093; Fri, 08 Nov 2024 05:29:21 -0800 (PST) Received: from localhost.localdomain ([183.193.178.50]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-724078ce169sm3642561b3a.86.2024.11.08.05.29.15 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Fri, 08 Nov 2024 05:29:20 -0800 (PST) From: Yafang Shao To: mingo@redhat.com, peterz@infradead.org Cc: juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, vschneid@redhat.com, hannes@cmpxchg.org, surenb@google.com, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Yafang Shao Subject: [PATCH v5 1/4] sched: Define sched_clock_irqtime as static key Date: Fri, 8 Nov 2024 21:29:01 +0800 Message-Id: <20241108132904.6932-2-laoar.shao@gmail.com> X-Mailer: git-send-email 2.30.1 (Apple Git-130) In-Reply-To: <20241108132904.6932-1-laoar.shao@gmail.com> References: <20241108132904.6932-1-laoar.shao@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Since CPU time accounting is a performance-critical path, let's define sched_clock_irqtime as a static key to minimize potential overhead. Signed-off-by: Yafang Shao --- kernel/sched/cputime.c | 16 +++++++--------- kernel/sched/sched.h | 13 +++++++++++++ 2 files changed, 20 insertions(+), 9 deletions(-) diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c index 0bed0fa1acd9..5d9143dd0879 100644 --- a/kernel/sched/cputime.c +++ b/kernel/sched/cputime.c @@ -9,6 +9,8 @@ =20 #ifdef CONFIG_IRQ_TIME_ACCOUNTING =20 +DEFINE_STATIC_KEY_FALSE(sched_clock_irqtime); + /* * There are no locks covering percpu hardirq/softirq time. * They are only modified in vtime_account, on corresponding CPU @@ -22,16 +24,14 @@ */ DEFINE_PER_CPU(struct irqtime, cpu_irqtime); =20 -static int sched_clock_irqtime; - void enable_sched_clock_irqtime(void) { - sched_clock_irqtime =3D 1; + static_branch_enable(&sched_clock_irqtime); } =20 void disable_sched_clock_irqtime(void) { - sched_clock_irqtime =3D 0; + static_branch_disable(&sched_clock_irqtime); } =20 static void irqtime_account_delta(struct irqtime *irqtime, u64 delta, @@ -57,7 +57,7 @@ void irqtime_account_irq(struct task_struct *curr, unsign= ed int offset) s64 delta; int cpu; =20 - if (!sched_clock_irqtime) + if (!irqtime_enabled()) return; =20 cpu =3D smp_processor_id(); @@ -90,8 +90,6 @@ static u64 irqtime_tick_accounted(u64 maxtime) =20 #else /* CONFIG_IRQ_TIME_ACCOUNTING */ =20 -#define sched_clock_irqtime (0) - static u64 irqtime_tick_accounted(u64 dummy) { return 0; @@ -478,7 +476,7 @@ void account_process_tick(struct task_struct *p, int us= er_tick) if (vtime_accounting_enabled_this_cpu()) return; =20 - if (sched_clock_irqtime) { + if (irqtime_enabled()) { irqtime_account_process_tick(p, user_tick, 1); return; } @@ -507,7 +505,7 @@ void account_idle_ticks(unsigned long ticks) { u64 cputime, steal; =20 - if (sched_clock_irqtime) { + if (irqtime_enabled()) { irqtime_account_idle_ticks(ticks); return; } diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 081519ffab46..0c83ab35256e 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -3179,6 +3179,12 @@ struct irqtime { }; =20 DECLARE_PER_CPU(struct irqtime, cpu_irqtime); +DECLARE_STATIC_KEY_FALSE(sched_clock_irqtime); + +static inline int irqtime_enabled(void) +{ + return static_branch_likely(&sched_clock_irqtime); +} =20 /* * Returns the irqtime minus the softirq time computed by ksoftirqd. @@ -3199,6 +3205,13 @@ static inline u64 irq_time_read(int cpu) return total; } =20 +#else + +static inline int irqtime_enabled(void) +{ + return 0; +} + #endif /* CONFIG_IRQ_TIME_ACCOUNTING */ =20 #ifdef CONFIG_CPU_FREQ --=20 2.43.5 From nobody Sat Nov 23 23:17:46 2024 Received: from mail-pg1-f169.google.com (mail-pg1-f169.google.com [209.85.215.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 74AF01EC015; Fri, 8 Nov 2024 13:29:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.169 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731072570; cv=none; b=E/bs1y851L86gsWQUxyGB6KEBPPg/GbIcClgSsw/Wg9dYGKbeclvLHJGjAhEXhiAariquUc6Qj/OVVWsWH/LCQdmjR+pfBFfyUG2OlCaGml++paZqJ+UkyhwMe2jC2J3imBK9RXJVr4vdnfQb0RxxXqbsA3TBFhu4W2sfhoSvBQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731072570; c=relaxed/simple; bh=1em+axrvF9NwNv7QaMn9okeJtnJ833xHpyR29l1vHkc=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=LkHlKGA+pOmnw0jas1yrp1rIyPx5DhvFq+yvA1CPsXKqFRJcKpg8pQDcpL0yJjV4vQixdimSCVkgBMB0kPbL8FGdkgyJd7t1leAS5ck7h/gIwmA6zIb/3C5BasF4HteG66iuW7ehvic2JzCEiWNIMlaGQXK6oW1FujbJXsU8pGU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=QFaKLNH5; arc=none smtp.client-ip=209.85.215.169 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="QFaKLNH5" Received: by mail-pg1-f169.google.com with SMTP id 41be03b00d2f7-7eab7622b61so1561636a12.1; Fri, 08 Nov 2024 05:29:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1731072569; x=1731677369; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=75LfCx1c72huwMJ9cBBxyuvVsqp01GHj0CZoJyicaVg=; b=QFaKLNH5p2hll5mqi60GaDSaFrsBeZExsderA7XrKraHl4QgO8SsNtSsmfm5PKAFar F2AwPncZGvw1vKhCE1nIl62003q7nZNNa/KSbhwTaSLSA66E4bjymeDrqZ+psQhMSIXM qb44iRcMBQ1G8F+c74TRz3lZcPBIXrKwrLWhZSaNyGWRkE4uApaeAXCRnqvodoaEwpjJ Fjaisael7XQNxdI+PJbfJ1Ji7NJsNSdh2iWTDgbIqandVkuPnrMSvvcDyKgpxLpvxP53 UNgOMRor7FtXGvgxaj2mM057qZkDvRhSa/drzczruL9S4VRIiHw7/Yc/U5qazukPvtZB Le/w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731072569; x=1731677369; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=75LfCx1c72huwMJ9cBBxyuvVsqp01GHj0CZoJyicaVg=; b=TGkewE5HMmWO6y2olXiKv4PkXOkyN8ELXHOylNecTMobq6YU+E8pv8E7DrTgAhnui8 o50nILErsPU5/4qMZY8yj0z9Q8KDdp18lrH1kYX7XbeHafXkSyy9cO7UXC+XFHpHP/Un 8BnC7qqyFzFlRUS1WXS/yEvd8xPITbrHO2pjNdznmGYppDVUZ2SS5UJynlGl1lVE6AQ6 7QJzBR42eQ+oQfnksKMc5grm6Qk6VLe8e9Qp8f5EZdHT5xEYNtsubgUuTiZ0/23YRw6u 8RyOZHAfd2W7R9AWNpSR6AZpIWODNHFW+DPicSvunCmNCSqNI4hNZOBKsNm9q57uWVbt yIbA== X-Forwarded-Encrypted: i=1; AJvYcCUZv/3g4gBTakLgzuJrm2iS1RMDRAXD3JKbpbVjeXZSa3fiNbIwEW/NNaeHA1A41AxmIg0zA0GnS52pKGwr@vger.kernel.org, AJvYcCXQtfIBgc5ODIzC0S7785PVqMlzaP6LY2iJbiJOA1aDwS/Ge0kiKsuvFaKj0zfJ1Bd7OKC0zG1s@vger.kernel.org X-Gm-Message-State: AOJu0YxXtebskl3tpiVAZaQvK0QkrdqY7Rq9ujRSkTGvdMSWms4KJxR1 YcFkyO5V9XAWcYT5uz/vokN++Yjeixj/iPc1R9ETmESth6DuBwLF X-Google-Smtp-Source: AGHT+IEUhJgOI1vmuKjBMjYKIz19xNmd7ocadeN2ED+0CdgUKOSWhRBEBI2lqV4N/y3ATA9xap+Qhg== X-Received: by 2002:a05:6a20:8404:b0:1d9:d9a7:dd36 with SMTP id adf61e73a8af0-1dc22b54192mr3295225637.32.1731072568702; Fri, 08 Nov 2024 05:29:28 -0800 (PST) Received: from localhost.localdomain ([183.193.178.50]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-724078ce169sm3642561b3a.86.2024.11.08.05.29.21 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Fri, 08 Nov 2024 05:29:28 -0800 (PST) From: Yafang Shao To: mingo@redhat.com, peterz@infradead.org Cc: juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, vschneid@redhat.com, hannes@cmpxchg.org, surenb@google.com, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Yafang Shao Subject: [PATCH v5 2/4] sched: Don't account irq time if sched_clock_irqtime is disabled Date: Fri, 8 Nov 2024 21:29:02 +0800 Message-Id: <20241108132904.6932-3-laoar.shao@gmail.com> X-Mailer: git-send-email 2.30.1 (Apple Git-130) In-Reply-To: <20241108132904.6932-1-laoar.shao@gmail.com> References: <20241108132904.6932-1-laoar.shao@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" sched_clock_irqtime may be disabled due to the clock source, in which case IRQ time should not be accounted. Let's add a conditional check to avoid unnecessary logic. Signed-off-by: Yafang Shao --- kernel/sched/core.c | 44 +++++++++++++++++++++++--------------------- 1 file changed, 23 insertions(+), 21 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index dbfb5717d6af..a75dad9be4b9 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -740,29 +740,31 @@ static void update_rq_clock_task(struct rq *rq, s64 d= elta) s64 __maybe_unused steal =3D 0, irq_delta =3D 0; =20 #ifdef CONFIG_IRQ_TIME_ACCOUNTING - irq_delta =3D irq_time_read(cpu_of(rq)) - rq->prev_irq_time; + if (irqtime_enabled()) { + irq_delta =3D irq_time_read(cpu_of(rq)) - rq->prev_irq_time; =20 - /* - * Since irq_time is only updated on {soft,}irq_exit, we might run into - * this case when a previous update_rq_clock() happened inside a - * {soft,}IRQ region. - * - * When this happens, we stop ->clock_task and only update the - * prev_irq_time stamp to account for the part that fit, so that a next - * update will consume the rest. This ensures ->clock_task is - * monotonic. - * - * It does however cause some slight miss-attribution of {soft,}IRQ - * time, a more accurate solution would be to update the irq_time using - * the current rq->clock timestamp, except that would require using - * atomic ops. - */ - if (irq_delta > delta) - irq_delta =3D delta; + /* + * Since irq_time is only updated on {soft,}irq_exit, we might run into + * this case when a previous update_rq_clock() happened inside a + * {soft,}IRQ region. + * + * When this happens, we stop ->clock_task and only update the + * prev_irq_time stamp to account for the part that fit, so that a next + * update will consume the rest. This ensures ->clock_task is + * monotonic. + * + * It does however cause some slight miss-attribution of {soft,}IRQ + * time, a more accurate solution would be to update the irq_time using + * the current rq->clock timestamp, except that would require using + * atomic ops. + */ + if (irq_delta > delta) + irq_delta =3D delta; =20 - rq->prev_irq_time +=3D irq_delta; - delta -=3D irq_delta; - delayacct_irq(rq->curr, irq_delta); + rq->prev_irq_time +=3D irq_delta; + delta -=3D irq_delta; + delayacct_irq(rq->curr, irq_delta); + } #endif #ifdef CONFIG_PARAVIRT_TIME_ACCOUNTING if (static_key_false((¶virt_steal_rq_enabled))) { --=20 2.43.5 From nobody Sat Nov 23 23:17:46 2024 Received: from mail-pf1-f175.google.com (mail-pf1-f175.google.com [209.85.210.175]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DDBA41EC015; Fri, 8 Nov 2024 13:29:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.175 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731072576; cv=none; b=pOrFFYb0s14TUS0hHG/K4Zn/XMf105PY8gyoU5OUYf91mQ0nqOOdbkJi/VDLXOcuV5ubE9tUcTDLAlXCuEJufqKjgYEHujdDkSridqwNiWp4voTZFzwuLleX/+neVme83a8/K4VDhZV+8WhiJgIXzIVZuQdZkhW9Bv3P6dlO6dY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731072576; c=relaxed/simple; bh=7YeGjR+34IesShhjFHXrzADkVKoy/SEglkASURmL0k0=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=SgloYQPw6b96YIMhx9fSh0sMxh4lbD05Fz70FS31TEJJqWMFi9RYjAx/Z6itx7k3lzz9VQjJh3073l4DRB7mYeJw1uoxgf2fir/LMJ7UnXtC8tFWMkeBxJ7ueAAeAZ9cR26s//N8mbJtfL0KOcXjQ/KypGFd4jmTPus+8arrceM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=ZxRwg/Xa; arc=none smtp.client-ip=209.85.210.175 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ZxRwg/Xa" Received: by mail-pf1-f175.google.com with SMTP id d2e1a72fcca58-72097a5ca74so1734252b3a.3; Fri, 08 Nov 2024 05:29:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1731072574; x=1731677374; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Bca8LHxFT6RS60KEb1zpl3Ks7DaYjeZ9NJCVw3XPDMQ=; b=ZxRwg/Xarxetu47u2xMQt/4x10H8TpIJNO/SfLslfJXAcLzdfTtrZH3s+JF/0uEWlx pPsTrgKHwxJWeLDf2SD3Dpd661ZpWVaabEKc5RLHGwMobSnPp0ak21uYUB4YUySwh1Vb WmyYsPh9OIpNZmBD4VZr4Ag4spe9QZAzGemOPZUbq1VKCjbTzNU26BZ7WYZK6fQ+VHim yQGZTbiX0vtnW4epQsmSo7P0enCrkBwX9T6OVu0ksiWFEJcOGPRApY+9xL4ZbMIIZXwY 4IgKkkD10ujyf23LxW1tMSsBwZDqkh9GR7lt4r3KIH98vQCyt6PncRvLR77sOmiNFSNQ wZrQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731072574; x=1731677374; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Bca8LHxFT6RS60KEb1zpl3Ks7DaYjeZ9NJCVw3XPDMQ=; b=kYCeEs8/Tu72jXXQDBhgAz9BieFgb/fuwhV0FXSXHruDG5DtY1wOxh8ew3ozeFhvEU Sl35ZGk7iko2sSRIVq3sIK8fKqw1XMS/wz9HS5+zEEqQU3PQ/KVkfY3aRXyIkeaFOrS7 sPID5svaivPkcvxeBBJgTlEA0xceqnKDe67iRC283HgMYTab32gW3mvDt8XrIKY7H+7r qwZ99lEFwmTDz7IdtXUa5UT/7x6wL/+WDG8J6hC89URYQHB5+eQPOfg4IsHBGPDIY6q2 vYXChR1RcycjTex+oSBskJgLkb2UDCUuhtCEcOMDSvlbhAewxOnkE06A5ioeZk25T0Mf IAhQ== X-Forwarded-Encrypted: i=1; AJvYcCVdRSyUXPbFpfsKK6s3lQdXTjuKnPhHCufQfjGpZQtgd2CVYDNYRim3RZKUMHqRSvVtgwI7QxUq@vger.kernel.org, AJvYcCWfw/SGiI2PXlw3EH8cRaq9GzFC76oCG5EBg+Bd6YBbJTWkXcm9eT5JvDl/OAlcqkSuRv8QRwmVOskA6fC9@vger.kernel.org X-Gm-Message-State: AOJu0YxtyVIipOhjZhIiWFJOG0ujk5zC82xxoBfG+tfbMKujB036hGtZ Wc3+3QQMaJSF8Urh5RP0NOF6UPpYn5h/EqjH8g2sn+ndr/Izr+AA X-Google-Smtp-Source: AGHT+IFNFLwJWfjuEkvcYqzIeEtfd3sh1N24sfiICV4C/s/z0wc4H2d+r0QkUNdbxJ2VLE1lB3tq/g== X-Received: by 2002:a05:6a20:1583:b0:1d8:a3ab:720d with SMTP id adf61e73a8af0-1dc224ece06mr3915725637.0.1731072574175; Fri, 08 Nov 2024 05:29:34 -0800 (PST) Received: from localhost.localdomain ([183.193.178.50]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-724078ce169sm3642561b3a.86.2024.11.08.05.29.28 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Fri, 08 Nov 2024 05:29:33 -0800 (PST) From: Yafang Shao To: mingo@redhat.com, peterz@infradead.org Cc: juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, vschneid@redhat.com, hannes@cmpxchg.org, surenb@google.com, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Yafang Shao Subject: [PATCH v5 3/4] sched, psi: Don't account irq time if sched_clock_irqtime is disabled Date: Fri, 8 Nov 2024 21:29:03 +0800 Message-Id: <20241108132904.6932-4-laoar.shao@gmail.com> X-Mailer: git-send-email 2.30.1 (Apple Git-130) In-Reply-To: <20241108132904.6932-1-laoar.shao@gmail.com> References: <20241108132904.6932-1-laoar.shao@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" sched_clock_irqtime may be disabled due to the clock source. When disabled, irq_time_read() won't change over time, so there is nothing to account. We can save iterating the whole hierarchy on every tick and context switch. Signed-off-by: Yafang Shao Acked-by: Johannes Weiner --- kernel/sched/psi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c index 84dad1511d1e..210aec717dc7 100644 --- a/kernel/sched/psi.c +++ b/kernel/sched/psi.c @@ -998,7 +998,7 @@ void psi_account_irqtime(struct rq *rq, struct task_str= uct *curr, struct task_st s64 delta; u64 irq; =20 - if (static_branch_likely(&psi_disabled)) + if (static_branch_likely(&psi_disabled) || !irqtime_enabled()) return; =20 if (!curr->pid) --=20 2.43.5 From nobody Sat Nov 23 23:17:46 2024 Received: from mail-pf1-f179.google.com (mail-pf1-f179.google.com [209.85.210.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CE3521F1308; Fri, 8 Nov 2024 13:29:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.179 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731072581; cv=none; b=pMYNsh37S0Qoyh+J5Uh8hzCyGZUde3JlW00cGRKneBFUVZRiJWZNEMZ1VOCJKdxKHERX23/80kaqApKahFrRfMqKha6JRqL+NVq31JstsmbX+jOvr5h0CVeVxRI3SchWe6H+a5e2a3vLJmWdq9Th9B3SkQ1PbZXyB6XqA6Z9CQ4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731072581; c=relaxed/simple; bh=p4IdLRBAI4i4tclu8JnMt2nvytFeCZIaKF2dwGdnWjk=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=JnJanf+SwHxhQzuSoO0XHs6NGJw1kdMRKpCmjFQKLGZ2n2z9Cg5vkby0Jo7fntxiXMpIS2Si+A0uClHY4UJD0QWgefl99fKGpz+XCFXQrPEh1HkCNE53QOhz7dkcDQhyp5d63NJ3uhF3+WoPkv//TzTlv4PlEi3qMa7+qDU3uzA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=MKJqoFqm; arc=none smtp.client-ip=209.85.210.179 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="MKJqoFqm" Received: by mail-pf1-f179.google.com with SMTP id d2e1a72fcca58-720c2db824eso2208883b3a.0; Fri, 08 Nov 2024 05:29:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1731072579; x=1731677379; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Qwj397Wxnt9kp22qzTuTHo57hB6xPUyU9ek/jjGs02U=; b=MKJqoFqm5u/qjZKuKqRkjfCE1NeVt1Iy1OM8iSYsEPzIqLfJUfMQegFIZw2FwVTgkA 2y3+4reGpWnLH1UEFqPudFmsscdLlU2IKw178VI0vGKwPKhfNxufJwc9FRS4c3e/9kai cBMMLDzLHTsl8mfnevHLBZ3sFkCV21JVqSTD9SjlhrMUUl4zA0JySgLCX9hScHE+C8TC weCUnvkQ2zoE2aJdjLjeLF6JMg6p9W4zfdh5+JiDvJvoM1hT7LgJF7933WrVRNwFSVRQ M7HeMIv2fvzX+oZwE4OPzLLSADwkmGBX8wW8FQai/iSoh5bTqygk9QmFWG5MDlNJaA6L uePg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731072579; x=1731677379; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Qwj397Wxnt9kp22qzTuTHo57hB6xPUyU9ek/jjGs02U=; b=lHwyXHcCH+nPVozLevyVECz8wogNjvYBV7EGsf9ByzvDY/y7NwOxNXPTvxcMyiDiZ/ 4tCvpMzT4tI37uSHINyZs3ncck/zeMeW7d7Ol48NZyxXVn9QFLkhdQULwSPzEvMcssJB QlhsZMCEAWT2ZVZ4TU3wSn0gO0zC9/5PEqdYaEEJ3+yMA8YCrzV5MxYuyj3FZu6CzNSA DnL5vDRj8DerSCIa3Iw1tx3IAPngOyQet3XiC62hMhQKvH/pH3cqlhadRAjX0/zjGvSY CLE73cegBPVUKLXC9985+Jk/W165uFXvXnGwKpaEh8nwtyMMXM3gPybd8rPY8jMTqFaG 0HxQ== X-Forwarded-Encrypted: i=1; AJvYcCWg4N+vFcDBI5x3V8Zuji5a+B0brsnupNdcMcggGa15UOvgF2u7tRQYa4HxiQt3CZHR49kZQ+DM@vger.kernel.org, AJvYcCXzlq9SOuxztLl0j7YU80hw37S3kFD8z6pG1Rza4+4UGiq+uBVFMfjRgq90eKcNpX4cS0VGwSAUvc1zo4ws@vger.kernel.org X-Gm-Message-State: AOJu0YxEafUF0NapGLFaaq2qjM47GL12lDwyArDJr9y1Aqr767ux4x87 mQUiR1WuW3Q2wMbvP/kyFlDuDm/aNeemSrrredLCfil2ntMdiAqs X-Google-Smtp-Source: AGHT+IFUDoeABPnImbn3piDnHO/WytS+iVznYw2BlqmngWWX5JNqpLZU6n/owlQl3ABz7QXviJwloA== X-Received: by 2002:a05:6a00:1307:b0:71e:6728:72d5 with SMTP id d2e1a72fcca58-724132c1a44mr3850160b3a.15.1731072579060; Fri, 08 Nov 2024 05:29:39 -0800 (PST) Received: from localhost.localdomain ([183.193.178.50]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-724078ce169sm3642561b3a.86.2024.11.08.05.29.34 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Fri, 08 Nov 2024 05:29:38 -0800 (PST) From: Yafang Shao To: mingo@redhat.com, peterz@infradead.org Cc: juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, vschneid@redhat.com, hannes@cmpxchg.org, surenb@google.com, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Yafang Shao Subject: [PATCH v5 4/4] sched: Fix cgroup irq time for CONFIG_IRQ_TIME_ACCOUNTING Date: Fri, 8 Nov 2024 21:29:04 +0800 Message-Id: <20241108132904.6932-5-laoar.shao@gmail.com> X-Mailer: git-send-email 2.30.1 (Apple Git-130) In-Reply-To: <20241108132904.6932-1-laoar.shao@gmail.com> References: <20241108132904.6932-1-laoar.shao@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable After enabling CONFIG_IRQ_TIME_ACCOUNTING to monitor IRQ pressure in our container environment, we observed several noticeable behavioral changes. One of our IRQ-heavy services, such as Redis, reported a significant reduction in CPU usage after upgrading to the new kernel with CONFIG_IRQ_TIME_ACCOUNTING enabled. However, despite adding more threads to handle an increased workload, the CPU usage could not be raised. In other words, even though the container=E2=80=99s CPU usage appeared low, it= was unable to process more workloads to utilize additional CPU resources, which caused issues. This behavior can be demonstrated using netperf: function start_server() { for j in `seq 1 3`; do netserver -p $[12345+j] > /dev/null & done } server_ip=3D$1 function start_client() { # That applies to cgroup2 as well. mkdir -p /sys/fs/cgroup/cpuacct/test echo $$ > /sys/fs/cgroup/cpuacct/test/cgroup.procs for j in `seq 1 3`; do port=3D$[12345+j] taskset -c 0 netperf -H ${server_ip} -l ${run_time:-30000} \ -t TCP_STREAM -p $port -- -D -m 1k -M 1K -s 8k -S 8k \ > /dev/null & done } start_server start_client We can verify the CPU usage of the test cgroup using cpuacct.stat. The output shows: system: 53 user: 2 The CPU usage of the cgroup is relatively low at around 55%, but this usage doesn't increase, even with more netperf tasks. The reason is that CPU0 is at 100% utilization, as confirmed by mpstat: 02:56:22 PM CPU %usr %nice %sys %iowait %irq %soft %steal = %guest %gnice %idle 02:56:23 PM 0 0.99 0.00 55.45 0.00 0.99 42.57 0.00 = 0.00 0.00 0.00 02:56:23 PM CPU %usr %nice %sys %iowait %irq %soft %steal = %guest %gnice %idle 02:56:24 PM 0 2.00 0.00 55.00 0.00 0.00 43.00 0.00 = 0.00 0.00 0.00 It is clear that the %soft is excluded in the cgroup of the interrupted task. This behavior is unexpected. We should include IRQ time in the cgroup to reflect the pressure the group is under. After a thorough analysis, I discovered that this change in behavior is due to commit 305e6835e055 ("sched: Do not account irq time to current task"), which altered whether IRQ time should be charged to the interrupted task. While I agree that a task should not be penalized by random interrupts, the task itself cannot progress while interrupted. Therefore, the interrupted time should be reported to the user. The system metric in cpuacct.stat is crucial in indicating whether a container is under heavy system pressure, including IRQ/softirq activity. Hence, IRQ/softirq time should be included in the cpuacct system usage, which also applies to cgroup2=E2=80=99s rstat. The reason it doesn't just add the cgroup_account_*() to irqtime_account_irq() is that it might result in performance hit to hold the rq_lock in the critical path. Taking inspiration from commit ddae0ca2a8fe ("sched: Move psi_account_irqtime() out of update_rq_clock_task() hotpath"), I've now adapted the approach to handle it in a non-critical path, reducing the performance impact. Signed-off-by: Yafang Shao Cc: Johannes Weiner --- kernel/sched/core.c | 33 +++++++++++++++++++++++++++++++-- kernel/sched/psi.c | 13 +++---------- kernel/sched/sched.h | 2 +- kernel/sched/stats.h | 7 ++++--- 4 files changed, 39 insertions(+), 16 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index a75dad9be4b9..61545db8ca4b 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -5579,6 +5579,35 @@ __setup("resched_latency_warn_ms=3D", setup_resched_= latency_warn_ms); static inline u64 cpu_resched_latency(struct rq *rq) { return 0; } #endif /* CONFIG_SCHED_DEBUG */ =20 +#ifdef CONFIG_IRQ_TIME_ACCOUNTING +static void account_irqtime(struct rq *rq, struct task_struct *curr, + struct task_struct *prev) +{ + int cpu =3D smp_processor_id(); + s64 delta; + u64 irq; + + if (!irqtime_enabled()) + return; + + irq =3D irq_time_read(cpu); + delta =3D (s64)(irq - rq->irq_time); + if (delta < 0) + return; + + rq->irq_time =3D irq; + psi_account_irqtime(rq, curr, prev, delta); + cgroup_account_cputime(curr, delta); + /* We account both softirq and irq into CPUTIME_IRQ */ + cgroup_account_cputime_field(curr, CPUTIME_IRQ, delta); +} +#else +static inline void account_irqtime(struct rq *rq, struct task_struct *curr, + struct task_struct *prev) +{ +} +#endif + /* * This function gets called by the timer code, with HZ frequency. * We call it with interrupts disabled. @@ -5600,7 +5629,7 @@ void sched_tick(void) rq_lock(rq, &rf); =20 curr =3D rq->curr; - psi_account_irqtime(rq, curr, NULL); + account_irqtime(rq, curr, NULL); =20 update_rq_clock(rq); hw_pressure =3D arch_scale_hw_pressure(cpu_of(rq)); @@ -6683,7 +6712,7 @@ static void __sched notrace __schedule(int sched_mode) ++*switch_count; =20 migrate_disable_switch(rq, prev); - psi_account_irqtime(rq, prev, next); + account_irqtime(rq, prev, next); psi_sched_switch(prev, next, block); =20 trace_sched_switch(preempt, prev, next, prev_state); diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c index 210aec717dc7..1adb41b2ae1d 100644 --- a/kernel/sched/psi.c +++ b/kernel/sched/psi.c @@ -990,15 +990,14 @@ void psi_task_switch(struct task_struct *prev, struct= task_struct *next, } =20 #ifdef CONFIG_IRQ_TIME_ACCOUNTING -void psi_account_irqtime(struct rq *rq, struct task_struct *curr, struct t= ask_struct *prev) +void psi_account_irqtime(struct rq *rq, struct task_struct *curr, struct t= ask_struct *prev, + s64 delta) { int cpu =3D task_cpu(curr); struct psi_group *group; struct psi_group_cpu *groupc; - s64 delta; - u64 irq; =20 - if (static_branch_likely(&psi_disabled) || !irqtime_enabled()) + if (static_branch_likely(&psi_disabled)) return; =20 if (!curr->pid) @@ -1009,12 +1008,6 @@ void psi_account_irqtime(struct rq *rq, struct task_= struct *curr, struct task_st if (prev && task_psi_group(prev) =3D=3D group) return; =20 - irq =3D irq_time_read(cpu); - delta =3D (s64)(irq - rq->psi_irq_time); - if (delta < 0) - return; - rq->psi_irq_time =3D irq; - do { u64 now; =20 diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 0c83ab35256e..690fc6f9d97c 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -1223,7 +1223,7 @@ struct rq { =20 #ifdef CONFIG_IRQ_TIME_ACCOUNTING u64 prev_irq_time; - u64 psi_irq_time; + u64 irq_time; #endif #ifdef CONFIG_PARAVIRT u64 prev_steal_time; diff --git a/kernel/sched/stats.h b/kernel/sched/stats.h index 767e098a3bd1..17eefe5876a5 100644 --- a/kernel/sched/stats.h +++ b/kernel/sched/stats.h @@ -111,10 +111,11 @@ void psi_task_change(struct task_struct *task, int cl= ear, int set); void psi_task_switch(struct task_struct *prev, struct task_struct *next, bool sleep); #ifdef CONFIG_IRQ_TIME_ACCOUNTING -void psi_account_irqtime(struct rq *rq, struct task_struct *curr, struct t= ask_struct *prev); +void psi_account_irqtime(struct rq *rq, struct task_struct *curr, + struct task_struct *prev, s64 delta); #else static inline void psi_account_irqtime(struct rq *rq, struct task_struct *= curr, - struct task_struct *prev) {} + struct task_struct *prev, s64 delta) {} #endif /*CONFIG_IRQ_TIME_ACCOUNTING */ /* * PSI tracks state that persists across sleeps, such as iowaits and @@ -215,7 +216,7 @@ static inline void psi_sched_switch(struct task_struct = *prev, struct task_struct *next, bool sleep) {} static inline void psi_account_irqtime(struct rq *rq, struct task_struct *= curr, - struct task_struct *prev) {} + struct task_struct *prev, s64 delta) {} #endif /* CONFIG_PSI */ =20 #ifdef CONFIG_SCHED_INFO --=20 2.43.5