From nobody Wed Nov 27 18:27:59 2024 Received: from mail-pl1-f174.google.com (mail-pl1-f174.google.com [209.85.214.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A4D70190463; Tue, 8 Oct 2024 06:22:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.174 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728368538; cv=none; b=GNP5ErnauP66Ds6eubxW08AK3eBONx3ruEUUGUZ5kUAUXCz0mgDImVjg4Q2c6yyPdRcZKKwuk9CBdD8e0NFgjyBLTFDvD1K6xpOvXkWeO4iCo0hcUSei9O59QrX64Pr6LRDSxgJr0nv17zc5eTB9kyvj60YAKrrUCApWOZsZsdg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728368538; c=relaxed/simple; bh=mD4RJEhAin69c0Coj5xGRt+ncUmUyzzgnKmub6srj84=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=VT1LKGakhlQLpsqmdyO9ywEPdJpC95G70kAr7Y5f1fnWBWHiMp+b4jYMr7R0ZwvXKczM6AXt/7O0KKkc+gL8lFoB7iPSCoiNaY/fyYznFy3AgglwZ5T2sgOZ3Xqrb3WwT/baV0y0tCGknLVfDidERSgSx6312/bG+i3Ftet8B/A= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=S4LvUqFa; arc=none smtp.client-ip=209.85.214.174 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="S4LvUqFa" Received: by mail-pl1-f174.google.com with SMTP id d9443c01a7336-20bb610be6aso58629275ad.1; Mon, 07 Oct 2024 23:22:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1728368536; x=1728973336; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ZDYsYYa+owJ/7qI8MxzT4IGktWSYUFxVm242eY9jgz8=; b=S4LvUqFaJ9vFRsFOzCDTvtvGge3eXA1LLnMxpPrnU/Ebw1XNe5X5HXMTvMptCRnJaK cSDdbvuAtZaNafIQqnL9+bFTgoGzTB0Px72TTIBJZ4jx6EW532diZ/C0frC5sePMED6O x/sa5nVO+nP/buAg6GSYCuAk3UWTeme8D6yDoj/RgcGVT7W8qGHbOOIvVN5E797YStg+ RjkKfVDal2OTbhShj1ztueyj90z90Wyn7yl7OT/YDxvxAkvr58ZTCZV4xm9dNd95hF7Z hXFwWjYhSko9FtE/bdxTZB8PPcGDsFXBpCPXRtTaSDGs2s5OBrUasxbxIIs/v+kqVxWg nKug== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728368536; x=1728973336; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ZDYsYYa+owJ/7qI8MxzT4IGktWSYUFxVm242eY9jgz8=; b=a+DmX1bGL9yIGABVaDgDVHYWrAl/i45WRfdvstLyABwUmp6nWA4eU12kBcXyawaiSO 9BzkPOL+O2AU+lkFciu0OjMLf77thoBpzy5Nkn6PSksHQcn1LizRS9UH/t2oXvW/BUUe BAFWpVB1cx6k86IB5MqvLVnlkNDD414gPUpLirkNYEIcpe71w4CUGK9Im4/0o1gx3Wgn f7Q77BWO8A/7IhYo7yDefVRI1PCUN3zx5SbdzAkjn7Mv8uVtHypfZ95S1I+l9Rht0lIy PyOQGzWkrNx+UDiawSoV1AeTPog1eRIY2n/Tb+g3QPFWqRNMsNwr905qZ2Xp5fsUiDZf 9WFw== X-Forwarded-Encrypted: i=1; AJvYcCV+pfRnGOAnINXty0Ewrv63sjAMKgum1Jc8z/b5/8hHWKy2q7zKTO5BNL4IdegYw7Zr5k+ED7aFnMJDDf4=@vger.kernel.org X-Gm-Message-State: AOJu0Yx61nhB7g2prGP3dMmwjuNTxB9Kozq0MHP9h3qsrTA0ZUytDqlM bihmB3VDahwfy5t52AlZGUr6Hysst/xvVeLJot2etTRCiIidF5hI X-Google-Smtp-Source: AGHT+IEuM0/04eemhhnRkg5maHvKPdS/K7f2Wz8dYXLpVx63PL1J7Rihae/q0LlyTt07M4TVdqqjUw== X-Received: by 2002:a17:90b:4a4c:b0:2e0:d1fa:fdd7 with SMTP id 98e67ed59e1d1-2e1e631e2c9mr16741791a91.27.1728368535883; Mon, 07 Oct 2024 23:22:15 -0700 (PDT) Received: from localhost.localdomain ([39.144.105.70]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2e1e85c8fd1sm8357525a91.18.2024.10.07.23.22.08 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Mon, 07 Oct 2024 23:22:15 -0700 (PDT) From: Yafang Shao To: mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, vschneid@redhat.com, hannes@cmpxchg.org, surenb@google.com Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Yafang Shao Subject: [PATCH v2 1/4] sched: Define sched_clock_irqtime as static key Date: Tue, 8 Oct 2024 14:19:48 +0800 Message-Id: <20241008061951.3980-2-laoar.shao@gmail.com> X-Mailer: git-send-email 2.30.1 (Apple Git-130) In-Reply-To: <20241008061951.3980-1-laoar.shao@gmail.com> References: <20241008061951.3980-1-laoar.shao@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Since CPU time accounting is a performance-critical path, let's define sched_clock_irqtime as a static key to minimize potential overhead. Signed-off-by: Yafang Shao --- kernel/sched/cputime.c | 16 +++++++--------- kernel/sched/sched.h | 1 + 2 files changed, 8 insertions(+), 9 deletions(-) diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c index 0bed0fa1acd9..d0b6ea737d04 100644 --- a/kernel/sched/cputime.c +++ b/kernel/sched/cputime.c @@ -7,6 +7,8 @@ #include #endif =20 +DEFINE_STATIC_KEY_FALSE(sched_clock_irqtime); + #ifdef CONFIG_IRQ_TIME_ACCOUNTING =20 /* @@ -22,16 +24,14 @@ */ DEFINE_PER_CPU(struct irqtime, cpu_irqtime); =20 -static int sched_clock_irqtime; - void enable_sched_clock_irqtime(void) { - sched_clock_irqtime =3D 1; + static_branch_enable(&sched_clock_irqtime); } =20 void disable_sched_clock_irqtime(void) { - sched_clock_irqtime =3D 0; + static_branch_disable(&sched_clock_irqtime); } =20 static void irqtime_account_delta(struct irqtime *irqtime, u64 delta, @@ -57,7 +57,7 @@ void irqtime_account_irq(struct task_struct *curr, unsign= ed int offset) s64 delta; int cpu; =20 - if (!sched_clock_irqtime) + if (!static_branch_likely(&sched_clock_irqtime)) return; =20 cpu =3D smp_processor_id(); @@ -90,8 +90,6 @@ static u64 irqtime_tick_accounted(u64 maxtime) =20 #else /* CONFIG_IRQ_TIME_ACCOUNTING */ =20 -#define sched_clock_irqtime (0) - static u64 irqtime_tick_accounted(u64 dummy) { return 0; @@ -478,7 +476,7 @@ void account_process_tick(struct task_struct *p, int us= er_tick) if (vtime_accounting_enabled_this_cpu()) return; =20 - if (sched_clock_irqtime) { + if (static_branch_likely(&sched_clock_irqtime)) { irqtime_account_process_tick(p, user_tick, 1); return; } @@ -507,7 +505,7 @@ void account_idle_ticks(unsigned long ticks) { u64 cputime, steal; =20 - if (sched_clock_irqtime) { + if (static_branch_likely(&sched_clock_irqtime)) { irqtime_account_idle_ticks(ticks); return; } diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 8063db62b027..db7d541eebff 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -3174,6 +3174,7 @@ struct irqtime { }; =20 DECLARE_PER_CPU(struct irqtime, cpu_irqtime); +DECLARE_STATIC_KEY_FALSE(sched_clock_irqtime); =20 /* * Returns the irqtime minus the softirq time computed by ksoftirqd. --=20 2.43.5 From nobody Wed Nov 27 18:27:59 2024 Received: from mail-pj1-f50.google.com (mail-pj1-f50.google.com [209.85.216.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8370C190463; Tue, 8 Oct 2024 06:22:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.50 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728368546; cv=none; b=Fr1+EORmCm/pwmFW6RWeInDWf2/OmqIYUC5OQ1aQ6F8XozYPO7sAB6YCbqUn+iw+zMoDPZLMfa82anDy/x3RZl73dvTtmwzHuXljUQViRDFTucpcngyX/xJpcz119dW+UOcSIpMd6+NWSLfeajVs0X1ID7VkJVNzKzJOkgEX9Ys= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728368546; c=relaxed/simple; bh=Eb8d5lyXFhSllOGcwU1VmnjYWiuaJB9lPUZZG8MTkDE=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=hefMXRFUns3vqywfxmtLKW7zdOMtMr/5Kq0VskgxPSVeT9IWJC07j8tHrNbKiy6Ndjy8t92dbPxLokAEN2xE+Vw9yK4cZIE4bJV4sENLscisBZtWNTbkLtclLYuPJFiJXo2/P2ti3u6aZTXW8RrvLFunrl+RBsYL+uij8DXlbUw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=IjdXuNW4; arc=none smtp.client-ip=209.85.216.50 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="IjdXuNW4" Received: by mail-pj1-f50.google.com with SMTP id 98e67ed59e1d1-2e07d91f78aso3830689a91.1; Mon, 07 Oct 2024 23:22:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1728368545; x=1728973345; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=DCA0RTe4B5CkvnInMX18gUuR5vgyuSITkXk+PtSiiZc=; b=IjdXuNW4hSvBlCgqNuw9WW7P54h7lL+jlGF2uZeAaUjJXa3Rh9S3ubQQsG7fOH9tX7 mrbqdaiJMSX/Z27eft6cn7QhvYkxlJbrx9ew+9XcJ24N35pBRWAhVUt/JNoMaSE2bnWv dIP7+oMUaayJ42zG0qdS95xPW99rTY4tLh0pfN4KF2jNrh8OjqmtiPsHUsaAnXAyy0Vb 7zai3XPLAs9XuA9OFWVPCjDFelZ8JUTsg85ss3kVMAexmFnjVnqMeXIsa/SbsM7ba057 T6MyYhibA2kPBaJv47kIXH0WIp/17e/0wQzmAlk07kg/r1j+c09tGo3Eft9WiuWJ5wx/ hfSQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728368545; x=1728973345; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=DCA0RTe4B5CkvnInMX18gUuR5vgyuSITkXk+PtSiiZc=; b=xTEEhbeH2O1KUhPLrlHCgUjWgQ+DG2tTYKfXoBXyRefNacDgCrtW1ZNedkXTfHSJ24 0t1HaE0rRtTzdXuSjgd/weDhgDnZBqUmDi2uL+uy8XQjzQ2G8/KY7xH1qO5oor7LzZFh 70pooenHEM0keiLmmlFInbhFvt6PD0TSwbwhd0Gt/PLuDylthuj3mdAQ7yYEQCPkS9Ak yr7x/uiLoilpuUga/nLCvJ2p/TDHZADSO09EgJvfPzp6sN9GzKgdQWE3QpM4p3EGtoDA vk4h385hiAehay0Z3r2rq4hLETqkmSXLt9qLiIv1VZtAkyoj9iUmZKbs/j2gFUuIA+Xk 23LQ== X-Forwarded-Encrypted: i=1; AJvYcCVIK1MHQTFPwwjoLI5wtf8BK66vC/V4tr4UsH1auGPMMKvbK90enZIwxiQpK1r+u7QCu2SWXZGzwkIKrGc=@vger.kernel.org X-Gm-Message-State: AOJu0YwD+cdK/2q3FBfHbmkFZ2FAVpLPB7SbFrkuO8MprtNPfxfF0Gkr D0r6NpG6FCl9CpToVECiAUeCApjMPqfBwgUl58NO7GxL+yT+tM2U X-Google-Smtp-Source: AGHT+IHgWeIfy/eX5qUF00EGeX9DuHDG/bJcMDVRAGpHPztKHZUJfEgweeCF+eX/1/FXCx/tRx6Sbg== X-Received: by 2002:a17:90a:db14:b0:2e2:8d7a:f1ba with SMTP id 98e67ed59e1d1-2e28d7af415mr308078a91.2.1728368544778; Mon, 07 Oct 2024 23:22:24 -0700 (PDT) Received: from localhost.localdomain ([39.144.105.70]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2e1e85c8fd1sm8357525a91.18.2024.10.07.23.22.16 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Mon, 07 Oct 2024 23:22:24 -0700 (PDT) From: Yafang Shao To: mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, vschneid@redhat.com, hannes@cmpxchg.org, surenb@google.com Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Yafang Shao Subject: [PATCH v2 2/4] sched: Don't account irq time if sched_clock_irqtime is disabled Date: Tue, 8 Oct 2024 14:19:49 +0800 Message-Id: <20241008061951.3980-3-laoar.shao@gmail.com> X-Mailer: git-send-email 2.30.1 (Apple Git-130) In-Reply-To: <20241008061951.3980-1-laoar.shao@gmail.com> References: <20241008061951.3980-1-laoar.shao@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" sched_clock_irqtime may be disabled due to the clock source, in which case IRQ time should not be accounted. Let's add a conditional check to avoid unnecessary logic. Signed-off-by: Yafang Shao --- kernel/sched/core.c | 44 +++++++++++++++++++++++--------------------- 1 file changed, 23 insertions(+), 21 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index b6cc1cf499d6..8b633a14a60f 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -735,29 +735,31 @@ static void update_rq_clock_task(struct rq *rq, s64 d= elta) s64 __maybe_unused steal =3D 0, irq_delta =3D 0; =20 #ifdef CONFIG_IRQ_TIME_ACCOUNTING - irq_delta =3D irq_time_read(cpu_of(rq)) - rq->prev_irq_time; + if (static_branch_likely(&sched_clock_irqtime)) { + irq_delta =3D irq_time_read(cpu_of(rq)) - rq->prev_irq_time; =20 - /* - * Since irq_time is only updated on {soft,}irq_exit, we might run into - * this case when a previous update_rq_clock() happened inside a - * {soft,}IRQ region. - * - * When this happens, we stop ->clock_task and only update the - * prev_irq_time stamp to account for the part that fit, so that a next - * update will consume the rest. This ensures ->clock_task is - * monotonic. - * - * It does however cause some slight miss-attribution of {soft,}IRQ - * time, a more accurate solution would be to update the irq_time using - * the current rq->clock timestamp, except that would require using - * atomic ops. - */ - if (irq_delta > delta) - irq_delta =3D delta; + /* + * Since irq_time is only updated on {soft,}irq_exit, we might run into + * this case when a previous update_rq_clock() happened inside a + * {soft,}IRQ region. + * + * When this happens, we stop ->clock_task and only update the + * prev_irq_time stamp to account for the part that fit, so that a next + * update will consume the rest. This ensures ->clock_task is + * monotonic. + * + * It does however cause some slight miss-attribution of {soft,}IRQ + * time, a more accurate solution would be to update the irq_time using + * the current rq->clock timestamp, except that would require using + * atomic ops. + */ + if (irq_delta > delta) + irq_delta =3D delta; =20 - rq->prev_irq_time +=3D irq_delta; - delta -=3D irq_delta; - delayacct_irq(rq->curr, irq_delta); + rq->prev_irq_time +=3D irq_delta; + delta -=3D irq_delta; + delayacct_irq(rq->curr, irq_delta); + } #endif #ifdef CONFIG_PARAVIRT_TIME_ACCOUNTING if (static_key_false((¶virt_steal_rq_enabled))) { --=20 2.43.5 From nobody Wed Nov 27 18:27:59 2024 Received: from mail-pj1-f43.google.com (mail-pj1-f43.google.com [209.85.216.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3768F190463; Tue, 8 Oct 2024 06:22:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.43 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728368557; cv=none; b=V0csvKmtv20YjEZ8q7GohBWfVETubha4yriee+SECOWFDhUacUfmK/iG4FBYdkEa+E0qHDNT+1Enncw3dG+4ugHUqIwmTLS33bmmGHGBQTy2behwoflHh1UJogHw9e8YDBGksKpys8TLXTeD/HwjIYMuQ89KDaSWolXkyDdWysg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728368557; c=relaxed/simple; bh=l5SirwIs4fa/kJ+TrNpd43oWvaho729bYKl0VngKc14=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=WiCkcQtZQjRfMnmVBHMj2uLJIC/8BRtNY3gA6DBk9PMbBJBycb3sSbbvByPmfCTr98v+8/nhW89kIOj5XCoLCXDE130ugWc5/n4m7MX00ieIRskEb2eMQksU54XPHL1PGjwV6SU7Tx1T0sfJdzAA1wWu40D6Baeoi6YAvlaEg74= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=Nq+KbRAE; arc=none smtp.client-ip=209.85.216.43 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Nq+KbRAE" Received: by mail-pj1-f43.google.com with SMTP id 98e67ed59e1d1-2e1bfa9ddb3so4031540a91.0; Mon, 07 Oct 2024 23:22:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1728368554; x=1728973354; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=bm9QXwvwwNOIs45Ze0l+LKqVq0pGGiMbOKHfffqvuOM=; b=Nq+KbRAEU3j58hRJ877uGZc8+x6dNnyxpgVGlPU+GJG5EYI2wqebnjdztw6i3eUnTK K60J7Vn6oewoUV6jksAtP6W4wA6gfGhGcqJS1AilivGGM4psxrXsR4Xc01WdcIl/BdLj 3KDvGk06PUvoWjFX8d1C/RGqbIPXDHKwSnJsMjqB8jUCbWeuOjX6qCZiIrUpAy42y5wa BVlUIyVtVt4jEcoE8p0tt15MGC25Xx+1/T3ahzF2wGr2ka0VJ3or0ES6uJTXKi99wmlk JrNiv7sbNSP91p9dDRTvWXqBcYjnIViwoQCzXwaiGeKAInruIi2W5RKrosXkfrdsXHXR 84dg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728368554; x=1728973354; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=bm9QXwvwwNOIs45Ze0l+LKqVq0pGGiMbOKHfffqvuOM=; b=KjS8CaCkB0D2015zAujZKHEWWHtNAMsfAvCwQBuWTWe6WNg4Mr9f98+jgX5Ro6N5Mh sn6ODGX357d14L2H2ZqVKaT5YMdRHjjAvmm1bVnLjC9LcwVU70GIxqgqXFgEbnyYoc2E lyYT8wlDjfTu5i2T9+17JOddbewwJYbTF55gmPw1kyEU1MI9yTlFtDFX5C7c+BxYK9ls kW49NrYKSy4G6vB8E79snBo5G3Bg4TEV0BG52D1ABl4euoDLA5ZPVrmqYty3eTa4PdoS p5YigEzWbZFVurjCQzb4+ZJgM8vmmG6IKNMC6758GcckBu8+00g36hFj6zwFKUMzUUqS kZAQ== X-Forwarded-Encrypted: i=1; AJvYcCWhP0YqhthCG3BMPN7N1KJVHKRaEIqdgtTrSKHO0QeE6Nt5Ori5ZP0FN9CvnGqUMiKn4nueNUSa6iltOQU=@vger.kernel.org X-Gm-Message-State: AOJu0YyTzUWB3rSQaCaJwdH2yZ73BIz/2vycaJ53yCHmvVQ0+tJ/84UG WezgG6x3oBmdMY/hVa7vCpKIoo+LdJTuDsvAGd+lwdfeTtucUhgD X-Google-Smtp-Source: AGHT+IEsUru35H3naDJFkp7mestgxOGxUXNbPwc7QGF7zWh3F8pFCwITAqaeWfDJK5Rpt3jYDJoFxw== X-Received: by 2002:a17:90a:b013:b0:2d8:82da:2627 with SMTP id 98e67ed59e1d1-2e1e6354232mr16548925a91.27.1728368554524; Mon, 07 Oct 2024 23:22:34 -0700 (PDT) Received: from localhost.localdomain ([39.144.105.70]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2e1e85c8fd1sm8357525a91.18.2024.10.07.23.22.25 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Mon, 07 Oct 2024 23:22:34 -0700 (PDT) From: Yafang Shao To: mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, vschneid@redhat.com, hannes@cmpxchg.org, surenb@google.com Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Yafang Shao Subject: [PATCH v2 3/4] sched, psi: Don't account irq time if sched_clock_irqtime is disabled Date: Tue, 8 Oct 2024 14:19:50 +0800 Message-Id: <20241008061951.3980-4-laoar.shao@gmail.com> X-Mailer: git-send-email 2.30.1 (Apple Git-130) In-Reply-To: <20241008061951.3980-1-laoar.shao@gmail.com> References: <20241008061951.3980-1-laoar.shao@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" sched_clock_irqtime may be disabled due to the clock source, in which case IRQ time should not be accounted. Let's add a conditional check to avoid unnecessary logic. Signed-off-by: Yafang Shao Acked-by: Johannes Weiner --- kernel/sched/psi.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c index 020d58967d4e..49d9c75be0c8 100644 --- a/kernel/sched/psi.c +++ b/kernel/sched/psi.c @@ -1000,7 +1000,8 @@ void psi_account_irqtime(struct rq *rq, struct task_s= truct *curr, struct task_st u64 now, irq; s64 delta; =20 - if (static_branch_likely(&psi_disabled)) + if (static_branch_likely(&psi_disabled) || + !static_branch_likely(&sched_clock_irqtime)) return; =20 if (!curr->pid) --=20 2.43.5 From nobody Wed Nov 27 18:27:59 2024 Received: from mail-pg1-f169.google.com (mail-pg1-f169.google.com [209.85.215.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C0612190463; Tue, 8 Oct 2024 06:22:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.169 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728368569; cv=none; b=VvdiK5oCYsg21WxND/J+0RZnarW8GSZO0SPENRyRA9Cxs01Gib0y0GaDUMidTeh7I4J0LiewAwSUtI+LtPoMZEZUMbpQINRjrknIGZgcU4ElbLfOSC/VLkYefQEYiSuoeRbWcXXRv+liY5Y2fbZ5t+EY378UwNlHzP6yMNM6nck= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728368569; c=relaxed/simple; bh=86UgX/mbty8uWpIMTjQSPIXrHh3bYdC5nmoLRupt7Kc=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=s23RhAFSNC3/7V7Qfg0/ctKY+1/371s4aF/jidfmve5wk1lG+Id/bkCLzl8HqU5+5uDXTWTu/4Pavd2l+2oeW6YnIE6qTckAGlyl9pBhF+Tm/fJPJIIjJ36j8m7QjG77tDRAnwoR678x8khFa3FBOX+yRUpxuVgGykMiKn/RVok= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=NhWyvaCv; arc=none smtp.client-ip=209.85.215.169 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="NhWyvaCv" Received: by mail-pg1-f169.google.com with SMTP id 41be03b00d2f7-7db908c9c83so3296971a12.2; Mon, 07 Oct 2024 23:22:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1728368567; x=1728973367; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=CREL76g3HNxkv23W1Xe5cLsUMj2uTYl1300sEXcL2Wo=; b=NhWyvaCvfcOjziqSYVQxjdRbGyo1Lyj2fieNaOAKwC8SomjlDRlzk0c9Rp5EJfNy8f I3/SSba6lwVejEya8ttMs6asIOV9CLLk78JEg4nrGnDb2NH8yiIomYw5exIvdYwwwphl tDIW7Au/cm1VJrU9z2HzHKQjzZajPn0v7MnZqWkfSvEQa32W7tLjb0XgcHyG+QAABFgN 3u4MbfnEsHznU7cfjuDdeKo30kLyFaG+9UWW7XJdCqT9kjnE1rydHod+Q4PZ/rUApTlk kl7gXjBelUND4yOFdSc04h8liZB+B/ZZlA1EI6YHUcZLWxYxYrPBEm7LLCew5CW378Q0 mfwQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728368567; x=1728973367; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=CREL76g3HNxkv23W1Xe5cLsUMj2uTYl1300sEXcL2Wo=; b=ZRi0c6r/pwJ0invsjAOryp8r7px83sXt5K9ByEUmrGZB1XBMWNdSdA69Ke9p410aCa kwRglLs1A0K6Xqct95ISB2iUbUV5z7Znk4fOS9kD+jHQmE1PNJID43WmCKzyS7ZjJUrP saF6ivqdhW9QU18WrHjqYKqkingyiVn9XB5UCUV/wxd70xDyOB71SZEBa/7vqnkKxCiA 4dVVzTF4Iy/pUkO3QjhiIr9dmuKE8xtwcopT24JC/50rsk7jLmMuLkRjH4hGO0iCuTZ7 GVjtzV7DdSlNlqBnbt9qrHZmYWq7mCkJniKUqyR8bK0MxgkbFaCLuODLb+EgS8Zo2Q4L BYCA== X-Forwarded-Encrypted: i=1; AJvYcCWQnANmGWIzFxfMuxi9COiVnD9cFedjt7JUG0Ddemyk6vVpqbZ3CcKAm/csczAqsCuwTq6G72poSsSEngA=@vger.kernel.org X-Gm-Message-State: AOJu0YyP4kjsi7joxCuQcUEoqOuMPdbtm8ZKFi12rIAVv/k1R7vNz9T5 yCIuaweM8V0/DBQ1GF9AGjtEK82KQtBCxvwiVKcRc2By3kAzVgceqz/Uu5qf0+0= X-Google-Smtp-Source: AGHT+IEGVu1GOoUdoVNewOPD2PejpPcmOyTHW1Lx20GCF66jj3PCbYeYhczjd4kPCVGPEEQ30XNeFQ== X-Received: by 2002:a17:90b:b14:b0:2d8:ebef:547 with SMTP id 98e67ed59e1d1-2e1e63bbf13mr16444269a91.35.1728368566981; Mon, 07 Oct 2024 23:22:46 -0700 (PDT) Received: from localhost.localdomain ([39.144.105.70]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2e1e85c8fd1sm8357525a91.18.2024.10.07.23.22.35 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Mon, 07 Oct 2024 23:22:46 -0700 (PDT) From: Yafang Shao To: mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, vschneid@redhat.com, hannes@cmpxchg.org, surenb@google.com Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Yafang Shao Subject: [PATCH v2 4/4] sched: Fix cgroup irq accounting for CONFIG_IRQ_TIME_ACCOUNTING Date: Tue, 8 Oct 2024 14:19:51 +0800 Message-Id: <20241008061951.3980-5-laoar.shao@gmail.com> X-Mailer: git-send-email 2.30.1 (Apple Git-130) In-Reply-To: <20241008061951.3980-1-laoar.shao@gmail.com> References: <20241008061951.3980-1-laoar.shao@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable After enabling CONFIG_IRQ_TIME_ACCOUNTING to monitor IRQ pressure in our container environment, we observed several noticeable behavioral changes. One of our IRQ-heavy services, such as Redis, reported a significant reduction in CPU usage after upgrading to the new kernel with CONFIG_IRQ_TIME_ACCOUNTING enabled. However, despite adding more threads to handle an increased workload, the CPU usage could not be raised. In other words, even though the container=E2=80=99s CPU usage appeared low, it= was unable to process more workloads to utilize additional CPU resources, which caused issues. This behavior can be demonstrated using netperf: function start_server() { for j in `seq 1 3`; do netserver -p $[12345+j] > /dev/null & done } server_ip=3D$1 function start_client() { # That applies to cgroup2 as well. mkdir -p /sys/fs/cgroup/cpuacct/test echo $$ > /sys/fs/cgroup/cpuacct/test/cgroup.procs for j in `seq 1 3`; do port=3D$[12345+j] taskset -c 0 netperf -H ${server_ip} -l ${run_time:-30000} \ -t TCP_STREAM -p $port -- -D -m 1k -M 1K -s 8k -S 8k \ > /dev/null & done } start_server start_client We can verify the CPU usage of the test cgroup using cpuacct.stat. The output shows: system: 53 user: 2 The CPU usage of the cgroup is relatively low at around 55%, but this usage doesn't increase, even with more netperf tasks. The reason is that CPU0 is at 100% utilization, as confirmed by mpstat: 02:56:22 PM CPU %usr %nice %sys %iowait %irq %soft %steal = %guest %gnice %idle 02:56:23 PM 0 0.99 0.00 55.45 0.00 0.99 42.57 0.00 = 0.00 0.00 0.00 02:56:23 PM CPU %usr %nice %sys %iowait %irq %soft %steal = %guest %gnice %idle 02:56:24 PM 0 2.00 0.00 55.00 0.00 0.00 43.00 0.00 = 0.00 0.00 0.00 This behavior is unexpected. We should account for IRQ time to the cgroup to reflect the pressure the group is under. After a thorough analysis, I discovered that this change in behavior is due to commit 305e6835e055 ("sched: Do not account irq time to current task"), which altered whether IRQ time should be charged to the interrupted task. While I agree that a task should not be penalized by random interrupts, the task itself cannot progress while interrupted. Therefore, the interrupted time should be reported to the user. The system metric in cpuacct.stat is crucial in indicating whether a container is under heavy system pressure, including IRQ/softirq activity. Hence, IRQ/softirq time should be accounted for in the cpuacct system usage, which also applies to cgroup2=E2=80=99s rstat. This patch reintroduces IRQ/softirq accounting to cgroups. Signed-off-by: Yafang Shao --- kernel/sched/core.c | 39 +++++++++++++++++++++++++++++++++++++-- kernel/sched/psi.c | 15 +++------------ kernel/sched/stats.h | 7 ++++--- 3 files changed, 44 insertions(+), 17 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 8b633a14a60f..533e015f8777 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -5587,7 +5587,24 @@ void sched_tick(void) rq_lock(rq, &rf); =20 curr =3D rq->curr; - psi_account_irqtime(rq, curr, NULL); + +#ifdef CONFIG_IRQ_TIME_ACCOUNTING + if (static_branch_likely(&sched_clock_irqtime)) { + u64 now, irq; + s64 delta; + + now =3D cpu_clock(cpu); + irq =3D irq_time_read(cpu); + delta =3D (s64)(irq - rq->psi_irq_time); + if (delta > 0) { + rq->psi_irq_time =3D irq; + psi_account_irqtime(rq, curr, NULL, now, delta); + cgroup_account_cputime(curr, delta); + /* We account both softirq and irq into softirq */ + cgroup_account_cputime_field(curr, CPUTIME_SOFTIRQ, delta); + } + } +#endif =20 update_rq_clock(rq); hw_pressure =3D arch_scale_hw_pressure(cpu_of(rq)); @@ -6667,7 +6684,25 @@ static void __sched notrace __schedule(int sched_mod= e) ++*switch_count; =20 migrate_disable_switch(rq, prev); - psi_account_irqtime(rq, prev, next); + +#ifdef CONFIG_IRQ_TIME_ACCOUNTING + if (static_branch_likely(&sched_clock_irqtime)) { + u64 now, irq; + s64 delta; + + now =3D cpu_clock(cpu); + irq =3D irq_time_read(cpu); + delta =3D (s64)(irq - rq->psi_irq_time); + if (delta > 0) { + rq->psi_irq_time =3D irq; + psi_account_irqtime(rq, prev, next, now, delta); + cgroup_account_cputime(prev, delta); + /* We account both softirq and irq into softirq */ + cgroup_account_cputime_field(prev, CPUTIME_SOFTIRQ, delta); + } + } +#endif + psi_sched_switch(prev, next, !task_on_rq_queued(prev)); =20 trace_sched_switch(preempt, prev, next, prev_state); diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c index 49d9c75be0c8..ffa8aa372fbd 100644 --- a/kernel/sched/psi.c +++ b/kernel/sched/psi.c @@ -992,16 +992,14 @@ void psi_task_switch(struct task_struct *prev, struct= task_struct *next, } =20 #ifdef CONFIG_IRQ_TIME_ACCOUNTING -void psi_account_irqtime(struct rq *rq, struct task_struct *curr, struct t= ask_struct *prev) +void psi_account_irqtime(struct rq *rq, struct task_struct *curr, struct t= ask_struct *prev, + u64 now, s64 delta) { int cpu =3D task_cpu(curr); struct psi_group *group; struct psi_group_cpu *groupc; - u64 now, irq; - s64 delta; =20 - if (static_branch_likely(&psi_disabled) || - !static_branch_likely(&sched_clock_irqtime)) + if (static_branch_likely(&psi_disabled)) return; =20 if (!curr->pid) @@ -1012,13 +1010,6 @@ void psi_account_irqtime(struct rq *rq, struct task_= struct *curr, struct task_st if (prev && task_psi_group(prev) =3D=3D group) return; =20 - now =3D cpu_clock(cpu); - irq =3D irq_time_read(cpu); - delta =3D (s64)(irq - rq->psi_irq_time); - if (delta < 0) - return; - rq->psi_irq_time =3D irq; - do { if (!group->enabled) continue; diff --git a/kernel/sched/stats.h b/kernel/sched/stats.h index 237780aa3c53..7c5979761021 100644 --- a/kernel/sched/stats.h +++ b/kernel/sched/stats.h @@ -111,10 +111,11 @@ void psi_task_change(struct task_struct *task, int cl= ear, int set); void psi_task_switch(struct task_struct *prev, struct task_struct *next, bool sleep); #ifdef CONFIG_IRQ_TIME_ACCOUNTING -void psi_account_irqtime(struct rq *rq, struct task_struct *curr, struct t= ask_struct *prev); +void psi_account_irqtime(struct rq *rq, struct task_struct *curr, + struct task_struct *prev, u64 now, s64 delta); #else static inline void psi_account_irqtime(struct rq *rq, struct task_struct *= curr, - struct task_struct *prev) {} + struct task_struct *prev, u64 now, s64 delta) {} #endif /*CONFIG_IRQ_TIME_ACCOUNTING */ /* * PSI tracks state that persists across sleeps, such as iowaits and @@ -197,7 +198,7 @@ static inline void psi_sched_switch(struct task_struct = *prev, struct task_struct *next, bool sleep) {} static inline void psi_account_irqtime(struct rq *rq, struct task_struct *= curr, - struct task_struct *prev) {} + struct task_struct *prev, u64 now, s64 delta) {} #endif /* CONFIG_PSI */ =20 #ifdef CONFIG_SCHED_INFO --=20 2.43.5