From nobody Wed Sep 10 22:04:14 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 70768C61DA4 for ; Wed, 22 Feb 2023 14:47:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232218AbjBVOrp (ORCPT ); Wed, 22 Feb 2023 09:47:45 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59772 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232216AbjBVOre (ORCPT ); Wed, 22 Feb 2023 09:47:34 -0500 Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D17A938E80 for ; Wed, 22 Feb 2023 06:47:16 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sin.source.kernel.org (Postfix) with ESMTPS id D07D8CE1DDE for ; Wed, 22 Feb 2023 14:47:00 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id DA38CC433D2; Wed, 22 Feb 2023 14:46:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1677077219; bh=i0suEjF1OVttZE4v7oGGN5lbF5fnEyMlbmcS3axNMpQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Z5Oy+8JBnQ3KrQ1WgxmB3G+kQ2Y6XPmfVa9zlkxOvp3QAq/WCcKocc7sjnmbc8B3k +khwBCnBJdm3akmYQFadw/1DzXGAwLiQtC/qoljoKBPsFOfe8/ND0ia7HReSer0+k4 Gsn/HLGMxQI0/k7aqeLmW1f2RTrTbwa7ORJ3QM6TL4sskFilRwRdYyo7EhdYH6MYmS puc7pu/xTxbTrPQQsfSW3aBx+ug28InqceUeXhHH4InCIa0HcdN6zdvPxFc40wlM8H 0H4MMEew3ljsRgT1V22HOodaHm8KCkWJip4gvKy5tMbYyaNeKNiBCkQTzRa2vTfIip LsDHF+CUO6qOg== From: Frederic Weisbecker To: Thomas Gleixner Cc: LKML , Frederic Weisbecker , Alexey Dobriyan , Wei Li , Peter Zijlstra , Mirsad Goran Todorovac , Yu Liao , Hillf Danton , Ingo Molnar Subject: [PATCH 1/8] timers/nohz: Restructure and reshuffle struct tick_sched Date: Wed, 22 Feb 2023 15:46:42 +0100 Message-Id: <20230222144649.624380-2-frederic@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230222144649.624380-1-frederic@kernel.org> References: <20230222144649.624380-1-frederic@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Restructure and group fields by access in order to optimize cache layout. While at it, also add missing kernel doc for two fields: @last_jiffies and @idle_expires. Reported-by: Thomas Gleixner Acked-by: Peter Zijlstra (Intel) Cc: Hillf Danton Cc: Yu Liao Cc: Ingo Molnar Cc: Wei Li Cc: Alexey Dobriyan Cc: Mirsad Goran Todorovac Signed-off-by: Frederic Weisbecker --- kernel/time/tick-sched.h | 66 +++++++++++++++++++++++++--------------- 1 file changed, 41 insertions(+), 25 deletions(-) diff --git a/kernel/time/tick-sched.h b/kernel/time/tick-sched.h index 504649513399..c6663254d17d 100644 --- a/kernel/time/tick-sched.h +++ b/kernel/time/tick-sched.h @@ -22,65 +22,81 @@ enum tick_nohz_mode { =20 /** * struct tick_sched - sched tick emulation and no idle tick control/stats - * @sched_timer: hrtimer to schedule the periodic tick in high - * resolution mode - * @check_clocks: Notification mechanism about clocksource changes - * @nohz_mode: Mode - one state of tick_nohz_mode + * * @inidle: Indicator that the CPU is in the tick idle mode * @tick_stopped: Indicator that the idle tick has been stopped * @idle_active: Indicator that the CPU is actively in the tick idle mode; * it is reset during irq handling phases. - * @do_timer_lst: CPU was the last one doing do_timer before going idle + * @do_timer_last: CPU was the last one doing do_timer before going idle * @got_idle_tick: Tick timer function has run with @inidle set + * @stalled_jiffies: Number of stalled jiffies detected across ticks + * @last_tick_jiffies: Value of jiffies seen on last tick + * @sched_timer: hrtimer to schedule the periodic tick in high + * resolution mode * @last_tick: Store the last tick expiry time when the tick * timer is modified for nohz sleeps. This is necessary * to resume the tick timer operation in the timeline * when the CPU returns from nohz sleep. * @next_tick: Next tick to be fired when in dynticks mode. * @idle_jiffies: jiffies at the entry to idle for idle time accounting + * @idle_waketime: Time when the idle was interrupted + * @idle_entrytime: Time when the idle call was entered + * @nohz_mode: Mode - one state of tick_nohz_mode + * @last_jiffies: Base jiffies snapshot when next event was last computed + * @timer_expires_base: Base time clock monotonic for @timer_expires + * @timer_expires: Anticipated timer expiration time (in case sched tick i= s stopped) + * @next_timer: Expiry time of next expiring timer for debugging purpose = only + * @idle_expires: Next tick in idle, for debugging purpose only * @idle_calls: Total number of idle calls * @idle_sleeps: Number of idle calls, where the sched tick was stopped - * @idle_entrytime: Time when the idle call was entered - * @idle_waketime: Time when the idle was interrupted * @idle_exittime: Time when the idle state was left * @idle_sleeptime: Sum of the time slept in idle with sched tick stopped * @iowait_sleeptime: Sum of the time slept in idle with sched tick stoppe= d, with IO outstanding - * @timer_expires: Anticipated timer expiration time (in case sched tick i= s stopped) - * @timer_expires_base: Base time clock monotonic for @timer_expires - * @next_timer: Expiry time of next expiring timer for debugging purpose = only * @tick_dep_mask: Tick dependency mask - is set, if someone needs the tick - * @last_tick_jiffies: Value of jiffies seen on last tick - * @stalled_jiffies: Number of stalled jiffies detected across ticks + * @check_clocks: Notification mechanism about clocksource changes */ struct tick_sched { - struct hrtimer sched_timer; - unsigned long check_clocks; - enum tick_nohz_mode nohz_mode; - + /* Common flags */ unsigned int inidle : 1; unsigned int tick_stopped : 1; unsigned int idle_active : 1; unsigned int do_timer_last : 1; unsigned int got_idle_tick : 1; =20 + /* Tick handling: jiffies stall check */ + unsigned int stalled_jiffies; + unsigned long last_tick_jiffies; + + /* Tick handling */ + struct hrtimer sched_timer; ktime_t last_tick; ktime_t next_tick; unsigned long idle_jiffies; - unsigned long idle_calls; - unsigned long idle_sleeps; - ktime_t idle_entrytime; ktime_t idle_waketime; - ktime_t idle_exittime; - ktime_t idle_sleeptime; - ktime_t iowait_sleeptime; + + /* Idle entry */ + ktime_t idle_entrytime; + + /* Tick stop */ + enum tick_nohz_mode nohz_mode; unsigned long last_jiffies; - u64 timer_expires; u64 timer_expires_base; + u64 timer_expires; u64 next_timer; ktime_t idle_expires; + unsigned long idle_calls; + unsigned long idle_sleeps; + + /* Idle exit */ + ktime_t idle_exittime; + ktime_t idle_sleeptime; + ktime_t iowait_sleeptime; + + /* Full dynticks handling */ atomic_t tick_dep_mask; - unsigned long last_tick_jiffies; - unsigned int stalled_jiffies; + + /* Clocksource changes */ + unsigned long check_clocks; }; =20 extern struct tick_sched *tick_get_tick_sched(int cpu); --=20 2.34.1 From nobody Wed Sep 10 22:04:14 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6EFDAC61DA4 for ; Wed, 22 Feb 2023 14:47:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231657AbjBVOrh (ORCPT ); Wed, 22 Feb 2023 09:47:37 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59486 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232169AbjBVOrZ (ORCPT ); Wed, 22 Feb 2023 09:47:25 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BF94E3BDAF for ; Wed, 22 Feb 2023 06:47:02 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 3544661464 for ; Wed, 22 Feb 2023 14:47:02 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 819ADC4339B; Wed, 22 Feb 2023 14:46:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1677077221; bh=Wy5hbT4ewIbxtuRGZo3BNmMi3nK/RbjQJn9sH1g4yFs=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=FDmOTfUDDAvB318Tim6Svmwl4prd0CvCrO2yMhterj811AW25jGCizagfRmMU3PuO TtwrV6OOwv0mDAN5hY6twOv0xbamI50e31QG5hOBllfU0tfy1WiuNPI1PFNylJku9E QN4/p1q4Mkdo26fnClKlfGr9+zZk2dvpCo1jxbOYzQ+c677jHx0+V9SrSnUeintpfl JfyivG+S4iGlZNMP70TqKR+DSDUdc7Wbb+gTYayl3WGKBFkf2IXoLn4WuTbr4EivDM MdnybAwi2AGGOFmy382kl+pG1nF6aGuLLq2BGD/GXMWKOpgw5S3CvPPD7yfdLzqFb8 UabOOMYAYTN6w== From: Frederic Weisbecker To: Thomas Gleixner Cc: LKML , Frederic Weisbecker , Alexey Dobriyan , Wei Li , Peter Zijlstra , Mirsad Goran Todorovac , Yu Liao , Hillf Danton , Ingo Molnar Subject: [PATCH 2/8] timers/nohz: Only ever update sleeptime from idle exit Date: Wed, 22 Feb 2023 15:46:43 +0100 Message-Id: <20230222144649.624380-3-frederic@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230222144649.624380-1-frederic@kernel.org> References: <20230222144649.624380-1-frederic@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" The idle and io sleeptime statistics appearing in /proc/stat can be currently updated from two sites: locally on idle exit and remotely by cpufreq. However there is no synchronization mechanism protecting concurrent updates. It is therefore possible to account the sleeptime twice, among all the possible broken scenarios. To prevent from breaking the sleeptime accounting source, restrict the sleeptime updates to the local idle exit site. If there is a delta to add since the last update, IO/Idle sleep time readers will now only compute the delta without actually writing it back to the internal idle statistic fields. This fixes a writer VS writer race. Note there are still two known reader VS writer races to handle. A subsequent patch will fix one. Reported-by: Yu Liao Acked-by: Peter Zijlstra (Intel) Cc: Hillf Danton Cc: Yu Liao Cc: Ingo Molnar Cc: Thomas Gleixner Cc: Wei Li Cc: Alexey Dobriyan Cc: Mirsad Goran Todorovac Signed-off-by: Frederic Weisbecker --- kernel/time/tick-sched.c | 103 ++++++++++++++++----------------------- 1 file changed, 41 insertions(+), 62 deletions(-) diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c index b0e3c9205946..9058b9eb8bc1 100644 --- a/kernel/time/tick-sched.c +++ b/kernel/time/tick-sched.c @@ -637,31 +637,21 @@ static void tick_nohz_update_jiffies(ktime_t now) touch_softlockup_watchdog_sched(); } =20 -/* - * Updates the per-CPU time idle statistics counters - */ -static void -update_ts_time_stats(int cpu, struct tick_sched *ts, ktime_t now, u64 *las= t_update_time) -{ - ktime_t delta; - - if (ts->idle_active) { - delta =3D ktime_sub(now, ts->idle_entrytime); - if (nr_iowait_cpu(cpu) > 0) - ts->iowait_sleeptime =3D ktime_add(ts->iowait_sleeptime, delta); - else - ts->idle_sleeptime =3D ktime_add(ts->idle_sleeptime, delta); - ts->idle_entrytime =3D now; - } - - if (last_update_time) - *last_update_time =3D ktime_to_us(now); - -} - static void tick_nohz_stop_idle(struct tick_sched *ts, ktime_t now) { - update_ts_time_stats(smp_processor_id(), ts, now, NULL); + ktime_t delta; + + if (WARN_ON_ONCE(!ts->idle_active)) + return; + + delta =3D ktime_sub(now, ts->idle_entrytime); + + if (nr_iowait_cpu(smp_processor_id()) > 0) + ts->iowait_sleeptime =3D ktime_add(ts->iowait_sleeptime, delta); + else + ts->idle_sleeptime =3D ktime_add(ts->idle_sleeptime, delta); + + ts->idle_entrytime =3D now; ts->idle_active =3D 0; =20 sched_clock_idle_wakeup_event(); @@ -674,6 +664,30 @@ static void tick_nohz_start_idle(struct tick_sched *ts) sched_clock_idle_sleep_event(); } =20 +static u64 get_cpu_sleep_time_us(struct tick_sched *ts, ktime_t *sleeptime, + bool compute_delta, u64 *last_update_time) +{ + ktime_t now, idle; + + if (!tick_nohz_active) + return -1; + + now =3D ktime_get(); + if (last_update_time) + *last_update_time =3D ktime_to_us(now); + + if (ts->idle_active && compute_delta) { + ktime_t delta =3D ktime_sub(now, ts->idle_entrytime); + + idle =3D ktime_add(*sleeptime, delta); + } else { + idle =3D *sleeptime; + } + + return ktime_to_us(idle); + +} + /** * get_cpu_idle_time_us - get the total idle time of a CPU * @cpu: CPU number to query @@ -691,27 +705,9 @@ static void tick_nohz_start_idle(struct tick_sched *ts) u64 get_cpu_idle_time_us(int cpu, u64 *last_update_time) { struct tick_sched *ts =3D &per_cpu(tick_cpu_sched, cpu); - ktime_t now, idle; - - if (!tick_nohz_active) - return -1; - - now =3D ktime_get(); - if (last_update_time) { - update_ts_time_stats(cpu, ts, now, last_update_time); - idle =3D ts->idle_sleeptime; - } else { - if (ts->idle_active && !nr_iowait_cpu(cpu)) { - ktime_t delta =3D ktime_sub(now, ts->idle_entrytime); - - idle =3D ktime_add(ts->idle_sleeptime, delta); - } else { - idle =3D ts->idle_sleeptime; - } - } - - return ktime_to_us(idle); =20 + return get_cpu_sleep_time_us(ts, &ts->idle_sleeptime, + !nr_iowait_cpu(cpu), last_update_time); } EXPORT_SYMBOL_GPL(get_cpu_idle_time_us); =20 @@ -732,26 +728,9 @@ EXPORT_SYMBOL_GPL(get_cpu_idle_time_us); u64 get_cpu_iowait_time_us(int cpu, u64 *last_update_time) { struct tick_sched *ts =3D &per_cpu(tick_cpu_sched, cpu); - ktime_t now, iowait; =20 - if (!tick_nohz_active) - return -1; - - now =3D ktime_get(); - if (last_update_time) { - update_ts_time_stats(cpu, ts, now, last_update_time); - iowait =3D ts->iowait_sleeptime; - } else { - if (ts->idle_active && nr_iowait_cpu(cpu) > 0) { - ktime_t delta =3D ktime_sub(now, ts->idle_entrytime); - - iowait =3D ktime_add(ts->iowait_sleeptime, delta); - } else { - iowait =3D ts->iowait_sleeptime; - } - } - - return ktime_to_us(iowait); + return get_cpu_sleep_time_us(ts, &ts->iowait_sleeptime, + nr_iowait_cpu(cpu), last_update_time); } EXPORT_SYMBOL_GPL(get_cpu_iowait_time_us); =20 --=20 2.34.1 From nobody Wed Sep 10 22:04:14 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D874EC636D6 for ; Wed, 22 Feb 2023 14:47:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232043AbjBVOrr (ORCPT ); Wed, 22 Feb 2023 09:47:47 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59704 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232196AbjBVOrf (ORCPT ); Wed, 22 Feb 2023 09:47:35 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D3AD222A2F for ; Wed, 22 Feb 2023 06:47:19 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id DD4DD61486 for ; Wed, 22 Feb 2023 14:47:04 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2C18EC433D2; Wed, 22 Feb 2023 14:47:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1677077224; bh=5sxyMTW09Kl3ct1nkcDaj+gcruIUS7UwkGWllRn6QWo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Sc66U+JeBvcIan2oTweypNsCy2L8Q2/rPUz6v+bXrFziNS6zAxmMtjoUctKljuRW5 59q297z8DODDV+4UtOB4+eFCzY4DcJO7esf/AF4bPFIy9gxqqrFX64DtVFKz2Hi73L n+uuZcaGU40P4h0HRfuKyBmN4WO6ylbAmBn6d6vwVZ2dGLjgwYfpnbr3phSDVbTIf9 P9JJM92Z7sgM9Mk7/fOrst3iDJtN+i2ImK7gpXdihpHHdy1YIovXunR3SkSKR/odZn pYDPMZqntFhTXNPrx41WL2rAOI+OlnhnEeGnjI9vbbChhMDztY2XoZwqdkBbG8/cX2 XTv0y2jKriNaQ== From: Frederic Weisbecker To: Thomas Gleixner Cc: LKML , Frederic Weisbecker , Alexey Dobriyan , Wei Li , Peter Zijlstra , Mirsad Goran Todorovac , Yu Liao , Hillf Danton , Ingo Molnar Subject: [PATCH 3/8] timers/nohz: Protect idle/iowait sleep time under seqcount Date: Wed, 22 Feb 2023 15:46:44 +0100 Message-Id: <20230222144649.624380-4-frederic@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230222144649.624380-1-frederic@kernel.org> References: <20230222144649.624380-1-frederic@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Reading idle/io sleep time (eg: from /proc/stat) can race with idle exit updates because the state machine handling the stats is not atomic and requires a coherent read batch. As a result reading the sleep time may report irrelevant or backward values. Fix this with protecting the simple state machine within a seqcount. This is expected to be cheap enough not to add measurable performance impact on the idle path. Note this only fixes reader VS writer condition partitially. A race remains that involves remote updates of the CPU iowait task counter. It can hardly be fixed. Reported-by: Yu Liao Acked-by: Peter Zijlstra (Intel) Cc: Hillf Danton Cc: Yu Liao Cc: Ingo Molnar Cc: Thomas Gleixner Cc: Wei Li Cc: Alexey Dobriyan Cc: Mirsad Goran Todorovac Signed-off-by: Frederic Weisbecker --- kernel/time/tick-sched.c | 22 ++++++++++++++++------ kernel/time/tick-sched.h | 1 + 2 files changed, 17 insertions(+), 6 deletions(-) diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c index 9058b9eb8bc1..90d9b7b29875 100644 --- a/kernel/time/tick-sched.c +++ b/kernel/time/tick-sched.c @@ -646,6 +646,7 @@ static void tick_nohz_stop_idle(struct tick_sched *ts, = ktime_t now) =20 delta =3D ktime_sub(now, ts->idle_entrytime); =20 + write_seqcount_begin(&ts->idle_sleeptime_seq); if (nr_iowait_cpu(smp_processor_id()) > 0) ts->iowait_sleeptime =3D ktime_add(ts->iowait_sleeptime, delta); else @@ -653,14 +654,18 @@ static void tick_nohz_stop_idle(struct tick_sched *ts= , ktime_t now) =20 ts->idle_entrytime =3D now; ts->idle_active =3D 0; + write_seqcount_end(&ts->idle_sleeptime_seq); =20 sched_clock_idle_wakeup_event(); } =20 static void tick_nohz_start_idle(struct tick_sched *ts) { + write_seqcount_begin(&ts->idle_sleeptime_seq); ts->idle_entrytime =3D ktime_get(); ts->idle_active =3D 1; + write_seqcount_end(&ts->idle_sleeptime_seq); + sched_clock_idle_sleep_event(); } =20 @@ -668,6 +673,7 @@ static u64 get_cpu_sleep_time_us(struct tick_sched *ts,= ktime_t *sleeptime, bool compute_delta, u64 *last_update_time) { ktime_t now, idle; + unsigned int seq; =20 if (!tick_nohz_active) return -1; @@ -676,13 +682,17 @@ static u64 get_cpu_sleep_time_us(struct tick_sched *t= s, ktime_t *sleeptime, if (last_update_time) *last_update_time =3D ktime_to_us(now); =20 - if (ts->idle_active && compute_delta) { - ktime_t delta =3D ktime_sub(now, ts->idle_entrytime); + do { + seq =3D read_seqcount_begin(&ts->idle_sleeptime_seq); =20 - idle =3D ktime_add(*sleeptime, delta); - } else { - idle =3D *sleeptime; - } + if (ts->idle_active && compute_delta) { + ktime_t delta =3D ktime_sub(now, ts->idle_entrytime); + + idle =3D ktime_add(*sleeptime, delta); + } else { + idle =3D *sleeptime; + } + } while (read_seqcount_retry(&ts->idle_sleeptime_seq, seq)); =20 return ktime_to_us(idle); =20 diff --git a/kernel/time/tick-sched.h b/kernel/time/tick-sched.h index c6663254d17d..5ed5a9d41d5a 100644 --- a/kernel/time/tick-sched.h +++ b/kernel/time/tick-sched.h @@ -75,6 +75,7 @@ struct tick_sched { ktime_t idle_waketime; =20 /* Idle entry */ + seqcount_t idle_sleeptime_seq; ktime_t idle_entrytime; =20 /* Tick stop */ --=20 2.34.1 From nobody Wed Sep 10 22:04:14 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 65CBFC64EC7 for ; Wed, 22 Feb 2023 14:47:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231477AbjBVOrv (ORCPT ); Wed, 22 Feb 2023 09:47:51 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59818 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232239AbjBVOrf (ORCPT ); Wed, 22 Feb 2023 09:47:35 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A43C63B215 for ; Wed, 22 Feb 2023 06:47:20 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 85E1C6148E for ; Wed, 22 Feb 2023 14:47:07 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id C9F62C4339E; Wed, 22 Feb 2023 14:47:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1677077227; bh=60UVPMKRBBZlhZputsHs6D/GUWH9jHmYFrcL5ldfROQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=freejV5546gLOuur9fYo+33buf40+Z96Mnn9j+UC1Jxp8s5/spdnEbt5zWmtjnFOc QXWQeq1FBg88cz9a7m5Gy36Ckj9xyFCDnW4KgPvsM7XrwYmAjJ1a7+Jsz1LPdA5FdL vH1yWwgm168Z5/4gQlURHesskOnAf4YQ3LQzCQ7F4B/yEKWx9uvjxJ5hMnNE+5xiqB jVMtIzfw+iF9GWl+VU1vnLRyMsgCU3nCDtNcEsoAbNycgCVZSVELl2HPGL7I8nvY/c fPG7t6B3IQLqPDmbHcRJ5RW1slecVqquDxO1pqOc1kSlvbmr2RZWPZQz+zETA6z6NN czPhb1KyzQPPg== From: Frederic Weisbecker To: Thomas Gleixner Cc: LKML , Frederic Weisbecker , Alexey Dobriyan , Wei Li , Peter Zijlstra , Mirsad Goran Todorovac , Yu Liao , Hillf Danton , Ingo Molnar Subject: [PATCH 4/8] timers/nohz: Add a comment about broken iowait counter update race Date: Wed, 22 Feb 2023 15:46:45 +0100 Message-Id: <20230222144649.624380-5-frederic@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230222144649.624380-1-frederic@kernel.org> References: <20230222144649.624380-1-frederic@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" The per-cpu iowait task counter is incremented locally upon sleeping. But since the task can be woken to (and by) another CPU, the counter may then be decremented remotely. This is the source of a race involving readers VS writer of idle/iowait sleeptime. The following scenario shows an example where a /proc/stat reader observes a pending sleep time as IO whereas that pending sleep time later eventually gets accounted as non-IO. CPU 0 CPU 1 CPU 2 ----- ----- ------ //io_schedule() TASK A current->in_iowait =3D 1 rq(0)->nr_iowait++ //switch to idle // READ /proc/stat // See nr_iowait_cpu(0) =3D=3D 1 return ts->iowait_sleeptime + ktime_sub(ktime_get(), ts->idle_entrytime) //try_to_wake_up(= TASK A) rq(0)->nr_iowait-- //idle exit // See nr_iowait_cpu(0) =3D=3D 0 ts->idle_sleeptime +=3D ktime_sub(ktime_get(), ts->idle_entrytime) As a result subsequent reads on /proc/stat may expose backward progress. This is unfortunately hardly fixable. Just add a comment about that condition. Acked-by: Peter Zijlstra (Intel) Cc: Hillf Danton Cc: Yu Liao Cc: Ingo Molnar Cc: Thomas Gleixner Cc: Wei Li Cc: Alexey Dobriyan Cc: Mirsad Goran Todorovac Signed-off-by: Frederic Weisbecker --- kernel/time/tick-sched.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c index 90d9b7b29875..edd6e9f26d16 100644 --- a/kernel/time/tick-sched.c +++ b/kernel/time/tick-sched.c @@ -705,7 +705,10 @@ static u64 get_cpu_sleep_time_us(struct tick_sched *ts= , ktime_t *sleeptime, * counters if NULL. * * Return the cumulative idle time (since boot) for a given - * CPU, in microseconds. + * CPU, in microseconds. Note this is partially broken due to + * the counter of iowait tasks that can be remotely updated without + * any synchronization. Therefore it is possible to observe backward + * values within two consecutive reads. * * This time is measured via accounting rather than sampling, * and is as accurate as ktime_get() is. @@ -728,7 +731,10 @@ EXPORT_SYMBOL_GPL(get_cpu_idle_time_us); * counters if NULL. * * Return the cumulative iowait time (since boot) for a given - * CPU, in microseconds. + * CPU, in microseconds. Note this is partially broken due to + * the counter of iowait tasks that can be remotely updated without + * any synchronization. Therefore it is possible to observe backward + * values within two consecutive reads. * * This time is measured via accounting rather than sampling, * and is as accurate as ktime_get() is. --=20 2.34.1 From nobody Wed Sep 10 22:04:14 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 670E9C61DA4 for ; Wed, 22 Feb 2023 14:48:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230363AbjBVOsF (ORCPT ); Wed, 22 Feb 2023 09:48:05 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59772 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231945AbjBVOrl (ORCPT ); Wed, 22 Feb 2023 09:47:41 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B27E83BD92 for ; Wed, 22 Feb 2023 06:47:22 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 2AC9961490 for ; Wed, 22 Feb 2023 14:47:10 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7259DC4339C; Wed, 22 Feb 2023 14:47:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1677077229; bh=/vEaMsqtOE5BkbGfNS7gZR33SAARhK8GonMdLs7ceS0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=bxgURqkQdkUvU9rwsmMvbmQ+NaQ/3H72lC8HzB9tC2wpkHwyjDOJmuPdDA1QjJPZc ogLlOR/no2HPH3zJD3aCVLLkDtZGFcVcdYy3rS+o/Wycq5KwssYIZzyNWKxkp1H6q9 NkyVSxYNBDpvaF40BBjqj5zFfaXJ8/z8ZOZzR1CxsQYcJ5XZTK4y24u5KxQDf50lCL kDi3Gwhx4ntKzy9+AD6K1OFMLGEuLBvX+7TTIZGgQfV4g8ZvgFst7pjGLuKIlvyId/ yPEDf+4C+EpTeiEcGylwcfTsEZ6g5/adZYZmDBsruqLJErd2w/8NBc3kvNQOlAKIXZ LNxnamwUqx4Hg== From: Frederic Weisbecker To: Thomas Gleixner Cc: LKML , Frederic Weisbecker , Alexey Dobriyan , Wei Li , Peter Zijlstra , Mirsad Goran Todorovac , Yu Liao , Hillf Danton , Ingo Molnar Subject: [PATCH 5/8] timers/nohz: Remove middle-function __tick_nohz_idle_stop_tick() Date: Wed, 22 Feb 2023 15:46:46 +0100 Message-Id: <20230222144649.624380-6-frederic@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230222144649.624380-1-frederic@kernel.org> References: <20230222144649.624380-1-frederic@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" There is no need for the __tick_nohz_idle_stop_tick() function between tick_nohz_idle_stop_tick() and its implementation. Remove that unnecessary step. Acked-by: Peter Zijlstra (Intel) Cc: Hillf Danton Cc: Yu Liao Cc: Ingo Molnar Cc: Thomas Gleixner Cc: Wei Li Cc: Alexey Dobriyan Cc: Mirsad Goran Todorovac Signed-off-by: Frederic Weisbecker --- kernel/time/tick-sched.c | 20 ++++++++------------ 1 file changed, 8 insertions(+), 12 deletions(-) diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c index edd6e9f26d16..3b53b894ca98 100644 --- a/kernel/time/tick-sched.c +++ b/kernel/time/tick-sched.c @@ -1079,10 +1079,16 @@ static bool can_stop_idle_tick(int cpu, struct tick= _sched *ts) return true; } =20 -static void __tick_nohz_idle_stop_tick(struct tick_sched *ts) +/** + * tick_nohz_idle_stop_tick - stop the idle tick from the idle task + * + * When the next event is more than a tick into the future, stop the idle = tick + */ +void tick_nohz_idle_stop_tick(void) { + struct tick_sched *ts =3D this_cpu_ptr(&tick_cpu_sched); + int cpu =3D smp_processor_id(); ktime_t expires; - int cpu =3D smp_processor_id(); =20 /* * If tick_nohz_get_sleep_length() ran tick_nohz_next_event(), the @@ -1114,16 +1120,6 @@ static void __tick_nohz_idle_stop_tick(struct tick_s= ched *ts) } } =20 -/** - * tick_nohz_idle_stop_tick - stop the idle tick from the idle task - * - * When the next event is more than a tick into the future, stop the idle = tick - */ -void tick_nohz_idle_stop_tick(void) -{ - __tick_nohz_idle_stop_tick(this_cpu_ptr(&tick_cpu_sched)); -} - void tick_nohz_idle_retain_tick(void) { tick_nohz_retain_tick(this_cpu_ptr(&tick_cpu_sched)); --=20 2.34.1 From nobody Wed Sep 10 22:04:14 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C0F58C61DA4 for ; Wed, 22 Feb 2023 14:48:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231735AbjBVOsJ (ORCPT ); Wed, 22 Feb 2023 09:48:09 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60114 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232280AbjBVOro (ORCPT ); Wed, 22 Feb 2023 09:47:44 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 45D3F3B3FC for ; Wed, 22 Feb 2023 06:47:24 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id BDBC2B81230 for ; Wed, 22 Feb 2023 14:47:13 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 1BD26C4339E; Wed, 22 Feb 2023 14:47:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1677077232; bh=NHv2T4M6t3YcZt1kUoyxtpuC8v5xebDYXn942sZPC1Q=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=VMn+Xg+QFyaDpYO0E6BpyUqeqGmDlgBBMKYEMIqvgN/H98zMYGaPN4Uq0rFnoCsBu tR6bonMsi9+qmL4BdwaYawwEBsOZObdXewrMkBug1CI7+uriwqKPXTyZzs48vIJdfu fAU55CEta+BA6FOXhOW5JOm7VCWcFs526FnrqzrR6OwoLEsd0F4SftzsnequpojC7S wDFZHiPuITun5zbc247ZGd8fIqkQrUBhmrxaCb3kMIFRagIeZkTEj464yA7byI5+m0 9bRHeIJDLkrCyxhO2vZ9AbiBXKw0jESwdJHJxQCJW6GwX4WL3WR9zm8qMx83qoE3yE ot9hFyqNsXFZw== From: Frederic Weisbecker To: Thomas Gleixner Cc: LKML , Frederic Weisbecker , Alexey Dobriyan , Wei Li , Peter Zijlstra , Mirsad Goran Todorovac , Yu Liao , Hillf Danton , Ingo Molnar Subject: [PATCH 6/8] MAINTAINERS: Remove stale email address Date: Wed, 22 Feb 2023 15:46:47 +0100 Message-Id: <20230222144649.624380-7-frederic@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230222144649.624380-1-frederic@kernel.org> References: <20230222144649.624380-1-frederic@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Acked-by: Peter Zijlstra (Intel) Cc: Hillf Danton Cc: Yu Liao Cc: Ingo Molnar Cc: Thomas Gleixner Cc: Wei Li Cc: Alexey Dobriyan Cc: Mirsad Goran Todorovac Signed-off-by: Frederic Weisbecker --- MAINTAINERS | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/MAINTAINERS b/MAINTAINERS index fb1471cb5ed3..300ca61fa0bc 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -14794,7 +14794,7 @@ F: include/uapi/linux/nitro_enclaves.h F: samples/nitro_enclaves/ =20 NOHZ, DYNTICKS SUPPORT -M: Frederic Weisbecker +M: Frederic Weisbecker M: Thomas Gleixner M: Ingo Molnar L: linux-kernel@vger.kernel.org --=20 2.34.1 From nobody Wed Sep 10 22:04:14 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8A654C636D6 for ; Wed, 22 Feb 2023 14:48:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229528AbjBVOsL (ORCPT ); Wed, 22 Feb 2023 09:48:11 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60134 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231821AbjBVOrp (ORCPT ); Wed, 22 Feb 2023 09:47:45 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6BCEB3C781 for ; Wed, 22 Feb 2023 06:47:25 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 6CD4D612FC for ; Wed, 22 Feb 2023 14:47:15 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id B842CC4339C; Wed, 22 Feb 2023 14:47:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1677077234; bh=Y8JxDtLzBbykmpbzPSKug0UQ8gYai1KJkaFJDQJ6g9M=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Ucf5hOIwqMVgXUZUJPSIqcTzk8UXkIhjg9lURzfKGg5BVAYGcrekCEyCjTKRHul9b sSWYME9oC2gyhBP8dnf2Fpdp0FuFx0ndMd1bMLNcR7f/l9pDuWktgLnxtHGPIaCAcd YgqyfxLG2Befj3Y2d8/wNrjyPc2/Hceb7s3qxUS3k5HrLwFAY+FAaxL/ws5IXWLu6W 41x55tBJoa3nL1XAEkDTEIZMXro3VFn5WVxpdMV+FY7opHdqcZD49XMpSCgBMqOeWX 0aKZZeIjeL33wBFumEr9uGShDHC/+oTQ1ZX1vujlLNHxvUyRf2biOOi14C/RtOP8eF +eydlKhZeJBrg== From: Frederic Weisbecker To: Thomas Gleixner Cc: LKML , Frederic Weisbecker , Alexey Dobriyan , Wei Li , Peter Zijlstra , Mirsad Goran Todorovac , Yu Liao , Hillf Danton , Ingo Molnar Subject: [PATCH 7/8] selftests/proc: Remove idle time monotonicity assertions Date: Wed, 22 Feb 2023 15:46:48 +0100 Message-Id: <20230222144649.624380-8-frederic@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230222144649.624380-1-frederic@kernel.org> References: <20230222144649.624380-1-frederic@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Due to broken iowait task counting design (cf: comments above get_cpu_idle_time_us() and nr_iowait()), it is not possible to provide the guarantee that /proc/stat or /proc/uptime display monotonic idle time values. Remove the assertions that verify the related wrong assumption so that testers and maintainers don't spend more time on that. Reported-by: Yu Liao Reported-by: Thomas Gleixner Cc: Hillf Danton Cc: Ingo Molnar Cc: Wei Li Cc: Alexey Dobriyan Cc: Mirsad Goran Todorovac Cc: Peter Zijlstra Signed-off-by: Frederic Weisbecker --- tools/testing/selftests/proc/proc-uptime-001.c | 12 ++++++------ tools/testing/selftests/proc/proc-uptime-002.c | 13 ++++++------- tools/testing/selftests/proc/proc-uptime.h | 16 ++-------------- 3 files changed, 14 insertions(+), 27 deletions(-) diff --git a/tools/testing/selftests/proc/proc-uptime-001.c b/tools/testing= /selftests/proc/proc-uptime-001.c index 781f7a50fc3f..35bddd9dd60b 100644 --- a/tools/testing/selftests/proc/proc-uptime-001.c +++ b/tools/testing/selftests/proc/proc-uptime-001.c @@ -13,7 +13,9 @@ * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. */ -// Test that values in /proc/uptime increment monotonically. +// Test that boottime value in /proc/uptime increments monotonically. +// We don't test idle time monotonicity due to broken iowait task +// counting, cf: comment above get_cpu_idle_time_us() #undef NDEBUG #include #include @@ -25,20 +27,18 @@ =20 int main(void) { - uint64_t start, u0, u1, i0, i1; + uint64_t start, u0, u1; int fd; =20 fd =3D open("/proc/uptime", O_RDONLY); assert(fd >=3D 0); =20 - proc_uptime(fd, &u0, &i0); + u0 =3D proc_uptime(fd); start =3D u0; do { - proc_uptime(fd, &u1, &i1); + u1 =3D proc_uptime(fd); assert(u1 >=3D u0); - assert(i1 >=3D i0); u0 =3D u1; - i0 =3D i1; } while (u1 - start < 100); =20 return 0; diff --git a/tools/testing/selftests/proc/proc-uptime-002.c b/tools/testing= /selftests/proc/proc-uptime-002.c index 7d0aa22bdc12..7ad79d5eaa84 100644 --- a/tools/testing/selftests/proc/proc-uptime-002.c +++ b/tools/testing/selftests/proc/proc-uptime-002.c @@ -13,8 +13,9 @@ * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. */ -// Test that values in /proc/uptime increment monotonically -// while shifting across CPUs. +// Test that boottime value in /proc/uptime increments monotonically +// while shifting across CPUs. We don't test idle time monotonicity +// due to broken iowait task counting, cf: comment above get_cpu_idle_time= _us() #undef NDEBUG #include #include @@ -45,7 +46,7 @@ int main(void) unsigned int len; unsigned long *m; unsigned int cpu; - uint64_t u0, u1, i0, i1; + uint64_t u0, u1; int fd; =20 /* find out "nr_cpu_ids" */ @@ -60,7 +61,7 @@ int main(void) fd =3D open("/proc/uptime", O_RDONLY); assert(fd >=3D 0); =20 - proc_uptime(fd, &u0, &i0); + u0 =3D proc_uptime(fd); for (cpu =3D 0; cpu < len * 8; cpu++) { memset(m, 0, len); m[cpu / (8 * sizeof(unsigned long))] |=3D 1UL << (cpu % (8 * sizeof(unsi= gned long))); @@ -68,11 +69,9 @@ int main(void) /* CPU might not exist, ignore error */ sys_sched_setaffinity(0, len, m); =20 - proc_uptime(fd, &u1, &i1); + u1 =3D proc_uptime(fd); assert(u1 >=3D u0); - assert(i1 >=3D i0); u0 =3D u1; - i0 =3D i1; } =20 return 0; diff --git a/tools/testing/selftests/proc/proc-uptime.h b/tools/testing/sel= ftests/proc/proc-uptime.h index dc6a42b1d6b0..ca55abeb0ccc 100644 --- a/tools/testing/selftests/proc/proc-uptime.h +++ b/tools/testing/selftests/proc/proc-uptime.h @@ -22,7 +22,7 @@ =20 #include "proc.h" =20 -static void proc_uptime(int fd, uint64_t *uptime, uint64_t *idle) +static uint64_t proc_uptime(int fd) { uint64_t val1, val2; char buf[64], *p; @@ -43,18 +43,6 @@ static void proc_uptime(int fd, uint64_t *uptime, uint64= _t *idle) assert(p[3] =3D=3D ' '); =20 val2 =3D (p[1] - '0') * 10 + p[2] - '0'; - *uptime =3D val1 * 100 + val2; =20 - p +=3D 4; - - val1 =3D xstrtoull(p, &p); - assert(p[0] =3D=3D '.'); - assert('0' <=3D p[1] && p[1] <=3D '9'); - assert('0' <=3D p[2] && p[2] <=3D '9'); - assert(p[3] =3D=3D '\n'); - - val2 =3D (p[1] - '0') * 10 + p[2] - '0'; - *idle =3D val1 * 100 + val2; - - assert(p + 4 =3D=3D buf + rv); + return val1 * 100 + val2; } --=20 2.34.1 From nobody Wed Sep 10 22:04:14 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CF9E5C61DA4 for ; Wed, 22 Feb 2023 14:48:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232316AbjBVOsA (ORCPT ); Wed, 22 Feb 2023 09:48:00 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59326 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232201AbjBVOrh (ORCPT ); Wed, 22 Feb 2023 09:47:37 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8249F38EB6 for ; Wed, 22 Feb 2023 06:47:21 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id BC6C2B815C9 for ; Wed, 22 Feb 2023 14:47:18 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 60642C4339E; Wed, 22 Feb 2023 14:47:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1677077237; bh=3ds3/cTbh0yU84XZ4nwDnOjFVf7ivtfK68suy9ql4Kw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=O/SXFOf0UVWM/2lQuF9pRCQJD7/Qjp1ltLC72UF7I04QPqLkTNXiYIUbh83lE8D6P qI/1/V3Rz/i9ea+1Q7TIPhxLJC0epNbZysQD2oisZBi73sQYcCq8mBgRma8Mv4S8KI 79IkEWklYyfvR/pp5se81yl7XOoFJZ50LTqpYdgtEHNd4klwuarw0kxpeDfNnr3zmJ LqBU2SnACBrx2yZ5SKNg3m9UjybeHnfUTlc8GatJcYbBnJE3YsVNSgvPMSJLS7FHg/ M9dIgk/6Kf7kqCTGT7u45sBjDL//mMflCmQpOjNH+k/XnbdQf2cyCbbY1Jg6OGcI9E FF1NHlqN4bxfw== From: Frederic Weisbecker To: Thomas Gleixner Cc: LKML , Frederic Weisbecker , Alexey Dobriyan , Wei Li , Peter Zijlstra , Mirsad Goran Todorovac , Yu Liao , Hillf Danton , Ingo Molnar Subject: [PATCH 8/8] selftests/proc: Assert clock_gettime(CLOCK_BOOTTIME) VS /proc/uptime monotonicity Date: Wed, 22 Feb 2023 15:46:49 +0100 Message-Id: <20230222144649.624380-9-frederic@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230222144649.624380-1-frederic@kernel.org> References: <20230222144649.624380-1-frederic@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" The first field of /proc/uptime relies on the CLOCK_BOOTTIME clock which can also be fetched from clock_gettime() API. Improve the test coverage while verifying the monotonicity of CLOCK_BOOTTIME accross both interfaces. Suggested-by: Thomas Gleixner Cc: Yu Liao Cc: Hillf Danton Cc: Ingo Molnar Cc: Wei Li Cc: Alexey Dobriyan Cc: Mirsad Goran Todorovac Cc: Peter Zijlstra Signed-off-by: Frederic Weisbecker --- .../testing/selftests/proc/proc-uptime-001.c | 21 ++++++++++++++---- .../testing/selftests/proc/proc-uptime-002.c | 22 +++++++++++++++---- tools/testing/selftests/proc/proc-uptime.h | 12 ++++++++++ 3 files changed, 47 insertions(+), 8 deletions(-) diff --git a/tools/testing/selftests/proc/proc-uptime-001.c b/tools/testing= /selftests/proc/proc-uptime-001.c index 35bddd9dd60b..f335eec5067e 100644 --- a/tools/testing/selftests/proc/proc-uptime-001.c +++ b/tools/testing/selftests/proc/proc-uptime-001.c @@ -13,9 +13,9 @@ * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. */ -// Test that boottime value in /proc/uptime increments monotonically. -// We don't test idle time monotonicity due to broken iowait task -// counting, cf: comment above get_cpu_idle_time_us() +// Test that boottime value in /proc/uptime and CLOCK_BOOTTIME increment +// monotonically. We don't test idle time monotonicity due to broken iowait +// task counting, cf: comment above get_cpu_idle_time_us() #undef NDEBUG #include #include @@ -27,7 +27,7 @@ =20 int main(void) { - uint64_t start, u0, u1; + uint64_t start, u0, u1, c0, c1; int fd; =20 fd =3D open("/proc/uptime", O_RDONLY); @@ -35,10 +35,23 @@ int main(void) =20 u0 =3D proc_uptime(fd); start =3D u0; + c0 =3D clock_boottime(); + do { u1 =3D proc_uptime(fd); + c1 =3D clock_boottime(); + + /* Is /proc/uptime monotonic ? */ assert(u1 >=3D u0); + + /* Is CLOCK_BOOTTIME monotonic ? */ + assert(c1 >=3D c0); + + /* Is CLOCK_BOOTTIME VS /proc/uptime monotonic ? */ + assert(c0 >=3D u0); + u0 =3D u1; + c0 =3D c1; } while (u1 - start < 100); =20 return 0; diff --git a/tools/testing/selftests/proc/proc-uptime-002.c b/tools/testing= /selftests/proc/proc-uptime-002.c index 7ad79d5eaa84..ae453daa96c1 100644 --- a/tools/testing/selftests/proc/proc-uptime-002.c +++ b/tools/testing/selftests/proc/proc-uptime-002.c @@ -13,9 +13,10 @@ * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. */ -// Test that boottime value in /proc/uptime increments monotonically -// while shifting across CPUs. We don't test idle time monotonicity -// due to broken iowait task counting, cf: comment above get_cpu_idle_time= _us() +// Test that boottime value in /proc/uptime and CLOCK_BOOTTIME increment +// monotonically while shifting across CPUs. We don't test idle time +// monotonicity due to broken iowait task counting, cf: comment above +// get_cpu_idle_time_us() #undef NDEBUG #include #include @@ -43,10 +44,10 @@ static inline int sys_sched_setaffinity(pid_t pid, unsi= gned int len, unsigned lo =20 int main(void) { + uint64_t u0, u1, c0, c1; unsigned int len; unsigned long *m; unsigned int cpu; - uint64_t u0, u1; int fd; =20 /* find out "nr_cpu_ids" */ @@ -62,6 +63,8 @@ int main(void) assert(fd >=3D 0); =20 u0 =3D proc_uptime(fd); + c0 =3D clock_boottime(); + for (cpu =3D 0; cpu < len * 8; cpu++) { memset(m, 0, len); m[cpu / (8 * sizeof(unsigned long))] |=3D 1UL << (cpu % (8 * sizeof(unsi= gned long))); @@ -70,8 +73,19 @@ int main(void) sys_sched_setaffinity(0, len, m); =20 u1 =3D proc_uptime(fd); + c1 =3D clock_boottime(); + + /* Is /proc/uptime monotonic ? */ assert(u1 >=3D u0); + + /* Is CLOCK_BOOTTIME monotonic ? */ + assert(c1 >=3D c0); + + /* Is CLOCK_BOOTTIME VS /proc/uptime monotonic ? */ + assert(c0 >=3D u0); + u0 =3D u1; + c0 =3D c1; } =20 return 0; diff --git a/tools/testing/selftests/proc/proc-uptime.h b/tools/testing/sel= ftests/proc/proc-uptime.h index ca55abeb0ccc..730cce4a3d73 100644 --- a/tools/testing/selftests/proc/proc-uptime.h +++ b/tools/testing/selftests/proc/proc-uptime.h @@ -19,9 +19,21 @@ #include #include #include +#include =20 #include "proc.h" =20 +static uint64_t clock_boottime(void) +{ + struct timespec ts; + int err; + + err =3D clock_gettime(CLOCK_BOOTTIME, &ts); + assert(err >=3D 0); + + return (ts.tv_sec * 100) + (ts.tv_nsec / 10000000); +} + static uint64_t proc_uptime(int fd) { uint64_t val1, val2; --=20 2.34.1