From nobody Sun Oct 5 20:11:54 2025 Received: from mail-pf1-f201.google.com (mail-pf1-f201.google.com [209.85.210.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5BE5B185E7F for ; Wed, 30 Jul 2025 02:24:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753842248; cv=none; b=F2AwBYE29M+BM9tvpkXuIJchs3l1PnWuF1SLP/O5oiEwXtWSieTKT+RFe+CaZeQa0LbO6ewir0FO0GRVGIMvW3qc882byh87Jih3UG90nny/3kWxpNWl43ssIJ4kU8dGBCsQ+UCfBIK9UJ7HjeLigITE1fC77/YWPOppV536d+g= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753842248; c=relaxed/simple; bh=nS5WfiJyDN++X6i4GiHHq5GyaBNmUkSAVJNYFeAMNTI=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=Wc+hdUYJDXuKDE49TJAx2dToZGG2PV6sCWx4NeJLk9TBbcyRVhKuZRqplQMJsYnZlTyVLx6cVX0F/QfKbu9MaG5SP8Q/BBddXPWrKrG1Kh47vXEhalLwXhPpjCd/QlU+z/x8rcG3Rb7C23EHJKCKjULdWYnSSrRqw8Gu0i2eaCM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--yuzhuo.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=3BS8FYHY; arc=none smtp.client-ip=209.85.210.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--yuzhuo.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="3BS8FYHY" Received: by mail-pf1-f201.google.com with SMTP id d2e1a72fcca58-74b29ee4f8bso5756184b3a.2 for ; Tue, 29 Jul 2025 19:24:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1753842246; x=1754447046; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=YSnVBVITeOzZAfMv7RC5y4cK3Twu/w5tupUEloHTubA=; b=3BS8FYHYNA83lJiFLyPLKNW0+Mld0mPVXquHzfZrAhRG06kpILinyxgx3rthTsoyB1 W2EKRFBhTMfwNsQEbGA2LCuRJp+EaDNUB22gEfP7C9wV8QoEQ7sFLUe3j/FMufvNDgBM S6a5QnY63EhlmVkrkVAaZC/mIesAdcLE6cX6JuSJOrTqHkyV3eZr1M+LfMcNno4cTmMX OyxygtLuwEG7Yv/UrhfDqnJMaRokTaUtT1SBLYnbZDNjlpWpeRuxc44gwfDpRK0pctCs Daibxpn6vFjBtm2zBJuNeWWRiuQuhes9cIt5cP5PbGH63UlXMGWJWLDeibKV4Pz7gf8P C04g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1753842246; x=1754447046; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=YSnVBVITeOzZAfMv7RC5y4cK3Twu/w5tupUEloHTubA=; b=bIfnpPO1OXddHlwkaATz54K0wZH9w+uMPaaF2zx7drVo705TAy9BKWUN9hiMwA2bfB prwrLfAUT0tf06dqMq2ve6ILHedURk1cd6Y4qzWEOZQPm/mERYnS7o5y6jmsq0TfzFVq 7MQRCj5PpKqGJtgSlDlzrU2JwaVApcnrgiGsmJsuoKL+TRBGvzplys8b/W3Y0Mc2+0rC jfF5iP7BMWmGIZsIFs2SaUySK2x5wHOAUjDUMIEjEfaJjsKmjiflgh3zIeYOchls7rQf jrjj8VF9ZEw1PmlKH5CXM/tk7MMWV/awByq98a+9DNzLTB2W/Mg0VVAEp7Ueh8/LAOMu SdSw== X-Forwarded-Encrypted: i=1; AJvYcCUz8xHh1s/cOLBmtoQpForT0V5mKvI74jlLbJn7G6Jt4egMr/+UJ8e+apmMfC2kJisvgqFnGr5Y7/b9n74=@vger.kernel.org X-Gm-Message-State: AOJu0Ywh1Z2K9s+CYIwwnMXTgwhYh2QNWM6ijP5wIyvKHwkUgRqSCwEt wTRE+W353KrwnrSmJF8tjrcbL4b3PR0+G7E0dXJysjdeK+dcAGv0qdMN0aiNuX5T2voGYYmIDDY +5wryXQ== X-Google-Smtp-Source: AGHT+IHoQsKozsPlh4E8P2d5oMA0zeGN4kDg69cefdHKKRDMh6Tt1jgzcPFhnMiLARA1JMxVUy4YUoynsJM= X-Received: from pfbic11.prod.google.com ([2002:a05:6a00:8a0b:b0:756:c6cf:2462]) (user=yuzhuo job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a20:3d0c:b0:234:cd25:735 with SMTP id adf61e73a8af0-23dc0e967afmr2202881637.38.1753842245632; Tue, 29 Jul 2025 19:24:05 -0700 (PDT) Date: Tue, 29 Jul 2025 19:23:44 -0700 In-Reply-To: <20250730022347.71722-1-yuzhuo@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250730022347.71722-1-yuzhuo@google.com> X-Mailer: git-send-email 2.50.1.552.g942d659e1b-goog Message-ID: <20250730022347.71722-2-yuzhuo@google.com> Subject: [PATCH v1 1/4] rcuscale: Create debugfs file for writer durations From: Yuzhuo Jing To: Ian Rogers , Yuzhuo Jing , Jonathan Corbet , Davidlohr Bueso , "Paul E . McKenney" , Josh Triplett , Frederic Weisbecker , Neeraj Upadhyay , Joel Fernandes , Boqun Feng , Uladzislau Rezki , Steven Rostedt , Mathieu Desnoyers , Lai Jiangshan , Zqiang , Andrew Morton , Ingo Molnar , Borislav Petkov , Arnd Bergmann , Frank van der Linden , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, rcu@vger.kernel.org Cc: Yuzhuo Jing Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Creates an "rcuscale" folder in debugfs and creates a "writer_durations" file in the folder. This file is in CSV format. Each line represents one duration record, with columns defined as: writer_id,duration Added an option "writer_no_print" to skip printing writer durations on cleanup. This allows external tools to read structured data and also drastically improves cleanup performance on large core count machines. On a 256C 512T machines running nreaders=3D1 nwriters=3D511: Before: $ time modprobe -r rcuscale; modprobe -r torture real 3m17.349s user 0m0.000s sys 3m15.288s After: $ time cat /sys/kernel/debug/rcuscale/writer_durations > durations.csv real 0m0.005s user 0m0.000s sys 0m0.005s $ time modprobe -r rcuscale; modprobe -r torture real 0m0.388s user 0m0.000s sys 0m0.335s Signed-off-by: Yuzhuo Jing --- .../admin-guide/kernel-parameters.txt | 5 + kernel/rcu/rcuscale.c | 142 +++++++++++++++++- 2 files changed, 139 insertions(+), 8 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentatio= n/admin-guide/kernel-parameters.txt index f1f2c0874da9..7b62a84a19d4 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -5583,6 +5583,11 @@ periods, but in jiffies. The default of zero says no holdoff. =20 + rcuscale.writer_no_print=3D [KNL] + Do not print writer durations to kernel ring buffer. + Instead, users can read them from the + rcuscale/writer_durations file in debugfs. + rcutorture.fqs_duration=3D [KNL] Set duration of force_quiescent_state bursts in microseconds. diff --git a/kernel/rcu/rcuscale.c b/kernel/rcu/rcuscale.c index b521d0455992..ad10b42be6fc 100644 --- a/kernel/rcu/rcuscale.c +++ b/kernel/rcu/rcuscale.c @@ -40,6 +40,8 @@ #include #include #include +#include +#include =20 #include "rcu.h" =20 @@ -97,6 +99,7 @@ torture_param(bool, shutdown, RCUSCALE_SHUTDOWN, torture_param(int, verbose, 1, "Enable verbose debugging printk()s"); torture_param(int, writer_holdoff, 0, "Holdoff (us) between GPs, zero to d= isable"); torture_param(int, writer_holdoff_jiffies, 0, "Holdoff (jiffies) between G= Ps, zero to disable"); +torture_param(bool, writer_no_print, false, "Do not print writer durations= to ring buffer"); torture_param(int, kfree_rcu_test, 0, "Do we run a kfree_rcu() scale test?= "); torture_param(int, kfree_mult, 1, "Multiple of kfree_obj size to allocate.= "); torture_param(int, kfree_by_call_rcu, 0, "Use call_rcu() to emulate kfree_= rcu()?"); @@ -138,6 +141,9 @@ static u64 t_rcu_scale_writer_finished; static unsigned long b_rcu_gp_test_started; static unsigned long b_rcu_gp_test_finished; =20 +static struct dentry *debugfs_dir; +static struct dentry *debugfs_writer_durations; + #define MAX_MEAS 10000 #define MIN_MEAS 100 =20 @@ -607,6 +613,7 @@ rcu_scale_writer(void *arg) t =3D ktime_get_mono_fast_ns(); *wdp =3D t - *wdp; i_max =3D i; + writer_n_durations[me] =3D i_max + 1; if (!started && atomic_read(&n_rcu_scale_writer_started) >=3D nrealwriters) started =3D true; @@ -620,6 +627,7 @@ rcu_scale_writer(void *arg) nrealwriters) { schedule_timeout_interruptible(10); rcu_ftrace_dump(DUMP_ALL); + WRITE_ONCE(test_complete, true); SCALEOUT_STRING("Test complete"); t_rcu_scale_writer_finished =3D t; if (gp_exp) { @@ -666,7 +674,6 @@ rcu_scale_writer(void *arg) rcu_scale_free(wmbp); cur_ops->gp_barrier(); } - writer_n_durations[me] =3D i_max + 1; torture_kthread_stopping("rcu_scale_writer"); return 0; } @@ -941,6 +948,117 @@ kfree_scale_init(void) return firsterr; } =20 +/* + * A seq_file for writer_durations. Content is only visible when all writ= ers + * finish. Element i of the sequence is writer_durations + i. + */ +static void *writer_durations_start(struct seq_file *m, loff_t *pos) +{ + loff_t writer_id =3D *pos; + + if (!test_complete || writer_id < 0 || writer_id >=3D nrealwriters) + return NULL; + + return writer_durations + writer_id; +} + +static void *writer_durations_next(struct seq_file *m, void *v, loff_t *po= s) +{ + (*pos)++; + return writer_durations_start(m, pos); +} + +static void writer_durations_stop(struct seq_file *m, void *v) +{ +} + +/* + * Each element in the seq_file is an array of one writer's durations. + * Each element prints writer_n_durations[writer_id] lines, and each line + * contains one duration record, in CSV format: + * writer_id,duration + */ +static int writer_durations_show(struct seq_file *m, void *v) +{ + u64 **durations =3D v; + loff_t writer_id =3D durations - writer_durations; + + for (int i =3D 0; i < writer_n_durations[writer_id]; ++i) + seq_printf(m, "%lld,%lld\n", writer_id, durations[0][i]); + + return 0; +} + +static const struct seq_operations writer_durations_op =3D { + .start =3D writer_durations_start, + .next =3D writer_durations_next, + .stop =3D writer_durations_stop, + .show =3D writer_durations_show +}; + +static int writer_durations_open(struct inode *inode, struct file *file) +{ + return seq_open(file, &writer_durations_op); +} + +static const struct file_operations writer_durations_fops =3D { + .owner =3D THIS_MODULE, + .open =3D writer_durations_open, + .read =3D seq_read, + .llseek =3D seq_lseek, + .release =3D seq_release, +}; + +/* + * Create an rcuscale directory exposing run states and results. + */ +static int register_debugfs(void) +{ +#define try_create_file(variable, name, mode, parent, data, fops) \ +({ \ + variable =3D debugfs_create_file((name), (mode), (parent), (data), (fops)= ); \ + err =3D PTR_ERR_OR_ZERO(variable); \ + err; \ +}) + + int err; + + debugfs_dir =3D debugfs_create_dir("rcuscale", NULL); + err =3D PTR_ERR_OR_ZERO(debugfs_dir); + if (err) + goto fail; + + if (try_create_file(debugfs_writer_durations, "writer_durations", 0444, + debugfs_dir, NULL, &writer_durations_fops)) + goto fail; + + return 0; +fail: + pr_err("rcu-scale: Failed to create debugfs file."); + /* unregister_debugfs is called by rcu_scale_cleanup, avoid + * calling it twice. + */ + return err; +#undef try_create_file +} + +static void unregister_debugfs(void) +{ +#define try_remove(variable) \ +do { \ + if (!IS_ERR_OR_NULL(variable)) \ + debugfs_remove(variable); \ + variable =3D NULL; \ +} while (0) + + try_remove(debugfs_writer_durations); + + /* Remove directory after files. */ + try_remove(debugfs_dir); + +#undef try_remove +} + static void rcu_scale_cleanup(void) { @@ -961,6 +1079,8 @@ rcu_scale_cleanup(void) if (gp_exp && gp_async) SCALEOUT_ERRSTRING("No expedited async GPs, so went with async!"); =20 + unregister_debugfs(); + // If built-in, just report all of the GP kthread's CPU time. if (IS_BUILTIN(CONFIG_RCU_SCALE_TEST) && !kthread_tp && cur_ops->rso_gp_k= thread) kthread_tp =3D cur_ops->rso_gp_kthread(); @@ -1020,13 +1140,15 @@ rcu_scale_cleanup(void) wdpp =3D writer_durations[i]; if (!wdpp) continue; - for (j =3D 0; j < writer_n_durations[i]; j++) { - wdp =3D &wdpp[j]; - pr_alert("%s%s %4d writer-duration: %5d %llu\n", - scale_type, SCALE_FLAG, - i, j, *wdp); - if (j % 100 =3D=3D 0) - schedule_timeout_uninterruptible(1); + if (!writer_no_print) { + for (j =3D 0; j < writer_n_durations[i]; j++) { + wdp =3D &wdpp[j]; + pr_alert("%s%s %4d writer-duration: %5d %llu\n", + scale_type, SCALE_FLAG, + i, j, *wdp); + if (j % 100 =3D=3D 0) + schedule_timeout_uninterruptible(1); + } } kfree(writer_durations[i]); if (writer_freelists) { @@ -1202,6 +1324,10 @@ rcu_scale_init(void) if (torture_init_error(firsterr)) goto unwind; } + + if (register_debugfs()) + goto unwind; + torture_init_end(); return 0; =20 --=20 2.50.1.552.g942d659e1b-goog From nobody Sun Oct 5 20:11:54 2025 Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A4F601D54EE for ; Wed, 30 Jul 2025 02:24:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753842250; cv=none; b=fPGKqeF696UkO++i7qL96xFjMrHBdzNxisED5l/L6rKWNpIXxBjh8u70XyTLU1bO/8gz+mJ/pnYxumopcHOcPmnHolbdbge8CIlNoYa2jcC6lirF6HCVmLFnGUMIuqIin0c4fgnwD4to0+nbCr7fPpEVg7bFyCOiT3i+h29/6u4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753842250; c=relaxed/simple; bh=9Ujt5KaGB2d5xYGZWoI/rd724qikJUaU1Yrwkr7rzPM=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=B39KlCHu5b+QENKoBsFxVYSe0Vybo8SoPf++jHb4JYxd3rN58EbzP/ZCGLyOX+hEuwc0GgBdgU8wwcpLmj3apUGd23inzU2ONXBT5rklOQQhLK3sWCvRrFaPMhIITUspR36HEluj1xzz29H6MDBFImAwkAGS4zyxo2LIZ985JeI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--yuzhuo.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=hXR7Pbje; arc=none smtp.client-ip=209.85.214.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--yuzhuo.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="hXR7Pbje" Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-2369dd58602so66631625ad.1 for ; Tue, 29 Jul 2025 19:24:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1753842248; x=1754447048; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=db7w+qlj3x6A30qhGk3svrDuPx+Id3qEqe7Z1wrV5Fc=; b=hXR7Pbjevb5yl4z410JO/4uNbP4zs2ic5lmmZvYQdEhYd4xdM6Ay3PoBi2TKgzyMhT 7wJPpzjoLuse61ta58OwtGP6rwmxdDKK6qiBKKSLO0P+HRYnv3aq1JK+AxWYP0kvVWex Qzqm66JxUz3GdVICx1VfinwkwB+fi4rBnH4yBOlnzYDIinDRHHCl8wlTduORvrhk7RQJ ct8MwHk4ZQ4SJ7LXJwlnqWel/BGYMzaYIdRjrP2O0AaYd0zuiYK14egq/hDbvdECi0/f hjAApxTyp0DFGg714R/eeIZAaXHq+prSivwtmGpKJ/y2gEph5Nn4FHpBWKYdycL1HoIb TJcg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1753842248; x=1754447048; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=db7w+qlj3x6A30qhGk3svrDuPx+Id3qEqe7Z1wrV5Fc=; b=Jzv0TYhnei93Y3tXwl3AF4dfGY3ldYIgXLHDieMgF0+lpdlYMy2729ZqGKKgcbP4+n 3TVQqHIH7r/0mlv0xxWFlkEbbX8ijZYastLzbR9mS55O8uuiO+iauOOaYoJz+ZrOBfjX aEOyUl98KdrS8wdh6s4W0CAMBFuN7W4ByHBk0S02WrGbg1tKUUQZF22WKeqKLrC9387I 1HR/ndH2rO0RtMiZq4V+aUcol9fO/dIOFln4RsSWSnStQKHpsjNadduMccPb+aJ2q1R/ wWn1SLp/L19N73fYUW8HeVj0cCODbBx0NhqgHC7516UOMzbZ2u25AyVc/Km8LGGOI6fR 83cg== X-Forwarded-Encrypted: i=1; AJvYcCXpXi/DzeASM1j9uf79+FjItTbMekyAJKQaKQZGfmdD4DrcddIoVaC6fec//OQSB7w6pvw4h61HFSCEsAg=@vger.kernel.org X-Gm-Message-State: AOJu0YzNW8SWSskxneYw6417eOclQW1shI6bKEDcb2Fspc0W3KkLVHLd /8X/roNQgY4dO6g6U8bsQdnkpsU/pPtCs+FkDJ01acbKH4UPyOkLQK4QuN0HhU5YhE4hzslmZmi k124iAA== X-Google-Smtp-Source: AGHT+IHhLv6xFPeOTg+DJUtO46sNQgxdk80iCLDRyCDBD5ZkpZjf8Is9Cy4z221p/h5t/aI4fA2UkVY4hWU= X-Received: from plhk16.prod.google.com ([2002:a17:902:d590:b0:234:bca4:b7b3]) (user=yuzhuo job=prod-delivery.src-stubby-dispatcher) by 2002:a17:902:f790:b0:240:3239:21c7 with SMTP id d9443c01a7336-24096ba6594mr16406855ad.37.1753842247994; Tue, 29 Jul 2025 19:24:07 -0700 (PDT) Date: Tue, 29 Jul 2025 19:23:45 -0700 In-Reply-To: <20250730022347.71722-1-yuzhuo@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250730022347.71722-1-yuzhuo@google.com> X-Mailer: git-send-email 2.50.1.552.g942d659e1b-goog Message-ID: <20250730022347.71722-3-yuzhuo@google.com> Subject: [PATCH v1 2/4] rcuscale: Create debugfs files for worker thread PIDs From: Yuzhuo Jing To: Ian Rogers , Yuzhuo Jing , Jonathan Corbet , Davidlohr Bueso , "Paul E . McKenney" , Josh Triplett , Frederic Weisbecker , Neeraj Upadhyay , Joel Fernandes , Boqun Feng , Uladzislau Rezki , Steven Rostedt , Mathieu Desnoyers , Lai Jiangshan , Zqiang , Andrew Morton , Ingo Molnar , Borislav Petkov , Arnd Bergmann , Frank van der Linden , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, rcu@vger.kernel.org Cc: Yuzhuo Jing Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Creates {reader,writer,kfree}_tasks files in the "rcuscale" debugfs folder. Each line contains one kernel thread PID. This provides a more robust way for external performance analysis tools to attach to kernel threads than using pgrep. Signed-off-by: Yuzhuo Jing --- kernel/rcu/rcuscale.c | 124 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 124 insertions(+) diff --git a/kernel/rcu/rcuscale.c b/kernel/rcu/rcuscale.c index ad10b42be6fc..7c88d461ed2c 100644 --- a/kernel/rcu/rcuscale.c +++ b/kernel/rcu/rcuscale.c @@ -143,6 +143,9 @@ static unsigned long b_rcu_gp_test_finished; =20 static struct dentry *debugfs_dir; static struct dentry *debugfs_writer_durations; +static struct dentry *debugfs_reader_tasks; +static struct dentry *debugfs_writer_tasks; +static struct dentry *debugfs_kfree_tasks; =20 #define MAX_MEAS 10000 #define MIN_MEAS 100 @@ -1009,6 +1012,112 @@ static const struct file_operations writer_duration= s_fops =3D { .release =3D seq_release, }; =20 +/* + * Generic seq_file private data for tasks walkthrough. + */ +struct debugfs_pid_info { + int ntasks; + struct task_struct **tasks; +}; + +/* + * Generic seq_file pos to pointer conversion function, using private data + * of type debugfs_pid_info, and ensure it is within bound. + */ +static void *debugfs_pid_start(struct seq_file *m, loff_t *pos) +{ + loff_t worker =3D *pos; + struct debugfs_pid_info *info =3D m->private; + + if (worker < 0 || worker >=3D info->ntasks) + return NULL; + + return info->tasks[worker]; +} + +static void *debugfs_pid_next(struct seq_file *m, void *v, loff_t *pos) +{ + (*pos)++; + return debugfs_pid_start(m, pos); +} + +/* + * Each line of the file contains one PID from the selected kernel threads. + */ +static int debugfs_pid_show(struct seq_file *m, void *v) +{ + seq_printf(m, "%d\n", ((struct task_struct *)v)->pid); + return 0; +} + +static void debugfs_pid_stop(struct seq_file *m, void *v) +{ +} + +static const struct seq_operations debugfs_pid_fops =3D { + .start =3D debugfs_pid_start, + .next =3D debugfs_pid_next, + .stop =3D debugfs_pid_stop, + .show =3D debugfs_pid_show +}; + +/* + * Generic seq_file creation function that sets private data of type + * debugfs_pid_info. + */ +static int debugfs_pid_open_info(struct inode *inode, struct file *file, + int ntasks, struct task_struct **tasks) +{ + struct debugfs_pid_info *info =3D + __seq_open_private(file, &debugfs_pid_fops, sizeof(*info)); + if (!info) + return -ENOMEM; + + info->ntasks =3D ntasks; + info->tasks =3D tasks; + + return 0; +} + +static int debugfs_pid_open_reader(struct inode *inode, struct file *file) +{ + return debugfs_pid_open_info(inode, file, nrealreaders, reader_tasks); +} + +static int debugfs_pid_open_writer(struct inode *inode, struct file *file) +{ + return debugfs_pid_open_info(inode, file, nrealwriters, writer_tasks); +} + +static int debugfs_pid_open_kfree(struct inode *inode, struct file *file) +{ + return debugfs_pid_open_info(inode, file, kfree_nrealthreads, kfree_reade= r_tasks); +} + +static const struct file_operations readers_fops =3D { + .owner =3D THIS_MODULE, + .open =3D debugfs_pid_open_reader, + .read =3D seq_read, + .llseek =3D seq_lseek, + .release =3D seq_release, +}; + +static const struct file_operations writers_fops =3D { + .owner =3D THIS_MODULE, + .open =3D debugfs_pid_open_writer, + .read =3D seq_read, + .llseek =3D seq_lseek, + .release =3D seq_release, +}; + +static const struct file_operations kfrees_fops =3D { + .owner =3D THIS_MODULE, + .open =3D debugfs_pid_open_kfree, + .read =3D seq_read, + .llseek =3D seq_lseek, + .release =3D seq_release, +}; + /* * Create an rcuscale directory exposing run states and results. */ @@ -1032,6 +1141,18 @@ static int register_debugfs(void) debugfs_dir, NULL, &writer_durations_fops)) goto fail; =20 + if (try_create_file(debugfs_reader_tasks, "reader_tasks", 0444, + debugfs_dir, NULL, &readers_fops)) + goto fail; + + if (try_create_file(debugfs_writer_tasks, "writer_tasks", 0444, + debugfs_dir, NULL, &writers_fops)) + goto fail; + + if (try_create_file(debugfs_kfree_tasks, "kfree_tasks", 0444, + debugfs_dir, NULL, &kfrees_fops)) + goto fail; + return 0; fail: pr_err("rcu-scale: Failed to create debugfs file."); @@ -1052,6 +1173,9 @@ do { \ } while (0) =20 try_remove(debugfs_writer_durations); + try_remove(debugfs_reader_tasks); + try_remove(debugfs_writer_tasks); + try_remove(debugfs_kfree_tasks); =20 /* Remove directory after files. */ try_remove(debugfs_dir); --=20 2.50.1.552.g942d659e1b-goog From nobody Sun Oct 5 20:11:54 2025 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 22D651DE8A8 for ; Wed, 30 Jul 2025 02:24:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753842251; cv=none; b=TLwQsZjKisPMZ3MqV3tVyul1LMbhdnRmj0oWPETV7O4tdmyfeUWLO0YAxARi1BdHPZkD3Wgq/IfOER1bf1swm8SERccqTTVk2Oxu0eZA8zRmXGKkV+5Tj6pAubom9VGw5A8+G3dSgOUhGug1ImFw+H+O+dcIaDndGBOL4sUT0+8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753842251; c=relaxed/simple; bh=zoHPkwBwGtruOmM0x6EZYoZDO8zKEm7M8glLRU+YiOs=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=B+jL6xxzIyW5eYZ8MUWurs05aplXvdMINUmTCKeS9ogxpWiqlYQfSRkl45U2rddljRnhANiKK7whNzMN2n80T0bBSKZNcbW+RvaRazcPGw/70u8tP3zCV+UQHqEOGov9qRNN2sax04VI1BZu6LnY5PIQo6RFDwuwXZ5tBecskKg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--yuzhuo.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=YC2oReaB; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--yuzhuo.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="YC2oReaB" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-31327b2f8e4so6153124a91.1 for ; Tue, 29 Jul 2025 19:24:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1753842249; x=1754447049; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=66ubHsamEtqUWfBlTghFg/Ipo5QGS23sl8Q6AjMFMf4=; b=YC2oReaB6nY5o6y12AoKFVEXjErZV0FMl46xvta/pgBw8ddC4m5sxgY0GLb7hwZUy0 Q207RQnMI/9hKjyEZiG5DX/v0NPVEyeSjsz9Uu2qsYx9f9BNhlMifOoMPfnbykoG2UM4 didPE3AZTFGRSKrYMijt/4NbMmYcvoYKCZMIlZqTCBGXGMmQIc8XWBg/gGLc6HyJQTQZ Tpu9o1Lw4c7RxvPcVEMT+3xDGQ+/7NkU286hshQU28KcehIRV5UKk7SM9DSKFcvqPkzP wyRk4nDnujQhCnBDvWreh/HTv46x4rshp40KRfwBTwfdA51lzpIYlHjboe+hPk5PV2dM yzsg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1753842249; x=1754447049; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=66ubHsamEtqUWfBlTghFg/Ipo5QGS23sl8Q6AjMFMf4=; b=ASrJLxHDGmzOsqaiBObBp8jPxmxlnqooSG4ZPGvPJBhStQJwCVVcQ0zn+0u0f7/GRa 29+auYqMHoW0xH9E80RbaI8NLScOOr8GtdJ5Il3WLnwz4XbxOcCvzdUodc0asw6pBkqc ToHcye7KvEjr+NRa+WIf3/y0T39TqkfhwuWuq/XGX+YJOqIPhB+MJhmNxbH9GMykj457 p/nWznln0zkYm6elIKHqALpx9fJ+vdSe40ME/PXQlOtkrqQD8cJIOCCcVvzfKGs9hZwR ImknwuJZn392SsWK0AWETCqvMU3ffR5cHRyd+Cg5pqpbkfXNTnI4YXkfy0olZXn1MikS woWw== X-Forwarded-Encrypted: i=1; AJvYcCXHzk/7ODXjcwsx7oi/nU7cWFx2osrZDY3bwC8GqiaLn+gXWqiz1kg0otqkgiLT2tdNAtOi/OW64H1K0yo=@vger.kernel.org X-Gm-Message-State: AOJu0Yyz+Iez7WdCyBD6EgZS55pmzM/8MWxobpQ6d3AVEcR9uZGBu+yY 7z5ElzbNewBkILtYXWGsX/OkIy7mEUFRWJcVzIMdVQZc0Di8Ncm5B7MTEOCIDeNEz7APqpMX7xK 2w3KEZQ== X-Google-Smtp-Source: AGHT+IFtxe9xMxqmME9I6E6RogvWG3JYyz459sIIvli4kZYQgqRpMMmsSNedqV+8u3MYIRp8qJiKXgKF+ak= X-Received: from pjk13.prod.google.com ([2002:a17:90b:558d:b0:31c:15c8:4c80]) (user=yuzhuo job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:57e6:b0:31f:313b:4d23 with SMTP id 98e67ed59e1d1-31f5e3809e4mr2409302a91.20.1753842249538; Tue, 29 Jul 2025 19:24:09 -0700 (PDT) Date: Tue, 29 Jul 2025 19:23:46 -0700 In-Reply-To: <20250730022347.71722-1-yuzhuo@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250730022347.71722-1-yuzhuo@google.com> X-Mailer: git-send-email 2.50.1.552.g942d659e1b-goog Message-ID: <20250730022347.71722-4-yuzhuo@google.com> Subject: [PATCH v1 3/4] rcuscale: Add file based start/finish control From: Yuzhuo Jing To: Ian Rogers , Yuzhuo Jing , Jonathan Corbet , Davidlohr Bueso , "Paul E . McKenney" , Josh Triplett , Frederic Weisbecker , Neeraj Upadhyay , Joel Fernandes , Boqun Feng , Uladzislau Rezki , Steven Rostedt , Mathieu Desnoyers , Lai Jiangshan , Zqiang , Andrew Morton , Ingo Molnar , Borislav Petkov , Arnd Bergmann , Frank van der Linden , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, rcu@vger.kernel.org Cc: Yuzhuo Jing Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" In addition to the existing timing-based (holdoff, writer_holdoff) start control, add file-based controls to debugfs. This patch adds an option "block_start", which holds all worker threads until the "rcuscale/should_start" debugfs file is written with a non-zero integer. A new "test_complete" file is added to the debugfs folder, with file content "0" indicating experiment has not finished and "1" indicating finished. This is useful for start/finish control by external test tools. Signed-off-by: Yuzhuo Jing --- .../admin-guide/kernel-parameters.txt | 5 ++ kernel/rcu/rcuscale.c | 79 +++++++++++++++++++ 2 files changed, 84 insertions(+) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentatio= n/admin-guide/kernel-parameters.txt index 7b62a84a19d4..5e233e511f81 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -5487,6 +5487,11 @@ =20 Default is 0. =20 + rcuscale.block_start=3D [KNL] + Block the experiment start until "1" is written to the + rcuscale/should_start file in debugfs. This is useful + for start/finish control by external tools. + rcuscale.gp_async=3D [KNL] Measure performance of asynchronous grace-period primitives such as call_rcu(). diff --git a/kernel/rcu/rcuscale.c b/kernel/rcu/rcuscale.c index 7c88d461ed2c..43bcaeac457f 100644 --- a/kernel/rcu/rcuscale.c +++ b/kernel/rcu/rcuscale.c @@ -87,6 +87,7 @@ MODULE_AUTHOR("Paul E. McKenney "); # define RCUSCALE_SHUTDOWN 1 #endif =20 +torture_param(bool, block_start, false, "Block all threads after creation = and wait for should_start"); torture_param(bool, gp_async, false, "Use asynchronous GP wait primitives"= ); torture_param(int, gp_async_max, 1000, "Max # outstanding waits per writer= "); torture_param(bool, gp_exp, false, "Use expedited GP wait primitives"); @@ -146,6 +147,12 @@ static struct dentry *debugfs_writer_durations; static struct dentry *debugfs_reader_tasks; static struct dentry *debugfs_writer_tasks; static struct dentry *debugfs_kfree_tasks; +static struct dentry *debugfs_should_start; +static struct dentry *debugfs_test_complete; + +static DECLARE_COMPLETION(start_barrier); +static bool should_start; +static bool test_complete; =20 #define MAX_MEAS 10000 #define MIN_MEAS 100 @@ -457,6 +464,23 @@ static void rcu_scale_wait_shutdown(void) schedule_timeout_uninterruptible(1); } =20 +/* + * Wait start_barrier if block_start is enabled. Exit early if shutdown + * is requested. + * + * Return: true if caller should exit; false if caller should continue. + */ +static bool wait_start_barrier(void) +{ + if (!block_start) + return false; + while (wait_for_completion_interruptible(&start_barrier)) { + if (torture_must_stop()) + return true; + } + return false; +} + /* * RCU scalability reader kthread. Repeatedly does empty RCU read-side * critical section, minimizing update-side interference. However, the @@ -475,6 +499,11 @@ rcu_scale_reader(void *arg) set_user_nice(current, MAX_NICE); atomic_inc(&n_rcu_scale_reader_started); =20 + if (wait_start_barrier()) { + torture_kthread_stopping("rcu_scale_reader"); + return 0; + } + do { local_irq_save(flags); idx =3D cur_ops->readlock(); @@ -560,6 +589,11 @@ rcu_scale_writer(void *arg) current->flags |=3D PF_NO_SETAFFINITY; sched_set_fifo_low(current); =20 + if (wait_start_barrier()) { + torture_kthread_stopping("rcu_scale_writer"); + return 0; + } + if (holdoff) schedule_timeout_idle(holdoff * HZ); =20 @@ -755,6 +789,11 @@ kfree_scale_thread(void *arg) set_user_nice(current, MAX_NICE); kfree_rcu_test_both =3D (kfree_rcu_test_single =3D=3D kfree_rcu_test_doub= le); =20 + if (wait_start_barrier()) { + torture_kthread_stopping("kfree_scale_thread"); + return 0; + } + start_time =3D ktime_get_mono_fast_ns(); =20 if (atomic_inc_return(&n_kfree_scale_thread_started) >=3D kfree_nrealthre= ads) { @@ -1118,6 +1157,32 @@ static const struct file_operations kfrees_fops =3D { .release =3D seq_release, }; =20 +/* + * For the "should_start" writable file, reuse debugfs integer parsing, but + * override write function to also send complete_all if should_start is + * changed to 1. + * + * Any non-zero value written to this file is converted to 1. + */ +static int should_start_set(void *data, u64 val) +{ + *(bool *)data =3D !!val; + + if (block_start && !!val) + complete_all(&start_barrier); + + return 0; +} + +static int bool_get(void *data, u64 *val) +{ + *val =3D *(bool *)data; + return 0; +} + +DEFINE_DEBUGFS_ATTRIBUTE(should_start_fops, bool_get, should_start_set, "%= llu"); +DEFINE_DEBUGFS_ATTRIBUTE(test_complete_fops, bool_get, NULL, "%llu"); + /* * Create an rcuscale directory exposing run states and results. */ @@ -1153,6 +1218,15 @@ static int register_debugfs(void) debugfs_dir, NULL, &kfrees_fops)) goto fail; =20 + if (try_create_file(debugfs_should_start, "should_start", 0644, + debugfs_dir, &should_start, &should_start_fops)) + goto fail; + + /* Future: add notification method for readers waiting on file change. */ + if (try_create_file(debugfs_test_complete, "test_complete", 0444, + debugfs_dir, &test_complete, &test_complete_fops)) + goto fail; + return 0; fail: pr_err("rcu-scale: Failed to create debugfs file."); @@ -1176,6 +1250,8 @@ do { \ try_remove(debugfs_reader_tasks); try_remove(debugfs_writer_tasks); try_remove(debugfs_kfree_tasks); + try_remove(debugfs_should_start); + try_remove(debugfs_test_complete); =20 /* Remove directory after files. */ try_remove(debugfs_dir); @@ -1372,6 +1448,9 @@ rcu_scale_init(void) atomic_set(&n_rcu_scale_writer_finished, 0); rcu_scale_print_module_parms(cur_ops, "Start of test"); =20 + if (!block_start) + should_start =3D true; + /* Start up the kthreads. */ =20 if (shutdown) { --=20 2.50.1.552.g942d659e1b-goog From nobody Sun Oct 5 20:11:54 2025 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F1DC91E1DFE for ; Wed, 30 Jul 2025 02:24:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753842253; cv=none; b=j99s29ZAnyoGtivYmlH0/LvU6ZrJFTN0iy62FsLF6auk6/yGb4so5fBZVD6UMGbLj1OXLbRrHLrDHKOZWrp6Q4LYuVz+trvCFt/VQmGg9qV2KNl02uV936szw7Nu2FwdqlDEFA2vL+aC/GWpW+7d1Rk+WgmOT2lqypxpiZXJgFA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753842253; c=relaxed/simple; bh=UIfabhiJWa3Z4s7GqC5WPKaw0anOyWSxrRl3qirvLr8=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=L68vb0Yh5J2L7W35+K+6awb+VKKExz1YpaglnjiS6M8U0LiJTaVBvi/aM2TDTsz6K91UiEpI1sQDcxsaAc42pNFjZUSGtghP12CFu867ofxVCaBdbdF24VT+ABrE4lYDAfUNJUX3ugzjL7jKPsl+D6NMaXVtLEpZNDSyORctfhU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--yuzhuo.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=LrHFNfhV; arc=none smtp.client-ip=209.85.214.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--yuzhuo.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="LrHFNfhV" Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-23638e1605dso49579565ad.0 for ; Tue, 29 Jul 2025 19:24:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1753842251; x=1754447051; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=/oRndOQtUuCTe+afI4UqP/ZtSY/9RpQUvM4BI6wS4mQ=; b=LrHFNfhVOcnsMjknDSCYEhwshYfehgpX3f/PRv+DzckOzjEQLsGbKUyIBbrmHlrxix SEMptHUJMKbc6dMrB4HXihNHeXZidG9AVMCVaKS8sSW8W+hpxTU7IjtrLZrqgv5131ML HhtS8sv/idqlHpuzqy2Bs3s2edCYUXyFcfhosdZ1XNz8S11vxFredrOWXY0SS+rNw194 WHvrM2wvQfyID0Q40KOoLCvOEA8+J9EiptzAagDYnI5IUHR4A6vLVaTqxXq0VHD3hiVl MRA/rYQf29RvBc9v8+57g/wYxCFQZgDtfBtdbnrC3L7qnHcUBrQBNbwF7LW9uXquiJK3 B0Ig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1753842251; x=1754447051; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=/oRndOQtUuCTe+afI4UqP/ZtSY/9RpQUvM4BI6wS4mQ=; b=jI4qpoPmRGOZ8pEshdjmllKXlQ+8W1WY3fHFQqBUepAw+JyBBCZRyVQeBpTbhV0Ncv Bzf/8gEWlJ2nB/sDIMT6N6rvA1hb+3Z04+dKFzEDM7f8xmtkDyRKPYJdFhnoQr6A25Ca ZKITBIdjoJyRb3YvbHuypFf/dTGhTxpK04ym3rAzFNK3aOqACdwtNOwNVO3ZNw1K/1f4 JAdHACb6Lwp7uPEbs1eTuEBGqrkMaJM1B4bRUXMjKSJAbUlpTEc1KPbbsdXnY8Bt7Yfa /OgsWgIFWU7pq9mFv2yfkMifPnYH4hEJJ8WWcbW/kt5KqaveYuy/oiSn/cPXi9GbtNJ5 AmsA== X-Forwarded-Encrypted: i=1; AJvYcCW8ZF1q2NJocUPVjMoknr30rgbh1cirss2pZomsgItr63uyCNniocooGB6cE9fklktJXNu5w/6/R6nM8Ds=@vger.kernel.org X-Gm-Message-State: AOJu0Yxx7wtfsieKUiHVS2JY2dUEYNZ4Y5Ay76sKjoZYe3s6ppKmsdk/ TGNWQZ/e3HcIvpfZoZ/BLgzJK/aVjDEvDmBjYnQhTl6JCyrspxkp9RFAOEWSKcVFMAzmhOHNtzD 4MMP2HQ== X-Google-Smtp-Source: AGHT+IE28C8fLr7AW0K7sod7RfR/S9Ci5NQA8Kz+RkgdLQ32laB755YoHEksytGwtVXC18LfdYKXqjs7klA= X-Received: from pliy6.prod.google.com ([2002:a17:903:3d06:b0:240:931c:1712]) (user=yuzhuo job=prod-delivery.src-stubby-dispatcher) by 2002:a17:903:98f:b0:23f:d861:bd4a with SMTP id d9443c01a7336-24096b6804dmr20278865ad.27.1753842251216; Tue, 29 Jul 2025 19:24:11 -0700 (PDT) Date: Tue, 29 Jul 2025 19:23:47 -0700 In-Reply-To: <20250730022347.71722-1-yuzhuo@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250730022347.71722-1-yuzhuo@google.com> X-Mailer: git-send-email 2.50.1.552.g942d659e1b-goog Message-ID: <20250730022347.71722-5-yuzhuo@google.com> Subject: [PATCH v1 4/4] rcuscale: Add CPU affinity offset options From: Yuzhuo Jing To: Ian Rogers , Yuzhuo Jing , Jonathan Corbet , Davidlohr Bueso , "Paul E . McKenney" , Josh Triplett , Frederic Weisbecker , Neeraj Upadhyay , Joel Fernandes , Boqun Feng , Uladzislau Rezki , Steven Rostedt , Mathieu Desnoyers , Lai Jiangshan , Zqiang , Andrew Morton , Ingo Molnar , Borislav Petkov , Arnd Bergmann , Frank van der Linden , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, rcu@vger.kernel.org Cc: Yuzhuo Jing Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Currently, reader, writer, and kfree threads set their affinity by their id % nr_cpu_ids. IDs of the all three types all start from 0, and therefore readers, writers, and kfrees may be scheduled on the same CPU. This patch adds options to offset CPU affinity. From the experiments below, writer duration characteristics are very different between offset 0 and 1. Experiments carried out on a 256C 512T machine running PREEMPT=3Dn kernel. Experiment: nreaders=3D1 nwriters=3D1 reader_cpu_offset=3D0 writer_cpu_offs= et=3D0 Average grace-period duration: 108376 microseconds Minimum grace-period duration: 13000.4 50th percentile grace-period duration: 115000 90th percentile grace-period duration: 121000 99th percentile grace-period duration: 121004 Maximum grace-period duration: 219000 Grace periods: 101 Batches: 1 Ratio: 101 Experiment: nreaders=3D1 nwriters=3D1 reader_cpu_offset=3D0 writer_cpu_offs= et=3D1 Average grace-period duration: 185950 microseconds Minimum grace-period duration: 8999.84 50th percentile grace-period duration: 217946 90th percentile grace-period duration: 218003 99th percentile grace-period duration: 218018 Maximum grace-period duration: 272195 Grace periods: 101 Batches: 1 Ratio: 101 Signed-off-by: Yuzhuo Jing --- .../admin-guide/kernel-parameters.txt | 19 +++++++++++++++++++ kernel/rcu/rcuscale.c | 16 +++++++++++----- 2 files changed, 30 insertions(+), 5 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentatio= n/admin-guide/kernel-parameters.txt index 5e233e511f81..f68651c103a4 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -1,3 +1,4 @@ +# vim: noet:sw=3D8:sts=3D8: accept_memory=3D [MM] Format: { eager | lazy } default: lazy @@ -5513,6 +5514,12 @@ test until boot completes in order to avoid interference. =20 + rcuscale.kfree_cpu_offset=3D [KNL] + Set the starting CPU affinity index of kfree threads. + CPU affinity is assigned sequentially from + kfree_cpu_offset to kfree_cpu_offset+kfree_nthreads, + modded by number of CPUs. Negative value is reset to 0. + rcuscale.kfree_by_call_rcu=3D [KNL] In kernels built with CONFIG_RCU_LAZY=3Dy, test call_rcu() instead of kfree_rcu(). @@ -5567,6 +5574,12 @@ the same as for rcuscale.nreaders. N, where N is the number of CPUs =20 + rcuscale.reader_cpu_offset=3D [KNL] + Set the starting CPU affinity index of reader threads. + CPU affinity is assigned sequentially from + reader_cpu_offset to reader_cpu_offset+nreaders, modded + by number of CPUs. Negative value is reset to 0. + rcuscale.scale_type=3D [KNL] Specify the RCU implementation to test. =20 @@ -5578,6 +5591,12 @@ rcuscale.verbose=3D [KNL] Enable additional printk() statements. =20 + rcuscale.writer_cpu_offset=3D [KNL] + Set the starting CPU affinity index of writer threads. + CPU affinity is assigned sequentially from + writer_cpu_offset to writer_cpu_offset+nwriters, modded + by number of CPUs. Negative value is reset to 0. + rcuscale.writer_holdoff=3D [KNL] Write-side holdoff between grace periods, in microseconds. The default of zero says diff --git a/kernel/rcu/rcuscale.c b/kernel/rcu/rcuscale.c index 43bcaeac457f..1208169be15e 100644 --- a/kernel/rcu/rcuscale.c +++ b/kernel/rcu/rcuscale.c @@ -95,12 +95,15 @@ torture_param(int, holdoff, 10, "Holdoff time before te= st start (s)"); torture_param(int, minruntime, 0, "Minimum run time (s)"); torture_param(int, nreaders, -1, "Number of RCU reader threads"); torture_param(int, nwriters, -1, "Number of RCU updater threads"); +torture_param(int, reader_cpu_offset, 0, "Offset of reader CPU affinity") torture_param(bool, shutdown, RCUSCALE_SHUTDOWN, "Shutdown at end of scalability tests."); torture_param(int, verbose, 1, "Enable verbose debugging printk()s"); +torture_param(int, writer_cpu_offset, 0, "Offset of writer CPU affinity") torture_param(int, writer_holdoff, 0, "Holdoff (us) between GPs, zero to d= isable"); torture_param(int, writer_holdoff_jiffies, 0, "Holdoff (jiffies) between G= Ps, zero to disable"); torture_param(bool, writer_no_print, false, "Do not print writer durations= to ring buffer"); +torture_param(int, kfree_cpu_offset, 0, "Offset of kfree CPU affinity") torture_param(int, kfree_rcu_test, 0, "Do we run a kfree_rcu() scale test?= "); torture_param(int, kfree_mult, 1, "Multiple of kfree_obj size to allocate.= "); torture_param(int, kfree_by_call_rcu, 0, "Use call_rcu() to emulate kfree_= rcu()?"); @@ -495,7 +498,7 @@ rcu_scale_reader(void *arg) long me =3D (long)arg; =20 VERBOSE_SCALEOUT_STRING("rcu_scale_reader task started"); - set_cpus_allowed_ptr(current, cpumask_of(me % nr_cpu_ids)); + set_cpus_allowed_ptr(current, cpumask_of((reader_cpu_offset + me) % nr_cp= u_ids)); set_user_nice(current, MAX_NICE); atomic_inc(&n_rcu_scale_reader_started); =20 @@ -585,7 +588,7 @@ rcu_scale_writer(void *arg) =20 VERBOSE_SCALEOUT_STRING("rcu_scale_writer task started"); WARN_ON(!wdpp); - set_cpus_allowed_ptr(current, cpumask_of(me % nr_cpu_ids)); + set_cpus_allowed_ptr(current, cpumask_of((writer_cpu_offset + me) % nr_cp= u_ids)); current->flags |=3D PF_NO_SETAFFINITY; sched_set_fifo_low(current); =20 @@ -719,8 +722,8 @@ static void rcu_scale_print_module_parms(struct rcu_scale_ops *cur_ops, const char *ta= g) { pr_alert("%s" SCALE_FLAG - "--- %s: gp_async=3D%d gp_async_max=3D%d gp_exp=3D%d holdoff=3D%d minru= ntime=3D%d nreaders=3D%d nwriters=3D%d writer_holdoff=3D%d writer_holdoff_j= iffies=3D%d verbose=3D%d shutdown=3D%d\n", - scale_type, tag, gp_async, gp_async_max, gp_exp, holdoff, minruntime, n= realreaders, nrealwriters, writer_holdoff, writer_holdoff_jiffies, verbose,= shutdown); + "--- %s: gp_async=3D%d gp_async_max=3D%d gp_exp=3D%d holdoff=3D%d minru= ntime=3D%d nreaders=3D%d nwriters=3D%d reader_cpu_offset=3D%d writer_cpu_of= fset=3D%d writer_holdoff=3D%d writer_holdoff_jiffies=3D%d kfree_cpu_offset= =3D%d verbose=3D%d shutdown=3D%d\n", + scale_type, tag, gp_async, gp_async_max, gp_exp, holdoff, minruntime, n= realreaders, nrealwriters, reader_cpu_offset, writer_cpu_offset, writer_hol= doff, writer_holdoff_jiffies, kfree_cpu_offset, verbose, shutdown); } =20 /* @@ -785,7 +788,7 @@ kfree_scale_thread(void *arg) DEFINE_TORTURE_RANDOM(tr); =20 VERBOSE_SCALEOUT_STRING("kfree_scale_thread task started"); - set_cpus_allowed_ptr(current, cpumask_of(me % nr_cpu_ids)); + set_cpus_allowed_ptr(current, cpumask_of((kfree_cpu_offset + me) % nr_cpu= _ids)); set_user_nice(current, MAX_NICE); kfree_rcu_test_both =3D (kfree_rcu_test_single =3D=3D kfree_rcu_test_doub= le); =20 @@ -1446,6 +1449,9 @@ rcu_scale_init(void) atomic_set(&n_rcu_scale_reader_started, 0); atomic_set(&n_rcu_scale_writer_started, 0); atomic_set(&n_rcu_scale_writer_finished, 0); + reader_cpu_offset =3D max(reader_cpu_offset, 0); + writer_cpu_offset =3D max(writer_cpu_offset, 0); + kfree_cpu_offset =3D max(kfree_cpu_offset, 0); rcu_scale_print_module_parms(cur_ops, "Start of test"); =20 if (!block_start) --=20 2.50.1.552.g942d659e1b-goog