From nobody Wed Apr 1 08:21:29 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 679DD41324B; Tue, 31 Mar 2026 16:32:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774974748; cv=none; b=WZAVN2tLj0mcury09p3PhcTL+DR/3/R2THM0A5vuCGDomxVWjZYNQ0B2r54VZhg9Njm4vPd9qApw3ky0ByaeSPfdBpFS7sGzD/ZakULuAkW/O79RBVw3bA16g27qZiL7KgHd/Cgi0b1B0MNODkuJiPXccRo8s5hStb5IaY1Vc+A= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774974748; c=relaxed/simple; bh=QYX0t9TWAINAGj3jH/FIPfk9CR7MZIRgSRBhJS1hKPg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=CM16BK3gLKhz90CiLdTBIyi6pVK1eIueygml0JrwBuShDVjpXn9TKyr1pHZm8uz+A8eh4bCPytQucMzIS273SmUL0F/GKt3+9bWJqjli8xCokE/Yc6MC+gKpSFencKB4xbZszlr9fNp8cQYTBwfoSUgjXxFUnlTz8r2JEhk+fww= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=Hr/i/PQ4; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="Hr/i/PQ4" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 78DD3C2BCB2; Tue, 31 Mar 2026 16:32:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1774974748; bh=QYX0t9TWAINAGj3jH/FIPfk9CR7MZIRgSRBhJS1hKPg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Hr/i/PQ48cIfVRwKZJSxKXC5LIYKcEDuthYEoHXvzJzZYoqCUmJ/a1bInyX4WA8v7 uDhRILsbOkHWuGdmQHzW/Sh9T5bKUPvroCMuBKOUhLSGpQMZXdymMa4d0s41aYEv1F qRzRdAf+5uUBCARmeEHGAM3IqMPTObI0hx/lG3bSrAtLKRx9A2WZenDJkhC/q0xwEO Af48BcCEem8ZWAsrAcYd7ljwTSBT8/cim4YeX0QjkQriAQUV8xgzwhD1MDnyNuEfI3 2WR69qI2HKOmovlb69uLV4f/QAQPdBqZ14M2wfweZWs9+BGX/34xpGsV7fvdYcxUjj Bzc0SZ0UtRErQ== From: "Masami Hiramatsu (Google)" To: Steven Rostedt Cc: Masami Hiramatsu , Mathieu Desnoyers , linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org Subject: [PATCH v9 1/3] tracing: Make the backup instance non-reusable Date: Wed, 1 Apr 2026 01:32:26 +0900 Message-ID: <177497474578.569199.1438084675587027223.stgit@mhiramat.tok.corp.google.com> X-Mailer: git-send-email 2.53.0.1118.gaef5881109-goog In-Reply-To: <177497473558.569199.6527680985537865638.stgit@mhiramat.tok.corp.google.com> References: <177497473558.569199.6527680985537865638.stgit@mhiramat.tok.corp.google.com> User-Agent: StGit/0.19 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable From: Masami Hiramatsu (Google) Since there is no reason to reuse the backup instance, make it readonly (but erasable). Note that only backup instances are readonly, because other trace instances will be empty unless it is writable. Only backup instances have copy entries from the original. With this change, most of the trace control files are removed from the backup instance, including eventfs enable/filter etc. # find /sys/kernel/tracing/instances/backup/events/ | wc -l 4093 # find /sys/kernel/tracing/instances/boot_map/events/ | wc -l 9573 Signed-off-by: Masami Hiramatsu (Google) --- Changes in v9: - Add forcibly readonly check in open() operations. Changes in v8: - Remove read-only checks in read() operation. Changes in v7: - Return -EACCES instead of -EPERM. Changes in v6: - Remove tracing_on file from readonly instances. - Remove unused writable_mode from tracing_init_tracefs_percpu(). - Cleanup init_tracer_tracefs() and create_event_toplevel_files(). - Remove TRACE_MODE_WRITE_MASK. - Add TRACE_ARRAY_FL_RDONLY. Changes in v5: - Rebased on the latest for-next (and hide show_event_filters/triggers if the instance is readonly. Changes in v4: - Make trace data erasable. (not reusable) Changes in v3: - Resuse the beginning part of event_entries for readonly files. - Remove readonly file_operations and checking readonly flag in each write operation. Changes in v2: - Use readonly file_operations to prohibit writing instead of checking flags in write() callbacks. - Remove writable files from eventfs. --- kernel/trace/trace.c | 81 +++++++++++++++++++++++++++------------= ---- kernel/trace/trace.h | 7 ++++ kernel/trace/trace_boot.c | 5 ++- kernel/trace/trace_events.c | 76 +++++++++++++++++++++++----------------- 4 files changed, 104 insertions(+), 65 deletions(-) diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index 7b9dd6378849..8cec7bd70438 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -3414,6 +3414,11 @@ int tracing_open_generic_tr(struct inode *inode, str= uct file *filp) if (ret) return ret; =20 + if ((filp->f_mode & FMODE_WRITE) && trace_array_is_readonly(tr)) { + trace_array_put(tr); + return -EACCES; + } + filp->private_data =3D inode->i_private; =20 return 0; @@ -6435,6 +6440,11 @@ static int tracing_clock_open(struct inode *inode, s= truct file *file) if (ret) return ret; =20 + if ((file->f_mode & FMODE_WRITE) && trace_array_is_readonly(tr)) { + trace_array_put(tr); + return -EACCES; + } + ret =3D single_open(file, tracing_clock_show, inode->i_private); if (ret < 0) trace_array_put(tr); @@ -8771,17 +8781,22 @@ static __init void create_trace_instances(struct de= ntry *d_tracer) static void init_tracer_tracefs(struct trace_array *tr, struct dentry *d_tracer) { + umode_t writable_mode =3D TRACE_MODE_WRITE; int cpu; =20 + if (trace_array_is_readonly(tr)) + writable_mode =3D TRACE_MODE_READ; + trace_create_file("available_tracers", TRACE_MODE_READ, d_tracer, - tr, &show_traces_fops); + tr, &show_traces_fops); =20 - trace_create_file("current_tracer", TRACE_MODE_WRITE, d_tracer, - tr, &set_tracer_fops); + trace_create_file("current_tracer", writable_mode, d_tracer, + tr, &set_tracer_fops); =20 - trace_create_file("tracing_cpumask", TRACE_MODE_WRITE, d_tracer, + trace_create_file("tracing_cpumask", writable_mode, d_tracer, tr, &tracing_cpumask_fops); =20 + /* Options are used for changing print-format even for readonly instance.= */ trace_create_file("trace_options", TRACE_MODE_WRITE, d_tracer, tr, &tracing_iter_fops); =20 @@ -8791,12 +8806,36 @@ init_tracer_tracefs(struct trace_array *tr, struct = dentry *d_tracer) trace_create_file("trace_pipe", TRACE_MODE_READ, d_tracer, tr, &tracing_pipe_fops); =20 - trace_create_file("buffer_size_kb", TRACE_MODE_WRITE, d_tracer, + trace_create_file("buffer_size_kb", writable_mode, d_tracer, tr, &tracing_entries_fops); =20 trace_create_file("buffer_total_size_kb", TRACE_MODE_READ, d_tracer, tr, &tracing_total_entries_fops); =20 + trace_create_file("trace_clock", writable_mode, d_tracer, tr, + &trace_clock_fops); + + trace_create_file("timestamp_mode", TRACE_MODE_READ, d_tracer, tr, + &trace_time_stamp_mode_fops); + + tr->buffer_percent =3D 50; + + trace_create_file("buffer_subbuf_size_kb", writable_mode, d_tracer, + tr, &buffer_subbuf_size_fops); + + create_trace_options_dir(tr); + + if (tr->range_addr_start) + trace_create_file("last_boot_info", TRACE_MODE_READ, d_tracer, + tr, &last_boot_fops); + + for_each_tracing_cpu(cpu) + tracing_init_tracefs_percpu(tr, cpu); + + /* Read-only instance has above files only. */ + if (trace_array_is_readonly(tr)) + return; + trace_create_file("free_buffer", 0200, d_tracer, tr, &tracing_free_buffer_fops); =20 @@ -8808,49 +8847,29 @@ init_tracer_tracefs(struct trace_array *tr, struct = dentry *d_tracer) trace_create_file("trace_marker_raw", 0220, d_tracer, tr, &tracing_mark_raw_fops); =20 - trace_create_file("trace_clock", TRACE_MODE_WRITE, d_tracer, tr, - &trace_clock_fops); - - trace_create_file("tracing_on", TRACE_MODE_WRITE, d_tracer, - tr, &rb_simple_fops); - - trace_create_file("timestamp_mode", TRACE_MODE_READ, d_tracer, tr, - &trace_time_stamp_mode_fops); - - tr->buffer_percent =3D 50; - trace_create_file("buffer_percent", TRACE_MODE_WRITE, d_tracer, - tr, &buffer_percent_fops); - - trace_create_file("buffer_subbuf_size_kb", TRACE_MODE_WRITE, d_tracer, - tr, &buffer_subbuf_size_fops); + tr, &buffer_percent_fops); =20 trace_create_file("syscall_user_buf_size", TRACE_MODE_WRITE, d_tracer, - tr, &tracing_syscall_buf_fops); + tr, &tracing_syscall_buf_fops); =20 - create_trace_options_dir(tr); + trace_create_file("tracing_on", TRACE_MODE_WRITE, d_tracer, + tr, &rb_simple_fops); =20 trace_create_maxlat_file(tr, d_tracer); =20 if (ftrace_create_function_files(tr, d_tracer)) MEM_FAIL(1, "Could not allocate function filter files"); =20 - if (tr->range_addr_start) { - trace_create_file("last_boot_info", TRACE_MODE_READ, d_tracer, - tr, &last_boot_fops); #ifdef CONFIG_TRACER_SNAPSHOT - } else { + if (!tr->range_addr_start) trace_create_file("snapshot", TRACE_MODE_WRITE, d_tracer, tr, &snapshot_fops); #endif - } =20 trace_create_file("error_log", TRACE_MODE_WRITE, d_tracer, tr, &tracing_err_log_fops); =20 - for_each_tracing_cpu(cpu) - tracing_init_tracefs_percpu(tr, cpu); - ftrace_init_tracefs(tr, d_tracer); } =20 @@ -9635,7 +9654,7 @@ __init static void enable_instances(void) * Backup buffers can be freed but need vfree(). */ if (backup) - tr->flags |=3D TRACE_ARRAY_FL_VMALLOC; + tr->flags |=3D TRACE_ARRAY_FL_VMALLOC | TRACE_ARRAY_FL_RDONLY; =20 if (start || backup) { tr->flags |=3D TRACE_ARRAY_FL_BOOT | TRACE_ARRAY_FL_LAST_BOOT; diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h index a3ea735a9ef6..2d9d26d423f1 100644 --- a/kernel/trace/trace.h +++ b/kernel/trace/trace.h @@ -464,6 +464,7 @@ enum { TRACE_ARRAY_FL_MOD_INIT =3D BIT(3), TRACE_ARRAY_FL_MEMMAP =3D BIT(4), TRACE_ARRAY_FL_VMALLOC =3D BIT(5), + TRACE_ARRAY_FL_RDONLY =3D BIT(6), }; =20 #ifdef CONFIG_MODULES @@ -493,6 +494,12 @@ extern unsigned long trace_adjust_address(struct trace= _array *tr, unsigned long =20 extern struct trace_array *printk_trace; =20 +static inline bool trace_array_is_readonly(struct trace_array *tr) +{ + /* backup instance is read only. */ + return tr->flags & TRACE_ARRAY_FL_RDONLY; +} + /* * The global tracer (top) should be the first trace array added, * but we check the flag anyway. diff --git a/kernel/trace/trace_boot.c b/kernel/trace/trace_boot.c index dbe29b4c6a7a..2ca2541c8a58 100644 --- a/kernel/trace/trace_boot.c +++ b/kernel/trace/trace_boot.c @@ -61,7 +61,8 @@ trace_boot_set_instance_options(struct trace_array *tr, s= truct xbc_node *node) v =3D memparse(p, NULL); if (v < PAGE_SIZE) pr_err("Buffer size is too small: %s\n", p); - if (tracing_resize_ring_buffer(tr, v, RING_BUFFER_ALL_CPUS) < 0) + if (trace_array_is_readonly(tr) || + tracing_resize_ring_buffer(tr, v, RING_BUFFER_ALL_CPUS) < 0) pr_err("Failed to resize trace buffer to %s\n", p); } =20 @@ -597,7 +598,7 @@ trace_boot_enable_tracer(struct trace_array *tr, struct= xbc_node *node) =20 p =3D xbc_node_find_value(node, "tracer", NULL); if (p && *p !=3D '\0') { - if (tracing_set_tracer(tr, p) < 0) + if (trace_array_is_readonly(tr) || tracing_set_tracer(tr, p) < 0) pr_err("Failed to set given tracer: %s\n", p); } =20 diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c index de807a9e2371..7ddcee312471 100644 --- a/kernel/trace/trace_events.c +++ b/kernel/trace/trace_events.c @@ -1401,6 +1401,9 @@ static int __ftrace_set_clr_event(struct trace_array = *tr, const char *match, { int ret; =20 + if (trace_array_is_readonly(tr)) + return -EACCES; + mutex_lock(&event_mutex); ret =3D __ftrace_set_clr_event_nolock(tr, match, sub, event, set, mod); mutex_unlock(&event_mutex); @@ -2969,8 +2972,8 @@ event_subsystem_dir(struct trace_array *tr, const cha= r *name, } else __get_system(system); =20 - /* ftrace only has directories no files */ - if (strcmp(name, "ftrace") =3D=3D 0) + /* ftrace only has directories no files, readonly instance too. */ + if (strcmp(name, "ftrace") =3D=3D 0 || trace_array_is_readonly(tr)) nr_entries =3D 0; else nr_entries =3D ARRAY_SIZE(system_entries); @@ -3135,28 +3138,30 @@ event_create_dir(struct eventfs_inode *parent, stru= ct trace_event_file *file) int ret; static struct eventfs_entry event_entries[] =3D { { - .name =3D "enable", + .name =3D "format", .callback =3D event_callback, - .release =3D event_release, }, +#ifdef CONFIG_PERF_EVENTS { - .name =3D "filter", + .name =3D "id", .callback =3D event_callback, }, +#endif +#define NR_RO_EVENT_ENTRIES (1 + IS_ENABLED(CONFIG_PERF_EVENTS)) +/* Readonly files must be above this line and counted by NR_RO_EVENT_ENTRI= ES. */ { - .name =3D "trigger", + .name =3D "enable", .callback =3D event_callback, + .release =3D event_release, }, { - .name =3D "format", + .name =3D "filter", .callback =3D event_callback, }, -#ifdef CONFIG_PERF_EVENTS { - .name =3D "id", + .name =3D "trigger", .callback =3D event_callback, }, -#endif #ifdef CONFIG_HIST_TRIGGERS { .name =3D "hist", @@ -3189,7 +3194,10 @@ event_create_dir(struct eventfs_inode *parent, struc= t trace_event_file *file) if (!e_events) return -ENOMEM; =20 - nr_entries =3D ARRAY_SIZE(event_entries); + if (trace_array_is_readonly(tr)) + nr_entries =3D NR_RO_EVENT_ENTRIES; + else + nr_entries =3D ARRAY_SIZE(event_entries); =20 name =3D trace_event_name(call); ei =3D eventfs_create_dir(name, e_events, event_entries, nr_entries, file= ); @@ -4532,31 +4540,44 @@ create_event_toplevel_files(struct dentry *parent, = struct trace_array *tr) int nr_entries; static struct eventfs_entry events_entries[] =3D { { - .name =3D "enable", + .name =3D "header_page", .callback =3D events_callback, }, { - .name =3D "header_page", + .name =3D "header_event", .callback =3D events_callback, }, +#define NR_RO_TOP_ENTRIES 2 +/* Readonly files must be above this line and counted by NR_RO_TOP_ENTRIES= . */ { - .name =3D "header_event", + .name =3D "enable", .callback =3D events_callback, }, }; =20 - entry =3D trace_create_file("set_event", TRACE_MODE_WRITE, parent, - tr, &ftrace_set_event_fops); - if (!entry) - return -ENOMEM; + if (!trace_array_is_readonly(tr)) { + entry =3D trace_create_file("set_event", TRACE_MODE_WRITE, parent, + tr, &ftrace_set_event_fops); + if (!entry) + return -ENOMEM; =20 - trace_create_file("show_event_filters", TRACE_MODE_READ, parent, tr, - &ftrace_show_event_filters_fops); + /* There are not as crucial, just warn if they are not created */ + trace_create_file("show_event_filters", TRACE_MODE_READ, parent, tr, + &ftrace_show_event_filters_fops); =20 - trace_create_file("show_event_triggers", TRACE_MODE_READ, parent, tr, - &ftrace_show_event_triggers_fops); + trace_create_file("show_event_triggers", TRACE_MODE_READ, parent, tr, + &ftrace_show_event_triggers_fops); =20 - nr_entries =3D ARRAY_SIZE(events_entries); + trace_create_file("set_event_pid", TRACE_MODE_WRITE, parent, + tr, &ftrace_set_event_pid_fops); + + trace_create_file("set_event_notrace_pid", + TRACE_MODE_WRITE, parent, tr, + &ftrace_set_event_notrace_pid_fops); + nr_entries =3D ARRAY_SIZE(events_entries); + } else { + nr_entries =3D NR_RO_TOP_ENTRIES; + } =20 e_events =3D eventfs_create_events_dir("events", parent, events_entries, nr_entries, tr); @@ -4565,15 +4586,6 @@ create_event_toplevel_files(struct dentry *parent, s= truct trace_array *tr) return -ENOMEM; } =20 - /* There are not as crucial, just warn if they are not created */ - - trace_create_file("set_event_pid", TRACE_MODE_WRITE, parent, - tr, &ftrace_set_event_pid_fops); - - trace_create_file("set_event_notrace_pid", - TRACE_MODE_WRITE, parent, tr, - &ftrace_set_event_notrace_pid_fops); - tr->event_dir =3D e_events; =20 return 0; From nobody Wed Apr 1 08:21:29 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 16CAD402BA9; Tue, 31 Mar 2026 16:32:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774974756; cv=none; b=QpVjoqHWDy9Zqmc0TS4dv/S3v3+R6RQ7sSMvHMXCv5EenHZcNFaeZk+BEJ6JjfjV16iBCouEpGZ/3UaO8QfKyatgyUTQ/YPNSoMoJ6c5c6c/ttat+1Sblb37TdV80z6DxHVQt3fEA/604ZRtXMEBAGEfT5uBc9M9iKzdwrngTho= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774974756; c=relaxed/simple; bh=UOAO6XujdJV+n7CK9mVVCRv84ROTAg0qJONCnoaInzA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=iRkpGPneFV/mlT2r+plrXrzqd+Addd7aiVGXF868P/j+Wkb89BJwp7ba6EsMM1ZMESFXm67au6VXRJ+TnUDkXbBvdPNRusgPZY6cOFMpDz3/JDi1hbhGkneryI1kLOxQukJg6BGD4wAArd4N3Pxt/N25QIssq2ByUnMTTPtvh8M= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=F5cVWW97; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="F5cVWW97" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2EEE0C2BCB1; Tue, 31 Mar 2026 16:32:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1774974755; bh=UOAO6XujdJV+n7CK9mVVCRv84ROTAg0qJONCnoaInzA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=F5cVWW97BZGqEv9QFU1e9wQHgjrO5bOztHdpqaRfuD/4UldnR8A8/wWraee43SowJ KLY27CjGDqKEKKAGE0EYIULszSTbYS760jOLnSrXtnoXFnGdItmQRO0DHeSKA5f01Y uj1q6cIlzgSFVeK3BDPWqMva4XkdzYacrJkcs50Qu/l/SJEToq/VN1jQHj08fLJTFx 9svzSOQVn1GUFqHkRDJmdjlqKb/2kPAhcuQV80YYnYIGg+0oHUYWG/0EKNpFraKNd3 Plh4muceKEIEtPPwvPuMsKGvn0PZfTZ5bscKi5U7f7zBstfbxmH7/oSEuG0xtTJ/Ro 7bwJWdjdYBtAw== From: "Masami Hiramatsu (Google)" To: Steven Rostedt Cc: Masami Hiramatsu , Mathieu Desnoyers , linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org Subject: [PATCH v9 2/3] tracing: Remove the backup instance automatically after read Date: Wed, 1 Apr 2026 01:32:33 +0900 Message-ID: <177497475349.569199.11513916633426967730.stgit@mhiramat.tok.corp.google.com> X-Mailer: git-send-email 2.53.0.1118.gaef5881109-goog In-Reply-To: <177497473558.569199.6527680985537865638.stgit@mhiramat.tok.corp.google.com> References: <177497473558.569199.6527680985537865638.stgit@mhiramat.tok.corp.google.com> User-Agent: StGit/0.19 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable From: Masami Hiramatsu (Google) Since the backup instance is readonly, after reading all data via pipe, no data is left on the instance. Thus it can be removed safely after closing all files. This also removes it if user resets the ring buffer manually via 'trace' file. Signed-off-by: Masami Hiramatsu (Google) --- Changes in v9: - Fix to initialize autoremove workqueue only for backup. - Fix to return -ENODEV if trace_array_get() refers freeing instance. Changes in v6: - Fix typo in comment. - Only when there is a readonly trace array, initialize autoremove_wq. - Fix to exit loop in trace_array_get() if tr is found in the list. Changes in v4: - Update description. --- kernel/trace/trace.c | 85 ++++++++++++++++++++++++++++++++++++++++++++++= ---- kernel/trace/trace.h | 6 ++++ 2 files changed, 85 insertions(+), 6 deletions(-) diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index 8cec7bd70438..1d73400a01c7 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -539,8 +539,65 @@ void trace_set_ring_buffer_expanded(struct trace_array= *tr) tr->ring_buffer_expanded =3D true; } =20 +static int __remove_instance(struct trace_array *tr); + +static void trace_array_autoremove(struct work_struct *work) +{ + struct trace_array *tr =3D container_of(work, struct trace_array, autorem= ove_work); + + guard(mutex)(&event_mutex); + guard(mutex)(&trace_types_lock); + + /* + * This can be fail if someone gets @tr before starting this + * function, but in that case, this will be kicked again when + * putting it. So we don't care about the result. + */ + __remove_instance(tr); +} + +static struct workqueue_struct *autoremove_wq; + +static void trace_array_kick_autoremove(struct trace_array *tr) +{ + if (autoremove_wq && !work_pending(&tr->autoremove_work)) + queue_work(autoremove_wq, &tr->autoremove_work); +} + +static void trace_array_cancel_autoremove(struct trace_array *tr) +{ + if (autoremove_wq && work_pending(&tr->autoremove_work)) + cancel_work(&tr->autoremove_work); +} + +static void trace_array_init_autoremove(struct trace_array *tr) +{ + INIT_WORK(&tr->autoremove_work, trace_array_autoremove); +} + +static void trace_array_start_autoremove(void) +{ + if (autoremove_wq) + return; + + autoremove_wq =3D alloc_workqueue("tr_autoremove_wq", + WQ_UNBOUND | WQ_HIGHPRI, 0); + if (!autoremove_wq) + pr_warn("Unable to allocate tr_autoremove_wq. autoremove disabled.\n"); +} + LIST_HEAD(ftrace_trace_arrays); =20 +static int __trace_array_get(struct trace_array *this_tr) +{ + /* When free_on_close is set, this is not available anymore. */ + if (autoremove_wq && this_tr->free_on_close) + return -ENODEV; + + this_tr->ref++; + return 0; +} + int trace_array_get(struct trace_array *this_tr) { struct trace_array *tr; @@ -548,8 +605,7 @@ int trace_array_get(struct trace_array *this_tr) guard(mutex)(&trace_types_lock); list_for_each_entry(tr, &ftrace_trace_arrays, list) { if (tr =3D=3D this_tr) { - tr->ref++; - return 0; + return __trace_array_get(tr); } } =20 @@ -560,6 +616,12 @@ static void __trace_array_put(struct trace_array *this= _tr) { WARN_ON(!this_tr->ref); this_tr->ref--; + /* + * When free_on_close is set, prepare removing the array + * when the last reference is released. + */ + if (this_tr->ref =3D=3D 1 && this_tr->free_on_close) + trace_array_kick_autoremove(this_tr); } =20 /** @@ -4829,6 +4891,10 @@ static void update_last_data(struct trace_array *tr) /* Only if the buffer has previous boot data clear and update it. */ tr->flags &=3D ~TRACE_ARRAY_FL_LAST_BOOT; =20 + /* If this is a backup instance, mark it for autoremove. */ + if (tr->flags & TRACE_ARRAY_FL_VMALLOC) + tr->free_on_close =3D true; + /* Reset the module list and reload them */ if (tr->scratch) { struct trace_scratch *tscratch =3D tr->scratch; @@ -8442,8 +8508,8 @@ struct trace_array *trace_array_find_get(const char *= instance) =20 guard(mutex)(&trace_types_lock); tr =3D trace_array_find(instance); - if (tr) - tr->ref++; + if (tr && __trace_array_get(tr) < 0) + tr =3D NULL; =20 return tr; } @@ -8540,6 +8606,8 @@ trace_array_create_systems(const char *name, const ch= ar *systems, if (ftrace_allocate_ftrace_ops(tr) < 0) goto out_free_tr; =20 + trace_array_init_autoremove(tr); + ftrace_init_trace_array(tr); =20 init_trace_flags_index(tr); @@ -8650,7 +8718,9 @@ struct trace_array *trace_array_get_by_name(const cha= r *name, const char *system =20 list_for_each_entry(tr, &ftrace_trace_arrays, list) { if (tr->name && strcmp(tr->name, name) =3D=3D 0) { - tr->ref++; + /* if this fails, @tr is going to be removed. */ + if (__trace_array_get(tr) < 0) + tr =3D NULL; return tr; } } @@ -8689,6 +8759,7 @@ static int __remove_instance(struct trace_array *tr) set_tracer_flag(tr, 1ULL << i, 0); } =20 + trace_array_cancel_autoremove(tr); tracing_set_nop(tr); clear_ftrace_function_probes(tr); event_trace_del_tracer(tr); @@ -9653,8 +9724,10 @@ __init static void enable_instances(void) /* * Backup buffers can be freed but need vfree(). */ - if (backup) + if (backup) { tr->flags |=3D TRACE_ARRAY_FL_VMALLOC | TRACE_ARRAY_FL_RDONLY; + trace_array_start_autoremove(); + } =20 if (start || backup) { tr->flags |=3D TRACE_ARRAY_FL_BOOT | TRACE_ARRAY_FL_LAST_BOOT; diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h index 2d9d26d423f1..60e079177492 100644 --- a/kernel/trace/trace.h +++ b/kernel/trace/trace.h @@ -455,6 +455,12 @@ struct trace_array { * we do not waste memory on systems that are not using tracing. */ bool ring_buffer_expanded; + /* + * If the ring buffer is a read only backup instance, it will be + * removed after dumping all data via pipe, because no readable data. + */ + bool free_on_close; + struct work_struct autoremove_work; }; =20 enum { From nobody Wed Apr 1 08:21:29 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CDCAF425CE4; Tue, 31 Mar 2026 16:32:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774974763; cv=none; b=fUJyTx0DS8BT65aQps6Pbf0FTVaF7sNaIo/nkNKBRfxYouM3vUlcRGosXsyPl/4mJZG3p6qfC2w/7EAeZC3HNui1ca6KfF/TgeM6Svb0yO9DiIHz1uIs80WqPQNmwQMxZtFU19q6DP2+GtheYvmfsWLIKEG1M1cVYGTEwxoalPc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774974763; c=relaxed/simple; bh=S5HUOKOwvhSR3nZmMfKa96wdR+Qt2kEIy+bpCsLdFdo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=iOQgg+JjJXhN++Z1qh8ea22Z8l4LZwRnjIvdZDJb1gzO+iPNGoq/KWiRZMryRB1NbbtE87cr+XXduVQR1XitrZgB7CUrqF+xC8GsBSew/MHQt8PjfyYezIT1GEv29dFbJl2sTFccfQF+N9m3IiT+poSyuZBfX/1X8QENvNAz4kc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=eqxyKDGi; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="eqxyKDGi" Received: by smtp.kernel.org (Postfix) with ESMTPSA id C1678C2BCB2; Tue, 31 Mar 2026 16:32:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1774974763; bh=S5HUOKOwvhSR3nZmMfKa96wdR+Qt2kEIy+bpCsLdFdo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=eqxyKDGiVchiEghQP8GkQdijGBRZTSmKlRUipJRI8+wDvvmr8C1gLqB6LErDc6+b6 HI7HxHle3Jy+v81Njjjqybv92EYUmETaH2StfzDiSjxYYvio5CAG8co8Q1SSk5ieTO goLfRK0XMQ6vV2L/BHmwfhOABEbfWfEVqVMPv8woSixyxg8DPWjZak9zQe3Q3CbKdN 3uaAFQsophoWYNoSu6A5MVpPew6laznAsL0+obOvAd6l5sI7wNT/POi3ees4l9juYK p2pPZalALgMXEmycZhy+JoL6Ji6w84EawUcUfVtqvKZ1C/0g6uBTiD7CA8MaFY0+VY lhPxMxbRiY72Q== From: "Masami Hiramatsu (Google)" To: Steven Rostedt Cc: Masami Hiramatsu , Mathieu Desnoyers , linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org Subject: [PATCH v9 3/3] tracing/Documentation: Add a section about backup instance Date: Wed, 1 Apr 2026 01:32:41 +0900 Message-ID: <177497476117.569199.18085846838539980210.stgit@mhiramat.tok.corp.google.com> X-Mailer: git-send-email 2.53.0.1118.gaef5881109-goog In-Reply-To: <177497473558.569199.6527680985537865638.stgit@mhiramat.tok.corp.google.com> References: <177497473558.569199.6527680985537865638.stgit@mhiramat.tok.corp.google.com> User-Agent: StGit/0.19 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable From: Masami Hiramatsu (Google) Add a section about backup instance to the debugging.rst. Signed-off-by: Masami Hiramatsu (Google) --- Changes in v6: - Fix typos. --- Documentation/trace/debugging.rst | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/Documentation/trace/debugging.rst b/Documentation/trace/debugg= ing.rst index 4d88c346fc38..15857951b506 100644 --- a/Documentation/trace/debugging.rst +++ b/Documentation/trace/debugging.rst @@ -159,3 +159,22 @@ If setting it from the kernel command line, it is reco= mmended to also disable tracing with the "traceoff" flag, and enable tracing after boot up. Otherwise the trace from the most recent boot will be mixed with the trace from the previous boot, and may make it confusing to read. + +Using a backup instance for keeping previous boot data +------------------------------------------------------ + +It is also possible to record trace data at system boot time by specifying +events with the persistent ring buffer, but in this case the data before t= he +reboot will be lost before it can be read. This problem can be solved by a +backup instance. From the kernel command line:: + + reserve_mem=3D12M:4096:trace trace_instance=3Dboot_map@trace,sched,irq t= race_instance=3Dbackup=3Dboot_map + +On boot up, the previous data in the "boot_map" is copied to the "backup" +instance, and the "sched:*" and "irq:*" events for the current boot are tr= aced +in the "boot_map". Thus the user can read the previous boot data from the = "backup" +instance without stopping the trace. + +Note that this "backup" instance is readonly, and will be removed automati= cally +if you clear the trace data or read out all trace data from the "trace_pip= e" +or the "trace_pipe_raw" files. \ No newline at end of file