From nobody Tue Dec 2 02:17:33 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 423B231327A; Thu, 20 Nov 2025 23:48:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763682493; cv=none; b=CrM5MWuzNxyxOssGBq1RpeFAUCUHvgjE4EGftpYCBJkL24q+qnYJXBsU5oA43zpZt5CrnaoD3p8H5eTkwFZFHk3oBDQzHOfyxDOcjd0DPQMbVWGGesCW2tSp4ZPHup76nfdAXyUjk7+MCJ65sOjnHB50SVQ+TtU5/oBdEIsG2nM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763682493; c=relaxed/simple; bh=VskRfNyrSYbZB/ivrQN/1ptwPzxNvSYAq+c10o1Kg0w=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=KABy0KeFDmjWpHMhwSH4Oty4PMCGdjPFzZ1caS9Ljygb3HgLjJTMxu1hGX82mr5+BcjkdrsBwQ46gOvnTp/o4kmPxPtZ9KCFi9JqsWNXok46vdNwKii3OF0GYxLd9um7NC0OaweBKObcC8sgoQfkmTsXXHweLufxgEFrA4m09x4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=P5xHrXTV; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="P5xHrXTV" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 22D44C4AF09; Thu, 20 Nov 2025 23:48:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1763682492; bh=VskRfNyrSYbZB/ivrQN/1ptwPzxNvSYAq+c10o1Kg0w=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=P5xHrXTVYyacccSYpp37jcxvcRgKUwXqmul1nuLhPN2jlvhEQrS/pmMa3OY0DnF11 5Dmbe0cZ1rg7ZVtOZwhNcFrQUiIloJ/bsWT47dvwIgq6wdxWxIGLrhp9Q+HLlq7V7p LvErN8B8oIM2YpJgBqUTht41ES7j7ScGi2ciYeF5GiCbObo99QiOFdwLQT3cbwnscy IQXH8nT70sSmdlHZO5CQkMFmdxAM6fVjReYBlPL5XnpMRsa0bZi+xe9cKjsc6maWQB 8fUCX+z9r0qk5oZ1J8l1VXu1gszOki0JVSHp+VPG2RwvoQCOXfQwaCqY74VhCp38He h33/kzP3QOr/g== From: Namhyung Kim To: Arnaldo Carvalho de Melo , Ian Rogers , James Clark Cc: Jiri Olsa , Adrian Hunter , Peter Zijlstra , Ingo Molnar , LKML , linux-perf-users@vger.kernel.org, Steven Rostedt , Josh Poimboeuf , Indu Bhagat , Jens Remus , Mathieu Desnoyers , linux-trace-kernel@vger.kernel.org, bpf@vger.kernel.org Subject: [PATCH v6 3/6] perf record: Add --call-graph fp,defer option for deferred callchains Date: Thu, 20 Nov 2025 15:48:01 -0800 Message-ID: <20251120234804.156340-4-namhyung@kernel.org> X-Mailer: git-send-email 2.52.0.rc2.455.g230fcf2819-goog In-Reply-To: <20251120234804.156340-1-namhyung@kernel.org> References: <20251120234804.156340-1-namhyung@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add a new callchain record mode option for deferred callchains. For now it only works with FP (frame-pointer) mode. And add the missing feature detection logic to clear the flag on old kernels. $ perf record --call-graph fp,defer -vv true ... ------------------------------------------------------------ perf_event_attr: type 0 (PERF_TYPE_HARDWARE) size 136 config 0 (PERF_COUNT_HW_CPU_CYCLES) { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|CALLCHAIN|PERIOD read_format ID|LOST disabled 1 inherit 1 mmap 1 comm 1 freq 1 enable_on_exec 1 task 1 sample_id_all 1 mmap2 1 comm_exec 1 ksymbol 1 bpf_event 1 defer_callchain 1 defer_output 1 ------------------------------------------------------------ sys_perf_event_open: pid 162755 cpu 0 group_fd -1 flags 0x8 sys_perf_event_open failed, error -22 switching off deferred callchain support Reviewed-by: Ian Rogers Signed-off-by: Namhyung Kim --- tools/perf/Documentation/perf-config.txt | 3 +++ tools/perf/Documentation/perf-record.txt | 4 ++++ tools/perf/util/callchain.c | 16 +++++++++++++--- tools/perf/util/callchain.h | 1 + tools/perf/util/evsel.c | 19 +++++++++++++++++++ tools/perf/util/evsel.h | 1 + 6 files changed, 41 insertions(+), 3 deletions(-) diff --git a/tools/perf/Documentation/perf-config.txt b/tools/perf/Document= ation/perf-config.txt index c6f33565966735fe..642d1c490d9e3bcd 100644 --- a/tools/perf/Documentation/perf-config.txt +++ b/tools/perf/Documentation/perf-config.txt @@ -452,6 +452,9 @@ Variables kernel space is controlled not by this option but by the kernel config (CONFIG_UNWINDER_*). =20 + The 'defer' mode can be used with 'fp' mode to enable deferred + user callchains (like 'fp,defer'). + call-graph.dump-size:: The size of stack to dump in order to do post-unwinding. Default is 8192= (byte). When using dwarf into record-mode, the default size will be used if omit= ted. diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Document= ation/perf-record.txt index 067891bd7da6edc8..e8b9aadbbfa50574 100644 --- a/tools/perf/Documentation/perf-record.txt +++ b/tools/perf/Documentation/perf-record.txt @@ -325,6 +325,10 @@ OPTIONS by default. User can change the number by passing it after comma like "--call-graph fp,32". =20 + Also "defer" can be used with "fp" (like "--call-graph fp,defer") to + enable deferred user callchain which will collect user-space callchains + when the thread returns to the user space. + -q:: --quiet:: Don't print any warnings or messages, useful for scripting. diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c index d7b7eef740b9d6ed..2884187ccbbecfdc 100644 --- a/tools/perf/util/callchain.c +++ b/tools/perf/util/callchain.c @@ -275,9 +275,13 @@ int parse_callchain_record(const char *arg, struct cal= lchain_param *param) if (tok) { unsigned long size; =20 - size =3D strtoul(tok, &name, 0); - if (size < (unsigned) sysctl__max_stack()) - param->max_stack =3D size; + if (!strncmp(tok, "defer", sizeof("defer"))) { + param->defer =3D true; + } else { + size =3D strtoul(tok, &name, 0); + if (size < (unsigned) sysctl__max_stack()) + param->max_stack =3D size; + } } break; =20 @@ -314,6 +318,12 @@ int parse_callchain_record(const char *arg, struct cal= lchain_param *param) } while (0); =20 free(buf); + + if (param->defer && param->record_mode !=3D CALLCHAIN_FP) { + pr_err("callchain: deferred callchain only works with FP\n"); + return -EINVAL; + } + return ret; } =20 diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h index 86ed9e4d04f9ee7b..d5ae4fbb7ce5fa44 100644 --- a/tools/perf/util/callchain.h +++ b/tools/perf/util/callchain.h @@ -98,6 +98,7 @@ extern bool dwarf_callchain_users; =20 struct callchain_param { bool enabled; + bool defer; enum perf_call_graph_mode record_mode; u32 dump_size; enum chain_mode mode; diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c index f1a311637694ac0a..887c6ac6c49cc415 100644 --- a/tools/perf/util/evsel.c +++ b/tools/perf/util/evsel.c @@ -1065,6 +1065,9 @@ static void __evsel__config_callchain(struct evsel *e= vsel, struct record_opts *o pr_info("Disabling user space callchains for function trace event.\n"); attr->exclude_callchain_user =3D 1; } + + if (param->defer && !attr->exclude_callchain_user) + attr->defer_callchain =3D 1; } =20 void evsel__config_callchain(struct evsel *evsel, struct record_opts *opts, @@ -1511,6 +1514,7 @@ void evsel__config(struct evsel *evsel, struct record= _opts *opts, attr->mmap2 =3D track && !perf_missing_features.mmap2; attr->comm =3D track; attr->build_id =3D track && opts->build_id; + attr->defer_output =3D track && callchain->defer; =20 /* * ksymbol is tracked separately with text poke because it needs to be @@ -2199,6 +2203,10 @@ static int __evsel__prepare_open(struct evsel *evsel= , struct perf_cpu_map *cpus, =20 static void evsel__disable_missing_features(struct evsel *evsel) { + if (perf_missing_features.defer_callchain && evsel->core.attr.defer_callc= hain) + evsel->core.attr.defer_callchain =3D 0; + if (perf_missing_features.defer_callchain && evsel->core.attr.defer_outpu= t) + evsel->core.attr.defer_output =3D 0; if (perf_missing_features.inherit_sample_read && evsel->core.attr.inherit= && (evsel->core.attr.sample_type & PERF_SAMPLE_READ)) evsel->core.attr.inherit =3D 0; @@ -2473,6 +2481,13 @@ static bool evsel__detect_missing_features(struct ev= sel *evsel, struct perf_cpu =20 /* Please add new feature detection here. */ =20 + attr.defer_callchain =3D true; + if (has_attr_feature(&attr, /*flags=3D*/0)) + goto found; + perf_missing_features.defer_callchain =3D true; + pr_debug2("switching off deferred callchain support\n"); + attr.defer_callchain =3D false; + attr.inherit =3D true; attr.sample_type =3D PERF_SAMPLE_READ | PERF_SAMPLE_TID; if (has_attr_feature(&attr, /*flags=3D*/0)) @@ -2584,6 +2599,10 @@ static bool evsel__detect_missing_features(struct ev= sel *evsel, struct perf_cpu errno =3D old_errno; =20 check: + if ((evsel->core.attr.defer_callchain || evsel->core.attr.defer_output) && + perf_missing_features.defer_callchain) + return true; + if (evsel->core.attr.inherit && (evsel->core.attr.sample_type & PERF_SAMPLE_READ) && perf_missing_features.inherit_sample_read) diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h index 3ae4ac8f9a37e009..a08130ff2e47a887 100644 --- a/tools/perf/util/evsel.h +++ b/tools/perf/util/evsel.h @@ -221,6 +221,7 @@ struct perf_missing_features { bool branch_counters; bool aux_action; bool inherit_sample_read; + bool defer_callchain; }; =20 extern struct perf_missing_features perf_missing_features; --=20 2.52.0.rc2.455.g230fcf2819-goog