From nobody Mon Feb 9 06:25:35 2026 Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 30B40148826 for ; Sat, 17 Aug 2024 06:45:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723877134; cv=none; b=dqIo+hyWjU1m+Y/C1SfXquVJObsn7407bo01x5ERIEiM1ZHqWTRxJq9MnFqB7tFEU5dGEC0xq/XFixsfFpXunx9aHlxRrUh5CCX7JHSfcfht1G5F7C4SoPWK6NfCoRFQI8rPSAS+zpVCBrojjBmGZYvKGvZfZgB43yPs41aqcBw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723877134; c=relaxed/simple; bh=T+vkABo8pfjGvY2NJBEP+BbLwKM2texesBTDSZzCQeo=; h=Date:In-Reply-To:Message-Id:Mime-Version:References:Subject:From: To:Content-Type; b=rbXx5jGJTRVhaHbxUEr5l4prEq966m9IYvIwfvz9DGCF0jTklNZbHBL4zrYcSMRvYNpg1wnvCOdM+2WhWDJHd/RN8QB0vrJLpuWIKGtHA6YmJH8QwpGIEu6Kf3PA4E3Ia/HqEj85VFyznnYxoVst2jU44wJPNBnqVwthOrnsDlY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--irogers.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=MBcRtOvB; arc=none smtp.client-ip=209.85.128.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--irogers.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="MBcRtOvB" Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-6addeef41a2so50614767b3.2 for ; Fri, 16 Aug 2024 23:45:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1723877131; x=1724481931; darn=vger.kernel.org; h=to:from:subject:references:mime-version:message-id:in-reply-to:date :from:to:cc:subject:date:message-id:reply-to; bh=9vs/ATUqSidqRIzdN2QwVwb6M41EgV7LbTTqskLbY2Q=; b=MBcRtOvBjHuzQ81K2qj/bR1qRjX/NBiNaFaw8RHW9Kb/H34rxMdtSXAaga6Ocs6OzL RIlk/Q+n7ZsU+S0/VllmWb7gD+hSYHXd7CBqKynxtbe11HPAZTKQPESXgkBBpltznt7A MVC5h2ScQxTm5Oo5bq4BCPrnZzf7VAZP81FQPbabtcnBV0lrRxiUU9GacnIeNgvXYAWh qOOrX8vhTsOQ1r6OQPgVD1JVv5jQ0zmoNBiNkwvvuLneU3u4uim/Yl/MQQ3tCeYZUs90 181me0DbLojvaB8UQegRPoTT9ui1OesIZjGPo3RN0F2rCrsAoujwVcb0fF2vZAOVxszY MC0Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723877131; x=1724481931; h=to:from:subject:references:mime-version:message-id:in-reply-to:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=9vs/ATUqSidqRIzdN2QwVwb6M41EgV7LbTTqskLbY2Q=; b=IbxZJdLZwmuo46rzk/UHtABspSFYJ02WyMbffedsnApUpLQx5rj6++Q++fay9gYkk0 NwSrL344IXJbrthkzxHZM/88TSLMGny2TbcT7u9I59Z4ppLHEnCGelSyf60TB+ScE3OB jK4hGJVSaWw3aplZmluBoTVf7cwjCrSvRfmfOUS+HIuD2hUZz83aYGsGAFw3eDdwuTte WMPSdw21dUtcDsDFsBlJ0OQWsa7+w0rSSGSBc8qwcnBKL8C4b/MSuZ2Jldsh3ctP1exQ PeFM39ckQSK5k4c9R/u44zpCwVnJujEEbo9PzOqfGIyixV+bYS6W/TqtvWgLtcdqL/e8 M+Wg== X-Forwarded-Encrypted: i=1; AJvYcCWhzz8orelgwbeB1ZhXrjhQ998Eo0LlBwK1Lyk5WvCyTJItwUjn+a/xC5L03gR6+s0V1DXSs5pZFtny0XgJD5oQ/BsNadH/7p/1NLG3 X-Gm-Message-State: AOJu0YzrqXznFrFuLF992xBLqV8osRvn6KALESnwVfh6qCZj1BsGn0rf Znqe85jwlL53+qPULYNKvv0MnsQb3vppuFi89kwaEtIJ+acRETisnZynBJeNBzMRZ02Hm5Z6IfJ wh/Q5OQ== X-Google-Smtp-Source: AGHT+IF6f83tDj9w+yBgzQZgrB/tfugQj/LzXEEgJgaYJ+HcUK3RgmIsHTTvviaw6SxOuRaRQXJampnNms/V X-Received: from irogers.svl.corp.google.com ([2620:15c:2a3:200:78c9:5e58:734f:c59f]) (user=irogers job=sendgmr) by 2002:a05:690c:2a87:b0:62c:f976:a763 with SMTP id 00721157ae682-6b1b7595cd4mr1426637b3.1.1723877131385; Fri, 16 Aug 2024 23:45:31 -0700 (PDT) Date: Fri, 16 Aug 2024 23:44:42 -0700 In-Reply-To: <20240817064442.2152089-1-irogers@google.com> Message-Id: <20240817064442.2152089-14-irogers@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240817064442.2152089-1-irogers@google.com> X-Mailer: git-send-email 2.46.0.184.g6999bdac58-goog Subject: [PATCH v1 13/13] perf callchain: Allow symbols to be optional when resolving a callchain From: Ian Rogers To: Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Mark Rutland , Alexander Shishkin , Jiri Olsa , Ian Rogers , Adrian Hunter , Kan Liang , Masahiro Yamada , Arnd Bergmann , Jann Horn , Colin Ian King , Casey Chen , Athira Rajeev , Chaitanya S Prakash , James Clark , Ze Gao , Yang Jihong , Yunseong Kim , Weilin Wang , Dominique Martinet , Anne Macedo , Sun Haiyong , linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" In uses like perf inject it is not necessary to gather the symbol for each call chain location, the map for the sample IP is wanted so that build IDs and the like can be injected. Make gathering the symbol in the callchain_cursor optional. For a perf inject -B command this lowers the peak RSS from 54.1MB to 29.6MB by avoiding loading symbols. Signed-off-by: Ian Rogers --- tools/perf/builtin-inject.c | 2 +- tools/perf/util/callchain.c | 8 ++-- tools/perf/util/callchain.h | 2 +- tools/perf/util/machine.c | 92 +++++++++++++++++++++---------------- tools/perf/util/machine.h | 33 ++++++++++--- 5 files changed, 85 insertions(+), 52 deletions(-) diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c index 2ff246f56a44..8ad7cc5f0c63 100644 --- a/tools/perf/builtin-inject.c +++ b/tools/perf/builtin-inject.c @@ -942,7 +942,7 @@ int perf_event__inject_buildid(const struct perf_tool *= tool, union perf_event *e } =20 sample__for_each_callchain_node(thread, evsel, sample, PERF_MAX_STACK_DEP= TH, - mark_dso_hit_callback, &args); + /*symbols=3D*/false, mark_dso_hit_callback, &args); =20 thread__put(thread); repipe: diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c index 0d608e875fe9..0c7564747a14 100644 --- a/tools/perf/util/callchain.c +++ b/tools/perf/util/callchain.c @@ -1800,7 +1800,7 @@ s64 callchain_avg_cycles(struct callchain_node *cnode) =20 int sample__for_each_callchain_node(struct thread *thread, struct evsel *e= vsel, struct perf_sample *sample, int max_stack, - callchain_iter_fn cb, void *data) + bool symbols, callchain_iter_fn cb, void *data) { struct callchain_cursor *cursor =3D get_tls_callchain_cursor(); int ret; @@ -1809,9 +1809,9 @@ int sample__for_each_callchain_node(struct thread *th= read, struct evsel *evsel, return -ENOMEM; =20 /* Fill in the callchain. */ - ret =3D thread__resolve_callchain(thread, cursor, evsel, sample, - /*parent=3D*/NULL, /*root_al=3D*/NULL, - max_stack); + ret =3D __thread__resolve_callchain(thread, cursor, evsel, sample, + /*parent=3D*/NULL, /*root_al=3D*/NULL, + max_stack, symbols); if (ret) return ret; =20 diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h index 76891f8e2373..86ed9e4d04f9 100644 --- a/tools/perf/util/callchain.h +++ b/tools/perf/util/callchain.h @@ -315,6 +315,6 @@ typedef int (*callchain_iter_fn)(struct callchain_curso= r_node *node, void *data) =20 int sample__for_each_callchain_node(struct thread *thread, struct evsel *e= vsel, struct perf_sample *sample, int max_stack, - callchain_iter_fn cb, void *data); + bool symbols, callchain_iter_fn cb, void *data); =20 #endif /* __PERF_CALLCHAIN_H */ diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c index cd79a830abae..6cffee6f9891 100644 --- a/tools/perf/util/machine.c +++ b/tools/perf/util/machine.c @@ -2059,7 +2059,8 @@ static int add_callchain_ip(struct thread *thread, bool branch, struct branch_flags *flags, struct iterations *iter, - u64 branch_from) + u64 branch_from, + bool symbols) { struct map_symbol ms =3D {}; struct addr_location al; @@ -2098,7 +2099,8 @@ static int add_callchain_ip(struct thread *thread, } goto out; } - thread__find_symbol(thread, *cpumode, ip, &al); + if (symbols) + thread__find_symbol(thread, *cpumode, ip, &al); } =20 if (al.sym !=3D NULL) { @@ -2227,7 +2229,8 @@ static int lbr_callchain_add_kernel_ip(struct thread = *thread, struct symbol **parent, struct addr_location *root_al, u64 branch_from, - bool callee, int end) + bool callee, int end, + bool symbols) { struct ip_callchain *chain =3D sample->callchain; u8 cpumode =3D PERF_RECORD_MISC_USER; @@ -2237,7 +2240,8 @@ static int lbr_callchain_add_kernel_ip(struct thread = *thread, for (i =3D 0; i < end + 1; i++) { err =3D add_callchain_ip(thread, cursor, parent, root_al, &cpumode, chain->ips[i], - false, NULL, NULL, branch_from); + false, NULL, NULL, branch_from, + symbols); if (err) return err; } @@ -2247,7 +2251,8 @@ static int lbr_callchain_add_kernel_ip(struct thread = *thread, for (i =3D end; i >=3D 0; i--) { err =3D add_callchain_ip(thread, cursor, parent, root_al, &cpumode, chain->ips[i], - false, NULL, NULL, branch_from); + false, NULL, NULL, branch_from, + symbols); if (err) return err; } @@ -2290,7 +2295,8 @@ static int lbr_callchain_add_lbr_ip(struct thread *th= read, struct symbol **parent, struct addr_location *root_al, u64 *branch_from, - bool callee) + bool callee, + bool symbols) { struct branch_stack *lbr_stack =3D sample->branch_stack; struct branch_entry *entries =3D perf_sample__branch_entries(sample); @@ -2323,7 +2329,7 @@ static int lbr_callchain_add_lbr_ip(struct thread *th= read, err =3D add_callchain_ip(thread, cursor, parent, root_al, &cpumode, ip, true, flags, NULL, - *branch_from); + *branch_from, symbols); if (err) return err; =20 @@ -2348,7 +2354,7 @@ static int lbr_callchain_add_lbr_ip(struct thread *th= read, err =3D add_callchain_ip(thread, cursor, parent, root_al, &cpumode, ip, true, flags, NULL, - *branch_from); + *branch_from, symbols); if (err) return err; save_lbr_cursor_node(thread, cursor, i); @@ -2363,7 +2369,7 @@ static int lbr_callchain_add_lbr_ip(struct thread *th= read, err =3D add_callchain_ip(thread, cursor, parent, root_al, &cpumode, ip, true, flags, NULL, - *branch_from); + *branch_from, symbols); if (err) return err; save_lbr_cursor_node(thread, cursor, i); @@ -2377,7 +2383,7 @@ static int lbr_callchain_add_lbr_ip(struct thread *th= read, err =3D add_callchain_ip(thread, cursor, parent, root_al, &cpumode, ip, true, flags, NULL, - *branch_from); + *branch_from, symbols); if (err) return err; } @@ -2544,7 +2550,8 @@ static int resolve_lbr_callchain_sample(struct thread= *thread, struct symbol **parent, struct addr_location *root_al, int max_stack, - unsigned int max_lbr) + unsigned int max_lbr, + bool symbols) { bool callee =3D (callchain_param.order =3D=3D ORDER_CALLEE); struct ip_callchain *chain =3D sample->callchain; @@ -2586,12 +2593,12 @@ static int resolve_lbr_callchain_sample(struct thre= ad *thread, /* Add kernel ip */ err =3D lbr_callchain_add_kernel_ip(thread, cursor, sample, parent, root_al, branch_from, - true, i); + true, i, symbols); if (err) goto error; =20 err =3D lbr_callchain_add_lbr_ip(thread, cursor, sample, parent, - root_al, &branch_from, true); + root_al, &branch_from, true, symbols); if (err) goto error; =20 @@ -2608,14 +2615,14 @@ static int resolve_lbr_callchain_sample(struct thre= ad *thread, goto error; } err =3D lbr_callchain_add_lbr_ip(thread, cursor, sample, parent, - root_al, &branch_from, false); + root_al, &branch_from, false, symbols); if (err) goto error; =20 /* Add kernel ip */ err =3D lbr_callchain_add_kernel_ip(thread, cursor, sample, parent, root_al, branch_from, - false, i); + false, i, symbols); if (err) goto error; } @@ -2629,7 +2636,7 @@ static int find_prev_cpumode(struct ip_callchain *cha= in, struct thread *thread, struct callchain_cursor *cursor, struct symbol **parent, struct addr_location *root_al, - u8 *cpumode, int ent) + u8 *cpumode, int ent, bool symbols) { int err =3D 0; =20 @@ -2639,7 +2646,7 @@ static int find_prev_cpumode(struct ip_callchain *cha= in, struct thread *thread, if (ip >=3D PERF_CONTEXT_MAX) { err =3D add_callchain_ip(thread, cursor, parent, root_al, cpumode, ip, - false, NULL, NULL, 0); + false, NULL, NULL, 0, symbols); break; } } @@ -2661,7 +2668,8 @@ static int thread__resolve_callchain_sample(struct th= read *thread, struct perf_sample *sample, struct symbol **parent, struct addr_location *root_al, - int max_stack) + int max_stack, + bool symbols) { struct branch_stack *branch =3D sample->branch_stack; struct branch_entry *entries =3D perf_sample__branch_entries(sample); @@ -2681,7 +2689,8 @@ static int thread__resolve_callchain_sample(struct th= read *thread, =20 err =3D resolve_lbr_callchain_sample(thread, cursor, sample, parent, root_al, max_stack, - !env ? 0 : env->max_branches); + !env ? 0 : env->max_branches, + symbols); if (err) return (err < 0) ? err : 0; } @@ -2746,13 +2755,14 @@ static int thread__resolve_callchain_sample(struct = thread *thread, root_al, NULL, be[i].to, true, &be[i].flags, - NULL, be[i].from); + NULL, be[i].from, symbols); =20 - if (!err) + if (!err) { err =3D add_callchain_ip(thread, cursor, parent, root_al, NULL, be[i].from, true, &be[i].flags, - &iter[i], 0); + &iter[i], 0, symbols); + } if (err =3D=3D -EINVAL) break; if (err) @@ -2768,7 +2778,7 @@ static int thread__resolve_callchain_sample(struct th= read *thread, check_calls: if (chain && callchain_param.order !=3D ORDER_CALLEE) { err =3D find_prev_cpumode(chain, thread, cursor, parent, root_al, - &cpumode, chain->nr - first_call); + &cpumode, chain->nr - first_call, symbols); if (err) return (err < 0) ? err : 0; } @@ -2790,7 +2800,7 @@ static int thread__resolve_callchain_sample(struct th= read *thread, ++nr_entries; else if (callchain_param.order !=3D ORDER_CALLEE) { err =3D find_prev_cpumode(chain, thread, cursor, parent, - root_al, &cpumode, j); + root_al, &cpumode, j, symbols); if (err) return (err < 0) ? err : 0; continue; @@ -2817,8 +2827,8 @@ static int thread__resolve_callchain_sample(struct th= read *thread, if (leaf_frame_caller && leaf_frame_caller !=3D ip) { =20 err =3D add_callchain_ip(thread, cursor, parent, - root_al, &cpumode, leaf_frame_caller, - false, NULL, NULL, 0); + root_al, &cpumode, leaf_frame_caller, + false, NULL, NULL, 0, symbols); if (err) return (err < 0) ? err : 0; } @@ -2826,7 +2836,7 @@ static int thread__resolve_callchain_sample(struct th= read *thread, =20 err =3D add_callchain_ip(thread, cursor, parent, root_al, &cpumode, ip, - false, NULL, NULL, 0); + false, NULL, NULL, 0, symbols); =20 if (err) return (err < 0) ? err : 0; @@ -2906,7 +2916,7 @@ static int thread__resolve_callchain_unwind(struct th= read *thread, struct callchain_cursor *cursor, struct evsel *evsel, struct perf_sample *sample, - int max_stack) + int max_stack, bool symbols) { /* Can we do dwarf post unwind? */ if (!((evsel->core.attr.sample_type & PERF_SAMPLE_REGS_USER) && @@ -2918,17 +2928,21 @@ static int thread__resolve_callchain_unwind(struct = thread *thread, (!sample->user_stack.size)) return 0; =20 + if (!symbols) + pr_debug("Not resolving symbols with an unwinder isn't currently support= ed\n"); + return unwind__get_entries(unwind_entry, cursor, thread, sample, max_stack, false); } =20 -int thread__resolve_callchain(struct thread *thread, - struct callchain_cursor *cursor, - struct evsel *evsel, - struct perf_sample *sample, - struct symbol **parent, - struct addr_location *root_al, - int max_stack) +int __thread__resolve_callchain(struct thread *thread, + struct callchain_cursor *cursor, + struct evsel *evsel, + struct perf_sample *sample, + struct symbol **parent, + struct addr_location *root_al, + int max_stack, + bool symbols) { int ret =3D 0; =20 @@ -2941,22 +2955,22 @@ int thread__resolve_callchain(struct thread *thread, ret =3D thread__resolve_callchain_sample(thread, cursor, evsel, sample, parent, root_al, - max_stack); + max_stack, symbols); if (ret) return ret; ret =3D thread__resolve_callchain_unwind(thread, cursor, evsel, sample, - max_stack); + max_stack, symbols); } else { ret =3D thread__resolve_callchain_unwind(thread, cursor, evsel, sample, - max_stack); + max_stack, symbols); if (ret) return ret; ret =3D thread__resolve_callchain_sample(thread, cursor, evsel, sample, parent, root_al, - max_stack); + max_stack, symbols); } =20 return ret; diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h index a687876e3453..2e5a4cb342d8 100644 --- a/tools/perf/util/machine.h +++ b/tools/perf/util/machine.h @@ -178,13 +178,32 @@ struct mem_info *sample__resolve_mem(struct perf_samp= le *sample, =20 struct callchain_cursor; =20 -int thread__resolve_callchain(struct thread *thread, - struct callchain_cursor *cursor, - struct evsel *evsel, - struct perf_sample *sample, - struct symbol **parent, - struct addr_location *root_al, - int max_stack); +int __thread__resolve_callchain(struct thread *thread, + struct callchain_cursor *cursor, + struct evsel *evsel, + struct perf_sample *sample, + struct symbol **parent, + struct addr_location *root_al, + int max_stack, + bool symbols); + +static inline int thread__resolve_callchain(struct thread *thread, + struct callchain_cursor *cursor, + struct evsel *evsel, + struct perf_sample *sample, + struct symbol **parent, + struct addr_location *root_al, + int max_stack) +{ + return __thread__resolve_callchain(thread, + cursor, + evsel, + sample, + parent, + root_al, + max_stack, + /*symbols=3D*/true); +} =20 /* * Default guest kernel is defined by parameter --guestkallsyms --=20 2.46.0.184.g6999bdac58-goog