From nobody Fri Sep 5 20:21:55 2025 Received: from mail-pf1-f202.google.com (mail-pf1-f202.google.com [209.85.210.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 380103FE4 for ; Wed, 30 Apr 2025 00:41:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745973703; cv=none; b=ChkDcA6LQiJ2meGpI5XoWZuYRHD745rWJ+jzfbkFwr5lAxzi+SoTJ0jxzAcunxajSlsqAmehYpsO7H+i4npq1bBCw2JSmpq5E/YcjRK4PQrICTR+QbTSkkG8WRxj30XNfDb5cf88GdhPoBM7Cszuj4Kz+eY04T8Co5ghzdaockY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745973703; c=relaxed/simple; bh=5x2/ZRWClEfOQkKsrBgAGh5cd+qjd8vSw5uogwI18I0=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=UTMkQ4Qr/fjNBn9IAWsm67DEpuDi/YF1Zqjp3RMWrax3vvKVfdumEC6Hfvz8raXWz3MZNhLnp7pFZ462TF9CBPYCekXbizLxdmShG1Lbgg5nnkw0p2A104IMOFhBL12+3iBIvSQnTWfCvN6vyRD/eLuDJufTrdXIdSPpSSgAmA0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--irogers.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=r9wBafLw; arc=none smtp.client-ip=209.85.210.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--irogers.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="r9wBafLw" Received: by mail-pf1-f202.google.com with SMTP id d2e1a72fcca58-736bf7eb149so4421226b3a.0 for ; Tue, 29 Apr 2025 17:41:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1745973699; x=1746578499; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=QHtFIhqDkdOqnzuqj5WLZC3nYjjTX1ROiVTFUI2VIgY=; b=r9wBafLwq8MIKxAndAH71pKyBAA4qh828UA2iX2N3Tf1psuv+3lAN4OtrY67Y3Z2+S Qtu2ADLRFM2G0kAY/81yRlAORxmwt3M3dRopXdoQuhgvtOwZqbZRm94NszLlwWLMr8Cr JK1aXpsWKX/3i7wFlG1ShIOjqjcFHZt2Y+LFez4h2BD/qXOaldUmAkgLwQXgXO8H8O+0 4uJO5e69CGoHFGVdqEij8Z0h1j/vAuKFXo6b5euiug0GHUezBnR1X7Xus+vNehe4FbA3 VvgsElbezfeS4Dvzt4tCEYkjQFcq6llGa/3MQtbmmU+57KMuV9qBAvJmel0U/xEzxWij /Fmg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1745973699; x=1746578499; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=QHtFIhqDkdOqnzuqj5WLZC3nYjjTX1ROiVTFUI2VIgY=; b=VL34nv0oh8xpTYLfj13V0KPnI0bn87WbiRvu3H1TSQfD5JzWCYDQVzUj96Y5WLOS7K +prwDxcmtNpqPVzjibnV5haB3OrGN0btTLIHmKyrOKkTeH0sW4ug5nWi6hU8UuZcvl7i DsQe3lh0j8wxdOsJj6OZ4Ss8aWydcgkmaJk6uOahehGLypR+clwRSHQ7M+t327w0xAQN wKJ38NbPc7vv4hnlHRcgLiLRFdWD3Nk8Qh9yPa6Z8PA/Ii6cUzMtJg3iyva8eks+0N8K IB5b5xOHbeq6QUoTBLM/c/AwzMCOC6LUMCUmqzCT4mQkq7WpDu+30ctqnNrBfktDqmY8 TY2A== X-Forwarded-Encrypted: i=1; AJvYcCVACVFuZif14FD6c8AAbgQVoaSRUJUMxu4jFM/ZggSMTDWp4437tH+/WiRCunlzIk2k0ff7NZAgOK9FadE=@vger.kernel.org X-Gm-Message-State: AOJu0Yx3SeCMQAdMSy9YycK83CscESgmUcN4Hc07eVucOi1k+zoK7hlB 4N+m+X2wXqVOyHZKZXd6mI6GdbZJBP5/4/7bflhyn2CrZDaunLVGgjRAsM7IZd0WwkamUyPZjE8 P0vd1Jg== X-Google-Smtp-Source: AGHT+IEH/M6QngOP7K6l3KQtqadSdX3zXJUq0/U++YIpP2aifSuLg7MlfzbKaEkQiVVs9zVNZDvAd2qpJbWh X-Received: from pfbna3.prod.google.com ([2002:a05:6a00:3e03:b0:730:8b4c:546c]) (user=irogers job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a20:e687:b0:1f5:9961:c40 with SMTP id adf61e73a8af0-20a876446d4mr1293367637.8.1745973699366; Tue, 29 Apr 2025 17:41:39 -0700 (PDT) Date: Tue, 29 Apr 2025 17:41:23 -0700 In-Reply-To: <20250430004128.474388-1-irogers@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250430004128.474388-1-irogers@google.com> X-Mailer: git-send-email 2.49.0.901.g37484f566f-goog Message-ID: <20250430004128.474388-2-irogers@google.com> Subject: [PATCH v2 1/6] perf demangle-rust: Add rustc-demangle C demangler From: Ian Rogers To: Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Mark Rutland , Alexander Shishkin , Jiri Olsa , Adrian Hunter , Kan Liang , Miguel Ojeda , Alex Gaynor , Boqun Feng , Gary Guo , "=?UTF-8?q?Bj=C3=B6rn=20Roy=20Baron?=" , Benno Lossin , Andreas Hindborg , Alice Ryhl , Trevor Gross , Danilo Krummrich , Nathan Chancellor , Nick Desaulniers , Bill Wendling , Justin Stitt , James Clark , Howard Chu , Jiapeng Chong , Ravi Bangoria , "Masami Hiramatsu (Google)" , Stephen Brennan , linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, rust-for-linux@vger.kernel.org, llvm@lists.linux.dev, Daniel Xu , Ariel Ben-Yehuda Cc: Ian Rogers Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Imported at commit 80e40f57d99f ("add comment about finding latest version of code") from: https://github.com/rust-lang/rustc-demangle/blob/main/crates/native-c/src/d= emangle.c https://github.com/rust-lang/rustc-demangle/blob/main/crates/native-c/inclu= de/demangle.h There is discussion of this issue motivating the import in: https://github.com/rust-lang/rust/issues/60705 https://lore.kernel.org/lkml/20250129193037.573431-1-irogers@google.com/ The SPDX lines reflect the dual license Apache-2 or MIT in: https://github.com/rust-lang/rustc-demangle/blob/main/README.md Following Migual Ojeda's suggestion comments were added on copyright and keeping the code in sync with upstream. The files are renamed as perf supports multiple demanglers and so demangle as a name would be overloaded. The work here was done by Ariel Ben-Yehuda and I am merely importing it as discussed in the rust-lang issue. Signed-off-by: Ian Rogers Reviewed-by: Miguel Ojeda --- tools/perf/util/demangle-rust-v0.c | 2042 ++++++++++++++++++++++++++++ tools/perf/util/demangle-rust-v0.h | 88 ++ 2 files changed, 2130 insertions(+) create mode 100644 tools/perf/util/demangle-rust-v0.c create mode 100644 tools/perf/util/demangle-rust-v0.h diff --git a/tools/perf/util/demangle-rust-v0.c b/tools/perf/util/demangle-= rust-v0.c new file mode 100644 index 000000000000..28e74ca0ff13 --- /dev/null +++ b/tools/perf/util/demangle-rust-v0.c @@ -0,0 +1,2042 @@ +// SPDX-License-Identifier: Apache-2.0 OR MIT + +// The contents of this file come from the Rust rustc-demangle library, ho= sted +// in the repository, licens= ed +// under "Apache-2.0 OR MIT". For copyright details, see +// . +// Please note that the file should be kept as close as possible to upstre= am. + +// Code for demangling Rust symbols. This code is mostly +// a line-by-line translation of the Rust code in `rustc-demangle`. + +// you can find the latest version of this code in https://github.com/rust= -lang/rustc-demangle + +#include +#include +#include +#include +#include +#include + +#include "demangle-rust-v0.h" + +#if defined(__GNUC__) || defined(__clang__) +#define NODISCARD __attribute__((warn_unused_result)) +#else +#define NODISCARD +#endif + +#define MAX_DEPTH 500 + +typedef enum { + DemangleOk, + DemangleInvalid, + DemangleRecursed, + DemangleBug, +} demangle_status; + +struct demangle_v0 { + const char *mangled; + size_t mangled_len; +}; + +struct demangle_legacy { + const char *mangled; + size_t mangled_len; + size_t elements; +}; + +// private version of memrchr to avoid _GNU_SOURCE +static void *demangle_memrchr(const void *s, int c, size_t n) { + const uint8_t *s_ =3D s; + for (; n !=3D 0; n--) { + if (s_[n-1] =3D=3D c) { + return (void*)&s_[n-1]; + } + } + return NULL; +} + + +static bool unicode_iscontrol(uint32_t ch) { + // this is *technically* a unicode table, but + // some unicode properties are simpler than you might think + return ch < 0x20 || (ch >=3D 0x7f && ch < 0xa0); +} + +// "good enough" tables, the only consequence is that when printing +// *constant strings*, some characters are printed as `\u{abcd}` rather th= an themselves. +// +// I'm leaving these here to allow easily replacing them with actual +// tables if desired. +static bool unicode_isprint(uint32_t ch) { + if (ch < 0x20) { + return false; + } + if (ch < 0x7f) { + return true; + } + return false; +} + +static bool unicode_isgraphemextend(uint32_t ch) { + (void)ch; + return false; +} + +static bool str_isascii(const char *s, size_t s_len) { + for (size_t i =3D 0; i < s_len; i++) { + if (s[i] & 0x80) { + return false; + } + } + + return true; +} + +typedef enum { + PunycodeOk, + PunycodeError +} punycode_status; + +struct parser { + // the parser assumes that `sym` has a safe "terminating byte". It mig= ht be NUL, + // but it might also be something else if a symbol is "truncated". + const char *sym; + size_t sym_len; + size_t next; + uint32_t depth; +}; + +struct printer { + demangle_status status; // if status =3D=3D 0 parser is valid + struct parser parser; + char *out; // NULL for no output [in which case out_len is not decreme= nted] + size_t out_len; + uint32_t bound_lifetime_depth; + bool alternate; +}; + +static NODISCARD overflow_status printer_print_path(struct printer *printe= r, bool in_value); +static NODISCARD overflow_status printer_print_type(struct printer *printe= r); +static NODISCARD overflow_status printer_print_const(struct printer *print= er, bool in_value); + +static NODISCARD demangle_status try_parse_path(struct parser *parser) { + struct printer printer =3D { + DemangleOk, + *parser, + NULL, + SIZE_MAX, + 0, + false + }; + overflow_status ignore =3D printer_print_path(&printer, false); // can= 't fail since no output + (void)ignore; + *parser =3D printer.parser; + return printer.status; +} + +NODISCARD static demangle_status rust_demangle_v0_demangle(const char *s, = size_t s_len, struct demangle_v0 *res, const char **rest) { + if (s_len > strlen(s)) { + // s_len only exists to shorten the string, this is not a buffer A= PI + return DemangleInvalid; + } + + const char *inner; + size_t inner_len; + if (s_len >=3D 2 && !strncmp(s, "_R", strlen("_R"))) { + inner =3D s+2; + inner_len =3D s_len - 2; + } else if (s_len >=3D 1 && !strncmp(s, "R", strlen("R"))) { + // On Windows, dbghelp strips leading underscores, so we accept "R= ..." + // form too. + inner =3D s+1; + inner_len =3D s_len - 1; + } else if (s_len >=3D 3 && !strncmp(s, "__R", strlen("__R"))) { + // On OSX, symbols are prefixed with an extra _ + inner =3D s+3; + inner_len =3D s_len - 3; + } else { + return DemangleInvalid; + } + + // Paths always start with uppercase characters. + if (*inner < 'A' || *inner > 'Z') { + return DemangleInvalid; + } + + if (!str_isascii(inner, inner_len)) { + return DemangleInvalid; + } + + struct parser parser =3D { inner, inner_len, 0, 0 }; + + demangle_status status =3D try_parse_path(&parser); + if (status !=3D DemangleOk) return status; + char next =3D parser.sym[parser.next]; + + // Instantiating crate (paths always start with uppercase characters). + if (parser.next < parser.sym_len && next >=3D 'A' && next <=3D 'Z') { + status =3D try_parse_path(&parser); + if (status !=3D DemangleOk) return status; + } + + res->mangled =3D inner; + res->mangled_len =3D inner_len; + if (rest) { + *rest =3D parser.sym + parser.next; + } + + return DemangleOk; +} + +// This might require `len` to be up to 3 characters bigger than the real = output len in case of utf-8 +NODISCARD static overflow_status rust_demangle_v0_display_demangle(struct = demangle_v0 res, char *out, size_t len, bool alternate) { + struct printer printer =3D { + DemangleOk, + { + res.mangled, + res.mangled_len, + 0, + 0 + }, + out, + len, + 0, + alternate + }; + if (printer_print_path(&printer, true) =3D=3D OverflowOverflow) { + return OverflowOverflow; + } + if (printer.out_len < OVERFLOW_MARGIN) { + return OverflowOverflow; + } + *printer.out =3D '\0'; + return OverflowOk; +} + +static size_t code_to_utf8(unsigned char *buffer, uint32_t code) +{ + if (code <=3D 0x7F) { + buffer[0] =3D code; + return 1; + } + if (code <=3D 0x7FF) { + buffer[0] =3D 0xC0 | (code >> 6); /* 110xxxxx */ + buffer[1] =3D 0x80 | (code & 0x3F); /* 10xxxxxx */ + return 2; + } + if (code <=3D 0xFFFF) { + buffer[0] =3D 0xE0 | (code >> 12); /* 1110xxxx */ + buffer[1] =3D 0x80 | ((code >> 6) & 0x3F); /* 10xxxxxx */ + buffer[2] =3D 0x80 | (code & 0x3F); /* 10xxxxxx */ + return 3; + } + if (code <=3D 0x10FFFF) { + buffer[0] =3D 0xF0 | (code >> 18); /* 11110xxx */ + buffer[1] =3D 0x80 | ((code >> 12) & 0x3F); /* 10xxxxxx */ + buffer[2] =3D 0x80 | ((code >> 6) & 0x3F); /* 10xxxxxx */ + buffer[3] =3D 0x80 | (code & 0x3F); /* 10xxxxxx */ + return 4; + } + return 0; +} + + +// return length of char at byte, or SIZE_MAX if invalid. buf should have = 4 valid characters +static NODISCARD size_t utf8_next_char(uint8_t *s, uint32_t *ch) { + uint8_t byte =3D *s; + // UTF8-1 =3D %x00-7F + // UTF8-2 =3D %xC2-DF UTF8-tail + // UTF8-3 =3D %xE0 %xA0-BF UTF8-tail / %xE1-EC 2( UTF8-tail ) / + // %xED %x80-9F UTF8-tail / %xEE-EF 2( UTF8-tail ) + // UTF8-4 =3D %xF0 %x90-BF 2( UTF8-tail ) / %xF1-F3 3( UTF8-tail = ) / + // %xF4 %x80-8F 2( UTF8-tail ) + if (byte < 0x80) { + *ch =3D byte; + return 1; + } else if (byte < 0xc2) { + return SIZE_MAX; + } else if (byte < 0xe0) { + if (s[1] >=3D 0x80 && s[1] < 0xc0) { + *ch =3D ((byte&0x1f)<<6) + (s[1] & 0x3f); + return 2; + } + return SIZE_MAX; + } if (byte < 0xf0) { + if (!(s[1] >=3D 0x80 && s[1] < 0xc0) || !(s[2] >=3D 0x80 && s[2] <= 0xc0)) { + return SIZE_MAX; // basic validation + } + if (byte =3D=3D 0xe0 && s[1] < 0xa0) { + return SIZE_MAX; // overshort + } + if (byte =3D=3D 0xed && s[1] >=3D 0xa0) { + return SIZE_MAX; // surrogate + } + *ch =3D ((byte&0x0f)<<12) + ((s[1] & 0x3f)<<6) + (s[2] & 0x3f); + return 3; + } else if (byte < 0xf5) { + if (!(s[1] >=3D 0x80 && s[1] < 0xc0) || !(s[2] >=3D 0x80 && s[2] <= 0xc0) || !(s[3] >=3D 0x80 && s[3] < 0xc0)) { + return SIZE_MAX; // basic validation + } + if (byte =3D=3D 0xf0 && s[1] < 0x90) { + return SIZE_MAX; // overshort + } + if (byte =3D=3D 0xf4 && s[1] >=3D 0x90) { + return SIZE_MAX; // over max + } + *ch =3D ((byte&0x07)<<18) + ((s[1] & 0x3f)<<12) + ((s[2] & 0x3f)<<= 6) + (s[3]&0x3f); + return 4; + } else { + return SIZE_MAX; + } +} + +static NODISCARD bool validate_char(uint32_t n) { + return ((n ^ 0xd800) - 0x800) < 0x110000 - 0x800; +} + +#define SMALL_PUNYCODE_LEN 128 + +static NODISCARD punycode_status punycode_decode(const char *start, size_t= ascii_len, const char *punycode_start, size_t punycode_len, uint32_t (*out= _)[SMALL_PUNYCODE_LEN], size_t *out_len) { + uint32_t *out =3D *out_; + + if (punycode_len =3D=3D 0) { + return PunycodeError; + } + + if (ascii_len > SMALL_PUNYCODE_LEN) { + return PunycodeError; + } + for (size_t i =3D 0; i < ascii_len; i++) { + out[i] =3D start[i]; + } + size_t len =3D ascii_len; + + size_t base =3D 36, t_min =3D 1, t_max =3D 26, skew =3D 38, damp =3D 7= 00, bias =3D 72, i =3D 0, n =3D 0x80; + for (;;) { + size_t delta =3D 0, w =3D 1, k =3D 0; + for (;;) { + k +=3D base; + size_t biased =3D k < bias ? 0 : k - bias; + size_t t =3D MIN(MAX(biased, t_min), t_max); + size_t d; + if (punycode_len =3D=3D 0) { + return PunycodeError; + } + char nx =3D *punycode_start++; + punycode_len--; + if ('a' <=3D nx && nx <=3D 'z') { + d =3D nx - 'a'; + } else if ('0' <=3D nx && nx <=3D '9') { + d =3D 26 + (nx - '0'); + } else { + return PunycodeError; + } + if (w =3D=3D 0 || d > SIZE_MAX / w || d*w > SIZE_MAX - delta) { + return PunycodeError; + } + delta +=3D d * w; + if (d < t) { + break; + } + if (base < t || w =3D=3D 0 || (base - t) > SIZE_MAX / w) { + return PunycodeError; + } + w *=3D (base - t); + } + + len +=3D 1; + if (i > SIZE_MAX - delta) { + return PunycodeError; + } + i +=3D delta; + if (n > SIZE_MAX - i / len) { + return PunycodeError; + } + n +=3D i / len; + i %=3D len; + + // char validation + if (n > UINT32_MAX || !validate_char((uint32_t)n)) { + return PunycodeError; + } + + // insert new character + if (len > SMALL_PUNYCODE_LEN) { + return PunycodeError; + } + memmove(out + i + 1, out + i, (len - i - 1) * sizeof(uint32_t)); + out[i] =3D (uint32_t)n; + + // start i index at incremented position + i++; + + // If there are no more deltas, decoding is complete. + if (punycode_len =3D=3D 0) { + *out_len =3D len; + return PunycodeOk; + } + + // Perform bias adaptation. + delta /=3D damp; + damp =3D 2; + + delta +=3D delta / len; + k =3D 0; + while (delta > ((base - t_min) * t_max) / 2) { + delta /=3D base - t_min; + k +=3D base; + } + bias =3D k + ((base - t_min + 1) * delta) / (delta + skew); + } +} + +struct ident { + const char *ascii_start; + size_t ascii_len; + const char *punycode_start; + size_t punycode_len; +}; + +static NODISCARD overflow_status display_ident(const char *ascii_start, si= ze_t ascii_len, const char *punycode_start, size_t punycode_len, uint8_t *o= ut, size_t *out_len) { + uint32_t outbuf[SMALL_PUNYCODE_LEN]; + + size_t wide_len; + size_t out_buflen =3D *out_len; + + if (punycode_len =3D=3D 0) { + if (ascii_len > out_buflen) { + return OverflowOverflow; + } + memcpy(out, ascii_start, ascii_len); + *out_len =3D ascii_len; + } else if (punycode_decode(ascii_start, ascii_len, punycode_start, pun= ycode_len, &outbuf, &wide_len) =3D=3D PunycodeOk) { + size_t narrow_len =3D 0; + for (size_t i =3D 0; i < wide_len; i++) { + if (out_buflen - narrow_len < 4) { + return OverflowOverflow; + } + unsigned char *pos =3D &out[narrow_len]; + narrow_len +=3D code_to_utf8(pos, outbuf[i]); + } + *out_len =3D narrow_len; + } else { + size_t narrow_len =3D 0; + if (out_buflen < strlen("punycode{")) { + return OverflowOverflow; + } + memcpy(out, "punycode{", strlen("punycode{")); + narrow_len =3D strlen("punycode{"); + if (ascii_len > 0) { + if (out_buflen - narrow_len < ascii_len || out_buflen - narrow= _len - ascii_len < 1) { + return OverflowOverflow; + } + memcpy(out + narrow_len, ascii_start, ascii_len); + narrow_len +=3D ascii_len; + out[narrow_len] =3D '-'; + narrow_len++; + } + if (out_buflen - narrow_len < punycode_len || out_buflen - narrow_= len - punycode_len < 1) { + return OverflowOverflow; + } + memcpy(out + narrow_len, punycode_start, punycode_len); + narrow_len +=3D punycode_len; + out[narrow_len] =3D '}'; + narrow_len++; + *out_len =3D narrow_len; + } + + return OverflowOk; +} + +static NODISCARD bool try_parse_uint(const char *buf, size_t len, uint64_t= *result) { + size_t cur =3D 0; + for(;cur < len && buf[cur] =3D=3D '0';cur++); + uint64_t result_val =3D 0; + if (len - cur > 16) return false; + for(;cur < len;cur++) { + char c =3D buf[cur]; + result_val <<=3D 4; + if ('0' <=3D c && c <=3D '9') { + result_val +=3D c - '0'; + } else if ('a' <=3D c && c <=3D 'f') { + result_val +=3D 10 + (c - 'a'); + } else { + return false; + } + } + *result =3D result_val; + return true; +} + +static NODISCARD bool dinibble2int(const char *buf, uint8_t *result) { + uint8_t result_val =3D 0; + for (int i =3D 0; i < 2; i++) { + char c =3D buf[i]; + result_val <<=3D 4; + if ('0' <=3D c && c <=3D '9') { + result_val +=3D c - '0'; + } else if ('a' <=3D c && c <=3D 'f') { + result_val +=3D 10 + (c - 'a'); + } else { + return false; + } + } + *result =3D result_val; + return true; +} + + +typedef enum { + NtsOk =3D 0, + NtsOverflow =3D 1, + NtsInvalid =3D 2 +} nibbles_to_string_status; + +// '\u{10ffff}', +margin +#define ESCAPED_SIZE 12 + +static NODISCARD size_t char_to_string(uint32_t ch, uint8_t quote, bool fi= rst, char (*buf)[ESCAPED_SIZE]) { + // encode the character + char *escaped_buf =3D *buf; + escaped_buf[0] =3D '\\'; + size_t escaped_len =3D 2; + switch (ch) { + case '\0': + escaped_buf[1] =3D '0'; + break; + case '\t': + escaped_buf[1] =3D 't'; + break; + case '\r': + escaped_buf[1] =3D 'r'; + break; + case '\n': + escaped_buf[1] =3D 'n'; + break; + case '\\': + escaped_buf[1] =3D '\\'; + break; + default: + if (ch =3D=3D quote) { + escaped_buf[1] =3D ch; + } else if (!unicode_isprint(ch) || (first && unicode_isgraphemexte= nd(ch))) { + int hexlen =3D snprintf(escaped_buf, ESCAPED_SIZE, "\\u{%x}", = (unsigned int)ch); + if (hexlen < 0) { + return 0; // (snprintf shouldn't fail!) + } + escaped_len =3D hexlen; + } else { + // printable character + escaped_buf[0] =3D ch; + escaped_len =3D 1; + } + break; + } + + return escaped_len; +} + +// convert nibbles to a single/double-quoted string +static NODISCARD nibbles_to_string_status nibbles_to_string(const char *bu= f, size_t len, uint8_t *out, size_t *out_len) { + uint8_t quote =3D '"'; + bool first =3D true; + + if ((len % 2) !=3D 0) { + return NtsInvalid; // odd number of nibbles + } + + size_t cur_out_len =3D 0; + + // write starting quote + if (out !=3D NULL) { + cur_out_len =3D *out_len; + if (cur_out_len =3D=3D 0) { + return NtsOverflow; + } + *out++ =3D quote; + cur_out_len--; + } + + uint8_t conv_buf[4] =3D {0}; + size_t conv_buf_len =3D 0; + while (len > 1 || conv_buf_len > 0) { + while (len > 1 && conv_buf_len < sizeof(conv_buf)) { + if (!dinibble2int(buf, &conv_buf[conv_buf_len])) { + return NtsInvalid; + } + conv_buf_len++; + buf +=3D 2; + len -=3D 2; + } + + // conv_buf is full here if possible, process 1 UTF-8 character + uint32_t ch =3D 0; + size_t consumed =3D utf8_next_char(conv_buf, &ch); + if (consumed > conv_buf_len) { + // either SIZE_MAX (invalid UTF-8) or finished input buffer and + // there are still bytes remaining, in both cases invalid + return NtsInvalid; + } + + // "consume" the character + memmove(conv_buf, conv_buf+consumed, conv_buf_len-consumed); + conv_buf_len -=3D consumed; + + char escaped_buf[ESCAPED_SIZE]; + size_t escaped_len =3D char_to_string(ch, '"', first, &escaped_buf= ); + if (out !=3D NULL) { + if (cur_out_len < escaped_len) { + return NtsOverflow; + } + memcpy(out, escaped_buf, escaped_len); + out +=3D escaped_len; + cur_out_len -=3D escaped_len; + } + first =3D false; + } + + // write ending quote + if (out !=3D NULL) { + if (cur_out_len =3D=3D 0) { + return NtsOverflow; + } + *out++ =3D quote; + cur_out_len--; + *out_len -=3D cur_out_len; // subtract remaining space to get used= space + } + + return NtsOk; +} + +static const char* basic_type(uint8_t tag) { + switch(tag) { + case 'b': + return "bool"; + case 'c': + return "char"; + case 'e': + return "str"; + case 'u': + return "()"; + case 'a': + return "i8"; + case 's': + return "i16"; + case 'l': + return "i32"; + case 'x': + return "i64"; + case 'n': + return "i128"; + case 'i': + return "isize"; + case 'h': + return "u8"; + case 't': + return "u16"; + case 'm': + return "u32"; + case 'y': + return "u64"; + case 'o': + return "u128"; + case 'j': + return "usize"; + case 'f': + return "f32"; + case 'd': + return "f64"; + case 'z': + return "!"; + case 'p': + return "_"; + case 'v': + return "..."; + default: + return NULL; + } +} + +static NODISCARD demangle_status parser_push_depth(struct parser *parser) { + parser->depth++; + if (parser->depth > MAX_DEPTH) { + return DemangleRecursed; + } else { + return DemangleOk; + } +} + +static demangle_status parser_pop_depth(struct parser *parser) { + parser->depth--; + return DemangleOk; +} + +static uint8_t parser_peek(struct parser const *parser) { + if (parser->next =3D=3D parser->sym_len) { + return 0; // add a "pseudo nul terminator" to avoid peeking past t= he end of a symbol + } else { + return parser->sym[parser->next]; + } +} + +static bool parser_eat(struct parser *parser, uint8_t ch) { + if (parser_peek(parser) =3D=3D ch) { + if (ch !=3D 0) { // safety: make sure we don't skip past the NUL t= erminator + parser->next++; + } + return true; + } else { + return false; + } +} + +static uint8_t parser_next(struct parser *parser) { + // don't advance after end of input, and return an imaginary NUL termi= nator + if (parser->next =3D=3D parser->sym_len) { + return 0; + } else { + return parser->sym[parser->next++]; + } +} + +static NODISCARD demangle_status parser_ch(struct parser *parser, uint8_t = *next) { + // don't advance after end of input + if (parser->next =3D=3D parser->sym_len) { + return DemangleInvalid; + } else { + *next =3D parser->sym[parser->next++]; + return DemangleOk; + } +} + +struct buf { + const char *start; + size_t len; +}; + +static NODISCARD demangle_status parser_hex_nibbles(struct parser *parser,= struct buf *buf) { + size_t start =3D parser->next; + for (;;) { + uint8_t ch =3D parser_next(parser); + if (ch =3D=3D '_') { + break; + } + if (!(('0' <=3D ch && ch <=3D '9') || ('a' <=3D ch && ch <=3D 'f')= )) { + return DemangleInvalid; + } + } + buf->start =3D parser->sym + start; + buf->len =3D parser->next - start - 1; // skip final _ + return DemangleOk; +} + +static NODISCARD demangle_status parser_digit_10(struct parser *parser, ui= nt8_t *out) { + uint8_t ch =3D parser_peek(parser); + if ('0' <=3D ch && ch <=3D '9') { + *out =3D ch - '0'; + parser->next++; + return DemangleOk; + } else { + return DemangleInvalid; + } +} + +static NODISCARD demangle_status parser_digit_62(struct parser *parser, ui= nt64_t *out) { + uint8_t ch =3D parser_peek(parser); + if ('0' <=3D ch && ch <=3D '9') { + *out =3D ch - '0'; + parser->next++; + return DemangleOk; + } else if ('a' <=3D ch && ch <=3D 'z') { + *out =3D 10 + (ch - 'a'); + parser->next++; + return DemangleOk; + } else if ('A' <=3D ch && ch <=3D 'Z') { + *out =3D 10 + 26 + (ch - 'A'); + parser->next++; + return DemangleOk; + } else { + return DemangleInvalid; + } +} + +static NODISCARD demangle_status parser_integer_62(struct parser *parser, = uint64_t *out) { + if (parser_eat(parser, '_')) { + *out =3D 0; + return DemangleOk; + } + + uint64_t x =3D 0; + demangle_status status; + while (!parser_eat(parser, '_')) { + uint64_t d; + if ((status =3D parser_digit_62(parser, &d)) !=3D DemangleOk) { + return status; + } + if (x > UINT64_MAX / 62) { + return DemangleInvalid; + } + x *=3D 62; + if (x > UINT64_MAX - d) { + return DemangleInvalid; + } + x +=3D d; + } + if (x =3D=3D UINT64_MAX) { + return DemangleInvalid; + } + *out =3D x + 1; + return DemangleOk; +} + +static NODISCARD demangle_status parser_opt_integer_62(struct parser *pars= er, uint8_t tag, uint64_t *out) { + if (!parser_eat(parser, tag)) { + *out =3D 0; + return DemangleOk; + } + + demangle_status status; + if ((status =3D parser_integer_62(parser, out)) !=3D DemangleOk) { + return status; + } + if (*out =3D=3D UINT64_MAX) { + return DemangleInvalid; + } + *out =3D *out + 1; + return DemangleOk; +} + +static NODISCARD demangle_status parser_disambiguator(struct parser *parse= r, uint64_t *out) { + return parser_opt_integer_62(parser, 's', out); +} + +typedef uint8_t parser_namespace_type; + +static NODISCARD demangle_status parser_namespace(struct parser *parser, p= arser_namespace_type *out) { + uint8_t next =3D parser_next(parser); + if ('A' <=3D next && next <=3D 'Z') { + *out =3D next; + return DemangleOk; + } else if ('a' <=3D next && next <=3D 'z') { + *out =3D 0; + return DemangleOk; + } else { + return DemangleInvalid; + } +} + +static NODISCARD demangle_status parser_backref(struct parser *parser, str= uct parser *out) { + size_t start =3D parser->next; + if (start =3D=3D 0) { + return DemangleBug; + } + size_t s_start =3D start - 1; + uint64_t i; + demangle_status status =3D parser_integer_62(parser, &i); + if (status !=3D DemangleOk) { + return status; + } + if (i >=3D s_start) { + return DemangleInvalid; + } + struct parser res =3D { + .sym =3D parser->sym, + .sym_len =3D parser->sym_len, + .next =3D (size_t)i, + .depth =3D parser->depth + }; + status =3D parser_push_depth(&res); + if (status !=3D DemangleOk) { + return status; + } + *out =3D res; + return DemangleOk; +} + +static NODISCARD demangle_status parser_ident(struct parser *parser, struc= t ident *out) { + bool is_punycode =3D parser_eat(parser, 'u'); + size_t len; + uint8_t d; + demangle_status status =3D parser_digit_10(parser, &d); + len =3D d; + if (status !=3D DemangleOk) { + return status; + } + if (len) { + for (;;) { + status =3D parser_digit_10(parser, &d); + if (status !=3D DemangleOk) { + break; + } + if (len > SIZE_MAX / 10) { + return DemangleInvalid; + } + len *=3D 10; + if (len > SIZE_MAX - d) { + return DemangleInvalid; + } + len +=3D d; + } + } + + // Skip past the optional `_` separator. + parser_eat(parser, '_'); + + size_t start =3D parser->next; + if (parser->sym_len - parser->next < len) { + return DemangleInvalid; + } + parser->next +=3D len; + + const char *ident =3D &parser->sym[start]; + + if (is_punycode) { + const char *underscore =3D demangle_memrchr(ident, '_', (size_t)le= n); + if (underscore =3D=3D NULL) { + *out =3D (struct ident){ + .ascii_start=3D"", + .ascii_len=3D0, + .punycode_start=3Dident, + .punycode_len=3Dlen + }; + } else { + size_t ascii_len =3D underscore - ident; + // ascii_len <=3D len - 1 since `_` is in the first len bytes + size_t punycode_len =3D len - 1 - ascii_len; + *out =3D (struct ident){ + .ascii_start=3Dident, + .ascii_len=3Dascii_len, + .punycode_start=3Dunderscore + 1, + .punycode_len=3Dpunycode_len + }; + } + if (out->punycode_len =3D=3D 0) { + return DemangleInvalid; + } + return DemangleOk; + } else { + *out =3D (struct ident) { + .ascii_start=3Dident, + .ascii_len=3D(size_t)len, + .punycode_start=3D"", + .punycode_len=3D0, + }; + return DemangleOk; + } +} + +#define INVALID_SYNTAX "{invalid syntax}" + +static const char *demangle_error_message(demangle_status status) { + switch (status) { + case DemangleInvalid: + return INVALID_SYNTAX; + case DemangleBug: + return "{bug}"; + case DemangleRecursed: + return "{recursion limit reached}"; + default: + return "{unknown error}"; + } +} + +#define PRINT(print_fn) \ + do { \ + if ((print_fn) =3D=3D OverflowOverflow) { \ + return OverflowOverflow; \ + } \ + } while(0) + +#define PRINT_CH(printer, s) PRINT(printer_print_ch((printer), (s))) +#define PRINT_STR(printer, s) PRINT(printer_print_str((printer), (s))) +#define PRINT_U64(printer, s) PRINT(printer_print_u64((printer), (s))) +#define PRINT_IDENT(printer, s) PRINT(printer_print_ident((printer), (s))) + +#define INVALID(printer) \ + do { \ + PRINT_STR((printer), INVALID_SYNTAX); \ + (printer)->status =3D DemangleInvalid; \ + return OverflowOk; \ + } while(0) + +#define PARSE(printer, method, ...) \ + do { \ + if ((printer)->status !=3D DemangleOk) { \ + PRINT_STR((printer), "?"); \ + return OverflowOk; \ + } else { \ + demangle_status _parse_status =3D method(&(printer)->parser, ## __VA= _ARGS__); \ + if (_parse_status !=3D DemangleOk) { \ + PRINT_STR((printer), demangle_error_message(_parse_status)); \ + (printer)->status =3D _parse_status; \ + return OverflowOk; \ + } \ + } \ + } while(0) + +#define PRINT_SEP_LIST(printer, body, sep) \ + do { \ + size_t _sep_list_i; \ + PRINT_SEP_LIST_COUNT(printer, _sep_list_i, body, sep); \ + } while(0) + +#define PRINT_SEP_LIST_COUNT(printer, count, body, sep) \ + do { \ + count =3D 0; \ + while ((printer)->status =3D=3D DemangleOk && !printer_eat((printer), = 'E')) { \ + if (count > 0) { PRINT_STR(printer, sep); } \ + body; \ + count++; \ + } \ + } while(0) + +static bool printer_eat(struct printer *printer, uint8_t b) { + if (printer->status !=3D DemangleOk) { + return false; + } + + return parser_eat(&printer->parser, b); +} + +static void printer_pop_depth(struct printer *printer) { + if (printer->status =3D=3D DemangleOk) { + parser_pop_depth(&printer->parser); + } +} + +static NODISCARD overflow_status printer_print_buf(struct printer *printer= , const char *start, size_t len) { + if (printer->out =3D=3D NULL) { + return OverflowOk; + } + if (printer->out_len < len) { + return OverflowOverflow; + } + + memcpy(printer->out, start, len); + printer->out +=3D len; + printer->out_len -=3D len; + return OverflowOk; +} + +static NODISCARD overflow_status printer_print_str(struct printer *printer= , const char *buf) { + return printer_print_buf(printer, buf, strlen(buf)); +} + +static NODISCARD overflow_status printer_print_ch(struct printer *printer,= char ch) { + return printer_print_buf(printer, &ch, 1); +} + +static NODISCARD overflow_status printer_print_u64(struct printer *printer= , uint64_t n) { + char buf[32] =3D {0}; + sprintf(buf, "%llu", (unsigned long long)n); // printing uint64 uses 2= 1 < 32 chars + return printer_print_str(printer, buf); +} + +static NODISCARD overflow_status printer_print_ident(struct printer *print= er, struct ident *ident) { + if (printer->out =3D=3D NULL) { + return OverflowOk; + } + + size_t out_len =3D printer->out_len; + overflow_status status; + if ((status =3D display_ident(ident->ascii_start, ident->ascii_len, id= ent->punycode_start, ident->punycode_len, (uint8_t*)printer->out, &out_len)= ) !=3D OverflowOk) { + return status; + } + printer->out +=3D out_len; + printer->out_len -=3D out_len; + return OverflowOk; +} + +typedef overflow_status (*printer_fn)(struct printer *printer); +typedef overflow_status (*backref_fn)(struct printer *printer, bool *arg); + +static NODISCARD overflow_status printer_print_backref(struct printer *pri= nter, backref_fn func, bool *arg) { + struct parser backref; + PARSE(printer, parser_backref, &backref); + + if (printer->out =3D=3D NULL) { + return OverflowOk; + } + + struct parser orig_parser =3D printer->parser; + demangle_status orig_status =3D printer->status; // fixme not sure thi= s is needed match for Ok on the Rust side + printer->parser =3D backref; + printer->status =3D DemangleOk; + overflow_status status =3D func(printer, arg); + printer->parser =3D orig_parser; + printer->status =3D orig_status; + + return status; +} + +static NODISCARD overflow_status printer_print_lifetime_from_index(struct = printer *printer, uint64_t lt) { + // Bound lifetimes aren't tracked when skipping printing. + if (printer->out =3D=3D NULL) { + return OverflowOk; + } + + PRINT_STR(printer, "'"); + if (lt =3D=3D 0) { + PRINT_STR(printer, "_"); + return OverflowOk; + } + =20 + if (printer->bound_lifetime_depth < lt) { + INVALID(printer); + } else { + uint64_t depth =3D printer->bound_lifetime_depth - lt; + if (depth < 26) { + PRINT_CH(printer, 'a' + depth); + } else { + PRINT_STR(printer, "_"); + PRINT_U64(printer, depth); + } + + return OverflowOk; + } +} + +static NODISCARD overflow_status printer_in_binder(struct printer *printer= , printer_fn func) { + uint64_t bound_lifetimes; + PARSE(printer, parser_opt_integer_62, 'G', &bound_lifetimes); + + // Don't track bound lifetimes when skipping printing. + if (printer->out =3D=3D NULL) { + return func(printer); + } + + if (bound_lifetimes > 0) { + PRINT_STR(printer, "for<"); + for (uint64_t i =3D 0; i < bound_lifetimes; i++) { + if (i > 0) { + PRINT_STR(printer, ", "); + } + printer->bound_lifetime_depth++; + PRINT(printer_print_lifetime_from_index(printer, 1)); + } + PRINT_STR(printer, "> "); + } + + overflow_status r =3D func(printer); + printer->bound_lifetime_depth -=3D bound_lifetimes; + + return r; +} + +static NODISCARD overflow_status printer_print_generic_arg(struct printer = *printer) { + if (printer_eat(printer, 'L')) { + uint64_t lt; + PARSE(printer, parser_integer_62, <); + return printer_print_lifetime_from_index(printer, lt); + } else if (printer_eat(printer, 'K')) { + return printer_print_const(printer, false); + } else { + return printer_print_type(printer); + } +} + +static NODISCARD overflow_status printer_print_generic_args(struct printer= *printer) { + PRINT_STR(printer, "<"); + PRINT_SEP_LIST(printer, PRINT(printer_print_generic_arg(printer)), ", = "); + PRINT_STR(printer, ">"); + return OverflowOk; +} + +static NODISCARD overflow_status printer_print_path_out_of_value(struct pr= inter *printer, bool *_arg) { + (void)_arg; + return printer_print_path(printer, false); +} + +static NODISCARD overflow_status printer_print_path_in_value(struct printe= r *printer, bool *_arg) { + (void)_arg; + return printer_print_path(printer, true); +} + +static NODISCARD overflow_status printer_print_path(struct printer *printe= r, bool in_value) { + PARSE(printer, parser_push_depth); + uint8_t tag; + PARSE(printer, parser_ch, &tag); + + overflow_status st; + uint64_t dis; + struct ident name; + parser_namespace_type ns; + char *orig_out; + + switch(tag) { + case 'C': + PARSE(printer, parser_disambiguator, &dis); + PARSE(printer, parser_ident, &name); + + PRINT_IDENT(printer, &name); + + if (printer->out !=3D NULL && !printer->alternate && dis !=3D 0) { + PRINT_STR(printer, "["); + char buf[24] =3D {0}; + sprintf(buf, "%llx", (unsigned long long)dis); + PRINT_STR(printer, buf); + PRINT_STR(printer, "]"); + } + break; + case 'N': + PARSE(printer, parser_namespace, &ns); + if ((st =3D printer_print_path(printer, in_value)) !=3D OverflowOk= ) { + return st; + } + + // HACK(eddyb) if the parser is already marked as having errored, + // `parse!` below will print a `?` without its preceding `::` + // (because printing the `::` is skipped in certain conditions, + // i.e. a lowercase namespace with an empty identifier), + // so in order to get `::?`, the `::` has to be printed here. + if (printer->status !=3D DemangleOk) { + PRINT_STR(printer, "::"); + } + + PARSE(printer, parser_disambiguator, &dis); + PARSE(printer, parser_ident, &name); + // Special namespace, like closures and shims + if (ns) { + PRINT_STR(printer, "::{"); + if (ns =3D=3D 'C') { + PRINT_STR(printer, "closure"); + } else if (ns =3D=3D 'S') { + PRINT_STR(printer, "shim"); + } else { + PRINT_CH(printer, ns); + } + if (name.ascii_len !=3D 0 || name.punycode_len !=3D 0) { + PRINT_STR(printer, ":"); + PRINT_IDENT(printer, &name); + } + PRINT_STR(printer, "#"); + PRINT_U64(printer, dis); + PRINT_STR(printer, "}"); + } else { + // Implementation-specific/unspecified namespaces + if (name.ascii_len !=3D 0 || name.punycode_len !=3D 0) { + PRINT_STR(printer, "::"); + PRINT_IDENT(printer, &name); + } + } + break; + case 'M': + case 'X': + // for impls, ignore the impls own path + PARSE(printer, parser_disambiguator, &dis); + orig_out =3D printer->out; + printer->out =3D NULL; + PRINT(printer_print_path(printer, false)); + printer->out =3D orig_out; + + // fallthru + case 'Y': + PRINT_STR(printer, "<"); + PRINT(printer_print_type(printer)); + if (tag !=3D 'M') { + PRINT_STR(printer, " as "); + PRINT(printer_print_path(printer, false)); + } + PRINT_STR(printer, ">"); + break; + case 'I': + PRINT(printer_print_path(printer, in_value)); + if (in_value) { + PRINT_STR(printer, "::"); + } + PRINT(printer_print_generic_args(printer)); + break; + case 'B': + PRINT(printer_print_backref(printer, in_value ? printer_print_path_in_= value : printer_print_path_out_of_value, NULL)); + break; + default: + INVALID(printer); + break; + } + + printer_pop_depth(printer); + return OverflowOk; +} + +static NODISCARD overflow_status printer_print_const_uint(struct printer *= printer, uint8_t tag) { + struct buf hex; + PARSE(printer, parser_hex_nibbles, &hex); + + uint64_t val; + if (try_parse_uint(hex.start, hex.len, &val)) { + PRINT_U64(printer, val); + } else { + PRINT_STR(printer, "0x"); + PRINT(printer_print_buf(printer, hex.start, hex.len)); + } + + if (printer->out !=3D NULL && !printer->alternate) { + const char *ty =3D basic_type(tag); + if (/* safety */ ty !=3D NULL) { + PRINT_STR(printer, ty); + } + } + + return OverflowOk; +} + +static NODISCARD overflow_status printer_print_const_str_literal(struct pr= inter *printer) { + struct buf hex; + PARSE(printer, parser_hex_nibbles, &hex); + + size_t out_len =3D SIZE_MAX; + nibbles_to_string_status nts_status =3D nibbles_to_string(hex.start, h= ex.len, NULL, &out_len); + switch (nts_status) { + case NtsOk: + if (printer->out !=3D NULL) { + out_len =3D printer->out_len; + nts_status =3D nibbles_to_string(hex.start, hex.len, (uint8_t*= )printer->out, &out_len); + if (nts_status !=3D NtsOk) { + return OverflowOverflow; + } + printer->out +=3D out_len; + printer->out_len -=3D out_len; + } + return OverflowOk; + case NtsOverflow: + // technically if there is a string of size `SIZE_MAX/6` whose esc= aped version overflows + // SIZE_MAX but has an invalid char, this will be a "fake" overflo= w. In practice, + // that is not going to happen and a fuzzer will not generate stri= ngs of this length. + return OverflowOverflow; + case NtsInvalid: + default: + INVALID(printer); + } +} + +static NODISCARD overflow_status printer_print_const_struct(struct printer= *printer) { + uint64_t dis; + struct ident name; + PARSE(printer, parser_disambiguator, &dis); + PARSE(printer, parser_ident, &name); + PRINT_IDENT(printer, &name); + PRINT_STR(printer, ": "); + return printer_print_const(printer, true); +} + +static NODISCARD overflow_status printer_print_const_out_of_value(struct p= rinter *printer, bool *_arg) { + (void)_arg; + return printer_print_const(printer, false); +} + +static NODISCARD overflow_status printer_print_const_in_value(struct print= er *printer, bool *_arg) { + (void)_arg; + return printer_print_const(printer, true); +} + +static NODISCARD overflow_status printer_print_const(struct printer *print= er, bool in_value) { + uint8_t tag; + + PARSE(printer, parser_ch, &tag); + PARSE(printer, parser_push_depth); + + struct buf hex; + uint64_t val; + size_t count; + + bool opened_brace =3D false; +#define OPEN_BRACE_IF_OUTSIDE_EXPR \ + do { if (!in_value) { \ + opened_brace =3D true; \ + PRINT_STR(printer, "{"); \ + } } while(0) + =20 + switch(tag) { + case 'p': + PRINT_STR(printer, "_"); + break; + // Primitive leaves with hex-encoded values (see `basic_type`). + case 'a': + case 's': + case 'l': + case 'x': + case 'n': + case 'i': + if (printer_eat(printer, 'n')) { + PRINT_STR(printer, "-"); + } + /* fallthrough */ + case 'h': + case 't': + case 'm': + case 'y': + case 'o': + case 'j': + PRINT(printer_print_const_uint(printer, tag)); + break; + case 'b': + PARSE(printer, parser_hex_nibbles, &hex); + if (try_parse_uint(hex.start, hex.len, &val)) { + if (val =3D=3D 0) { + PRINT_STR(printer, "false"); + } else if (val =3D=3D 1) { + PRINT_STR(printer, "true"); + } else { + INVALID(printer); + } + } else { + INVALID(printer); + } + break; + case 'c': + PARSE(printer, parser_hex_nibbles, &hex); + if (try_parse_uint(hex.start, hex.len, &val) + && val < UINT32_MAX + && validate_char((uint32_t)val)) + { + char escaped_buf[ESCAPED_SIZE]; + size_t escaped_size =3D char_to_string((uint32_t)val, '\'', tr= ue, &escaped_buf); + + PRINT_STR(printer, "'"); + PRINT(printer_print_buf(printer, escaped_buf, escaped_size)); + PRINT_STR(printer, "'"); + } else { + INVALID(printer); + } + break; + case 'e': + OPEN_BRACE_IF_OUTSIDE_EXPR; + PRINT_STR(printer, "*"); + PRINT(printer_print_const_str_literal(printer)); + break; + case 'R': + case 'Q': + if (tag =3D=3D 'R' && printer_eat(printer, 'e')) { + PRINT(printer_print_const_str_literal(printer)); + } else { + OPEN_BRACE_IF_OUTSIDE_EXPR; + PRINT_STR(printer, "&"); + if (tag !=3D 'R') { + PRINT_STR(printer, "mut "); + } + PRINT(printer_print_const(printer, true)); + } + break; + case 'A': + OPEN_BRACE_IF_OUTSIDE_EXPR; + PRINT_STR(printer, "["); + PRINT_SEP_LIST(printer, PRINT(printer_print_const(printer, true)),= ", "); + PRINT_STR(printer, "]"); + break; + case 'T': + OPEN_BRACE_IF_OUTSIDE_EXPR; + PRINT_STR(printer, "("); + PRINT_SEP_LIST_COUNT(printer, count, PRINT(printer_print_const(pri= nter, true)), ", "); + if (count =3D=3D 1) { + PRINT_STR(printer, ","); + } + PRINT_STR(printer, ")"); + break; + case 'V': + OPEN_BRACE_IF_OUTSIDE_EXPR; + PRINT(printer_print_path(printer, true)); + PARSE(printer, parser_ch, &tag); + switch(tag) { + case 'U': + break; + case 'T': + PRINT_STR(printer, "("); + PRINT_SEP_LIST(printer, PRINT(printer_print_const(printer, true)),= ", "); + PRINT_STR(printer, ")"); + break; + case 'S': + PRINT_STR(printer, " { "); + PRINT_SEP_LIST(printer, PRINT(printer_print_const_struct(printer))= , ", "); + PRINT_STR(printer, " }"); + break; + default: + INVALID(printer); + } + break; + case 'B': + PRINT(printer_print_backref(printer, in_value ? printer_print_cons= t_in_value : printer_print_const_out_of_value, NULL)); + break; + default:=20 + INVALID(printer); + } +#undef OPEN_BRACE_IF_OUTSIDE_EXPR + + if (opened_brace) { + PRINT_STR(printer, "}"); + } + printer_pop_depth(printer); + + return OverflowOk; +} + +/// A trait in a trait object may have some "existential projections" +/// (i.e. associated type bindings) after it, which should be printed +/// in the `<...>` of the trait, e.g. `dyn Trait`. +/// To this end, this method will keep the `<...>` of an 'I' path +/// open, by omitting the `>`, and return `Ok(true)` in that case. +static NODISCARD overflow_status printer_print_maybe_open_generics(struct = printer *printer, bool *open) { + if (printer_eat(printer, 'B')) { + // NOTE(eddyb) the closure may not run if printing is being skippe= d, + // but in that case the returned boolean doesn't matter. + *open =3D false; + return printer_print_backref(printer, printer_print_maybe_open_gen= erics, open); + } else if(printer_eat(printer, 'I')) { + PRINT(printer_print_path(printer, false)); + PRINT_STR(printer, "<"); + PRINT_SEP_LIST(printer, PRINT(printer_print_generic_arg(printer)),= ", "); + *open =3D true; + return OverflowOk; + } else { + PRINT(printer_print_path(printer, false)); + *open =3D false; + return OverflowOk; + } +} + +static NODISCARD overflow_status printer_print_dyn_trait(struct printer *p= rinter) { + bool open; + PRINT(printer_print_maybe_open_generics(printer, &open)); + + while (printer_eat(printer, 'p')) { + if (!open) { + PRINT_STR(printer, "<"); + open =3D true; + } else { + PRINT_STR(printer, ", "); + } + + struct ident name; + PARSE(printer, parser_ident, &name); + + PRINT_IDENT(printer, &name); + PRINT_STR(printer, " =3D "); + PRINT(printer_print_type(printer)); + } + + if (open) { + PRINT_STR(printer, ">"); + } + + return OverflowOk; +} + +static NODISCARD overflow_status printer_print_object_bounds(struct printe= r *printer) { + PRINT_SEP_LIST(printer, PRINT(printer_print_dyn_trait(printer)), " + "= ); + return OverflowOk; +} + +static NODISCARD overflow_status printer_print_function_type(struct printe= r *printer) { + bool is_unsafe =3D printer_eat(printer, 'U'); + const char *abi; + size_t abi_len; + if (printer_eat(printer, 'K')) { + if (printer_eat(printer, 'C')) { + abi =3D "C"; + abi_len =3D 1; + } else { + struct ident abi_ident; + PARSE(printer, parser_ident, &abi_ident); + if (abi_ident.ascii_len =3D=3D 0 || abi_ident.punycode_len != =3D 0) { + INVALID(printer); + } + abi =3D abi_ident.ascii_start; + abi_len =3D abi_ident.ascii_len; + } + } else { + abi =3D NULL; + abi_len =3D 0; + } + + if (is_unsafe) { + PRINT_STR(printer, "unsafe "); + } + + if (abi !=3D NULL) { + PRINT_STR(printer, "extern \""); + + // replace _ with - + while (abi_len > 0) { + const char *minus =3D memchr(abi, '_', abi_len); + if (minus =3D=3D NULL) { + PRINT(printer_print_buf(printer, (const char*)abi, abi_len= )); + break; + } else { + size_t space_to_minus =3D minus - abi; + PRINT(printer_print_buf(printer, (const char*)abi, space_t= o_minus)); + PRINT_STR(printer, "-"); + abi =3D minus + 1; + abi_len -=3D (space_to_minus + 1); + } + } + + PRINT_STR(printer, "\" "); + } + + PRINT_STR(printer, "fn("); + PRINT_SEP_LIST(printer, PRINT(printer_print_type(printer)), ", "); + PRINT_STR(printer, ")"); + + if (printer_eat(printer, 'u')) { + // Skip printing the return type if it's 'u', i.e. `()`. + } else { + PRINT_STR(printer, " -> "); + PRINT(printer_print_type(printer)); + } + + return OverflowOk; +} + +static NODISCARD overflow_status printer_print_type_backref(struct printer= *printer, bool *_arg) { + (void)_arg; + return printer_print_type(printer); +} + +static NODISCARD overflow_status printer_print_type(struct printer *printe= r) { + uint8_t tag; + PARSE(printer, parser_ch, &tag); + + const char *basic_ty =3D basic_type(tag); + if (basic_ty) { + return printer_print_str(printer, basic_ty); + } + + uint64_t count; + uint64_t lt; + + PARSE(printer, parser_push_depth); + switch (tag) { + case 'R': + case 'Q': + PRINT_STR(printer, "&"); + if (printer_eat(printer, 'L')) { + PARSE(printer, parser_integer_62, <); + if (lt !=3D 0) { + PRINT(printer_print_lifetime_from_index(printer, lt)); + PRINT_STR(printer, " "); + } + } + if (tag !=3D 'R') { + PRINT_STR(printer, "mut "); + } + PRINT(printer_print_type(printer)); + break; + case 'P': + case 'O': + PRINT_STR(printer, "*"); + if (tag !=3D 'P') { + PRINT_STR(printer, "mut "); + } else { + PRINT_STR(printer, "const "); + } + PRINT(printer_print_type(printer)); + break; + case 'A': + case 'S': + PRINT_STR(printer, "["); + PRINT(printer_print_type(printer)); + if (tag =3D=3D 'A') { + PRINT_STR(printer, "; "); + PRINT(printer_print_const(printer, true)); + } + PRINT_STR(printer, "]"); + break; + case 'T': + PRINT_STR(printer, "("); + PRINT_SEP_LIST_COUNT(printer, count, PRINT(printer_print_type(prin= ter)), ", "); + if (count =3D=3D 1) { + PRINT_STR(printer, ","); + } + PRINT_STR(printer, ")"); + break; + case 'F': + PRINT(printer_in_binder(printer, printer_print_function_type)); + break; + case 'D': + PRINT_STR(printer, "dyn "); + PRINT(printer_in_binder(printer, printer_print_object_bounds)); + + if (!printer_eat(printer, 'L')) { + INVALID(printer); + } + PARSE(printer, parser_integer_62, <); + + if (lt !=3D 0) { + PRINT_STR(printer, " + "); + PRINT(printer_print_lifetime_from_index(printer, lt)); + } + break; + case 'B': + PRINT(printer_print_backref(printer, printer_print_type_backref, N= ULL)); + break; + default: + // Go back to the tag, so `print_path` also sees it. + if (printer->status =3D=3D DemangleOk && /* safety */ printer->par= ser.next > 0) { + printer->parser.next--; + } + PRINT(printer_print_path(printer, false)); + } + + printer_pop_depth(printer); + return OverflowOk; +} + +NODISCARD static demangle_status rust_demangle_legacy_demangle(const char = *s, size_t s_len, struct demangle_legacy *res, const char **rest) +{ + if (s_len > strlen(s)) { + // s_len only exists to shorten the string, this is not a buffer A= PI + return DemangleInvalid; + } + + const char *inner; + size_t inner_len; + if (s_len >=3D 3 && !strncmp(s, "_ZN", 3)) { + inner =3D s + 3; + inner_len =3D s_len - 3; + } else if (s_len >=3D 2 && !strncmp(s, "ZN", 2)) { + // On Windows, dbghelp strips leading underscores, so we accept "Z= N...E" + // form too. + inner =3D s + 2; + inner_len =3D s_len - 2; + } else if (s_len >=3D 4 && !strncmp(s, "__ZN", 4)) { + // On OSX, symbols are prefixed with an extra _ + inner =3D s + 4; + inner_len =3D s_len - 4; + } else { + return DemangleInvalid; + } + + if (!str_isascii(inner, inner_len)) { + return DemangleInvalid; + } + + size_t elements =3D 0; + const char *chars =3D inner; + size_t chars_len =3D inner_len; + if (chars_len =3D=3D 0) { + return DemangleInvalid; + } + char c; + while ((c =3D *chars) !=3D 'E') { + // Decode an identifier element's length + if (c < '0' || c > '9') { + return DemangleInvalid; + } + size_t len =3D 0; + while (c >=3D '0' && c <=3D '9') { + size_t d =3D c - '0'; + if (len > SIZE_MAX / 10) { + return DemangleInvalid; + } + len *=3D 10; + if (len > SIZE_MAX - d) { + return DemangleInvalid; + } + len +=3D d; + + chars++; + chars_len--; + if (chars_len =3D=3D 0) { + return DemangleInvalid; + } + c =3D *chars; + } + + // Advance by the length + if (chars_len <=3D len) { + return DemangleInvalid; + } + chars +=3D len; + chars_len -=3D len; + elements++; + } + *res =3D (struct demangle_legacy) { inner, inner_len, elements }; + *rest =3D chars + 1; + return DemangleOk; +} + +static bool is_rust_hash(const char *s, size_t len) { + if (len =3D=3D 0 || s[0] !=3D 'h') { + return false; + } + + for (size_t i =3D 1; i < len; i++) { + if (!((s[i] >=3D '0' && s[i] <=3D '9') || (s[i] >=3D 'a' && s[i] <= =3D 'f') || (s[i] >=3D 'A' && s[i] <=3D 'F'))) { + return false; + } + } + + return true; +} + +NODISCARD static overflow_status rust_demangle_legacy_display_demangle(str= uct demangle_legacy res, char *out, size_t len, bool alternate) +{ + struct printer printer =3D { + // not actually using the parser part of the printer, just keeping= it to share the format functions + DemangleOk, + { NULL },=20 + out, + len, + 0, + alternate + }; + const char *inner =3D res.mangled; + for (size_t element =3D 0; element < res.elements; element++) { + size_t i =3D 0; + const char *rest; + for (rest =3D inner; rest < res.mangled + res.mangled_len && *rest= >=3D '0' && *rest <=3D '9'; rest++) { + i *=3D 10; + i +=3D *rest - '0'; + } + if ((size_t)(res.mangled + res.mangled_len - rest) < i) { + // safety: shouldn't reach this place if the input string is v= alidated. bail out. + // safety: we knwo rest <=3D res.mangled + res.mangled_len fro= m the for-loop above + break; + } + + size_t len =3D i; + inner =3D rest + len; + + // From here on, inner contains a pointer to the next element, res= t[:len] to the current one =20 + if (alternate && element + 1 =3D=3D res.elements && is_rust_hash(r= est, i)) { + break; + } + if (element !=3D 0) { + PRINT_STR(&printer, "::"); + } + + if (len >=3D 2 && !strncmp(rest, "_$", 2)) { + rest++; + len--; + } + =20 + while (len > 0) { + if (rest[0] =3D=3D '.') { + if (len >=3D 2 && rest[1] =3D=3D '.') { + PRINT_STR(&printer, "::"); + rest +=3D 2; + len -=3D 2; + } else { + PRINT_STR(&printer, "."); + rest +=3D 1; + len -=3D 1; + } + } else if (rest[0] =3D=3D '$') { + const char *escape =3D memchr(rest + 1, '$', len - 1); + if (escape =3D=3D NULL) { + break; + } + const char *escape_start =3D rest + 1; + size_t escape_len =3D escape - (rest + 1); + =20 + size_t next_len =3D len - (escape + 1 - rest); + const char *next_rest =3D escape + 1; + + char ch; + if ((escape_len =3D=3D 2 && escape_start[0] =3D=3D 'S' && = escape_start[1] =3D=3D 'P')) { + ch =3D '@'; + } else if ((escape_len =3D=3D 2 && escape_start[0] =3D=3D = 'B' && escape_start[1] =3D=3D 'P')) { + ch =3D '*'; + } else if ((escape_len =3D=3D 2 && escape_start[0] =3D=3D = 'R' && escape_start[1] =3D=3D 'F')) { + ch =3D '&'; + } else if ((escape_len =3D=3D 2 && escape_start[0] =3D=3D = 'L' && escape_start[1] =3D=3D 'T')) { + ch =3D '<'; + } else if ((escape_len =3D=3D 2 && escape_start[0] =3D=3D = 'G' && escape_start[1] =3D=3D 'T')) { + ch =3D '>'; + } else if ((escape_len =3D=3D 2 && escape_start[0] =3D=3D = 'L' && escape_start[1] =3D=3D 'P')) { + ch =3D '('; + } else if ((escape_len =3D=3D 2 && escape_start[0] =3D=3D = 'R' && escape_start[1] =3D=3D 'P')) { + ch =3D ')'; + } else if ((escape_len =3D=3D 1 && escape_start[0] =3D=3D = 'C')) { + ch =3D ','; + } else { + if (escape_len > 1 && escape_start[0] =3D=3D 'u') { + escape_start++; + escape_len--; + uint64_t val; + if (try_parse_uint(escape_start, escape_len, &val) + && val < UINT32_MAX + && validate_char((uint32_t)val)) + { + if (!unicode_iscontrol(val)) { + uint8_t wchr[4]; + size_t wchr_len =3D code_to_utf8(wchr, (ui= nt32_t)val); + PRINT(printer_print_buf(&printer, (const c= har*)wchr, wchr_len)); + len =3D next_len; + rest =3D next_rest; + continue; + } + } + } + break; // print the rest of this element raw + } + PRINT_CH(&printer, ch); + len =3D next_len; + rest =3D next_rest; + } else { + size_t j =3D 0; + for (;j < len && rest[j] !=3D '$' && rest[j] !=3D '.';j++); + if (j =3D=3D len) { + break; + } + PRINT(printer_print_buf(&printer, rest, j)); + rest +=3D j; + len -=3D j; + } + } + PRINT(printer_print_buf(&printer, rest, len)); + } + + if (printer.out_len < OVERFLOW_MARGIN) { + return OverflowOverflow; + } + *printer.out =3D '\0'; + return OverflowOk; +} + +static bool is_symbol_like(const char *s, size_t len) { + // rust-demangle definition of symbol like: control characters and spa= ce are not symbol-like, all else is + for (size_t i =3D 0; i < len; i++) { + char ch =3D s[i]; + if (!(ch >=3D 0x21 && ch <=3D 0x7e)) { + return false; + } + } + return true; +} + +void rust_demangle_demangle(const char *s, struct demangle *res) +{ + // During ThinLTO LLVM may import and rename internal symbols, so stri= p out + // those endings first as they're one of the last manglings applied to= symbol + // names. + const char *llvm =3D ".llvm."; + const char *found_llvm =3D strstr(s, llvm); + size_t s_len =3D strlen(s); + if (found_llvm) { + const char *all_hex_ptr =3D found_llvm + strlen(".llvm."); + bool all_hex =3D true; + for (;*all_hex_ptr;all_hex_ptr++) { + if (!(('0' <=3D *all_hex_ptr && *all_hex_ptr <=3D '9') || + ('A' <=3D *all_hex_ptr && *all_hex_ptr <=3D 'F') || + *all_hex_ptr =3D=3D '@')) { + all_hex =3D false; + break; + } + } + + if (all_hex) { + s_len =3D found_llvm - s; + } + } + + const char *suffix; + struct demangle_legacy legacy; + demangle_status st =3D rust_demangle_legacy_demangle(s, s_len, &legacy= , &suffix); + if (st =3D=3D DemangleOk) { + *res =3D (struct demangle) { + .style=3DDemangleStyleLegacy, + .mangled=3Dlegacy.mangled, + .mangled_len=3Dlegacy.mangled_len, + .elements=3Dlegacy.elements, + .original=3Ds, + .original_len=3Ds_len, + .suffix=3Dsuffix, + .suffix_len=3Ds_len - (suffix - s), + }; + } else { + struct demangle_v0 v0; + st =3D rust_demangle_v0_demangle(s, s_len, &v0, &suffix); + if (st =3D=3D DemangleOk) { + *res =3D (struct demangle) { + .style=3DDemangleStyleV0, + .mangled=3Dv0.mangled, + .mangled_len=3Dv0.mangled_len, + .elements=3D0, + .original=3Ds, + .original_len=3Ds_len, + .suffix=3Dsuffix, + .suffix_len=3Ds_len - (suffix - s), + }; + } else { + *res =3D (struct demangle) { + .style=3DDemangleStyleUnknown, + .mangled=3DNULL, + .mangled_len=3D0, + .elements=3D0, + .original=3Ds, + .original_len=3Ds_len, + .suffix=3Ds, + .suffix_len=3D0, + }; =20 + } + } + + // Output like LLVM IR adds extra period-delimited words. See if + // we are in that case and save the trailing words if so. + if (res->suffix_len) { + if (res->suffix[0] =3D=3D '.' && is_symbol_like(res->suffix, res->= suffix_len)) { + // Keep the suffix + } else { + // Reset the suffix and invalidate the demangling + res->style =3D DemangleStyleUnknown; + res->suffix_len =3D 0; + } + } +} + +bool rust_demangle_is_known(struct demangle *res) { + return res->style !=3D DemangleStyleUnknown; +} + +overflow_status rust_demangle_display_demangle(struct demangle const *res,= char *out, size_t len, bool alternate) { =20 + size_t original_len =3D res->original_len; + size_t out_len; + switch (res->style) { + case DemangleStyleUnknown: + if (len < original_len) { + return OverflowOverflow; + } else { + memcpy(out, res->original, original_len); + out +=3D original_len; + len -=3D original_len; + break; + } + break; + case DemangleStyleLegacy: { + struct demangle_legacy legacy =3D { + res->mangled, + res->mangled_len, + res->elements + }; + if (rust_demangle_legacy_display_demangle(legacy, out, len, altern= ate) =3D=3D OverflowOverflow) { + return OverflowOverflow; + } + out_len =3D strlen(out); + out +=3D out_len; + len -=3D out_len; + break; + } + case DemangleStyleV0: { + struct demangle_v0 v0 =3D { + res->mangled, + res->mangled_len + }; + if (rust_demangle_v0_display_demangle(v0, out, len, alternate) =3D= =3D OverflowOverflow) { + return OverflowOverflow; + } + out_len =3D strlen(out); + out +=3D out_len; + len -=3D out_len; + break; + } + } + size_t suffix_len =3D res->suffix_len; + if (len < suffix_len || len - suffix_len < OVERFLOW_MARGIN) { + return OverflowOverflow; + } + memcpy(out, res->suffix, suffix_len); + out[suffix_len] =3D 0; + return OverflowOk; +} diff --git a/tools/perf/util/demangle-rust-v0.h b/tools/perf/util/demangle-= rust-v0.h new file mode 100644 index 000000000000..d0092818610a --- /dev/null +++ b/tools/perf/util/demangle-rust-v0.h @@ -0,0 +1,88 @@ +// SPDX-License-Identifier: Apache-2.0 OR MIT + +// The contents of this file come from the Rust rustc-demangle library, ho= sted +// in the repository, licens= ed +// under "Apache-2.0 OR MIT". For copyright details, see +// . +// Please note that the file should be kept as close as possible to upstre= am. + +#ifndef _H_DEMANGLE_V0_H +#define _H_DEMANGLE_V0_H + +#ifdef __cplusplus +extern "C" { +#endif + +#include + +#if defined(__GNUC__) || defined(__clang__) +#define DEMANGLE_NODISCARD __attribute__((warn_unused_result)) +#else +#define DEMANGLE_NODISCARD +#endif + +typedef enum { + OverflowOk, + OverflowOverflow +} overflow_status; + +enum demangle_style { + DemangleStyleUnknown =3D 0, + DemangleStyleLegacy, + DemangleStyleV0, +}; + +// Not using a union here to make the struct easier to copy-paste if neede= d. +struct demangle { + enum demangle_style style; + // points to the "mangled" part of the name, + // not including `ZN` or `R` prefixes. + const char *mangled; + size_t mangled_len; + // In DemangleStyleLegacy, is the number of path elements + size_t elements; + // while it's called "original", it will not contain `.llvm.9D1C9369@@= 16` suffixes + // that are to be ignored. + const char *original; + size_t original_len; + // Contains the part after the mangled name that is to be outputted, + // which can be `.exit.i.i` suffixes LLVM sometimes adds. + const char *suffix; + size_t suffix_len; +}; + +// if the length of the output buffer is less than `output_len-OVERFLOW_MA= RGIN`, +// the demangler will return `OverflowOverflow` even if there is no overfl= ow. +#define OVERFLOW_MARGIN 4 + +/// Demangle a C string that refers to a Rust symbol and put the demangle = intermediate result in `res`. +/// Beware that `res` contains references into `s`. If `s` is modified (or= free'd) before calling +/// `rust_demangle_display_demangle` behavior is undefined. +/// +/// Use `rust_demangle_display_demangle` to convert it to an actual string. +void rust_demangle_demangle(const char *s, struct demangle *res); + +/// Write the string in a `struct demangle` into a buffer. +/// +/// Return `OverflowOk` if the output buffer was sufficiently big, `Overfl= owOverflow` if it wasn't. +/// This function is `O(n)` in the length of the input + *output* [$], but= the demangled output of demangling a symbol can +/// be exponentially[$$] large, therefore it is recommended to have a sane= bound (`rust-demangle` +/// uses 1,000,000 bytes) on `len`. +/// +/// `alternate`, if true, uses the less verbose alternate formatting (Rust= `{:#}`) is used, which does not show +/// symbol hashes and types of constant ints. +/// +/// [$] It's `O(n * MAX_DEPTH)`, but `MAX_DEPTH` is a constant 300 and the= refore it's `O(n)` +/// [$$] Technically, bounded by `O(n^MAX_DEPTH)`, but this is practically= exponential. +DEMANGLE_NODISCARD overflow_status rust_demangle_display_demangle(struct d= emangle const *res, char *out, size_t len, bool alternate); + +/// Returns true if `res` refers to a known valid Rust demangling style, f= alse if it's an unknown style. +bool rust_demangle_is_known(struct demangle *res); + +#undef DEMANGLE_NODISCARD + +#ifdef __cplusplus +} +#endif + +#endif --=20 2.49.0.901.g37484f566f-goog From nobody Fri Sep 5 20:21:55 2025 Received: from mail-pg1-f201.google.com (mail-pg1-f201.google.com [209.85.215.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5C9CD801 for ; Wed, 30 Apr 2025 00:41:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745973704; cv=none; b=KwjGmkahbaXpeWGY6BXY7H4Ty4Wf2OkRjH/s+yaUtyV5BVV1lQBUZXXFFkWFzJwoABiseCq++dharUSTks/b5yAiMwxXD2GjQX6DCgDrYk1V+SUJEpVufs6PiAxsoYXQvkS317MVYr7paHvSEuMkFdD4TptWbDKUXZkcMlu6Y2A= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745973704; c=relaxed/simple; bh=aQ8Ixb2ii7TM17c5CxHc/kZsrUa4Me1nM9JR+KWLbRA=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=gqqOmGWrEvYC4vwsSfP0UGtsfLnje2eHzZspnNWHNBfBNs2OiucZYzMFPUXJ/hlb/G9nZr6aeIq00HUok0A1cKPeiv1ov5a3IeqRmBEUEREkZpDW0l9VDLlzRvstzozKaqc0K7PNnANhDuX1x3RwJUw17D8PiJy5yWlRDwG1+Sg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--irogers.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=pmvxetvA; arc=none smtp.client-ip=209.85.215.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--irogers.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="pmvxetvA" Received: by mail-pg1-f201.google.com with SMTP id 41be03b00d2f7-b16837f1874so3442457a12.2 for ; Tue, 29 Apr 2025 17:41:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1745973702; x=1746578502; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=e1KMrryhTDqYQRcc+MhBf4Nju0O0E10cRTohQDGFkGY=; b=pmvxetvAX5sZV7/2/HiEwwwKtIl4NEgCdbbTwT9PYG1y72zDXeVY9cUSDGl5UycIxE uBb6dBzwblJhxkHppRcinXejNF97yEhE9sJQ/Xc19htP+OCXXkWY53T4mj6CzBSotSwj h5nFEOZxvr/XFrv8BAtvIgTWlAO0Fkz9OqPid+Xn9RQtBWG06WmeNxL+z1x795fLVU5l 4Kd2Js40qdyti4Pqjba+8oASCTL98Pqd8JWlAk6e6f3jGgi+p2I17H9Q1nELreo+OTQ2 /3Uj3ZbwT2GpUd0dO+dam4KnXrpDRpM/pcCtvm19hSyhiYqH8XVGNTHdBEBXebWu2vNI Rkmg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1745973702; x=1746578502; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=e1KMrryhTDqYQRcc+MhBf4Nju0O0E10cRTohQDGFkGY=; b=uAw3OnDjoTThvlc1bJeZAEVKULyS/IDe59bK8XD4v0bLe532bwPgiCycoWNvpgJkL3 ha08Xx33nv/tmZzuJZs/1TMmyn1g8BwXqjkb2Vl9mXwt2RNhwhx7lqrIFj1W6ggmoD3D +S6SfvCg2mcEugmD32HC0ZDNbV18SkzEqYqMAaeZop2wu8m/osvlIwBbZkDC0Z1mQg3T AqOj18G48My2DjTB0uQ0FhcTaM9MQfRSCpNx7en6gYcI5l7tSRPia3vUQg69BybNukqf ayj9/I4cJiD/tXAYxMuNvyUsTLcFvD7c7ocBKAbqVPp/z24vMscZUc2RytBaBLh3t3Yi xyXw== X-Forwarded-Encrypted: i=1; AJvYcCUginPHJz/G9jhXvzK7dkKvLqCQIvS3/wjIRZzn8GV211D7WszhV/NMukzDFVZUsfRjcunsRmgqlQ6R5oU=@vger.kernel.org X-Gm-Message-State: AOJu0YxnDWz4dT0lUjQFgeM09UMdyDvQKssf1hPYJeKNbXL6r+Km/ErP VPLvz3x3dXxj+NBc0TMui6tkvP+XKyxTSEgTAa4AZBNIFbWt19ZBDVkNtaghIT2Az1e12slip+o JgtyKGg== X-Google-Smtp-Source: AGHT+IH5AV8B/lmxOmO7bvKCtJ0Ps0gr6YLjC9VmSVrpHIqji0bGhb7vyMUNXXGmFCeguUAOJMfQk3WEI6qq X-Received: from pfbfn23.prod.google.com ([2002:a05:6a00:2fd7:b0:740:555:f7af]) (user=irogers job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a20:6f09:b0:1f5:70af:a32a with SMTP id adf61e73a8af0-20a88e1c278mr1607808637.32.1745973701640; Tue, 29 Apr 2025 17:41:41 -0700 (PDT) Date: Tue, 29 Apr 2025 17:41:24 -0700 In-Reply-To: <20250430004128.474388-1-irogers@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250430004128.474388-1-irogers@google.com> X-Mailer: git-send-email 2.49.0.901.g37484f566f-goog Message-ID: <20250430004128.474388-3-irogers@google.com> Subject: [PATCH v2 2/6] perf symbol-elf: Integrate rust-v0 demangling From: Ian Rogers To: Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Mark Rutland , Alexander Shishkin , Jiri Olsa , Adrian Hunter , Kan Liang , Miguel Ojeda , Alex Gaynor , Boqun Feng , Gary Guo , "=?UTF-8?q?Bj=C3=B6rn=20Roy=20Baron?=" , Benno Lossin , Andreas Hindborg , Alice Ryhl , Trevor Gross , Danilo Krummrich , Nathan Chancellor , Nick Desaulniers , Bill Wendling , Justin Stitt , James Clark , Howard Chu , Jiapeng Chong , Ravi Bangoria , "Masami Hiramatsu (Google)" , Stephen Brennan , linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, rust-for-linux@vger.kernel.org, llvm@lists.linux.dev, Daniel Xu , Ariel Ben-Yehuda Cc: Ian Rogers Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Use the demangle-rust-v0 APIs to see if symbol is Rust mangled and demangle if so. The API requires a pre-allocated output buffer, some estimation and retrying are added for this. Signed-off-by: Ian Rogers --- tools/perf/util/Build | 5 +++- tools/perf/util/symbol-elf.c | 47 ++++++++++++++++++++++++++---------- 2 files changed, 38 insertions(+), 14 deletions(-) diff --git a/tools/perf/util/Build b/tools/perf/util/Build index 4f7f072fa222..7910d908c814 100644 --- a/tools/perf/util/Build +++ b/tools/perf/util/Build @@ -241,9 +241,12 @@ perf-util-y +=3D cap.o perf-util-$(CONFIG_CXX_DEMANGLE) +=3D demangle-cxx.o perf-util-y +=3D demangle-ocaml.o perf-util-y +=3D demangle-java.o -perf-util-y +=3D demangle-rust.o +perf-util-y +=3D demangle-rust-v0.o perf-util-$(CONFIG_LIBLLVM) +=3D llvm-c-helpers.o =20 +CFLAGS_demangle-rust-v0.o +=3D -Wno-shadow -Wno-declaration-after-statemen= t \ + -Wno-switch-default -Wno-switch-enum -Wno-missing-field-initializers + ifdef CONFIG_JITDUMP perf-util-$(CONFIG_LIBELF) +=3D jitdump.o perf-util-$(CONFIG_LIBELF) +=3D genelf.o diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c index fbf6d0f73af9..3fc87309746f 100644 --- a/tools/perf/util/symbol-elf.c +++ b/tools/perf/util/symbol-elf.c @@ -16,13 +16,14 @@ #include "demangle-cxx.h" #include "demangle-ocaml.h" #include "demangle-java.h" -#include "demangle-rust.h" +#include "demangle-rust-v0.h" #include "machine.h" #include "vdso.h" #include "debug.h" #include "util/copyfile.h" #include #include +#include #include #include #include @@ -308,6 +309,9 @@ char *cxx_demangle_sym(const char *str __maybe_unused, = bool params __maybe_unuse =20 static char *demangle_sym(struct dso *dso, int kmodule, const char *elf_na= me) { + struct demangle rust_demangle =3D { + .style =3D DemangleStyleUnknown, + }; char *demangled =3D NULL; =20 /* @@ -318,21 +322,38 @@ static char *demangle_sym(struct dso *dso, int kmodul= e, const char *elf_name) if (!want_demangle(dso__kernel(dso) || kmodule)) return demangled; =20 - demangled =3D cxx_demangle_sym(elf_name, verbose > 0, verbose > 0); - if (demangled =3D=3D NULL) { - demangled =3D ocaml_demangle_sym(elf_name); - if (demangled =3D=3D NULL) { - demangled =3D java_demangle_sym(elf_name, JAVA_DEMANGLE_NORET); + rust_demangle_demangle(elf_name, &rust_demangle); + if (rust_demangle_is_known(&rust_demangle)) { + /* A rust mangled name. */ + if (rust_demangle.mangled_len =3D=3D 0) + return demangled; + + for (size_t buf_len =3D roundup_pow_of_two(rust_demangle.mangled_len * 2= ); + buf_len < 1024 * 1024; buf_len +=3D 32) { + char *tmp =3D realloc(demangled, buf_len); + + if (!tmp) { + /* Failure to grow output buffer, return what is there. */ + return demangled; + } + demangled =3D tmp; + if (rust_demangle_display_demangle(&rust_demangle, demangled, buf_len, + /*alternate=3D*/true) =3D=3D OverflowOk) + return demangled; } + /* Buffer exceeded sensible bounds, return what is there. */ + return demangled; } - else if (rust_is_mangled(demangled)) - /* - * Input to Rust demangling is the BFD-demangled - * name which it Rust-demangles in place. - */ - rust_demangle_sym(demangled); =20 - return demangled; + demangled =3D cxx_demangle_sym(elf_name, verbose > 0, verbose > 0); + if (demangled) + return demangled; + + demangled =3D ocaml_demangle_sym(elf_name); + if (demangled) + return demangled; + + return java_demangle_sym(elf_name, JAVA_DEMANGLE_NORET); } =20 struct rel_info { --=20 2.49.0.901.g37484f566f-goog From nobody Fri Sep 5 20:21:55 2025 Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A789EBA53 for ; Wed, 30 Apr 2025 00:41:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745973706; cv=none; b=DlgAal0e7y4au88Z9O3MtLSgLTVjG60w4JtyU9W3BR75cC0dMPvpsgKHYYrPjJMgb03gmVo/u98Z7tztMebBmW6vrLEHrDBgHOh4Ur1P9nJtFFZUGjtJ+HO5s2UHo6Fn+hPnS2l8SEPegdj7PNc7xcy+hYbEIYcVJqsJPDpEaSI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745973706; c=relaxed/simple; bh=bQOUZ5ooNOtzWZnFpHAynQKZLkRbyLSyu+YVQL5hQ8U=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=GAMAXRgPZ/GwooIU4xrp3+SEyEe4bl8Isktz6sfTDfqymIOwzBKJJ2sM0z7NOzrGl2NQSrTo0kA3V2AKphj42j3ETnvIHuOqMQ3Sn7g4MjLNvqZZDoc+uHbOLA3346ItejwFcrWgz1myW0pofU+0Y5Y4GsNuociYu5rmTpJdIBU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--irogers.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=spaBFm9+; arc=none smtp.client-ip=209.85.214.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--irogers.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="spaBFm9+" Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-225107fbdc7so62103255ad.0 for ; Tue, 29 Apr 2025 17:41:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1745973704; x=1746578504; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=3P8+0W8rcJSWHrA+D5OrYMrXp40a4gp5I5ey87XHEMw=; b=spaBFm9+O3gnVkVsjA8+1JMAuYnv+kV30KqWLfd73qsOO8OwEBZ+SeaHWpmuJ3K+AV WT/mt0RXDhRqKZI+GSddbO3ZZ4p+FdZm4+mdskNDngR0AsrRjvzjcfTeEWybDbwTsFc+ 1rHL1QA0uBdi4Y1pvpgaGKRIhmj/wnhzaV160qb3sjCbwmU3WDYmUrEs46idpUjHiiTE M/dwXJ2LX8ZiCTjRUfCZeJ5MBULmStqh98nzCL/D2wSUQ0d61k9yMrnt29T1HKGPNrbd LAY/dVSDpN58t2MhEfNqJ1mIbuL3QUCwiVTw+RLSIM7lmY4pjUkPl2iQXNpz93rtxOen K/Yw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1745973704; x=1746578504; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=3P8+0W8rcJSWHrA+D5OrYMrXp40a4gp5I5ey87XHEMw=; b=jivi0KM25Xdc+Ckt4ygg0puSQugSncah9DveOgo/m6R9soIwY74LlI4U3fbB89rf2t YSPQCfT9vLGX9CZmKGx+otK4f5is/YTW8cnPKRUxlrdyInrw+3qv/XoliBaw+gnrxHGh JjrdVQRm5Sa5OdjNXJcJTOpqY8UN/ysY6tOSjij1m3kYIK+4qrKP1yOdin4XQeswSyUd geN0c8YuUpyLsBKQQGAgEDGu2zakovqVqJwkT8nrvoDBtga4TiGhfXTtAEb1zTgVNPz1 OInKenL0xrGk+U3l/TVpDX3rtAR7vwH9PdJchZoqjXviXb4tQ98Q7tkPU11nWJMf1Gib m+gA== X-Forwarded-Encrypted: i=1; AJvYcCVrSoXUcuyGj1Z/seexVq4GWzBDchJHigy7ZyToqsuqHi20C7WlkJYuZvgdSVANlGgv4EYx+2YNNSq5FLk=@vger.kernel.org X-Gm-Message-State: AOJu0Yy5lsBYxBJ+snQK1E9xoAgd/jU5pi//yrC6plrjwhu6DD6joXIL p9GWOvf2aV+Y6AR/8blH85j20bkJtIPeqo+KH3O6B3Muw/YW7lJyZHb+fi9nQtAgiHkJ8ZJxakz Y8gkPMA== X-Google-Smtp-Source: AGHT+IF6rGQczrBYLKeu5PIevKkC9hGr0+f3qbWmXMTSLnxpnnROvoClomrm5O7BzJV9EQmtFyeGFOVFy560 X-Received: from plbja7.prod.google.com ([2002:a17:902:efc7:b0:223:f59e:ae50]) (user=irogers job=prod-delivery.src-stubby-dispatcher) by 2002:a17:902:d4d1:b0:223:669f:ca2d with SMTP id d9443c01a7336-22df356d961mr21753095ad.35.1745973703883; Tue, 29 Apr 2025 17:41:43 -0700 (PDT) Date: Tue, 29 Apr 2025 17:41:25 -0700 In-Reply-To: <20250430004128.474388-1-irogers@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250430004128.474388-1-irogers@google.com> X-Mailer: git-send-email 2.49.0.901.g37484f566f-goog Message-ID: <20250430004128.474388-4-irogers@google.com> Subject: [PATCH v2 3/6] perf demangle-rust: Remove previous legacy rust decoder From: Ian Rogers To: Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Mark Rutland , Alexander Shishkin , Jiri Olsa , Adrian Hunter , Kan Liang , Miguel Ojeda , Alex Gaynor , Boqun Feng , Gary Guo , "=?UTF-8?q?Bj=C3=B6rn=20Roy=20Baron?=" , Benno Lossin , Andreas Hindborg , Alice Ryhl , Trevor Gross , Danilo Krummrich , Nathan Chancellor , Nick Desaulniers , Bill Wendling , Justin Stitt , James Clark , Howard Chu , Jiapeng Chong , Ravi Bangoria , "Masami Hiramatsu (Google)" , Stephen Brennan , linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, rust-for-linux@vger.kernel.org, llvm@lists.linux.dev, Daniel Xu , Ariel Ben-Yehuda Cc: Ian Rogers Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Code is unused since the introduction of rustc-demangle demangler. Reviewed-by: Miguel Ojeda Signed-off-by: Ian Rogers --- tools/perf/util/demangle-rust.c | 269 -------------------------------- tools/perf/util/demangle-rust.h | 8 - 2 files changed, 277 deletions(-) delete mode 100644 tools/perf/util/demangle-rust.c delete mode 100644 tools/perf/util/demangle-rust.h diff --git a/tools/perf/util/demangle-rust.c b/tools/perf/util/demangle-rus= t.c deleted file mode 100644 index a659fc69f73a..000000000000 --- a/tools/perf/util/demangle-rust.c +++ /dev/null @@ -1,269 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0 -#include -#include "debug.h" - -#include "demangle-rust.h" - -/* - * Mangled Rust symbols look like this: - * - * _$LT$std..sys..fd..FileDesc$u20$as$u20$core..ops..Drop$GT$::drop::h= c68340e1baa4987a - * - * The original symbol is: - * - * ::drop - * - * The last component of the path is a 64-bit hash in lowercase hex, prefi= xed - * with "h". Rust does not have a global namespace between crates, an illu= sion - * which Rust maintains by using the hash to distinguish things that would - * otherwise have the same symbol. - * - * Any path component not starting with a XID_Start character is prefixed = with - * "_". - * - * The following escape sequences are used: - * - * "," =3D> $C$ - * "@" =3D> $SP$ - * "*" =3D> $BP$ - * "&" =3D> $RF$ - * "<" =3D> $LT$ - * ">" =3D> $GT$ - * "(" =3D> $LP$ - * ")" =3D> $RP$ - * " " =3D> $u20$ - * "'" =3D> $u27$ - * "[" =3D> $u5b$ - * "]" =3D> $u5d$ - * "~" =3D> $u7e$ - * - * A double ".." means "::" and a single "." means "-". - * - * The only characters allowed in the mangled symbol are a-zA-Z0-9 and _.:$ - */ - -static const char *hash_prefix =3D "::h"; -static const size_t hash_prefix_len =3D 3; -static const size_t hash_len =3D 16; - -static bool is_prefixed_hash(const char *start); -static bool looks_like_rust(const char *sym, size_t len); -static bool unescape(const char **in, char **out, const char *seq, char va= lue); - -/* - * INPUT: - * sym: symbol that has been through BFD-demangling - * - * This function looks for the following indicators: - * - * 1. The hash must consist of "h" followed by 16 lowercase hex digits. - * - * 2. As a sanity check, the hash must use between 5 and 15 of the 16 pos= sible - * hex digits. This is true of 99.9998% of hashes so once in your life= you - * may see a false negative. The point is to notice path components th= at - * could be Rust hashes but are probably not, like "haaaaaaaaaaaaaaaa"= . In - * this case a false positive (non-Rust symbol has an important path - * component removed because it looks like a Rust hash) is worse than a - * false negative (the rare Rust symbol is not demangled) so this sets= the - * balance in favor of false negatives. - * - * 3. There must be no characters other than a-zA-Z0-9 and _.:$ - * - * 4. There must be no unrecognized $-sign sequences. - * - * 5. There must be no sequence of three or more dots in a row ("..."). - */ -bool -rust_is_mangled(const char *sym) -{ - size_t len, len_without_hash; - - if (!sym) - return false; - - len =3D strlen(sym); - if (len <=3D hash_prefix_len + hash_len) - /* Not long enough to contain "::h" + hash + something else */ - return false; - - len_without_hash =3D len - (hash_prefix_len + hash_len); - if (!is_prefixed_hash(sym + len_without_hash)) - return false; - - return looks_like_rust(sym, len_without_hash); -} - -/* - * A hash is the prefix "::h" followed by 16 lowercase hex digits. The hex - * digits must comprise between 5 and 15 (inclusive) distinct digits. - */ -static bool is_prefixed_hash(const char *str) -{ - const char *end; - bool seen[16]; - size_t i; - int count; - - if (strncmp(str, hash_prefix, hash_prefix_len)) - return false; - str +=3D hash_prefix_len; - - memset(seen, false, sizeof(seen)); - for (end =3D str + hash_len; str < end; str++) - if (*str >=3D '0' && *str <=3D '9') - seen[*str - '0'] =3D true; - else if (*str >=3D 'a' && *str <=3D 'f') - seen[*str - 'a' + 10] =3D true; - else - return false; - - /* Count how many distinct digits seen */ - count =3D 0; - for (i =3D 0; i < 16; i++) - if (seen[i]) - count++; - - return count >=3D 5 && count <=3D 15; -} - -static bool looks_like_rust(const char *str, size_t len) -{ - const char *end =3D str + len; - - while (str < end) - switch (*str) { - case '$': - if (!strncmp(str, "$C$", 3)) - str +=3D 3; - else if (!strncmp(str, "$SP$", 4) - || !strncmp(str, "$BP$", 4) - || !strncmp(str, "$RF$", 4) - || !strncmp(str, "$LT$", 4) - || !strncmp(str, "$GT$", 4) - || !strncmp(str, "$LP$", 4) - || !strncmp(str, "$RP$", 4)) - str +=3D 4; - else if (!strncmp(str, "$u20$", 5) - || !strncmp(str, "$u27$", 5) - || !strncmp(str, "$u5b$", 5) - || !strncmp(str, "$u5d$", 5) - || !strncmp(str, "$u7e$", 5)) - str +=3D 5; - else - return false; - break; - case '.': - /* Do not allow three or more consecutive dots */ - if (!strncmp(str, "...", 3)) - return false; - /* Fall through */ - case 'a' ... 'z': - case 'A' ... 'Z': - case '0' ... '9': - case '_': - case ':': - str++; - break; - default: - return false; - } - - return true; -} - -/* - * INPUT: - * sym: symbol for which rust_is_mangled(sym) returns true - * - * The input is demangled in-place because the mangled name is always long= er - * than the demangled one. - */ -void -rust_demangle_sym(char *sym) -{ - const char *in; - char *out; - const char *end; - - if (!sym) - return; - - in =3D sym; - out =3D sym; - end =3D sym + strlen(sym) - (hash_prefix_len + hash_len); - - while (in < end) - switch (*in) { - case '$': - if (!(unescape(&in, &out, "$C$", ',') - || unescape(&in, &out, "$SP$", '@') - || unescape(&in, &out, "$BP$", '*') - || unescape(&in, &out, "$RF$", '&') - || unescape(&in, &out, "$LT$", '<') - || unescape(&in, &out, "$GT$", '>') - || unescape(&in, &out, "$LP$", '(') - || unescape(&in, &out, "$RP$", ')') - || unescape(&in, &out, "$u20$", ' ') - || unescape(&in, &out, "$u27$", '\'') - || unescape(&in, &out, "$u5b$", '[') - || unescape(&in, &out, "$u5d$", ']') - || unescape(&in, &out, "$u7e$", '~'))) { - pr_err("demangle-rust: unexpected escape sequence"); - goto done; - } - break; - case '_': - /* - * If this is the start of a path component and the next - * character is an escape sequence, ignore the - * underscore. The mangler inserts an underscore to make - * sure the path component begins with a XID_Start - * character. - */ - if ((in =3D=3D sym || in[-1] =3D=3D ':') && in[1] =3D=3D '$') - in++; - else - *out++ =3D *in++; - break; - case '.': - if (in[1] =3D=3D '.') { - /* ".." becomes "::" */ - *out++ =3D ':'; - *out++ =3D ':'; - in +=3D 2; - } else { - /* "." becomes "-" */ - *out++ =3D '-'; - in++; - } - break; - case 'a' ... 'z': - case 'A' ... 'Z': - case '0' ... '9': - case ':': - *out++ =3D *in++; - break; - default: - pr_err("demangle-rust: unexpected character '%c' in symbol\n", - *in); - goto done; - } - -done: - *out =3D '\0'; -} - -static bool unescape(const char **in, char **out, const char *seq, char va= lue) -{ - size_t len =3D strlen(seq); - - if (strncmp(*in, seq, len)) - return false; - - **out =3D value; - - *in +=3D len; - *out +=3D 1; - - return true; -} diff --git a/tools/perf/util/demangle-rust.h b/tools/perf/util/demangle-rus= t.h deleted file mode 100644 index 2fca618b1aa5..000000000000 --- a/tools/perf/util/demangle-rust.h +++ /dev/null @@ -1,8 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0 */ -#ifndef __PERF_DEMANGLE_RUST -#define __PERF_DEMANGLE_RUST 1 - -bool rust_is_mangled(const char *str); -void rust_demangle_sym(char *str); - -#endif /* __PERF_DEMANGLE_RUST */ --=20 2.49.0.901.g37484f566f-goog From nobody Fri Sep 5 20:21:55 2025 Received: from mail-pf1-f202.google.com (mail-pf1-f202.google.com [209.85.210.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 05BA917BA1 for ; Wed, 30 Apr 2025 00:41:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745973709; cv=none; b=PIwXp7jnWyX1AScRpExQxQcU/rpGUXEY2iHYEdYITk0kkORwpn7I9inXAuC/PPtKHRTAsKywx0dlmc+Qqj6ECQHXaa6nl4OT6mEs9Fnxl1PVY3pBsPzMkfijhkutSJcSedZo/w6k+7BcNqH1tBqFDWliVWvrkk1ASKVImskM4bc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745973709; c=relaxed/simple; bh=kRagKt4OqPImjkKD1oQH895FmmrMksgLNhCNLEEIoOg=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=Px7D2Hb3cnXABxn8M7CIPvULp4GM1C02XVWw8ko/k4NOc5vXyNFUd1MprCsaUQL7pdRMCJI60kfqINJIILS8hVVEp+CXOT8+5U1nK2+fNTlPAYZ3u3eK8CdQidVJ10uEeMiMJ9e8gbDoI8yKm1//AAEFURqXFNGfSXE7oYMc9Ig= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--irogers.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=LkejbkSO; arc=none smtp.client-ip=209.85.210.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--irogers.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="LkejbkSO" Received: by mail-pf1-f202.google.com with SMTP id d2e1a72fcca58-736d64c5e16so5173337b3a.3 for ; Tue, 29 Apr 2025 17:41:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1745973706; x=1746578506; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=/DAYr6PusLm+/+Qz98aPjWyLm+UNylxUX3Sur2p0ihA=; b=LkejbkSOhskRJTZxZNtipszg10QFRFkR35o6SZTAABD3EXnwkCzjqTm6VXMmzzQmmx KBNECEV5y6A8+ConvQIayLp8Wx93oz+I0a5Tc/iYzWLh6kiUiLE03z1nI9nj7TcEpnWJ WgPyKBoRNNVnXnFlg4BICKl7UG110OnENZCrwOQOMLwqNxy1fyIALA6VyczUncJ2xTgI we4qzl+mluburiMCHusD+0bqGn14IDbBjQGkIg8sIhQXM5cOEO/paa7H0NsF3Tnl7SOr +LljZmYNixg8BEr2Z8nfGQ7Vt7tcNYLmB51u7yBKtzK8oXvUS2Mn/uHdroUacuFPw6OU eAhQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1745973706; x=1746578506; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=/DAYr6PusLm+/+Qz98aPjWyLm+UNylxUX3Sur2p0ihA=; b=YxJiO6IGdpmKe9FPWyo8V5eaOwecZFGBGrQSQaiolEe1QJlvN9zjcEBY6DNSRbtJyR St7tIut/LbFhzAKLc+jl2Acnxk51nIJUECxw/JGD1LabovhMFxsCBWHdBTZRZ7IYLgRE Kbh8Su+MhPEv6wYOZXuVpILG32+wNhbqPNrkxGsK/wy+D9xOmgrwHg+qTj0NDs77Z922 U1PP38Ug1bN32k7+wFpKIR3eNcy4axjtr1ahHCEhnjEY8LGeRCuf1FBeDeGuk+PxgnHz pP9FUXPvXdCHl7fIDzE2FpCLo5NE+3I6sXWWCaFjh8XE30EcPTfX8vXNdlsNTLChxQaJ IM2g== X-Forwarded-Encrypted: i=1; AJvYcCXT5LHRm6tfnAYxqNL74E+ggG+Q9rEBCLK/g7VfH5bb8Lkp3Lh+85xvq3sxi6UIf3CJMrWFFte4rLZOgAk=@vger.kernel.org X-Gm-Message-State: AOJu0Yxie6ZNu/ddxyWMIAYcl+9RcWNQLQF8o0bRX5A5L/ErUyBNiMQo ahzZkmkRciJhYpzALfNf3jSPeWKHCQ3tG01FdF30l+VsuUrvqdhzafp6EHC69wxFComjj6+excM GgGvb8A== X-Google-Smtp-Source: AGHT+IEO+ew6JL7Wqcne+i8I/KZRO0kBvB5Ye1PoPyHG4SG+gdGNhCGDl1UPMLXblvyoaO7J3mijbEI7HL1d X-Received: from pgbfq21.prod.google.com ([2002:a05:6a02:2995:b0:af2:3b16:9767]) (user=irogers job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6300:6681:b0:1f5:7873:304f with SMTP id adf61e73a8af0-20aa468608amr977493637.37.1745973706097; Tue, 29 Apr 2025 17:41:46 -0700 (PDT) Date: Tue, 29 Apr 2025 17:41:26 -0700 In-Reply-To: <20250430004128.474388-1-irogers@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250430004128.474388-1-irogers@google.com> X-Mailer: git-send-email 2.49.0.901.g37484f566f-goog Message-ID: <20250430004128.474388-5-irogers@google.com> Subject: [PATCH v2 4/6] perf test demangle-rust: Add Rust demangling test From: Ian Rogers To: Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Mark Rutland , Alexander Shishkin , Jiri Olsa , Adrian Hunter , Kan Liang , Miguel Ojeda , Alex Gaynor , Boqun Feng , Gary Guo , "=?UTF-8?q?Bj=C3=B6rn=20Roy=20Baron?=" , Benno Lossin , Andreas Hindborg , Alice Ryhl , Trevor Gross , Danilo Krummrich , Nathan Chancellor , Nick Desaulniers , Bill Wendling , Justin Stitt , James Clark , Howard Chu , Jiapeng Chong , Ravi Bangoria , "Masami Hiramatsu (Google)" , Stephen Brennan , linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, rust-for-linux@vger.kernel.org, llvm@lists.linux.dev, Daniel Xu , Ariel Ben-Yehuda Cc: Ian Rogers Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The test cases are listed examples in: https://doc.rust-lang.org/rustc/symbol-mangling/v0.html This test was previously part of a different Rust v0 demangler: https://lore.kernel.org/lkml/20250129193037.573431-1-irogers@google.com/ Signed-off-by: Ian Rogers --- tools/perf/tests/Build | 1 + tools/perf/tests/builtin-test.c | 1 + tools/perf/tests/demangle-rust-v0-test.c | 74 ++++++++++++++++++++++++ tools/perf/tests/tests.h | 1 + tools/perf/util/symbol-elf.c | 2 +- 5 files changed, 78 insertions(+), 1 deletion(-) create mode 100644 tools/perf/tests/demangle-rust-v0-test.c diff --git a/tools/perf/tests/Build b/tools/perf/tests/Build index 934f32090553..2181f5a92148 100644 --- a/tools/perf/tests/Build +++ b/tools/perf/tests/Build @@ -56,6 +56,7 @@ perf-test-y +=3D genelf.o perf-test-y +=3D api-io.o perf-test-y +=3D demangle-java-test.o perf-test-y +=3D demangle-ocaml-test.o +perf-test-y +=3D demangle-rust-v0-test.o perf-test-y +=3D pfm.o perf-test-y +=3D parse-metric.o perf-test-y +=3D pe-file-parsing.o diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-tes= t.c index 14d30a5053be..45d3d8b3317a 100644 --- a/tools/perf/tests/builtin-test.c +++ b/tools/perf/tests/builtin-test.c @@ -126,6 +126,7 @@ static struct test_suite *generic_tests[] =3D { &suite__maps__merge_in, &suite__demangle_java, &suite__demangle_ocaml, + &suite__demangle_rust, &suite__parse_metric, &suite__pe_file_parsing, &suite__expand_cgroup_events, diff --git a/tools/perf/tests/demangle-rust-v0-test.c b/tools/perf/tests/de= mangle-rust-v0-test.c new file mode 100644 index 000000000000..904f966c65d7 --- /dev/null +++ b/tools/perf/tests/demangle-rust-v0-test.c @@ -0,0 +1,74 @@ +// SPDX-License-Identifier: Apache-2.0 OR MIT +#include "tests.h" +#include "debug.h" +#include "symbol.h" +#include +#include +#include + +static int test__demangle_rust(struct test_suite *test __maybe_unused, int= subtest __maybe_unused) +{ + int ret =3D TEST_OK; + char *buf =3D NULL; + size_t i; + + struct { + const char *mangled, *demangled; + } test_cases[] =3D { + { "_RNvMsr_NtCs3ssYzQotkvD_3std4pathNtB5_7PathBuf3newCs15kBYyAo9fc_7mycr= ate", + "::new" }, + { "_RNvCs15kBYyAo9fc_7mycrate7example", + "mycrate::example" }, + { "_RNvMs_Cs4Cv8Wi1oAIB_7mycrateNtB4_7Example3foo", + "::foo" }, + { "_RNvXCs15kBYyAo9fc_7mycrateNtB2_7ExampleNtB2_5Trait3foo", + "::foo" }, + { "_RNvMCs7qp2U7fqm6G_7mycrateNtB2_7Example3foo", + "::foo" }, + { "_RNvMs_Cs7qp2U7fqm6G_7mycrateNtB4_7Example3bar", + "::bar" }, + { "_RNvYNtCs15kBYyAo9fc_7mycrate7ExampleNtB4_5Trait7exampleB4_", + "::example" }, + { "_RNCNvCsgStHSCytQ6I_7mycrate4main0B3_", + "mycrate::main::{closure#0}" }, + { "_RNCNvCsgStHSCytQ6I_7mycrate4mains_0B3_", + "mycrate::main::{closure#1}" }, + { "_RINvCsgStHSCytQ6I_7mycrate7examplelKj1_EB2_", + "mycrate::example::" }, + { "_RINvCs7qp2U7fqm6G_7mycrate7exampleFG0_RL1_hRL0_tEuEB2_", + "mycrate::example:: fn(&'a u8, &'b u16)>", + }, + { "_RINvCs7qp2U7fqm6G_7mycrate7exampleKy12345678_EB2_", + "mycrate::example::<305419896>" }, + { "_RNvNvMCsd9PVOYlP1UU_7mycrateINtB4_7ExamplepKpE3foo14EXAMPLE_STATIC", + ">::foo::EXAMPLE_STATIC", + }, + { "_RINvCs7qp2U7fqm6G_7mycrate7exampleAtj8_EB2_", + "mycrate::example::<[u16; 8]>" }, + { "_RINvCs7qp2U7fqm6G_7mycrate7exampleNtB2_7ExampleBw_EB2_", + "mycrate::example::" }, + { "_RINvMsY_NtCseXNvpPnDBDp_3std4pathNtB6_4Path3neweECs7qp2U7fqm6G_7mycr= ate", + "::new::" }, + { "_RNvNvNvCs7qp2U7fqm6G_7mycrate7EXAMPLE7___getit5___KEY", + "mycrate::EXAMPLE::__getit::__KEY" }, + }; + + for (i =3D 0; i < ARRAY_SIZE(test_cases); i++) { + buf =3D dso__demangle_sym(/*dso=3D*/NULL, /*kmodule=3D*/0, test_cases[i]= .mangled); + if (!buf) { + pr_debug("FAILED to demangle: \"%s\"\n \"%s\"\n", test_cases[i].mangled, + test_cases[i].demangled); + continue; + } + if (strcmp(buf, test_cases[i].demangled)) { + pr_debug("FAILED: %s: %s !=3D %s\n", test_cases[i].mangled, + buf, test_cases[i].demangled); + ret =3D TEST_FAIL; + } + free(buf); + } + + return ret; +} + +DEFINE_SUITE("Demangle Rust", demangle_rust); diff --git a/tools/perf/tests/tests.h b/tools/perf/tests/tests.h index 8aea344536b8..bb7951c61971 100644 --- a/tools/perf/tests/tests.h +++ b/tools/perf/tests/tests.h @@ -157,6 +157,7 @@ DECLARE_SUITE(jit_write_elf); DECLARE_SUITE(api_io); DECLARE_SUITE(demangle_java); DECLARE_SUITE(demangle_ocaml); +DECLARE_SUITE(demangle_rust); DECLARE_SUITE(pfm); DECLARE_SUITE(parse_metric); DECLARE_SUITE(pe_file_parsing); diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c index 3fc87309746f..8734e8b6cf84 100644 --- a/tools/perf/util/symbol-elf.c +++ b/tools/perf/util/symbol-elf.c @@ -319,7 +319,7 @@ static char *demangle_sym(struct dso *dso, int kmodule,= const char *elf_name) * DWARF DW_compile_unit has this, but we don't always have access * to it... */ - if (!want_demangle(dso__kernel(dso) || kmodule)) + if (!want_demangle((dso && dso__kernel(dso)) || kmodule)) return demangled; =20 rust_demangle_demangle(elf_name, &rust_demangle); --=20 2.49.0.901.g37484f566f-goog From nobody Fri Sep 5 20:21:55 2025 Received: from mail-pf1-f202.google.com (mail-pf1-f202.google.com [209.85.210.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 141241FB3 for ; Wed, 30 Apr 2025 00:41:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745973711; cv=none; b=Cc2JzwK9ukQT8nncV5e0o+fvsLMTDZ3HM/OcYOu1felLsaOYXsF1I+Y82jUw5tYJuBVOfyrz4N7izEloJ/VPOV2jrHk8X14oK0AH6M6ds6/dIX5KApkWJM4F3EMypy/OwqF8MoNlJ7KnU9ffKTfI13bvRmgX98X+f4A0J8zLfl8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745973711; c=relaxed/simple; bh=YsSJK+JBgrMChothMITffQgSKnUI35D9iUSvUi32+oo=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=o6vug3AfG8JKNUwEE2czgG00eEhZmcIDVapTNnNKnZD+Ta1a9AF9bY6hUcwNR5fMSPfB6TejQVR+KFXMztw1zPJQo7fpBnD77vz4ESMzWnX2HgVI+ZWbmmvZoEsnVixdxF1Ic669ecAWQlhIsKKsco8O1qooaSDtGn+ADD2Ffzk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--irogers.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=wgwPOYu/; arc=none smtp.client-ip=209.85.210.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--irogers.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="wgwPOYu/" Received: by mail-pf1-f202.google.com with SMTP id d2e1a72fcca58-739515de999so5108551b3a.1 for ; Tue, 29 Apr 2025 17:41:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1745973708; x=1746578508; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=9cYhstsrbro2bvVLigEn3Nm+qEHX+oY+qpb1akXLGPE=; b=wgwPOYu/3MCy4dUbuFwXITtgYdI4a2TuJ3lAVd7tiGEyf6CQr9WFUkktb30NiNLulW u9Jn84uOIuLSsyhdwdg5TKu0BG/0pKsNvfuCR31deHlWmq6MpgYT52wOs2cop3GZwRNE NbMLn/RqntV4VcOmbAapbDda5QwezI2nTjoYhTSeZTrfsB0pnLCwVPqYro7W2AUmNpCp rHDle2ZM6gSN1zI/QdgiWfdd+qCc+aZO9yFV0s+OSSEURrVYU89sI7mx1Vlv3xv2hmc1 EzHrOVB7G9XUKnhaZBhUH2j2F+1eBQCScS8NGjRvfqZzeljFvRbMTVYckJchUYcXFuWX iIIQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1745973708; x=1746578508; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=9cYhstsrbro2bvVLigEn3Nm+qEHX+oY+qpb1akXLGPE=; b=GGhI2uOvH218uCapL/YkFnTGR+Zmbj+H3j/sdrez73l6cCqeHLQNwwpWKV88yUZkUh 1+znT+tzjr3np5eJCJJifmsxg0G3hZTe0Oa4kgpOwvUBWtZwdP2oDj1YU+vXiTI0oiVf Q1t6KJyT/tnahbwxszaAOFN3HyC7JeVN1fwCNDVzikEdgxdLRWo7nOM3LXiykwHLfOCa oZoZjAT8oERhLz7sWqABv4d347C4mOuUCDT7wmUb1cuq5RbcYSAt1JH4vOZeU179/o/v RBq/TbUmxPEcHQlJdNsVGO5vM5UWj2vq8DSzQj7QKT2nzS7Pq1lDhwSJQCltRQEp68UC w/pA== X-Forwarded-Encrypted: i=1; AJvYcCWD1YLxD/aRByN6Y1U/grrZ80tNwjV6xxYRWxvYtqIPLIrtk3TyfUyVj/BMwxm4K8EPKQxjYf3v3L1HWSQ=@vger.kernel.org X-Gm-Message-State: AOJu0YyHqgPiecVYf75Uzei7InIxgL0fEAfaP1QnbouSv41bjOMeSi3T HT8bDd8r+lpRRnc3q8DFWC3snHSSND6Wc7w+oPk+UnpHgsoT42GuCF5vOpeZ6N43QC/SUPd+Znq GYNEdKA== X-Google-Smtp-Source: AGHT+IEJsy1CiwSbTZ772fe8tVZ0+2GytQHzJ43kqO3OP7rBH78lCFDM5KKDrcvzIsMB/uaQe+1R5/yWGIpM X-Received: from pfbhw20.prod.google.com ([2002:a05:6a00:8914:b0:736:a70b:53c7]) (user=irogers job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:4b46:b0:736:6151:c6ca with SMTP id d2e1a72fcca58-74038939bc9mr1750119b3a.4.1745973708332; Tue, 29 Apr 2025 17:41:48 -0700 (PDT) Date: Tue, 29 Apr 2025 17:41:27 -0700 In-Reply-To: <20250430004128.474388-1-irogers@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250430004128.474388-1-irogers@google.com> X-Mailer: git-send-email 2.49.0.901.g37484f566f-goog Message-ID: <20250430004128.474388-6-irogers@google.com> Subject: [PATCH v2 5/6] perf test demangle-java: Switch to using dso__demangle_sym From: Ian Rogers To: Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Mark Rutland , Alexander Shishkin , Jiri Olsa , Adrian Hunter , Kan Liang , Miguel Ojeda , Alex Gaynor , Boqun Feng , Gary Guo , "=?UTF-8?q?Bj=C3=B6rn=20Roy=20Baron?=" , Benno Lossin , Andreas Hindborg , Alice Ryhl , Trevor Gross , Danilo Krummrich , Nathan Chancellor , Nick Desaulniers , Bill Wendling , Justin Stitt , James Clark , Howard Chu , Jiapeng Chong , Ravi Bangoria , "Masami Hiramatsu (Google)" , Stephen Brennan , linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, rust-for-linux@vger.kernel.org, llvm@lists.linux.dev, Daniel Xu , Ariel Ben-Yehuda Cc: Ian Rogers Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The use of the demangle-java APIs means we don't detect if a different demangler is used before the Java one for the case that matters to perf. Remove the return types from the demangled names as dso__demangle_sym removes those. Signed-off-by: Ian Rogers --- tools/perf/tests/demangle-java-test.c | 17 ++++++++--------- 1 file changed, 8 insertions(+), 9 deletions(-) diff --git a/tools/perf/tests/demangle-java-test.c b/tools/perf/tests/deman= gle-java-test.c index 93c94408bdc8..ebaf60cdfa99 100644 --- a/tools/perf/tests/demangle-java-test.c +++ b/tools/perf/tests/demangle-java-test.c @@ -3,10 +3,9 @@ #include #include #include -#include "tests.h" -#include "session.h" #include "debug.h" -#include "demangle-java.h" +#include "symbol.h" +#include "tests.h" =20 static int test__demangle_java(struct test_suite *test __maybe_unused, int= subtest __maybe_unused) { @@ -18,19 +17,19 @@ static int test__demangle_java(struct test_suite *test = __maybe_unused, int subte const char *mangled, *demangled; } test_cases[] =3D { { "Ljava/lang/StringLatin1;equals([B[B)Z", - "boolean java.lang.StringLatin1.equals(byte[], byte[])" }, + "java.lang.StringLatin1.equals(byte[], byte[])" }, { "Ljava/util/zip/ZipUtils;CENSIZ([BI)J", - "long java.util.zip.ZipUtils.CENSIZ(byte[], int)" }, + "java.util.zip.ZipUtils.CENSIZ(byte[], int)" }, { "Ljava/util/regex/Pattern$BmpCharProperty;match(Ljava/util/regex/Match= er;ILjava/lang/CharSequence;)Z", - "boolean java.util.regex.Pattern$BmpCharProperty.match(java.util.regex= .Matcher, int, java.lang.CharSequence)" }, + "java.util.regex.Pattern$BmpCharProperty.match(java.util.regex.Matcher= , int, java.lang.CharSequence)" }, { "Ljava/lang/AbstractStringBuilder;appendChars(Ljava/lang/String;II)V", - "void java.lang.AbstractStringBuilder.appendChars(java.lang.String, in= t, int)" }, + "java.lang.AbstractStringBuilder.appendChars(java.lang.String, int, in= t)" }, { "Ljava/lang/Object;()V", - "void java.lang.Object()" }, + "java.lang.Object()" }, }; =20 for (i =3D 0; i < ARRAY_SIZE(test_cases); i++) { - buf =3D java_demangle_sym(test_cases[i].mangled, 0); + buf =3D dso__demangle_sym(/*dso=3D*/NULL, /*kmodule=3D*/0, test_cases[i]= .mangled); if (strcmp(buf, test_cases[i].demangled)) { pr_debug("FAILED: %s: %s !=3D %s\n", test_cases[i].mangled, buf, test_cases[i].demangled); --=20 2.49.0.901.g37484f566f-goog From nobody Fri Sep 5 20:21:55 2025 Received: from mail-pf1-f202.google.com (mail-pf1-f202.google.com [209.85.210.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5686EBA27 for ; Wed, 30 Apr 2025 00:41:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745973717; cv=none; b=m0Nq4pmi4yrdJ60eEsfgtip/a38I6wqq0pIWRgsUUidN9dsRHFFcmTMqc4eCuy/35euuE/1RmXo/IGqqTlfN2OnRKT/EhFGB32mlluxK8STmdPkdEFBfAokI3JSxgMR/GsBpbcPf7it/U3fEUxRuJDggl0uXTksLN+obENHKwwE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745973717; c=relaxed/simple; bh=pLvxUdxndsr2wYkJQYg49ecez/cviTwRj3eN4nJwgxo=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=ts9Ts26dCuC641vPkebi5KUWYN4cswrZL/EzQ5PSRF2GijrNDQb3L4MneT0PjeoAHgMPKu3chRNW5ioEyD4a+71HzTIBDKSG7urJheNNQya5Gfue37cljS/PalbGq1RRy/j58Mgya7DVLXlmg/hnOeiauKQ4IQdePjXMkfU4xPQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--irogers.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=apNF28FV; arc=none smtp.client-ip=209.85.210.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--irogers.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="apNF28FV" Received: by mail-pf1-f202.google.com with SMTP id d2e1a72fcca58-7369c5ed395so7516046b3a.0 for ; Tue, 29 Apr 2025 17:41:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1745973715; x=1746578515; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=VWjkucJuMaosxMcy1Z3rSUNsLtdDY3ZZywavpz7InSY=; b=apNF28FVbWdzH+0p7oT9pqhgVfytqVk4HeXo+6Pnia3EmomAcSMSbJQxSrx5jaqdv+ +9rgBfJ7WR8ZIb1AQIrdBZpSvJdoOtkLqFDnVB5bV6hzfAwZgIKVv8/2pSJiSiZ7O8ro aOOxHiF6fqhWoRE/7cGosma/CCpMzJuGDnNx7kYgLDxHMdteOMvJZP/U6mhPM6e9g58P 75/BKgxt+1kcfuxqKq+z9uOzONGN5f/ZMidHbVOGMhTnVnGPHEjJSLlRN9DiKcsZx84M WVra43nR9I9c4ShK8cQRuSXo708Z0UXlJWRWmYloVSAdwTMPf/tSHOLldBKRQrWtDlit lltQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1745973715; x=1746578515; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=VWjkucJuMaosxMcy1Z3rSUNsLtdDY3ZZywavpz7InSY=; b=PFbA8rPwxRG3cVoSnqq+sl47kqFSaUhK4Nfr0A6XI+4ErhS2CTBpZurKOMMRkWVWqt Li/BdWUbR7FWn5LmV+PDgDivQ0kbn5TeAdJD5ONsVdl9x+kL0Chi1DaFNoDQJMUPVd+T IV1uUykG6LpSRE9131pQFJSNYFZ2dF+NEd+hhAp16P7lrFxxNIXb7XTmQqVIRbOmC0Zs PeAvM4BffGcBtBGwgLWb9400fbJH5nJEnUDoRxo4ICyOP0Q3oIJOyFERTcIZ97fL457T 5GHtaEjxt3F30EwLqcLWB07IEO8dvwJEc5P2i464Zxp1Jk/4YErDuUMlG33SOJBLf6pA meFA== X-Forwarded-Encrypted: i=1; AJvYcCUv1a+S6g96nHJL+LSYpf8r6vEX9hyWo37UzCgLLynlLs8RCDLDIqYnq/s+Mjru4517bGtB8NVhH1lvg40=@vger.kernel.org X-Gm-Message-State: AOJu0YwnLKR7qYRxaDwTAvIE7fNkeJkF5TAnGr5xfVo5aEVxLgl6kXRb vEn63xEO4JqhINhednmjJBx2zjbGVuHPb3xD76BnMOUPgT3+2ljZBdlUp/OvmCNRMkDWj/KKuqL tczOLsg== X-Google-Smtp-Source: AGHT+IGBKZmj0xcnx3IB64TcbS0pb/r17lmOgH8lxAqqhmr49CDFUH1EBNpi6iyWZc1C2ZsKErjB7VbqxAQl X-Received: from pfbfn23.prod.google.com ([2002:a05:6a00:2fd7:b0:740:555:f7af]) (user=irogers job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:3924:b0:736:34a2:8a18 with SMTP id d2e1a72fcca58-7403a836556mr988402b3a.24.1745973715575; Tue, 29 Apr 2025 17:41:55 -0700 (PDT) Date: Tue, 29 Apr 2025 17:41:28 -0700 In-Reply-To: <20250430004128.474388-1-irogers@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250430004128.474388-1-irogers@google.com> X-Mailer: git-send-email 2.49.0.901.g37484f566f-goog Message-ID: <20250430004128.474388-7-irogers@google.com> Subject: [PATCH v2 6/6] perf test demangle-ocaml: Switch to using dso__demangle_sym From: Ian Rogers To: Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Mark Rutland , Alexander Shishkin , Jiri Olsa , Adrian Hunter , Kan Liang , Miguel Ojeda , Alex Gaynor , Boqun Feng , Gary Guo , "=?UTF-8?q?Bj=C3=B6rn=20Roy=20Baron?=" , Benno Lossin , Andreas Hindborg , Alice Ryhl , Trevor Gross , Danilo Krummrich , Nathan Chancellor , Nick Desaulniers , Bill Wendling , Justin Stitt , James Clark , Howard Chu , Jiapeng Chong , Ravi Bangoria , "Masami Hiramatsu (Google)" , Stephen Brennan , linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, rust-for-linux@vger.kernel.org, llvm@lists.linux.dev, Daniel Xu , Ariel Ben-Yehuda Cc: Ian Rogers Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The use of the demangle-ocaml APIs means we don't detect if a different demangler is used before the OCaml one for the case that matters to perf. Signed-off-by: Ian Rogers --- tools/perf/tests/demangle-ocaml-test.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/tools/perf/tests/demangle-ocaml-test.c b/tools/perf/tests/dema= ngle-ocaml-test.c index 90a4285e2ad5..612c788b7e0d 100644 --- a/tools/perf/tests/demangle-ocaml-test.c +++ b/tools/perf/tests/demangle-ocaml-test.c @@ -2,10 +2,9 @@ #include #include #include -#include "tests.h" -#include "session.h" #include "debug.h" -#include "demangle-ocaml.h" +#include "symbol.h" +#include "tests.h" =20 static int test__demangle_ocaml(struct test_suite *test __maybe_unused, in= t subtest __maybe_unused) { @@ -27,7 +26,7 @@ static int test__demangle_ocaml(struct test_suite *test _= _maybe_unused, int subt }; =20 for (i =3D 0; i < ARRAY_SIZE(test_cases); i++) { - buf =3D ocaml_demangle_sym(test_cases[i].mangled); + buf =3D dso__demangle_sym(/*dso=3D*/NULL, /*kmodule=3D*/0, test_cases[i]= .mangled); if ((buf =3D=3D NULL && test_cases[i].demangled !=3D NULL) || (buf !=3D NULL && test_cases[i].demangled =3D=3D NULL) || (buf !=3D NULL && strcmp(buf, test_cases[i].demangled))) { --=20 2.49.0.901.g37484f566f-goog