From nobody Wed Dec 17 19:39:09 2025 Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 77BC025E462 for ; Mon, 10 Feb 2025 19:12:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1739214757; cv=none; b=D6ZHj/M7ctOLlnvLFXAJYkohYi04b252qEUsZHIkH/du4MF8ELTe5Pi2xocowejbLNtqb2PDKVtNE0WZwb98zkpzsXxj28KZQdzHBLobNG8Gb/FSQYX68uCEw4WB+bQDe12fvyXrs00isMiwbuZ2q8POPmHSnr1YR9xbtQaOdxc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1739214757; c=relaxed/simple; bh=c4VvBbH7uMZOC2uY2zHXPsbHjoTOuRrPLVazsSCZ0zU=; h=Date:Message-Id:Mime-Version:Subject:From:To:Cc:Content-Type; b=Yq3Gi++Cie3Mr0oPRy4Fag+rgB4dWyP8eKo8bMjEheRcmUhOpqdoJiAdcG4p7DT7D2qqeVIFQuWtOHqW6nnxPELtgrnyFpiL0mBhZ4C5pqWGAZ8YwcBC4bUmbvtPjXgQcp1RHu8BSMpAve/BI+k31KUm+n8Fk/FGJ+lTRCZo4co= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--irogers.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=WtMEFT/3; arc=none smtp.client-ip=209.85.128.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--irogers.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="WtMEFT/3" Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-6f3ff1ff117so59959317b3.0 for ; Mon, 10 Feb 2025 11:12:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1739214754; x=1739819554; darn=vger.kernel.org; h=cc:to:from:subject:mime-version:message-id:date:from:to:cc:subject :date:message-id:reply-to; bh=OR8P25OJ4vGDOaMebWgow/0A+NvhiOK6b9XUs76+pr4=; b=WtMEFT/394V7MaYCKEgMR77pw4O+uKq9F3nMh06Wh+qrP7griXXukLYqJIiuwBGgkl e/OzMWb/HL3McCwBKYRTCE0ddcRAlNFfzpJhRTOHBIYrC86ruZpkWrwdUnTuWKznJjmT MBjhbaM86KydgD00TFUVnhJ78wMBWXW/8Vr0wS5So4JJe+oi0nmwnV0yyMU8C2j56rG4 mEYA9nQvCC2WrgQImY7MCwaoxvoliqruKDYkw855ZKdt/8pTGGggwzVig61Oq8d8gYmW oUypsQgsKr3ybUYaiIXY2U+4OurJmcusTIKVR1Waoax6lPwi2TFv9oAjgX0Ar+fMsEdK Lo9w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1739214754; x=1739819554; h=cc:to:from:subject:mime-version:message-id:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=OR8P25OJ4vGDOaMebWgow/0A+NvhiOK6b9XUs76+pr4=; b=nR/zJh4m4/BU84N2i/fxAkxvtQCgtZA80YYe1bWAn/tgEKBajN7q3mxXKiVYfoCuTt pMV3xWxrsmrP2/y7jsApTDElCAB8wdGopTWmA3fV4ffTbDaLw919IxCySD3iWc9zFC2K kQB/nRcKPIcy10ZE6bWVvwSgn1E2ANDA/G1/iiMN3JEcmOBWrEGy+CoRxBI3d7/OELWO cE5C4UNh01Acf2JvVtz+SF3LKv8C/o4a43MIrEhljK+2yHUi+UpJEbcr05Rn1xhH9KMf /B0wLGWyyKTEcXxpvL1t4ptXK0No8RIkhhHmNVdH9oQqb7Kyv7WKUWqRe2RJoBRzCQZE uZ7w== X-Forwarded-Encrypted: i=1; AJvYcCUqTudkk81+0k1HbO5F0XG3JTKlhUZ+5xT358dk7/6jRCE0EKgiyqdK6eGo9LMNJF5Pu+vi/QjM94u+g5Q=@vger.kernel.org X-Gm-Message-State: AOJu0YyoZrn4Ktb4AgdrZlM+61rhUeP99rXWdMsno4IDwbY3YC7iX23R BQnwiOzd0C/YP/6VpYrdAp2KDwsiBXK9rO6oN55PdFRUmnCuseeUyjGtGeh8HI+UVfgrZ1r3maj K/Yn5wA== X-Google-Smtp-Source: AGHT+IEFGNvZNrTvurg9Q9s3VyBmuy8tCX2cZ77ByEDGWHebDA0fRRpZBRRIZa6H7Ph07kANZ45IznFo9U02 X-Received: from irogers.svl.corp.google.com ([2620:15c:2c5:11:c64e:af58:30d4:168d]) (user=irogers job=sendgmr) by 2002:a05:690c:3306:b0:6fb:be:e331 with SMTP id 00721157ae682-6fb00bee3cfmr263747b3.6.1739214754488; Mon, 10 Feb 2025 11:12:34 -0800 (PST) Date: Mon, 10 Feb 2025 11:12:31 -0800 Message-Id: <20250210191231.156294-1-irogers@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 X-Mailer: git-send-email 2.48.1.502.g6dc24dfdaf-goog Subject: [PATCH v3] perf cpumap: Reduce cpu size from int to int16_t From: Ian Rogers To: Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Mark Rutland , Alexander Shishkin , Jiri Olsa , Adrian Hunter , Kan Liang , James Clark , Tim Chen , Yicong Yang , Ravi Bangoria , linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org, Kyle Meyer Cc: Ian Rogers Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Fewer than 32k logical CPUs are currently supported by perf. A cpumap is indexed by an integer (see perf_cpu_map__cpu) yielding a perf_cpu that wraps a 4-byte int for the logical CPU - the wrapping is done deliberately to avoid confusing a logical CPU with an index into a cpumap. Using a 4-byte int within the perf_cpu is larger than required so this patch reduces it to the 2-byte int16_t. For a cpumap containing 16 entries this will reduce the array size from 64 to 32 bytes. For very large servers with lots of logical CPUs the size savings will be greater. Signed-off-by: Ian Rogers Reviewed-by: James Clark --- v3. Add bounds checks as suggested by Leo Yan . v2. Rebase and tweak commit message add Reviewed-by: Tim Chen . --- tools/lib/perf/cpumap.c | 8 ++-- tools/lib/perf/include/perf/cpumap.h | 3 +- tools/perf/util/cpumap.c | 68 +++++++++++++++++++--------- tools/perf/util/env.c | 2 +- 4 files changed, 54 insertions(+), 27 deletions(-) diff --git a/tools/lib/perf/cpumap.c b/tools/lib/perf/cpumap.c index fcc47214062a..4454a5987570 100644 --- a/tools/lib/perf/cpumap.c +++ b/tools/lib/perf/cpumap.c @@ -185,7 +185,7 @@ struct perf_cpu_map *perf_cpu_map__new(const char *cpu_= list) while (isdigit(*cpu_list)) { p =3D NULL; start_cpu =3D strtoul(cpu_list, &p, 0); - if (start_cpu >=3D INT_MAX + if (start_cpu >=3D INT16_MAX || (*p !=3D '\0' && *p !=3D ',' && *p !=3D '-' && *p !=3D '\n')) goto invalid; =20 @@ -194,7 +194,7 @@ struct perf_cpu_map *perf_cpu_map__new(const char *cpu_= list) p =3D NULL; end_cpu =3D strtoul(cpu_list, &p, 0); =20 - if (end_cpu >=3D INT_MAX || (*p !=3D '\0' && *p !=3D ',' && *p !=3D '\n= ')) + if (end_cpu >=3D INT16_MAX || (*p !=3D '\0' && *p !=3D ',' && *p !=3D '= \n')) goto invalid; =20 if (end_cpu < start_cpu) @@ -209,7 +209,7 @@ struct perf_cpu_map *perf_cpu_map__new(const char *cpu_= list) for (; start_cpu <=3D end_cpu; start_cpu++) { /* check for duplicates */ for (i =3D 0; i < nr_cpus; i++) - if (tmp_cpus[i].cpu =3D=3D (int)start_cpu) + if (tmp_cpus[i].cpu =3D=3D (int16_t)start_cpu) goto invalid; =20 if (nr_cpus =3D=3D max_entries) { @@ -219,7 +219,7 @@ struct perf_cpu_map *perf_cpu_map__new(const char *cpu_= list) goto invalid; tmp_cpus =3D tmp; } - tmp_cpus[nr_cpus++].cpu =3D (int)start_cpu; + tmp_cpus[nr_cpus++].cpu =3D (int16_t)start_cpu; } if (*p) ++p; diff --git a/tools/lib/perf/include/perf/cpumap.h b/tools/lib/perf/include/= perf/cpumap.h index 188a667babc6..8c1ab0f9194e 100644 --- a/tools/lib/perf/include/perf/cpumap.h +++ b/tools/lib/perf/include/perf/cpumap.h @@ -4,10 +4,11 @@ =20 #include #include +#include =20 /** A wrapper around a CPU to avoid confusion with the perf_cpu_map's map'= s indices. */ struct perf_cpu { - int cpu; + int16_t cpu; }; =20 struct perf_cache { diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c index 5c329ad614e9..9bc5e0370234 100644 --- a/tools/perf/util/cpumap.c +++ b/tools/perf/util/cpumap.c @@ -67,19 +67,23 @@ static struct perf_cpu_map *cpu_map__from_entries(const= struct perf_record_cpu_m struct perf_cpu_map *map; =20 map =3D perf_cpu_map__empty_new(data->cpus_data.nr); - if (map) { - unsigned i; - - for (i =3D 0; i < data->cpus_data.nr; i++) { - /* - * Special treatment for -1, which is not real cpu number, - * and we need to use (int) -1 to initialize map[i], - * otherwise it would become 65535. - */ - if (data->cpus_data.cpu[i] =3D=3D (u16) -1) - RC_CHK_ACCESS(map)->map[i].cpu =3D -1; - else - RC_CHK_ACCESS(map)->map[i].cpu =3D (int) data->cpus_data.cpu[i]; + if (!map) + return NULL; + + for (unsigned int i =3D 0; i < data->cpus_data.nr; i++) { + /* + * Special treatment for -1, which is not real cpu number, + * and we need to use (int) -1 to initialize map[i], + * otherwise it would become 65535. + */ + if (data->cpus_data.cpu[i] =3D=3D (u16) -1) { + RC_CHK_ACCESS(map)->map[i].cpu =3D -1; + } else if (data->cpus_data.cpu[i] < INT16_MAX) { + RC_CHK_ACCESS(map)->map[i].cpu =3D (int16_t) data->cpus_data.cpu[i]; + } else { + pr_err("Invalid cpumap entry %u\n", data->cpus_data.cpu[i]); + perf_cpu_map__put(map); + return NULL; } } =20 @@ -106,8 +110,15 @@ static struct perf_cpu_map *cpu_map__from_mask(const s= truct perf_record_cpu_map_ int cpu; =20 perf_record_cpu_map_data__read_one_mask(data, i, local_copy); - for_each_set_bit(cpu, local_copy, 64) - RC_CHK_ACCESS(map)->map[j++].cpu =3D cpu + cpus_per_i; + for_each_set_bit(cpu, local_copy, 64) { + if (cpu + cpus_per_i < INT16_MAX) { + RC_CHK_ACCESS(map)->map[j++].cpu =3D cpu + cpus_per_i; + } else { + pr_err("Invalid cpumap entry %d\n", cpu + cpus_per_i); + perf_cpu_map__put(map); + return NULL; + } + } } return map; =20 @@ -127,8 +138,15 @@ static struct perf_cpu_map *cpu_map__from_range(const = struct perf_record_cpu_map RC_CHK_ACCESS(map)->map[i++].cpu =3D -1; =20 for (int cpu =3D data->range_cpu_data.start_cpu; cpu <=3D data->range_cpu= _data.end_cpu; - i++, cpu++) - RC_CHK_ACCESS(map)->map[i].cpu =3D cpu; + i++, cpu++) { + if (cpu < INT16_MAX) { + RC_CHK_ACCESS(map)->map[i].cpu =3D cpu; + } else { + pr_err("Invalid cpumap entry %d\n", cpu); + perf_cpu_map__put(map); + return NULL; + } + } =20 return map; } @@ -427,7 +445,7 @@ static void set_max_cpu_num(void) { const char *mnt; char path[PATH_MAX]; - int ret =3D -1; + int max, ret =3D -1; =20 /* set up default */ max_cpu_num.cpu =3D 4096; @@ -444,10 +462,12 @@ static void set_max_cpu_num(void) goto out; } =20 - ret =3D get_max_num(path, &max_cpu_num.cpu); + ret =3D get_max_num(path, &max); if (ret) goto out; =20 + max_cpu_num.cpu =3D max; + /* get the highest present cpu number for a sparse allocation */ ret =3D snprintf(path, PATH_MAX, "%s/devices/system/cpu/present", mnt); if (ret >=3D PATH_MAX) { @@ -455,8 +475,14 @@ static void set_max_cpu_num(void) goto out; } =20 - ret =3D get_max_num(path, &max_present_cpu_num.cpu); + ret =3D get_max_num(path, &max); =20 + if (!ret && max > INT16_MAX) { + pr_err("Read out of bounds max cpus of %d\n", max); + ret =3D -1; + } + if (!ret) + max_present_cpu_num.cpu =3D (int16_t)max; out: if (ret) pr_err("Failed to read max cpus, using default of %d\n", max_cpu_num.cpu= ); @@ -606,7 +632,7 @@ size_t cpu_map__snprint(struct perf_cpu_map *map, char = *buf, size_t size) #define COMMA first ? "" : "," =20 for (i =3D 0; i < perf_cpu_map__nr(map) + 1; i++) { - struct perf_cpu cpu =3D { .cpu =3D INT_MAX }; + struct perf_cpu cpu =3D { .cpu =3D INT16_MAX }; bool last =3D i =3D=3D perf_cpu_map__nr(map); =20 if (!last) diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c index cae4f6d63318..36411749e007 100644 --- a/tools/perf/util/env.c +++ b/tools/perf/util/env.c @@ -543,7 +543,7 @@ int perf_env__numa_node(struct perf_env *env, struct pe= rf_cpu cpu) =20 for (i =3D 0; i < env->nr_numa_nodes; i++) { nn =3D &env->numa_nodes[i]; - nr =3D max(nr, perf_cpu_map__max(nn->map).cpu); + nr =3D max(nr, (int)perf_cpu_map__max(nn->map).cpu); } =20 nr++; --=20 2.48.1.502.g6dc24dfdaf-goog