From nobody Sun May 24 18:41:29 2026 Received: from mail-dy1-f202.google.com (mail-dy1-f202.google.com [74.125.82.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 19EAD28F948 for ; Sat, 23 May 2026 16:27:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.82.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779553652; cv=none; b=Chpd/NEHIp3ZiCVzpcVZ132KgWqW1nkd6FrCYmWJtFRpZ4bvOUMMcXBc2ncEy4SDAvo3wQ1QGeRLGRJue+2UvTY41i+XbMaQUohYbJpuRNzlZUcI0Op3LDE1LbEhqRbSbIqrT0lTxtHfKuPxbwwuuPs/zdiTTseHF6Al1/noW5w= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779553652; c=relaxed/simple; bh=F8nw7M0+tF4OHZ6NYPgvS8E02lHrwRSGwOXPHLI3pV8=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=FaIm+bSnAMLmCA7ka5tDofjU75IRRfjI2aFezBsfsrtlxxzNkm9W503JVje+VcTihiQRZNrhP6nwYpUTyJe5VW99jNymTGpIMcDwGw+zYqeyrhreF2bWFeComTkzskYse87qPS978r7x87ZGZpcSRbmPaiAmJ4Dc0fchL5ElRUc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--cmllamas.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=YgqtrW+s; arc=none smtp.client-ip=74.125.82.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--cmllamas.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="YgqtrW+s" Received: by mail-dy1-f202.google.com with SMTP id 5a478bee46e88-2ef37c3f773so8602329eec.1 for ; Sat, 23 May 2026 09:27:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20251104; t=1779553649; x=1780158449; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Gd/tUqQcMfFQAxReTEdDtkWo6QGX1KBNJNik7T9QchI=; b=YgqtrW+sOIn1nmmQfrDG5kCxxhhR362bMPLAX+niaLPz01l2e8dZJlIU5AAnE/RbPH WU+1JC3SO27J8hStvfUD0RgOalo2yCBNuspl6YY8WxTyGSMzTHVsUtmiBMetnIDyMs04 wz2RJVo0qwbq2B3qaZRjx3D4/fRr6z8jm8WZnOZG6lOUaju6jhuiJ5Ycf7Vn8n6LAPMw w8bF1kJ0iLdRUoPCRDyE6UtAG6/uK27uEYBKBhLi0KsHzu5PeGq0dVUH5jMsggdW47ZF CQbh6kBXhEfKo/tmDbC1RzfBzSZqE808GeoGW8KAtOuuZl9905uVkdLPFmKGq7CGKagl PvhQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779553649; x=1780158449; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Gd/tUqQcMfFQAxReTEdDtkWo6QGX1KBNJNik7T9QchI=; b=LshyToVw0VE3ts1nJgdMQZ3oExuyQf1kmntEcmduOIgbV9iCF4VkMkVjUaleWsW6rq NEVsDo1QGwkr6iBbGKidcTjkm7+uHHQqHgdbPvKJmHHL5hNuW6eEdG1dCnE5BAuetMIX vLDlvvZJlewZbIb7gxm4+OZo7WbGHww9kV08eJ1OAwjG7AJWhy0upb02jpMzA7v9XR3J WbxAeiIabXt9ZpIFi1pWaA9bWn+h06v9+x8n8iuAIEtRASUwuXpNPaARVetdQR1TbV9n kAD2TgvHWO4xo6uYGbiLyD+NezFUpcfejGbWYUOID2ndyezSFlEDNJlS8qfWXfto+ytm 7PJA== X-Forwarded-Encrypted: i=1; AFNElJ8un4fyxcOWsEWHoPD+D74cz+q9eM4Zm0PwyUiSX0xY4XqzYySFU39OHQS4WwYOBCRsxiE5xj65eAycBI8=@vger.kernel.org X-Gm-Message-State: AOJu0Yzt7cldp8JA+X3Tj2//s3Q70mU5WIyavK0oRI/vc489jNMh+MbK x3ZLPwjIrrWMy+oKt7uRZK2OhcGEQNDU1Vjsgi2Pc3D1+28eb4vLc+FTmsuxuLIXDj0RoUlBUAJ V5RFtRIEu26WFGQ== X-Received: from dyu18.prod.google.com ([2002:a05:693c:8112:b0:2d5:d26c:d4bc]) (user=cmllamas job=prod-delivery.src-stubby-dispatcher) by 2002:a05:7301:3809:b0:304:ab8:f89a with SMTP id 5a478bee46e88-30448fd5c6amr4254238eec.1.1779553648914; Sat, 23 May 2026 09:27:28 -0700 (PDT) Date: Sat, 23 May 2026 16:27:21 +0000 In-Reply-To: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.54.0.746.g67dd491aae-goog Message-ID: <20260523162722.2718940-1-cmllamas@google.com> Subject: [PATCH v4] libbpf: fix UAF in strset__add_str() From: Carlos Llamas To: mykyta.yatsenko5@gmail.com, andrii.nakryiko@gmail.com, Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Eduard Zingerman , Kumar Kartikeya Dwivedi , Song Liu , Yonghong Song , Jiri Olsa Cc: kernel-team@android.com, linux-kernel@vger.kernel.org, Carlos Llamas , Mykyta Yatsenko , "open list:BPF [LIBRARY] (libbpf)" Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" strset_add_str_mem() might reallocate the strset data buffer in order to accommodate the provided string 's'. However, if 's' points to a string already present in the buffer, it becomes dangling after the realloc. This leads to a use-after-free when attempting to memcpy() the string into the new buffer. One scenario that triggers this problematic path is when resolve_btfids attempts to patch kfunc prototypes using existing BTF parameter names: | resolve_btfids: function bpf_list_push_back_impl already exists in BTF | Segmentation fault (core dumped) Compiling resolve_btfids with fsanitize=3Daddress generates a detailed report of the UAF: | =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D | ERROR: AddressSanitizer: heap-use-after-free on address 0x7f4c4a500bd4 | =3D=3D1507892=3D=3DERROR: AddressSanitizer: heap-use-after-free on addre= ss 0x7f4c4a500bd4 at pc 0x55d25155a2a8 bp 0x7ffcef879060 sp 0x7ffcef878818 | READ of size 5 at 0x7f4c4a500bd4 thread T0 | #0 0x55d25155a2a7 in memcpy (tools/bpf/resolve_btfids/resolve_btfids= +0xcf2a7) | #1 0x55d2515d708e in strset__add_str tools/lib/bpf/strset.c:162:2 | #2 0x55d2515c730b in btf__add_str tools/lib/bpf/btf.c:2109:8 | #3 0x55d2515c9020 in btf__add_func_param tools/lib/bpf/btf.c:3108:14 | #4 0x55d25159f0b5 in process_kfunc_with_implicit_args tools/bpf/reso= lve_btfids/main.c:1196:9 | #5 0x55d25159e004 in btf2btf tools/bpf/resolve_btfids/main.c:1229:9 | #6 0x55d25159cee7 in main tools/bpf/resolve_btfids/main.c:1535:6 | #7 0x7f4c78e29f76 in __libc_start_call_main csu/../sysdeps/nptl/libc= _start_call_main.h:58:16 | #8 0x7f4c78e2a026 in __libc_start_main csu/../csu/libc-start.c:360:3 | #9 0x55d2514bb860 in _start (tools/bpf/resolve_btfids/resolve_btfids= +0x30860) | | 0x7f4c4a500bd4 is located 13268 bytes inside of 2829000-byte region [0x7= f4c4a4fd800,0x7f4c4a7b02c8) | freed by thread T0 here: | #0 0x55d25155b700 in realloc (tools/bpf/resolve_btfids/resolve_btfid= s+0xd0700) | #1 0x55d2515c426c in libbpf_reallocarray tools/lib/bpf/./libbpf_inte= rnal.h:220:9 | #2 0x55d2515c426c in libbpf_add_mem tools/lib/bpf/btf.c:224:13 | | previously allocated by thread T0 here: | #0 0x55d25155b2e3 in malloc (tools/bpf/resolve_btfids/resolve_btfids= +0xd02e3) | #1 0x55d2515d6e7d in strset__new tools/lib/bpf/strset.c:58:20 While resolve_btfids could be refactored to avoid this call path, let's instead fix this issue at the source in strset__add_str() and avoid similar scenarios. Let's check if set->strs_data was reallocated and whether 's' points to an internal string within the old strset buffer. In such case, 's' is reconstructed to point to the new buffer. While already here, also fix strset__find_str() which suffers from the same problem by factoring out the common operations into a new helper function strset_str_append(). Fixes: 90d76d3ececc ("libbpf: Extract internal set-of-strings datastructure= APIs") Suggested-by: Andrii Nakryiko Suggested-by: Mykyta Yatsenko Signed-off-by: Carlos Llamas --- v4: Store pointers as integers in advance before realloc to prevent UB. Access set->strs_data directly, not through external API. v3: Switch to 's' reconstruction approach suggested by Andrii. Adjusted names and commit log accordingly. https://lore.kernel.org/all/20260518050550.2600101-1-cmllamas@google.com/ v2: Implemented the fix in strset__offset() helper as suggested by Mykyta. Added support to handle "substrings" of existing ones. Used 90d76d3ececc as Fixes tag as suggested by Sashiko. https://lore.kernel.org/all/20260515044759.2863546-1-cmllamas@google.com/ v1: https://lore.kernel.org/all/20260513232055.1681859-1-cmllamas@google.com/ tools/lib/bpf/strset.c | 59 +++++++++++++++++++++++++++--------------- 1 file changed, 38 insertions(+), 21 deletions(-) diff --git a/tools/lib/bpf/strset.c b/tools/lib/bpf/strset.c index 2464bcbd04e0..b9faca828f09 100644 --- a/tools/lib/bpf/strset.c +++ b/tools/lib/bpf/strset.c @@ -107,6 +107,38 @@ static void *strset_add_str_mem(struct strset *set, si= ze_t add_sz) set->strs_data_len, set->strs_data_max_len, add_sz); } =20 +static long strset_str_append(struct strset *set, const char *s) +{ + uintptr_t old_data =3D (uintptr_t)set->strs_data; + uintptr_t old_s =3D (uintptr_t)s; + long len =3D strlen(s) + 1; + void *p; + + /* Hashmap keys are always offsets within set->strs_data, so to even + * look up some string from the "outside", we need to first append it + * at the end, so that it can be addressed with an offset. Luckily, + * until set->strs_data_len is incremented, that string is just a piece + * of garbage for the rest of the code, so no harm, no foul. On the + * other hand, if the string is unique, it's already appended and + * ready to be used, only a simple set->strs_data_len increment away. + */ + p =3D strset_add_str_mem(set, len); + if (!p) + return -ENOMEM; + + /* The set->strs_data might have reallocated and if 's' pointed + * to an internal string within the old buffer, then it became + * dangling and needs to be reconstructed before the copy. + */ + if (old_data && old_data !=3D (uintptr_t)set->strs_data && + old_s >=3D old_data && old_s < old_data + set->strs_data_len) + s =3D set->strs_data + (old_s - old_data); + + memcpy(p, s, len); + + return len; +} + /* Find string offset that corresponds to a given string *s*. * Returns: * - >0 offset into string data, if string is found; @@ -116,16 +148,12 @@ static void *strset_add_str_mem(struct strset *set, s= ize_t add_sz) int strset__find_str(struct strset *set, const char *s) { long old_off, new_off, len; - void *p; =20 - /* see strset__add_str() for why we do this */ - len =3D strlen(s) + 1; - p =3D strset_add_str_mem(set, len); - if (!p) - return -ENOMEM; + len =3D strset_str_append(set, s); + if (len < 0) + return len; =20 new_off =3D set->strs_data_len; - memcpy(p, s, len); =20 if (hashmap__find(set->strs_hash, new_off, &old_off)) return old_off; @@ -142,24 +170,13 @@ int strset__find_str(struct strset *set, const char *= s) int strset__add_str(struct strset *set, const char *s) { long old_off, new_off, len; - void *p; int err; =20 - /* Hashmap keys are always offsets within set->strs_data, so to even - * look up some string from the "outside", we need to first append it - * at the end, so that it can be addressed with an offset. Luckily, - * until set->strs_data_len is incremented, that string is just a piece - * of garbage for the rest of the code, so no harm, no foul. On the - * other hand, if the string is unique, it's already appended and - * ready to be used, only a simple set->strs_data_len increment away. - */ - len =3D strlen(s) + 1; - p =3D strset_add_str_mem(set, len); - if (!p) - return -ENOMEM; + len =3D strset_str_append(set, s); + if (len < 0) + return len; =20 new_off =3D set->strs_data_len; - memcpy(p, s, len); =20 /* Now attempt to add the string, but only if the string with the same * contents doesn't exist already (HASHMAP_ADD strategy). If such --=20 2.54.0.746.g67dd491aae-goog