From nobody Sat Feb 7 18:16:11 2026 Received: from mail-wr1-f43.google.com (mail-wr1-f43.google.com [209.85.221.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 885072FFDD5 for ; Tue, 3 Feb 2026 15:13:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.43 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770131605; cv=none; b=Qogw+88y1E9KcJ+Nq4ujkb4tn57EH7gJcaYL0KyBD30P8EDBnQlncl9KDDN95Xi9pncH4uVzcLF59F/4DZduj8UCUBhQUCWCTxE0Xu5ykgyPtrKvVspevkTZgmHkzHd1JirnsrgxOqztS173/IIUIEzYW419N+7TS17BUJ8ptVI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770131605; c=relaxed/simple; bh=DJ2EYNyS2aEV2Fpbxss4sBPsCrSa/w4voCHYaHzySoU=; h=From:To:Cc:Subject:Date:Message-Id:MIME-Version; b=dmvjp6tfFwk84XomAWvoMoSNrN0IaZaFc5+nNEAeQuVNr9SYHXNf23um3CJd8cv5gxhWOdKgtvLxKLNZLV1cJvKj7hYNYHuHfBpgZ3pJLqLj/Ynu+z93UxOphvj6Oz/rysJ6ZX1zAUhbnrWSm/lnGA7im7MlSpq47shwEsJZ+as= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=imJI3Gc/; arc=none smtp.client-ip=209.85.221.43 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="imJI3Gc/" Received: by mail-wr1-f43.google.com with SMTP id ffacd0b85a97d-4359a16a400so4999569f8f.1 for ; Tue, 03 Feb 2026 07:13:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1770131601; x=1770736401; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=plkiCe9TKFTUb+Ot+aYthBJQvEidZHEjSj1Yzz3HUOI=; b=imJI3Gc/24vqjyN8VUkFHHahcTJu2gBr+A9bIzL4EKy28ifJQxV6Lo4Yz48ij+AFlZ Uv0VYAHU1ranQUpBC4PoD+d7wgME9HNtd+4nJWy+slRwsC+dAj0s1OhNYbR5TmaBsIiG R9ce2TK1mK4tZvF3XFlRc9mpKdLAmObwO4D5W1mXnLAenaieqy+b3qhlHQFHggfUc+O2 Co6by0q85jOmyaM2I2zLL1UO3xXZe7l3P9NBek9uhuJJYODLjCTzUrYtLHHZuKIg4KRC kJ2+nBnvZZyzMIoYZ6suTjZl3JRYPZAmRwBIzVDv462jukBVhhePzE0Z/1tYaakGYISv JJjQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1770131601; x=1770736401; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=plkiCe9TKFTUb+Ot+aYthBJQvEidZHEjSj1Yzz3HUOI=; b=hZp1+R7tjdW30iYSGWNb2Yjg/jshOkRH7rrRf7yeH/hA3lNeaHrvDILQMVmw5LioZX m6Q3eg5VexNr5x1+egLWYzMdE/vYPrt0PuqS+8L0aBWbEpkDFVagTYK+Xwkuce7wG/jb SL5D5kQ7Bpw8Yq5C6dy3pJ05FynHFOOezyuoCF+oWiNTX3c8x4PQOeNIt04qCcrMfI4t 76BWw2x7iPLfvgCZK0fEaZ0KOV1DRK09GdEhbJzwjTFuqMKOA5D/T+ZvnRmm43mJNh4m hxx57JX6YNWR36s8VPLu6c5AAKrsuPQxM0HDm6vkDn+15GGcl1goFYno01E77VHnr+Vu T6xA== X-Forwarded-Encrypted: i=1; AJvYcCX/Gvdehx/SuTimRFVZvN4WZNwkUemSggB0GVhgAFLPn/nl50sQ1MUGA6cLv2JZhO9bE6QN7AMuS/d4+To=@vger.kernel.org X-Gm-Message-State: AOJu0YyrA2ZpgUfKUz6uqxtZPPfzCqzgv07A2cJg8gH6/PJIWrIUlvCW hsFSZui6Isq+Kz79SBvdiwRDzNWYzml7tHLjWWJokL3oDLdGxaai4gGh X-Gm-Gg: AZuq6aKEzh6cmHQkErM7eizngg576w0BVyY3DDblw6qN3e3AtBr5xh147YDCZVWRnB7 dNn1Rvm9Y+i19axXUcCsWzpS7qeUq85qoy819/QvHGjfdWiwo8cVBJv+yZKdsF/q4qgKad6qZem EuZlsQFUStsJCW59oeQEYKWpua2Ds3RbV02rUyyBZp4xC9/UYUU+1/eUZhWfPfNr7Nr32V2/vwv +0YvQPZJYmoxSDkle7EyxSG0Tq4bgdym9ylW+4OzeIeGX+5L2AhkP1uUM2lS+M5nc3JV3iuA47S 2MvTJmEFPvptBFTl1j2y6af7KFnNXtcf3pW1EYgCTIeWkdQUgEHtEpIGXMsZ+kd2DjkwpF5HBat z/fGRfeALqxSQNzfI6eTILyOlszTu0bawBwuOqqF97ssFJE5FV0+ZAmu4vzQE821+lOCX7h/EFI uyz/oAXoK14HEQI9kIwzl+QE89K10yvUBBuUvJnZoF3n97lad2eawMf6kOEl91k1kqzeEWcOe9 X-Received: by 2002:a05:6000:4313:b0:435:9ee1:f91a with SMTP id ffacd0b85a97d-435f3abc95amr25904383f8f.53.1770131600716; Tue, 03 Feb 2026 07:13:20 -0800 (PST) Received: from snowdrop.snailnet.com (82-69-66-36.dsl.in-addr.zen.co.uk. [82.69.66.36]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-435e10ee04csm52805908f8f.12.2026.02.03.07.13.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 03 Feb 2026 07:13:20 -0800 (PST) From: david.laight.linux@gmail.com To: Willy Tarreau , =?UTF-8?q?Thomas=20Wei=C3=9Fschuh?= , linux-kernel@vger.kernel.org, Cheng Li Cc: David Laight Subject: [PATCH next] tools/nolibc: Optimise and common up number to ascii functions Date: Tue, 3 Feb 2026 15:13:15 +0000 Message-Id: <20260203151316.24506-1-david.laight.linux@gmail.com> X-Mailer: git-send-email 2.39.5 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: David Laight Implement u[64]to[ah]_r() using a common function that uses multiply by reciprocal to generate the least significant digit first and then reverses the string. On 32bit this is five multiplies (with 64bit product) for each output digit. I think the old utoa_r() always did 36 multiplies and a lot of subtracts - so this is likely faster even for 32bit values. Definitely better for 64bit values (especially small ones). Clearly shifts are faster for base 16, but reversing the output buffer makes a big difference. Sharing the code reduces the footprint (unless gcc decides to constant fold the functions). Definitely helps vfprintf() where the constants get loaded and a single call is down. Also makes it cheap to add octal support to vfprintf for completeness. Signed-off-by: David Laight Acked-by: Willy Tarreau --- tools/include/nolibc/stdlib.h | 145 ++++++++++++++-------------------- 1 file changed, 58 insertions(+), 87 deletions(-) diff --git a/tools/include/nolibc/stdlib.h b/tools/include/nolibc/stdlib.h index f184e108ed0a..8a2e86f60ae9 100644 --- a/tools/include/nolibc/stdlib.h +++ b/tools/include/nolibc/stdlib.h @@ -188,36 +188,67 @@ void *realloc(void *old_ptr, size_t new_size) return ret; } =20 -/* Converts the unsigned long integer to its hex representation into +/* Converts the unsigned 64bit integer to base ascii into * buffer , which must be long enough to store the number and the - * trailing zero (17 bytes for "ffffffffffffffff" or 9 for "ffffffff"). The - * buffer is filled from the first byte, and the number of characters emit= ted - * (not counting the trailing zero) is returned. The function is construct= ed - * in a way to optimize the code size and avoid any divide that could add a - * dependency on large external functions. + * trailing zero. The buffer is filled from the first byte, and the number + * of characters emitted (not counting the trailing zero) is returned. + * The function uses 'multiply be reciprocal' for the divisions and + * requires the caller pass the correct compile-time constant. */ -static __attribute__((unused)) -int utoh_r(unsigned long in, char *buffer) +#define __U64TOA_RECIP(base) ((base) & 1 ? ~0ull / (base) : (1ull << 63) /= ((base) / 2)) +static __attribute__((unused, noinline)) +int __u64toa_base(uint64_t in, char *buffer, unsigned int base, uint64_t r= ecip) { - signed char pos =3D (~0UL > 0xfffffffful) ? 60 : 28; - int digits =3D 0; - int dig; + unsigned int digits =3D 0; + unsigned int dig; + uint64_t q; + char *p; =20 + /* Generate least significant digit first */ do { - dig =3D in >> pos; - in -=3D (uint64_t)dig << pos; - pos -=3D 4; - if (dig || digits || pos < 0) { - if (dig > 9) - dig +=3D 'a' - '0' - 10; - buffer[digits++] =3D '0' + dig; +#if defined(__SIZEOF_INT128__) && !defined(__mips__) + q =3D ((unsigned __int128)in * recip) >> 64; +#else + uint64_t p =3D (uint32_t)in * (recip >> 32); + q =3D (in >> 32) * (recip >> 32) + (p >> 32); + p =3D (uint32_t)p + (in >> 32) * (uint32_t)recip; + q +=3D p >> 32; +#endif + dig =3D in - q * base; + /* Correct for any rounding errors (eg from low*low multiply) */ + while (dig >=3D base) { + dig -=3D base; + q++; } - } while (pos >=3D 0); + if (dig > 9) + dig +=3D 'a' - '0' - 10; + buffer[digits++] =3D '0' + dig; + } while ((in =3D q)); =20 buffer[digits] =3D 0; + + /* Order reverse to result */ + for (p =3D buffer + digits - 1; p > buffer; buffer++, p--) { + dig =3D *buffer; + *buffer =3D *p; + *p =3D dig; + } + return digits; } =20 +/* Converts the unsigned long integer to its hex representation into + * buffer , which must be long enough to store the number and the + * trailing zero (17 bytes for "ffffffffffffffff" or 9 for "ffffffff"). The + * buffer is filled from the first byte, and the number of characters emit= ted + * (not counting the trailing zero) is returned. + */ +static __inline__ __attribute__((unused)) +int utoh_r(unsigned long in, char *buffer) +{ + return __u64toa_base(in, buffer, 16, __U64TOA_RECIP(16)); +} + /* converts unsigned long to an hex string using the static itoa_buff= er * and returns the pointer to that string. */ @@ -233,30 +264,11 @@ char *utoh(unsigned long in) * trailing zero (21 bytes for 18446744073709551615 in 64-bit, 11 for * 4294967295 in 32-bit). The buffer is filled from the first byte, and the * number of characters emitted (not counting the trailing zero) is return= ed. - * The function is constructed in a way to optimize the code size and avoid - * any divide that could add a dependency on large external functions. */ -static __attribute__((unused)) +static __inline__ __attribute__((unused)) int utoa_r(unsigned long in, char *buffer) { - unsigned long lim; - int digits =3D 0; - int pos =3D (~0UL > 0xfffffffful) ? 19 : 9; - int dig; - - do { - for (dig =3D 0, lim =3D 1; dig < pos; dig++) - lim *=3D 10; - - if (digits || in >=3D lim || !pos) { - for (dig =3D 0; in >=3D lim; dig++) - in -=3D lim; - buffer[digits++] =3D '0' + dig; - } - } while (pos--); - - buffer[digits] =3D 0; - return digits; + return __u64toa_base(in, buffer, 10, __U64TOA_RECIP(10)); } =20 /* Converts the signed long integer to its string representation into @@ -324,34 +336,12 @@ char *utoa(unsigned long in) * buffer , which must be long enough to store the number and the * trailing zero (17 bytes for "ffffffffffffffff"). The buffer is filled f= rom * the first byte, and the number of characters emitted (not counting the - * trailing zero) is returned. The function is constructed in a way to opt= imize - * the code size and avoid any divide that could add a dependency on large - * external functions. + * trailing zero) is returned. */ -static __attribute__((unused)) +static __inline__ __attribute__((unused)) int u64toh_r(uint64_t in, char *buffer) { - signed char pos =3D 60; - int digits =3D 0; - int dig; - - do { - if (sizeof(long) >=3D 8) { - dig =3D (in >> pos) & 0xF; - } else { - /* 32-bit platforms: avoid a 64-bit shift */ - uint32_t d =3D (pos >=3D 32) ? (in >> 32) : in; - dig =3D (d >> (pos & 31)) & 0xF; - } - if (dig > 9) - dig +=3D 'a' - '0' - 10; - pos -=3D 4; - if (dig || digits || pos < 0) - buffer[digits++] =3D '0' + dig; - } while (pos >=3D 0); - - buffer[digits] =3D 0; - return digits; + return __u64toa_base(in, buffer, 16, __U64TOA_RECIP(16)); } =20 /* converts uint64_t to an hex string using the static itoa_buffer and @@ -368,31 +358,12 @@ char *u64toh(uint64_t in) * buffer , which must be long enough to store the number and the * trailing zero (21 bytes for 18446744073709551615). The buffer is filled= from * the first byte, and the number of characters emitted (not counting the - * trailing zero) is returned. The function is constructed in a way to opt= imize - * the code size and avoid any divide that could add a dependency on large - * external functions. + * trailing zero) is returned. */ -static __attribute__((unused)) +static __inline__ __attribute__((unused)) int u64toa_r(uint64_t in, char *buffer) { - unsigned long long lim; - int digits =3D 0; - int pos =3D 19; /* start with the highest possible digit */ - int dig; - - do { - for (dig =3D 0, lim =3D 1; dig < pos; dig++) - lim *=3D 10; - - if (digits || in >=3D lim || !pos) { - for (dig =3D 0; in >=3D lim; dig++) - in -=3D lim; - buffer[digits++] =3D '0' + dig; - } - } while (pos--); - - buffer[digits] =3D 0; - return digits; + return __u64toa_base(in, buffer, 10, __U64TOA_RECIP(10)); } =20 /* Converts the signed 64-bit integer to its string representation in= to --=20 2.39.5