From nobody Wed Feb 11 06:53:06 2026 Received: from mail-wm1-f42.google.com (mail-wm1-f42.google.com [209.85.128.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 498F1C13B for ; Sun, 8 Feb 2026 19:53:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.42 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770580416; cv=none; b=Hc4Id2989SPK5flEqmRCzPL49edhkim2eDtpi1PrIeM4Dzf22s/mz3XAefe8Y53fkTAld6t6sAZUEsOIbPllsyiO6xWmuSYc5CcgQEM7JWtVahalnOR1sjM2Lhc9oQ+vt5y2XtW2Q2ARyv6keMQIt6/rChzFUollz3UDJBhC+DA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770580416; c=relaxed/simple; bh=k4coVz5E9CgUsAltcQ/BeuI+MOWPxVU3eC0054jI8F8=; h=From:To:Cc:Subject:Date:Message-Id:MIME-Version; b=Xk1gfG28k1eJ1IzGBsHjTVju/2TaQUgMGIKuv62Ol8uP34IT9hmgIGoXeWRDPCApVYP9b5aM/Xfqaz86DhCtu/uPh7CmHMKLi5yughZkCaVX9X0lE88f4KLZktC2tQIgS3Sc8ttlLgIyeA1AVYYe2n1iZGSF4LgxCRJcLtFG7Rk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=NPjAzC5V; arc=none smtp.client-ip=209.85.128.42 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="NPjAzC5V" Received: by mail-wm1-f42.google.com with SMTP id 5b1f17b1804b1-47ee0291921so21321605e9.3 for ; Sun, 08 Feb 2026 11:53:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1770580415; x=1771185215; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=dnqz1cN+AJz1UvtdnCW7VHdFUO52yF+UKvakvSoMw+o=; b=NPjAzC5V2OdgK+ii27jOX3r7uU8WHRl4xEvUBgCvASdo5I14RrDzyNN24dPeP+RlCt fRzXMmWy5zdoeLAqjF/QUpLPMLEi6X4/PPtchPya29CxpIyEoV84N6FomXsQxLkIC5/r s+FENr7Xo37Hj9xKuCCir00ll2MuQRy6C7FtBMZY5sh/K5UX4OS+IOUijp4Qdc0GJ88k fAim3TbQjOFYpAZw7q3G6Bz/NJkaxurz5wFjF4P0zbREgPJAEky48T6IDX9pjlF6zA/L LjnsO8H8+TgNPoF6RtVphKbZ65JsYgZ5lby0RAz7LcUhS4GgHwRadEmTgFVQew3dCMTA 7GyA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1770580415; x=1771185215; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=dnqz1cN+AJz1UvtdnCW7VHdFUO52yF+UKvakvSoMw+o=; b=E2ZwQTgMzGLvk4x+Vgcf86nf74/9EKDMuJqOFeYJmFUpA/V2SSLg9PN48uINM3JpZ9 LrcY2CA061xX1s24qEjGLfFijmLe3/PgyQSvkwCq1ZIF0Vp7TLt99Eerdw6C3+kFnuvj SPylqvyl+z+YD+TYR929IJidn9iy3bsKu0H4MmmjCXKmZDnbjWFZcXBfh/OWLIz6ZO4k 5iQFFgTK7blAB38jl4KBQGvaQykCj8FGNd0aVSEW1VRYDn6gdEFuaf6DQi6LQI6+liMO Li1tezBymsy8DAvs2aU/tLlYCAT4WSqpYkkmZS5D/w29ttCthlzHl2m37XqfExl4wKHp Vd5A== X-Forwarded-Encrypted: i=1; AJvYcCXNQ64JegQVS6wgusIYPSmB9ernCIHsRgmrPNequYGjJJ+I7tKLnCSS6oGjLCPziHXwgprFEdm+AUBL+c8=@vger.kernel.org X-Gm-Message-State: AOJu0Ywivgg6yN1qu9d8uapkNj5RKzioOX1qCrt98IhhFz5S2BFLvxbY 7uBPrbNpFg+00++CO/TdPh4tBqP3MoxmsnnxDxiVhtgR0De01WmtMsCT X-Gm-Gg: AZuq6aJBIskgW9UbrmBMd1qB8TZ6I0s7aTvNcYqt53eE5fZDFqLM3mulFc4BmMF5Inp 6kqe+MuP+HnIpo/idSyM/plsmfynqaKI0xCEjAkLBM063Uoq6Pu9+AVhvGLyxesn5AQnrweu7an P0XiTw0QQKHu+EDLXuek6/8mcx4Wt1qaDxzJBCwqVxPtfMPn0S4RYOtFnWd8vZNYp//VlKAFNiB h3p1RzRwXe+Sv337OBxiNosOLtE2RtfnQ5G7COjCIuyeZz96b7h/qRyzpf2NCk+9E6rgDM9lD5x 0x3odvZfq9v5CLOPWAPmU7qco+hqjTsL3jJGVhAF07nkP5avj++I1PUt839m+uZ2EdELxTEKeQE f3XS+o2CNigfYNJzdhf5+OuAnABJJcyefsEsFtdk1mMXzDFHYlSb+n5l98dLeeQZ7Aq3UGge/Eo eHNGxb6NxV2v059Y8Y07LoSihHUoW8zqCvv9fsykBEeBWuuJCkhq3B3/LwEC+ZfJPmOthS/mkQS IeRObhMsdM= X-Received: by 2002:a05:600c:628d:b0:480:6bef:63a0 with SMTP id 5b1f17b1804b1-483202139f7mr143330905e9.21.1770580414361; Sun, 08 Feb 2026 11:53:34 -0800 (PST) Received: from snowdrop.snailnet.com (82-69-66-36.dsl.in-addr.zen.co.uk. [82.69.66.36]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-4832097d9a5sm85797485e9.11.2026.02.08.11.53.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 08 Feb 2026 11:53:33 -0800 (PST) From: david.laight.linux@gmail.com To: Willy Tarreau , =?UTF-8?q?Thomas=20Wei=C3=9Fschuh?= , linux-kernel@vger.kernel.org, Cheng Li Cc: David Laight Subject: [PATCH v2 next] tools/nolibc: Optimise and common up the number to ascii functions Date: Sun, 8 Feb 2026 19:53:08 +0000 Message-Id: <20260208195308.4074-1-david.laight.linux@gmail.com> X-Mailer: git-send-email 2.39.5 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: David Laight Implement u[64]to[ah]_r() using a common function that uses multiply by reciprocal to generate the least significant digit first and then reverses the string. On 32bit this is six multiplies (with 64bit product) for each output digit. I think the old utoa_r() always did 36 multiplies and a lot of subtracts - so this is likely faster even for 32bit values. Definitely better for 64bit values (especially small ones). Clearly shifts are faster for base 16, but reversing the output buffer makes a big difference. Sharing the code reduces the footprint (unless gcc decides to constant fold the functions). Definitely helps vfprintf() where the constants get loaded and a single call is done. Also makes it cheap to add octal support to vfprintf for completeness. Signed-off-by: David Laight --- Willy T. acked v1, but I've changed the code slightly since. Changes for v2: - Update some comments. - Add _nolibc_/_NOLIBC_ prefix to the visible symbols. - Change _NOLIBC_U64TOA_RECIP() to return a slightly larger value and include the 'low by low' multiply in 32bit systems so that the correction for a slightly small quotient isn't needed. The code is now slightly smaller on x86-64. tools/include/nolibc/stdlib.h | 150 ++++++++++++++-------------------- 1 file changed, 62 insertions(+), 88 deletions(-) diff --git a/tools/include/nolibc/stdlib.h b/tools/include/nolibc/stdlib.h index f184e108ed0a..62d2fdc8c8d0 100644 --- a/tools/include/nolibc/stdlib.h +++ b/tools/include/nolibc/stdlib.h @@ -188,36 +188,70 @@ void *realloc(void *old_ptr, size_t new_size) return ret; } =20 -/* Converts the unsigned long integer to its hex representation into +/* Converts the unsigned 64bit integer to base ascii into * buffer , which must be long enough to store the number and the - * trailing zero (17 bytes for "ffffffffffffffff" or 9 for "ffffffff"). The - * buffer is filled from the first byte, and the number of characters emit= ted - * (not counting the trailing zero) is returned. The function is construct= ed - * in a way to optimize the code size and avoid any divide that could add a - * dependency on large external functions. + * trailing zero. The buffer is filled from the first byte, and the number + * of characters emitted (not counting the trailing zero) is returned. + * The function uses 'multiply by reciprocal' for the divisions and + * requires the caller pass the correct reciprocal. + * + * Note that unlike __div64_const32() in asm-generic/div64.h the divisor + * is never large enough to need a bias added. + * The division is correct for all divisors up to at least 1G. + * + * __int128 isn't used for mips because gcc prior to 10.0 will call + * __multi3 for MIPS64r6. */ -static __attribute__((unused)) -int utoh_r(unsigned long in, char *buffer) +#define _NOLIBC_U64TOA_RECIP(base) ((~0ULL / (base)) + 1) +static __attribute__((unused, noinline)) +int _nolibc_u64toa_base(uint64_t in, char *buffer, unsigned int base, uint= 64_t recip) { - signed char pos =3D (~0UL > 0xfffffffful) ? 60 : 28; - int digits =3D 0; - int dig; + unsigned int digits =3D 0; + unsigned int dig; + uint64_t q; + char *p; =20 + /* Generate least significant digit first */ do { - dig =3D in >> pos; - in -=3D (uint64_t)dig << pos; - pos -=3D 4; - if (dig || digits || pos < 0) { - if (dig > 9) - dig +=3D 'a' - '0' - 10; - buffer[digits++] =3D '0' + dig; - } - } while (pos >=3D 0); +#if defined(__SIZEOF_INT128__) && !defined(__mips__) + q =3D ((unsigned __int128)in * recip) >> 64; +#else + uint64_t p =3D ((uint64_t)(uint32_t)in * (uint32_t)recip) >> 32; + p +=3D (uint32_t)in * (recip >> 32); + q =3D (in >> 32) * (recip >> 32) + (p >> 32); + p =3D (uint32_t)p + (in >> 32) * (uint32_t)recip; + q +=3D p >> 32; +#endif + dig =3D in - q * base; + if (dig > 9) + dig +=3D 'a' - '0' - 10; + buffer[digits++] =3D '0' + dig; + } while ((in =3D q)); =20 buffer[digits] =3D 0; + + /* Order reverse to result */ + for (p =3D buffer + digits - 1; p > buffer; buffer++, p--) { + dig =3D *buffer; + *buffer =3D *p; + *p =3D dig; + } + return digits; } =20 +/* Converts the unsigned long integer to its hex representation into + * buffer , which must be long enough to store the number and the + * trailing zero (17 bytes for "ffffffffffffffff" or 9 for "ffffffff"). The + * buffer is filled from the first byte, and the number of characters emit= ted + * (not counting the trailing zero) is returned. + */ +static __inline__ __attribute__((unused)) +int utoh_r(unsigned long in, char *buffer) +{ + return _nolibc_u64toa_base(in, buffer, 16, _NOLIBC_U64TOA_RECIP(16)); +} + /* converts unsigned long to an hex string using the static itoa_buff= er * and returns the pointer to that string. */ @@ -233,30 +267,11 @@ char *utoh(unsigned long in) * trailing zero (21 bytes for 18446744073709551615 in 64-bit, 11 for * 4294967295 in 32-bit). The buffer is filled from the first byte, and the * number of characters emitted (not counting the trailing zero) is return= ed. - * The function is constructed in a way to optimize the code size and avoid - * any divide that could add a dependency on large external functions. */ -static __attribute__((unused)) +static __inline__ __attribute__((unused)) int utoa_r(unsigned long in, char *buffer) { - unsigned long lim; - int digits =3D 0; - int pos =3D (~0UL > 0xfffffffful) ? 19 : 9; - int dig; - - do { - for (dig =3D 0, lim =3D 1; dig < pos; dig++) - lim *=3D 10; - - if (digits || in >=3D lim || !pos) { - for (dig =3D 0; in >=3D lim; dig++) - in -=3D lim; - buffer[digits++] =3D '0' + dig; - } - } while (pos--); - - buffer[digits] =3D 0; - return digits; + return _nolibc_u64toa_base(in, buffer, 10, _NOLIBC_U64TOA_RECIP(10)); } =20 /* Converts the signed long integer to its string representation into @@ -324,34 +339,12 @@ char *utoa(unsigned long in) * buffer , which must be long enough to store the number and the * trailing zero (17 bytes for "ffffffffffffffff"). The buffer is filled f= rom * the first byte, and the number of characters emitted (not counting the - * trailing zero) is returned. The function is constructed in a way to opt= imize - * the code size and avoid any divide that could add a dependency on large - * external functions. + * trailing zero) is returned. */ -static __attribute__((unused)) +static __inline__ __attribute__((unused)) int u64toh_r(uint64_t in, char *buffer) { - signed char pos =3D 60; - int digits =3D 0; - int dig; - - do { - if (sizeof(long) >=3D 8) { - dig =3D (in >> pos) & 0xF; - } else { - /* 32-bit platforms: avoid a 64-bit shift */ - uint32_t d =3D (pos >=3D 32) ? (in >> 32) : in; - dig =3D (d >> (pos & 31)) & 0xF; - } - if (dig > 9) - dig +=3D 'a' - '0' - 10; - pos -=3D 4; - if (dig || digits || pos < 0) - buffer[digits++] =3D '0' + dig; - } while (pos >=3D 0); - - buffer[digits] =3D 0; - return digits; + return _nolibc_u64toa_base(in, buffer, 16, _NOLIBC_U64TOA_RECIP(16)); } =20 /* converts uint64_t to an hex string using the static itoa_buffer and @@ -368,31 +361,12 @@ char *u64toh(uint64_t in) * buffer , which must be long enough to store the number and the * trailing zero (21 bytes for 18446744073709551615). The buffer is filled= from * the first byte, and the number of characters emitted (not counting the - * trailing zero) is returned. The function is constructed in a way to opt= imize - * the code size and avoid any divide that could add a dependency on large - * external functions. + * trailing zero) is returned. */ -static __attribute__((unused)) +static __inline__ __attribute__((unused)) int u64toa_r(uint64_t in, char *buffer) { - unsigned long long lim; - int digits =3D 0; - int pos =3D 19; /* start with the highest possible digit */ - int dig; - - do { - for (dig =3D 0, lim =3D 1; dig < pos; dig++) - lim *=3D 10; - - if (digits || in >=3D lim || !pos) { - for (dig =3D 0; in >=3D lim; dig++) - in -=3D lim; - buffer[digits++] =3D '0' + dig; - } - } while (pos--); - - buffer[digits] =3D 0; - return digits; + return _nolibc_u64toa_base(in, buffer, 10, _NOLIBC_U64TOA_RECIP(10)); } =20 /* Converts the signed 64-bit integer to its string representation in= to --=20 2.39.5