From nobody Tue Apr 7 03:36:25 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 23858C433F5 for ; Mon, 10 Oct 2022 22:53:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229848AbiJJWx4 (ORCPT ); Mon, 10 Oct 2022 18:53:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56002 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229619AbiJJWxx (ORCPT ); Mon, 10 Oct 2022 18:53:53 -0400 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E83107A53E for ; Mon, 10 Oct 2022 15:53:51 -0700 (PDT) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-3579d28ffd6so117625657b3.18 for ; Mon, 10 Oct 2022 15:53:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=35T83bsZ4xGPTaOlyecWSoUgFUedgbH6BlUxlOUBlR8=; b=oTutQm7Zb4sdm6oN3sN1aqtG9kDMiU7ItlRnISemwBS9t48mFyaPD6rfBkSxPsg5MP v+uAW4w+skMr8tqJESJ8HFSR8MfGgIPd8LFPNMJW8vWFnhx2Vu0AUiVJam3kOfWdAVdf QZpdTCS1rI1IY6gKaQlNlbqZzj4wDR8spjrOjq2YOuKCOdC2R4Ooaevctw5xKr+0kawn BRzg1g45r2R3f/XcFo/pc4gVT9clpnMN7wC7ahzTdEV2M6iYIevOesE7ZfULdGUg5UYX Tru+SL61avsmaetlHsZJTD8pk4dUTVGIvTSIggYlVaNqRlZ1oyCGMUmT0PLUkvZt2x0O Gx8A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=35T83bsZ4xGPTaOlyecWSoUgFUedgbH6BlUxlOUBlR8=; b=TgQKyrT5DBWufqplc27TCIafoQmAezaqZ7LwQKLbgk+IKqCNwe4YjDu14DqD+EPFYC 02KCYJQMUPjasZ3xn2NBOrN7pSPh0Q6co/5YAB/L9kY3ai5rHRL/hUkHZQAM2cANuhmr PKjnW9ydfz0bAHSyun6+1NWifizz5Zumsr6V+7COr+tSxiIitzBIrwe8yPNMsp58UIpR 0PYV4cx/CfU4WUtf61Ch1t0XwQ9MERve6vXyMQ/5uDZ7MfxXmYPNF2KBrsRtrdDtcvBe /pz6lUK5N2IOyT4VQ4W2lwdOfkUNgQXbu5130LLsNelVNO6Lc/HuS6emZRLDTgxFlZkX 9j4g== X-Gm-Message-State: ACrzQf0ijvkCXkZzuOoEiKRy93vupY7l5rTeUnKH0XKiI/QY9Hgv0PRK s9ynfVOIhvbbIqWHPpMJ2g87Bnm8y7YpozEE4R4= X-Google-Smtp-Source: AMsMyM6q9jQ47MmC0Y/IsHY+Z0OQ3TlY1am07kfC8geRQ5Kno2DnoQbRAI62yZ3QiVelIrmwkSBy+pufO7a+InhVzBI= X-Received: from ndesaulniers-desktop.svl.corp.google.com ([2620:0:100e:712:283b:bbf5:938:fb2d]) (user=ndesaulniers job=sendgmr) by 2002:a25:9e83:0:b0:6be:ebbb:9d8b with SMTP id p3-20020a259e83000000b006beebbb9d8bmr20756154ybq.333.1665442431236; Mon, 10 Oct 2022 15:53:51 -0700 (PDT) Date: Mon, 10 Oct 2022 15:53:42 -0700 In-Reply-To: Mime-Version: 1.0 References: X-Developer-Key: i=ndesaulniers@google.com; a=ed25519; pk=UIrHvErwpgNbhCkRZAYSX0CFd/XFEwqX3D0xqtqjNug= X-Developer-Signature: v=1; a=ed25519-sha256; t=1665442422; l=2033; i=ndesaulniers@google.com; s=20220923; h=from:subject; bh=rZfNaR/aBwwpz1oPD+ZP2XMfcjdSORrzwrlR54g5esk=; b=+6wxScJxIKR3muVDZdynqydUM7ayvYaD10gSryPI/aKJaULM14Ikux1pjsPpQqfnmjRkAmi1dbwX zuTFcpdnC/vxxB97Gl2PE6kLYI17zVZxk/7LEp0vT7OK1OoWAzaP X-Mailer: git-send-email 2.38.0.rc2.412.g84df46c1b4-goog Message-ID: <20221010225342.3903590-1-ndesaulniers@google.com> Subject: [PATCH] ARM: NWFPE: avoid compiler-generated __aeabi_uldivmod From: Nick Desaulniers To: Arnd Bergmann , Russell King Cc: Tom Rix , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, llvm@lists.linux.dev, Miguel Ojeda , Ard Biesheuvel , Gary Guo , Craig Topper , Philip Reames , jh@jhauser.us, Nick Desaulniers , Nathan Chancellor Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" clang-15's ability to elide loops completely became more aggressive when it can deduce how a variable is being updated in a loop. Counting down one variable by an increment of another can be replaced by a modulo operation. For 64b variables on 32b ARM EABI targets, this can result in the compiler generating calls to __aeabi_uldivmod, which it does for a do while loop in float64_rem(). For the kernel, we'd generally prefer that developers not open code 64b division via binary / operators and instead use the more explicit helpers from div64.h. On arm-linux-gnuabi targets, failure to do so can result in linkage failures due to undefined references to __aeabi_uldivmod(). While developers can avoid open coding divisions on 64b variables, the compiler doesn't know that the Linux kernel has a partial implementation of a compiler runtime (--rtlib) to enforce this convention. It's also undecidable for the compiler whether the code in question would be faster to execute the loop vs elide it and do the 64b division. While I actively avoid using the internal -mllvm command line flags, I think we get better code than using barrier() here, which will force reloads+spills in the loop for all toolchains. Link: https://github.com/ClangBuiltLinux/linux/issues/1666 Reported-by: Nathan Chancellor Signed-off-by: Nick Desaulniers Reviewed-by: Arnd Bergmann Tested-by: Nathan Chancellor --- arch/arm/nwfpe/Makefile | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/arch/arm/nwfpe/Makefile b/arch/arm/nwfpe/Makefile index 303400fa2cdf..2aec85ab1e8b 100644 --- a/arch/arm/nwfpe/Makefile +++ b/arch/arm/nwfpe/Makefile @@ -11,3 +11,9 @@ nwfpe-y +=3D fpa11.o fpa11_cpdo.o fpa11_cpdt.o \ entry.o =20 nwfpe-$(CONFIG_FPE_NWFPE_XP) +=3D extended_cpdo.o + +# Try really hard to avoid generating calls to __aeabi_uldivmod() from +# float64_rem() due to loop elision. +ifdef CONFIG_CC_IS_CLANG +CFLAGS_softfloat.o +=3D -mllvm -replexitval=3Dnever +endif --=20 2.38.0.rc2.412.g84df46c1b4-goog