From nobody Sun Feb 8 18:03:48 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3DB4BC83F15 for ; Sun, 27 Aug 2023 01:27:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230129AbjH0B1J (ORCPT ); Sat, 26 Aug 2023 21:27:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51790 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230037AbjH0B0v (ORCPT ); Sat, 26 Aug 2023 21:26:51 -0400 Received: from mail-oo1-xc2c.google.com (mail-oo1-xc2c.google.com [IPv6:2607:f8b0:4864:20::c2c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A32BE1B3 for ; Sat, 26 Aug 2023 18:26:48 -0700 (PDT) Received: by mail-oo1-xc2c.google.com with SMTP id 006d021491bc7-571194584e2so1402214eaf.3 for ; Sat, 26 Aug 2023 18:26:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20221208.gappssmtp.com; s=20221208; t=1693099608; x=1693704408; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=xfTS8tqhuCXvlRMkmz2PkepwLOiLzUz1oIVWEI6AmSg=; b=2uClxYn/559ZVUpa1YquW1FFWcG7vzRXkQRMoM5j2FLcrHWGTjCAKRd75m/5gdiqPh yM+9URcFsOcw0EQelDYFTuuZjZCoosRnNu79MJR79CKVohEkJwd6gRsL/eeobALPxFFg cQP6lkg8MltZCwzgfnmU0HspEgv5zpW/R1cgqxW3zzvjxYxixve7rX286VuKEZXZjWfa iHo1icydIYbrJ+qDmejDKV3LjvPKKECqFybgQlr4t5WJg70btYWsiKcwyHJjXEriRDiE ZBrdt7bfW73bkbWVNSu8uViudvT87B7nA3+sVIklrRrX7hUeKCGoS/qzG7wnSeXbMPtk mn/A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1693099608; x=1693704408; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=xfTS8tqhuCXvlRMkmz2PkepwLOiLzUz1oIVWEI6AmSg=; b=a4I6/Vt3wSGToBmzpy8pA3nk+cC5BeevxJ24YON4DgYXsmnQYlO60xzCbfdVHuFtdW 3o3AKV+sySownYCgXTQp7UgTSg6L8vEs2CcqDDG0qFqacgCELVnueZcSF+k1zaDm3mCF FTNvc72vFyXpT/fdjsFQqJBv7lOXqQ65FQvN5bY8KeqnHpUV1K8h8AnWaG6+WzDTTa19 p9kENN7XeZeiVNHhI/wGpDXx7kSS3E/idFjHsKUCBY4BOhTcgA1vZ1Nvp4W+g/ZuwF+m d5wk8TRqmYTDEnfwNQvcb9yO6fxzvXrvwUK2j9TSdAJ0OVOMXBUxWnwNhcmRUx8BY81C 1aNQ== X-Gm-Message-State: AOJu0YzEmdoQG5Sz12H+kSgFQ5q3OLvX6m9yezRcVHiBbTBk1Ep5Uddz nKaJObW8QQ+fX5/5SbrkCl4N1Xf6bgE0uaSLmLQ= X-Google-Smtp-Source: AGHT+IFLXaIwbb2H3mZ5Q6cj8cOqVl+jK5FQkosINRTZl886s4rzQbb1kPqE57ixOM1lSRn0MVvJpA== X-Received: by 2002:a05:6358:880a:b0:12b:e45b:3fac with SMTP id hv10-20020a056358880a00b0012be45b3facmr17843078rwb.32.1693099607875; Sat, 26 Aug 2023 18:26:47 -0700 (PDT) Received: from charlie.ba.rivosinc.com ([66.220.2.162]) by smtp.gmail.com with ESMTPSA id jf6-20020a170903268600b001b869410ed2sm4357404plb.72.2023.08.26.18.26.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 26 Aug 2023 18:26:46 -0700 (PDT) From: Charlie Jenkins Date: Sat, 26 Aug 2023 18:26:06 -0700 Subject: [PATCH 1/5] riscv: Checksum header MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20230826-optimize_checksum-v1-1-937501b4522a@rivosinc.com> References: <20230826-optimize_checksum-v1-0-937501b4522a@rivosinc.com> In-Reply-To: <20230826-optimize_checksum-v1-0-937501b4522a@rivosinc.com> To: linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org Cc: Paul Walmsley , Palmer Dabbelt , Albert Ou , Charlie Jenkins X-Mailer: b4 0.12.3 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Provide checksum algorithms that have been designed to leverage riscv instructions such as rotate. In 64-bit, can take advantage of the larger register to avoid some overflow checking. Add configuration for Zba extension and add march for Zba and Zbb. Signed-off-by: Charlie Jenkins --- arch/riscv/Kconfig | 23 +++++++++++ arch/riscv/Makefile | 2 + arch/riscv/include/asm/checksum.h | 86 +++++++++++++++++++++++++++++++++++= ++++ 3 files changed, 111 insertions(+) diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index 4c07b9189c86..8d7e475ca28d 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -507,6 +507,29 @@ config RISCV_ISA_V_DEFAULT_ENABLE =20 If you don't know what to do here, say Y. =20 +config TOOLCHAIN_HAS_ZBA + bool + default y + depends on !64BIT || $(cc-option,-mabi=3Dlp64 -march=3Drv64ima_zba) + depends on !32BIT || $(cc-option,-mabi=3Dilp32 -march=3Drv32ima_zba) + depends on LLD_VERSION >=3D 150000 || LD_VERSION >=3D 23900 + depends on AS_HAS_OPTION_ARCH + +config RISCV_ISA_ZBA + bool "Zba extension support for bit manipulation instructions" + depends on TOOLCHAIN_HAS_ZBA + depends on MMU + depends on RISCV_ALTERNATIVE + default y + help + Adds support to dynamically detect the presence of the ZBA + extension (basic bit manipulation) and enable its usage. + + The Zba extension provides instructions to accelerate a number + of bit-specific address creation operations. + + If you don't know what to do here, say Y. + config TOOLCHAIN_HAS_ZBB bool default y diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile index 6ec6d52a4180..51fa3f67fc9a 100644 --- a/arch/riscv/Makefile +++ b/arch/riscv/Makefile @@ -61,6 +61,8 @@ riscv-march-$(CONFIG_ARCH_RV64I) :=3D rv64ima riscv-march-$(CONFIG_FPU) :=3D $(riscv-march-y)fd riscv-march-$(CONFIG_RISCV_ISA_C) :=3D $(riscv-march-y)c riscv-march-$(CONFIG_RISCV_ISA_V) :=3D $(riscv-march-y)v +riscv-march-$(CONFIG_RISCV_ISA_ZBA) :=3D $(riscv-march-y)_zba +riscv-march-$(CONFIG_RISCV_ISA_ZBB) :=3D $(riscv-march-y)_zbb =20 ifdef CONFIG_TOOLCHAIN_NEEDS_OLD_ISA_SPEC KBUILD_CFLAGS +=3D -Wa,-misa-spec=3D2.2 diff --git a/arch/riscv/include/asm/checksum.h b/arch/riscv/include/asm/che= cksum.h new file mode 100644 index 000000000000..cd98f8cde888 --- /dev/null +++ b/arch/riscv/include/asm/checksum.h @@ -0,0 +1,86 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * IP checksum routines + * + * Copyright (C) 2023 Rivos Inc. + */ +#ifndef __ASM_RISCV_CHECKSUM_H +#define __ASM_RISCV_CHECKSUM_H + +#include +#include + +/* Default version is sufficient for 32 bit */ +#ifdef CONFIG_64BIT +#define _HAVE_ARCH_IPV6_CSUM +__sum16 csum_ipv6_magic(const struct in6_addr *saddr, + const struct in6_addr *daddr, + __u32 len, __u8 proto, __wsum sum); +#endif + +/* + * Fold a partial checksum without adding pseudo headers + */ +static inline __sum16 csum_fold(__wsum sum) +{ + sum +=3D (sum >> 16) | (sum << 16); + return (__force __sum16)(~(sum >> 16)); +} + +#define csum_fold csum_fold + +/* + * This is a version of ip_compute_csum() optimized for IP headers, + * which always checksum on 4 octet boundaries. + * Optimized for 32 and 64 bit platforms, with and without vector, with and + * without the bitmanip extensions zba/zbb. + */ +#ifdef CONFIG_32BIT +static inline __sum16 ip_fast_csum(const void *iph, unsigned int ihl) +{ + __wsum csum =3D 0; + int pos =3D 0; + + do { + csum +=3D ((const __wsum *)iph)[pos]; + csum +=3D csum < ((const __wsum *)iph)[pos]; + } while (++pos < ihl); + return csum_fold(csum); +} +#else + +/* + * Quickly compute an IP checksum with the assumption that IPv4 headers wi= ll + * always be in multiples of 32-bits, and have an ihl of at least 5. + * @ihl is the number of 32 bit segments and must be greater than or equal= to 5. + * @iph is also assumed to be word aligned. + */ +static inline __sum16 ip_fast_csum(const void *iph, unsigned int ihl) +{ + unsigned long beginning; + unsigned long csum =3D 0; + + beginning =3D ((const unsigned long *)iph)[0]; + beginning +=3D ((const unsigned long *)iph)[1]; + beginning +=3D beginning < ((const unsigned long *)iph)[1]; + int pos =3D 4; + + do { + csum +=3D ((const unsigned int *)iph)[pos]; + } while (++pos < ihl); + csum +=3D beginning; + csum +=3D csum < beginning; + csum +=3D (csum >> 32) | (csum << 32); // Calculate overflow + return csum_fold((__force __wsum)(csum >> 32)); +} +#endif +#define ip_fast_csum ip_fast_csum + +#ifdef CONFIG_64BIT +extern unsigned int do_csum(const unsigned char *buff, int len); +#define do_csum do_csum +#endif + +#include + +#endif --=20 2.41.0 From nobody Sun Feb 8 18:03:48 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E30CEC83F01 for ; Sun, 27 Aug 2023 01:27:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230162AbjH0B1K (ORCPT ); Sat, 26 Aug 2023 21:27:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43852 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230038AbjH0B0w (ORCPT ); Sat, 26 Aug 2023 21:26:52 -0400 Received: from mail-pj1-x1033.google.com (mail-pj1-x1033.google.com [IPv6:2607:f8b0:4864:20::1033]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7E29B1B4 for ; Sat, 26 Aug 2023 18:26:49 -0700 (PDT) Received: by mail-pj1-x1033.google.com with SMTP id 98e67ed59e1d1-26f38171174so1334523a91.3 for ; Sat, 26 Aug 2023 18:26:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20221208.gappssmtp.com; s=20221208; t=1693099609; x=1693704409; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=VcyEIUqviA7khUw+arj1F8KZbIH4pa6BCJUUF1erYKQ=; b=KBssN7pSowk/dJIIEgHKav1tbbyrmro3jYrP4WaAKlOMelorWlbKl6B5G9TRP9R9vm Cn/y4MlGtkiL+mEqtt7syBiBa6vzehktUGC4vJl5Bek9x1lbOAyLfEIuYLQBFfnr06QO vuULcGbun73JMaKzaej0JFrEU4DD5K4v6xHe2t941gT/NqP2zUPVypqwXGoSdrHg+zI0 FqfR4wVVNL3DpAl5aijkwqrmOMvaqeKZHUegDc3KJhhZIM4cay+jKTpVhBvOiV8hR6QN /b73p5FCVDSRnKsC5SBWWr+P1MzCDKNAEUuk2YRI6nqctwO0uqMuMim2uq+zxNnS+Z3V ceZg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1693099609; x=1693704409; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=VcyEIUqviA7khUw+arj1F8KZbIH4pa6BCJUUF1erYKQ=; b=AQnSe8GBcMkxvKp01csaCiqEWoIbsO7lzlT3jRF6ctAT5K5baCNk4fBJ1CsS8jGWPd OMoVvQyOTD09vPWYaPW/0zVBlaQ9/+Wi8q4HCIMaSsv8ImhYTh28qB1FWkWTgslHG5VZ Plt3kwpJvuE9mQroVai4UxRE4rx/YBf/JiOwX9X/BBM5byDBL4rIZRKII4kW5SZYkyDe fbUy5i3O8o4PVylo7whpfE9BL3VfeucnGZQv7fHgyLsISFJX1SRZO9DRTmpbiu5SkEld i6H5p3fbm3iXr8TeXKk7N/VSdAsrg6y6Oju5qWt6xNF3ocmdS+2ZtkBN/VxIxlfqk2oO nkcg== X-Gm-Message-State: AOJu0YyPQOdyg++DiFRXkjNoyC5vLhbHCdlvPtCkpYBpvUL67bD38/eV bbuzqZ6arhJq9PofFUAS8se9tBus1h/ICg+A7UQ= X-Google-Smtp-Source: AGHT+IHd5wO3p992s4Jo593vlDR2y4R+RaUPenkdSZfrxwMjPtp06m42dnM4COmq5sTNRn0RSemKjw== X-Received: by 2002:a17:90a:4ec2:b0:268:f987:305f with SMTP id v2-20020a17090a4ec200b00268f987305fmr21012361pjl.46.1693099608933; Sat, 26 Aug 2023 18:26:48 -0700 (PDT) Received: from charlie.ba.rivosinc.com ([66.220.2.162]) by smtp.gmail.com with ESMTPSA id jf6-20020a170903268600b001b869410ed2sm4357404plb.72.2023.08.26.18.26.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 26 Aug 2023 18:26:48 -0700 (PDT) From: Charlie Jenkins Date: Sat, 26 Aug 2023 18:26:07 -0700 Subject: [PATCH 2/5] riscv: Add checksum library MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20230826-optimize_checksum-v1-2-937501b4522a@rivosinc.com> References: <20230826-optimize_checksum-v1-0-937501b4522a@rivosinc.com> In-Reply-To: <20230826-optimize_checksum-v1-0-937501b4522a@rivosinc.com> To: linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org Cc: Paul Walmsley , Palmer Dabbelt , Albert Ou , Charlie Jenkins X-Mailer: b4 0.12.3 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Provide a 32 and 64 bit version of do_csum. When compiled for 32-bit will load from the buffer in groups of 32 bits, and when compiled for 64-bit will load in groups of 64 bits. Signed-off-by: Charlie Jenkins --- arch/riscv/include/asm/checksum.h | 2 - arch/riscv/lib/Makefile | 1 + arch/riscv/lib/csum.c | 118 ++++++++++++++++++++++++++++++++++= ++++ 3 files changed, 119 insertions(+), 2 deletions(-) diff --git a/arch/riscv/include/asm/checksum.h b/arch/riscv/include/asm/che= cksum.h index cd98f8cde888..af49b3409576 100644 --- a/arch/riscv/include/asm/checksum.h +++ b/arch/riscv/include/asm/checksum.h @@ -76,10 +76,8 @@ static inline __sum16 ip_fast_csum(const void *iph, unsi= gned int ihl) #endif #define ip_fast_csum ip_fast_csum =20 -#ifdef CONFIG_64BIT extern unsigned int do_csum(const unsigned char *buff, int len); #define do_csum do_csum -#endif =20 #include =20 diff --git a/arch/riscv/lib/Makefile b/arch/riscv/lib/Makefile index 26cb2502ecf8..2aa1a4ad361f 100644 --- a/arch/riscv/lib/Makefile +++ b/arch/riscv/lib/Makefile @@ -6,6 +6,7 @@ lib-y +=3D memmove.o lib-y +=3D strcmp.o lib-y +=3D strlen.o lib-y +=3D strncmp.o +lib-y +=3D csum.o lib-$(CONFIG_MMU) +=3D uaccess.o lib-$(CONFIG_64BIT) +=3D tishift.o lib-$(CONFIG_RISCV_ISA_ZICBOZ) +=3D clear_page.o diff --git a/arch/riscv/lib/csum.c b/arch/riscv/lib/csum.c new file mode 100644 index 000000000000..2037041ce8a0 --- /dev/null +++ b/arch/riscv/lib/csum.c @@ -0,0 +1,118 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * IP checksum library + * + * Influenced by arch/arm64/lib/csum.c + * Copyright (C) 2023 Rivos Inc. + */ +#include +#include +#include +#include + +#include + +/* Default version is sufficient for 32 bit */ +#ifdef CONFIG_64BIT +__sum16 csum_ipv6_magic(const struct in6_addr *saddr, + const struct in6_addr *daddr, + __u32 len, __u8 proto, __wsum csum) +{ + unsigned long sum, ulen, uproto; + + uproto =3D (unsigned long)htonl(proto); + ulen =3D (unsigned long)htonl(len); + sum =3D (unsigned long)csum; + + sum +=3D *(const unsigned long *)saddr->s6_addr; + sum +=3D sum < csum; + + sum +=3D *((const unsigned long *)saddr->s6_addr + 1); + sum +=3D sum < *((const unsigned long *)saddr->s6_addr + 1); + + sum +=3D *(const unsigned long *)daddr->s6_addr; + sum +=3D sum < *(const unsigned long *)daddr->s6_addr; + + sum +=3D *((const unsigned long *)daddr->s6_addr + 1); + sum +=3D sum < *((const unsigned long *)daddr->s6_addr + 1); + + sum +=3D ulen; + sum +=3D sum < ulen; + + sum +=3D uproto; + sum +=3D sum < uproto; + + sum +=3D (sum >> 32) | (sum << 32); + sum >>=3D 32; + return csum_fold((__force __wsum)sum); +} +EXPORT_SYMBOL(csum_ipv6_magic); +#endif + +#ifdef CONFIG_32BIT +typedef unsigned int csum_t; +#define OFFSET_MASK 3 +#else +typedef unsigned long csum_t; +#define OFFSET_MASK 7 +#endif + +/* + * Perform a checksum on an arbitrary memory address. + * Algorithm accounts for buff being misaligned. + * If not aligned on an 8-byte boundary, will read the whole byte but not = use + * the bytes that it shouldn't. The same thing will occur on the tail-end = of the + * read. + */ +unsigned int __no_sanitize_address do_csum(const unsigned char *buff, int = len) +{ + unsigned int offset, shift; + csum_t csum, data; + const csum_t *ptr; + + if (unlikely(len <=3D 0)) + return 0; + /* + * To align the address, grab the whole first byte in buff. + * Since it is inside of a same byte, it will never cross pages or cache + * lines. + * Directly call KASAN with the alignment we will be using. + */ + offset =3D (csum_t)buff & OFFSET_MASK; + kasan_check_read(buff, len); + ptr =3D (const csum_t *)(buff - offset); + len =3D len + offset - sizeof(csum_t); + + /* + * RISC-V is always little endian, so need to clear bits to the right. + */ + shift =3D offset * 8; + data =3D *ptr; + data =3D (data >> shift) << shift; + + while (len > 0) { + csum +=3D data; + csum +=3D csum < data; + len -=3D sizeof(csum_t); + ptr +=3D 1; + data =3D *ptr; + } + + /* + * Perform alignment (and over-read) bytes on the tail if any bytes + * leftover. + */ + shift =3D len * -8; + data =3D (data << shift) >> shift; + csum +=3D data; + csum +=3D csum < data; + +#ifdef CONFIG_64BIT + csum +=3D (csum >> 32) | (csum << 32); + csum >>=3D 32; +#endif + csum =3D (unsigned int)csum + (((unsigned int)csum >> 16) | ((unsigned in= t)csum << 16)); + if (offset & 1) + return (unsigned short)swab32(csum); + return csum >> 16; +} --=20 2.41.0 From nobody Sun Feb 8 18:03:48 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2E6DAC83F17 for ; Sun, 27 Aug 2023 01:27:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230193AbjH0B1L (ORCPT ); Sat, 26 Aug 2023 21:27:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43868 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230049AbjH0B0x (ORCPT ); Sat, 26 Aug 2023 21:26:53 -0400 Received: from mail-oo1-xc2e.google.com (mail-oo1-xc2e.google.com [IPv6:2607:f8b0:4864:20::c2e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E6FE81B3 for ; Sat, 26 Aug 2023 18:26:50 -0700 (PDT) Received: by mail-oo1-xc2e.google.com with SMTP id 006d021491bc7-5733789a44cso1439289eaf.2 for ; Sat, 26 Aug 2023 18:26:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20221208.gappssmtp.com; s=20221208; t=1693099610; x=1693704410; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=ExZs/Ozc5l4bbPGNHrfkcofcGWcAaQ7sfcQyuY4JEts=; b=mENLkbVNQcciD80loVn0MQJWYsOKvy3JwwwlPSEIvInRWvoIsBRPy7a/ERxr32CN41 dE5XWQgAbSdsVxrMaGxqmyHw5iKPyNf7QfCKohtIu3BMNW/3iH6Fd3s4o2sOGS5w+IrI 73aSoP63Wf02tG9CDg3Mj/0a6z3vsLsuqrXz9AgfxBL3gRNfRgfeBybbtlyXJO+NIYW5 Gf+GHn+aM6hDzQV0QNd+pflAy8kGBGDq3Vfu6SH+nRY58BZxl4xbFEGA3SRSVpm6yweq j4lbOvWCf1ReWiljEiMGnJ2stpC9FJYC9V60H5Fcc5vZiPvXSWYjRLZu1Qo49er9FCsP cS5A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1693099610; x=1693704410; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ExZs/Ozc5l4bbPGNHrfkcofcGWcAaQ7sfcQyuY4JEts=; b=Ntu4PwZQOAbuHY1xObegdv05wT8WEskHqCuLZ+0bLgYtXz/+2pFcFrbwGImrrZbSFm h8NkbX5jEQIopNCxP99G7Qz71bKWkkYojaCZYXsKqFxpm9oXabksEqysjaskkW+hGFY6 YIYcMbHmwbbhcJzYoFrsT2bYFwv/gAa04KaefVKiMj6jLUarE62mj0e5HjJTsFHrQ4Vr QKOuPmPd39qjCvGKt++vtAwPqVJk1rSZgDRERKlNhctRbzTLhnqRvH/5PRz403upO5DK fKawugzUAmAWZuOAbpfvFxbkJzOVP4DGD9LTSRJgpXJdImU+ZlxBkjgoga36TTYPZpwe Ws9g== X-Gm-Message-State: AOJu0YzPUO1as9JXPXc0yygMERKmY5LfpRyJh7cRszzhWFwRAUYHFVEW rwhT4C+9iEmbWqjgMRs3kNIP8Q== X-Google-Smtp-Source: AGHT+IEGHUp91K/R5AJzws7m5jTW/0V4MOrwdiKfnAZq2vc5gFCk+yTLxuLR2FbxrnzWKQCy61yFdQ== X-Received: by 2002:a05:6358:281d:b0:135:ae78:56c9 with SMTP id k29-20020a056358281d00b00135ae7856c9mr24257121rwb.6.1693099610119; Sat, 26 Aug 2023 18:26:50 -0700 (PDT) Received: from charlie.ba.rivosinc.com ([66.220.2.162]) by smtp.gmail.com with ESMTPSA id jf6-20020a170903268600b001b869410ed2sm4357404plb.72.2023.08.26.18.26.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 26 Aug 2023 18:26:49 -0700 (PDT) From: Charlie Jenkins Date: Sat, 26 Aug 2023 18:26:08 -0700 Subject: [PATCH 3/5] riscv: Vector checksum header MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20230826-optimize_checksum-v1-3-937501b4522a@rivosinc.com> References: <20230826-optimize_checksum-v1-0-937501b4522a@rivosinc.com> In-Reply-To: <20230826-optimize_checksum-v1-0-937501b4522a@rivosinc.com> To: linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org Cc: Paul Walmsley , Palmer Dabbelt , Albert Ou , Charlie Jenkins X-Mailer: b4 0.12.3 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This patch is not ready for merge as vector support in the kernel is limited. However, the code has been tested in QEMU so the algorithms do work. It is written in assembly rather than using the GCC vector instrinsics because they did not provide optimal code. Signed-off-by: Charlie Jenkins --- arch/riscv/include/asm/checksum.h | 81 +++++++++++++++++++++++++++++++++++= ++++ 1 file changed, 81 insertions(+) diff --git a/arch/riscv/include/asm/checksum.h b/arch/riscv/include/asm/che= cksum.h index af49b3409576..7e31c0ad6346 100644 --- a/arch/riscv/include/asm/checksum.h +++ b/arch/riscv/include/asm/checksum.h @@ -10,6 +10,10 @@ #include #include =20 +#ifdef CONFIG_RISCV_ISA_V +#include +#endif + /* Default version is sufficient for 32 bit */ #ifdef CONFIG_64BIT #define _HAVE_ARCH_IPV6_CSUM @@ -36,6 +40,46 @@ static inline __sum16 csum_fold(__wsum sum) * without the bitmanip extensions zba/zbb. */ #ifdef CONFIG_32BIT +#ifdef CONFIG_RISCV_ISA_V +static inline __sum16 ip_fast_csum(const void *iph, unsigned int ihl) +{ + vuint64m1_t prev_buffer; + vuint32m1_t curr_buffer; + unsigned int vl; + unsigned int high_result; + unsigned int low_result; + + asm("vsetivli x0, 1, e64, ta, ma \n\t\ + vmv.v.i %[prev_buffer], 0 \n\t\ + 1: \n\t\ + vsetvli %[vl], %[ihl], e32, m1, ta, ma \n\t\ + vle32.v %[curr_buffer], (%[iph]) \n\t\ + vwredsumu.vs %[prev_buffer], %[curr_buffer], %[prev_buffer] \n\t\ + sub %[ihl], %[ihl], %[vl] \n\t" +#ifdef CONFIG_RISCV_ISA_ZBA + "sh2add %[iph], %[vl], %[iph] \n\t" +#else + "slli %[vl], %[vl], 2 \n\ + add %[iph], %[vl], %[iph] \n\t" +#endif + "bnez %[ihl], 1b \n\ + vsetivli x0, 1, e64, m1, ta, ma \n\ + vmv.x.s %[low_result], %[prev_buffer] \n\ + addi %[vl], x0, 32 \n\ + vsrl.vx %[prev_buffer], %[prev_buffer], %[vl] \n\ + vmv.x.s %[high_result], %[prev_buffer]" + : [vl] "=3D&r" (vl), [prev_buffer] "=3D&vd" (prev_buffer), + [curr_buffer] "=3D&vd" (curr_buffer), + [high_result] "=3D&r" (high_result), + [low_result] "=3D&r" (low_result) + : [iph] "r" (iph), [ihl] "r" (ihl)); + + high_result +=3D low_result; + high_result +=3D high_result < low_result; + return csum_fold((__force __wsum)(high_result)); +} + +#else static inline __sum16 ip_fast_csum(const void *iph, unsigned int ihl) { __wsum csum =3D 0; @@ -47,8 +91,44 @@ static inline __sum16 ip_fast_csum(const void *iph, unsi= gned int ihl) } while (++pos < ihl); return csum_fold(csum); } +#endif +#else + +#ifdef CONFIG_RISCV_ISA_V +static inline __sum16 ip_fast_csum(const void *iph, unsigned int ihl) +{ + vuint64m1_t prev_buffer; + vuint32m1_t curr_buffer; + unsigned long vl; + unsigned long result; + + asm("vsetivli x0, 1, e64, ta, ma \n\ + vmv.v.i %[prev_buffer], 0 \n\ + 1: \n\ + # Setup 32-bit sum of iph \n\ + vsetvli %[vl], %[ihl], e32, m1, ta, ma \n\ + vle32.v %[curr_buffer], (%[iph]) \n\ + # Sum each 32-bit segment of iph that can fit into a vector reg \n\ + vwredsumu.vs %[prev_buffer], %[curr_buffer], %[prev_buffer] \n\ + subw %[ihl], %[ihl], %[vl] \n\t" +#ifdef CONFIG_RISCV_ISA_ZBA + "sh2add %[iph], %[vl], %[iph] \n\t" #else + "slli %[vl], %[vl], 2 \n\ + addw %[iph], %[vl], %[iph] \n\t" +#endif + "# If not all of iph could fit into vector reg, do another sum \n\ + bnez %[ihl], 1b \n\ + vsetvli x0, x0, e64, m1, ta, ma \n\ + vmv.x.s %[result], %[prev_buffer]" + : [vl] "=3D&r" (vl), [prev_buffer] "=3D&vd" (prev_buffer), + [curr_buffer] "=3D&vd" (curr_buffer), [result] "=3D&r" (result) + : [iph] "r" (iph), [ihl] "r" (ihl)); =20 + result +=3D (result >> 32) | (result << 32); + return csum_fold((__force __wsum)(result >> 32)); +} +#else /* * Quickly compute an IP checksum with the assumption that IPv4 headers wi= ll * always be in multiples of 32-bits, and have an ihl of at least 5. @@ -74,6 +154,7 @@ static inline __sum16 ip_fast_csum(const void *iph, unsi= gned int ihl) return csum_fold((__force __wsum)(csum >> 32)); } #endif +#endif #define ip_fast_csum ip_fast_csum =20 extern unsigned int do_csum(const unsigned char *buff, int len); --=20 2.41.0 From nobody Sun Feb 8 18:03:48 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F2F10C83F11 for ; Sun, 27 Aug 2023 01:27:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230233AbjH0B1N (ORCPT ); Sat, 26 Aug 2023 21:27:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43874 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230080AbjH0B0y (ORCPT ); Sat, 26 Aug 2023 21:26:54 -0400 Received: from mail-oo1-xc35.google.com (mail-oo1-xc35.google.com [IPv6:2607:f8b0:4864:20::c35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3113F1B3 for ; Sat, 26 Aug 2023 18:26:52 -0700 (PDT) Received: by mail-oo1-xc35.google.com with SMTP id 006d021491bc7-56d8bc0d909so1332493eaf.3 for ; Sat, 26 Aug 2023 18:26:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20221208.gappssmtp.com; s=20221208; t=1693099611; x=1693704411; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=Ie5AvS87bg127Kdu5s03kcc9obiNPBZ0JNqYthYsEyc=; b=zUD1vsSrIt6NDaAmeB/rl4JKwtjqbf5Uby30fzjowl93TMiabkcelrdn0e50bqpdm0 1h4Gw18y3jK3WKRYljthTXqRU9bAfqMQryNerDQAh8mIa+QDvXoihH5oM9WWlfHtMlDo sgiR403FrWrNlom1heS9q9yEFb3GtxSznQIFRERg7lTwUx80bycVJjSHzai+JIQXZMao 8MiAePd6Cbl1EOPsVTPcCopM4Z23kLD1NS4gUaR0v2QMiPupe1AHOC7sEO8x7yxpqO9B gId8JcItLuzUQCzAcpnd0xH5/2WE0dmhbTHwkuoB9pjpuvZRoDJsfTOIFInXdnXv17l8 pOLg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1693099611; x=1693704411; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Ie5AvS87bg127Kdu5s03kcc9obiNPBZ0JNqYthYsEyc=; b=WhqX9BfNHwh65WK2GDNXGenhPEEiZDTmyrW8PBo6E/f3rF1lTvJ37Pzyei7Uk1uYFw xc1v0ZRsEyGwhfIiosv9m6idWvPbV2xV18PLHroX5Ds5x53tR9H+ENVX51WF5QIb9rcx SKUE6TW7XKzRMzSawD1fFd3Cpn65Mk84yv2E5Fgy4i4VapJ2RuS4d2aqU/vSUnKycm36 wtAzBeG1LBfFTbnEd/VvEVvY7SWvoZYmb+UV4KNk+9G+whn/uMJlfIlgI/Lqon0+ECJl U0otvrjfBE47Pm5wO0FDMey6owMtCNZS44tlAzQQ569EkgCWdmx5UANAz7GWUyjN0p2P yovg== X-Gm-Message-State: AOJu0YwWVqOA/mnWQVut/EaCkyV2roKJaRNpw5Ck3y+FjqAfFELalZkv k91oxalMjmkW8hLCfR57YBWjFpi/d4WCF23ru0E= X-Google-Smtp-Source: AGHT+IEoNs4BNtH12YP5V6L4J0N/KuUw1MFS8fSEwjeImsFFj1jnKWjJsURig57J3YVMHkIW5mJN+A== X-Received: by 2002:a05:6358:4319:b0:13a:4855:d885 with SMTP id r25-20020a056358431900b0013a4855d885mr24823282rwc.10.1693099611417; Sat, 26 Aug 2023 18:26:51 -0700 (PDT) Received: from charlie.ba.rivosinc.com ([66.220.2.162]) by smtp.gmail.com with ESMTPSA id jf6-20020a170903268600b001b869410ed2sm4357404plb.72.2023.08.26.18.26.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 26 Aug 2023 18:26:50 -0700 (PDT) From: Charlie Jenkins Date: Sat, 26 Aug 2023 18:26:09 -0700 Subject: [PATCH 4/5] riscv: Vector checksum library MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20230826-optimize_checksum-v1-4-937501b4522a@rivosinc.com> References: <20230826-optimize_checksum-v1-0-937501b4522a@rivosinc.com> In-Reply-To: <20230826-optimize_checksum-v1-0-937501b4522a@rivosinc.com> To: linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org Cc: Paul Walmsley , Palmer Dabbelt , Albert Ou , Charlie Jenkins X-Mailer: b4 0.12.3 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This patch is not ready for merge as vector support in the kernel is limited. However, the code has been tested in QEMU so the algorithms do work. When Vector support is more mature, I will do more thorough testing of this code. It is written in assembly rather than using the GCC vector instrinsics because they did not provide optimal code. Signed-off-by: Charlie Jenkins --- arch/riscv/lib/csum.c | 165 ++++++++++++++++++++++++++++++++++++++++++++++= ++++ 1 file changed, 165 insertions(+) diff --git a/arch/riscv/lib/csum.c b/arch/riscv/lib/csum.c index 2037041ce8a0..049a10596008 100644 --- a/arch/riscv/lib/csum.c +++ b/arch/riscv/lib/csum.c @@ -12,6 +12,10 @@ =20 #include =20 +#ifdef CONFIG_RISCV_ISA_V +#include +#endif + /* Default version is sufficient for 32 bit */ #ifdef CONFIG_64BIT __sum16 csum_ipv6_magic(const struct in6_addr *saddr, @@ -64,6 +68,166 @@ typedef unsigned long csum_t; * the bytes that it shouldn't. The same thing will occur on the tail-end = of the * read. */ +#ifdef CONFIG_RISCV_ISA_V +#ifdef CONFIG_32BIT +unsigned int __no_sanitize_address do_csum(const unsigned char *buff, int = len) +{ + vuint64m1_t prev_buffer; + vuint32m1_t curr_buffer; + unsigned int shift; + unsigned int vl, high_result, low_result, csum, offset; + unsigned int tail_seg; + const unsigned int *ptr; + + if (len <=3D 0) + return 0; + + /* + * To align the address, grab the whole first byte in buff. + * Directly call KASAN with the alignment we will be using. + */ + offset =3D (unsigned int)buff & OFFSET_MASK; + kasan_check_read(buff, len); + ptr =3D (const unsigned int *)(buff - offset); + len +=3D offset; + + // Read the tail segment + tail_seg =3D len % 4; + csum =3D 0; + if (tail_seg) { + shift =3D (4 - tail_seg) * 8; + csum =3D *(unsigned int *)((const unsigned char *)ptr + len - tail_seg); + csum =3D ((unsigned int)csum << shift) >> shift; + len -=3D tail_seg; + } + + unsigned long start_mask =3D (unsigned int)(~(~0U << offset)); + + asm("vsetvli %[vl], %[len], e8, m1, ta, ma \n\ + # clear out mask and vector registers since we switch up sizes \n\ + vmclr.m v0 \n\ + vmclr.m %[prev_buffer] \n\ + vmclr.m %[curr_buffer] \n\ + # Mask out the leading bits of a misaligned address \n\ + vsetivli x0, 1, e64, m1, ta, ma \n\ + vmv.s.x %[prev_buffer], %[csum] \n\ + vmv.s.x v0, %[start_mask] \n\ + vsetvli %[vl], %[len], e8, m1, ta, ma \n\ + vmnot.m v0, v0 \n\ + vle8.v %[curr_buffer], (%[buff]), v0.t \n\ + j 2f \n\ + # Iterate through the buff and sum all words \n\ + 1: \n\ + vsetvli %[vl], %[len], e8, m1, ta, ma \n\ + vle8.v %[curr_buffer], (%[buff]) \n\ + 2: \n\ + vsetvli x0, x0, e32, m1, ta, ma \n\ + vwredsumu.vs %[prev_buffer], %[curr_buffer], %[prev_buffer] \n\ + sub %[len], %[len], %[vl] \n\t" +#ifdef CONFIG_RISCV_ISA_ZBA + "sh2add %[iph], %[vl], %[iph] \n\t" +#else + "slli %[vl], %[vl], 2 \n\ + add %[iph], %[vl], %[iph] \n\t" +#endif + "bnez %[len], 1b \n\ + vsetvli x0, x0, e64, m1, ta, ma \n\ + vmv.x.s %[result], %[prev_buffer] \n\ + addi %[vl], x0, 32 \n\ + vsrl.vx %[prev_buffer], %[prev_buffer], %[vl] \n\ + vmv.x.s %[high_result], %[prev_buffer]" + : [vl] "=3D&r" (vl), [prev_buffer] "=3D&vd" (prev_buffer), + [curr_buffer] "=3D&vd" (curr_buffer), + [high_result] "=3D&r" (high_result), + [low_result] "=3D&r" (low_result) + : [buff] "r" (ptr), [len] "r" (len), [start_mask] "r" (start_mask), + [csum] "r" (csum)); + + high_result +=3D low_result; + high_result +=3D high_result < low_result; + result =3D (unsigned int)result + (((unsigned int)result >> 16) | ((unsig= ned int)result << 16)); + if (offset & 1) + return (unsigned short)swab32(result); + return result >> 16; +} +#else +unsigned int __no_sanitize_address do_csum(const unsigned char *buff, int = len) +{ + vuint64m1_t prev_buffer; + vuint32m1_t curr_buffer; + unsigned int shift; + unsigned long vl, result, csum, offset; + unsigned int tail_seg; + const unsigned long *ptr; + + if (len <=3D 0) + return 0; + + /* + * To align the address, grab the whole first byte in buff. + * Directly call KASAN with the alignment we will be using. + */ + offset =3D (unsigned long)buff & 7; + kasan_check_read(buff, len); + ptr =3D (const unsigned long *)(buff - offset); + len +=3D offset; + + // Read the tail segment + tail_seg =3D len % 4; + csum =3D 0; + if (tail_seg) { + shift =3D (4 - tail_seg) * 8; + csum =3D *(unsigned int *)((const unsigned char *)ptr + len - tail_seg); + csum =3D ((unsigned int)csum << shift) >> shift; + len -=3D tail_seg; + } + + unsigned long start_mask =3D (unsigned int)(~(~0U << offset)); + + asm("vsetvli %[vl], %[len], e8, m1, ta, ma \n\ + # clear out mask and vector registers since we switch up sizes \n\ + vmclr.m v0 \n\ + vmclr.m %[prev_buffer] \n\ + vmclr.m %[curr_buffer] \n\ + # Mask out the leading bits of a misaligned address \n\ + vsetivli x0, 1, e64, m1, ta, ma \n\ + vmv.s.x %[prev_buffer], %[csum] \n\ + vmv.s.x v0, %[start_mask] \n\ + vsetvli %[vl], %[len], e8, m1, ta, ma \n\ + vmnot.m v0, v0 \n\ + vle8.v %[curr_buffer], (%[buff]), v0.t \n\ + j 2f \n\ + # Iterate through the buff and sum all words \n\ + 1: \n\ + vsetvli %[vl], %[len], e8, m1, ta, ma \n\ + vle8.v %[curr_buffer], (%[buff]) \n\ + 2: \n\ + vsetvli x0, x0, e32, m1, ta, ma \n\ + vwredsumu.vs %[prev_buffer], %[curr_buffer], %[prev_buffer] \n\ + subw %[len], %[len], %[vl] \n\t" +#ifdef CONFIG_RISCV_ISA_ZBA + "sh2add %[iph], %[vl], %[iph] \n\t" +#else + "slli %[vl], %[vl], 2 \n\ + addw %[iph], %[vl], %[iph] \n\t" +#endif + "bnez %[len], 1b \n\ + vsetvli x0, x0, e64, m1, ta, ma \n\ + vmv.x.s %[result], %[prev_buffer]" + : [vl] "=3D&r" (vl), [prev_buffer] "=3D&vd" (prev_buffer), + [curr_buffer] "=3D&vd" (curr_buffer), [result] "=3D&r" (result) + : [buff] "r" (ptr), [len] "r" (len), [start_mask] "r" (start_mask), + [csum] "r" (csum)); + + result +=3D (result >> 32) | (result << 32); + result >>=3D 32; + result =3D (unsigned int)result + (((unsigned int)result >> 16) | ((unsig= ned int)result << 16)); + if (offset & 1) + return (unsigned short)swab32(result); + return result >> 16; +} +#endif +#else unsigned int __no_sanitize_address do_csum(const unsigned char *buff, int = len) { unsigned int offset, shift; @@ -116,3 +280,4 @@ unsigned int __no_sanitize_address do_csum(const unsign= ed char *buff, int len) return (unsigned short)swab32(csum); return csum >> 16; } +#endif --=20 2.41.0 From nobody Sun Feb 8 18:03:48 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1E04AC71153 for ; Sun, 27 Aug 2023 01:27:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230220AbjH0B1M (ORCPT ); Sat, 26 Aug 2023 21:27:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43884 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230099AbjH0B0z (ORCPT ); Sat, 26 Aug 2023 21:26:55 -0400 Received: from mail-oo1-xc31.google.com (mail-oo1-xc31.google.com [IPv6:2607:f8b0:4864:20::c31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 50CCA1B4 for ; Sat, 26 Aug 2023 18:26:53 -0700 (PDT) Received: by mail-oo1-xc31.google.com with SMTP id 006d021491bc7-573249e73f8so1473861eaf.1 for ; Sat, 26 Aug 2023 18:26:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20221208.gappssmtp.com; s=20221208; t=1693099612; x=1693704412; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=51CzmLExobdNojQV/8xztk6TetLzPUmBlLjWC0Th8zU=; b=m7qZdRv407cpgoxyHhwtM859LwQH/Zj8rhFGzC87zGoRGDL45ix/rP1aZuL80cBWFm DUqvuwdRK0wTD5TJr3ZfPhOnWAh3BCiy5VGCdvcChj/wCUedKz5HdvRU27lBiuAIELJ0 ZMq6o52tP0c7qLP0g/Sm+nTo7Ckrbjfq2u9VICWZmLbhxAYLUdeVbcqtgw/75jNalkG9 dIIPK+aj+lhf0l00Dlc/Pe3RXZT+zrxBBvhrj8EVnS7Q0BUX16rDy5d2s+vW733lRQ6+ tK2PDtGdRL5EbT50ZkEcq4fqEaQDmeQdJ32G3r78HfqCB+JVWBwVsYN7iatWqV2bP+gm TfVg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1693099612; x=1693704412; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=51CzmLExobdNojQV/8xztk6TetLzPUmBlLjWC0Th8zU=; b=cWnaNPuzQd9ajWG2n+nTq5QuMHj4C6MkLWCSyXk/sJE4suTeGxfd6mHhFGq5bsN+Vz K3JQMUUFiJqmIGqg/OZwIyZjfJiSAIYyi4gQmPTfDOjNkMbVpucZBIeGZah63Y4veuGS KTkBclGIrpBmiwSgEoTEddThWQdJkSm2FP2Fe1DpNYMFtBGJ7dBE4pZvPhoVAR/4KEOX 8mjGVHavpBfJWp45Y6FX8oasytVKYrOA8CX/ZXMPxaLhnjfDv2ntxlkg0+t13mPoH7cE 2TvuSn7VmJnE63jL5HgY4T986oTYt9J+jBIN76kaBBCE+E2p2lseOjnK/yPa3tEaf9Bx Gi9A== X-Gm-Message-State: AOJu0YxssnQLDu4EhL9icK4DJc8IcAsEMNvYmMcAFS9hvG7MevoQ89X/ B+93dUpUQ+lP5uKYaIwmWeVh2wLuuj9mB9mZ6oE= X-Google-Smtp-Source: AGHT+IFvk3wdXz8svKDCr5C9cRFRuEBUhu9SfN/ssh0HFACRuc4nx5PCahf2QTTUSsQ8BSkBngKdng== X-Received: by 2002:a05:6358:248b:b0:132:d42f:8e19 with SMTP id m11-20020a056358248b00b00132d42f8e19mr23623473rwc.31.1693099612527; Sat, 26 Aug 2023 18:26:52 -0700 (PDT) Received: from charlie.ba.rivosinc.com ([66.220.2.162]) by smtp.gmail.com with ESMTPSA id jf6-20020a170903268600b001b869410ed2sm4357404plb.72.2023.08.26.18.26.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 26 Aug 2023 18:26:51 -0700 (PDT) From: Charlie Jenkins Date: Sat, 26 Aug 2023 18:26:10 -0700 Subject: [PATCH 5/5] riscv: Test checksum functions MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20230826-optimize_checksum-v1-5-937501b4522a@rivosinc.com> References: <20230826-optimize_checksum-v1-0-937501b4522a@rivosinc.com> In-Reply-To: <20230826-optimize_checksum-v1-0-937501b4522a@rivosinc.com> To: linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org Cc: Paul Walmsley , Palmer Dabbelt , Albert Ou , Charlie Jenkins X-Mailer: b4 0.12.3 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Add Kconfig support for riscv specific testing modules. This was created to supplement lib/checksum_kunit.c, and add tests for ip_fast_csum and csum_ipv6_magic. Signed-off-by: Charlie Jenkins --- arch/riscv/Kconfig.debug | 1 + arch/riscv/lib/Kconfig.debug | 31 ++++++++++ arch/riscv/lib/Makefile | 2 + arch/riscv/lib/riscv_checksum_kunit.c | 111 ++++++++++++++++++++++++++++++= ++++ 4 files changed, 145 insertions(+) diff --git a/arch/riscv/Kconfig.debug b/arch/riscv/Kconfig.debug index e69de29bb2d1..53a84ec4f91f 100644 --- a/arch/riscv/Kconfig.debug +++ b/arch/riscv/Kconfig.debug @@ -0,0 +1 @@ +source "arch/riscv/lib/Kconfig.debug" diff --git a/arch/riscv/lib/Kconfig.debug b/arch/riscv/lib/Kconfig.debug new file mode 100644 index 000000000000..15fc83b68340 --- /dev/null +++ b/arch/riscv/lib/Kconfig.debug @@ -0,0 +1,31 @@ +# SPDX-License-Identifier: GPL-2.0-only +menu "riscv Testing and Coverage" + +menuconfig RUNTIME_TESTING_MENU + bool "Runtime Testing" + def_bool y + help + Enable riscv runtime testing. + +if RUNTIME_TESTING_MENU + +config RISCV_CHECKSUM_KUNIT + tristate "KUnit test riscv checksum functions at runtime" if !KUNIT_ALL_T= ESTS + depends on KUNIT + default KUNIT_ALL_TESTS + help + Enable this option to test the checksum functions at boot. + + KUnit tests run during boot and output the results to the debug log + in TAP format (http://testanything.org/). Only useful for kernel devs + running the KUnit test harness, and not intended for inclusion into a + production build. + + For more information on KUnit and unit tests in general please refer + to the KUnit documentation in Documentation/dev-tools/kunit/. + + If unsure, say N. + +endif # RUNTIME_TESTING_MENU + +endmenu # "riscv Testing and Coverage" diff --git a/arch/riscv/lib/Makefile b/arch/riscv/lib/Makefile index 2aa1a4ad361f..1535a8c81430 100644 --- a/arch/riscv/lib/Makefile +++ b/arch/riscv/lib/Makefile @@ -12,3 +12,5 @@ lib-$(CONFIG_64BIT) +=3D tishift.o lib-$(CONFIG_RISCV_ISA_ZICBOZ) +=3D clear_page.o =20 obj-$(CONFIG_FUNCTION_ERROR_INJECTION) +=3D error-inject.o + +obj-$(CONFIG_RISCV_CHECKSUM_KUNIT) +=3D riscv_checksum_kunit.o diff --git a/arch/riscv/lib/riscv_checksum_kunit.c b/arch/riscv/lib/riscv_c= hecksum_kunit.c new file mode 100644 index 000000000000..05b4710c907f --- /dev/null +++ b/arch/riscv/lib/riscv_checksum_kunit.c @@ -0,0 +1,111 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Test cases for checksum + */ + +#include + +#include +#include +#include + +#define CHECK_EQ(lhs, rhs) KUNIT_ASSERT_EQ(test, lhs, rhs) + +static void test_csum_fold(struct kunit *test) +{ + unsigned int one =3D 1226127848; + unsigned int two =3D 446627905; + unsigned int three =3D 3644783064; + unsigned int four =3D 361842745; + unsigned int five =3D 4281073503; + unsigned int max =3D -1; + + CHECK_EQ(0x7d02, csum_fold(one)); + CHECK_EQ(0xe51f, csum_fold(two)); + CHECK_EQ(0x2ce8, csum_fold(three)); + CHECK_EQ(0xa235, csum_fold(four)); + CHECK_EQ(0x174, csum_fold(five)); + CHECK_EQ(0x0, csum_fold(max)); +} + +static void test_ip_fast_csum(struct kunit *test) +{ + unsigned char *average =3D { 0x1c, 0x00, 0x00, 0x45, 0x00, 0x00, 0x68, + 0x74, 0x00, 0x00, 0x11, 0x80, 0x01, 0x64, + 0xa8, 0xc0, 0xe9, 0x9c, 0x46, 0xab }; + unsigned char *larger =3D { 0xa3, 0xde, 0x43, 0x41, 0x11, 0x19, + 0x2f, 0x73, 0x00, 0x00, 0xf1, 0xc5, + 0x31, 0xbb, 0xaa, 0xc1, 0x23, 0x5f, + 0x32, 0xde, 0x65, 0x39, 0xfe, 0xbc }; + unsigned char *overflow =3D { 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 }; + unsigned char *max =3D { 0xff, 0xff, 0xff, 0xff, 0xff, 0xfd, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xfd, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff }; + + CHECK_EQ(0x598f, ip_fast_csum(average, 5)); + CHECK_EQ(0xdd4f, ip_fast_csum(larger, 6)); + CHECK_EQ(0xfffe, ip_fast_csum(overflow, 5)); + CHECK_EQ(0x400, ip_fast_csum(max, 14)); +} + +static void test_csum_ipv6_magic(struct kunit *test) +{ + struct in6_addr saddr =3D { + .s6_addr =3D { 0xf8, 0x43, 0x43, 0xf0, 0xdc, 0xa0, 0x39, 0x92, + 0x43, 0x67, 0x12, 0x03, 0xe3, 0x32, 0xfe, 0xed }}; + struct in6_addr daddr =3D { + .s6_addr =3D { 0xa8, 0x23, 0x46, 0xdc, 0xc8, 0x2d, 0xaa, 0xe3, + 0xdc, 0x66, 0x72, 0x43, 0xe2, 0x12, 0xee, 0xfd }}; + u32 len =3D 1 << 10; + u8 proto =3D 17; + __wsum csum =3D 53; + + CHECK_EQ(0x2fbb, csum_ipv6_magic(&saddr, &daddr, len, proto, csum)); +} + +static void test_do_csum(struct kunit *test) +{ + unsigned char *very_small =3D {0x32}; + unsigned char *small =3D {0xd3, 0x43, 0xad, 0x46}; + unsigned char *medium =3D { + 0xa0, 0x13, 0xaa, 0xa6, 0x53, 0xac, 0xa3, 0x43 + }; + unsigned char *misaligned =3D medium + 1; + unsigned char *large =3D { + 0xa0, 0x13, 0xaa, 0xa6, 0x53, 0xac, 0xa3, 0x43, + 0xa0, 0x13, 0xaa, 0xa6, 0x53, 0xac, 0xa3, 0x43, + 0xa0, 0x13, 0xaa, 0xa6, 0x53, 0xac, 0xa3, 0x43, + 0xa0, 0x13, 0xaa, 0xa6, 0x53, 0xac, 0xa3, 0x43 + }; + unsigned char *large_misaligned =3D large + 3; + + CHECK_EQ(0xffcd, ip_compute_csum(very_small, 1)); + CHECK_EQ(0x757f, ip_compute_csum(small, 4)); + CHECK_EQ(0x5e56, ip_compute_csum(misaligned, 7)); + CHECK_EQ(0x469d, ip_compute_csum(large, 29)); + CHECK_EQ(0x43ae, ip_compute_csum(large_misaligned, 28)); +} + +static struct kunit_case __refdata riscv_checksum_test_cases[] =3D { + KUNIT_CASE(test_csum_fold), + KUNIT_CASE(test_ip_fast_csum), + KUNIT_CASE(test_csum_ipv6_magic), + KUNIT_CASE(test_do_csum), + {} +}; + +static struct kunit_suite riscv_checksum_test_suite =3D { + .name =3D "riscv_checksum", + .test_cases =3D riscv_checksum_test_cases, +}; + +kunit_test_suites(&riscv_checksum_test_suite); + +MODULE_AUTHOR("Charlie Jenkins "); +MODULE_LICENSE("GPL"); --=20 2.41.0