From nobody Tue Jun 30 23:28:38 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id ED536C433EF for ; Thu, 6 Jan 2022 14:45:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240238AbiAFOpw (ORCPT ); Thu, 6 Jan 2022 09:45:52 -0500 Received: from eu-smtp-delivery-151.mimecast.com ([185.58.86.151]:37107 "EHLO eu-smtp-delivery-151.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239677AbiAFOpt (ORCPT ); Thu, 6 Jan 2022 09:45:49 -0500 Received: from AcuMS.aculab.com (156.67.243.121 [156.67.243.121]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id uk-mta-167-S1Fja6XRP0mJDfamsb7scg-1; Thu, 06 Jan 2022 14:45:42 +0000 X-MC-Unique: S1Fja6XRP0mJDfamsb7scg-1 Received: from AcuMS.Aculab.com (fd9f:af1c:a25b:0:994c:f5c2:35d6:9b65) by AcuMS.aculab.com (fd9f:af1c:a25b:0:994c:f5c2:35d6:9b65) with Microsoft SMTP Server (TLS) id 15.0.1497.26; Thu, 6 Jan 2022 14:45:41 +0000 Received: from AcuMS.Aculab.com ([fe80::994c:f5c2:35d6:9b65]) by AcuMS.aculab.com ([fe80::994c:f5c2:35d6:9b65%12]) with mapi id 15.00.1497.026; Thu, 6 Jan 2022 14:45:41 +0000 From: David Laight To: 'Eric Dumazet' , Peter Zijlstra CC: "'tglx@linutronix.de'" , "'mingo@redhat.com'" , 'Borislav Petkov' , "'dave.hansen@linux.intel.com'" , 'X86 ML' , "'hpa@zytor.com'" , "'alexanderduyck@fb.com'" , 'open list' , 'netdev' , "'Noah Goldstein'" Subject: [PATCH v2] x86/lib: Remove the special case for odd-aligned buffers in csum-partial_64.c Thread-Topic: [PATCH v2] x86/lib: Remove the special case for odd-aligned buffers in csum-partial_64.c Thread-Index: AdgDCwHV1n1J2N5MS6O6mkuVVrK4ag== Date: Thu, 6 Jan 2022 14:45:41 +0000 Message-ID: Accept-Language: en-GB, en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-transport-fromentityheader: Hosted x-originating-ip: [10.202.205.107] MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=C51A453 smtp.mailfrom=david.laight@aculab.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: aculab.com Content-Language: en-US Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" There is no need to special case the very unusual odd-aligned buffers. They are no worse than 4n+2 aligned buffers. Signed-off-by: David Laight Acked-by: Eric Dumazet --- resend - v1 seems to have got lost :-) v2: Also delete from32to16() Add acked-by from Eric (he sent one at some point) Fix possible whitespace error in the last hunk. The penalty for any misaligned access seems to be minimal. On an i7-7700 misaligned buffers add 2 or 3 clocks (in 115) to a 512 byte checksum. That is less than 1 clock for each cache line! That is just measuring the main loop with an lfence prior to rdpmc to read PERF_COUNT_HW_CPU_CYCLES. arch/x86/lib/csum-partial_64.c | 28 ++-------------------------- 1 file changed, 2 insertions(+), 26 deletions(-) diff --git a/arch/x86/lib/csum-partial_64.c b/arch/x86/lib/csum-partial_64.c index 1f8a8f895173..061b1ed74d6a 100644 --- a/arch/x86/lib/csum-partial_64.c +++ b/arch/x86/lib/csum-partial_64.c @@ -11,16 +11,6 @@ #include #include =20 -static inline unsigned short from32to16(unsigned a)=20 -{ - unsigned short b =3D a >> 16;=20 - asm("addw %w2,%w0\n\t" - "adcw $0,%w0\n"=20 - : "=3Dr" (b) - : "0" (b), "r" (a)); - return b; -} - /* * Do a checksum on an arbitrary memory area. * Returns a 32bit checksum. @@ -30,22 +20,12 @@ static inline unsigned short from32to16(unsigned a) * * Still, with CHECKSUM_COMPLETE this is called to compute * checksums on IPv6 headers (40 bytes) and other small parts. - * it's best to have buff aligned on a 64-bit boundary + * The penalty for misaligned buff is negligable. */ __wsum csum_partial(const void *buff, int len, __wsum sum) { u64 temp64 =3D (__force u64)sum; - unsigned odd, result; - - odd =3D 1 & (unsigned long) buff; - if (unlikely(odd)) { - if (unlikely(len =3D=3D 0)) - return sum; - temp64 =3D ror32((__force u32)sum, 8); - temp64 +=3D (*(unsigned char *)buff << 8); - len--; - buff++; - } + unsigned result; =20 while (unlikely(len >=3D 64)) { asm("addq 0*8(%[src]),%[res]\n\t" @@ -130,10 +110,6 @@ __wsum csum_partial(const void *buff, int len, __wsum = sum) #endif } result =3D add32_with_carry(temp64 >> 32, temp64 & 0xffffffff); - if (unlikely(odd)) { - result =3D from32to16(result); - result =3D ((result >> 8) & 0xff) | ((result & 0xff) << 8); - } return (__force __wsum)result; } EXPORT_SYMBOL(csum_partial); --=20 2.17.1 - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1= PT, UK Registration No: 1397386 (Wales)