From nobody Thu Apr 2 17:44:29 2026 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1104534BA5B for ; Wed, 11 Feb 2026 17:30:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=193.142.43.55 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770831053; cv=none; b=V3+9InRxscn7mNYuIcLMbu4l33Kza2zq871pJWo8tGA/JbQgPz3GE4AFdgZ2i1iUzcTPxqt2SRKegOr9rE1h5OTMtsxe72WxNehpDk6yPgycLgzP+Cpiqyx6gEO/kN/k/xmgwvrBdjQJ7kAhLe/Zw+Yfqu/ayOPzsA+06MisSjY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770831053; c=relaxed/simple; bh=8e20AmCrtqUDkRXKRWH3dENziO3/dE2iW9JJTecT7OU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=aHb4Fp2RuDS5wkrbN+mKV7gm15W9acYV3zoB1YQbJ2iAuytcnO7MurE6X8gXsfdDzXJIVuPO+7Cf25PYs6/QAdjHAE24s8BJHZASQ4XuOmK9Yoqp4PHW90gNr02XucUvQDuwTl7PZAl6+NOQmZHpv2SoyV1vmyfx7/WmnzwkaPE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de; spf=pass smtp.mailfrom=linutronix.de; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=kpB4bgYE; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=LQ91RW9Z; arc=none smtp.client-ip=193.142.43.55 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linutronix.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="kpB4bgYE"; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="LQ91RW9Z" From: Nam Cao DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1770831050; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=XZ1wS3EkGS+wwLdwUudi9ss9LulFQNodQHS/dw62M1g=; b=kpB4bgYEqanUlIF0vZKcYFpNXDgR7pr7v6lRuqzhHnr3AoyuH3MP2Ww628MyOorndOVXEB ueNmE7eG7i7cZtZlmd0A1CqvREarLsB9s8egrrZTpZf57KLMdJY8dz655aFdSn7UF9XPVd QT7sVJIVv5jGwAj5zUaXPW+GbrvzhroqP5zTFW/oT+6TRYA0UIAwVmJAMy1anbpcStvXe3 yhO2Yej3kimKcpysJAQTbTDAyssJnw/KhPbRaB3Sz9GM5XNZhyPxw1dp3vvRlIed+n3oSp InfbOvtLvaidiGEBYTks1JR7/0oxbMOZLbcYBf9Er+qpybZLQo6UgCLPjdmtGA== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1770831050; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=XZ1wS3EkGS+wwLdwUudi9ss9LulFQNodQHS/dw62M1g=; b=LQ91RW9Zh93ahUIz4FmRP/hfC84OGYkwo90UEkjaQG8jONEILQ71e9CjxJRTxyicbWD7+g ndWnnTvTGpEg3ACA== To: Paul Walmsley , Palmer Dabbelt , Albert Ou , Alexandre Ghiti , Andrew Jones , =?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?= , linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org Cc: Nam Cao , Sebastian Andrzej Siewior Subject: [PATCH 1/5] riscv: Clean up & optimize unaligned scalar access probe Date: Wed, 11 Feb 2026 18:30:31 +0100 Message-ID: <9b9a20affe2e4f5c380926ceb885a47e20a59395.1770830596.git.namcao@linutronix.de> In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" check_unaligned_access_speed_all_cpus() is more complicated than it should be: - It uses on_each_cpu() to probe unaligned memory access on all CPUs but excludes CPU0 with a check in the callback function. So an IPI to CPU0 is wasted. - Probing on CPU0 is done with smp_call_on_cpu(), which is not as fast as on_each_cpu(). The reason for this design is because the probe is timed with jiffies. Therefore on_each_cpu() excludes CPU0 because that CPU needs to tend to jiffies. Instead, replace jiffies usage with ktime_get_mono_fast_ns(). With jiffies out of the way, on_each_cpu() can be used for all CPUs and smp_call_on_cpu() can be dropped. To make ktime_get_mono_fast_ns() usable, move this probe to late_initcall. Anything after clocksource's fs_initcall works, but avoid depending on clocksource staying at fs_initcall. The choice of probe time is now 8000000 ns, which is the same as before (2 jiffies) for riscv defconfig. This is excessive for the CPUs I have, and probably should be reduced; but that's a different discussion. Suggested-by: Sebastian Andrzej Siewior Signed-off-by: Nam Cao --- arch/riscv/kernel/unaligned_access_speed.c | 28 ++++++++-------------- 1 file changed, 10 insertions(+), 18 deletions(-) diff --git a/arch/riscv/kernel/unaligned_access_speed.c b/arch/riscv/kernel= /unaligned_access_speed.c index 70b5e6927620..8b744c4a41ea 100644 --- a/arch/riscv/kernel/unaligned_access_speed.c +++ b/arch/riscv/kernel/unaligned_access_speed.c @@ -17,6 +17,7 @@ #include "copy-unaligned.h" =20 #define MISALIGNED_ACCESS_JIFFIES_LG2 1 +#define MISALIGNED_ACCESS_NS 8000000 #define MISALIGNED_BUFFER_SIZE 0x4000 #define MISALIGNED_BUFFER_ORDER get_order(MISALIGNED_BUFFER_SIZE) #define MISALIGNED_COPY_SIZE ((MISALIGNED_BUFFER_SIZE / 2) - 0x80) @@ -36,8 +37,8 @@ static int check_unaligned_access(void *param) u64 start_cycles, end_cycles; u64 word_cycles; u64 byte_cycles; + u64 start_ns; int ratio; - unsigned long start_jiffies, now; struct page *page =3D param; void *dst; void *src; @@ -55,15 +56,13 @@ static int check_unaligned_access(void *param) /* Do a warmup. */ __riscv_copy_words_unaligned(dst, src, MISALIGNED_COPY_SIZE); preempt_disable(); - start_jiffies =3D jiffies; - while ((now =3D jiffies) =3D=3D start_jiffies) - cpu_relax(); =20 /* * For a fixed amount of time, repeatedly try the function, and take * the best time in cycles as the measurement. */ - while (time_before(jiffies, now + (1 << MISALIGNED_ACCESS_JIFFIES_LG2))) { + start_ns =3D ktime_get_mono_fast_ns(); + while (ktime_get_mono_fast_ns() < start_ns + MISALIGNED_ACCESS_NS) { start_cycles =3D get_cycles64(); /* Ensure the CSR read can't reorder WRT to the copy. */ mb(); @@ -77,11 +76,9 @@ static int check_unaligned_access(void *param) =20 byte_cycles =3D -1ULL; __riscv_copy_bytes_unaligned(dst, src, MISALIGNED_COPY_SIZE); - start_jiffies =3D jiffies; - while ((now =3D jiffies) =3D=3D start_jiffies) - cpu_relax(); =20 - while (time_before(jiffies, now + (1 << MISALIGNED_ACCESS_JIFFIES_LG2))) { + start_ns =3D ktime_get_mono_fast_ns(); + while (ktime_get_mono_fast_ns() < start_ns + MISALIGNED_ACCESS_NS) { start_cycles =3D get_cycles64(); mb(); __riscv_copy_bytes_unaligned(dst, src, MISALIGNED_COPY_SIZE); @@ -125,13 +122,12 @@ static int check_unaligned_access(void *param) return 0; } =20 -static void __init check_unaligned_access_nonboot_cpu(void *param) +static void __init _check_unaligned_access(void *param) { unsigned int cpu =3D smp_processor_id(); struct page **pages =3D param; =20 - if (smp_processor_id() !=3D 0) - check_unaligned_access(pages[cpu]); + check_unaligned_access(pages[cpu]); } =20 /* Measure unaligned access speed on all CPUs present at boot in parallel.= */ @@ -158,11 +154,7 @@ static void __init check_unaligned_access_speed_all_cp= us(void) } } =20 - /* Check everybody except 0, who stays behind to tend jiffies. */ - on_each_cpu(check_unaligned_access_nonboot_cpu, bufs, 1); - - /* Check core 0. */ - smp_call_on_cpu(0, check_unaligned_access, bufs[0], true); + on_each_cpu(_check_unaligned_access, bufs, 1); =20 out: for_each_cpu(cpu, cpu_online_mask) { @@ -494,4 +486,4 @@ static int __init check_unaligned_access_all_cpus(void) return 0; } =20 -arch_initcall(check_unaligned_access_all_cpus); +late_initcall(check_unaligned_access_all_cpus); --=20 2.47.3 From nobody Thu Apr 2 17:44:29 2026 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 74AA03EBF2C for ; Wed, 11 Feb 2026 17:30:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=193.142.43.55 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770831053; cv=none; b=e/wEmAFi+Ga5A/H+AUouy8Er8VLpsKc/rXLSrtUCO33xLSUyJw8cuIeCROstrMinILGxmnfM2kjsEwUFV2tujOFOF9/YePxVibmJyRSKFJ0ddLCJmvXQMtFczxQ+xMjglXu0OP2JlNAGQMGIj03oiAy98RpKLaPyaZ7vJUzmcAY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770831053; c=relaxed/simple; bh=4QUfyU0BrGpA2i3nZ5cJtFi2TkDjb1Ji+IDFW45hrIk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=cX82C06j7A+CMXid1uhZr6BCFI4iyvaEEoZBFlWt/OrML9TuzwnudsW6nFVJrdC99XmlGFl1B/ZEYui5kP6ZmhoTbwMu8SY8kCZUrgxgesFajh2kB7BLpygcxJ6YKW/4wwN+DgfY3ubQrqqHsAJaQ6tKp2+lQf7c3so9j4+eGqY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de; spf=pass smtp.mailfrom=linutronix.de; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=rs7dFTkJ; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=eOZVy0Mg; arc=none smtp.client-ip=193.142.43.55 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linutronix.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="rs7dFTkJ"; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="eOZVy0Mg" From: Nam Cao DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1770831051; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=BUvkJ8w7tWD428d0BdBWQcoGHqIOYKVajPuycpt1LHo=; b=rs7dFTkJT2aeKl7wyDrHY6j4sJsJOPo3k7zZJtt8ykCO5bPQHcjxArKin4vr5B7H2cCBp0 nF2lLxYzQxi2lkjKBDgqR+FZs1+JiNr0W1/aeWRgljWhFIBqzXNK821kZS1k955yHBjc3J JyyZdTMHm5dF8Olbmjiox+v/q33SfMuwZBFklSJhKiT6zmX1yaRjYqDSDCs67MMU/X/DjW EMsdL7yTLCwAEk5Nht7JFECXFuDLHWsX+FbCju1MNFrCdZkv9C96u6CUEA0VLrGDOChQzE TFPrpVrt5ge4biAfHivFqlBxt6UwukyCIoyU0e0z9XO7UGDP2Cmg789zXnUe9A== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1770831051; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=BUvkJ8w7tWD428d0BdBWQcoGHqIOYKVajPuycpt1LHo=; b=eOZVy0MgH/eo0R1k1vEpbyAXyjY293OPaiCvVmNW/F/WqzsANEBsUecAxJbJTnhSUABGfX FdXZTbiS0bgUhwDQ== To: Paul Walmsley , Palmer Dabbelt , Albert Ou , Alexandre Ghiti , Andrew Jones , =?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?= , linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org Cc: Nam Cao Subject: [PATCH 2/5] riscv: Split out measure_cycles() for reuse Date: Wed, 11 Feb 2026 18:30:32 +0100 Message-ID: <50d0598e45acc56c95176e52fbbe56e1f4becc84.1770830596.git.namcao@linutronix.de> In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Byte cycle measurement and word cycle measurement of scalar misaligned access are very similar. Split these parts out into a common measure_cycles() function to avoid duplication. This function will also be reused for vector misaligned access probe in a follow-up commit. Signed-off-by: Nam Cao --- arch/riscv/kernel/unaligned_access_speed.c | 69 +++++++++++----------- 1 file changed, 33 insertions(+), 36 deletions(-) diff --git a/arch/riscv/kernel/unaligned_access_speed.c b/arch/riscv/kernel= /unaligned_access_speed.c index 8b744c4a41ea..b964a666a973 100644 --- a/arch/riscv/kernel/unaligned_access_speed.c +++ b/arch/riscv/kernel/unaligned_access_speed.c @@ -31,30 +31,15 @@ static long unaligned_vector_speed_param =3D RISCV_HWPR= OBE_MISALIGNED_VECTOR_UNKNO static cpumask_t fast_misaligned_access; =20 #ifdef CONFIG_RISCV_PROBE_UNALIGNED_ACCESS -static int check_unaligned_access(void *param) +static u64 measure_cycles(void (*func)(void *dst, const void *src, size_t = len), + void *dst, void *src, size_t len) { - int cpu =3D smp_processor_id(); - u64 start_cycles, end_cycles; - u64 word_cycles; - u64 byte_cycles; + u64 start_cycles, end_cycles, cycles =3D -1ULL; u64 start_ns; - int ratio; - struct page *page =3D param; - void *dst; - void *src; - long speed =3D RISCV_HWPROBE_MISALIGNED_SCALAR_SLOW; =20 - if (per_cpu(misaligned_access_speed, cpu) !=3D RISCV_HWPROBE_MISALIGNED_S= CALAR_UNKNOWN) - return 0; - - /* Make an unaligned destination buffer. */ - dst =3D (void *)((unsigned long)page_address(page) | 0x1); - /* Unalign src as well, but differently (off by 1 + 2 =3D 3). */ - src =3D dst + (MISALIGNED_BUFFER_SIZE / 2); - src +=3D 2; - word_cycles =3D -1ULL; /* Do a warmup. */ - __riscv_copy_words_unaligned(dst, src, MISALIGNED_COPY_SIZE); + func(dst, src, len); + preempt_disable(); =20 /* @@ -66,29 +51,41 @@ static int check_unaligned_access(void *param) start_cycles =3D get_cycles64(); /* Ensure the CSR read can't reorder WRT to the copy. */ mb(); - __riscv_copy_words_unaligned(dst, src, MISALIGNED_COPY_SIZE); + func(dst, src, len); /* Ensure the copy ends before the end time is snapped. */ mb(); end_cycles =3D get_cycles64(); - if ((end_cycles - start_cycles) < word_cycles) - word_cycles =3D end_cycles - start_cycles; + if ((end_cycles - start_cycles) < cycles) + cycles =3D end_cycles - start_cycles; } =20 - byte_cycles =3D -1ULL; - __riscv_copy_bytes_unaligned(dst, src, MISALIGNED_COPY_SIZE); + preempt_enable(); =20 - start_ns =3D ktime_get_mono_fast_ns(); - while (ktime_get_mono_fast_ns() < start_ns + MISALIGNED_ACCESS_NS) { - start_cycles =3D get_cycles64(); - mb(); - __riscv_copy_bytes_unaligned(dst, src, MISALIGNED_COPY_SIZE); - mb(); - end_cycles =3D get_cycles64(); - if ((end_cycles - start_cycles) < byte_cycles) - byte_cycles =3D end_cycles - start_cycles; - } + return cycles; +} =20 - preempt_enable(); +static int check_unaligned_access(void *param) +{ + int cpu =3D smp_processor_id(); + u64 word_cycles; + u64 byte_cycles; + int ratio; + struct page *page =3D param; + void *dst; + void *src; + long speed =3D RISCV_HWPROBE_MISALIGNED_SCALAR_SLOW; + + if (per_cpu(misaligned_access_speed, cpu) !=3D RISCV_HWPROBE_MISALIGNED_S= CALAR_UNKNOWN) + return 0; + + /* Make an unaligned destination buffer. */ + dst =3D (void *)((unsigned long)page_address(page) | 0x1); + /* Unalign src as well, but differently (off by 1 + 2 =3D 3). */ + src =3D dst + (MISALIGNED_BUFFER_SIZE / 2); + src +=3D 2; + + word_cycles =3D measure_cycles(__riscv_copy_words_unaligned, dst, src, MI= SALIGNED_COPY_SIZE); + byte_cycles =3D measure_cycles(__riscv_copy_bytes_unaligned, dst, src, MI= SALIGNED_COPY_SIZE); =20 /* Don't divide by zero. */ if (!word_cycles || !byte_cycles) { --=20 2.47.3 From nobody Thu Apr 2 17:44:29 2026 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BD5CB3A1C9 for ; Wed, 11 Feb 2026 17:30:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=193.142.43.55 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770831054; cv=none; b=gQMPiR9gHUDZpr4ppVwQ+psIzxHpHZpZRevsNt0H/KmDKZdtYTVGTQFYyQmkHbsKXlt/DVmhF7N9IybK/jcGH5/Q5+0fdPiNABRdcllpfvQ75gkMzY3L3P8RvayKE4yp2V8hpBaWGDg/3iR9q6d5xGKN/Nx5mM5wgosRYP7eLOM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770831054; c=relaxed/simple; bh=G/Fs1jo+I9DUOp3q7ItK2rL9da8TP1cJemCnwWQl2tc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=N3tVDxfZPEk+7Sf7W2LxoFq79OLcUO3PCwKQKw3dYGCBby/rENTM00mPtAEz7QJv5Q1cVud5yreY2zGwvlcLNcXHuRCsQVIl8XpQIw2NyAMwFY96nrYvrwTEdlsIqbxl3J2zR9GRIWLLPQRpda+CeuLJVRnAJrRABfib1AKVT9w= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de; spf=pass smtp.mailfrom=linutronix.de; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=SdZMaVtp; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=k6LTFLm3; arc=none smtp.client-ip=193.142.43.55 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linutronix.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="SdZMaVtp"; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="k6LTFLm3" From: Nam Cao DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1770831051; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=r/8y0dAsi1KhYQOwI9LZZG8LjkoOsEKw49Zlrv+dlAg=; b=SdZMaVtpDXnlEmf2pEmfwTyNcS8A9Qiqs5CvVQO5Ve9E/z1NFbyHRpPF69btIw3HMZuMuz Ors8Gsuj4kctnAXXSgAj1daq2F0OUZwEQHzd+tKjXrP72zUWUHJqMdA9/7M1vx0zhQdafx 74OaIVPnrIgxKMuTn/iUsFFQHPDKkBy4ufbhyVr3HwB8h+A0wA5iqURQVAffd3O1A6Atvs oNY2TC64pR5zFjFq0ZnoKzKVU9Y5WvbRuDLxqukC3jv85WN5UFqBIqiiy7nYxGvtKdR1Er WZ38UfnJ1kwuWaqHfiegaqEdVZqWl3RRYG4qwY7ZmafQzecKyOXOyhB16yaEuQ== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1770831051; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=r/8y0dAsi1KhYQOwI9LZZG8LjkoOsEKw49Zlrv+dlAg=; b=k6LTFLm3s31j4+B5eetbSfi6TJuNhE43D3HIdciKWDDFdd3YzCd25qygSntBdy8nxc/iVj VAKUMRhPWZDhUxCQ== To: Paul Walmsley , Palmer Dabbelt , Albert Ou , Alexandre Ghiti , Andrew Jones , =?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?= , linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org Cc: Nam Cao Subject: [PATCH 3/5] riscv: Reuse measure_cycles() in check_vector_unaligned_access() Date: Wed, 11 Feb 2026 18:30:33 +0100 Message-ID: In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" check_vector_unaligned_access() duplicates the logic in measure_cycles(). Reuse measure_cycles() and deduplicate. Signed-off-by: Nam Cao --- arch/riscv/kernel/unaligned_access_speed.c | 54 ++++------------------ 1 file changed, 8 insertions(+), 46 deletions(-) diff --git a/arch/riscv/kernel/unaligned_access_speed.c b/arch/riscv/kernel= /unaligned_access_speed.c index b964a666a973..c0d39c4b2150 100644 --- a/arch/riscv/kernel/unaligned_access_speed.c +++ b/arch/riscv/kernel/unaligned_access_speed.c @@ -16,7 +16,6 @@ =20 #include "copy-unaligned.h" =20 -#define MISALIGNED_ACCESS_JIFFIES_LG2 1 #define MISALIGNED_ACCESS_NS 8000000 #define MISALIGNED_BUFFER_SIZE 0x4000 #define MISALIGNED_BUFFER_ORDER get_order(MISALIGNED_BUFFER_SIZE) @@ -30,9 +29,9 @@ static long unaligned_vector_speed_param =3D RISCV_HWPROB= E_MISALIGNED_VECTOR_UNKNO =20 static cpumask_t fast_misaligned_access; =20 -#ifdef CONFIG_RISCV_PROBE_UNALIGNED_ACCESS -static u64 measure_cycles(void (*func)(void *dst, const void *src, size_t = len), - void *dst, void *src, size_t len) +static u64 __maybe_unused +measure_cycles(void (*func)(void *dst, const void *src, size_t len), + void *dst, void *src, size_t len) { u64 start_cycles, end_cycles, cycles =3D -1ULL; u64 start_ns; @@ -64,6 +63,7 @@ static u64 measure_cycles(void (*func)(void *dst, const v= oid *src, size_t len), return cycles; } =20 +#ifdef CONFIG_RISCV_PROBE_UNALIGNED_ACCESS static int check_unaligned_access(void *param) { int cpu =3D smp_processor_id(); @@ -270,11 +270,9 @@ static int riscv_offline_cpu(unsigned int cpu) static void check_vector_unaligned_access(struct work_struct *work __alway= s_unused) { int cpu =3D smp_processor_id(); - u64 start_cycles, end_cycles; u64 word_cycles; u64 byte_cycles; int ratio; - unsigned long start_jiffies, now; struct page *page; void *dst; void *src; @@ -294,50 +292,14 @@ static void check_vector_unaligned_access(struct work= _struct *work __always_unus /* Unalign src as well, but differently (off by 1 + 2 =3D 3). */ src =3D dst + (MISALIGNED_BUFFER_SIZE / 2); src +=3D 2; - word_cycles =3D -1ULL; =20 - /* Do a warmup. */ kernel_vector_begin(); - __riscv_copy_vec_words_unaligned(dst, src, MISALIGNED_COPY_SIZE); - - start_jiffies =3D jiffies; - while ((now =3D jiffies) =3D=3D start_jiffies) - cpu_relax(); - - /* - * For a fixed amount of time, repeatedly try the function, and take - * the best time in cycles as the measurement. - */ - while (time_before(jiffies, now + (1 << MISALIGNED_ACCESS_JIFFIES_LG2))) { - start_cycles =3D get_cycles64(); - /* Ensure the CSR read can't reorder WRT to the copy. */ - mb(); - __riscv_copy_vec_words_unaligned(dst, src, MISALIGNED_COPY_SIZE); - /* Ensure the copy ends before the end time is snapped. */ - mb(); - end_cycles =3D get_cycles64(); - if ((end_cycles - start_cycles) < word_cycles) - word_cycles =3D end_cycles - start_cycles; - } - - byte_cycles =3D -1ULL; - __riscv_copy_vec_bytes_unaligned(dst, src, MISALIGNED_COPY_SIZE); - start_jiffies =3D jiffies; - while ((now =3D jiffies) =3D=3D start_jiffies) - cpu_relax(); =20 - while (time_before(jiffies, now + (1 << MISALIGNED_ACCESS_JIFFIES_LG2))) { - start_cycles =3D get_cycles64(); - /* Ensure the CSR read can't reorder WRT to the copy. */ - mb(); - __riscv_copy_vec_bytes_unaligned(dst, src, MISALIGNED_COPY_SIZE); - /* Ensure the copy ends before the end time is snapped. */ - mb(); - end_cycles =3D get_cycles64(); - if ((end_cycles - start_cycles) < byte_cycles) - byte_cycles =3D end_cycles - start_cycles; - } + word_cycles =3D measure_cycles(__riscv_copy_vec_words_unaligned, + dst, src, MISALIGNED_COPY_SIZE); =20 + byte_cycles =3D measure_cycles(__riscv_copy_vec_bytes_unaligned, + dst, src, MISALIGNED_COPY_SIZE); kernel_vector_end(); =20 /* Don't divide by zero. */ --=20 2.47.3 From nobody Thu Apr 2 17:44:29 2026 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5529026B2CE for ; Wed, 11 Feb 2026 17:30:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=193.142.43.55 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770831054; cv=none; b=tCimf5NuFAqD/58z84hDvBrftjHakfXczpfoadf8D9THme6KeGLU4lpjxDDX1JOJ86OLYD9jsJ9pq7ad77s47rAFG130/GVV026clwEhp7g98EabGX+0I7pw+sUk0MpOekJlG5EaT2GzbhVw52H8aUWd3ToW7U96PSI8dM4+wDU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770831054; c=relaxed/simple; bh=2LD/4B+eeK+6RiiQXJtqeYjCyOSAeCtrLQC5AL8sT8Y=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Q/R/Jv5o8np79QYQmkmtqIJIUD1cs+obDZU6/KdM2sC2q/7yzIR65ysqCjoqFqsMiA8o4+QbYLML0HtxPvXnlqtHuoMLWWc0uD+aHKWPVm1JlwZUhJf9qnXFFU5Wzx8D17iEuNl2WGq77jXO1lfJD9gJ8On4uR8d0FvLWxvdtZE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de; spf=pass smtp.mailfrom=linutronix.de; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=1lBK/ktc; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=onsazupK; arc=none smtp.client-ip=193.142.43.55 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linutronix.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="1lBK/ktc"; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="onsazupK" From: Nam Cao DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1770831051; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=md7PKEkW8qRnREfNKFlfHXRJ8unV7xpddBLzTx6CkJg=; b=1lBK/ktcYfigwtF7Wm7+ofucP+4Ikm8K636KRYzfkD7zRCgm7vNbGQM9/b6/4wQ7Mbso7k qxCFAcbWVZK61NFDmZj2CSEdUTSM0aCN4lfHsGDzNlrYSEnmVm4zT9HLDgWEPeXENBMkNb z814DHgCNM5rII8jkUlUzfuHndgOdMhU+uIhLT8RaEZ5FKiOFa9jMDCIXGA6h25aFNRh39 kGYFV31crqW4FHf6j4tw6bD/8ZznEtRnnrf37GFA8vJF1+gPGPcS+cZ2jVTvInPxl3QVlx L/9WMiaoGAvVJauOn0qI9Qd5hsKmP3dcQS60UoTNatU9RGdJH8d60vA/2Cy+Aw== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1770831051; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=md7PKEkW8qRnREfNKFlfHXRJ8unV7xpddBLzTx6CkJg=; b=onsazupK5R9EnobhxSXr+gvZPjH6MVKB8Lv4ovAtHYM13MyO2JRg2kLOx2FHEUqFvs/XKx 9sxXcuc2Wrn9Z0AA== To: Paul Walmsley , Palmer Dabbelt , Albert Ou , Alexandre Ghiti , Andrew Jones , =?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?= , linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org Cc: Nam Cao Subject: [PATCH 4/5] riscv: Split out compare_unaligned_access() Date: Wed, 11 Feb 2026 18:30:34 +0100 Message-ID: <3695f77279d473eead8ed6210d97c941321cd4f1.1770830596.git.namcao@linutronix.de> In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Scalar misaligned access probe and vector misaligned access probe share very similar code. Split out this similar part from scalar probe into compare_unaligned_access(), which will be reused for vector probe in a follow-up commit. Signed-off-by: Nam Cao --- arch/riscv/kernel/unaligned_access_speed.c | 59 +++++++++++++++------- 1 file changed, 40 insertions(+), 19 deletions(-) diff --git a/arch/riscv/kernel/unaligned_access_speed.c b/arch/riscv/kernel= /unaligned_access_speed.c index c0d39c4b2150..b3ed74b71d3e 100644 --- a/arch/riscv/kernel/unaligned_access_speed.c +++ b/arch/riscv/kernel/unaligned_access_speed.c @@ -63,58 +63,79 @@ measure_cycles(void (*func)(void *dst, const void *src,= size_t len), return cycles; } =20 -#ifdef CONFIG_RISCV_PROBE_UNALIGNED_ACCESS -static int check_unaligned_access(void *param) +/* + * Return: + * 1 if unaligned accesses are fast + * 0 if unaligned accesses are slow + * -1 if check cannot be done + */ +static int __maybe_unused +compare_unaligned_access(void (*word_copy)(void *dst, const void *src, siz= e_t len), + void (*byte_copy)(void *dst, const void *src, size_t len), + void *buf) { int cpu =3D smp_processor_id(); u64 word_cycles; u64 byte_cycles; + void *dst, *src; + bool fast; int ratio; - struct page *page =3D param; - void *dst; - void *src; - long speed =3D RISCV_HWPROBE_MISALIGNED_SCALAR_SLOW; - - if (per_cpu(misaligned_access_speed, cpu) !=3D RISCV_HWPROBE_MISALIGNED_S= CALAR_UNKNOWN) - return 0; =20 /* Make an unaligned destination buffer. */ - dst =3D (void *)((unsigned long)page_address(page) | 0x1); + dst =3D (void *)((unsigned long)buf | 0x1); /* Unalign src as well, but differently (off by 1 + 2 =3D 3). */ src =3D dst + (MISALIGNED_BUFFER_SIZE / 2); src +=3D 2; =20 - word_cycles =3D measure_cycles(__riscv_copy_words_unaligned, dst, src, MI= SALIGNED_COPY_SIZE); - byte_cycles =3D measure_cycles(__riscv_copy_bytes_unaligned, dst, src, MI= SALIGNED_COPY_SIZE); + word_cycles =3D measure_cycles(word_copy, dst, src, MISALIGNED_COPY_SIZE); + byte_cycles =3D measure_cycles(byte_copy, dst, src, MISALIGNED_COPY_SIZE); =20 /* Don't divide by zero. */ if (!word_cycles || !byte_cycles) { pr_warn("cpu%d: rdtime lacks granularity needed to measure unaligned acc= ess speed\n", cpu); =20 - return 0; + return -1; } =20 - if (word_cycles < byte_cycles) - speed =3D RISCV_HWPROBE_MISALIGNED_SCALAR_FAST; + fast =3D word_cycles < byte_cycles; =20 ratio =3D div_u64((byte_cycles * 100), word_cycles); pr_info("cpu%d: Ratio of byte access time to unaligned word access is %d.= %02d, unaligned accesses are %s\n", cpu, ratio / 100, ratio % 100, - (speed =3D=3D RISCV_HWPROBE_MISALIGNED_SCALAR_FAST) ? "fast" : "slow"); + fast ? "fast" : "slow"); =20 - per_cpu(misaligned_access_speed, cpu) =3D speed; + return fast; +} + +#ifdef CONFIG_RISCV_PROBE_UNALIGNED_ACCESS +static int check_unaligned_access(struct page *page) +{ + void *buf =3D page_address(page); + int cpu =3D smp_processor_id(); + int ret; + + if (per_cpu(misaligned_access_speed, cpu) !=3D RISCV_HWPROBE_MISALIGNED_S= CALAR_UNKNOWN) + return 0; + + ret =3D compare_unaligned_access(__riscv_copy_words_unaligned, + __riscv_copy_bytes_unaligned, buf); + if (ret < 0) + return 0; =20 /* * Set the value of fast_misaligned_access of a CPU. These operations * are atomic to avoid race conditions. */ - if (speed =3D=3D RISCV_HWPROBE_MISALIGNED_SCALAR_FAST) + if (ret) { + per_cpu(misaligned_access_speed, cpu) =3D RISCV_HWPROBE_MISALIGNED_SCALA= R_FAST; cpumask_set_cpu(cpu, &fast_misaligned_access); - else + } else { + per_cpu(misaligned_access_speed, cpu) =3D RISCV_HWPROBE_MISALIGNED_SCALA= R_SLOW; cpumask_clear_cpu(cpu, &fast_misaligned_access); + } =20 return 0; } --=20 2.47.3 From nobody Thu Apr 2 17:44:29 2026 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8B66C34B191 for ; Wed, 11 Feb 2026 17:30:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=193.142.43.55 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770831054; cv=none; b=PsQ896k36U4lJnEPhPLT1XP2Q0A4ePX9baaUKgM3MUpvnxCk5pp5oLZcTEdu6F0tGMIeBNG/sT7sOid5CscjGEmoCpl80iZXs3PkyKAJkIuI2NCKS7wNuGQDVxR1uSoF4Ka9GY3DH1+GBEEMrINSSMO+BGng0Ptb7R1nk0dNR0c= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770831054; c=relaxed/simple; bh=rzp3geWVTuAiQkMDh8BWYAc8nUKCBuWa4beXGqDAPWg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=g7PlP6Qg0SUF4YnBS0Ar8S2TTZwBFZujs/mzE1NVKwq+zKqtlHXIjkXsO2l/EUqAhpGTVQyUEm9y2mQT3BtWzKXd0Pm4a5KFfZtOtNpdyAJvjzG3y8C7Nu/xFu3eDdpgtb/IvvhH39DRxgWqXeKwuOPaNsmktu9cO5NwT/7grc0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de; spf=pass smtp.mailfrom=linutronix.de; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=ecMZkWzY; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=y4ZGecbN; arc=none smtp.client-ip=193.142.43.55 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linutronix.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="ecMZkWzY"; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="y4ZGecbN" From: Nam Cao DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1770831052; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=X323MOQmcQUTpJgkN+84fsCKXqW3v74pQvlv4aLvRT8=; b=ecMZkWzYoWPY0RS09/lD57t2gt1Vy4HclF6l0WnT/kggt+60Q3rvJcwouC4luYCE4DsOJn PPnkFvHEqYgqi9Lzc0dv3yZ9H0+IwavqU7yTdJvnsCGxomKl53g4+7MfH+tNgrl/MIiTfe QigFZquT1bAHATGHCHUNQRABZpuXd1ThcNtT0Tpj6k2M2WVwECM1Am2H1NwqqClIXlCnUW oWNGocAvLG3ARFsmrHS1n//dWfBR0EYXyzwQU9FxHe1GBflIBJNNdLXttvtUolb5HPavoA hVFGHjYHl2XkjfceBytUPfvvUDdDdET797Ngpm+psquBBp6fTPH6SHFkKY7dfQ== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1770831052; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=X323MOQmcQUTpJgkN+84fsCKXqW3v74pQvlv4aLvRT8=; b=y4ZGecbN4iScLAhdZBOG4jR47B8CfxQJ9bLUZRwGay9YA1BpQebFfqplN9wR3ViIQwxiDR IhfCBNiFhyHNfcCg== To: Paul Walmsley , Palmer Dabbelt , Albert Ou , Alexandre Ghiti , Andrew Jones , =?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?= , linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org Cc: Nam Cao Subject: [PATCH 5/5] riscv: Reuse compare_unaligned_access() in check_vector_unaligned_access() Date: Wed, 11 Feb 2026 18:30:35 +0100 Message-ID: In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" check_vector_unaligned_access() duplicates the logic in compare_unaligned_access(). Use compare_unaligned_access() and deduplicate. Signed-off-by: Nam Cao --- arch/riscv/kernel/unaligned_access_speed.c | 55 +++++++--------------- 1 file changed, 16 insertions(+), 39 deletions(-) diff --git a/arch/riscv/kernel/unaligned_access_speed.c b/arch/riscv/kernel= /unaligned_access_speed.c index b3ed74b71d3e..8a9f261dc10b 100644 --- a/arch/riscv/kernel/unaligned_access_speed.c +++ b/arch/riscv/kernel/unaligned_access_speed.c @@ -72,7 +72,7 @@ measure_cycles(void (*func)(void *dst, const void *src, s= ize_t len), static int __maybe_unused compare_unaligned_access(void (*word_copy)(void *dst, const void *src, siz= e_t len), void (*byte_copy)(void *dst, const void *src, size_t len), - void *buf) + void *buf, const char *type) { int cpu =3D smp_processor_id(); u64 word_cycles; @@ -92,8 +92,8 @@ compare_unaligned_access(void (*word_copy)(void *dst, con= st void *src, size_t le =20 /* Don't divide by zero. */ if (!word_cycles || !byte_cycles) { - pr_warn("cpu%d: rdtime lacks granularity needed to measure unaligned acc= ess speed\n", - cpu); + pr_warn("cpu%d: rdtime lacks granularity needed to measure %s unaligned = access speed\n", + cpu, type); =20 return -1; } @@ -101,8 +101,9 @@ compare_unaligned_access(void (*word_copy)(void *dst, c= onst void *src, size_t le fast =3D word_cycles < byte_cycles; =20 ratio =3D div_u64((byte_cycles * 100), word_cycles); - pr_info("cpu%d: Ratio of byte access time to unaligned word access is %d.= %02d, unaligned accesses are %s\n", + pr_info("cpu%d: %s unaligned word access speed is %d.%02dx byte access sp= eed (%s)\n", cpu, + type, ratio / 100, ratio % 100, fast ? "fast" : "slow"); @@ -121,7 +122,8 @@ static int check_unaligned_access(struct page *page) return 0; =20 ret =3D compare_unaligned_access(__riscv_copy_words_unaligned, - __riscv_copy_bytes_unaligned, buf); + __riscv_copy_bytes_unaligned, + buf, "scalar"); if (ret < 0) return 0; =20 @@ -291,13 +293,8 @@ static int riscv_offline_cpu(unsigned int cpu) static void check_vector_unaligned_access(struct work_struct *work __alway= s_unused) { int cpu =3D smp_processor_id(); - u64 word_cycles; - u64 byte_cycles; - int ratio; struct page *page; - void *dst; - void *src; - long speed =3D RISCV_HWPROBE_MISALIGNED_VECTOR_SLOW; + int ret; =20 if (per_cpu(vector_misaligned_access, cpu) !=3D RISCV_HWPROBE_MISALIGNED_= VECTOR_UNKNOWN) return; @@ -308,40 +305,20 @@ static void check_vector_unaligned_access(struct work= _struct *work __always_unus return; } =20 - /* Make an unaligned destination buffer. */ - dst =3D (void *)((unsigned long)page_address(page) | 0x1); - /* Unalign src as well, but differently (off by 1 + 2 =3D 3). */ - src =3D dst + (MISALIGNED_BUFFER_SIZE / 2); - src +=3D 2; - kernel_vector_begin(); =20 - word_cycles =3D measure_cycles(__riscv_copy_vec_words_unaligned, - dst, src, MISALIGNED_COPY_SIZE); - - byte_cycles =3D measure_cycles(__riscv_copy_vec_bytes_unaligned, - dst, src, MISALIGNED_COPY_SIZE); + ret =3D compare_unaligned_access(__riscv_copy_vec_words_unaligned, + __riscv_copy_vec_bytes_unaligned, + page_address(page), "vector"); kernel_vector_end(); =20 - /* Don't divide by zero. */ - if (!word_cycles || !byte_cycles) { - pr_warn("cpu%d: rdtime lacks granularity needed to measure unaligned vec= tor access speed\n", - cpu); - + if (ret < 0) goto free; - } =20 - if (word_cycles < byte_cycles) - speed =3D RISCV_HWPROBE_MISALIGNED_VECTOR_FAST; - - ratio =3D div_u64((byte_cycles * 100), word_cycles); - pr_info("cpu%d: Ratio of vector byte access time to vector unaligned word= access is %d.%02d, unaligned accesses are %s\n", - cpu, - ratio / 100, - ratio % 100, - (speed =3D=3D RISCV_HWPROBE_MISALIGNED_VECTOR_FAST) ? "fast" : "slow"); - - per_cpu(vector_misaligned_access, cpu) =3D speed; + if (ret) + per_cpu(vector_misaligned_access, cpu) =3D RISCV_HWPROBE_MISALIGNED_VECT= OR_FAST; + else + per_cpu(vector_misaligned_access, cpu) =3D RISCV_HWPROBE_MISALIGNED_VECT= OR_SLOW; =20 free: __free_pages(page, MISALIGNED_BUFFER_ORDER); --=20 2.47.3