drivers/soc/amlogic/meson-clk-measure.c | 27 ++++++++++++++++++--------- 1 file changed, 18 insertions(+), 9 deletions(-)
From: Chuan Liu <chuan.liu@amlogic.com>
The cycle count register has a 20-bit effective width, but the driver
only utilizes 16 bits. This reduces the sampling window when measuring
high-frequency clocks, resulting in (slightly) degraded measurement
accuracy.
The input clock signal path from gate (Controlled by MSR_RUN) to internal
sampling circuit in clk-measure has a propagation delay requirement: 24
clock cycles must elapse after mux selection before sampling.
The measurement circuit employs single-edge sampling for clock frequency
detection, resulting in a ±1 cycle count error within the measurement window.
+1 cycle: 3 rising edges captured in 2-cycle measurement window.
__ __ __
__↑ |__↑ |__↑ |__
^ ^
-1 cycle: 2 rising edges captured in 3-cycle measurement window.
__ __ __
__↑ |__↑ |__↑ |__↑
^ ^
Change-Id: If367c013fe2a8d0c8f5f06888bb8f30a1e46b927
Signed-off-by: Chuan Liu <chuan.liu@amlogic.com>
---
Improve measurement accuracy by increasing the bit width of the cycle
counter register and adding delay during measurement.
The 800μs delay between enabling the input clock gate and activating
sampling is determined by the minimum sampling frequency of 30kHz (the
lowest commonly used frequency in applications is 32.768kHz).
Here are the test comparisons based on C3:
Pre-optimization:
cat /sys/kernel/debug/meson-clk-msr/measure_summary
clock rate precision
---------------------------------------------
sys_clk 166664063 +/-5208Hz
axi_clk 499968750 +/-15625Hz
rtc_clk 23982813 +/-3125Hz
p20_usb2_ckout 479968750 +/-15625Hz
eth_mpll_test 499992188 +/-15625Hz
sys_pll 1919875000 +/-62500Hz
cpu_clk_div16 119998162 +/-3676Hz
ts_pll 0 +/-3125Hz
fclk_div2 999843750 +/-31250Hz
fclk_div2p5 799953125 +/-31250Hz
fclk_div3 666625000 +/-20833Hz
fclk_div4 499914063 +/-15625Hz
fclk_div5 399987500 +/-12500Hz
fclk_div7 285709821 +/-8928Hz
fclk_50m 49982813 +/-3125Hz
sys_oscin32k_i 26563 +/-3125Hz
Post-optimization:
cat /sys/kernel/debug/meson-clk-msr/measure_summary
clock rate precision
---------------------------------------------
sys_clk 166665625 +/-1562Hz
axi_clk 499996875 +/-1562Hz
rtc_clk 24000000 +/-1562Hz
p20_usb2_ckout 479996875 +/-1562Hz
eth_mpll_test 499996875 +/-1562Hz
sys_pll 1919987132 +/-1838Hz
cpu_clk_div16 119998438 +/-1562Hz
ts_pll 0 +/-1562Hz
fclk_div2 999993750 +/-1562Hz
fclk_div2p5 799995313 +/-1562Hz
fclk_div3 666656250 +/-1562Hz
fclk_div4 499996875 +/-1562Hz
fclk_div5 399993750 +/-1562Hz
fclk_div7 285712500 +/-1562Hz
fclk_50m 49998438 +/-1562Hz
sys_oscin32k_i 32813 +/-1562Hz
---
drivers/soc/amlogic/meson-clk-measure.c | 27 ++++++++++++++++++---------
1 file changed, 18 insertions(+), 9 deletions(-)
diff --git a/drivers/soc/amlogic/meson-clk-measure.c b/drivers/soc/amlogic/meson-clk-measure.c
index d862e30a244e..82acd8635bf8 100644
--- a/drivers/soc/amlogic/meson-clk-measure.c
+++ b/drivers/soc/amlogic/meson-clk-measure.c
@@ -22,7 +22,7 @@ static DEFINE_MUTEX(measure_lock);
#define MSR_CLK_SRC GENMASK(26, 20)
#define MSR_BUSY BIT(31)
-#define MSR_VAL_MASK GENMASK(15, 0)
+#define MSR_VAL_MASK GENMASK(19, 0)
#define DIV_MIN 32
#define DIV_STEP 32
@@ -805,14 +805,23 @@ static int meson_measure_id(struct meson_msr_id *clk_msr_id,
regmap_update_bits(priv->regmap, reg->freq_ctrl, MSR_DURATION,
FIELD_PREP(MSR_DURATION, duration - 1));
- /* Set ID */
- regmap_update_bits(priv->regmap, reg->freq_ctrl, MSR_CLK_SRC,
- FIELD_PREP(MSR_CLK_SRC, clk_msr_id->id));
+ /* Set the clock channel ID and enable the input clock gate. */
+ regmap_update_bits(priv->regmap, reg->freq_ctrl, MSR_CLK_SRC | MSR_RUN,
+ FIELD_PREP(MSR_CLK_SRC, clk_msr_id->id) | MSR_RUN);
- /* Enable & Start */
- regmap_update_bits(priv->regmap, reg->freq_ctrl,
- MSR_RUN | MSR_ENABLE,
- MSR_RUN | MSR_ENABLE);
+ /*
+ * HACK: The input clock signal path from gate (Controlled by MSR_RUN)
+ * to internal sampling circuit in clk-measure has a propagation delay
+ * requirement: 24 clock cycles must elapse after mux selection before
+ * sampling.
+ *
+ * For a 30kHz measurement clock, this translates to an 800μs delay:
+ * 800us = 24 / 30000Hz.
+ */
+ fsleep(800);
+
+ /* Enable the internal sampling circuit and start clock measurement. */
+ regmap_update_bits(priv->regmap, reg->freq_ctrl, MSR_ENABLE, MSR_ENABLE);
ret = regmap_read_poll_timeout(priv->regmap, reg->freq_ctrl,
val, !(val & MSR_BUSY), 10, 10000);
@@ -846,7 +855,7 @@ static int meson_measure_best_id(struct meson_msr_id *clk_msr_id,
do {
ret = meson_measure_id(clk_msr_id, duration);
if (ret >= 0)
- *precision = (2 * 1000000) / duration;
+ *precision = 1000000 / duration;
else
duration -= DIV_STEP;
} while (duration >= DIV_MIN && ret == -EINVAL);
---
base-commit: 87b480e04af45833deb5af1584694b0077805ea6
change-id: 20250523-optimize_clk-measure_accuracy-9e16ee346dd2
Best regards,
--
Chuan Liu <chuan.liu@amlogic.com>
Hello, thank you for this patch! On Thu, Jul 17, 2025 at 5:08 AM Chuan Liu via B4 Relay <devnull+chuan.liu.amlogic.com@kernel.org> wrote: > > From: Chuan Liu <chuan.liu@amlogic.com> > > The cycle count register has a 20-bit effective width, but the driver > only utilizes 16 bits. This reduces the sampling window when measuring > high-frequency clocks, resulting in (slightly) degraded measurement > accuracy. I checked the Meson8 downstream code [0] and it uses 0x000FFFFF to mask the register value -> this means that old SoCs also have a 20-bit wide width. [...] > Here are the test comparisons based on C3: [...] > Here are the test comparisons based on C3: I have tested this patch with Meson8b based Odroid-C1: pre-optimization: # time cat /sys/kernel/debug/meson-clk-msr/measure_summary | grep -v " 0 " clock rate precision --------------------------------------------- clk81 159372396 +/-5208Hz a9_clk_div16 24000000 +/-3125Hz rtc_osc_clk_out 31250 +/-3125Hz hdmi_ch0_tmds 146399038 +/-4807Hz sar_adc 1140625 +/-3125Hz sdhc_rx 94443750 +/-3125Hz sdhc_sd 94443750 +/-3125Hz pwm_d 849921875 +/-31250Hz pwm_c 849921875 +/-31250Hz real 0m0.102s user 0m0.005s sys 0m0.069s post-optimization: # time cat /sys/kernel/debug/meson-clk-msr/measure_summary | grep -v " 0 " clock rate precision --------------------------------------------- clk81 159373438 +/-1562Hz a9_clk_div16 12000000 +/-1562Hz rtc_osc_clk_out 32813 +/-1562Hz hdmi_ch0_tmds 146398438 +/-1562Hz sar_adc 1143750 +/-1562Hz sdhc_rx 94443750 +/-1562Hz sdhc_sd 94443750 +/-1562Hz pwm_d 849992188 +/-1562Hz pwm_c 849992188 +/-1562Hz real 0m0.173s user 0m0.008s sys 0m0.109s So there's also an improvement in accuracy. The only downside I'm seeing is that it takes 75% extra time for the measurement. For me this is irrelevant since we use this for debugging. [...] > + /* > + * HACK: The input clock signal path from gate (Controlled by MSR_RUN) > + * to internal sampling circuit in clk-measure has a propagation delay > + * requirement: 24 clock cycles must elapse after mux selection before > + * sampling. > + * > + * For a 30kHz measurement clock, this translates to an 800μs delay: > + * 800us = 24 / 30000Hz. > + */ > + fsleep(800); What is needed to make this not a HACK anymore? Is there a register that we can poll for the number of clock cycles that have passed? Best regards, Martin
hi Marti: On 7/17/2025 11:43 PM, Martin Blumenstingl wrote: > [ EXTERNAL EMAIL ] > > Hello, > > thank you for this patch! > > On Thu, Jul 17, 2025 at 5:08 AM Chuan Liu via B4 Relay > <devnull+chuan.liu.amlogic.com@kernel.org> wrote: >> From: Chuan Liu <chuan.liu@amlogic.com> >> >> The cycle count register has a 20-bit effective width, but the driver >> only utilizes 16 bits. This reduces the sampling window when measuring >> high-frequency clocks, resulting in (slightly) degraded measurement >> accuracy. > I checked the Meson8 downstream code [0] and it uses 0x000FFFFF to > mask the register value -> this means that old SoCs also have a 20-bit > wide width. > > [...] >> Here are the test comparisons based on C3: > [...] >> Here are the test comparisons based on C3: > I have tested this patch with Meson8b based Odroid-C1: > pre-optimization: > # time cat /sys/kernel/debug/meson-clk-msr/measure_summary | grep -v " 0 " > clock rate precision > --------------------------------------------- > clk81 159372396 +/-5208Hz > a9_clk_div16 24000000 +/-3125Hz > rtc_osc_clk_out 31250 +/-3125Hz > hdmi_ch0_tmds 146399038 +/-4807Hz > sar_adc 1140625 +/-3125Hz > sdhc_rx 94443750 +/-3125Hz > sdhc_sd 94443750 +/-3125Hz > pwm_d 849921875 +/-31250Hz > pwm_c 849921875 +/-31250Hz > > real 0m0.102s > user 0m0.005s > sys 0m0.069s > > > post-optimization: > # time cat /sys/kernel/debug/meson-clk-msr/measure_summary | grep -v " 0 " > clock rate precision > --------------------------------------------- > clk81 159373438 +/-1562Hz > a9_clk_div16 12000000 +/-1562Hz > rtc_osc_clk_out 32813 +/-1562Hz > hdmi_ch0_tmds 146398438 +/-1562Hz > sar_adc 1143750 +/-1562Hz > sdhc_rx 94443750 +/-1562Hz > sdhc_sd 94443750 +/-1562Hz > pwm_d 849992188 +/-1562Hz > pwm_c 849992188 +/-1562Hz > > real 0m0.173s > user 0m0.008s > sys 0m0.109s > > So there's also an improvement in accuracy. The only downside I'm > seeing is that it takes 75% extra time for the measurement. For me > this is irrelevant since we use this for debugging. > > [...] >> + /* >> + * HACK: The input clock signal path from gate (Controlled by MSR_RUN) >> + * to internal sampling circuit in clk-measure has a propagation delay >> + * requirement: 24 clock cycles must elapse after mux selection before >> + * sampling. >> + * >> + * For a 30kHz measurement clock, this translates to an 800μs delay: >> + * 800us = 24 / 30000Hz. >> + */ >> + fsleep(800); > What is needed to make this not a HACK anymore? Is there a register > that we can poll for the number of clock cycles that have passed? The required delay duration is frequency-dependent on the measurement clock source. The current 800μs delay is calculated based on a minimum input clock frequency of 30kHz. At higher input frequencies, this delay could be proportionally reduced. Applying a uniform 800μs delay therefore appears overly conservative for general use cases. The IP currently lacks a status register to detect whether the input clock signal has successfully propagated to the internal measurement circuitry. The design of this IP has been maintained for many years. From a hardware design perspective, there is room for optimization in this signal propagation delay. Future IP updates may not require such a long delay. > > Best regards, > Martin
On Fri, Jul 18, 2025 at 8:14 AM Chuan Liu <chuan.liu@amlogic.com> wrote: > > hi Marti: > > > On 7/17/2025 11:43 PM, Martin Blumenstingl wrote: > > [ EXTERNAL EMAIL ] > > > > Hello, > > > > thank you for this patch! > > > > On Thu, Jul 17, 2025 at 5:08 AM Chuan Liu via B4 Relay > > <devnull+chuan.liu.amlogic.com@kernel.org> wrote: > >> From: Chuan Liu <chuan.liu@amlogic.com> > >> > >> The cycle count register has a 20-bit effective width, but the driver > >> only utilizes 16 bits. This reduces the sampling window when measuring > >> high-frequency clocks, resulting in (slightly) degraded measurement > >> accuracy. > > I checked the Meson8 downstream code [0] and it uses 0x000FFFFF to > > mask the register value -> this means that old SoCs also have a 20-bit > > wide width. > > > > [...] > >> Here are the test comparisons based on C3: > > [...] > >> Here are the test comparisons based on C3: > > I have tested this patch with Meson8b based Odroid-C1: > > pre-optimization: > > # time cat /sys/kernel/debug/meson-clk-msr/measure_summary | grep -v " 0 " > > clock rate precision > > --------------------------------------------- > > clk81 159372396 +/-5208Hz > > a9_clk_div16 24000000 +/-3125Hz > > rtc_osc_clk_out 31250 +/-3125Hz > > hdmi_ch0_tmds 146399038 +/-4807Hz > > sar_adc 1140625 +/-3125Hz > > sdhc_rx 94443750 +/-3125Hz > > sdhc_sd 94443750 +/-3125Hz > > pwm_d 849921875 +/-31250Hz > > pwm_c 849921875 +/-31250Hz > > > > real 0m0.102s > > user 0m0.005s > > sys 0m0.069s > > > > > > post-optimization: > > # time cat /sys/kernel/debug/meson-clk-msr/measure_summary | grep -v " 0 " > > clock rate precision > > --------------------------------------------- > > clk81 159373438 +/-1562Hz > > a9_clk_div16 12000000 +/-1562Hz > > rtc_osc_clk_out 32813 +/-1562Hz > > hdmi_ch0_tmds 146398438 +/-1562Hz > > sar_adc 1143750 +/-1562Hz > > sdhc_rx 94443750 +/-1562Hz > > sdhc_sd 94443750 +/-1562Hz > > pwm_d 849992188 +/-1562Hz > > pwm_c 849992188 +/-1562Hz > > > > real 0m0.173s > > user 0m0.008s > > sys 0m0.109s > > > > So there's also an improvement in accuracy. The only downside I'm > > seeing is that it takes 75% extra time for the measurement. For me > > this is irrelevant since we use this for debugging. > > > > [...] > >> + /* > >> + * HACK: The input clock signal path from gate (Controlled by MSR_RUN) > >> + * to internal sampling circuit in clk-measure has a propagation delay > >> + * requirement: 24 clock cycles must elapse after mux selection before > >> + * sampling. > >> + * > >> + * For a 30kHz measurement clock, this translates to an 800μs delay: > >> + * 800us = 24 / 30000Hz. > >> + */ > >> + fsleep(800); > > What is needed to make this not a HACK anymore? Is there a register > > that we can poll for the number of clock cycles that have passed? > > > The required delay duration is frequency-dependent on the measurement > clock source. The current 800μs delay is calculated based on a > minimum input clock frequency of 30kHz. At higher input frequencies, > this delay could be proportionally reduced. Applying a uniform 800μs > delay therefore appears overly conservative for general use cases. > > > The IP currently lacks a status register to detect whether the input > clock signal has successfully propagated to the internal measurement > circuitry. > > > The design of this IP has been maintained for many years. From a > hardware design perspective, there is room for optimization in this > signal propagation delay. Future IP updates may not require such a > long delay. Thanks for the detailed description. To me this doesn't seem like a "hack" then, it's just something that's needed to interface with the hardware correctly. My suggestion is to replace the word "HACK" with "NOTE". What do you think? Best regards, Martin
On 7/22/2025 4:16 AM, Martin Blumenstingl wrote: > [ EXTERNAL EMAIL ] > > On Fri, Jul 18, 2025 at 8:14 AM Chuan Liu <chuan.liu@amlogic.com> wrote: >> hi Marti: >> >> >> On 7/17/2025 11:43 PM, Martin Blumenstingl wrote: >>> [ EXTERNAL EMAIL ] >>> >>> Hello, >>> >>> thank you for this patch! >>> >>> On Thu, Jul 17, 2025 at 5:08 AM Chuan Liu via B4 Relay >>> <devnull+chuan.liu.amlogic.com@kernel.org> wrote: >>>> From: Chuan Liu <chuan.liu@amlogic.com> >>>> >>>> The cycle count register has a 20-bit effective width, but the driver >>>> only utilizes 16 bits. This reduces the sampling window when measuring >>>> high-frequency clocks, resulting in (slightly) degraded measurement >>>> accuracy. >>> I checked the Meson8 downstream code [0] and it uses 0x000FFFFF to >>> mask the register value -> this means that old SoCs also have a 20-bit >>> wide width. >>> >>> [...] >>>> Here are the test comparisons based on C3: >>> [...] >>>> Here are the test comparisons based on C3: >>> I have tested this patch with Meson8b based Odroid-C1: >>> pre-optimization: >>> # time cat /sys/kernel/debug/meson-clk-msr/measure_summary | grep -v " 0 " >>> clock rate precision >>> --------------------------------------------- >>> clk81 159372396 +/-5208Hz >>> a9_clk_div16 24000000 +/-3125Hz >>> rtc_osc_clk_out 31250 +/-3125Hz >>> hdmi_ch0_tmds 146399038 +/-4807Hz >>> sar_adc 1140625 +/-3125Hz >>> sdhc_rx 94443750 +/-3125Hz >>> sdhc_sd 94443750 +/-3125Hz >>> pwm_d 849921875 +/-31250Hz >>> pwm_c 849921875 +/-31250Hz >>> >>> real 0m0.102s >>> user 0m0.005s >>> sys 0m0.069s >>> >>> >>> post-optimization: >>> # time cat /sys/kernel/debug/meson-clk-msr/measure_summary | grep -v " 0 " >>> clock rate precision >>> --------------------------------------------- >>> clk81 159373438 +/-1562Hz >>> a9_clk_div16 12000000 +/-1562Hz >>> rtc_osc_clk_out 32813 +/-1562Hz >>> hdmi_ch0_tmds 146398438 +/-1562Hz >>> sar_adc 1143750 +/-1562Hz >>> sdhc_rx 94443750 +/-1562Hz >>> sdhc_sd 94443750 +/-1562Hz >>> pwm_d 849992188 +/-1562Hz >>> pwm_c 849992188 +/-1562Hz >>> >>> real 0m0.173s >>> user 0m0.008s >>> sys 0m0.109s >>> >>> So there's also an improvement in accuracy. The only downside I'm >>> seeing is that it takes 75% extra time for the measurement. For me >>> this is irrelevant since we use this for debugging. >>> >>> [...] >>>> + /* >>>> + * HACK: The input clock signal path from gate (Controlled by MSR_RUN) >>>> + * to internal sampling circuit in clk-measure has a propagation delay >>>> + * requirement: 24 clock cycles must elapse after mux selection before >>>> + * sampling. >>>> + * >>>> + * For a 30kHz measurement clock, this translates to an 800μs delay: >>>> + * 800us = 24 / 30000Hz. >>>> + */ >>>> + fsleep(800); >>> What is needed to make this not a HACK anymore? Is there a register >>> that we can poll for the number of clock cycles that have passed? >> >> The required delay duration is frequency-dependent on the measurement >> clock source. The current 800μs delay is calculated based on a >> minimum input clock frequency of 30kHz. At higher input frequencies, >> this delay could be proportionally reduced. Applying a uniform 800μs >> delay therefore appears overly conservative for general use cases. >> >> >> The IP currently lacks a status register to detect whether the input >> clock signal has successfully propagated to the internal measurement >> circuitry. >> >> >> The design of this IP has been maintained for many years. From a >> hardware design perspective, there is room for optimization in this >> signal propagation delay. Future IP updates may not require such a >> long delay. > Thanks for the detailed description. To me this doesn't seem like a > "hack" then, it's just something that's needed to interface with the > hardware correctly. > My suggestion is to replace the word "HACK" with "NOTE". > > What do you think? OK, thanks for your suggestion. I'll replace it with “NOTE” in the next version. > > Best regards, > Martin
© 2016 - 2025 Red Hat, Inc.