From nobody Sun Feb 8 06:58:50 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 50F29E7D0A2 for ; Thu, 21 Sep 2023 18:53:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230180AbjIUSxF (ORCPT ); Thu, 21 Sep 2023 14:53:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58892 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229621AbjIUSwq (ORCPT ); Thu, 21 Sep 2023 14:52:46 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.93]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5752690F0A for ; Thu, 21 Sep 2023 10:47:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695318441; x=1726854441; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=MQkAyEvPfeUlDu9XxnZ+yO72HHL2Kv4TmA2TPoRNbBY=; b=k7cv3bPH8tnIcA9IGvP1kEcV1aDE+Ez7WIgw50m5HvBFXUxGSiFgBiVa N/QUBbyfZNqwGg5lVmU1k3BG/TBwscefaHUMUDxlIP5hqahjBzQcRKXK6 CEVBkO79NI3gKQ8dMXf+h/f2WG4Vm3zDAytcG+tPx1QTlcE8RFzPmSO0y vNgH8qoehoBKcI1T3JqjfylwXN7Njj3GB0Z5QCjARzzTy6iAu0oD5N0RH Ae4skXjOO4pymhhyYSZUmaFlWm6dhZ8qPppAIkggNqsGgSrlZ1uZJgTwE pksOPoiWh7CRKQxux3YfzG5nRnQbu87TkKcKT5qXjrWILjum/mCocjA/g g==; X-IronPort-AV: E=McAfee;i="6600,9927,10839"; a="377729738" X-IronPort-AV: E=Sophos;i="6.03,164,1694761200"; d="scan'208";a="377729738" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Sep 2023 23:29:57 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10839"; a="920606311" X-IronPort-AV: E=Sophos;i="6.03,164,1694761200"; d="scan'208";a="920606311" Received: from sunyi-station.sh.intel.com (HELO ysun46-mobl.sh.intel.com) ([10.239.159.10]) by orsmga005.jf.intel.com with ESMTP; 20 Sep 2023 23:29:53 -0700 From: Yi Sun To: dave.hansen@intel.com, mingo@kernel.org, linux-kernel@vger.kernel.org, x86@kernel.org Cc: sohil.mehta@intel.com, ak@linux.intel.com, ilpo.jarvinen@linux.intel.com, heng.su@intel.com, tony.luck@intel.com, yi.sun@linux.intel.com, yu.c.chen@intel.com, Yi Sun Subject: [PATCH v7 1/3] x86/fpu: Measure the Latency of XSAVE and XRSTOR Date: Thu, 21 Sep 2023 14:28:58 +0800 Message-Id: <20230921062900.864679-2-yi.sun@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230921062900.864679-1-yi.sun@intel.com> References: <20230921062900.864679-1-yi.sun@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Add two trace points x86_fpu_latency_xsave and x86_fpu_latency_xrstor. The latency dumped by the new trace points can tell when XSAVE/XRSTOR are getting more or less expensive, and get out the RFBM (requested-feature bitmap) and XINUSE to figure out the reason. Calculate the latency of instructions XSAVE and XRSTOR within a single trace event respectively. Another option considered was to have 2 separated trace events marking the start and finish of the XSAVE/XRSTOR. The latency was calculated from the 2 trace points in user space, but there was significant overhead added by the trace function itself. In internal testing, the single trace point option which is implemented here proved to save big overhead introduced by trace function. Make use of trace_clock() to calculate the latency, which is based on cpu_clock() with precision at most ~1 jiffy between CPUs. Configure CONFIG_X86_DEBUG_FPU is required. And the compiler will get rid of all the extra crust when CONFIG_X86_DEBUG_FPU is disabled. If both of the configs are enabled, the function tracepoint_enabled would be reduced to a static check for tracing enabled. Thus, in the fast path there would be only 2 additional static checks. Since trace points can be enabled dynamically, while the code is checking tracepoint_enabled(trace_event), the trace_event could be concurrently enabled. Hence there is probability to get single once noisy result 'trace_clock() - (-1)' at the moment enabling the trace points x86_fpu_latency_*. Leave the noise here instead of additional conditions while calling the x86_fpu_latency_* because it's not worth for the only once noise. It's easy to filter out by the following consuming script or other user space tool. Trace log looks like following: x86_fpu_latency_xsave: x86/fpu: latency:100 RFBM:0x202e7 XINUSE:0x202 x86_fpu_latency_xrstor: x86/fpu: latency:99 RFBM:0x202e7 XINUSE:0x202 Reviewed-by: Sohil Mehta Reviewed-by: Tony Luck Signed-off-by: Yi Sun diff --git a/arch/x86/include/asm/trace/fpu.h b/arch/x86/include/asm/trace/= fpu.h index 4645a6334063..0640fe79edf3 100644 --- a/arch/x86/include/asm/trace/fpu.h +++ b/arch/x86/include/asm/trace/fpu.h @@ -89,6 +89,43 @@ DEFINE_EVENT(x86_fpu, x86_fpu_xstate_check_failed, TP_ARGS(fpu) ); =20 +#if defined(CONFIG_X86_DEBUG_FPU) +DECLARE_EVENT_CLASS(x86_fpu_latency, + TP_PROTO(struct fpstate *fpstate, u64 latency), + TP_ARGS(fpstate, latency), + + TP_STRUCT__entry( + __field(struct fpstate *, fpstate) + __field(u64, latency) + __field(u64, rfbm) + __field(u64, xinuse) + ), + + TP_fast_assign( + __entry->fpstate =3D fpstate; + __entry->latency =3D latency; + __entry->rfbm =3D fpstate->xfeatures; + __entry->xinuse =3D fpstate->regs.xsave.header.xfeatures; + ), + + TP_printk("x86/fpu: latency:%lld RFBM:0x%llx XINUSE:0x%llx", + __entry->latency, + __entry->rfbm, + __entry->xinuse + ) +); + +DEFINE_EVENT(x86_fpu_latency, x86_fpu_latency_xsave, + TP_PROTO(struct fpstate *fpstate, u64 latency), + TP_ARGS(fpstate, latency) +); + +DEFINE_EVENT(x86_fpu_latency, x86_fpu_latency_xrstor, + TP_PROTO(struct fpstate *fpstate, u64 latency), + TP_ARGS(fpstate, latency) +); +#endif + #undef TRACE_INCLUDE_PATH #define TRACE_INCLUDE_PATH asm/trace/ #undef TRACE_INCLUDE_FILE diff --git a/arch/x86/kernel/fpu/xstate.h b/arch/x86/kernel/fpu/xstate.h index a4ecb04d8d64..aa997fb86537 100644 --- a/arch/x86/kernel/fpu/xstate.h +++ b/arch/x86/kernel/fpu/xstate.h @@ -5,6 +5,9 @@ #include #include #include +#include + +#include =20 #ifdef CONFIG_X86_64 DECLARE_PER_CPU(u64, xfd_state); @@ -113,7 +116,7 @@ static inline u64 xfeatures_mask_independent(void) * original instruction which gets replaced. We need to use it here as the * address of the instruction where we might get an exception at. */ -#define XSTATE_XSAVE(st, lmask, hmask, err) \ +#define __XSTATE_XSAVE(st, lmask, hmask, err) \ asm volatile(ALTERNATIVE_3(XSAVE, \ XSAVEOPT, X86_FEATURE_XSAVEOPT, \ XSAVEC, X86_FEATURE_XSAVEC, \ @@ -130,7 +133,7 @@ static inline u64 xfeatures_mask_independent(void) * Use XRSTORS to restore context if it is enabled. XRSTORS supports compa= ct * XSAVE area format. */ -#define XSTATE_XRESTORE(st, lmask, hmask) \ +#define __XSTATE_XRESTORE(st, lmask, hmask) \ asm volatile(ALTERNATIVE(XRSTOR, \ XRSTORS, X86_FEATURE_XSAVES) \ "\n" \ @@ -140,6 +143,35 @@ static inline u64 xfeatures_mask_independent(void) : "D" (st), "m" (*st), "a" (lmask), "d" (hmask) \ : "memory") =20 +#if defined(CONFIG_X86_DEBUG_FPU) +#define XSTATE_XSAVE(fps, lmask, hmask, err) \ + do { \ + struct fpstate *f =3D fps; \ + u64 tc =3D -1; \ + if (tracepoint_enabled(x86_fpu_latency_xsave)) \ + tc =3D trace_clock(); \ + __XSTATE_XSAVE(&f->regs.xsave, lmask, hmask, err); \ + if (tracepoint_enabled(x86_fpu_latency_xsave)) \ + trace_x86_fpu_latency_xsave(f, trace_clock() - tc);\ + } while (0) + +#define XSTATE_XRESTORE(fps, lmask, hmask) \ + do { \ + struct fpstate *f =3D fps; \ + u64 tc =3D -1; \ + if (tracepoint_enabled(x86_fpu_latency_xrstor)) \ + tc =3D trace_clock(); \ + __XSTATE_XRESTORE(&f->regs.xsave, lmask, hmask); \ + if (tracepoint_enabled(x86_fpu_latency_xrstor)) \ + trace_x86_fpu_latency_xrstor(f, trace_clock() - tc);\ + } while (0) +#else +#define XSTATE_XSAVE(fps, lmask, hmask, err) \ + __XSTATE_XSAVE(&(fps)->regs.xsave, lmask, hmask, err) +#define XSTATE_XRESTORE(fps, lmask, hmask) \ + __XSTATE_XRESTORE(&(fps)->regs.xsave, lmask, hmask) +#endif + #if defined(CONFIG_X86_64) && defined(CONFIG_X86_DEBUG_FPU) extern void xfd_validate_state(struct fpstate *fpstate, u64 mask, bool rst= or); #else @@ -184,7 +216,7 @@ static inline void os_xsave(struct fpstate *fpstate) WARN_ON_FPU(!alternatives_patched); xfd_validate_state(fpstate, mask, false); =20 - XSTATE_XSAVE(&fpstate->regs.xsave, lmask, hmask, err); + XSTATE_XSAVE(fpstate, lmask, hmask, err); =20 /* We should never fault when copying to a kernel buffer: */ WARN_ON_FPU(err); @@ -201,7 +233,7 @@ static inline void os_xrstor(struct fpstate *fpstate, u= 64 mask) u32 hmask =3D mask >> 32; =20 xfd_validate_state(fpstate, mask, true); - XSTATE_XRESTORE(&fpstate->regs.xsave, lmask, hmask); + XSTATE_XRESTORE(fpstate, lmask, hmask); } =20 /* Restore of supervisor state. Does not require XFD */ @@ -211,7 +243,7 @@ static inline void os_xrstor_supervisor(struct fpstate = *fpstate) u32 lmask =3D mask; u32 hmask =3D mask >> 32; =20 - XSTATE_XRESTORE(&fpstate->regs.xsave, lmask, hmask); + XSTATE_XRESTORE(fpstate, lmask, hmask); } =20 /* --=20 2.34.1 From nobody Sun Feb 8 06:58:50 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 27F30E7D0AC for ; Thu, 21 Sep 2023 20:38:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231906AbjIUUiE (ORCPT ); Thu, 21 Sep 2023 16:38:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38418 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232157AbjIUUhb (ORCPT ); Thu, 21 Sep 2023 16:37:31 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.93]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 678BA90F0B for ; Thu, 21 Sep 2023 10:47:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695318441; x=1726854441; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=E39ZJi+fK0k/tLrnny5MphRS3c/RTFP05Ljn5tZgGl0=; b=VHugowrmwCg9zBMfrNIMW/G3BxjqW/jUGBYJ4WeyRN+Gcgx6bCkwvXk8 o/xX1gQ5oyBIcHKrYjhU+teZTCtUarYA/YTUNWny34uU08wo4LDWD/ZnC TJej6P2OWR8TKXYNl/DmwWb4lbQlEKvO7qmf9cM7aZYz0I2eq1M9va7SU arSLtowFqPEcFk4/2VzNWboD849S0nMHLoQnp7znIfG4q6vY7cfqikUKy L0+7xyzaoi+Ui8halbF9fxUIs6rokjUL+mne7zT+jVmrctDlwV3FUQ3m1 Y+9cAi08Ft9+U7TzA8A+SVrPp96/prFDsyEjTriqFDcLuX5X1A3A0ZlBw g==; X-IronPort-AV: E=McAfee;i="6600,9927,10839"; a="377729742" X-IronPort-AV: E=Sophos;i="6.03,164,1694761200"; d="scan'208";a="377729742" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Sep 2023 23:30:00 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10839"; a="920606328" X-IronPort-AV: E=Sophos;i="6.03,164,1694761200"; d="scan'208";a="920606328" Received: from sunyi-station.sh.intel.com (HELO ysun46-mobl.sh.intel.com) ([10.239.159.10]) by orsmga005.jf.intel.com with ESMTP; 20 Sep 2023 23:29:57 -0700 From: Yi Sun To: dave.hansen@intel.com, mingo@kernel.org, linux-kernel@vger.kernel.org, x86@kernel.org Cc: sohil.mehta@intel.com, ak@linux.intel.com, ilpo.jarvinen@linux.intel.com, heng.su@intel.com, tony.luck@intel.com, yi.sun@linux.intel.com, yu.c.chen@intel.com, Yi Sun Subject: [PATCH v7 2/3] tools/testing/fpu: Add script to consume trace log of xsave latency Date: Thu, 21 Sep 2023 14:28:59 +0800 Message-Id: <20230921062900.864679-3-yi.sun@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230921062900.864679-1-yi.sun@intel.com> References: <20230921062900.864679-1-yi.sun@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Consume the trace log dumped by trace points x86_fpu_latency_xsave and x86_fpu_latency_xrstor, calculate latency ranges for each RFBM and XINUSE combination including min, max, average and 97% tail latency. Add the average of 97% tail latency to remove the unreasonable data which is introduced by interrupts or other noise. By adding the experimental code disabling interrupts before the calculation of latency, it's obvious to get the 3% tail latency has been filtered. Make use of sqlite3 to make the data statistics more efficient and concise. The output looks like following: EVENTs RFBM XINUSE lat_min lat_max lat_avg lat_avg(97%) Reviewed-by: Tony Luck ---------------------- ------- ------ ------- ------- ------- ------------ x86_fpu_latency_xrstor 0x206e7 0x0 364 364 364 364 x86_fpu_latency_xrstor 0x206e7 0x202 112 1152 300 276 x86_fpu_latency_xsave 0x206e7 0x202 80 278 141 137 x86_fpu_latency_xsave 0x206e7 0x246 108 234 180 177 The XSAVE/XRSTOR latency trace log can be got by two ways: 1. Generated by Kernel debugfs echo 1 > /sys/kernel/debug/tracing/events/x86_fpu/enable cat /sys/kernel/debug/tracing/trace_pipe > trace-log 2. Generated by helper tool like 'trace-cmd' trace-cmd record -e x86_fpu -F trace-cmd report > trace-log Reviewed-by: Tony Luck Signed-off-by: Yi Sun diff --git a/tools/testing/fpu/xsave-latency-trace.sh b/tools/testing/fpu/x= save-latency-trace.sh new file mode 100755 index 000000000000..d45563984fd6 --- /dev/null +++ b/tools/testing/fpu/xsave-latency-trace.sh @@ -0,0 +1,227 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 +# +# (c) 2022 Yi Sun + +trace_log=3D$1 +trace_lat_log=3D".trace_lat_log" +db_name=3D"db_trace" +db_file=3D"${db_name}.db" +table_raw=3D"t_trace" +table_tail=3D"t_trace_tail" +table_results=3D"t_results" +events=3D"x86_fpu_latency_xsave|x86_fpu_latency_xrstor" + +# The regex for the trace log. The rough pattern: +# (proc) (No.cpu) (flags) (timestamp): (tracepoint): latency:(123) RFBM:0x= (123) XINUSE:0x(123)$ +# Fold the regex into 3 parts making it easier to read. +regex1=3D"([^\ ]*)[[:space:]]*\[([0-9]+)\][[:space:]]*(.....\ )?[[:space:]= ]*" +regex2=3D"([0-9.]*):[[:space:]]*([^\ :]*):.*latency:([0-9]*)[[:space:]]*" +regex3=3D"RFBM:(0x[0-9a-f]*)[[:space:]]*XINUSE:(0x[0-9a-f]*)$" + +function usage() { + echo "This script consumes the tracepoint data, and dumps out the" + echo "latency ranges for each RFBM combination." + echo "Usage:" + echo "$0 " + echo " trace-log:" + echo " Either generated by Kernel sysfs:" + echo " echo 1 > /sys/kernel/debug/tracing/events/x86_fpu/enable" + echo " cat /sys/kernel/debug/tracing/trace_pipe > trace-log" + echo "" + echo " Or generate by helper tool like 'trace-cmd':" + echo " trace-cmd record -e x86_fpu" + echo " trace-cmd report > trace-log" +} + +# Check the dependent tools +# {@}: a list of third-part tools +function check_packages() { + for pack in "$@"; do + which $pack >& /dev/null + if [[ $? !=3D 0 ]]; then + echo "Please install $pack before running this script." + exit 1 + fi + done +} + +# Run SQL command with sqlite3 +# ${*}: SQL command fed to sqlite3 +function SQL_CMD() { + sqlite3 $db_file "$*" +} + +# Run SQL command with sqlite3 and format the output with headers and colu= mn. +# ${*}: SQL command fed to sqlite3 +function SQL_CMD_HEADER() { + sqlite3 -column -header $db_file "$*" +} + +# Create a table in the DB +# ${1}=EF=BC=9A name of the table +function create_table() { + if [[ "$1" =3D=3D "" ]]; then + echo "Empty table name!" + exit 1 + fi + SQL_CMD "create table $1 ( + id INTEGER PRIMARY KEY AUTOINCREMENT, + process TEXT, + cpu INT, + timestamp FLOAT, + event_name TEXT, + lat INT, + RFBM INT, + XINUSE INT);" +} + +# Round to the nearest whole number +# ${1}: a float number +# Output: integer +function round() { + echo "scale=3D0; ($1+0.5)/1" | bc +} + +# Insert a record in the trace table +# +# process cpu timestamp event_name lat RFBM XINUSE +# $2 $3 $4 $5 $6 $7 $8 + +function insert_line() { + if [[ "$1" =3D=3D "" ]]; then + echo "Empty table name!" + exit 1 + fi + SQL_CMD "INSERT INTO $1 (process, cpu, timestamp, event_name, lat, RFBM, = XINUSE) + VALUES (\"$2\", $3, $4, \"$5\", $6, $7, $8);" +} + +# Show the results of the trace statistics +function get_latency_stat() { + SQL_CMD "create table $table_results ( + id INTEGER PRIMARY KEY AUTOINCREMENT, + event_name TEXT, + RFBM INT, + XINUSE INT, + lat_min INT, + lat_max INT, + lat_avg INT, + lat_tail_avg INT);" + + for((i=3D0; i<$cnt; i++));do + event_name=3D`get_comb_item $table_raw $i event_name` + RFBM=3D`get_comb_item $table_raw $i RFBM` + XINUSE=3D`get_comb_item $table_raw $i XINUSE` + lat_min=3D`get_comb_item $table_raw $i min\(lat\)` + lat_max=3D`get_comb_item $table_raw $i max\(lat\)` + lat_avg=3D`get_comb_item $table_raw $i avg\(lat\)` + lat_tail_avg=3D`get_comb_item $table_tail $i avg\(lat\)` + + lat_avg=3D`round $lat_avg` + lat_tail_avg=3D`round $lat_tail_avg` + + SQL_CMD "INSERT INTO $table_results + (event_name, RFBM,XINUSE, lat_min, lat_max, lat_avg, lat_tail_avg) + VALUES (\"$event_name\", $RFBM, $XINUSE, $lat_min, $lat_max, + $lat_avg, $lat_tail_avg);" + done + + SQL_CMD_HEADER "select event_name[EVENTs],printf('0x%x',RFBM)[RFBM], + printf('0x%x',XINUSE)[XINUSE],lat_min,lat_max,lat_avg, + lat_tail_avg[lat_avg(97%)] + from $table_results;" +} + +# Get the count of the combination of event_name, RFBM, XINUSE amount all = lat trace records +function get_combs_cnt() { + SQL_CMD "SELECT event_name, RFBM, XINUSE from $table_raw + group by event_name,RFBM,XINUSE;" | wc -l +} + +# Get a specified combination from a table +# ${1}: name of table +# ${2}: the order of the combination of event_name, RFBM, XINUSE +# ${3}: the items which are wanted to be shown +function get_comb_item() { + table=3D$1 + cnt=3D$2 + col=3D$3 + SQL_CMD "SELECT $col from $table group by event_name,RFBM,XINUSE limit $c= nt,1;" +} + +# Get count of the records in a given table +# ${1}: name of the table +function get_rows_cnt() { + table=3D$1 + SQL_CMD "SELECT count(*) from $table;" +} + +# Generate a new table from the raw trace table removing 3% tail traces. +function gen_tail_lat() { + cnt=3D`get_combs_cnt` + create_table $table_tail + + for((i=3D0; i<$cnt; i++));do + create_table t$i + event_name=3D`get_comb_item $table_raw $i event_name` + RFBM=3D`get_comb_item $table_raw $i RFBM` + XINUSE=3D`get_comb_item $table_raw $i XINUSE` + + SQL_CMD "insert into t$i(process,cpu,timestamp,event_name,lat,RFBM,XINUS= E) + select process,cpu,timestamp,event_name,lat,RFBM,XINUSE + from $table_raw where event_name=3D\"$event_name\" and RFBM=3D$RFBM and + XINUSE=3D$XINUSE ORDER BY lat ASC;" + + row=3D`get_rows_cnt t$i` + row=3D`echo "scale=3D0; ($row*0.97 + 0.5)/1" | bc` + + SQL_CMD "insert into $table_tail + (process,cpu,timestamp,event_name,lat,RFBM,XINUSE) + select process,cpu,timestamp,event_name,lat,RFBM,XINUSE + from t$i limit 0,$row;" + done + +} + +if [[ ! -e "$trace_log" || $# !=3D 1 ]];then + usage + exit 1 +fi + +# Check dependency +# Make sure having following packages +check_packages sqlite3 bc wc cut + +# Filter trace log keeping latency related lines only +grep -E "$events" $trace_log > $trace_lat_log +cnt_lines=3D`wc -l $trace_lat_log | cut -d' ' -f1` +# Remove the old db file if it existed before creating +[[ -f $db_file ]] && rm -rf $db_file + +create_table $table_raw + +# Read each line from the temp file and insert into the table +i=3D0 +while IFS=3D read -r line; +do + ((i =3D i + 1)) + echo -ne "(${i}/$cnt_lines) Importing trace log into database!\r" + if [[ "$line" =3D~ ${regex1}${regex2}${regex3} ]]; then + pname=3D${BASH_REMATCH[1]} + cpu=3D${BASH_REMATCH[2]} + ts=3D${BASH_REMATCH[4]} + ename=3D${BASH_REMATCH[5]} + lat=3D${BASH_REMATCH[6]} + ((rfbm=3D${BASH_REMATCH[7]})) + ((xinuse=3D${BASH_REMATCH[8]})) + + insert_line $table_raw $pname $cpu $ts $ename $lat $rfbm $xinuse + fi +done < $trace_lat_log + +gen_tail_lat +get_latency_stat + +# Cleanup +rm -rf $trace_lat_log $db_file --=20 2.34.1 From nobody Sun Feb 8 06:58:50 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F3577E7D0AA for ; Thu, 21 Sep 2023 20:47:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232432AbjIUUrS (ORCPT ); Thu, 21 Sep 2023 16:47:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39976 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231640AbjIUUrB (ORCPT ); Thu, 21 Sep 2023 16:47:01 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.93]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D7AC990F12 for ; Thu, 21 Sep 2023 10:47:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695318443; x=1726854443; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Msw4IpRBg5UAkrhphCxUeE+v0uEt43SFzf7qsAETac8=; b=NG8MsCT0CuXvoVHZ+9KL0gg6AOAYvwolup45XBRBSqoYV0st3LGCHtr/ VPdcfHee+obsOLBG4+Asj21FQQ3wZ+1awC5EnQJc7bxvF8/sSsaZup15N db1C3giANs7HaDS5BM/onXBsPKOZB7jWSdcHQ7DA7Gp0ZDFQzNqECG1NI Y2H6teq0mUfXgWl0qCJrvVM1C9NEmQPfLl84gQXwu4DvYxN1HBqfdyfLQ Kk9jRLYXA8PVc9DV72mqPTUyEIPBQCIUr/yf+iZx9FZL++QaKvzZPnBSY Ssinz2nmxRgBMvrAnLwIX0ITd5syNtsiLzaFboOiWFWuFzxfZEEdM94cN w==; X-IronPort-AV: E=McAfee;i="6600,9927,10839"; a="377729757" X-IronPort-AV: E=Sophos;i="6.03,164,1694761200"; d="scan'208";a="377729757" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Sep 2023 23:30:03 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10839"; a="920606351" X-IronPort-AV: E=Sophos;i="6.03,164,1694761200"; d="scan'208";a="920606351" Received: from sunyi-station.sh.intel.com (HELO ysun46-mobl.sh.intel.com) ([10.239.159.10]) by orsmga005.jf.intel.com with ESMTP; 20 Sep 2023 23:30:00 -0700 From: Yi Sun To: dave.hansen@intel.com, mingo@kernel.org, linux-kernel@vger.kernel.org, x86@kernel.org Cc: sohil.mehta@intel.com, ak@linux.intel.com, ilpo.jarvinen@linux.intel.com, heng.su@intel.com, tony.luck@intel.com, yi.sun@linux.intel.com, yu.c.chen@intel.com, Yi Sun Subject: [PATCH v7 3/3] tools/testing/fpu: Add a 'count' column. Date: Thu, 21 Sep 2023 14:29:00 +0800 Message-Id: <20230921062900.864679-4-yi.sun@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230921062900.864679-1-yi.sun@intel.com> References: <20230921062900.864679-1-yi.sun@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Show the total number of each combination of event-RFBM-XINUSE. Users can identify which data is noise and focus more on useful data. Signed-off-by: Yi Sun diff --git a/tools/testing/fpu/xsave-latency-trace.sh b/tools/testing/fpu/x= save-latency-trace.sh index d45563984fd6..b2f7c3d0dd65 100755 --- a/tools/testing/fpu/xsave-latency-trace.sh +++ b/tools/testing/fpu/xsave-latency-trace.sh @@ -99,11 +99,14 @@ function insert_line() { =20 # Show the results of the trace statistics function get_latency_stat() { + cnt=3D`get_combs_cnt` + SQL_CMD "create table $table_results ( id INTEGER PRIMARY KEY AUTOINCREMENT, event_name TEXT, RFBM INT, XINUSE INT, + CNT INT, lat_min INT, lat_max INT, lat_avg INT, @@ -121,14 +124,18 @@ function get_latency_stat() { lat_avg=3D`round $lat_avg` lat_tail_avg=3D`round $lat_tail_avg` =20 + count=3D`SQL_CMD "SELECT count(*) from $table_raw + where event_name=3D\"$event_name\" and RFBM=3D$RFBM and + XINUSE=3D$XINUSE;"` + SQL_CMD "INSERT INTO $table_results - (event_name, RFBM,XINUSE, lat_min, lat_max, lat_avg, lat_tail_avg) - VALUES (\"$event_name\", $RFBM, $XINUSE, $lat_min, $lat_max, + (event_name,RFBM,XINUSE, CNT, lat_min, lat_max, lat_avg, lat_tail_avg) + VALUES (\"$event_name\", $RFBM, $XINUSE, $count, $lat_min, $lat_max, $lat_avg, $lat_tail_avg);" done =20 SQL_CMD_HEADER "select event_name[EVENTs],printf('0x%x',RFBM)[RFBM], - printf('0x%x',XINUSE)[XINUSE],lat_min,lat_max,lat_avg, + printf('0x%x',XINUSE)[XINUSE],CNT,lat_min,lat_max,lat_avg, lat_tail_avg[lat_avg(97%)] from $table_results;" } --=20 2.34.1