From nobody Tue Nov 26 20:19:25 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass header.i=@intel.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.intel.com ARC-Seal: i=1; a=rsa-sha256; t=1704443240; cv=none; d=zohomail.com; s=zohoarc; b=hpJeGfhxHsj9QcV+q9ufmM+6eKE+cnLPAcwe2UfwS0j40AUL72GtyHBsXSJHzxJM0lHUTRhN9biCVM0yZcn+RpmMsyqX/3wl7iKVZpILi/Ha97KPJi6bFQsNFIHRpsaR/Yz/3sx6ShiVPSmoXFrkdsChXOkXzrpeFBn8gla3Mb8= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1704443240; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=3p44lPjxBkRh/5ETkziVmUkKBSTp3mnmIA21mQeDpaw=; b=UnmPvihdq+XhHCKGJc7t1vefJVMWHG7/TAu8g8bG5F6lWd/b8gX/nC7pTZWV1CXEajC/9VyFWVl/ogDQTPH4GkjAOBX0QMX13X9rMWjiH8f6uSeKeznF90kBEMt/a5E02pKWQfpywO8i6xiCapmb4GsOB5ZOqAewbeu7NUDOAFw= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass header.i=@intel.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1704443240389344.315839429995; Fri, 5 Jan 2024 00:27:20 -0800 (PST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rLfXF-0002jM-PN; Fri, 05 Jan 2024 03:26:33 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rLfXD-0002iZ-EO; Fri, 05 Jan 2024 03:26:31 -0500 Received: from mgamail.intel.com ([192.55.52.136]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rLfX8-0001qE-Th; Fri, 05 Jan 2024 03:26:31 -0500 Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Jan 2024 00:26:08 -0800 Received: from liuzhao-optiplex-7080.sh.intel.com ([10.239.160.36]) by fmsmga007.fm.intel.com with ESMTP; 05 Jan 2024 00:26:05 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1704443186; x=1735979186; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=w4Plb8VoQe8M2Gz0DfTXzjxGrfMyToDgiD718NI0E6s=; b=lRPlSOIQeBkxT2580GL8CEBP0Cs9dCBgKSZSxl+HvGAFhZTFVQWllGEu 5DBcu9SNWFaGr63lXkDYN4FFKVRW3u0Dmp+tqZCWSPeIp2zZxjB39DCYk diNw3/1VeAY0WGFads14Qmng3ftSvtGkGWok47SB7ljJszu6n8VWzvvdy X+OHMWZU29cVDIXNXEUITzBET4brUqrYUNDHnrx6EieUz85Y8By3atBqn Zx1Ct1g3zro2v0CQegMLmjV+n07i1U5pPJOC07hcNLRQQhyXavsMfhJnc sXyyrq0+PQ9p65JwZNNwYyXKx+zACFmjtEllb0H6Kai/+Kn7Dlm9J1JQf w==; X-IronPort-AV: E=McAfee;i="6600,9927,10943"; a="376948840" X-IronPort-AV: E=Sophos;i="6.04,333,1695711600"; d="scan'208";a="376948840" X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10943"; a="784145770" X-IronPort-AV: E=Sophos;i="6.04,333,1695711600"; d="scan'208";a="784145770" From: Zhao Liu To: Michael Tokarev , Laurent Vivier , =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= , Paolo Bonzini , Thomas Huth , Richard Henderson , Darren Kenny , =?UTF-8?q?Alex=20Benn=C3=A9e?= , Alexander Bulekov , qemu-devel@nongnu.org Cc: qemu-trivial@nongnu.org, Zhenyu Wang , Zhao Liu , Yongwei Ma , Samuel Tardieu Subject: [PATCH v3] scripts/checkpatch: Support codespell checking Date: Fri, 5 Jan 2024 16:38:48 +0800 Message-Id: <20240105083848.267192-1-zhao1.liu@linux.intel.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: none client-ip=192.55.52.136; envelope-from=zhao1.liu@linux.intel.com; helo=mgamail.intel.com X-Spam_score_int: -69 X-Spam_score: -7.0 X-Spam_bar: ------- X-Spam_report: (-7.0 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-2.691, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_NONE=0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @intel.com) X-ZM-MESSAGEID: 1704443242765100001 Content-Type: text/plain; charset="utf-8" From: Zhao Liu Add two spelling check options (--codespell and --codespellfile) to enhance spelling check through dictionary, which copied the Linux kernel's implementation in checkpatch.pl. This check uses the dictionary at "/usr/share/codespell/dictionary.txt" by default, if there is no dictionary specified under this path, it will look for the dictionary of python3's codespell (This requires user to add python3's path in environment variable $PATH, and to install codespell by "pip install codespell"). Tested-by: Yongwei Ma Tested-by: Samuel Tardieu Signed-off-by: Zhao Liu Tested-by: Thomas Huth --- Changes since v2: * Fix the code style. (Samuel) v2: https://lore.kernel.org/qemu-devel/20231215103448.3822284-1-zhao1.liu@l= inux.intel.com/ Changes since v1: * Drop the default dictionary "selling.text" and just support optional spelling check via --codespell and --codespellfile. (Thomas) v1: https://lore.kernel.org/qemu-devel/20231204082917.2430223-1-zhao1.liu@l= inux.intel.com/ --- scripts/checkpatch.pl | 125 +++++++++++++++++++++++++++++++++++------- 1 file changed, 105 insertions(+), 20 deletions(-) diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl index 6e4100d2a41c..702689507412 100755 --- a/scripts/checkpatch.pl +++ b/scripts/checkpatch.pl @@ -35,6 +35,9 @@ my $summary_file =3D 0; my $root; my %debug; my $help =3D 0; +my $codespell =3D 0; +my $codespellfile =3D "/usr/share/codespell/dictionary.txt"; +my $user_codespellfile =3D ""; =20 sub help { my ($exitcode) =3D @_; @@ -66,6 +69,9 @@ Options: is all off) --test-only=3DWORD report only warnings/errors containing WORD literally + --codespell Use the codespell dictionary for spelling/typ= os + (default: $codespellfile) + --codespellfile Use this codespell dictionary --color[=3DWHEN] Use colors 'always', 'never', or only when = output is a terminal ('auto'). Default is 'auto'. -h, --help, --version display this help and exit @@ -85,28 +91,50 @@ foreach (@ARGV) { } =20 GetOptions( - 'q|quiet+' =3D> \$quiet, - 'tree!' =3D> \$tree, - 'signoff!' =3D> \$chk_signoff, - 'patch!' =3D> \$chk_patch, - 'branch!' =3D> \$chk_branch, - 'emacs!' =3D> \$emacs, - 'terse!' =3D> \$terse, - 'f|file!' =3D> \$file, - 'strict!' =3D> \$no_warnings, - 'root=3Ds' =3D> \$root, - 'summary!' =3D> \$summary, - 'mailback!' =3D> \$mailback, - 'summary-file!' =3D> \$summary_file, - - 'debug=3Ds' =3D> \%debug, - 'test-only=3Ds' =3D> \$tst_only, - 'color=3Ds' =3D> \$color, - 'no-color' =3D> sub { $color =3D 'never'; }, - 'h|help' =3D> \$help, - 'version' =3D> \$help + 'q|quiet+' =3D> \$quiet, + 'tree!' =3D> \$tree, + 'signoff!' =3D> \$chk_signoff, + 'patch!' =3D> \$chk_patch, + 'branch!' =3D> \$chk_branch, + 'emacs!' =3D> \$emacs, + 'terse!' =3D> \$terse, + 'f|file!' =3D> \$file, + 'strict!' =3D> \$no_warnings, + 'root=3Ds' =3D> \$root, + 'summary!' =3D> \$summary, + 'mailback!' =3D> \$mailback, + 'summary-file!' =3D> \$summary_file, + 'debug=3Ds' =3D> \%debug, + 'test-only=3Ds' =3D> \$tst_only, + 'codespell!' =3D> \$codespell, + 'codespellfile=3Ds' =3D> \$user_codespellfile, + 'color=3Ds' =3D> \$color, + 'no-color' =3D> sub { $color =3D 'never'; }, + 'h|help' =3D> \$help, + 'version' =3D> \$help ) or help(1); =20 +if ($user_codespellfile) { + # Use the user provided codespell file unconditionally + $codespellfile =3D $user_codespellfile; +} elsif (!(-f $codespellfile)) { + # If /usr/share/codespell/dictionary.txt is not present, try to find it + # under codespell's install directory: /data/dictionary.t= xt + if (($codespell || $help) && which("python3") ne "") { + my $python_codespell_dict =3D << "EOF"; + +import os.path as op +import codespell_lib +codespell_dir =3D op.dirname(codespell_lib.__file__) +codespell_file =3D op.join(codespell_dir, 'data', 'dictionary.txt') +print(codespell_file, end=3D'') +EOF + + my $codespell_dict =3D `python3 -c "$python_codespell_dict" 2> /dev/null= `; + $codespellfile =3D $codespell_dict if (-f $codespell_dict); + } +} + help(0) if ($help); =20 my $exit =3D 0; @@ -337,6 +365,36 @@ our @typeList =3D ( qr{guintptr}, ); =20 +# Load common spelling mistakes and build regular expression list. +my $misspellings; +my %spelling_fix; + +if ($codespell) { + if (open(my $spelling, '<', $codespellfile)) { + while (<$spelling>) { + my $line =3D $_; + + $line =3D~ s/\s*\n?$//g; + $line =3D~ s/^\s*//g; + + next if ($line =3D~ m/^\s*#/); + next if ($line =3D~ m/^\s*$/); + next if ($line =3D~ m/, disabled/i); + + $line =3D~ s/,.*$//; + + my ($suspect, $fix) =3D split(/->/, $line); + + $spelling_fix{$suspect} =3D $fix; + } + close($spelling); + } else { + warn "No codespell typos will be found - file '$codespellfile': $!\n"; + } +} + +$misspellings =3D join("|", sort keys %spelling_fix) if keys %spelling_fix; + # This can be modified by sub possible. Since it can be empty, be careful # about regexes that always match, because they can cause infinite loops. our @modifierList =3D ( @@ -477,6 +535,18 @@ sub top_of_kernel_tree { return 1; } =20 +sub which { + my ($bin) =3D @_; + + foreach my $path (split(/:/, $ENV{PATH})) { + if (-e "$path/$bin") { + return "$path/$bin"; + } + } + + return ""; +} + sub expand_tabs { my ($str) =3D @_; =20 @@ -1585,6 +1655,21 @@ sub process { WARN("8-bit UTF-8 used in possible commit log\n" . $herecurr); } =20 +# Check for various typo / spelling mistakes + if (defined($misspellings) && + ($in_commit_log || $line =3D~ /^(?:\+|Subject:)/i)) { + while ($rawline =3D~ /(?:^|[^\w\-'`])($misspellings)(?:[^\w\-'`]|$)/gi)= { + my $typo =3D $1; + my $blank =3D copy_spacing($rawline); + my $ptr =3D substr($blank, 0, $-[1]) . "^" x length($typo); + my $hereptr =3D "$hereline$ptr\n"; + my $typo_fix =3D $spelling_fix{lc($typo)}; + $typo_fix =3D ucfirst($typo_fix) if ($typo =3D~ /^[A-Z]/); + $typo_fix =3D uc($typo_fix) if ($typo =3D~ /^[A-Z]+$/); + WARN("'$typo' may be misspelled - perhaps '$typo_fix'?\n" . $hereptr); + } + } + # ignore non-hunk lines and lines being removed next if (!$hunk_line || $line =3D~ /^-/); =20 --=20 2.34.1