From nobody Wed Apr 1 20:38:04 2026 Received: from mail-wm1-f49.google.com (mail-wm1-f49.google.com [209.85.128.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 37A643F165D for ; Wed, 1 Apr 2026 14:47:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.49 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775054883; cv=none; b=YyzP7iYJuTuhO1TpGxtov7WI44D1Th3cAqT04XU05dQ5DqAQYjfJFE1wOFOz2SFMDuKN6H30l4sfxKrxsZlrGeio7kP2fs2c1hqYHMF9xxC71QDRMbss2pKT/soZ5P2vjoPeSYkusFq55ekGbSj3UA32HxawQUnKR+5iGPwWlQs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775054883; c=relaxed/simple; bh=tdiwdLvqudQEDZnVeHEOTSgoik2UvPlpugmg9Jz8WII=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=H2KrNLnvYD1tVh5Sa4sHJpBGyCFT3xU5/M9z7csDN53zwzMc3Z5remvWBFgpZ9o0bAVjnX4vBd2SRo7TS1UtWQP3kVO6/TcMX4pMKnZIK2p9x6R+NJKzt3tnla31VXlrNyWmBBbCU+GTz6mFic6NnPbaInX5r9mQnoINXh2Tj/I= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=m4r3a2pZ; arc=none smtp.client-ip=209.85.128.49 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="m4r3a2pZ" Received: by mail-wm1-f49.google.com with SMTP id 5b1f17b1804b1-486fc4725f0so13483715e9.1 for ; Wed, 01 Apr 2026 07:47:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1775054875; x=1775659675; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=t2dIUngSfbQfLIy8bJws/+ZY8TXN1Mkp/WGt72plMlc=; b=m4r3a2pZXynhzKYpWpT692g/qVaMC1ZcGz909cTcMgehn+Cq/DoXrgn/CbDK/Ug0zz qWu1iINZ9SZbblk57D2R5jM3fZ1KBYfn0KnMsjm+3ad3Y/J70XPfcRNpCDiGTwKUmn+G xQ8dZtV2mK5CULhjhZF5CZytOzeUh4snlqEU8K6NgW2IPHNrfaxrqCrL2gNWbcZxWLa9 bSjKZD6b677GDr7Bm6ALnOaRH4ONdxP/EXvKCQjcM/RWo6A1tS1Psq4Nso468zKOiNvt DxvwsQ8KN1Kili3dTHe13P7rqPqeBtvPW6TsdFK1cSulyJzKSIJPzkjf8lZ9/HhBjO3W MzZQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775054875; x=1775659675; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=t2dIUngSfbQfLIy8bJws/+ZY8TXN1Mkp/WGt72plMlc=; b=ajcxTmcMEkmp8k9E9WjyKRx03T8LNTYyPmi7hIbj4mcztDjQGOYeIepIjtsR4117Ue KIKkzXZhpcqzNakXTTDBXZ8QrJPzBrwvTMHTEmEiFYXWH5HCrkka/P56yntdQhExa7Pg NEyA/u935JvTqfD/c16+LYXD//vfgRJwfOx4X5f9ltgXV4pFOaGVPU+G871LIrksDEx4 CjNoaHwcLcOvOAV3XbzG3bISu/ZH1iNHdhvgXxWxfA+bhl8x0LDBLL4nluljyO+sCXQz V9p+9jE7MAH/nOKdnV9KLYDR8syl+QBvD5eG2A2oqYKoqtBimoTWvI/bqMXWSGDlB4kp 7UJw== X-Gm-Message-State: AOJu0YwbEWmbSlM1ScHjjMmAyJMtWBmhIIaEPf5pHpKKSEJtBl/YpW+5 s84ROf6aXAA7Mav1+THY1K80tNcqfLqZJKPKDHKtOxP6umSTVmFp9V89aLFv0XRNmHY= X-Gm-Gg: ATEYQzyadEVhMaVmWxKEfNEroQDGgz48MkE45KJnwHgLlc0VquZkw1wp2qMfLfC/QMI 1nPJ2D7xo5npGcb0uNNznbkrM85cZ/yonXV6xZFc/UA/DE28HeT+859CeBxPZP3WV+trW9tZnc6 aPyjokK5knGOFR/JgK5WAjxoOHC6uKB4+wRXgbbwEyhIEVE0wxSfZOqd1ALNLE0o+oESSGW7uKy /PM/Bap6QgloCSsHc1wGcQWYJ6Cw6C0SnmbQuuPofy6K99iytl8XsG9pyHtXMTvu3necQM4jw/+ ZIQjzFn88i866OXgVCKqbwBKCBYJ6biGVWlyiA3ZOxNN/I/xcgLBIti0TltaWfg0O/V/zamBooF zWAyrhMutll1i/R/JIY1RO4KgRVX14u1AD5mwVDHvzsaGNuUdUGATKzA1LO5NKNjdMvBWktGWQx tRhyMbmH+5j45gswYq+Dd9O1i+A+jBlCo2WkFF/6DtYYfsQOpKVVzSydqLWAxyUw== X-Received: by 2002:a05:600c:5303:b0:477:9890:9ab8 with SMTP id 5b1f17b1804b1-48883566129mr64146415e9.3.1775054874846; Wed, 01 Apr 2026 07:47:54 -0700 (PDT) Received: from fedorarm (net-2-37-83-250.cust.vodafonedsl.it. [2.37.83.250]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-4887c8b6230sm45753505e9.24.2026.04.01.07.47.53 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Apr 2026 07:47:54 -0700 (PDT) From: Guido De Rossi To: LKML Subject: [PATCH 1/2] get_maintainer: rewrite in Python Date: Wed, 1 Apr 2026 16:47:22 +0200 Message-ID: <20260401144723.44406-2-guido.derossi91@gmail.com> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260401144723.44406-1-guido.derossi91@gmail.com> References: <20260401144723.44406-1-guido.derossi91@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add scripts/get_maintainer.py, a full-parity Python 3.6+ rewrite of scripts/get_maintainer.pl. This is the first step toward deprecating Perl in the kernel scripts, following the precedent set by scripts/kernel-doc.py. The Python version is a drop-in replacement that produces identical output across all major operation modes: file lookups, patch parsing, section listing, interactive mode, and self-test validation. Benchmark comparison (10 iterations, wall time): Mode Perl Python Single file (--nogit) 7.96s 8.00s Default (with git-fallback) 7.93s 8.09s Patch (--nogit) 2.60s 2.22s Self-test (sections, 1x) 2.69s 1.52s Performance optimizations over a naive port: - Pre-parsed (type, value) tuples avoid repeated regex matching on the ~24,000 typevalue entries during section traversal - Literal prefix extraction from F:/X: glob patterns enables fast string-based short-circuit before regex compilation, reducing regex compilations from ~9,500 to ~170 per invocation (98% reduction) - Lazy regex compilation ensures only patterns that survive the prefix check are compiled - Pre-compiled regex for the hot-path type:value line format - Self-test pattern matching uses pre-compiled regexes Key implementation details: - All 45+ CLI options supported with Perl-compatible --foo/--nofoo negation via argparse - .get_maintainer.conf and .get_maintainer.ignore file loading - Full VCS integration (git/hg) with command templates using str.format() instead of Perl eval-based interpolation - RFC822 email validation ported from the Perl regex builder - Mailmap support with all 4 entry formats - Interactive terminal menu with all commands - stdlib only (no external dependencies), single self-contained file Signed-off-by: Guido De Rossi --- scripts/get_maintainer.py | 2393 +++++++++++++++++++++++++++++++++++++ 1 file changed, 2393 insertions(+) create mode 100755 scripts/get_maintainer.py diff --git a/scripts/get_maintainer.py b/scripts/get_maintainer.py new file mode 100755 index 000000000000..cc1dab4d4f64 --- /dev/null +++ b/scripts/get_maintainer.py @@ -0,0 +1,2393 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: GPL-2.0 +# +# Print selected MAINTAINERS information for +# the files modified in a patch or for a file +# +# usage: get_maintainer.py [OPTIONS] +# get_maintainer.py [OPTIONS] -f +# +# Python rewrite of get_maintainer.pl + +import argparse +import os +import re +import shlex +import shutil +import subprocess +import sys + +V =3D '0.26' + +# ---- constants ---- + +penguin_chief =3D [ + "Linus Torvalds:torvalds@linux-foundation.org", +] + +penguin_chief_names =3D [] +for chief in penguin_chief: + m =3D re.match(r'^(.*?):(.*)', chief) + if m: + penguin_chief_names.append(m.group(1)) + +penguin_chiefs =3D "(" + "|".join(re.escape(n) for n in penguin_chief_name= s) + ")" + +signature_tags =3D [ + "Signed-off-by:", + "Reviewed-by:", + "Acked-by:", +] + +signature_pattern =3D "(" + "|".join(re.escape(t) for t in signature_tags)= + ")" + +rfc822_lwsp =3D r"(?:(?:\r\n)?[ \t])" +rfc822_char =3D r'[\000-\377]' + +# Pre-compiled patterns for hot paths +_re_type_value =3D re.compile(r'^([A-Z]):\s*(.*)') +_re_blank_line =3D re.compile(r'^\s*$') +_re_comment_line =3D re.compile(r'^\s*#') + +# ---- VCS command templates ---- + +VCS_cmds_git =3D { + "available": "git", + "find_signers_cmd": + 'git log --no-color --follow --since=3D{email_git_since} ' + '--numstat --no-merges ' + '--format=3D"GitCommit: %H%n' + 'GitAuthor: %an <%ae>%n' + 'GitDate: %aD%n' + 'GitSubject: %s%n' + '%b%n"' + ' -- {file}', + "find_commit_signers_cmd": + 'git log --no-color ' + '--numstat ' + '--format=3D"GitCommit: %H%n' + 'GitAuthor: %an <%ae>%n' + 'GitDate: %aD%n' + 'GitSubject: %s%n' + '%b%n"' + ' -1 {commit}', + "find_commit_author_cmd": + 'git log --no-color ' + '--numstat ' + '--format=3D"GitCommit: %H%n' + 'GitAuthor: %an <%ae>%n' + 'GitDate: %aD%n' + 'GitSubject: %s%n"' + ' -1 {commit}', + "blame_range_cmd": "git blame -l -L {diff_start},+{diff_length} {file}= ", + "blame_file_cmd": "git blame -l {file}", + "commit_pattern": r"^GitCommit: ([0-9a-f]{40})", + "blame_commit_pattern": r"^([0-9a-f]+) ", + "author_pattern": r"^GitAuthor: (.*)", + "subject_pattern": r"^GitSubject: (.*)", + "stat_pattern": r"^(\d+)\t(\d+)\t{file}$", + "file_exists_cmd": "git ls-files {file}", + "list_files_cmd": "git ls-files {file}", +} + +VCS_cmds_hg =3D { + "available": "hg", + "find_signers_cmd": + "hg log --date=3D{email_hg_since} " + "--template=3D'HgCommit: {{node}}\\n" + "HgAuthor: {{author}}\\n" + "HgSubject: {{desc}}\\n'" + " -- {file}", + "find_commit_signers_cmd": + "hg log " + "--template=3D'HgSubject: {{desc}}\\n'" + " -r {commit}", + "find_commit_author_cmd": + "hg log " + "--template=3D'HgCommit: {{node}}\\n" + "HgAuthor: {{author}}\\n" + "HgSubject: {{desc|firstline}}\\n'" + " -r {commit}", + "blame_range_cmd": "", + "blame_file_cmd": "hg blame -n {file}", + "commit_pattern": r"^HgCommit: ([0-9a-f]{40})", + "blame_commit_pattern": r"^([ 0-9a-f]+):", + "author_pattern": r"^HgAuthor: (.*)", + "subject_pattern": r"^HgSubject: (.*)", + "stat_pattern": r"^(\d+)\t(\d+)\t{file}$", + "file_exists_cmd": "hg files {file}", + "list_files_cmd": "hg manifest -R {file}", +} + +# ---- global state ---- + +P =3D "" +cur_path =3D "" +lk_path =3D "./" + +# Options (with defaults matching the Perl script) +email =3D 1 +email_usename =3D 1 +email_maintainer =3D 1 +email_reviewer =3D 1 +email_fixes =3D 1 +email_list =3D 1 +email_moderated_list =3D 1 +email_subscriber_list =3D 0 +email_git_penguin_chiefs =3D 0 +email_git =3D 0 +email_git_all_signature_types =3D 0 +email_git_blame =3D 0 +email_git_blame_signatures =3D 1 +email_git_fallback =3D 1 +email_git_min_signatures =3D 1 +email_git_max_maintainers =3D 5 +email_git_min_percent =3D 5 +email_git_since =3D "1-year-ago" +email_hg_since =3D "-365" +interactive =3D 0 +email_remove_duplicates =3D 1 +email_use_mailmap =3D 1 +output_multiline =3D 1 +output_separator =3D ", " +output_roles =3D 0 +output_rolestats =3D 1 +output_substatus =3D None +output_section_maxlen =3D 50 +scm =3D 0 +tree =3D 1 +web =3D 0 +bug =3D 0 +subsystem_opt =3D 0 +status_opt =3D 0 +letters =3D "" +keywords =3D 1 +keywords_in_file =3D 0 +sections =3D 0 +email_file_emails =3D 0 +from_filename =3D 0 +pattern_depth =3D 0 +self_test =3D None +version =3D 0 +find_maintainer_files =3D 0 +maintainer_path =3D None +vcs_used =3D 0 + +# Mutable state +files =3D [] +fixes =3D [] +range_list =3D [] +keyword_tvi =3D [] +file_emails =3D [] + +commit_author_hash =3D {} +commit_signer_hash =3D {} + +typevalue =3D [] +# Pre-parsed (type_char, value_string) tuples parallel to typevalue. +# For non-type lines, type_char is None. +typevalue_parsed =3D [] +# Lazily compiled regexes for F:/X: patterns, keyed by typevalue index. +# Sentinel value _UNCOMPILED means the pattern has not been compiled yet. +_UNCOMPILED =3D object() +typevalue_compiled =3D {} +# Literal prefix for each F:/X: pattern -- fast string check before regex. +typevalue_prefix =3D {} +keyword_hash =3D {} +mfiles =3D [] +self_test_info =3D [] + +mailmap_data =3D None + +email_hash_name =3D {} +email_hash_address =3D {} +email_to =3D [] +hash_list_to =3D {} +list_to =3D [] +scm_list =3D [] +web_list =3D [] +bug_list =3D [] +subsystem_list =3D [] +status_list =3D [] +substatus_list =3D [] +deduplicate_name_hash =3D {} +deduplicate_address_hash =3D {} + +ignore_emails =3D [] + +VCS_cmds =3D {} + +printed_novcs =3D False + +# ---- utility functions ---- + +def which(bin_name): + path =3D shutil.which(bin_name) + return path if path else "" + +def which_conf(conf): + for path_dir in [".", os.environ.get("HOME", ""), ".scripts"]: + p =3D os.path.join(path_dir, conf) + if os.path.exists(p): + return p + return "" + +def top_of_kernel_tree(lk): + if lk and not lk.endswith("/"): + lk +=3D "/" + checks_f =3D ["COPYING", "CREDITS", "Kbuild", "Makefile", "README"] + checks_e =3D ["MAINTAINERS"] + checks_d =3D ["Documentation", "arch", "include", "drivers", "fs", + "init", "ipc", "kernel", "lib", "scripts"] + for f in checks_f: + if not os.path.isfile(lk + f): + return False + for f in checks_e: + if not os.path.exists(lk + f): + return False + for d in checks_d: + if not os.path.isdir(lk + d): + return False + return True + +def uniq(lst): + seen =3D set() + result =3D [] + for x in lst: + if x not in seen: + seen.add(x) + result.append(x) + return result + +def sort_and_uniq(lst): + seen =3D set() + result =3D [] + for x in sorted(lst): + if x not in seen: + seen.add(x) + result.append(x) + return result + +# ---- email functions ---- + +def escape_name(name): + if re.search(r'[^\w \-]', name, re.ASCII | re.IGNORECASE): + name =3D name.replace('\\', '\\\\') + name =3D name.replace('"', '\\"') + name =3D '"' + name + '"' + return name + +def parse_email(formatted_email): + name =3D "" + address =3D "" + m =3D re.match(r'^([^<]+)<(.+@.*)>.*$', formatted_email) + if m: + name =3D m.group(1) + address =3D m.group(2) + else: + m =3D re.match(r'^\s*<(.+@\S*)>.*$', formatted_email) + if m: + address =3D m.group(1) + else: + m =3D re.match(r'^(.+@\S*).*$', formatted_email) + if m: + address =3D m.group(1) + name =3D name.strip() + name =3D name.strip('"') + name =3D escape_name(name) + address =3D address.strip() + return (name, address) + +def format_email(name, address, usename): + name =3D name.strip().strip('"') + name =3D escape_name(name) + address =3D address.strip() + if usename: + if name =3D=3D "": + return address + else: + return "{} <{}>".format(name, address) + else: + return address + +# ---- RFC822 validation ---- + +_rfc822re =3D None + +def make_rfc822re(): + specials =3D r'()<>@,;:\\".\\[\\]' + controls =3D r'\000-\037\177' + + dtext =3D r"[^\[\]\r\\]" + domain_literal =3D r"\[(?:" + dtext + r"|\\\\.)* \]" + rfc822_lwsp + "= *" + + quoted_string =3D r'"(?:[^"\r\\]|\\\\.|' + rfc822_lwsp + r')*"' + rfc8= 22_lwsp + "*" + + atom =3D r"[^" + specials + " " + controls + r"]+(?:" + rfc822_lwsp + = r"+|\Z|(?=3D[\\[\"" + specials + r"]))" + word =3D r"(?:" + atom + r"|" + quoted_string + r")" + localpart =3D word + r"(?:\." + rfc822_lwsp + r"*" + word + r")*" + + sub_domain =3D r"(?:" + atom + r"|" + domain_literal + r")" + domain =3D sub_domain + r"(?:\." + rfc822_lwsp + r"*" + sub_domain + r= ")*" + + addr_spec =3D localpart + r"@" + rfc822_lwsp + r"*" + domain + + phrase =3D word + r"*" + route =3D r"(?:@" + domain + r"(?:,@" + rfc822_lwsp + r"*" + domain + = r")*:" + rfc822_lwsp + r"*)" + route_addr =3D r"\<" + rfc822_lwsp + r"*" + route + r"?" + addr_spec += r"\>" + rfc822_lwsp + r"*" + mailbox =3D r"(?:" + addr_spec + r"|" + phrase + route_addr + r")" + + group =3D phrase + r":" + rfc822_lwsp + r"*(?:" + mailbox + r"(?:,\s*"= + mailbox + r")*)?;\s*" + address =3D r"(?:" + mailbox + r"|" + group + r")" + + return rfc822_lwsp + r"*" + address + +def rfc822_strip_comments(s): + while True: + new_s =3D re.sub( + r'^((?:[^"\\]|\\.)*(?:"(?:[^"\\]|\\.)*"(?:[^"\\]|\\.)*)*)\((?:= [^()\\]|\\.)*\)', + r'\1 ', s, count=3D1, flags=3Dre.DOTALL) + if new_s =3D=3D s: + break + s =3D new_s + return s + +def rfc822_valid(s): + global _rfc822re + s =3D rfc822_strip_comments(s) + if _rfc822re is None: + _rfc822re =3D make_rfc822re() + if re.match(r'^' + _rfc822re + r'$', s) and re.match(r'^' + rfc822_cha= r + r'*$', s): + return True + return False + +def rfc822_validlist(s): + global _rfc822re + s =3D rfc822_strip_comments(s) + if _rfc822re is None: + _rfc822re =3D make_rfc822re() + if re.match(r'^(?:' + _rfc822re + r')?(?:,(?:' + _rfc822re + r')?)*$',= s) and \ + re.match(r'^' + rfc822_char + r'*$', s): + result =3D [] + for m in re.finditer(r'(?:^|,' + rfc822_lwsp + r'*)(' + _rfc822re = + r')', s): + result.append(m.group(1)) + return result + return [] + +# ---- mailmap functions ---- + +def read_mailmap(): + global mailmap_data, email_use_mailmap, lk_path + mailmap_data =3D {"names": {}, "addresses": {}} + + if not email_use_mailmap or not os.path.isfile(lk_path + ".mailmap"): + return + + try: + with open(lk_path + ".mailmap", "r", encoding=3D"utf-8") as f: + for line in f: + line =3D re.sub(r'#.*$', '', line) # strip comments + line =3D line.strip() + if not line: + continue + + # name1 name2 + m =3D re.match(r'^(.+)<([^>]+)>\s*(.+)\s*<([^>]+)>$', line) + if m: + real_name =3D m.group(1).rstrip() + real_address =3D m.group(2) + wrong_name =3D m.group(3).rstrip() + wrong_address =3D m.group(4) + real_name, real_address =3D parse_email("{} <{}>".form= at(real_name, real_address)) + wrong_name, wrong_address =3D parse_email("{} <{}>".fo= rmat(wrong_name, wrong_address)) + wrong_email =3D format_email(wrong_name, wrong_address= , 1) + mailmap_data["names"][wrong_email] =3D real_name + mailmap_data["addresses"][wrong_email] =3D real_address + continue + + # name1 + m =3D re.match(r'^(.+)<([^>]+)>\s*<([^>]+)>$', line) + if m: + real_name =3D m.group(1).rstrip() + real_address =3D m.group(2) + wrong_address =3D m.group(3) + real_name, real_address =3D parse_email("{} <{}>".form= at(real_name, real_address)) + mailmap_data["names"][wrong_address] =3D real_name + mailmap_data["addresses"][wrong_address] =3D real_addr= ess + continue + + # + m =3D re.match(r'^<([^>]+)>\s*<([^>]+)>$', line) + if m: + real_address =3D m.group(1) + wrong_address =3D m.group(2) + mailmap_data["addresses"][wrong_address] =3D real_addr= ess + continue + + # name1 + m =3D re.match(r'^([^<]+)<([^>]+)>$', line) + if m: + real_name =3D m.group(1).rstrip() + address =3D m.group(2) + real_name, address =3D parse_email("{} <{}>".format(re= al_name, address)) + mailmap_data["names"][address] =3D real_name + continue + except IOError: + print("{}: Can't open .mailmap".format(P), file=3Dsys.stderr) + +def mailmap_email(line): + name, address =3D parse_email(line) + email_str =3D format_email(name, address, 1) + real_name =3D name + real_address =3D address + + if email_str in mailmap_data["names"] or email_str in mailmap_data["ad= dresses"]: + if email_str in mailmap_data["names"]: + real_name =3D mailmap_data["names"][email_str] + if email_str in mailmap_data["addresses"]: + real_address =3D mailmap_data["addresses"][email_str] + else: + if address in mailmap_data["names"]: + real_name =3D mailmap_data["names"][address] + if address in mailmap_data["addresses"]: + real_address =3D mailmap_data["addresses"][address] + + return format_email(real_name, real_address, 1) + +def mailmap(addresses): + mapped =3D [] + for line in addresses: + mapped.append(mailmap_email(line)) + if email_use_mailmap: + merge_by_realname(mapped) + return mapped + +def merge_by_realname(emails): + address_map =3D {} + for i in range(len(emails)): + name, address =3D parse_email(emails[i]) + if name in address_map: + address =3D address_map[name] + emails[i] =3D format_email(name, address, 1) + else: + address_map[name] =3D address + +# ---- MAINTAINERS parsing ---- + +def read_maintainer_file(filepath): + global typevalue, typevalue_parsed, typevalue_compiled + global keyword_hash, self_test_info, self_test + try: + with open(filepath, "r", encoding=3D"utf-8") as f: + i =3D 1 + for raw_line in f: + line =3D raw_line.rstrip('\n') + + m =3D _re_type_value.match(line) + if m: + typ =3D m.group(1) + value =3D m.group(2) + + if typ in ("F", "X"): + value =3D value.replace('.', '\\.') + value =3D value.replace('**', '\x00') + value =3D value.replace('*', '.*') + value =3D value.replace('?', '.') + value =3D value.replace('\x00', '(?:.*)') + if os.path.isdir(value): + if not value.endswith('/'): + value +=3D '/' + # Mark for lazy compilation and extract literal pr= efix + idx =3D len(typevalue) + typevalue_compiled[idx] =3D _UNCOMPILED + # Extract the literal prefix before any regex meta= char + prefix =3D [] + for ch in value: + if ch in r'.*+?()[]{}|\\^$': + break + prefix.append(ch) + typevalue_prefix[idx] =3D ''.join(prefix) + elif typ =3D=3D "K": + keyword_hash[len(typevalue)] =3D value + + typevalue.append("{}:{}".format(typ, value)) + typevalue_parsed.append((typ, value)) + elif not _re_blank_line.match(line) and not _re_comment_li= ne.match(line): + typevalue.append(line) + typevalue_parsed.append((None, line)) + + if self_test is not None: + self_test_info.append({"file": filepath, "linenr": i, = "line": line}) + i +=3D 1 + except IOError: + print("{}: Can't open MAINTAINERS file '{}'" .format(P, filepath),= file=3Dsys.stderr) + sys.exit(1) + +def read_all_maintainer_files(): + global mfiles, lk_path, maintainer_path, find_maintainer_files + path =3D lk_path + "MAINTAINERS" + if maintainer_path is not None: + path =3D maintainer_path + path =3D os.path.expanduser(path) + + if os.path.isdir(path): + if not path.endswith('/'): + path +=3D '/' + if find_maintainer_files: + for root, dirs, fnames in os.walk(path): + # skip .git + dirs[:] =3D [d for d in dirs if d !=3D '.git'] + for fname in fnames: + if fname =3D=3D "MAINTAINERS": + mfiles.append(os.path.join(root, fname)) + else: + for fname in os.listdir(path): + if not fname.startswith('.'): + mfiles.append(path + fname) + elif os.path.isfile(path): + mfiles.append(path) + else: + print("{}: MAINTAINER file not found '{}'".format(P, path), file= =3Dsys.stderr) + sys.exit(1) + + if len(mfiles) =3D=3D 0: + print("{}: No MAINTAINER files found in '{}'".format(P, path), fil= e=3Dsys.stderr) + sys.exit(1) + + for filepath in mfiles: + read_maintainer_file(filepath) + +def maintainers_in_file(filepath): + global file_emails, email_file_emails + if re.search(r'\bMAINTAINERS$', filepath): + return + if os.path.isfile(filepath) and (email_file_emails or filepath.endswit= h('.yaml')): + try: + with open(filepath, 'r', encoding=3D'utf-8', errors=3D'replace= ') as f: + text =3D f.read() + poss_addr =3D re.findall( + r"[\w\"\' \,\.\+-]*\s*[\,]*\s*[\(\<\{]?[A-Za-z0-9_\.\+-]+@= [A-Za-z0-9\.\-]+\.[A-Za-z0-9]+[\)\>\}]?", + text) + file_emails.extend(clean_file_emails(poss_addr)) + except IOError: + pass + +# ---- section navigation ---- + +def find_first_section(): + index =3D 0 + while index < len(typevalue_parsed): + if typevalue_parsed[index][0] is not None: + break + index +=3D 1 + return index + +def find_starting_index(index): + while index > 0: + if typevalue_parsed[index][0] is None: + break + index -=3D 1 + return index + +def find_ending_index(index): + while index < len(typevalue_parsed): + if typevalue_parsed[index][0] is None: + break + index +=3D 1 + return index + +def get_subsystem_name(index): + start =3D find_starting_index(index) + sub =3D typevalue[start] + if output_section_maxlen and len(sub) > output_section_maxlen: + sub =3D sub[:output_section_maxlen - 3].rstrip() + "..." + return sub + +# ---- file matching ---- + +def _get_compiled(idx): + """Lazily compile and cache the regex for typevalue entry at idx.""" + compiled =3D typevalue_compiled.get(idx) + if compiled is _UNCOMPILED: + value =3D typevalue_parsed[idx][1] + try: + compiled =3D re.compile(r'^' + value) + except re.error: + compiled =3D None + typevalue_compiled[idx] =3D compiled + return compiled + +def file_match_pattern(filepath, pattern, compiled=3DNone): + if compiled is None: + try: + compiled =3D re.compile(r'^' + pattern) + except re.error: + return False + if pattern.endswith("/"): + if compiled.search(filepath): + return True + else: + if compiled.search(filepath): + s1 =3D filepath.count('/') + s2 =3D pattern.count('/') + if s1 =3D=3D s2 or '(?:' in pattern: + return True + return False + +# ---- category/role functions ---- + +def get_maintainer_role(index): + start =3D find_starting_index(index) + end =3D find_ending_index(index) + role =3D "maintainer" + sub =3D get_subsystem_name(index) + sts =3D "unknown" + + for i in range(start + 1, end): + typ, value =3D typevalue_parsed[i] + if typ =3D=3D "S": + sts =3D value + + sts =3D sts.lower() + if sts =3D=3D "buried alive in reporters": + role =3D "chief penguin" + + return role + ":" + sub + +def get_list_role(index): + sub =3D get_subsystem_name(index) + if sub =3D=3D "THE REST": + sub =3D "" + return sub + +def add_categories(index, suffix): + global email_to, hash_list_to, list_to, scm_list, web_list, bug_list + global subsystem_list, status_list, substatus_list + + start =3D find_starting_index(index) + end =3D find_ending_index(index) + + sub =3D typevalue[start] + subsystem_list.append(sub) + sts =3D "Unknown" + + for i in range(start + 1, end): + ptype, pvalue =3D typevalue_parsed[i] + if ptype is not None: + if ptype =3D=3D "L": + list_address =3D pvalue + list_additional =3D "" + list_role =3D get_list_role(i) + + if list_role: + list_role =3D ":" + list_role + + m2 =3D re.match(r'^(\S+)\s+(.*)$', list_address) + if m2: + list_address =3D m2.group(1) + list_additional =3D m2.group(2) + + if re.search(r'subscribers-only', list_additional): + if email_subscriber_list: + if list_address.lower() not in hash_list_to: + hash_list_to[list_address.lower()] =3D 1 + list_to.append([list_address, + "subscriber list{}{}".format(l= ist_role, suffix)]) + else: + if email_list: + if list_address.lower() not in hash_list_to: + if re.search(r'moderated', list_additional): + if email_moderated_list: + hash_list_to[list_address.lower()] =3D= 1 + list_to.append([list_address, + "moderated list{}{}".f= ormat(list_role, suffix)]) + else: + hash_list_to[list_address.lower()] =3D 1 + list_to.append([list_address, + "open list{}{}".format(lis= t_role, suffix)]) + + elif ptype =3D=3D "M": + if email_maintainer: + role =3D get_maintainer_role(i) + push_email_addresses(pvalue, role + suffix) + elif ptype =3D=3D "R": + if email_reviewer: + subs =3D get_subsystem_name(i) + push_email_addresses(pvalue, "reviewer:" + subs + suff= ix) + elif ptype =3D=3D "T": + scm_list.append(pvalue + suffix) + elif ptype =3D=3D "W": + web_list.append(pvalue + suffix) + elif ptype =3D=3D "B": + bug_list.append(pvalue + suffix) + elif ptype =3D=3D "S": + status_list.append(pvalue + suffix) + sts =3D pvalue + + if sub !=3D "THE REST" and sts !=3D "Maintained": + substatus_list.append("{} status: {}{}".format(sub, sts, suffix)) + +# ---- email management ---- + +def email_inuse(name, address): + if name =3D=3D "" and address =3D=3D "": + return True + if name !=3D "" and name.lower() in email_hash_name: + return True + if address !=3D "" and address.lower() in email_hash_address: + return True + return False + +def push_email_address(line, role): + global email_to, email_hash_name, email_hash_address + name, address =3D parse_email(line) + if address =3D=3D "": + return False + + if not email_remove_duplicates: + email_to.append([format_email(name, address, email_usename), role]) + elif not email_inuse(name, address): + email_to.append([format_email(name, address, email_usename), role]) + if name !=3D "": + email_hash_name[name.lower()] =3D email_hash_name.get(name.low= er(), 0) + 1 + email_hash_address[address.lower()] =3D email_hash_address.get(add= ress.lower(), 0) + 1 + + return True + +def push_email_addresses(address, role): + if rfc822_valid(address): + push_email_address(address, role) + else: + addr_list =3D rfc822_validlist(address) + if addr_list: + for entry in addr_list: + push_email_address(entry, role) + else: + if not push_email_address(address, role): + print("Invalid MAINTAINERS address: '{}'".format(address),= file=3Dsys.stderr) + +def add_role(line, role): + global email_to + name, address =3D parse_email(line) + email_str =3D format_email(name, address, email_usename) + + for entry in email_to: + if email_remove_duplicates: + entry_name, entry_address =3D parse_email(entry[0]) + if (name =3D=3D entry_name or address =3D=3D entry_address) an= d \ + (role =3D=3D "" or role not in entry[1]): + if entry[1] =3D=3D "": + entry[1] =3D role + else: + entry[1] =3D "{},{}".format(entry[1], role) + else: + if email_str =3D=3D entry[0] and \ + (role =3D=3D "" or role not in entry[1]): + if entry[1] =3D=3D "": + entry[1] =3D role + else: + entry[1] =3D "{},{}".format(entry[1], role) + +def deduplicate_email(email_addr): + global deduplicate_name_hash, deduplicate_address_hash + name, address =3D parse_email(email_addr) + email_addr =3D format_email(name, address, 1) + email_addr =3D mailmap_email(email_addr) + + if not email_remove_duplicates: + return email_addr + + name, address =3D parse_email(email_addr) + matched =3D False + + if name !=3D "" and name.lower() in deduplicate_name_hash: + name =3D deduplicate_name_hash[name.lower()][0] + address =3D deduplicate_name_hash[name.lower()][1] + matched =3D True + elif address.lower() in deduplicate_address_hash: + name =3D deduplicate_address_hash[address.lower()][0] + address =3D deduplicate_address_hash[address.lower()][1] + matched =3D True + + if not matched: + deduplicate_name_hash[name.lower()] =3D [name, address] + deduplicate_address_hash[address.lower()] =3D [name, address] + + email_addr =3D format_email(name, address, 1) + email_addr =3D mailmap_email(email_addr) + return email_addr + +def ignore_email_address(address): + for ig in ignore_emails: + if ig =3D=3D address: + return True + return False + +def clean_file_emails(raw_emails): + fmt_emails =3D [] + for em in raw_emails: + em =3D re.sub( + r'[\(\<\{]?([A-Za-z0-9_\.\+-]+@[A-Za-z0-9\.\-]+)[\)\>\}]?', + r'<\1>', em) + name, address =3D parse_email(em) + + # Strip quotes + if name.startswith('"') and name.endswith('"'): + name =3D name[1:-1] + + # Split into name-like parts + nw =3D re.split(r"[^\w\'\,\.\+\-]", name) + nw =3D [w for w in nw if w and not re.match(r"^[\'\,\.\+\-]$", w)] + + if len(nw) > 2: + first =3D nw[-3] + middle =3D nw[-2] + last =3D nw[-1] + if ((len(first) =3D=3D 1 and re.match(r'\w', first)) or + (len(first) =3D=3D 2 and first.endswith("."))) or \ + (len(middle) =3D=3D 1 or + (len(middle) =3D=3D 2 and middle.endswith("."))): + name =3D "{} {} {}".format(first, middle, last) + else: + name =3D "{} {}".format(middle, last) + else: + name =3D " ".join(nw) + + if name and name[-1] in (',', '.'): + name =3D name[:-1] + if name and name[0] in (',', '.'): + name =3D name[1:] + + fmt_emails.append(format_email(name, address, email_usename)) + return fmt_emails + +# ---- VCS integration ---- + +def git_execute_cmd(cmd): + try: + output =3D subprocess.run(cmd, shell=3DTrue, capture_output=3DTrue, + text=3DTrue, encoding=3D'utf-8', errors=3D= 'replace') + lines =3D output.stdout.lstrip().split('\n') + # strip leading whitespace from each line (matching Perl) + lines =3D [l.lstrip() for l in lines] + return lines + except Exception: + return [] + +def hg_execute_cmd(cmd): + try: + output =3D subprocess.run(cmd, shell=3DTrue, capture_output=3DTrue, + text=3DTrue, encoding=3D'utf-8', errors=3D= 'replace') + return output.stdout.split('\n') + except Exception: + return [] + +def vcs_exists(): + global VCS_cmds, vcs_used, printed_novcs + # Try git + if which("git") and os.path.exists(".git"): + VCS_cmds =3D VCS_cmds_git + vcs_used =3D 1 + return 1 + # Try hg + if which("hg") and os.path.isdir(".hg"): + VCS_cmds =3D VCS_cmds_hg + vcs_used =3D 2 + return 2 + VCS_cmds =3D {} + if not printed_novcs and email_git: + print("{}: No supported VCS found. Add --nogit to options?".forma= t(P), file=3Dsys.stderr) + print("Using a git repository produces better results.", file=3Dsy= s.stderr) + print("Try Linus Torvalds' latest git repository using:", file=3Ds= ys.stderr) + print("git clone git://git.kernel.org/pub/scm/linux/kernel/git/tor= valds/linux.git", file=3Dsys.stderr) + printed_novcs =3D True + return 0 + +def vcs_is_git(): + vcs_exists() + return vcs_used =3D=3D 1 + +def vcs_is_hg(): + return vcs_used =3D=3D 2 + +def _vcs_exec(cmd): + if vcs_used =3D=3D 1: + return git_execute_cmd(cmd) + elif vcs_used =3D=3D 2: + return hg_execute_cmd(cmd) + return [] + +def _interpolate_cmd(cmd_template, **kwargs): + """Substitute variables in a VCS command template.""" + result =3D cmd_template + for key, val in kwargs.items(): + result =3D result.replace('{' + key + '}', str(val)) + return result + +def extract_formatted_signatures(signature_lines): + types =3D [] + signers =3D [] + for line in signature_lines: + # extract type (everything before first ':') + m =3D re.match(r'\s*(.*?):.*', line) + typ =3D m.group(1) if m else "" + types.append(typ) + + # extract signer (everything after first ':') + m =3D re.match(r'\s*.*?:\s*(.+)\s*', line) + signer =3D m.group(1).strip() if m else "" + signer =3D deduplicate_email(signer) + signers.append(signer) + + return (types, signers) + +def vcs_find_signers(cmd, filepath): + global signature_pattern + lines =3D _vcs_exec(cmd) + + pattern =3D VCS_cmds["commit_pattern"] + author_pattern =3D VCS_cmds["author_pattern"] + stat_pattern =3D VCS_cmds["stat_pattern"] + + if filepath: + stat_pattern =3D stat_pattern.replace('{file}', re.escape(filepath= )) + else: + stat_pattern =3D stat_pattern.replace('{file}', '') + + commits =3D sum(1 for l in lines if re.search(pattern, l)) + authors =3D [l for l in lines if re.search(author_pattern, l)] + signatures =3D [l for l in lines if re.search( + r'^[ \t]*' + signature_pattern + r'.*@.*$', l)] + stats =3D [l for l in lines if re.search(stat_pattern, l)] + + if not signatures: + return (0, signatures, authors, stats) + + if interactive: + save_commits_by_author(lines) + save_commits_by_signer(lines) + + if not email_git_penguin_chiefs: + signatures =3D [s for s in signatures if not re.search(penguin_chi= efs, s, re.IGNORECASE)] + + _, authors_list =3D extract_formatted_signatures(authors) + _, signers_list =3D extract_formatted_signatures(signatures) + + return (commits, signers_list, authors_list, stats) + +def vcs_find_author(cmd): + lines =3D _vcs_exec(cmd) + + if not email_git_penguin_chiefs: + lines =3D [l for l in lines if not re.search(penguin_chiefs, l, re= .IGNORECASE)] + + if not lines: + return [] + + authors =3D [] + author_pattern =3D VCS_cmds["author_pattern"] + for line in lines: + m =3D re.search(author_pattern, line) + if m: + author =3D m.group(1) + name, address =3D parse_email(author) + author =3D format_email(name, address, 1) + authors.append(author) + + if interactive: + save_commits_by_author(lines) + save_commits_by_signer(lines) + + return authors + +def vcs_save_commits(cmd): + lines =3D _vcs_exec(cmd) + commits =3D [] + blame_pattern =3D VCS_cmds["blame_commit_pattern"] + for line in lines: + m =3D re.search(blame_pattern, line) + if m: + commits.append(m.group(1)) + return commits + +def vcs_blame(filepath): + global range_list + commits =3D [] + if not os.path.isfile(filepath): + return commits + + if range_list and VCS_cmds.get("blame_range_cmd", "") =3D=3D "": + cmd =3D _interpolate_cmd(VCS_cmds["blame_file_cmd"], file=3Dshlex.= quote(filepath)) + all_commits =3D vcs_save_commits(cmd) + + for file_range_diff in range_list: + m =3D re.match(r'(.+):(.+):(.+)', file_range_diff) + if not m: + continue + diff_file =3D m.group(1) + diff_start =3D int(m.group(2)) + diff_length =3D int(m.group(3)) + if filepath !=3D diff_file: + continue + for i in range(diff_start, diff_start + diff_length): + if i < len(all_commits): + commits.append(all_commits[i]) + elif range_list: + for file_range_diff in range_list: + m =3D re.match(r'(.+):(.+):(.+)', file_range_diff) + if not m: + continue + diff_file =3D m.group(1) + diff_start =3D m.group(2) + diff_length =3D m.group(3) + if filepath !=3D diff_file: + continue + cmd =3D _interpolate_cmd(VCS_cmds["blame_range_cmd"], + file=3Dshlex.quote(filepath), + diff_start=3Ddiff_start, + diff_length=3Ddiff_length) + commits.extend(vcs_save_commits(cmd)) + else: + cmd =3D _interpolate_cmd(VCS_cmds["blame_file_cmd"], file=3Dshlex.= quote(filepath)) + commits =3D vcs_save_commits(cmd) + + commits =3D [c.lstrip('^') for c in commits] + return commits + +def vcs_file_exists(filepath): + v =3D vcs_exists() + if not v: + return False + cmd =3D _interpolate_cmd(VCS_cmds["file_exists_cmd"], file=3Dshlex.quo= te(filepath)) + cmd +=3D " 2>&1" + result =3D _vcs_exec(cmd) + # Check if any non-empty output + return any(line.strip() for line in result) if result else False + +def vcs_list_files(filepath): + v =3D vcs_exists() + if not v: + return [] + cmd =3D _interpolate_cmd(VCS_cmds["list_files_cmd"], file=3Dshlex.quot= e(filepath)) + return _vcs_exec(cmd) + +def vcs_assign(role, divisor, lines): + if not lines: + return + + if divisor <=3D 0: + print("Bad divisor in vcs_assign: {}".format(divisor), file=3Dsys.= stderr) + divisor =3D 1 + + lines =3D mailmap(lines) + + if not lines: + return + + lines.sort() + + # uniq -c + counts =3D {} + for l in lines: + counts[l] =3D counts.get(l, 0) + 1 + + count =3D 0 + for line in sorted(counts.keys(), key=3Dlambda x: counts[x], reverse= =3DTrue): + sign_offs =3D counts[line] + percent =3D sign_offs * 100 / divisor + if percent > 100: + percent =3D 100 + if ignore_email_address(line): + continue + count +=3D 1 + if sign_offs < email_git_min_signatures or \ + count > email_git_max_maintainers or \ + percent < email_git_min_percent: + break + push_email_address(line, '') + if output_rolestats: + fmt_percent =3D "{:.0f}".format(percent) + add_role(line, "{}:{}/{}=3D{}%".format(role, sign_offs, diviso= r, fmt_percent)) + else: + add_role(line, role) + +def vcs_file_signoffs(filepath): + global vcs_used + + vcs_used =3D vcs_exists() + if not vcs_used: + return + + cmd =3D _interpolate_cmd(VCS_cmds["find_signers_cmd"], + email_git_since=3Demail_git_since, + email_hg_since=3Demail_hg_since, + file=3Dshlex.quote(filepath)) + + commits, signers, authors, stats =3D vcs_find_signers(cmd, filepath) + + for i in range(len(signers)): + signers[i] =3D deduplicate_email(signers[i]) + + vcs_assign("commit_signer", commits, signers) + vcs_assign("authored", commits, authors) + + if len(authors) =3D=3D len(stats): + stat_pattern =3D VCS_cmds["stat_pattern"] + stat_pattern =3D stat_pattern.replace('{file}', re.escape(filepath= )) + + added =3D 0 + deleted =3D 0 + for i in range(len(stats)): + m =3D re.search(stat_pattern, stats[i]) + if m: + added +=3D int(m.group(1)) + deleted +=3D int(m.group(2)) + + tmp_authors =3D uniq(authors) + tmp_authors =3D [deduplicate_email(a) for a in tmp_authors] + tmp_authors =3D uniq(tmp_authors) + + list_added =3D [] + list_deleted =3D [] + for author in tmp_authors: + auth_added =3D 0 + auth_deleted =3D 0 + for i in range(len(stats)): + if author =3D=3D deduplicate_email(authors[i]): + m =3D re.search(stat_pattern, stats[i]) + if m: + auth_added +=3D int(m.group(1)) + auth_deleted +=3D int(m.group(2)) + list_added.extend([author] * auth_added) + list_deleted.extend([author] * auth_deleted) + + vcs_assign("added_lines", added, list_added) + vcs_assign("removed_lines", deleted, list_deleted) + +def vcs_file_blame(filepath): + global vcs_used + vcs_used =3D vcs_exists() + if not vcs_used: + return + + all_commits =3D vcs_blame(filepath) + commits =3D uniq(all_commits) + total_commits =3D len(commits) + total_lines =3D len(all_commits) + + signers =3D [] + + if email_git_blame_signatures: + if vcs_is_hg(): + commit =3D " -r ".join(commits) + cmd =3D _interpolate_cmd(VCS_cmds["find_commit_signers_cmd"], = commit=3Dcommit) + _, commit_signers, _, _ =3D vcs_find_signers(cmd, filepath) + signers.extend(commit_signers) + else: + for commit in commits: + cmd =3D _interpolate_cmd(VCS_cmds["find_commit_signers_cmd= "], commit=3Dcommit) + _, commit_signers, _, _ =3D vcs_find_signers(cmd, filepath) + signers.extend(commit_signers) + + if from_filename: + if output_rolestats: + blame_signers =3D [] + if vcs_is_hg(): + u_commits =3D sorted(uniq(commits)) + commit =3D " -r ".join(u_commits) + cmd =3D _interpolate_cmd(VCS_cmds["find_commit_author_cmd"= ], commit=3Dcommit) + lines =3D _vcs_exec(cmd) + if not email_git_penguin_chiefs: + lines =3D [l for l in lines if not re.search(penguin_c= hiefs, l, re.IGNORECASE)] + if lines: + author_pattern =3D VCS_cmds["author_pattern"] + for line in lines: + m =3D re.search(author_pattern, line) + if m: + author =3D deduplicate_email(m.group(1)) + signers.append(author) + if interactive: + save_commits_by_author(lines) + save_commits_by_signer(lines) + else: + for commit in commits: + cmd =3D _interpolate_cmd(VCS_cmds["find_commit_author_= cmd"], commit=3Dcommit) + author_list =3D vcs_find_author(cmd) + if not author_list: + continue + formatted_author =3D deduplicate_email(author_list[0]) + cnt =3D sum(1 for c in all_commits if commit in c) + blame_signers.extend([formatted_author] * cnt) + + if blame_signers: + vcs_assign("authored lines", total_lines, blame_signers) + + signers =3D [deduplicate_email(s) for s in signers] + vcs_assign("commits", total_commits, signers) + else: + signers =3D [deduplicate_email(s) for s in signers] + vcs_assign("modified commits", total_commits, signers) + +def vcs_add_commit_signers(commit, desc): + if not vcs_exists(): + return + + cmd =3D _interpolate_cmd(VCS_cmds["find_commit_signers_cmd"], commit= =3Dcommit) + commit_count, commit_signers, commit_authors, stats =3D vcs_find_signe= rs(cmd, "") + + for i in range(len(commit_signers)): + commit_signers[i] =3D deduplicate_email(commit_signers[i]) + + vcs_assign(desc, 1, commit_signers) + +# ---- interactive commit tracking ---- + +def save_commits_by_author(lines): + global commit_author_hash + authors =3D [] + commits =3D [] + subjects =3D [] + author_pattern =3D VCS_cmds["author_pattern"] + commit_pattern =3D VCS_cmds["commit_pattern"] + subject_pattern =3D VCS_cmds["subject_pattern"] + + for line in lines: + m =3D re.search(author_pattern, line) + if m: + author =3D deduplicate_email(m.group(1)) + authors.append(author) + m =3D re.search(commit_pattern, line) + if m: + commits.append(m.group(1)) + m =3D re.search(subject_pattern, line) + if m: + subjects.append(m.group(1)) + + for i in range(len(authors)): + if i >=3D len(commits) or i >=3D len(subjects): + break + if authors[i] not in commit_author_hash: + commit_author_hash[authors[i]] =3D [] + exists =3D False + for ref in commit_author_hash[authors[i]]: + if ref[0] =3D=3D commits[i] and ref[1] =3D=3D subjects[i]: + exists =3D True + break + if not exists: + commit_author_hash[authors[i]].append([commits[i], subjects[i]= ]) + +def save_commits_by_signer(lines): + global commit_signer_hash, signature_pattern + commit =3D "" + subject =3D "" + commit_pattern =3D VCS_cmds["commit_pattern"] + subject_pattern =3D VCS_cmds["subject_pattern"] + + for line in lines: + m =3D re.search(commit_pattern, line) + if m: + commit =3D m.group(1) + m =3D re.search(subject_pattern, line) + if m: + subject =3D m.group(1) + if re.search(r'^[ \t]*' + signature_pattern + r'.*@.*$', line): + sig_types, sig_signers =3D extract_formatted_signatures([line]) + if sig_signers: + typ =3D sig_types[0] + signer =3D deduplicate_email(sig_signers[0]) + if signer not in commit_signer_hash: + commit_signer_hash[signer] =3D [] + exists =3D False + for ref in commit_signer_hash[signer]: + if ref[0] =3D=3D commit and ref[1] =3D=3D subject and = ref[2] =3D=3D typ: + exists =3D True + break + if not exists: + commit_signer_hash[signer].append([commit, subject, ty= p]) + +# ---- range checks ---- + +def range_is_maintained(start, end): + for i in range(start, end): + typ, value =3D typevalue_parsed[i] + if typ =3D=3D 'S': + if re.search(r'maintain|support', value, re.IGNORECASE): + return True + return False + +def range_has_maintainer(start, end): + for i in range(start, end): + typ, value =3D typevalue_parsed[i] + if typ =3D=3D 'M': + return True + return False + +# ---- core orchestration ---- + +def get_maintainers(): + global email_hash_name, email_hash_address, commit_author_hash, commit= _signer_hash + global email_to, hash_list_to, list_to, scm_list, web_list, bug_list + global subsystem_list, status_list, substatus_list + global deduplicate_name_hash, deduplicate_address_hash + global signature_pattern, keyword_tvi, file_emails + + email_hash_name =3D {} + email_hash_address =3D {} + commit_author_hash =3D {} + commit_signer_hash =3D {} + email_to =3D [] + hash_list_to =3D {} + list_to =3D [] + scm_list =3D [] + web_list =3D [] + bug_list =3D [] + subsystem_list =3D [] + status_list =3D [] + substatus_list =3D [] + deduplicate_name_hash =3D {} + deduplicate_address_hash =3D {} + + if email_git_all_signature_types: + signature_pattern =3D r"(.+?)[Bb][Yy]:" + else: + signature_pattern =3D "(" + "|".join(re.escape(t) for t in signatu= re_tags) + ")" + + exact_pattern_match_hash =3D {} + + for filepath in files: + hash_map =3D {} + tvi =3D find_first_section() + while tvi < len(typevalue): + start =3D find_starting_index(tvi) + end =3D find_ending_index(tvi) + exclude =3D False + + # Check excluded patterns + for i in range(start, end): + typ, value =3D typevalue_parsed[i] + if typ =3D=3D 'X': + prefix =3D typevalue_prefix.get(i, '') + if prefix and not filepath.startswith(prefix): + continue + compiled =3D _get_compiled(i) + if compiled is not None and file_match_pattern(filepat= h, value, compiled): + exclude =3D True + break + + if not exclude: + for i in range(start, end): + typ, value =3D typevalue_parsed[i] + if typ is None: + continue + if typ =3D=3D 'F': + prefix =3D typevalue_prefix.get(i, '') + if prefix and not filepath.startswith(prefix): + continue + compiled =3D _get_compiled(i) + if compiled is not None and file_match_pattern(fil= epath, value, compiled): + value_pd =3D value.count('/') + file_pd =3D filepath.count('/') + if not value.endswith('/'): + value_pd +=3D 1 + if re.match(r'^(\.\*|\(\?:\.\*\))', value): + value_pd =3D -1 + if value_pd >=3D file_pd and \ + range_is_maintained(start, end) and \ + range_has_maintainer(start, end): + exact_pattern_match_hash[filepath] =3D 1 + if pattern_depth =3D=3D 0 or \ + (file_pd - value_pd) < pattern_depth: + hash_map[tvi] =3D value_pd + elif typ =3D=3D 'N': + try: + if re.search(value, filepath, re.VERBOSE): + hash_map[tvi] =3D 0 + except re.error: + pass + + tvi =3D end + 1 + + for line_idx in sorted(hash_map.keys(), key=3Dlambda x: hash_map[x= ], reverse=3DTrue): + add_categories(line_idx, "") + if sections: + start =3D find_starting_index(line_idx) + end =3D find_ending_index(line_idx) + for i in range(start, end): + line =3D typevalue[i] + if re.match(r'^[FX]:', line): + # Restore file patterns + line =3D re.sub(r'([^\\])\.([^\*])', r'\1?\2', lin= e) + line =3D re.sub(r'([^\\])\.$', r'\1?', line) + line =3D line.replace('\\.', '.') + line =3D line.replace('(?:.*)', '**') + line =3D line.replace('.*', '*') + m2 =3D re.match(r'^([A-Z]):(.*)', line) + if m2: + line =3D "{}:\t{}".format(m2.group(1), m2.group(2)) + if letters =3D=3D "" or re.search(m2.group(1), let= ters, re.IGNORECASE): + print(line) + else: + # Section header lines are always printed + print(line) + print() + + maintainers_in_file(filepath) + + if keywords: + kw_tvi =3D sort_and_uniq(keyword_tvi) + for line_idx in kw_tvi: + if line_idx in keyword_hash: + add_categories(line_idx, ":Keyword:" + keyword_hash[line_i= dx]) + + for em in email_to + list_to: + em[0] =3D deduplicate_email(em[0]) + + for filepath in files: + if email and \ + (email_git or + (email_git_fallback and + not filepath.endswith('MAINTAINERS') and + filepath not in exact_pattern_match_hash)): + vcs_file_signoffs(filepath) + if email and email_git_blame: + vcs_file_blame(filepath) + + if email: + for chief in penguin_chief: + m =3D re.match(r'^(.*?):(.*)', chief) + if m: + email_address =3D format_email(m.group(1), m.group(2), ema= il_usename) + if email_git_penguin_chiefs: + email_to.append([email_address, 'chief penguin']) + else: + email_to[:] =3D [e for e in email_to + if not re.search(re.escape(email_addres= s), e[0])] + + for em in file_emails: + em =3D mailmap_email(em) + name, address =3D parse_email(em) + tmp_email =3D format_email(name, address, email_usename) + push_email_address(tmp_email, '') + add_role(tmp_email, 'in file') + + for fix in fixes: + vcs_add_commit_signers(fix, "blamed_fixes") + + to =3D [] + if email or email_list: + if email: + to.extend(email_to) + if email_list: + to.extend(list_to) + + if interactive: + to =3D interactive_get_maintainers(to) + + return to + +# ---- output functions ---- + +def merge_email(entries): + lines =3D [] + saw =3D {} + for entry in entries: + address =3D entry[0] + role =3D entry[1] + if address not in saw: + if output_roles: + lines.append("{} ({})".format(address, role)) + else: + lines.append(address) + saw[address] =3D 1 + return lines + +def output(parms): + if output_multiline: + for line in parms: + print(line) + else: + print(output_separator.join(parms)) + +# ---- interactive mode ---- + +def interactive_get_maintainers(to_list): + global interactive, output_rolestats, output_roles, output_substatus + global email_git, email_git_fallback, email_git_blame + global email_git_blame_signatures, email_git_min_signatures + global email_git_max_maintainers, email_git_min_percent + global email_git_since, email_hg_since, email_git_all_signature_types + global email_file_emails, email_remove_duplicates, email_use_mailmap + global keywords, pattern_depth + + vcs_exists() + + selected =3D {} + authored =3D {} + signed =3D {} + count =3D 0 + maintained =3D False + + for entry in to_list: + if re.match(r'^(maintainer|supporter)', entry[1], re.IGNORECASE): + maintained =3D True + selected[count] =3D True + authored[count] =3D False + signed[count] =3D False + count +=3D 1 + + done =3D False + print_options =3D False + redraw =3D True + + while not done: + count =3D len(to_list) + if redraw: + sys.stderr.write("\n{:1s} {:2s} {:65s}".format("*", "#", "emai= l/list and role:stats")) + if email_git or (email_git_fallback and not maintained) or ema= il_git_blame: + sys.stderr.write("auth sign") + sys.stderr.write("\n") + for idx, entry in enumerate(to_list): + em =3D entry[0] + role =3D entry[1] + sel =3D "*" if selected.get(idx, False) else "" + ca =3D commit_author_hash.get(em, []) + cs =3D commit_signer_hash.get(em, []) + auth_count =3D len(ca) + sign_count =3D len(cs) + sys.stderr.write("{:1s} {:2d} {:65s}".format(sel, idx + 1,= em)) + if auth_count > 0 or sign_count > 0: + sys.stderr.write("{:4d} {:4d}".format(auth_count, sign= _count)) + sys.stderr.write("\n {}\n".format(role)) + if authored.get(idx, False): + for ref in ca: + sys.stderr.write(" Author: {}\n".format(ref[1]= )) + if signed.get(idx, False): + for ref in cs: + sys.stderr.write(" {}: {}\n".format(ref[2], re= f[1])) + + if print_options: + print_options =3D False + if vcs_exists(): + date_ref =3D email_hg_since if vcs_is_hg() else email_git_= since + sys.stderr.write(""" +Version Control options: +g use git history [{}] +gf use git-fallback [{}] +b use git blame [{}] +bs use blame signatures [{}] +c# minimum commits [{}] +%# min percent [{}] +d# history to use [{}] +x# max maintainers [{}] +t all signature types [{}] +m use .mailmap [{}] +""".format(email_git, email_git_fallback, email_git_blame, + email_git_blame_signatures, email_git_min_signatures, + email_git_min_percent, date_ref, + email_git_max_maintainers, email_git_all_signature_types, + email_use_mailmap)) + + sys.stderr.write(""" +Additional options: +0 toggle all +tm toggle maintainers +tg toggle git entries +tl toggle open list entries +ts toggle subscriber list entries +f emails in file [{}] +k keywords in file [{}] +r remove duplicates [{}] +p# pattern match depth [{}] +""".format(email_file_emails, keywords, email_remove_duplicates, pattern_d= epth)) + + sys.stderr.write( + "\n#(toggle), A#(author), S#(signed) *(all), ^(none), O(option= s), Y(approve): ") + + try: + inp =3D input() + except EOFError: + break + + redraw =3D True + rerun =3D False + wishes =3D re.split(r'[, ]+', inp) + + for nr in wishes: + nr =3D nr.lower().strip() + if not nr: + continue + sel =3D nr[0] + rest =3D nr[1:] + val =3D 0 + m =3D re.match(r'^(\d+)$', rest) + if m: + val =3D int(m.group(1)) + + if sel =3D=3D "y": + interactive =3D 0 + done =3D True + output_rolestats =3D 0 + output_roles =3D 0 + output_substatus =3D 0 + break + elif re.match(r'^\d+$', nr): + num =3D int(nr) + if 0 < num <=3D count: + selected[num - 1] =3D not selected.get(num - 1, False) + elif sel in ('*', '^'): + toggle =3D (sel =3D=3D '*') + for i in range(count): + selected[i] =3D toggle + elif sel =3D=3D '0': + for i in range(count): + selected[i] =3D not selected.get(i, False) + elif sel =3D=3D 't': + if rest =3D=3D 'm': + for i in range(count): + if re.match(r'^(maintainer|supporter)', to_list[i]= [1], re.IGNORECASE): + selected[i] =3D not selected.get(i, False) + elif rest =3D=3D 'g': + for i in range(count): + if re.match(r'^(author|commit|signer)', to_list[i]= [1], re.IGNORECASE): + selected[i] =3D not selected.get(i, False) + elif rest =3D=3D 'l': + for i in range(count): + if re.match(r'^(open list)', to_list[i][1], re.IGN= ORECASE): + selected[i] =3D not selected.get(i, False) + elif rest =3D=3D 's': + for i in range(count): + if re.match(r'^(subscriber list)', to_list[i][1], = re.IGNORECASE): + selected[i] =3D not selected.get(i, False) + else: + # 't' alone toggles all signature types + email_git_all_signature_types =3D not email_git_all_si= gnature_types + rerun =3D True + elif sel =3D=3D 'a': + if val > 0 and val <=3D count: + authored[val - 1] =3D not authored.get(val - 1, False) + elif rest in ('*', '^'): + toggle =3D (rest =3D=3D '*') + for i in range(count): + authored[i] =3D toggle + elif sel =3D=3D 's': + if val > 0 and val <=3D count: + signed[val - 1] =3D not signed.get(val - 1, False) + elif rest in ('*', '^'): + toggle =3D (rest =3D=3D '*') + for i in range(count): + signed[i] =3D toggle + elif sel =3D=3D 'o': + print_options =3D True + redraw =3D True + elif sel =3D=3D 'g': + if rest =3D=3D 'f': + email_git_fallback =3D not email_git_fallback + else: + email_git =3D not email_git + rerun =3D True + elif sel =3D=3D 'b': + if rest =3D=3D 's': + email_git_blame_signatures =3D not email_git_blame_sig= natures + else: + email_git_blame =3D not email_git_blame + rerun =3D True + elif sel =3D=3D 'c': + if val > 0: + email_git_min_signatures =3D val + rerun =3D True + elif sel =3D=3D 'x': + if val > 0: + email_git_max_maintainers =3D val + rerun =3D True + elif sel =3D=3D '%': + if rest !=3D "" and val >=3D 0: + email_git_min_percent =3D val + rerun =3D True + elif sel =3D=3D 'd': + if vcs_is_git(): + email_git_since =3D rest + elif vcs_is_hg(): + email_hg_since =3D rest + rerun =3D True + elif sel =3D=3D 'f': + email_file_emails =3D not email_file_emails + rerun =3D True + elif sel =3D=3D 'r': + email_remove_duplicates =3D not email_remove_duplicates + rerun =3D True + elif sel =3D=3D 'm': + email_use_mailmap =3D not email_use_mailmap + read_mailmap() + rerun =3D True + elif sel =3D=3D 'k': + keywords =3D not keywords + rerun =3D True + elif sel =3D=3D 'p': + if rest !=3D "" and val >=3D 0: + pattern_depth =3D val + rerun =3D True + elif sel in ('h', '?'): + sys.stderr.write(""" +Interactive mode allows you to select the various maintainers, submitters, +commit signers and mailing lists that could be CC'd on a patch. + +Any *'d entry is selected. + +If you have git or hg installed, you can choose to summarize the commit +history of files in the patch. Also, each line of the current file can +be matched to its commit author and that commits signers with blame. + +Various knobs exist to control the length of time for active commit +tracking, the maximum number of commit authors and signers to add, +and such. + +Enter selections at the prompt until you are satisfied that the selected +maintainers are appropriate. You may enter multiple selections separated +by either commas or spaces. + +""") + else: + sys.stderr.write("invalid option: '{}'\n".format(nr)) + redraw =3D False + + if rerun: + if email_git_blame: + sys.stderr.write("git-blame can be very slow, please have = patience...") + return get_maintainers() + + # drop not selected entries + new_emailto =3D [] + for i in range(len(to_list)): + if selected.get(i, False): + new_emailto.append(to_list[i]) + return new_emailto + +# ---- self-test mode ---- + +def do_self_test(): + lsfiles =3D vcs_list_files(lk_path) + good_links =3D [] + bad_links =3D [] + section_headers =3D [] + index =3D 0 + + for x in self_test_info: + index +=3D 1 + + # Section header duplication and missing section content + if (self_test =3D=3D "" or "sections" in self_test) and \ + re.match(r'^\S[^:]', x["line"]) and \ + index < len(self_test_info) and \ + re.match(r'^([A-Z]):\s*\S', self_test_info[index]["line"]): + has_S =3D False + has_F =3D False + has_ML =3D False + sts =3D "" + if any(h.startswith(x["line"]) for h in section_headers): + print("{}:{}: warning: duplicate section header\t{}".forma= t( + x["file"], x["linenr"], x["line"])) + else: + section_headers.append(x["line"]) + nextline =3D index + while nextline < len(self_test_info): + m =3D re.match(r'^([A-Z]):\s*(\S.*)', self_test_info[nextl= ine]["line"]) + if not m: + break + typ =3D m.group(1) + val =3D m.group(2) + if typ =3D=3D "S": + has_S =3D True + sts =3D val + elif typ in ("F", "N"): + has_F =3D True + elif typ in ("M", "R", "L"): + has_ML =3D True + nextline +=3D 1 + if not has_ML and not re.search(r'orphan|obsolete', sts, re.IG= NORECASE): + print("{}:{}: warning: section without email address\t{}".= format( + x["file"], x["linenr"], x["line"])) + if not has_S: + print("{}:{}: warning: section without status \t{}".format( + x["file"], x["linenr"], x["line"])) + if not has_F: + print("{}:{}: warning: section without file pattern\t{}".f= ormat( + x["file"], x["linenr"], x["line"])) + + m =3D _re_type_value.match(x["line"]) + if not m: + continue + + typ =3D m.group(1) + value =3D m.group(2) + + # Filename pattern matching + if typ in ("F", "X") and (self_test =3D=3D "" or "patterns" in sel= f_test): + val =3D value + val =3D val.replace('.', '\\.') + val =3D val.replace('**', '\x00') + val =3D val.replace('*', '.*') + val =3D val.replace('?', '.') + val =3D val.replace('\x00', '(?:.*)') + if os.path.isdir(val): + if not val.endswith('/'): + val +=3D '/' + try: + pat =3D re.compile(r'^' + val) + except re.error: + continue + if not any(pat.search(f) for f in lsfiles): + print("{}:{}: warning: no file matches\t{}".format( + x["file"], x["linenr"], x["line"])) + + # Link reachability + elif typ in ("W", "Q", "B") and \ + re.match(r'^https?:', value) and \ + (self_test =3D=3D "" or "links" in self_test): + if value in good_links: + continue + isbad =3D False + if value in bad_links: + isbad =3D True + else: + ret =3D subprocess.run( + ["wget", "--spider", "-q", "--no-check-certificate", + "--timeout", "10", "--tries", "1", value], + capture_output=3DTrue) + if ret.returncode =3D=3D 0: + good_links.append(value) + else: + bad_links.append(value) + isbad =3D True + if isbad: + print("{}:{}: warning: possible bad link\t{}".format( + x["file"], x["linenr"], x["line"])) + + # SCM reachability + elif typ =3D=3D "T" and (self_test =3D=3D "" or "scm" in self_test= ): + if value in good_links: + continue + isbad =3D False + if value in bad_links: + isbad =3D True + elif not re.match(r'^(?:git|quilt|hg)\s+\S', value): + print("{}:{}: warning: malformed entry\t{}".format( + x["file"], x["linenr"], x["line"])) + else: + m2 =3D re.match(r'^git\s+(\S+)(\s+([^\(]+\S+))?', value) + if m2: + url =3D m2.group(1) + branch =3D m2.group(3) if m2.group(3) else "" + ret =3D subprocess.run( + 'git ls-remote --exit-code -h "{}" {} > /dev/null = 2>&1'.format(url, branch), + shell=3DTrue, capture_output=3DTrue) + if ret.returncode =3D=3D 0: + good_links.append(value) + else: + bad_links.append(value) + isbad =3D True + else: + m2 =3D re.match(r'^(?:quilt|hg)\s+(https?:\S+)', value) + if m2: + url =3D m2.group(1) + ret =3D subprocess.run( + ["wget", "--spider", "-q", "--no-check-certifi= cate", + "--timeout", "10", "--tries", "1", url], + capture_output=3DTrue) + if ret.returncode =3D=3D 0: + good_links.append(value) + else: + bad_links.append(value) + isbad =3D True + if isbad: + print("{}:{}: warning: possible bad link\t{}".format( + x["file"], x["linenr"], x["line"])) + +# ---- usage ---- + +def usage(): + print("""usage: {} [options] patchfile + {} [options] -f file|directory +version: {} + +MAINTAINER field selection options: + --email =3D> print email address(es) if any + --git =3D> include recent git *-by: signers + --git-all-signature-types =3D> include signers regardless of signature= type + or use only {} signers (default: {}) + --git-fallback =3D> use git when no exact MAINTAINERS pattern (default= : {}) + --git-chief-penguins =3D> include {} + --git-min-signatures =3D> number of signatures required (default: {}) + --git-max-maintainers =3D> maximum maintainers to add (default: {}) + --git-min-percent =3D> minimum percentage of commits required (default= : {}) + --git-blame =3D> use git blame to find modified commits for patch or f= ile + --git-blame-signatures =3D> when used with --git-blame, also include a= ll commit signers + --git-since =3D> git history to use (default: {}) + --hg-since =3D> hg history to use (default: {}) + --interactive =3D> display a menu (mostly useful if used with the --gi= t option) + --m =3D> include maintainer(s) if any + --r =3D> include reviewer(s) if any + --n =3D> include name 'Full Name ' + --l =3D> include list(s) if any + --moderated =3D> include moderated lists(s) if any (default: true) + --s =3D> include subscriber only list(s) if any (default: false) + --remove-duplicates =3D> minimize duplicate email names/addresses + --roles =3D> show roles (role:subsystem, git-signer, list, etc...) + --rolestats =3D> show roles and statistics (commits/total_commits, %) + --substatus =3D> show subsystem status if not Maintained (default: mat= ch --roles when output is tty)" + --file-emails =3D> add email addresses found in -f file (default: 0 (o= ff)) + --fixes =3D> for patches, add signatures of commits with 'Fixes: ' (default: 1 (on)) + --scm =3D> print SCM tree(s) if any + --status =3D> print status if any + --subsystem =3D> print subsystem name if any + --web =3D> print website(s) if any + --bug =3D> print bug reporting info if any + +Output type options: + --separator [, ] =3D> separator for multiple entries on 1 line + using --separator also sets --nomultiline if --separator is not [, ] + --multiline =3D> print 1 entry per line + +Other options: + --pattern-depth =3D> Number of pattern directory traversals (default: 0 = (all)) + --keywords =3D> scan patch for keywords (default: {}) + --keywords-in-file =3D> scan file for keywords (default: {}) + --sections =3D> print all of the subsystem sections with pattern matches + --letters =3D> print all matching 'letter' types from all matching secti= ons + --mailmap =3D> use .mailmap file (default: {}) + --no-tree =3D> run without a kernel tree + --self-test =3D> show potential issues with MAINTAINERS file content + --version =3D> show version + --help =3D> show this help information + +Default options: + [--email --tree --nogit --git-fallback --m --r --n --l --multiline + --pattern-depth=3D0 --remove-duplicates --rolestats --keywords] + +Notes: + Using "-f directory" may give unexpected results: + Used with "--git", git signators for _all_ files in and below + directory are examined as git recurses directories. + Any specified X: (exclude) pattern matches are _not_ ignored. + Used with "--nogit", directory is used as a pattern match, + no individual file within the directory or subdirectory + is matched. + Used with "--git-blame", does not iterate all files in directory + Using "--git-blame" is slow and may add old committers and authors + that are no longer active maintainers to the output. + Using "--roles" or "--rolestats" with git send-email --cc-cmd or any + other automated tools that expect only ["name"] + may not work because of additional output after . + Using "--rolestats" and "--git-blame" shows the #/total=3D% commits, + not the percentage of the entire file authored. # of commits is + not a good measure of amount of code authored. 1 major commit may + contain a thousand lines, 5 trivial commits may modify a single line. + If git is not installed, but mercurial (hg) is installed and an .hg + repository exists, the following options apply to mercurial: + --git, + --git-min-signatures, --git-max-maintainers, --git-min-percent, = and + --git-blame + Use --hg-since not --git-since to control date selection + File ".get_maintainer.conf", if it exists in the linux kernel source root + directory, can change whatever get_maintainer defaults are desired. + Entries in this file can be any command line argument. + This file is prepended to any additional command line arguments. + Multiple lines and # comments are allowed. + Most options have both positive and negative forms. + The negative forms for -- are --no and --no-. +""".format(P, P, V, signature_pattern, email_git_all_signature_types, + email_git_fallback, penguin_chiefs, + email_git_min_signatures, email_git_max_maintainers, + email_git_min_percent, email_git_since, email_hg_since, + keywords, keywords_in_file, email_use_mailmap)) + +# ---- argument parsing helpers ---- + +def add_negatable_flag(parser, name, dest, default, short=3DNone): + """Add --foo/--nofoo/--no-foo flags (Perl Getopt::Long style negation)= .""" + names =3D ['--' + name] + if short: + names.insert(0, short) + parser.add_argument(*names, dest=3Ddest, action=3D'store_true', defaul= t=3Ddefault) + parser.add_argument('--no' + name, '--no-' + name, + dest=3Ddest, action=3D'store_false') + +# ---- main ---- + +def main(): + global P, cur_path, lk_path + global email, email_usename, email_maintainer, email_reviewer, email_f= ixes + global email_list, email_moderated_list, email_subscriber_list + global email_git_penguin_chiefs, email_git, email_git_all_signature_ty= pes + global email_git_blame, email_git_blame_signatures, email_git_fallback + global email_git_min_signatures, email_git_max_maintainers, email_git_= min_percent + global email_git_since, email_hg_since, interactive + global email_remove_duplicates, email_use_mailmap + global output_multiline, output_separator, output_roles, output_rolest= ats + global output_substatus, output_section_maxlen + global scm, tree, web, bug, subsystem_opt, status_opt + global letters, keywords, keywords_in_file, sections + global email_file_emails, from_filename, pattern_depth + global self_test, version, find_maintainer_files, maintainer_path + global files, fixes, range_list, keyword_tvi, file_emails + global ignore_emails + + P =3D sys.argv[0] + cur_path =3D os.getcwd() + '/' + + # Load config file + conf =3D which_conf(".get_maintainer.conf") + conf_args =3D [] + if conf and os.path.isfile(conf): + try: + with open(conf, 'r', encoding=3D'utf-8') as f: + for line in f: + line =3D line.rstrip('\n').strip() + line =3D re.sub(r'\s+', ' ', line) + if not line or line.startswith('#'): + continue + for word in line.split(): + if word.startswith('#'): + break + conf_args.append(word) + except IOError: + print("{}: Can't find a readable .get_maintainer.conf file".fo= rmat(P), file=3Dsys.stderr) + + # Load ignore file + ignore_file =3D which_conf(".get_maintainer.ignore") + if ignore_file and os.path.isfile(ignore_file): + try: + with open(ignore_file, 'r', encoding=3D'utf-8') as f: + for line in f: + line =3D line.strip() + line =3D re.sub(r'#.*$', '', line).strip() + if not line: + continue + if rfc822_valid(line): + ignore_emails.append(line) + except IOError: + pass + + # Check for --self-test exclusivity + all_args =3D conf_args + sys.argv[1:] + if len(all_args) > 1: + for arg in all_args: + if re.match(r'^-{1,2}self-test', arg): + print("{}: using --self-test does not allow any other opti= on or argument".format(P), + file=3Dsys.stderr) + sys.exit(1) + + # Build argument parser + parser =3D argparse.ArgumentParser(add_help=3DFalse) + + add_negatable_flag(parser, 'email', 'email', True) + add_negatable_flag(parser, 'git', 'email_git', False) + add_negatable_flag(parser, 'git-all-signature-types', 'email_git_all_s= ignature_types', False) + add_negatable_flag(parser, 'git-blame', 'email_git_blame', False) + add_negatable_flag(parser, 'git-blame-signatures', 'email_git_blame_si= gnatures', True) + add_negatable_flag(parser, 'git-fallback', 'email_git_fallback', True) + add_negatable_flag(parser, 'git-chief-penguins', 'email_git_penguin_ch= iefs', False) + parser.add_argument('--git-min-signatures', type=3Dint, default=3D1) + parser.add_argument('--git-max-maintainers', type=3Dint, default=3D5) + parser.add_argument('--git-min-percent', type=3Dint, default=3D5) + parser.add_argument('--git-since', default=3D"1-year-ago") + parser.add_argument('--hg-since', default=3D"-365") + add_negatable_flag(parser, 'interactive', 'interactive', False, short= =3D'-i') + add_negatable_flag(parser, 'remove-duplicates', 'email_remove_duplicat= es', True) + add_negatable_flag(parser, 'mailmap', 'email_use_mailmap', True) + add_negatable_flag(parser, 'm', 'email_maintainer', True) + add_negatable_flag(parser, 'r', 'email_reviewer', True) + add_negatable_flag(parser, 'n', 'email_usename', True) + add_negatable_flag(parser, 'l', 'email_list', True) + add_negatable_flag(parser, 'fixes', 'email_fixes', True) + add_negatable_flag(parser, 'moderated', 'email_moderated_list', True) + add_negatable_flag(parser, 's', 'email_subscriber_list', False) + add_negatable_flag(parser, 'multiline', 'output_multiline', True) + add_negatable_flag(parser, 'roles', 'output_roles', False) + add_negatable_flag(parser, 'rolestats', 'output_rolestats', True) + parser.add_argument('--separator', default=3D", ") + add_negatable_flag(parser, 'subsystem', 'subsystem', False) + add_negatable_flag(parser, 'status', 'status', False) + add_negatable_flag(parser, 'substatus', 'output_substatus_flag', None) + add_negatable_flag(parser, 'scm', 'scm', False) + add_negatable_flag(parser, 'tree', 'tree', True) + add_negatable_flag(parser, 'web', 'web', False) + add_negatable_flag(parser, 'bug', 'bug', False) + parser.add_argument('--letters', default=3D"") + parser.add_argument('--pattern-depth', type=3Dint, default=3D0) + add_negatable_flag(parser, 'keywords', 'keywords', True, short=3D'-k') + add_negatable_flag(parser, 'keywords-in-file', 'keywords_in_file', Fal= se) + add_negatable_flag(parser, 'sections', 'sections', False) + add_negatable_flag(parser, 'file-emails', 'email_file_emails', False) + parser.add_argument('-f', '--file', dest=3D'from_filename', action=3D'= store_true', default=3DFalse) + parser.add_argument('--find-maintainer-files', action=3D'store_true', = default=3DFalse) + parser.add_argument('--mpath', '--maintainer-path', dest=3D'maintainer= _path', default=3DNone) + parser.add_argument('--self-test', dest=3D'self_test', nargs=3D'?', co= nst=3D'', default=3DNone) + parser.add_argument('-v', '--version', dest=3D'show_version', action= =3D'store_true', default=3DFalse) + parser.add_argument('-h', '--help', '--usage', dest=3D'show_help', act= ion=3D'store_true', default=3DFalse) + parser.add_argument('files', nargs=3D'*') + + # Handle negatable --fe alias + parser.add_argument('--fe', dest=3D'email_file_emails', action=3D'stor= e_true') + parser.add_argument('--kf', dest=3D'keywords_in_file', action=3D'store= _true') + + try: + args =3D parser.parse_args(conf_args + sys.argv[1:]) + except SystemExit: + sys.exit(1) + + if args.show_help: + usage() + sys.exit(0) + + if args.show_version: + print("{} {}".format(P, V)) + sys.exit(0) + + # Apply parsed args to globals + email =3D 1 if args.email else 0 + email_git =3D 1 if args.email_git else 0 + email_git_all_signature_types =3D 1 if args.email_git_all_signature_ty= pes else 0 + email_git_blame =3D 1 if args.email_git_blame else 0 + email_git_blame_signatures =3D 1 if args.email_git_blame_signatures el= se 0 + email_git_fallback =3D 1 if args.email_git_fallback else 0 + email_git_penguin_chiefs =3D 1 if args.email_git_penguin_chiefs else 0 + email_git_min_signatures =3D args.git_min_signatures + email_git_max_maintainers =3D args.git_max_maintainers + email_git_min_percent =3D args.git_min_percent + email_git_since =3D args.git_since + email_hg_since =3D args.hg_since + interactive =3D 1 if args.interactive else 0 + email_remove_duplicates =3D 1 if args.email_remove_duplicates else 0 + email_use_mailmap =3D 1 if args.email_use_mailmap else 0 + email_maintainer =3D 1 if args.email_maintainer else 0 + email_reviewer =3D 1 if args.email_reviewer else 0 + email_usename =3D 1 if args.email_usename else 0 + email_list =3D 1 if args.email_list else 0 + email_fixes =3D 1 if args.email_fixes else 0 + email_moderated_list =3D 1 if args.email_moderated_list else 0 + email_subscriber_list =3D 1 if args.email_subscriber_list else 0 + output_multiline =3D 1 if args.output_multiline else 0 + output_roles =3D 1 if args.output_roles else 0 + output_rolestats =3D 1 if args.output_rolestats else 0 + output_separator =3D args.separator + subsystem_opt =3D 1 if args.subsystem else 0 + status_opt =3D 1 if args.status else 0 + scm =3D 1 if args.scm else 0 + tree =3D 1 if args.tree else 0 + web =3D 1 if args.web else 0 + bug =3D 1 if args.bug else 0 + letters =3D args.letters + pattern_depth =3D args.pattern_depth + keywords =3D 1 if args.keywords else 0 + keywords_in_file =3D 1 if args.keywords_in_file else 0 + sections =3D 1 if args.sections else 0 + email_file_emails =3D 1 if args.email_file_emails else 0 + from_filename =3D 1 if args.from_filename else 0 + find_maintainer_files =3D 1 if args.find_maintainer_files else 0 + maintainer_path =3D args.maintainer_path + self_test =3D args.self_test + + # Handle substatus special logic + if args.output_substatus_flag is None: + output_substatus =3D None + else: + output_substatus =3D 1 if args.output_substatus_flag else 0 + + if self_test is not None: + read_all_maintainer_files() + do_self_test() + sys.exit(0) + + if output_separator !=3D ", ": + output_multiline =3D 0 + if interactive: + output_rolestats =3D 1 + if output_rolestats: + output_roles =3D 1 + + if output_substatus is None: + output_substatus =3D 1 if (email and output_roles and sys.stdout.i= satty()) else 0 + + if sections or letters !=3D "": + sections =3D 1 + email =3D 0 + email_list =3D 0 + scm =3D 0 + status_opt =3D 0 + subsystem_opt =3D 0 + web =3D 0 + bug =3D 0 + keywords =3D 0 + keywords_in_file =3D 0 + interactive =3D 0 + else: + selections =3D email + scm + status_opt + subsystem_opt + web + bug + if selections =3D=3D 0: + print("{}: Missing required option: email, scm, status, subsys= tem, web or bug".format(P), + file=3Dsys.stderr) + sys.exit(1) + + if email and \ + (email_maintainer + email_reviewer + + email_list + email_subscriber_list + + email_git + email_git_penguin_chiefs + email_git_blame) =3D=3D 0: + print("{}: Please select at least 1 email option".format(P), file= =3Dsys.stderr) + sys.exit(1) + + if tree and not top_of_kernel_tree(lk_path): + print("{}: The current directory does not appear to be " + "a linux kernel source tree.".format(P), file=3Dsys.stderr) + sys.exit(1) + + # Read MAINTAINERS + read_all_maintainer_files() + + # Read mailmap + read_mailmap() + + # Process input files + input_files =3D args.files if args.files else [] + if not input_files and not sys.stdin.isatty(): + input_files =3D ["&STDIN"] + elif not input_files: + print("{}: missing patchfile or -f file - use --help if necessary"= .format(P), + file=3Dsys.stderr) + sys.exit(1) + + for filepath in input_files: + if filepath !=3D "&STDIN": + filepath =3D os.path.normpath(filepath) + if os.path.isdir(filepath): + if not filepath.endswith('/'): + filepath +=3D '/' + elif not os.path.isfile(filepath): + print("{}: file '{}' not found".format(P, filepath), file= =3Dsys.stderr) + sys.exit(1) + + file_in_vcs =3D None + if from_filename and vcs_exists(): + file_in_vcs =3D vcs_file_exists(filepath) + if not file_in_vcs: + print("{}: file '{}' not found in version control".format(= P, filepath), file=3Dsys.stderr) + + if from_filename or (filepath !=3D "&STDIN" and + (file_in_vcs if file_in_vcs is not None else = vcs_file_exists(filepath))): + # strip absolute or lk_path prefix + if filepath.startswith(cur_path): + filepath =3D filepath[len(cur_path):] + if filepath.startswith(lk_path) and lk_path !=3D "./": + filepath =3D filepath[len(lk_path):] + files.append(filepath) + if filepath !=3D "MAINTAINERS" and os.path.isfile(filepath) an= d keywords and keywords_in_file: + try: + with open(filepath, 'r', encoding=3D'utf-8', errors=3D= 'replace') as f: + text =3D f.read() + for line_idx in keyword_hash: + try: + if re.search(keyword_hash[line_idx], text, re.= VERBOSE): + keyword_tvi.append(line_idx) + except re.error: + pass + except IOError: + pass + else: + file_cnt =3D len(files) + lastfile =3D None + + if filepath =3D=3D "&STDIN": + patch =3D sys.stdin + else: + try: + patch =3D open(filepath, 'r', encoding=3D'utf-8', erro= rs=3D'replace') + except IOError: + print("{}: Can't open {}: ".format(P, filepath), file= =3Dsys.stderr) + sys.exit(1) + + patch_prefix =3D "" # Parsing the intro + + for patch_line in patch: + m =3D re.match(r'^ mode change [0-7]+ =3D> [0-7]+ (\S+)\s*= $', patch_line) + if m: + files.append(m.group(1)) + continue + m =3D re.match(r'^rename (?:from|to) (\S+)\s*$', patch_lin= e) + if m: + files.append(m.group(1)) + continue + m =3D re.match(r'^diff --git a/(\S+) b/(\S+)\s*$', patch_l= ine) + if m: + files.append(m.group(1)) + files.append(m.group(2)) + continue + m =3D re.match(r'^Fixes:\s+([0-9a-fA-F]{6,40})', patch_lin= e) + if m: + if email_fixes: + fixes.append(m.group(1)) + continue + m =3D re.match(r'^\+\+\+\s+(\S+)', patch_line) or \ + re.match(r'^---\s+(\S+)', patch_line) + if m: + filename =3D m.group(1) + filename =3D re.sub(r'^[^/]*/', '', filename) + filename =3D filename.rstrip('\n') + lastfile =3D filename + files.append(filename) + patch_prefix =3D "^[+-].*" + continue + m =3D re.match(r'^@@ -(\d+),(\d+)', patch_line) + if m: + if email_git_blame and lastfile: + range_list.append("{}:{}:{}".format(lastfile, m.gr= oup(1), m.group(2))) + continue + if keywords: + for line_idx in keyword_hash: + try: + if re.search(patch_prefix + keyword_hash[line_= idx], + patch_line, re.VERBOSE): + keyword_tvi.append(line_idx) + except re.error: + pass + + if filepath !=3D "&STDIN": + patch.close() + + if file_cnt =3D=3D len(files): + print("{}: file '{}' doesn't appear to be a patch. " + "Add -f to options?".format(P, filepath), file=3Dsys= .stderr) + + files[:] =3D sort_and_uniq(files) + + file_emails[:] =3D uniq(file_emails) + fixes[:] =3D uniq(fixes) + + maintainers =3D get_maintainers() + if maintainers: + maintainers =3D merge_email(maintainers) + output(maintainers) + + if scm: + scm_out =3D uniq(scm_list) + output(scm_out) + + if output_substatus: + ss =3D uniq(substatus_list) + output(ss) + + if status_opt: + st =3D uniq(status_list) + output(st) + + if subsystem_opt: + ss =3D uniq(subsystem_list) + output(ss) + + if web: + w =3D uniq(web_list) + output(w) + + if bug: + b =3D uniq(bug_list) + output(b) + +if __name__ =3D=3D "__main__": + main() --=20 2.52.0 From nobody Wed Apr 1 20:38:04 2026 Received: from mail-wm1-f45.google.com (mail-wm1-f45.google.com [209.85.128.45]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CEC402E717B for ; Wed, 1 Apr 2026 14:48:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.45 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775054891; cv=none; b=g7eRHuY0UrP93Fs/mtaohfSlQdl1OkCALFIiqW1CvJOXvJcx6xZ3oSA8ZTt0ad0rZv4akAXkADMz5/G/oxAcieH7PwZVEAc1y8tq+msSd0y/+5bZB/xSHHxoLtQy2wV8QITLUIzfq3GflHN/k9RaCWkDu59S9DAoP8TkYAXDE18= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775054891; c=relaxed/simple; bh=1a0j+/2frOMbReA062D1KtmtEeL79FEdNZKmMUEzwN4=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=UvxTIUj+/AiO4t0EnIBgdMQqT6Cr1SQmg8RE0XqBsrub9qepuOE0wu4C4LFSvhU0QvvEdhGUtAvbZNRTkGaqrUuXtBAs4G5d/+XLCoBvUZK+2bQfB2Axihdaq2pTClBDwXfgydU5fhi0UXpF/KNTYoK6Vd/vZ0MTl1360ohyycY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=NH9cP3O0; arc=none smtp.client-ip=209.85.128.45 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="NH9cP3O0" Received: by mail-wm1-f45.google.com with SMTP id 5b1f17b1804b1-48541edecf9so83652545e9.1 for ; Wed, 01 Apr 2026 07:48:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1775054885; x=1775659685; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=dJuWWxVJUNEWRRvJz/eTvBx/v65U94aqJB5MtvtEAdc=; b=NH9cP3O0LCHBM53+SobmEx/RSRU0otzubw5X6JfamKnb4G2cn8tsU4wOSdUdINeXoy eIsJ8FCZXanucOORbLwq2q9/NZVb0jgVGBrh6/MOnDFOf54j4i02Aznqnqo4DUBlNfQL 2m9ljnnT6ble5Erq7hYgFfs4T11ollz4As7Yl2OQFIBmjU7bH7MTUnhPDq+bzl5seJ8D FAnugfaDF+b7ZL1YgP8xFuW94S2NhckSDjjzmScuL6ALk/4BvkIdQru/bWKg4QuWnfCH ffnbRPk0oN/iUv8ttTdqLgQRzxlcIjw2AJRvWpj4tJWw1aiTSHFEOWOinzKGdSVmzAVi AvjA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775054885; x=1775659685; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=dJuWWxVJUNEWRRvJz/eTvBx/v65U94aqJB5MtvtEAdc=; b=e27EsHsP8j0RmA21ozlK102nt6kCnK1JyGTtfUXnqyba8XG7fpHC5BoIXO6zO/n9gW z/qSq2PfAma9kNJd4C22jik8QFWzYEYQCfLoO67XPdpiOQgwmH32U47+YBD5UbRQE8HZ gqnAMTptPjAWtrVe/x53jNU7DYgSOA1yCjw72R3aNnxmDW+noC79DuXQGnq0XgFBQhWR qaaWI1KGN+M9LRS+JfDM7ZFjV0hrjWSyb1n/8PhkHYTMk+RD7vTdjeDoPtwkrRpFdL4c Qjhl+Us4tDlKvLbZa77OZDdVRNvIbmf7Q1P2x7VbudFD7v5pBgwdmwYFf40lTrz/3QVS p3zw== X-Gm-Message-State: AOJu0Yyf0w7f2T88ewP9uDWAWN+JpdYJ/PMJirnULZim6kENP8ZVo7uo zOlE8o9whAcFsGHpZJMNgxA8tC7f2zBwW2fEKJBzTHhncEiVqSQZ/xmeuTwO0GxZL7A= X-Gm-Gg: ATEYQzz6bhML9XA+qoXxz4uyrzv0efloYa9HvfcLtYtqTtph4eGtbsXSGwmsUG6eTvK 2XDbLvg86+up9Pq6MfIUQCNM3w6DUayreT2S+W7Hh18lmoDIKj77NSQ5eCwPULoAJFNcsd1fVaj gpO5t+HJWfKPKd3/NBQzOl5hN68kOytitbCb/P1j+BY6tm3uRAkaKuRd5pl5jGb4n+C5lNqsg6e Wi6Lv5zsmmGssI07bqIKHMTkJeQWYwnu7kZFIE+LC0ky5bkUrjNuctqiYqLOKAiPwv8UeaisisG wcaqy77evXniK36Q1iiTAce05k5fmK7bcsqeXc2GBmx2khzTysNPcFl9oUNRnLYyLwJ3dIIDU9b 2x8DVBsq4+g8gIg/MTPze/EqXB5JKGJVeLuZSUD6WTuCkYgjkorcKgsPO/GlP0bsj9yHRPsbgSC kr42swpiQ3XQt7sajD+QCVV7VWfvj9eoIztMtNhW+S8BMfO4AZK8Ev/uhxO7UgA8WEGFYVPicP X-Received: by 2002:a05:600c:4704:b0:486:fcc7:d6a with SMTP id 5b1f17b1804b1-4888358931fmr64125485e9.13.1775054884345; Wed, 01 Apr 2026 07:48:04 -0700 (PDT) Received: from fedorarm (net-2-37-83-250.cust.vodafonedsl.it. [2.37.83.250]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-4887c8b6230sm45753505e9.24.2026.04.01.07.48.03 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Apr 2026 07:48:03 -0700 (PDT) From: Guido De Rossi To: LKML Subject: [PATCH 2/2] checkpatch: rewrite in Python Date: Wed, 1 Apr 2026 16:47:23 +0200 Message-ID: <20260401144723.44406-3-guido.derossi91@gmail.com> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260401144723.44406-1-guido.derossi91@gmail.com> References: <20260401144723.44406-1-guido.derossi91@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add scripts/checkpatch.py, a Python 3.6+ rewrite of scripts/checkpatch.pl. This continues the effort to deprecate Perl in kernel scripts, following scripts/get_maintainer.py. The Python version implements the core checking infrastructure and the most commonly triggered checks. It is designed as a drop-in replacement with identical CLI options and output format. Implemented checks include: - Commit message validation (sign-off, fixes tag, line length, diff content, gerrit change-id, git commit references) - Whitespace (trailing, DOS endings, tabs vs spaces, space before tab) - SPDX license tags - Line length limits with URL/string exceptions - Code style (brace placement, spacing around operators, function parentheses, if/while/for spacing) - API usage (volatile, printk levels, BUG variants, deprecated APIs, strcpy/strlcpy/strncpy, udelay/msleep, jiffies comparison) - Type checks (new typedefs, sizeof usage, CamelCase) - File checks (execute permissions, embedded filename, FSF address) - Spelling/typo detection via spelling.txt and optional codespell Checks requiring full C statement context analysis (ctx_statement_block, annotate_values, operator spacing, macro analysis, brace balancing) are scaffolded but simplified. These represent the remaining checks and will be completed incrementally. Benchmark comparison (wall time): Mode Perl Python Speedup --------------------------------------------------- core.c (10.9k lines) 9.0s 4.1s 2.2x dev.c (13.3k lines) 11.7s 4.8s 2.4x super.c (7.6k lines) 7.3s 3.3s 2.2x 5 files (~50k lines) 40.7s 16.6s 2.5x Patch mode (1 commit) 2.8s 2.7s 1.0x Git mode (5 commits) 14.3s 11.8s 1.2x Signed-off-by: Guido De Rossi --- scripts/checkpatch.py | 2417 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 2417 insertions(+) create mode 100755 scripts/checkpatch.py diff --git a/scripts/checkpatch.py b/scripts/checkpatch.py new file mode 100755 index 000000000000..712aa373c03c --- /dev/null +++ b/scripts/checkpatch.py @@ -0,0 +1,2417 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: GPL-2.0 +# +# (c) 2001, Dave Jones. (the file handling bit) +# (c) 2005, Joel Schopp (the ugly bit) +# (c) 2007,2008, Andy Whitcroft (new conditions, test sui= te) +# (c) 2008-2010 Andy Whitcroft +# +# Python rewrite of scripts/checkpatch.pl + +import argparse +import os +import re +import shutil +import subprocess +import sys + +P =3D os.path.basename(sys.argv[0]) +D =3D os.path.dirname(os.path.abspath(sys.argv[0])) + +V =3D '0.32' + +# ---- Global options with defaults ---- +quiet =3D 0 +verbose =3D False +verbose_messages =3D {} +verbose_emitted =3D {} +tree =3D True +chk_signoff =3D True +chk_fixes_tag =3D True +chk_patch =3D True +tst_only =3D None +emacs =3D False +terse =3D False +showfile =3D False +file_mode =3D False +git_mode =3D False +git_commits =3D {} +check =3D False +check_orig =3D False +summary =3D True +mailback =3D False +summary_file =3D False +show_types =3D False +list_types =3D False +fix =3D False +fix_inplace =3D False +root =3D None +gitroot =3D os.environ.get('GIT_DIR', '.git') +debug =3D {} +camelcase =3D {} +use_type =3D {} +use_list =3D [] +ignore_type =3D {} +ignore_list =3D [] +max_line_length =3D 100 +min_conf_desc_length =3D 4 +spelling_file =3D os.path.join(D, 'spelling.txt') +codespell =3D False +codespellfile =3D '/usr/share/codespell/dictionary.txt' +user_codespellfile =3D '' +conststructsfile =3D os.path.join(D, 'const_structs.checkpatch') +docsfile =3D os.path.join(D, '..', 'Documentation', 'dev-tools', 'checkpat= ch.rst') +typedefsfile =3D None +color =3D 'auto' +allow_c99_comments =3D True +git_command =3D 'export LANGUAGE=3Den_US.UTF-8; git' +tabsize =3D 8 +CONFIG_ =3D 'CONFIG_' +configuration_file =3D '.checkpatch.conf' + +maybe_linker_symbol =3D {} + +# ---- ANSI color codes ---- +RED =3D '\033[31m' +YELLOW =3D '\033[33m' +GREEN =3D '\033[32m' +BLUE =3D '\033[34m' +RESET =3D '\033[0m' + +# ---- Regex patterns (matching Perl exactly) ---- + +Ident =3D r'[A-Za-z_][A-Za-z\d_]*(?:\s*\#\#\s*[A-Za-z_][A-Za-z\d_]*)*' +Storage =3D r'(?:extern|static|asmlinkage)' +Sparse =3D r'(?:__user|__kernel|__force|__iomem|__must_check|__kprobes|__r= ef|__refconst|__refdata|__rcu|__private)' +InitAttributePrefix =3D r'__(?:mem|cpu|dev|net_|)' +InitAttributeData =3D InitAttributePrefix + r'(?:initdata\b)' +InitAttributeConst =3D InitAttributePrefix + r'(?:initconst\b)' +InitAttributeInit =3D InitAttributePrefix + r'(?:init\b)' +InitAttribute =3D f'(?:{InitAttributeData}|{InitAttributeConst}|{InitAttri= buteInit})' + +Attribute =3D (r'(?:const|volatile|__percpu|__nocast|__safe|__bitwise|__pa= cked__|__packed2__|' + r'__naked|__maybe_unused|__always_unused|__noreturn|__used|__= cold|__pure|' + r'__noclone|__deprecated|__read_mostly|__ro_after_init|__kpro= bes|' + + InitAttribute + r'|' + r'__aligned\s*\(.*\)|____cacheline_aligned|____cacheline_alig= ned_in_smp|' + r'____cacheline_internodealigned_in_smp|__weak|' + r'__alloc_size\s*\(\s*\d+\s*(?:,\s*\d+\s*)?\))') + +Inline =3D r'(?:inline|__always_inline|noinline|__inline|__inline__)' +Member =3D f'(?:->{Ident}|\\.{Ident}|\\[[^\\]]*\\])' +Lval =3D f'(?:{Ident}(?:{Member})*)' + +Int_type =3D r'(?:[iI])?(?:llu|ull|ll|lu|ul|l|u)' +Binary =3D r'(?:[iI])?0[bB][01]+(?:' + Int_type + r')?' +Hex =3D r'(?:[iI])?0[xX][0-9a-fA-F]+(?:' + Int_type + r')?' +Int =3D r'[0-9]+(?:' + Int_type + r')?' +Octal =3D r'0[0-7]+(?:' + Int_type + r')?' +String =3D r'(?:\b[Lu])?"[X\t]*"' +Float_hex =3D r'(?:[iI])?0[xX][0-9a-fA-F]+[pP]-?[0-9]+[fFlL]?' +Float_dec =3D r'(?:[iI])?(?:[0-9]+\.[0-9]*|[0-9]*\.[0-9]+)(?:[eE]-?[0-9]+)= ?[fFlL]?' +Float_int =3D r'(?:[iI])?[0-9]+[eE]-?[0-9]+[fFlL]?' +Float =3D f'(?:{Float_hex}|{Float_dec}|{Float_int})' +Constant =3D f'(?:{Float}|{Binary}|{Octal}|{Hex}|{Int})' +Assignment =3D r'(?:\*=3D|/=3D|%=3D|\+=3D|-=3D|<<=3D|>>=3D|&=3D|\^=3D|\|= =3D|=3D)' +Compare =3D r'(?:<=3D|>=3D|=3D=3D|!=3D|<|(?)' +Arithmetic =3D r'(?:\+|-|\*|/|%)' +Operators =3D r'(?:<=3D|>=3D|=3D=3D|!=3D|=3D>|->|<<|>>|<|>|!|~|&&|\|\||,|\= ^|\+\+|--|&|\||' + Arithmetic + r')' + +c90_Keywords =3D r'(?:do|for|while|if|else|return|goto|continue|switch|def= ault|case|break)' + +NON_ASCII_UTF8 =3D (r'(?:[\xC2-\xDF][\x80-\xBF]' + r'|\xE0[\xA0-\xBF][\x80-\xBF]' + r'|[\xE1-\xEC\xEE\xEF][\x80-\xBF]{2}' + r'|\xED[\x80-\x9F][\x80-\xBF]' + r'|\xF0[\x90-\xBF][\x80-\xBF]{2}' + r'|[\xF1-\xF3][\x80-\xBF]{3}' + r'|\xF4[\x80-\x8F][\x80-\xBF]{2})') +UTF8 =3D r'(?:[\x09\x0A\x0D\x20-\x7E]|' + NON_ASCII_UTF8 + r')' + +typeC99Typedefs =3D r'(?:__)?(?:[us]_?)?int_?(?:8|16|32|64)_t' +typeOtherOSTypedefs =3D r'(?:u_(?:char|short|int|long)|u(?:nchar|short|int= |long))' +typeKernelTypedefs =3D r'(?:(?:__)?(?:u|s|be|le)(?:8|16|32|64)|atomic_t)' +typeStdioTypedefs =3D r'(?:FILE)' +typeTypedefs =3D f'(?:{typeC99Typedefs}\\b|{typeOtherOSTypedefs}\\b|{typeK= ernelTypedefs}\\b|{typeStdioTypedefs}\\b)' + +zero_initializer =3D r'(?:(?:0[xX])?0+(?:' + Int_type + r')?|NULL|false)\b' + +logFunctions =3D (r'(?:printk(?:_ratelimited|_once|_deferred_once|_deferre= d|)|' + r'(?:[a-z0-9]+_){1,2}(?:printk|emerg|alert|crit|err|warnin= g|warn|notice|info|debug|dbg|vdbg|devel|cont|WARN)(?:_ratelimited|_once|)|' + r'TP_printk|WARN(?:_RATELIMIT|_ONCE|)|panic|' + r'MODULE_[A-Z_]+|seq_vprintf|seq_printf|seq_puts)') + +allocFunctions =3D (r'(?:(?:(?:devm_)?(?:kv|k|v)[czm]alloc(?:_array)?(?:_n= ode)?|' + r'kstrdup(?:_const)?|kmemdup(?:_nul)?)|' + r'(?:\w+)?alloc_skb(?:_ip_align)?|dma_alloc_coherent)') + +signature_tags =3D (r'(?:Signed-off-by:|Co-developed-by:|Acked-by:|Tested-= by:|' + r'Reviewed-by:|Reported-by:|Suggested-by:|To:|Cc:)') + +link_tags =3D ['Link', 'Closes'] +link_tags_search =3D '(?:' + '|'.join(t + ':' for t in link_tags) + ')' +link_tags_print =3D ' or '.join("'" + t + ":'" for t in link_tags) + +tracing_logging_tags =3D (r'(?:[=3D\-]*>|<[=3D\-]*|\[|\]|start|called|ente= red|entry|enter|in|' + r'inside|here|begin|exit|end|done|leave|completed|= out|return|[\.\!:\s]*)') + +dev_id_types =3D r'\b[a-z]\w*_device_id\b' + +obsolete_archives =3D (r'(?:freedesktop\.org/archives/dri-devel|' + r'lists\.infradead\.org|lkml\.org|mail-archive\.com|' + r'mailman\.alsa-project\.org/pipermail|marc\.info|' + r'ozlabs\.org/pipermail|spinics\.net)') + +typeListMisordered =3D [ + r'char\s+(?:un)?signed', + r'int\s+(?:(?:un)?signed\s+)?short\s', + r'int\s+short(?:\s+(?:un)?signed)', + r'short\s+int(?:\s+(?:un)?signed)', + r'(?:un)?signed\s+int\s+short', + r'short\s+(?:un)?signed', + r'long\s+int\s+(?:un)?signed', + r'int\s+long\s+(?:un)?signed', + r'long\s+(?:un)?signed\s+int', + r'int\s+(?:un)?signed\s+long', + r'int\s+(?:un)?signed', + r'int\s+long\s+long\s+(?:un)?signed', + r'long\s+long\s+int\s+(?:un)?signed', + r'long\s+long\s+(?:un)?signed\s+int', + r'long\s+long\s+(?:un)?signed', + r'long\s+(?:un)?signed', +] + +typeList =3D [ + r'void', + r'(?:(?:un)?signed\s+)?char', + r'(?:(?:un)?signed\s+)?short\s+int', + r'(?:(?:un)?signed\s+)?short', + r'(?:(?:un)?signed\s+)?int', + r'(?:(?:un)?signed\s+)?long\s+int', + r'(?:(?:un)?signed\s+)?long\s+long\s+int', + r'(?:(?:un)?signed\s+)?long\s+long', + r'(?:(?:un)?signed\s+)?long', + r'(?:un)?signed', + r'float', + r'double', + r'bool', + f'struct\\s+{Ident}', + f'union\\s+{Ident}', + f'enum\\s+{Ident}', + f'{Ident}_t', + f'{Ident}_handler', + f'{Ident}_handler_fn', +] + typeListMisordered + +C90_int_types =3D (r'(?:long\s+long\s+int\s+(?:un)?signed|long\s+long\s+(?= :un)?signed\s+int|' + r'long\s+long\s+(?:un)?signed|(?:(?:un)?signed\s+)?long\s= +long\s+int|' + r'(?:(?:un)?signed\s+)?long\s+long|int\s+long\s+long\s+(?= :un)?signed|' + r'int\s+(?:(?:un)?signed\s+)?long\s+long|' + r'long\s+int\s+(?:un)?signed|long\s+(?:un)?signed\s+int|' + r'long\s+(?:un)?signed|(?:(?:un)?signed\s+)?long\s+int|' + r'(?:(?:un)?signed\s+)?long|int\s+long\s+(?:un)?signed|' + r'int\s+(?:(?:un)?signed\s+)?long|' + r'int\s+(?:un)?signed|(?:(?:un)?signed\s+)?int)') + +typeListFile =3D [] +typeListWithAttr =3D typeList + [ + f'struct\\s+{InitAttribute}\\s+{Ident}', + f'union\\s+{InitAttribute}\\s+{Ident}', +] + +modifierList =3D [r'fastcall'] +modifierListFile =3D [] + +mode_permission_funcs =3D [ + ("module_param", 3), + ("module_param_(?:array|named|string)", 4), + ("module_param_array_named", 5), + ("debugfs_create_(?:file|u8|u16|u32|u64|x8|x16|x32|x64|size_t|atomic_t= |bool|blob|regset32|u32_array)", 2), + ("proc_create(?:_data|)", 2), + ("(?:CLASS|DEVICE|SENSOR|SENSOR_DEVICE|IIO_DEVICE)_ATTR", 2), + ("IIO_DEV_ATTR_[A-Z_]+", 1), + ("SENSOR_(?:DEVICE_|)ATTR_2", 2), + ("SENSOR_TEMPLATE(?:_2|)", 3), + ("__ATTR", 2), +] + +word_pattern =3D r'\b[A-Z]?[a-z]{2,}\b' + +mode_perms_search =3D '(?:' + '|'.join(e[0] for e in mode_permission_funcs= ) + ')' + +deprecated_apis =3D { + "kmap": "kmap_local_page", + "kunmap": "kunmap_local", + "kmap_atomic": "kmap_local_page", + "kunmap_atomic": "kunmap_local", + "DEFINE_IDR": "DEFINE_XARRAY", + "idr_init": "xa_init", + "idr_init_base": "xa_init_flags", +} + +deprecated_apis_search =3D '(?:' + '|'.join(re.escape(k) for k in deprecat= ed_apis) + ')' + +mode_perms_world_writable =3D r'(?:S_IWUGO|S_IWOTH|S_IRWXUGO|S_IALLUGO|0[0= -7][0-7][2367])' + +mode_permission_string_types =3D { + "S_IRWXU": 0o700, "S_IRUSR": 0o400, "S_IWUSR": 0o200, "S_IXUSR": 0o100, + "S_IRWXG": 0o070, "S_IRGRP": 0o040, "S_IWGRP": 0o020, "S_IXGRP": 0o010, + "S_IRWXO": 0o007, "S_IROTH": 0o004, "S_IWOTH": 0o002, "S_IXOTH": 0o001, + "S_IRWXUGO": 0o777, "S_IRUGO": 0o444, "S_IWUGO": 0o222, "S_IXUGO": 0o1= 11, +} + +single_mode_perms_string_search =3D '(?:' + '|'.join(mode_permission_strin= g_types.keys()) + ')' +multi_mode_perms_string_search =3D single_mode_perms_string_search + r'(?:= \s*\|\s*' + single_mode_perms_string_search + r')*' + +allowed_asm_includes =3D r'(?:irq|memory|time|reboot)' + +allow_repeated_words =3D {'add', 'added', 'bad', 'be'} + +# ---- Dynamically built type patterns ---- +Modifier =3D '' +BasicType =3D '' +NonptrType =3D '' +NonptrTypeMisordered =3D '' +NonptrTypeWithAttr =3D '' +Type =3D '' +TypeMisordered =3D '' +Declare =3D '' +DeclareMisordered =3D '' + +def build_types(): + global Modifier, BasicType, NonptrType, NonptrTypeMisordered, NonptrTy= peWithAttr + global Type, TypeMisordered, Declare, DeclareMisordered + + mods =3D '(?:' + '|'.join(modifierList + modifierListFile) + ')' + all_types =3D '(?:' + '|'.join(typeList + typeListFile) + ')' + mis =3D '(?:' + '|'.join(typeListMisordered) + ')' + allWithAttr =3D '(?:' + '|'.join(typeListWithAttr) + ')' + + Modifier =3D f'(?:{Attribute}|{Sparse}|{mods})' + BasicType =3D f'(?:{typeTypedefs}\\b|(?:{all_types})\\b)' + NonptrType =3D (f'(?:(?:{Modifier}\\s+|const\\s+)*' + f'(?:(?:typeof|__typeof__)\\s*\\([^\\)]*\\)|{typeTypedef= s}\\b|(?:{all_types})\\b)' + f'(?:\\s+{Modifier}|\\s+const)*)') + NonptrTypeMisordered =3D (f'(?:(?:{Modifier}\\s+|const\\s+)*' + f'(?:(?:{mis})\\b)' + f'(?:\\s+{Modifier}|\\s+const)*)') + NonptrTypeWithAttr =3D (f'(?:(?:{Modifier}\\s+|const\\s+)*' + f'(?:(?:typeof|__typeof__)\\s*\\([^\\)]*\\)|{typ= eTypedefs}\\b|(?:{allWithAttr})\\b)' + f'(?:\\s+{Modifier}|\\s+const)*)') + Type =3D (f'(?:{NonptrType}' + f'(?:(?:\\s|\\*|\\[\\])+\\s*const|(?:\\s|\\*\\s*(?:const\\s*)?= |\\[\\])+|(?:\\s*\\[\\s*\\])+){{0,4}}' + f'(?:\\s+{Inline}|\\s+{Modifier})*)') + TypeMisordered =3D (f'(?:{NonptrTypeMisordered}' + f'(?:(?:\\s|\\*|\\[\\])+\\s*const|(?:\\s|\\*\\s*(?:c= onst\\s*)?|\\[\\])+|(?:\\s*\\[\\s*\\])+){{0,4}}' + f'(?:\\s+{Inline}|\\s+{Modifier})*)') + Declare =3D f'(?:(?:{Storage}\\s+(?:{Inline}\\s+)?)?{Type})' + DeclareMisordered =3D f'(?:(?:{Storage}\\s+(?:{Inline}\\s+)?)?{TypeMis= ordered})' + +build_types() + +Typecast =3D f'(?:\\s*(?:\\(\\s*{NonptrType}\\s*\\)){{0,1}}\\s*)' + +# These require recursive regex which Python doesn't have natively. +# We approximate balanced_parens with a non-recursive depth-limited versio= n. +balanced_parens =3D r'(\((?:[^()]*|\((?:[^()]*|\([^()]*\))*\))*\))' +LvalOrFunc =3D f'(?:(?:[&*]\\s*)?{Lval})\\s*(?:{balanced_parens}{{0,1}})\\= s*' +FuncArg =3D f'(?:{Typecast}{{0,1}}(?:{LvalOrFunc}|{Constant}|{String}))' + +declaration_macros =3D (f'(?:(?:{Storage}\\s+)?(?:[A-Z_][A-Z0-9]*_){{0,2}}= (?:DEFINE|DECLARE)(?:_[A-Z0-9]+){{1,6}}\\s*\\(|' + f'(?:{Storage}\\s+)?[HLP]?LIST_HEAD\\s*\\(|' + r'(?:SKCIPHER_REQUEST|SHASH_DESC|AHASH_REQUEST)_ON_S= TACK\s*\(|' + f'(?:{Storage}\\s+)?(?:XA_STATE|XA_STATE_ORDER)\\s*\= \()') + +# Comment character used to sanitize lines +COMMENT_CHAR =3D chr(1) # \x01 used like Perl's $; + +# ---- Utility functions ---- + +def which(bin_name): + path =3D shutil.which(bin_name) + return path if path else "" + +def which_conf(conf): + for path_dir in ['.', os.environ.get('HOME', ''), '.scripts']: + p =3D os.path.join(path_dir, conf) + if os.path.exists(p): + return p + return "" + +def top_of_kernel_tree(root_path): + checks =3D ["COPYING", "CREDITS", "Kbuild", "MAINTAINERS", "Makefile", + "README", "Documentation", "arch", "include", "drivers", + "fs", "init", "ipc", "kernel", "lib", "scripts"] + for c in checks: + if not os.path.exists(os.path.join(root_path, c)): + return False + return True + +def trim(s): + return s.strip() + +def ltrim(s): + return s.lstrip() + +def rtrim(s): + return s.rstrip() + +def expand_tabs(s): + res =3D '' + n =3D 0 + for c in s: + if c =3D=3D '\t': + res +=3D ' ' + n +=3D 1 + while n % tabsize !=3D 0: + res +=3D ' ' + n +=3D 1 + else: + res +=3D c + n +=3D 1 + return res + +def copy_spacing(s): + return re.sub(r'[^\t]', ' ', s) + +def line_stats(line): + line =3D line[1:] if line else '' # drop diff marker + line =3D expand_tabs(line) + m =3D re.match(r'^(\s*)', line) + white =3D len(m.group(1)) if m else 0 + return (len(line), white) + +def cat_vet(vet): + res =3D '' + for c in vet: + if c =3D=3D '\t': + res +=3D c + elif ord(c) < 32 or ord(c) =3D=3D 127: + res +=3D '^' + chr(ord(c) + 64) if ord(c) < 127 else '^?' + else: + res +=3D c + res +=3D '$' + return res + +def tabify(leading): + spaces =3D ' ' * tabsize + while spaces in leading: + leading =3D leading.replace(spaces, '\t', 1) + return leading + +def string_find_replace(string, find, replace): + return re.sub(find, replace, string) + +def deparenthesize(string): + if string is None: + return "" + while re.match(r'^\s*\(.*\)\s*$', string, re.DOTALL): + string =3D re.sub(r'^\s*\(\s*', '', string) + string =3D re.sub(r'\s*\)\s*$', '', string) + string =3D re.sub(r'\s+', ' ', string) + return string + +def get_edit_distance(str1, str2): + str1 =3D str1.lower().replace('-', '') + str2 =3D str2.lower().replace('-', '') + len1, len2 =3D len(str1), len(str2) + d =3D [[0] * (len2 + 1) for _ in range(len1 + 1)] + for i in range(len1 + 1): + d[i][0] =3D i + for j in range(len2 + 1): + d[0][j] =3D j + for i in range(1, len1 + 1): + for j in range(1, len2 + 1): + if str1[i-1] =3D=3D str2[j-1]: + d[i][j] =3D d[i-1][j-1] + else: + d[i][j] =3D 1 + min(d[i][j-1], d[i-1][j], d[i-1][j-1]) + return d[len1][len2] + +def find_standard_signature(sign_off): + standard =3D ['Signed-off-by:', 'Co-developed-by:', 'Acked-by:', 'Test= ed-by:', + 'Reviewed-by:', 'Reported-by:', 'Suggested-by:'] + for sig in standard: + if get_edit_distance(sign_off, sig) <=3D 2: + return sig + return "" + +def perms_to_octal(string): + string =3D string.strip() + if re.match(r'^\s*0[0-7]{3}\s*$', string): + return string.strip() + to_val =3D 0 + for m in re.finditer(r'\b(' + single_mode_perms_string_search + r')\b'= , string): + match =3D m.group(1) + if match in mode_permission_string_types: + to_val |=3D mode_permission_string_types[match] + return f'{to_val:04o}' + +def is_userspace(realfile): + return bool(re.match(r'^tools/', realfile) or re.match(r'^scripts/', r= ealfile)) + +def exclude_global_initialisers(realfile): + return bool(re.match(r'^tools/testing/selftests/bpf/progs/.*\.c$', rea= lfile) or + re.match(r'^samples/bpf/.*_kern\.c$', realfile) or + re.search(r'/bpf/.*\.bpf\.c$', realfile)) + +# ---- Sanitize line (replace comments/strings with placeholders) ---- + +_sanitise_quote =3D '' + +def sanitise_line_reset(in_comment=3DFalse): + global _sanitise_quote + _sanitise_quote =3D '*/' if in_comment else '' + +def sanitise_line(line): + global _sanitise_quote + if not line: + return line + + res =3D list(line[0]) # copy diff marker + rest =3D line[1:] if len(line) > 1 else '' + + off =3D 0 + length =3D len(rest) + while off < length: + c =3D rest[off] + + # Block comments + if _sanitise_quote =3D=3D '' and off + 1 < length and rest[off:off= +2] =3D=3D '/*': + _sanitise_quote =3D '*/' + res.append(COMMENT_CHAR) + res.append(COMMENT_CHAR) + off +=3D 2 + continue + if _sanitise_quote =3D=3D '*/' and off + 1 < length and rest[off:o= ff+2] =3D=3D '*/': + _sanitise_quote =3D '' + res.append(COMMENT_CHAR) + res.append(COMMENT_CHAR) + off +=3D 2 + continue + if _sanitise_quote =3D=3D '' and off + 1 < length and rest[off:off= +2] =3D=3D '//': + _sanitise_quote =3D '//' + res.append('/') + res.append('/') + off +=3D 2 + continue + + # Escaped chars in strings + if _sanitise_quote in ("'", '"') and c =3D=3D '\\' and off + 1 < l= ength: + res.append('X') + res.append('X') + off +=3D 2 + continue + + # Quotes + if c in ("'", '"'): + if _sanitise_quote =3D=3D '': + _sanitise_quote =3D c + res.append(c) + off +=3D 1 + continue + elif _sanitise_quote =3D=3D c: + _sanitise_quote =3D '' + + # Replace content + if _sanitise_quote =3D=3D '*/' and c !=3D '\t': + res.append(COMMENT_CHAR) + elif _sanitise_quote =3D=3D '//' and c !=3D '\t': + res.append(COMMENT_CHAR) + elif _sanitise_quote and _sanitise_quote not in ('*/', '//') and c= !=3D '\t': + res.append('X') + else: + res.append(c) + off +=3D 1 + + if _sanitise_quote =3D=3D '//': + _sanitise_quote =3D '' + + result =3D ''.join(res) + + # Clean up #include paths + m =3D re.match(r'^.\s*\#\s*include\s+<(.*)>', result) + if m: + clean =3D 'X' * len(m.group(1)) + result =3D re.sub(r'<.*>', f'<{clean}>', result, count=3D1) + else: + m =3D re.match(r'^.\s*\#\s*(?:error|warning)\s+(.*)\b', result) + if m: + clean =3D 'X' * len(m.group(1)) + result =3D re.sub(r'(\#\s*(?:error|warning)\s+).*', r'\g<1>' += clean, result, count=3D1) + + if allow_c99_comments: + m =3D re.search(r'(//.*$)', result) + if m: + repl =3D COMMENT_CHAR * len(m.group(1)) + result =3D result[:m.start(1)] + repl + result[m.end(1):] + + return result + +def get_quoted_string(line, rawline): + if not line or not rawline: + return "" + m =3D re.search(String, line) + if not m: + return "" + return rawline[m.start():m.end()] + +# ---- Reporting functions ---- + +report_list =3D [] +cnt_lines =3D 0 +cnt_error =3D 0 +cnt_warn =3D 0 +cnt_chk =3D 0 +clean =3D 1 +prefix =3D '' +rpt_cleaners =3D 0 + +def show_type(msg_type): + msg_type =3D msg_type.upper() + if use_type: + return msg_type in use_type + return msg_type not in ignore_type + +def report(level, msg_type, msg): + global prefix, report_list + if not show_type(msg_type): + return False + if tst_only and tst_only not in msg: + return False + + output =3D '' + if color_enabled: + if level =3D=3D 'ERROR': + output +=3D RED + elif level =3D=3D 'WARNING': + output +=3D YELLOW + else: + output +=3D GREEN + + output +=3D prefix + level + ':' + if show_types: + if color_enabled: + output +=3D BLUE + output +=3D msg_type + ':' + if color_enabled: + output +=3D RESET + output +=3D ' ' + msg + '\n' + + if showfile: + lines =3D output.split('\n', 2) + if len(lines) > 2: + output =3D lines[0] + '\n' + '\n'.join(lines[2:]) + + if terse: + output =3D output.split('\n')[0] + '\n' + + if verbose and msg_type in verbose_messages and msg_type not in verbos= e_emitted: + output +=3D verbose_messages[msg_type] + '\n\n' + verbose_emitted[msg_type] =3D True + + report_list.append(output) + return True + +def ERROR(msg_type, msg): + global clean, cnt_error + if report("ERROR", msg_type, msg): + clean =3D 0 + cnt_error +=3D 1 + return True + return False + +def WARN(msg_type, msg): + global clean, cnt_warn + if report("WARNING", msg_type, msg): + clean =3D 0 + cnt_warn +=3D 1 + return True + return False + +def CHK(msg_type, msg): + global clean, cnt_chk + if check and report("CHECK", msg_type, msg): + clean =3D 0 + cnt_chk +=3D 1 + return True + return False + +# ---- Fix tracking ---- + +fixed =3D [] +fixed_inserted =3D [] +fixed_deleted =3D [] +fixlinenr =3D -1 + +def fix_insert_line(linenr, line): + fixed_inserted.append({'LINENR': linenr, 'LINE': line}) + +def fix_delete_line(linenr, line): + fixed_deleted.append({'LINENR': linenr, 'LINE': line}) + +def fixup_current_range(lines_list, idx, offset, length): + if idx < len(lines_list): + m =3D re.match(r'^(@@ -\d+,\d+ \+)(\d+),(\d+)( @@)', lines_list[id= x]) + if m: + new_o =3D int(m.group(2)) + offset + new_l =3D int(m.group(3)) + length + lines_list[idx] =3D f'{m.group(1)}{new_o},{new_l}{m.group(4)}' + +def fix_inserted_deleted_lines(lines_ref, inserted_ref, deleted_ref): + range_last =3D 0 + delta_offset =3D 0 + old_linenr =3D 0 + new_linenr =3D 0 + next_insert =3D 0 + next_delete =3D 0 + result =3D [] + + for old_line in lines_ref: + save =3D True + line =3D old_line + + if re.match(r'^(?:\+\+\+|---)\s+\S+', line): + delta_offset =3D 0 + elif re.match(r'^@@ -\d+,\d+ \+\d+,\d+ @@', line): + range_last =3D new_linenr + fixup_current_range(result, range_last, delta_offset, 0) if re= sult else None + + while next_delete < len(deleted_ref) and deleted_ref[next_delete][= 'LINENR'] =3D=3D old_linenr: + next_delete +=3D 1 + save =3D False + delta_offset -=3D 1 + + while next_insert < len(inserted_ref) and inserted_ref[next_insert= ]['LINENR'] =3D=3D old_linenr: + result.append(inserted_ref[next_insert]['LINE']) + next_insert +=3D 1 + new_linenr +=3D 1 + delta_offset +=3D 1 + + if save: + result.append(line) + new_linenr +=3D 1 + + old_linenr +=3D 1 + + return result + +# ---- Context analysis functions ---- + +def raw_line(linenr, cnt, rawlines): + offset =3D linenr - 1 + cnt +=3D 1 + line =3D None + while cnt > 0: + if offset >=3D len(rawlines): + return None + line =3D rawlines[offset] + offset +=3D 1 + if line is not None and line.startswith('-'): + continue + cnt -=3D 1 + return line + +def get_stat_real(linenr, lc, rawlines): + stat_real =3D raw_line(linenr, 0, rawlines) + if stat_real is None: + return "" + for count in range(linenr + 1, lc + 1): + rl =3D raw_line(count, 0, rawlines) + if rl is not None: + stat_real +=3D '\n' + rl + return stat_real + +def get_stat_here(linenr, cnt, here, rawlines): + herectx =3D here + '\n' + for n in range(cnt): + rl =3D raw_line(linenr, n, rawlines) + if rl is not None: + herectx +=3D rl + '\n' + return herectx + +def ctx_has_comment(first_line, end_line, rawlines): + """Check if there's a comment in the context around end_line.""" + # Check current, previous, and next lines for // comments + for idx in [end_line - 1, end_line - 2, end_line]: + if 0 <=3D idx < len(rawlines): + m =3D re.search(r'//.*$', rawlines[idx]) + if m: + return True + # Check for inline /* */ comment + if 0 <=3D end_line - 1 < len(rawlines): + if re.search(r'/\*.*\*/', rawlines[end_line - 1]): + return True + # Check for block comment in context + in_comment =3D False + for ln in range(first_line - 1, end_line): + if 0 <=3D ln < len(rawlines): + if '/*' in rawlines[ln]: + in_comment =3D True + if in_comment: + return True + if '*/' in rawlines[ln]: + in_comment =3D False + return False + +def ctx_statement_block(linenr, remain, off, lines, rawlines): + """Extract a statement block from the source.""" + line =3D linenr - 1 + blk =3D '' + soff =3D off + coff =3D off - 1 + coff_set =3D False + loff =3D 0 + ptype =3D '' + level =3D 0 + stack =3D [] + p =3D None + length =3D 0 + + while True: + if not stack: + stack =3D [('', 0)] + + if off >=3D length: + while remain > 0: + if line >=3D len(lines): + break + if lines[line] is not None and lines[line].startswith('-'): + line +=3D 1 + continue + remain -=3D 1 + loff =3D length + blk +=3D (lines[line] if lines[line] is not None else '') = + '\n' + length =3D len(blk) + line +=3D 1 + break + + if off >=3D length: + break + + if level =3D=3D 0 and re.match(r'^.\s*#\s*define', blk[off:]): + level +=3D 1 + ptype =3D '#' + + p_prev =3D p + p =3D blk[off] if off < length else '' + remainder =3D blk[off:] + + # Handle nested #if/#else + if re.match(r'^#\s*(?:ifndef|ifdef|if)\s', remainder): + stack.append((ptype, level)) + elif re.match(r'^#\s*(?:else|elif)\b', remainder): + if len(stack) >=3D 2: + ptype, level =3D stack[-2] + elif re.match(r'^#\s*endif\b', remainder): + if stack: + ptype, level =3D stack.pop() + + if level =3D=3D 0 and p =3D=3D ';': + break + + # else detection + if (level =3D=3D 0 and not coff_set and + (p_prev is None or re.match(r'[\s}+]', str(p_prev))) and + re.match(r'^(else)(?:\s|{)', remainder) and + not re.match(r'^else\s+if\b', remainder)): + m =3D re.match(r'^(else)', remainder) + if m: + coff =3D off + len(m.group(1)) - 1 + coff_set =3D True + + if (ptype =3D=3D '' or ptype =3D=3D '(') and p =3D=3D '(': + level +=3D 1 + ptype =3D '(' + if ptype =3D=3D '(' and p =3D=3D ')': + level -=3D 1 + ptype =3D '(' if level !=3D 0 else '' + if level =3D=3D 0 and coff < soff: + coff =3D off + coff_set =3D True + + if (ptype =3D=3D '' or ptype =3D=3D '{') and p =3D=3D '{': + level +=3D 1 + ptype =3D '{' + if ptype =3D=3D '{' and p =3D=3D '}': + level -=3D 1 + ptype =3D '{' if level !=3D 0 else '' + if level =3D=3D 0: + if off + 1 < length and blk[off + 1] =3D=3D ';': + off +=3D 1 + break + + if ptype =3D=3D '#' and p =3D=3D '\n' and p_prev !=3D '\\': + level -=3D 1 + ptype =3D '' + off +=3D 1 + break + + off +=3D 1 + + if off =3D=3D length: + loff =3D length + 1 + line +=3D 1 + remain -=3D 1 + + statement =3D blk[soff:off + 1] if off + 1 <=3D len(blk) else blk[soff= :] + condition =3D blk[soff:coff + 1] if coff + 1 <=3D len(blk) else blk[so= ff:] + + return (statement, condition, line, max(remain + 1, 0), off - loff + 1= , level) + +def statement_lines(stmt): + stmt =3D re.sub(r'(?:^|\n).', '\n', stmt) + stmt =3D stmt.strip() + return stmt.count('\n') + 1 + +def statement_rawlines(stmt): + return stmt.count('\n') + 1 + +def statement_block_size(stmt): + stmt =3D re.sub(r'(?:^|\n).', '\n', stmt) + stmt =3D re.sub(r'^\s*\{', '', stmt) + stmt =3D re.sub(r'\}\s*$', '', stmt) + stmt =3D stmt.strip() + stmt_lines_count =3D stmt.count('\n') + 1 + stmt_stmts =3D stmt.count(';') + return max(stmt_lines_count, stmt_stmts) + +def pos_last_openparen(line): + opens =3D line.count('(') + closes =3D line.count(')') + if opens =3D=3D 0 or closes >=3D opens: + return -1 + last_open =3D 0 + for pos, c in enumerate(line): + if c =3D=3D '(': + last_open =3D pos + return len(expand_tabs(line[:last_open])) + 1 + +# ---- Value annotation (simplified from Perl) ---- + +av_preprocessor =3D False +av_pending =3D '_' +av_paren_type =3D ['E'] +av_pend_colon =3D 'O' + +def annotate_reset(): + global av_preprocessor, av_pending, av_paren_type, av_pend_colon + av_preprocessor =3D False + av_pending =3D '_' + av_paren_type =3D ['E'] + av_pend_colon =3D 'O' + +def annotate_values(stream, vtype): + global av_preprocessor, av_pending, av_paren_type, av_pend_colon + res =3D '' + var =3D '_' * len(stream) + var_list =3D list(var) + cur =3D stream + + while cur: + if not av_paren_type: + av_paren_type =3D ['E'] + + consumed =3D None + + m =3D re.match(r'^(\s+)', cur) + if m: + consumed =3D m.group(1) + if '\n' in consumed and av_preprocessor: + if av_paren_type: + vtype =3D av_paren_type.pop() + av_preprocessor =3D False + if consumed is None: + m =3D re.match(r'^(\(\s*' + Type + r'\s*)\)', cur) + if m and av_pending =3D=3D '_': + consumed =3D m.group(1) + av_paren_type.append(vtype) + vtype =3D 'c' + if consumed is None: + m =3D re.match(r'^(' + Type + r')\s*(?:' + Ident + r'|,|\)|\(|= \s*$)', cur) + if m: + consumed =3D m.group(1) + vtype =3D 'T' + if consumed is None: + m =3D re.match(r'^(' + Modifier + r')\s*', cur) + if m: + consumed =3D m.group(1) + vtype =3D 'T' + if consumed is None: + m =3D re.match(r'^(\#\s*define\s*' + Ident + r')(\(?)', cur) + if m: + consumed =3D m.group(1) + av_preprocessor =3D True + av_paren_type.append(vtype) + if m.group(2): + av_pending =3D 'N' + vtype =3D 'E' + if consumed is None: + m =3D re.match(r'^(\#\s*(?:undef\s*' + Ident + r'|include\b))'= , cur) + if m: + consumed =3D m.group(1) + av_preprocessor =3D True + av_paren_type.append(vtype) + if consumed is None: + m =3D re.match(r'^(\#\s*(?:ifdef|ifndef|if))', cur) + if m: + consumed =3D m.group(1) + av_preprocessor =3D True + av_paren_type.append(vtype) + av_paren_type.append(vtype) + vtype =3D 'E' + if consumed is None: + m =3D re.match(r'^(\#\s*(?:else|elif))', cur) + if m: + consumed =3D m.group(1) + av_preprocessor =3D True + if av_paren_type: + av_paren_type.append(av_paren_type[-1]) + vtype =3D 'E' + if consumed is None: + m =3D re.match(r'^(\#\s*endif)', cur) + if m: + consumed =3D m.group(1) + av_preprocessor =3D True + if av_paren_type: + av_paren_type.pop() + av_paren_type.append(vtype) + vtype =3D 'E' + if consumed is None: + m =3D re.match(r'^(\\\n)', cur) + if m: + consumed =3D m.group(1) + if consumed is None: + m =3D re.match(r'^(__attribute__)\s*\(?', cur) + if m: + consumed =3D m.group(1) + av_pending =3D vtype + vtype =3D 'N' + if consumed is None: + m =3D re.match(r'^(sizeof)\s*(\()?', cur) + if m: + consumed =3D m.group(1) + if m.group(2): + av_pending =3D 'V' + vtype =3D 'N' + if consumed is None: + m =3D re.match(r'^(if|while|for)\b', cur) + if m: + consumed =3D m.group(1) + av_pending =3D 'E' + vtype =3D 'N' + if consumed is None: + m =3D re.match(r'^(case)', cur) + if m: + consumed =3D m.group(1) + av_pend_colon =3D 'C' + vtype =3D 'N' + if consumed is None: + m =3D re.match(r'^(return|else|goto|typeof|__typeof__)\b', cur) + if m: + consumed =3D m.group(1) + vtype =3D 'N' + if consumed is None: + m =3D re.match(r'^(\()', cur) + if m: + consumed =3D m.group(1) + av_paren_type.append(av_pending) + av_pending =3D '_' + vtype =3D 'N' + if consumed is None: + m =3D re.match(r'^(\))', cur) + if m: + consumed =3D m.group(1) + new_type =3D av_paren_type.pop() if av_paren_type else '_' + if new_type !=3D '_': + vtype =3D new_type + if consumed is None: + m =3D re.match(r'^(' + Ident + r')\s*\(', cur) + if m: + consumed =3D m.group(1) + vtype =3D 'V' + av_pending =3D 'V' + if consumed is None: + m =3D re.match(r'^(' + Ident + r'\s*):(?:\s*\d+\s*(,|=3D|;))?'= , cur) + if m: + consumed =3D m.group(1) + if m.group(2) and vtype in ('C', 'T'): + av_pend_colon =3D 'B' + elif vtype =3D=3D 'E': + av_pend_colon =3D 'L' + vtype =3D 'V' + if consumed is None: + m =3D re.match(r'^(' + Ident + r'|' + Constant + r')', cur) + if m: + consumed =3D m.group(1) + vtype =3D 'V' + if consumed is None: + m =3D re.match(r'^(' + Assignment + r')', cur) + if m: + consumed =3D m.group(1) + vtype =3D 'N' + if consumed is None: + m =3D re.match(r'^(;|\{|\})', cur) + if m: + consumed =3D m.group(1) + vtype =3D 'E' + av_pend_colon =3D 'O' + if consumed is None: + m =3D re.match(r'^(,)', cur) + if m: + consumed =3D m.group(1) + vtype =3D 'C' + if consumed is None: + m =3D re.match(r'^(\?)', cur) + if m: + consumed =3D m.group(1) + vtype =3D 'N' + if consumed is None: + m =3D re.match(r'^(:)', cur) + if m: + consumed =3D m.group(1) + idx =3D len(res) + if idx < len(var_list): + var_list[idx] =3D av_pend_colon + if av_pend_colon in ('C', 'L'): + vtype =3D 'E' + else: + vtype =3D 'N' + av_pend_colon =3D 'O' + if consumed is None: + m =3D re.match(r'^(\[)', cur) + if m: + consumed =3D m.group(1) + vtype =3D 'N' + if consumed is None: + m =3D re.match(r'^(-(?![->])|\+(?!\+)|\*|&&|&)', cur) + if m: + consumed =3D m.group(1) + variant =3D 'B' if vtype =3D=3D 'V' else 'U' + idx =3D len(res) + if idx < len(var_list): + var_list[idx] =3D variant + vtype =3D 'N' + if consumed is None: + m =3D re.match(r'^(' + Operators + r')', cur) + if m: + consumed =3D m.group(1) + if consumed not in ('++', '--'): + vtype =3D 'N' + if consumed is None: + m =3D re.match(r'^(.)', cur) + if m: + consumed =3D m.group(1) + + if consumed: + cur =3D cur[len(consumed):] + res +=3D vtype * len(consumed) + + return (res, ''.join(var_list)) + +# ---- Spelling ---- + +misspellings =3D None +spelling_fix =3D {} + +def load_spelling(): + global misspellings, spelling_fix + if os.path.isfile(spelling_file): + try: + with open(spelling_file, 'r', encoding=3D'utf-8', errors=3D're= place') as f: + for line in f: + line =3D line.strip() + if not line or line.startswith('#'): + continue + parts =3D line.split('||', 1) + if len(parts) =3D=3D 2: + spelling_fix[parts[0]] =3D parts[1] + except IOError: + print(f"No typos will be found - file '{spelling_file}': not r= eadable", file=3Dsys.stderr) + + if codespell and os.path.isfile(codespellfile): + try: + with open(codespellfile, 'r', encoding=3D'utf-8', errors=3D're= place') as f: + for line in f: + line =3D line.strip() + if not line or line.startswith('#') or ', disabled' in= line.lower(): + continue + line =3D line.split(',')[0] + parts =3D line.split('->', 1) + if len(parts) =3D=3D 2: + spelling_fix[parts[0]] =3D parts[1] + except IOError: + print(f"No codespell typos will be found - file '{codespellfil= e}': not readable", file=3Dsys.stderr) + + if spelling_fix: + misspellings =3D '|'.join(sorted(spelling_fix.keys())) + +# ---- Const structs ---- + +const_structs =3D None + +def load_const_structs(): + global const_structs + if os.path.isfile(conststructsfile): + words =3D [] + try: + with open(conststructsfile, 'r', encoding=3D'utf-8', errors=3D= 'replace') as f: + for line in f: + line =3D line.strip() + if not line or line.startswith('#') or ' ' in line: + continue + words.append(line) + except IOError: + pass + if words: + const_structs =3D '|'.join(words) + +# ---- Git helpers ---- + +def git_is_single_file(filename): + if not which("git") or not os.path.exists(gitroot): + return False + try: + output =3D subprocess.run(f'{git_command} ls-files -- {filename}', + shell=3DTrue, capture_output=3DTrue, text= =3DTrue).stdout + count =3D output.count('\n') + return count =3D=3D 1 and output.strip() =3D=3D filename + except Exception: + return False + +def git_commit_info(commit, cid, desc): + if not which("git") or not os.path.exists(gitroot): + return (cid, desc) + try: + output =3D subprocess.run( + f"{git_command} log --no-color --format=3D'%H %s' -1 {commit}", + shell=3DTrue, capture_output=3DTrue, text=3DTrue, stderr=3Dsub= process.STDOUT).stdout.strip() + lines =3D output.split('\n') + except Exception: + return (cid, desc) + + if not lines: + return (cid, desc) + + if 'error: short SHA1' in lines[0] and 'is ambiguous' in lines[0]: + pass + elif 'fatal: ambiguous argument' in lines[0] or 'fatal: bad object' in= lines[0]: + cid =3D None + else: + cid =3D lines[0][:12] + desc =3D lines[0][41:] if len(lines[0]) > 41 else '' + + return (cid, desc) + +# ---- Maintained/obsolete check ---- + +maintained_status =3D {} + +def is_maintained_obsolete(filename): + global maintained_status + if not tree or not root: + return False + gm_script =3D os.path.join(root, 'scripts', 'get_maintainer.pl') + if not os.path.exists(gm_script): + gm_script =3D os.path.join(root, 'scripts', 'get_maintainer.py') + if not os.path.exists(gm_script): + return False + + if filename not in maintained_status: + try: + if gm_script.endswith('.py'): + cmd =3D f'python3 {gm_script} --status --nom --nol --nogit= --nogit-fallback -f {filename}' + else: + cmd =3D f'perl {gm_script} --status --nom --nol --nogit --= nogit-fallback -f {filename}' + maintained_status[filename] =3D subprocess.run( + cmd, shell=3DTrue, capture_output=3DTrue, text=3DTrue).std= out + except Exception: + return False + + return bool(re.search(r'obsolete', maintained_status.get(filename, '')= , re.IGNORECASE)) + +def is_SPDX_License_valid(license_str): + if not tree or not which("python3") or not root: + return True + spdxcheck =3D os.path.join(root, 'scripts', 'spdxcheck.py') + if not os.path.isfile(spdxcheck) or not os.path.exists(gitroot): + return True + try: + result =3D subprocess.run( + f'cd "{os.path.abspath(root)}"; echo "{license_str}" | scripts= /spdxcheck.py -', + shell=3DTrue, capture_output=3DTrue, text=3DTrue) + return result.stdout =3D=3D "" + except Exception: + return True + +# ---- Verbose docs loading ---- + +def load_docs(): + global verbose_messages + if not os.path.isfile(docsfile): + return + try: + with open(docsfile, 'r', encoding=3D'utf-8', errors=3D'replace') a= s f: + msg_type =3D '' + desc =3D '' + in_desc =3D False + for raw_line_text in f: + line =3D raw_line_text.rstrip() + m =3D re.match(r'^\s*\*\*(.+)\*\*$', line) + if m: + if desc: + verbose_messages[msg_type] =3D desc.strip() + msg_type =3D m.group(1) + desc =3D '' + in_desc =3D True + elif in_desc: + if re.match(r'^(?:\s{4,}|$)', line): + desc +=3D re.sub(r'^\s{4}', '', line) + '\n' + else: + if desc: + verbose_messages[msg_type] =3D desc.strip() + msg_type =3D '' + desc =3D '' + in_desc =3D False + if desc: + verbose_messages[msg_type] =3D desc.strip() + except IOError: + pass + +# ---- List types ---- + +def do_list_types(): + """List all message types by scanning our own source.""" + try: + with open(os.path.abspath(sys.argv[0]), 'r', encoding=3D'utf-8') a= s f: + text =3D f.read() + except IOError: + return + + types =3D {} + for m in re.finditer(r'(?:(\bCHK|\bWARN|\bERROR)\s*\()\s*["\']([^"\']+= )["\']', text): + level =3D m.group(1) + msg_type =3D m.group(2) + if msg_type in types: + if types[msg_type] !=3D level: + types[msg_type] +=3D ',' + level + else: + types[msg_type] =3D level + + print("#\tMessage type\n") + if color_enabled: + print(f" ( Color coding: {RED}ERROR{RESET} | {YELLOW}WARNING{RESET= } | {GREEN}CHECK{RESET} | Multiple levels / Undetermined )\n") + + count =3D 0 + for msg_type in sorted(types.keys()): + count +=3D 1 + display =3D msg_type + if color_enabled: + level =3D types[msg_type] + if level =3D=3D 'ERROR': + display =3D RED + msg_type + RESET + elif level =3D=3D 'WARN': + display =3D YELLOW + msg_type + RESET + elif level =3D=3D 'CHK': + display =3D GREEN + msg_type + RESET + print(f"{count}\t{display}") + if verbose and msg_type in verbose_messages: + msg =3D verbose_messages[msg_type].replace('\n', '\n\t') + print(f"\t{msg}\n") + +# ---- Main process function ---- + +def process(filename): + global rawlines, lines, fixed, fixed_inserted, fixed_deleted, fixlinenr + global report_list, cnt_lines, cnt_error, cnt_warn, cnt_chk, clean + global prefix, rpt_cleaners, check, modifierListFile, typeListFile + + linenr =3D 0 + prevline =3D "" + prevrawline =3D "" + stashline =3D "" + stashrawline =3D "" + length =3D 0 + indent =3D 0 + previndent =3D 0 + stashindent =3D 0 + + clean =3D 1 + signoff =3D 0 + fixes_tag =3D 0 + is_revert =3D False + needs_fixes_tag =3D "" + author =3D '' + authorsignoff =3D 0 + author_sob =3D '' + is_patch =3D False + in_header_lines =3D 0 if file_mode else 1 + in_commit_log =3D False + has_patch_separator =3D False + has_commit_log =3D False + commit_log_lines =3D 0 + commit_log_possible_stack_dump =3D False + commit_log_long_line =3D False + commit_log_has_diff =3D False + reported_maintainer_file =3D False + non_utf8_charset =3D False + last_git_commit_id_linenr =3D -1 + last_blank_line =3D 0 + last_coalesced_string_linenr =3D -1 + + report_list =3D [] + cnt_lines =3D 0 + cnt_error =3D 0 + cnt_warn =3D 0 + cnt_chk =3D 0 + + realfile =3D '' + realline =3D 0 + realcnt =3D 0 + here =3D '' + context_function =3D None + in_comment =3D False + first_line =3D 0 + p1_prefix =3D '' + prev_values =3D 'E' + + suppress_ifbraces =3D {} + suppress_whiletrailers =3D {} + suppress_export =3D {} + suppress_statement =3D 0 + + signatures =3D {} + setup_docs_list =3D [] + setup_docs =3D False + camelcase_file_seeded =3D False + checklicenseline =3D 1 + + emitted_corrupt =3D 0 + + # Pre-scan: sanitize lines and build lines[] + sanitise_line_reset() + lines_local =3D [] + local_fixed =3D [] + + for raw_idx, rawline_text in enumerate(rawlines): + line_text =3D rawline_text + + if fix: + local_fixed.append(rawline_text) + + if re.match(r'^\+\+\+\s+(\S+)', rawline_text): + setup_docs =3D False + m =3D re.match(r'^\+\+\+\s+(\S+)', rawline_text) + if m and re.search(r'Documentation/admin-guide/kernel-paramete= rs\.txt$', m.group(1)): + setup_docs =3D True + + m =3D re.match(r'^@@ -\d+(?:,\d+)? \+(\d+)(,(\d+))? @@', rawline_t= ext) + if m: + realline =3D int(m.group(1)) - 1 + realcnt =3D int(m.group(3)) + 1 if m.group(2) else 2 + in_comment =3D False + + # Guess if we're in a comment + edge =3D None + cnt =3D realcnt + for ln in range(raw_idx + 1, len(rawlines)): + if rawlines[ln].startswith('-'): + continue + cnt -=3D 1 + if cnt <=3D 0: + break + m2 =3D re.search(r'(/\*|\*/)', rawlines[ln]) + if m2 and not re.search(r'"[^"]*(?:/\*|\*/)[^"]*"', rawlin= es[ln]): + edge =3D m2.group(1) + break + + if edge =3D=3D '*/': + in_comment =3D True + + if edge is None and raw_idx + 1 < len(rawlines): + if re.match(r'^.\s*(?:\*\*+| \*)(?:\s|$)', rawlines[raw_id= x + 1] if raw_idx + 1 < len(rawlines) else ''): + in_comment =3D True + + sanitise_line_reset(in_comment) + elif realcnt and re.match(r'^(?:\+| |$)', rawline_text): + line_text =3D sanitise_line(rawline_text) + + lines_local.append(line_text) + + if realcnt > 1: + if re.match(r'^(?:\+| |$)', line_text): + realcnt -=3D 1 + else: + realcnt =3D 0 + + if setup_docs and line_text.startswith('+'): + setup_docs_list.append(line_text) + + lines =3D lines_local + if fix: + fixed[:] =3D local_fixed + + # Main processing loop + prefix_local =3D '' + realcnt =3D 0 + linenr =3D 0 + fixlinenr =3D -1 + hunk_line =3D False + + for line_idx, line in enumerate(lines): + linenr =3D line_idx + 1 + fixlinenr +=3D 1 + + sline =3D line.replace(COMMENT_CHAR, ' ') if line else '' + rawline =3D rawlines[line_idx] if line_idx < len(rawlines) else '' + + # Check mode change / rename / patch start + if not in_commit_log: + if (re.match(r'^ mode change [0-7]+ =3D> [0-7]+ \S+\s*$', line= or '') or + re.match(r'^rename (?:from|to) \S+\s*$', line or '') or + re.match(r'^diff --git a/[\w/._\-]+ b/\S+\s*$', line or ''= )): + is_patch =3D True + + # Extract line range + if not in_commit_log: + m =3D re.match(r'^@@ -\d+(?:,\d+)? \+(\d+)(,(\d+))? @@(.*)', l= ine or '') + if m: + context_text =3D m.group(4) + is_patch =3D True + first_line =3D linenr + 1 + realline =3D int(m.group(1)) - 1 + realcnt =3D int(m.group(3)) + 1 if m.group(2) else 2 + annotate_reset() + prev_values =3D 'E' + suppress_ifbraces =3D {} + suppress_whiletrailers =3D {} + suppress_export =3D {} + suppress_statement =3D 0 + m2 =3D re.search(r'\b(\w+)\s*\(', context_text or '') + context_function =3D m2.group(1) if m2 else None + continue + + # Track lines in the hunk + if re.match(r'^( |\+|$)', line or ''): + realline +=3D 1 + if realcnt: + realcnt -=3D 1 + length, indent =3D line_stats(rawline) + prevline, stashline =3D stashline, line + previndent, stashindent =3D stashindent, indent + prevrawline, stashrawline =3D stashrawline, rawline + elif realcnt =3D=3D 1: + realcnt =3D 0 + + hunk_line =3D (realcnt !=3D 0) + + if not file_mode: + here =3D f"#{linenr}: " + else: + here =3D f"#{realline}: " + + # Extract filename + found_file =3D False + m =3D re.match(r'^diff --git.*?(\S+)$', line or '') + if m: + realfile =3D m.group(1) + if not file_mode: + realfile =3D re.sub(r'^[^/]*/', '', realfile) + in_commit_log =3D False + found_file =3D True + else: + m =3D re.match(r'^\+\+\+\s+(\S+)', line or '') + if m: + realfile =3D m.group(1) + if not file_mode: + realfile =3D re.sub(r'^[^/]*/', '', realfile) + in_commit_log =3D False + p1_prefix =3D re.match(r'^([^/]*/)', m.group(1)) + p1_prefix =3D p1_prefix.group(1) if p1_prefix else '' + + if re.match(r'^include/asm/', realfile): + ERROR("MODIFIED_INCLUDE_ASM", + "do not modify files in include/asm, change arch= itecture specific files in arch//include/asm\n" + f"{here}{ra= wline}\n") + found_file =3D True + + # Set prefix for error reporting + if showfile: + prefix =3D f"{realfile}:{realline}: " + elif emacs: + prefix =3D f"{filename}:{realline if file_mode else linenr}: " + else: + prefix =3D '' + + if found_file: + if is_maintained_obsolete(realfile): + WARN("OBSOLETE", + f"{realfile} is marked as 'obsolete' in the MAINTAINE= RS hierarchy. No unnecessary modifications please.\n") + if re.match(r'^(?:drivers/net/|net/|drivers/staging/)', realfi= le): + check =3D True + else: + check =3D check_orig + checklicenseline =3D 1 + continue + + here +=3D f"FILE: {realfile}:{realline}:" if realcnt else '' + + hereline =3D f"{here}\n{rawline}\n" + herecurr =3D f"{here}\n{rawline}\n" + hereprev =3D f"{here}\n{prevrawline}\n{rawline}\n" + + if realcnt: + cnt_lines +=3D 1 + + # ---- Commit log checks ---- + if in_commit_log: + if line and not re.match(r'^\s*$', line): + commit_log_lines +=3D 1 + elif has_commit_log and commit_log_lines < 2: + WARN("COMMIT_MESSAGE", + "Missing commit description - Add an appropriate one\n") + commit_log_lines =3D 2 + + # Check for diff in commit message + if (in_commit_log and not commit_log_has_diff and line and + (re.match(r'^\s+diff\b.*a/([\w/]+)', line) or + re.match(r'^\s*(?:---\s+a/|\+\+\+\s+b/)', line) or + re.match(r'^\s*@@ -\d+,\d+ \+\d+,\d+ @@', line))): + ERROR("DIFF_IN_COMMIT_MSG", + "Avoid using diff content in the commit message - patch(= 1) might not work\n" + herecurr) + commit_log_has_diff =3D True + + # Incorrect file permissions + if line and re.match(r'^new (file )?mode.*[7531]\d{0,2}$', line): + permhere =3D here + f"FILE: {realfile}\n" + if not re.match(r'^scripts/', realfile) and not re.search(r'\.= (py|pl|awk|sh)$', realfile): + ERROR("EXECUTE_PERMISSIONS", + "do not set execute permissions for source files\n" = + permhere) + + # Check for From: + if line and re.match(r'^From:\s*(.*)', line, re.IGNORECASE): + author =3D re.match(r'^From:\s*(.*)', line, re.IGNORECASE).gro= up(1) + + # Check for signoff + if line and re.match(r'^\s*signed-off-by:\s*(.*)', line, re.IGNORE= CASE): + signoff +=3D 1 + in_commit_log =3D False + + # Check for patch separator + if line =3D=3D '---': + has_patch_separator =3D True + in_commit_log =3D False + + # MAINTAINERS update check + if line and re.match(r'^\s*MAINTAINERS\s*\|', line): + reported_maintainer_file =3D True + + # Check if it's the start of commit log + if in_header_lines and realfile =3D=3D '': + if rawline and not (re.match(r'^\s+(?:\S|$)', rawline) or + re.match(r'^(?:commit\b|from\b|[\w-]+:)', = rawline, re.IGNORECASE)): + in_header_lines =3D 0 + in_commit_log =3D True + has_commit_log =3D True + + # Check for Fixes: tag + if (not in_header_lines and line and + re.match(r'^\s*(fixes:?)\s*(?:commit\s*)?([0-9a-f]{5,40})', li= ne, re.IGNORECASE)): + fixes_tag =3D 1 + m =3D re.match(r'^\s*(fixes:?)\s*(?:commit\s*)?([0-9a-f]{5,40}= )', line, re.IGNORECASE) + tag =3D m.group(1) + orig_commit =3D m.group(2) + # Simplified Fixes: tag check + if tag !=3D "Fixes:": + cid, ctitle =3D git_commit_info(orig_commit, '0123456789ab= ', 'commit title') + if cid is not None: + fixed_str =3D f'Fixes: {cid} ("{ctitle}")' + WARN("BAD_FIXES_TAG", + f"Please use correct Fixes: style 'Fixes: <12+ ch= ars of sha1> (\"\")' - ie: '{fixed_str}'\n" + herecurr) + + # Check for Gerrit Change-Id + if realfile =3D=3D '' and not has_patch_separator and line and re.= match(r'^\s*change-id:', line, re.IGNORECASE): + ERROR("GERRIT_CHANGE_ID", + "Remove Gerrit Change-Id's before submitting upstream\n"= + herecurr) + + # Check commit log line length + if (in_commit_log and not commit_log_long_line and line and + len(line) > 75 and + not re.match(r'^\s*[a-zA-Z0-9_/\.]+\s+\|\s+\d+', line) and + not re.match(r'^\s*(?:[\w.\-+]*/)+[\w.\-+]+:', line) and + not re.match(r'^\s*(?:Fixes:|https?:|' + link_tags_search + '|= ' + signature_tags + r')', line, re.IGNORECASE) and + not commit_log_possible_stack_dump): + WARN("COMMIT_LOG_LONG_LINE", + f"Prefer a maximum 75 chars per line (possible unwrapped = commit description?)\n" + herecurr) + commit_log_long_line =3D True + + # Check for This reverts commit + if not in_header_lines and not is_patch and line and re.match(r'^T= his reverts commit', line): + is_revert =3D True + + # Bug/crash indicators + if (not in_header_lines and not is_patch and line and + re.search(r'((?:(?:BUG: K\.|UB)SAN: |Call Trace:|stable@|syzka= ller))', line)): + needs_fixes_tag =3D re.search(r'((?:(?:BUG: K\.|UB)SAN: |Call = Trace:|stable@|syzkaller))', line).group(1) + + # Check for lines starting with # + if in_commit_log and line and line.startswith('#'): + WARN("COMMIT_COMMENT_SYMBOL", + "Commit log lines starting with '#' are dropped by git as= comments\n" + herecurr) + + # ignore non-hunk lines and lines being removed + if not hunk_line or (line and line.startswith('-')): + continue + + # ---- Whitespace checks ---- + + # DOS line endings + if line and line.startswith('+') and '\r' in rawline: + herevet =3D f"{here}\n{cat_vet(rawline)}\n" + ERROR("DOS_LINE_ENDINGS", "DOS line endings\n" + herevet) + # Trailing whitespace + elif rawline and (re.match(r'^\+.*\S\s+$', rawline) or re.match(r'= ^\+\s+$', rawline)): + herevet =3D f"{here}\n{cat_vet(rawline)}\n" + ERROR("TRAILING_WHITESPACE", "trailing whitespace\n" + herevet) + rpt_cleaners =3D 1 + + # Check we are in a valid source file + if not re.search(r'\.(h|c|rs|s|S|sh|dtsi|dts)$', realfile): + continue + + # SPDX check + if realline =3D=3D checklicenseline: + if rawline and rawline.startswith('+'): + if re.search(r'SPDX-License-Identifier:', rawline): + m =3D re.search(r'(SPDX-License-Identifier: .*)', rawl= ine) + if m: + spdx_license =3D m.group(1) + if not is_SPDX_License_valid(spdx_license): + WARN("SPDX_LICENSE_TAG", + f"'{spdx_license}' is not supported in LI= CENSES/...\n" + herecurr) + + # Line length check + if line and line.startswith('+') and length > max_line_length: + msg_type =3D "LONG_LINE" + + # Skip URLs + if re.search(r'\b[a-z][\w.+\-]*://\S+', rawline, re.IGNORECASE= ): + msg_type =3D "" + # Skip strings + elif re.match(r'^\+\s*' + String + r'\s*(?:\s*|,|\)\s*;)\s*$',= line): + msg_type =3D "" + elif re.match(r'^\+\s*#\s*define\s+\w+\s+' + String + r'$', li= ne): + msg_type =3D "" + + if msg_type and show_type("LONG_LINE") and show_type(msg_type): + msg_level =3D CHK if file_mode else WARN + msg_level(msg_type, + f"line length of {length} exceeds {max_line_lengt= h} columns\n" + herecurr) + + # Check we are in a valid C source file + if not re.search(r'\.(h|c|pl|dtsi|dts)$', realfile): + continue + + # Code indent (tabs vs spaces) + # At the beginning of a line any tabs must come first and anything + # more than tabsize spaces must use tabs + if rawline and (re.match(r'^\+\s* \t\s*\S', rawline) or + re.match(r'^\+ {' + str(tabsize) + r',}\s*', rawli= ne)): + herevet =3D f"{here}\n{cat_vet(rawline)}\n" + ERROR("CODE_INDENT", "code indent should use tabs where possib= le\n" + herevet) + rpt_cleaners =3D 1 + + # Space before tab + if rawline and rawline.startswith('+') and ' \t' in rawline: + herevet =3D f"{here}\n{cat_vet(rawline)}\n" + WARN("SPACE_BEFORE_TAB", "please, no space before tabs\n" + he= revet) + + # Check we are in a valid C source file + if not re.search(r'\.(h|c)$', realfile): + continue + + # Function start detection + if (sline and re.match(r'^\+\{\s*$', sline) and + prevline and re.match(r'^\+(?:(?:(?:' + Storage + r'|' + Inlin= e + r')\s*)*\s*' + Type + r'\s*)?(' + Ident + r')\(', prevline)): + m =3D re.match(r'^\+(?:(?:(?:' + Storage + r'|' + Inline + r')= \s*)*\s*' + Type + r'\s*)?(' + Ident + r')\(', prevline) + if m: + context_function =3D m.group(1) + + # Function end + if sline and re.match(r'^\+\}\s*$', sline): + context_function =3D None + + # ---- Simple pattern-based checks ---- + + # printk should use KERN_* levels + if line and re.search(r'\bprintk\s*\(\s*(?!KERN_[A-Z]+\b)', line): + WARN("PRINTK_WITHOUT_KERN_LEVEL", + "printk() should include KERN_<LEVEL> facility level\n" += herecurr) + + # ENOSYS + if line and re.search(r'\bENOSYS\b', line): + WARN("ENOSYS", + "ENOSYS means 'invalid syscall nr' and nothing else\n" + = herecurr) + + # ENOTSUPP + if not file_mode and line and re.search(r'\bENOTSUPP\b', line): + WARN("ENOTSUPP", + "ENOTSUPP is not a SUSV4 error code, prefer EOPNOTSUPP\n"= + herecurr) + + # BUG + if line and re.search(r'\b(?!AA_|BUILD_|IDA_|KVM_|RWLOCK_|snd_|SPI= N_)(?:[a-zA-Z_]*_)?BUG(?:_ON)?(?:_[A-Z_]+)?\s*\(', line): + msg_level =3D CHK if file_mode else WARN + msg_level("AVOID_BUG", + "Do not crash the kernel unless it is absolutely una= voidable--use WARN_ON_ONCE() plus recovery code (if feasible) instead of BU= G() or variants\n" + herecurr) + + # LINUX_VERSION_CODE + if line and re.search(r'\bLINUX_VERSION_CODE\b', line): + WARN("LINUX_VERSION_CODE", + "LINUX_VERSION_CODE should be avoided, code should be for= the version to which it is merged\n" + herecurr) + + # volatile + if line and re.search(r'\bvolatile\b', line) and not re.search(r'\= b(?:__asm__|asm)\s+(?:__volatile__|volatile)\b', line): + WARN("VOLATILE", + "Use of volatile is usually wrong: see Documentation/proc= ess/volatile-considered-harmful.rst\n" + herecurr) + + # trace_printk + if line: + m =3D re.search(r'\b(trace_printk|trace_puts|ftrace_vprintk)\s= *\(', line) + if m: + WARN("TRACE_PRINTK", + f"Do not use {m.group(1)}() in production code (this = can be ignored if built only with a debug config option)\n" + herecurr) + + # printk_ratelimit + if line and re.search(r'\bprintk_ratelimit\s*\(', line): + WARN("PRINTK_RATELIMITED", + "Prefer printk_ratelimited or pr_<level>_ratelimited to p= rintk_ratelimit\n" + herecurr) + + # udelay + if line: + m =3D re.search(r'\budelay\s*\(\s*(\d+)\s*\)', line) + if m: + delay =3D int(m.group(1)) + if delay >=3D 10: + CHK("USLEEP_RANGE", + "usleep_range is preferred over udelay; see functi= on description of usleep_range() and udelay().\n" + herecurr) + if delay > 2000: + WARN("LONG_UDELAY", + "long udelay - prefer mdelay; see function descri= ption of mdelay().\n" + herecurr) + + # msleep + if line: + m =3D re.search(r'\bmsleep\s*\((\d+)\);', line) + if m and int(m.group(1)) < 20: + WARN("MSLEEP", + "msleep < 20ms can sleep for up to 20ms; see function= description of msleep().\n" + herecurr) + + # jiffies comparison + if line and re.search(r'\bjiffies\s*' + Compare + r'|' + Compare += r'\s*jiffies\b', line): + WARN("JIFFIES_COMPARISON", + "Comparing jiffies is almost always wrong; prefer time_af= ter, time_before and friends\n" + herecurr) + + # strcpy/strlcpy/strncpy + if line and re.search(r'\bstrcpy\s*\(', line) and not is_userspace= (realfile): + WARN("STRCPY", + "Prefer strscpy over strcpy - see: https://github.com/KSP= P/linux/issues/88\n" + herecurr) + if line and re.search(r'\bstrlcpy\s*\(', line) and not is_userspac= e(realfile): + WARN("STRLCPY", + "Prefer strscpy over strlcpy - see: https://github.com/KS= PP/linux/issues/89\n" + herecurr) + if line and re.search(r'\bstrncpy\s*\(', line) and not is_userspac= e(realfile): + WARN("STRNCPY", + "Prefer strscpy, strscpy_pad, or __nonstring over strncpy= - see: https://github.com/KSPP/linux/issues/90\n" + herecurr) + + # yield() + if line and re.search(r'\byield\s*\(\s*\)', line): + WARN("YIELD", + "Using yield() is generally wrong. See yield() kernel-doc= (sched/core.c)\n" + herecurr) + + # __FUNCTION__ + if line and re.search(r'\b__FUNCTION__\b', line): + WARN("USE_FUNC", + "__func__ should be used instead of gcc specific __FUNCTI= ON__\n" + herecurr) + + # __DATE__, __TIME__, __TIMESTAMP__ + if line: + for m in re.finditer(r'\b(__(?:DATE|TIME|TIMESTAMP)__)\b', lin= e): + ERROR("DATE_TIME", + f"Use of the '{m.group(1)}' macro makes the build no= n-deterministic\n" + herecurr) + + # #if 0 + if line and re.match(r'^.\s*\#\s*if\s+0\b', line): + WARN("IF_0", + "Consider removing the code enclosed by this #if 0 and it= s #endif\n" + herecurr) + + # #if 1 + if line and re.match(r'^.\s*\#\s*if\s+1\b', line): + WARN("IF_1", + "Consider removing the #if 1 and its #endif\n" + herecurr) + + # sizeof without parenthesis + if line: + m =3D re.search(r'\bsizeof\s+((?:\*\s*|)' + Lval + r'|' + Type= + r'(?:\s+' + Lval + r'|))', line) + if m: + WARN("SIZEOF_PARENTHESIS", + f"sizeof {m.group(1)} should be sizeof({m.group(1).st= rip()})\n" + herecurr) + + # sizeof(&) + if line and re.search(r'\bsizeof\s*\(\s*&', line): + WARN("SIZEOF_ADDRESS", "sizeof(& should be avoided\n" + herecu= rr) + + # spinlock_t + if line and re.match(r'^.\s*\bstruct\s+spinlock\s+\w+\s*;', line): + WARN("USE_SPINLOCK_T", + "struct spinlock should be spinlock_t\n" + herecurr) + + # new typedefs + if (line and re.search(r'\btypedef\s', line) and + not re.search(r'\btypedef\s+' + Type + r'\s*\(\s*\*?' + Ident = + r'\s*\)\s*\(', line) and + not re.search(r'\btypedef\s+' + Type + r'\s+' + Ident + r'\s*\= (', line) and + not re.search(r'\b' + typeTypedefs + r'\b', line) and + not re.search(r'\b__bitwise\b', line)): + WARN("NEW_TYPEDEFS", "do not add new typedefs\n" + herecurr) + + # in_atomic + if line and re.search(r'\bin_atomic\s*\(', line): + if re.match(r'^drivers/', realfile): + ERROR("IN_ATOMIC", "do not use in_atomic in drivers\n" + h= erecurr) + elif not re.match(r'^kernel/', realfile): + WARN("IN_ATOMIC", + "use of in_atomic() is incorrect outside core kernel = code\n" + herecurr) + + # NR_CPUS + if (line and re.search(r'\bNR_CPUS\b', line) and + not re.search(r'^\s*#\s*if\b.*\bNR_CPUS\b', line) and + not re.search(r'^\s*#\s*define\b.*\bNR_CPUS\b', line) and + not re.search(r'\bNR_CPUS[^\]]*\]', line)): + WARN("NR_CPUS", + "usage of NR_CPUS is often wrong - consider using cpu_pos= sible(), num_possible_cpus(), for_each_possible_cpu(), etc\n" + herecurr) + + # deprecated APIs + if line: + m =3D re.search(r'\b(' + deprecated_apis_search + r')\b\s*\(',= line) + if m: + api =3D m.group(1) + new_api =3D deprecated_apis.get(api, '') + WARN("DEPRECATED_API", + f"Deprecated use of '{api}', prefer '{new_api}' inste= ad\n" + herecurr) + + # const_structs + if (const_structs and line and + not re.search(r'\bconst\b', line) and + re.search(r'\bstruct\s+(' + const_structs + r')\b(?!\s*\{)', l= ine)): + m =3D re.search(r'\bstruct\s+(' + const_structs + r')\b', line) + WARN("CONST_STRUCT", + f"struct {m.group(1)} should normally be const\n" + herec= urr) + + # multiple semicolons + if line and re.search(r';\s*;\s*$', line): + WARN("ONE_SEMICOLON", + "Statements terminations use 1 semicolon\n" + herecurr) + + # spaces between function name and ( + if line: + for m in re.finditer(r'(' + Ident + r')\s+\(', line): + name =3D m.group(1) + if name in ('if', 'for', 'while', 'switch', 'return', 'cas= e', + 'volatile', '__volatile__', '__attribute__', '= format', + '__extension__', 'asm', '__asm__', 'scoped_gua= rd'): + continue + ctx_before =3D line[:m.start()] + if re.match(r'^.\s*#\s*define\s*$', ctx_before): + continue + if re.match(r'^.\s*#\s*elif\s*$', ctx_before + name): + continue + WARN("SPACING", + f"space prohibited between function name and open par= enthesis '('\n" + herecurr) + + # space before open brace + if line and (re.search(r'\(.*\)\{', line) and not re.search(r'\(' = + Type + r'\)\{', line)): + ERROR("SPACING", + "space required before the open brace '{'\n" + herecurr) + if line and re.search(r'\b(?:else|do)\{', line): + ERROR("SPACING", + "space required before the open brace '{'\n" + herecurr) + + # space after close brace + if line and re.search(r'\}(?!(?:,|;|\)|\}))\S', line): + ERROR("SPACING", + "space required after that close brace '}'\n" + herecurr) + + # Need space before ( after if/while etc + if line and re.search(r'\b(if|while|for|switch)\(', line): + ERROR("SPACING", + "space required before the open parenthesis '('\n" + her= ecurr) + + # return errno should be negative + if sline: + m =3D re.search(r'\breturn(?:\s*\(+\s*|\s+)(E[A-Z]+)(?:\s*\)+\= s*|\s*)[;:,]', sline) + if m: + name =3D m.group(1) + if name not in ('EOF', 'ERROR') and not name.startswith('E= POLL'): + WARN("USE_NEGATIVE_ERRNO", + f"return of an errno should typically be negative= (ie: return -{name})\n" + herecurr) + + # Malformed #include + if rawline: + m =3D re.match(r'^.\s*#\s*include\s+[<"](.*)[">]', rawline) + if m: + if '//' in m.group(1): + ERROR("MALFORMED_INCLUDE", + "malformed #include filename\n" + herecurr) + + # CamelCase (simplified) + if line and line.startswith('+'): + for m in re.finditer(r'\b(' + Ident + r')\b', line): + word =3D m.group(1) + if (re.search(r'[A-Z][a-z]|[a-z][A-Z]', word) and + word !=3D '_Generic' and + not re.match(r'^(?:[A-Z]+_){1,5}[A-Z]{1,3}[a-z]', word= ) and + not re.match(r'^(?:Clear|Set|TestClear|TestSet|)Page[A= -Z]', word) and + not re.match(r'^ETHTOOL_LINK_MODE_', word)): + if word not in camelcase: + camelcase[word] =3D True + CHK("CAMELCASE", f"Avoid CamelCase: <{word}>\n" + = herecurr) + + # Embedded filename + if rawline and realfile and re.search(r'^\+.*\b' + re.escape(realf= ile) + r'\b', rawline): + WARN("EMBEDDED_FILENAME", + "It's generally not useful to have the filename in the fi= le\n" + herecurr) + + # FSF mailing address + if rawline and (re.search(r'\bwrite to the Free', rawline, re.IGNO= RECASE) or + re.search(r'\b675\s+Mass\s+Ave', rawline, re.IGNOR= ECASE) or + re.search(r'\b59\s+Temple\s+Pl', rawline, re.IGNOR= ECASE) or + re.search(r'\b51\s+Franklin\s+St', rawline, re.IGN= ORECASE)): + msg_level =3D CHK if file_mode else ERROR + msg_level("FSF_MAILING_ADDRESS", + "Do not include the paragraph about writing to the F= ree Software Foundation's mailing address from the sample GPL notice. The F= SF has changed addresses in the past, and may do so again. Linux already in= cludes a copy of the GPL.\n" + f"{here}\n{cat_vet(rawline)}\n") + + # Spelling check + if (misspellings and line and + (in_commit_log or line.startswith('+') or re.match(r'^Subject:= ', line, re.IGNORECASE))): + for m in re.finditer(r'(?:^|[^\w\-\'`])(' + misspellings + r')= (?:[^\w\-\'`]|$)', rawline, re.IGNORECASE): + typo =3D m.group(1) + typo_fix_val =3D spelling_fix.get(typo.lower(), '') + if typo_fix_val: + if typo[0].isupper(): + typo_fix_val =3D typo_fix_val[0].upper() + typo_fi= x_val[1:] + if typo.isupper(): + typo_fix_val =3D typo_fix_val.upper() + msg_level =3D CHK if file_mode else WARN + msg_level("TYPO_SPELLING", + f"'{typo}' may be misspelled - perhaps '{typ= o_fix_val}'?\n" + herecurr) + + # ---- End of file checks ---- + + if not rawlines: + return 0 + + if mailback and (clean =3D=3D 1 or not is_patch): + return 0 + + if not chk_patch and not is_patch: + return 0 + + if not is_patch and not re.search(r'cover-letter\.patch$', filename): + ERROR("NOT_UNIFIED_DIFF", + "Does not appear to be a unified-diff format patch\n") + + if is_patch and has_commit_log and chk_fixes_tag: + if needs_fixes_tag and not is_revert and not fixes_tag: + WARN("MISSING_FIXES_TAG", + f"The commit message has '{needs_fixes_tag}', perhaps it = also needs a 'Fixes:' tag?\n") + + if is_patch and has_commit_log and chk_signoff: + if signoff =3D=3D 0: + ERROR("MISSING_SIGN_OFF", + "Missing Signed-off-by: line(s)\n") + + # Print report + output =3D ''.join(report_list) + if output: + sys.stdout.write(output) + + if summary and not (clean =3D=3D 1 and quiet >=3D 1): + if summary_file: + sys.stdout.write(f"{filename} ") + chk_str =3D f"{cnt_chk} checks, " if check else "" + print(f"total: {cnt_error} errors, {cnt_warn} warnings, {chk_str}{= cnt_lines} lines checked") + + if quiet =3D=3D 0: + if not clean and not fix: + print("\nNOTE: For some of the reported defects, checkpatch ma= y be able to\n" + " mechanically convert to the typical style using -= -fix or --fix-inplace.") + if rpt_cleaners: + print("\nNOTE: Whitespace errors detected.\n" + " You may wish to use scripts/cleanpatch or scripts= /cleanfile") + + if clean =3D=3D 0 and fix and fixed !=3D rawlines: + newfile =3D filename + if not fix_inplace: + newfile +=3D '.EXPERIMENTAL-checkpatch-fixes' + try: + with open(newfile, 'w', encoding=3D'utf-8') as f: + linecount =3D 0 + for fixed_line in fixed: + linecount +=3D 1 + if file_mode: + if linecount > 3: + f.write(re.sub(r'^\+', '', fixed_line) + '\n') + else: + f.write(fixed_line + '\n') + if not quiet: + print(f"\nWrote EXPERIMENTAL --fix correction(s) to '{newf= ile}'\n\n" + "Do _NOT_ trust the results written to this file.\n" + "Do _NOT_ submit these changes without inspecting th= em for correctness.\n\n" + "This EXPERIMENTAL file is simply a convenience to h= elp rewrite patches.\n" + "No warranties, expressed or implied...") + except IOError as e: + print(f"{P}: Can't open {newfile} for write: {e}", file=3Dsys.= stderr) + + if quiet =3D=3D 0: + print() + if clean =3D=3D 1: + print(f"{vname} has no obvious style problems and is ready for= submission.") + else: + print(f"{vname} has style problems, please review.") + + return clean + +# ---- CLI + main ---- + +color_enabled =3D False + +def hash_save_array_words(hash_ref, array_ref): + for item in array_ref: + for word in item.split(','): + word =3D word.strip().upper() + if word and not word.startswith('#'): + hash_ref[word] =3D hash_ref.get(word, 0) + 1 + +def main(): + global P, quiet, verbose, tree, chk_signoff, chk_fixes_tag, chk_patch + global tst_only, emacs, terse, showfile, file_mode, git_mode + global check, check_orig, summary, mailback, summary_file + global show_types, list_types, fix, fix_inplace, root, debug + global use_type, ignore_type, max_line_length, min_conf_desc_length + global codespell, codespellfile, user_codespellfile, typedefsfile + global color, allow_c99_comments, tabsize, CONFIG_ + global color_enabled, rawlines, lines, vname, gitroot + global use_list, ignore_list + + # Load config file + conf =3D which_conf(configuration_file) + conf_args =3D [] + if conf and os.path.isfile(conf): + try: + with open(conf, 'r', encoding=3D'utf-8') as f: + for line in f: + line =3D line.strip() + line =3D re.sub(r'\s+', ' ', line) + if not line or line.startswith('#'): + continue + for word in line.split(): + if word.startswith('#'): + break + conf_args.append(word) + except IOError: + pass + + parser =3D argparse.ArgumentParser(add_help=3DFalse) + parser.add_argument('-q', '--quiet', action=3D'count', default=3D0) + parser.add_argument('-v', '--verbose', action=3D'store_true', default= =3DFalse) + parser.add_argument('--tree', action=3D'store_true', default=3DTrue) + parser.add_argument('--no-tree', dest=3D'tree', action=3D'store_false') + parser.add_argument('--signoff', action=3D'store_true', default=3DTrue) + parser.add_argument('--no-signoff', dest=3D'signoff', action=3D'store_= false') + parser.add_argument('--fixes-tag', action=3D'store_true', default=3DTr= ue) + parser.add_argument('--no-fixes-tag', dest=3D'fixes_tag', action=3D'st= ore_false') + parser.add_argument('--patch', action=3D'store_true', default=3DTrue) + parser.add_argument('--no-patch', dest=3D'patch', action=3D'store_fals= e') + parser.add_argument('--emacs', action=3D'store_true', default=3DFalse) + parser.add_argument('--terse', action=3D'store_true', default=3DFalse) + parser.add_argument('--showfile', action=3D'store_true', default=3DFal= se) + parser.add_argument('-f', '--file', dest=3D'file_mode', action=3D'stor= e_true', default=3DFalse) + parser.add_argument('-g', '--git', dest=3D'git_mode', action=3D'store_= true', default=3DFalse) + parser.add_argument('--subjective', '--strict', dest=3D'strict', actio= n=3D'store_true', default=3DFalse) + parser.add_argument('--list-types', action=3D'store_true', default=3DF= alse) + parser.add_argument('--types', action=3D'append', default=3D[]) + parser.add_argument('--ignore', action=3D'append', default=3D[]) + parser.add_argument('--show-types', action=3D'store_true', default=3DF= alse) + parser.add_argument('--max-line-length', type=3Dint, default=3D100) + parser.add_argument('--min-conf-desc-length', type=3Dint, default=3D4) + parser.add_argument('--tab-size', type=3Dint, default=3D8) + parser.add_argument('--root', default=3DNone) + parser.add_argument('--summary', action=3D'store_true', default=3DTrue) + parser.add_argument('--no-summary', dest=3D'summary', action=3D'store_= false') + parser.add_argument('--mailback', action=3D'store_true', default=3DFal= se) + parser.add_argument('--summary-file', action=3D'store_true', default= =3DFalse) + parser.add_argument('--fix', action=3D'store_true', default=3DFalse) + parser.add_argument('--fix-inplace', action=3D'store_true', default=3D= False) + parser.add_argument('--debug', action=3D'append', default=3D[]) + parser.add_argument('--test-only', default=3DNone) + parser.add_argument('--codespell', action=3D'store_true', default=3DFa= lse) + parser.add_argument('--codespellfile', default=3D'') + parser.add_argument('--typedefsfile', default=3DNone) + parser.add_argument('--color', nargs=3D'?', const=3D'auto', default=3D= 'auto') + parser.add_argument('--no-color', dest=3D'color', action=3D'store_cons= t', const=3D'never') + parser.add_argument('--nocolor', dest=3D'color', action=3D'store_const= ', const=3D'never') + parser.add_argument('--kconfig-prefix', default=3D'CONFIG_') + parser.add_argument('-h', '--help', '--version', action=3D'store_true'= , dest=3D'show_help', default=3DFalse) + parser.add_argument('files', nargs=3D'*') + + args =3D parser.parse_args(conf_args + sys.argv[1:]) + + if args.show_help: + print(f"Usage: {P} [OPTION]... [FILE]...") + print(f"Version: {V}") + print(""" +Options: + -q, --quiet quiet + -v, --verbose verbose mode + --no-tree run without a kernel tree + --no-signoff do not check for 'Signed-off-by' line + --patch treat FILE as patchfile (default) + --emacs emacs compile window format + --terse one line per report + -f, --file treat FILE as regular source file + -g, --git treat FILE as a single commit or git revision= range + --subjective, --strict enable more subjective tests + --list-types list the possible message types + --types TYPE(,TYPE2...) show only these comma separated message types + --ignore TYPE(,TYPE2...) ignore various comma separated message types + --show-types show the specific message type in the output + --max-line-length=3Dn set the maximum line length (default 100) + --root=3DPATH PATH to the kernel tree root + --fix EXPERIMENTAL - create fix file + --fix-inplace EXPERIMENTAL - fix in place + --codespell Use the codespell dictionary + --color[=3DWHEN] Use colors 'always', 'never', or 'auto' (de= fault) + -h, --help, --version display this help and exit""") + sys.exit(0) + + quiet =3D args.quiet + verbose =3D args.verbose + tree =3D args.tree + chk_signoff =3D args.signoff + chk_fixes_tag =3D args.fixes_tag + chk_patch =3D args.patch + emacs =3D args.emacs + terse =3D args.terse + showfile =3D args.showfile + file_mode =3D args.file_mode + git_mode =3D args.git_mode + check =3D args.strict + summary =3D args.summary + mailback =3D args.mailback + summary_file =3D args.summary_file + fix =3D args.fix + fix_inplace =3D args.fix_inplace + show_types =3D args.show_types + list_types =3D args.list_types + max_line_length =3D args.max_line_length + min_conf_desc_length =3D args.min_conf_desc_length + tabsize =3D args.tab_size + root =3D args.root + tst_only =3D args.test_only + codespell =3D args.codespell + typedefsfile =3D args.typedefsfile + CONFIG_ =3D args.kconfig_prefix + color =3D args.color + + if args.codespellfile: + codespellfile =3D args.codespellfile + user_codespellfile =3D args.codespellfile + + # Process debug + for d in args.debug: + if '=3D' in d: + k, v =3D d.split('=3D', 1) + debug[k] =3D v + + # Process types/ignore + hash_save_array_words(ignore_type, args.ignore) + hash_save_array_words(use_type, args.types) + + # Color setup + if color in ('0', '1'): + color_enabled =3D (color =3D=3D '0') # inverted like Perl + elif color.lower() =3D=3D 'always': + color_enabled =3D True + elif color.lower() =3D=3D 'never': + color_enabled =3D False + elif color.lower() =3D=3D 'auto': + color_enabled =3D sys.stdout.isatty() + else: + print(f"{P}: Invalid color mode: {color}", file=3Dsys.stderr) + sys.exit(1) + + if verbose: + load_docs() + if list_types: + do_list_types() + sys.exit(0) + + if fix_inplace: + fix =3D True + check_orig =3D check + + if git_mode and (file_mode or fix): + print(f"{P}: --git cannot be used with --file or --fix", file=3Dsy= s.stderr) + sys.exit(1) + if verbose and terse: + print(f"{P}: --verbose cannot be used with --terse", file=3Dsys.st= derr) + sys.exit(1) + + if terse: + emacs =3D True + quiet +=3D 1 + + if tabsize < 2: + print(f"{P}: Invalid TAB size: {tabsize}", file=3Dsys.stderr) + sys.exit(1) + + # Find kernel tree root + if tree: + if root: + if not top_of_kernel_tree(root): + print(f"{P}: {root}: --root does not point at a valid tree= ", file=3Dsys.stderr) + sys.exit(1) + else: + if top_of_kernel_tree('.'): + root =3D '.' + else: + # Try to find root from script location + script_dir =3D os.path.dirname(os.path.abspath(sys.argv[0]= )) + parent =3D os.path.dirname(script_dir) + if top_of_kernel_tree(parent): + root =3D parent + + if not root: + print("Must be run from the top-level dir. of a kernel tree", = file=3Dsys.stderr) + sys.exit(2) + + if file_mode: + chk_signoff =3D False + chk_fixes_tag =3D False + + allow_c99_comments =3D 'C99_COMMENT_TOLERANCE' not in ignore_type + + # Load spelling data + load_spelling() + load_const_structs() + + # Handle input files + input_files =3D args.files if args.files else ['-'] + + # Handle git mode + if git_mode: + if not os.path.exists(gitroot): + print(f"{P}: No git repository found", file=3Dsys.stderr) + sys.exit(1) + commits =3D [] + for commit_expr in input_files: + if re.match(r'^(.*)-(\d+)$', commit_expr): + m =3D re.match(r'^(.*)-(\d+)$', commit_expr) + git_range =3D f"-{m.group(2)} {m.group(1)}" + elif '..' in commit_expr: + git_range =3D commit_expr + else: + git_range =3D f"-1 {commit_expr}" + try: + output =3D subprocess.run( + f"{git_command} log --no-color --no-merges --pretty=3D= format:'%H %s' {git_range}", + shell=3DTrue, capture_output=3DTrue, text=3DTrue).stdo= ut + for line in output.split('\n'): + m =3D re.match(r'^([0-9a-fA-F]{40}) (.*)$', line) + if m: + sha1 =3D m.group(1) + subject =3D m.group(2) + commits.insert(0, sha1) + git_commits[sha1] =3D subject + except Exception: + pass + + if not commits: + print(f"{P}: no git commits after extraction!", file=3Dsys.std= err) + sys.exit(1) + input_files =3D commits + + exit_code =3D 0 + + for filename in input_files: + rawlines =3D [] + is_git_file =3D git_is_single_file(filename) + old_file_mode =3D file_mode + if is_git_file: + file_mode =3D True + + if git_mode: + try: + proc =3D subprocess.run(f"git format-patch -M --stdout -1 = {filename}", + shell=3DTrue, capture_output=3DTrue,= text=3DTrue) + rawlines =3D proc.stdout.rstrip('\n').split('\n') + except Exception as e: + print(f"{P}: {filename}: git format-patch failed - {e}", f= ile=3Dsys.stderr) + sys.exit(1) + elif file_mode: + try: + proc =3D subprocess.run(f"diff -u /dev/null {filename}", + shell=3DTrue, capture_output=3DTrue,= text=3DTrue) + rawlines =3D proc.stdout.rstrip('\n').split('\n') + except Exception as e: + print(f"{P}: {filename}: diff failed - {e}", file=3Dsys.st= derr) + sys.exit(1) + elif filename =3D=3D '-': + rawlines =3D [line.rstrip('\n') for line in sys.stdin] + else: + try: + with open(filename, 'r', encoding=3D'utf-8', errors=3D'rep= lace') as f: + rawlines =3D [line.rstrip('\n') for line in f] + except IOError as e: + print(f"{P}: {filename}: open failed - {e}", file=3Dsys.st= derr) + sys.exit(1) + + if filename =3D=3D '-': + vname =3D 'Your patch' + elif git_mode: + vname =3D f'Commit {filename[:12]} ("{git_commits.get(filename= , "")}")' + else: + vname =3D filename + + # Check Subject line for vname + for rl in rawlines: + m =3D re.match(r'^Subject:\s+(.+)', rl, re.IGNORECASE) + if m and filename =3D=3D '-': + vname =3D f'"{m.group(1)}"' + + if len(input_files) > 1 and quiet =3D=3D 0: + print('-' * len(vname)) + print(vname) + print('-' * len(vname)) + + if not process(filename): + exit_code =3D 1 + + rawlines =3D [] + lines =3D [] + fixed =3D [] + modifierListFile =3D [] + typeListFile =3D [] + build_types() + if is_git_file: + file_mode =3D old_file_mode + + if not quiet: + if use_type: + print("\nNOTE: Used message types: " + ' '.join(sorted(use_typ= e.keys()))) + if ignore_type: + print("\nNOTE: Ignored message types: " + ' '.join(sorted(igno= re_type.keys()))) + if exit_code: + print("\nNOTE: If any of the errors are false positives, pleas= e report\n" + " them to the maintainer, see CHECKPATCH in MAINTAI= NERS.") + + sys.exit(exit_code) + +if __name__ =3D=3D '__main__': + main() --=20 2.52.0