From nobody Wed Dec 17 05:33:22 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 12198265CDF; Tue, 8 Apr 2025 10:09:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106996; cv=none; b=L8DtobDPS4oDUjM4shNgd2yH4a2nkNmp3egjbBbpYAugPe0I+xqlh9Aj6/nNbrpBnU/hISBSP317txcyjUhxUJ0YSzErGFmYgZfjE4n81EEild0Ku93VlJ2IuAcH6HKhbpPMKnP5sxRCgjgJk2Bq9CPLh4aYtJyxMVl84ElbX8I= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106996; c=relaxed/simple; bh=2u3SyFsxIXspNw8e+0pUv6hD908nVrF6VatMyj0SdQo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=qhhcNMmjhy4aNe560lprJU2UJOat/DUPU44AV8THiaL1OTrIRDhfm8iq0vgf+61AHWWgNKeqv1IXSWKGkXbayjXceyz+2X3d3qNWOthTY2fuhn6sMi4HDahaDglASiohNfmkgRFa250oyOov8TYG+NNSwm4zRN3hNTJMwL2R4o8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=EUiFv+Hy; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="EUiFv+Hy" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8DABEC4CEEB; Tue, 8 Apr 2025 10:09:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1744106995; bh=2u3SyFsxIXspNw8e+0pUv6hD908nVrF6VatMyj0SdQo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=EUiFv+HyfWajSYhQZpi0USVDbgCrVfyfaKoE/oV+xIZMSd9VsdW9jQqHKd+0H9kff OWEr5CLEWmTNUbAzGBaBn5F5aLsyQgaDznCSBsBED1D++n3Laf2D183t72by0hxi8w l4KF7rsQbd7yF5mfxowT2PSyIjSL2UtG3y372KIdq8IssCRKzGmHQoStd/HoV3sib9 W1jDx1mhkWptpgr3UTICdzQTo1fbgCiPFj2uA5eHSQKJQjnBR7We1XNe8EXsLJBcJE S7MTMl/refKAVZ06MdJI7FXMI8MaZcWFfSMLFS5JpVtk5a+ukY0Etb9mS4K5SEkf6c FtE6eAx7X23jQ== Received: from mchehab by mail.kernel.org with local (Exim 4.98.2) (envelope-from ) id 1u25ts-00000008RVI-3uUZ; Tue, 08 Apr 2025 18:09:48 +0800 From: Mauro Carvalho Chehab To: Linux Doc Mailing List , Jonathan Corbet Cc: Mauro Carvalho Chehab , linux-kernel@vger.kernel.org Subject: [PATCH v3 01/33] scripts/kernel-doc: rename it to scripts/kernel-doc.pl Date: Tue, 8 Apr 2025 18:09:04 +0800 Message-ID: X-Mailer: git-send-email 2.49.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: Mauro Carvalho Chehab Content-Type: text/plain; charset="utf-8" In preparation for deprecating scripts/kernel-doc in favor of a new version written in Perl, rename it to scripts/kernel-doc.pl. Signed-off-by: Mauro Carvalho Chehab --- scripts/{kernel-doc =3D> kernel-doc.pl} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename scripts/{kernel-doc =3D> kernel-doc.pl} (100%) diff --git a/scripts/kernel-doc b/scripts/kernel-doc.pl similarity index 100% rename from scripts/kernel-doc rename to scripts/kernel-doc.pl --=20 2.49.0 From nobody Wed Dec 17 05:33:22 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C34D326560E; Tue, 8 Apr 2025 10:09:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106995; cv=none; b=lijcEo03rShhv54xYXWzyaBSA48VgI++YRzeV16o82DUn569Aw4e4Vxq6xPzgbGoRBG55GxATOnXJBnwuoM/Iha7HT++Pq4OquwALyy9dr1FRqg8nAmwTeiXrFtAso7Vbjiap4ym9RdyCxpa5oObvYnKt9+nL4JKeHqGTxhdET0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106995; c=relaxed/simple; bh=n+o0b+HPwq6x4iRjMZOAvXZhldtuxW0vCL+uH+CKE7Y=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=gx5w/CO7ea+odq/i+pQENfsJFJpKyAZGa9+7MHfHRXfpo9Ippjy3haNjH+833WiixbxlrH2tisw8zQ36W4HQjzV2PabFfInxl7bjMYgb88DeFcCHFppxf4t2XKRhBKmG0+cnErmZ2WNT7QwIyIZqpx7dRfWP+NR1CDIlWg9Da+I= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=M+bqv++g; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="M+bqv++g" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 834E4C4CEE5; Tue, 8 Apr 2025 10:09:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1744106995; bh=n+o0b+HPwq6x4iRjMZOAvXZhldtuxW0vCL+uH+CKE7Y=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=M+bqv++gr76ibXFjqXKNdB7KWpQmyV+1/0cxudcwJ5Utv8XlqI1ZUwag/le4rPwc2 ZcA+YIsXUM1pHJepUQp9txFu/HyEOpBWx6CJZ3lWBhRk0BAkV4GcAM0EPf/f6UwiB8 AhYNs1uSfZJkLJvv7jKVmiz2ivfXVaT1yu2ArTqFcZ7/LxMD57oXTAoqkc8BzlpgGd zqVhqquhq+JdpFMg8YCCZg2gG2UMICFY5YmNSoAfA0y1H8+oLhyJSTx8SAwo+QxPjg opEKYJ4u7m6Sn7IKBpXFja2orHCjB3P7nFINEv49FAtjLx4MZIy/80frAOd7kLhwjw A3hZLOc85zPlg== Received: from mchehab by mail.kernel.org with local (Exim 4.98.2) (envelope-from ) id 1u25ts-00000008RVL-41PS; Tue, 08 Apr 2025 18:09:48 +0800 From: Mauro Carvalho Chehab To: Linux Doc Mailing List , Jonathan Corbet Cc: Mauro Carvalho Chehab , linux-kernel@vger.kernel.org Subject: [PATCH v3 02/33] scripts/kernel-doc: add a symlink to the Perl version of kernel-doc Date: Tue, 8 Apr 2025 18:09:05 +0800 Message-ID: X-Mailer: git-send-email 2.49.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: Mauro Carvalho Chehab Content-Type: text/plain; charset="utf-8" Preserve kernel-doc name, associating with the curent version in Perl. Signed-off-by: Mauro Carvalho Chehab --- scripts/kernel-doc | 1 + 1 file changed, 1 insertion(+) create mode 120000 scripts/kernel-doc diff --git a/scripts/kernel-doc b/scripts/kernel-doc new file mode 120000 index 000000000000..f175155c1e66 --- /dev/null +++ b/scripts/kernel-doc @@ -0,0 +1 @@ +kernel-doc.pl \ No newline at end of file --=20 2.49.0 From nobody Wed Dec 17 05:33:22 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A4E79267B8C; Tue, 8 Apr 2025 10:09:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106998; cv=none; b=jL9bJThqiHcC6dNW5ZGjveFSaFrK9G3yJndVwtL9a+P7LHT+Ds0+9II45BSK47u6HbwmOeRUNwd4KCVdZkPkGE5bq+QQBhukItmnQOSp+quDNZrwhT89YKZueBsUqJ2uRnMrotzmj/qw1jjte8rMvvGA+UKAtoOYE53Rc8O/fhg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106998; c=relaxed/simple; bh=MjPkjAVjtvK/jz1L+9NY2p6Gy9QUudbHBhmVAWY0J3U=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=gNdZTY/tabKVLscYQhumQDN9gdxE8ygHS9FLOd7PiFZlJ1iFZgu+qF3T5dG47Hf/UriWkwCOhfGaY0LJQDYcIi7uhKvv/r/LstA1vo9/MBG3pHdAnQqXqJAUDBnZSyvFknniHZb7EJfvtVDnKikGI/mcNXFfj4t4y2LB09I98Qc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=NjzjZe8j; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="NjzjZe8j" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 44F43C4CEEE; Tue, 8 Apr 2025 10:09:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1744106998; bh=MjPkjAVjtvK/jz1L+9NY2p6Gy9QUudbHBhmVAWY0J3U=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=NjzjZe8joKS50tfTuN3j0IM+I7J9xbKMnQN3jxObABNcYrAhHNgccxLzx8MdYaCV9 fzVc5LtwO0Y2XOh9L26H1RlPICoERDymYUFmcYtU8bN9T5D4BC9KvpYritYzucbgNW W3/PbV/gBMeocMyZgaT6+jhJLrxZpLeHWaCrQjvgZGWrE/VA6mUFM3ynCXZZ9jeGI6 rXiTBxh0ypHHQt0H25dINPzx9LwseDy6ZTTBWTk4JNF1meK0UNBWtfVURE2r3jnR44 EbrBH29oxik9cfxKu8ki0ayjim08KCuQosUUSnH4c9F3qVK69KAsC52VqhJSoAGV/C BnT6+/8gvZf+A== Received: from mchehab by mail.kernel.org with local (Exim 4.98.2) (envelope-from ) id 1u25ts-00000008RVO-4AyL; Tue, 08 Apr 2025 18:09:48 +0800 From: Mauro Carvalho Chehab To: Linux Doc Mailing List , Jonathan Corbet Cc: Mauro Carvalho Chehab , "Gustavo A. R. Silva" , Kees Cook , linux-hardening@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v3 03/33] scripts/kernel-doc.py: add a Python parser Date: Tue, 8 Apr 2025 18:09:06 +0800 Message-ID: <2fa671a9fb08d03a376a42d46cc0b1d3aab4ae3f.1744106241.git.mchehab+huawei@kernel.org> X-Mailer: git-send-email 2.49.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Sender: Mauro Carvalho Chehab Maintaining kernel-doc has been a challenge, as there aren't many perl developers among maintainers. Also, the logic there is too complex. Having lots of global variables and using pure functions doesn't help. Rewrite the script in Python, placing most global variables inside classes. This should help maintaining the script in long term. It also allows a better integration with kernel-doc Sphinx extension in the future. I opted to keep this version as close as possible to what we have already in Perl. There are some differences though: 1. There is one regular expression that required a rewrite: /\bSTRUCT_GROUP(\(((?:(?>[^)(]+)|(?1))*)\))[^;]*;/ As this one uses two features that aren't available by the native Python regular expression module (re): - recursive patterns: ?1 - atomic grouping (?>...) Rewrite it to use a much simpler regular expression: /\bSTRUCT_GROUP\(([^\)]+)\)[^;]*;/ Extra care should be taken when validating this script, as such replacement might cause some regressions. 2. The filters are now applied only during output generation. In particular, "nosymbol" argument is only handled there. It means that, if the same file is processed twice for different symbols, the warnings will be duplicated. I opted to use this behavior as it allows the Sphinx extension to read the file(s) only once, and apply the filtering only when producing the ReST output. This hopefully will help to speed up doc generation 3. This version can handle multiple files and multiple directories. So, if one just wants to produce a big output with everything inside a file, this could be done with $ time ./scripts/kernel-doc.py -man . 2>/dev/null >new real 0m54.592s user 0m53.345s sys 0m0.997s 4. I tried to replicate as much as possible the same arguments from kernel-doc, with about the same behavior, for the command line parameters starting with a single dash (-parameter). I also added one letter aliases for each parameter, and a --parameter (sometimes with a better name). 5. There are some sutile nuances between how Perl handles certain regular expressions. In special, the qr operatior, which compiles a regular expression also works as a non-capturing group. It means that some regexes like this one: my $type1 =3D qr{[\w\s]+}; needs to be mapped as: type1 =3D r'(?:[\w\s]+)?' Signed-off-by: Mauro Carvalho Chehab --- TODO: - on this RFC, the man output doesn't match yet the same output of kernel-doc. The ReST output matches, except for some whitespaces and suppressed empty sectionsl - this version lacks support for -W parameters: it will just output all warnings. - all classes are at the same file. I want to split the classes on multiple files for the final version, but, during development time, it is easier to have everything on a single file, but I plan to split classes on different files to help maintaining the script. Signed-off-by: Mauro Carvalho Chehab --- scripts/kernel-doc.py | 2832 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 2832 insertions(+) create mode 100755 scripts/kernel-doc.py diff --git a/scripts/kernel-doc.py b/scripts/kernel-doc.py new file mode 100755 index 000000000000..114f3699bf7c --- /dev/null +++ b/scripts/kernel-doc.py @@ -0,0 +1,2832 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: GPL-2.0 +# Copyright(c) 2025: Mauro Carvalho Chehab . +# +# pylint: disable=3DR0902,R0903,R0904,R0911,R0912,R0913,R0914,R0915,R0917,= R1702 +# pylint: disable=3DC0302,C0103,C0301 +# pylint: disable=3DC0116,C0115,W0511,W0613 +# +# Converted from the kernel-doc script originally written in Perl +# under GPLv2, copyrighted since 1998 by the following authors: +# +# Aditya Srivastava +# Akira Yokosawa +# Alexander A. Klimov +# Alexander Lobakin +# Andr=C3=A9 Almeida +# Andy Shevchenko +# Anna-Maria Behnsen +# Armin Kuster +# Bart Van Assche +# Ben Hutchings +# Borislav Petkov +# Chen-Yu Tsai +# Coco Li +# Conch=C3=BAr Navid +# Daniel Santos +# Danilo Cesar Lemes de Paula +# Dan Luedtke +# Donald Hunter +# Gabriel Krisman Bertazi +# Greg Kroah-Hartman +# Harvey Harrison +# Horia Geanta +# Ilya Dryomov +# Jakub Kicinski +# Jani Nikula +# Jason Baron +# Jason Gunthorpe +# J=C3=A9r=C3=A9my Bobbio +# Johannes Berg +# Johannes Weiner +# Jonathan Cameron +# Jonathan Corbet +# Jonathan Neusch=C3=A4fer +# Kamil Rytarowski +# Kees Cook +# Laurent Pinchart +# Levin, Alexander (Sasha Levin) +# Linus Torvalds +# Lucas De Marchi +# Mark Rutland +# Markus Heiser +# Martin Waitz +# Masahiro Yamada +# Matthew Wilcox +# Mauro Carvalho Chehab +# Michal Wajdeczko +# Michael Zucchi +# Mike Rapoport +# Niklas S=C3=B6derlund +# Nishanth Menon +# Paolo Bonzini +# Pavan Kumar Linga +# Pavel Pisa +# Peter Maydell +# Pierre-Louis Bossart +# Randy Dunlap +# Richard Kennedy +# Rich Walker +# Rolf Eike Beer +# Sakari Ailus +# Silvio Fricke +# Simon Huggins +# Tim Waugh +# Tomasz Warnie=C5=82=C5=82o +# Utkarsh Tripathi +# valdis.kletnieks@vt.edu +# Vegard Nossum +# Will Deacon +# Yacine Belkadi +# Yujie Liu + +# TODO: implement warning filtering + +""" +kernel_doc +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +Print formatted kernel documentation to stdout + +Read C language source or header FILEs, extract embedded +documentation comments, and print formatted documentation +to standard output. + +The documentation comments are identified by the "/**" +opening comment mark. + +See Documentation/doc-guide/kernel-doc.rst for the +documentation comment syntax. +""" + +import argparse +import logging +import os +import re +import sys + +from datetime import datetime +from pprint import pformat + +from dateutil import tz + +# Local cache for regular expressions +re_cache =3D {} + + +class Re: + """ + Helper class to simplify regex declaration and usage, + + It calls re.compile for a given pattern. It also allows adding + regular expressions and define sub at class init time. + + Regular expressions can be cached via an argument, helping to speedup + searches. + """ + + def _add_regex(self, string, flags): + if string in re_cache: + self.regex =3D re_cache[string] + else: + self.regex =3D re.compile(string, flags=3Dflags) + + if self.cache: + re_cache[string] =3D self.regex + + def __init__(self, string, cache=3DTrue, flags=3D0): + self.cache =3D cache + self.last_match =3D None + + self._add_regex(string, flags) + + def __str__(self): + return self.regex.pattern + + def __add__(self, other): + return Re(str(self) + str(other), cache=3Dself.cache or other.cach= e, + flags=3Dself.regex.flags | other.regex.flags) + + def match(self, string): + self.last_match =3D self.regex.match(string) + return self.last_match + + def search(self, string): + self.last_match =3D self.regex.search(string) + return self.last_match + + def findall(self, string): + return self.regex.findall(string) + + def split(self, string): + return self.regex.split(string) + + def sub(self, sub, string, count=3D0): + return self.regex.sub(sub, string, count=3Dcount) + + def group(self, num): + return self.last_match.group(num) + +# +# Regular expressions used to parse kernel-doc markups at KernelDoc class. +# +# Let's declare them in lowercase outside any class to make easier to +# convert from the python script. +# +# As those are evaluated at the beginning, no need to cache them +# + + +# Allow whitespace at end of comment start. +doc_start =3D Re(r'^/\*\*\s*$', cache=3DFalse) + +doc_end =3D Re(r'\*/', cache=3DFalse) +doc_com =3D Re(r'\s*\*\s*', cache=3DFalse) +doc_com_body =3D Re(r'\s*\* ?', cache=3DFalse) +doc_decl =3D doc_com + Re(r'(\w+)', cache=3DFalse) + +# @params and a strictly limited set of supported section names +# Specifically: +# Match @word: +# @...: +# @{section-name}: +# while trying to not match literal block starts like "example::" +# +doc_sect =3D doc_com + \ + Re(r'\s*(\@[.\w]+|\@\.\.\.|description|context|returns?|notes?= |examples?)\s*:([^:].*)?$', + flags=3Dre.I, cache=3DFalse) + +doc_content =3D doc_com_body + Re(r'(.*)', cache=3DFalse) +doc_block =3D doc_com + Re(r'DOC:\s*(.*)?', cache=3DFalse) +doc_inline_start =3D Re(r'^\s*/\*\*\s*$', cache=3DFalse) +doc_inline_sect =3D Re(r'\s*\*\s*(@\s*[\w][\w\.]*\s*):(.*)', cache=3DFalse) +doc_inline_end =3D Re(r'^\s*\*/\s*$', cache=3DFalse) +doc_inline_oneline =3D Re(r'^\s*/\*\*\s*(@[\w\s]+):\s*(.*)\s*\*/\s*$', cac= he=3DFalse) +function_pointer =3D Re(r"([^\(]*\(\*)\s*\)\s*\(([^\)]*)\)", cache=3DFalse) +attribute =3D Re(r"__attribute__\s*\(\([a-z0-9,_\*\s\(\)]*\)\)", + flags=3Dre.I | re.S, cache=3DFalse) + +# match expressions used to find embedded type information +type_constant =3D Re(r"\b``([^\`]+)``\b", cache=3DFalse) +type_constant2 =3D Re(r"\%([-_*\w]+)", cache=3DFalse) +type_func =3D Re(r"(\w+)\(\)", cache=3DFalse) +type_param =3D Re(r"\@(\w*((\.\w+)|(->\w+))*(\.\.\.)?)", cache=3DFalse) +type_param_ref =3D Re(r"([\!~\*]?)\@(\w*((\.\w+)|(->\w+))*(\.\.\.)?)", cac= he=3DFalse) + +# Special RST handling for func ptr params +type_fp_param =3D Re(r"\@(\w+)\(\)", cache=3DFalse) + +# Special RST handling for structs with func ptr params +type_fp_param2 =3D Re(r"\@(\w+->\S+)\(\)", cache=3DFalse) + +type_env =3D Re(r"(\$\w+)", cache=3DFalse) +type_enum =3D Re(r"\&(enum\s*([_\w]+))", cache=3DFalse) +type_struct =3D Re(r"\&(struct\s*([_\w]+))", cache=3DFalse) +type_typedef =3D Re(r"\&(typedef\s*([_\w]+))", cache=3DFalse) +type_union =3D Re(r"\&(union\s*([_\w]+))", cache=3DFalse) +type_member =3D Re(r"\&([_\w]+)(\.|->)([_\w]+)", cache=3DFalse) +type_fallback =3D Re(r"\&([_\w]+)", cache=3DFalse) +type_member_func =3D type_member + Re(r"\(\)", cache=3DFalse) + +export_symbol =3D Re(r'^\s*EXPORT_SYMBOL(_GPL)?\s*\(\s*(\w+)\s*\)\s*', cac= he=3DFalse) +export_symbol_ns =3D Re(r'^\s*EXPORT_SYMBOL_NS(_GPL)?\s*\(\s*(\w+)\s*,\s*"= \S+"\)\s*', cache=3DFalse) + +class KernelDoc: + # Parser states + STATE_NORMAL =3D 0 # normal code + STATE_NAME =3D 1 # looking for function name + STATE_BODY_MAYBE =3D 2 # body - or maybe more description + STATE_BODY =3D 3 # the body of the comment + STATE_BODY_WITH_BLANK_LINE =3D 4 # the body which has a blank line + STATE_PROTO =3D 5 # scanning prototype + STATE_DOCBLOCK =3D 6 # documentation block + STATE_INLINE =3D 7 # gathering doc outside main block + + st_name =3D [ + "NORMAL", + "NAME", + "BODY_MAYBE", + "BODY", + "BODY_WITH_BLANK_LINE", + "PROTO", + "DOCBLOCK", + "INLINE", + ] + + # Inline documentation state + STATE_INLINE_NA =3D 0 # not applicable ($state !=3D STATE_INLINE) + STATE_INLINE_NAME =3D 1 # looking for member name (@foo:) + STATE_INLINE_TEXT =3D 2 # looking for member documentation + STATE_INLINE_END =3D 3 # done + STATE_INLINE_ERROR =3D 4 # error - Comment without header was found. + # Spit a warning as it's not + # proper kernel-doc and ignore the rest. + + st_inline_name =3D [ + "", + "_NAME", + "_TEXT", + "_END", + "_ERROR", + ] + + # Section names + + section_default =3D "Description" # default section + section_intro =3D "Introduction" + section_context =3D "Context" + section_return =3D "Return" + + undescribed =3D "-- undescribed --" + + def __init__(self, config, fname): + """Initialize internal variables""" + + self.fname =3D fname + self.config =3D config + + # Initial state for the state machines + self.state =3D self.STATE_NORMAL + self.inline_doc_state =3D self.STATE_INLINE_NA + + # Store entry currently being processed + self.entry =3D None + + # Place all potential outputs into an array + self.entries =3D [] + + def show_warnings(self, dtype, declaration_name): + # TODO: implement it + + return True + + # TODO: rename to emit_message + def emit_warning(self, ln, msg, warning=3DTrue): + """Emit a message""" + + if warning: + self.config.log.warning("%s:%d %s", self.fname, ln, msg) + else: + self.config.log.info("%s:%d %s", self.fname, ln, msg) + + def dump_section(self, start_new=3DTrue): + """ + Dumps section contents to arrays/hashes intended for that purpose. + """ + + name =3D self.entry.section + contents =3D self.entry.contents + + if type_param.match(name): + name =3D type_param.group(1) + + self.entry.parameterdescs[name] =3D contents + self.entry.parameterdesc_start_lines[name] =3D self.entry.new_= start_line + + self.entry.sectcheck +=3D name + " " + self.entry.new_start_line =3D 0 + + elif name =3D=3D "@...": + name =3D "..." + self.entry.parameterdescs[name] =3D contents + self.entry.sectcheck +=3D name + " " + self.entry.parameterdesc_start_lines[name] =3D self.entry.new_= start_line + self.entry.new_start_line =3D 0 + + else: + if name in self.entry.sections and self.entry.sections[name] != =3D "": + # Only warn on user-specified duplicate section names + if name !=3D self.section_default: + self.emit_warning(self.entry.new_start_line, + f"duplicate section name '{name}'\n") + self.entry.sections[name] +=3D contents + else: + self.entry.sections[name] =3D contents + self.entry.sectionlist.append(name) + self.entry.section_start_lines[name] =3D self.entry.new_st= art_line + self.entry.new_start_line =3D 0 + +# self.config.log.debug("Section: %s : %s", name, pformat(vars(self= .entry))) + + if start_new: + self.entry.section =3D self.section_default + self.entry.contents =3D "" + + # TODO: rename it to store_declaration + def output_declaration(self, dtype, name, **args): + """ + Stores the entry into an entry array. + + The actual output and output filters will be handled elsewhere + """ + + # The implementation here is different than the original kernel-do= c: + # instead of checking for output filters or actually output anythi= ng, + # it just stores the declaration content at self.entries, as the + # output will happen on a separate class. + # + # For now, we're keeping the same name of the function just to make + # easier to compare the source code of both scripts + + if "declaration_start_line" not in args: + args["declaration_start_line"] =3D self.entry.declaration_star= t_line + + args["type"] =3D dtype + + self.entries.append((name, args)) + + self.config.log.debug("Output: %s:%s =3D %s", dtype, name, pformat= (args)) + + def reset_state(self, ln): + """ + Ancillary routine to create a new entry. It initializes all + variables used by the state machine. + """ + + self.entry =3D argparse.Namespace + + self.entry.contents =3D "" + self.entry.function =3D "" + self.entry.sectcheck =3D "" + self.entry.struct_actual =3D "" + self.entry.prototype =3D "" + + self.entry.parameterlist =3D [] + self.entry.parameterdescs =3D {} + self.entry.parametertypes =3D {} + self.entry.parameterdesc_start_lines =3D {} + + self.entry.section_start_lines =3D {} + self.entry.sectionlist =3D [] + self.entry.sections =3D {} + + self.entry.anon_struct_union =3D False + + self.entry.leading_space =3D None + + # State flags + self.state =3D self.STATE_NORMAL + self.inline_doc_state =3D self.STATE_INLINE_NA + self.entry.brcount =3D 0 + + self.entry.in_doc_sect =3D False + self.entry.declaration_start_line =3D ln + + def push_parameter(self, ln, decl_type, param, dtype, + org_arg, declaration_name): + if self.entry.anon_struct_union and dtype =3D=3D "" and param =3D= =3D "}": + return # Ignore the ending }; from anonymous struct/union + + self.entry.anon_struct_union =3D False + + param =3D Re(r'[\[\)].*').sub('', param, count=3D1) + + if dtype =3D=3D "" and param.endswith("..."): + if Re(r'\w\.\.\.$').search(param): + # For named variable parameters of the form `x...`, + # remove the dots + param =3D param[:-3] + else: + # Handles unnamed variable parameters + param =3D "..." + + if param not in self.entry.parameterdescs or \ + not self.entry.parameterdescs[param]: + + self.entry.parameterdescs[param] =3D "variable arguments" + + elif dtype =3D=3D "" and (not param or param =3D=3D "void"): + param =3D "void" + self.entry.parameterdescs[param] =3D "no arguments" + + elif dtype =3D=3D "" and param in ["struct", "union"]: + # Handle unnamed (anonymous) union or struct + dtype =3D param + param =3D "{unnamed_" + param + "}" + self.entry.parameterdescs[param] =3D "anonymous\n" + self.entry.anon_struct_union =3D True + + # Handle cache group enforcing variables: they do not need + # to be described in header files + elif "__cacheline_group" in param: + # Ignore __cacheline_group_begin and __cacheline_group_end + return + + # Warn if parameter has no description + # (but ignore ones starting with # as these are not parameters + # but inline preprocessor statements) + if param not in self.entry.parameterdescs and not param.startswith= ("#"): + self.entry.parameterdescs[param] =3D self.undescribed + + if self.show_warnings(dtype, declaration_name) and "." not in = param: + if decl_type =3D=3D 'function': + dname =3D f"{decl_type} parameter" + else: + dname =3D f"{decl_type} member" + + self.emit_warning(ln, + f"{dname} '{param}' not described in '{d= eclaration_name}'") + + # Strip spaces from param so that it is one continuous string on + # parameterlist. This fixes a problem where check_sections() + # cannot find a parameter like "addr[6 + 2]" because it actually + # appears as "addr[6", "+", "2]" on the parameter list. + # However, it's better to maintain the param string unchanged for + # output, so just weaken the string compare in check_sections() + # to ignore "[blah" in a parameter string. + + self.entry.parameterlist.append(param) + org_arg =3D Re(r'\s\s+').sub(' ', org_arg, count=3D1) + self.entry.parametertypes[param] =3D org_arg + + def save_struct_actual(self, actual): + """ + Strip all spaces from the actual param so that it looks like + one string item. + """ + + actual =3D Re(r'\s*').sub("", actual, count=3D1) + + self.entry.struct_actual +=3D actual + " " + + def create_parameter_list(self, ln, decl_type, args, splitter, declara= tion_name): + + # temporarily replace all commas inside function pointer definition + arg_expr =3D Re(r'(\([^\),]+),') + while arg_expr.search(args): + args =3D arg_expr.sub(r"\1#", args) + + for arg in args.split(splitter): + # Strip comments + arg =3D Re(r'\/\*.*\*\/').sub('', arg) + + # Ignore argument attributes + arg =3D Re(r'\sPOS0?\s').sub(' ', arg) + + # Strip leading/trailing spaces + arg =3D arg.strip() + arg =3D Re(r'\s+').sub(' ', arg, count=3D1) + + if arg.startswith('#'): + # Treat preprocessor directive as a typeless variable just= to fill + # corresponding data structures "correctly". Catch it late= r in + # output_* subs. + + # Treat preprocessor directive as a typeless variable + self.push_parameter(ln, decl_type, arg, "", + "", declaration_name) + + elif Re(r'\(.+\)\s*\(').search(arg): + # Pointer-to-function + + arg =3D arg.replace('#', ',') + + r =3D Re(r'[^\(]+\(\*?\s*([\w\[\]\.]*)\s*\)') + if r.match(arg): + param =3D r.group(1) + else: + self.emit_warning(ln, f"Invalid param: {arg}") + param =3D arg + + dtype =3D Re(r'([^\(]+\(\*?)\s*' + re.escape(param)).sub(r= '\1', arg) + self.save_struct_actual(param) + self.push_parameter(ln, decl_type, param, dtype, + arg, declaration_name) + + elif Re(r'\(.+\)\s*\[').search(arg): + # Array-of-pointers + + arg =3D arg.replace('#', ',') + r =3D Re(r'[^\(]+\(\s*\*\s*([\w\[\]\.]*?)\s*(\s*\[\s*[\w]+= \s*\]\s*)*\)') + if r.match(arg): + param =3D r.group(1) + else: + self.emit_warning(ln, f"Invalid param: {arg}") + param =3D arg + + dtype =3D Re(r'([^\(]+\(\*?)\s*' + re.escape(param)).sub(r= '\1', arg) + + self.save_struct_actual(param) + self.push_parameter(ln, decl_type, param, dtype, + arg, declaration_name) + + elif arg: + arg =3D Re(r'\s*:\s*').sub(":", arg) + arg =3D Re(r'\s*\[').sub('[', arg) + + args =3D Re(r'\s*,\s*').split(arg) + if args[0] and '*' in args[0]: + args[0] =3D re.sub(r'(\*+)\s*', r' \1', args[0]) + + first_arg =3D [] + r =3D Re(r'^(.*\s+)(.*?\[.*\].*)$') + if args[0] and r.match(args[0]): + args.pop(0) + first_arg.extend(r.group(1)) + first_arg.append(r.group(2)) + else: + first_arg =3D Re(r'\s+').split(args.pop(0)) + + args.insert(0, first_arg.pop()) + dtype =3D ' '.join(first_arg) + + for param in args: + if Re(r'^(\*+)\s*(.*)').match(param): + r =3D Re(r'^(\*+)\s*(.*)') + if not r.match(param): + self.emit_warning(ln, f"Invalid param: {param}= ") + continue + + param =3D r.group(1) + + self.save_struct_actual(r.group(2)) + self.push_parameter(ln, decl_type, r.group(2), + f"{dtype} {r.group(1)}", + arg, declaration_name) + + elif Re(r'(.*?):(\w+)').search(param): + r =3D Re(r'(.*?):(\w+)') + if not r.match(param): + self.emit_warning(ln, f"Invalid param: {param}= ") + continue + + if dtype !=3D "": # Skip unnamed bit-fields + self.save_struct_actual(r.group(1)) + self.push_parameter(ln, decl_type, r.group(1), + f"{dtype}:{r.group(2)}", + arg, declaration_name) + else: + self.save_struct_actual(param) + self.push_parameter(ln, decl_type, param, dtype, + arg, declaration_name) + + def check_sections(self, ln, decl_name, decl_type, sectcheck, prmschec= k): + sects =3D sectcheck.split() + prms =3D prmscheck.split() + err =3D False + + for sx in range(len(sects)): # pylint: disable=3D= C0200 + err =3D True + for px in range(len(prms)): # pylint: disable=3D= C0200 + prm_clean =3D prms[px] + prm_clean =3D Re(r'\[.*\]').sub('', prm_clean) + prm_clean =3D attribute.sub('', prm_clean) + + # ignore array size in a parameter string; + # however, the original param string may contain + # spaces, e.g.: addr[6 + 2] + # and this appears in @prms as "addr[6" since the + # parameter list is split at spaces; + # hence just ignore "[..." for the sections check; + prm_clean =3D Re(r'\[.*').sub('', prm_clean) + + if prm_clean =3D=3D sects[sx]: + err =3D False + break + + if err: + if decl_type =3D=3D 'function': + dname =3D f"{decl_type} parameter" + else: + dname =3D f"{decl_type} member" + + self.emit_warning(ln, + f"Excess {dname} '{sects[sx]}' descripti= on in '{decl_name}'") + + def check_return_section(self, ln, declaration_name, return_type): + + if not self.config.wreturn: + return + + # Ignore an empty return type (It's a macro) + # Ignore functions with a "void" return type (but not "void *") + if not return_type or Re(r'void\s*\w*\s*$').search(return_type): + return + + if not self.entry.sections.get("Return", None): + self.emit_warning(ln, + f"No description found for return value of '= {declaration_name}'") + + def dump_struct(self, ln, proto): + """ + Store an entry for an struct or union + """ + + type_pattern =3D r'(struct|union)' + + qualifiers =3D [ + "__attribute__", + "__packed", + "__aligned", + "____cacheline_aligned_in_smp", + "____cacheline_aligned", + ] + + definition_body =3D r'\{(.*)\}\s*' + "(?:" + '|'.join(qualifiers) = + ")?" + struct_members =3D Re(type_pattern + r'([^\{\};]+)(\{)([^\{\}]*)(\= })([^\{\}\;]*)(\;)') + + # Extract struct/union definition + members =3D None + declaration_name =3D None + decl_type =3D None + + r =3D Re(type_pattern + r'\s+(\w+)\s*' + definition_body) + if r.search(proto): + decl_type =3D r.group(1) + declaration_name =3D r.group(2) + members =3D r.group(3) + else: + r =3D Re(r'typedef\s+' + type_pattern + r'\s*' + definition_bo= dy + r'\s*(\w+)\s*;') + + if r.search(proto): + decl_type =3D r.group(1) + declaration_name =3D r.group(3) + members =3D r.group(2) + + if not members: + self.emit_warning(ln, f"{proto} error: Cannot parse struct or = union!") + self.config.errors +=3D 1 + return + + if self.entry.identifier !=3D declaration_name: + self.emit_warning(ln, + f"expecting prototype for {decl_type} {self.= entry.identifier}. Prototype was for {decl_type} {declaration_name} instead= \n") + return + + args_pattern =3Dr'([^,)]+)' + + sub_prefixes =3D [ + (Re(r'\/\*\s*private:.*?\/\*\s*public:.*?\*\/', re.S | re.I), = ''), + (Re(r'\/\*\s*private:.*', re.S| re.I), ''), + + # Strip comments + (Re(r'\/\*.*?\*\/', re.S), ''), + + # Strip attributes + (attribute, ' '), + (Re(r'\s*__aligned\s*\([^;]*\)', re.S), ' '), + (Re(r'\s*__counted_by\s*\([^;]*\)', re.S), ' '), + (Re(r'\s*__counted_by_(le|be)\s*\([^;]*\)', re.S), ' '), + (Re(r'\s*__packed\s*', re.S), ' '), + (Re(r'\s*CRYPTO_MINALIGN_ATTR', re.S), ' '), + (Re(r'\s*____cacheline_aligned_in_smp', re.S), ' '), + (Re(r'\s*____cacheline_aligned', re.S), ' '), + + # Unwrap struct_group() based on this definition: + # __struct_group(TAG, NAME, ATTRS, MEMBERS...) + # which has variants like: struct_group(NAME, MEMBERS...) + + (Re(r'\bstruct_group\s*\(([^,]*,)', re.S), r'STRUCT_GROUP('), + (Re(r'\bstruct_group_attr\s*\(([^,]*,){2}', re.S), r'STRUCT_G= ROUP('), + (Re(r'\bstruct_group_tagged\s*\(([^,]*),([^,]*),', re.S), r's= truct \1 \2; STRUCT_GROUP('), + (Re(r'\b__struct_group\s*\(([^,]*,){3}', re.S), r'STRUCT_GROU= P('), + + # This is incompatible with Python re, as it uses: + # recursive patterns ((?1)) and atomic grouping ((?>...)): + # '\bSTRUCT_GROUP(\(((?:(?>[^)(]+)|(?1))*)\))[^;]*;' + # Let's see if this works instead: + (Re(r'\bSTRUCT_GROUP\(([^\)]+)\)[^;]*;', re.S), r'\1'), + + # Replace macros + (Re(r'__ETHTOOL_DECLARE_LINK_MODE_MASK\s*\(([^\)]+)\)', re.S),= r'DECLARE_BITMAP(\1, __ETHTOOL_LINK_MODE_MASK_NBITS)'), + (Re(r'DECLARE_PHY_INTERFACE_MASK\s*\(([^\)]+)\)', re.S), r'DE= CLARE_BITMAP(\1, PHY_INTERFACE_MODE_MAX)'), + (Re(r'DECLARE_BITMAP\s*\(' + args_pattern + r',\s*' + args_pat= tern + r'\)', re.S), r'unsigned long \1[BITS_TO_LONGS(\2)]'), + (Re(r'DECLARE_HASHTABLE\s*\(' + args_pattern + r',\s*' + args_= pattern + r'\)', re.S), r'unsigned long \1[1 << ((\2) - 1)]'), + (Re(r'DECLARE_KFIFO\s*\(' + args_pattern + r',\s*' + args_patt= ern + r',\s*' + args_pattern + r'\)', re.S), r'\2 *\1'), + (Re(r'DECLARE_KFIFO_PTR\s*\(' + args_pattern + r',\s*' + args_= pattern + r'\)', re.S), r'\2 *\1'), + (Re(r'(?:__)?DECLARE_FLEX_ARRAY\s*\(' + args_pattern + r',\s*'= + args_pattern + r'\)', re.S), r'\1 \2[]'), + (Re(r'DEFINE_DMA_UNMAP_ADDR\s*\(' + args_pattern + r'\)', re.S= ), r'dma_addr_t \1'), + (Re(r'DEFINE_DMA_UNMAP_LEN\s*\(' + args_pattern + r'\)', re.S)= , r'__u32 \1'), + ] + + for search, sub in sub_prefixes: + members =3D search.sub(sub, members) + + # Keeps the original declaration as-is + declaration =3D members + + # Split nested struct/union elements + # + # This loop was simpler at the original kernel-doc perl version, as + # while ($members =3D~ m/$struct_members/) { ... } + # reads 'members' string on each interaction. + # + # Python behavior is different: it parses 'members' only once, + # creating a list of tuples from the first interaction. + # + # On other words, this won't get nested structs. + # + # So, we need to have an extra loop on Python to override such + # re limitation. + + while True: + tuples =3D struct_members.findall(members) + if not tuples: + break + + for t in tuples: + newmember =3D "" + maintype =3D t[0] + s_ids =3D t[5] + content =3D t[3] + + oldmember =3D "".join(t) + + for s_id in s_ids.split(','): + s_id =3D s_id.strip() + + newmember +=3D f"{maintype} {s_id}; " + s_id =3D Re(r'[:\[].*').sub('', s_id) + s_id =3D Re(r'^\s*\**(\S+)\s*').sub(r'\1', s_id) + + for arg in content.split(';'): + arg =3D arg.strip() + + if not arg: + continue + + r =3D Re(r'^([^\(]+\(\*?\s*)([\w\.]*)(\s*\).*)') + if r.match(arg): + # Pointer-to-function + dtype =3D r.group(1) + name =3D r.group(2) + extra =3D r.group(3) + + if not name: + continue + + if not s_id: + # Anonymous struct/union + newmember +=3D f"{dtype}{name}{extra}; " + else: + newmember +=3D f"{dtype}{s_id}.{name}{extr= a}; " + + else: + arg =3D arg.strip() + # Handle bitmaps + arg =3D Re(r':\s*\d+\s*').sub('', arg) + + # Handle arrays + arg =3D Re(r'\[.*\]').sub('', arg) + + # Handle multiple IDs + arg =3D Re(r'\s*,\s*').sub(',', arg) + + + r =3D Re(r'(.*)\s+([\S+,]+)') + + if r.search(arg): + dtype =3D r.group(1) + names =3D r.group(2) + else: + newmember +=3D f"{arg}; " + continue + + for name in names.split(','): + name =3D Re(r'^\s*\**(\S+)\s*').sub(r'\1',= name).strip() + + if not name: + continue + + if not s_id: + # Anonymous struct/union + newmember +=3D f"{dtype} {name}; " + else: + newmember +=3D f"{dtype} {s_id}.{name}= ; " + + members =3D members.replace(oldmember, newmember) + + # Ignore other nested elements, like enums + members =3D re.sub(r'(\{[^\{\}]*\})', '', members) + + self.create_parameter_list(ln, decl_type, members, ';', + declaration_name) + self.check_sections(ln, declaration_name, decl_type, + self.entry.sectcheck, self.entry.struct_actual) + + # Adjust declaration for better display + declaration =3D Re(r'([\{;])').sub(r'\1\n', declaration) + declaration =3D Re(r'\}\s+;').sub('};', declaration) + + # Better handle inlined enums + while True: + r =3D Re(r'(enum\s+\{[^\}]+),([^\n])') + if not r.search(declaration): + break + + declaration =3D r.sub(r'\1,\n\2', declaration) + + def_args =3D declaration.split('\n') + level =3D 1 + declaration =3D "" + for clause in def_args: + + clause =3D clause.strip() + clause =3D Re(r'\s+').sub(' ', clause, count=3D1) + + if not clause: + continue + + if '}' in clause and level > 1: + level -=3D 1 + + if not Re(r'^\s*#').match(clause): + declaration +=3D "\t" * level + + declaration +=3D "\t" + clause + "\n" + if "{" in clause and "}" not in clause: + level +=3D 1 + + self.output_declaration(decl_type, declaration_name, + struct=3Ddeclaration_name, + module=3Dself.entry.modulename, + definition=3Ddeclaration, + parameterlist=3Dself.entry.parameterlist, + parameterdescs=3Dself.entry.parameterdescs, + parametertypes=3Dself.entry.parametertypes, + sectionlist=3Dself.entry.sectionlist, + sections=3Dself.entry.sections, + purpose=3Dself.entry.declaration_purpose) + + def dump_enum(self, ln, proto): + + # Ignore members marked private + proto =3D Re(r'\/\*\s*private:.*?\/\*\s*public:.*?\*\/', flags=3Dr= e.S).sub('', proto) + proto =3D Re(r'\/\*\s*private:.*}', flags=3Dre.S).sub('}', proto) + + # Strip comments + proto =3D Re(r'\/\*.*?\*\/', flags=3Dre.S).sub('', proto) + + # Strip #define macros inside enums + proto =3D Re(r'#\s*((define|ifdef|if)\s+|endif)[^;]*;', flags=3Dre= .S).sub('', proto) + + members =3D None + declaration_name =3D None + + r =3D Re(r'typedef\s+enum\s*\{(.*)\}\s*(\w*)\s*;') + if r.search(proto): + declaration_name =3D r.group(2) + members =3D r.group(1).rstrip() + else: + r =3D Re(r'enum\s+(\w*)\s*\{(.*)\}') + if r.match(proto): + declaration_name =3D r.group(1) + members =3D r.group(2).rstrip() + + if not members: + self.emit_warning(ln, f"{proto}: error: Cannot parse enum!") + self.config.errors +=3D 1 + return + + if self.entry.identifier !=3D declaration_name: + if self.entry.identifier =3D=3D "": + self.emit_warning(ln, + f"{proto}: wrong kernel-doc identifier o= n prototype") + else: + self.emit_warning(ln, + f"expecting prototype for enum {self.ent= ry.identifier}. Prototype was for enum {declaration_name} instead") + return + + if not declaration_name: + declaration_name =3D "(anonymous)" + + member_set =3D set() + + members =3D Re(r'\([^;]*?[\)]').sub('', members) + + for arg in members.split(','): + if not arg: + continue + arg =3D Re(r'^\s*(\w+).*').sub(r'\1', arg) + self.entry.parameterlist.append(arg) + if arg not in self.entry.parameterdescs: + self.entry.parameterdescs[arg] =3D self.undescribed + if self.show_warnings("enum", declaration_name): + self.emit_warning(ln, + f"Enum value '{arg}' not described i= n enum '{declaration_name}'") + member_set.add(arg) + + for k in self.entry.parameterdescs: + if k not in member_set: + if self.show_warnings("enum", declaration_name): + self.emit_warning(ln, + f"Excess enum value '%{k}' descripti= on in '{declaration_name}'") + + self.output_declaration('enum', declaration_name, + enum=3Ddeclaration_name, + module=3Dself.config.modulename, + parameterlist=3Dself.entry.parameterlist, + parameterdescs=3Dself.entry.parameterdescs, + sectionlist=3Dself.entry.sectionlist, + sections=3Dself.entry.sections, + purpose=3Dself.entry.declaration_purpose) + + def dump_declaration(self, ln, prototype): + if self.entry.decl_type =3D=3D "enum": + self.dump_enum(ln, prototype) + return + + if self.entry.decl_type =3D=3D "typedef": + self.dump_typedef(ln, prototype) + return + + if self.entry.decl_type in ["union", "struct"]: + self.dump_struct(ln, prototype) + return + + # TODO: handle other types + self.output_declaration(self.entry.decl_type, prototype, + entry=3Dself.entry) + + def dump_function(self, ln, prototype): + + func_macro =3D False + return_type =3D '' + decl_type =3D 'function' + + # Prefixes that would be removed + sub_prefixes =3D [ + (r"^static +", "", 0), + (r"^extern +", "", 0), + (r"^asmlinkage +", "", 0), + (r"^inline +", "", 0), + (r"^__inline__ +", "", 0), + (r"^__inline +", "", 0), + (r"^__always_inline +", "", 0), + (r"^noinline +", "", 0), + (r"^__FORTIFY_INLINE +", "", 0), + (r"__init +", "", 0), + (r"__init_or_module +", "", 0), + (r"__deprecated +", "", 0), + (r"__flatten +", "", 0), + (r"__meminit +", "", 0), + (r"__must_check +", "", 0), + (r"__weak +", "", 0), + (r"__sched +", "", 0), + (r"_noprof", "", 0), + (r"__printf\s*\(\s*\d*\s*,\s*\d*\s*\) +", "", 0), + (r"__(?:re)?alloc_size\s*\(\s*\d+\s*(?:,\s*\d+\s*)?\) +", "", = 0), + (r"__diagnose_as\s*\(\s*\S+\s*(?:,\s*\d+\s*)*\) +", "", 0), + (r"DECL_BUCKET_PARAMS\s*\(\s*(\S+)\s*,\s*(\S+)\s*\)", r"\1, \2= ", 0), + (r"__attribute_const__ +", "", 0), + + # It seems that Python support for re.X is broken: + # At least for me (Python 3.13), this didn't work +# (r""" +# __attribute__\s*\(\( +# (?: +# [\w\s]+ # attribute name +# (?:\([^)]*\))? # attribute arguments +# \s*,? # optional comma at the end +# )+ +# \)\)\s+ +# """, "", re.X), + + # So, remove whitespaces and comments from it + (r"__attribute__\s*\(\((?:[\w\s]+(?:\([^)]*\))?\s*,?)+\)\)\s+"= , "", 0), + ] + + for search, sub, flags in sub_prefixes: + prototype =3D Re(search, flags).sub(sub, prototype) + + # Macros are a special case, as they change the prototype format + new_proto =3D Re(r"^#\s*define\s+").sub("", prototype) + if new_proto !=3D prototype: + is_define_proto =3D True + prototype =3D new_proto + else: + is_define_proto =3D False + + # Yes, this truly is vile. We are looking for: + # 1. Return type (may be nothing if we're looking at a macro) + # 2. Function name + # 3. Function parameters. + # + # All the while we have to watch out for function pointer paramete= rs + # (which IIRC is what the two sections are for), C types (these + # regexps don't even start to express all the possibilities), and + # so on. + # + # If you mess with these regexps, it's a good idea to check that + # the following functions' documentation still comes out right: + # - parport_register_device (function pointer parameters) + # - atomic_set (macro) + # - pci_match_device, __copy_to_user (long return type) + + name =3D r'[a-zA-Z0-9_~:]+' + prototype_end1 =3D r'[^\(]*' + prototype_end2 =3D r'[^\{]*' + prototype_end =3D fr'\(({prototype_end1}|{prototype_end2})\)' + + # Besides compiling, Perl qr{[\w\s]+} works as a non-capturing gro= up. + # So, this needs to be mapped in Python with (?:...)? or (?:...)+ + + type1 =3D r'(?:[\w\s]+)?' + type2 =3D r'(?:[\w\s]+\*+)+' + + found =3D False + + if is_define_proto: + r =3D Re(r'^()(' + name + r')\s+') + + if r.search(prototype): + return_type =3D '' + declaration_name =3D r.group(2) + func_macro =3D True + + found =3D True + + if not found: + patterns =3D [ + rf'^()({name})\s*{prototype_end}', + rf'^({type1})\s+({name})\s*{prototype_end}', + rf'^({type2})\s*({name})\s*{prototype_end}', + ] + + for p in patterns: + r =3D Re(p) + + if r.match(prototype): + + return_type =3D r.group(1) + declaration_name =3D r.group(2) + args =3D r.group(3) + + self.create_parameter_list(ln, decl_type, args, ',', + declaration_name) + + found =3D True + break + if not found: + self.emit_warning(ln, + f"cannot understand function prototype: '{pr= ototype}'") + return + + if self.entry.identifier !=3D declaration_name: + self.emit_warning(ln, + f"expecting prototype for {self.entry.identi= fier}(). Prototype was for {declaration_name}() instead") + return + + prms =3D " ".join(self.entry.parameterlist) + self.check_sections(ln, declaration_name, "function", + self.entry.sectcheck, prms) + + self.check_return_section(ln, declaration_name, return_type) + + if 'typedef' in return_type: + self.output_declaration(decl_type, declaration_name, + function=3Ddeclaration_name, + typedef=3DTrue, + module=3Dself.config.modulename, + functiontype=3Dreturn_type, + parameterlist=3Dself.entry.parameterlist, + parameterdescs=3Dself.entry.parameterdescs, + parametertypes=3Dself.entry.parametertypes, + sectionlist=3Dself.entry.sectionlist, + sections=3Dself.entry.sections, + purpose=3Dself.entry.declaration_purpose, + func_macro=3Dfunc_macro) + else: + self.output_declaration(decl_type, declaration_name, + function=3Ddeclaration_name, + typedef=3DFalse, + module=3Dself.config.modulename, + functiontype=3Dreturn_type, + parameterlist=3Dself.entry.parameterlist, + parameterdescs=3Dself.entry.parameterdescs, + parametertypes=3Dself.entry.parametertypes, + sectionlist=3Dself.entry.sectionlist, + sections=3Dself.entry.sections, + purpose=3Dself.entry.declaration_purpose, + func_macro=3Dfunc_macro) + + def dump_typedef(self, ln, proto): + typedef_type =3D r'((?:\s+[\w\*]+\b){1,8})\s*' + typedef_ident =3D r'\*?\s*(\w\S+)\s*' + typedef_args =3D r'\s*\((.*)\);' + + typedef1 =3D Re(r'typedef' + typedef_type + r'\(' + typedef_ident = + r'\)' + typedef_args) + typedef2 =3D Re(r'typedef' + typedef_type + typedef_ident + typede= f_args) + + # Strip comments + proto =3D Re(r'/\*.*?\*/', flags=3Dre.S).sub('', proto) + + # Parse function typedef prototypes + for r in [typedef1, typedef2]: + if not r.match(proto): + continue + + return_type =3D r.group(1).strip() + declaration_name =3D r.group(2) + args =3D r.group(3) + + if self.entry.identifier !=3D declaration_name: + self.emit_warning(ln, + f"expecting prototype for typedef {self.= entry.identifier}. Prototype was for typedef {declaration_name} instead\n") + return + + decl_type =3D 'function' + self.create_parameter_list(ln, decl_type, args, ',', declarati= on_name) + + self.output_declaration(decl_type, declaration_name, + function=3Ddeclaration_name, + typedef=3DTrue, + module=3Dself.entry.modulename, + functiontype=3Dreturn_type, + parameterlist=3Dself.entry.parameterlist, + parameterdescs=3Dself.entry.parameterdescs, + parametertypes=3Dself.entry.parametertypes, + sectionlist=3Dself.entry.sectionlist, + sections=3Dself.entry.sections, + purpose=3Dself.entry.declaration_purpose) + return + + # Handle nested parentheses or brackets + r =3D Re(r'(\(*.\)\s*|\[*.\]\s*);$') + while r.search(proto): + proto =3D r.sub('', proto) + + # Parse simple typedefs + r =3D Re(r'typedef.*\s+(\w+)\s*;') + if r.match(proto): + declaration_name =3D r.group(1) + + if self.entry.identifier !=3D declaration_name: + self.emit_warning(ln, f"expecting prototype for typedef {s= elf.entry.identifier}. Prototype was for typedef {declaration_name} instead= \n") + return + + self.output_declaration('typedef', declaration_name, + typedef=3Ddeclaration_name, + module=3Dself.entry.modulename, + sectionlist=3Dself.entry.sectionlist, + sections=3Dself.entry.sections, + purpose=3Dself.entry.declaration_purpose) + return + + self.emit_warning(ln, "error: Cannot parse typedef!") + self.config.errors +=3D 1 + + @staticmethod + def process_export(function_table, line): + """ + process EXPORT_SYMBOL* tags + + This method is called both internally and externally, so, it + doesn't use self. + """ + + if export_symbol.search(line): + symbol =3D export_symbol.group(2) + function_table.add(symbol) + + if export_symbol_ns.search(line): + symbol =3D export_symbol_ns.group(2) + function_table.add(symbol) + + def process_normal(self, ln, line): + """ + STATE_NORMAL: looking for the /** to begin everything. + """ + + if not doc_start.match(line): + return + + # start a new entry + self.reset_state(ln + 1) + self.entry.in_doc_sect =3D False + + # next line is always the function name + self.state =3D self.STATE_NAME + + def process_name(self, ln, line): + """ + STATE_NAME: Looking for the "name - description" line + """ + + if doc_block.search(line): + self.entry.new_start_line =3D ln + + if not doc_block.group(1): + self.entry.section =3D self.section_intro + else: + self.entry.section =3D doc_block.group(1) + + self.state =3D self.STATE_DOCBLOCK + return + + if doc_decl.search(line): + self.entry.identifier =3D doc_decl.group(1) + self.entry.is_kernel_comment =3D False + + decl_start =3D str(doc_com) # comment block asterisk + fn_type =3D r"(?:\w+\s*\*\s*)?" # type (for non-functions) + parenthesis =3D r"(?:\(\w*\))?" # optional parenthesis on fu= nction + decl_end =3D r"(?:[-:].*)" # end of the name part + + # test for pointer declaration type, foo * bar() - desc + r =3D Re(fr"^{decl_start}([\w\s]+?){parenthesis}?\s*{decl_end}= ?$") + if r.search(line): + self.entry.identifier =3D r.group(1) + + # Test for data declaration + r =3D Re(r"^\s*\*?\s*(struct|union|enum|typedef)\b\s*(\w*)") + if r.search(line): + self.entry.decl_type =3D r.group(1) + self.entry.identifier =3D r.group(2) + self.entry.is_kernel_comment =3D True + else: + # Look for foo() or static void foo() - description; + # or misspelt identifier + + r1 =3D Re(fr"^{decl_start}{fn_type}(\w+)\s*{parenthesis}\s= *{decl_end}?$") + r2 =3D Re(fr"^{decl_start}{fn_type}(\w+[^-:]*){parenthesis= }\s*{decl_end}$") + + for r in [r1, r2]: + if r.search(line): + self.entry.identifier =3D r.group(1) + self.entry.decl_type =3D "function" + + r =3D Re(r"define\s+") + self.entry.identifier =3D r.sub("", self.entry.ide= ntifier) + self.entry.is_kernel_comment =3D True + break + + self.entry.identifier =3D self.entry.identifier.strip(" ") + + self.state =3D self.STATE_BODY + + # if there's no @param blocks need to set up default section h= ere + self.entry.section =3D self.section_default + self.entry.new_start_line =3D ln + 1 + + r =3D Re("[-:](.*)") + if r.search(line): + # strip leading/trailing/multiple spaces + self.entry.descr =3D r.group(1).strip(" ") + + r =3D Re(r"\s+") + self.entry.descr =3D r.sub(" ", self.entry.descr) + self.entry.declaration_purpose =3D self.entry.descr + self.state =3D self.STATE_BODY_MAYBE + else: + self.entry.declaration_purpose =3D "" + + if not self.entry.is_kernel_comment: + self.emit_warning(ln, + f"This comment starts with '/**', but is= n't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst\n{li= ne}") + self.state =3D self.STATE_NORMAL + + if not self.entry.declaration_purpose and self.config.wshort_d= esc: + self.emit_warning(ln, + f"missing initial short description on l= ine:\n{line}") + + if not self.entry.identifier and self.entry.decl_type !=3D "en= um": + self.emit_warning(ln, + f"wrong kernel-doc identifier on line:\n= {line}") + self.state =3D self.STATE_NORMAL + + if self.config.verbose: + self.emit_warning(ln, + f"Scanning doc for {self.entry.decl_type= } {self.entry.identifier}", + warning=3DFalse) + + return + + # Failed to find an identifier. Emit a warning + self.emit_warning(ln, f"Cannot find identifier on line:\n{line}") + + def process_body(self, ln, line): + """ + STATE_BODY and STATE_BODY_MAYBE: the bulk of a kerneldoc comment. + """ + + if self.state =3D=3D self.STATE_BODY_WITH_BLANK_LINE: + r =3D Re(r"\s*\*\s?\S") + if r.match(line): + self.dump_section() + self.entry.section =3D self.section_default + self.entry.new_start_line =3D line + self.entry.contents =3D "" + + if doc_sect.search(line): + self.entry.in_doc_sect =3D True + newsection =3D doc_sect.group(1) + + if newsection.lower() in ["description", "context"]: + newsection =3D newsection.title() + + # Special case: @return is a section, not a param description + if newsection.lower() in ["@return", "@returns", + "return", "returns"]: + newsection =3D "Return" + + # Perl kernel-doc has a check here for contents before section= s. + # the logic there is always false, as in_doc_sect variable is + # always true. So, just don't implement Wcontents_before_secti= ons + + # .title() + newcontents =3D doc_sect.group(2) + if not newcontents: + newcontents =3D "" + + if self.entry.contents.strip("\n"): + self.dump_section() + + self.entry.new_start_line =3D ln + self.entry.section =3D newsection + self.entry.leading_space =3D None + + self.entry.contents =3D newcontents.lstrip() + if self.entry.contents: + self.entry.contents +=3D "\n" + + self.state =3D self.STATE_BODY + return + + if doc_end.search(line): + if self.entry.contents.strip("\n"): + self.dump_section() + + # Look for doc_com + + doc_end: + r =3D Re(r'\s*\*\s*[a-zA-Z_0-9:\.]+\*/') + if r.match(line): + self.emit_warning(ln, f"suspicious ending line: {line}") + + self.entry.prototype =3D "" + self.entry.new_start_line =3D ln + 1 + + self.state =3D self.STATE_PROTO + return + + if doc_content.search(line): + cont =3D doc_content.group(1) + + if cont =3D=3D "": + if self.entry.section =3D=3D self.section_context: + self.dump_section() + + self.entry.new_start_line =3D ln + self.state =3D self.STATE_BODY + else: + if self.entry.section !=3D self.section_default: + self.state =3D self.STATE_BODY_WITH_BLANK_LINE + else: + self.state =3D self.STATE_BODY + + self.entry.contents +=3D "\n" + + elif self.state =3D=3D self.STATE_BODY_MAYBE: + + # Continued declaration purpose + self.entry.declaration_purpose =3D self.entry.declaration_= purpose.rstrip() + self.entry.declaration_purpose +=3D " " + cont + + r =3D Re(r"\s+") + self.entry.declaration_purpose =3D r.sub(' ', + self.entry.declarat= ion_purpose) + + else: + if self.entry.section.startswith('@') or \ + self.entry.section =3D=3D self.section_context: + if self.entry.leading_space is None: + r =3D Re(r'^(\s+)') + if r.match(cont): + self.entry.leading_space =3D len(r.group(1)) + else: + self.entry.leading_space =3D 0 + + # Double-check if leading space are realy spaces + pos =3D 0 + for i in range(0, self.entry.leading_space): + if cont[i] !=3D " ": + break + pos +=3D 1 + + cont =3D cont[pos:] + + # NEW LOGIC: + # In case it is different, update it + if self.entry.leading_space !=3D pos: + self.entry.leading_space =3D pos + + self.entry.contents +=3D cont + "\n" + return + + # Unknown line, ignore + self.emit_warning(ln, f"bad line: {line}") + + def process_inline(self, ln, line): + """STATE_INLINE: docbook comments within a prototype.""" + + if self.inline_doc_state =3D=3D self.STATE_INLINE_NAME and \ + doc_inline_sect.search(line): + self.entry.section =3D doc_inline_sect.group(1) + self.entry.new_start_line =3D ln + + self.entry.contents =3D doc_inline_sect.group(2).lstrip() + if self.entry.contents !=3D "": + self.entry.contents +=3D "\n" + + self.inline_doc_state =3D self.STATE_INLINE_TEXT + # Documentation block end */ + return + + if doc_inline_end.search(line): + if self.entry.contents not in ["", "\n"]: + self.dump_section() + + self.state =3D self.STATE_PROTO + self.inline_doc_state =3D self.STATE_INLINE_NA + return + + if doc_content.search(line): + if self.inline_doc_state =3D=3D self.STATE_INLINE_TEXT: + self.entry.contents +=3D doc_content.group(1) + "\n" + if not self.entry.contents.strip(" ").rstrip("\n"): + self.entry.contents =3D "" + + elif self.inline_doc_state =3D=3D self.STATE_INLINE_NAME: + self.emit_warning(ln, + f"Incorrect use of kernel-doc format: {l= ine}") + + self.inline_doc_state =3D self.STATE_INLINE_ERROR + + def syscall_munge(self, ln, proto): + """ + Handle syscall definitions + """ + + is_void =3D False + + # Strip newlines/CR's + proto =3D re.sub(r'[\r\n]+', ' ', proto) + + # Check if it's a SYSCALL_DEFINE0 + if 'SYSCALL_DEFINE0' in proto: + is_void =3D True + + # Replace SYSCALL_DEFINE with correct return type & function name + proto =3D Re(r'SYSCALL_DEFINE.*\(').sub('long sys_', proto) + + r =3D Re(r'long\s+(sys_.*?),') + if r.search(proto): + proto =3D proto.replace(',', '(', count=3D1) + elif is_void: + proto =3D proto.replace(')', '(void)', count=3D1) + + # Now delete all of the odd-numbered commas in the proto + # so that argument types & names don't have a comma between them + count =3D 0 + length =3D len(proto) + + if is_void: + length =3D 0 # skip the loop if is_void + + for ix in range(length): + if proto[ix] =3D=3D ',': + count +=3D 1 + if count % 2 =3D=3D 1: + proto =3D proto[:ix] + ' ' + proto[ix+1:] + + return proto + + def tracepoint_munge(self, ln, proto): + """ + Handle tracepoint definitions + """ + + tracepointname =3D None + tracepointargs =3D None + + # Match tracepoint name based on different patterns + r =3D Re(r'TRACE_EVENT\((.*?),') + if r.search(proto): + tracepointname =3D r.group(1) + + r =3D Re(r'DEFINE_SINGLE_EVENT\((.*?),') + if r.search(proto): + tracepointname =3D r.group(1) + + r =3D Re(r'DEFINE_EVENT\((.*?),(.*?),') + if r.search(proto): + tracepointname =3D r.group(2) + + if tracepointname: + tracepointname =3D tracepointname.lstrip() + + r =3D Re(r'TP_PROTO\((.*?)\)') + if r.search(proto): + tracepointargs =3D r.group(1) + + if not tracepointname or not tracepointargs: + self.emit_warning(ln, + f"Unrecognized tracepoint format:\n{proto}\n= ") + else: + proto =3D f"static inline void trace_{tracepointname}({tracepo= intargs})" + self.entry.identifier =3D f"trace_{self.entry.identifier}" + + return proto + + def process_proto_function(self, ln, line): + """Ancillary routine to process a function prototype""" + + # strip C99-style comments to end of line + r =3D Re(r"\/\/.*$", re.S) + line =3D r.sub('', line) + + if Re(r'\s*#\s*define').match(line): + self.entry.prototype =3D line + elif line.startswith('#'): + # Strip other macros like #ifdef/#ifndef/#endif/... + pass + else: + r =3D Re(r'([^\{]*)') + if r.match(line): + self.entry.prototype +=3D r.group(1) + " " + + if '{' in line or ';' in line or Re(r'\s*#\s*define').match(line): + # strip comments + r =3D Re(r'/\*.*?\*/') + self.entry.prototype =3D r.sub('', self.entry.prototype) + + # strip newlines/cr's + r =3D Re(r'[\r\n]+') + self.entry.prototype =3D r.sub(' ', self.entry.prototype) + + # strip leading spaces + r =3D Re(r'^\s+') + self.entry.prototype =3D r.sub('', self.entry.prototype) + + # Handle self.entry.prototypes for function pointers like: + # int (*pcs_config)(struct foo) + + r =3D Re(r'^(\S+\s+)\(\s*\*(\S+)\)') + self.entry.prototype =3D r.sub(r'\1\2', self.entry.prototype) + + if 'SYSCALL_DEFINE' in self.entry.prototype: + self.entry.prototype =3D self.syscall_munge(ln, + self.entry.proto= type) + + r =3D Re(r'TRACE_EVENT|DEFINE_EVENT|DEFINE_SINGLE_EVENT') + if r.search(self.entry.prototype): + self.entry.prototype =3D self.tracepoint_munge(ln, + self.entry.pr= ototype) + + self.dump_function(ln, self.entry.prototype) + self.reset_state(ln) + + def process_proto_type(self, ln, line): + """Ancillary routine to process a type""" + + # Strip newlines/cr's. + line =3D Re(r'[\r\n]+', re.S).sub(' ', line) + + # Strip leading spaces + line =3D Re(r'^\s+', re.S).sub('', line) + + # Strip trailing spaces + line =3D Re(r'\s+$', re.S).sub('', line) + + # Strip C99-style comments to the end of the line + line =3D Re(r"\/\/.*$", re.S).sub('', line) + + # To distinguish preprocessor directive from regular declaration l= ater. + if line.startswith('#'): + line +=3D ";" + + r =3D Re(r'([^\{\};]*)([\{\};])(.*)') + while True: + if r.search(line): + if self.entry.prototype: + self.entry.prototype +=3D " " + self.entry.prototype +=3D r.group(1) + r.group(2) + + self.entry.brcount +=3D r.group(2).count('{') + self.entry.brcount -=3D r.group(2).count('}') + + self.entry.brcount =3D max(self.entry.brcount, 0) + + if r.group(2) =3D=3D ';' and self.entry.brcount =3D=3D 0: + self.dump_declaration(ln, self.entry.prototype) + self.reset_state(ln) + break + + line =3D r.group(3) + else: + self.entry.prototype +=3D line + break + + def process_proto(self, ln, line): + """STATE_PROTO: reading a function/whatever prototype.""" + + if doc_inline_oneline.search(line): + self.entry.section =3D doc_inline_oneline.group(1) + self.entry.contents =3D doc_inline_oneline.group(2) + + if self.entry.contents !=3D "": + self.entry.contents +=3D "\n" + self.dump_section(start_new=3DFalse) + + elif doc_inline_start.search(line): + self.state =3D self.STATE_INLINE + self.inline_doc_state =3D self.STATE_INLINE_NAME + + elif self.entry.decl_type =3D=3D 'function': + self.process_proto_function(ln, line) + + else: + self.process_proto_type(ln, line) + + def process_docblock(self, ln, line): + """STATE_DOCBLOCK: within a DOC: block.""" + + if doc_end.search(line): + self.dump_section() + self.output_declaration("doc", None, + sectionlist=3Dself.entry.sectionlist, + sections=3Dself.entry.sections, = module=3Dself.config.modulename) + self.reset_state(ln) + + elif doc_content.search(line): + self.entry.contents +=3D doc_content.group(1) + "\n" + + def run(self): + """ + Open and process each line of a C source file. + he parsing is controlled via a state machine, and the line is pass= ed + to a different process function depending on the state. The process + function may update the state as needed. + """ + + cont =3D False + prev =3D "" + prev_ln =3D None + + try: + with open(self.fname, "r", encoding=3D"utf8", + errors=3D"backslashreplace") as fp: + for ln, line in enumerate(fp): + + line =3D line.expandtabs().strip("\n") + + # Group continuation lines on prototypes + if self.state =3D=3D self.STATE_PROTO: + if line.endswith("\\"): + prev +=3D line.removesuffix("\\") + cont =3D True + + if not prev_ln: + prev_ln =3D ln + + continue + + if cont: + ln =3D prev_ln + line =3D prev + line + prev =3D "" + cont =3D False + prev_ln =3D None + + self.config.log.debug("%d %s%s: %s", + ln, self.st_name[self.state], + self.st_inline_name[self.inline_= doc_state], + line) + + # TODO: not all states allow EXPORT_SYMBOL*, so this + # can be optimized later on to speedup parsing + self.process_export(self.config.function_table, line) + + # Hand this line to the appropriate state handler + if self.state =3D=3D self.STATE_NORMAL: + self.process_normal(ln, line) + elif self.state =3D=3D self.STATE_NAME: + self.process_name(ln, line) + elif self.state in [self.STATE_BODY, self.STATE_BODY_M= AYBE, + self.STATE_BODY_WITH_BLANK_LINE]: + self.process_body(ln, line) + elif self.state =3D=3D self.STATE_INLINE: # scanning = for inline parameters + self.process_inline(ln, line) + elif self.state =3D=3D self.STATE_PROTO: + self.process_proto(ln, line) + elif self.state =3D=3D self.STATE_DOCBLOCK: + self.process_docblock(ln, line) + except OSError: + self.config.log.error(f"Error: Cannot open file {self.fname}") + self.config.errors +=3D 1 + + +class GlobSourceFiles: + """ + Parse C source code file names and directories via an Interactor. + + """ + + def __init__(self, srctree=3DNone, valid_extensions=3DNone): + """ + Initialize valid extensions with a tuple. + + If not defined, assume default C extensions (.c and .h) + + It would be possible to use python's glob function, but it is + very slow, and it is not interactive. So, it would wait to read all + directories before actually do something. + + So, let's use our own implementation. + """ + + if not valid_extensions: + self.extensions =3D (".c", ".h") + else: + self.extensions =3D valid_extensions + + self.srctree =3D srctree + + def _parse_dir(self, dirname): + """Internal function to parse files recursively""" + + with os.scandir(dirname) as obj: + for entry in obj: + name =3D os.path.join(dirname, entry.name) + + if entry.is_dir(): + yield from self._parse_dir(name) + + if not entry.is_file(): + continue + + basename =3D os.path.basename(name) + + if not basename.endswith(self.extensions): + continue + + yield name + + def parse_files(self, file_list, file_not_found_cb): + for fname in file_list: + if self.srctree: + f =3D os.path.join(self.srctree, fname) + else: + f =3D fname + + if os.path.isdir(f): + yield from self._parse_dir(f) + elif os.path.isfile(f): + yield f + elif file_not_found_cb: + file_not_found_cb(fname) + + +class KernelFiles(): + + def parse_file(self, fname): + + doc =3D KernelDoc(self.config, fname) + doc.run() + + return doc + + def process_export_file(self, fname): + try: + with open(fname, "r", encoding=3D"utf8", + errors=3D"backslashreplace") as fp: + for line in fp: + KernelDoc.process_export(self.config.function_table, l= ine) + + except IOError: + print(f"Error: Cannot open fname {fname}", fname=3Dsys.stderr) + self.config.errors +=3D 1 + + def file_not_found_cb(self, fname): + self.config.log.error("Cannot find file %s", fname) + self.config.errors +=3D 1 + + def __init__(self, files=3DNone, verbose=3DFalse, out_style=3DNone, + werror=3DFalse, wreturn=3DFalse, wshort_desc=3DFalse, + wcontents_before_sections=3DFalse, + logger=3DNone, modulename=3DNone, export_file=3DNone): + """Initialize startup variables and parse all files""" + + + if not verbose: + verbose =3D bool(os.environ.get("KBUILD_VERBOSE", 0)) + + if not modulename: + modulename =3D "Kernel API" + + dt =3D datetime.now() + if os.environ.get("KBUILD_BUILD_TIMESTAMP", None): + # use UTC TZ + to_zone =3D tz.gettz('UTC') + dt =3D dt.astimezone(to_zone) + + if not werror: + kcflags =3D os.environ.get("KCFLAGS", None) + if kcflags: + match =3D re.search(r"(\s|^)-Werror(\s|$)/", kcflags) + if match: + werror =3D True + + # reading this variable is for backwards compat just in case + # someone was calling it with the variable from outside the + # kernel's build system + kdoc_werror =3D os.environ.get("KDOC_WERROR", None) + if kdoc_werror: + werror =3D kdoc_werror + + # Set global config data used on all files + self.config =3D argparse.Namespace + + self.config.verbose =3D verbose + self.config.werror =3D werror + self.config.wreturn =3D wreturn + self.config.wshort_desc =3D wshort_desc + self.config.wcontents_before_sections =3D wcontents_before_sections + self.config.modulename =3D modulename + + self.config.function_table =3D set() + self.config.source_map =3D {} + + if not logger: + self.config.log =3D logging.getLogger("kernel-doc") + else: + self.config.log =3D logger + + self.config.kernel_version =3D os.environ.get("KERNELVERSION", + "unknown kernel versio= n'") + self.config.src_tree =3D os.environ.get("SRCTREE", None) + + self.out_style =3D out_style + self.export_file =3D export_file + + # Initialize internal variables + + self.config.errors =3D 0 + self.results =3D [] + + self.file_list =3D files + self.files =3D set() + + def parse(self): + """ + Parse all files + """ + + glob =3D GlobSourceFiles(srctree=3Dself.config.src_tree) + + # Let's use a set here to avoid duplicating files + + for fname in glob.parse_files(self.file_list, self.file_not_found_= cb): + if fname in self.files: + continue + + self.files.add(fname) + + res =3D self.parse_file(fname) + self.results.append((res.fname, res.entries)) + + if not self.files: + sys.exit(1) + + # If a list of export files was provided, parse EXPORT_SYMBOL* + # from the ones not already parsed + + if self.export_file: + files =3D self.files + + glob =3D GlobSourceFiles(srctree=3Dself.config.src_tree) + + for fname in glob.parse_files(self.export_file, + self.file_not_found_cb): + if fname not in files: + files.add(fname) + + self.process_export_file(fname) + + def out_msg(self, fname, name, arg): + # TODO: filter out unwanted parts + + return self.out_style.msg(fname, name, arg) + + def msg(self, enable_lineno=3DFalse, export=3DFalse, internal=3DFalse, + symbol=3DNone, nosymbol=3DNone): + + function_table =3D self.config.function_table + + if symbol: + for s in symbol: + function_table.add(s) + + # Output none mode: only warnings will be shown + if not self.out_style: + return + + self.out_style.set_config(self.config) + + self.out_style.set_filter(export, internal, symbol, nosymbol, + function_table, enable_lineno) + + for fname, arg_tuple in self.results: + for name, arg in arg_tuple: + if self.out_msg(fname, name, arg): + ln =3D arg.get("ln", 0) + dtype =3D arg.get('type', "") + + self.config.log.warning("%s:%d Can't handle %s", + fname, ln, dtype) + + +class OutputFormat: + # output mode. + OUTPUT_ALL =3D 0 # output all symbols and doc sections + OUTPUT_INCLUDE =3D 1 # output only specified symbols + OUTPUT_EXPORTED =3D 2 # output exported symbols + OUTPUT_INTERNAL =3D 3 # output non-exported symbols + + # Virtual member to be overriden at the inherited classes + highlights =3D [] + + def __init__(self): + """Declare internal vars and set mode to OUTPUT_ALL""" + + self.out_mode =3D self.OUTPUT_ALL + self.enable_lineno =3D None + self.nosymbol =3D {} + self.symbol =3D None + self.function_table =3D set() + self.config =3D None + + def set_config(self, config): + self.config =3D config + + def set_filter(self, export, internal, symbol, nosymbol, function_tabl= e, + enable_lineno): + """ + Initialize filter variables according with the requested mode. + + Only one choice is valid between export, internal and symbol. + + The nosymbol filter can be used on all modes. + """ + + self.enable_lineno =3D enable_lineno + + if symbol: + self.out_mode =3D self.OUTPUT_INCLUDE + function_table =3D symbol + elif export: + self.out_mode =3D self.OUTPUT_EXPORTED + elif internal: + self.out_mode =3D self.OUTPUT_INTERNAL + else: + self.out_mode =3D self.OUTPUT_ALL + + if nosymbol: + self.nosymbol =3D set(nosymbol) + + if function_table: + self.function_table =3D function_table + + def highlight_block(self, block): + """ + Apply the RST highlights to a sub-block of text. + """ + + for r, sub in self.highlights: + block =3D r.sub(sub, block) + + return block + + def check_doc(self, name): + """Check if DOC should be output""" + + if self.out_mode =3D=3D self.OUTPUT_ALL: + return True + + if self.out_mode =3D=3D self.OUTPUT_INCLUDE: + if name in self.nosymbol: + return False + + if name in self.function_table: + return True + + return False + + def check_declaration(self, dtype, name): + if name in self.nosymbol: + return False + + if self.out_mode =3D=3D self.OUTPUT_ALL: + return True + + if self.out_mode in [ self.OUTPUT_INCLUDE, self.OUTPUT_EXPORTED ]: + if name in self.function_table: + return True + + if self.out_mode =3D=3D self.OUTPUT_INTERNAL: + if dtype !=3D "function": + return True + + if name not in self.function_table: + return True + + return False + + def check_function(self, fname, name, args): + return True + + def check_enum(self, fname, name, args): + return True + + def check_typedef(self, fname, name, args): + return True + + def msg(self, fname, name, args): + + dtype =3D args.get('type', "") + + if dtype =3D=3D "doc": + self.out_doc(fname, name, args) + return False + + if not self.check_declaration(dtype, name): + return False + + if dtype =3D=3D "function": + self.out_function(fname, name, args) + return False + + if dtype =3D=3D "enum": + self.out_enum(fname, name, args) + return False + + if dtype =3D=3D "typedef": + self.out_typedef(fname, name, args) + return False + + if dtype in ["struct", "union"]: + self.out_struct(fname, name, args) + return False + + # Warn if some type requires an output logic + self.config.log.warning("doesn't now how to output '%s' block", + dtype) + + return True + + # Virtual methods to be overridden by inherited classes + def out_doc(self, fname, name, args): + pass + + def out_function(self, fname, name, args): + pass + + def out_enum(self, fname, name, args): + pass + + def out_typedef(self, fname, name, args): + pass + + def out_struct(self, fname, name, args): + pass + + +class RestFormat(OutputFormat): + # """Consts and functions used by ReST output""" + + highlights =3D [ + (type_constant, r"``\1``"), + (type_constant2, r"``\1``"), + + # Note: need to escape () to avoid func matching later + (type_member_func, r":c:type:`\1\2\3\\(\\) <\1>`"), + (type_member, r":c:type:`\1\2\3 <\1>`"), + (type_fp_param, r"**\1\\(\\)**"), + (type_fp_param2, r"**\1\\(\\)**"), + (type_func, r"\1()"), + (type_enum, r":c:type:`\1 <\2>`"), + (type_struct, r":c:type:`\1 <\2>`"), + (type_typedef, r":c:type:`\1 <\2>`"), + (type_union, r":c:type:`\1 <\2>`"), + + # in rst this can refer to any type + (type_fallback, r":c:type:`\1`"), + (type_param_ref, r"**\1\2**") + ] + blankline =3D "\n" + + sphinx_literal =3D Re(r'^[^.].*::$', cache=3DFalse) + sphinx_cblock =3D Re(r'^\.\.\ +code-block::', cache=3DFalse) + + def __init__(self): + """ + Creates class variables. + + Not really mandatory, but it is a good coding style and makes + pylint happy. + """ + + super().__init__() + self.lineprefix =3D "" + + def print_lineno (self, ln): + """Outputs a line number""" + + if self.enable_lineno and ln: + print(f".. LINENO {ln}") + + def output_highlight(self, args): + input_text =3D args + output =3D "" + in_literal =3D False + litprefix =3D "" + block =3D "" + + for line in input_text.strip("\n").split("\n"): + + # If we're in a literal block, see if we should drop out of it. + # Otherwise, pass the line straight through unmunged. + if in_literal: + if line.strip(): # If the line is not blank + # If this is the first non-blank line in a literal blo= ck, + # figure out the proper indent. + if not litprefix: + r =3D Re(r'^(\s*)') + if r.match(line): + litprefix =3D '^' + r.group(1) + else: + litprefix =3D "" + + output +=3D line + "\n" + elif not Re(litprefix).match(line): + in_literal =3D False + else: + output +=3D line + "\n" + else: + output +=3D line + "\n" + + # Not in a literal block (or just dropped out) + if not in_literal: + block +=3D line + "\n" + if self.sphinx_literal.match(line) or self.sphinx_cblock.m= atch(line): + in_literal =3D True + litprefix =3D "" + output +=3D self.highlight_block(block) + block =3D "" + + # Handle any remaining block + if block: + output +=3D self.highlight_block(block) + + # Print the output with the line prefix + for line in output.strip("\n").split("\n"): + print(self.lineprefix + line) + + def out_section(self, args, out_reference=3DFalse): + """ + Outputs a block section. + + This could use some work; it's used to output the DOC: sections, a= nd + starts by putting out the name of the doc section itself, but that + tends to duplicate a header already in the template file. + """ + + sectionlist =3D args.get('sectionlist', []) + sections =3D args.get('sections', {}) + section_start_lines =3D args.get('section_start_lines', {}) + + for section in sectionlist: + # Skip sections that are in the nosymbol_table + if section in self.nosymbol: + continue + + if not self.out_mode =3D=3D self.OUTPUT_INCLUDE: + if out_reference: + print(f".. _{section}:\n") + + if not self.symbol: + print(f'{self.lineprefix}**{section}**\n') + + self.print_lineno(section_start_lines.get(section, 0)) + self.output_highlight(sections[section]) + print() + print() + + def out_doc(self, fname, name, args): + if not self.check_doc(name): + return + + self.out_section(args, out_reference=3DTrue) + + def out_function(self, fname, name, args): + + oldprefix =3D self.lineprefix + signature =3D "" + + func_macro =3D args.get('func_macro', False) + if func_macro: + signature =3D args['function'] + else: + if args.get('functiontype'): + signature =3D args['functiontype'] + " " + signature +=3D args['function'] + " (" + + parameterlist =3D args.get('parameterlist', []) + parameterdescs =3D args.get('parameterdescs', {}) + parameterdesc_start_lines =3D args.get('parameterdesc_start_lines'= , {}) + + ln =3D args.get('ln', 0) + + count =3D 0 + for parameter in parameterlist: + if count !=3D 0: + signature +=3D ", " + count +=3D 1 + dtype =3D args['parametertypes'].get(parameter, "") + + if function_pointer.search(dtype): + signature +=3D function_pointer.group(1) + parameter + fun= ction_pointer.group(3) + else: + signature +=3D dtype + + if not func_macro: + signature +=3D ")" + + if args.get('typedef') or not args.get('functiontype'): + print(f".. c:macro:: {args['function']}\n") + + if args.get('typedef'): + self.print_lineno(ln) + print(" **Typedef**: ", end=3D"") + self.lineprefix =3D "" + self.output_highlight(args.get('purpose', "")) + print("\n\n**Syntax**\n") + print(f" ``{signature}``\n") + else: + print(f"``{signature}``\n") + else: + print(f".. c:function:: {signature}\n") + + if not args.get('typedef'): + self.print_lineno(ln) + self.lineprefix =3D " " + self.output_highlight(args.get('purpose', "")) + print() + + # Put descriptive text into a container (HTML
) to help set + # function prototypes apart + self.lineprefix =3D " " + + if parameterlist: + print(".. container:: kernelindent\n") + print(f"{self.lineprefix}**Parameters**\n") + + for parameter in parameterlist: + parameter_name =3D Re(r'\[.*').sub('', parameter) + dtype =3D args['parametertypes'].get(parameter, "") + + if dtype: + print(f"{self.lineprefix}``{dtype}``") + else: + print(f"{self.lineprefix}``{parameter}``") + + self.print_lineno(parameterdesc_start_lines.get(parameter_name= , 0)) + + self.lineprefix =3D " " + if parameter_name in parameterdescs and \ + parameterdescs[parameter_name] !=3D KernelDoc.undescribed: + + self.output_highlight(parameterdescs[parameter_name]) + print() + else: + print(f"{self.lineprefix}*undescribed*\n") + self.lineprefix =3D " " + + self.out_section(args) + self.lineprefix =3D oldprefix + + def out_enum(self, fname, name, args): + + oldprefix =3D self.lineprefix + name =3D args.get('enum', '') + parameterlist =3D args.get('parameterlist', []) + parameterdescs =3D args.get('parameterdescs', {}) + ln =3D args.get('ln', 0) + + print(f"\n\n.. c:enum:: {name}\n") + + self.print_lineno(ln) + self.lineprefix =3D " " + self.output_highlight(args.get('purpose', '')) + print() + + print(".. container:: kernelindent\n") + outer =3D self.lineprefix + " " + self.lineprefix =3D outer + " " + print(f"{outer}**Constants**\n") + + for parameter in parameterlist: + print(f"{outer}``{parameter}``") + + if parameterdescs.get(parameter, '') !=3D KernelDoc.undescribe= d: + self.output_highlight(parameterdescs[parameter]) + else: + print(f"{self.lineprefix}*undescribed*\n") + print() + + self.lineprefix =3D oldprefix + self.out_section(args) + + def out_typedef(self, fname, name, args): + + oldprefix =3D self.lineprefix + name =3D args.get('typedef', '') + ln =3D args.get('ln', 0) + + print(f"\n\n.. c:type:: {name}\n") + + self.print_lineno(ln) + self.lineprefix =3D " " + + self.output_highlight(args.get('purpose', '')) + + print() + + self.lineprefix =3D oldprefix + self.out_section(args) + + def out_struct(self, fname, name, args): + + name =3D args.get('struct', "") + purpose =3D args.get('purpose', "") + declaration =3D args.get('definition', "") + dtype =3D args.get('type', "struct") + ln =3D args.get('ln', 0) + + parameterlist =3D args.get('parameterlist', []) + parameterdescs =3D args.get('parameterdescs', {}) + parameterdesc_start_lines =3D args.get('parameterdesc_start_lines'= , {}) + + print(f"\n\n.. c:{dtype}:: {name}\n") + + self.print_lineno(ln) + + oldprefix =3D self.lineprefix + self.lineprefix +=3D " " + + self.output_highlight(purpose) + print() + + print(".. container:: kernelindent\n") + print(f"{self.lineprefix}**Definition**::\n") + + self.lineprefix =3D self.lineprefix + " " + + declaration =3D declaration.replace("\t", self.lineprefix) + + print(f"{self.lineprefix}{dtype} {name}" + ' {') + print(f"{declaration}{self.lineprefix}" + "};\n") + + self.lineprefix =3D " " + print(f"{self.lineprefix}**Members**\n") + for parameter in parameterlist: + if not parameter or parameter.startswith("#"): + continue + + parameter_name =3D parameter.split("[", maxsplit=3D1)[0] + + if parameterdescs.get(parameter_name) =3D=3D KernelDoc.undescr= ibed: + continue + + self.print_lineno(parameterdesc_start_lines.get(parameter_name= , 0)) + + print(f"{self.lineprefix}``{parameter}``") + + self.lineprefix =3D " " + self.output_highlight(parameterdescs[parameter_name]) + self.lineprefix =3D " " + + print() + + print() + + self.lineprefix =3D oldprefix + self.out_section(args) + + +class ManFormat(OutputFormat): + """Consts and functions used by man pages output""" + + highlights =3D ( + (type_constant, r"\1"), + (type_constant2, r"\1"), + (type_func, r"\\fB\1\\fP"), + (type_enum, r"\\fI\1\\fP"), + (type_struct, r"\\fI\1\\fP"), + (type_typedef, r"\\fI\1\\fP"), + (type_union, r"\\fI\1\\fP"), + (type_param, r"\\fI\1\\fP"), + (type_param_ref, r"\\fI\1\2\\fP"), + (type_member, r"\\fI\1\2\3\\fP"), + (type_fallback, r"\\fI\1\\fP") + ) + blankline =3D "" + + def __init__(self): + """ + Creates class variables. + + Not really mandatory, but it is a good coding style and makes + pylint happy. + """ + + super().__init__() + + dt =3D datetime.now() + if os.environ.get("KBUILD_BUILD_TIMESTAMP", None): + # use UTC TZ + to_zone =3D tz.gettz('UTC') + dt =3D dt.astimezone(to_zone) + + self.man_date =3D dt.strftime("%B %Y") + + def output_highlight(self, block): + + contents =3D self.highlight_block(block) + + if isinstance(contents, list): + contents =3D "\n".join(contents) + + for line in contents.strip("\n").split("\n"): + line =3D Re(r"^\s*").sub("", line) + + if line and line[0] =3D=3D ".": + print("\\&" + line) + else: + print(line) + + def out_doc(self, fname, name, args): + module =3D args.get('module') + sectionlist =3D args.get('sectionlist', []) + sections =3D args.get('sections', {}) + + print(f'.TH "{module}" 9 "{module}" "{self.man_date}" "API Manual"= LINUX') + + for section in sectionlist: + print(f'.SH "{section}"') + self.output_highlight(sections.get(section)) + + def out_function(self, fname, name, args): + """output function in man""" + + parameterlist =3D args.get('parameterlist', []) + parameterdescs =3D args.get('parameterdescs', {}) + sectionlist =3D args.get('sectionlist', []) + sections =3D args.get('sections', {}) + + print(f'.TH "{args['function']}" 9 "{args['function']}" "{self.man= _date}" "Kernel Hacker\'s Manual" LINUX') + + print(".SH NAME") + print(f"{args['function']} \\- {args['purpose']}") + + print(".SH SYNOPSIS") + if args.get('functiontype', ''): + print(f'.B "{args['functiontype']}" {args['function']}') + else: + print(f'.B "{args['function']}') + + count =3D 0 + parenth =3D "(" + post =3D "," + + for parameter in parameterlist: + if count =3D=3D len(parameterlist) - 1: + post =3D ");" + + dtype =3D args['parametertypes'].get(parameter, "") + if function_pointer.match(dtype): + # Pointer-to-function + print(f'".BI "{parenth}{function_pointer.group(1)}" " ") (= {function_pointer.group(2)}){post}"') + else: + dtype =3D Re(r'([^\*])$').sub(r'\1 ', dtype) + + print(f'.BI "{parenth}{dtype}" "{post}"') + count +=3D 1 + parenth =3D "" + + if parameterlist: + print(".SH ARGUMENTS") + + for parameter in parameterlist: + parameter_name =3D re.sub(r'\[.*', '', parameter) + + print(f'.IP "{parameter}" 12') + self.output_highlight(parameterdescs.get(parameter_name, "")) + + for section in sectionlist: + print(f'.SH "{section.upper()}"') + self.output_highlight(sections[section]) + + def out_enum(self, fname, name, args): + + name =3D args.get('enum', '') + parameterlist =3D args.get('parameterlist', []) + sectionlist =3D args.get('sectionlist', []) + sections =3D args.get('sections', {}) + + print(f'.TH "{args['module']}" 9 "enum {args['enum']}" "{self.man_= date}" "API Manual" LINUX') + + print(".SH NAME") + print(f"enum {args['enum']} \\- {args['purpose']}") + + print(".SH SYNOPSIS") + print(f"enum {args['enum']}" + " {") + + count =3D 0 + for parameter in parameterlist: + print(f'.br\n.BI " {parameter}"') + if count =3D=3D len(parameterlist) - 1: + print("\n};") + else: + print(", \n.br") + + count +=3D 1 + + print(".SH Constants") + + for parameter in parameterlist: + parameter_name =3D Re(r'\[.*').sub('', parameter) + print(f'.IP "{parameter}" 12') + self.output_highlight(args['parameterdescs'].get(parameter_nam= e, "")) + + for section in sectionlist: + print(f'.SH "{section}"') + self.output_highlight(sections[section]) + + def out_typedef(self, fname, name, args): + module =3D args.get('module') + typedef =3D args.get('typedef') + purpose =3D args.get('purpose') + sectionlist =3D args.get('sectionlist', []) + sections =3D args.get('sections', {}) + + print(f'.TH "{module}" 9 "{typedef}" "{self.man_date}" "API Manual= " LINUX') + + print(".SH NAME") + print(f"typedef {typedef} \\- {purpose}") + + for section in sectionlist: + print(f'.SH "{section}"') + self.output_highlight(sections.get(section)) + + def out_struct(self, fname, name, args): + module =3D args.get('module') + struct_type =3D args.get('type') + struct_name =3D args.get('struct') + purpose =3D args.get('purpose') + definition =3D args.get('definition') + sectionlist =3D args.get('sectionlist', []) + parameterlist =3D args.get('parameterlist', []) + sections =3D args.get('sections', {}) + parameterdescs =3D args.get('parameterdescs', {}) + + print(f'.TH "{module}" 9 "{struct_type} {struct_name}" "{self.man_= date}" "API Manual" LINUX') + + print(".SH NAME") + print(f"{struct_type} {struct_name} \\- {purpose}") + + # Replace tabs with two spaces and handle newlines + declaration =3D definition.replace("\t", " ") + declaration =3D Re(r"\n").sub('"\n.br\n.BI "', declaration) + + print(".SH SYNOPSIS") + print(f"{struct_type} {struct_name} " + "{" +"\n.br") + print(f'.BI "{declaration}\n' + "};\n.br\n") + + print(".SH Members") + for parameter in parameterlist: + if parameter.startswith("#"): + continue + + parameter_name =3D re.sub(r"\[.*", "", parameter) + + if parameterdescs.get(parameter_name) =3D=3D KernelDoc.undescr= ibed: + continue + + print(f'.IP "{parameter}" 12') + self.output_highlight(parameterdescs.get(parameter_name)) + + for section in sectionlist: + print(f'.SH "{section}"') + self.output_highlight(sections.get(section)) + + +# Command line interface + + +DESC =3D """ +Read C language source or header FILEs, extract embedded documentation com= ments, +and print formatted documentation to standard output. + +The documentation comments are identified by the "/**" opening comment mar= k. + +See Documentation/doc-guide/kernel-doc.rst for the documentation comment s= yntax. +""" + +EXPORT_FILE_DESC =3D """ +Specify an additional FILE in which to look for EXPORT_SYMBOL information. + +May be used multiple times. +""" + +EXPORT_DESC =3D """ +Only output documentation for the symbols that have been +exported using EXPORT_SYMBOL() and related macros in any input +FILE or -export-file FILE. +""" + +INTERNAL_DESC =3D """ +Only output documentation for the symbols that have NOT been +exported using EXPORT_SYMBOL() and related macros in any input +FILE or -export-file FILE. +""" + +FUNCTION_DESC =3D """ +Only output documentation for the given function or DOC: section +title. All other functions and DOC: sections are ignored. + +May be used multiple times. +""" + +NOSYMBOL_DESC =3D """ +Exclude the specified symbol from the output documentation. + +May be used multiple times. +""" + +FILES_DESC =3D """ +Header and C source files to be parsed. +""" + +WARN_CONTENTS_BEFORE_SECTIONS_DESC =3D """ +Warns if there are contents before sections (deprecated). + +This option is kept just for backward-compatibility, but it does nothing, +neither here nor at the original Perl script. +""" + + +def main(): + """Main program""" + + parser =3D argparse.ArgumentParser(formatter_class=3Dargparse.RawTextH= elpFormatter, + description=3DDESC) + + # Normal arguments + + parser.add_argument("-v", "-verbose", "--verbose", action=3D"store_tru= e", + help=3D"Verbose output, more warnings and other in= formation.") + + parser.add_argument("-d", "-debug", "--debug", action=3D"store_true", + help=3D"Enable debug messages") + + parser.add_argument("-M", "-modulename", "--modulename", + help=3D"Allow setting a module name at the output.= ") + + parser.add_argument("-l", "-enable-lineno", "--enable_lineno", + action=3D"store_true", + help=3D"Enable line number output (only in ReST mo= de)") + + # Arguments to control the warning behavior + + parser.add_argument("-Wreturn", "--wreturn", action=3D"store_true", + help=3D"Warns about the lack of a return markup on= functions.") + + parser.add_argument("-Wshort-desc", "-Wshort-description", "--wshort-d= esc", + action=3D"store_true", + help=3D"Warns if initial short description is miss= ing") + + parser.add_argument("-Wcontents-before-sections", + "--wcontents-before-sections", action=3D"store_tru= e", + help=3DWARN_CONTENTS_BEFORE_SECTIONS_DESC) + + parser.add_argument("-Wall", "--wall", action=3D"store_true", + help=3D"Enable all types of warnings") + + parser.add_argument("-Werror", "--werror", action=3D"store_true", + help=3D"Treat warnings as errors.") + + parser.add_argument("-export-file", "--export-file", action=3D'append', + help=3DEXPORT_FILE_DESC) + + # Output format mutually-exclusive group + + out_group =3D parser.add_argument_group("Output format selection (mutu= ally exclusive)") + + out_fmt =3D out_group.add_mutually_exclusive_group() + + out_fmt.add_argument("-m", "-man", "--man", action=3D"store_true", + help=3D"Output troff manual page format.") + out_fmt.add_argument("-r", "-rst", "--rst", action=3D"store_true", + help=3D"Output reStructuredText format (default).= ") + out_fmt.add_argument("-N", "-none", "--none", action=3D"store_true", + help=3D"Do not output documentation, only warning= s.") + + # Output selection mutually-exclusive group + + sel_group =3D parser.add_argument_group("Output selection (mutually ex= clusive)") + sel_mut =3D sel_group.add_mutually_exclusive_group() + + sel_mut.add_argument("-e", "-export", "--export", action=3D'store_true= ', + help=3DEXPORT_DESC) + + sel_mut.add_argument("-i", "-internal", "--internal", action=3D'store_= true', + help=3DINTERNAL_DESC) + + sel_mut.add_argument("-s", "-function", "--symbol", action=3D'append', + help=3DFUNCTION_DESC) + + # This one is valid for all 3 types of filter + parser.add_argument("-n", "-nosymbol", "--nosymbol", action=3D'append', + help=3DNOSYMBOL_DESC) + + parser.add_argument("files", metavar=3D"FILE", + nargs=3D"+", help=3DFILES_DESC) + + args =3D parser.parse_args() + + if args.wall: + args.wreturn =3D True + args.wshort_desc =3D True + args.wcontents_before_sections =3D True + + if not args.debug: + level =3D logging.INFO + else: + level =3D logging.DEBUG + + if args.man: + out_style =3D ManFormat() + elif args.none: + out_style =3D None + else: + out_style =3D RestFormat() + + logging.basicConfig(level=3Dlevel, format=3D"%(levelname)s: %(message)= s") + + kfiles =3D KernelFiles(files=3Dargs.files, verbose=3Dargs.verbose, + out_style=3Dout_style, werror=3Dargs.werror, + wreturn=3Dargs.wreturn, wshort_desc=3Dargs.wshort= _desc, + wcontents_before_sections=3Dargs.wcontents_before= _sections, + modulename=3Dargs.modulename, + export_file=3Dargs.export_file) + + kfiles.parse() + + kfiles.msg(enable_lineno=3Dargs.enable_lineno, export=3Dargs.export, + internal=3Dargs.internal, symbol=3Dargs.symbol, + nosymbol=3Dargs.nosymbol) + + +# Call main method +if __name__ =3D=3D "__main__": + main() --=20 2.49.0 From nobody Wed Dec 17 05:33:22 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 12129265CDC; Tue, 8 Apr 2025 10:09:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106996; cv=none; b=XXZ/3gE25KN1/zn3rfKAG7cbjYo+suR6JDepf4Hqyxe8+ugMzivkMCek+fM/EmEzB4FHe6Y/dFQK6D1HOsvFufzY0ZHzUHfJsvSOG597ucXZUrtGQiCKtNSRCqMDHDcnAEy/wAe5CQXSNxnVXdOAezAXmr5JjsTMknwPptRhlMk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106996; c=relaxed/simple; bh=Uh81BCZUvcOm9n0xK8VTsj936JVZ+CIeEHT1iPNOQVA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=XXUZB6j7ieKWOtG8Aal66EvNgaKvyDIkkQtqsSbiwjNPIQvxQgXOdjtGvza9y2IkdquDlMnugCIB2m3M41QzBvWe6VT+p9lqbfv3vT3IvrYQqJSz1nlbPISjYq2u32GD9OQ/Uf3mFIwMEbhZ8MKeftHuKZtcy1akdItLTiESnXE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=RNzL2Oun; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="RNzL2Oun" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 9B5E0C4CEEA; Tue, 8 Apr 2025 10:09:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1744106995; bh=Uh81BCZUvcOm9n0xK8VTsj936JVZ+CIeEHT1iPNOQVA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=RNzL2Ounydg9MrIFLUQN/vTk+EgaZ3DZEDYf5d1zi4daJxZYFPQZmvQiz45UPI/jR 5g/gysGbFGTlqKph9p8KFxyF+SNYyYJBJm+4G8m+GE6cz+ScM+fWd6q4ISwbBM2Md1 tR/IhmunVX4Qp1AIuBDl0jM5M3PByUg/YSxsefzeGw250c/+bC5u3O6fMI3HN4aZ2i OONiuk0TewAxruxpCV3CRVbF3R7NrKKSuwzsNhbhvV5BXPUvN8za15qPj57ha4NRGM QfCOHiS8dyLsJCN8p1t9HAspGRhqI1olrsy+NZO1/o8gBtHxWf9kaTypqQl4Ow42ux +wXGR5VrGyOog== Received: from mchehab by mail.kernel.org with local (Exim 4.98.2) (envelope-from ) id 1u25tt-00000008RVR-05cN; Tue, 08 Apr 2025 18:09:49 +0800 From: Mauro Carvalho Chehab To: Linux Doc Mailing List , Jonathan Corbet Cc: Mauro Carvalho Chehab , linux-kernel@vger.kernel.org Subject: [PATCH v3 04/33] scripts/kernel-doc.py: output warnings the same way as kerneldoc Date: Tue, 8 Apr 2025 18:09:07 +0800 Message-ID: <559f0ad9e6fecfcbb3cc38b6097463bd38d58629.1744106241.git.mchehab+huawei@kernel.org> X-Mailer: git-send-email 2.49.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: Mauro Carvalho Chehab Content-Type: text/plain; charset="utf-8" Add a formatter to logging to produce outputs in a similar way to kernel-doc. This should help making it more compatible with existing scripts. Signed-off-by: Mauro Carvalho Chehab --- scripts/kernel-doc.py | 20 ++++++++++++++++---- 1 file changed, 16 insertions(+), 4 deletions(-) diff --git a/scripts/kernel-doc.py b/scripts/kernel-doc.py index 114f3699bf7c..8625209d6293 100755 --- a/scripts/kernel-doc.py +++ b/scripts/kernel-doc.py @@ -2715,6 +2715,11 @@ neither here nor at the original Perl script. """ =20 =20 +class MsgFormatter(logging.Formatter): + def format(self, record): + record.levelname =3D record.levelname.capitalize() + return logging.Formatter.format(self, record) + def main(): """Main program""" =20 @@ -2799,10 +2804,19 @@ def main(): args.wshort_desc =3D True args.wcontents_before_sections =3D True =20 + logger =3D logging.getLogger() + if not args.debug: - level =3D logging.INFO + logger.setLevel(logging.INFO) else: - level =3D logging.DEBUG + logger.setLevel(logging.DEBUG) + + formatter =3D MsgFormatter('%(levelname)s: %(message)s') + + handler =3D logging.StreamHandler() + handler.setFormatter(formatter) + + logger.addHandler(handler) =20 if args.man: out_style =3D ManFormat() @@ -2811,8 +2825,6 @@ def main(): else: out_style =3D RestFormat() =20 - logging.basicConfig(level=3Dlevel, format=3D"%(levelname)s: %(message)= s") - kfiles =3D KernelFiles(files=3Dargs.files, verbose=3Dargs.verbose, out_style=3Dout_style, werror=3Dargs.werror, wreturn=3Dargs.wreturn, wshort_desc=3Dargs.wshort= _desc, --=20 2.49.0 From nobody Wed Dec 17 05:33:22 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9373A2673A0; Tue, 8 Apr 2025 10:09:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106996; cv=none; b=AcKa0Qt12sPyas9LRgyFMkOiCv/Dur0OdJ4obZQ9MbZcp5YH2hwa8BlX04ZIU9JOBdWQI07IpXqpPIdvZPMANlAXjWeE6HV/SgZJM7izlvGiV3jCIGzhyC0/MdQHc3h75aLExYigb2FmtdHXNpEsYf2iqqTAIlyAPyAw78lvXN4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106996; c=relaxed/simple; bh=qNbb8ty5fRnxFkcUXw+miskt6Tu11OYOQ5Ju3o0WAzI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=qKdm3S8m8v+faCjE8GGEbemvyW4xm6MRxxKG53ckho6uSUtQSGgKtvEc9w1//4yR+5GfsjphvnGZxz7uUOdqHPdVZsfarrxdgSZXyOsyPseUrHgSQdYshVoiH8Ga3HgRRH3JQUIm8Z4BymZ1JLFiNlU9aiblnXlfBVv3rDn1MFg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=Igstavse; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="Igstavse" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 4CE98C4CEEC; Tue, 8 Apr 2025 10:09:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1744106996; bh=qNbb8ty5fRnxFkcUXw+miskt6Tu11OYOQ5Ju3o0WAzI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=IgstavseRL2eOlVa2rHZE9QalA4Z2S7D8SROqhmrZv0E701E2uwwLmXoXKaCs5Vfz uYtoc7DOWTlOhni/I/gAeTwiI5KXuN2mMt0GMUDeIse+V46Ks8k5CuLTM1f5OHnoO1 /NwS/fdnwyo3ykZ+lFeNigTuxQe56WC7griisuckf9ltjV+13HZIwrl+dfhU4WoxEx 2QC5uvn23yjDENalxqKgXDeYCnLs3HPgrscz1/vrG3Snk/YbnMU8Z1ibKEkwHS+scl t7m8BP44dIjzvdI1YGwb3ydhl/9BrKQa+m4dc0wxAwx9Pe0UHjmp41nBXgXrXg7lFn x7FA93jrD0hhQ== Received: from mchehab by mail.kernel.org with local (Exim 4.98.2) (envelope-from ) id 1u25tt-00000008RVU-0C02; Tue, 08 Apr 2025 18:09:49 +0800 From: Mauro Carvalho Chehab To: Linux Doc Mailing List , Jonathan Corbet Cc: Mauro Carvalho Chehab , linux-kernel@vger.kernel.org Subject: [PATCH v3 05/33] scripts/kernel-doc.py: better handle empty sections Date: Tue, 8 Apr 2025 18:09:08 +0800 Message-ID: <1b057092a48ba61d92a411f4f6d505b802913785.1744106241.git.mchehab+huawei@kernel.org> X-Mailer: git-send-email 2.49.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: Mauro Carvalho Chehab Content-Type: text/plain; charset="utf-8" While doing the conversion, we opted to skip empty sections (description, return), but this makes harder to see the differences between kernel-doc (Perl) and kernel-doc.py. Also, the logic doesn't always work properly. So, change the way this is done by adding an extra step to remove such sections, doing it only for Return and Description. Signed-off-by: Mauro Carvalho Chehab --- scripts/kernel-doc.py | 31 ++++++++++++++++++++++++++++--- 1 file changed, 28 insertions(+), 3 deletions(-) diff --git a/scripts/kernel-doc.py b/scripts/kernel-doc.py index 8625209d6293..90808d538de7 100755 --- a/scripts/kernel-doc.py +++ b/scripts/kernel-doc.py @@ -317,6 +317,19 @@ class KernelDoc: name =3D self.entry.section contents =3D self.entry.contents =20 + # TODO: we can prevent dumping empty sections here with: + # + # if self.entry.contents.strip("\n"): + # if start_new: + # self.entry.section =3D self.section_default + # self.entry.contents =3D "" + # + # return + # + # But, as we want to be producing the same output of the + # venerable kernel-doc Perl tool, let's just output everything, + # at least for now + if type_param.match(name): name =3D type_param.group(1) =20 @@ -373,6 +386,19 @@ class KernelDoc: =20 args["type"] =3D dtype =20 + # TODO: use colletions.OrderedDict + + sections =3D args.get('sections', {}) + sectionlist =3D args.get('sectionlist', []) + + # Drop empty sections + # TODO: improve it to emit warnings + for section in [ "Description", "Return" ]: + if section in sectionlist: + if not sections[section].rstrip(): + del sections[section] + sectionlist.remove(section) + self.entries.append((name, args)) =20 self.config.log.debug("Output: %s:%s =3D %s", dtype, name, pformat= (args)) @@ -476,7 +502,7 @@ class KernelDoc: # to ignore "[blah" in a parameter string. =20 self.entry.parameterlist.append(param) - org_arg =3D Re(r'\s\s+').sub(' ', org_arg, count=3D1) + org_arg =3D Re(r'\s\s+').sub(' ', org_arg) self.entry.parametertypes[param] =3D org_arg =20 def save_struct_actual(self, actual): @@ -1384,8 +1410,7 @@ class KernelDoc: return =20 if doc_end.search(line): - if self.entry.contents.strip("\n"): - self.dump_section() + self.dump_section() =20 # Look for doc_com + + doc_end: r =3D Re(r'\s*\*\s*[a-zA-Z_0-9:\.]+\*/') --=20 2.49.0 From nobody Wed Dec 17 05:33:22 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8238C266F13; Tue, 8 Apr 2025 10:09:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106996; cv=none; b=bu0WXTEINjA5I9O3yf8720CHHy+i+yLX66n1aguJ6F9z7jNf16n+2LmNlm1zExapuI/+RHOaUcKNTqkX5NXhXl4fHz6F+GEcJLMVgLIoq59o7EE15TzV+Kv62SgiWIXAaf4XJrDLOlqI1bLwY+L/Aq3wJl+iKfsjQCjuPz5zPEM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106996; c=relaxed/simple; bh=IVWVxy8Iyjwhz+B3KhlfqTCOeB9G5zFT4ISB8zvAsy0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=kaFBBPOkG2GsGiPk6GPirYZ85j/3w02n3eW18+f+0PdWT+JrNURfL7Iitn9QmPk8Qn+TkCT19msugnrKnp5lfBV9Rt+e9ieWlk5BwzlVragPSpJrVYrg5lgHuh+CRxdImdarj7cNF0FIjvPlKjKSQTelWJzHnxn80PNeMOYyv1Y= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=Gm0b+wwa; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="Gm0b+wwa" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 88246C4AF09; Tue, 8 Apr 2025 10:09:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1744106996; bh=IVWVxy8Iyjwhz+B3KhlfqTCOeB9G5zFT4ISB8zvAsy0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Gm0b+wwaqvH6p3e9SqV8JdmIAGE2MY0aWGNdCCxDR50kszF6Pxl30cX1ohUVLycCK RE2GnkHSPsQPCvif2wL0D/ujd3wVOzwPPMhvbkRp9hk9Vou9f8tf6n35Q1TEp2nn/w 7TLqwEAREIMmaviHXdKIA7Mm3jeTUcysl9j6Wm7FcktqWEcMISHNpyZGLEG3C66qr5 G2VshhxFwMJWUdrJnYFaPPBfD3grsd7gXpjztdK3aaBu+bSnT0lV+MQ71ZywR5Bo9X APl2zZ/ZzFaoIZgw0UdCneb+XMGMv4NtHmOqv8Bl7UIcn3ZdQunQ9XfIcIoQAWVzzI DSk+AyGVRSXWQ== Received: from mchehab by mail.kernel.org with local (Exim 4.98.2) (envelope-from ) id 1u25tt-00000008RVX-0HaZ; Tue, 08 Apr 2025 18:09:49 +0800 From: Mauro Carvalho Chehab To: Linux Doc Mailing List , Jonathan Corbet Cc: Mauro Carvalho Chehab , linux-kernel@vger.kernel.org Subject: [PATCH v3 06/33] scripts/kernel-doc.py: properly handle struct_group macros Date: Tue, 8 Apr 2025 18:09:09 +0800 Message-ID: <74dee485f70b7ce85e90496bfdd360283a677a58.1744106241.git.mchehab+huawei@kernel.org> X-Mailer: git-send-email 2.49.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: Mauro Carvalho Chehab Content-Type: text/plain; charset="utf-8" Handing nested parenthesis with regular expressions is not an easy task. It is even harder with Python's re module, as it has a limited subset of regular expressions, missing more advanced features. We might use instead Python regex module, but still the regular expressions are very hard to understand. So, instead, add a logic to properly match delimiters. Signed-off-by: Mauro Carvalho Chehab --- scripts/kernel-doc.py | 220 ++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 213 insertions(+), 7 deletions(-) diff --git a/scripts/kernel-doc.py b/scripts/kernel-doc.py index 90808d538de7..fb96d42d287c 100755 --- a/scripts/kernel-doc.py +++ b/scripts/kernel-doc.py @@ -167,6 +167,172 @@ class Re: def group(self, num): return self.last_match.group(num) =20 +class NestedMatch: + """ + Finding nested delimiters is hard with regular expressions. It is + even harder on Python with its normal re module, as there are several + advanced regular expressions that are missing. + + This is the case of this pattern: + + '\\bSTRUCT_GROUP(\\(((?:(?>[^)(]+)|(?1))*)\\))[^;]*;' + + which is used to properly match open/close parenthesis of the + string search STRUCT_GROUP(), + + Add a class that counts pairs of delimiters, using it to match and + replace nested expressions. + + The original approach was suggested by: + https://stackoverflow.com/questions/5454322/python-how-to-match-ne= sted-parentheses-with-regex + + Although I re-implemented it to make it more generic and match 3 types + of delimiters. The logic checks if delimiters are paired. If not, it + will ignore the search string. + """ + + # TODO: + # Right now, regular expressions to match it are defined only up to + # the start delimiter, e.g.: + # + # \bSTRUCT_GROUP\( + # + # is similar to: STRUCT_GROUP\((.*)\) + # except that the content inside the match group is delimiter's aligne= d. + # + # The content inside parenthesis are converted into a single replace + # group (e.g. r`\1'). + # + # It would be nice to change such definition to support multiple + # match groups, allowing a regex equivalent to. + # + # FOO\((.*), (.*), (.*)\) + # + # it is probably easier to define it not as a regular expression, but + # with some lexical definition like: + # + # FOO(arg1, arg2, arg3) + + + DELIMITER_PAIRS =3D { + '{': '}', + '(': ')', + '[': ']', + } + + RE_DELIM =3D re.compile(r'[\{\}\[\]\(\)]') + + def _search(self, regex, line): + """ + Finds paired blocks for a regex that ends with a delimiter. + + The suggestion of using finditer to match pairs came from: + https://stackoverflow.com/questions/5454322/python-how-to-match-ne= sted-parentheses-with-regex + but I ended using a different implementation to align all three ty= pes + of delimiters and seek for an initial regular expression. + + The algorithm seeks for open/close paired delimiters and place them + into a stack, yielding a start/stop position of each match when t= he + stack is zeroed. + + The algorithm shoud work fine for properly paired lines, but will + silently ignore end delimiters that preceeds an start delimiter. + This should be OK for kernel-doc parser, as unaligned delimiters + would cause compilation errors. So, we don't need to rise exceptio= ns + to cover such issues. + """ + + stack =3D [] + + for match_re in regex.finditer(line): + start =3D match_re.start() + offset =3D match_re.end() + + d =3D line[offset -1] + if d not in self.DELIMITER_PAIRS: + continue + + end =3D self.DELIMITER_PAIRS[d] + stack.append(end) + + for match in self.RE_DELIM.finditer(line[offset:]): + pos =3D match.start() + offset + + d =3D line[pos] + + if d in self.DELIMITER_PAIRS: + end =3D self.DELIMITER_PAIRS[d] + + stack.append(end) + continue + + # Does the end delimiter match what it is expected? + if stack and d =3D=3D stack[-1]: + stack.pop() + + if not stack: + yield start, offset, pos + 1 + break + + def search(self, regex, line): + """ + This is similar to re.search: + + It matches a regex that it is followed by a delimiter, + returning occurrences only if all delimiters are paired. + """ + + for t in self._search(regex, line): + + yield line[t[0]:t[2]] + + def sub(self, regex, sub, line, count=3D0): + """ + This is similar to re.sub: + + It matches a regex that it is followed by a delimiter, + replacing occurrences only if all delimiters are paired. + + if r'\1' is used, it works just like re: it places there the + matched paired data with the delimiter stripped. + + If count is different than zero, it will replace at most count + items. + """ + out =3D "" + + cur_pos =3D 0 + n =3D 0 + + found =3D False + for start, end, pos in self._search(regex, line): + out +=3D line[cur_pos:start] + + # Value, ignoring start/end delimiters + value =3D line[end:pos - 1] + + # replaces \1 at the sub string, if \1 is used there + new_sub =3D sub + new_sub =3D new_sub.replace(r'\1', value) + + out +=3D new_sub + + # Drop end ';' if any + if line[pos] =3D=3D ';': + pos +=3D 1 + + cur_pos =3D pos + n +=3D 1 + + if count and count >=3D n: + break + + # Append the remaining string + l =3D len(line) + out +=3D line[cur_pos:l] + + return out + # # Regular expressions used to parse kernel-doc markups at KernelDoc class. # @@ -738,22 +904,49 @@ class KernelDoc: (Re(r'\s*____cacheline_aligned_in_smp', re.S), ' '), (Re(r'\s*____cacheline_aligned', re.S), ' '), =20 - # Unwrap struct_group() based on this definition: + # Unwrap struct_group macros based on this definition: # __struct_group(TAG, NAME, ATTRS, MEMBERS...) # which has variants like: struct_group(NAME, MEMBERS...) + # Only MEMBERS arguments require documentation. + # + # Parsing them happens on two steps: + # + # 1. drop struct group arguments that aren't at MEMBERS, + # storing them as STRUCT_GROUP(MEMBERS) + # + # 2. remove STRUCT_GROUP() ancillary macro. + # + # The original logic used to remove STRUCT_GROUP() using an + # advanced regex: + # + # \bSTRUCT_GROUP(\(((?:(?>[^)(]+)|(?1))*)\))[^;]*; + # + # with two patterns that are incompatible with + # Python re module, as it has: + # + # - a recursive pattern: (?1) + # - an atomic grouping: (?>...) + # + # I tried a simpler version: but it didn't work either: + # \bSTRUCT_GROUP\(([^\)]+)\)[^;]*; + # + # As it doesn't properly match the end parenthesis on some cas= es. + # + # So, a better solution was crafted: there's now a NestedMatch + # class that ensures that delimiters after a search are proper= ly + # matched. So, the implementation to drop STRUCT_GROUP() will = be + # handled in separate. =20 (Re(r'\bstruct_group\s*\(([^,]*,)', re.S), r'STRUCT_GROUP('), (Re(r'\bstruct_group_attr\s*\(([^,]*,){2}', re.S), r'STRUCT_G= ROUP('), (Re(r'\bstruct_group_tagged\s*\(([^,]*),([^,]*),', re.S), r's= truct \1 \2; STRUCT_GROUP('), (Re(r'\b__struct_group\s*\(([^,]*,){3}', re.S), r'STRUCT_GROU= P('), =20 - # This is incompatible with Python re, as it uses: - # recursive patterns ((?1)) and atomic grouping ((?>...)): - # '\bSTRUCT_GROUP(\(((?:(?>[^)(]+)|(?1))*)\))[^;]*;' - # Let's see if this works instead: - (Re(r'\bSTRUCT_GROUP\(([^\)]+)\)[^;]*;', re.S), r'\1'), - # Replace macros + # + # TODO: it is better to also move those to the NestedMatch log= ic, + # to ensure that parenthesis will be properly matched. + (Re(r'__ETHTOOL_DECLARE_LINK_MODE_MASK\s*\(([^\)]+)\)', re.S),= r'DECLARE_BITMAP(\1, __ETHTOOL_LINK_MODE_MASK_NBITS)'), (Re(r'DECLARE_PHY_INTERFACE_MASK\s*\(([^\)]+)\)', re.S), r'DE= CLARE_BITMAP(\1, PHY_INTERFACE_MODE_MAX)'), (Re(r'DECLARE_BITMAP\s*\(' + args_pattern + r',\s*' + args_pat= tern + r'\)', re.S), r'unsigned long \1[BITS_TO_LONGS(\2)]'), @@ -765,9 +958,22 @@ class KernelDoc: (Re(r'DEFINE_DMA_UNMAP_LEN\s*\(' + args_pattern + r'\)', re.S)= , r'__u32 \1'), ] =20 + # Regexes here are guaranteed to have the end limiter matching + # the start delimiter. Yet, right now, only one replace group + # is allowed. + + sub_nested_prefixes =3D [ + (re.compile(r'\bSTRUCT_GROUP\('), r'\1'), + ] + for search, sub in sub_prefixes: members =3D search.sub(sub, members) =20 + nested =3D NestedMatch() + + for search, sub in sub_nested_prefixes: + members =3D nested.sub(search, sub, members) + # Keeps the original declaration as-is declaration =3D members =20 --=20 2.49.0 From nobody Wed Dec 17 05:33:22 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 700BB8488; Tue, 8 Apr 2025 10:09:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106996; cv=none; b=M9icrKLcM7g90BkM7Dxdyj948QREmJV//9Y0dLpx6XviELa+C/hfdX+MRLrL47yd5xl8JjNDhKvUsXNGwojk70y/wXzfmj3UQjHMKXTDw8SYqWrKz2A2O71XQ0Yw4al/P+OHpefQY6z0EtGxBw6h/DPbmn6BfQX5ZipHPUzEFas= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106996; c=relaxed/simple; bh=jm1K5mJ5d8V7brXcbsIGHEwX3cLBPOTXA1hQuPqkDsM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=bzPyeDP2vKzMKS79WipLjUcuHy5cZn2LvXE6XSIcQjCalEB1eyBpaCge8IaBJmwkxRFVCIikIk9BF8iAxT++StmEP+eav7Y7qU92iipvoh9zhQUo+BVUty4Zw+K9PGwR4C8R0sptXbBafkxWu1vlyCjTp5pQJZ7mo74Q+SHDZ8g= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=N99jDxdh; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="N99jDxdh" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8668CC4CEE8; Tue, 8 Apr 2025 10:09:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1744106996; bh=jm1K5mJ5d8V7brXcbsIGHEwX3cLBPOTXA1hQuPqkDsM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=N99jDxdhmp5Jp09jRqlSEXoDu6pio6TYLfh2MPXQiOQqf5aZ4Qi4PcZRVkFvYuttI a9DnZm3uFR1LocXcbp72RwkB8scr7R3cmNzPGul9vKGiOUTPKFWwI2NebTDU4bp2tg rULKfyqW4QDPTS0Bz88nOaRZyba/bfzAMjB0u1kNBIeajYVokhcqZjjDko+Zgqz0tk BInIDNtE9EbyxTarLOeaTaATMQDFQD6lkdqdrAAUhYG7JtLegKWASgqDWOemUpfwlq LfhXBS/N89DipMhjKCj03o8Xn/cLhaCV3jrrYmbxqYlfKh4Uwko3wAKBtBrG7veczH 0BRAGYOqDIpXA== Received: from mchehab by mail.kernel.org with local (Exim 4.98.2) (envelope-from ) id 1u25tt-00000008RVa-0Neo; Tue, 08 Apr 2025 18:09:49 +0800 From: Mauro Carvalho Chehab To: Linux Doc Mailing List , Jonathan Corbet Cc: Mauro Carvalho Chehab , linux-kernel@vger.kernel.org Subject: [PATCH v3 07/33] scripts/kernel-doc.py: move regex methods to a separate file Date: Tue, 8 Apr 2025 18:09:10 +0800 Message-ID: <64f96b6744435b51894bb4ab7612851d9d054190.1744106241.git.mchehab+huawei@kernel.org> X-Mailer: git-send-email 2.49.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: Mauro Carvalho Chehab Content-Type: text/plain; charset="utf-8" In preparation for letting kerneldoc Sphinx extension to import Python libraries, move regex ancillary classes to a separate file. Signed-off-by: Mauro Carvalho Chehab --- scripts/kernel-doc.py | 223 +---------------------------- scripts/lib/kdoc/kdoc_re.py | 272 ++++++++++++++++++++++++++++++++++++ 2 files changed, 277 insertions(+), 218 deletions(-) create mode 100755 scripts/lib/kdoc/kdoc_re.py diff --git a/scripts/kernel-doc.py b/scripts/kernel-doc.py index fb96d42d287c..7f00c8c86a78 100755 --- a/scripts/kernel-doc.py +++ b/scripts/kernel-doc.py @@ -110,228 +110,15 @@ from pprint import pformat =20 from dateutil import tz =20 -# Local cache for regular expressions -re_cache =3D {} +# Import Python modules =20 +LIB_DIR =3D "lib/kdoc" +SRC_DIR =3D os.path.dirname(os.path.realpath(__file__)) =20 -class Re: - """ - Helper class to simplify regex declaration and usage, +sys.path.insert(0, os.path.join(SRC_DIR, LIB_DIR)) =20 - It calls re.compile for a given pattern. It also allows adding - regular expressions and define sub at class init time. +from kdoc_re import Re, NestedMatch =20 - Regular expressions can be cached via an argument, helping to speedup - searches. - """ - - def _add_regex(self, string, flags): - if string in re_cache: - self.regex =3D re_cache[string] - else: - self.regex =3D re.compile(string, flags=3Dflags) - - if self.cache: - re_cache[string] =3D self.regex - - def __init__(self, string, cache=3DTrue, flags=3D0): - self.cache =3D cache - self.last_match =3D None - - self._add_regex(string, flags) - - def __str__(self): - return self.regex.pattern - - def __add__(self, other): - return Re(str(self) + str(other), cache=3Dself.cache or other.cach= e, - flags=3Dself.regex.flags | other.regex.flags) - - def match(self, string): - self.last_match =3D self.regex.match(string) - return self.last_match - - def search(self, string): - self.last_match =3D self.regex.search(string) - return self.last_match - - def findall(self, string): - return self.regex.findall(string) - - def split(self, string): - return self.regex.split(string) - - def sub(self, sub, string, count=3D0): - return self.regex.sub(sub, string, count=3Dcount) - - def group(self, num): - return self.last_match.group(num) - -class NestedMatch: - """ - Finding nested delimiters is hard with regular expressions. It is - even harder on Python with its normal re module, as there are several - advanced regular expressions that are missing. - - This is the case of this pattern: - - '\\bSTRUCT_GROUP(\\(((?:(?>[^)(]+)|(?1))*)\\))[^;]*;' - - which is used to properly match open/close parenthesis of the - string search STRUCT_GROUP(), - - Add a class that counts pairs of delimiters, using it to match and - replace nested expressions. - - The original approach was suggested by: - https://stackoverflow.com/questions/5454322/python-how-to-match-ne= sted-parentheses-with-regex - - Although I re-implemented it to make it more generic and match 3 types - of delimiters. The logic checks if delimiters are paired. If not, it - will ignore the search string. - """ - - # TODO: - # Right now, regular expressions to match it are defined only up to - # the start delimiter, e.g.: - # - # \bSTRUCT_GROUP\( - # - # is similar to: STRUCT_GROUP\((.*)\) - # except that the content inside the match group is delimiter's aligne= d. - # - # The content inside parenthesis are converted into a single replace - # group (e.g. r`\1'). - # - # It would be nice to change such definition to support multiple - # match groups, allowing a regex equivalent to. - # - # FOO\((.*), (.*), (.*)\) - # - # it is probably easier to define it not as a regular expression, but - # with some lexical definition like: - # - # FOO(arg1, arg2, arg3) - - - DELIMITER_PAIRS =3D { - '{': '}', - '(': ')', - '[': ']', - } - - RE_DELIM =3D re.compile(r'[\{\}\[\]\(\)]') - - def _search(self, regex, line): - """ - Finds paired blocks for a regex that ends with a delimiter. - - The suggestion of using finditer to match pairs came from: - https://stackoverflow.com/questions/5454322/python-how-to-match-ne= sted-parentheses-with-regex - but I ended using a different implementation to align all three ty= pes - of delimiters and seek for an initial regular expression. - - The algorithm seeks for open/close paired delimiters and place them - into a stack, yielding a start/stop position of each match when t= he - stack is zeroed. - - The algorithm shoud work fine for properly paired lines, but will - silently ignore end delimiters that preceeds an start delimiter. - This should be OK for kernel-doc parser, as unaligned delimiters - would cause compilation errors. So, we don't need to rise exceptio= ns - to cover such issues. - """ - - stack =3D [] - - for match_re in regex.finditer(line): - start =3D match_re.start() - offset =3D match_re.end() - - d =3D line[offset -1] - if d not in self.DELIMITER_PAIRS: - continue - - end =3D self.DELIMITER_PAIRS[d] - stack.append(end) - - for match in self.RE_DELIM.finditer(line[offset:]): - pos =3D match.start() + offset - - d =3D line[pos] - - if d in self.DELIMITER_PAIRS: - end =3D self.DELIMITER_PAIRS[d] - - stack.append(end) - continue - - # Does the end delimiter match what it is expected? - if stack and d =3D=3D stack[-1]: - stack.pop() - - if not stack: - yield start, offset, pos + 1 - break - - def search(self, regex, line): - """ - This is similar to re.search: - - It matches a regex that it is followed by a delimiter, - returning occurrences only if all delimiters are paired. - """ - - for t in self._search(regex, line): - - yield line[t[0]:t[2]] - - def sub(self, regex, sub, line, count=3D0): - """ - This is similar to re.sub: - - It matches a regex that it is followed by a delimiter, - replacing occurrences only if all delimiters are paired. - - if r'\1' is used, it works just like re: it places there the - matched paired data with the delimiter stripped. - - If count is different than zero, it will replace at most count - items. - """ - out =3D "" - - cur_pos =3D 0 - n =3D 0 - - found =3D False - for start, end, pos in self._search(regex, line): - out +=3D line[cur_pos:start] - - # Value, ignoring start/end delimiters - value =3D line[end:pos - 1] - - # replaces \1 at the sub string, if \1 is used there - new_sub =3D sub - new_sub =3D new_sub.replace(r'\1', value) - - out +=3D new_sub - - # Drop end ';' if any - if line[pos] =3D=3D ';': - pos +=3D 1 - - cur_pos =3D pos - n +=3D 1 - - if count and count >=3D n: - break - - # Append the remaining string - l =3D len(line) - out +=3D line[cur_pos:l] - - return out =20 # # Regular expressions used to parse kernel-doc markups at KernelDoc class. diff --git a/scripts/lib/kdoc/kdoc_re.py b/scripts/lib/kdoc/kdoc_re.py new file mode 100755 index 000000000000..512b6521e79d --- /dev/null +++ b/scripts/lib/kdoc/kdoc_re.py @@ -0,0 +1,272 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: GPL-2.0 +# Copyright(c) 2025: Mauro Carvalho Chehab . + +""" +Regular expression ancillary classes. + +Those help caching regular expressions and do matching for kernel-doc. +""" + +import re + +# Local cache for regular expressions +re_cache =3D {} + + +class Re: + """ + Helper class to simplify regex declaration and usage, + + It calls re.compile for a given pattern. It also allows adding + regular expressions and define sub at class init time. + + Regular expressions can be cached via an argument, helping to speedup + searches. + """ + + def _add_regex(self, string, flags): + """ + Adds a new regex or re-use it from the cache. + """ + + if string in re_cache: + self.regex =3D re_cache[string] + else: + self.regex =3D re.compile(string, flags=3Dflags) + + if self.cache: + re_cache[string] =3D self.regex + + def __init__(self, string, cache=3DTrue, flags=3D0): + """ + Compile a regular expression and initialize internal vars. + """ + + self.cache =3D cache + self.last_match =3D None + + self._add_regex(string, flags) + + def __str__(self): + """ + Return the regular expression pattern. + """ + return self.regex.pattern + + def __add__(self, other): + """ + Allows adding two regular expressions into one. + """ + + return Re(str(self) + str(other), cache=3Dself.cache or other.cach= e, + flags=3Dself.regex.flags | other.regex.flags) + + def match(self, string): + """ + Handles a re.match storing its results + """ + + self.last_match =3D self.regex.match(string) + return self.last_match + + def search(self, string): + """ + Handles a re.search storing its results + """ + + self.last_match =3D self.regex.search(string) + return self.last_match + + def findall(self, string): + """ + Alias to re.findall + """ + + return self.regex.findall(string) + + def split(self, string): + """ + Alias to re.split + """ + + return self.regex.split(string) + + def sub(self, sub, string, count=3D0): + """ + Alias to re.sub + """ + + return self.regex.sub(sub, string, count=3Dcount) + + def group(self, num): + """ + Returns the group results of the last match + """ + + return self.last_match.group(num) + + +class NestedMatch: + """ + Finding nested delimiters is hard with regular expressions. It is + even harder on Python with its normal re module, as there are several + advanced regular expressions that are missing. + + This is the case of this pattern: + + '\\bSTRUCT_GROUP(\\(((?:(?>[^)(]+)|(?1))*)\\))[^;]*;' + + which is used to properly match open/close parenthesis of the + string search STRUCT_GROUP(), + + Add a class that counts pairs of delimiters, using it to match and + replace nested expressions. + + The original approach was suggested by: + https://stackoverflow.com/questions/5454322/python-how-to-match-ne= sted-parentheses-with-regex + + Although I re-implemented it to make it more generic and match 3 types + of delimiters. The logic checks if delimiters are paired. If not, it + will ignore the search string. + """ + + # TODO: + # Right now, regular expressions to match it are defined only up to + # the start delimiter, e.g.: + # + # \bSTRUCT_GROUP\( + # + # is similar to: STRUCT_GROUP\((.*)\) + # except that the content inside the match group is delimiter's aligne= d. + # + # The content inside parenthesis are converted into a single replace + # group (e.g. r`\1'). + # + # It would be nice to change such definition to support multiple + # match groups, allowing a regex equivalent to. + # + # FOO\((.*), (.*), (.*)\) + # + # it is probably easier to define it not as a regular expression, but + # with some lexical definition like: + # + # FOO(arg1, arg2, arg3) + + DELIMITER_PAIRS =3D { + '{': '}', + '(': ')', + '[': ']', + } + + RE_DELIM =3D re.compile(r'[\{\}\[\]\(\)]') + + def _search(self, regex, line): + """ + Finds paired blocks for a regex that ends with a delimiter. + + The suggestion of using finditer to match pairs came from: + https://stackoverflow.com/questions/5454322/python-how-to-match-ne= sted-parentheses-with-regex + but I ended using a different implementation to align all three ty= pes + of delimiters and seek for an initial regular expression. + + The algorithm seeks for open/close paired delimiters and place them + into a stack, yielding a start/stop position of each match when t= he + stack is zeroed. + + The algorithm shoud work fine for properly paired lines, but will + silently ignore end delimiters that preceeds an start delimiter. + This should be OK for kernel-doc parser, as unaligned delimiters + would cause compilation errors. So, we don't need to rise exceptio= ns + to cover such issues. + """ + + stack =3D [] + + for match_re in regex.finditer(line): + start =3D match_re.start() + offset =3D match_re.end() + + d =3D line[offset - 1] + if d not in self.DELIMITER_PAIRS: + continue + + end =3D self.DELIMITER_PAIRS[d] + stack.append(end) + + for match in self.RE_DELIM.finditer(line[offset:]): + pos =3D match.start() + offset + + d =3D line[pos] + + if d in self.DELIMITER_PAIRS: + end =3D self.DELIMITER_PAIRS[d] + + stack.append(end) + continue + + # Does the end delimiter match what it is expected? + if stack and d =3D=3D stack[-1]: + stack.pop() + + if not stack: + yield start, offset, pos + 1 + break + + def search(self, regex, line): + """ + This is similar to re.search: + + It matches a regex that it is followed by a delimiter, + returning occurrences only if all delimiters are paired. + """ + + for t in self._search(regex, line): + + yield line[t[0]:t[2]] + + def sub(self, regex, sub, line, count=3D0): + """ + This is similar to re.sub: + + It matches a regex that it is followed by a delimiter, + replacing occurrences only if all delimiters are paired. + + if r'\1' is used, it works just like re: it places there the + matched paired data with the delimiter stripped. + + If count is different than zero, it will replace at most count + items. + """ + out =3D "" + + cur_pos =3D 0 + n =3D 0 + + for start, end, pos in self._search(regex, line): + out +=3D line[cur_pos:start] + + # Value, ignoring start/end delimiters + value =3D line[end:pos - 1] + + # replaces \1 at the sub string, if \1 is used there + new_sub =3D sub + new_sub =3D new_sub.replace(r'\1', value) + + out +=3D new_sub + + # Drop end ';' if any + if line[pos] =3D=3D ';': + pos +=3D 1 + + cur_pos =3D pos + n +=3D 1 + + if count and count >=3D n: + break + + # Append the remaining string + l =3D len(line) + out +=3D line[cur_pos:l] + + return out --=20 2.49.0 From nobody Wed Dec 17 05:33:22 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C0BA4266579; Tue, 8 Apr 2025 10:09:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106999; cv=none; b=muu6dFYggnrbUWzJxmJfk0whiNquK+PEV8juBG0DM9eBNj4mjvyjsYvFgjnJmK7qA+WmQdiTpk6A7qxi8Syu285A7YmYXTNvzC0JBEkP9Zq8c8TYGkzy02/2ePoym5DlLxBIMkoRb72zEbQZ5ucFp9m+QMGXZJcg8mDyiNIrEKU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106999; c=relaxed/simple; bh=F8pKPq7VopAvbcnhMqVHrebIqO/8qwK6q2nGyxBl+cs=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=UhQX3XbnxgcueiOFztSOFwwOH4ZZ+KEsH0V5/bMorG9Iotvrt1f30eXVWfHBX5sS6oLndclG03rRYumnLZzifuLJHaX+Aothc9ND2cBLO1RkFAPGt43gK3gF6LOv4RY2PvgLeGkHzrBo2XUGnizOZtdkTjRfLoniQ7Uf1Su/LVM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=B/jTytkN; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="B/jTytkN" Received: by smtp.kernel.org (Postfix) with ESMTPSA id D6F82C4AF09; Tue, 8 Apr 2025 10:09:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1744106999; bh=F8pKPq7VopAvbcnhMqVHrebIqO/8qwK6q2nGyxBl+cs=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=B/jTytkN0Qnm8btEjNfbMOfM0qiQ1r8f/0y0AHyV6D7DofIQ1T6lyPXHJrN55V2jv tGToDO06vOtkysHB5pLL21YGmjn+zl5xs4vmdYyp4DTq9J0fdDCPbTaeCR0AK/jl0b +B0xVjydXBPYuj5DWmYQOEJelU9f2Jl+Q6L7ziVzCqCZTiuVe65im1RTUEXl2NXuI9 AeRTDmGunXWOeFQNslPEzz1tx/XszLp6xHv9VII+GnvxON4jyQcCFeC3uLQjZ2S/lc 3X3l9ppUv/u1tnpqci0EUgjiUmomzV1gd3LYOTjhy+eyDAFTHBPl+9ybexqXJtaqqg OjdaRTmZ4OYig== Received: from mchehab by mail.kernel.org with local (Exim 4.98.2) (envelope-from ) id 1u25tt-00000008RVd-0WIz; Tue, 08 Apr 2025 18:09:49 +0800 From: Mauro Carvalho Chehab To: Linux Doc Mailing List , Jonathan Corbet Cc: Mauro Carvalho Chehab , "Gustavo A. R. Silva" , Kees Cook , Sean Anderson , linux-hardening@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v3 08/33] scripts/kernel-doc.py: move KernelDoc class to a separate file Date: Tue, 8 Apr 2025 18:09:11 +0800 Message-ID: X-Mailer: git-send-email 2.49.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: Mauro Carvalho Chehab Content-Type: text/plain; charset="utf-8" In preparation for letting kerneldoc Sphinx extension to import Python libraries, move regex ancillary classes to a separate file. Signed-off-by: Mauro Carvalho Chehab --- scripts/kernel-doc.py | 1634 +----------------------------- scripts/lib/kdoc/kdoc_parser.py | 1690 +++++++++++++++++++++++++++++++ 2 files changed, 1692 insertions(+), 1632 deletions(-) create mode 100755 scripts/lib/kdoc/kdoc_parser.py diff --git a/scripts/kernel-doc.py b/scripts/kernel-doc.py index 7f00c8c86a78..f030a36a165b 100755 --- a/scripts/kernel-doc.py +++ b/scripts/kernel-doc.py @@ -117,53 +117,15 @@ SRC_DIR =3D os.path.dirname(os.path.realpath(__file__= )) =20 sys.path.insert(0, os.path.join(SRC_DIR, LIB_DIR)) =20 -from kdoc_re import Re, NestedMatch +from kdoc_parser import KernelDoc, type_param +from kdoc_re import Re =20 - -# -# Regular expressions used to parse kernel-doc markups at KernelDoc class. -# -# Let's declare them in lowercase outside any class to make easier to -# convert from the python script. -# -# As those are evaluated at the beginning, no need to cache them -# - - -# Allow whitespace at end of comment start. -doc_start =3D Re(r'^/\*\*\s*$', cache=3DFalse) - -doc_end =3D Re(r'\*/', cache=3DFalse) -doc_com =3D Re(r'\s*\*\s*', cache=3DFalse) -doc_com_body =3D Re(r'\s*\* ?', cache=3DFalse) -doc_decl =3D doc_com + Re(r'(\w+)', cache=3DFalse) - -# @params and a strictly limited set of supported section names -# Specifically: -# Match @word: -# @...: -# @{section-name}: -# while trying to not match literal block starts like "example::" -# -doc_sect =3D doc_com + \ - Re(r'\s*(\@[.\w]+|\@\.\.\.|description|context|returns?|notes?= |examples?)\s*:([^:].*)?$', - flags=3Dre.I, cache=3DFalse) - -doc_content =3D doc_com_body + Re(r'(.*)', cache=3DFalse) -doc_block =3D doc_com + Re(r'DOC:\s*(.*)?', cache=3DFalse) -doc_inline_start =3D Re(r'^\s*/\*\*\s*$', cache=3DFalse) -doc_inline_sect =3D Re(r'\s*\*\s*(@\s*[\w][\w\.]*\s*):(.*)', cache=3DFalse) -doc_inline_end =3D Re(r'^\s*\*/\s*$', cache=3DFalse) -doc_inline_oneline =3D Re(r'^\s*/\*\*\s*(@[\w\s]+):\s*(.*)\s*\*/\s*$', cac= he=3DFalse) function_pointer =3D Re(r"([^\(]*\(\*)\s*\)\s*\(([^\)]*)\)", cache=3DFalse) -attribute =3D Re(r"__attribute__\s*\(\([a-z0-9,_\*\s\(\)]*\)\)", - flags=3Dre.I | re.S, cache=3DFalse) =20 # match expressions used to find embedded type information type_constant =3D Re(r"\b``([^\`]+)``\b", cache=3DFalse) type_constant2 =3D Re(r"\%([-_*\w]+)", cache=3DFalse) type_func =3D Re(r"(\w+)\(\)", cache=3DFalse) -type_param =3D Re(r"\@(\w*((\.\w+)|(->\w+))*(\.\.\.)?)", cache=3DFalse) type_param_ref =3D Re(r"([\!~\*]?)\@(\w*((\.\w+)|(->\w+))*(\.\.\.)?)", cac= he=3DFalse) =20 # Special RST handling for func ptr params @@ -181,1598 +143,6 @@ type_member =3D Re(r"\&([_\w]+)(\.|->)([_\w]+)", cac= he=3DFalse) type_fallback =3D Re(r"\&([_\w]+)", cache=3DFalse) type_member_func =3D type_member + Re(r"\(\)", cache=3DFalse) =20 -export_symbol =3D Re(r'^\s*EXPORT_SYMBOL(_GPL)?\s*\(\s*(\w+)\s*\)\s*', cac= he=3DFalse) -export_symbol_ns =3D Re(r'^\s*EXPORT_SYMBOL_NS(_GPL)?\s*\(\s*(\w+)\s*,\s*"= \S+"\)\s*', cache=3DFalse) - -class KernelDoc: - # Parser states - STATE_NORMAL =3D 0 # normal code - STATE_NAME =3D 1 # looking for function name - STATE_BODY_MAYBE =3D 2 # body - or maybe more description - STATE_BODY =3D 3 # the body of the comment - STATE_BODY_WITH_BLANK_LINE =3D 4 # the body which has a blank line - STATE_PROTO =3D 5 # scanning prototype - STATE_DOCBLOCK =3D 6 # documentation block - STATE_INLINE =3D 7 # gathering doc outside main block - - st_name =3D [ - "NORMAL", - "NAME", - "BODY_MAYBE", - "BODY", - "BODY_WITH_BLANK_LINE", - "PROTO", - "DOCBLOCK", - "INLINE", - ] - - # Inline documentation state - STATE_INLINE_NA =3D 0 # not applicable ($state !=3D STATE_INLINE) - STATE_INLINE_NAME =3D 1 # looking for member name (@foo:) - STATE_INLINE_TEXT =3D 2 # looking for member documentation - STATE_INLINE_END =3D 3 # done - STATE_INLINE_ERROR =3D 4 # error - Comment without header was found. - # Spit a warning as it's not - # proper kernel-doc and ignore the rest. - - st_inline_name =3D [ - "", - "_NAME", - "_TEXT", - "_END", - "_ERROR", - ] - - # Section names - - section_default =3D "Description" # default section - section_intro =3D "Introduction" - section_context =3D "Context" - section_return =3D "Return" - - undescribed =3D "-- undescribed --" - - def __init__(self, config, fname): - """Initialize internal variables""" - - self.fname =3D fname - self.config =3D config - - # Initial state for the state machines - self.state =3D self.STATE_NORMAL - self.inline_doc_state =3D self.STATE_INLINE_NA - - # Store entry currently being processed - self.entry =3D None - - # Place all potential outputs into an array - self.entries =3D [] - - def show_warnings(self, dtype, declaration_name): - # TODO: implement it - - return True - - # TODO: rename to emit_message - def emit_warning(self, ln, msg, warning=3DTrue): - """Emit a message""" - - if warning: - self.config.log.warning("%s:%d %s", self.fname, ln, msg) - else: - self.config.log.info("%s:%d %s", self.fname, ln, msg) - - def dump_section(self, start_new=3DTrue): - """ - Dumps section contents to arrays/hashes intended for that purpose. - """ - - name =3D self.entry.section - contents =3D self.entry.contents - - # TODO: we can prevent dumping empty sections here with: - # - # if self.entry.contents.strip("\n"): - # if start_new: - # self.entry.section =3D self.section_default - # self.entry.contents =3D "" - # - # return - # - # But, as we want to be producing the same output of the - # venerable kernel-doc Perl tool, let's just output everything, - # at least for now - - if type_param.match(name): - name =3D type_param.group(1) - - self.entry.parameterdescs[name] =3D contents - self.entry.parameterdesc_start_lines[name] =3D self.entry.new_= start_line - - self.entry.sectcheck +=3D name + " " - self.entry.new_start_line =3D 0 - - elif name =3D=3D "@...": - name =3D "..." - self.entry.parameterdescs[name] =3D contents - self.entry.sectcheck +=3D name + " " - self.entry.parameterdesc_start_lines[name] =3D self.entry.new_= start_line - self.entry.new_start_line =3D 0 - - else: - if name in self.entry.sections and self.entry.sections[name] != =3D "": - # Only warn on user-specified duplicate section names - if name !=3D self.section_default: - self.emit_warning(self.entry.new_start_line, - f"duplicate section name '{name}'\n") - self.entry.sections[name] +=3D contents - else: - self.entry.sections[name] =3D contents - self.entry.sectionlist.append(name) - self.entry.section_start_lines[name] =3D self.entry.new_st= art_line - self.entry.new_start_line =3D 0 - -# self.config.log.debug("Section: %s : %s", name, pformat(vars(self= .entry))) - - if start_new: - self.entry.section =3D self.section_default - self.entry.contents =3D "" - - # TODO: rename it to store_declaration - def output_declaration(self, dtype, name, **args): - """ - Stores the entry into an entry array. - - The actual output and output filters will be handled elsewhere - """ - - # The implementation here is different than the original kernel-do= c: - # instead of checking for output filters or actually output anythi= ng, - # it just stores the declaration content at self.entries, as the - # output will happen on a separate class. - # - # For now, we're keeping the same name of the function just to make - # easier to compare the source code of both scripts - - if "declaration_start_line" not in args: - args["declaration_start_line"] =3D self.entry.declaration_star= t_line - - args["type"] =3D dtype - - # TODO: use colletions.OrderedDict - - sections =3D args.get('sections', {}) - sectionlist =3D args.get('sectionlist', []) - - # Drop empty sections - # TODO: improve it to emit warnings - for section in [ "Description", "Return" ]: - if section in sectionlist: - if not sections[section].rstrip(): - del sections[section] - sectionlist.remove(section) - - self.entries.append((name, args)) - - self.config.log.debug("Output: %s:%s =3D %s", dtype, name, pformat= (args)) - - def reset_state(self, ln): - """ - Ancillary routine to create a new entry. It initializes all - variables used by the state machine. - """ - - self.entry =3D argparse.Namespace - - self.entry.contents =3D "" - self.entry.function =3D "" - self.entry.sectcheck =3D "" - self.entry.struct_actual =3D "" - self.entry.prototype =3D "" - - self.entry.parameterlist =3D [] - self.entry.parameterdescs =3D {} - self.entry.parametertypes =3D {} - self.entry.parameterdesc_start_lines =3D {} - - self.entry.section_start_lines =3D {} - self.entry.sectionlist =3D [] - self.entry.sections =3D {} - - self.entry.anon_struct_union =3D False - - self.entry.leading_space =3D None - - # State flags - self.state =3D self.STATE_NORMAL - self.inline_doc_state =3D self.STATE_INLINE_NA - self.entry.brcount =3D 0 - - self.entry.in_doc_sect =3D False - self.entry.declaration_start_line =3D ln - - def push_parameter(self, ln, decl_type, param, dtype, - org_arg, declaration_name): - if self.entry.anon_struct_union and dtype =3D=3D "" and param =3D= =3D "}": - return # Ignore the ending }; from anonymous struct/union - - self.entry.anon_struct_union =3D False - - param =3D Re(r'[\[\)].*').sub('', param, count=3D1) - - if dtype =3D=3D "" and param.endswith("..."): - if Re(r'\w\.\.\.$').search(param): - # For named variable parameters of the form `x...`, - # remove the dots - param =3D param[:-3] - else: - # Handles unnamed variable parameters - param =3D "..." - - if param not in self.entry.parameterdescs or \ - not self.entry.parameterdescs[param]: - - self.entry.parameterdescs[param] =3D "variable arguments" - - elif dtype =3D=3D "" and (not param or param =3D=3D "void"): - param =3D "void" - self.entry.parameterdescs[param] =3D "no arguments" - - elif dtype =3D=3D "" and param in ["struct", "union"]: - # Handle unnamed (anonymous) union or struct - dtype =3D param - param =3D "{unnamed_" + param + "}" - self.entry.parameterdescs[param] =3D "anonymous\n" - self.entry.anon_struct_union =3D True - - # Handle cache group enforcing variables: they do not need - # to be described in header files - elif "__cacheline_group" in param: - # Ignore __cacheline_group_begin and __cacheline_group_end - return - - # Warn if parameter has no description - # (but ignore ones starting with # as these are not parameters - # but inline preprocessor statements) - if param not in self.entry.parameterdescs and not param.startswith= ("#"): - self.entry.parameterdescs[param] =3D self.undescribed - - if self.show_warnings(dtype, declaration_name) and "." not in = param: - if decl_type =3D=3D 'function': - dname =3D f"{decl_type} parameter" - else: - dname =3D f"{decl_type} member" - - self.emit_warning(ln, - f"{dname} '{param}' not described in '{d= eclaration_name}'") - - # Strip spaces from param so that it is one continuous string on - # parameterlist. This fixes a problem where check_sections() - # cannot find a parameter like "addr[6 + 2]" because it actually - # appears as "addr[6", "+", "2]" on the parameter list. - # However, it's better to maintain the param string unchanged for - # output, so just weaken the string compare in check_sections() - # to ignore "[blah" in a parameter string. - - self.entry.parameterlist.append(param) - org_arg =3D Re(r'\s\s+').sub(' ', org_arg) - self.entry.parametertypes[param] =3D org_arg - - def save_struct_actual(self, actual): - """ - Strip all spaces from the actual param so that it looks like - one string item. - """ - - actual =3D Re(r'\s*').sub("", actual, count=3D1) - - self.entry.struct_actual +=3D actual + " " - - def create_parameter_list(self, ln, decl_type, args, splitter, declara= tion_name): - - # temporarily replace all commas inside function pointer definition - arg_expr =3D Re(r'(\([^\),]+),') - while arg_expr.search(args): - args =3D arg_expr.sub(r"\1#", args) - - for arg in args.split(splitter): - # Strip comments - arg =3D Re(r'\/\*.*\*\/').sub('', arg) - - # Ignore argument attributes - arg =3D Re(r'\sPOS0?\s').sub(' ', arg) - - # Strip leading/trailing spaces - arg =3D arg.strip() - arg =3D Re(r'\s+').sub(' ', arg, count=3D1) - - if arg.startswith('#'): - # Treat preprocessor directive as a typeless variable just= to fill - # corresponding data structures "correctly". Catch it late= r in - # output_* subs. - - # Treat preprocessor directive as a typeless variable - self.push_parameter(ln, decl_type, arg, "", - "", declaration_name) - - elif Re(r'\(.+\)\s*\(').search(arg): - # Pointer-to-function - - arg =3D arg.replace('#', ',') - - r =3D Re(r'[^\(]+\(\*?\s*([\w\[\]\.]*)\s*\)') - if r.match(arg): - param =3D r.group(1) - else: - self.emit_warning(ln, f"Invalid param: {arg}") - param =3D arg - - dtype =3D Re(r'([^\(]+\(\*?)\s*' + re.escape(param)).sub(r= '\1', arg) - self.save_struct_actual(param) - self.push_parameter(ln, decl_type, param, dtype, - arg, declaration_name) - - elif Re(r'\(.+\)\s*\[').search(arg): - # Array-of-pointers - - arg =3D arg.replace('#', ',') - r =3D Re(r'[^\(]+\(\s*\*\s*([\w\[\]\.]*?)\s*(\s*\[\s*[\w]+= \s*\]\s*)*\)') - if r.match(arg): - param =3D r.group(1) - else: - self.emit_warning(ln, f"Invalid param: {arg}") - param =3D arg - - dtype =3D Re(r'([^\(]+\(\*?)\s*' + re.escape(param)).sub(r= '\1', arg) - - self.save_struct_actual(param) - self.push_parameter(ln, decl_type, param, dtype, - arg, declaration_name) - - elif arg: - arg =3D Re(r'\s*:\s*').sub(":", arg) - arg =3D Re(r'\s*\[').sub('[', arg) - - args =3D Re(r'\s*,\s*').split(arg) - if args[0] and '*' in args[0]: - args[0] =3D re.sub(r'(\*+)\s*', r' \1', args[0]) - - first_arg =3D [] - r =3D Re(r'^(.*\s+)(.*?\[.*\].*)$') - if args[0] and r.match(args[0]): - args.pop(0) - first_arg.extend(r.group(1)) - first_arg.append(r.group(2)) - else: - first_arg =3D Re(r'\s+').split(args.pop(0)) - - args.insert(0, first_arg.pop()) - dtype =3D ' '.join(first_arg) - - for param in args: - if Re(r'^(\*+)\s*(.*)').match(param): - r =3D Re(r'^(\*+)\s*(.*)') - if not r.match(param): - self.emit_warning(ln, f"Invalid param: {param}= ") - continue - - param =3D r.group(1) - - self.save_struct_actual(r.group(2)) - self.push_parameter(ln, decl_type, r.group(2), - f"{dtype} {r.group(1)}", - arg, declaration_name) - - elif Re(r'(.*?):(\w+)').search(param): - r =3D Re(r'(.*?):(\w+)') - if not r.match(param): - self.emit_warning(ln, f"Invalid param: {param}= ") - continue - - if dtype !=3D "": # Skip unnamed bit-fields - self.save_struct_actual(r.group(1)) - self.push_parameter(ln, decl_type, r.group(1), - f"{dtype}:{r.group(2)}", - arg, declaration_name) - else: - self.save_struct_actual(param) - self.push_parameter(ln, decl_type, param, dtype, - arg, declaration_name) - - def check_sections(self, ln, decl_name, decl_type, sectcheck, prmschec= k): - sects =3D sectcheck.split() - prms =3D prmscheck.split() - err =3D False - - for sx in range(len(sects)): # pylint: disable=3D= C0200 - err =3D True - for px in range(len(prms)): # pylint: disable=3D= C0200 - prm_clean =3D prms[px] - prm_clean =3D Re(r'\[.*\]').sub('', prm_clean) - prm_clean =3D attribute.sub('', prm_clean) - - # ignore array size in a parameter string; - # however, the original param string may contain - # spaces, e.g.: addr[6 + 2] - # and this appears in @prms as "addr[6" since the - # parameter list is split at spaces; - # hence just ignore "[..." for the sections check; - prm_clean =3D Re(r'\[.*').sub('', prm_clean) - - if prm_clean =3D=3D sects[sx]: - err =3D False - break - - if err: - if decl_type =3D=3D 'function': - dname =3D f"{decl_type} parameter" - else: - dname =3D f"{decl_type} member" - - self.emit_warning(ln, - f"Excess {dname} '{sects[sx]}' descripti= on in '{decl_name}'") - - def check_return_section(self, ln, declaration_name, return_type): - - if not self.config.wreturn: - return - - # Ignore an empty return type (It's a macro) - # Ignore functions with a "void" return type (but not "void *") - if not return_type or Re(r'void\s*\w*\s*$').search(return_type): - return - - if not self.entry.sections.get("Return", None): - self.emit_warning(ln, - f"No description found for return value of '= {declaration_name}'") - - def dump_struct(self, ln, proto): - """ - Store an entry for an struct or union - """ - - type_pattern =3D r'(struct|union)' - - qualifiers =3D [ - "__attribute__", - "__packed", - "__aligned", - "____cacheline_aligned_in_smp", - "____cacheline_aligned", - ] - - definition_body =3D r'\{(.*)\}\s*' + "(?:" + '|'.join(qualifiers) = + ")?" - struct_members =3D Re(type_pattern + r'([^\{\};]+)(\{)([^\{\}]*)(\= })([^\{\}\;]*)(\;)') - - # Extract struct/union definition - members =3D None - declaration_name =3D None - decl_type =3D None - - r =3D Re(type_pattern + r'\s+(\w+)\s*' + definition_body) - if r.search(proto): - decl_type =3D r.group(1) - declaration_name =3D r.group(2) - members =3D r.group(3) - else: - r =3D Re(r'typedef\s+' + type_pattern + r'\s*' + definition_bo= dy + r'\s*(\w+)\s*;') - - if r.search(proto): - decl_type =3D r.group(1) - declaration_name =3D r.group(3) - members =3D r.group(2) - - if not members: - self.emit_warning(ln, f"{proto} error: Cannot parse struct or = union!") - self.config.errors +=3D 1 - return - - if self.entry.identifier !=3D declaration_name: - self.emit_warning(ln, - f"expecting prototype for {decl_type} {self.= entry.identifier}. Prototype was for {decl_type} {declaration_name} instead= \n") - return - - args_pattern =3Dr'([^,)]+)' - - sub_prefixes =3D [ - (Re(r'\/\*\s*private:.*?\/\*\s*public:.*?\*\/', re.S | re.I), = ''), - (Re(r'\/\*\s*private:.*', re.S| re.I), ''), - - # Strip comments - (Re(r'\/\*.*?\*\/', re.S), ''), - - # Strip attributes - (attribute, ' '), - (Re(r'\s*__aligned\s*\([^;]*\)', re.S), ' '), - (Re(r'\s*__counted_by\s*\([^;]*\)', re.S), ' '), - (Re(r'\s*__counted_by_(le|be)\s*\([^;]*\)', re.S), ' '), - (Re(r'\s*__packed\s*', re.S), ' '), - (Re(r'\s*CRYPTO_MINALIGN_ATTR', re.S), ' '), - (Re(r'\s*____cacheline_aligned_in_smp', re.S), ' '), - (Re(r'\s*____cacheline_aligned', re.S), ' '), - - # Unwrap struct_group macros based on this definition: - # __struct_group(TAG, NAME, ATTRS, MEMBERS...) - # which has variants like: struct_group(NAME, MEMBERS...) - # Only MEMBERS arguments require documentation. - # - # Parsing them happens on two steps: - # - # 1. drop struct group arguments that aren't at MEMBERS, - # storing them as STRUCT_GROUP(MEMBERS) - # - # 2. remove STRUCT_GROUP() ancillary macro. - # - # The original logic used to remove STRUCT_GROUP() using an - # advanced regex: - # - # \bSTRUCT_GROUP(\(((?:(?>[^)(]+)|(?1))*)\))[^;]*; - # - # with two patterns that are incompatible with - # Python re module, as it has: - # - # - a recursive pattern: (?1) - # - an atomic grouping: (?>...) - # - # I tried a simpler version: but it didn't work either: - # \bSTRUCT_GROUP\(([^\)]+)\)[^;]*; - # - # As it doesn't properly match the end parenthesis on some cas= es. - # - # So, a better solution was crafted: there's now a NestedMatch - # class that ensures that delimiters after a search are proper= ly - # matched. So, the implementation to drop STRUCT_GROUP() will = be - # handled in separate. - - (Re(r'\bstruct_group\s*\(([^,]*,)', re.S), r'STRUCT_GROUP('), - (Re(r'\bstruct_group_attr\s*\(([^,]*,){2}', re.S), r'STRUCT_G= ROUP('), - (Re(r'\bstruct_group_tagged\s*\(([^,]*),([^,]*),', re.S), r's= truct \1 \2; STRUCT_GROUP('), - (Re(r'\b__struct_group\s*\(([^,]*,){3}', re.S), r'STRUCT_GROU= P('), - - # Replace macros - # - # TODO: it is better to also move those to the NestedMatch log= ic, - # to ensure that parenthesis will be properly matched. - - (Re(r'__ETHTOOL_DECLARE_LINK_MODE_MASK\s*\(([^\)]+)\)', re.S),= r'DECLARE_BITMAP(\1, __ETHTOOL_LINK_MODE_MASK_NBITS)'), - (Re(r'DECLARE_PHY_INTERFACE_MASK\s*\(([^\)]+)\)', re.S), r'DE= CLARE_BITMAP(\1, PHY_INTERFACE_MODE_MAX)'), - (Re(r'DECLARE_BITMAP\s*\(' + args_pattern + r',\s*' + args_pat= tern + r'\)', re.S), r'unsigned long \1[BITS_TO_LONGS(\2)]'), - (Re(r'DECLARE_HASHTABLE\s*\(' + args_pattern + r',\s*' + args_= pattern + r'\)', re.S), r'unsigned long \1[1 << ((\2) - 1)]'), - (Re(r'DECLARE_KFIFO\s*\(' + args_pattern + r',\s*' + args_patt= ern + r',\s*' + args_pattern + r'\)', re.S), r'\2 *\1'), - (Re(r'DECLARE_KFIFO_PTR\s*\(' + args_pattern + r',\s*' + args_= pattern + r'\)', re.S), r'\2 *\1'), - (Re(r'(?:__)?DECLARE_FLEX_ARRAY\s*\(' + args_pattern + r',\s*'= + args_pattern + r'\)', re.S), r'\1 \2[]'), - (Re(r'DEFINE_DMA_UNMAP_ADDR\s*\(' + args_pattern + r'\)', re.S= ), r'dma_addr_t \1'), - (Re(r'DEFINE_DMA_UNMAP_LEN\s*\(' + args_pattern + r'\)', re.S)= , r'__u32 \1'), - ] - - # Regexes here are guaranteed to have the end limiter matching - # the start delimiter. Yet, right now, only one replace group - # is allowed. - - sub_nested_prefixes =3D [ - (re.compile(r'\bSTRUCT_GROUP\('), r'\1'), - ] - - for search, sub in sub_prefixes: - members =3D search.sub(sub, members) - - nested =3D NestedMatch() - - for search, sub in sub_nested_prefixes: - members =3D nested.sub(search, sub, members) - - # Keeps the original declaration as-is - declaration =3D members - - # Split nested struct/union elements - # - # This loop was simpler at the original kernel-doc perl version, as - # while ($members =3D~ m/$struct_members/) { ... } - # reads 'members' string on each interaction. - # - # Python behavior is different: it parses 'members' only once, - # creating a list of tuples from the first interaction. - # - # On other words, this won't get nested structs. - # - # So, we need to have an extra loop on Python to override such - # re limitation. - - while True: - tuples =3D struct_members.findall(members) - if not tuples: - break - - for t in tuples: - newmember =3D "" - maintype =3D t[0] - s_ids =3D t[5] - content =3D t[3] - - oldmember =3D "".join(t) - - for s_id in s_ids.split(','): - s_id =3D s_id.strip() - - newmember +=3D f"{maintype} {s_id}; " - s_id =3D Re(r'[:\[].*').sub('', s_id) - s_id =3D Re(r'^\s*\**(\S+)\s*').sub(r'\1', s_id) - - for arg in content.split(';'): - arg =3D arg.strip() - - if not arg: - continue - - r =3D Re(r'^([^\(]+\(\*?\s*)([\w\.]*)(\s*\).*)') - if r.match(arg): - # Pointer-to-function - dtype =3D r.group(1) - name =3D r.group(2) - extra =3D r.group(3) - - if not name: - continue - - if not s_id: - # Anonymous struct/union - newmember +=3D f"{dtype}{name}{extra}; " - else: - newmember +=3D f"{dtype}{s_id}.{name}{extr= a}; " - - else: - arg =3D arg.strip() - # Handle bitmaps - arg =3D Re(r':\s*\d+\s*').sub('', arg) - - # Handle arrays - arg =3D Re(r'\[.*\]').sub('', arg) - - # Handle multiple IDs - arg =3D Re(r'\s*,\s*').sub(',', arg) - - - r =3D Re(r'(.*)\s+([\S+,]+)') - - if r.search(arg): - dtype =3D r.group(1) - names =3D r.group(2) - else: - newmember +=3D f"{arg}; " - continue - - for name in names.split(','): - name =3D Re(r'^\s*\**(\S+)\s*').sub(r'\1',= name).strip() - - if not name: - continue - - if not s_id: - # Anonymous struct/union - newmember +=3D f"{dtype} {name}; " - else: - newmember +=3D f"{dtype} {s_id}.{name}= ; " - - members =3D members.replace(oldmember, newmember) - - # Ignore other nested elements, like enums - members =3D re.sub(r'(\{[^\{\}]*\})', '', members) - - self.create_parameter_list(ln, decl_type, members, ';', - declaration_name) - self.check_sections(ln, declaration_name, decl_type, - self.entry.sectcheck, self.entry.struct_actual) - - # Adjust declaration for better display - declaration =3D Re(r'([\{;])').sub(r'\1\n', declaration) - declaration =3D Re(r'\}\s+;').sub('};', declaration) - - # Better handle inlined enums - while True: - r =3D Re(r'(enum\s+\{[^\}]+),([^\n])') - if not r.search(declaration): - break - - declaration =3D r.sub(r'\1,\n\2', declaration) - - def_args =3D declaration.split('\n') - level =3D 1 - declaration =3D "" - for clause in def_args: - - clause =3D clause.strip() - clause =3D Re(r'\s+').sub(' ', clause, count=3D1) - - if not clause: - continue - - if '}' in clause and level > 1: - level -=3D 1 - - if not Re(r'^\s*#').match(clause): - declaration +=3D "\t" * level - - declaration +=3D "\t" + clause + "\n" - if "{" in clause and "}" not in clause: - level +=3D 1 - - self.output_declaration(decl_type, declaration_name, - struct=3Ddeclaration_name, - module=3Dself.entry.modulename, - definition=3Ddeclaration, - parameterlist=3Dself.entry.parameterlist, - parameterdescs=3Dself.entry.parameterdescs, - parametertypes=3Dself.entry.parametertypes, - sectionlist=3Dself.entry.sectionlist, - sections=3Dself.entry.sections, - purpose=3Dself.entry.declaration_purpose) - - def dump_enum(self, ln, proto): - - # Ignore members marked private - proto =3D Re(r'\/\*\s*private:.*?\/\*\s*public:.*?\*\/', flags=3Dr= e.S).sub('', proto) - proto =3D Re(r'\/\*\s*private:.*}', flags=3Dre.S).sub('}', proto) - - # Strip comments - proto =3D Re(r'\/\*.*?\*\/', flags=3Dre.S).sub('', proto) - - # Strip #define macros inside enums - proto =3D Re(r'#\s*((define|ifdef|if)\s+|endif)[^;]*;', flags=3Dre= .S).sub('', proto) - - members =3D None - declaration_name =3D None - - r =3D Re(r'typedef\s+enum\s*\{(.*)\}\s*(\w*)\s*;') - if r.search(proto): - declaration_name =3D r.group(2) - members =3D r.group(1).rstrip() - else: - r =3D Re(r'enum\s+(\w*)\s*\{(.*)\}') - if r.match(proto): - declaration_name =3D r.group(1) - members =3D r.group(2).rstrip() - - if not members: - self.emit_warning(ln, f"{proto}: error: Cannot parse enum!") - self.config.errors +=3D 1 - return - - if self.entry.identifier !=3D declaration_name: - if self.entry.identifier =3D=3D "": - self.emit_warning(ln, - f"{proto}: wrong kernel-doc identifier o= n prototype") - else: - self.emit_warning(ln, - f"expecting prototype for enum {self.ent= ry.identifier}. Prototype was for enum {declaration_name} instead") - return - - if not declaration_name: - declaration_name =3D "(anonymous)" - - member_set =3D set() - - members =3D Re(r'\([^;]*?[\)]').sub('', members) - - for arg in members.split(','): - if not arg: - continue - arg =3D Re(r'^\s*(\w+).*').sub(r'\1', arg) - self.entry.parameterlist.append(arg) - if arg not in self.entry.parameterdescs: - self.entry.parameterdescs[arg] =3D self.undescribed - if self.show_warnings("enum", declaration_name): - self.emit_warning(ln, - f"Enum value '{arg}' not described i= n enum '{declaration_name}'") - member_set.add(arg) - - for k in self.entry.parameterdescs: - if k not in member_set: - if self.show_warnings("enum", declaration_name): - self.emit_warning(ln, - f"Excess enum value '%{k}' descripti= on in '{declaration_name}'") - - self.output_declaration('enum', declaration_name, - enum=3Ddeclaration_name, - module=3Dself.config.modulename, - parameterlist=3Dself.entry.parameterlist, - parameterdescs=3Dself.entry.parameterdescs, - sectionlist=3Dself.entry.sectionlist, - sections=3Dself.entry.sections, - purpose=3Dself.entry.declaration_purpose) - - def dump_declaration(self, ln, prototype): - if self.entry.decl_type =3D=3D "enum": - self.dump_enum(ln, prototype) - return - - if self.entry.decl_type =3D=3D "typedef": - self.dump_typedef(ln, prototype) - return - - if self.entry.decl_type in ["union", "struct"]: - self.dump_struct(ln, prototype) - return - - # TODO: handle other types - self.output_declaration(self.entry.decl_type, prototype, - entry=3Dself.entry) - - def dump_function(self, ln, prototype): - - func_macro =3D False - return_type =3D '' - decl_type =3D 'function' - - # Prefixes that would be removed - sub_prefixes =3D [ - (r"^static +", "", 0), - (r"^extern +", "", 0), - (r"^asmlinkage +", "", 0), - (r"^inline +", "", 0), - (r"^__inline__ +", "", 0), - (r"^__inline +", "", 0), - (r"^__always_inline +", "", 0), - (r"^noinline +", "", 0), - (r"^__FORTIFY_INLINE +", "", 0), - (r"__init +", "", 0), - (r"__init_or_module +", "", 0), - (r"__deprecated +", "", 0), - (r"__flatten +", "", 0), - (r"__meminit +", "", 0), - (r"__must_check +", "", 0), - (r"__weak +", "", 0), - (r"__sched +", "", 0), - (r"_noprof", "", 0), - (r"__printf\s*\(\s*\d*\s*,\s*\d*\s*\) +", "", 0), - (r"__(?:re)?alloc_size\s*\(\s*\d+\s*(?:,\s*\d+\s*)?\) +", "", = 0), - (r"__diagnose_as\s*\(\s*\S+\s*(?:,\s*\d+\s*)*\) +", "", 0), - (r"DECL_BUCKET_PARAMS\s*\(\s*(\S+)\s*,\s*(\S+)\s*\)", r"\1, \2= ", 0), - (r"__attribute_const__ +", "", 0), - - # It seems that Python support for re.X is broken: - # At least for me (Python 3.13), this didn't work -# (r""" -# __attribute__\s*\(\( -# (?: -# [\w\s]+ # attribute name -# (?:\([^)]*\))? # attribute arguments -# \s*,? # optional comma at the end -# )+ -# \)\)\s+ -# """, "", re.X), - - # So, remove whitespaces and comments from it - (r"__attribute__\s*\(\((?:[\w\s]+(?:\([^)]*\))?\s*,?)+\)\)\s+"= , "", 0), - ] - - for search, sub, flags in sub_prefixes: - prototype =3D Re(search, flags).sub(sub, prototype) - - # Macros are a special case, as they change the prototype format - new_proto =3D Re(r"^#\s*define\s+").sub("", prototype) - if new_proto !=3D prototype: - is_define_proto =3D True - prototype =3D new_proto - else: - is_define_proto =3D False - - # Yes, this truly is vile. We are looking for: - # 1. Return type (may be nothing if we're looking at a macro) - # 2. Function name - # 3. Function parameters. - # - # All the while we have to watch out for function pointer paramete= rs - # (which IIRC is what the two sections are for), C types (these - # regexps don't even start to express all the possibilities), and - # so on. - # - # If you mess with these regexps, it's a good idea to check that - # the following functions' documentation still comes out right: - # - parport_register_device (function pointer parameters) - # - atomic_set (macro) - # - pci_match_device, __copy_to_user (long return type) - - name =3D r'[a-zA-Z0-9_~:]+' - prototype_end1 =3D r'[^\(]*' - prototype_end2 =3D r'[^\{]*' - prototype_end =3D fr'\(({prototype_end1}|{prototype_end2})\)' - - # Besides compiling, Perl qr{[\w\s]+} works as a non-capturing gro= up. - # So, this needs to be mapped in Python with (?:...)? or (?:...)+ - - type1 =3D r'(?:[\w\s]+)?' - type2 =3D r'(?:[\w\s]+\*+)+' - - found =3D False - - if is_define_proto: - r =3D Re(r'^()(' + name + r')\s+') - - if r.search(prototype): - return_type =3D '' - declaration_name =3D r.group(2) - func_macro =3D True - - found =3D True - - if not found: - patterns =3D [ - rf'^()({name})\s*{prototype_end}', - rf'^({type1})\s+({name})\s*{prototype_end}', - rf'^({type2})\s*({name})\s*{prototype_end}', - ] - - for p in patterns: - r =3D Re(p) - - if r.match(prototype): - - return_type =3D r.group(1) - declaration_name =3D r.group(2) - args =3D r.group(3) - - self.create_parameter_list(ln, decl_type, args, ',', - declaration_name) - - found =3D True - break - if not found: - self.emit_warning(ln, - f"cannot understand function prototype: '{pr= ototype}'") - return - - if self.entry.identifier !=3D declaration_name: - self.emit_warning(ln, - f"expecting prototype for {self.entry.identi= fier}(). Prototype was for {declaration_name}() instead") - return - - prms =3D " ".join(self.entry.parameterlist) - self.check_sections(ln, declaration_name, "function", - self.entry.sectcheck, prms) - - self.check_return_section(ln, declaration_name, return_type) - - if 'typedef' in return_type: - self.output_declaration(decl_type, declaration_name, - function=3Ddeclaration_name, - typedef=3DTrue, - module=3Dself.config.modulename, - functiontype=3Dreturn_type, - parameterlist=3Dself.entry.parameterlist, - parameterdescs=3Dself.entry.parameterdescs, - parametertypes=3Dself.entry.parametertypes, - sectionlist=3Dself.entry.sectionlist, - sections=3Dself.entry.sections, - purpose=3Dself.entry.declaration_purpose, - func_macro=3Dfunc_macro) - else: - self.output_declaration(decl_type, declaration_name, - function=3Ddeclaration_name, - typedef=3DFalse, - module=3Dself.config.modulename, - functiontype=3Dreturn_type, - parameterlist=3Dself.entry.parameterlist, - parameterdescs=3Dself.entry.parameterdescs, - parametertypes=3Dself.entry.parametertypes, - sectionlist=3Dself.entry.sectionlist, - sections=3Dself.entry.sections, - purpose=3Dself.entry.declaration_purpose, - func_macro=3Dfunc_macro) - - def dump_typedef(self, ln, proto): - typedef_type =3D r'((?:\s+[\w\*]+\b){1,8})\s*' - typedef_ident =3D r'\*?\s*(\w\S+)\s*' - typedef_args =3D r'\s*\((.*)\);' - - typedef1 =3D Re(r'typedef' + typedef_type + r'\(' + typedef_ident = + r'\)' + typedef_args) - typedef2 =3D Re(r'typedef' + typedef_type + typedef_ident + typede= f_args) - - # Strip comments - proto =3D Re(r'/\*.*?\*/', flags=3Dre.S).sub('', proto) - - # Parse function typedef prototypes - for r in [typedef1, typedef2]: - if not r.match(proto): - continue - - return_type =3D r.group(1).strip() - declaration_name =3D r.group(2) - args =3D r.group(3) - - if self.entry.identifier !=3D declaration_name: - self.emit_warning(ln, - f"expecting prototype for typedef {self.= entry.identifier}. Prototype was for typedef {declaration_name} instead\n") - return - - decl_type =3D 'function' - self.create_parameter_list(ln, decl_type, args, ',', declarati= on_name) - - self.output_declaration(decl_type, declaration_name, - function=3Ddeclaration_name, - typedef=3DTrue, - module=3Dself.entry.modulename, - functiontype=3Dreturn_type, - parameterlist=3Dself.entry.parameterlist, - parameterdescs=3Dself.entry.parameterdescs, - parametertypes=3Dself.entry.parametertypes, - sectionlist=3Dself.entry.sectionlist, - sections=3Dself.entry.sections, - purpose=3Dself.entry.declaration_purpose) - return - - # Handle nested parentheses or brackets - r =3D Re(r'(\(*.\)\s*|\[*.\]\s*);$') - while r.search(proto): - proto =3D r.sub('', proto) - - # Parse simple typedefs - r =3D Re(r'typedef.*\s+(\w+)\s*;') - if r.match(proto): - declaration_name =3D r.group(1) - - if self.entry.identifier !=3D declaration_name: - self.emit_warning(ln, f"expecting prototype for typedef {s= elf.entry.identifier}. Prototype was for typedef {declaration_name} instead= \n") - return - - self.output_declaration('typedef', declaration_name, - typedef=3Ddeclaration_name, - module=3Dself.entry.modulename, - sectionlist=3Dself.entry.sectionlist, - sections=3Dself.entry.sections, - purpose=3Dself.entry.declaration_purpose) - return - - self.emit_warning(ln, "error: Cannot parse typedef!") - self.config.errors +=3D 1 - - @staticmethod - def process_export(function_table, line): - """ - process EXPORT_SYMBOL* tags - - This method is called both internally and externally, so, it - doesn't use self. - """ - - if export_symbol.search(line): - symbol =3D export_symbol.group(2) - function_table.add(symbol) - - if export_symbol_ns.search(line): - symbol =3D export_symbol_ns.group(2) - function_table.add(symbol) - - def process_normal(self, ln, line): - """ - STATE_NORMAL: looking for the /** to begin everything. - """ - - if not doc_start.match(line): - return - - # start a new entry - self.reset_state(ln + 1) - self.entry.in_doc_sect =3D False - - # next line is always the function name - self.state =3D self.STATE_NAME - - def process_name(self, ln, line): - """ - STATE_NAME: Looking for the "name - description" line - """ - - if doc_block.search(line): - self.entry.new_start_line =3D ln - - if not doc_block.group(1): - self.entry.section =3D self.section_intro - else: - self.entry.section =3D doc_block.group(1) - - self.state =3D self.STATE_DOCBLOCK - return - - if doc_decl.search(line): - self.entry.identifier =3D doc_decl.group(1) - self.entry.is_kernel_comment =3D False - - decl_start =3D str(doc_com) # comment block asterisk - fn_type =3D r"(?:\w+\s*\*\s*)?" # type (for non-functions) - parenthesis =3D r"(?:\(\w*\))?" # optional parenthesis on fu= nction - decl_end =3D r"(?:[-:].*)" # end of the name part - - # test for pointer declaration type, foo * bar() - desc - r =3D Re(fr"^{decl_start}([\w\s]+?){parenthesis}?\s*{decl_end}= ?$") - if r.search(line): - self.entry.identifier =3D r.group(1) - - # Test for data declaration - r =3D Re(r"^\s*\*?\s*(struct|union|enum|typedef)\b\s*(\w*)") - if r.search(line): - self.entry.decl_type =3D r.group(1) - self.entry.identifier =3D r.group(2) - self.entry.is_kernel_comment =3D True - else: - # Look for foo() or static void foo() - description; - # or misspelt identifier - - r1 =3D Re(fr"^{decl_start}{fn_type}(\w+)\s*{parenthesis}\s= *{decl_end}?$") - r2 =3D Re(fr"^{decl_start}{fn_type}(\w+[^-:]*){parenthesis= }\s*{decl_end}$") - - for r in [r1, r2]: - if r.search(line): - self.entry.identifier =3D r.group(1) - self.entry.decl_type =3D "function" - - r =3D Re(r"define\s+") - self.entry.identifier =3D r.sub("", self.entry.ide= ntifier) - self.entry.is_kernel_comment =3D True - break - - self.entry.identifier =3D self.entry.identifier.strip(" ") - - self.state =3D self.STATE_BODY - - # if there's no @param blocks need to set up default section h= ere - self.entry.section =3D self.section_default - self.entry.new_start_line =3D ln + 1 - - r =3D Re("[-:](.*)") - if r.search(line): - # strip leading/trailing/multiple spaces - self.entry.descr =3D r.group(1).strip(" ") - - r =3D Re(r"\s+") - self.entry.descr =3D r.sub(" ", self.entry.descr) - self.entry.declaration_purpose =3D self.entry.descr - self.state =3D self.STATE_BODY_MAYBE - else: - self.entry.declaration_purpose =3D "" - - if not self.entry.is_kernel_comment: - self.emit_warning(ln, - f"This comment starts with '/**', but is= n't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst\n{li= ne}") - self.state =3D self.STATE_NORMAL - - if not self.entry.declaration_purpose and self.config.wshort_d= esc: - self.emit_warning(ln, - f"missing initial short description on l= ine:\n{line}") - - if not self.entry.identifier and self.entry.decl_type !=3D "en= um": - self.emit_warning(ln, - f"wrong kernel-doc identifier on line:\n= {line}") - self.state =3D self.STATE_NORMAL - - if self.config.verbose: - self.emit_warning(ln, - f"Scanning doc for {self.entry.decl_type= } {self.entry.identifier}", - warning=3DFalse) - - return - - # Failed to find an identifier. Emit a warning - self.emit_warning(ln, f"Cannot find identifier on line:\n{line}") - - def process_body(self, ln, line): - """ - STATE_BODY and STATE_BODY_MAYBE: the bulk of a kerneldoc comment. - """ - - if self.state =3D=3D self.STATE_BODY_WITH_BLANK_LINE: - r =3D Re(r"\s*\*\s?\S") - if r.match(line): - self.dump_section() - self.entry.section =3D self.section_default - self.entry.new_start_line =3D line - self.entry.contents =3D "" - - if doc_sect.search(line): - self.entry.in_doc_sect =3D True - newsection =3D doc_sect.group(1) - - if newsection.lower() in ["description", "context"]: - newsection =3D newsection.title() - - # Special case: @return is a section, not a param description - if newsection.lower() in ["@return", "@returns", - "return", "returns"]: - newsection =3D "Return" - - # Perl kernel-doc has a check here for contents before section= s. - # the logic there is always false, as in_doc_sect variable is - # always true. So, just don't implement Wcontents_before_secti= ons - - # .title() - newcontents =3D doc_sect.group(2) - if not newcontents: - newcontents =3D "" - - if self.entry.contents.strip("\n"): - self.dump_section() - - self.entry.new_start_line =3D ln - self.entry.section =3D newsection - self.entry.leading_space =3D None - - self.entry.contents =3D newcontents.lstrip() - if self.entry.contents: - self.entry.contents +=3D "\n" - - self.state =3D self.STATE_BODY - return - - if doc_end.search(line): - self.dump_section() - - # Look for doc_com + + doc_end: - r =3D Re(r'\s*\*\s*[a-zA-Z_0-9:\.]+\*/') - if r.match(line): - self.emit_warning(ln, f"suspicious ending line: {line}") - - self.entry.prototype =3D "" - self.entry.new_start_line =3D ln + 1 - - self.state =3D self.STATE_PROTO - return - - if doc_content.search(line): - cont =3D doc_content.group(1) - - if cont =3D=3D "": - if self.entry.section =3D=3D self.section_context: - self.dump_section() - - self.entry.new_start_line =3D ln - self.state =3D self.STATE_BODY - else: - if self.entry.section !=3D self.section_default: - self.state =3D self.STATE_BODY_WITH_BLANK_LINE - else: - self.state =3D self.STATE_BODY - - self.entry.contents +=3D "\n" - - elif self.state =3D=3D self.STATE_BODY_MAYBE: - - # Continued declaration purpose - self.entry.declaration_purpose =3D self.entry.declaration_= purpose.rstrip() - self.entry.declaration_purpose +=3D " " + cont - - r =3D Re(r"\s+") - self.entry.declaration_purpose =3D r.sub(' ', - self.entry.declarat= ion_purpose) - - else: - if self.entry.section.startswith('@') or \ - self.entry.section =3D=3D self.section_context: - if self.entry.leading_space is None: - r =3D Re(r'^(\s+)') - if r.match(cont): - self.entry.leading_space =3D len(r.group(1)) - else: - self.entry.leading_space =3D 0 - - # Double-check if leading space are realy spaces - pos =3D 0 - for i in range(0, self.entry.leading_space): - if cont[i] !=3D " ": - break - pos +=3D 1 - - cont =3D cont[pos:] - - # NEW LOGIC: - # In case it is different, update it - if self.entry.leading_space !=3D pos: - self.entry.leading_space =3D pos - - self.entry.contents +=3D cont + "\n" - return - - # Unknown line, ignore - self.emit_warning(ln, f"bad line: {line}") - - def process_inline(self, ln, line): - """STATE_INLINE: docbook comments within a prototype.""" - - if self.inline_doc_state =3D=3D self.STATE_INLINE_NAME and \ - doc_inline_sect.search(line): - self.entry.section =3D doc_inline_sect.group(1) - self.entry.new_start_line =3D ln - - self.entry.contents =3D doc_inline_sect.group(2).lstrip() - if self.entry.contents !=3D "": - self.entry.contents +=3D "\n" - - self.inline_doc_state =3D self.STATE_INLINE_TEXT - # Documentation block end */ - return - - if doc_inline_end.search(line): - if self.entry.contents not in ["", "\n"]: - self.dump_section() - - self.state =3D self.STATE_PROTO - self.inline_doc_state =3D self.STATE_INLINE_NA - return - - if doc_content.search(line): - if self.inline_doc_state =3D=3D self.STATE_INLINE_TEXT: - self.entry.contents +=3D doc_content.group(1) + "\n" - if not self.entry.contents.strip(" ").rstrip("\n"): - self.entry.contents =3D "" - - elif self.inline_doc_state =3D=3D self.STATE_INLINE_NAME: - self.emit_warning(ln, - f"Incorrect use of kernel-doc format: {l= ine}") - - self.inline_doc_state =3D self.STATE_INLINE_ERROR - - def syscall_munge(self, ln, proto): - """ - Handle syscall definitions - """ - - is_void =3D False - - # Strip newlines/CR's - proto =3D re.sub(r'[\r\n]+', ' ', proto) - - # Check if it's a SYSCALL_DEFINE0 - if 'SYSCALL_DEFINE0' in proto: - is_void =3D True - - # Replace SYSCALL_DEFINE with correct return type & function name - proto =3D Re(r'SYSCALL_DEFINE.*\(').sub('long sys_', proto) - - r =3D Re(r'long\s+(sys_.*?),') - if r.search(proto): - proto =3D proto.replace(',', '(', count=3D1) - elif is_void: - proto =3D proto.replace(')', '(void)', count=3D1) - - # Now delete all of the odd-numbered commas in the proto - # so that argument types & names don't have a comma between them - count =3D 0 - length =3D len(proto) - - if is_void: - length =3D 0 # skip the loop if is_void - - for ix in range(length): - if proto[ix] =3D=3D ',': - count +=3D 1 - if count % 2 =3D=3D 1: - proto =3D proto[:ix] + ' ' + proto[ix+1:] - - return proto - - def tracepoint_munge(self, ln, proto): - """ - Handle tracepoint definitions - """ - - tracepointname =3D None - tracepointargs =3D None - - # Match tracepoint name based on different patterns - r =3D Re(r'TRACE_EVENT\((.*?),') - if r.search(proto): - tracepointname =3D r.group(1) - - r =3D Re(r'DEFINE_SINGLE_EVENT\((.*?),') - if r.search(proto): - tracepointname =3D r.group(1) - - r =3D Re(r'DEFINE_EVENT\((.*?),(.*?),') - if r.search(proto): - tracepointname =3D r.group(2) - - if tracepointname: - tracepointname =3D tracepointname.lstrip() - - r =3D Re(r'TP_PROTO\((.*?)\)') - if r.search(proto): - tracepointargs =3D r.group(1) - - if not tracepointname or not tracepointargs: - self.emit_warning(ln, - f"Unrecognized tracepoint format:\n{proto}\n= ") - else: - proto =3D f"static inline void trace_{tracepointname}({tracepo= intargs})" - self.entry.identifier =3D f"trace_{self.entry.identifier}" - - return proto - - def process_proto_function(self, ln, line): - """Ancillary routine to process a function prototype""" - - # strip C99-style comments to end of line - r =3D Re(r"\/\/.*$", re.S) - line =3D r.sub('', line) - - if Re(r'\s*#\s*define').match(line): - self.entry.prototype =3D line - elif line.startswith('#'): - # Strip other macros like #ifdef/#ifndef/#endif/... - pass - else: - r =3D Re(r'([^\{]*)') - if r.match(line): - self.entry.prototype +=3D r.group(1) + " " - - if '{' in line or ';' in line or Re(r'\s*#\s*define').match(line): - # strip comments - r =3D Re(r'/\*.*?\*/') - self.entry.prototype =3D r.sub('', self.entry.prototype) - - # strip newlines/cr's - r =3D Re(r'[\r\n]+') - self.entry.prototype =3D r.sub(' ', self.entry.prototype) - - # strip leading spaces - r =3D Re(r'^\s+') - self.entry.prototype =3D r.sub('', self.entry.prototype) - - # Handle self.entry.prototypes for function pointers like: - # int (*pcs_config)(struct foo) - - r =3D Re(r'^(\S+\s+)\(\s*\*(\S+)\)') - self.entry.prototype =3D r.sub(r'\1\2', self.entry.prototype) - - if 'SYSCALL_DEFINE' in self.entry.prototype: - self.entry.prototype =3D self.syscall_munge(ln, - self.entry.proto= type) - - r =3D Re(r'TRACE_EVENT|DEFINE_EVENT|DEFINE_SINGLE_EVENT') - if r.search(self.entry.prototype): - self.entry.prototype =3D self.tracepoint_munge(ln, - self.entry.pr= ototype) - - self.dump_function(ln, self.entry.prototype) - self.reset_state(ln) - - def process_proto_type(self, ln, line): - """Ancillary routine to process a type""" - - # Strip newlines/cr's. - line =3D Re(r'[\r\n]+', re.S).sub(' ', line) - - # Strip leading spaces - line =3D Re(r'^\s+', re.S).sub('', line) - - # Strip trailing spaces - line =3D Re(r'\s+$', re.S).sub('', line) - - # Strip C99-style comments to the end of the line - line =3D Re(r"\/\/.*$", re.S).sub('', line) - - # To distinguish preprocessor directive from regular declaration l= ater. - if line.startswith('#'): - line +=3D ";" - - r =3D Re(r'([^\{\};]*)([\{\};])(.*)') - while True: - if r.search(line): - if self.entry.prototype: - self.entry.prototype +=3D " " - self.entry.prototype +=3D r.group(1) + r.group(2) - - self.entry.brcount +=3D r.group(2).count('{') - self.entry.brcount -=3D r.group(2).count('}') - - self.entry.brcount =3D max(self.entry.brcount, 0) - - if r.group(2) =3D=3D ';' and self.entry.brcount =3D=3D 0: - self.dump_declaration(ln, self.entry.prototype) - self.reset_state(ln) - break - - line =3D r.group(3) - else: - self.entry.prototype +=3D line - break - - def process_proto(self, ln, line): - """STATE_PROTO: reading a function/whatever prototype.""" - - if doc_inline_oneline.search(line): - self.entry.section =3D doc_inline_oneline.group(1) - self.entry.contents =3D doc_inline_oneline.group(2) - - if self.entry.contents !=3D "": - self.entry.contents +=3D "\n" - self.dump_section(start_new=3DFalse) - - elif doc_inline_start.search(line): - self.state =3D self.STATE_INLINE - self.inline_doc_state =3D self.STATE_INLINE_NAME - - elif self.entry.decl_type =3D=3D 'function': - self.process_proto_function(ln, line) - - else: - self.process_proto_type(ln, line) - - def process_docblock(self, ln, line): - """STATE_DOCBLOCK: within a DOC: block.""" - - if doc_end.search(line): - self.dump_section() - self.output_declaration("doc", None, - sectionlist=3Dself.entry.sectionlist, - sections=3Dself.entry.sections, = module=3Dself.config.modulename) - self.reset_state(ln) - - elif doc_content.search(line): - self.entry.contents +=3D doc_content.group(1) + "\n" - - def run(self): - """ - Open and process each line of a C source file. - he parsing is controlled via a state machine, and the line is pass= ed - to a different process function depending on the state. The process - function may update the state as needed. - """ - - cont =3D False - prev =3D "" - prev_ln =3D None - - try: - with open(self.fname, "r", encoding=3D"utf8", - errors=3D"backslashreplace") as fp: - for ln, line in enumerate(fp): - - line =3D line.expandtabs().strip("\n") - - # Group continuation lines on prototypes - if self.state =3D=3D self.STATE_PROTO: - if line.endswith("\\"): - prev +=3D line.removesuffix("\\") - cont =3D True - - if not prev_ln: - prev_ln =3D ln - - continue - - if cont: - ln =3D prev_ln - line =3D prev + line - prev =3D "" - cont =3D False - prev_ln =3D None - - self.config.log.debug("%d %s%s: %s", - ln, self.st_name[self.state], - self.st_inline_name[self.inline_= doc_state], - line) - - # TODO: not all states allow EXPORT_SYMBOL*, so this - # can be optimized later on to speedup parsing - self.process_export(self.config.function_table, line) - - # Hand this line to the appropriate state handler - if self.state =3D=3D self.STATE_NORMAL: - self.process_normal(ln, line) - elif self.state =3D=3D self.STATE_NAME: - self.process_name(ln, line) - elif self.state in [self.STATE_BODY, self.STATE_BODY_M= AYBE, - self.STATE_BODY_WITH_BLANK_LINE]: - self.process_body(ln, line) - elif self.state =3D=3D self.STATE_INLINE: # scanning = for inline parameters - self.process_inline(ln, line) - elif self.state =3D=3D self.STATE_PROTO: - self.process_proto(ln, line) - elif self.state =3D=3D self.STATE_DOCBLOCK: - self.process_docblock(ln, line) - except OSError: - self.config.log.error(f"Error: Cannot open file {self.fname}") - self.config.errors +=3D 1 - - class GlobSourceFiles: """ Parse C source code file names and directories via an Interactor. diff --git a/scripts/lib/kdoc/kdoc_parser.py b/scripts/lib/kdoc/kdoc_parser= .py new file mode 100755 index 000000000000..3ce116595546 --- /dev/null +++ b/scripts/lib/kdoc/kdoc_parser.py @@ -0,0 +1,1690 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: GPL-2.0 +# Copyright(c) 2025: Mauro Carvalho Chehab . +# +# pylint: disable=3DC0301,C0302,R0904,R0912,R0913,R0914,R0915,R0917,R1702 + +""" +kdoc_parser +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +Read a C language source or header FILE and extract embedded +documentation comments +""" + +import argparse +import re +from pprint import pformat + +from kdoc_re import NestedMatch, Re + + +# +# Regular expressions used to parse kernel-doc markups at KernelDoc class. +# +# Let's declare them in lowercase outside any class to make easier to +# convert from the python script. +# +# As those are evaluated at the beginning, no need to cache them +# + +# Allow whitespace at end of comment start. +doc_start =3D Re(r'^/\*\*\s*$', cache=3DFalse) + +doc_end =3D Re(r'\*/', cache=3DFalse) +doc_com =3D Re(r'\s*\*\s*', cache=3DFalse) +doc_com_body =3D Re(r'\s*\* ?', cache=3DFalse) +doc_decl =3D doc_com + Re(r'(\w+)', cache=3DFalse) + +# @params and a strictly limited set of supported section names +# Specifically: +# Match @word: +# @...: +# @{section-name}: +# while trying to not match literal block starts like "example::" +# +doc_sect =3D doc_com + \ + Re(r'\s*(\@[.\w]+|\@\.\.\.|description|context|returns?|notes?= |examples?)\s*:([^:].*)?$', + flags=3Dre.I, cache=3DFalse) + +doc_content =3D doc_com_body + Re(r'(.*)', cache=3DFalse) +doc_block =3D doc_com + Re(r'DOC:\s*(.*)?', cache=3DFalse) +doc_inline_start =3D Re(r'^\s*/\*\*\s*$', cache=3DFalse) +doc_inline_sect =3D Re(r'\s*\*\s*(@\s*[\w][\w\.]*\s*):(.*)', cache=3DFalse) +doc_inline_end =3D Re(r'^\s*\*/\s*$', cache=3DFalse) +doc_inline_oneline =3D Re(r'^\s*/\*\*\s*(@[\w\s]+):\s*(.*)\s*\*/\s*$', cac= he=3DFalse) +attribute =3D Re(r"__attribute__\s*\(\([a-z0-9,_\*\s\(\)]*\)\)", + flags=3Dre.I | re.S, cache=3DFalse) + +export_symbol =3D Re(r'^\s*EXPORT_SYMBOL(_GPL)?\s*\(\s*(\w+)\s*\)\s*', cac= he=3DFalse) +export_symbol_ns =3D Re(r'^\s*EXPORT_SYMBOL_NS(_GPL)?\s*\(\s*(\w+)\s*,\s*"= \S+"\)\s*', cache=3DFalse) + +type_param =3D Re(r"\@(\w*((\.\w+)|(->\w+))*(\.\.\.)?)", cache=3DFalse) + + +class KernelDoc: + """ + Read a C language source or header FILE and extract embedded + documentation comments. + """ + + # Parser states + STATE_NORMAL =3D 0 # normal code + STATE_NAME =3D 1 # looking for function name + STATE_BODY_MAYBE =3D 2 # body - or maybe more description + STATE_BODY =3D 3 # the body of the comment + STATE_BODY_WITH_BLANK_LINE =3D 4 # the body which has a blank line + STATE_PROTO =3D 5 # scanning prototype + STATE_DOCBLOCK =3D 6 # documentation block + STATE_INLINE =3D 7 # gathering doc outside main block + + st_name =3D [ + "NORMAL", + "NAME", + "BODY_MAYBE", + "BODY", + "BODY_WITH_BLANK_LINE", + "PROTO", + "DOCBLOCK", + "INLINE", + ] + + # Inline documentation state + STATE_INLINE_NA =3D 0 # not applicable ($state !=3D STATE_INLINE) + STATE_INLINE_NAME =3D 1 # looking for member name (@foo:) + STATE_INLINE_TEXT =3D 2 # looking for member documentation + STATE_INLINE_END =3D 3 # done + STATE_INLINE_ERROR =3D 4 # error - Comment without header was found. + # Spit a warning as it's not + # proper kernel-doc and ignore the rest. + + st_inline_name =3D [ + "", + "_NAME", + "_TEXT", + "_END", + "_ERROR", + ] + + # Section names + + section_default =3D "Description" # default section + section_intro =3D "Introduction" + section_context =3D "Context" + section_return =3D "Return" + + undescribed =3D "-- undescribed --" + + def __init__(self, config, fname): + """Initialize internal variables""" + + self.fname =3D fname + self.config =3D config + + # Initial state for the state machines + self.state =3D self.STATE_NORMAL + self.inline_doc_state =3D self.STATE_INLINE_NA + + # Store entry currently being processed + self.entry =3D None + + # Place all potential outputs into an array + self.entries =3D [] + + def show_warnings(self, dtype, declaration_name): # pylint: disable= =3DW0613 + """ + Allow filtering out warnings + """ + + # TODO: implement it + + return True + + # TODO: rename to emit_message + def emit_warning(self, ln, msg, warning=3DTrue): + """Emit a message""" + + if warning: + self.config.log.warning("%s:%d %s", self.fname, ln, msg) + else: + self.config.log.info("%s:%d %s", self.fname, ln, msg) + + def dump_section(self, start_new=3DTrue): + """ + Dumps section contents to arrays/hashes intended for that purpose. + """ + + name =3D self.entry.section + contents =3D self.entry.contents + + # TODO: we can prevent dumping empty sections here with: + # + # if self.entry.contents.strip("\n"): + # if start_new: + # self.entry.section =3D self.section_default + # self.entry.contents =3D "" + # + # return + # + # But, as we want to be producing the same output of the + # venerable kernel-doc Perl tool, let's just output everything, + # at least for now + + if type_param.match(name): + name =3D type_param.group(1) + + self.entry.parameterdescs[name] =3D contents + self.entry.parameterdesc_start_lines[name] =3D self.entry.new_= start_line + + self.entry.sectcheck +=3D name + " " + self.entry.new_start_line =3D 0 + + elif name =3D=3D "@...": + name =3D "..." + self.entry.parameterdescs[name] =3D contents + self.entry.sectcheck +=3D name + " " + self.entry.parameterdesc_start_lines[name] =3D self.entry.new_= start_line + self.entry.new_start_line =3D 0 + + else: + if name in self.entry.sections and self.entry.sections[name] != =3D "": + # Only warn on user-specified duplicate section names + if name !=3D self.section_default: + self.emit_warning(self.entry.new_start_line, + f"duplicate section name '{name}'\n") + self.entry.sections[name] +=3D contents + else: + self.entry.sections[name] =3D contents + self.entry.sectionlist.append(name) + self.entry.section_start_lines[name] =3D self.entry.new_st= art_line + self.entry.new_start_line =3D 0 + +# self.config.log.debug("Section: %s : %s", name, pformat(vars(self= .entry))) + + if start_new: + self.entry.section =3D self.section_default + self.entry.contents =3D "" + + # TODO: rename it to store_declaration + def output_declaration(self, dtype, name, **args): + """ + Stores the entry into an entry array. + + The actual output and output filters will be handled elsewhere + """ + + # The implementation here is different than the original kernel-do= c: + # instead of checking for output filters or actually output anythi= ng, + # it just stores the declaration content at self.entries, as the + # output will happen on a separate class. + # + # For now, we're keeping the same name of the function just to make + # easier to compare the source code of both scripts + + if "declaration_start_line" not in args: + args["declaration_start_line"] =3D self.entry.declaration_star= t_line + + args["type"] =3D dtype + + # TODO: use colletions.OrderedDict + + sections =3D args.get('sections', {}) + sectionlist =3D args.get('sectionlist', []) + + # Drop empty sections + # TODO: improve it to emit warnings + for section in ["Description", "Return"]: + if section in sectionlist: + if not sections[section].rstrip(): + del sections[section] + sectionlist.remove(section) + + self.entries.append((name, args)) + + self.config.log.debug("Output: %s:%s =3D %s", dtype, name, pformat= (args)) + + def reset_state(self, ln): + """ + Ancillary routine to create a new entry. It initializes all + variables used by the state machine. + """ + + self.entry =3D argparse.Namespace + + self.entry.contents =3D "" + self.entry.function =3D "" + self.entry.sectcheck =3D "" + self.entry.struct_actual =3D "" + self.entry.prototype =3D "" + + self.entry.parameterlist =3D [] + self.entry.parameterdescs =3D {} + self.entry.parametertypes =3D {} + self.entry.parameterdesc_start_lines =3D {} + + self.entry.section_start_lines =3D {} + self.entry.sectionlist =3D [] + self.entry.sections =3D {} + + self.entry.anon_struct_union =3D False + + self.entry.leading_space =3D None + + # State flags + self.state =3D self.STATE_NORMAL + self.inline_doc_state =3D self.STATE_INLINE_NA + self.entry.brcount =3D 0 + + self.entry.in_doc_sect =3D False + self.entry.declaration_start_line =3D ln + + def push_parameter(self, ln, decl_type, param, dtype, + org_arg, declaration_name): + """ + Store parameters and their descriptions at self.entry. + """ + + if self.entry.anon_struct_union and dtype =3D=3D "" and param =3D= =3D "}": + return # Ignore the ending }; from anonymous struct/union + + self.entry.anon_struct_union =3D False + + param =3D Re(r'[\[\)].*').sub('', param, count=3D1) + + if dtype =3D=3D "" and param.endswith("..."): + if Re(r'\w\.\.\.$').search(param): + # For named variable parameters of the form `x...`, + # remove the dots + param =3D param[:-3] + else: + # Handles unnamed variable parameters + param =3D "..." + + if param not in self.entry.parameterdescs or \ + not self.entry.parameterdescs[param]: + + self.entry.parameterdescs[param] =3D "variable arguments" + + elif dtype =3D=3D "" and (not param or param =3D=3D "void"): + param =3D "void" + self.entry.parameterdescs[param] =3D "no arguments" + + elif dtype =3D=3D "" and param in ["struct", "union"]: + # Handle unnamed (anonymous) union or struct + dtype =3D param + param =3D "{unnamed_" + param + "}" + self.entry.parameterdescs[param] =3D "anonymous\n" + self.entry.anon_struct_union =3D True + + # Handle cache group enforcing variables: they do not need + # to be described in header files + elif "__cacheline_group" in param: + # Ignore __cacheline_group_begin and __cacheline_group_end + return + + # Warn if parameter has no description + # (but ignore ones starting with # as these are not parameters + # but inline preprocessor statements) + if param not in self.entry.parameterdescs and not param.startswith= ("#"): + self.entry.parameterdescs[param] =3D self.undescribed + + if self.show_warnings(dtype, declaration_name) and "." not in = param: + if decl_type =3D=3D 'function': + dname =3D f"{decl_type} parameter" + else: + dname =3D f"{decl_type} member" + + self.emit_warning(ln, + f"{dname} '{param}' not described in '{d= eclaration_name}'") + + # Strip spaces from param so that it is one continuous string on + # parameterlist. This fixes a problem where check_sections() + # cannot find a parameter like "addr[6 + 2]" because it actually + # appears as "addr[6", "+", "2]" on the parameter list. + # However, it's better to maintain the param string unchanged for + # output, so just weaken the string compare in check_sections() + # to ignore "[blah" in a parameter string. + + self.entry.parameterlist.append(param) + org_arg =3D Re(r'\s\s+').sub(' ', org_arg) + self.entry.parametertypes[param] =3D org_arg + + def save_struct_actual(self, actual): + """ + Strip all spaces from the actual param so that it looks like + one string item. + """ + + actual =3D Re(r'\s*').sub("", actual, count=3D1) + + self.entry.struct_actual +=3D actual + " " + + def create_parameter_list(self, ln, decl_type, args, + splitter, declaration_name): + """ + Creates a list of parameters, storing them at self.entry. + """ + + # temporarily replace all commas inside function pointer definition + arg_expr =3D Re(r'(\([^\),]+),') + while arg_expr.search(args): + args =3D arg_expr.sub(r"\1#", args) + + for arg in args.split(splitter): + # Strip comments + arg =3D Re(r'\/\*.*\*\/').sub('', arg) + + # Ignore argument attributes + arg =3D Re(r'\sPOS0?\s').sub(' ', arg) + + # Strip leading/trailing spaces + arg =3D arg.strip() + arg =3D Re(r'\s+').sub(' ', arg, count=3D1) + + if arg.startswith('#'): + # Treat preprocessor directive as a typeless variable just= to fill + # corresponding data structures "correctly". Catch it late= r in + # output_* subs. + + # Treat preprocessor directive as a typeless variable + self.push_parameter(ln, decl_type, arg, "", + "", declaration_name) + + elif Re(r'\(.+\)\s*\(').search(arg): + # Pointer-to-function + + arg =3D arg.replace('#', ',') + + r =3D Re(r'[^\(]+\(\*?\s*([\w\[\]\.]*)\s*\)') + if r.match(arg): + param =3D r.group(1) + else: + self.emit_warning(ln, f"Invalid param: {arg}") + param =3D arg + + dtype =3D Re(r'([^\(]+\(\*?)\s*' + re.escape(param)).sub(r= '\1', arg) + self.save_struct_actual(param) + self.push_parameter(ln, decl_type, param, dtype, + arg, declaration_name) + + elif Re(r'\(.+\)\s*\[').search(arg): + # Array-of-pointers + + arg =3D arg.replace('#', ',') + r =3D Re(r'[^\(]+\(\s*\*\s*([\w\[\]\.]*?)\s*(\s*\[\s*[\w]+= \s*\]\s*)*\)') + if r.match(arg): + param =3D r.group(1) + else: + self.emit_warning(ln, f"Invalid param: {arg}") + param =3D arg + + dtype =3D Re(r'([^\(]+\(\*?)\s*' + re.escape(param)).sub(r= '\1', arg) + + self.save_struct_actual(param) + self.push_parameter(ln, decl_type, param, dtype, + arg, declaration_name) + + elif arg: + arg =3D Re(r'\s*:\s*').sub(":", arg) + arg =3D Re(r'\s*\[').sub('[', arg) + + args =3D Re(r'\s*,\s*').split(arg) + if args[0] and '*' in args[0]: + args[0] =3D re.sub(r'(\*+)\s*', r' \1', args[0]) + + first_arg =3D [] + r =3D Re(r'^(.*\s+)(.*?\[.*\].*)$') + if args[0] and r.match(args[0]): + args.pop(0) + first_arg.extend(r.group(1)) + first_arg.append(r.group(2)) + else: + first_arg =3D Re(r'\s+').split(args.pop(0)) + + args.insert(0, first_arg.pop()) + dtype =3D ' '.join(first_arg) + + for param in args: + if Re(r'^(\*+)\s*(.*)').match(param): + r =3D Re(r'^(\*+)\s*(.*)') + if not r.match(param): + self.emit_warning(ln, f"Invalid param: {param}= ") + continue + + param =3D r.group(1) + + self.save_struct_actual(r.group(2)) + self.push_parameter(ln, decl_type, r.group(2), + f"{dtype} {r.group(1)}", + arg, declaration_name) + + elif Re(r'(.*?):(\w+)').search(param): + r =3D Re(r'(.*?):(\w+)') + if not r.match(param): + self.emit_warning(ln, f"Invalid param: {param}= ") + continue + + if dtype !=3D "": # Skip unnamed bit-fields + self.save_struct_actual(r.group(1)) + self.push_parameter(ln, decl_type, r.group(1), + f"{dtype}:{r.group(2)}", + arg, declaration_name) + else: + self.save_struct_actual(param) + self.push_parameter(ln, decl_type, param, dtype, + arg, declaration_name) + + def check_sections(self, ln, decl_name, decl_type, sectcheck, prmschec= k): + """ + Check for errors inside sections, emitting warnings if not found + parameters are described. + """ + + sects =3D sectcheck.split() + prms =3D prmscheck.split() + err =3D False + + for sx in range(len(sects)): # pylint: disable=3D= C0200 + err =3D True + for px in range(len(prms)): # pylint: disable=3D= C0200 + prm_clean =3D prms[px] + prm_clean =3D Re(r'\[.*\]').sub('', prm_clean) + prm_clean =3D attribute.sub('', prm_clean) + + # ignore array size in a parameter string; + # however, the original param string may contain + # spaces, e.g.: addr[6 + 2] + # and this appears in @prms as "addr[6" since the + # parameter list is split at spaces; + # hence just ignore "[..." for the sections check; + prm_clean =3D Re(r'\[.*').sub('', prm_clean) + + if prm_clean =3D=3D sects[sx]: + err =3D False + break + + if err: + if decl_type =3D=3D 'function': + dname =3D f"{decl_type} parameter" + else: + dname =3D f"{decl_type} member" + + self.emit_warning(ln, + f"Excess {dname} '{sects[sx]}' descripti= on in '{decl_name}'") + + def check_return_section(self, ln, declaration_name, return_type): + """ + If the function doesn't return void, warns about the lack of a + return description. + """ + + if not self.config.wreturn: + return + + # Ignore an empty return type (It's a macro) + # Ignore functions with a "void" return type (but not "void *") + if not return_type or Re(r'void\s*\w*\s*$').search(return_type): + return + + if not self.entry.sections.get("Return", None): + self.emit_warning(ln, + f"No description found for return value of '= {declaration_name}'") + + def dump_struct(self, ln, proto): + """ + Store an entry for an struct or union + """ + + type_pattern =3D r'(struct|union)' + + qualifiers =3D [ + "__attribute__", + "__packed", + "__aligned", + "____cacheline_aligned_in_smp", + "____cacheline_aligned", + ] + + definition_body =3D r'\{(.*)\}\s*' + "(?:" + '|'.join(qualifiers) = + ")?" + struct_members =3D Re(type_pattern + r'([^\{\};]+)(\{)([^\{\}]*)(\= })([^\{\}\;]*)(\;)') + + # Extract struct/union definition + members =3D None + declaration_name =3D None + decl_type =3D None + + r =3D Re(type_pattern + r'\s+(\w+)\s*' + definition_body) + if r.search(proto): + decl_type =3D r.group(1) + declaration_name =3D r.group(2) + members =3D r.group(3) + else: + r =3D Re(r'typedef\s+' + type_pattern + r'\s*' + definition_bo= dy + r'\s*(\w+)\s*;') + + if r.search(proto): + decl_type =3D r.group(1) + declaration_name =3D r.group(3) + members =3D r.group(2) + + if not members: + self.emit_warning(ln, f"{proto} error: Cannot parse struct or = union!") + self.config.errors +=3D 1 + return + + if self.entry.identifier !=3D declaration_name: + self.emit_warning(ln, + f"expecting prototype for {decl_type} {self.= entry.identifier}. Prototype was for {decl_type} {declaration_name} instead= \n") + return + + args_pattern =3D r'([^,)]+)' + + sub_prefixes =3D [ + (Re(r'\/\*\s*private:.*?\/\*\s*public:.*?\*\/', re.S | re.I), = ''), + (Re(r'\/\*\s*private:.*', re.S | re.I), ''), + + # Strip comments + (Re(r'\/\*.*?\*\/', re.S), ''), + + # Strip attributes + (attribute, ' '), + (Re(r'\s*__aligned\s*\([^;]*\)', re.S), ' '), + (Re(r'\s*__counted_by\s*\([^;]*\)', re.S), ' '), + (Re(r'\s*__counted_by_(le|be)\s*\([^;]*\)', re.S), ' '), + (Re(r'\s*__packed\s*', re.S), ' '), + (Re(r'\s*CRYPTO_MINALIGN_ATTR', re.S), ' '), + (Re(r'\s*____cacheline_aligned_in_smp', re.S), ' '), + (Re(r'\s*____cacheline_aligned', re.S), ' '), + + # Unwrap struct_group macros based on this definition: + # __struct_group(TAG, NAME, ATTRS, MEMBERS...) + # which has variants like: struct_group(NAME, MEMBERS...) + # Only MEMBERS arguments require documentation. + # + # Parsing them happens on two steps: + # + # 1. drop struct group arguments that aren't at MEMBERS, + # storing them as STRUCT_GROUP(MEMBERS) + # + # 2. remove STRUCT_GROUP() ancillary macro. + # + # The original logic used to remove STRUCT_GROUP() using an + # advanced regex: + # + # \bSTRUCT_GROUP(\(((?:(?>[^)(]+)|(?1))*)\))[^;]*; + # + # with two patterns that are incompatible with + # Python re module, as it has: + # + # - a recursive pattern: (?1) + # - an atomic grouping: (?>...) + # + # I tried a simpler version: but it didn't work either: + # \bSTRUCT_GROUP\(([^\)]+)\)[^;]*; + # + # As it doesn't properly match the end parenthesis on some cas= es. + # + # So, a better solution was crafted: there's now a NestedMatch + # class that ensures that delimiters after a search are proper= ly + # matched. So, the implementation to drop STRUCT_GROUP() will = be + # handled in separate. + + (Re(r'\bstruct_group\s*\(([^,]*,)', re.S), r'STRUCT_GROUP('), + (Re(r'\bstruct_group_attr\s*\(([^,]*,){2}', re.S), r'STRUCT_GR= OUP('), + (Re(r'\bstruct_group_tagged\s*\(([^,]*),([^,]*),', re.S), r'st= ruct \1 \2; STRUCT_GROUP('), + (Re(r'\b__struct_group\s*\(([^,]*,){3}', re.S), r'STRUCT_GROUP= ('), + + # Replace macros + # + # TODO: it is better to also move those to the NestedMatch log= ic, + # to ensure that parenthesis will be properly matched. + + (Re(r'__ETHTOOL_DECLARE_LINK_MODE_MASK\s*\(([^\)]+)\)', re.S),= r'DECLARE_BITMAP(\1, __ETHTOOL_LINK_MODE_MASK_NBITS)'), + (Re(r'DECLARE_PHY_INTERFACE_MASK\s*\(([^\)]+)\)', re.S), r'DEC= LARE_BITMAP(\1, PHY_INTERFACE_MODE_MAX)'), + (Re(r'DECLARE_BITMAP\s*\(' + args_pattern + r',\s*' + args_pat= tern + r'\)', re.S), r'unsigned long \1[BITS_TO_LONGS(\2)]'), + (Re(r'DECLARE_HASHTABLE\s*\(' + args_pattern + r',\s*' + args_= pattern + r'\)', re.S), r'unsigned long \1[1 << ((\2) - 1)]'), + (Re(r'DECLARE_KFIFO\s*\(' + args_pattern + r',\s*' + args_patt= ern + r',\s*' + args_pattern + r'\)', re.S), r'\2 *\1'), + (Re(r'DECLARE_KFIFO_PTR\s*\(' + args_pattern + r',\s*' + args_= pattern + r'\)', re.S), r'\2 *\1'), + (Re(r'(?:__)?DECLARE_FLEX_ARRAY\s*\(' + args_pattern + r',\s*'= + args_pattern + r'\)', re.S), r'\1 \2[]'), + (Re(r'DEFINE_DMA_UNMAP_ADDR\s*\(' + args_pattern + r'\)', re.S= ), r'dma_addr_t \1'), + (Re(r'DEFINE_DMA_UNMAP_LEN\s*\(' + args_pattern + r'\)', re.S)= , r'__u32 \1'), + ] + + # Regexes here are guaranteed to have the end limiter matching + # the start delimiter. Yet, right now, only one replace group + # is allowed. + + sub_nested_prefixes =3D [ + (re.compile(r'\bSTRUCT_GROUP\('), r'\1'), + ] + + for search, sub in sub_prefixes: + members =3D search.sub(sub, members) + + nested =3D NestedMatch() + + for search, sub in sub_nested_prefixes: + members =3D nested.sub(search, sub, members) + + # Keeps the original declaration as-is + declaration =3D members + + # Split nested struct/union elements + # + # This loop was simpler at the original kernel-doc perl version, as + # while ($members =3D~ m/$struct_members/) { ... } + # reads 'members' string on each interaction. + # + # Python behavior is different: it parses 'members' only once, + # creating a list of tuples from the first interaction. + # + # On other words, this won't get nested structs. + # + # So, we need to have an extra loop on Python to override such + # re limitation. + + while True: + tuples =3D struct_members.findall(members) + if not tuples: + break + + for t in tuples: + newmember =3D "" + maintype =3D t[0] + s_ids =3D t[5] + content =3D t[3] + + oldmember =3D "".join(t) + + for s_id in s_ids.split(','): + s_id =3D s_id.strip() + + newmember +=3D f"{maintype} {s_id}; " + s_id =3D Re(r'[:\[].*').sub('', s_id) + s_id =3D Re(r'^\s*\**(\S+)\s*').sub(r'\1', s_id) + + for arg in content.split(';'): + arg =3D arg.strip() + + if not arg: + continue + + r =3D Re(r'^([^\(]+\(\*?\s*)([\w\.]*)(\s*\).*)') + if r.match(arg): + # Pointer-to-function + dtype =3D r.group(1) + name =3D r.group(2) + extra =3D r.group(3) + + if not name: + continue + + if not s_id: + # Anonymous struct/union + newmember +=3D f"{dtype}{name}{extra}; " + else: + newmember +=3D f"{dtype}{s_id}.{name}{extr= a}; " + + else: + arg =3D arg.strip() + # Handle bitmaps + arg =3D Re(r':\s*\d+\s*').sub('', arg) + + # Handle arrays + arg =3D Re(r'\[.*\]').sub('', arg) + + # Handle multiple IDs + arg =3D Re(r'\s*,\s*').sub(',', arg) + + r =3D Re(r'(.*)\s+([\S+,]+)') + + if r.search(arg): + dtype =3D r.group(1) + names =3D r.group(2) + else: + newmember +=3D f"{arg}; " + continue + + for name in names.split(','): + name =3D Re(r'^\s*\**(\S+)\s*').sub(r'\1',= name).strip() + + if not name: + continue + + if not s_id: + # Anonymous struct/union + newmember +=3D f"{dtype} {name}; " + else: + newmember +=3D f"{dtype} {s_id}.{name}= ; " + + members =3D members.replace(oldmember, newmember) + + # Ignore other nested elements, like enums + members =3D re.sub(r'(\{[^\{\}]*\})', '', members) + + self.create_parameter_list(ln, decl_type, members, ';', + declaration_name) + self.check_sections(ln, declaration_name, decl_type, + self.entry.sectcheck, self.entry.struct_actual) + + # Adjust declaration for better display + declaration =3D Re(r'([\{;])').sub(r'\1\n', declaration) + declaration =3D Re(r'\}\s+;').sub('};', declaration) + + # Better handle inlined enums + while True: + r =3D Re(r'(enum\s+\{[^\}]+),([^\n])') + if not r.search(declaration): + break + + declaration =3D r.sub(r'\1,\n\2', declaration) + + def_args =3D declaration.split('\n') + level =3D 1 + declaration =3D "" + for clause in def_args: + + clause =3D clause.strip() + clause =3D Re(r'\s+').sub(' ', clause, count=3D1) + + if not clause: + continue + + if '}' in clause and level > 1: + level -=3D 1 + + if not Re(r'^\s*#').match(clause): + declaration +=3D "\t" * level + + declaration +=3D "\t" + clause + "\n" + if "{" in clause and "}" not in clause: + level +=3D 1 + + self.output_declaration(decl_type, declaration_name, + struct=3Ddeclaration_name, + module=3Dself.entry.modulename, + definition=3Ddeclaration, + parameterlist=3Dself.entry.parameterlist, + parameterdescs=3Dself.entry.parameterdescs, + parametertypes=3Dself.entry.parametertypes, + sectionlist=3Dself.entry.sectionlist, + sections=3Dself.entry.sections, + purpose=3Dself.entry.declaration_purpose) + + def dump_enum(self, ln, proto): + """ + Stores an enum inside self.entries array. + """ + + # Ignore members marked private + proto =3D Re(r'\/\*\s*private:.*?\/\*\s*public:.*?\*\/', flags=3Dr= e.S).sub('', proto) + proto =3D Re(r'\/\*\s*private:.*}', flags=3Dre.S).sub('}', proto) + + # Strip comments + proto =3D Re(r'\/\*.*?\*\/', flags=3Dre.S).sub('', proto) + + # Strip #define macros inside enums + proto =3D Re(r'#\s*((define|ifdef|if)\s+|endif)[^;]*;', flags=3Dre= .S).sub('', proto) + + members =3D None + declaration_name =3D None + + r =3D Re(r'typedef\s+enum\s*\{(.*)\}\s*(\w*)\s*;') + if r.search(proto): + declaration_name =3D r.group(2) + members =3D r.group(1).rstrip() + else: + r =3D Re(r'enum\s+(\w*)\s*\{(.*)\}') + if r.match(proto): + declaration_name =3D r.group(1) + members =3D r.group(2).rstrip() + + if not members: + self.emit_warning(ln, f"{proto}: error: Cannot parse enum!") + self.config.errors +=3D 1 + return + + if self.entry.identifier !=3D declaration_name: + if self.entry.identifier =3D=3D "": + self.emit_warning(ln, + f"{proto}: wrong kernel-doc identifier o= n prototype") + else: + self.emit_warning(ln, + f"expecting prototype for enum {self.ent= ry.identifier}. Prototype was for enum {declaration_name} instead") + return + + if not declaration_name: + declaration_name =3D "(anonymous)" + + member_set =3D set() + + members =3D Re(r'\([^;]*?[\)]').sub('', members) + + for arg in members.split(','): + if not arg: + continue + arg =3D Re(r'^\s*(\w+).*').sub(r'\1', arg) + self.entry.parameterlist.append(arg) + if arg not in self.entry.parameterdescs: + self.entry.parameterdescs[arg] =3D self.undescribed + if self.show_warnings("enum", declaration_name): + self.emit_warning(ln, + f"Enum value '{arg}' not described i= n enum '{declaration_name}'") + member_set.add(arg) + + for k in self.entry.parameterdescs: + if k not in member_set: + if self.show_warnings("enum", declaration_name): + self.emit_warning(ln, + f"Excess enum value '%{k}' descripti= on in '{declaration_name}'") + + self.output_declaration('enum', declaration_name, + enum=3Ddeclaration_name, + module=3Dself.config.modulename, + parameterlist=3Dself.entry.parameterlist, + parameterdescs=3Dself.entry.parameterdescs, + sectionlist=3Dself.entry.sectionlist, + sections=3Dself.entry.sections, + purpose=3Dself.entry.declaration_purpose) + + def dump_declaration(self, ln, prototype): + """ + Stores a data declaration inside self.entries array. + """ + + if self.entry.decl_type =3D=3D "enum": + self.dump_enum(ln, prototype) + return + + if self.entry.decl_type =3D=3D "typedef": + self.dump_typedef(ln, prototype) + return + + if self.entry.decl_type in ["union", "struct"]: + self.dump_struct(ln, prototype) + return + + # TODO: handle other types + self.output_declaration(self.entry.decl_type, prototype, + entry=3Dself.entry) + + def dump_function(self, ln, prototype): + """ + Stores a function of function macro inside self.entries array. + """ + + func_macro =3D False + return_type =3D '' + decl_type =3D 'function' + + # Prefixes that would be removed + sub_prefixes =3D [ + (r"^static +", "", 0), + (r"^extern +", "", 0), + (r"^asmlinkage +", "", 0), + (r"^inline +", "", 0), + (r"^__inline__ +", "", 0), + (r"^__inline +", "", 0), + (r"^__always_inline +", "", 0), + (r"^noinline +", "", 0), + (r"^__FORTIFY_INLINE +", "", 0), + (r"__init +", "", 0), + (r"__init_or_module +", "", 0), + (r"__deprecated +", "", 0), + (r"__flatten +", "", 0), + (r"__meminit +", "", 0), + (r"__must_check +", "", 0), + (r"__weak +", "", 0), + (r"__sched +", "", 0), + (r"_noprof", "", 0), + (r"__printf\s*\(\s*\d*\s*,\s*\d*\s*\) +", "", 0), + (r"__(?:re)?alloc_size\s*\(\s*\d+\s*(?:,\s*\d+\s*)?\) +", "", = 0), + (r"__diagnose_as\s*\(\s*\S+\s*(?:,\s*\d+\s*)*\) +", "", 0), + (r"DECL_BUCKET_PARAMS\s*\(\s*(\S+)\s*,\s*(\S+)\s*\)", r"\1, \2= ", 0), + (r"__attribute_const__ +", "", 0), + + # It seems that Python support for re.X is broken: + # At least for me (Python 3.13), this didn't work +# (r""" +# __attribute__\s*\(\( +# (?: +# [\w\s]+ # attribute name +# (?:\([^)]*\))? # attribute arguments +# \s*,? # optional comma at the end +# )+ +# \)\)\s+ +# """, "", re.X), + + # So, remove whitespaces and comments from it + (r"__attribute__\s*\(\((?:[\w\s]+(?:\([^)]*\))?\s*,?)+\)\)\s+"= , "", 0), + ] + + for search, sub, flags in sub_prefixes: + prototype =3D Re(search, flags).sub(sub, prototype) + + # Macros are a special case, as they change the prototype format + new_proto =3D Re(r"^#\s*define\s+").sub("", prototype) + if new_proto !=3D prototype: + is_define_proto =3D True + prototype =3D new_proto + else: + is_define_proto =3D False + + # Yes, this truly is vile. We are looking for: + # 1. Return type (may be nothing if we're looking at a macro) + # 2. Function name + # 3. Function parameters. + # + # All the while we have to watch out for function pointer paramete= rs + # (which IIRC is what the two sections are for), C types (these + # regexps don't even start to express all the possibilities), and + # so on. + # + # If you mess with these regexps, it's a good idea to check that + # the following functions' documentation still comes out right: + # - parport_register_device (function pointer parameters) + # - atomic_set (macro) + # - pci_match_device, __copy_to_user (long return type) + + name =3D r'[a-zA-Z0-9_~:]+' + prototype_end1 =3D r'[^\(]*' + prototype_end2 =3D r'[^\{]*' + prototype_end =3D fr'\(({prototype_end1}|{prototype_end2})\)' + + # Besides compiling, Perl qr{[\w\s]+} works as a non-capturing gro= up. + # So, this needs to be mapped in Python with (?:...)? or (?:...)+ + + type1 =3D r'(?:[\w\s]+)?' + type2 =3D r'(?:[\w\s]+\*+)+' + + found =3D False + + if is_define_proto: + r =3D Re(r'^()(' + name + r')\s+') + + if r.search(prototype): + return_type =3D '' + declaration_name =3D r.group(2) + func_macro =3D True + + found =3D True + + if not found: + patterns =3D [ + rf'^()({name})\s*{prototype_end}', + rf'^({type1})\s+({name})\s*{prototype_end}', + rf'^({type2})\s*({name})\s*{prototype_end}', + ] + + for p in patterns: + r =3D Re(p) + + if r.match(prototype): + + return_type =3D r.group(1) + declaration_name =3D r.group(2) + args =3D r.group(3) + + self.create_parameter_list(ln, decl_type, args, ',', + declaration_name) + + found =3D True + break + if not found: + self.emit_warning(ln, + f"cannot understand function prototype: '{pr= ototype}'") + return + + if self.entry.identifier !=3D declaration_name: + self.emit_warning(ln, + f"expecting prototype for {self.entry.identi= fier}(). Prototype was for {declaration_name}() instead") + return + + prms =3D " ".join(self.entry.parameterlist) + self.check_sections(ln, declaration_name, "function", + self.entry.sectcheck, prms) + + self.check_return_section(ln, declaration_name, return_type) + + if 'typedef' in return_type: + self.output_declaration(decl_type, declaration_name, + function=3Ddeclaration_name, + typedef=3DTrue, + module=3Dself.config.modulename, + functiontype=3Dreturn_type, + parameterlist=3Dself.entry.parameterli= st, + parameterdescs=3Dself.entry.parameterd= escs, + parametertypes=3Dself.entry.parametert= ypes, + sectionlist=3Dself.entry.sectionlist, + sections=3Dself.entry.sections, + purpose=3Dself.entry.declaration_purpo= se, + func_macro=3Dfunc_macro) + else: + self.output_declaration(decl_type, declaration_name, + function=3Ddeclaration_name, + typedef=3DFalse, + module=3Dself.config.modulename, + functiontype=3Dreturn_type, + parameterlist=3Dself.entry.parameterli= st, + parameterdescs=3Dself.entry.parameterd= escs, + parametertypes=3Dself.entry.parametert= ypes, + sectionlist=3Dself.entry.sectionlist, + sections=3Dself.entry.sections, + purpose=3Dself.entry.declaration_purpo= se, + func_macro=3Dfunc_macro) + + def dump_typedef(self, ln, proto): + """ + Stores a typedef inside self.entries array. + """ + + typedef_type =3D r'((?:\s+[\w\*]+\b){1,8})\s*' + typedef_ident =3D r'\*?\s*(\w\S+)\s*' + typedef_args =3D r'\s*\((.*)\);' + + typedef1 =3D Re(r'typedef' + typedef_type + r'\(' + typedef_ident = + r'\)' + typedef_args) + typedef2 =3D Re(r'typedef' + typedef_type + typedef_ident + typede= f_args) + + # Strip comments + proto =3D Re(r'/\*.*?\*/', flags=3Dre.S).sub('', proto) + + # Parse function typedef prototypes + for r in [typedef1, typedef2]: + if not r.match(proto): + continue + + return_type =3D r.group(1).strip() + declaration_name =3D r.group(2) + args =3D r.group(3) + + if self.entry.identifier !=3D declaration_name: + self.emit_warning(ln, + f"expecting prototype for typedef {self.= entry.identifier}. Prototype was for typedef {declaration_name} instead\n") + return + + decl_type =3D 'function' + self.create_parameter_list(ln, decl_type, args, ',', declarati= on_name) + + self.output_declaration(decl_type, declaration_name, + function=3Ddeclaration_name, + typedef=3DTrue, + module=3Dself.entry.modulename, + functiontype=3Dreturn_type, + parameterlist=3Dself.entry.parameterli= st, + parameterdescs=3Dself.entry.parameterd= escs, + parametertypes=3Dself.entry.parametert= ypes, + sectionlist=3Dself.entry.sectionlist, + sections=3Dself.entry.sections, + purpose=3Dself.entry.declaration_purpo= se) + return + + # Handle nested parentheses or brackets + r =3D Re(r'(\(*.\)\s*|\[*.\]\s*);$') + while r.search(proto): + proto =3D r.sub('', proto) + + # Parse simple typedefs + r =3D Re(r'typedef.*\s+(\w+)\s*;') + if r.match(proto): + declaration_name =3D r.group(1) + + if self.entry.identifier !=3D declaration_name: + self.emit_warning(ln, f"expecting prototype for typedef {s= elf.entry.identifier}. Prototype was for typedef {declaration_name} instead= \n") + return + + self.output_declaration('typedef', declaration_name, + typedef=3Ddeclaration_name, + module=3Dself.entry.modulename, + sectionlist=3Dself.entry.sectionlist, + sections=3Dself.entry.sections, + purpose=3Dself.entry.declaration_purpo= se) + return + + self.emit_warning(ln, "error: Cannot parse typedef!") + self.config.errors +=3D 1 + + @staticmethod + def process_export(function_table, line): + """ + process EXPORT_SYMBOL* tags + + This method is called both internally and externally, so, it + doesn't use self. + """ + + if export_symbol.search(line): + symbol =3D export_symbol.group(2) + function_table.add(symbol) + + if export_symbol_ns.search(line): + symbol =3D export_symbol_ns.group(2) + function_table.add(symbol) + + def process_normal(self, ln, line): + """ + STATE_NORMAL: looking for the /** to begin everything. + """ + + if not doc_start.match(line): + return + + # start a new entry + self.reset_state(ln + 1) + self.entry.in_doc_sect =3D False + + # next line is always the function name + self.state =3D self.STATE_NAME + + def process_name(self, ln, line): + """ + STATE_NAME: Looking for the "name - description" line + """ + + if doc_block.search(line): + self.entry.new_start_line =3D ln + + if not doc_block.group(1): + self.entry.section =3D self.section_intro + else: + self.entry.section =3D doc_block.group(1) + + self.state =3D self.STATE_DOCBLOCK + return + + if doc_decl.search(line): + self.entry.identifier =3D doc_decl.group(1) + self.entry.is_kernel_comment =3D False + + decl_start =3D str(doc_com) # comment block asterisk + fn_type =3D r"(?:\w+\s*\*\s*)?" # type (for non-functions) + parenthesis =3D r"(?:\(\w*\))?" # optional parenthesis on fu= nction + decl_end =3D r"(?:[-:].*)" # end of the name part + + # test for pointer declaration type, foo * bar() - desc + r =3D Re(fr"^{decl_start}([\w\s]+?){parenthesis}?\s*{decl_end}= ?$") + if r.search(line): + self.entry.identifier =3D r.group(1) + + # Test for data declaration + r =3D Re(r"^\s*\*?\s*(struct|union|enum|typedef)\b\s*(\w*)") + if r.search(line): + self.entry.decl_type =3D r.group(1) + self.entry.identifier =3D r.group(2) + self.entry.is_kernel_comment =3D True + else: + # Look for foo() or static void foo() - description; + # or misspelt identifier + + r1 =3D Re(fr"^{decl_start}{fn_type}(\w+)\s*{parenthesis}\s= *{decl_end}?$") + r2 =3D Re(fr"^{decl_start}{fn_type}(\w+[^-:]*){parenthesis= }\s*{decl_end}$") + + for r in [r1, r2]: + if r.search(line): + self.entry.identifier =3D r.group(1) + self.entry.decl_type =3D "function" + + r =3D Re(r"define\s+") + self.entry.identifier =3D r.sub("", self.entry.ide= ntifier) + self.entry.is_kernel_comment =3D True + break + + self.entry.identifier =3D self.entry.identifier.strip(" ") + + self.state =3D self.STATE_BODY + + # if there's no @param blocks need to set up default section h= ere + self.entry.section =3D self.section_default + self.entry.new_start_line =3D ln + 1 + + r =3D Re("[-:](.*)") + if r.search(line): + # strip leading/trailing/multiple spaces + self.entry.descr =3D r.group(1).strip(" ") + + r =3D Re(r"\s+") + self.entry.descr =3D r.sub(" ", self.entry.descr) + self.entry.declaration_purpose =3D self.entry.descr + self.state =3D self.STATE_BODY_MAYBE + else: + self.entry.declaration_purpose =3D "" + + if not self.entry.is_kernel_comment: + self.emit_warning(ln, + f"This comment starts with '/**', but is= n't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst\n{li= ne}") + self.state =3D self.STATE_NORMAL + + if not self.entry.declaration_purpose and self.config.wshort_d= esc: + self.emit_warning(ln, + f"missing initial short description on l= ine:\n{line}") + + if not self.entry.identifier and self.entry.decl_type !=3D "en= um": + self.emit_warning(ln, + f"wrong kernel-doc identifier on line:\n= {line}") + self.state =3D self.STATE_NORMAL + + if self.config.verbose: + self.emit_warning(ln, + f"Scanning doc for {self.entry.decl_type= } {self.entry.identifier}", + warning=3DFalse) + + return + + # Failed to find an identifier. Emit a warning + self.emit_warning(ln, f"Cannot find identifier on line:\n{line}") + + def process_body(self, ln, line): + """ + STATE_BODY and STATE_BODY_MAYBE: the bulk of a kerneldoc comment. + """ + + if self.state =3D=3D self.STATE_BODY_WITH_BLANK_LINE: + r =3D Re(r"\s*\*\s?\S") + if r.match(line): + self.dump_section() + self.entry.section =3D self.section_default + self.entry.new_start_line =3D line + self.entry.contents =3D "" + + if doc_sect.search(line): + self.entry.in_doc_sect =3D True + newsection =3D doc_sect.group(1) + + if newsection.lower() in ["description", "context"]: + newsection =3D newsection.title() + + # Special case: @return is a section, not a param description + if newsection.lower() in ["@return", "@returns", + "return", "returns"]: + newsection =3D "Return" + + # Perl kernel-doc has a check here for contents before section= s. + # the logic there is always false, as in_doc_sect variable is + # always true. So, just don't implement Wcontents_before_secti= ons + + # .title() + newcontents =3D doc_sect.group(2) + if not newcontents: + newcontents =3D "" + + if self.entry.contents.strip("\n"): + self.dump_section() + + self.entry.new_start_line =3D ln + self.entry.section =3D newsection + self.entry.leading_space =3D None + + self.entry.contents =3D newcontents.lstrip() + if self.entry.contents: + self.entry.contents +=3D "\n" + + self.state =3D self.STATE_BODY + return + + if doc_end.search(line): + self.dump_section() + + # Look for doc_com + + doc_end: + r =3D Re(r'\s*\*\s*[a-zA-Z_0-9:\.]+\*/') + if r.match(line): + self.emit_warning(ln, f"suspicious ending line: {line}") + + self.entry.prototype =3D "" + self.entry.new_start_line =3D ln + 1 + + self.state =3D self.STATE_PROTO + return + + if doc_content.search(line): + cont =3D doc_content.group(1) + + if cont =3D=3D "": + if self.entry.section =3D=3D self.section_context: + self.dump_section() + + self.entry.new_start_line =3D ln + self.state =3D self.STATE_BODY + else: + if self.entry.section !=3D self.section_default: + self.state =3D self.STATE_BODY_WITH_BLANK_LINE + else: + self.state =3D self.STATE_BODY + + self.entry.contents +=3D "\n" + + elif self.state =3D=3D self.STATE_BODY_MAYBE: + + # Continued declaration purpose + self.entry.declaration_purpose =3D self.entry.declaration_= purpose.rstrip() + self.entry.declaration_purpose +=3D " " + cont + + r =3D Re(r"\s+") + self.entry.declaration_purpose =3D r.sub(' ', + self.entry.declarat= ion_purpose) + + else: + if self.entry.section.startswith('@') or \ + self.entry.section =3D=3D self.section_context: + if self.entry.leading_space is None: + r =3D Re(r'^(\s+)') + if r.match(cont): + self.entry.leading_space =3D len(r.group(1)) + else: + self.entry.leading_space =3D 0 + + # Double-check if leading space are realy spaces + pos =3D 0 + for i in range(0, self.entry.leading_space): + if cont[i] !=3D " ": + break + pos +=3D 1 + + cont =3D cont[pos:] + + # NEW LOGIC: + # In case it is different, update it + if self.entry.leading_space !=3D pos: + self.entry.leading_space =3D pos + + self.entry.contents +=3D cont + "\n" + return + + # Unknown line, ignore + self.emit_warning(ln, f"bad line: {line}") + + def process_inline(self, ln, line): + """STATE_INLINE: docbook comments within a prototype.""" + + if self.inline_doc_state =3D=3D self.STATE_INLINE_NAME and \ + doc_inline_sect.search(line): + self.entry.section =3D doc_inline_sect.group(1) + self.entry.new_start_line =3D ln + + self.entry.contents =3D doc_inline_sect.group(2).lstrip() + if self.entry.contents !=3D "": + self.entry.contents +=3D "\n" + + self.inline_doc_state =3D self.STATE_INLINE_TEXT + # Documentation block end */ + return + + if doc_inline_end.search(line): + if self.entry.contents not in ["", "\n"]: + self.dump_section() + + self.state =3D self.STATE_PROTO + self.inline_doc_state =3D self.STATE_INLINE_NA + return + + if doc_content.search(line): + if self.inline_doc_state =3D=3D self.STATE_INLINE_TEXT: + self.entry.contents +=3D doc_content.group(1) + "\n" + if not self.entry.contents.strip(" ").rstrip("\n"): + self.entry.contents =3D "" + + elif self.inline_doc_state =3D=3D self.STATE_INLINE_NAME: + self.emit_warning(ln, + f"Incorrect use of kernel-doc format: {l= ine}") + + self.inline_doc_state =3D self.STATE_INLINE_ERROR + + def syscall_munge(self, ln, proto): # pylint: disable=3DW0613 + """ + Handle syscall definitions + """ + + is_void =3D False + + # Strip newlines/CR's + proto =3D re.sub(r'[\r\n]+', ' ', proto) + + # Check if it's a SYSCALL_DEFINE0 + if 'SYSCALL_DEFINE0' in proto: + is_void =3D True + + # Replace SYSCALL_DEFINE with correct return type & function name + proto =3D Re(r'SYSCALL_DEFINE.*\(').sub('long sys_', proto) + + r =3D Re(r'long\s+(sys_.*?),') + if r.search(proto): + proto =3D proto.replace(',', '(', count=3D1) + elif is_void: + proto =3D proto.replace(')', '(void)', count=3D1) + + # Now delete all of the odd-numbered commas in the proto + # so that argument types & names don't have a comma between them + count =3D 0 + length =3D len(proto) + + if is_void: + length =3D 0 # skip the loop if is_void + + for ix in range(length): + if proto[ix] =3D=3D ',': + count +=3D 1 + if count % 2 =3D=3D 1: + proto =3D proto[:ix] + ' ' + proto[ix + 1:] + + return proto + + def tracepoint_munge(self, ln, proto): + """ + Handle tracepoint definitions + """ + + tracepointname =3D None + tracepointargs =3D None + + # Match tracepoint name based on different patterns + r =3D Re(r'TRACE_EVENT\((.*?),') + if r.search(proto): + tracepointname =3D r.group(1) + + r =3D Re(r'DEFINE_SINGLE_EVENT\((.*?),') + if r.search(proto): + tracepointname =3D r.group(1) + + r =3D Re(r'DEFINE_EVENT\((.*?),(.*?),') + if r.search(proto): + tracepointname =3D r.group(2) + + if tracepointname: + tracepointname =3D tracepointname.lstrip() + + r =3D Re(r'TP_PROTO\((.*?)\)') + if r.search(proto): + tracepointargs =3D r.group(1) + + if not tracepointname or not tracepointargs: + self.emit_warning(ln, + f"Unrecognized tracepoint format:\n{proto}\n= ") + else: + proto =3D f"static inline void trace_{tracepointname}({tracepo= intargs})" + self.entry.identifier =3D f"trace_{self.entry.identifier}" + + return proto + + def process_proto_function(self, ln, line): + """Ancillary routine to process a function prototype""" + + # strip C99-style comments to end of line + r =3D Re(r"\/\/.*$", re.S) + line =3D r.sub('', line) + + if Re(r'\s*#\s*define').match(line): + self.entry.prototype =3D line + elif line.startswith('#'): + # Strip other macros like #ifdef/#ifndef/#endif/... + pass + else: + r =3D Re(r'([^\{]*)') + if r.match(line): + self.entry.prototype +=3D r.group(1) + " " + + if '{' in line or ';' in line or Re(r'\s*#\s*define').match(line): + # strip comments + r =3D Re(r'/\*.*?\*/') + self.entry.prototype =3D r.sub('', self.entry.prototype) + + # strip newlines/cr's + r =3D Re(r'[\r\n]+') + self.entry.prototype =3D r.sub(' ', self.entry.prototype) + + # strip leading spaces + r =3D Re(r'^\s+') + self.entry.prototype =3D r.sub('', self.entry.prototype) + + # Handle self.entry.prototypes for function pointers like: + # int (*pcs_config)(struct foo) + + r =3D Re(r'^(\S+\s+)\(\s*\*(\S+)\)') + self.entry.prototype =3D r.sub(r'\1\2', self.entry.prototype) + + if 'SYSCALL_DEFINE' in self.entry.prototype: + self.entry.prototype =3D self.syscall_munge(ln, + self.entry.proto= type) + + r =3D Re(r'TRACE_EVENT|DEFINE_EVENT|DEFINE_SINGLE_EVENT') + if r.search(self.entry.prototype): + self.entry.prototype =3D self.tracepoint_munge(ln, + self.entry.pr= ototype) + + self.dump_function(ln, self.entry.prototype) + self.reset_state(ln) + + def process_proto_type(self, ln, line): + """Ancillary routine to process a type""" + + # Strip newlines/cr's. + line =3D Re(r'[\r\n]+', re.S).sub(' ', line) + + # Strip leading spaces + line =3D Re(r'^\s+', re.S).sub('', line) + + # Strip trailing spaces + line =3D Re(r'\s+$', re.S).sub('', line) + + # Strip C99-style comments to the end of the line + line =3D Re(r"\/\/.*$", re.S).sub('', line) + + # To distinguish preprocessor directive from regular declaration l= ater. + if line.startswith('#'): + line +=3D ";" + + r =3D Re(r'([^\{\};]*)([\{\};])(.*)') + while True: + if r.search(line): + if self.entry.prototype: + self.entry.prototype +=3D " " + self.entry.prototype +=3D r.group(1) + r.group(2) + + self.entry.brcount +=3D r.group(2).count('{') + self.entry.brcount -=3D r.group(2).count('}') + + self.entry.brcount =3D max(self.entry.brcount, 0) + + if r.group(2) =3D=3D ';' and self.entry.brcount =3D=3D 0: + self.dump_declaration(ln, self.entry.prototype) + self.reset_state(ln) + break + + line =3D r.group(3) + else: + self.entry.prototype +=3D line + break + + def process_proto(self, ln, line): + """STATE_PROTO: reading a function/whatever prototype.""" + + if doc_inline_oneline.search(line): + self.entry.section =3D doc_inline_oneline.group(1) + self.entry.contents =3D doc_inline_oneline.group(2) + + if self.entry.contents !=3D "": + self.entry.contents +=3D "\n" + self.dump_section(start_new=3DFalse) + + elif doc_inline_start.search(line): + self.state =3D self.STATE_INLINE + self.inline_doc_state =3D self.STATE_INLINE_NAME + + elif self.entry.decl_type =3D=3D 'function': + self.process_proto_function(ln, line) + + else: + self.process_proto_type(ln, line) + + def process_docblock(self, ln, line): + """STATE_DOCBLOCK: within a DOC: block.""" + + if doc_end.search(line): + self.dump_section() + self.output_declaration("doc", None, + sectionlist=3Dself.entry.sectionlist, + sections=3Dself.entry.sections, module= =3Dself.config.modulename) + self.reset_state(ln) + + elif doc_content.search(line): + self.entry.contents +=3D doc_content.group(1) + "\n" + + def run(self): + """ + Open and process each line of a C source file. + he parsing is controlled via a state machine, and the line is pass= ed + to a different process function depending on the state. The process + function may update the state as needed. + """ + + cont =3D False + prev =3D "" + prev_ln =3D None + + try: + with open(self.fname, "r", encoding=3D"utf8", + errors=3D"backslashreplace") as fp: + for ln, line in enumerate(fp): + + line =3D line.expandtabs().strip("\n") + + # Group continuation lines on prototypes + if self.state =3D=3D self.STATE_PROTO: + if line.endswith("\\"): + prev +=3D line.removesuffix("\\") + cont =3D True + + if not prev_ln: + prev_ln =3D ln + + continue + + if cont: + ln =3D prev_ln + line =3D prev + line + prev =3D "" + cont =3D False + prev_ln =3D None + + self.config.log.debug("%d %s%s: %s", + ln, self.st_name[self.state], + self.st_inline_name[self.inline_= doc_state], + line) + + # TODO: not all states allow EXPORT_SYMBOL*, so this + # can be optimized later on to speedup parsing + self.process_export(self.config.function_table, line) + + # Hand this line to the appropriate state handler + if self.state =3D=3D self.STATE_NORMAL: + self.process_normal(ln, line) + elif self.state =3D=3D self.STATE_NAME: + self.process_name(ln, line) + elif self.state in [self.STATE_BODY, self.STATE_BODY_M= AYBE, + self.STATE_BODY_WITH_BLANK_LINE]: + self.process_body(ln, line) + elif self.state =3D=3D self.STATE_INLINE: # scanning = for inline parameters + self.process_inline(ln, line) + elif self.state =3D=3D self.STATE_PROTO: + self.process_proto(ln, line) + elif self.state =3D=3D self.STATE_DOCBLOCK: + self.process_docblock(ln, line) + except OSError: + self.config.log.error(f"Error: Cannot open file {self.fname}") + self.config.errors +=3D 1 --=20 2.49.0 From nobody Wed Dec 17 05:33:22 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9AF142673BD; Tue, 8 Apr 2025 10:09:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106996; cv=none; b=c6j+564gTXm7E4bL5UwgJfGocnjyGQ4sF99Q6bXUOlE6vUm42X+wAaABRlJH/V9JpPeLGOAyFyk+NyQ4Yq/M7BIjpPeu+pZjWZxIXWZ5myuQOIZGPWsEP4aTvVCQgaSEQg7owbZGCWCEfwIoIdmCrqeOTBvvPNoq35hcTkYKFeA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106996; c=relaxed/simple; bh=MVRuaiqzwNT41sqHa7MqwAn2eO+Z5qjUAPM1cDdmHsA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=nKLzf6ZYiuhTDi0+GLvsBlZbfPtVuriod9Y+fPjWTv9KrbARLXkK0/tTT2MYuskQHAu7obznHLbbt27o5XnW/KKRxzNyhgYDLrzSM9Oo4c0nUsmM663JFgjlKdnzlHE43Dcmc0pBCgggwTIZm+Zu5D6exVdKvRijv+fKud8wIMM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=AUz/Ozyf; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="AUz/Ozyf" Received: by smtp.kernel.org (Postfix) with ESMTPSA id D7AD3C4CEF5; Tue, 8 Apr 2025 10:09:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1744106996; bh=MVRuaiqzwNT41sqHa7MqwAn2eO+Z5qjUAPM1cDdmHsA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=AUz/OzyfgFxI4AJ3JQd72trrR1cJ+kLo7xlTZLaAgQTQZ+qmFvZxcCU7hhlw6SF4V 39WoFttJMesOd5HRcIqijDrd4pHbl75n/UUo0XE3EVYGOGVM1TCHq8sF5lk+7RQV2p HQ2NwC1flRfxxcG17wuIMx2PgaIUpyWk8ipLlbFnYeg/JvwELncppfZoq0xRYtwqHd 3c4MppF9oi1WZno/9Ynapzpj62bt7/pT6HBZPnHQscXWT28csOEossGphkq7edFdQm 0IrbmcWQ7yyP9dDlJvSpnCfS+ebJlCd6NSpL8NrSjr51Htrpe9aoyqFClYjjNy0EIY MroabyA47rexw== Received: from mchehab by mail.kernel.org with local (Exim 4.98.2) (envelope-from ) id 1u25tt-00000008RVg-0fct; Tue, 08 Apr 2025 18:09:49 +0800 From: Mauro Carvalho Chehab To: Linux Doc Mailing List , Jonathan Corbet Cc: Mauro Carvalho Chehab , linux-kernel@vger.kernel.org Subject: [PATCH v3 09/33] scripts/kernel-doc.py: move KernelFiles class to a separate file Date: Tue, 8 Apr 2025 18:09:12 +0800 Message-ID: <80bc855e128a9ff0a11df5afe9ba71775dfc9a0f.1744106241.git.mchehab+huawei@kernel.org> X-Mailer: git-send-email 2.49.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: Mauro Carvalho Chehab Content-Type: text/plain; charset="utf-8" The KernelFiles class is the main dispatcher which parses each source file. In preparation for letting kerneldoc Sphinx extension to import Python libraries, move regex ancillary classes to a separate file. Signed-off-by: Mauro Carvalho Chehab --- scripts/kernel-doc.py | 220 +-------------------------- scripts/lib/kdoc/kdoc_files.py | 270 +++++++++++++++++++++++++++++++++ 2 files changed, 271 insertions(+), 219 deletions(-) create mode 100755 scripts/lib/kdoc/kdoc_files.py diff --git a/scripts/kernel-doc.py b/scripts/kernel-doc.py index f030a36a165b..d09ada2d862a 100755 --- a/scripts/kernel-doc.py +++ b/scripts/kernel-doc.py @@ -119,6 +119,7 @@ sys.path.insert(0, os.path.join(SRC_DIR, LIB_DIR)) =20 from kdoc_parser import KernelDoc, type_param from kdoc_re import Re +from kdoc_files import KernelFiles =20 function_pointer =3D Re(r"([^\(]*\(\*)\s*\)\s*\(([^\)]*)\)", cache=3DFalse) =20 @@ -143,225 +144,6 @@ type_member =3D Re(r"\&([_\w]+)(\.|->)([_\w]+)", cach= e=3DFalse) type_fallback =3D Re(r"\&([_\w]+)", cache=3DFalse) type_member_func =3D type_member + Re(r"\(\)", cache=3DFalse) =20 -class GlobSourceFiles: - """ - Parse C source code file names and directories via an Interactor. - - """ - - def __init__(self, srctree=3DNone, valid_extensions=3DNone): - """ - Initialize valid extensions with a tuple. - - If not defined, assume default C extensions (.c and .h) - - It would be possible to use python's glob function, but it is - very slow, and it is not interactive. So, it would wait to read all - directories before actually do something. - - So, let's use our own implementation. - """ - - if not valid_extensions: - self.extensions =3D (".c", ".h") - else: - self.extensions =3D valid_extensions - - self.srctree =3D srctree - - def _parse_dir(self, dirname): - """Internal function to parse files recursively""" - - with os.scandir(dirname) as obj: - for entry in obj: - name =3D os.path.join(dirname, entry.name) - - if entry.is_dir(): - yield from self._parse_dir(name) - - if not entry.is_file(): - continue - - basename =3D os.path.basename(name) - - if not basename.endswith(self.extensions): - continue - - yield name - - def parse_files(self, file_list, file_not_found_cb): - for fname in file_list: - if self.srctree: - f =3D os.path.join(self.srctree, fname) - else: - f =3D fname - - if os.path.isdir(f): - yield from self._parse_dir(f) - elif os.path.isfile(f): - yield f - elif file_not_found_cb: - file_not_found_cb(fname) - - -class KernelFiles(): - - def parse_file(self, fname): - - doc =3D KernelDoc(self.config, fname) - doc.run() - - return doc - - def process_export_file(self, fname): - try: - with open(fname, "r", encoding=3D"utf8", - errors=3D"backslashreplace") as fp: - for line in fp: - KernelDoc.process_export(self.config.function_table, l= ine) - - except IOError: - print(f"Error: Cannot open fname {fname}", fname=3Dsys.stderr) - self.config.errors +=3D 1 - - def file_not_found_cb(self, fname): - self.config.log.error("Cannot find file %s", fname) - self.config.errors +=3D 1 - - def __init__(self, files=3DNone, verbose=3DFalse, out_style=3DNone, - werror=3DFalse, wreturn=3DFalse, wshort_desc=3DFalse, - wcontents_before_sections=3DFalse, - logger=3DNone, modulename=3DNone, export_file=3DNone): - """Initialize startup variables and parse all files""" - - - if not verbose: - verbose =3D bool(os.environ.get("KBUILD_VERBOSE", 0)) - - if not modulename: - modulename =3D "Kernel API" - - dt =3D datetime.now() - if os.environ.get("KBUILD_BUILD_TIMESTAMP", None): - # use UTC TZ - to_zone =3D tz.gettz('UTC') - dt =3D dt.astimezone(to_zone) - - if not werror: - kcflags =3D os.environ.get("KCFLAGS", None) - if kcflags: - match =3D re.search(r"(\s|^)-Werror(\s|$)/", kcflags) - if match: - werror =3D True - - # reading this variable is for backwards compat just in case - # someone was calling it with the variable from outside the - # kernel's build system - kdoc_werror =3D os.environ.get("KDOC_WERROR", None) - if kdoc_werror: - werror =3D kdoc_werror - - # Set global config data used on all files - self.config =3D argparse.Namespace - - self.config.verbose =3D verbose - self.config.werror =3D werror - self.config.wreturn =3D wreturn - self.config.wshort_desc =3D wshort_desc - self.config.wcontents_before_sections =3D wcontents_before_sections - self.config.modulename =3D modulename - - self.config.function_table =3D set() - self.config.source_map =3D {} - - if not logger: - self.config.log =3D logging.getLogger("kernel-doc") - else: - self.config.log =3D logger - - self.config.kernel_version =3D os.environ.get("KERNELVERSION", - "unknown kernel versio= n'") - self.config.src_tree =3D os.environ.get("SRCTREE", None) - - self.out_style =3D out_style - self.export_file =3D export_file - - # Initialize internal variables - - self.config.errors =3D 0 - self.results =3D [] - - self.file_list =3D files - self.files =3D set() - - def parse(self): - """ - Parse all files - """ - - glob =3D GlobSourceFiles(srctree=3Dself.config.src_tree) - - # Let's use a set here to avoid duplicating files - - for fname in glob.parse_files(self.file_list, self.file_not_found_= cb): - if fname in self.files: - continue - - self.files.add(fname) - - res =3D self.parse_file(fname) - self.results.append((res.fname, res.entries)) - - if not self.files: - sys.exit(1) - - # If a list of export files was provided, parse EXPORT_SYMBOL* - # from the ones not already parsed - - if self.export_file: - files =3D self.files - - glob =3D GlobSourceFiles(srctree=3Dself.config.src_tree) - - for fname in glob.parse_files(self.export_file, - self.file_not_found_cb): - if fname not in files: - files.add(fname) - - self.process_export_file(fname) - - def out_msg(self, fname, name, arg): - # TODO: filter out unwanted parts - - return self.out_style.msg(fname, name, arg) - - def msg(self, enable_lineno=3DFalse, export=3DFalse, internal=3DFalse, - symbol=3DNone, nosymbol=3DNone): - - function_table =3D self.config.function_table - - if symbol: - for s in symbol: - function_table.add(s) - - # Output none mode: only warnings will be shown - if not self.out_style: - return - - self.out_style.set_config(self.config) - - self.out_style.set_filter(export, internal, symbol, nosymbol, - function_table, enable_lineno) - - for fname, arg_tuple in self.results: - for name, arg in arg_tuple: - if self.out_msg(fname, name, arg): - ln =3D arg.get("ln", 0) - dtype =3D arg.get('type', "") - - self.config.log.warning("%s:%d Can't handle %s", - fname, ln, dtype) - =20 class OutputFormat: # output mode. diff --git a/scripts/lib/kdoc/kdoc_files.py b/scripts/lib/kdoc/kdoc_files.py new file mode 100755 index 000000000000..8bcdc7ead984 --- /dev/null +++ b/scripts/lib/kdoc/kdoc_files.py @@ -0,0 +1,270 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: GPL-2.0 +# Copyright(c) 2025: Mauro Carvalho Chehab . +# +# pylint: disable=3DR0903,R0913,R0914,R0917 + +# TODO: implement warning filtering + +""" +Parse lernel-doc tags on multiple kernel source files. +""" + +import argparse +import logging +import os +import re +import sys +from datetime import datetime + +from dateutil import tz + +from kdoc_parser import KernelDoc + + +class GlobSourceFiles: + """ + Parse C source code file names and directories via an Interactor. + """ + + def __init__(self, srctree=3DNone, valid_extensions=3DNone): + """ + Initialize valid extensions with a tuple. + + If not defined, assume default C extensions (.c and .h) + + It would be possible to use python's glob function, but it is + very slow, and it is not interactive. So, it would wait to read all + directories before actually do something. + + So, let's use our own implementation. + """ + + if not valid_extensions: + self.extensions =3D (".c", ".h") + else: + self.extensions =3D valid_extensions + + self.srctree =3D srctree + + def _parse_dir(self, dirname): + """Internal function to parse files recursively""" + + with os.scandir(dirname) as obj: + for entry in obj: + name =3D os.path.join(dirname, entry.name) + + if entry.is_dir(): + yield from self._parse_dir(name) + + if not entry.is_file(): + continue + + basename =3D os.path.basename(name) + + if not basename.endswith(self.extensions): + continue + + yield name + + def parse_files(self, file_list, file_not_found_cb): + """ + Define an interator to parse all source files from file_list, + handling directories if any + """ + + for fname in file_list: + if self.srctree: + f =3D os.path.join(self.srctree, fname) + else: + f =3D fname + + if os.path.isdir(f): + yield from self._parse_dir(f) + elif os.path.isfile(f): + yield f + elif file_not_found_cb: + file_not_found_cb(fname) + + +class KernelFiles(): + """ + Parse lernel-doc tags on multiple kernel source files. + """ + + def parse_file(self, fname): + """ + Parse a single Kernel source. + """ + + doc =3D KernelDoc(self.config, fname) + doc.run() + + return doc + + def process_export_file(self, fname): + """ + Parses EXPORT_SYMBOL* macros from a single Kernel source file. + """ + try: + with open(fname, "r", encoding=3D"utf8", + errors=3D"backslashreplace") as fp: + for line in fp: + KernelDoc.process_export(self.config.function_table, l= ine) + + except IOError: + print(f"Error: Cannot open fname {fname}", fname=3Dsys.stderr) + self.config.errors +=3D 1 + + def file_not_found_cb(self, fname): + """ + Callback to warn if a file was not found. + """ + + self.config.log.error("Cannot find file %s", fname) + self.config.errors +=3D 1 + + def __init__(self, files=3DNone, verbose=3DFalse, out_style=3DNone, + werror=3DFalse, wreturn=3DFalse, wshort_desc=3DFalse, + wcontents_before_sections=3DFalse, + logger=3DNone, modulename=3DNone, export_file=3DNone): + """ + Initialize startup variables and parse all files + """ + + if not verbose: + verbose =3D bool(os.environ.get("KBUILD_VERBOSE", 0)) + + if not modulename: + modulename =3D "Kernel API" + + dt =3D datetime.now() + if os.environ.get("KBUILD_BUILD_TIMESTAMP", None): + # use UTC TZ + to_zone =3D tz.gettz('UTC') + dt =3D dt.astimezone(to_zone) + + if not werror: + kcflags =3D os.environ.get("KCFLAGS", None) + if kcflags: + match =3D re.search(r"(\s|^)-Werror(\s|$)/", kcflags) + if match: + werror =3D True + + # reading this variable is for backwards compat just in case + # someone was calling it with the variable from outside the + # kernel's build system + kdoc_werror =3D os.environ.get("KDOC_WERROR", None) + if kdoc_werror: + werror =3D kdoc_werror + + # Set global config data used on all files + self.config =3D argparse.Namespace + + self.config.verbose =3D verbose + self.config.werror =3D werror + self.config.wreturn =3D wreturn + self.config.wshort_desc =3D wshort_desc + self.config.wcontents_before_sections =3D wcontents_before_sections + self.config.modulename =3D modulename + + self.config.function_table =3D set() + self.config.source_map =3D {} + + if not logger: + self.config.log =3D logging.getLogger("kernel-doc") + else: + self.config.log =3D logger + + self.config.kernel_version =3D os.environ.get("KERNELVERSION", + "unknown kernel versio= n'") + self.config.src_tree =3D os.environ.get("SRCTREE", None) + + self.out_style =3D out_style + self.export_file =3D export_file + + # Initialize internal variables + + self.config.errors =3D 0 + self.results =3D [] + + self.file_list =3D files + self.files =3D set() + + def parse(self): + """ + Parse all files + """ + + glob =3D GlobSourceFiles(srctree=3Dself.config.src_tree) + + # Let's use a set here to avoid duplicating files + + for fname in glob.parse_files(self.file_list, self.file_not_found_= cb): + if fname in self.files: + continue + + self.files.add(fname) + + res =3D self.parse_file(fname) + self.results.append((res.fname, res.entries)) + + if not self.files: + sys.exit(1) + + # If a list of export files was provided, parse EXPORT_SYMBOL* + # from the ones not already parsed + + if self.export_file: + files =3D self.files + + glob =3D GlobSourceFiles(srctree=3Dself.config.src_tree) + + for fname in glob.parse_files(self.export_file, + self.file_not_found_cb): + if fname not in files: + files.add(fname) + + self.process_export_file(fname) + + def out_msg(self, fname, name, arg): + """ + Output messages from a file name using the output style filtering. + + If output type was not handled by the syler, return False. + """ + + # NOTE: we can add rules here to filter out unwanted parts, + # although OutputFormat.msg already does that. + + return self.out_style.msg(fname, name, arg) + + def msg(self, enable_lineno=3DFalse, export=3DFalse, internal=3DFalse, + symbol=3DNone, nosymbol=3DNone): + """ + Interacts over the kernel-doc results and output messages. + """ + + function_table =3D self.config.function_table + + if symbol: + for s in symbol: + function_table.add(s) + + # Output none mode: only warnings will be shown + if not self.out_style: + return + + self.out_style.set_config(self.config) + + self.out_style.set_filter(export, internal, symbol, nosymbol, + function_table, enable_lineno) + + for fname, arg_tuple in self.results: + for name, arg in arg_tuple: + if self.out_msg(fname, name, arg): + ln =3D arg.get("ln", 0) + dtype =3D arg.get('type', "") + + self.config.log.warning("%s:%d Can't handle %s", + fname, ln, dtype) --=20 2.49.0 From nobody Wed Dec 17 05:33:22 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DF91F267728; Tue, 8 Apr 2025 10:09:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106997; cv=none; b=brTIRGU1ujm3IZga3dF7nbU+YsSmPGOFv9QC8gTvR84z/kPdXSxrEvDBQRnWvBL2q4yeCrtVAW3DqeuGMtJo9/bWB4ivOOkqUKHseM1q1yNuvpQprB39LJgmes/s0heMPFUWL4zqkp4cwviwyUF8Baa0dEJpnfsTF4K4d+QNg8k= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106997; c=relaxed/simple; bh=WC+f6MF8GfFGr87diUnskmJPC7KBcMzgHUcQy1jHo0Y=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=TfER6g9h/HBF667pWSKEroHTmqedRx8xIFu108JlSUubVQ1K4OUPb5RKLjH3VAFZbKdkeRb7kWd3LYHwkUE7lMyNpC72OXHK/H5APzEGGQpLf1zSbkU7pCU6/cQJXHvkJJ2Ag6A42yrXLxZld7IBIur0P5pProd61s/+XubL9gk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=JLSz1ik1; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="JLSz1ik1" Received: by smtp.kernel.org (Postfix) with ESMTPSA id DDE73C4CEF8; Tue, 8 Apr 2025 10:09:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1744106996; bh=WC+f6MF8GfFGr87diUnskmJPC7KBcMzgHUcQy1jHo0Y=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=JLSz1ik1QaYVzyAQ9lMi3N82ZZgS2mb8I5/hWo0dcA4vTQiKV41iork2xAUjen4qF BCgXyrUvDK7SO7d/znIRxTAZjBFSdYoloCi71nUZJX1hy/9Mn5TWDVFA3MLqP7NiOm p5gUmhklTj3Wpmz8rZG+92h+h1ei4vFzAUmBwnlIpYYCxOxacOc+892BvrDBDC4hkp bdiI87FzfiVfPZSb1yBVcnNdQtDd3qVCfaAjjuJ5hIEfJyz3qOkTs3gPM1sEOydrVT yirH0LIbEIhtW7AnYvdaFVQpMC7eTvC+LV66enIT2hCHCHKk7mMCHmzkRUN1mk8T47 Mzs53J8mzCFTQ== Received: from mchehab by mail.kernel.org with local (Exim 4.98.2) (envelope-from ) id 1u25tt-00000008RVj-0nD6; Tue, 08 Apr 2025 18:09:49 +0800 From: Mauro Carvalho Chehab To: Linux Doc Mailing List , Jonathan Corbet Cc: Mauro Carvalho Chehab , linux-kernel@vger.kernel.org Subject: [PATCH v3 10/33] scripts/kernel-doc.py: move output classes to a separate file Date: Tue, 8 Apr 2025 18:09:13 +0800 Message-ID: <81087eff25d11c265019a8631f7fc8d3904795d0.1744106242.git.mchehab+huawei@kernel.org> X-Mailer: git-send-email 2.49.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: Mauro Carvalho Chehab Content-Type: text/plain; charset="utf-8" In preparation for letting kerneldoc Sphinx extension to import Python libraries, move kernel-doc output logic to a separate file. Signed-off-by: Mauro Carvalho Chehab --- scripts/kernel-doc.py | 727 +------------------------------ scripts/lib/kdoc/kdoc_output.py | 736 ++++++++++++++++++++++++++++++++ 2 files changed, 739 insertions(+), 724 deletions(-) create mode 100755 scripts/lib/kdoc/kdoc_output.py diff --git a/scripts/kernel-doc.py b/scripts/kernel-doc.py index d09ada2d862a..abff78e9160f 100755 --- a/scripts/kernel-doc.py +++ b/scripts/kernel-doc.py @@ -2,9 +2,7 @@ # SPDX-License-Identifier: GPL-2.0 # Copyright(c) 2025: Mauro Carvalho Chehab . # -# pylint: disable=3DR0902,R0903,R0904,R0911,R0912,R0913,R0914,R0915,R0917,= R1702 -# pylint: disable=3DC0302,C0103,C0301 -# pylint: disable=3DC0116,C0115,W0511,W0613 +# pylint: disable=3DC0103 # # Converted from the kernel-doc script originally written in Perl # under GPLv2, copyrighted since 1998 by the following authors: @@ -102,14 +100,8 @@ documentation comment syntax. import argparse import logging import os -import re import sys =20 -from datetime import datetime -from pprint import pformat - -from dateutil import tz - # Import Python modules =20 LIB_DIR =3D "lib/kdoc" @@ -117,721 +109,8 @@ SRC_DIR =3D os.path.dirname(os.path.realpath(__file__= )) =20 sys.path.insert(0, os.path.join(SRC_DIR, LIB_DIR)) =20 -from kdoc_parser import KernelDoc, type_param -from kdoc_re import Re -from kdoc_files import KernelFiles - -function_pointer =3D Re(r"([^\(]*\(\*)\s*\)\s*\(([^\)]*)\)", cache=3DFalse) - -# match expressions used to find embedded type information -type_constant =3D Re(r"\b``([^\`]+)``\b", cache=3DFalse) -type_constant2 =3D Re(r"\%([-_*\w]+)", cache=3DFalse) -type_func =3D Re(r"(\w+)\(\)", cache=3DFalse) -type_param_ref =3D Re(r"([\!~\*]?)\@(\w*((\.\w+)|(->\w+))*(\.\.\.)?)", cac= he=3DFalse) - -# Special RST handling for func ptr params -type_fp_param =3D Re(r"\@(\w+)\(\)", cache=3DFalse) - -# Special RST handling for structs with func ptr params -type_fp_param2 =3D Re(r"\@(\w+->\S+)\(\)", cache=3DFalse) - -type_env =3D Re(r"(\$\w+)", cache=3DFalse) -type_enum =3D Re(r"\&(enum\s*([_\w]+))", cache=3DFalse) -type_struct =3D Re(r"\&(struct\s*([_\w]+))", cache=3DFalse) -type_typedef =3D Re(r"\&(typedef\s*([_\w]+))", cache=3DFalse) -type_union =3D Re(r"\&(union\s*([_\w]+))", cache=3DFalse) -type_member =3D Re(r"\&([_\w]+)(\.|->)([_\w]+)", cache=3DFalse) -type_fallback =3D Re(r"\&([_\w]+)", cache=3DFalse) -type_member_func =3D type_member + Re(r"\(\)", cache=3DFalse) - - -class OutputFormat: - # output mode. - OUTPUT_ALL =3D 0 # output all symbols and doc sections - OUTPUT_INCLUDE =3D 1 # output only specified symbols - OUTPUT_EXPORTED =3D 2 # output exported symbols - OUTPUT_INTERNAL =3D 3 # output non-exported symbols - - # Virtual member to be overriden at the inherited classes - highlights =3D [] - - def __init__(self): - """Declare internal vars and set mode to OUTPUT_ALL""" - - self.out_mode =3D self.OUTPUT_ALL - self.enable_lineno =3D None - self.nosymbol =3D {} - self.symbol =3D None - self.function_table =3D set() - self.config =3D None - - def set_config(self, config): - self.config =3D config - - def set_filter(self, export, internal, symbol, nosymbol, function_tabl= e, - enable_lineno): - """ - Initialize filter variables according with the requested mode. - - Only one choice is valid between export, internal and symbol. - - The nosymbol filter can be used on all modes. - """ - - self.enable_lineno =3D enable_lineno - - if symbol: - self.out_mode =3D self.OUTPUT_INCLUDE - function_table =3D symbol - elif export: - self.out_mode =3D self.OUTPUT_EXPORTED - elif internal: - self.out_mode =3D self.OUTPUT_INTERNAL - else: - self.out_mode =3D self.OUTPUT_ALL - - if nosymbol: - self.nosymbol =3D set(nosymbol) - - if function_table: - self.function_table =3D function_table - - def highlight_block(self, block): - """ - Apply the RST highlights to a sub-block of text. - """ - - for r, sub in self.highlights: - block =3D r.sub(sub, block) - - return block - - def check_doc(self, name): - """Check if DOC should be output""" - - if self.out_mode =3D=3D self.OUTPUT_ALL: - return True - - if self.out_mode =3D=3D self.OUTPUT_INCLUDE: - if name in self.nosymbol: - return False - - if name in self.function_table: - return True - - return False - - def check_declaration(self, dtype, name): - if name in self.nosymbol: - return False - - if self.out_mode =3D=3D self.OUTPUT_ALL: - return True - - if self.out_mode in [ self.OUTPUT_INCLUDE, self.OUTPUT_EXPORTED ]: - if name in self.function_table: - return True - - if self.out_mode =3D=3D self.OUTPUT_INTERNAL: - if dtype !=3D "function": - return True - - if name not in self.function_table: - return True - - return False - - def check_function(self, fname, name, args): - return True - - def check_enum(self, fname, name, args): - return True - - def check_typedef(self, fname, name, args): - return True - - def msg(self, fname, name, args): - - dtype =3D args.get('type', "") - - if dtype =3D=3D "doc": - self.out_doc(fname, name, args) - return False - - if not self.check_declaration(dtype, name): - return False - - if dtype =3D=3D "function": - self.out_function(fname, name, args) - return False - - if dtype =3D=3D "enum": - self.out_enum(fname, name, args) - return False - - if dtype =3D=3D "typedef": - self.out_typedef(fname, name, args) - return False - - if dtype in ["struct", "union"]: - self.out_struct(fname, name, args) - return False - - # Warn if some type requires an output logic - self.config.log.warning("doesn't now how to output '%s' block", - dtype) - - return True - - # Virtual methods to be overridden by inherited classes - def out_doc(self, fname, name, args): - pass - - def out_function(self, fname, name, args): - pass - - def out_enum(self, fname, name, args): - pass - - def out_typedef(self, fname, name, args): - pass - - def out_struct(self, fname, name, args): - pass - - -class RestFormat(OutputFormat): - # """Consts and functions used by ReST output""" - - highlights =3D [ - (type_constant, r"``\1``"), - (type_constant2, r"``\1``"), - - # Note: need to escape () to avoid func matching later - (type_member_func, r":c:type:`\1\2\3\\(\\) <\1>`"), - (type_member, r":c:type:`\1\2\3 <\1>`"), - (type_fp_param, r"**\1\\(\\)**"), - (type_fp_param2, r"**\1\\(\\)**"), - (type_func, r"\1()"), - (type_enum, r":c:type:`\1 <\2>`"), - (type_struct, r":c:type:`\1 <\2>`"), - (type_typedef, r":c:type:`\1 <\2>`"), - (type_union, r":c:type:`\1 <\2>`"), - - # in rst this can refer to any type - (type_fallback, r":c:type:`\1`"), - (type_param_ref, r"**\1\2**") - ] - blankline =3D "\n" - - sphinx_literal =3D Re(r'^[^.].*::$', cache=3DFalse) - sphinx_cblock =3D Re(r'^\.\.\ +code-block::', cache=3DFalse) - - def __init__(self): - """ - Creates class variables. - - Not really mandatory, but it is a good coding style and makes - pylint happy. - """ - - super().__init__() - self.lineprefix =3D "" - - def print_lineno (self, ln): - """Outputs a line number""" - - if self.enable_lineno and ln: - print(f".. LINENO {ln}") - - def output_highlight(self, args): - input_text =3D args - output =3D "" - in_literal =3D False - litprefix =3D "" - block =3D "" - - for line in input_text.strip("\n").split("\n"): - - # If we're in a literal block, see if we should drop out of it. - # Otherwise, pass the line straight through unmunged. - if in_literal: - if line.strip(): # If the line is not blank - # If this is the first non-blank line in a literal blo= ck, - # figure out the proper indent. - if not litprefix: - r =3D Re(r'^(\s*)') - if r.match(line): - litprefix =3D '^' + r.group(1) - else: - litprefix =3D "" - - output +=3D line + "\n" - elif not Re(litprefix).match(line): - in_literal =3D False - else: - output +=3D line + "\n" - else: - output +=3D line + "\n" - - # Not in a literal block (or just dropped out) - if not in_literal: - block +=3D line + "\n" - if self.sphinx_literal.match(line) or self.sphinx_cblock.m= atch(line): - in_literal =3D True - litprefix =3D "" - output +=3D self.highlight_block(block) - block =3D "" - - # Handle any remaining block - if block: - output +=3D self.highlight_block(block) - - # Print the output with the line prefix - for line in output.strip("\n").split("\n"): - print(self.lineprefix + line) - - def out_section(self, args, out_reference=3DFalse): - """ - Outputs a block section. - - This could use some work; it's used to output the DOC: sections, a= nd - starts by putting out the name of the doc section itself, but that - tends to duplicate a header already in the template file. - """ - - sectionlist =3D args.get('sectionlist', []) - sections =3D args.get('sections', {}) - section_start_lines =3D args.get('section_start_lines', {}) - - for section in sectionlist: - # Skip sections that are in the nosymbol_table - if section in self.nosymbol: - continue - - if not self.out_mode =3D=3D self.OUTPUT_INCLUDE: - if out_reference: - print(f".. _{section}:\n") - - if not self.symbol: - print(f'{self.lineprefix}**{section}**\n') - - self.print_lineno(section_start_lines.get(section, 0)) - self.output_highlight(sections[section]) - print() - print() - - def out_doc(self, fname, name, args): - if not self.check_doc(name): - return - - self.out_section(args, out_reference=3DTrue) - - def out_function(self, fname, name, args): - - oldprefix =3D self.lineprefix - signature =3D "" - - func_macro =3D args.get('func_macro', False) - if func_macro: - signature =3D args['function'] - else: - if args.get('functiontype'): - signature =3D args['functiontype'] + " " - signature +=3D args['function'] + " (" - - parameterlist =3D args.get('parameterlist', []) - parameterdescs =3D args.get('parameterdescs', {}) - parameterdesc_start_lines =3D args.get('parameterdesc_start_lines'= , {}) - - ln =3D args.get('ln', 0) - - count =3D 0 - for parameter in parameterlist: - if count !=3D 0: - signature +=3D ", " - count +=3D 1 - dtype =3D args['parametertypes'].get(parameter, "") - - if function_pointer.search(dtype): - signature +=3D function_pointer.group(1) + parameter + fun= ction_pointer.group(3) - else: - signature +=3D dtype - - if not func_macro: - signature +=3D ")" - - if args.get('typedef') or not args.get('functiontype'): - print(f".. c:macro:: {args['function']}\n") - - if args.get('typedef'): - self.print_lineno(ln) - print(" **Typedef**: ", end=3D"") - self.lineprefix =3D "" - self.output_highlight(args.get('purpose', "")) - print("\n\n**Syntax**\n") - print(f" ``{signature}``\n") - else: - print(f"``{signature}``\n") - else: - print(f".. c:function:: {signature}\n") - - if not args.get('typedef'): - self.print_lineno(ln) - self.lineprefix =3D " " - self.output_highlight(args.get('purpose', "")) - print() - - # Put descriptive text into a container (HTML
) to help set - # function prototypes apart - self.lineprefix =3D " " - - if parameterlist: - print(".. container:: kernelindent\n") - print(f"{self.lineprefix}**Parameters**\n") - - for parameter in parameterlist: - parameter_name =3D Re(r'\[.*').sub('', parameter) - dtype =3D args['parametertypes'].get(parameter, "") - - if dtype: - print(f"{self.lineprefix}``{dtype}``") - else: - print(f"{self.lineprefix}``{parameter}``") - - self.print_lineno(parameterdesc_start_lines.get(parameter_name= , 0)) - - self.lineprefix =3D " " - if parameter_name in parameterdescs and \ - parameterdescs[parameter_name] !=3D KernelDoc.undescribed: - - self.output_highlight(parameterdescs[parameter_name]) - print() - else: - print(f"{self.lineprefix}*undescribed*\n") - self.lineprefix =3D " " - - self.out_section(args) - self.lineprefix =3D oldprefix - - def out_enum(self, fname, name, args): - - oldprefix =3D self.lineprefix - name =3D args.get('enum', '') - parameterlist =3D args.get('parameterlist', []) - parameterdescs =3D args.get('parameterdescs', {}) - ln =3D args.get('ln', 0) - - print(f"\n\n.. c:enum:: {name}\n") - - self.print_lineno(ln) - self.lineprefix =3D " " - self.output_highlight(args.get('purpose', '')) - print() - - print(".. container:: kernelindent\n") - outer =3D self.lineprefix + " " - self.lineprefix =3D outer + " " - print(f"{outer}**Constants**\n") - - for parameter in parameterlist: - print(f"{outer}``{parameter}``") - - if parameterdescs.get(parameter, '') !=3D KernelDoc.undescribe= d: - self.output_highlight(parameterdescs[parameter]) - else: - print(f"{self.lineprefix}*undescribed*\n") - print() - - self.lineprefix =3D oldprefix - self.out_section(args) - - def out_typedef(self, fname, name, args): - - oldprefix =3D self.lineprefix - name =3D args.get('typedef', '') - ln =3D args.get('ln', 0) - - print(f"\n\n.. c:type:: {name}\n") - - self.print_lineno(ln) - self.lineprefix =3D " " - - self.output_highlight(args.get('purpose', '')) - - print() - - self.lineprefix =3D oldprefix - self.out_section(args) - - def out_struct(self, fname, name, args): - - name =3D args.get('struct', "") - purpose =3D args.get('purpose', "") - declaration =3D args.get('definition', "") - dtype =3D args.get('type', "struct") - ln =3D args.get('ln', 0) - - parameterlist =3D args.get('parameterlist', []) - parameterdescs =3D args.get('parameterdescs', {}) - parameterdesc_start_lines =3D args.get('parameterdesc_start_lines'= , {}) - - print(f"\n\n.. c:{dtype}:: {name}\n") - - self.print_lineno(ln) - - oldprefix =3D self.lineprefix - self.lineprefix +=3D " " - - self.output_highlight(purpose) - print() - - print(".. container:: kernelindent\n") - print(f"{self.lineprefix}**Definition**::\n") - - self.lineprefix =3D self.lineprefix + " " - - declaration =3D declaration.replace("\t", self.lineprefix) - - print(f"{self.lineprefix}{dtype} {name}" + ' {') - print(f"{declaration}{self.lineprefix}" + "};\n") - - self.lineprefix =3D " " - print(f"{self.lineprefix}**Members**\n") - for parameter in parameterlist: - if not parameter or parameter.startswith("#"): - continue - - parameter_name =3D parameter.split("[", maxsplit=3D1)[0] - - if parameterdescs.get(parameter_name) =3D=3D KernelDoc.undescr= ibed: - continue - - self.print_lineno(parameterdesc_start_lines.get(parameter_name= , 0)) - - print(f"{self.lineprefix}``{parameter}``") - - self.lineprefix =3D " " - self.output_highlight(parameterdescs[parameter_name]) - self.lineprefix =3D " " - - print() - - print() - - self.lineprefix =3D oldprefix - self.out_section(args) - - -class ManFormat(OutputFormat): - """Consts and functions used by man pages output""" - - highlights =3D ( - (type_constant, r"\1"), - (type_constant2, r"\1"), - (type_func, r"\\fB\1\\fP"), - (type_enum, r"\\fI\1\\fP"), - (type_struct, r"\\fI\1\\fP"), - (type_typedef, r"\\fI\1\\fP"), - (type_union, r"\\fI\1\\fP"), - (type_param, r"\\fI\1\\fP"), - (type_param_ref, r"\\fI\1\2\\fP"), - (type_member, r"\\fI\1\2\3\\fP"), - (type_fallback, r"\\fI\1\\fP") - ) - blankline =3D "" - - def __init__(self): - """ - Creates class variables. - - Not really mandatory, but it is a good coding style and makes - pylint happy. - """ - - super().__init__() - - dt =3D datetime.now() - if os.environ.get("KBUILD_BUILD_TIMESTAMP", None): - # use UTC TZ - to_zone =3D tz.gettz('UTC') - dt =3D dt.astimezone(to_zone) - - self.man_date =3D dt.strftime("%B %Y") - - def output_highlight(self, block): - - contents =3D self.highlight_block(block) - - if isinstance(contents, list): - contents =3D "\n".join(contents) - - for line in contents.strip("\n").split("\n"): - line =3D Re(r"^\s*").sub("", line) - - if line and line[0] =3D=3D ".": - print("\\&" + line) - else: - print(line) - - def out_doc(self, fname, name, args): - module =3D args.get('module') - sectionlist =3D args.get('sectionlist', []) - sections =3D args.get('sections', {}) - - print(f'.TH "{module}" 9 "{module}" "{self.man_date}" "API Manual"= LINUX') - - for section in sectionlist: - print(f'.SH "{section}"') - self.output_highlight(sections.get(section)) - - def out_function(self, fname, name, args): - """output function in man""" - - parameterlist =3D args.get('parameterlist', []) - parameterdescs =3D args.get('parameterdescs', {}) - sectionlist =3D args.get('sectionlist', []) - sections =3D args.get('sections', {}) - - print(f'.TH "{args['function']}" 9 "{args['function']}" "{self.man= _date}" "Kernel Hacker\'s Manual" LINUX') - - print(".SH NAME") - print(f"{args['function']} \\- {args['purpose']}") - - print(".SH SYNOPSIS") - if args.get('functiontype', ''): - print(f'.B "{args['functiontype']}" {args['function']}') - else: - print(f'.B "{args['function']}') - - count =3D 0 - parenth =3D "(" - post =3D "," - - for parameter in parameterlist: - if count =3D=3D len(parameterlist) - 1: - post =3D ");" - - dtype =3D args['parametertypes'].get(parameter, "") - if function_pointer.match(dtype): - # Pointer-to-function - print(f'".BI "{parenth}{function_pointer.group(1)}" " ") (= {function_pointer.group(2)}){post}"') - else: - dtype =3D Re(r'([^\*])$').sub(r'\1 ', dtype) - - print(f'.BI "{parenth}{dtype}" "{post}"') - count +=3D 1 - parenth =3D "" - - if parameterlist: - print(".SH ARGUMENTS") - - for parameter in parameterlist: - parameter_name =3D re.sub(r'\[.*', '', parameter) - - print(f'.IP "{parameter}" 12') - self.output_highlight(parameterdescs.get(parameter_name, "")) - - for section in sectionlist: - print(f'.SH "{section.upper()}"') - self.output_highlight(sections[section]) - - def out_enum(self, fname, name, args): - - name =3D args.get('enum', '') - parameterlist =3D args.get('parameterlist', []) - sectionlist =3D args.get('sectionlist', []) - sections =3D args.get('sections', {}) - - print(f'.TH "{args['module']}" 9 "enum {args['enum']}" "{self.man_= date}" "API Manual" LINUX') - - print(".SH NAME") - print(f"enum {args['enum']} \\- {args['purpose']}") - - print(".SH SYNOPSIS") - print(f"enum {args['enum']}" + " {") - - count =3D 0 - for parameter in parameterlist: - print(f'.br\n.BI " {parameter}"') - if count =3D=3D len(parameterlist) - 1: - print("\n};") - else: - print(", \n.br") - - count +=3D 1 - - print(".SH Constants") - - for parameter in parameterlist: - parameter_name =3D Re(r'\[.*').sub('', parameter) - print(f'.IP "{parameter}" 12') - self.output_highlight(args['parameterdescs'].get(parameter_nam= e, "")) - - for section in sectionlist: - print(f'.SH "{section}"') - self.output_highlight(sections[section]) - - def out_typedef(self, fname, name, args): - module =3D args.get('module') - typedef =3D args.get('typedef') - purpose =3D args.get('purpose') - sectionlist =3D args.get('sectionlist', []) - sections =3D args.get('sections', {}) - - print(f'.TH "{module}" 9 "{typedef}" "{self.man_date}" "API Manual= " LINUX') - - print(".SH NAME") - print(f"typedef {typedef} \\- {purpose}") - - for section in sectionlist: - print(f'.SH "{section}"') - self.output_highlight(sections.get(section)) - - def out_struct(self, fname, name, args): - module =3D args.get('module') - struct_type =3D args.get('type') - struct_name =3D args.get('struct') - purpose =3D args.get('purpose') - definition =3D args.get('definition') - sectionlist =3D args.get('sectionlist', []) - parameterlist =3D args.get('parameterlist', []) - sections =3D args.get('sections', {}) - parameterdescs =3D args.get('parameterdescs', {}) - - print(f'.TH "{module}" 9 "{struct_type} {struct_name}" "{self.man_= date}" "API Manual" LINUX') - - print(".SH NAME") - print(f"{struct_type} {struct_name} \\- {purpose}") - - # Replace tabs with two spaces and handle newlines - declaration =3D definition.replace("\t", " ") - declaration =3D Re(r"\n").sub('"\n.br\n.BI "', declaration) - - print(".SH SYNOPSIS") - print(f"{struct_type} {struct_name} " + "{" +"\n.br") - print(f'.BI "{declaration}\n' + "};\n.br\n") - - print(".SH Members") - for parameter in parameterlist: - if parameter.startswith("#"): - continue - - parameter_name =3D re.sub(r"\[.*", "", parameter) - - if parameterdescs.get(parameter_name) =3D=3D KernelDoc.undescr= ibed: - continue - - print(f'.IP "{parameter}" 12') - self.output_highlight(parameterdescs.get(parameter_name)) - - for section in sectionlist: - print(f'.SH "{section}"') - self.output_highlight(sections.get(section)) - - -# Command line interface - +from kdoc_files import KernelFiles # pylint: disable= =3DC0413 +from kdoc_output import RestFormat, ManFormat # pylint: disable= =3DC0413 =20 DESC =3D """ Read C language source or header FILEs, extract embedded documentation com= ments, diff --git a/scripts/lib/kdoc/kdoc_output.py b/scripts/lib/kdoc/kdoc_output= .py new file mode 100755 index 000000000000..24e40b3e7d1d --- /dev/null +++ b/scripts/lib/kdoc/kdoc_output.py @@ -0,0 +1,736 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: GPL-2.0 +# Copyright(c) 2025: Mauro Carvalho Chehab . +# +# pylint: disable=3DC0301,R0911,R0912,R0913,R0914,R0915,R0917 + +# TODO: implement warning filtering + +""" +Implement output filters to print kernel-doc documentation. + +The implementation uses a virtual base class (OutputFormat) which +contains a dispatches to virtual methods, and some code to filter +out output messages. + +The actual implementation is done on one separate class per each type +of output. Currently, there are output classes for ReST and man/troff. +""" + +import os +import re +from datetime import datetime + +from dateutil import tz + +from kdoc_parser import KernelDoc, type_param +from kdoc_re import Re + + +function_pointer =3D Re(r"([^\(]*\(\*)\s*\)\s*\(([^\)]*)\)", cache=3DFalse) + +# match expressions used to find embedded type information +type_constant =3D Re(r"\b``([^\`]+)``\b", cache=3DFalse) +type_constant2 =3D Re(r"\%([-_*\w]+)", cache=3DFalse) +type_func =3D Re(r"(\w+)\(\)", cache=3DFalse) +type_param_ref =3D Re(r"([\!~\*]?)\@(\w*((\.\w+)|(->\w+))*(\.\.\.)?)", cac= he=3DFalse) + +# Special RST handling for func ptr params +type_fp_param =3D Re(r"\@(\w+)\(\)", cache=3DFalse) + +# Special RST handling for structs with func ptr params +type_fp_param2 =3D Re(r"\@(\w+->\S+)\(\)", cache=3DFalse) + +type_env =3D Re(r"(\$\w+)", cache=3DFalse) +type_enum =3D Re(r"\&(enum\s*([_\w]+))", cache=3DFalse) +type_struct =3D Re(r"\&(struct\s*([_\w]+))", cache=3DFalse) +type_typedef =3D Re(r"\&(typedef\s*([_\w]+))", cache=3DFalse) +type_union =3D Re(r"\&(union\s*([_\w]+))", cache=3DFalse) +type_member =3D Re(r"\&([_\w]+)(\.|->)([_\w]+)", cache=3DFalse) +type_fallback =3D Re(r"\&([_\w]+)", cache=3DFalse) +type_member_func =3D type_member + Re(r"\(\)", cache=3DFalse) + + +class OutputFormat: + # output mode. + OUTPUT_ALL =3D 0 # output all symbols and doc sections + OUTPUT_INCLUDE =3D 1 # output only specified symbols + OUTPUT_EXPORTED =3D 2 # output exported symbols + OUTPUT_INTERNAL =3D 3 # output non-exported symbols + + # Virtual member to be overriden at the inherited classes + highlights =3D [] + + def __init__(self): + """Declare internal vars and set mode to OUTPUT_ALL""" + + self.out_mode =3D self.OUTPUT_ALL + self.enable_lineno =3D None + self.nosymbol =3D {} + self.symbol =3D None + self.function_table =3D set() + self.config =3D None + + def set_config(self, config): + self.config =3D config + + def set_filter(self, export, internal, symbol, nosymbol, function_tabl= e, + enable_lineno): + """ + Initialize filter variables according with the requested mode. + + Only one choice is valid between export, internal and symbol. + + The nosymbol filter can be used on all modes. + """ + + self.enable_lineno =3D enable_lineno + + if symbol: + self.out_mode =3D self.OUTPUT_INCLUDE + function_table =3D symbol + elif export: + self.out_mode =3D self.OUTPUT_EXPORTED + elif internal: + self.out_mode =3D self.OUTPUT_INTERNAL + else: + self.out_mode =3D self.OUTPUT_ALL + + if nosymbol: + self.nosymbol =3D set(nosymbol) + + if function_table: + self.function_table =3D function_table + + def highlight_block(self, block): + """ + Apply the RST highlights to a sub-block of text. + """ + + for r, sub in self.highlights: + block =3D r.sub(sub, block) + + return block + + def check_doc(self, name): + """Check if DOC should be output""" + + if self.out_mode =3D=3D self.OUTPUT_ALL: + return True + + if self.out_mode =3D=3D self.OUTPUT_INCLUDE: + if name in self.nosymbol: + return False + + if name in self.function_table: + return True + + return False + + def check_declaration(self, dtype, name): + if name in self.nosymbol: + return False + + if self.out_mode =3D=3D self.OUTPUT_ALL: + return True + + if self.out_mode in [self.OUTPUT_INCLUDE, self.OUTPUT_EXPORTED]: + if name in self.function_table: + return True + + if self.out_mode =3D=3D self.OUTPUT_INTERNAL: + if dtype !=3D "function": + return True + + if name not in self.function_table: + return True + + return False + + def check_function(self, fname, name, args): + return True + + def check_enum(self, fname, name, args): + return True + + def check_typedef(self, fname, name, args): + return True + + def msg(self, fname, name, args): + + dtype =3D args.get('type', "") + + if dtype =3D=3D "doc": + self.out_doc(fname, name, args) + return False + + if not self.check_declaration(dtype, name): + return False + + if dtype =3D=3D "function": + self.out_function(fname, name, args) + return False + + if dtype =3D=3D "enum": + self.out_enum(fname, name, args) + return False + + if dtype =3D=3D "typedef": + self.out_typedef(fname, name, args) + return False + + if dtype in ["struct", "union"]: + self.out_struct(fname, name, args) + return False + + # Warn if some type requires an output logic + self.config.log.warning("doesn't now how to output '%s' block", + dtype) + + return True + + # Virtual methods to be overridden by inherited classes + def out_doc(self, fname, name, args): + pass + + def out_function(self, fname, name, args): + pass + + def out_enum(self, fname, name, args): + pass + + def out_typedef(self, fname, name, args): + pass + + def out_struct(self, fname, name, args): + pass + + +class RestFormat(OutputFormat): + # """Consts and functions used by ReST output""" + + highlights =3D [ + (type_constant, r"``\1``"), + (type_constant2, r"``\1``"), + + # Note: need to escape () to avoid func matching later + (type_member_func, r":c:type:`\1\2\3\\(\\) <\1>`"), + (type_member, r":c:type:`\1\2\3 <\1>`"), + (type_fp_param, r"**\1\\(\\)**"), + (type_fp_param2, r"**\1\\(\\)**"), + (type_func, r"\1()"), + (type_enum, r":c:type:`\1 <\2>`"), + (type_struct, r":c:type:`\1 <\2>`"), + (type_typedef, r":c:type:`\1 <\2>`"), + (type_union, r":c:type:`\1 <\2>`"), + + # in rst this can refer to any type + (type_fallback, r":c:type:`\1`"), + (type_param_ref, r"**\1\2**") + ] + blankline =3D "\n" + + sphinx_literal =3D Re(r'^[^.].*::$', cache=3DFalse) + sphinx_cblock =3D Re(r'^\.\.\ +code-block::', cache=3DFalse) + + def __init__(self): + """ + Creates class variables. + + Not really mandatory, but it is a good coding style and makes + pylint happy. + """ + + super().__init__() + self.lineprefix =3D "" + + def print_lineno(self, ln): + """Outputs a line number""" + + if self.enable_lineno and ln: + print(f".. LINENO {ln}") + + def output_highlight(self, args): + input_text =3D args + output =3D "" + in_literal =3D False + litprefix =3D "" + block =3D "" + + for line in input_text.strip("\n").split("\n"): + + # If we're in a literal block, see if we should drop out of it. + # Otherwise, pass the line straight through unmunged. + if in_literal: + if line.strip(): # If the line is not blank + # If this is the first non-blank line in a literal blo= ck, + # figure out the proper indent. + if not litprefix: + r =3D Re(r'^(\s*)') + if r.match(line): + litprefix =3D '^' + r.group(1) + else: + litprefix =3D "" + + output +=3D line + "\n" + elif not Re(litprefix).match(line): + in_literal =3D False + else: + output +=3D line + "\n" + else: + output +=3D line + "\n" + + # Not in a literal block (or just dropped out) + if not in_literal: + block +=3D line + "\n" + if self.sphinx_literal.match(line) or self.sphinx_cblock.m= atch(line): + in_literal =3D True + litprefix =3D "" + output +=3D self.highlight_block(block) + block =3D "" + + # Handle any remaining block + if block: + output +=3D self.highlight_block(block) + + # Print the output with the line prefix + for line in output.strip("\n").split("\n"): + print(self.lineprefix + line) + + def out_section(self, args, out_reference=3DFalse): + """ + Outputs a block section. + + This could use some work; it's used to output the DOC: sections, a= nd + starts by putting out the name of the doc section itself, but that + tends to duplicate a header already in the template file. + """ + + sectionlist =3D args.get('sectionlist', []) + sections =3D args.get('sections', {}) + section_start_lines =3D args.get('section_start_lines', {}) + + for section in sectionlist: + # Skip sections that are in the nosymbol_table + if section in self.nosymbol: + continue + + if not self.out_mode =3D=3D self.OUTPUT_INCLUDE: + if out_reference: + print(f".. _{section}:\n") + + if not self.symbol: + print(f'{self.lineprefix}**{section}**\n') + + self.print_lineno(section_start_lines.get(section, 0)) + self.output_highlight(sections[section]) + print() + print() + + def out_doc(self, fname, name, args): + if not self.check_doc(name): + return + + self.out_section(args, out_reference=3DTrue) + + def out_function(self, fname, name, args): + + oldprefix =3D self.lineprefix + signature =3D "" + + func_macro =3D args.get('func_macro', False) + if func_macro: + signature =3D args['function'] + else: + if args.get('functiontype'): + signature =3D args['functiontype'] + " " + signature +=3D args['function'] + " (" + + parameterlist =3D args.get('parameterlist', []) + parameterdescs =3D args.get('parameterdescs', {}) + parameterdesc_start_lines =3D args.get('parameterdesc_start_lines'= , {}) + + ln =3D args.get('ln', 0) + + count =3D 0 + for parameter in parameterlist: + if count !=3D 0: + signature +=3D ", " + count +=3D 1 + dtype =3D args['parametertypes'].get(parameter, "") + + if function_pointer.search(dtype): + signature +=3D function_pointer.group(1) + parameter + fun= ction_pointer.group(3) + else: + signature +=3D dtype + + if not func_macro: + signature +=3D ")" + + if args.get('typedef') or not args.get('functiontype'): + print(f".. c:macro:: {args['function']}\n") + + if args.get('typedef'): + self.print_lineno(ln) + print(" **Typedef**: ", end=3D"") + self.lineprefix =3D "" + self.output_highlight(args.get('purpose', "")) + print("\n\n**Syntax**\n") + print(f" ``{signature}``\n") + else: + print(f"``{signature}``\n") + else: + print(f".. c:function:: {signature}\n") + + if not args.get('typedef'): + self.print_lineno(ln) + self.lineprefix =3D " " + self.output_highlight(args.get('purpose', "")) + print() + + # Put descriptive text into a container (HTML
) to help set + # function prototypes apart + self.lineprefix =3D " " + + if parameterlist: + print(".. container:: kernelindent\n") + print(f"{self.lineprefix}**Parameters**\n") + + for parameter in parameterlist: + parameter_name =3D Re(r'\[.*').sub('', parameter) + dtype =3D args['parametertypes'].get(parameter, "") + + if dtype: + print(f"{self.lineprefix}``{dtype}``") + else: + print(f"{self.lineprefix}``{parameter}``") + + self.print_lineno(parameterdesc_start_lines.get(parameter_name= , 0)) + + self.lineprefix =3D " " + if parameter_name in parameterdescs and \ + parameterdescs[parameter_name] !=3D KernelDoc.undescribed: + + self.output_highlight(parameterdescs[parameter_name]) + print() + else: + print(f"{self.lineprefix}*undescribed*\n") + self.lineprefix =3D " " + + self.out_section(args) + self.lineprefix =3D oldprefix + + def out_enum(self, fname, name, args): + + oldprefix =3D self.lineprefix + name =3D args.get('enum', '') + parameterlist =3D args.get('parameterlist', []) + parameterdescs =3D args.get('parameterdescs', {}) + ln =3D args.get('ln', 0) + + print(f"\n\n.. c:enum:: {name}\n") + + self.print_lineno(ln) + self.lineprefix =3D " " + self.output_highlight(args.get('purpose', '')) + print() + + print(".. container:: kernelindent\n") + outer =3D self.lineprefix + " " + self.lineprefix =3D outer + " " + print(f"{outer}**Constants**\n") + + for parameter in parameterlist: + print(f"{outer}``{parameter}``") + + if parameterdescs.get(parameter, '') !=3D KernelDoc.undescribe= d: + self.output_highlight(parameterdescs[parameter]) + else: + print(f"{self.lineprefix}*undescribed*\n") + print() + + self.lineprefix =3D oldprefix + self.out_section(args) + + def out_typedef(self, fname, name, args): + + oldprefix =3D self.lineprefix + name =3D args.get('typedef', '') + ln =3D args.get('ln', 0) + + print(f"\n\n.. c:type:: {name}\n") + + self.print_lineno(ln) + self.lineprefix =3D " " + + self.output_highlight(args.get('purpose', '')) + + print() + + self.lineprefix =3D oldprefix + self.out_section(args) + + def out_struct(self, fname, name, args): + + name =3D args.get('struct', "") + purpose =3D args.get('purpose', "") + declaration =3D args.get('definition', "") + dtype =3D args.get('type', "struct") + ln =3D args.get('ln', 0) + + parameterlist =3D args.get('parameterlist', []) + parameterdescs =3D args.get('parameterdescs', {}) + parameterdesc_start_lines =3D args.get('parameterdesc_start_lines'= , {}) + + print(f"\n\n.. c:{dtype}:: {name}\n") + + self.print_lineno(ln) + + oldprefix =3D self.lineprefix + self.lineprefix +=3D " " + + self.output_highlight(purpose) + print() + + print(".. container:: kernelindent\n") + print(f"{self.lineprefix}**Definition**::\n") + + self.lineprefix =3D self.lineprefix + " " + + declaration =3D declaration.replace("\t", self.lineprefix) + + print(f"{self.lineprefix}{dtype} {name}" + ' {') + print(f"{declaration}{self.lineprefix}" + "};\n") + + self.lineprefix =3D " " + print(f"{self.lineprefix}**Members**\n") + for parameter in parameterlist: + if not parameter or parameter.startswith("#"): + continue + + parameter_name =3D parameter.split("[", maxsplit=3D1)[0] + + if parameterdescs.get(parameter_name) =3D=3D KernelDoc.undescr= ibed: + continue + + self.print_lineno(parameterdesc_start_lines.get(parameter_name= , 0)) + + print(f"{self.lineprefix}``{parameter}``") + + self.lineprefix =3D " " + self.output_highlight(parameterdescs[parameter_name]) + self.lineprefix =3D " " + + print() + + print() + + self.lineprefix =3D oldprefix + self.out_section(args) + + +class ManFormat(OutputFormat): + """Consts and functions used by man pages output""" + + highlights =3D ( + (type_constant, r"\1"), + (type_constant2, r"\1"), + (type_func, r"\\fB\1\\fP"), + (type_enum, r"\\fI\1\\fP"), + (type_struct, r"\\fI\1\\fP"), + (type_typedef, r"\\fI\1\\fP"), + (type_union, r"\\fI\1\\fP"), + (type_param, r"\\fI\1\\fP"), + (type_param_ref, r"\\fI\1\2\\fP"), + (type_member, r"\\fI\1\2\3\\fP"), + (type_fallback, r"\\fI\1\\fP") + ) + blankline =3D "" + + def __init__(self): + """ + Creates class variables. + + Not really mandatory, but it is a good coding style and makes + pylint happy. + """ + + super().__init__() + + dt =3D datetime.now() + if os.environ.get("KBUILD_BUILD_TIMESTAMP", None): + # use UTC TZ + to_zone =3D tz.gettz('UTC') + dt =3D dt.astimezone(to_zone) + + self.man_date =3D dt.strftime("%B %Y") + + def output_highlight(self, block): + + contents =3D self.highlight_block(block) + + if isinstance(contents, list): + contents =3D "\n".join(contents) + + for line in contents.strip("\n").split("\n"): + line =3D Re(r"^\s*").sub("", line) + + if line and line[0] =3D=3D ".": + print("\\&" + line) + else: + print(line) + + def out_doc(self, fname, name, args): + module =3D args.get('module') + sectionlist =3D args.get('sectionlist', []) + sections =3D args.get('sections', {}) + + print(f'.TH "{module}" 9 "{module}" "{self.man_date}" "API Manual"= LINUX') + + for section in sectionlist: + print(f'.SH "{section}"') + self.output_highlight(sections.get(section)) + + def out_function(self, fname, name, args): + """output function in man""" + + parameterlist =3D args.get('parameterlist', []) + parameterdescs =3D args.get('parameterdescs', {}) + sectionlist =3D args.get('sectionlist', []) + sections =3D args.get('sections', {}) + + print(f'.TH "{args['function']}" 9 "{args['function']}" "{self.man= _date}" "Kernel Hacker\'s Manual" LINUX') + + print(".SH NAME") + print(f"{args['function']} \\- {args['purpose']}") + + print(".SH SYNOPSIS") + if args.get('functiontype', ''): + print(f'.B "{args['functiontype']}" {args['function']}') + else: + print(f'.B "{args['function']}') + + count =3D 0 + parenth =3D "(" + post =3D "," + + for parameter in parameterlist: + if count =3D=3D len(parameterlist) - 1: + post =3D ");" + + dtype =3D args['parametertypes'].get(parameter, "") + if function_pointer.match(dtype): + # Pointer-to-function + print(f'".BI "{parenth}{function_pointer.group(1)}" " ") (= {function_pointer.group(2)}){post}"') + else: + dtype =3D Re(r'([^\*])$').sub(r'\1 ', dtype) + + print(f'.BI "{parenth}{dtype}" "{post}"') + count +=3D 1 + parenth =3D "" + + if parameterlist: + print(".SH ARGUMENTS") + + for parameter in parameterlist: + parameter_name =3D re.sub(r'\[.*', '', parameter) + + print(f'.IP "{parameter}" 12') + self.output_highlight(parameterdescs.get(parameter_name, "")) + + for section in sectionlist: + print(f'.SH "{section.upper()}"') + self.output_highlight(sections[section]) + + def out_enum(self, fname, name, args): + + name =3D args.get('enum', '') + parameterlist =3D args.get('parameterlist', []) + sectionlist =3D args.get('sectionlist', []) + sections =3D args.get('sections', {}) + + print(f'.TH "{args['module']}" 9 "enum {args['enum']}" "{self.man_= date}" "API Manual" LINUX') + + print(".SH NAME") + print(f"enum {args['enum']} \\- {args['purpose']}") + + print(".SH SYNOPSIS") + print(f"enum {args['enum']}" + " {") + + count =3D 0 + for parameter in parameterlist: + print(f'.br\n.BI " {parameter}"') + if count =3D=3D len(parameterlist) - 1: + print("\n};") + else: + print(", \n.br") + + count +=3D 1 + + print(".SH Constants") + + for parameter in parameterlist: + parameter_name =3D Re(r'\[.*').sub('', parameter) + print(f'.IP "{parameter}" 12') + self.output_highlight(args['parameterdescs'].get(parameter_nam= e, "")) + + for section in sectionlist: + print(f'.SH "{section}"') + self.output_highlight(sections[section]) + + def out_typedef(self, fname, name, args): + module =3D args.get('module') + typedef =3D args.get('typedef') + purpose =3D args.get('purpose') + sectionlist =3D args.get('sectionlist', []) + sections =3D args.get('sections', {}) + + print(f'.TH "{module}" 9 "{typedef}" "{self.man_date}" "API Manual= " LINUX') + + print(".SH NAME") + print(f"typedef {typedef} \\- {purpose}") + + for section in sectionlist: + print(f'.SH "{section}"') + self.output_highlight(sections.get(section)) + + def out_struct(self, fname, name, args): + module =3D args.get('module') + struct_type =3D args.get('type') + struct_name =3D args.get('struct') + purpose =3D args.get('purpose') + definition =3D args.get('definition') + sectionlist =3D args.get('sectionlist', []) + parameterlist =3D args.get('parameterlist', []) + sections =3D args.get('sections', {}) + parameterdescs =3D args.get('parameterdescs', {}) + + print(f'.TH "{module}" 9 "{struct_type} {struct_name}" "{self.man_= date}" "API Manual" LINUX') + + print(".SH NAME") + print(f"{struct_type} {struct_name} \\- {purpose}") + + # Replace tabs with two spaces and handle newlines + declaration =3D definition.replace("\t", " ") + declaration =3D Re(r"\n").sub('"\n.br\n.BI "', declaration) + + print(".SH SYNOPSIS") + print(f"{struct_type} {struct_name} " + "{" + "\n.br") + print(f'.BI "{declaration}\n' + "};\n.br\n") + + print(".SH Members") + for parameter in parameterlist: + if parameter.startswith("#"): + continue + + parameter_name =3D re.sub(r"\[.*", "", parameter) + + if parameterdescs.get(parameter_name) =3D=3D KernelDoc.undescr= ibed: + continue + + print(f'.IP "{parameter}" 12') + self.output_highlight(parameterdescs.get(parameter_name)) + + for section in sectionlist: + print(f'.SH "{section}"') + self.output_highlight(sections.get(section)) --=20 2.49.0 From nobody Wed Dec 17 05:33:22 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8AD6C267388; Tue, 8 Apr 2025 10:09:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106996; cv=none; b=VEdGWzVKsirjkzpNzKLpAm2wcCSE2qNO7POaFqCgkWiOFr5NTFr5O+KAZvJTDrnZES8luxpAlqE9acForKV8Vl4aqjvnJHGGtc/PkBlBI2AhjI1hCMjbbNqU5YKtNC1kz3HzmPQdD+ldNU9eVVJSyk7UP8tHB+oR+iA50v/tFGs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106996; c=relaxed/simple; bh=dSMMgREF1qhbEmSVC78ny/E/UlXH1qJJJkrWXWIPDo0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Kc4H7VWUiiLxgZSbGYDi5LlDFf1oLp6640Mw42oEZQNSpN4uo44+stjIDrGBBxamYtJ1JGf3O3trI1e3oKrvtIU8I/xUY5X05aCa7DkGoxua+pk8icWYcthsbEAWlqZKJFCdcqg8YDBn+5w/wQcsv9t02fKt+iEFpOjetNi5Bps= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=DwFbS+x6; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="DwFbS+x6" Received: by smtp.kernel.org (Postfix) with ESMTPSA id D2717C4CEF2; Tue, 8 Apr 2025 10:09:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1744106996; bh=dSMMgREF1qhbEmSVC78ny/E/UlXH1qJJJkrWXWIPDo0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=DwFbS+x6UeuEIBsIJdPLXkrg5miGY0kKtbMnhsA+i0UUGujsJNFXIuFvbBHYLKZqe dg//qeQMBBJwtbmW0R4/Oz4a7I8qK9dXZlB4wVEosEhQPXIDjn5oK6u0CRxYY1flkC avJ5wEHf4mBK4owNSd2HxbjvT5bgOtgYFmlVwjgyqMlpcR41/DrAfdHSb0zBya97aC 4YgkPllxmCQZyFg3s0koUUIC9xar+Jzl0ZZDRWJIHkdQw+ppNah+aQ+7XxbXxeesY8 0LVnaQzCNxaSYOwrW8ddBBFOE78GY75WXq4hDTIjnEYfF6ziIpfl1zdy5MRj8MlSvy PeQ2OtyzS26AA== Received: from mchehab by mail.kernel.org with local (Exim 4.98.2) (envelope-from ) id 1u25tt-00000008RVm-0u1s; Tue, 08 Apr 2025 18:09:49 +0800 From: Mauro Carvalho Chehab To: Linux Doc Mailing List , Jonathan Corbet Cc: Mauro Carvalho Chehab , linux-kernel@vger.kernel.org Subject: [PATCH v3 11/33] scripts/kernel-doc.py: convert message output to an interactor Date: Tue, 8 Apr 2025 18:09:14 +0800 Message-ID: <557304c8458f1fb4aa2e833f4bdaff953094ddcb.1744106242.git.mchehab+huawei@kernel.org> X-Mailer: git-send-email 2.49.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: Mauro Carvalho Chehab Content-Type: text/plain; charset="utf-8" Instead of directly printing output messages, change kdoc classes to return an interactor with the output message, letting the actual display to happen at the command-line command. Signed-off-by: Mauro Carvalho Chehab --- scripts/kernel-doc.py | 9 +- scripts/lib/kdoc/kdoc_files.py | 15 ++- scripts/lib/kdoc/kdoc_output.py | 171 ++++++++++++++++---------------- 3 files changed, 104 insertions(+), 91 deletions(-) diff --git a/scripts/kernel-doc.py b/scripts/kernel-doc.py index abff78e9160f..63efec4b3f4b 100755 --- a/scripts/kernel-doc.py +++ b/scripts/kernel-doc.py @@ -283,9 +283,12 @@ def main(): =20 kfiles.parse() =20 - kfiles.msg(enable_lineno=3Dargs.enable_lineno, export=3Dargs.export, - internal=3Dargs.internal, symbol=3Dargs.symbol, - nosymbol=3Dargs.nosymbol) + for t in kfiles.msg(enable_lineno=3Dargs.enable_lineno, export=3Dargs.= export, + internal=3Dargs.internal, symbol=3Dargs.symbol, + nosymbol=3Dargs.nosymbol): + msg =3D t[1] + if msg: + print(msg) =20 =20 # Call main method diff --git a/scripts/lib/kdoc/kdoc_files.py b/scripts/lib/kdoc/kdoc_files.py index 8bcdc7ead984..817ed98b2727 100755 --- a/scripts/lib/kdoc/kdoc_files.py +++ b/scripts/lib/kdoc/kdoc_files.py @@ -229,9 +229,10 @@ class KernelFiles(): =20 def out_msg(self, fname, name, arg): """ - Output messages from a file name using the output style filtering. + Return output messages from a file name using the output style + filtering. =20 - If output type was not handled by the syler, return False. + If output type was not handled by the syler, return None. """ =20 # NOTE: we can add rules here to filter out unwanted parts, @@ -242,7 +243,8 @@ class KernelFiles(): def msg(self, enable_lineno=3DFalse, export=3DFalse, internal=3DFalse, symbol=3DNone, nosymbol=3DNone): """ - Interacts over the kernel-doc results and output messages. + Interacts over the kernel-doc results and output messages, + returning kernel-doc markups on each interaction """ =20 function_table =3D self.config.function_table @@ -261,10 +263,15 @@ class KernelFiles(): function_table, enable_lineno) =20 for fname, arg_tuple in self.results: + msg =3D "" for name, arg in arg_tuple: - if self.out_msg(fname, name, arg): + msg +=3D self.out_msg(fname, name, arg) + + if msg is None: ln =3D arg.get("ln", 0) dtype =3D arg.get('type', "") =20 self.config.log.warning("%s:%d Can't handle %s", fname, ln, dtype) + if msg: + yield fname, msg diff --git a/scripts/lib/kdoc/kdoc_output.py b/scripts/lib/kdoc/kdoc_output= .py index 24e40b3e7d1d..fda07049ecf7 100755 --- a/scripts/lib/kdoc/kdoc_output.py +++ b/scripts/lib/kdoc/kdoc_output.py @@ -71,6 +71,8 @@ class OutputFormat: self.function_table =3D set() self.config =3D None =20 + self.data =3D "" + def set_config(self, config): self.config =3D config =20 @@ -157,37 +159,38 @@ class OutputFormat: return True =20 def msg(self, fname, name, args): + self.data =3D "" =20 dtype =3D args.get('type', "") =20 if dtype =3D=3D "doc": self.out_doc(fname, name, args) - return False + return self.data =20 if not self.check_declaration(dtype, name): - return False + return self.data =20 if dtype =3D=3D "function": self.out_function(fname, name, args) - return False + return self.data =20 if dtype =3D=3D "enum": self.out_enum(fname, name, args) - return False + return self.data =20 if dtype =3D=3D "typedef": self.out_typedef(fname, name, args) - return False + return self.data =20 if dtype in ["struct", "union"]: self.out_struct(fname, name, args) - return False + return self.data =20 # Warn if some type requires an output logic self.config.log.warning("doesn't now how to output '%s' block", dtype) =20 - return True + return None =20 # Virtual methods to be overridden by inherited classes def out_doc(self, fname, name, args): @@ -248,7 +251,7 @@ class RestFormat(OutputFormat): """Outputs a line number""" =20 if self.enable_lineno and ln: - print(f".. LINENO {ln}") + self.data +=3D f".. LINENO {ln}\n" =20 def output_highlight(self, args): input_text =3D args @@ -295,7 +298,7 @@ class RestFormat(OutputFormat): =20 # Print the output with the line prefix for line in output.strip("\n").split("\n"): - print(self.lineprefix + line) + self.data +=3D self.lineprefix + line + "\n" =20 def out_section(self, args, out_reference=3DFalse): """ @@ -317,15 +320,15 @@ class RestFormat(OutputFormat): =20 if not self.out_mode =3D=3D self.OUTPUT_INCLUDE: if out_reference: - print(f".. _{section}:\n") + self.data +=3D f".. _{section}:\n\n" =20 if not self.symbol: - print(f'{self.lineprefix}**{section}**\n') + self.data +=3D f'{self.lineprefix}**{section}**\n\n' =20 self.print_lineno(section_start_lines.get(section, 0)) self.output_highlight(sections[section]) - print() - print() + self.data +=3D "\n" + self.data +=3D "\n" =20 def out_doc(self, fname, name, args): if not self.check_doc(name): @@ -368,42 +371,42 @@ class RestFormat(OutputFormat): signature +=3D ")" =20 if args.get('typedef') or not args.get('functiontype'): - print(f".. c:macro:: {args['function']}\n") + self.data +=3D f".. c:macro:: {args['function']}\n\n" =20 if args.get('typedef'): self.print_lineno(ln) - print(" **Typedef**: ", end=3D"") + self.data +=3D " **Typedef**: " self.lineprefix =3D "" self.output_highlight(args.get('purpose', "")) - print("\n\n**Syntax**\n") - print(f" ``{signature}``\n") + self.data +=3D "\n\n**Syntax**\n\n" + self.data +=3D f" ``{signature}``\n\n" else: - print(f"``{signature}``\n") + self.data +=3D f"``{signature}``\n\n" else: - print(f".. c:function:: {signature}\n") + self.data +=3D f".. c:function:: {signature}\n\n" =20 if not args.get('typedef'): self.print_lineno(ln) self.lineprefix =3D " " self.output_highlight(args.get('purpose', "")) - print() + self.data +=3D "\n" =20 # Put descriptive text into a container (HTML
) to help set # function prototypes apart self.lineprefix =3D " " =20 if parameterlist: - print(".. container:: kernelindent\n") - print(f"{self.lineprefix}**Parameters**\n") + self.data +=3D ".. container:: kernelindent\n\n" + self.data +=3D f"{self.lineprefix}**Parameters**\n\n" =20 for parameter in parameterlist: parameter_name =3D Re(r'\[.*').sub('', parameter) dtype =3D args['parametertypes'].get(parameter, "") =20 if dtype: - print(f"{self.lineprefix}``{dtype}``") + self.data +=3D f"{self.lineprefix}``{dtype}``\n" else: - print(f"{self.lineprefix}``{parameter}``") + self.data +=3D f"{self.lineprefix}``{parameter}``\n" =20 self.print_lineno(parameterdesc_start_lines.get(parameter_name= , 0)) =20 @@ -412,9 +415,9 @@ class RestFormat(OutputFormat): parameterdescs[parameter_name] !=3D KernelDoc.undescribed: =20 self.output_highlight(parameterdescs[parameter_name]) - print() + self.data +=3D "\n" else: - print(f"{self.lineprefix}*undescribed*\n") + self.data +=3D f"{self.lineprefix}*undescribed*\n\n" self.lineprefix =3D " " =20 self.out_section(args) @@ -428,26 +431,26 @@ class RestFormat(OutputFormat): parameterdescs =3D args.get('parameterdescs', {}) ln =3D args.get('ln', 0) =20 - print(f"\n\n.. c:enum:: {name}\n") + self.data +=3D f"\n\n.. c:enum:: {name}\n\n" =20 self.print_lineno(ln) self.lineprefix =3D " " self.output_highlight(args.get('purpose', '')) - print() + self.data +=3D "\n" =20 - print(".. container:: kernelindent\n") + self.data +=3D ".. container:: kernelindent\n\n" outer =3D self.lineprefix + " " self.lineprefix =3D outer + " " - print(f"{outer}**Constants**\n") + self.data +=3D f"{outer}**Constants**\n\n" =20 for parameter in parameterlist: - print(f"{outer}``{parameter}``") + self.data +=3D f"{outer}``{parameter}``\n" =20 if parameterdescs.get(parameter, '') !=3D KernelDoc.undescribe= d: self.output_highlight(parameterdescs[parameter]) else: - print(f"{self.lineprefix}*undescribed*\n") - print() + self.data +=3D f"{self.lineprefix}*undescribed*\n\n" + self.data +=3D "\n" =20 self.lineprefix =3D oldprefix self.out_section(args) @@ -458,14 +461,14 @@ class RestFormat(OutputFormat): name =3D args.get('typedef', '') ln =3D args.get('ln', 0) =20 - print(f"\n\n.. c:type:: {name}\n") + self.data +=3D f"\n\n.. c:type:: {name}\n\n" =20 self.print_lineno(ln) self.lineprefix =3D " " =20 self.output_highlight(args.get('purpose', '')) =20 - print() + self.data +=3D "\n" =20 self.lineprefix =3D oldprefix self.out_section(args) @@ -482,7 +485,7 @@ class RestFormat(OutputFormat): parameterdescs =3D args.get('parameterdescs', {}) parameterdesc_start_lines =3D args.get('parameterdesc_start_lines'= , {}) =20 - print(f"\n\n.. c:{dtype}:: {name}\n") + self.data +=3D f"\n\n.. c:{dtype}:: {name}\n\n" =20 self.print_lineno(ln) =20 @@ -490,20 +493,20 @@ class RestFormat(OutputFormat): self.lineprefix +=3D " " =20 self.output_highlight(purpose) - print() + self.data +=3D "\n" =20 - print(".. container:: kernelindent\n") - print(f"{self.lineprefix}**Definition**::\n") + self.data +=3D ".. container:: kernelindent\n\n" + self.data +=3D f"{self.lineprefix}**Definition**::\n\n" =20 self.lineprefix =3D self.lineprefix + " " =20 declaration =3D declaration.replace("\t", self.lineprefix) =20 - print(f"{self.lineprefix}{dtype} {name}" + ' {') - print(f"{declaration}{self.lineprefix}" + "};\n") + self.data +=3D f"{self.lineprefix}{dtype} {name}" + ' {' + "\n" + self.data +=3D f"{declaration}{self.lineprefix}" + "};\n\n" =20 self.lineprefix =3D " " - print(f"{self.lineprefix}**Members**\n") + self.data +=3D f"{self.lineprefix}**Members**\n\n" for parameter in parameterlist: if not parameter or parameter.startswith("#"): continue @@ -515,15 +518,15 @@ class RestFormat(OutputFormat): =20 self.print_lineno(parameterdesc_start_lines.get(parameter_name= , 0)) =20 - print(f"{self.lineprefix}``{parameter}``") + self.data +=3D f"{self.lineprefix}``{parameter}``\n" =20 self.lineprefix =3D " " self.output_highlight(parameterdescs[parameter_name]) self.lineprefix =3D " " =20 - print() + self.data +=3D "\n" =20 - print() + self.data +=3D "\n" =20 self.lineprefix =3D oldprefix self.out_section(args) @@ -576,19 +579,19 @@ class ManFormat(OutputFormat): line =3D Re(r"^\s*").sub("", line) =20 if line and line[0] =3D=3D ".": - print("\\&" + line) + self.data +=3D "\\&" + line + "\n" else: - print(line) + self.data +=3D line + "\n" =20 def out_doc(self, fname, name, args): module =3D args.get('module') sectionlist =3D args.get('sectionlist', []) sections =3D args.get('sections', {}) =20 - print(f'.TH "{module}" 9 "{module}" "{self.man_date}" "API Manual"= LINUX') + self.data +=3D f'.TH "{module}" 9 "{module}" "{self.man_date}" "AP= I Manual" LINUX' + "\n" =20 for section in sectionlist: - print(f'.SH "{section}"') + self.data +=3D f'.SH "{section}"' + "\n" self.output_highlight(sections.get(section)) =20 def out_function(self, fname, name, args): @@ -599,16 +602,16 @@ class ManFormat(OutputFormat): sectionlist =3D args.get('sectionlist', []) sections =3D args.get('sections', {}) =20 - print(f'.TH "{args['function']}" 9 "{args['function']}" "{self.man= _date}" "Kernel Hacker\'s Manual" LINUX') + self.data +=3D f'.TH "{args['function']}" 9 "{args['function']}" "= {self.man_date}" "Kernel Hacker\'s Manual" LINUX' + "\n" =20 - print(".SH NAME") - print(f"{args['function']} \\- {args['purpose']}") + self.data +=3D ".SH NAME\n" + self.data +=3D f"{args['function']} \\- {args['purpose']}\n" =20 - print(".SH SYNOPSIS") + self.data +=3D ".SH SYNOPSIS\n" if args.get('functiontype', ''): - print(f'.B "{args['functiontype']}" {args['function']}') + self.data +=3D f'.B "{args['functiontype']}" {args['function']= }' + "\n" else: - print(f'.B "{args['function']}') + self.data +=3D f'.B "{args['function']}' + "\n" =20 count =3D 0 parenth =3D "(" @@ -621,25 +624,25 @@ class ManFormat(OutputFormat): dtype =3D args['parametertypes'].get(parameter, "") if function_pointer.match(dtype): # Pointer-to-function - print(f'".BI "{parenth}{function_pointer.group(1)}" " ") (= {function_pointer.group(2)}){post}"') + self.data +=3D f'".BI "{parenth}{function_pointer.group(1)= }" " ") ({function_pointer.group(2)}){post}"' + "\n" else: dtype =3D Re(r'([^\*])$').sub(r'\1 ', dtype) =20 - print(f'.BI "{parenth}{dtype}" "{post}"') + self.data +=3D f'.BI "{parenth}{dtype}" "{post}"' + "\n" count +=3D 1 parenth =3D "" =20 if parameterlist: - print(".SH ARGUMENTS") + self.data +=3D ".SH ARGUMENTS\n" =20 for parameter in parameterlist: parameter_name =3D re.sub(r'\[.*', '', parameter) =20 - print(f'.IP "{parameter}" 12') + self.data +=3D f'.IP "{parameter}" 12' + "\n" self.output_highlight(parameterdescs.get(parameter_name, "")) =20 for section in sectionlist: - print(f'.SH "{section.upper()}"') + self.data +=3D f'.SH "{section.upper()}"' + "\n" self.output_highlight(sections[section]) =20 def out_enum(self, fname, name, args): @@ -649,33 +652,33 @@ class ManFormat(OutputFormat): sectionlist =3D args.get('sectionlist', []) sections =3D args.get('sections', {}) =20 - print(f'.TH "{args['module']}" 9 "enum {args['enum']}" "{self.man_= date}" "API Manual" LINUX') + self.data +=3D f'.TH "{args['module']}" 9 "enum {args['enum']}" "{= self.man_date}" "API Manual" LINUX' + "\n" =20 - print(".SH NAME") - print(f"enum {args['enum']} \\- {args['purpose']}") + self.data +=3D ".SH NAME\n" + self.data +=3D f"enum {args['enum']} \\- {args['purpose']}\n" =20 - print(".SH SYNOPSIS") - print(f"enum {args['enum']}" + " {") + self.data +=3D ".SH SYNOPSIS\n" + self.data +=3D f"enum {args['enum']}" + " {\n" =20 count =3D 0 for parameter in parameterlist: - print(f'.br\n.BI " {parameter}"') + self.data +=3D f'.br\n.BI " {parameter}"' + "\n" if count =3D=3D len(parameterlist) - 1: - print("\n};") + self.data +=3D "\n};\n" else: - print(", \n.br") + self.data +=3D ", \n.br\n" =20 count +=3D 1 =20 - print(".SH Constants") + self.data +=3D ".SH Constants\n" =20 for parameter in parameterlist: parameter_name =3D Re(r'\[.*').sub('', parameter) - print(f'.IP "{parameter}" 12') + self.data +=3D f'.IP "{parameter}" 12' + "\n" self.output_highlight(args['parameterdescs'].get(parameter_nam= e, "")) =20 for section in sectionlist: - print(f'.SH "{section}"') + self.data +=3D f'.SH "{section}"' + "\n" self.output_highlight(sections[section]) =20 def out_typedef(self, fname, name, args): @@ -685,13 +688,13 @@ class ManFormat(OutputFormat): sectionlist =3D args.get('sectionlist', []) sections =3D args.get('sections', {}) =20 - print(f'.TH "{module}" 9 "{typedef}" "{self.man_date}" "API Manual= " LINUX') + self.data +=3D f'.TH "{module}" 9 "{typedef}" "{self.man_date}" "A= PI Manual" LINUX' + "\n" =20 - print(".SH NAME") - print(f"typedef {typedef} \\- {purpose}") + self.data +=3D ".SH NAME\n" + self.data +=3D f"typedef {typedef} \\- {purpose}\n" =20 for section in sectionlist: - print(f'.SH "{section}"') + self.data +=3D f'.SH "{section}"' + "\n" self.output_highlight(sections.get(section)) =20 def out_struct(self, fname, name, args): @@ -705,20 +708,20 @@ class ManFormat(OutputFormat): sections =3D args.get('sections', {}) parameterdescs =3D args.get('parameterdescs', {}) =20 - print(f'.TH "{module}" 9 "{struct_type} {struct_name}" "{self.man_= date}" "API Manual" LINUX') + self.data +=3D f'.TH "{module}" 9 "{struct_type} {struct_name}" "{= self.man_date}" "API Manual" LINUX' + "\n" =20 - print(".SH NAME") - print(f"{struct_type} {struct_name} \\- {purpose}") + self.data +=3D ".SH NAME\n" + self.data +=3D f"{struct_type} {struct_name} \\- {purpose}\n" =20 # Replace tabs with two spaces and handle newlines declaration =3D definition.replace("\t", " ") declaration =3D Re(r"\n").sub('"\n.br\n.BI "', declaration) =20 - print(".SH SYNOPSIS") - print(f"{struct_type} {struct_name} " + "{" + "\n.br") - print(f'.BI "{declaration}\n' + "};\n.br\n") + self.data +=3D ".SH SYNOPSIS\n" + self.data +=3D f"{struct_type} {struct_name} " + "{" + "\n.br\n" + self.data +=3D f'.BI "{declaration}\n' + "};\n.br\n\n" =20 - print(".SH Members") + self.data +=3D ".SH Members\n" for parameter in parameterlist: if parameter.startswith("#"): continue @@ -728,9 +731,9 @@ class ManFormat(OutputFormat): if parameterdescs.get(parameter_name) =3D=3D KernelDoc.undescr= ibed: continue =20 - print(f'.IP "{parameter}" 12') + self.data +=3D f'.IP "{parameter}" 12' + "\n" self.output_highlight(parameterdescs.get(parameter_name)) =20 for section in sectionlist: - print(f'.SH "{section}"') + self.data +=3D f'.SH "{section}"' + "\n" self.output_highlight(sections.get(section)) --=20 2.49.0 From nobody Wed Dec 17 05:33:22 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 120C5265613; Tue, 8 Apr 2025 10:09:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106996; cv=none; b=j/f1rmOiN639XvE9vbcADkkfnpWtlC0t324KK5mt6CspsnTBm8vto6Ti21kPK4Lm3nNDdxwyrcG6Nw+sK8K5omKJqhHJh8nv3DZ5bevqJdmzK/0g6VK9aheumd8IFkOkR6btDeuHJ9DY00f2L64UEz+mMKzYYV7Zh1hr8WL4Wc0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106996; c=relaxed/simple; bh=vqPL8En8t9ALAvXP8ScDduZLWEBitge16Njw+N4oQik=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=OTZEcSsCPZpPK9SyJMLyY3Lmm2m2wUdTXy0ZhS2THWTTWS6MIrrJK4jnXbbpQZpZ2ncUDJRMzmDf2BMl3lssKceA3VYtoloVhPEF7Nv8lsc4Ih3YDbP7oYz60pC8xVdVU+ERAKR8zb+o23iOBp/701uyVSE7ax/jISOo9ULVnaA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=GNsNjyYH; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="GNsNjyYH" Received: by smtp.kernel.org (Postfix) with ESMTPSA id B24CAC4CEED; Tue, 8 Apr 2025 10:09:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1744106995; bh=vqPL8En8t9ALAvXP8ScDduZLWEBitge16Njw+N4oQik=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=GNsNjyYHu8tCNH1nFRuwN045M6XHrc/jJ0+NcWmD3ixZhX9cHJtkzCppzBxq1fLbi 9Bn0pmq/KYK7AO2N4kQ/uBSA5cIQT8Et0iNRUzistlPiOpeZIiU230ty3mjwLftztQ rFN/nEGvoewP1W91rQYB2DGY6AKkTVsSNL167YFtx2/ssNPWvNBKj2Ni2y7kK0hM3f OqlrsQOxn2LqBtrBjcYADjBLPlAdZcc5ofjZVRQPZ1WYb3uHmowk17wvS9CaHgnclA QOsivMtWkxEdgaqgDkwbLW57dV+tcWA9yVtdPWbP6HbkzPEapw6MQ7KkiaCL6fXIlj UxHKZDN8iIndQ== Received: from mchehab by mail.kernel.org with local (Exim 4.98.2) (envelope-from ) id 1u25tt-00000008RVp-0zSQ; Tue, 08 Apr 2025 18:09:49 +0800 From: Mauro Carvalho Chehab To: Linux Doc Mailing List , Jonathan Corbet Cc: Mauro Carvalho Chehab , linux-kernel@vger.kernel.org Subject: [PATCH v3 12/33] scripts/kernel-doc.py: move file lists to the parser function Date: Tue, 8 Apr 2025 18:09:15 +0800 Message-ID: X-Mailer: git-send-email 2.49.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: Mauro Carvalho Chehab Content-Type: text/plain; charset="utf-8" Instead of setting file lists at __init__ time, move it to the actual parsing function. This allows adding more files to be parsed in real time, by calling parse function multiple times. With the new way, the export_files logic was rewritten to avoid parsing twice EXPORT_SYMBOL for partial matches. Please notice that, with this logic, it can still read the same file twice when export_file is used. Signed-off-by: Mauro Carvalho Chehab --- scripts/kernel-doc.py | 7 +++---- scripts/lib/kdoc/kdoc_files.py | 37 ++++++++++++++++------------------ 2 files changed, 20 insertions(+), 24 deletions(-) diff --git a/scripts/kernel-doc.py b/scripts/kernel-doc.py index 63efec4b3f4b..e258a9df7f78 100755 --- a/scripts/kernel-doc.py +++ b/scripts/kernel-doc.py @@ -274,14 +274,13 @@ def main(): else: out_style =3D RestFormat() =20 - kfiles =3D KernelFiles(files=3Dargs.files, verbose=3Dargs.verbose, + kfiles =3D KernelFiles(verbose=3Dargs.verbose, out_style=3Dout_style, werror=3Dargs.werror, wreturn=3Dargs.wreturn, wshort_desc=3Dargs.wshort= _desc, wcontents_before_sections=3Dargs.wcontents_before= _sections, - modulename=3Dargs.modulename, - export_file=3Dargs.export_file) + modulename=3Dargs.modulename) =20 - kfiles.parse() + kfiles.parse(args.files, export_file=3Dargs.export_file) =20 for t in kfiles.msg(enable_lineno=3Dargs.enable_lineno, export=3Dargs.= export, internal=3Dargs.internal, symbol=3Dargs.symbol, diff --git a/scripts/lib/kdoc/kdoc_files.py b/scripts/lib/kdoc/kdoc_files.py index 817ed98b2727..47dab46c89fe 100755 --- a/scripts/lib/kdoc/kdoc_files.py +++ b/scripts/lib/kdoc/kdoc_files.py @@ -124,7 +124,7 @@ class KernelFiles(): self.config.log.error("Cannot find file %s", fname) self.config.errors +=3D 1 =20 - def __init__(self, files=3DNone, verbose=3DFalse, out_style=3DNone, + def __init__(self, verbose=3DFalse, out_style=3DNone, werror=3DFalse, wreturn=3DFalse, wshort_desc=3DFalse, wcontents_before_sections=3DFalse, logger=3DNone, modulename=3DNone, export_file=3DNone): @@ -181,51 +181,48 @@ class KernelFiles(): self.config.src_tree =3D os.environ.get("SRCTREE", None) =20 self.out_style =3D out_style - self.export_file =3D export_file =20 # Initialize internal variables =20 self.config.errors =3D 0 self.results =3D [] =20 - self.file_list =3D files self.files =3D set() + self.export_files =3D set() =20 - def parse(self): + def parse(self, file_list, export_file=3DNone): """ Parse all files """ =20 glob =3D GlobSourceFiles(srctree=3Dself.config.src_tree) =20 - # Let's use a set here to avoid duplicating files + # Prevent parsing the same file twice to speedup parsing and + # avoid reporting errors multiple times =20 - for fname in glob.parse_files(self.file_list, self.file_not_found_= cb): + for fname in glob.parse_files(file_list, self.file_not_found_cb): if fname in self.files: continue =20 - self.files.add(fname) - res =3D self.parse_file(fname) + self.results.append((res.fname, res.entries)) - - if not self.files: - sys.exit(1) + self.files.add(fname) =20 # If a list of export files was provided, parse EXPORT_SYMBOL* - # from the ones not already parsed + # from files that weren't fully parsed =20 - if self.export_file: - files =3D self.files + if not export_file: + return =20 - glob =3D GlobSourceFiles(srctree=3Dself.config.src_tree) + self.export_files |=3D self.files =20 - for fname in glob.parse_files(self.export_file, - self.file_not_found_cb): - if fname not in files: - files.add(fname) + glob =3D GlobSourceFiles(srctree=3Dself.config.src_tree) =20 - self.process_export_file(fname) + for fname in glob.parse_files(export_file, self.file_not_found_cb): + if fname not in self.export_files: + self.process_export_file(fname) + self.export_files.add(fname) =20 def out_msg(self, fname, name, arg): """ --=20 2.49.0 From nobody Wed Dec 17 05:33:22 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1220026656B; Tue, 8 Apr 2025 10:09:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106996; cv=none; b=XQ8wa1M+RqnQroM1l0gHLHVbEQBrVQ1WA6+W7f1YS1juzLGREhAduVuLQcDsRG3q6XDN06pkeRE6BFGSYiQPPXdEJk2Gij1Nw1ZjueXyc5vVn3WJU2pOHrC/t20pIvdlMrDSGx4qcb54mSIVwRA9xD+EqaxSN6iqxEmGnQJzy9g= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106996; c=relaxed/simple; bh=sXOUNajyjwmKqQnv7hchPEAJzSeCddrnAGb/R4Jp03A=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Q/wvKQljIjIeQQdnF4maTmzYVjRXfVVuzw078Ako8BXZT5of/+p0g4VxTuQ5h9rNHo45ap43/OqNxy72xZRosof96DsVP1M5jxqmUpe1WMV+h0LRhEoOKXA8HNgDof+U4uxcLqJXFsE8eoTfkk+Z0QlHR9xF5PlGIyzfuPH+PXU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=q4KMl5bj; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="q4KMl5bj" Received: by smtp.kernel.org (Postfix) with ESMTPSA id B2F30C4CEEE; Tue, 8 Apr 2025 10:09:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1744106995; bh=sXOUNajyjwmKqQnv7hchPEAJzSeCddrnAGb/R4Jp03A=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=q4KMl5bjRaaKJn/p13ksfffYbLrbb6+ox85Ny7/4DS+TDQliar6MgtnbVpHqJ7rQx O3PFEVy35I6VSWegv+E6jqJMEN/mw3WcmhlZdYZcf/MdWzS1l8uDxC56rPAY23JgsT 2jys5zF90nrMT4rjqFmhVcd8tYl09weWdetlZPPylEPKxdGHGPw3YR2UAB3ZM6hCRY CVNTH1m/LG90XJA3YilT+38Kye5v5a2HrAWg+KWilqfjB9DsDf3asjmO8C8Nwqgeyy xFocvIs15VukbSGmur8QR50pSAS9XOcGRrPIbKm8g0uVRUO9ALeqjWtodEgeEJxppe MfB7gnigEGoZw== Received: from mchehab by mail.kernel.org with local (Exim 4.98.2) (envelope-from ) id 1u25tt-00000008RVs-15KS; Tue, 08 Apr 2025 18:09:49 +0800 From: Mauro Carvalho Chehab To: Linux Doc Mailing List , Jonathan Corbet Cc: Mauro Carvalho Chehab , linux-kernel@vger.kernel.org Subject: [PATCH v3 13/33] scripts/kernel-doc.py: implement support for -no-doc-sections Date: Tue, 8 Apr 2025 18:09:16 +0800 Message-ID: <06b18a32142b44d5ba8b41ac64a76c02b03b4969.1744106242.git.mchehab+huawei@kernel.org> X-Mailer: git-send-email 2.49.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: Mauro Carvalho Chehab Content-Type: text/plain; charset="utf-8" The venerable kernel-doc Perl script has a number of options that aren't properly documented. Among them, there is -no-doc-sections, which is used by the Sphinx extension. Implement support for it. Signed-off-by: Mauro Carvalho Chehab --- scripts/kernel-doc.py | 8 ++++++-- scripts/lib/kdoc/kdoc_files.py | 5 +++-- scripts/lib/kdoc/kdoc_output.py | 7 ++++++- 3 files changed, 15 insertions(+), 5 deletions(-) diff --git a/scripts/kernel-doc.py b/scripts/kernel-doc.py index e258a9df7f78..90aacd17499a 100755 --- a/scripts/kernel-doc.py +++ b/scripts/kernel-doc.py @@ -239,10 +239,13 @@ def main(): sel_mut.add_argument("-s", "-function", "--symbol", action=3D'append', help=3DFUNCTION_DESC) =20 - # This one is valid for all 3 types of filter + # Those are valid for all 3 types of filter parser.add_argument("-n", "-nosymbol", "--nosymbol", action=3D'append', help=3DNOSYMBOL_DESC) =20 + parser.add_argument("-D", "-no-doc-sections", "--no-doc-sections", + action=3D'store_true', help=3D"Don't outputt DOC s= ections") + parser.add_argument("files", metavar=3D"FILE", nargs=3D"+", help=3DFILES_DESC) =20 @@ -284,7 +287,8 @@ def main(): =20 for t in kfiles.msg(enable_lineno=3Dargs.enable_lineno, export=3Dargs.= export, internal=3Dargs.internal, symbol=3Dargs.symbol, - nosymbol=3Dargs.nosymbol): + nosymbol=3Dargs.nosymbol, + no_doc_sections=3Dargs.no_doc_sections): msg =3D t[1] if msg: print(msg) diff --git a/scripts/lib/kdoc/kdoc_files.py b/scripts/lib/kdoc/kdoc_files.py index 47dab46c89fe..4c04546a74fe 100755 --- a/scripts/lib/kdoc/kdoc_files.py +++ b/scripts/lib/kdoc/kdoc_files.py @@ -238,7 +238,7 @@ class KernelFiles(): return self.out_style.msg(fname, name, arg) =20 def msg(self, enable_lineno=3DFalse, export=3DFalse, internal=3DFalse, - symbol=3DNone, nosymbol=3DNone): + symbol=3DNone, nosymbol=3DNone, no_doc_sections=3DFalse): """ Interacts over the kernel-doc results and output messages, returning kernel-doc markups on each interaction @@ -257,7 +257,8 @@ class KernelFiles(): self.out_style.set_config(self.config) =20 self.out_style.set_filter(export, internal, symbol, nosymbol, - function_table, enable_lineno) + function_table, enable_lineno, + no_doc_sections) =20 for fname, arg_tuple in self.results: msg =3D "" diff --git a/scripts/lib/kdoc/kdoc_output.py b/scripts/lib/kdoc/kdoc_output= .py index fda07049ecf7..a246d213523c 100755 --- a/scripts/lib/kdoc/kdoc_output.py +++ b/scripts/lib/kdoc/kdoc_output.py @@ -70,6 +70,7 @@ class OutputFormat: self.symbol =3D None self.function_table =3D set() self.config =3D None + self.no_doc_sections =3D False =20 self.data =3D "" =20 @@ -77,7 +78,7 @@ class OutputFormat: self.config =3D config =20 def set_filter(self, export, internal, symbol, nosymbol, function_tabl= e, - enable_lineno): + enable_lineno, no_doc_sections): """ Initialize filter variables according with the requested mode. =20 @@ -87,6 +88,7 @@ class OutputFormat: """ =20 self.enable_lineno =3D enable_lineno + self.no_doc_sections =3D no_doc_sections =20 if symbol: self.out_mode =3D self.OUTPUT_INCLUDE @@ -117,6 +119,9 @@ class OutputFormat: def check_doc(self, name): """Check if DOC should be output""" =20 + if self.no_doc_sections: + return False + if self.out_mode =3D=3D self.OUTPUT_ALL: return True =20 --=20 2.49.0 From nobody Wed Dec 17 05:33:22 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C8607267708; Tue, 8 Apr 2025 10:09:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106996; cv=none; b=pYyFzf9zKVN1op4i2r5jVABJTCYxsabV7FN86htwzFJSstIFQkvI6YmTN3yHrbKuBV8l54YV3ajZUY60GH7gpmIithzHd23mPCjyOnOyzd+EX5QwE+My4YoqO9JcNlc9CfQZLCRS7d3Zk66g4zgloIzfpGrUS1DojjHVDMwAugk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106996; c=relaxed/simple; bh=M/5483xsneVac2QnbA1ZZyNS5r11vVv9aaft8Yje1UA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=nTgw9QNpvhpyRIxelN+pA5UV4jebcgFtRdaLMKT3ChcKvVi/0HLERdQyUXjMHEv9IVk7ujzfGF9su0i0OoP9KJM7bbMva/LPAmMiE2zovabpgfHsh9y4r5xx736FPCjgYzbnqTqL2Q8SyGpCBL9l/5DPCvGQyYw9mLCbBcfHG88= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=H4N3UPvh; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="H4N3UPvh" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 628B9C4CEF1; Tue, 8 Apr 2025 10:09:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1744106996; bh=M/5483xsneVac2QnbA1ZZyNS5r11vVv9aaft8Yje1UA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=H4N3UPvhIOT3p3fhRQOi5DCB5j1bnGfsVRxncUvcuUnsJ2TbOOVo4J+xCIJsTuwOk 8wtiv1UK16DJr1bVGzFbwCZpi0rNXMWohwjHZqaGjo8q+m9VLVTUX4I7aNglWBTll+ SDztmfh5e08Od6Askc/ObQREySXVfqgKvAbZrpzDCXqAbVFf/8zHMIKF4tJ+v/bO5i Xfg5UDnnyMBvgl/C4NVr0OMJkbF7lSDx8yGROtYHfZqzU8glU6UEi/kTEuhCAAR1rb qng21hz+FBLKxBPEPARQDTN+M1vxD/BCqwQXm2wla42GzxCJ5RsSGq8vEIZGCyW72O FqpiGXZg3dMKg== Received: from mchehab by mail.kernel.org with local (Exim 4.98.2) (envelope-from ) id 1u25tt-00000008RVv-1B4E; Tue, 08 Apr 2025 18:09:49 +0800 From: Mauro Carvalho Chehab To: Linux Doc Mailing List , Jonathan Corbet Cc: Mauro Carvalho Chehab , Sean Anderson , linux-kernel@vger.kernel.org Subject: [PATCH v3 14/33] scripts/kernel-doc.py: fix line number output Date: Tue, 8 Apr 2025 18:09:17 +0800 Message-ID: <5182a531d14b5fe9e1fc5da5f9dae05d66852a60.1744106242.git.mchehab+huawei@kernel.org> X-Mailer: git-send-email 2.49.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: Mauro Carvalho Chehab Content-Type: text/plain; charset="utf-8" With the Pyhton version, the actual output happens after parsing, from records stored at self.entries. Ensure that line numbers will be properly stored there and that they'll produce the desired results at the ReST output. Signed-off-by: Mauro Carvalho Chehab --- scripts/lib/kdoc/kdoc_output.py | 13 +++++++------ scripts/lib/kdoc/kdoc_parser.py | 21 +++++++++++++++++---- 2 files changed, 24 insertions(+), 10 deletions(-) diff --git a/scripts/lib/kdoc/kdoc_output.py b/scripts/lib/kdoc/kdoc_output= .py index a246d213523c..6a7187980bec 100755 --- a/scripts/lib/kdoc/kdoc_output.py +++ b/scripts/lib/kdoc/kdoc_output.py @@ -255,7 +255,8 @@ class RestFormat(OutputFormat): def print_lineno(self, ln): """Outputs a line number""" =20 - if self.enable_lineno and ln: + if self.enable_lineno and ln is not None: + ln +=3D 1 self.data +=3D f".. LINENO {ln}\n" =20 def output_highlight(self, args): @@ -358,7 +359,7 @@ class RestFormat(OutputFormat): parameterdescs =3D args.get('parameterdescs', {}) parameterdesc_start_lines =3D args.get('parameterdesc_start_lines'= , {}) =20 - ln =3D args.get('ln', 0) + ln =3D args.get('declaration_start_line', 0) =20 count =3D 0 for parameter in parameterlist: @@ -375,11 +376,11 @@ class RestFormat(OutputFormat): if not func_macro: signature +=3D ")" =20 + self.print_lineno(ln) if args.get('typedef') or not args.get('functiontype'): self.data +=3D f".. c:macro:: {args['function']}\n\n" =20 if args.get('typedef'): - self.print_lineno(ln) self.data +=3D " **Typedef**: " self.lineprefix =3D "" self.output_highlight(args.get('purpose', "")) @@ -434,7 +435,7 @@ class RestFormat(OutputFormat): name =3D args.get('enum', '') parameterlist =3D args.get('parameterlist', []) parameterdescs =3D args.get('parameterdescs', {}) - ln =3D args.get('ln', 0) + ln =3D args.get('declaration_start_line', 0) =20 self.data +=3D f"\n\n.. c:enum:: {name}\n\n" =20 @@ -464,7 +465,7 @@ class RestFormat(OutputFormat): =20 oldprefix =3D self.lineprefix name =3D args.get('typedef', '') - ln =3D args.get('ln', 0) + ln =3D args.get('declaration_start_line', 0) =20 self.data +=3D f"\n\n.. c:type:: {name}\n\n" =20 @@ -484,7 +485,7 @@ class RestFormat(OutputFormat): purpose =3D args.get('purpose', "") declaration =3D args.get('definition', "") dtype =3D args.get('type', "struct") - ln =3D args.get('ln', 0) + ln =3D args.get('declaration_start_line', 0) =20 parameterlist =3D args.get('parameterlist', []) parameterdescs =3D args.get('parameterdescs', {}) diff --git a/scripts/lib/kdoc/kdoc_parser.py b/scripts/lib/kdoc/kdoc_parser= .py index 3ce116595546..e8c86448d6b5 100755 --- a/scripts/lib/kdoc/kdoc_parser.py +++ b/scripts/lib/kdoc/kdoc_parser.py @@ -276,7 +276,7 @@ class KernelDoc: self.entry.brcount =3D 0 =20 self.entry.in_doc_sect =3D False - self.entry.declaration_start_line =3D ln + self.entry.declaration_start_line =3D ln + 1 =20 def push_parameter(self, ln, decl_type, param, dtype, org_arg, declaration_name): @@ -806,8 +806,10 @@ class KernelDoc: parameterlist=3Dself.entry.parameterlist, parameterdescs=3Dself.entry.parameterdescs, parametertypes=3Dself.entry.parametertypes, + parameterdesc_start_lines=3Dself.entry.par= ameterdesc_start_lines, sectionlist=3Dself.entry.sectionlist, sections=3Dself.entry.sections, + section_start_lines=3Dself.entry.section_s= tart_lines, purpose=3Dself.entry.declaration_purpose) =20 def dump_enum(self, ln, proto): @@ -882,8 +884,10 @@ class KernelDoc: module=3Dself.config.modulename, parameterlist=3Dself.entry.parameterlist, parameterdescs=3Dself.entry.parameterdescs, + parameterdesc_start_lines=3Dself.entry.par= ameterdesc_start_lines, sectionlist=3Dself.entry.sectionlist, sections=3Dself.entry.sections, + section_start_lines=3Dself.entry.section_s= tart_lines, purpose=3Dself.entry.declaration_purpose) =20 def dump_declaration(self, ln, prototype): @@ -1054,8 +1058,10 @@ class KernelDoc: parameterlist=3Dself.entry.parameterli= st, parameterdescs=3Dself.entry.parameterd= escs, parametertypes=3Dself.entry.parametert= ypes, + parameterdesc_start_lines=3Dself.entry= .parameterdesc_start_lines, sectionlist=3Dself.entry.sectionlist, sections=3Dself.entry.sections, + section_start_lines=3Dself.entry.secti= on_start_lines, purpose=3Dself.entry.declaration_purpo= se, func_macro=3Dfunc_macro) else: @@ -1067,8 +1073,10 @@ class KernelDoc: parameterlist=3Dself.entry.parameterli= st, parameterdescs=3Dself.entry.parameterd= escs, parametertypes=3Dself.entry.parametert= ypes, + parameterdesc_start_lines=3Dself.entry= .parameterdesc_start_lines, sectionlist=3Dself.entry.sectionlist, sections=3Dself.entry.sections, + section_start_lines=3Dself.entry.secti= on_start_lines, purpose=3Dself.entry.declaration_purpo= se, func_macro=3Dfunc_macro) =20 @@ -1112,8 +1120,10 @@ class KernelDoc: parameterlist=3Dself.entry.parameterli= st, parameterdescs=3Dself.entry.parameterd= escs, parametertypes=3Dself.entry.parametert= ypes, + parameterdesc_start_lines=3Dself.entry= .parameterdesc_start_lines, sectionlist=3Dself.entry.sectionlist, sections=3Dself.entry.sections, + section_start_lines=3Dself.entry.secti= on_start_lines, purpose=3Dself.entry.declaration_purpo= se) return =20 @@ -1136,6 +1146,7 @@ class KernelDoc: module=3Dself.entry.modulename, sectionlist=3Dself.entry.sectionlist, sections=3Dself.entry.sections, + section_start_lines=3Dself.entry.secti= on_start_lines, purpose=3Dself.entry.declaration_purpo= se) return =20 @@ -1168,7 +1179,7 @@ class KernelDoc: return =20 # start a new entry - self.reset_state(ln + 1) + self.reset_state(ln) self.entry.in_doc_sect =3D False =20 # next line is always the function name @@ -1281,7 +1292,7 @@ class KernelDoc: if r.match(line): self.dump_section() self.entry.section =3D self.section_default - self.entry.new_start_line =3D line + self.entry.new_start_line =3D ln self.entry.contents =3D "" =20 if doc_sect.search(line): @@ -1619,7 +1630,9 @@ class KernelDoc: self.dump_section() self.output_declaration("doc", None, sectionlist=3Dself.entry.sectionlist, - sections=3Dself.entry.sections, module= =3Dself.config.modulename) + sections=3Dself.entry.sections, + section_start_lines=3Dself.entry.secti= on_start_lines, + module=3Dself.config.modulename) self.reset_state(ln) =20 elif doc_content.search(line): --=20 2.49.0 From nobody Wed Dec 17 05:33:22 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1E9E2267F55; Tue, 8 Apr 2025 10:09:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106999; cv=none; b=Jg8uV4uyR6oCoOg+NtPZWYZLMekb2ueIEKqeBd1sLHM9pgYF7kknLmhYy6KfW0H7SVhw73lhHvd2XkNXBgz/+fgZptjovWM0PPalrQxf7mwGhLhlGDnCnGQMyohQ6etFh+C/UMk+oyd42S8e3y5z4zvwDvBZQGyyk2FOFR2rCgo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106999; c=relaxed/simple; bh=iHwJIWf+EpApsxQknvppsFpmD/7QpWJ0mtVTdwRXEsw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=JRSItDdCi6bqV44pkRt6VsWf0ZoIcHA/vcXfDq3q5OF6YTCRu3T2ThATCG2oksluDEMsMvs/hk6JxyQXJa4sdxoAJs0p8emx5a0JBmyup0dcoIpy16iirqZITnF5CeQq8DdSWCKcJMIwuk34RdR0yVdUB+ywRDCpBMjZkrvYkSY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=o8YedC33; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="o8YedC33" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8373DC4CEE5; Tue, 8 Apr 2025 10:09:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1744106997; bh=iHwJIWf+EpApsxQknvppsFpmD/7QpWJ0mtVTdwRXEsw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=o8YedC33i8tsSrK0yDhUBlA6CUcSx6KQiR3tVpiqhdxWtXw4Qim5VVCBRFoX/jZb3 mAiY4M3nbxvwkTqLEg2VrLlIV3JnejtUJm+ur/xIoa04lYFnTNyZ1hCXoe25A40Z04 77FyeQVkpkzCy4EUd7sIiu7slY83WQQO3NVRiOWqXZqbN175T74gBWI0oVoZPIlR3M IJgaCIYUWzZimla1cZsHXZJMATxORKiCIpzG1vtl9ckQIlxSpkbcg260KskzoV5pn9 WTKzQrJ2HS0rMIkJPRDfUjU+finLARH7fCxeE3j2m/SDEK/4VViJITmTROj95l2ecp 0HWYVtjiRD4Aw== Received: from mchehab by mail.kernel.org with local (Exim 4.98.2) (envelope-from ) id 1u25tt-00000008RVy-1HDy; Tue, 08 Apr 2025 18:09:49 +0800 From: Mauro Carvalho Chehab To: Linux Doc Mailing List , Jonathan Corbet Cc: Mauro Carvalho Chehab , Sean Anderson , linux-kernel@vger.kernel.org Subject: [PATCH v3 15/33] scripts/kernel-doc.py: fix handling of doc output check Date: Tue, 8 Apr 2025 18:09:18 +0800 Message-ID: <6d8b77af85295452c0191863ea1041f4195aeaaf.1744106242.git.mchehab+huawei@kernel.org> X-Mailer: git-send-email 2.49.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: Mauro Carvalho Chehab Content-Type: text/plain; charset="utf-8" The filtering logic was seeking for the DOC name to check for symbols, but such data is stored only inside a section. Add it to the output_declaration, as it is quicker/easier to check the declaration name than to check inside each section. While here, make sure that the output for both ReST and man after filtering will be similar to what kernel-doc Perl version does. Signed-off-by: Mauro Carvalho Chehab --- scripts/lib/kdoc/kdoc_output.py | 29 ++++++++++++----------------- scripts/lib/kdoc/kdoc_parser.py | 3 ++- 2 files changed, 14 insertions(+), 18 deletions(-) diff --git a/scripts/lib/kdoc/kdoc_output.py b/scripts/lib/kdoc/kdoc_output= .py index 6a7187980bec..7a945dd80c9b 100755 --- a/scripts/lib/kdoc/kdoc_output.py +++ b/scripts/lib/kdoc/kdoc_output.py @@ -122,13 +122,13 @@ class OutputFormat: if self.no_doc_sections: return False =20 + if name in self.nosymbol: + return False + if self.out_mode =3D=3D self.OUTPUT_ALL: return True =20 if self.out_mode =3D=3D self.OUTPUT_INCLUDE: - if name in self.nosymbol: - return False - if name in self.function_table: return True =20 @@ -154,15 +154,6 @@ class OutputFormat: =20 return False =20 - def check_function(self, fname, name, args): - return True - - def check_enum(self, fname, name, args): - return True - - def check_typedef(self, fname, name, args): - return True - def msg(self, fname, name, args): self.data =3D "" =20 @@ -306,7 +297,7 @@ class RestFormat(OutputFormat): for line in output.strip("\n").split("\n"): self.data +=3D self.lineprefix + line + "\n" =20 - def out_section(self, args, out_reference=3DFalse): + def out_section(self, args, out_docblock=3DFalse): """ Outputs a block section. =20 @@ -325,7 +316,7 @@ class RestFormat(OutputFormat): continue =20 if not self.out_mode =3D=3D self.OUTPUT_INCLUDE: - if out_reference: + if out_docblock: self.data +=3D f".. _{section}:\n\n" =20 if not self.symbol: @@ -339,8 +330,7 @@ class RestFormat(OutputFormat): def out_doc(self, fname, name, args): if not self.check_doc(name): return - - self.out_section(args, out_reference=3DTrue) + self.out_section(args, out_docblock=3DTrue) =20 def out_function(self, fname, name, args): =20 @@ -583,8 +573,10 @@ class ManFormat(OutputFormat): =20 for line in contents.strip("\n").split("\n"): line =3D Re(r"^\s*").sub("", line) + if not line: + continue =20 - if line and line[0] =3D=3D ".": + if line[0] =3D=3D ".": self.data +=3D "\\&" + line + "\n" else: self.data +=3D line + "\n" @@ -594,6 +586,9 @@ class ManFormat(OutputFormat): sectionlist =3D args.get('sectionlist', []) sections =3D args.get('sections', {}) =20 + if not self.check_doc(name): + return + self.data +=3D f'.TH "{module}" 9 "{module}" "{self.man_date}" "AP= I Manual" LINUX' + "\n" =20 for section in sectionlist: diff --git a/scripts/lib/kdoc/kdoc_parser.py b/scripts/lib/kdoc/kdoc_parser= .py index e8c86448d6b5..74b311c8184c 100755 --- a/scripts/lib/kdoc/kdoc_parser.py +++ b/scripts/lib/kdoc/kdoc_parser.py @@ -1198,6 +1198,7 @@ class KernelDoc: else: self.entry.section =3D doc_block.group(1) =20 + self.entry.identifier =3D self.entry.section self.state =3D self.STATE_DOCBLOCK return =20 @@ -1628,7 +1629,7 @@ class KernelDoc: =20 if doc_end.search(line): self.dump_section() - self.output_declaration("doc", None, + self.output_declaration("doc", self.entry.identifier, sectionlist=3Dself.entry.sectionlist, sections=3Dself.entry.sections, section_start_lines=3Dself.entry.secti= on_start_lines, --=20 2.49.0 From nobody Wed Dec 17 05:33:22 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2F40F266590; Tue, 8 Apr 2025 10:09:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106996; cv=none; b=n9kVSAWYf4bhooi4w4XqLyUgkEPDDXj/s+f3S/BsyFq26G8LAAANHC34kgigorVW69LNAB/83rrKIO+OqEYLaiVLFYmOIfVWsamdR57GGOi4BliduvT9KIM4m/0eNBVKynO6mA0FHQuG9/SYiWN5caFahIGZefERWSbr247Oc/Y= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106996; c=relaxed/simple; bh=cI1iCiRg3MPksW4t2Wew3d0GBmSUuTKJXieNw32ATT8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=jpiCwp+hUEB0I3SpNSgustZ/lyv22c+GXOCKAMq4tJyZg0g0nydAeMR0o/VhBGVbLhUrXIWDDsUZbZtflWZG3bqao5ZleKm6Pk0YDJ2or/GdRExym1Qu3TGpNvsByUO1pbQ7MHB6SdCh6DkrglmHq4YIhqiIBReZHoo71j5L3Os= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=o5glxogZ; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="o5glxogZ" Received: by smtp.kernel.org (Postfix) with ESMTPSA id DAC9CC4CEF6; Tue, 8 Apr 2025 10:09:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1744106995; bh=cI1iCiRg3MPksW4t2Wew3d0GBmSUuTKJXieNw32ATT8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=o5glxogZIRtpzR+o7+qTzO1Zw6Cw9Yp1qHXOMFuYdFJ6Q6NXr4qcQakuCfnyv/odt OixqZ5GsL0vkWMzYeMVDeXUWNa82bt4gnUiLc2LaKVgqh8tZfCNrGdrkm04i7SKO1E pScw7Hje5rfDrg8/XSOwGcnYAWrSzVhWzq6mw0g55FIfNsnwB9BFSFo+LIMgxXOjTl MPo52p9mBoW5DnEDkxThlfeCvTIneuaxA6mAx3nHpA2EeAvgKBsxRtd9aD4IEPD05f fP75tG2REPtHlcTnMlcgtuBL81Fbbl6IVG5p8Xqxy+wDKC9oGko2xkvlH03jdfIMcu pzJcW2sxAvz8A== Received: from mchehab by mail.kernel.org with local (Exim 4.98.2) (envelope-from ) id 1u25tt-00000008RW1-1NGw; Tue, 08 Apr 2025 18:09:49 +0800 From: Mauro Carvalho Chehab To: Linux Doc Mailing List , Jonathan Corbet Cc: Mauro Carvalho Chehab , linux-kernel@vger.kernel.org Subject: [PATCH v3 16/33] scripts/kernel-doc.py: properly handle out_section for ReST Date: Tue, 8 Apr 2025 18:09:19 +0800 Message-ID: <935d00c6a7c45b25a8be72fad6183fe5a8476cd2.1744106242.git.mchehab+huawei@kernel.org> X-Mailer: git-send-email 2.49.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: Mauro Carvalho Chehab Content-Type: text/plain; charset="utf-8" There is a difference at the way DOC sections are output with the include mode. Handle such difference properly. Signed-off-by: Mauro Carvalho Chehab --- scripts/lib/kdoc/kdoc_output.py | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/scripts/lib/kdoc/kdoc_output.py b/scripts/lib/kdoc/kdoc_output= .py index 7a945dd80c9b..d0c8cedb0ea5 100755 --- a/scripts/lib/kdoc/kdoc_output.py +++ b/scripts/lib/kdoc/kdoc_output.py @@ -315,12 +315,12 @@ class RestFormat(OutputFormat): if section in self.nosymbol: continue =20 - if not self.out_mode =3D=3D self.OUTPUT_INCLUDE: - if out_docblock: + if out_docblock: + if not self.out_mode =3D=3D self.OUTPUT_INCLUDE: self.data +=3D f".. _{section}:\n\n" - - if not self.symbol: self.data +=3D f'{self.lineprefix}**{section}**\n\n' + else: + self.data +=3D f'{self.lineprefix}**{section}**\n\n' =20 self.print_lineno(section_start_lines.get(section, 0)) self.output_highlight(sections[section]) --=20 2.49.0 From nobody Wed Dec 17 05:33:22 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C15472676FF; Tue, 8 Apr 2025 10:09:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106996; cv=none; b=D89HpGvttcCLcayzAx5+73wwzd1s0hPqcKIvfgKi0w2F/20LhaDBXaxAut53bHWaYckzC5FJtpO72p+Vi4VCWo8IxqGDPyyio1rEuekuvbLL6pE744XdjevNcYbLgzSeHdN2Llm5y4s70lIwsXpF3iIwQU3cVkPTKeKL4LsqrlU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106996; c=relaxed/simple; bh=ap4JRnRfz9etkeP/FGAkkiE80UWtsVBLm1sZD3/QPKE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=U+PcAbemn9/563HDm8fjfIp/AIJaPfNgxn4qLJjtYcN4f6t1St1+x5FHT3oNl/5T2v7KM4sd0a2Qhq0OV1NJC14sQc1G521KGUc+CP/y9E7VKsIBSp5sqyCt2nlohrF0eQZ7jfr0NzY/NcK96IbTjkad6V1A7KUqkFZyl3tTycQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=NgmCOUrs; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="NgmCOUrs" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 689F8C4AF0C; Tue, 8 Apr 2025 10:09:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1744106996; bh=ap4JRnRfz9etkeP/FGAkkiE80UWtsVBLm1sZD3/QPKE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=NgmCOUrsCQtM6L8JEe1XQ5hOwQZbW5TzfAB35TRvNNTXBKQ9oJ6jeTu7n0Agp8CXs qqLVYkn3aWUGpnS3g8qwJ0NQKiLs9iuY4M34ohATyqOh4gYPgTwxkmFbWWFiUFtAJs S7OnZi9Wp49dhVFCYw/1+MUERASqeDa1tbzgfl7httFPz5WzTsxl8R7KID7uU6Bn95 32zo76yQFcJk5G3rRkQ4KnxCbn1hHbypgX1io2Rgy3BRaKHF2VMq42kuCg/+GAEZB/ Dulyy/m2nXtANgLdI3LOorgEbTXUiG6FMLnLBzZBuHKw6hMlF7MHSGjYn+k4mrh4cF JKMIpLyxgMTRg== Received: from mchehab by mail.kernel.org with local (Exim 4.98.2) (envelope-from ) id 1u25tt-00000008RW4-1SvQ; Tue, 08 Apr 2025 18:09:49 +0800 From: Mauro Carvalho Chehab To: Linux Doc Mailing List , Jonathan Corbet Cc: Mauro Carvalho Chehab , Sean Anderson , linux-kernel@vger.kernel.org Subject: [PATCH v3 17/33] scripts/kernel-doc.py: postpone warnings to the output plugin Date: Tue, 8 Apr 2025 18:09:20 +0800 Message-ID: X-Mailer: git-send-email 2.49.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: Mauro Carvalho Chehab Content-Type: text/plain; charset="utf-8" We don't want to have warnings displayed for symbols that weren't output. So, postpone warnings print to the output plugin, where symbol output is validated. Signed-off-by: Mauro Carvalho Chehab --- scripts/lib/kdoc/kdoc_output.py | 24 +++++++++++++++---- scripts/lib/kdoc/kdoc_parser.py | 41 ++++++++++++++++----------------- 2 files changed, 39 insertions(+), 26 deletions(-) diff --git a/scripts/lib/kdoc/kdoc_output.py b/scripts/lib/kdoc/kdoc_output= .py index d0c8cedb0ea5..6582d1f64d1e 100755 --- a/scripts/lib/kdoc/kdoc_output.py +++ b/scripts/lib/kdoc/kdoc_output.py @@ -116,7 +116,16 @@ class OutputFormat: =20 return block =20 - def check_doc(self, name): + def out_warnings(self, args): + warnings =3D args.get('warnings', []) + + for warning, log_msg in warnings: + if warning: + self.config.log.warning(log_msg) + else: + self.config.log.info(log_msg) + + def check_doc(self, name, args): """Check if DOC should be output""" =20 if self.no_doc_sections: @@ -126,19 +135,22 @@ class OutputFormat: return False =20 if self.out_mode =3D=3D self.OUTPUT_ALL: + self.out_warnings(args) return True =20 if self.out_mode =3D=3D self.OUTPUT_INCLUDE: if name in self.function_table: + self.out_warnings(args) return True =20 return False =20 - def check_declaration(self, dtype, name): + def check_declaration(self, dtype, name, args): if name in self.nosymbol: return False =20 if self.out_mode =3D=3D self.OUTPUT_ALL: + self.out_warnings(args) return True =20 if self.out_mode in [self.OUTPUT_INCLUDE, self.OUTPUT_EXPORTED]: @@ -147,9 +159,11 @@ class OutputFormat: =20 if self.out_mode =3D=3D self.OUTPUT_INTERNAL: if dtype !=3D "function": + self.out_warnings(args) return True =20 if name not in self.function_table: + self.out_warnings(args) return True =20 return False @@ -163,7 +177,7 @@ class OutputFormat: self.out_doc(fname, name, args) return self.data =20 - if not self.check_declaration(dtype, name): + if not self.check_declaration(dtype, name, args): return self.data =20 if dtype =3D=3D "function": @@ -328,7 +342,7 @@ class RestFormat(OutputFormat): self.data +=3D "\n" =20 def out_doc(self, fname, name, args): - if not self.check_doc(name): + if not self.check_doc(name, args): return self.out_section(args, out_docblock=3DTrue) =20 @@ -586,7 +600,7 @@ class ManFormat(OutputFormat): sectionlist =3D args.get('sectionlist', []) sections =3D args.get('sections', {}) =20 - if not self.check_doc(name): + if not self.check_doc(name, args): return =20 self.data +=3D f'.TH "{module}" 9 "{module}" "{self.man_date}" "AP= I Manual" LINUX' + "\n" diff --git a/scripts/lib/kdoc/kdoc_parser.py b/scripts/lib/kdoc/kdoc_parser= .py index 74b311c8184c..3698ef625367 100755 --- a/scripts/lib/kdoc/kdoc_parser.py +++ b/scripts/lib/kdoc/kdoc_parser.py @@ -131,23 +131,23 @@ class KernelDoc: # Place all potential outputs into an array self.entries =3D [] =20 - def show_warnings(self, dtype, declaration_name): # pylint: disable= =3DW0613 - """ - Allow filtering out warnings - """ - - # TODO: implement it - - return True - # TODO: rename to emit_message def emit_warning(self, ln, msg, warning=3DTrue): """Emit a message""" =20 + log_msg =3D f"{self.fname}:{ln} {msg}" + + if self.entry: + # Delegate warning output to output logic, as this way it + # will report warnings/info only for symbols that are output + + self.entry.warnings.append((warning, log_msg)) + return + if warning: - self.config.log.warning("%s:%d %s", self.fname, ln, msg) + self.config.log.warning(log_msg) else: - self.config.log.info("%s:%d %s", self.fname, ln, msg) + self.config.log.info(log_msg) =20 def dump_section(self, start_new=3DTrue): """ @@ -221,10 +221,9 @@ class KernelDoc: # For now, we're keeping the same name of the function just to make # easier to compare the source code of both scripts =20 - if "declaration_start_line" not in args: - args["declaration_start_line"] =3D self.entry.declaration_star= t_line - + args["declaration_start_line"] =3D self.entry.declaration_start_li= ne args["type"] =3D dtype + args["warnings"] =3D self.entry.warnings =20 # TODO: use colletions.OrderedDict =20 @@ -257,6 +256,8 @@ class KernelDoc: self.entry.struct_actual =3D "" self.entry.prototype =3D "" =20 + self.entry.warnings =3D [] + self.entry.parameterlist =3D [] self.entry.parameterdescs =3D {} self.entry.parametertypes =3D {} @@ -328,7 +329,7 @@ class KernelDoc: if param not in self.entry.parameterdescs and not param.startswith= ("#"): self.entry.parameterdescs[param] =3D self.undescribed =20 - if self.show_warnings(dtype, declaration_name) and "." not in = param: + if "." not in param: if decl_type =3D=3D 'function': dname =3D f"{decl_type} parameter" else: @@ -868,16 +869,14 @@ class KernelDoc: self.entry.parameterlist.append(arg) if arg not in self.entry.parameterdescs: self.entry.parameterdescs[arg] =3D self.undescribed - if self.show_warnings("enum", declaration_name): - self.emit_warning(ln, - f"Enum value '{arg}' not described i= n enum '{declaration_name}'") + self.emit_warning(ln, + f"Enum value '{arg}' not described in en= um '{declaration_name}'") member_set.add(arg) =20 for k in self.entry.parameterdescs: if k not in member_set: - if self.show_warnings("enum", declaration_name): - self.emit_warning(ln, - f"Excess enum value '%{k}' descripti= on in '{declaration_name}'") + self.emit_warning(ln, + f"Excess enum value '%{k}' description i= n '{declaration_name}'") =20 self.output_declaration('enum', declaration_name, enum=3Ddeclaration_name, --=20 2.49.0 From nobody Wed Dec 17 05:33:22 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F29B8265CDA; Tue, 8 Apr 2025 10:09:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106996; cv=none; b=L6uqUt3FhQSkrlHNpxf+2D426Of8I462GEjo/dutuxja7Oh0+V0SLfEYi++EdaS6SOTgWyppnM9l2SHHUts26GGQhg4Yh9cFrPoK/anc1wSVs0Nu9gNPCTyj8qLuS5meul5xLQZtUYGiBdZ0+xyTNn7SSs7z/wle3mNnwJwLbKE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106996; c=relaxed/simple; bh=baQs6T5V+0EX/ZOEtJqTKbtbHsBX/BZE6fQSRnnW5Ds=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=eUblle8drpjP7AABMD+3qJnidsW7K8YUFpAcRqyV6CPpL6gmeTcqOirTpCWH69KsOsmLiehJBOeRCM8UsGak4CxTxqhQexizDNGUDM+GmRrjmDJnm2+PRNcrwvBQwFB2FTKY9mJ9Y2nhgK4YGpy7Qn02tB9UyOrfaF7POSl+FLU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=l1VTJSY9; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="l1VTJSY9" Received: by smtp.kernel.org (Postfix) with ESMTPSA id CF0ECC4CEF3; Tue, 8 Apr 2025 10:09:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1744106995; bh=baQs6T5V+0EX/ZOEtJqTKbtbHsBX/BZE6fQSRnnW5Ds=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=l1VTJSY9Nq7BRcVEWTJZYBqaQXMTl/mZeGB5CVFI/yoR3kWF4ErufeyUfnPt37gPG ATNn/JrAYHiKYQw1oPenXILPwHWehAwZLxmBplVzaTCRdIUadBuTCB2ds6scUkRNF1 0rZmtuDMmMgYCDOWRNbbxrMKtcAZXSBWR97kFoo0ZXHd7F8lxPxsqkxSs+GZ5qjwBe ULmLel6OtD9MuHjy4qAbIA8Ui4HDSlKjVbzmZQXW3IyEB5HCsFfF2eoFXB0W2i226T DXnIgyAlKyruzYUmvqf2ZF8WvNBg6ESUrgeUxp2u8PAVFIJbC2FVz7Ap6mZqGC5mK/ cvUwmf4SWrm8Q== Received: from mchehab by mail.kernel.org with local (Exim 4.98.2) (envelope-from ) id 1u25tt-00000008RW7-1YBO; Tue, 08 Apr 2025 18:09:49 +0800 From: Mauro Carvalho Chehab To: Linux Doc Mailing List , Jonathan Corbet Cc: Mauro Carvalho Chehab , linux-kernel@vger.kernel.org Subject: [PATCH v3 18/33] docs: add a .pylintrc file with sys path for docs scripts Date: Tue, 8 Apr 2025 18:09:21 +0800 Message-ID: <7b3c8a932c50ae52ce4c848676602b46d1d4a8f9.1744106242.git.mchehab+huawei@kernel.org> X-Mailer: git-send-email 2.49.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: Mauro Carvalho Chehab Content-Type: text/plain; charset="utf-8" The docs scripts that are used by Documentation/sphinx are using scripts/lib/* directories to place classes that will be used by both extensions and scripts. When pylint is used, it needs to identify the path where such scripts are, otherwise it will bail out. Add a simple RC file placing the location of such files. Signed-off-by: Mauro Carvalho Chehab --- .pylintrc | 2 ++ 1 file changed, 2 insertions(+) create mode 100644 .pylintrc diff --git a/.pylintrc b/.pylintrc new file mode 100644 index 000000000000..30b8ae1659f8 --- /dev/null +++ b/.pylintrc @@ -0,0 +1,2 @@ +[MASTER] +init-hook=3D'import sys; sys.path +=3D ["scripts/lib/kdoc", "scripts/lib/a= bi"]' --=20 2.49.0 From nobody Wed Dec 17 05:33:22 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9A63E2673B8; Tue, 8 Apr 2025 10:09:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106996; cv=none; b=mNbRKmI2QvKitEswzW4Xupx7NCWrl7LB6TF3kYze0LqyGOP3IeHjHIyCg0XCFOP+EJWuOKcyAD5s1ovMQD7Faci0Mqz49PTTjsFde2+47JiyKPUBco0sGUOWt362I+g30JTv8M3P77vBv4kERRMU9UYwAy+469eQeKzydho/hUA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106996; c=relaxed/simple; bh=5NIsmVCuy0Ps73FY9SsRdlOH9IrLgYYVlKEUKddXlno=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ZAck7/qdVT7bbjo8gn8UIDfl7sqxVj8JlRJvWm4tFdp8FI41r9ISHQoulazwA3oojcev1OKP3UIiOsfbaIw1375WiGXvUAfGvhdqgC9T+u1WKCfS3EKcV6bUAqTK+k8tRi4VImFe9oZR3f/p26DT87a75nJdBtCrJpegp26l5rY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=nRC9cHDV; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="nRC9cHDV" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 6B064C4CEEA; Tue, 8 Apr 2025 10:09:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1744106996; bh=5NIsmVCuy0Ps73FY9SsRdlOH9IrLgYYVlKEUKddXlno=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=nRC9cHDV9zHdrcvINwNoFFY0cfM0R31zuO686CM/9sneLkEO7j3EHhvS1LL6GTD24 Z/qFyNnDQJgO27iRqE08pZmwZ0Q9F47TRLuX5rA/0CrgE7KauKdGHSn0BYWahNuWI3 /dXpg7P76+5KY+yTftnHjBa6rjz7qm31nYVFvKAskDfWsmOL5ZVB8QtSVIO4OXPWF5 L1FSs+sEd5IQUQ7E1xMqRQyf3W/fRnjnavYByusFu2l7hJepqrNnmXiV6av6ihOX7a W81p0/u+CxuwuZz34hQWlhu692cSzoUk9pV9v1g+vD7WiJK+RFP4g8Ii7UQeBb364g QyD/kPIvkPVzA== Received: from mchehab by mail.kernel.org with local (Exim 4.98.2) (envelope-from ) id 1u25tt-00000008RWA-1e8a; Tue, 08 Apr 2025 18:09:49 +0800 From: Mauro Carvalho Chehab To: Linux Doc Mailing List , Jonathan Corbet Cc: Mauro Carvalho Chehab , Kees Cook , linux-kernel@vger.kernel.org Subject: [PATCH v3 19/33] docs: sphinx: kerneldoc: verbose kernel-doc command if V=1 Date: Tue, 8 Apr 2025 18:09:22 +0800 Message-ID: X-Mailer: git-send-email 2.49.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: Mauro Carvalho Chehab Content-Type: text/plain; charset="utf-8" It is useful to know what kernel-doc command was used during document build time, as it allows one to check the output the same way as Sphinx extension does. Signed-off-by: Mauro Carvalho Chehab --- Documentation/sphinx/kerneldoc.py | 34 +++++++++++++++++++++++++++++++ 1 file changed, 34 insertions(+) diff --git a/Documentation/sphinx/kerneldoc.py b/Documentation/sphinx/kerne= ldoc.py index 39ddae6ae7dd..d206eb2be10a 100644 --- a/Documentation/sphinx/kerneldoc.py +++ b/Documentation/sphinx/kerneldoc.py @@ -43,6 +43,29 @@ from sphinx.util import logging =20 __version__ =3D '1.0' =20 +def cmd_str(cmd): + """ + Helper function to output a command line that can be used to produce + the same records via command line. Helpful to debug troubles at the + script. + """ + + cmd_line =3D "" + + for w in cmd: + if w =3D=3D "" or " " in w: + esc_cmd =3D "'" + w + "'" + else: + esc_cmd =3D w + + if cmd_line: + cmd_line +=3D " " + esc_cmd + continue + else: + cmd_line =3D esc_cmd + + return cmd_line + class KernelDocDirective(Directive): """Extract kernel-doc comments from the specified file""" required_argument =3D 1 @@ -57,6 +80,7 @@ class KernelDocDirective(Directive): } has_content =3D False logger =3D logging.getLogger('kerneldoc') + verbose =3D 0 =20 def run(self): env =3D self.state.document.settings.env @@ -65,6 +89,13 @@ class KernelDocDirective(Directive): filename =3D env.config.kerneldoc_srctree + '/' + self.arguments[0] export_file_patterns =3D [] =20 + verbose =3D os.environ.get("V") + if verbose: + try: + self.verbose =3D int(verbose) + except ValueError: + pass + # Tell sphinx of the dependency env.note_dependency(os.path.abspath(filename)) =20 @@ -104,6 +135,9 @@ class KernelDocDirective(Directive): =20 cmd +=3D [filename] =20 + if self.verbose >=3D 1: + print(cmd_str(cmd)) + try: self.logger.verbose("calling kernel-doc '%s'" % (" ".join(cmd)= )) =20 --=20 2.49.0 From nobody Wed Dec 17 05:33:22 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8F4D7267394; Tue, 8 Apr 2025 10:09:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106996; cv=none; b=HEcdRhtJtbKrCEJKOSKT/LDu0KectXskWQf1pdYEO78+8rqQusZVUiq6uzMdWG5kAreobcnyOkz5JvoQRWOdWQzF3RmqePX5yb6plqnZIi4P0Was0S+CqJvvRh9h7rm/qv8zhwUbr8Zu+eB/IGFJFdskcdgu+pKOWzwgCnFRfFk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106996; c=relaxed/simple; bh=acYo2Td1sfM4zaQAyEu4vloX6YMzSDuGC17yA6ctkig=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=V84xQ48byOpexOXhOFHBjLRKOHpjgM09mU53Y7FbujzX7MDUKoKyuVYFMyjIXV6bFoOopBNE4aKZbvZ7CBIF+pyPJa1QowONit0VxVIvSRHz9cSxEFBsSWLhJrTTtzvLgGIu62BjCP7Dl15IxOEMjaNTktqzOEnBSvScrhnvfp0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=niGp3+t/; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="niGp3+t/" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 678E0C4CEF7; Tue, 8 Apr 2025 10:09:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1744106996; bh=acYo2Td1sfM4zaQAyEu4vloX6YMzSDuGC17yA6ctkig=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=niGp3+t/hhCMwYHaz99dZ11PjRoZkuTGDKzekH7Su4v4j8l+9JraPAlKPwSNHoa6v Kndg6wB2+4rp5KosB43J1Nkb4LnUYdz0VV0xg0Y5Q6sBpxglRO1YSsru6Agy60AF3s U+aLnCklJ9uv+egJRceGYxRBS2b77BHXseNpzI+lLhtNZiFCH0qwhUArJss7u3o/Cz AuDFwBqtV7W2kcmKezO6ogCzlqE+xozzhFk3SECP1luqZbfPpqNMCTA8Rm4MVsqDUZ +dVHtHJAmu9Ud/2Qkek3ee7lnfUTbVhUpvkq+VHuCEe4qbiE4vDqXlD1x/pa7uMP8l huUfhQSCwZ14g== Received: from mchehab by mail.kernel.org with local (Exim 4.98.2) (envelope-from ) id 1u25tt-00000008RWD-1kFW; Tue, 08 Apr 2025 18:09:49 +0800 From: Mauro Carvalho Chehab To: Linux Doc Mailing List , Jonathan Corbet Cc: Mauro Carvalho Chehab , Kees Cook , linux-kernel@vger.kernel.org Subject: [PATCH v3 20/33] docs: sphinx: kerneldoc: ignore "\" characters from options Date: Tue, 8 Apr 2025 18:09:23 +0800 Message-ID: <4c652d6c57b20500c135b95294e554d9e9a97f42.1744106242.git.mchehab+huawei@kernel.org> X-Mailer: git-send-email 2.49.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: Mauro Carvalho Chehab Content-Type: text/plain; charset="utf-8" Documentation/driver-api/infiniband.rst has a kernel-doc tag with "\" characters at the end: .. kernel-doc:: drivers/infiniband/ulp/iser/iscsi_iser.c :functions: iscsi_iser_pdu_alloc iser_initialize_task_headers \ iscsi_iser_task_init iscsi_iser_mtask_xmit iscsi_iser_task_xmit \ iscsi_iser_cleanup_task iscsi_iser_check_protection \ iscsi_iser_conn_create iscsi_iser_conn_bind \ iscsi_iser_conn_start iscsi_iser_conn_stop \ iscsi_iser_session_destroy iscsi_iser_session_create \ iscsi_iser_set_param iscsi_iser_ep_connect iscsi_iser_ep_poll \ iscsi_iser_ep_disconnect This is not handled well, as the "\" strings will be just stored inside Sphinx options. While the actual problem deserves being fixed, better to relax the keneldoc.py extension to silently strip "\" from the end of strings, as otherwise this may cause troubles when preparing arguments to be executed by kernel-doc. Signed-off-by: Mauro Carvalho Chehab --- Documentation/sphinx/kerneldoc.py | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/Documentation/sphinx/kerneldoc.py b/Documentation/sphinx/kerne= ldoc.py index d206eb2be10a..344789ed9ea2 100644 --- a/Documentation/sphinx/kerneldoc.py +++ b/Documentation/sphinx/kerneldoc.py @@ -118,6 +118,10 @@ class KernelDocDirective(Directive): identifiers =3D self.options.get('identifiers').split() if identifiers: for i in identifiers: + i =3D i.rstrip("\\").strip() + if not i: + continue + cmd +=3D ['-function', i] else: cmd +=3D ['-no-doc-sections'] @@ -126,9 +130,17 @@ class KernelDocDirective(Directive): no_identifiers =3D self.options.get('no-identifiers').split() if no_identifiers: for i in no_identifiers: + i =3D i.rstrip("\\").strip() + if not i: + continue + cmd +=3D ['-nosymbol', i] =20 for pattern in export_file_patterns: + pattern =3D pattern.rstrip("\\").strip() + if not pattern: + continue + for f in glob.glob(env.config.kerneldoc_srctree + '/' + patter= n): env.note_dependency(os.path.abspath(f)) cmd +=3D ['-export-file', f] --=20 2.49.0 From nobody Wed Dec 17 05:33:22 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EF9BD2676CC; Tue, 8 Apr 2025 10:09:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106997; cv=none; b=Fa0+FyT+6VVTRmORbEmRCnKXU65xOsN7FB45p6H0FIo4Wmzbalb0+bGpiNxKxubql04mSMjthe03EEyGpAh6nLvTjxVeAU8XasiNJAgkC6IO1gyNSqNge/ORNvLRbtmav7iUieGaoyebSSpS6/T/7deMyQtIsb2rXsRkIEzRvPk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106997; c=relaxed/simple; bh=U8UoBsYILY5xqJ321naGG+BcFzDB5+hrgzzoiF2VkSo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=q7OJS7IRyqYpLTRBRTEkWdo/E/CcuoZrqndqMZ3RnIuGVpnqMrPFVWguwtXEfZ19QPHYeJcrTo6i/UQLAPnk/IYwSf3iKHYJlKDBVw3lep0hxediXt+xnbuwGWlSst4cBe7Af49qa00Tq//urS7vq0TFcCuI/1A+shddGkbjfXM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=rcOE7OfE; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="rcOE7OfE" Received: by smtp.kernel.org (Postfix) with ESMTPSA id C8249C4CEF2; Tue, 8 Apr 2025 10:09:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1744106996; bh=U8UoBsYILY5xqJ321naGG+BcFzDB5+hrgzzoiF2VkSo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=rcOE7OfEnieJBl+Dh4TzJjVksZuBjv4swHs2xJ/RgtmarrCnRoG0xxaT0g72EZMnk TPeyvfqnMUs7R5z0uqwMB+jseH+9kvaW0hoBYzPa+Xgni6kanVQhLtRSkfEZBjcfhV QnK6nqczbewWFNgb95N2OQSLrFjPiJAG/3fa72Luk0mFH4oVPPe29O4L3i6RHRretT eTja0OLeUMjUVyGON8iMtgynz208cU9UJpo82fEunPAxxEkJrB8SBk3PpXi+VYRi8x YIyLk9/aHxxXoV1YLBAfKEdMV4VhGWvsXU2dAInmOO/MQ2JGSnoiUoVNcqJHABl9+5 sPua4WDcKjTXw== Received: from mchehab by mail.kernel.org with local (Exim 4.98.2) (envelope-from ) id 1u25tt-00000008RWG-1qN7; Tue, 08 Apr 2025 18:09:49 +0800 From: Mauro Carvalho Chehab To: Linux Doc Mailing List , Jonathan Corbet Cc: Mauro Carvalho Chehab , linux-kernel@vger.kernel.org Subject: [PATCH v3 21/33] docs: sphinx: kerneldoc: use kernel-doc.py script Date: Tue, 8 Apr 2025 18:09:24 +0800 Message-ID: X-Mailer: git-send-email 2.49.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: Mauro Carvalho Chehab Content-Type: text/plain; charset="utf-8" Switch to the new version when producing documentation. Signed-off-by: Mauro Carvalho Chehab --- Documentation/Makefile | 2 +- Documentation/conf.py | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/Documentation/Makefile b/Documentation/Makefile index 63094646df28..c022b97c487e 100644 --- a/Documentation/Makefile +++ b/Documentation/Makefile @@ -60,7 +60,7 @@ endif #HAVE_LATEXMK # Internal variables. PAPEROPT_a4 =3D -D latex_paper_size=3Da4 PAPEROPT_letter =3D -D latex_paper_size=3Dletter -KERNELDOC =3D $(srctree)/scripts/kernel-doc +KERNELDOC =3D $(srctree)/scripts/kernel-doc.py KERNELDOC_CONF =3D -D kerneldoc_srctree=3D$(srctree) -D kerneldoc_bin=3D$= (KERNELDOC) ALLSPHINXOPTS =3D $(KERNELDOC_CONF) $(PAPEROPT_$(PAPER)) $(SPHINXOPTS) ifneq ($(wildcard $(srctree)/.config),) diff --git a/Documentation/conf.py b/Documentation/conf.py index 3dad1f90b098..b126f6760b5f 100644 --- a/Documentation/conf.py +++ b/Documentation/conf.py @@ -540,7 +540,7 @@ pdf_documents =3D [ # kernel-doc extension configuration for running Sphinx directly (e.g. by = Read # the Docs). In a normal build, these are supplied from the Makefile via c= ommand # line arguments. -kerneldoc_bin =3D '../scripts/kernel-doc' +kerneldoc_bin =3D '../scripts/kernel-doc.py' kerneldoc_srctree =3D '..' =20 # ------------------------------------------------------------------------= ------ --=20 2.49.0 From nobody Wed Dec 17 05:33:22 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 35CF626659F; Tue, 8 Apr 2025 10:09:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106996; cv=none; b=bVxvGIrEH++vb0oTe1RmUhtzJrr7eZ+4zcJZhwVZB1OFbCHA2SzMdT1zpotPXJFtsjcRvndweC2qPwqNo6D08Yzjfz9Qq7UJD62oBsEBGQDYV7EtCvvYIgUiuYD7FoUe6sWqmk3xN0IbyVcfY+EkHrWDzihn5F4e9YEm3dR7v6k= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106996; c=relaxed/simple; bh=wYGITVSD3W1nJ6cH/x1z0CW1HgVp+nvFB/0wOau6qSs=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=q86KNJVwSrsdgnHBE8xvQ0uDKFVayNWY1GlKhwZS0bjB1BP4l5DWZLx/X4P4hEjtuTTlre4fRN+MmepXnmDwGCtnLD0QEHRdByndsX5elbeOb6dIbwGCz2Iowcgr3eiJkQk/rnK7y1zeBzXvzTs381aSvp8pzi1HBGe17HxpDBA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=qc1rV2j/; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="qc1rV2j/" Received: by smtp.kernel.org (Postfix) with ESMTPSA id D0B6DC4CEF4; Tue, 8 Apr 2025 10:09:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1744106995; bh=wYGITVSD3W1nJ6cH/x1z0CW1HgVp+nvFB/0wOau6qSs=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=qc1rV2j/Y8IU6+tjvkXSoLq3yPpo4PGn4CdgLFOnGpm3mjNZTt1TOhbSTOPOt4hmn U4+JWDtBFxzUVslkf16YLWvXepBp984AgNOsLiRX2q5gRtGWh/kBIWKG1wXWLAK6T4 ENG2wa/34Q3ShpBKFkbN2S+bSXvaoPtqagrLn/Kl1b7Gq48+cktLurVm6wITug/1J7 /hpYJjdrlkO25NxTxeuyw3WvCvKF18FnniA99F+kEn2QzvbxDFRAGpjmDE/2MeuKih Bsdg6MY3mJV+gNqtBjXMUs93BZXCTBDmK9lb6h2dOpb9WhoW0TAeR3fp+g+6XZmuEU KgO85f/kQmUrQ== Received: from mchehab by mail.kernel.org with local (Exim 4.98.2) (envelope-from ) id 1u25tt-00000008RWJ-1vxW; Tue, 08 Apr 2025 18:09:49 +0800 From: Mauro Carvalho Chehab To: Linux Doc Mailing List , Jonathan Corbet Cc: Mauro Carvalho Chehab , linux-kernel@vger.kernel.org Subject: [PATCH v3 22/33] scripts/kernel-doc.py: Set an output format for --none Date: Tue, 8 Apr 2025 18:09:25 +0800 Message-ID: X-Mailer: git-send-email 2.49.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: Mauro Carvalho Chehab Content-Type: text/plain; charset="utf-8" Now that warnings output is deferred to the output plugin, we need to have an output style for none as well. So, use the OutputFormat base class on such cases. Signed-off-by: Mauro Carvalho Chehab --- scripts/lib/kdoc/kdoc_files.py | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/scripts/lib/kdoc/kdoc_files.py b/scripts/lib/kdoc/kdoc_files.py index 4c04546a74fe..dd3dbe87520b 100755 --- a/scripts/lib/kdoc/kdoc_files.py +++ b/scripts/lib/kdoc/kdoc_files.py @@ -20,6 +20,7 @@ from datetime import datetime from dateutil import tz =20 from kdoc_parser import KernelDoc +from kdoc_output import OutputFormat =20 =20 class GlobSourceFiles: @@ -138,6 +139,9 @@ class KernelFiles(): if not modulename: modulename =3D "Kernel API" =20 + if out_style is None: + out_style =3D OutputFormat() + dt =3D datetime.now() if os.environ.get("KBUILD_BUILD_TIMESTAMP", None): # use UTC TZ --=20 2.49.0 From nobody Wed Dec 17 05:33:22 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C853C267707; Tue, 8 Apr 2025 10:09:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106996; cv=none; b=TdZzZymH5frMaO4WQ+gE50tfsObG8gSflWGIQsyyBl/NaE/3013vPywnDUDPOknysqKgMCI84obQP7niv3UCLxJx4dWGEDNcXZs5PWOsMMY3fLnfY2wMJJKYyXLuXVLQdWHvkB/76uy/Xr/du0zbOFktOB5D0UsiTULOxt37Y/k= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106996; c=relaxed/simple; bh=2gtlbokD3wIvcyskgL2DYRWExsd4tWZg94PvP3UaS7o=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=jIDaJU3CR0MvkaVhSI+lMKdmsXAfQDujhy+CcF9zEvzgFA4lHjuhiKCtu/MTq3VvWWsJ09YWs8PTAXjDFYAdDOyzSkRqZFEfwSIdJcxO8GpQWLeKvv2CHBBY+bHgRPLyS+dHXamvqp5RzrVmQow5m2XA7bt+ThHKKs9JQRF1nRs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=omwK458e; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="omwK458e" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 64672C4CEEE; Tue, 8 Apr 2025 10:09:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1744106996; bh=2gtlbokD3wIvcyskgL2DYRWExsd4tWZg94PvP3UaS7o=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=omwK458eL+K/pQKW9eEpQfOOETKxlPJV6K8dd2yoiSFjxcGhsCrcJaaAgOR53ZiFz /W8Z2sAgUDlxxRpNWi2piA5sa1dSSN0lZCT3kqpk2i6PyIIvGjCjOH11LbeztfKxy3 o29u5nX8adDLX6z/QLpOM6jHHxt/24tMOHfjrRuiyYB4pLrdMDzBtNuNLnr/hVp+Yt QW3ff+eSLG5nOk5bvdvodwH3QQLmO6TYFWishnksf689pvfVj7Hu0BuzcnEHbOcxN5 IkmobPrZcNPFmPKqHTtL83My7YBBiBjzoEgSmGdExpEVpfqoUMacluLYtdGI5JqvoH hGJMqDEumRO1A== Received: from mchehab by mail.kernel.org with local (Exim 4.98.2) (envelope-from ) id 1u25tt-00000008RWM-21WJ; Tue, 08 Apr 2025 18:09:49 +0800 From: Mauro Carvalho Chehab To: Linux Doc Mailing List , Jonathan Corbet Cc: Mauro Carvalho Chehab , Sean Anderson , linux-kernel@vger.kernel.org Subject: [PATCH v3 23/33] scripts/kernel-doc.py: adjust some coding style issues Date: Tue, 8 Apr 2025 18:09:26 +0800 Message-ID: <0f9d5473105e4c09c6c41e3db72cc63f1d4d55f9.1744106242.git.mchehab+huawei@kernel.org> X-Mailer: git-send-email 2.49.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: Mauro Carvalho Chehab Content-Type: text/plain; charset="utf-8" Make pylint happier by adding some missing documentation and addressing a couple of pylint warnings. Signed-off-by: Mauro Carvalho Chehab --- scripts/kernel-doc.py | 12 ++++---- scripts/lib/kdoc/kdoc_files.py | 4 +-- scripts/lib/kdoc/kdoc_output.py | 50 ++++++++++++++++++++++++++------- scripts/lib/kdoc/kdoc_parser.py | 30 +++++--------------- scripts/lib/kdoc/kdoc_re.py | 3 +- 5 files changed, 57 insertions(+), 42 deletions(-) mode change 100755 =3D> 100644 scripts/lib/kdoc/kdoc_files.py diff --git a/scripts/kernel-doc.py b/scripts/kernel-doc.py index 90aacd17499a..eca7e34f9d03 100755 --- a/scripts/kernel-doc.py +++ b/scripts/kernel-doc.py @@ -2,7 +2,7 @@ # SPDX-License-Identifier: GPL-2.0 # Copyright(c) 2025: Mauro Carvalho Chehab . # -# pylint: disable=3DC0103 +# pylint: disable=3DC0103,R0915 # # Converted from the kernel-doc script originally written in Perl # under GPLv2, copyrighted since 1998 by the following authors: @@ -165,6 +165,8 @@ neither here nor at the original Perl script. =20 =20 class MsgFormatter(logging.Formatter): + """Helper class to format warnings on a similar way to kernel-doc.pl""" + def format(self, record): record.levelname =3D record.levelname.capitalize() return logging.Formatter.format(self, record) @@ -241,7 +243,7 @@ def main(): =20 # Those are valid for all 3 types of filter parser.add_argument("-n", "-nosymbol", "--nosymbol", action=3D'append', - help=3DNOSYMBOL_DESC) + help=3DNOSYMBOL_DESC) =20 parser.add_argument("-D", "-no-doc-sections", "--no-doc-sections", action=3D'store_true', help=3D"Don't outputt DOC s= ections") @@ -286,9 +288,9 @@ def main(): kfiles.parse(args.files, export_file=3Dargs.export_file) =20 for t in kfiles.msg(enable_lineno=3Dargs.enable_lineno, export=3Dargs.= export, - internal=3Dargs.internal, symbol=3Dargs.symbol, - nosymbol=3Dargs.nosymbol, - no_doc_sections=3Dargs.no_doc_sections): + internal=3Dargs.internal, symbol=3Dargs.symbol, + nosymbol=3Dargs.nosymbol, + no_doc_sections=3Dargs.no_doc_sections): msg =3D t[1] if msg: print(msg) diff --git a/scripts/lib/kdoc/kdoc_files.py b/scripts/lib/kdoc/kdoc_files.py old mode 100755 new mode 100644 index dd3dbe87520b..e2221db7022a --- a/scripts/lib/kdoc/kdoc_files.py +++ b/scripts/lib/kdoc/kdoc_files.py @@ -4,8 +4,6 @@ # # pylint: disable=3DR0903,R0913,R0914,R0917 =20 -# TODO: implement warning filtering - """ Parse lernel-doc tags on multiple kernel source files. """ @@ -128,7 +126,7 @@ class KernelFiles(): def __init__(self, verbose=3DFalse, out_style=3DNone, werror=3DFalse, wreturn=3DFalse, wshort_desc=3DFalse, wcontents_before_sections=3DFalse, - logger=3DNone, modulename=3DNone, export_file=3DNone): + logger=3DNone, modulename=3DNone): """ Initialize startup variables and parse all files """ diff --git a/scripts/lib/kdoc/kdoc_output.py b/scripts/lib/kdoc/kdoc_output= .py index 6582d1f64d1e..7f84bf12f1e1 100755 --- a/scripts/lib/kdoc/kdoc_output.py +++ b/scripts/lib/kdoc/kdoc_output.py @@ -2,9 +2,7 @@ # SPDX-License-Identifier: GPL-2.0 # Copyright(c) 2025: Mauro Carvalho Chehab . # -# pylint: disable=3DC0301,R0911,R0912,R0913,R0914,R0915,R0917 - -# TODO: implement warning filtering +# pylint: disable=3DC0301,R0902,R0911,R0912,R0913,R0914,R0915,R0917 =20 """ Implement output filters to print kernel-doc documentation. @@ -52,6 +50,11 @@ type_member_func =3D type_member + Re(r"\(\)", cache=3DF= alse) =20 =20 class OutputFormat: + """ + Base class for OutputFormat. If used as-is, it means that only + warnings will be displayed. + """ + # output mode. OUTPUT_ALL =3D 0 # output all symbols and doc sections OUTPUT_INCLUDE =3D 1 # output only specified symbols @@ -75,6 +78,10 @@ class OutputFormat: self.data =3D "" =20 def set_config(self, config): + """ + Setup global config variables used by both parser and output. + """ + self.config =3D config =20 def set_filter(self, export, internal, symbol, nosymbol, function_tabl= e, @@ -117,6 +124,10 @@ class OutputFormat: return block =20 def out_warnings(self, args): + """ + Output warnings for identifiers that will be displayed. + """ + warnings =3D args.get('warnings', []) =20 for warning, log_msg in warnings: @@ -146,6 +157,11 @@ class OutputFormat: return False =20 def check_declaration(self, dtype, name, args): + """ + Checks if a declaration should be output or not based on the + filtering criteria. + """ + if name in self.nosymbol: return False =20 @@ -169,6 +185,10 @@ class OutputFormat: return False =20 def msg(self, fname, name, args): + """ + Handles a single entry from kernel-doc parser + """ + self.data =3D "" =20 dtype =3D args.get('type', "") @@ -203,24 +223,25 @@ class OutputFormat: return None =20 # Virtual methods to be overridden by inherited classes + # At the base class, those do nothing. def out_doc(self, fname, name, args): - pass + """Outputs a DOC block""" =20 def out_function(self, fname, name, args): - pass + """Outputs a function""" =20 def out_enum(self, fname, name, args): - pass + """Outputs an enum""" =20 def out_typedef(self, fname, name, args): - pass + """Outputs a typedef""" =20 def out_struct(self, fname, name, args): - pass + """Outputs a struct""" =20 =20 class RestFormat(OutputFormat): - # """Consts and functions used by ReST output""" + """Consts and functions used by ReST output""" =20 highlights =3D [ (type_constant, r"``\1``"), @@ -265,6 +286,11 @@ class RestFormat(OutputFormat): self.data +=3D f".. LINENO {ln}\n" =20 def output_highlight(self, args): + """ + Outputs a C symbol that may require being converted to ReST using + the self.highlights variable + """ + input_text =3D args output =3D "" in_literal =3D False @@ -579,6 +605,10 @@ class ManFormat(OutputFormat): self.man_date =3D dt.strftime("%B %Y") =20 def output_highlight(self, block): + """ + Outputs a C symbol that may require being highlighted with + self.highlights variable using troff syntax + """ =20 contents =3D self.highlight_block(block) =20 @@ -601,7 +631,7 @@ class ManFormat(OutputFormat): sections =3D args.get('sections', {}) =20 if not self.check_doc(name, args): - return + return =20 self.data +=3D f'.TH "{module}" 9 "{module}" "{self.man_date}" "AP= I Manual" LINUX' + "\n" =20 diff --git a/scripts/lib/kdoc/kdoc_parser.py b/scripts/lib/kdoc/kdoc_parser= .py index 3698ef625367..dcb9515fc40b 100755 --- a/scripts/lib/kdoc/kdoc_parser.py +++ b/scripts/lib/kdoc/kdoc_parser.py @@ -131,7 +131,7 @@ class KernelDoc: # Place all potential outputs into an array self.entries =3D [] =20 - # TODO: rename to emit_message + # TODO: rename to emit_message after removal of kernel-doc.pl def emit_warning(self, ln, msg, warning=3DTrue): """Emit a message""" =20 @@ -157,19 +157,6 @@ class KernelDoc: name =3D self.entry.section contents =3D self.entry.contents =20 - # TODO: we can prevent dumping empty sections here with: - # - # if self.entry.contents.strip("\n"): - # if start_new: - # self.entry.section =3D self.section_default - # self.entry.contents =3D "" - # - # return - # - # But, as we want to be producing the same output of the - # venerable kernel-doc Perl tool, let's just output everything, - # at least for now - if type_param.match(name): name =3D type_param.group(1) =20 @@ -205,7 +192,7 @@ class KernelDoc: self.entry.section =3D self.section_default self.entry.contents =3D "" =20 - # TODO: rename it to store_declaration + # TODO: rename it to store_declaration after removal of kernel-doc.pl def output_declaration(self, dtype, name, **args): """ Stores the entry into an entry array. @@ -225,13 +212,13 @@ class KernelDoc: args["type"] =3D dtype args["warnings"] =3D self.entry.warnings =20 - # TODO: use colletions.OrderedDict + # TODO: use colletions.OrderedDict to remove sectionlist =20 sections =3D args.get('sections', {}) sectionlist =3D args.get('sectionlist', []) =20 # Drop empty sections - # TODO: improve it to emit warnings + # TODO: improve empty sections logic to emit warnings for section in ["Description", "Return"]: if section in sectionlist: if not sections[section].rstrip(): @@ -636,7 +623,9 @@ class KernelDoc: =20 # Replace macros # - # TODO: it is better to also move those to the NestedMatch log= ic, + # TODO: use NestedMatch for FOO($1, $2, ...) matches + # + # it is better to also move those to the NestedMatch logic, # to ensure that parenthesis will be properly matched. =20 (Re(r'__ETHTOOL_DECLARE_LINK_MODE_MASK\s*\(([^\)]+)\)', re.S),= r'DECLARE_BITMAP(\1, __ETHTOOL_LINK_MODE_MASK_NBITS)'), @@ -906,7 +895,6 @@ class KernelDoc: self.dump_struct(ln, prototype) return =20 - # TODO: handle other types self.output_declaration(self.entry.decl_type, prototype, entry=3Dself.entry) =20 @@ -1680,10 +1668,6 @@ class KernelDoc: self.st_inline_name[self.inline_= doc_state], line) =20 - # TODO: not all states allow EXPORT_SYMBOL*, so this - # can be optimized later on to speedup parsing - self.process_export(self.config.function_table, line) - # Hand this line to the appropriate state handler if self.state =3D=3D self.STATE_NORMAL: self.process_normal(ln, line) diff --git a/scripts/lib/kdoc/kdoc_re.py b/scripts/lib/kdoc/kdoc_re.py index 512b6521e79d..d28485ff94d6 100755 --- a/scripts/lib/kdoc/kdoc_re.py +++ b/scripts/lib/kdoc/kdoc_re.py @@ -131,7 +131,8 @@ class NestedMatch: will ignore the search string. """ =20 - # TODO: + # TODO: make NestedMatch handle multiple match groups + # # Right now, regular expressions to match it are defined only up to # the start delimiter, e.g.: # --=20 2.49.0 From nobody Wed Dec 17 05:33:22 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F2F7E267B61; Tue, 8 Apr 2025 10:09:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106998; cv=none; b=pFHfccR8zy3GgxV9d8+fwIdnKHd4ubeMX8YZGTfJUYTuG17aFiNhfRbM0ovHfb6GiNl138kjzDkG2W2h3alo1pe+prlCsteZxNpPNLQKEze4614oJ01bhm5VYKNAJKbcJ5dyngX3BtjA1RjRNMsJ5PxYtwtuTPPq0f2CM6GlLM8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106998; c=relaxed/simple; bh=xGGaublStiDF9Xu+i44mFNAhZua59ZKfxDpXyFMS7D0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=F0DxuJS9FcDlA8rA5XGhStUNUsM7oaVTrWeW5PUx+eZwMB+NQA+Qd2ESaByxAhof4kicPkbmHcDfANmy7eFn0q45Du9BrMheMIeBtIlS6GPVGQRTUwLV2Bemx0VvuPfw1KSh32Ylf16Ol7Hn9YLUdjHcIPDxGbAxW2/1vyL/5KU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=ZhD+3w03; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="ZhD+3w03" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 4F236C4CEEB; Tue, 8 Apr 2025 10:09:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1744106997; bh=xGGaublStiDF9Xu+i44mFNAhZua59ZKfxDpXyFMS7D0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ZhD+3w036SUvdKmRhUVh1HWc4aPzTeYp56frEDd/+oJPAQfTNtGXeJoSCJZaFwW8t /K6EjrmxihfhlglwMUh7MkaiT+XzKIY1FfwU/RDhmHslsSAqhFesF1xn7OBxYsrf52 oBHnV3w+EPP2VtysvaA4A6O5kuLUCLR8JfoqD51MHGQ8n/DzWQZOMoxrmpSVqkLDI4 38DM4X0DqrjpCngoWAac0T1sMlrG9zxPMFyLdqIUbg8vOLQT33NgNNMOINpDsGMb30 gf8IqxrBtn+TCIoCcvWt0ckL4a8jyjq6X6nKV7W655TI1kQcKHqwJJLGQFyGLWL0hh lRhF2E89+UNnQ== Received: from mchehab by mail.kernel.org with local (Exim 4.98.2) (envelope-from ) id 1u25tt-00000008RWP-28PO; Tue, 08 Apr 2025 18:09:49 +0800 From: Mauro Carvalho Chehab To: Linux Doc Mailing List , Jonathan Corbet Cc: Mauro Carvalho Chehab , Sean Anderson , linux-kernel@vger.kernel.org Subject: [PATCH v3 24/33] scripts/lib/kdoc/kdoc_parser.py: fix Python compat with < v3.13 Date: Tue, 8 Apr 2025 18:09:27 +0800 Message-ID: X-Mailer: git-send-email 2.49.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: Mauro Carvalho Chehab Content-Type: text/plain; charset="utf-8" - str.replace count was introduced only in Python 3.13; - before Python 3.13, f-string dict arguments can't use the same delimiter of the main string. Signed-off-by: Mauro Carvalho Chehab --- scripts/lib/kdoc/kdoc_output.py | 8 ++++---- scripts/lib/kdoc/kdoc_parser.py | 4 ++-- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/scripts/lib/kdoc/kdoc_output.py b/scripts/lib/kdoc/kdoc_output= .py index 7f84bf12f1e1..e0ed79e4d985 100755 --- a/scripts/lib/kdoc/kdoc_output.py +++ b/scripts/lib/kdoc/kdoc_output.py @@ -647,16 +647,16 @@ class ManFormat(OutputFormat): sectionlist =3D args.get('sectionlist', []) sections =3D args.get('sections', {}) =20 - self.data +=3D f'.TH "{args['function']}" 9 "{args['function']}" "= {self.man_date}" "Kernel Hacker\'s Manual" LINUX' + "\n" + self.data +=3D f'.TH "{args["function"]}" 9 "{args["function"]}" "= {self.man_date}" "Kernel Hacker\'s Manual" LINUX' + "\n" =20 self.data +=3D ".SH NAME\n" self.data +=3D f"{args['function']} \\- {args['purpose']}\n" =20 self.data +=3D ".SH SYNOPSIS\n" if args.get('functiontype', ''): - self.data +=3D f'.B "{args['functiontype']}" {args['function']= }' + "\n" + self.data +=3D f'.B "{args["functiontype"]}" {args["function"]= }' + "\n" else: - self.data +=3D f'.B "{args['function']}' + "\n" + self.data +=3D f'.B "{args["function"]}' + "\n" =20 count =3D 0 parenth =3D "(" @@ -697,7 +697,7 @@ class ManFormat(OutputFormat): sectionlist =3D args.get('sectionlist', []) sections =3D args.get('sections', {}) =20 - self.data +=3D f'.TH "{args['module']}" 9 "enum {args['enum']}" "{= self.man_date}" "API Manual" LINUX' + "\n" + self.data +=3D f'.TH "{args["module"]}" 9 "enum {args["enum"]}" "{= self.man_date}" "API Manual" LINUX' + "\n" =20 self.data +=3D ".SH NAME\n" self.data +=3D f"enum {args['enum']} \\- {args['purpose']}\n" diff --git a/scripts/lib/kdoc/kdoc_parser.py b/scripts/lib/kdoc/kdoc_parser= .py index dcb9515fc40b..e48ed128ca04 100755 --- a/scripts/lib/kdoc/kdoc_parser.py +++ b/scripts/lib/kdoc/kdoc_parser.py @@ -1444,9 +1444,9 @@ class KernelDoc: =20 r =3D Re(r'long\s+(sys_.*?),') if r.search(proto): - proto =3D proto.replace(',', '(', count=3D1) + proto =3D Re(',').sub('(', proto, count=3D1) elif is_void: - proto =3D proto.replace(')', '(void)', count=3D1) + proto =3D Re(r'\)').sub('(void)', proto, count=3D1) =20 # Now delete all of the odd-numbered commas in the proto # so that argument types & names don't have a comma between them --=20 2.49.0 From nobody Wed Dec 17 05:33:22 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D3F67267712; Tue, 8 Apr 2025 10:09:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106996; cv=none; b=t0fNUxdRX4AwJddGz/EyoC0+OqjlIc+O0sdWOmHevGNQLiLlJYDwg47fBZxO/WQvN8Hb8v37DU565DrfDQeKEUwpFGOAKPCn+VgVpUnt05BS7fUUwQ1soH9ahgMQcYEZS2dSh3IIoPF9WBm4iNug8GwVuQGBKpv8mzl01f8tHQ4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106996; c=relaxed/simple; bh=YknnQNXKSY9zOMID4CuZkIJaoPwVVvWlyNEsLX0P8r8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=t3im8NncTwFNEwEWdCdEPvQINVsCcVx/sCNK04VrWiP/uJITgd5hoq9lW4cwWFXBu733CJqdEWGz2WLXoBntIX+Dc9MKxWKEkPrzEu4IuuEtCE+3rKDlpXsPy4wdC3oHJN1h6WTKlOkUI39GxEo9ZGOjJtQUGj9dJbB2rzWQw2U= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=g2G+KxNA; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="g2G+KxNA" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 6E812C4CEF9; Tue, 8 Apr 2025 10:09:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1744106996; bh=YknnQNXKSY9zOMID4CuZkIJaoPwVVvWlyNEsLX0P8r8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=g2G+KxNABV6Nfy9T38lDVc5vC5/3iAXlzZaC5GsDDVeumyylarlckWZIKMhRk+EyO kSItWJ/d+G7OBFcRQ8owgmQUEhhLepX2nYiudTLysUb0L4BeM7ypo/q9SZwTadmVT3 ZNIZtEGjLU6kUYvEyOYm8exKaLeLZYC9fZTWqkoUBiyLsqEqTdRFFbS1FW1SEvMZMZ +qNi0JDAw6DQ0u7b6ioaxm5/EhwREHoxf5xqJGRA1F3Z86MR8vS3cQX5op3SoE9FAa jeHzQMbKh5mOsr49/0doJlGDppFfsC2PcCvzjPBuJg3von5EM2QFTryBt4VIeZv37D QdQ9CK1iM8jFw== Received: from mchehab by mail.kernel.org with local (Exim 4.98.2) (envelope-from ) id 1u25tt-00000008RWS-2DvZ; Tue, 08 Apr 2025 18:09:49 +0800 From: Mauro Carvalho Chehab To: Linux Doc Mailing List , Jonathan Corbet Cc: Mauro Carvalho Chehab , Sean Anderson , linux-kernel@vger.kernel.org Subject: [PATCH v3 25/33] scripts/kernel-doc.py: move modulename to man class Date: Tue, 8 Apr 2025 18:09:28 +0800 Message-ID: <583085e3885b0075d16ef9961b4f2ad870f30a55.1744106242.git.mchehab+huawei@kernel.org> X-Mailer: git-send-email 2.49.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: Mauro Carvalho Chehab Content-Type: text/plain; charset="utf-8" Only man output requires a modulename. Move its definition to the man class. Signed-off-by: Mauro Carvalho Chehab --- scripts/kernel-doc.py | 6 +++--- scripts/lib/kdoc/kdoc_files.py | 6 +----- scripts/lib/kdoc/kdoc_output.py | 12 ++++++------ scripts/lib/kdoc/kdoc_parser.py | 9 +-------- 4 files changed, 11 insertions(+), 22 deletions(-) diff --git a/scripts/kernel-doc.py b/scripts/kernel-doc.py index eca7e34f9d03..6a6bc81efd31 100755 --- a/scripts/kernel-doc.py +++ b/scripts/kernel-doc.py @@ -186,6 +186,7 @@ def main(): help=3D"Enable debug messages") =20 parser.add_argument("-M", "-modulename", "--modulename", + default=3D"Kernel API", help=3D"Allow setting a module name at the output.= ") =20 parser.add_argument("-l", "-enable-lineno", "--enable_lineno", @@ -273,7 +274,7 @@ def main(): logger.addHandler(handler) =20 if args.man: - out_style =3D ManFormat() + out_style =3D ManFormat(modulename=3Dargs.modulename) elif args.none: out_style =3D None else: @@ -282,8 +283,7 @@ def main(): kfiles =3D KernelFiles(verbose=3Dargs.verbose, out_style=3Dout_style, werror=3Dargs.werror, wreturn=3Dargs.wreturn, wshort_desc=3Dargs.wshort= _desc, - wcontents_before_sections=3Dargs.wcontents_before= _sections, - modulename=3Dargs.modulename) + wcontents_before_sections=3Dargs.wcontents_before= _sections) =20 kfiles.parse(args.files, export_file=3Dargs.export_file) =20 diff --git a/scripts/lib/kdoc/kdoc_files.py b/scripts/lib/kdoc/kdoc_files.py index e2221db7022a..5a6e92e34d05 100644 --- a/scripts/lib/kdoc/kdoc_files.py +++ b/scripts/lib/kdoc/kdoc_files.py @@ -126,7 +126,7 @@ class KernelFiles(): def __init__(self, verbose=3DFalse, out_style=3DNone, werror=3DFalse, wreturn=3DFalse, wshort_desc=3DFalse, wcontents_before_sections=3DFalse, - logger=3DNone, modulename=3DNone): + logger=3DNone): """ Initialize startup variables and parse all files """ @@ -134,9 +134,6 @@ class KernelFiles(): if not verbose: verbose =3D bool(os.environ.get("KBUILD_VERBOSE", 0)) =20 - if not modulename: - modulename =3D "Kernel API" - if out_style is None: out_style =3D OutputFormat() =20 @@ -168,7 +165,6 @@ class KernelFiles(): self.config.wreturn =3D wreturn self.config.wshort_desc =3D wshort_desc self.config.wcontents_before_sections =3D wcontents_before_sections - self.config.modulename =3D modulename =20 self.config.function_table =3D set() self.config.source_map =3D {} diff --git a/scripts/lib/kdoc/kdoc_output.py b/scripts/lib/kdoc/kdoc_output= .py index e0ed79e4d985..8be69245c0d0 100755 --- a/scripts/lib/kdoc/kdoc_output.py +++ b/scripts/lib/kdoc/kdoc_output.py @@ -586,7 +586,7 @@ class ManFormat(OutputFormat): ) blankline =3D "" =20 - def __init__(self): + def __init__(self, modulename): """ Creates class variables. =20 @@ -595,6 +595,7 @@ class ManFormat(OutputFormat): """ =20 super().__init__() + self.modulename =3D modulename =20 dt =3D datetime.now() if os.environ.get("KBUILD_BUILD_TIMESTAMP", None): @@ -626,14 +627,13 @@ class ManFormat(OutputFormat): self.data +=3D line + "\n" =20 def out_doc(self, fname, name, args): - module =3D args.get('module') sectionlist =3D args.get('sectionlist', []) sections =3D args.get('sections', {}) =20 if not self.check_doc(name, args): return =20 - self.data +=3D f'.TH "{module}" 9 "{module}" "{self.man_date}" "AP= I Manual" LINUX' + "\n" + self.data +=3D f'.TH "{self.modulename}" 9 "{self.modulename}" "{s= elf.man_date}" "API Manual" LINUX' + "\n" =20 for section in sectionlist: self.data +=3D f'.SH "{section}"' + "\n" @@ -697,7 +697,7 @@ class ManFormat(OutputFormat): sectionlist =3D args.get('sectionlist', []) sections =3D args.get('sections', {}) =20 - self.data +=3D f'.TH "{args["module"]}" 9 "enum {args["enum"]}" "{= self.man_date}" "API Manual" LINUX' + "\n" + self.data +=3D f'.TH "{self.modulename}" 9 "enum {args["enum"]}" "= {self.man_date}" "API Manual" LINUX' + "\n" =20 self.data +=3D ".SH NAME\n" self.data +=3D f"enum {args['enum']} \\- {args['purpose']}\n" @@ -727,7 +727,7 @@ class ManFormat(OutputFormat): self.output_highlight(sections[section]) =20 def out_typedef(self, fname, name, args): - module =3D args.get('module') + module =3D self.modulename typedef =3D args.get('typedef') purpose =3D args.get('purpose') sectionlist =3D args.get('sectionlist', []) @@ -743,7 +743,7 @@ class ManFormat(OutputFormat): self.output_highlight(sections.get(section)) =20 def out_struct(self, fname, name, args): - module =3D args.get('module') + module =3D self.modulename struct_type =3D args.get('type') struct_name =3D args.get('struct') purpose =3D args.get('purpose') diff --git a/scripts/lib/kdoc/kdoc_parser.py b/scripts/lib/kdoc/kdoc_parser= .py index e48ed128ca04..f923600561f8 100755 --- a/scripts/lib/kdoc/kdoc_parser.py +++ b/scripts/lib/kdoc/kdoc_parser.py @@ -791,7 +791,6 @@ class KernelDoc: =20 self.output_declaration(decl_type, declaration_name, struct=3Ddeclaration_name, - module=3Dself.entry.modulename, definition=3Ddeclaration, parameterlist=3Dself.entry.parameterlist, parameterdescs=3Dself.entry.parameterdescs, @@ -869,7 +868,6 @@ class KernelDoc: =20 self.output_declaration('enum', declaration_name, enum=3Ddeclaration_name, - module=3Dself.config.modulename, parameterlist=3Dself.entry.parameterlist, parameterdescs=3Dself.entry.parameterdescs, parameterdesc_start_lines=3Dself.entry.par= ameterdesc_start_lines, @@ -1040,7 +1038,6 @@ class KernelDoc: self.output_declaration(decl_type, declaration_name, function=3Ddeclaration_name, typedef=3DTrue, - module=3Dself.config.modulename, functiontype=3Dreturn_type, parameterlist=3Dself.entry.parameterli= st, parameterdescs=3Dself.entry.parameterd= escs, @@ -1055,7 +1052,6 @@ class KernelDoc: self.output_declaration(decl_type, declaration_name, function=3Ddeclaration_name, typedef=3DFalse, - module=3Dself.config.modulename, functiontype=3Dreturn_type, parameterlist=3Dself.entry.parameterli= st, parameterdescs=3Dself.entry.parameterd= escs, @@ -1102,7 +1098,6 @@ class KernelDoc: self.output_declaration(decl_type, declaration_name, function=3Ddeclaration_name, typedef=3DTrue, - module=3Dself.entry.modulename, functiontype=3Dreturn_type, parameterlist=3Dself.entry.parameterli= st, parameterdescs=3Dself.entry.parameterd= escs, @@ -1130,7 +1125,6 @@ class KernelDoc: =20 self.output_declaration('typedef', declaration_name, typedef=3Ddeclaration_name, - module=3Dself.entry.modulename, sectionlist=3Dself.entry.sectionlist, sections=3Dself.entry.sections, section_start_lines=3Dself.entry.secti= on_start_lines, @@ -1619,8 +1613,7 @@ class KernelDoc: self.output_declaration("doc", self.entry.identifier, sectionlist=3Dself.entry.sectionlist, sections=3Dself.entry.sections, - section_start_lines=3Dself.entry.secti= on_start_lines, - module=3Dself.config.modulename) + section_start_lines=3Dself.entry.secti= on_start_lines) self.reset_state(ln) =20 elif doc_content.search(line): --=20 2.49.0 From nobody Wed Dec 17 05:33:22 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9664C2673A8; Tue, 8 Apr 2025 10:09:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106996; cv=none; b=EDWInrrHuolpXx8nkS6u7FVFZnQfi8gHNMivj+wSf8IDRUsN3/wLpLQxKKzOfnUOvM4ge9HrluMg32PhRYbmaJTkTJ5l8cs9uL7IP3dTd+Ks8M7QxXOq6lJZVgT1QLwFEmyMz4wcpkxKVVdQqEcvlNtRGnQTsjyaGd++Kv4FpaE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106996; c=relaxed/simple; bh=3BDYTh1/WNFZ05pM+UlQyBSQkq+sjQ/vD8b342dkHcE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=cUxoDF+3Ls8YVtdIwjfnJQGEHvpLkbdQoXMCAksOn/fzvrgYZg1+/4XFR0BtlwR2IUPZmfLtbdCMDFKHDMbZDq8MOfKeD/917F7VElg4J52sKBSBKt9PAGONcy0MurY6SgkSxKeiTcSQ69T/9ri4bbAa0t0c887qg1PauwSPSAU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=TNwaozQz; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="TNwaozQz" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3A2C9C4CEED; Tue, 8 Apr 2025 10:09:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1744106996; bh=3BDYTh1/WNFZ05pM+UlQyBSQkq+sjQ/vD8b342dkHcE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=TNwaozQz1eXfwIoxIgOj+N3Vr+rRFutWbt2QVglR9yv+EwvV3WitJfjr7kjWvq0hK jb15LLcItUowJ7z+bcU/atRXB16+Q07OcvFbNEIg9njPbvbL0ZA1Yma+G3iGujFmeI o4sEjgDiX9Lmqbob+3iFuqvaL20xQAe3jdATP5lTYDIAueHcWG8Te/x9BBeI8rBTU2 spWJpf9F4uvZUXHW4mxPIeBa0qcLx1xbPgpXjYIvryKiCTHKpEoiJsjerQElNYDAuk PZ3V6DJy/KR3wm0I9SqONSsBfS9KWbyw/jXtwvQ0fBsGaXS5iVKkfn1b2z1VB2+i1Q sKmNy76XNEn3A== Received: from mchehab by mail.kernel.org with local (Exim 4.98.2) (envelope-from ) id 1u25tt-00000008RWV-2Jmx; Tue, 08 Apr 2025 18:09:49 +0800 From: Mauro Carvalho Chehab To: Linux Doc Mailing List , Jonathan Corbet Cc: Mauro Carvalho Chehab , linux-kernel@vger.kernel.org Subject: [PATCH v3 26/33] scripts/kernel-doc.py: properly handle KBUILD_BUILD_TIMESTAMP Date: Tue, 8 Apr 2025 18:09:29 +0800 Message-ID: X-Mailer: git-send-email 2.49.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: Mauro Carvalho Chehab Content-Type: text/plain; charset="utf-8" The logic that handles KBUILD_BUILD_TIMESTAMP is wrong, and adds a dependency of a third party module (dateutil). Fix it. Signed-off-by: Mauro Carvalho Chehab --- scripts/lib/kdoc/kdoc_files.py | 9 --------- scripts/lib/kdoc/kdoc_output.py | 28 +++++++++++++++++++++------- 2 files changed, 21 insertions(+), 16 deletions(-) diff --git a/scripts/lib/kdoc/kdoc_files.py b/scripts/lib/kdoc/kdoc_files.py index 5a6e92e34d05..e52a6d05237e 100644 --- a/scripts/lib/kdoc/kdoc_files.py +++ b/scripts/lib/kdoc/kdoc_files.py @@ -13,9 +13,6 @@ import logging import os import re import sys -from datetime import datetime - -from dateutil import tz =20 from kdoc_parser import KernelDoc from kdoc_output import OutputFormat @@ -137,12 +134,6 @@ class KernelFiles(): if out_style is None: out_style =3D OutputFormat() =20 - dt =3D datetime.now() - if os.environ.get("KBUILD_BUILD_TIMESTAMP", None): - # use UTC TZ - to_zone =3D tz.gettz('UTC') - dt =3D dt.astimezone(to_zone) - if not werror: kcflags =3D os.environ.get("KCFLAGS", None) if kcflags: diff --git a/scripts/lib/kdoc/kdoc_output.py b/scripts/lib/kdoc/kdoc_output= .py index 8be69245c0d0..eb013075da84 100755 --- a/scripts/lib/kdoc/kdoc_output.py +++ b/scripts/lib/kdoc/kdoc_output.py @@ -19,8 +19,6 @@ import os import re from datetime import datetime =20 -from dateutil import tz - from kdoc_parser import KernelDoc, type_param from kdoc_re import Re =20 @@ -586,6 +584,15 @@ class ManFormat(OutputFormat): ) blankline =3D "" =20 + date_formats =3D [ + "%a %b %d %H:%M:%S %Z %Y", + "%a %b %d %H:%M:%S %Y", + "%Y-%m-%d", + "%b %d %Y", + "%B %d %Y", + "%m %d %Y", + ] + def __init__(self, modulename): """ Creates class variables. @@ -597,11 +604,18 @@ class ManFormat(OutputFormat): super().__init__() self.modulename =3D modulename =20 - dt =3D datetime.now() - if os.environ.get("KBUILD_BUILD_TIMESTAMP", None): - # use UTC TZ - to_zone =3D tz.gettz('UTC') - dt =3D dt.astimezone(to_zone) + dt =3D None + tstamp =3D os.environ.get("KBUILD_BUILD_TIMESTAMP") + if tstamp: + for fmt in self.date_formats: + try: + dt =3D datetime.strptime(tstamp, fmt) + break + except ValueError: + pass + + if not dt: + dt =3D datetime.now() =20 self.man_date =3D dt.strftime("%B %Y") =20 --=20 2.49.0 From nobody Wed Dec 17 05:33:22 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C3983267705; Tue, 8 Apr 2025 10:09:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106996; cv=none; b=q+agoVV3aQr2iZGgwngIOEe9j8XZN/XWYXN8GQqEfcEVFBlA7PyrioMdoQpXbuJ2KSDT3pXUGiOx43tQu7PQIPpOvDzvg59yafhMCd91l3zHuZkxXMzpJguMS71PsU4A/tG3hI/HIftFDtOLgSiUtd5Eo4C3UvZyR2dnwxePSTI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106996; c=relaxed/simple; bh=qRmSVZIAGM1/RGxEsAh4Fq+fWp94UwK53McyfcPS6fE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=O2KeKNOdWNQIAfm+ef1bH94N8TE3fNzol2pcvK338d5JKWULsiGGnoZlWhAepy2qPp8yGAs5UiMwkXLcqnqRZXslq7Nyw8K38myFDC4vzIn+Bvh0yuM1YegAGIELkOR2AerqBmKcAIgNAKU8PfzptLJtLAtqZX3yu4VDKxpkk6A= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=pOHaSnkx; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="pOHaSnkx" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 9EDBCC4AF0B; Tue, 8 Apr 2025 10:09:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1744106996; bh=qRmSVZIAGM1/RGxEsAh4Fq+fWp94UwK53McyfcPS6fE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=pOHaSnkxciH1EXpW2XiFYKBr4vj3uciw3tr3/6z80UYYvaLO6unq7zmlL11EWBPP5 P6mEbZ0pulR2BW+kR3mo9B27DWsh/8yL6XkqkzWsY8SwN0zmlIRDYsf4d+9Y92fRQ/ /LMpsiPcuoHX2ArXcushrwGCoQVoYh3PDyVXLc3CY/UwEzR0UEZBPe/gzTANDzTELd F2uEgBpE19bc2yr3tE2l+vrU1dLQF4KF7ooLArVETNIXQh6R+gbcBafyARxRuDL6tg NJE1fgpQ7b6drvj2lzEie0kEHURccbMpEtIm7q/uAd+zQBDxTGB26yvQBitgp8z6Bz wE9E6gX6/oDzg== Received: from mchehab by mail.kernel.org with local (Exim 4.98.2) (envelope-from ) id 1u25tt-00000008RWY-2Pkg; Tue, 08 Apr 2025 18:09:49 +0800 From: Mauro Carvalho Chehab To: Linux Doc Mailing List , Jonathan Corbet Cc: Mauro Carvalho Chehab , Sean Anderson , linux-kernel@vger.kernel.org Subject: [PATCH v3 27/33] scripts/lib/kdoc/kdoc_parser.py: remove a python 3.9 dependency Date: Tue, 8 Apr 2025 18:09:30 +0800 Message-ID: X-Mailer: git-send-email 2.49.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: Mauro Carvalho Chehab Content-Type: text/plain; charset="utf-8" str.removesuffix() was added on Python 3.9, but rstrip() actually does the same thing, as we just want to remove a single character. It is also shorter. So, use it. Signed-off-by: Mauro Carvalho Chehab --- scripts/lib/kdoc/kdoc_parser.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/scripts/lib/kdoc/kdoc_parser.py b/scripts/lib/kdoc/kdoc_parser= .py index f923600561f8..77e8bfeccc8e 100755 --- a/scripts/lib/kdoc/kdoc_parser.py +++ b/scripts/lib/kdoc/kdoc_parser.py @@ -1641,7 +1641,7 @@ class KernelDoc: # Group continuation lines on prototypes if self.state =3D=3D self.STATE_PROTO: if line.endswith("\\"): - prev +=3D line.removesuffix("\\") + prev +=3D line.rstrip("\\") cont =3D True =20 if not prev_ln: --=20 2.49.0 From nobody Wed Dec 17 05:33:22 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C93FA26770C; Tue, 8 Apr 2025 10:09:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106996; cv=none; b=S/9QI85oCRPB8NPvRTrLCpzFLYFjkwb7btBIIWS08UwM0U4bxHRO55kKwBCb71BF19m/zQrXIge0RjMidskNvSFteNiBgN2d17DGMEA2sIuazWk/A6H9h/X9Ja3wnNpI7AArq2fPiQOAJjm8gpWNTTLNnHIgGaIM/9RwTSbpTjE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106996; c=relaxed/simple; bh=lPtO1RqP5P/ODzspQp2bbNZnI7e6MrRf9tkdCuK9ZV8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=jCaiIFgGk18e+04yCUY1jQ8qfBJir9CDpDM6+N5vf53RIplTDMYI6SsctpSjzxvq6whxPozKNMDKaJAbXDg0+62iVSkuV0o9bFLsQ0wCSxunYvFq7XT3w5kEmw+JNkWzfRnvoGbLu+DlZ8cVzFaPSTmNG3MKNQP+OnQXm5kPBN8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=UkxOToIJ; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="UkxOToIJ" Received: by smtp.kernel.org (Postfix) with ESMTPSA id A2CA1C4CEEF; Tue, 8 Apr 2025 10:09:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1744106996; bh=lPtO1RqP5P/ODzspQp2bbNZnI7e6MrRf9tkdCuK9ZV8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=UkxOToIJclfShZ90nFctm4gtpZLshGPh16Wso5iAg9o5NSYP2ZF8KD7iVIAFI7y2a +qTEQVBJQZVKrsvZCAwjfvSVJH08yNDV73fOFpVDLLe3MnYA22PaXD5AxHq6ILKiyg +FS3zK5/BvOTXf8Y7Ll5WmzBrXTvG3yHxdp+1ziVJTup4y1yVmhM7lj+h3IaJJswUM nB1f6rHUGf76eDQXhdN2TrIQ9gev2a/T0sFyslLLwLOKxlniDljpIfCM86tql+yS3q GLM2xr4ZKR/pgklX5hM9OIKKXaKAKBQs/Qpg2tuNBT5U7N3VAZ9l1I/3DAi9WybVuk 23N/mlpceHShA== Received: from mchehab by mail.kernel.org with local (Exim 4.98.2) (envelope-from ) id 1u25tt-00000008RWb-2Vh2; Tue, 08 Apr 2025 18:09:49 +0800 From: Mauro Carvalho Chehab To: Linux Doc Mailing List , Jonathan Corbet Cc: Mauro Carvalho Chehab , Sean Anderson , linux-kernel@vger.kernel.org Subject: [PATCH v3 28/33] scripts/kernel-doc.py: Properly handle Werror and exit codes Date: Tue, 8 Apr 2025 18:09:31 +0800 Message-ID: X-Mailer: git-send-email 2.49.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: Mauro Carvalho Chehab Content-Type: text/plain; charset="utf-8" The original kernel-doc script has a logic to return warnings as errors, and to report the number of warnings found, if in verbose mode. Implement it to be fully compatible with the original script. Signed-off-by: Mauro Carvalho Chehab --- scripts/kernel-doc.py | 18 ++++++++++++++++-- scripts/lib/kdoc/kdoc_files.py | 12 ++++++++++-- scripts/lib/kdoc/kdoc_output.py | 8 +++----- scripts/lib/kdoc/kdoc_parser.py | 15 ++++++--------- 4 files changed, 35 insertions(+), 18 deletions(-) diff --git a/scripts/kernel-doc.py b/scripts/kernel-doc.py index 6a6bc81efd31..2f2fad813024 100755 --- a/scripts/kernel-doc.py +++ b/scripts/kernel-doc.py @@ -78,8 +78,6 @@ # Yacine Belkadi # Yujie Liu =20 -# TODO: implement warning filtering - """ kernel_doc =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D @@ -295,6 +293,22 @@ def main(): if msg: print(msg) =20 + error_count =3D kfiles.errors + if not error_count: + sys.exit(0) + + if args.werror: + print(f"{error_count} warnings as errors") + sys.exit(error_count) + + if args.verbose: + print(f"{error_count} errors") + + if args.none: + sys.exit(0) + + sys.exit(error_count) + =20 # Call main method if __name__ =3D=3D "__main__": diff --git a/scripts/lib/kdoc/kdoc_files.py b/scripts/lib/kdoc/kdoc_files.py index e52a6d05237e..182d9ed58a72 100644 --- a/scripts/lib/kdoc/kdoc_files.py +++ b/scripts/lib/kdoc/kdoc_files.py @@ -12,7 +12,6 @@ import argparse import logging import os import re -import sys =20 from kdoc_parser import KernelDoc from kdoc_output import OutputFormat @@ -109,7 +108,7 @@ class KernelFiles(): KernelDoc.process_export(self.config.function_table, l= ine) =20 except IOError: - print(f"Error: Cannot open fname {fname}", fname=3Dsys.stderr) + self.config.log.error("Error: Cannot open fname %s", fname) self.config.errors +=3D 1 =20 def file_not_found_cb(self, fname): @@ -262,3 +261,12 @@ class KernelFiles(): fname, ln, dtype) if msg: yield fname, msg + + @property + def errors(self): + """ + Return a count of the number of warnings found, including + the ones displayed while interacting over self.msg. + """ + + return self.config.errors diff --git a/scripts/lib/kdoc/kdoc_output.py b/scripts/lib/kdoc/kdoc_output= .py index eb013075da84..e9b4d0093084 100755 --- a/scripts/lib/kdoc/kdoc_output.py +++ b/scripts/lib/kdoc/kdoc_output.py @@ -128,11 +128,9 @@ class OutputFormat: =20 warnings =3D args.get('warnings', []) =20 - for warning, log_msg in warnings: - if warning: - self.config.log.warning(log_msg) - else: - self.config.log.info(log_msg) + for log_msg in warnings: + self.config.log.warning(log_msg) + self.config.errors +=3D 1 =20 def check_doc(self, name, args): """Check if DOC should be output""" diff --git a/scripts/lib/kdoc/kdoc_parser.py b/scripts/lib/kdoc/kdoc_parser= .py index 77e8bfeccc8e..43e6ffbdcc2c 100755 --- a/scripts/lib/kdoc/kdoc_parser.py +++ b/scripts/lib/kdoc/kdoc_parser.py @@ -137,17 +137,18 @@ class KernelDoc: =20 log_msg =3D f"{self.fname}:{ln} {msg}" =20 + if not warning: + self.config.log.info(log_msg) + return + if self.entry: # Delegate warning output to output logic, as this way it # will report warnings/info only for symbols that are output =20 - self.entry.warnings.append((warning, log_msg)) + self.entry.warnings.append(log_msg) return =20 - if warning: - self.config.log.warning(log_msg) - else: - self.config.log.info(log_msg) + self.config.log.warning(log_msg) =20 def dump_section(self, start_new=3DTrue): """ @@ -556,7 +557,6 @@ class KernelDoc: =20 if not members: self.emit_warning(ln, f"{proto} error: Cannot parse struct or = union!") - self.config.errors +=3D 1 return =20 if self.entry.identifier !=3D declaration_name: @@ -831,7 +831,6 @@ class KernelDoc: =20 if not members: self.emit_warning(ln, f"{proto}: error: Cannot parse enum!") - self.config.errors +=3D 1 return =20 if self.entry.identifier !=3D declaration_name: @@ -1132,7 +1131,6 @@ class KernelDoc: return =20 self.emit_warning(ln, "error: Cannot parse typedef!") - self.config.errors +=3D 1 =20 @staticmethod def process_export(function_table, line): @@ -1677,4 +1675,3 @@ class KernelDoc: self.process_docblock(ln, line) except OSError: self.config.log.error(f"Error: Cannot open file {self.fname}") - self.config.errors +=3D 1 --=20 2.49.0 From nobody Wed Dec 17 05:33:22 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 69A60266EEC; Tue, 8 Apr 2025 10:09:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106996; cv=none; b=ilPN4DCWEqIaHmReB+kmX2W2jXHhW2J5Fe7vQ4IXobXJrE3RpXtPjPpBQClCZoAHQPg2WI5exnuK+IgAXE2uhobJSsBDs2VmjRY0jr9PQDEbCFjuzoxinSWtAfHEOLtaIhbyWukni2LeOjgW0B+3otYqCLQ54f8oQdqq82o3GoM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106996; c=relaxed/simple; bh=8mtu7ff1BYjYpQYtuu/yi0zDvECr4GwUYx9p5Sr6akM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=m5VSW8rkOO/QPXQ6LdAkxOgTnrfO6E10J3ZOM4OGCxo5BbODc+uuLQtlUILLfomkEa3ofhWiilPyfXe+NzfENLIvJh8eXtP8EStZE0UtqSoH4FxII1XEoE50Y7ltqNr7ZUBvQY656Scobt0ahNqugm264weCDMbCi0TrBS+LL5M= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=OmEQ1D2f; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="OmEQ1D2f" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 43C07C4CEEB; Tue, 8 Apr 2025 10:09:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1744106996; bh=8mtu7ff1BYjYpQYtuu/yi0zDvECr4GwUYx9p5Sr6akM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=OmEQ1D2f6IOfetUNNJwiOwfQADQuSAq4bwssGOfyBOWFTqc2vBKTSwR7xwl2r4ePg iRxR9HFjxY5uE0ugC0+CiIRXuaEQ988594i/e9cSiDftlRedLFfJ+x7KwyPHa3wJQ0 QzaXsGhmMyHZBHA7EyRdFdka/lmFXq5GfB/iEHFj/plHaOB7NWMba++NzcCdq6tNEW Q0YcxwEb9KeZnuj4LW0yBfPlQZW7HxWZfVJ7pAuQy73A/eRW8X9wxoxOLwv6qaQaR2 WVpoqKKFJpVWHvtiDl9cNdJevV4DHba+VpE9QIgSni7mGD5Y32RsFeYCIG0AJ62Gwo B6toKZ0vU0h2g== Received: from mchehab by mail.kernel.org with local (Exim 4.98.2) (envelope-from ) id 1u25tt-00000008RWe-2bvq; Tue, 08 Apr 2025 18:09:49 +0800 From: Mauro Carvalho Chehab To: Linux Doc Mailing List , Jonathan Corbet Cc: Mauro Carvalho Chehab , linux-kernel@vger.kernel.org Subject: [PATCH v3 29/33] scripts/kernel-doc: switch to use kernel-doc.py Date: Tue, 8 Apr 2025 18:09:32 +0800 Message-ID: X-Mailer: git-send-email 2.49.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: Mauro Carvalho Chehab Content-Type: text/plain; charset="utf-8" Now that all features are in place, change the kernel-doc alias to point to kernel-doc.py. Signed-off-by: Mauro Carvalho Chehab --- scripts/kernel-doc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/scripts/kernel-doc b/scripts/kernel-doc index f175155c1e66..3b6ef807791a 120000 --- a/scripts/kernel-doc +++ b/scripts/kernel-doc @@ -1 +1 @@ -kernel-doc.pl \ No newline at end of file +kernel-doc.py \ No newline at end of file --=20 2.49.0 From nobody Wed Dec 17 05:33:22 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 55A8F266B79; Tue, 8 Apr 2025 10:09:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106996; cv=none; b=FA91j/QZogALv4TT0JS03P0bwiIn+RpjVAguWwsIRXTNV12jDZeqQmJzQHu06LGpSUBq4Ziol/agYY7ou+MYfQ3WSjekknzkmAwYaE2ofxfUtON0vkqDOqz86sEsIXAHBN7TlrTYc1dQBvu8cG5Wg8ZoxYyFF7o4qPO653SlyqM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106996; c=relaxed/simple; bh=aBedWHeAMyQ8zeUoubzCtMfcMsOOxpIGXbGY+jZZDSw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=C5HP1awks00amMWLlc2KAlqymO55XwAbr7/hxusHXpF7ofP9jdPs4TH3MraaZqPF/94rj2gPQOxHyWNoWB9Ut8urmsclMq7IuVpylVtvgRSo/m/b58NWS8vrwXNc9D4V5cugi+aDVTOPGbpZw8DMO3LTzCdrnEGJdCD652DzkPE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=JEYCwj16; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="JEYCwj16" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 34236C4CEE5; Tue, 8 Apr 2025 10:09:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1744106996; bh=aBedWHeAMyQ8zeUoubzCtMfcMsOOxpIGXbGY+jZZDSw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=JEYCwj16pty+1zVFLjGUgd172Bt2qI8nOm+muagoXOU2g9SMYKI+u7zA+Ycm2EYdJ 1eElT+BxpHWXLBY/CMsI9Ir/os6R6Jn0eOl5NPBF/ldycTFFC9T7lBuM/IVmG6k/5M 7wMnowfKCOPVvAIYCA7kTH41xUX0ATszG9uiCqCTcw8r4K51uVx0nH5eMb8lDcnAss AJ+bzO0fmmHh1WfFPHcOSHwXBglC//mC8Tsdm8A22ik7BfEpGk0bdCgOmXWhwQbkUS T0KGshfgOBSU/V+IB+ghPWkC6ssa/8DeLOboAWiBwGDCyDRqPX4FP1414QSOvl8NS2 QG5NcBC7+XTxA== Received: from mchehab by mail.kernel.org with local (Exim 4.98.2) (envelope-from ) id 1u25tt-00000008RWh-2he4; Tue, 08 Apr 2025 18:09:49 +0800 From: Mauro Carvalho Chehab To: Linux Doc Mailing List , Jonathan Corbet Cc: Mauro Carvalho Chehab , linux-kernel@vger.kernel.org Subject: [PATCH v3 30/33] scripts/lib/kdoc/kdoc_files.py: allow filtering output per fname Date: Tue, 8 Apr 2025 18:09:33 +0800 Message-ID: <9f5c0ff2568f34532ca99465fb378241d831d39f.1744106242.git.mchehab+huawei@kernel.org> X-Mailer: git-send-email 2.49.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: Mauro Carvalho Chehab Content-Type: text/plain; charset="utf-8" For kerneldoc Sphinx extension, it is useful to display parsed results only from a single file. Change the logic at KernelFiles.msg() to allow such usage. Signed-off-by: Mauro Carvalho Chehab --- scripts/lib/kdoc/kdoc_files.py | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/scripts/lib/kdoc/kdoc_files.py b/scripts/lib/kdoc/kdoc_files.py index 182d9ed58a72..527ab9117268 100644 --- a/scripts/lib/kdoc/kdoc_files.py +++ b/scripts/lib/kdoc/kdoc_files.py @@ -95,7 +95,7 @@ class KernelFiles(): doc =3D KernelDoc(self.config, fname) doc.run() =20 - return doc + return doc.entries =20 def process_export_file(self, fname): """ @@ -173,7 +173,7 @@ class KernelFiles(): # Initialize internal variables =20 self.config.errors =3D 0 - self.results =3D [] + self.results =3D {} =20 self.files =3D set() self.export_files =3D set() @@ -189,13 +189,9 @@ class KernelFiles(): # avoid reporting errors multiple times =20 for fname in glob.parse_files(file_list, self.file_not_found_cb): - if fname in self.files: - continue - - res =3D self.parse_file(fname) - - self.results.append((res.fname, res.entries)) - self.files.add(fname) + if fname not in self.files: + self.results[fname] =3D self.parse_file(fname) + self.files.add(fname) =20 # If a list of export files was provided, parse EXPORT_SYMBOL* # from files that weren't fully parsed @@ -226,7 +222,8 @@ class KernelFiles(): return self.out_style.msg(fname, name, arg) =20 def msg(self, enable_lineno=3DFalse, export=3DFalse, internal=3DFalse, - symbol=3DNone, nosymbol=3DNone, no_doc_sections=3DFalse): + symbol=3DNone, nosymbol=3DNone, no_doc_sections=3DFalse, + filenames=3DNone): """ Interacts over the kernel-doc results and output messages, returning kernel-doc markups on each interaction @@ -248,9 +245,12 @@ class KernelFiles(): function_table, enable_lineno, no_doc_sections) =20 - for fname, arg_tuple in self.results: + if not filenames: + filenames =3D sorted(self.results.keys()) + + for fname in filenames: msg =3D "" - for name, arg in arg_tuple: + for name, arg in self.results[fname]: msg +=3D self.out_msg(fname, name, arg) =20 if msg is None: --=20 2.49.0 From nobody Wed Dec 17 05:33:22 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 76B90267B07; Tue, 8 Apr 2025 10:09:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106997; cv=none; b=QHRPMo0tiZVWM1VDfRYJ2niDRwF6L1C6NNTMXQk5JU/qd8JWKnWHm5jXrVl0wkI+CTZfotMuJvnUkytfRm56ItIWheZBWT1Bj2uWSkJ+wGIFv5PBryI25NnIRzhoMN9ju1jg6v9kIbwm5TbJUMdX2zDGmApcCI4vfYLDcOWI6aQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106997; c=relaxed/simple; bh=fECQ0wT3cwyILMq52vt+FKrJbJo2Gu0emu+5UkOVdgM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Bgm6n0IR67XbAYdw7QZx7V27Z/5FoMFhRzCR5HtkolAN89nG2D3pCs2idgsyV1E9sXQhdwNRefi1voLwKCDMo5uh+4QvH6JxksO6rqd4z4q4yRDlv6CLfcecEmYF9pEPeH5vLdpwXkW53oXOe2/yWBQK9Em+vO9vw24oe9ma46M= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=VVtGygnS; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="VVtGygnS" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 9F90CC4AF0D; Tue, 8 Apr 2025 10:09:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1744106997; bh=fECQ0wT3cwyILMq52vt+FKrJbJo2Gu0emu+5UkOVdgM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=VVtGygnSL21S0Bb+6WFI3FSyVeXdlkaF4bB6gh8jFc3yLM/+Uv4zPZD/oSgzwXl/L TSjXMfLCi+MUcIubTqXDSCLySusbS9xFeL+tgzHo883+J/PtVak5yLqpmWVPV8Fgze +mvbFuupjjadPYwNFtPoR/W381sapvryTeUQfVTQDbOl9b1lu9lH8TQInyGxU3j70J G4ovXGNaiqyPD2PvNWYARsPbJLS3W8iz+ibCYVdehk3Qvrtl7GG/1Vqqpg+LWFwomt c82a5k0ESFEvB/4mrypW7h9+SdHrTzHFL8VazOMXxeuxUAwAIOb3GYodMmEFZYiwTK 6tRH5gV1EVoSg== Received: from mchehab by mail.kernel.org with local (Exim 4.98.2) (envelope-from ) id 1u25tt-00000008RWk-2nBW; Tue, 08 Apr 2025 18:09:49 +0800 From: Mauro Carvalho Chehab To: Linux Doc Mailing List , Jonathan Corbet Cc: Mauro Carvalho Chehab , Sean Anderson , linux-kernel@vger.kernel.org Subject: [PATCH v3 31/33] scripts/kernel_doc.py: better handle exported symbols Date: Tue, 8 Apr 2025 18:09:34 +0800 Message-ID: <6a69ba8d2b7ee6a6427abb53e60d09bd4d3565ee.1744106242.git.mchehab+huawei@kernel.org> X-Mailer: git-send-email 2.49.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: Mauro Carvalho Chehab Content-Type: text/plain; charset="utf-8" Change the logic which detects internal/external symbols in a way that we can re-use it when calling via Sphinx extension. While here, remove an unused self.config var and let it clearer that self.config variables are read-only. This helps to allow handling multiple times in parallel if ever needed. Signed-off-by: Mauro Carvalho Chehab --- scripts/kernel-doc.py | 2 +- scripts/lib/kdoc/kdoc_files.py | 142 +++++++++++++++++--------------- scripts/lib/kdoc/kdoc_output.py | 9 +- scripts/lib/kdoc/kdoc_parser.py | 52 ++++++++++-- 4 files changed, 125 insertions(+), 80 deletions(-) diff --git a/scripts/kernel-doc.py b/scripts/kernel-doc.py index 2f2fad813024..12ae66f40bd7 100755 --- a/scripts/kernel-doc.py +++ b/scripts/kernel-doc.py @@ -287,7 +287,7 @@ def main(): =20 for t in kfiles.msg(enable_lineno=3Dargs.enable_lineno, export=3Dargs.= export, internal=3Dargs.internal, symbol=3Dargs.symbol, - nosymbol=3Dargs.nosymbol, + nosymbol=3Dargs.nosymbol, export_file=3Dargs.expor= t_file, no_doc_sections=3Dargs.no_doc_sections): msg =3D t[1] if msg: diff --git a/scripts/lib/kdoc/kdoc_files.py b/scripts/lib/kdoc/kdoc_files.py index 527ab9117268..dd003feefd1b 100644 --- a/scripts/lib/kdoc/kdoc_files.py +++ b/scripts/lib/kdoc/kdoc_files.py @@ -68,6 +68,9 @@ class GlobSourceFiles: handling directories if any """ =20 + if not file_list: + return + for fname in file_list: if self.srctree: f =3D os.path.join(self.srctree, fname) @@ -84,40 +87,70 @@ class GlobSourceFiles: =20 class KernelFiles(): """ - Parse lernel-doc tags on multiple kernel source files. + Parse kernel-doc tags on multiple kernel source files. + + There are two type of parsers defined here: + - self.parse_file(): parses both kernel-doc markups and + EXPORT_SYMBOL* macros; + - self.process_export_file(): parses only EXPORT_SYMBOL* macros. """ =20 + def warning(self, msg): + """Ancillary routine to output a warning and increment error count= """ + + self.config.log.warning(msg) + self.errors +=3D 1 + + def error(self, msg): + """Ancillary routine to output an error and increment error count"= "" + + self.config.log.error(msg) + self.errors +=3D 1 + def parse_file(self, fname): """ Parse a single Kernel source. """ =20 + # Prevent parsing the same file twice if results are cached + if fname in self.files: + return + doc =3D KernelDoc(self.config, fname) - doc.run() + export_table, entries =3D doc.parse_kdoc() =20 - return doc.entries + self.export_table[fname] =3D export_table + + self.files.add(fname) + self.export_files.add(fname) # parse_kdoc() already check exp= orts + + self.results[fname] =3D entries =20 def process_export_file(self, fname): """ Parses EXPORT_SYMBOL* macros from a single Kernel source file. """ - try: - with open(fname, "r", encoding=3D"utf8", - errors=3D"backslashreplace") as fp: - for line in fp: - KernelDoc.process_export(self.config.function_table, l= ine) - - except IOError: - self.config.log.error("Error: Cannot open fname %s", fname) - self.config.errors +=3D 1 + + # Prevent parsing the same file twice if results are cached + if fname in self.export_files: + return + + doc =3D KernelDoc(self.config, fname) + export_table =3D doc.parse_export() + + if not export_table: + self.error(f"Error: Cannot check EXPORT_SYMBOL* on {fname}") + export_table =3D set() + + self.export_table[fname] =3D export_table + self.export_files.add(fname) =20 def file_not_found_cb(self, fname): """ Callback to warn if a file was not found. """ =20 - self.config.log.error("Cannot find file %s", fname) - self.config.errors +=3D 1 + self.error(f"Cannot find file {fname}") =20 def __init__(self, verbose=3DFalse, out_style=3DNone, werror=3DFalse, wreturn=3DFalse, wshort_desc=3DFalse, @@ -147,7 +180,9 @@ class KernelFiles(): if kdoc_werror: werror =3D kdoc_werror =20 - # Set global config data used on all files + # Some variables are global to the parser logic as a whole as they= are + # used to send control configuration to KernelDoc class. As such, + # those variables are read-only inside the KernelDoc. self.config =3D argparse.Namespace =20 self.config.verbose =3D verbose @@ -156,27 +191,25 @@ class KernelFiles(): self.config.wshort_desc =3D wshort_desc self.config.wcontents_before_sections =3D wcontents_before_sections =20 - self.config.function_table =3D set() - self.config.source_map =3D {} - if not logger: self.config.log =3D logging.getLogger("kernel-doc") else: self.config.log =3D logger =20 - self.config.kernel_version =3D os.environ.get("KERNELVERSION", - "unknown kernel versio= n'") + self.config.warning =3D self.warning + self.config.src_tree =3D os.environ.get("SRCTREE", None) =20 + # Initialize variables that are internal to KernelFiles + self.out_style =3D out_style =20 - # Initialize internal variables - - self.config.errors =3D 0 + self.errors =3D 0 self.results =3D {} =20 self.files =3D set() self.export_files =3D set() + self.export_table =3D {} =20 def parse(self, file_list, export_file=3DNone): """ @@ -185,28 +218,11 @@ class KernelFiles(): =20 glob =3D GlobSourceFiles(srctree=3Dself.config.src_tree) =20 - # Prevent parsing the same file twice to speedup parsing and - # avoid reporting errors multiple times - for fname in glob.parse_files(file_list, self.file_not_found_cb): - if fname not in self.files: - self.results[fname] =3D self.parse_file(fname) - self.files.add(fname) - - # If a list of export files was provided, parse EXPORT_SYMBOL* - # from files that weren't fully parsed - - if not export_file: - return - - self.export_files |=3D self.files - - glob =3D GlobSourceFiles(srctree=3Dself.config.src_tree) + self.parse_file(fname) =20 for fname in glob.parse_files(export_file, self.file_not_found_cb): - if fname not in self.export_files: - self.process_export_file(fname) - self.export_files.add(fname) + self.process_export_file(fname) =20 def out_msg(self, fname, name, arg): """ @@ -223,32 +239,35 @@ class KernelFiles(): =20 def msg(self, enable_lineno=3DFalse, export=3DFalse, internal=3DFalse, symbol=3DNone, nosymbol=3DNone, no_doc_sections=3DFalse, - filenames=3DNone): + filenames=3DNone, export_file=3DNone): """ Interacts over the kernel-doc results and output messages, returning kernel-doc markups on each interaction """ =20 - function_table =3D self.config.function_table - - if symbol: - for s in symbol: - function_table.add(s) - - # Output none mode: only warnings will be shown - if not self.out_style: - return - self.out_style.set_config(self.config) =20 - self.out_style.set_filter(export, internal, symbol, nosymbol, - function_table, enable_lineno, - no_doc_sections) - if not filenames: filenames =3D sorted(self.results.keys()) =20 for fname in filenames: + function_table =3D set() + + if internal or export: + if not export_file: + export_file =3D [fname] + + for f in export_file: + function_table |=3D self.export_table[f] + + if symbol: + for s in symbol: + function_table.add(s) + + self.out_style.set_filter(export, internal, symbol, nosymbol, + function_table, enable_lineno, + no_doc_sections) + msg =3D "" for name, arg in self.results[fname]: msg +=3D self.out_msg(fname, name, arg) @@ -261,12 +280,3 @@ class KernelFiles(): fname, ln, dtype) if msg: yield fname, msg - - @property - def errors(self): - """ - Return a count of the number of warnings found, including - the ones displayed while interacting over self.msg. - """ - - return self.config.errors diff --git a/scripts/lib/kdoc/kdoc_output.py b/scripts/lib/kdoc/kdoc_output= .py index e9b4d0093084..c352b7f8d3fd 100755 --- a/scripts/lib/kdoc/kdoc_output.py +++ b/scripts/lib/kdoc/kdoc_output.py @@ -69,7 +69,7 @@ class OutputFormat: self.enable_lineno =3D None self.nosymbol =3D {} self.symbol =3D None - self.function_table =3D set() + self.function_table =3D None self.config =3D None self.no_doc_sections =3D False =20 @@ -94,10 +94,10 @@ class OutputFormat: =20 self.enable_lineno =3D enable_lineno self.no_doc_sections =3D no_doc_sections + self.function_table =3D function_table =20 if symbol: self.out_mode =3D self.OUTPUT_INCLUDE - function_table =3D symbol elif export: self.out_mode =3D self.OUTPUT_EXPORTED elif internal: @@ -108,8 +108,6 @@ class OutputFormat: if nosymbol: self.nosymbol =3D set(nosymbol) =20 - if function_table: - self.function_table =3D function_table =20 def highlight_block(self, block): """ @@ -129,8 +127,7 @@ class OutputFormat: warnings =3D args.get('warnings', []) =20 for log_msg in warnings: - self.config.log.warning(log_msg) - self.config.errors +=3D 1 + self.config.warning(log_msg) =20 def check_doc(self, name, args): """Check if DOC should be output""" diff --git a/scripts/lib/kdoc/kdoc_parser.py b/scripts/lib/kdoc/kdoc_parser= .py index 43e6ffbdcc2c..33f00c77dd5f 100755 --- a/scripts/lib/kdoc/kdoc_parser.py +++ b/scripts/lib/kdoc/kdoc_parser.py @@ -1133,21 +1133,25 @@ class KernelDoc: self.emit_warning(ln, "error: Cannot parse typedef!") =20 @staticmethod - def process_export(function_table, line): + def process_export(function_set, line): """ process EXPORT_SYMBOL* tags =20 - This method is called both internally and externally, so, it - doesn't use self. + This method doesn't use any variable from the class, so declare it + with a staticmethod decorator. """ =20 + # Note: it accepts only one EXPORT_SYMBOL* per line, as having + # multiple export lines would violate Kernel coding style. + if export_symbol.search(line): symbol =3D export_symbol.group(2) - function_table.add(symbol) + function_set.add(symbol) + return =20 if export_symbol_ns.search(line): symbol =3D export_symbol_ns.group(2) - function_table.add(symbol) + function_set.add(symbol) =20 def process_normal(self, ln, line): """ @@ -1617,17 +1621,39 @@ class KernelDoc: elif doc_content.search(line): self.entry.contents +=3D doc_content.group(1) + "\n" =20 - def run(self): + def parse_export(self): + """ + Parses EXPORT_SYMBOL* macros from a single Kernel source file. + """ + + export_table =3D set() + + try: + with open(self.fname, "r", encoding=3D"utf8", + errors=3D"backslashreplace") as fp: + + for line in fp: + self.process_export(export_table, line) + + except IOError: + return None + + return export_table + + def parse_kdoc(self): """ Open and process each line of a C source file. - he parsing is controlled via a state machine, and the line is pass= ed + The parsing is controlled via a state machine, and the line is pas= sed to a different process function depending on the state. The process function may update the state as needed. + + Besides parsing kernel-doc tags, it also parses export symbols. """ =20 cont =3D False prev =3D "" prev_ln =3D None + export_table =3D set() =20 try: with open(self.fname, "r", encoding=3D"utf8", @@ -1659,6 +1685,16 @@ class KernelDoc: self.st_inline_name[self.inline_= doc_state], line) =20 + # This is an optimization over the original script. + # There, when export_file was used for the same file, + # it was read twice. Here, we use the already-existing + # loop to parse exported symbols as well. + # + # TODO: It should be noticed that not all states are + # needed here. On a future cleanup, process export only + # at the states that aren't handling comment markups. + self.process_export(export_table, line) + # Hand this line to the appropriate state handler if self.state =3D=3D self.STATE_NORMAL: self.process_normal(ln, line) @@ -1675,3 +1711,5 @@ class KernelDoc: self.process_docblock(ln, line) except OSError: self.config.log.error(f"Error: Cannot open file {self.fname}") + + return export_table, self.entries --=20 2.49.0 From nobody Wed Dec 17 05:33:22 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CA4E426560E; Tue, 8 Apr 2025 10:09:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106999; cv=none; b=hyWhJQnTpJjakaLBlE7w6SBry/PLeF3QapWsx/ntzVyRRao3FB7FxbzxnyddqnZhjXLeR0UI1+JL5mdUlCTMSJm+lPf6xFAH/gkkLnR/+S+MDdZrZYmU8n9B676gHrmESn8HX71INjDw2/hI8goqnQcWaGujSKcaqjtCNftVlW0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106999; c=relaxed/simple; bh=SszucMXEdc/sou+xc8alSD8raZAiM5Q3h/x3KVgAi8Q=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=NCicaTa2QRjAk2M/YvS0EhRUwV0yofkOL0ZNsCXJoLDbQQktiKRS5D+fEWa5fm2w2NoWGcG29T3HuIiilq/y6IIwdQ1wkfEsUTnzaf5GapPDk4D+vH3Q+ZTMS/Y45ACIJ4kEa6wwDu+Rdfc4wQaF5OQQNpkiW2Z0XFBxNvJatCs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=oPtVmZJv; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="oPtVmZJv" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 9FE03C4CEF4; Tue, 8 Apr 2025 10:09:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1744106999; bh=SszucMXEdc/sou+xc8alSD8raZAiM5Q3h/x3KVgAi8Q=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=oPtVmZJvNmTIb7w7HHWkz9iDfi6d02/FAg3jtko4psFSyMRVfVfnyC6kL0KfAnb0Q 0VIB/Xaz794TvZ/658FddqQhEvV64zboUdJEpjNFxZmZ5R5gaV7W39j3GcExZdlNRK n+E1/4RNwu36sKUHLQw21mNtj0Q3upm8Wn4sUuGB99h14gaeQo3hs0bZ9Q5grbjglk ACF3xo8+pZcY+z2P2vAlE6Owaa87fHoEaYFJRSuelF/eP6yDEXhccjKfYAqp7frlRj JYUXCaTl0Wjt579tvKtITfGLRwhoeg9/YdkSHZZC4W83q9V47wPGomZQCqFbYBD9Gy XmrZrwsj1hlfw== Received: from mchehab by mail.kernel.org with local (Exim 4.98.2) (envelope-from ) id 1u25tt-00000008RWn-2ujm; Tue, 08 Apr 2025 18:09:49 +0800 From: Mauro Carvalho Chehab To: Linux Doc Mailing List , Jonathan Corbet Cc: Mauro Carvalho Chehab , "Gustavo A. R. Silva" , Kees Cook , Sean Anderson , linux-hardening@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v3 32/33] scripts/kernel-doc.py: Rename the kernel doc Re class to KernRe Date: Tue, 8 Apr 2025 18:09:35 +0800 Message-ID: <4e095ecd5235a3e811ddcf5bad4cfb92f1da0a4a.1744106242.git.mchehab+huawei@kernel.org> X-Mailer: git-send-email 2.49.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: Mauro Carvalho Chehab Content-Type: text/plain; charset="utf-8" Using just "Re" makes it harder to distinguish from the native "re" class. So, let's rename it. Signed-off-by: Mauro Carvalho Chehab --- scripts/lib/kdoc/kdoc_output.py | 50 +++--- scripts/lib/kdoc/kdoc_parser.py | 264 ++++++++++++++++---------------- scripts/lib/kdoc/kdoc_re.py | 4 +- 3 files changed, 159 insertions(+), 159 deletions(-) diff --git a/scripts/lib/kdoc/kdoc_output.py b/scripts/lib/kdoc/kdoc_output= .py index c352b7f8d3fd..86102e628d91 100755 --- a/scripts/lib/kdoc/kdoc_output.py +++ b/scripts/lib/kdoc/kdoc_output.py @@ -20,31 +20,31 @@ import re from datetime import datetime =20 from kdoc_parser import KernelDoc, type_param -from kdoc_re import Re +from kdoc_re import KernRe =20 =20 -function_pointer =3D Re(r"([^\(]*\(\*)\s*\)\s*\(([^\)]*)\)", cache=3DFalse) +function_pointer =3D KernRe(r"([^\(]*\(\*)\s*\)\s*\(([^\)]*)\)", cache=3DF= alse) =20 # match expressions used to find embedded type information -type_constant =3D Re(r"\b``([^\`]+)``\b", cache=3DFalse) -type_constant2 =3D Re(r"\%([-_*\w]+)", cache=3DFalse) -type_func =3D Re(r"(\w+)\(\)", cache=3DFalse) -type_param_ref =3D Re(r"([\!~\*]?)\@(\w*((\.\w+)|(->\w+))*(\.\.\.)?)", cac= he=3DFalse) +type_constant =3D KernRe(r"\b``([^\`]+)``\b", cache=3DFalse) +type_constant2 =3D KernRe(r"\%([-_*\w]+)", cache=3DFalse) +type_func =3D KernRe(r"(\w+)\(\)", cache=3DFalse) +type_param_ref =3D KernRe(r"([\!~\*]?)\@(\w*((\.\w+)|(->\w+))*(\.\.\.)?)",= cache=3DFalse) =20 # Special RST handling for func ptr params -type_fp_param =3D Re(r"\@(\w+)\(\)", cache=3DFalse) +type_fp_param =3D KernRe(r"\@(\w+)\(\)", cache=3DFalse) =20 # Special RST handling for structs with func ptr params -type_fp_param2 =3D Re(r"\@(\w+->\S+)\(\)", cache=3DFalse) +type_fp_param2 =3D KernRe(r"\@(\w+->\S+)\(\)", cache=3DFalse) =20 -type_env =3D Re(r"(\$\w+)", cache=3DFalse) -type_enum =3D Re(r"\&(enum\s*([_\w]+))", cache=3DFalse) -type_struct =3D Re(r"\&(struct\s*([_\w]+))", cache=3DFalse) -type_typedef =3D Re(r"\&(typedef\s*([_\w]+))", cache=3DFalse) -type_union =3D Re(r"\&(union\s*([_\w]+))", cache=3DFalse) -type_member =3D Re(r"\&([_\w]+)(\.|->)([_\w]+)", cache=3DFalse) -type_fallback =3D Re(r"\&([_\w]+)", cache=3DFalse) -type_member_func =3D type_member + Re(r"\(\)", cache=3DFalse) +type_env =3D KernRe(r"(\$\w+)", cache=3DFalse) +type_enum =3D KernRe(r"\&(enum\s*([_\w]+))", cache=3DFalse) +type_struct =3D KernRe(r"\&(struct\s*([_\w]+))", cache=3DFalse) +type_typedef =3D KernRe(r"\&(typedef\s*([_\w]+))", cache=3DFalse) +type_union =3D KernRe(r"\&(union\s*([_\w]+))", cache=3DFalse) +type_member =3D KernRe(r"\&([_\w]+)(\.|->)([_\w]+)", cache=3DFalse) +type_fallback =3D KernRe(r"\&([_\w]+)", cache=3DFalse) +type_member_func =3D type_member + KernRe(r"\(\)", cache=3DFalse) =20 =20 class OutputFormat: @@ -257,8 +257,8 @@ class RestFormat(OutputFormat): ] blankline =3D "\n" =20 - sphinx_literal =3D Re(r'^[^.].*::$', cache=3DFalse) - sphinx_cblock =3D Re(r'^\.\.\ +code-block::', cache=3DFalse) + sphinx_literal =3D KernRe(r'^[^.].*::$', cache=3DFalse) + sphinx_cblock =3D KernRe(r'^\.\.\ +code-block::', cache=3DFalse) =20 def __init__(self): """ @@ -299,14 +299,14 @@ class RestFormat(OutputFormat): # If this is the first non-blank line in a literal blo= ck, # figure out the proper indent. if not litprefix: - r =3D Re(r'^(\s*)') + r =3D KernRe(r'^(\s*)') if r.match(line): litprefix =3D '^' + r.group(1) else: litprefix =3D "" =20 output +=3D line + "\n" - elif not Re(litprefix).match(line): + elif not KernRe(litprefix).match(line): in_literal =3D False else: output +=3D line + "\n" @@ -429,7 +429,7 @@ class RestFormat(OutputFormat): self.data +=3D f"{self.lineprefix}**Parameters**\n\n" =20 for parameter in parameterlist: - parameter_name =3D Re(r'\[.*').sub('', parameter) + parameter_name =3D KernRe(r'\[.*').sub('', parameter) dtype =3D args['parametertypes'].get(parameter, "") =20 if dtype: @@ -626,7 +626,7 @@ class ManFormat(OutputFormat): contents =3D "\n".join(contents) =20 for line in contents.strip("\n").split("\n"): - line =3D Re(r"^\s*").sub("", line) + line =3D KernRe(r"^\s*").sub("", line) if not line: continue =20 @@ -680,7 +680,7 @@ class ManFormat(OutputFormat): # Pointer-to-function self.data +=3D f'".BI "{parenth}{function_pointer.group(1)= }" " ") ({function_pointer.group(2)}){post}"' + "\n" else: - dtype =3D Re(r'([^\*])$').sub(r'\1 ', dtype) + dtype =3D KernRe(r'([^\*])$').sub(r'\1 ', dtype) =20 self.data +=3D f'.BI "{parenth}{dtype}" "{post}"' + "\n" count +=3D 1 @@ -727,7 +727,7 @@ class ManFormat(OutputFormat): self.data +=3D ".SH Constants\n" =20 for parameter in parameterlist: - parameter_name =3D Re(r'\[.*').sub('', parameter) + parameter_name =3D KernRe(r'\[.*').sub('', parameter) self.data +=3D f'.IP "{parameter}" 12' + "\n" self.output_highlight(args['parameterdescs'].get(parameter_nam= e, "")) =20 @@ -769,7 +769,7 @@ class ManFormat(OutputFormat): =20 # Replace tabs with two spaces and handle newlines declaration =3D definition.replace("\t", " ") - declaration =3D Re(r"\n").sub('"\n.br\n.BI "', declaration) + declaration =3D KernRe(r"\n").sub('"\n.br\n.BI "', declaration) =20 self.data +=3D ".SH SYNOPSIS\n" self.data +=3D f"{struct_type} {struct_name} " + "{" + "\n.br\n" diff --git a/scripts/lib/kdoc/kdoc_parser.py b/scripts/lib/kdoc/kdoc_parser= .py index 33f00c77dd5f..f60722bcc687 100755 --- a/scripts/lib/kdoc/kdoc_parser.py +++ b/scripts/lib/kdoc/kdoc_parser.py @@ -16,7 +16,7 @@ import argparse import re from pprint import pformat =20 -from kdoc_re import NestedMatch, Re +from kdoc_re import NestedMatch, KernRe =20 =20 # @@ -29,12 +29,12 @@ from kdoc_re import NestedMatch, Re # =20 # Allow whitespace at end of comment start. -doc_start =3D Re(r'^/\*\*\s*$', cache=3DFalse) +doc_start =3D KernRe(r'^/\*\*\s*$', cache=3DFalse) =20 -doc_end =3D Re(r'\*/', cache=3DFalse) -doc_com =3D Re(r'\s*\*\s*', cache=3DFalse) -doc_com_body =3D Re(r'\s*\* ?', cache=3DFalse) -doc_decl =3D doc_com + Re(r'(\w+)', cache=3DFalse) +doc_end =3D KernRe(r'\*/', cache=3DFalse) +doc_com =3D KernRe(r'\s*\*\s*', cache=3DFalse) +doc_com_body =3D KernRe(r'\s*\* ?', cache=3DFalse) +doc_decl =3D doc_com + KernRe(r'(\w+)', cache=3DFalse) =20 # @params and a strictly limited set of supported section names # Specifically: @@ -44,22 +44,22 @@ doc_decl =3D doc_com + Re(r'(\w+)', cache=3DFalse) # while trying to not match literal block starts like "example::" # doc_sect =3D doc_com + \ - Re(r'\s*(\@[.\w]+|\@\.\.\.|description|context|returns?|notes?= |examples?)\s*:([^:].*)?$', + KernRe(r'\s*(\@[.\w]+|\@\.\.\.|description|context|returns?|no= tes?|examples?)\s*:([^:].*)?$', flags=3Dre.I, cache=3DFalse) =20 -doc_content =3D doc_com_body + Re(r'(.*)', cache=3DFalse) -doc_block =3D doc_com + Re(r'DOC:\s*(.*)?', cache=3DFalse) -doc_inline_start =3D Re(r'^\s*/\*\*\s*$', cache=3DFalse) -doc_inline_sect =3D Re(r'\s*\*\s*(@\s*[\w][\w\.]*\s*):(.*)', cache=3DFalse) -doc_inline_end =3D Re(r'^\s*\*/\s*$', cache=3DFalse) -doc_inline_oneline =3D Re(r'^\s*/\*\*\s*(@[\w\s]+):\s*(.*)\s*\*/\s*$', cac= he=3DFalse) -attribute =3D Re(r"__attribute__\s*\(\([a-z0-9,_\*\s\(\)]*\)\)", +doc_content =3D doc_com_body + KernRe(r'(.*)', cache=3DFalse) +doc_block =3D doc_com + KernRe(r'DOC:\s*(.*)?', cache=3DFalse) +doc_inline_start =3D KernRe(r'^\s*/\*\*\s*$', cache=3DFalse) +doc_inline_sect =3D KernRe(r'\s*\*\s*(@\s*[\w][\w\.]*\s*):(.*)', cache=3DF= alse) +doc_inline_end =3D KernRe(r'^\s*\*/\s*$', cache=3DFalse) +doc_inline_oneline =3D KernRe(r'^\s*/\*\*\s*(@[\w\s]+):\s*(.*)\s*\*/\s*$',= cache=3DFalse) +attribute =3D KernRe(r"__attribute__\s*\(\([a-z0-9,_\*\s\(\)]*\)\)", flags=3Dre.I | re.S, cache=3DFalse) =20 -export_symbol =3D Re(r'^\s*EXPORT_SYMBOL(_GPL)?\s*\(\s*(\w+)\s*\)\s*', cac= he=3DFalse) -export_symbol_ns =3D Re(r'^\s*EXPORT_SYMBOL_NS(_GPL)?\s*\(\s*(\w+)\s*,\s*"= \S+"\)\s*', cache=3DFalse) +export_symbol =3D KernRe(r'^\s*EXPORT_SYMBOL(_GPL)?\s*\(\s*(\w+)\s*\)\s*',= cache=3DFalse) +export_symbol_ns =3D KernRe(r'^\s*EXPORT_SYMBOL_NS(_GPL)?\s*\(\s*(\w+)\s*,= \s*"\S+"\)\s*', cache=3DFalse) =20 -type_param =3D Re(r"\@(\w*((\.\w+)|(->\w+))*(\.\.\.)?)", cache=3DFalse) +type_param =3D KernRe(r"\@(\w*((\.\w+)|(->\w+))*(\.\.\.)?)", cache=3DFalse) =20 =20 class KernelDoc: @@ -278,10 +278,10 @@ class KernelDoc: =20 self.entry.anon_struct_union =3D False =20 - param =3D Re(r'[\[\)].*').sub('', param, count=3D1) + param =3D KernRe(r'[\[\)].*').sub('', param, count=3D1) =20 if dtype =3D=3D "" and param.endswith("..."): - if Re(r'\w\.\.\.$').search(param): + if KernRe(r'\w\.\.\.$').search(param): # For named variable parameters of the form `x...`, # remove the dots param =3D param[:-3] @@ -335,7 +335,7 @@ class KernelDoc: # to ignore "[blah" in a parameter string. =20 self.entry.parameterlist.append(param) - org_arg =3D Re(r'\s\s+').sub(' ', org_arg) + org_arg =3D KernRe(r'\s\s+').sub(' ', org_arg) self.entry.parametertypes[param] =3D org_arg =20 def save_struct_actual(self, actual): @@ -344,7 +344,7 @@ class KernelDoc: one string item. """ =20 - actual =3D Re(r'\s*').sub("", actual, count=3D1) + actual =3D KernRe(r'\s*').sub("", actual, count=3D1) =20 self.entry.struct_actual +=3D actual + " " =20 @@ -355,20 +355,20 @@ class KernelDoc: """ =20 # temporarily replace all commas inside function pointer definition - arg_expr =3D Re(r'(\([^\),]+),') + arg_expr =3D KernRe(r'(\([^\),]+),') while arg_expr.search(args): args =3D arg_expr.sub(r"\1#", args) =20 for arg in args.split(splitter): # Strip comments - arg =3D Re(r'\/\*.*\*\/').sub('', arg) + arg =3D KernRe(r'\/\*.*\*\/').sub('', arg) =20 # Ignore argument attributes - arg =3D Re(r'\sPOS0?\s').sub(' ', arg) + arg =3D KernRe(r'\sPOS0?\s').sub(' ', arg) =20 # Strip leading/trailing spaces arg =3D arg.strip() - arg =3D Re(r'\s+').sub(' ', arg, count=3D1) + arg =3D KernRe(r'\s+').sub(' ', arg, count=3D1) =20 if arg.startswith('#'): # Treat preprocessor directive as a typeless variable just= to fill @@ -379,63 +379,63 @@ class KernelDoc: self.push_parameter(ln, decl_type, arg, "", "", declaration_name) =20 - elif Re(r'\(.+\)\s*\(').search(arg): + elif KernRe(r'\(.+\)\s*\(').search(arg): # Pointer-to-function =20 arg =3D arg.replace('#', ',') =20 - r =3D Re(r'[^\(]+\(\*?\s*([\w\[\]\.]*)\s*\)') + r =3D KernRe(r'[^\(]+\(\*?\s*([\w\[\]\.]*)\s*\)') if r.match(arg): param =3D r.group(1) else: self.emit_warning(ln, f"Invalid param: {arg}") param =3D arg =20 - dtype =3D Re(r'([^\(]+\(\*?)\s*' + re.escape(param)).sub(r= '\1', arg) + dtype =3D KernRe(r'([^\(]+\(\*?)\s*' + re.escape(param)).s= ub(r'\1', arg) self.save_struct_actual(param) self.push_parameter(ln, decl_type, param, dtype, arg, declaration_name) =20 - elif Re(r'\(.+\)\s*\[').search(arg): + elif KernRe(r'\(.+\)\s*\[').search(arg): # Array-of-pointers =20 arg =3D arg.replace('#', ',') - r =3D Re(r'[^\(]+\(\s*\*\s*([\w\[\]\.]*?)\s*(\s*\[\s*[\w]+= \s*\]\s*)*\)') + r =3D KernRe(r'[^\(]+\(\s*\*\s*([\w\[\]\.]*?)\s*(\s*\[\s*[= \w]+\s*\]\s*)*\)') if r.match(arg): param =3D r.group(1) else: self.emit_warning(ln, f"Invalid param: {arg}") param =3D arg =20 - dtype =3D Re(r'([^\(]+\(\*?)\s*' + re.escape(param)).sub(r= '\1', arg) + dtype =3D KernRe(r'([^\(]+\(\*?)\s*' + re.escape(param)).s= ub(r'\1', arg) =20 self.save_struct_actual(param) self.push_parameter(ln, decl_type, param, dtype, arg, declaration_name) =20 elif arg: - arg =3D Re(r'\s*:\s*').sub(":", arg) - arg =3D Re(r'\s*\[').sub('[', arg) + arg =3D KernRe(r'\s*:\s*').sub(":", arg) + arg =3D KernRe(r'\s*\[').sub('[', arg) =20 - args =3D Re(r'\s*,\s*').split(arg) + args =3D KernRe(r'\s*,\s*').split(arg) if args[0] and '*' in args[0]: args[0] =3D re.sub(r'(\*+)\s*', r' \1', args[0]) =20 first_arg =3D [] - r =3D Re(r'^(.*\s+)(.*?\[.*\].*)$') + r =3D KernRe(r'^(.*\s+)(.*?\[.*\].*)$') if args[0] and r.match(args[0]): args.pop(0) first_arg.extend(r.group(1)) first_arg.append(r.group(2)) else: - first_arg =3D Re(r'\s+').split(args.pop(0)) + first_arg =3D KernRe(r'\s+').split(args.pop(0)) =20 args.insert(0, first_arg.pop()) dtype =3D ' '.join(first_arg) =20 for param in args: - if Re(r'^(\*+)\s*(.*)').match(param): - r =3D Re(r'^(\*+)\s*(.*)') + if KernRe(r'^(\*+)\s*(.*)').match(param): + r =3D KernRe(r'^(\*+)\s*(.*)') if not r.match(param): self.emit_warning(ln, f"Invalid param: {param}= ") continue @@ -447,8 +447,8 @@ class KernelDoc: f"{dtype} {r.group(1)}", arg, declaration_name) =20 - elif Re(r'(.*?):(\w+)').search(param): - r =3D Re(r'(.*?):(\w+)') + elif KernRe(r'(.*?):(\w+)').search(param): + r =3D KernRe(r'(.*?):(\w+)') if not r.match(param): self.emit_warning(ln, f"Invalid param: {param}= ") continue @@ -477,7 +477,7 @@ class KernelDoc: err =3D True for px in range(len(prms)): # pylint: disable=3D= C0200 prm_clean =3D prms[px] - prm_clean =3D Re(r'\[.*\]').sub('', prm_clean) + prm_clean =3D KernRe(r'\[.*\]').sub('', prm_clean) prm_clean =3D attribute.sub('', prm_clean) =20 # ignore array size in a parameter string; @@ -486,7 +486,7 @@ class KernelDoc: # and this appears in @prms as "addr[6" since the # parameter list is split at spaces; # hence just ignore "[..." for the sections check; - prm_clean =3D Re(r'\[.*').sub('', prm_clean) + prm_clean =3D KernRe(r'\[.*').sub('', prm_clean) =20 if prm_clean =3D=3D sects[sx]: err =3D False @@ -512,7 +512,7 @@ class KernelDoc: =20 # Ignore an empty return type (It's a macro) # Ignore functions with a "void" return type (but not "void *") - if not return_type or Re(r'void\s*\w*\s*$').search(return_type): + if not return_type or KernRe(r'void\s*\w*\s*$').search(return_type= ): return =20 if not self.entry.sections.get("Return", None): @@ -535,20 +535,20 @@ class KernelDoc: ] =20 definition_body =3D r'\{(.*)\}\s*' + "(?:" + '|'.join(qualifiers) = + ")?" - struct_members =3D Re(type_pattern + r'([^\{\};]+)(\{)([^\{\}]*)(\= })([^\{\}\;]*)(\;)') + struct_members =3D KernRe(type_pattern + r'([^\{\};]+)(\{)([^\{\}]= *)(\})([^\{\}\;]*)(\;)') =20 # Extract struct/union definition members =3D None declaration_name =3D None decl_type =3D None =20 - r =3D Re(type_pattern + r'\s+(\w+)\s*' + definition_body) + r =3D KernRe(type_pattern + r'\s+(\w+)\s*' + definition_body) if r.search(proto): decl_type =3D r.group(1) declaration_name =3D r.group(2) members =3D r.group(3) else: - r =3D Re(r'typedef\s+' + type_pattern + r'\s*' + definition_bo= dy + r'\s*(\w+)\s*;') + r =3D KernRe(r'typedef\s+' + type_pattern + r'\s*' + definitio= n_body + r'\s*(\w+)\s*;') =20 if r.search(proto): decl_type =3D r.group(1) @@ -567,21 +567,21 @@ class KernelDoc: args_pattern =3D r'([^,)]+)' =20 sub_prefixes =3D [ - (Re(r'\/\*\s*private:.*?\/\*\s*public:.*?\*\/', re.S | re.I), = ''), - (Re(r'\/\*\s*private:.*', re.S | re.I), ''), + (KernRe(r'\/\*\s*private:.*?\/\*\s*public:.*?\*\/', re.S | re.= I), ''), + (KernRe(r'\/\*\s*private:.*', re.S | re.I), ''), =20 # Strip comments - (Re(r'\/\*.*?\*\/', re.S), ''), + (KernRe(r'\/\*.*?\*\/', re.S), ''), =20 # Strip attributes (attribute, ' '), - (Re(r'\s*__aligned\s*\([^;]*\)', re.S), ' '), - (Re(r'\s*__counted_by\s*\([^;]*\)', re.S), ' '), - (Re(r'\s*__counted_by_(le|be)\s*\([^;]*\)', re.S), ' '), - (Re(r'\s*__packed\s*', re.S), ' '), - (Re(r'\s*CRYPTO_MINALIGN_ATTR', re.S), ' '), - (Re(r'\s*____cacheline_aligned_in_smp', re.S), ' '), - (Re(r'\s*____cacheline_aligned', re.S), ' '), + (KernRe(r'\s*__aligned\s*\([^;]*\)', re.S), ' '), + (KernRe(r'\s*__counted_by\s*\([^;]*\)', re.S), ' '), + (KernRe(r'\s*__counted_by_(le|be)\s*\([^;]*\)', re.S), ' '), + (KernRe(r'\s*__packed\s*', re.S), ' '), + (KernRe(r'\s*CRYPTO_MINALIGN_ATTR', re.S), ' '), + (KernRe(r'\s*____cacheline_aligned_in_smp', re.S), ' '), + (KernRe(r'\s*____cacheline_aligned', re.S), ' '), =20 # Unwrap struct_group macros based on this definition: # __struct_group(TAG, NAME, ATTRS, MEMBERS...) @@ -616,10 +616,10 @@ class KernelDoc: # matched. So, the implementation to drop STRUCT_GROUP() will = be # handled in separate. =20 - (Re(r'\bstruct_group\s*\(([^,]*,)', re.S), r'STRUCT_GROUP('), - (Re(r'\bstruct_group_attr\s*\(([^,]*,){2}', re.S), r'STRUCT_GR= OUP('), - (Re(r'\bstruct_group_tagged\s*\(([^,]*),([^,]*),', re.S), r'st= ruct \1 \2; STRUCT_GROUP('), - (Re(r'\b__struct_group\s*\(([^,]*,){3}', re.S), r'STRUCT_GROUP= ('), + (KernRe(r'\bstruct_group\s*\(([^,]*,)', re.S), r'STRUCT_GROUP(= '), + (KernRe(r'\bstruct_group_attr\s*\(([^,]*,){2}', re.S), r'STRUC= T_GROUP('), + (KernRe(r'\bstruct_group_tagged\s*\(([^,]*),([^,]*),', re.S), = r'struct \1 \2; STRUCT_GROUP('), + (KernRe(r'\b__struct_group\s*\(([^,]*,){3}', re.S), r'STRUCT_G= ROUP('), =20 # Replace macros # @@ -628,15 +628,15 @@ class KernelDoc: # it is better to also move those to the NestedMatch logic, # to ensure that parenthesis will be properly matched. =20 - (Re(r'__ETHTOOL_DECLARE_LINK_MODE_MASK\s*\(([^\)]+)\)', re.S),= r'DECLARE_BITMAP(\1, __ETHTOOL_LINK_MODE_MASK_NBITS)'), - (Re(r'DECLARE_PHY_INTERFACE_MASK\s*\(([^\)]+)\)', re.S), r'DEC= LARE_BITMAP(\1, PHY_INTERFACE_MODE_MAX)'), - (Re(r'DECLARE_BITMAP\s*\(' + args_pattern + r',\s*' + args_pat= tern + r'\)', re.S), r'unsigned long \1[BITS_TO_LONGS(\2)]'), - (Re(r'DECLARE_HASHTABLE\s*\(' + args_pattern + r',\s*' + args_= pattern + r'\)', re.S), r'unsigned long \1[1 << ((\2) - 1)]'), - (Re(r'DECLARE_KFIFO\s*\(' + args_pattern + r',\s*' + args_patt= ern + r',\s*' + args_pattern + r'\)', re.S), r'\2 *\1'), - (Re(r'DECLARE_KFIFO_PTR\s*\(' + args_pattern + r',\s*' + args_= pattern + r'\)', re.S), r'\2 *\1'), - (Re(r'(?:__)?DECLARE_FLEX_ARRAY\s*\(' + args_pattern + r',\s*'= + args_pattern + r'\)', re.S), r'\1 \2[]'), - (Re(r'DEFINE_DMA_UNMAP_ADDR\s*\(' + args_pattern + r'\)', re.S= ), r'dma_addr_t \1'), - (Re(r'DEFINE_DMA_UNMAP_LEN\s*\(' + args_pattern + r'\)', re.S)= , r'__u32 \1'), + (KernRe(r'__ETHTOOL_DECLARE_LINK_MODE_MASK\s*\(([^\)]+)\)', re= .S), r'DECLARE_BITMAP(\1, __ETHTOOL_LINK_MODE_MASK_NBITS)'), + (KernRe(r'DECLARE_PHY_INTERFACE_MASK\s*\(([^\)]+)\)', re.S), r= 'DECLARE_BITMAP(\1, PHY_INTERFACE_MODE_MAX)'), + (KernRe(r'DECLARE_BITMAP\s*\(' + args_pattern + r',\s*' + args= _pattern + r'\)', re.S), r'unsigned long \1[BITS_TO_LONGS(\2)]'), + (KernRe(r'DECLARE_HASHTABLE\s*\(' + args_pattern + r',\s*' + a= rgs_pattern + r'\)', re.S), r'unsigned long \1[1 << ((\2) - 1)]'), + (KernRe(r'DECLARE_KFIFO\s*\(' + args_pattern + r',\s*' + args_= pattern + r',\s*' + args_pattern + r'\)', re.S), r'\2 *\1'), + (KernRe(r'DECLARE_KFIFO_PTR\s*\(' + args_pattern + r',\s*' + a= rgs_pattern + r'\)', re.S), r'\2 *\1'), + (KernRe(r'(?:__)?DECLARE_FLEX_ARRAY\s*\(' + args_pattern + r',= \s*' + args_pattern + r'\)', re.S), r'\1 \2[]'), + (KernRe(r'DEFINE_DMA_UNMAP_ADDR\s*\(' + args_pattern + r'\)', = re.S), r'dma_addr_t \1'), + (KernRe(r'DEFINE_DMA_UNMAP_LEN\s*\(' + args_pattern + r'\)', r= e.S), r'__u32 \1'), ] =20 # Regexes here are guaranteed to have the end limiter matching @@ -689,8 +689,8 @@ class KernelDoc: s_id =3D s_id.strip() =20 newmember +=3D f"{maintype} {s_id}; " - s_id =3D Re(r'[:\[].*').sub('', s_id) - s_id =3D Re(r'^\s*\**(\S+)\s*').sub(r'\1', s_id) + s_id =3D KernRe(r'[:\[].*').sub('', s_id) + s_id =3D KernRe(r'^\s*\**(\S+)\s*').sub(r'\1', s_id) =20 for arg in content.split(';'): arg =3D arg.strip() @@ -698,7 +698,7 @@ class KernelDoc: if not arg: continue =20 - r =3D Re(r'^([^\(]+\(\*?\s*)([\w\.]*)(\s*\).*)') + r =3D KernRe(r'^([^\(]+\(\*?\s*)([\w\.]*)(\s*\).*)= ') if r.match(arg): # Pointer-to-function dtype =3D r.group(1) @@ -717,15 +717,15 @@ class KernelDoc: else: arg =3D arg.strip() # Handle bitmaps - arg =3D Re(r':\s*\d+\s*').sub('', arg) + arg =3D KernRe(r':\s*\d+\s*').sub('', arg) =20 # Handle arrays - arg =3D Re(r'\[.*\]').sub('', arg) + arg =3D KernRe(r'\[.*\]').sub('', arg) =20 # Handle multiple IDs - arg =3D Re(r'\s*,\s*').sub(',', arg) + arg =3D KernRe(r'\s*,\s*').sub(',', arg) =20 - r =3D Re(r'(.*)\s+([\S+,]+)') + r =3D KernRe(r'(.*)\s+([\S+,]+)') =20 if r.search(arg): dtype =3D r.group(1) @@ -735,7 +735,7 @@ class KernelDoc: continue =20 for name in names.split(','): - name =3D Re(r'^\s*\**(\S+)\s*').sub(r'\1',= name).strip() + name =3D KernRe(r'^\s*\**(\S+)\s*').sub(r'= \1', name).strip() =20 if not name: continue @@ -757,12 +757,12 @@ class KernelDoc: self.entry.sectcheck, self.entry.struct_actual) =20 # Adjust declaration for better display - declaration =3D Re(r'([\{;])').sub(r'\1\n', declaration) - declaration =3D Re(r'\}\s+;').sub('};', declaration) + declaration =3D KernRe(r'([\{;])').sub(r'\1\n', declaration) + declaration =3D KernRe(r'\}\s+;').sub('};', declaration) =20 # Better handle inlined enums while True: - r =3D Re(r'(enum\s+\{[^\}]+),([^\n])') + r =3D KernRe(r'(enum\s+\{[^\}]+),([^\n])') if not r.search(declaration): break =20 @@ -774,7 +774,7 @@ class KernelDoc: for clause in def_args: =20 clause =3D clause.strip() - clause =3D Re(r'\s+').sub(' ', clause, count=3D1) + clause =3D KernRe(r'\s+').sub(' ', clause, count=3D1) =20 if not clause: continue @@ -782,7 +782,7 @@ class KernelDoc: if '}' in clause and level > 1: level -=3D 1 =20 - if not Re(r'^\s*#').match(clause): + if not KernRe(r'^\s*#').match(clause): declaration +=3D "\t" * level =20 declaration +=3D "\t" + clause + "\n" @@ -807,24 +807,24 @@ class KernelDoc: """ =20 # Ignore members marked private - proto =3D Re(r'\/\*\s*private:.*?\/\*\s*public:.*?\*\/', flags=3Dr= e.S).sub('', proto) - proto =3D Re(r'\/\*\s*private:.*}', flags=3Dre.S).sub('}', proto) + proto =3D KernRe(r'\/\*\s*private:.*?\/\*\s*public:.*?\*\/', flags= =3Dre.S).sub('', proto) + proto =3D KernRe(r'\/\*\s*private:.*}', flags=3Dre.S).sub('}', pro= to) =20 # Strip comments - proto =3D Re(r'\/\*.*?\*\/', flags=3Dre.S).sub('', proto) + proto =3D KernRe(r'\/\*.*?\*\/', flags=3Dre.S).sub('', proto) =20 # Strip #define macros inside enums - proto =3D Re(r'#\s*((define|ifdef|if)\s+|endif)[^;]*;', flags=3Dre= .S).sub('', proto) + proto =3D KernRe(r'#\s*((define|ifdef|if)\s+|endif)[^;]*;', flags= =3Dre.S).sub('', proto) =20 members =3D None declaration_name =3D None =20 - r =3D Re(r'typedef\s+enum\s*\{(.*)\}\s*(\w*)\s*;') + r =3D KernRe(r'typedef\s+enum\s*\{(.*)\}\s*(\w*)\s*;') if r.search(proto): declaration_name =3D r.group(2) members =3D r.group(1).rstrip() else: - r =3D Re(r'enum\s+(\w*)\s*\{(.*)\}') + r =3D KernRe(r'enum\s+(\w*)\s*\{(.*)\}') if r.match(proto): declaration_name =3D r.group(1) members =3D r.group(2).rstrip() @@ -847,12 +847,12 @@ class KernelDoc: =20 member_set =3D set() =20 - members =3D Re(r'\([^;]*?[\)]').sub('', members) + members =3D KernRe(r'\([^;]*?[\)]').sub('', members) =20 for arg in members.split(','): if not arg: continue - arg =3D Re(r'^\s*(\w+).*').sub(r'\1', arg) + arg =3D KernRe(r'^\s*(\w+).*').sub(r'\1', arg) self.entry.parameterlist.append(arg) if arg not in self.entry.parameterdescs: self.entry.parameterdescs[arg] =3D self.undescribed @@ -947,10 +947,10 @@ class KernelDoc: ] =20 for search, sub, flags in sub_prefixes: - prototype =3D Re(search, flags).sub(sub, prototype) + prototype =3D KernRe(search, flags).sub(sub, prototype) =20 # Macros are a special case, as they change the prototype format - new_proto =3D Re(r"^#\s*define\s+").sub("", prototype) + new_proto =3D KernRe(r"^#\s*define\s+").sub("", prototype) if new_proto !=3D prototype: is_define_proto =3D True prototype =3D new_proto @@ -987,7 +987,7 @@ class KernelDoc: found =3D False =20 if is_define_proto: - r =3D Re(r'^()(' + name + r')\s+') + r =3D KernRe(r'^()(' + name + r')\s+') =20 if r.search(prototype): return_type =3D '' @@ -1004,7 +1004,7 @@ class KernelDoc: ] =20 for p in patterns: - r =3D Re(p) + r =3D KernRe(p) =20 if r.match(prototype): =20 @@ -1071,11 +1071,11 @@ class KernelDoc: typedef_ident =3D r'\*?\s*(\w\S+)\s*' typedef_args =3D r'\s*\((.*)\);' =20 - typedef1 =3D Re(r'typedef' + typedef_type + r'\(' + typedef_ident = + r'\)' + typedef_args) - typedef2 =3D Re(r'typedef' + typedef_type + typedef_ident + typede= f_args) + typedef1 =3D KernRe(r'typedef' + typedef_type + r'\(' + typedef_id= ent + r'\)' + typedef_args) + typedef2 =3D KernRe(r'typedef' + typedef_type + typedef_ident + ty= pedef_args) =20 # Strip comments - proto =3D Re(r'/\*.*?\*/', flags=3Dre.S).sub('', proto) + proto =3D KernRe(r'/\*.*?\*/', flags=3Dre.S).sub('', proto) =20 # Parse function typedef prototypes for r in [typedef1, typedef2]: @@ -1109,12 +1109,12 @@ class KernelDoc: return =20 # Handle nested parentheses or brackets - r =3D Re(r'(\(*.\)\s*|\[*.\]\s*);$') + r =3D KernRe(r'(\(*.\)\s*|\[*.\]\s*);$') while r.search(proto): proto =3D r.sub('', proto) =20 # Parse simple typedefs - r =3D Re(r'typedef.*\s+(\w+)\s*;') + r =3D KernRe(r'typedef.*\s+(\w+)\s*;') if r.match(proto): declaration_name =3D r.group(1) =20 @@ -1195,12 +1195,12 @@ class KernelDoc: decl_end =3D r"(?:[-:].*)" # end of the name part =20 # test for pointer declaration type, foo * bar() - desc - r =3D Re(fr"^{decl_start}([\w\s]+?){parenthesis}?\s*{decl_end}= ?$") + r =3D KernRe(fr"^{decl_start}([\w\s]+?){parenthesis}?\s*{decl_= end}?$") if r.search(line): self.entry.identifier =3D r.group(1) =20 # Test for data declaration - r =3D Re(r"^\s*\*?\s*(struct|union|enum|typedef)\b\s*(\w*)") + r =3D KernRe(r"^\s*\*?\s*(struct|union|enum|typedef)\b\s*(\w*)= ") if r.search(line): self.entry.decl_type =3D r.group(1) self.entry.identifier =3D r.group(2) @@ -1209,15 +1209,15 @@ class KernelDoc: # Look for foo() or static void foo() - description; # or misspelt identifier =20 - r1 =3D Re(fr"^{decl_start}{fn_type}(\w+)\s*{parenthesis}\s= *{decl_end}?$") - r2 =3D Re(fr"^{decl_start}{fn_type}(\w+[^-:]*){parenthesis= }\s*{decl_end}$") + r1 =3D KernRe(fr"^{decl_start}{fn_type}(\w+)\s*{parenthesi= s}\s*{decl_end}?$") + r2 =3D KernRe(fr"^{decl_start}{fn_type}(\w+[^-:]*){parenth= esis}\s*{decl_end}$") =20 for r in [r1, r2]: if r.search(line): self.entry.identifier =3D r.group(1) self.entry.decl_type =3D "function" =20 - r =3D Re(r"define\s+") + r =3D KernRe(r"define\s+") self.entry.identifier =3D r.sub("", self.entry.ide= ntifier) self.entry.is_kernel_comment =3D True break @@ -1230,12 +1230,12 @@ class KernelDoc: self.entry.section =3D self.section_default self.entry.new_start_line =3D ln + 1 =20 - r =3D Re("[-:](.*)") + r =3D KernRe("[-:](.*)") if r.search(line): # strip leading/trailing/multiple spaces self.entry.descr =3D r.group(1).strip(" ") =20 - r =3D Re(r"\s+") + r =3D KernRe(r"\s+") self.entry.descr =3D r.sub(" ", self.entry.descr) self.entry.declaration_purpose =3D self.entry.descr self.state =3D self.STATE_BODY_MAYBE @@ -1272,7 +1272,7 @@ class KernelDoc: """ =20 if self.state =3D=3D self.STATE_BODY_WITH_BLANK_LINE: - r =3D Re(r"\s*\*\s?\S") + r =3D KernRe(r"\s*\*\s?\S") if r.match(line): self.dump_section() self.entry.section =3D self.section_default @@ -1318,7 +1318,7 @@ class KernelDoc: self.dump_section() =20 # Look for doc_com + + doc_end: - r =3D Re(r'\s*\*\s*[a-zA-Z_0-9:\.]+\*/') + r =3D KernRe(r'\s*\*\s*[a-zA-Z_0-9:\.]+\*/') if r.match(line): self.emit_warning(ln, f"suspicious ending line: {line}") =20 @@ -1351,7 +1351,7 @@ class KernelDoc: self.entry.declaration_purpose =3D self.entry.declaration_= purpose.rstrip() self.entry.declaration_purpose +=3D " " + cont =20 - r =3D Re(r"\s+") + r =3D KernRe(r"\s+") self.entry.declaration_purpose =3D r.sub(' ', self.entry.declarat= ion_purpose) =20 @@ -1359,7 +1359,7 @@ class KernelDoc: if self.entry.section.startswith('@') or \ self.entry.section =3D=3D self.section_context: if self.entry.leading_space is None: - r =3D Re(r'^(\s+)') + r =3D KernRe(r'^(\s+)') if r.match(cont): self.entry.leading_space =3D len(r.group(1)) else: @@ -1436,13 +1436,13 @@ class KernelDoc: is_void =3D True =20 # Replace SYSCALL_DEFINE with correct return type & function name - proto =3D Re(r'SYSCALL_DEFINE.*\(').sub('long sys_', proto) + proto =3D KernRe(r'SYSCALL_DEFINE.*\(').sub('long sys_', proto) =20 - r =3D Re(r'long\s+(sys_.*?),') + r =3D KernRe(r'long\s+(sys_.*?),') if r.search(proto): - proto =3D Re(',').sub('(', proto, count=3D1) + proto =3D KernRe(',').sub('(', proto, count=3D1) elif is_void: - proto =3D Re(r'\)').sub('(void)', proto, count=3D1) + proto =3D KernRe(r'\)').sub('(void)', proto, count=3D1) =20 # Now delete all of the odd-numbered commas in the proto # so that argument types & names don't have a comma between them @@ -1469,22 +1469,22 @@ class KernelDoc: tracepointargs =3D None =20 # Match tracepoint name based on different patterns - r =3D Re(r'TRACE_EVENT\((.*?),') + r =3D KernRe(r'TRACE_EVENT\((.*?),') if r.search(proto): tracepointname =3D r.group(1) =20 - r =3D Re(r'DEFINE_SINGLE_EVENT\((.*?),') + r =3D KernRe(r'DEFINE_SINGLE_EVENT\((.*?),') if r.search(proto): tracepointname =3D r.group(1) =20 - r =3D Re(r'DEFINE_EVENT\((.*?),(.*?),') + r =3D KernRe(r'DEFINE_EVENT\((.*?),(.*?),') if r.search(proto): tracepointname =3D r.group(2) =20 if tracepointname: tracepointname =3D tracepointname.lstrip() =20 - r =3D Re(r'TP_PROTO\((.*?)\)') + r =3D KernRe(r'TP_PROTO\((.*?)\)') if r.search(proto): tracepointargs =3D r.group(1) =20 @@ -1501,43 +1501,43 @@ class KernelDoc: """Ancillary routine to process a function prototype""" =20 # strip C99-style comments to end of line - r =3D Re(r"\/\/.*$", re.S) + r =3D KernRe(r"\/\/.*$", re.S) line =3D r.sub('', line) =20 - if Re(r'\s*#\s*define').match(line): + if KernRe(r'\s*#\s*define').match(line): self.entry.prototype =3D line elif line.startswith('#'): # Strip other macros like #ifdef/#ifndef/#endif/... pass else: - r =3D Re(r'([^\{]*)') + r =3D KernRe(r'([^\{]*)') if r.match(line): self.entry.prototype +=3D r.group(1) + " " =20 - if '{' in line or ';' in line or Re(r'\s*#\s*define').match(line): + if '{' in line or ';' in line or KernRe(r'\s*#\s*define').match(li= ne): # strip comments - r =3D Re(r'/\*.*?\*/') + r =3D KernRe(r'/\*.*?\*/') self.entry.prototype =3D r.sub('', self.entry.prototype) =20 # strip newlines/cr's - r =3D Re(r'[\r\n]+') + r =3D KernRe(r'[\r\n]+') self.entry.prototype =3D r.sub(' ', self.entry.prototype) =20 # strip leading spaces - r =3D Re(r'^\s+') + r =3D KernRe(r'^\s+') self.entry.prototype =3D r.sub('', self.entry.prototype) =20 # Handle self.entry.prototypes for function pointers like: # int (*pcs_config)(struct foo) =20 - r =3D Re(r'^(\S+\s+)\(\s*\*(\S+)\)') + r =3D KernRe(r'^(\S+\s+)\(\s*\*(\S+)\)') self.entry.prototype =3D r.sub(r'\1\2', self.entry.prototype) =20 if 'SYSCALL_DEFINE' in self.entry.prototype: self.entry.prototype =3D self.syscall_munge(ln, self.entry.proto= type) =20 - r =3D Re(r'TRACE_EVENT|DEFINE_EVENT|DEFINE_SINGLE_EVENT') + r =3D KernRe(r'TRACE_EVENT|DEFINE_EVENT|DEFINE_SINGLE_EVENT') if r.search(self.entry.prototype): self.entry.prototype =3D self.tracepoint_munge(ln, self.entry.pr= ototype) @@ -1549,22 +1549,22 @@ class KernelDoc: """Ancillary routine to process a type""" =20 # Strip newlines/cr's. - line =3D Re(r'[\r\n]+', re.S).sub(' ', line) + line =3D KernRe(r'[\r\n]+', re.S).sub(' ', line) =20 # Strip leading spaces - line =3D Re(r'^\s+', re.S).sub('', line) + line =3D KernRe(r'^\s+', re.S).sub('', line) =20 # Strip trailing spaces - line =3D Re(r'\s+$', re.S).sub('', line) + line =3D KernRe(r'\s+$', re.S).sub('', line) =20 # Strip C99-style comments to the end of the line - line =3D Re(r"\/\/.*$", re.S).sub('', line) + line =3D KernRe(r"\/\/.*$", re.S).sub('', line) =20 # To distinguish preprocessor directive from regular declaration l= ater. if line.startswith('#'): line +=3D ";" =20 - r =3D Re(r'([^\{\};]*)([\{\};])(.*)') + r =3D KernRe(r'([^\{\};]*)([\{\};])(.*)') while True: if r.search(line): if self.entry.prototype: diff --git a/scripts/lib/kdoc/kdoc_re.py b/scripts/lib/kdoc/kdoc_re.py index d28485ff94d6..e81695b273bf 100755 --- a/scripts/lib/kdoc/kdoc_re.py +++ b/scripts/lib/kdoc/kdoc_re.py @@ -14,7 +14,7 @@ import re re_cache =3D {} =20 =20 -class Re: +class KernRe: """ Helper class to simplify regex declaration and usage, =20 @@ -59,7 +59,7 @@ class Re: Allows adding two regular expressions into one. """ =20 - return Re(str(self) + str(other), cache=3Dself.cache or other.cach= e, + return KernRe(str(self) + str(other), cache=3Dself.cache or other.= cache, flags=3Dself.regex.flags | other.regex.flags) =20 def match(self, string): --=20 2.49.0 From nobody Wed Dec 17 05:33:22 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 03F25267B64; Tue, 8 Apr 2025 10:09:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106998; cv=none; b=UXUcdljZxt7g969D51qe2yyVe8ukDUXHp4k0gm8RfTeAeIih6Fb2M3N58GnzK9Aqp3JXW8g0Br4sn0lb6hkmuR12G50F5Lpg43HdLrDLHWRsQFo07sGvYgpjdmTYTVrfHSgrnBl5Y/04d/laLVIK08PhgTU9YxaDvetQou9QYBY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744106998; c=relaxed/simple; bh=QcroL1J/9cJK+bUU6dPs9SthSAPzqPI/VmikKkYQHtU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=F07OJGz14moWJOHRXll5PRxSp3EsAqJm86CT0XpFckPQTrRziuZZnzFM8sQN8Edbc0ZC+keXeeRMjeDA1sN/GOoZSNG6877GpsgFgYq31iyjOy56EMje/rom7DmXpxytUgRMK4mkeSJTuv6BP0ZrrervZd1jTInx93TWET5Evp4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=oVs2nqUe; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="oVs2nqUe" Received: by smtp.kernel.org (Postfix) with ESMTPSA id D5FD9C4CEEA; Tue, 8 Apr 2025 10:09:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1744106997; bh=QcroL1J/9cJK+bUU6dPs9SthSAPzqPI/VmikKkYQHtU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=oVs2nqUeylUWX925gqfZfO1q0jHpNa0IoeNIuqo2yaA4TLW7Md1l3o4yvaDUxsX92 5yt9pk1V3UKvvXynp+f0ULtGwOmA5sCt85y7FNMO5dXaY6xHL+g9my6NNTVZmOoy9f UE3iBB0VyARBb9zMZFM/kk5OPQIT7DRFFgmvy40rGJPxzF4RwTgX1q64dsXVFNMnmR jP31Nan6uY/SFrigGTCzRqr67eXFhFaPeh7CjfI4Wd+xIvKxk/81oJKhxS0uC/6Kw9 1IASIfa3OLbRLI3VIr04tWl30hUI7YMryalIFKP7vOul1l/YxZkgtpNgwY+XYVET7e JFDNJsJzFKtrw== Received: from mchehab by mail.kernel.org with local (Exim 4.98.2) (envelope-from ) id 1u25tt-00000008RWq-30YM; Tue, 08 Apr 2025 18:09:49 +0800 From: Mauro Carvalho Chehab To: Linux Doc Mailing List , Jonathan Corbet Cc: Sean Anderson , "Mauro Carvalho Chehab" , Russell King , linux-kernel@vger.kernel.org, netdev@vger.kernel.org Subject: [PATCH v3 33/33] scripts: kernel-doc: fix parsing function-like typedefs (again) Date: Tue, 8 Apr 2025 18:09:36 +0800 Message-ID: X-Mailer: git-send-email 2.49.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: Mauro Carvalho Chehab Content-Type: text/plain; charset="utf-8" From: Sean Anderson Typedefs like typedef struct phylink_pcs *(*pcs_xlate_t)(const u64 *args); have a typedef_type that ends with a * and therefore has no word boundary. Add an extra clause for the final group of the typedef_type so we only require a word boundary if we match a word. [mchehab: modify also kernel-doc.py, as we're deprecating the perl version] Fixes: 7d2c6b1edf79 ("scripts: kernel-doc: fix parsing function-like typede= fs") Signed-off-by: Sean Anderson Signed-off-by: Mauro Carvalho Chehab --- scripts/kernel-doc.pl | 2 +- scripts/lib/kdoc/kdoc_parser.py | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/scripts/kernel-doc.pl b/scripts/kernel-doc.pl index af6cf408b96d..5db23cbf4eb2 100755 --- a/scripts/kernel-doc.pl +++ b/scripts/kernel-doc.pl @@ -1325,7 +1325,7 @@ sub dump_enum($$) { } } =20 -my $typedef_type =3D qr { ((?:\s+[\w\*]+\b){1,8})\s* }x; +my $typedef_type =3D qr { ((?:\s+[\w\*]+\b){0,7}\s+(?:\w+\b|\*+))\s* }x; my $typedef_ident =3D qr { \*?\s*(\w\S+)\s* }x; my $typedef_args =3D qr { \s*\((.*)\); }x; =20 diff --git a/scripts/lib/kdoc/kdoc_parser.py b/scripts/lib/kdoc/kdoc_parser= .py index f60722bcc687..4f036c720b36 100755 --- a/scripts/lib/kdoc/kdoc_parser.py +++ b/scripts/lib/kdoc/kdoc_parser.py @@ -1067,7 +1067,7 @@ class KernelDoc: Stores a typedef inside self.entries array. """ =20 - typedef_type =3D r'((?:\s+[\w\*]+\b){1,8})\s*' + typedef_type =3D r'((?:\s+[\w\*]+\b){0,7}\s+(?:\w+\b|\*+))\s*' typedef_ident =3D r'\*?\s*(\w\S+)\s*' typedef_args =3D r'\s*\((.*)\);' =20 --=20 2.49.0