docs: kdoc: rework the BODY* processing states

[PATCH 9/9] docs: kdoc: finish disentangling the BODY and SPECIAL_SECTION states

Posted by Jonathan Corbet 3 months, 2 weeks ago

Move the last SPECIAL_SECTION special case into the proper handler
function, getting rid of more if/then/else logic.  The leading-space
tracking was tightened up a bit in the move.  Add some comments describing
what is going on.

No changes to the generated output.

Signed-off-by: Jonathan Corbet <corbet@lwn.net>
---
 scripts/lib/kdoc/kdoc_parser.py | 80 ++++++++++++++++++++-------------
 1 file changed, 48 insertions(+), 32 deletions(-)

diff --git a/scripts/lib/kdoc/kdoc_parser.py b/scripts/lib/kdoc/kdoc_parser.py
index a6ee8bac378d..3557c512c85a 100644
--- a/scripts/lib/kdoc/kdoc_parser.py
+++ b/scripts/lib/kdoc/kdoc_parser.py
@@ -1405,10 +1405,53 @@ class KernelDoc:
         """
         STATE_SPECIAL_SECTION: a section ending with a blank line
         """
+        #
+        # If we have hit a blank line (only the " * " marker), then this
+        # section is done.
+        #
         if KernRe(r"\s*\*\s*$").match(line):
             self.entry.begin_section(ln, dump = True)
+            self.entry.contents += '\n'
             self.state = state.BODY
-        self.process_body(ln, line)
+            return
+        #
+        # Not a blank line, look for the other ways to end the section.
+        #
+        if self.is_new_section(ln, line) or self.is_comment_end(ln, line):
+            return
+        #
+        # OK, we should have a continuation of the text for this section.
+        #
+        if doc_content.search(line):
+            cont = doc_content.group(1)
+            #
+            # If the lines of text after the first in a special section have
+            # leading white space, we need to trim it out or Sphinx will get
+            # confused.  For the second line (the None case), see what we
+            # find there and remember it.
+            #
+            if self.entry.leading_space is None:
+                r = KernRe(r'^(\s+)')
+                if r.match(cont):
+                    self.entry.leading_space = len(r.group(1))
+                else:
+                    self.entry.leading_space = 0
+            #
+            # Otherwise, before trimming any leading chars, be *sure*
+            # that they are white space.  We should maybe warn if this
+            # isn't the case.
+            #
+            for i in range(0, self.entry.leading_space):
+                if cont[i] != " ":
+                    self.entry.leading_space = i
+                    break
+            #
+            # Add the trimmed result to the section and we're done.
+            #
+            self.entry.contents += cont[self.entry.leading_space:] + '\n'
+        else:
+            # Unknown line, ignore
+            self.emit_msg(ln, f"bad line: {line}")
 
     def process_body(self, ln, line):
         """
@@ -1419,37 +1462,10 @@ class KernelDoc:
 
         if doc_content.search(line):
             cont = doc_content.group(1)
-
-            if cont == "":
-                    self.entry.contents += "\n"
-            else:
-                if self.state == state.SPECIAL_SECTION:
-                    if self.entry.leading_space is None:
-                        r = KernRe(r'^(\s+)')
-                        if r.match(cont):
-                            self.entry.leading_space = len(r.group(1))
-                        else:
-                            self.entry.leading_space = 0
-
-                    # Double-check if leading space are realy spaces
-                    pos = 0
-                    for i in range(0, self.entry.leading_space):
-                        if cont[i] != " ":
-                            break
-                        pos += 1
-
-                    cont = cont[pos:]
-
-                    # NEW LOGIC:
-                    # In case it is different, update it
-                    if self.entry.leading_space != pos:
-                        self.entry.leading_space = pos
-
-                self.entry.contents += cont + "\n"
-            return
-
-        # Unknown line, ignore
-        self.emit_msg(ln, f"bad line: {line}")
+            self.entry.contents += cont + "\n"
+        else:
+            # Unknown line, ignore
+            self.emit_msg(ln, f"bad line: {line}")
 
     def process_inline(self, ln, line):
         """STATE_INLINE: docbook comments within a prototype."""
-- 
2.49.0

Re: [PATCH 9/9] docs: kdoc: finish disentangling the BODY and SPECIAL_SECTION states

Posted by Mauro Carvalho Chehab 3 months, 2 weeks ago

Em Sat, 21 Jun 2025 14:35:12 -0600
Jonathan Corbet <corbet@lwn.net> escreveu:

> Move the last SPECIAL_SECTION special case into the proper handler
> function, getting rid of more if/then/else logic.  The leading-space
> tracking was tightened up a bit in the move.  Add some comments describing
> what is going on.
> 
> No changes to the generated output.

LGTM.
Reviewed-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>

> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
> ---
>  scripts/lib/kdoc/kdoc_parser.py | 80 ++++++++++++++++++++-------------
>  1 file changed, 48 insertions(+), 32 deletions(-)
> 
> diff --git a/scripts/lib/kdoc/kdoc_parser.py b/scripts/lib/kdoc/kdoc_parser.py
> index a6ee8bac378d..3557c512c85a 100644
> --- a/scripts/lib/kdoc/kdoc_parser.py
> +++ b/scripts/lib/kdoc/kdoc_parser.py
> @@ -1405,10 +1405,53 @@ class KernelDoc:
>          """
>          STATE_SPECIAL_SECTION: a section ending with a blank line
>          """
> +        #
> +        # If we have hit a blank line (only the " * " marker), then this
> +        # section is done.
> +        #
>          if KernRe(r"\s*\*\s*$").match(line):
>              self.entry.begin_section(ln, dump = True)
> +            self.entry.contents += '\n'
>              self.state = state.BODY
> -        self.process_body(ln, line)
> +            return
> +        #
> +        # Not a blank line, look for the other ways to end the section.
> +        #
> +        if self.is_new_section(ln, line) or self.is_comment_end(ln, line):
> +            return
> +        #
> +        # OK, we should have a continuation of the text for this section.
> +        #
> +        if doc_content.search(line):
> +            cont = doc_content.group(1)
> +            #
> +            # If the lines of text after the first in a special section have
> +            # leading white space, we need to trim it out or Sphinx will get
> +            # confused.  For the second line (the None case), see what we
> +            # find there and remember it.
> +            #
> +            if self.entry.leading_space is None:
> +                r = KernRe(r'^(\s+)')
> +                if r.match(cont):
> +                    self.entry.leading_space = len(r.group(1))
> +                else:
> +                    self.entry.leading_space = 0
> +            #
> +            # Otherwise, before trimming any leading chars, be *sure*
> +            # that they are white space.  We should maybe warn if this
> +            # isn't the case.
> +            #
> +            for i in range(0, self.entry.leading_space):
> +                if cont[i] != " ":
> +                    self.entry.leading_space = i
> +                    break
> +            #
> +            # Add the trimmed result to the section and we're done.
> +            #
> +            self.entry.contents += cont[self.entry.leading_space:] + '\n'
> +        else:
> +            # Unknown line, ignore
> +            self.emit_msg(ln, f"bad line: {line}")
>  
>      def process_body(self, ln, line):
>          """
> @@ -1419,37 +1462,10 @@ class KernelDoc:
>  
>          if doc_content.search(line):
>              cont = doc_content.group(1)
> -
> -            if cont == "":
> -                    self.entry.contents += "\n"
> -            else:
> -                if self.state == state.SPECIAL_SECTION:
> -                    if self.entry.leading_space is None:
> -                        r = KernRe(r'^(\s+)')
> -                        if r.match(cont):
> -                            self.entry.leading_space = len(r.group(1))
> -                        else:
> -                            self.entry.leading_space = 0
> -
> -                    # Double-check if leading space are realy spaces
> -                    pos = 0
> -                    for i in range(0, self.entry.leading_space):
> -                        if cont[i] != " ":
> -                            break
> -                        pos += 1
> -
> -                    cont = cont[pos:]
> -
> -                    # NEW LOGIC:
> -                    # In case it is different, update it
> -                    if self.entry.leading_space != pos:
> -                        self.entry.leading_space = pos
> -
> -                self.entry.contents += cont + "\n"
> -            return
> -
> -        # Unknown line, ignore
> -        self.emit_msg(ln, f"bad line: {line}")
> +            self.entry.contents += cont + "\n"
> +        else:
> +            # Unknown line, ignore
> +            self.emit_msg(ln, f"bad line: {line}")
>  
>      def process_inline(self, ln, line):
>          """STATE_INLINE: docbook comments within a prototype."""



Thanks,
Mauro

[PATCH 1/9] docs: kdoc: Make body_with_blank_line parsing more flexible
[PATCH 2/9] docs: kdoc: consolidate the "begin section" logic
[PATCH 3/9] docs: kdoc: separate out the handling of the declaration phase
[PATCH 4/9] docs: kdoc: split out the special-section state
[PATCH 5/9] docs: kdoc: coalesce the new-section handling
[PATCH 6/9] docs: kdoc: rework the handling of SPECIAL_SECTION
[PATCH 7/9] docs: kdoc: coalesce the end-of-comment processing
[PATCH 8/9] docs: kdoc: Add some comments to process_decl()
[PATCH 9/9] docs: kdoc: finish disentangling the BODY and SPECIAL_SECTION states