[PATCH v4] qapi: Add documentation format validation

Vladimir Sementsov-Ogievskiy posted 1 patch 2 weeks ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20251031115517.79032-1-vsementsov@yandex-team.ru
Maintainers: Markus Armbruster <armbru@redhat.com>, Michael Roth <michael.roth@amd.com>
There is a newer version of this series
scripts/qapi/parser.py | 44 +++++++++++++++++++++++++++++++++++++++++-
1 file changed, 43 insertions(+), 1 deletion(-)
[PATCH v4] qapi: Add documentation format validation
Posted by Vladimir Sementsov-Ogievskiy 2 weeks ago
Add explicit validation for QAPI documentation formatting rules:

1. Lines must not exceed 70 columns in width (including '# ' prefix)
2. Sentences must be separated by two spaces

Example sections and literal :: blocks (seldom case) are excluded, we
don't require them to be <= 70, that would be too restrictive. Anyway,
they share common 80-columns recommendations (not requirements).

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---

Hi all!

v4: apply suggestions by Markus:
 - smart regexps
 - simpler error messages
 - hack to move cursor at the place of error
 - support :: blocks

This is based on
[PATCH 0/8] A QAPI schema doc markup fix, and style cleanup
Based-on: <20251031094751.2817932-1-armbru@redhat.com>

 scripts/qapi/parser.py | 44 +++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 43 insertions(+), 1 deletion(-)

diff --git a/scripts/qapi/parser.py b/scripts/qapi/parser.py
index 9fbf80a541..2c244a3608 100644
--- a/scripts/qapi/parser.py
+++ b/scripts/qapi/parser.py
@@ -108,6 +108,10 @@ def __init__(self,
         self.exprs: List[QAPIExpression] = []
         self.docs: List[QAPIDoc] = []
 
+        # State for tracking qmp-example blocks and simple
+        # :: literal blocks.
+        self._literal_mode = False
+
         # Showtime!
         self._parse()
 
@@ -423,12 +427,50 @@ def get_doc_line(self) -> Optional[str]:
             if self.val != '##':
                 raise QAPIParseError(
                     self, "junk after '##' at end of documentation comment")
+            self._literal_mode = False
             return None
         if self.val == '#':
             return ''
         if self.val[1] != ' ':
             raise QAPIParseError(self, "missing space after #")
-        return self.val[2:].rstrip()
+
+        line = self.val[2:].rstrip()
+
+        if re.match(r'(\.\. +qmp-example)? *::$', line):
+            self._literal_mode = True
+        elif self._literal_mode and line and not line.startswith(' '):
+            # ReST directives stop at first non-blank non-indented line
+            self._literal_mode = False
+
+        if not self._literal_mode:
+            self._validate_doc_line_format(line)
+
+        return line
+
+    def _validate_doc_line_format(self, line: str) -> None:
+        """
+        Validate documentation format rules for a single line:
+        1. Lines should not exceed 70 columns
+        2. Sentences should be separated by two spaces
+        """
+        full_line_length = len(line) + 2  # "# " = 2 characters
+        if full_line_length > 70:
+            # Skip URL lines - they can't be broken
+            if re.match(r' *(https?|ftp)://[^ ]*$', line):
+                pass
+            else:
+                raise QAPIParseError(
+                    self, "documentation line exceeds 70 columns"
+                )
+
+        single_space_pattern = r'(\be\.g\.|^ *\d\.|([.!?])) [A-Z0-9(]'
+        for m in list(re.finditer(single_space_pattern, line)):
+            if not m.group(2):
+                continue
+            # HACK so the error message points to the offending spot
+            self.pos = self.line_pos + 2 + m.start(2) + 1
+            raise QAPIParseError(
+                 self, "Use two spaces between sentences")
 
     @staticmethod
     def _match_at_name_colon(string: str) -> Optional[Match[str]]:
-- 
2.48.1
Re: [PATCH v4] qapi: Add documentation format validation
Posted by Markus Armbruster 1 week, 6 days ago
Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> writes:

> Add explicit validation for QAPI documentation formatting rules:
>
> 1. Lines must not exceed 70 columns in width (including '# ' prefix)
> 2. Sentences must be separated by two spaces
>
> Example sections and literal :: blocks (seldom case) are excluded, we
> don't require them to be <= 70, that would be too restrictive. Anyway,
> they share common 80-columns recommendations (not requirements).
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> ---
>
> Hi all!
>
> v4: apply suggestions by Markus:
>  - smart regexps
>  - simpler error messages
>  - hack to move cursor at the place of error
>  - support :: blocks
>
> This is based on
> [PATCH 0/8] A QAPI schema doc markup fix, and style cleanup
> Based-on: <20251031094751.2817932-1-armbru@redhat.com>
>
>  scripts/qapi/parser.py | 44 +++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 43 insertions(+), 1 deletion(-)
>
> diff --git a/scripts/qapi/parser.py b/scripts/qapi/parser.py
> index 9fbf80a541..2c244a3608 100644
> --- a/scripts/qapi/parser.py
> +++ b/scripts/qapi/parser.py
> @@ -108,6 +108,10 @@ def __init__(self,
>          self.exprs: List[QAPIExpression] = []
>          self.docs: List[QAPIDoc] = []
>  
> +        # State for tracking qmp-example blocks and simple
> +        # :: literal blocks.
> +        self._literal_mode = False
> +
>          # Showtime!
>          self._parse()
>  
> @@ -423,12 +427,50 @@ def get_doc_line(self) -> Optional[str]:
>              if self.val != '##':
>                  raise QAPIParseError(
>                      self, "junk after '##' at end of documentation comment")
> +            self._literal_mode = False
>              return None
>          if self.val == '#':
>              return ''
>          if self.val[1] != ' ':
>              raise QAPIParseError(self, "missing space after #")
> -        return self.val[2:].rstrip()
> +
> +        line = self.val[2:].rstrip()
> +
> +        if re.match(r'(\.\. +qmp-example)? *::$', line):
> +            self._literal_mode = True

This doesn't match the contracted form of literal blocks

    lorem ipsum ::
        dolor sit amet

We don't use this form right now.  We can worry about matching it when
we do.

> +        elif self._literal_mode and line and not line.startswith(' '):
> +            # ReST directives stop at first non-blank non-indented line
> +            self._literal_mode = False

This can miss the end of the literal block when the line with '::' is
indented.  To reproduce ...

> +
> +        if not self._literal_mode:
> +            self._validate_doc_line_format(line)

... tack a debug print here

           else:
               print('@@@', line)

and run

    $ pyvenv/bin/python3 /work/armbru/qemu/scripts/qapi-gen.py -o docs/qapi-firmware/ ../docs/interop/firmware.json

Relevant part of input:

    # @executable: Identifies the firmware executable.  The @mode
    #     indicates whether there will be an associated NVRAM template
    #     present.  The preferred corresponding QEMU command line options
    #     are
    #
    #     ::
    #
    #         -drive if=none,id=pflash0,readonly=on,file=@executable.@filename,format=@executable.@format
    #         -machine pflash0=pflash0
    #
    #     or equivalent -blockdev instead of -drive.  When @mode is
    #     @combined the executable must be cloned before use and
    #     configured with readonly=off.  With QEMU versions older than
    #     4.0, you have to use
    #
    #     ::
    #
    #         -drive if=pflash,unit=0,readonly=on,file=@executable.@filename,format=@executable.@format
    #

Relevant part of output:

    @@@     ::
    @@@         -drive if=none,id=pflash0,readonly=on,file=@executable.@filename,format=@executable.@format
    @@@         -machine pflash0=pflash0
    @@@     or equivalent -blockdev instead of -drive.  When @mode is
    @@@     @combined the executable must be cloned before use and
    @@@     configured with readonly=off.  With QEMU versions older than
    @@@     4.0, you have to use
    @@@     ::
    @@@         -drive if=pflash,unit=0,readonly=on,file=@executable.@filename,format=@executable.@format

Save the indentation of the line containing the '::'.  A line with less
indentation ends the literal block.

> +
> +        return line
> +
> +    def _validate_doc_line_format(self, line: str) -> None:
> +        """
> +        Validate documentation format rules for a single line:
> +        1. Lines should not exceed 70 columns
> +        2. Sentences should be separated by two spaces
> +        """
> +        full_line_length = len(line) + 2  # "# " = 2 characters
> +        if full_line_length > 70:
> +            # Skip URL lines - they can't be broken
> +            if re.match(r' *(https?|ftp)://[^ ]*$', line):
> +                pass
> +            else:
> +                raise QAPIParseError(
> +                    self, "documentation line exceeds 70 columns"
> +                )
> +
> +        single_space_pattern = r'(\be\.g\.|^ *\d\.|([.!?])) [A-Z0-9(]'
> +        for m in list(re.finditer(single_space_pattern, line)):
> +            if not m.group(2):
> +                continue
> +            # HACK so the error message points to the offending spot
> +            self.pos = self.line_pos + 2 + m.start(2) + 1

I have an idea for a non-hacky solution.  I'll post it when ready.
Until then, don't worry about it.

> +            raise QAPIParseError(
> +                 self, "Use two spaces between sentences")

Note this check is somewhat prone to false positives.
@single_space_pattern matches 'e.g.' to avoid the false positive
'e.g. FLAT' in block-core.json.  The same could happen for other
abbreviations.

We could add more sophisticated heuristics to reduce the risk of false
positives.  Meh.  I'd rather KISS for now.  We can deal with the problem
once we have it.

We may want to add a hint, though.  Something like:

               raise QAPIParseError(
                    self, "Use two spaces between sentences\n"
                    "If this not the end of a sentence, please report the bug")

>  
>      @staticmethod
>      def _match_at_name_colon(string: str) -> Optional[Match[str]]:

Negative tests would be nice, one for each new error.  Feel free to ask
me to write them.
Re: [PATCH v4] qapi: Add documentation format validation
Posted by Vladimir Sementsov-Ogievskiy 1 week, 6 days ago
On 31.10.25 19:07, Markus Armbruster wrote:
> Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> writes:
> 
>> Add explicit validation for QAPI documentation formatting rules:
>>
>> 1. Lines must not exceed 70 columns in width (including '# ' prefix)
>> 2. Sentences must be separated by two spaces
>>
>> Example sections and literal :: blocks (seldom case) are excluded, we
>> don't require them to be <= 70, that would be too restrictive. Anyway,
>> they share common 80-columns recommendations (not requirements).
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
>> ---
>>
>> Hi all!
>>
>> v4: apply suggestions by Markus:
>>   - smart regexps
>>   - simpler error messages
>>   - hack to move cursor at the place of error
>>   - support :: blocks
>>
>> This is based on
>> [PATCH 0/8] A QAPI schema doc markup fix, and style cleanup
>> Based-on: <20251031094751.2817932-1-armbru@redhat.com>
>>
>>   scripts/qapi/parser.py | 44 +++++++++++++++++++++++++++++++++++++++++-
>>   1 file changed, 43 insertions(+), 1 deletion(-)
>>
>> diff --git a/scripts/qapi/parser.py b/scripts/qapi/parser.py
>> index 9fbf80a541..2c244a3608 100644
>> --- a/scripts/qapi/parser.py
>> +++ b/scripts/qapi/parser.py
>> @@ -108,6 +108,10 @@ def __init__(self,
>>           self.exprs: List[QAPIExpression] = []
>>           self.docs: List[QAPIDoc] = []
>>   
>> +        # State for tracking qmp-example blocks and simple
>> +        # :: literal blocks.
>> +        self._literal_mode = False
>> +
>>           # Showtime!
>>           self._parse()
>>   
>> @@ -423,12 +427,50 @@ def get_doc_line(self) -> Optional[str]:
>>               if self.val != '##':
>>                   raise QAPIParseError(
>>                       self, "junk after '##' at end of documentation comment")
>> +            self._literal_mode = False
>>               return None
>>           if self.val == '#':
>>               return ''
>>           if self.val[1] != ' ':
>>               raise QAPIParseError(self, "missing space after #")
>> -        return self.val[2:].rstrip()
>> +
>> +        line = self.val[2:].rstrip()
>> +
>> +        if re.match(r'(\.\. +qmp-example)? *::$', line):
>> +            self._literal_mode = True
> 
> This doesn't match the contracted form of literal blocks
> 
>      lorem ipsum ::
>          dolor sit amet
> 
> We don't use this form right now.  We can worry about matching it when
> we do.
> 
>> +        elif self._literal_mode and line and not line.startswith(' '):
>> +            # ReST directives stop at first non-blank non-indented line
>> +            self._literal_mode = False
> 
> This can miss the end of the literal block when the line with '::' is
> indented.  To reproduce ...
> 
>> +
>> +        if not self._literal_mode:
>> +            self._validate_doc_line_format(line)
> 
> ... tack a debug print here
> 
>             else:
>                 print('@@@', line)
> 
> and run
> 
>      $ pyvenv/bin/python3 /work/armbru/qemu/scripts/qapi-gen.py -o docs/qapi-firmware/ ../docs/interop/firmware.json
> 
> Relevant part of input:
> 
>      # @executable: Identifies the firmware executable.  The @mode
>      #     indicates whether there will be an associated NVRAM template
>      #     present.  The preferred corresponding QEMU command line options
>      #     are
>      #
>      #     ::
>      #
>      #         -drive if=none,id=pflash0,readonly=on,file=@executable.@filename,format=@executable.@format
>      #         -machine pflash0=pflash0
>      #
>      #     or equivalent -blockdev instead of -drive.  When @mode is
>      #     @combined the executable must be cloned before use and
>      #     configured with readonly=off.  With QEMU versions older than
>      #     4.0, you have to use
>      #
>      #     ::
>      #
>      #         -drive if=pflash,unit=0,readonly=on,file=@executable.@filename,format=@executable.@format
>      #
> 
> Relevant part of output:
> 
>      @@@     ::
>      @@@         -drive if=none,id=pflash0,readonly=on,file=@executable.@filename,format=@executable.@format
>      @@@         -machine pflash0=pflash0
>      @@@     or equivalent -blockdev instead of -drive.  When @mode is
>      @@@     @combined the executable must be cloned before use and
>      @@@     configured with readonly=off.  With QEMU versions older than
>      @@@     4.0, you have to use
>      @@@     ::
>      @@@         -drive if=pflash,unit=0,readonly=on,file=@executable.@filename,format=@executable.@format
> 
> Save the indentation of the line containing the '::'.  A line with less
> indentation ends the literal block.
> 
>> +
>> +        return line
>> +
>> +    def _validate_doc_line_format(self, line: str) -> None:
>> +        """
>> +        Validate documentation format rules for a single line:
>> +        1. Lines should not exceed 70 columns
>> +        2. Sentences should be separated by two spaces
>> +        """
>> +        full_line_length = len(line) + 2  # "# " = 2 characters
>> +        if full_line_length > 70:
>> +            # Skip URL lines - they can't be broken
>> +            if re.match(r' *(https?|ftp)://[^ ]*$', line):
>> +                pass
>> +            else:
>> +                raise QAPIParseError(
>> +                    self, "documentation line exceeds 70 columns"
>> +                )
>> +
>> +        single_space_pattern = r'(\be\.g\.|^ *\d\.|([.!?])) [A-Z0-9(]'
>> +        for m in list(re.finditer(single_space_pattern, line)):
>> +            if not m.group(2):
>> +                continue
>> +            # HACK so the error message points to the offending spot
>> +            self.pos = self.line_pos + 2 + m.start(2) + 1
> 
> I have an idea for a non-hacky solution.  I'll post it when ready.
> Until then, don't worry about it.
> 
>> +            raise QAPIParseError(
>> +                 self, "Use two spaces between sentences")
> 
> Note this check is somewhat prone to false positives.
> @single_space_pattern matches 'e.g.' to avoid the false positive
> 'e.g. FLAT' in block-core.json.  The same could happen for other
> abbreviations.
> 
> We could add more sophisticated heuristics to reduce the risk of false
> positives.  Meh.  I'd rather KISS for now.  We can deal with the problem
> once we have it.
> 
> We may want to add a hint, though.  Something like:
> 
>                 raise QAPIParseError(
>                      self, "Use two spaces between sentences\n"
>                      "If this not the end of a sentence, please report the bug")
> 
>>   
>>       @staticmethod
>>       def _match_at_name_colon(string: str) -> Optional[Match[str]]:
> 
> Negative tests would be nice, one for each new error.  Feel free to ask
> me to write them.
> 

No problem, it was interesting to look, how qapi-schema tests are done.

Hmm, while making the tests I thought: it would be more convenient to have all
err, json and out things in one file, like

     @json
     ##
     # @foo:
     #
     # This line has exactly 71 characters, including spaces and punctuation!
     ##
     { 'command': 'foo' }
     @out
     @err
     doc-bad-long-line.json:4:1: documentation line exceeds 70 columns

Or something like this. That will simplify creating a new test process from
copying (and modifying) three files to copying one.

(just thinking out loud, don't mind me)


v5 is sent. Tests are small and simple, I decided not split them out to separate
patches.

-- 
Best regards,
Vladimir