docs: Update our kernel-doc script to the kernel's new Python one

[PATCH for-10.2 0/8] docs: Update our kernel-doc script to the kernel's new Python one

Posted by Peter Maydell 3 months ago

Earlier this year, the Linux kernel's kernel-doc script was rewritten
from the old Perl version into a shiny and hopefully more maintainable
Python version. This commit series updates our copy of this script
to the latest kernel version. I have tested it by comparing the
generated HTML documentation and checking that there are no
unexpected changes.

Luckily we are carrying very few local modifications to the Perl
script, so this is fairly straightforward. The structure of the
patchset is:
 * a minor update to the kerneldoc.py Sphinx extension so it
   will work with both old and new kernel-doc script output
 * a fix to a doc comment markup error that I noticed while comparing
   the HTML output from the two versions of the script
 * import the new Python script, unmodified from the kernel's version
   (conveniently the kernel calls it kernel-doc.py, so it doesn't
   clash with the existing script)
 * make the changes to that library code that correspond to the
   two local QEMU-specific changes we carry
 * tell sphinx to use the Python version
 * delete the Perl script (I have put a diff of our local mods
   to the Perl script in the commit message of this commit, for
   posterity)

The diffstat looks big, but almost all of it is "import the
kernel's new script that we trust and don't need to review in
detail" and "delete the old script".

My immediate motivation for doing this update is that I noticed
that the submitter of https://gitlab.com/qemu-project/qemu/-/issues/3077
is using a Perl that complains about a construct in the perl script,
which prompted me to check if the kernel folks had already fixed
it, which it turned out that they had, by rewriting the whole thing :-)
More generally, if we don't do this update, then we're effectively
going to drift down the same path we did with checkpatch.pl, where
we have our own version that diverges from the kernel's version
and we have to maintain it ourselves.

We should also update the Sphinx plugin itself (i.e.
docs/sphinx/kerneldoc.py), but because I did not need to do
that to update the main kernel-doc script, I have left that as
a separate todo item.

Testing
-------

I looked at the HTML output of the old kernel-doc script versus the
new one, using the following diff command which mechanically excludes
a couple of "same minor change" everywhere diffs, and eyeballing the
resulting ~150 lines of diff.

diff -w  -I '^<div class="kernelindent docutils container">$' -I '^</div>$' -I '^<p><strong>Definition</strong>' -r -u -x searchindex.js build/x86/docs-old-kerneldoc/manual build/x86/docs/manual

The HTML changes are:

(1) some paras now have ID tags, eg:
-<p><strong>Functions operating on arrays of bits</strong></p>
+<p id="functions-operating-on-arrays-of-bits"><strong>Functions operating on arrays of bits</strong></p>

(2) Some extra named <div>s, eg:
+<div class="kernelindent docutils container">
 <p><strong>Parameters</strong></p>
 <dl class="simple">
 <dt><code class="docutils literal notranslate"><span class="pre">long</span> <span class="pre">nr</span></code></dt><dd><p>the bit to set</p>
@@ -144,12 +145,14 @@
 <dt><code class="docutils literal notranslate"><span class="pre">unsigned</span> <span class="pre">long</span> <span class="pre">*addr</span></code></dt><dd><p>the address to start counting from</p>
 </dd>
 </dl>
+</div>

(3) The new version correctly parses the multi-line Return: block for
the memory_translate_iotlb() doc comment. You can see that the
old HTML here had dt/dd markup, and it mis-renders in the HTML at
https://www.qemu.org/docs/master/devel/memory.html#c.memory_translate_iotlb

 <p><strong>Return</strong></p>
-<dl class="simple">
-<dt>On success, return the MemoryRegion containing the <strong>iotlb</strong> translated</dt><dd><p>addr.  The MemoryRegion must not be
 accessed after rcu_read_unlock.
+<p>On success, return the MemoryRegion containing the <strong>iotlb</strong> translated
+addr.  The MemoryRegion must not be accessed after rcu_read_unlock.
 On failure, return NULL, setting <strong>errp</strong> with error.</p>
-</dd>
-</dl>
+</div>

"Definition" sections now get output with a trailing colon:

-<p><strong>Definition</strong></p>
+<div class="kernelindent docutils container">
+<p><strong>Definition</strong>:</p>

This seems like it might be a bug in kernel-doc since the Parameters,
Return, etc sections don't get the trailing colon. I don't think it's
important enough to worry about.

thanks
-- PMM

Peter Maydell (8):
  docs/sphinx/kerneldoc.py: Handle new LINENO syntax
  tests/qtest/libqtest.h: Remove stray space from doc comment
  scripts: Import Python kerneldoc from Linux kernel
  scripts/kernel-doc: strip QEMU_ from function definitions
  scripts/kernel-doc: tweak for QEMU coding standards
  scripts/kerneldoc: Switch to the Python kernel-doc script
  scripts/kernel-doc: Delete the old Perl kernel-doc script
  MAINTAINERS: Put kernel-doc under the "docs build machinery" section

 MAINTAINERS                     |    2 +
 docs/conf.py                    |    4 +-
 docs/sphinx/kerneldoc.py        |    7 +-
 tests/qtest/libqtest.h          |    2 +-
 .editorconfig                   |    2 +-
 scripts/kernel-doc              | 2442 -------------------------------
 scripts/kernel-doc.py           |  325 ++++
 scripts/lib/kdoc/kdoc_files.py  |  291 ++++
 scripts/lib/kdoc/kdoc_item.py   |   42 +
 scripts/lib/kdoc/kdoc_output.py |  749 ++++++++++
 scripts/lib/kdoc/kdoc_parser.py | 1670 +++++++++++++++++++++
 scripts/lib/kdoc/kdoc_re.py     |  270 ++++
 12 files changed, 3355 insertions(+), 2451 deletions(-)
 delete mode 100755 scripts/kernel-doc
 create mode 100755 scripts/kernel-doc.py
 create mode 100644 scripts/lib/kdoc/kdoc_files.py
 create mode 100644 scripts/lib/kdoc/kdoc_item.py
 create mode 100644 scripts/lib/kdoc/kdoc_output.py
 create mode 100644 scripts/lib/kdoc/kdoc_parser.py
 create mode 100644 scripts/lib/kdoc/kdoc_re.py

-- 
2.43.0

Re: [PATCH for-10.2 0/8] docs: Update our kernel-doc script to the kernel's new Python one

Posted by Paolo Bonzini 2 months, 3 weeks ago

On 8/14/25 19:13, Peter Maydell wrote:
> Earlier this year, the Linux kernel's kernel-doc script was rewritten
> from the old Perl version into a shiny and hopefully more maintainable
> Python version. This commit series updates our copy of this script
> to the latest kernel version. I have tested it by comparing the
> generated HTML documentation and checking that there are no
> unexpected changes.
> 
> Luckily we are carrying very few local modifications to the Perl
> script, so this is fairly straightforward. The structure of the
> patchset is:
>   * a minor update to the kerneldoc.py Sphinx extension so it
>     will work with both old and new kernel-doc script output
>   * a fix to a doc comment markup error that I noticed while comparing
>     the HTML output from the two versions of the script
>   * import the new Python script, unmodified from the kernel's version
>     (conveniently the kernel calls it kernel-doc.py, so it doesn't
>     clash with the existing script)
>   * make the changes to that library code that correspond to the
>     two local QEMU-specific changes we carry
>   * tell sphinx to use the Python version
>   * delete the Perl script (I have put a diff of our local mods
>     to the Perl script in the commit message of this commit, for
>     posterity)
> 
> The diffstat looks big, but almost all of it is "import the
> kernel's new script that we trust and don't need to review in
> detail" and "delete the old script".
> 
> My immediate motivation for doing this update is that I noticed
> that the submitter of https://gitlab.com/qemu-project/qemu/-/issues/3077
> is using a Perl that complains about a construct in the perl script,
> which prompted me to check if the kernel folks had already fixed
> it, which it turned out that they had, by rewriting the whole thing :-)
> More generally, if we don't do this update, then we're effectively
> going to drift down the same path we did with checkpatch.pl, where
> we have our own version that diverges from the kernel's version
> and we have to maintain it ourselves.

Yep - for checkpatch.pl that makes sense, since we have more differences 
in what we test and we have backported the most pressing parser fixes, 
but kerneldoc has no reason to diverge.

Thanks for doing this!  For the whole series...

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>

Paolo

> We should also update the Sphinx plugin itself (i.e.
> docs/sphinx/kerneldoc.py), but because I did not need to do
> that to update the main kernel-doc script, I have left that as
> a separate todo item.
> 
> Testing
> -------
> 
> I looked at the HTML output of the old kernel-doc script versus the
> new one, using the following diff command which mechanically excludes
> a couple of "same minor change" everywhere diffs, and eyeballing the
> resulting ~150 lines of diff.
> 
> diff -w  -I '^<div class="kernelindent docutils container">$' -I '^</div>$' -I '^<p><strong>Definition</strong>' -r -u -x searchindex.js build/x86/docs-old-kerneldoc/manual build/x86/docs/manual
> 
> The HTML changes are:
> 
> (1) some paras now have ID tags, eg:
> -<p><strong>Functions operating on arrays of bits</strong></p>
> +<p id="functions-operating-on-arrays-of-bits"><strong>Functions operating on arrays of bits</strong></p>
> 
> (2) Some extra named <div>s, eg:
> +<div class="kernelindent docutils container">
>   <p><strong>Parameters</strong></p>
>   <dl class="simple">
>   <dt><code class="docutils literal notranslate"><span class="pre">long</span> <span class="pre">nr</span></code></dt><dd><p>the bit to set</p>
> @@ -144,12 +145,14 @@
>   <dt><code class="docutils literal notranslate"><span class="pre">unsigned</span> <span class="pre">long</span> <span class="pre">*addr</span></code></dt><dd><p>the address to start counting from</p>
>   </dd>
>   </dl>
> +</div>
> 
> (3) The new version correctly parses the multi-line Return: block for
> the memory_translate_iotlb() doc comment. You can see that the
> old HTML here had dt/dd markup, and it mis-renders in the HTML at
> https://www.qemu.org/docs/master/devel/memory.html#c.memory_translate_iotlb
> 
>   <p><strong>Return</strong></p>
> -<dl class="simple">
> -<dt>On success, return the MemoryRegion containing the <strong>iotlb</strong> translated</dt><dd><p>addr.  The MemoryRegion must not be
>   accessed after rcu_read_unlock.
> +<p>On success, return the MemoryRegion containing the <strong>iotlb</strong> translated
> +addr.  The MemoryRegion must not be accessed after rcu_read_unlock.
>   On failure, return NULL, setting <strong>errp</strong> with error.</p>
> -</dd>
> -</dl>
> +</div>
> 
> "Definition" sections now get output with a trailing colon:
> 
> -<p><strong>Definition</strong></p>
> +<div class="kernelindent docutils container">
> +<p><strong>Definition</strong>:</p>
> 
> This seems like it might be a bug in kernel-doc since the Parameters,
> Return, etc sections don't get the trailing colon. I don't think it's
> important enough to worry about.
> 
> thanks
> -- PMM
> 
> Peter Maydell (8):
>    docs/sphinx/kerneldoc.py: Handle new LINENO syntax
>    tests/qtest/libqtest.h: Remove stray space from doc comment
>    scripts: Import Python kerneldoc from Linux kernel
>    scripts/kernel-doc: strip QEMU_ from function definitions
>    scripts/kernel-doc: tweak for QEMU coding standards
>    scripts/kerneldoc: Switch to the Python kernel-doc script
>    scripts/kernel-doc: Delete the old Perl kernel-doc script
>    MAINTAINERS: Put kernel-doc under the "docs build machinery" section
> 
>   MAINTAINERS                     |    2 +
>   docs/conf.py                    |    4 +-
>   docs/sphinx/kerneldoc.py        |    7 +-
>   tests/qtest/libqtest.h          |    2 +-
>   .editorconfig                   |    2 +-
>   scripts/kernel-doc              | 2442 -------------------------------
>   scripts/kernel-doc.py           |  325 ++++
>   scripts/lib/kdoc/kdoc_files.py  |  291 ++++
>   scripts/lib/kdoc/kdoc_item.py   |   42 +
>   scripts/lib/kdoc/kdoc_output.py |  749 ++++++++++
>   scripts/lib/kdoc/kdoc_parser.py | 1670 +++++++++++++++++++++
>   scripts/lib/kdoc/kdoc_re.py     |  270 ++++
>   12 files changed, 3355 insertions(+), 2451 deletions(-)
>   delete mode 100755 scripts/kernel-doc
>   create mode 100755 scripts/kernel-doc.py
>   create mode 100644 scripts/lib/kdoc/kdoc_files.py
>   create mode 100644 scripts/lib/kdoc/kdoc_item.py
>   create mode 100644 scripts/lib/kdoc/kdoc_output.py
>   create mode 100644 scripts/lib/kdoc/kdoc_parser.py
>   create mode 100644 scripts/lib/kdoc/kdoc_re.py
>

Re: [PATCH for-10.2 0/8] docs: Update our kernel-doc script to the kernel's new Python one

Posted by Jonathan Cameron via 3 months ago

On Thu, 14 Aug 2025 18:13:15 +0100
Peter Maydell <peter.maydell@linaro.org> wrote:

> Earlier this year, the Linux kernel's kernel-doc script was rewritten
> from the old Perl version into a shiny and hopefully more maintainable
> Python version. This commit series updates our copy of this script
> to the latest kernel version. I have tested it by comparing the
> generated HTML documentation and checking that there are no
> unexpected changes.
> 
> Luckily we are carrying very few local modifications to the Perl
> script, so this is fairly straightforward. The structure of the
> patchset is:
>  * a minor update to the kerneldoc.py Sphinx extension so it
>    will work with both old and new kernel-doc script output
>  * a fix to a doc comment markup error that I noticed while comparing
>    the HTML output from the two versions of the script
>  * import the new Python script, unmodified from the kernel's version
>    (conveniently the kernel calls it kernel-doc.py, so it doesn't
>    clash with the existing script)
>  * make the changes to that library code that correspond to the
>    two local QEMU-specific changes we carry
>  * tell sphinx to use the Python version
>  * delete the Perl script (I have put a diff of our local mods
>    to the Perl script in the commit message of this commit, for
>    posterity)
> 
> The diffstat looks big, but almost all of it is "import the
> kernel's new script that we trust and don't need to review in
> detail" and "delete the old script".

Given Mauro is somewhat active in qemu as well, +CC for information
if nothing else.

Jonathan



> 
> My immediate motivation for doing this update is that I noticed
> that the submitter of https://gitlab.com/qemu-project/qemu/-/issues/3077
> is using a Perl that complains about a construct in the perl script,
> which prompted me to check if the kernel folks had already fixed
> it, which it turned out that they had, by rewriting the whole thing :-)
> More generally, if we don't do this update, then we're effectively
> going to drift down the same path we did with checkpatch.pl, where
> we have our own version that diverges from the kernel's version
> and we have to maintain it ourselves.
> 
> We should also update the Sphinx plugin itself (i.e.
> docs/sphinx/kerneldoc.py), but because I did not need to do
> that to update the main kernel-doc script, I have left that as
> a separate todo item.
> 
> Testing
> -------
> 
> I looked at the HTML output of the old kernel-doc script versus the
> new one, using the following diff command which mechanically excludes
> a couple of "same minor change" everywhere diffs, and eyeballing the
> resulting ~150 lines of diff.
> 
> diff -w  -I '^<div class="kernelindent docutils container">$' -I '^</div>$' -I '^<p><strong>Definition</strong>' -r -u -x searchindex.js build/x86/docs-old-kerneldoc/manual build/x86/docs/manual
> 
> The HTML changes are:
> 
> (1) some paras now have ID tags, eg:
> -<p><strong>Functions operating on arrays of bits</strong></p>
> +<p id="functions-operating-on-arrays-of-bits"><strong>Functions operating on arrays of bits</strong></p>
> 
> (2) Some extra named <div>s, eg:
> +<div class="kernelindent docutils container">
>  <p><strong>Parameters</strong></p>
>  <dl class="simple">
>  <dt><code class="docutils literal notranslate"><span class="pre">long</span> <span class="pre">nr</span></code></dt><dd><p>the bit to set</p>
> @@ -144,12 +145,14 @@
>  <dt><code class="docutils literal notranslate"><span class="pre">unsigned</span> <span class="pre">long</span> <span class="pre">*addr</span></code></dt><dd><p>the address to start counting from</p>
>  </dd>
>  </dl>
> +</div>
> 
> (3) The new version correctly parses the multi-line Return: block for
> the memory_translate_iotlb() doc comment. You can see that the
> old HTML here had dt/dd markup, and it mis-renders in the HTML at
> https://www.qemu.org/docs/master/devel/memory.html#c.memory_translate_iotlb
> 
>  <p><strong>Return</strong></p>
> -<dl class="simple">
> -<dt>On success, return the MemoryRegion containing the <strong>iotlb</strong> translated</dt><dd><p>addr.  The MemoryRegion must not be
>  accessed after rcu_read_unlock.
> +<p>On success, return the MemoryRegion containing the <strong>iotlb</strong> translated
> +addr.  The MemoryRegion must not be accessed after rcu_read_unlock.
>  On failure, return NULL, setting <strong>errp</strong> with error.</p>
> -</dd>
> -</dl>
> +</div>
> 
> "Definition" sections now get output with a trailing colon:
> 
> -<p><strong>Definition</strong></p>
> +<div class="kernelindent docutils container">
> +<p><strong>Definition</strong>:</p>
> 
> This seems like it might be a bug in kernel-doc since the Parameters,
> Return, etc sections don't get the trailing colon. I don't think it's
> important enough to worry about.
> 
> thanks
> -- PMM
> 
> Peter Maydell (8):
>   docs/sphinx/kerneldoc.py: Handle new LINENO syntax
>   tests/qtest/libqtest.h: Remove stray space from doc comment
>   scripts: Import Python kerneldoc from Linux kernel
>   scripts/kernel-doc: strip QEMU_ from function definitions
>   scripts/kernel-doc: tweak for QEMU coding standards
>   scripts/kerneldoc: Switch to the Python kernel-doc script
>   scripts/kernel-doc: Delete the old Perl kernel-doc script
>   MAINTAINERS: Put kernel-doc under the "docs build machinery" section
> 
>  MAINTAINERS                     |    2 +
>  docs/conf.py                    |    4 +-
>  docs/sphinx/kerneldoc.py        |    7 +-
>  tests/qtest/libqtest.h          |    2 +-
>  .editorconfig                   |    2 +-
>  scripts/kernel-doc              | 2442 -------------------------------
>  scripts/kernel-doc.py           |  325 ++++
>  scripts/lib/kdoc/kdoc_files.py  |  291 ++++
>  scripts/lib/kdoc/kdoc_item.py   |   42 +
>  scripts/lib/kdoc/kdoc_output.py |  749 ++++++++++
>  scripts/lib/kdoc/kdoc_parser.py | 1670 +++++++++++++++++++++
>  scripts/lib/kdoc/kdoc_re.py     |  270 ++++
>  12 files changed, 3355 insertions(+), 2451 deletions(-)
>  delete mode 100755 scripts/kernel-doc
>  create mode 100755 scripts/kernel-doc.py
>  create mode 100644 scripts/lib/kdoc/kdoc_files.py
>  create mode 100644 scripts/lib/kdoc/kdoc_item.py
>  create mode 100644 scripts/lib/kdoc/kdoc_output.py
>  create mode 100644 scripts/lib/kdoc/kdoc_parser.py
>  create mode 100644 scripts/lib/kdoc/kdoc_re.py
>

Re: [PATCH for-10.2 0/8] docs: Update our kernel-doc script to the kernel's new Python one

Posted by Mauro Carvalho Chehab 3 months ago

Hi Peter/Jonathan,

Em Fri, 15 Aug 2025 10:11:09 +0100
Jonathan Cameron <Jonathan.Cameron@huawei.com> escreveu:

> On Thu, 14 Aug 2025 18:13:15 +0100
> Peter Maydell <peter.maydell@linaro.org> wrote:
> 
> > Earlier this year, the Linux kernel's kernel-doc script was rewritten
> > from the old Perl version into a shiny and hopefully more maintainable
> > Python version. This commit series updates our copy of this script
> > to the latest kernel version. I have tested it by comparing the
> > generated HTML documentation and checking that there are no
> > unexpected changes.

Nice! Yeah, I had a branch here doing something similar for QEMU, 
but got sidetracked by other things and didn't have time to address
a couple of issues. I'm glad you find the time for it.

> > Luckily we are carrying very few local modifications to the Perl
> > script, so this is fairly straightforward. The structure of the
> > patchset is:
> >  * a minor update to the kerneldoc.py Sphinx extension so it
> >    will work with both old and new kernel-doc script output
> >  * a fix to a doc comment markup error that I noticed while comparing
> >    the HTML output from the two versions of the script
> >  * import the new Python script, unmodified from the kernel's version
> >    (conveniently the kernel calls it kernel-doc.py, so it doesn't
> >    clash with the existing script)

> >  * make the changes to that library code that correspond to the
> >    two local QEMU-specific changes we carry

To make it easier to maintain and keep in sync with Kernel upstream,
perhaps we can try to change Kernel upstream to make easier for QEMU
to have a class override for the kdoc parser, allowing it to just
sync with Linux upstream, while having its own set of rules on a
separate file.

A RFC on that sense is welcomed. Otherwise, I'll try to spare some
time to think on a good way for doing that.

> >  * tell sphinx to use the Python version
> >  * delete the Perl script (I have put a diff of our local mods
> >    to the Perl script in the commit message of this commit, for
> >    posterity)
> > 
> > The diffstat looks big, but almost all of it is "import the
> > kernel's new script that we trust and don't need to review in
> > detail" and "delete the old script".  

One thing that should be noticed is that Jonathan Corbet is currently
doing several cleanups at the Python script, simplifying some
regular expressions, avoiding them when str.replace() does the job
and adding comments. The end goal is to make it easier for developers
to understand and help maintaining its code.

So, it is probably worth backporting Linux upstream changes after
the end of Kernel 6.17 cycle.

> 
> Given Mauro is somewhat active in qemu as well, +CC for information
> if nothing else.
> 
> Jonathan
> 
> 
> 
> > 
> > My immediate motivation for doing this update is that I noticed
> > that the submitter of https://gitlab.com/qemu-project/qemu/-/issues/3077
> > is using a Perl that complains about a construct in the perl script,
> > which prompted me to check if the kernel folks had already fixed
> > it, which it turned out that they had, by rewriting the whole thing :-)
> > More generally, if we don't do this update, then we're effectively
> > going to drift down the same path we did with checkpatch.pl, where
> > we have our own version that diverges from the kernel's version
> > and we have to maintain it ourselves.
> > 
> > We should also update the Sphinx plugin itself (i.e.
> > docs/sphinx/kerneldoc.py), but because I did not need to do
> > that to update the main kernel-doc script, I have left that as
> > a separate todo item.

The Kernel Sphinx plugin after the change is IMHO (*) a lot cleaner
than before, and hendles better kernel-doc warnings, as they are now
using Sphinx logger class.

(*) I'm a little bit suspicious when talking about it, as I did the
    changes there too ;-)

-

Btw, one important point to notice: if you picked the latest version
of kernel-doc, it currently requires at least Python 3.6 (3.7 is the 
recommended minimal one). It does check that, silently bailing out
if Python < 3.6. 

With Python 3.6, it emits a warning, as the parameter order for
structs and functions won't match the original order, as the script
assumes 3.7+ dict behavior where the insert order is preserved.

So, at QEMU build instructions, I would add a notice asking for at 
least 3.7 to build docs.

> > 
> > Testing
> > -------
> > 
> > I looked at the HTML output of the old kernel-doc script versus the
> > new one, using the following diff command which mechanically excludes
> > a couple of "same minor change" everywhere diffs, and eyeballing the
> > resulting ~150 lines of diff.
> > 
> > diff -w  -I '^<div class="kernelindent docutils container">$' -I '^</div>$' -I '^<p><strong>Definition</strong>' -r -u -x searchindex.js build/x86/docs-old-kerneldoc/manual build/x86/docs/manual
> > 
> > The HTML changes are:
> > 
> > (1) some paras now have ID tags, eg:
> > -<p><strong>Functions operating on arrays of bits</strong></p>
> > +<p id="functions-operating-on-arrays-of-bits"><strong>Functions operating on arrays of bits</strong></p>
> > 
> > (2) Some extra named <div>s, eg:
> > +<div class="kernelindent docutils container">
> >  <p><strong>Parameters</strong></p>
> >  <dl class="simple">
> >  <dt><code class="docutils literal notranslate"><span class="pre">long</span> <span class="pre">nr</span></code></dt><dd><p>the bit to set</p>
> > @@ -144,12 +145,14 @@
> >  <dt><code class="docutils literal notranslate"><span class="pre">unsigned</span> <span class="pre">long</span> <span class="pre">*addr</span></code></dt><dd><p>the address to start counting from</p>
> >  </dd>
> >  </dl>
> > +</div>
> > 
> > (3) The new version correctly parses the multi-line Return: block for
> > the memory_translate_iotlb() doc comment. You can see that the
> > old HTML here had dt/dd markup, and it mis-renders in the HTML at
> > https://www.qemu.org/docs/master/devel/memory.html#c.memory_translate_iotlb
> > 
> >  <p><strong>Return</strong></p>
> > -<dl class="simple">
> > -<dt>On success, return the MemoryRegion containing the <strong>iotlb</strong> translated</dt><dd><p>addr.  The MemoryRegion must not be
> >  accessed after rcu_read_unlock.
> > +<p>On success, return the MemoryRegion containing the <strong>iotlb</strong> translated
> > +addr.  The MemoryRegion must not be accessed after rcu_read_unlock.
> >  On failure, return NULL, setting <strong>errp</strong> with error.</p>
> > -</dd>
> > -</dl>
> > +</div>
> > 
> > "Definition" sections now get output with a trailing colon:
> > 
> > -<p><strong>Definition</strong></p>
> > +<div class="kernelindent docutils container">
> > +<p><strong>Definition</strong>:</p>
> > 
> > This seems like it might be a bug in kernel-doc since the Parameters,
> > Return, etc sections don't get the trailing colon. I don't think it's
> > important enough to worry about.
> > 
> > thanks
> > -- PMM
> > 
> > Peter Maydell (8):
> >   docs/sphinx/kerneldoc.py: Handle new LINENO syntax
> >   tests/qtest/libqtest.h: Remove stray space from doc comment
> >   scripts: Import Python kerneldoc from Linux kernel
> >   scripts/kernel-doc: strip QEMU_ from function definitions
> >   scripts/kernel-doc: tweak for QEMU coding standards
> >   scripts/kerneldoc: Switch to the Python kernel-doc script
> >   scripts/kernel-doc: Delete the old Perl kernel-doc script
> >   MAINTAINERS: Put kernel-doc under the "docs build machinery" section

I'll review the actual patches later.

> > 
> >  MAINTAINERS                     |    2 +
> >  docs/conf.py                    |    4 +-
> >  docs/sphinx/kerneldoc.py        |    7 +-
> >  tests/qtest/libqtest.h          |    2 +-
> >  .editorconfig                   |    2 +-
> >  scripts/kernel-doc              | 2442 -------------------------------
> >  scripts/kernel-doc.py           |  325 ++++
> >  scripts/lib/kdoc/kdoc_files.py  |  291 ++++
> >  scripts/lib/kdoc/kdoc_item.py   |   42 +
> >  scripts/lib/kdoc/kdoc_output.py |  749 ++++++++++
> >  scripts/lib/kdoc/kdoc_parser.py | 1670 +++++++++++++++++++++
> >  scripts/lib/kdoc/kdoc_re.py     |  270 ++++
> >  12 files changed, 3355 insertions(+), 2451 deletions(-)
> >  delete mode 100755 scripts/kernel-doc
> >  create mode 100755 scripts/kernel-doc.py
> >  create mode 100644 scripts/lib/kdoc/kdoc_files.py
> >  create mode 100644 scripts/lib/kdoc/kdoc_item.py
> >  create mode 100644 scripts/lib/kdoc/kdoc_output.py
> >  create mode 100644 scripts/lib/kdoc/kdoc_parser.py
> >  create mode 100644 scripts/lib/kdoc/kdoc_re.py
> >   
> 



Thanks,
Mauro

Re: [PATCH for-10.2 0/8] docs: Update our kernel-doc script to the kernel's new Python one

Posted by Peter Maydell 3 months ago

On Fri, 15 Aug 2025 at 10:39, Mauro Carvalho Chehab
<mchehab+huawei@kernel.org> wrote:
>
> Hi Peter/Jonathan,
>
> Em Fri, 15 Aug 2025 10:11:09 +0100
> Jonathan Cameron <Jonathan.Cameron@huawei.com> escreveu:
>
> > On Thu, 14 Aug 2025 18:13:15 +0100
> > Peter Maydell <peter.maydell@linaro.org> wrote:
> >
> > > Earlier this year, the Linux kernel's kernel-doc script was rewritten
> > > from the old Perl version into a shiny and hopefully more maintainable
> > > Python version. This commit series updates our copy of this script
> > > to the latest kernel version. I have tested it by comparing the
> > > generated HTML documentation and checking that there are no
> > > unexpected changes.
>
> Nice! Yeah, I had a branch here doing something similar for QEMU,
> but got sidetracked by other things and didn't have time to address
> a couple of issues. I'm glad you find the time for it.
>
> > > Luckily we are carrying very few local modifications to the Perl
> > > script, so this is fairly straightforward. The structure of the
> > > patchset is:
> > >  * a minor update to the kerneldoc.py Sphinx extension so it
> > >    will work with both old and new kernel-doc script output
> > >  * a fix to a doc comment markup error that I noticed while comparing
> > >    the HTML output from the two versions of the script
> > >  * import the new Python script, unmodified from the kernel's version
> > >    (conveniently the kernel calls it kernel-doc.py, so it doesn't
> > >    clash with the existing script)
>
> > >  * make the changes to that library code that correspond to the
> > >    two local QEMU-specific changes we carry
>
> To make it easier to maintain and keep in sync with Kernel upstream,
> perhaps we can try to change Kernel upstream to make easier for QEMU
> to have a class override for the kdoc parser, allowing it to just
> sync with Linux upstream, while having its own set of rules on a
> separate file.

Mmm, this would certainly be nice, but at least so far we haven't
needed to make extensive changes, luckily (you can see how small
our local adjustments are here).

> > >  * tell sphinx to use the Python version
> > >  * delete the Perl script (I have put a diff of our local mods
> > >    to the Perl script in the commit message of this commit, for
> > >    posterity)
> > >
> > > The diffstat looks big, but almost all of it is "import the
> > > kernel's new script that we trust and don't need to review in
> > > detail" and "delete the old script".
>
> One thing that should be noticed is that Jonathan Corbet is currently
> doing several cleanups at the Python script, simplifying some
> regular expressions, avoiding them when str.replace() does the job
> and adding comments. The end goal is to make it easier for developers
> to understand and help maintaining its code.
>
> So, it is probably worth backporting Linux upstream changes after
> the end of Kernel 6.17 cycle.

Thanks for the heads-up on that one. A further sync should
be straightforward after this one, I expect.

> > > We should also update the Sphinx plugin itself (i.e.
> > > docs/sphinx/kerneldoc.py), but because I did not need to do
> > > that to update the main kernel-doc script, I have left that as
> > > a separate todo item.
>
> The Kernel Sphinx plugin after the change is IMHO (*) a lot cleaner
> than before, and hendles better kernel-doc warnings, as they are now
> using Sphinx logger class.

Also as much as anything else it's just nice for us not to
diverge if we can avoid it.

Incidentally, I'm curious if the kernel docs see problems
with docutils 0.22 -- we had a report about problems there,
at least some of which seem to be because the way kerneldoc.py
adds its rST output is triggering the new docutils to complain
if the added code doesn't have a consistent title style
hierarchy: https://sourceforge.net/p/docutils/bugs/508/
(It looks like they're trying to address this on the docutils side;
we might or might not adjust on our side too by fixing up the
title styles if that's not too awkward for us.)

> Btw, one important point to notice: if you picked the latest version
> of kernel-doc, it currently requires at least Python 3.6 (3.7 is the
> recommended minimal one). It does check that, silently bailing out
> if Python < 3.6.

QEMU already requires Python 3.9 or better; our configure checks:

check_py_version() {
    # We require python >= 3.9.
    # NB: a True python conditional creates a non-zero return code (Failure)
    "$1" -c 'import sys; sys.exit(sys.version_info < (3,9))'
}

Thanks for the confirmation that the kernel is being more
conservative on python requirements than we are; I did
wonder about this but merely assumed you probably were
rather than specifically checking :-)


On this minor output change:

> > > "Definition" sections now get output with a trailing colon:
> > >
> > > -<p><strong>Definition</strong></p>
> > > +<div class="kernelindent docutils container">
> > > +<p><strong>Definition</strong>:</p>
> > >
> > > This seems like it might be a bug in kernel-doc since the Parameters,
> > > Return, etc sections don't get the trailing colon. I don't think it's
> > > important enough to worry about.

is the extra colon intentional, or do you agree that it's
a bug? You can see it in the kernel docs output at e.g.
https://docs.kernel.org/core-api/workqueue.html#c.workqueue_attrs

where in the documentation of struct workqueue_attrs,
"Definition:" gets a kernel but the corresponding "Members"
and "Description" don't.  (Also "Description" is out-dented
there when it probably should not be, but that's separate.)

thanks
-- PMM

Re: [PATCH for-10.2 0/8] docs: Update our kernel-doc script to the kernel's new Python one

Posted by Mauro Carvalho Chehab 3 months ago

Em Fri, 15 Aug 2025 11:10:05 +0100
Peter Maydell <peter.maydell@linaro.org> escreveu:

> On Fri, 15 Aug 2025 at 10:39, Mauro Carvalho Chehab
> <mchehab+huawei@kernel.org> wrote:
> >
> > Hi Peter/Jonathan,
> >
> > Em Fri, 15 Aug 2025 10:11:09 +0100
> > Jonathan Cameron <Jonathan.Cameron@huawei.com> escreveu:
> >  
> > > On Thu, 14 Aug 2025 18:13:15 +0100
> > > Peter Maydell <peter.maydell@linaro.org> wrote:
> > >  
> > > > Earlier this year, the Linux kernel's kernel-doc script was rewritten
> > > > from the old Perl version into a shiny and hopefully more maintainable
> > > > Python version. This commit series updates our copy of this script
> > > > to the latest kernel version. I have tested it by comparing the
> > > > generated HTML documentation and checking that there are no
> > > > unexpected changes.  
> >
> > Nice! Yeah, I had a branch here doing something similar for QEMU,
> > but got sidetracked by other things and didn't have time to address
> > a couple of issues. I'm glad you find the time for it.
> >  
> > > > Luckily we are carrying very few local modifications to the Perl
> > > > script, so this is fairly straightforward. The structure of the
> > > > patchset is:
> > > >  * a minor update to the kerneldoc.py Sphinx extension so it
> > > >    will work with both old and new kernel-doc script output
> > > >  * a fix to a doc comment markup error that I noticed while comparing
> > > >    the HTML output from the two versions of the script
> > > >  * import the new Python script, unmodified from the kernel's version
> > > >    (conveniently the kernel calls it kernel-doc.py, so it doesn't
> > > >    clash with the existing script)  
> >  
> > > >  * make the changes to that library code that correspond to the
> > > >    two local QEMU-specific changes we carry  
> >
> > To make it easier to maintain and keep in sync with Kernel upstream,
> > perhaps we can try to change Kernel upstream to make easier for QEMU
> > to have a class override for the kdoc parser, allowing it to just
> > sync with Linux upstream, while having its own set of rules on a
> > separate file.  
> 
> Mmm, this would certainly be nice, but at least so far we haven't
> needed to make extensive changes, luckily (you can see how small
> our local adjustments are here).

I just reviewed the series. IMO, if you create a class override for
RestOutput, as I suggested, there will be just a single line
of difference:

	            (r"QEMU_[A-Z_]+ +", "", 0),

Not sure about others, but, from my side, I don't mind picking a
patch like that at Kernel upstream, if it doesn't cause any
regressions there (unlikely, but I didn't check).

> > > >  * tell sphinx to use the Python version
> > > >  * delete the Perl script (I have put a diff of our local mods
> > > >    to the Perl script in the commit message of this commit, for
> > > >    posterity)
> > > >
> > > > The diffstat looks big, but almost all of it is "import the
> > > > kernel's new script that we trust and don't need to review in
> > > > detail" and "delete the old script".  
> >
> > One thing that should be noticed is that Jonathan Corbet is currently
> > doing several cleanups at the Python script, simplifying some
> > regular expressions, avoiding them when str.replace() does the job
> > and adding comments. The end goal is to make it easier for developers
> > to understand and help maintaining its code.
> >
> > So, it is probably worth backporting Linux upstream changes after
> > the end of Kernel 6.17 cycle.  
> 
> Thanks for the heads-up on that one. A further sync should
> be straightforward after this one, I expect.

Yeah, it sounds so.

> > > > We should also update the Sphinx plugin itself (i.e.
> > > > docs/sphinx/kerneldoc.py), but because I did not need to do
> > > > that to update the main kernel-doc script, I have left that as
> > > > a separate todo item.  
> >
> > The Kernel Sphinx plugin after the change is IMHO (*) a lot cleaner
> > than before, and hendles better kernel-doc warnings, as they are now
> > using Sphinx logger class.  
> 
> Also as much as anything else it's just nice for us not to
> diverge if we can avoid it.
> 
> Incidentally, I'm curious if the kernel docs see problems
> with docutils 0.22 -- we had a report about problems there,
> at least some of which seem to be because the way kerneldoc.py
> adds its rST output is triggering the new docutils to complain
> if the added code doesn't have a consistent title style
> hierarchy: https://sourceforge.net/p/docutils/bugs/508/
> (It looks like they're trying to address this on the docutils side;
> we might or might not adjust on our side too by fixing up the
> title styles if that's not too awkward for us.)

I did test building only from 0.17 up to 0.21.2. It worked fine
for all of them. Now, 0.22 was released on 2025-07-29. I didn't
test it yet, nor I'm aware of anyone complaining about it on
Kernel MLs yet.

Btw, I wrote an upstream script to test building docs with different
Sphinx and docutils versions.

It is under:
	scripts/test_doc_build.py

It probably makes sense to port it to QEMU and add it to CI. Most of
the logic is independent from the Kernel. The only part that would
require adjustments is the logic at _handle_version() that creates
make commands to clean docs and build html.

> 
> > Btw, one important point to notice: if you picked the latest version
> > of kernel-doc, it currently requires at least Python 3.6 (3.7 is the
> > recommended minimal one). It does check that, silently bailing out
> > if Python < 3.6.  
> 
> QEMU already requires Python 3.9 or better; our configure checks:
> 
> check_py_version() {
>     # We require python >= 3.9.
>     # NB: a True python conditional creates a non-zero return code (Failure)
>     "$1" -c 'import sys; sys.exit(sys.version_info < (3,9))'
> }

Great!

> Thanks for the confirmation that the kernel is being more
> conservative on python requirements than we are; I did
> wonder about this but merely assumed you probably were
> rather than specifically checking :-)

Heh, an early change on 6.17 cycle incidentally made it requiring
3.9 ;-)

We ended changing it to preserve 3.7+ support, as we wanted to
ensure it would build with OpenSuse Leap.

> On this minor output change:
> 
> > > > "Definition" sections now get output with a trailing colon:
> > > >
> > > > -<p><strong>Definition</strong></p>
> > > > +<div class="kernelindent docutils container">
> > > > +<p><strong>Definition</strong>:</p>
> > > >
> > > > This seems like it might be a bug in kernel-doc since the Parameters,
> > > > Return, etc sections don't get the trailing colon. I don't think it's
> > > > important enough to worry about.  
> 
> is the extra colon intentional, or do you agree that it's
> a bug? You can see it in the kernel docs output at e.g.
> https://docs.kernel.org/core-api/workqueue.html#c.workqueue_attrs
> 
> where in the documentation of struct workqueue_attrs,
> "Definition:" gets a kernel but the corresponding "Members"
> and "Description" don't. 

This one predates kernel-doc.py, as it exists at the Perl version:

	$ grep Definition scripts/kernel-doc.pl
	    print $lineprefix . "**Definition**::\n\n";

It seems this was added on this upstream commit:

commit eaf710ceb5ae284778a87c0d0f2348c19e3e4751
Author: Jonathan Corbet <corbet@lwn.net>
Date:   Fri Sep 30 11:52:09 2022 -0600

    docs: improve the HTML formatting of kerneldoc comments
    
    Make a few changes to cause functions documented by kerneldoc to stand out
    better in the rendered documentation.  Specifically, change kernel-doc to
    put the description section into a ".. container::" section, then add a bit
    of CSS to indent that section relative to the function prototype (or struct
    or enum definition).  Tweak a few other CSS parameters while in the
    neighborhood to improve the formatting.
    
    Acked-by: Mauro Carvalho Chehab <mchehab@kernel.org>
    Signed-off-by: Jonathan Corbet <corbet@lwn.net>

While I don't matter much about that, IMO the best would be to drop
the extra ":" at the end.

Feel free to submit a Kernel patch upstream dropping it from 
scripts/lib/kdoc/kdoc_output.py.

> (Also "Description" is out-dented
> there when it probably should not be, but that's separate.)

Yeah, indenting Description makes sense to me. 

Thanks,
Mauro