[PATCH 2/7] docs: kdoc: micro-optimize KernRe

Jonathan Corbet posted 7 patches 3 months, 1 week ago
There is a newer version of this series
[PATCH 2/7] docs: kdoc: micro-optimize KernRe
Posted by Jonathan Corbet 3 months, 1 week ago
Switch KernRe::add_regex() to a try..except block to avoid looking up each
regex twice.

Signed-off-by: Jonathan Corbet <corbet@lwn.net>
---
 scripts/lib/kdoc/kdoc_re.py | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/scripts/lib/kdoc/kdoc_re.py b/scripts/lib/kdoc/kdoc_re.py
index e81695b273bf..a467cd2f160b 100644
--- a/scripts/lib/kdoc/kdoc_re.py
+++ b/scripts/lib/kdoc/kdoc_re.py
@@ -29,12 +29,10 @@ class KernRe:
         """
         Adds a new regex or re-use it from the cache.
         """
-
-        if string in re_cache:
+        try:
             self.regex = re_cache[string]
-        else:
+        except KeyError:
             self.regex = re.compile(string, flags=flags)
-
             if self.cache:
                 re_cache[string] = self.regex
 
-- 
2.49.0
Re: [PATCH 2/7] docs: kdoc: micro-optimize KernRe
Posted by Mauro Carvalho Chehab 3 months ago
Em Tue,  1 Jul 2025 14:57:25 -0600
Jonathan Corbet <corbet@lwn.net> escreveu:

> Switch KernRe::add_regex() to a try..except block to avoid looking up each
> regex twice.
> 
> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
> ---
>  scripts/lib/kdoc/kdoc_re.py | 6 ++----
>  1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/scripts/lib/kdoc/kdoc_re.py b/scripts/lib/kdoc/kdoc_re.py
> index e81695b273bf..a467cd2f160b 100644
> --- a/scripts/lib/kdoc/kdoc_re.py
> +++ b/scripts/lib/kdoc/kdoc_re.py
> @@ -29,12 +29,10 @@ class KernRe:
>          """
>          Adds a new regex or re-use it from the cache.
>          """
> -
> -        if string in re_cache:
> +        try:
>              self.regex = re_cache[string]
> -        else:
> +        except KeyError:
>              self.regex = re.compile(string, flags=flags)
> -

Hmm... I opted for this particular way of checking is that I
expect that check inside a hash at dict would be faster than
letting it crash then raise an exception. 

Btw, one easy way to check how much it affects performance
(if any) would be to run it in "rogue" mode with:

	$ time ./scripts/kernel-doc.py -N .

This will run kernel-doc.py for all files at the entire Kernel
tree, only reporting problems. If you want to do changes like
this that might introduce performance regressions, I suggest
running it once, just to fill disk caches, and then run it
again before/after such changes.

Anyway, I did such measurements before/after your patch.
the difference was not relevant: just one second of difference:

original code:

real	1m20,839s
user	1m19,594s
sys	0m0,998s

after your change:

real	1m21,805s
user	1m20,612s
sys	0m0,929s

I don't mind myself to be one second slower, but this is hardly
a micro-optimization ;-)

-

Disclaimer notice: one second of difference here can be due to
some other background process on this laptop.

Regards,
Mauro
Re: [PATCH 2/7] docs: kdoc: micro-optimize KernRe
Posted by Jonathan Corbet 3 months ago
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> writes:

> Hmm... I opted for this particular way of checking is that I
> expect that check inside a hash at dict would be faster than
> letting it crash then raise an exception. 

Raising an exception is not quite a "crash" and, if the caching is doing
any good, it should be ... exceptional.  That pattern is often shown as
a better way to do conditional dict lookups, so I've tended to follow
it, even though I'm not a big fan of exceptions in general.

> Btw, one easy way to check how much it affects performance
> (if any) would be to run it in "rogue" mode with:
>
> 	$ time ./scripts/kernel-doc.py -N .
>
> This will run kernel-doc.py for all files at the entire Kernel
> tree, only reporting problems. If you want to do changes like
> this that might introduce performance regressions, I suggest
> running it once, just to fill disk caches, and then run it
> again before/after such changes.
>
> Anyway, I did such measurements before/after your patch.
> the difference was not relevant: just one second of difference:
>
> original code:
>
> real	1m20,839s
> user	1m19,594s
> sys	0m0,998s
>
> after your change:
>
> real	1m21,805s
> user	1m20,612s
> sys	0m0,929s
>
> I don't mind myself to be one second slower, but this is hardly
> a micro-optimization ;-)

Docs builds generally went slightly faster for me, but that is always a
noisy signal.

Anyway, I am not tied to this patch and can drop it.  Or I suppose I
could just redo it with .get(), which avoids both the double lookup and
the exception.

Thanks,

jon
Re: [PATCH 2/7] docs: kdoc: micro-optimize KernRe
Posted by Mauro Carvalho Chehab 3 months ago
Em Thu, 03 Jul 2025 12:14:57 -0600
Jonathan Corbet <corbet@lwn.net> escreveu:

> Mauro Carvalho Chehab <mchehab+huawei@kernel.org> writes:
> 
> > Hmm... I opted for this particular way of checking is that I
> > expect that check inside a hash at dict would be faster than
> > letting it crash then raise an exception.   
> 
> Raising an exception is not quite a "crash" and, if the caching is doing
> any good, it should be ... exceptional.  That pattern is often shown as
> a better way to do conditional dict lookups, so I've tended to follow
> it, even though I'm not a big fan of exceptions in general.
> 
> > Btw, one easy way to check how much it affects performance
> > (if any) would be to run it in "rogue" mode with:
> >
> > 	$ time ./scripts/kernel-doc.py -N .
> >
> > This will run kernel-doc.py for all files at the entire Kernel
> > tree, only reporting problems. If you want to do changes like
> > this that might introduce performance regressions, I suggest
> > running it once, just to fill disk caches, and then run it
> > again before/after such changes.
> >
> > Anyway, I did such measurements before/after your patch.
> > the difference was not relevant: just one second of difference:
> >
> > original code:
> >
> > real	1m20,839s
> > user	1m19,594s
> > sys	0m0,998s
> >
> > after your change:
> >
> > real	1m21,805s
> > user	1m20,612s
> > sys	0m0,929s
> >
> > I don't mind myself to be one second slower, but this is hardly
> > a micro-optimization ;-)  
> 
> Docs builds generally went slightly faster for me, but that is always a
> noisy signal.

Maybe it is just some noise. When I ran the test, I executed the script
a couple of times just to ensure that disk cache won't be affecting it
too much. 

The advantage of running just kerneldoc without Sphinx is that we
avoid doctree cache and other things that would add too much
randomness at the build time.

> Anyway, I am not tied to this patch and can drop it.  Or I suppose I
> could just redo it with .get(), which avoids both the double lookup and
> the exception.

I'm fine with .get().

Thanks,
Mauro