[PATCH v2 1/2] get_maintainer: add patch-only keyword-matching

Justin Stitt posted 2 patches 2 years, 2 months ago
[PATCH v2 1/2] get_maintainer: add patch-only keyword-matching
Posted by Justin Stitt 2 years, 2 months ago
Add the "D:" type which behaves the same as "K:" but will only match
content present in a patch file.

To illustrate:

Imagine this entry in MAINTAINERS:

NEW REPUBLIC
M: Han Solo <hansolo@rebelalliance.co>
W: https://www.jointheresistance.org
D: \bstrncpy\b

Our maintainer, Han, will only be added to the recipients if a patch
file is passed to get_maintainer (like what b4 does):
$ ./scripts/get_maintainer.pl 0004-some-change.patch

If the above patch has a `strncpy` present in the subject, commit log or
diff then Han will be to/cc'd.

However, in the event of a file from the tree given like:
$ ./scripts/get_maintainer.pl ./lib/string.c

Han will not be noisily to/cc'd (like a K: type would in this
circumstance)

Signed-off-by: Justin Stitt <justinstitt@google.com>
---
 MAINTAINERS               |  5 +++++
 scripts/get_maintainer.pl | 12 ++++++++++--
 2 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index b19995690904..94e431daa7c2 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -59,6 +59,11 @@ Descriptions of section entries and preferred order
 	      matches patches or files that contain one or more of the words
 	      printk, pr_info or pr_err
 	   One regex pattern per line.  Multiple K: lines acceptable.
+  D: *Diff content regex* (perl extended) pattern match that applies only to
+     patches and not entire files (e.g. when using the get_maintainers.pl
+     script).
+     Usage same as K:.
+
 
 Maintainers List
 ----------------
diff --git a/scripts/get_maintainer.pl b/scripts/get_maintainer.pl
index ab123b498fd9..a3e697926ddd 100755
--- a/scripts/get_maintainer.pl
+++ b/scripts/get_maintainer.pl
@@ -342,6 +342,7 @@ if ($tree && !top_of_kernel_tree($lk_path)) {
 
 my @typevalue = ();
 my %keyword_hash;
+my %patch_keyword_hash;
 my @mfiles = ();
 my @self_test_info = ();
 
@@ -369,8 +370,10 @@ sub read_maintainer_file {
 		    $value =~ s@([^/])$@$1/@;
 		}
 	    } elsif ($type eq "K") {
-		$keyword_hash{@typevalue} = $value;
-	    }
+          $keyword_hash{@typevalue} = $value;
+	    } elsif ($type eq "D") {
+          $patch_keyword_hash{@typevalue} = $value;
+      }
 	    push(@typevalue, "$type:$value");
 	} elsif (!(/^\s*$/ || /^\s*\#/)) {
 	    push(@typevalue, $line);
@@ -607,6 +610,11 @@ foreach my $file (@ARGV) {
 			push(@keyword_tvi, $line);
 		    }
 		}
+    foreach my $line(keys %patch_keyword_hash) {
+      if ($patch_line =~ m/${patch_prefix}$patch_keyword_hash{$line}/x) {
+        push(@keyword_tvi, $line);
+      }
+    }
 	    }
 	}
 	close($patch);

-- 
2.42.0.582.g8ccd20d70d-goog
Re: [PATCH v2 1/2] get_maintainer: add patch-only keyword-matching
Posted by Joe Perches 2 years, 2 months ago
On Thu, 2023-09-28 at 04:23 +0000, Justin Stitt wrote:
> Add the "D:" type which behaves the same as "K:" but will only match
> content present in a patch file.
> 
> To illustrate:
> 
> Imagine this entry in MAINTAINERS:
> 
> NEW REPUBLIC
> M: Han Solo <hansolo@rebelalliance.co>
> W: https://www.jointheresistance.org
> D: \bstrncpy\b
> 
> Our maintainer, Han, will only be added to the recipients if a patch
> file is passed to get_maintainer (like what b4 does):
> $ ./scripts/get_maintainer.pl 0004-some-change.patch
> 
> If the above patch has a `strncpy` present in the subject, commit log or
> diff then Han will be to/cc'd.
> 
> However, in the event of a file from the tree given like:
> $ ./scripts/get_maintainer.pl ./lib/string.c
> 
> Han will not be noisily to/cc'd (like a K: type would in this
> circumstance)
> 
> Signed-off-by: Justin Stitt <justinstitt@google.com>
> ---
>  MAINTAINERS               |  5 +++++
>  scripts/get_maintainer.pl | 12 ++++++++++--
>  2 files changed, 15 insertions(+), 2 deletions(-)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index b19995690904..94e431daa7c2 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -59,6 +59,11 @@ Descriptions of section entries and preferred order
>  	      matches patches or files that contain one or more of the words
>  	      printk, pr_info or pr_err
>  	   One regex pattern per line.  Multiple K: lines acceptable.
> +  D: *Diff content regex* (perl extended) pattern match that applies only to
> +     patches and not entire files (e.g. when using the get_maintainers.pl
> +     script).
> +     Usage same as K:.
> +
>  
>  Maintainers List
>  ----------------
> diff --git a/scripts/get_maintainer.pl b/scripts/get_maintainer.pl
> index ab123b498fd9..a3e697926ddd 100755
> --- a/scripts/get_maintainer.pl
> +++ b/scripts/get_maintainer.pl
> @@ -342,6 +342,7 @@ if ($tree && !top_of_kernel_tree($lk_path)) {
>  
>  my @typevalue = ();
>  my %keyword_hash;
> +my %patch_keyword_hash;
>  my @mfiles = ();
>  my @self_test_info = ();
>  
> @@ -369,8 +370,10 @@ sub read_maintainer_file {
>  		    $value =~ s@([^/])$@$1/@;
>  		}
>  	    } elsif ($type eq "K") {
> -		$keyword_hash{@typevalue} = $value;
> -	    }
> +          $keyword_hash{@typevalue} = $value;
> +	    } elsif ($type eq "D") {
> +          $patch_keyword_hash{@typevalue} = $value;
> +      }
>  	    push(@typevalue, "$type:$value");
>  	} elsif (!(/^\s*$/ || /^\s*\#/)) {
>  	    push(@typevalue, $line);
> @@ -607,6 +610,11 @@ foreach my $file (@ARGV) {
>  			push(@keyword_tvi, $line);
>  		    }
>  		}
> +    foreach my $line(keys %patch_keyword_hash) {
> +      if ($patch_line =~ m/${patch_prefix}$patch_keyword_hash{$line}/x) {
> +        push(@keyword_tvi, $line);
> +      }
> +    }
>  	    }
>  	}
>  	close($patch);
> 


My opinion: Nack.

I think something like this would be better
as it avoids duplication of K and D content.
---
 scripts/get_maintainer.pl | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/scripts/get_maintainer.pl b/scripts/get_maintainer.pl
index ab123b498fd9..07e7d744cadb 100755
--- a/scripts/get_maintainer.pl
+++ b/scripts/get_maintainer.pl
@@ -57,6 +57,7 @@ my $subsystem = 0;
 my $status = 0;
 my $letters = "";
 my $keywords = 1;
+my $keywords_in_file = 0;
 my $sections = 0;
 my $email_file_emails = 0;
 my $from_filename = 0;
@@ -272,6 +273,7 @@ if (!GetOptions(
 		'letters=s' => \$letters,
 		'pattern-depth=i' => \$pattern_depth,
 		'k|keywords!' => \$keywords,
+		'kf|keywords-in-file!' => \$keywords_in_file,
 		'sections!' => \$sections,
 		'fe|file-emails!' => \$email_file_emails,
 		'f|file' => \$from_filename,
@@ -318,6 +320,7 @@ if ($sections || $letters ne "") {
     $subsystem = 0;
     $web = 0;
     $keywords = 0;
+    $keywords_in_file = 0;
     $interactive = 0;
 } else {
     my $selections = $email + $scm + $status + $subsystem + $web;
@@ -548,16 +551,14 @@ foreach my $file (@ARGV) {
 	$file =~ s/^\Q${cur_path}\E//;	#strip any absolute path
 	$file =~ s/^\Q${lk_path}\E//;	#or the path to the lk tree
 	push(@files, $file);
-	if ($file ne "MAINTAINERS" && -f $file && $keywords) {
+	if ($file ne "MAINTAINERS" && -f $file && $keywords && $keywords_in_file) {
 	    open(my $f, '<', $file)
 		or die "$P: Can't open $file: $!\n";
 	    my $text = do { local($/) ; <$f> };
 	    close($f);
-	    if ($keywords) {
-		foreach my $line (keys %keyword_hash) {
-		    if ($text =~ m/$keyword_hash{$line}/x) {
-			push(@keyword_tvi, $line);
-		    }
+	    foreach my $line (keys %keyword_hash) {
+		if ($text =~ m/$keyword_hash{$line}/x) {
+		    push(@keyword_tvi, $line);
 		}
 	    }
 	}
@@ -1076,6 +1077,7 @@ Output type options:
 Other options:
   --pattern-depth => Number of pattern directory traversals (default: 0 (all))
   --keywords => scan patch for keywords (default: $keywords)
+  --keywords-in-file => scan file for keywords (default: $keywords_in_file)
   --sections => print all of the subsystem sections with pattern matches
   --letters => print all matching 'letter' types from all matching sections
   --mailmap => use .mailmap file (default: $email_use_mailmap)
@@ -1086,7 +1088,7 @@ Other options:
 
 Default options:
   [--email --tree --nogit --git-fallback --m --r --n --l --multiline
-   --pattern-depth=0 --remove-duplicates --rolestats]
+   --pattern-depth=0 --remove-duplicates --rolestats --keywords]
 
 Notes:
   Using "-f directory" may give unexpected results:
Re: [PATCH v2 1/2] get_maintainer: add patch-only keyword-matching
Posted by Justin Stitt 2 years, 2 months ago
On Thu, Sep 28, 2023 at 1:46 PM Joe Perches <joe@perches.com> wrote:
>
> On Thu, 2023-09-28 at 04:23 +0000, Justin Stitt wrote:
> > Add the "D:" type which behaves the same as "K:" but will only match
> > content present in a patch file.
> >
> > To illustrate:
> >
> > Imagine this entry in MAINTAINERS:
> >
> > NEW REPUBLIC
> > M: Han Solo <hansolo@rebelalliance.co>
> > W: https://www.jointheresistance.org
> > D: \bstrncpy\b
> >
> > Our maintainer, Han, will only be added to the recipients if a patch
> > file is passed to get_maintainer (like what b4 does):
> > $ ./scripts/get_maintainer.pl 0004-some-change.patch
> >
> > If the above patch has a `strncpy` present in the subject, commit log or
> > diff then Han will be to/cc'd.
> >
> > However, in the event of a file from the tree given like:
> > $ ./scripts/get_maintainer.pl ./lib/string.c
> >
> > Han will not be noisily to/cc'd (like a K: type would in this
> > circumstance)
> >
> > Signed-off-by: Justin Stitt <justinstitt@google.com>
> > ---
> >  MAINTAINERS               |  5 +++++
> >  scripts/get_maintainer.pl | 12 ++++++++++--
> >  2 files changed, 15 insertions(+), 2 deletions(-)
> >
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index b19995690904..94e431daa7c2 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -59,6 +59,11 @@ Descriptions of section entries and preferred order
> >             matches patches or files that contain one or more of the words
> >             printk, pr_info or pr_err
> >          One regex pattern per line.  Multiple K: lines acceptable.
> > +  D: *Diff content regex* (perl extended) pattern match that applies only to
> > +     patches and not entire files (e.g. when using the get_maintainers.pl
> > +     script).
> > +     Usage same as K:.
> > +
> >
> >  Maintainers List
> >  ----------------
> > diff --git a/scripts/get_maintainer.pl b/scripts/get_maintainer.pl
> > index ab123b498fd9..a3e697926ddd 100755
> > --- a/scripts/get_maintainer.pl
> > +++ b/scripts/get_maintainer.pl
> > @@ -342,6 +342,7 @@ if ($tree && !top_of_kernel_tree($lk_path)) {
> >
> >  my @typevalue = ();
> >  my %keyword_hash;
> > +my %patch_keyword_hash;
> >  my @mfiles = ();
> >  my @self_test_info = ();
> >
> > @@ -369,8 +370,10 @@ sub read_maintainer_file {
> >                   $value =~ s@([^/])$@$1/@;
> >               }
> >           } elsif ($type eq "K") {
> > -             $keyword_hash{@typevalue} = $value;
> > -         }
> > +          $keyword_hash{@typevalue} = $value;
> > +         } elsif ($type eq "D") {
> > +          $patch_keyword_hash{@typevalue} = $value;
> > +      }
> >           push(@typevalue, "$type:$value");
> >       } elsif (!(/^\s*$/ || /^\s*\#/)) {
> >           push(@typevalue, $line);
> > @@ -607,6 +610,11 @@ foreach my $file (@ARGV) {
> >                       push(@keyword_tvi, $line);
> >                   }
> >               }
> > +    foreach my $line(keys %patch_keyword_hash) {
> > +      if ($patch_line =~ m/${patch_prefix}$patch_keyword_hash{$line}/x) {
> > +        push(@keyword_tvi, $line);
> > +      }
> > +    }
> >           }
> >       }
> >       close($patch);
> >
>
>
> My opinion: Nack.
>
> I think something like this would be better
> as it avoids duplication of K and D content.

If I understand correctly, this puts the onus on the get_maintainer users
to select the right argument whereas adding "D:", albeit with some
duplicate code, allows maintainers themselves to decide in exactly
which context they receive mail.

Adding a command line flag means sometimes K: is treated one
way and sometimes treated a different way depending on
the specific incantation.

This could all be a moot point, though, as I believe Konstantin
is trying to separate out the whole idea of a patch-sender needing
to specify the recipients of a patch. Instead some middleware would
capture mail and delegate automatically based on some queries
set up by maintainers.

Exciting idea, I wonder what the progress is on that?

Cc: Konstantin Ryabitsev <konstantin@linuxfoundation.org>

[1]: https://lore.kernel.org/all/20230726-june-mocha-ad6809@meerkat/

> ---
>  scripts/get_maintainer.pl | 16 +++++++++-------
>  1 file changed, 9 insertions(+), 7 deletions(-)
>
> diff --git a/scripts/get_maintainer.pl b/scripts/get_maintainer.pl
> index ab123b498fd9..07e7d744cadb 100755
> --- a/scripts/get_maintainer.pl
> +++ b/scripts/get_maintainer.pl
> @@ -57,6 +57,7 @@ my $subsystem = 0;
>  my $status = 0;
>  my $letters = "";
>  my $keywords = 1;
> +my $keywords_in_file = 0;
>  my $sections = 0;
>  my $email_file_emails = 0;
>  my $from_filename = 0;
> @@ -272,6 +273,7 @@ if (!GetOptions(
>                 'letters=s' => \$letters,
>                 'pattern-depth=i' => \$pattern_depth,
>                 'k|keywords!' => \$keywords,
> +               'kf|keywords-in-file!' => \$keywords_in_file,
>                 'sections!' => \$sections,
>                 'fe|file-emails!' => \$email_file_emails,
>                 'f|file' => \$from_filename,
> @@ -318,6 +320,7 @@ if ($sections || $letters ne "") {
>      $subsystem = 0;
>      $web = 0;
>      $keywords = 0;
> +    $keywords_in_file = 0;
>      $interactive = 0;
>  } else {
>      my $selections = $email + $scm + $status + $subsystem + $web;
> @@ -548,16 +551,14 @@ foreach my $file (@ARGV) {
>         $file =~ s/^\Q${cur_path}\E//;  #strip any absolute path
>         $file =~ s/^\Q${lk_path}\E//;   #or the path to the lk tree
>         push(@files, $file);
> -       if ($file ne "MAINTAINERS" && -f $file && $keywords) {
> +       if ($file ne "MAINTAINERS" && -f $file && $keywords && $keywords_in_file) {
>             open(my $f, '<', $file)
>                 or die "$P: Can't open $file: $!\n";
>             my $text = do { local($/) ; <$f> };
>             close($f);
> -           if ($keywords) {
> -               foreach my $line (keys %keyword_hash) {
> -                   if ($text =~ m/$keyword_hash{$line}/x) {
> -                       push(@keyword_tvi, $line);
> -                   }
> +           foreach my $line (keys %keyword_hash) {
> +               if ($text =~ m/$keyword_hash{$line}/x) {
> +                   push(@keyword_tvi, $line);
>                 }
>             }
>         }
> @@ -1076,6 +1077,7 @@ Output type options:
>  Other options:
>    --pattern-depth => Number of pattern directory traversals (default: 0 (all))
>    --keywords => scan patch for keywords (default: $keywords)
> +  --keywords-in-file => scan file for keywords (default: $keywords_in_file)
>    --sections => print all of the subsystem sections with pattern matches
>    --letters => print all matching 'letter' types from all matching sections
>    --mailmap => use .mailmap file (default: $email_use_mailmap)
> @@ -1086,7 +1088,7 @@ Other options:
>
>  Default options:
>    [--email --tree --nogit --git-fallback --m --r --n --l --multiline
> -   --pattern-depth=0 --remove-duplicates --rolestats]
> +   --pattern-depth=0 --remove-duplicates --rolestats --keywords]
>
>  Notes:
>    Using "-f directory" may give unexpected results:
>
Re: [PATCH v2 1/2] get_maintainer: add patch-only keyword-matching
Posted by Joe Perches 2 years, 2 months ago
On Thu, 2023-09-28 at 14:03 +0900, Justin Stitt wrote:
> On Thu, Sep 28, 2023 at 1:46 PM Joe Perches <joe@perches.com> wrote:
> > 
> > On Thu, 2023-09-28 at 04:23 +0000, Justin Stitt wrote:
> > > Add the "D:" type which behaves the same as "K:" but will only match
> > > content present in a patch file.
[]
> > > My opinion: Nack.
> > 
> > I think something like this would be better
> > as it avoids duplication of K and D content.
> 
> If I understand correctly, this puts the onus on the get_maintainer users
> to select the right argument whereas adding "D:", albeit with some
> duplicate code, allows maintainers themselves to decide in exactly
> which context they receive mail.

Maybe, but I doubt it'll be significantly different.


> This could all be a moot point, though, as I believe Konstantin
> is trying to separate out the whole idea of a patch-sender needing
> to specify the recipients of a patch.

As I understand it, by using get_maintainer.

Re: [PATCH v2 1/2] get_maintainer: add patch-only keyword-matching
Posted by Konstantin Ryabitsev 2 years, 2 months ago
On Wed, Sep 27, 2023 at 10:08:33PM -0700, Joe Perches wrote:
> > This could all be a moot point, though, as I believe Konstantin
> > is trying to separate out the whole idea of a patch-sender needing
> > to specify the recipients of a patch.
> 
> As I understand it, by using get_maintainer.

Correct, we will ultimately still defer to get_maintainer to figure out who
needs to be added.

-K