[PATCH v4] checkpatch: warn for non-standard fixes tag style

Niklas Söderlund posted 1 patch 3 years, 6 months ago
There is a newer version of this series
Documentation/dev-tools/checkpatch.rst |  8 +++++
scripts/checkpatch.pl                  | 41 ++++++++++++++++++++++++++
2 files changed, 49 insertions(+)
[PATCH v4] checkpatch: warn for non-standard fixes tag style
Posted by Niklas Söderlund 3 years, 6 months ago
Add a warning for fixes tags that does not fall in line with the
standards specified by the community.

Signed-off-by: Niklas Söderlund <niklas.soderlund@corigine.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Louis Peens <louis.peens@corigine.com>
---
* Changes since v3
- Add test that title in tag match title of commit referenced by sha1.

* Changes since v2
- Change the pattern to match on 'fixes:?' to catch more malformed
  fixes tags.

* Changes since v1
- Update the documentation wording and add mention one cause of the
  message can be that email program splits the tag over multiple lines.
---
 Documentation/dev-tools/checkpatch.rst |  8 +++++
 scripts/checkpatch.pl                  | 41 ++++++++++++++++++++++++++
 2 files changed, 49 insertions(+)

diff --git a/Documentation/dev-tools/checkpatch.rst b/Documentation/dev-tools/checkpatch.rst
index b52452bc2963..8c8456a3bd18 100644
--- a/Documentation/dev-tools/checkpatch.rst
+++ b/Documentation/dev-tools/checkpatch.rst
@@ -612,6 +612,14 @@ Commit message
 
     See: https://www.kernel.org/doc/html/latest/process/submitting-patches.html#describe-your-changes
 
+  **BAD_FIXES_TAG**
+    The Fixes: tag is malformed or does not fall in line with the standards
+    specified by the community. This can occur if the tag have been split into
+    multiple lines (e.g., when pasted in email program with word wrapping
+    enabled).
+
+    See: https://www.kernel.org/doc/html/latest/process/submitting-patches.html#describe-your-changes
+
 
 Comparison style
 ----------------
diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 79e759aac543..ac7ae2e4a1d8 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -3140,6 +3140,47 @@ sub process {
 			}
 		}
 
+# Check Fixes: styles is correct
+		if (!$in_header_lines && $line =~ /^fixes:?/i) {
+			my $orig_commit = "";
+			my $id = "0123456789ab";
+			my $title = "commit title";
+			my $tag_case = 1;
+			my $tag_space = 1;
+			my $id_length = 1;
+			my $id_case = 1;
+			my $title_has_quotes = 0;
+
+			if ($line =~ /(fixes:?)\s+([0-9a-f]{5,})\s+($balanced_parens)/i) {
+				my $tag = $1;
+				$orig_commit = $2;
+				$title = $3;
+
+				$tag_case = 0 if $tag eq "Fixes:";
+				$tag_space = 0 if ($line =~ /^fixes:? [0-9a-f]{5,} ($balanced_parens)/i);
+
+				$id_length = 0 if ($orig_commit =~ /^[0-9a-f]{12}$/i);
+				$id_case = 0 if ($orig_commit !~ /[A-F]/);
+
+				# Always strip leading/trailing parens then double quotes if existing
+				$title = substr($title, 1, -1);
+				if ($title =~ /^".*"$/) {
+					$title = substr($title, 1, -1);
+					$title_has_quotes = 1;
+				}
+			}
+
+			my ($cid, $ctitle) = git_commit_info($orig_commit, $id,
+							     $title);
+
+			if ($ctitle ne $title || $tag_case || $tag_space ||
+			    $id_length || $id_case || !$title_has_quotes) {
+				WARN("BAD_FIXES_TAG",
+				     "Please use correct Fixes: style 'Fixes: <12 chars of sha1> (\"<title line>\")' - ie: 'Fixes: $cid (\"$ctitle\")'\n" . $herecurr);
+
+			}
+		}
+
 # Check email subject for common tools that don't need to be mentioned
 		if ($in_header_lines &&
 		    $line =~ /^Subject:.*\b(?:checkpatch|sparse|smatch)\b[^:]/i) {
-- 
2.37.3

Re: [PATCH v4] checkpatch: warn for non-standard fixes tag style
Posted by Philippe Schenker 3 years, 6 months ago
Hi Niklas, 

Thanks for adding me to cc. I will also add Stephen, as he also sent
some comments on my submission the exact same problem. I'm supportive of
your code as it has the nice advantage of suggesting the right format of
the tag if it might be wrong. However it seems lot of stuff is slightly
duplicated and lots of lines could be left away simplifying it greatly.
I don't want to hold anything up anyway so I'm fine with it, but will
stillleave some comments of things I think should be improved.

Please see my comments inline.


On Thu, 2022-09-08 at 18:44 +0200, Niklas Söderlund wrote:
> Add a warning for fixes tags that does not fall in line with the
> standards specified by the community.
> 
> Signed-off-by: Niklas Söderlund <niklas.soderlund@corigine.com>
> Reviewed-by: Simon Horman <simon.horman@corigine.com>
> Reviewed-by: Louis Peens <louis.peens@corigine.com>
> ---
> * Changes since v3
> - Add test that title in tag match title of commit referenced by sha1.
> 
> * Changes since v2
> - Change the pattern to match on 'fixes:?' to catch more malformed
>   fixes tags.
> 
> * Changes since v1
> - Update the documentation wording and add mention one cause of the
>   message can be that email program splits the tag over multiple
> lines.
> ---
>  Documentation/dev-tools/checkpatch.rst |  8 +++++
>  scripts/checkpatch.pl                  | 41
> ++++++++++++++++++++++++++
>  2 files changed, 49 insertions(+)
> 
> diff --git a/Documentation/dev-tools/checkpatch.rst
> b/Documentation/dev-tools/checkpatch.rst
> index b52452bc2963..8c8456a3bd18 100644
> --- a/Documentation/dev-tools/checkpatch.rst
> +++ b/Documentation/dev-tools/checkpatch.rst
> @@ -612,6 +612,14 @@ Commit message
>  
>      See:
> https://www.kernel.org/doc/html/latest/process/submitting-patches.html#describe-your-changes
>  
> +  **BAD_FIXES_TAG**
> +    The Fixes: tag is malformed or does not fall in line with the
> standards
> +    specified by the community. This can occur if the tag have been
> split into
> +    multiple lines (e.g., when pasted in email program with word
> wrapping
> +    enabled).
> +
> +    See:
> https://www.kernel.org/doc/html/latest/process/submitting-patches.html#describe-your-changes
> +
>  
>  Comparison style
>  ----------------
> diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
> index 79e759aac543..ac7ae2e4a1d8 100755
> --- a/scripts/checkpatch.pl
> +++ b/scripts/checkpatch.pl
> @@ -3140,6 +3140,47 @@ sub process {
>                         }
>                 }
>  
> +# Check Fixes: styles is correct
> +               if (!$in_header_lines && $line =~ /^fixes:?/i) {

I would check all lines that start with fixes, even if there is
whitespace in front (and then failing later on...)

if (!$in_header_lines && $line =~ /^\s*fixes:?/i) {


> +                       my $orig_commit = "";
> +                       my $id = "0123456789ab";
> +                       my $title = "commit title";
> +                       my $tag_case = 1;
> +                       my $tag_space = 1;
> +                       my $id_length = 1;
> +                       my $id_case = 1;
> +                       my $title_has_quotes = 0;
> +
> +                       if ($line =~ /(fixes:?)\s+([0-9a-
> f]{5,})\s+($balanced_parens)/i) {

If we check also the fixes: lines that begin with whitespace this would
be a good space to check that we do not want any whitespace in front of
Fixes: tag.

/(\s*fixes:?)\s+([0-9a-f]{5,})\s+($balanced_parens)/i) {

> +                               my $tag = $1;
> +                               $orig_commit = $2;
> +                               $title = $3;
> +
> +                               $tag_case = 0 if $tag eq "Fixes:";
> +                               $tag_space = 0 if ($line =~ /^fixes:?
> [0-9a-f]{5,} ($balanced_parens)/i);
> +
> +                               $id_length = 0 if ($orig_commit =~
> /^[0-9a-f]{12}$/i);

I suggest we borrow the patter that is also used in "Check for git id
commit length and improperly formed commit descriptions". This has the
reason as checking strictly for 12 chars is at the moment right but as
linux grows 13 chars will eventually come.

$id_length = 0 if ($orig_commit =~ /^[0-9a-f]{12,40}$/i);
 

> +                               $id_case = 0 if ($orig_commit !~ /[A-
> F]/);
> +
> +                               # Always strip leading/trailing parens
> then double quotes if existing
> +                               $title = substr($title, 1, -1);
> +                               if ($title =~ /^".*"$/) {
> +                                       $title = substr($title, 1, -
> 1);
> +                                       $title_has_quotes = 1;
> +                               }
> +                       }
> +
> +                       my ($cid, $ctitle) =
> git_commit_info($orig_commit, $id,
> +                                                            $title);
> +
> +                       if ($ctitle ne $title || $tag_case ||
> $tag_space ||
> +                           $id_length || $id_case ||
> !$title_has_quotes) {
> +                               WARN("BAD_FIXES_TAG",
> +                                    "Please use correct Fixes: style
> 'Fixes: <12 chars of sha1> (\"<title line>\")' - ie: 'Fixes: $cid
> (\"$ctitle\")'\n" . $herecurr);
> +
> +                       }
> +               }
> +
>  # Check email subject for common tools that don't need to be
> mentioned
>                 if ($in_header_lines &&
>                     $line =~
> /^Subject:.*\b(?:checkpatch|sparse|smatch)\b[^:]/i) {

Re: [PATCH v4] checkpatch: warn for non-standard fixes tag style
Posted by niklas.soderlund@corigine.com 3 years, 6 months ago
Hi Philippe,

On 2022-09-08 17:49:14 +0000, Philippe Schenker wrote:
> Hi Niklas, 
> 
> Thanks for adding me to cc. I will also add Stephen, as he also sent
> some comments on my submission the exact same problem. I'm supportive of
> your code as it has the nice advantage of suggesting the right format of
> the tag if it might be wrong. However it seems lot of stuff is slightly
> duplicated and lots of lines could be left away simplifying it greatly.
> I don't want to hold anything up anyway so I'm fine with it, but will
> stillleave some comments of things I think should be improved.

I agree the LoC could be reduced, I try to mimic the style from the 
"Check for git id commit length and improperly formed commit 
descriptions" check. As there is some overlap maybe one day someone 
cleverer then me can figure out how to share code between them.

> > +# Check Fixes: styles is correct
> > +               if (!$in_header_lines && $line =~ /^fixes:?/i) {
> 
> I would check all lines that start with fixes, even if there is
> whitespace in front (and then failing later on...)
> 
> if (!$in_header_lines && $line =~ /^\s*fixes:?/i) {

Good point, will do so in v5.

> If we check also the fixes: lines that begin with whitespace this 
> would be a good space to check that we do not want any whitespace in 
> front of Fixes: tag.
> 
> /(\s*fixes:?)\s+([0-9a-f]{5,})\s+($balanced_parens)/i) {

Ditto.

> > +                               $id_length = 0 if ($orig_commit =~
> > /^[0-9a-f]{12}$/i);
> 
> I suggest we borrow the patter that is also used in "Check for git id
> commit length and improperly formed commit descriptions". This has the
> reason as checking strictly for 12 chars is at the moment right but as
> linux grows 13 chars will eventually come.
> 
> $id_length = 0 if ($orig_commit =~ /^[0-9a-f]{12,40}$/i);

This one bothers me a bit. I did do that before I sent out v1 but 
changed my mind. The reason being that the documentation asks explicitly 
for 12 chars [1]. I have no preference on keeping it strictly 12 or 
allowing anything in the 12 to 40 range, but i do think the check should 
reflect whats in the documentation. If we change this maybe we also need 
to update the documentation?

One argument to keep it strict is that when Linux the need 13 or more 
characters the documentation will need to be updated and it is natural 
that the script to check that the style documented is followed is 
updated at the same time.

1.  https://www.kernel.org/doc/html/latest/process/submitting-patches.html#describe-your-changes

-- 
Kind Regards,
Niklas Söderlund
Re: [PATCH v4] checkpatch: warn for non-standard fixes tag style
Posted by Philippe Schenker 3 years, 6 months ago
On Fri, 2022-09-09 at 09:40 +0200, niklas.soderlund@corigine.com wrote:
> Hi Philippe,
> 
> On 2022-09-08 17:49:14 +0000, Philippe Schenker wrote:
> > Hi Niklas, 
> > 
> > Thanks for adding me to cc. I will also add Stephen, as he also sent
> > some comments on my submission the exact same problem. I'm
> > supportive of
> > your code as it has the nice advantage of suggesting the right
> > format of
> > the tag if it might be wrong. However it seems lot of stuff is
> > slightly
> > duplicated and lots of lines could be left away simplifying it
> > greatly.
> > I don't want to hold anything up anyway so I'm fine with it, but
> > will
> > stillleave some comments of things I think should be improved.
> 
> I agree the LoC could be reduced, I try to mimic the style from the 
> "Check for git id commit length and improperly formed commit 
> descriptions" check. As there is some overlap maybe one day someone 
> cleverer then me can figure out how to share code between them.
> 
> > > +# Check Fixes: styles is correct
> > > +               if (!$in_header_lines && $line =~ /^fixes:?/i) {
> > 
> > I would check all lines that start with fixes, even if there is
> > whitespace in front (and then failing later on...)
> > 
> > if (!$in_header_lines && $line =~ /^\s*fixes:?/i) {
> 
> Good point, will do so in v5.
> 
> > If we check also the fixes: lines that begin with whitespace this 
> > would be a good space to check that we do not want any whitespace in
> > front of Fixes: tag.
> > 
> > /(\s*fixes:?)\s+([0-9a-f]{5,})\s+($balanced_parens)/i) {
> 
> Ditto.
> 
> > > +                               $id_length = 0 if ($orig_commit =~
> > > /^[0-9a-f]{12}$/i);
> > 
> > I suggest we borrow the patter that is also used in "Check for git
> > id
> > commit length and improperly formed commit descriptions". This has
> > the
> > reason as checking strictly for 12 chars is at the moment right but
> > as
> > linux grows 13 chars will eventually come.
> > 
> > $id_length = 0 if ($orig_commit =~ /^[0-9a-f]{12,40}$/i);
> 
> This one bothers me a bit. I did do that before I sent out v1 but 
> changed my mind. The reason being that the documentation asks
> explicitly 
> for 12 chars [1]. I have no preference on keeping it strictly 12 or 
> allowing anything in the 12 to 40 range, but i do think the check
> should 
> reflect whats in the documentation. If we change this maybe we also
> need 
> to update the documentation?
> 
> One argument to keep it strict is that when Linux the need 13 or more 
> characters the documentation will need to be updated and it is natural
> that the script to check that the style documented is followed is 
> updated at the same time.

In the end I'm fine with both variants. Up to you, I just would not have
made the whole check too strict as until now there was no check at all.
But the fact we are now even checking one space character it is also
fine for me that we limit this to strictly 12 characters.

> 
> 1. 
> https://www.kernel.org/doc/html/latest/process/submitting-patches.html#describe-your-changes
> 

Re: [PATCH v4] checkpatch: warn for non-standard fixes tag style
Posted by Joe Perches 3 years, 6 months ago
On Fri, 2022-09-09 at 09:40 +0200, niklas.soderlund@corigine.com wrote:
> On 2022-09-08 17:49:14 +0000, Philippe Schenker wrote:
> > Thanks for adding me to cc. I will also add Stephen, as he also sent
> > some comments on my submission the exact same problem. I'm supportive of
> > your code as it has the nice advantage of suggesting the right format of
> > the tag if it might be wrong. However it seems lot of stuff is slightly
> > duplicated and lots of lines could be left away simplifying it greatly.

It's not very possible to reduce the line count here.
I mentioned the same thing in my first reply.

> > > +# Check Fixes: styles is correct
> > > +               if (!$in_header_lines && $line =~ /^fixes:?/i) {
> > 
> > I would check all lines that start with fixes, even if there is
> > whitespace in front (and then failing later on...)
> > 
> > if (!$in_header_lines && $line =~ /^\s*fixes:?/i) {

I think that's a poor idea.

You should really review git history for lines that start with fixes
and look at the number of false positives that would give.

Try this grep:

$ git log -100000 --no-merges --grep="^\s*fixes" -i --format=email -P | \
  grep -P -i "^\s*fixes)" | \
  grep -P -v "^Fixes: [0-9a-f]{12,}'
[...]

That is a greater than 10% false positive rate.

I think it's better to make sure that there is a likely SHA1 of some
minimum length after the fixes line.

And a relatively common defect is to have the word "commit" after fixes.

e.g.:

Fixes commit 554c0a3abf216 ("staging: Add rtl8723bs sdio wifi driver").
Fixes commit a2c60d42d97c ("Add files for new driver - part 16").

So maybe:

	if (!$in_header_lines &&
	    $line =~ /^\s*fixes:?\s*(?:commit\s*)?[0-9a-f]{5,}\b/i)
Re: [PATCH v4] checkpatch: warn for non-standard fixes tag style
Posted by niklas.soderlund@corigine.com 3 years, 6 months ago
Hi Joe,

Thanks for your feedback.

On 2022-09-09 10:57:10 -0700, Joe Perches wrote:
> I think it's better to make sure that there is a likely SHA1 of some
> minimum length after the fixes line.
> 
> And a relatively common defect is to have the word "commit" after fixes.
> 
> e.g.:
> 
> Fixes commit 554c0a3abf216 ("staging: Add rtl8723bs sdio wifi driver").
> Fixes commit a2c60d42d97c ("Add files for new driver - part 16").
> 
> So maybe:
> 
> 	if (!$in_header_lines &&
> 	    $line =~ /^\s*fixes:?\s*(?:commit\s*)?[0-9a-f]{5,}\b/i)

I'm sorry I just sent out v6 and missed this comment, which I think is a 
great idea. I will let v6 stew for a day or two in case there is other 
feedback and then spin a v7 with this addressed.

-- 
Kind Regards,
Niklas Söderlund
Re: [PATCH v4] checkpatch: warn for non-standard fixes tag style
Posted by Joe Perches 3 years, 6 months ago
On Fri, 2022-09-09 at 10:57 -0700, Joe Perches wrote:
> On Fri, 2022-09-09 at 09:40 +0200, niklas.soderlund@corigine.com wrote:
> > On 2022-09-08 17:49:14 +0000, Philippe Schenker wrote:
[]
> > > I would check all lines that start with fixes, even if there is
> > > whitespace in front (and then failing later on...)
> > > 
> > > if (!$in_header_lines && $line =~ /^\s*fixes:?/i) {
> 
> I think that's a poor idea.
> 
> You should really review git history for lines that start with fixes
> and look at the number of false positives that would give.
> 
> Try this grep:
> 
> $ git log -100000 --no-merges --grep="^\s*fixes" -i --format=email -P | \
>   grep -P -i "^\s*fixes)" | \
>   grep -P -v "^Fixes: [0-9a-f]{12,}'
> [...]
> 
> That is a greater than 10% false positive rate.

One day I'll be better at typing grep commands in the editor...

Try:

$ git log -100000 --no-merges --grep="^\s*fixes" -i --format=email -P | \
  grep -P -i "^\s*fixes" | \
  grep -P -v "^Fixes: [0-9a-f]{12,}"