From nobody Fri Dec 19 10:57:18 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8B4685103F; Sat, 12 Apr 2025 12:29:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744460955; cv=none; b=kwCOJS2eTifU1Qmu0xONdj28Cd0BYFjmFdOytIV9o4KYnc/CucSkqflMGA4ZZK/pK4ci2fKLAEy5VEU5+yyVhC3ZVBCMXSJ92bnnzPxmTqrGfJh9NsaHlt7rKx0nTaGBoon3bNVr6HRVqP2/v6nIs+tBCP4o98luwY3S3RFbUmU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744460955; c=relaxed/simple; bh=n2rwKvDLR4v62zLyn6uiDhpEPKhBm7u5inGMN2NR9JE=; h=From:To:Cc:Subject:Date:Message-Id:MIME-Version:Content-Type; b=oMNkwDbuuqqVpMTTNrFT09OD6Ybet1JOf/FHQ4YKIQPE0SkxUvtzEmh6Y6xbgF5+4FwVQeqfNPOUWYDZa2g2ng3jW6JOTixHGQuLT/Pj3jCgfosYLPQHi8N6O94Q1u4310E1OCrB5erYnDA8UmAoavO9OsnhyW5axpdbIlDgfIo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=jt7uSzG+; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="jt7uSzG+" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 34509C4CEE3; Sat, 12 Apr 2025 12:29:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1744460954; bh=n2rwKvDLR4v62zLyn6uiDhpEPKhBm7u5inGMN2NR9JE=; h=From:To:Cc:Subject:Date:From; b=jt7uSzG+IMQFvxZSJdBo7Q38dMZsR5FjhaVkfkYZ28qkuAFsvCODsSEJi70SO17M/ gB9PMzwdQ+AE4KWJ73s4RyuE4mqQIFZYJrswR3PfVNWbq75AftMi2AJaf340ZuRLio iQb1Zs4VJbC5AToYVG2eCVfxdKOhuNZD+kMiq/Serma0Sbh4i55TqAWK0amjjSZmI4 FpWNiSbBQWDY6YLonFGSlxmRD7wudvNQjzGRmYM1yJ5dp+Yh6xzgkI/ftiVC2jteMt Iku2FVdYAH5bmO4RQpbPXaFQ0prgh46vP6IcQvAoW2kWPL9I+Ghg6/TXtpTLN2KOOz X4v/pqvx2TpIA== From: Sasha Levin To: workflows@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Sasha Levin Subject: [PATCH] verify_pull_requests: initial pull request sanitizer Date: Sat, 12 Apr 2025 08:29:11 -0400 Message-Id: <20250412122911.327134-1-sashal@kernel.org> X-Mailer: git-send-email 2.39.5 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable I'm working on evolving the work I'm doing on the linus-next integration branch, and this seemed like another useful tool. Verify that either the sender of the pull request is listed as a maintainer for the subsystem the patches are destined for. This provides us two things: 1. Audit the correctness of the MAINTAINERS file, and provide an opportunity to correct and add missing "tribal knowledge" (folks who are the de-facto maintainers, but are not listed in MAINTAINERS). 2. Verify that inadvertent changes are not included in a pull request. Below is an example output of the tool. Take note that for pull request #3 we see a warning because Jens isn't listed as a maintainer for drivers/nvme/ even though he is sending pull requests for it. $ ./scripts/verify_pull_requests.sh --days 1 Number of pull requests in the last 1 day(s): 5 Processing pull requests... Pull request #1: http://lore.kernel.org/all/CAH2r5mt3CCXVEwdsrqPe1VE+xebPSh= 2k4Wg5Zqqp_OCm+m7cPQ@mail.gmail.com/ Sender: Steve French Repository: git://git.samba.org/sfrench/cifs-2.6.git Branch/Tag: tags/v6.15-rc1-smb3-client-fixes Fetching: git fetch "git://git.samba.org/sfrench/cifs-2.6.git" "tags/v6.1= 5-rc1-smb3-client-fixes" Fetch: =E2=9C=85 Successfully fetched Checking maintainer status for 10 commit(s)... =E2=9C=85 Maintainer verification: Sender or a signer is listed as mainta= iner for all commits ------------------------ Pull request #2: http://lore.kernel.org/all/20250411181650.GA372618@bhelgaa= s/ Sender: Bjorn Helgaas Repository: git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci.git Branch/Tag: tags/pci-v6.15-fixes-1 Fetching: git fetch "git://git.kernel.org/pub/scm/linux/kernel/git/pci/pc= i.git" "tags/pci-v6.15-fixes-1" Fetch: =E2=9C=85 Successfully fetched Checking maintainer status for 1 commit(s)... =E2=9C=85 Maintainer verification: Sender or a signer is listed as mainta= iner for all commits ------------------------ Pull request #3: http://lore.kernel.org/all/8d3e5d98-09b1-4274-af25-124c913= 42b7a@kernel.dk/ Sender: Jens Axboe Repository: git://git.kernel.dk/linux.git Branch/Tag: tags/block-6.15-20250411 Fetching: git fetch "git://git.kernel.dk/linux.git" "tags/block-6.15-2025= 0411" Fetch: =E2=9C=85 Successfully fetched Checking maintainer status for 13 commit(s)... =E2=9C=85 Maintainer verification: Sender or a signer is listed as mainta= iner for all commits =E2=9A=A0=EF=B8=8F Warning: Sender is NOT listed as maintainer for these= commits (but a signer is): - 70289ae5cac4d nvmet-fc: put ref when assoc->del_work is already sched= uled - b0b26ad0e1943 nvmet-fc: take tgtport reference only once - 1a909565733ed nvmet-fc: update tgtport ref per assoc - 88517565b5929 nvmet-fc: inline nvmet_fc_free_hostport - aeaa0913a6994 nvmet-fc: inline nvmet_fc_delete_assoc - 72511b1dc4147 nvmet-fcloop: add ref counting to lport - f22c458f9495f nvmet-fcloop: replace kref with refcount - 2b5f0c5bc819a nvmet-fcloop: swap list_add_tail arguments ------------------------ Pull request #4: http://lore.kernel.org/all/Z_kntkZxksOfGwpt@8bytes.org/ Sender: Joerg Roedel Repository: git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux.git Branch/Tag: tags/iommu-fixes-v6.15-rc1 Fetching: git fetch "git://git.kernel.org/pub/scm/linux/kernel/git/iommu/= linux.git" "tags/iommu-fixes-v6.15-rc1" Fetch: =E2=9C=85 Successfully fetched Checking maintainer status for 9 commit(s)... =E2=9C=85 Maintainer verification: Sender or a signer is listed as mainta= iner for all commits ------------------------ Pull request #5: http://lore.kernel.org/all/CAJZ5v0iEn-Lyic6zxDehxF1HHfNfg1= 1_S7COMsHnZeQ+TzZAsA@mail.gmail.com/ Sender: "Rafael J. Wysocki" Repository: git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm= .git Branch/Tag: acpi-6.15-rc2 Fetching: git fetch "git://git.kernel.org/pub/scm/linux/kernel/git/rafael= /linux-pm.git" "tags/acpi-6.15-rc2" Fetch: =E2=9C=85 Successfully fetched Checking maintainer status for 3 commit(s)... =E2=9C=85 Maintainer verification: Sender or a signer is listed as mainta= iner for all commits Signed-off-by: Sasha Levin --- scripts/verify_pull_requests.sh | 393 ++++++++++++++++++++++++++++++++ 1 file changed, 393 insertions(+) create mode 100755 scripts/verify_pull_requests.sh diff --git a/scripts/verify_pull_requests.sh b/scripts/verify_pull_requests= .sh new file mode 100755 index 0000000000000..3dd6492a71d2f --- /dev/null +++ b/scripts/verify_pull_requests.sh @@ -0,0 +1,393 @@ +#!/bin/bash +#set -x + +# Default number of days to search +days=3D1 + +# Parse command line arguments +while [ "$#" -gt 0 ]; do + case "$1" in + --days) + shift + if [[ "$1" =3D~ ^[0-9]+$ ]]; then + days=3D"$1" + else + echo "Error: --days requires a numeric argument" + exit 1 + fi + ;; + *) + echo "Unknown option: $1" + echo "Usage: $0 [--days N]" + exit 1 + ;; + esac + shift +done + +URL=3D"https://lore.kernel.org/all/?q=3Ds:%22GIT+PULL%22+AND+t:torvalds+AN= D+rt:${days}.day.ago...+AND+NOT+s:re:&x=3DA" + +temp_file=3D$(mktemp) +curl -s "$URL" > "$temp_file" + +count=3D$(grep -c "" "$temp_file") +echo "Number of pull requests in the last ${days} day(s): $count" + +# Extract message URLs and filter out query parameters and #related links +message_urls=3D$(grep -o "http://lore.kernel.org/all/[^\"]*" "$temp_file" = | grep -v "\\?" | grep -v "#related") + +echo "Processing pull requests..." + +count=3D0 +while read -r message_url; do + count=3D$((count + 1)) + echo "Pull request #$count: $message_url" + + message_content=3D$(mktemp) + curl -s -L "$message_url" > "$message_content" + + email_content=3D$(cat "$message_content") + + # Extract and clean sender information + from_line=3D$(echo "$email_content" | grep -o "From:.*" | head -1) + from_line=3D$(echo "$from_line" | sed 's/<//g' = | sed 's/"/"/g' | sed 's/"/"/g') + + if [[ "$from_line" =3D~ From:[[:space:]]+(.*)[[:space:]]+\<([^>]+)\> ]= ]; then + sender_name=3D"${BASH_REMATCH[1]}" + sender_email=3D"${BASH_REMATCH[2]}" + sender_name=3D$(echo "$sender_name" | sed 's/^[[:space:]]*//;s/[[:= space:]]*$//') + sender_email=3D$(echo "$sender_email" | sed 's/^[[:space:]]*//;s/[= [:space:]]*$//') + echo " Sender: $sender_name <$sender_email>" + else + echo " Sender: $(echo "$from_line" | sed 's/From: //')" + fi + + found_repo=3Dfalse + repo=3D"" + branch=3D"" + + # Try extraction methods in order of preference + + # 1. Extract repo from HTML links + html_href_lines=3D$(echo "$email_content" | grep -n '([[:space:]]*([[:alnum:]/_.-]+)) ]]= ; then + branch=3D"${BASH_REMATCH[2]}" + echo " Repository: $repo" + echo " Branch/Tag: $branch" + found_repo=3Dtrue + break + else + next_line_num=3D$((line_num + 1)) + next_line=3D$(echo "$email_content" | sed -n "${next_l= ine_num}p") + next_line=3D$(echo "$next_line" | sed 's/^[[:space:]]*= //' | sed 's/[[:space:]]*$//') + + if [[ $next_line =3D~ ^[[:alnum:]/_.-]+$ ]]; then + branch=3D"$next_line" + echo " Repository: $repo" + echo " Branch/Tag: $branch" + found_repo=3Dtrue + break + elif [ "$found_repo" =3D false ]; then + repo_no_branch=3D$repo + line_no_branch=3D$line + fi + fi + fi + done <<< "$html_href_lines" + fi + + # 2. Extract repo from plain text if not found in HTML + if [ "$found_repo" =3D false ]; then + repo_lines=3D$(echo "$email_content" | grep -n -i "git://\|https:/= /git\|git@" | grep -v "href=3D") + + if [ -n "$repo_lines" ]; then + while read -r numbered_line; do + line_num=3D$(echo "$numbered_line" | cut -d: -f1) + line=3D$(echo "$numbered_line" | cut -d: -f2-) + + if [[ $line =3D~ (git://|ssh://git|https://git|git@)[^[:sp= ace:]]+(/[^[:space:]]+)+ ]]; then + repo=3D"${BASH_REMATCH[0]}" + repo=3D$(echo "$repo" | sed 's/[,.\\]$//' | sed 's/[[:= space:]]*$//') + + if [[ $line =3D~ $repo[[:space:]]+([[:alnum:]/_.-]+) ]= ]; then + branch=3D"${BASH_REMATCH[1]}" + echo " Repository: $repo" + echo " Branch/Tag: $branch" + found_repo=3Dtrue + break + else + next_line_num=3D$((line_num + 1)) + next_line=3D$(echo "$email_content" | sed -n "${ne= xt_line_num}p") + next_line=3D$(echo "$next_line" | sed 's/^[[:space= :]]*//' | sed 's/[[:space:]]*$//') + + if [[ $next_line =3D~ ^[[:alnum:]/_.-]+$ ]]; then + branch=3D"$next_line" + echo " Repository: $repo" + echo " Branch/Tag: $branch" + found_repo=3Dtrue + break + elif [ "$found_repo" =3D false ]; then + repo_no_branch=3D$repo + line_no_branch=3D$line + fi + fi + fi + done <<< "$repo_lines" + fi + fi + + # 3. Try "available in the Git repository at:" section + if [ "$found_repo" =3D false ]; then + main_repo_section=3D$(echo "$email_content" | grep -A 10 "availabl= e in the Git repository at") + + if [ -n "$main_repo_section" ]; then + if [[ $main_repo_section =3D~ href=3D\"([^\"]*gitlab[^\"]*|[^\= "]*git[^\"]*|[^\"]*kernel\.org[^\"]*) ]]; then + repo=3D"${BASH_REMATCH[1]}" + echo " Repository: $repo" + found_repo=3Dtrue + + tags_line=3D$(echo "$main_repo_section" | grep -o "tags/[[= :alnum:]/_.-]*" | head -1) + if [ -n "$tags_line" ]; then + branch=3D"$tags_line" + echo " Branch/Tag: $branch" + fi + fi + fi + fi + + # 4. Use repo without branch if that's all we found + if [ "$found_repo" =3D false ] && [ -n "${repo_no_branch:-}" ]; then + repo=3D"$repo_no_branch" + echo " Repository: $repo" + echo " Context: $line_no_branch" + found_repo=3Dtrue + fi + + if [ "$found_repo" =3D false ]; then + echo " No repository URL found in this pull request." + else + # Convert ssh URLs to git URLs for verification + verification_repo=3D"$repo" + + # Handle different git URL formats for kernel.org + if [[ "$verification_repo" =3D~ ^ssh://git@gitolite\.kernel\.org(.= *) ]]; then + verification_repo=3D"git://git.kernel.org${BASH_REMATCH[1]}" + echo " Using git URL for verification: $verification_repo" + fi + + if [[ "$verification_repo" =3D~ ^git@gitolite\.kernel\.org:(.*) ]]= ; then + verification_repo=3D"git://git.kernel.org/${BASH_REMATCH[1]}" + echo " Using git URL for verification: $verification_repo" + fi + + if [ -n "$verification_repo" ] && [ -n "$branch" ]; then + # Try fetching, first with tags/ prefix if needed + fetch_ref=3D"$branch" + if [[ ! "$branch" =3D~ ^(refs/|tags/) ]] && [[ ! "$branch" =3D= ~ ^remotes/ ]]; then + fetch_ref=3D"tags/$branch" + fi + + echo " Fetching: git fetch \"$verification_repo\" \"$fetch_re= f\"" + if git fetch "$verification_repo" "$fetch_ref" 2>/dev/null; th= en + echo " Fetch: =E2=9C=85 Successfully fetched" + + # Check if there are any commits to verify + commit_hashes=3D$(git rev-list --no-merges origin/master..= FETCH_HEAD 2>/dev/null) + + if [ -z "$commit_hashes" ]; then + echo " =E2=84=B9=EF=B8=8F No new commits found. Pull = request likely already merged." + else + total_commits=3D$(echo "$commit_hashes" | wc -l) + echo " Checking maintainer status for $total_commits = commit(s)..." + + # Array to store problematic commits + problematic_commits=3D() + # Array to store commits where sender is not maintaine= r but a signer is + sender_not_maintainer_commits=3D() + + # Check each commit silently + while read -r commit_hash; do + [ -z "$commit_hash" ] && continue + + commit_msg=3D$(git log -1 --pretty=3Dformat:"%h %s= " "$commit_hash") + + if [ -f "scripts/get_maintainer.pl" ]; then + maintainers=3D$(git show "$commit_hash" | ./sc= ripts/get_maintainer.pl) + signoffs=3D$(git show -s --format=3D%b "$commi= t_hash" | grep -i "Signed-off-by:" | sed 's/^[[:space:]]*Signed-off-by:[[:s= pace:]]*//') + + valid_maintainer=3Dfalse + sender_is_maintainer=3Dfalse + + # Check if sender is a maintainer + if echo "$maintainers" | grep -q "$sender_emai= l" || echo "$maintainers" | grep -q "$sender_name"; then + valid_maintainer=3Dtrue + sender_is_maintainer=3Dtrue + else + # Check if any signoff person is a maintai= ner + while read -r signoff; do + [ -z "$signoff" ] && continue + + # Extract name and email from signoff + if [[ "$signoff" =3D~ (.*)[[:space:]]+= \<([^>]+)\> ]]; then + signer_name=3D"${BASH_REMATCH[1]}" + signer_email=3D"${BASH_REMATCH[2]}" + signer_name=3D$(echo "$signer_name= " | sed 's/^[[:space:]]*//;s/[[:space:]]*$//') + signer_email=3D$(echo "$signer_ema= il" | sed 's/^[[:space:]]*//;s/[[:space:]]*$//') + + if echo "$maintainers" | grep -q "= $signer_email" || echo "$maintainers" | grep -q "$signer_name"; then + valid_maintainer=3Dtrue + break + fi + fi + done <<< "$signoffs" + fi + + # Add to problematic commits if no valid maint= ainer found + if [ "$valid_maintainer" =3D false ]; then + problematic_commits+=3D("$commit_msg") + # Track commits where sender is not a maintain= er but a signer is + elif [ "$sender_is_maintainer" =3D false ]; th= en + sender_not_maintainer_commits+=3D("$commit= _msg") + fi + fi + done <<< "$commit_hashes" + + # Display results based on problematic commits + if [ ${#problematic_commits[@]} -eq 0 ]; then + echo " =E2=9C=85 Maintainer verification: Sender = or a signer is listed as maintainer for all commits" + + # Add warning if we found commits where sender is = not a maintainer + if [ ${#sender_not_maintainer_commits[@]} -gt 0 ];= then + echo " =E2=9A=A0=EF=B8=8F Warning: Sender is= NOT listed as maintainer for these commits (but a signer is):" + for commit in "${sender_not_maintainer_commits= [@]}"; do + echo " - $commit" + done + fi + else + echo " =E2=9D=8C Maintainer verification: Neither= sender nor any signers are listed as maintainers for these commits:" + for commit in "${problematic_commits[@]}"; do + echo " - $commit" + done + fi + fi + else + # Try without tags/ prefix if the first attempt failed + if [[ "$fetch_ref" =3D=3D tags/* ]]; then + fetch_ref=3D"${branch}" + echo " Fetching: git fetch \"$verification_repo\" \"$= fetch_ref\"" + if git fetch "$verification_repo" "$fetch_ref" 2>/dev/= null; then + echo " Fetch: =E2=9C=85 Successfully fetched" + + # Check if there are any commits to verify + commit_hashes=3D$(git rev-list --no-merges origin/= master..FETCH_HEAD 2>/dev/null) + + if [ -z "$commit_hashes" ]; then + echo " =E2=84=B9=EF=B8=8F No new commits foun= d. Pull request likely already merged." + else + total_commits=3D$(echo "$commit_hashes" | wc -= l) + echo " Checking maintainer status for $total_= commits commit(s)..." + + # Array to store problematic commits + problematic_commits=3D() + # Array to store commits where sender is not m= aintainer but a signer is + sender_not_maintainer_commits=3D() + + # Check each commit silently + while read -r commit_hash; do + [ -z "$commit_hash" ] && continue + + commit_msg=3D$(git log -1 --pretty=3Dforma= t:"%h %s" "$commit_hash") + + if [ -f "scripts/get_maintainer.pl" ]; then + maintainers=3D$(git show "$commit_hash= " | ./scripts/get_maintainer.pl) + signoffs=3D$(git show -s --format=3D%b= "$commit_hash" | grep -i "Signed-off-by:" | sed 's/^[[:space:]]*Signed-off= -by:[[:space:]]*//') + + valid_maintainer=3Dfalse + sender_is_maintainer=3Dfalse + + # Check if sender is a maintainer + if echo "$maintainers" | grep -q "$sen= der_email" || echo "$maintainers" | grep -q "$sender_name"; then + valid_maintainer=3Dtrue + sender_is_maintainer=3Dtrue + else + # Check if any signoff person is a= maintainer + while read -r signoff; do + [ -z "$signoff" ] && continue + + # Extract name and email from = signoff + if [[ "$signoff" =3D~ (.*)[[:s= pace:]]+\<([^>]+)\> ]]; then + signer_name=3D"${BASH_REMA= TCH[1]}" + signer_email=3D"${BASH_REM= ATCH[2]}" + signer_name=3D$(echo "$sig= ner_name" | sed 's/^[[:space:]]*//;s/[[:space:]]*$//') + signer_email=3D$(echo "$si= gner_email" | sed 's/^[[:space:]]*//;s/[[:space:]]*$//') + + if echo "$maintainers" | g= rep -q "$signer_email" || echo "$maintainers" | grep -q "$signer_name"; then + valid_maintainer=3Dtrue + break + fi + fi + done <<< "$signoffs" + fi + + # Add to problematic commits if no val= id maintainer found + if [ "$valid_maintainer" =3D false ]; = then + problematic_commits+=3D("$commit_m= sg") + # Track commits where sender is not a = maintainer but a signer is + elif [ "$sender_is_maintainer" =3D fal= se ]; then + sender_not_maintainer_commits+=3D(= "$commit_msg") + fi + fi + done <<< "$commit_hashes" + + # Display results based on problematic commits + if [ ${#problematic_commits[@]} -eq 0 ]; then + echo " =E2=9C=85 Maintainer verification:= Sender or a signer is listed as maintainer for all commits" + + # Add warning if we found commits where se= nder is not a maintainer + if [ ${#sender_not_maintainer_commits[@]} = -gt 0 ]; then + echo " =E2=9A=A0=EF=B8=8F Warning: Se= nder is NOT listed as maintainer for these commits (but a signer is):" + for commit in "${sender_not_maintainer= _commits[@]}"; do + echo " - $commit" + done + fi + else + echo " =E2=9D=8C Maintainer verification:= Neither sender nor any signers are listed as maintainers for these commits= :" + for commit in "${problematic_commits[@]}";= do + echo " - $commit" + done + fi + fi + else + echo " Fetch: =E2=9D=8C Failed to fetch" + fi + else + echo " Fetch: =E2=9D=8C Failed to fetch" + fi + fi + elif [ -n "$verification_repo" ]; then + # If we only have the repository but no branch/tag, just verif= y the repository exists + echo " Verifying: git ls-remote --exit-code \"$verification_r= epo\"" + if git ls-remote --exit-code "$verification_repo" > /dev/null = 2>&1; then + echo " Verification: =E2=9C=85 Repository exists" + else + echo " Verification: =E2=9D=8C Could not access repositor= y" + fi + fi + fi + + rm "$message_content" + + echo "------------------------" +done <<< "$message_urls" + +rm "$temp_file" --=20 2.39.5