[Qemu-devel] [PATCH v3] scripts: use git archive in archive-source

Gerd Hoffmann posted 1 patch 5 years, 2 months ago
Test docker-clang@ubuntu passed
Test docker-mingw@fedora passed
Test asan passed
Test checkpatch failed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20190131130016.17337-1-kraxel@redhat.com
There is a newer version of this series
scripts/archive-source.sh | 63 ++++++++++++++++++++---------------------------
1 file changed, 27 insertions(+), 36 deletions(-)
[Qemu-devel] [PATCH v3] scripts: use git archive in archive-source
Posted by Gerd Hoffmann 5 years, 2 months ago
Use git archive to create tarballs of qemu and submodules instead of
cloning the repository and the submodules.  This is a order of magnitude
faster because it doesn't fetch the submodules from the internet each
time the script runs.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
---
 scripts/archive-source.sh | 63 ++++++++++++++++++++---------------------------
 1 file changed, 27 insertions(+), 36 deletions(-)

diff --git a/scripts/archive-source.sh b/scripts/archive-source.sh
index 6eed2a29bd..38d53986d7 100755
--- a/scripts/archive-source.sh
+++ b/scripts/archive-source.sh
@@ -19,8 +19,8 @@ if test $# -lt 1; then
 fi
 
 tar_file=$(realpath "$1")
-list_file="${tar_file}.list"
-vroot_dir="${tar_file}.vroot"
+sub_file=$(mktemp "${tar_file%.tar}.sub.XXXXXXXX.tar")
+sub_tdir=$(mktemp -d "${tar_file%.tar}.sub.XXXXXXXX")
 
 # We want a predictable list of submodules for builds, that is
 # independent of what the developer currently has initialized
@@ -28,7 +28,7 @@ vroot_dir="${tar_file}.vroot"
 # different to the host OS.
 submodules="dtc ui/keycodemapdb tests/fp/berkeley-softfloat-3 tests/fp/berkeley-testfloat-3"
 
-trap "status=$?; rm -rf \"$list_file\" \"$vroot_dir\"; exit \$status" 0 1 2 3 15
+trap "status=$?; rm -rf \"$sub_file\" \"$sub_tdir\" ; exit \$status" 0 1 2 3 15
 
 if git diff-index --quiet HEAD -- &>/dev/null
 then
@@ -36,38 +36,29 @@ then
 else
     HEAD=$(git stash create)
 fi
-git clone --shared . "$vroot_dir"
-test $? -ne 0 && error "failed to clone into '$vroot_dir'"
-
-cd "$vroot_dir"
-test $? -ne 0 && error "failed to change into '$vroot_dir'"
-
-git checkout $HEAD
-test $? -ne 0 && error "failed to checkout $HEAD revision"
-
+git archive --format tar $HEAD > "$tar_file"
+test $? -ne 0 && error "failed to archive qemu"
 for sm in $submodules; do
-    git submodule update --init $sm
-    test $? -ne 0 && error "failed to init submodule $sm"
+	status="$(git submodule status "$sm")"
+	smhash="${status# }"
+	smhash="${smhash#+}"
+	smhash="${smhash#-}"
+	smhash="${smhash%% *}"
+	smdir="$sm"
+	case "$status" in
+	    -*)
+		smdir="$sub_tdir/$sm"
+		smurl="$(git config -f .gitmodules submodule.${sm}.url)"
+		echo "NOTICE: using temporary clone for submodule $sm"
+		git clone "$smurl" "$smdir"
+		test $? -ne 0 && error "failed to clone submodule $sm"
+		;;
+	    +*)
+		echo "WARNING: submodule $sm is out of sync"
+		;;
+	esac
+	(cd $smdir; git archive --format tar --prefix "$sm/" $smhash) > "$sub_file"
+	test $? -ne 0 && error "failed to archive submodule $sm ($smhash)"
+	tar --concatenate --file "$tar_file" "$sub_file"
+	test $? -ne 0 && error "failed append submodule $sm to $tar_file"
 done
-
-if test -n "$submodules"; then
-    {
-        git ls-files || error "git ls-files failed"
-        for sm in $submodules; do
-            (cd $sm; git ls-files) | sed "s:^:$sm/:"
-            if test "${PIPESTATUS[*]}" != "0 0"; then
-                error "git ls-files in submodule $sm failed"
-            fi
-        done
-    } | grep -x -v $(for sm in $submodules; do echo "-e $sm"; done) > "$list_file"
-else
-    git ls-files > "$list_file"
-fi
-
-if test $? -ne 0; then
-    error "failed to generate list file"
-fi
-
-tar -cf "$tar_file" -T "$list_file" || error "failed to create tar file"
-
-exit 0
-- 
2.9.3


Re: [Qemu-devel] [PATCH v3] scripts: use git archive in archive-source
Posted by Eric Blake 5 years, 2 months ago
On 1/31/19 7:00 AM, Gerd Hoffmann wrote:
> Use git archive to create tarballs of qemu and submodules instead of
> cloning the repository and the submodules.  This is a order of magnitude
> faster because it doesn't fetch the submodules from the internet each
> time the script runs.
> 
> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
> ---
>  scripts/archive-source.sh | 63 ++++++++++++++++++++---------------------------
>  1 file changed, 27 insertions(+), 36 deletions(-)
> 
> diff --git a/scripts/archive-source.sh b/scripts/archive-source.sh
> index 6eed2a29bd..38d53986d7 100755
> --- a/scripts/archive-source.sh
> +++ b/scripts/archive-source.sh
> @@ -19,8 +19,8 @@ if test $# -lt 1; then
>  fi
>  
>  tar_file=$(realpath "$1")
> -list_file="${tar_file}.list"
> -vroot_dir="${tar_file}.vroot"
> +sub_file=$(mktemp "${tar_file%.tar}.sub.XXXXXXXX.tar")
> +sub_tdir=$(mktemp -d "${tar_file%.tar}.sub.XXXXXXXX")

mktemp is not specified by POSIX; and FreeBSD man pages for mktemp
suggest that if you don't use XXXXXX as the suffix that you are not
guaranteed correct behavior.  Are you sure this is portable enough?  Do
you need both a temp file and dir, or can you create the file name of
your choice inside a temp dir, where only the dir has to have a
randomized name?

>  
>  # We want a predictable list of submodules for builds, that is
>  # independent of what the developer currently has initialized
> @@ -28,7 +28,7 @@ vroot_dir="${tar_file}.vroot"
>  # different to the host OS.
>  submodules="dtc ui/keycodemapdb tests/fp/berkeley-softfloat-3 tests/fp/berkeley-testfloat-3"
>  
> -trap "status=$?; rm -rf \"$list_file\" \"$vroot_dir\"; exit \$status" 0 1 2 3 15
> +trap "status=$?; rm -rf \"$sub_file\" \"$sub_tdir\" ; exit \$status" 0 1 2 3 15
>  
>  if git diff-index --quiet HEAD -- &>/dev/null
>  then
> @@ -36,38 +36,29 @@ then
>  else
>      HEAD=$(git stash create)
>  fi
> -git clone --shared . "$vroot_dir"
> -test $? -ne 0 && error "failed to clone into '$vroot_dir'"
> -
> -cd "$vroot_dir"
> -test $? -ne 0 && error "failed to change into '$vroot_dir'"
> -
> -git checkout $HEAD
> -test $? -ne 0 && error "failed to checkout $HEAD revision"
> -
> +git archive --format tar $HEAD > "$tar_file"
> +test $? -ne 0 && error "failed to archive qemu"
>  for sm in $submodules; do
> -    git submodule update --init $sm
> -    test $? -ne 0 && error "failed to init submodule $sm"
> +	status="$(git submodule status "$sm")"
> +	smhash="${status# }"
> +	smhash="${smhash#+}"
> +	smhash="${smhash#-}"

These three lines can be consolidated into one:
smhash=${status#[ +-]}

> +	smhash="${smhash%% *}"
> +	smdir="$sm"
> +	case "$status" in
> +	    -*)
> +		smdir="$sub_tdir/$sm"
> +		smurl="$(git config -f .gitmodules submodule.${sm}.url)"
> +		echo "NOTICE: using temporary clone for submodule $sm"
> +		git clone "$smurl" "$smdir"
> +		test $? -ne 0 && error "failed to clone submodule $sm"

I know we don't want to affect the developer's normal checkout, but is
it worth storing the temporary clone in a specifically-named
subdirectory of their checkout instead of in a randomly-generated mktemp
transient location, so that we can reuse results from a previous
archive-source run?

> +		;;
> +	    +*)
> +		echo "WARNING: submodule $sm is out of sync"
> +		;;
> +	esac
> +	(cd $smdir; git archive --format tar --prefix "$sm/" $smhash) > "$sub_file"
> +	test $? -ne 0 && error "failed to archive submodule $sm ($smhash)"
> +	tar --concatenate --file "$tar_file" "$sub_file"
> +	test $? -ne 0 && error "failed append submodule $sm to $tar_file"
>  done

Overall, though, the idea seems reasonable.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [Qemu-devel] [PATCH v3] scripts: use git archive in archive-source
Posted by Gerd Hoffmann 5 years, 2 months ago
  Hi,

> >  tar_file=$(realpath "$1")
> > -list_file="${tar_file}.list"
> > -vroot_dir="${tar_file}.vroot"
> > +sub_file=$(mktemp "${tar_file%.tar}.sub.XXXXXXXX.tar")
> > +sub_tdir=$(mktemp -d "${tar_file%.tar}.sub.XXXXXXXX")
> 
> mktemp is not specified by POSIX; and FreeBSD man pages for mktemp
> suggest that if you don't use XXXXXX as the suffix that you are not
> guaranteed correct behavior.  Are you sure this is portable enough?  Do
> you need both a temp file and dir, or can you create the file name of
> your choice inside a temp dir, where only the dir has to have a
> randomized name?

Yes, storing temp tar in the temp dir should work.

> > +	status="$(git submodule status "$sm")"
> > +	smhash="${status# }"
> > +	smhash="${smhash#+}"
> > +	smhash="${smhash#-}"
> 
> These three lines can be consolidated into one:
> smhash=${status#[ +-]}

Ah, cool.  Learned a new trick.

> > +		smdir="$sub_tdir/$sm"
> > +		smurl="$(git config -f .gitmodules submodule.${sm}.url)"
> > +		echo "NOTICE: using temporary clone for submodule $sm"
> > +		git clone "$smurl" "$smdir"
> > +		test $? -ne 0 && error "failed to clone submodule $sm"
> 
> I know we don't want to affect the developer's normal checkout, but is
> it worth storing the temporary clone in a specifically-named
> subdirectory of their checkout instead

Hmm, we could do "git submodule init + git archive + git submodule
deinit".  With git storing a bare repo in .git/modules/$submodule these
days (and not deleting it on deinit) that'll effectively cache things.
It's a (temporary) modification of the checkout though.

Alternatively we could clone to $HOME/.cache/$somewhere, similar to the
vm tests which store downloads (and soon vm images too) below
$HOME/.cache/qemu-vm/.

cheers,
  Gerd


Re: [Qemu-devel] [PATCH v3] scripts: use git archive in archive-source
Posted by no-reply@patchew.org 5 years, 2 months ago
Patchew URL: https://patchew.org/QEMU/20190131130016.17337-1-kraxel@redhat.com/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Message-id: 20190131130016.17337-1-kraxel@redhat.com
Subject: [Qemu-devel] [PATCH v3] scripts: use git archive in archive-source
Type: series

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 - [tag update]      patchew/20190123092538.8004-1-kbastian@mail.uni-paderborn.de -> patchew/20190123092538.8004-1-kbastian@mail.uni-paderborn.de
Switched to a new branch 'test'
cfe9664 scripts: use git archive in archive-source

=== OUTPUT BEGIN ===
ERROR: code indent should never use tabs
#57: FILE: scripts/archive-source.sh:42:
+^Istatus="$(git submodule status "$sm")"$

ERROR: code indent should never use tabs
#58: FILE: scripts/archive-source.sh:43:
+^Ismhash="${status# }"$

ERROR: code indent should never use tabs
#59: FILE: scripts/archive-source.sh:44:
+^Ismhash="${smhash#+}"$

ERROR: code indent should never use tabs
#60: FILE: scripts/archive-source.sh:45:
+^Ismhash="${smhash#-}"$

ERROR: code indent should never use tabs
#61: FILE: scripts/archive-source.sh:46:
+^Ismhash="${smhash%% *}"$

ERROR: code indent should never use tabs
#62: FILE: scripts/archive-source.sh:47:
+^Ismdir="$sm"$

ERROR: code indent should never use tabs
#63: FILE: scripts/archive-source.sh:48:
+^Icase "$status" in$

ERROR: code indent should never use tabs
#64: FILE: scripts/archive-source.sh:49:
+^I    -*)$

ERROR: code indent should never use tabs
#65: FILE: scripts/archive-source.sh:50:
+^I^Ismdir="$sub_tdir/$sm"$

ERROR: code indent should never use tabs
#66: FILE: scripts/archive-source.sh:51:
+^I^Ismurl="$(git config -f .gitmodules submodule.${sm}.url)"$

ERROR: code indent should never use tabs
#67: FILE: scripts/archive-source.sh:52:
+^I^Iecho "NOTICE: using temporary clone for submodule $sm"$

ERROR: code indent should never use tabs
#68: FILE: scripts/archive-source.sh:53:
+^I^Igit clone "$smurl" "$smdir"$

ERROR: code indent should never use tabs
#69: FILE: scripts/archive-source.sh:54:
+^I^Itest $? -ne 0 && error "failed to clone submodule $sm"$

ERROR: code indent should never use tabs
#70: FILE: scripts/archive-source.sh:55:
+^I^I;;$

ERROR: code indent should never use tabs
#71: FILE: scripts/archive-source.sh:56:
+^I    +*)$

ERROR: code indent should never use tabs
#72: FILE: scripts/archive-source.sh:57:
+^I^Iecho "WARNING: submodule $sm is out of sync"$

ERROR: code indent should never use tabs
#73: FILE: scripts/archive-source.sh:58:
+^I^I;;$

ERROR: code indent should never use tabs
#74: FILE: scripts/archive-source.sh:59:
+^Iesac$

WARNING: line over 80 characters
#75: FILE: scripts/archive-source.sh:60:
+       (cd $smdir; git archive --format tar --prefix "$sm/" $smhash) > "$sub_file"

ERROR: code indent should never use tabs
#75: FILE: scripts/archive-source.sh:60:
+^I(cd $smdir; git archive --format tar --prefix "$sm/" $smhash) > "$sub_file"$

ERROR: code indent should never use tabs
#76: FILE: scripts/archive-source.sh:61:
+^Itest $? -ne 0 && error "failed to archive submodule $sm ($smhash)"$

ERROR: code indent should never use tabs
#77: FILE: scripts/archive-source.sh:62:
+^Itar --concatenate --file "$tar_file" "$sub_file"$

ERROR: code indent should never use tabs
#78: FILE: scripts/archive-source.sh:63:
+^Itest $? -ne 0 && error "failed append submodule $sm to $tar_file"$

total: 22 errors, 1 warnings, 58 lines checked

Commit cfe96645bc47 (scripts: use git archive in archive-source) has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/20190131130016.17337-1-kraxel@redhat.com/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-devel@redhat.com