[PATCH] tests/qemu-iotests: Rework the checks and spots using GNU sed

Thomas Huth posted 1 patch 2 years, 2 months ago
Failed in applying to current master (apply log)
There is a newer version of this series
tests/check-block.sh             | 12 ------
tests/qemu-iotests/271           |  2 +-
tests/qemu-iotests/common.filter | 74 ++++++++++++++++----------------
tests/qemu-iotests/common.rc     | 45 +++++++++----------
4 files changed, 61 insertions(+), 72 deletions(-)
[PATCH] tests/qemu-iotests: Rework the checks and spots using GNU sed
Posted by Thomas Huth 2 years, 2 months ago
Instead of failing the iotests if GNU sed is not available (or skipping
them completely in the check-block.sh script), it would be better to
simply skip the bash-based tests that rely on GNU sed, so that the other
tests could still be run. Thus we now explicitely use "gsed" (either as
direct program or as a wrapper around "sed" if it's the GNU version)
in the spots that rely on the GNU sed behavior. Then we also remove the
sed checks from the check-block.sh script, so that "make check-block"
can now be run on systems without GNU sed, too.

Signed-off-by: Thomas Huth <thuth@redhat.com>
---
 I've checked that this works fine with:
 make vm-build-netbsd TARGET_LIST=x86_64-softmmu BUILD_TARGET=check-block
 make vm-build-freebsd TARGET_LIST=x86_64-softmmu BUILD_TARGET=check-block
 and with the macOS targets in our CI.

 tests/check-block.sh             | 12 ------
 tests/qemu-iotests/271           |  2 +-
 tests/qemu-iotests/common.filter | 74 ++++++++++++++++----------------
 tests/qemu-iotests/common.rc     | 45 +++++++++----------
 4 files changed, 61 insertions(+), 72 deletions(-)

diff --git a/tests/check-block.sh b/tests/check-block.sh
index 720a46bc36..af0c574812 100755
--- a/tests/check-block.sh
+++ b/tests/check-block.sh
@@ -52,18 +52,6 @@ if LANG=C bash --version | grep -q 'GNU bash, version [123]' ; then
     skip "bash version too old ==> Not running the qemu-iotests."
 fi
 
-if ! (sed --version | grep 'GNU sed') > /dev/null 2>&1 ; then
-    if ! command -v gsed >/dev/null 2>&1; then
-        skip "GNU sed not available ==> Not running the qemu-iotests."
-    fi
-else
-    # Double-check that we're not using BusyBox' sed which says
-    # that "This is not GNU sed version 4.0" ...
-    if sed --version | grep -q 'not GNU sed' ; then
-        skip "BusyBox sed not supported ==> Not running the qemu-iotests."
-    fi
-fi
-
 cd tests/qemu-iotests
 
 # QEMU_CHECK_BLOCK_AUTO is used to disable some unstable sub-tests
diff --git a/tests/qemu-iotests/271 b/tests/qemu-iotests/271
index 2775b4d130..c7c2cadda0 100755
--- a/tests/qemu-iotests/271
+++ b/tests/qemu-iotests/271
@@ -896,7 +896,7 @@ _make_test_img -o extended_l2=on 1M
 # Second and third writes in _concurrent_io() are independent and may finish in
 # different order. So, filter offset out to match both possible variants.
 _concurrent_io     | $QEMU_IO | _filter_qemu_io | \
-    $SED -e 's/\(20480\|40960\)/OFFSET/'
+    sed -e 's/\(20480\|40960\)/OFFSET/'
 _concurrent_verify | $QEMU_IO | _filter_qemu_io
 
 # success, all done
diff --git a/tests/qemu-iotests/common.filter b/tests/qemu-iotests/common.filter
index 935217aa65..a3b1708adc 100644
--- a/tests/qemu-iotests/common.filter
+++ b/tests/qemu-iotests/common.filter
@@ -21,58 +21,58 @@
 
 _filter_date()
 {
-    $SED -re 's/[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}/yyyy-mm-dd hh:mm:ss/'
+    gsed -re 's/[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}/yyyy-mm-dd hh:mm:ss/'
 }
 
 _filter_vmstate_size()
 {
-    $SED -r -e 's/[0-9. ]{5} [KMGT]iB/     SIZE/' \
+    gsed -r -e 's/[0-9. ]{5} [KMGT]iB/     SIZE/' \
             -e 's/[0-9. ]{5} B/   SIZE/'
 }
 
 _filter_generated_node_ids()
 {
-    $SED -re 's/\#block[0-9]{3,}/NODE_NAME/'
+    gsed -re 's/\#block[0-9]{3,}/NODE_NAME/'
 }
 
 _filter_qom_path()
 {
-    $SED -e '/Attached to:/s/\device[[0-9]\+\]/device[N]/g'
+    gsed -e '/Attached to:/s/\device[[0-9]\+\]/device[N]/g'
 }
 
 # replace occurrences of the actual TEST_DIR value with TEST_DIR
 _filter_testdir()
 {
-    $SED -e "s#$TEST_DIR/#TEST_DIR/#g" \
-         -e "s#$SOCK_DIR/#SOCK_DIR/#g" \
-         -e "s#SOCK_DIR/fuse-#TEST_DIR/#g"
+    sed -e "s#$TEST_DIR/#TEST_DIR/#g" \
+        -e "s#$SOCK_DIR/#SOCK_DIR/#g" \
+        -e "s#SOCK_DIR/fuse-#TEST_DIR/#g"
 }
 
 # replace occurrences of the actual IMGFMT value with IMGFMT
 _filter_imgfmt()
 {
-    $SED -e "s#$IMGFMT#IMGFMT#g"
+    sed -e "s#$IMGFMT#IMGFMT#g"
 }
 
 # Replace error message when the format is not supported and delete
 # the output lines after the first one
 _filter_qemu_img_check()
 {
-    $SED -e '/allocated.*fragmented.*compressed clusters/d' \
-        -e 's/qemu-img: This image format does not support checks/No errors were found on the image./' \
-        -e '/Image end offset: [0-9]\+/d'
+    gsed -e '/allocated.*fragmented.*compressed clusters/d' \
+         -e 's/qemu-img: This image format does not support checks/No errors were found on the image./' \
+         -e '/Image end offset: [0-9]\+/d'
 }
 
 # Removes \r from messages
 _filter_win32()
 {
-    $SED -e 's/\r//g'
+    gsed -e 's/\r//g'
 }
 
 # sanitize qemu-io output
 _filter_qemu_io()
 {
-    _filter_win32 | $SED -e "s/[0-9]* ops\; [0-9/:. sec]* ([0-9/.inf]* [EPTGMKiBbytes]*\/sec and [0-9/.inf]* ops\/sec)/X ops\; XX:XX:XX.X (XXX YYY\/sec and XXX ops\/sec)/" \
+    _filter_win32 | gsed -e "s/[0-9]* ops\; [0-9/:. sec]* ([0-9/.inf]* [EPTGMKiBbytes]*\/sec and [0-9/.inf]* ops\/sec)/X ops\; XX:XX:XX.X (XXX YYY\/sec and XXX ops\/sec)/" \
         -e "s/: line [0-9][0-9]*:  *[0-9][0-9]*\( Aborted\| Killed\)/:\1/" \
         -e "s/qemu-io> //g"
 }
@@ -80,7 +80,7 @@ _filter_qemu_io()
 # replace occurrences of QEMU_PROG with "qemu"
 _filter_qemu()
 {
-    $SED -e "s#\\(^\\|(qemu) \\)$(basename $QEMU_PROG):#\1QEMU_PROG:#" \
+    gsed -e "s#\\(^\\|(qemu) \\)$(basename $QEMU_PROG):#\1QEMU_PROG:#" \
         -e 's#^QEMU [0-9]\+\.[0-9]\+\.[0-9]\+ monitor#QEMU X.Y.Z monitor#' \
         -e 's#I/O error#Input/output error#' \
         -e $'s#\r##' # QEMU monitor uses \r\n line endings
@@ -90,41 +90,41 @@ _filter_qemu()
 _filter_qmp()
 {
     _filter_win32 | \
-    $SED -e 's#\("\(micro\)\?seconds": \)[0-9]\+#\1 TIMESTAMP#g' \
-        -e 's#^{"QMP":.*}$#QMP_VERSION#' \
-        -e '/^    "QMP": {\s*$/, /^    }\s*$/ c\' \
-        -e '    QMP_VERSION'
+    gsed -e 's#\("\(micro\)\?seconds": \)[0-9]\+#\1 TIMESTAMP#g' \
+         -e 's#^{"QMP":.*}$#QMP_VERSION#' \
+         -e '/^    "QMP": {\s*$/, /^    }\s*$/ c\' \
+         -e '    QMP_VERSION'
 }
 
 # readline makes HMP command strings so long that git complains
 _filter_hmp()
 {
-    $SED -e $'s/^\\((qemu) \\)\\?.*\e\\[D/\\1/g' \
-        -e $'s/\e\\[K//g'
+    gsed -e $'s/^\\((qemu) \\)\\?.*\e\\[D/\\1/g' \
+         -e $'s/\e\\[K//g'
 }
 
 # replace block job offset
 _filter_block_job_offset()
 {
-    $SED -e 's/, "offset": [0-9]\+,/, "offset": OFFSET,/'
+    sed -e 's/, "offset": [0-9]\+,/, "offset": OFFSET,/'
 }
 
 # replace block job len
 _filter_block_job_len()
 {
-    $SED -e 's/, "len": [0-9]\+,/, "len": LEN,/g'
+    sed -e 's/, "len": [0-9]\+,/, "len": LEN,/g'
 }
 
 # replace actual image size (depends on the host filesystem)
 _filter_actual_image_size()
 {
-    $SED -s 's/\("actual-size":\s*\)[0-9]\+/\1SIZE/g'
+    gsed -s 's/\("actual-size":\s*\)[0-9]\+/\1SIZE/g'
 }
 
 # Filename filters for qemu-img create
 _filter_img_create_filenames()
 {
-    $SED \
+    sed \
         -e "s#$REMOTE_TEST_DIR#TEST_DIR#g" \
         -e "s#$IMGPROTO:$TEST_DIR#TEST_DIR#g" \
         -e "s#$TEST_DIR#TEST_DIR#g" \
@@ -142,7 +142,7 @@ _do_filter_img_create()
     # precedes ", fmt=") and the options part ($options, which starts
     # with "fmt=")
     # (And just echo everything before the first "^Formatting")
-    readarray formatting_line < <($SED -e 's/, fmt=/\n/')
+    readarray formatting_line < <(gsed -e 's/, fmt=/\n/')
 
     filename_part=${formatting_line[0]}
     unset formatting_line[0]
@@ -169,11 +169,11 @@ _do_filter_img_create()
     options=$(
         echo "$options" \
         | tr '\n' '\0' \
-        | $SED -e 's/ \([a-z0-9_.-]*\)=/\n\1=/g' \
+        | gsed -e 's/ \([a-z0-9_.-]*\)=/\n\1=/g' \
         | grep -a -e '^fmt' -e '^size' -e '^backing' -e '^preallocation' \
                   -e '^encryption' "${grep_data_file[@]}" \
         | _filter_img_create_filenames \
-        | $SED \
+        | sed \
             -e 's/^\(fmt\)/0-\1/' \
             -e 's/^\(size\)/1-\1/' \
             -e 's/^\(backing\)/2-\1/' \
@@ -181,9 +181,9 @@ _do_filter_img_create()
             -e 's/^\(encryption\)/4-\1/' \
             -e 's/^\(preallocation\)/8-\1/' \
         | LC_ALL=C sort \
-        | $SED -e 's/^[0-9]-//' \
+        | sed -e 's/^[0-9]-//' \
         | tr '\n\0' ' \n' \
-        | $SED -e 's/^ *$//' -e 's/ *$//'
+        | sed -e 's/^ *$//' -e 's/ *$//'
     )
 
     if [ -n "$options" ]; then
@@ -209,7 +209,7 @@ _filter_img_create()
 
 _filter_img_create_size()
 {
-    $SED -e "s# size=[0-9]\\+# size=SIZE#g"
+    sed -e "s# size=[0-9]\\+# size=SIZE#g"
 }
 
 _filter_img_info()
@@ -223,7 +223,7 @@ _filter_img_info()
 
     discard=0
     regex_json_spec_start='^ *"format-specific": \{'
-    $SED -e "s#$REMOTE_TEST_DIR#TEST_DIR#g" \
+    gsed -e "s#$REMOTE_TEST_DIR#TEST_DIR#g" \
         -e "s#$IMGPROTO:$TEST_DIR#TEST_DIR#g" \
         -e "s#$TEST_DIR#TEST_DIR#g" \
         -e "s#$SOCK_DIR#SOCK_DIR#g" \
@@ -285,7 +285,7 @@ _filter_qemu_img_map()
         data_file_filter=(-e "s#$data_file_pattern#\\1#")
     fi
 
-    $SED -e 's/\([0-9a-fx]* *[0-9a-fx]* *\)[0-9a-fx]* */\1/g' \
+    sed -e 's/\([0-9a-fx]* *[0-9a-fx]* *\)[0-9a-fx]* */\1/g' \
         -e 's/"offset": [0-9]\+/"offset": OFFSET/g' \
         -e 's/Mapped to *//' \
         "${data_file_filter[@]}" \
@@ -299,7 +299,7 @@ _filter_nbd()
     # receive callbacks sometimes, making them unreliable.
     #
     # Filter out the TCP port number since this changes between runs.
-    $SED -e '/nbd\/.*\.c:/d' \
+    sed -e '/nbd\/.*\.c:/d' \
         -e 's#127\.0\.0\.1:[0-9]*#127.0.0.1:PORT#g' \
         -e "s#?socket=$SOCK_DIR#?socket=SOCK_DIR#g" \
         -e 's#\(foo\|PORT/\?\|.sock\): Failed to .*$#\1#'
@@ -336,14 +336,14 @@ sys.stdout.write(result)'
 
 _filter_authz_check_tls()
 {
-    $SED -e 's/TLS x509 authz check for .* is denied/TLS x509 authz check for DISTINGUISHED-NAME is denied/'
+    sed -e 's/TLS x509 authz check for .* is denied/TLS x509 authz check for DISTINGUISHED-NAME is denied/'
 }
 
 _filter_qcow2_compression_type_bit()
 {
-    $SED -e 's/\(incompatible_features\s\+\)\[3\(, \)\?/\1[/' \
-         -e 's/\(incompatible_features.*\), 3\]/\1]/' \
-         -e 's/\(incompatible_features.*\), 3\(,.*\)/\1\2/'
+    sed -e 's/\(incompatible_features\s\+\)\[3\(, \)\?/\1[/' \
+        -e 's/\(incompatible_features.*\), 3\]/\1]/' \
+        -e 's/\(incompatible_features.*\), 3\(,.*\)/\1\2/'
 }
 
 # make sure this script returns success
diff --git a/tests/qemu-iotests/common.rc b/tests/qemu-iotests/common.rc
index 9885030b43..3bfd94c2e0 100644
--- a/tests/qemu-iotests/common.rc
+++ b/tests/qemu-iotests/common.rc
@@ -17,17 +17,28 @@
 # along with this program.  If not, see <http://www.gnu.org/licenses/>.
 #
 
-SED=
-for sed in sed gsed; do
-    ($sed --version | grep 'GNU sed') > /dev/null 2>&1
-    if [ "$?" -eq 0 ]; then
-        SED=$sed
-        break
-    fi
-done
-if [ -z "$SED" ]; then
-    echo "$0: GNU sed not found"
-    exit 1
+# bail out, setting up .notrun file
+_notrun()
+{
+    echo "$*" >"$OUTPUT_DIR/$seq.notrun"
+    echo "$seq not run: $*"
+    status=0
+    exit
+}
+
+if ! command -v gsed >/dev/null 2>&1; then
+    if sed --version 2>&1 | grep -v 'not GNU sed' | grep 'GNU sed' > /dev/null;
+    then
+        gsed()
+        {
+            sed "$@"
+        }
+    else
+        gsed()
+        {
+            _notrun "GNU sed not available"
+        }
+    fi
 fi
 
 dd()
@@ -722,16 +733,6 @@ _img_info()
         done
 }
 
-# bail out, setting up .notrun file
-#
-_notrun()
-{
-    echo "$*" >"$OUTPUT_DIR/$seq.notrun"
-    echo "$seq not run: $*"
-    status=0
-    exit
-}
-
 # bail out, setting up .casenotrun file
 # The function _casenotrun() is used as a notifier. It is the
 # caller's responsibility to make skipped a particular test.
@@ -920,7 +921,7 @@ _require_working_luks()
     IMGFMT='luks' _rm_test_img "$file"
 
     if [ $status != 0 ]; then
-        reason=$(echo "$output" | grep "$file:" | $SED -e "s#.*$file: *##")
+        reason=$(echo "$output" | grep "$file:" | sed -e "s#.*$file: *##")
         if [ -z "$reason" ]; then
             reason="Failed to create a LUKS image"
         fi
-- 
2.27.0


Re: [PATCH] tests/qemu-iotests: Rework the checks and spots using GNU sed
Posted by Eric Blake 2 years, 2 months ago
On Tue, Feb 15, 2022 at 02:20:31PM +0100, Thomas Huth wrote:
> Instead of failing the iotests if GNU sed is not available (or skipping
> them completely in the check-block.sh script), it would be better to
> simply skip the bash-based tests that rely on GNU sed, so that the other
> tests could still be run. Thus we now explicitely use "gsed" (either as
> direct program or as a wrapper around "sed" if it's the GNU version)
> in the spots that rely on the GNU sed behavior. Then we also remove the
> sed checks from the check-block.sh script, so that "make check-block"
> can now be run on systems without GNU sed, too.
> 
> Signed-off-by: Thomas Huth <thuth@redhat.com>
> ---
>  I've checked that this works fine with:
>  make vm-build-netbsd TARGET_LIST=x86_64-softmmu BUILD_TARGET=check-block
>  make vm-build-freebsd TARGET_LIST=x86_64-softmmu BUILD_TARGET=check-block
>  and with the macOS targets in our CI.
> 
>  tests/check-block.sh             | 12 ------
>  tests/qemu-iotests/271           |  2 +-
>  tests/qemu-iotests/common.filter | 74 ++++++++++++++++----------------
>  tests/qemu-iotests/common.rc     | 45 +++++++++----------
>  4 files changed, 61 insertions(+), 72 deletions(-)
> 

> +++ b/tests/qemu-iotests/271
> @@ -896,7 +896,7 @@ _make_test_img -o extended_l2=on 1M
>  # Second and third writes in _concurrent_io() are independent and may finish in
>  # different order. So, filter offset out to match both possible variants.
>  _concurrent_io     | $QEMU_IO | _filter_qemu_io | \
> -    $SED -e 's/\(20480\|40960\)/OFFSET/'
> +    sed -e 's/\(20480\|40960\)/OFFSET/'

Looks like a portable sed script, so 'sed' instead of 'gsed' here is fine.

>  _concurrent_verify | $QEMU_IO | _filter_qemu_io
>  
>  # success, all done
> diff --git a/tests/qemu-iotests/common.filter b/tests/qemu-iotests/common.filter
> index 935217aa65..a3b1708adc 100644
> --- a/tests/qemu-iotests/common.filter
> +++ b/tests/qemu-iotests/common.filter
> @@ -21,58 +21,58 @@
>  
>  _filter_date()
>  {
> -    $SED -re 's/[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}/yyyy-mm-dd hh:mm:ss/'
> +    gsed -re 's/[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}/yyyy-mm-dd hh:mm:ss/'

GNU sed recommends spelling it 'sed -E', not 'sed -r', when using
extended regex.  Older POSIX did not support either spelling, but the
next version will require -E, as many implementations have it now:
https://www.austingroupbugs.net/view.php?id=528

Other than the fact that this was easier to write with ERE, I'm not
seeing any other GNU-only extensions in use here; but I'd recommend
that as long as we're touching the line, we spell it 'gsed -Ee'
instead of -re (here, and in several other places).

>  _filter_qom_path()
>  {
> -    $SED -e '/Attached to:/s/\device[[0-9]\+\]/device[N]/g'
> +    gsed -e '/Attached to:/s/\device[[0-9]\+\]/device[N]/g'

Here, it is our use of \+ that is a GNU sed extension, although it is
fairly easy (but verbose) to translate that one to portable sed
(<PAT>\+ is the same as <PAT><PAT>*).  So gsed is correct.  On the
other hand, the use of [[0-9]\+\] looks ugly - it probably does NOT
match what we meant (we have a bracket expression '[...]' that matches
the 11 characters [ and 0-9, then '\+' to match that bracket
expression 1 or more times, then '\]' which in its context is
identical to ']' to match a closing ], since only opening [ needs \
escaping for literal treatment).  What we probably meant is:

'/Attached to:/s/\device\[[0-9][0-9]*]/device[N]/g'

at which point normal sed would do.

>  }
>  
>  # replace occurrences of the actual TEST_DIR value with TEST_DIR
>  _filter_testdir()
>  {
> -    $SED -e "s#$TEST_DIR/#TEST_DIR/#g" \
> -         -e "s#$SOCK_DIR/#SOCK_DIR/#g" \
> -         -e "s#SOCK_DIR/fuse-#TEST_DIR/#g"
> +    sed -e "s#$TEST_DIR/#TEST_DIR/#g" \
> +        -e "s#$SOCK_DIR/#SOCK_DIR/#g" \
> +        -e "s#SOCK_DIR/fuse-#TEST_DIR/#g"

And this one indeed looks portable to POSIX (unless $TEST_DIR contains
weird stuff by accident).

>  # Removes \r from messages
>  _filter_win32()
>  {
> -    $SED -e 's/\r//g'
> +    gsed -e 's/\r//g'

Yep, \r is another GNU sed extension.

>  }
>  
>  # sanitize qemu-io output
>  _filter_qemu_io()
>  {
> -    _filter_win32 | $SED -e "s/[0-9]* ops\; [0-9/:. sec]* ([0-9/.inf]* [EPTGMKiBbytes]*\/sec and [0-9/.inf]* ops\/sec)/X ops\; XX:XX:XX.X (XXX YYY\/sec and XXX ops\/sec)/" \
> +    _filter_win32 | gsed -e "s/[0-9]* ops\; [0-9/:. sec]* ([0-9/.inf]* [EPTGMKiBbytes]*\/sec and [0-9/.inf]* ops\/sec)/X ops\; XX:XX:XX.X (XXX YYY\/sec and XXX ops\/sec)/" \
>          -e "s/: line [0-9][0-9]*:  *[0-9][0-9]*\( Aborted\| Killed\)/:\1/" \
>          -e "s/qemu-io> //g"

I'm not seeing anything specific to GNU sed in this (long) sed script;
can we relax this one to plain 'sed'?  Use of s#some/text## might be
easier than having to s/some\/text//, but that's not essential.

>  }
> @@ -80,7 +80,7 @@ _filter_qemu_io()
>  # replace occurrences of QEMU_PROG with "qemu"
>  _filter_qemu()
>  {
> -    $SED -e "s#\\(^\\|(qemu) \\)$(basename $QEMU_PROG):#\1QEMU_PROG:#" \
> +    gsed -e "s#\\(^\\|(qemu) \\)$(basename $QEMU_PROG):#\1QEMU_PROG:#" \
>          -e 's#^QEMU [0-9]\+\.[0-9]\+\.[0-9]\+ monitor#QEMU X.Y.Z monitor#' \

More uses of \+ explaining why gsed is nicer.

>          -e 's#I/O error#Input/output error#' \
>          -e $'s#\r##' # QEMU monitor uses \r\n line endings
> @@ -90,41 +90,41 @@ _filter_qemu()
>  _filter_qmp()
>  {
>      _filter_win32 | \
> -    $SED -e 's#\("\(micro\)\?seconds": \)[0-9]\+#\1 TIMESTAMP#g' \
> -        -e 's#^{"QMP":.*}$#QMP_VERSION#' \
> -        -e '/^    "QMP": {\s*$/, /^    }\s*$/ c\' \
> -        -e '    QMP_VERSION'
> +    gsed -e 's#\("\(micro\)\?seconds": \)[0-9]\+#\1 TIMESTAMP#g' \
> +         -e 's#^{"QMP":.*}$#QMP_VERSION#' \
> +         -e '/^    "QMP": {\s*$/, /^    }\s*$/ c\' \
> +         -e '    QMP_VERSION'

In addition to the \+, this one has a c\ command split across two -e
parameters.  Not portable to really ancient sed, but recently
standardized by POSIX:
https://www.austingroupbugs.net/view.php?id=262.  I'm happy with
requiring gsed instead of trying to rewrite \+ and assuming that -e
'c\' -e 'text' is portable.

>  }
>  
>  # readline makes HMP command strings so long that git complains
>  _filter_hmp()
>  {
> -    $SED -e $'s/^\\((qemu) \\)\\?.*\e\\[D/\\1/g' \
> -        -e $'s/\e\\[K//g'
> +    gsed -e $'s/^\\((qemu) \\)\\?.*\e\\[D/\\1/g' \
> +         -e $'s/\e\\[K//g'

\e is indeed GNU sed.  There are other was to spell ESC in portable
sed, but not worth the bother compared to just using gsed.

> @@ -142,7 +142,7 @@ _do_filter_img_create()
>      # precedes ", fmt=") and the options part ($options, which starts
>      # with "fmt=")
>      # (And just echo everything before the first "^Formatting")
> -    readarray formatting_line < <($SED -e 's/, fmt=/\n/')
> +    readarray formatting_line < <(gsed -e 's/, fmt=/\n/')

This one looks like it should work with plain 'sed'.

>  
>      filename_part=${formatting_line[0]}
>      unset formatting_line[0]
> @@ -169,11 +169,11 @@ _do_filter_img_create()
>      options=$(
>          echo "$options" \
>          | tr '\n' '\0' \
> -        | $SED -e 's/ \([a-z0-9_.-]*\)=/\n\1=/g' \
> +        | gsed -e 's/ \([a-z0-9_.-]*\)=/\n\1=/g' \

And this one.

>          | grep -a -e '^fmt' -e '^size' -e '^backing' -e '^preallocation' \
>                    -e '^encryption' "${grep_data_file[@]}" \
>          | _filter_img_create_filenames \
> -        | $SED \
> +        | sed \
>              -e 's/^\(fmt\)/0-\1/' \
>              -e 's/^\(size\)/1-\1/' \
>              -e 's/^\(backing\)/2-\1/' \
> @@ -181,9 +181,9 @@ _do_filter_img_create()
>              -e 's/^\(encryption\)/4-\1/' \
>              -e 's/^\(preallocation\)/8-\1/' \

Missing context here, but also probably safe for plain 'sed'.

>          | LC_ALL=C sort \
> -        | $SED -e 's/^[0-9]-//' \
> +        | sed -e 's/^[0-9]-//' \
>          | tr '\n\0' ' \n' \
> -        | $SED -e 's/^ *$//' -e 's/ *$//'
> +        | sed -e 's/^ *$//' -e 's/ *$//'
>      )
>  
>      if [ -n "$options" ]; then
> @@ -209,7 +209,7 @@ _filter_img_create()
>  
>  _filter_img_create_size()
>  {
> -    $SED -e "s# size=[0-9]\\+# size=SIZE#g"
> +    sed -e "s# size=[0-9]\\+# size=SIZE#g"

The use of "\\+" here either needs gsed, or respelling as [0-9][0-9]*.

>  }
>  
>  _filter_img_info()
> @@ -223,7 +223,7 @@ _filter_img_info()
>  
>      discard=0
>      regex_json_spec_start='^ *"format-specific": \{'
> -    $SED -e "s#$REMOTE_TEST_DIR#TEST_DIR#g" \
> +    gsed -e "s#$REMOTE_TEST_DIR#TEST_DIR#g" \
>          -e "s#$IMGPROTO:$TEST_DIR#TEST_DIR#g" \
>          -e "s#$TEST_DIR#TEST_DIR#g" \
>          -e "s#$SOCK_DIR#SOCK_DIR#g" \

I didn't check context for whether this one needs to be gsed, or could
be plain sed.

> +++ b/tests/qemu-iotests/common.rc
> @@ -17,17 +17,28 @@
>  # along with this program.  If not, see <http://www.gnu.org/licenses/>.
>  #
>  
> -SED=
> -for sed in sed gsed; do
> -    ($sed --version | grep 'GNU sed') > /dev/null 2>&1
> -    if [ "$?" -eq 0 ]; then
> -        SED=$sed
> -        break
> -    fi
> -done
> -if [ -z "$SED" ]; then
> -    echo "$0: GNU sed not found"
> -    exit 1
> +# bail out, setting up .notrun file
> +_notrun()
> +{
> +    echo "$*" >"$OUTPUT_DIR/$seq.notrun"
> +    echo "$seq not run: $*"
> +    status=0
> +    exit
> +}
> +
> +if ! command -v gsed >/dev/null 2>&1; then
> +    if sed --version 2>&1 | grep -v 'not GNU sed' | grep 'GNU sed' > /dev/null;
> +    then
> +        gsed()
> +        {
> +            sed "$@"
> +        }
> +    else
> +        gsed()
> +        {
> +            _notrun "GNU sed not available"
> +        }
> +    fi
>  fi

This one looks good.

I found one or two issues that need to be fixed, and a couple of
"might as well improve them while touching the line anyway", but
overall I like where this is headed.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org


Re: [PATCH] tests/qemu-iotests: Rework the checks and spots using GNU sed
Posted by Thomas Huth 2 years, 2 months ago
On 15/02/2022 23.10, Eric Blake wrote:
> On Tue, Feb 15, 2022 at 02:20:31PM +0100, Thomas Huth wrote:
>> Instead of failing the iotests if GNU sed is not available (or skipping
>> them completely in the check-block.sh script), it would be better to
>> simply skip the bash-based tests that rely on GNU sed, so that the other
>> tests could still be run. Thus we now explicitely use "gsed" (either as
>> direct program or as a wrapper around "sed" if it's the GNU version)
>> in the spots that rely on the GNU sed behavior. Then we also remove the
>> sed checks from the check-block.sh script, so that "make check-block"
>> can now be run on systems without GNU sed, too.
...
>> diff --git a/tests/qemu-iotests/common.filter b/tests/qemu-iotests/common.filter
>> index 935217aa65..a3b1708adc 100644
>> --- a/tests/qemu-iotests/common.filter
>> +++ b/tests/qemu-iotests/common.filter
>> @@ -21,58 +21,58 @@
>>   
>>   _filter_date()
>>   {
>> -    $SED -re 's/[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}/yyyy-mm-dd hh:mm:ss/'
>> +    gsed -re 's/[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}/yyyy-mm-dd hh:mm:ss/'
> 
> GNU sed recommends spelling it 'sed -E', not 'sed -r', when using
> extended regex.  Older POSIX did not support either spelling, but the
> next version will require -E, as many implementations have it now:
> https://www.austingroupbugs.net/view.php?id=528

Thanks for the pointer ... I originally checked "man 1p sed" on
my system and did not see -r or -E in there, so I thought that
this must be really something specific to GNU sed. But now that
you've mentioned this, I just double-checked the build environments
that we support with QEMU, and seems like -E should be supported
everywhere:

macOS 11:

$ sed --help
sed: illegal option -- -
usage: sed script [-Ealnru] [-i extension] [file ...]
	sed [-Ealnu] [-i extension] [-e script] ... [-f script_file] ... [file ...]


NetBSD 9.2:

$ sed --help
sed: unknown option -- -
Usage:  sed [-aElnru] command [file ...]
         sed [-aElnru] [-e command] [-f command_file] [-I[extension]]
             [-i[extension]] [file ...]


FreeBSD 12.3:

$ sed --help
sed: illegal option -- -
usage: sed script [-Ealnru] [-i extension] [file ...]
	sed [-Ealnu] [-i extension] [-e script] ... [-f script_file] ... [file ...]


OpenBSD 7.0:

$ sed --help
sed: unknown option -- -
usage: sed [-aEnru] [-i[extension]] command [file ...]
        sed [-aEnru] [-e command] [-f command_file] [-i[extension]] [file ...]


Illumos:

Has -E according to https://illumos.org/man/1/sed


Busybox:

Has -E according to https://www.commandlinux.com/man-page/man1/busybox.1.html


Haiku:

Seems to use GNU sed, so -E is available.


We likely never will run the iotests on Windows, so I did not check
those (but I assume MSYS and friends are using GNU sed, too).


So I think it should be safe to change these spots that are
using "-r" to "sed -E" (and not use gsed here).

> Other than the fact that this was easier to write with ERE, I'm not
> seeing any other GNU-only extensions in use here; but I'd recommend
> that as long as we're touching the line, we spell it 'gsed -Ee'
> instead of -re (here, and in several other places).
> 
>>   _filter_qom_path()
>>   {
>> -    $SED -e '/Attached to:/s/\device[[0-9]\+\]/device[N]/g'
>> +    gsed -e '/Attached to:/s/\device[[0-9]\+\]/device[N]/g'
> 
> Here, it is our use of \+ that is a GNU sed extension, although it is
> fairly easy (but verbose) to translate that one to portable sed
> (<PAT>\+ is the same as <PAT><PAT>*).  So gsed is correct.  On the
> other hand, the use of [[0-9]\+\] looks ugly - it probably does NOT
> match what we meant (we have a bracket expression '[...]' that matches
> the 11 characters [ and 0-9, then '\+' to match that bracket
> expression 1 or more times, then '\]' which in its context is
> identical to ']' to match a closing ], since only opening [ needs \
> escaping for literal treatment).  What we probably meant is:
> 
> '/Attached to:/s/\device\[[0-9][0-9]*]/device[N]/g'
> 
> at which point normal sed would do.

Ok ... but I'd prefer to clean such spots rather in a second step,
to make sure not to introduce bugs and to make the review easier.

>>   # Removes \r from messages
>>   _filter_win32()
>>   {
>> -    $SED -e 's/\r//g'
>> +    gsed -e 's/\r//g'
> 
> Yep, \r is another GNU sed extension.
> 
>>   }
>>   
>>   # sanitize qemu-io output
>>   _filter_qemu_io()
>>   {
>> -    _filter_win32 | $SED -e "s/[0-9]* ops\; [0-9/:. sec]* ([0-9/.inf]* [EPTGMKiBbytes]*\/sec and [0-9/.inf]* ops\/sec)/X ops\; XX:XX:XX.X (XXX YYY\/sec and XXX ops\/sec)/" \
>> +    _filter_win32 | gsed -e "s/[0-9]* ops\; [0-9/:. sec]* ([0-9/.inf]* [EPTGMKiBbytes]*\/sec and [0-9/.inf]* ops\/sec)/X ops\; XX:XX:XX.X (XXX YYY\/sec and XXX ops\/sec)/" \
>>           -e "s/: line [0-9][0-9]*:  *[0-9][0-9]*\( Aborted\| Killed\)/:\1/" \
>>           -e "s/qemu-io> //g"
> 
> I'm not seeing anything specific to GNU sed in this (long) sed script;
> can we relax this one to plain 'sed'?  Use of s#some/text## might be
> easier than having to s/some\/text//, but that's not essential.

If I switch that to plain sed, I'm getting errors like this on NetBSD:

--- /home/qemu/qemu-test.is2SLq/src/tests/qemu-iotests/046.out
+++ 11296-046.out.bad
@@ -66,6 +66,7 @@
  64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
  wrote 65536/65536 bytes at offset 2031616
  64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
  Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=6442450944 backing_file=TEST_DIR/t.IMGFMT.base backing_fmt=IMGFMT

  == Some concurrent requests touching the same cluster ==

So I'll keep gsed here for now - we need it for _filter_qemu_io
anyway since it's calling _filter_win32 that currently needs
gsed, too.

>> @@ -142,7 +142,7 @@ _do_filter_img_create()
>>       # precedes ", fmt=") and the options part ($options, which starts
>>       # with "fmt=")
>>       # (And just echo everything before the first "^Formatting")
>> -    readarray formatting_line < <($SED -e 's/, fmt=/\n/')
>> +    readarray formatting_line < <(gsed -e 's/, fmt=/\n/')
> 
> This one looks like it should work with plain 'sed'.

Using normal sed does not really work for me here. For example
with NetBSD, I get errors like this:

--- /home/qemu/qemu-test.cSYvEb/src/tests/qemu-iotests/027.out
+++ 13694-027.out.bad
@@ -1,5 +1,5 @@
  QA output created by 027
-Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728
+Formatting 'TEST_DIR/t.IMGFMT'nIMGFMT cluster_size=65536 extended_l2=off compression_type=zlib size=134217728 lazy_refcounts=off refcount_bits=16, fmt=

  == writing first cluster to populate metadata ==
  wrote 65536/65536 bytes at offset 65536

>> @@ -209,7 +209,7 @@ _filter_img_create()
>>   
>>   _filter_img_create_size()
>>   {
>> -    $SED -e "s# size=[0-9]\\+# size=SIZE#g"
>> +    sed -e "s# size=[0-9]\\+# size=SIZE#g"
> 
> The use of "\\+" here either needs gsed, or respelling as [0-9][0-9]*.

I'll change it to gsed for now ... we can update the \+ spots in a
second patch later if we think that it helps to make the iotests run
on more systems.

>>   }
>>   
>>   _filter_img_info()
>> @@ -223,7 +223,7 @@ _filter_img_info()
>>   
>>       discard=0
>>       regex_json_spec_start='^ *"format-specific": \{'
>> -    $SED -e "s#$REMOTE_TEST_DIR#TEST_DIR#g" \
>> +    gsed -e "s#$REMOTE_TEST_DIR#TEST_DIR#g" \
>>           -e "s#$IMGPROTO:$TEST_DIR#TEST_DIR#g" \
>>           -e "s#$TEST_DIR#TEST_DIR#g" \
>>           -e "s#$SOCK_DIR#SOCK_DIR#g" \
> 
> I didn't check context for whether this one needs to be gsed, or could
> be plain sed.

Complete statement looks like this:

  gsed -e "s#$REMOTE_TEST_DIR#TEST_DIR#g" \
    -e "s#$IMGPROTO:$TEST_DIR#TEST_DIR#g" \
    -e "s#$TEST_DIR#TEST_DIR#g" \
    -e "s#$SOCK_DIR#SOCK_DIR#g" \
    -e "s#$IMGFMT#IMGFMT#g" \
    -e 's#nbd://127.0.0.1:[0-9]\\+$#TEST_DIR/t.IMGFMT#g' \
    -e 's#nbd+unix:///\??socket=SOCK_DIR/nbd#TEST_DIR/t.IMGFMT#g'\
    -e 's#SOCK_DIR/fuse-#TEST_DIR/#g' \
    -e "/encrypted: yes/d" \
    -e "/cluster_size: [0-9]\\+/d" \
    -e "/table_size: [0-9]\\+/d" \
    -e "/compat: '[^']*'/d" \
    -e "/compat6: \\(on\\|off\\)/d" \
    -e "s/cid: [0-9]\+/cid: XXXXXXXXXX/" \
    -e "/static: \\(on\\|off\\)/d" \
    -e "/zeroed_grain: \\(on\\|off\\)/d" \
    -e "/subformat: '[^']*'/d" \
    -e "/adapter_type: '[^']*'/d" \
    -e "/hwversion: '[^']*'/d" \
    -e "/lazy_refcounts: \\(on\\|off\\)/d" \
    -e "/extended_l2=\\(on\\|off\\)/d" \
    -e "/block_size: [0-9]\\+/d" \
    -e "/block_state_zero: \\(on\\|off\\)/d" \
    -e "/log_size: [0-9]\\+/d" \
    -e "s/iters: [0-9]\\+/iters: 1024/" \
    -e 's/\(compression type: \)\(zlib\|zstd\)/\1COMPRESSION_TYPE/' \
    -e "s/uuid: [-a-f0-9]\\+/uuid: 00000000-0000-0000-0000-000000000000/" | \

There are some \\+ in here, so I think this needs gsed for now.

>> +++ b/tests/qemu-iotests/common.rc
>> @@ -17,17 +17,28 @@
>>   # along with this program.  If not, see <http://www.gnu.org/licenses/>.
>>   #
>>   
>> -SED=
>> -for sed in sed gsed; do
>> -    ($sed --version | grep 'GNU sed') > /dev/null 2>&1
>> -    if [ "$?" -eq 0 ]; then
>> -        SED=$sed
>> -        break
>> -    fi
>> -done
>> -if [ -z "$SED" ]; then
>> -    echo "$0: GNU sed not found"
>> -    exit 1
>> +# bail out, setting up .notrun file
>> +_notrun()
>> +{
>> +    echo "$*" >"$OUTPUT_DIR/$seq.notrun"
>> +    echo "$seq not run: $*"
>> +    status=0
>> +    exit
>> +}
>> +
>> +if ! command -v gsed >/dev/null 2>&1; then
>> +    if sed --version 2>&1 | grep -v 'not GNU sed' | grep 'GNU sed' > /dev/null;
>> +    then
>> +        gsed()
>> +        {
>> +            sed "$@"
>> +        }
>> +    else
>> +        gsed()
>> +        {
>> +            _notrun "GNU sed not available"
>> +        }
>> +    fi
>>   fi
> 
> This one looks good.
> 
> I found one or two issues that need to be fixed, and a couple of
> "might as well improve them while touching the line anyway", but
> overall I like where this is headed.

Thanks a lot of your review and suggestions, I'll respin a v2 with the updates...

  Thomas


Re: [PATCH] tests/qemu-iotests: Rework the checks and spots using GNU sed
Posted by Eric Blake 2 years, 2 months ago
On Wed, Feb 16, 2022 at 12:39:06PM +0100, Thomas Huth wrote:
> > > -    $SED -re 's/[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}/yyyy-mm-dd hh:mm:ss/'
> > > +    gsed -re 's/[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}/yyyy-mm-dd hh:mm:ss/'
> > 
> > GNU sed recommends spelling it 'sed -E', not 'sed -r', when using
> > extended regex.  Older POSIX did not support either spelling, but the
> > next version will require -E, as many implementations have it now:
> > https://www.austingroupbugs.net/view.php?id=528
> 
> Thanks for the pointer ... I originally checked "man 1p sed" on
> my system and did not see -r or -E in there, so I thought that
> this must be really something specific to GNU sed. But now that
> you've mentioned this, I just double-checked the build environments
> that we support with QEMU, and seems like -E should be supported
> everywhere:

Yay.

> 
> So I think it should be safe to change these spots that are
> using "-r" to "sed -E" (and not use gsed here).
> 
> > Other than the fact that this was easier to write with ERE, I'm not
> > seeing any other GNU-only extensions in use here; but I'd recommend
> > that as long as we're touching the line, we spell it 'gsed -Ee'
> > instead of -re (here, and in several other places).
> > 
> > >   _filter_qom_path()
> > >   {
> > > -    $SED -e '/Attached to:/s/\device[[0-9]\+\]/device[N]/g'
> > > +    gsed -e '/Attached to:/s/\device[[0-9]\+\]/device[N]/g'
> > 
> > Here, it is our use of \+ that is a GNU sed extension, although it is
> > fairly easy (but verbose) to translate that one to portable sed
> > (<PAT>\+ is the same as <PAT><PAT>*).  So gsed is correct.

Then again, since we claim 'sed -E' is portable, we can get the +
operator everywhere by using ERE instead of BRE (and with fewer
leaning toothpicks, another reason I like ERE better than BRE).

On the
> > other hand, the use of [[0-9]\+\] looks ugly - it probably does NOT
> > match what we meant (we have a bracket expression '[...]' that matches
> > the 11 characters [ and 0-9, then '\+' to match that bracket
> > expression 1 or more times, then '\]' which in its context is
> > identical to ']' to match a closing ], since only opening [ needs \
> > escaping for literal treatment).  What we probably meant is:
> > 
> > '/Attached to:/s/\device\[[0-9][0-9]*]/device[N]/g'
> > 
> > at which point normal sed would do.
> 
> Ok ... but I'd prefer to clean such spots rather in a second step,
> to make sure not to introduce bugs and to make the review easier.

Yeah, fixing bugs and cleaning up consistent use of sed/gsed/$SED are
worth separating.

> > >   _filter_qemu_io()
> > >   {
> > > -    _filter_win32 | $SED -e "s/[0-9]* ops\; [0-9/:. sec]* ([0-9/.inf]* [EPTGMKiBbytes]*\/sec and [0-9/.inf]* ops\/sec)/X ops\; XX:XX:XX.X (XXX YYY\/sec and XXX ops\/sec)/" \
> > > +    _filter_win32 | gsed -e "s/[0-9]* ops\; [0-9/:. sec]* ([0-9/.inf]* [EPTGMKiBbytes]*\/sec and [0-9/.inf]* ops\/sec)/X ops\; XX:XX:XX.X (XXX YYY\/sec and XXX ops\/sec)/" \
> > >           -e "s/: line [0-9][0-9]*:  *[0-9][0-9]*\( Aborted\| Killed\)/:\1/" \
> > >           -e "s/qemu-io> //g"
> > 
> > I'm not seeing anything specific to GNU sed in this (long) sed script;
> > can we relax this one to plain 'sed'?  Use of s#some/text## might be
> > easier than having to s/some\/text//, but that's not essential.
> 
> If I switch that to plain sed, I'm getting errors like this on NetBSD:
> 
> --- /home/qemu/qemu-test.is2SLq/src/tests/qemu-iotests/046.out
> +++ 11296-046.out.bad
> @@ -66,6 +66,7 @@
>  64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
>  wrote 65536/65536 bytes at offset 2031616
>  64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> +
>  Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=6442450944 backing_file=TEST_DIR/t.IMGFMT.base backing_fmt=IMGFMT

Huh; not sure what happened that I didn't see.  But I trust your tests
as a more canonical version of "it worked on this platform's sed" than
my "I don't see anything blantantly non-POSIX" ;)

> 
>  == Some concurrent requests touching the same cluster ==
> 
> So I'll keep gsed here for now - we need it for _filter_qemu_io
> anyway since it's calling _filter_win32 that currently needs
> gsed, too.

Yeah, I think your patch is big enough to prove there are places where
it really is easier to rely on gsed than to try and be portable.

> 
> > > @@ -142,7 +142,7 @@ _do_filter_img_create()
> > >       # precedes ", fmt=") and the options part ($options, which starts
> > >       # with "fmt=")
> > >       # (And just echo everything before the first "^Formatting")
> > > -    readarray formatting_line < <($SED -e 's/, fmt=/\n/')
> > > +    readarray formatting_line < <(gsed -e 's/, fmt=/\n/')
> > 
> > This one looks like it should work with plain 'sed'.
> 
> Using normal sed does not really work for me here. For example
> with NetBSD, I get errors like this:
> 
> --- /home/qemu/qemu-test.cSYvEb/src/tests/qemu-iotests/027.out
> +++ 13694-027.out.bad
> @@ -1,5 +1,5 @@
>  QA output created by 027
> -Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728
> +Formatting 'TEST_DIR/t.IMGFMT'nIMGFMT cluster_size=65536 extended_l2=off compression_type=zlib size=134217728 lazy_refcounts=off refcount_bits=16, fmt=

Hmm.  I had to go and re-read POSIX.  Okay, POSIX says that
's/...\n.../.../' is required to match a newline in the pattern space,
but for the substitution, \n is not required to work, and instead, you
would write:
s/.../\
/
to portably substitute a literal newline into the output.  But that is
unwieldy in a script, so using gsed is indeed the best approach.

> > I found one or two issues that need to be fixed, and a couple of
> > "might as well improve them while touching the line anyway", but
> > overall I like where this is headed.
> 
> Thanks a lot of your review and suggestions, I'll respin a v2 with the updates...

Looking forward to it.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org