[PATCH v2] nbd/server: Allow MULTI_CONN for shared writable exports

Eric Blake posted 1 patch 2 years, 2 months ago
Test checkpatch passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20220215171838.2651387-1-eblake@redhat.com
Maintainers: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>, Markus Armbruster <armbru@redhat.com>, Hanna Reitz <hreitz@redhat.com>, Kevin Wolf <kwolf@redhat.com>, Eric Blake <eblake@redhat.com>
docs/interop/nbd.txt                       |   1 +
docs/tools/qemu-nbd.rst                    |   3 +-
qapi/block-export.json                     |  34 +++-
include/block/nbd.h                        |   3 +-
blockdev-nbd.c                             |   5 +
nbd/server.c                               |  27 ++-
MAINTAINERS                                |   1 +
tests/qemu-iotests/tests/nbd-multiconn     | 188 +++++++++++++++++++++
tests/qemu-iotests/tests/nbd-multiconn.out | 112 ++++++++++++
9 files changed, 363 insertions(+), 11 deletions(-)
create mode 100755 tests/qemu-iotests/tests/nbd-multiconn
create mode 100644 tests/qemu-iotests/tests/nbd-multiconn.out
[PATCH v2] nbd/server: Allow MULTI_CONN for shared writable exports
Posted by Eric Blake 2 years, 2 months ago
According to the NBD spec, a server advertising
NBD_FLAG_CAN_MULTI_CONN promises that multiple client connections will
not see any cache inconsistencies: when properly separated by a single
flush, actions performed by one client will be visible to another
client, regardless of which client did the flush.  We satisfy these
conditions in qemu when our block layer is backed by the local
filesystem (by virtue of the semantics of fdatasync(), and the fact
that qemu itself is not buffering writes beyond flushes).  It is
harder to state whether we satisfy these conditions for network-based
protocols, so the safest course of action is to allow users to opt-in
to advertising multi-conn.  We may later tweak defaults to advertise
by default when the block layer can confirm that the underlying
protocol driver is cache consistent between multiple writers, but for
now, this at least allows savvy users (such as virt-v2v or nbdcopy) to
explicitly start qemu-nbd or qemu-storage-daemon with multi-conn
advertisement in a known-safe setup where the client end can then
benefit from parallel clients.

Note, however, that we don't want to advertise MULTI_CONN when we know
that a second client cannot connect (for historical reasons, qemu-nbd
defaults to a single connection while nbd-server-add and QMP commands
default to unlimited connections; but we already have existing means
to let either style of NBD server creation alter those defaults).  The
harder part of this patch is setting up an iotest to demonstrate
behavior of multiple NBD clients to a single server.  It might be
possible with parallel qemu-io processes, but concisely managing that
in shell is painful.  I found it easier to do by relying on the libnbd
project's nbdsh, which means this test will be skipped on platforms
where that is not available.

Signed-off-by: Eric Blake <eblake@redhat.com>
Fixes: https://bugzilla.redhat.com/1708300
---

v1 was in Aug 2021 [1], with further replies in Sep [2] and Oct [3].

[1] https://lists.gnu.org/archive/html/qemu-devel/2021-08/msg04900.html
[2] https://lists.gnu.org/archive/html/qemu-devel/2021-09/msg00038.html
[3] https://lists.gnu.org/archive/html/qemu-devel/2021-10/msg06744.html

Since then, I've tweaked the QAPI to mention 7.0 (instead of 6.2), and
reworked the logic so that default behavior is unchanged for now
(advertising multi-conn on a writable export requires opt-in during
the command line or QMP, but remains default for a readonly export).
I've also expanded the amount of testing done in the new iotest.

 docs/interop/nbd.txt                       |   1 +
 docs/tools/qemu-nbd.rst                    |   3 +-
 qapi/block-export.json                     |  34 +++-
 include/block/nbd.h                        |   3 +-
 blockdev-nbd.c                             |   5 +
 nbd/server.c                               |  27 ++-
 MAINTAINERS                                |   1 +
 tests/qemu-iotests/tests/nbd-multiconn     | 188 +++++++++++++++++++++
 tests/qemu-iotests/tests/nbd-multiconn.out | 112 ++++++++++++
 9 files changed, 363 insertions(+), 11 deletions(-)
 create mode 100755 tests/qemu-iotests/tests/nbd-multiconn
 create mode 100644 tests/qemu-iotests/tests/nbd-multiconn.out

diff --git a/docs/interop/nbd.txt b/docs/interop/nbd.txt
index bdb0f2a41aca..6c99070b99c8 100644
--- a/docs/interop/nbd.txt
+++ b/docs/interop/nbd.txt
@@ -68,3 +68,4 @@ NBD_CMD_BLOCK_STATUS for "qemu:dirty-bitmap:", NBD_CMD_CACHE
 * 4.2: NBD_FLAG_CAN_MULTI_CONN for shareable read-only exports,
 NBD_CMD_FLAG_FAST_ZERO
 * 5.2: NBD_CMD_BLOCK_STATUS for "qemu:allocation-depth"
+* 7.0: NBD_FLAG_CAN_MULTI_CONN for shareable writable exports
diff --git a/docs/tools/qemu-nbd.rst b/docs/tools/qemu-nbd.rst
index 6031f9689312..1de785524c36 100644
--- a/docs/tools/qemu-nbd.rst
+++ b/docs/tools/qemu-nbd.rst
@@ -139,8 +139,7 @@ driver options if ``--image-opts`` is specified.
 .. option:: -e, --shared=NUM

   Allow up to *NUM* clients to share the device (default
-  ``1``), 0 for unlimited. Safe for readers, but for now,
-  consistency is not guaranteed between multiple writers.
+  ``1``), 0 for unlimited.

 .. option:: -t, --persistent

diff --git a/qapi/block-export.json b/qapi/block-export.json
index f183522d0d2c..0a27e8ee84f9 100644
--- a/qapi/block-export.json
+++ b/qapi/block-export.json
@@ -21,7 +21,9 @@
 #             recreated on the fly while the NBD server is active.
 #             If missing, it will default to denying access (since 4.0).
 # @max-connections: The maximum number of connections to allow at the same
-#                   time, 0 for unlimited. (since 5.2; default: 0)
+#                   time, 0 for unlimited. Setting this to 1 also stops
+#                   the server from advertising multiple client support
+#                   (since 5.2; default: 0)
 #
 # Since: 4.2
 ##
@@ -50,7 +52,9 @@
 #             recreated on the fly while the NBD server is active.
 #             If missing, it will default to denying access (since 4.0).
 # @max-connections: The maximum number of connections to allow at the same
-#                   time, 0 for unlimited. (since 5.2; default: 0)
+#                   time, 0 for unlimited. Setting this to 1 also stops
+#                   the server from advertising multiple client support
+#                   (since 5.2; default: 0).
 #
 # Returns: error if the server is already running.
 #
@@ -79,6 +83,26 @@
 { 'struct': 'BlockExportOptionsNbdBase',
   'data': { '*name': 'str', '*description': 'str' } }

+##
+# @NbdExportMultiConn:
+#
+# Possible settings for advertising NBD multiple client support.
+#
+# @off: Do not advertise multiple clients.
+#
+# @on: Allow multiple clients (for writable clients, this is only safe
+#      if the underlying BDS is cache-consistent, such as when backed
+#      by the raw file driver); ignored if the NBD server was set up
+#      with max-connections of 1.
+#
+# @auto: Behaves like @off if the export is writable, and @on if the
+#        export is read-only.
+#
+# Since: 7.0
+##
+{ 'enum': 'NbdExportMultiConn',
+  'data': ['off', 'on', 'auto'] }
+
 ##
 # @BlockExportOptionsNbd:
 #
@@ -95,11 +119,15 @@
 #                    the metadata context name "qemu:allocation-depth" to
 #                    inspect allocation details. (since 5.2)
 #
+# @multi-conn: Controls whether multiple client support is advertised, if the
+#              server's max-connections is not 1. (default auto, since 7.0)
+#
 # Since: 5.2
 ##
 { 'struct': 'BlockExportOptionsNbd',
   'base': 'BlockExportOptionsNbdBase',
-  'data': { '*bitmaps': ['str'], '*allocation-depth': 'bool' } }
+  'data': { '*bitmaps': ['str'], '*allocation-depth': 'bool',
+            '*multi-conn': 'NbdExportMultiConn' } }

 ##
 # @BlockExportOptionsVhostUserBlk:
diff --git a/include/block/nbd.h b/include/block/nbd.h
index 78d101b77488..138b9857dbcb 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -1,5 +1,5 @@
 /*
- *  Copyright (C) 2016-2020 Red Hat, Inc.
+ *  Copyright (C) 2016-2022 Red Hat, Inc.
  *  Copyright (C) 2005  Anthony Liguori <anthony@codemonkey.ws>
  *
  *  Network Block Device
@@ -346,6 +346,7 @@ void nbd_client_put(NBDClient *client);

 void nbd_server_is_qemu_nbd(bool value);
 bool nbd_server_is_running(void);
+int nbd_server_max_connections(void);
 void nbd_server_start(SocketAddress *addr, const char *tls_creds,
                       const char *tls_authz, uint32_t max_connections,
                       Error **errp);
diff --git a/blockdev-nbd.c b/blockdev-nbd.c
index bdfa7ed3a5a9..0df222e868d8 100644
--- a/blockdev-nbd.c
+++ b/blockdev-nbd.c
@@ -44,6 +44,11 @@ bool nbd_server_is_running(void)
     return nbd_server || is_qemu_nbd;
 }

+int nbd_server_max_connections(void)
+{
+    return nbd_server ? nbd_server->max_connections : -1;
+}
+
 static void nbd_blockdev_client_closed(NBDClient *client, bool ignored)
 {
     nbd_client_put(client);
diff --git a/nbd/server.c b/nbd/server.c
index 9fb2f264023e..f17aacc8139f 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -1,5 +1,5 @@
 /*
- *  Copyright (C) 2016-2021 Red Hat, Inc.
+ *  Copyright (C) 2016-2022 Red Hat, Inc.
  *  Copyright (C) 2005  Anthony Liguori <anthony@codemonkey.ws>
  *
  *  Network Block Device Server Side
@@ -1641,7 +1641,7 @@ static int nbd_export_create(BlockExport *blk_exp, BlockExportOptions *exp_args,
     int64_t size;
     uint64_t perm, shared_perm;
     bool readonly = !exp_args->writable;
-    bool shared = !exp_args->writable;
+    bool multi_conn;
     strList *bitmaps;
     size_t i;
     int ret;
@@ -1679,6 +1679,23 @@ static int nbd_export_create(BlockExport *blk_exp, BlockExportOptions *exp_args,
         return size;
     }

+    /*
+     * Determine whether to advertise multi-conn.  Default is auto,
+     * which resolves to on for read-only and off for writable.  But
+     * if the server has max-connections 1, that forces the flag off.
+     */
+    if (!arg->has_multi_conn) {
+        arg->multi_conn = NBD_EXPORT_MULTI_CONN_AUTO;
+    }
+    if (nbd_server_max_connections() == 1) {
+        arg->multi_conn = NBD_EXPORT_MULTI_CONN_OFF;
+    }
+    if (arg->multi_conn == NBD_EXPORT_MULTI_CONN_AUTO) {
+        multi_conn = readonly;
+    } else {
+        multi_conn = arg->multi_conn == NBD_EXPORT_MULTI_CONN_ON;
+    }
+
     /* Don't allow resize while the NBD server is running, otherwise we don't
      * care what happens with the node. */
     blk_get_perm(blk, &perm, &shared_perm);
@@ -1692,11 +1709,11 @@ static int nbd_export_create(BlockExport *blk_exp, BlockExportOptions *exp_args,
     exp->description = g_strdup(arg->description);
     exp->nbdflags = (NBD_FLAG_HAS_FLAGS | NBD_FLAG_SEND_FLUSH |
                      NBD_FLAG_SEND_FUA | NBD_FLAG_SEND_CACHE);
+    if (multi_conn) {
+        exp->nbdflags |= NBD_FLAG_CAN_MULTI_CONN;
+    }
     if (readonly) {
         exp->nbdflags |= NBD_FLAG_READ_ONLY;
-        if (shared) {
-            exp->nbdflags |= NBD_FLAG_CAN_MULTI_CONN;
-        }
     } else {
         exp->nbdflags |= (NBD_FLAG_SEND_TRIM | NBD_FLAG_SEND_WRITE_ZEROES |
                           NBD_FLAG_SEND_FAST_ZERO);
diff --git a/MAINTAINERS b/MAINTAINERS
index 9814580975c5..24dd800ae3f8 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3334,6 +3334,7 @@ F: qemu-nbd.*
 F: blockdev-nbd.c
 F: docs/interop/nbd.txt
 F: docs/tools/qemu-nbd.rst
+F: tests/qemu-iotests/tests/*nbd*
 T: git https://repo.or.cz/qemu/ericb.git nbd
 T: git https://src.openvz.org/scm/~vsementsov/qemu.git nbd

diff --git a/tests/qemu-iotests/tests/nbd-multiconn b/tests/qemu-iotests/tests/nbd-multiconn
new file mode 100755
index 000000000000..0261552f32f2
--- /dev/null
+++ b/tests/qemu-iotests/tests/nbd-multiconn
@@ -0,0 +1,188 @@
+#!/usr/bin/env bash
+# group: rw auto quick
+#
+# Test that qemu-nbd MULTI_CONN works
+#
+# Copyright (C) 2018-2022 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+#
+
+seq="$(basename $0)"
+echo "QA output created by $seq"
+
+status=1 # failure is the default!
+
+_cleanup()
+{
+    _cleanup_test_img
+    _cleanup_qemu
+    rm -f "$SOCK_DIR/nbd"
+}
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+cd ..
+. ./common.rc
+. ./common.filter
+. ./common.qemu
+. ./common.nbd
+
+_supported_fmt qcow2
+_supported_proto file
+_supported_os Linux
+_require_command QEMU_NBD
+
+# Parallel client connections are easier with libnbd than qemu-io:
+if ! (nbdsh --version) >/dev/null 2>&1; then
+    _notrun "nbdsh utility required, skipped this test"
+fi
+
+echo
+echo "=== Initial image setup ==="
+echo
+
+_make_test_img 4M
+$QEMU_IO -c 'w -P 1 0 2M' -c 'w -P 2 2M 2M' "$TEST_IMG" | _filter_qemu_io
+_launch_qemu 2> >(_filter_nbd)
+_send_qemu_cmd $QEMU_HANDLE '{"execute":"qmp_capabilities"}' "return"
+_send_qemu_cmd $QEMU_HANDLE '{"execute":"blockdev-add",
+  "arguments":{"driver":"qcow2", "node-name":"n",
+    "file":{"driver":"file", "filename":"'"$TEST_IMG"'"}}}' "return"
+export nbd_unix_socket
+
+echo
+echo "=== Default nbd server settings ==="
+echo
+# Default allows for unlimited connections, readonly images advertise
+# multi-conn, and writable images do not
+_send_qemu_cmd $QEMU_HANDLE '{"execute":"nbd-server-start",
+  "arguments":{"addr":{"type":"unix",
+    "data":{"path":"'"$nbd_unix_socket"'"}}}}' "return"
+_send_qemu_cmd $QEMU_HANDLE '{"execute":"block-export-add",
+  "arguments":{"type":"nbd", "id":"r", "node-name":"n",
+    "name":"r"}}' "return"
+_send_qemu_cmd $QEMU_HANDLE '{"execute":"block-export-add",
+  "arguments":{"type":"nbd", "id":"w", "node-name":"n",
+    "name":"w", "writable":true}}' "return"
+nbdsh -u "nbd+unix:///r?socket=$nbd_unix_socket" -c '
+assert h.can_multi_conn()
+h.shutdown()
+print("nbdsh passed")'
+nbdsh -u "nbd+unix:///w?socket=$nbd_unix_socket" -c '
+assert not h.can_multi_conn()
+h.shutdown()
+print("nbdsh passed")'
+_send_qemu_cmd $QEMU_HANDLE '{"execute":"block-export-del",
+  "arguments":{"id":"r"}}' "DELETED"
+_send_qemu_cmd $QEMU_HANDLE '{"execute":"block-export-del",
+  "arguments":{"id":"w"}}' "DELETED"
+
+echo
+echo "=== Per-export overrides of defaults ==="
+echo
+# Can explicitly disable multi-conn for readonly image, and explicitly
+# enable it for writable image
+_send_qemu_cmd $QEMU_HANDLE '{"execute":"block-export-add",
+  "arguments":{"type":"nbd", "id":"r", "node-name":"n",
+    "name":"r", "multi-conn":"off"}}' "return"
+_send_qemu_cmd $QEMU_HANDLE '{"execute":"block-export-add",
+  "arguments":{"type":"nbd", "id":"w", "node-name":"n",
+    "name":"w", "multi-conn":"on", "writable":true}}' "return"
+nbdsh -u "nbd+unix:///r?socket=$nbd_unix_socket" -c '
+assert not h.can_multi_conn()
+h.shutdown()
+print("nbdsh passed")'
+nbdsh -u "nbd+unix:///w?socket=$nbd_unix_socket" -c '
+assert h.can_multi_conn()
+h.shutdown()
+print("nbdsh passed")'
+_send_qemu_cmd $QEMU_HANDLE '{"execute":"block-export-del",
+  "arguments":{"id":"r"}}' "DELETED"
+_send_qemu_cmd $QEMU_HANDLE '{"execute":"block-export-del",
+  "arguments":{"id":"w"}}' "DELETED"
+_send_qemu_cmd $QEMU_HANDLE '{"execute":"nbd-server-stop"}' "return"
+
+echo
+echo "=== Limit nbd server max-connections ==="
+echo
+# When max-connections is 1, no images advertise multi-conn, even when
+# explicitly requested per export
+_send_qemu_cmd $QEMU_HANDLE '{"execute":"nbd-server-start",
+  "arguments":{"max-connections":1, "addr":{"type":"unix",
+    "data":{"path":"'"$nbd_unix_socket"'"}}}}' "return"
+_send_qemu_cmd $QEMU_HANDLE '{"execute":"block-export-add",
+  "arguments":{"type":"nbd", "id":"r", "node-name":"n",
+    "name":"r", "multi-conn":"on"}}' "return"
+nbdsh -u "nbd+unix:///r?socket=$nbd_unix_socket" -c '
+assert not h.can_multi_conn()
+h.shutdown()
+print("nbdsh passed")'
+_send_qemu_cmd $QEMU_HANDLE '{"execute":"block-export-del",
+  "arguments":{"id":"r"}}' "DELETED"
+_send_qemu_cmd $QEMU_HANDLE '{"execute":"block-export-add",
+  "arguments":{"type":"nbd", "id":"w", "node-name":"n",
+    "name":"w", "multi-conn":"on", "writable":true}}' "return"
+nbdsh -u "nbd+unix:///w?socket=$nbd_unix_socket" -c '
+assert not h.can_multi_conn()
+h.shutdown()
+print("nbdsh passed")'
+_send_qemu_cmd $QEMU_HANDLE '{"execute":"block-export-del",
+  "arguments":{"id":"w"}}' "DELETED"
+_send_qemu_cmd $QEMU_HANDLE '{"execute":"nbd-server-stop"}' "return"
+
+echo
+echo "=== Demonstrate parallel writers ==="
+echo
+_send_qemu_cmd $QEMU_HANDLE '{"execute":"nbd-server-start",
+  "arguments":{"addr":{"type":"unix",
+    "data":{"path":"'"$nbd_unix_socket"'"}}}}' "return"
+_send_qemu_cmd $QEMU_HANDLE '{"execute":"block-export-add",
+  "arguments":{"type":"nbd", "id":"w", "node-name":"n",
+    "name":"", "multi-conn":"on", "writable":true}}' "return"
+
+nbdsh -c '
+import os
+sock = os.getenv("nbd_unix_socket")
+h = []
+
+for i in range(3):
+  h.append(nbd.NBD())
+  h[i].connect_unix(sock)
+  assert h[i].can_multi_conn()
+
+buf1 = h[0].pread(1024 * 1024, 0)
+if buf1 != b"\x01" * 1024 * 1024:
+  print("Unexpected initial read")
+buf2 = b"\x03" * 1024 * 1024
+h[1].pwrite(buf2, 0)
+h[2].flush()
+buf1 = h[0].pread(1024 * 1024, 0)
+if buf1 == buf2:
+  print("Flush appears to be consistent across connections")
+
+for i in range(3):
+  h[i].shutdown()
+'
+
+_send_qemu_cmd $QEMU_HANDLE '{"execute":"block-export-del",
+  "arguments":{"id":"w"}}' "DELETED"
+_send_qemu_cmd $QEMU_HANDLE '{"execute":"nbd-server-stop"}' "return"
+_send_qemu_cmd $QEMU_HANDLE '{"execute":"quit"}' "return"
+wait=yes _cleanup_qemu
+
+# success, all done
+echo '*** done'
+rm -f $seq.full
+status=0
diff --git a/tests/qemu-iotests/tests/nbd-multiconn.out b/tests/qemu-iotests/tests/nbd-multiconn.out
new file mode 100644
index 000000000000..a06428e1536a
--- /dev/null
+++ b/tests/qemu-iotests/tests/nbd-multiconn.out
@@ -0,0 +1,112 @@
+QA output created by nbd-multiconn
+
+=== Initial image setup ===
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=4194304
+wrote 2097152/2097152 bytes at offset 0
+2 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote 2097152/2097152 bytes at offset 2097152
+2 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+{"execute":"qmp_capabilities"}
+{"return": {}}
+{"execute":"blockdev-add",
+  "arguments":{"driver":"IMGFMT", "node-name":"n",
+    "file":{"driver":"file", "filename":"TEST_DIR/t.IMGFMT"}}}
+{"return": {}}
+
+=== Default nbd server settings ===
+
+{"execute":"nbd-server-start",
+  "arguments":{"addr":{"type":"unix",
+    "data":{"path":"SOCK_DIR/qemu-nbd.sock"}}}}
+{"return": {}}
+{"execute":"block-export-add",
+  "arguments":{"type":"nbd", "id":"r", "node-name":"n",
+    "name":"r"}}
+{"return": {}}
+{"execute":"block-export-add",
+  "arguments":{"type":"nbd", "id":"w", "node-name":"n",
+    "name":"w", "writable":true}}
+{"return": {}}
+nbdsh passed
+nbdsh passed
+{"execute":"block-export-del",
+  "arguments":{"id":"r"}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_EXPORT_DELETED", "data": {"id": "r"}}
+{"execute":"block-export-del",
+  "arguments":{"id":"w"}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_EXPORT_DELETED", "data": {"id": "w"}}
+
+=== Per-export overrides of defaults ===
+
+{"execute":"block-export-add",
+  "arguments":{"type":"nbd", "id":"r", "node-name":"n",
+    "name":"r", "multi-conn":"off"}}
+{"return": {}}
+{"execute":"block-export-add",
+  "arguments":{"type":"nbd", "id":"w", "node-name":"n",
+    "name":"w", "multi-conn":"on", "writable":true}}
+{"return": {}}
+nbdsh passed
+nbdsh passed
+{"execute":"block-export-del",
+  "arguments":{"id":"r"}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_EXPORT_DELETED", "data": {"id": "r"}}
+{"execute":"block-export-del",
+  "arguments":{"id":"w"}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_EXPORT_DELETED", "data": {"id": "w"}}
+{"execute":"nbd-server-stop"}
+{"return": {}}
+
+=== Limit nbd server max-connections ===
+
+{"execute":"nbd-server-start",
+  "arguments":{"max-connections":1, "addr":{"type":"unix",
+    "data":{"path":"SOCK_DIR/qemu-nbd.sock"}}}}
+{"return": {}}
+{"execute":"block-export-add",
+  "arguments":{"type":"nbd", "id":"r", "node-name":"n",
+    "name":"r", "multi-conn":"on"}}
+{"return": {}}
+nbdsh passed
+{"execute":"block-export-del",
+  "arguments":{"id":"r"}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_EXPORT_DELETED", "data": {"id": "r"}}
+{"execute":"block-export-add",
+  "arguments":{"type":"nbd", "id":"w", "node-name":"n",
+    "name":"w", "multi-conn":"on", "writable":true}}
+{"return": {}}
+nbdsh passed
+{"execute":"block-export-del",
+  "arguments":{"id":"w"}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_EXPORT_DELETED", "data": {"id": "w"}}
+{"execute":"nbd-server-stop"}
+{"return": {}}
+
+=== Demonstrate parallel writers ===
+
+{"execute":"nbd-server-start",
+  "arguments":{"addr":{"type":"unix",
+    "data":{"path":"SOCK_DIR/qemu-nbd.sock"}}}}
+{"return": {}}
+{"execute":"block-export-add",
+  "arguments":{"type":"nbd", "id":"w", "node-name":"n",
+    "name":"", "multi-conn":"on", "writable":true}}
+{"return": {}}
+Flush appears to be consistent across connections
+{"execute":"block-export-del",
+  "arguments":{"id":"w"}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_EXPORT_DELETED", "data": {"id": "w"}}
+{"execute":"nbd-server-stop"}
+{"return": {}}
+{"execute":"quit"}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false, "reason": "host-qmp-quit"}}
+*** done
-- 
2.35.1


Re: [PATCH v2] nbd/server: Allow MULTI_CONN for shared writable exports
Posted by Vladimir Sementsov-Ogievskiy 2 years, 2 months ago
15.02.2022 20:18, Eric Blake wrote:
> According to the NBD spec, a server advertising
> NBD_FLAG_CAN_MULTI_CONN promises that multiple client connections will
> not see any cache inconsistencies: when properly separated by a single
> flush, actions performed by one client will be visible to another
> client, regardless of which client did the flush.  We satisfy these
> conditions in qemu when our block layer is backed by the local
> filesystem (by virtue of the semantics of fdatasync(), and the fact
> that qemu itself is not buffering writes beyond flushes).  It is
> harder to state whether we satisfy these conditions for network-based
> protocols, so the safest course of action is to allow users to opt-in
> to advertising multi-conn.  We may later tweak defaults to advertise
> by default when the block layer can confirm that the underlying
> protocol driver is cache consistent between multiple writers, but for
> now, this at least allows savvy users (such as virt-v2v or nbdcopy) to
> explicitly start qemu-nbd or qemu-storage-daemon with multi-conn
> advertisement in a known-safe setup where the client end can then
> benefit from parallel clients.
> 
> Note, however, that we don't want to advertise MULTI_CONN when we know

Here we change existing default behavior. Pre-patch MULTI_CONN is still advertised for readonly export even when connections number is limited to 1.

Still may be it's a good change? Let's then note it here.. Could it break some existing clients? Hard to imagine.

Otherwise patch looks good to me, nonsignificant notes below.

> that a second client cannot connect (for historical reasons, qemu-nbd
> defaults to a single connection while nbd-server-add and QMP commands
> default to unlimited connections; but we already have existing means
> to let either style of NBD server creation alter those defaults).  The
> harder part of this patch is setting up an iotest to demonstrate
> behavior of multiple NBD clients to a single server.  It might be
> possible with parallel qemu-io processes, but concisely managing that
> in shell is painful.  I found it easier to do by relying on the libnbd
> project's nbdsh, which means this test will be skipped on platforms
> where that is not available.
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>
> Fixes: https://bugzilla.redhat.com/1708300
> ---
> 

[..]

> +##
> +# @NbdExportMultiConn:
> +#
> +# Possible settings for advertising NBD multiple client support.
> +#
> +# @off: Do not advertise multiple clients.
> +#
> +# @on: Allow multiple clients (for writable clients, this is only safe
> +#      if the underlying BDS is cache-consistent, such as when backed
> +#      by the raw file driver); ignored if the NBD server was set up
> +#      with max-connections of 1.
> +#
> +# @auto: Behaves like @off if the export is writable, and @on if the
> +#        export is read-only.
> +#
> +# Since: 7.0
> +##
> +{ 'enum': 'NbdExportMultiConn',
> +  'data': ['off', 'on', 'auto'] }

Probably we should use generic OnOffAuto type that we already have.. But in your way documentation looks better.

> +
>   ##
>   # @BlockExportOptionsNbd:
>   #

[..]

> +
> +_make_test_img 4M
> +$QEMU_IO -c 'w -P 1 0 2M' -c 'w -P 2 2M 2M' "$TEST_IMG" | _filter_qemu_io
> +_launch_qemu 2> >(_filter_nbd)
> +_send_qemu_cmd $QEMU_HANDLE '{"execute":"qmp_capabilities"}' "return"
> +_send_qemu_cmd $QEMU_HANDLE '{"execute":"blockdev-add",
> +  "arguments":{"driver":"qcow2", "node-name":"n",
> +    "file":{"driver":"file", "filename":"'"$TEST_IMG"'"}}}' "return"
> +export nbd_unix_socket
> +

[..]

> +nbdsh -c '
> +import os
> +sock = os.getenv("nbd_unix_socket")

Just interested why you pass it through environment and no simply do '... sock = "'"$nbd_unix_socket"'"... '

> +h = []
> +
> +for i in range(3):
> +  h.append(nbd.NBD())

Looks like Python:) And if something looks like Python, it's Python, we know. Should I say about PEP8 that recommends 4 whitespaces?)

> +  h[i].connect_unix(sock)
> +  assert h[i].can_multi_conn()
> +
> +buf1 = h[0].pread(1024 * 1024, 0)
> +if buf1 != b"\x01" * 1024 * 1024:
> +  print("Unexpected initial read")
> +buf2 = b"\x03" * 1024 * 1024
> +h[1].pwrite(buf2, 0)
> +h[2].flush()
> +buf1 = h[0].pread(1024 * 1024, 0)
> +if buf1 == buf2:
> +  print("Flush appears to be consistent across connections")
> +
> +for i in range(3):
> +  h[i].shutdown()
> +'

-- 
Best regards,
Vladimir

Re: [PATCH v2] nbd/server: Allow MULTI_CONN for shared writable exports
Posted by Markus Armbruster 2 years, 2 months ago
Eric Blake <eblake@redhat.com> writes:

> According to the NBD spec, a server advertising
> NBD_FLAG_CAN_MULTI_CONN promises that multiple client connections will
> not see any cache inconsistencies: when properly separated by a single
> flush, actions performed by one client will be visible to another
> client, regardless of which client did the flush.  We satisfy these
> conditions in qemu when our block layer is backed by the local
> filesystem (by virtue of the semantics of fdatasync(), and the fact
> that qemu itself is not buffering writes beyond flushes).  It is
> harder to state whether we satisfy these conditions for network-based
> protocols, so the safest course of action is to allow users to opt-in
> to advertising multi-conn.  We may later tweak defaults to advertise
> by default when the block layer can confirm that the underlying
> protocol driver is cache consistent between multiple writers, but for
> now, this at least allows savvy users (such as virt-v2v or nbdcopy) to
> explicitly start qemu-nbd or qemu-storage-daemon with multi-conn
> advertisement in a known-safe setup where the client end can then
> benefit from parallel clients.
>
> Note, however, that we don't want to advertise MULTI_CONN when we know
> that a second client cannot connect (for historical reasons, qemu-nbd
> defaults to a single connection while nbd-server-add and QMP commands
> default to unlimited connections; but we already have existing means
> to let either style of NBD server creation alter those defaults).  The
> harder part of this patch is setting up an iotest to demonstrate
> behavior of multiple NBD clients to a single server.  It might be
> possible with parallel qemu-io processes, but concisely managing that
> in shell is painful.  I found it easier to do by relying on the libnbd
> project's nbdsh, which means this test will be skipped on platforms
> where that is not available.
>
> Signed-off-by: Eric Blake <eblake@redhat.com>
> Fixes: https://bugzilla.redhat.com/1708300
> ---
>
> v1 was in Aug 2021 [1], with further replies in Sep [2] and Oct [3].
>
> [1] https://lists.gnu.org/archive/html/qemu-devel/2021-08/msg04900.html
> [2] https://lists.gnu.org/archive/html/qemu-devel/2021-09/msg00038.html
> [3] https://lists.gnu.org/archive/html/qemu-devel/2021-10/msg06744.html
>
> Since then, I've tweaked the QAPI to mention 7.0 (instead of 6.2), and
> reworked the logic so that default behavior is unchanged for now
> (advertising multi-conn on a writable export requires opt-in during
> the command line or QMP, but remains default for a readonly export).
> I've also expanded the amount of testing done in the new iotest.
>
>  docs/interop/nbd.txt                       |   1 +
>  docs/tools/qemu-nbd.rst                    |   3 +-
>  qapi/block-export.json                     |  34 +++-
>  include/block/nbd.h                        |   3 +-
>  blockdev-nbd.c                             |   5 +
>  nbd/server.c                               |  27 ++-
>  MAINTAINERS                                |   1 +
>  tests/qemu-iotests/tests/nbd-multiconn     | 188 +++++++++++++++++++++
>  tests/qemu-iotests/tests/nbd-multiconn.out | 112 ++++++++++++
>  9 files changed, 363 insertions(+), 11 deletions(-)
>  create mode 100755 tests/qemu-iotests/tests/nbd-multiconn
>  create mode 100644 tests/qemu-iotests/tests/nbd-multiconn.out
>
> diff --git a/docs/interop/nbd.txt b/docs/interop/nbd.txt
> index bdb0f2a41aca..6c99070b99c8 100644
> --- a/docs/interop/nbd.txt
> +++ b/docs/interop/nbd.txt
> @@ -68,3 +68,4 @@ NBD_CMD_BLOCK_STATUS for "qemu:dirty-bitmap:", NBD_CMD_CACHE
>  * 4.2: NBD_FLAG_CAN_MULTI_CONN for shareable read-only exports,
>  NBD_CMD_FLAG_FAST_ZERO
>  * 5.2: NBD_CMD_BLOCK_STATUS for "qemu:allocation-depth"
> +* 7.0: NBD_FLAG_CAN_MULTI_CONN for shareable writable exports
> diff --git a/docs/tools/qemu-nbd.rst b/docs/tools/qemu-nbd.rst
> index 6031f9689312..1de785524c36 100644
> --- a/docs/tools/qemu-nbd.rst
> +++ b/docs/tools/qemu-nbd.rst
> @@ -139,8 +139,7 @@ driver options if ``--image-opts`` is specified.
>  .. option:: -e, --shared=NUM
>
>    Allow up to *NUM* clients to share the device (default
> -  ``1``), 0 for unlimited. Safe for readers, but for now,
> -  consistency is not guaranteed between multiple writers.
> +  ``1``), 0 for unlimited.
>
>  .. option:: -t, --persistent
>
> diff --git a/qapi/block-export.json b/qapi/block-export.json
> index f183522d0d2c..0a27e8ee84f9 100644
> --- a/qapi/block-export.json
> +++ b/qapi/block-export.json
> @@ -21,7 +21,9 @@
>  #             recreated on the fly while the NBD server is active.
>  #             If missing, it will default to denying access (since 4.0).
>  # @max-connections: The maximum number of connections to allow at the same
> -#                   time, 0 for unlimited. (since 5.2; default: 0)
> +#                   time, 0 for unlimited. Setting this to 1 also stops
> +#                   the server from advertising multiple client support
> +#                   (since 5.2; default: 0)
>  #
>  # Since: 4.2
>  ##
> @@ -50,7 +52,9 @@
>  #             recreated on the fly while the NBD server is active.
>  #             If missing, it will default to denying access (since 4.0).
>  # @max-connections: The maximum number of connections to allow at the same
> -#                   time, 0 for unlimited. (since 5.2; default: 0)
> +#                   time, 0 for unlimited. Setting this to 1 also stops
> +#                   the server from advertising multiple client support
> +#                   (since 5.2; default: 0).
>  #
>  # Returns: error if the server is already running.
>  #
> @@ -79,6 +83,26 @@
>  { 'struct': 'BlockExportOptionsNbdBase',
>    'data': { '*name': 'str', '*description': 'str' } }
>
> +##
> +# @NbdExportMultiConn:
> +#
> +# Possible settings for advertising NBD multiple client support.
> +#
> +# @off: Do not advertise multiple clients.
> +#
> +# @on: Allow multiple clients (for writable clients, this is only safe
> +#      if the underlying BDS is cache-consistent, such as when backed
> +#      by the raw file driver); ignored if the NBD server was set up
> +#      with max-connections of 1.
> +#
> +# @auto: Behaves like @off if the export is writable, and @on if the
> +#        export is read-only.
> +#
> +# Since: 7.0
> +##
> +{ 'enum': 'NbdExportMultiConn',
> +  'data': ['off', 'on', 'auto'] }

Have you considered using OnOffAuto from common.json?

Duplicating it as NbdExportMultiConn lets you document the values right
in the enum.  If you reuse it, you have to document them where the enum
is used, i.e. ...

> +
>  ##
>  # @BlockExportOptionsNbd:
>  #
> @@ -95,11 +119,15 @@
>  #                    the metadata context name "qemu:allocation-depth" to
>  #                    inspect allocation details. (since 5.2)
>  #
> +# @multi-conn: Controls whether multiple client support is advertised, if the
> +#              server's max-connections is not 1. (default auto, since 7.0)
> +#

... here.

>  # Since: 5.2
>  ##
>  { 'struct': 'BlockExportOptionsNbd',
>    'base': 'BlockExportOptionsNbdBase',
> -  'data': { '*bitmaps': ['str'], '*allocation-depth': 'bool' } }
> +  'data': { '*bitmaps': ['str'], '*allocation-depth': 'bool',
> +            '*multi-conn': 'NbdExportMultiConn' } }
>
>  ##
>  # @BlockExportOptionsVhostUserBlk:

[...]


Re: [PATCH v2] nbd/server: Allow MULTI_CONN for shared writable exports
Posted by Eric Blake 2 years, 2 months ago
On Wed, Feb 16, 2022 at 02:44:52PM +0100, Markus Armbruster wrote:
> >
> > +##
> > +# @NbdExportMultiConn:
> > +#
> > +# Possible settings for advertising NBD multiple client support.
> > +#
> > +# @off: Do not advertise multiple clients.
> > +#
> > +# @on: Allow multiple clients (for writable clients, this is only safe
> > +#      if the underlying BDS is cache-consistent, such as when backed
> > +#      by the raw file driver); ignored if the NBD server was set up
> > +#      with max-connections of 1.
> > +#
> > +# @auto: Behaves like @off if the export is writable, and @on if the
> > +#        export is read-only.
> > +#
> > +# Since: 7.0
> > +##
> > +{ 'enum': 'NbdExportMultiConn',
> > +  'data': ['off', 'on', 'auto'] }
> 
> Have you considered using OnOffAuto from common.json?

Sounds good to me.

> 
> Duplicating it as NbdExportMultiConn lets you document the values right
> in the enum.  If you reuse it, you have to document them where the enum
> is used, i.e. ...
> 
> > +
> >  ##
> >  # @BlockExportOptionsNbd:
> >  #
> > @@ -95,11 +119,15 @@
> >  #                    the metadata context name "qemu:allocation-depth" to
> >  #                    inspect allocation details. (since 5.2)
> >  #
> > +# @multi-conn: Controls whether multiple client support is advertised, if the
> > +#              server's max-connections is not 1. (default auto, since 7.0)
> > +#
> 
> ... here.

Yep, that's a good change to make for v3.  I'll do it.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org


Re: [PATCH v2] nbd/server: Allow MULTI_CONN for shared writable exports
Posted by Nir Soffer 2 years, 2 months ago
On Tue, Feb 15, 2022 at 7:22 PM Eric Blake <eblake@redhat.com> wrote:

> According to the NBD spec, a server advertising
> NBD_FLAG_CAN_MULTI_CONN promises that multiple client connections will
> not see any cache inconsistencies: when properly separated by a single
> flush, actions performed by one client will be visible to another
> client, regardless of which client did the flush.  We satisfy these
> conditions in qemu when our block layer is backed by the local
> filesystem (by virtue of the semantics of fdatasync(), and the fact
> that qemu itself is not buffering writes beyond flushes).  It is
> harder to state whether we satisfy these conditions for network-based
> protocols, so the safest course of action is to allow users to opt-in
> to advertising multi-conn.  We may later tweak defaults to advertise
> by default when the block layer can confirm that the underlying
> protocol driver is cache consistent between multiple writers, but for
> now, this at least allows savvy users (such as virt-v2v or nbdcopy) to
> explicitly start qemu-nbd or qemu-storage-daemon with multi-conn
> advertisement in a known-safe setup where the client end can then
> benefit from parallel clients.
>

It makes sense, and will be used by oVirt. Actually we are already using
multiple connections for writing about 2 years, based on your promise
that if every client writes to district  areas this is safe.

Note, however, that we don't want to advertise MULTI_CONN when we know
> that a second client cannot connect (for historical reasons, qemu-nbd
> defaults to a single connection while nbd-server-add and QMP commands
> default to unlimited connections; but we already have existing means
> to let either style of NBD server creation alter those defaults).  The
> harder part of this patch is setting up an iotest to demonstrate
> behavior of multiple NBD clients to a single server.  It might be
> possible with parallel qemu-io processes, but concisely managing that
> in shell is painful.  I found it easier to do by relying on the libnbd
> project's nbdsh, which means this test will be skipped on platforms
> where that is not available.
>
> Signed-off-by: Eric Blake <eblake@redhat.com>
> Fixes: https://bugzilla.redhat.com/1708300
> ---
>
> v1 was in Aug 2021 [1], with further replies in Sep [2] and Oct [3].
>
> [1] https://lists.gnu.org/archive/html/qemu-devel/2021-08/msg04900.html
> [2] https://lists.gnu.org/archive/html/qemu-devel/2021-09/msg00038.html
> [3] https://lists.gnu.org/archive/html/qemu-devel/2021-10/msg06744.html
>
> Since then, I've tweaked the QAPI to mention 7.0 (instead of 6.2), and
> reworked the logic so that default behavior is unchanged for now
> (advertising multi-conn on a writable export requires opt-in during
> the command line or QMP, but remains default for a readonly export).
> I've also expanded the amount of testing done in the new iotest.
>
>  docs/interop/nbd.txt                       |   1 +
>  docs/tools/qemu-nbd.rst                    |   3 +-
>  qapi/block-export.json                     |  34 +++-
>  include/block/nbd.h                        |   3 +-
>  blockdev-nbd.c                             |   5 +
>  nbd/server.c                               |  27 ++-
>  MAINTAINERS                                |   1 +
>  tests/qemu-iotests/tests/nbd-multiconn     | 188 +++++++++++++++++++++
>  tests/qemu-iotests/tests/nbd-multiconn.out | 112 ++++++++++++
>  9 files changed, 363 insertions(+), 11 deletions(-)
>  create mode 100755 tests/qemu-iotests/tests/nbd-multiconn
>  create mode 100644 tests/qemu-iotests/tests/nbd-multiconn.out
>
> diff --git a/docs/interop/nbd.txt b/docs/interop/nbd.txt
> index bdb0f2a41aca..6c99070b99c8 100644
> --- a/docs/interop/nbd.txt
> +++ b/docs/interop/nbd.txt
> @@ -68,3 +68,4 @@ NBD_CMD_BLOCK_STATUS for "qemu:dirty-bitmap:",
> NBD_CMD_CACHE
>  * 4.2: NBD_FLAG_CAN_MULTI_CONN for shareable read-only exports,
>  NBD_CMD_FLAG_FAST_ZERO
>  * 5.2: NBD_CMD_BLOCK_STATUS for "qemu:allocation-depth"
> +* 7.0: NBD_FLAG_CAN_MULTI_CONN for shareable writable exports
> diff --git a/docs/tools/qemu-nbd.rst b/docs/tools/qemu-nbd.rst
> index 6031f9689312..1de785524c36 100644
> --- a/docs/tools/qemu-nbd.rst
> +++ b/docs/tools/qemu-nbd.rst
> @@ -139,8 +139,7 @@ driver options if ``--image-opts`` is specified.
>  .. option:: -e, --shared=NUM
>
>    Allow up to *NUM* clients to share the device (default
> -  ``1``), 0 for unlimited. Safe for readers, but for now,
> -  consistency is not guaranteed between multiple writers.
> +  ``1``), 0 for unlimited.
>

Removing the note means that now consistency is guaranteed between
multiple writers, no?

Or maybe we want to mention here that consistency depends on the protocol
and users can opt in, or refer to the section where this is discussed?

 .. option:: -t, --persistent
>
> diff --git a/qapi/block-export.json b/qapi/block-export.json
> index f183522d0d2c..0a27e8ee84f9 100644
> --- a/qapi/block-export.json
> +++ b/qapi/block-export.json
> @@ -21,7 +21,9 @@
>  #             recreated on the fly while the NBD server is active.
>  #             If missing, it will default to denying access (since 4.0).
>  # @max-connections: The maximum number of connections to allow at the same
> -#                   time, 0 for unlimited. (since 5.2; default: 0)
> +#                   time, 0 for unlimited. Setting this to 1 also stops
> +#                   the server from advertising multiple client support
> +#                   (since 5.2; default: 0)
>  #
>  # Since: 4.2
>  ##
> @@ -50,7 +52,9 @@
>  #             recreated on the fly while the NBD server is active.
>  #             If missing, it will default to denying access (since 4.0).
>  # @max-connections: The maximum number of connections to allow at the same
> -#                   time, 0 for unlimited. (since 5.2; default: 0)
> +#                   time, 0 for unlimited. Setting this to 1 also stops
> +#                   the server from advertising multiple client support
> +#                   (since 5.2; default: 0).
>  #
>  # Returns: error if the server is already running.
>  #
> @@ -79,6 +83,26 @@
>  { 'struct': 'BlockExportOptionsNbdBase',
>    'data': { '*name': 'str', '*description': 'str' } }
>
> +##
> +# @NbdExportMultiConn:
> +#
> +# Possible settings for advertising NBD multiple client support.
> +#
> +# @off: Do not advertise multiple clients.
> +#
> +# @on: Allow multiple clients (for writable clients, this is only safe
> +#      if the underlying BDS is cache-consistent, such as when backed
> +#      by the raw file driver); ignored if the NBD server was set up
> +#      with max-connections of 1.
> +#
> +# @auto: Behaves like @off if the export is writable, and @on if the
> +#        export is read-only.
> +#
> +# Since: 7.0
> +##
> +{ 'enum': 'NbdExportMultiConn',
> +  'data': ['off', 'on', 'auto'] }
>

Are we going to have --multi-con=(on|off|auto)?


> +
>  ##
>  # @BlockExportOptionsNbd:
>  #
> @@ -95,11 +119,15 @@
>  #                    the metadata context name "qemu:allocation-depth" to
>  #                    inspect allocation details. (since 5.2)
>  #
> +# @multi-conn: Controls whether multiple client support is advertised, if
> the
> +#              server's max-connections is not 1. (default auto, since
> 7.0)
> +#
>  # Since: 5.2
>  ##
>  { 'struct': 'BlockExportOptionsNbd',
>    'base': 'BlockExportOptionsNbdBase',
> -  'data': { '*bitmaps': ['str'], '*allocation-depth': 'bool' } }
> +  'data': { '*bitmaps': ['str'], '*allocation-depth': 'bool',
> +            '*multi-conn': 'NbdExportMultiConn' } }
>
>  ##
>  # @BlockExportOptionsVhostUserBlk:
> diff --git a/include/block/nbd.h b/include/block/nbd.h
> index 78d101b77488..138b9857dbcb 100644
> --- a/include/block/nbd.h
> +++ b/include/block/nbd.h
> @@ -1,5 +1,5 @@
>  /*
> - *  Copyright (C) 2016-2020 Red Hat, Inc.
> + *  Copyright (C) 2016-2022 Red Hat, Inc.
>   *  Copyright (C) 2005  Anthony Liguori <anthony@codemonkey.ws>
>   *
>   *  Network Block Device
> @@ -346,6 +346,7 @@ void nbd_client_put(NBDClient *client);
>
>  void nbd_server_is_qemu_nbd(bool value);
>  bool nbd_server_is_running(void);
> +int nbd_server_max_connections(void);
>  void nbd_server_start(SocketAddress *addr, const char *tls_creds,
>                        const char *tls_authz, uint32_t max_connections,
>                        Error **errp);
> diff --git a/blockdev-nbd.c b/blockdev-nbd.c
> index bdfa7ed3a5a9..0df222e868d8 100644
> --- a/blockdev-nbd.c
> +++ b/blockdev-nbd.c
> @@ -44,6 +44,11 @@ bool nbd_server_is_running(void)
>      return nbd_server || is_qemu_nbd;
>  }
>
> +int nbd_server_max_connections(void)
> +{
> +    return nbd_server ? nbd_server->max_connections : -1;
> +}
>

-1 is a little bit strange for a limit, maybe 1 is a better default when
we nbd_server == NULL? When can this happen?


> +
>  static void nbd_blockdev_client_closed(NBDClient *client, bool ignored)
>  {
>      nbd_client_put(client);
> diff --git a/nbd/server.c b/nbd/server.c
> index 9fb2f264023e..f17aacc8139f 100644
> --- a/nbd/server.c
> +++ b/nbd/server.c
> @@ -1,5 +1,5 @@
>  /*
> - *  Copyright (C) 2016-2021 Red Hat, Inc.
> + *  Copyright (C) 2016-2022 Red Hat, Inc.
>   *  Copyright (C) 2005  Anthony Liguori <anthony@codemonkey.ws>
>   *
>   *  Network Block Device Server Side
> @@ -1641,7 +1641,7 @@ static int nbd_export_create(BlockExport *blk_exp,
> BlockExportOptions *exp_args,
>      int64_t size;
>      uint64_t perm, shared_perm;
>      bool readonly = !exp_args->writable;
> -    bool shared = !exp_args->writable;
> +    bool multi_conn;
>      strList *bitmaps;
>      size_t i;
>      int ret;
> @@ -1679,6 +1679,23 @@ static int nbd_export_create(BlockExport *blk_exp,
> BlockExportOptions *exp_args,
>          return size;
>      }
>
> +    /*
> +     * Determine whether to advertise multi-conn.  Default is auto,
> +     * which resolves to on for read-only and off for writable.  But
> +     * if the server has max-connections 1, that forces the flag off.
>

Looks good, this can be enabled automatically based on the driver
if we want, so users using auto will can upgrade to multi-con automatically.


> +     */
> +    if (!arg->has_multi_conn) {
> +        arg->multi_conn = NBD_EXPORT_MULTI_CONN_AUTO;
> +    }
> +    if (nbd_server_max_connections() == 1) {

+        arg->multi_conn = NBD_EXPORT_MULTI_CONN_OFF;
> +    }

+    if (arg->multi_conn == NBD_EXPORT_MULTI_CONN_AUTO) {
> +        multi_conn = readonly;
> +    } else {
> +        multi_conn = arg->multi_conn == NBD_EXPORT_MULTI_CONN_ON;
> +    }
>

This part is a little bit confusing - we do:
- initialize args->multi_con if it has not value
- set the temporary multi_con based now initialized args->multi_con

I think it will be nicer to separate arguments parsing, so there is no need
to initialize it here or have arg->has_multi_conn, but I did not check how
other arguments are handled.


> +
>      /* Don't allow resize while the NBD server is running, otherwise we
> don't
>       * care what happens with the node. */
>      blk_get_perm(blk, &perm, &shared_perm);
> @@ -1692,11 +1709,11 @@ static int nbd_export_create(BlockExport *blk_exp,
> BlockExportOptions *exp_args,
>      exp->description = g_strdup(arg->description);
>      exp->nbdflags = (NBD_FLAG_HAS_FLAGS | NBD_FLAG_SEND_FLUSH |
>                       NBD_FLAG_SEND_FUA | NBD_FLAG_SEND_CACHE);
> +    if (multi_conn) {
> +        exp->nbdflags |= NBD_FLAG_CAN_MULTI_CONN;
> +    }
>      if (readonly) {
>          exp->nbdflags |= NBD_FLAG_READ_ONLY;
> -        if (shared) {
> -            exp->nbdflags |= NBD_FLAG_CAN_MULTI_CONN;
> -        }
>      } else {
>          exp->nbdflags |= (NBD_FLAG_SEND_TRIM | NBD_FLAG_SEND_WRITE_ZEROES
> |
>                            NBD_FLAG_SEND_FAST_ZERO);v
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 9814580975c5..24dd800ae3f8 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -3334,6 +3334,7 @@ F: qemu-nbd.*
>  F: blockdev-nbd.c
>  F: docs/interop/nbd.txt
>  F: docs/tools/qemu-nbd.rst
> +F: tests/qemu-iotests/tests/*nbd*
>  T: git https://repo.or.cz/qemu/ericb.git nbd
>  T: git https://src.openvz.org/scm/~vsementsov/qemu.git nbd
>
> diff --git a/tests/qemu-iotests/tests/nbd-multiconn
> b/tests/qemu-iotests/tests/nbd-multiconn
> new file mode 100755
> index 000000000000..0261552f32f2
> --- /dev/null
> +++ b/tests/qemu-iotests/tests/nbd-multiconn
> @@ -0,0 +1,188 @@
> +#!/usr/bin/env bash
> +# group: rw auto quick
> +#
> +# Test that qemu-nbd MULTI_CONN works
> +#
> +# Copyright (C) 2018-2022 Red Hat, Inc.
> +#
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 2 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program.  If not, see <http://www.gnu.org/licenses/>.
> +#
> +
> +seq="$(basename $0)"
> +echo "QA output created by $seq"
> +
> +status=1 # failure is the default!
> +
> +_cleanup()
> +{
> +    _cleanup_test_img
> +    _cleanup_qemu
> +    rm -f "$SOCK_DIR/nbd"
> +}
> +trap "_cleanup; exit \$status" 0 1 2 3 15
> +
> +# get standard environment, filters and checks
> +cd ..
> +. ./common.rc
> +. ./common.filter
> +. ./common.qemu
> +. ./common.nbd
> +
> +_supported_fmt qcow2
> +_supported_proto file
> +_supported_os Linux
> +_require_command QEMU_NBD
> +
> +# Parallel client connections are easier with libnbd than qemu-io:
> +if ! (nbdsh --version) >/dev/null 2>&1; then
> +    _notrun "nbdsh utility required, skipped this test"
> +fi
> +
> +echo
> +echo "=== Initial image setup ==="
> +echo
> +
> +_make_test_img 4M
> +$QEMU_IO -c 'w -P 1 0 2M' -c 'w -P 2 2M 2M' "$TEST_IMG" | _filter_qemu_io
> +_launch_qemu 2> >(_filter_nbd)
> +_send_qemu_cmd $QEMU_HANDLE '{"execute":"qmp_capabilities"}' "return"
> +_send_qemu_cmd $QEMU_HANDLE '{"execute":"blockdev-add",
> +  "arguments":{"driver":"qcow2", "node-name":"n",
> +    "file":{"driver":"file", "filename":"'"$TEST_IMG"'"}}}' "return"
> +export nbd_unix_socket
> +
> +echo
> +echo "=== Default nbd server settings ==="
> +echo
> +# Default allows for unlimited connections, readonly images advertise
> +# multi-conn, and writable images do not
> +_send_qemu_cmd $QEMU_HANDLE '{"execute":"nbd-server-start",
> +  "arguments":{"addr":{"type":"unix",
> +    "data":{"path":"'"$nbd_unix_socket"'"}}}}' "return"
> +_send_qemu_cmd $QEMU_HANDLE '{"execute":"block-export-add",
> +  "arguments":{"type":"nbd", "id":"r", "node-name":"n",
> +    "name":"r"}}' "return"
> +_send_qemu_cmd $QEMU_HANDLE '{"execute":"block-export-add",
> +  "arguments":{"type":"nbd", "id":"w", "node-name":"n",
> +    "name":"w", "writable":true}}' "return"
> +nbdsh -u "nbd+unix:///r?socket=$nbd_unix_socket" -c '
> +assert h.can_multi_conn()
> +h.shutdown()
> +print("nbdsh passed")'
> +nbdsh -u "nbd+unix:///w?socket=$nbd_unix_socket" -c '
> +assert not h.can_multi_conn()
> +h.shutdown()
> +print("nbdsh passed")'
>

Mixing of shell and python is very confusing. Wouldn't it be much cleaner
to write the test in python?


> +_send_qemu_cmd $QEMU_HANDLE '{"execute":"block-export-del",
> +  "arguments":{"id":"r"}}' "DELETED"
> +_send_qemu_cmd $QEMU_HANDLE '{"execute":"block-export-del",
> +  "arguments":{"id":"w"}}' "DELETED"
> +
> +echo
> +echo "=== Per-export overrides of defaults ==="
> +echo
> +# Can explicitly disable multi-conn for readonly image, and explicitly
> +# enable it for writable image
> +_send_qemu_cmd $QEMU_HANDLE '{"execute":"block-export-add",
> +  "arguments":{"type":"nbd", "id":"r", "node-name":"n",
> +    "name":"r", "multi-conn":"off"}}' "return"
> +_send_qemu_cmd $QEMU_HANDLE '{"execute":"block-export-add",
> +  "arguments":{"type":"nbd", "id":"w", "node-name":"n",
> +    "name":"w", "multi-conn":"on", "writable":true}}' "return"
> +nbdsh -u "nbd+unix:///r?socket=$nbd_unix_socket" -c '
> +assert not h.can_multi_conn()
> +h.shutdown()
> +print("nbdsh passed")'
> +nbdsh -u "nbd+unix:///w?socket=$nbd_unix_socket" -c '
> +assert h.can_multi_conn()
> +h.shutdown()
> +print("nbdsh passed")'
> +_send_qemu_cmd $QEMU_HANDLE '{"execute":"block-export-del",
> +  "arguments":{"id":"r"}}' "DELETED"
> +_send_qemu_cmd $QEMU_HANDLE '{"execute":"block-export-del",
> +  "arguments":{"id":"w"}}' "DELETED"
> +_send_qemu_cmd $QEMU_HANDLE '{"execute":"nbd-server-stop"}' "return"
> +
> +echo
> +echo "=== Limit nbd server max-connections ==="
> +echo
> +# When max-connections is 1, no images advertise multi-conn, even when
> +# explicitly requested per export
> +_send_qemu_cmd $QEMU_HANDLE '{"execute":"nbd-server-start",
> +  "arguments":{"max-connections":1, "addr":{"type":"unix",
> +    "data":{"path":"'"$nbd_unix_socket"'"}}}}' "return"
> +_send_qemu_cmd $QEMU_HANDLE '{"execute":"block-export-add",
> +  "arguments":{"type":"nbd", "id":"r", "node-name":"n",
> +    "name":"r", "multi-conn":"on"}}' "return"
> +nbdsh -u "nbd+unix:///r?socket=$nbd_unix_socket" -c '
> +assert not h.can_multi_conn()
> +h.shutdown()
> +print("nbdsh passed")'
> +_send_qemu_cmd $QEMU_HANDLE '{"execute":"block-export-del",
> +  "arguments":{"id":"r"}}' "DELETED"
> +_send_qemu_cmd $QEMU_HANDLE '{"execute":"block-export-add",
> +  "arguments":{"type":"nbd", "id":"w", "node-name":"n",
> +    "name":"w", "multi-conn":"on", "writable":true}}' "return"
> +nbdsh -u "nbd+unix:///w?socket=$nbd_unix_socket" -c '
> +assert not h.can_multi_conn()
> +h.shutdown()
> +print("nbdsh passed")'
> +_send_qemu_cmd $QEMU_HANDLE '{"execute":"block-export-del",
> +  "arguments":{"id":"w"}}' "DELETED"
> +_send_qemu_cmd $QEMU_HANDLE '{"execute":"nbd-server-stop"}' "return"
> +
> +echo
> +echo "=== Demonstrate parallel writers ==="
> +echo
> +_send_qemu_cmd $QEMU_HANDLE '{"execute":"nbd-server-start",
> +  "arguments":{"addr":{"type":"unix",
> +    "data":{"path":"'"$nbd_unix_socket"'"}}}}' "return"
> +_send_qemu_cmd $QEMU_HANDLE '{"execute":"block-export-add",
> +  "arguments":{"type":"nbd", "id":"w", "node-name":"n",
> +    "name":"", "multi-conn":"on", "writable":true}}' "return"
> +
> +nbdsh -c '
> +import os
> +sock = os.getenv("nbd_unix_socket")
> +h = []
> +
> +for i in range(3):
> +  h.append(nbd.NBD())
> +  h[i].connect_unix(sock)
> +  assert h[i].can_multi_conn()
>

This is somewhat C in python, maybe:

    handles = [nbd.NBD() for _ in range(3)]

    for h in handles:
        h.connect_unix(sock)
        assert h.can_multi_con()

Since assert will abort the test, and we don't handle
exceptions, failure will not shutdown the connections
but it should not matter for the purpose of a test.


> +
> +buf1 = h[0].pread(1024 * 1024, 0)
> +if buf1 != b"\x01" * 1024 * 1024:
> +  print("Unexpected initial read")
>

Not clear when we initialize the buffer to \x01 - is this the qemu-io above?


> +buf2 = b"\x03" * 1024 * 1024
> +h[1].pwrite(buf2, 0)
> +h[2].flush()
> +buf1 = h[0].pread(1024 * 1024, 0)
> +if buf1 == buf2:
> +  print("Flush appears to be consistent across connections")
>

buf1 was the initial content, buf2 is the new content, but now we override
but1 to check that the right was flushed. It will be be better to use
different
names like inittial_data, updated_data, current_data.

This block is the most important part of the test, showing that we can write
in one connection, flush in the second, and read the written block in the
third.
Maybe add a comment about this? I think it will help someone else trying
to understand what this part is trying to test.


> +
> +for i in range(3):
> +  h[i].shutdown()
>

for h in handles:
    h.shutdown()


> +'
> +
> +_send_qemu_cmd $QEMU_HANDLE '{"execute":"block-export-del",
> +  "arguments":{"id":"w"}}' "DELETED"
> +_send_qemu_cmd $QEMU_HANDLE '{"execute":"nbd-server-stop"}' "return"
> +_send_qemu_cmd $QEMU_HANDLE '{"execute":"quit"}' "return"
> +wait=yes _cleanup_qemu
> +
> +# success, all done
> +echo '*** done'
> +rm -f $seq.full
> +status=0
> diff --git a/tests/qemu-iotests/tests/nbd-multiconn.out
> b/tests/qemu-iotests/tests/nbd-multiconn.out
> new file mode 100644
> index 000000000000..a06428e1536a
> --- /dev/null
> +++ b/tests/qemu-iotests/tests/nbd-multiconn.out
> @@ -0,0 +1,112 @@
> +QA output created by nbd-multiconn
> +
> +=== Initial image setup ===
> +
> +Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=4194304
> +wrote 2097152/2097152 bytes at offset 0
> +2 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> +wrote 2097152/2097152 bytes at offset 2097152
> +2 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> +{"execute":"qmp_capabilities"}
> +{"return": {}}
> +{"execute":"blockdev-add",
> +  "arguments":{"driver":"IMGFMT", "node-name":"n",
> +    "file":{"driver":"file", "filename":"TEST_DIR/t.IMGFMT"}}}
> +{"return": {}}
> +
> +=== Default nbd server settings ===
> +
> +{"execute":"nbd-server-start",
> +  "arguments":{"addr":{"type":"unix",
> +    "data":{"path":"SOCK_DIR/qemu-nbd.sock"}}}}
> +{"return": {}}
> +{"execute":"block-export-add",
> +  "arguments":{"type":"nbd", "id":"r", "node-name":"n",
> +    "name":"r"}}
> +{"return": {}}
> +{"execute":"block-export-add",
> +  "arguments":{"type":"nbd", "id":"w", "node-name":"n",
> +    "name":"w", "writable":true}}
> +{"return": {}}
> +nbdsh passed
> +nbdsh passed
> +{"execute":"block-export-del",
> +  "arguments":{"id":"r"}}
> +{"return": {}}
> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP},
> "event": "BLOCK_EXPORT_DELETED", "data": {"id": "r"}}
> +{"execute":"block-export-del",
> +  "arguments":{"id":"w"}}
> +{"return": {}}
> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP},
> "event": "BLOCK_EXPORT_DELETED", "data": {"id": "w"}}
> +
> +=== Per-export overrides of defaults ===
> +
> +{"execute":"block-export-add",
> +  "arguments":{"type":"nbd", "id":"r", "node-name":"n",
> +    "name":"r", "multi-conn":"off"}}
> +{"return": {}}
> +{"execute":"block-export-add",
> +  "arguments":{"type":"nbd", "id":"w", "node-name":"n",
> +    "name":"w", "multi-conn":"on", "writable":true}}
> +{"return": {}}
> +nbdsh passed
> +nbdsh passed
> +{"execute":"block-export-del",
> +  "arguments":{"id":"r"}}
> +{"return": {}}
> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP},
> "event": "BLOCK_EXPORT_DELETED", "data": {"id": "r"}}
> +{"execute":"block-export-del",
> +  "arguments":{"id":"w"}}
> +{"return": {}}
> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP},
> "event": "BLOCK_EXPORT_DELETED", "data": {"id": "w"}}
> +{"execute":"nbd-server-stop"}
> +{"return": {}}
> +
> +=== Limit nbd server max-connections ===
> +
> +{"execute":"nbd-server-start",
> +  "arguments":{"max-connections":1, "addr":{"type":"unix",
> +    "data":{"path":"SOCK_DIR/qemu-nbd.sock"}}}}
> +{"return": {}}
> +{"execute":"block-export-add",
> +  "arguments":{"type":"nbd", "id":"r", "node-name":"n",
> +    "name":"r", "multi-conn":"on"}}
> +{"return": {}}
> +nbdsh passed
> +{"execute":"block-export-del",
> +  "arguments":{"id":"r"}}
> +{"return": {}}
> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP},
> "event": "BLOCK_EXPORT_DELETED", "data": {"id": "r"}}
> +{"execute":"block-export-add",
> +  "arguments":{"type":"nbd", "id":"w", "node-name":"n",
> +    "name":"w", "multi-conn":"on", "writable":true}}
> +{"return": {}}
> +nbdsh passed
> +{"execute":"block-export-del",
> +  "arguments":{"id":"w"}}
> +{"return": {}}
> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP},
> "event": "BLOCK_EXPORT_DELETED", "data": {"id": "w"}}
> +{"execute":"nbd-server-stop"}
> +{"return": {}}
> +
> +=== Demonstrate parallel writers ===
> +
> +{"execute":"nbd-server-start",
> +  "arguments":{"addr":{"type":"unix",
> +    "data":{"path":"SOCK_DIR/qemu-nbd.sock"}}}}
> +{"return": {}}
> +{"execute":"block-export-add",
> +  "arguments":{"type":"nbd", "id":"w", "node-name":"n",
> +    "name":"", "multi-conn":"on", "writable":true}}
> +{"return": {}}
> +Flush appears to be consistent across connections
> +{"execute":"block-export-del",
> +  "arguments":{"id":"w"}}
> +{"return": {}}
> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP},
> "event": "BLOCK_EXPORT_DELETED", "data": {"id": "w"}}
> +{"execute":"nbd-server-stop"}
> +{"return": {}}
> +{"execute":"quit"}
> +{"return": {}}
> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP},
> "event": "SHUTDOWN", "data": {"guest": false, "reason": "host-qmp-quit"}}
>

Nothing about the contents here says anything about the actual test
except the "Flush appears..." line.


> +*** done
> --
> 2.35.1
>

Looks good, feel free to ignore the style comments and suggestion to rewrite
the test in python.

Nir
Re: [PATCH v2] nbd/server: Allow MULTI_CONN for shared writable exports
Posted by Eric Blake 2 years, 2 months ago
On Tue, Feb 15, 2022 at 09:23:36PM +0200, Nir Soffer wrote:
> On Tue, Feb 15, 2022 at 7:22 PM Eric Blake <eblake@redhat.com> wrote:
> 
> > According to the NBD spec, a server advertising
> > NBD_FLAG_CAN_MULTI_CONN promises that multiple client connections will
> > not see any cache inconsistencies: when properly separated by a single
> > flush, actions performed by one client will be visible to another
> > client, regardless of which client did the flush.  We satisfy these
> > conditions in qemu when our block layer is backed by the local
> > filesystem (by virtue of the semantics of fdatasync(), and the fact
> > that qemu itself is not buffering writes beyond flushes).  It is
> > harder to state whether we satisfy these conditions for network-based
> > protocols, so the safest course of action is to allow users to opt-in
> > to advertising multi-conn.  We may later tweak defaults to advertise
> > by default when the block layer can confirm that the underlying
> > protocol driver is cache consistent between multiple writers, but for
> > now, this at least allows savvy users (such as virt-v2v or nbdcopy) to
> > explicitly start qemu-nbd or qemu-storage-daemon with multi-conn
> > advertisement in a known-safe setup where the client end can then
> > benefit from parallel clients.
> >
> 
> It makes sense, and will be used by oVirt. Actually we are already using
> multiple connections for writing about 2 years, based on your promise
> that if every client writes to district  areas this is safe.

I presume s/district/distinct/, but yes, I'm glad we're finally trying
to make the code match existing practice ;)

> > +++ b/docs/tools/qemu-nbd.rst
> > @@ -139,8 +139,7 @@ driver options if ``--image-opts`` is specified.
> >  .. option:: -e, --shared=NUM
> >
> >    Allow up to *NUM* clients to share the device (default
> > -  ``1``), 0 for unlimited. Safe for readers, but for now,
> > -  consistency is not guaranteed between multiple writers.
> > +  ``1``), 0 for unlimited.
> >
> 
> Removing the note means that now consistency is guaranteed between
> multiple writers, no?
> 
> Or maybe we want to mention here that consistency depends on the protocol
> and users can opt in, or refer to the section where this is discussed?

Yeah, a link to the QAPI docs where multi-conn is documented might be
nice, except I'm not sure the best way to do that in our sphinx
documentation setup.

> > +##
> > +# @NbdExportMultiConn:
> > +#
> > +# Possible settings for advertising NBD multiple client support.
> > +#
> > +# @off: Do not advertise multiple clients.
> > +#
> > +# @on: Allow multiple clients (for writable clients, this is only safe
> > +#      if the underlying BDS is cache-consistent, such as when backed
> > +#      by the raw file driver); ignored if the NBD server was set up
> > +#      with max-connections of 1.
> > +#
> > +# @auto: Behaves like @off if the export is writable, and @on if the
> > +#        export is read-only.
> > +#
> > +# Since: 7.0
> > +##
> > +{ 'enum': 'NbdExportMultiConn',
> > +  'data': ['off', 'on', 'auto'] }
> >
> 
> Are we going to have --multi-con=(on|off|auto)?

Oh. The QMP command (which is immediately visible through
nbd-server-add/block-storage-add to qemu and qemu-storage-daemon)
gains "multi-conn":"on", but you may be right that qemu-nbd would want
a command line option (either that, or we accellerate our plans that
qsd should replace qemu-nbd).

> > +++ b/blockdev-nbd.c
> > @@ -44,6 +44,11 @@ bool nbd_server_is_running(void)
> >      return nbd_server || is_qemu_nbd;
> >  }
> >
> > +int nbd_server_max_connections(void)
> > +{
> > +    return nbd_server ? nbd_server->max_connections : -1;
> > +}
> >
> 
> -1 is a little bit strange for a limit, maybe 1 is a better default when
> we nbd_server == NULL? When can this happen?

In qemu, if you haven't used the QMP command 'nbd-server-start' yet.
In qemu-nbd, always (per the nbd_server_is_running function just
above).  My iotest only covered the qemu/qsd side, not the qemu-nbd
side, so it looks like I need a v3...

> > +++ b/nbd/server.c

> > +    /*
> > +     * Determine whether to advertise multi-conn.  Default is auto,
> > +     * which resolves to on for read-only and off for writable.  But
> > +     * if the server has max-connections 1, that forces the flag off.
> >
> 
> Looks good, this can be enabled automatically based on the driver
> if we want, so users using auto will can upgrade to multi-con automatically.

Yes, that's part of why I made it a tri-state with a default of 'auto'.

> 
> 
> > +     */
> > +    if (!arg->has_multi_conn) {
> > +        arg->multi_conn = NBD_EXPORT_MULTI_CONN_AUTO;
> > +    }
> > +    if (nbd_server_max_connections() == 1) {
> 
> +        arg->multi_conn = NBD_EXPORT_MULTI_CONN_OFF;
> > +    }
> 
> +    if (arg->multi_conn == NBD_EXPORT_MULTI_CONN_AUTO) {
> > +        multi_conn = readonly;
> > +    } else {
> > +        multi_conn = arg->multi_conn == NBD_EXPORT_MULTI_CONN_ON;
> > +    }
> >
> 
> This part is a little bit confusing - we do:
> - initialize args->multi_con if it has not value
> - set the temporary multi_con based now initialized args->multi_con
> 
> I think it will be nicer to separate arguments parsing, so there is no need
> to initialize it here or have arg->has_multi_conn, but I did not check how
> other arguments are handled.

arg->has_multi_conn is fallout from the fact that our QAPI must remain
back-compat. If it is false, the user did not pass "multi-conn":...,
so we want the default value of "auto".  Once we've established the
default, then we force multi-conn off if we know we are limited to one
client (although as you pointed out, nbd_server_max_connections()
needs to be tested with qemu-nbd).  Then, it's easier to resolve the
tri-state enum variable into the bool of what we actually advertise to
the NBD client.

> > +++ b/tests/qemu-iotests/tests/nbd-multiconn
> > @@ -0,0 +1,188 @@
> > +#!/usr/bin/env bash
> > +# group: rw auto quick
> > +#
> > +# Test that qemu-nbd MULTI_CONN works
> > +#
> > +echo
> > +echo "=== Initial image setup ==="
> > +echo
> > +
> > +_make_test_img 4M
> > +$QEMU_IO -c 'w -P 1 0 2M' -c 'w -P 2 2M 2M' "$TEST_IMG" | _filter_qemu_io
> > +_launch_qemu 2> >(_filter_nbd)
> > +_send_qemu_cmd $QEMU_HANDLE '{"execute":"qmp_capabilities"}' "return"
> > +_send_qemu_cmd $QEMU_HANDLE '{"execute":"blockdev-add",
> > +  "arguments":{"driver":"qcow2", "node-name":"n",
> > +    "file":{"driver":"file", "filename":"'"$TEST_IMG"'"}}}' "return"

I'm not the best at writing python iotests; I welcome a language
translation of this aspect.  But the shell code for
_launch_qemu/_send_qemu_cmd was already pretty nice for setting up a
long-running background qemu process where I can pass in QMP commands
at will, then interact from other processes.

> > +export nbd_unix_socket
> > +
> > +echo
> > +echo "=== Default nbd server settings ==="
> > +echo
> > +# Default allows for unlimited connections, readonly images advertise
> > +# multi-conn, and writable images do not
> > +_send_qemu_cmd $QEMU_HANDLE '{"execute":"nbd-server-start",
> > +  "arguments":{"addr":{"type":"unix",
> > +    "data":{"path":"'"$nbd_unix_socket"'"}}}}' "return"
> > +_send_qemu_cmd $QEMU_HANDLE '{"execute":"block-export-add",
> > +  "arguments":{"type":"nbd", "id":"r", "node-name":"n",
> > +    "name":"r"}}' "return"
> > +_send_qemu_cmd $QEMU_HANDLE '{"execute":"block-export-add",
> > +  "arguments":{"type":"nbd", "id":"w", "node-name":"n",
> > +    "name":"w", "writable":true}}' "return"
> > +nbdsh -u "nbd+unix:///r?socket=$nbd_unix_socket" -c '
> > +assert h.can_multi_conn()
> > +h.shutdown()
> > +print("nbdsh passed")'
> > +nbdsh -u "nbd+unix:///w?socket=$nbd_unix_socket" -c '
> > +assert not h.can_multi_conn()
> > +h.shutdown()
> > +print("nbdsh passed")'
> >
> 
> Mixing of shell and python is very confusing. Wouldn't it be much cleaner
> to write the test in python?

Here, nbdsh -c 'python snippet' is used as a shell command line
parameter.  Writing python code to call out to a system() command
where one of the arguments to that command is a python script snippet
is going to be just as annoying as writing it in bash.

> > +echo
> > +echo "=== Demonstrate parallel writers ==="
> > +echo
> > +_send_qemu_cmd $QEMU_HANDLE '{"execute":"nbd-server-start",
> > +  "arguments":{"addr":{"type":"unix",
> > +    "data":{"path":"'"$nbd_unix_socket"'"}}}}' "return"
> > +_send_qemu_cmd $QEMU_HANDLE '{"execute":"block-export-add",
> > +  "arguments":{"type":"nbd", "id":"w", "node-name":"n",
> > +    "name":"", "multi-conn":"on", "writable":true}}' "return"
> > +
> > +nbdsh -c '
> > +import os
> > +sock = os.getenv("nbd_unix_socket")
> > +h = []
> > +
> > +for i in range(3):
> > +  h.append(nbd.NBD())
> > +  h[i].connect_unix(sock)
> > +  assert h[i].can_multi_conn()
> >
> 
> This is somewhat C in python, maybe:
> 
>     handles = [nbd.NBD() for _ in range(3)]
> 
>     for h in handles:
>         h.connect_unix(sock)
>         assert h.can_multi_con()
> 
> Since assert will abort the test, and we don't handle
> exceptions, failure will not shutdown the connections
> but it should not matter for the purpose of a test.

As I said, I'm open to python suggestions :)  I like your approach.

> 
> 
> > +
> > +buf1 = h[0].pread(1024 * 1024, 0)
> > +if buf1 != b"\x01" * 1024 * 1024:
> > +  print("Unexpected initial read")
> >
> 
> Not clear when we initialize the buffer to \x01 - is this the qemu-io above?

Yes, when the qcow2 file was initially created.

> 
> 
> > +buf2 = b"\x03" * 1024 * 1024
> > +h[1].pwrite(buf2, 0)
> > +h[2].flush()
> > +buf1 = h[0].pread(1024 * 1024, 0)
> > +if buf1 == buf2:
> > +  print("Flush appears to be consistent across connections")
> >
> 
> buf1 was the initial content, buf2 is the new content, but now we override
> but1 to check that the right was flushed. It will be be better to use
> different
> names like inittial_data, updated_data, current_data.

Can do.

> 
> This block is the most important part of the test, showing that we can write
> in one connection, flush in the second, and read the written block in the
> third.
> Maybe add a comment about this? I think it will help someone else trying
> to understand what this part is trying to test.

Can do.

> > +{"execute":"block-export-add",
> > +  "arguments":{"type":"nbd", "id":"w", "node-name":"n",
> > +    "name":"", "multi-conn":"on", "writable":true}}
> > +{"return": {}}
> > +Flush appears to be consistent across connections
> > +{"execute":"block-export-del",
> > +  "arguments":{"id":"w"}}
> > +{"return": {}}
> > +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP},
> > "event": "BLOCK_EXPORT_DELETED", "data": {"id": "w"}}
> > +{"execute":"nbd-server-stop"}
> > +{"return": {}}
> > +{"execute":"quit"}
> > +{"return": {}}
> > +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP},
> > "event": "SHUTDOWN", "data": {"guest": false, "reason": "host-qmp-quit"}}
> >
> 
> Nothing about the contents here says anything about the actual test
> except the "Flush appears..." line.

Yeah, it's a lot of QMP debugging (to show what commands were run in
setting up the server), and less verbose in the nbdsh side.  Do I need
to add more output during the nbdsh that uses multiple connections?

> 
> 
> > +*** done
> > --
> > 2.35.1
> >
> 
> Looks good, feel free to ignore the style comments and suggestion to rewrite
> the test in python.

The style comments are nice, the rewriting is going to be harder for
me (but I'll accept help).  At any rate, getting qemu-nbd to be
feature-compatible may require a v3 anyway.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org


Re: [PATCH v2] nbd/server: Allow MULTI_CONN for shared writable exports
Posted by Vladimir Sementsov-Ogievskiy 2 years, 2 months ago
16.02.2022 02:24, Eric Blake wrote:
> On Tue, Feb 15, 2022 at 09:23:36PM +0200, Nir Soffer wrote:
>> On Tue, Feb 15, 2022 at 7:22 PM Eric Blake <eblake@redhat.com> wrote:
>>
>>> According to the NBD spec, a server advertising
>>> NBD_FLAG_CAN_MULTI_CONN promises that multiple client connections will
>>> not see any cache inconsistencies: when properly separated by a single
>>> flush, actions performed by one client will be visible to another
>>> client, regardless of which client did the flush.  We satisfy these
>>> conditions in qemu when our block layer is backed by the local
>>> filesystem (by virtue of the semantics of fdatasync(), and the fact
>>> that qemu itself is not buffering writes beyond flushes).  It is
>>> harder to state whether we satisfy these conditions for network-based
>>> protocols, so the safest course of action is to allow users to opt-in
>>> to advertising multi-conn.  We may later tweak defaults to advertise
>>> by default when the block layer can confirm that the underlying
>>> protocol driver is cache consistent between multiple writers, but for
>>> now, this at least allows savvy users (such as virt-v2v or nbdcopy) to
>>> explicitly start qemu-nbd or qemu-storage-daemon with multi-conn
>>> advertisement in a known-safe setup where the client end can then
>>> benefit from parallel clients.
>>>
>>
>> It makes sense, and will be used by oVirt. Actually we are already using
>> multiple connections for writing about 2 years, based on your promise
>> that if every client writes to district  areas this is safe.
> 
> I presume s/district/distinct/, but yes, I'm glad we're finally trying
> to make the code match existing practice ;)
> 
>>> +++ b/docs/tools/qemu-nbd.rst
>>> @@ -139,8 +139,7 @@ driver options if ``--image-opts`` is specified.
>>>   .. option:: -e, --shared=NUM
>>>
>>>     Allow up to *NUM* clients to share the device (default
>>> -  ``1``), 0 for unlimited. Safe for readers, but for now,
>>> -  consistency is not guaranteed between multiple writers.
>>> +  ``1``), 0 for unlimited.
>>>
>>
>> Removing the note means that now consistency is guaranteed between
>> multiple writers, no?
>>
>> Or maybe we want to mention here that consistency depends on the protocol
>> and users can opt in, or refer to the section where this is discussed?
> 
> Yeah, a link to the QAPI docs where multi-conn is documented might be
> nice, except I'm not sure the best way to do that in our sphinx
> documentation setup.
> 
>>> +##
>>> +# @NbdExportMultiConn:
>>> +#
>>> +# Possible settings for advertising NBD multiple client support.
>>> +#
>>> +# @off: Do not advertise multiple clients.
>>> +#
>>> +# @on: Allow multiple clients (for writable clients, this is only safe
>>> +#      if the underlying BDS is cache-consistent, such as when backed
>>> +#      by the raw file driver); ignored if the NBD server was set up
>>> +#      with max-connections of 1.
>>> +#
>>> +# @auto: Behaves like @off if the export is writable, and @on if the
>>> +#        export is read-only.
>>> +#
>>> +# Since: 7.0
>>> +##
>>> +{ 'enum': 'NbdExportMultiConn',
>>> +  'data': ['off', 'on', 'auto'] }
>>>
>>
>> Are we going to have --multi-con=(on|off|auto)?
> 
> Oh. The QMP command (which is immediately visible through
> nbd-server-add/block-storage-add to qemu and qemu-storage-daemon)
> gains "multi-conn":"on", but you may be right that qemu-nbd would want
> a command line option (either that, or we accellerate our plans that
> qsd should replace qemu-nbd).
> 
>>> +++ b/blockdev-nbd.c
>>> @@ -44,6 +44,11 @@ bool nbd_server_is_running(void)
>>>       return nbd_server || is_qemu_nbd;
>>>   }
>>>
>>> +int nbd_server_max_connections(void)
>>> +{
>>> +    return nbd_server ? nbd_server->max_connections : -1;
>>> +}
>>>
>>
>> -1 is a little bit strange for a limit, maybe 1 is a better default when
>> we nbd_server == NULL? When can this happen?
> 
> In qemu, if you haven't used the QMP command 'nbd-server-start' yet.
> In qemu-nbd, always (per the nbd_server_is_running function just
> above).  My iotest only covered the qemu/qsd side, not the qemu-nbd
> side, so it looks like I need a v3...
> 
>>> +++ b/nbd/server.c
> 
>>> +    /*
>>> +     * Determine whether to advertise multi-conn.  Default is auto,
>>> +     * which resolves to on for read-only and off for writable.  But
>>> +     * if the server has max-connections 1, that forces the flag off.
>>>
>>
>> Looks good, this can be enabled automatically based on the driver
>> if we want, so users using auto will can upgrade to multi-con automatically.
> 
> Yes, that's part of why I made it a tri-state with a default of 'auto'.
> 
>>
>>
>>> +     */
>>> +    if (!arg->has_multi_conn) {
>>> +        arg->multi_conn = NBD_EXPORT_MULTI_CONN_AUTO;
>>> +    }
>>> +    if (nbd_server_max_connections() == 1) {
>>
>> +        arg->multi_conn = NBD_EXPORT_MULTI_CONN_OFF;
>>> +    }
>>
>> +    if (arg->multi_conn == NBD_EXPORT_MULTI_CONN_AUTO) {
>>> +        multi_conn = readonly;
>>> +    } else {
>>> +        multi_conn = arg->multi_conn == NBD_EXPORT_MULTI_CONN_ON;
>>> +    }
>>>
>>
>> This part is a little bit confusing - we do:
>> - initialize args->multi_con if it has not value
>> - set the temporary multi_con based now initialized args->multi_con
>>
>> I think it will be nicer to separate arguments parsing, so there is no need
>> to initialize it here or have arg->has_multi_conn, but I did not check how
>> other arguments are handled.
> 
> arg->has_multi_conn is fallout from the fact that our QAPI must remain
> back-compat. If it is false, the user did not pass "multi-conn":...,
> so we want the default value of "auto".  Once we've established the
> default, then we force multi-conn off if we know we are limited to one
> client (although as you pointed out, nbd_server_max_connections()
> needs to be tested with qemu-nbd).  Then, it's easier to resolve the
> tri-state enum variable into the bool of what we actually advertise to
> the NBD client.
> 
>>> +++ b/tests/qemu-iotests/tests/nbd-multiconn
>>> @@ -0,0 +1,188 @@
>>> +#!/usr/bin/env bash
>>> +# group: rw auto quick
>>> +#
>>> +# Test that qemu-nbd MULTI_CONN works
>>> +#
>>> +echo
>>> +echo "=== Initial image setup ==="
>>> +echo
>>> +
>>> +_make_test_img 4M
>>> +$QEMU_IO -c 'w -P 1 0 2M' -c 'w -P 2 2M 2M' "$TEST_IMG" | _filter_qemu_io
>>> +_launch_qemu 2> >(_filter_nbd)
>>> +_send_qemu_cmd $QEMU_HANDLE '{"execute":"qmp_capabilities"}' "return"
>>> +_send_qemu_cmd $QEMU_HANDLE '{"execute":"blockdev-add",
>>> +  "arguments":{"driver":"qcow2", "node-name":"n",
>>> +    "file":{"driver":"file", "filename":"'"$TEST_IMG"'"}}}' "return"
> 
> I'm not the best at writing python iotests; I welcome a language
> translation of this aspect. 



Let me try:)


#!/usr/bin/env python3

import os
import iotests
import nbd
from iotests import qemu_img_create, qemu_io_silent


disk = os.path.join(iotests.test_dir, 'disk')
size = '4M'
nbd_sock = os.path.join(iotests.test_dir, 'nbd_sock')
nbd_uri = 'nbd+unix:///{}?socket=' + nbd_sock


class TestNbdMulticonn(iotests.QMPTestCase):
     def setUp(self):
         assert qemu_img_create('-f', iotests.imgfmt, disk, size) == 0
         assert qemu_io_silent('-c', 'w -P 1 0 2M',
                               '-c', 'w -P 2 2M 2M', disk) == 0

         self.vm = iotests.VM()
         self.vm.launch()
         result = self.vm.qmp('blockdev-add', {
             'driver': 'qcow2',
             'node-name': 'n',
             'file': {'driver': 'file', 'filename': disk}
         })
         self.assert_qmp(result, 'return', {})

     def tearDown(self):
         self.vm.shutdown()
         os.remove(disk)
         os.remove(nbd_sock)

     def server_start(self, max_connections=None):
         args = {
             'addr': {
                 'type': 'unix',
                 'data': {'path': nbd_sock}
             }
         }
         if max_connections is not None:
             args['max-connections'] = max_connections

         result = self.vm.qmp('nbd-server-start', args)
         self.assert_qmp(result, 'return', {})

     def export_add(self, name, writable=None, multi_conn=None):
         args = {
             'type': 'nbd',
             'id': name,
             'node-name': 'n',
             'name': name,
         }
         if writable is not None:
             args['writable'] = writable
         if multi_conn is not None:
             args['multi-conn'] = multi_conn

         result = self.vm.qmp('block-export-add', args)
         self.assert_qmp(result, 'return', {})

     def check_multi_conn(self, export_name, multi_conn):
         cl = nbd.NBD()
         cl.connect_uri(nbd_uri.format(export_name))
         self.assertEqual(cl.can_multi_conn(), multi_conn)
         cl.shutdown()

     def test_default_settings(self):
         self.server_start()
         self.export_add('r')
         self.export_add('w', writable=True)
         self.check_multi_conn('r', True)
         self.check_multi_conn('w', False)

     def test_multiconn_option(self):
         self.server_start()
         self.export_add('r', multi_conn='off')
         self.export_add('w', writable=True, multi_conn='on')
         self.check_multi_conn('r', False)
         self.check_multi_conn('w', True)

     def test_limited_connections(self):
         self.server_start(max_connections=1)
         self.export_add('r', multi_conn='on')
         self.export_add('w', writable=True, multi_conn='on')
         self.check_multi_conn('r', False)
         self.check_multi_conn('w', False)

     def test_parallel_writes(self):
         self.server_start()
         self.export_add('w', writable=True, multi_conn='on')

         clients = [nbd.NBD() for _ in range(3)]
         for c in clients:
             c.connect_uri(nbd_uri.format('w'))
             assert c.can_multi_conn()

         buf1 = clients[0].pread(1024 * 1024, 0)
         self.assertEqual(buf1, b"\x01" * 1024 * 1024)

         buf2 = b"\x03" * 1024 * 1024
         clients[1].pwrite(buf2, 0)
         clients[2].flush()
         buf1 = clients[0].pread(1024 * 1024, 0)

         self.assertEqual(buf1, buf2)

         for i in range(3):
             clients[i].shutdown()


if __name__ == '__main__':
     iotests.main(supported_fmts=['qcow2'])







> But the shell code for
> _launch_qemu/_send_qemu_cmd was already pretty nice for setting up a
> long-running background qemu process where I can pass in QMP commands
> at will, then interact from other processes.
> 
>>> +export nbd_unix_socket
>>> +
>>> +echo
>>> +echo "=== Default nbd server settings ==="
>>> +echo
>>> +# Default allows for unlimited connections, readonly images advertise
>>> +# multi-conn, and writable images do not
>>> +_send_qemu_cmd $QEMU_HANDLE '{"execute":"nbd-server-start",
>>> +  "arguments":{"addr":{"type":"unix",
>>> +    "data":{"path":"'"$nbd_unix_socket"'"}}}}' "return"
>>> +_send_qemu_cmd $QEMU_HANDLE '{"execute":"block-export-add",
>>> +  "arguments":{"type":"nbd", "id":"r", "node-name":"n",
>>> +    "name":"r"}}' "return"
>>> +_send_qemu_cmd $QEMU_HANDLE '{"execute":"block-export-add",
>>> +  "arguments":{"type":"nbd", "id":"w", "node-name":"n",
>>> +    "name":"w", "writable":true}}' "return"
>>> +nbdsh -u "nbd+unix:///r?socket=$nbd_unix_socket" -c '
>>> +assert h.can_multi_conn()
>>> +h.shutdown()
>>> +print("nbdsh passed")'
>>> +nbdsh -u "nbd+unix:///w?socket=$nbd_unix_socket" -c '
>>> +assert not h.can_multi_conn()
>>> +h.shutdown()
>>> +print("nbdsh passed")'
>>>
>>
>> Mixing of shell and python is very confusing. Wouldn't it be much cleaner
>> to write the test in python?
> 
> Here, nbdsh -c 'python snippet' is used as a shell command line
> parameter.  Writing python code to call out to a system() command
> where one of the arguments to that command is a python script snippet
> is going to be just as annoying as writing it in bash.
> 
>>> +echo
>>> +echo "=== Demonstrate parallel writers ==="
>>> +echo
>>> +_send_qemu_cmd $QEMU_HANDLE '{"execute":"nbd-server-start",
>>> +  "arguments":{"addr":{"type":"unix",
>>> +    "data":{"path":"'"$nbd_unix_socket"'"}}}}' "return"
>>> +_send_qemu_cmd $QEMU_HANDLE '{"execute":"block-export-add",
>>> +  "arguments":{"type":"nbd", "id":"w", "node-name":"n",
>>> +    "name":"", "multi-conn":"on", "writable":true}}' "return"
>>> +
>>> +nbdsh -c '
>>> +import os
>>> +sock = os.getenv("nbd_unix_socket")
>>> +h = []
>>> +
>>> +for i in range(3):
>>> +  h.append(nbd.NBD())
>>> +  h[i].connect_unix(sock)
>>> +  assert h[i].can_multi_conn()
>>>
>>
>> This is somewhat C in python, maybe:
>>
>>      handles = [nbd.NBD() for _ in range(3)]
>>
>>      for h in handles:
>>          h.connect_unix(sock)
>>          assert h.can_multi_con()
>>
>> Since assert will abort the test, and we don't handle
>> exceptions, failure will not shutdown the connections
>> but it should not matter for the purpose of a test.
> 
> As I said, I'm open to python suggestions :)  I like your approach.
> 
>>
>>
>>> +
>>> +buf1 = h[0].pread(1024 * 1024, 0)
>>> +if buf1 != b"\x01" * 1024 * 1024:
>>> +  print("Unexpected initial read")
>>>
>>
>> Not clear when we initialize the buffer to \x01 - is this the qemu-io above?
> 
> Yes, when the qcow2 file was initially created.
> 
>>
>>
>>> +buf2 = b"\x03" * 1024 * 1024
>>> +h[1].pwrite(buf2, 0)
>>> +h[2].flush()
>>> +buf1 = h[0].pread(1024 * 1024, 0)
>>> +if buf1 == buf2:
>>> +  print("Flush appears to be consistent across connections")
>>>
>>
>> buf1 was the initial content, buf2 is the new content, but now we override
>> but1 to check that the right was flushed. It will be be better to use
>> different
>> names like inittial_data, updated_data, current_data.
> 
> Can do.
> 
>>
>> This block is the most important part of the test, showing that we can write
>> in one connection, flush in the second, and read the written block in the
>> third.
>> Maybe add a comment about this? I think it will help someone else trying
>> to understand what this part is trying to test.
> 
> Can do.
> 
>>> +{"execute":"block-export-add",
>>> +  "arguments":{"type":"nbd", "id":"w", "node-name":"n",
>>> +    "name":"", "multi-conn":"on", "writable":true}}
>>> +{"return": {}}
>>> +Flush appears to be consistent across connections
>>> +{"execute":"block-export-del",
>>> +  "arguments":{"id":"w"}}
>>> +{"return": {}}
>>> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP},
>>> "event": "BLOCK_EXPORT_DELETED", "data": {"id": "w"}}
>>> +{"execute":"nbd-server-stop"}
>>> +{"return": {}}
>>> +{"execute":"quit"}
>>> +{"return": {}}
>>> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP},
>>> "event": "SHUTDOWN", "data": {"guest": false, "reason": "host-qmp-quit"}}
>>>
>>
>> Nothing about the contents here says anything about the actual test
>> except the "Flush appears..." line.
> 
> Yeah, it's a lot of QMP debugging (to show what commands were run in
> setting up the server), and less verbose in the nbdsh side.  Do I need
> to add more output during the nbdsh that uses multiple connections?
> 
>>
>>
>>> +*** done
>>> --
>>> 2.35.1
>>>
>>
>> Looks good, feel free to ignore the style comments and suggestion to rewrite
>> the test in python.
> 
> The style comments are nice, the rewriting is going to be harder for
> me (but I'll accept help).  At any rate, getting qemu-nbd to be
> feature-compatible may require a v3 anyway.
> 


-- 
Best regards,
Vladimir

Re: [PATCH v2] nbd/server: Allow MULTI_CONN for shared writable exports
Posted by Nir Soffer 2 years, 2 months ago
On Wed, Feb 16, 2022 at 10:08 AM Vladimir Sementsov-Ogievskiy
<vsementsov@virtuozzo.com> wrote:
>
> 16.02.2022 02:24, Eric Blake wrote:
> > On Tue, Feb 15, 2022 at 09:23:36PM +0200, Nir Soffer wrote:
> >> On Tue, Feb 15, 2022 at 7:22 PM Eric Blake <eblake@redhat.com> wrote:
> >>
> >>> According to the NBD spec, a server advertising
> >>> NBD_FLAG_CAN_MULTI_CONN promises that multiple client connections will
> >>> not see any cache inconsistencies: when properly separated by a single
> >>> flush, actions performed by one client will be visible to another
> >>> client, regardless of which client did the flush.  We satisfy these
> >>> conditions in qemu when our block layer is backed by the local
> >>> filesystem (by virtue of the semantics of fdatasync(), and the fact
> >>> that qemu itself is not buffering writes beyond flushes).  It is
> >>> harder to state whether we satisfy these conditions for network-based
> >>> protocols, so the safest course of action is to allow users to opt-in
> >>> to advertising multi-conn.  We may later tweak defaults to advertise
> >>> by default when the block layer can confirm that the underlying
> >>> protocol driver is cache consistent between multiple writers, but for
> >>> now, this at least allows savvy users (such as virt-v2v or nbdcopy) to
> >>> explicitly start qemu-nbd or qemu-storage-daemon with multi-conn
> >>> advertisement in a known-safe setup where the client end can then
> >>> benefit from parallel clients.
> >>>
> >>
> >> It makes sense, and will be used by oVirt. Actually we are already using
> >> multiple connections for writing about 2 years, based on your promise
> >> that if every client writes to district  areas this is safe.
> >
> > I presume s/district/distinct/, but yes, I'm glad we're finally trying
> > to make the code match existing practice ;)
> >
> >>> +++ b/docs/tools/qemu-nbd.rst
> >>> @@ -139,8 +139,7 @@ driver options if ``--image-opts`` is specified.
> >>>   .. option:: -e, --shared=NUM
> >>>
> >>>     Allow up to *NUM* clients to share the device (default
> >>> -  ``1``), 0 for unlimited. Safe for readers, but for now,
> >>> -  consistency is not guaranteed between multiple writers.
> >>> +  ``1``), 0 for unlimited.
> >>>
> >>
> >> Removing the note means that now consistency is guaranteed between
> >> multiple writers, no?
> >>
> >> Or maybe we want to mention here that consistency depends on the protocol
> >> and users can opt in, or refer to the section where this is discussed?
> >
> > Yeah, a link to the QAPI docs where multi-conn is documented might be
> > nice, except I'm not sure the best way to do that in our sphinx
> > documentation setup.
> >
> >>> +##
> >>> +# @NbdExportMultiConn:
> >>> +#
> >>> +# Possible settings for advertising NBD multiple client support.
> >>> +#
> >>> +# @off: Do not advertise multiple clients.
> >>> +#
> >>> +# @on: Allow multiple clients (for writable clients, this is only safe
> >>> +#      if the underlying BDS is cache-consistent, such as when backed
> >>> +#      by the raw file driver); ignored if the NBD server was set up
> >>> +#      with max-connections of 1.
> >>> +#
> >>> +# @auto: Behaves like @off if the export is writable, and @on if the
> >>> +#        export is read-only.
> >>> +#
> >>> +# Since: 7.0
> >>> +##
> >>> +{ 'enum': 'NbdExportMultiConn',
> >>> +  'data': ['off', 'on', 'auto'] }
> >>>
> >>
> >> Are we going to have --multi-con=(on|off|auto)?
> >
> > Oh. The QMP command (which is immediately visible through
> > nbd-server-add/block-storage-add to qemu and qemu-storage-daemon)
> > gains "multi-conn":"on", but you may be right that qemu-nbd would want
> > a command line option (either that, or we accellerate our plans that
> > qsd should replace qemu-nbd).
> >
> >>> +++ b/blockdev-nbd.c
> >>> @@ -44,6 +44,11 @@ bool nbd_server_is_running(void)
> >>>       return nbd_server || is_qemu_nbd;
> >>>   }
> >>>
> >>> +int nbd_server_max_connections(void)
> >>> +{
> >>> +    return nbd_server ? nbd_server->max_connections : -1;
> >>> +}
> >>>
> >>
> >> -1 is a little bit strange for a limit, maybe 1 is a better default when
> >> we nbd_server == NULL? When can this happen?
> >
> > In qemu, if you haven't used the QMP command 'nbd-server-start' yet.
> > In qemu-nbd, always (per the nbd_server_is_running function just
> > above).  My iotest only covered the qemu/qsd side, not the qemu-nbd
> > side, so it looks like I need a v3...
> >
> >>> +++ b/nbd/server.c
> >
> >>> +    /*
> >>> +     * Determine whether to advertise multi-conn.  Default is auto,
> >>> +     * which resolves to on for read-only and off for writable.  But
> >>> +     * if the server has max-connections 1, that forces the flag off.
> >>>
> >>
> >> Looks good, this can be enabled automatically based on the driver
> >> if we want, so users using auto will can upgrade to multi-con automatically.
> >
> > Yes, that's part of why I made it a tri-state with a default of 'auto'.
> >
> >>
> >>
> >>> +     */
> >>> +    if (!arg->has_multi_conn) {
> >>> +        arg->multi_conn = NBD_EXPORT_MULTI_CONN_AUTO;
> >>> +    }
> >>> +    if (nbd_server_max_connections() == 1) {
> >>
> >> +        arg->multi_conn = NBD_EXPORT_MULTI_CONN_OFF;
> >>> +    }
> >>
> >> +    if (arg->multi_conn == NBD_EXPORT_MULTI_CONN_AUTO) {
> >>> +        multi_conn = readonly;
> >>> +    } else {
> >>> +        multi_conn = arg->multi_conn == NBD_EXPORT_MULTI_CONN_ON;
> >>> +    }
> >>>
> >>
> >> This part is a little bit confusing - we do:
> >> - initialize args->multi_con if it has not value
> >> - set the temporary multi_con based now initialized args->multi_con
> >>
> >> I think it will be nicer to separate arguments parsing, so there is no need
> >> to initialize it here or have arg->has_multi_conn, but I did not check how
> >> other arguments are handled.
> >
> > arg->has_multi_conn is fallout from the fact that our QAPI must remain
> > back-compat. If it is false, the user did not pass "multi-conn":...,
> > so we want the default value of "auto".  Once we've established the
> > default, then we force multi-conn off if we know we are limited to one
> > client (although as you pointed out, nbd_server_max_connections()
> > needs to be tested with qemu-nbd).  Then, it's easier to resolve the
> > tri-state enum variable into the bool of what we actually advertise to
> > the NBD client.
> >
> >>> +++ b/tests/qemu-iotests/tests/nbd-multiconn
> >>> @@ -0,0 +1,188 @@
> >>> +#!/usr/bin/env bash
> >>> +# group: rw auto quick
> >>> +#
> >>> +# Test that qemu-nbd MULTI_CONN works
> >>> +#
> >>> +echo
> >>> +echo "=== Initial image setup ==="
> >>> +echo
> >>> +
> >>> +_make_test_img 4M
> >>> +$QEMU_IO -c 'w -P 1 0 2M' -c 'w -P 2 2M 2M' "$TEST_IMG" | _filter_qemu_io
> >>> +_launch_qemu 2> >(_filter_nbd)
> >>> +_send_qemu_cmd $QEMU_HANDLE '{"execute":"qmp_capabilities"}' "return"
> >>> +_send_qemu_cmd $QEMU_HANDLE '{"execute":"blockdev-add",
> >>> +  "arguments":{"driver":"qcow2", "node-name":"n",
> >>> +    "file":{"driver":"file", "filename":"'"$TEST_IMG"'"}}}' "return"
> >
> > I'm not the best at writing python iotests; I welcome a language
> > translation of this aspect.
>
>
>
> Let me try:)

Thanks! This is much nicer and will be easier to maintain.

>
>
> #!/usr/bin/env python3
>
> import os
> import iotests
> import nbd
> from iotests import qemu_img_create, qemu_io_silent
>
>
> disk = os.path.join(iotests.test_dir, 'disk')
> size = '4M'
> nbd_sock = os.path.join(iotests.test_dir, 'nbd_sock')
> nbd_uri = 'nbd+unix:///{}?socket=' + nbd_sock
>
>
> class TestNbdMulticonn(iotests.QMPTestCase):
>      def setUp(self):
>          assert qemu_img_create('-f', iotests.imgfmt, disk, size) == 0
>          assert qemu_io_silent('-c', 'w -P 1 0 2M',
>                                '-c', 'w -P 2 2M 2M', disk) == 0
>
>          self.vm = iotests.VM()
>          self.vm.launch()
>          result = self.vm.qmp('blockdev-add', {
>              'driver': 'qcow2',
>              'node-name': 'n',
>              'file': {'driver': 'file', 'filename': disk}
>          })
>          self.assert_qmp(result, 'return', {})
>
>      def tearDown(self):
>          self.vm.shutdown()
>          os.remove(disk)
>          os.remove(nbd_sock)
>
>      def server_start(self, max_connections=None):
>          args = {
>              'addr': {
>                  'type': 'unix',
>                  'data': {'path': nbd_sock}
>              }
>          }
>          if max_connections is not None:
>              args['max-connections'] = max_connections
>
>          result = self.vm.qmp('nbd-server-start', args)
>          self.assert_qmp(result, 'return', {})
>
>      def export_add(self, name, writable=None, multi_conn=None):
>          args = {
>              'type': 'nbd',
>              'id': name,
>              'node-name': 'n',
>              'name': name,
>          }
>          if writable is not None:
>              args['writable'] = writable
>          if multi_conn is not None:
>              args['multi-conn'] = multi_conn
>
>          result = self.vm.qmp('block-export-add', args)
>          self.assert_qmp(result, 'return', {})
>
>      def check_multi_conn(self, export_name, multi_conn):
>          cl = nbd.NBD()
>          cl.connect_uri(nbd_uri.format(export_name))
>          self.assertEqual(cl.can_multi_conn(), multi_conn)
>          cl.shutdown()

The test will be more clear and the code more reusable if the helper
was doing only connect/disconnect.

@contextmanager
def open_nbd(nbd_uri, export_name):
     h = nbd.NBD()
     h.connect_uri(nbd_uri.format(export_name))
     try:
         yield h
     finally:
         h.shutdown()

Any test that need nbd handle can do:

    with open_nbd(nbd_uri, export_name) as h:
        use the handle...

Good example when we can use this is the block status cache test,
using complicated qemu-img commands instead of opening
a client and calling block_status a few times.

And this also cleans up at the end of the test so failure
will not affect the next test.

The helper can live in the iotest module instead of inventing it for
every new test.

>
>      def test_default_settings(self):
>          self.server_start()
>          self.export_add('r'))
>          self.export_add('w', writable=True)
>          self.check_multi_conn('r', True)
>          self.check_multi_conn('w', False)

With the helper suggested, this test will be:

    with open_nbd(nbd_uri, "r") as h:
        self.assertTrue(h.can_multi_conn())

    with open_nbd(nbd_uri, "w") as h:
        self.assertFalse(h.can_multi_conn())

Now you don't need to check what check_multicon() is doing when
reading the tests, and it is very clear what open_nbd() does based
on the name and usage as context manager.

>
>      def test_multiconn_option(self):
>          self.server_start()
>          self.export_add('r', multi_conn='off')
>          self.export_add('w', writable=True, multi_conn='on')

It will be more natural to use:

    self.start_server()
    self.add_export(...)

In C the other way is more natural because you don't have namespaces
or classes.

>          self.check_multi_conn('r', False)
>          self.check_multi_conn('w', True)

And I think you agree since you did not use:

    self.multi_con_check(...)

>
>      def test_limited_connections(self):
>          self.server_start(max_connections=1)
>          self.export_add('r', multi_conn='on')
>          self.export_add('w', writable=True, multi_conn='on')
>          self.check_multi_conn('r', False)
>          self.check_multi_conn('w', False)
>
>      def test_parallel_writes(self):
>          self.server_start()
>          self.export_add('w', writable=True, multi_conn='on')
>
>          clients = [nbd.NBD() for _ in range(3)]
>          for c in clients:
>              c.connect_uri(nbd_uri.format('w'))
>              assert c.can_multi_conn()
>
>          buf1 = clients[0].pread(1024 * 1024, 0)
>          self.assertEqual(buf1, b"\x01" * 1024 * 1024)
>
>          buf2 = b"\x03" * 1024 * 1024
>          clients[1].pwrite(buf2, 0)
>          clients[2].flush()
>          buf1 = clients[0].pread(1024 * 1024, 0)
>
>          self.assertEqual(buf1, buf2)
>
>          for i in range(3):
>              clients[i].shutdown()
>
>
> if __name__ == '__main__':
>      iotests.main(supported_fmts=['qcow2'])
>
> > But the shell code for
> > _launch_qemu/_send_qemu_cmd was already pretty nice for setting up a
> > long-running background qemu process where I can pass in QMP commands
> > at will, then interact from other processes.
> >
> >>> +export nbd_unix_socket
> >>> +
> >>> +echo
> >>> +echo "=== Default nbd server settings ==="
> >>> +echo
> >>> +# Default allows for unlimited connections, readonly images advertise
> >>> +# multi-conn, and writable images do not
> >>> +_send_qemu_cmd $QEMU_HANDLE '{"execute":"nbd-server-start",
> >>> +  "arguments":{"addr":{"type":"unix",
> >>> +    "data":{"path":"'"$nbd_unix_socket"'"}}}}' "return"
> >>> +_send_qemu_cmd $QEMU_HANDLE '{"execute":"block-export-add",
> >>> +  "arguments":{"type":"nbd", "id":"r", "node-name":"n",
> >>> +    "name":"r"}}' "return"
> >>> +_send_qemu_cmd $QEMU_HANDLE '{"execute":"block-export-add",
> >>> +  "arguments":{"type":"nbd", "id":"w", "node-name":"n",
> >>> +    "name":"w", "writable":true}}' "return"
> >>> +nbdsh -u "nbd+unix:///r?socket=$nbd_unix_socket" -c '
> >>> +assert h.can_multi_conn()
> >>> +h.shutdown()
> >>> +print("nbdsh passed")'
> >>> +nbdsh -u "nbd+unix:///w?socket=$nbd_unix_socket" -c '
> >>> +assert not h.can_multi_conn()
> >>> +h.shutdown()
> >>> +print("nbdsh passed")'
> >>>
> >>
> >> Mixing of shell and python is very confusing. Wouldn't it be much cleaner
> >> to write the test in python?
> >
> > Here, nbdsh -c 'python snippet' is used as a shell command line
> > parameter.  Writing python code to call out to a system() command
> > where one of the arguments to that command is a python script snippet
> > is going to be just as annoying as writing it in bash.
> >
> >>> +echo
> >>> +echo "=== Demonstrate parallel writers ==="
> >>> +echo
> >>> +_send_qemu_cmd $QEMU_HANDLE '{"execute":"nbd-server-start",
> >>> +  "arguments":{"addr":{"type":"unix",
> >>> +    "data":{"path":"'"$nbd_unix_socket"'"}}}}' "return"
> >>> +_send_qemu_cmd $QEMU_HANDLE '{"execute":"block-export-add",
> >>> +  "arguments":{"type":"nbd", "id":"w", "node-name":"n",
> >>> +    "name":"", "multi-conn":"on", "writable":true}}' "return"
> >>> +
> >>> +nbdsh -c '
> >>> +import os
> >>> +sock = os.getenv("nbd_unix_socket")
> >>> +h = []
> >>> +
> >>> +for i in range(3):
> >>> +  h.append(nbd.NBD())
> >>> +  h[i].connect_unix(sock)
> >>> +  assert h[i].can_multi_conn()
> >>>
> >>
> >> This is somewhat C in python, maybe:
> >>
> >>      handles = [nbd.NBD() for _ in range(3)]
> >>
> >>      for h in handles:
> >>          h.connect_unix(sock)
> >>          assert h.can_multi_con()
> >>
> >> Since assert will abort the test, and we don't handle
> >> exceptions, failure will not shutdown the connections
> >> but it should not matter for the purpose of a test.
> >
> > As I said, I'm open to python suggestions :)  I like your approach.
> >
> >>
> >>
> >>> +
> >>> +buf1 = h[0].pread(1024 * 1024, 0)
> >>> +if buf1 != b"\x01" * 1024 * 1024:
> >>> +  print("Unexpected initial read")
> >>>
> >>
> >> Not clear when we initialize the buffer to \x01 - is this the qemu-io above?
> >
> > Yes, when the qcow2 file was initially created.
> >
> >>
> >>
> >>> +buf2 = b"\x03" * 1024 * 1024
> >>> +h[1].pwrite(buf2, 0)
> >>> +h[2].flush()
> >>> +buf1 = h[0].pread(1024 * 1024, 0)
> >>> +if buf1 == buf2:
> >>> +  print("Flush appears to be consistent across connections")
> >>>
> >>
> >> buf1 was the initial content, buf2 is the new content, but now we override
> >> but1 to check that the right was flushed. It will be be better to use
> >> different
> >> names like inittial_data, updated_data, current_data.
> >
> > Can do.
> >
> >>
> >> This block is the most important part of the test, showing that we can write
> >> in one connection, flush in the second, and read the written block in the
> >> third.
> >> Maybe add a comment about this? I think it will help someone else trying
> >> to understand what this part is trying to test.
> >
> > Can do.
> >
> >>> +{"execute":"block-export-add",
> >>> +  "arguments":{"type":"nbd", "id":"w", "node-name":"n",
> >>> +    "name":"", "multi-conn":"on", "writable":true}}
> >>> +{"return": {}}
> >>> +Flush appears to be consistent across connections
> >>> +{"execute":"block-export-del",
> >>> +  "arguments":{"id":"w"}}
> >>> +{"return": {}}
> >>> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP},
> >>> "event": "BLOCK_EXPORT_DELETED", "data": {"id": "w"}}
> >>> +{"execute":"nbd-server-stop"}
> >>> +{"return": {}}
> >>> +{"execute":"quit"}
> >>> +{"return": {}}
> >>> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP},
> >>> "event": "SHUTDOWN", "data": {"guest": false, "reason": "host-qmp-quit"}}
> >>>
> >>
> >> Nothing about the contents here says anything about the actual test
> >> except the "Flush appears..." line.
> >
> > Yeah, it's a lot of QMP debugging (to show what commands were run in
> > setting up the server), and less verbose in the nbdsh side.  Do I need
> > to add more output during the nbdsh that uses multiple connections?
> >
> >>
> >>
> >>> +*** done
> >>> --
> >>> 2.35.1
> >>>
> >>
> >> Looks good, feel free to ignore the style comments and suggestion to rewrite
> >> the test in python.
> >
> > The style comments are nice, the rewriting is going to be harder for
> > me (but I'll accept help).  At any rate, getting qemu-nbd to be
> > feature-compatible may require a v3 anyway.
> >
>
>
> --
> Best regards,
> Vladimir
>


Re: [PATCH v2] nbd/server: Allow MULTI_CONN for shared writable exports
Posted by Vladimir Sementsov-Ogievskiy 2 years, 2 months ago
16.02.2022 20:14, Nir Soffer wrote:
> On Wed, Feb 16, 2022 at 10:08 AM Vladimir Sementsov-Ogievskiy
> <vsementsov@virtuozzo.com> wrote:
>>
>> 16.02.2022 02:24, Eric Blake wrote:
>>> On Tue, Feb 15, 2022 at 09:23:36PM +0200, Nir Soffer wrote:
>>>> On Tue, Feb 15, 2022 at 7:22 PM Eric Blake <eblake@redhat.com> wrote:
>>>>
>>>>> According to the NBD spec, a server advertising
>>>>> NBD_FLAG_CAN_MULTI_CONN promises that multiple client connections will
>>>>> not see any cache inconsistencies: when properly separated by a single
>>>>> flush, actions performed by one client will be visible to another
>>>>> client, regardless of which client did the flush.  We satisfy these
>>>>> conditions in qemu when our block layer is backed by the local
>>>>> filesystem (by virtue of the semantics of fdatasync(), and the fact
>>>>> that qemu itself is not buffering writes beyond flushes).  It is
>>>>> harder to state whether we satisfy these conditions for network-based
>>>>> protocols, so the safest course of action is to allow users to opt-in
>>>>> to advertising multi-conn.  We may later tweak defaults to advertise
>>>>> by default when the block layer can confirm that the underlying
>>>>> protocol driver is cache consistent between multiple writers, but for
>>>>> now, this at least allows savvy users (such as virt-v2v or nbdcopy) to
>>>>> explicitly start qemu-nbd or qemu-storage-daemon with multi-conn
>>>>> advertisement in a known-safe setup where the client end can then
>>>>> benefit from parallel clients.
>>>>>
>>>>
>>>> It makes sense, and will be used by oVirt. Actually we are already using
>>>> multiple connections for writing about 2 years, based on your promise
>>>> that if every client writes to district  areas this is safe.
>>>
>>> I presume s/district/distinct/, but yes, I'm glad we're finally trying
>>> to make the code match existing practice ;)
>>>
>>>>> +++ b/docs/tools/qemu-nbd.rst
>>>>> @@ -139,8 +139,7 @@ driver options if ``--image-opts`` is specified.
>>>>>    .. option:: -e, --shared=NUM
>>>>>
>>>>>      Allow up to *NUM* clients to share the device (default
>>>>> -  ``1``), 0 for unlimited. Safe for readers, but for now,
>>>>> -  consistency is not guaranteed between multiple writers.
>>>>> +  ``1``), 0 for unlimited.
>>>>>
>>>>
>>>> Removing the note means that now consistency is guaranteed between
>>>> multiple writers, no?
>>>>
>>>> Or maybe we want to mention here that consistency depends on the protocol
>>>> and users can opt in, or refer to the section where this is discussed?
>>>
>>> Yeah, a link to the QAPI docs where multi-conn is documented might be
>>> nice, except I'm not sure the best way to do that in our sphinx
>>> documentation setup.
>>>
>>>>> +##
>>>>> +# @NbdExportMultiConn:
>>>>> +#
>>>>> +# Possible settings for advertising NBD multiple client support.
>>>>> +#
>>>>> +# @off: Do not advertise multiple clients.
>>>>> +#
>>>>> +# @on: Allow multiple clients (for writable clients, this is only safe
>>>>> +#      if the underlying BDS is cache-consistent, such as when backed
>>>>> +#      by the raw file driver); ignored if the NBD server was set up
>>>>> +#      with max-connections of 1.
>>>>> +#
>>>>> +# @auto: Behaves like @off if the export is writable, and @on if the
>>>>> +#        export is read-only.
>>>>> +#
>>>>> +# Since: 7.0
>>>>> +##
>>>>> +{ 'enum': 'NbdExportMultiConn',
>>>>> +  'data': ['off', 'on', 'auto'] }
>>>>>
>>>>
>>>> Are we going to have --multi-con=(on|off|auto)?
>>>
>>> Oh. The QMP command (which is immediately visible through
>>> nbd-server-add/block-storage-add to qemu and qemu-storage-daemon)
>>> gains "multi-conn":"on", but you may be right that qemu-nbd would want
>>> a command line option (either that, or we accellerate our plans that
>>> qsd should replace qemu-nbd).
>>>
>>>>> +++ b/blockdev-nbd.c
>>>>> @@ -44,6 +44,11 @@ bool nbd_server_is_running(void)
>>>>>        return nbd_server || is_qemu_nbd;
>>>>>    }
>>>>>
>>>>> +int nbd_server_max_connections(void)
>>>>> +{
>>>>> +    return nbd_server ? nbd_server->max_connections : -1;
>>>>> +}
>>>>>
>>>>
>>>> -1 is a little bit strange for a limit, maybe 1 is a better default when
>>>> we nbd_server == NULL? When can this happen?
>>>
>>> In qemu, if you haven't used the QMP command 'nbd-server-start' yet.
>>> In qemu-nbd, always (per the nbd_server_is_running function just
>>> above).  My iotest only covered the qemu/qsd side, not the qemu-nbd
>>> side, so it looks like I need a v3...
>>>
>>>>> +++ b/nbd/server.c
>>>
>>>>> +    /*
>>>>> +     * Determine whether to advertise multi-conn.  Default is auto,
>>>>> +     * which resolves to on for read-only and off for writable.  But
>>>>> +     * if the server has max-connections 1, that forces the flag off.
>>>>>
>>>>
>>>> Looks good, this can be enabled automatically based on the driver
>>>> if we want, so users using auto will can upgrade to multi-con automatically.
>>>
>>> Yes, that's part of why I made it a tri-state with a default of 'auto'.
>>>
>>>>
>>>>
>>>>> +     */
>>>>> +    if (!arg->has_multi_conn) {
>>>>> +        arg->multi_conn = NBD_EXPORT_MULTI_CONN_AUTO;
>>>>> +    }
>>>>> +    if (nbd_server_max_connections() == 1) {
>>>>
>>>> +        arg->multi_conn = NBD_EXPORT_MULTI_CONN_OFF;
>>>>> +    }
>>>>
>>>> +    if (arg->multi_conn == NBD_EXPORT_MULTI_CONN_AUTO) {
>>>>> +        multi_conn = readonly;
>>>>> +    } else {
>>>>> +        multi_conn = arg->multi_conn == NBD_EXPORT_MULTI_CONN_ON;
>>>>> +    }
>>>>>
>>>>
>>>> This part is a little bit confusing - we do:
>>>> - initialize args->multi_con if it has not value
>>>> - set the temporary multi_con based now initialized args->multi_con
>>>>
>>>> I think it will be nicer to separate arguments parsing, so there is no need
>>>> to initialize it here or have arg->has_multi_conn, but I did not check how
>>>> other arguments are handled.
>>>
>>> arg->has_multi_conn is fallout from the fact that our QAPI must remain
>>> back-compat. If it is false, the user did not pass "multi-conn":...,
>>> so we want the default value of "auto".  Once we've established the
>>> default, then we force multi-conn off if we know we are limited to one
>>> client (although as you pointed out, nbd_server_max_connections()
>>> needs to be tested with qemu-nbd).  Then, it's easier to resolve the
>>> tri-state enum variable into the bool of what we actually advertise to
>>> the NBD client.
>>>
>>>>> +++ b/tests/qemu-iotests/tests/nbd-multiconn
>>>>> @@ -0,0 +1,188 @@
>>>>> +#!/usr/bin/env bash
>>>>> +# group: rw auto quick
>>>>> +#
>>>>> +# Test that qemu-nbd MULTI_CONN works
>>>>> +#
>>>>> +echo
>>>>> +echo "=== Initial image setup ==="
>>>>> +echo
>>>>> +
>>>>> +_make_test_img 4M
>>>>> +$QEMU_IO -c 'w -P 1 0 2M' -c 'w -P 2 2M 2M' "$TEST_IMG" | _filter_qemu_io
>>>>> +_launch_qemu 2> >(_filter_nbd)
>>>>> +_send_qemu_cmd $QEMU_HANDLE '{"execute":"qmp_capabilities"}' "return"
>>>>> +_send_qemu_cmd $QEMU_HANDLE '{"execute":"blockdev-add",
>>>>> +  "arguments":{"driver":"qcow2", "node-name":"n",
>>>>> +    "file":{"driver":"file", "filename":"'"$TEST_IMG"'"}}}' "return"
>>>
>>> I'm not the best at writing python iotests; I welcome a language
>>> translation of this aspect.
>>
>>
>>
>> Let me try:)
> 
> Thanks! This is much nicer and will be easier to maintain.
> 
>>
>>
>> #!/usr/bin/env python3
>>
>> import os
>> import iotests
>> import nbd
>> from iotests import qemu_img_create, qemu_io_silent
>>
>>
>> disk = os.path.join(iotests.test_dir, 'disk')
>> size = '4M'
>> nbd_sock = os.path.join(iotests.test_dir, 'nbd_sock')
>> nbd_uri = 'nbd+unix:///{}?socket=' + nbd_sock
>>
>>
>> class TestNbdMulticonn(iotests.QMPTestCase):
>>       def setUp(self):
>>           assert qemu_img_create('-f', iotests.imgfmt, disk, size) == 0
>>           assert qemu_io_silent('-c', 'w -P 1 0 2M',
>>                                 '-c', 'w -P 2 2M 2M', disk) == 0
>>
>>           self.vm = iotests.VM()
>>           self.vm.launch()
>>           result = self.vm.qmp('blockdev-add', {
>>               'driver': 'qcow2',
>>               'node-name': 'n',
>>               'file': {'driver': 'file', 'filename': disk}
>>           })
>>           self.assert_qmp(result, 'return', {})
>>
>>       def tearDown(self):
>>           self.vm.shutdown()
>>           os.remove(disk)
>>           os.remove(nbd_sock)
>>
>>       def server_start(self, max_connections=None):
>>           args = {
>>               'addr': {
>>                   'type': 'unix',
>>                   'data': {'path': nbd_sock}
>>               }
>>           }
>>           if max_connections is not None:
>>               args['max-connections'] = max_connections
>>
>>           result = self.vm.qmp('nbd-server-start', args)
>>           self.assert_qmp(result, 'return', {})
>>
>>       def export_add(self, name, writable=None, multi_conn=None):
>>           args = {
>>               'type': 'nbd',
>>               'id': name,
>>               'node-name': 'n',
>>               'name': name,
>>           }
>>           if writable is not None:
>>               args['writable'] = writable
>>           if multi_conn is not None:
>>               args['multi-conn'] = multi_conn
>>
>>           result = self.vm.qmp('block-export-add', args)
>>           self.assert_qmp(result, 'return', {})
>>
>>       def check_multi_conn(self, export_name, multi_conn):
>>           cl = nbd.NBD()
>>           cl.connect_uri(nbd_uri.format(export_name))
>>           self.assertEqual(cl.can_multi_conn(), multi_conn)
>>           cl.shutdown()
> 
> The test will be more clear and the code more reusable if the helper
> was doing only connect/disconnect.
> 
> @contextmanager
> def open_nbd(nbd_uri, export_name):
>       h = nbd.NBD()
>       h.connect_uri(nbd_uri.format(export_name))
>       try:
>           yield h
>       finally:
>           h.shutdown()
> 
> Any test that need nbd handle can do:
> 
>      with open_nbd(nbd_uri, export_name) as h:
>          use the handle...
> 
> Good example when we can use this is the block status cache test,
> using complicated qemu-img commands instead of opening
> a client and calling block_status a few times.
> 
> And this also cleans up at the end of the test so failure
> will not affect the next test.
> 
> The helper can live in the iotest module instead of inventing it for
> every new test.

Yes that sounds good.

> 
>>
>>       def test_default_settings(self):
>>           self.server_start()
>>           self.export_add('r'))
>>           self.export_add('w', writable=True)
>>           self.check_multi_conn('r', True)
>>           self.check_multi_conn('w', False)
> 
> With the helper suggested, this test will be:
> 
>      with open_nbd(nbd_uri, "r") as h:
>          self.assertTrue(h.can_multi_conn())
> 
>      with open_nbd(nbd_uri, "w") as h:
>          self.assertFalse(h.can_multi_conn())
> 
> Now you don't need to check what check_multicon() is doing when
> reading the tests, and it is very clear what open_nbd() does based
> on the name and usage as context manager.

Agree

> 
>>
>>       def test_multiconn_option(self):
>>           self.server_start()
>>           self.export_add('r', multi_conn='off')
>>           self.export_add('w', writable=True, multi_conn='on')
> 
> It will be more natural to use:
> 
>      self.start_server()
>      self.add_export(...)
> 
> In C the other way is more natural because you don't have namespaces
> or classes.
> 
>>           self.check_multi_conn('r', False)
>>           self.check_multi_conn('w', True)
> 
> And I think you agree since you did not use:
> 
>      self.multi_con_check(...)
> 

Yes)

>>
>>       def test_limited_connections(self):
>>           self.server_start(max_connections=1)
>>           self.export_add('r', multi_conn='on')
>>           self.export_add('w', writable=True, multi_conn='on')
>>           self.check_multi_conn('r', False)
>>           self.check_multi_conn('w', False)
>>
>>       def test_parallel_writes(self):
>>           self.server_start()
>>           self.export_add('w', writable=True, multi_conn='on')
>>
>>           clients = [nbd.NBD() for _ in range(3)]
>>           for c in clients:
>>               c.connect_uri(nbd_uri.format('w'))
>>               assert c.can_multi_conn()
>>
>>           buf1 = clients[0].pread(1024 * 1024, 0)
>>           self.assertEqual(buf1, b"\x01" * 1024 * 1024)
>>
>>           buf2 = b"\x03" * 1024 * 1024
>>           clients[1].pwrite(buf2, 0)
>>           clients[2].flush()
>>           buf1 = clients[0].pread(1024 * 1024, 0)
>>
>>           self.assertEqual(buf1, buf2)
>>
>>           for i in range(3):
>>               clients[i].shutdown()
>>
>>
>> if __name__ == '__main__':
>>       iotests.main(supported_fmts=['qcow2'])
>>
>>> But the shell code for
>>> _launch_qemu/_send_qemu_cmd was already pretty nice for setting up a
>>> long-running background qemu process where I can pass in QMP commands
>>> at will, then interact from other processes.
>>>
>>>>> +export nbd_unix_socket
>>>>> +
>>>>> +echo
>>>>> +echo "=== Default nbd server settings ==="
>>>>> +echo
>>>>> +# Default allows for unlimited connections, readonly images advertise
>>>>> +# multi-conn, and writable images do not
>>>>> +_send_qemu_cmd $QEMU_HANDLE '{"execute":"nbd-server-start",
>>>>> +  "arguments":{"addr":{"type":"unix",
>>>>> +    "data":{"path":"'"$nbd_unix_socket"'"}}}}' "return"
>>>>> +_send_qemu_cmd $QEMU_HANDLE '{"execute":"block-export-add",
>>>>> +  "arguments":{"type":"nbd", "id":"r", "node-name":"n",
>>>>> +    "name":"r"}}' "return"
>>>>> +_send_qemu_cmd $QEMU_HANDLE '{"execute":"block-export-add",
>>>>> +  "arguments":{"type":"nbd", "id":"w", "node-name":"n",
>>>>> +    "name":"w", "writable":true}}' "return"
>>>>> +nbdsh -u "nbd+unix:///r?socket=$nbd_unix_socket" -c '
>>>>> +assert h.can_multi_conn()
>>>>> +h.shutdown()
>>>>> +print("nbdsh passed")'
>>>>> +nbdsh -u "nbd+unix:///w?socket=$nbd_unix_socket" -c '
>>>>> +assert not h.can_multi_conn()
>>>>> +h.shutdown()
>>>>> +print("nbdsh passed")'
>>>>>
>>>>
>>>> Mixing of shell and python is very confusing. Wouldn't it be much cleaner
>>>> to write the test in python?
>>>
>>> Here, nbdsh -c 'python snippet' is used as a shell command line
>>> parameter.  Writing python code to call out to a system() command
>>> where one of the arguments to that command is a python script snippet
>>> is going to be just as annoying as writing it in bash.
>>>
>>>>> +echo
>>>>> +echo "=== Demonstrate parallel writers ==="
>>>>> +echo
>>>>> +_send_qemu_cmd $QEMU_HANDLE '{"execute":"nbd-server-start",
>>>>> +  "arguments":{"addr":{"type":"unix",
>>>>> +    "data":{"path":"'"$nbd_unix_socket"'"}}}}' "return"
>>>>> +_send_qemu_cmd $QEMU_HANDLE '{"execute":"block-export-add",
>>>>> +  "arguments":{"type":"nbd", "id":"w", "node-name":"n",
>>>>> +    "name":"", "multi-conn":"on", "writable":true}}' "return"
>>>>> +
>>>>> +nbdsh -c '
>>>>> +import os
>>>>> +sock = os.getenv("nbd_unix_socket")
>>>>> +h = []
>>>>> +
>>>>> +for i in range(3):
>>>>> +  h.append(nbd.NBD())
>>>>> +  h[i].connect_unix(sock)
>>>>> +  assert h[i].can_multi_conn()
>>>>>
>>>>
>>>> This is somewhat C in python, maybe:
>>>>
>>>>       handles = [nbd.NBD() for _ in range(3)]
>>>>
>>>>       for h in handles:
>>>>           h.connect_unix(sock)
>>>>           assert h.can_multi_con()
>>>>
>>>> Since assert will abort the test, and we don't handle
>>>> exceptions, failure will not shutdown the connections
>>>> but it should not matter for the purpose of a test.
>>>
>>> As I said, I'm open to python suggestions :)  I like your approach.
>>>
>>>>
>>>>
>>>>> +
>>>>> +buf1 = h[0].pread(1024 * 1024, 0)
>>>>> +if buf1 != b"\x01" * 1024 * 1024:
>>>>> +  print("Unexpected initial read")
>>>>>
>>>>
>>>> Not clear when we initialize the buffer to \x01 - is this the qemu-io above?
>>>
>>> Yes, when the qcow2 file was initially created.
>>>
>>>>
>>>>
>>>>> +buf2 = b"\x03" * 1024 * 1024
>>>>> +h[1].pwrite(buf2, 0)
>>>>> +h[2].flush()
>>>>> +buf1 = h[0].pread(1024 * 1024, 0)
>>>>> +if buf1 == buf2:
>>>>> +  print("Flush appears to be consistent across connections")
>>>>>
>>>>
>>>> buf1 was the initial content, buf2 is the new content, but now we override
>>>> but1 to check that the right was flushed. It will be be better to use
>>>> different
>>>> names like inittial_data, updated_data, current_data.
>>>
>>> Can do.
>>>
>>>>
>>>> This block is the most important part of the test, showing that we can write
>>>> in one connection, flush in the second, and read the written block in the
>>>> third.
>>>> Maybe add a comment about this? I think it will help someone else trying
>>>> to understand what this part is trying to test.
>>>
>>> Can do.
>>>
>>>>> +{"execute":"block-export-add",
>>>>> +  "arguments":{"type":"nbd", "id":"w", "node-name":"n",
>>>>> +    "name":"", "multi-conn":"on", "writable":true}}
>>>>> +{"return": {}}
>>>>> +Flush appears to be consistent across connections
>>>>> +{"execute":"block-export-del",
>>>>> +  "arguments":{"id":"w"}}
>>>>> +{"return": {}}
>>>>> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP},
>>>>> "event": "BLOCK_EXPORT_DELETED", "data": {"id": "w"}}
>>>>> +{"execute":"nbd-server-stop"}
>>>>> +{"return": {}}
>>>>> +{"execute":"quit"}
>>>>> +{"return": {}}
>>>>> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP},
>>>>> "event": "SHUTDOWN", "data": {"guest": false, "reason": "host-qmp-quit"}}
>>>>>
>>>>
>>>> Nothing about the contents here says anything about the actual test
>>>> except the "Flush appears..." line.
>>>
>>> Yeah, it's a lot of QMP debugging (to show what commands were run in
>>> setting up the server), and less verbose in the nbdsh side.  Do I need
>>> to add more output during the nbdsh that uses multiple connections?
>>>
>>>>
>>>>
>>>>> +*** done
>>>>> --
>>>>> 2.35.1
>>>>>
>>>>
>>>> Looks good, feel free to ignore the style comments and suggestion to rewrite
>>>> the test in python.
>>>
>>> The style comments are nice, the rewriting is going to be harder for
>>> me (but I'll accept help).  At any rate, getting qemu-nbd to be
>>> feature-compatible may require a v3 anyway.
>>>
>>
>>
>> --
>> Best regards,
>> Vladimir
>>
> 


-- 
Best regards,
Vladimir

Re: [PATCH v2] nbd/server: Allow MULTI_CONN for shared writable exports
Posted by Eric Blake 2 years, 1 month ago
On Wed, Feb 16, 2022 at 07:14:58PM +0200, Nir Soffer wrote:
> > > I'm not the best at writing python iotests; I welcome a language
> > > translation of this aspect.
> >
> >
> >
> > Let me try:)
> 
> Thanks! This is much nicer and will be easier to maintain.
> 
> >
> >
> > #!/usr/bin/env python3
> >
> > import os
> > import iotests
> > import nbd

What happens here if libnbd is not installed?  Is there a way to make
the test gracefully skip instead of error out?

> >      def check_multi_conn(self, export_name, multi_conn):
> >          cl = nbd.NBD()
> >          cl.connect_uri(nbd_uri.format(export_name))
> >          self.assertEqual(cl.can_multi_conn(), multi_conn)
> >          cl.shutdown()
> 
> The test will be more clear and the code more reusable if the helper
> was doing only connect/disconnect.
> 
> @contextmanager
> def open_nbd(nbd_uri, export_name):
>      h = nbd.NBD()
>      h.connect_uri(nbd_uri.format(export_name))

This can throw an exception if the connect fails; should it be inside
the try?  I'm comparing to libnbd/python/t/220-opt-list.py, which also
sets up a context manager.

>      try:
>          yield h
>      finally:
>          h.shutdown()
> 
> Any test that need nbd handle can do:
> 
>     with open_nbd(nbd_uri, export_name) as h:
>         use the handle...
> 
> Good example when we can use this is the block status cache test,
> using complicated qemu-img commands instead of opening
> a client and calling block_status a few times.
> 
> And this also cleans up at the end of the test so failure
> will not affect the next test.
> 
> The helper can live in the iotest module instead of inventing it for
> every new test.

Moving it into iotest makes the question about what to do if libnbd is
not installed even more important; while we could perhaps catch and
deal with a failed 'import' for this test, skipping the iotest module
due to a failed import has knock-on effects to all other iotests even
when they don't care if libnbd is installed.

> 
> >
> >      def test_default_settings(self):
> >          self.server_start()
> >          self.export_add('r'))
> >          self.export_add('w', writable=True)
> >          self.check_multi_conn('r', True)
> >          self.check_multi_conn('w', False)
> 
> With the helper suggested, this test will be:
> 
>     with open_nbd(nbd_uri, "r") as h:
>         self.assertTrue(h.can_multi_conn())
> 
>     with open_nbd(nbd_uri, "w") as h:
>         self.assertFalse(h.can_multi_conn())
> 
> Now you don't need to check what check_multicon() is doing when
> reading the tests, and it is very clear what open_nbd() does based
> on the name and usage as context manager.

Yes, I like that aspect of a context manager.

> 
> >
> >      def test_multiconn_option(self):
> >          self.server_start()
> >          self.export_add('r', multi_conn='off')
> >          self.export_add('w', writable=True, multi_conn='on')
> 
> It will be more natural to use:
> 
>     self.start_server()
>     self.add_export(...)
> 
> In C the other way is more natural because you don't have namespaces
> or classes.
> 
> >          self.check_multi_conn('r', False)
> >          self.check_multi_conn('w', True)
> 
> And I think you agree since you did not use:
> 
>     self.multi_con_check(...)
>

Useful naming advice.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org
Re: [PATCH v2] nbd/server: Allow MULTI_CONN for shared writable exports
Posted by Eric Blake 2 years, 1 month ago
On Wed, Feb 16, 2022 at 11:08:06AM +0300, Vladimir Sementsov-Ogievskiy wrote:
> 16.02.2022 02:24, Eric Blake wrote:

> > > > +++ b/tests/qemu-iotests/tests/nbd-multiconn
> > > > @@ -0,0 +1,188 @@
> > > > +#!/usr/bin/env bash
> > > > +# group: rw auto quick
> > > > +#
> > > > +# Test that qemu-nbd MULTI_CONN works
> > > > +#
> > > > +echo
> > > > +echo "=== Initial image setup ==="
> > > > +echo
> > > > +
> > > > +_make_test_img 4M
> > > > +$QEMU_IO -c 'w -P 1 0 2M' -c 'w -P 2 2M 2M' "$TEST_IMG" | _filter_qemu_io
> > > > +_launch_qemu 2> >(_filter_nbd)
> > > > +_send_qemu_cmd $QEMU_HANDLE '{"execute":"qmp_capabilities"}' "return"
> > > > +_send_qemu_cmd $QEMU_HANDLE '{"execute":"blockdev-add",
> > > > +  "arguments":{"driver":"qcow2", "node-name":"n",
> > > > +    "file":{"driver":"file", "filename":"'"$TEST_IMG"'"}}}' "return"
> > 
> > I'm not the best at writing python iotests; I welcome a language
> > translation of this aspect.
> 
> 
> 
> Let me try:)
> 
> 
> #!/usr/bin/env python3
> 
> import os
> import iotests
> import nbd
> from iotests import qemu_img_create, qemu_io_silent
> 
> 
> disk = os.path.join(iotests.test_dir, 'disk')
> size = '4M'
> nbd_sock = os.path.join(iotests.test_dir, 'nbd_sock')
> nbd_uri = 'nbd+unix:///{}?socket=' + nbd_sock

...

Thanks; I'm playing with this (and the improvements suggested in
followup messages) in preparation for a v3 posting.

> > > > +nbdsh -u "nbd+unix:///r?socket=$nbd_unix_socket" -c '
> > > > +assert h.can_multi_conn()
> > > > +h.shutdown()
> > > > +print("nbdsh passed")'
> > > > +nbdsh -u "nbd+unix:///w?socket=$nbd_unix_socket" -c '
> > > > +assert not h.can_multi_conn()
> > > > +h.shutdown()
> > > > +print("nbdsh passed")'
> > > > 
> > > 
> > > Mixing of shell and python is very confusing. Wouldn't it be much cleaner
> > > to write the test in python?
> > 
> > Here, nbdsh -c 'python snippet' is used as a shell command line
> > parameter.  Writing python code to call out to a system() command
> > where one of the arguments to that command is a python script snippet
> > is going to be just as annoying as writing it in bash.

Then again, since libnbd already includes python bindings, we wouldn't
have to detour through nbdsh, but just use the python bindings
directly (and I see that your translation did that).

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org
Re: [PATCH v2] nbd/server: Allow MULTI_CONN for shared writable exports
Posted by Richard W.M. Jones 2 years, 2 months ago
On Tue, Feb 15, 2022 at 05:24:14PM -0600, Eric Blake wrote:
> Oh. The QMP command (which is immediately visible through
> nbd-server-add/block-storage-add to qemu and qemu-storage-daemon)
> gains "multi-conn":"on", but you may be right that qemu-nbd would want
> a command line option (either that, or we accellerate our plans that
> qsd should replace qemu-nbd).

I really hope there will always be something called "qemu-nbd"
that acts like qemu-nbd.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine.  Supports Linux and Windows.
http://people.redhat.com/~rjones/virt-df/


Re: [PATCH v2] nbd/server: Allow MULTI_CONN for shared writable exports
Posted by Nir Soffer 2 years, 2 months ago
On Wed, Feb 16, 2022 at 12:13 PM Richard W.M. Jones <rjones@redhat.com>
wrote:

> On Tue, Feb 15, 2022 at 05:24:14PM -0600, Eric Blake wrote:
> > Oh. The QMP command (which is immediately visible through
> > nbd-server-add/block-storage-add to qemu and qemu-storage-daemon)
> > gains "multi-conn":"on", but you may be right that qemu-nbd would want
> > a command line option (either that, or we accellerate our plans that
> > qsd should replace qemu-nbd).
>
> I really hope there will always be something called "qemu-nbd"
> that acts like qemu-nbd.
>

I share this hope. Most projects I work on are based on qemu-nbd.

However in oVirt use case, we want to provide an NBD socket for clients to
allow direct
access to disks. One of the issues we need to solve for this is having a
way to tell if the
qemu-nbd is active, so we can terminate idle transfers.

The way we do this with the ovirt-imageio server is to query the status of
the transfer, and
use the idle time (time since last request) and active status (has inflight
requests) to detect
a stale transfer that should be terminated. An example use case is a
process on a remote
host that started an image transfer, and killed or crashed in the middle of
the transfer
without cleaning up properly.

To be more specific, every request to the imageio server (read, write,
flush, zero, options)
updates a timestamp in the transfer state. When we get the status we report
the time since
that timestamp was updated.

Additionally we keep and report the number of inflight requests, so we can
tell the case when
requests are blocked on inaccessible storage (e.g. non responsive NFS).

We don't have a way to do this with qemu-nbd, but I guess that using
qemu-storage-daemon
when we have qmp access will make such monitoring possible.

Nir