From nobody Mon Feb 9 11:30:10 2026 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of redhat.com designates 170.10.133.124 as permitted sender) client-ip=170.10.133.124; envelope-from=libvir-list-bounces@redhat.com; helo=us-smtp-delivery-124.mimecast.com; Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=libvir-list-bounces@redhat.com; dmarc=fail(p=none dis=none) header.from=suse.de Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mx.zohomail.com with SMTPS id 1654593620225391.80703272047003; Tue, 7 Jun 2022 02:20:20 -0700 (PDT) Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-300-rVwECxG4OqywWyGKj2jIrw-1; Tue, 07 Jun 2022 05:20:13 -0400 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 17D58802809; Tue, 7 Jun 2022 09:19:55 +0000 (UTC) Received: from mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (unknown [10.30.29.100]) by smtp.corp.redhat.com (Postfix) with ESMTP id 02C0F1410DDB; Tue, 7 Jun 2022 09:19:55 +0000 (UTC) Received: from mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (localhost [IPv6:::1]) by mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (Postfix) with ESMTP id C4A9019452D8; Tue, 7 Jun 2022 09:19:54 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) by mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (Postfix) with ESMTP id 7403519452D2 for ; Tue, 7 Jun 2022 09:19:51 +0000 (UTC) Received: by smtp.corp.redhat.com (Postfix) id 41E8740336E; Tue, 7 Jun 2022 09:19:51 +0000 (UTC) Received: from mimecast-mx02.redhat.com (mimecast04.extmail.prod.ext.rdu2.redhat.com [10.11.55.20]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 3C62A492C3B for ; Tue, 7 Jun 2022 09:19:51 +0000 (UTC) Received: from us-smtp-1.mimecast.com (us-smtp-1.mimecast.com [207.211.31.81]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 20B02108C1CA for ; Tue, 7 Jun 2022 09:19:51 +0000 (UTC) Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-167-y7kRW5ygPDSxYrTN7b23Ug-1; Tue, 07 Jun 2022 05:19:47 -0400 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 952D721B38; Tue, 7 Jun 2022 09:19:45 +0000 (UTC) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 5C15A13638; Tue, 7 Jun 2022 09:19:45 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id oO3aFDEYn2IoLwAAMHmgww (envelope-from ); Tue, 07 Jun 2022 09:19:45 +0000 X-MC-Unique: rVwECxG4OqywWyGKj2jIrw-1 X-Original-To: libvir-list@listman.corp.redhat.com X-MC-Unique: y7kRW5ygPDSxYrTN7b23Ug-1 From: Claudio Fontana To: =?UTF-8?q?Daniel=20P=20=2E=20Berrang=C3=A9?= Subject: [libvirt RFCv11 15/33] multifd-helper: new helper for parallel save/restore Date: Tue, 7 Jun 2022 11:19:18 +0200 Message-Id: <20220607091936.7948-16-cfontana@suse.de> In-Reply-To: <20220607091936.7948-1-cfontana@suse.de> References: <20220607091936.7948-1-cfontana@suse.de> MIME-Version: 1.0 X-Mimecast-Impersonation-Protect: Policy=CLT - Impersonation Protection Definition; Similar Internal Domain=false; Similar Monitored External Domain=false; Custom External Domain=false; Mimecast External Domain=false; Newly Observed Domain=false; Internal User Name=false; Custom Display Name List=false; Reply-to Address Mismatch=false; Targeted Threat Dictionary=false; Mimecast Threat Dictionary=false; Custom Threat Dictionary=false X-Scanned-By: MIMEDefang 2.85 on 10.11.54.10 X-BeenThere: libvir-list@redhat.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Development discussions about the libvirt library & tools List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: libvir-list@redhat.com, Claudio Fontana , "Dr . David Alan Gilbert" , Anthony Iliopoulos Errors-To: libvir-list-bounces@redhat.com Sender: "libvir-list" X-Scanned-By: MIMEDefang 2.85 on 10.11.54.7 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=libvir-list-bounces@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable X-ZM-MESSAGEID: 1654593622507100016 Content-Type: text/plain; charset="utf-8"; x-default="true" For the save direction, this helper listens on a unix socket which QEMU connects to for multifd migration to a file. For the restore direction, this helper connects to a unix socket QEMU listens at for multifd migration from a file. The file descriptor is passed as a command line parameter, and interleaved channels are used to allow reading/writing to different parts of the file depending on the channel used. Signed-off-by: Claudio Fontana --- po/POTFILES | 1 + src/util/meson.build | 16 ++ src/util/multifd-helper.c | 359 ++++++++++++++++++++++++++++++++++++++ 3 files changed, 376 insertions(+) create mode 100644 src/util/multifd-helper.c diff --git a/po/POTFILES b/po/POTFILES index faaba53c8f..97ecbb0ead 100644 --- a/po/POTFILES +++ b/po/POTFILES @@ -241,6 +241,7 @@ src/storage_file/storage_source.c src/storage_file/storage_source_backingstore.c src/test/test_driver.c src/util/iohelper.c +src/util/multifd-helper.c src/util/viralloc.c src/util/virarptable.c src/util/viraudit.c diff --git a/src/util/meson.build b/src/util/meson.build index 07ae94631c..2e08ed8745 100644 --- a/src/util/meson.build +++ b/src/util/meson.build @@ -179,6 +179,11 @@ io_helper_sources =3D [ 'virfile.c', ] =20 +multifd_helper_sources =3D [ + 'multifd-helper.c', + 'virfile.c', +] + virt_util_lib =3D static_library( 'virt_util', [ @@ -220,6 +225,17 @@ if conf.has('WITH_LIBVIRTD') libutil_dep, ], } + virt_helpers +=3D { + 'name': 'libvirt_multifd_helper', + 'sources': [ + files(multifd_helper_sources), + dtrace_gen_headers, + ], + 'deps': [ + acl_dep, + libutil_dep, + ], + } endif =20 util_inc_dir =3D include_directories('.') diff --git a/src/util/multifd-helper.c b/src/util/multifd-helper.c new file mode 100644 index 0000000000..6d26e2210a --- /dev/null +++ b/src/util/multifd-helper.c @@ -0,0 +1,359 @@ +/* + * multifd-helper.c: listens on Unix socket to perform I/O to multiple fil= es + * + * Copyright (C) 2022 SUSE LLC + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library. If not, see + * . + * + * This has been written to support QEMU multifd migration to file, + * allowing better use of cpu resources to speed up the save/restore. + */ + +#include + +#include +#include +#include +#include +#include +#include +#include + +#include "virfile.h" +#include "virerror.h" +#include "virstring.h" +#include "virgettext.h" + +#define VIR_FROM_THIS VIR_FROM_STORAGE + +typedef struct _multiFdConnData multiFdConnData; +struct _multiFdConnData { + int idx; + int nchannels; + int clientfd; + int filefd; + int oflags; + const char *sun_path; + const char *disk_path; + off_t total; + + GThread *tid; +}; + +typedef struct _multiFdThreadArgs multiFdThreadArgs; +struct _multiFdThreadArgs { + int nchannels; + multiFdConnData *conn; /* contains main fd + nchannels */ + const char *sun_path; /* unix socket name to use for the server */ + const char *disk_path; /* disk pathname */ + struct sockaddr_un serv_addr; + + off_t total; +}; + +static gpointer clientThreadFunc(void *a) +{ + multiFdConnData *c =3D a; + c->total =3D virFileDiskCopyChannel(c->filefd, c->disk_path, c->client= fd, c->sun_path, + c->idx, c->nchannels, c->total); + return &c->total; +} + +static off_t waitClientThreads(multiFdConnData *conn, int n) +{ + int idx; + off_t total =3D 0; + + for (idx =3D 0; idx < n; idx++) { + multiFdConnData *c =3D &conn[idx]; + off_t *ctotal; + + ctotal =3D g_thread_join(c->tid); + if (*ctotal < (off_t)0) { + total =3D -1; + } else if (total >=3D 0) { + total +=3D *ctotal; + } + if (VIR_CLOSE(c->clientfd) < 0) { + total =3D -1; + } + } + return total; +} + +static gpointer loadThreadFunc(void *a) +{ + multiFdThreadArgs *args =3D a; + int idx; + args->total =3D -1; + + for (idx =3D 0; idx < args->nchannels; idx++) { + /* Perform outgoing connections */ + multiFdConnData *c =3D &args->conn[idx]; + c->clientfd =3D socket(AF_UNIX, SOCK_STREAM, 0); + if (c->clientfd < 0) { + virReportSystemError(errno, "%s", _("loadThread: socket() fail= ed")); + goto cleanup; + } + if (connect(c->clientfd, (const struct sockaddr *)&args->serv_addr, + sizeof(struct sockaddr_un)) < 0) { + virReportSystemError(errno, "%s", _("loadThread: connect() fai= led")); + goto cleanup; + } + c->tid =3D g_thread_new("libvirt_multifd_load", &clientThreadFunc,= c); + } + args->total =3D waitClientThreads(args->conn, args->nchannels); + + cleanup: + for (idx =3D 0; idx < args->nchannels; idx++) { + multiFdConnData *c =3D &args->conn[idx]; + VIR_FORCE_CLOSE(c->clientfd); + } + return &args->total; +} + +static gpointer saveThreadFunc(void *a) +{ + multiFdThreadArgs *args =3D a; + int idx; + const char buf[1] =3D {'R'}; + int sockfd; + + args->total =3D -1; + + if ((sockfd =3D socket(AF_UNIX, SOCK_STREAM, 0)) < 0) { + virReportSystemError(errno, "%s", _("saveThread: socket() failed")= ); + return &args->total; + } + unlink(args->sun_path); + if (bind(sockfd, (struct sockaddr *)&args->serv_addr, sizeof(args->ser= v_addr)) < 0) { + virReportSystemError(errno, "%s", _("saveThread: bind() failed")); + goto cleanup; + } + if (listen(sockfd, args->nchannels) < 0) { + virReportSystemError(errno, "%s", _("saveThread: listen() failed")= ); + goto cleanup; + } + /* signal that the server is ready */ + if (safewrite(STDOUT_FILENO, &buf, 1) !=3D 1) { + virReportSystemError(errno, "%s", _("saveThread: safewrite failed"= )); + goto cleanup; + } + for (idx =3D 0; idx < args->nchannels; idx++) { + /* Wait for incoming connection. */ + multiFdConnData *c =3D &args->conn[idx]; + if ((c->clientfd =3D accept(sockfd, NULL, NULL)) < 0) { + virReportSystemError(errno, "%s", _("saveThread: accept() fail= ed")); + goto cleanup; + } + c->tid =3D g_thread_new("libvirt_multifd_save", &clientThreadFunc,= c); + } + args->total =3D waitClientThreads(args->conn, args->nchannels); + + cleanup: + for (idx =3D 0; idx < args->nchannels; idx++) { + multiFdConnData *c =3D &args->conn[idx]; + VIR_FORCE_CLOSE(c->clientfd); + } + if (VIR_CLOSE(sockfd) < 0) + args->total =3D -1; + return &args->total; +} + +static int readCLIA(int disk_fd, int nchannels, multiFdConnData *conn) +{ + int idx; + g_autofree void *base =3D NULL; /* Location to be freed */ + size_t buflen =3D virFileDirectAlign(nchannels * 8); + int64_t *buf =3D virFileDirectBufferNew(&base, buflen); + ssize_t got =3D saferead(disk_fd, buf, buflen); + + if (got < buflen) + return -1; + + for (idx =3D 0; idx < nchannels; idx++) { + multiFdConnData *c =3D &conn[idx]; + c->total =3D buf[idx]; + } + return 0; +} + +static int writeCLIA(int disk_fd, int nchannels, multiFdConnData *conn) +{ + int idx; + g_autofree void *base =3D NULL; /* Location to be freed */ + size_t buflen =3D virFileDirectAlign(nchannels * 8); + int64_t *buf =3D virFileDirectBufferNew(&base, buflen); + + for (idx =3D 0; idx < nchannels; idx++) { + multiFdConnData *c =3D &conn[idx]; + buf[idx] =3D c->total; + } + if (safewrite(disk_fd, buf, buflen) < buflen) + return -1; + return 0; +} + +static const char *program_name; + +G_GNUC_NORETURN static void +usage(int status) +{ + if (status) { + fprintf(stderr, _("%s: try --help for more details"), program_name= ); + } else { + fprintf(stderr, _("Usage: %s UNIX_SOCNAME DISK_PATHNAME N MAINFD")= , program_name); + } + exit(status); +} + +int +main(int argc, char **argv) +{ + GThread *tid; + GThreadFunc func; + multiFdThreadArgs args =3D { 0 }; + multiFdConnData *mc; + int idx; + off_t clia_off, data_off, *total; + + program_name =3D argv[0]; + if (virGettextInitialize() < 0 || + virErrorInitialize() < 0) { + fprintf(stderr, _("%s: initialization failed\n"), program_name); + exit(EXIT_FAILURE); + } + + if (argc > 1 && STREQ(argv[1], "--help")) + usage(EXIT_SUCCESS); + if (argc < 5) + usage(EXIT_FAILURE); + + args.sun_path =3D argv[1]; + args.disk_path =3D argv[2]; + if (virStrToLong_i(argv[3], NULL, 10, &args.nchannels) < 0) { + fprintf(stderr, _("%s: malformed number of channels N %s\n"), prog= ram_name, argv[3]); + usage(EXIT_FAILURE); + } + /* consider the main channel as just another channel */ + args.nchannels +=3D 1; + args.conn =3D g_new0(multiFdConnData, args.nchannels); + + /* set main channel connection data */ + mc =3D &args.conn[0]; + mc->idx =3D 0; + mc->nchannels =3D args.nchannels; + if (virStrToLong_i(argv[4], NULL, 10, &mc->filefd) < 0) { + fprintf(stderr, _("%s: malformed MAINFD %s\n"), program_name, argv= [4]); + usage(EXIT_FAILURE); + } + +#ifndef F_GETFL +# error "multifd-helper requires F_GETFL parameter of fcntl" +#endif + + mc->oflags =3D fcntl(mc->filefd, F_GETFL); + mc->sun_path =3D args.sun_path; + mc->disk_path =3D args.disk_path; + clia_off =3D lseek(mc->filefd, 0, SEEK_CUR); + if (clia_off < 0) { + fprintf(stderr, _("%s: failed to seek %s\n"), program_name, args.d= isk_path); + exit(EXIT_FAILURE); + } + if ((mc->oflags & O_ACCMODE) =3D=3D O_RDONLY) { + func =3D loadThreadFunc; + /* set totals from the Channel Length Indicators Area */ + if (readCLIA(mc->filefd, args.nchannels, args.conn) < 0) { + fprintf(stderr, _("%s: failed to read CLIA\n"), program_name); + exit(EXIT_FAILURE); + } + } else { + func =3D saveThreadFunc; + /* skip Channel Length Indicators Area */ + if (lseek(mc->filefd, virFileDirectAlign(args.nchannels * 8), SEEK= _CUR) < 0) { + fprintf(stderr, _("%s: failed to seek %s\n"), program_name, ar= gs.disk_path); + exit(EXIT_FAILURE); + } + mc->total =3D 0; + } + if ((data_off =3D lseek(mc->filefd, 0, SEEK_CUR)) < 0) { + fprintf(stderr, _("%s: failed to seek %s\n"), program_name, args.d= isk_path); + exit(EXIT_FAILURE); + } + + /* initialize channels */ + for (idx =3D 1; idx < args.nchannels; idx++) { + multiFdConnData *c =3D &args.conn[idx]; + c->idx =3D idx; + c->nchannels =3D args.nchannels; + c->oflags =3D mc->oflags & ~(O_TRUNC | O_CREAT); + c->filefd =3D open(args.disk_path, c->oflags); + if (c->filefd < 0) { + fprintf(stderr, _("%s: failed to open %s\n"), program_name, ar= gs.disk_path); + exit(EXIT_FAILURE); + } + c->sun_path =3D args.sun_path; + c->disk_path =3D args.disk_path; + if (mc->total =3D=3D 0) + c->total =3D 0; + if (lseek(c->filefd, data_off, SEEK_SET) < 0) { + fprintf(stderr, _("%s: failed to seek %s\n"), program_name, ar= gs.disk_path); + exit(EXIT_FAILURE); + } + } + + /* initialize server address structure */ + memset(&args.serv_addr, 0, sizeof(args.serv_addr)); + args.serv_addr.sun_family =3D AF_UNIX; + virStrcpyStatic(args.serv_addr.sun_path, args.sun_path); + + tid =3D g_thread_new("libvirt_multifd_func", func, &args); + + total =3D g_thread_join(tid); + if (*total < 0) { + exit(EXIT_FAILURE); + } + if (func =3D=3D saveThreadFunc) { + /* write CLIA */ + if (lseek(mc->filefd, clia_off, SEEK_SET) < 0) { + fprintf(stderr, _("%s: failed to seek %s\n"), program_name, ar= gs.disk_path); + exit(EXIT_FAILURE); + } + /* set totals into the Channel Length Indicators Area */ + if (writeCLIA(mc->filefd, args.nchannels, args.conn) < 0) { + fprintf(stderr, _("%s: failed to write CLIA\n"), program_name); + exit(EXIT_FAILURE); + } + if (lseek(mc->filefd, 0, SEEK_END) < 0) { + fprintf(stderr, _("%s: failed to seek %s\n"), program_name, ar= gs.disk_path); + exit(EXIT_FAILURE); + } + if (virFileDataSync(mc->filefd) < 0) { + if (errno !=3D EINVAL && errno !=3D EROFS) { + fprintf(stderr, _("%s: failed to fsyncdata %s\n"), program= _name, args.disk_path); + exit(EXIT_FAILURE); + } + } + } + /* close up */ + for (idx =3D 0; idx < args.nchannels; idx++) { + multiFdConnData *c =3D &args.conn[idx]; + if (VIR_CLOSE(c->filefd) < 0) { + fprintf(stderr, _("%s: failed to close %s\n"), program_name, a= rgs.disk_path); + exit(EXIT_FAILURE); + } + } + exit(EXIT_SUCCESS); +} --=20 2.26.2