From nobody Wed May 8 17:17:08 2024 Delivered-To: importer@patchew.org Received-SPF: none (zohomail.com: 8.43.85.245 is neither permitted nor denied by domain of lists.libvirt.org) client-ip=8.43.85.245; envelope-from=devel-bounces@lists.libvirt.org; helo=lists.libvirt.org; Authentication-Results: mx.zohomail.com; spf=none (zohomail.com: 8.43.85.245 is neither permitted nor denied by domain of lists.libvirt.org) smtp.mailfrom=devel-bounces@lists.libvirt.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.libvirt.org (lists.libvirt.org [8.43.85.245]) by mx.zohomail.com with SMTPS id 1707406366362321.27245301129267; Thu, 8 Feb 2024 07:32:46 -0800 (PST) Received: by lists.libvirt.org (Postfix, from userid 996) id DFC2C1A4B; Thu, 8 Feb 2024 10:32:44 -0500 (EST) Received: from lists.libvirt.org.85.43.8.in-addr.arpa (localhost [IPv6:::1]) by lists.libvirt.org (Postfix) with ESMTP id 6C43C1A1F; Thu, 8 Feb 2024 10:30:52 -0500 (EST) Received: by lists.libvirt.org (Postfix, from userid 996) id 29F8D1A1C; Thu, 8 Feb 2024 10:30:45 -0500 (EST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.libvirt.org (Postfix) with ESMTPS id 439641A1B for ; Thu, 8 Feb 2024 10:30:44 -0500 (EST) Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-50-P7eTj0bmMS6g7Jzwx9szFw-1; Thu, 08 Feb 2024 10:30:41 -0500 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 617963C2B631 for ; Thu, 8 Feb 2024 15:30:41 +0000 (UTC) Received: from orkuz (unknown [10.43.3.115]) by smtp.corp.redhat.com (Postfix) with ESMTP id 247DD2026D06 for ; Thu, 8 Feb 2024 15:30:40 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on lists.libvirt.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H5, RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.4 X-MC-Unique: P7eTj0bmMS6g7Jzwx9szFw-1 From: Jiri Denemark To: devel@lists.libvirt.org Subject: [libvirt PATCH] qemu: Add support for /dev/userfaultfd Date: Thu, 8 Feb 2024 16:30:38 +0100 Message-ID: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.4 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Message-ID-Hash: CCNOPAH2G2XE7TBTNWBFBZKXOR2ESEYJ X-Message-ID-Hash: CCNOPAH2G2XE7TBTNWBFBZKXOR2ESEYJ X-MailFrom: jdenemar@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-config-1; header-match-config-2; header-match-config-3; header-match-devel.lists.libvirt.org-0; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; suspicious-header X-Mailman-Version: 3.2.2 Precedence: list List-Id: Development discussions about the libvirt library & tools Archived-At: List-Archive: List-Help: List-Post: List-Subscribe: List-Unsubscribe: Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-ZM-MESSAGEID: 1707406367767100001 /dev/userfaultfd device is preferred over userfaultfd syscall for post-copy migrations. Unless qemu driver is configured to disable mount namespace or to forbid access to /dev/userfaultfd in cgroup_device_acl, we will copy it to the limited /dev filesystem QEMU will have access to and label it appropriately. So in the default configuration post-copy migration will be allowed even without enabling vm.unprivileged_userfaultfd sysctl. Signed-off-by: Jiri Denemark --- Notes: The question is what should we do with the src/qemu/postcopy-migration.sysctl file which is installed by libvirt.spec to /usr/lib/sysctl.d/60-qemu-postcopy-migration.conf by default. The file is now useless and should ideally be removed, but only when the host kernel is new enough to support /dev/userfaultfd src/qemu/qemu.conf.in | 3 +- src/qemu/qemu_cgroup.c | 1 + src/qemu/qemu_process.c | 38 +++++++++++++++++++++++++ src/qemu/qemu_security.c | 45 ++++++++++++++++++++++++++++++ src/qemu/qemu_security.h | 5 ++++ src/qemu/test_libvirtd_qemu.aug.in | 1 + 6 files changed, 92 insertions(+), 1 deletion(-) diff --git a/src/qemu/qemu.conf.in b/src/qemu/qemu.conf.in index 34025a02ef..f406df8749 100644 --- a/src/qemu/qemu.conf.in +++ b/src/qemu/qemu.conf.in @@ -565,7 +565,8 @@ #cgroup_device_acl =3D [ # "/dev/null", "/dev/full", "/dev/zero", # "/dev/random", "/dev/urandom", -# "/dev/ptmx", "/dev/kvm" +# "/dev/ptmx", "/dev/kvm", +# "/dev/userfaultfd" #] # # RDMA migration requires the following extra files to be added to the lis= t: diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c index 47402b3750..5a5ba763a0 100644 --- a/src/qemu/qemu_cgroup.c +++ b/src/qemu/qemu_cgroup.c @@ -41,6 +41,7 @@ const char *const defaultDeviceACL[] =3D { "/dev/null", "/dev/full", "/dev/zero", "/dev/random", "/dev/urandom", "/dev/ptmx", "/dev/kvm", + "/dev/userfaultfd", NULL, }; #define DEVICE_PTY_MAJOR 136 diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 0a6c18a671..6e51d6586b 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -2882,6 +2882,40 @@ qemuProcessStartManagedPRDaemon(virDomainObj *vm) } =20 =20 +static int +qemuProcessAllowPostCopyMigration(virDomainObj *vm) +{ + qemuDomainObjPrivate *priv =3D vm->privateData; + virQEMUDriver *driver =3D priv->driver; + g_autoptr(virQEMUDriverConfig) cfg =3D virQEMUDriverGetConfig(driver); + const char *const *devices =3D (const char *const *) cfg->cgroupDevice= ACL; + const char *uffd =3D "/dev/userfaultfd"; + int rc; + + if (!virFileExists(uffd)) { + VIR_DEBUG("%s is not supported by the host", uffd); + return 0; + } + + if (!devices) + devices =3D defaultDeviceACL; + + if (!g_strv_contains(devices, uffd)) { + VIR_DEBUG("%s is not allowed by device ACL", uffd); + return 0; + } + + VIR_DEBUG("Labeling %s in mount namespace", uffd); + if ((rc =3D qemuSecurityDomainSetMountNSPathLabel(driver, vm, uffd)) <= 0) + return -1; + + if (rc =3D=3D 1) + VIR_DEBUG("Mount namespace is not enabled, leaving %s as is", uffd= ); + + return 0; +} + + static int qemuProcessInitPasswords(virQEMUDriver *driver, virDomainObj *vm, @@ -7802,6 +7836,10 @@ qemuProcessLaunch(virConnectPtr conn, qemuProcessStartManagedPRDaemon(vm) < 0) goto cleanup; =20 + VIR_DEBUG("Setting up permissions to allow post-copy migration"); + if (qemuProcessAllowPostCopyMigration(vm) < 0) + goto cleanup; + VIR_DEBUG("Setting domain security labels"); if (qemuSecuritySetAllLabel(driver, vm, diff --git a/src/qemu/qemu_security.c b/src/qemu/qemu_security.c index 8bcef14d08..4aaa863ae9 100644 --- a/src/qemu/qemu_security.c +++ b/src/qemu/qemu_security.c @@ -615,6 +615,51 @@ qemuSecurityDomainRestorePathLabel(virQEMUDriver *driv= er, } =20 =20 +/** + * qemuSecurityDomainSetMountNSPathLabel: + * + * Label given path in mount namespace. If mount namespace is not enabled, + * nothing is labeled at all. + * + * Because the label is only applied in mount namespace, there's no need to + * restore it. + * + * Returns 0 on success, + * 1 when mount namespace is not enabled, + * -1 on error. + */ +int +qemuSecurityDomainSetMountNSPathLabel(virQEMUDriver *driver, + virDomainObj *vm, + const char *path) +{ + int ret =3D -1; + + if (!qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) { + VIR_DEBUG("Not labeling '%s': mount namespace disabled for domain = '%s'", + path, vm->def->name); + return 1; + } + + if (virSecurityManagerTransactionStart(driver->securityManager) < 0) + goto cleanup; + + if (virSecurityManagerDomainSetPathLabel(driver->securityManager, + vm->def, path, false) < 0) + goto cleanup; + + if (virSecurityManagerTransactionCommit(driver->securityManager, + vm->pid, false) < 0) + goto cleanup; + + ret =3D 0; + + cleanup: + virSecurityManagerTransactionAbort(driver->securityManager); + return ret; +} + + /** * qemuSecurityCommandRun: * @driver: the QEMU driver diff --git a/src/qemu/qemu_security.h b/src/qemu/qemu_security.h index 10f11771b4..41da33debc 100644 --- a/src/qemu/qemu_security.h +++ b/src/qemu/qemu_security.h @@ -110,6 +110,11 @@ int qemuSecurityDomainRestorePathLabel(virQEMUDriver *= driver, virDomainObj *vm, const char *path); =20 +int +qemuSecurityDomainSetMountNSPathLabel(virQEMUDriver *driver, + virDomainObj *vm, + const char *path); + int qemuSecurityCommandRun(virQEMUDriver *driver, virDomainObj *vm, virCommand *cmd, diff --git a/src/qemu/test_libvirtd_qemu.aug.in b/src/qemu/test_libvirtd_qe= mu.aug.in index e4cfde6cc7..b97e6de11e 100644 --- a/src/qemu/test_libvirtd_qemu.aug.in +++ b/src/qemu/test_libvirtd_qemu.aug.in @@ -67,6 +67,7 @@ module Test_libvirtd_qemu =3D { "5" =3D "/dev/urandom" } { "6" =3D "/dev/ptmx" } { "7" =3D "/dev/kvm" } + { "8" =3D "/dev/userfaultfd" } } { "save_image_format" =3D "raw" } { "dump_image_format" =3D "raw" } --=20 2.43.0 _______________________________________________ Devel mailing list -- devel@lists.libvirt.org To unsubscribe send an email to devel-leave@lists.libvirt.org