From nobody Sun Apr 12 04:22:08 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=quarantine dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1771423300; cv=none; d=zohomail.com; s=zohoarc; b=Iud/4Tr4TJnYIkeGNFKyfYXxqRWNX3OOVQYF0B5R6tSwAV7s6bBBSjxua5G9zBq8QGqei+B4RVMSCIleCuY4s3yQkpYdg1RH0CQF4t1+PhfEUkkKe6gXOCV9vniOjZdlfTdFKD4JqkjcF5cqv/WycFW2/xLWq8I3spVVXkTItcg= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1771423300; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=Kjq1qNsV8IoikIyePU3huwdiu2PqhFN9lIM3uwRFHic=; b=CGk8M0hPqRPHPOVufKg3BNm8dkICGJlIqejC6gAiZMrAkVau15z8IPYFZ78JTwoGSjoRT5aIzXpM8XZTrLaUxKfEhfp2LswOcKE3pFksLOfrOpHsonkDNpeXwBhTMEH/00ZJM8giAxJ5f2dnBQUOIQu1WW5EG7+tOoTfBeBGfkk= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=quarantine dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1771423300816429.6992342072326; Wed, 18 Feb 2026 06:01:40 -0800 (PST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1vsi6K-0003th-5n; Wed, 18 Feb 2026 09:00:24 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1vsi6I-0003st-3E for qemu-devel@nongnu.org; Wed, 18 Feb 2026 09:00:22 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1vsi6G-00079l-7N for qemu-devel@nongnu.org; Wed, 18 Feb 2026 09:00:21 -0500 Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-171-GPXzvr9-N7aErrRSTsMlQw-1; Wed, 18 Feb 2026 09:00:11 -0500 Received: from mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.17]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 5D1E51955BC4; Wed, 18 Feb 2026 14:00:10 +0000 (UTC) Received: from corto.redhat.com (unknown [10.45.224.251]) by mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id AAF761956095; Wed, 18 Feb 2026 14:00:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1771423218; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Kjq1qNsV8IoikIyePU3huwdiu2PqhFN9lIM3uwRFHic=; b=gVsUFRtc9aF3HeO69Kj1tXtkWU0r83fy247H5Wtjdy/DDvn+udMyb3fqATxxfP8JX7fHE0 13+f8TH3eUjU1KavvBliiuF2MZY6t+jWyzyECQ4Ru6Pw2EGPI8Hy8LNTCSo90wKhZBv8i6 xqAXruegd4ZKqKRn5H9XQ5tZCnQiNjU= X-MC-Unique: GPXzvr9-N7aErrRSTsMlQw-1 X-Mimecast-MFC-AGG-ID: GPXzvr9-N7aErrRSTsMlQw_1771423210 From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= To: qemu-devel@nongnu.org Cc: Alex Williamson , Avihai Horon , Markus Armbruster , =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= Subject: [PULL 1/5] vfio/migration: Send VFIO_MIGRATION event before PRE_COPY_P2P transition Date: Wed, 18 Feb 2026 14:59:59 +0100 Message-ID: <20260218140003.1554502-2-clg@redhat.com> In-Reply-To: <20260218140003.1554502-1-clg@redhat.com> References: <20260218140003.1554502-1-clg@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.0 on 10.30.177.17 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=clg@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.043, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: qemu development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1771423302945158500 From: Avihai Horon The VFIO_MIGRATION event notifies users when a VFIO device transitions to a new state. One use case for this event is to prevent timeouts for RDMA connections to the migrated device. In this case, an external management application (not libvirt) consumes the events and disables the RDMA timeout mechanism when receiving the event for PRE_COPY_P2P state, which indicates that the device is non-responsive. This is essential because RDMA connections typically have very low timeouts (tens of milliseconds), which can be far below migration downtime. However, under heavy resource utilization, the device transition to PRE_COPY_P2P can take hundreds of milliseconds to complete. Since the VFIO_MIGRATION event is currently sent only after the transition completes, it arrives too late, after RDMA connections have already timed out. To address this, send an additional "prepare" event immediately before initiating the PRE_COPY_P2P transition. This guarantees timely event delivery regardless of how long the actual state transition takes. Signed-off-by: Avihai Horon Acked-by: Markus Armbruster Reviewed-by: C=C3=A9dric Le Goater Link: https://lore.kernel.org/qemu-devel/20260202173406.13979-1-avihaih@nvi= dia.com Signed-off-by: C=C3=A9dric Le Goater --- qapi/vfio.json | 13 +++++++++++-- hw/vfio/migration.c | 26 +++++++++++++++++++------- 2 files changed, 30 insertions(+), 9 deletions(-) diff --git a/qapi/vfio.json b/qapi/vfio.json index a1a9c5b673d8e2f9a1a7e5cc7ad518639320e976..17b604687128c7a0a4b53441d10= 8b78b6bd0343b 100644 --- a/qapi/vfio.json +++ b/qapi/vfio.json @@ -11,7 +11,13 @@ ## # @QapiVfioMigrationState: # -# An enumeration of the VFIO device migration states. +# An enumeration of the VFIO device migration states. In addition to +# the regular states, there are prepare states (with 'prepare' suffix) +# which indicate that the device is just about to transition to the +# corresponding state. Note that seeing a prepare state for state X +# doesn't guarantee that the next state will be X, as the state +# transition can fail and the device may transition to a different +# state instead. # # @stop: The device is stopped. # @@ -32,11 +38,14 @@ # tracking its internal state and its internal state is available # for reading. # +# @pre-copy-p2p-prepare: The device is just about to move to +# pre-copy-p2p state. (since 11.0) +# # Since: 9.1 ## { 'enum': 'QapiVfioMigrationState', 'data': [ 'stop', 'running', 'stop-copy', 'resuming', 'running-p2p', - 'pre-copy', 'pre-copy-p2p' ] } + 'pre-copy', 'pre-copy-p2p', 'pre-copy-p2p-prepare' ] } =20 ## # @VFIO_MIGRATION: diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c index b4695030c7295f318faf1d12ac48ba951aa943c7..4bd8e24699cc584924e82197b61= ed558cb0399ec 100644 --- a/hw/vfio/migration.c +++ b/hw/vfio/migration.c @@ -68,7 +68,7 @@ static const char *mig_state_to_str(enum vfio_device_mig_= state state) } =20 static QapiVfioMigrationState -mig_state_to_qapi_state(enum vfio_device_mig_state state) +mig_state_to_qapi_state(enum vfio_device_mig_state state, bool prepare) { switch (state) { case VFIO_DEVICE_STATE_STOP: @@ -84,15 +84,17 @@ mig_state_to_qapi_state(enum vfio_device_mig_state stat= e) case VFIO_DEVICE_STATE_PRE_COPY: return QAPI_VFIO_MIGRATION_STATE_PRE_COPY; case VFIO_DEVICE_STATE_PRE_COPY_P2P: - return QAPI_VFIO_MIGRATION_STATE_PRE_COPY_P2P; + return prepare ? QAPI_VFIO_MIGRATION_STATE_PRE_COPY_P2P_PREPARE : + QAPI_VFIO_MIGRATION_STATE_PRE_COPY_P2P; default: g_assert_not_reached(); } } =20 -static void vfio_migration_send_event(VFIODevice *vbasedev) +static void vfio_migration_send_event(VFIODevice *vbasedev, + enum vfio_device_mig_state state, + bool prepare) { - VFIOMigration *migration =3D vbasedev->migration; DeviceState *dev =3D vbasedev->dev; g_autofree char *qom_path =3D NULL; Object *obj; @@ -106,8 +108,8 @@ static void vfio_migration_send_event(VFIODevice *vbase= dev) g_assert(obj); qom_path =3D object_get_canonical_path(obj); =20 - qapi_event_send_vfio_migration( - dev->id, qom_path, mig_state_to_qapi_state(migration->device_state= )); + qapi_event_send_vfio_migration(dev->id, qom_path, + mig_state_to_qapi_state(state, prepare)= ); } =20 static void vfio_migration_set_device_state(VFIODevice *vbasedev, @@ -119,7 +121,7 @@ static void vfio_migration_set_device_state(VFIODevice = *vbasedev, mig_state_to_str(state)); =20 migration->device_state =3D state; - vfio_migration_send_event(vbasedev); + vfio_migration_send_event(vbasedev, state, false); } =20 int vfio_migration_set_state(VFIODevice *vbasedev, @@ -146,6 +148,16 @@ int vfio_migration_set_state(VFIODevice *vbasedev, return 0; } =20 + /* + * Send a prepare event before initiating the PRE_COPY_P2P transition = to + * ensure timely event delivery regardless of how long the state trans= ition + * takes. + */ + if (new_state =3D=3D VFIO_DEVICE_STATE_PRE_COPY_P2P) { + vfio_migration_send_event(vbasedev, VFIO_DEVICE_STATE_PRE_COPY_P2P, + true); + } + feature->argsz =3D sizeof(buf); feature->flags =3D VFIO_DEVICE_FEATURE_SET | VFIO_DEVICE_FEATURE_MIG_DEVICE_STATE; --=20 2.53.0 From nobody Sun Apr 12 04:22:08 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=quarantine dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1771423276; cv=none; d=zohomail.com; s=zohoarc; b=DNjQb/0Lu+JM+N/lv/HSvbfwfkUNZPpK2L4OdEbVirqCwKZ7fCl4VH8olPLukWb5ixfymuMpzfXo29KQETTb6SoLDoROquc9mF8m4NKE5AzaWmmjFNIvxLBf9pmvFrciOQrEhgd26bcM5NsFGangOPPoro5fxjiSQWbdK4Bu5r8= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1771423276; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=V/pA3ScU7INhBgfTMYS0saB19ICuaNzZtxc63hZ50S8=; b=asP2Ml+vs8hBFPWaJaWluOB/iRY3gsI4oVgcVUNRw/uaduOt1TYP7sDe9sxiFM/C8gCRCqiwhiJpgaiqxGvcO8yhyo7JBvbcNQQnG6nD7dQr7n5GmIY0j/5kBbxU+FOWXSJ7u3xDz9Onx4hFLWIhOhgXMyNAeci009PmTN200bo= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=quarantine dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1771423271408528.2546148551957; Wed, 18 Feb 2026 06:01:11 -0800 (PST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1vsi6T-000409-4m; Wed, 18 Feb 2026 09:00:33 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1vsi6N-0003xa-Uh for qemu-devel@nongnu.org; Wed, 18 Feb 2026 09:00:29 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1vsi6L-0007Az-53 for qemu-devel@nongnu.org; Wed, 18 Feb 2026 09:00:26 -0500 Received: from mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-437-mV5Gc6TKMnmk-hbVsFJ8mg-1; Wed, 18 Feb 2026 09:00:15 -0500 Received: from mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.17]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id CB74E195FDE7; Wed, 18 Feb 2026 14:00:12 +0000 (UTC) Received: from corto.redhat.com (unknown [10.45.224.251]) by mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id E6C4B1956095; Wed, 18 Feb 2026 14:00:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1771423224; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=V/pA3ScU7INhBgfTMYS0saB19ICuaNzZtxc63hZ50S8=; b=a/a4q55ulErPitGaXKbZ9JFvHiNknN0r91aNJu5c5/PhLeKN6QU99LQ4h7CqCub6Qg1CRT JTDZ79JkZBgwti5oLyjuCPHqJm5p2JbtpDpFP04DSDWmzCzeX6BoXF3WbA1o3lTIzr3rkP 3V7YP/LkP4Pqz6SVYa48JUxd8FlZXvg= X-MC-Unique: mV5Gc6TKMnmk-hbVsFJ8mg-1 X-Mimecast-MFC-AGG-ID: mV5Gc6TKMnmk-hbVsFJ8mg_1771423213 From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= To: qemu-devel@nongnu.org Cc: Alex Williamson , Ankit Agrawal , Shameer Kolothum , =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= Subject: [PULL 2/5] hw/vfio: sort and validate sparse mmap regions by offset Date: Wed, 18 Feb 2026 15:00:00 +0100 Message-ID: <20260218140003.1554502-3-clg@redhat.com> In-Reply-To: <20260218140003.1554502-1-clg@redhat.com> References: <20260218140003.1554502-1-clg@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.0 on 10.30.177.17 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=clg@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.043, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: qemu development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1771423282113154100 From: Ankit Agrawal Sort sparse mmap regions by offset during region setup to ensure predictable mapping order, avoid overlaps and a proper handling of the gaps between sub-regions. Add validation to detect overlapping sparse regions early during setup before any mapping operations begin. The sorting is performed on the subregions ranges during vfio_setup_region_sparse_mmaps(). This also ensures that subsequent mapping code can rely on subregions being in ascending offset order. This is preparatory work for alignment adjustments needed to support hugepfnmap on systems where device memory (e.g., Grace-based systems) may have non-power-of-2 sizes. cc: Alex Williamson Reviewed-by: Alex Williamson Reviewed-by: Shameer Kolothum Signed-off-by: Ankit Agrawal Reviewed-by: C=C3=A9dric Le Goater Link: https://lore.kernel.org/qemu-devel/20260217153010.408739-2-ankita@nvi= dia.com Signed-off-by: C=C3=A9dric Le Goater --- hw/vfio/region.c | 46 +++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 45 insertions(+), 1 deletion(-) diff --git a/hw/vfio/region.c b/hw/vfio/region.c index ab39d77574ccef0ca8d393435979db97bcbd882b..8fbc98918f5f7c76c5e2f99956b= 54b96d0908fd6 100644 --- a/hw/vfio/region.c +++ b/hw/vfio/region.c @@ -149,6 +149,19 @@ static const MemoryRegionOps vfio_region_ops =3D { }, }; =20 +static int vfio_mmap_compare_offset(const void *a, const void *b) +{ + const VFIOMmap *mmap_a =3D a; + const VFIOMmap *mmap_b =3D b; + + if (mmap_a->offset < mmap_b->offset) { + return -1; + } else if (mmap_a->offset > mmap_b->offset) { + return 1; + } + return 0; +} + static int vfio_setup_region_sparse_mmaps(VFIORegion *region, struct vfio_region_info *info) { @@ -182,6 +195,35 @@ static int vfio_setup_region_sparse_mmaps(VFIORegion *= region, region->nr_mmaps =3D j; region->mmaps =3D g_realloc(region->mmaps, j * sizeof(VFIOMmap)); =20 + /* + * Sort sparse mmaps by offset to ensure proper handling of gaps + * and predictable mapping order in vfio_region_mmap(). + */ + if (region->nr_mmaps > 1) { + qsort(region->mmaps, region->nr_mmaps, sizeof(VFIOMmap), + vfio_mmap_compare_offset); + + /* + * Validate that sparse regions don't overlap after sorting. + */ + for (i =3D 1; i < region->nr_mmaps; i++) { + off_t prev_end =3D region->mmaps[i - 1].offset + + region->mmaps[i - 1].size; + if (prev_end > region->mmaps[i].offset) { + error_report("%s: overlapping sparse mmap regions detected= " + "in region %d: [0x%"PRIx64"-0x%"PRIx64"] over= laps " + "with [0x%"PRIx64"-0x%"PRIx64"]", + __func__, region->nr, region->mmaps[i - 1].of= fset, + prev_end - 1, region->mmaps[i].offset, + region->mmaps[i].offset + region->mmaps[i].si= ze - 1); + g_free(region->mmaps); + region->mmaps =3D NULL; + region->nr_mmaps =3D 0; + return -EINVAL; + } + } + } + return 0; } =20 @@ -213,11 +255,13 @@ int vfio_region_setup(Object *obj, VFIODevice *vbased= ev, VFIORegion *region, =20 ret =3D vfio_setup_region_sparse_mmaps(region, info); =20 - if (ret) { + if (ret =3D=3D -ENODEV) { region->nr_mmaps =3D 1; region->mmaps =3D g_new0(VFIOMmap, region->nr_mmaps); region->mmaps[0].offset =3D 0; region->mmaps[0].size =3D region->size; + } else if (ret) { + return ret; } } } --=20 2.53.0 From nobody Sun Apr 12 04:22:08 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=quarantine dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1771423316; cv=none; d=zohomail.com; s=zohoarc; b=CZ3tWacDiiDU4t/4MgWE28NvC30U/pK757PPh5coAarh+0TiweTwuAmhyaHCIm+sQucQzRkMA6SdbuHu9p6pz55Xkyo4tK+xExn01jJTKwDyMRXs41i0UFJwNpwdFzjicgfNaZInTQWHF8Hq067PVyRxA9s3sdnAyg2dvK3igMg= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1771423316; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=YGFAO1IXnM0+bjnrxUtaJJdGMMhAn4QNXCcXW9XUah0=; b=j+yEpaA04Mk9/WlijsdCt7nAtukcNfeKQ2tZ1F3oi3zaUAn0gdSrcJriw+4740vmCROGe4qQyIW3akXdq+WI+vlX+BZsZAsAUP7FbSP7653ky8lN1HDYwdKg3g7f9pWEejvsN7XQyZ7ZUGP3dszRq4ukwPCmaFU0qdV1sSjHpYM= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=quarantine dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1771423316951750.2116817282864; Wed, 18 Feb 2026 06:01:56 -0800 (PST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1vsi7V-0004GK-6k; Wed, 18 Feb 2026 09:01:37 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1vsi6T-00040A-7E for qemu-devel@nongnu.org; Wed, 18 Feb 2026 09:00:33 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1vsi6Q-0007Bj-OT for qemu-devel@nongnu.org; Wed, 18 Feb 2026 09:00:32 -0500 Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-382-xcdRyIQRPIiyTBMQLwze3A-1; Wed, 18 Feb 2026 09:00:15 -0500 Received: from mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.17]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id C926919344F9; Wed, 18 Feb 2026 14:00:14 +0000 (UTC) Received: from corto.redhat.com (unknown [10.45.224.251]) by mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 314931956095; Wed, 18 Feb 2026 14:00:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1771423229; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=YGFAO1IXnM0+bjnrxUtaJJdGMMhAn4QNXCcXW9XUah0=; b=Bknc9X8f9GDLrIqQwm4NSZiUtuT+IhKbBNnl8dDpYiMjD+AelmlNCvWRy1eypvSXLyj16f qD+AercUXhijva5/c7iyFxw7ZAMdHbtfAobDvmPsvTCYNIPGsV5CFwb0UNUXfhMIpVkC4q zE2hJ2m823hUYAa9R7lCuQ5MzsPH36Q= X-MC-Unique: xcdRyIQRPIiyTBMQLwze3A-1 X-Mimecast-MFC-AGG-ID: xcdRyIQRPIiyTBMQLwze3A_1771423215 From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= To: qemu-devel@nongnu.org Cc: Alex Williamson , Ankit Agrawal , Cedric Le Goater Subject: [PULL 3/5] vfio: Add Error ** parameter to vfio_region_setup() Date: Wed, 18 Feb 2026 15:00:01 +0100 Message-ID: <20260218140003.1554502-4-clg@redhat.com> In-Reply-To: <20260218140003.1554502-1-clg@redhat.com> References: <20260218140003.1554502-1-clg@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.0 on 10.30.177.17 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=clg@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.043, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: qemu development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1771423318686154100 From: Ankit Agrawal Add an Error **errp parameter to vfio_region_setup() and vfio_setup_region_sparse_mmaps to allow proper error handling instead of just returning error codes. The function sets errors via error_setg() when failure occur. Suggested-by: Cedric Le Goater Signed-off-by: Ankit Agrawal Reviewed-by: C=C3=A9dric Le Goater Link: https://lore.kernel.org/qemu-devel/20260217153010.408739-3-ankita@nvi= dia.com Signed-off-by: C=C3=A9dric Le Goater --- hw/vfio/vfio-region.h | 2 +- hw/vfio/display.c | 6 +++--- hw/vfio/pci.c | 3 +-- hw/vfio/region.c | 20 +++++++++++--------- 4 files changed, 16 insertions(+), 15 deletions(-) diff --git a/hw/vfio/vfio-region.h b/hw/vfio/vfio-region.h index ede6e0c8f992caa3292d280474c90ddef27eb3dd..9b21d4ee5ba16f8c05be83c75d1= c7a6ad4cf8370 100644 --- a/hw/vfio/vfio-region.h +++ b/hw/vfio/vfio-region.h @@ -38,7 +38,7 @@ void vfio_region_write(void *opaque, hwaddr addr, uint64_t vfio_region_read(void *opaque, hwaddr addr, unsigned size); int vfio_region_setup(Object *obj, VFIODevice *vbasedev, VFIORegion *regio= n, - int index, const char *name); + int index, const char *name, Error **errp); int vfio_region_mmap(VFIORegion *region); void vfio_region_mmaps_set_enabled(VFIORegion *region, bool enabled); void vfio_region_unmap(VFIORegion *region); diff --git a/hw/vfio/display.c b/hw/vfio/display.c index faacd9019a558ea35a8730696472e967d4196919..5a42a6f7a29e77ec2489f942457= 221aed8118d0b 100644 --- a/hw/vfio/display.c +++ b/hw/vfio/display.c @@ -446,13 +446,13 @@ static void vfio_display_region_update(void *opaque) =20 if (!dpy->region.buffer.size) { /* mmap region */ + Error *error =3D NULL; ret =3D vfio_region_setup(OBJECT(vdev), &vdev->vbasedev, &dpy->region.buffer, plane.region_index, - "display"); + "display", &error); if (ret !=3D 0) { - error_report("%s: vfio_region_setup(%d): %s", - __func__, plane.region_index, strerror(-ret)); + error_report_err(error); goto err; } ret =3D vfio_region_mmap(&dpy->region.buffer); diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index 36d8fbe872eb9c0e2493b9710fc29e326dc1ec52..c89f3fbea348e780b7662c7d689= 3d9d778aae678 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -3056,11 +3056,10 @@ bool vfio_pci_populate_device(VFIOPCIDevice *vdev, = Error **errp) char *name =3D g_strdup_printf("%s BAR %d", vbasedev->name, i); =20 ret =3D vfio_region_setup(OBJECT(vdev), vbasedev, - &vdev->bars[i].region, i, name); + &vdev->bars[i].region, i, name, errp); g_free(name); =20 if (ret) { - error_setg_errno(errp, -ret, "failed to get region %d info", i= ); return false; } =20 diff --git a/hw/vfio/region.c b/hw/vfio/region.c index 8fbc98918f5f7c76c5e2f99956b54b96d0908fd6..d464eadf9c048e29981da8af48f= 8f86933a98ad5 100644 --- a/hw/vfio/region.c +++ b/hw/vfio/region.c @@ -163,7 +163,8 @@ static int vfio_mmap_compare_offset(const void *a, cons= t void *b) } =20 static int vfio_setup_region_sparse_mmaps(VFIORegion *region, - struct vfio_region_info *info) + struct vfio_region_info *info, + Error **errp) { struct vfio_info_cap_header *hdr; struct vfio_region_info_cap_sparse_mmap *sparse; @@ -210,12 +211,12 @@ static int vfio_setup_region_sparse_mmaps(VFIORegion = *region, off_t prev_end =3D region->mmaps[i - 1].offset + region->mmaps[i - 1].size; if (prev_end > region->mmaps[i].offset) { - error_report("%s: overlapping sparse mmap regions detected= " - "in region %d: [0x%"PRIx64"-0x%"PRIx64"] over= laps " - "with [0x%"PRIx64"-0x%"PRIx64"]", - __func__, region->nr, region->mmaps[i - 1].of= fset, - prev_end - 1, region->mmaps[i].offset, - region->mmaps[i].offset + region->mmaps[i].si= ze - 1); + error_setg(errp, "%s: overlapping sparse mmap regions dete= cted " + "in region %d: [0x%"PRIx64"-0x%"PRIx64"] overla= ps " + "with [0x%"PRIx64"-0x%"PRIx64"]", + __func__, region->nr, region->mmaps[i - 1].offs= et, + prev_end - 1, region->mmaps[i].offset, + region->mmaps[i].offset + region->mmaps[i].size= - 1); g_free(region->mmaps); region->mmaps =3D NULL; region->nr_mmaps =3D 0; @@ -228,13 +229,14 @@ static int vfio_setup_region_sparse_mmaps(VFIORegion = *region, } =20 int vfio_region_setup(Object *obj, VFIODevice *vbasedev, VFIORegion *regio= n, - int index, const char *name) + int index, const char *name, Error **errp) { struct vfio_region_info *info =3D NULL; int ret; =20 ret =3D vfio_device_get_region_info(vbasedev, index, &info); if (ret) { + error_setg_errno(errp, -ret, "failed to get region %d info", index= ); return ret; } =20 @@ -253,7 +255,7 @@ int vfio_region_setup(Object *obj, VFIODevice *vbasedev= , VFIORegion *region, if (!vbasedev->no_mmap && region->flags & VFIO_REGION_INFO_FLAG_MMAP) { =20 - ret =3D vfio_setup_region_sparse_mmaps(region, info); + ret =3D vfio_setup_region_sparse_mmaps(region, info, errp); =20 if (ret =3D=3D -ENODEV) { region->nr_mmaps =3D 1; --=20 2.53.0 From nobody Sun Apr 12 04:22:08 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=quarantine dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1771423289; cv=none; d=zohomail.com; s=zohoarc; b=YPjloGc0j/vgPG8AT02IxYafZ4DyflFPOFP1UfR3nsZRhrLhsZn5kXAV/d0nMGV+w3/PtZFKCg4DC7dNPzObVrmz7cjyTfGTGp4ymY1V+OuURAwwCHsz08+KLVRgICtjXhPpGeiee1qlndnbOJZ7RH7WRXhv0VpyA3zNUT0hmGE= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1771423289; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=x+FmMN620Ef1QiHzVSuMnoiNS27P80CU44pJ5h3ooyU=; b=bR51dWi3rOBf9Ua8ZahZJ5VnKxUnoY+Hn9pWvHvPTfIgubp7H918Ai42SjhTOIWihGjjtkn66YJ0mlO2OwdUCcz5bYaRk/ruooNu7I9+tpj1cxrlneZhX7iXZXZW+CYpa/Q/x0LL1HnrcITW8yQ2o7rlMdvESNDL1amJZKD0hi0= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=quarantine dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1771423289304958.357668085993; Wed, 18 Feb 2026 06:01:29 -0800 (PST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1vsi6y-00041d-P0; Wed, 18 Feb 2026 09:01:07 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1vsi6M-0003xX-Jr for qemu-devel@nongnu.org; Wed, 18 Feb 2026 09:00:29 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1vsi6J-0007Af-H0 for qemu-devel@nongnu.org; Wed, 18 Feb 2026 09:00:25 -0500 Received: from mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-600-rl99xc6eOnGevHOj0iG1Bw-1; Wed, 18 Feb 2026 09:00:18 -0500 Received: from mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.17]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 31CB7195FDC3; Wed, 18 Feb 2026 14:00:17 +0000 (UTC) Received: from corto.redhat.com (unknown [10.45.224.251]) by mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 44CF01956095; Wed, 18 Feb 2026 14:00:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1771423222; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=x+FmMN620Ef1QiHzVSuMnoiNS27P80CU44pJ5h3ooyU=; b=Bmq7X/8K+rCqygZLEGecPMB7LqBB9VdzyX6bkWouQ7MOA7b6rGayUXJC12bPCnNGPLe5eU hl7wMQG2ySUxKD98nJVtkyLxvQaBHqefeT818mUFajlCOJ9oB9R/0Y5wNsi2AyoE4MnJzJ hvBX6G49yhyQpErL18/fndL7/XdXeBA= X-MC-Unique: rl99xc6eOnGevHOj0iG1Bw-1 X-Mimecast-MFC-AGG-ID: rl99xc6eOnGevHOj0iG1Bw_1771423217 From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= To: qemu-devel@nongnu.org Cc: Alex Williamson , Ankit Agrawal , Shameer Kolothum , Jason Gunthorpe , =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= Subject: [PULL 4/5] hw/vfio: align mmap to power-of-2 of region size for hugepfnmap Date: Wed, 18 Feb 2026 15:00:02 +0100 Message-ID: <20260218140003.1554502-5-clg@redhat.com> In-Reply-To: <20260218140003.1554502-1-clg@redhat.com> References: <20260218140003.1554502-1-clg@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.0 on 10.30.177.17 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=clg@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.043, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: qemu development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1771423292166154100 From: Ankit Agrawal On Grace-based systems such as GB200, device memory is exposed as a BAR but the actual mappable size is not power-of-2 aligned. The previous algorithm aligned each sparse mmap area based on its individual size using ctz64() which prevented efficient huge page usage by the kernel. Adjust VFIO region mapping alignment to use the next power-of-2 of the total region size and place the sparse subregions at their appropriate offset. This provides better opportunities to get huge alignment allowing the kernel to use larger page sizes for the VMA. This enables the use of PMD-level huge pages which can significantly improve memory access performance and reduce TLB pressure for large device memory regions. With this change: - Create a single aligned base mapping for the entire region - Change Alignment to be based on pow2ceil(region->size), capped at 1GiB - Unmap gaps between sparse regions - Use MAP_FIXED to overlay sparse mmap areas at their offsets Example VMA for device memory of size 0x2F00F00000 on GB200: Before (misaligned, no hugepfnmap): ff88ff000000-ffb7fff00000 rw-s 400000000000 00:06 727 /d= ev/vfio/devices/vfio1 After (aligned to 1GiB boundary, hugepfnmap enabled): ff8ac0000000-ffb9c0f00000 rw-s 400000000000 00:06 727 /d= ev/vfio/devices/vfio1 Requires sparse regions to be sorted by offset (done in previous patch) to correctly identify and handle gaps. cc: Alex Williamson Reviewed-by: Alex Williamson Reviewed-by: Shameer Kolothum Suggested-by: Jason Gunthorpe Signed-off-by: Ankit Agrawal Reviewed-by: C=C3=A9dric Le Goater Link: https://lore.kernel.org/qemu-devel/20260217153010.408739-4-ankita@nvi= dia.com Signed-off-by: C=C3=A9dric Le Goater --- hw/vfio/region.c | 86 +++++++++++++++++++++++++++++++++--------------- 1 file changed, 59 insertions(+), 27 deletions(-) diff --git a/hw/vfio/region.c b/hw/vfio/region.c index d464eadf9c048e29981da8af48f8f86933a98ad5..47fdc2df349b65c6be6c9605b7a= 38a4e367f0475 100644 --- a/hw/vfio/region.c +++ b/hw/vfio/region.c @@ -344,8 +344,11 @@ static bool vfio_region_create_dma_buf(VFIORegion *reg= ion, Error **errp) =20 int vfio_region_mmap(VFIORegion *region) { - int i, ret, prot =3D 0; + void *map_base, *map_align; Error *local_err =3D NULL; + int i, ret, prot =3D 0; + off_t map_offset =3D 0; + size_t align; char *name; int fd; =20 @@ -356,41 +359,61 @@ int vfio_region_mmap(VFIORegion *region) prot |=3D region->flags & VFIO_REGION_INFO_FLAG_READ ? PROT_READ : 0; prot |=3D region->flags & VFIO_REGION_INFO_FLAG_WRITE ? PROT_WRITE : 0; =20 - for (i =3D 0; i < region->nr_mmaps; i++) { - size_t align =3D MIN(1ULL << ctz64(region->mmaps[i].size), 1 * GiB= ); - void *map_base, *map_align; + /* + * Align the mmap for more efficient mapping in the kernel. Ideally + * we'd know the PMD and PUD mapping sizes to use as discrete alignment + * intervals, but we don't. As of Linux v6.19, the largest PUD size + * supporting huge pfnmap is 1GiB (ARCH_SUPPORTS_PUD_PFNMAP is only set + * on x86_64). + * + * Align by power-of-two of the size of the entire region - capped + * by 1G - and place the sparse subregions at their appropriate offset. + * This will get maximum alignment. + * + * NB. qemu_memalign() and friends actually allocate memory, whereas + * the region size here can exceed host memory, therefore we manually + * create an oversized anonymous mapping and clean it up for alignment. + */ =20 - /* - * Align the mmap for more efficient mapping in the kernel. Ideal= ly - * we'd know the PMD and PUD mapping sizes to use as discrete alig= nment - * intervals, but we don't. As of Linux v6.12, the largest PUD si= ze - * supporting huge pfnmap is 1GiB (ARCH_SUPPORTS_PUD_PFNMAP is onl= y set - * on x86_64). Align by power-of-two size, capped at 1GiB. - * - * NB. qemu_memalign() and friends actually allocate memory, where= as - * the region size here can exceed host memory, therefore we manua= lly - * create an oversized anonymous mapping and clean it up for align= ment. - */ - map_base =3D mmap(0, region->mmaps[i].size + align, PROT_NONE, - MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); - if (map_base =3D=3D MAP_FAILED) { - ret =3D -errno; - goto no_mmap; - } + align =3D MIN(pow2ceil(region->size), 1 * GiB); =20 - fd =3D vfio_device_get_region_fd(region->vbasedev, region->nr); + map_base =3D mmap(0, region->size + align, PROT_NONE, + MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); + if (map_base =3D=3D MAP_FAILED) { + ret =3D -errno; + trace_vfio_region_mmap_fault(memory_region_name(region->mem), -1, + region->fd_offset, + region->fd_offset + region->size - 1,= ret); + return ret; + } + + fd =3D vfio_device_get_region_fd(region->vbasedev, region->nr); =20 - map_align =3D (void *)ROUND_UP((uintptr_t)map_base, (uintptr_t)ali= gn); - munmap(map_base, map_align - map_base); - munmap(map_align + region->mmaps[i].size, - align - (map_align - map_base)); + map_align =3D (void *)ROUND_UP((uintptr_t)map_base, (uintptr_t)align); + munmap(map_base, map_align - map_base); + munmap(map_align + region->size, + align - (map_align - map_base)); =20 - region->mmaps[i].mmap =3D mmap(map_align, region->mmaps[i].size, p= rot, + /* + * Regions should already be sorted by vfio_setup_region_sparse_mmaps(= ). + * This is critical for the following algorithm which relies on range + * offsets being in ascending order. + */ + for (i =3D 0; i < region->nr_mmaps; i++) { + munmap(map_align + map_offset, region->mmaps[i].offset - map_offse= t); + region->mmaps[i].mmap =3D mmap(map_align + region->mmaps[i].offset, + region->mmaps[i].size, prot, MAP_SHARED | MAP_FIXED, fd, region->fd_offset + region->mmaps[i].offset); if (region->mmaps[i].mmap =3D=3D MAP_FAILED) { ret =3D -errno; + /* + * Only unmap the rest of the region. Any mmaps that were succ= essful + * will be unmapped in no_mmap. + */ + munmap(map_align + region->mmaps[i].offset, + region->size - region->mmaps[i].offset); goto no_mmap; } =20 @@ -408,6 +431,15 @@ int vfio_region_mmap(VFIORegion *region) region->mmaps[i].offset, region->mmaps[i].offset + region->mmaps[i].size - 1); + + map_offset =3D region->mmaps[i].offset + region->mmaps[i].size; + } + + /* + * Unmap the rest of the region not covered by sparse mmap. + */ + if (map_offset < region->size) { + munmap(map_align + map_offset, region->size - map_offset); } =20 if (!vfio_region_create_dma_buf(region, &local_err)) { --=20 2.53.0 From nobody Sun Apr 12 04:22:08 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=quarantine dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1771423293; cv=none; d=zohomail.com; s=zohoarc; b=irSEEJQD29+WVNtoLqLA3zjDZ9n3i7m84+/UC/mgpCUbdGgG7M7lC+9e9htLF35BQTq0Kdcx2t+vO1fIT+JpX1z0FVcnfpEJ0mTqBlR1d5MtctvOQWACOg3ApfYJ7ZxRYUvWoFs35Q98rPT2W5E3L+xGHtfxUJa93UVPXHleZX0= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1771423293; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=OOee9YxXmCfpONqwsL6eQDaAixD1oOSK0bCHKeQK5ns=; b=BnB4x++ZHU5I+M0IeefZb4ScLUHc1EvBocQIcC0kqJ23BRxzNhwuZCK+OkzMlz7JdS8myvDktDEVcV6fKucC0CtbEXrWX4ew7mK2lT+l6YVPezJ3P8fwVh+6wq2pzQhNteXKyARbYjuxI+lTAvAsIfMXNBn9aX+aEAzd8tPgyTI= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=quarantine dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1771423293364180.07228663092553; Wed, 18 Feb 2026 06:01:33 -0800 (PST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1vsi71-00041e-GI; Wed, 18 Feb 2026 09:01:07 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1vsi6N-0003xZ-To for qemu-devel@nongnu.org; Wed, 18 Feb 2026 09:00:29 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1vsi6K-0007Ar-N4 for qemu-devel@nongnu.org; Wed, 18 Feb 2026 09:00:26 -0500 Received: from mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-626-3KjA8bAoN4q3BA40uk5cBA-1; Wed, 18 Feb 2026 09:00:20 -0500 Received: from mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.17]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 18B791955E85; Wed, 18 Feb 2026 14:00:19 +0000 (UTC) Received: from corto.redhat.com (unknown [10.45.224.251]) by mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 9FAB5195410D; Wed, 18 Feb 2026 14:00:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1771423223; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=OOee9YxXmCfpONqwsL6eQDaAixD1oOSK0bCHKeQK5ns=; b=S/2u4Q/k75VGun/cABkey/iHXN4SUtZn+loI/rgOEBvkpqkJgySEUD5etK2BLUeOtIIwoX jzAl9IpW11Qm3Bkjt4i/17yBsNKtBu39fsueoLjBts5DZKOlxFaA7ZEaqdc+ypfpwNzH8O 8RrDK8KIqzLxqimJCyWaRxppyPE+AkI= X-MC-Unique: 3KjA8bAoN4q3BA40uk5cBA-1 X-Mimecast-MFC-AGG-ID: 3KjA8bAoN4q3BA40uk5cBA_1771423219 From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= To: qemu-devel@nongnu.org Cc: Alex Williamson , Vivek Kasireddy , =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= Subject: [PULL 5/5] vfio: Document vfio_device_get_region_info() Date: Wed, 18 Feb 2026 15:00:03 +0100 Message-ID: <20260218140003.1554502-6-clg@redhat.com> In-Reply-To: <20260218140003.1554502-1-clg@redhat.com> References: <20260218140003.1554502-1-clg@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.0 on 10.30.177.17 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=clg@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.043, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: qemu development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1771423296009154100 From: Vivek Kasireddy Add documentation for vfio_device_get_region_info() and clarify the expectations around its usage. Cc: Alex Williamson Cc: C=C3=A9dric Le Goater Reviewed-by: C=C3=A9dric Le Goater Signed-off-by: Vivek Kasireddy Link: https://lore.kernel.org/qemu-devel/20260210070155.1176081-8-vivek.kas= ireddy@intel.com Signed-off-by: C=C3=A9dric Le Goater --- include/hw/vfio/vfio-device.h | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/include/hw/vfio/vfio-device.h b/include/hw/vfio/vfio-device.h index 35a5ec6d9224df515c354fe50fcf7c80c1241c8c..828a31c006ec9fc372760fee023= da31b0e7acc4b 100644 --- a/include/hw/vfio/vfio-device.h +++ b/include/hw/vfio/vfio-device.h @@ -275,6 +275,19 @@ bool vfio_device_get_host_iommu_quirk_bypass_ro(VFIODe= vice *vbasedev, int vfio_device_get_feature(VFIODevice *vbasedev, struct vfio_device_feature *feature); =20 +/** + * Return the region info for a given region index. The region info includ= es + * details such as size, offset, and capabilities. Note that the returned + * info pointer is either a cached copy or newly allocated by + * vfio_device_get_region_info(), so the caller is not expected to allocate + * or free it. + * + * @vbasedev: #VFIODevice to use + * @index: region index + * @info: pointer to store the region info + * + * Returns 0 on success or a negative value on error. + */ int vfio_device_get_region_info(VFIODevice *vbasedev, int index, struct vfio_region_info **info); int vfio_device_get_region_info_type(VFIODevice *vbasedev, uint32_t type, --=20 2.53.0