From nobody Thu May 16 03:23:34 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=nutanix.com ARC-Seal: i=1; a=rsa-sha256; t=1689688509; cv=none; d=zohomail.com; s=zohoarc; b=THyeHLFYvhN30J5J1FBvkFCb8k24X3stdKK3xwOZ8FEXz8ilwfMUfgCE2FLL8Qk3leOkOceZeVl5KULHXEMX75oDnl862py+6wUfHlPw/NX/8HL9fg74YGXcfnGLv2y+/OWcQhvloIIYKnpzOrsiygemzw3lRIFFRn7ZtEULDFM= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1689688509; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:Sender:Subject:To; bh=G97/3PzPQS5oEtFLATpcwfg+GYoEHv/26o7Y5zedtz4=; b=f7E734ao9DPxlhfs4I70Z3RUYPEo08afw8YQhS2lHY790+Sf+Zlvvk/20e6V5Oh6y/Pz9g2mFzr+gSAIV12QFbxIsXwFpgUFsdVAo0bJcR/ML6/spFxJMvpcgGQnYmDIUdutClBsuGOKBUZhcj7kArDrgn74f+ZuCHWP4pD83Og= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1689688509627632.2047895092305; Tue, 18 Jul 2023 06:55:09 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qLl9c-0005Xi-97; Tue, 18 Jul 2023 09:54:16 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qLhE0-0004O9-RN for qemu-devel@nongnu.org; Tue, 18 Jul 2023 05:42:32 -0400 Received: from mx0a-002c1b01.pphosted.com ([148.163.151.68]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qLhDv-0007tj-55 for qemu-devel@nongnu.org; Tue, 18 Jul 2023 05:42:32 -0400 Received: from pps.filterd (m0127838.ppops.net [127.0.0.1]) by mx0a-002c1b01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 36I7uuIt026344; Tue, 18 Jul 2023 02:42:20 -0700 Received: from nam02-bn1-obe.outbound.protection.outlook.com (mail-bn1nam02lp2043.outbound.protection.outlook.com [104.47.51.43]) by mx0a-002c1b01.pphosted.com (PPS) with ESMTPS id 3rwbaps9kd-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 18 Jul 2023 02:42:20 -0700 (PDT) Received: from MN2PR02MB6880.namprd02.prod.outlook.com (2603:10b6:208:1f7::12) by CH2PR02MB6839.namprd02.prod.outlook.com (2603:10b6:610:af::18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6609.20; Tue, 18 Jul 2023 09:42:17 +0000 Received: from MN2PR02MB6880.namprd02.prod.outlook.com ([fe80::20aa:33ec:7f5e:936a]) by MN2PR02MB6880.namprd02.prod.outlook.com ([fe80::20aa:33ec:7f5e:936a%7]) with mapi id 15.20.6609.022; Tue, 18 Jul 2023 09:42:16 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nutanix.com; h= from:to:cc:subject:date:message-id:content-type :content-transfer-encoding:mime-version; s=proofpoint20171006; bh=G97/3PzPQS5oEtFLATpcwfg+GYoEHv/26o7Y5zedtz4=; b=bUcoZRHhlQhO HsAEl8/loEyf/Wh2QdihuBsEZCgsO6gpwNEemsjTKeT8q7bKk5f+iXvI1Y+XE3eP wnZfU6k65QJHncJQ5j+KILt9SatztSSJqAC7TYmJUWe4spPw8HeHfe5DOG4bljd9 7x9vjiBpF9t9VcprBz4fLmqxMiTsVrSoWe8G1DZjoh+ohtVivohdLSGqBWZzxJKk e9/0aMEGhOR29nO8MxAnHUbgLKjfWFRakHl4nvPN6merNhBIy5rDcHNHmgeTEDLF fyvspjJnD3NWj1DGgTxUtfT/WplQ/hqFkifn8clcC3X3BIowXmO1O1+vBOz8IYCN hFVW7PLh2w== ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=bKVIxUZZpfdkjK2Ny65wixplg1Go6yEEgR6V+hXyzcOyGEneh1g/K8iLJ6Esh8W6fHgcDaaAxHXzPivOEGtPhc69Hld3KZtVhaeR7HQQXttlUEUsK+PNI+rc/iahZ4Ibo4t/xSZXv8ZVy8ggBCsL/Gru5TMU2DP7p1TL7amqq/WHKeZtsFE/b9f8OgFxqjAQDwuBOSraGynrRYF1a11FY9NkDqdZFTQtEgA48/UpWS+tx2D9+0mxpmx9h+/ZwL5sL1xZMMLF10fK/FbfXnuZshOhcUBNOM22UyzXOfgdSBC8Q7Ye3G7SKaq2uyHaKfH4jAn8AGSsxiQqRlpnhfqLiQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=G97/3PzPQS5oEtFLATpcwfg+GYoEHv/26o7Y5zedtz4=; b=l61IuGQpKXJxRkEMjpX4MGH/2YN9L1JdXxoIKcWK6XzQZ2MSNDhpA+OmxsKlslUQp7bVpk8p8eL/J3Pc77rklPyLfl3MKa1o9Wer22chaZPba5Jie5rXp+t3/+4MUxTZk4S6m81GZpeGpZJHri9DPsFFMovfGEVxUeo0ZRayKvzNy4feughSIF99mAY9cnPlNDOy9tG/SFQtyY7RbLZU1rHX7YIsIAFoFel5GRDmQlEPVyqVPib3qMzzDznwBJh3iWg9jVYZ87SMiaQal3Pk0MwHCMnccFhtDsHBQevug6lQW7C7ZVPAa4630Gjm8kf4dvwOlCMan4ROEeQ02MEawA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nutanix.com; dmarc=pass action=none header.from=nutanix.com; dkim=pass header.d=nutanix.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nutanix.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=G97/3PzPQS5oEtFLATpcwfg+GYoEHv/26o7Y5zedtz4=; b=EvRUTPmaisNPB8u417ECcA6i7IR+Pw/iK6yBeCLdq7HFzyqvlZOoyDAHimxEFb95CyELvF7YVNm3Q743wpCyagsQeOvt2/Kxd+gh9tniT5jnXnwcZE8ib3lj3km6bkr9In+gcoMldkTOmyzQvZ2NrJfqArmj3rDzJR9lcetLsvN6X3NZqmHOMbH6c5jr8IkLWkp27XJJH3GkTa879udCTpL2lKkLO6M6XyBUC7sWupgzrTe+lAusNYQLYFvft98iHiVj0WA2TYFkUaF8cRPlXSOr+OmmRYPYm/2LpR/ZYkdNaMF46tpf5+u5qdVwKAzXrXDfB5wwyNjAcjKma8opWA== From: William Henderson To: "qemu-devel@nongnu.org" CC: Thanos Makatos , John Levon , "alex.williamson@redhat.com" , "stefanha@gmail.com" , "philmd@linaro.org" , "clg@redhat.com" , "jag.raman@oracle.com" , William Henderson Subject: [RFC PATCH] vfio-user: add live migration to vfio-user protocol specification Thread-Topic: [RFC PATCH] vfio-user: add live migration to vfio-user protocol specification Thread-Index: Adm5XCNnN20J7LjWzUGvlILFTs81vQ== Date: Tue, 18 Jul 2023 09:42:16 +0000 Message-ID: <20230718094150.110183-1-william.henderson@nutanix.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-clientproxiedby: PH8PR21CA0021.namprd21.prod.outlook.com (2603:10b6:510:2ce::15) To MN2PR02MB6880.namprd02.prod.outlook.com (2603:10b6:208:1f7::12) x-ms-publictraffictype: Email x-ms-traffictypediagnostic: MN2PR02MB6880:EE_|CH2PR02MB6839:EE_ x-ms-office365-filtering-correlation-id: 8254f591-1c9d-4c9d-88ea-08db877345e0 x-proofpoint-crosstenant: true x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: vkdD1ztRacR02i0BZ1v1/7looSV9f4ypDDSnhb1cNVBaibnNuIWxVPWz9h35jHBYWw59+lTCGP/deap/0DxcIcStbCLgvqSLUYZxP1Tww1ccFSegrn7JVl97HOdoODpmt/pKWsICjdukNIBGxinnuiGlAMPV+TAgHM1GF6jYmR9C1QjaGyntxmJ/+WSUrn3fSee4QA8qcoV26aq+C3sqwHfeL5WB3O+CYWnMsO5BJpN+c0NInoaw4HQMTLr5eWqe45BD8znKCOJ0rA7hTnk8yTuO1gs1TxugslGAqJLwSTznCmbDwPJ9eOqDaTNj/8OP3wdzuCvbLF0YkKML4SPTx2YkzZIbz8RHB8IF8eTf6aQCMp7Cyn1zWh6cKDl+e459Kn3LImPaTnrP6sILBGReWQsI77cPOiRls5pLw4cvb2bo/+q8VwNA6mtc1LgkvfGYtY7b8p74VIfFl0TZ0X/d7yfghkZuYhUJOEVxIYwcj6kf0LaAcf07KLllbH++91qgjWvnGu3Z5+N0b+89V08hsFzB3hgN2kLT07ozRatL9yyobMjWvBEXzu+rqbC00lxxx/B5NrBNL+mgMe8sJPOLKku0mQu3uSoq23eOMll8o/E= x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:MN2PR02MB6880.namprd02.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230028)(376002)(346002)(136003)(366004)(39860400002)(396003)(451199021)(478600001)(71200400001)(6486002)(52116002)(66556008)(54906003)(83380400001)(86362001)(30864003)(2906002)(1076003)(186003)(36756003)(6506007)(107886003)(966005)(2616005)(316002)(6512007)(122000001)(38350700002)(38100700002)(26005)(64756008)(6916009)(66446008)(66946007)(66476007)(41300700001)(44832011)(4326008)(8676002)(8936002)(5660300002)(579004); DIR:OUT; SFP:1102; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?utf-8?B?NUJhSCs4eUJVaGgwRkJRUWRaenl2b1p3MzdkV3Bnazdrd0dKMk5SUXE4emM4?= =?utf-8?B?d0IrZTFOVm1rUC9zUlJkc0tGSFZFSS9PQlU4RkNOK09wa29ySEY2RnV1dmZE?= =?utf-8?B?aFpCdXZYMU9taGdLaExqZEpqcVE3RWk4OVhqVFFlUlE2VjhtWmdGQ3RIYlRY?= =?utf-8?B?VUI0UUE5TVNuQ2JuYXVsSHlOMzhHRjhZUlNkcGtBZTJINTNsNUlSWmJRT0l5?= =?utf-8?B?MXFzWFhkaFd3Tm1OVi9uOGE4ekdaSE9SeUI0SjJnOUtxaW1kdEFPNkNUcDl3?= =?utf-8?B?S09pTWdkTTRYa3k0N3ZFZ2JleXc5cWdpb0F2eG1ES3pOSFFmV3ZiN1VRSWd0?= =?utf-8?B?d050RGEzTXVQOE4vYm5RZCt2bk41ZHZDSnhjS25seDM1SHRxWGdENEdRcUFs?= =?utf-8?B?Z1N6bXB0WTlGRWl4NDg0d3A1YkRrSmsycGpjL1IxNUFtWW1WVXcreXNOMlJK?= =?utf-8?B?Y0ZmQWkrclZ3TGVhTGRyaUd1QjIyNHk3VlN0TWVPcnRQN0M3QWZPMXQ1b0tT?= =?utf-8?B?dVE5R3lBMTJxRXorKzluQ21LVHdSRFc5WDkzL2R0WVZnaFRtOStlZWJOVmZ2?= =?utf-8?B?aVljZTNHVkErV2o0WkVKMFJGOXFQUWpKaHQyUFBXSEpraGRoMzR5bFJZTTNW?= =?utf-8?B?b2pEMmQ2RHo1ZEthZVNpWE5qL296MFQra1gxZmFwR3MzdFZDYTM5NWxJS3Rx?= =?utf-8?B?Nm8yOCtjRVJ2ZU9sQTBZY1hGQnpKZEpYZHd5bHpCUHlPR1ZmM0hNMmdLKzBB?= =?utf-8?B?SXFmeFdKdXNKQTlRK0xsaHZVcmE5MVFKUzJQcUdjdVUzVVk2cGVOdW15UHRk?= =?utf-8?B?bGE3M2lxclg0a2pWQ2J0YVVvc2w0cE1WVTcxMXpSeU1TRjNtclNLdE91SmhP?= =?utf-8?B?eHdMMmZTOTNyRkwyL0NFY3JRRkFiL0NxYThOTmhkcGNNQ2wyN0dlQTZISnp1?= =?utf-8?B?dnhJcldnSUdVTTdkUThZQTF4ems4aEIzdjNKcFdiaHpVRnFGTzZ2R3VkS1NU?= =?utf-8?B?clVweDVCNnBubk1pbzM2WEVWdnVKaHVxUGtEOEhqcFE0MG9yZmRpRFRDRWpr?= =?utf-8?B?YnBOQXJ0Z3F2cTRldWdaM1dDUjJERUdEYjk1NXRNNDNOTEQ4Vnl4VTFNQisy?= =?utf-8?B?VGxjL1RmN0c1NjRZcXpndk94VEw2VWFtUEV2WGZPamtBY1U2eHREUG11U0ZV?= =?utf-8?B?M1psQ29WLzllYjNjSWhnYnkrUlAxYWhwSCtRbzUrSjhjL1loU3NvM3ZaYkg2?= =?utf-8?B?K1VsMEd5QzlPOWxsaEUvemh3T1V4SXE4UHRhNXhoOUtNaTNoYXRMdm1JZisw?= =?utf-8?B?NGwrYnI2SlZ3eHBsZUkxc3dqb1V2d3lXOVNUZHV2U1F3aXVGc2k4aVppMkE2?= =?utf-8?B?R243MjV0SUNoRXJKM0pnNG5RcTR0dWh2bGMzVEdDb2lQaDgzLzk2VDdSVkp2?= =?utf-8?B?N3FmZDM4NXljZzZVZlR5MVJYV2lXamc5SGdwaWM3VmdueTVlMjJCc3NVNXNz?= =?utf-8?B?eEdWK0VOLzcrM3FUdEIzaFpoVDROeXlFTnA5dktaM3dBR3RPMXY2Y0cxSWRw?= =?utf-8?B?LzN1RFVkdDN2QWhaNTNaRVB5NXRKZWFtSlRVZjdROFBHdm5IVEZqaWxqa0VS?= =?utf-8?B?SkcraGM2NkZLaGRTdGY1MVhmTCtkQk1IT0owNmtwUVlTaWxFZWNiMDhGdjUr?= =?utf-8?B?TFVaMTBITXY4b0krTDI5cndpZDFMQjduZVpHTlR5UzMzOGRBaXJEYlQra0Fo?= =?utf-8?B?NUNudkR0SzM4WE9IRlJDU1RiRzYwNFdJb3dDWVFESXpabG5zRzFMeWhENkV0?= =?utf-8?B?Q0FmVFhZVkxuWnZNb0EwR00rTUg2ZEE5WmFhdzdjQ2tZY096SFhXVGt3dWlh?= =?utf-8?B?OFppYWI5VnFvMkVDQU5MSTJOYU1yL2Z6QXNaT1pMdkVBWWhHc0pTaDl6UHps?= =?utf-8?B?bkcxa2FGMXkyVDBXTURZdEcxZUdnUVk3OERhT0xuQXI1RXFWS1g3R3hnMmJW?= =?utf-8?B?T2hHdHB0d3g1TlJqUXBJNWc0Tno2OExkT3V6eDRnSEF6QS9Gejl1blJnK1ht?= =?utf-8?B?czNYbnA0eklueGI1N3d6KzVDd3JENjd4TXVqTlBsaHdWQXpZWE9qL1RyYzZa?= =?utf-8?B?TXNxMGVZNlh1VVQ4emZoYVRiaHBuNmlzVXJLd0x5aUFGamhNRFh0VHdYSWdl?= =?utf-8?Q?IYs4KwMWwED4hwfv6SiWn2s=3D?= Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: nutanix.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: MN2PR02MB6880.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 8254f591-1c9d-4c9d-88ea-08db877345e0 X-MS-Exchange-CrossTenant-originalarrivaltime: 18 Jul 2023 09:42:16.6344 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: bb047546-786f-4de1-bd75-24e5b6f79043 X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: QyXLd54292RSccfXSSrWS2BEEd5/uPD5/DNntAQzrCpCBvDkmDe0iCoQjci4f6SSqE5f2GPGR72IvVG1blF+/NYnnZdXyUOrWidndCnZpls= X-MS-Exchange-Transport-CrossTenantHeadersStamped: CH2PR02MB6839 X-Proofpoint-ORIG-GUID: Y6O7cK6WRWMMSTp4qaMzxbiTacmRkoMx X-Proofpoint-GUID: Y6O7cK6WRWMMSTp4qaMzxbiTacmRkoMx X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.591,FMLib:17.11.176.26 definitions=2023-07-17_15,2023-07-13_01,2023-05-22_02 X-Proofpoint-Spam-Reason: safe Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=148.163.151.68; envelope-from=william.henderson@nutanix.com; helo=mx0a-002c1b01.pphosted.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, WEIRD_QUOTING=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-Mailman-Approved-At: Tue, 18 Jul 2023 09:54:14 -0400 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @nutanix.com) X-ZM-MESSAGEID: 1689688511262100007 This patch adds live migration to the vfio-user specification, based on the= new VFIO migration interface introduced in the kernel here: https://lore.kernel.org/all/20220224142024.147653-10-yishaih@nvidia.com/ We differ from the VFIO protocol in that, while VFIO transfers migration da= ta using a file descriptor, we simply use the already-established vfio-user so= cket with two additional commands, VFIO_USER_MIG_DATA_READ and VFIO_USER_MIG_DATA_WRITE, which have stream semantics. We also don't use P2P states as we don't yet have a use-case for them, although this may change i= n the future. This patch should be applied on the previous pending patch which introduces the vfio-user protocol: https://lists.nongnu.org/archive/html/qemu-devel/2023-06/msg06567.html Signed-off-by: William Henderson --- docs/devel/vfio-user.rst | 413 +++++++++++++++++++++++++++++++++++++-- 1 file changed, 396 insertions(+), 17 deletions(-) diff --git a/docs/devel/vfio-user.rst b/docs/devel/vfio-user.rst index 0d96477a68..f433579db0 100644 --- a/docs/devel/vfio-user.rst +++ b/docs/devel/vfio-user.rst @@ -4,7 +4,7 @@ vfio-user Protocol Specification ******************************** =20 -------------- -Version_ 0.9.1 +Version_ 0.9.2 -------------- =20 .. contents:: Table of Contents @@ -366,6 +366,9 @@ Name Command Requ= est Direction ``VFIO_USER_DMA_WRITE`` 12 server -> client ``VFIO_USER_DEVICE_RESET`` 13 client -> server ``VFIO_USER_REGION_WRITE_MULTI`` 15 client -> server +``VFIO_USER_DEVICE_FEATURE`` 16 client -> server +``VFIO_USER_MIG_DATA_READ`` 17 client -> server +``VFIO_USER_MIG_DATA_WRITE`` 18 client -> server =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D=3D=3D=3D=3D = =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =20 Header @@ -508,26 +511,10 @@ Capabilities: | | | valid simultaneously. Optional, with a = | | | | value of 65535 (64k-1). = | +--------------------+---------+------------------------------------------= ------+ -| migration | object | Migration capability parameters. If missi= ng | -| | | then migration is not supported by the se= nder. | -+--------------------+---------+------------------------------------------= ------+ | write_multiple | boolean | ``VFIO_USER_REGION_WRITE_MULTI`` messages= | | | | are supported if the value is ``true``. = | +--------------------+---------+------------------------------------------= ------+ =20 -The migration capability contains the following name/value pairs: - -+-----------------+--------+----------------------------------------------= ----+ -| Name | Type | Description = | -+=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D= =3D+=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D+ -| pgsize | number | Page size of dirty pages bitmap. The smallest= | -| | | between the client and the server is used. = | -+-----------------+--------+----------------------------------------------= ----+ -| max_bitmap_size | number | Maximum bitmap size in ``VFIO_USER_DIRTY_PAGE= S`` | -| | | and ``VFIO_DMA_UNMAP`` messages. Optional, = | -| | | with a default value of 256MB. = | -+-----------------+--------+----------------------------------------------= ----+ - Reply ^^^^^ =20 @@ -1468,6 +1455,398 @@ Reply =20 * *wr_cnt* is the number of device writes completed. =20 +``VFIO_USER_DEVICE_FEATURE`` +---------------------------- + +This command is analogous to ``VFIO_DEVICE_FEATURE``. It is used to get, s= et, or +probe feature data of the device. + +Request +^^^^^^^ + +The request payload for this message is a structure of the following forma= t. + ++-------+--------+--------------------------------+ +| Name | Offset | Size | ++=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D+ +| argsz | 0 | 4 | ++-------+--------+--------------------------------+ +| flags | 4 | 4 | ++-------+--------+--------------------------------+ +| | +---------+---------------------------+ | +| | | Bit | Definition | | +| | +=3D=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D+ | +| | | 0 to 15 | Feature bits | | +| | +---------+---------------------------+ | +| | | 16 | VFIO_DEVICE_FEATURE_GET | | +| | +---------+---------------------------+ | +| | | 17 | VFIO_DEVICE_FEATURE_SET | | +| | +---------+---------------------------+ | +| | | 18 | VFIO_DEVICE_FEATURE_PROBE | | +| | +---------+---------------------------+ | ++-------+--------+--------------------------------+ +| data | 8 | variable | ++-------+--------+--------------------------------+ + +* *argsz* is the size of the above structure. + +* *flags* defines the action to be performed by the server and upon which + feature: + + * The feature bits are the least significant 16 bits of the flags field,= and + can be accessed using the ``VFIO_DEVICE_FEATURE_MASK`` bit mask. + =20 + * ``VFIO_DEVICE_FEATURE_GET`` instructs the server to get the data for t= he + given feature. + + * ``VFIO_DEVICE_FEATURE_SET`` instructs the server to set the feature da= ta to + that given in the ``data`` field of the payload. + + * ``VFIO_DEVICE_FEATURE_PROBE`` instructs the server to probe for feature + support. If ``VFIO_DEVICE_FEATURE_GET`` and/or ``VFIO_DEVICE_FEATURE_S= ET`` + are also set, the probe will only return success if all of the indicat= ed + methods are supported. + + ``VFIO_DEVICE_FEATURE_GET`` and ``VFIO_DEVICE_FEATURE_SET`` are mutually + exclusive, except for use with ``VFIO_DEVICE_FEATURE_PROBE``. + +* *data* is specific to the particular feature. It is not used for probing. + +This part of the request is analogous to VFIO's ``struct vfio_device_featu= re``. + +Reply +^^^^^ + +The reply payload must be the same as the request payload for setting or +probing a feature. For getting a feature's data, the data is added in the = data +section and its length is added to ``argsz``. + +Device Features +^^^^^^^^^^^^^^^ + +The only device features supported by vfio-user are those related to migra= tion, +although this may change in the future. They are a subset of those support= ed in +the VFIO implementation of the Linux kernel. + ++----------------------------------------+-------+ +| Name | Value | ++=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D+ +| VFIO_DEVICE_FEATURE_MIGRATION | 1 | ++----------------------------------------+-------+ +| VFIO_DEVICE_FEATURE_MIG_DEVICE_STATE | 2 | ++----------------------------------------+-------+ +| VFIO_DEVICE_FEATURE_DMA_LOGGING_START | 6 | ++----------------------------------------+-------+ +| VFIO_DEVICE_FEATURE_DMA_LOGGING_STOP | 7 | ++----------------------------------------+-------+ +| VFIO_DEVICE_FEATURE_DMA_LOGGING_REPORT | 8 | ++----------------------------------------+-------+ + +``VFIO_DEVICE_FEATURE_MIGRATION`` +""""""""""""""""""""""""""""""""" + +This feature indicates that the device can support the migration API throu= gh +``VFIO_DEVICE_FEATURE_MIG_DEVICE_STATE``. If ``GET`` succeeds, the ``RUNNI= NG`` +and ``ERROR`` states are always supported. Support for additional states is +indicated via the flags field; at least ``VFIO_MIGRATION_STOP_COPY`` must = be +set. + +There is no data field of the request message. + +The data field of the reply message is structured as follows: + ++-------+--------+---------------------------+ +| Name | Offset | Size | ++=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D+ +| flags | 0 | 8 | ++-------+--------+---------------------------+ +| | +-----+--------------------------+ | +| | | Bit | Definition | | +| | +=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D+ | +| | | 0 | VFIO_MIGRATION_STOP_COPY | | +| | +-----+--------------------------+ | +| | | 1 | VFIO_MIGRATION_P2P | | +| | +-----+--------------------------+ | +| | | 2 | VFIO_MIGRATION_PRE_COPY | | +| | +-----+--------------------------+ | ++-------+--------+---------------------------+ + +These flags are interpreted in the same way as VFIO. + +``VFIO_DEVICE_FEATURE_MIG_DEVICE_STATE`` +"""""""""""""""""""""""""""""""""""""""" + +Upon ``VFIO_DEVICE_FEATURE_SET``, execute a migration state change on the = VFIO +device. The new state is supplied in ``device_state``. The state transitio= n must +fully complete before the reply is sent. + +The data field of the reply message, as well as the ``SET`` request messag= e, is +structured as follows: + ++--------------+--------+------+ +| Name | Offset | Size | ++=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D+=3D= =3D=3D=3D=3D=3D+ +| device_state | 0 | 4 | ++--------------+--------+------+ +| data_fd | 4 | 4 | ++--------------+--------+------+ + +* *device_state* is the current state of the device (for ``GET``) or the + state to transition to (for ``SET``). It is defined by the + ``vfio_device_mig_state`` enum as detailed below. These states are the s= tates + of the device migration Finite State Machine. + ++--------------------------------+-------+--------------------------------= -------------------------------------+ +| Name | State | Description = | ++=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D+ +| VFIO_DEVICE_STATE_ERROR | 0 | The device has failed and must = be reset. | ++--------------------------------+-------+--------------------------------= -------------------------------------+ +| VFIO_DEVICE_STATE_STOP | 1 | The device does not change the = internal or external state. | ++--------------------------------+-------+--------------------------------= -------------------------------------+ +| VFIO_DEVICE_STATE_RUNNING | 2 | The device is running normally.= | ++--------------------------------+-------+--------------------------------= -------------------------------------+ +| VFIO_DEVICE_STATE_STOP_COPY | 3 | The device internal state can b= e read out. | ++--------------------------------+-------+--------------------------------= -------------------------------------+ +| VFIO_DEVICE_STATE_RESUMING | 4 | The device is stopped and is lo= ading a new internal state. | ++--------------------------------+-------+--------------------------------= -------------------------------------+ +| VFIO_DEVICE_STATE_RUNNING_P2P | 5 | (not used in vfio-user) = | ++--------------------------------+-------+--------------------------------= -------------------------------------+ +| VFIO_DEVICE_STATE_PRE_COPY | 6 | The device is running normally = but tracking internal state changes. | ++--------------------------------+-------+--------------------------------= -------------------------------------+ +| VFIO_DEVICE_STATE_PRE_COPY_P2P | 7 | (not used in vfio-user) = | ++--------------------------------+-------+--------------------------------= -------------------------------------+ + +* *data_fd* is unused in vfio-user, as the ``VFIO_USER_MIG_DATA_READ`` and + ``VFIO_USER_MIG_DATA_WRITE`` messages are used instead for migration data + transport. + +Direct State Transitions +"""""""""""""""""""""""" + +The device migration FSM is a Mealy machine, so actions are taken upon the= arcs +between FSM states. The following transitions need to be supported by the +server, a subset of those defined in ```` +(``enum vfio_device_mig_state``). + +* ``RUNNING -> STOP``, ``STOP_COPY -> STOP``: Stop the operation of the de= vice. + The ``STOP_COPY`` arc terminates the data transfer session. + +* ``RESUMING -> STOP``: Terminate the data transfer session. Complete proc= essing + of the migration data. Stop the operation of the device. If the delivere= d data + is found to be incomplete, inconsistent, or otherwise invalid, fail the + ``SET`` command and optionally transition to the ``ERROR`` state. + +* ``PRE_COPY -> RUNNING``: Terminate the data transfer session. The device= is + now fully operational. + +* ``STOP -> RUNNING``: Start the operation of the device. + +* ``RUNNING -> PRE_COPY``, ``STOP -> STOP_COPY``: Begin the process of sav= ing + the device state. The device operation is unchanged, but data transfer b= egins. + ``PRE_COPY`` and ``STOP_COPY`` are referred to as the "saving group" of + states. + +* ``PRE_COPY -> STOP_COPY``: Continue to transfer migration data, but stop=20 + device operation. + +* ``STOP -> RESUMING``: Start the process of restoring the device state. T= he + internal device state may be changed to prepare the device to receive the + migration data. + +The ``STOP_COPY -> PRE_COPY`` transition is explicitly not allowed and sho= uld +return an error if requested. + +``ERROR`` cannot be specified as a device state, but any transition reques= t can +be failed and then move the state into ``ERROR`` if the server was unable = to +execute the requested arc AND was unable to restore the device into any va= lid +state. To recover from ``ERROR``, ``VFIO_USER_DEVICE_RESET`` must be used = to +return back to ``RUNNING``. + +If ``PRE_COPY`` is not supported, arcs touching it are removed. + +Complex State Transitions +""""""""""""""""""""""""" + +The remaining possible transitions are to be implemented as combinations o= f the +above FSM arcs. As there are multiple paths, the path should be selected b= ased +on the following rules: + +* Select the shortest path. + +* The path cannot have saving group states as interior arcs, only start/end + states. + =20 +``VFIO_DEVICE_FEATURE_DMA_LOGGING_START`` / ``VFIO_DEVICE_FEATURE_DMA_LOGG= ING_STOP`` +""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""= """""""""" + +Upon ``VFIO_DEVICE_FEATURE_SET``, start/stop DMA logging. These features c= an +also be probed to determine whether the device supports DMA logging. + +When DMA logging is started, a range of IOVAs to monitor is provided and t= he +device can optimize its logging to cover only the IOVA range given. Only D= MA +writes are logged. + +The data field of the ``SET`` request is structured as follows: + ++------------+--------+----------+ +| Name | Offset | Size | ++=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D+ +| page_size | 0 | 8 | ++------------+--------+----------+ +| num_ranges | 8 | 4 | ++------------+--------+----------+ +| reserved | 12 | 4 | ++------------+--------+----------+ +| ranges | 16 | variable | ++------------+--------+----------+ + +* *page_size* hints what tracking granularity the device should try to ach= ieve. + If the device cannot do the hinted page size then it's the driver's choi= ce + which page size to pick based on its support. On output the device will = return + the page size it selected. + +* *num_ranges* is the number of IOVA ranges to monitor. A value of zero + indicates that all writes should be logged. + +* *ranges* is an array of ``vfio_user_device_feature_dma_logging_range`` + entries: + ++--------+--------+------+ +| Name | Offset | Size | ++=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D+ +| iova | 0 | 8 | ++--------+--------+------+ +| length | 8 | 8 | ++--------+--------+------+ + + * *iova* is the base IO virtual address + * *length* is the length of the range to log + +Upon success, the response data field will be the same as the request, unl= ess +the page size was changed, in which case this will be reflected in the res= ponse. + +``VFIO_DEVICE_FEATURE_DMA_LOGGING_REPORT`` +"""""""""""""""""""""""""""""""""""""""""" + +Upon ``VFIO_DEVICE_FEATURE_GET``, returns the dirty bitmap for a specific = IOVA +range. This operation is only valid if logging of dirty pages has been +previously started by setting ``VFIO_DEVICE_FEATURE_DMA_LOGGING_START``. + +The data field of the request is structured as follows: + ++-----------+--------+------+ +| Name | Offset | Size | ++=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D= =3D=3D+ +| iova | 0 | 8 | ++-----------+--------+------+ +| length | 8 | 8 | ++-----------+--------+------+ +| page_size | 16 | 8 | ++-----------+--------+------+ + +* *iova* is the base IO virtual address + +* *length* is the length of the range + +* *page_size* is the unit of granularity of the bitmap, and must be a powe= r of + two. It doesn't have to match the value given to + ``VFIO_DEVICE_FEATURE_DMA_LOGGING_START`` because the driver will format= its + internal logging to match the reporting page size possibly by replicatin= g bits + if the internal page size is lower than requested + +The data field of the response is identical, except with the bitmap added = on +the end at offset 24. + +The mapping of IOVA to bits is given by: + +``bitmap[(addr - iova)/page_size] & (1ULL << (addr % 64))`` + +``VFIO_USER_MIG_DATA_READ`` +--------------------------- + +This command is used to read data from the source migration server while i= t is +in a saving group state (``PRE_COPY`` or ``STOP_COPY``). + +This command, and ``VFIO_USER_MIG_DATA_WRITE``, are used in place of the +``data_fd`` file descriptor in ```` +(``struct vfio_device_feature_mig_state``) to enable all data transport to= use +the single already-established UNIX socket. Hence, the migration data is +treated like a stream, so the client must continue reading until no more +migration data remains. + +Request +^^^^^^^ + +The request payload for this message is a structure of the following forma= t. + ++-------+--------+------+ +| Name | Offset | Size | ++=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D+ +| argsz | 0 | 4 | ++-------+--------+------+ +| size | 4 | 4 | ++-------+--------+------+ + +* *argsz* is the size of the above structure. + +* *size* is the size of the migration data to read. + +Reply +^^^^^ + +The reply payload for this message is a structure of the following format. + ++-------+--------+----------+ +| Name | Offset | Size | ++=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D+ +| argsz | 0 | 4 | ++-------+--------+----------+ +| size | 4 | 4 | ++-------+--------+----------+ +| data | 8 | variable | ++-------+--------+----------+ + +* *argsz* is the size of the above structure. + +* *size* indicates the size of returned migration data. If this is less th= an the + requested size, there is no more migration data to read. + +* *data* contains the migration data. + +``VFIO_USER_MIG_DATA_WRITE`` +---------------------------- + +This command is used to write data to the destination migration server whi= le it +is in the ``RESUMING`` state. + +As above, this replaces the ``data_fd`` file descriptor for transport of +migration data, and as such, the migration data is treated like a stream. + +Request +^^^^^^^ + +The request payload for this message is a structure of the following forma= t. + ++-------+--------+----------+ +| Name | Offset | Size | ++=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D+ +| argsz | 0 | 4 | ++-------+--------+----------+ +| size | 4 | 4 | ++-------+--------+----------+ +| data | 8 | variable | ++-------+--------+----------+ + +* *argsz* is the size of the above structure. + +* *size* is the size of the migration data to be written. + +* *data* contains the migration data. + +Reply +^^^^^ + +There is no reply payload for this message. =20 Appendices =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --=20 2.22.3