From nobody Sat May 18 18:13:04 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1561641803; cv=none; d=zoho.com; s=zohoarc; b=P1UUNnAICWmPNtbbPkHT3exOvHwn4G+iLFiDyuKePqQGVreZHXjy5MLvBu6DcIkOnscqVe75QQRgdmDexTixj76FiLrDYyFZ8rBEwHRqHZiRsi7fozN9Uyi7aqRgmhKu0PCHjWkFeMaDhqfT3kVGh6qtiBw6tHExpRyEpGrtjrE= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zoho.com; s=zohoarc; t=1561641803; h=Content-Type:Cc:Date:From:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:Sender:Subject:To:ARC-Authentication-Results; bh=GRLGQUsFQa2xYmA9+iYkUJXqZjX6B8KxzI3ZrZu2Ffk=; b=BWMvXH94t0JKtuS4SAsgsKIdgiEGsnA7nufKS2Fv3rq8VDVF5Do66tI0AFvPwBovrX5NBV1rtuyL0OY2Dveg0fSDjC1MSi7q/z9We1TWfzSRkhAtGBhpkwMy6+QKq9wvYqZvgtP567zG4hWqGs5rIoIyBT3FRCyuPmndwo8hvdc= ARC-Authentication-Results: i=1; mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1561641803552883.2697029416759; Thu, 27 Jun 2019 06:23:23 -0700 (PDT) Received: from localhost ([::1]:50714 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hgUN0-000282-6O for importer@patchew.org; Thu, 27 Jun 2019 09:23:22 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:33240) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hgUKn-00083n-JX for qemu-devel@nongnu.org; Thu, 27 Jun 2019 09:21:07 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hgUKl-0000PK-Sc for qemu-devel@nongnu.org; Thu, 27 Jun 2019 09:21:05 -0400 Received: from proxmox-new.maurer-it.com ([212.186.127.180]:31868) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hgUKl-0000DW-GJ for qemu-devel@nongnu.org; Thu, 27 Jun 2019 09:21:03 -0400 Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 2839C43019; Thu, 27 Jun 2019 15:12:54 +0200 (CEST) Date: Thu, 27 Jun 2019 15:12:52 +0200 From: Wolfgang Bumiller To: qemu-devel@nongnu.org Message-ID: <20190627131252.GA14795@olga.proxmox.com> MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.10.1 (2018-07-13) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 212.186.127.180 Subject: [Qemu-devel] balloon config change seems to break live migration from 3.0.1 to 4.0 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Paolo Bonzini , "Dr. David Alan Gilbert" Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" While testing with 4.0 we've run into issues with live migration from 3.0.1 to 4.0 when a balloon device was involved. We'd see the following error on the destination: qemu-system-x86_64: get_pci_config_device: Bad config data: i=3D0x10 read= : a1 device: 1 cmask: ff wmask: c0 w1cmask:0=20 qemu-system-x86_64: Failed to load PCIDevice:config=20 qemu-system-x86_64: Failed to load virtio-balloon:virtio=20 qemu-system-x86_64: error while loading state for instance 0x0 of device = '0000:00:03.0/virtio-balloon'=20 qemu-system-x86_64: load of migration failed: Invalid argument After looking through the commits I noticed that the pci config sent for the balloon device comes from include/standard-headers/linux/virtio_balloon.h and changed size between 3.1 and 4.0. As a "guess" I tried reverting that change (commented out the two last fields (and access to it in hw/virtio/virtio-balloon.c's virtio_balloon_get_config()), and then the migration seems to go through successfully. I've since also rebuilt qemu without our patches (tags v3.0.1 and v4.0.0) and also tried with master (since dgilbert mentioned on irc remembering the issue and that there may have been a fix around), but got the same result. Posting here now as dgilbert requested on irc. Here are the commands used to start qemu: Source: /usr/bin/kvm \ -name randomclone \ -chardev 'socket,id=3Dqmp,path=3D/var/run/qemu-server/101.qmp,server,= nowait' \ -mon 'chardev=3Dqmp,mode=3Dcontrol' \ -chardev 'socket,id=3Dqmp-event,path=3D/var/run/qmeventd.sock,reconne= ct=3D5' \ -mon 'chardev=3Dqmp-event,mode=3Dcontrol' \ -pidfile /var/run/qemu-server/101.pid \ -daemonize \ -smbios 'type=3D1,uuid=3Df3ab31f6-ca7d-469c-bf51-547fd9bbd2d9' \ -smp '4,sockets=3D1,cores=3D4,maxcpus=3D4' \ -nodefaults \ -boot 'menu=3Don,strict=3Don,reboot-timeout=3D1000,splash=3D/usr/shar= e/qemu-server/bootsplash.jpg' \ -vnc unix:/var/run/qemu-server/101.vnc,password \ -cpu host,+pcid,+spec-ctrl,+ssbd,+pdpe1gb,+kvm_pv_unhalt,+kvm_pv_eoi \ -m 4096 \ -device 'pci-bridge,id=3Dpci.2,chassis_nr=3D2,bus=3Dpci.0,addr=3D0x1f= ' \ -device 'pci-bridge,id=3Dpci.1,chassis_nr=3D1,bus=3Dpci.0,addr=3D0x1e= ' \ -device 'vmgenid,guid=3Dfb282779-7056-4f1d-96bb-70f578294e45' \ -device 'piix3-usb-uhci,id=3Duhci,bus=3Dpci.0,addr=3D0x1.0x2' \ -device 'usb-tablet,id=3Dtablet,bus=3Duhci.0,port=3D1' \ -device 'VGA,id=3Dvga,bus=3Dpci.0,addr=3D0x2' \ -device 'virtio-balloon-pci,id=3Dballoon0,bus=3Dpci.0,addr=3D0x3' \ -iscsi 'initiator-name=3Diqn.1993-08.org.debian:01:856d32b504d' \ -drive 'if=3Dnone,id=3Ddrive-ide2,media=3Dcdrom,aio=3Dthreads' \ -device 'ide-cd,bus=3Dide.1,unit=3D0,drive=3Ddrive-ide2,id=3Dide2,boo= tindex=3D200' \ -device 'virtio-scsi-pci,id=3Dscsihw0,bus=3Dpci.0,addr=3D0x5' \ -drive 'file=3Drbd:rbd/vm-101-disk-0:conf=3D/etc/pve/ceph.conf:id=3Da= dmin:keyring=3D/etc/pve/priv/ceph/rbd.keyring,if=3Dnone,id=3Ddrive-scsi0,di= scard=3Don,format=3Draw,cache=3Dnone,aio=3Dnative,detect-zeroes=3Dunmap' \ -device 'scsi-hd,bus=3Dscsihw0.0,channel=3D0,scsi-id=3D0,lun=3D0,driv= e=3Ddrive-scsi0,id=3Dscsi0,rotation_rate=3D1,bootindex=3D100' \ -netdev 'type=3Dtap,id=3Dnet0,ifname=3Dtap101i0,script=3D/var/lib/qem= u-server/pve-bridge,downscript=3D/var/lib/qemu-server/pve-bridgedown,vhost= =3Don' \ -device 'virtio-net-pci,mac=3D4E:5D:50:75:4D:ED,netdev=3Dnet0,bus=3Dp= ci.0,addr=3D0x12,id=3Dnet0,bootindex=3D300' \ -machine 'type=3Dpc' \ -enable-kvm Destination: /usr/bin/kvm \ -name randomclone \ -chardev socket,id=3Dqmp,path=3D/var/run/qemu-server/101.qmp,server,n= owait \ -mon chardev=3Dqmp,mode=3Dcontrol \ -chardev socket,id=3Dqmp-event,path=3D/var/run/qmeventd.sock,reconnec= t=3D5 \ -mon chardev=3Dqmp-event,mode=3Dcontrol \ -pidfile /var/run/qemu-server/101.pid \ -smbios type=3D1,uuid=3Df3ab31f6-ca7d-469c-bf51-547fd9bbd2d9 \ -smp 4,sockets=3D1,cores=3D4,maxcpus=3D4 \ -nodefaults \ -boot menu=3Don,strict=3Don,reboot-timeout=3D1000,splash=3D/usr/share= /qemu-server/bootsplash.jpg \ -vnc unix:/var/run/qemu-server/101.vnc,password \ -cpu host,+pcid,+spec-ctrl,+ssbd,+pdpe1gb,+kvm_pv_unhalt,+kvm_pv_eoi \ -m 4096 \ -device pci-bridge,id=3Dpci.1,chassis_nr=3D1,bus=3Dpci.0,addr=3D0x1e \ -device pci-bridge,id=3Dpci.2,chassis_nr=3D2,bus=3Dpci.0,addr=3D0x1f \ -device vmgenid,guid=3Dfb282779-7056-4f1d-96bb-70f578294e45 \ -device piix3-usb-uhci,id=3Duhci,bus=3Dpci.0,addr=3D0x1.0x2 \ -device usb-tablet,id=3Dtablet,bus=3Duhci.0,port=3D1 \ -device VGA,id=3Dvga,bus=3Dpci.0,addr=3D0x2 \ -device virtio-balloon-pci,id=3Dballoon0,bus=3Dpci.0,addr=3D0x3 \ -iscsi initiator-name=3Diqn.1993-08.org.debian:01:ee4e4a566b \ -drive if=3Dnone,id=3Ddrive-ide2,media=3Dcdrom,aio=3Dthreads \ -device ide-cd,bus=3Dide.1,unit=3D0,drive=3Ddrive-ide2,id=3Dide2,boot= index=3D200 \ -device virtio-scsi-pci,id=3Dscsihw0,bus=3Dpci.0,addr=3D0x5 \ -drive file=3Drbd:rbd/vm-101-disk-0:conf=3D/etc/pve/ceph.conf:id=3Dad= min:keyring=3D/etc/pve/priv/ceph/rbd.keyring,if=3Dnone,id=3Ddrive-scsi0,dis= card=3Don,format=3Draw,cache=3Dnone,aio=3Dnative,detect-zeroes=3Dunmap \ -device scsi-hd,bus=3Dscsihw0.0,channel=3D0,scsi-id=3D0,lun=3D0,drive= =3Ddrive-scsi0,id=3Dscsi0,rotation_rate=3D1,bootindex=3D100 \ -netdev type=3Dtap,id=3Dnet0,ifname=3Dtap101i0,script=3D/var/lib/qemu= -server/pve-bridge,downscript=3D/var/lib/qemu-server/pve-bridgedown,vhost= =3Don \ -device virtio-net-pci,mac=3D4E:5D:50:75:4D:ED,netdev=3Dnet0,bus=3Dpc= i.0,addr=3D0x12,id=3Dnet0,bootindex=3D300 \ -machine type=3Dpc-i440fx-3.0 \ -enable-kvm \ -incoming tcp:10.9.2.106:9989 \ -S This is the exact test-change I made which seems to work around it, but a proper fix would be nicer. Not sure how, though. ---8<--- diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c index d96e4aa96f..8d631d67a8 100644 --- a/hw/virtio/virtio-balloon.c +++ b/hw/virtio/virtio-balloon.c @@ -623,16 +623,16 @@ static void virtio_balloon_get_config(VirtIODevice *v= dev, uint8_t *config_data) config.num_pages =3D cpu_to_le32(dev->num_pages); config.actual =3D cpu_to_le32(dev->actual); =20 - if (dev->free_page_report_status =3D=3D FREE_PAGE_REPORT_S_REQUESTED) { - config.free_page_report_cmd_id =3D - cpu_to_le32(dev->free_page_report_cmd_id); - } else if (dev->free_page_report_status =3D=3D FREE_PAGE_REPORT_S_STOP= ) { - config.free_page_report_cmd_id =3D - cpu_to_le32(VIRTIO_BALLOON_CMD_ID_STOP); - } else if (dev->free_page_report_status =3D=3D FREE_PAGE_REPORT_S_DONE= ) { - config.free_page_report_cmd_id =3D - cpu_to_le32(VIRTIO_BALLOON_CMD_ID_DONE); - } + //if (dev->free_page_report_status =3D=3D FREE_PAGE_REPORT_S_REQUESTED= ) { + // config.free_page_report_cmd_id =3D + // cpu_to_le32(dev->free_page_report_cmd_id); + //} else if (dev->free_page_report_status =3D=3D FREE_PAGE_REPORT_S_ST= OP) { + // config.free_page_report_cmd_id =3D + // cpu_to_le32(VIRTIO_BALLOON_CMD_ID_STOP); + //} else if (dev->free_page_report_status =3D=3D FREE_PAGE_REPORT_S_DO= NE) { + // config.free_page_report_cmd_id =3D + // cpu_to_le32(VIRTIO_BALLOON_CMD_ID_DONE); + //} =20 trace_virtio_balloon_get_config(config.num_pages, config.actual); memcpy(config_data, &config, sizeof(struct virtio_balloon_config)); diff --git a/include/standard-headers/linux/virtio_balloon.h b/include/stan= dard-headers/linux/virtio_balloon.h index 9375ca2a70..86aca75972 100644 --- a/include/standard-headers/linux/virtio_balloon.h +++ b/include/standard-headers/linux/virtio_balloon.h @@ -48,9 +48,9 @@ struct virtio_balloon_config { /* Number of pages we've actually got in balloon. */ uint32_t actual; /* Free page report command id, readonly by guest */ - uint32_t free_page_report_cmd_id; - /* Stores PAGE_POISON if page poisoning is in use */ - uint32_t poison_val; + //uint32_t free_page_report_cmd_id; + ///* Stores PAGE_POISON if page poisoning is in use */ + //uint32_t poison_val; }; =20 #define VIRTIO_BALLOON_S_SWAP_IN 0 /* Amount of memory swapped in */