hw/vfio/common.c | 348 +++++++++++++++++++++++++++++++-- hw/virtio/virtio-mem.c | 347 ++++++++++++++++++++++++++++---- include/exec/memory.h | 249 ++++++++++++++++++++++- include/hw/vfio/vfio-common.h | 13 ++ include/hw/virtio/virtio-mem.h | 3 + include/migration/vmstate.h | 1 + softmmu/memory.c | 22 +++ softmmu/physmem.c | 108 +++++++--- 8 files changed, 1007 insertions(+), 84 deletions(-)
A virtio-mem device manages a memory region in guest physical address
space, represented as a single (currently large) memory region in QEMU,
mapped into system memory address space. Before the guest is allowed to use
memory blocks, it must coordinate with the hypervisor (plug blocks). After
a reboot, all memory is usually unplugged - when the guest comes up, it
detects the virtio-mem device and selects memory blocks to plug (based on
resize requests from the hypervisor).
Memory hot(un)plug consists of (un)plugging memory blocks via a virtio-mem
device (triggered by the guest). When unplugging blocks, we discard the
memory - similar to memory balloon inflation. In contrast to memory
ballooning, we always know which memory blocks a guest may actually use -
especially during a reboot, after a crash, or after kexec (and during
hibernation as well). Guests agreed to not access unplugged memory again,
especially not via DMA.
The issue with vfio is, that it cannot deal with random discards - for this
reason, virtio-mem and vfio can currently only run mutually exclusive.
Especially, vfio would currently map the whole memory region (with possible
only little/no plugged blocks), resulting in all pages getting pinned and
therefore resulting in a higher memory consumption than expected (turning
virtio-mem basically useless in these environments).
To make vfio work nicely with virtio-mem, we have to map only the plugged
blocks, and map/unmap properly when plugging/unplugging blocks (including
discarding of RAM when unplugging). We achieve that by using a new notifier
mechanism that communicates changes.
It's important to map memory in the granularity in which we could see
unmaps again (-> virtio-mem block size) - so when e.g., plugging
consecutive 100 MB with a block size of 2 MB, we need 50 mappings. When
unmapping, we can use a single vfio_unmap call for the applicable range.
We expect that the block size of virtio-mem devices will be fairly large
in the future (to not run out of mappings and to improve hot(un)plug
performance), configured by the user, when used with vfio (e.g., 128MB,
1G, ...), but it will depend on the setup.
More info regarding virtio-mem can be found at:
https://virtio-mem.gitlab.io/
v5 is located at:
git@github.com:davidhildenbrand/qemu.git virtio-mem-vfio-v5
v4 -> v5:
- "vfio: Support for RamDiscardMgr in the !vIOMMU case"
-- Added more assertions for granularity vs. iommu supported pagesize
- "vfio: Sanity check maximum number of DMA mappings with RamDiscardMgr"
-- Fix accounting of mappings
- "vfio: Disable only uncoordinated discards for VFIO_TYPE1 iommus"
-- Fence off SPAPR and add some comments regarding future support.
-- Tweak patch description
- Rebase and retest
v3 -> v4:
- "vfio: Query and store the maximum number of DMA mappings
-- Limit the patch to querying and storing only
-- Renamed to "vfio: Query and store the maximum number of possible DMA
mappings"
- "vfio: Support for RamDiscardMgr in the !vIOMMU case"
-- Remove sanity checks / warning the user
- "vfio: Sanity check maximum number of DMA mappings with RamDiscardMgr"
-- Perform sanity checks by looking at the number of memslots and all
registered RamDiscardMgr sections
- Rebase and retest
- Reshuffled the patches slightly
v2 -> v3:
- Rebased + retested
- Fixed some typos
- Added RB's
v1 -> v2:
- "memory: Introduce RamDiscardMgr for RAM memory regions"
-- Fix some errors in the documentation
-- Make register_listener() notify about populated parts and
unregister_listener() notify about discarding populated parts, to
simplify future locking inside virtio-mem, when handling requests via a
separate thread.
- "vfio: Query and store the maximum number of DMA mappings"
-- Query number of mappings and track mappings (except for vIOMMU)
- "vfio: Support for RamDiscardMgr in the !vIOMMU case"
-- Adapt to RamDiscardMgr changes and warn via generic DMA reservation
- "vfio: Support for RamDiscardMgr in the vIOMMU case"
-- Use vmstate priority to handle migration dependencies
RFC - v1:
- VFIO migration code. Due to missing kernel support, I cannot really test
if that part works.
- Understand/test/document vIOMMU implications, also regarding migration
- Nicer ram_block_discard_disable/require handling.
- s/SparseRAMHandler/RamDiscardMgr/, refactorings, cleanups, documentation,
testing, ...
David Hildenbrand (11):
memory: Introduce RamDiscardMgr for RAM memory regions
virtio-mem: Factor out traversing unplugged ranges
virtio-mem: Implement RamDiscardMgr interface
vfio: Support for RamDiscardMgr in the !vIOMMU case
vfio: Query and store the maximum number of possible DMA mappings
vfio: Sanity check maximum number of DMA mappings with RamDiscardMgr
vfio: Support for RamDiscardMgr in the vIOMMU case
softmmu/physmem: Don't use atomic operations in
ram_block_discard_(disable|require)
softmmu/physmem: Extend ram_block_discard_(require|disable) by two
discard types
virtio-mem: Require only coordinated discards
vfio: Disable only uncoordinated discards for VFIO_TYPE1 iommus
hw/vfio/common.c | 348 +++++++++++++++++++++++++++++++--
hw/virtio/virtio-mem.c | 347 ++++++++++++++++++++++++++++----
include/exec/memory.h | 249 ++++++++++++++++++++++-
include/hw/vfio/vfio-common.h | 13 ++
include/hw/virtio/virtio-mem.h | 3 +
include/migration/vmstate.h | 1 +
softmmu/memory.c | 22 +++
softmmu/physmem.c | 108 +++++++---
8 files changed, 1007 insertions(+), 84 deletions(-)
--
2.29.2
On Thu, Jan 21, 2021 at 12:05:29PM +0100, David Hildenbrand wrote: > A virtio-mem device manages a memory region in guest physical address > space, represented as a single (currently large) memory region in QEMU, > mapped into system memory address space. Before the guest is allowed to use > memory blocks, it must coordinate with the hypervisor (plug blocks). After > a reboot, all memory is usually unplugged - when the guest comes up, it > detects the virtio-mem device and selects memory blocks to plug (based on > resize requests from the hypervisor). > > Memory hot(un)plug consists of (un)plugging memory blocks via a virtio-mem > device (triggered by the guest). When unplugging blocks, we discard the > memory - similar to memory balloon inflation. In contrast to memory > ballooning, we always know which memory blocks a guest may actually use - > especially during a reboot, after a crash, or after kexec (and during > hibernation as well). Guests agreed to not access unplugged memory again, > especially not via DMA. > > The issue with vfio is, that it cannot deal with random discards - for this > reason, virtio-mem and vfio can currently only run mutually exclusive. > Especially, vfio would currently map the whole memory region (with possible > only little/no plugged blocks), resulting in all pages getting pinned and > therefore resulting in a higher memory consumption than expected (turning > virtio-mem basically useless in these environments). > > To make vfio work nicely with virtio-mem, we have to map only the plugged > blocks, and map/unmap properly when plugging/unplugging blocks (including > discarding of RAM when unplugging). We achieve that by using a new notifier > mechanism that communicates changes. series Acked-by: Michael S. Tsirkin <mst@redhat.com> virtio bits Reviewed-by: Michael S. Tsirkin <mst@redhat.com> This needs to go through vfio tree I assume. > It's important to map memory in the granularity in which we could see > unmaps again (-> virtio-mem block size) - so when e.g., plugging > consecutive 100 MB with a block size of 2 MB, we need 50 mappings. When > unmapping, we can use a single vfio_unmap call for the applicable range. > We expect that the block size of virtio-mem devices will be fairly large > in the future (to not run out of mappings and to improve hot(un)plug > performance), configured by the user, when used with vfio (e.g., 128MB, > 1G, ...), but it will depend on the setup. > > More info regarding virtio-mem can be found at: > https://virtio-mem.gitlab.io/ > > v5 is located at: > git@github.com:davidhildenbrand/qemu.git virtio-mem-vfio-v5 > > v4 -> v5: > - "vfio: Support for RamDiscardMgr in the !vIOMMU case" > -- Added more assertions for granularity vs. iommu supported pagesize > - "vfio: Sanity check maximum number of DMA mappings with RamDiscardMgr" > -- Fix accounting of mappings > - "vfio: Disable only uncoordinated discards for VFIO_TYPE1 iommus" > -- Fence off SPAPR and add some comments regarding future support. > -- Tweak patch description > - Rebase and retest > > v3 -> v4: > - "vfio: Query and store the maximum number of DMA mappings > -- Limit the patch to querying and storing only > -- Renamed to "vfio: Query and store the maximum number of possible DMA > mappings" > - "vfio: Support for RamDiscardMgr in the !vIOMMU case" > -- Remove sanity checks / warning the user > - "vfio: Sanity check maximum number of DMA mappings with RamDiscardMgr" > -- Perform sanity checks by looking at the number of memslots and all > registered RamDiscardMgr sections > - Rebase and retest > - Reshuffled the patches slightly > > v2 -> v3: > - Rebased + retested > - Fixed some typos > - Added RB's > > v1 -> v2: > - "memory: Introduce RamDiscardMgr for RAM memory regions" > -- Fix some errors in the documentation > -- Make register_listener() notify about populated parts and > unregister_listener() notify about discarding populated parts, to > simplify future locking inside virtio-mem, when handling requests via a > separate thread. > - "vfio: Query and store the maximum number of DMA mappings" > -- Query number of mappings and track mappings (except for vIOMMU) > - "vfio: Support for RamDiscardMgr in the !vIOMMU case" > -- Adapt to RamDiscardMgr changes and warn via generic DMA reservation > - "vfio: Support for RamDiscardMgr in the vIOMMU case" > -- Use vmstate priority to handle migration dependencies > > RFC - v1: > - VFIO migration code. Due to missing kernel support, I cannot really test > if that part works. > - Understand/test/document vIOMMU implications, also regarding migration > - Nicer ram_block_discard_disable/require handling. > - s/SparseRAMHandler/RamDiscardMgr/, refactorings, cleanups, documentation, > testing, ... > > David Hildenbrand (11): > memory: Introduce RamDiscardMgr for RAM memory regions > virtio-mem: Factor out traversing unplugged ranges > virtio-mem: Implement RamDiscardMgr interface > vfio: Support for RamDiscardMgr in the !vIOMMU case > vfio: Query and store the maximum number of possible DMA mappings > vfio: Sanity check maximum number of DMA mappings with RamDiscardMgr > vfio: Support for RamDiscardMgr in the vIOMMU case > softmmu/physmem: Don't use atomic operations in > ram_block_discard_(disable|require) > softmmu/physmem: Extend ram_block_discard_(require|disable) by two > discard types > virtio-mem: Require only coordinated discards > vfio: Disable only uncoordinated discards for VFIO_TYPE1 iommus > > hw/vfio/common.c | 348 +++++++++++++++++++++++++++++++-- > hw/virtio/virtio-mem.c | 347 ++++++++++++++++++++++++++++---- > include/exec/memory.h | 249 ++++++++++++++++++++++- > include/hw/vfio/vfio-common.h | 13 ++ > include/hw/virtio/virtio-mem.h | 3 + > include/migration/vmstate.h | 1 + > softmmu/memory.c | 22 +++ > softmmu/physmem.c | 108 +++++++--- > 8 files changed, 1007 insertions(+), 84 deletions(-) > > -- > 2.29.2
On 27.01.21 13:45, Michael S. Tsirkin wrote: > On Thu, Jan 21, 2021 at 12:05:29PM +0100, David Hildenbrand wrote: >> A virtio-mem device manages a memory region in guest physical address >> space, represented as a single (currently large) memory region in QEMU, >> mapped into system memory address space. Before the guest is allowed to use >> memory blocks, it must coordinate with the hypervisor (plug blocks). After >> a reboot, all memory is usually unplugged - when the guest comes up, it >> detects the virtio-mem device and selects memory blocks to plug (based on >> resize requests from the hypervisor). >> >> Memory hot(un)plug consists of (un)plugging memory blocks via a virtio-mem >> device (triggered by the guest). When unplugging blocks, we discard the >> memory - similar to memory balloon inflation. In contrast to memory >> ballooning, we always know which memory blocks a guest may actually use - >> especially during a reboot, after a crash, or after kexec (and during >> hibernation as well). Guests agreed to not access unplugged memory again, >> especially not via DMA. >> >> The issue with vfio is, that it cannot deal with random discards - for this >> reason, virtio-mem and vfio can currently only run mutually exclusive. >> Especially, vfio would currently map the whole memory region (with possible >> only little/no plugged blocks), resulting in all pages getting pinned and >> therefore resulting in a higher memory consumption than expected (turning >> virtio-mem basically useless in these environments). >> >> To make vfio work nicely with virtio-mem, we have to map only the plugged >> blocks, and map/unmap properly when plugging/unplugging blocks (including >> discarding of RAM when unplugging). We achieve that by using a new notifier >> mechanism that communicates changes. > > series > > Acked-by: Michael S. Tsirkin <mst@redhat.com> > > virtio bits > > Reviewed-by: Michael S. Tsirkin <mst@redhat.com> > > This needs to go through vfio tree I assume. Thanks Michael. @Alex, what are your suggestions? -- Thanks, David / dhildenb
On 08.02.21 09:28, David Hildenbrand wrote: > On 27.01.21 13:45, Michael S. Tsirkin wrote: >> On Thu, Jan 21, 2021 at 12:05:29PM +0100, David Hildenbrand wrote: >>> A virtio-mem device manages a memory region in guest physical address >>> space, represented as a single (currently large) memory region in QEMU, >>> mapped into system memory address space. Before the guest is allowed to use >>> memory blocks, it must coordinate with the hypervisor (plug blocks). After >>> a reboot, all memory is usually unplugged - when the guest comes up, it >>> detects the virtio-mem device and selects memory blocks to plug (based on >>> resize requests from the hypervisor). >>> >>> Memory hot(un)plug consists of (un)plugging memory blocks via a virtio-mem >>> device (triggered by the guest). When unplugging blocks, we discard the >>> memory - similar to memory balloon inflation. In contrast to memory >>> ballooning, we always know which memory blocks a guest may actually use - >>> especially during a reboot, after a crash, or after kexec (and during >>> hibernation as well). Guests agreed to not access unplugged memory again, >>> especially not via DMA. >>> >>> The issue with vfio is, that it cannot deal with random discards - for this >>> reason, virtio-mem and vfio can currently only run mutually exclusive. >>> Especially, vfio would currently map the whole memory region (with possible >>> only little/no plugged blocks), resulting in all pages getting pinned and >>> therefore resulting in a higher memory consumption than expected (turning >>> virtio-mem basically useless in these environments). >>> >>> To make vfio work nicely with virtio-mem, we have to map only the plugged >>> blocks, and map/unmap properly when plugging/unplugging blocks (including >>> discarding of RAM when unplugging). We achieve that by using a new notifier >>> mechanism that communicates changes. >> >> series >> >> Acked-by: Michael S. Tsirkin <mst@redhat.com> >> >> virtio bits >> >> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> >> >> This needs to go through vfio tree I assume. > > Thanks Michael. > > @Alex, what are your suggestions? Gentle ping. -- Thanks, David / dhildenb
On Mon, 15 Feb 2021 15:03:43 +0100 David Hildenbrand <david@redhat.com> wrote: > On 08.02.21 09:28, David Hildenbrand wrote: > > On 27.01.21 13:45, Michael S. Tsirkin wrote: > >> On Thu, Jan 21, 2021 at 12:05:29PM +0100, David Hildenbrand wrote: > >>> A virtio-mem device manages a memory region in guest physical address > >>> space, represented as a single (currently large) memory region in QEMU, > >>> mapped into system memory address space. Before the guest is allowed to use > >>> memory blocks, it must coordinate with the hypervisor (plug blocks). After > >>> a reboot, all memory is usually unplugged - when the guest comes up, it > >>> detects the virtio-mem device and selects memory blocks to plug (based on > >>> resize requests from the hypervisor). > >>> > >>> Memory hot(un)plug consists of (un)plugging memory blocks via a virtio-mem > >>> device (triggered by the guest). When unplugging blocks, we discard the > >>> memory - similar to memory balloon inflation. In contrast to memory > >>> ballooning, we always know which memory blocks a guest may actually use - > >>> especially during a reboot, after a crash, or after kexec (and during > >>> hibernation as well). Guests agreed to not access unplugged memory again, > >>> especially not via DMA. > >>> > >>> The issue with vfio is, that it cannot deal with random discards - for this > >>> reason, virtio-mem and vfio can currently only run mutually exclusive. > >>> Especially, vfio would currently map the whole memory region (with possible > >>> only little/no plugged blocks), resulting in all pages getting pinned and > >>> therefore resulting in a higher memory consumption than expected (turning > >>> virtio-mem basically useless in these environments). > >>> > >>> To make vfio work nicely with virtio-mem, we have to map only the plugged > >>> blocks, and map/unmap properly when plugging/unplugging blocks (including > >>> discarding of RAM when unplugging). We achieve that by using a new notifier > >>> mechanism that communicates changes. > >> > >> series > >> > >> Acked-by: Michael S. Tsirkin <mst@redhat.com> > >> > >> virtio bits > >> > >> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> > >> > >> This needs to go through vfio tree I assume. > > > > Thanks Michael. > > > > @Alex, what are your suggestions? > > Gentle ping. Sorry for the delay. It looks to me like patches 1, 8, and 9 are Memory API that are still missing an Ack from Paolo. I'll toss in my A-b+R-b for patches 6 and 7. I don't see that this necessarily needs to go in through vfio, I'm more than happy if someone else wants to grab it. Thanks, Alex
On 16.02.21 19:33, Alex Williamson wrote: > On Mon, 15 Feb 2021 15:03:43 +0100 > David Hildenbrand <david@redhat.com> wrote: > >> On 08.02.21 09:28, David Hildenbrand wrote: >>> On 27.01.21 13:45, Michael S. Tsirkin wrote: >>>> On Thu, Jan 21, 2021 at 12:05:29PM +0100, David Hildenbrand wrote: >>>>> A virtio-mem device manages a memory region in guest physical address >>>>> space, represented as a single (currently large) memory region in QEMU, >>>>> mapped into system memory address space. Before the guest is allowed to use >>>>> memory blocks, it must coordinate with the hypervisor (plug blocks). After >>>>> a reboot, all memory is usually unplugged - when the guest comes up, it >>>>> detects the virtio-mem device and selects memory blocks to plug (based on >>>>> resize requests from the hypervisor). >>>>> >>>>> Memory hot(un)plug consists of (un)plugging memory blocks via a virtio-mem >>>>> device (triggered by the guest). When unplugging blocks, we discard the >>>>> memory - similar to memory balloon inflation. In contrast to memory >>>>> ballooning, we always know which memory blocks a guest may actually use - >>>>> especially during a reboot, after a crash, or after kexec (and during >>>>> hibernation as well). Guests agreed to not access unplugged memory again, >>>>> especially not via DMA. >>>>> >>>>> The issue with vfio is, that it cannot deal with random discards - for this >>>>> reason, virtio-mem and vfio can currently only run mutually exclusive. >>>>> Especially, vfio would currently map the whole memory region (with possible >>>>> only little/no plugged blocks), resulting in all pages getting pinned and >>>>> therefore resulting in a higher memory consumption than expected (turning >>>>> virtio-mem basically useless in these environments). >>>>> >>>>> To make vfio work nicely with virtio-mem, we have to map only the plugged >>>>> blocks, and map/unmap properly when plugging/unplugging blocks (including >>>>> discarding of RAM when unplugging). We achieve that by using a new notifier >>>>> mechanism that communicates changes. >>>> >>>> series >>>> >>>> Acked-by: Michael S. Tsirkin <mst@redhat.com> >>>> >>>> virtio bits >>>> >>>> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> >>>> >>>> This needs to go through vfio tree I assume. >>> >>> Thanks Michael. >>> >>> @Alex, what are your suggestions? >> >> Gentle ping. > > Sorry for the delay. It looks to me like patches 1, 8, and 9 are > Memory API that are still missing an Ack from Paolo. I'll toss in my > A-b+R-b for patches 6 and 7. I don't see that this necessarily needs > to go in through vfio, I'm more than happy if someone else wants to > grab it. Thanks, Thanks, I assume patch #11 is fine with you as well? @Paolo, it would be great if I can get your feedback on patch #1. I have more stuff coming up that will reuse RamDiscardMgr (i.e., for better migration handling and better guest memory dump handling). -- Thanks, David / dhildenb
On Tue, 16 Feb 2021 19:49:09 +0100 David Hildenbrand <david@redhat.com> wrote: > On 16.02.21 19:33, Alex Williamson wrote: > > On Mon, 15 Feb 2021 15:03:43 +0100 > > David Hildenbrand <david@redhat.com> wrote: > > > >> On 08.02.21 09:28, David Hildenbrand wrote: > >>> On 27.01.21 13:45, Michael S. Tsirkin wrote: > >>>> On Thu, Jan 21, 2021 at 12:05:29PM +0100, David Hildenbrand wrote: > >>>>> A virtio-mem device manages a memory region in guest physical address > >>>>> space, represented as a single (currently large) memory region in QEMU, > >>>>> mapped into system memory address space. Before the guest is allowed to use > >>>>> memory blocks, it must coordinate with the hypervisor (plug blocks). After > >>>>> a reboot, all memory is usually unplugged - when the guest comes up, it > >>>>> detects the virtio-mem device and selects memory blocks to plug (based on > >>>>> resize requests from the hypervisor). > >>>>> > >>>>> Memory hot(un)plug consists of (un)plugging memory blocks via a virtio-mem > >>>>> device (triggered by the guest). When unplugging blocks, we discard the > >>>>> memory - similar to memory balloon inflation. In contrast to memory > >>>>> ballooning, we always know which memory blocks a guest may actually use - > >>>>> especially during a reboot, after a crash, or after kexec (and during > >>>>> hibernation as well). Guests agreed to not access unplugged memory again, > >>>>> especially not via DMA. > >>>>> > >>>>> The issue with vfio is, that it cannot deal with random discards - for this > >>>>> reason, virtio-mem and vfio can currently only run mutually exclusive. > >>>>> Especially, vfio would currently map the whole memory region (with possible > >>>>> only little/no plugged blocks), resulting in all pages getting pinned and > >>>>> therefore resulting in a higher memory consumption than expected (turning > >>>>> virtio-mem basically useless in these environments). > >>>>> > >>>>> To make vfio work nicely with virtio-mem, we have to map only the plugged > >>>>> blocks, and map/unmap properly when plugging/unplugging blocks (including > >>>>> discarding of RAM when unplugging). We achieve that by using a new notifier > >>>>> mechanism that communicates changes. > >>>> > >>>> series > >>>> > >>>> Acked-by: Michael S. Tsirkin <mst@redhat.com> > >>>> > >>>> virtio bits > >>>> > >>>> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> > >>>> > >>>> This needs to go through vfio tree I assume. > >>> > >>> Thanks Michael. > >>> > >>> @Alex, what are your suggestions? > >> > >> Gentle ping. > > > > Sorry for the delay. It looks to me like patches 1, 8, and 9 are > > Memory API that are still missing an Ack from Paolo. I'll toss in my > > A-b+R-b for patches 6 and 7. I don't see that this necessarily needs > > to go in through vfio, I'm more than happy if someone else wants to > > grab it. Thanks, > > Thanks, I assume patch #11 is fine with you as well? Yep, sent my acks for it as well. Thanks, Alex > @Paolo, it would be great if I can get your feedback on patch #1. I have > more stuff coming up that will reuse RamDiscardMgr (i.e., for better > migration handling and better guest memory dump handling). >
© 2016 - 2025 Red Hat, Inc.