[PATCH v3 14/22] microvm: use 2G split unconditionally

Gerd Hoffmann posted 22 patches 5 years, 7 months ago
Maintainers: Peter Maydell <peter.maydell@linaro.org>, Sergio Lopez <slp@redhat.com>, Igor Mammedov <imammedo@redhat.com>, Eduardo Habkost <ehabkost@redhat.com>, Marcel Apfelbaum <marcel.apfelbaum@gmail.com>, Paolo Bonzini <pbonzini@redhat.com>, "Michael S. Tsirkin" <mst@redhat.com>, Richard Henderson <rth@twiddle.net>, Shannon Zhao <shannon.zhaosl@gmail.com>
There is a newer version of this series
[PATCH v3 14/22] microvm: use 2G split unconditionally
Posted by Gerd Hoffmann 5 years, 7 months ago
Looks like the logiv was copied over from q35.

q35 does this for backward compatibility, there is no reason to do this
on microvm though.  So split @ 2G unconditionally.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
---
 hw/i386/microvm.c | 16 +---------------
 1 file changed, 1 insertion(+), 15 deletions(-)

diff --git a/hw/i386/microvm.c b/hw/i386/microvm.c
index 867d3d652145..b8f0d3283758 100644
--- a/hw/i386/microvm.c
+++ b/hw/i386/microvm.c
@@ -170,23 +170,9 @@ static void microvm_memory_init(MicrovmMachineState *mms)
     MemoryRegion *ram_below_4g, *ram_above_4g;
     MemoryRegion *system_memory = get_system_memory();
     FWCfgState *fw_cfg;
-    ram_addr_t lowmem;
+    ram_addr_t lowmem = 0x80000000; /* 2G */
     int i;
 
-    /*
-     * Check whether RAM fits below 4G (leaving 1/2 GByte for IO memory
-     * and 256 Mbytes for PCI Express Enhanced Configuration Access Mapping
-     * also known as MMCFG).
-     * If it doesn't, we need to split it in chunks below and above 4G.
-     * In any case, try to make sure that guest addresses aligned at
-     * 1G boundaries get mapped to host addresses aligned at 1G boundaries.
-     */
-    if (machine->ram_size >= 0xb0000000) {
-        lowmem = 0x80000000;
-    } else {
-        lowmem = 0xb0000000;
-    }
-
     /*
      * Handle the machine opt max-ram-below-4g.  It is basically doing
      * min(qemu limit, user limit).
-- 
2.18.4


Re: [PATCH v3 14/22] microvm: use 2G split unconditionally
Posted by Igor Mammedov 5 years, 7 months ago
On Wed, 20 May 2020 15:19:55 +0200
Gerd Hoffmann <kraxel@redhat.com> wrote:

> Looks like the logiv was copied over from q35.
> 
> q35 does this for backward compatibility, there is no reason to do this
> on microvm though.  So split @ 2G unconditionally.
> 
> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>

Reviewed-by: Igor Mammedov <imammedo@redhat.com>

> ---
>  hw/i386/microvm.c | 16 +---------------
>  1 file changed, 1 insertion(+), 15 deletions(-)
> 
> diff --git a/hw/i386/microvm.c b/hw/i386/microvm.c
> index 867d3d652145..b8f0d3283758 100644
> --- a/hw/i386/microvm.c
> +++ b/hw/i386/microvm.c
> @@ -170,23 +170,9 @@ static void microvm_memory_init(MicrovmMachineState *mms)
>      MemoryRegion *ram_below_4g, *ram_above_4g;
>      MemoryRegion *system_memory = get_system_memory();
>      FWCfgState *fw_cfg;
> -    ram_addr_t lowmem;
> +    ram_addr_t lowmem = 0x80000000; /* 2G */
>      int i;
>  
> -    /*
> -     * Check whether RAM fits below 4G (leaving 1/2 GByte for IO memory
> -     * and 256 Mbytes for PCI Express Enhanced Configuration Access Mapping
> -     * also known as MMCFG).
> -     * If it doesn't, we need to split it in chunks below and above 4G.
> -     * In any case, try to make sure that guest addresses aligned at
> -     * 1G boundaries get mapped to host addresses aligned at 1G boundaries.
> -     */
> -    if (machine->ram_size >= 0xb0000000) {
> -        lowmem = 0x80000000;
> -    } else {
> -        lowmem = 0xb0000000;
> -    }
> -
>      /*
>       * Handle the machine opt max-ram-below-4g.  It is basically doing
>       * min(qemu limit, user limit).


Re: [PATCH v3 14/22] microvm: use 2G split unconditionally
Posted by Igor Mammedov 5 years, 7 months ago
On Wed, 20 May 2020 15:19:55 +0200
Gerd Hoffmann <kraxel@redhat.com> wrote:

> Looks like the logiv was copied over from q35.
> 
> q35 does this for backward compatibility, there is no reason to do this
> on microvm though.  So split @ 2G unconditionally.

not related to your ACPI rework, but just an idea for future of microvm

I wonder if we should carry over all this fixed RAM layout legacy from pc/q35
with a bunch of knobs to tweak it (along with complicated logic).

Can we just re-use pc-dimms for main RAM and let user specify RAM layout the way they wish?


> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
> ---
>  hw/i386/microvm.c | 16 +---------------
>  1 file changed, 1 insertion(+), 15 deletions(-)
> 
> diff --git a/hw/i386/microvm.c b/hw/i386/microvm.c
> index 867d3d652145..b8f0d3283758 100644
> --- a/hw/i386/microvm.c
> +++ b/hw/i386/microvm.c
> @@ -170,23 +170,9 @@ static void microvm_memory_init(MicrovmMachineState *mms)
>      MemoryRegion *ram_below_4g, *ram_above_4g;
>      MemoryRegion *system_memory = get_system_memory();
>      FWCfgState *fw_cfg;
> -    ram_addr_t lowmem;
> +    ram_addr_t lowmem = 0x80000000; /* 2G */
>      int i;
>  
> -    /*
> -     * Check whether RAM fits below 4G (leaving 1/2 GByte for IO memory
> -     * and 256 Mbytes for PCI Express Enhanced Configuration Access Mapping
> -     * also known as MMCFG).
> -     * If it doesn't, we need to split it in chunks below and above 4G.
> -     * In any case, try to make sure that guest addresses aligned at
> -     * 1G boundaries get mapped to host addresses aligned at 1G boundaries.
> -     */
> -    if (machine->ram_size >= 0xb0000000) {
> -        lowmem = 0x80000000;
> -    } else {
> -        lowmem = 0xb0000000;
> -    }
> -
>      /*
>       * Handle the machine opt max-ram-below-4g.  It is basically doing
>       * min(qemu limit, user limit).


Re: [PATCH v3 14/22] microvm: use 2G split unconditionally
Posted by Gerd Hoffmann 5 years, 6 months ago
On Thu, May 21, 2020 at 11:29:21AM +0200, Igor Mammedov wrote:
> On Wed, 20 May 2020 15:19:55 +0200
> Gerd Hoffmann <kraxel@redhat.com> wrote:
> 
> > Looks like the logiv was copied over from q35.
> > 
> > q35 does this for backward compatibility, there is no reason to do this
> > on microvm though.  So split @ 2G unconditionally.
> 
> not related to your ACPI rework, but just an idea for future of microvm
> 
> I wonder if we should carry over all this fixed RAM layout legacy from pc/q35
> with a bunch of knobs to tweak it (along with complicated logic).

Well, I think we can (should) drop max-ram-below-4g too.  There is
no reason to use that with microvm, other that shooting yourself into
the foot (by making mmio overlap with ram).

With that being gone too there isn't much logic left ...

take care,
  Gerd


Re: [PATCH v3 14/22] microvm: use 2G split unconditionally
Posted by Igor Mammedov 5 years, 6 months ago
On Mon, 25 May 2020 13:45:08 +0200
Gerd Hoffmann <kraxel@redhat.com> wrote:

> On Thu, May 21, 2020 at 11:29:21AM +0200, Igor Mammedov wrote:
> > On Wed, 20 May 2020 15:19:55 +0200
> > Gerd Hoffmann <kraxel@redhat.com> wrote:
> >   
> > > Looks like the logiv was copied over from q35.
> > > 
> > > q35 does this for backward compatibility, there is no reason to do this
> > > on microvm though.  So split @ 2G unconditionally.  
> > 
> > not related to your ACPI rework, but just an idea for future of microvm
> > 
> > I wonder if we should carry over all this fixed RAM layout legacy from pc/q35
> > with a bunch of knobs to tweak it (along with complicated logic).  
> 
> Well, I think we can (should) drop max-ram-below-4g too.  There is
> no reason to use that with microvm, other that shooting yourself into
> the foot (by making mmio overlap with ram).
> 
> With that being gone too there isn't much logic left ...

I wonder if we need 2G split for microvm at all?
Can we map 1 contiguous big blob from 0 GPA and overlay bios & other x86 TOLUD stuff?


> take care,
>   Gerd
> 


Re: [PATCH v3 14/22] microvm: use 2G split unconditionally
Posted by Gerd Hoffmann 5 years, 6 months ago
> > Well, I think we can (should) drop max-ram-below-4g too.  There is
> > no reason to use that with microvm, other that shooting yourself into
> > the foot (by making mmio overlap with ram).
> > 
> > With that being gone too there isn't much logic left ...
> 
> I wonder if we need 2G split for microvm at all?
> Can we map 1 contiguous big blob from 0 GPA and overlay bios & other x86 TOLUD stuff?

I think it would work, but it has some drawbacks:

  (1) we loose a bit of memory.
  (2) we loose a gigabyte page.
  (3) we wouldn't have guard pages (unused address space) between
      between ram and mmio space.

take care,
  Gerd


Re: [PATCH v3 14/22] microvm: use 2G split unconditionally
Posted by Igor Mammedov 5 years, 6 months ago
On Tue, 26 May 2020 06:48:39 +0200
Gerd Hoffmann <kraxel@redhat.com> wrote:

> > > Well, I think we can (should) drop max-ram-below-4g too.  There is
> > > no reason to use that with microvm, other that shooting yourself into
> > > the foot (by making mmio overlap with ram).
> > > 
> > > With that being gone too there isn't much logic left ...  
> > 
> > I wonder if we need 2G split for microvm at all?
> > Can we map 1 contiguous big blob from 0 GPA and overlay bios & other x86 TOLUD stuff?  
> 
> I think it would work, but it has some drawbacks:
> 
>   (1) we loose a bit of memory.
          it's probably not a big enough to care about, we do similar ovarlay mapping on pc/q35
          at the beginning of RAM
>   (2) we loose a gigabyte page.
          I'm not sure waht exactly we loose in this case.
          Lets assume we allocating guest 5G of continuous RAM using 1G huge pages,
          in this case I'd think that on host side MMIO overlay won't affect RAM blob
          on guest side pagetables will be fragmented due to MMIO holes, but guest still
          could use huge pages smaller ones in fragmented area and 1G where there is no fragmentation.

>   (3) we wouldn't have guard pages (unused address space) between
>       between ram and mmio space.
           if it's holes' mmio,then do we really need them (access is going to be terminated
           either in always valid RAM or in valid mmio hole)?
> 
> take care,
>   Gerd
> 
> 


Re: [PATCH v3 14/22] microvm: use 2G split unconditionally
Posted by Paolo Bonzini 5 years, 6 months ago
On 27/05/20 14:25, Igor Mammedov wrote:
>>   (2) we loose a gigabyte page.
>           I'm not sure waht exactly we loose in this case.
>           Lets assume we allocating guest 5G of continuous RAM using 1G huge pages,
>           in this case I'd think that on host side MMIO overlay won't affect RAM blob
>           on guest side pagetables will be fragmented due to MMIO holes, but guest still
>           could use huge pages smaller ones in fragmented area and 1G where there is no fragmentation.

Access to the 3G-4G area would not be able to use 1G EPT pages.

But why use 2G split instead of 3G?  There's only very little MMIO and
no PCI hole (including no huge MMCONFIG BAR) on microvm.

Paolo


Re: [PATCH v3 14/22] microvm: use 2G split unconditionally
Posted by Gerd Hoffmann 5 years, 6 months ago
  Hi,

> But why use 2G split instead of 3G?  There's only very little MMIO and
> no PCI hole (including no huge MMCONFIG BAR) on microvm.

Yes, we can go for 3G, we are indeed not short on address space ;)

take care,
  Gerd


Re: [PATCH v3 14/22] microvm: use 2G split unconditionally
Posted by Igor Mammedov 5 years, 6 months ago
On Wed, 27 May 2020 15:06:28 +0200
Paolo Bonzini <pbonzini@redhat.com> wrote:

> On 27/05/20 14:25, Igor Mammedov wrote:
> >>   (2) we loose a gigabyte page.  
> >           I'm not sure waht exactly we loose in this case.
> >           Lets assume we allocating guest 5G of continuous RAM using 1G huge pages,
> >           in this case I'd think that on host side MMIO overlay won't affect RAM blob
> >           on guest side pagetables will be fragmented due to MMIO holes, but guest still
> >           could use huge pages smaller ones in fragmented area and 1G where there is no fragmentation.  
> 
> Access to the 3G-4G area would not be able to use 1G EPT pages.
Could it use 2Mb pages instead of 1Gb?
Do we really care about 1 gigabyte huge page in microvm intended usecase?
(fast starting VMs for microservices like FaaS, which unlikely would use much memory to begin with)

> But why use 2G split instead of 3G?  There's only very little MMIO and
> no PCI hole (including no huge MMCONFIG BAR) on microvm.
> 
> Paolo
> 


Re: [PATCH v3 14/22] microvm: use 2G split unconditionally
Posted by Paolo Bonzini 5 years, 6 months ago
On 27/05/20 16:26, Igor Mammedov wrote:
> On Wed, 27 May 2020 15:06:28 +0200
> Paolo Bonzini <pbonzini@redhat.com> wrote:
> 
>> On 27/05/20 14:25, Igor Mammedov wrote:
>>>>   (2) we loose a gigabyte page.  
>>>           I'm not sure waht exactly we loose in this case.
>>>           Lets assume we allocating guest 5G of continuous RAM using 1G huge pages,
>>>           in this case I'd think that on host side MMIO overlay won't affect RAM blob
>>>           on guest side pagetables will be fragmented due to MMIO holes, but guest still
>>>           could use huge pages smaller ones in fragmented area and 1G where there is no fragmentation.  
>>
>> Access to the 3G-4G area would not be able to use 1G EPT pages.
> Could it use 2Mb pages instead of 1Gb?

Yes, probably a mix of 2 MiB pages and 4 KiB pages around the memslot
splits.

> Do we really care about 1 gigabyte huge page in microvm intended usecase?
> (fast starting VMs for microservices like FaaS, which unlikely would use much memory to begin with)

I honestly don't think it's measurable, but at least in theory we care
because such workloads could have more TLB misses (relative to the
execution time) than long-lasting VMs.

Paolo


Re: [PATCH v3 14/22] microvm: use 2G split unconditionally
Posted by Igor Mammedov 5 years, 6 months ago
On Wed, 27 May 2020 16:26:46 +0200
Igor Mammedov <imammedo@redhat.com> wrote:

> On Wed, 27 May 2020 15:06:28 +0200
> Paolo Bonzini <pbonzini@redhat.com> wrote:
> 
> > On 27/05/20 14:25, Igor Mammedov wrote:  
> > >>   (2) we loose a gigabyte page.    
> > >           I'm not sure waht exactly we loose in this case.
> > >           Lets assume we allocating guest 5G of continuous RAM using 1G huge pages,
> > >           in this case I'd think that on host side MMIO overlay won't affect RAM blob
> > >           on guest side pagetables will be fragmented due to MMIO holes, but guest still
> > >           could use huge pages smaller ones in fragmented area and 1G where there is no fragmentation.    
> > 
> > Access to the 3G-4G area would not be able to use 1G EPT pages.  
> Could it use 2Mb pages instead of 1Gb?
> Do we really care about 1 gigabyte huge page in microvm intended usecase?
> (fast starting VMs for microservices like FaaS, which unlikely would use much memory to begin with)

my interest in having single memory region, is in possibility of drop in conversion to [nv|pc-dimm] later on
without breaking ABI. (I'm not sure that we actually need it though)


> > But why use 2G split instead of 3G?  There's only very little MMIO and
> > no PCI hole (including no huge MMCONFIG BAR) on microvm.
> > 
> > Paolo
> >   
> 
> 


Re: [PATCH v3 14/22] microvm: use 2G split unconditionally
Posted by Gerd Hoffmann 5 years, 6 months ago
  Hi,

> >   (1) we loose a bit of memory.
>           it's probably not a big enough to care about, we do similar ovarlay mapping on pc/q35
>           at the beginning of RAM

Yes, shouldn't be too much.

> >   (2) we loose a gigabyte page.
>           I'm not sure waht exactly we loose in this case.

The 1G page for 0xc0000000 -> 0xffffffff (as explained by paolo).

> >   (3) we wouldn't have guard pages (unused address space) between
> >       between ram and mmio space.
>            if it's holes' mmio,then do we really need them (access is going to be terminated
>            either in always valid RAM or in valid mmio hole)?

Not required, but more robust.  Less likely that the guest touches mmio
by accident.

I'd expect it also requires some e820 hacks.

cheers,
  Gerd


Re: [PATCH v3 14/22] microvm: use 2G split unconditionally
Posted by Philippe Mathieu-Daudé 5 years, 7 months ago
On 5/20/20 3:19 PM, Gerd Hoffmann wrote:
> Looks like the logiv was copied over from q35.

Typo 'logiv' -> 'logic'.

> 
> q35 does this for backward compatibility, there is no reason to do this
> on microvm though.  So split @ 2G unconditionally.

Yes please!

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>

> 
> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
> ---
>   hw/i386/microvm.c | 16 +---------------
>   1 file changed, 1 insertion(+), 15 deletions(-)
> 
> diff --git a/hw/i386/microvm.c b/hw/i386/microvm.c
> index 867d3d652145..b8f0d3283758 100644
> --- a/hw/i386/microvm.c
> +++ b/hw/i386/microvm.c
> @@ -170,23 +170,9 @@ static void microvm_memory_init(MicrovmMachineState *mms)
>       MemoryRegion *ram_below_4g, *ram_above_4g;
>       MemoryRegion *system_memory = get_system_memory();
>       FWCfgState *fw_cfg;
> -    ram_addr_t lowmem;
> +    ram_addr_t lowmem = 0x80000000; /* 2G */
>       int i;
>   
> -    /*
> -     * Check whether RAM fits below 4G (leaving 1/2 GByte for IO memory
> -     * and 256 Mbytes for PCI Express Enhanced Configuration Access Mapping
> -     * also known as MMCFG).
> -     * If it doesn't, we need to split it in chunks below and above 4G.
> -     * In any case, try to make sure that guest addresses aligned at
> -     * 1G boundaries get mapped to host addresses aligned at 1G boundaries.
> -     */
> -    if (machine->ram_size >= 0xb0000000) {
> -        lowmem = 0x80000000;
> -    } else {
> -        lowmem = 0xb0000000;
> -    }
> -
>       /*
>        * Handle the machine opt max-ram-below-4g.  It is basically doing
>        * min(qemu limit, user limit).
>