[Qemu-devel] [PATCH] q35: split memory at 2G

Gerd Hoffmann posted 1 patch 4 years, 10 months ago
Test s390x failed
Test checkpatch passed
Test asan passed
Test docker-mingw@fedora passed
Test docker-clang@ubuntu passed
Test FreeBSD passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20190528204838.21568-1-kraxel@redhat.com
Maintainers: Richard Henderson <rth@twiddle.net>, Eduardo Habkost <ehabkost@redhat.com>, Paolo Bonzini <pbonzini@redhat.com>, Marcel Apfelbaum <marcel.apfelbaum@gmail.com>, "Michael S. Tsirkin" <mst@redhat.com>
There is a newer version of this series
include/hw/i386/pc.h | 1 +
hw/i386/pc.c         | 1 +
hw/i386/pc_q35.c     | 7 ++++++-
3 files changed, 8 insertions(+), 1 deletion(-)
[Qemu-devel] [PATCH] q35: split memory at 2G
Posted by Gerd Hoffmann 4 years, 10 months ago
Original q35 behavior was to split memory 2.75 GB, leaving space for the
mmconfig bar at 0xb000000 and pci I/O window starting at 0xc0000000.

Note: Those machine types have been removed from the qemu codebase
meanwhile because they could not be live-migrated so there was little
value in keeping them around.

With the effort to allow for gigabyte-alignment of guest memory that
behavior was changed:  The split was moved to 2G, but only in case the
memory didn't fit below 2.75 GB.

So today the address space between 2G and 2,75G is not used for guest
memory in typical use cases, where the guest memory sized at a power of
two or a gigabyte number.  But if you configure your guest with some odd
amout of memory (such as 2.5G) the address space is used.

This patch removes that oddity for 4.1+ machine types.  The memory is
splitted at 2G no matter what.

Cc: László Érsek <lersek@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
---
 include/hw/i386/pc.h | 1 +
 hw/i386/pc.c         | 1 +
 hw/i386/pc_q35.c     | 7 ++++++-
 3 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 43df7230a22b..d88179a3b21e 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -120,6 +120,7 @@ typedef struct PCMachineClass {
 
     /* RAM / address space compat: */
     bool gigabyte_align;
+    bool gigabyte_split;
     bool has_reserved_memory;
     bool enforce_aligned_dimm;
     bool broken_reserved_end;
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 2632b73f800b..828eeb36e398 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -2719,6 +2719,7 @@ static void pc_machine_class_init(ObjectClass *oc, void *data)
     pcmc->smbios_defaults = true;
     pcmc->smbios_uuid_encoded = true;
     pcmc->gigabyte_align = true;
+    pcmc->gigabyte_split = true;
     pcmc->has_reserved_memory = true;
     pcmc->kvmclock_enabled = true;
     pcmc->enforce_aligned_dimm = true;
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index 37dd350511a9..266671a9d544 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -143,8 +143,10 @@ static void pc_q35_init(MachineState *machine)
      * If it doesn't, we need to split it in chunks below and above 4G.
      * In any case, try to make sure that guest addresses aligned at
      * 1G boundaries get mapped to host addresses aligned at 1G boundaries.
+     *
+     * qemu 4.1+ machines: split at 2G unconditionally (gigabyte_split = true)
      */
-    if (machine->ram_size >= 0xb0000000) {
+    if (machine->ram_size >= 0xb0000000 || pcmc->gigabyte_split) {
         lowmem = 0x80000000;
     } else {
         lowmem = 0xb0000000;
@@ -376,8 +378,11 @@ DEFINE_Q35_MACHINE(v4_1, "pc-q35-4.1", NULL,
 
 static void pc_q35_4_0_machine_options(MachineClass *m)
 {
+    PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
+
     pc_q35_4_1_machine_options(m);
     m->alias = NULL;
+    pcmc->gigabyte_split = false;
     compat_props_add(m->compat_props, hw_compat_4_0, hw_compat_4_0_len);
     compat_props_add(m->compat_props, pc_compat_4_0, pc_compat_4_0_len);
 }
-- 
2.18.1


Re: [Qemu-devel] [PATCH] q35: split memory at 2G
Posted by Eric Blake 4 years, 10 months ago
On 5/28/19 3:48 PM, Gerd Hoffmann wrote:
> Original q35 behavior was to split memory 2.75 GB, leaving space for the

s/memory/memory at/

> mmconfig bar at 0xb000000 and pci I/O window starting at 0xc0000000.
> 
> Note: Those machine types have been removed from the qemu codebase
> meanwhile because they could not be live-migrated so there was little
> value in keeping them around.
> 
> With the effort to allow for gigabyte-alignment of guest memory that
> behavior was changed:  The split was moved to 2G, but only in case the
> memory didn't fit below 2.75 GB.
> 
> So today the address space between 2G and 2,75G is not used for guest
> memory in typical use cases, where the guest memory sized at a power of

s/memory/memory is/

> two or a gigabyte number.  But if you configure your guest with some odd
> amout of memory (such as 2.5G) the address space is used.

s/amout/amount/

> 
> This patch removes that oddity for 4.1+ machine types.  The memory is
> splitted at 2G no matter what.

s/splitted/split/

> 
> Cc: László Érsek <lersek@redhat.com>
> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
> ---
>  include/hw/i386/pc.h | 1 +
>  hw/i386/pc.c         | 1 +
>  hw/i386/pc_q35.c     | 7 ++++++-
>  3 files changed, 8 insertions(+), 1 deletion(-)
> 


-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [Qemu-devel] [PATCH] q35: split memory at 2G
Posted by Paolo Bonzini 4 years, 10 months ago
On 28/05/19 22:48, Gerd Hoffmann wrote:
> Original q35 behavior was to split memory 2.75 GB, leaving space for the
> mmconfig bar at 0xb000000 and pci I/O window starting at 0xc0000000.
> 
> Note: Those machine types have been removed from the qemu codebase
> meanwhile because they could not be live-migrated so there was little
> value in keeping them around.
> 
> With the effort to allow for gigabyte-alignment of guest memory that
> behavior was changed:  The split was moved to 2G, but only in case the
> memory didn't fit below 2.75 GB.
> 
> So today the address space between 2G and 2,75G is not used for guest
> memory in typical use cases, where the guest memory sized at a power of
> two or a gigabyte number.  But if you configure your guest with some odd
> amout of memory (such as 2.5G) the address space is used.

Wasn't it done to ensure pre-PAE OSes could use as much memory as
possible?  (If you run pre-PAE OSes with more RAM than can fit below 4G,
you can just reduce the amount of memory and get all the 2.75G).

Paolo

Re: [Qemu-devel] [PATCH] q35: split memory at 2G
Posted by Michael S. Tsirkin 4 years, 10 months ago
On Wed, May 29, 2019 at 03:21:16AM +0200, Paolo Bonzini wrote:
> On 28/05/19 22:48, Gerd Hoffmann wrote:
> > Original q35 behavior was to split memory 2.75 GB, leaving space for the
> > mmconfig bar at 0xb000000 and pci I/O window starting at 0xc0000000.
> > 
> > Note: Those machine types have been removed from the qemu codebase
> > meanwhile because they could not be live-migrated so there was little
> > value in keeping them around.
> > 
> > With the effort to allow for gigabyte-alignment of guest memory that
> > behavior was changed:  The split was moved to 2G, but only in case the
> > memory didn't fit below 2.75 GB.
> > 
> > So today the address space between 2G and 2,75G is not used for guest
> > memory in typical use cases, where the guest memory sized at a power of
> > two or a gigabyte number.  But if you configure your guest with some odd
> > amout of memory (such as 2.5G) the address space is used.
> 
> Wasn't it done to ensure pre-PAE OSes could use as much memory as
> possible?  (If you run pre-PAE OSes with more RAM than can fit below 4G,
> you can just reduce the amount of memory and get all the 2.75G).
> 
> Paolo

Absolutely. Gerd is just saying the configuration is rare enough that
it's not worth worrying about. I don't know myself - why do
we bother making this change? What's the advantage?

-- 
MST

Re: [Qemu-devel] [PATCH] q35: split memory at 2G
Posted by Gerd Hoffmann 4 years, 10 months ago
On Tue, May 28, 2019 at 10:49:55PM -0400, Michael S. Tsirkin wrote:
> On Wed, May 29, 2019 at 03:21:16AM +0200, Paolo Bonzini wrote:
> > On 28/05/19 22:48, Gerd Hoffmann wrote:
> > > Original q35 behavior was to split memory 2.75 GB, leaving space for the
> > > mmconfig bar at 0xb000000 and pci I/O window starting at 0xc0000000.
> > > 
> > > Note: Those machine types have been removed from the qemu codebase
> > > meanwhile because they could not be live-migrated so there was little
> > > value in keeping them around.
> > > 
> > > With the effort to allow for gigabyte-alignment of guest memory that
> > > behavior was changed:  The split was moved to 2G, but only in case the
> > > memory didn't fit below 2.75 GB.
> > > 
> > > So today the address space between 2G and 2,75G is not used for guest
> > > memory in typical use cases, where the guest memory sized at a power of
> > > two or a gigabyte number.  But if you configure your guest with some odd
> > > amout of memory (such as 2.5G) the address space is used.
> > 
> > Wasn't it done to ensure pre-PAE OSes could use as much memory as
> > possible?  (If you run pre-PAE OSes with more RAM than can fit below 4G,
> > you can just reduce the amount of memory and get all the 2.75G).
> > 
> > Paolo
> 
> Absolutely. Gerd is just saying the configuration is rare enough that
> it's not worth worrying about. I don't know myself - why do
> we bother making this change? What's the advantage?

Some ovmf versions place the mmconfig @ 2G.  Which works fine in 99% of
the cases, but with memory sizes between 2G and 2.75G it doesn't.

cheers,
  Gerd


Re: [Qemu-devel] [PATCH] q35: split memory at 2G
Posted by Laszlo Ersek 4 years, 10 months ago
On 05/29/19 06:47, Gerd Hoffmann wrote:
> On Tue, May 28, 2019 at 10:49:55PM -0400, Michael S. Tsirkin wrote:
>> On Wed, May 29, 2019 at 03:21:16AM +0200, Paolo Bonzini wrote:
>>> On 28/05/19 22:48, Gerd Hoffmann wrote:
>>>> Original q35 behavior was to split memory 2.75 GB, leaving space for the
>>>> mmconfig bar at 0xb000000 and pci I/O window starting at 0xc0000000.
>>>>
>>>> Note: Those machine types have been removed from the qemu codebase
>>>> meanwhile because they could not be live-migrated so there was little
>>>> value in keeping them around.
>>>>
>>>> With the effort to allow for gigabyte-alignment of guest memory that
>>>> behavior was changed:  The split was moved to 2G, but only in case the
>>>> memory didn't fit below 2.75 GB.
>>>>
>>>> So today the address space between 2G and 2,75G is not used for guest
>>>> memory in typical use cases, where the guest memory sized at a power of
>>>> two or a gigabyte number.  But if you configure your guest with some odd
>>>> amout of memory (such as 2.5G) the address space is used.
>>>
>>> Wasn't it done to ensure pre-PAE OSes could use as much memory as
>>> possible?  (If you run pre-PAE OSes with more RAM than can fit below 4G,
>>> you can just reduce the amount of memory and get all the 2.75G).
>>>
>>> Paolo
>>
>> Absolutely. Gerd is just saying the configuration is rare enough that
>> it's not worth worrying about. I don't know myself - why do
>> we bother making this change? What's the advantage?
> 
> Some ovmf versions place the mmconfig @ 2G.  Which works fine in 99% of
> the cases, but with memory sizes between 2G and 2.75G it doesn't.

Here's the stages of PCIEXBAR placement in OVMF:

#1 Commit 7b8fe63561b4 ("OvmfPkg: PlatformPei: enable PCIEXBAR (aka
MMCONFIG / ECAM) on Q35", 2016-03-10): places the PCIEXBAR at 2GB.
Behaves according to your description.

#2 Commit 75136b29541b ("OvmfPkg/PlatformPei: reorder the 32-bit PCI
window vs. the PCIEXBAR on q35", 2019-05-16): made for the 1% of cases
(according to your description) where the previous logic would fail,
namely with RAM between 2G and 2.75G. Places the PCIEXBAR at
0xE000_0000, and the 32-bit MMIO window below it. Unfortunately, this
causes a regression for end-users, because this ordering of areas
triggers a bug in QEMU's ACPI generator somewhere. The 32-bit MMIO
window will be clamped *above* 0xF000_0000, when it should actually end
at 0xE000_0000. Causes confusion for some guest OSes (the mildest
symptom is PCI resource reassignment).

#3 Patch series linked in
<https://bugzilla.tianocore.org/show_bug.cgi?id=1859#c1>: restores the
original order (so as to pacify the ACPI generator in QEMU). Places the
PCIEXBAR at 0xB000_0000. Places the 32-bit MMIO window at 0xC000_0000.
This wastes a bit of 32-bit MMIO space, but it has the best
compatibility. The cases under which the waste occurs are: (a) pre-4.1
Q35 machine types with low RAM side *outside* of 2GB..2.75GB, and (b)
4.1+ Q35 machine types (with this patch applied), regardless of low RAM
size. Effectively this patch set returns to the first variant, except it
bumps the PCIEXBAR base from 2GB to 2.75GB.

Once all pre-4.1 Q35 machine types can be considered obsolete, we can
return OVMF to option#1 again. Until then, it's best to fix the host
side and the guest side both, for best compatibility for end-users.
(This is generally what we do, i.e. fix both sides, when there is a
host-guest disagreement.)

So, for this patch:

Acked-by: Laszlo Ersek <lersek@redhat.com>

I'll also push (soon) the edk2 patches for option#3 above.

(

Things I haven't discussed here: (a) why we'd like to continue
specifying the PXIEXBAR base in OVMF as a build-time constant (because
making it dynamic might introduce complications for module dispatch
order), (b) MTRR aspects (both the PCIEXBAR and the 32-bit MMIO window
should be marked UC through variable MTRRs, and while SeaBIOS currently
ignores the first, I wouldn't like to, in OVMF). All of options #1
through #3 can be made work correctly wrt. variable MTRRs, but *some*
OVMF patches are necessary for either of those.

)

Thanks
Laszlo

Re: [Qemu-devel] [PATCH] q35: split memory at 2G
Posted by Gerd Hoffmann 4 years, 10 months ago
On Wed, May 29, 2019 at 03:21:16AM +0200, Paolo Bonzini wrote:
> On 28/05/19 22:48, Gerd Hoffmann wrote:
> > Original q35 behavior was to split memory 2.75 GB, leaving space for the
> > mmconfig bar at 0xb000000 and pci I/O window starting at 0xc0000000.
> > 
> > Note: Those machine types have been removed from the qemu codebase
> > meanwhile because they could not be live-migrated so there was little
> > value in keeping them around.
> > 
> > With the effort to allow for gigabyte-alignment of guest memory that
> > behavior was changed:  The split was moved to 2G, but only in case the
> > memory didn't fit below 2.75 GB.
> > 
> > So today the address space between 2G and 2,75G is not used for guest
> > memory in typical use cases, where the guest memory sized at a power of
> > two or a gigabyte number.  But if you configure your guest with some odd
> > amout of memory (such as 2.5G) the address space is used.
> 
> Wasn't it done to ensure pre-PAE OSes could use as much memory as
> possible?  (If you run pre-PAE OSes with more RAM than can fit below 4G,
> you can just reduce the amount of memory and get all the 2.75G).

Well, those guests are better served with 'pc' where we don't need
address space for mmconfig and you can get 3.5G with no trouble and even
a bit more with extra tweaks (see longish comment in hw/i386/pc_piix.c
explaining all the memory handling options).

cheers,
  Gerd