[Qemu-devel] [PATCH] hw/ppc/spapr.c: do not adjust rma_size with KVM RADIX in ppc_spapr_init

Daniel Henrique Barboza posted 1 patch 6 years, 10 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20170628194731.15742-1-danielhb@linux.vnet.ibm.com
Test FreeBSD passed
Test checkpatch passed
Test docker passed
Test s390x passed
hw/ppc/spapr.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
[Qemu-devel] [PATCH] hw/ppc/spapr.c: do not adjust rma_size with KVM RADIX in ppc_spapr_init
Posted by Daniel Henrique Barboza 6 years, 10 months ago
In ppc_spapr_init when setting rma_size we have the following verification:

    if (rma_alloc_size && (rma_alloc_size < node0_size)) {
        spapr->rma_size = rma_alloc_size;
    } else {
        spapr->rma_size = node0_size;

        /* With KVM, we don't actually know whether KVM supports an
         * unbounded RMA (PR KVM) or is limited by the hash table size
         * (HV KVM using VRMA), so we always assume the latter
         *
         * In that case, we also limit the initial allocations for RTAS
         * etc... to 256M since we have no way to know what the VRMA size
         * is going to be as it depends on the size of the hash table
         * isn't determined yet.
         */
        if (kvm_enabled()) {
            spapr->vrma_adjust = 1;
            spapr->rma_size = MIN(spapr->rma_size, 0x10000000);
        }

This code (and the comment that precedes it) is taking constraints and conditions
related to KVM HPT guests and filtering them with "if (kvm_enabled())". Note that
this also means that, for KVM RADIX guests, we'll change rma_size and set the
vrma_adjust flag as well.

The flag vrma_adjust is used inside 'spapr_setup_hpt_and_vrma', which in turn is
called from ppc_spapr_reset as follows:

    if (kvm_enabled() && kvmppc_has_cap_mmu_radix()) {
        /* If using KVM with radix mode available, VCPUs can be started
         * without a HPT because KVM will start them in radix mode.
         * Set the GR bit in PATB so that we know there is no HPT. */
        spapr->patb_entry = PATBE1_GR;
    } else {
        spapr_setup_hpt_and_vrma(spapr);
    }

In short, when running a KVM HPT guest, rma_size is shrinked, the flag vrma_adjust
is set and later on spapr_setup_hpt_and_vrma() is called to make the proper
adjustments. When running a KVM RADIX guest no post adjustment is made, leaving
rma_size with the value MIN(spapr->rma_size, 0x10000000).

This changed rma_size value is causing the code to populate more memory nodes
in 'spapr_populate_memory', which in turn results in differences in the memory
layout at SLOF init (alloc_top and rmo_top). At first this isn't causing bugs or
guest misbehavior in case of KVM RADIX - the problem is that this is happening
due to KVM HPT code.

This patch changes the if conditional inside ppc_spapr_init to:

        if (kvm_enabled() && !kvmppc_has_cap_mmu_radix()) {
            spapr->vrma_adjust = 1;
            spapr->rma_size = MIN(spapr->rma_size, 0x10000000);
        }

To restrict the rma changes only to KVM HPT guests. If we need to do
adjustments for RADIX we should either do it explicitly in its own code
or make it clearer that a common code applies to both HPT and RADIX.

Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
---
 hw/ppc/spapr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 7d9af75..117ea9d 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -2164,7 +2164,7 @@ static void ppc_spapr_init(MachineState *machine)
          * is going to be as it depends on the size of the hash table
          * isn't determined yet.
          */
-        if (kvm_enabled()) {
+        if (kvm_enabled() && !kvmppc_has_cap_mmu_radix()) {
             spapr->vrma_adjust = 1;
             spapr->rma_size = MIN(spapr->rma_size, 0x10000000);
         }
-- 
2.9.4


Re: [Qemu-devel] [PATCH] hw/ppc/spapr.c: do not adjust rma_size with KVM RADIX in ppc_spapr_init
Posted by David Gibson 6 years, 9 months ago
On Wed, Jun 28, 2017 at 04:47:31PM -0300, Daniel Henrique Barboza wrote:
> In ppc_spapr_init when setting rma_size we have the following verification:
> 
>     if (rma_alloc_size && (rma_alloc_size < node0_size)) {
>         spapr->rma_size = rma_alloc_size;
>     } else {
>         spapr->rma_size = node0_size;
> 
>         /* With KVM, we don't actually know whether KVM supports an
>          * unbounded RMA (PR KVM) or is limited by the hash table size
>          * (HV KVM using VRMA), so we always assume the latter
>          *
>          * In that case, we also limit the initial allocations for RTAS
>          * etc... to 256M since we have no way to know what the VRMA size
>          * is going to be as it depends on the size of the hash table
>          * isn't determined yet.
>          */
>         if (kvm_enabled()) {
>             spapr->vrma_adjust = 1;
>             spapr->rma_size = MIN(spapr->rma_size, 0x10000000);
>         }
> 
> This code (and the comment that precedes it) is taking constraints and conditions
> related to KVM HPT guests and filtering them with "if (kvm_enabled())". Note that
> this also means that, for KVM RADIX guests, we'll change rma_size and set the
> vrma_adjust flag as well.
> 
> The flag vrma_adjust is used inside 'spapr_setup_hpt_and_vrma', which in turn is
> called from ppc_spapr_reset as follows:
> 
>     if (kvm_enabled() && kvmppc_has_cap_mmu_radix()) {
>         /* If using KVM with radix mode available, VCPUs can be started
>          * without a HPT because KVM will start them in radix mode.
>          * Set the GR bit in PATB so that we know there is no HPT. */
>         spapr->patb_entry = PATBE1_GR;
>     } else {
>         spapr_setup_hpt_and_vrma(spapr);
>     }
> 
> In short, when running a KVM HPT guest, rma_size is shrinked, the flag vrma_adjust
> is set and later on spapr_setup_hpt_and_vrma() is called to make the proper
> adjustments. When running a KVM RADIX guest no post adjustment is made, leaving
> rma_size with the value MIN(spapr->rma_size, 0x10000000).
> 
> This changed rma_size value is causing the code to populate more memory nodes
> in 'spapr_populate_memory', which in turn results in differences in the memory
> layout at SLOF init (alloc_top and rmo_top). At first this isn't causing bugs or
> guest misbehavior in case of KVM RADIX - the problem is that this is happening
> due to KVM HPT code.
> 
> This patch changes the if conditional inside ppc_spapr_init to:
> 
>         if (kvm_enabled() && !kvmppc_has_cap_mmu_radix()) {
>             spapr->vrma_adjust = 1;
>             spapr->rma_size = MIN(spapr->rma_size, 0x10000000);
>         }
> 
> To restrict the rma changes only to KVM HPT guests. If we need to do
> adjustments for RADIX we should either do it explicitly in its own code
> or make it clearer that a common code applies to both HPT and RADIX.
> 
> Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>

This doesn't seem right to me, on a few levels.

First, if the guest is RPT, then AFAIK, the whole concept of an RMA is
basically meaningless - so we should be reflecting that throughout not
just removing one adjustment to it.

We probably need some cleanup of the existing code here - at the
moment these RMA adjustments make guest-visible changes based on
whether we're KVM or not, which is not an ideal situation at all.

> ---
>  hw/ppc/spapr.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 7d9af75..117ea9d 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -2164,7 +2164,7 @@ static void ppc_spapr_init(MachineState *machine)
>           * is going to be as it depends on the size of the hash table
>           * isn't determined yet.
>           */
> -        if (kvm_enabled()) {
> +        if (kvm_enabled() && !kvmppc_has_cap_mmu_radix()) {

In addition, this looks like the wrong test.  This tests if KVM is
_capable_ of doing radix, not if we actually _are_ doing radix.  At
the moment an RPT host can't run an HPT guest, but I believe we're
planning to change that at some point.

>              spapr->vrma_adjust = 1;
>              spapr->rma_size = MIN(spapr->rma_size, 0x10000000);
>          }

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson