From nobody Fri Mar 29 05:09:22 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 15816083824361014.0103302925774; Thu, 13 Feb 2020 07:39:42 -0800 (PST) Received: from localhost ([::1]:54958 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j2Gaa-0002vI-Tn for importer@patchew.org; Thu, 13 Feb 2020 10:39:40 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:46477) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j2GZj-00020E-Nq for qemu-devel@nongnu.org; Thu, 13 Feb 2020 10:38:49 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1j2GZh-0005yy-Mn for qemu-devel@nongnu.org; Thu, 13 Feb 2020 10:38:47 -0500 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:21828) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1j2GZh-0005wY-Gy for qemu-devel@nongnu.org; Thu, 13 Feb 2020 10:38:45 -0500 Received: from pps.filterd (m0127361.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 01DFTV5A005411 for ; Thu, 13 Feb 2020 10:38:44 -0500 Received: from e06smtp05.uk.ibm.com (e06smtp05.uk.ibm.com [195.75.94.101]) by mx0a-001b2d01.pphosted.com with ESMTP id 2y1ucnufqw-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Thu, 13 Feb 2020 10:38:44 -0500 Received: from localhost by e06smtp05.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 13 Feb 2020 15:38:42 -0000 Received: from b06avi18626390.portsmouth.uk.ibm.com (9.149.26.192) by e06smtp05.uk.ibm.com (192.168.101.135) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Thu, 13 Feb 2020 15:38:40 -0000 Received: from d06av22.portsmouth.uk.ibm.com (d06av22.portsmouth.uk.ibm.com [9.149.105.58]) by b06avi18626390.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 01DFbi4o40173924 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 13 Feb 2020 15:37:44 GMT Received: from d06av22.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 1A9AB4C044; Thu, 13 Feb 2020 15:38:39 +0000 (GMT) Received: from d06av22.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id CF1944C046; Thu, 13 Feb 2020 15:38:38 +0000 (GMT) Received: from bahia.lan (unknown [9.145.71.236]) by d06av22.portsmouth.uk.ibm.com (Postfix) with ESMTP; Thu, 13 Feb 2020 15:38:38 +0000 (GMT) Subject: [PATCH] spapr: Rework hash<->radix transitions at CAS From: Greg Kurz To: David Gibson Date: Thu, 13 Feb 2020 16:38:38 +0100 User-Agent: StGit/unknown-version MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-TM-AS-GCONF: 00 x-cbid: 20021315-0020-0000-0000-000003A9D42F X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 20021315-0021-0000-0000-00002201BF8A Message-Id: <158160831807.3339719.7059822505220975954.stgit@bahia.lan> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.138, 18.0.572 definitions=2020-02-13_05:2020-02-12, 2020-02-13 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 bulkscore=0 phishscore=0 impostorscore=0 adultscore=0 lowpriorityscore=0 mlxscore=0 mlxlogscore=999 priorityscore=1501 suspectscore=2 spamscore=0 clxscore=1034 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2001150001 definitions=main-2002130120 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x [generic] X-Received-From: 148.163.158.5 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Alexey Kardashevskiy , Laurent Vivier , qemu-ppc@nongnu.org, qemu-devel@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Until the CAS negotiation is over, an HPT can be allocated on three different paths: 1) during machine reset if the host doesn't support radix, 2) during CAS if the guest wants hash and doesn't support HPT resizing, in which case we pre-emptively resize the HPT to accomodate maxram, 3) during CAS if no CAS reboot was requested, the guest wants hash but we're currently configured for radix. Depending on the various combinations of host or guest MMU support, HPT resizing guest support and the possibility of a CAS reboot, it is quite hard to know which of these allocates the HPT that will be ultimately used by the guest that wants to do hash. Also, some of them have bugs: - 2) calls spapr_reallocate_hpt() instead of spapr_setup_hpt_and_vrma() and thus doesn't update the VRMA size, even though we've just extended the HPT. Not sure what issues this can cause, - 3) doesn't check for HPT resizing support and will always allocate a small HPT based on the initial RAM size. This caps the total amount of RAM the guest can see, especially if maxram is much higher than the initial ram. We only support guests that do CAS and we already assume that the HPT isn't being used when we do the pre-emptive resizing at CAS. It thus seems reasonable to only allocate the HPT at the end of CAS, when no CAS reboot was requested. Consolidate the logic so that we only create the HPT during 3), ie. when we're done with the CAS reboot cycles, and ensure HPT resizing is taken into account. This fixes the radix->hash transition for all cases. The guest can theoretically call CAS several times, without a CAS reboot in between. Linux guests don't do that, but better safe than sorry, let's ensure we can also handle the symmetrical hash->radix transition correctly: free the HPT and set the GR bit in PATE. An helper is introduced for the latter since this is already what we do during machine reset when going for radix. As a bonus, this removes one user of spapr->cas_reboot, which we want to get rid of in the future. Signed-off-by: Greg Kurz --- hw/ppc/spapr.c | 25 +++++++++++++++----- hw/ppc/spapr_hcall.c | 59 ++++++++++++++++++++------------------------= ---- include/hw/ppc/spapr.h | 1 + 3 files changed, 44 insertions(+), 41 deletions(-) diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index 828e2cc1359a..88bc0e4e3ca1 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -1573,9 +1573,19 @@ void spapr_setup_hpt_and_vrma(SpaprMachineState *spa= pr) { int hpt_shift; =20 + /* + * HPT resizing is a bit of a special case, because when enabled + * we assume an HPT guest will support it until it says it + * doesn't, instead of assuming it won't support it until it says + * it does. Strictly speaking that approach could break for + * guests which don't make a CAS call, but those are so old we + * don't care about them. Without that assumption we'd have to + * make at least a temporary allocation of an HPT sized for max + * memory, which could be impossibly difficult under KVM HV if + * maxram is large. + */ if ((spapr->resize_hpt =3D=3D SPAPR_RESIZE_HPT_DISABLED) - || (spapr->cas_reboot - && !spapr_ovec_test(spapr->ov5_cas, OV5_HPT_RESIZE))) { + || !spapr_ovec_test(spapr->ov5_cas, OV5_HPT_RESIZE)) { hpt_shift =3D spapr_hpt_shift_for_ramsize(MACHINE(spapr)->maxram_s= ize); } else { uint64_t current_ram_size; @@ -1604,6 +1614,12 @@ static int spapr_reset_drcs(Object *child, void *opa= que) return 0; } =20 +void spapr_reset_patb_entry(SpaprMachineState *spapr) +{ + spapr->patb_entry =3D PATE1_GR; + spapr_set_all_lpcrs(LPCR_HR | LPCR_UPRT, LPCR_HR | LPCR_UPRT); +} + static void spapr_machine_reset(MachineState *machine) { SpaprMachineState *spapr =3D SPAPR_MACHINE(machine); @@ -1624,10 +1640,7 @@ static void spapr_machine_reset(MachineState *machin= e) * without a HPT because KVM will start them in radix mode. * Set the GR bit in PATE so that we know there is no HPT. */ - spapr->patb_entry =3D PATE1_GR; - spapr_set_all_lpcrs(LPCR_HR | LPCR_UPRT, LPCR_HR | LPCR_UPRT); - } else { - spapr_setup_hpt_and_vrma(spapr); + spapr_reset_patb_entry(spapr); } =20 qemu_devices_reset(); diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c index b8bb66b5c0d4..57ddf0fa6d05 100644 --- a/hw/ppc/spapr_hcall.c +++ b/hw/ppc/spapr_hcall.c @@ -1677,6 +1677,7 @@ static target_ulong h_client_architecture_support(Pow= erPCCPU *cpu, bool raw_mode_supported =3D false; bool guest_xive; CPUState *cs; + int maxshift =3D spapr_hpt_shift_for_ramsize(MACHINE(spapr)->maxram_si= ze); =20 /* CAS is supposed to be called early when only the boot vCPU is activ= e. */ CPU_FOREACH(cs) { @@ -1739,36 +1740,6 @@ static target_ulong h_client_architecture_support(Po= werPCCPU *cpu, =20 guest_xive =3D spapr_ovec_test(ov5_guest, OV5_XIVE_EXPLOIT); =20 - /* - * HPT resizing is a bit of a special case, because when enabled - * we assume an HPT guest will support it until it says it - * doesn't, instead of assuming it won't support it until it says - * it does. Strictly speaking that approach could break for - * guests which don't make a CAS call, but those are so old we - * don't care about them. Without that assumption we'd have to - * make at least a temporary allocation of an HPT sized for max - * memory, which could be impossibly difficult under KVM HV if - * maxram is large. - */ - if (!guest_radix && !spapr_ovec_test(ov5_guest, OV5_HPT_RESIZE)) { - int maxshift =3D spapr_hpt_shift_for_ramsize(MACHINE(spapr)->maxra= m_size); - - if (spapr->resize_hpt =3D=3D SPAPR_RESIZE_HPT_REQUIRED) { - error_report( - "h_client_architecture_support: Guest doesn't support HPT = resizing, but resize-hpt=3Drequired"); - exit(1); - } - - if (spapr->htab_shift < maxshift) { - /* Guest doesn't know about HPT resizing, so we - * pre-emptively resize for the maximum permitted RAM. At - * the point this is called, nothing should have been - * entered into the existing HPT */ - spapr_reallocate_hpt(spapr, maxshift, &error_fatal); - push_sregs_to_kvm_pr(spapr); - } - } - /* NOTE: there are actually a number of ov5 bits where input from the * guest is always zero, and the platform/QEMU enables them independen= tly * of guest input. To model these properly we'd want some sort of mask, @@ -1806,6 +1777,12 @@ static target_ulong h_client_architecture_support(Po= werPCCPU *cpu, error_report("Guest requested unavailable MMU mode (hash)."); exit(EXIT_FAILURE); } + if (spapr->resize_hpt =3D=3D SPAPR_RESIZE_HPT_REQUIRED && + !spapr_ovec_test(ov5_guest, OV5_HPT_RESIZE)) { + error_report( + "h_client_architecture_support: Guest doesn't support HPT = resizing, but resize-hpt=3Drequired"); + exit(1); + } } spapr->cas_pre_isa3_guest =3D !spapr_ovec_test(ov1_guest, OV1_PPC_3_00= ); spapr_ovec_cleanup(ov1_guest); @@ -1838,11 +1815,23 @@ static target_ulong h_client_architecture_support(P= owerPCCPU *cpu, void *fdt; SpaprDeviceTreeUpdateHeader hdr =3D { .version_id =3D 1 }; =20 - /* If spapr_machine_reset() did not set up a HPT but one is necess= ary - * (because the guest isn't going to use radix) then set it up her= e. */ - if ((spapr->patb_entry & PATE1_GR) && !guest_radix) { - /* legacy hash or new hash: */ - spapr_setup_hpt_and_vrma(spapr); + if (!guest_radix) { + /* + * Either spapr_machine_reset() did not set up a HPT but one + * is necessary (because the guest isn't going to use radix), + * or the guest doesn't know about HPT resizing and we need to + * pre-emptively resize for the maximum permitted RAM. Set it + * up here. At the point this is called, nothing should have + * been entered into the existing HPT. + */ + if (spapr->patb_entry & PATE1_GR || spapr->htab_shift < maxshi= ft) { + /* legacy hash or new hash: */ + spapr_setup_hpt_and_vrma(spapr); + push_sregs_to_kvm_pr(spapr); + } + } else { + spapr_free_hpt(spapr); + spapr_reset_patb_entry(spapr); } =20 if (fdt_bufsize < sizeof(hdr)) { diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h index 09110961a589..9d88b5596481 100644 --- a/include/hw/ppc/spapr.h +++ b/include/hw/ppc/spapr.h @@ -919,4 +919,5 @@ void spapr_check_pagesize(SpaprMachineState *spapr, hwa= ddr pagesize, =20 void spapr_set_all_lpcrs(target_ulong value, target_ulong mask); hwaddr spapr_get_rtas_addr(void); +void spapr_reset_patb_entry(SpaprMachineState *spapr); #endif /* HW_SPAPR_H */