From nobody Tue Feb 10 00:22:26 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6295EC7EE30 for ; Thu, 18 May 2023 23:15:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231124AbjERXPI (ORCPT ); Thu, 18 May 2023 19:15:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33818 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230411AbjERXOz (ORCPT ); Thu, 18 May 2023 19:14:55 -0400 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 98F4AE69; Thu, 18 May 2023 16:14:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1684451693; x=1715987693; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=db9taUrn11hgDxvXkyxqziZdCMhZCofmcOE8yuxYxXA=; b=eVnT/zE7Y4C5YI6C8wOoY/kfQrbGv1GmDlI+ButC7i365nTXmBbZn6jx 3MSEQ57YQWKwwunC8LL1xQW/KiEHSy3ANsz1pN8bMwde48sDqhCGxwmBe HZKsTD6d/SFdpWhqiGVp8BC+A+RcLZdwJWapSqAELkZwPF8BQUwktKOFE xYIBnrSyGGR8DHAPNyGQVmb07SJefzTvM/XzyPRP/aBQ6p4j9dZUc7E8S +GcPUegCr1Z4Qq9u9qcfM7xeG0yeAXfxMRavydybE9aV6CTQo2szK54bW 6VkgEwi9ymbZGnchswcIPN2ZbP2Kh9NriRL6Yb0umILSDqiaqPhKpPRyr A==; X-IronPort-AV: E=McAfee;i="6600,9927,10714"; a="380431732" X-IronPort-AV: E=Sophos;i="6.00,175,1681196400"; d="scan'208";a="380431732" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 May 2023 16:14:50 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10714"; a="846669543" X-IronPort-AV: E=Sophos;i="6.00,175,1681196400"; d="scan'208";a="846669543" Received: from rkiyama-mobl1.amr.corp.intel.com (HELO box.shutemov.name) ([10.251.222.16]) by fmsmga001-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 May 2023 16:14:42 -0700 Received: by box.shutemov.name (Postfix, from userid 1000) id 1E6B810DFC5; Fri, 19 May 2023 02:14:40 +0300 (+03) From: "Kirill A. Shutemov" To: Borislav Petkov , Andy Lutomirski , Dave Hansen , Sean Christopherson , Andrew Morton , Joerg Roedel , Ard Biesheuvel Cc: Andi Kleen , Kuppuswamy Sathyanarayanan , David Rientjes , Vlastimil Babka , Tom Lendacky , Thomas Gleixner , Peter Zijlstra , Paolo Bonzini , Ingo Molnar , Dario Faggioli , Mike Rapoport , David Hildenbrand , Mel Gorman , marcelo.cerri@canonical.com, tim.gardner@canonical.com, khalid.elmously@canonical.com, philip.cox@canonical.com, aarcange@redhat.com, peterx@redhat.com, x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" , Mike Rapoport Subject: [PATCHv12 1/9] mm: Add support for unaccepted memory Date: Fri, 19 May 2023 02:14:26 +0300 Message-Id: <20230518231434.26080-2-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.39.3 In-Reply-To: <20230518231434.26080-1-kirill.shutemov@linux.intel.com> References: <20230518231434.26080-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" UEFI Specification version 2.9 introduces the concept of memory acceptance. Some Virtual Machine platforms, such as Intel TDX or AMD SEV-SNP, require memory to be accepted before it can be used by the guest. Accepting happens via a protocol specific to the Virtual Machine platform. There are several ways kernel can deal with unaccepted memory: 1. Accept all the memory during the boot. It is easy to implement and it doesn't have runtime cost once the system is booted. The downside is very long boot time. Accept can be parallelized to multiple CPUs to keep it manageable (i.e. via DEFERRED_STRUCT_PAGE_INIT), but it tends to saturate memory bandwidth and does not scale beyond the point. 2. Accept a block of memory on the first use. It requires more infrastructure and changes in page allocator to make it work, but it provides good boot time. On-demand memory accept means latency spikes every time kernel steps onto a new memory block. The spikes will go away once workload data set size gets stabilized or all memory gets accepted. 3. Accept all memory in background. Introduce a thread (or multiple) that gets memory accepted proactively. It will minimize time the system experience latency spikes on memory allocation while keeping low boot time. This approach cannot function on its own. It is an extension of #2: background memory acceptance requires functional scheduler, but the page allocator may need to tap into unaccepted memory before that. The downside of the approach is that these threads also steal CPU cycles and memory bandwidth from the user's workload and may hurt user experience. The patch implements #1 and #2 for now. #2 is the default. Some workloads may want to use #1 with accept_memory=3Deager in kernel command line. #3 can be implemented later based on user's demands. Support of unaccepted memory requires a few changes in core-mm code: - memblock has to accept memory on allocation; - page allocator has to accept memory on the first allocation of the page; Memblock change is trivial. The page allocator is modified to accept pages. New memory gets accepted before putting pages on free lists. It is done lazily: only accept new pages when we run out of already accepted memory. The memory gets accepted until the high watermark is reached. EFI code will provide two helpers if the platform supports unaccepted memory: - accept_memory() makes a range of physical addresses accepted. - range_contains_unaccepted_memory() checks anything within the range of physical addresses requires acceptance. Signed-off-by: Kirill A. Shutemov Acked-by: Mike Rapoport # memblock Reviewed-by: Vlastimil Babka --- drivers/base/node.c | 7 ++ fs/proc/meminfo.c | 5 ++ include/linux/mm.h | 19 +++++ include/linux/mmzone.h | 8 ++ mm/memblock.c | 9 +++ mm/mm_init.c | 7 ++ mm/page_alloc.c | 173 +++++++++++++++++++++++++++++++++++++++++ mm/vmstat.c | 3 + 8 files changed, 231 insertions(+) diff --git a/drivers/base/node.c b/drivers/base/node.c index b46db17124f3..655975946ef6 100644 --- a/drivers/base/node.c +++ b/drivers/base/node.c @@ -448,6 +448,9 @@ static ssize_t node_read_meminfo(struct device *dev, "Node %d ShmemPmdMapped: %8lu kB\n" "Node %d FileHugePages: %8lu kB\n" "Node %d FilePmdMapped: %8lu kB\n" +#endif +#ifdef CONFIG_UNACCEPTED_MEMORY + "Node %d Unaccepted: %8lu kB\n" #endif , nid, K(node_page_state(pgdat, NR_FILE_DIRTY)), @@ -477,6 +480,10 @@ static ssize_t node_read_meminfo(struct device *dev, nid, K(node_page_state(pgdat, NR_SHMEM_PMDMAPPED)), nid, K(node_page_state(pgdat, NR_FILE_THPS)), nid, K(node_page_state(pgdat, NR_FILE_PMDMAPPED)) +#endif +#ifdef CONFIG_UNACCEPTED_MEMORY + , + nid, K(sum_zone_node_page_state(nid, NR_UNACCEPTED)) #endif ); len +=3D hugetlb_report_node_meminfo(buf, len, nid); diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c index b43d0bd42762..8dca4d6d96c7 100644 --- a/fs/proc/meminfo.c +++ b/fs/proc/meminfo.c @@ -168,6 +168,11 @@ static int meminfo_proc_show(struct seq_file *m, void = *v) global_zone_page_state(NR_FREE_CMA_PAGES)); #endif =20 +#ifdef CONFIG_UNACCEPTED_MEMORY + show_val_kb(m, "Unaccepted: ", + global_zone_page_state(NR_UNACCEPTED)); +#endif + hugetlb_report_meminfo(m); =20 arch_report_meminfo(m); diff --git a/include/linux/mm.h b/include/linux/mm.h index 27ce77080c79..d9174d464348 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3816,4 +3816,23 @@ madvise_set_anon_name(struct mm_struct *mm, unsigned= long start, } #endif =20 +#ifdef CONFIG_UNACCEPTED_MEMORY + +bool range_contains_unaccepted_memory(phys_addr_t start, phys_addr_t end); +void accept_memory(phys_addr_t start, phys_addr_t end); + +#else + +static inline bool range_contains_unaccepted_memory(phys_addr_t start, + phys_addr_t end) +{ + return false; +} + +static inline void accept_memory(phys_addr_t start, phys_addr_t end) +{ +} + +#endif + #endif /* _LINUX_MM_H */ diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index a4889c9d4055..6c1c2fc13017 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -143,6 +143,9 @@ enum zone_stat_item { NR_ZSPAGES, /* allocated in zsmalloc */ #endif NR_FREE_CMA_PAGES, +#ifdef CONFIG_UNACCEPTED_MEMORY + NR_UNACCEPTED, +#endif NR_VM_ZONE_STAT_ITEMS }; =20 enum node_stat_item { @@ -910,6 +913,11 @@ struct zone { /* free areas of different sizes */ struct free_area free_area[MAX_ORDER + 1]; =20 +#ifdef CONFIG_UNACCEPTED_MEMORY + /* Pages to be accepted. All pages on the list are MAX_ORDER */ + struct list_head unaccepted_pages; +#endif + /* zone flags, see below */ unsigned long flags; =20 diff --git a/mm/memblock.c b/mm/memblock.c index 3feafea06ab2..50b921119600 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -1436,6 +1436,15 @@ phys_addr_t __init memblock_alloc_range_nid(phys_add= r_t size, */ kmemleak_alloc_phys(found, size, 0); =20 + /* + * Some Virtual Machine platforms, such as Intel TDX or AMD SEV-SNP, + * require memory to be accepted before it can be used by the + * guest. + * + * Accept the memory of the allocated buffer. + */ + accept_memory(found, found + size); + return found; } =20 diff --git a/mm/mm_init.c b/mm/mm_init.c index 7f7f9c677854..1cfc08e25f93 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -1375,6 +1375,10 @@ static void __meminit zone_init_free_lists(struct zo= ne *zone) INIT_LIST_HEAD(&zone->free_area[order].free_list[t]); zone->free_area[order].nr_free =3D 0; } + +#ifdef CONFIG_UNACCEPTED_MEMORY + INIT_LIST_HEAD(&zone->unaccepted_pages); +#endif } =20 void __meminit init_currently_empty_zone(struct zone *zone, @@ -1960,6 +1964,9 @@ static void __init deferred_free_range(unsigned long = pfn, return; } =20 + /* Accept chunks smaller than MAX_ORDER upfront */ + accept_memory(PFN_PHYS(pfn), PFN_PHYS(pfn + nr_pages)); + for (i =3D 0; i < nr_pages; i++, page++, pfn++) { if (pageblock_aligned(pfn)) set_pageblock_migratetype(page, MIGRATE_MOVABLE); diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 47421bedc12b..d239fba3f31c 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -387,6 +387,12 @@ EXPORT_SYMBOL(nr_node_ids); EXPORT_SYMBOL(nr_online_nodes); #endif =20 +static bool page_contains_unaccepted(struct page *page, unsigned int order= ); +static void accept_page(struct page *page, unsigned int order); +static bool try_to_accept_memory(struct zone *zone, unsigned int order); +static inline bool has_unaccepted_memory(void); +static bool __free_unaccepted(struct page *page); + int page_group_by_mobility_disabled __read_mostly; =20 #ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT @@ -1481,6 +1487,13 @@ void __free_pages_core(struct page *page, unsigned i= nt order) =20 atomic_long_add(nr_pages, &page_zone(page)->managed_pages); =20 + if (page_contains_unaccepted(page, order)) { + if (order =3D=3D MAX_ORDER && __free_unaccepted(page)) + return; + + accept_page(page, order); + } + /* * Bypass PCP and place fresh pages right to the tail, primarily * relevant for memory onlining. @@ -3159,6 +3172,9 @@ static inline long __zone_watermark_unusable_free(str= uct zone *z, if (!(alloc_flags & ALLOC_CMA)) unusable_free +=3D zone_page_state(z, NR_FREE_CMA_PAGES); #endif +#ifdef CONFIG_UNACCEPTED_MEMORY + unusable_free +=3D zone_page_state(z, NR_UNACCEPTED); +#endif =20 return unusable_free; } @@ -3458,6 +3474,11 @@ get_page_from_freelist(gfp_t gfp_mask, unsigned int = order, int alloc_flags, gfp_mask)) { int ret; =20 + if (has_unaccepted_memory()) { + if (try_to_accept_memory(zone, order)) + goto try_this_zone; + } + #ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT /* * Watermark failed for this zone, but see if we can @@ -3510,6 +3531,11 @@ get_page_from_freelist(gfp_t gfp_mask, unsigned int = order, int alloc_flags, =20 return page; } else { + if (has_unaccepted_memory()) { + if (try_to_accept_memory(zone, order)) + goto try_this_zone; + } + #ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT /* Try again if zone has deferred pages */ if (deferred_pages_enabled()) { @@ -7215,3 +7241,150 @@ bool has_managed_dma(void) return false; } #endif /* CONFIG_ZONE_DMA */ + +#ifdef CONFIG_UNACCEPTED_MEMORY + +/* Counts number of zones with unaccepted pages. */ +static DEFINE_STATIC_KEY_FALSE(zones_with_unaccepted_pages); + +static bool lazy_accept =3D true; + +static int __init accept_memory_parse(char *p) +{ + if (!strcmp(p, "lazy")) { + lazy_accept =3D true; + return 0; + } else if (!strcmp(p, "eager")) { + lazy_accept =3D false; + return 0; + } else { + return -EINVAL; + } +} +early_param("accept_memory", accept_memory_parse); + +static bool page_contains_unaccepted(struct page *page, unsigned int order) +{ + phys_addr_t start =3D page_to_phys(page); + phys_addr_t end =3D start + (PAGE_SIZE << order); + + return range_contains_unaccepted_memory(start, end); +} + +static void accept_page(struct page *page, unsigned int order) +{ + phys_addr_t start =3D page_to_phys(page); + + accept_memory(start, start + (PAGE_SIZE << order)); +} + +static bool try_to_accept_memory_one(struct zone *zone) +{ + unsigned long flags; + struct page *page; + bool last; + + if (list_empty(&zone->unaccepted_pages)) + return false; + + spin_lock_irqsave(&zone->lock, flags); + page =3D list_first_entry_or_null(&zone->unaccepted_pages, + struct page, lru); + if (!page) { + spin_unlock_irqrestore(&zone->lock, flags); + return false; + } + + list_del(&page->lru); + last =3D list_empty(&zone->unaccepted_pages); + + __mod_zone_freepage_state(zone, -MAX_ORDER_NR_PAGES, MIGRATE_MOVABLE); + __mod_zone_page_state(zone, NR_UNACCEPTED, -MAX_ORDER_NR_PAGES); + spin_unlock_irqrestore(&zone->lock, flags); + + accept_page(page, MAX_ORDER); + + __free_pages_ok(page, MAX_ORDER, FPI_TO_TAIL); + + if (last) + static_branch_dec(&zones_with_unaccepted_pages); + + return true; +} + +static bool try_to_accept_memory(struct zone *zone, unsigned int order) +{ + long to_accept; + int ret =3D false; + + /* How much to accept to get to high watermark? */ + to_accept =3D high_wmark_pages(zone) - + (zone_page_state(zone, NR_FREE_PAGES) - + __zone_watermark_unusable_free(zone, order, 0)); + + /* Accept at least one page */ + do { + if (!try_to_accept_memory_one(zone)) + break; + ret =3D true; + to_accept -=3D MAX_ORDER_NR_PAGES; + } while (to_accept > 0); + + return ret; +} + +static inline bool has_unaccepted_memory(void) +{ + return static_branch_unlikely(&zones_with_unaccepted_pages); +} + +static bool __free_unaccepted(struct page *page) +{ + struct zone *zone =3D page_zone(page); + unsigned long flags; + bool first =3D false; + + if (!lazy_accept) + return false; + + spin_lock_irqsave(&zone->lock, flags); + first =3D list_empty(&zone->unaccepted_pages); + list_add_tail(&page->lru, &zone->unaccepted_pages); + __mod_zone_freepage_state(zone, MAX_ORDER_NR_PAGES, MIGRATE_MOVABLE); + __mod_zone_page_state(zone, NR_UNACCEPTED, MAX_ORDER_NR_PAGES); + spin_unlock_irqrestore(&zone->lock, flags); + + if (first) + static_branch_inc(&zones_with_unaccepted_pages); + + return true; +} + +#else + +static bool page_contains_unaccepted(struct page *page, unsigned int order) +{ + return false; +} + +static void accept_page(struct page *page, unsigned int order) +{ +} + +static bool try_to_accept_memory(struct zone *zone, unsigned int order) +{ + return false; +} + +static inline bool has_unaccepted_memory(void) +{ + return false; +} + +static bool __free_unaccepted(struct page *page) +{ + BUILD_BUG(); + return false; +} + +#endif /* CONFIG_UNACCEPTED_MEMORY */ diff --git a/mm/vmstat.c b/mm/vmstat.c index c28046371b45..282349cabf01 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -1180,6 +1180,9 @@ const char * const vmstat_text[] =3D { "nr_zspages", #endif "nr_free_cma", +#ifdef CONFIG_UNACCEPTED_MEMORY + "nr_unaccepted", +#endif =20 /* enum numa_stat_item counters */ #ifdef CONFIG_NUMA --=20 2.39.3 From nobody Tue Feb 10 00:22:26 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2A9A9C7EE29 for ; Fri, 19 May 2023 16:24:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230116AbjESQYk (ORCPT ); Fri, 19 May 2023 12:24:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51416 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229571AbjESQYi (ORCPT ); Fri, 19 May 2023 12:24:38 -0400 Received: from NAM12-MW2-obe.outbound.protection.outlook.com (mail-mw2nam12on2056.outbound.protection.outlook.com [40.107.244.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 374FE8F for ; Fri, 19 May 2023 09:24:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=VuTp3zJfzaOJGkj59dZAL+IWcFxua3nCG4r8EObB6zQBvCC5by95sebqFVPX8F/dR9wWU890fp+S+aH9EB8jdxG7rqBKPUWUyHlROwi6N73MWVr7SWPOpxdTE7t9W94PEm3vUi74hUYj1kdovhHFC6dzXr+RtuS/2LBkYPMtSyokHRh79C8/GzVIVCGLWSKNrZAWod0mMQ7k8pZrUdHstZyafskq9aDvWHVlZ6hIHsCw71z6BlU1XlavuxaKPt4lB+p2b1otX8DOpMrDvvj4W9Q9BcGmzhPMRHLtlXReKihoGjj9GzSh3CyBxcN14pIVdxxC0CEjby0Z5NTYLJzW0Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=zrl1+aY9B4MUc/Htnd/dbIrwQO2Nvs4GfUL3eP6gJLY=; b=mXZOLe/1sWmPcnhxbuApAlRjYdJI74ndKBvZhnkgxb4+pknXAxM/Ovxzgylh6ptjbeCXOEcsnlhfUjfO3oEzCaZPKCbc+kvfoJlBhf+mMWNRB3PN5MAtd0HYO39JBQfuDCQ7BxZVPGsXf2u32xp4KMOdbqD6hx3ZjHAHy/eB1l3wWs8NlqIxErp8/uCXYOZqbs2Ahsik7hFswQHdyeB8dBakk75+EzwleycltYgw0M5rmUjJnIxAOXbcpga73Nv9/8Fd+qM4y3rXuJ4FXszONMesQFiaAiNkqwXMWvMt5FW2rRQ7nHtqgzZ9ZNqQ9LQQIZ0pJELPJ4nCGCxagN9huw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=zrl1+aY9B4MUc/Htnd/dbIrwQO2Nvs4GfUL3eP6gJLY=; b=H5/f1G0kGjI+y+3JEcKgBVWf4Tj/i02lyxBguYasNkpGXPzLPamINDZv9B9RALdIdfJ9OvyLQpJ0Vj962lvzWMoHQARRd6Hi6JwkBIg07/pnnaO3ii1E4kkKvGqhsdZd2lZvVpkYUA4U+kz/g014dqE8vbcYZ/oGACgTHbfWWIs= Received: from MW4P220CA0027.NAMP220.PROD.OUTLOOK.COM (2603:10b6:303:115::32) by DM4PR12MB6303.namprd12.prod.outlook.com (2603:10b6:8:a3::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6411.21; Fri, 19 May 2023 16:24:31 +0000 Received: from CO1NAM11FT049.eop-nam11.prod.protection.outlook.com (2603:10b6:303:115:cafe::b7) by MW4P220CA0027.outlook.office365.com (2603:10b6:303:115::32) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6411.21 via Frontend Transport; Fri, 19 May 2023 16:24:31 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by CO1NAM11FT049.mail.protection.outlook.com (10.13.175.50) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.6411.21 via Frontend Transport; Fri, 19 May 2023 16:24:31 +0000 Received: from tlendack-t1.amdoffice.net (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.34; Fri, 19 May 2023 11:24:29 -0500 From: Tom Lendacky To: , CC: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "Kirill A. Shutemov" , "H. Peter Anvin" , Michael Roth , Joerg Roedel , Dionna Glaze , Andy Lutomirski , Peter Zijlstra Subject: [PATCH 1/6] x86/sev: Fix calculation of end address based on number of pages Date: Fri, 19 May 2023 11:24:12 -0500 Message-ID: X-Mailer: git-send-email 2.40.0 In-Reply-To: <20230518231434.26080-1-kirill.shutemov@linux.intel.com> References: <20230518231434.26080-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [10.180.168.240] X-ClientProxiedBy: SATLEXMB04.amd.com (10.181.40.145) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CO1NAM11FT049:EE_|DM4PR12MB6303:EE_ X-MS-Office365-Filtering-Correlation-Id: e3a3aafc-79d8-4200-afbd-08db58858681 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: uZVPVb2q8WqseMXo1Nju0h6QG/nlkUSFo4imaYnawpaT9MZ0Y3D8kyIPqSeF395D02mL5G11E3H8saEjTzMjSqMv2G2Iy5JtalBc+jN83PIl3FHn8T7gMLElqcQlNBeLUZdwSEE2lOO0k5LezWCfANWiKpj+AK6szvQRFoR4dHtDzdnocC10E8gcoTjxk7LQvf5v9KxW+QBUSrHStwdqOvizXgIfwI2XjDolZvIKdvQxqUDMXCOAyA6hkkZyQry9lEKhBVMUa1xYFH25EErLJ8X+hwiw9rXtXVPjGC+mlnXqq2JQxG9bFbQ78BDCbViA2jPyuiDwLbww9IKGb/P567/++uW84EpqlMolMuzJHbNqLPEESE1Az/x+Jsvsh9oAz8ct8bEWiRbnQ46c7WH25SOYJGn35d6kmNfTzQzHLIiwSZrFrPJ3y1YdrJng7MxwsSldsbWjKy2iB1jhf2FgUI+NKf+XOTVSbef/LLc6PRve5LimxUldomzFS3skAOrTqOeDlqz0xjR1wN+nuebHOenxRXS7vVLvolKR/Q+fTL4e+uuHvOPELEnpG2/4+8KpKfeNjdgddtAQiO0mjU++eCIwabKdoj3ml/VpVyRil3n3FLUaz9DuhhmmubIKnyHb6g9ls/xecNvsWTs/OhXbAyw/x9iQK+3HcIWo5VVJE8oe7iN3zLGqX7CFj/AFQVeHg1yL6EIqC9uiy6nk4aqUu8B+kSyjHmiHg15XMbyfpJAZAWQcY4A7Na2YhivUbu0z86b0lw/6dgqyOl0W9rngIw== X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230028)(4636009)(396003)(376002)(39860400002)(136003)(346002)(451199021)(46966006)(40470700004)(36840700001)(8936002)(8676002)(478600001)(40480700001)(336012)(426003)(54906003)(110136005)(41300700001)(5660300002)(7416002)(4326008)(70206006)(6666004)(316002)(70586007)(16526019)(2906002)(186003)(26005)(40460700003)(81166007)(356005)(82740400003)(36860700001)(2616005)(47076005)(83380400001)(36756003)(86362001)(82310400005)(36900700001);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 19 May 2023 16:24:31.2191 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: e3a3aafc-79d8-4200-afbd-08db58858681 X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: CO1NAM11FT049.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM4PR12MB6303 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" When calculating an end address based on an unsigned int number of pages, any value greater than or equal to 0x100000 that is shift PAGE_SHIFT bits results in a 0 value, resulting in an invalid end address. Change the number of pages variable in various routines from an unsigned int to an unsigned long to calculate the end address correctly. Fixes: 5e5ccff60a29 ("x86/sev: Add helper for validating pages in early enc= attribute changes") Fixes: dc3f3d2474b8 ("x86/mm: Validate memory when changing the C-bit") Signed-off-by: Tom Lendacky --- arch/x86/include/asm/sev.h | 16 ++++++++-------- arch/x86/kernel/sev.c | 14 +++++++------- 2 files changed, 15 insertions(+), 15 deletions(-) diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h index 13dc2a9d23c1..7ca5c9ec8b52 100644 --- a/arch/x86/include/asm/sev.h +++ b/arch/x86/include/asm/sev.h @@ -192,12 +192,12 @@ struct snp_guest_request_ioctl; =20 void setup_ghcb(void); void __init early_snp_set_memory_private(unsigned long vaddr, unsigned lon= g paddr, - unsigned int npages); + unsigned long npages); void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long= paddr, - unsigned int npages); + unsigned long npages); void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc= _op op); -void snp_set_memory_shared(unsigned long vaddr, unsigned int npages); -void snp_set_memory_private(unsigned long vaddr, unsigned int npages); +void snp_set_memory_shared(unsigned long vaddr, unsigned long npages); +void snp_set_memory_private(unsigned long vaddr, unsigned long npages); void snp_set_wakeup_secondary_cpu(void); bool snp_init(struct boot_params *bp); void __init __noreturn snp_abort(void); @@ -212,12 +212,12 @@ static inline int pvalidate(unsigned long vaddr, bool= rmp_psize, bool validate) static inline int rmpadjust(unsigned long vaddr, bool rmp_psize, unsigned = long attrs) { return 0; } static inline void setup_ghcb(void) { } static inline void __init -early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr, uns= igned int npages) { } +early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr, uns= igned long npages) { } static inline void __init -early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr, unsi= gned int npages) { } +early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr, unsi= gned long npages) { } static inline void __init snp_prep_memory(unsigned long paddr, unsigned in= t sz, enum psc_op op) { } -static inline void snp_set_memory_shared(unsigned long vaddr, unsigned int= npages) { } -static inline void snp_set_memory_private(unsigned long vaddr, unsigned in= t npages) { } +static inline void snp_set_memory_shared(unsigned long vaddr, unsigned lon= g npages) { } +static inline void snp_set_memory_private(unsigned long vaddr, unsigned lo= ng npages) { } static inline void snp_set_wakeup_secondary_cpu(void) { } static inline bool snp_init(struct boot_params *bp) { return false; } static inline void snp_abort(void) { } diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c index b031244d6d2d..108bbae59c35 100644 --- a/arch/x86/kernel/sev.c +++ b/arch/x86/kernel/sev.c @@ -645,7 +645,7 @@ static u64 __init get_jump_table_addr(void) return ret; } =20 -static void pvalidate_pages(unsigned long vaddr, unsigned int npages, bool= validate) +static void pvalidate_pages(unsigned long vaddr, unsigned long npages, boo= l validate) { unsigned long vaddr_end; int rc; @@ -662,7 +662,7 @@ static void pvalidate_pages(unsigned long vaddr, unsign= ed int npages, bool valid } } =20 -static void __init early_set_pages_state(unsigned long paddr, unsigned int= npages, enum psc_op op) +static void __init early_set_pages_state(unsigned long paddr, unsigned lon= g npages, enum psc_op op) { unsigned long paddr_end; u64 val; @@ -701,7 +701,7 @@ static void __init early_set_pages_state(unsigned long = paddr, unsigned int npage } =20 void __init early_snp_set_memory_private(unsigned long vaddr, unsigned lon= g paddr, - unsigned int npages) + unsigned long npages) { /* * This can be invoked in early boot while running identity mapped, so @@ -723,7 +723,7 @@ void __init early_snp_set_memory_private(unsigned long = vaddr, unsigned long padd } =20 void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long= paddr, - unsigned int npages) + unsigned long npages) { /* * This can be invoked in early boot while running identity mapped, so @@ -879,7 +879,7 @@ static void __set_pages_state(struct snp_psc_desc *data= , unsigned long vaddr, sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PSC); } =20 -static void set_pages_state(unsigned long vaddr, unsigned int npages, int = op) +static void set_pages_state(unsigned long vaddr, unsigned long npages, int= op) { unsigned long vaddr_end, next_vaddr; struct snp_psc_desc *desc; @@ -904,7 +904,7 @@ static void set_pages_state(unsigned long vaddr, unsign= ed int npages, int op) kfree(desc); } =20 -void snp_set_memory_shared(unsigned long vaddr, unsigned int npages) +void snp_set_memory_shared(unsigned long vaddr, unsigned long npages) { if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP)) return; @@ -914,7 +914,7 @@ void snp_set_memory_shared(unsigned long vaddr, unsigne= d int npages) set_pages_state(vaddr, npages, SNP_PAGE_STATE_SHARED); } =20 -void snp_set_memory_private(unsigned long vaddr, unsigned int npages) +void snp_set_memory_private(unsigned long vaddr, unsigned long npages) { if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP)) return; --=20 2.40.0 From nobody Tue Feb 10 00:22:26 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7A067C77B73 for ; Thu, 18 May 2023 23:15:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230298AbjERXO7 (ORCPT ); Thu, 18 May 2023 19:14:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33786 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229621AbjERXOv (ORCPT ); Thu, 18 May 2023 19:14:51 -0400 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E89E3E48; Thu, 18 May 2023 16:14:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1684451690; x=1715987690; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=LJWuKaqvYOvqAXDR1TKqUlI78Dny7nmmQ0aMOVcXkCg=; b=G303A/QuLi10GQZWr2fLSW5eXRV/AgkIALDsn9v66paL4d28vAn+Rk5q cm815CRJ/xxVRPtuAXo25cTfanOQDRYJRkc4Z4xkJershL+zT0xuQb5Lj 3VDRi5gN657SeoYnxDzwiJX6Qw5xDJLKAFmv7eh2lp6l/C1AaoTtwIdha KeDhg4SM32DRnwWDSGsHEGBf4BH1CYGfKmZdZP5WW/nHnf9yFenjRY2gk 0y1uf8KFCKAPcBM+9fPtH6V16dyV8GPX+T8Y9wusXZkTmV6tjK09xP/lK rujNWcdqsPkcDmXEdi+TjviiBWmcZhRKC6K+9nBr3qm052EevgLcXTeKS w==; X-IronPort-AV: E=McAfee;i="6600,9927,10714"; a="341652063" X-IronPort-AV: E=Sophos;i="6.00,175,1681196400"; d="scan'208";a="341652063" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 May 2023 16:14:50 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10714"; a="772065073" X-IronPort-AV: E=Sophos;i="6.00,175,1681196400"; d="scan'208";a="772065073" Received: from rkiyama-mobl1.amr.corp.intel.com (HELO box.shutemov.name) ([10.251.222.16]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 May 2023 16:14:42 -0700 Received: by box.shutemov.name (Postfix, from userid 1000) id 2A6C610DFC6; Fri, 19 May 2023 02:14:40 +0300 (+03) From: "Kirill A. Shutemov" To: Borislav Petkov , Andy Lutomirski , Dave Hansen , Sean Christopherson , Andrew Morton , Joerg Roedel , Ard Biesheuvel Cc: Andi Kleen , Kuppuswamy Sathyanarayanan , David Rientjes , Vlastimil Babka , Tom Lendacky , Thomas Gleixner , Peter Zijlstra , Paolo Bonzini , Ingo Molnar , Dario Faggioli , Mike Rapoport , David Hildenbrand , Mel Gorman , marcelo.cerri@canonical.com, tim.gardner@canonical.com, khalid.elmously@canonical.com, philip.cox@canonical.com, aarcange@redhat.com, peterx@redhat.com, x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" , Borislav Petkov Subject: [PATCHv12 2/9] efi/x86: Get full memory map in allocate_e820() Date: Fri, 19 May 2023 02:14:27 +0300 Message-Id: <20230518231434.26080-3-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.39.3 In-Reply-To: <20230518231434.26080-1-kirill.shutemov@linux.intel.com> References: <20230518231434.26080-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Currently allocate_e820() is only interested in the size of map and size of memory descriptor to determine how many e820 entries the kernel needs. UEFI Specification version 2.9 introduces a new memory type -- unaccepted memory. To track unaccepted memory kernel needs to allocate a bitmap. The size of the bitmap is dependent on the maximum physical address present in the system. A full memory map is required to find the maximum address. Modify allocate_e820() to get a full memory map. Signed-off-by: Kirill A. Shutemov Reviewed-by: Borislav Petkov Acked-by: Ard Biesheuvel Reviewed-by: Tom Lendacky --- drivers/firmware/efi/libstub/x86-stub.c | 26 +++++++++++-------------- 1 file changed, 11 insertions(+), 15 deletions(-) diff --git a/drivers/firmware/efi/libstub/x86-stub.c b/drivers/firmware/efi= /libstub/x86-stub.c index a0bfd31358ba..fff81843169c 100644 --- a/drivers/firmware/efi/libstub/x86-stub.c +++ b/drivers/firmware/efi/libstub/x86-stub.c @@ -681,28 +681,24 @@ static efi_status_t allocate_e820(struct boot_params = *params, struct setup_data **e820ext, u32 *e820ext_size) { - unsigned long map_size, desc_size, map_key; + struct efi_boot_memmap *map; efi_status_t status; - __u32 nr_desc, desc_version; + __u32 nr_desc; =20 - /* Only need the size of the mem map and size of each mem descriptor */ - map_size =3D 0; - status =3D efi_bs_call(get_memory_map, &map_size, NULL, &map_key, - &desc_size, &desc_version); - if (status !=3D EFI_BUFFER_TOO_SMALL) - return (status !=3D EFI_SUCCESS) ? status : EFI_UNSUPPORTED; + status =3D efi_get_memory_map(&map, false); + if (status !=3D EFI_SUCCESS) + return status; =20 - nr_desc =3D map_size / desc_size + EFI_MMAP_NR_SLACK_SLOTS; - - if (nr_desc > ARRAY_SIZE(params->e820_table)) { - u32 nr_e820ext =3D nr_desc - ARRAY_SIZE(params->e820_table); + nr_desc =3D map->map_size / map->desc_size; + if (nr_desc > ARRAY_SIZE(params->e820_table) - EFI_MMAP_NR_SLACK_SLOTS) { + u32 nr_e820ext =3D nr_desc - ARRAY_SIZE(params->e820_table) + + EFI_MMAP_NR_SLACK_SLOTS; =20 status =3D alloc_e820ext(nr_e820ext, e820ext, e820ext_size); - if (status !=3D EFI_SUCCESS) - return status; } =20 - return EFI_SUCCESS; + efi_bs_call(free_pool, map); + return status; } =20 struct exit_boot_struct { --=20 2.39.3 From nobody Tue Feb 10 00:22:26 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 55721C77B75 for ; Fri, 19 May 2023 16:24:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230412AbjESQYz (ORCPT ); Fri, 19 May 2023 12:24:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51514 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230417AbjESQYr (ORCPT ); Fri, 19 May 2023 12:24:47 -0400 Received: from NAM12-MW2-obe.outbound.protection.outlook.com (mail-mw2nam12on2084.outbound.protection.outlook.com [40.107.244.84]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 634E6DF for ; Fri, 19 May 2023 09:24:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=I9J10nNIi727LGM+fSG6SVJNycm0XCsyehjU8UxRETNMWQlVijJiWPVgo81iEzIc5VBdnOY3GeyHBcJqCEeOP1ONJOfrsOQTUfMExCVI8UUiaU2c6KL8oEWHqZgt9dYWGp0tkwwrv6hVsijqIr5KIXwXgIO7f61sN7ajuo4kj4KajXCrYwlKr+E0BTd0YxVToh7xg/78xifT04BPG/1roHW09vEuqaApNZjzS1ZFj9DCSKqAaUukBl4DycjrdnbO1SKTorqNeUR+l5uZ6meuBuTzV9r5pzoSg7lRCH2AlI8pMggUIdqk/FNn5peH5LCXJZwH+HWjArXb14EkgzIXSQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=XMQqH5m4QXLuZBkFFVr61vvEYj4Js0xBdEZitHzZIe0=; b=FAx49FJqz7ybkN6dUucvz3+ZPiYiIXZx4RNZYKWpx2S3ChGmBG9+QaasHr4nHx/HKy52OyWUTM8uW/aM9PE5xC1SdUFs7DiP+dX69q+RvFN4cXwf5FVhN8UMS+3U1OkFweMY11oI4CYbYWDw2Z9T7UKb6Nrf1SAqDJDjYMLDFsi/TJ2w9/fgowWjyveDehLOPi1rm6P68NccL984OKeOidQOxHfD5jjASX3k9XTdn1Wxo137pYxSzZltJYGWSQ4M3wzf4OtqDns+4VZKA6zSJSHWqFl+sTsCkqxcN+FKFwe8AH8NNVHhdyPJJHysGqX+mBXjsyausp4X3wxnuSCDKQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=XMQqH5m4QXLuZBkFFVr61vvEYj4Js0xBdEZitHzZIe0=; b=ozAjbzRBOOjvs8xKXsYKKTLZpqwg8lY7kyChsO4HXy+6QiXYwU96YRzPkd8P6gU3CIKUSuSdybJlp88q7VbXx8xsxYM7rudZa6RX8MgWLwNTDxW25Bz724VwT9uF77j1DLSEW/ObmgU0e/eFEM3DGLyb8x0xxl8MD3gNXLKF9bw= Received: from MW2PR16CA0007.namprd16.prod.outlook.com (2603:10b6:907::20) by DM6PR12MB4044.namprd12.prod.outlook.com (2603:10b6:5:21d::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6411.21; Fri, 19 May 2023 16:24:41 +0000 Received: from CO1NAM11FT020.eop-nam11.prod.protection.outlook.com (2603:10b6:907:0:cafe::79) by MW2PR16CA0007.outlook.office365.com (2603:10b6:907::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6411.21 via Frontend Transport; Fri, 19 May 2023 16:24:41 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by CO1NAM11FT020.mail.protection.outlook.com (10.13.174.149) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.6411.21 via Frontend Transport; Fri, 19 May 2023 16:24:40 +0000 Received: from tlendack-t1.amdoffice.net (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.34; Fri, 19 May 2023 11:24:38 -0500 From: Tom Lendacky To: , CC: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "Kirill A. Shutemov" , "H. Peter Anvin" , Michael Roth , Joerg Roedel , Dionna Glaze , Andy Lutomirski , Peter Zijlstra Subject: [PATCH 2/6] x86/sev: Put PSC struct on the stack in prep for unaccepted memory support Date: Fri, 19 May 2023 11:24:13 -0500 Message-ID: <12e07432b4f4db949ef7ca5b93c0b2adb5b80a1b.1684513457.git.thomas.lendacky@amd.com> X-Mailer: git-send-email 2.40.0 In-Reply-To: <20230518231434.26080-1-kirill.shutemov@linux.intel.com> References: <20230518231434.26080-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [10.180.168.240] X-ClientProxiedBy: SATLEXMB04.amd.com (10.181.40.145) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CO1NAM11FT020:EE_|DM6PR12MB4044:EE_ X-MS-Office365-Filtering-Correlation-Id: c36180e3-56fa-4c65-57a6-08db58858c51 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 0AFpn7HstYonnYPhzfPL7FrsB/qE+fFcM0uSP1RAC6CwVP6I9DuLVzNomjoY5EY32yeiWga3ScaHn36+URdpMjVeeIlcOSzZnqymBIGSdje/Z0GDG04MxYUt9EpTaciTfP0RZsbXJGQ5CEf804iVI/qdh4vP6InxOov/FwEaXa8t6D+XOlwYp8N1lNfBLwOwH6YUhIjOYFYmXdlmsPcSZ8evfz2iiA1HokAHKx42CMaeaafm/MjbKTvtBTG3TiqL2qrD6WmJF0oAJIwGEFqrM0DUW58porZYix/+Oyj8256bF2JxZg0IGa3SPJnKB8d7xTFgvZLdksVJB+brA8WFe4Q2XmWVgVXmpBMJUcxomEo2CiYISbWoyCZX+zSsgY1FBjFqhNzjruTh0yA6lOE2dtiYxE63SjGKBaiNGayyiFsPGFTfNHIYI7dwm5hNbqQmxl2TvW2O/ptH9v2SaEF8x+uu4LEQqdvQN7+Tn/PehH3k4WXH6307783SEVkeFmnTdMZR1m4IcSLLsidmvGQAwVB3Eir/tgx/E5Vq6tkIRll1V6D1xKDNbiXP/emCVbHXaWgnJeKiH/zh6zl4yp3zWPiYYvtvcNYHcj7JR9pR5oKZS00xj1YsNQlYkHzrVXcmkl3a120S0nVgi+R8Q8hiCwdtSGDpRur24Nrmu7+TDxpDShpDsUTYPJdsViDUdtmaNf+tgHT1H4ENeW8lPrVoPvBMmVilXG+rAu5EiUHNq6YkRetV73ia8+Uuy7BVGUyl X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230028)(4636009)(396003)(346002)(136003)(39860400002)(376002)(451199021)(46966006)(40470700004)(36840700001)(36860700001)(83380400001)(81166007)(82740400003)(356005)(40460700003)(36756003)(8936002)(5660300002)(8676002)(7416002)(41300700001)(82310400005)(40480700001)(86362001)(4326008)(316002)(70206006)(70586007)(426003)(966005)(47076005)(16526019)(110136005)(26005)(186003)(336012)(2906002)(6666004)(478600001)(2616005)(54906003)(36900700001);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 19 May 2023 16:24:40.9686 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: c36180e3-56fa-4c65-57a6-08db58858c51 X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: CO1NAM11FT020.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM6PR12MB4044 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" In advance of providing support for unaccepted memory, switch from using kmalloc() for allocating the Page State Change (PSC) structure to using a local variable that lives on the stack. This is needed to avoid a possible recursive call into set_pages_state() if the kmalloc() call requires (more) memory to be accepted, which would result in a hang. The current size of the PSC struct is 2,032 bytes. To make the struct more stack friendly, reduce the number of PSC entries from 253 down to 64, resulting in a size of 520 bytes. This is a nice compromise on struct size and total PSC requests while still allowing parallel PSC operations across vCPUs. If the reduction in PSC entries results in any kind of performance issue (that is not seen at the moment), use of a larger static PSC struct, with fallback to the smaller stack version, can be investigated. For more background info on this decision, see the subthread in the Link: tag below. Signed-off-by: Tom Lendacky Link: https://lore.kernel.org/lkml/658c455c40e8950cb046dd885dd19dc1c52d060a= .1659103274.git.thomas.lendacky@amd.com --- arch/x86/include/asm/sev-common.h | 9 +++++++-- arch/x86/kernel/sev.c | 10 ++-------- 2 files changed, 9 insertions(+), 10 deletions(-) diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-c= ommon.h index 0759af9b1acf..b463fcbd4b90 100644 --- a/arch/x86/include/asm/sev-common.h +++ b/arch/x86/include/asm/sev-common.h @@ -106,8 +106,13 @@ enum psc_op { #define GHCB_HV_FT_SNP BIT_ULL(0) #define GHCB_HV_FT_SNP_AP_CREATION BIT_ULL(1) =20 -/* SNP Page State Change NAE event */ -#define VMGEXIT_PSC_MAX_ENTRY 253 +/* + * SNP Page State Change NAE event + * The VMGEXIT_PSC_MAX_ENTRY determines the size of the PSC structure, w= hich + * is a local stack variable in set_pages_state(). Do not increase this = value + * without evaluating the impact to stack usage. + */ +#define VMGEXIT_PSC_MAX_ENTRY 64 =20 struct psc_hdr { u16 cur_entry; diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c index 108bbae59c35..7b0144acd7bf 100644 --- a/arch/x86/kernel/sev.c +++ b/arch/x86/kernel/sev.c @@ -882,11 +882,7 @@ static void __set_pages_state(struct snp_psc_desc *dat= a, unsigned long vaddr, static void set_pages_state(unsigned long vaddr, unsigned long npages, int= op) { unsigned long vaddr_end, next_vaddr; - struct snp_psc_desc *desc; - - desc =3D kmalloc(sizeof(*desc), GFP_KERNEL_ACCOUNT); - if (!desc) - panic("SNP: failed to allocate memory for PSC descriptor\n"); + struct snp_psc_desc desc; =20 vaddr =3D vaddr & PAGE_MASK; vaddr_end =3D vaddr + (npages << PAGE_SHIFT); @@ -896,12 +892,10 @@ static void set_pages_state(unsigned long vaddr, unsi= gned long npages, int op) next_vaddr =3D min_t(unsigned long, vaddr_end, (VMGEXIT_PSC_MAX_ENTRY * PAGE_SIZE) + vaddr); =20 - __set_pages_state(desc, vaddr, next_vaddr, op); + __set_pages_state(&desc, vaddr, next_vaddr, op); =20 vaddr =3D next_vaddr; } - - kfree(desc); } =20 void snp_set_memory_shared(unsigned long vaddr, unsigned long npages) --=20 2.40.0 From nobody Tue Feb 10 00:22:26 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 22CF1C7EE29 for ; Thu, 18 May 2023 23:15:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230116AbjERXPB (ORCPT ); Thu, 18 May 2023 19:15:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33798 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230226AbjERXOx (ORCPT ); Thu, 18 May 2023 19:14:53 -0400 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 069E9E49; Thu, 18 May 2023 16:14:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1684451691; x=1715987691; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=n9SDPh3Ob0T0sTasYeFOdu7CKW9d+5L1TvbhmirjRWA=; b=IBro/QScJDoCK/D+1DrA5X952KQVzbXIAePVcJud7mxeRvvm+1FBXUrl TcPgDEOoWW7Lc9qvpox5+xx6vrvQp2zpLKgCGK7GKBZFHGRivA0TxqZc4 jKKx2UJgpv41HA/WT0ze4UYMxxKTdJdQBNzy1xh6TNET5dxtWrHE9pCeN BliGB9mGMHGZ/g6wRvYMZSDVMxJIyFi+KtprGFNJOt2ran6P4rQvXtqQu 7qvzFXefqRUBUYvSiHYt96dysLLBSwtj7F+RlWz5ajr7HrXtdyhhiAWXi Jk3ynCT2nffhORLi2CNM/tISKl8imV5j6YYy+aIrVc+6k0UtPMtjik9JM A==; X-IronPort-AV: E=McAfee;i="6600,9927,10714"; a="380431692" X-IronPort-AV: E=Sophos;i="6.00,175,1681196400"; d="scan'208";a="380431692" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 May 2023 16:14:50 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10714"; a="846669537" X-IronPort-AV: E=Sophos;i="6.00,175,1681196400"; d="scan'208";a="846669537" Received: from rkiyama-mobl1.amr.corp.intel.com (HELO box.shutemov.name) ([10.251.222.16]) by fmsmga001-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 May 2023 16:14:42 -0700 Received: by box.shutemov.name (Postfix, from userid 1000) id 344EA10DFCE; Fri, 19 May 2023 02:14:40 +0300 (+03) From: "Kirill A. Shutemov" To: Borislav Petkov , Andy Lutomirski , Dave Hansen , Sean Christopherson , Andrew Morton , Joerg Roedel , Ard Biesheuvel Cc: Andi Kleen , Kuppuswamy Sathyanarayanan , David Rientjes , Vlastimil Babka , Tom Lendacky , Thomas Gleixner , Peter Zijlstra , Paolo Bonzini , Ingo Molnar , Dario Faggioli , Mike Rapoport , David Hildenbrand , Mel Gorman , marcelo.cerri@canonical.com, tim.gardner@canonical.com, khalid.elmously@canonical.com, philip.cox@canonical.com, aarcange@redhat.com, peterx@redhat.com, x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [PATCHv12 3/9] efi/libstub: Implement support for unaccepted memory Date: Fri, 19 May 2023 02:14:28 +0300 Message-Id: <20230518231434.26080-4-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.39.3 In-Reply-To: <20230518231434.26080-1-kirill.shutemov@linux.intel.com> References: <20230518231434.26080-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" UEFI Specification version 2.9 introduces the concept of memory acceptance: Some Virtual Machine platforms, such as Intel TDX or AMD SEV-SNP, requiring memory to be accepted before it can be used by the guest. Accepting happens via a protocol specific for the Virtual Machine platform. Accepting memory is costly and it makes VMM allocate memory for the accepted guest physical address range. It's better to postpone memory acceptance until memory is needed. It lowers boot time and reduces memory overhead. The kernel needs to know what memory has been accepted. Firmware communicates this information via memory map: a new memory type -- EFI_UNACCEPTED_MEMORY -- indicates such memory. Range-based tracking works fine for firmware, but it gets bulky for the kernel: e820 (or whatever the arch uses) has to be modified on every page acceptance. It leads to table fragmentation and there's a limited number of entries in the e820 table. Another option is to mark such memory as usable in e820 and track if the range has been accepted in a bitmap. One bit in the bitmap represents a naturally aligned power-2-sized region of address space -- unit. For x86, unit size is 2MiB: 4k of the bitmap is enough to track 64GiB or physical address space. In the worst-case scenario -- a huge hole in the middle of the address space -- It needs 256MiB to handle 4PiB of the address space. Any unaccepted memory that is not aligned to unit_size gets accepted upfront. The bitmap is allocated and constructed in the EFI stub and passed down to the kernel via EFI configuration table. allocate_e820() allocates the bitmap if unaccepted memory is present, according to the size of unaccepted region. Signed-off-by: Kirill A. Shutemov Reviewed-by: Ard Biesheuvel Reviewed-by: Tom Lendacky --- arch/x86/boot/compressed/Makefile | 1 + arch/x86/boot/compressed/mem.c | 9 + arch/x86/include/asm/efi.h | 2 + drivers/firmware/efi/Kconfig | 14 ++ drivers/firmware/efi/efi.c | 1 + drivers/firmware/efi/libstub/Makefile | 2 + drivers/firmware/efi/libstub/bitmap.c | 41 +++ drivers/firmware/efi/libstub/efistub.h | 6 + drivers/firmware/efi/libstub/find.c | 43 ++++ .../firmware/efi/libstub/unaccepted_memory.c | 234 ++++++++++++++++++ drivers/firmware/efi/libstub/x86-stub.c | 13 + include/linux/efi.h | 12 +- 12 files changed, 377 insertions(+), 1 deletion(-) create mode 100644 arch/x86/boot/compressed/mem.c create mode 100644 drivers/firmware/efi/libstub/bitmap.c create mode 100644 drivers/firmware/efi/libstub/find.c create mode 100644 drivers/firmware/efi/libstub/unaccepted_memory.c diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/M= akefile index 6b6cfe607bdb..cc4978123c30 100644 --- a/arch/x86/boot/compressed/Makefile +++ b/arch/x86/boot/compressed/Makefile @@ -107,6 +107,7 @@ endif =20 vmlinux-objs-$(CONFIG_ACPI) +=3D $(obj)/acpi.o vmlinux-objs-$(CONFIG_INTEL_TDX_GUEST) +=3D $(obj)/tdx.o $(obj)/tdcall.o +vmlinux-objs-$(CONFIG_UNACCEPTED_MEMORY) +=3D $(obj)/mem.o =20 vmlinux-objs-$(CONFIG_EFI) +=3D $(obj)/efi.o vmlinux-objs-$(CONFIG_EFI_MIXED) +=3D $(obj)/efi_mixed.o diff --git a/arch/x86/boot/compressed/mem.c b/arch/x86/boot/compressed/mem.c new file mode 100644 index 000000000000..67594fcb11d9 --- /dev/null +++ b/arch/x86/boot/compressed/mem.c @@ -0,0 +1,9 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#include "error.h" + +void arch_accept_memory(phys_addr_t start, phys_addr_t end) +{ + /* Platform-specific memory-acceptance call goes here */ + error("Cannot accept memory"); +} diff --git a/arch/x86/include/asm/efi.h b/arch/x86/include/asm/efi.h index 419280d263d2..8b4be7cecdb8 100644 --- a/arch/x86/include/asm/efi.h +++ b/arch/x86/include/asm/efi.h @@ -31,6 +31,8 @@ extern unsigned long efi_mixed_mode_stack_pa; =20 #define ARCH_EFI_IRQ_FLAGS_MASK X86_EFLAGS_IF =20 +#define EFI_UNACCEPTED_UNIT_SIZE PMD_SIZE + /* * The EFI services are called through variadic functions in many cases. T= hese * functions are implemented in assembler and support only a fixed number = of diff --git a/drivers/firmware/efi/Kconfig b/drivers/firmware/efi/Kconfig index 043ca31c114e..231f1c70d1db 100644 --- a/drivers/firmware/efi/Kconfig +++ b/drivers/firmware/efi/Kconfig @@ -269,6 +269,20 @@ config EFI_COCO_SECRET virt/coco/efi_secret module to access the secrets, which in turn allows userspace programs to access the injected secrets. =20 +config UNACCEPTED_MEMORY + bool + depends on EFI_STUB + help + Some Virtual Machine platforms, such as Intel TDX, require + some memory to be "accepted" by the guest before it can be used. + This mechanism helps prevent malicious hosts from making changes + to guest memory. + + UEFI specification v2.9 introduced EFI_UNACCEPTED_MEMORY memory type. + + This option adds support for unaccepted memory and makes such memory + usable by the kernel. + config EFI_EMBEDDED_FIRMWARE bool select CRYPTO_LIB_SHA256 diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c index abeff7dc0b58..7dce06e419c5 100644 --- a/drivers/firmware/efi/efi.c +++ b/drivers/firmware/efi/efi.c @@ -843,6 +843,7 @@ static __initdata char memory_type_name[][13] =3D { "MMIO Port", "PAL Code", "Persistent", + "Unaccepted", }; =20 char * __init efi_md_typeattr_format(char *buf, size_t size, diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/l= ibstub/Makefile index 3abb2b357482..16d64a34d1e1 100644 --- a/drivers/firmware/efi/libstub/Makefile +++ b/drivers/firmware/efi/libstub/Makefile @@ -96,6 +96,8 @@ CFLAGS_arm32-stub.o :=3D -DTEXT_OFFSET=3D$(TEXT_OFFSET) zboot-obj-$(CONFIG_RISCV) :=3D lib-clz_ctz.o lib-ashldi3.o lib-$(CONFIG_EFI_ZBOOT) +=3D zboot.o $(zboot-obj-y) =20 +lib-$(CONFIG_UNACCEPTED_MEMORY) +=3D unaccepted_memory.o bitmap.o find.o + extra-y :=3D $(lib-y) lib-y :=3D $(patsubst %.o,%.stub.o,$(lib-y)) =20 diff --git a/drivers/firmware/efi/libstub/bitmap.c b/drivers/firmware/efi/l= ibstub/bitmap.c new file mode 100644 index 000000000000..5c9bba0d549b --- /dev/null +++ b/drivers/firmware/efi/libstub/bitmap.c @@ -0,0 +1,41 @@ +#include + +void __bitmap_set(unsigned long *map, unsigned int start, int len) +{ + unsigned long *p =3D map + BIT_WORD(start); + const unsigned int size =3D start + len; + int bits_to_set =3D BITS_PER_LONG - (start % BITS_PER_LONG); + unsigned long mask_to_set =3D BITMAP_FIRST_WORD_MASK(start); + + while (len - bits_to_set >=3D 0) { + *p |=3D mask_to_set; + len -=3D bits_to_set; + bits_to_set =3D BITS_PER_LONG; + mask_to_set =3D ~0UL; + p++; + } + if (len) { + mask_to_set &=3D BITMAP_LAST_WORD_MASK(size); + *p |=3D mask_to_set; + } +} + +void __bitmap_clear(unsigned long *map, unsigned int start, int len) +{ + unsigned long *p =3D map + BIT_WORD(start); + const unsigned int size =3D start + len; + int bits_to_clear =3D BITS_PER_LONG - (start % BITS_PER_LONG); + unsigned long mask_to_clear =3D BITMAP_FIRST_WORD_MASK(start); + + while (len - bits_to_clear >=3D 0) { + *p &=3D ~mask_to_clear; + len -=3D bits_to_clear; + bits_to_clear =3D BITS_PER_LONG; + mask_to_clear =3D ~0UL; + p++; + } + if (len) { + mask_to_clear &=3D BITMAP_LAST_WORD_MASK(size); + *p &=3D ~mask_to_clear; + } +} diff --git a/drivers/firmware/efi/libstub/efistub.h b/drivers/firmware/efi/= libstub/efistub.h index 67d5a20802e0..8659a01664b8 100644 --- a/drivers/firmware/efi/libstub/efistub.h +++ b/drivers/firmware/efi/libstub/efistub.h @@ -1133,4 +1133,10 @@ const u8 *__efi_get_smbios_string(const struct efi_s= mbios_record *record, void efi_remap_image(unsigned long image_base, unsigned alloc_size, unsigned long code_size); =20 +efi_status_t allocate_unaccepted_bitmap(__u32 nr_desc, + struct efi_boot_memmap *map); +void process_unaccepted_memory(u64 start, u64 end); +void accept_memory(phys_addr_t start, phys_addr_t end); +void arch_accept_memory(phys_addr_t start, phys_addr_t end); + #endif diff --git a/drivers/firmware/efi/libstub/find.c b/drivers/firmware/efi/lib= stub/find.c new file mode 100644 index 000000000000..4e7740d28987 --- /dev/null +++ b/drivers/firmware/efi/libstub/find.c @@ -0,0 +1,43 @@ +// SPDX-License-Identifier: GPL-2.0-only +#include +#include +#include + +/* + * Common helper for find_next_bit() function family + * @FETCH: The expression that fetches and pre-processes each word of bitm= ap(s) + * @MUNGE: The expression that post-processes a word containing found bit = (may be empty) + * @size: The bitmap size in bits + * @start: The bitnumber to start searching at + */ +#define FIND_NEXT_BIT(FETCH, MUNGE, size, start) \ +({ \ + unsigned long mask, idx, tmp, sz =3D (size), __start =3D (start); \ + \ + if (unlikely(__start >=3D sz)) \ + goto out; \ + \ + mask =3D MUNGE(BITMAP_FIRST_WORD_MASK(__start)); \ + idx =3D __start / BITS_PER_LONG; \ + \ + for (tmp =3D (FETCH) & mask; !tmp; tmp =3D (FETCH)) { \ + if ((idx + 1) * BITS_PER_LONG >=3D sz) \ + goto out; \ + idx++; \ + } \ + \ + sz =3D min(idx * BITS_PER_LONG + __ffs(MUNGE(tmp)), sz); \ +out: \ + sz; \ +}) + +unsigned long _find_next_bit(const unsigned long *addr, unsigned long nbit= s, unsigned long start) +{ + return FIND_NEXT_BIT(addr[idx], /* nop */, nbits, start); +} + +unsigned long _find_next_zero_bit(const unsigned long *addr, unsigned long= nbits, + unsigned long start) +{ + return FIND_NEXT_BIT(~addr[idx], /* nop */, nbits, start); +} diff --git a/drivers/firmware/efi/libstub/unaccepted_memory.c b/drivers/fir= mware/efi/libstub/unaccepted_memory.c new file mode 100644 index 000000000000..fd6a3195c68f --- /dev/null +++ b/drivers/firmware/efi/libstub/unaccepted_memory.c @@ -0,0 +1,234 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#include +#include +#include "efistub.h" + +static struct efi_unaccepted_memory *unaccepted_table; + +/* + * Decompressor needs to initialize the variable to cover cases when the t= able + * is not allocated by EFI stub or EFI stub copied the kernel image with + * efi_relocate_kernel() before the variable is set. + * + * It must be call before the first usage of accept_memory() by decompress= or. + */ +void set_unaccepted_table(struct efi_unaccepted_memory *table) +{ + unaccepted_table =3D table; +} + +efi_status_t allocate_unaccepted_bitmap(__u32 nr_desc, + struct efi_boot_memmap *map) +{ + efi_guid_t unaccepted_table_guid =3D LINUX_EFI_UNACCEPTED_MEM_TABLE_GUID; + u64 unaccepted_start =3D ULLONG_MAX, unaccepted_end =3D 0, bitmap_size; + efi_status_t status; + int i; + + /* Check if the table is already installed */ + unaccepted_table =3D get_efi_config_table(unaccepted_table_guid); + if (unaccepted_table) { + if (unaccepted_table->version !=3D 1) { + efi_err("Unknown version of unaccepted memory table\n"); + return EFI_UNSUPPORTED; + } + return EFI_SUCCESS; + } + + /* Check if there's any unaccepted memory and find the max address */ + for (i =3D 0; i < nr_desc; i++) { + efi_memory_desc_t *d; + unsigned long m =3D (unsigned long)map->map; + + d =3D efi_early_memdesc_ptr(m, map->desc_size, i); + if (d->type !=3D EFI_UNACCEPTED_MEMORY) + continue; + + unaccepted_start =3D min(unaccepted_start, d->phys_addr); + unaccepted_end =3D max(unaccepted_end, + d->phys_addr + d->num_pages * PAGE_SIZE); + } + + if (unaccepted_start =3D=3D ULLONG_MAX) + return EFI_SUCCESS; + + unaccepted_start =3D round_down(unaccepted_start, + EFI_UNACCEPTED_UNIT_SIZE); + unaccepted_end =3D round_up(unaccepted_end, EFI_UNACCEPTED_UNIT_SIZE); + + /* + * If unaccepted memory is present, allocate a bitmap to track what + * memory has to be accepted before access. + * + * One bit in the bitmap represents 2MiB in the address space: + * A 4k bitmap can track 64GiB of physical address space. + * + * In the worst case scenario -- a huge hole in the middle of the + * address space -- It needs 256MiB to handle 4PiB of the address + * space. + * + * The bitmap will be populated in setup_e820() according to the memory + * map after efi_exit_boot_services(). + */ + bitmap_size =3D DIV_ROUND_UP(unaccepted_end - unaccepted_start, + EFI_UNACCEPTED_UNIT_SIZE * BITS_PER_BYTE); + + status =3D efi_bs_call(allocate_pool, EFI_LOADER_DATA, + sizeof(*unaccepted_table) + bitmap_size, + (void **)&unaccepted_table); + if (status !=3D EFI_SUCCESS) { + efi_err("Failed to allocate unaccepted memory config table\n"); + return status; + } + + unaccepted_table->version =3D 1; + unaccepted_table->unit_size =3D EFI_UNACCEPTED_UNIT_SIZE; + unaccepted_table->phys_base =3D unaccepted_start; + unaccepted_table->size =3D bitmap_size; + memset(unaccepted_table->bitmap, 0, bitmap_size); + + status =3D efi_bs_call(install_configuration_table, + &unaccepted_table_guid, unaccepted_table); + if (status !=3D EFI_SUCCESS) { + efi_bs_call(free_pool, unaccepted_table); + efi_err("Failed to install unaccepted memory config table!\n"); + } + + return status; +} + +/* + * The accepted memory bitmap only works at unit_size granularity. Take + * unaligned start/end addresses and either: + * 1. Accepts the memory immediately and in its entirety + * 2. Accepts unaligned parts, and marks *some* aligned part unaccepted + * + * The function will never reach the bitmap_set() with zero bits to set. + */ +void process_unaccepted_memory(u64 start, u64 end) +{ + u64 unit_size =3D unaccepted_table->unit_size; + u64 unit_mask =3D unaccepted_table->unit_size - 1; + u64 bitmap_size =3D unaccepted_table->size; + + /* + * Ensure that at least one bit will be set in the bitmap by + * immediately accepting all regions under 2*unit_size. This is + * imprecise and may immediately accept some areas that could + * have been represented in the bitmap. But, results in simpler + * code below + * + * Consider case like this (assuming unit_size =3D=3D 2MB): + * + * | 4k | 2044k | 2048k | + * ^ 0x0 ^ 2MB ^ 4MB + * + * Only the first 4k has been accepted. The 0MB->2MB region can not be + * represented in the bitmap. The 2MB->4MB region can be represented in + * the bitmap. But, the 0MB->4MB region is <2*unit_size and will be + * immediately accepted in its entirety. + */ + if (end - start < 2 * unit_size) { + arch_accept_memory(start, end); + return; + } + + /* + * No matter how the start and end are aligned, at least one unaccepted + * unit_size area will remain to be marked in the bitmap. + */ + + /* Immediately accept a phys_base) { + arch_accept_memory(start, + min(unaccepted_table->phys_base, end)); + start =3D unaccepted_table->phys_base; + } + + /* Nothing to record */ + if (end < unaccepted_table->phys_base) + return; + + /* Translate to offsets from the beginning of the bitmap */ + start -=3D unaccepted_table->phys_base; + end -=3D unaccepted_table->phys_base; + + /* Accept memory that doesn't fit into bitmap */ + if (end > bitmap_size * unit_size * BITS_PER_BYTE) { + unsigned long phys_start, phys_end; + + phys_start =3D bitmap_size * unit_size * BITS_PER_BYTE + + unaccepted_table->phys_base; + phys_end =3D end + unaccepted_table->phys_base; + + arch_accept_memory(phys_start, phys_end); + end =3D bitmap_size * unit_size * BITS_PER_BYTE; + } + + /* + * 'start' and 'end' are now both unit_size-aligned. + * Record the range as being unaccepted: + */ + bitmap_set(unaccepted_table->bitmap, + start / unit_size, (end - start) / unit_size); +} + +void accept_memory(phys_addr_t start, phys_addr_t end) +{ + unsigned long range_start, range_end; + unsigned long bitmap_size; + u64 unit_size; + + if (!unaccepted_table) + return; + + unit_size =3D unaccepted_table->unit_size; + + /* + * Only care for the part of the range that is represented + * in the bitmap. + */ + if (start < unaccepted_table->phys_base) + start =3D unaccepted_table->phys_base; + if (end < unaccepted_table->phys_base) + return; + + /* Translate to offsets from the beginning of the bitmap */ + start -=3D unaccepted_table->phys_base; + end -=3D unaccepted_table->phys_base; + + /* Make sure not to overrun the bitmap */ + if (end > unaccepted_table->size * unit_size * BITS_PER_BYTE) + end =3D unaccepted_table->size * unit_size * BITS_PER_BYTE; + + range_start =3D start / unit_size; + bitmap_size =3D DIV_ROUND_UP(end, unit_size); + + for_each_set_bitrange_from(range_start, range_end, + unaccepted_table->bitmap, bitmap_size) { + unsigned long phys_start, phys_end; + + phys_start =3D range_start * unit_size + unaccepted_table->phys_base; + phys_end =3D range_end * unit_size + unaccepted_table->phys_base; + + arch_accept_memory(phys_start, phys_end); + bitmap_clear(unaccepted_table->bitmap, + range_start, range_end - range_start); + } +} diff --git a/drivers/firmware/efi/libstub/x86-stub.c b/drivers/firmware/efi= /libstub/x86-stub.c index fff81843169c..8d17cee8b98e 100644 --- a/drivers/firmware/efi/libstub/x86-stub.c +++ b/drivers/firmware/efi/libstub/x86-stub.c @@ -613,6 +613,16 @@ setup_e820(struct boot_params *params, struct setup_da= ta *e820ext, u32 e820ext_s e820_type =3D E820_TYPE_PMEM; break; =20 + case EFI_UNACCEPTED_MEMORY: + if (!IS_ENABLED(CONFIG_UNACCEPTED_MEMORY)) { + efi_warn_once( +"The system has unaccepted memory, but kernel does not support it\nConsid= er enabling CONFIG_UNACCEPTED_MEMORY\n"); + continue; + } + e820_type =3D E820_TYPE_RAM; + process_unaccepted_memory(d->phys_addr, + d->phys_addr + PAGE_SIZE * d->num_pages); + break; default: continue; } @@ -697,6 +707,9 @@ static efi_status_t allocate_e820(struct boot_params *p= arams, status =3D alloc_e820ext(nr_e820ext, e820ext, e820ext_size); } =20 + if (IS_ENABLED(CONFIG_UNACCEPTED_MEMORY) && status =3D=3D EFI_SUCCESS) + status =3D allocate_unaccepted_bitmap(nr_desc, map); + efi_bs_call(free_pool, map); return status; } diff --git a/include/linux/efi.h b/include/linux/efi.h index 7aa62c92185f..29cc622910da 100644 --- a/include/linux/efi.h +++ b/include/linux/efi.h @@ -108,7 +108,8 @@ typedef struct { #define EFI_MEMORY_MAPPED_IO_PORT_SPACE 12 #define EFI_PAL_CODE 13 #define EFI_PERSISTENT_MEMORY 14 -#define EFI_MAX_MEMORY_TYPE 15 +#define EFI_UNACCEPTED_MEMORY 15 +#define EFI_MAX_MEMORY_TYPE 16 =20 /* Attribute values: */ #define EFI_MEMORY_UC ((u64)0x0000000000000001ULL) /* uncached */ @@ -417,6 +418,7 @@ void efi_native_runtime_setup(void); #define LINUX_EFI_MOK_VARIABLE_TABLE_GUID EFI_GUID(0xc451ed2b, 0x9694, 0x4= 5d3, 0xba, 0xba, 0xed, 0x9f, 0x89, 0x88, 0xa3, 0x89) #define LINUX_EFI_COCO_SECRET_AREA_GUID EFI_GUID(0xadf956ad, 0xe98c, 0x48= 4c, 0xae, 0x11, 0xb5, 0x1c, 0x7d, 0x33, 0x64, 0x47) #define LINUX_EFI_BOOT_MEMMAP_GUID EFI_GUID(0x800f683f, 0xd08b, 0x423a, = 0xa2, 0x93, 0x96, 0x5c, 0x3c, 0x6f, 0xe2, 0xb4) +#define LINUX_EFI_UNACCEPTED_MEM_TABLE_GUID EFI_GUID(0xd5d1de3c, 0x105c, 0= x44f9, 0x9e, 0xa9, 0xbc, 0xef, 0x98, 0x12, 0x00, 0x31) =20 #define RISCV_EFI_BOOT_PROTOCOL_GUID EFI_GUID(0xccd15fec, 0x6f73, 0x4eec,= 0x83, 0x95, 0x3e, 0x69, 0xe4, 0xb9, 0x40, 0xbf) =20 @@ -534,6 +536,14 @@ struct efi_boot_memmap { efi_memory_desc_t map[]; }; =20 +struct efi_unaccepted_memory { + u32 version; + u32 unit_size; + u64 phys_base; + u64 size; + unsigned long bitmap[]; +}; + /* * Architecture independent structure for describing a memory map for the * benefit of efi_memmap_init_early(), and for passing context between --=20 2.39.3 From nobody Tue Feb 10 00:22:26 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BC764C77B7A for ; Fri, 19 May 2023 16:25:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229451AbjESQZL (ORCPT ); Fri, 19 May 2023 12:25:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51642 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230426AbjESQY4 (ORCPT ); Fri, 19 May 2023 12:24:56 -0400 Received: from NAM02-SN1-obe.outbound.protection.outlook.com (mail-sn1nam02on2082.outbound.protection.outlook.com [40.107.96.82]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 666FAC1 for ; Fri, 19 May 2023 09:24:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Sb14FCvh8duijrLk6DyLDeLI976c9Dr+7vPp3N7AMXeGocPCnuxuzZ1AjcZYV2i7XxWisSNa4lRBIlrDdOfQJPKstp3BzCr8FQXsHprEY7hDVKdCcqzZU/NSuFNHTrZo1xM5rvgvCiye3kNBibgMtjmJFi0y/sKwRzGvx6Bzr4bNpyd+/BsRVoK0ggDWnEb5R0gG2ffx7NtlJieCHcoTMosyULZEOlcby1eqONSN9Y+LxfSGiHceCBKKcpUTphSa9ZzX7NpDGzDnHfJ5vmucMIapRYLUoaWWcJrS2bvzbpEswjX47GJ3Gxg232mn2kJGb/2aVKyW7mq9n+TPVfJWLw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=adTuFZ8iQMiM83q2NY8/acoI9AbzAM/9fd4rMUYoVsk=; b=em1acvEUOOtwDdP/JrdfKcF1vn1l5G/1sXa499m1kcXxKhe2uA/UwWSWOwjLjm7ugljO7hLF2HoE8AsRj7qXAFB0+bQ71bijHKCtf5Vs/Jc7YybjZIm/CsQ0aeL062QNYfgzflfnZvlnYr//uMfuCYXVwnhuDkBTB1TUuqquQyBV6hiAucwAWtIDyWwWSzg7E5rbqrKxGm4QWA6888X95Aux9HBzRR5kXxXmL4MyxSEej/uRdHfSeziJmx4y07b7UA8WNFmB6qbZsgjGCXki3yzlcgOrhm8+8j4srbS1jAYGak4/ClI2cJ2wfH48s+JSAa/vuvt31i1c1KSUXwqlOQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=adTuFZ8iQMiM83q2NY8/acoI9AbzAM/9fd4rMUYoVsk=; b=COXpEhbZB8PgQs2QSXWTSdK2je5LsvK9VH/o8OlfdFEHDB21SEf6rbxO44JgYnPu1eFOuruYtYoJIAl6vlViKJxhnk+XTft0rTFosQ+iviiTBYgmRMBpOX6n11qu2B79Q113xcgekk0Y428NQftO9re7H9wUL0Yj7q+ws2+qCjQ= Received: from MW4PR04CA0037.namprd04.prod.outlook.com (2603:10b6:303:6a::12) by CYYPR12MB8656.namprd12.prod.outlook.com (2603:10b6:930:c3::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6411.19; Fri, 19 May 2023 16:24:50 +0000 Received: from CO1NAM11FT082.eop-nam11.prod.protection.outlook.com (2603:10b6:303:6a:cafe::de) by MW4PR04CA0037.outlook.office365.com (2603:10b6:303:6a::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6411.19 via Frontend Transport; Fri, 19 May 2023 16:24:50 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by CO1NAM11FT082.mail.protection.outlook.com (10.13.175.224) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.6411.22 via Frontend Transport; Fri, 19 May 2023 16:24:50 +0000 Received: from tlendack-t1.amdoffice.net (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.34; Fri, 19 May 2023 11:24:48 -0500 From: Tom Lendacky To: , CC: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "Kirill A. Shutemov" , "H. Peter Anvin" , Michael Roth , Joerg Roedel , Dionna Glaze , Andy Lutomirski , Peter Zijlstra Subject: [PATCH 3/6] x86/sev: Allow for use of the early boot GHCB for PSC requests Date: Fri, 19 May 2023 11:24:14 -0500 Message-ID: <49e0e596d71de92150f5d27b773d738eaf7dd68b.1684513457.git.thomas.lendacky@amd.com> X-Mailer: git-send-email 2.40.0 In-Reply-To: <20230518231434.26080-1-kirill.shutemov@linux.intel.com> References: <20230518231434.26080-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [10.180.168.240] X-ClientProxiedBy: SATLEXMB04.amd.com (10.181.40.145) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CO1NAM11FT082:EE_|CYYPR12MB8656:EE_ X-MS-Office365-Filtering-Correlation-Id: c9b157a7-8088-4237-21b6-08db588591eb X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: lmTzUGgB70iKmQ9jfRtHovD61UPCrVQL1s3yKWpkJy6agqzc7OfOS9UotXltCvftMTTvv2R9oQmVzbkN2xKwFYXzLuwOCJvHtu0DPcQS04cQD/6TpZSNyiEKuuZTs6L99axrqUqXIsFdcJUUPfIcOOnEZHZYHU0k89TDnpJNMBmDwC6/2BRBp66/goALUiyx0arPXM+WhTZ4MtI1hH0mCkDvtnmtTxUkOiEv8puzuRploswD32EG8H9RYBYD4SXqWI8hMD0KEE/+oKMdZpVLgU0/5jGC+tYHwuxO3RvgnhcV0uTnwdecn+skNTTPgG17uIcIk29t3lmWB3lsrb9njlUh2jTiJFe+VtwjMBvNM5QXiGy09Ut+wwQWpcWDQ8taulZt/xXyn0wwUB+vY1UL6701FYuC76weSOOSx1U978mwjbJKUkp2PNKgS7z55bH5khgZzZJX5mypBX5pAeTDPZHv6exuIuS51UmF/DhB/tjhNcgyfy7Lt6IfwyZ7bb2hMohg9bgW+83+xUuYOjWH3BFIuHCb6wVaIyY/cDqq46rae9u0BQ1LPc+pIz+Ch0duP2tnWJFnNVxXyo+AR++mQ8/rM/fHk2j1rMucexVJn856Fyh4jNLwEZp3fEY13CXSShCsz+9Q2/a/uUOrDXMENNw4qTKKOEuQV5cj9qiAhQfNALZoAijd/3X4AztyxUNiYYspVR4gktj6ZtKlLYZhWlDpfA5c4WYFx3EZKzklbQAogESWpr+gCMHLzzJwBDU+hbaw15bfkNdAXUBvenkOUw== X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230028)(4636009)(396003)(39860400002)(136003)(376002)(346002)(451199021)(36840700001)(46966006)(40470700004)(316002)(2906002)(478600001)(41300700001)(4326008)(8676002)(8936002)(54906003)(110136005)(7416002)(6666004)(70206006)(70586007)(5660300002)(26005)(40460700003)(186003)(82740400003)(16526019)(356005)(40480700001)(47076005)(2616005)(83380400001)(36860700001)(36756003)(82310400005)(86362001)(336012)(426003)(81166007)(36900700001);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 19 May 2023 16:24:50.3651 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: c9b157a7-8088-4237-21b6-08db588591eb X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: CO1NAM11FT082.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: CYYPR12MB8656 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Using a GHCB for a page stage change (as opposed to the MSR protocol) allows for multiple pages to be processed in a single request. In prep for early PSC requests in support of unaccepted memory, update the invocation of vmgexit_psc() to be able to use the early boot GHCB and not just the per-CPU GHCB structure. In order to use the proper GHCB (early boot vs per-CPU), set a flag that indicates when the per-CPU GHCBs are available and registered. For APs, the per-CPU GHCBs are created before they are started and registered upon startup, so this flag can be used globally for the BSP and APs instead of creating a per-CPU flag. This will allow for a significant reduction in the number of MSR protocol page state change requests when accepting memory. Signed-off-by: Tom Lendacky --- arch/x86/kernel/sev.c | 61 +++++++++++++++++++++++++++---------------- 1 file changed, 38 insertions(+), 23 deletions(-) diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c index 7b0144acd7bf..973756c89dac 100644 --- a/arch/x86/kernel/sev.c +++ b/arch/x86/kernel/sev.c @@ -119,7 +119,19 @@ static DEFINE_PER_CPU(struct sev_es_save_area *, sev_v= msa); =20 struct sev_config { __u64 debug : 1, - __reserved : 63; + + /* + * A flag used by __set_pages_state() that indicates when the + * per-CPU GHCB has been created and registered and thus can be + * used by the BSP instead of the early boot GHCB. + * + * For APs, the per-CPU GHCB is created before they are started + * and registered upon startup, so this flag can be used globally + * for the BSP and APs. + */ + ghcbs_initialized : 1, + + __reserved : 62; }; =20 static struct sev_config sev_cfg __read_mostly; @@ -662,7 +674,7 @@ static void pvalidate_pages(unsigned long vaddr, unsign= ed long npages, bool vali } } =20 -static void __init early_set_pages_state(unsigned long paddr, unsigned lon= g npages, enum psc_op op) +static void early_set_pages_state(unsigned long paddr, unsigned long npage= s, enum psc_op op) { unsigned long paddr_end; u64 val; @@ -756,26 +768,13 @@ void __init snp_prep_memory(unsigned long paddr, unsi= gned int sz, enum psc_op op WARN(1, "invalid memory op %d\n", op); } =20 -static int vmgexit_psc(struct snp_psc_desc *desc) +static int vmgexit_psc(struct ghcb *ghcb, struct snp_psc_desc *desc) { int cur_entry, end_entry, ret =3D 0; struct snp_psc_desc *data; - struct ghcb_state state; struct es_em_ctxt ctxt; - unsigned long flags; - struct ghcb *ghcb; =20 - /* - * __sev_get_ghcb() needs to run with IRQs disabled because it is using - * a per-CPU GHCB. - */ - local_irq_save(flags); - - ghcb =3D __sev_get_ghcb(&state); - if (!ghcb) { - ret =3D 1; - goto out_unlock; - } + vc_ghcb_invalidate(ghcb); =20 /* Copy the input desc into GHCB shared buffer */ data =3D (struct snp_psc_desc *)ghcb->shared_buffer; @@ -832,20 +831,18 @@ static int vmgexit_psc(struct snp_psc_desc *desc) } =20 out: - __sev_put_ghcb(&state); - -out_unlock: - local_irq_restore(flags); - return ret; } =20 static void __set_pages_state(struct snp_psc_desc *data, unsigned long vad= dr, unsigned long vaddr_end, int op) { + struct ghcb_state state; struct psc_hdr *hdr; struct psc_entry *e; + unsigned long flags; unsigned long pfn; + struct ghcb *ghcb; int i; =20 hdr =3D &data->hdr; @@ -875,8 +872,20 @@ static void __set_pages_state(struct snp_psc_desc *dat= a, unsigned long vaddr, i++; } =20 - if (vmgexit_psc(data)) + local_irq_save(flags); + + if (sev_cfg.ghcbs_initialized) + ghcb =3D __sev_get_ghcb(&state); + else + ghcb =3D boot_ghcb; + + if (!ghcb || vmgexit_psc(ghcb, data)) sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PSC); + + if (sev_cfg.ghcbs_initialized) + __sev_put_ghcb(&state); + + local_irq_restore(flags); } =20 static void set_pages_state(unsigned long vaddr, unsigned long npages, int= op) @@ -884,6 +893,10 @@ static void set_pages_state(unsigned long vaddr, unsig= ned long npages, int op) unsigned long vaddr_end, next_vaddr; struct snp_psc_desc desc; =20 + /* Use the MSR protocol when a GHCB is not available. */ + if (!boot_ghcb) + return early_set_pages_state(__pa(vaddr), npages, op); + vaddr =3D vaddr & PAGE_MASK; vaddr_end =3D vaddr + (npages << PAGE_SHIFT); =20 @@ -1261,6 +1274,8 @@ void setup_ghcb(void) if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP)) snp_register_per_cpu_ghcb(); =20 + sev_cfg.ghcbs_initialized =3D true; + return; } =20 --=20 2.40.0 From nobody Tue Feb 10 00:22:26 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3585CC7EE2E for ; Thu, 18 May 2023 23:15:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230357AbjERXPD (ORCPT ); Thu, 18 May 2023 19:15:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33816 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230373AbjERXOz (ORCPT ); Thu, 18 May 2023 19:14:55 -0400 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B7C92E6E; Thu, 18 May 2023 16:14:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1684451693; x=1715987693; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=BuLaVtTf5RKQWpr3anFnaNnaMQK4KaotupS5o/iFyhQ=; b=IO5kue1Hkr9uJpm4HUiGHAZ9t4tixBmYNGVPHeIULKnrQaUD+2LN8WFm aclgC0/bgik8BMDc/R3aGAF19C57jJi1ZWUpnW2PuEjUruIDc56/IPykU 8keoZ5fz+WNvJceZ02tX1A8Blt+1ADBcBoeg8FU3u+U76hL+c1YFO5hQl SJjDGSXELXigapmhxnaRSw+x7i6dBVpjogU1ZFnM39Qcd1apfe3bdrf8b FVstrMe/meJxIkXMzz8oVxBdSB3ChYNRCwblr0JXyX41sCwBok9j6VnGx lI1EOlizZLyXk9/9Y6xYpCFp8exx+lV8dug//jFt0t4zY9sscC5MUodk8 Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10714"; a="380431748" X-IronPort-AV: E=Sophos;i="6.00,175,1681196400"; d="scan'208";a="380431748" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 May 2023 16:14:50 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10714"; a="846669541" X-IronPort-AV: E=Sophos;i="6.00,175,1681196400"; d="scan'208";a="846669541" Received: from rkiyama-mobl1.amr.corp.intel.com (HELO box.shutemov.name) ([10.251.222.16]) by fmsmga001-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 May 2023 16:14:42 -0700 Received: by box.shutemov.name (Postfix, from userid 1000) id 3F46A10DFCF; Fri, 19 May 2023 02:14:40 +0300 (+03) From: "Kirill A. Shutemov" To: Borislav Petkov , Andy Lutomirski , Dave Hansen , Sean Christopherson , Andrew Morton , Joerg Roedel , Ard Biesheuvel Cc: Andi Kleen , Kuppuswamy Sathyanarayanan , David Rientjes , Vlastimil Babka , Tom Lendacky , Thomas Gleixner , Peter Zijlstra , Paolo Bonzini , Ingo Molnar , Dario Faggioli , Mike Rapoport , David Hildenbrand , Mel Gorman , marcelo.cerri@canonical.com, tim.gardner@canonical.com, khalid.elmously@canonical.com, philip.cox@canonical.com, aarcange@redhat.com, peterx@redhat.com, x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" , Liam Merwick Subject: [PATCHv12 4/9] x86/boot/compressed: Handle unaccepted memory Date: Fri, 19 May 2023 02:14:29 +0300 Message-Id: <20230518231434.26080-5-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.39.3 In-Reply-To: <20230518231434.26080-1-kirill.shutemov@linux.intel.com> References: <20230518231434.26080-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" The firmware will pre-accept the memory used to run the stub. But, the stub is responsible for accepting the memory into which it decompresses the main kernel. Accept memory just before decompression starts. The stub is also responsible for choosing a physical address in which to place the decompressed kernel image. The KASLR mechanism will randomize this physical address. Since the accepted memory region is relatively small, KASLR would be quite ineffective if it only used the pre-accepted area (EFI_CONVENTIONAL_MEMORY). Ensure that KASLR randomizes among the entire physical address space by also including EFI_UNACCEPTED_MEMORY. Signed-off-by: Kirill A. Shutemov Reviewed-by: Liam Merwick Reviewed-by: Tom Lendacky Reviewed-by: Ard Biesheuvel --- arch/x86/boot/compressed/efi.h | 10 +++++++++ arch/x86/boot/compressed/kaslr.c | 35 +++++++++++++++++++++----------- arch/x86/boot/compressed/mem.c | 23 +++++++++++++++++++++ arch/x86/boot/compressed/misc.c | 7 +++++++ arch/x86/boot/compressed/misc.h | 6 ++++++ 5 files changed, 69 insertions(+), 12 deletions(-) diff --git a/arch/x86/boot/compressed/efi.h b/arch/x86/boot/compressed/efi.h index 7db2f41b54cd..866c0af8b5b9 100644 --- a/arch/x86/boot/compressed/efi.h +++ b/arch/x86/boot/compressed/efi.h @@ -16,6 +16,7 @@ typedef guid_t efi_guid_t __aligned(__alignof__(u32)); #define ACPI_TABLE_GUID EFI_GUID(0xeb9d2d30, 0x2d88, 0x11d3, 0x9a, 0x1= 6, 0x00, 0x90, 0x27, 0x3f, 0xc1, 0x4d) #define ACPI_20_TABLE_GUID EFI_GUID(0x8868e871, 0xe4f1, 0x11d3, 0xbc, 0= x22, 0x00, 0x80, 0xc7, 0x3c, 0x88, 0x81) #define EFI_CC_BLOB_GUID EFI_GUID(0x067b1f5f, 0xcf26, 0x44c5, 0x85, 0x54= , 0x93, 0xd7, 0x77, 0x91, 0x2d, 0x42) +#define LINUX_EFI_UNACCEPTED_MEM_TABLE_GUID EFI_GUID(0xd5d1de3c, 0x105c, 0= x44f9, 0x9e, 0xa9, 0xbc, 0xef, 0x98, 0x12, 0x00, 0x31) =20 #define EFI32_LOADER_SIGNATURE "EL32" #define EFI64_LOADER_SIGNATURE "EL64" @@ -32,6 +33,7 @@ typedef struct { } efi_table_hdr_t; =20 #define EFI_CONVENTIONAL_MEMORY 7 +#define EFI_UNACCEPTED_MEMORY 15 =20 #define EFI_MEMORY_MORE_RELIABLE \ ((u64)0x0000000000010000ULL) /* higher reliability */ @@ -104,6 +106,14 @@ struct efi_setup_data { u64 reserved[8]; }; =20 +struct efi_unaccepted_memory { + u32 version; + u32 unit_size; + u64 phys_base; + u64 size; + unsigned long bitmap[]; +}; + static inline int efi_guidcmp (efi_guid_t left, efi_guid_t right) { return memcmp(&left, &right, sizeof (efi_guid_t)); diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/ka= slr.c index 454757fbdfe5..749f0fe7e446 100644 --- a/arch/x86/boot/compressed/kaslr.c +++ b/arch/x86/boot/compressed/kaslr.c @@ -672,6 +672,28 @@ static bool process_mem_region(struct mem_vector *regi= on, } =20 #ifdef CONFIG_EFI + +/* + * Only EFI_CONVENTIONAL_MEMORY and EFI_UNACCEPTED_MEMORY (if supported) a= re + * guaranteed to be free. + * + * It is more conservative in picking free memory than the EFI spec allows: + * + * According to the spec, EFI_BOOT_SERVICES_{CODE|DATA} are also free memo= ry + * and thus available to place the kernel image into, but in practice ther= e's + * firmware where using that memory leads to crashes. + */ +static inline bool memory_type_is_free(efi_memory_desc_t *md) +{ + if (md->type =3D=3D EFI_CONVENTIONAL_MEMORY) + return true; + + if (md->type =3D=3D EFI_UNACCEPTED_MEMORY) + return IS_ENABLED(CONFIG_UNACCEPTED_MEMORY); + + return false; +} + /* * Returns true if we processed the EFI memmap, which we prefer over the E= 820 * table if it is available. @@ -716,18 +738,7 @@ process_efi_entries(unsigned long minimum, unsigned lo= ng image_size) for (i =3D 0; i < nr_desc; i++) { md =3D efi_early_memdesc_ptr(pmap, e->efi_memdesc_size, i); =20 - /* - * Here we are more conservative in picking free memory than - * the EFI spec allows: - * - * According to the spec, EFI_BOOT_SERVICES_{CODE|DATA} are also - * free memory and thus available to place the kernel image into, - * but in practice there's firmware where using that memory leads - * to crashes. - * - * Only EFI_CONVENTIONAL_MEMORY is guaranteed to be free. - */ - if (md->type !=3D EFI_CONVENTIONAL_MEMORY) + if (!memory_type_is_free(md)) continue; =20 if (efi_soft_reserve_enabled() && diff --git a/arch/x86/boot/compressed/mem.c b/arch/x86/boot/compressed/mem.c index 67594fcb11d9..87372b96d613 100644 --- a/arch/x86/boot/compressed/mem.c +++ b/arch/x86/boot/compressed/mem.c @@ -1,9 +1,32 @@ // SPDX-License-Identifier: GPL-2.0-only =20 #include "error.h" +#include "misc.h" =20 void arch_accept_memory(phys_addr_t start, phys_addr_t end) { /* Platform-specific memory-acceptance call goes here */ error("Cannot accept memory"); } + +void init_unaccepted_memory(void) +{ + guid_t guid =3D LINUX_EFI_UNACCEPTED_MEM_TABLE_GUID; + struct efi_unaccepted_memory *unaccepted_table; + unsigned long cfg_table_pa; + unsigned int cfg_table_len; + int ret; + + ret =3D efi_get_conf_table(boot_params, &cfg_table_pa, &cfg_table_len); + if (ret) + error("EFI config table not found."); + + unaccepted_table =3D (void *)efi_find_vendor_table(boot_params, + cfg_table_pa, + cfg_table_len, + guid); + if (unaccepted_table->version !=3D 1) + error("Unknown version of unaccepted memory table\n"); + + set_unaccepted_table(unaccepted_table); +} diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/mis= c.c index 014ff222bf4b..36535a3753f5 100644 --- a/arch/x86/boot/compressed/misc.c +++ b/arch/x86/boot/compressed/misc.c @@ -455,6 +455,13 @@ asmlinkage __visible void *extract_kernel(void *rmode,= memptr heap, #endif =20 debug_putstr("\nDecompressing Linux... "); + + if (IS_ENABLED(CONFIG_UNACCEPTED_MEMORY)) { + debug_putstr("Accepting memory... "); + init_unaccepted_memory(); + accept_memory(__pa(output), __pa(output) + needed_size); + } + __decompress(input_data, input_len, NULL, NULL, output, output_len, NULL, error); entry_offset =3D parse_elf(output); diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/mis= c.h index 2f155a0e3041..e1a0b49e0ed2 100644 --- a/arch/x86/boot/compressed/misc.h +++ b/arch/x86/boot/compressed/misc.h @@ -247,4 +247,10 @@ static inline unsigned long efi_find_vendor_table(stru= ct boot_params *bp, } #endif /* CONFIG_EFI */ =20 +void init_unaccepted_memory(void); + +/* Implemented in EFI stub */ +void set_unaccepted_table(struct efi_unaccepted_memory *table); +void accept_memory(phys_addr_t start, phys_addr_t end); + #endif /* BOOT_COMPRESSED_MISC_H */ --=20 2.39.3 From nobody Tue Feb 10 00:22:26 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 28C4BC77B7A for ; Fri, 19 May 2023 16:25:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230467AbjESQZP (ORCPT ); Fri, 19 May 2023 12:25:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51828 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230094AbjESQZJ (ORCPT ); Fri, 19 May 2023 12:25:09 -0400 Received: from NAM12-MW2-obe.outbound.protection.outlook.com (mail-mw2nam12on2047.outbound.protection.outlook.com [40.107.244.47]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 24043E54 for ; Fri, 19 May 2023 09:25:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=ZcVM+mOc44rMOeLEHRGoYx9rdvOdQa/smUNa/4E9KV+IJeIY39eu+PyC/mkGSzb/t4PzwQNd6SzL3Itv+Jj3zSusgETTSdLQZsufH2yQVu//6nKU68CzpOPsSEA1Lqs4s0oe588ajfdnhxRAYhoPEe3Av6bq1sj7R0FMK1pk/J5IBKinfDZ2WuBrZCRmrWSiBIbpvdm2XtxYaABYAF2nLj9xLk1cuPGLANBxgp3iAgB/D1bUEpMpoGbl0XvcSwQHhtJ9Ga91IW1cgRRNInZMubx72RLJU3mwLj7eaH5fxfkU40GPa5dCahWPXHbyxMFktGs6yHrqsjgBXVD6DbAFCQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=blqlGSo1u7qyUOO+G4REHTh7X/rBN0iZH9kWG9Z8W9U=; b=CYn6hdCKgUPxxm6rVGrsc9nSBpb/i4FOyEzLd0fdpxjWHGMIYsAYOA+ifsl5LqjZzTP+ZG2zNGNz4jnecGTxEcliZGnWiciKAifIGZy9kG8mP+BWEJcygDJm7hUiyzg1CvpV538F5QsxosGzqtdajc3wvQAhntnR1E8IL2nlZLCBF9wdjDkLx+PJzLRe//34xfVTPETOKIWUfRMg88AnHA6VJQhkzKAhLKKlZsVEtfw6seOYipTOvkc/WhHWB+vP/WYey3aDaysOc+WALESDUeXjp7l2Y7yDkuZce+kwXAWhzMEtg/cDFZXKK1azBg3Z9QI5lpv7XuP34P/fy1L7Wg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=blqlGSo1u7qyUOO+G4REHTh7X/rBN0iZH9kWG9Z8W9U=; b=c0bm19etJOSc5E7mwdVRCCTQVjMv+TpQTcIFeWuTEzlsxmtoVRkjkR40gBn4jBPMyxBrmncW9kro33/UcpjbrJffq5WP+hkt/fJ74OSTHG9vZy1GZIQPf2n3tY8IhmoeEjReGneOa7cU4lbFtM9tabgCoBF0qbj9o++tEcS3Gx8= Received: from MW4PR04CA0203.namprd04.prod.outlook.com (2603:10b6:303:86::28) by CH3PR12MB7548.namprd12.prod.outlook.com (2603:10b6:610:144::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6411.19; Fri, 19 May 2023 16:25:00 +0000 Received: from CO1NAM11FT058.eop-nam11.prod.protection.outlook.com (2603:10b6:303:86:cafe::4f) by MW4PR04CA0203.outlook.office365.com (2603:10b6:303:86::28) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6411.21 via Frontend Transport; Fri, 19 May 2023 16:24:59 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by CO1NAM11FT058.mail.protection.outlook.com (10.13.174.164) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.6411.21 via Frontend Transport; Fri, 19 May 2023 16:24:59 +0000 Received: from tlendack-t1.amdoffice.net (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.34; Fri, 19 May 2023 11:24:57 -0500 From: Tom Lendacky To: , CC: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "Kirill A. Shutemov" , "H. Peter Anvin" , Michael Roth , Joerg Roedel , Dionna Glaze , Andy Lutomirski , Peter Zijlstra Subject: [PATCH 4/6] x86/sev: Use large PSC requests if applicable Date: Fri, 19 May 2023 11:24:15 -0500 Message-ID: X-Mailer: git-send-email 2.40.0 In-Reply-To: <20230518231434.26080-1-kirill.shutemov@linux.intel.com> References: <20230518231434.26080-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [10.180.168.240] X-ClientProxiedBy: SATLEXMB04.amd.com (10.181.40.145) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CO1NAM11FT058:EE_|CH3PR12MB7548:EE_ X-MS-Office365-Filtering-Correlation-Id: 12ecbcf4-cbc7-4b8a-3b97-08db58859763 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: szENTl/GmerTTlxyoxr3onBiHsWk6iFPMAqGdcnPlDu+37xbteXNru7RkWb9Ga5khWri4aSnH49EwvhAvhWwIYyCdO6gx1KBPXI+XsOzaZehbGOC5X/fgWxRT+ioFppUiVZUbo/lkJ4uFbK6Pjgg8g8oOd7Lgxg1fd+P0w1V1+gCAgK008b14LyRrmi5wHjBzvwafRv/GBqAbwsm4Ab69jBjiu/rEDoPhuP+oPkLsv1tX3QS7WwzGpi5ZpFbq59MqxyrASuvOI+kDgh1Zl8D9G7fXtw6A6KXYtyk4DCEOyuM6JH6Wsz09/2t5U9L5HdE/gmMaGr4E8c53+HUvFzTD1hMeVDxwY4vLnAQyWeqOlAuqyBYFWyl1fuNrZsePwBduDVu/vEdqVt54Ul87KlLM/i8qxUOS9yVBTyF90kk8OerO/WTHPjFO9Xx6hI6BGohKsoL+zu2/0cH5V+bck6//e2TU/51yDUIYqm9YouGidfEokYywz+E+YUeFphTWzf7xKubMoe+f1hAEDonPfzSyd8Sci7/qTL8gJRiCZo+NM1O1P/nUIjNcbr28qumcLEoa/FyO8QVSSN/nTT1cPxhUrcwZc0NJwrs7j3UpPfPzWSYNS+iLgdCiMzzl9DoRx65KI6DC8hHRWe+dq88hsWRXNKYyWs+9l5KC6hBV988BaJhTCwpt7VfqlI/h+c+jKp/5pBBt26GhkBc8TU5JwcCF+S5WeRLZHdkcAVG0aBUvyJEYBiyJcdHjOQUpyAPofyTNERIfkGfvWoaZtJTW/XSxw== X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230028)(4636009)(39860400002)(136003)(376002)(396003)(346002)(451199021)(36840700001)(46966006)(40470700004)(70206006)(478600001)(70586007)(5660300002)(2906002)(8936002)(316002)(8676002)(54906003)(110136005)(4326008)(41300700001)(6666004)(7416002)(26005)(40460700003)(36860700001)(356005)(16526019)(186003)(82740400003)(336012)(47076005)(40480700001)(426003)(36756003)(2616005)(83380400001)(86362001)(82310400005)(81166007)(36900700001);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 19 May 2023 16:24:59.5408 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 12ecbcf4-cbc7-4b8a-3b97-08db58859763 X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: CO1NAM11FT058.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: CH3PR12MB7548 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" In advance of providing support for unaccepted memory, request 2M Page State Change (PSC) requests when the address range allows for it. By using a 2M page size, more PSC operations can be handled in a single request to the hypervisor. The hypervisor will determine if it can accommodate the larger request by checking the mapping in the nested page table. If mapped as a large page, then the 2M page request can be performed, otherwise the 2M page request will be broken down into 512 4K page requests. This is still more efficient than having the guest perform multiple PSC requests in order to process the 512 4K pages. In conjunction with the 2M PSC requests, attempt to perform the associated PVALIDATE instruction of the page using the 2M page size. If PVALIDATE fails with a size mismatch, then fallback to validating 512 4K pages. To do this, page validation is modified to work with the PSC structure and not just a virtual address range. Signed-off-by: Tom Lendacky --- arch/x86/include/asm/sev.h | 4 ++ arch/x86/kernel/sev.c | 125 ++++++++++++++++++++++++------------- 2 files changed, 84 insertions(+), 45 deletions(-) diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h index 7ca5c9ec8b52..e21e1c5397c1 100644 --- a/arch/x86/include/asm/sev.h +++ b/arch/x86/include/asm/sev.h @@ -80,11 +80,15 @@ extern void vc_no_ghcb(void); extern void vc_boot_ghcb(void); extern bool handle_vc_boot_ghcb(struct pt_regs *regs); =20 +/* PVALIDATE return codes */ +#define PVALIDATE_FAIL_SIZEMISMATCH 6 + /* Software defined (when rFlags.CF =3D 1) */ #define PVALIDATE_FAIL_NOUPDATE 255 =20 /* RMP page size */ #define RMP_PG_SIZE_4K 0 +#define RMP_PG_SIZE_2M 1 =20 #define RMPADJUST_VMSA_PAGE_BIT BIT(16) =20 diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c index 973756c89dac..8802a75e1c20 100644 --- a/arch/x86/kernel/sev.c +++ b/arch/x86/kernel/sev.c @@ -657,32 +657,58 @@ static u64 __init get_jump_table_addr(void) return ret; } =20 -static void pvalidate_pages(unsigned long vaddr, unsigned long npages, boo= l validate) +static void pvalidate_pages(struct snp_psc_desc *desc) { - unsigned long vaddr_end; + struct psc_entry *e; + unsigned long vaddr; + unsigned int size; + unsigned int i; + bool validate; int rc; =20 - vaddr =3D vaddr & PAGE_MASK; - vaddr_end =3D vaddr + (npages << PAGE_SHIFT); + for (i =3D 0; i <=3D desc->hdr.end_entry; i++) { + e =3D &desc->entries[i]; + + vaddr =3D (unsigned long)pfn_to_kaddr(e->gfn); + size =3D e->pagesize ? RMP_PG_SIZE_2M : RMP_PG_SIZE_4K; + validate =3D (e->operation =3D=3D SNP_PAGE_STATE_PRIVATE) ? true : false; + + rc =3D pvalidate(vaddr, size, validate); + if (rc =3D=3D PVALIDATE_FAIL_SIZEMISMATCH && size =3D=3D RMP_PG_SIZE_2M)= { + unsigned long vaddr_end =3D vaddr + PMD_SIZE; + + for (; vaddr < vaddr_end; vaddr +=3D PAGE_SIZE) { + rc =3D pvalidate(vaddr, RMP_PG_SIZE_4K, validate); + if (rc) + break; + } + } =20 - while (vaddr < vaddr_end) { - rc =3D pvalidate(vaddr, RMP_PG_SIZE_4K, validate); if (WARN(rc, "Failed to validate address 0x%lx ret %d", vaddr, rc)) sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PVALIDATE); - - vaddr =3D vaddr + PAGE_SIZE; } } =20 -static void early_set_pages_state(unsigned long paddr, unsigned long npage= s, enum psc_op op) +static void early_set_pages_state(unsigned long vaddr, unsigned long paddr, + unsigned long npages, enum psc_op op) { unsigned long paddr_end; u64 val; + int ret; + + vaddr =3D vaddr & PAGE_MASK; =20 paddr =3D paddr & PAGE_MASK; paddr_end =3D paddr + (npages << PAGE_SHIFT); =20 while (paddr < paddr_end) { + if (op =3D=3D SNP_PAGE_STATE_SHARED) { + /* Page validation must be rescinded before changing to shared */ + ret =3D pvalidate(vaddr, RMP_PG_SIZE_4K, false); + if (WARN(ret, "Failed to validate address 0x%lx ret %d", paddr, ret)) + goto e_term; + } + /* * Use the MSR protocol because this function can be called before * the GHCB is established. @@ -703,7 +729,15 @@ static void early_set_pages_state(unsigned long paddr,= unsigned long npages, enu paddr, GHCB_MSR_PSC_RESP_VAL(val))) goto e_term; =20 - paddr =3D paddr + PAGE_SIZE; + if (op =3D=3D SNP_PAGE_STATE_PRIVATE) { + /* Page validation must be performed after changing to private */ + ret =3D pvalidate(vaddr, RMP_PG_SIZE_4K, true); + if (WARN(ret, "Failed to validate address 0x%lx ret %d", paddr, ret)) + goto e_term; + } + + vaddr +=3D PAGE_SIZE; + paddr +=3D PAGE_SIZE; } =20 return; @@ -728,10 +762,7 @@ void __init early_snp_set_memory_private(unsigned long= vaddr, unsigned long padd * Ask the hypervisor to mark the memory pages as private in the RMP * table. */ - early_set_pages_state(paddr, npages, SNP_PAGE_STATE_PRIVATE); - - /* Validate the memory pages after they've been added in the RMP table. */ - pvalidate_pages(vaddr, npages, true); + early_set_pages_state(vaddr, paddr, npages, SNP_PAGE_STATE_PRIVATE); } =20 void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long= paddr, @@ -746,11 +777,8 @@ void __init early_snp_set_memory_shared(unsigned long = vaddr, unsigned long paddr if (!(sev_status & MSR_AMD64_SEV_SNP_ENABLED)) return; =20 - /* Invalidate the memory pages before they are marked shared in the RMP t= able. */ - pvalidate_pages(vaddr, npages, false); - /* Ask hypervisor to mark the memory pages shared in the RMP table. */ - early_set_pages_state(paddr, npages, SNP_PAGE_STATE_SHARED); + early_set_pages_state(vaddr, paddr, npages, SNP_PAGE_STATE_SHARED); } =20 void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc= _op op) @@ -834,10 +862,11 @@ static int vmgexit_psc(struct ghcb *ghcb, struct snp_= psc_desc *desc) return ret; } =20 -static void __set_pages_state(struct snp_psc_desc *data, unsigned long vad= dr, - unsigned long vaddr_end, int op) +static unsigned long __set_pages_state(struct snp_psc_desc *data, unsigned= long vaddr, + unsigned long vaddr_end, int op) { struct ghcb_state state; + bool use_large_entry; struct psc_hdr *hdr; struct psc_entry *e; unsigned long flags; @@ -851,27 +880,37 @@ static void __set_pages_state(struct snp_psc_desc *da= ta, unsigned long vaddr, memset(data, 0, sizeof(*data)); i =3D 0; =20 - while (vaddr < vaddr_end) { - if (is_vmalloc_addr((void *)vaddr)) + while (vaddr < vaddr_end && i < ARRAY_SIZE(data->entries)) { + hdr->end_entry =3D i; + + if (is_vmalloc_addr((void *)vaddr)) { pfn =3D vmalloc_to_pfn((void *)vaddr); - else + use_large_entry =3D false; + } else { pfn =3D __pa(vaddr) >> PAGE_SHIFT; + use_large_entry =3D true; + } =20 e->gfn =3D pfn; e->operation =3D op; - hdr->end_entry =3D i; =20 - /* - * Current SNP implementation doesn't keep track of the RMP page - * size so use 4K for simplicity. - */ - e->pagesize =3D RMP_PG_SIZE_4K; + if (use_large_entry && IS_ALIGNED(vaddr, PMD_SIZE) && + (vaddr_end - vaddr) >=3D PMD_SIZE) { + e->pagesize =3D RMP_PG_SIZE_2M; + vaddr +=3D PMD_SIZE; + } else { + e->pagesize =3D RMP_PG_SIZE_4K; + vaddr +=3D PAGE_SIZE; + } =20 - vaddr =3D vaddr + PAGE_SIZE; e++; i++; } =20 + /* Page validation must be rescinded before changing to shared */ + if (op =3D=3D SNP_PAGE_STATE_SHARED) + pvalidate_pages(data); + local_irq_save(flags); =20 if (sev_cfg.ghcbs_initialized) @@ -879,6 +918,7 @@ static void __set_pages_state(struct snp_psc_desc *data= , unsigned long vaddr, else ghcb =3D boot_ghcb; =20 + /* Invoke the hypervisor to perform the page state changes */ if (!ghcb || vmgexit_psc(ghcb, data)) sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PSC); =20 @@ -886,29 +926,28 @@ static void __set_pages_state(struct snp_psc_desc *da= ta, unsigned long vaddr, __sev_put_ghcb(&state); =20 local_irq_restore(flags); + + /* Page validation must be performed after changing to private */ + if (op =3D=3D SNP_PAGE_STATE_PRIVATE) + pvalidate_pages(data); + + return vaddr; } =20 static void set_pages_state(unsigned long vaddr, unsigned long npages, int= op) { - unsigned long vaddr_end, next_vaddr; struct snp_psc_desc desc; + unsigned long vaddr_end; =20 /* Use the MSR protocol when a GHCB is not available. */ if (!boot_ghcb) - return early_set_pages_state(__pa(vaddr), npages, op); + return early_set_pages_state(vaddr, __pa(vaddr), npages, op); =20 vaddr =3D vaddr & PAGE_MASK; vaddr_end =3D vaddr + (npages << PAGE_SHIFT); =20 - while (vaddr < vaddr_end) { - /* Calculate the last vaddr that fits in one struct snp_psc_desc. */ - next_vaddr =3D min_t(unsigned long, vaddr_end, - (VMGEXIT_PSC_MAX_ENTRY * PAGE_SIZE) + vaddr); - - __set_pages_state(&desc, vaddr, next_vaddr, op); - - vaddr =3D next_vaddr; - } + while (vaddr < vaddr_end) + vaddr =3D __set_pages_state(&desc, vaddr, vaddr_end, op); } =20 void snp_set_memory_shared(unsigned long vaddr, unsigned long npages) @@ -916,8 +955,6 @@ void snp_set_memory_shared(unsigned long vaddr, unsigne= d long npages) if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP)) return; =20 - pvalidate_pages(vaddr, npages, false); - set_pages_state(vaddr, npages, SNP_PAGE_STATE_SHARED); } =20 @@ -927,8 +964,6 @@ void snp_set_memory_private(unsigned long vaddr, unsign= ed long npages) return; =20 set_pages_state(vaddr, npages, SNP_PAGE_STATE_PRIVATE); - - pvalidate_pages(vaddr, npages, true); } =20 static int snp_set_vmsa(void *va, bool vmsa) --=20 2.40.0 From nobody Tue Feb 10 00:22:26 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 829A8C77B75 for ; Fri, 19 May 2023 16:25:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230373AbjESQZl (ORCPT ); Fri, 19 May 2023 12:25:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52384 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229711AbjESQZi (ORCPT ); Fri, 19 May 2023 12:25:38 -0400 Received: from NAM12-DM6-obe.outbound.protection.outlook.com (mail-dm6nam12on20605.outbound.protection.outlook.com [IPv6:2a01:111:f400:fe59::605]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3AEED171E for ; Fri, 19 May 2023 09:25:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=cYkSCImi+aqxX4pjTxZeiaXO6prF1dAwHB6SzMW1HHu8BkhpjyQF323xV8inXrvL5T+GIW1o1hhdl0bB6pXw/MGWXN87aDYD0YpYK0kLvQEUWnRS/D9nRCOXejkwmDlHXsfJPI2ZDbawEJ8wuTF2zfBryVPuwdXNDm4evbQO5d6mQV84sEafkhP/9mbW+cQE0YxlzWUD9p1oPh6tzYZe8cOP+AVMUznxarwqSSK/FUC1Xpo9lEK/HZUZ06KFNi68M/JCPTI/lTo161DlLJ6RRY5sKPzJs8QlFF5xWnxCUx+cLYYutxs/hSnlZlS5iO0hROHKyz1pHNZLp5MFQdfipA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=MRl98XliT95INHRWHlCFvP+XcTziK7Bl8ILB8nXUGFo=; b=QlXCHxysixqyjC49NMHx8F96lE4P4vP8xxumZx1+KLlOoElhLfHC7/RIWeqOfsGtxQCxVpu56GuKnbpzH6F3Ev3a7SSN0VpddJZt65ni53hYhZHbMx7gRukHNKTxELM2IigxAuKUjSoKnb5AFcsyRQUu3zayZe0r00et6EQll0B6JyWaK+0ze2LX8N/7wB/4iWm4PoZ7nXmtYtnxtng6slr5fJMP00CToFQTpWsHcaVBePyCc8+0FjiF30GPu+bwmtfu+aUF4Q2DiFQdy/QdPsfxdY8CYj1whqOj2voEf81/c/7vRsMVit+SO9FaUSmnLiEsPVTMxBU11KB/QyjSRQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=MRl98XliT95INHRWHlCFvP+XcTziK7Bl8ILB8nXUGFo=; b=FoC68mvC4ymU+P0vroxuuPNP91GjxOmNTHbROR/tik/syNTjuzsacQZ9w8yCBosFb7Vv1gdZrldvfmPt/A5b5TQmTkjAKEYNLRklk5lQ/1V3oZ8BWkbsOhPzPpXFjz/1kQTl7+FvhxTZMsOjSkY/wduP0+7rg3ukGVNAevoaSSo= Received: from MW2PR2101CA0026.namprd21.prod.outlook.com (2603:10b6:302:1::39) by MN0PR12MB6002.namprd12.prod.outlook.com (2603:10b6:208:37e::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6387.30; Fri, 19 May 2023 16:25:12 +0000 Received: from CO1NAM11FT010.eop-nam11.prod.protection.outlook.com (2603:10b6:302:1:cafe::dc) by MW2PR2101CA0026.outlook.office365.com (2603:10b6:302:1::39) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6433.7 via Frontend Transport; Fri, 19 May 2023 16:25:12 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by CO1NAM11FT010.mail.protection.outlook.com (10.13.175.88) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.6411.21 via Frontend Transport; Fri, 19 May 2023 16:25:11 +0000 Received: from tlendack-t1.amdoffice.net (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.34; Fri, 19 May 2023 11:25:07 -0500 From: Tom Lendacky To: , CC: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "Kirill A. Shutemov" , "H. Peter Anvin" , Michael Roth , Joerg Roedel , Dionna Glaze , Andy Lutomirski , Peter Zijlstra Subject: [PATCH 5/6] x86/sev: Add SNP-specific unaccepted memory support Date: Fri, 19 May 2023 11:24:16 -0500 Message-ID: <262831c7c519fedb1bdb3b51bf3019d49e68ff9e.1684513457.git.thomas.lendacky@amd.com> X-Mailer: git-send-email 2.40.0 In-Reply-To: <20230518231434.26080-1-kirill.shutemov@linux.intel.com> References: <20230518231434.26080-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [10.180.168.240] X-ClientProxiedBy: SATLEXMB04.amd.com (10.181.40.145) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CO1NAM11FT010:EE_|MN0PR12MB6002:EE_ X-MS-Office365-Filtering-Correlation-Id: b8be37b3-58a1-48e7-71eb-08db58859ecf X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: VDQkAp83j9w1hkr7sYih/eWIBTSA5p+tfsct5yEejNTVuFG9AOGbwLtPRxrrFj/mQGJ1mUoDWuwpl9+uvnkVO577EvlUEg4ksQhnvU2eMmRjCl0dXHvqz5wL2b6pettyoyHEZRc3vtZJzFLO27ynC+JAsoFP4xd7F2aEJVb+so8OE6AcBcZ9nr6k5vtlcVuDbHzHvH848niEIZf0XpnHWCcA3eQMuKNs2vQwaV72jVHmBdM3bIlYpA3W1Z1zyxg9uODQNjnURwA3+FEaRVLBNVLwse0LbTFWQKMII5I5xGeUhufywqICiQgzTBKjALVAJiHkkmomopATNlSUz2z2em6Rm7DktiCMIRcYTxupVbsiplktgQwogNq9TMFVUWOlUCVGdIo4eBEWNcAj7xrbN+wgdbzydJ/opPgMt2S28dS/BU7kLDynT41razMFaHoSrSl9zhKMP5zEW0/DVmLoWX3qVF6KtL5W4efV4QF3lOrlH0ZSzxig1PPM6y3DDbpWltdv55BaTbg9CVqkHzxTBWXD0LS/GG+u/CMpFiT1QVtGyt8G4SqIKzwPpuEDlfrwMXD7vqSY8NWl6+HfUE8QpFKKN5WRdUoF9eCnQw9KDyrPMu755hW4/NAmOkrnpxmJO6qnbIHlhbCTj0i2wEWSK8fSEQuhETBxxsBt/QN80zMIi0hJUzvlKmP+bcMsHICZ6wdDGerTierhICCr8XTAd7EaFOnICgSje4TuxdlC4gp1dln7Elwt+x9un8lXmPA9ZXfewMnvbuTa4n8wjh+Cwg== X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230028)(4636009)(346002)(396003)(376002)(136003)(39860400002)(451199021)(36840700001)(40470700004)(46966006)(82310400005)(83380400001)(47076005)(426003)(336012)(36860700001)(81166007)(2906002)(82740400003)(316002)(41300700001)(356005)(4326008)(70586007)(478600001)(5660300002)(70206006)(40480700001)(2616005)(40460700003)(26005)(54906003)(8676002)(86362001)(16526019)(8936002)(30864003)(7416002)(110136005)(186003)(36756003)(6666004)(36900700001);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 19 May 2023 16:25:11.9784 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: b8be37b3-58a1-48e7-71eb-08db58859ecf X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: CO1NAM11FT010.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: MN0PR12MB6002 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Add SNP-specific hooks to the unaccepted memory support in the boot path (__accept_memory()) and the core kernel (accept_memory()) in order to support booting SNP guests when unaccepted memory is present. Without this support, SNP guests will fail to boot and/or panic() when unaccepted memory is present in the EFI memory map. The process of accepting memory under SNP involves invoking the hypervisor to perform a page state change for the page to private memory and then issuing a PVALIDATE instruction to accept the page. Since the boot path and the core kernel paths perform similar operations, move the pvalidate_pages() and vmgexit_psc() functions into sev-shared.c to avoid code duplication. Create the new header file arch/x86/boot/compressed/sev.h because adding the function declaration to any of the existing SEV related header files pulls in too many other header files, causing the build to fail. Signed-off-by: Tom Lendacky --- arch/x86/Kconfig | 2 + arch/x86/boot/compressed/mem.c | 3 + arch/x86/boot/compressed/sev.c | 54 ++++++++++- arch/x86/boot/compressed/sev.h | 23 +++++ arch/x86/include/asm/sev.h | 3 + arch/x86/include/asm/unaccepted_memory.h | 3 + arch/x86/kernel/sev-shared.c | 103 +++++++++++++++++++++ arch/x86/kernel/sev.c | 112 +++-------------------- 8 files changed, 204 insertions(+), 99 deletions(-) create mode 100644 arch/x86/boot/compressed/sev.h diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 5c72067c06d4..b9c451f75d5e 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -1543,11 +1543,13 @@ config X86_MEM_ENCRYPT config AMD_MEM_ENCRYPT bool "AMD Secure Memory Encryption (SME) support" depends on X86_64 && CPU_SUP_AMD + depends on EFI_STUB select DMA_COHERENT_POOL select ARCH_USE_MEMREMAP_PROT select INSTRUCTION_DECODER select ARCH_HAS_CC_PLATFORM select X86_MEM_ENCRYPT + select UNACCEPTED_MEMORY help Say yes to enable support for the encryption of system memory. This requires an AMD processor that supports Secure Memory diff --git a/arch/x86/boot/compressed/mem.c b/arch/x86/boot/compressed/mem.c index 8df3d988ae69..c8f2353f6894 100644 --- a/arch/x86/boot/compressed/mem.c +++ b/arch/x86/boot/compressed/mem.c @@ -3,6 +3,7 @@ #include "error.h" #include "misc.h" #include "tdx.h" +#include "sev.h" #include =20 /* @@ -36,6 +37,8 @@ void arch_accept_memory(phys_addr_t start, phys_addr_t en= d) /* Platform-specific memory-acceptance call goes here */ if (early_is_tdx_guest()) tdx_accept_memory(start, end); + else if (sev_snp_enabled()) + snp_accept_memory(start, end); else error("Cannot accept memory: unknown platform\n"); } diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c index 014b89c89088..09dc8c187b3c 100644 --- a/arch/x86/boot/compressed/sev.c +++ b/arch/x86/boot/compressed/sev.c @@ -115,7 +115,7 @@ static enum es_result vc_read_mem(struct es_em_ctxt *ct= xt, /* Include code for early handlers */ #include "../../kernel/sev-shared.c" =20 -static inline bool sev_snp_enabled(void) +bool sev_snp_enabled(void) { return sev_status & MSR_AMD64_SEV_SNP_ENABLED; } @@ -181,6 +181,58 @@ static bool early_setup_ghcb(void) return true; } =20 +static phys_addr_t __snp_accept_memory(struct snp_psc_desc *desc, + phys_addr_t pa, phys_addr_t pa_end) +{ + struct psc_hdr *hdr; + struct psc_entry *e; + unsigned int i; + + hdr =3D &desc->hdr; + memset(hdr, 0, sizeof(*hdr)); + + e =3D desc->entries; + + i =3D 0; + while (pa < pa_end && i < VMGEXIT_PSC_MAX_ENTRY) { + hdr->end_entry =3D i; + + e->gfn =3D pa >> PAGE_SHIFT; + e->operation =3D SNP_PAGE_STATE_PRIVATE; + if (IS_ALIGNED(pa, PMD_SIZE) && (pa_end - pa) >=3D PMD_SIZE) { + e->pagesize =3D RMP_PG_SIZE_2M; + pa +=3D PMD_SIZE; + } else { + e->pagesize =3D RMP_PG_SIZE_4K; + pa +=3D PAGE_SIZE; + } + + e++; + i++; + } + + if (vmgexit_psc(boot_ghcb, desc)) + sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PSC); + + pvalidate_pages(desc); + + return pa; +} + +void snp_accept_memory(phys_addr_t start, phys_addr_t end) +{ + struct snp_psc_desc desc =3D {}; + unsigned int i; + phys_addr_t pa; + + if (!boot_ghcb && !early_setup_ghcb()) + sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PSC); + + pa =3D start; + while (pa < end) + pa =3D __snp_accept_memory(&desc, pa, end); +} + void sev_es_shutdown_ghcb(void) { if (!boot_ghcb) diff --git a/arch/x86/boot/compressed/sev.h b/arch/x86/boot/compressed/sev.h new file mode 100644 index 000000000000..fc725a981b09 --- /dev/null +++ b/arch/x86/boot/compressed/sev.h @@ -0,0 +1,23 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * AMD SEV header for early boot related functions. + * + * Author: Tom Lendacky + */ + +#ifndef BOOT_COMPRESSED_SEV_H +#define BOOT_COMPRESSED_SEV_H + +#ifdef CONFIG_AMD_MEM_ENCRYPT + +bool sev_snp_enabled(void); +void snp_accept_memory(phys_addr_t start, phys_addr_t end); + +#else + +static inline bool sev_snp_enabled(void) { return false; } +static inline void snp_accept_memory(phys_addr_t start, phys_addr_t end) {= } + +#endif + +#endif diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h index e21e1c5397c1..86e1296e87f5 100644 --- a/arch/x86/include/asm/sev.h +++ b/arch/x86/include/asm/sev.h @@ -206,6 +206,7 @@ void snp_set_wakeup_secondary_cpu(void); bool snp_init(struct boot_params *bp); void __init __noreturn snp_abort(void); int snp_issue_guest_request(u64 exit_code, struct snp_req_data *input, str= uct snp_guest_request_ioctl *rio); +void snp_accept_memory(phys_addr_t start, phys_addr_t end); #else static inline void sev_es_ist_enter(struct pt_regs *regs) { } static inline void sev_es_ist_exit(void) { } @@ -229,6 +230,8 @@ static inline int snp_issue_guest_request(u64 exit_code= , struct snp_req_data *in { return -ENOTTY; } + +static inline void snp_accept_memory(phys_addr_t start, phys_addr_t end) {= } #endif =20 #endif diff --git a/arch/x86/include/asm/unaccepted_memory.h b/arch/x86/include/as= m/unaccepted_memory.h index 72b354f992bb..ed3fcd3ac9dd 100644 --- a/arch/x86/include/asm/unaccepted_memory.h +++ b/arch/x86/include/asm/unaccepted_memory.h @@ -3,12 +3,15 @@ =20 #include #include +#include =20 static inline void arch_accept_memory(phys_addr_t start, phys_addr_t end) { /* Platform-specific memory-acceptance call goes here */ if (cpu_feature_enabled(X86_FEATURE_TDX_GUEST)) { tdx_accept_memory(start, end); + } else if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP)) { + snp_accept_memory(start, end); } else { panic("Cannot accept memory: unknown platform\n"); } diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c index 3a5b0c9c4fcc..be312db48a49 100644 --- a/arch/x86/kernel/sev-shared.c +++ b/arch/x86/kernel/sev-shared.c @@ -12,6 +12,9 @@ #ifndef __BOOT_COMPRESSED #define error(v) pr_err(v) #define has_cpuflag(f) boot_cpu_has(f) +#else +#undef WARN +#define WARN(condition, format...) (!!(condition)) #endif =20 /* I/O parameters for CPUID-related helpers */ @@ -991,3 +994,103 @@ static void __init setup_cpuid_table(const struct cc_= blob_sev_info *cc_info) cpuid_ext_range_max =3D fn->eax; } } + +static void pvalidate_pages(struct snp_psc_desc *desc) +{ + struct psc_entry *e; + unsigned long vaddr; + unsigned int size; + unsigned int i; + bool validate; + int rc; + + for (i =3D 0; i <=3D desc->hdr.end_entry; i++) { + e =3D &desc->entries[i]; + + vaddr =3D (unsigned long)pfn_to_kaddr(e->gfn); + size =3D e->pagesize ? RMP_PG_SIZE_2M : RMP_PG_SIZE_4K; + validate =3D (e->operation =3D=3D SNP_PAGE_STATE_PRIVATE) ? true : false; + + rc =3D pvalidate(vaddr, size, validate); + if (rc =3D=3D PVALIDATE_FAIL_SIZEMISMATCH && size =3D=3D RMP_PG_SIZE_2M)= { + unsigned long vaddr_end =3D vaddr + PMD_SIZE; + + for (; vaddr < vaddr_end; vaddr +=3D PAGE_SIZE) { + rc =3D pvalidate(vaddr, RMP_PG_SIZE_4K, validate); + if (rc) + break; + } + } + + if (rc) { + WARN(1, "Failed to validate address 0x%lx ret %d", vaddr, rc); + sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PVALIDATE); + } + } +} + +static int vmgexit_psc(struct ghcb *ghcb, struct snp_psc_desc *desc) +{ + int cur_entry, end_entry, ret =3D 0; + struct snp_psc_desc *data; + struct es_em_ctxt ctxt; + + vc_ghcb_invalidate(ghcb); + + /* Copy the input desc into GHCB shared buffer */ + data =3D (struct snp_psc_desc *)ghcb->shared_buffer; + memcpy(ghcb->shared_buffer, desc, min_t(int, GHCB_SHARED_BUF_SIZE, sizeof= (*desc))); + + /* + * As per the GHCB specification, the hypervisor can resume the guest + * before processing all the entries. Check whether all the entries + * are processed. If not, then keep retrying. Note, the hypervisor + * will update the data memory directly to indicate the status, so + * reference the data->hdr everywhere. + * + * The strategy here is to wait for the hypervisor to change the page + * state in the RMP table before guest accesses the memory pages. If the + * page state change was not successful, then later memory access will + * result in a crash. + */ + cur_entry =3D data->hdr.cur_entry; + end_entry =3D data->hdr.end_entry; + + while (data->hdr.cur_entry <=3D data->hdr.end_entry) { + ghcb_set_sw_scratch(ghcb, (u64)__pa(data)); + + /* This will advance the shared buffer data points to. */ + ret =3D sev_es_ghcb_hv_call(ghcb, &ctxt, SVM_VMGEXIT_PSC, 0, 0); + + /* + * Page State Change VMGEXIT can pass error code through + * exit_info_2. + */ + if (WARN(ret || ghcb->save.sw_exit_info_2, + "SNP: PSC failed ret=3D%d exit_info_2=3D%llx\n", + ret, ghcb->save.sw_exit_info_2)) { + ret =3D 1; + goto out; + } + + /* Verify that reserved bit is not set */ + if (WARN(data->hdr.reserved, "Reserved bit is set in the PSC header\n"))= { + ret =3D 1; + goto out; + } + + /* + * Sanity check that entry processing is not going backwards. + * This will happen only if hypervisor is tricking us. + */ + if (WARN(data->hdr.end_entry > end_entry || cur_entry > data->hdr.cur_en= try, +"SNP: PSC processing going backward, end_entry %d (got %d) cur_entry %d (g= ot %d)\n", + end_entry, data->hdr.end_entry, cur_entry, data->hdr.cur_entry)) { + ret =3D 1; + goto out; + } + } + +out: + return ret; +} diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c index 8802a75e1c20..ea2546e5130f 100644 --- a/arch/x86/kernel/sev.c +++ b/arch/x86/kernel/sev.c @@ -657,38 +657,6 @@ static u64 __init get_jump_table_addr(void) return ret; } =20 -static void pvalidate_pages(struct snp_psc_desc *desc) -{ - struct psc_entry *e; - unsigned long vaddr; - unsigned int size; - unsigned int i; - bool validate; - int rc; - - for (i =3D 0; i <=3D desc->hdr.end_entry; i++) { - e =3D &desc->entries[i]; - - vaddr =3D (unsigned long)pfn_to_kaddr(e->gfn); - size =3D e->pagesize ? RMP_PG_SIZE_2M : RMP_PG_SIZE_4K; - validate =3D (e->operation =3D=3D SNP_PAGE_STATE_PRIVATE) ? true : false; - - rc =3D pvalidate(vaddr, size, validate); - if (rc =3D=3D PVALIDATE_FAIL_SIZEMISMATCH && size =3D=3D RMP_PG_SIZE_2M)= { - unsigned long vaddr_end =3D vaddr + PMD_SIZE; - - for (; vaddr < vaddr_end; vaddr +=3D PAGE_SIZE) { - rc =3D pvalidate(vaddr, RMP_PG_SIZE_4K, validate); - if (rc) - break; - } - } - - if (WARN(rc, "Failed to validate address 0x%lx ret %d", vaddr, rc)) - sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PVALIDATE); - } -} - static void early_set_pages_state(unsigned long vaddr, unsigned long paddr, unsigned long npages, enum psc_op op) { @@ -796,72 +764,6 @@ void __init snp_prep_memory(unsigned long paddr, unsig= ned int sz, enum psc_op op WARN(1, "invalid memory op %d\n", op); } =20 -static int vmgexit_psc(struct ghcb *ghcb, struct snp_psc_desc *desc) -{ - int cur_entry, end_entry, ret =3D 0; - struct snp_psc_desc *data; - struct es_em_ctxt ctxt; - - vc_ghcb_invalidate(ghcb); - - /* Copy the input desc into GHCB shared buffer */ - data =3D (struct snp_psc_desc *)ghcb->shared_buffer; - memcpy(ghcb->shared_buffer, desc, min_t(int, GHCB_SHARED_BUF_SIZE, sizeof= (*desc))); - - /* - * As per the GHCB specification, the hypervisor can resume the guest - * before processing all the entries. Check whether all the entries - * are processed. If not, then keep retrying. Note, the hypervisor - * will update the data memory directly to indicate the status, so - * reference the data->hdr everywhere. - * - * The strategy here is to wait for the hypervisor to change the page - * state in the RMP table before guest accesses the memory pages. If the - * page state change was not successful, then later memory access will - * result in a crash. - */ - cur_entry =3D data->hdr.cur_entry; - end_entry =3D data->hdr.end_entry; - - while (data->hdr.cur_entry <=3D data->hdr.end_entry) { - ghcb_set_sw_scratch(ghcb, (u64)__pa(data)); - - /* This will advance the shared buffer data points to. */ - ret =3D sev_es_ghcb_hv_call(ghcb, &ctxt, SVM_VMGEXIT_PSC, 0, 0); - - /* - * Page State Change VMGEXIT can pass error code through - * exit_info_2. - */ - if (WARN(ret || ghcb->save.sw_exit_info_2, - "SNP: PSC failed ret=3D%d exit_info_2=3D%llx\n", - ret, ghcb->save.sw_exit_info_2)) { - ret =3D 1; - goto out; - } - - /* Verify that reserved bit is not set */ - if (WARN(data->hdr.reserved, "Reserved bit is set in the PSC header\n"))= { - ret =3D 1; - goto out; - } - - /* - * Sanity check that entry processing is not going backwards. - * This will happen only if hypervisor is tricking us. - */ - if (WARN(data->hdr.end_entry > end_entry || cur_entry > data->hdr.cur_en= try, -"SNP: PSC processing going backward, end_entry %d (got %d) cur_entry %d (g= ot %d)\n", - end_entry, data->hdr.end_entry, cur_entry, data->hdr.cur_entry)) { - ret =3D 1; - goto out; - } - } - -out: - return ret; -} - static unsigned long __set_pages_state(struct snp_psc_desc *data, unsigned= long vaddr, unsigned long vaddr_end, int op) { @@ -966,6 +868,20 @@ void snp_set_memory_private(unsigned long vaddr, unsig= ned long npages) set_pages_state(vaddr, npages, SNP_PAGE_STATE_PRIVATE); } =20 +void snp_accept_memory(phys_addr_t start, phys_addr_t end) +{ + unsigned long vaddr; + unsigned int npages; + + if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP)) + return; + + vaddr =3D (unsigned long)__va(start); + npages =3D (end - start) >> PAGE_SHIFT; + + set_pages_state(vaddr, npages, SNP_PAGE_STATE_PRIVATE); +} + static int snp_set_vmsa(void *va, bool vmsa) { u64 attrs; --=20 2.40.0 From nobody Tue Feb 10 00:22:26 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7D177C7EE30 for ; Thu, 18 May 2023 23:16:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230450AbjERXPj (ORCPT ); Thu, 18 May 2023 19:15:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33880 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230226AbjERXPC (ORCPT ); Thu, 18 May 2023 19:15:02 -0400 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C1B2DE49; Thu, 18 May 2023 16:15:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1684451700; x=1715987700; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=rBInWBXCqa9zaNha3sJK2PKA62i2doD1PS1LbLbl/Do=; b=D927NiBV0ZH0nu/c13W7zxsWsKBY6MO+LFiTrrmDWQ5zNdF70ljVRnau AUxU2kjoy78ZtR/evthNWjLlRaVgVyeUO+X11vcouC9nMjgKGTzHwxjIl pfuXTXv6sh1dbmSmr2iATkvQO9f4Wp18a50Lq8EtfoZq70tnNW+IFc3ot iBcSSYCm5lhg0AxdNLdiYMMt6hAKSJAFtKIELt9cuW2nEzYY2lH1H435q wjU7ZUZm1+UcyTs4JFkLhenZst8UOxg+HH3C6AOWgQ/S2uOpVtakFvXsj 8yoDZSa+0iNSr3pyHdqK/KFZVqS0nzxvUIFz+LNt6E8lyr1jqStben+6s g==; X-IronPort-AV: E=McAfee;i="6600,9927,10714"; a="341652104" X-IronPort-AV: E=Sophos;i="6.00,175,1681196400"; d="scan'208";a="341652104" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 May 2023 16:14:59 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10714"; a="772065101" X-IronPort-AV: E=Sophos;i="6.00,175,1681196400"; d="scan'208";a="772065101" Received: from rkiyama-mobl1.amr.corp.intel.com (HELO box.shutemov.name) ([10.251.222.16]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 May 2023 16:14:52 -0700 Received: by box.shutemov.name (Postfix, from userid 1000) id 4B24010DFD0; Fri, 19 May 2023 02:14:40 +0300 (+03) From: "Kirill A. Shutemov" To: Borislav Petkov , Andy Lutomirski , Dave Hansen , Sean Christopherson , Andrew Morton , Joerg Roedel , Ard Biesheuvel Cc: Andi Kleen , Kuppuswamy Sathyanarayanan , David Rientjes , Vlastimil Babka , Tom Lendacky , Thomas Gleixner , Peter Zijlstra , Paolo Bonzini , Ingo Molnar , Dario Faggioli , Mike Rapoport , David Hildenbrand , Mel Gorman , marcelo.cerri@canonical.com, tim.gardner@canonical.com, khalid.elmously@canonical.com, philip.cox@canonical.com, aarcange@redhat.com, peterx@redhat.com, x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [PATCHv12 5/9] efi: Add unaccepted memory support Date: Fri, 19 May 2023 02:14:30 +0300 Message-Id: <20230518231434.26080-6-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.39.3 In-Reply-To: <20230518231434.26080-1-kirill.shutemov@linux.intel.com> References: <20230518231434.26080-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" efi_config_parse_tables() reserves memory that holds unaccepted memory configuration table so it won't be reused by page allocator. Core-mm requires few helpers to support unaccepted memory: - accept_memory() checks the range of addresses against the bitmap and accept memory if needed. - range_contains_unaccepted_memory() checks if anything within the range requires acceptance. Architectural code has to provide efi_get_unaccepted_table() that returns pointer to the unaccepted memory configuration table. arch_accept_memory() handles arch-specific part of memory acceptance. Signed-off-by: Kirill A. Shutemov Reviewed-by: Ard Biesheuvel Reviewed-by: Tom Lendacky --- arch/x86/platform/efi/efi.c | 3 + drivers/firmware/efi/Makefile | 1 + drivers/firmware/efi/efi.c | 25 ++++++ drivers/firmware/efi/unaccepted_memory.c | 103 +++++++++++++++++++++++ include/linux/efi.h | 1 + 5 files changed, 133 insertions(+) create mode 100644 drivers/firmware/efi/unaccepted_memory.c diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c index f3f2d87cce1b..e9f99c56f3ce 100644 --- a/arch/x86/platform/efi/efi.c +++ b/arch/x86/platform/efi/efi.c @@ -96,6 +96,9 @@ static const unsigned long * const efi_tables[] =3D { #ifdef CONFIG_EFI_COCO_SECRET &efi.coco_secret, #endif +#ifdef CONFIG_UNACCEPTED_MEMORY + &efi.unaccepted, +#endif }; =20 u64 efi_setup; /* efi setup_data physical address */ diff --git a/drivers/firmware/efi/Makefile b/drivers/firmware/efi/Makefile index b51f2a4c821e..e489fefd23da 100644 --- a/drivers/firmware/efi/Makefile +++ b/drivers/firmware/efi/Makefile @@ -41,3 +41,4 @@ obj-$(CONFIG_EFI_CAPSULE_LOADER) +=3D capsule-loader.o obj-$(CONFIG_EFI_EARLYCON) +=3D earlycon.o obj-$(CONFIG_UEFI_CPER_ARM) +=3D cper-arm.o obj-$(CONFIG_UEFI_CPER_X86) +=3D cper-x86.o +obj-$(CONFIG_UNACCEPTED_MEMORY) +=3D unaccepted_memory.o diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c index 7dce06e419c5..d817e7afd266 100644 --- a/drivers/firmware/efi/efi.c +++ b/drivers/firmware/efi/efi.c @@ -50,6 +50,9 @@ struct efi __read_mostly efi =3D { #ifdef CONFIG_EFI_COCO_SECRET .coco_secret =3D EFI_INVALID_TABLE_ADDR, #endif +#ifdef CONFIG_UNACCEPTED_MEMORY + .unaccepted =3D EFI_INVALID_TABLE_ADDR, +#endif }; EXPORT_SYMBOL(efi); =20 @@ -605,6 +608,9 @@ static const efi_config_table_type_t common_tables[] __= initconst =3D { #ifdef CONFIG_EFI_COCO_SECRET {LINUX_EFI_COCO_SECRET_AREA_GUID, &efi.coco_secret, "CocoSecret" }, #endif +#ifdef CONFIG_UNACCEPTED_MEMORY + {LINUX_EFI_UNACCEPTED_MEM_TABLE_GUID, &efi.unaccepted, "Unaccepted" }, +#endif #ifdef CONFIG_EFI_GENERIC_STUB {LINUX_EFI_SCREEN_INFO_TABLE_GUID, &screen_info_table }, #endif @@ -759,6 +765,25 @@ int __init efi_config_parse_tables(const efi_config_ta= ble_t *config_tables, } } =20 + if (IS_ENABLED(CONFIG_UNACCEPTED_MEMORY) && + efi.unaccepted !=3D EFI_INVALID_TABLE_ADDR) { + struct efi_unaccepted_memory *unaccepted; + + unaccepted =3D early_memremap(efi.unaccepted, sizeof(*unaccepted)); + if (unaccepted) { + unsigned long size; + + if (unaccepted->version =3D=3D 1) { + size =3D sizeof(*unaccepted) + unaccepted->size; + memblock_reserve(efi.unaccepted, size); + } else { + efi.unaccepted =3D EFI_INVALID_TABLE_ADDR; + } + + early_memunmap(unaccepted, sizeof(*unaccepted)); + } + } + return 0; } =20 diff --git a/drivers/firmware/efi/unaccepted_memory.c b/drivers/firmware/ef= i/unaccepted_memory.c new file mode 100644 index 000000000000..bb91c41f76fb --- /dev/null +++ b/drivers/firmware/efi/unaccepted_memory.c @@ -0,0 +1,103 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#include +#include +#include +#include + +/* Protects unaccepted memory bitmap */ +static DEFINE_SPINLOCK(unaccepted_memory_lock); + +void accept_memory(phys_addr_t start, phys_addr_t end) +{ + struct efi_unaccepted_memory *unaccepted; + unsigned long range_start, range_end; + unsigned long flags; + u64 unit_size; + + if (efi.unaccepted =3D=3D EFI_INVALID_TABLE_ADDR) + return; + + unaccepted =3D efi_get_unaccepted_table(); + if (!unaccepted) + return; + + unit_size =3D unaccepted->unit_size; + + /* + * Only care for the part of the range that is represented + * in the bitmap. + */ + if (start < unaccepted->phys_base) + start =3D unaccepted->phys_base; + if (end < unaccepted->phys_base) + return; + + /* Translate to offsets from the beginning of the bitmap */ + start -=3D unaccepted->phys_base; + end -=3D unaccepted->phys_base; + + /* Make sure not to overrun the bitmap */ + if (end > unaccepted->size * unit_size * BITS_PER_BYTE) + end =3D unaccepted->size * unit_size * BITS_PER_BYTE; + + range_start =3D start / unit_size; + + spin_lock_irqsave(&unaccepted_memory_lock, flags); + for_each_set_bitrange_from(range_start, range_end, unaccepted->bitmap, + DIV_ROUND_UP(end, unit_size)) { + unsigned long phys_start, phys_end; + unsigned long len =3D range_end - range_start; + + phys_start =3D range_start * unit_size + unaccepted->phys_base; + phys_end =3D range_end * unit_size + unaccepted->phys_base; + + arch_accept_memory(phys_start, phys_end); + bitmap_clear(unaccepted->bitmap, range_start, len); + } + spin_unlock_irqrestore(&unaccepted_memory_lock, flags); +} + +bool range_contains_unaccepted_memory(phys_addr_t start, phys_addr_t end) +{ + struct efi_unaccepted_memory *unaccepted; + unsigned long flags; + bool ret =3D false; + u64 unit_size; + + unaccepted =3D efi_get_unaccepted_table(); + if (!unaccepted) + return false; + + unit_size =3D unaccepted->unit_size; + + /* + * Only care for the part of the range that is represented + * in the bitmap. + */ + if (start < unaccepted->phys_base) + start =3D unaccepted->phys_base; + if (end < unaccepted->phys_base) + return false; + + /* Translate to offsets from the beginning of the bitmap */ + start -=3D unaccepted->phys_base; + end -=3D unaccepted->phys_base; + + /* Make sure not to overrun the bitmap */ + if (end > unaccepted->size * unit_size * BITS_PER_BYTE) + end =3D unaccepted->size * unit_size * BITS_PER_BYTE; + + spin_lock_irqsave(&unaccepted_memory_lock, flags); + while (start < end) { + if (test_bit(start / unit_size, unaccepted->bitmap)) { + ret =3D true; + break; + } + + start +=3D unit_size; + } + spin_unlock_irqrestore(&unaccepted_memory_lock, flags); + + return ret; +} diff --git a/include/linux/efi.h b/include/linux/efi.h index 29cc622910da..9864f9c00da2 100644 --- a/include/linux/efi.h +++ b/include/linux/efi.h @@ -646,6 +646,7 @@ extern struct efi { unsigned long tpm_final_log; /* TPM2 Final Events Log table */ unsigned long mokvar_table; /* MOK variable config table */ unsigned long coco_secret; /* Confidential computing secret table */ + unsigned long unaccepted; /* Unaccepted memory table */ =20 efi_get_time_t *get_time; efi_set_time_t *set_time; --=20 2.39.3 From nobody Tue Feb 10 00:22:26 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 43A22C77B7A for ; Fri, 19 May 2023 16:25:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230405AbjESQZu (ORCPT ); Fri, 19 May 2023 12:25:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52500 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230469AbjESQZr (ORCPT ); Fri, 19 May 2023 12:25:47 -0400 Received: from NAM04-DM6-obe.outbound.protection.outlook.com (mail-dm6nam04on2066.outbound.protection.outlook.com [40.107.102.66]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 66B08173C for ; Fri, 19 May 2023 09:25:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=PxKL7yJ+25GB7ofqOF1Oy5QVd6DOu+gTBz+zctG36L41QS7mwCzhgo94UKB2S6Hi0vhWspH1nD+Bc8A8MyhIZEKH6/9bLukexQWGPdKJv0NF1nLNGa+PMEyIrUVKcqCshmU93i/GUcm7tACYV2dWhkryUGPQ2yWscQEqqFNV8DKFd3iPJdaDnO1pZ8/uG8q94HtpP7qAqJz4OF3ajtJHBlOORAYqTuJifqaFBebJfogWGXjtocvuhHuQ3ECsmYRD9/Mg8zQnkvRr3lC2qAwUrtWA7MNQWyTVZ3EWo8cw3bInF+J/ViWvxKNnUY/3u95hcoHxuT7D0t7T8iRVV6WiXA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=MH9GCEWsgAt9YKaKenSWcFZ6c7NWvrzv26GHiVvz04o=; b=I1jADgBA3efdLAMGwbKbUKv3vizQTTR5oMQGhQmUpbYx059mbJ00FnVwPbPequJXu1mQi8qbxbte1/ixA9Miyb8+w2Zrb0dAeAgd9pPDJEJZz0fmV8EKZsdRDWqkZjZUyBEll5i4aTK1KAoQ5YIFMAhReNu/y7g4n7EqQOVulO0P8Wl/kSc1/vtvupUZJDrfxKhL2Sws8bAoziefXVmKz5GAur1/szfE2f5MHif6+zmo3nn9IAOBZNZfctZbPzNf6pyVY8u6P4o7p5DvUuSoFQSywiC2v4mXNl2wfuot5AWesD48Eh8WLWMa4tE0iKuFDxAzyPItJzj1vLr3J7/lDg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=MH9GCEWsgAt9YKaKenSWcFZ6c7NWvrzv26GHiVvz04o=; b=Bz3uTo7Lp8Pyi/yOvvGwfvHq/o5Hr/N6Z1LDYCNwORx5cPUmP4RVUu0rvHmo+EiFLoEAtkc4adfnjIbjXkw60NedaRO4dxDpZkHQl4LE55QqJ9G2nyc+KZXbD0OEYlYUQp0b8bCNJVjeXIaj7EaGC37giK/jB6M7IC+j4XxVwDk= Received: from MW4PR03CA0327.namprd03.prod.outlook.com (2603:10b6:303:dd::32) by DM4PR12MB8572.namprd12.prod.outlook.com (2603:10b6:8:17d::18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6411.21; Fri, 19 May 2023 16:25:20 +0000 Received: from CO1NAM11FT080.eop-nam11.prod.protection.outlook.com (2603:10b6:303:dd:cafe::ad) by MW4PR03CA0327.outlook.office365.com (2603:10b6:303:dd::32) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6411.21 via Frontend Transport; Fri, 19 May 2023 16:25:20 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by CO1NAM11FT080.mail.protection.outlook.com (10.13.174.99) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.6411.21 via Frontend Transport; Fri, 19 May 2023 16:25:20 +0000 Received: from tlendack-t1.amdoffice.net (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.34; Fri, 19 May 2023 11:25:17 -0500 From: Tom Lendacky To: , CC: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "Kirill A. Shutemov" , "H. Peter Anvin" , Michael Roth , Joerg Roedel , Dionna Glaze , Andy Lutomirski , Peter Zijlstra , Ard Biescheuvel , "Min M. Xu" , Gerd Hoffmann , James Bottomley , Tom Lendacky , Jiewen Yao , Erdem Aktas , "Kirill A. Shutemov" Subject: [PATCH 6/6] x86/efi: Safely enable unaccepted memory in UEFI Date: Fri, 19 May 2023 11:24:17 -0500 Message-ID: <430269747cb961c7e70288b5808925f4fa4c544d.1684513457.git.thomas.lendacky@amd.com> X-Mailer: git-send-email 2.40.0 In-Reply-To: <20230518231434.26080-1-kirill.shutemov@linux.intel.com> References: <20230518231434.26080-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [10.180.168.240] X-ClientProxiedBy: SATLEXMB04.amd.com (10.181.40.145) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CO1NAM11FT080:EE_|DM4PR12MB8572:EE_ X-MS-Office365-Filtering-Correlation-Id: 04c1b3d1-eeb3-43a3-621a-08db5885a3aa X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: kuDlCP9iubqIn85+NAyFj5Fw92YaMKoJ/bkRP2ev0YII2EeG5yYUkfIee8S5BtpRcn9M7vqoUmySUvX6Shyxedqclo+NwjL+DpGU5hxebB2howkPR0Wcwrx8iOH0QTocO1FCelkIRqJAv0he0T22E9mvPL7KXKlRWiA1zIHSmbRynYNWg4+EscMyEjkcDpEPy3c0ALOPAbAJQ+w1e8UKebZ097A81gam5/jUcbDKIILHk+ldmYmLsZ6hifDN0BzSuL+uzu8puGPkweRaLtFXNbpqUI3aIs68CwdAx6hhxElXGETSnFWrOnK8Axu52aTIunMrwW+yjKVxWLo5//uJ0WeoWGFloF86LtENeGf5ecgv7fGkJf8zwe7tChFmCXtgSzhHcLJxNof2PbEhIPj/mhJsXpmJopdU7CxNWmfKmuxk23BHJOkzdgSoyPH9PwwkqaoKy5h6sazUjn8ovia+Q+Q5f/LQUSpBPLvsxc9lo1pJBoVt4Bz5a/Q6y0lPKvpnbnHBtbwg9CQfrw89VXchBYy8PJwY/Z/aS7lYsvNwzMlEydcDRLeq2q4hLX0goZwxa7tq3ggT4U6VMRnBYzosX7mlzYZ4RbCE8bTGWDmhZEnukLzq7ht9EridjD3T6J6K09OB+IHHm4l/DpNFBIp7qPOCJ4QCgHVgrvHUA5ci0La5BgxW1Phz48fsBadRncjloxiFpOCyqmuOtJr5/sa94jHuZ3ITts59za/db8SGQSX8jtooCDVcJhYhQjTt5gpiJGkRXCz/QRC48ixrLt/K0Q== X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230028)(4636009)(396003)(346002)(136003)(39860400002)(376002)(451199021)(40470700004)(36840700001)(46966006)(36860700001)(83380400001)(81166007)(82740400003)(356005)(40460700003)(36756003)(8936002)(5660300002)(8676002)(7416002)(41300700001)(82310400005)(40480700001)(86362001)(4326008)(316002)(70206006)(70586007)(426003)(47076005)(16526019)(110136005)(26005)(186003)(336012)(2906002)(478600001)(2616005)(54906003)(36900700001);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 19 May 2023 16:25:20.1398 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 04c1b3d1-eeb3-43a3-621a-08db5885a3aa X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: CO1NAM11FT080.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM4PR12MB8572 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Dionna Glaze The UEFI v2.9 specification includes a new memory type to be used in environments where the OS must accept memory that is provided from its host. Before the introduction of this memory type, all memory was accepted eagerly in the firmware. In order for the firmware to safely stop accepting memory on the OS's behalf, the OS must affirmatively indicate support to the firmware. This is only a problem for AMD SEV-SNP, since Linux has had support for it since 5.19. The other technology that can make use of unaccepted memory, Intel TDX, does not yet have Linux support, so it can strictly require unaccepted memory support as a dependency of CONFIG_TDX and not require communication with the firmware. Enabling unaccepted memory requires calling a 0-argument enablement protocol before ExitBootServices. This call is only made if the kernel is compiled with UNACCEPTED_MEMORY=3Dy This protocol will be removed after the end of life of the first LTS that includes it, in order to give firmware implementations an expiration date for it. When the protocol is removed, firmware will strictly infer that a SEV-SNP VM is running an OS that supports the unaccepted memory type. At the earliest convenience, when unaccepted memory support is added to Linux, SEV-SNP may take strict dependence in it. After the firmware removes support for the protocol, this patch should be reverted. [tl: address some checkscript warnings] Cc: Ard Biescheuvel Cc: "Min M. Xu" Cc: Gerd Hoffmann Cc: James Bottomley Cc: Tom Lendacky Cc: Jiewen Yao Cc: Erdem Aktas Cc: "Kirill A. Shutemov" Cc: Dave Hansen Cc: Borislav Petkov Signed-off-by: Dionna Glaze Signed-off-by: Tom Lendacky --- drivers/firmware/efi/libstub/x86-stub.c | 36 +++++++++++++++++++++++++ include/linux/efi.h | 3 +++ 2 files changed, 39 insertions(+) diff --git a/drivers/firmware/efi/libstub/x86-stub.c b/drivers/firmware/efi= /libstub/x86-stub.c index 8d17cee8b98e..e2193dbe1f66 100644 --- a/drivers/firmware/efi/libstub/x86-stub.c +++ b/drivers/firmware/efi/libstub/x86-stub.c @@ -26,6 +26,17 @@ const efi_dxe_services_table_t *efi_dxe_table; u32 image_offset __section(".data"); static efi_loaded_image_t *image =3D NULL; =20 +typedef union sev_memory_acceptance_protocol sev_memory_acceptance_protoco= l_t; +union sev_memory_acceptance_protocol { + struct { + efi_status_t (__efiapi * allow_unaccepted_memory)( + sev_memory_acceptance_protocol_t *); + }; + struct { + u32 allow_unaccepted_memory; + } mixed_mode; +}; + static efi_status_t preserve_pci_rom_image(efi_pci_io_protocol_t *pci, struct pci_setup_rom **= __rom) { @@ -310,6 +321,29 @@ setup_memory_protection(unsigned long image_base, unsi= gned long image_size) #endif } =20 +static void setup_unaccepted_memory(void) +{ + efi_guid_t mem_acceptance_proto =3D OVMF_SEV_MEMORY_ACCEPTANCE_PROTOCOL_G= UID; + sev_memory_acceptance_protocol_t *proto; + efi_status_t status; + + if (!IS_ENABLED(CONFIG_UNACCEPTED_MEMORY)) + return; + + /* + * Enable unaccepted memory before calling exit boot services in order + * for the UEFI to not accept all memory on EBS. + */ + status =3D efi_bs_call(locate_protocol, &mem_acceptance_proto, NULL, + (void **)&proto); + if (status !=3D EFI_SUCCESS) + return; + + status =3D efi_call_proto(proto, allow_unaccepted_memory); + if (status !=3D EFI_SUCCESS) + efi_err("Memory acceptance protocol failed\n"); +} + static const efi_char16_t apple[] =3D L"Apple"; =20 static void setup_quirks(struct boot_params *boot_params, @@ -908,6 +942,8 @@ asmlinkage unsigned long efi_main(efi_handle_t handle, =20 setup_quirks(boot_params, bzimage_addr, buffer_end - buffer_start); =20 + setup_unaccepted_memory(); + status =3D exit_boot(boot_params, handle); if (status !=3D EFI_SUCCESS) { efi_err("exit_boot() failed!\n"); diff --git a/include/linux/efi.h b/include/linux/efi.h index 9864f9c00da2..8c5abcf70a05 100644 --- a/include/linux/efi.h +++ b/include/linux/efi.h @@ -437,6 +437,9 @@ void efi_native_runtime_setup(void); #define DELLEMC_EFI_RCI2_TABLE_GUID EFI_GUID(0x2d9f28a2, 0xa886, 0x456a, = 0x97, 0xa8, 0xf1, 0x1e, 0xf2, 0x4f, 0xf4, 0x55) #define AMD_SEV_MEM_ENCRYPT_GUID EFI_GUID(0x0cf29b71, 0x9e51, 0x433a, 0x= a3, 0xb7, 0x81, 0xf3, 0xab, 0x16, 0xb8, 0x75) =20 +/* OVMF protocol GUIDs */ +#define OVMF_SEV_MEMORY_ACCEPTANCE_PROTOCOL_GUID EFI_GUID(0xc5a010fe, 0x38= a7, 0x4531, 0x8a, 0x4a, 0x05, 0x00, 0xd2, 0xfd, 0x16, 0x49) + typedef struct { efi_guid_t guid; u64 table; --=20 2.40.0 From nobody Tue Feb 10 00:22:26 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BADCCC7EE32 for ; Thu, 18 May 2023 23:16:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231173AbjERXPr (ORCPT ); Thu, 18 May 2023 19:15:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33906 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230410AbjERXPD (ORCPT ); Thu, 18 May 2023 19:15:03 -0400 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 84177E52; Thu, 18 May 2023 16:15:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1684451702; x=1715987702; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=aHz7WO7wShdd679ArCfVvIALDE990d9+23AGlVsKS9M=; b=kPQpt/hQp2zOvYWKoaHJ27sf1IXWTzGjlVfEnsvE7eaChQsO1dfDEXr9 1A0jVrSi/xwGoVI/jTs0g3lsSmUUw3a49SOpjYpu2Q1We6lNLyb3QzUMk m8PcMuiMrAYH7zYYpJm+innrkmLhcRAFFFqpxsGABkmYRzAy7HJeSmfYM xDFMvJqeIufWrmrY4Bu75Ckmbv2e5cl/4CtDiXYlPqH1t6+dzrNJxOX2j to0/tOm+vhVLHKRFKR9FnOKVPM67Pv29O/iVGDOGguzhDyUv78bzcEb9S /2eVlt0zWAz6U0zGdTBgRLOiP5lXz38p3OdLuznX3jyoK+8RCay/YDz8r A==; X-IronPort-AV: E=McAfee;i="6600,9927,10714"; a="341652146" X-IronPort-AV: E=Sophos;i="6.00,175,1681196400"; d="scan'208";a="341652146" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 May 2023 16:15:00 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10714"; a="772065105" X-IronPort-AV: E=Sophos;i="6.00,175,1681196400"; d="scan'208";a="772065105" Received: from rkiyama-mobl1.amr.corp.intel.com (HELO box.shutemov.name) ([10.251.222.16]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 May 2023 16:14:52 -0700 Received: by box.shutemov.name (Postfix, from userid 1000) id 5530610DFD2; Fri, 19 May 2023 02:14:40 +0300 (+03) From: "Kirill A. Shutemov" To: Borislav Petkov , Andy Lutomirski , Dave Hansen , Sean Christopherson , Andrew Morton , Joerg Roedel , Ard Biesheuvel Cc: Andi Kleen , Kuppuswamy Sathyanarayanan , David Rientjes , Vlastimil Babka , Tom Lendacky , Thomas Gleixner , Peter Zijlstra , Paolo Bonzini , Ingo Molnar , Dario Faggioli , Mike Rapoport , David Hildenbrand , Mel Gorman , marcelo.cerri@canonical.com, tim.gardner@canonical.com, khalid.elmously@canonical.com, philip.cox@canonical.com, aarcange@redhat.com, peterx@redhat.com, x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" , Dave Hansen Subject: [PATCHv12 6/9] efi/unaccepted: Avoid load_unaligned_zeropad() stepping into unaccepted memory Date: Fri, 19 May 2023 02:14:31 +0300 Message-Id: <20230518231434.26080-7-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.39.3 In-Reply-To: <20230518231434.26080-1-kirill.shutemov@linux.intel.com> References: <20230518231434.26080-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" load_unaligned_zeropad() can lead to unwanted loads across page boundaries. The unwanted loads are typically harmless. But, they might be made to totally unrelated or even unmapped memory. load_unaligned_zeropad() relies on exception fixup (#PF, #GP and now #VE) to recover from these unwanted loads. But, this approach does not work for unaccepted memory. For TDX, a load from unaccepted memory will not lead to a recoverable exception within the guest. The guest will exit to the VMM where the only recourse is to terminate the guest. There are two parts to fix this issue and comprehensively avoid access to unaccepted memory. Together these ensure that an extra "guard" page is accepted in addition to the memory that needs to be used. 1. Implicitly extend the range_contains_unaccepted_memory(start, end) checks up to end+unit_size if 'end' is aligned on a unit_size boundary. 2. Implicitly extend accept_memory(start, end) to end+unit_size if 'end' is aligned on a unit_size boundary. Side note: This leads to something strange. Pages which were accepted at boot, marked by the firmware as accepted and will never _need_ to be accepted might be on unaccepted_pages list This is a cue to ensure that the next page is accepted before 'page' can be used. This is an actual, real-world problem which was discovered during TDX testing. Signed-off-by: Kirill A. Shutemov Reviewed-by: Dave Hansen Reviewed-by: Ard Biesheuvel Reviewed-by: Tom Lendacky --- drivers/firmware/efi/unaccepted_memory.c | 35 ++++++++++++++++++++++++ 1 file changed, 35 insertions(+) diff --git a/drivers/firmware/efi/unaccepted_memory.c b/drivers/firmware/ef= i/unaccepted_memory.c index bb91c41f76fb..3d1ca60916dd 100644 --- a/drivers/firmware/efi/unaccepted_memory.c +++ b/drivers/firmware/efi/unaccepted_memory.c @@ -37,6 +37,34 @@ void accept_memory(phys_addr_t start, phys_addr_t end) start -=3D unaccepted->phys_base; end -=3D unaccepted->phys_base; =20 + /* + * load_unaligned_zeropad() can lead to unwanted loads across page + * boundaries. The unwanted loads are typically harmless. But, they + * might be made to totally unrelated or even unmapped memory. + * load_unaligned_zeropad() relies on exception fixup (#PF, #GP and now + * #VE) to recover from these unwanted loads. + * + * But, this approach does not work for unaccepted memory. For TDX, a + * load from unaccepted memory will not lead to a recoverable exception + * within the guest. The guest will exit to the VMM where the only + * recourse is to terminate the guest. + * + * There are two parts to fix this issue and comprehensively avoid + * access to unaccepted memory. Together these ensure that an extra + * "guard" page is accepted in addition to the memory that needs to be + * used: + * + * 1. Implicitly extend the range_contains_unaccepted_memory(start, end) + * checks up to end+unit_size if 'end' is aligned on a unit_size + * boundary. + * + * 2. Implicitly extend accept_memory(start, end) to end+unit_size if + * 'end' is aligned on a unit_size boundary. (immediately following + * this comment) + */ + if (!(end % unit_size)) + end +=3D unit_size; + /* Make sure not to overrun the bitmap */ if (end > unaccepted->size * unit_size * BITS_PER_BYTE) end =3D unaccepted->size * unit_size * BITS_PER_BYTE; @@ -84,6 +112,13 @@ bool range_contains_unaccepted_memory(phys_addr_t start= , phys_addr_t end) start -=3D unaccepted->phys_base; end -=3D unaccepted->phys_base; =20 + /* + * Also consider the unaccepted state of the *next* page. See fix #1 in + * the comment on load_unaligned_zeropad() in accept_memory(). + */ + if (!(end % unit_size)) + end +=3D unit_size; + /* Make sure not to overrun the bitmap */ if (end > unaccepted->size * unit_size * BITS_PER_BYTE) end =3D unaccepted->size * unit_size * BITS_PER_BYTE; --=20 2.39.3 From nobody Tue Feb 10 00:22:26 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A2BBFC7EE2F for ; Thu, 18 May 2023 23:16:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230497AbjERXPo (ORCPT ); Thu, 18 May 2023 19:15:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33908 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230373AbjERXPE (ORCPT ); Thu, 18 May 2023 19:15:04 -0400 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 41EFEE48; Thu, 18 May 2023 16:15:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1684451703; x=1715987703; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Eu57mUO9ct2ICW4Etq/GrS2vGWKQMEDS8IVWtzUCpBo=; b=T/cA6gaKE29FzKyoMbYf0Sz4dLwK480DBZn0L41Z/a1mcKozRafGdXA7 WVMFBrKwxHU6VEMR9oB/6uYYlesl+P2u/SvUSPFUaQ/U+Uh8ocw34lugf fQlroM/sUIMpd7SDBTK1cmyEgrGjcHcOn4V1oHAF1IkxxUSeN1XH0pWA+ /re9TaBXpr+V5xyaZuHU8FJ9vGIyP6ilkLEKm3EikTVzJGNA+JOf90l+6 Ja9wbMgm/UABuWXBGb8vkhCEJfn4f3gnntFd6xIR+28v6XBryPm/GOq0J wBcXwmA7IAJW2/6NpJ6jlQx/Ph8NTF4pnTPE8sdH0OvqMvaMcxlUwZNzo w==; X-IronPort-AV: E=McAfee;i="6600,9927,10714"; a="341652161" X-IronPort-AV: E=Sophos;i="6.00,175,1681196400"; d="scan'208";a="341652161" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 May 2023 16:15:00 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10714"; a="772065107" X-IronPort-AV: E=Sophos;i="6.00,175,1681196400"; d="scan'208";a="772065107" Received: from rkiyama-mobl1.amr.corp.intel.com (HELO box.shutemov.name) ([10.251.222.16]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 May 2023 16:14:52 -0700 Received: by box.shutemov.name (Postfix, from userid 1000) id 6149C10DFD4; Fri, 19 May 2023 02:14:40 +0300 (+03) From: "Kirill A. Shutemov" To: Borislav Petkov , Andy Lutomirski , Dave Hansen , Sean Christopherson , Andrew Morton , Joerg Roedel , Ard Biesheuvel Cc: Andi Kleen , Kuppuswamy Sathyanarayanan , David Rientjes , Vlastimil Babka , Tom Lendacky , Thomas Gleixner , Peter Zijlstra , Paolo Bonzini , Ingo Molnar , Dario Faggioli , Mike Rapoport , David Hildenbrand , Mel Gorman , marcelo.cerri@canonical.com, tim.gardner@canonical.com, khalid.elmously@canonical.com, philip.cox@canonical.com, aarcange@redhat.com, peterx@redhat.com, x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" , Dave Hansen Subject: [PATCHv12 7/9] x86/tdx: Make _tdx_hypercall() and __tdx_module_call() available in boot stub Date: Fri, 19 May 2023 02:14:32 +0300 Message-Id: <20230518231434.26080-8-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.39.3 In-Reply-To: <20230518231434.26080-1-kirill.shutemov@linux.intel.com> References: <20230518231434.26080-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Memory acceptance requires a hypercall and one or multiple module calls. Make helpers for the calls available in boot stub. It has to accept memory where kernel image and initrd are placed. Signed-off-by: Kirill A. Shutemov Reviewed-by: Dave Hansen --- arch/x86/coco/tdx/tdx.c | 32 ------------------- arch/x86/include/asm/shared/tdx.h | 51 +++++++++++++++++++++++++++++++ arch/x86/include/asm/tdx.h | 19 ------------ 3 files changed, 51 insertions(+), 51 deletions(-) diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c index e146b599260f..e6f4c2758a68 100644 --- a/arch/x86/coco/tdx/tdx.c +++ b/arch/x86/coco/tdx/tdx.c @@ -14,20 +14,6 @@ #include #include =20 -/* TDX module Call Leaf IDs */ -#define TDX_GET_INFO 1 -#define TDX_GET_VEINFO 3 -#define TDX_GET_REPORT 4 -#define TDX_ACCEPT_PAGE 6 -#define TDX_WR 8 - -/* TDCS fields. To be used by TDG.VM.WR and TDG.VM.RD module calls */ -#define TDCS_NOTIFY_ENABLES 0x9100000000000010 - -/* TDX hypercall Leaf IDs */ -#define TDVMCALL_MAP_GPA 0x10001 -#define TDVMCALL_REPORT_FATAL_ERROR 0x10003 - /* MMIO direction */ #define EPT_READ 0 #define EPT_WRITE 1 @@ -51,24 +37,6 @@ =20 #define TDREPORT_SUBTYPE_0 0 =20 -/* - * Wrapper for standard use of __tdx_hypercall with no output aside from - * return code. - */ -static inline u64 _tdx_hypercall(u64 fn, u64 r12, u64 r13, u64 r14, u64 r1= 5) -{ - struct tdx_hypercall_args args =3D { - .r10 =3D TDX_HYPERCALL_STANDARD, - .r11 =3D fn, - .r12 =3D r12, - .r13 =3D r13, - .r14 =3D r14, - .r15 =3D r15, - }; - - return __tdx_hypercall(&args); -} - /* Called from __tdx_hypercall() for unrecoverable failure */ noinstr void __tdx_hypercall_failed(void) { diff --git a/arch/x86/include/asm/shared/tdx.h b/arch/x86/include/asm/share= d/tdx.h index 2631e01f6e0f..1ff0ee822961 100644 --- a/arch/x86/include/asm/shared/tdx.h +++ b/arch/x86/include/asm/shared/tdx.h @@ -10,6 +10,20 @@ #define TDX_CPUID_LEAF_ID 0x21 #define TDX_IDENT "IntelTDX " =20 +/* TDX module Call Leaf IDs */ +#define TDX_GET_INFO 1 +#define TDX_GET_VEINFO 3 +#define TDX_GET_REPORT 4 +#define TDX_ACCEPT_PAGE 6 +#define TDX_WR 8 + +/* TDCS fields. To be used by TDG.VM.WR and TDG.VM.RD module calls */ +#define TDCS_NOTIFY_ENABLES 0x9100000000000010 + +/* TDX hypercall Leaf IDs */ +#define TDVMCALL_MAP_GPA 0x10001 +#define TDVMCALL_REPORT_FATAL_ERROR 0x10003 + #ifndef __ASSEMBLY__ =20 /* @@ -37,8 +51,45 @@ struct tdx_hypercall_args { u64 __tdx_hypercall(struct tdx_hypercall_args *args); u64 __tdx_hypercall_ret(struct tdx_hypercall_args *args); =20 +/* + * Wrapper for standard use of __tdx_hypercall with no output aside from + * return code. + */ +static inline u64 _tdx_hypercall(u64 fn, u64 r12, u64 r13, u64 r14, u64 r1= 5) +{ + struct tdx_hypercall_args args =3D { + .r10 =3D TDX_HYPERCALL_STANDARD, + .r11 =3D fn, + .r12 =3D r12, + .r13 =3D r13, + .r14 =3D r14, + .r15 =3D r15, + }; + + return __tdx_hypercall(&args); +} + + /* Called from __tdx_hypercall() for unrecoverable failure */ void __tdx_hypercall_failed(void); =20 +/* + * Used in __tdx_module_call() to gather the output registers' values of t= he + * TDCALL instruction when requesting services from the TDX module. This i= s a + * software only structure and not part of the TDX module/VMM ABI + */ +struct tdx_module_output { + u64 rcx; + u64 rdx; + u64 r8; + u64 r9; + u64 r10; + u64 r11; +}; + +/* Used to communicate with the TDX module */ +u64 __tdx_module_call(u64 fn, u64 rcx, u64 rdx, u64 r8, u64 r9, + struct tdx_module_output *out); + #endif /* !__ASSEMBLY__ */ #endif /* _ASM_X86_SHARED_TDX_H */ diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h index 28d889c9aa16..234197ec17e4 100644 --- a/arch/x86/include/asm/tdx.h +++ b/arch/x86/include/asm/tdx.h @@ -20,21 +20,6 @@ =20 #ifndef __ASSEMBLY__ =20 -/* - * Used to gather the output registers values of the TDCALL and SEAMCALL - * instructions when requesting services from the TDX module. - * - * This is a software only structure and not part of the TDX module/VMM AB= I. - */ -struct tdx_module_output { - u64 rcx; - u64 rdx; - u64 r8; - u64 r9; - u64 r10; - u64 r11; -}; - /* * Used by the #VE exception handler to gather the #VE exception * info from the TDX module. This is a software only structure @@ -55,10 +40,6 @@ struct ve_info { =20 void __init tdx_early_init(void); =20 -/* Used to communicate with the TDX module */ -u64 __tdx_module_call(u64 fn, u64 rcx, u64 rdx, u64 r8, u64 r9, - struct tdx_module_output *out); - void tdx_get_ve_info(struct ve_info *ve); =20 bool tdx_handle_virt_exception(struct pt_regs *regs, struct ve_info *ve); --=20 2.39.3 From nobody Tue Feb 10 00:22:26 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CCF62C7EE31 for ; Thu, 18 May 2023 23:16:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231192AbjERXPw (ORCPT ); Thu, 18 May 2023 19:15:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33894 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230341AbjERXPD (ORCPT ); Thu, 18 May 2023 19:15:03 -0400 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4F1B8E75; Thu, 18 May 2023 16:15:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1684451702; x=1715987702; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ZRWbbgKJS3P7pe8ruC5AYEzSpFJMHJrlR+e2/skszlg=; b=dONhJH02Sdqb1NFaheaqF4Yg4jfO0Iu3sMj9rgHTVkzLNpNERT/oOeVW kw/PYB9KMlJZXFQCs5lItmfFVnf/OXehdumk2OjsSlnu0viIxRHpJDm6B /NdUi7x1ZHuFF8/p4f77xBUE4gCll350H4fd1BFxR4Ms4OUeomBJeEQWZ xw0P7ktri3RUSDaPN9jHVlsQTPehEl1J0bvtnRK3okqL1W7GeAYAqySa8 z339K9dGQkNnoS70PH9VmjEIH2f7OHbRIM/8SlfEg/cDNPuyATva+pulK JHhDTBNgk9k/K8pOnvIbOfGbmTtkHepcvMypchDIN5h0Gtzq4MlibJ7TS g==; X-IronPort-AV: E=McAfee;i="6600,9927,10714"; a="341652124" X-IronPort-AV: E=Sophos;i="6.00,175,1681196400"; d="scan'208";a="341652124" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 May 2023 16:15:00 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10714"; a="772065103" X-IronPort-AV: E=Sophos;i="6.00,175,1681196400"; d="scan'208";a="772065103" Received: from rkiyama-mobl1.amr.corp.intel.com (HELO box.shutemov.name) ([10.251.222.16]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 May 2023 16:14:52 -0700 Received: by box.shutemov.name (Postfix, from userid 1000) id 6D73610DFD6; Fri, 19 May 2023 02:14:40 +0300 (+03) From: "Kirill A. Shutemov" To: Borislav Petkov , Andy Lutomirski , Dave Hansen , Sean Christopherson , Andrew Morton , Joerg Roedel , Ard Biesheuvel Cc: Andi Kleen , Kuppuswamy Sathyanarayanan , David Rientjes , Vlastimil Babka , Tom Lendacky , Thomas Gleixner , Peter Zijlstra , Paolo Bonzini , Ingo Molnar , Dario Faggioli , Mike Rapoport , David Hildenbrand , Mel Gorman , marcelo.cerri@canonical.com, tim.gardner@canonical.com, khalid.elmously@canonical.com, philip.cox@canonical.com, aarcange@redhat.com, peterx@redhat.com, x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" , Dave Hansen Subject: [PATCHv12 8/9] x86/tdx: Refactor try_accept_one() Date: Fri, 19 May 2023 02:14:33 +0300 Message-Id: <20230518231434.26080-9-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.39.3 In-Reply-To: <20230518231434.26080-1-kirill.shutemov@linux.intel.com> References: <20230518231434.26080-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Rework try_accept_one() to return accepted size instead of modifying 'start' inside the helper. It makes 'start' in-only argument and streamlines code on the caller side. Signed-off-by: Kirill A. Shutemov Suggested-by: Borislav Petkov Reviewed-by: Dave Hansen --- arch/x86/coco/tdx/tdx.c | 38 +++++++++++++++++++------------------- 1 file changed, 19 insertions(+), 19 deletions(-) diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c index e6f4c2758a68..0d5fe6e24e45 100644 --- a/arch/x86/coco/tdx/tdx.c +++ b/arch/x86/coco/tdx/tdx.c @@ -713,18 +713,18 @@ static bool tdx_cache_flush_required(void) return true; } =20 -static bool try_accept_one(phys_addr_t *start, unsigned long len, - enum pg_level pg_level) +static unsigned long try_accept_one(phys_addr_t start, unsigned long len, + enum pg_level pg_level) { unsigned long accept_size =3D page_level_size(pg_level); u64 tdcall_rcx; u8 page_size; =20 - if (!IS_ALIGNED(*start, accept_size)) - return false; + if (!IS_ALIGNED(start, accept_size)) + return 0; =20 if (len < accept_size) - return false; + return 0; =20 /* * Pass the page physical address to the TDX module to accept the @@ -743,15 +743,14 @@ static bool try_accept_one(phys_addr_t *start, unsign= ed long len, page_size =3D 2; break; default: - return false; + return 0; } =20 - tdcall_rcx =3D *start | page_size; + tdcall_rcx =3D start | page_size; if (__tdx_module_call(TDX_ACCEPT_PAGE, tdcall_rcx, 0, 0, 0, NULL)) - return false; + return 0; =20 - *start +=3D accept_size; - return true; + return accept_size; } =20 /* @@ -788,21 +787,22 @@ static bool tdx_enc_status_changed(unsigned long vadd= r, int numpages, bool enc) */ while (start < end) { unsigned long len =3D end - start; + unsigned long accept_size; =20 /* * Try larger accepts first. It gives chance to VMM to keep - * 1G/2M SEPT entries where possible and speeds up process by - * cutting number of hypercalls (if successful). + * 1G/2M Secure EPT entries where possible and speeds up + * process by cutting number of hypercalls (if successful). */ =20 - if (try_accept_one(&start, len, PG_LEVEL_1G)) - continue; - - if (try_accept_one(&start, len, PG_LEVEL_2M)) - continue; - - if (!try_accept_one(&start, len, PG_LEVEL_4K)) + accept_size =3D try_accept_one(start, len, PG_LEVEL_1G); + if (!accept_size) + accept_size =3D try_accept_one(start, len, PG_LEVEL_2M); + if (!accept_size) + accept_size =3D try_accept_one(start, len, PG_LEVEL_4K); + if (!accept_size) return false; + start +=3D accept_size; } =20 return true; --=20 2.39.3 From nobody Tue Feb 10 00:22:26 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 53221C7EE2E for ; Thu, 18 May 2023 23:16:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230286AbjERXPf (ORCPT ); Thu, 18 May 2023 19:15:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33886 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230339AbjERXPC (ORCPT ); Thu, 18 May 2023 19:15:02 -0400 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 87A65E48; Thu, 18 May 2023 16:15:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1684451700; x=1715987700; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=WocS7y0ges2OdJqRz+xrLKS0zSZ/Dr1Rwu72pFULcnE=; b=a5lqeR9lkCF/b49zajsBBGZJ570A++zhilY1XsnlKTFNnIHJA3xI2kQt NlcWIU8i9meGXplazfuWkb3UU7ovlFHGlGERfgGphPHO1UK/DJoYWr/GR yf8dVKgcM7aCvMn/atz0Jpl3Bq51B7/JeeAZFf/vxer7pC3o3lhQiFar7 uoA3E3syJacMBGmcUYPdZzD4FSjOOOJIDPiFR/CbEEG9qXBpMt7SLMLlj 2aXmEebawT1OfG360pvdGhLL4WhXBF5K+BjuJEzwPN0DVLizDdfcBo1rG 8jQqiqzTw+T7wytNG7Y3KjEbRthCdk3f6r/Z3f0v54v1ztyhXV6r7C3Jb Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10714"; a="380431790" X-IronPort-AV: E=Sophos;i="6.00,175,1681196400"; d="scan'208";a="380431790" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 May 2023 16:14:59 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10714"; a="846669578" X-IronPort-AV: E=Sophos;i="6.00,175,1681196400"; d="scan'208";a="846669578" Received: from rkiyama-mobl1.amr.corp.intel.com (HELO box.shutemov.name) ([10.251.222.16]) by fmsmga001-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 May 2023 16:14:52 -0700 Received: by box.shutemov.name (Postfix, from userid 1000) id 79FC910DFD7; Fri, 19 May 2023 02:14:40 +0300 (+03) From: "Kirill A. Shutemov" To: Borislav Petkov , Andy Lutomirski , Dave Hansen , Sean Christopherson , Andrew Morton , Joerg Roedel , Ard Biesheuvel Cc: Andi Kleen , Kuppuswamy Sathyanarayanan , David Rientjes , Vlastimil Babka , Tom Lendacky , Thomas Gleixner , Peter Zijlstra , Paolo Bonzini , Ingo Molnar , Dario Faggioli , Mike Rapoport , David Hildenbrand , Mel Gorman , marcelo.cerri@canonical.com, tim.gardner@canonical.com, khalid.elmously@canonical.com, philip.cox@canonical.com, aarcange@redhat.com, peterx@redhat.com, x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [PATCHv12 9/9] x86/tdx: Add unaccepted memory support Date: Fri, 19 May 2023 02:14:34 +0300 Message-Id: <20230518231434.26080-10-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.39.3 In-Reply-To: <20230518231434.26080-1-kirill.shutemov@linux.intel.com> References: <20230518231434.26080-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Hookup TDX-specific code to accept memory. Accepting the memory is the same process as converting memory from shared to private: kernel notifies VMM with MAP_GPA hypercall and then accept pages with ACCEPT_PAGE module call. The implementation in core kernel uses tdx_enc_status_changed(). It already used for converting memory to shared and back for I/O transactions. Boot stub provides own implementation of tdx_accept_memory(). It is similar in structure to tdx_enc_status_changed(), but only cares about converting memory to private. Signed-off-by: Kirill A. Shutemov --- arch/x86/Kconfig | 2 + arch/x86/boot/compressed/Makefile | 2 +- arch/x86/boot/compressed/error.c | 19 +++++ arch/x86/boot/compressed/error.h | 1 + arch/x86/boot/compressed/mem.c | 33 +++++++- arch/x86/boot/compressed/tdx-shared.c | 2 + arch/x86/boot/compressed/tdx.c | 37 +++++++++ arch/x86/coco/tdx/Makefile | 2 +- arch/x86/coco/tdx/tdx-shared.c | 95 ++++++++++++++++++++++++ arch/x86/coco/tdx/tdx.c | 86 +-------------------- arch/x86/include/asm/shared/tdx.h | 2 + arch/x86/include/asm/tdx.h | 2 + arch/x86/include/asm/unaccepted_memory.h | 23 ++++++ 13 files changed, 219 insertions(+), 87 deletions(-) create mode 100644 arch/x86/boot/compressed/tdx-shared.c create mode 100644 arch/x86/coco/tdx/tdx-shared.c create mode 100644 arch/x86/include/asm/unaccepted_memory.h diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 53bab123a8ee..5c72067c06d4 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -884,9 +884,11 @@ config INTEL_TDX_GUEST bool "Intel TDX (Trust Domain Extensions) - Guest Support" depends on X86_64 && CPU_SUP_INTEL depends on X86_X2APIC + depends on EFI_STUB select ARCH_HAS_CC_PLATFORM select X86_MEM_ENCRYPT select X86_MCE + select UNACCEPTED_MEMORY help Support running as a guest under Intel TDX. Without this support, the guest kernel can not boot or run under TDX. diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/M= akefile index cc4978123c30..b13a58021086 100644 --- a/arch/x86/boot/compressed/Makefile +++ b/arch/x86/boot/compressed/Makefile @@ -106,7 +106,7 @@ ifdef CONFIG_X86_64 endif =20 vmlinux-objs-$(CONFIG_ACPI) +=3D $(obj)/acpi.o -vmlinux-objs-$(CONFIG_INTEL_TDX_GUEST) +=3D $(obj)/tdx.o $(obj)/tdcall.o +vmlinux-objs-$(CONFIG_INTEL_TDX_GUEST) +=3D $(obj)/tdx.o $(obj)/tdcall.o $= (obj)/tdx-shared.o vmlinux-objs-$(CONFIG_UNACCEPTED_MEMORY) +=3D $(obj)/mem.o =20 vmlinux-objs-$(CONFIG_EFI) +=3D $(obj)/efi.o diff --git a/arch/x86/boot/compressed/error.c b/arch/x86/boot/compressed/er= ror.c index c881878e56d3..5313c5cb2b80 100644 --- a/arch/x86/boot/compressed/error.c +++ b/arch/x86/boot/compressed/error.c @@ -22,3 +22,22 @@ void error(char *m) while (1) asm("hlt"); } + +/* EFI libstub provides vsnprintf() */ +#ifdef CONFIG_EFI_STUB +void panic(const char *fmt, ...) +{ + static char buf[1024]; + va_list args; + int len; + + va_start(args, fmt); + len =3D vsnprintf(buf, sizeof(buf), fmt, args); + va_end(args); + + if (len && buf[len - 1] =3D=3D '\n') + buf[len - 1] =3D '\0'; + + error(buf); +} +#endif diff --git a/arch/x86/boot/compressed/error.h b/arch/x86/boot/compressed/er= ror.h index 1de5821184f1..86fe33b93715 100644 --- a/arch/x86/boot/compressed/error.h +++ b/arch/x86/boot/compressed/error.h @@ -6,5 +6,6 @@ =20 void warn(char *m); void error(char *m) __noreturn; +void panic(const char *fmt, ...) __noreturn __cold; =20 #endif /* BOOT_COMPRESSED_ERROR_H */ diff --git a/arch/x86/boot/compressed/mem.c b/arch/x86/boot/compressed/mem.c index 87372b96d613..0108c97399a5 100644 --- a/arch/x86/boot/compressed/mem.c +++ b/arch/x86/boot/compressed/mem.c @@ -2,11 +2,42 @@ =20 #include "error.h" #include "misc.h" +#include "tdx.h" +#include + +/* + * accept_memory() and process_unaccepted_memory() called from EFI stub wh= ich + * runs before decompresser and its early_tdx_detect(). + * + * Enumerate TDX directly from the early users. + */ +static bool early_is_tdx_guest(void) +{ + static bool once; + static bool is_tdx; + + if (!IS_ENABLED(CONFIG_INTEL_TDX_GUEST)) + return false; + + if (!once) { + u32 eax, sig[3]; + + cpuid_count(TDX_CPUID_LEAF_ID, 0, &eax, + &sig[0], &sig[2], &sig[1]); + is_tdx =3D !memcmp(TDX_IDENT, sig, sizeof(sig)); + once =3D true; + } + + return is_tdx; +} =20 void arch_accept_memory(phys_addr_t start, phys_addr_t end) { /* Platform-specific memory-acceptance call goes here */ - error("Cannot accept memory"); + if (early_is_tdx_guest()) + tdx_accept_memory(start, end); + else + error("Cannot accept memory: unknown platform\n"); } =20 void init_unaccepted_memory(void) diff --git a/arch/x86/boot/compressed/tdx-shared.c b/arch/x86/boot/compress= ed/tdx-shared.c new file mode 100644 index 000000000000..5ac43762fe13 --- /dev/null +++ b/arch/x86/boot/compressed/tdx-shared.c @@ -0,0 +1,2 @@ +#include "error.h" +#include "../../coco/tdx/tdx-shared.c" diff --git a/arch/x86/boot/compressed/tdx.c b/arch/x86/boot/compressed/tdx.c index 2d81d3cc72a1..d073764eaa50 100644 --- a/arch/x86/boot/compressed/tdx.c +++ b/arch/x86/boot/compressed/tdx.c @@ -9,6 +9,9 @@ #include =20 #include +#include + +static u64 cc_mask; =20 /* Called from __tdx_hypercall() for unrecoverable failure */ void __tdx_hypercall_failed(void) @@ -16,6 +19,38 @@ void __tdx_hypercall_failed(void) error("TDVMCALL failed. TDX module bug?"); } =20 +static u64 get_cc_mask(void) +{ + struct tdx_module_output out; + unsigned int gpa_width; + + /* + * TDINFO TDX module call is used to get the TD execution environment + * information like GPA width, number of available vcpus, debug mode + * information, etc. More details about the ABI can be found in TDX + * Guest-Host-Communication Interface (GHCI), section 2.4.2 TDCALL + * [TDG.VP.INFO]. + * + * The GPA width that comes out of this call is critical. TDX guests + * can not meaningfully run without it. + */ + if (__tdx_module_call(TDX_GET_INFO, 0, 0, 0, 0, &out)) + error("TDCALL GET_INFO failed (Buggy TDX module!)\n"); + + gpa_width =3D out.rcx & GENMASK(5, 0); + + /* + * The highest bit of a guest physical address is the "sharing" bit. + * Set it for shared pages and clear it for private pages. + */ + return BIT_ULL(gpa_width - 1); +} + +u64 cc_mkdec(u64 val) +{ + return val & ~cc_mask; +} + static inline unsigned int tdx_io_in(int size, u16 port) { struct tdx_hypercall_args args =3D { @@ -70,6 +105,8 @@ void early_tdx_detect(void) if (memcmp(TDX_IDENT, sig, sizeof(sig))) return; =20 + cc_mask =3D get_cc_mask(); + /* Use hypercalls instead of I/O instructions */ pio_ops.f_inb =3D tdx_inb; pio_ops.f_outb =3D tdx_outb; diff --git a/arch/x86/coco/tdx/Makefile b/arch/x86/coco/tdx/Makefile index 46c55998557d..2c7dcbf1458b 100644 --- a/arch/x86/coco/tdx/Makefile +++ b/arch/x86/coco/tdx/Makefile @@ -1,3 +1,3 @@ # SPDX-License-Identifier: GPL-2.0 =20 -obj-y +=3D tdx.o tdcall.o +obj-y +=3D tdx.o tdx-shared.o tdcall.o diff --git a/arch/x86/coco/tdx/tdx-shared.c b/arch/x86/coco/tdx/tdx-shared.c new file mode 100644 index 000000000000..ee74f7bbe806 --- /dev/null +++ b/arch/x86/coco/tdx/tdx-shared.c @@ -0,0 +1,95 @@ +#include +#include + +static unsigned long try_accept_one(phys_addr_t start, unsigned long len, + enum pg_level pg_level) +{ + unsigned long accept_size =3D page_level_size(pg_level); + u64 tdcall_rcx; + u8 page_size; + + if (!IS_ALIGNED(start, accept_size)) + return 0; + + if (len < accept_size) + return 0; + + /* + * Pass the page physical address to the TDX module to accept the + * pending, private page. + * + * Bits 2:0 of RCX encode page size: 0 - 4K, 1 - 2M, 2 - 1G. + */ + switch (pg_level) { + case PG_LEVEL_4K: + page_size =3D 0; + break; + case PG_LEVEL_2M: + page_size =3D 1; + break; + case PG_LEVEL_1G: + page_size =3D 2; + break; + default: + return 0; + } + + tdcall_rcx =3D start | page_size; + if (__tdx_module_call(TDX_ACCEPT_PAGE, tdcall_rcx, 0, 0, 0, NULL)) + return 0; + + return accept_size; +} + +bool tdx_enc_status_changed_phys(phys_addr_t start, phys_addr_t end, bool = enc) +{ + if (!enc) { + /* Set the shared (decrypted) bits: */ + start |=3D cc_mkdec(0); + end |=3D cc_mkdec(0); + } + + /* + * Notify the VMM about page mapping conversion. More info about ABI + * can be found in TDX Guest-Host-Communication Interface (GHCI), + * section "TDG.VP.VMCALL" + */ + if (_tdx_hypercall(TDVMCALL_MAP_GPA, start, end - start, 0, 0)) + return false; + + /* private->shared conversion requires only MapGPA call */ + if (!enc) + return true; + + /* + * For shared->private conversion, accept the page using + * TDX_ACCEPT_PAGE TDX module call. + */ + while (start < end) { + unsigned long len =3D end - start; + unsigned long accept_size; + + /* + * Try larger accepts first. It gives chance to VMM to keep + * 1G/2M Secure EPT entries where possible and speeds up + * process by cutting number of hypercalls (if successful). + */ + + accept_size =3D try_accept_one(start, len, PG_LEVEL_1G); + if (!accept_size) + accept_size =3D try_accept_one(start, len, PG_LEVEL_2M); + if (!accept_size) + accept_size =3D try_accept_one(start, len, PG_LEVEL_4K); + if (!accept_size) + return false; + start +=3D accept_size; + } + + return true; +} + +void tdx_accept_memory(phys_addr_t start, phys_addr_t end) +{ + if (!tdx_enc_status_changed_phys(start, end, true)) + panic("Accepting memory failed: %#llx-%#llx\n", start, end); +} diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c index 0d5fe6e24e45..32501277ef84 100644 --- a/arch/x86/coco/tdx/tdx.c +++ b/arch/x86/coco/tdx/tdx.c @@ -713,46 +713,6 @@ static bool tdx_cache_flush_required(void) return true; } =20 -static unsigned long try_accept_one(phys_addr_t start, unsigned long len, - enum pg_level pg_level) -{ - unsigned long accept_size =3D page_level_size(pg_level); - u64 tdcall_rcx; - u8 page_size; - - if (!IS_ALIGNED(start, accept_size)) - return 0; - - if (len < accept_size) - return 0; - - /* - * Pass the page physical address to the TDX module to accept the - * pending, private page. - * - * Bits 2:0 of RCX encode page size: 0 - 4K, 1 - 2M, 2 - 1G. - */ - switch (pg_level) { - case PG_LEVEL_4K: - page_size =3D 0; - break; - case PG_LEVEL_2M: - page_size =3D 1; - break; - case PG_LEVEL_1G: - page_size =3D 2; - break; - default: - return 0; - } - - tdcall_rcx =3D start | page_size; - if (__tdx_module_call(TDX_ACCEPT_PAGE, tdcall_rcx, 0, 0, 0, NULL)) - return 0; - - return accept_size; -} - /* * Inform the VMM of the guest's intent for this physical page: shared with * the VMM or private to the guest. The VMM is expected to change its map= ping @@ -761,51 +721,9 @@ static unsigned long try_accept_one(phys_addr_t start,= unsigned long len, static bool tdx_enc_status_changed(unsigned long vaddr, int numpages, bool= enc) { phys_addr_t start =3D __pa(vaddr); - phys_addr_t end =3D __pa(vaddr + numpages * PAGE_SIZE); + phys_addr_t end =3D __pa(vaddr + numpages * PAGE_SIZE); =20 - if (!enc) { - /* Set the shared (decrypted) bits: */ - start |=3D cc_mkdec(0); - end |=3D cc_mkdec(0); - } - - /* - * Notify the VMM about page mapping conversion. More info about ABI - * can be found in TDX Guest-Host-Communication Interface (GHCI), - * section "TDG.VP.VMCALL" - */ - if (_tdx_hypercall(TDVMCALL_MAP_GPA, start, end - start, 0, 0)) - return false; - - /* private->shared conversion requires only MapGPA call */ - if (!enc) - return true; - - /* - * For shared->private conversion, accept the page using - * TDX_ACCEPT_PAGE TDX module call. - */ - while (start < end) { - unsigned long len =3D end - start; - unsigned long accept_size; - - /* - * Try larger accepts first. It gives chance to VMM to keep - * 1G/2M Secure EPT entries where possible and speeds up - * process by cutting number of hypercalls (if successful). - */ - - accept_size =3D try_accept_one(start, len, PG_LEVEL_1G); - if (!accept_size) - accept_size =3D try_accept_one(start, len, PG_LEVEL_2M); - if (!accept_size) - accept_size =3D try_accept_one(start, len, PG_LEVEL_4K); - if (!accept_size) - return false; - start +=3D accept_size; - } - - return true; + return tdx_enc_status_changed_phys(start, end, enc); } =20 void __init tdx_early_init(void) diff --git a/arch/x86/include/asm/shared/tdx.h b/arch/x86/include/asm/share= d/tdx.h index 1ff0ee822961..95fbe7376694 100644 --- a/arch/x86/include/asm/shared/tdx.h +++ b/arch/x86/include/asm/shared/tdx.h @@ -91,5 +91,7 @@ struct tdx_module_output { u64 __tdx_module_call(u64 fn, u64 rcx, u64 rdx, u64 r8, u64 r9, struct tdx_module_output *out); =20 +void tdx_accept_memory(phys_addr_t start, phys_addr_t end); + #endif /* !__ASSEMBLY__ */ #endif /* _ASM_X86_SHARED_TDX_H */ diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h index 234197ec17e4..3a7340ad9a4b 100644 --- a/arch/x86/include/asm/tdx.h +++ b/arch/x86/include/asm/tdx.h @@ -50,6 +50,8 @@ bool tdx_early_handle_ve(struct pt_regs *regs); =20 int tdx_mcall_get_report0(u8 *reportdata, u8 *tdreport); =20 +bool tdx_enc_status_changed_phys(phys_addr_t start, phys_addr_t end, bool = enc); + #else =20 static inline void tdx_early_init(void) { }; diff --git a/arch/x86/include/asm/unaccepted_memory.h b/arch/x86/include/as= m/unaccepted_memory.h new file mode 100644 index 000000000000..72b354f992bb --- /dev/null +++ b/arch/x86/include/asm/unaccepted_memory.h @@ -0,0 +1,23 @@ +#ifndef _ASM_X86_UNACCEPTED_MEMORY_H +#define _ASM_X86_UNACCEPTED_MEMORY_H + +#include +#include + +static inline void arch_accept_memory(phys_addr_t start, phys_addr_t end) +{ + /* Platform-specific memory-acceptance call goes here */ + if (cpu_feature_enabled(X86_FEATURE_TDX_GUEST)) { + tdx_accept_memory(start, end); + } else { + panic("Cannot accept memory: unknown platform\n"); + } +} + +static inline struct efi_unaccepted_memory *efi_get_unaccepted_table(void) +{ + if (efi.unaccepted =3D=3D EFI_INVALID_TABLE_ADDR) + return NULL; + return __va(efi.unaccepted); +} +#endif --=20 2.39.3