[v1] Add RMPOPT support.

[PATCH 5/6] x86/sev: Use configfs to re-enable RMP optimizations.

Posted by Ashish Kalra 1 month, 2 weeks ago

From: Ashish Kalra <ashish.kalra@amd.com>

Use configfs as an interface to re-enable RMP optimizations at runtime

When SNP guests are launched, RMPUPDATE disables the corresponding
RMPOPT optimizations. Therefore, an interface is required to manually
re-enable RMP optimizations, as no mechanism currently exists to do so
during SNP guest cleanup.

Also select CONFIG_CONFIGFS_FS when host SEV or SNP support is enabled.

Suggested-by: Thomas Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
---
 arch/x86/kvm/Kconfig    |  1 +
 arch/x86/virt/svm/sev.c | 79 +++++++++++++++++++++++++++++++++++++++++
 2 files changed, 80 insertions(+)

diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index d916bd766c94..8fb21893ec8c 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -164,6 +164,7 @@ config KVM_AMD_SEV
 	select HAVE_KVM_ARCH_GMEM_PREPARE
 	select HAVE_KVM_ARCH_GMEM_INVALIDATE
 	select HAVE_KVM_ARCH_GMEM_POPULATE
+	select CONFIGFS_FS
 	help
 	  Provides support for launching encrypted VMs which use Secure
 	  Encrypted Virtualization (SEV), Secure Encrypted Virtualization with
diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c
index 713afcc2fab3..0f71a045e4aa 100644
--- a/arch/x86/virt/svm/sev.c
+++ b/arch/x86/virt/svm/sev.c
@@ -20,6 +20,7 @@
 #include <linux/amd-iommu.h>
 #include <linux/nospec.h>
 #include <linux/kthread.h>
+#include <linux/configfs.h>
 
 #include <asm/sev.h>
 #include <asm/processor.h>
@@ -146,6 +147,10 @@ struct rmpopt_socket_config {
 	int current_node_idx;
 };
 
+#define RMPOPT_CONFIGFS_NAME	"rmpopt"
+
+static atomic_t rmpopt_in_progress = ATOMIC_INIT(0);
+
 #undef pr_fmt
 #define pr_fmt(fmt)	"SEV-SNP: " fmt
 
@@ -581,6 +586,9 @@ static int rmpopt_kthread(void *__unused)
 			cond_resched();
 		}
 
+		/* Clear in_progress flag before going to sleep */
+		atomic_set(&rmpopt_in_progress, 0);
+
 		set_current_state(TASK_INTERRUPTIBLE);
 		schedule();
 	}
@@ -595,6 +603,75 @@ static void rmpopt_all_physmem(void)
 		wake_up_process(rmpopt_task);
 }
 
+static ssize_t rmpopt_action_show(struct config_item *item, char *page)
+{
+	return sprintf(page, "RMP optimization in progress: %s\n",
+		       atomic_read(&rmpopt_in_progress) == 1 ? "Yes" : "No");
+}
+
+static ssize_t rmpopt_action_store(struct config_item *item,
+				   const char *page, size_t count)
+{
+	int in_progress_flag, ret;
+	unsigned int action;
+
+	ret = kstrtouint(page, 10, &action);
+	if (ret)
+		return ret;
+
+	if (action == 1) {
+		/* perform RMP re-optimizations */
+		in_progress_flag = atomic_cmpxchg(&rmpopt_in_progress, 0, 1);
+		if (!in_progress_flag)
+			rmpopt_all_physmem();
+	} else {
+		return -EINVAL;
+	}
+
+	return count;
+}
+
+static ssize_t rmpopt_description_show(struct config_item *item, char *page)
+{
+	return sprintf(page, "[RMPOPT]\n\necho 1 > action to perform RMP optimization.\n");
+}
+
+CONFIGFS_ATTR(rmpopt_, action);
+CONFIGFS_ATTR_RO(rmpopt_, description);
+
+static struct configfs_attribute *rmpopt_attrs[] = {
+	&rmpopt_attr_action,
+	&rmpopt_attr_description,
+	NULL,
+};
+
+static const struct config_item_type rmpopt_config_type = {
+	.ct_attrs       = rmpopt_attrs,
+	.ct_owner       = THIS_MODULE,
+};
+
+static struct configfs_subsystem rmpopt_configfs = {
+	.su_group = {
+		.cg_item = {
+		.ci_namebuf = RMPOPT_CONFIGFS_NAME,
+		.ci_type = &rmpopt_config_type,
+		},
+	},
+	.su_mutex = __MUTEX_INITIALIZER(rmpopt_configfs.su_mutex),
+};
+
+static int rmpopt_configfs_setup(void)
+{
+	int ret;
+
+	config_group_init(&rmpopt_configfs.su_group);
+	ret = configfs_register_subsystem(&rmpopt_configfs);
+	if (ret)
+		pr_err("Error %d while registering subsystem %s\n", ret, RMPOPT_CONFIGFS_NAME);
+
+	return ret;
+}
+
 static void __configure_rmpopt(void *val)
 {
 	u64 rmpopt_base = ((u64)val & PUD_MASK) | MSR_AMD64_RMPOPT_ENABLE;
@@ -770,6 +847,8 @@ static __init void configure_and_enable_rmpopt(void)
 	 */
 	rmpopt_all_physmem();
 
+	rmpopt_configfs_setup();
+
 free_cpumask:
 	free_cpumask_var(primary_threads_cpulist);
 }
-- 
2.43.0

Re: [PATCH 5/6] x86/sev: Use configfs to re-enable RMP optimizations.

Posted by Dave Hansen 1 month, 2 weeks ago

On 2/17/26 12:11, Ashish Kalra wrote:
> From: Ashish Kalra <ashish.kalra@amd.com>
> 
> Use configfs as an interface to re-enable RMP optimizations at runtime
> 
> When SNP guests are launched, RMPUPDATE disables the corresponding
> RMPOPT optimizations. Therefore, an interface is required to manually
> re-enable RMP optimizations, as no mechanism currently exists to do so
> during SNP guest cleanup.

Is this like a proof-of-concept to poke the hardware and show it works?
Or, is this intended to be the way that folks actually interact with
SEV-SNP optimization in real production scenarios?

Shouldn't freeing SEV-SNP memory back to the system do this
automatically? Worst case, keep a 1-bit-per-GB bitmap of memory that's
been freed and schedule_work() to run in 1 or 10 or 100 seconds. That
should batch things up nicely enough. No?

I can't fathom that users don't want this to be done automatically for them.

Is the optimization scan really expensive or something? 1GB of memory
should have a small number of megabytes of metadata to scan.

Re: [PATCH 5/6] x86/sev: Use configfs to re-enable RMP optimizations.

Posted by Kalra, Ashish 1 month, 2 weeks ago

Hello Dave, 

On 2/17/2026 4:19 PM, Dave Hansen wrote:
> On 2/17/26 12:11, Ashish Kalra wrote:
>> From: Ashish Kalra <ashish.kalra@amd.com>
>>
>> Use configfs as an interface to re-enable RMP optimizations at runtime
>>
>> When SNP guests are launched, RMPUPDATE disables the corresponding
>> RMPOPT optimizations. Therefore, an interface is required to manually
>> re-enable RMP optimizations, as no mechanism currently exists to do so
>> during SNP guest cleanup.
> 
> Is this like a proof-of-concept to poke the hardware and show it works?
> Or, is this intended to be the way that folks actually interact with
> SEV-SNP optimization in real production scenarios?
> 
> Shouldn't freeing SEV-SNP memory back to the system do this
> automatically? Worst case, keep a 1-bit-per-GB bitmap of memory that's
> been freed and schedule_work() to run in 1 or 10 or 100 seconds. That
> should batch things up nicely enough. No?

Actually, the RMPOPT implementation is going to be a multi-phased development.

In the first phase (which is this patch-series) we enable RMPOPT globally, and let RMPUPDATE(s)
slowly switch it off over time as SNP guest spin up, and then in phase#2 once 1GB hugetlb is in place,
we enable re-issuing of RMPOPT during 1GB page cleanup.

So automatic re-issuing of RMPOPT will be done when SNP guests are shutdown and as part of 
SNP guest cleanup once 1GB hugetlb support (for guest_memfd) has been merged. 

As currently, i.e, as part of this patch series, there is no mechanism to re-issue RMPOPT
automatically as part of SNP guest cleanup, therefore this support exists to doing it
manually at runtime via configfs.

I will describe this multi-phased RMPOPT implementation plan in the cover letter for 
next revision of this patch series.

Thanks,
Ashish

> 
> I can't fathom that users don't want this to be done automatically for them.
> 
> Is the optimization scan really expensive or something? 1GB of memory
> should have a small number of megabytes of metadata to scan.

Re: [PATCH 5/6] x86/sev: Use configfs to re-enable RMP optimizations.

Posted by Kalra, Ashish 1 month, 1 week ago


On 2/17/2026 9:34 PM, Kalra, Ashish wrote:
> Hello Dave, 
> 
> On 2/17/2026 4:19 PM, Dave Hansen wrote:
>> On 2/17/26 12:11, Ashish Kalra wrote:
>>> From: Ashish Kalra <ashish.kalra@amd.com>
>>>
>>> Use configfs as an interface to re-enable RMP optimizations at runtime
>>>
>>> When SNP guests are launched, RMPUPDATE disables the corresponding
>>> RMPOPT optimizations. Therefore, an interface is required to manually
>>> re-enable RMP optimizations, as no mechanism currently exists to do so
>>> during SNP guest cleanup.
>>
>> Is this like a proof-of-concept to poke the hardware and show it works?
>> Or, is this intended to be the way that folks actually interact with
>> SEV-SNP optimization in real production scenarios?
>>
>> Shouldn't freeing SEV-SNP memory back to the system do this
>> automatically? Worst case, keep a 1-bit-per-GB bitmap of memory that's
>> been freed and schedule_work() to run in 1 or 10 or 100 seconds. That
>> should batch things up nicely enough. No?

And there is a cost associated with re-enabling the optimizations for all 
system RAM (even though it runs as a background kernel thread executing RMPOPT
on different 1GB regions in parallel and with inline cond_resched()'s), 
we don't want to run this periodically. 

In case of running SNP guests, this scheduled/periodic run will conflict with
RMPUPDATE(s) being executed for assigning the guest pages and marking them as private.
Even though the hardware takes care of handling such race conditions where 
one CPU is doing RMPOPT on it while another is changing one of the pages in that
region to be assigned via RMPUPDATE.  In this case, the hardware ensures that after
the RMPUPDATE completes, the CPU that did RMPOPT will see the region as un-optimized.

Once 1GB hugetlb support (for guest_memfd) has been merged, however it will be
straightforward to plumb it into the 1GB hugetlb cleanup path.

Thanks,
Ashish

> 
> Actually, the RMPOPT implementation is going to be a multi-phased development.
> 
> In the first phase (which is this patch-series) we enable RMPOPT globally, and let RMPUPDATE(s)
> slowly switch it off over time as SNP guest spin up, and then in phase#2 once 1GB hugetlb is in place,
> we enable re-issuing of RMPOPT during 1GB page cleanup.
> 
> So automatic re-issuing of RMPOPT will be done when SNP guests are shutdown and as part of 
> SNP guest cleanup once 1GB hugetlb support (for guest_memfd) has been merged. 
> 
> As currently, i.e, as part of this patch series, there is no mechanism to re-issue RMPOPT
> automatically as part of SNP guest cleanup, therefore this support exists to doing it
> manually at runtime via configfs.
> 
> I will describe this multi-phased RMPOPT implementation plan in the cover letter for 
> next revision of this patch series.
> 
> 
>>
>> I can't fathom that users don't want this to be done automatically for them.
>>
>> Is the optimization scan really expensive or something? 1GB of memory
>> should have a small number of megabytes of metadata to scan.

Re: [PATCH 5/6] x86/sev: Use configfs to re-enable RMP optimizations.

Posted by Dave Hansen 1 month, 1 week ago

On 2/17/26 19:34, Kalra, Ashish wrote:
...
> As currently, i.e, as part of this patch series, there is no
> mechanism to re-issue RMPOPT automatically as part of SNP guest
> cleanup, therefore this support exists to doing it manually at
> runtime via configfs.
I think you need a mechanism that re-enable RMP optimizations
automatically for this feature to go upstream. It's just dead code
otherwise, and we don't merge dead code.

A configfs hack doesn't really count.