From: Ashish Kalra <ashish.kalra@amd.com>
Use configfs as an interface to re-enable RMP optimizations at runtime
When SNP guests are launched, RMPUPDATE disables the corresponding
RMPOPT optimizations. Therefore, an interface is required to manually
re-enable RMP optimizations, as no mechanism currently exists to do so
during SNP guest cleanup.
Also select CONFIG_CONFIGFS_FS when host SEV or SNP support is enabled.
Suggested-by: Thomas Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
---
arch/x86/kvm/Kconfig | 1 +
arch/x86/virt/svm/sev.c | 79 +++++++++++++++++++++++++++++++++++++++++
2 files changed, 80 insertions(+)
diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index d916bd766c94..8fb21893ec8c 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -164,6 +164,7 @@ config KVM_AMD_SEV
select HAVE_KVM_ARCH_GMEM_PREPARE
select HAVE_KVM_ARCH_GMEM_INVALIDATE
select HAVE_KVM_ARCH_GMEM_POPULATE
+ select CONFIGFS_FS
help
Provides support for launching encrypted VMs which use Secure
Encrypted Virtualization (SEV), Secure Encrypted Virtualization with
diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c
index 713afcc2fab3..0f71a045e4aa 100644
--- a/arch/x86/virt/svm/sev.c
+++ b/arch/x86/virt/svm/sev.c
@@ -20,6 +20,7 @@
#include <linux/amd-iommu.h>
#include <linux/nospec.h>
#include <linux/kthread.h>
+#include <linux/configfs.h>
#include <asm/sev.h>
#include <asm/processor.h>
@@ -146,6 +147,10 @@ struct rmpopt_socket_config {
int current_node_idx;
};
+#define RMPOPT_CONFIGFS_NAME "rmpopt"
+
+static atomic_t rmpopt_in_progress = ATOMIC_INIT(0);
+
#undef pr_fmt
#define pr_fmt(fmt) "SEV-SNP: " fmt
@@ -581,6 +586,9 @@ static int rmpopt_kthread(void *__unused)
cond_resched();
}
+ /* Clear in_progress flag before going to sleep */
+ atomic_set(&rmpopt_in_progress, 0);
+
set_current_state(TASK_INTERRUPTIBLE);
schedule();
}
@@ -595,6 +603,75 @@ static void rmpopt_all_physmem(void)
wake_up_process(rmpopt_task);
}
+static ssize_t rmpopt_action_show(struct config_item *item, char *page)
+{
+ return sprintf(page, "RMP optimization in progress: %s\n",
+ atomic_read(&rmpopt_in_progress) == 1 ? "Yes" : "No");
+}
+
+static ssize_t rmpopt_action_store(struct config_item *item,
+ const char *page, size_t count)
+{
+ int in_progress_flag, ret;
+ unsigned int action;
+
+ ret = kstrtouint(page, 10, &action);
+ if (ret)
+ return ret;
+
+ if (action == 1) {
+ /* perform RMP re-optimizations */
+ in_progress_flag = atomic_cmpxchg(&rmpopt_in_progress, 0, 1);
+ if (!in_progress_flag)
+ rmpopt_all_physmem();
+ } else {
+ return -EINVAL;
+ }
+
+ return count;
+}
+
+static ssize_t rmpopt_description_show(struct config_item *item, char *page)
+{
+ return sprintf(page, "[RMPOPT]\n\necho 1 > action to perform RMP optimization.\n");
+}
+
+CONFIGFS_ATTR(rmpopt_, action);
+CONFIGFS_ATTR_RO(rmpopt_, description);
+
+static struct configfs_attribute *rmpopt_attrs[] = {
+ &rmpopt_attr_action,
+ &rmpopt_attr_description,
+ NULL,
+};
+
+static const struct config_item_type rmpopt_config_type = {
+ .ct_attrs = rmpopt_attrs,
+ .ct_owner = THIS_MODULE,
+};
+
+static struct configfs_subsystem rmpopt_configfs = {
+ .su_group = {
+ .cg_item = {
+ .ci_namebuf = RMPOPT_CONFIGFS_NAME,
+ .ci_type = &rmpopt_config_type,
+ },
+ },
+ .su_mutex = __MUTEX_INITIALIZER(rmpopt_configfs.su_mutex),
+};
+
+static int rmpopt_configfs_setup(void)
+{
+ int ret;
+
+ config_group_init(&rmpopt_configfs.su_group);
+ ret = configfs_register_subsystem(&rmpopt_configfs);
+ if (ret)
+ pr_err("Error %d while registering subsystem %s\n", ret, RMPOPT_CONFIGFS_NAME);
+
+ return ret;
+}
+
static void __configure_rmpopt(void *val)
{
u64 rmpopt_base = ((u64)val & PUD_MASK) | MSR_AMD64_RMPOPT_ENABLE;
@@ -770,6 +847,8 @@ static __init void configure_and_enable_rmpopt(void)
*/
rmpopt_all_physmem();
+ rmpopt_configfs_setup();
+
free_cpumask:
free_cpumask_var(primary_threads_cpulist);
}
--
2.43.0
On 2/17/26 12:11, Ashish Kalra wrote: > From: Ashish Kalra <ashish.kalra@amd.com> > > Use configfs as an interface to re-enable RMP optimizations at runtime > > When SNP guests are launched, RMPUPDATE disables the corresponding > RMPOPT optimizations. Therefore, an interface is required to manually > re-enable RMP optimizations, as no mechanism currently exists to do so > during SNP guest cleanup. Is this like a proof-of-concept to poke the hardware and show it works? Or, is this intended to be the way that folks actually interact with SEV-SNP optimization in real production scenarios? Shouldn't freeing SEV-SNP memory back to the system do this automatically? Worst case, keep a 1-bit-per-GB bitmap of memory that's been freed and schedule_work() to run in 1 or 10 or 100 seconds. That should batch things up nicely enough. No? I can't fathom that users don't want this to be done automatically for them. Is the optimization scan really expensive or something? 1GB of memory should have a small number of megabytes of metadata to scan.
Hello Dave, On 2/17/2026 4:19 PM, Dave Hansen wrote: > On 2/17/26 12:11, Ashish Kalra wrote: >> From: Ashish Kalra <ashish.kalra@amd.com> >> >> Use configfs as an interface to re-enable RMP optimizations at runtime >> >> When SNP guests are launched, RMPUPDATE disables the corresponding >> RMPOPT optimizations. Therefore, an interface is required to manually >> re-enable RMP optimizations, as no mechanism currently exists to do so >> during SNP guest cleanup. > > Is this like a proof-of-concept to poke the hardware and show it works? > Or, is this intended to be the way that folks actually interact with > SEV-SNP optimization in real production scenarios? > > Shouldn't freeing SEV-SNP memory back to the system do this > automatically? Worst case, keep a 1-bit-per-GB bitmap of memory that's > been freed and schedule_work() to run in 1 or 10 or 100 seconds. That > should batch things up nicely enough. No? Actually, the RMPOPT implementation is going to be a multi-phased development. In the first phase (which is this patch-series) we enable RMPOPT globally, and let RMPUPDATE(s) slowly switch it off over time as SNP guest spin up, and then in phase#2 once 1GB hugetlb is in place, we enable re-issuing of RMPOPT during 1GB page cleanup. So automatic re-issuing of RMPOPT will be done when SNP guests are shutdown and as part of SNP guest cleanup once 1GB hugetlb support (for guest_memfd) has been merged. As currently, i.e, as part of this patch series, there is no mechanism to re-issue RMPOPT automatically as part of SNP guest cleanup, therefore this support exists to doing it manually at runtime via configfs. I will describe this multi-phased RMPOPT implementation plan in the cover letter for next revision of this patch series. Thanks, Ashish > > I can't fathom that users don't want this to be done automatically for them. > > Is the optimization scan really expensive or something? 1GB of memory > should have a small number of megabytes of metadata to scan.
On 2/17/2026 9:34 PM, Kalra, Ashish wrote: > Hello Dave, > > On 2/17/2026 4:19 PM, Dave Hansen wrote: >> On 2/17/26 12:11, Ashish Kalra wrote: >>> From: Ashish Kalra <ashish.kalra@amd.com> >>> >>> Use configfs as an interface to re-enable RMP optimizations at runtime >>> >>> When SNP guests are launched, RMPUPDATE disables the corresponding >>> RMPOPT optimizations. Therefore, an interface is required to manually >>> re-enable RMP optimizations, as no mechanism currently exists to do so >>> during SNP guest cleanup. >> >> Is this like a proof-of-concept to poke the hardware and show it works? >> Or, is this intended to be the way that folks actually interact with >> SEV-SNP optimization in real production scenarios? >> >> Shouldn't freeing SEV-SNP memory back to the system do this >> automatically? Worst case, keep a 1-bit-per-GB bitmap of memory that's >> been freed and schedule_work() to run in 1 or 10 or 100 seconds. That >> should batch things up nicely enough. No? And there is a cost associated with re-enabling the optimizations for all system RAM (even though it runs as a background kernel thread executing RMPOPT on different 1GB regions in parallel and with inline cond_resched()'s), we don't want to run this periodically. In case of running SNP guests, this scheduled/periodic run will conflict with RMPUPDATE(s) being executed for assigning the guest pages and marking them as private. Even though the hardware takes care of handling such race conditions where one CPU is doing RMPOPT on it while another is changing one of the pages in that region to be assigned via RMPUPDATE. In this case, the hardware ensures that after the RMPUPDATE completes, the CPU that did RMPOPT will see the region as un-optimized. Once 1GB hugetlb support (for guest_memfd) has been merged, however it will be straightforward to plumb it into the 1GB hugetlb cleanup path. Thanks, Ashish > > Actually, the RMPOPT implementation is going to be a multi-phased development. > > In the first phase (which is this patch-series) we enable RMPOPT globally, and let RMPUPDATE(s) > slowly switch it off over time as SNP guest spin up, and then in phase#2 once 1GB hugetlb is in place, > we enable re-issuing of RMPOPT during 1GB page cleanup. > > So automatic re-issuing of RMPOPT will be done when SNP guests are shutdown and as part of > SNP guest cleanup once 1GB hugetlb support (for guest_memfd) has been merged. > > As currently, i.e, as part of this patch series, there is no mechanism to re-issue RMPOPT > automatically as part of SNP guest cleanup, therefore this support exists to doing it > manually at runtime via configfs. > > I will describe this multi-phased RMPOPT implementation plan in the cover letter for > next revision of this patch series. > > >> >> I can't fathom that users don't want this to be done automatically for them. >> >> Is the optimization scan really expensive or something? 1GB of memory >> should have a small number of megabytes of metadata to scan.
On 2/17/26 19:34, Kalra, Ashish wrote: ... > As currently, i.e, as part of this patch series, there is no > mechanism to re-issue RMPOPT automatically as part of SNP guest > cleanup, therefore this support exists to doing it manually at > runtime via configfs. I think you need a mechanism that re-enable RMP optimizations automatically for this feature to go upstream. It's just dead code otherwise, and we don't merge dead code. A configfs hack doesn't really count.
© 2016 - 2026 Red Hat, Inc.