When hibernate with data center dGPUs, huge number of VRAM data will be
moved to shmem during dev_pm_ops.prepare(). These shmem pages take a lot
of system memory so that there's no enough free memory for creating the
hibernation image. This will cause hibernation fail and abort.
After dev_pm_ops.prepare(), call shrink_all_memory() to force move shmem
pages to swap disk and reclaim the pages, so that there's enough system
memory for hibernation image and less pages needed to copy to the image.
This patch can only flush and free about half shmem pages. It will be
better to flush and free more pages, even all of shmem pages, so that
there're less pages to be copied to the hibernation image and the overall
hibernation time can be reduced.
Signed-off-by: Samuel Zhang <guoqing.zhang@amd.com>
---
kernel/power/hibernate.c | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c
index 10a01af63a80..913a298c1d01 100644
--- a/kernel/power/hibernate.c
+++ b/kernel/power/hibernate.c
@@ -370,6 +370,17 @@ static int create_image(int platform_mode)
return error;
}
+static void shrink_shmem_memory(void)
+{
+ struct sysinfo info;
+ unsigned long pages, freed;
+
+ si_meminfo(&info);
+ pages = info.sharedram;
+ freed = shrink_all_memory(pages);
+ pr_debug("requested to reclaim %lu pages, freed %lu pages\n", pages, freed);
+}
+
/**
* hibernation_snapshot - Quiesce devices and create a hibernation image.
* @platform_mode: If set, use platform driver to prepare for the transition.
@@ -411,6 +422,8 @@ int hibernation_snapshot(int platform_mode)
goto Thaw;
}
+ shrink_shmem_memory();
+
suspend_console();
pm_restrict_gfp_mask();
--
2.43.5
On Mon, Jun 30, 2025 at 12:41 PM Samuel Zhang <guoqing.zhang@amd.com> wrote: > > When hibernate with data center dGPUs, huge number of VRAM data will be > moved to shmem during dev_pm_ops.prepare(). These shmem pages take a lot > of system memory so that there's no enough free memory for creating the > hibernation image. This will cause hibernation fail and abort. > > After dev_pm_ops.prepare(), call shrink_all_memory() to force move shmem > pages to swap disk and reclaim the pages, so that there's enough system > memory for hibernation image and less pages needed to copy to the image. > > This patch can only flush and free about half shmem pages. It will be > better to flush and free more pages, even all of shmem pages, so that > there're less pages to be copied to the hibernation image and the overall > hibernation time can be reduced. > > Signed-off-by: Samuel Zhang <guoqing.zhang@amd.com> > --- > kernel/power/hibernate.c | 13 +++++++++++++ > 1 file changed, 13 insertions(+) > > diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c > index 10a01af63a80..913a298c1d01 100644 > --- a/kernel/power/hibernate.c > +++ b/kernel/power/hibernate.c > @@ -370,6 +370,17 @@ static int create_image(int platform_mode) > return error; > } > > +static void shrink_shmem_memory(void) > +{ > + struct sysinfo info; > + unsigned long pages, freed; > + Please add a comment explaining what is going on here. > + si_meminfo(&info); > + pages = info.sharedram; > + freed = shrink_all_memory(pages); > + pr_debug("requested to reclaim %lu pages, freed %lu pages\n", pages, freed); This message will be hard to decode without any context. > +} > + > /** > * hibernation_snapshot - Quiesce devices and create a hibernation image. > * @platform_mode: If set, use platform driver to prepare for the transition. > @@ -411,6 +422,8 @@ int hibernation_snapshot(int platform_mode) > goto Thaw; > } > Please add a comment explaining why the below is done. > + shrink_shmem_memory(); > + > suspend_console(); > pm_restrict_gfp_mask(); > > -- > 2.43.5 > >
© 2016 - 2025 Red Hat, Inc.