RE: [PATCH v7 0/4] RISC-V Hibernation Support

JeeHeng Sia posted 4 patches 1 year ago
Only 0 patches received!
There is a newer version of this series
RE: [PATCH v7 0/4] RISC-V Hibernation Support
Posted by JeeHeng Sia 1 year ago

> -----Original Message-----
> From: Andrew Jones <ajones@ventanamicro.com>
> Sent: Monday, March 27, 2023 9:14 PM
> To: JeeHeng Sia <jeeheng.sia@starfivetech.com>
> Cc: paul.walmsley@sifive.com; palmer@dabbelt.com; aou@eecs.berkeley.edu; linux-riscv@lists.infradead.org; linux-
> kernel@vger.kernel.org; Leyfoon Tan <leyfoon.tan@starfivetech.com>; Mason Huo <mason.huo@starfivetech.com>
> Subject: Re: [PATCH v7 0/4] RISC-V Hibernation Support
> 
> On Thu, Mar 23, 2023 at 12:56:00PM +0800, Sia Jee Heng wrote:
> > This series adds RISC-V Hibernation/suspend to disk support.
> > Low level Arch functions were created to support hibernation.
> > swsusp_arch_suspend() relies code from __cpu_suspend_enter() to write
> > cpu state onto the stack, then calling swsusp_save() to save the memory
> > image.
> >
> > Arch specific hibernation header is implemented and is utilized by the
> > arch_hibernation_header_restore() and arch_hibernation_header_save()
> > functions. The arch specific hibernation header consists of satp, hartid,
> > and the cpu_resume address. The kernel built version is also need to be
> > saved into the hibernation image header to making sure only the same
> > kernel is restore when resume.
> >
> > swsusp_arch_resume() creates a temporary page table that covering only
> > the linear map. It copies the restore code to a 'safe' page, then start to
> > restore the memory image. Once completed, it restores the original
> > kernel's page table. It then calls into __hibernate_cpu_resume()
> > to restore the CPU context. Finally, it follows the normal hibernation
> > path back to the hibernation core.
> >
> > To enable hibernation/suspend to disk into RISCV, the below config
> > need to be enabled:
> > - CONFIG_ARCH_HIBERNATION_HEADER
> > - CONFIG_ARCH_HIBERNATION_POSSIBLE
> >
> > At high-level, this series includes the following changes:
> > 1) Change suspend_save_csrs() and suspend_restore_csrs()
> >    to public function as these functions are common to
> >    suspend/hibernation. (patch 1)
> > 2) Refactor the common code in the __cpu_resume_enter() function and
> >    __hibernate_cpu_resume() function. The common code are used by
> >    hibernation and suspend. (patch 2)
> > 3) Enhance kernel_page_present() function to support huge page. (patch 3)
> > 4) Add arch/riscv low level functions to support
> >    hibernation/suspend to disk. (patch 4)
> >
> > The above patches are based on kernel v6.3-rc3 and are tested on
> > StarFive VF2 SBC board and Qemu.
> > ACPI platform mode is not supported in this series.
> >
> 
> I tested this on QEMU, but, FYI, I had to use a raw backing file for
> the swap disk, rather than a qcow2 backing file, otherwise it didn't
> resume. It's probably worth looking into why that is.
Thanks for your time. The raw file format is closer to the actual physical disk. Although I can look into the qcow2 format for QEMU in the near future, it shouldn't be a blocking factor for this patch series to be upstreamed.
> 
> Thanks,
> drew
RE: [PATCH v7 0/4] RISC-V Hibernation Support
Posted by JeeHeng Sia 12 months ago

> -----Original Message-----
> From: JeeHeng Sia
> Sent: Tuesday, March 28, 2023 2:37 PM
> To: 'Andrew Jones' <ajones@ventanamicro.com>
> Cc: paul.walmsley@sifive.com; palmer@dabbelt.com; aou@eecs.berkeley.edu; linux-riscv@lists.infradead.org; linux-
> kernel@vger.kernel.org; Leyfoon Tan <leyfoon.tan@starfivetech.com>; Mason Huo <mason.huo@starfivetech.com>
> Subject: RE: [PATCH v7 0/4] RISC-V Hibernation Support
> 
> 
> 
> > -----Original Message-----
> > From: Andrew Jones <ajones@ventanamicro.com>
> > Sent: Monday, March 27, 2023 9:14 PM
> > To: JeeHeng Sia <jeeheng.sia@starfivetech.com>
> > Cc: paul.walmsley@sifive.com; palmer@dabbelt.com; aou@eecs.berkeley.edu; linux-riscv@lists.infradead.org; linux-
> > kernel@vger.kernel.org; Leyfoon Tan <leyfoon.tan@starfivetech.com>; Mason Huo <mason.huo@starfivetech.com>
> > Subject: Re: [PATCH v7 0/4] RISC-V Hibernation Support
> >
> > On Thu, Mar 23, 2023 at 12:56:00PM +0800, Sia Jee Heng wrote:
> > > This series adds RISC-V Hibernation/suspend to disk support.
> > > Low level Arch functions were created to support hibernation.
> > > swsusp_arch_suspend() relies code from __cpu_suspend_enter() to write
> > > cpu state onto the stack, then calling swsusp_save() to save the memory
> > > image.
> > >
> > > Arch specific hibernation header is implemented and is utilized by the
> > > arch_hibernation_header_restore() and arch_hibernation_header_save()
> > > functions. The arch specific hibernation header consists of satp, hartid,
> > > and the cpu_resume address. The kernel built version is also need to be
> > > saved into the hibernation image header to making sure only the same
> > > kernel is restore when resume.
> > >
> > > swsusp_arch_resume() creates a temporary page table that covering only
> > > the linear map. It copies the restore code to a 'safe' page, then start to
> > > restore the memory image. Once completed, it restores the original
> > > kernel's page table. It then calls into __hibernate_cpu_resume()
> > > to restore the CPU context. Finally, it follows the normal hibernation
> > > path back to the hibernation core.
> > >
> > > To enable hibernation/suspend to disk into RISCV, the below config
> > > need to be enabled:
> > > - CONFIG_ARCH_HIBERNATION_HEADER
> > > - CONFIG_ARCH_HIBERNATION_POSSIBLE
> > >
> > > At high-level, this series includes the following changes:
> > > 1) Change suspend_save_csrs() and suspend_restore_csrs()
> > >    to public function as these functions are common to
> > >    suspend/hibernation. (patch 1)
> > > 2) Refactor the common code in the __cpu_resume_enter() function and
> > >    __hibernate_cpu_resume() function. The common code are used by
> > >    hibernation and suspend. (patch 2)
> > > 3) Enhance kernel_page_present() function to support huge page. (patch 3)
> > > 4) Add arch/riscv low level functions to support
> > >    hibernation/suspend to disk. (patch 4)
> > >
> > > The above patches are based on kernel v6.3-rc3 and are tested on
> > > StarFive VF2 SBC board and Qemu.
> > > ACPI platform mode is not supported in this series.
> > >
> >
> > I tested this on QEMU, but, FYI, I had to use a raw backing file for
> > the swap disk, rather than a qcow2 backing file, otherwise it didn't
> > resume. It's probably worth looking into why that is.
> Thanks for your time. The raw file format is closer to the actual physical disk. Although I can look into the qcow2 format for QEMU in
> the near future, it shouldn't be a blocking factor for this patch series to be upstreamed.

FYI, I managed to reproduce the hibernation issue that Andrew reported. The hibernation resume failed while retrieving pages from the disk, specifically in the kernel/power/swap.c - swap_read_page() function and the snapshot_write_next() function in the kernel/power/snapshot.c. I found that adding a delay to the functions (by adding a printk) allowed the page retrieval process to progress further. Through this exercise, I have begun to suspect that there may be an issue with coherency handling in between the hibernation core and the QEMU qcow2 driver. I will add it to my AR list and shall help to investigate the issue in the near future.
> >
> > Thanks,
> > drew