[PATCH 2/6] hw/ppc: Trigger Fadump boot if fadump is registered

Aditya Gupta posted 6 patches 11 months, 3 weeks ago
Maintainers: Nicholas Piggin <npiggin@gmail.com>, Daniel Henrique Barboza <danielhb413@gmail.com>, Harsh Prateek Bora <harshpb@linux.ibm.com>
There is a newer version of this series
[PATCH 2/6] hw/ppc: Trigger Fadump boot if fadump is registered
Posted by Aditya Gupta 11 months, 3 weeks ago
According to PAPR:

    R1–7.3.30–3. When the platform receives an ibm,os-term RTAS call, or
    on a system reset without an ibm,nmi-interlock RTAS call, if the
    platform has a dump structure registered through the
    ibm,configure-kernel-dump call, the platform must process each
    registered kernel dump section as required and, when available,
    present the dump structure information to the operating system
    through the “ibm,kernel-dump” property, updated with status for each
    dump section, until the dump has been invalidated through the
    ibm,configure-kernel-dump RTAS call.

If Fadump has been registered, trigger an Fadump boot (memory preserving
boot), if QEMU recieves a 'ibm,os-term' rtas call.

Implementing the fadump boot as:
    * pause all vcpus (will save registers later)
    * preserve memory regions specified by fadump
    * do a memory preserving reboot (GUEST_RESET in QEMU doesn't clear
      the memory)

Memory regions registered by fadump will be handled in a later patch.

Signed-off-by: Aditya Gupta <adityag@linux.ibm.com>
---
 hw/ppc/spapr_rtas.c | 42 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 42 insertions(+)

diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
index eebdf13b1552..01c82375f03d 100644
--- a/hw/ppc/spapr_rtas.c
+++ b/hw/ppc/spapr_rtas.c
@@ -342,6 +342,43 @@ static void rtas_ibm_set_system_parameter(PowerPCCPU *cpu,
 }
 
 struct fadump_metadata fadump_metadata;
+bool is_next_boot_fadump;
+
+static void trigger_fadump_boot(target_ulong spapr_retcode)
+{
+    /*
+     * In PowerNV, SBE stops all clocks for cores, do similar to it
+     * QEMU's nearest equivalent is 'pause_all_vcpus'
+     * See 'stopClocksS0' in SBE source code for more info on SBE part
+     */
+    pause_all_vcpus();
+
+    if (true /* TODO: Preserve memory registered for fadump */) {
+        /* Failed to preserve the registered memory regions */
+        rtas_st(spapr_retcode, 0, RTAS_OUT_HW_ERROR);
+
+        /* Cause a reboot */
+        qemu_system_guest_panicked(NULL);
+        return;
+    }
+
+    /* Mark next boot as fadump boot */
+    is_next_boot_fadump = true;
+
+    /* Reset fadump_registered for next boot */
+    fadump_metadata.fadump_registered = false;
+    fadump_metadata.fadump_dump_active = true;
+
+    /* Then do a guest reset */
+    /*
+     * Requirement:
+     * This guest reset should not clear the memory (which is
+     * the case when this is merged)
+     */
+    qemu_system_reset_request(SHUTDOWN_CAUSE_GUEST_RESET);
+
+    rtas_st(spapr_retcode, 0, RTAS_OUT_SUCCESS);
+}
 
 /* Papr Section 7.4.9 ibm,configure-kernel-dump RTAS call */
 static __attribute((unused)) void rtas_configure_kernel_dump(PowerPCCPU *cpu,
@@ -449,6 +486,11 @@ static void rtas_ibm_os_term(PowerPCCPU *cpu,
     target_ulong msgaddr = rtas_ld(args, 0);
     char msg[512];
 
+    if (fadump_metadata.fadump_registered) {
+        /* If fadump boot works, control won't come back here */
+        return trigger_fadump_boot(rets);
+    }
+
     cpu_physical_memory_read(msgaddr, msg, sizeof(msg) - 1);
     msg[sizeof(msg) - 1] = 0;
 
-- 
2.48.1


Re: [PATCH 2/6] hw/ppc: Trigger Fadump boot if fadump is registered
Posted by Harsh Prateek Bora 11 months, 1 week ago

On 2/17/25 12:47, Aditya Gupta wrote:
> According to PAPR:
> 
>      R1–7.3.30–3. When the platform receives an ibm,os-term RTAS call, or
>      on a system reset without an ibm,nmi-interlock RTAS call, if the
>      platform has a dump structure registered through the
>      ibm,configure-kernel-dump call, the platform must process each
>      registered kernel dump section as required and, when available,
>      present the dump structure information to the operating system
>      through the “ibm,kernel-dump” property, updated with status for each
>      dump section, until the dump has been invalidated through the
>      ibm,configure-kernel-dump RTAS call.
> 
> If Fadump has been registered, trigger an Fadump boot (memory preserving
> boot), if QEMU recieves a 'ibm,os-term' rtas call.
> 
> Implementing the fadump boot as:
>      * pause all vcpus (will save registers later)
>      * preserve memory regions specified by fadump

Although mentioned later, but needs to call out here as not implemented
in this patch. Ideally, all the prep work patches should be introduced
earlier before enabling the trigger.

>      * do a memory preserving reboot (GUEST_RESET in QEMU doesn't clear
>        the memory)
> 
> Memory regions registered by fadump will be handled in a later patch.
> 
> Signed-off-by: Aditya Gupta <adityag@linux.ibm.com>
> ---
>   hw/ppc/spapr_rtas.c | 42 ++++++++++++++++++++++++++++++++++++++++++
>   1 file changed, 42 insertions(+)
> 
> diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
> index eebdf13b1552..01c82375f03d 100644
> --- a/hw/ppc/spapr_rtas.c
> +++ b/hw/ppc/spapr_rtas.c
> @@ -342,6 +342,43 @@ static void rtas_ibm_set_system_parameter(PowerPCCPU *cpu,
>   }
>   
>   struct fadump_metadata fadump_metadata;
> +bool is_next_boot_fadump;
> +
> +static void trigger_fadump_boot(target_ulong spapr_retcode)
> +{
> +    /*
> +     * In PowerNV, SBE stops all clocks for cores, do similar to it
> +     * QEMU's nearest equivalent is 'pause_all_vcpus'
> +     * See 'stopClocksS0' in SBE source code for more info on SBE part
> +     */
> +    pause_all_vcpus();
> +
> +    if (true /* TODO: Preserve memory registered for fadump */) {
> +        /* Failed to preserve the registered memory regions */

Instead of this, it is better to introduce the dummy stub here now which 
can be populated in a later patch. That also helps in avoiding code 
changes in this hunk in future patch.

For eg:

static bool fadump_preserved_mem(void)
{
     return false; /* TBD */
}

...

if (!fadump_preserve_mem()) {
  ...
}

> +        rtas_st(spapr_retcode, 0, RTAS_OUT_HW_ERROR);
> +
> +        /* Cause a reboot */
> +        qemu_system_guest_panicked(NULL);
> +        return;
> +    }
> +
> +    /* Mark next boot as fadump boot */
> +    is_next_boot_fadump = true;
> +
> +    /* Reset fadump_registered for next boot */
> +    fadump_metadata.fadump_registered = false;
> +    fadump_metadata.fadump_dump_active = true;
> +
> +    /* Then do a guest reset */
> +    /*
> +     * Requirement:
> +     * This guest reset should not clear the memory (which is
> +     * the case when this is merged)
> +     */
> +    qemu_system_reset_request(SHUTDOWN_CAUSE_GUEST_RESET);
> +
> +    rtas_st(spapr_retcode, 0, RTAS_OUT_SUCCESS);
> +}
>   
>   /* Papr Section 7.4.9 ibm,configure-kernel-dump RTAS call */
>   static __attribute((unused)) void rtas_configure_kernel_dump(PowerPCCPU *cpu,
> @@ -449,6 +486,11 @@ static void rtas_ibm_os_term(PowerPCCPU *cpu,
>       target_ulong msgaddr = rtas_ld(args, 0);
>       char msg[512];
>   
> +    if (fadump_metadata.fadump_registered) {
> +        /* If fadump boot works, control won't come back here */
> +        return trigger_fadump_boot(rets);
> +    }
> +
>       cpu_physical_memory_read(msgaddr, msg, sizeof(msg) - 1);
>       msg[sizeof(msg) - 1] = 0;
>   

Re: [PATCH 2/6] hw/ppc: Trigger Fadump boot if fadump is registered
Posted by Aditya Gupta 11 months, 1 week ago
On 04/03/25 14:51, Harsh Prateek Bora wrote:
>
>
> On 2/17/25 12:47, Aditya Gupta wrote:
>> According to PAPR:
>>
>>      R1–7.3.30–3. When the platform receives an ibm,os-term RTAS 
>> call, or
>>      on a system reset without an ibm,nmi-interlock RTAS call, if the
>>      platform has a dump structure registered through the
>>      ibm,configure-kernel-dump call, the platform must process each
>>      registered kernel dump section as required and, when available,
>>      present the dump structure information to the operating system
>>      through the “ibm,kernel-dump” property, updated with status for 
>> each
>>      dump section, until the dump has been invalidated through the
>>      ibm,configure-kernel-dump RTAS call.
>>
>> If Fadump has been registered, trigger an Fadump boot (memory preserving
>> boot), if QEMU recieves a 'ibm,os-term' rtas call.
>>
>> Implementing the fadump boot as:
>>      * pause all vcpus (will save registers later)
>>      * preserve memory regions specified by fadump
>
> Although mentioned later, but needs to call out here as not implemented
> in this patch. Ideally, all the prep work patches should be introduced
> earlier before enabling the trigger.
>
Got it, will try rearranging the code. Though with current code, the 
trigger won't be called as fadump will not get registered (as of this 
patch the rtas call was not exposed to the kernel, this will likely 
change in v2).
>>      * do a memory preserving reboot (GUEST_RESET in QEMU doesn't clear
>>        the memory)
>>
>> Memory regions registered by fadump will be handled in a later patch.
>>
>> Signed-off-by: Aditya Gupta <adityag@linux.ibm.com>
>> ---
>>   hw/ppc/spapr_rtas.c | 42 ++++++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 42 insertions(+)
>>
>> diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
>> index eebdf13b1552..01c82375f03d 100644
>> --- a/hw/ppc/spapr_rtas.c
>> +++ b/hw/ppc/spapr_rtas.c
>> @@ -342,6 +342,43 @@ static void 
>> rtas_ibm_set_system_parameter(PowerPCCPU *cpu,
>>   }
>>     struct fadump_metadata fadump_metadata;
>> +bool is_next_boot_fadump;
>> +
>> +static void trigger_fadump_boot(target_ulong spapr_retcode)
>> +{
>> +    /*
>> +     * In PowerNV, SBE stops all clocks for cores, do similar to it
>> +     * QEMU's nearest equivalent is 'pause_all_vcpus'
>> +     * See 'stopClocksS0' in SBE source code for more info on SBE part
>> +     */
>> +    pause_all_vcpus();
>> +
>> +    if (true /* TODO: Preserve memory registered for fadump */) {
>> +        /* Failed to preserve the registered memory regions */
>
> Instead of this, it is better to introduce the dummy stub here now 
> which can be populated in a later patch. That also helps in avoiding 
> code changes in this hunk in future patch.
>
> For eg:
>
> static bool fadump_preserved_mem(void)
> {
>     return false; /* TBD */
> }
>
> ...
>
> if (!fadump_preserve_mem()) {
>  ...
> }

Thanks, makes sense. I will do it this way.


Thanks,

- Aditya Gupta

>
>> +        rtas_st(spapr_retcode, 0, RTAS_OUT_HW_ERROR);
>> +
>> +        /* Cause a reboot */
>> +        qemu_system_guest_panicked(NULL);
>> +        return;
>> +    }
>> +
>> +    /* Mark next boot as fadump boot */
>> +    is_next_boot_fadump = true;
>> +
>> +    /* Reset fadump_registered for next boot */
>> +    fadump_metadata.fadump_registered = false;
>> +    fadump_metadata.fadump_dump_active = true;
>> +
>> +    /* Then do a guest reset */
>> +    /*
>> +     * Requirement:
>> +     * This guest reset should not clear the memory (which is
>> +     * the case when this is merged)
>> +     */
>> +    qemu_system_reset_request(SHUTDOWN_CAUSE_GUEST_RESET);
>> +
>> +    rtas_st(spapr_retcode, 0, RTAS_OUT_SUCCESS);
>> +}
>>     /* Papr Section 7.4.9 ibm,configure-kernel-dump RTAS call */
>>   static __attribute((unused)) void 
>> rtas_configure_kernel_dump(PowerPCCPU *cpu,
>> @@ -449,6 +486,11 @@ static void rtas_ibm_os_term(PowerPCCPU *cpu,
>>       target_ulong msgaddr = rtas_ld(args, 0);
>>       char msg[512];
>>   +    if (fadump_metadata.fadump_registered) {
>> +        /* If fadump boot works, control won't come back here */
>> +        return trigger_fadump_boot(rets);
>> +    }
>> +
>>       cpu_physical_memory_read(msgaddr, msg, sizeof(msg) - 1);
>>       msg[sizeof(msg) - 1] = 0;

Re: [PATCH 2/6] hw/ppc: Trigger Fadump boot if fadump is registered
Posted by Nicholas Piggin 11 months, 2 weeks ago
On Mon Feb 17, 2025 at 5:17 PM AEST, Aditya Gupta wrote:
> According to PAPR:
>
>     R1–7.3.30–3. When the platform receives an ibm,os-term RTAS call, or
>     on a system reset without an ibm,nmi-interlock RTAS call, if the
>     platform has a dump structure registered through the
>     ibm,configure-kernel-dump call, the platform must process each
>     registered kernel dump section as required and, when available,
>     present the dump structure information to the operating system
>     through the “ibm,kernel-dump” property, updated with status for each
>     dump section, until the dump has been invalidated through the
>     ibm,configure-kernel-dump RTAS call.
>
> If Fadump has been registered, trigger an Fadump boot (memory preserving
> boot), if QEMU recieves a 'ibm,os-term' rtas call.
>
> Implementing the fadump boot as:
>     * pause all vcpus (will save registers later)
>     * preserve memory regions specified by fadump
>     * do a memory preserving reboot (GUEST_RESET in QEMU doesn't clear
>       the memory)
>
> Memory regions registered by fadump will be handled in a later patch.
>
> Signed-off-by: Aditya Gupta <adityag@linux.ibm.com>
> ---
>  hw/ppc/spapr_rtas.c | 42 ++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 42 insertions(+)
>
> diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
> index eebdf13b1552..01c82375f03d 100644
> --- a/hw/ppc/spapr_rtas.c
> +++ b/hw/ppc/spapr_rtas.c
> @@ -342,6 +342,43 @@ static void rtas_ibm_set_system_parameter(PowerPCCPU *cpu,
>  }
>  
>  struct fadump_metadata fadump_metadata;
> +bool is_next_boot_fadump;

Here's another one for spapr state.

> +
> +static void trigger_fadump_boot(target_ulong spapr_retcode)
> +{
> +    /*
> +     * In PowerNV, SBE stops all clocks for cores, do similar to it
> +     * QEMU's nearest equivalent is 'pause_all_vcpus'
> +     * See 'stopClocksS0' in SBE source code for more info on SBE part
> +     */

Can probably remove this comment here.

> +    pause_all_vcpus();
> +
> +    if (true /* TODO: Preserve memory registered for fadump */) {

If you're adding half the code to preserve memory but never actually
calling it anyway, you don't need the pause_all_vcpus() call either.

Again I would rather not adding unused code to the patches if possible.
If you're really not able to find a nice way to split and add
incrementally then okay, but try to take another look if possible.


> +        /* Failed to preserve the registered memory regions */
> +        rtas_st(spapr_retcode, 0, RTAS_OUT_HW_ERROR);
> +
> +        /* Cause a reboot */
> +        qemu_system_guest_panicked(NULL);
> +        return;
> +    }
> +
> +    /* Mark next boot as fadump boot */
> +    is_next_boot_fadump = true;
> +
> +    /* Reset fadump_registered for next boot */
> +    fadump_metadata.fadump_registered = false;
> +    fadump_metadata.fadump_dump_active = true;
> +
> +    /* Then do a guest reset */
> +    /*
> +     * Requirement:
> +     * This guest reset should not clear the memory (which is
> +     * the case when this is merged)
> +     */
> +    qemu_system_reset_request(SHUTDOWN_CAUSE_GUEST_RESET);

Seems reasonable. What is the actual mechanism that clears the machine
RAM anyway? I'm not able to find it...

Thanks,
Nick

> +
> +    rtas_st(spapr_retcode, 0, RTAS_OUT_SUCCESS);
> +}
>  
>  /* Papr Section 7.4.9 ibm,configure-kernel-dump RTAS call */
>  static __attribute((unused)) void rtas_configure_kernel_dump(PowerPCCPU *cpu,
> @@ -449,6 +486,11 @@ static void rtas_ibm_os_term(PowerPCCPU *cpu,
>      target_ulong msgaddr = rtas_ld(args, 0);
>      char msg[512];
>  
> +    if (fadump_metadata.fadump_registered) {
> +        /* If fadump boot works, control won't come back here */
> +        return trigger_fadump_boot(rets);
> +    }
> +
>      cpu_physical_memory_read(msgaddr, msg, sizeof(msg) - 1);
>      msg[sizeof(msg) - 1] = 0;
>  
Re: [PATCH 2/6] hw/ppc: Trigger Fadump boot if fadump is registered
Posted by Aditya Gupta 11 months, 2 weeks ago
On 27/02/25 08:44, Nicholas Piggin wrote:
> On Mon Feb 17, 2025 at 5:17 PM AEST, Aditya Gupta wrote:
>> According to PAPR:
>>
>>      R1–7.3.30–3. When the platform receives an ibm,os-term RTAS call, or
>>      on a system reset without an ibm,nmi-interlock RTAS call, if the
>>      platform has a dump structure registered through the
>>      ibm,configure-kernel-dump call, the platform must process each
>>      registered kernel dump section as required and, when available,
>>      present the dump structure information to the operating system
>>      through the “ibm,kernel-dump” property, updated with status for each
>>      dump section, until the dump has been invalidated through the
>>      ibm,configure-kernel-dump RTAS call.
>>
>> If Fadump has been registered, trigger an Fadump boot (memory preserving
>> boot), if QEMU recieves a 'ibm,os-term' rtas call.
>>
>> Implementing the fadump boot as:
>>      * pause all vcpus (will save registers later)
>>      * preserve memory regions specified by fadump
>>      * do a memory preserving reboot (GUEST_RESET in QEMU doesn't clear
>>        the memory)
>>
>> Memory regions registered by fadump will be handled in a later patch.
>>
>> Signed-off-by: Aditya Gupta <adityag@linux.ibm.com>
>> ---
>>   hw/ppc/spapr_rtas.c | 42 ++++++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 42 insertions(+)
>>
>> diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
>> index eebdf13b1552..01c82375f03d 100644
>> --- a/hw/ppc/spapr_rtas.c
>> +++ b/hw/ppc/spapr_rtas.c
>> @@ -342,6 +342,43 @@ static void rtas_ibm_set_system_parameter(PowerPCCPU *cpu,
>>   }
>>   
>>   struct fadump_metadata fadump_metadata;
>> +bool is_next_boot_fadump;
> Here's another one for spapr state.
Sure, will add.
>> +
>> +static void trigger_fadump_boot(target_ulong spapr_retcode)
>> +{
>> +    /*
>> +     * In PowerNV, SBE stops all clocks for cores, do similar to it
>> +     * QEMU's nearest equivalent is 'pause_all_vcpus'
>> +     * See 'stopClocksS0' in SBE source code for more info on SBE part
>> +     */
> Can probably remove this comment here.
Sure.
>> +    pause_all_vcpus();
>> +
>> +    if (true /* TODO: Preserve memory registered for fadump */) {
> If you're adding half the code to preserve memory but never actually
> calling it anyway, you don't need the pause_all_vcpus() call either.
>
> Again I would rather not adding unused code to the patches if possible.
> If you're really not able to find a nice way to split and add
> incrementally then okay, but try to take another look if possible.
>
Yes all this is unused. Will take another look to see how I can arrange it.
>> +        /* Failed to preserve the registered memory regions */
>> +        rtas_st(spapr_retcode, 0, RTAS_OUT_HW_ERROR);
>> +
>> +        /* Cause a reboot */
>> +        qemu_system_guest_panicked(NULL);
>> +        return;
>> +    }
>> +
>> +    /* Mark next boot as fadump boot */
>> +    is_next_boot_fadump = true;
>> +
>> +    /* Reset fadump_registered for next boot */
>> +    fadump_metadata.fadump_registered = false;
>> +    fadump_metadata.fadump_dump_active = true;
>> +
>> +    /* Then do a guest reset */
>> +    /*
>> +     * Requirement:
>> +     * This guest reset should not clear the memory (which is
>> +     * the case when this is merged)
>> +     */
>> +    qemu_system_reset_request(SHUTDOWN_CAUSE_GUEST_RESET);
> Seems reasonable. What is the actual mechanism that clears the machine
> RAM anyway? I'm not able to find it...

I didn't find any too. There is a devices reset which happens at this 
guest_reset, during which some devices do clear their memory registers, 
eg. 'pnv_psi_reset', since it clears it's regs it's like it's cleared 
it's memory region.

There were few like that which cleared the data they pass in their 
memory regions, but nothing clearing the whole RAM/whole memory regions 
as such.


- Aditya G

>
> Thanks,
> Nick
>
>> +
>> +    rtas_st(spapr_retcode, 0, RTAS_OUT_SUCCESS);
>> +}
>>   
>>   /* Papr Section 7.4.9 ibm,configure-kernel-dump RTAS call */
>>   static __attribute((unused)) void rtas_configure_kernel_dump(PowerPCCPU *cpu,
>> @@ -449,6 +486,11 @@ static void rtas_ibm_os_term(PowerPCCPU *cpu,
>>       target_ulong msgaddr = rtas_ld(args, 0);
>>       char msg[512];
>>   
>> +    if (fadump_metadata.fadump_registered) {
>> +        /* If fadump boot works, control won't come back here */
>> +        return trigger_fadump_boot(rets);
>> +    }
>> +
>>       cpu_physical_memory_read(msgaddr, msg, sizeof(msg) - 1);
>>       msg[sizeof(msg) - 1] = 0;
>>