[PATCH 03/16] spapr: Explain purpose of ->fwnmi_migration_blocker more clearly

Markus Armbruster posted 16 patches 4 years, 6 months ago
[PATCH 03/16] spapr: Explain purpose of ->fwnmi_migration_blocker more clearly
Posted by Markus Armbruster 4 years, 6 months ago
spapr_mce_req_event() makes an effort to prevent migration from
degrading the reporting of FWNMIs.  It adds a migration blocker when
it receives one, and deletes it when it's done handling it.  This is a
best effort.

Commit 2500fb423a "migration: Include migration support for machine
check handling" tried to explain this in a comment.  Rewrite the
comment for clarity, and reposition it to make it clear it applies to
all failure modes, not just "migration already in progress".

Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Aravinda Prasad <arawinda.p@gmail.com>
Cc: Ganesh Goudar <ganeshgr@linux.ibm.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
---
 hw/ppc/spapr_events.c | 17 +++++++++--------
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
index a8f2cc6bdc..7d6876f12d 100644
--- a/hw/ppc/spapr_events.c
+++ b/hw/ppc/spapr_events.c
@@ -911,16 +911,17 @@ void spapr_mce_req_event(PowerPCCPU *cpu, bool recovered)
         }
     }
 
+    /*
+     * Try to block migration while FWNMI is being handled, so the
+     * machine check handler runs where the information passed to it
+     * actually makes sense.  This shouldn't actually block migration,
+     * only delay it slightly, assuming migration is retried.  If the
+     * attempt to block fails, carry on.  Unfortunately, it always
+     * fails when running with -only-migrate.  A proper interface to
+     * delay migration completion for a bit could avoid that.
+     */
     ret = migrate_add_blocker(spapr->fwnmi_migration_blocker, NULL);
     if (ret == -EBUSY) {
-        /*
-         * We don't want to abort so we let the migration to continue.
-         * In a rare case, the machine check handler will run on the target.
-         * Though this is not preferable, it is better than aborting
-         * the migration or killing the VM. It is okay to call
-         * migrate_del_blocker on a blocker that was not added (which the
-         * nmi-interlock handler would do when it's called after this).
-         */
         warn_report("Received a fwnmi while migration was in progress");
     }
 
-- 
2.31.1


Re: [PATCH 03/16] spapr: Explain purpose of ->fwnmi_migration_blocker more clearly
Posted by David Gibson 4 years, 6 months ago
On Tue, Jul 20, 2021 at 02:53:55PM +0200, Markus Armbruster wrote:
> spapr_mce_req_event() makes an effort to prevent migration from
> degrading the reporting of FWNMIs.  It adds a migration blocker when
> it receives one, and deletes it when it's done handling it.  This is a
> best effort.
> 
> Commit 2500fb423a "migration: Include migration support for machine
> check handling" tried to explain this in a comment.  Rewrite the
> comment for clarity, and reposition it to make it clear it applies to
> all failure modes, not just "migration already in progress".
> 
> Cc: David Gibson <david@gibson.dropbear.id.au>
> Cc: Aravinda Prasad <arawinda.p@gmail.com>
> Cc: Ganesh Goudar <ganeshgr@linux.ibm.com>
> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
> Signed-off-by: Markus Armbruster <armbru@redhat.com>

Acked-by: David Gibson <david@gibson.dropbear.id.au>

> ---
>  hw/ppc/spapr_events.c | 17 +++++++++--------
>  1 file changed, 9 insertions(+), 8 deletions(-)
> 
> diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
> index a8f2cc6bdc..7d6876f12d 100644
> --- a/hw/ppc/spapr_events.c
> +++ b/hw/ppc/spapr_events.c
> @@ -911,16 +911,17 @@ void spapr_mce_req_event(PowerPCCPU *cpu, bool recovered)
>          }
>      }
>  
> +    /*
> +     * Try to block migration while FWNMI is being handled, so the
> +     * machine check handler runs where the information passed to it
> +     * actually makes sense.  This shouldn't actually block migration,
> +     * only delay it slightly, assuming migration is retried.  If the
> +     * attempt to block fails, carry on.  Unfortunately, it always
> +     * fails when running with -only-migrate.  A proper interface to
> +     * delay migration completion for a bit could avoid that.
> +     */
>      ret = migrate_add_blocker(spapr->fwnmi_migration_blocker, NULL);
>      if (ret == -EBUSY) {
> -        /*
> -         * We don't want to abort so we let the migration to continue.
> -         * In a rare case, the machine check handler will run on the target.
> -         * Though this is not preferable, it is better than aborting
> -         * the migration or killing the VM. It is okay to call
> -         * migrate_del_blocker on a blocker that was not added (which the
> -         * nmi-interlock handler would do when it's called after this).
> -         */
>          warn_report("Received a fwnmi while migration was in progress");
>      }
>  

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson