[edk2-devel] [PATCH v6 2/9] OvmfPkg/CpuHotplugSmm: collect hot-unplug events

Ankur Arora posted 9 patches 3 years, 9 months ago
There is a newer version of this series
[edk2-devel] [PATCH v6 2/9] OvmfPkg/CpuHotplugSmm: collect hot-unplug events
Posted by Ankur Arora 3 years, 9 months ago
Process fw_remove events in QemuCpuhpCollectApicIds() and collect
corresponding APIC IDs for CPUs that are being hot-unplugged.

In addition, we now ignore CPUs which only have remove set. These
CPUs haven't been processed by OSPM yet.

This is based on the QEMU hot-unplug protocol documented here:
  https://lore.kernel.org/qemu-devel/20201204170939.1815522-3-imammedo@redhat.com/

Also define QEMU_CPUHP_STAT_EJECTED while we are at it.

Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Ard Biesheuvel <ard.biesheuvel@arm.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Aaron Young <aaron.young@oracle.com>
Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=3132
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---

Notes:
    I'm treating events (insert=1, fw_remove=1) below as invalid (return
    EFI_PROTOCOL_ERROR, which ends up as an assert), but I'm not sure
    that is correct:
    
         if ((CpuStatus & QEMU_CPUHP_STAT_INSERT) != 0) {
           //
           // The "insert" event guarantees the "enabled" status; plus it excludes
    -      // the "remove" event.
    +      // the "fw_remove" event.
           //
           if ((CpuStatus & QEMU_CPUHP_STAT_ENABLED) == 0 ||
    -          (CpuStatus & QEMU_CPUHP_STAT_REMOVE) != 0) {
    +          (CpuStatus & QEMU_CPUHP_STAT_FW_REMOVE) != 0) {
             DEBUG ((DEBUG_ERROR, "%a: CurrentSelector=%u CpuStatus=0x%x: "
               "inconsistent CPU status\n", __FUNCTION__, CurrentSelector,
               CpuStatus));
    
    QEMU's handling in cpu_hotplug_rd() can return both of these:
    
    cpu_hotplug_rd() {
       ...
       case ACPI_CPU_FLAGS_OFFSET_RW: /* pack and return is_* fields */
    	val |= cdev->cpu ? 1 : 0;
    	val |= cdev->is_inserting ? 2 : 0;
    	val |= cdev->is_removing  ? 4 : 0;
    	val |= cdev->fw_remove  ? 16 : 0;
       ...
    }
    and I don't see any code that treats is_inserting and is_removing as
    exclusive.

    One specific case where this looks it might be a problem is if the user
    unplugs a CPU and right after that plugs it.
    
    As part of the unplug handling, the ACPI AML would, in the scan loop,
    asynchronously trigger the notify, which would do the OS unplug, set
    "fw_remove" and then call the SMI_CMD.
    
    The subsequent plug could then come and set the "insert" bit.
    
    Assuming what I'm describing could happen, I'm not sure what's the right
    handling: QEMU could treat these bits as exclusive and then OVMF could
    justifiably treat it as a protocol error?

 OvmfPkg/Include/IndustryStandard/QemuCpuHotplug.h |  2 ++
 OvmfPkg/CpuHotplugSmm/QemuCpuhp.c                 | 29 +++++++++++++++++++----
 2 files changed, 26 insertions(+), 5 deletions(-)

diff --git a/OvmfPkg/Include/IndustryStandard/QemuCpuHotplug.h b/OvmfPkg/Include/IndustryStandard/QemuCpuHotplug.h
index a34a6d3fae61..692e3072598c 100644
--- a/OvmfPkg/Include/IndustryStandard/QemuCpuHotplug.h
+++ b/OvmfPkg/Include/IndustryStandard/QemuCpuHotplug.h
@@ -34,6 +34,8 @@
 #define QEMU_CPUHP_STAT_ENABLED                BIT0
 #define QEMU_CPUHP_STAT_INSERT                 BIT1
 #define QEMU_CPUHP_STAT_REMOVE                 BIT2
+#define QEMU_CPUHP_STAT_EJECTED                BIT3
+#define QEMU_CPUHP_STAT_FW_REMOVE              BIT4
 
 #define QEMU_CPUHP_RW_CMD_DATA               0x8
 
diff --git a/OvmfPkg/CpuHotplugSmm/QemuCpuhp.c b/OvmfPkg/CpuHotplugSmm/QemuCpuhp.c
index 8d4a6693c8d6..f871e50c377b 100644
--- a/OvmfPkg/CpuHotplugSmm/QemuCpuhp.c
+++ b/OvmfPkg/CpuHotplugSmm/QemuCpuhp.c
@@ -245,10 +245,10 @@ QemuCpuhpCollectApicIds (
     if ((CpuStatus & QEMU_CPUHP_STAT_INSERT) != 0) {
       //
       // The "insert" event guarantees the "enabled" status; plus it excludes
-      // the "remove" event.
+      // the "fw_remove" event.
       //
       if ((CpuStatus & QEMU_CPUHP_STAT_ENABLED) == 0 ||
-          (CpuStatus & QEMU_CPUHP_STAT_REMOVE) != 0) {
+          (CpuStatus & QEMU_CPUHP_STAT_FW_REMOVE) != 0) {
         DEBUG ((DEBUG_ERROR, "%a: CurrentSelector=%u CpuStatus=0x%x: "
           "inconsistent CPU status\n", __FUNCTION__, CurrentSelector,
           CpuStatus));
@@ -260,12 +260,31 @@ QemuCpuhpCollectApicIds (
 
       ExtendIds   = PluggedApicIds;
       ExtendCount = PluggedCount;
-    } else if ((CpuStatus & QEMU_CPUHP_STAT_REMOVE) != 0) {
-      DEBUG ((DEBUG_VERBOSE, "%a: CurrentSelector=%u: remove\n", __FUNCTION__,
-        CurrentSelector));
+    } else if ((CpuStatus & QEMU_CPUHP_STAT_FW_REMOVE) != 0) {
+      //
+      // "fw_remove" event guarantees "enabled".
+      //
+      if ((CpuStatus & QEMU_CPUHP_STAT_ENABLED) == 0) {
+        DEBUG ((DEBUG_ERROR, "%a: CurrentSelector=%u CpuStatus=0x%x: "
+          "inconsistent CPU status\n", __FUNCTION__, CurrentSelector,
+          CpuStatus));
+        return EFI_PROTOCOL_ERROR;
+      }
+
+      DEBUG ((DEBUG_VERBOSE, "%a: CurrentSelector=%u: fw_remove\n",
+        __FUNCTION__, CurrentSelector));
 
       ExtendIds   = ToUnplugApicIds;
       ExtendCount = ToUnplugCount;
+    } else if ((CpuStatus & QEMU_CPUHP_STAT_REMOVE) != 0) {
+      //
+      // Let the OSPM deal with the "remove" event.
+      //
+      DEBUG ((DEBUG_INFO, "%a: CurrentSelector=%u: remove (ignored)\n",
+        __FUNCTION__, CurrentSelector));
+
+      CurrentSelector++;
+      continue;
     } else {
       DEBUG ((DEBUG_VERBOSE, "%a: CurrentSelector=%u: no event\n",
         __FUNCTION__, CurrentSelector));
-- 
2.9.3



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#70876): https://edk2.groups.io/g/devel/message/70876
Mute This Topic: https://groups.io/mt/80199952/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [edk2-devel] [PATCH v6 2/9] OvmfPkg/CpuHotplugSmm: collect hot-unplug events
Posted by Laszlo Ersek 3 years, 9 months ago
On 01/29/21 01:59, Ankur Arora wrote:
> Process fw_remove events in QemuCpuhpCollectApicIds() and collect
> corresponding APIC IDs for CPUs that are being hot-unplugged.
>
> In addition, we now ignore CPUs which only have remove set. These
> CPUs haven't been processed by OSPM yet.
>
> This is based on the QEMU hot-unplug protocol documented here:
>   https://lore.kernel.org/qemu-devel/20201204170939.1815522-3-imammedo@redhat.com/
>
> Also define QEMU_CPUHP_STAT_EJECTED while we are at it.

(1) Please move the addition of QEMU_CPUHP_STAT_EJECTED to patch 8
("OvmfPkg/CpuHotplugSmm: add worker to do CPU ejection"), where you
first use it.

>
> Cc: Laszlo Ersek <lersek@redhat.com>
> Cc: Jordan Justen <jordan.l.justen@intel.com>
> Cc: Ard Biesheuvel <ard.biesheuvel@arm.com>
> Cc: Igor Mammedov <imammedo@redhat.com>
> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Cc: Aaron Young <aaron.young@oracle.com>
> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=3132
> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
> ---
>
> Notes:
>     I'm treating events (insert=1, fw_remove=1) below as invalid (return
>     EFI_PROTOCOL_ERROR, which ends up as an assert), but I'm not sure
>     that is correct:
>
>          if ((CpuStatus & QEMU_CPUHP_STAT_INSERT) != 0) {
>            //
>            // The "insert" event guarantees the "enabled" status; plus it excludes
>     -      // the "remove" event.
>     +      // the "fw_remove" event.
>            //
>            if ((CpuStatus & QEMU_CPUHP_STAT_ENABLED) == 0 ||
>     -          (CpuStatus & QEMU_CPUHP_STAT_REMOVE) != 0) {
>     +          (CpuStatus & QEMU_CPUHP_STAT_FW_REMOVE) != 0) {
>              DEBUG ((DEBUG_ERROR, "%a: CurrentSelector=%u CpuStatus=0x%x: "
>                "inconsistent CPU status\n", __FUNCTION__, CurrentSelector,
>                CpuStatus));
>
>     QEMU's handling in cpu_hotplug_rd() can return both of these:
>
>     cpu_hotplug_rd() {
>        ...
>        case ACPI_CPU_FLAGS_OFFSET_RW: /* pack and return is_* fields */
>     	val |= cdev->cpu ? 1 : 0;
>     	val |= cdev->is_inserting ? 2 : 0;
>     	val |= cdev->is_removing  ? 4 : 0;
>     	val |= cdev->fw_remove  ? 16 : 0;
>        ...
>     }
>     and I don't see any code that treats is_inserting and is_removing as
>     exclusive.
>
>     One specific case where this looks it might be a problem is if the user
>     unplugs a CPU and right after that plugs it.
>
>     As part of the unplug handling, the ACPI AML would, in the scan loop,
>     asynchronously trigger the notify, which would do the OS unplug, set
>     "fw_remove" and then call the SMI_CMD.
>
>     The subsequent plug could then come and set the "insert" bit.
>
>     Assuming what I'm describing could happen, I'm not sure what's the right
>     handling: QEMU could treat these bits as exclusive and then OVMF could
>     justifiably treat it as a protocol error?

I'm OK with the related part of your patch (i.e., returning
EFI_PROTOCOL_ERROR for (insert=1, fw_remove=1)).

>
>  OvmfPkg/Include/IndustryStandard/QemuCpuHotplug.h |  2 ++
>  OvmfPkg/CpuHotplugSmm/QemuCpuhp.c                 | 29 +++++++++++++++++++----
>  2 files changed, 26 insertions(+), 5 deletions(-)
>
> diff --git a/OvmfPkg/Include/IndustryStandard/QemuCpuHotplug.h b/OvmfPkg/Include/IndustryStandard/QemuCpuHotplug.h
> index a34a6d3fae61..692e3072598c 100644
> --- a/OvmfPkg/Include/IndustryStandard/QemuCpuHotplug.h
> +++ b/OvmfPkg/Include/IndustryStandard/QemuCpuHotplug.h
> @@ -34,6 +34,8 @@
>  #define QEMU_CPUHP_STAT_ENABLED                BIT0
>  #define QEMU_CPUHP_STAT_INSERT                 BIT1
>  #define QEMU_CPUHP_STAT_REMOVE                 BIT2
> +#define QEMU_CPUHP_STAT_EJECTED                BIT3
> +#define QEMU_CPUHP_STAT_FW_REMOVE              BIT4
>
>  #define QEMU_CPUHP_RW_CMD_DATA               0x8
>
> diff --git a/OvmfPkg/CpuHotplugSmm/QemuCpuhp.c b/OvmfPkg/CpuHotplugSmm/QemuCpuhp.c
> index 8d4a6693c8d6..f871e50c377b 100644
> --- a/OvmfPkg/CpuHotplugSmm/QemuCpuhp.c
> +++ b/OvmfPkg/CpuHotplugSmm/QemuCpuhp.c
> @@ -245,10 +245,10 @@ QemuCpuhpCollectApicIds (
>      if ((CpuStatus & QEMU_CPUHP_STAT_INSERT) != 0) {
>        //
>        // The "insert" event guarantees the "enabled" status; plus it excludes
> -      // the "remove" event.
> +      // the "fw_remove" event.
>        //
>        if ((CpuStatus & QEMU_CPUHP_STAT_ENABLED) == 0 ||
> -          (CpuStatus & QEMU_CPUHP_STAT_REMOVE) != 0) {
> +          (CpuStatus & QEMU_CPUHP_STAT_FW_REMOVE) != 0) {
>          DEBUG ((DEBUG_ERROR, "%a: CurrentSelector=%u CpuStatus=0x%x: "
>            "inconsistent CPU status\n", __FUNCTION__, CurrentSelector,
>            CpuStatus));
> @@ -260,12 +260,31 @@ QemuCpuhpCollectApicIds (
>
>        ExtendIds   = PluggedApicIds;
>        ExtendCount = PluggedCount;
> -    } else if ((CpuStatus & QEMU_CPUHP_STAT_REMOVE) != 0) {
> -      DEBUG ((DEBUG_VERBOSE, "%a: CurrentSelector=%u: remove\n", __FUNCTION__,
> -        CurrentSelector));
> +    } else if ((CpuStatus & QEMU_CPUHP_STAT_FW_REMOVE) != 0) {
> +      //
> +      // "fw_remove" event guarantees "enabled".
> +      //
> +      if ((CpuStatus & QEMU_CPUHP_STAT_ENABLED) == 0) {
> +        DEBUG ((DEBUG_ERROR, "%a: CurrentSelector=%u CpuStatus=0x%x: "
> +          "inconsistent CPU status\n", __FUNCTION__, CurrentSelector,
> +          CpuStatus));
> +        return EFI_PROTOCOL_ERROR;
> +      }
> +
> +      DEBUG ((DEBUG_VERBOSE, "%a: CurrentSelector=%u: fw_remove\n",
> +        __FUNCTION__, CurrentSelector));
>
>        ExtendIds   = ToUnplugApicIds;
>        ExtendCount = ToUnplugCount;
> +    } else if ((CpuStatus & QEMU_CPUHP_STAT_REMOVE) != 0) {
> +      //
> +      // Let the OSPM deal with the "remove" event.
> +      //
> +      DEBUG ((DEBUG_INFO, "%a: CurrentSelector=%u: remove (ignored)\n",
> +        __FUNCTION__, CurrentSelector));

(2) Please downgrade this debug mask from DEBUG_INFO to DEBUG_VERBOSE.

(If you want your OVMF build to emit DEBUG_VERBOSE messages to the log,
you can set PcdDebugPrintErrorLevel to 0x8040004F in the DSC file --
DEBUG_VERBOSE has value 0x00400000.)

> +
> +      CurrentSelector++;
> +      continue;

(3) This change is logically correct; however I request a different
implementation, as I indicated here:

  https://lists.gnu.org/archive/html/qemu-devel/2020-11/msg06737.html
  msgid: <926113ec-8fa1-7d3b-ff3f-f1eda692e83d@redhat.com>

Namely:

(3a) On this branch, please set both "ExtendIds" and "ExtendCount" to
NULL, replacing the currently proposed "CurrentSelector" increment and
the "continue" statement.

(3b) Locate the section of code that starts with the comment "Save the
APIC ID of the CPU with the pending event...", and make it conditional
like this:

    ASSERT ((ExtendIds == NULL) == (ExtendCount == NULL));
    if (ExtendIds != NULL) {
      ...
    }

(3c) and then simply proceed to the end of the loop body, where we
increment "CurrentSelector" already.


Here's why I'm asing for this: with your proposed v6 patch, the loop
body would receive a "CurrentSelector" increment operation that did not
explain itself. And I'd really like to keep *any* "CurrentSelector"
increment operation explained by the comment that we currently have at
the end of the loop body:

     //
     // We've processed the CPU with (known) pending events, but we must never
     // clear events. Therefore we need to advance past this CPU manually;
     // otherwise, QEMU_CPUHP_CMD_GET_PENDING would stick to the currently
     // selected CPU.
     //

Keeping up that "well-explained" status would require one of two
options:

- copy the comment into the new branch (duplicating the comment) just
  before you add the new "CurrentSelector" increment operation, or

- make sure we have just one spot where we increment "CurrentSelector",
  and preserve the comment there.

The second option looks much better to me, so that's what I'm asking
for.

If we didn't have that big comment on the increment, your solution would
be just fine, but said comment is really important IMO.

Thanks!
Laszlo

>      } else {
>        DEBUG ((DEBUG_VERBOSE, "%a: CurrentSelector=%u: no event\n",
>          __FUNCTION__, CurrentSelector));
>



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#70916): https://edk2.groups.io/g/devel/message/70916
Mute This Topic: https://groups.io/mt/80199952/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [edk2-devel] [PATCH v6 2/9] OvmfPkg/CpuHotplugSmm: collect hot-unplug events
Posted by Ankur Arora 3 years, 9 months ago
On 2021-01-29 6:18 p.m., Laszlo Ersek wrote:
> On 01/29/21 01:59, Ankur Arora wrote:
>> Process fw_remove events in QemuCpuhpCollectApicIds() and collect
>> corresponding APIC IDs for CPUs that are being hot-unplugged.
>>
>> In addition, we now ignore CPUs which only have remove set. These
>> CPUs haven't been processed by OSPM yet.
>>
>> This is based on the QEMU hot-unplug protocol documented here:
>>    https://lore.kernel.org/qemu-devel/20201204170939.1815522-3-imammedo@redhat.com/
>>
>> Also define QEMU_CPUHP_STAT_EJECTED while we are at it.
> 
> (1) Please move the addition of QEMU_CPUHP_STAT_EJECTED to patch 8
> ("OvmfPkg/CpuHotplugSmm: add worker to do CPU ejection"), where you
> first use it.
> 
>>
>> Cc: Laszlo Ersek <lersek@redhat.com>
>> Cc: Jordan Justen <jordan.l.justen@intel.com>
>> Cc: Ard Biesheuvel <ard.biesheuvel@arm.com>
>> Cc: Igor Mammedov <imammedo@redhat.com>
>> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>> Cc: Aaron Young <aaron.young@oracle.com>
>> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=3132
>> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
>> ---
>>
>> Notes:
>>      I'm treating events (insert=1, fw_remove=1) below as invalid (return
>>      EFI_PROTOCOL_ERROR, which ends up as an assert), but I'm not sure
>>      that is correct:
>>
>>           if ((CpuStatus & QEMU_CPUHP_STAT_INSERT) != 0) {
>>             //
>>             // The "insert" event guarantees the "enabled" status; plus it excludes
>>      -      // the "remove" event.
>>      +      // the "fw_remove" event.
>>             //
>>             if ((CpuStatus & QEMU_CPUHP_STAT_ENABLED) == 0 ||
>>      -          (CpuStatus & QEMU_CPUHP_STAT_REMOVE) != 0) {
>>      +          (CpuStatus & QEMU_CPUHP_STAT_FW_REMOVE) != 0) {
>>               DEBUG ((DEBUG_ERROR, "%a: CurrentSelector=%u CpuStatus=0x%x: "
>>                 "inconsistent CPU status\n", __FUNCTION__, CurrentSelector,
>>                 CpuStatus));
>>
>>      QEMU's handling in cpu_hotplug_rd() can return both of these:
>>
>>      cpu_hotplug_rd() {
>>         ...
>>         case ACPI_CPU_FLAGS_OFFSET_RW: /* pack and return is_* fields */
>>      	val |= cdev->cpu ? 1 : 0;
>>      	val |= cdev->is_inserting ? 2 : 0;
>>      	val |= cdev->is_removing  ? 4 : 0;
>>      	val |= cdev->fw_remove  ? 16 : 0;
>>         ...
>>      }
>>      and I don't see any code that treats is_inserting and is_removing as
>>      exclusive.
>>
>>      One specific case where this looks it might be a problem is if the user
>>      unplugs a CPU and right after that plugs it.
>>
>>      As part of the unplug handling, the ACPI AML would, in the scan loop,
>>      asynchronously trigger the notify, which would do the OS unplug, set
>>      "fw_remove" and then call the SMI_CMD.
>>
>>      The subsequent plug could then come and set the "insert" bit.
>>
>>      Assuming what I'm describing could happen, I'm not sure what's the right
>>      handling: QEMU could treat these bits as exclusive and then OVMF could
>>      justifiably treat it as a protocol error?
> 
> I'm OK with the related part of your patch (i.e., returning
> EFI_PROTOCOL_ERROR for (insert=1, fw_remove=1)).
> 
>>
>>   OvmfPkg/Include/IndustryStandard/QemuCpuHotplug.h |  2 ++
>>   OvmfPkg/CpuHotplugSmm/QemuCpuhp.c                 | 29 +++++++++++++++++++----
>>   2 files changed, 26 insertions(+), 5 deletions(-)
>>
>> diff --git a/OvmfPkg/Include/IndustryStandard/QemuCpuHotplug.h b/OvmfPkg/Include/IndustryStandard/QemuCpuHotplug.h
>> index a34a6d3fae61..692e3072598c 100644
>> --- a/OvmfPkg/Include/IndustryStandard/QemuCpuHotplug.h
>> +++ b/OvmfPkg/Include/IndustryStandard/QemuCpuHotplug.h
>> @@ -34,6 +34,8 @@
>>   #define QEMU_CPUHP_STAT_ENABLED                BIT0
>>   #define QEMU_CPUHP_STAT_INSERT                 BIT1
>>   #define QEMU_CPUHP_STAT_REMOVE                 BIT2
>> +#define QEMU_CPUHP_STAT_EJECTED                BIT3
>> +#define QEMU_CPUHP_STAT_FW_REMOVE              BIT4
>>
>>   #define QEMU_CPUHP_RW_CMD_DATA               0x8
>>
>> diff --git a/OvmfPkg/CpuHotplugSmm/QemuCpuhp.c b/OvmfPkg/CpuHotplugSmm/QemuCpuhp.c
>> index 8d4a6693c8d6..f871e50c377b 100644
>> --- a/OvmfPkg/CpuHotplugSmm/QemuCpuhp.c
>> +++ b/OvmfPkg/CpuHotplugSmm/QemuCpuhp.c
>> @@ -245,10 +245,10 @@ QemuCpuhpCollectApicIds (
>>       if ((CpuStatus & QEMU_CPUHP_STAT_INSERT) != 0) {
>>         //
>>         // The "insert" event guarantees the "enabled" status; plus it excludes
>> -      // the "remove" event.
>> +      // the "fw_remove" event.
>>         //
>>         if ((CpuStatus & QEMU_CPUHP_STAT_ENABLED) == 0 ||
>> -          (CpuStatus & QEMU_CPUHP_STAT_REMOVE) != 0) {
>> +          (CpuStatus & QEMU_CPUHP_STAT_FW_REMOVE) != 0) {
>>           DEBUG ((DEBUG_ERROR, "%a: CurrentSelector=%u CpuStatus=0x%x: "
>>             "inconsistent CPU status\n", __FUNCTION__, CurrentSelector,
>>             CpuStatus));
>> @@ -260,12 +260,31 @@ QemuCpuhpCollectApicIds (
>>
>>         ExtendIds   = PluggedApicIds;
>>         ExtendCount = PluggedCount;
>> -    } else if ((CpuStatus & QEMU_CPUHP_STAT_REMOVE) != 0) {
>> -      DEBUG ((DEBUG_VERBOSE, "%a: CurrentSelector=%u: remove\n", __FUNCTION__,
>> -        CurrentSelector));
>> +    } else if ((CpuStatus & QEMU_CPUHP_STAT_FW_REMOVE) != 0) {
>> +      //
>> +      // "fw_remove" event guarantees "enabled".
>> +      //
>> +      if ((CpuStatus & QEMU_CPUHP_STAT_ENABLED) == 0) {
>> +        DEBUG ((DEBUG_ERROR, "%a: CurrentSelector=%u CpuStatus=0x%x: "
>> +          "inconsistent CPU status\n", __FUNCTION__, CurrentSelector,
>> +          CpuStatus));
>> +        return EFI_PROTOCOL_ERROR;
>> +      }
>> +
>> +      DEBUG ((DEBUG_VERBOSE, "%a: CurrentSelector=%u: fw_remove\n",
>> +        __FUNCTION__, CurrentSelector));
>>
>>         ExtendIds   = ToUnplugApicIds;
>>         ExtendCount = ToUnplugCount;
>> +    } else if ((CpuStatus & QEMU_CPUHP_STAT_REMOVE) != 0) {
>> +      //
>> +      // Let the OSPM deal with the "remove" event.
>> +      //
>> +      DEBUG ((DEBUG_INFO, "%a: CurrentSelector=%u: remove (ignored)\n",
>> +        __FUNCTION__, CurrentSelector));
> 
> (2) Please downgrade this debug mask from DEBUG_INFO to DEBUG_VERBOSE.
> 
> (If you want your OVMF build to emit DEBUG_VERBOSE messages to the log,
> you can set PcdDebugPrintErrorLevel to 0x8040004F in the DSC file --
> DEBUG_VERBOSE has value 0x00400000.)
> 
>> +
>> +      CurrentSelector++;
>> +      continue;
> 
> (3) This change is logically correct; however I request a different
> implementation, as I indicated here:
> 
>    https://lists.gnu.org/archive/html/qemu-devel/2020-11/msg06737.html
>    msgid: <926113ec-8fa1-7d3b-ff3f-f1eda692e83d@redhat.com>
> 
> Namely:
> 
> (3a) On this branch, please set both "ExtendIds" and "ExtendCount" to
> NULL, replacing the currently proposed "CurrentSelector" increment and
> the "continue" statement.
> 
> (3b) Locate the section of code that starts with the comment "Save the
> APIC ID of the CPU with the pending event...", and make it conditional
> like this:
> 
>      ASSERT ((ExtendIds == NULL) == (ExtendCount == NULL));
>      if (ExtendIds != NULL) {
>        ...
>      }
> 
> (3c) and then simply proceed to the end of the loop body, where we
> increment "CurrentSelector" already.
> 
> 
> Here's why I'm asing for this: with your proposed v6 patch, the loop
> body would receive a "CurrentSelector" increment operation that did not
> explain itself. And I'd really like to keep *any* "CurrentSelector"
> increment operation explained by the comment that we currently have at
> the end of the loop body:
> 
>       //
>       // We've processed the CPU with (known) pending events, but we must never
>       // clear events. Therefore we need to advance past this CPU manually;
>       // otherwise, QEMU_CPUHP_CMD_GET_PENDING would stick to the currently
>       // selected CPU.
>       //
> 
> Keeping up that "well-explained" status would require one of two
> options:
> 
> - copy the comment into the new branch (duplicating the comment) just
>    before you add the new "CurrentSelector" increment operation, or
> 
> - make sure we have just one spot where we increment "CurrentSelector",
>    and preserve the comment there.
> 
> The second option looks much better to me, so that's what I'm asking
> for.
> 
> If we didn't have that big comment on the increment, your solution would
> be just fine, but said comment is really important IMO.

Yeah you are right, that comment is quite useful to keep in mind. Will fix.

Also acking the other review comments here (including the EJECTED -> EJECT
change.)

Thanks
Ankur

> 
> Thanks!
> Laszlo
> 
>>       } else {
>>         DEBUG ((DEBUG_VERBOSE, "%a: CurrentSelector=%u: no event\n",
>>           __FUNCTION__, CurrentSelector));
>>
> 


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#71047): https://edk2.groups.io/g/devel/message/71047
Mute This Topic: https://groups.io/mt/80199952/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [edk2-devel] [PATCH v6 2/9] OvmfPkg/CpuHotplugSmm: collect hot-unplug events
Posted by Laszlo Ersek 3 years, 9 months ago
On 01/30/21 03:18, Laszlo Ersek wrote:
> On 01/29/21 01:59, Ankur Arora wrote:
>> Process fw_remove events in QemuCpuhpCollectApicIds() and collect
>> corresponding APIC IDs for CPUs that are being hot-unplugged.
>>
>> In addition, we now ignore CPUs which only have remove set. These
>> CPUs haven't been processed by OSPM yet.
>>
>> This is based on the QEMU hot-unplug protocol documented here:
>>   https://lore.kernel.org/qemu-devel/20201204170939.1815522-3-imammedo@redhat.com/
>>
>> Also define QEMU_CPUHP_STAT_EJECTED while we are at it.
> 
> (1) Please move the addition of QEMU_CPUHP_STAT_EJECTED to patch 8
> ("OvmfPkg/CpuHotplugSmm: add worker to do CPU ejection"), where you
> first use it.

(4) Apologies for the bikeshedding, but I also suggest that we call the
macro "QEMU_CPUHP_STAT_EJECT", rather than "_EJECTED".

Reason: QEMU documents this bit (on write) as "initiates device eject";
in other words, it's not a status, but a signal (or request) from the
guest code to QEMU.

Thanks
Laszlo



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#70917): https://edk2.groups.io/g/devel/message/70917
Mute This Topic: https://groups.io/mt/80199952/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-