UefiCpuPkg/Library/MpInitLib/DxeMpLib.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
This commit will reduce the interval of the AP status check event from 100
milliseconds to 10 milliseconds.
(I searched the history of the 100ms interval, it seems no comment or log
message was mentioned for the choice of this value. Looks like the value
is selected by experience.)
The purpose is to reduce the response time when the BSP calls below
EFI_MP_SERVICES_PROTOCOL services in a non-blocking manner:
* StartupAllAPs()
* StartupThisAP()
Reducing the check interval will benefit the performance for the case when
the BSP uses WaitForEvent() or uses CheckEvent() in a loop to wait for
AP(s) to complete the task, especially when the task can be finished
considerably fast on AP(s).
An example is within function CpuFeaturesInitialize() under
UefiCpuPkg/Library/RegisterCpuFeaturesLib/DxeRegisterCpuFeaturesLib.c,
where BSP will perform the same task with APs and requires all the
processors to finish the task before BSP proceeds to its next task.
Impact:
A. The impact is minimal when there is no non-blocking calls of the
StartupAllAPs/StartupThisAp MP services, because the check function
CheckAndUpdateApsStatus() will return directly when there is no
registered wait event (i.e. no non-blocking request).
B. There will be a performance tradeoff when BSP continues to proceed
other tasks after submitting a non-blocking StartupAllAPs/StartupThisAP
request. If the AP status check takes a good portion of the shortened
interval, BSP will have less time slice working on its own task before
all the APs complete their tasks.
My investigation for Impact B is that it is a rare scenario in the edk2
code base.
Unitests:
A. OS boot successfully.
B. System (with 24 threads) boot time reduced. Almost all the saved time
comes from the above-mentioned case in CpuFeaturesInitialize().
Cc: Eric Dong <eric.dong@intel.com>
Cc: Ray Ni <ray.ni@intel.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Michael D Kinney <michael.d.kinney@intel.com>
Cc: Star Zeng <star.zeng@intel.com>
Signed-off-by: Hao A Wu <hao.a.wu@intel.com>
---
UefiCpuPkg/Library/MpInitLib/DxeMpLib.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c b/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c
index a987c32109..9ba886e8ed 100644
--- a/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c
+++ b/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c
@@ -15,7 +15,7 @@
#include <Protocol/Timer.h>
-#define AP_CHECK_INTERVAL (EFI_TIMER_PERIOD_MILLISECONDS (100))
+#define AP_CHECK_INTERVAL (EFI_TIMER_PERIOD_MILLISECONDS (10))
#define AP_SAFE_STACK_SIZE 128
CPU_MP_DATA *mCpuMpData = NULL;
--
2.12.0.windows.1
-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#55846): https://edk2.groups.io/g/devel/message/55846
Mute This Topic: https://groups.io/mt/71925352/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-
Hi Hao, On 03/13/20 14:22, Hao A Wu wrote: > This commit will reduce the interval of the AP status check event from 100 > milliseconds to 10 milliseconds. > > (I searched the history of the 100ms interval, it seems no comment or log > message was mentioned for the choice of this value. Looks like the value > is selected by experience.) > > The purpose is to reduce the response time when the BSP calls below > EFI_MP_SERVICES_PROTOCOL services in a non-blocking manner: > > * StartupAllAPs() > * StartupThisAP() > > Reducing the check interval will benefit the performance for the case when > the BSP uses WaitForEvent() or uses CheckEvent() in a loop to wait for > AP(s) to complete the task, especially when the task can be finished > considerably fast on AP(s). > > An example is within function CpuFeaturesInitialize() under > UefiCpuPkg/Library/RegisterCpuFeaturesLib/DxeRegisterCpuFeaturesLib.c, > where BSP will perform the same task with APs and requires all the > processors to finish the task before BSP proceeds to its next task. > > Impact: > A. The impact is minimal when there is no non-blocking calls of the > StartupAllAPs/StartupThisAp MP services, because the check function > CheckAndUpdateApsStatus() will return directly when there is no > registered wait event (i.e. no non-blocking request). > > B. There will be a performance tradeoff when BSP continues to proceed > other tasks after submitting a non-blocking StartupAllAPs/StartupThisAP > request. If the AP status check takes a good portion of the shortened > interval, BSP will have less time slice working on its own task before > all the APs complete their tasks. > > My investigation for Impact B is that it is a rare scenario in the edk2 > code base. > > Unitests: > A. OS boot successfully. > B. System (with 24 threads) boot time reduced. Almost all the saved time > comes from the above-mentioned case in CpuFeaturesInitialize(). > > Cc: Eric Dong <eric.dong@intel.com> > Cc: Ray Ni <ray.ni@intel.com> > Cc: Laszlo Ersek <lersek@redhat.com> > Cc: Michael D Kinney <michael.d.kinney@intel.com> > Cc: Star Zeng <star.zeng@intel.com> > Signed-off-by: Hao A Wu <hao.a.wu@intel.com> > --- > UefiCpuPkg/Library/MpInitLib/DxeMpLib.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c b/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c > index a987c32109..9ba886e8ed 100644 > --- a/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c > +++ b/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c > @@ -15,7 +15,7 @@ > > #include <Protocol/Timer.h> > > -#define AP_CHECK_INTERVAL (EFI_TIMER_PERIOD_MILLISECONDS (100)) > +#define AP_CHECK_INTERVAL (EFI_TIMER_PERIOD_MILLISECONDS (10)) > #define AP_SAFE_STACK_SIZE 128 > > CPU_MP_DATA *mCpuMpData = NULL; > The use case is valid, IMO. And the commit message is helpful. But I really think this constant should be PCD. Here's why I think a platform might want to control it: - The best (finest) possible resolution for timer events is platform dependent, IIUC. The duration of the "idle tick" is platform-specific. And, it likely makes no sense to set AP_CHECK_INTERVAL to a duration that's around, or under, what the arch timer resolution allows for. - In the other direction, CheckAndUpdateApsStatus() contains a loop that counts up to CpuMpData->CpuCount. In a very large system (hundreds or maybe thousands of APs) this function may have non-negligible cost. I suggest introducing a PCD for this (measured in msecs) and using 100 msecs as the default. Thanks Laszlo -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#55850): https://edk2.groups.io/g/devel/message/55850 Mute This Topic: https://groups.io/mt/71925352/1787277 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org] -=-=-=-=-=-=-=-=-=-=-=-
> The use case is valid, IMO. And the commit message is helpful. > > But I really think this constant should be PCD. Here's why I think a > platform might want to control it: > > - The best (finest) possible resolution for timer events is platform > dependent, IIUC. The duration of the "idle tick" is platform-specific. > And, it likely makes no sense to set AP_CHECK_INTERVAL to a duration > that's around, or under, what the arch timer resolution allows for. > > - In the other direction, CheckAndUpdateApsStatus() contains a loop that > counts up to CpuMpData->CpuCount. In a very large system (hundreds or > maybe thousands of APs) this function may have non-negligible cost. > > I suggest introducing a PCD for this (measured in msecs) and using 100 > msecs as the default. Laszlo, Adding a PCD means platform integrators need to consider which value to set. Most of the time, they may just use the default PCD value. Then, why not we add the PCD later when a real case is met? Thanks, Ray -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#55860): https://edk2.groups.io/g/devel/message/55860 Mute This Topic: https://groups.io/mt/71925352/1787277 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org] -=-=-=-=-=-=-=-=-=-=-=-
On 03/16/20 02:37, Ni, Ray wrote: >> The use case is valid, IMO. And the commit message is helpful. >> >> But I really think this constant should be PCD. Here's why I think a >> platform might want to control it: >> >> - The best (finest) possible resolution for timer events is platform >> dependent, IIUC. The duration of the "idle tick" is platform-specific. >> And, it likely makes no sense to set AP_CHECK_INTERVAL to a duration >> that's around, or under, what the arch timer resolution allows for. >> >> - In the other direction, CheckAndUpdateApsStatus() contains a loop that >> counts up to CpuMpData->CpuCount. In a very large system (hundreds or >> maybe thousands of APs) this function may have non-negligible cost. >> >> I suggest introducing a PCD for this (measured in msecs) and using 100 >> msecs as the default. > > Laszlo, > Adding a PCD means platform integrators need to consider which value to set. > Most of the time, they may just use the default PCD value. > Then, why not we add the PCD later when a real case is met? The patch changes existent behavior; it is not for a newly introduced feature. Because most platforms are not in the edk2 tree, we don't know what platforms could be regressed by increasing the polling frequency tenfold. (And remember that the polling action has O(n) cost, where "n" is the logical processor count.) Under your suggestion, the expression "real case is met" amounts to "someone reports a regression" (possibly after the next stable tag, even). I don't think that's a good idea. In particular, the patch is motivated by RegisterCpuFeaturesLib -- the CpuFeaturesInitialize() function -- on some platform(s) that Hao uses. But there are platforms that don't use RegisterCpuFeaturesLib, and still use MpInitLib. Thanks, Laszlo -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#56101): https://edk2.groups.io/g/devel/message/56101 Mute This Topic: https://groups.io/mt/71925352/1787277 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org] -=-=-=-=-=-=-=-=-=-=-=-
> > Laszlo, > > Adding a PCD means platform integrators need to consider which value to set. > > Most of the time, they may just use the default PCD value. > > Then, why not we add the PCD later when a real case is met? > > The patch changes existent behavior; it is not for a newly introduced > feature. > > Because most platforms are not in the edk2 tree, we don't know what > platforms could be regressed by increasing the polling frequency > tenfold. (And remember that the polling action has O(n) cost, where "n" > is the logical processor count.) > > Under your suggestion, the expression "real case is met" amounts to > "someone reports a regression" (possibly after the next stable tag, > even). I don't think that's a good idea. > In particular, the patch is motivated by RegisterCpuFeaturesLib -- the > CpuFeaturesInitialize() function -- on some platform(s) that Hao uses. > But there are platforms that don't use RegisterCpuFeaturesLib, and still > use MpInitLib. OK. I agree with your suggestion. -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#56109): https://edk2.groups.io/g/devel/message/56109 Mute This Topic: https://groups.io/mt/71925352/1787277 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org] -=-=-=-=-=-=-=-=-=-=-=-
On 3/23/20 9:37 AM, Ni, Ray wrote: >>> Laszlo, >>> Adding a PCD means platform integrators need to consider which value to set. >>> Most of the time, they may just use the default PCD value. >>> Then, why not we add the PCD later when a real case is met? >> >> The patch changes existent behavior; it is not for a newly introduced >> feature. >> >> Because most platforms are not in the edk2 tree, we don't know what >> platforms could be regressed by increasing the polling frequency >> tenfold. (And remember that the polling action has O(n) cost, where "n" >> is the logical processor count.) >> >> Under your suggestion, the expression "real case is met" amounts to >> "someone reports a regression" (possibly after the next stable tag, >> even). I don't think that's a good idea. >> In particular, the patch is motivated by RegisterCpuFeaturesLib -- the >> CpuFeaturesInitialize() function -- on some platform(s) that Hao uses. >> But there are platforms that don't use RegisterCpuFeaturesLib, and still >> use MpInitLib. > > OK. I agree with your suggestion. Thank you. I agree with Laszlo: MP initialization is tricky to scale, and platform dependent. My group builds real machines with thousands of APs, and we have had to do various tweaks to the MP init. code. Having a PCD to adjust this timeout will be very useful. Thanks, -- Brian J. Johnson Enterprise X86 Lab Hewlett Packard Enterprise brian.johnson@hpe.com -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#56112): https://edk2.groups.io/g/devel/message/56112 Mute This Topic: https://groups.io/mt/71925352/1787277 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org] -=-=-=-=-=-=-=-=-=-=-=-
> -----Original Message----- > From: devel@edk2.groups.io [mailto:devel@edk2.groups.io] On Behalf Of Brian > J. Johnson > Sent: Tuesday, March 24, 2020 12:39 AM > To: devel@edk2.groups.io; Ni, Ray; Laszlo Ersek; Wu, Hao A; > rfc@edk2.groups.io > Cc: Dong, Eric; Kinney, Michael D; Zeng, Star > Subject: Re: [edk2-devel] [RFC][PATCH v1] UefiCpuPkg/MpInitLib DXE: Reduce > AP status check interval > > On 3/23/20 9:37 AM, Ni, Ray wrote: > >>> Laszlo, > >>> Adding a PCD means platform integrators need to consider which value to > set. > >>> Most of the time, they may just use the default PCD value. > >>> Then, why not we add the PCD later when a real case is met? > >> > >> The patch changes existent behavior; it is not for a newly introduced > >> feature. > >> > >> Because most platforms are not in the edk2 tree, we don't know what > >> platforms could be regressed by increasing the polling frequency > >> tenfold. (And remember that the polling action has O(n) cost, where "n" > >> is the logical processor count.) > >> > >> Under your suggestion, the expression "real case is met" amounts to > >> "someone reports a regression" (possibly after the next stable tag, > >> even). I don't think that's a good idea. > >> In particular, the patch is motivated by RegisterCpuFeaturesLib -- the > >> CpuFeaturesInitialize() function -- on some platform(s) that Hao uses. > >> But there are platforms that don't use RegisterCpuFeaturesLib, and still > >> use MpInitLib. > > > > OK. I agree with your suggestion. > > Thank you. I agree with Laszlo: MP initialization is tricky to scale, > and platform dependent. My group builds real machines with thousands of > APs, and we have had to do various tweaks to the MP init. code. Having > a PCD to adjust this timeout will be very useful. Thanks all for the feedbacks. Please grant me some time to prepare a new version of the patch. Best Regards, Hao Wu > > Thanks, > -- > Brian J. Johnson > Enterprise X86 Lab > > Hewlett Packard Enterprise > brian.johnson@hpe.com > > > -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#56121): https://edk2.groups.io/g/devel/message/56121 Mute This Topic: https://groups.io/mt/71925352/1787277 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org] -=-=-=-=-=-=-=-=-=-=-=-
© 2016 - 2024 Red Hat, Inc.