[edk2-devel] [PATCH] BaseTools: Cannot store library cache of different arch together

Steven Shi posted 1 patch 5 days ago
Failed in applying to current master (apply log)
BaseTools/Source/Python/build/build.py | 13 +++----------
1 file changed, 3 insertions(+), 10 deletions(-)

[edk2-devel] [PATCH] BaseTools: Cannot store library cache of different arch together

Posted by Steven Shi 5 days ago
https://bugzilla.tianocore.org/show_bug.cgi?id=1895

Build cache cannot store cache for the same library modules
in different arch together. E.g. Both the below IA32 and X64
arch BaseLib caches should exist after build Ovmf3264, but now
only the one in X64 arch exist.
The reason is the current Basetool use a set() to same all
library AutoGen objects, but the different arch lib AutoGen
objects have same __hash_ value which comes from the lib
MetaFile(The path of module file):
    def __hash__(self):
        return hash(self.MetaFile)

So the different arch lib AutoGen objects are duplicated one
to the set() and only one can exist. This is why the Basetool
can only store one arch cache for library.
This patch avoid to use the set() as the container to save the
library and module objects because the objects might have same
__hash__ value.

Cc: Liming Gao <liming.gao@intel.com>
Cc: Bob Feng <bob.c.feng@intel.com>
Cc: Christian Rodriguez <christian.rodriguez@intel.com>
Signed-off-by: Steven Shi <steven.shi@intel.com>
---
 BaseTools/Source/Python/build/build.py | 13 +++----------
 1 file changed, 3 insertions(+), 10 deletions(-)

diff --git a/BaseTools/Source/Python/build/build.py b/BaseTools/Source/Python/build/build.py
index 0855d4561c..f7a79cbbab 100644
--- a/BaseTools/Source/Python/build/build.py
+++ b/BaseTools/Source/Python/build/build.py
@@ -2203,21 +2203,14 @@ class Build():
             RemoveDirectory(os.path.dirname(GlobalData.gDatabasePath), True)
 
     def CreateAsBuiltInf(self):
-        all_lib_set = set()
-        all_mod_set = set()
         for Module in self.BuildModules:
             Module.CreateAsBuiltInf()
-            all_mod_set.add(Module)
+            for lib in Module.LibraryAutoGenList:
+                lib.CreateAsBuiltInf(True)
         for Module in self.HashSkipModules:
             Module.CreateAsBuiltInf(True)
-            all_mod_set.add(Module)
-        for Module in all_mod_set:
             for lib in Module.LibraryAutoGenList:
-                all_lib_set.add(lib)
-        for lib in all_lib_set:
-            lib.CreateAsBuiltInf(True)
-        all_lib_set.clear()
-        all_mod_set.clear()
+                lib.CreateAsBuiltInf(True)
         self.BuildModules = []
         self.HashSkipModules = []
     ## Do some clean-up works when error occurred
-- 
2.17.1.windows.2


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.

View/Reply Online (#42171): https://edk2.groups.io/g/devel/message/42171
Mute This Topic: https://groups.io/mt/32013233/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub  [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [edk2-devel] [PATCH] BaseTools: Cannot store library cache of different arch together

Posted by Christian Rodriguez 5 days ago
The change looks good overall, but I'm concerned about the performance if there are multiple of the same library used by different modules. In this case the library will be copied once per arch per module. I'm not sure if it would make a giant impact, but just something to think about because you would be doing extra IO accesses for each duplicate library autogen.

Possible suggestion:
Is it possible to change the hash to include the arch? Or you can store a unique tuple pair like (lib.arch, lib) in the set to reduce the libraries being copied to a unique subset.

Thanks,
Christian

>-----Original Message-----
>From: Shi, Steven
>Sent: Monday, June 10, 2019 10:53 PM
>To: devel@edk2.groups.io
>Cc: Gao, Liming <liming.gao@intel.com>; Feng, Bob C
><bob.c.feng@intel.com>; Rodriguez, Christian
><christian.rodriguez@intel.com>
>Subject: [PATCH] BaseTools: Cannot store library cache of different arch
>together
>
>https://bugzilla.tianocore.org/show_bug.cgi?id=1895
>
>Build cache cannot store cache for the same library modules in different arch
>together. E.g. Both the below IA32 and X64 arch BaseLib caches should exist
>after build Ovmf3264, but now only the one in X64 arch exist.
>The reason is the current Basetool use a set() to same all library AutoGen
>objects, but the different arch lib AutoGen objects have same __hash_ value
>which comes from the lib MetaFile(The path of module file):
>    def __hash__(self):
>        return hash(self.MetaFile)
>
>So the different arch lib AutoGen objects are duplicated one to the set() and
>only one can exist. This is why the Basetool can only store one arch cache for
>library.
>This patch avoid to use the set() as the container to save the library and
>module objects because the objects might have same __hash__ value.
>
>Cc: Liming Gao <liming.gao@intel.com>
>Cc: Bob Feng <bob.c.feng@intel.com>
>Cc: Christian Rodriguez <christian.rodriguez@intel.com>
>Signed-off-by: Steven Shi <steven.shi@intel.com>
>---
> BaseTools/Source/Python/build/build.py | 13 +++----------
> 1 file changed, 3 insertions(+), 10 deletions(-)
>
>diff --git a/BaseTools/Source/Python/build/build.py
>b/BaseTools/Source/Python/build/build.py
>index 0855d4561c..f7a79cbbab 100644
>--- a/BaseTools/Source/Python/build/build.py
>+++ b/BaseTools/Source/Python/build/build.py
>@@ -2203,21 +2203,14 @@ class Build():
>             RemoveDirectory(os.path.dirname(GlobalData.gDatabasePath), True)
>
>     def CreateAsBuiltInf(self):
>-        all_lib_set = set()
>-        all_mod_set = set()
>         for Module in self.BuildModules:
>             Module.CreateAsBuiltInf()
>-            all_mod_set.add(Module)
>+            for lib in Module.LibraryAutoGenList:
>+                lib.CreateAsBuiltInf(True)
>         for Module in self.HashSkipModules:
>             Module.CreateAsBuiltInf(True)
>-            all_mod_set.add(Module)
>-        for Module in all_mod_set:
>             for lib in Module.LibraryAutoGenList:
>-                all_lib_set.add(lib)
>-        for lib in all_lib_set:
>-            lib.CreateAsBuiltInf(True)
>-        all_lib_set.clear()
>-        all_mod_set.clear()
>+                lib.CreateAsBuiltInf(True)
>         self.BuildModules = []
>         self.HashSkipModules = []
>     ## Do some clean-up works when error occurred
>--
>2.17.1.windows.2


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.

View/Reply Online (#42209): https://edk2.groups.io/g/devel/message/42209
Mute This Topic: https://groups.io/mt/32013233/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub  [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [edk2-devel] [PATCH] BaseTools: Cannot store library cache of different arch together

Posted by Steven Shi 4 days ago
Hi Christian,

For the extra IO accesses for duplicated library, I plan to introduce the CopyFileOnChange() function to solve it. Below is the CopyFileOnChange() BZ, and I haven't sent its patch yet. The CopyFileOnChange() will ensure only once IO store/restore access for each library. To avoid duplicated library access is critical not only for the performance, but also for the writing data race in multi-threads.





Basetool need CopyFileOnChange function to avoid cache file writing race in multi-thread build

https://bugzilla.tianocore.org/show_bug.cgi?id=1894





Thanks



Steven Shi

Intel\SSG\FID\Firmware Infrastructure



> -----Original Message-----

> From: Rodriguez, Christian

> Sent: Tuesday, June 11, 2019 11:41 PM

> To: Shi, Steven <steven.shi@intel.com>; devel@edk2.groups.io

> Cc: Gao, Liming <liming.gao@intel.com>; Feng, Bob C <bob.c.feng@intel.com>

> Subject: RE: [PATCH] BaseTools: Cannot store library cache of different arch

> together

>

> The change looks good overall, but I'm concerned about the performance if

> there are multiple of the same library used by different modules. In this case

> the library will be copied once per arch per module. I'm not sure if it would

> make a giant impact, but just something to think about because you would be

> doing extra IO accesses for each duplicate library autogen.

>

> Possible suggestion:

> Is it possible to change the hash to include the arch? Or you can store a

> unique tuple pair like (lib.arch, lib) in the set to reduce the libraries being

> copied to a unique subset.

>

> Thanks,

> Christian

>

> >-----Original Message-----

> >From: Shi, Steven

> >Sent: Monday, June 10, 2019 10:53 PM

> >To: devel@edk2.groups.io<mailto:devel@edk2.groups.io>

> >Cc: Gao, Liming <liming.gao@intel.com<mailto:liming.gao@intel.com>>; Feng, Bob C

> ><bob.c.feng@intel.com<mailto:bob.c.feng@intel.com>>; Rodriguez, Christian

> ><christian.rodriguez@intel.com<mailto:christian.rodriguez@intel.com>>

> >Subject: [PATCH] BaseTools: Cannot store library cache of different arch

> >together

> >

> >https://bugzilla.tianocore.org/show_bug.cgi?id=1895

> >

> >Build cache cannot store cache for the same library modules in different arch

> >together. E.g. Both the below IA32 and X64 arch BaseLib caches should exist

> >after build Ovmf3264, but now only the one in X64 arch exist.

> >The reason is the current Basetool use a set() to same all library AutoGen

> >objects, but the different arch lib AutoGen objects have same __hash_ value

> >which comes from the lib MetaFile(The path of module file):

> >    def __hash__(self):

> >        return hash(self.MetaFile)

> >

> >So the different arch lib AutoGen objects are duplicated one to the set() and

> >only one can exist. This is why the Basetool can only store one arch cache for

> >library.

> >This patch avoid to use the set() as the container to save the library and

> >module objects because the objects might have same __hash__ value.

> >

> >Cc: Liming Gao <liming.gao@intel.com<mailto:liming.gao@intel.com>>

> >Cc: Bob Feng <bob.c.feng@intel.com<mailto:bob.c.feng@intel.com>>

> >Cc: Christian Rodriguez <christian.rodriguez@intel.com<mailto:christian.rodriguez@intel.com>>

> >Signed-off-by: Steven Shi <steven.shi@intel.com<mailto:steven.shi@intel.com>>

> >---

> > BaseTools/Source/Python/build/build.py | 13 +++----------

> > 1 file changed, 3 insertions(+), 10 deletions(-)

> >

> >diff --git a/BaseTools/Source/Python/build/build.py

> >b/BaseTools/Source/Python/build/build.py

> >index 0855d4561c..f7a79cbbab 100644

> >--- a/BaseTools/Source/Python/build/build.py

> >+++ b/BaseTools/Source/Python/build/build.py

> >@@ -2203,21 +2203,14 @@ class Build():

> >             RemoveDirectory(os.path.dirname(GlobalData.gDatabasePath), True)

> >

> >     def CreateAsBuiltInf(self):

> >-        all_lib_set = set()

> >-        all_mod_set = set()

> >         for Module in self.BuildModules:

> >             Module.CreateAsBuiltInf()

> >-            all_mod_set.add(Module)

> >+            for lib in Module.LibraryAutoGenList:

> >+                lib.CreateAsBuiltInf(True)

> >         for Module in self.HashSkipModules:

> >             Module.CreateAsBuiltInf(True)

> >-            all_mod_set.add(Module)

> >-        for Module in all_mod_set:

> >             for lib in Module.LibraryAutoGenList:

> >-                all_lib_set.add(lib)

> >-        for lib in all_lib_set:

> >-            lib.CreateAsBuiltInf(True)

> >-        all_lib_set.clear()

> >-        all_mod_set.clear()

> >+                lib.CreateAsBuiltInf(True)

> >         self.BuildModules = []

> >         self.HashSkipModules = []

> >     ## Do some clean-up works when error occurred

> >--

> >2.17.1.windows.2



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.

View/Reply Online (#42244): https://edk2.groups.io/g/devel/message/42244
Mute This Topic: https://groups.io/mt/32013233/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub  [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-