[RFC] docs/misra: List files in Xen originated from external sources

Michal Orzel posted 1 patch 3 weeks, 1 day ago
Test gitlab-ci passed
Patches applied successfully (tree, apply log)
git fetch https://gitlab.com/xen-project/patchew/xen tags/patchew/20221116092032.4423-1-michal.orzel@amd.com
docs/misra/external-files.txt | 70 +++++++++++++++++++++++++++++++++++
1 file changed, 70 insertions(+)
create mode 100644 docs/misra/external-files.txt
[RFC] docs/misra: List files in Xen originated from external sources
Posted by Michal Orzel 3 weeks, 1 day ago
Add a file with a table listing files in the Xen codebase that originated
from external sources (e.g. Linux kernel). This is done in order to
improve traceability, help with the review process and act as a base for
listing files to exclude from MISRA checkers.

Signed-off-by: Michal Orzel <michal.orzel@amd.com>
---
Some time ago, together with Stefano, we shared a spreadsheet [1] to gather
a feedback about external files in Xen. This RFC patch is moving this
discussion to the mailing list to gather some more comments. The main request
for the maintainers is to check if there are any files missing, check for the
correctness of the already populated entries and decide what to do with the
entries with a question mark.

[1] https://cryptpad.fr/sheet/#/2/sheet/edit/G5WWo5TI71n58OLEG9TCu95j/
---
 docs/misra/external-files.txt | 70 +++++++++++++++++++++++++++++++++++
 1 file changed, 70 insertions(+)
 create mode 100644 docs/misra/external-files.txt

diff --git a/docs/misra/external-files.txt b/docs/misra/external-files.txt
new file mode 100644
index 000000000000..69ff233da456
--- /dev/null
+++ b/docs/misra/external-files.txt
@@ -0,0 +1,70 @@
+External files in Xen
+=====================
+
+The following table lists files present in the Xen codebase that originated
+from external sources e.g. Linux kernel. The purpose of this table is to:
+ - keep track of the external files in the Xen codebase,
+ - help with the review process (e.g. changes done to the files that did not
+   diverge, shall be first submitted to the external projects and then
+   backported to Xen),
+ - act as a base for listing files to exclude from MISRA checkers.
+
+NOTES:
+1) The files shall be listed in alphabetical order.
+2) The origin of the files derived from the projects other than Linux, shall
+   be specified within the () placed next to the path.
+3) The table shall be updated when importing new files from external projects.
+4) At the moment, only the source files are listed in the table.
+
+| Relative path to xen/                     | Diverged from | Subject to       |
+|                                           | origin? [Y/N] | backports? [Y/N] |
++-------------------------------------------+---------------+------------------+
+| arch/arm/arm64/cpufeature.c               | N             | Y                |
+| arch/arm/arm64/insn.c                     | N             | Y                |
+| arch/arm/arm64/lib/find_next_bit.c        | N             | Y                |
+| arch/x86/acpi/boot.c                      | Y             | ?                |
+| arch/x86/acpi/lib.c                       | ?             | ?                |
+| arch/x86/cpu/mcheck/mce-apei.c            | N             | Y                |
+| arch/x86/cpu/mcheck/non-fatal.c           | ?             | Y                |
+| arch/x86/cpu/mtrr/*                       | Y             | N                |
+| arch/x86/cpu/amd.c                        | Y             | N                |
+| arch/x86/cpu/centaur.c                    | Y             | N                |
+| arch/x86/cpu/common.c                     | Y             | N                |
+| arch/x86/cpu/hygon.c                      | Y             | N                |
+| arch/x86/cpu/intel_cacheinfo.c            | Y             | Y                |
+| arch/x86/cpu/intel.c                      | Y             | N                |
+| arch/x86/cpu/mwait-idle.c                 | Y             | Y                |
+| arch/x86/genapic/*                        | Y             | N                |
+| arch/x86/x86_64/mmconf-fam10h.c           | N             | Y                |
+| arch/x86/dmi_scan.c                       | Y             | ?                |
+| arch/x86/mpparse.c                        | Y             | ?                |
+| arch/x86/srat.c                           | Y             | N                |
+| common/libfdt/* (libfdt)                  | N             | Y                |
+| common/lz4/decompress.c                   | N             | Y                |
+| common/ubsan/ubsan.c                      | Y             | Y                |
+| common/xz/*                               | N             | Y                |
+| common/zstd/*                             | N             | Y                |
+| common/bitmap.c                           | N             | Y                |
+| common/bunzip2.c                          | N             | Y                |
+| common/earlycpio.c                        | N             | Y                |
+| common/inflate.c                          | N             | Y                |
+| common/radix-tree.c                       | N             | Y                |
+| common/un*.c                              | N             | Y                |
+| crypto/rijndael.c (OpenBSD)               | N             | Y                |
+| crypto/vmac.c (public domain)             | N             | Y                |
+| drivers/acpi/apei/*                       | N             | Y                |
+| drivers/acpi/tables/*                     | N             | Y                |
+| drivers/acpi/utilities/*                  | N             | Y                |
+| drivers/acpi/hwregs.c                     | N             | Y                |
+| drivers/acpi/numa.c                       | ?             | Y                |
+| drivers/acpi/osl.c                        | Y             | Y                |
+| drivers/acpi/reboot.c                     | N             | Y                |
+| drivers/acpi/tables.c                     | ?             | Y                |
+| drivers/passthrough/arm/smmu.c            | Y             | N                |
+| drivers/passthrough/arm/smmu-v3.c         | Y             | N                |
+| lib/list-sort.c                           | N             | Y                |
+| lib/mem*.c                                | N             | Y                |
+| lib/rbtree.c                              | N             | Y                |
+| lib/str*.c                                | N             | Y                |
+| lib/xxhash32.c                            | N             | Y                |
+| lib/xxhash64.c                            | N             | Y                |
-- 
2.25.1
Re: [RFC] docs/misra: List files in Xen originated from external sources
Posted by Jan Beulich 3 weeks ago
On 16.11.2022 10:20, Michal Orzel wrote:
> --- /dev/null
> +++ b/docs/misra/external-files.txt
> @@ -0,0 +1,70 @@
> +External files in Xen
> +=====================
> +
> +The following table lists files present in the Xen codebase that originated
> +from external sources e.g. Linux kernel. The purpose of this table is to:
> + - keep track of the external files in the Xen codebase,
> + - help with the review process (e.g. changes done to the files that did not
> +   diverge, shall be first submitted to the external projects and then
> +   backported to Xen),
> + - act as a base for listing files to exclude from MISRA checkers.
> +
> +NOTES:
> +1) The files shall be listed in alphabetical order.

But then you don't?

> +2) The origin of the files derived from the projects other than Linux, shall
> +   be specified within the () placed next to the path.

Might it be more generally useful to have another column, then not only
stating the project but also the path it lives at there (or lived at the
time of cloning)?

> +3) The table shall be updated when importing new files from external projects.
> +4) At the moment, only the source files are listed in the table.
> +
> +| Relative path to xen/                     | Diverged from | Subject to       |
> +|                                           | origin? [Y/N] | backports? [Y/N] |
> ++-------------------------------------------+---------------+------------------+
> +| arch/arm/arm64/cpufeature.c               | N             | Y                |
> +| arch/arm/arm64/insn.c                     | N             | Y                |
> +| arch/arm/arm64/lib/find_next_bit.c        | N             | Y                |
> +| arch/x86/acpi/boot.c                      | Y             | ?                |
> +| arch/x86/acpi/lib.c                       | ?             | ?                |

This was simply split off from boot.c, which I'd be inclined to take to
mean Y in the "diverged" column. In the backports column I'd prefer to
keep ? for both, or any other indicator taken to mean "maybe / uncertain".

What about arch/x86/acpi/cpufreq/* and other stuff in arch/x86/acpi/?

> +| arch/x86/cpu/mcheck/mce-apei.c            | N             | Y                |
> +| arch/x86/cpu/mcheck/non-fatal.c           | ?             | Y                |

Even before disappearing in 2.6.32 the file was different from Linux'es,
simply because we don't have anything like schedule_delayed_work(). So
it's pretty clearly Y for "diverged".

> +| arch/x86/cpu/mtrr/*                       | Y             | N                |
> +| arch/x86/cpu/amd.c                        | Y             | N                |
> +| arch/x86/cpu/centaur.c                    | Y             | N                |
> +| arch/x86/cpu/common.c                     | Y             | N                |
> +| arch/x86/cpu/hygon.c                      | Y             | N                |
> +| arch/x86/cpu/intel_cacheinfo.c            | Y             | Y                |
> +| arch/x86/cpu/intel.c                      | Y             | N                |
> +| arch/x86/cpu/mwait-idle.c                 | Y             | Y                |
> +| arch/x86/genapic/*                        | Y             | N                |
> +| arch/x86/x86_64/mmconf-fam10h.c           | N             | Y                |
> +| arch/x86/dmi_scan.c                       | Y             | ?                |
> +| arch/x86/mpparse.c                        | Y             | ?                |

Like above I'd like to keep ? (or alike) here, as neither Y nor N are
fully accurate.

> +| arch/x86/srat.c                           | Y             | N                |

What about common/cpu.c?

> +| common/libfdt/* (libfdt)                  | N             | Y                |
> +| common/lz4/decompress.c                   | N             | Y                |
> +| common/ubsan/ubsan.c                      | Y             | Y                |
> +| common/xz/*                               | N             | Y                |
> +| common/zstd/*                             | N             | Y                |
> +| common/bitmap.c                           | N             | Y                |
> +| common/bunzip2.c                          | N             | Y                |
> +| common/earlycpio.c                        | N             | Y                |
> +| common/inflate.c                          | N             | Y                |

What about common/notifier.c?

> +| common/radix-tree.c                       | N             | Y                |

What about common/rcupdate.c? (Stopping at this in this regard:
It's unclear by what criteria you have gone. Even as simple an
indicator as "Copyright (C) ... Linus Torvalds" was apparently not
used. Similarly mentioning criteria for considering a file
"diverged" would be very helpful to spell out, even if there's
likely some fuzziness involved there.)

> +| common/un*.c                              | N             | Y                |
> +| crypto/rijndael.c (OpenBSD)               | N             | Y                |
> +| crypto/vmac.c (public domain)             | N             | Y                |
> +| drivers/acpi/apei/*                       | N             | Y                |

I'm not sure of the N here.

> +| drivers/acpi/tables/*                     | N             | Y                |
> +| drivers/acpi/utilities/*                  | N             | Y                |
> +| drivers/acpi/hwregs.c                     | N             | Y                |
> +| drivers/acpi/numa.c                       | ?             | Y                |

Y

> +| drivers/acpi/osl.c                        | Y             | Y                |
> +| drivers/acpi/reboot.c                     | N             | Y                |
> +| drivers/acpi/tables.c                     | ?             | Y                |

Y

What about drivers/cpufreq/*, drivers/char/ehci-dbgp.c,
drivers/char/xhci-dbc.c, and drivers/video/font*? What about some of
the stuff under tools/, especially tools/kconfig/?

Jan
Re: [RFC] docs/misra: List files in Xen originated from external sources
Posted by Michal Orzel 3 weeks ago
Hi Jan,

Thanks for looking at it.

On 17/11/2022 11:03, Jan Beulich wrote:
> 
> 
> On 16.11.2022 10:20, Michal Orzel wrote:
>> --- /dev/null
>> +++ b/docs/misra/external-files.txt
>> @@ -0,0 +1,70 @@
>> +External files in Xen
>> +=====================
>> +
>> +The following table lists files present in the Xen codebase that originated
>> +from external sources e.g. Linux kernel. The purpose of this table is to:
>> + - keep track of the external files in the Xen codebase,
>> + - help with the review process (e.g. changes done to the files that did not
>> +   diverge, shall be first submitted to the external projects and then
>> +   backported to Xen),
>> + - act as a base for listing files to exclude from MISRA checkers.
>> +
>> +NOTES:
>> +1) The files shall be listed in alphabetical order.
> 
> But then you don't?
True, it is alphabetical order with directories having a precedence.

> 
>> +2) The origin of the files derived from the projects other than Linux, shall
>> +   be specified within the () placed next to the path.
> 
> Might it be more generally useful to have another column, then not only
> stating the project but also the path it lives at there (or lived at the
> time of cloning)?
We though about it. Would be a good idea but can be quite challenging for files
that appeared in Xen before switching to git (difficult to establish the time of cloning in such cases).

> 
>> +3) The table shall be updated when importing new files from external projects.
>> +4) At the moment, only the source files are listed in the table.
>> +
>> +| Relative path to xen/                     | Diverged from | Subject to       |
>> +|                                           | origin? [Y/N] | backports? [Y/N] |
>> ++-------------------------------------------+---------------+------------------+
>> +| arch/arm/arm64/cpufeature.c               | N             | Y                |
>> +| arch/arm/arm64/insn.c                     | N             | Y                |
>> +| arch/arm/arm64/lib/find_next_bit.c        | N             | Y                |
>> +| arch/x86/acpi/boot.c                      | Y             | ?                |
>> +| arch/x86/acpi/lib.c                       | ?             | ?                |
> 
> This was simply split off from boot.c, which I'd be inclined to take to
> mean Y in the "diverged" column. In the backports column I'd prefer to
> keep ? for both, or any other indicator taken to mean "maybe / uncertain".
> 
> What about arch/x86/acpi/cpufreq/* and other stuff in arch/x86/acpi/?
> 
>> +| arch/x86/cpu/mcheck/mce-apei.c            | N             | Y                |
>> +| arch/x86/cpu/mcheck/non-fatal.c           | ?             | Y                |
> 
> Even before disappearing in 2.6.32 the file was different from Linux'es,
> simply because we don't have anything like schedule_delayed_work(). So
> it's pretty clearly Y for "diverged".
> 
>> +| arch/x86/cpu/mtrr/*                       | Y             | N                |
>> +| arch/x86/cpu/amd.c                        | Y             | N                |
>> +| arch/x86/cpu/centaur.c                    | Y             | N                |
>> +| arch/x86/cpu/common.c                     | Y             | N                |
>> +| arch/x86/cpu/hygon.c                      | Y             | N                |
>> +| arch/x86/cpu/intel_cacheinfo.c            | Y             | Y                |
>> +| arch/x86/cpu/intel.c                      | Y             | N                |
>> +| arch/x86/cpu/mwait-idle.c                 | Y             | Y                |
>> +| arch/x86/genapic/*                        | Y             | N                |
>> +| arch/x86/x86_64/mmconf-fam10h.c           | N             | Y                |
>> +| arch/x86/dmi_scan.c                       | Y             | ?                |
>> +| arch/x86/mpparse.c                        | Y             | ?                |
> 
> Like above I'd like to keep ? (or alike) here, as neither Y nor N are
> fully accurate.
> 
>> +| arch/x86/srat.c                           | Y             | N                |
> 
> What about common/cpu.c?
> 
>> +| common/libfdt/* (libfdt)                  | N             | Y                |
>> +| common/lz4/decompress.c                   | N             | Y                |
>> +| common/ubsan/ubsan.c                      | Y             | Y                |
>> +| common/xz/*                               | N             | Y                |
>> +| common/zstd/*                             | N             | Y                |
>> +| common/bitmap.c                           | N             | Y                |
>> +| common/bunzip2.c                          | N             | Y                |
>> +| common/earlycpio.c                        | N             | Y                |
>> +| common/inflate.c                          | N             | Y                |
> 
> What about common/notifier.c?
> 
>> +| common/radix-tree.c                       | N             | Y                |
> 
> What about common/rcupdate.c? (Stopping at this in this regard:
> It's unclear by what criteria you have gone. Even as simple an
> indicator as "Copyright (C) ... Linus Torvalds" was apparently not
Please see [1]

> used. Similarly mentioning criteria for considering a file
> "diverged" would be very helpful to spell out, even if there's
> likely some fuzziness involved there.)

We would need to pre-define some criteria to avoid having a long justifications.
Any ideas?

> 
>> +| common/un*.c                              | N             | Y                |
>> +| crypto/rijndael.c (OpenBSD)               | N             | Y                |
>> +| crypto/vmac.c (public domain)             | N             | Y                |
>> +| drivers/acpi/apei/*                       | N             | Y                |
> 
> I'm not sure of the N here.
> 
>> +| drivers/acpi/tables/*                     | N             | Y                |
>> +| drivers/acpi/utilities/*                  | N             | Y                |
>> +| drivers/acpi/hwregs.c                     | N             | Y                |
>> +| drivers/acpi/numa.c                       | ?             | Y                |
> 
> Y
> 
>> +| drivers/acpi/osl.c                        | Y             | Y                |
>> +| drivers/acpi/reboot.c                     | N             | Y                |
>> +| drivers/acpi/tables.c                     | ?             | Y                |
> 
> Y
> 
> What about drivers/cpufreq/*, drivers/char/ehci-dbgp.c,
> drivers/char/xhci-dbc.c, and drivers/video/font*? What about some of
> the stuff under tools/, especially tools/kconfig/?

[1]
For the first shot, the criteria was to list files using different coding style than Xen,
especially the ones using tabs instead of spaces. As I indicated before, the list may not be
completed, hence a gentle ask to list the missing ones. Some of the files you mentioned
use Xen coding style + there is no information in the git history that they originated from
external sources. This is why, the maintainers who are the addressee of this RFC should have
a better knowledge of the origin of such files.

As for the files under tools/, FWICS they are being filtered-out from MISRA checks, hence I
did not list them.

> 
> Jan

~Michal
Re: [RFC] docs/misra: List files in Xen originated from external sources
Posted by Jan Beulich 3 weeks ago
On 17.11.2022 11:39, Michal Orzel wrote:
> On 17/11/2022 11:03, Jan Beulich wrote:
>> On 16.11.2022 10:20, Michal Orzel wrote:
>>> --- /dev/null
>>> +++ b/docs/misra/external-files.txt
>>> @@ -0,0 +1,70 @@
>>> +External files in Xen
>>> +=====================
>>> +
>>> +The following table lists files present in the Xen codebase that originated
>>> +from external sources e.g. Linux kernel. The purpose of this table is to:
>>> + - keep track of the external files in the Xen codebase,
>>> + - help with the review process (e.g. changes done to the files that did not
>>> +   diverge, shall be first submitted to the external projects and then
>>> +   backported to Xen),
>>> + - act as a base for listing files to exclude from MISRA checkers.
>>> +
>>> +NOTES:
>>> +1) The files shall be listed in alphabetical order.
>>
>> But then you don't?
> True, it is alphabetical order with directories having a precedence.

Which is kind of surprising and, at least for me, confusing.

>>> +2) The origin of the files derived from the projects other than Linux, shall
>>> +   be specified within the () placed next to the path.
>>
>> Might it be more generally useful to have another column, then not only
>> stating the project but also the path it lives at there (or lived at the
>> time of cloning)?
> We though about it. Would be a good idea but can be quite challenging for files
> that appeared in Xen before switching to git (difficult to establish the time of cloning in such cases).
> 
>>
>>> +3) The table shall be updated when importing new files from external projects.
>>> +4) At the moment, only the source files are listed in the table.
>>> +
>>> +| Relative path to xen/                     | Diverged from | Subject to       |
>>> +|                                           | origin? [Y/N] | backports? [Y/N] |
>>> ++-------------------------------------------+---------------+------------------+
>>> +| arch/arm/arm64/cpufeature.c               | N             | Y                |
>>> +| arch/arm/arm64/insn.c                     | N             | Y                |
>>> +| arch/arm/arm64/lib/find_next_bit.c        | N             | Y                |
>>> +| arch/x86/acpi/boot.c                      | Y             | ?                |
>>> +| arch/x86/acpi/lib.c                       | ?             | ?                |
>>
>> This was simply split off from boot.c, which I'd be inclined to take to
>> mean Y in the "diverged" column. In the backports column I'd prefer to
>> keep ? for both, or any other indicator taken to mean "maybe / uncertain".
>>
>> What about arch/x86/acpi/cpufreq/* and other stuff in arch/x86/acpi/?
>>
>>> +| arch/x86/cpu/mcheck/mce-apei.c            | N             | Y                |
>>> +| arch/x86/cpu/mcheck/non-fatal.c           | ?             | Y                |
>>
>> Even before disappearing in 2.6.32 the file was different from Linux'es,
>> simply because we don't have anything like schedule_delayed_work(). So
>> it's pretty clearly Y for "diverged".
>>
>>> +| arch/x86/cpu/mtrr/*                       | Y             | N                |
>>> +| arch/x86/cpu/amd.c                        | Y             | N                |
>>> +| arch/x86/cpu/centaur.c                    | Y             | N                |
>>> +| arch/x86/cpu/common.c                     | Y             | N                |
>>> +| arch/x86/cpu/hygon.c                      | Y             | N                |
>>> +| arch/x86/cpu/intel_cacheinfo.c            | Y             | Y                |
>>> +| arch/x86/cpu/intel.c                      | Y             | N                |
>>> +| arch/x86/cpu/mwait-idle.c                 | Y             | Y                |
>>> +| arch/x86/genapic/*                        | Y             | N                |
>>> +| arch/x86/x86_64/mmconf-fam10h.c           | N             | Y                |
>>> +| arch/x86/dmi_scan.c                       | Y             | ?                |
>>> +| arch/x86/mpparse.c                        | Y             | ?                |
>>
>> Like above I'd like to keep ? (or alike) here, as neither Y nor N are
>> fully accurate.
>>
>>> +| arch/x86/srat.c                           | Y             | N                |
>>
>> What about common/cpu.c?
>>
>>> +| common/libfdt/* (libfdt)                  | N             | Y                |
>>> +| common/lz4/decompress.c                   | N             | Y                |
>>> +| common/ubsan/ubsan.c                      | Y             | Y                |
>>> +| common/xz/*                               | N             | Y                |
>>> +| common/zstd/*                             | N             | Y                |
>>> +| common/bitmap.c                           | N             | Y                |
>>> +| common/bunzip2.c                          | N             | Y                |
>>> +| common/earlycpio.c                        | N             | Y                |
>>> +| common/inflate.c                          | N             | Y                |
>>
>> What about common/notifier.c?
>>
>>> +| common/radix-tree.c                       | N             | Y                |
>>
>> What about common/rcupdate.c? (Stopping at this in this regard:
>> It's unclear by what criteria you have gone. Even as simple an
>> indicator as "Copyright (C) ... Linus Torvalds" was apparently not
> Please see [1]
> 
>> used. Similarly mentioning criteria for considering a file
>> "diverged" would be very helpful to spell out, even if there's
>> likely some fuzziness involved there.)
> 
> We would need to pre-define some criteria to avoid having a long justifications.
> Any ideas?

Well, changing just #include-s to fit Xen's model shouldn't count as
divergence. But coding style conversion already may. I'm afraid
criteria here depend very much on the purpose, and hence I don't
feel qualified to suggest any.

>>> +| common/un*.c                              | N             | Y                |
>>> +| crypto/rijndael.c (OpenBSD)               | N             | Y                |
>>> +| crypto/vmac.c (public domain)             | N             | Y                |
>>> +| drivers/acpi/apei/*                       | N             | Y                |
>>
>> I'm not sure of the N here.
>>
>>> +| drivers/acpi/tables/*                     | N             | Y                |
>>> +| drivers/acpi/utilities/*                  | N             | Y                |
>>> +| drivers/acpi/hwregs.c                     | N             | Y                |
>>> +| drivers/acpi/numa.c                       | ?             | Y                |
>>
>> Y
>>
>>> +| drivers/acpi/osl.c                        | Y             | Y                |
>>> +| drivers/acpi/reboot.c                     | N             | Y                |
>>> +| drivers/acpi/tables.c                     | ?             | Y                |
>>
>> Y
>>
>> What about drivers/cpufreq/*, drivers/char/ehci-dbgp.c,
>> drivers/char/xhci-dbc.c, and drivers/video/font*? What about some of
>> the stuff under tools/, especially tools/kconfig/?
> 
> [1]
> For the first shot, the criteria was to list files using different coding style than Xen,
> especially the ones using tabs instead of spaces. As I indicated before, the list may not be
> completed, hence a gentle ask to list the missing ones. Some of the files you mentioned
> use Xen coding style + there is no information in the git history that they originated from
> external sources. This is why, the maintainers who are the addressee of this RFC should have
> a better knowledge of the origin of such files.

Hmm. Please forgive me being blunt, but to me this then looks like
offloading work to people who shouldn't be required to invest
meaningful amounts of time. But maybe that's just me viewing it this
way ... Yet this is particularly relevant if ...

> As for the files under tools/, FWICS they are being filtered-out from MISRA checks, hence I
> did not list them.

... the goal here then indeed is use for MISRA alone. I did e.g. ask
whether it wouldn't be worthwhile to more precisely describe the
origin of files because at some point in the past it was also
proposed to arrange for some more automatic monitoring of changes
being applied at their origins for files we have cloned. Which
obviously first of all requires establishing an association between
our files and their origins.

Jan
Re: [RFC] docs/misra: List files in Xen originated from external sources
Posted by Stefano Stabellini 3 weeks ago
On Thu, 17 Nov 2022, Jan Beulich wrote:
> On 17.11.2022 11:39, Michal Orzel wrote:
> > On 17/11/2022 11:03, Jan Beulich wrote:
> >> On 16.11.2022 10:20, Michal Orzel wrote:
> >>> --- /dev/null
> >>> +++ b/docs/misra/external-files.txt
> >>> @@ -0,0 +1,70 @@
> >>> +External files in Xen
> >>> +=====================
> >>> +
> >>> +The following table lists files present in the Xen codebase that originated
> >>> +from external sources e.g. Linux kernel. The purpose of this table is to:
> >>> + - keep track of the external files in the Xen codebase,
> >>> + - help with the review process (e.g. changes done to the files that did not
> >>> +   diverge, shall be first submitted to the external projects and then
> >>> +   backported to Xen),
> >>> + - act as a base for listing files to exclude from MISRA checkers.
> >>> +
> >>> +NOTES:
> >>> +1) The files shall be listed in alphabetical order.
> >>
> >> But then you don't?
> > True, it is alphabetical order with directories having a precedence.
> 
> Which is kind of surprising and, at least for me, confusing.
> 
> >>> +2) The origin of the files derived from the projects other than Linux, shall
> >>> +   be specified within the () placed next to the path.
> >>
> >> Might it be more generally useful to have another column, then not only
> >> stating the project but also the path it lives at there (or lived at the
> >> time of cloning)?
> > We though about it. Would be a good idea but can be quite challenging for files
> > that appeared in Xen before switching to git (difficult to establish the time of cloning in such cases).
> > 
> >>
> >>> +3) The table shall be updated when importing new files from external projects.
> >>> +4) At the moment, only the source files are listed in the table.
> >>> +
> >>> +| Relative path to xen/                     | Diverged from | Subject to       |
> >>> +|                                           | origin? [Y/N] | backports? [Y/N] |
> >>> ++-------------------------------------------+---------------+------------------+
> >>> +| arch/arm/arm64/cpufeature.c               | N             | Y                |
> >>> +| arch/arm/arm64/insn.c                     | N             | Y                |
> >>> +| arch/arm/arm64/lib/find_next_bit.c        | N             | Y                |
> >>> +| arch/x86/acpi/boot.c                      | Y             | ?                |
> >>> +| arch/x86/acpi/lib.c                       | ?             | ?                |
> >>
> >> This was simply split off from boot.c, which I'd be inclined to take to
> >> mean Y in the "diverged" column. In the backports column I'd prefer to
> >> keep ? for both, or any other indicator taken to mean "maybe / uncertain".
> >>
> >> What about arch/x86/acpi/cpufreq/* and other stuff in arch/x86/acpi/?
> >>
> >>> +| arch/x86/cpu/mcheck/mce-apei.c            | N             | Y                |
> >>> +| arch/x86/cpu/mcheck/non-fatal.c           | ?             | Y                |
> >>
> >> Even before disappearing in 2.6.32 the file was different from Linux'es,
> >> simply because we don't have anything like schedule_delayed_work(). So
> >> it's pretty clearly Y for "diverged".
> >>
> >>> +| arch/x86/cpu/mtrr/*                       | Y             | N                |
> >>> +| arch/x86/cpu/amd.c                        | Y             | N                |
> >>> +| arch/x86/cpu/centaur.c                    | Y             | N                |
> >>> +| arch/x86/cpu/common.c                     | Y             | N                |
> >>> +| arch/x86/cpu/hygon.c                      | Y             | N                |
> >>> +| arch/x86/cpu/intel_cacheinfo.c            | Y             | Y                |
> >>> +| arch/x86/cpu/intel.c                      | Y             | N                |
> >>> +| arch/x86/cpu/mwait-idle.c                 | Y             | Y                |
> >>> +| arch/x86/genapic/*                        | Y             | N                |
> >>> +| arch/x86/x86_64/mmconf-fam10h.c           | N             | Y                |
> >>> +| arch/x86/dmi_scan.c                       | Y             | ?                |
> >>> +| arch/x86/mpparse.c                        | Y             | ?                |
> >>
> >> Like above I'd like to keep ? (or alike) here, as neither Y nor N are
> >> fully accurate.
> >>
> >>> +| arch/x86/srat.c                           | Y             | N                |
> >>
> >> What about common/cpu.c?
> >>
> >>> +| common/libfdt/* (libfdt)                  | N             | Y                |
> >>> +| common/lz4/decompress.c                   | N             | Y                |
> >>> +| common/ubsan/ubsan.c                      | Y             | Y                |
> >>> +| common/xz/*                               | N             | Y                |
> >>> +| common/zstd/*                             | N             | Y                |
> >>> +| common/bitmap.c                           | N             | Y                |
> >>> +| common/bunzip2.c                          | N             | Y                |
> >>> +| common/earlycpio.c                        | N             | Y                |
> >>> +| common/inflate.c                          | N             | Y                |
> >>
> >> What about common/notifier.c?
> >>
> >>> +| common/radix-tree.c                       | N             | Y                |
> >>
> >> What about common/rcupdate.c? (Stopping at this in this regard:
> >> It's unclear by what criteria you have gone. Even as simple an
> >> indicator as "Copyright (C) ... Linus Torvalds" was apparently not
> > Please see [1]
> > 
> >> used. Similarly mentioning criteria for considering a file
> >> "diverged" would be very helpful to spell out, even if there's
> >> likely some fuzziness involved there.)
> > 
> > We would need to pre-define some criteria to avoid having a long justifications.
> > Any ideas?
> 
> Well, changing just #include-s to fit Xen's model shouldn't count as
> divergence. But coding style conversion already may. I'm afraid
> criteria here depend very much on the purpose, and hence I don't
> feel qualified to suggest any.

Hi Jan,

These two columns are not for MISRA's benefit. They are for our own
benefit as maintainers of this code. We can define them the way we want
to.

MISRA doesn't allow us to make any exceptions to our coding guidelines
for files originating from external sources (unless they are a proper
library we link against, I don't think that even libfdt qualifies from
what I understand.) We'll have to figure out what to do about that, but
it is not something this patch is trying to solve. It is just trying to
identify the external files.

So the two columns are just for us as maintainers. It is only to help
us, not to help with MISRA or with safety. So if you think we should
word the first column differently, or even remove the first column
entirely, we could.

Maybe a better criteria for the first column would be: "do we accept
changes to this file?" (direct changes, not backports)


> >>> +| common/un*.c                              | N             | Y                |
> >>> +| crypto/rijndael.c (OpenBSD)               | N             | Y                |
> >>> +| crypto/vmac.c (public domain)             | N             | Y                |
> >>> +| drivers/acpi/apei/*                       | N             | Y                |
> >>
> >> I'm not sure of the N here.
> >>
> >>> +| drivers/acpi/tables/*                     | N             | Y                |
> >>> +| drivers/acpi/utilities/*                  | N             | Y                |
> >>> +| drivers/acpi/hwregs.c                     | N             | Y                |
> >>> +| drivers/acpi/numa.c                       | ?             | Y                |
> >>
> >> Y
> >>
> >>> +| drivers/acpi/osl.c                        | Y             | Y                |
> >>> +| drivers/acpi/reboot.c                     | N             | Y                |
> >>> +| drivers/acpi/tables.c                     | ?             | Y                |
> >>
> >> Y
> >>
> >> What about drivers/cpufreq/*, drivers/char/ehci-dbgp.c,
> >> drivers/char/xhci-dbc.c, and drivers/video/font*? What about some of
> >> the stuff under tools/, especially tools/kconfig/?
> > 
> > [1]
> > For the first shot, the criteria was to list files using different coding style than Xen,
> > especially the ones using tabs instead of spaces. As I indicated before, the list may not be
> > completed, hence a gentle ask to list the missing ones. Some of the files you mentioned
> > use Xen coding style + there is no information in the git history that they originated from
> > external sources. This is why, the maintainers who are the addressee of this RFC should have
> > a better knowledge of the origin of such files.
> 
> Hmm. Please forgive me being blunt, but to me this then looks like
> offloading work to people who shouldn't be required to invest
> meaningful amounts of time. But maybe that's just me viewing it this
> way ... Yet this is particularly relevant if ...
> 
> > As for the files under tools/, FWICS they are being filtered-out from MISRA checks, hence I
> > did not list them.
> 
> ... the goal here then indeed is use for MISRA alone. I did e.g. ask
> whether it wouldn't be worthwhile to more precisely describe the
> origin of files because at some point in the past it was also
> proposed to arrange for some more automatic monitoring of changes
> being applied at their origins for files we have cloned. Which
> obviously first of all requires establishing an association between
> our files and their origins.

One of the goals has certainly to do with MISRA, but I think we would
want to know which files are not conforming to the Xen coding style and
coding guidelines anyway? For example, we need it to automate coding
style checks for new patches with scripts like checkpatch.

And you are right, also adding the origin of the files to help with
backports and monitoring is a great idea.

On the extra work: some of us in the community have been around for a
long time. Without having to do any research, there are things you might
remember on top of your head. If you don't remember, that's fine and we
can try to do some investigation/archeology.
Re: [RFC] docs/misra: List files in Xen originated from external sources
Posted by Michal Orzel 1 day, 9 hours ago

On 17/11/2022 22:15, Stefano Stabellini wrote:
> 
> 
> On Thu, 17 Nov 2022, Jan Beulich wrote:
>> On 17.11.2022 11:39, Michal Orzel wrote:
>>> On 17/11/2022 11:03, Jan Beulich wrote:
>>>> On 16.11.2022 10:20, Michal Orzel wrote:
>>>>> --- /dev/null
>>>>> +++ b/docs/misra/external-files.txt
>>>>> @@ -0,0 +1,70 @@
>>>>> +External files in Xen
>>>>> +=====================
>>>>> +
>>>>> +The following table lists files present in the Xen codebase that originated
>>>>> +from external sources e.g. Linux kernel. The purpose of this table is to:
>>>>> + - keep track of the external files in the Xen codebase,
>>>>> + - help with the review process (e.g. changes done to the files that did not
>>>>> +   diverge, shall be first submitted to the external projects and then
>>>>> +   backported to Xen),
>>>>> + - act as a base for listing files to exclude from MISRA checkers.
>>>>> +
>>>>> +NOTES:
>>>>> +1) The files shall be listed in alphabetical order.
>>>>
>>>> But then you don't?
>>> True, it is alphabetical order with directories having a precedence.
>>
>> Which is kind of surprising and, at least for me, confusing.
>>
>>>>> +2) The origin of the files derived from the projects other than Linux, shall
>>>>> +   be specified within the () placed next to the path.
>>>>
>>>> Might it be more generally useful to have another column, then not only
>>>> stating the project but also the path it lives at there (or lived at the
>>>> time of cloning)?
>>> We though about it. Would be a good idea but can be quite challenging for files
>>> that appeared in Xen before switching to git (difficult to establish the time of cloning in such cases).
>>>
>>>>
>>>>> +3) The table shall be updated when importing new files from external projects.
>>>>> +4) At the moment, only the source files are listed in the table.
>>>>> +
>>>>> +| Relative path to xen/                     | Diverged from | Subject to       |
>>>>> +|                                           | origin? [Y/N] | backports? [Y/N] |
>>>>> ++-------------------------------------------+---------------+------------------+
>>>>> +| arch/arm/arm64/cpufeature.c               | N             | Y                |
>>>>> +| arch/arm/arm64/insn.c                     | N             | Y                |
>>>>> +| arch/arm/arm64/lib/find_next_bit.c        | N             | Y                |
>>>>> +| arch/x86/acpi/boot.c                      | Y             | ?                |
>>>>> +| arch/x86/acpi/lib.c                       | ?             | ?                |
>>>>
>>>> This was simply split off from boot.c, which I'd be inclined to take to
>>>> mean Y in the "diverged" column. In the backports column I'd prefer to
>>>> keep ? for both, or any other indicator taken to mean "maybe / uncertain".
>>>>
>>>> What about arch/x86/acpi/cpufreq/* and other stuff in arch/x86/acpi/?
>>>>
>>>>> +| arch/x86/cpu/mcheck/mce-apei.c            | N             | Y                |
>>>>> +| arch/x86/cpu/mcheck/non-fatal.c           | ?             | Y                |
>>>>
>>>> Even before disappearing in 2.6.32 the file was different from Linux'es,
>>>> simply because we don't have anything like schedule_delayed_work(). So
>>>> it's pretty clearly Y for "diverged".
>>>>
>>>>> +| arch/x86/cpu/mtrr/*                       | Y             | N                |
>>>>> +| arch/x86/cpu/amd.c                        | Y             | N                |
>>>>> +| arch/x86/cpu/centaur.c                    | Y             | N                |
>>>>> +| arch/x86/cpu/common.c                     | Y             | N                |
>>>>> +| arch/x86/cpu/hygon.c                      | Y             | N                |
>>>>> +| arch/x86/cpu/intel_cacheinfo.c            | Y             | Y                |
>>>>> +| arch/x86/cpu/intel.c                      | Y             | N                |
>>>>> +| arch/x86/cpu/mwait-idle.c                 | Y             | Y                |
>>>>> +| arch/x86/genapic/*                        | Y             | N                |
>>>>> +| arch/x86/x86_64/mmconf-fam10h.c           | N             | Y                |
>>>>> +| arch/x86/dmi_scan.c                       | Y             | ?                |
>>>>> +| arch/x86/mpparse.c                        | Y             | ?                |
>>>>
>>>> Like above I'd like to keep ? (or alike) here, as neither Y nor N are
>>>> fully accurate.
>>>>
>>>>> +| arch/x86/srat.c                           | Y             | N                |
>>>>
>>>> What about common/cpu.c?
>>>>
>>>>> +| common/libfdt/* (libfdt)                  | N             | Y                |
>>>>> +| common/lz4/decompress.c                   | N             | Y                |
>>>>> +| common/ubsan/ubsan.c                      | Y             | Y                |
>>>>> +| common/xz/*                               | N             | Y                |
>>>>> +| common/zstd/*                             | N             | Y                |
>>>>> +| common/bitmap.c                           | N             | Y                |
>>>>> +| common/bunzip2.c                          | N             | Y                |
>>>>> +| common/earlycpio.c                        | N             | Y                |
>>>>> +| common/inflate.c                          | N             | Y                |
>>>>
>>>> What about common/notifier.c?
>>>>
>>>>> +| common/radix-tree.c                       | N             | Y                |
>>>>
>>>> What about common/rcupdate.c? (Stopping at this in this regard:
>>>> It's unclear by what criteria you have gone. Even as simple an
>>>> indicator as "Copyright (C) ... Linus Torvalds" was apparently not
>>> Please see [1]
>>>
>>>> used. Similarly mentioning criteria for considering a file
>>>> "diverged" would be very helpful to spell out, even if there's
>>>> likely some fuzziness involved there.)
>>>
>>> We would need to pre-define some criteria to avoid having a long justifications.
>>> Any ideas?
>>
>> Well, changing just #include-s to fit Xen's model shouldn't count as
>> divergence. But coding style conversion already may. I'm afraid
>> criteria here depend very much on the purpose, and hence I don't
>> feel qualified to suggest any.
> 
> Hi Jan,
> 
> These two columns are not for MISRA's benefit. They are for our own
> benefit as maintainers of this code. We can define them the way we want
> to.
> 
> MISRA doesn't allow us to make any exceptions to our coding guidelines
> for files originating from external sources (unless they are a proper
> library we link against, I don't think that even libfdt qualifies from
> what I understand.) We'll have to figure out what to do about that, but
> it is not something this patch is trying to solve. It is just trying to
> identify the external files.
> 
> So the two columns are just for us as maintainers. It is only to help
> us, not to help with MISRA or with safety. So if you think we should
> word the first column differently, or even remove the first column
> entirely, we could.
> 
> Maybe a better criteria for the first column would be: "do we accept
> changes to this file?" (direct changes, not backports)
> 
> 
>>>>> +| common/un*.c                              | N             | Y                |
>>>>> +| crypto/rijndael.c (OpenBSD)               | N             | Y                |
>>>>> +| crypto/vmac.c (public domain)             | N             | Y                |
>>>>> +| drivers/acpi/apei/*                       | N             | Y                |
>>>>
>>>> I'm not sure of the N here.
>>>>
>>>>> +| drivers/acpi/tables/*                     | N             | Y                |
>>>>> +| drivers/acpi/utilities/*                  | N             | Y                |
>>>>> +| drivers/acpi/hwregs.c                     | N             | Y                |
>>>>> +| drivers/acpi/numa.c                       | ?             | Y                |
>>>>
>>>> Y
>>>>
>>>>> +| drivers/acpi/osl.c                        | Y             | Y                |
>>>>> +| drivers/acpi/reboot.c                     | N             | Y                |
>>>>> +| drivers/acpi/tables.c                     | ?             | Y                |
>>>>
>>>> Y
>>>>
>>>> What about drivers/cpufreq/*, drivers/char/ehci-dbgp.c,
>>>> drivers/char/xhci-dbc.c, and drivers/video/font*? What about some of
>>>> the stuff under tools/, especially tools/kconfig/?
>>>
>>> [1]
>>> For the first shot, the criteria was to list files using different coding style than Xen,
>>> especially the ones using tabs instead of spaces. As I indicated before, the list may not be
>>> completed, hence a gentle ask to list the missing ones. Some of the files you mentioned
>>> use Xen coding style + there is no information in the git history that they originated from
>>> external sources. This is why, the maintainers who are the addressee of this RFC should have
>>> a better knowledge of the origin of such files.
>>
>> Hmm. Please forgive me being blunt, but to me this then looks like
>> offloading work to people who shouldn't be required to invest
>> meaningful amounts of time. But maybe that's just me viewing it this
>> way ... Yet this is particularly relevant if ...
>>
>>> As for the files under tools/, FWICS they are being filtered-out from MISRA checks, hence I
>>> did not list them.
>>
>> ... the goal here then indeed is use for MISRA alone. I did e.g. ask
>> whether it wouldn't be worthwhile to more precisely describe the
>> origin of files because at some point in the past it was also
>> proposed to arrange for some more automatic monitoring of changes
>> being applied at their origins for files we have cloned. Which
>> obviously first of all requires establishing an association between
>> our files and their origins.
> 
> One of the goals has certainly to do with MISRA, but I think we would
> want to know which files are not conforming to the Xen coding style and
> coding guidelines anyway? For example, we need it to automate coding
> style checks for new patches with scripts like checkpatch.
> 
> And you are right, also adding the origin of the files to help with
> backports and monitoring is a great idea.
> 
> On the extra work: some of us in the community have been around for a
> long time. Without having to do any research, there are things you might
> remember on top of your head. If you don't remember, that's fine and we
> can try to do some investigation/archeology.

I managed to extend the list with additional 21 files (+alphabetization done)
thanks to Jan's suggestions, checking for Linux copyright and using a black magic
(e.g. for common/cpu.c it was almost impossible to deduct that it originated from Linux).
The table looks as follows for now (I put ? in the second column for new files):

| arch/arm/arm64/cpufeature.c               | N             | Y                |
| arch/arm/arm64/insn.c                     | N             | Y                |
| arch/arm/arm64/lib/find_next_bit.c        | N             | Y                |
| arch/x86/acpi/boot.c                      | Y             | ?                |
| arch/x86/acpi/cpu_idle.c                  | Y             | ?                |
| arch/x86/acpi/cpufreq/cpufreq.c           | Y             | ?                |
| arch/x86/acpi/cpuidle_menu.c              | Y             | ?                |
| arch/x86/acpi/lib.c                       | Y             | ?                |
| arch/x86/acpi/power.c                     | Y             | ?                |
| arch/x86/cpu/amd.c                        | Y             | N                |
| arch/x86/cpu/centaur.c                    | Y             | N                |
| arch/x86/cpu/common.c                     | Y             | N                |
| arch/x86/cpu/hygon.c                      | Y             | N                |
| arch/x86/cpu/intel.c                      | Y             | N                |
| arch/x86/cpu/intel_cacheinfo.c            | Y             | Y                |
| arch/x86/cpu/mcheck/mce-apei.c            | N             | Y                |
| arch/x86/cpu/mcheck/non-fatal.c           | Y             | Y                |
| arch/x86/cpu/mtrr/*                       | Y             | N                |
| arch/x86/cpu/mwait-idle.c                 | Y             | Y                |
| arch/x86/delay.c                          | Y             | ?                |
| arch/x86/dmi_scan.c                       | Y             | ?                |
| arch/x86/domain.c                         | Y             | ?                |
| arch/x86/genapic/*                        | Y             | N                |
| arch/x86/i387.c                           | Y             | ?                |
| arch/x86/irq.c                            | Y             | ?                |
| arch/x86/mpparse.c                        | Y             | ?                |
| arch/x86/srat.c                           | Y             | N                |
| arch/x86/time.c                           | Y             | ?                |
| arch/x86/traps.c                          | Y             | ?                |
| arch/x86/usercopy.c                       | Y             | ?                |
| arch/x86/x86_64/mmconf-fam10h.c           | N             | Y                |
| common/bitmap.c                           | N             | Y                |
| common/bunzip2.c                          | N             | Y                |
| common/cpu.c                              | Y             | ?                |
| common/earlycpio.c                        | N             | Y                |
| common/inflate.c                          | N             | Y                |
| common/libfdt/* (libfdt)                  | N             | Y                |
| common/lz4/decompress.c                   | N             | Y                |
| common/notifier.c                         | Y             | ?                |
| common/radix-tree.c                       | N             | Y                |
| common/rcupdate.c                         | Y             | ?                |
| common/softirq.c                          | Y             | ?                |
| common/tasklet.c                          | Y             | ?                |
| common/ubsan/ubsan.c                      | Y             | Y                |
| common/un*.c                              | N             | Y                |
| common/vsprintf.c                         | Y             | ?                |
| common/xz/*                               | N             | Y                |
| common/zstd/*                             | N             | Y                |
| crypto/rijndael.c (OpenBSD)               | N             | Y                |
| crypto/vmac.c (public domain)             | N             | Y                |
| drivers/acpi/apei/*                       | Y             | Y                |
| drivers/acpi/hwregs.c                     | N             | Y                |
| drivers/acpi/numa.c                       | Y             | Y                |
| drivers/acpi/osl.c                        | Y             | Y                |
| drivers/acpi/reboot.c                     | N             | Y                |
| drivers/acpi/tables.c                     | Y             | Y                |
| drivers/acpi/tables/*                     | N             | Y                |
| drivers/acpi/utilities/*                  | N             | Y                |
| drivers/char/ehci-dbgp.c                  | Y             | ?                |
| drivers/char/xhci-dbc.c                   | Y             | ?                |
| drivers/cpufreq/*                         | Y             | ?                |
| drivers/passthrough/arm/smmu-v3.c         | Y             | N                |
| drivers/passthrough/arm/smmu.c            | Y             | N                |
| drivers/video/font_*                      | Y             | ?                |
| lib/list-sort.c                           | N             | Y                |
| lib/mem*.c                                | N             | Y                |
| lib/rbtree.c                              | N             | Y                |
| lib/str*.c                                | N             | Y                |
| lib/xxhash32.c                            | N             | Y                |
| lib/xxhash64.c                            | N             | Y                |

~Michal