[PATCH v6 6/9] docs: add doxygen preprocessor and related files

Luca Fancellu posted 9 patches 3 years, 6 months ago
There is a newer version of this series
[PATCH v6 6/9] docs: add doxygen preprocessor and related files
Posted by Luca Fancellu 3 years, 6 months ago
Add preprocessor called by doxygen before parsing headers,
it will include in every header a doxygen_include.h file
that provides missing defines and includes that are
usually passed by the compiler.

Add doxy_input.list that is a text file containing the
relative path to the source code file to be parsed by
doxygen. The path sould be relative to the xen folder:
E.g. xen/include/public/grant_table.h

Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
---
 docs/xen-doxygen/doxy-preprocessor.py | 110 ++++++++++++++++++++++++++
 docs/xen-doxygen/doxy_input.list      |   0
 docs/xen-doxygen/doxygen_include.h.in |  32 ++++++++
 3 files changed, 142 insertions(+)
 create mode 100755 docs/xen-doxygen/doxy-preprocessor.py
 create mode 100644 docs/xen-doxygen/doxy_input.list
 create mode 100644 docs/xen-doxygen/doxygen_include.h.in

diff --git a/docs/xen-doxygen/doxy-preprocessor.py b/docs/xen-doxygen/doxy-preprocessor.py
new file mode 100755
index 0000000000..496899d8e6
--- /dev/null
+++ b/docs/xen-doxygen/doxy-preprocessor.py
@@ -0,0 +1,110 @@
+#!/usr/bin/python3
+#
+# Copyright (c) 2021, Arm Limited.
+#
+# SPDX-License-Identifier: GPL-2.0
+#
+
+import os, sys, re
+
+
+# Variables that holds the preprocessed header text
+output_text = ""
+header_file_name = ""
+
+# Variables to enumerate the anonymous structs/unions
+anonymous_struct_count = 0
+anonymous_union_count = 0
+
+
+def error(text):
+    sys.stderr.write("{}\n".format(text))
+    sys.exit(1)
+
+
+def write_to_output(text):
+    sys.stdout.write(text)
+
+
+def insert_doxygen_header(text):
+    # Here the strategy is to insert the #include <doxygen_include.h> in the
+    # first line of the header
+    abspath = os.path.dirname(os.path.abspath(__file__))
+    text += "#include \"{}/doxygen_include.h\"\n".format(abspath)
+
+    return text
+
+
+def enumerate_anonymous(match):
+    global anonymous_struct_count
+    global anonymous_union_count
+
+    if "struct" in match.group(1):
+        label = "anonymous_struct_%d" % anonymous_struct_count
+        anonymous_struct_count += 1
+    else:
+        label = "anonymous_union_%d" % anonymous_union_count
+        anonymous_union_count += 1
+
+    return match.group(1) + " " + label + " {"
+
+
+def manage_anonymous_structs_unions(text):
+    # Match anonymous unions/structs with this pattern:
+    # struct/union {
+    #     [...]
+    #
+    # and substitute it in this way:
+    #
+    # struct anonymous_struct_# {
+    #     [...]
+    # or
+    # union anonymous_union_# {
+    #     [...]
+    # where # is a counter starting from zero, different between structs and
+    # unions
+    #
+    # We don't count anonymous union/struct that are part of a typedef because
+    # they don't create any issue for doxygen
+    text = re.sub(
+        "(?<!typedef\s)(struct|union)\s+?\{",
+        enumerate_anonymous,
+        text,
+        flags=re.S
+    )
+
+    return text
+
+
+def main(argv):
+    global output_text
+    global header_file_name
+
+    if len(argv) != 1:
+        error("Script called without arguments!")
+
+    header_file_name = argv[0]
+
+    # Open header file
+    input_header_file = open(header_file_name, 'r')
+    # Read all lines
+    lines = input_header_file.readlines()
+
+    # Inject config.h and some defines in the current header, during compilation
+    # this job is done by the -include argument passed to the compiler.
+    output_text = insert_doxygen_header(output_text)
+
+    # Load file content in a variable
+    for line in lines:
+        output_text += "{}".format(line)
+
+    # Try to get rid of any anonymous union/struct
+    output_text = manage_anonymous_structs_unions(output_text)
+
+    # Final stage of the preprocessor, print the output to stdout
+    write_to_output(output_text)
+
+
+if __name__ == "__main__":
+    main(sys.argv[1:])
+    sys.exit(0)
diff --git a/docs/xen-doxygen/doxy_input.list b/docs/xen-doxygen/doxy_input.list
new file mode 100644
index 0000000000..e69de29bb2
diff --git a/docs/xen-doxygen/doxygen_include.h.in b/docs/xen-doxygen/doxygen_include.h.in
new file mode 100644
index 0000000000..df284f3931
--- /dev/null
+++ b/docs/xen-doxygen/doxygen_include.h.in
@@ -0,0 +1,32 @@
+/*
+ * Doxygen include header
+ * It supplies the xen/include/xen/config.h that is included using the -include
+ * argument of the compiler in Xen Makefile.
+ * Other macros are defined because they are usually provided by the compiler.
+ *
+ * Copyright (C) 2021 ARM Limited
+ *
+ * Author: Luca Fancellu <luca.fancellu@arm.com>
+ *
+ * SPDX-License-Identifier: GPL-2.0
+ */
+
+#include "@XEN_BASE@/xen/include/xen/config.h"
+
+#if defined(CONFIG_X86_64)
+
+#define __x86_64__ 1
+
+#elif defined(CONFIG_ARM_64)
+
+#define __aarch64__ 1
+
+#elif defined(CONFIG_ARM_32)
+
+#define __arm__ 1
+
+#else
+
+#error Architecture not supported/recognized.
+
+#endif
-- 
2.17.1


Re: [PATCH v6 6/9] docs: add doxygen preprocessor and related files
Posted by Stefano Stabellini 3 years, 5 months ago
On Mon, 10 May 2021, Luca Fancellu wrote:
> Add preprocessor called by doxygen before parsing headers,
> it will include in every header a doxygen_include.h file
> that provides missing defines and includes that are
> usually passed by the compiler.
> 
> Add doxy_input.list that is a text file containing the
> relative path to the source code file to be parsed by
> doxygen. The path sould be relative to the xen folder:
> E.g. xen/include/public/grant_table.h
> 
> Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
> ---
>  docs/xen-doxygen/doxy-preprocessor.py | 110 ++++++++++++++++++++++++++
>  docs/xen-doxygen/doxy_input.list      |   0
>  docs/xen-doxygen/doxygen_include.h.in |  32 ++++++++
>  3 files changed, 142 insertions(+)
>  create mode 100755 docs/xen-doxygen/doxy-preprocessor.py
>  create mode 100644 docs/xen-doxygen/doxy_input.list
>  create mode 100644 docs/xen-doxygen/doxygen_include.h.in
> 
> diff --git a/docs/xen-doxygen/doxy-preprocessor.py b/docs/xen-doxygen/doxy-preprocessor.py
> new file mode 100755
> index 0000000000..496899d8e6
> --- /dev/null
> +++ b/docs/xen-doxygen/doxy-preprocessor.py
> @@ -0,0 +1,110 @@
> +#!/usr/bin/python3
> +#
> +# Copyright (c) 2021, Arm Limited.
> +#
> +# SPDX-License-Identifier: GPL-2.0
> +#
> +
> +import os, sys, re
> +
> +
> +# Variables that holds the preprocessed header text
> +output_text = ""
> +header_file_name = ""
> +
> +# Variables to enumerate the anonymous structs/unions
> +anonymous_struct_count = 0
> +anonymous_union_count = 0
> +
> +
> +def error(text):
> +    sys.stderr.write("{}\n".format(text))
> +    sys.exit(1)
> +
> +
> +def write_to_output(text):
> +    sys.stdout.write(text)
> +
> +
> +def insert_doxygen_header(text):
> +    # Here the strategy is to insert the #include <doxygen_include.h> in the
> +    # first line of the header
> +    abspath = os.path.dirname(os.path.abspath(__file__))
> +    text += "#include \"{}/doxygen_include.h\"\n".format(abspath)
> +
> +    return text
> +
> +
> +def enumerate_anonymous(match):
> +    global anonymous_struct_count
> +    global anonymous_union_count
> +
> +    if "struct" in match.group(1):
> +        label = "anonymous_struct_%d" % anonymous_struct_count
> +        anonymous_struct_count += 1
> +    else:
> +        label = "anonymous_union_%d" % anonymous_union_count
> +        anonymous_union_count += 1
> +
> +    return match.group(1) + " " + label + " {"
> +
> +
> +def manage_anonymous_structs_unions(text):
> +    # Match anonymous unions/structs with this pattern:
> +    # struct/union {
> +    #     [...]
> +    #
> +    # and substitute it in this way:
> +    #
> +    # struct anonymous_struct_# {
> +    #     [...]
> +    # or
> +    # union anonymous_union_# {
> +    #     [...]
> +    # where # is a counter starting from zero, different between structs and
> +    # unions
> +    #
> +    # We don't count anonymous union/struct that are part of a typedef because
> +    # they don't create any issue for doxygen
> +    text = re.sub(
> +        "(?<!typedef\s)(struct|union)\s+?\{",
> +        enumerate_anonymous,
> +        text,
> +        flags=re.S
> +    )

My python is a bit rusty but I thought this is really clever!

One question: given that anonymous_struct_count is local per file being
processed, it always starts at 0 for each header. I think that is
actually better from a documentation readability point of view.

However, is it possible that Doxygen gets confused in a case where we
can multiple "struct anonymous_struct_0", e.g. one for grant_table.h,
one for event_channel.h, etc. ?

Re: [PATCH v6 6/9] docs: add doxygen preprocessor and related files
Posted by Luca Fancellu 3 years, 5 months ago

> On 23 Jun 2021, at 23:03, Stefano Stabellini <sstabellini@kernel.org> wrote:
> 
> On Mon, 10 May 2021, Luca Fancellu wrote:
>> Add preprocessor called by doxygen before parsing headers,
>> it will include in every header a doxygen_include.h file
>> that provides missing defines and includes that are
>> usually passed by the compiler.
>> 
>> Add doxy_input.list that is a text file containing the
>> relative path to the source code file to be parsed by
>> doxygen. The path sould be relative to the xen folder:
>> E.g. xen/include/public/grant_table.h
>> 
>> Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
>> ---
>> docs/xen-doxygen/doxy-preprocessor.py | 110 ++++++++++++++++++++++++++
>> docs/xen-doxygen/doxy_input.list      |   0
>> docs/xen-doxygen/doxygen_include.h.in |  32 ++++++++
>> 3 files changed, 142 insertions(+)
>> create mode 100755 docs/xen-doxygen/doxy-preprocessor.py
>> create mode 100644 docs/xen-doxygen/doxy_input.list
>> create mode 100644 docs/xen-doxygen/doxygen_include.h.in
>> 
>> diff --git a/docs/xen-doxygen/doxy-preprocessor.py b/docs/xen-doxygen/doxy-preprocessor.py
>> new file mode 100755
>> index 0000000000..496899d8e6
>> --- /dev/null
>> +++ b/docs/xen-doxygen/doxy-preprocessor.py
>> @@ -0,0 +1,110 @@
>> +#!/usr/bin/python3
>> +#
>> +# Copyright (c) 2021, Arm Limited.
>> +#
>> +# SPDX-License-Identifier: GPL-2.0
>> +#
>> +
>> +import os, sys, re
>> +
>> +
>> +# Variables that holds the preprocessed header text
>> +output_text = ""
>> +header_file_name = ""
>> +
>> +# Variables to enumerate the anonymous structs/unions
>> +anonymous_struct_count = 0
>> +anonymous_union_count = 0
>> +
>> +
>> +def error(text):
>> +    sys.stderr.write("{}\n".format(text))
>> +    sys.exit(1)
>> +
>> +
>> +def write_to_output(text):
>> +    sys.stdout.write(text)
>> +
>> +
>> +def insert_doxygen_header(text):
>> +    # Here the strategy is to insert the #include <doxygen_include.h> in the
>> +    # first line of the header
>> +    abspath = os.path.dirname(os.path.abspath(__file__))
>> +    text += "#include \"{}/doxygen_include.h\"\n".format(abspath)
>> +
>> +    return text
>> +
>> +
>> +def enumerate_anonymous(match):
>> +    global anonymous_struct_count
>> +    global anonymous_union_count
>> +
>> +    if "struct" in match.group(1):
>> +        label = "anonymous_struct_%d" % anonymous_struct_count
>> +        anonymous_struct_count += 1
>> +    else:
>> +        label = "anonymous_union_%d" % anonymous_union_count
>> +        anonymous_union_count += 1
>> +
>> +    return match.group(1) + " " + label + " {"
>> +
>> +
>> +def manage_anonymous_structs_unions(text):
>> +    # Match anonymous unions/structs with this pattern:
>> +    # struct/union {
>> +    #     [...]
>> +    #
>> +    # and substitute it in this way:
>> +    #
>> +    # struct anonymous_struct_# {
>> +    #     [...]
>> +    # or
>> +    # union anonymous_union_# {
>> +    #     [...]
>> +    # where # is a counter starting from zero, different between structs and
>> +    # unions
>> +    #
>> +    # We don't count anonymous union/struct that are part of a typedef because
>> +    # they don't create any issue for doxygen
>> +    text = re.sub(
>> +        "(?<!typedef\s)(struct|union)\s+?\{",
>> +        enumerate_anonymous,
>> +        text,
>> +        flags=re.S
>> +    )

Hi Stefano,

> 
> My python is a bit rusty but I thought this is really clever!
> 
> One question: given that anonymous_struct_count is local per file being
> processed, it always starts at 0 for each header. I think that is
> actually better from a documentation readability point of view.
> 
> However, is it possible that Doxygen gets confused in a case where we
> can multiple "struct anonymous_struct_0", e.g. one for grant_table.h,
> one for event_channel.h, etc. ?

Yes this is a very good point, I did some experiment and it can happen, however so far we didn’t notice any
problem because all the anonymous union/struct were part of other data structure, in that case there is no
problem at all because we have upper_data_structure::anonymous_{struct/union}_0/1/2…

So given the fact that is difficult to have clash, I would say we can handle any future case separately.

Having a global numbering can solve the issue but because the script is called separately for each header,
Implementing it will involve some changes, there should be a file to maintain the number across invocation
and so on.

Let me know what do you think about that and if in your opinion we can proceed the way it is now.

Cheers,

Luca




Re: [PATCH v6 6/9] docs: add doxygen preprocessor and related files
Posted by Stefano Stabellini 3 years, 5 months ago
On Thu, 1 Jul 2021, Luca Fancellu wrote:
> > On 23 Jun 2021, at 23:03, Stefano Stabellini <sstabellini@kernel.org> wrote:
> > 
> > On Mon, 10 May 2021, Luca Fancellu wrote:
> >> Add preprocessor called by doxygen before parsing headers,
> >> it will include in every header a doxygen_include.h file
> >> that provides missing defines and includes that are
> >> usually passed by the compiler.
> >> 
> >> Add doxy_input.list that is a text file containing the
> >> relative path to the source code file to be parsed by
> >> doxygen. The path sould be relative to the xen folder:
> >> E.g. xen/include/public/grant_table.h
> >> 
> >> Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
> >> ---
> >> docs/xen-doxygen/doxy-preprocessor.py | 110 ++++++++++++++++++++++++++
> >> docs/xen-doxygen/doxy_input.list      |   0
> >> docs/xen-doxygen/doxygen_include.h.in |  32 ++++++++
> >> 3 files changed, 142 insertions(+)
> >> create mode 100755 docs/xen-doxygen/doxy-preprocessor.py
> >> create mode 100644 docs/xen-doxygen/doxy_input.list
> >> create mode 100644 docs/xen-doxygen/doxygen_include.h.in
> >> 
> >> diff --git a/docs/xen-doxygen/doxy-preprocessor.py b/docs/xen-doxygen/doxy-preprocessor.py
> >> new file mode 100755
> >> index 0000000000..496899d8e6
> >> --- /dev/null
> >> +++ b/docs/xen-doxygen/doxy-preprocessor.py
> >> @@ -0,0 +1,110 @@
> >> +#!/usr/bin/python3
> >> +#
> >> +# Copyright (c) 2021, Arm Limited.
> >> +#
> >> +# SPDX-License-Identifier: GPL-2.0
> >> +#
> >> +
> >> +import os, sys, re
> >> +
> >> +
> >> +# Variables that holds the preprocessed header text
> >> +output_text = ""
> >> +header_file_name = ""
> >> +
> >> +# Variables to enumerate the anonymous structs/unions
> >> +anonymous_struct_count = 0
> >> +anonymous_union_count = 0
> >> +
> >> +
> >> +def error(text):
> >> +    sys.stderr.write("{}\n".format(text))
> >> +    sys.exit(1)
> >> +
> >> +
> >> +def write_to_output(text):
> >> +    sys.stdout.write(text)
> >> +
> >> +
> >> +def insert_doxygen_header(text):
> >> +    # Here the strategy is to insert the #include <doxygen_include.h> in the
> >> +    # first line of the header
> >> +    abspath = os.path.dirname(os.path.abspath(__file__))
> >> +    text += "#include \"{}/doxygen_include.h\"\n".format(abspath)
> >> +
> >> +    return text
> >> +
> >> +
> >> +def enumerate_anonymous(match):
> >> +    global anonymous_struct_count
> >> +    global anonymous_union_count
> >> +
> >> +    if "struct" in match.group(1):
> >> +        label = "anonymous_struct_%d" % anonymous_struct_count
> >> +        anonymous_struct_count += 1
> >> +    else:
> >> +        label = "anonymous_union_%d" % anonymous_union_count
> >> +        anonymous_union_count += 1
> >> +
> >> +    return match.group(1) + " " + label + " {"
> >> +
> >> +
> >> +def manage_anonymous_structs_unions(text):
> >> +    # Match anonymous unions/structs with this pattern:
> >> +    # struct/union {
> >> +    #     [...]
> >> +    #
> >> +    # and substitute it in this way:
> >> +    #
> >> +    # struct anonymous_struct_# {
> >> +    #     [...]
> >> +    # or
> >> +    # union anonymous_union_# {
> >> +    #     [...]
> >> +    # where # is a counter starting from zero, different between structs and
> >> +    # unions
> >> +    #
> >> +    # We don't count anonymous union/struct that are part of a typedef because
> >> +    # they don't create any issue for doxygen
> >> +    text = re.sub(
> >> +        "(?<!typedef\s)(struct|union)\s+?\{",
> >> +        enumerate_anonymous,
> >> +        text,
> >> +        flags=re.S
> >> +    )
> 
> Hi Stefano,
> 
> > 
> > My python is a bit rusty but I thought this is really clever!
> > 
> > One question: given that anonymous_struct_count is local per file being
> > processed, it always starts at 0 for each header. I think that is
> > actually better from a documentation readability point of view.
> > 
> > However, is it possible that Doxygen gets confused in a case where we
> > can multiple "struct anonymous_struct_0", e.g. one for grant_table.h,
> > one for event_channel.h, etc. ?
> 
> Yes this is a very good point, I did some experiment and it can happen, however so far we didn’t notice any
> problem because all the anonymous union/struct were part of other data structure, in that case there is no
> problem at all because we have upper_data_structure::anonymous_{struct/union}_0/1/2…
> 
> So given the fact that is difficult to have clash, I would say we can handle any future case separately.
> 
> Having a global numbering can solve the issue but because the script is called separately for each header,
> Implementing it will involve some changes, there should be a file to maintain the number across invocation
> and so on.
> 
> Let me know what do you think about that and if in your opinion we can proceed the way it is now.

I think anonymous union/struct are always part of another data struct in
our headers, so we shouldn't have a problem. I think we can get away
without a global numbering system. Maybe you should add a note about
this limitation in the commit message or an in-code comment.