[RFC PATCH v1] imagebuilder: Add a script to check the sanity of device tree

Ayan Kumar Halder posted 1 patch 1 month, 4 weeks ago
Failed in applying to current master (apply log)
README.md            | 13 +++++++++++++
scripts/dt_sanity.py | 33 +++++++++++++++++++++++++++++++++
2 files changed, 46 insertions(+)
create mode 100644 scripts/dt_sanity.py
[RFC PATCH v1] imagebuilder: Add a script to check the sanity of device tree
Posted by Ayan Kumar Halder 1 month, 4 weeks ago
Xen gives a panic if certain nodes are not present in the device tree. In order
to prevent this panic, scripts/dt_sanity.py is written so that it checks if the
node/s are present. If the node/s are not present, the script gives an error.

User is expected to run the script against the device tree before booting Xen
with dtb.

Signed-off-by: Ayan Kumar Halder <ayan.kumar.halder@amd.com>
---

Hi,
 
In some of the discussions with the safety experts and upstream folks, one issue
that kept coming up is there are lots of ‘faulty system configuration’ and
‘impossible conditions’ checks in Xen.  While these conditions can rarely occur,
Xen would panic if any of such condition does occur.
 
For example, during bootup, Xen parses the device tree .
It checks if the device tree nodes are present for timer, interrupt-controller,
memory, cpu, etc. If these nodes are not present, Xen panics.
 
As part of safety certification, we have 3 aims :-
1. We want to reduce the instances where Xen can panic. This is to improve the
robustness.

2. We need to define a safe state when a fault is triggered in Xen. As faults
(like the one mentioned here) are triggered during boot time and it is due to
incorrect system configuration in device tree, it is hard to define a safe state.

3. Avoid validating all the instances of system configuration errors. By having
an external tool, we push the responsibility to the system integrator. The system
integrator needs to run the tool to validate all the properties that Xen checks
for. This can be a justification for the coverage gap for those checks in Xen.
 
Thus, I have come up with the attached python script. In the script, we parse the
device tree to check if the nodes with the compatible properties (as specified in
config file) are present. If not, the script will throw an error.
 
 README.md            | 13 +++++++++++++
 scripts/dt_sanity.py | 33 +++++++++++++++++++++++++++++++++
 2 files changed, 46 insertions(+)
 create mode 100644 scripts/dt_sanity.py

diff --git a/README.md b/README.md
index 7b68cf5..413de3f 100644
--- a/README.md
+++ b/README.md
@@ -456,3 +456,16 @@ This section defines config file debug options
 
 - DBG_FDT_PRINT_CHOSEN specifies that U-Boot script command to print DT "chosen"
   node will be added to the boot script.
+
+## dt_sanity.py
+
+This script parses xen device tree source and checks if the required nodes are
+present. If not, the script gives an error.
+
+To use it, first write a config file like `config` where you can keep the
+compatible strings to be checked:
+
+```
+arm,gic-v3
+arm,armv8-timer
+```
diff --git a/scripts/dt_sanity.py b/scripts/dt_sanity.py
new file mode 100644
index 0000000..171947f
--- /dev/null
+++ b/scripts/dt_sanity.py
@@ -0,0 +1,33 @@
+import argparse
+from pydevicetree import Devicetree
+import sys
+
+def load_compatible_strings(config_path):
+    with open(config_path, 'r') as file:
+        return [line.strip() for line in file if line.strip()]
+
+def check_compatible_nodes(dts_path):
+    # Parse the DTS file
+    tree = Devicetree.parseFile(dts_path)
+
+    # Search nodes for compatible properties in the global array
+    for compatible in compatible_strings:
+        nodes = tree.match(compatible)
+        if len(nodes) == 0:
+            print(f"Error: Node with compatible '{compatible}' not found.")
+            sys.exit(1)
+
+if __name__ == "__main__":
+    # Set up argument parser
+    parser = argparse.ArgumentParser(description="Check for Xen specific nodes in a DTS file.")
+    parser.add_argument("dts_file", help="Path to the DTS file")
+    parser.add_argument("config_file", help="Path to the configuration file with compatible strings")
+
+    # Parse arguments
+    args = parser.parse_args()
+
+    # Load compatible strings from the config file
+    compatible_strings = load_compatible_strings(args.config_file)
+
+    # Use the provided DTS file path
+    check_compatible_nodes(args.dts_file)
-- 
2.25.1


Re: [RFC PATCH v1] imagebuilder: Add a script to check the sanity of device tree
Posted by Ayan Kumar Halder 1 month, 1 week ago
Hi,

After the FMEA discussion at Xen summit, I want to put a clarification.

On 01/09/2025 13:31, Ayan Kumar Halder wrote:
> Xen gives a panic if certain nodes are not present in the device tree. In order
> to prevent this panic, scripts/dt_sanity.py is written so that it checks if the
> node/s are present. If the node/s are not present, the script gives an error.
>
> User is expected to run the script against the device tree before booting Xen
> with dtb.
>
> Signed-off-by: Ayan Kumar Halder <ayan.kumar.halder@amd.com>
> ---
>
> Hi,
>   
> In some of the discussions with the safety experts and upstream folks, one issue
> that kept coming up is there are lots of ‘faulty system configuration’ and
> ‘impossible conditions’ checks in Xen.  While these conditions can rarely occur,
> Xen would panic if any of such condition does occur.
>   
> For example, during bootup, Xen parses the device tree .
> It checks if the device tree nodes are present for timer, interrupt-controller,
> memory, cpu, etc. If these nodes are not present, Xen panics.
>   
> As part of safety certification, we have 3 aims :-
> 1. We want to reduce the instances where Xen can panic. This is to improve the
> robustness.
>
> 2. We need to define a safe state when a fault is triggered in Xen. As faults
> (like the one mentioned here) are triggered during boot time and it is due to
> incorrect system configuration in device tree, it is hard to define a safe state.
>
> 3. Avoid validating all the instances of system configuration errors. By having
> an external tool, we push the responsibility to the system integrator. The system
> integrator needs to run the tool to validate all the properties that Xen checks
> for. This can be a justification for the coverage gap for those checks in Xen.

This isn't true as we will have tests to validate all possible system 
configuration errors. However, I want to use this script as a 
'prevention' mechanism to check for the sanity of device tree (which can 
be offloaded to the system integrator). There could be many of such 
errors arising from misconfiguration in device tree. Wherever possible, 
we will use a script or we can explain how to identify these errors 
before Xen boots.

We want to convey that certain failures in Xen are not possible (or 
there is atleast some mitigation) if the user has read the FMEA or 
safety manual or public documents.

- Ayan


Re: [RFC PATCH v1] imagebuilder: Add a script to check the sanity of device tree
Posted by Orzel, Michal 1 month, 4 weeks ago

On 01/09/2025 14:31, Ayan Kumar Halder wrote:
> Xen gives a panic if certain nodes are not present in the device tree. In order
> to prevent this panic, scripts/dt_sanity.py is written so that it checks if the
> node/s are present. If the node/s are not present, the script gives an error.
> 
> User is expected to run the script against the device tree before booting Xen
> with dtb.
I would expect the script to know what to look for in a passed dtb. Otherwise
it's just a script that searches for the list of specified compatibilities by
the user. In other words, a simple grep would also do the job. Compatibility is
not everything, there are other things that would prevent Xen from booting.
Also, often times the dtb is modified by the bootloader before passing control
over to Xen (i.e. it may differ from the base dtb).

~Michal
Re: [RFC PATCH v1] imagebuilder: Add a script to check the sanity of device tree
Posted by Ayan Kumar Halder 1 month, 4 weeks ago
Hi Michal,

On 01/09/2025 14:17, Orzel, Michal wrote:
>
> On 01/09/2025 14:31, Ayan Kumar Halder wrote:
>> Xen gives a panic if certain nodes are not present in the device tree. In order
>> to prevent this panic, scripts/dt_sanity.py is written so that it checks if the
>> node/s are present. If the node/s are not present, the script gives an error.
>>
>> User is expected to run the script against the device tree before booting Xen
>> with dtb.
> I would expect the script to know what to look for in a passed dtb. Otherwise
> it's just a script that searches for the list of specified compatibilities by
> the user. In other words, a simple grep would also do the job.

I agree. The main idea is to check that a dt node with 
'compatible="arm,gic-v3"' property is present. We can enhance the script 
as we go forward.

> Compatibility is
> not everything, there are other things that would prevent Xen from booting.

Agreed. The rationale here is to shift the responsibility to the system 
integrator to run some validation so that Xen does not catch these errors.

We wish to provide some sample tools to achieve this aim.

Another category of errors are the unsupported register configurations. 
For eg gicv3_info.nr_lrs can be supported with a max value of 16. So if 
we have a baremetal program that user can execute before running Xen, 
then the user can catch these errors.

If we are able to catch these errors outside of Xen, then we can claim 
such panic() will not be triggered in Xen. There will be certain 
assumptions to make like ....

> Also, often times the dtb is modified by the bootloader before passing control
> over to Xen (i.e. it may differ from the base dtb).

bootloader is not expected to modify the nodes which are sanitised by 
the external tool.

- Ayan
Re: [RFC PATCH v1] imagebuilder: Add a script to check the sanity of device tree
Posted by Ayan Kumar Halder 1 month, 3 weeks ago
Hi,

On 01/09/2025 14:51, Ayan Kumar Halder wrote:
> Hi Michal,
>
> On 01/09/2025 14:17, Orzel, Michal wrote:
>>
>> On 01/09/2025 14:31, Ayan Kumar Halder wrote:
>>> Xen gives a panic if certain nodes are not present in the device 
>>> tree. In order
>>> to prevent this panic, scripts/dt_sanity.py is written so that it 
>>> checks if the
>>> node/s are present. If the node/s are not present, the script gives 
>>> an error.
>>>
>>> User is expected to run the script against the device tree before 
>>> booting Xen
>>> with dtb.
>>
One thing I forgot to mention is that as part of safety certification, 
we do need to do "Failure mode and error analysis". This means 
describing the scenarios in which Xen can fail to perform its regular 
functionality and coming up with prevention, detection and mitigation 
measures.

One can argue that the panics caused by system misconfiguration, are the 
most straightforward of all the errors. However, we do need to define 
prevention mechanisms to avoid these panics. For this particular 
failure, the prevention mechanism can be described as manually looking 
into the device tree to ensure that the nodes expected by Xen, are 
present. The script aims to provide a better alternative.

This script is not meant to catch all possible panics. However we do 
want to have such scripts and utilities wherever possible, and document 
them as part of our FMEA.

May be a safety expert can comment if the approach makes sense.

- Ayan
Re: [RFC PATCH v1] imagebuilder: Add a script to check the sanity of device tree
Posted by Andrew Cooper 1 month, 4 weeks ago
On 01/09/2025 1:31 pm, Ayan Kumar Halder wrote:
> diff --git a/scripts/dt_sanity.py b/scripts/dt_sanity.py
> new file mode 100644
> index 0000000..171947f
> --- /dev/null
> +++ b/scripts/dt_sanity.py
> @@ -0,0 +1,33 @@

Shebang

> +import argparse
> +from pydevicetree import Devicetree

pydevicetree isn't part of the standard library.  Readme should note the
external dependency.  Also it should be separate from stdlib imports.

>  sys
> +
> +def load_compatible_strings(config_path):
> +    with open(config_path, 'r') as file:
> +        return [line.strip() for line in file if line.strip()]

You should choose a different name from file as it shadows a keyword.

Also, you want something more like this

with open() as f:
    for l in f.readlines():
        l.strip()
        if l:
            yield l

but you really ought to also consider supporting comments in this config
file.

> +
> +def check_compatible_nodes(dts_path):
> +    # Parse the DTS file
> +    tree = Devicetree.parseFile(dts_path)
> +
> +    # Search nodes for compatible properties in the global array
> +    for compatible in compatible_strings:
> +        nodes = tree.match(compatible)
> +        if len(nodes) == 0:

if not nodes:

> +            print(f"Error: Node with compatible '{compatible}' not found.")

f-strings are not supported in our minimum version of python.  Use % or
.format().

~Andrew