README.md | 13 +++++++++++++ scripts/dt_sanity.py | 33 +++++++++++++++++++++++++++++++++ 2 files changed, 46 insertions(+) create mode 100644 scripts/dt_sanity.py
Xen gives a panic if certain nodes are not present in the device tree. In order
to prevent this panic, scripts/dt_sanity.py is written so that it checks if the
node/s are present. If the node/s are not present, the script gives an error.
User is expected to run the script against the device tree before booting Xen
with dtb.
Signed-off-by: Ayan Kumar Halder <ayan.kumar.halder@amd.com>
---
Hi,
In some of the discussions with the safety experts and upstream folks, one issue
that kept coming up is there are lots of ‘faulty system configuration’ and
‘impossible conditions’ checks in Xen. While these conditions can rarely occur,
Xen would panic if any of such condition does occur.
For example, during bootup, Xen parses the device tree .
It checks if the device tree nodes are present for timer, interrupt-controller,
memory, cpu, etc. If these nodes are not present, Xen panics.
As part of safety certification, we have 3 aims :-
1. We want to reduce the instances where Xen can panic. This is to improve the
robustness.
2. We need to define a safe state when a fault is triggered in Xen. As faults
(like the one mentioned here) are triggered during boot time and it is due to
incorrect system configuration in device tree, it is hard to define a safe state.
3. Avoid validating all the instances of system configuration errors. By having
an external tool, we push the responsibility to the system integrator. The system
integrator needs to run the tool to validate all the properties that Xen checks
for. This can be a justification for the coverage gap for those checks in Xen.
Thus, I have come up with the attached python script. In the script, we parse the
device tree to check if the nodes with the compatible properties (as specified in
config file) are present. If not, the script will throw an error.
README.md | 13 +++++++++++++
scripts/dt_sanity.py | 33 +++++++++++++++++++++++++++++++++
2 files changed, 46 insertions(+)
create mode 100644 scripts/dt_sanity.py
diff --git a/README.md b/README.md
index 7b68cf5..413de3f 100644
--- a/README.md
+++ b/README.md
@@ -456,3 +456,16 @@ This section defines config file debug options
- DBG_FDT_PRINT_CHOSEN specifies that U-Boot script command to print DT "chosen"
node will be added to the boot script.
+
+## dt_sanity.py
+
+This script parses xen device tree source and checks if the required nodes are
+present. If not, the script gives an error.
+
+To use it, first write a config file like `config` where you can keep the
+compatible strings to be checked:
+
+```
+arm,gic-v3
+arm,armv8-timer
+```
diff --git a/scripts/dt_sanity.py b/scripts/dt_sanity.py
new file mode 100644
index 0000000..171947f
--- /dev/null
+++ b/scripts/dt_sanity.py
@@ -0,0 +1,33 @@
+import argparse
+from pydevicetree import Devicetree
+import sys
+
+def load_compatible_strings(config_path):
+ with open(config_path, 'r') as file:
+ return [line.strip() for line in file if line.strip()]
+
+def check_compatible_nodes(dts_path):
+ # Parse the DTS file
+ tree = Devicetree.parseFile(dts_path)
+
+ # Search nodes for compatible properties in the global array
+ for compatible in compatible_strings:
+ nodes = tree.match(compatible)
+ if len(nodes) == 0:
+ print(f"Error: Node with compatible '{compatible}' not found.")
+ sys.exit(1)
+
+if __name__ == "__main__":
+ # Set up argument parser
+ parser = argparse.ArgumentParser(description="Check for Xen specific nodes in a DTS file.")
+ parser.add_argument("dts_file", help="Path to the DTS file")
+ parser.add_argument("config_file", help="Path to the configuration file with compatible strings")
+
+ # Parse arguments
+ args = parser.parse_args()
+
+ # Load compatible strings from the config file
+ compatible_strings = load_compatible_strings(args.config_file)
+
+ # Use the provided DTS file path
+ check_compatible_nodes(args.dts_file)
--
2.25.1
Hi, After the FMEA discussion at Xen summit, I want to put a clarification. On 01/09/2025 13:31, Ayan Kumar Halder wrote: > Xen gives a panic if certain nodes are not present in the device tree. In order > to prevent this panic, scripts/dt_sanity.py is written so that it checks if the > node/s are present. If the node/s are not present, the script gives an error. > > User is expected to run the script against the device tree before booting Xen > with dtb. > > Signed-off-by: Ayan Kumar Halder <ayan.kumar.halder@amd.com> > --- > > Hi, > > In some of the discussions with the safety experts and upstream folks, one issue > that kept coming up is there are lots of ‘faulty system configuration’ and > ‘impossible conditions’ checks in Xen. While these conditions can rarely occur, > Xen would panic if any of such condition does occur. > > For example, during bootup, Xen parses the device tree . > It checks if the device tree nodes are present for timer, interrupt-controller, > memory, cpu, etc. If these nodes are not present, Xen panics. > > As part of safety certification, we have 3 aims :- > 1. We want to reduce the instances where Xen can panic. This is to improve the > robustness. > > 2. We need to define a safe state when a fault is triggered in Xen. As faults > (like the one mentioned here) are triggered during boot time and it is due to > incorrect system configuration in device tree, it is hard to define a safe state. > > 3. Avoid validating all the instances of system configuration errors. By having > an external tool, we push the responsibility to the system integrator. The system > integrator needs to run the tool to validate all the properties that Xen checks > for. This can be a justification for the coverage gap for those checks in Xen. This isn't true as we will have tests to validate all possible system configuration errors. However, I want to use this script as a 'prevention' mechanism to check for the sanity of device tree (which can be offloaded to the system integrator). There could be many of such errors arising from misconfiguration in device tree. Wherever possible, we will use a script or we can explain how to identify these errors before Xen boots. We want to convey that certain failures in Xen are not possible (or there is atleast some mitigation) if the user has read the FMEA or safety manual or public documents. - Ayan
On 01/09/2025 14:31, Ayan Kumar Halder wrote: > Xen gives a panic if certain nodes are not present in the device tree. In order > to prevent this panic, scripts/dt_sanity.py is written so that it checks if the > node/s are present. If the node/s are not present, the script gives an error. > > User is expected to run the script against the device tree before booting Xen > with dtb. I would expect the script to know what to look for in a passed dtb. Otherwise it's just a script that searches for the list of specified compatibilities by the user. In other words, a simple grep would also do the job. Compatibility is not everything, there are other things that would prevent Xen from booting. Also, often times the dtb is modified by the bootloader before passing control over to Xen (i.e. it may differ from the base dtb). ~Michal
Hi Michal, On 01/09/2025 14:17, Orzel, Michal wrote: > > On 01/09/2025 14:31, Ayan Kumar Halder wrote: >> Xen gives a panic if certain nodes are not present in the device tree. In order >> to prevent this panic, scripts/dt_sanity.py is written so that it checks if the >> node/s are present. If the node/s are not present, the script gives an error. >> >> User is expected to run the script against the device tree before booting Xen >> with dtb. > I would expect the script to know what to look for in a passed dtb. Otherwise > it's just a script that searches for the list of specified compatibilities by > the user. In other words, a simple grep would also do the job. I agree. The main idea is to check that a dt node with 'compatible="arm,gic-v3"' property is present. We can enhance the script as we go forward. > Compatibility is > not everything, there are other things that would prevent Xen from booting. Agreed. The rationale here is to shift the responsibility to the system integrator to run some validation so that Xen does not catch these errors. We wish to provide some sample tools to achieve this aim. Another category of errors are the unsupported register configurations. For eg gicv3_info.nr_lrs can be supported with a max value of 16. So if we have a baremetal program that user can execute before running Xen, then the user can catch these errors. If we are able to catch these errors outside of Xen, then we can claim such panic() will not be triggered in Xen. There will be certain assumptions to make like .... > Also, often times the dtb is modified by the bootloader before passing control > over to Xen (i.e. it may differ from the base dtb). bootloader is not expected to modify the nodes which are sanitised by the external tool. - Ayan
Hi, On 01/09/2025 14:51, Ayan Kumar Halder wrote: > Hi Michal, > > On 01/09/2025 14:17, Orzel, Michal wrote: >> >> On 01/09/2025 14:31, Ayan Kumar Halder wrote: >>> Xen gives a panic if certain nodes are not present in the device >>> tree. In order >>> to prevent this panic, scripts/dt_sanity.py is written so that it >>> checks if the >>> node/s are present. If the node/s are not present, the script gives >>> an error. >>> >>> User is expected to run the script against the device tree before >>> booting Xen >>> with dtb. >> One thing I forgot to mention is that as part of safety certification, we do need to do "Failure mode and error analysis". This means describing the scenarios in which Xen can fail to perform its regular functionality and coming up with prevention, detection and mitigation measures. One can argue that the panics caused by system misconfiguration, are the most straightforward of all the errors. However, we do need to define prevention mechanisms to avoid these panics. For this particular failure, the prevention mechanism can be described as manually looking into the device tree to ensure that the nodes expected by Xen, are present. The script aims to provide a better alternative. This script is not meant to catch all possible panics. However we do want to have such scripts and utilities wherever possible, and document them as part of our FMEA. May be a safety expert can comment if the approach makes sense. - Ayan
On 01/09/2025 1:31 pm, Ayan Kumar Halder wrote:
> diff --git a/scripts/dt_sanity.py b/scripts/dt_sanity.py
> new file mode 100644
> index 0000000..171947f
> --- /dev/null
> +++ b/scripts/dt_sanity.py
> @@ -0,0 +1,33 @@
Shebang
> +import argparse
> +from pydevicetree import Devicetree
pydevicetree isn't part of the standard library. Readme should note the
external dependency. Also it should be separate from stdlib imports.
> sys
> +
> +def load_compatible_strings(config_path):
> + with open(config_path, 'r') as file:
> + return [line.strip() for line in file if line.strip()]
You should choose a different name from file as it shadows a keyword.
Also, you want something more like this
with open() as f:
for l in f.readlines():
l.strip()
if l:
yield l
but you really ought to also consider supporting comments in this config
file.
> +
> +def check_compatible_nodes(dts_path):
> + # Parse the DTS file
> + tree = Devicetree.parseFile(dts_path)
> +
> + # Search nodes for compatible properties in the global array
> + for compatible in compatible_strings:
> + nodes = tree.match(compatible)
> + if len(nodes) == 0:
if not nodes:
> + print(f"Error: Node with compatible '{compatible}' not found.")
f-strings are not supported in our minimum version of python. Use % or
.format().
~Andrew
© 2016 - 2025 Red Hat, Inc.