From nobody Mon Feb 9 04:59:00 2026 Received: from mailgw02.zimbra-vnc.de (mailgw02.zimbra-vnc.de [148.251.102.236]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 81201426D0B; Tue, 20 Jan 2026 11:55:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=148.251.102.236 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768910148; cv=none; b=OX15zZKu1msX9ego0GeFMdaYJ87EmiDNTw28v4zfgj24wUwJngUdV3GTeKUHN/MyalferpJo/GsW0g11PSbRJ3sGN0sdHFc9693/dybmhVa3NYh751c/GqRPxTaYW5DLI9YuE7wrl4h5Yx4H+OtT9YEg5z9PlvfmPmP3qApGbZk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768910148; c=relaxed/simple; bh=HozEa+JmWevn0qmXrtisFKNkYTtXOxO/kRnvft7BuQ0=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=ZxFDG0388x6HXGvsKEcxGLJkaxtwOE/OEf67OCP6TvOT+9uFHszp1qOWFgupeQcgz1RPuOeKUf0GSlAIoKCLODvpgfukfbnMfUxCHTBy8EOmYkl3ZW/aY5YgKP8NRELkVxA0esWkAYyMfxFqhsJ13r8pd8XUm6ZWD3UUEspJMdI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=tngtech.com; spf=pass smtp.mailfrom=tngtech.com; dkim=pass (2048-bit key) header.d=tngtech.com header.i=@tngtech.com header.b=TLpUgxnz; arc=none smtp.client-ip=148.251.102.236 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=tngtech.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=tngtech.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=tngtech.com header.i=@tngtech.com header.b="TLpUgxnz" Received: from zmproxy.tng.vnc.biz (zimbra-vnc.tngtech.com [35.234.71.156]) by mailgw02.zimbra-vnc.de (Postfix) with ESMTPS id 3A6CF200B6; Tue, 20 Jan 2026 12:55:43 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by zmproxy.tng.vnc.biz (Postfix) with ESMTP id 2EE7F1FA659; Tue, 20 Jan 2026 12:55:42 +0100 (CET) Received: from zmproxy.tng.vnc.biz ([127.0.0.1]) by localhost (zmproxy.tng.vnc.biz [127.0.0.1]) (amavis, port 10032) with ESMTP id QVokCg6xRPEy; Tue, 20 Jan 2026 12:55:41 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by zmproxy.tng.vnc.biz (Postfix) with ESMTP id F328B1FA862; Tue, 20 Jan 2026 12:55:40 +0100 (CET) DKIM-Filter: OpenDKIM Filter v2.10.3 zmproxy.tng.vnc.biz F328B1FA862 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tngtech.com; s=B14491C6-869D-11EB-BB6C-8DD33D883B31; t=1768910141; bh=lit3ZgKJvE7Nh/l6McsLxtwua1VQGRW6lk66YnQJEw8=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=TLpUgxnz2btDjpaMsuS43bOIxruZquZrBbU7kZPM5sgy+zBDA63Eupy4R2gYqIEyz OhBSTmGAfQ31wA2FOGQMSO4484Uj8b7vbMlcJsvSKTtEWMR2THLez8qhUlIRVj9h2+ G8SjVtVr+KJiq9fWammEmGg5O2OlGy1O+lejyv9IWU1k0IUEJcvH6pHwPIq89phn8S 3sVSFJq+Q6ygGcfRRECzbx3lftrOLV8Op9deOIyY0pQt+ZEl/HFbFCTRcNFqlWE695 YDLy/6dMpjY2DuJhV0X7ZIXb2J0jKNm8zIVJAWGTue78mPo3+4Oy8e1sPqO94PBmoQ lUp9LCJ3M4vYg== X-Virus-Scanned: amavis at zmproxy.tng.vnc.biz Received: from zmproxy.tng.vnc.biz ([127.0.0.1]) by localhost (zmproxy.tng.vnc.biz [127.0.0.1]) (amavis, port 10026) with ESMTP id 0rErD_JG9hey; Tue, 20 Jan 2026 12:55:40 +0100 (CET) Received: from DESKTOP-0O0JV6I.localdomain (ipservice-092-208-231-176.092.208.pools.vodafone-ip.de [92.208.231.176]) by zmproxy.tng.vnc.biz (Postfix) with ESMTPSA id 7CE221FA858; Tue, 20 Jan 2026 12:55:40 +0100 (CET) From: Luis Augenstein To: nathan@kernel.org, nsc@kernel.org Cc: linux-kbuild@vger.kernel.org, linux-kernel@vger.kernel.org, akpm@linux-foundation.org, gregkh@linuxfoundation.org, maximilian.huber@tngtech.com, Luis Augenstein Subject: [PATCH v2 07/14] tools/sbom: add JSON-LD serialization Date: Tue, 20 Jan 2026 12:53:45 +0100 Message-Id: <20260120115352.10910-8-luis.augenstein@tngtech.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20260120115352.10910-1-luis.augenstein@tngtech.com> References: <20260120115352.10910-1-luis.augenstein@tngtech.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add infrastructure to serialize an SPDX graph as a JSON-LD document. NamespaceMaps in the SPDX document are converted to custom prefixes in the @context field of the JSON-LD output. The SBOM tool uses NamespaceMaps solely to shorten SPDX IDs, avoiding repetition of full namespace URIs by using short prefixes. Co-developed-by: Maximilian Huber Signed-off-by: Maximilian Huber Signed-off-by: Luis Augenstein --- tools/sbom/Makefile | 3 +- tools/sbom/sbom.py | 52 +++++++++++++++++ tools/sbom/sbom/config.py | 56 +++++++++++++++++++ tools/sbom/sbom/path_utils.py | 11 ++++ tools/sbom/sbom/spdx_graph/__init__.py | 7 +++ .../sbom/sbom/spdx_graph/build_spdx_graphs.py | 36 ++++++++++++ .../sbom/sbom/spdx_graph/spdx_graph_model.py | 36 ++++++++++++ 7 files changed, 200 insertions(+), 1 deletion(-) create mode 100644 tools/sbom/sbom/path_utils.py create mode 100644 tools/sbom/sbom/spdx_graph/__init__.py create mode 100644 tools/sbom/sbom/spdx_graph/build_spdx_graphs.py create mode 100644 tools/sbom/sbom/spdx_graph/spdx_graph_model.py diff --git a/tools/sbom/Makefile b/tools/sbom/Makefile index b4ef72b30960..f907f427bf8d 100644 --- a/tools/sbom/Makefile +++ b/tools/sbom/Makefile @@ -32,7 +32,8 @@ $(SBOM_TARGETS) &: $(SBOM_DEPS) --src-tree $(srctree) \ --obj-tree $(objtree) \ --roots-file $(SBOM_ROOTS_FILE) \ - --output-directory $(objtree) + --output-directory $(objtree) \ + --generate-spdx =20 @rm $(SBOM_ROOTS_FILE) =20 diff --git a/tools/sbom/sbom.py b/tools/sbom/sbom.py index 25d912a282de..426521ade460 100644 --- a/tools/sbom/sbom.py +++ b/tools/sbom/sbom.py @@ -6,13 +6,18 @@ Compute software bill of materials in SPDX format describing a kernel buil= d. """ =20 +import json import logging import os import sys import time +import uuid import sbom.sbom_logging as sbom_logging from sbom.config import get_config from sbom.path_utils import is_relative_to +from sbom.spdx import JsonLdSpdxDocument, SpdxIdGenerator +from sbom.spdx.core import CreationInfo, SpdxDocument +from sbom.spdx_graph import SpdxIdGeneratorCollection, build_spdx_graphs from sbom.cmd_graph import CmdGraph =20 =20 @@ -56,10 +61,57 @@ def main(): f.write("\n".join(str(file_path) for file_path in used_fil= es)) logging.debug(f"Successfully saved {used_files_path}") =20 + if config.generate_spdx is False: + return + + # Build SPDX Documents + logging.debug("Start generating SPDX graph based on cmd graph") + start_time =3D time.time() + + # The real uuid will be generated based on the content of the SPDX gra= phs + # to ensure that the same SPDX document is always assigned the same uu= id. + PLACEHOLDER_UUID =3D "00000000-0000-0000-0000-000000000000" + spdx_id_base_namespace =3D f"{config.spdxId_prefix}{PLACEHOLDER_UUID}/" + spdx_id_generators =3D SpdxIdGeneratorCollection( + base=3DSpdxIdGenerator(prefix=3D"p", namespace=3Dspdx_id_base_name= space), + source=3DSpdxIdGenerator(prefix=3D"s", namespace=3Df"{spdx_id_base= _namespace}source/"), + build=3DSpdxIdGenerator(prefix=3D"b", namespace=3Df"{spdx_id_base_= namespace}build/"), + output=3DSpdxIdGenerator(prefix=3D"o", namespace=3Df"{spdx_id_base= _namespace}output/"), + ) + + spdx_graphs =3D build_spdx_graphs( + cmd_graph, + spdx_id_generators, + config, + ) + spdx_id_uuid =3D uuid.uuid5( + uuid.NAMESPACE_URL, + "".join( + json.dumps(element.to_dict()) for spdx_graph in spdx_graphs.va= lues() for element in spdx_graph.to_list() + ), + ) + logging.debug(f"Generated SPDX graph in {time.time() - start_time} sec= onds") + # Report collected warnings and errors in case of failure warning_summary =3D sbom_logging.summarize_warnings() error_summary =3D sbom_logging.summarize_errors() =20 + if not sbom_logging.has_errors() or config.write_output_on_error: + for kernel_sbom_kind, spdx_graph in spdx_graphs.items(): + spdx_graph_objects =3D spdx_graph.to_list() + # Add warning and error summary to creation info comment + creation_info =3D next(element for element in spdx_graph_objec= ts if isinstance(element, CreationInfo)) + creation_info.comment =3D "\n".join([warning_summary, error_su= mmary]).strip() + # Replace Placeholder uuid with real uuid for spdxIds + spdx_document =3D next(element for element in spdx_graph_objec= ts if isinstance(element, SpdxDocument)) + for namespaceMap in spdx_document.namespaceMap: + namespaceMap.namespace =3D namespaceMap.namespace.replace(= PLACEHOLDER_UUID, str(spdx_id_uuid)) + # Serialize SPDX graph to JSON-LD + spdx_doc =3D JsonLdSpdxDocument(graph=3Dspdx_graph_objects) + save_path =3D os.path.join(config.output_directory, config.spd= x_file_names[kernel_sbom_kind]) + spdx_doc.save(save_path, config.prettify_json) + logging.debug(f"Successfully saved {save_path}") + if warning_summary: logging.warning(warning_summary) if error_summary: diff --git a/tools/sbom/sbom/config.py b/tools/sbom/sbom/config.py index 39e556a4c53b..0985457c3cae 100644 --- a/tools/sbom/sbom/config.py +++ b/tools/sbom/sbom/config.py @@ -3,11 +3,18 @@ =20 import argparse from dataclasses import dataclass +from enum import Enum import os from typing import Any from sbom.path_utils import PathStr =20 =20 +class KernelSpdxDocumentKind(Enum): + SOURCE =3D "source" + BUILD =3D "build" + OUTPUT =3D "output" + + @dataclass class KernelSbomConfig: src_tree: PathStr @@ -19,6 +26,13 @@ class KernelSbomConfig: root_paths: list[PathStr] """List of paths to root outputs (relative to obj_tree) to base the SB= OM on.""" =20 + generate_spdx: bool + """Whether to generate SPDX SBOM documents. If False, no SPDX files ar= e created.""" + + spdx_file_names: dict[KernelSpdxDocumentKind, str] + """If `generate_spdx` is True, defines the file names for each SPDX SB= OM kind + (source, build, output) to store on disk.""" + generate_used_files: bool """Whether to generate a flat list of all source files used in the bui= ld. If False, no used-files document is created.""" @@ -38,6 +52,12 @@ class KernelSbomConfig: write_output_on_error: bool """Whether to write output documents even if errors occur.""" =20 + spdxId_prefix: str + """Prefix to use for all SPDX element IDs.""" + + prettify_json: bool + """Whether to pretty-print generated SPDX JSON documents.""" + =20 def _parse_cli_arguments() -> dict[str, Any]: """ @@ -72,6 +92,15 @@ def _parse_cli_arguments() -> dict[str, Any]: "--roots-file", help=3D"Path to a file containing the root paths (one per line). C= annot be used together with --roots.", ) + parser.add_argument( + "--generate-spdx", + action=3D"store_true", + default=3DFalse, + help=3D( + "Whether to create sbom-source.spdx.json, sbom-build.spdx.json= and " + "sbom-output.spdx.json documents (default: False)" + ), + ) parser.add_argument( "--generate-used-files", action=3D"store_true", @@ -119,6 +148,20 @@ def _parse_cli_arguments() -> dict[str, Any]: ), ) =20 + # SPDX specific options + spdx_group =3D parser.add_argument_group("SPDX options", "Options for = customizing SPDX document generation") + spdx_group.add_argument( + "--spdxId-prefix", + default=3D"urn:spdx.dev:", + help=3D"The prefix to use for all spdxId properties. (default: urn= :spdx.dev:)", + ) + spdx_group.add_argument( + "--prettify-json", + action=3D"store_true", + default=3DFalse, + help=3D"Whether to pretty print the generated spdx.json documents = (default: False)", + ) + args =3D vars(parser.parse_args()) return args =20 @@ -144,6 +187,7 @@ def get_config() -> KernelSbomConfig: root_paths =3D args["roots"] _validate_path_arguments(src_tree, obj_tree, root_paths) =20 + generate_spdx =3D args["generate_spdx"] generate_used_files =3D args["generate_used_files"] output_directory =3D os.path.realpath(args["output_directory"]) debug =3D args["debug"] @@ -151,19 +195,31 @@ def get_config() -> KernelSbomConfig: fail_on_unknown_build_command =3D not args["do_not_fail_on_unknown_bui= ld_command"] write_output_on_error =3D args["write_output_on_error"] =20 + spdxId_prefix =3D args["spdxId_prefix"] + prettify_json =3D args["prettify_json"] + # Hardcoded config + spdx_file_names =3D { + KernelSpdxDocumentKind.SOURCE: "sbom-source.spdx.json", + KernelSpdxDocumentKind.BUILD: "sbom-build.spdx.json", + KernelSpdxDocumentKind.OUTPUT: "sbom-output.spdx.json", + } used_files_file_name =3D "sbom.used-files.txt" =20 return KernelSbomConfig( src_tree=3Dsrc_tree, obj_tree=3Dobj_tree, root_paths=3Droot_paths, + generate_spdx=3Dgenerate_spdx, + spdx_file_names=3Dspdx_file_names, generate_used_files=3Dgenerate_used_files, used_files_file_name=3Dused_files_file_name, output_directory=3Doutput_directory, debug=3Ddebug, fail_on_unknown_build_command=3Dfail_on_unknown_build_command, write_output_on_error=3Dwrite_output_on_error, + spdxId_prefix=3DspdxId_prefix, + prettify_json=3Dprettify_json, ) =20 =20 diff --git a/tools/sbom/sbom/path_utils.py b/tools/sbom/sbom/path_utils.py new file mode 100644 index 000000000000..d28d67b25398 --- /dev/null +++ b/tools/sbom/sbom/path_utils.py @@ -0,0 +1,11 @@ +# SPDX-License-Identifier: GPL-2.0-only OR MIT +# Copyright (C) 2025 TNG Technology Consulting GmbH + +import os + +PathStr =3D str +"""Filesystem path represented as a plain string for better performance th= an pathlib.Path.""" + + +def is_relative_to(path: PathStr, base: PathStr) -> bool: + return os.path.commonpath([path, base]) =3D=3D base diff --git a/tools/sbom/sbom/spdx_graph/__init__.py b/tools/sbom/sbom/spdx_= graph/__init__.py new file mode 100644 index 000000000000..3557b1d51bf9 --- /dev/null +++ b/tools/sbom/sbom/spdx_graph/__init__.py @@ -0,0 +1,7 @@ +# SPDX-License-Identifier: GPL-2.0-only OR MIT +# Copyright (C) 2025 TNG Technology Consulting GmbH + +from .build_spdx_graphs import build_spdx_graphs +from .spdx_graph_model import SpdxIdGeneratorCollection + +__all__ =3D ["build_spdx_graphs", "SpdxIdGeneratorCollection"] diff --git a/tools/sbom/sbom/spdx_graph/build_spdx_graphs.py b/tools/sbom/s= bom/spdx_graph/build_spdx_graphs.py new file mode 100644 index 000000000000..bb3db4e423da --- /dev/null +++ b/tools/sbom/sbom/spdx_graph/build_spdx_graphs.py @@ -0,0 +1,36 @@ +# SPDX-License-Identifier: GPL-2.0-only OR MIT +# Copyright (C) 2025 TNG Technology Consulting GmbH + + +from typing import Protocol + +from sbom.config import KernelSpdxDocumentKind +from sbom.cmd_graph import CmdGraph +from sbom.path_utils import PathStr +from sbom.spdx_graph.spdx_graph_model import SpdxGraph, SpdxIdGeneratorCol= lection + + +class SpdxGraphConfig(Protocol): + obj_tree: PathStr + src_tree: PathStr + + +def build_spdx_graphs( + cmd_graph: CmdGraph, + spdx_id_generators: SpdxIdGeneratorCollection, + config: SpdxGraphConfig, +) -> dict[KernelSpdxDocumentKind, SpdxGraph]: + """ + Builds SPDX graphs (output, source, and build) based on a cmd dependen= cy graph. + If the source and object trees are identical, no dedicated source grap= h can be created. + In that case the source files are added to the build graph instead. + + Args: + cmd_graph: The dependency graph of a kernel build. + spdx_id_generators: Collection of SPDX ID generators. + config: Configuration options. + + Returns: + Dictionary of SPDX graphs + """ + return {} diff --git a/tools/sbom/sbom/spdx_graph/spdx_graph_model.py b/tools/sbom/sb= om/spdx_graph/spdx_graph_model.py new file mode 100644 index 000000000000..682194d4362a --- /dev/null +++ b/tools/sbom/sbom/spdx_graph/spdx_graph_model.py @@ -0,0 +1,36 @@ +# SPDX-License-Identifier: GPL-2.0-only OR MIT +# Copyright (C) 2025 TNG Technology Consulting GmbH + +from dataclasses import dataclass +from sbom.spdx.core import CreationInfo, SoftwareAgent, SpdxDocument, Spdx= Object +from sbom.spdx.software import Sbom +from sbom.spdx.spdxId import SpdxIdGenerator + + +@dataclass +class SpdxGraph: + """Represents the complete graph of a single SPDX document.""" + + spdx_document: SpdxDocument + agent: SoftwareAgent + creation_info: CreationInfo + sbom: Sbom + + def to_list(self) -> list[SpdxObject]: + return [ + self.spdx_document, + self.agent, + self.creation_info, + self.sbom, + *self.sbom.element, + ] + + +@dataclass +class SpdxIdGeneratorCollection: + """Holds SPDX ID generators for different document types to ensure glo= bally unique SPDX IDs.""" + + base: SpdxIdGenerator + source: SpdxIdGenerator + build: SpdxIdGenerator + output: SpdxIdGenerator --=20 2.34.1