From nobody Sat Feb 7 08:27:39 2026 Received: from mailgw01.zimbra-vnc.de (mailgw01.zimbra-vnc.de [148.251.101.236]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 92CB428504D; Mon, 26 Jan 2026 19:35:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=148.251.101.236 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769456133; cv=none; b=cf/c+cp+EUgqxl88WUMZCH8cPB2/zqUN7+UVR69h/Sq/hcAahXiUD65EhHvEEXmIQhwPMHBtwLMSPmx4NXWCX/g1R8a/sA/f6Y504oj7ZGPAXmMGYsRFGk5PTMMKzYfszupysVWUoTo7FlmcRT9nVgfFgo8+rr0P6s/GQ+WahpA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769456133; c=relaxed/simple; bh=iBo7CqqZUbdt5NKY0d4+flItVF54YJcJAPZV4eEE2bc=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Cruq0no70Ts694qnG0BW9jxjorXmdRnrXOY6pu5i2rJ0etvDjFUbGJIK/umoOU2itKoBWw2zRfJQ16pD/HVgxKuoVsdn8ypSwXyDM0hgJpNAYaXzdp9Wiqjkqa/Y7uFqPi3lDU7bvjXJGkQvDdVEvl5n/lhT3vpDxpi4Cnd4Vx0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=tngtech.com; spf=pass smtp.mailfrom=tngtech.com; dkim=pass (2048-bit key) header.d=tngtech.com header.i=@tngtech.com header.b=CWAT0TnQ; arc=none smtp.client-ip=148.251.101.236 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=tngtech.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=tngtech.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=tngtech.com header.i=@tngtech.com header.b="CWAT0TnQ" Received: from zmproxy.tng.vnc.biz (zimbra-vnc.tngtech.com [35.234.71.156]) by mailgw01.zimbra-vnc.de (Postfix) with ESMTPS id 2E2703FAF2; Mon, 26 Jan 2026 20:35:27 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by zmproxy.tng.vnc.biz (Postfix) with ESMTP id A492F1FA703; Mon, 26 Jan 2026 20:35:26 +0100 (CET) Received: from zmproxy.tng.vnc.biz ([127.0.0.1]) by localhost (zmproxy.tng.vnc.biz [127.0.0.1]) (amavis, port 10032) with ESMTP id f_MhrksYLW9s; Mon, 26 Jan 2026 20:35:24 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by zmproxy.tng.vnc.biz (Postfix) with ESMTP id 082D91FA6D0; Mon, 26 Jan 2026 20:35:24 +0100 (CET) DKIM-Filter: OpenDKIM Filter v2.10.3 zmproxy.tng.vnc.biz 082D91FA6D0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tngtech.com; s=B14491C6-869D-11EB-BB6C-8DD33D883B31; t=1769456124; bh=DUjIygVJ7USk7YTYmtr+cNBP96RzKe7WrgkUPu8Yckk=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=CWAT0TnQxE4Ir7bTTnWNyhmKjPeMlLBntCSPPJwr0SixnJLEH0TDAlNLLrODAl+j/ gcxvNhioTXlv8C7+Kq5EL2wPWPZ7X68UTysbnzejfHVKCuxrupUiXR6s64uySuY8lL LsLOZ3Zlpahnrg+oDoQXyGKxAgRTzUH2VwxmNZnaXU7OC1OG4cq+MBSuGuYWSOfrWA 5RRayJma6p/V4MRXjBNfUtoNv10NWjO2xZ3Kr8ZRNM7G7SIXgPTGL5NEbiDcZLBYF4 7/AmcyRIGeRkUFmBkO95inzZMFAp5mqWdeiYIWLnfT/nbHyHS2YKVBrep/ZR5jBb0C gKGk6D0Aa3oUw== X-Virus-Scanned: amavis at zmproxy.tng.vnc.biz Received: from zmproxy.tng.vnc.biz ([127.0.0.1]) by localhost (zmproxy.tng.vnc.biz [127.0.0.1]) (amavis, port 10026) with ESMTP id w8uUK1FSIyAi; Mon, 26 Jan 2026 20:35:23 +0100 (CET) Received: from DESKTOP-0O0JV6I.localdomain (ipservice-092-208-231-176.092.208.pools.vodafone-ip.de [92.208.231.176]) by zmproxy.tng.vnc.biz (Postfix) with ESMTPSA id A336E1F8839; Mon, 26 Jan 2026 20:35:23 +0100 (CET) From: Luis Augenstein To: nathan@kernel.org, nsc@kernel.org Cc: linux-kbuild@vger.kernel.org, linux-kernel@vger.kernel.org, akpm@linux-foundation.org, gregkh@linuxfoundation.org, maximilian.huber@tngtech.com, Luis Augenstein Subject: [PATCH v3 07/14] tools/sbom: add JSON-LD serialization Date: Mon, 26 Jan 2026 20:32:57 +0100 Message-Id: <20260126193304.320916-8-luis.augenstein@tngtech.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20260126193304.320916-1-luis.augenstein@tngtech.com> References: <20260126193304.320916-1-luis.augenstein@tngtech.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add infrastructure to serialize an SPDX graph as a JSON-LD document. NamespaceMaps in the SPDX document are converted to custom prefixes in the @context field of the JSON-LD output. The SBOM tool uses NamespaceMaps solely to shorten SPDX IDs, avoiding repetition of full namespace URIs by using short prefixes. Co-developed-by: Maximilian Huber Signed-off-by: Maximilian Huber Signed-off-by: Luis Augenstein --- tools/sbom/Makefile | 3 +- tools/sbom/sbom.py | 52 +++++++++++++++++ tools/sbom/sbom/config.py | 56 +++++++++++++++++++ tools/sbom/sbom/path_utils.py | 11 ++++ tools/sbom/sbom/spdx_graph/__init__.py | 7 +++ .../sbom/sbom/spdx_graph/build_spdx_graphs.py | 36 ++++++++++++ .../sbom/sbom/spdx_graph/spdx_graph_model.py | 36 ++++++++++++ 7 files changed, 200 insertions(+), 1 deletion(-) create mode 100644 tools/sbom/sbom/path_utils.py create mode 100644 tools/sbom/sbom/spdx_graph/__init__.py create mode 100644 tools/sbom/sbom/spdx_graph/build_spdx_graphs.py create mode 100644 tools/sbom/sbom/spdx_graph/spdx_graph_model.py diff --git a/tools/sbom/Makefile b/tools/sbom/Makefile index cc4a632533ba..1fef44cede46 100644 --- a/tools/sbom/Makefile +++ b/tools/sbom/Makefile @@ -33,7 +33,8 @@ $(SBOM_TARGETS) &: $(SBOM_DEPS) --src-tree $(srctree) \ --obj-tree $(objtree) \ --roots-file $(SBOM_ROOTS_FILE) \ - --output-directory $(objtree) + --output-directory $(objtree) \ + --generate-spdx =20 @rm $(SBOM_ROOTS_FILE) =20 diff --git a/tools/sbom/sbom.py b/tools/sbom/sbom.py index 25d912a282de..426521ade460 100644 --- a/tools/sbom/sbom.py +++ b/tools/sbom/sbom.py @@ -6,13 +6,18 @@ Compute software bill of materials in SPDX format describing a kernel buil= d. """ =20 +import json import logging import os import sys import time +import uuid import sbom.sbom_logging as sbom_logging from sbom.config import get_config from sbom.path_utils import is_relative_to +from sbom.spdx import JsonLdSpdxDocument, SpdxIdGenerator +from sbom.spdx.core import CreationInfo, SpdxDocument +from sbom.spdx_graph import SpdxIdGeneratorCollection, build_spdx_graphs from sbom.cmd_graph import CmdGraph =20 =20 @@ -56,10 +61,57 @@ def main(): f.write("\n".join(str(file_path) for file_path in used_fil= es)) logging.debug(f"Successfully saved {used_files_path}") =20 + if config.generate_spdx is False: + return + + # Build SPDX Documents + logging.debug("Start generating SPDX graph based on cmd graph") + start_time =3D time.time() + + # The real uuid will be generated based on the content of the SPDX gra= phs + # to ensure that the same SPDX document is always assigned the same uu= id. + PLACEHOLDER_UUID =3D "00000000-0000-0000-0000-000000000000" + spdx_id_base_namespace =3D f"{config.spdxId_prefix}{PLACEHOLDER_UUID}/" + spdx_id_generators =3D SpdxIdGeneratorCollection( + base=3DSpdxIdGenerator(prefix=3D"p", namespace=3Dspdx_id_base_name= space), + source=3DSpdxIdGenerator(prefix=3D"s", namespace=3Df"{spdx_id_base= _namespace}source/"), + build=3DSpdxIdGenerator(prefix=3D"b", namespace=3Df"{spdx_id_base_= namespace}build/"), + output=3DSpdxIdGenerator(prefix=3D"o", namespace=3Df"{spdx_id_base= _namespace}output/"), + ) + + spdx_graphs =3D build_spdx_graphs( + cmd_graph, + spdx_id_generators, + config, + ) + spdx_id_uuid =3D uuid.uuid5( + uuid.NAMESPACE_URL, + "".join( + json.dumps(element.to_dict()) for spdx_graph in spdx_graphs.va= lues() for element in spdx_graph.to_list() + ), + ) + logging.debug(f"Generated SPDX graph in {time.time() - start_time} sec= onds") + # Report collected warnings and errors in case of failure warning_summary =3D sbom_logging.summarize_warnings() error_summary =3D sbom_logging.summarize_errors() =20 + if not sbom_logging.has_errors() or config.write_output_on_error: + for kernel_sbom_kind, spdx_graph in spdx_graphs.items(): + spdx_graph_objects =3D spdx_graph.to_list() + # Add warning and error summary to creation info comment + creation_info =3D next(element for element in spdx_graph_objec= ts if isinstance(element, CreationInfo)) + creation_info.comment =3D "\n".join([warning_summary, error_su= mmary]).strip() + # Replace Placeholder uuid with real uuid for spdxIds + spdx_document =3D next(element for element in spdx_graph_objec= ts if isinstance(element, SpdxDocument)) + for namespaceMap in spdx_document.namespaceMap: + namespaceMap.namespace =3D namespaceMap.namespace.replace(= PLACEHOLDER_UUID, str(spdx_id_uuid)) + # Serialize SPDX graph to JSON-LD + spdx_doc =3D JsonLdSpdxDocument(graph=3Dspdx_graph_objects) + save_path =3D os.path.join(config.output_directory, config.spd= x_file_names[kernel_sbom_kind]) + spdx_doc.save(save_path, config.prettify_json) + logging.debug(f"Successfully saved {save_path}") + if warning_summary: logging.warning(warning_summary) if error_summary: diff --git a/tools/sbom/sbom/config.py b/tools/sbom/sbom/config.py index 39e556a4c53b..0985457c3cae 100644 --- a/tools/sbom/sbom/config.py +++ b/tools/sbom/sbom/config.py @@ -3,11 +3,18 @@ =20 import argparse from dataclasses import dataclass +from enum import Enum import os from typing import Any from sbom.path_utils import PathStr =20 =20 +class KernelSpdxDocumentKind(Enum): + SOURCE =3D "source" + BUILD =3D "build" + OUTPUT =3D "output" + + @dataclass class KernelSbomConfig: src_tree: PathStr @@ -19,6 +26,13 @@ class KernelSbomConfig: root_paths: list[PathStr] """List of paths to root outputs (relative to obj_tree) to base the SB= OM on.""" =20 + generate_spdx: bool + """Whether to generate SPDX SBOM documents. If False, no SPDX files ar= e created.""" + + spdx_file_names: dict[KernelSpdxDocumentKind, str] + """If `generate_spdx` is True, defines the file names for each SPDX SB= OM kind + (source, build, output) to store on disk.""" + generate_used_files: bool """Whether to generate a flat list of all source files used in the bui= ld. If False, no used-files document is created.""" @@ -38,6 +52,12 @@ class KernelSbomConfig: write_output_on_error: bool """Whether to write output documents even if errors occur.""" =20 + spdxId_prefix: str + """Prefix to use for all SPDX element IDs.""" + + prettify_json: bool + """Whether to pretty-print generated SPDX JSON documents.""" + =20 def _parse_cli_arguments() -> dict[str, Any]: """ @@ -72,6 +92,15 @@ def _parse_cli_arguments() -> dict[str, Any]: "--roots-file", help=3D"Path to a file containing the root paths (one per line). C= annot be used together with --roots.", ) + parser.add_argument( + "--generate-spdx", + action=3D"store_true", + default=3DFalse, + help=3D( + "Whether to create sbom-source.spdx.json, sbom-build.spdx.json= and " + "sbom-output.spdx.json documents (default: False)" + ), + ) parser.add_argument( "--generate-used-files", action=3D"store_true", @@ -119,6 +148,20 @@ def _parse_cli_arguments() -> dict[str, Any]: ), ) =20 + # SPDX specific options + spdx_group =3D parser.add_argument_group("SPDX options", "Options for = customizing SPDX document generation") + spdx_group.add_argument( + "--spdxId-prefix", + default=3D"urn:spdx.dev:", + help=3D"The prefix to use for all spdxId properties. (default: urn= :spdx.dev:)", + ) + spdx_group.add_argument( + "--prettify-json", + action=3D"store_true", + default=3DFalse, + help=3D"Whether to pretty print the generated spdx.json documents = (default: False)", + ) + args =3D vars(parser.parse_args()) return args =20 @@ -144,6 +187,7 @@ def get_config() -> KernelSbomConfig: root_paths =3D args["roots"] _validate_path_arguments(src_tree, obj_tree, root_paths) =20 + generate_spdx =3D args["generate_spdx"] generate_used_files =3D args["generate_used_files"] output_directory =3D os.path.realpath(args["output_directory"]) debug =3D args["debug"] @@ -151,19 +195,31 @@ def get_config() -> KernelSbomConfig: fail_on_unknown_build_command =3D not args["do_not_fail_on_unknown_bui= ld_command"] write_output_on_error =3D args["write_output_on_error"] =20 + spdxId_prefix =3D args["spdxId_prefix"] + prettify_json =3D args["prettify_json"] + # Hardcoded config + spdx_file_names =3D { + KernelSpdxDocumentKind.SOURCE: "sbom-source.spdx.json", + KernelSpdxDocumentKind.BUILD: "sbom-build.spdx.json", + KernelSpdxDocumentKind.OUTPUT: "sbom-output.spdx.json", + } used_files_file_name =3D "sbom.used-files.txt" =20 return KernelSbomConfig( src_tree=3Dsrc_tree, obj_tree=3Dobj_tree, root_paths=3Droot_paths, + generate_spdx=3Dgenerate_spdx, + spdx_file_names=3Dspdx_file_names, generate_used_files=3Dgenerate_used_files, used_files_file_name=3Dused_files_file_name, output_directory=3Doutput_directory, debug=3Ddebug, fail_on_unknown_build_command=3Dfail_on_unknown_build_command, write_output_on_error=3Dwrite_output_on_error, + spdxId_prefix=3DspdxId_prefix, + prettify_json=3Dprettify_json, ) =20 =20 diff --git a/tools/sbom/sbom/path_utils.py b/tools/sbom/sbom/path_utils.py new file mode 100644 index 000000000000..d28d67b25398 --- /dev/null +++ b/tools/sbom/sbom/path_utils.py @@ -0,0 +1,11 @@ +# SPDX-License-Identifier: GPL-2.0-only OR MIT +# Copyright (C) 2025 TNG Technology Consulting GmbH + +import os + +PathStr =3D str +"""Filesystem path represented as a plain string for better performance th= an pathlib.Path.""" + + +def is_relative_to(path: PathStr, base: PathStr) -> bool: + return os.path.commonpath([path, base]) =3D=3D base diff --git a/tools/sbom/sbom/spdx_graph/__init__.py b/tools/sbom/sbom/spdx_= graph/__init__.py new file mode 100644 index 000000000000..3557b1d51bf9 --- /dev/null +++ b/tools/sbom/sbom/spdx_graph/__init__.py @@ -0,0 +1,7 @@ +# SPDX-License-Identifier: GPL-2.0-only OR MIT +# Copyright (C) 2025 TNG Technology Consulting GmbH + +from .build_spdx_graphs import build_spdx_graphs +from .spdx_graph_model import SpdxIdGeneratorCollection + +__all__ =3D ["build_spdx_graphs", "SpdxIdGeneratorCollection"] diff --git a/tools/sbom/sbom/spdx_graph/build_spdx_graphs.py b/tools/sbom/s= bom/spdx_graph/build_spdx_graphs.py new file mode 100644 index 000000000000..bb3db4e423da --- /dev/null +++ b/tools/sbom/sbom/spdx_graph/build_spdx_graphs.py @@ -0,0 +1,36 @@ +# SPDX-License-Identifier: GPL-2.0-only OR MIT +# Copyright (C) 2025 TNG Technology Consulting GmbH + + +from typing import Protocol + +from sbom.config import KernelSpdxDocumentKind +from sbom.cmd_graph import CmdGraph +from sbom.path_utils import PathStr +from sbom.spdx_graph.spdx_graph_model import SpdxGraph, SpdxIdGeneratorCol= lection + + +class SpdxGraphConfig(Protocol): + obj_tree: PathStr + src_tree: PathStr + + +def build_spdx_graphs( + cmd_graph: CmdGraph, + spdx_id_generators: SpdxIdGeneratorCollection, + config: SpdxGraphConfig, +) -> dict[KernelSpdxDocumentKind, SpdxGraph]: + """ + Builds SPDX graphs (output, source, and build) based on a cmd dependen= cy graph. + If the source and object trees are identical, no dedicated source grap= h can be created. + In that case the source files are added to the build graph instead. + + Args: + cmd_graph: The dependency graph of a kernel build. + spdx_id_generators: Collection of SPDX ID generators. + config: Configuration options. + + Returns: + Dictionary of SPDX graphs + """ + return {} diff --git a/tools/sbom/sbom/spdx_graph/spdx_graph_model.py b/tools/sbom/sb= om/spdx_graph/spdx_graph_model.py new file mode 100644 index 000000000000..682194d4362a --- /dev/null +++ b/tools/sbom/sbom/spdx_graph/spdx_graph_model.py @@ -0,0 +1,36 @@ +# SPDX-License-Identifier: GPL-2.0-only OR MIT +# Copyright (C) 2025 TNG Technology Consulting GmbH + +from dataclasses import dataclass +from sbom.spdx.core import CreationInfo, SoftwareAgent, SpdxDocument, Spdx= Object +from sbom.spdx.software import Sbom +from sbom.spdx.spdxId import SpdxIdGenerator + + +@dataclass +class SpdxGraph: + """Represents the complete graph of a single SPDX document.""" + + spdx_document: SpdxDocument + agent: SoftwareAgent + creation_info: CreationInfo + sbom: Sbom + + def to_list(self) -> list[SpdxObject]: + return [ + self.spdx_document, + self.agent, + self.creation_info, + self.sbom, + *self.sbom.element, + ] + + +@dataclass +class SpdxIdGeneratorCollection: + """Holds SPDX ID generators for different document types to ensure glo= bally unique SPDX IDs.""" + + base: SpdxIdGenerator + source: SpdxIdGenerator + build: SpdxIdGenerator + output: SpdxIdGenerator --=20 2.34.1