[PATCH v2 00/14] Add SPDX SBOM generation tool

Luis Augenstein posted 14 patches 2 weeks, 3 days ago
There is a newer version of this series
.gitignore                                    |   1 +
MAINTAINERS                                   |   6 +
Makefile                                      |  15 +-
lib/Kconfig.debug                             |   9 +
tools/Makefile                                |   3 +-
tools/sbom/Makefile                           |  42 ++
tools/sbom/README                             | 208 ++++++
tools/sbom/sbom.py                            | 129 ++++
tools/sbom/sbom/__init__.py                   |   0
tools/sbom/sbom/cmd_graph/__init__.py         |   7 +
tools/sbom/sbom/cmd_graph/cmd_file.py         | 149 ++++
tools/sbom/sbom/cmd_graph/cmd_graph.py        |  46 ++
tools/sbom/sbom/cmd_graph/cmd_graph_node.py   | 142 ++++
tools/sbom/sbom/cmd_graph/deps_parser.py      |  52 ++
.../sbom/cmd_graph/hardcoded_dependencies.py  |  83 +++
tools/sbom/sbom/cmd_graph/incbin_parser.py    |  42 ++
tools/sbom/sbom/cmd_graph/savedcmd_parser.py  | 664 ++++++++++++++++++
tools/sbom/sbom/config.py                     | 335 +++++++++
tools/sbom/sbom/environment.py                | 164 +++++
tools/sbom/sbom/path_utils.py                 |  11 +
tools/sbom/sbom/sbom_logging.py               |  88 +++
tools/sbom/sbom/spdx/__init__.py              |   7 +
tools/sbom/sbom/spdx/build.py                 |  17 +
tools/sbom/sbom/spdx/core.py                  | 182 +++++
tools/sbom/sbom/spdx/serialization.py         |  56 ++
tools/sbom/sbom/spdx/simplelicensing.py       |  20 +
tools/sbom/sbom/spdx/software.py              |  71 ++
tools/sbom/sbom/spdx/spdxId.py                |  36 +
tools/sbom/sbom/spdx_graph/__init__.py        |   7 +
.../sbom/sbom/spdx_graph/build_spdx_graphs.py |  82 +++
tools/sbom/sbom/spdx_graph/kernel_file.py     | 310 ++++++++
.../sbom/spdx_graph/shared_spdx_elements.py   |  32 +
.../sbom/sbom/spdx_graph/spdx_build_graph.py  | 317 +++++++++
.../sbom/sbom/spdx_graph/spdx_graph_model.py  |  36 +
.../sbom/sbom/spdx_graph/spdx_output_graph.py | 188 +++++
.../sbom/sbom/spdx_graph/spdx_source_graph.py | 126 ++++
tools/sbom/tests/__init__.py                  |   0
tools/sbom/tests/cmd_graph/__init__.py        |   0
.../tests/cmd_graph/test_savedcmd_parser.py   | 383 ++++++++++
tools/sbom/tests/spdx_graph/__init__.py       |   0
.../sbom/tests/spdx_graph/test_kernel_file.py |  32 +
41 files changed, 4096 insertions(+), 2 deletions(-)
create mode 100644 tools/sbom/Makefile
create mode 100644 tools/sbom/README
create mode 100644 tools/sbom/sbom.py
create mode 100644 tools/sbom/sbom/__init__.py
create mode 100644 tools/sbom/sbom/cmd_graph/__init__.py
create mode 100644 tools/sbom/sbom/cmd_graph/cmd_file.py
create mode 100644 tools/sbom/sbom/cmd_graph/cmd_graph.py
create mode 100644 tools/sbom/sbom/cmd_graph/cmd_graph_node.py
create mode 100644 tools/sbom/sbom/cmd_graph/deps_parser.py
create mode 100644 tools/sbom/sbom/cmd_graph/hardcoded_dependencies.py
create mode 100644 tools/sbom/sbom/cmd_graph/incbin_parser.py
create mode 100644 tools/sbom/sbom/cmd_graph/savedcmd_parser.py
create mode 100644 tools/sbom/sbom/config.py
create mode 100644 tools/sbom/sbom/environment.py
create mode 100644 tools/sbom/sbom/path_utils.py
create mode 100644 tools/sbom/sbom/sbom_logging.py
create mode 100644 tools/sbom/sbom/spdx/__init__.py
create mode 100644 tools/sbom/sbom/spdx/build.py
create mode 100644 tools/sbom/sbom/spdx/core.py
create mode 100644 tools/sbom/sbom/spdx/serialization.py
create mode 100644 tools/sbom/sbom/spdx/simplelicensing.py
create mode 100644 tools/sbom/sbom/spdx/software.py
create mode 100644 tools/sbom/sbom/spdx/spdxId.py
create mode 100644 tools/sbom/sbom/spdx_graph/__init__.py
create mode 100644 tools/sbom/sbom/spdx_graph/build_spdx_graphs.py
create mode 100644 tools/sbom/sbom/spdx_graph/kernel_file.py
create mode 100644 tools/sbom/sbom/spdx_graph/shared_spdx_elements.py
create mode 100644 tools/sbom/sbom/spdx_graph/spdx_build_graph.py
create mode 100644 tools/sbom/sbom/spdx_graph/spdx_graph_model.py
create mode 100644 tools/sbom/sbom/spdx_graph/spdx_output_graph.py
create mode 100644 tools/sbom/sbom/spdx_graph/spdx_source_graph.py
create mode 100644 tools/sbom/tests/__init__.py
create mode 100644 tools/sbom/tests/cmd_graph/__init__.py
create mode 100644 tools/sbom/tests/cmd_graph/test_savedcmd_parser.py
create mode 100644 tools/sbom/tests/spdx_graph/__init__.py
create mode 100644 tools/sbom/tests/spdx_graph/test_kernel_file.py
[PATCH v2 00/14] Add SPDX SBOM generation tool
Posted by Luis Augenstein 2 weeks, 3 days ago
This patch series introduces a Python-based tool for generating SBOM
documents in the SPDX 3.0.1 format for kernel builds.

A Software Bill of Materials (SBOM) describes the individual components
of a software product. For the kernel, the goal is to describe the
distributable build outputs (typically the kernel image and modules),
the source files involved in producing these outputs, and the build
process that connects the source and output files.

To achieve this, the SBOM tool generates three SPDX documents:

- sbom-output.spdx.json
  Describes the final build outputs together with high-level
  build metadata.

- sbom-source.spdx.json
  Describes all source files involved in the build, including
  licensing information and additional file metadata.

- sbom-build.spdx.json
  Describes the entire build process, linking source files
  from the source SBOM to output files in the output SBOM.

The sbom tool is optional and runs only when CONFIG_SBOM is enabled. It
is invoked after the build, once all output artifacts have been
generated. Starting from the kernel image and modules as root nodes,
the tool reconstructs the dependency graph up to the original source
files. Build dependencies are primarily derived from the .cmd files
generated by Kbuild, which record the full command used to build
each output file.

Currently, the tool only supports x86 and arm64 architectures.

Co-developed-by: Maximilian Huber <maximilian.huber@tngtech.com>
Signed-off-by: Maximilian Huber <maximilian.huber@tngtech.com>
Signed-off-by: Luis Augenstein <luis.augenstein@tngtech.com>
---
Changes in v2:
- regenerate sbom documents when build configuration changes
---
Luis Augenstein (14):
  tools/sbom: integrate tool in make process
  tools/sbom: setup sbom logging
  tools/sbom: add command parsers
  tools/sbom: add cmd graph generation
  tools/sbom: add additional dependency sources for cmd graph
  tools/sbom: add SPDX classes
  tools/sbom: add JSON-LD serialization
  tools/sbom: add shared SPDX elements
  tools/sbom: collect file metadata
  tools/sbom: add SPDX output graph
  tools/sbom: add SPDX source graph
  tools/sbom: add SPDX build graph
  tools/sbom: add unit tests for command parsers
  tools/sbom: add unit tests for SPDX-License-Identifier parsing

 .gitignore                                    |   1 +
 MAINTAINERS                                   |   6 +
 Makefile                                      |  15 +-
 lib/Kconfig.debug                             |   9 +
 tools/Makefile                                |   3 +-
 tools/sbom/Makefile                           |  42 ++
 tools/sbom/README                             | 208 ++++++
 tools/sbom/sbom.py                            | 129 ++++
 tools/sbom/sbom/__init__.py                   |   0
 tools/sbom/sbom/cmd_graph/__init__.py         |   7 +
 tools/sbom/sbom/cmd_graph/cmd_file.py         | 149 ++++
 tools/sbom/sbom/cmd_graph/cmd_graph.py        |  46 ++
 tools/sbom/sbom/cmd_graph/cmd_graph_node.py   | 142 ++++
 tools/sbom/sbom/cmd_graph/deps_parser.py      |  52 ++
 .../sbom/cmd_graph/hardcoded_dependencies.py  |  83 +++
 tools/sbom/sbom/cmd_graph/incbin_parser.py    |  42 ++
 tools/sbom/sbom/cmd_graph/savedcmd_parser.py  | 664 ++++++++++++++++++
 tools/sbom/sbom/config.py                     | 335 +++++++++
 tools/sbom/sbom/environment.py                | 164 +++++
 tools/sbom/sbom/path_utils.py                 |  11 +
 tools/sbom/sbom/sbom_logging.py               |  88 +++
 tools/sbom/sbom/spdx/__init__.py              |   7 +
 tools/sbom/sbom/spdx/build.py                 |  17 +
 tools/sbom/sbom/spdx/core.py                  | 182 +++++
 tools/sbom/sbom/spdx/serialization.py         |  56 ++
 tools/sbom/sbom/spdx/simplelicensing.py       |  20 +
 tools/sbom/sbom/spdx/software.py              |  71 ++
 tools/sbom/sbom/spdx/spdxId.py                |  36 +
 tools/sbom/sbom/spdx_graph/__init__.py        |   7 +
 .../sbom/sbom/spdx_graph/build_spdx_graphs.py |  82 +++
 tools/sbom/sbom/spdx_graph/kernel_file.py     | 310 ++++++++
 .../sbom/spdx_graph/shared_spdx_elements.py   |  32 +
 .../sbom/sbom/spdx_graph/spdx_build_graph.py  | 317 +++++++++
 .../sbom/sbom/spdx_graph/spdx_graph_model.py  |  36 +
 .../sbom/sbom/spdx_graph/spdx_output_graph.py | 188 +++++
 .../sbom/sbom/spdx_graph/spdx_source_graph.py | 126 ++++
 tools/sbom/tests/__init__.py                  |   0
 tools/sbom/tests/cmd_graph/__init__.py        |   0
 .../tests/cmd_graph/test_savedcmd_parser.py   | 383 ++++++++++
 tools/sbom/tests/spdx_graph/__init__.py       |   0
 .../sbom/tests/spdx_graph/test_kernel_file.py |  32 +
 41 files changed, 4096 insertions(+), 2 deletions(-)
 create mode 100644 tools/sbom/Makefile
 create mode 100644 tools/sbom/README
 create mode 100644 tools/sbom/sbom.py
 create mode 100644 tools/sbom/sbom/__init__.py
 create mode 100644 tools/sbom/sbom/cmd_graph/__init__.py
 create mode 100644 tools/sbom/sbom/cmd_graph/cmd_file.py
 create mode 100644 tools/sbom/sbom/cmd_graph/cmd_graph.py
 create mode 100644 tools/sbom/sbom/cmd_graph/cmd_graph_node.py
 create mode 100644 tools/sbom/sbom/cmd_graph/deps_parser.py
 create mode 100644 tools/sbom/sbom/cmd_graph/hardcoded_dependencies.py
 create mode 100644 tools/sbom/sbom/cmd_graph/incbin_parser.py
 create mode 100644 tools/sbom/sbom/cmd_graph/savedcmd_parser.py
 create mode 100644 tools/sbom/sbom/config.py
 create mode 100644 tools/sbom/sbom/environment.py
 create mode 100644 tools/sbom/sbom/path_utils.py
 create mode 100644 tools/sbom/sbom/sbom_logging.py
 create mode 100644 tools/sbom/sbom/spdx/__init__.py
 create mode 100644 tools/sbom/sbom/spdx/build.py
 create mode 100644 tools/sbom/sbom/spdx/core.py
 create mode 100644 tools/sbom/sbom/spdx/serialization.py
 create mode 100644 tools/sbom/sbom/spdx/simplelicensing.py
 create mode 100644 tools/sbom/sbom/spdx/software.py
 create mode 100644 tools/sbom/sbom/spdx/spdxId.py
 create mode 100644 tools/sbom/sbom/spdx_graph/__init__.py
 create mode 100644 tools/sbom/sbom/spdx_graph/build_spdx_graphs.py
 create mode 100644 tools/sbom/sbom/spdx_graph/kernel_file.py
 create mode 100644 tools/sbom/sbom/spdx_graph/shared_spdx_elements.py
 create mode 100644 tools/sbom/sbom/spdx_graph/spdx_build_graph.py
 create mode 100644 tools/sbom/sbom/spdx_graph/spdx_graph_model.py
 create mode 100644 tools/sbom/sbom/spdx_graph/spdx_output_graph.py
 create mode 100644 tools/sbom/sbom/spdx_graph/spdx_source_graph.py
 create mode 100644 tools/sbom/tests/__init__.py
 create mode 100644 tools/sbom/tests/cmd_graph/__init__.py
 create mode 100644 tools/sbom/tests/cmd_graph/test_savedcmd_parser.py
 create mode 100644 tools/sbom/tests/spdx_graph/__init__.py
 create mode 100644 tools/sbom/tests/spdx_graph/test_kernel_file.py

-- 
2.34.1
Re: [PATCH v2 00/14] Add SPDX SBOM generation tool
Posted by Greg KH 2 weeks, 3 days ago
On Tue, Jan 20, 2026 at 12:53:38PM +0100, Luis Augenstein wrote:
> This patch series introduces a Python-based tool for generating SBOM
> documents in the SPDX 3.0.1 format for kernel builds.
> 
> A Software Bill of Materials (SBOM) describes the individual components
> of a software product. For the kernel, the goal is to describe the
> distributable build outputs (typically the kernel image and modules),
> the source files involved in producing these outputs, and the build
> process that connects the source and output files.
> 
> To achieve this, the SBOM tool generates three SPDX documents:
> 
> - sbom-output.spdx.json
>   Describes the final build outputs together with high-level
>   build metadata.
> 
> - sbom-source.spdx.json
>   Describes all source files involved in the build, including
>   licensing information and additional file metadata.
> 
> - sbom-build.spdx.json
>   Describes the entire build process, linking source files
>   from the source SBOM to output files in the output SBOM.
> 
> The sbom tool is optional and runs only when CONFIG_SBOM is enabled. It
> is invoked after the build, once all output artifacts have been
> generated. Starting from the kernel image and modules as root nodes,
> the tool reconstructs the dependency graph up to the original source
> files. Build dependencies are primarily derived from the .cmd files
> generated by Kbuild, which record the full command used to build
> each output file.
> 
> Currently, the tool only supports x86 and arm64 architectures.
> 
> Co-developed-by: Maximilian Huber <maximilian.huber@tngtech.com>
> Signed-off-by: Maximilian Huber <maximilian.huber@tngtech.com>
> Signed-off-by: Luis Augenstein <luis.augenstein@tngtech.com>
> ---
> Changes in v2:
> - regenerate sbom documents when build configuration changes

I'm still getting:

	make[3]: Nothing to be done for 'sbom'.

When rebuilding the kernel and nothing needs to be done for the sbom.
That message should not be there, right?

thanks,

greg k-h
Re: [PATCH v2 00/14] Add SPDX SBOM generation tool
Posted by Luis Augenstein 2 weeks, 3 days ago
> I'm still getting:
>
> 	make[3]: Nothing to be done for 'sbom'.
>
> When rebuilding the kernel and nothing needs to be done for the sbom.
> That message should not be there, right?

Ah, you mean the message should always be suppressed.
Sorry, I thought your concern was just that this message appeared when
the SBOM should have been regenerated.
With the changes in v2, the SBOM should now regenerate correctly
whenever the build configuration changes.
I will include the additional suppression of the make message in the
next version.

Best,
Luis

-- 
Luis Augenstein * luis.augenstein@tngtech.com * +49-152-25275761
TNG Technology Consulting GmbH, Beta-Str. 13, 85774 Unterföhring
Geschäftsführer: Henrik Klagges, Dr. Robert Dahlke, Thomas Endres
Aufsichtsratsvorsitzender: Christoph Stock
Sitz: Unterföhring * Amtsgericht München * HRB 135082
Re: [PATCH v2 00/14] Add SPDX SBOM generation tool
Posted by Miguel Ojeda 2 weeks, 2 days ago
On Wed, Jan 21, 2026 at 6:55 AM Luis Augenstein
<luis.augenstein@tngtech.com> wrote:
>
> The sbom tool is optional and runs only when CONFIG_SBOM is enabled. It
> is invoked after the build, once all output artifacts have been
> generated. Starting from the kernel image and modules as root nodes,
> the tool reconstructs the dependency graph up to the original source
> files. Build dependencies are primarily derived from the .cmd files
> generated by Kbuild, which record the full command used to build
> each output file.
>
> Currently, the tool only supports x86 and arm64 architectures.

I am out of the loop, and I don't know the requirements here, but what
kind of approaches were considered for this?

Parsing the `.cmd`s seems a bit ad-hoc / after-the-fact approach, and
from a very cursory look at the patches, it seems to require a fair
amount of hardcoding, e.g. it seems we may need to list every
generator tool in `SINGLE_COMMAND_PARSERS`?

Now, if this is meant to be best-effort and cover the most important
parts, it may be fine -- again, I don't know the requirements here.
But if it is meant to accurately match everything, then it will
require keeping those lists in sync with Kbuild, right?

Hmm... I feel like changing the build system itself (whether at the
Kbuild level or even a customized Make itself if needed) to record
this information would be conceptually simpler / more elegant, even if
changing Kbuild itself can sometimes be quite a challenge.

In addition, why does this need to be a `CONFIG_` option? Should this
be a separate tool or at most a target that supports whatever config
happens to be, rather than part of the config itself?

Thanks!

Cheers,
Miguel
Re: [PATCH v2 00/14] Add SPDX SBOM generation tool
Posted by Luis Augenstein 2 weeks, 1 day ago
> it seems to require a fair amount of hardcoding, e.g.
> it seems we may need to list every generator tool in
> `SINGLE_COMMAND_PARSERS`?

Yes. Optimally, the cmd files would contain the full list of input
files, such that parsing the commands would no longer be necessary.
However, that was considered out of scope for this project.

> But if it is meant to accurately match everything, then it will
> require keeping those lists in sync with Kbuild, right?

Yes. The goal should be to keep the parser functions complete, that is,
to add new ones as soon as unsupported commands are discovered. It is
quite likely that the current list of parser functions is not complete.
When unsupported commands are encountered, KernelSbom is still able to
generate the SBOM as much as possible, as explained in the last section
of the README.

> Unknown Build Commands
> ----------------------
>
> Because the kernel supports a wide range of configurations and versions,
> KernelSbom may encounter build commands in `.cmd` files that it does
> not yet support. By default, KernelSbom will fail if an unknown build
> command is encountered.
>
> If you still wish to generate SPDX documents despite unsupported
> commands, you can use the `--do-not-fail-on-unknown-build-command`
> option. KernelSbom will continue and produce the documents, although
> the resulting SBOM will be incomplete.
>
> This option should only be used when the missing portion of the
> dependency graph is small and an incomplete SBOM is acceptable for
> your use case.


> In addition, why does this need to be a `CONFIG_` option? Should this
> be a separate tool or at most a target that supports whatever config
> happens to be, rather than part of the config itself?

The main reason to run the SBOM tool within the main make process is to
gain direct access to the make/environment variables used during the
build. The `KERNEL_BUILD_VARIABLES_ALLOWLIST` defines which environment
variables should be included in the SBOM if they are available. When the
tool is run outside of the main build, this information is no longer
accessible.
We are looking for a better place for the CONFIG_SBOM option though, as
`lib/Kconfig.debug` may not be the most appropriate place?

Best,
Luis

-- 
Luis Augenstein * luis.augenstein@tngtech.com * +49-152-25275761
TNG Technology Consulting GmbH, Beta-Str. 13, 85774 Unterföhring
Geschäftsführer: Henrik Klagges, Dr. Robert Dahlke, Thomas Endres
Aufsichtsratsvorsitzender: Christoph Stock
Sitz: Unterföhring * Amtsgericht München * HRB 135082
Re: [PATCH v2 00/14] Add SPDX SBOM generation tool
Posted by Miguel Ojeda 1 week, 5 days ago
On Thu, Jan 22, 2026 at 9:32 PM Luis Augenstein
<luis.augenstein@tngtech.com> wrote:
>
> The main reason to run the SBOM tool within the main make process is to
> gain direct access to the make/environment variables used during the
> build. The `KERNEL_BUILD_VARIABLES_ALLOWLIST` defines which environment
> variables should be included in the SBOM if they are available. When the
> tool is run outside of the main build, this information is no longer
> accessible.

I was not suggesting to take it out of `make` completely if the
environment is needed, but rather have the user call the target (which
could still depend on the kernel build like you have it now).

For instance, for generating the rust-analyzer configuration, we want
to have the environment too, so we have a Make target that users call
when they need it, rather than making it a configuration of the
kernel.

Now, I can understand there may be other reasons (please see my reply to Greg).

Thanks!

Cheers,
Miguel
Re: [PATCH v2 00/14] Add SPDX SBOM generation tool
Posted by Luis Augenstein 1 week, 5 days ago
> I was not suggesting to take it out of `make` completely if the
> environment is needed, but rather have the user call the target (which
> could still depend on the kernel build like you have it now).

Thanks for clarifying.
I will try that.

Best,
Luis

-- 
Luis Augenstein * luis.augenstein@tngtech.com * +49-152-25275761
TNG Technology Consulting GmbH, Beta-Str. 13, 85774 Unterföhring
Geschäftsführer: Henrik Klagges, Dr. Robert Dahlke, Thomas Endres
Aufsichtsratsvorsitzender: Christoph Stock
Sitz: Unterföhring * Amtsgericht München * HRB 135082
Re: [PATCH v2 00/14] Add SPDX SBOM generation tool
Posted by Greg KH 2 weeks, 2 days ago
On Thu, Jan 22, 2026 at 07:18:18AM +0100, Miguel Ojeda wrote:
> On Wed, Jan 21, 2026 at 6:55 AM Luis Augenstein
> <luis.augenstein@tngtech.com> wrote:
> >
> > The sbom tool is optional and runs only when CONFIG_SBOM is enabled. It
> > is invoked after the build, once all output artifacts have been
> > generated. Starting from the kernel image and modules as root nodes,
> > the tool reconstructs the dependency graph up to the original source
> > files. Build dependencies are primarily derived from the .cmd files
> > generated by Kbuild, which record the full command used to build
> > each output file.
> >
> > Currently, the tool only supports x86 and arm64 architectures.
> 
> I am out of the loop, and I don't know the requirements here, but what
> kind of approaches were considered for this?

Lots of different attempts, usually using bpf and other run-time tracing
tools.  But it was determined that we already have this info in our
build dependancy files, so parsing them was picked.

> Parsing the `.cmd`s seems a bit ad-hoc / after-the-fact approach, and
> from a very cursory look at the patches, it seems to require a fair
> amount of hardcoding, e.g. it seems we may need to list every
> generator tool in `SINGLE_COMMAND_PARSERS`?

If you know of a better way, that would be great!

> Now, if this is meant to be best-effort and cover the most important
> parts, it may be fine -- again, I don't know the requirements here.
> But if it is meant to accurately match everything, then it will
> require keeping those lists in sync with Kbuild, right?

It should match everything, and yes, it will require keeping things in
sync.

> Hmm... I feel like changing the build system itself (whether at the
> Kbuild level or even a customized Make itself if needed) to record
> this information would be conceptually simpler / more elegant, even if
> changing Kbuild itself can sometimes be quite a challenge.

Changing kbuild would be great too, if you know of a way we can get that
info out of it.

> In addition, why does this need to be a `CONFIG_` option? Should this
> be a separate tool or at most a target that supports whatever config
> happens to be, rather than part of the config itself?

It should be part of the kernel build process, and generated as part of
it as it will want to go into some packages directly.  Having to run the
build "again" is probably not a good idea (i.e. do you want to modify
all the distro rpm scripts?)

thanks,

greg k-h
Re: [PATCH v2 00/14] Add SPDX SBOM generation tool
Posted by Miguel Ojeda 1 week, 5 days ago
On Thu, Jan 22, 2026 at 7:35 AM Greg KH <gregkh@linuxfoundation.org> wrote:
>
> Lots of different attempts, usually using bpf and other run-time tracing
> tools.  But it was determined that we already have this info in our
> build dependancy files, so parsing them was picked.
>
> If you know of a better way, that would be great!

Yes, if I understand correctly, then this should be done on the build
system side (i.e. I don't see how BPF/tracing could achieve this, so
maybe I am missing something), but what I meant is that there are
several ways to do this in the build system side.

One is this kind of post-processing after the build, which is easier
in that it avoids touching Kbuild and can be written in something like
Python, which always helps. The downside (my worry) is that it
introduces yet another layer to Kbuild.

My first instinct would have been to try to see if the build system
itself could already give us what is built while it gets built (i.e.
just like it outputs the `cmd` files). So I wondered if that was
considered.

> Changing kbuild would be great too, if you know of a way we can get that
> info out of it.

It depends on what is needed, but Kbuild of course knows about input
and output files and dependencies, so I was thinking of outputting
that information in an easier format instead of having to parse
command lines from the `cmd` files.

> It should be part of the kernel build process, and generated as part of
> it as it will want to go into some packages directly.  Having to run the
> build "again" is probably not a good idea (i.e. do you want to modify
> all the distro rpm scripts?)

Even with `CONFIG_SBOM`, they will need to modify at least their
kernel configuration, and perhaps more if they want to save the SBOM
files differently, e.g. in another package etc. So I am not sure if it
is a big difference for any distro than adding a word to their `make`
line.

Now, I understand it may be easier to tell users to "just turn one
more config", and perhaps it looks more "integrated" to them, but I
mainly asked because, to me, the SBOM is orthogonal to the kernel
configuration.

In other words, I would have expected to be able to get an SBOM for
any build, without having to modify the kernel configuration at all.
After all, the kernel image should not change at all whether there is
an SBOM or not. We also do not do that for some other big "globally
orthogonal" things that involve generating extra files, like
documentation.

I hope that helps somehow...

Cheers,
Miguel
Re: [PATCH v2 00/14] Add SPDX SBOM generation tool
Posted by Greg KH 1 week, 5 days ago
On Sun, Jan 25, 2026 at 04:20:40PM +0100, Miguel Ojeda wrote:
> On Thu, Jan 22, 2026 at 7:35 AM Greg KH <gregkh@linuxfoundation.org> wrote:
> >
> > Lots of different attempts, usually using bpf and other run-time tracing
> > tools.  But it was determined that we already have this info in our
> > build dependancy files, so parsing them was picked.
> >
> > If you know of a better way, that would be great!
> 
> Yes, if I understand correctly, then this should be done on the build
> system side (i.e. I don't see how BPF/tracing could achieve this, so
> maybe I am missing something), but what I meant is that there are
> several ways to do this in the build system side.

for a horrible hack of an example of how you can do this using
bpf/tracing, see this "fun" thing that I use every so often:
	https://github.com/gregkh/gregkh-linux/blob/master/scripts/trace_kernel_build.sh
it uses bpftrace to inject a script and then do a build and then
post-process the output.  Not something you should ever do on a "real"
build system :)

> One is this kind of post-processing after the build, which is easier
> in that it avoids touching Kbuild and can be written in something like
> Python, which always helps. The downside (my worry) is that it
> introduces yet another layer to Kbuild.

That's what is happening here, it's post-processing the build files to
detetct the dependancy graph it already knows about.

> My first instinct would have been to try to see if the build system
> itself could already give us what is built while it gets built (i.e.
> just like it outputs the `cmd` files). So I wondered if that was
> considered.

cmake can do this, that's what Zephyr uses, but we don't use cmake for
kernel builds.  I know the gnu toolchain developers have talked about
adding this to make/gcc/whatever in the past, and I thought Red Hat was
funding them to do that, but it seems to have never gone anywhere and
it's been years since I last heard from them.

> > Changing kbuild would be great too, if you know of a way we can get that
> > info out of it.
> 
> It depends on what is needed, but Kbuild of course knows about input
> and output files and dependencies, so I was thinking of outputting
> that information in an easier format instead of having to parse
> command lines from the `cmd` files.

"all" we need is the list of files that are used to make the resulting
kernel image and modules.  Given that the kernel build is
self-contained, and does not pull in anything from outside of its tree
(well, with the exception of some rust things I think), we should be ok.

And kbuild already encodes this information in the cmd files, for the
most part (there are corner cases and exceptions which the developers
here have gone through great lengths to track down and document in the
scripts.)  So 99% of the info is there already, which is why the cmd
files are used for parsing, no need to re-create that info in
yet-another-format, right?

> > It should be part of the kernel build process, and generated as part of
> > it as it will want to go into some packages directly.  Having to run the
> > build "again" is probably not a good idea (i.e. do you want to modify
> > all the distro rpm scripts?)
> 
> Even with `CONFIG_SBOM`, they will need to modify at least their
> kernel configuration, and perhaps more if they want to save the SBOM
> files differently, e.g. in another package etc. So I am not sure if it
> is a big difference for any distro than adding a word to their `make`
> line.

Let's stick with a config option for now please.  If the distros who
will need/want this decide to do it in a different way, they can send
patches :)

For now, this should be sufficient.

> Now, I understand it may be easier to tell users to "just turn one
> more config", and perhaps it looks more "integrated" to them, but I
> mainly asked because, to me, the SBOM is orthogonal to the kernel
> configuration.

It's a build-time output, just like debugging symbols are, and
documentation.  Ok, documentation is a separate build target, and "to
the side" of the source build, but you get the idea :)

> In other words, I would have expected to be able to get an SBOM for
> any build, without having to modify the kernel configuration at all.
> After all, the kernel image should not change at all whether there is
> an SBOM or not. We also do not do that for some other big "globally
> orthogonal" things that involve generating extra files, like
> documentation.

The SBOM is directly tied to the kernel configuration in that it needs
to know the config in order to determine exactly what files were used to
generate the resulting binaries.  That's what the SBOM is documenting,
not "all of the files in the tarball", but just "these are the files
that are required to build the binaries".  Which is a tiny subset of the
overall files in the tree, and is really, all that the target system
cares about.

thanks,

greg k-h
Re: [PATCH v2 00/14] Add SPDX SBOM generation tool
Posted by Miguel Ojeda 1 week, 5 days ago
On Sun, Jan 25, 2026 at 4:34 PM Greg KH <gregkh@linuxfoundation.org> wrote:
>
> for a horrible hack of an example of how you can do this using
> bpf/tracing, see this "fun" thing that I use every so often:
>         https://github.com/gregkh/gregkh-linux/blob/master/scripts/trace_kernel_build.sh
> it uses bpftrace to inject a script and then do a build and then
> post-process the output.  Not something you should ever do on a "real"
> build system :)

Oof... So you really meant tracing the builder. :)

> That's what is happening here, it's post-processing the build files to
> detetct the dependancy graph it already knows about.

Yeah, I am aware -- I mentioned both sides to explain the upsides and
my worry about hardcoding all this stuff out of band.

> cmake can do this, that's what Zephyr uses, but we don't use cmake for
> kernel builds.  I know the gnu toolchain developers have talked about
> adding this to make/gcc/whatever in the past, and I thought Red Hat was
> funding them to do that, but it seems to have never gone anywhere and
> it's been years since I last heard from them.

Well, Make can ""do"" things like that, in the sense that we can
program it, which is essentially what Kbuild does -- it "hacks" the
usual Make graph to do extra stuff on top.

i.e. Kbuild and friends are the ones writing the `.cmd` files and
running custom filtering and so on, not Make, and just like we abuse
Make to do that, in principle we could encode and output more
information (if that would help).

To be clear, I am not sure exactly what information it is needed --
when I was Cc'd for the Rust bit, I noticed it was parsing the command
line to try to guess more deps (?), which seemed odd and I wondered
whether we could provide that (even if it requires additions) so that
we don't need to parse those.

> And kbuild already encodes this information in the cmd files, for the
> most part (there are corner cases and exceptions which the developers
> here have gone through great lengths to track down and document in the
> scripts.)

By "corner cases and exceptions", I assume you mean the hardcoded ones
(not the command line parsing), which I hadn't noticed yet.

Those aren't really documented from what I can see? It is just the
list of cases, which we will also have to maintain.

I also see the `.incbin` now, which is even more hardcoding, but I see
Makefiles explicitly adding the dependency on their side, which is
closer to what I am saying: that it would be better to add the
dependencies (or whatever information is needed) in the build system
side.

In other words, we could make those generate a `.cmd` file or similar,
rather than hardcode it on the script.

I guess my question to Luis et al. is: for things like `.incbin` and
the hardcoded dependencies, is there a reason to avoid declaring any
missing dependencies or to generate the `.cmd` files to begin with?

The patches say, for instance:

    Some files in the kernel build process are not tracked by the .cmd
dependency mechanism.
    Parsing these dependencies programmatically is too complex for the
scope of this project.
    Therefore, this function provides manually defined dependencies to
be added to the build graph.

And my point is precisely that we should not be parsing Makefiles, but
neither command lines, if at all possible. Instead, if there are
missing dependencies, we should fix them; and if there are missing
`.cmd`s (i.e. dependency information not saved) you need, we could add
those, and so on.

> So 99% of the info is there already, which is why the cmd
> files are used for parsing, no need to re-create that info in
> yet-another-format, right?

Yeah, the extra information may be just in `deps_` in the `.cmd`, or
it could be an extra variable there or whatever is needed, i.e. no
need for a new format.

i.e. what I was trying to avoid was the out-of-band hardcoding as much
as possible.

> Let's stick with a config option for now please.  If the distros who
> will need/want this decide to do it in a different way, they can send
> patches :)

In case it wasn't clear: for the config bit, it wouldn't be a big
change -- it would just require removing ~10 lines unless I am missing
something.

But if this was already discussed with users or you think it will be
easier etc., then fine, I won't press. :)

> It's a build-time output, just like debugging symbols are, and
> documentation.  Ok, documentation is a separate build target, and "to
> the side" of the source build, but you get the idea :)

Just in case: debugging symbols are different -- they change the
actual build (e.g. flags), and even the actual image (unless requested
to be separate). So those make sense as a config option.

(We also have other targets that work like docs, i.e. it is not just
docs. But fine... :)

> The SBOM is directly tied to the kernel configuration in that it needs
> to know the config in order to determine exactly what files were used to
> generate the resulting binaries.  That's what the SBOM is documenting,
> not "all of the files in the tarball", but just "these are the files
> that are required to build the binaries".  Which is a tiny subset of the
> overall files in the tree, and is really, all that the target system
> cares about.

To clarify, I didn't suggest we document the files in the tarball, nor
that the kernel config doesn't influence the SBOM contents.

What I am saying is that whether an SBOM is generated or not is
orthogonal to the kernel configuration, and as a user I would have
expected to be able to obtain one in my build without having to change
any configuration.

I hope that helps.

Cheers,
Miguel
Re: [PATCH v2 00/14] Add SPDX SBOM generation tool
Posted by Luis Augenstein 1 week, 4 days ago
>> Let's stick with a config option for now please. If the distros who
>> will need/want this decide to do it in a different way, they can send
>> patches :)
>
> In case it wasn't clear: for the config bit, it wouldn't be a big
> change -- it would just require removing ~10 lines unless I am missing
> something.
>
> But if this was already discussed with users or you think it will be
> easier etc., then fine, I won't press. :)

Thanks for this suggestion.
I explored this approach in v3 where I removed the CONFIG_SBOM option
and instead introduced the `make sbom` target to invoke the sbom tool.
This way the default `all` target is not changed at all. The `sbom`
target depends on `all` such that `make sbom` can be used to build the
kernel if not present and generate the SBOM afterwards. The environment
variables picked up by the SBOM remain the same as in v2. I like that
this solves the need to find a good place for a CONFIG_SBOM option since
the previous location in lib/Kconfig.debug was not optimal.
Let me know what you think.

> To be clear, I am not sure exactly what information it is needed --
> when I was Cc'd for the Rust bit, I noticed it was parsing the command
> line to try to guess more deps (?), which seemed odd and I wondered
> whether we could provide that (even if it requires additions) so that
> we don't need to parse those.
> [...]
> In other words, we could make those generate a `.cmd` file or similar,
> rather than hardcode it on the script.
>
> I guess my question to Luis et al. is: for things like `.incbin` and
> the hardcoded dependencies, is there a reason to avoid declaring any
> missing dependencies or to generate the `.cmd` files to begin with?

I agree that it would likely be a cleaner solution if the `.cmd` files
directly contained all dependencies. Ideally, the SBOM tool would only
need to consume dependency information from the existing `.cmd` files,
without parsing build commands or collecting additional dependencies
that are not tracked by the `.cmd` mechanism.

However, as mentioned previously, adapting the `.cmd` file generation
was considered out of scope for this project. Early on, we decided to
focus on an isolated tool that works with the information currently
available to keep the scope manageable. Improving dependency coverage at
the Kbuild level would certainly be a worthwhile follow-up, but it was
not something we could realistically address within this effort.

Best,
Luis

-- 
Luis Augenstein * luis.augenstein@tngtech.com * +49-152-25275761
TNG Technology Consulting GmbH, Beta-Str. 13, 85774 Unterföhring
Geschäftsführer: Henrik Klagges, Dr. Robert Dahlke, Thomas Endres
Aufsichtsratsvorsitzender: Christoph Stock
Sitz: Unterföhring * Amtsgericht München * HRB 135082
Re: [PATCH v2 00/14] Add SPDX SBOM generation tool
Posted by Nathan Chancellor 1 week, 3 days ago
Hi Luis, Greg, and Miguel,

Sorry for not having any input up until this point, as I felt this was
not going to be ready for 6.20/7.0 and I wanted to focus on getting
things ready for that release (on top of other work). Some high level
comments based on what has been discussed so far to follow, it was going
to be hard to reply inline to everything. I will try to take a closer
look at v3 in the next couple of weeks but I might not get to it until
after the merge window closes.

I agree with Miguel that if there is any information that we can add to
the .cmd files or another file generated by Kbuild to avoid hard coding
things while preprocessing, it should be pursued, as we should be making
the build system work for us. We have already some prior art with post
processing Kbuild files like scripts/clang-tools/gen_compile_commands.py
so I am not too worried about that. At the same time, I do like how self
contained the implementation currently is, as it is just there available
for people to use if they want it but it impacts nothing if it is not
being used. It also makes it an easier maintenance burden in the
immediate term, as I would like this to be shown as useful to various
entities before it starts to entangle itself into the build system.

I think getting rid of CONFIG_SBOM in favor of just an sbom make target
is a good direction. If we really wanted some sort of configuration
option, I think it should only mean "generate an SBOM by default" and
nothing more but I worry about this getting turned on via compile
testing and causing issues. At that point, it feels like whatever entity
wants this information can just add 'make sbom' to their build system
since they may have to control the outputs beyond the simple "all" make
target.

I wonder if it would be better for this to live within scripts/ instead
of tools/, as that should allow it to be integrated into the build
process a little bit more naturally, such as using $(Q) instead of @,
$(PYTHON3) instead of the bare python3, being able to access
CONFIG_MODULES directly, and cleaning up the actual implementation of
the sbom target in Makefile.

Cheers,
Nathan
Re: [PATCH v2 00/14] Add SPDX SBOM generation tool
Posted by Luis Augenstein 4 days, 17 hours ago
Hi Nathan,

> I wonder if it would be better for this to live within scripts/ instead
> of tools/, as that should allow it to be integrated into the build
> process a little bit more naturally, such as using $(Q) instead of @,
> $(PYTHON3) instead of the bare python3, being able to access
> CONFIG_MODULES directly, and cleaning up the actual implementation of
> the sbom target in Makefile.

Thanks, I wasn’t aware that targets under scripts/ have access to more
Make variables by default. During development, we didn’t have strong
reasons for choosing either tools/ or scripts/. I’m happy to move it to
scripts/ if that is the preferred location.

Regarding $(Q) and $(PYTHON3), I noticed that these variables are
actually available within the tools/ directory as well, so we could
switch to using them even if we stay under tools/.

CONFIG_MODULES and src_tree, on the other hand, need to be passed
explicitly when staying in tools/, whereas they would be available by
default under scripts/ in which case we could simply invoke the script via:
```Makefile
PHONY += sbom
sbom: all
	$(Q)$(MAKE) $(build)=scripts/sbom
```

So yes, I think it makes sense to move it to scripts then.

Best,
Luis


-- 
Luis Augenstein * luis.augenstein@tngtech.com * +49-152-25275761
TNG Technology Consulting GmbH, Beta-Str. 13, 85774 Unterföhring
Geschäftsführer: Henrik Klagges, Dr. Robert Dahlke, Thomas Endres
Aufsichtsratsvorsitzender: Christoph Stock
Sitz: Unterföhring * Amtsgericht München * HRB 135082
Re: [PATCH v2 00/14] Add SPDX SBOM generation tool
Posted by Nathan Chancellor 4 days, 9 hours ago
Hi Luis,

On Mon, Feb 02, 2026 at 05:28:39PM +0100, Luis Augenstein wrote:
> > I wonder if it would be better for this to live within scripts/ instead
> > of tools/, as that should allow it to be integrated into the build
> > process a little bit more naturally, such as using $(Q) instead of @,
> > $(PYTHON3) instead of the bare python3, being able to access
> > CONFIG_MODULES directly, and cleaning up the actual implementation of
> > the sbom target in Makefile.
> 
> Thanks, I wasn’t aware that targets under scripts/ have access to more
> Make variables by default. During development, we didn’t have strong

I think this is a byproduct of being fully within Kbuild at that point,
rather than in the tools/ build system.

> reasons for choosing either tools/ or scripts/. I’m happy to move it to
> scripts/ if that is the preferred location.

Yes please. If this tool is designed to run within and parse Kbuild, it
should live fully within Kbuild, as the "tools build system" comment in
Makefile added by Masahiro in commit 6e6ef2da3a28 ("Makefile: add
comment to discourage tools/* addition for kernel builds") notes (even
though this is not a C program so the hostprogs comment is irrelevant
here). scripts/sbom seems entirely reasonable to me.

> Regarding $(Q) and $(PYTHON3), I noticed that these variables are
> actually available within the tools/ directory as well, so we could
> switch to using them even if we stay under tools/.

Ah, good to know. I do not delve into the tools build system all too
much.

> CONFIG_MODULES and src_tree, on the other hand, need to be passed
> explicitly when staying in tools/, whereas they would be available by
> default under scripts/ in which case we could simply invoke the script via:
> ```Makefile
> PHONY += sbom
> sbom: all
> 	$(Q)$(MAKE) $(build)=scripts/sbom
> ```
> 
> So yes, I think it makes sense to move it to scripts then.

Yeah, that looks much cleaner to me. I suspect scripts/sbom/Makefile
could be cleaned up a little bit as a result of that move as well.

Also, two other comments I forgot to bring up:

1. With the movement out of tools/, I think the README should become a
   proper Documentation file so that its contents is more discoverable.
   That should probably be separate from the change that adds the
   initial SBOM scaffolding in Kbuild to help with review.

2. This depends on having a clean initial build tree (either empty
   directory or 'clean' as a make target) due to needing to parse the
   .cmd files, which could be stale if someone builds a kernel, changes
   their config, and rebuilds, right? This should be documented since I
   do not think it is possible to do something like what Masahiro did in
   commit 3d32285fa995 ("kbuild: wire up the build rule of
   compile_commands.json to Makefile") because of the drawback that it
   misses too many things.

Cheers,
Nathan
Re: [PATCH v2 00/14] Add SPDX SBOM generation tool
Posted by Luis Augenstein 3 days, 19 hours ago
Hi Nathan,

> 2. This depends on having a clean initial build tree (either empty
>    directory or 'clean' as a make target) due to needing to parse the
>    .cmd files, which could be stale if someone builds a kernel, changes
>    their config, and rebuilds, right? This should be documented since I
>    do not think it is possible to do something like what Masahiro did in
>    commit 3d32285fa995 ("kbuild: wire up the build rule of
>    compile_commands.json to Makefile") because of the drawback that it
>    misses too many things.

There might be edge cases, but in general stale .cmd files should not be
an issue.

The script does not scan the build tree for .cmd files. It starts from a
set of root build artifacts (kernel image and .ko modules listed in
modules.order). From these roots, it parses the corresponding .cmd files
to discover the immediate dependencies, and then recursively processes
the .cmd files of those dependencies, effectively walking the entire
dependency graph up to the individual source files.

Stale .cmd files should not be referenced as dependencies by the root
artifacts and therefore not be part of the resulting dependency graph.

Best,
Luis

-- 
Luis Augenstein * luis.augenstein@tngtech.com * +49-152-25275761
TNG Technology Consulting GmbH, Beta-Str. 13, 85774 Unterföhring
Geschäftsführer: Henrik Klagges, Dr. Robert Dahlke, Thomas Endres
Aufsichtsratsvorsitzender: Christoph Stock
Sitz: Unterföhring * Amtsgericht München * HRB 135082
Re: [PATCH v2 00/14] Add SPDX SBOM generation tool
Posted by Nathan Chancellor 3 days, 13 hours ago
On Tue, Feb 03, 2026 at 03:41:42PM +0100, Luis Augenstein wrote:
> Hi Nathan,
> 
> > 2. This depends on having a clean initial build tree (either empty
> >    directory or 'clean' as a make target) due to needing to parse the
> >    .cmd files, which could be stale if someone builds a kernel, changes
> >    their config, and rebuilds, right? This should be documented since I
> >    do not think it is possible to do something like what Masahiro did in
> >    commit 3d32285fa995 ("kbuild: wire up the build rule of
> >    compile_commands.json to Makefile") because of the drawback that it
> >    misses too many things.
> 
> There might be edge cases, but in general stale .cmd files should not be
> an issue.
> 
> The script does not scan the build tree for .cmd files. It starts from a
> set of root build artifacts (kernel image and .ko modules listed in
> modules.order). From these roots, it parses the corresponding .cmd files
> to discover the immediate dependencies, and then recursively processes
> the .cmd files of those dependencies, effectively walking the entire
> dependency graph up to the individual source files.
> 
> Stale .cmd files should not be referenced as dependencies by the root
> artifacts and therefore not be part of the resulting dependency graph.

Ah okay, thanks for the explanation! I have not had a chance to review
the actual Python implementation yet. It sounds very similar to the
approach taken by Masahiro for compile_commands.json but by looking at
the .cmd files recursively from the root artifacts.

Cheers,
Nathan
Re: [PATCH v2 00/14] Add SPDX SBOM generation tool
Posted by Miguel Ojeda 1 week, 5 days ago
On Sun, Jan 25, 2026 at 4:20 PM Miguel Ojeda
<miguel.ojeda.sandonis@gmail.com> wrote:
>
> After all, the kernel image should not change at all whether there is
> an SBOM or not.

Well, I guess the SBOM could be saved into the kernel itself (and
perhaps retrieved in different ways, e.g. at runtime), in which case,
then an option definitely makes sense.

Cheers,
Miguel
Re: [PATCH v2 00/14] Add SPDX SBOM generation tool
Posted by Greg KH 1 week, 5 days ago
On Sun, Jan 25, 2026 at 04:33:28PM +0100, Miguel Ojeda wrote:
> On Sun, Jan 25, 2026 at 4:20 PM Miguel Ojeda
> <miguel.ojeda.sandonis@gmail.com> wrote:
> >
> > After all, the kernel image should not change at all whether there is
> > an SBOM or not.
> 
> Well, I guess the SBOM could be saved into the kernel itself (and
> perhaps retrieved in different ways, e.g. at runtime), in which case,
> then an option definitely makes sense.

Ick, let's not dump the HUGE sbom json file inside the binary kernel
image itself, unless you want to do something like /proc/sbom.gz?

That would be funny, but a big waste of memory :)

greg k-h