Documentation: Provide guidelines for kernel development tools

[PATCH] Documentation: Provide guidelines for kernel development tools

Posted by Dave Hansen 3 months, 1 week ago

In the last few years, the capabilities of coding tools have exploded.
As those capabilities have expanded, contributors and maintainers have
more and more questions about how and when to apply those
capabilities.

The shiny new AI tools (chatbots, coding assistants and more) are
impressive.  Add new Documentation to guide contributors on how to
best use kernel development tools, new and old.

Note, though, there are fundamentally no new or unique rules in this
new document. It clarifies expectations that the kernel community has
had for many years. For example, researchers are already asked to
disclose the tools they use to find issues in
Documentation/process/researcher-guidelines.rst. This new document
just reiterate existing best practices for development tooling.

In short: Please show your work and make sure your contribution is
easy to review.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Sasha Levin <sashal@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kees Cook <kees@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>

--

This document was a collaborative effort from all the members of
the TAB. I just reformatted it into .rst and wrote the changelog.
---
 Documentation/process/development-tools.rst | 92 +++++++++++++++++++++
 1 file changed, 92 insertions(+)
 create mode 100644 Documentation/process/development-tools.rst

diff --git a/Documentation/process/development-tools.rst b/Documentation/process/development-tools.rst
new file mode 100644
index 0000000000000..ab6596cc595ac
--- /dev/null
+++ b/Documentation/process/development-tools.rst
@@ -0,0 +1,92 @@
+============================================
+Kernel Guidelines for Tool Generated Content
+============================================
+
+Purpose
+=======
+
+Kernel contributors have been using tooling to generate contributions
+for a long time. These tools are constantly becoming more capable and
+undoubtedly improve developer productivity. At the same time, reviewer
+and maintainer bandwidth is a very scarce resource. Understanding
+which portions of a contribution come from humans versus tools is
+critical to maintain those resources and keep kernel development
+healthy.
+
+The goal here is to clarify community expectations around tools. This
+lets everyone become more productive while also maintaining high
+degrees of trust between submitters and reviewers.
+
+Out of Scope
+============
+
+These guidelines do not apply to tools that make trivial tweaks to
+preexisting content. Nor do they pertain to AI tooling that helps with
+menial tasks. Some examples:
+
+ - Spelling and grammar fix ups, like rephrasing to imperative voice
+ - Typing aids like identifier completion, common boilerplate or
+   trivial pattern completion
+ - Purely mechanical transformations like variable renaming
+ - Reformatting, like running scripts/Lindent.
+
+Even if your tool use is out of scope you should still always consider
+if it would help reviewing your contribution if the reviewer knows
+about the tool that you used.
+
+In Scope
+========
+
+These guidelines apply when a meaningful amount of content in a kernel
+contribution was not written by a person in the Signed-off-by chain,
+but was instead created by a tool.
+
+Some examples:
+ - “checkpatch.pl --fix” output, or any tool suggested fix.
+ - coccinelle scripts
+ - ChatGPT generated a new function in your patch to sort list entries.
+ - A .c file in the patch was originally generated by Gemini but cleaned
+   up by hand.
+ - The changelog was generated by handing the patch to a generative AI
+   tool and asking it to write the changelog.
+ - The changelog was translated from another language.
+ - Detection of a problem is also a part of the development process; if
+   a tool was used to find a problem addressed by a change, that should
+   be noted in the changelog. This not only gives credit where it is
+   due, it also helps fellow developers find out about these tools.
+
+If in doubt, choose transparency and assume these guidelines apply to
+your contribution.
+
+Guidelines
+==========
+
+First, read the Developer's Certificate of Origin:
+``Documentation/process/submitting-patches.rst`` Its rules are simple
+and have been in place for a long time. They have covered many
+tool-generated contributions.
+
+Second, when making a contribution, be transparent about the origin of
+content in cover letters and changelogs. You can be more transparent
+by adding information like this:
+
+ - What tools were used?
+ - The input to the tools you used, like the coccinelle source script.
+ - If code was largely generated from a single or short set of
+   prompts, include those prompts in the commit log. For longer
+   sessions, include a summary of the prompts and the nature of
+   resulting assistance.
+ - Which portions of the content were affected by that tool?
+
+As with all contributions, individual maintainers have discretion to
+choose how they handle the contribution. For example, they might:
+
+ - Treat it just like any other contribution
+ - Reject it outright
+ - Review the contribution with extra scrutiny
+ - Suggest a better prompt instead of suggesting specific code changes
+ - Ask for some other special steps, like asking the contributor to
+   elaborate on how the tool or model was trained
+ - Ask the submitter to explain in more detail about the contribution
+   so that the maintainer can feel comfortable that the submitter fully
+   understands how the code works.
-- 
2.34.1

Re: [PATCH] Documentation: Provide guidelines for kernel development tools

Posted by Miguel Ojeda 3 months, 1 week ago

Hi Dave,

Some more nits and suggestions below...

On Mon, Oct 27, 2025 at 9:13 PM Dave Hansen <dave.hansen@linux.intel.com> wrote:
>
> Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>

No big deal either way, but for the commit you could use ojeda@kernel.org

> + - Spelling and grammar fix ups, like rephrasing to imperative voice

Some bullet points have periods at the end, others don't.

> + - Reformatting, like running scripts/Lindent.

You could perhaps add

    Lindent, ``clang-format`` or ``rustfmt``

or similar (the Rust one is particularly relevant because we enforce
that one treewide unlike the C one).

> + - “checkpatch.pl --fix” output, or any tool suggested fix.

We should use a code span for this one.

Also, should it be "tool-suggested", i.e. with a hyphen? (not a native speaker)

> + - coccinelle scripts

Coccinelle

> + - ChatGPT generated a new function in your patch to sort list entries.
> + - A .c file in the patch was originally generated by Gemini but cleaned
> +   up by hand.

Like Jon, it also crossed my mind using just LLM here or perhaps
mentioning "open" models. On the other hand, it is clear commercial
models are getting used already, e.g. Gemini is in the commit log
already and Claude is in the mailing list.

I hope that helps -- thanks for sending this!

Cheers,
Miguel

Re: [PATCH] Documentation: Provide guidelines for kernel development tools

Posted by Sasha Levin 3 months, 1 week ago

On Tue, Oct 28, 2025 at 04:29:14PM +0100, Miguel Ojeda wrote:
>On Mon, Oct 27, 2025 at 9:13 PM Dave Hansen <dave.hansen@linux.intel.com> wrote:
>> + - ChatGPT generated a new function in your patch to sort list entries.
>> + - A .c file in the patch was originally generated by Gemini but cleaned
>> +   up by hand.
>
>Like Jon, it also crossed my mind using just LLM here or perhaps
>mentioning "open" models. On the other hand, it is clear commercial
>models are getting used already, e.g. Gemini is in the commit log
>already and Claude is in the mailing list.

I *think* that this was based on the experience[1] Kees had with LLMs, and my
thoughts were that this example would match what a developer were to write if
they were asked to document the usage of tooling.

And yes, we shouldn't mention proprietary brand names in kernel docs, so I
fully agree with Jon and Miguel here. We should however encourage contributors
to list the actual tools/LLMs they used, as this could be interesting down the
road...

[1] https://hachyderm.io/@kees/114907228284590439

-- 
Thanks,
Sasha

Re: [PATCH] Documentation: Provide guidelines for kernel development tools

Posted by Jonathan Corbet 3 months, 1 week ago

Dave Hansen <dave.hansen@linux.intel.com> writes:

> In the last few years, the capabilities of coding tools have exploded.
> As those capabilities have expanded, contributors and maintainers have
> more and more questions about how and when to apply those
> capabilities.
>
> The shiny new AI tools (chatbots, coding assistants and more) are
> impressive.  Add new Documentation to guide contributors on how to
> best use kernel development tools, new and old.
>
> Note, though, there are fundamentally no new or unique rules in this
> new document. It clarifies expectations that the kernel community has
> had for many years. For example, researchers are already asked to
> disclose the tools they use to find issues in
> Documentation/process/researcher-guidelines.rst. This new document
> just reiterate existing best practices for development tooling.
>
> In short: Please show your work and make sure your contribution is
> easy to review.
>
> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Theodore Ts'o <tytso@mit.edu>
> Cc: Sasha Levin <sashal@kernel.org>
> Cc: Jonathan Corbet <corbet@lwn.net>
> Cc: Kees Cook <kees@kernel.org>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
> Cc: Shuah Khan <shuah@kernel.org>
>
> --
>
> This document was a collaborative effort from all the members of
> the TAB. I just reformatted it into .rst and wrote the changelog.

Generally seems good to me, but I have a few nits

> ---
>  Documentation/process/development-tools.rst | 92 +++++++++++++++++++++
>  1 file changed, 92 insertions(+)
>  create mode 100644 Documentation/process/development-tools.rst

You didn't add it to index.rst, so it won't be part of the docs build.

"development-tools" is a fairly generic file name that doesn't really
tell readers what they might find within it.  Maybe this shed is better
named "generated-content.rst" or something like that?

> diff --git a/Documentation/process/development-tools.rst b/Documentation/process/development-tools.rst
> new file mode 100644
> index 0000000000000..ab6596cc595ac
> --- /dev/null
> +++ b/Documentation/process/development-tools.rst
> @@ -0,0 +1,92 @@
> +============================================
> +Kernel Guidelines for Tool Generated Content
> +============================================
> +
> +Purpose
> +=======
> +
> +Kernel contributors have been using tooling to generate contributions
> +for a long time. These tools are constantly becoming more capable and
> +undoubtedly improve developer productivity. At the same time, reviewer
> +and maintainer bandwidth is a very scarce resource. Understanding
> +which portions of a contribution come from humans versus tools is
> +critical to maintain those resources and keep kernel development
> +healthy.
> +
> +The goal here is to clarify community expectations around tools. This
> +lets everyone become more productive while also maintaining high
> +degrees of trust between submitters and reviewers.
> +
> +Out of Scope
> +============
> +
> +These guidelines do not apply to tools that make trivial tweaks to
> +preexisting content. Nor do they pertain to AI tooling that helps with
> +menial tasks. Some examples:
> +
> + - Spelling and grammar fix ups, like rephrasing to imperative voice
> + - Typing aids like identifier completion, common boilerplate or
> +   trivial pattern completion
> + - Purely mechanical transformations like variable renaming
> + - Reformatting, like running scripts/Lindent.
> +
> +Even if your tool use is out of scope you should still always consider
> +if it would help reviewing your contribution if the reviewer knows
> +about the tool that you used.
> +
> +In Scope
> +========
> +
> +These guidelines apply when a meaningful amount of content in a kernel
> +contribution was not written by a person in the Signed-off-by chain,
> +but was instead created by a tool.
> +
> +Some examples:
> + - “checkpatch.pl --fix” output, or any tool suggested fix.
> + - coccinelle scripts
> + - ChatGPT generated a new function in your patch to sort list entries.
> + - A .c file in the patch was originally generated by Gemini but cleaned
> +   up by hand.

Might we want to use some sort of generic term rather than listing
specific proprietary systems here?

> + - The changelog was generated by handing the patch to a generative AI
> +   tool and asking it to write the changelog.
> + - The changelog was translated from another language.
> + - Detection of a problem is also a part of the development process; if
> +   a tool was used to find a problem addressed by a change, that should
> +   be noted in the changelog. This not only gives credit where it is
> +   due, it also helps fellow developers find out about these tools.
> +
> +If in doubt, choose transparency and assume these guidelines apply to
> +your contribution.
> +
> +Guidelines
> +==========
> +
> +First, read the Developer's Certificate of Origin:
> +``Documentation/process/submitting-patches.rst`` Its rules are simple
> +and have been in place for a long time. They have covered many
> +tool-generated contributions.

I'd drop the ``literal`` formatting so that the automatic
cross-reference magic can happen.

> +
> +Second, when making a contribution, be transparent about the origin of
> +content in cover letters and changelogs. You can be more transparent
> +by adding information like this:
> +
> + - What tools were used?
> + - The input to the tools you used, like the coccinelle source script.
> + - If code was largely generated from a single or short set of
> +   prompts, include those prompts in the commit log. For longer
> +   sessions, include a summary of the prompts and the nature of
> +   resulting assistance.
> + - Which portions of the content were affected by that tool?
> +
> +As with all contributions, individual maintainers have discretion to
> +choose how they handle the contribution. For example, they might:
> +
> + - Treat it just like any other contribution
> + - Reject it outright
> + - Review the contribution with extra scrutiny
> + - Suggest a better prompt instead of suggesting specific code changes
> + - Ask for some other special steps, like asking the contributor to
> +   elaborate on how the tool or model was trained
> + - Ask the submitter to explain in more detail about the contribution
> +   so that the maintainer can feel comfortable that the submitter fully
> +   understands how the code works.
> -- 
> 2.34.1

Like I said, nits; the policy side seems to align with how the
discussions have gone.

Thanks,

jon