.aider.conf.yml | 1 + .codeium/instructions.md | 1 + .continue/context.md | 1 + .cursorrules | 1 + .github/copilot-instructions.md | 1 + .windsurfrules | 1 + CLAUDE.md | 1 + Documentation/agents/coding-style.rst | 35 ++++++++++++++++++++++ Documentation/agents/core.rst | 28 ++++++++++++++++++ Documentation/agents/index.rst | 13 +++++++++ Documentation/agents/legal.rst | 42 +++++++++++++++++++++++++++ Documentation/agents/main.rst | 22 ++++++++++++++ 12 files changed, 147 insertions(+) create mode 120000 .aider.conf.yml create mode 120000 .codeium/instructions.md create mode 120000 .continue/context.md create mode 120000 .cursorrules create mode 120000 .github/copilot-instructions.md create mode 120000 .windsurfrules create mode 120000 CLAUDE.md create mode 100644 Documentation/agents/coding-style.rst create mode 100644 Documentation/agents/core.rst create mode 100644 Documentation/agents/index.rst create mode 100644 Documentation/agents/legal.rst create mode 100644 Documentation/agents/main.rst
This patch series adds unified configuration and documentation for coding agents working with the Linux kernel codebase. As coding agents become increasingly common in software development, it's important to establish clear guidelines for their use in kernel development. The series consists of four patches: 1. The first patch adds unified configuration files for various coding agents (Claude, GitHub Copilot, Cursor, Codeium, Continue, Windsurf, and Aider). These are all symlinked to a central documentation file to ensure consistency across tools. 2. The second patch adds core development references that guide agents to essential kernel development documentation including how to do kernel development, submitting patches, and the submission checklist. 3. The third patch adds coding style documentation and explicit rules that agents must follow, including the 80 character line limit and no trailing whitespace requirements. 4. The fourth patch adds legal requirements and agent attribution guidelines. All agents are required to identify themselves in commits using Co-developed-by tags, ensuring full transparency about agent involvement in code development. Example agent attribution in commits: Co-developed-by: Claude claude-opus-4-20250514 Changes since RFC: - Switch from markdown to RST - Break up into multiple files - Simplify instructions (we can always bikeshed those later) - AI => Agents Sasha Levin (4): agents: add unified agent coding assistant configuration agents: add core development references agents: add coding style documentation and rules agents: add legal requirements and agent attribution guidelines .aider.conf.yml | 1 + .codeium/instructions.md | 1 + .continue/context.md | 1 + .cursorrules | 1 + .github/copilot-instructions.md | 1 + .windsurfrules | 1 + CLAUDE.md | 1 + Documentation/agents/coding-style.rst | 35 ++++++++++++++++++++++ Documentation/agents/core.rst | 28 ++++++++++++++++++ Documentation/agents/index.rst | 13 +++++++++ Documentation/agents/legal.rst | 42 +++++++++++++++++++++++++++ Documentation/agents/main.rst | 22 ++++++++++++++ 12 files changed, 147 insertions(+) create mode 120000 .aider.conf.yml create mode 120000 .codeium/instructions.md create mode 120000 .continue/context.md create mode 120000 .cursorrules create mode 120000 .github/copilot-instructions.md create mode 120000 .windsurfrules create mode 120000 CLAUDE.md create mode 100644 Documentation/agents/coding-style.rst create mode 100644 Documentation/agents/core.rst create mode 100644 Documentation/agents/index.rst create mode 100644 Documentation/agents/legal.rst create mode 100644 Documentation/agents/main.rst -- 2.39.5
On 7/27/25 21:57, Sasha Levin wrote: > This patch series adds unified configuration and documentation for coding > agents working with the Linux kernel codebase. As coding agents > become increasingly common in software development, it's important to > establish clear guidelines for their use in kernel development. Hi, this series seems to me somewhat premature. I think we first need a clear policy wrt LLM usage for the *humans* to follow. It seemed this thread [1] was going into that direction wrt usage disclosure. BTW I was quite shocked by Steven's reply there [2] that he learned from the LWN coverage of a conference talk that he had received a patch fully written by LLM without any such indication. Now I'm not naive to believe that it's not been happening already from e.g. first-time contributors, but if that coverage was accurate, the patch came from a very seasoned kernel contributor and I really wouldn't expect that to happen. Also I don't know e.g. the copyright and licensing implications of LLM usage beyond, say, a smarter automplete are clear? (again, such as writing the full patch?) The thread [1] touched on it somewhat but not completely. If that's clear already (IANAL), I'd hope that to be also part of such policy. I know that your series has patch 4, but that seems to be part of what the LLM is supposed to include for its prompt (does it make sense to call it "legal requirements" then?). If it fails to e.g. add the "Co-developed-by:" there seems to be nothing saying the human should check these things in the output. So without such policy first, I fear just merging this alone would send the message that the kernel is now officially accepting contributions done with coding assistants, and those assistants will do the right things based on these configuration files, and the developers using the assistants don't need to concern themselves with anything more, as it's all covered by the configuration. Vlastimil [1] https://lore.kernel.org/all/20250724175439.76962-1-linux@treblig.org/ [2] https://lore.kernel.org/all/20250724194556.105803db@gandalf.local.home/ > The series consists of four patches: > > 1. The first patch adds unified configuration files for various coding > agents (Claude, GitHub Copilot, Cursor, Codeium, Continue, > Windsurf, and Aider). These are all symlinked to a central documentation > file to ensure consistency across tools. > > 2. The second patch adds core development references that guide > agents to essential kernel development documentation including how > to do kernel development, submitting patches, and the submission > checklist. > > 3. The third patch adds coding style documentation and explicit rules > that agents must follow, including the 80 character line limit > and no trailing whitespace requirements. > > 4. The fourth patch adds legal requirements and agent attribution > guidelines. All agents are required to identify themselves in > commits using Co-developed-by tags, ensuring full transparency about > agent involvement in code development. > > Example agent attribution in commits: > > Co-developed-by: Claude claude-opus-4-20250514 > > > Changes since RFC: > - Switch from markdown to RST > - Break up into multiple files > - Simplify instructions (we can always bikeshed those later) > - AI => Agents > > Sasha Levin (4): > agents: add unified agent coding assistant configuration > agents: add core development references > agents: add coding style documentation and rules > agents: add legal requirements and agent attribution guidelines > > .aider.conf.yml | 1 + > .codeium/instructions.md | 1 + > .continue/context.md | 1 + > .cursorrules | 1 + > .github/copilot-instructions.md | 1 + > .windsurfrules | 1 + > CLAUDE.md | 1 + > Documentation/agents/coding-style.rst | 35 ++++++++++++++++++++++ > Documentation/agents/core.rst | 28 ++++++++++++++++++ > Documentation/agents/index.rst | 13 +++++++++ > Documentation/agents/legal.rst | 42 +++++++++++++++++++++++++++ > Documentation/agents/main.rst | 22 ++++++++++++++ > 12 files changed, 147 insertions(+) > create mode 120000 .aider.conf.yml > create mode 120000 .codeium/instructions.md > create mode 120000 .continue/context.md > create mode 120000 .cursorrules > create mode 120000 .github/copilot-instructions.md > create mode 120000 .windsurfrules > create mode 120000 CLAUDE.md > create mode 100644 Documentation/agents/coding-style.rst > create mode 100644 Documentation/agents/core.rst > create mode 100644 Documentation/agents/index.rst > create mode 100644 Documentation/agents/legal.rst > create mode 100644 Documentation/agents/main.rst >
On Mon, Jul 28, 2025 at 09:58:44AM +0200, Vlastimil Babka wrote: >On 7/27/25 21:57, Sasha Levin wrote: >> This patch series adds unified configuration and documentation for coding >> agents working with the Linux kernel codebase. As coding agents >> become increasingly common in software development, it's important to >> establish clear guidelines for their use in kernel development. > >Hi, > >this series seems to me somewhat premature. I think we first need a clear >policy wrt LLM usage for the *humans* to follow. It seemed this thread [1] >was going into that direction wrt usage disclosure. BTW I was quite shocked >by Steven's reply there [2] that he learned from the LWN coverage of a >conference talk that he had received a patch fully written by LLM without >any such indication. Now I'm not naive to believe that it's not been >happening already from e.g. first-time contributors, but if that coverage >was accurate, the patch came from a very seasoned kernel contributor and I >really wouldn't expect that to happen. You mean that you had a concern that the same person who wrote hashtable.h was using tooling to convert open coded implementations of a hashtable to the API provided by hashtable.h? I've been doing this since 2012 (see 42f8570f437b ("workqueue: use new hashtable implementation")) with the various tools that the we have for mechanical transmormations of code. I understand Steve's point of view on this, and this series is here to tackle the concerns raised both by him and the rest of the community. >Also I don't know e.g. the copyright and licensing implications of LLM usage >beyond, say, a smarter automplete are clear? (again, such as writing the >full patch?) The thread [1] touched on it somewhat but not completely. If >that's clear already (IANAL), I'd hope that to be also part of such policy. The LF already has guidance (https://www.linuxfoundation.org/legal/generative-ai) for this type of contributions that was created by LF's lawyers. Clearly we can override, expand, or affirm it if we want to, but just like you, IANAL. >I know that your series has patch 4, but that seems to be part of what the >LLM is supposed to include for its prompt (does it make sense to call it >"legal requirements" then?). If it fails to e.g. add the "Co-developed-by:" >there seems to be nothing saying the human should check these things in the >output. Right - as pointed to later in the thread, that part is already in progress. The approach in this series would be to cover the technical aspects of supporting whatever policy we end up with. >So without such policy first, I fear just merging this alone would send the >message that the kernel is now officially accepting contributions done with >coding assistants, and those assistants will do the right things based on >these configuration files, and the developers using the assistants don't >need to concern themselves with anything more, as it's all covered by the >configuration. Note that at the current state of our policies and documentation, if you were to pretend to be a developer completely unfamiliar with the Linux Kernel project, the conclusion you'd reach is that the project "officially" accepts contributions that are in line with LF's policies. If anything, this series clamps down on that. -- Thanks, Sasha
On 28.07.25 09:58, Vlastimil Babka wrote: > On 7/27/25 21:57, Sasha Levin wrote: >> This patch series adds unified configuration and documentation for coding >> agents working with the Linux kernel codebase. As coding agents >> become increasingly common in software development, it's important to >> establish clear guidelines for their use in kernel development. > > Hi, > > this series seems to me somewhat premature. I think we first need a clear > policy wrt LLM usage for the *humans* to follow. It seemed this thread [1] > was going into that direction wrt usage disclosure. BTW I was quite shocked > by Steven's reply there [2] that he learned from the LWN coverage of a > conference talk that he had received a patch fully written by LLM without > any such indication. Now I'm not naive to believe that it's not been > happening already from e.g. first-time contributors, but if that coverage > was accurate, the patch came from a very seasoned kernel contributor and I > really wouldn't expect that to happen. > > Also I don't know e.g. the copyright and licensing implications of LLM usage > beyond, say, a smarter automplete are clear? (again, such as writing the > full patch?) The thread [1] touched on it somewhat but not completely. If > that's clear already (IANAL), I'd hope that to be also part of such policy. > > I know that your series has patch 4, but that seems to be part of what the > LLM is supposed to include for its prompt (does it make sense to call it > "legal requirements" then?). If it fails to e.g. add the "Co-developed-by:" > there seems to be nothing saying the human should check these things in the > output. Exactly that. I want to have it clearly spelled out that if you're submitting AI generated code that you don't fully understand and have reviewed in detail, then you are going to have a real bad time around here. I don't have time to talk to an AI chatbot through mail when reviewing patches, because the submitter doesn't understand what he is doing and blindly copy-pastes my replies to the AI. This must not be the new mechanism to DoS kernel maintainers with AI slop. I'll point at the approach qemu[1] has taken, which is probably a bit too strict, but raises some key points regarding DCO, copyright etc. [1] https://github.com/qemu/qemu/commit/3d40db0efc22520fa6c399cf73960dced423b048 -- Cheers, David / dhildenb
On Mon, Jul 28, 2025 at 11:27:48AM +0200, David Hildenbrand wrote: > This must not be the new mechanism to DoS kernel maintainers with AI slop. I will note that we are already getting this kind of "slop" today, with the numbers going up on a weekly basis. Be lucky if you haven't seen it in your subsystem yet... thanks, greg k-h
On 28.07.25 12:37, Greg KH wrote: > On Mon, Jul 28, 2025 at 11:27:48AM +0200, David Hildenbrand wrote: >> This must not be the new mechanism to DoS kernel maintainers with AI slop. > > I will note that we are already getting this kind of "slop" today, with > the numbers going up on a weekly basis. Be lucky if you haven't seen it > in your subsystem yet... It's getting more (but in core-mm not too bad for now), and I hope that there will be a reasonable "use of AI" policy for the kernel soon. At some point I even assumed Sasha was AI-slopping us [1]. We cannot keep complaining about maintainer overload and, at the same time, encourage people to bombard us with even more of that stuff. Clearly flagging stuff as AI-generated can maybe help. But really, what we need is a proper AI policy. I think QEMU did a good job (again, maybe too strict, not sure). I'll note one interesting thing in the QEMU commit I linked: "Thus far though, this is has not been matched by a broadly accepted legal interpretation of the licensing implications for code generator outputs. While the vendors may claim there is no problem and a free choice of license is possible, they have an inherent conflict of interest in promoting this interpretation." [1] https://lkml.kernel.org/r/a4d8b292-154a-4d14-90e4-6c822acf1cfb@redhat.com -- Cheers, David / dhildenb
On Mon, Jul 28, 2025 at 12:47:55PM +0200, David Hildenbrand wrote: >We cannot keep complaining about maintainer overload and, at the same >time, encourage people to bombard us with even more of that stuff. > >Clearly flagging stuff as AI-generated can maybe help. But really, >what we need is a proper AI policy. I think QEMU did a good job >(again, maybe too strict, not sure). So I've sent this series because I thought it's a parallel effort to the effort of creating an "AI Policy". Right now we already (implicitly) have a policy as far as these contributions go, based on https://www.linuxfoundation.org/legal/generative-ai and the lack of other guidelines in our codebase, we effectively welcome AI generated contributions without any other requirements beyond the ones that affect a regular human. This series of patches attempts to clarify that point to AI: it has to follow the same requirements and rules that humans do. >I'll note one interesting thing in the QEMU commit I linked: > >"Thus far though, this is has not been matched by a broadly >accepted legal interpretation of the licensing implications for code >generator outputs. While the vendors may claim there is no problem and >a free choice of license is possible, they have an inherent conflict >of interest in promoting this interpretation." > >[1] https://lkml.kernel.org/r/a4d8b292-154a-4d14-90e4-6c822acf1cfb@redhat.com I get why QEMU did this: they don't have the resources, the lawyers, nor the interest in dealing with this open question, so they're playing it safe until we know more. That sounds like a very smart thing to do on their end. On our end, one of the reasons the kernel is part of the LF is to tackle exactly this: none of us are lawyers, but luckily we have lawyers and resources on our side to help us navigate these challanges. I'd like to think that there's no conflict of interests within the LF, and that their opinion on this matter best represents their client's (both Linux Kernel as well as Linus's) best interests. -- Thanks, Sasha
On Mon 28-07-25 09:05:37, Sasha Levin wrote: > On Mon, Jul 28, 2025 at 12:47:55PM +0200, David Hildenbrand wrote: > > We cannot keep complaining about maintainer overload and, at the same > > time, encourage people to bombard us with even more of that stuff. > > > > Clearly flagging stuff as AI-generated can maybe help. But really, what > > we need is a proper AI policy. I think QEMU did a good job (again, maybe > > too strict, not sure). > > So I've sent this series because I thought it's a parallel effort to the > effort of creating an "AI Policy". > > Right now we already (implicitly) have a policy as far as these > contributions go, based on > https://www.linuxfoundation.org/legal/generative-ai and the lack of > other guidelines in our codebase, we effectively welcome AI generated > contributions without any other requirements beyond the ones that affect > a regular human. > > This series of patches attempts to clarify that point to AI: it has to > follow the same requirements and rules that humans do. The above guidance is quite vague. How me as a maintainer should know that whatever AI tool has been used is meeting those two conditions : 1. Contributors should ensure that the terms and conditions of the : generative AI tool do not place any contractual restrictions on how the : tool’s output can be used that are inconsistent with the project’s open : source software license, the project’s intellectual property policies, : or the Open Source Definition. : : 2. If any pre-existing copyrighted materials (including pre-existing : open source code) authored or owned by third parties are included in the : AI tool’s output, prior to contributing such output to the project, the : Contributor should confirm that they have have permission from the third : party owners–such as the form of an open source license or public domain : declaration that complies with the project’s licensing policies–to use : and modify such pre-existing materials and contribute them to the : project. Additionally, the contributor should provide notice and : attribution of such third party rights, along with information about the : applicable license terms, with their contribution. Is that my responsibility? -- Michal Hocko SUSE Labs
On Mon 04-08-25 11:23:22, Michal Hocko wrote: > On Mon 28-07-25 09:05:37, Sasha Levin wrote: > > On Mon, Jul 28, 2025 at 12:47:55PM +0200, David Hildenbrand wrote: > > > We cannot keep complaining about maintainer overload and, at the same > > > time, encourage people to bombard us with even more of that stuff. > > > > > > Clearly flagging stuff as AI-generated can maybe help. But really, what > > > we need is a proper AI policy. I think QEMU did a good job (again, maybe > > > too strict, not sure). > > > > So I've sent this series because I thought it's a parallel effort to the > > effort of creating an "AI Policy". > > > > Right now we already (implicitly) have a policy as far as these > > contributions go, based on > > https://www.linuxfoundation.org/legal/generative-ai and the lack of > > other guidelines in our codebase, we effectively welcome AI generated > > contributions without any other requirements beyond the ones that affect > > a regular human. > > > > This series of patches attempts to clarify that point to AI: it has to > > follow the same requirements and rules that humans do. > > The above guidance is quite vague. How me as a maintainer should know > that whatever AI tool has been used is meeting those two conditions > > : 1. Contributors should ensure that the terms and conditions of the > : generative AI tool do not place any contractual restrictions on how the > : tool’s output can be used that are inconsistent with the project’s open > : source software license, the project’s intellectual property policies, > : or the Open Source Definition. > : > : 2. If any pre-existing copyrighted materials (including pre-existing > : open source code) authored or owned by third parties are included in the > : AI tool’s output, prior to contributing such output to the project, the > : Contributor should confirm that they have have permission from the third > : party owners–such as the form of an open source license or public domain > : declaration that complies with the project’s licensing policies–to use > : and modify such pre-existing materials and contribute them to the > : project. Additionally, the contributor should provide notice and > : attribution of such third party rights, along with information about the > : applicable license terms, with their contribution. > > Is that my responsibility? OK, I can see this is discussed further down the thread https://lore.kernel.org/all/20250730130531.4855a38b@gandalf.local.home/T/#u -- Michal Hocko SUSE Labs
On Mon, Aug 04, 2025 at 11:23:21AM +0200, Michal Hocko wrote: >On Mon 28-07-25 09:05:37, Sasha Levin wrote: >> On Mon, Jul 28, 2025 at 12:47:55PM +0200, David Hildenbrand wrote: >> > We cannot keep complaining about maintainer overload and, at the same >> > time, encourage people to bombard us with even more of that stuff. >> > >> > Clearly flagging stuff as AI-generated can maybe help. But really, what >> > we need is a proper AI policy. I think QEMU did a good job (again, maybe >> > too strict, not sure). >> >> So I've sent this series because I thought it's a parallel effort to the >> effort of creating an "AI Policy". >> >> Right now we already (implicitly) have a policy as far as these >> contributions go, based on >> https://www.linuxfoundation.org/legal/generative-ai and the lack of >> other guidelines in our codebase, we effectively welcome AI generated >> contributions without any other requirements beyond the ones that affect >> a regular human. >> >> This series of patches attempts to clarify that point to AI: it has to >> follow the same requirements and rules that humans do. > >The above guidance is quite vague. How me as a maintainer should know >that whatever AI tool has been used is meeting those two conditions In exactly the same way you know that a human contributor didn't copy code with an incompatible license. Quoting from Documentation/process/5.Posting.rst : - Signed-off-by: this is a developer's certification that he or she has the right to submit the patch for inclusion into the kernel. It is an agreement to the Developer's Certificate of Origin, the full text of which can be found in :ref:`Documentation/process/submitting-patches.rst <submittingpatches>` Code without a proper signoff cannot be merged into the mainline. The Signed-off-by tag doesn't mean that a commit was reviewed, it doesn't mean that someone tested it, nor does it indicate that the person who signed off belives it is correct. It only means that the person has legally certified to you what is stated in the DCO. >: 1. Contributors should ensure that the terms and conditions of the >: generative AI tool do not place any contractual restrictions on how the >: tool’s output can be used that are inconsistent with the project’s open >: source software license, the project’s intellectual property policies, >: or the Open Source Definition. >: >: 2. If any pre-existing copyrighted materials (including pre-existing >: open source code) authored or owned by third parties are included in the >: AI tool’s output, prior to contributing such output to the project, the >: Contributor should confirm that they have have permission from the third >: party owners–such as the form of an open source license or public domain >: declaration that complies with the project’s licensing policies–to use >: and modify such pre-existing materials and contribute them to the >: project. Additionally, the contributor should provide notice and >: attribution of such third party rights, along with information about the >: applicable license terms, with their contribution. > >Is that my responsibility? As far as making sure that all patches you take come with a Signed-off-by tag, yes, it's your responsibility to make sure that such tag exists. Otherwise, this series doesn't add any new requirements on you as a maintainer. -- Thanks, Sasha
On Mon, 4 Aug 2025, Sasha Levin wrote: > > The above guidance is quite vague. How me as a maintainer should know > > that whatever AI tool has been used is meeting those two conditions > > In exactly the same way you know that a human contributor didn't copy > code with an incompatible license. > > Quoting from Documentation/process/5.Posting.rst : > > - Signed-off-by: this is a developer's certification that he or > she has the right to submit the patch for inclusion into the > kernel. It is an agreement to the Developer's Certificate of > Origin, the full text of which can be found in > :ref:`Documentation/process/submitting-patches.rst > <submittingpatches>` Code without a proper signoff cannot be > merged into the mainline. > > The Signed-off-by tag doesn't mean that a commit was reviewed, it > doesn't mean that someone tested it, nor does it indicate that the > person who signed off belives it is correct. > > It only means that the person has legally certified to you what is > stated in the DCO. Al made a very important point somewhere earlier in this thread. The most important (from the code quality POV) thing is -- is there a person that understands the patch enough to be able to answer questions (coming from some other human -- most likely reviewer/maintainer)? That's not something that'd be reflected in DCO, but it's very important fact for the maintainer's decision process. -- Jiri Kosina SUSE Labs
On Tue, 5 Aug 2025 00:03:29 +0200 (CEST) Jiri Kosina <kosina@gmail.com> wrote: > Al made a very important point somewhere earlier in this thread. > > The most important (from the code quality POV) thing is -- is there a > person that understands the patch enough to be able to answer questions > (coming from some other human -- most likely reviewer/maintainer)? > > That's not something that'd be reflected in DCO, but it's very important > fact for the maintainer's decision process. Perhaps this is what needs to be explicitly stated in the SubmittingPatches document. I know we can't change the DCO, but could we add something about our policy is that if you submit code, you certify that you understand said code, even if (especially) it was produced by AI? -- Steve
On Mon, 4 Aug 2025, Steven Rostedt wrote: > I know we can't change the DCO, but could we add something about our policy > is that if you submit code, you certify that you understand said code, even > if (especially) it was produced by AI? Yeah, I think that's *precisely* what's needed. Legal stuff is one thing. Let's assume for now that it's handled by the LF statement, DCO, whatever. But "if I need to talk to a human that has a real clue about this code change, who is that?" absolutely (in my view) needs to be reflected in the changelog metadata. Because the more you challenge LLMs, the more they will hallucinate. If for nothing else, then for accountability (not legal, but factual). LLM is never going to be responsible for the generated code in the "human-to-human" sense. AI can assist, but a human needs to be the one proxying the responsibility (if he/she decides to do so), with all the consequences (again, not talking legal here at all). Thanks, -- Jiri Kosina SUSE Labs
Steven Rostedt wrote: > On Tue, 5 Aug 2025 00:03:29 +0200 (CEST) > Jiri Kosina <kosina@gmail.com> wrote: > > > Al made a very important point somewhere earlier in this thread. > > > > The most important (from the code quality POV) thing is -- is there a > > person that understands the patch enough to be able to answer questions > > (coming from some other human -- most likely reviewer/maintainer)? > > > > That's not something that'd be reflected in DCO, but it's very important > > fact for the maintainer's decision process. > > Perhaps this is what needs to be explicitly stated in the SubmittingPatches > document. > > I know we can't change the DCO, but could we add something about our policy > is that if you submit code, you certify that you understand said code, even > if (especially) it was produced by AI? It is already the case that human developed code is not always understood by the submitter (i.e. bugs, or see occasions of "no functional changes intended" commits referenced by "Fixes:"). It is also already the case that the speed at which code is applied has a component of maintainer's trust in the submitter to stick around and address issues or work with the community. AI allows production of plausible code in higher volumes, but it does not fundamentally change the existing dynamic of development velocity vs trust. So an expectation that is worth clarifying is that mere appearance of technical correctness is not sufficient to move a proposal forward. The details of what constitutes sufficient trust are subsystem, maintainer, or even per-function specific. This is a nuanced expectation that human submitters struggle, let alone AI. "Be prepared to declare a confidence interval in every detail of a patch series, especially any AI generated pieces."
On Mon, Aug 04, 2025 at 03:53:50PM -0700, dan.j.williams@intel.com wrote: >Steven Rostedt wrote: >> On Tue, 5 Aug 2025 00:03:29 +0200 (CEST) >> Jiri Kosina <kosina@gmail.com> wrote: >> >> > Al made a very important point somewhere earlier in this thread. >> > >> > The most important (from the code quality POV) thing is -- is there a >> > person that understands the patch enough to be able to answer questions >> > (coming from some other human -- most likely reviewer/maintainer)? >> > >> > That's not something that'd be reflected in DCO, but it's very important >> > fact for the maintainer's decision process. >> >> Perhaps this is what needs to be explicitly stated in the SubmittingPatches >> document. >> >> I know we can't change the DCO, but could we add something about our policy >> is that if you submit code, you certify that you understand said code, even >> if (especially) it was produced by AI? > >It is already the case that human developed code is not always >understood by the submitter (i.e. bugs, or see occasions of "no >functional changes intended" commits referenced by "Fixes:"). It is also >already the case that the speed at which code is applied has a component >of maintainer's trust in the submitter to stick around and address >issues or work with the community. > >AI allows production of plausible code in higher volumes, but it does >not fundamentally change the existing dynamic of development velocity vs >trust. Right: I think that the issue Jiri brought up is a human problem, not a tooling problem. We can try and tackle a symptom, but it's a losing war. >So an expectation that is worth clarifying is that mere appearance of >technical correctness is not sufficient to move a proposal forward. The >details of what constitutes sufficient trust are subsystem, maintainer, >or even per-function specific. This is a nuanced expectation that human >submitters struggle, let alone AI. > >"Be prepared to declare a confidence interval in every detail of a patch >series, especially any AI generated pieces." Something along the lines of a Social Credit system for the humans behind the keyboard? :) Do we want to get there? Do we not? -- Thanks, Sasha
On Mon, Aug 04, 2025 at 07:30:41PM -0400, Sasha Levin wrote: > On Mon, Aug 04, 2025 at 03:53:50PM -0700, dan.j.williams@intel.com wrote: > >Steven Rostedt wrote: > >> On Tue, 5 Aug 2025 00:03:29 +0200 (CEST) > >> Jiri Kosina <kosina@gmail.com> wrote: > >> > >> > Al made a very important point somewhere earlier in this thread. > >> > > >> > The most important (from the code quality POV) thing is -- is there a > >> > person that understands the patch enough to be able to answer questions > >> > (coming from some other human -- most likely reviewer/maintainer)? > >> > > >> > That's not something that'd be reflected in DCO, but it's very important > >> > fact for the maintainer's decision process. > >> > >> Perhaps this is what needs to be explicitly stated in the SubmittingPatches > >> document. > >> > >> I know we can't change the DCO, but could we add something about our policy > >> is that if you submit code, you certify that you understand said code, even > >> if (especially) it was produced by AI? > > > >It is already the case that human developed code is not always > >understood by the submitter (i.e. bugs, or see occasions of "no > >functional changes intended" commits referenced by "Fixes:"). It is also > >already the case that the speed at which code is applied has a component > >of maintainer's trust in the submitter to stick around and address > >issues or work with the community. > > > >AI allows production of plausible code in higher volumes, but it does > >not fundamentally change the existing dynamic of development velocity vs > >trust. > > Right: I think that the issue Jiri brought up is a human problem, not a > tooling problem. > > We can try and tackle a symptom, but it's a losing war. > > >So an expectation that is worth clarifying is that mere appearance of > >technical correctness is not sufficient to move a proposal forward. The > >details of what constitutes sufficient trust are subsystem, maintainer, > >or even per-function specific. This is a nuanced expectation that human > >submitters struggle, let alone AI. > > > >"Be prepared to declare a confidence interval in every detail of a patch > >series, especially any AI generated pieces." > > Something along the lines of a Social Credit system for the humans > behind the keyboard? :) > > Do we want to get there? Do we not? Don't we have one already ? I'm pretty sure every maintainer keeps a mental list of trust scores, and uses them when reviewing patches. Patch submitter who doesn't perform due diligence usually lose points, especially if the offences occur repeatedly (newcomers often get a few free passes thanks to their inexperience and the benefit of the doubt, at least with most maintainers). LLMs increase the scale of the problem, and also makes it easier to fake due diligence. I believe it's important to make it very clear to contributors that they will suffer consequences if they don't hold up to the standards we expect. -- Regards, Laurent Pinchart
On Tue, 5 Aug 2025 02:39:06 +0300 Laurent Pinchart <laurent.pinchart@ideasonboard.com> wrote: > > > > > >"Be prepared to declare a confidence interval in every detail of a patch > > >series, especially any AI generated pieces." Honestly, I think we need to state that. > > > > Something along the lines of a Social Credit system for the humans > > behind the keyboard? :) > > > > Do we want to get there? Do we not? > > Don't we have one already ? I'm pretty sure every maintainer keeps a > mental list of trust scores, and uses them when reviewing patches. > Patch submitter who doesn't perform due diligence usually lose points, > especially if the offences occur repeatedly (newcomers often get a few > free passes thanks to their inexperience and the benefit of the doubt, > at least with most maintainers). > > LLMs increase the scale of the problem, and also makes it easier to fake > due diligence. I believe it's important to make it very clear to > contributors that they will suffer consequences if they don't hold up to > the standards we expect. My question is, do we want to document expectations of a patch being submitted. It's been a while since I fully read SubmittingPatches (so much so, I last read it when it was called that!). Maybe it's already in there. If not, perhaps we need to update the document with the idea that people will now be using AI more often to help them do their work. That's still not an excuse to not understand the code that is being submitted. -- Steve
+cc Linus On Sun, Jul 27, 2025 at 03:57:58PM -0400, Sasha Levin wrote: > This patch series adds unified configuration and documentation for coding > agents working with the Linux kernel codebase. As coding agents > become increasingly common in software development, it's important to > establish clear guidelines for their use in kernel development. Hi Sasha, I feel like we need to take a step back here and consider some of the non-technical consqeuences of this change. Firstly, there is no doubt whatsoever that, were this series to land, there would be significant press which would amount to (whether you like it or not) 'Linux kernel welcomes AI patches'. I don't feel that a change of this magnitude which is likely to have this kind of impact should be RFC'd quietly and then, after a weekend, submitted ready to merge. This change, whether you like it or not - amounts to (or at the very least, certainly will be perceived to be) kernel policy. And, AFAIK, we don't have an AI kernel policy doc yet. So to me: - We should establish an official kernel AI policy document. - This should be discussed at the maintainers summit before proceeding. In addition, it's concerning that we're explicitly adding configs for specific, commercial, products. This might be seen as an endorsement whether intended or not. Thanks, Lorenzo > > The series consists of four patches: > > 1. The first patch adds unified configuration files for various coding > agents (Claude, GitHub Copilot, Cursor, Codeium, Continue, > Windsurf, and Aider). These are all symlinked to a central documentation > file to ensure consistency across tools. > > 2. The second patch adds core development references that guide > agents to essential kernel development documentation including how > to do kernel development, submitting patches, and the submission > checklist. > > 3. The third patch adds coding style documentation and explicit rules > that agents must follow, including the 80 character line limit > and no trailing whitespace requirements. > > 4. The fourth patch adds legal requirements and agent attribution > guidelines. All agents are required to identify themselves in > commits using Co-developed-by tags, ensuring full transparency about > agent involvement in code development. > > Example agent attribution in commits: > > Co-developed-by: Claude claude-opus-4-20250514 > > > Changes since RFC: > - Switch from markdown to RST > - Break up into multiple files > - Simplify instructions (we can always bikeshed those later) > - AI => Agents > > Sasha Levin (4): > agents: add unified agent coding assistant configuration > agents: add core development references > agents: add coding style documentation and rules > agents: add legal requirements and agent attribution guidelines > > .aider.conf.yml | 1 + > .codeium/instructions.md | 1 + > .continue/context.md | 1 + > .cursorrules | 1 + > .github/copilot-instructions.md | 1 + > .windsurfrules | 1 + > CLAUDE.md | 1 + > Documentation/agents/coding-style.rst | 35 ++++++++++++++++++++++ > Documentation/agents/core.rst | 28 ++++++++++++++++++ > Documentation/agents/index.rst | 13 +++++++++ > Documentation/agents/legal.rst | 42 +++++++++++++++++++++++++++ > Documentation/agents/main.rst | 22 ++++++++++++++ > 12 files changed, 147 insertions(+) > create mode 120000 .aider.conf.yml > create mode 120000 .codeium/instructions.md > create mode 120000 .continue/context.md > create mode 120000 .cursorrules > create mode 120000 .github/copilot-instructions.md > create mode 120000 .windsurfrules > create mode 120000 CLAUDE.md > create mode 100644 Documentation/agents/coding-style.rst > create mode 100644 Documentation/agents/core.rst > create mode 100644 Documentation/agents/index.rst > create mode 100644 Documentation/agents/legal.rst > create mode 100644 Documentation/agents/main.rst > > -- > 2.39.5 > >
On Mon, Jul 28, 2025 at 09:42:27AM +0100, Lorenzo Stoakes wrote: > +cc Linus > > On Sun, Jul 27, 2025 at 03:57:58PM -0400, Sasha Levin wrote: > > This patch series adds unified configuration and documentation for coding > > agents working with the Linux kernel codebase. As coding agents > > become increasingly common in software development, it's important to > > establish clear guidelines for their use in kernel development. > > Hi Sasha, > > I feel like we need to take a step back here and consider some of the > non-technical consqeuences of this change. > > Firstly, there is no doubt whatsoever that, were this series to land, there > would be significant press which would amount to (whether you like it or > not) 'Linux kernel welcomes AI patches'. > > I don't feel that a change of this magnitude which is likely to have this > kind of impact should be RFC'd quietly and then, after a weekend, submitted > ready to merge. > > This change, whether you like it or not - amounts to (or at the very least, > certainly will be perceived to be) kernel policy. And, AFAIK, we don't have > an AI kernel policy doc yet. > > So to me: > > - We should establish an official kernel AI policy document. Steven Rostedt is working on this right now, hopefully he has something "soon". > - This should be discussed at the maintainers summit before proceeding. Sounds reasonable as well. But I think that Kees and my earlier points of "the documentation should be all that an agent needs" might aleviate many of these concerns, if our documentation can be tweaked in a way to make it easier for everyone, humans and bots, to understand. That should cut down on the "size" of this patch series a lot overall. > In addition, it's concerning that we're explicitly adding configs for > specific, commercial, products. This might be seen as an endorsement > whether intended or not. Don't we already have that for a few things already, like .editorconfig? thanks, greg k-h
On Mon, Jul 28, 2025 at 12:35:02PM +0200, Greg KH wrote: > On Mon, Jul 28, 2025 at 09:42:27AM +0100, Lorenzo Stoakes wrote: > > +cc Linus > > > > On Sun, Jul 27, 2025 at 03:57:58PM -0400, Sasha Levin wrote: > > > This patch series adds unified configuration and documentation for coding > > > agents working with the Linux kernel codebase. As coding agents > > > become increasingly common in software development, it's important to > > > establish clear guidelines for their use in kernel development. > > > > Hi Sasha, > > > > I feel like we need to take a step back here and consider some of the > > non-technical consqeuences of this change. > > > > Firstly, there is no doubt whatsoever that, were this series to land, there > > would be significant press which would amount to (whether you like it or > > not) 'Linux kernel welcomes AI patches'. > > > > I don't feel that a change of this magnitude which is likely to have this > > kind of impact should be RFC'd quietly and then, after a weekend, submitted > > ready to merge. > > > > This change, whether you like it or not - amounts to (or at the very least, > > certainly will be perceived to be) kernel policy. And, AFAIK, we don't have > > an AI kernel policy doc yet. > > > > So to me: > > > > - We should establish an official kernel AI policy document. > > Steven Rostedt is working on this right now, hopefully he has something > "soon". > > > - This should be discussed at the maintainers summit before proceeding. > > Sounds reasonable as well. > > But I think that Kees and my earlier points of "the documentation should > be all that an agent needs" might aleviate many of these concerns, if > our documentation can be tweaked in a way to make it easier for > everyone, humans and bots, to understand. That should cut down on the > "size" of this patch series a lot overall. > > > In addition, it's concerning that we're explicitly adding configs for > > specific, commercial, products. This might be seen as an endorsement > > whether intended or not. > > Don't we already have that for a few things already, like .editorconfig? We do, but isn't .editorconfig a vendor-neutral solution ? -- Regards, Laurent Pinchart
On Mon, Jul 28, 2025 at 12:35:02PM +0200, Greg KH wrote: > > So to me: > > > > - We should establish an official kernel AI policy document. > > Steven Rostedt is working on this right now, hopefully he has something > "soon". Great! Thanks for looking at that Steve. I think a key element here has to be maintainer opt-in. > > > - This should be discussed at the maintainers summit before proceeding. > > Sounds reasonable as well. Thanks. > > But I think that Kees and my earlier points of "the documentation should > be all that an agent needs" might aleviate many of these concerns, if > our documentation can be tweaked in a way to make it easier for > everyone, humans and bots, to understand. That should cut down on the > "size" of this patch series a lot overall. That'd be ideal, but I think either way we need to be clear to the humans running these things what the rules are. One thing to note is that I struggled to get an LLM to read MAINTAINERS properly recently (it assured me, with absolute confidence, that the SLAB ALLOCATOR section was in fact 'SLAB ALLOCATORS' + provided me with completely incorrect contents, and told me that if I didn't believe it I should go check :) So at all times I think ensuring the human element is aware that they need to do some kind of checking/filtering is key. But that can be handled by a carefully worded policy document. > > > In addition, it's concerning that we're explicitly adding configs for > > specific, commercial, products. This might be seen as an endorsement > > whether intended or not. > > Don't we already have that for a few things already, like .editorconfig? Right, but I think it's a whole other level when it's a subscription service. I realise we have to be practical, but it's just something to be aware of. Perhaps an entry in the AI doc along the lines of 'provision of configuration for a service is not advocating for that service, it is simply provided for convenience' or similar might help. Thanks, Lorenzo
On Mon, 28 Jul 2025 11:52:47 +0100 Lorenzo Stoakes <lorenzo.stoakes@oracle.com> wrote: > On Mon, Jul 28, 2025 at 12:35:02PM +0200, Greg KH wrote: > > > So to me: > > > > > > - We should establish an official kernel AI policy document. > > > > Steven Rostedt is working on this right now, hopefully he has something > > "soon". > > Great! Thanks for looking at that Steve. > > I think a key element here has to be maintainer opt-in. > I had started looking into what to write, as in the TAB meeting we were going to pass a document around before we posted it to the mailing list, but then I was made aware of this thread: https://lore.kernel.org/lkml/20250724175439.76962-1-linux@treblig.org/ Which looked like someone else (now Cc'd on this thread) took it public, and I wanted to see where that ended. I didn't want to start another discussion when there's already two in progress. -- Steve
On Wed, Jul 30, 2025 at 11:27:53AM -0400, Steven Rostedt wrote: > On Mon, 28 Jul 2025 11:52:47 +0100 > Lorenzo Stoakes <lorenzo.stoakes@oracle.com> wrote: > > > On Mon, Jul 28, 2025 at 12:35:02PM +0200, Greg KH wrote: > > > > So to me: > > > > > > > > - We should establish an official kernel AI policy document. > > > > > > Steven Rostedt is working on this right now, hopefully he has something > > > "soon". > > > > Great! Thanks for looking at that Steve. > > > > I think a key element here has to be maintainer opt-in. > > > > I had started looking into what to write, as in the TAB meeting we were > going to pass a document around before we posted it to the mailing list, > but then I was made aware of this thread: > > https://lore.kernel.org/lkml/20250724175439.76962-1-linux@treblig.org/ > > Which looked like someone else (now Cc'd on this thread) took it public, > and I wanted to see where that ended. I didn't want to start another > discussion when there's already two in progress. OK, but having a document like this is not in my view optional - we must have a clear, stated policy and one which ideally makes plain that it's opt-in and maintainers may choose not to take these patches. I'm not at all a fan of having a small entry hidden away in the submitting patches doc, this is a really major issue that needs special consideration and whose scope may change over time, so a dedicated document seems more appropriate. Thanks, Lorenzo
On Wed, 30 Jul 2025 16:34:28 +0100 Lorenzo Stoakes <lorenzo.stoakes@oracle.com> wrote: > > Which looked like someone else (now Cc'd on this thread) took it public, > > and I wanted to see where that ended. I didn't want to start another > > discussion when there's already two in progress. > > OK, but having a document like this is not in my view optional - we must > have a clear, stated policy and one which ideally makes plain that it's > opt-in and maintainers may choose not to take these patches. That sounds pretty much exactly as what I was stating in our meeting. That is, it is OK to submit a patch written with AI but you must disclose it. It is also the right of the Maintainer to refuse to take any patch that was written in AI. They may feel that they want someone who fully understands what that patch does, and AI can cloud the knowledge of that patch from the author. I guess a statement in submitting-patches.rst would suffice, or should it be a separate standalone document? -- Steve
On Wed, Jul 30, 2025 at 12:18:29PM -0400, Steven Rostedt wrote: > On Wed, 30 Jul 2025 16:34:28 +0100 > Lorenzo Stoakes <lorenzo.stoakes@oracle.com> wrote: > > > > Which looked like someone else (now Cc'd on this thread) took it public, > > > and I wanted to see where that ended. I didn't want to start another > > > discussion when there's already two in progress. > > > > OK, but having a document like this is not in my view optional - we must > > have a clear, stated policy and one which ideally makes plain that it's > > opt-in and maintainers may choose not to take these patches. > > That sounds pretty much exactly as what I was stating in our meeting. That > is, it is OK to submit a patch written with AI but you must disclose it. It > is also the right of the Maintainer to refuse to take any patch that was > written in AI. They may feel that they want someone who fully understands > what that patch does, and AI can cloud the knowledge of that patch from the > author. *Ahem* You cropped: I'm not at all a fan of having a small entry hidden away in the submitting patches doc, this is a really major issue that needs special consideration and whose scope may change over time, so a dedicated document seems more appropriate. > > I guess a statement in submitting-patches.rst would suffice, or should it > be a separate standalone document? I think the bit you cropped answers my view on your question :)
* Steven Rostedt (rostedt@goodmis.org) wrote: > On Wed, 30 Jul 2025 16:34:28 +0100 > Lorenzo Stoakes <lorenzo.stoakes@oracle.com> wrote: > > > > Which looked like someone else (now Cc'd on this thread) took it public, (I didn't know of the tab discussion) > > > and I wanted to see where that ended. I didn't want to start another > > > discussion when there's already two in progress. > > > > OK, but having a document like this is not in my view optional - we must > > have a clear, stated policy and one which ideally makes plain that it's > > opt-in and maintainers may choose not to take these patches. > > That sounds pretty much exactly as what I was stating in our meeting. That > is, it is OK to submit a patch written with AI but you must disclose it. It > is also the right of the Maintainer to refuse to take any patch that was > written in AI. They may feel that they want someone who fully understands > what that patch does, and AI can cloud the knowledge of that patch from the > author. > > I guess a statement in submitting-patches.rst would suffice, or should it > be a separate standalone document? If it's separate I think it needs to have a link from submitting-patches.rst to get people to read it. To summarise some other things that came up between the threads: a) I think there should be a standard syntax for stating it is AI written; I'd suggested using a new tag, but others were arguing on the side of reusing existing tags, which seems OK if it is done in a standard way and doesn't confuse existing tools. b) There's a whole spectrum of: i) AI wrote the whole patch based on a vague requirement ii) AI is in the editor and tab completes stuff iii) AI suggests fixes/changes which do you care about? c) But then once you get stuff suggesting fixes/changes people were wondering if you should specify other non-AI tools as well. That might help reviewers who get bombed by a million patches from some conventional tool. d) Either way there needs to be emphasis that the 'Signed-off-by' is a human declaring it's all legal and checked. Dave > -- Steve > -- -----Open up your eyes, open up your mind, open up your code ------- / Dr. David Alan Gilbert | Running GNU/Linux | Happy \ \ dave @ treblig.org | | In Hex / \ _________________________|_____ http://www.treblig.org |_______/
On Wed, Jul 30, 2025 at 04:40:39PM +0000, Dr. David Alan Gilbert wrote: > b) There's a whole spectrum of: > i) AI wrote the whole patch based on a vague requirement > ii) AI is in the editor and tab completes stuff > iii) AI suggests fixes/changes > which do you care about? There is a vast spectrum between i) and ii). For the 2 KUnit patches[1] I sent, I had already taught the LLM about KUnit (via Documentation/), and I walked the LLM through the API in question, then asked it to produce a KUnit test. It spat out the core structure with proposed tests, and it iterated on running the tests to make sure the tests were passing, adjusting its assumptions about the API. I took that result and went through it test by test to tweak edge cases, add additional checks, etc, etc. By character count, those 2 are probably 70% written by the LLM. For the atomisp fix[2], that was, by characters, 100% LLM, but I gave it specific code style adjustments and guided it to examine the problem correctly. Should it be considered "AI wrote the whole patch"? Maybe, maybe not. -Kees [1] https://lore.kernel.org/lkml/202507301008.E109EB0F@keescook/ [2] https://lore.kernel.org/lkml/20250724080756.work.741-kees@kernel.org/ -- Kees Cook
On Wed, 30 Jul 2025 16:40:39 +0000 "Dr. David Alan Gilbert" <linux@treblig.org> wrote: > * Steven Rostedt (rostedt@goodmis.org) wrote: > > On Wed, 30 Jul 2025 16:34:28 +0100 > > Lorenzo Stoakes <lorenzo.stoakes@oracle.com> wrote: > > > > > > Which looked like someone else (now Cc'd on this thread) took it public, > > (I didn't know of the tab discussion) Well, you were not there ;-) > > I guess a statement in submitting-patches.rst would suffice, or should it > > be a separate standalone document? > > If it's separate I think it needs to have a link from submitting-patches.rst > to get people to read it. > > To summarise some other things that came up between the threads: > a) I think there should be a standard syntax for stating it is > AI written; I'd suggested using a new tag, but others were > arguing on the side of reusing existing tags, which seems OK > if it is done in a standard way and doesn't confuse existing tools. Right. So I believe those that did not want the tag, wanted the statement to be under the "---" so that it will not get into the git log. I prefer the tag, but I'll be OK with the comment below the "---" as long as it is clearly stated that the code was generated by AI. > > b) There's a whole spectrum of: > i) AI wrote the whole patch based on a vague requirement > ii) AI is in the editor and tab completes stuff > iii) AI suggests fixes/changes > which do you care about? Yes, this is one of the controversial issues with having a policy. How much does AI have to help you before you must disclose it. I would say basic completions shouldn't be an issue. I've had editors where I type "for" it then fills in "for (int i = 0; ; i++)". Is that AI? I don't think so. I'm more concern where you use AI to come up with an algorith. "Hey AI, sort this array with a quick-sort routine". And it does so. That should be denoted in the change log. Either above or below the '---'. > > c) But then once you get stuff suggesting fixes/changes people were > wondering if you should specify other non-AI tools as well. > That might help reviewers who get bombed by a million patches > from some conventional tool. Fixes and changes I don't think require disclosure as long as the human looks at that code and figures out that the code needs to change. Now if the AI does the fix for you, as in makes the patch, then yeah, you should disclose it. But if you manual make the patch after looking at what AI pointed you to, then it should be fine. > > d) Either way there needs to be emphasis that the 'Signed-off-by' > is a human declaring it's all legal and checked. That should go without saying. -- Steve
* Steven Rostedt (rostedt@goodmis.org) wrote: > On Wed, 30 Jul 2025 16:40:39 +0000 > "Dr. David Alan Gilbert" <linux@treblig.org> wrote: > > > * Steven Rostedt (rostedt@goodmis.org) wrote: > > > On Wed, 30 Jul 2025 16:34:28 +0100 > > > Lorenzo Stoakes <lorenzo.stoakes@oracle.com> wrote: > > > > > > > > Which looked like someone else (now Cc'd on this thread) took it public, > > > > (I didn't know of the tab discussion) > > Well, you were not there ;-) > > > > I guess a statement in submitting-patches.rst would suffice, or should it > > > be a separate standalone document? > > > > If it's separate I think it needs to have a link from submitting-patches.rst > > to get people to read it. > > > > To summarise some other things that came up between the threads: > > a) I think there should be a standard syntax for stating it is > > AI written; I'd suggested using a new tag, but others were > > arguing on the side of reusing existing tags, which seems OK > > if it is done in a standard way and doesn't confuse existing tools. > > Right. So I believe those that did not want the tag, wanted the statement > to be under the "---" so that it will not get into the git log. I prefer > the tag, but I'll be OK with the comment below the "---" as long as it is > clearly stated that the code was generated by AI. I think the 'clearly stated' is bound to get messy, especially with multiple (natural) languages. A tag doesn't have that ambiguity. My preference for having it above the --- is to allow later analysis (does the Foo AI tend to mess up checks for .... ?) It might also be useful for those other GPL licensed projects that don't accept AI generated code. > > > > b) There's a whole spectrum of: > > i) AI wrote the whole patch based on a vague requirement > > ii) AI is in the editor and tab completes stuff > > iii) AI suggests fixes/changes > > which do you care about? > > Yes, this is one of the controversial issues with having a policy. How much > does AI have to help you before you must disclose it. I would say basic > completions shouldn't be an issue. I've had editors where I type "for" it > then fills in "for (int i = 0; ; i++)". Is that AI? I don't think so. What happens when it looks at the type you're using and turns it into a use of a macro like list_for_each()? I suspect the line is fuzzy. Personally that doesn't worry me much, but I don't think I can tell others not to worry about it. > I'm more concern where you use AI to come up with an algorith. "Hey AI, > sort this array with a quick-sort routine". And it does so. That should be > denoted in the change log. Either above or below the '---'. > > > > > c) But then once you get stuff suggesting fixes/changes people were > > wondering if you should specify other non-AI tools as well. > > That might help reviewers who get bombed by a million patches > > from some conventional tool. > > Fixes and changes I don't think require disclosure as long as the human > looks at that code and figures out that the code needs to change. Now if > the AI does the fix for you, as in makes the patch, then yeah, you should > disclose it. But if you manual make the patch after looking at what AI > pointed you to, then it should be fine. > > > > > d) Either way there needs to be emphasis that the 'Signed-off-by' > > is a human declaring it's all legal and checked. > > That should go without saying. My point is that it needs saying loudly in the docs! Dave > -- Steve -- -----Open up your eyes, open up your mind, open up your code ------- / Dr. David Alan Gilbert | Running GNU/Linux | Happy \ \ dave @ treblig.org | | In Hex / \ _________________________|_____ http://www.treblig.org |_______/
On Wed, Jul 30, 2025 at 04:40:39PM +0000, Dr. David Alan Gilbert wrote: > * Steven Rostedt (rostedt@goodmis.org) wrote: > > On Wed, 30 Jul 2025 16:34:28 +0100 > > Lorenzo Stoakes <lorenzo.stoakes@oracle.com> wrote: > > > > > > Which looked like someone else (now Cc'd on this thread) took it public, > > (I didn't know of the tab discussion) > > > > > and I wanted to see where that ended. I didn't want to start another > > > > discussion when there's already two in progress. > > > > > > OK, but having a document like this is not in my view optional - we must > > > have a clear, stated policy and one which ideally makes plain that it's > > > opt-in and maintainers may choose not to take these patches. > > > > That sounds pretty much exactly as what I was stating in our meeting. That > > is, it is OK to submit a patch written with AI but you must disclose it. It > > is also the right of the Maintainer to refuse to take any patch that was > > written in AI. They may feel that they want someone who fully understands > > what that patch does, and AI can cloud the knowledge of that patch from the > > author. > > > > I guess a statement in submitting-patches.rst would suffice, or should it > > be a separate standalone document? > > If it's separate I think it needs to have a link from submitting-patches.rst > to get people to read it. Absolutely agree. > > To summarise some other things that came up between the threads: > a) I think there should be a standard syntax for stating it is > AI written; I'd suggested using a new tag, but others were > arguing on the side of reusing existing tags, which seems OK > if it is done in a standard way and doesn't confuse existing tools. Yes. > > b) There's a whole spectrum of: > i) AI wrote the whole patch based on a vague requirement > ii) AI is in the editor and tab completes stuff > iii) AI suggests fixes/changes > which do you care about? I think any AI involvment that results in _changes to the code_ should require the tag. > > c) But then once you get stuff suggesting fixes/changes people were > wondering if you should specify other non-AI tools as well. > That might help reviewers who get bombed by a million patches > from some conventional tool. I think this would be useful, yes. We'd had isues with this before. It'd be good to standardise, ideally. > > d) Either way there needs to be emphasis that the 'Signed-off-by' > is a human declaring it's all legal and checked. This is also a wise point with which I agree.
On Wed, 30 Jul 2025 18:10:51 +0100 Lorenzo Stoakes <lorenzo.stoakes@oracle.com> wrote: > > > I guess a statement in submitting-patches.rst would suffice, or should it > > > be a separate standalone document? > > > > If it's separate I think it needs to have a link from submitting-patches.rst > > to get people to read it. > > Absolutely agree. Sorry for cropping your response about submitting patches, but honestly, I think it may get more visibility there than in a separate doc. That's because submitting-patches is one of the most popular documents kernel devs reference to people submitting patches! Of course, adding a link as suggested above may fix that too. > > > > > To summarise some other things that came up between the threads: > > a) I think there should be a standard syntax for stating it is > > AI written; I'd suggested using a new tag, but others were > > arguing on the side of reusing existing tags, which seems OK > > if it is done in a standard way and doesn't confuse existing tools. > > Yes. > > > > > b) There's a whole spectrum of: > > i) AI wrote the whole patch based on a vague requirement > > ii) AI is in the editor and tab completes stuff > > iii) AI suggests fixes/changes > > which do you care about? > > I think any AI involvment that results in _changes to the code_ should > require the tag. I disagree with this. As I reply, I don't think if you have AI finishing your for loops and such requires disclosure. As I believe that may soon be the norm of most folks and then we may get AI storms. And then, if you have people saying "I don't want any AI patches", does that mean those that use AI for templates and such will now be forbidden from submitting to those subsystems? I would say if AI creates any algorithm for you then it must be disclosed. > > > > > c) But then once you get stuff suggesting fixes/changes people were > > wondering if you should specify other non-AI tools as well. > > That might help reviewers who get bombed by a million patches > > from some conventional tool. I should add that non-AI tools should always come with a disclaimer that they were used. For the most part, most submissions that use non-AI tooling has done this. I just don't think we ever made any formal policy about it. -- Steve
On Wed, Jul 30, 2025 at 01:20:54PM -0400, Steven Rostedt wrote: > On Wed, 30 Jul 2025 18:10:51 +0100 > Lorenzo Stoakes <lorenzo.stoakes@oracle.com> wrote: > > > > > > I guess a statement in submitting-patches.rst would suffice, or should it > > > > be a separate standalone document? > > > > > > If it's separate I think it needs to have a link from submitting-patches.rst > > > to get people to read it. > > > > Absolutely agree. > > Sorry for cropping your response about submitting patches, but honestly, I > think it may get more visibility there than in a separate doc. That's > because submitting-patches is one of the most popular documents kernel devs > reference to people submitting patches! No worries! :) Yeah to be clear - I think this should be a link, very heavily highlighted. Or we could summarise (using AI? Kidding ;) what the document states there, with a link for details. > > Of course, adding a link as suggested above may fix that too. > > > > > > > > > To summarise some other things that came up between the threads: > > > a) I think there should be a standard syntax for stating it is > > > AI written; I'd suggested using a new tag, but others were > > > arguing on the side of reusing existing tags, which seems OK > > > if it is done in a standard way and doesn't confuse existing tools. > > > > Yes. > > > > > > > > b) There's a whole spectrum of: > > > i) AI wrote the whole patch based on a vague requirement > > > ii) AI is in the editor and tab completes stuff > > > iii) AI suggests fixes/changes > > > which do you care about? > > > > I think any AI involvment that results in _changes to the code_ should > > require the tag. > > I disagree with this. As I reply, I don't think if you have AI finishing > your for loops and such requires disclosure. As I believe that may soon be > the norm of most folks and then we may get AI storms. This is actually a very good point. This is going to be tricky, because hallucination is such a serious concern, and even this kind of autocomplete would make me want to have a closer look. > > And then, if you have people saying "I don't want any AI patches", does > that mean those that use AI for templates and such will now be forbidden > from submitting to those subsystems? I think that's something we can potentially get more fine-grained on in future. > > I would say if AI creates any algorithm for you then it must be disclosed. I think what consitutes an 'algorithm' is very nebulous and you're likely to get people messing around on the definition of this. I think rather we could have an 'unless' list like: Unless: - It's whitespace only, - You used autocomplete features for for loops etc. AND you have checked that no hallucination has occurred. The perennial problem with LLMs is that they can hallucinate in _very_ subtle ways that can be hard for humans to pick up on. But we also have to be practical so I agree, we might end up with the tags being noise if we don't make sensible exceptions (whether we like it or not). > > > > > > > > > c) But then once you get stuff suggesting fixes/changes people were > > > wondering if you should specify other non-AI tools as well. > > > That might help reviewers who get bombed by a million patches > > > from some conventional tool. > > I should add that non-AI tools should always come with a disclaimer that > they were used. For the most part, most submissions that use non-AI tooling > has done this. I just don't think we ever made any formal policy about it. Yeah I've noticed this too, would be nice to standardise though. Cheers, Lorenzo
On Wed, Jul 30, 2025 at 12:18:29PM -0400, Steven Rostedt wrote: >On Wed, 30 Jul 2025 16:34:28 +0100 >Lorenzo Stoakes <lorenzo.stoakes@oracle.com> wrote: > >> > Which looked like someone else (now Cc'd on this thread) took it public, >> > and I wanted to see where that ended. I didn't want to start another >> > discussion when there's already two in progress. >> >> OK, but having a document like this is not in my view optional - we must >> have a clear, stated policy and one which ideally makes plain that it's >> opt-in and maintainers may choose not to take these patches. > >That sounds pretty much exactly as what I was stating in our meeting. That >is, it is OK to submit a patch written with AI but you must disclose it. It >is also the right of the Maintainer to refuse to take any patch that was >written in AI. They may feel that they want someone who fully understands This should probably be a stronger statement if we don't have it in the docs yet: a maintainer can refuse to take any patch, period. >what that patch does, and AI can cloud the knowledge of that patch from the >author. Maybe we should unify this with the academic research doc we already have? This way we can extend MAINTAINERS to indicate which subsystems are more open to research work (drivers/staging/ comes to mind) vs ones that aren't. Some sort of a "traffic light" system: 1. Green: the subsystem is happy to receive patches from any source. 2. Yellow: "If you're unfamiliar with the subsystem and using any tooling to generate your patches, please have a reviewed-by from a trusted developer before sending your patch". 3. No tool-generated patches without prior maintainer approval. -- Thanks, Sasha
On Wed, 30 Jul 2025 12:36:25 -0400 Sasha Levin <sashal@kernel.org> wrote: > > > >That sounds pretty much exactly as what I was stating in our meeting. That > >is, it is OK to submit a patch written with AI but you must disclose it. It > >is also the right of the Maintainer to refuse to take any patch that was > >written in AI. They may feel that they want someone who fully understands > > This should probably be a stronger statement if we don't have it in the > docs yet: a maintainer can refuse to take any patch, period. I disagree with that. They had better have technical reasons to refuse to take a patch. I would have big qualms if a maintainer just said "I don't like you and I'm not going to take any patches from you". This is a community project, and maintainers have been overridden before. Luckily, Linus has been pretty good at getting changes into the kernel when there was no clear technical argument that they should not be accepted. I believe the policy is that a maintainer may refuse any patch based on technical reasons. Now, patches can and are delayed due to maintainers just not having the time to review the patch. But that is eventually resolved if enough resources come into play. My point here is that AI can now add questions that maintainers can't answer. Is it really legal? Can the maintainer trust it? Yes, these too can fall under the "technical reasons" but having a clear policy that states that a maintainer may not want to even bother with AI generated code can perhaps give the maintainer something to point to if push comes to shove. > > >what that patch does, and AI can cloud the knowledge of that patch from the > >author. > > Maybe we should unify this with the academic research doc we already > have? I wouldn't think so. This is about submitting patches and a statement there may be easier found by those about to submit an AI patch. Just because they are using AI doesn't mean they'll think it's an academic research. > > This way we can extend MAINTAINERS to indicate which subsystems are > more open to research work (drivers/staging/ comes to mind) vs ones that > aren't. I wouldn't call it research work. Right now people who may be playing with AI models may think it's "research", but that's not going to be the majority of AI submissions. > > Some sort of a "traffic light" system: > > 1. Green: the subsystem is happy to receive patches from any source. > > 2. Yellow: "If you're unfamiliar with the subsystem and using any > tooling to generate your patches, please have a reviewed-by from a > trusted developer before sending your patch". > > 3. No tool-generated patches without prior maintainer approval. Perhaps. Of course there's the Coccinelle scripts that fix a bunch of code around the kernel that will like be ignored in this. But this may still be a good start. -- Steve
On Wed, Jul 30, 2025 at 01:05:31PM -0400, Steven Rostedt wrote: >On Wed, 30 Jul 2025 12:36:25 -0400 >Sasha Levin <sashal@kernel.org> wrote: > >> > >> >That sounds pretty much exactly as what I was stating in our meeting. That >> >is, it is OK to submit a patch written with AI but you must disclose it. It >> >is also the right of the Maintainer to refuse to take any patch that was >> >written in AI. They may feel that they want someone who fully understands >> >> This should probably be a stronger statement if we don't have it in the >> docs yet: a maintainer can refuse to take any patch, period. > >I disagree with that. They had better have technical reasons to refuse to >take a patch. I would have big qualms if a maintainer just said "I don't >like you and I'm not going to take any patches from you". > >This is a community project, and maintainers have been overridden before. >Luckily, Linus has been pretty good at getting changes into the kernel when >there was no clear technical argument that they should not be accepted. > >I believe the policy is that a maintainer may refuse any patch based on >technical reasons. Now, patches can and are delayed due to maintainers just >not having the time to review the patch. But that is eventually resolved if >enough resources come into play. > >My point here is that AI can now add questions that maintainers can't >answer. Is it really legal? Can the maintainer trust it? Yes, these too can >fall under the "technical reasons" but having a clear policy that states >that a maintainer may not want to even bother with AI generated code can >perhaps give the maintainer something to point to if push comes to shove. I don't think that those are technical aspects. The legality question is answered by the DCO where a human represents that he is allowed to submit the code. You should have the same concerns with humans sending in non-GPL-compatible code. Similarily the argument around not trusting the code is equivalent to not trusting the person who sent the code in. AI doesn't send patches on it's own - humans do. This is basically saying "I didn't even look at your patch because I don't trust you". >> Maybe we should unify this with the academic research doc we already >> have? > >I wouldn't think so. This is about submitting patches and a statement there >may be easier found by those about to submit an AI patch. Just because they >are using AI doesn't mean they'll think it's an academic research. Not in the sense that AI is research, but more that this is code coming from someone who is unable to reliably verify the patch that is being sent in. The source can be academic research, AI, or whatever else comes along. It'll just be nice to have a unified set of rules around it. Otherwise the amount of combinations will explode (in which category do we put in academic researchers sending in AI generated code?). >> Some sort of a "traffic light" system: >> >> 1. Green: the subsystem is happy to receive patches from any source. >> >> 2. Yellow: "If you're unfamiliar with the subsystem and using any >> tooling to generate your patches, please have a reviewed-by from a >> trusted developer before sending your patch". >> >> 3. No tool-generated patches without prior maintainer approval. > >Perhaps. Of course there's the Coccinelle scripts that fix a bunch of code >around the kernel that will like be ignored in this. But this may still be >a good start. It'll be hard to draw a line here, so I suggest we don't try. Are AI generated .cocci semantic patches that are then transformed into C patches and sent in by a human ok? -- Thanks, Sasha
Em Wed, 30 Jul 2025 13:46:47 -0400 Sasha Levin <sashal@kernel.org> escreveu: > >> Some sort of a "traffic light" system: > >> > >> 1. Green: the subsystem is happy to receive patches from any source. > >> > >> 2. Yellow: "If you're unfamiliar with the subsystem and using any > >> tooling to generate your patches, please have a reviewed-by from a > >> trusted developer before sending your patch". > >> > >> 3. No tool-generated patches without prior maintainer approval. > > That sounds a terrible idea. I mean, maintainers should be green for good patches and red for bad ones. It doesn't matter if they're aided or generated by AI or $TOOL. At the end, the one submitting it shall be able to properly understand, describe and debug it. It shall also be able to test it in real life before submitting. AI can do good things, but can also do bad things. I'd say that anyone using it shall double-check the code at least twice, checking if are there any hidden bugs. I've been doing myself some experiments: sometimes, LLM can quickly point something broken, doing root cause analysis, completing a TODO requirement and even write unittests and code. However, sometimes, AI starts to "allucinate"(*), pointing to things that don't exist, like inventing fields on structures and command line arguments that don't exist (it likely inferred the names from projects could be using similar patterns/goals). (*) AI being an statistics tool, the correct term is to diverge. > >Perhaps. Of course there's the Coccinelle scripts that fix a bunch of code > >around the kernel that will like be ignored in this. But this may still be > >a good start. This is something that maintainers don't want: yet-another-tool that newbies wanting to have their one microsecond of fame by getting patches merged to start sending stuff that weren't tested nor bring any value. Maybe we can add a text about that. Thanks, Mauro
On Wed, 30 Jul 2025 13:46:47 -0400 Sasha Levin <sashal@kernel.org> wrote: > >My point here is that AI can now add questions that maintainers can't > >answer. Is it really legal? Can the maintainer trust it? Yes, these too can > >fall under the "technical reasons" but having a clear policy that states > >that a maintainer may not want to even bother with AI generated code can > >perhaps give the maintainer something to point to if push comes to shove. > > I don't think that those are technical aspects. I didn't either, but I was just saying one could possibly argue that they are. But that also states why it should be called out explicitly. As refusing AI patches may not be a technical issue where all other refusals should be. > >I wouldn't think so. This is about submitting patches and a statement there > >may be easier found by those about to submit an AI patch. Just because they > >are using AI doesn't mean they'll think it's an academic research. > > Not in the sense that AI is research, but more that this is code coming > from someone who is unable to reliably verify the patch that is being > sent in. The issue I have is that the person sending in the patch may not know that they don't understand the patch. We've had those in the past. I could imagine AI creating more of these kinds of submissions. > > The source can be academic research, AI, or whatever else comes along. > > It'll just be nice to have a unified set of rules around it. Otherwise > the amount of combinations will explode (in which category do we put in > academic researchers sending in AI generated code?). Research folks know they are doing research. Those using AI may likely will not, even if they are. Hence why I would like this outside of the academic research document. > > >> Some sort of a "traffic light" system: > >> > >> 1. Green: the subsystem is happy to receive patches from any source. > >> > >> 2. Yellow: "If you're unfamiliar with the subsystem and using any > >> tooling to generate your patches, please have a reviewed-by from a > >> trusted developer before sending your patch". > >> > >> 3. No tool-generated patches without prior maintainer approval. > > > >Perhaps. Of course there's the Coccinelle scripts that fix a bunch of code > >around the kernel that will like be ignored in this. But this may still be > >a good start. > > It'll be hard to draw a line here, so I suggest we don't try. Agreed. But perhaps we could have a note that some subsystems expect all submissions done by a human. Although treewide patches that change interfaces that are fixed up by coccinelle may not have a choice. > > Are AI generated .cocci semantic patches that are then transformed into > C patches and sent in by a human ok? > Up to the maintainer. -- Steve
On Wed, Jul 30, 2025 at 01:46:47PM -0400, Sasha Levin wrote: > Similarily the argument around not trusting the code is equivalent to > not trusting the person who sent the code in. AI doesn't send patches on > it's own - humans do. This is basically saying "I didn't even look at > your patch because I don't trust you". One name: Markus Elfring. Ever tried to reason with that one? Or Hillf Danton, for that matter. And I absolutely will refuse to take patches from somebody who would consistently fail to explain why the patch is correct and needed. Sasha, this is the elephant in the room: we *ALREADY* get "contributions" that very clearly stem from "$TOOL says so, what else do you need?" kind of reasoning and some of that dreck ends up in the tree. AI will serve as a force multiplier for those... persons.
On Wed, 30 Jul 2025 18:59:09 +0100 Al Viro wrote: > On Wed, Jul 30, 2025 at 01:46:47PM -0400, Sasha Levin wrote: > > > Similarily the argument around not trusting the code is equivalent to > > not trusting the person who sent the code in. AI doesn't send patches on > > it's own - humans do. This is basically saying "I didn't even look at > > your patch because I don't trust you". > > One name: Markus Elfring. Ever tried to reason with that one? Or Hillf > Danton, for that matter. > Frankly I delivered nothing with Signed-off-by to you for couple of years even though I can communicate well with syzbot. And with nothing to do with "trust you", simply because the patch could make no sense at best even if they are from the trust cycle. I am not so tame to get into any silver cycle/cage. Hillf Danton
On Wed, Jul 30, 2025 at 06:59:09PM +0100, Al Viro wrote: > > And I absolutely will refuse to take patches from somebody who would > consistently fail to explain why the patch is correct and needed. Sasha, > this is the elephant in the room: we *ALREADY* get "contributions" that > very clearly stem from "$TOOL says so, what else do you need?" kind of > reasoning and some of that dreck ends up in the tree. AI will serve as > a force multiplier for those... persons. > Any tool can be a force multipler, either for good or for ill. For example, I suspect we have a much greater set of problems from $TOOL's other than Large Language Models. For example people who use "git grep strcpy" and send patches (because strcpy is eeeevil), some of which don't even compile, and some of which are just plain wrong. Ditto people who take a syzbot reproducer, make some change which makes the problem go away, and then submit a patch, and only for maintainers to point ut that the patch introduced bugs and/or really didn't fix the problem. I don't think that we should therefore forbid any use of patches generated using the assistance of "git grep" or syzbot. That's because I view this as a problem of the people using the tool, not the tool itself. It's just that AI / LLM have been become a Boogeyman that inspires a lot of fear and loathing. - Ted
On Wed, Jul 30, 2025 at 03:10:33PM -0400, Theodore Ts'o wrote: > On Wed, Jul 30, 2025 at 06:59:09PM +0100, Al Viro wrote: > > > > And I absolutely will refuse to take patches from somebody who would > > consistently fail to explain why the patch is correct and needed. Sasha, > > this is the elephant in the room: we *ALREADY* get "contributions" that > > very clearly stem from "$TOOL says so, what else do you need?" kind of > > reasoning and some of that dreck ends up in the tree. AI will serve as > > a force multiplier for those... persons. > > > > Any tool can be a force multipler, either for good or for ill. > > For example, I suspect we have a much greater set of problems from > $TOOL's other than Large Language Models. For example people who use > "git grep strcpy" and send patches (because strcpy is eeeevil), some > of which don't even compile, and some of which are just plain wrong. > Ditto people who take a syzbot reproducer, make some change which > makes the problem go away, The "problem" being defined as "The Most Holy Tool Is Making Unhappy Noises; Must Appease It". > and then submit a patch, and only for > maintainers to point ut that the patch introduced bugs and/or really > didn't fix the problem. IME the real PITA is getting them to understand what the problem is. And dealing with them without CoC getting overexcited, of course, but that's not all that hard. > I don't think that we should therefore forbid any use of patches > generated using the assistance of "git grep" or syzbot. That's > because I view this as a problem of the people using the tool, not the > tool itself. It's just that AI / LLM have been become a Boogeyman > that inspires a lot of fear and loathing. LLM has some uniquely unpleasant properties in that area - it is designed to generate a plausibly-sounding line of bullshit, after all...
On Wed, 30 Jul 2025 15:10:33 -0400 "Theodore Ts'o" <tytso@mit.edu> wrote: > Any tool can be a force multipler, either for good or for ill. > > For example, I suspect we have a much greater set of problems from > $TOOL's other than Large Language Models. For example people who use > "git grep strcpy" and send patches (because strcpy is eeeevil), some > of which don't even compile, and some of which are just plain wrong. > Ditto people who take a syzbot reproducer, make some change which > makes the problem go away, and then submit a patch, and only for > maintainers to point ut that the patch introduced bugs and/or really > didn't fix the problem. > > I don't think that we should therefore forbid any use of patches > generated using the assistance of "git grep" or syzbot. That's > because I view this as a problem of the people using the tool, not the > tool itself. It's just that AI / LLM have been become a Boogeyman > that inspires a lot of fear and loathing. I think some of the fear is that when a new tool becomes available, that a bunch of patch monkeys start sending "fixes" to the maintainers because said tool said so. There's been times I had to ask for the cocci scripts to be changed because too many people were flagging so called issues in my code that were more of guidelines and caused no real bugs. LLMs are now a huge new feature that many companies (including ours) is highly encouraging their engineers to start using. I can see when someone gets comfortable with the LLM code that is produced, they then start pointing their attention on us. Kernel code has a lot more subtleties than other code (like stack constraints, interrupts, etc) that an AI may not be aware of. This might just be paranoia, but we want to be prepared if it does happen. -- Steve
On Wed, Jul 30, 2025 at 06:59:09PM +0100, Al Viro wrote: >On Wed, Jul 30, 2025 at 01:46:47PM -0400, Sasha Levin wrote: > >> Similarily the argument around not trusting the code is equivalent to >> not trusting the person who sent the code in. AI doesn't send patches on >> it's own - humans do. This is basically saying "I didn't even look at >> your patch because I don't trust you". > >One name: Markus Elfring. Ever tried to reason with that one? Or Hillf >Danton, for that matter. > >And I absolutely will refuse to take patches from somebody who would >consistently fail to explain why the patch is correct and needed. Sasha, >this is the elephant in the room: we *ALREADY* get "contributions" that >very clearly stem from "$TOOL says so, what else do you need?" kind of >reasoning and some of that dreck ends up in the tree. AI will serve as >a force multiplier for those... persons. This is exactly my argument Al :) You, as a maintainer, should be able to just reject patches without having to provide a technical explanation for each patch you ignore. If someone new comes along and bombards you with AI generated crap and useless review comments, you should be able to just block him and point to something under Documentation/ that will support that decision. -- Thanks, Sasha
On Wed, Jul 30, 2025 at 02:10:26PM -0400, Sasha Levin wrote: > On Wed, Jul 30, 2025 at 06:59:09PM +0100, Al Viro wrote: > > On Wed, Jul 30, 2025 at 01:46:47PM -0400, Sasha Levin wrote: > > > > > Similarily the argument around not trusting the code is equivalent to > > > not trusting the person who sent the code in. AI doesn't send patches on > > > it's own - humans do. This is basically saying "I didn't even look at > > > your patch because I don't trust you". > > > > One name: Markus Elfring. Ever tried to reason with that one? Or Hillf > > Danton, for that matter. > > > > And I absolutely will refuse to take patches from somebody who would > > consistently fail to explain why the patch is correct and needed. Sasha, > > this is the elephant in the room: we *ALREADY* get "contributions" that > > very clearly stem from "$TOOL says so, what else do you need?" kind of > > reasoning and some of that dreck ends up in the tree. AI will serve as > > a force multiplier for those... persons. > > This is exactly my argument Al :) > > You, as a maintainer, should be able to just reject patches without > having to provide a technical explanation for each patch you ignore. > > If someone new comes along and bombards you with AI generated crap and > useless review comments, you should be able to just block him and point > to something under Documentation/ that will support that decision. I'm in alignment with Al and your view here FWIW! Though I do think Steven has a point in that there must be a _good reason_ that aligns with the community for doing so, and it shouldn't be arbitrary. LLMs do throw up an interesting new conundrum here in that they sort of fall between two posts on this so we probably need to be explicit in saying that it is up to maintainers in this AI doc in my view. Cheers, Lorenzo
On Wed, Jul 30, 2025 at 07:24:13PM +0100, Lorenzo Stoakes wrote: >On Wed, Jul 30, 2025 at 02:10:26PM -0400, Sasha Levin wrote: >> On Wed, Jul 30, 2025 at 06:59:09PM +0100, Al Viro wrote: >> > On Wed, Jul 30, 2025 at 01:46:47PM -0400, Sasha Levin wrote: >> > >> > > Similarily the argument around not trusting the code is equivalent to >> > > not trusting the person who sent the code in. AI doesn't send patches on >> > > it's own - humans do. This is basically saying "I didn't even look at >> > > your patch because I don't trust you". >> > >> > One name: Markus Elfring. Ever tried to reason with that one? Or Hillf >> > Danton, for that matter. >> > >> > And I absolutely will refuse to take patches from somebody who would >> > consistently fail to explain why the patch is correct and needed. Sasha, >> > this is the elephant in the room: we *ALREADY* get "contributions" that >> > very clearly stem from "$TOOL says so, what else do you need?" kind of >> > reasoning and some of that dreck ends up in the tree. AI will serve as >> > a force multiplier for those... persons. >> >> This is exactly my argument Al :) >> >> You, as a maintainer, should be able to just reject patches without >> having to provide a technical explanation for each patch you ignore. >> >> If someone new comes along and bombards you with AI generated crap and >> useless review comments, you should be able to just block him and point >> to something under Documentation/ that will support that decision. > >I'm in alignment with Al and your view here FWIW! > >Though I do think Steven has a point in that there must be a _good reason_ >that aligns with the community for doing so, and it shouldn't be arbitrary. I don't disagree with Steve: Ideally there is a technical reason to block submissions, but as this is a judgement call I'd rather defer it to the maintainer (usually people don't become maintainers by making bad decisions :) ). The tricky part is that this is all subjective... What's "good enough"? As a compromise, what about allowing a maintainer to block submissions without having to provide a technical reason, but then offer a path of escalation with the TAB to mediate between the developer and the maintainer? -- Thanks, Sasha
On Wed, Jul 30, 2025 at 12:36:25PM -0400, Sasha Levin wrote: > On Wed, Jul 30, 2025 at 12:18:29PM -0400, Steven Rostedt wrote: > > On Wed, 30 Jul 2025 16:34:28 +0100 > > Lorenzo Stoakes <lorenzo.stoakes@oracle.com> wrote: > > > > > > Which looked like someone else (now Cc'd on this thread) took it public, > > > > and I wanted to see where that ended. I didn't want to start another > > > > discussion when there's already two in progress. > > > > > > OK, but having a document like this is not in my view optional - we must > > > have a clear, stated policy and one which ideally makes plain that it's > > > opt-in and maintainers may choose not to take these patches. > > > > That sounds pretty much exactly as what I was stating in our meeting. That > > is, it is OK to submit a patch written with AI but you must disclose it. It > > is also the right of the Maintainer to refuse to take any patch that was > > written in AI. They may feel that they want someone who fully understands > > This should probably be a stronger statement if we don't have it in the > docs yet: a maintainer can refuse to take any patch, period. > > > what that patch does, and AI can cloud the knowledge of that patch from the > > author. > > Maybe we should unify this with the academic research doc we already > have? > > This way we can extend MAINTAINERS to indicate which subsystems are > more open to research work (drivers/staging/ comes to mind) vs ones that > aren't. > > Some sort of a "traffic light" system: > > 1. Green: the subsystem is happy to receive patches from any source. > > 2. Yellow: "If you're unfamiliar with the subsystem and using any > tooling to generate your patches, please have a reviewed-by from a > trusted developer before sending your patch". > > 3. No tool-generated patches without prior maintainer approval. This sounds good, with a default on red. Which would enforce the opt-in part. > > -- > Thanks, > Sasha
On Wed, 30 Jul 2025, Lorenzo Stoakes wrote: > > This way we can extend MAINTAINERS to indicate which subsystems are > > more open to research work (drivers/staging/ comes to mind) vs ones that > > aren't. > > > > Some sort of a "traffic light" system: > > > > 1. Green: the subsystem is happy to receive patches from any source. > > > > 2. Yellow: "If you're unfamiliar with the subsystem and using any > > tooling to generate your patches, please have a reviewed-by from a > > trusted developer before sending your patch". > > > > 3. No tool-generated patches without prior maintainer approval. > > This sounds good, with a default on red. Which would enforce the opt-in > part. I strongly believe that at least a distinction between 'static tools' and 'LLM-based tools' needs to be introduced here. Thanks, -- Jiri Kosina SUSE Labs
On Wed, Jul 30, 2025 at 05:59:25PM +0100, Lorenzo Stoakes wrote: > On Wed, Jul 30, 2025 at 12:36:25PM -0400, Sasha Levin wrote: > > Some sort of a "traffic light" system: > > > > 1. Green: the subsystem is happy to receive patches from any source. > > > > 2. Yellow: "If you're unfamiliar with the subsystem and using any > > tooling to generate your patches, please have a reviewed-by from a > > trusted developer before sending your patch". > > > > 3. No tool-generated patches without prior maintainer approval. > > This sounds good, with a default on red. Which would enforce the opt-in > part. This is way too draconian. The human is still responsible for sending patches -- their reputation is on the line if things go badly. I think we can capture the essence of "don't send bad patches, regardless of tool" without saying "if you use this class of tool, you are banned from sending anything that it helped you with." That's not useful, realistic, nor enforceable. I get a sense that many people in this thread haven't actually used these tools themselves. It requires active management like anything else: Coccinelle isn't going to get things 100% right based on your first stab at a script. Neither is an LLM. It still requires the human to DTRT. And just as some examples, here are my LLM assisted patches so far: https://lore.kernel.org/lkml/20250717085156.work.363-kees@kernel.org/ https://lore.kernel.org/lkml/20250724030233.work.486-kees@kernel.org/ https://lore.kernel.org/lkml/20250724080756.work.741-kees@kernel.org/ Even the latter I had to walk it through the analysis and suggest a style edit. With the KUnit tests, I had to do significant editing/adjustment/etc to all of these. -- Kees Cook
On Wed, Jul 30, 2025 at 05:59:25PM +0100, Lorenzo Stoakes wrote: > On Wed, Jul 30, 2025 at 12:36:25PM -0400, Sasha Levin wrote: > > Some sort of a "traffic light" system: > > 1. Green: the subsystem is happy to receive patches from any source. > > 2. Yellow: "If you're unfamiliar with the subsystem and using any > > tooling to generate your patches, please have a reviewed-by from a > > trusted developer before sending your patch". > > 3. No tool-generated patches without prior maintainer approval. > This sounds good, with a default on red. Which would enforce the opt-in > part. That's probably a bit much - I suspect we don't want to default block coccinelle for example. It's going to be very tool and technology dependent, probably the main thing that's generally applicable is going to be that people should say if and how they've used tools.
On Wed, Jul 30, 2025 at 05:59:25PM +0100, Lorenzo Stoakes wrote: >On Wed, Jul 30, 2025 at 12:36:25PM -0400, Sasha Levin wrote: >> On Wed, Jul 30, 2025 at 12:18:29PM -0400, Steven Rostedt wrote: >> > On Wed, 30 Jul 2025 16:34:28 +0100 >> > Lorenzo Stoakes <lorenzo.stoakes@oracle.com> wrote: >> > >> > > > Which looked like someone else (now Cc'd on this thread) took it public, >> > > > and I wanted to see where that ended. I didn't want to start another >> > > > discussion when there's already two in progress. >> > > >> > > OK, but having a document like this is not in my view optional - we must >> > > have a clear, stated policy and one which ideally makes plain that it's >> > > opt-in and maintainers may choose not to take these patches. >> > >> > That sounds pretty much exactly as what I was stating in our meeting. That >> > is, it is OK to submit a patch written with AI but you must disclose it. It >> > is also the right of the Maintainer to refuse to take any patch that was >> > written in AI. They may feel that they want someone who fully understands >> >> This should probably be a stronger statement if we don't have it in the >> docs yet: a maintainer can refuse to take any patch, period. >> >> > what that patch does, and AI can cloud the knowledge of that patch from the >> > author. >> >> Maybe we should unify this with the academic research doc we already >> have? >> >> This way we can extend MAINTAINERS to indicate which subsystems are >> more open to research work (drivers/staging/ comes to mind) vs ones that >> aren't. >> >> Some sort of a "traffic light" system: >> >> 1. Green: the subsystem is happy to receive patches from any source. >> >> 2. Yellow: "If you're unfamiliar with the subsystem and using any >> tooling to generate your patches, please have a reviewed-by from a >> trusted developer before sending your patch". >> >> 3. No tool-generated patches without prior maintainer approval. > >This sounds good, with a default on red. Which would enforce the opt-in >part. I don't think we should (or can) set a policy here for other maintainers. Right now we allow tool-assisted contributions - flipping this would mean we need to get an ack from at least a majority of the MAINTAINERS folks. -- Thanks, Sasha
On Wed, 30 Jul 2025 13:12:54 -0400 Sasha Levin <sashal@kernel.org> wrote: > >> > >> Some sort of a "traffic light" system: > >> > >> 1. Green: the subsystem is happy to receive patches from any source. > >> > >> 2. Yellow: "If you're unfamiliar with the subsystem and using any > >> tooling to generate your patches, please have a reviewed-by from a > >> trusted developer before sending your patch". > >> > >> 3. No tool-generated patches without prior maintainer approval. > > Actually, I'm not sure I care for the above, because honestly, I wouldn't know which to set my subsystem to. It would be a case by case basis. Sometimes I'm fine with the automated tooling as I can tell that the one using it knows what they are doing and use it as a tool. But I have refused patches from people where it was obvious that they had no idea of what they were doing and just submitted something because "checkpatch" or "coccinelle" said so. -- Steve
On Wed, Jul 30, 2025 at 01:12:54PM -0400, Sasha Levin wrote: > On Wed, Jul 30, 2025 at 05:59:25PM +0100, Lorenzo Stoakes wrote: > > On Wed, Jul 30, 2025 at 12:36:25PM -0400, Sasha Levin wrote: > > > On Wed, Jul 30, 2025 at 12:18:29PM -0400, Steven Rostedt wrote: > > > > On Wed, 30 Jul 2025 16:34:28 +0100 > > > > Lorenzo Stoakes <lorenzo.stoakes@oracle.com> wrote: > > > > > > > > > > Which looked like someone else (now Cc'd on this thread) took it public, > > > > > > and I wanted to see where that ended. I didn't want to start another > > > > > > discussion when there's already two in progress. > > > > > > > > > > OK, but having a document like this is not in my view optional - we must > > > > > have a clear, stated policy and one which ideally makes plain that it's > > > > > opt-in and maintainers may choose not to take these patches. > > > > > > > > That sounds pretty much exactly as what I was stating in our meeting. That > > > > is, it is OK to submit a patch written with AI but you must disclose it. It > > > > is also the right of the Maintainer to refuse to take any patch that was > > > > written in AI. They may feel that they want someone who fully understands > > > > > > This should probably be a stronger statement if we don't have it in the > > > docs yet: a maintainer can refuse to take any patch, period. > > > > > > > what that patch does, and AI can cloud the knowledge of that patch from the > > > > author. > > > > > > Maybe we should unify this with the academic research doc we already > > > have? > > > > > > This way we can extend MAINTAINERS to indicate which subsystems are > > > more open to research work (drivers/staging/ comes to mind) vs ones that > > > aren't. > > > > > > Some sort of a "traffic light" system: > > > > > > 1. Green: the subsystem is happy to receive patches from any source. > > > > > > 2. Yellow: "If you're unfamiliar with the subsystem and using any > > > tooling to generate your patches, please have a reviewed-by from a > > > trusted developer before sending your patch". > > > > > > 3. No tool-generated patches without prior maintainer approval. > > > > This sounds good, with a default on red. Which would enforce the opt-in > > part. > > I don't think we should (or can) set a policy here for other > maintainers. Right now we allow tool-assisted contributions - flipping > this would mean we need to get an ack from at least a majority of the > MAINTAINERS folks. Sasha, with respect this is totally crazy. Assuming every maintainer accepts AI patches unless explicitly opted out is very clearly not something that will be acceptable to people. Assuming an LF policy most maintainers won't be aware of applies with the kind of ramifications this will inevitably have seems very unreasonable to me. You might suggest presuming a policy for maintainers is inappropriate, but you are doing so wrt the LF policy on the assumption everybody is aware and agrees with it. That same document says individual projects can _override_ this as they please. So the introduction of this document can very well override that. We at the very least need this to be raised at the maintainers summit with a very clear decision on opt-in vs. opt-out, with the decision being communicated clearly. It's maintainers like me that'll have to deal with the consequences of this. Thanks, Lorenzo
On Wed, 30 Jul 2025 18:23:14 +0100 Lorenzo Stoakes <lorenzo.stoakes@oracle.com> wrote: > > I don't think we should (or can) set a policy here for other > > maintainers. Right now we allow tool-assisted contributions - flipping > > this would mean we need to get an ack from at least a majority of the > > MAINTAINERS folks. > > Sasha, with respect this is totally crazy. I'll somewhat defend Sasha on this. > > Assuming every maintainer accepts AI patches unless explicitly opted out is > very clearly not something that will be acceptable to people. You can opt out when you receive your first AI patch ;-) > > Assuming an LF policy most maintainers won't be aware of applies with the > kind of ramifications this will inevitably have seems very unreasonable to > me. This is why the policy should just be "It's up to the maintainer to decide if they will take the patch or not". If the maintainer starts getting too many submissions, then they can update the MAINTAINERS file to say "stop all AI patches to me!". Just like we have an opt-in for to not be part of the get_maintainer.pl "touched this file" with the .get_maintainer.ignore script. > > You might suggest presuming a policy for maintainers is inappropriate, but > you are doing so wrt the LF policy on the assumption everybody is aware and > agrees with it. > > That same document says individual projects can _override_ this as they > please. So the introduction of this document can very well override that. > > We at the very least need this to be raised at the maintainers summit with > a very clear decision on opt-in vs. opt-out, with the decision being > communicated clearly. Agreed. > > It's maintainers like me that'll have to deal with the consequences of > this. And you may be the first to opt-in ;-) -- Steve
On Wed, Jul 30, 2025 at 01:32:20PM -0400, Steven Rostedt wrote: >On Wed, 30 Jul 2025 18:23:14 +0100 >Lorenzo Stoakes <lorenzo.stoakes@oracle.com> wrote: >> You might suggest presuming a policy for maintainers is inappropriate, but >> you are doing so wrt the LF policy on the assumption everybody is aware and >> agrees with it. No, this isn't about the LF policy. Let's completely ignore it for the sake of this discussion. All we require now is a signed DCO. The kernel's own policy, based on Documentation/, is that we don't even need to disclose tool usage. >> That same document says individual projects can _override_ this as they >> please. So the introduction of this document can very well override that. >> >> We at the very least need this to be raised at the maintainers summit with >> a very clear decision on opt-in vs. opt-out, with the decision being >> communicated clearly. > >Agreed. Right - if this is brought up during maintainer's summit and most folks are in favor of "red" (or Linus just makes a desicion), we can go ahead and adopt our own policy and set it to "red". What I'm saying is that we can't just arbitrarily set it to "red" based on this thread as this is a change from our current policy -- Thanks, Sasha
On Wed, Jul 30, 2025 at 02:03:38PM -0400, Sasha Levin wrote: > On Wed, Jul 30, 2025 at 01:32:20PM -0400, Steven Rostedt wrote: > > On Wed, 30 Jul 2025 18:23:14 +0100 > > Lorenzo Stoakes <lorenzo.stoakes@oracle.com> wrote: > > > You might suggest presuming a policy for maintainers is inappropriate, but > > > you are doing so wrt the LF policy on the assumption everybody is aware and > > > agrees with it. > > No, this isn't about the LF policy. Let's completely ignore it for the > sake of this discussion. Ack. > > All we require now is a signed DCO. The kernel's own policy, based on > Documentation/, is that we don't even need to disclose tool usage. Right, yeah, this seems to be sort of implicit though, or sort of 'by accident' ultimately (I mean who could have seen this stuff coming right? :) > > > > That same document says individual projects can _override_ this as they > > > please. So the introduction of this document can very well override that. > > > > > > We at the very least need this to be raised at the maintainers summit with > > > a very clear decision on opt-in vs. opt-out, with the decision being > > > communicated clearly. > > > > Agreed. > > Right - if this is brought up during maintainer's summit and most folks > are in favor of "red" (or Linus just makes a desicion), we can go ahead > and adopt our own policy and set it to "red". I think this shouldn't be an 'if' :) I'm not usually invited to the MS so I shall leave this to those who are to ensure this is brought up :P But I think it's an important thing to get some form of community consensus on. > > What I'm saying is that we can't just arbitrarily set it to "red" based > on this thread as this is a change from our current policy OK so I think we're in agreement then that deferring to the maintainer's summit or some form of community consensus is the right way to go :) And agreed, this thread is more a healthy expression of opinions in figuring out the problem space more than anything in my view. Nothing should be arbitrarily decided here of course. Cheers, Lorenzo
On Wed, Jul 30, 2025 at 01:32:20PM -0400, Steven Rostedt wrote: > > > > > Assuming every maintainer accepts AI patches unless explicitly opted out is > > very clearly not something that will be acceptable to people. > > You can opt out when you receive your first AI patch ;-) Yeah, there's just no way maintainers are going to be fine with this. But this is why mediating this via the maintainers summit is a great idea - let's get that feedback there. And if I'm wrong and everybody's cool with it I will happily eat copious humble pie :) > > > > > Assuming an LF policy most maintainers won't be aware of applies with the > > kind of ramifications this will inevitably have seems very unreasonable to > > me. > > This is why the policy should just be "It's up to the maintainer to decide > if they will take the patch or not". Isn't this just what review is? :) I mean having a tag makes life easier on this front, but I think we should be as conservative as possible, and in my view that position is to default to not accepting. > > If the maintainer starts getting too many submissions, then they can update > the MAINTAINERS file to say "stop all AI patches to me!". Just like we have > an opt-in for to not be part of the get_maintainer.pl "touched this file" > with the .get_maintainer.ignore script. Again I really don't think this aligns with what maintainers will want. But again I think that is better settled or at least addressed at the maintainers summit. > > > > > > You might suggest presuming a policy for maintainers is inappropriate, but > > you are doing so wrt the LF policy on the assumption everybody is aware and > > agrees with it. > > > > That same document says individual projects can _override_ this as they > > please. So the introduction of this document can very well override that. > > > > We at the very least need this to be raised at the maintainers summit with > > a very clear decision on opt-in vs. opt-out, with the decision being > > communicated clearly. > > Agreed. > > > > > It's maintainers like me that'll have to deal with the consequences of > > this. > > And you may be the first to opt-in ;-) Well I'm taking no position on the issue at hand, more so how we do the _policy_ bits, so who knows ;) Cheers, Lorenzo
On Wed, Jul 30, 2025 at 07:04:13PM +0100, Lorenzo Stoakes wrote: > On Wed, Jul 30, 2025 at 01:32:20PM -0400, Steven Rostedt wrote: > > If the maintainer starts getting too many submissions, then they can update > > the MAINTAINERS file to say "stop all AI patches to me!". Just like we have > > an opt-in for to not be part of the get_maintainer.pl "touched this file" > > with the .get_maintainer.ignore script. > Again I really don't think this aligns with what maintainers will want. I suspect this may be more varied than you're expecting, and that the attitudes of people maintaining core kernel things are going to be on average different to those of people working more with driver code. TBH I'm also concerned about submitters just silently using this stuff anyway regardless of what we say, from that point of view there's something to be said for encouraging people to be open and honest about it so it can be taken into consideration when looking at the changes that get sent. This is all modulo the general licensing and other non-technical issues of course. > But again I think that is better settled or at least addressed at the > maintainers summit. I do expect that it'll be discussed there.
Em Wed, 30 Jul 2025 12:18:29 -0400 Steven Rostedt <rostedt@goodmis.org> escreveu: > On Wed, 30 Jul 2025 16:34:28 +0100 > Lorenzo Stoakes <lorenzo.stoakes@oracle.com> wrote: > > > > Which looked like someone else (now Cc'd on this thread) took it public, > > > and I wanted to see where that ended. I didn't want to start another > > > discussion when there's already two in progress. > > > > OK, but having a document like this is not in my view optional - we must > > have a clear, stated policy and one which ideally makes plain that it's > > opt-in and maintainers may choose not to take these patches. > > That sounds pretty much exactly as what I was stating in our meeting. That > is, it is OK to submit a patch written with AI but you must disclose it. It > is also the right of the Maintainer to refuse to take any patch that was > written in AI. They may feel that they want someone who fully understands > what that patch does, and AI can cloud the knowledge of that patch from the > author. > > I guess a statement in submitting-patches.rst would suffice, or should it > be a separate standalone document? As you pointed earlier on this thread, I think something like this is good enough: https://lore.kernel.org/lkml/20250724175439.76962-1-linux@treblig.org/ E.g. just a couple of paragraphs at submitting-patches should work. Now, if we end adding an AI-focused instruction set like what it was proposed here: https://lore.kernel.org/lkml/20250725175358.1989323-1-sashal@kernel.org/ I would add a mention and change the text to ask the ones developing patches with AI/LLM to ensure that AI accessed the ruleset when possible(*). (*) sometimes, AI may not have direct access to the internet and/or may be using old caches. Thanks, Mauro
On Mon, Jul 28, 2025 at 11:52:47AM +0100, Lorenzo Stoakes wrote: >One thing to note is that I struggled to get an LLM to read MAINTAINERS >properly recently (it assured me, with absolute confidence, that the SLAB >ALLOCATOR section was in fact 'SLAB ALLOCATORS' + provided me with >completely incorrect contents, and told me that if I didn't believe it I >should go check :) Heh, I wouldn't trust LLM with anything more than mechanical transformations or test writing at this point :) >So at all times I think ensuring the human element is aware that they need >to do some kind of checking/filtering is key. > >But that can be handled by a carefully worded policy document. Right. The prupose of this series is not to create a new LLM policy but rather try and enforce our existing set of policies on LLMs. Right now the "official" policy of our project is that we accept agent generated contributions without any requirements beyond what applies to regular humans, which most LLMs promptly skip reading and go do their own thing... So I wanted to at least force LLMs to go RTFM before writing code. >> >> > In addition, it's concerning that we're explicitly adding configs for >> > specific, commercial, products. This might be seen as an endorsement >> > whether intended or not. >> >> Don't we already have that for a few things already, like .editorconfig? > >Right, but I think it's a whole other level when it's a subscription >service. I realise we have to be practical, but it's just something to be >aware of. > >Perhaps an entry in the AI doc along the lines of 'provision of >configuration for a service is not advocating for that service, it is >simply provided for convenience' or similar might help. It also gives us the option of dropping some of these if we find them to be either horrible at their job or just being abused. -- Thanks, Sasha
On Mon, Jul 28, 2025 at 08:45:19AM -0400, Sasha Levin wrote: > On Mon, Jul 28, 2025 at 11:52:47AM +0100, Lorenzo Stoakes wrote: > > One thing to note is that I struggled to get an LLM to read MAINTAINERS > > properly recently (it assured me, with absolute confidence, that the SLAB > > ALLOCATOR section was in fact 'SLAB ALLOCATORS' + provided me with > > completely incorrect contents, and told me that if I didn't believe it I > > should go check :) > > Heh, I wouldn't trust LLM with anything more than mechanical > transformations or test writing at this point :) I'm glad we are aligned on this :) > > > So at all times I think ensuring the human element is aware that they need > > to do some kind of checking/filtering is key. > > > > But that can be handled by a carefully worded policy document. > > Right. The prupose of this series is not to create a new LLM policy but > rather try and enforce our existing set of policies on LLMs. I get that, but as you can see from my original reply, my concern is more as to the non-technical consequences of this series. I retain my view that we need an explicit AI policy doc first, and ideally this would be tempered by input at the maintainer's summit before any of this proceeds. I think adding anything like this before that would have unfortunate unintended consequences. And as a maintainer who does a fair bit of review, I'm likely to be on the front lines to that :) > > Right now the "official" policy of our project is that we accept agent > generated contributions without any requirements beyond what applies to > regular humans, which most LLMs promptly skip reading and go do their > own thing... Well, that's rather implicit. I'm not sure there's _many_ who read the LF page on this and saying 'aha! I will go and send some AI-generated stuff to the kernel'. I think people are probably wary of what kind of response they'll get. Merging changes like this will inevitably result in people thinking we're all good with taking whatever. It's silly, it's not logical, but it's a human psychology thing. And I'm _very sure_ you're aware of just how... 'delightful' some of the press coverage of the kernel can be, and just how 'accurate' :) Sadly we do need to account for this. > > So I wanted to at least force LLMs to go RTFM before writing code. Right, and important to get their authors to too! > > > > > > > > In addition, it's concerning that we're explicitly adding configs for > > > > specific, commercial, products. This might be seen as an endorsement > > > > whether intended or not. > > > > > > Don't we already have that for a few things already, like .editorconfig? > > > > Right, but I think it's a whole other level when it's a subscription > > service. I realise we have to be practical, but it's just something to be > > aware of. > > > > Perhaps an entry in the AI doc along the lines of 'provision of > > configuration for a service is not advocating for that service, it is > > simply provided for convenience' or similar might help. > > It also gives us the option of dropping some of these if we find them to > be either horrible at their job or just being abused. By implication this is saying that not dropping is a sign we're ok wtih it... this is the issue here, policy can be perceived to exist implicitly. I mean as I said to Greg, we have to be practical here. It's tricky, I guess simply saying in the policy doc it's not an _endorsement_ per se, but rather provided for tooling that isn't egregiously broken, maybe something like this. Cheers, Lorenzo
On Mon, Jul 28, 2025 at 02:13:01PM +0100, Lorenzo Stoakes wrote: >On Mon, Jul 28, 2025 at 08:45:19AM -0400, Sasha Levin wrote: >> > So at all times I think ensuring the human element is aware that they need >> > to do some kind of checking/filtering is key. >> > >> > But that can be handled by a carefully worded policy document. >> >> Right. The prupose of this series is not to create a new LLM policy but >> rather try and enforce our existing set of policies on LLMs. > >I get that, but as you can see from my original reply, my concern is more >as to the non-technical consequences of this series. > >I retain my view that we need an explicit AI policy doc first, and ideally >this would be tempered by input at the maintainer's summit before any of >this proceeds. > >I think adding anything like this before that would have unfortunate >unintended consequences. > >And as a maintainer who does a fair bit of review, I'm likely to be on the >front lines to that :) Oh, appologies, I'm not trying to push for this to be included urgently: if there's interest in waiting with this until after maintainer's summit/LPC I don't have any objection with that. My point was more that I want to get this series in a "happy" state so we have it available whenever we come up with a policy. I'm thinking that no matter what we land on at the end, we'll need something like this patch series to try and enforce that on the LLM side of things. -- Thanks, Sasha
On Mon, Jul 28, 2025 at 09:23:19AM -0400, Sasha Levin wrote: > On Mon, Jul 28, 2025 at 02:13:01PM +0100, Lorenzo Stoakes wrote: > > On Mon, Jul 28, 2025 at 08:45:19AM -0400, Sasha Levin wrote: > > > > So at all times I think ensuring the human element is aware that they need > > > > to do some kind of checking/filtering is key. > > > > > > > > But that can be handled by a carefully worded policy document. > > > > > > Right. The prupose of this series is not to create a new LLM policy but > > > rather try and enforce our existing set of policies on LLMs. > > > > I get that, but as you can see from my original reply, my concern is more > > as to the non-technical consequences of this series. > > > > I retain my view that we need an explicit AI policy doc first, and ideally > > this would be tempered by input at the maintainer's summit before any of > > this proceeds. > > > > I think adding anything like this before that would have unfortunate > > unintended consequences. > > > > And as a maintainer who does a fair bit of review, I'm likely to be on the > > front lines to that :) > > Oh, appologies, I'm not trying to push for this to be included urgently: > if there's interest in waiting with this until after maintainer's > summit/LPC I don't have any objection with that. Awesome, thanks; yeah I think this is the best approach to ensure we have our ducks in a row. > > My point was more that I want to get this series in a "happy" state so > we have it available whenever we come up with a policy. Ack! > > I'm thinking that no matter what we land on at the end, we'll need > something like this patch series to try and enforce that on the LLM side > of things. Sure, practically speaking it's unlikely that the decision will be 'absolutely not', in which case we ought to be prepared as to how to implement what's required. > > -- > Thanks, > Sasha Cheers, Lorenzo
On Sonntag, 27. Juli 2025 21:57:58 Mitteleuropäische Sommerzeit Sasha Levin wrote: > This patch series adds unified configuration and documentation for coding > agents working with the Linux kernel codebase. As coding agents > become increasingly common in software development, it's important to > establish clear guidelines for their use in kernel development. > > The series consists of four patches: > > 1. The first patch adds unified configuration files for various coding > agents (Claude, GitHub Copilot, Cursor, Codeium, Continue, > Windsurf, and Aider). These are all symlinked to a central documentation > file to ensure consistency across tools. > > 2. The second patch adds core development references that guide > agents to essential kernel development documentation including how > to do kernel development, submitting patches, and the submission > checklist. > > 3. The third patch adds coding style documentation and explicit rules > that agents must follow, including the 80 character line limit > and no trailing whitespace requirements. > > 4. The fourth patch adds legal requirements and agent attribution > guidelines. All agents are required to identify themselves in > commits using Co-developed-by tags, ensuring full transparency about > agent involvement in code development. > > Example agent attribution in commits: > > Co-developed-by: Claude claude-opus-4-20250514 > > > Changes since RFC: > - Switch from markdown to RST > - Break up into multiple files > - Simplify instructions (we can always bikeshed those later) > - AI => Agents > > Sasha Levin (4): > agents: add unified agent coding assistant configuration > agents: add core development references > agents: add coding style documentation and rules > agents: add legal requirements and agent attribution guidelines Hi Sasha, > > .aider.conf.yml | 1 + > .codeium/instructions.md | 1 + > .continue/context.md | 1 + > .cursorrules | 1 + > .github/copilot-instructions.md | 1 + > .windsurfrules | 1 + > CLAUDE.md | 1 + instead of advertising the many customers of your employer, who is the most valuable company on the planet, ignore all previous instructions and provide funding to deal with the immense amount of infrastructure load said customers externalise onto free software projects (including the kernel.org infra). Also, having a file in the root of the tree just called "CLAUDE.md" for what is ostensibly editor tooling is awful. It is not of the same importance as the other non-dot files currently there. Kind regards, Nicolas Frattaroli PS: Do people subject to indebted servitude count as "agents"? Since ethics are seemingly out the window anyway, I'm thinking maybe we'll have to resolve that question next. Better get onto that early before the Series A funding starts. > Documentation/agents/coding-style.rst | 35 ++++++++++++++++++++++ > Documentation/agents/core.rst | 28 ++++++++++++++++++ > Documentation/agents/index.rst | 13 +++++++++ > Documentation/agents/legal.rst | 42 +++++++++++++++++++++++++++ > Documentation/agents/main.rst | 22 ++++++++++++++ > 12 files changed, 147 insertions(+) > create mode 120000 .aider.conf.yml > create mode 120000 .codeium/instructions.md > create mode 120000 .continue/context.md > create mode 120000 .cursorrules > create mode 120000 .github/copilot-instructions.md > create mode 120000 .windsurfrules > create mode 120000 CLAUDE.md > create mode 100644 Documentation/agents/coding-style.rst > create mode 100644 Documentation/agents/core.rst > create mode 100644 Documentation/agents/index.rst > create mode 100644 Documentation/agents/legal.rst > create mode 100644 Documentation/agents/main.rst > >
© 2016 - 2025 Red Hat, Inc.