Skip to content

Conversation

@alejandro-colomar
Copy link
Collaborator

@alejandro-colomar alejandro-colomar commented Dec 26, 2025

This policy has been copied verbatim from the Linux man-pages project, which itself derived it from (and is more restrictive than) Gentoo.

Cc: @ikerexxe
Cc: @hallyn

Link: https://arstechnica.com/ai/2025/07/study-finds-ai-tools-made-open-source-software-developers-19-percent-slower/
Link: https://petri.com/ai-coding-tools-rising-software-defects/
Link: https://ia.acs.org.au/article/2024/ai-coding-tools-may-produce-worse-software-.html
Link: https://carbonate.dev/blog/posts/the-ai-code-quality-crisis


Revisions:

v2
  • Allow minimal uses of AI.
$ git rd 
1:  7d360348a ! 1:  dafd0c440 doc/contributions/: Add guidelines banning AI for contributing
    @@ Metadata
     Author: Alejandro Colomar <[email protected]>
     
      ## Commit message ##
    -    doc/contributions/: Add guidelines banning AI for contributing
    +    doc/contributions/: Add guidelines severely restricting use of AI for contributing
     
    -    This policy has been copied verbatim from the Linux man-pages project,
    -    which itself derived it from (and is more restrictive than) Gentoo.
    +    This policy has been derived from Gentoo.  I added a requirement that
    +    use of AI is disclosed.  And changes resulting from said use should be
    +    disclosed in detail.  Also, I left a note saying we'll reject
    +    non-negligible use of AI, which is a bit of an escape allowing us to
    +    just say "too much".
     
         Cc: Iker Pedrosa <[email protected]>
         Cc: Serge Hallyn <[email protected]>
    @@ doc/contributions/ai.txt (new)
     +
     +Description
     +  It is expressly forbidden to contribute to this project any
    -+  content that has been created or derived with the assistance of
    ++  content that has been created with the assistance of
     +  AI tools.
     +
    -+  This includes AI assistive tools used in the contributing
    -+  process, even if such tools do not directly generate the
    -+  contributed code but are used to derive the contribution.  For
    -+  example, AI linters, AI static analyzers, and AI tools that
    -+  summarize input are forbidden.
    ++  The use of AI tools should be minimal, limited to reviewing
    ++  a contribution once it's almost finished.  Use of AI tools in
    ++  the process of a contribution should be disclosed explicitly and
    ++  clearly.  If such use of AI tools resulted in changing the
    ++  contribution in any way --for example, if an AI-based static
    ++  analyzer detected a bug, which was fixed before sending the
    ++  contribution--, this should be disclosed explicitly with more
    ++  detail.
     +
     +    Exceptions
     +  As an exception to the above, AI assistive tools which don't
    @@ doc/contributions/ai.txt (new)
     +  a tool that does not pose copyright, quality, or ethical
     +  concerns.
     +
    -+Copyright
    -+  Text copied from the Linux man-pages project
    -+  <https://git.kernel.org/pub/scm/docs/man-pages/man-pages.git/tree/CONTRIBUTING.d/ai>
    ++  We will reject patches were the use of AI is not anecdotical.
     +
    ++Copyright
     +  Text derived from --but different than-- the Gentoo project
     +  AI policy
     +  <https://wiki.gentoo.org/wiki/Project:Council/AI_policy>.
v2b
  • Rebase
  • Add links
$ git rd 
1:  dafd0c440 ! 1:  4322771d7 doc/contributions/: Add guidelines severely restricting use of AI for contributing
    @@ Commit message
         non-negligible use of AI, which is a bit of an escape allowing us to
         just say "too much".
     
    +    Link: <https://arstechnica.com/ai/2025/07/study-finds-ai-tools-made-open-source-software-developers-19-percent-slower/>
    +    Link: <https://petri.com/ai-coding-tools-rising-software-defects/>
    +    Link: <https://ia.acs.org.au/article/2024/ai-coding-tools-may-produce-worse-software-.html>
    +    Link: <https://carbonate.dev/blog/posts/the-ai-code-quality-crisis>
         Cc: Iker Pedrosa <[email protected]>
         Cc: Serge Hallyn <[email protected]>
         Signed-off-by: Alejandro Colomar <[email protected]>

@ikerexxe
Copy link
Collaborator

I would like to share a different perspective regarding the proposed AI policy.

While I understand the concerns regarding code quality, copyright, and the maintenance burden, I believe an outright ban is counterproductive. A total prohibition often leads to "shadow usage". If contributors find these tools useful for understanding complex logic or structuring contributions, a ban will not necessarily stop the use of AI; it will simply discourage transparency. This makes it impossible for maintainers to have an open dialogue with contributors about how to verify and audit AI-assisted work.

I believe a more constructive approach would be to follow a model similar to the Fedora Project’s policy on AI-assisted contributions. Their approach focuses on ensuring that the contributor remains legally and technically responsible for the work, ensuring that the code meets licensing requirements and quality standards, regardless of the tools used to create it.

Instead of a complete ban, the project could consider a policy based on strict accountability and disclosure. This allows the project to maintain its high standards and legal safety without banning tools that, when used responsibly, can improve a developer's workflow.

@alejandro-colomar
Copy link
Collaborator Author

alejandro-colomar commented Dec 29, 2025

I would like to share a different perspective regarding the proposed AI policy.

While I understand the concerns regarding code quality, copyright, and the maintenance burden, I believe an outright ban is counterproductive.

A total prohibition often leads to "shadow usage".

This is a theoretical concern, but after some months of such a policy in the Linux man-pages project --where we also discussed the possibility of undisclosed usage--, I haven't noticed any such undisclosed usage of AI tools.

What I've experienced there is that contributors do actually read the contributing guidelines, and often try to comply with them.

I've received contributions where AI had been used, but they all fall in one of two categories:

  • The contributor is unaware of the policy, but is otherwise competent. Such contributions disclose the use of AI. I told them that the contribution was unacceptable due to policy, and that if they want to contribute such a change, they should start again from scratch.

    The quality of those contributions was actually quite bad --to my quite high standards--, but the conversation with the contributors was honest. They have the opportunity to contribute, if they stop using AI. Those programmers have a reputation, and they won't risk it by lying to maintainers.

    As with any other liars, I warned in the mailing list that any dishonest use of AI, such as using it without reporting it (if the contributor is aware of the policy) will result in the contributor being busted, banned from contributing ever again.

  • The second category of contributions is essentially SPAM; patches that make no sense, and whose authors can't even defend in discussion.

In any case, at least currently, it is relatively easy to spot when a patch has been produced with the help of AI. That might change in the future, but we're years away from that. So, it should be easy to spot such undisclosed usage of AI tools.

If contributors find these tools useful for understanding complex logic or structuring contributions,

They aren't. These tools produce a false sense of understanding, which can be very dangerous, especially in a project like this one. Banning the tools has the beneficial side effect that even if a contributor has the false sense of understanding, we're not affected by it.

a ban will not necessarily stop the use of AI; it will simply discourage transparency.

The regular contributors shouldn't have any incentive to use it, as they can produce high quality contributions by hand, and probably faster. Reputation should be important enough to contributors, so I don't think they'll risk it by lying to maintainers.

The passer-by contributors, especially when using pseudonyms, might be more tempted to do that. However, we should be very careful about those anyway, as Jia Tan might also want to contribute, and will of course not be honest. We already have enough incentives for one-time contributors to include backdoors, that I don't think this one would be significant.

This makes it impossible for maintainers to have an open dialogue with contributors about how to verify and audit AI-assisted work.

I believe a more constructive approach would be to follow a model similar to the Fedora Project’s policy on AI-assisted contributions. Their approach focuses on ensuring that the contributor remains legally and technically responsible for the work, ensuring that the code meets licensing requirements and quality standards, regardless of the tools used to create it.

It was mentioned in the discussion in the Linux man-pages project. However, any contributions that have used AI will be dangerous (they might introduce dangerous bugs, and they're harder to spot than bugs introduced by human mistakes). For that reason, I'm strongly opposed to a policy that allows AI-based contributions in any way. The Fedora policy is one of the worst I've seen, IMO.

The Gentoo policy, which (allegedly) allows AI-based review tools, but not AI-generated code, nor the use of AI to understand the existing code, would be better. It is quite close to this one. However, I think even that isn't enough, as AI tools for review (e.g., static analyzers) can induce the human to make mistakes that are hard to spot, and result in the introduction of vulnerabilities. For that reason, we added a clause in the policy of the Linux man-pages project prohibiting that too.

We should take no unnecessary risks.

Instead of a complete ban, the project could consider a policy based on strict accountability and disclosure. This allows the project to maintain its high standards and legal safety without banning tools that, when used responsibly, can improve a developer's workflow.

I don't think high standards are compatible at all with AI tools. There's no responsible use of such tools that improves code quality, or at least hasn't been proved. Research points in the other direction, actually (I've now added links in the first message). There's too much risk.

@alejandro-colomar
Copy link
Collaborator Author

alejandro-colomar commented Dec 30, 2025

It was mentioned in the discussion in the Linux man-pages project. However, any contributions that have used AI will be dangerous (they might introduce dangerous bugs, and they're harder to spot than bugs introduced by human mistakes). For that reason, I'm strongly opposed to a policy that allows AI-based contributions in any way. The Fedora policy is one of the worst I've seen, IMO.

To be fair, I've seen a worse policy than that of Fedora: while the kernel still has no formal policy, some parties seem to be pushing for undisclosed use of AI, as if it was the wild wild west.

I think the quality of kernel code might very well decrease as a consequence in the near future. We'll see. I keep an eye on the development of GNU Hurd, just in case I need to switch kernels. :)

@hallyn
Copy link
Member

hallyn commented Dec 30, 2025

It was mentioned in the discussion in the Linux man-pages project. However, any contributions that have used AI will be dangerous (they might introduce dangerous bugs, and they're harder to spot than bugs introduced by human mistakes).

FWIW, from what I've seen this is not something we need to worry about. The
only way AI generated mistakes would be danger / hard to spot is if the
person looking is inexperienced.

I'm seeing AI do quite a good job at spotting bugs that experienced reviewers
miss. In fact, I would say that we are at the point where it might be
deemed irresponsible not to at least have AI review patches before submitting
a PR.

If we were to enact a full ban, the main reasons would be to take a stand on
the copyright and commons abuses by the creators of the AI, and the environmental
impact. That can be a tough set of issues to balance: If someone breaks into
a some defense or environmental or nuclear site and can cause an ecological
disaster because of a bug that slipped in which AI would have caught, that's
not going to be comforting.

I would like to take the stand you're advocating, but from what I've seen,
any energy that isn't used to review shadow patches will just be used
to have AI write and perform a glam rock version of row row row your boat.

I'm ok (for now) forbidding AI-generated submissions. That's in part
because this is supposed to be a venerable, stable, legacy project. If
it's going to be moving fast enough for people to want to generate huge
patches with AI, then I'm stepping down :)

But patches written by a human and reviewed by AI, I feel like it's in
everyone's best interest to not only accept, but encourage that.

@alejandro-colomar
Copy link
Collaborator Author

alejandro-colomar commented Dec 30, 2025

It was mentioned in the discussion in the Linux man-pages project. However, any contributions that have used AI will be dangerous (they might introduce dangerous bugs, and they're harder to spot than bugs introduced by human mistakes).

FWIW, from what I've seen this is not something we need to worry about. The only way AI generated mistakes would be danger / hard to spot is if the person looking is inexperienced.

I'm going to doubt this. I myself feel less qualified to review AI mistakes than human mistakes.

I'm seeing AI do quite a good job at spotting bugs that experienced reviewers miss. In fact, I would say that we are at the point where it might be deemed irresponsible not to at least have AI review patches before submitting a PR.

I'm going to doubt this too. It has a false positive rate (and false negative too) that is unacceptable (any compiler diagnostic that would have a similar rate of false positives would certainly be turned off). Any valid claims are buried too deep that I don't find them useful. Just look at CodeQL; I think we've only received one real report from it in this project.

If we were to enact a full ban, the main reasons would be to take a stand on the copyright and commons abuses by the creators of the AI, and the environmental impact. That can be a tough set of issues to balance: If someone breaks into a some defense or environmental or nuclear site and can cause an ecological disaster because of a bug that slipped in which AI would have caught, that's not going to be comforting.

I would like to take the stand you're advocating, but from what I've seen, any energy that isn't used to review shadow patches will just be used to have AI write and perform a glam rock version of row row row your boat.

I'm ok (for now) forbidding AI-generated submissions. That's in part because this is supposed to be a venerable, stable, legacy project. If it's going to be moving fast enough for people to want to generate huge patches with AI, then I'm stepping down :)

That sounds reasonable. Even if I doubt the first two claims, I'm happy to ban use of AI in the production of a patch, if I have to bite the bullet and accept that AI will be used for reviewing.

That sounds like Gentoo's AI policy (at least as interpreted by Gentoo maintainers). Should we copy that?

I have a disagreement with the Gentoo maintainers in the interpretation of their text (IMO, it would also disallow AI review). Maybe I could add a paragraph explicitly allowing it. And I'd also like to add text requiring very explicit disclaimer of use, including the detailing of any changes to the patch that have resulted from following AI suggestions/reports, as that's the code I'm going to be more picky about.

But patches written by a human and reviewed by AI, I feel like it's in everyone's best interest to not only accept, but encourage that.

@ikerexxe
Copy link
Collaborator

I'm going to doubt this too. It has a false positive rate (and false negative too) that is unacceptable (any compiler diagnostic that would have a similar rate of false positives would certainly be turned off). Any valid claims are buried too deep that I don't find them useful. Just look at CodeQL; I think we've only received one real report from it in this project.

No, not really. The first time I run CodeQL it reported several concerning issues.

If we were to enact a full ban, the main reasons would be to take a stand on the copyright and commons abuses by the creators of the AI, and the environmental impact. That can be a tough set of issues to balance: If someone breaks into a some defense or environmental or nuclear site and can cause an ecological disaster because of a bug that slipped in which AI would have caught, that's not going to be comforting.
I would like to take the stand you're advocating, but from what I've seen, any energy that isn't used to review shadow patches will just be used to have AI write and perform a glam rock version of row row row your boat.
I'm ok (for now) forbidding AI-generated submissions. That's in part because this is supposed to be a venerable, stable, legacy project. If it's going to be moving fast enough for people to want to generate huge patches with AI, then I'm stepping down :)

That sounds reasonable. Even if I doubt the first two claims, I'm happy to ban use of AI in the production of a patch, if I have to bite the bullet and accept that AI will be used for reviewing.

This seems like a good middle ground. It will allow us to evaluate the usefulness of these types of tools in this project, while continuing to reject anything that is not valuable.

… contributing

This policy has been derived from Gentoo.  I added a requirement that
use of AI is disclosed.  And changes resulting from said use should be
disclosed in detail.  Also, I left a note saying we'll reject
non-negligible use of AI, which is a bit of an escape allowing us to
just say "too much".

Link: <https://arstechnica.com/ai/2025/07/study-finds-ai-tools-made-open-source-software-developers-19-percent-slower/>
Link: <https://petri.com/ai-coding-tools-rising-software-defects/>
Link: <https://ia.acs.org.au/article/2024/ai-coding-tools-may-produce-worse-software-.html>
Link: <https://carbonate.dev/blog/posts/the-ai-code-quality-crisis>
Cc: Iker Pedrosa <[email protected]>
Cc: Serge Hallyn <[email protected]>
Signed-off-by: Alejandro Colomar <[email protected]>
@alejandro-colomar
Copy link
Collaborator Author

I've changed the text to allow AI for review. Any use must be disclosed, and any changes to the contribution resulting from use of AI need explicit disclosing too. And there's a clause warning that non-minimal use of AI will be rejected.

Copy link
Collaborator

@ikerexxe ikerexxe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems more in line with what we have discussed in this thread.

Out of curiosity, what are your thoughts on the use of open source AI? Do you think that if there were ever a usable generative open source AI, we could use it?

@alejandro-colomar
Copy link
Collaborator Author

alejandro-colomar commented Jan 7, 2026

This seems more in line with what we have discussed in this thread.

Out of curiosity, what are your thoughts on the use of open source AI? Do you think that if there were ever a usable generative open source AI, we could use it?

I would personally never use AI, open source or not.

I don't think we'll have any AIs that will have the quality that I would expect from a programmer. I also don't think we'll have any AI that will be free of the ethical concerns stated in the document.

Things have to change a lot for me to change my mind, and I think such changes won't happen in this century, if ever.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants