-
Notifications
You must be signed in to change notification settings - Fork 254
doc/contributions/: Add guidelines banning AI for contributing #1453
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
I would like to share a different perspective regarding the proposed AI policy. While I understand the concerns regarding code quality, copyright, and the maintenance burden, I believe an outright ban is counterproductive. A total prohibition often leads to "shadow usage". If contributors find these tools useful for understanding complex logic or structuring contributions, a ban will not necessarily stop the use of AI; it will simply discourage transparency. This makes it impossible for maintainers to have an open dialogue with contributors about how to verify and audit AI-assisted work. I believe a more constructive approach would be to follow a model similar to the Fedora Project’s policy on AI-assisted contributions. Their approach focuses on ensuring that the contributor remains legally and technically responsible for the work, ensuring that the code meets licensing requirements and quality standards, regardless of the tools used to create it. Instead of a complete ban, the project could consider a policy based on strict accountability and disclosure. This allows the project to maintain its high standards and legal safety without banning tools that, when used responsibly, can improve a developer's workflow. |
This is a theoretical concern, but after some months of such a policy in the Linux man-pages project --where we also discussed the possibility of undisclosed usage--, I haven't noticed any such undisclosed usage of AI tools. What I've experienced there is that contributors do actually read the contributing guidelines, and often try to comply with them. I've received contributions where AI had been used, but they all fall in one of two categories:
In any case, at least currently, it is relatively easy to spot when a patch has been produced with the help of AI. That might change in the future, but we're years away from that. So, it should be easy to spot such undisclosed usage of AI tools.
They aren't. These tools produce a false sense of understanding, which can be very dangerous, especially in a project like this one. Banning the tools has the beneficial side effect that even if a contributor has the false sense of understanding, we're not affected by it.
The regular contributors shouldn't have any incentive to use it, as they can produce high quality contributions by hand, and probably faster. Reputation should be important enough to contributors, so I don't think they'll risk it by lying to maintainers. The passer-by contributors, especially when using pseudonyms, might be more tempted to do that. However, we should be very careful about those anyway, as Jia Tan might also want to contribute, and will of course not be honest. We already have enough incentives for one-time contributors to include backdoors, that I don't think this one would be significant.
It was mentioned in the discussion in the Linux man-pages project. However, any contributions that have used AI will be dangerous (they might introduce dangerous bugs, and they're harder to spot than bugs introduced by human mistakes). For that reason, I'm strongly opposed to a policy that allows AI-based contributions in any way. The Fedora policy is one of the worst I've seen, IMO. The Gentoo policy, which (allegedly) allows AI-based review tools, but not AI-generated code, nor the use of AI to understand the existing code, would be better. It is quite close to this one. However, I think even that isn't enough, as AI tools for review (e.g., static analyzers) can induce the human to make mistakes that are hard to spot, and result in the introduction of vulnerabilities. For that reason, we added a clause in the policy of the Linux man-pages project prohibiting that too. We should take no unnecessary risks.
I don't think high standards are compatible at all with AI tools. There's no responsible use of such tools that improves code quality, or at least hasn't been proved. Research points in the other direction, actually (I've now added links in the first message). There's too much risk. |
To be fair, I've seen a worse policy than that of Fedora: while the kernel still has no formal policy, some parties seem to be pushing for undisclosed use of AI, as if it was the wild wild west. I think the quality of kernel code might very well decrease as a consequence in the near future. We'll see. I keep an eye on the development of GNU Hurd, just in case I need to switch kernels. :) |
FWIW, from what I've seen this is not something we need to worry about. The I'm seeing AI do quite a good job at spotting bugs that experienced reviewers If we were to enact a full ban, the main reasons would be to take a stand on I would like to take the stand you're advocating, but from what I've seen, I'm ok (for now) forbidding AI-generated submissions. That's in part But patches written by a human and reviewed by AI, I feel like it's in |
I'm going to doubt this. I myself feel less qualified to review AI mistakes than human mistakes.
I'm going to doubt this too. It has a false positive rate (and false negative too) that is unacceptable (any compiler diagnostic that would have a similar rate of false positives would certainly be turned off). Any valid claims are buried too deep that I don't find them useful. Just look at CodeQL; I think we've only received one real report from it in this project.
That sounds reasonable. Even if I doubt the first two claims, I'm happy to ban use of AI in the production of a patch, if I have to bite the bullet and accept that AI will be used for reviewing. That sounds like Gentoo's AI policy (at least as interpreted by Gentoo maintainers). Should we copy that? I have a disagreement with the Gentoo maintainers in the interpretation of their text (IMO, it would also disallow AI review). Maybe I could add a paragraph explicitly allowing it. And I'd also like to add text requiring very explicit disclaimer of use, including the detailing of any changes to the patch that have resulted from following AI suggestions/reports, as that's the code I'm going to be more picky about.
|
No, not really. The first time I run CodeQL it reported several concerning issues.
This seems like a good middle ground. It will allow us to evaluate the usefulness of these types of tools in this project, while continuing to reject anything that is not valuable. |
7d36034 to
dafd0c4
Compare
… contributing This policy has been derived from Gentoo. I added a requirement that use of AI is disclosed. And changes resulting from said use should be disclosed in detail. Also, I left a note saying we'll reject non-negligible use of AI, which is a bit of an escape allowing us to just say "too much". Link: <https://arstechnica.com/ai/2025/07/study-finds-ai-tools-made-open-source-software-developers-19-percent-slower/> Link: <https://petri.com/ai-coding-tools-rising-software-defects/> Link: <https://ia.acs.org.au/article/2024/ai-coding-tools-may-produce-worse-software-.html> Link: <https://carbonate.dev/blog/posts/the-ai-code-quality-crisis> Cc: Iker Pedrosa <[email protected]> Cc: Serge Hallyn <[email protected]> Signed-off-by: Alejandro Colomar <[email protected]>
dafd0c4 to
4322771
Compare
|
I've changed the text to allow AI for review. Any use must be disclosed, and any changes to the contribution resulting from use of AI need explicit disclosing too. And there's a clause warning that non-minimal use of AI will be rejected. |
ikerexxe
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems more in line with what we have discussed in this thread.
Out of curiosity, what are your thoughts on the use of open source AI? Do you think that if there were ever a usable generative open source AI, we could use it?
I would personally never use AI, open source or not. I don't think we'll have any AIs that will have the quality that I would expect from a programmer. I also don't think we'll have any AI that will be free of the ethical concerns stated in the document. Things have to change a lot for me to change my mind, and I think such changes won't happen in this century, if ever. |
This policy has been copied verbatim from the Linux man-pages project, which itself derived it from (and is more restrictive than) Gentoo.
Cc: @ikerexxe
Cc: @hallyn
Link: https://arstechnica.com/ai/2025/07/study-finds-ai-tools-made-open-source-software-developers-19-percent-slower/
Link: https://petri.com/ai-coding-tools-rising-software-defects/
Link: https://ia.acs.org.au/article/2024/ai-coding-tools-may-produce-worse-software-.html
Link: https://carbonate.dev/blog/posts/the-ai-code-quality-crisis
Revisions:
v2
v2b