Rendered at 17:52:10 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.
mr-wendel 2 hours ago [-]
My two cents: I've been coding practically my entire life, but a few years back I sustained a pretty significant and lasting injury to my wrists. As such, I have very little tolerance for typing. It's been quite a problem and made full time work impossible.
With the advent of LLMs, AI-autocomplete, and agent-based development workflows, my ability to deliver reliable, high-quality code is restored and (arguably) better. Personally, I love the "hallucinations" as they help me fine-tune my prompts, base instructions, and reinforce intentionality; e.g. is that >really< the right solution/suggestion to accept? It's like peer programming without a battle of ego.
When analyzing problems, I think you have to look at both upsides and downsides. Folks have done well to debate the many, many downsides of AI and this tends to dominate the conversation. Probably thats a good thing.
But, on the flip side, I personally advocate hard for AI from the point-of-view on accessibility. I know (more-or-less) exactly what output I'm aiming for and control that obsessively, but it's AI and my voice at the helm instead of my fingertips.
I also think it incorrect to look at it from a perspective of "does the good outweigh the bad?". Relevant, yes, but utilitarian arguments often lead to counter-intuitive results and end up amplifying the problems they seek to solve.
I'd MUCH rather see a holistic embrace and integration of these tools into our ecosystems. Telling people "no AI!" (even if very well defined on what that means) is toothless against people with little regard for making the world (or just one specific repo) a better place.
moduspol 1 hours ago [-]
> But, on the flip side, I personally advocate hard for AI from the point-of-view on accessibility. I know (more-or-less) exactly what output I'm aiming for and control that obsessively, but it's AI and my voice at the helm instead of my fingertips.
This is the technique I've picked up and got the most from over the past few months. I don't give it hard, high-level problems and then review a giant set of changes to figure it out. I give it the technical solution I was already going to implement anyway, and then have it generate the code I otherwise would have written.
It cuts back dramatically on the review fatigue because I already know exactly what I'm expecting to see, so my reviews are primarily focused on the deviations from that.
distances 31 minutes ago [-]
This, and I curate a tree of MD docs per topic to define the expected structure. It is supposed to output code that looks exactly like my code. If not, I manually edit it and perhaps update the docs.
This is how I've found myself to be productive with the tools, or since productivity is hard to measure, at least it's still a fun way to work. I do not need to type everything but I want a very exact outcome nonetheless.
ok_dad 30 minutes ago [-]
The only issue to beat in mind is that visual inspection is only about 85% accurate at its limit. I was responsible for incoming inspection at a medical device factory and visual inspection was the least reliable test for components that couldn’t be inspected for anything else. We always preferred to use machines (likes big CMM) where possible.
I also use LLM assistance, and I love it because it helps my ADHD brain get stuff done, but I definitely miss stuff that I wouldn’t miss by myself. It’s usually fairly simple mistakes to fix later but I still miss them initially.
I’ve been having luck with LLM reviewers though.
VorpalWay 21 minutes ago [-]
I'm in a very similar situation: I have RSI and smarter-autocomplete style AI is a godsend. Unlike you I haven't found more complex AI (agent mode) particularly useful though for what I do (hard realtime C++ and Rust). So I avoid that. Plus it takes away the fun part of coding for me. (The journey matters more than the destination.)
The accessibility angle is really important here. What we need is a way to stop people who make contributions they don't understand and/or can not vouch they are the author for (the license question is very murky still, and no what the US supreme court said doesn't matter here in EU). This is difficult though.
BeetleB 18 minutes ago [-]
Similar story, albeit not so extreme. I have similar ergonomic issues that crop up from time to time. My programming is not so impacted (spend more time thinking than typing, etc), but things like email, documentation, etc can be brutal (a lot more computer usage vs programming).
My simple solution: I use Whisper to transcribe my text, and feed the output to an LLM for cleanup (custom prompt). It's fantastic. Way better than stuff like Dragon. Now I get frustrated with transcribing using Google's default mechanism on Android - so inaccurate!
But the ability to take notes, dictate emails, etc using Whisper + LLM is invaluable. I likely would refuse to work for a company that won't let me put IP into an LLM.
Similarly, I take a lot of notes on paper, and would have to type them up. Tedious and painful. I switched to reading my notes aloud and use the above system to transcribe. Still painful. I recently realized Gemini will do a great job just reading my notes. So now I simply convert my notes to a photo and send to Gemini.
I categorize all my expenses. I have receipts from grocery stores where I highlight items into categories. You can imagine it's painful to enter that into a financial SW. I'm going to play with getting Gemini to look at the photo of the receipt and categorize and add up the categories for me.
All of these are cool applications on their own, but when you realize they're also improving your health ... clear win.
why_at 34 minutes ago [-]
>Personally, I love the "hallucinations" as they help me fine-tune my prompts, base instructions, and reinforce intentionality
This reads almost like satire of an AI power user. Why would you like it when an LLM makes things up? Because you get to write more prompts? Wouldn't it be better if it just didn't do that?
It's like saying "I love getting stuck in traffic because I get to drive longer!"
Sorry but that one sentence really stuck out to me
walthamstow 13 minutes ago [-]
You worked with people before haven't you? Sometimes they make stuff up, or misremember stuff. Sometimes people who do this are brilliant and you end up learning a lot from them.
mr-wendel 15 minutes ago [-]
I appreciate the feedback.
I like it because I have no expectation of perfection-- out of others, myself, and especially not AI. I expect "good enough" and work upwards from there, and with (most) things, I find AI to be better than good enough.
QuercusMax 30 minutes ago [-]
A few years ago I was in a place where I couldn't type on a computer keyboard for more than a few minutes without significant pain, and I fortunately had shifted into a role where I could oversee a bunch of junior engineers mostly via text chat (phone keyboard didn't hurt my hands as much) and occasional video/voice chat.
I'm much better now after tons of rehab work (no surgery, thankfully), but I don't have the stamina to type as much as I used to. I was always a heavy IDE user and a very fast coder, but I've moved platforms too many times and lost my muscle memory. A year ago I found the AI tools to be basically time-wasters, but now I can be as productive as before without incurring significant pain.
Joel_Mckay 23 minutes ago [-]
The premise LLM are "AI" is false, but are good at problems like context search, and isomorphic plagiarism.
Given the liabilities of relying on public and chat users markdown data to sell to other users without compensation raises a number of issues:
1. Copyright: LLM generated content can't be assigned copyright (USA), and thus may contaminate licensing agreements. It is likely public-domain, but also may conflict with GPL/LGPL when stolen IP bleeds through weak obfuscation. The risk has zero precedent cases so far (the Disney case slightly differs), but is likely a legal liability waiting to surface eventually.
2. Workmanship: All software is terrible, but some of it is useful. People that don't care about black-box obfuscated generated content, are also a maintenance and security liability. Seriously, folks should just retire if they can't be arsed to improve readable source tree structure.
3. Repeatability: As the models started consuming other LLM content, the behavioral vectors often also change the content output. Humans know when they don't know something, but an LLM will inject utter random nonsense every time. More importantly, the energy cost to get that error rate lower balloons exponentially.
4. Psychology: People do not think critically when something seems right 80% of the time. The LLM accuracy depends mostly on stealing content, but it stops working when there is nothing left to commit theft of service on. The web is now >53% slop and growing. Only the human user chat data is worth stealing now.
5. Manipulation: The frequency of bad bots AstroTurf forums with poisoned discourse is biasing the delusional. Some react emotionally instead of engaging the community in good faith, or shill hard for their cult of choice.
6. Sustainability: FOSS like all ecosystems is vulnerable to peer review exhaustion like the recent xz CVE fiasco. The LLM hidden hostile agent problem is currently impossible to solve, and thus cannot be trusted in hostile environments.
7. Ethics: Every LLM ruined town economic simulations, nuked humanity 94% of the time in every war game, and encouraged the delusional to kill IRL
While I am all for assistive technologies like better voice recognition, TTS, and individuals computer-user interfaces. Most will draw a line at slop code, and branch to a less chaotic source tree to work on.
I think it is hilarious some LLM proponents immediately assume everyone also has no clue how these models are implemented. =3
This is a bit of a straw man. The harms of AI in OSS are not from people needing accessibility tooling.
mr-wendel 33 minutes ago [-]
I disagree. I've done nothing to argue that the harm isn't real, downplayed it, nor misrepresented it.
I do agree that at large, the theoretical upsides of accessibility are almost certainly completely overshadowed by obvious downsides of AI. At least, for now anyway. Accessibility is a single instance of the general argument that "of course there are major upsides to using AI", and there a good chance the future only gets brighter.
My point, essentially, is that I think this is (yet another) area in life where you can't solve the problem by saying "don't do it", and enforcing it is cost-prohibitive. Saying "no AI!" isn't going to stop PR spam. It's not going to stop slop code. What is it going to stop (see edit)? "Bad" people won't care, and "good" people (who use/depend-on AI) will contribute less.
Thus I think we need to focus on developing robust systems around integrating AI. Certainly I'd love to see people adopt responsible disclosure policies as a starting point.
--
[edit] -- To answer some of my own question, there are obvious legal concerns that frequently come up. I have my opinions, but as in many legal matters, especially around IP, the water is murky and opinions are strongly held at both extremes and all to often having to fight a legal battle at all* is immediately a loss regardless of outcome.
johnnyanmac 2 minutes ago [-]
[delayed]
DonsDiscountGas 34 minutes ago [-]
It's absolutely not a straw man, because OP and people like OP will be affected by any policy which limits or bans LLMs. Whether or not the policy writer intended it. So he deserves a voice.
johnnyanmac 1 minutes ago [-]
[delayed]
glenstein 1 hours ago [-]
Fantastic point. I do think there was a bit of an over correction toward AI hostility because capitalism, and for good reason, but it did almost make it taboo to talk about legitimate use cases that are not related to bad AI use cases like instigating nuclear wars in war game simulations.
I think the ugly unspoken truth whether Mozilla or Debian or someone else, is that there are going to be plausible and valuable use cases and that AI as a paradigm is going to be a hard problem the same way that presiding over, say, a justice system is a hard problem (stay with me). What I mean is it can have a legitimate purpose but be prone to abuse and it's a matter of building in institutional safeguards and winning people's trust while never fully being able to eliminate risk.
It's easy for someone to roll their eyes at the idea that there's utility but accessibility is perfect and clear-eyed use case, that makes it harder to simply default to hedonic skepticism against any and all AI applications. I actually think it could have huge implications for leveling the playing field in the browser wars for my particular pet issue.
vladms 2 hours ago [-]
Very reasonable stance. I see reviewing and accepting a PR is a question of trust - you trust the submitter to have done the most he can for the PR to be correct and useful.
Something might be required now as some people might think that just asking an LLM is "the most he can done", but it's not about using AI it's about being aware and responsible about using it.
rustyhancock 2 hours ago [-]
Important though we generally assume few bad actors.
But like the XZ attack, we kind of have to assume that advanced perissitant threats are a reality for FOSS too.
I can envisage a Sybil attack where several seemingly disaparate contributors are actually one actor building a backdoor.
Right now we have a disparity in that many contributors can use LLMs but the recieving projects aren't able to review them as effectively with LLMs.
LLM generated content often (perhaps by definition) seems acceptable to LLMs. This is the critical issue.
If we had means of effectively assessing PRs objectively that would make this moot.
I wonder if those is a whole new class of issue. Is judging a PR harder than making one? It seems so right now
vladms 2 hours ago [-]
> Is judging a PR harder than making one?
Depends on the assumptions. If you assume good intent of the submitter and you spend time to explain what he should improve, why something is not good, etc, than it's a lot of effort. If you assume bad intent, you can just reject with something like "too large review from unproven user, please contribute something smaller first".
Yes, we might need to take things a bit slower, and build relations to the people you collaborate with in order to have some trust (this can also be attacked, but this was already possible).
PowerfulWizard 26 minutes ago [-]
On judging vs. making, also someone has to take time away from development to do code review. If the code being reviewed is written by someone who is involved and interested then at least there's a benefit to training and consensus building in discussing the code and the project in the review phase. The time and energy of developers who are qualified to review is quite possibly the bottleneck on development speed too so wasting review time will slow down development.
For AI generated code if previous PRs aren't loaded into context then there's no lasting benefit from the time taken to review and it's blank slate each time. I think ultimately it can be solved with workflow changes (i.e. AI written code should be attributed to the AI in VCS, the full trace and manual edits should be visible for review, all human input prompts to the AI should be browsable during review without having scroll 10k lines of AI reasoning.)
delichon 2 hours ago [-]
> I see reviewing and accepting a PR is a question of trust
I think that's backwards, at least as far as accepting a PR. Better that all code is reviewed as if it is probably a carefully thought out Trojan horse from a dedicated enemy until proven otherwise.
jajuuka 1 hours ago [-]
That's the key part in all this. Reviewing PR needs to be a rock solid process that can catch errors. Human or AI generated.
sothatsit 1 hours ago [-]
Concerns about the wasting of maintainer’s time, onboarding, or copyright, are of great interest to me from a policy perspective. But I find some of the debate around the quality of AI contributions to be odd.
Quality should always be the responsibility of the person submitting changes. Whether a person used LLMs should not be a large concern if someone is acting in good-faith. If they submitted bad code, having used AI is not a valid excuse.
Policies restricting AI-use might counterintuitively have a negative effect on contribution quality if they hinder good contributors while bad contributors ignore the restrictions. That said, restrictions for non-quality reasons, like copyright concerns, might still make sense.
qsera 26 minutes ago [-]
> If they submitted bad code...
The core issue is that it takes a large amount of effort to even assess this, because LLM generated code looks good superficially.
It is said that static FP languages make it hard to implement something if you don't really understand what you are implementing. Dynamically typed languages makes it easier to implement something when you don't fully understand what you are implementing.
LLMs takes this to another level when it enables one to implement something with zero understanding of what they are implementing.
sothatsit 12 minutes ago [-]
The people likely to submit low-effort contributions are also the people most likely to ignore policies restricting AI usage.
The people following the policies are the most likely to use AI responsibly and not submit low-effort contributions.
Attempting to restrict AI to improve contribution quality might counterintuitively have the opposite effect by hindering good contributors while bad contributors ignore the restrictions.
I’m more interested in how we might allow people to build trust so that reviewers can positively spend time on their contributions, whilst avoiding wasting reviewers time on drive-by contributors. This seems like a hard problem.
dormento 3 minutes ago [-]
I wonder if the right call wouldn't be impose a LOC limit on contributions (sensibly chosen for the combination of language/framework/toolset).
sothatsit 20 seconds ago [-]
I quite like this direction. Limit new contributors to small contributions, and then relax restrictions as more of their contributions are accepted.
IshKebab 44 minutes ago [-]
It should be the responsibility of the person submitting changes. The problem is AI apparently makes it easy for people to shirk that responsibility.
sothatsit 20 minutes ago [-]
Trusted contributors using LLMs do not cause this problem though. It is the larger volume of low-effort contributions causing this problem, and those contributors are the most likely to ignore the policies.
Therefore, policies restricting AI-use on the basis of avoiding low-quality contributions are probably hurting more than they’re helping.
qsera 32 minutes ago [-]
> people to shirk that responsibility.
Actually not shrink, but just transfer it to reviewers.
SamuelAdams 3 hours ago [-]
My question on AI generated contributions and content in general: on a long enough timeline, with ever improving advancements in AI, how can people reliably tell the difference between human and AI generated efforts?
Sure now it is easy, but in 3-10 years AI will get significantly better. It is a lot like the audio quality of an MP3 recording. It is not perfect (lossless audio is better), but for the majority of users it is "good enough".
At a certain point AI generated content, PR's, etc will be good enough for humans to accept it as "human". What happens then, when even the best checks and balances are fooled?
lich_king 3 hours ago [-]
> My question on AI generated contributions and content in general: on a long enough timeline, with ever improving advancements in AI, how can people reliably tell the difference between human and AI generated efforts?
Can you reliably tell that the contributor is truly the author of the patch and that they aren't working for a company that asserts copyright on that code? No, but it's probably still a good idea to have a policy that says "you can't do that", and you should be on the lookout for obvious violations.
It's the same story here. If you do nothing, you invite problems. If you do something, you won't stop every instance, but you're on stronger footing if it ever blows up.
Of course, the next question is whether AI-generated code that matches or surpasses human quality is even a problem. But right now, it's academic: most of the AI submissions received by open source projects are low quality. And if it improves, some projects might still have issues with it on legal (copyright) or ideological grounds, and that's their prerogative.
sheepscreek 3 hours ago [-]
Precisely. “AI” contributions should be seen as an extension of the individual. If anything, they could ask that the account belong to a person and not be a second bot only account. Basically, a person’s own reputation should be on the line.
SlinkyOnStairs 2 hours ago [-]
Reputation isn't very relevant here. Yes, for established well known FOSS developers, their reputation will tank if they put out sloppy PRs and people will just ignore them.
But the projects aren't drowning under PRs from reputable people. They're drowning in drive-by PRs from people with no reputation to speak of. Even if you outright ban their account, they'll just spin up a new one and try again.
Blocking AI submissions serves as a heuristic to reduce this flood of PRs, because the alternative is to ban submissions from people without reputation, and that'd be very harmful to open source.
And AI cannot be the solution here, because open source projects have no funds. Asking maintainers to fork over $200/month for "AI code reviews" just kills the project.
dudeinhawaii 52 minutes ago [-]
I don't see why we can't have AI powered reviews as a verification of truth and trust score modifier. Let me explain.
1. You layout policy stating that all code, especially AI code has to be written to a high quality level and have been reviewed for issues prior to submission.
2. Given that even the fastest AI models do a great job of code reviews, you setup an agent using Codex-Spark or Sonnnet, etc to scan submissions for a few different dimensions (maintainability, security, etc).
3. If a submission comes through that fails review, that's a strong indication that the submitter hasn't put even the lowest effort into reviewing their own code. Especially since most AI models will flag similar issues. Knock their trust score down and supply feedback.
3a. If the submitter never acts on the feedback - close the submission and knock the trust score down even more.
3b. If the submitter acts on the feedback - boost trust score slightly. We now have a self-reinforcing loop that pushes thoughtful submitters to screen their own code. (Or ai models to iterate and improve their own code)
4. Submission passes and trust score of submitter meets some minimal threshold. Queued for human review pending prioritization.
I haven't put much thought into this but it seems like you could design a system such that "clout chasing" or "bot submissions" would be forced to either deliver something useful or give up _and_ lose enough trust score that you can safely shadowban them.
SlinkyOnStairs 9 minutes ago [-]
The immediate problem is just cost. Open Source has no money, so any fancy AI solution is off the table immediately.
In terms of your plan though, you're just building a generative adversarial network here. Automated review is relatively easy to "attack".
Yet human contributors don't put up with having to game an arbitrary score system. StackOverflow imploded in no small part because of it.
hombre_fatal 2 hours ago [-]
Well, the problem you just outlined is a reputation (+ UI) problem: why are contributions from unknown contributors shown at the same level as PRs from known quality contributors, for example?
We need to rethink some UX design and processes here, not pretend low quality people are going to follow your "no low quality pls i'm serious >:(" rules. Rather, design the processes against low quality.
Also, we're in a new world where code-change PRs are trivial, and the hard part isn't writing code anymore but generating the spec. Maybe we don't even allow PRs anymore except for trusted contributors, everyone else can only create an issue and help refine a plan there which the code impl is derived?
You know, even before LLMs, it would have been pretty cool if we had a better process around deliberating and collaborating around a plan before the implementation step of any non-trivial code change. Changing code in a PR with no link to discussion around what the impl should actually look like always did feel like the cart before the horse.
SlinkyOnStairs 2 hours ago [-]
In the long distant past of 4-5 years ago, it simply wasn't a problem. Few projects were overwhelmed with PRs to begin with.
And for the major projects where there was a flood of PRs, it was fairly easy to identify if someone knew what they were talking about by looking at their language; Correct use of jargon, especially domain-specific jargon.
The broader reason why "unknown contributor" PRs were held in high regard is that, outside of some specific incidents (thank you, DigitalOcean and your stupid tshirts), the odds were pretty good of a drive by PR coming from someone who identified a problem in your software by using it. Those are incredibly valuable PRs, especially as the work of diagnosing the problem generally also identifies the solution.
It's very hard to design a UX that impedes clueless fools spamming PRs but not the occasional random person finding sincere issues and having the time to identify (and fix them) but not permanent project contribution.
> and the hard part isn't writing code anymore but generating the spec
My POV: This is a bunch of crap and always has been.
Any sufficiently detailed specification is code. And the cost of writing such a specification is the cost of writing code. Every time "low code" has been tried, it doesn't work for this very reason.
e.g. The work of a ticket "Create a product category for 'Lime'" consists not of adding a database entry and typing in the word 'Lime', it consists of the human work of calling your client and asking whether it should go under Fruit or Cement.
bombcar 2 hours ago [-]
Because until now, unknown contributors either submitted obvious junk which could be closed by even an unskilled moderator (I've done triage work for OS projects before) or they submitted something that was workable and a good start.
The latter is where you get all known contributors from! So if you close off unknown contributors the project will eventually stagnate and die.
bityard 2 hours ago [-]
> because the alternative is to ban submissions from people without reputation, and that'd be very harmful to open source.
Hmmm, no? That's actually very common in open source. Maybe "banning" isn't the right word, but lots of projects don't accept random drive-by submissions and never have. Debian is a perfect example, you are very unlikely to get a nontrivial patch or package into Debian unless you have some kind of interaction or rapport with a package maintainer, or commit to the process of building trust to become a maintainer yourself.
I have seen high profile GitHub projects that summarily close PRs if you didn't raise the bug/feature as an issue or join their discord first.
SlinkyOnStairs 1 hours ago [-]
Setting aside "make an issue first" because those too are flooded with LLMs.
> you are very unlikely to get a nontrivial patch or package into Debian unless you have some kind of interaction or rapport with a package maintainer
I did mean the "trivial" patches as well, as often it's a lot of these small little fixes to single issues that improve software quality overall.
But yes, it's true that it's not uncommon for projects to refuse outside PRs.
This already causes massive amounts of friction and contributes (heh) heavily to what makes Open Source such a pain in the ass to use.
Conversely, many popular "good" open source libraries rely extensively on this inflow of small contributions to become comprehensively good.
And so it's a tradeoff. Forcing all open source into refusing drive-by PRs will have costs. What makes sense for major security-sensitive projects with large resources doesn't make sense for others.
It's not that we won't have open source at all. It's that it'll just be worse and encourage further fragmentation. e.g. One doesn't build a good .ZIP library by carefully reading the specification, you get it by collecting a million little examples of weird zip files in the wild breaking your code.
lich_king 53 minutes ago [-]
> Precisely. “AI” contributions should be seen as an extension of the individual.
That's an OK view to hold, but I'll point out two things. First, it's not how the tech is usually wielded to interact with open-source software. Second, your worldview is at odds with the owners of this technology: the main reason why so much money is being poured into AI coding is that it's seen by investors as a replacement for the individual.
aerodexis 2 hours ago [-]
Interesting argument for AI ethics in general. It takes the form of "guns don't kill people - people kill people".
glhaynes 2 hours ago [-]
An argument that I have some sympathy for, while still being moderately+ in favor of gun control (here in the USA where I'm a citizen).
It seems that gun control—though imperfect—in regions that have implemented it has had a good bit of success and the legitimate/non-harmful capabilities lost seem worth it to me in trade for the gains. (Reasonable people can disagree here!)
Whereas it seems to me that if we accept the proposition that the vast majority of code in the future is going to be written by AI (and I do), these valuable projects that are taking hard-line stances against it are going to find themselves either having to retreat from that position or facing insurmountable difficulties in staying relevant while holding to their stance.
estebank 2 hours ago [-]
> these valuable projects that are taking hard-line stances against it are going to find themselves either having to retreat from that position or facing insurmountable difficulties in staying relevant while holding to their stance.
It is the conservative position: it will be easier to walk back the policy and start accepting AI produced code some time down the road when its benefits are clearer than it will be to excise AI produced code from years prior if there's a technical or social reason to do that.
Even if the promise of AI is fulfilled and projects that don't use it are comparatively smaller, that doesn't mean there's no value in that, in the same way that people still make furniture in wood with traditional methods today even if a company can make the same widget cheaper in an almost fully automated way.
datsci_est_2015 2 hours ago [-]
> It seems that gun control—though imperfect—in regions that have implemented it has had a good bit of success and the legitimate/non-harmful capabilities lost seem worth it to me in trade for the gains.
This is even true despite the fact that there are bad actors only a few minutes drive away in many cases (Chicago->Indiana border, for example).
jazzyjackson 2 hours ago [-]
Unfortunately ChatGPT turned “text continuation” into “separate entity you can talk to”
aerodexis 45 minutes ago [-]
The desire to anthropomorphize LLMs is super interesting. People naturally anthropomorphize technology (even printers: "why are you not working!?"). It's a natural and useful heuristic. However, I can easily see how chatGPT would want to intensify this tendency in order to sell the technology's "agency" and the promise that it can solve all your problems. However, since it's a heuristic, it papers over a lot of details that one would do well to understand.
(as an aside - this reminds me of the trend of Object Oriented Ontology that specifically /tried/ to imbue agency onto large-scale phenomena that were difficult to understand discretely. I remember "global warming" being one of those things - and I can see now how this philosophy would have done more to obscure the dominion of experts wrt that topic)
spogbiper 1 hours ago [-]
its actually the bullets
dataflow 2 hours ago [-]
I don't think any side on the issue of gun ownership has ever claimed that statement is false, so I'm not sure what your point is.
simianwords 4 minutes ago [-]
with improvements, we wouldn't even talk about code. just designs and features!
nancyminusone 2 hours ago [-]
Of course you can tell. If someone suddenly submits a mountainous pile of code out of nowhere that claims to fix every problem, you can make a reasonable estimate that the author used AI. It's then equally reasonable to suggest said author might not have taken the requisite time and detail to understand the scope of the problem.
This is the basis of the argument - it doesn't matter if you use AI or not, but it does matter if you know what you're doing or not.
mrbungie 3 hours ago [-]
The same way niche/luxury product and services compare to fast/cheap ones: they are made with focus and intent that goes against the statistical average, which also normally would take more time and effort to make.
McDonalds cooks ~great~ (edit: fair enough, decent) burgers when measured objectively, but people still go to more niche burger restaurants because they want something different and made with more care.
That's not to say that an human can't use AI with intent, but then AI becomes another tool and not an autonomous code generating agent.
AlexandrB 2 hours ago [-]
> McDonalds cooks great burgers when measured objectively
Wait, what? In what world are McDonalds burgers "great"? They're cheap. Maybe even a good value. But that's not the same as great.
bombcar 2 hours ago [-]
They are consistent and decent, though arguably some are even good (though everyone usually has a preferred fast food destination).
Some of the best burgers I've ever had came from fast food.
mrbungie 2 hours ago [-]
Fair enough, I should've said borderline decent.
Jleagle 3 hours ago [-]
Isn't your prediction a good thing? People prefer humans currently as they are better but if AI is just as good, doesn't that just mean more good PRs?
coldpie 3 hours ago [-]
> but if AI is just as good, doesn't that just mean more good PRs?
If you believe the outputs of LLMs are derivative products of the materials the LLMs were trained on (which is a position I lean towards myself, but I also understand the viewpoint of those who disagree), then no, that's not a good thing, because it would be a license violation to accept those derived products without following the original material's license terms, such as attribution and copyleft terms. You are now party to violating the original materials' copyright by accepting AI generated code. That's ethically dubious, even if those original authors may have a hard time bringing a court case against you.
graemep 3 hours ago [-]
> If you believe the outputs of LLMs are derivative products of the materials the LLMs were trained on
In that case a lot of proprietary software is in breach of copyleft licences. Its probably by far the commonest breach.
> You are now party to violating the original materials' copyright by accepting AI generated code. That's ethically dubious
That is arguable. Is it always ethically dubious to breach a law? If not, which is it ethically dubious to breach this law in this particular way?
coldpie 2 hours ago [-]
> In that case a lot of proprietary software is in breach of copyleft licences. Its probably by far the commonest breach.
Sure, but this doesn't really seem relevant to the conversation. Someone else violating software license terms doesn't justify me (or Debian, in the case of TFA) doing so.
> Is it always ethically dubious to breach a law?
I'm not really concerned with the law, here. I think it is ethically dubious to use someone else's work without compensating them in the manner they declared. Copyright law happens to be the method we've used for a couple hundred years to standardize the discussion about that compensation, and sometimes enforce it. Breaching the law doesn't really enter into the conversation, except as a way our society agrees to hold everyone to a minimum ethical standard.
graemep 47 minutes ago [-]
> I'm not really concerned with the law, here. I think it is ethically dubious to use someone else's work without compensating them in the manner they declared.
OK, that is reasonable. I do not think copyright is a good mechanism though, and I think the need to compensate depends on multiple factors depending on what you use a work for and under what circumstances.
iLoveOncall 2 hours ago [-]
> but in 3-10 years AI will get significantly better
Crystal ball or time machine?
pjerem 2 hours ago [-]
Crystal ball, maybe, but 3 years ago, the AI generated classes with empty methods containing "// implement logic here" and now, AI is generating whole stack applications that run from the first try.
Past performance does not guarantee future results, of course. But acting like AI is now magically going to stagnate is also a really bold bet.
bigstrat2003 57 minutes ago [-]
> now, AI is generating whole stack applications that run from the first try
I sincerely doubt that, because it still can't even generate a few hundred line script that runs on the first try. I would know, I just tried yesterday. The first attempt was using hallucinated APIs and while I did get it to work eventually, I don't think it can one shot a complex application if it can't one shot a simple script.
IMO, AI has already stagnated and isn't significantly better than it was 3 years ago. I don't see how it's supposed to get better still when the improvement has already stopped.
pjerem 48 minutes ago [-]
What tool did you use ?
I routinely generate applications for my personal use using OpenCode + Claude Sonnet/Opus.
Yesterday I generated an app for my son to learn multiplication tables using spaced repetition algorithm and score keeping. It took me like 5 minutes.
Of course if you use ChatGPT it will not work but there is no way Claude Code/Open Code with any modern model isn't able to generate a one hundred line script on the first try.
wadim 3 hours ago [-]
Why accept PR's in this case, if the maintainers themselves can ask their favorite LLM to implement a feature/fix an issue?
FrojoS 2 hours ago [-]
Because it might require time consuming testing, iterations, documentation etc.
If everything the maintainer wants can (hypothetically) be one-shotted, then there is no need to accept PR's at all. Just allow forks in case of open source.
theptip 3 hours ago [-]
Obviously - it takes effort to hone the idea/spec, and it takes time to validate the result. Code being free doesn’t make a kernel patch free, though it would make it cheaper.
hombre_fatal 3 hours ago [-]
You say "on a long enough timeline", but you already can't tell today in the hands of someone who knows what they're doing.
I think a lot of anti-LLM opinions just come from interacting with the lowest effort LLM slop and someone not realizing that it's really a problem with a low value person behind it.
It's why "no AI allowed" is pointless; high value contributors won't follow it because they know how to use it productively and they know there's no way for you to tell, and low value people never cared about wasting your time with low effort output, so the rule is performative.
e.g. If you tell me AI isn't allowed because it writes bad code, then you're clearly not talking to someone who uses AI to plan, specify, and implement high quality code.
datsci_est_2015 2 hours ago [-]
> It's why "no AI allowed" is pointless … If you tell me AI isn't allowed because it writes bad code
I disagree that the rule is pointless, and your last point is a strawman. AI is disallowed because it’s the manner in which the would-be contributors are attempting to contribute to these projects. It’s a proxy rule.
Unfortunately for AI maximalists, code is more than just letters on the screen. There needs to be human understanding, and if you’re not a core contributor who’s proven you’re willing to stick around when shit hits the fan, a +3000 PR is a liability, not an asset.
Maybe there needs to be something like the MMORPG concept of “Dragon Kill Points (DKP)”, where you’re not entitled to loot (contribution) until you’ve proven that you give a shit.
bombcar 2 hours ago [-]
> Unfortunately for AI maximalists, code is more than just letters on the screen. There needs to be human understanding, and if you’re not a core contributor who’s proven you’re willing to stick around when shit hits the fan, a +3000 PR is a liability, not an asset.
This isn't necessarily true; I've seen some projects absorb a PR of roughly that size, and after the smoke tests and other standard development stuff, the original PR author basically disappeared.
It added a feature he wanted, he tested and coded it, and got it in.
datsci_est_2015 55 minutes ago [-]
So because some projects can absorb some PRs of a certain size, all projects of should be able to absorb PRs of that same size?
This anecdotal argument is a dead end. The nuance is clear: not all software is the same, and not all edits to software are the same.
ApolloFortyNine 28 minutes ago [-]
>So because some projects can absorb some PRs of a certain size, all projects of should be able to absorb PRs of that same size?
Your argument has nothing to do with AI and more to do with PR size and 'fire and forget' feature merges. That's what the commenter your responding to is pointing out.
darkwater 2 hours ago [-]
> and if you’re not a core contributor who’s proven you’re willing to stick around when shit hits the fan, a +3000 PR is a liability, not an asset.
And in the context of high-value contributors that GP was mentioning, they are never going to land a +3000 PR because they know there is going to be a human reviewer on the other side.
sigseg1v 42 minutes ago [-]
Vibe coded slop is a 50 DKP minus of course
cindyllm 2 hours ago [-]
[dead]
fwip 5 minutes ago [-]
> high value contributors won't follow it
High-value contributors follow the rules and social mores of the community they are contributing to. If they intentionally deceive others, they are not high-value.
nananana9 2 hours ago [-]
I don't see an issue here. You keep using AI to create high value contributions in the projects that accept it, I will keep not using it in mine, and we can see who wins out in 10 years.
beepbooptheory 2 hours ago [-]
But then why have any contributions at all?
Like its been years and years now, if all this is true, you'd think there would be more of a paradigm shift? I'm happy I guess waiting for Godot like everyone else, but the shadows are getting a little long now, people are starting to just repeat the same things over and over.
Like, I am so tired now, it's causing such messes everywhere. Can all the best things about AI be manifest soon? Is there a timeline?
Like what can I take so that I can see the brave new world just out of reach? Where can I go? If I could just even taste the mindset of the true believer for a moment, I feel like it would be a reprieve.
lpcvoid 3 hours ago [-]
All LLM-output is slop. There's no good LLM output. It's stolen code, stolen literature, stolen media condensed into the greatest heist of the 21. century. Perfect capitalism - big LLM companies don't need to pay royalties to humans, while selling access to a service which generates monthly revenue.
hombre_fatal 3 hours ago [-]
Whether it trained on real world "stolen" code is an implementation detail. A controversial one, but it isn't a supporting argument for whether it can write high quality, functional code or not.
jacquesm 2 hours ago [-]
Sorry, but no, that is not a detail, that is a major sticking point for me.
mikkupikku 2 hours ago [-]
I'm fine with calling all LLM outputs slop, but I'll draw the line at asserting there's no good LLM output. LLM output is good when it works, and we can easily verify that a lot of code from LLMs does work. That the code LLMs output is derive of copyrighted works is neither here nor there. First of all, ALL creative work is derivative. Secondly IP is absurd horse shit and we never should have humored the premise of it being treated like real property.
__alexs 2 hours ago [-]
I came from a poor background and stole pretty much all the textbooks I used to learn programming as a kid. I also stole all the music I listened to while studying them. Is everything I write slop for the same reason?
lpcvoid 2 hours ago [-]
No. You're a human, who went through real life experiences. You learned, developed as a human being. You made mistakes and grew from them. You did what you have to do to advance. What you output has intrinsic value because of all this. I argue that even when you roll your face on your keyboard, the output is more valuable than ten pages of slop output from an LLM, since it's human, with all the history, experience, emotions and character which came before it.
the_biot 2 hours ago [-]
A quote from Neuromancer comes to mind:
"But I ain't likely to write you no poem, if you follow me. Your AI, it just might. But it ain't no way human.”
__alexs 2 hours ago [-]
The Neo-Victorian perspective of The Diamond Age is not a luxury most of us are going to be able to afford unfortunately.
sigbottle 2 hours ago [-]
I don't know why this got downvoted. I've already been so frustrated by HN LIDAR mindsets but holy shit.
Human society exists because we value humans, full stop. The easiest way to "solve" all of humanity's problems is to simply say that humans aren't valuable. Sometimes it feels like we're conceding a ridiculous amount of ground on that basic principle every year - one more human value gone because it "doesn't matter", so hey, we've obviously made progress!
bigstrat2003 50 minutes ago [-]
Agreed. I think that sometimes people on HN lose sight of what is actually important, which is human flourishing. The other day there was someone arguing that the best thing to do to fix loneliness problems in society is to remove the human need for socializing. Which... is certainly one way to fix the problem, I guess, but completely missed the point. The point is not to fix a mismatch between essential human desires and what we can attain, the point is to work on fulfilling those desires! Just something goes with nerd autism, I guess.
sieep 3 hours ago [-]
Well put. Im gonna start parroting this talking point more from now on.
ronsor 2 hours ago [-]
And I thought being a stochastic parrot was limited to LLMs, but apparently they learned it from somewhere...
BoredPositron 3 hours ago [-]
Intent matters. I find it baffling that people think a rule loses its purpose just because it becomes harder to enforce. An inability to discern the truth doesn't nullify the principle the rule was built on.
retired 47 minutes ago [-]
Fork it to Slobian and let the clankers go to town creating, approving and merging pull requests by themselves. Look at the install base to see what people prefer.
MintPaw 21 minutes ago [-]
An interesting concept that stood out to me. Committing the prompts instead of the resulting code only.
It it really true the LLM's are non-deterministic? I thought if you used the exact input and seed with the temperature set to 0 you would get the same output. It would actually be interesting to probe the commit prompts to see how slight variants preformed.
LelouBil 18 minutes ago [-]
> I thought if you used the exact input and seed with the temperature set to 0 you would get the same output.
I think they can also be differences on different hardware, and also usually temperature is set higher than zero because it produces more "useful/interesting" outputs
theptip 3 hours ago [-]
> disclosure if "a significant portion of the contribution is taken from a tool without manual modification", and labeling of such contributions with "a clear disclaimer or a machine-readable tag like '[AI-Generated]'.
Quixotic, unworkable, pointless. It’s fundamentally impossible (at least without a level of surveillance that would obviously be unavceptable) to prove the “artisanal hand-crafted human code” label.
> contributors should "fully understand" their submissions and would be accountable for the contributions, "including vouching for the technical merit, security, license compliance, and utility of their submissions".
This is in the right direction.
I think the missing link is around formalizing the reputation system; this exists for senior contributors but the on-ramp for new contributors is currently not working.
Perhaps bots should ruthlessly triage in-vouched submissions until the actor has proven a good-faith ability to deliver meaningful results. (Or the principal has staked / donated real money to the foundation to prove they are serious.)
I think the real problem here is the flood of low-effort slop, not AI tooling itself. In the hands of a responsible contributor LLMs are already providing big wins to many. (See antirez’s posts for example, if you are skeptical.)
hananova 1 hours ago [-]
> Quixotic, unworkable, pointless. It’s fundamentally impossible (at least without a level of surveillance that would obviously be unavceptable) to prove the “artisanal hand-crafted human code” label.
Difficulty of enforcing is a detail. Since the rule exists, it can be used when detection is done. And importantly it means that ignoring the rule means you’re intentionally defrauding the project.
jruohonen 2 hours ago [-]
Debian has always been Debian and thus there are these purist opinions, but perhaps my take too would be something along the "one-strike-and-you-are-out" kind of a policy (i.e., you submit slop without being able to explain your submission in any way) already followed in some projects:
This is like trying to stop spam by banning emails that send you spam.
They can spin up LLM-backed contributors faster than you can ban them.
ApolloFortyNine 24 minutes ago [-]
Banning AI would hardly stop that, the LLM contributors would simply claim they're not AI.
Hence why banning AI contributions is meaningless, you literally only punish 'good' actors.
jruohonen 1 hours ago [-]
If the situation becomes that worse, I agree with you; otherwise, I don't see that as a problem.
techwizrd 2 hours ago [-]
I agree. If the real concern is the flood of low-effort slop, unmaintainable patches, accidental code reuse, or licensing violations, then the process should target those directly. The useful work is improving review and triage so those problems get filtered out early. The genie is already out of the bottle with AI tooling, so broad “no AI” rules feel like a reaction to the tool and do not seem especially useful or enforceable.
hombre_fatal 2 hours ago [-]
Aside, that's a fun read/format, like reading about judges arguing how to interpret a law or debating whether a law is constitutional.
3 hours ago [-]
aplomb1026 19 minutes ago [-]
[dead]
bhekanik 52 minutes ago [-]
[dead]
newzino 2 hours ago [-]
[dead]
techpulse_x 3 hours ago [-]
[dead]
wetpaws 51 minutes ago [-]
[dead]
3012846 3 hours ago [-]
Again you can see which developers are owned by corporations and which are not. There is no free software any longer.
fidorka 3 hours ago [-]
What do you mean?
est31 2 hours ago [-]
I think it's a complicated issue.
A lot of low quality AI contributions arrive using free tiers of these AI models, the output of which is pretty crap. On the other hand, if you max out the model configs, i.e. get "the best money can buy", then those models are actually quite useful and powerful.
OSS should not miss out on the power LLMs can unleash. Talking about the maxed out versions of the newest models only, i.e. stuff like Claude 4.5+ and Gemini 3, so developments of the last 5 months.
But at the same time, maintainers should not have to review code written by a low quality model (and the high quality models, for now, are all closed, although I heard good things about Minmax 2.5 but I haven't tried it).
Given how hard it is to tell which model made a specific output, without doing an actual review, I think it would make most sense to have a rule restricting AI access to trusted contributors only, i.e. maintainers as a start, and maybe some trusted group of contributors where you know that they use the expensive but useful models, and not the cheap but crap models.
ACCount37 44 minutes ago [-]
It's the difference between raw LLM output vs LLM output that was tweaked, reviewed and validated by a competent developer.
Both can look like the same exact type of AI-generated code. But one is a broken useless piece of shit and the other actually does what it claims to do.
The problem is just how hard it is to differentiate the two at a glance.
bombcar 1 hours ago [-]
The tacit understanding of all these is that the valued contributors can us AI as long as they can "defend the code" if you will, because AI used lightly and in that way would be indistinguishable from knuthkode.
The problem is having an unwritten rule is sometimes worse than a written one, even if it "works".
With the advent of LLMs, AI-autocomplete, and agent-based development workflows, my ability to deliver reliable, high-quality code is restored and (arguably) better. Personally, I love the "hallucinations" as they help me fine-tune my prompts, base instructions, and reinforce intentionality; e.g. is that >really< the right solution/suggestion to accept? It's like peer programming without a battle of ego.
When analyzing problems, I think you have to look at both upsides and downsides. Folks have done well to debate the many, many downsides of AI and this tends to dominate the conversation. Probably thats a good thing.
But, on the flip side, I personally advocate hard for AI from the point-of-view on accessibility. I know (more-or-less) exactly what output I'm aiming for and control that obsessively, but it's AI and my voice at the helm instead of my fingertips.
I also think it incorrect to look at it from a perspective of "does the good outweigh the bad?". Relevant, yes, but utilitarian arguments often lead to counter-intuitive results and end up amplifying the problems they seek to solve.
I'd MUCH rather see a holistic embrace and integration of these tools into our ecosystems. Telling people "no AI!" (even if very well defined on what that means) is toothless against people with little regard for making the world (or just one specific repo) a better place.
This is the technique I've picked up and got the most from over the past few months. I don't give it hard, high-level problems and then review a giant set of changes to figure it out. I give it the technical solution I was already going to implement anyway, and then have it generate the code I otherwise would have written.
It cuts back dramatically on the review fatigue because I already know exactly what I'm expecting to see, so my reviews are primarily focused on the deviations from that.
This is how I've found myself to be productive with the tools, or since productivity is hard to measure, at least it's still a fun way to work. I do not need to type everything but I want a very exact outcome nonetheless.
I also use LLM assistance, and I love it because it helps my ADHD brain get stuff done, but I definitely miss stuff that I wouldn’t miss by myself. It’s usually fairly simple mistakes to fix later but I still miss them initially.
I’ve been having luck with LLM reviewers though.
The accessibility angle is really important here. What we need is a way to stop people who make contributions they don't understand and/or can not vouch they are the author for (the license question is very murky still, and no what the US supreme court said doesn't matter here in EU). This is difficult though.
My simple solution: I use Whisper to transcribe my text, and feed the output to an LLM for cleanup (custom prompt). It's fantastic. Way better than stuff like Dragon. Now I get frustrated with transcribing using Google's default mechanism on Android - so inaccurate!
But the ability to take notes, dictate emails, etc using Whisper + LLM is invaluable. I likely would refuse to work for a company that won't let me put IP into an LLM.
Similarly, I take a lot of notes on paper, and would have to type them up. Tedious and painful. I switched to reading my notes aloud and use the above system to transcribe. Still painful. I recently realized Gemini will do a great job just reading my notes. So now I simply convert my notes to a photo and send to Gemini.
I categorize all my expenses. I have receipts from grocery stores where I highlight items into categories. You can imagine it's painful to enter that into a financial SW. I'm going to play with getting Gemini to look at the photo of the receipt and categorize and add up the categories for me.
All of these are cool applications on their own, but when you realize they're also improving your health ... clear win.
This reads almost like satire of an AI power user. Why would you like it when an LLM makes things up? Because you get to write more prompts? Wouldn't it be better if it just didn't do that?
It's like saying "I love getting stuck in traffic because I get to drive longer!"
Sorry but that one sentence really stuck out to me
I like it because I have no expectation of perfection-- out of others, myself, and especially not AI. I expect "good enough" and work upwards from there, and with (most) things, I find AI to be better than good enough.
I'm much better now after tons of rehab work (no surgery, thankfully), but I don't have the stamina to type as much as I used to. I was always a heavy IDE user and a very fast coder, but I've moved platforms too many times and lost my muscle memory. A year ago I found the AI tools to be basically time-wasters, but now I can be as productive as before without incurring significant pain.
Given the liabilities of relying on public and chat users markdown data to sell to other users without compensation raises a number of issues:
1. Copyright: LLM generated content can't be assigned copyright (USA), and thus may contaminate licensing agreements. It is likely public-domain, but also may conflict with GPL/LGPL when stolen IP bleeds through weak obfuscation. The risk has zero precedent cases so far (the Disney case slightly differs), but is likely a legal liability waiting to surface eventually.
2. Workmanship: All software is terrible, but some of it is useful. People that don't care about black-box obfuscated generated content, are also a maintenance and security liability. Seriously, folks should just retire if they can't be arsed to improve readable source tree structure.
3. Repeatability: As the models started consuming other LLM content, the behavioral vectors often also change the content output. Humans know when they don't know something, but an LLM will inject utter random nonsense every time. More importantly, the energy cost to get that error rate lower balloons exponentially.
4. Psychology: People do not think critically when something seems right 80% of the time. The LLM accuracy depends mostly on stealing content, but it stops working when there is nothing left to commit theft of service on. The web is now >53% slop and growing. Only the human user chat data is worth stealing now.
5. Manipulation: The frequency of bad bots AstroTurf forums with poisoned discourse is biasing the delusional. Some react emotionally instead of engaging the community in good faith, or shill hard for their cult of choice.
6. Sustainability: FOSS like all ecosystems is vulnerable to peer review exhaustion like the recent xz CVE fiasco. The LLM hidden hostile agent problem is currently impossible to solve, and thus cannot be trusted in hostile environments.
7. Ethics: Every LLM ruined town economic simulations, nuked humanity 94% of the time in every war game, and encouraged the delusional to kill IRL
While I am all for assistive technologies like better voice recognition, TTS, and individuals computer-user interfaces. Most will draw a line at slop code, and branch to a less chaotic source tree to work on.
I think it is hilarious some LLM proponents immediately assume everyone also has no clue how these models are implemented. =3
"A Day in the Life of an Ensh*ttificator "
https://www.youtube.com/watch?v=T4Upf_B9RLQ
I do agree that at large, the theoretical upsides of accessibility are almost certainly completely overshadowed by obvious downsides of AI. At least, for now anyway. Accessibility is a single instance of the general argument that "of course there are major upsides to using AI", and there a good chance the future only gets brighter.
My point, essentially, is that I think this is (yet another) area in life where you can't solve the problem by saying "don't do it", and enforcing it is cost-prohibitive. Saying "no AI!" isn't going to stop PR spam. It's not going to stop slop code. What is it going to stop (see edit)? "Bad" people won't care, and "good" people (who use/depend-on AI) will contribute less.
Thus I think we need to focus on developing robust systems around integrating AI. Certainly I'd love to see people adopt responsible disclosure policies as a starting point.
--
[edit] -- To answer some of my own question, there are obvious legal concerns that frequently come up. I have my opinions, but as in many legal matters, especially around IP, the water is murky and opinions are strongly held at both extremes and all to often having to fight a legal battle at all* is immediately a loss regardless of outcome.
I think the ugly unspoken truth whether Mozilla or Debian or someone else, is that there are going to be plausible and valuable use cases and that AI as a paradigm is going to be a hard problem the same way that presiding over, say, a justice system is a hard problem (stay with me). What I mean is it can have a legitimate purpose but be prone to abuse and it's a matter of building in institutional safeguards and winning people's trust while never fully being able to eliminate risk.
It's easy for someone to roll their eyes at the idea that there's utility but accessibility is perfect and clear-eyed use case, that makes it harder to simply default to hedonic skepticism against any and all AI applications. I actually think it could have huge implications for leveling the playing field in the browser wars for my particular pet issue.
Something might be required now as some people might think that just asking an LLM is "the most he can done", but it's not about using AI it's about being aware and responsible about using it.
But like the XZ attack, we kind of have to assume that advanced perissitant threats are a reality for FOSS too.
I can envisage a Sybil attack where several seemingly disaparate contributors are actually one actor building a backdoor.
Right now we have a disparity in that many contributors can use LLMs but the recieving projects aren't able to review them as effectively with LLMs.
LLM generated content often (perhaps by definition) seems acceptable to LLMs. This is the critical issue.
If we had means of effectively assessing PRs objectively that would make this moot.
I wonder if those is a whole new class of issue. Is judging a PR harder than making one? It seems so right now
Depends on the assumptions. If you assume good intent of the submitter and you spend time to explain what he should improve, why something is not good, etc, than it's a lot of effort. If you assume bad intent, you can just reject with something like "too large review from unproven user, please contribute something smaller first".
Yes, we might need to take things a bit slower, and build relations to the people you collaborate with in order to have some trust (this can also be attacked, but this was already possible).
For AI generated code if previous PRs aren't loaded into context then there's no lasting benefit from the time taken to review and it's blank slate each time. I think ultimately it can be solved with workflow changes (i.e. AI written code should be attributed to the AI in VCS, the full trace and manual edits should be visible for review, all human input prompts to the AI should be browsable during review without having scroll 10k lines of AI reasoning.)
I think that's backwards, at least as far as accepting a PR. Better that all code is reviewed as if it is probably a carefully thought out Trojan horse from a dedicated enemy until proven otherwise.
Quality should always be the responsibility of the person submitting changes. Whether a person used LLMs should not be a large concern if someone is acting in good-faith. If they submitted bad code, having used AI is not a valid excuse.
Policies restricting AI-use might counterintuitively have a negative effect on contribution quality if they hinder good contributors while bad contributors ignore the restrictions. That said, restrictions for non-quality reasons, like copyright concerns, might still make sense.
The core issue is that it takes a large amount of effort to even assess this, because LLM generated code looks good superficially.
It is said that static FP languages make it hard to implement something if you don't really understand what you are implementing. Dynamically typed languages makes it easier to implement something when you don't fully understand what you are implementing.
LLMs takes this to another level when it enables one to implement something with zero understanding of what they are implementing.
The people following the policies are the most likely to use AI responsibly and not submit low-effort contributions.
Attempting to restrict AI to improve contribution quality might counterintuitively have the opposite effect by hindering good contributors while bad contributors ignore the restrictions.
I’m more interested in how we might allow people to build trust so that reviewers can positively spend time on their contributions, whilst avoiding wasting reviewers time on drive-by contributors. This seems like a hard problem.
Therefore, policies restricting AI-use on the basis of avoiding low-quality contributions are probably hurting more than they’re helping.
Actually not shrink, but just transfer it to reviewers.
Sure now it is easy, but in 3-10 years AI will get significantly better. It is a lot like the audio quality of an MP3 recording. It is not perfect (lossless audio is better), but for the majority of users it is "good enough".
At a certain point AI generated content, PR's, etc will be good enough for humans to accept it as "human". What happens then, when even the best checks and balances are fooled?
Can you reliably tell that the contributor is truly the author of the patch and that they aren't working for a company that asserts copyright on that code? No, but it's probably still a good idea to have a policy that says "you can't do that", and you should be on the lookout for obvious violations.
It's the same story here. If you do nothing, you invite problems. If you do something, you won't stop every instance, but you're on stronger footing if it ever blows up.
Of course, the next question is whether AI-generated code that matches or surpasses human quality is even a problem. But right now, it's academic: most of the AI submissions received by open source projects are low quality. And if it improves, some projects might still have issues with it on legal (copyright) or ideological grounds, and that's their prerogative.
But the projects aren't drowning under PRs from reputable people. They're drowning in drive-by PRs from people with no reputation to speak of. Even if you outright ban their account, they'll just spin up a new one and try again.
Blocking AI submissions serves as a heuristic to reduce this flood of PRs, because the alternative is to ban submissions from people without reputation, and that'd be very harmful to open source.
And AI cannot be the solution here, because open source projects have no funds. Asking maintainers to fork over $200/month for "AI code reviews" just kills the project.
1. You layout policy stating that all code, especially AI code has to be written to a high quality level and have been reviewed for issues prior to submission.
2. Given that even the fastest AI models do a great job of code reviews, you setup an agent using Codex-Spark or Sonnnet, etc to scan submissions for a few different dimensions (maintainability, security, etc).
3. If a submission comes through that fails review, that's a strong indication that the submitter hasn't put even the lowest effort into reviewing their own code. Especially since most AI models will flag similar issues. Knock their trust score down and supply feedback.
3a. If the submitter never acts on the feedback - close the submission and knock the trust score down even more.
3b. If the submitter acts on the feedback - boost trust score slightly. We now have a self-reinforcing loop that pushes thoughtful submitters to screen their own code. (Or ai models to iterate and improve their own code)
4. Submission passes and trust score of submitter meets some minimal threshold. Queued for human review pending prioritization.
I haven't put much thought into this but it seems like you could design a system such that "clout chasing" or "bot submissions" would be forced to either deliver something useful or give up _and_ lose enough trust score that you can safely shadowban them.
In terms of your plan though, you're just building a generative adversarial network here. Automated review is relatively easy to "attack".
Yet human contributors don't put up with having to game an arbitrary score system. StackOverflow imploded in no small part because of it.
We need to rethink some UX design and processes here, not pretend low quality people are going to follow your "no low quality pls i'm serious >:(" rules. Rather, design the processes against low quality.
Also, we're in a new world where code-change PRs are trivial, and the hard part isn't writing code anymore but generating the spec. Maybe we don't even allow PRs anymore except for trusted contributors, everyone else can only create an issue and help refine a plan there which the code impl is derived?
You know, even before LLMs, it would have been pretty cool if we had a better process around deliberating and collaborating around a plan before the implementation step of any non-trivial code change. Changing code in a PR with no link to discussion around what the impl should actually look like always did feel like the cart before the horse.
And for the major projects where there was a flood of PRs, it was fairly easy to identify if someone knew what they were talking about by looking at their language; Correct use of jargon, especially domain-specific jargon.
The broader reason why "unknown contributor" PRs were held in high regard is that, outside of some specific incidents (thank you, DigitalOcean and your stupid tshirts), the odds were pretty good of a drive by PR coming from someone who identified a problem in your software by using it. Those are incredibly valuable PRs, especially as the work of diagnosing the problem generally also identifies the solution.
It's very hard to design a UX that impedes clueless fools spamming PRs but not the occasional random person finding sincere issues and having the time to identify (and fix them) but not permanent project contribution.
> and the hard part isn't writing code anymore but generating the spec
My POV: This is a bunch of crap and always has been.
Any sufficiently detailed specification is code. And the cost of writing such a specification is the cost of writing code. Every time "low code" has been tried, it doesn't work for this very reason.
e.g. The work of a ticket "Create a product category for 'Lime'" consists not of adding a database entry and typing in the word 'Lime', it consists of the human work of calling your client and asking whether it should go under Fruit or Cement.
The latter is where you get all known contributors from! So if you close off unknown contributors the project will eventually stagnate and die.
Hmmm, no? That's actually very common in open source. Maybe "banning" isn't the right word, but lots of projects don't accept random drive-by submissions and never have. Debian is a perfect example, you are very unlikely to get a nontrivial patch or package into Debian unless you have some kind of interaction or rapport with a package maintainer, or commit to the process of building trust to become a maintainer yourself.
I have seen high profile GitHub projects that summarily close PRs if you didn't raise the bug/feature as an issue or join their discord first.
> you are very unlikely to get a nontrivial patch or package into Debian unless you have some kind of interaction or rapport with a package maintainer
I did mean the "trivial" patches as well, as often it's a lot of these small little fixes to single issues that improve software quality overall.
But yes, it's true that it's not uncommon for projects to refuse outside PRs.
This already causes massive amounts of friction and contributes (heh) heavily to what makes Open Source such a pain in the ass to use.
Conversely, many popular "good" open source libraries rely extensively on this inflow of small contributions to become comprehensively good.
And so it's a tradeoff. Forcing all open source into refusing drive-by PRs will have costs. What makes sense for major security-sensitive projects with large resources doesn't make sense for others.
It's not that we won't have open source at all. It's that it'll just be worse and encourage further fragmentation. e.g. One doesn't build a good .ZIP library by carefully reading the specification, you get it by collecting a million little examples of weird zip files in the wild breaking your code.
That's an OK view to hold, but I'll point out two things. First, it's not how the tech is usually wielded to interact with open-source software. Second, your worldview is at odds with the owners of this technology: the main reason why so much money is being poured into AI coding is that it's seen by investors as a replacement for the individual.
It seems that gun control—though imperfect—in regions that have implemented it has had a good bit of success and the legitimate/non-harmful capabilities lost seem worth it to me in trade for the gains. (Reasonable people can disagree here!)
Whereas it seems to me that if we accept the proposition that the vast majority of code in the future is going to be written by AI (and I do), these valuable projects that are taking hard-line stances against it are going to find themselves either having to retreat from that position or facing insurmountable difficulties in staying relevant while holding to their stance.
It is the conservative position: it will be easier to walk back the policy and start accepting AI produced code some time down the road when its benefits are clearer than it will be to excise AI produced code from years prior if there's a technical or social reason to do that.
Even if the promise of AI is fulfilled and projects that don't use it are comparatively smaller, that doesn't mean there's no value in that, in the same way that people still make furniture in wood with traditional methods today even if a company can make the same widget cheaper in an almost fully automated way.
This is even true despite the fact that there are bad actors only a few minutes drive away in many cases (Chicago->Indiana border, for example).
(as an aside - this reminds me of the trend of Object Oriented Ontology that specifically /tried/ to imbue agency onto large-scale phenomena that were difficult to understand discretely. I remember "global warming" being one of those things - and I can see now how this philosophy would have done more to obscure the dominion of experts wrt that topic)
This is the basis of the argument - it doesn't matter if you use AI or not, but it does matter if you know what you're doing or not.
McDonalds cooks ~great~ (edit: fair enough, decent) burgers when measured objectively, but people still go to more niche burger restaurants because they want something different and made with more care.
That's not to say that an human can't use AI with intent, but then AI becomes another tool and not an autonomous code generating agent.
Wait, what? In what world are McDonalds burgers "great"? They're cheap. Maybe even a good value. But that's not the same as great.
Some of the best burgers I've ever had came from fast food.
If you believe the outputs of LLMs are derivative products of the materials the LLMs were trained on (which is a position I lean towards myself, but I also understand the viewpoint of those who disagree), then no, that's not a good thing, because it would be a license violation to accept those derived products without following the original material's license terms, such as attribution and copyleft terms. You are now party to violating the original materials' copyright by accepting AI generated code. That's ethically dubious, even if those original authors may have a hard time bringing a court case against you.
In that case a lot of proprietary software is in breach of copyleft licences. Its probably by far the commonest breach.
> You are now party to violating the original materials' copyright by accepting AI generated code. That's ethically dubious
That is arguable. Is it always ethically dubious to breach a law? If not, which is it ethically dubious to breach this law in this particular way?
Sure, but this doesn't really seem relevant to the conversation. Someone else violating software license terms doesn't justify me (or Debian, in the case of TFA) doing so.
> Is it always ethically dubious to breach a law?
I'm not really concerned with the law, here. I think it is ethically dubious to use someone else's work without compensating them in the manner they declared. Copyright law happens to be the method we've used for a couple hundred years to standardize the discussion about that compensation, and sometimes enforce it. Breaching the law doesn't really enter into the conversation, except as a way our society agrees to hold everyone to a minimum ethical standard.
OK, that is reasonable. I do not think copyright is a good mechanism though, and I think the need to compensate depends on multiple factors depending on what you use a work for and under what circumstances.
Crystal ball or time machine?
Past performance does not guarantee future results, of course. But acting like AI is now magically going to stagnate is also a really bold bet.
I sincerely doubt that, because it still can't even generate a few hundred line script that runs on the first try. I would know, I just tried yesterday. The first attempt was using hallucinated APIs and while I did get it to work eventually, I don't think it can one shot a complex application if it can't one shot a simple script.
IMO, AI has already stagnated and isn't significantly better than it was 3 years ago. I don't see how it's supposed to get better still when the improvement has already stopped.
I routinely generate applications for my personal use using OpenCode + Claude Sonnet/Opus.
Yesterday I generated an app for my son to learn multiplication tables using spaced repetition algorithm and score keeping. It took me like 5 minutes.
Of course if you use ChatGPT it will not work but there is no way Claude Code/Open Code with any modern model isn't able to generate a one hundred line script on the first try.
If everything the maintainer wants can (hypothetically) be one-shotted, then there is no need to accept PR's at all. Just allow forks in case of open source.
I think a lot of anti-LLM opinions just come from interacting with the lowest effort LLM slop and someone not realizing that it's really a problem with a low value person behind it.
It's why "no AI allowed" is pointless; high value contributors won't follow it because they know how to use it productively and they know there's no way for you to tell, and low value people never cared about wasting your time with low effort output, so the rule is performative.
e.g. If you tell me AI isn't allowed because it writes bad code, then you're clearly not talking to someone who uses AI to plan, specify, and implement high quality code.
I disagree that the rule is pointless, and your last point is a strawman. AI is disallowed because it’s the manner in which the would-be contributors are attempting to contribute to these projects. It’s a proxy rule.
Unfortunately for AI maximalists, code is more than just letters on the screen. There needs to be human understanding, and if you’re not a core contributor who’s proven you’re willing to stick around when shit hits the fan, a +3000 PR is a liability, not an asset.
Maybe there needs to be something like the MMORPG concept of “Dragon Kill Points (DKP)”, where you’re not entitled to loot (contribution) until you’ve proven that you give a shit.
This isn't necessarily true; I've seen some projects absorb a PR of roughly that size, and after the smoke tests and other standard development stuff, the original PR author basically disappeared.
It added a feature he wanted, he tested and coded it, and got it in.
This anecdotal argument is a dead end. The nuance is clear: not all software is the same, and not all edits to software are the same.
Your argument has nothing to do with AI and more to do with PR size and 'fire and forget' feature merges. That's what the commenter your responding to is pointing out.
And in the context of high-value contributors that GP was mentioning, they are never going to land a +3000 PR because they know there is going to be a human reviewer on the other side.
High-value contributors follow the rules and social mores of the community they are contributing to. If they intentionally deceive others, they are not high-value.
Like its been years and years now, if all this is true, you'd think there would be more of a paradigm shift? I'm happy I guess waiting for Godot like everyone else, but the shadows are getting a little long now, people are starting to just repeat the same things over and over.
Like, I am so tired now, it's causing such messes everywhere. Can all the best things about AI be manifest soon? Is there a timeline?
Like what can I take so that I can see the brave new world just out of reach? Where can I go? If I could just even taste the mindset of the true believer for a moment, I feel like it would be a reprieve.
Human society exists because we value humans, full stop. The easiest way to "solve" all of humanity's problems is to simply say that humans aren't valuable. Sometimes it feels like we're conceding a ridiculous amount of ground on that basic principle every year - one more human value gone because it "doesn't matter", so hey, we've obviously made progress!
It it really true the LLM's are non-deterministic? I thought if you used the exact input and seed with the temperature set to 0 you would get the same output. It would actually be interesting to probe the commit prompts to see how slight variants preformed.
I think they can also be differences on different hardware, and also usually temperature is set higher than zero because it produces more "useful/interesting" outputs
Quixotic, unworkable, pointless. It’s fundamentally impossible (at least without a level of surveillance that would obviously be unavceptable) to prove the “artisanal hand-crafted human code” label.
> contributors should "fully understand" their submissions and would be accountable for the contributions, "including vouching for the technical merit, security, license compliance, and utility of their submissions".
This is in the right direction.
I think the missing link is around formalizing the reputation system; this exists for senior contributors but the on-ramp for new contributors is currently not working.
Perhaps bots should ruthlessly triage in-vouched submissions until the actor has proven a good-faith ability to deliver meaningful results. (Or the principal has staked / donated real money to the foundation to prove they are serious.)
I think the real problem here is the flood of low-effort slop, not AI tooling itself. In the hands of a responsible contributor LLMs are already providing big wins to many. (See antirez’s posts for example, if you are skeptical.)
Difficulty of enforcing is a detail. Since the rule exists, it can be used when detection is done. And importantly it means that ignoring the rule means you’re intentionally defrauding the project.
https://news.ycombinator.com/item?id=47109952
They can spin up LLM-backed contributors faster than you can ban them.
Hence why banning AI contributions is meaningless, you literally only punish 'good' actors.
A lot of low quality AI contributions arrive using free tiers of these AI models, the output of which is pretty crap. On the other hand, if you max out the model configs, i.e. get "the best money can buy", then those models are actually quite useful and powerful.
OSS should not miss out on the power LLMs can unleash. Talking about the maxed out versions of the newest models only, i.e. stuff like Claude 4.5+ and Gemini 3, so developments of the last 5 months.
But at the same time, maintainers should not have to review code written by a low quality model (and the high quality models, for now, are all closed, although I heard good things about Minmax 2.5 but I haven't tried it).
Given how hard it is to tell which model made a specific output, without doing an actual review, I think it would make most sense to have a rule restricting AI access to trusted contributors only, i.e. maintainers as a start, and maybe some trusted group of contributors where you know that they use the expensive but useful models, and not the cheap but crap models.
Both can look like the same exact type of AI-generated code. But one is a broken useless piece of shit and the other actually does what it claims to do.
The problem is just how hard it is to differentiate the two at a glance.
The problem is having an unwritten rule is sometimes worse than a written one, even if it "works".