> Parallel semantic analysis has been an explicitly planned feature of the Zig compiler for a long time, and it has heavily influenced the design of the self-hosted Zig compiler. However, implementing this feature correctly has implications not only for the compiler implementation, but for the Zig language itself! Therefore, to implement this feature without an avalanche of bugs and inconsistencies, we need to make language changes.
3000 line LLM commit is not that.
I feel like if their goal is to prioritize contributors over contributions, it'd also logically follow that they should try to have descriptions where possible? Just to make exploring any set of changes and learning easier? Looked it over briefly, no Markdown or similar doc changes there either.
I mean the changes can be amazing, it's just that adding some description of what they are in more detail, alongside the considerations during development, for new folks or anyone wanting to learn from good code would also be due diligence.
edit Okay, I set the bar too high here with "best human developer" and vague "good AI processes". My bad. Yes, LLM is not quite there yet.
We're already at the point talking about best vs. best.
We definitely are not close to that point though and it's unclear if/when we will get there.
If I do the latter and submit a PR to something like Zig, I'll be certainly caught doing it and rightfully chastised. If I do the former, my PR will be better without anybody besides myself having any way of knowing how it got better. Probably I do something in between when I contribute to open-source these days.
Blanket banning all of these seems like a bad idea to me. It actively gates people like myself from contributing, because I respect these people and projects that much. It feels like I would be doing something they find disgusting if my work has touched an LLM and I obviously don't want to do that to people I respect. But I do not presume to have any say on Zig project's decisions. Their point of preferring human contact is superb, frankly. Probably a different kind of problem in an open-source project staffed with a lot of remote working people, where human contact is scarce.
Because the pro-group are whining that the policy is preventing the merge, when in actual fact even if the policy did not exist, the PR is crap anyway.
In this case it isn't the blocker - the fact that the dev took the time to read the PR in detail, comment on it, and provide reasons why it could not be merged makes it very clear to me that the policy wasn't the blocker.
If they were going to enforce the policy for this PR, they wouldn't have bothered to read it. The only reason to read it is to see if the policy is waived for this specific PR.
As the Zig maintainer so patiently explained, no amount of "polish" can fix the PR because it is misaligned to the correctness that they require.
IOW, that PR is so far off the reservation, unless it is completely rewritten, it won't be accepted.
Rewriting PRs with LLMs is cheap, but often the output is no better than the previous revision (fixing one issue only to cause another one is very common IME). And reviewing each revision of the PR is not cheap.
I've had good experiences with people submitting AI generated PRs who then actually take the time to understand what's going on and fix issues (either by hand or with a targeted LLM generated fix) that are brought up in review. But it's incredibly frustrating when you spend an hour reviewing something only to have someone throw your review comments directly back at the LLM and have it generate something new that requires another hour of review.
In this case it looks like the answer is "Yes"; the PR was not dismissed immediately, it was first examined in great detail!
Why would the maintainer expend effort on something that was going to be rejected anyway?
I don't understand this PoV - have you ever come across a policy in any environment that wasn't subject to case-by-case exceptions?
Even in highly regulated environments (banking/fintech, Insurance, Medical, etc), policies are subject to exceptions and exemptions, done on a case-by-case basis.
The notion, in this specific case, that "well they rejected it because of policy" is clearly nonsense and I don't understand why people are pushing this so hard when the explanation of why an exemption can't be made for this specific PR is public, accessible and, I feel, already public knowledge.
A healthy contributor community is more important than mere code performance, quantity of features or lines of code, etc..
The same argument applies to open source itself. Why use someone's project when you can just have the robot write your own? It's especially true if the open source project was vibe coded. AI and technology in general makes personalization cheap and affordable. Whereas earlier you had to use something that was mass produced to be satisfactory for everyone, now you have the hope of getting something that's outstanding for just you. It also stimulates the labor economy, because you have lots of people everywhere reinventing open source projects with their LLMs.
I've been thinking about this a bunch recently, and I've realized that the thing I value most in software now isn't robust tests or thorough documentation - an LLM can spit those out in a few minutes. It's usage. I want to use software which other people have used before me. I want them to have encountered the bugs and sharp edges and sanded them down.
The sanding down you refer to is what generates those tests and documentation.
It may be able to spit out text that purports to be that, in a few minutes. But for most software, an LLM will not be able to spit out robust tests - let alone useful documentation. (And documentation which just replicates the parameter names and types is thorough...ly useless.)
So it's just the fact that others have already gone through the motions before I did. That's it really. I suppose in commercial settings, this is even more true and perhaps extends to compliance.
I regularly do both when trying to use library, especially unfamiliar to me.
Lolz! I haven’t encountered “code that institutions had been keeping to themselves” that got even remotely close to OSS in quality.
I have worked during several decades in many companies, located in many countries, in a few continents, from startups to some of the biggest companies in their fields. Therefore I have seen many proprietary programs.
On average, proprietary programs are not better than open-source programs, but usually worse, because they are reviewed by fewer people and because frequently the programmers who write them may be stressed by having to meet unrealistic timelines for the projects.
The proprietary programs have greater quantity, not quality, by being written by a greater number of programmers working full-time on them, while much work on open-source projects is done in spare time by people occupied with something else.
Many proprietary programs can do things which cannot be done by open-source programs, but only because of access to documentation that is kept secret in the hope of preventing competition.
While lawyers, and other people who do not understand how research and development is really done, put a lot of weight in the so-called "intellectual property" of a company, which they believe to be embodied in things like the source code of proprietary programs or the design files for some hardware, the reality is that I have nowhere seen anything of substantial value in this so-called IP. Everywhere, what was really valuable in the know-how of the company was not the final implementation that could be read in some source code, but the knowledge about the many other solutions that had been tried before and they worked worse or not at all. This knowledge was too frequently not written down in any documentation. Knowing which are the dead ends is a great productivity boost for an experienced team, because any recent graduate could list many alternative ways of solving a problem, but most of them would not be the right choice in certain specific circumstances.
The whole point of having a civilization is that most things in life can be made someone else's problem and you can focus on doing one thing well. If I'm a dentist or if I run a muffler shop, there are only so many hours in a day, so I'd probably rather pay a SaaS vendor than learn vibecoding and then be stuck supervising a weird, high-maintenance underling that may or may not build me the app with the features I need (and that I might not be able to articulate clearly). There are exceptions, but they're just that, exceptions. If a vendor is reasonable and makes a competent product, I'll gladly pay.
The same goes for open source... even if an LLM could reliably create a brand new operating system from scratch, would I really want it to? I don't want to maintain an OS. I don't want to be in charge of someone who maintains an OS. I don't necessarily trust myself to have a coherent vision for an OS in the first place!
The Zig project is certainly far beyond such capability.
And you indeed get a lot of wheel reinvention by LLMs because that is now cheap to do. So rather than using some obscure thing on Github (like my stuff), it's easier to just generate what you need. I've noticed this with my own choices in dependencies as well. I tend to just go with what the LLM suggests unless I have a very good reason not to.
Maybe this will be a real problem in a couple years though.
So centralizing that common work is a benefit of open-source just as much with LLMs as it was before.
As someone who recently started using OpenSCAD for a project I find this attitude quite irritating. You certainly did not "have to" use popular tools.
The OpenSCAD example is particularly illuminating because it's fussy and frustrating and clearly tuned towards a few specific maintainers; there's a ton of things I'd like changed. But I would never trust an LLM to do it! "Oh the output looks fine, cool" is not enough for a CAD program. "Oh, there are a lot of tests, cool" great, I have no idea what a thorough CAD test suite looks like. I would be a reckless idiot if I asked Claude to make me a custom SCAD program... unless I put in a counterproductive amount of work. So I'm fine with OpenSCAD.
I am also sincerely baffled as to how this stimulates the "labor economy." The most obvious objection is that Anthropic seems to be the only party here getting any form of economic benefit: the open-source maintainers are just plain screwed unless they compromise quality for productivity, and the LLM users are trading high-quality tooling built by people who understand the problem for shitty tooling built by a robot, in exchange for uncompensated labor. It only stimulates the "labor economy" in a Bizarro Keynesian sense, digging up glass bottles that someone forgot to put the money in.
I have seen at least 4 completely busted vibe-coded Rust SQLite clones in the last three months, happily used by people who think they don't need to worry their pretty little heads with routine matters like database design. It's a solved problem and Claude is on the case! In fact unlike those stooopid human SQLIte developers, Claude made it multithreaded! So fucking depressing.
You definitely need to have a strong sense of code design though. The AIs are not up to writing clean code at project scale on their own, yet.
Not trying to denigrate the work here, as such. But this certainly didn't convince me that using AI to replace OpenSCAD (or any other major open-source project) is a good idea. The LLMs still aren't even close to being able to pull it off.
Civilization isn't monotonic. People keep solving the same problems over and over again, telling the same stories with a different twist. For example in 1964 having a GUI work environment with a light pen as your mouse was a solved problem on IBM System/360. They had tools similar to CAD. So why don't we all just use that rather than make the same mistakes again. Each time a new way of doing things comes out, people get an opportunity to rewrite everything.
Because it is incredibly expensive to write a replacement for semi-complex software? Good luck asking frontier models to write a replacement for Zig, Docker, VSCode, etc.
Iff it is doable, then it would be worth considering it as alternative.
> It also stimulates the labor economy, because you have lots of people everywhere reinventing open source projects with their LLMs.
not sure what you mean by that
That's a fair thing to ask, though it seems like people will arrive at very different conclusions there.
While I haven't codified it anywhere, the policy I would like is for issues and PR descriptions to have no LLMs - there is no reason to ban code completely though IMO. I would say that would be pro human-communication and a stance I would like a lot.
perhaps that's what the maintainers should be doing after all. it still takes time and tokens, though; neither is free.
I'd personally rather have the maintainers spend the time writing as much docs and specs as possible so the future LLMs have strong guardrails. zig's policy will be completely outdated in a couple years, for better or worse. someone will take bun's fork, add a codegen improvement here, add a linker improvement there and suddenly you'll have a better, faster zig outside of zig.
Someone forking it and makeing it better with AI is a possibility. If that happens will know it was better for the project for the maintainers to just review the code. If that happens, they can probably become maintainers in the fork. Or maybe they don't like that work and could just go do something else
This already happens to some degree on large software projects with corporate backing (Web engines, compilers, etc.), where it is often not trivial to start contributing as an independent individual.
Reasonable people can disagree on whether one approach is inherently better than the other, as ultimately they seem to be optimising for different goals.
If I have a test harness, and LLM workflow setup, it is easier to just write new code myself. I am not giving away my "secret sauce". And I will not have a debate "why this simple feature needs 1000 new tests...", and two days just to make a full release build.
For merge I have to do 99% of work anyway (analyze, autotest, build, smoke, regression test). I usually merge smaller commits just to be polite (and not to look like one man show), but there is no way to accept large refactoring!
I think this is a great policy by the Zig team.
Doing manual reviews of everything is very labor intensive and not scalable. However, AIs are pretty good at doing code reviews and verifying adherence to guard rails, contributor guidelines, and other rules. It's not perfect, but it's an underused tool. Both by reviewers and contributors. If your contribution obviously doesn't comply with the guidelines, it should be rejected automatically. The word "obviously" here translates into "easy to detect with some AI system".
Projects should be using a lot of scrutiny for contributions by new contributors. And most of that scrutiny should be automated. They should reserve their attention for things that make it past automated checks for contribution quality, contributor reputability, adherence to whatever rules are in place, etc. Reputability is a good way to ensure that contributions from reputable sources get priority. If your reputation is not great, you should expect more scrutiny and a lower priority.
Until the contributions are cheap and correct, you need valuable contributors more than you need the contributions.
You point would be valid when we get to a point of contributions all being both correct and cheap. Right now they are only cheap.
It takes like 5 minutes to spot garbage PRs manually. LLM can flood you with a wall of text where only half of the stuff make sense. Also, they can't really spot bad architecture. It's a compiler in an unpopular language, don't forget that.
The real bottle neck when you want to grow is connecting with the right people. An LLM is not helping with that if you want to build a community. When you use LLM to skip the need to understand a problem how are you ever going to get a reputation that I can trust?
The post is not about reputation it about seeing how people respond and work with you in a community.
EDIT: I see that you frame it as a help and a tool and sure it might work, but I feel like it is just another obstacle.
I suggest we also automate the distribution and the use of software with AI as well, and then just all go to the beach and sip on some cocktails or something.
Or in other words: Good luck with that.
For those who are pissed because a large OSS project isn't accepting LLM generated slop: Fuck off!
https://ziggit.dev/t/bun-s-zig-fork-got-4x-faster-compilatio...
>There’s the 4x speedup claimed by the Bun team, already available on Zig 0.16.0!
>Each [incremental] update is taking less than 0.4s, compared to the 120+ seconds taken to rebuild with LLVM. In other words, incremental updates are over 300 times faster on this codebase than fresh LLVM builds are. In comparison, an enhancement capped at a 4x improvement is pretty abysmal. [..] Again, this feature is available in Zig 0.16.0—you can use it!
That's exactly the sketchy part here. They turned down known, working and tested, code that came from a partner (bun) due to this policy. Code that 4x'd compile speed.
A general ban makes sense based on their rationalization ("contributor poker"[0]). A total and inflexible ban can lead to a worse outcome for everyone though.
If a senior, experienced, contributor vouches for the code it shouldn't matter if they hand crafted it on stone tablets, generated it with yarrow sticks, or used gpt-3.
No; they turned it down because the vibe-coded PR was crap.
> The rewritten type resolution semantics were designed to avoid these issues, but Bun’s Zig fork does not incorporate the changes (and has not otherwise solved the design problems), which means their parallelized semantic analysis implementation will exhibit non-deterministic behavior. That’s pretty much a non-starter for most serious developers: you don’t want your compilation to randomly fail with a nonsense error 30% of the time.
The flip side of that is that if such a contributor vouches for code that turns out to be poor-quality, this should severely damage their reputation. I've found far too many "senior" developers will give AI a pass on poor coding practices.
> Put more simply, we are going to make these enhancements, but hacking them in for a flashy headline isn’t a good outcome for our users. Instead we’re approaching the problem with the care it deserves, so that when we ultimately ship it, we don’t cause regressions.
These exact changes are already on the roadmap and Bun’s PR is rushing ahead.
The conclusion does not follow from the premises. They are assuming fully autonomous agents submitting PRs, not using LLMs as tools like Bun did. As always, the most ethical thing to do is to just ignore any anti-LLM policies and not disclose anything.
How does this have anything to do with ethics? Its their project not yours, they can reject your PR for whatever reason, including you using LLMs for developing that PR. Also they're not assuming autonomous agents submitting PRs. They're saying that they do not accept PRs where any part of the thinking process was outsourced to a LLM.
Even if you disagree with their opinion, the ethical thing to do is to not interact and move on. Not to try to sneak in your LLM assisted PRs without the maintainers consent.
"Unfortunately the reality of LLM-based contributions has been mostly negative for us, from an increase in background noise due to worthless drive-by PRs full of hallucinations (that wouldn’t even compile, let alone pass CI), to insane 10 thousand line long first time PRs. In-between we also received plenty of PRs that looked fine on the surface, some of which explicitly claimed to not have made use of LLMs, but where follow-up discussions immediately made it clear that the author was sneakily consulting an LLM and regurgitating its mistake-filled replies to us."
Why are they often so desperate to lie and non-consensually harass others with their vibing rather than be honest about it? Why do they think they are "helping" with hallucinated rubbish that can't even build?
I use LLMs. It is not difficult to: ethically disclose your use, double check all of your work, ensure things compile without errors, not lie to others, not ask it to generate ten paragraphs of rubbish when the answer is one sentence, and respect the project's guidelines. But for so many people this seems like an impossible task.
Because they can't tell the difference between what the machine is outputting, and what people have built. All they see is the superficial resemblance (long lines of incomprehensbile code) and the reward that the people writing the code have got, and want that reward too.
AI is absolutely terrible for people like that, as it's the perfect enabler.