I'm not (just) being glib. That earlier article displays some introspection and thoughtful consideration of an old debate. The writing style is clearly personal, human.
Today's post is not so much. It has LLM fingerprints on it. It's longer, there are more words. But it doesn't strike me as having the same thoughtful consideration in it. I would venture to guess that the author tried to come up with some new angles on the news of the Claude Code leak, because it's a hot topic, and jotted some notes, and then let an LLM flesh it out.
Writing styles of course change over time, but looking at these two posts side by side, the difference is stark.
I made a commitment to write more this year and put my thoughts out quicker than I used to, so that’s likely the primary reason it’s not as deep of a piece of writing as the post you’re referencing. But I do want to note that this wasn’t written using AI, it just wasn’t intended to be as rich of a post.
The reason it came out longer is that I’ve honestly been thinking about these ideas for a while, and there is so much to say about this subject. I didn’t have any particular intention of hopping on a news cycle, but once I started writing the juices were flowing and I found myself coming up with five separate but interrelated thoughts around this story that I thought were worth sharing.
First known use in English comes from a 1658 translation of Blaise Pascal in 1657
> Je n’ai fait celle-ci plus longue que parce que je n’ai pas eu le loisir de la faire plus courte.
translated to
> I had not made this longer then the rest, but that I had not the leisure to make it shorter then it is.
(note the archaic then)
This was a popular piece of wit at the time.
Mark Twain wrote something similar a hundred years later
> You'll have to excuse my lengthiness - the reason I dread writing letters is because I am so apt to get to slinging wisdom & forget to let up. Thus much precious time is lost.
But it's still quite different.
There is a great article about this one on quoteinvestigator! https://quoteinvestigator.com/2012/04/28/shorter-letter/
If you have a strategy for jotting down (or dictating) notes while walking about, I would be curious how you manage that. I spend plenty of time walking outside, and tend to get (at the time) ideas that I'd like to explore further, most of which have evaporated from my mind by the time I get back home. Or even before I can get my phone out to jot down the keywords to help me recall the details later.
Cannot even imagine how someone would manage both walking and writing at the same time.
Apropos of nothing, this is astonishing me to no end. The ergonomics of 1) using a phone keyboard for anything but a word or two and 2) doing so while walking pretty much guarantee that I'd probably need a half a day to recover if I attempted the same.
With the exception of things that places like HN seems to consider worth reading, which is why I'm looking through the comments to this and others to find recommendations.
We're starting to become wary due to the abuse of AI and proliferation of sloppy content, but also because we often have trouble distinguishing authentic from sloppy content.
Another feature of this AI era that I hate.
“This is AI” seems to just be an evolution of other thought terminating cliches where the negative conditioning associated with something is used in an abusive and manipulative way to evade challenge or the truth itself. It is a common tactic of abusive people, the “beyond the pale” moralizing.
What is interesting and has possibly bled over from heavy LLM use by the author is the style of simplistic bullet point titles for the argument with filler in between. It does read like they wrote the 5 bullet points then added the other text (by hand).
Speaking for myself, I long for the day I can dump the comparatively garbage experience of Claude Code for something more enjoyable and OSS like OpenCode. But the fact is that it is simply not economically viable to do so.
So the PMF is not really for Claude Code alone -- it is for Claude Code + Claude Max.
There's even a GUI called claudia for a piecemeal extraction with a PRD.
https://github.com/kristopolous/Claudette
I've got a web, rust and tkinter version (for fun) right now just making sure this approach works.
The answer is... Mostly...
Enjoy
Seems like it would be a nightmare to provide evidence of what parts of a half a million line codebase were written by humans if no one bothered to track it.
Code doesn't matter IN THE EARLY DAYS.
This is similar to what I've observed over 25 years in the industry. In a startup, the code doesn't really matter; the market fit does.
But as time goes on your codebase has to mature, or else you end up using more and more resources on maintenance rather than innovation.
Is it possible to start with something of this size that's vibe coded and refactor your way into something resembling a human codebase?
> This is similar to what I've observed over 25 years in the industry. In a startup, the code doesn't really matter; the market fit does.
> But as time goes on your codebase has to mature, or else you end up using more and more resources on maintenance rather than innovation.
Counterpoint: Code does matter, in the early days too!
It matters more after you have PMF, but that doesn't mean it doesn't matter pre-PMF.
After all, the code is a step-by-step list of instructions on solving a specific pain point for a specific target market.
1. Find a potential customer who's excited about the idea of what you're going to build.
2. Build just enough to make them a mostly happy, paying customer while you secure more customers.
3. Now that you have a few customers, you have a better idea of where your architecture and business flow doesn't fit their needs.
4. Adapt to this reality, and make things robust enough that you're not spending too much time on customer support.
In less than four years the AI coding workflow has been overhauled at least twice: from Chat interface (ChatGPT) to editor integration (Cursor), then to CLI agent harnesses (CC/Codex). It would be crazy to assume that harnesses are the end of evolution.
Except, apparently, Anthropic - who are doing their darndest to get everyone onboard their tools as a moat. Apparently that's the only strategy to AI stickiness.
Claude Code 3.0 (and other agent tools) are not expected to be mature. They'll all be obsolete in two or three years, replaced by the next generation of AI tools. Everyone knows that.
And so on and on and on.
A promise of AI was mature software
Claude Code is strictly worse than e.g. OpenCode in my experience. Not much to see in the app’s code except how it authenticates itself…
Sure I try and use all my subscription allowance with CC on side tasks, etc. but I still end up burning a bunch of API tokens (via OpenRouter) for more serious work (even the UI and ability to quickly review what the agent has done/is doing is vastly inferior in CC).
What they have done is got me experimenting with cheaper models from other providers with those API credits.
Given the output speed, it's practically impossible for developers to keep up, which directly impacts maintenance: the knowledge that would previously reside in-house, now is becoming dependent on having codebases pre-processed by LLMs.
I hope in the near future local LLMs will gain traction and provide an alternative, otherwise we are in the risky path where businesses are over-reliant on a few big companies.
But now everything is, "ship as fast as is humanly possible, literally" from management, and "garbage Claude-written PRs" from devs. Trying to maintain sanity over my monorepo is impossible.
We have nearly a century of examples of "somebody who only mostly understands making a breaking change" and decided, "what the hell, this thing is called Claude, so it can wreak havoc for as long as corporate decides"
But you can use AI to improve your codebase too. Plus models are only going to get smarter from here (or stay the same).
If dealing with a functionality that is splittable into microfeatures/microservices, then anything that you need right now can potentially be vibe-coded, even on the fly (and deleted afterwards). Single-use code.
>But as time goes on your codebase has to mature, or else you end up using more and more resources on maintenance rather than innovation.
tremendous resource sink in enterprise software. Solving it, even if making it just avoidable - may be Anthropic goes that way and leads the others - would be a huge revolution.
- building functionalities as components that are swappable on a whim requires a level of careful thought, abstraction and architecture that essentially is the exact opposite to ai slop
- in this day and age we still don't make software for the sake of it, and who's financing it doesn't generally require such levels of functional flexibility (the physical world commandeering the coding isn't nearly as volatile as to justify that)
- this comes loaded with the implication that "stuff needs to work": if you are developing software that manages inventory, orders, resources, ... you just can't take the chance to corrupt your customers data or disrupt their business processes. Shipping faster than you can test and with no accountability and no oversight is a solution to a problem I've personally never encountered in the wild
that is only for humans really. Why we need these careful thought, abstraction and architecture? Because otherwise the required code becomes an unmanageable pile of spaghetti handling myriad of edge cases of abstraction leaks and unexpected side effects. Human brain can't manage it. AI can or at least soon would be able to. It will just be a large pile of AI slop.
It may also happen that AI will also start generate good component based architecture if forced to minimize or in some other measurable way improve its slop.
Have you seen the code generated by AI? These things converge on the "1 million lines to make an API call" pattern. They're a lot of things, but certainly not "micro".
The product hasn't been around long enough to decide whether such an approach is "sustainable". It is currently in a hype state and needs more time for that hype to die down and the true value to show up, as well as to see whether it becomes the 9th circle of hell to keep in working order.
I have come to the conclusion that we just do not know yet. There is a part of me that believes there is a point somewhere on the grand scale where the code quality genuinely does not matter if the outcome is reliably and deterministically achieved. (As an image, I like to think of Wall—E literally compressing garbage into a cube shape.)
This would ignore maintenance costs (time and effort inclusive.) Those matter to an established user base (people do not love change in my experience, even if it solves the problem better.)
On the other hand, maybe software is meant to be highly personal and not widely general. For instance, I have had more fun in the past two years than the entire 15 years of coding before it, simply building small custom-fitted tools for exactly what I need. I aimed to please an audience of one. I have also done this for others. Code quality has not mattered all that much, if at all. It will be interesting to see where things go.
But, there is also quite a lot of confident "code quality" fluff claims that have nothing to do with maintainability, robustness or performance. Fairly often, a guy claiming "the code is garbage" is not actually concerned with code quality as much as he is concerned with asserting dominance. Or is confusing own preference with quality.
It's not. My favorite example: due to vibe coding overload literally nobody knows what configuration options OpenClaw now supports. (Not even other LLM's.)
Their "solution" is to build a chat bot LLM that will attempt to configure OpenClaw for you, and hope for the best, fingers crossed. Yes, really.
My setup is very simple too, just two agents, some MD files, and discord. Nothing else. These people using it for real work or managing their email and texts are in for a rough ride.
Non-trivial things tend to be much more sensitive to code quality in my experience, and will by necessity be kept around for longer and thus be much more sensitive to maintenance issues.
If you are a serious software developer, then you will probably be able to explain both what your code does (i.e. what spec it implements) and how it does it (what does it call, what algorithms does it use, what properties does it rely on?). With the advent of LLMs, people have started to accept not having a clue about the "how", and I fear that we are also starting to sacrifice the "what". Unless our LLMs get context windows that are large enough to hold the source of the full software stack, including LLM-generated dependencies, then I think that sacrificing the "what" is going to lead to disaster. APIs will be designed and modified with only the use cases that fit in the modifying agents context window in mind with little regard for downstream consequences and stability of behavior, because there is not even a definition of what the behavior is supposed to be.
I hear this narrative being pushed quite a bit, and it makes my spidey senses tingle every time. Secure programs are a subset of correct programs, and to write and maintain correct programs you need to have a quality mindset.
A 0-day doesn't care if it's in a part of your computer you consider trivial or not.
Mind you, I'm not using LLMs for professional programming since I prefer knowing everything inside and out in the code that I work on, but I have tried a bunch of different modes of use (spec-driven + entire implementation by Opus 4.6, latest Codex and Composer 2, and entirely "vibecoded", as well as minor changes) and can say that for trivial in-house things it's actually usable.
Do I prefer to rewrite it entirely manually if I want something that I actually like? Yes. Do I think that not everything needs to be treated that way if you just want an initial version you can tinker with? Also yes.
I was replying to the statement that "maybe code quality is really not that important for trivial things", not whether LLM's are good at analysis nor not.
Thanks for the link though, looks like an interesting talk!
Fixed it for you.
Seems like the phrase "clean room" is the new "nonplussed"... how does this make any sense?
[^1]: https://bsky.app/profile/mergesort.me/post/3mihhaliils2y
Then use Anthropic's own argument that LLM output is original work and thus not subject to copyright.
Does this still count as clean-room? Or what if the model wasn't the same exact one, but one trained the same way on the same input material, which Anthropic never owned?
This is going to be a decade of very interesting, and probably often hypocritical lawsuits.
if one person writes the spec from the implementation, and then also writes the new implementation, it is not clean-room design.
There are other details of course (is the old code in the training data?) but I'm not trying to weigh in on the argument one way or the other.
Sure, the weights are where the real value lives, but if the quality is so lax they leak their whole codebase, maybe they are just lucky they didn’t leak customer data or the model weights? If that did happen, the entire business might evaporate overnight.
Actually wait, it's worse than that. The product works, demo looks great. Then someone opens the network tab and ... yeah. "Quality doesn't matter" really just means nothing caught fire yet.
It's creators clearly care not for the efficiency of how it is built, which translates directly into how it runs.
This blog post is effectively being apologetic about the fact that this is alright, since at least they got product market fit. Except Anthropic is never going to go back and clean up the mess once (if) they become profitable.
I doubt anyone will like how things will be in 5 years time if this trend of releasing badly engineered spaghetti continues.
Code quality tends to have an impact on more than just aesthetics - and Claude Code certainly feels like a buggy mess from an end user's perspective.
Of course people still use Claude Code, but that is certainly because of the underlying models first and foremost. Most products don't have such a moat and would not nearly see as much tolerance from end users. If the Max subscriptions could be used with other harnesses, I am sure Anthropic would have to compete harder on the quality of the harness (to be fair, most AI based tooling seems pretty alpha these days, but eventually things will stabilize).
Polish is not everything, clearly, but it is a factor, and I feel Claude Code is maybe the worst example to use here, as it doesn't at all generalize to most other products.
That being said, if you're just beginning and looking for your market fit, or pitching to investors with a flashy demo, it doesn't need to be an architectural miracle, in fact it will waste your time.
Extra one (3) We are getting super lenient with major failures and having a services that has only one 9 on reliability charts as norm.
1. The code is garbage and this means the end of software.
Now try maintaining it.
2. Code doesn’t matter (the same point restated).
No, we shouldn’t accept garbage code that breaks e.g. login as an acceptable cost of business.
3. It’s about product market fit.
OK, but what happens after product market fit when your code is hot garbage that nobody understands?
4. Anthropic can’t defend the copyright of their leaked code.
This I agree with and they are hoist by their own petard. Would anyone want the garbage though?
5. This leak doesn’t matter
I agree with the author but for different reasons - the value is the models, which are incredibly expensive to train, not the badly written scaffold surrounding it.
We also should not mistake current market value for use value.
Unlike the author who seems to have fully signed up for the LLM hype train I don’t see this as meaning code is dead, it’s an illustration of where fully relying on generative AI will take you - to a garbage unmaintainable mess which must be a nightmare to work with for humans or LLMs.
I feel the author is just stating the obvious: code quality has very little to do with whether a product succeeds
Wut? The value in the ecosystem is the model. Harnesses are simple. Great models work nearly identically in every harness
I tried to build my own harness once. The amount of work that is required is incredible. From how external memory is managed per session to the techniques to save on the context window, for example, you do not want the llm to read in whole files, instead you give it the capability to read chunks from offsets, but then what should stay in context and what should be pruned.
After that you have to start designing the think - plan - generate - evaluation pipeline. A learning moment for me here was to split up when the llm is evaluating the work, because same lllm who did the work should not evaluate itself, it introduces a bias. Then you realize you need subagents too and statt wondering how their context will be handled (maybe return a summarized version to the main llm?).
And then you have to start thinking about integration with mcp servers and how the llm should invoke things like tools, prompts and resources from each mcp. I learned llms, especially the smaller ones tend to hiccup and return malformed json format.
At some point I started wondering about just throwing everything and just look at PydanticAi or Langchain or Langgraph or Microsoft Autogen to operate everything between the llm and mcps. Its quite difficult to make something like this work well, especially for long horizontal tasks.
I agree that good models have more value because a harness can't magically make a bad model good, but there's a lot that would be inordinately difficult without a proper harness.
Keeping models on rails is still important, if not essential. Great models might behave similarly in the same harness, but I suppose the value prop is that they wouldn't behave as well on the same task without a good harness.
The harness matters A LOT.
The model is the engine, the harness is the driver and chassis. Even the best top of the line engine in a shitty car driven by a bad driver won't win any races.
Seems wrong. Devs will whine, moan and nitpick about even free software but they can understand failure modes, navigate around bugs and file issues on GitHub. The quality bar is 10-100x amongst non-techno-savvy folks and enterprise users that are paying for your software. They’re far more “picky”.
Seriously, if Anthropic were like oAI and let you use their subscription plans with any agent harness, how many users would CC instantly start bleeding? They're #39 in terminal bench and they get beaten by a harness that provides a single tool: tmux. You can literally get better results by giving Opus 4.6 only a tmux session and having it do everything with bash commands.
It seems premature to make sweeping claims about code quality, especially since the main reason to desire a well architected codebase is for development over the long haul.
Yes, exactly. Products.
It seems like me and all the engineers I've known always have this established dichotomy: engineers, who want to write good code and to think a lot about user needs, and project managers/ executives/sales people, who want to make the non-negative numbers on accounting documents larger.
The truth is that to write "good software," you do need to take care, review code, not single-shot vibe code and not let LLMs run rampant. The other truth is that good software is not necessary good product; the converse is also true: bad product doesn't necessarily mean bad software. However there's not really a correlation, as this article points out: terrible software can be great product! In fact if writing terrible software lets you shit out more features, more quickly, you'll probably come ahead in business world than someone carefully writing good software but releasing more slowly. That's because the priorities and incentives in business world are often in contradiction to priorities and incentives in human world.
I think this is hard to grasp for those of us who have been taught our whole lives that money is a good scorekeeper for quality and efficacy. In reality it's absolutely not. Money is Disney bucks recording who's doing Disney World in the most optimal way. Outside of Disney World, your optimal in-park behavior is often suboptimal for out-of-park needs. The problem is we've mistaken Disney World for all of reality, or, let Walt Disney enclose our globe within the boundaries of his park.
> The object which labor produces confronts it as something alien, as a power independent of the producer.
Most corporations never give code a single thought.
In the race to market, quality always suffers, and with such high stakes, it should surprise no one that AI companies are vibe-coding their own slop.
First, the twitter quote is standard toxic clapback nonsense. Gambling makes billions and does not add any value. Even facebook can argue it adds more value than gambling so this one is a dud.
People use claud code because of claud the model and not claud the harness. Cursor or a hacked up agent loop using opus or whatever are about as good. The magic is in the model not the harness here. This isnt to say the hardness is doesnt do anything.
The other bit this misses is that yes the product matters more then the code, and if the product burns battery/ram/etc doing nothing because the ai has crappy code or maybe something leaks or has a security issue, then that impacts the product.
This just validates my theory that open-sourcing old code that people have sentimental attachments for, and that you won't ever make any money off of, again is actually a terrible idea.
Everything about this leak is a long list of arguments why you shouldn't ever open source anything.
We, the developer community, have really dropped the ball here.
One way is that the law applies to everybody equally. That has been the way it works for many years, not perfectly, in democratic countries.
There is another way of working were the law is not blind. Laws are applied based in who is the one affected. This is what big tech and the ultra-rich have been advocating for. The law applies differently to nobility and aristocrats than to the working class.
So, for all this big tech companies the law is clear: I can copy from you, you cannot copy from me.
(That is horrifying in case that anyone needs me to spell it out)
Nobody, not even Anthropic, is arguing that they should be able to host other people's paid content for free. The crux of their fair-use defense is that models are transformative works, just like parodies or book reviews, and hence should be treated as fair use.
You can't just take a pile of books (no pun intended) and turn that into Claude in a day with 30 lines of Python, there's a lot of work and know-how on the Anthropic side that goes into making a good LLM.
That’s a cynical view, but unfortunately it seems true in many cases, especially for corporate law.
It is an affirmative defense, you to be able to argue the merits. If you publish their source code, they are allowed to come after you whether they have previously used fair use or not. It's fact specific and determined case by case.
Anthropic won half of their fair use argument in the billion dollar settlement, but lost the other half.
You can say you're just using their code to train your own models, just like they did, and they will correctly point out that how you obtained the code also matters and you will lose just like they did.
[1] https://www.congress.gov/crs-product/LSB10922
If anything, this is a question of whether you owe royalties to the owner of IP you consumed in your life since it became part of and trained your mind, identity, and outputs too.
According to IP owners ever since things were digitized, you technically own nothing and simply paid for an authorization to use any given IP for the duration that the IP owner authorized you to use it and you continue to pay, so pay your monthly meat-AI bill to pay for all the IP your mind has been trained on.
https://arstechnica.com/tech-policy/2025/02/meta-torrented-o...
Did they actually? Someone can go to prison for 5 years for that.
Fact 1: AI generated code has no copyright, so the Digital Millennium Copyright Act does not apply.
Fact 2: Misrepresenting your copyright ownership under the DMCA is felony perjury.
Fact 3: The existence of undercover.ts in the leak is grounds to void any copyright claims on whatever human written code might have existed in Claude Code. You have a DUTY TO DISCLOSE any AI generated code in your copyrighted work. undercover.ts HIDES DISCLOSURE to FRAUDULENTLY claim all the code is human written when it is not.
Given the current administration has a bone to pick with Anthropic, it was a VERY BAD IDEA for them to send false DMCA takedowns to github. Someone at Anthropic may be the very first ever to go to prison under that section of the DMCA.
Good luck!
Yes.
https://x.com/theo/status/2039411851919057339
https://github.com/github/dmca/blob/master/2026/03/2026-03-3...
https://github.com/nirholas/claude-code
https://www.congress.gov/crs-product/LSB10922