You're almost "locked in" to using more AI on top of it then. It may also make it harder to give estimates to non-technical staff on how long it'd take to make a change or implement a new feature
* subtle footguns
* hallucinations
* things that were poorly or incompletely expressed in the prompt and ended up implemented incorrectly
* poor performance or security bugs
other things (probably correctable by fine-tuning the prompt and the context):
* lots of redundancy
* comments that are insulting to the intelligence (e.g., "here we instantiate a class")
* ...
not to mention reduced human understanding of the system and where it might break or how this implementation is likely to behave. All of this will come back to bite during maintenance.
I remember the general consensus on this _not even two years ago_ being that the code should speak for itself and that comments harm more than help.
This matters less when agentic tools are doing the maintenance, I suppose, but the backslide in this practice is interesting.
Saying that function "getUserByName" fetches a user by name is redundant. Saying that a certain method is called because of a quirk in a legacy system is important.
I regularly implement financial calculations. Not only do I leave comments everywhere, I tend to create a markdown file next to the function, to summarise and explain the context around the calculation. Just plain english, what it's supposed to do, the high level steps, etc.
It wasn't an entirely bad idea, because comments carry a high maintenance cost. They usually need to be rewritten when nearby code is edited, and they sometimes need to be rewritten when remote code is edited - a form of coupling which can't be checked by the compiler. It's easy to squander this high cost by writing comments which are more noise than signal.
However, there's plenty of useful information which can only be communicated using prose. "Avoid unnecessary comments" is a very good suggestion, but I think a lot of people over-corrected, distorting the message into "never write comments" or "comments are a code smell".
If that was the consensus, it was wrong. There are valuable kinds of comments (whys, warnings, etc) that code can never say.
But in general I agree with your point.
This is a poor metric as soon as you reach a scale where you've hired an additional engineer, where 10% annual employee turnover reflects > 1 employee, much less the scale where a layoff is possible.
It's also only a hope as soon as you have dependencies that you don't directly manage like community libraries.
Reminds me of my last job where the team that pushed React Native into the codebase were the ones providing the metrics for "how well" React Native was going. Ain't no chance they'd ever provide bad numbers.
I watched a lot of stuff burn. It was horrifying. We are nearly there again.
They actually hire more junior developers
"Uhh .. to adopt AI better they're hiring more junior developers!"
Because the latter would still be indicative of AI hurting entry level hiring since it may signal that other firms are not really willing to hire a full time entry level employee whose job may be obsoleted by AI, and paying for a consultant from IBM may be a lower risk alternative in case AI doesn't pan out.
Source: current (full time) staff consultant at a third party cloud consulting firm and former consultant (full time) at Amazon.
I can’t fault you for not knowing AWS ProServe exists. I didn’t know either until a recruiter reached out to me.
~ Monty Python, Meaning of Line (1983), on The Machine that Goes Ping.
https://www.cohenmilstein.com/case-study/ibm-age-discriminat...
A large number of vets can now choose to reapply for their old job (or similar job) at a fraction of the price with their pension/benefits reduced and the vets in low cost centers now become the SMEs. In many places in the company they were not taken seriously due to both internal politics, but also quite a bit of performative "output" that either didn't do anything or had to be redone.
Nothing to do with AI - everything to do with Arvind Krishna. One of the reasons the market loves him, but the tech community doesn't necessarily take IBM seriously.
Sounds like business as usual to me, with a little sensationalization.
LLM's can be a very useful tool and will probably lead to measurable productivity increases in the future, at their current state they are not capable of replacing most knowledge workers. Remember, even computers as a whole didn't measurably impact the economy for years after their adoption. The real world is a messy place and hard to predict!
Which measure? Like when folk say something is more "efficient" it's more time-efficient to fly but one trades other efficiency. Efficiency, like productivity needs a second word with it to properly communicate.
Whtys more productive? Lines of code (a weak measure). Features shipped? Bugs fixed? Time by company saved? Time for client? Shareholders value (lame).
I don't know the answer but this year (2026) I'm gonna see if LLM is better at tax prep than my 10yr CPA. So that test is my time vs $6k USD.
Most recent BLS for the last quarter ‘25 was an annualized rate of 5.4%.
The historic annual average is around 2%.
It’s a bit early to draw a conclusion from this. Also it’s not an absolute measure. GDP per hour worked. So, to cut through any proxy factors or intermediating signals you’d really need to know how many hours were worked, which I don’t have to hand.
That said, in general macro sense, assuming hours worked does not decrease, productivity +% and gdp +% are two of the fundamental factors required for real world wage gains.
If you’re looking for signals in either direction on AI’s influence on the economy, these are #s to watch, among others. The Federal Reserve, the the Chair reports after each meeting, is (IMO) one of the most convenient places to get very fresh hard #s combined with cogent analysis and usually some q&a from the business press asking questions that are at least some of the ones I’d want to ask.
If you follow these fairly accessible speeches after meetings, you’ll occasionally see how lots of the things in them end up being thematic in lots of the stories that pop up here weeks or months later.
[1] https://www.oecd.org/en/topics/sub-issues/measuring-producti...
I get that it takes a long time to make software, but people were making big promises a year ago and I think its time to start expecting some results.
Also weekend hackathon events have completely/drastically changed as an experience in the last 2-3 years (expectations and also feature-set/polish of working code by the end of the weekend).
And as another example, you see people producing CUDA kernels and MLX ports as an individual (with AI) way more these days (compared to 1-2 years ago), like this: https://huggingface.co/blog/custom-cuda-kernels-agent-skills
January numbers are out and there were fewer games launched this January than last.
I wrote a python DHCP server which connects with proxmox server to hand out stable IPs as long as the VM / container exists in proxmox.
Not via MAC but basically via VM ID ( or name)
Then you start asking questions like, does the button for each of the features actually do the thing? Are there any race conditions? Are there inputs that cause it to segfault or deadlock? Are the libraries it uses being maintained by anyone or are they full of security vulnerabilities? Is the code itself full of security vulnerabilities? What happens if you have more than 100 users at once? If the user sets some preferences, does it actually save them somewhere, and then load them back properly on the next run? If the preferences are sensitive, where is it saving them and who has access to it?
It's way easier to get code that runs than code that works.
Or to put it another way, AI is pretty good at writing the first 90% of the code:
"The first 90 percent of the code accounts for the first 90 percent of the development time. The remaining 10 percent of the code accounts for the other 90 percent of the development time." — Tom Cargill, Bell LabsHave you ever looked for, say, WisprFlow alternatives? I had to compare like 10 extremely similar solutions. Apps have no moat nowadays.
That's happening all over the place.
I’d argue the majority use AI this way. The minority “10x” workers who are using it to churn through more tasks are the motivated ones driving real business value being added - but let’s be honest, in a soulless enterprise 9-5 these folks are few and far between.
Why are there fewer games launched in steam this January than last?
Even if models stopped improving today, it'd take years before we see the full effects of people slowly gaining the skills needed to leverage them.
There doesn't need to be any "magic" there. Just clearly state your requirements. And start by asking the model to plan out the changes and write a markdown file with a plan first (I prefer this over e.g. Claude Code's plan mode, because I like to keep that artefact), including planning out tests.
If a colleague of yours not intimately familiar with the project could get the plan without needing to ask followup questions (but able to spend time digging through the code), you've done pretty well.
You can go over-board with agents to assist in reviewing the code, running tests etc. as well, but that's the second 90%. The first 90% is just to write a coherent request for a plan, read the plan, ask for revisions until it makes sense, and tell it to implement it.
Nothing new here. Getting users to clearly state their requirements has always been like pulling teeth. Incomplete sentences and all.
If the people you are teaching are developers, they should know better. But I'm not all that surprised if many of them don't. People will be people.
Once people have had the experience of being a lead and having to pass tasks to other developers a few times, most seem to develop this skill at least to a basic level, but even then it's often informal and they don't get enough practice documenting the details in one go, say by improving a ticket.
But the big models have come a long way in this regard. Claude + Opus especially. You can build something with a super small prompt and keep hammering it with fix prompts until you get what you want. It's not efficient, but it's doable, and it's much better than having to write a full spec not half a year ago.
LOL: especially with Claude this was only in 1 out of 10 cases?
Claude output is usually (near) production ready on the first prompt if you precisely describe where you are, what you want and how you get it and what the result should be.
I’ve worked with a few folks who have been given AI tools (like a designer who never coded in his life, a or video/content creator) who have absolutely taken off with creating web apps and various little tools and process improvements for themselves thanks by just vibecoding what they wanted. The key with both these individuals is high agency, curiosity, and motivation. That was innate, the AI tooling just gave them the external means to realise what they wanted to do with more ease.
These kinds of folks are not the majority, and we’re still early into this technological revolution imo (models are improving on a regular basis).
In summary, we’ve given the masses to “intelligence” but creativity and motivation stay the same.
Eg, ai is a big multiplier but that doesnt mean it will translate to “more” in the way people think.
But my guess would be: games are closed sourced and need physics. Which AI is bad at.
It's like trying to make fusion happen only by spending more money. It helps but it doesn't fundamentally solve thr pace of true innovation.
I've been saying for years now that the next AI breakthrough could come from big tech but it also has just a likely chance of comming from a smart kid with a whiteboard.
It comes from the company best equipped with capital and infra.
If some university invents a new approach, one of the nimble hyperscalers / foundation model companies will gobble it up.
This is why capital is being spent. That is the only thing that matters: positioning to take advantage of the adoption curve.
Individuals make mistakes in air traffic control towers, but as a cumulative outcome it's a scandal if airplanes collide midair. Even in contested airspace.
The current infrastructure never gets there. There is no improvement path from MCP to air traffic control.
It's hard work and patience and math.
The "limits of AI" bit is just smokescreen.
Firing seniors:
> Just a week after his comments, however, IBM announced it would cut thousands of workers by the end of the year as it shifts focus to high-growth software and AI areas. A company spokesperson told Fortune at the time that the round of layoffs would impact a relatively low single-digit percentage of the company’s global workforce, and when combined with new hiring, would leave IBM’s U.S. headcount roughly flat.
New workers will use AI:
> While she admitted that many of the responsibilities that previously defined entry-level jobs can now be automated, IBM has since rewritten its roles across sectors to account for AI fluency. For example, software engineers will spend less time on routine coding—and more on interacting with customers, and HR staffers will work more on intervening with chatbots, rather than having to answer every question.
Obviously they want new workers to use AI but I don't really see anything to suggest they're so successful with AI that they're firing all their seniors and hiring juniors to be meatbags for LLMs.
I suspect the gap is that you don't know enough about IBM's business model.
When something doesn't make sense, a very common cause is a lack of context: many things can be extremely sensible for a business to do; things which appear insane from an outsider's point of view.
https://www.sciencefocus.com/science/is-water-wet https://centreforinquiry.ca/keiths-conundrums-is-water-wet https://www.theguardian.com/notesandqueries/query/0,5753,-17... http://scienceline.ucsb.edu/getkey.php?key=6097 https://parknotes.substack.com/p/is-water-wet-or-does-it-jus...
...etc. Turns out, it's not a solved question!
If my boss asked me a question like this my reply would be "exactly what you told me to build, check jira".
If you want to know if I'm more productive - look at the metrics. Isn't that what you pay Atlassian for? Maybe you could ask their AI...
It someone can use AI to make a $50,000/year project in three months, then someone else can also do so.
Obviously some people hype and lie. But also obviously some people DID succeed at SEO/Affiliate marketing/dropshipping etc. AI resembled those areas in that the entry barrier is low.
To get actual reports you often need to look to open source. Simon Willison details how he used it extensively and he has real projects. And here Mitchell Hashimoto, creator of Ghostty, details how he uses it: https://mitchellh.com/writing/my-ai-adoption-journey
Update: OP posted their own project however. Looks nice!
It involves a whole raft of complex agents + code they've written, but that code and the agents were written by AI over a very short span of time. And as much as I'd like to stroke my own ego and assume it's one of a kind, realistically if I can do it, someone else can too.
I run an eComm business and have built multiple software tools that each save the business $1000+ per month, in measurable wage savings/reductions in misfires.
What used to take a month or so can now be spat out in less than a week, and the tools are absolutely fit for purpose.
It's arguably more than that, since I used to have to spread that month of work over 3-6 months (working part time while also doing daily tasks at the warehouse), but now can just take a week WFH and come back with a notable productivity gain.
I will say, to give credit to the anti-AI-hype crowd, that I make sure to roll the critical parts of the software by hand (things like the actual calculations that tell us what price an item at, for example). I did try to vibecode too much once and it backfired.
But things like UIs, task managers for web apps, simple API calls to print a courier label, all done with vibes.
Stop arguing on HN and get to building.
This would have been a couple day+ unpleasant task before; possibly more. I had been putting it off because scouring datasheets and register maps and startup behavior is not fun.
It didn’t know how to troubleshoot the startup successfully itself, though. I had to advise it on a debugging strategy with sentinel values to bisect. But then once explained it fixed the defects and succeeded.
LLMs struggle in large codebases and the benefit is much smaller now. But that capability is growing fast, and not everything software developers do is large.
Are you worried by any of those claims about SaaS being dead because of AI? lol
I've found AI to be a big productivity boost for myself, but I don't really use it to generate much actual code. Maybe it could do more for me, idk, but I also don't feel like I'm being left behind. I actually enjoy writing code, but hate most other programming tasks so it's been nice to just focus on what I like. Feels good to have it generate a UI skeleton for me so I can just fill out the styles and stuff. Or figure out stupid build config and errors. Etc etc.
Anyways congrats on the product. I know a lot of people are negative about productivity claims and I'm certainly skeptical of a lot of them too, but if you asked most programmers 5 years ago if a super-autocomplete which could generate working code snippets and debug issues in a project would boost productivity everyone would say yes lol. People are annoyed that its overhyped, but there should still be room for reasonable hype imo.
For me, I always had the ideas and even as a competent engineer, the speed of development annoyed me.
I think folks get annoyed when their reality doesn't match other people's claims. But I have friends who aren't engineers who have launched successful SaaS products. I don't know if it's jealousy or what but people are quite passionate about how it doesn't have productivity gains.
Hell, I remember Intellisense in Visual Studio being a big boon for me. Now I can have tasks asynchronous, even if not faster, it frees up my time.
Is the business 3 months old now?
Certainly they didn’t mean 1000 junior positions were cut. So what they really want to say is that they cut senior positions as a way of saving cost/make profit in the age of AI? Totally contrary to what other companies believe? Sounds quite insane to me!
The job is essentially changing from "You have to know what to say, and say it" to "make sure the AI says what you know to be right"
https://www.ibm.com/careers/search?field_keyword_18[0]=Entry...
Total: 240
United States: 25
India: 29
Canada: 15
Not because it's wrong, but because it risks initiating the collapse of the AI bubble and the whole "AI is gonna replace all skilled work, any day now, just give us another billion".
Seems like IBM can no longer wait for that day.
> Some executives and economists argue that younger workers are a better investment for companies in the midst of technological upheaval.
Why Replacing Developers with AI is Going Horribly Wrong https://m.youtube.com/watch?v=WfjGZCuxl-U&pp=ygUvV2h5IHJlcGx...
A bunch of big companies took big bets on this hype and got burned badly.
The "learn to code" saga has run its course. Coder is the new factory worker job where I live, a commodity.
E.g. If you cut hiring from say 1,000 a year to 10 and now are 'tripling' it to 30 then that's still a nothingburger.
Think about the economy and the AI children
You work with junior devs that have those abilities? Because I certainly don't.
And about the pulling in devs - you can actually go to indeed.com and filter out listings for co-founders and CTOs. Usually equity only, or barely any pay. Since they're used to getting code for free. No real CTO/Senior dev will touch anything like that.
For every vibe coded product, there's a 100 clones more. It's just a red ocean.
https://www.anthropic.com/engineering/building-c-compiler
Like, I'm sure it's just laundering gcc's source at some level, but if Claude can handle making a compiler, either we have to reframe a compiler as "not serious", or, well, come up with a different definition for what entails "serious" code.
You need a highly refined sense of “smell” and intuition about architecture and data design, but if you give good specifications and clear design goals and architectural guidance, it’s like managing a small team but 12x faster iteration.
I sometimes am surprised with feature scope or minor execution details but usually whenever I drill down I’m seeing what I expected to see, even more so than with humans.
If I didn’t have the 4 decades of engineering and management experience I wouldn’t be able to get anything near the quality or productivity.
It’s an ideal tool for seasoned devs with experience shipping with a team. I can do the work of a team of 5 in this type of highly technical greenfield engineering, and I’m shipping better code with stellar documentation… and it’s also a lot less stressful because of the lack of interpersonal dynamics.
But… there’s no way I would give this to a person without technical management experience and expect the same results, because the specification and architectural work is critical, and the ability to see the code you know someone else is writing and understand the mistakes they will probably make if you don’t warn them away from it is the most important skillset here.
In a lot of ways I do fear that we could be pulling up the ladder, but if we completely rethink what it means to be a developer we could teach with an emphasis on architecture, data structures, and code/architecture intuition we might be able to prepare people to step into the role.
Otherwise we will end up with a lot of garbage code that mostly works most of the time and breaks in diabolically sinister ways.
Whether its a hotlang, LLMs, or some new framework. Juniors like to dive right in because the promise of getting a competitive edge against people much more experienced than you is too tantalizing. You really want it to be true
Note: I'm not taking any particular side of the "Juniors are F**d" vs "no they're not" argument.
When I read these takes I wonder what kind of companies some of you have been working for. I say this as someone who has been using Opus 4.6 and GPT-Codex-5.3 daily.
I think the “senior developer” title inflation created a bubble of developers who coasted on playing the ticket productive game where even small tasks could be turned into points and sprints and charts and graphs such that busy work looked like a lot of work was being done.
Ahh, what could possibly go wrong!
It always baffles me when someone wants to only think about the code as if it exists in a vacuum. (Although for junior engineers it’s a bit more acceptable than for senior engineers).
Anyone who's worked in a "bikeshed sensitive" stack of programming knows how quickly things railroad off when such customers get direct access to an engineer. Think being a fullstack dev but you constantly get requests over button colors while you're trying to get the database setup.
Customers bikeshed WAY less than those two categories.
Customers want to save money and see projects finished. That anyone can reason with.
Someone inside the company trying to climb the corporate ladder? Different story.
When a customer starts saying “we need to build X”, first ask what the actual problem is etc. It takes actual effort, and you need to speak their language (understand the domain).
But if you have a PM in the middle, now you just start playing telephone and I don’t believe that’s great for anyone involved.
Otherwise, you never feeelanced on the cheap.
I am certain that I went through the same problems you did in the past, maybe I just have a different way of dealing with them, or maybe I had even worse problems than you did but I have a different frame of comparison. We never stopped to compared notes.
All I'm saying is: for me dealing with business owners, end-users, CEOs and CTOs was always way easier than dealing with proxies. That's all.
And I'm certain you haven't if you really, never wanted a layer of separation between certain clients over behavioral issues that got in the way of the actual work. And I'm still male, so I'm sure I still have it better than certain other experiences I only heard third hand in my industry.
I don't see it as a cheap attack. Any teacher would love to be in a classroom exclusively made up of motivated honors students so they can focus on teaching and nurturing. Instead, most teachers tend to become parental proxies without the authority to actually discipline children. So they see a chair fly and at best they need to hope a principal handles it. But sometimes the kid is back in class the next day.
Its envy more than anything else.
Having had to support many of these systems for sales or automation or video production pipelines as soon as you dig under the covers you realize they are a hot mess of amateur code that _barely_ functions as long as you don't breath on it too hard.
Software engineering is in an entirely nascent stage. That the industry could even put forward ideas like "move fast and break things" is extreme evidence of this. We know how to handle this challenge of deep technical knowledge interfacing with domain specific knowledge in almost every other industry. Coders were once cowboys, now we're in the Upton Sinclair version of the industry, and soon we'll enter into regular honest professional engineering like every other new technology ultimately has.
I’ll happily pay up to $2k/month for it if I was left with no choice, but I don’t think it will ever get that expensive since you can run models locally and it could have the same result.
That being said, my outputs are similarish in the big picture. When I get something done, I typically don’t have the energy to keep going to get it to 2x or 3x because the cognitive load is about the same.
However I get a lot of time freed up which is amazing because I’m able to play golf 3-4 times a week which would have been impossible without AI.
Productive? Yes. Time saved? Yes. Overall outputs? Similar.
There’s so many varieties, specialized to different tasks or simply different in performance.
Maybe we’ll get to a one-size fits all at some point, but for now trying out a few can pay off. It also starts to build a better sense of the ecosystem as a whole.
For running them: if you have an Nvidia GPU w/ 8GB of vram you’re probably able to run a bunch— quantized. It gets a bit esoteric when you start getting into quantization varieties but generally speaking you should find out the sort of integer & float math your gpu has optimized support for and then choose the largest quantized model that corresponds to support and still fits in vram. Most often that’s what will perform the best in both speed and quality, unless you need to run more than 1 model at a time.
To give you a reference point on model choice, performance, gpu, etc: one of my systems runs with an nvidia 4080 w/ 16GB VRAM. Using Qwen 3 Coder 30B, heavily quantized, I can get about 60 tokens per second.
https://github.com/ggml-org/llama.cpp/discussions/15396 a guide for running gpt-oss on llama-server, with settings for various amounts of GPU memory, from 8GB on up
Not having to worry about token limits is surprisingly cognitively freeing. I don’t have to worry about having a perfect prompt.
Plenty of people are still ambitious and being successful.