Well, I know why. I just wanted to be snarky. It's just that trying to hide the actual price is getting a bit old. Just tell me that generating this much code will cost me $10.
- Not everyone uses dollars.
- The price of credits in some currency could change after you bought them.
- The price of credits could be different for different customers (commercial, educational, partners, etc)
- They can ban trading of credits or let them expire
Even for a single standalone LLM that's the case, and the 'agentic' layers thrown on top just make that problem exponentially worse.
One'd need to entirely switch away from LLMs to fix this problem.
And now it feels like the are gamifying the compute we use for work for all the same reasons.
If you have some left over that you can’t spend, it feels like you’ve “wasted” them.
The answer is so that they can charge different prices per credit. If you buy low amounts, they can charge one price. If you buy in bulk, they can offer a discount. The usage is the same, but they can differentiate price per usage to give people more a favorable price if they are better customers.
Is there anything wrong with that?
That's not true.
First of all, there's no dollar amount tied to how many credits you get for a subscription.
Second, if you look at the prices for bundles of _extra_ credits and then do some math on the Codex rate card, you'll see that there's no way they would work out to be the same or similar.
I don't understand what you mean here; their official comms is:
Customers on existing Plus, Pro and Enterprise/Edu plans should continue to use the legacy rate card. We’ll migrate you to the new rates in the upcoming weeks.
To me, anyway, that means that GP was exactly right - they'll give the $20 subscriptions $20 worth of credits, and the $200 dollars subscriptions $200 worth of credits. That is what the "New Rates" are!I think it would be more rational to discount a subscription (standard is about 10% in most industries) vs PAYG and agree in principal with your assertion - they haven't specified what the discount is on credits bought in a subscription plan - but there is no indication that they are going to continue allowing thousands of dollars of credits on a $200/m plan.
My guess would be a 10% (or similar) discount if you buy a subscription.
Now I'm going to have to find the new best deal.
It's very variable though recently I'm noticing it's more reliable but there was a patch where it was nearly unusable some days.
I guess I won't complain for the price and YMMV.
Unfortunately gemini as a coding agent is a steaming useless pile. They have no right selling it, cheap open weight Chinese models are better at this point.
It's not stupid it just is incompetent at tool use and makes bad mistakes. It constantly gets itself into weird dysfunctional loops when doing basic things like editing files.
I'm not sure what GOOG employees are using internally, but I hope they're not being saddled with Gemini 3.1. It's miles behind.
Antigravity wants me to switch IDEs, and I'm not going to do that.
The only way I can do serious development with Gemini models is with other tooling (Cline, etc) that requires API based access which isn't available as part of the subscription.
It gets worse than that though. Most harnesses that are made to handle codex and Claude cannot handle Gemini 3.1 correctly. Google has trained Gemini 3.1 to return different json keys than most harnesses expect resulting in awful results and failure. (Based on me perusing multiple harness GitHub issues after Gemini 3.1 came out)
You could probably be charging google literally thousands if all 6 members were spamming video and image generation and antigravity.
There's a few complaints online about the same happening to multiple users.
Otherwise anti-gravity has been great.
In the last month they have all clamped down quite heavily. I use to be able to deep-dive into a subject, or fix a small Python project, multiple times per day on the free Web UIs.
Claude, this morning, modified a small Python project for me and that single act exhausted all my free usage for the day. In the past I could do multiple projects per day without issue.
Same with ChatGPT. Gemini at least doesn't go full on "You can use this again at 1100AM", but it does fallback to a model that works very poorly.
Grok and Mistral I don't really use that much, but Grok's coding isn't that bad. The problem is that it is not such a good application for deep-diving a topic, because it will perform a web search before answering anything, making it take long.
Mistral tends to run out of steam very quickly in a conversation. Never tried code on it though.
I need to try the command line version.
My next step is going to be evaluating open and local models to see if they are sufficiently close to par with frontier models.
My hope is that the end of seat based pricing comes with this tech cycle. I was looking for document signing provider that doesn't charge a monthly, I only need a few docs a year.
If you have an M processor then I would recommend that you ditch Ollama because it performs slowly. We get double or triple tok/s using omlx or vmlx, respectively, but vmlx doesn't have extensive support for some models like gpt-oss.
Opencode was able to create the library as well. It just took about 2x longer.
Next week I will be trying qwopus 27b.
And I just subscribed for a year's worth of Claude... Terrible timing I guess. Do you know if the open models are viable?
The infrastructure build out just can't keep up with it.
Of course it is. I started a (commercial) product in Jan, on track for in-field testing at the end of April.
Of course, it's not my f/time job, so I've only been working on it a/hours, but, with the exception of two functions, everything else is hand-coded.
I rubber-ducked with AI, but they never wrote the product for me (other than those two functions which I felt too lazy to copy from an existing project and fixup to work in the new project).
This isn't your problem; this is management's problem for cutting headcount, or not caring about the things that people wanted.
As it isn't your problem, paint it bright pink and move on.
That said: competition will soon kick in.
I had my hopes up to switch to local but my first few passes didn’t pan out with that so far. But I’m optimistic it’ll land soon.
I think I need to lower my ambitions too. I got my hopes up since AI can do everything but how long it takes to do it right can really drag on
Dangerous too of course. So many times I’ve had subtle unexpected side effects. But it’s all about pinning thins down well and that’s what we’re all still figuring out well
Some people are turn out slop. I was really excited to try and make some impressive shit. My whole life has been dedicated to trying to embody what Apple preached in the early days.
I knew this was coming, but I thought I had a little more time to try and get them over the finish line, ya know?
Maintenance by hand might be achievable, but it’s extremely hard when you’ve built something really big.
I’ve only got so much savings left to live on.
I’m not saying anyone owes me anything, but we all need to pivot and in a lot less sure my pivot is going to work out now
Based on what, exactly?
It's very easy to claim some software would've taken you months to make, but this is ridiculous. Estimating project duration is well known to be impossible in this field. A few years ago you'd get laughed out the room for making such predictions.
> I’ve only got so much savings left to live on.
Respectfully, what are you doing here?
Yeah sure, the Apple dream. But supposing AI did in fact make you this legendary 100x developer, so it would to everyone else including those with significantly more resources. You'd still be run out of the market by those with bigger budgets or more marketing, and end up penniless all the same.
I would strongly recommend you not put all your proverbial eggs in this basket.
I’m not ready to unveil the thing I alluded to, it’s important to me that it’s good and polished. But I’ve done quite well so far developing in Swift, Rust, Go, and coming up with marketing and design — things I definitely couldn’t do by hand without a lot more time and effort.
https://poolometer.com/ Is one of the things I’m almost ready to call ready. So much domain expertise or tedious math involved — I simply wouldn’t have bothered on my own, pre-AI
I agree it’s a huge existential risk that everyone is also amazing. So far that’s not true. I get hung up on a lot of little quirks, like getting Dolby Vision to play properly on Apple Silicon without Vulcan. Something I accomplished after about 2 weeks of relentless determination.
To be clear I’m just trying to answer your questions honestly. I understand the situation. It’s almost to my benefit the harder it is for non Software Engineers. But in our current reality, when I’m not launched yet, it’s more stress
This confuses me - did you leave your job to cosplay as an EM, using LLMs to build your products? If not, then your savings don't matter.
In that sense your analogy is kinda good. I totally agree the current situation is like getting my solo start up funded and subsidized … but with only like 4 months runway now that the prices are skyrocketing, vs ~2+ for a typical YC venture
IOW, you are no further behind nor further ahead than your competitors compared to 1 week ago, 1 month ago, 1 year ago and 1 decade ago.
Everyone has the same tools you have. The only advantage you get is if you make your own tools (I did that, and pre-AI, was able to modify my LoB WebApps at a rate of 1x new API endpoint, tested and pushed to production, every 15m).
With the hidden reasoning tokens and tool calls, I have no idea how many tokens I typically use per message. I would guess maybe a quarter of that, which would make the new pricing cheaper.
Then I realized I was an idiot and this was magic.
But it now seems more like an introductory offer to use the API, as opposed to an alternative product / way to use their API product.
I thought it would get increasingly expensive, like say the $200 plan becomes $400.
Switching these plans to API metering doesn’t feel like it’s a separate product anymore?
Ultimately, we need to know the true cost of this technology to evaluate how effectively or ineffectively it can displace the workforce that existed before it.
Two examples:
- https://www.msn.com/en-us/money/other/three-years-after-tria...
- https://record.umich.edu/articles/public-school-investment-r...
If I recall correctly, Ed Zitron noted in a recent article that one of the horsemen of his AI-pocalypse would be price hikes from providers.
At any rate, this observation is not unique to Ed, lots of people have made the same conclusion that the math doesn’t add up from a business profitability perspective.
Did you mean instead "The articles are okay if overly wordy"?
Hot take, but really it's more of an observation than a take: We saw this exact response in Blockchain & crypto circles a few years ago. (Though HN wasn't quite as culturally "central" to those)
Economic Bubbles are subject to the Tinkerbell Effect. They exist so long as people exist in them, and collapse when either 1) They become so financially unsustainable as to collapse, having consumed all the money the economy could possibly give them, or 2) People stop believing in the bubble and stop feeding it money.
In this regard, the statement "NTFs are stupid" was not merely ridiculing those who bought them, but a direct attack on the bubble and those invested in it. And this is something the people involved in the bubble understand instinctively, even if they aren't consciously aware of it. (There's a psychological mechanism to that, but it's not relevant)
So consequently, they react aggressively to dissent. They seek to enforce their narrative, because not doing so is a threat to the bubble and their financial interests.
---
AI's not much different to that. It's clearly a bubble to everyone including the AI execs saying it out loud.
And people react aggressively to dissent like Ed's, because if the wider public stops believing in AI's future, the bubble bursts. They'll stop tolerating datacenter construction, they'll sell their Nvidia shares, they'll demand regulators restrict AI.
(And to those who can feel their aggression rising reading this comment. Hi, yes. I see you. If I were wrong, nothing I said would matter. You'd be wasting your time engaging with it, history would simply prove me wrong. But by all means, type up that reply or click that button.)
[I'm an AI-doomer myself, but I am an AI-doomer because by and large this stuff increasingly works, not because it doesn't.]
That said, Ed Zitron still does a lot of useful research into the economics of the industry and I also believe that continued progress in AI can disrupt the world (for better and for worse) while the economics propping up all the frontier model providers can also implode spectacularly.
Some people talk about how AI doom comes about either way because it could take all of our jobs OR crash the economy when the current bubble bursts. But as an uber-AI-doomer I happen to think there is a very real possibility of a double downside (for the labor class, at least) where both of those things can happen at the same time!
As I see it, the only thing close to a moat is CC for Anthropic, and since it is a big ol' fucking mess that is a) apparently now beyond the ability of any current SOTA LLM to fix, and b) understood by absolutely no human, I'd say it's not much of a moat. The other agents will catch up sooner rather than later.
The other providers? I don't see a moat. We jump ship at the drop of a hat.
I'm still running local LLMs and finding perfectly acceptable code gen.
Yeah, but do we even need them? Non-SOTA is still pretty damn good; remember last year, pre-SOTA? How many people were boasting 10x - 100x productivity increases using the end-2025 models?
So the non-sota models support doing 10 hours of work in 1 hour. Many people would be fine with that. Fine enough that they aren't going to spring for a SOTA model that cuts the 10 hours to 0.5 hours, they're just going to use the cheap models to cut the 10 hours down to 1 hour.
But let's not cry for the founders, they managed to get away with tons of money. The problem is for the fools holding the bag.
I've also heard that, we're near the end of the exponential.
Jesus, the spin on this message is making me dizzy.
They finally try to stop running at a loss, and you see that as "they've been so successful"?
Here's how I see it: they all ran out of money trying to build a moat, and now realise that they are commodity sellers. What sort of profit do you think they need to make per token at current usage (which is served at below cost)?
How are they going to get there when less-highly-capitalised providers are already getting popular?
That's the problem - these small businesses are writign code, models from last year are good enough for them, and as a small business they can easily shell out for hardware to self-host.
The minute businesses take-up AI for their business processed, the will to buy each employee a subscription is going to go the way of the dodo.
Gemini burned me too many times but maybe the situation has improved since.
I find it sad that some people are already at the point where "My only options are to leave it as spaghetti or pay for another LLM to fix it". Already their skills are atrophied.
> This format replaces average per-message estimates for your plan with a direct mapping between token usage and credits. It is most useful when you want a clearer view of how input, cached input, and output affect credit consumption.
For home projects, I almost exclusively use the web chat interface to code. I haven't done anything large yet so I will iterate and get the web chat to update code, print out the code that I copy and paste.
How does this differ in terms of pricing than Codex?
But Gemini's API based usage also has a free tier and if that doesn't work for you (they train on your data) and you've never signed up before you get several hundred dollars in free credits that expire after 90 days. 3 months of free access is a pretty good deal.
The idea, as far as I can tell from all the pro-AI developers, was that it will never explode, and the performance will continue increasing so the slop they write today doesn't need maintenance, because when that time comes around there will be smarter models that can clean it up.
If the providers are tightening the screws now (and they are all doing it at the same time), it tells me that either:
1. They are out of runway and need to run inference at a profit.
or
2. They think that this is as good as it is going to get, so the best time to tighten the screws is right now.
Just spitballing.
Unlikely that they all decided to do this within weeks of each other. Still, like you said, you were spit-balling, not asserting :-)
> This format replaces average per-message estimates with a direct mapping between token usage and credits.
It's to replace the opaque, per-message calculation, not the subscription plan.