Years ago I hired an Upwork contractor to port v1.5.3 to golang as best he could. He did a great job and it served us well, however it was far, far from perfect and it couldn't pass most of the JS test suite. The worst was that it had several recursion bugs that could segfault with bad expressions.
That was the now-deprecated implementation at
https://github.com/blues/jsonata-go
Early in 2025 I used Claude Code and Codex to do a proper, compliant port that passes the full set of tests and is safe. It was most certainly not a trivial task for AI, as many nuances of JSONata syntax derive from its JS roots.
Regardless, it was a great experience and here's the 2.0.6 AI port, along with a golang exerciser that lets you flip back and forth between the implementations. We did a seamless migration and it's been running beautifully in prod in Blues' Notehub for quite a while - as a core transformation capability used by customers in our JSON message pipeline.
JSONata is too tied to the language. Looking back, we should have slightly altered the spec and written some code mods. we didn't have customers bringing their existing JSONata over so they wouldn't notice the differences.
The original architecture choice and price almost gave me a brain aneurysm, but the "build it with AI" solution is also under-considered.
This looks like a perfect candidate for existing, high quality, high performance, production grade solutions such quamina (independent successor to aws/event-ruler, and ancestor to quamina-rs).
There's going to be a lot of "we were doing something stupid and we solved it by doing something stupid with AI [LLM code]" in our near future. :-|
That's a win for human engineers, not AI.
Jokes aside, we will probably see everyone doing this, trying to remove human hands off of code, because they corrupt and AI does not.
Joke jokes aside why did we even code until AI?
the first question that comes to mind is: who takes care of this now?
You had a dependency with an open source project. now your translated copy (fork?) is yours to maintain, 13k lines of go. how do you make sure it stays updated? Is this maintainance factored in?
I know nothing about JSONata or the problem it solves, but I took a look at the repo and there's 15PRs and 150 open issues.
For this case, where it's used as an internal filtering engine, I expect the goal is fixing bugs that show up and occasionally adding a feature that's needed by this organization.
Even if we assume a clean and bug free port, and no compatibility required moving forward, and a scope that doesn't involve security risks, that's already non trivial, since it's a codebase no one has context of.
Probably not 500k worth of maintainance (because wtf were they doing in the first place) but I don't buy placing the current cost at 0.
They just seemed to fix their technology choices and got the benefits.
There's existing golang versions of jsonata, so this could have been achieved with those libraries too in theory. There's nothing written about why the existing libraries aren't good enough and why a new one needed to be written. Usually you need to do some due diligence in this area, but no mentions of it in this post
In order to measure the real efficiency, gnata should've been benchmarked against the existing golang libraries. For all we know, the ai implementation is much slower.
The benchmarks in the blog are also weird. The measurement is done within the app, but you're meant to measure the calls within the library itself (e.g calling the js version in its isolated benchmark vs go version in its isolated benchmark). So you don't actually know what the actual performance of the ai written version is?
The only benefit, again, is that they fixed their existing bad technology choice, and based on what is observed, with a lesser bad technology choice. Then it's layered with clickbait marketing titles for others to read.
I'll probably need to expect more of these types of posts in the future.
The only one I found (jsonata-go) is a port of JSONata 1.x, while the gnata library they've published is compatible with the 2.x syntax. Guess that's why.
Maybe I’m out of touch, but I cannot fathom this level of cost for custom lambda functions operating on JSON objects.
I do have some questions like:
* Did they estimate cost savings based on peak capacity, as though it were running 24x7x365?
* Did they use auto scaling to keep costs low?
* Were they wasting capacity by running a single-threaded app (Node-based) on multi-CPU hardware? (My guess is no, but anything is possible)
It is, by orders of magnitude, larger than any deployment that I have been a part of in my work experience, as a 10-year data scientist/Python developer.
>The reference implementation is JavaScript, whereas our pipeline is in Go. So for years we’ve been running a fleet of jsonata-js pods on Kubernetes - Node.js processes that our Go services call over RPC. That meant that for every event (and expression) we had to serialize, send over the network, evaluate, serialize the result, and finally send it back.
But either way, we're talking $25k/mo. That's not even remotely difficult to believe.
But no, the the post is talking about just RPC calls on k8s pods running docker images, for saving $300k/year, their compute bill should be well above $100M/year.
Perhaps if it was Google scale of events for billions of users daily, paired with the poorest/inefficient processing engine, using zero caching layer and very badly written rules, maybe it is possible.
Feels like it is just an SEO article designed to catch reader's attention.
I highly doubt the issue was serialization latency, unless they were doing something stupid like reserializing the same payload over and over again.
I have no idea if they are doing orders of magnitude more processing, but I crunch through 60GB of JSON data in about 3000 files regularly on my local 20-thread machine using nodejs workers to do deep and sometimes complicated queries and data manipulation. It's not exactly lightning fast, but it's free and it crunches through any task in about 3 or 4 minutes or less.
The main cost is downloading the compressed files from S3, but if I really wanted to I could process it all in AWS. It also could go much faster on better hardware. If I have a really big task I want done quickly, I can start up dozens or hundreds of EC2 instances to run the task, and it would take practically no time at all... seconds. Still has to be cheaper than what they were doing.
But it's common for engineers to blow insane amounts of money unnecessarily on inefficient solutions for "reasons". Sort of reminds me of saas's offering 100 concurrent "serverless" WS connections for like $50 / month - some devs buy into this nonsense.
The bad engineering part is writing your own replacement for something that already exists. As other commenters here have noted, there were already two separate implementations of JSONata in Go. Why spend $400 to have Claude rewrite something when you can just use an already existing, already supported library?
The last release of jsonata was mid 2025, and there hasn't been new features since the last 2022 release until the latest, so it's likely those other ports are fine.
But its a realitively simple tool from the looks of it. It seems like their are many competitors, some already written in go.
Its kind of weird why they waited so long to do this. Why even need AI? This looks like the sort of thing you could port by hand in less than a week (possibly even in a day).
> it must have that architecture for a reason, we don't enough knowledge about it to touch it, etc.
That or they simply haven't had the time, cost can creep up over time. 300k is a lot though. Especially for just 200 replicas.
Seems wildly in-efficient. I also don't understand why you wouldn't just bundle these with the application in question. Have the go service and nodejs service in the same pod / container. It can even use sockets, it should be pretty much instant (sub ms) for rpc between them.
1. People are going to come in and vibe code a replacement for some shitty component in a morning. They aren’t going to take time to verify and understand the code.
2. The new code will fix most of the problems with the original component, but it will have a whole new set of issues.
3. People will use AI to fix the bugs, but they won’t take the time to understand the fixes or the regression tests that they tell AI to add.
4. The new system will get so complicated that it’s hard for even AI to work on it. The “test suite” will be so full of tests that are redundant, and nonsensical that the run time will be too high to meaningfully guide AI. And even in the cases where AI does use it, many of the tests are just reimplementing the code under test in the test (Claude does this about 25% of the time based on what I’ve seen if you don’t catch it).
5. Goto 1
This is the same cycle I’ve seen in 90% of companies I’ve worked at, it will just be on a faster cadence.
And that is how we’ll get to a place where we output 100x lines of code, and spend 2x developers salaries on tokens, with little meaningful impact on the outside world.
I used AI to refactor several of my own "move fast and break things" projects and it worked absolutely GREAT. So if that's what you're concerned about, you're not seeing where the puck is going.
The trick is to make a good plan first. And to not rewrite your entire codebase all at once. But that advice is older than my all of my kids combined.
I'm obviously projecting from my own experience, but it echoes so clearly how power can be wielded without actual insight and an almost arrogantly: "OK, all very nice, but the ROI...?"
The article seems to come from a company with stellar engineering so maybe doesn't apply to this case. But, the tone I imagine from that comment still stands out. To me more, precisely because of the mature engineering.
Of course ROI is important and a company exists to build it. I'm extrapolating from something tiny and thinking of the Boeing culture shift: https://news.ycombinator.com/item?id=25677848
In short, why can't good engineering just be good engineering fostered with trust and then profits?
> I don't know what to think. These blog articles are supposed to be a showcase of engineering expertise, but bragging about having AI vibecode a replacement for a critical part of your system that was questionably designed and costing as much as a fully-loaded FTE per year raises a lot of other questions.
Which cumulatively means a competent developer could probably port it in less than one day.
They almost certainly spent longer working out how to deploy and integrate the original JS and ironing out the problems, than it would have taken to port it in the first place.
That’s sad.
And then they definitely spent much longer making their optimised fast path for simple expressions. Which they probably wouldn’t have bothered with if they had just ported the whole thing.
As for trying things like embedding V8… this is getting ridiculous.
I strongly suspect no one had actually looked at the code, but had just assumed all along that it was much more complex than it actually was.
The entire thing is a tragedy.
There's confidence and there's barking mad delusion.
Here's the reality.
I once ported 50k loc from Java to Go. Here are details: https://blog.kowalczyk.info/article/19f2fe97f06a47c3b1f118fd...
Java => Go is easier than JavaScript => Go because languages are more similar. That was a very line-by-line port.
Because I was paid by hour I took detailed notes.
I spent 601 hours to port it.
50k / 601 = 83 lines ported per hour, 665 per 8 hour day, but really 500 per 6 working hours a day. No one does sustained 8 hours of writing code daily.
I would consider that very fast and yet order of magnitude slower than your 5.5 k per day.
10x is not a mis-estimation, it's a full blown delusion.
But above everything else, this is a great example of how much JavaScript inefficiency actually costs us, as humanity. How many companies burn money through like this?
It is a security product, so unless they want to deal with the exfiltration charges on the data it's probably better to keep it in AWS. Thats the nasty double edge sword of "cloud", and how we're all getting locked in.
All the bits on their own seem to make perfect sense, but it's become apparent that the orchestra has been blind folded and given noise canceling head phones.
See Next.js, over a decade of iterative development. Countless vulnerabilities discovered internally, and externally, which got patches with tribal knowledge acquired by core contributors, security reviewers.
Now Joe shows off he rewrote it with Vite at its core, for just 1,100 dollars worth of token. Performance improvement and no licensing liability.
Outcome: more money for Nvidia, and even more money into the pockets of your next hackers.
but, it is very boring stable, which means I can't tell the world about my wartime stories and write a blog about it.
I'm very curious why this didn't help more. That was my first thought. Maybe they didn't get the result they wanted immediately so gave up before evaluating this fully?
No doubt, this approach would work reasonably well for machines with plenty of RAM, but I can see why it can be a bottleneck when scaled to N instances. RAM is expensive, and when you multiply those 50 extra megabytes by N, your total costs quickly climb up.
Is what made this exaggerated cost even possible.
Or: the peter principle.
If it does work I'll do a Show HN in a few months. One thing I always do with LLM-code though is review every single line (mainly because I'm particular with formatting). disc.sh is gonna be the domain when I launch the marketing site.
This makes me wonder, for reimplementation projects like this that aren't lucky enough to have super-extensive test suites, how good are LLM's at taking existing code bases and writing tests for every single piece of logic, every code path? So that you can then do a "cleanish-room" reimplementation in a different language (or even same language) using these tests?
Obviously the easy part is getting the LLM's to write lots of tests, which is then trivial to iterate until they all pass on the original code. The hard parts are how to verify that the tests cover all possible code paths and edge cases, and how to reliably trigger certain internal code paths.
I've been successful at using Claude Code this way:
1. get it to generate code for complex data structures in a separate library project
2. use the code inside a complex existing project (no LLM here)
3. then find a bug in the project, with some fuzzy clues as to causes
4. tell CC about the bug and ask it to generate intensive test cases in the direction of the fuzzy clues
5. get the test cases to reproduce the bug and then CC to fix it by itself
6. take the new code back to the full project and see the issue fixed
All this using C++. I've been a pretty intensive developer for ~35 years. I've done this kind of thing by hand a million times, not any more. We really live in the future now.
My use case is a bit different. I wanted JSONata as the query language to query Flatbuffers data (via schema introspection) in Rust, due to its terseness and expressiveness, which is a great combination for AI generated queries.
The AI generated code can still introduce subtle bugs that lead to incorrect behaviour.
One example of this is the introduction of functions into the codebase (by AI) that have bugs but no corresponding tests.
EDIT: correct quotation characters
It feels like I'm getting gaslighted.
I use AI at work with C#/Python - it fine. It can write some glue code and sometimes even pretty well. But I have to hand-hold it a lot.
My own project in Swift. Boy, AI can't handle Apple quirks - multiple iterations, code does not compile or missing crucial pieces (there are navigation links but not navigation stack).
I'm trying to be not picky. I want AI to do my job. But it's so far away.
Am I alone and everyone rewriting Linux in Rust over a weekend?
I'm the author of the blog post. I'm honestly loving the discussion this is generating (including the less flattering comments here). I'll try to answer some of the assumptions I've seen, hopefully it clears a few things.
First off - some numbers. We're a near real-time cybersecurity platform, and we ingest tens of billions of raw events daily from thousands of different endpoints across SaaS. Additionally, a significant subset of our customers are quite large (think Fortune 500 and up). For the engine, that means a few things:
- It was designed to be dynamic by nature, so that both out-of-the-box and user-defined expressions evaluate seamlessly.
- Schemas vary wildly, of which there are thousands, since they are received from external sources. Often with little documentation.
- A matching expression needs to be alerted on immediately, as these are critical to business safety (no use triggering an alert on a breached account a day later).
- Endpoints change and break on a near-weekly basis, so being able to update expressions on the fly is integral to the process, and should not require changes by the dev team.
Now to answer some questions:
- Why JSONata: others have mentioned it here, but it is a fantastic and expressive framework with a very detailed spec. It fits naturally into a system that is primarily NOT maintained by engineers, but instead by analysts and end-users that often have little coding expertise.
- Why not a pre-existing library: believe me, we tried that first. None actually match the reference spec reliably. We tried multiple Go, Rust and even Java implementations. They all broke on multiple existing expressions, and were not reliably maintained.
- Why JSON at all (and not a normalized pipeline): we have one! Our main flow is much more of a classic ELT, with strongly-defined schemas and distributed processing engines (i.e. Spark). It ingests quite a lot more traffic than gnata does, and is obviously more efficient at scale. However, we have different processes for separate use-cases, as I suspect most of the organizations you work at do as well.
- Why Go and not Java/JS/Rust: well, because that's our backend. The rule engine is not JUST for evaluating JSONata expressions. There are a lot of layers involving many aspects of the system, one of which is gnata. A matching event must pass all these layers before it even gets to the evaluation part. Unless we rewrote our backend out in JS, no other language would have really mitigated the problem.
Finally, regarding the $300k/year cost (which many here seem to be horrified by) - it seems I wasn't clear enough in the blog. 200 pods was not the entire fleet, and it was not statically set. It was a single cluster at peak time. We have multiple clusters, each with their own traffic patterns and auto-scaling configurations. The total cost was $25k/month when summed as a whole.
Being slightly defensive here, but that really is not that dramatic a number when you take into account the business requirements to get such a flexible system up and running (with low latency). And yes, it was a cost sink we were aware of, but as others have mentioned - business ROI is just as important as pure dollar cost. It is a core feature that our customers rely on heavily, and changing its base infrastructure was neither trivial nor cost-effective in human-hours. AI completely changed that, and so I took it as a challenge to see how far it could go. gnata was the result.
Your original architecture was a kludge to start with, it was a self-inflicted wound. This is probably the craziest part:
> We’d tried a few things over the years - optimizing expressions, output caching, and even embedding V8 directly into Go (to avoid the network hop).
I know hindsight is 20/20 - but still, you made the wrong decision at the start, and then you kept digging the hole deeper and deeper. Hopefully a good lesson for everyone working with microservices.
To end on a more positive note, I think this (porting code to other languages/platforms) is one use-case where AI code generation really shines, and will be of immense value in the future. Great reporting, just let's not confuse code generation with architectural decisions.
Having said that, My opinion still is that the previous solution had valid business merit. Though inefficient, the fact that it was infinitely scalable and the only limit was pure dollar cost is pretty valuable. It enables business stakeholders / managers to objectively quantify the value of the feature (for X dollars we get Y business, scaling linearly). I've worked in many systems where this was not at all the case, and there was a hard-limit at some point where the feature simply shut down.
So, then, what do you estimate the actual savings of the transition to be, taking into account only the component in question and its actual resource needs? (i.e. not simply projecting based on a linear multiple of peak utilization).
I'm going to be a little harsh here, and please forgive me: intellectual dishonesty, especially when the hard numbers are easily determinable, is something I've denied engineers' promotions for. It's genuinely impressive that you've saved the company money, but $500k/year based on peak projections is a very different number than, say, $100k/year in actual resources saved over the full course of it.
I wonder whether this was your first attempt to solve this issue with LLMs, and this was the time you finally felt they were good enough for the job. Did you try doing this switch earlier on, for example last year when Claude Code was released?
However after it came out it suddenly behaved closely to what they marketed it as being. So it was my first real end-to-end project relying on AI at the front seat. Though design wise it is nowhere near perfect, I was holding it's hand the entire way throughout.
> This was costing us ~$300K/year in compute
Wooof. As soon as that kind of spend hit my radar for this sort of service I would have given my most autistic and senior engineer a private office and the sole task of eliminating this from the stack.
At any point did anyone step back and ask if jsonata was the right tool in the first place? I cannot make any judgements here without seeing real world examples of the rules themselves and the ways that they are leveraged. Is this policy language intentionally JSON for portability with other systems, or for editing by end users?
This article does not do much to improve their standing.
The thing is, if it took them a day with AI it would’ve been _at most_ a week without it. So why did they wait? Someone is not being responsible with the company funds.
I think the better question is whether it’s avoidable. I share the concern but is there a real alternative? “Say no to AI!” is fine until your competitors decide they don’t share your concerns. Or at least not enough to stop using it.
My god. But I am happy that they finally realised their error and put it right.
You used to have two problems. Now you have three.
Did you know that you can pass numbers up to 2 billion in 4 constant bytes instead of as a string of 20 average dynamic bytes? Also, fun fact, you can cut your packets in half by not repeating the names of your variables in every packet, you can instead use a positional system where cardinality represents the type of the variable.
And you can do all of this with pre AI technology!
Neat trick huh?
And up to 4 billion if you're not bothered about those pesky negative nancies!
So I don't see there is any point in the article.
Since at least November 2022, everyone in the software industry knows that "it is possible to save money with AI rewrites".
I found nothing I don't already know from the article.
The only way this specific article gained attention is with the number in the headline.
Your physical form is destructively read into data, sent via radio signal, and reconstructed on the other end. Is it still you? Did you teleport, or did you die in the fancy paper shredder/fax machine?
If vibe code is never fully reviewed and edited, then it's not "alive" and effectively zombie code?
Also, I have to comment on the many commenters that spent time researching existing Go implementations just to question everything, because "AI bad". I don't know how much enterprise experience the average HN commenter these days have, but it's not usually easy to simply swap a library in a production system like that, especially when the replacement lib is outdated and unmaintened (which is the case here). I remember a couple of times I was tasked with migrating a core library in a production system only to see everything fall apart in unexpected ways the moment it touched real data. Anyway, the case here seems to be even simpler: the existing Go libs, apart from being unmaintened and obscure, don't support current feature of the JSONata 2.x, which gnata does. Period.
The article missed anticipating such critics and explaining this in more detail, so that's my feedback to the authors. But congrats anyway, this is one of the best use cases for current AI coding agents.
> The reference implementation is JavaScript, whereas our pipeline is in Go. So for years we’ve been running a fleet of jsonata-js pods on Kubernetes - Node.js processes that our Go services call over RPC. That meant that for every event (and expression) we had to serialize, send over the network, evaluate, serialize the result, and finally send it back.
> This was costing us ~$300K/year in compute, and the number kept growing as more customers and detection rules were added.
For something so core to the business, I'm baffled that they let it get to the point where it was costing $300K per year.
The fact that this only took $400 of Claude tokens to completely rewrite makes it even more baffling. I can make $400 of Claude tokens disappear quickly in a large codebase. If they rewrote the entire thing with $400 of Claude tokens it couldn't have been that big. Within the range of something that engineers could have easily migrated by hand in a reasonable time. Those same engineers will have to review and understand all of the AI-generated code now and then improve it, which will take time too.
I don't know what to think. These blog articles are supposed to be a showcase of engineering expertise, but bragging about having AI vibecode a replacement for a critical part of your system that was questionably designed and costing as much as a fully-loaded FTE per year raises a lot of other questions.
> For something so core to the business, I'm baffled that they let it get to the point where it was costing $300K per year.
And this, this is the core/true/insightful story the executives will never hear about.
In the world of manufacturing this is known as a gain-sharing plan. Not sure I'd call it common, but it certainly isn't unheard of
That takes a lot of engineer hours to set up and maintain. This architecture didn't just happen, it took a lot of FTE hours to get it working and keep it that way.
If you’re skipping 8 $300k projects a year that could be done by one fully-burdened $400k developer, something is wrong.
Over the years of running these I think the key is to keep the cluster config manual and then you just deploy your YAMLs from a repo with hydration of secrets or whatever.
No, because you can use that 300k to solve some real problem instead of literally lighting it on fire.
(Hell, just give employees avocado toasts or pingpong tables instead.)
I was also bothered by this:
"February 2026" is just way to specific. It feels like a PR/marketing team wrote it. It acts like a jump scare in the post for any normie programmer.https://www.anthropic.com/news/claude-opus-4-6
The big coding model moments in recent recollection, IMO, were something like:
- Sonnet 3.5 update in October 2024: ability to generate actually-working code using context from a codebase became genuinely feasible.
- Claude 4 release in May 2025: big tool calling improvements meant that agentic editors like Claude Code could operate on a noticeably longer leash without falling apart.
- Gemini 3 Pro, Claude 4.5, GPT 5.2 in Nov/Dec 2025: with some caveats these were a pretty major jump in the difficulty and scale of tasks that coding assistants are able to handle, working on much more complex projects over longer time scales without supervision, and testing their own work effectively.
[0] https://github.com/anthropics/claude-code/issues/11447
I think a lot of the noise about letting Claude run for very extended periods involves relatively greenfield projects where the AI is going to be using tools and patterns and choices that are heavily represented in training data (unless you tell it not to), which I think are more likely to result in a codebase that lends itself to ongoing AI work. People also just exaggerate and talk about the one time doing that actually worked vs the 37 times Claude required more handholding.
The bigger problem I see with the "leave it running for the weekend" type work is that, even if it doesn't get caught up on something trivial like tabs vs spaces (glad we're keeping that one alive in the AI era, lol), it will accumulate bad decisions about project structure/architecture/design that become really annoying to untie, and that amount to a flavor of technical debt that makes it harder for agents themselves to continue to make forward progress. Lots of insidious little things: creating giant files that eventually create context problems, duplicating important methods willy nilly and modifying them independently so their implementations drift apart, writing tests that are..."designed to pass" in a way that creates a false sense of confidence when they're passing, and "forest for the trees" kind of issues where the AI gets the logic right inside a crucial method so it looks good at a glance, but it misses some kind of bigger picture flaw in the way the rest of the code actually uses that method.
1: https://marginlab.ai/trackers/claude-code-historical-perform...
(They still wanted to go ahead with the migration, but that's a different story.)
Yeah I would too lol. During Covid I found myself in the odd situation of developing a new Access DB product and man was it miserable.
A human writing some poor, but working code that is supposed to be a demo, goes to production 9 times out of 10.
Then it becomes critical infrastructure.
Then management cannot understand why something working needs a rewrite because there's no tangible numbers attached to it. The timeless classic developer problem.
We were here ^^^^ up to 2024-2025.
Now, with LLMs, you can at least come up with a vibe coded, likely correct, likely faster, solution in a morning, that management won't moan at you about.
LLMs will only ever be as good as an average programmer, and average programmers usually get stuff wrong.
What do you base this claim on?
> average programmers usually get stuff wrong.
All programmers get stuff wrong.
They might be different bugs and technical debt than the original, so it might take you long enough to run into them that the engineer who did it can take the credit for solving the original problem without taking the blame for the new ones.
That seems unlikely.
I agree. But most of the time the people responsible for the codebase / architecture do not want those questions raised. AI is greatly appreciated emergency exit for those situations. Apparently.
I don't know if that matches my experience. I've seen plenty of places where the dev teams complain about tech debt and other kludges costing too much, slowing them down and causing other problems, but management don't want to "waste time re-writing working code".
But now that management read on linkedin they can jump on the AI bandwagon by having the team use AI to fix tech debt, there's suddenly time to work on it.
The original is ~10k lines of JS + a few hundred for a test harness. You can probably oneshot this with a $20/month Codex subscription and not even use up your daily allowance.
In this case AI allowed the developer to make a change that the organisation would not have allowed. Regular rewrites don't let you signal to investors that you are AI ready/ascendant/agentic (whatever the latest AI hype term is) so would have been blocked. But, an AI rewrite.
AI benefits rely on these good engineers having 5, 10, 20 years of experience pre-AI designing (and fully, thoroughly understanding) these systems. What's going to happen to that engineering skill after 15 years of AI use?
You build something that's a dirty hack but it works, then your company grows, and nobody ever gets around to building it.
I was at a place spending over $4 million a year on redshift basically because someone had slapped together some bad (but effective!) queries when the company was new, and then they grew, and so many things had been built on top they were terrified to touch anything underneath.
Most startups didn’t care (to a point) because at that point in their lifecycle, the information they needed to get from those queries (and actions they could take based on it, like which customers were likely to convert and worth spending sales time on, etc) was more important than the money spent on the insane redshift clusters.
The mantra was almost always some version of, “just do it now, as fast as possible, and if we’re still alive in a year we’ll optimize then.”
The practical key point is: if you want to do a large migration is to have a very good & extensive test suite that Claude is not allowed to change during the migration. Then Claude is extremely impressive and accurate migrating your codebase and needs minimal handholding. If you don't have a test suite, claude will be freewheeling all the way. Just did an extensive migration project, and should have focused on the test suite much more.
Will they? What makes you think so? If no one cared to improve it when it costed $300k/year, no one will care it when it's cheaper now.
If the system is simple enough someone might take enough time to understand and verify the test suite to the point where they can keep adding regression tests to it and maybe mostly call it done.
They probably won’t do this though (based on the situation the company was in in the first place) and people will have Claude fix it and write tests that no one verified. And in a while the test suite will be so full tests that reimplement the code instead of testing it that it will be mostly useless.
Then someone else will come in and vibe code a replacement that won’t have the bugs the current system does but will have a whole new set.
And the cycle will continue.
The same cycle that I’ve seen in the bottom 80% of companies I’ve worked for, just faster.
AI is not a junior developer, as some analogise, but Rain Man. Ultra autistic entity that can chew through way more logical conditions that you.
As long as you can describe the bug well ai will likely fix it. Logs help.
Let me give you specific example.
Here's a fix made by claude to my SumatraPDF: https://github.com/sumatrapdfreader/sumatrapdf/commit/a571c0...
I have a crash reporting system that sends me crash information in text file: callstack of crashed thread, basic os info and logs of this execution.
The way I (well, claude) fixed this bug is: I said "analyze crash report: <paste crash report>" and it does it in under a minute.
Recently I've fixed at least 30 bugs with this process (you can view recent checkins).
Those are crashes that I found hard to fix because even though by human standard I'm both expert developer and expert Windows API developer.
But I'm not an autistic machine that can just connect the dots between how every windows api works and how it ties to the callstack and information from the log, in under a minute.
I feel like you’re probably just a worse engineer than you think you are if you needed Claude for this.
How about you try to make a change to SumatraPDF code base.
Let's see how good of an engineer you are when you actually have to write a line of C++ code in complex codebase as opposed to commenting on a check in with an explanation of the issue and a fix.
Claude fixed this crash in a minute: https://gist.github.com/kjk/d22af052499f70a45708c311eef201ff
Why don't you tell me, smart man, what the fix it and how long it took you to figure out.
If you can do it in less than a day, then we can talk about how better of an engineer you are than me.
Doubt they'd have a blog post to write about that, though.
The use of ai agents allowed them to shrink the problem down to the point where it was small enough to fit in their free time and not interrupt their assigned work.
I get that it's fun and there's personal satisfaction in it, but it just reinforces to management that they don't need to care about allocating resources to optimisation, the problem will just take care of itself for free.
Should it be this way? No. Is it this way in practice? Unfortunately often.
This also explains this blog post
For the managers, it's about a bonus. For engineers it's the existential question of future hirability: every future employer will love the candidate with experience in operating a $500k/a cluster. They guy who wrote a library that got linked into a service... Yeah, that's the kind they already have, not interested, move along.
Then you look at it and you're like "Jesus! What the fuck, I meant to have this be a stop-gap". I've done as bad when at near 100% duty-cycle. Often you're targeting just the primary thing that's blocking some revenue and if you get caught yak-shaving you're screwed. A year ago, I did one of these things because I was in the middle of two projects that were blocking a potential hundred-million in revenue.
A year down the line, Claude Opus 4.6 could have live-solved it. But Claude of that time would have required some time and attention and I was doing something else.
That engineering team is some 15 people strong and the company is at $400m+ revenue. If you saw the code, you'd wonder why anyone would have done something like this.
0: I once did this because some inscrutable code/library was tying us to an old runtime so I just encapsulated it in HTTP and moved it into a service.
Kubernetes, app engine, beanstalk all are huge money sink
All managed services like cloud datastore, firestore all tend to accure lots of costs if you've good size app.
These are quick to start when you don't have any traffic. Once traffic comes, you the cost drastically goes up.
You can always do better running your own services.
It was "A few iterations and some 7 hours later - 13,000 lines of Go with 1,778 passing test cases."
edit: saw the total raise not the incremental 30MM
...but this is VC funded AI startup, the product might still be burning VC money on each customer ever after optimizing it.
We can deliver feature X for you - incrementally broken down into sub-features x1, x2, x3 over a period of Y weeks/months
The other way to do this would be to build a custom integration on top of your existing APIs and beta test it alongside the customer, bill them accordingly and eventually merge the changes into the main platform, once you can guarantee stability.
But, both these methods will sound boring to VC funded companies as they are under constant pressure from VCs to show something in their weekly graphs - meaningful or not.
https://github.com/RecoLabs/gnata
I have no idea what is JSONata. It seems it is not THAT hard to rewrite to go, just very tedious, and would cost more than 400 USD in developer time.
But, venture funding does create a lot of weird inefficiencies which vary from company to company.
The reason it matters is (1) because it’s directly relevant to profitability projections, i.e. cost per customer, and (2) because management looks at those numbers and sees potential headcount.