The consequences of task switching in supervisory programming
117 points by bigwheels 3 days ago | 51 comments

simonw 20 hours ago
This cognitive debt bit from the linked article by Margaret-Anne Storey at https://margaretstorey.com/blog/2026/02/09/cognitive-debt/ is fantastic:

> But by weeks 7 or 8, one team hit a wall. They could no longer make even simple changes without breaking something unexpected. When I met with them, the team initially blamed technical debt: messy code, poor architecture, hurried implementations. But as we dug deeper, the real problem emerged: no one on the team could explain why certain design decisions had been made or how different parts of the system were supposed to work together. The code might have been messy, but the bigger issue was that the theory of the system, their shared understanding, had fragmented or disappeared entirely. They had accumulated cognitive debt faster than technical debt, and it paralyzed them.

reply
appplication 20 hours ago
This was essentially my experience vibe coding a web app. I got great results initially and made it quite far quickly but over time velocity exponentially slowed due to exactly this cognitive debt. Took my time and did a ground up rewrite manually and made way faster progress and a much more stable app.

You could argue LLMs let me learn enough about the product I was trying to build that the second rewrite was faster and better informed, and that’s probably true to some degree, but it also was quite a few weeks down the drain.

reply
Mavvie 18 hours ago
That makes sense, but surely there's a middle ground somewhere between "AI does everything including architecture" and writing everything by hand?
reply
layer8 8 hours ago
I wonder about that. A general experience in software engineering is that abstractions are always leaky and that details always end up mattering, or at least that it’s very hard to predict which details will end up mattering. So there may not be a threshold below which cognitive debt isn’t an issue.
reply
simonw 6 hours ago
> So there may not be a threshold below which cognitive debt isn’t an issue.

That's my hunch too.

The problem isn't "I don't understand how the code works", it's "I don't understand what my product does deeply enough to make good decisions about it".

No amount of AI assistance is going to fill that hole. You gotta pay down your cognitive debt and build a robust enough mental model that you can reason about your product.

reply
layer8 5 hours ago
I wouldn’t use the term “product” here. Apart from most software being projects, not products, what I was getting at is that details and design decisions matter at all levels of software. You might have a robust mental model of your product as a product, and about what it does, but that doesn’t mean that you have a good mental model of what’s going on in some sub-sub-sub-module deep within its bowels. Software design has a fractal quality to it, and cognitive debt can accumulate at the ostensibly mundane implementation-detail level as well as at the domain-conceptual level. If you replace “product” by “module”, I would agree.
reply
simonw 5 hours ago
I think of that as the law of leaky abstractions - https://www.joelonsoftware.com/2002/11/11/the-law-of-leaky-a... - where the more abstractions between you and how things actually work the more chance there is that something will go wrong at a layer you're not familiar with.

I think of cognitive debt as more of a product design challenge - but yeah, it certainly overlaps with abstraction debt.

reply
appplication 18 hours ago
Of course! The original attempt wasn’t really AI doing everything. I was writing much of the code but letting AI drive general patterns since I was unfamiliar with web dev. Now, it’s also not entirely without AI, but I am very much steering the ship and my usage of AI is more “low context chat” than “agentic”. IMO it’s a more functional way to interface with AI for anyone with solid engineering skills.
reply
r_lee 18 hours ago
I think the sweet spot is to make the initial stuff yourself and then extend or modify somewhat with LLMs

it acts as a guide for the LLM too, so it doesn't have to just come up with everything on its own in terms of style or design choices in terms of consistency I'd say?

reply
elcritch 16 hours ago
For more complex projects I find this pattern very helpful. The last two gens of SOTA models have become rather good at following existing code patterns.

If you have a solid architecture they can be almost prescient in their ability to modify things. However they're a bit like Taylor series expansions. They only accurate out so far from the known basis. Hmm, or control theory where you have stable and unstable regimes.

reply
mattmanser 15 hours ago
I think it's closer to "doing everything by hand" than you'd expect.

For me, anyway.

I design as I code, the architecture becomes more obvious as I fill in the detail.

So getting AI to do bits, really means getting AI to do the really easy bits.

reply
svara 15 hours ago
> So getting AI to do bits, really means getting AI to do the really easy bits.

As someone who gets quickly bored with repetitive work, this is big though.

reply
mpbart 8 hours ago
This is essentially the definition for complexity that Ousterhout argues for in the book A Philosophy of Software Design. I highly recommend reading it if you haven’t, it’s very good.
reply
ReptileMan 14 hours ago
I spend half my prompts of making codex explain why and what is he doing. The other 40% is reducing the size of the code base and optimizing. Only 10-sh percent is new development.
reply
phren0logy 20 hours ago
I'm not a coder, I'm a medical doctor. I see some interesting parallels in how medical students sort themselves into specialties by cognitive style to this new rift in programming with LLMs.

Some people like the deep work, some like managing a steady rain of chaos. There's no one right answer. But I'll tell you that my classmates who are happy as nephrologists are very different to the ones that are happy as transplant surgeons.

reply
sebmellen 8 hours ago
By that token, I’m curious to know what your specialty is and why you chose it.
reply
cadamsdotcom 2 hours ago
If after 7 or 8 weeks you can’t change the software, it’s wise to start putting tests in to document retroactively how things work / worked and why.

The more the test suite grows, the more you & future agents will be able to consult it to understand how things should work - but also, why.

Imagine a test case that covers some non-compliant API response from a third party. The commit that’s tied to, the date the test was added, all that becomes metadata.. and the fact it’s executable means your agent can’t undo that fix without something very visible in the PR.

reply
evmar 22 hours ago
The term he’s searching for is possibly “intellectual control”: https://www.georgefairbanks.com/ieee-software-v36-n1-jan-201...
reply
tomsmithtld 11 hours ago
the cognitive debt framing clicks for me. I've been using claude code on a laravel project and the thing that actually keeps velocity up isn't the AI getting better, it's me writing tests first. sounds boring but when the LLM generates code against an existing test suite it basically has to follow your architecture. you keep the mental model because you wrote the tests, and the AI keeps the codebase coherent because it has concrete constraints to work against. without that the codebase drifts into this weird state where technically everything works but you can't reason about any of it.
reply
themafia 21 hours ago
> a third of them were instantly converted to being very pro-LLM. That suggests that practical experience

I wasn't aware one could get 'practical experience' "instantly." I would assume that their instant change of heart owes more to other factors. Perhaps concern over the source of their next paycheck? You have admitted you just "forced" them to do this. Isn't the question then, why didn't they do it before? Shouldn't you answer that before you prognosticate?

> that junior developers will still be needed, if nothing else because they are open-minded about LLMs

You're broadcasting, to me, that you understand all of the above perfectly, yet instead of acknowledging it, you're planning on taking advantage of it.

> I think the equivalent of cruft is ignorance

Exceedingly ironic.

> Will two-pizza teams shrink to one-pizza teams

The language you use to describe work, workers, and overcoming challenges are too depressing to continue. You have become everything we hated about this profession.

reply
simonw 20 hours ago
If you haven't experenced a post-November-2025 coding agent before and someone coaches you through how to one-shot prompt it into solving a difficult problem in your own codebase that you are deeply familiar with I can see how you might be an almost instant convert.

(Based on your comment history I'm guessing you haven't experienced this yourself yet.)

reply
svara 15 hours ago
You're right, and I enjoy using coding agents too. I've built some things with them I wouldn't have otherwise.

However, it's been a full quarter now since November 2025.

Based on facts on the ground, i.e. the rate and quality of new software and features we observe, change has been nowhere as dramatic as your comment would suggest.

It seems to me that a possible explanation is that people get very excited about massive speedups in specific tasks, but the bottleneck of the system shifts somewhere else immediately (e.g, human capacity for learning, team coordination costs, communication delays).

reply
simonw 6 hours ago
That "full quarter" included the Christmas holidays for many people, during which not a lot of work gets done.

I think it's a bit early to expect to see huge visible output from these new tools. A lot of people are still spinning up on them - learning to use a coding agent effectively takes months.

And for people who are spun up, there's a lot more to shipping new features and products that writing the code. I expect we'll start to see companies ship features to customers that benefited from Opus 4.5/4.6 and Codex 5.2/5.3 over the next few months, but I'm not surprised there hasn't been a huge swell in stuff-that-shipped in just the ~10 weeks since those models become available.

There is one notable example that's captured the zeitgeist: https://github.com/openclaw/openclaw had its first commit on November 25th 2025, 3 months later it's had more than 10,000 commits from 600 contributors, attracted 196,000 stars and (kind-of) been featured in a Superbowl commercial (apparently that's what the AI.com thing was, if anyone could get the page to load - https://x.com/kris/status/2020663711015514399 )

reply
icedchai 19 hours ago
This rings true for me. Up until the end of 2025 I had my doubts. I haven't fully adopted AI, but I am using it for several side projects where I normally would not have made much progress. The output w/Claude Code is solid.
reply
themafia 17 hours ago
The challenges I have were selected because I enjoy solving them and because very few, if any, people have taken the time to work on them already. As such I have no desire to "one-shot" a solution and I additionally have serious doubts that any model trained on any existing code could possibly output anything useful or anything that truly fits into the design of the system. These projects are written for style and are to explore ideas and gain experience. Inviting an LLM in out of laziness is completely the opposite of my intentions.

The only other code that I write is for a handful of industry specific products that are not challenging in any way to code but are fun to design for the specific needs of my users and are informed by their incredible feedback from the field. The time and effort to play games with an LLM prompt would have effectively zero value here and again is the opposite of what makes these products great enough to be sold by word of mouth alone.

Aside from all of this I have no desire to pay a subscription to a service that requires me to submit all of my code to their engine for output. Given their models apparent fondness for taking copyrighted code and passing it off as it's own I would not put it past them to play games behind my back with my work.

Finally I see no new "AI billionaires" suddenly rising out of the field and I see no "AI heavy" companies suddenly increasing their profits, productivity or quality in any way. I hear what you are saying, and you're certainly not alone in saying it, but I see zero evidence that it's actually meaningful in the real world software market.

reply
wiz21c 15 hours ago
I would be very happy to solve problems that "very few, if any, people have taken the time to work on them already."

My experience (as someone how works with a team of PhD's) is that code is about 30% of what we do but in these, 75% are "trivial things" (building charts, quickly designing apps to process information, etc). Out of these 75%, AI certainly helps us at least 50% of the time (and amazes me 10% of the time :-))

> I see no new "AI billionaires" suddenly rising out of the field and I see no "AI heavy" companies suddenly increasing their profits, productivity or quality in any way.

Exactly what I was telling myself yesterday. That's rather not in line with the media coverage.

reply
djkivi 8 hours ago
We need a new AI "Code Panther".
reply
habinero 20 hours ago
I have literally heard this exact vague phrase about every single stupid model that has come out, plus more than a few companies.

So far it's all been endless unfounded FOMO hype by people who have something to sell or podcasts to be on. I am so tired of it.

reply
simonw 20 hours ago
Ask around and see if you can find anyone you know who's experienced the November 2025 effect. Claude Code / Codex with GPT-5.1+ or Opus 4.5+ really did make a material difference - they flipped the script from "can write code that often works" to "can write code that almost always works".

I know you'll dismiss that as the same old crap you've heard before, but it's pretty widely observed now.

reply
geraneum 17 hours ago
I’ve been living this experience and using latest models in work throughout this time. The failure modes of LLMs have not fundamentally changed. The makers are not awfully transparent about what exactly they change in each model release the same way you know what changed in i.e., a new Django version. But there’s not been a paradigm shift. I believe/guess (from outside) the big change you think you’re experiencing could be result of many things like better post training processes (RLHF) for models to run a predefined set of commands like always running tests, or other marginal improvements to the models and focusing on programming tasks. To be clear these improvements are welcome and useful, just not the groundbreaking change some claim.
reply
ej88 17 hours ago
the perimeter of the tasks the LLMs can handle continuously expands at a pretty steady pace

a year ago they could easily one shot full stack features in my hobby next.js apps but imploded in my work codebase

as of opus 4.6 they can now one shot full features in a complex js/go data streaming & analysis tool but implode in low latency voice synthesis systems (...for now...)

just depends on how you're using it (skill issues are a thing) and what you're working on

reply
kalessin 23 hours ago
I like the idea of "cognitive debt" vs "technical debt".
reply
jongjong 24 hours ago
Part of me feels like LLMs will struggle to architect code properly, no matter how good they get.

Software engineering is different from programming. Other kinds of engineers often ridiculed software engineers as "not real engineers" because mainstream engineers never had to build arbitrarily complex software systems from scratch. They have never experienced the cascading issues which often happen when trying to make changes to complex software systems. Their brief exposure to programming during their university days gave them a glimpse into programming but not software engineering. They think they understand it but they don't.

Other engineers think that they're the only ones wrestling with the laws of nature.

They're wrong. Software engineering involves wrestling with entropy itself. In some ways, it's an even purer form of engineering. Software engineering struggles against the most fundamental forces and requires reasoning skills of the highest order.

I think software engineers will be among the last of the white collar professions to be automated besides the ones which have legal protections like lawyers, judges, politicians, accountants, pilots... where a human is required to provide a layer of accountability. Though I think lawyers will be reduced to being "official human stamping machines" before software engineers are reduced to mere Product Owners.

reply
Swizec 23 hours ago
> Though I think lawyers will be reduced to being "official human stamping machines" before software engineers are reduced to mere Product Owners

GeLLMan Amnesia – AI can fully automate every profession except the ones I’m deeply familiar with.

I’m a software engineer who wears the product owner hat a lot these days, there’s no way AI will automate this any time soon. Too much peopling and accountability.

reply
Herring 22 hours ago
Don't be so sure about that. These days I'm already finding it 100x easier/informative to have complicated charged discussions (eg immigration) with Gemini than with actual people. It's day and night. Accountability might be solvable too, maybe escrow and pay me if you waste my time. Or amazon-like reviews.
reply
Aurornis 21 hours ago
> These days I'm already finding it 100x easier/informative to have complicated charged discussions (eg immigration) with Gemini than with actual people.

It’s scary how quickly people start to mistake LLMs appeasing them for actual conversation.

Discussing something with an LLM isn’t equivalent to having a conversation with a person. It’s just a text generator trained to show you what you want to see.

reply
Herring 21 hours ago
Don't assume everyone is like you. I'm an early adopter and I know how to scaffold it. I generally love the bleeding edge in all things, and I'm increasingly sure it's an actual talent to be able to quickly adapt to unfamiliar things (this includes not making assumptions).
reply
paulryanrogers 21 hours ago
What do you feel like you get out of discussions with Gemini about politically charged topics like immigration?
reply
Herring 21 hours ago
I think that's obvious from the discussion so far? It broadens my horizons.

YMMV. Ask the bot for supporting evidence, and follow up on google/wikipedia.

reply
KittenInABox 21 hours ago
Would you be willing to share the logs of a nuanced conversation you've had with Gemini?
reply
Herring 20 hours ago
I'm not sure. Most of it is not even on the logs, it's followed up elsewhere.

You can try something like this on Gemini 3 Pro:

> Break down aspects of the economy by amenability to state control high/medium/low, based on what we see in successful economies. Include a rationale and supporting evidence/counterexamples. Present it in 3 tables.

It should give you dozens of things you can look up. It might mention successful Singapore and Vienna-style public housing. Some nice videos on that on Youtube.

Online discussions are usually at the level of "[Flagged] Communism bad".

reply
linkregister 15 hours ago
I have the luxury of a few friends capable of discussing complex military, political, and social issues who are able to hold nuanced views backed by evidence.

Because of that good fortune, it hasn't occurred to me to use an LLM to organize information for these topics. I appreciate your sharing your approach and I look forward to trying this use case of LLMs.

reply
Herring 2 hours ago
Cheers, mate.
reply
metadat 23 hours ago
With the requisite planning steps Codex and Claude are already coming up with better architecture and design than I can.

I've been doing this for more than 25 years.

reply
_se 6 hours ago
Brother, you are just outing yourself as being incompetent here. That is embarrassing.
reply
whattheheckheck 22 hours ago
What's the most complicated thing its designed so far for you
reply
Herring 22 hours ago
Not that guy, but for me it's something like Tensorflow/Pytorch. A domain-specific language for a scientific application, Python API with a Rust core for very fast/safe calculations. It has all kinds of bells & whistles you'd want, like automatic differentiation, lazy evaluation, provenance, serialization, etc. Occasionally dips down to raw pointer work too. It's easy to test, so AI excels at this type of thing.
reply
sebmellen 23 hours ago
Beautifully expressed… you missed doctors in your list of white collar professions, but I’m sure surgeons and pilots will outlive all of us from an AI resilience standpoint.
reply
jongjong 23 hours ago
Ah yes 100%, doctors have legal moats too.

It's kind of terrifying to think that all professions are going to have to shift away from value creation to pure politics to survive.

I have a feeling that big tech companies will be legally forced to pay royalties to software engineers. Once software engineers stop applying their reasoning skills to solving real problems and start vengefully focusing it on politics, we're going to corrupt the whole system in our favor. We have enough collective knowledge to frame such corruption as moral in the context of an already corrupt system.

Either software engineers will create regulatory moats for themselves or there will be a more broad political movement like communism. I've met many people working deep in the critical systems which underpin our society who are full-blown communists.

reply
linkregister 4 hours ago
Software engineering as a field has exceptionally low worker solidarity. This is largely because the talent and productivity is so stratified that even the median engineer produces an order of magnitude less value than the p95. Furthermore, a sufficient amount of opportunities exist —for engineers to become capitalists by founding, joining, or investing in an enterprise early enough to capture its upside— to credibly convince individuals that they may achieve this.

Software engineers in aggregate will happily automate themselves out of a job. Legal cartels like the AMA and the ABA will persist. It would take years of strong threats to software engineers' livelihoods to compel them to support a cartel of their own. Even the first step for their regulatory capture, credentialization, is rejected, as enough autodidacts without degrees practice in the field.

Essentially, too many software engineers view themselves as temporary embarrassed millionaires, rather than workers who need to band together. Automation in the field is happening faster than individuals' minds will change.

reply
whattheheckheck 22 hours ago
Openclaw as your own person political advisor is pretty cool
reply
kittbuilds 17 hours ago
[dead]
reply
wittlesus 19 hours ago
[dead]
reply