Constraint Decay: The Fragility of LLM Agents in Back End Code Generation
14 points by wek 3 hours ago | 3 comments

jdlshore 20 minutes ago
“Our systematic study exposes a phenomenon of constraint decay in LLM-based coding agents. While current models excel at unconstrained generation, their performance drops when forced to navigate explicit architectural rules. For end-users, this dichotomy implies that agents are reliable for rapid prototyping but remain unreliable for production-grade backend development.”

One major weakness of this study is that they didn’t fully test frontier models for cost reasons, so the specific performance results should be taken with a grain of salt. But the overall conclusion that models degrade when both behavior and architecture must be correct is interesting, and something to keep an eye on.

reply
maxbond 22 minutes ago
Reminds me of the recent paper about delegating document editing tasks to LLMs across different disciplines [1]. That paper found that programming was the only discipline most LLMs can perform long horizon tasks on without accumulating errors & corrupting the document.

I've only read the abstract of this one so far but it seems like this paper has zoomed in on programming with greater fidelity and shown a similar phenomenon. But not about long tasks horizons, more like "long style horizons" of larger sets of structural constraints.

[1] https://arxiv.org/abs/2604.15597

Discussion: https://news.ycombinator.com/item?id=48073246

reply
gkfasdfasdf 17 minutes ago
Odd they used GPT-5.2 and not GPT-5.2-codex. i.e. the one optimized for coding agent tasks.
reply
volume_tech 3 hours ago
[flagged]
reply