Cognitive Debt: When Velocity Exceeds Comprehension

2026-02-2815:39437190www.rockoder.com

A systems analysis of how AI-assisted development creates a gap between output speed and understanding, and why organizations cannot see it happening.

Show article

The engineer shipped seven features in a single sprint. DORA metrics looked immaculate. The promotion packet practically wrote itself.

Six months later, an architectural change required modifying those features. No one on the team could explain why certain components existed or how they interacted. The engineer who built them stared at her own code like a stranger’s.

Code has become cheaper to produce than to perceive.

The Comprehension Lag

When an engineer writes code manually, two parallel processes occur. The first is production: characters appear in files, tests get written, systems change. The second is absorption: mental models form, edge cases become intuitive, architectural relationships solidify into understanding. These processes are coupled. The act of typing forces engagement. The friction of implementation creates space for reasoning.

AI-assisted development decouples these processes. A prompt generates hundreds of lines in seconds. The engineer reviews, adjusts, iterates. Output accelerates. But absorption cannot accelerate proportionally. The cognitive work of truly understanding what was built, why it was built that way, and how it relates to everything else remains bounded by human processing speed.

This gap between output velocity and comprehension velocity is cognitive debt.

Unlike technical debt, which surfaces through system failures or maintenance costs, cognitive debt remains invisible to velocity metrics. The code works. The tests pass. The features ship. The deficit exists only in the minds of the engineers who built the system, manifesting as uncertainty about their own work.

The debt is not truly invisible. It eventually appears in reliability metrics: Mean Time to Recovery stretches longer, Change Failure Rate creeps upward. But these are lagging indicators, separated by months from the velocity metrics that drive quarterly decisions. By the time MTTR signals a problem, the comprehension deficit has already compounded.

What Organizations Actually Measure

Engineering performance systems evolved to measure observable outputs. Story points completed. Features shipped. Commits merged. Review turnaround time. These metrics emerged from an era when output and comprehension were tightly coupled, when shipping something implied understanding something.

The metrics never measured comprehension directly because comprehension was assumed. An engineer who shipped a feature was presumed to understand that feature. The presumption held because the production process itself forced understanding.

That presumption no longer holds. An engineer can now ship features while maintaining only surface familiarity with their implementation. The features work. The metrics register success. The organizational knowledge that would traditionally accumulate alongside those features simply does not form at the same rate.

Performance calibration committees see velocity improvements. They do not see comprehension deficits. They cannot, because no artifact of the organizational measurement system captures that dimension.

The Reviewer’s Dilemma

The discussion of cognitive debt typically focuses on the engineer who generates code. The more acute problem sits with the engineer who reviews it.

Code review evolved as a quality gate. A senior engineer examines a junior engineer’s work, catching errors, suggesting improvements, transferring knowledge. The rate-limiting factor was always the junior engineer’s output speed. Senior engineers could review faster than juniors could produce.

AI-assisted development inverts this relationship. A junior engineer can now generate code faster than a senior engineer can critically audit it. The volume of generated code exceeds the bandwidth available for deep review. Something has to give, and typically it is review depth.

The reviewer faces an impossible choice. Maintain previous review standards and become a bottleneck that negates the velocity gains AI provides. Or approve code at the rate it arrives and hope the tests catch what the review missed. Most choose the latter, often unconsciously, because organizational pressure favors throughput.

This is where cognitive debt compounds fastest. The author’s comprehension deficit might be recoverable through later engagement with the code. The reviewer’s comprehension deficit propagates: they approved code they do not fully understand, which now carries implicit endorsement. The organizational assumption that reviewed code is understood code no longer holds.

The Burnout Pattern

Engineers working extensively with AI tools report a specific form of exhaustion that differs from traditional burnout. Traditional burnout emerges from sustained cognitive load, from having too much to hold in mind while solving complex problems. The new pattern emerges from something closer to cognitive disconnection.

The work happens quickly. Progress is visible. But the engineer experiences a persistent sense of not quite grasping their own output. They can execute, but explanation requires reconstruction. They can modify, but prediction becomes unreliable. The system they built feels slightly foreign even as it functions correctly.

This creates a distinctive psychological state: high output combined with low confidence. Engineers produce more while feeling less certain about what they have produced. In organizations that stack-rank based on visible output, this creates pressure to continue generating despite the growing uncertainty.

The engineer who pauses to deeply understand what they built falls behind in velocity metrics. The engineer who prioritizes throughput over comprehension meets their quarterly objectives. The incentive structure selects for the behavior that accelerates cognitive debt accumulation.

When Organizational Memory Fails

Knowledge in engineering organizations exists in two forms. The first is explicit: documentation, design documents, recorded decisions. The second is tacit: understanding held in the minds of people who built and maintained systems over time. Tacit knowledge cannot be fully externalized because much of it exists as intuition, pattern recognition, and contextual judgment that formed through direct engagement with the work.

When the people who built a system leave or rotate to new projects, tacit knowledge walks out with them. Organizations traditionally replenished this knowledge through the normal process of engineering work. New engineers building on existing systems developed their own tacit understanding through the friction of implementation.

AI-assisted development potentially short-circuits this replenishment mechanism. If new engineers can generate working modifications without developing deep comprehension, they never form the tacit knowledge that would traditionally accumulate. The organization loses knowledge not just through attrition but through insufficient formation.

This creates a delayed failure mode. The system continues to function. New features continue to ship. But the reservoir of people who truly understand the system gradually depletes. When circumstances eventually require that understanding, when something breaks in an unexpected way or requirements change in a way that demands architectural reasoning, the organization discovers the deficit.

How the Debt Compounds

Three failure modes emerge as cognitive debt accumulates.

The first involves the reversal of a normally reliable heuristic. Engineers typically trust code that has been in production for years. If it survived that long, it probably works. The longer code exists without causing problems, the more confidence it earns. AI-generated code inverts this pattern. The longer it remains untouched, the more dangerous it becomes, because the context window of the humans around it has closed completely. Code that was barely understood when written becomes entirely opaque after the people who wrote it have moved on.

They are debugging a black box written by a black box.

The second failure mode surfaces during incidents. An alert fires at 3:00 AM. The on-call engineer opens a system they did not build, generated by tools they did not supervise, documented in ways that assume familiarity they do not possess. They are debugging a black box written by a black box. What would have been a ten-minute fix when someone understood the system becomes a four-hour forensic investigation when no one does. Multiply this across enough incidents and the aggregate cost exceeds whatever velocity gains the AI-assisted development provided.

The organization is effectively trading its pipeline of future Staff Engineers for this quarter's feature delivery.

The third failure mode operates on a longer timescale. Junior engineers who rely primarily on AI-assisted development never develop the intuition that comes from manual implementation. They ship features without forming the scar tissue that informs architectural judgment. The organization is effectively trading its pipeline of future Staff Engineers for this quarter’s feature delivery. The cost does not appear in current headcount models because the people who would have become senior architects five years from now are not yet absent. They simply never form.

The Director’s View

From the perspective of engineering leadership, AI-assisted development presents as productivity gain. Teams ship faster. Roadmaps compress. Headcount discussions become more favorable. These are the observable signals that propagate upward through organizational reporting structures.

The cognitive debt accumulating in those teams does not present as a signal. There is no metric for “engineers who can explain their own code without re-reading it.” There is no dashboard for “organizational comprehension depth.” The concept does not fit into quarterly business review formats or headcount justification narratives.

Directors make decisions based on observable signals. When those signals uniformly indicate success, the decision to double down on the approach that produced those signals is rational within the information environment available to leadership. The decision is not wrong given the data. The data is incomplete.

Where This Model Breaks

The cognitive debt framing does not apply uniformly across all engineering work. Some tasks genuinely are mechanical. Some codebases genuinely benefit from rapid iteration without deep architectural understanding. Some features genuinely do not require the level of comprehension that would traditionally form through manual implementation.

The model also assumes that comprehension was previously forming at adequate rates. This assumption may be generous. Engineers have always varied in how deeply they understood their own work. The distribution may simply be shifting rather than a new phenomenon emerging.

Additionally, tooling and documentation practices may evolve to partially close the comprehension gap. If organizations develop methods for capturing and transmitting the understanding that AI-assisted development fails to form organically, the debt may prove manageable rather than accumulative.

The Measurement Problem

The system is optimizing correctly for what it measures. What it measures no longer captures what matters.

The fundamental challenge is that organizations cannot optimize for what they cannot measure. Velocity is measurable. Comprehension is not, or at least not through any mechanism that currently feeds into performance evaluation, promotion decisions, or headcount planning.

Until comprehension becomes legible to organizational decision-making systems, the incentive structure will continue to favor velocity. Engineers who prioritize understanding over output will appear less productive than peers who prioritize output over understanding. Performance calibration will reward the behavior that accumulates debt faster.

This is not a failure of individual managers or engineers. It is a measurement system designed for an era when production and comprehension were coupled, operating in an era when that coupling no longer holds. The system is optimizing correctly for what it measures. What it measures no longer captures what matters.

The gap will eventually manifest. Whether through maintenance costs that exceed projections, through incidents that require understanding no one possesses, or through new requirements that expose the brittleness of systems built without deep comprehension. The timing and form of manifestation remain uncertain. The underlying dynamic does not.

Read the original article

Comments

By Klaster_1 2026-02-2817:544 reply

The article very much resonates with my experience past several months.

The project I work on has been steadily growing for years, but the amount of engineers taking care of it stayed same or even declined a bit. Most of features are isolated and left untouched for months unless something comes up.

So far, I managed growing scope by relying on tests more and more. Then I switched to exclusively developing against a simulator. Checking changes with real system become rare and more involved - when you have to check, it's usually the gnarliest parts.

Last year's, I noticed I can no longer answer questions about several features because despite working on those for a couple of months and reviewing PRs, I barely hold the details in my head soon afterwards. And this all even before coding agents penetrated deep into our process.

With agents, I noticed exactly what article talks about. Reviewing PR feels even more implicit, I have to exert deliberate effort because tacit knowledge of context didn't form yet and you have to review more than before - the stuff goes into one ear and out of another. My team mates report similar experience.

Currently, we are trying various approaches to deal with that, it it's still too early to tell. We now commit agent plans alongside code to maybe not lose insights gained during development. Tasks with vague requirements we'd implicitly understand most of previously are now a bottleneck because when you type requirements to an agent for planning immediately surface various issues you'd think of during backlog grooming. Skill MDs are often tacit knowledge dumps we previously kept distributed in less formal ways. Agents are forcing us to up our process game and discipline, real people benefit from that too. As article mentioned, I am looking forward to tools picking some of that slack.

One other thing that surprised me was that my eng manager was seemingly oblivious to my ongoing complains about growing cognitive load and confusion rate. It's as if the concept was alien to them or they could comprehend that other people handle that at different capacity than them.

By datsci_est_2015 2026-02-2818:23

> One other thing that surprised me was that my eng manager was seemingly oblivious to my ongoing complains about growing cognitive load and confusion rate.

Engineering managers in my experience (even in ones with deep technical backgrounds) often miss the trees for the forest. The best ones go to bat for you, especially once verifying that they can do something to unblock or support you. But that’s still different than being in the terminal or IDE all day.

Offloading cognitive load is pretty much their entire role.

By matsemann 2026-02-2818:402 reply

Learning has always been to write things down. Just reading it seldom sticks.

By RealityVoid 2026-02-2819:361 reply

Absolutely not. Learning has been to experiment with the things until you form a effective mental model of the thing. Writing things does ab-so-luetely nothing except make you feel good in the moment. Just like listening to a lecture without engaging with the subject matter deeper.

Writing things down is important for organisational persistence of information but that is something else.

By direwolf20 2026-02-2819:431 reply

Writing is better than reading, but doing is better than writing.

By shimman 2026-02-2819:452 reply

How does this apply to coding when the act of writing IS doing? Or do you mean like coding "on your own" versus following a tutorial for example?

By coldtea 2026-02-2820:13

Means writing code (doing) vs writing documentation / plans / project architecture documents and so on.

By direwolf20 2026-02-2819:58

Writing code is doing

By 0wis 2026-02-2819:01

Not sure humanity learned nothing before the last 8000 years. It was just very slow. Maybe we will need new ways to learn

By nsvd2 2026-02-2818:062 reply

I think that recording dialog with the agent (prompt, the agent's plan, and agent's report after implementation) will become increasingly important in the future.

By slashdev 2026-02-2818:213 reply

I have this at the bottom of my AGENTS.md:

You will also add a markdown file to the changelog directory named with the current date and time `date -u +"%Y-%m-%dT%H-%M-%SZ"`, record the prompt, and a brief summary of what changes you made, this should be the same summary you gave the developer in the chat.

From that I get the prompt and the summary for each change. It's not perfect but it at least adds some context around the commit.

By atmosx 2026-02-2819:211 reply

Isn’t the commit message a better place to add what and why? You might need to feed some info that the agent doesn’t have access to “we are developing feature X this change will such and such to blah blah”. The agent will write a pretty good commit message most of the times. Why do you need a markdown file? Are releasing new versions of the software for third parties?

By Belphemur 2026-02-2819:371 reply

Cheaper and faster retrieval to be added to the context and discoverable by the agent.

You need more git commands to find the right commit that contains the context you want (either you the human or the LLM burning too many token and time) than just include the right MD file or use grep with proper keywords.

Moreover you could need multiple commits to get the full context, while if you ask the LLM to keep the MD file up to date, you have everything together.

By atmosx 2026-02-2819:49

I doubt you can give more context to an LLM from a README file than 500 properly written commits. Or to a human for that matter.

By mort96 2026-02-2819:34

How often, in your experience, do people read those auto-generated markdown files? Do you have any empirical data on how useful people find reading other people's agents' auto-generated files?

By addaon 2026-02-2818:52

How often is it the same summary given to the developer in the chat?

By Klaster_1 2026-02-2818:161 reply

Agree, but current agents don't help with that. I use Copilot, and you can't even dump it preserving complete context, including images, tool call results and subagent outputs. And even if you could, you'd immediately blow up the context trying to ingest that. This needs some supporting tooling, like in today's submission where agent accesses terabytes of CI logs via ClickHouse.

By gopher_space 2026-02-2818:53

I've had some luck creating tiny skills that produce summaries. E.g. a current TASK.md is generated from a milestone in PLAN.md, and when work is checked in STATUS.md and README.md are regenerated as needed. AGENTS.md is minimal and shrinking as I spread instructions out to the tools.

Part of my CI process when creating skills involves setting token caps and comparing usage rates with and without the skill.

By bluegatty 2026-02-2819:15

We don't have the right abstractions in place to support true AI driven work. We replaced ourselves but we don't have the tools to do '1 layer up'.

By pajtai 2026-02-2816:1419 reply

The whole premise of the post, that coders remember what and why they wrote things from 6 months ago, is flawed.

We've always had the problem that understanding while writing code is easier than understanding code you've written. This is why, in the pre-AI era, Joel Spolsky wrote: "It's harder to read code than to write it."

By Vexs 2026-02-2817:071 reply

I don't remember exactly what I wrote and how the logic works, but I generally remember the broad flow of how things tie together, which makes it easier to drop in on some aspect and understand where it is code-wise.

By Verdex 2026-02-2818:511 reply

There's code structure but then there's also code philosophy.

The worst code bases I have to deal with have either no philosophy or a dozen competing and incompatible philosophies.

The best are (obviously) written in my battle tested and ultra refined philosophy developed over the last ~25 years.

But I'm perfectly happy to be working in code bases written even with philosophies that I violently disagree with. Just as long as the singular (or at least compatible) philosophy has a certain maturity and consistency to it.

By jfreds 2026-02-2820:07

I think this is well put. Cohesive philosophy, even if flawed, is a lot easier to work with than a patchwork of out-of-context “best practices” strewn together by an LLM

By senko 2026-02-2816:383 reply

I recently did some work on a codebase I last touched 4 years ago.

I didn't remember every line but I still had a very good grasp of how and why it's put together.

(edit: and no, I don't have some extra good memory)

By copperx 2026-02-2817:093 reply

Lucky you. I always go "huh, so I wrote this?". And this was in the pre-AI era.

By seba_dos1 2026-02-2817:161 reply

These feelings aren't mutually exclusive. I'm often like "I have no memory of this place" while my name stares at me from git blame, but that doesn't mean my intuition of how it's structured isn't highly likely to be right in such cases.

By datsci_est_2015 2026-02-2818:24

Like a painter not remembering a specific stroke, but being able to recreate it instantly because of years of muscle memory.

By layer8 2026-02-2820:02

There is probably a bias here, because you notice the times where the code is unfamiliar more than the times when it’s still familiar. You wouldn’t go “huh” if not remembering was the normal case. If it were, you’d rather go “huh” if exceptionally you do remember.

By suzzer99 2026-02-2819:25

It feels like that at first, especially as I get older. But I still think it comes back to me a lot quicker if I once understood it than if I was learning it from scratch. Possibly just because I know how I think.

By vjvjvjvjghv 2026-02-2818:311 reply

I definitely understand my own code better than what other people wrote, even from 10 years ago. I often see code and think "this makes sense to do it this way". Turns out I wrote it years ago.

By layer8 2026-02-2820:071 reply

That means your personal growth has plateaued. ;)

By vjvjvjvjghv 2026-02-2821:50

[delayed]

By SoftTalker 2026-02-2817:141 reply

I find this to be the case if it was something I was deeply involved with.

Other times, I can make a small change to something that doesn't require much time, and once it's tested and committed, I quickly lose any memory of even having done it.

By senko 2026-02-2817:25

Yeah I did pour a lot of sweat and thinking into that codebase all those years ago.

When I do a drive-by edit, I probably don't remember it in a week.

Which is why the "cognitive debt" from the article is relevant, IMHO. If I just thoroughly review the plan and quickly scan the resulting code, will that have a strong enough imprint on my mind over time?

I would like to think "yes", my gut is telling me "no". IMHO the LLMs are now "good enough" for coding. These are hard questions we'll have to grapple with this in the next year or two (in context of AI-assisted software development).

By seba_dos1 2026-02-2817:08

I juggle between various codebases regularly, some written by me and some not, often come back to things after not even months but years, and in my experience there's very little difference in coming back to a codebase after 6 months or after a week.

The hard part is to gain familiarity with the project's coding style and high level structure (the "intuition" of where to expect what you're looking for) and this is something that comes back to you with relative ease if you had already put that effort in the past - like a song you used to have memorized in the past, but couldn't recall it now after all these years until you heard the first verse somewhere. And of course, memorizing songs you wrote yourself is much easier, it just kinda happens on its own.

By softwaredoug 2026-02-2816:301 reply

If I’m learning for the first time, I think it matters to hand code something. The struggle internalizes critical thinking. How else am I supposed to have “taste”? :)

I don’t know if this becomes prod code, but I often feel the need to create like a Jupyter notebook to create a solution step by step to ensure I understand.

Of course I don’t need to understand most silly things in my codebase. But some things I need to reason about carefully.

By Vexs 2026-02-2817:051 reply

Almost anything I write in Python I start in jupyter just so I can roll it around and see how it feels- which determines how I build it out and to some degree, how easy it is to fix issues later on.

With llm-first coding, this experience is lost

By softwaredoug 2026-02-2818:28

Yeah I do that too. I also teach training with Jupyter notebooks (ironically about agents). I still find it invaluable.

By zeroonetwothree 2026-02-2818:05

I still remember the core architecture of code I wrote 20 years ago at my first job. I can visualize the main classes and how they interact even though I haven’t touched it since then.

Meanwhile some stuff Claude wrote for me last week I barely remember what it even did at a high level.

By Retric 2026-02-2816:202 reply

Harder here doesn’t mean slower. Reading and understanding your own code is way faster than writing and testing it, but it’s not easy.

AI tools don’t prevent people from understanding the code they are producing as it wouldn’t actually take that much time, but there’s a natural tendency to avoid hard work. Of course AI code is generally terrible making the process even more painful, but you where just looking at the context that created it so you have a leg up.

By layer8 2026-02-2820:13

The reason it’s hard is exactly because you have to do it in shorter time and without a feedback cycle that has you learn bit by bit, like when you’d write the code yourself. It has some similarity with short-term cramming for an exam, where you will soon forget most of it afterwards, as opposed to when you built up the knowledge and problem-solving exercise over a longer period of time.

By forgetfreeman 2026-02-2816:35

Certainly AI tools don't prevent anything per se, that's management's job. Deadlines and other forms of time pressure being what they are it's trivial to construct a narrative where developers are producing (and shipping) code significantly faster than the resulting codebase can be fully comprehended.

By TallGuyShort 2026-02-2817:08

This is also an area where AI can help. Don't just tell it to write your code. Before you get going, have it give you an architectural overview of certain parts you're rusty on, have it summarize changes that have happened since you were familiar, have it look at the bigger picture of what you're about to do and have it critique your design. If you're going to have it help you write code, don't have it ONLY help you write code. Have it help you with all the cognitive load.

By iainctduncan 2026-02-2816:402 reply

Oh come on, that is complete nonsense. I can reunderstand complicated code I wrote a year ago far, far faster than complicated code someone else wrote. Especially if I also wrote tests, accompanying notes, and docs. If you can't understand your old code when you come back to it... including looking through your comments and docs and tests... I'm going to say you're doing it wrong. Maybe it takes a while, but it shouldn't be that hard.

Anyone pretending gen-ai code is understood as well as pre-gen-ai, handwritten code is totally kidding themselves.

Now, whether the trade off is still worth it is debatable, but that's a different question.

By bogzz 2026-02-2817:53

The trade-off is worth it in my opinion when you are in a time crunch to deliver a demo, or are asked to test out an idea for a new feature (also in a time crunch).

The hope being that if the feature were to be kept or the demo fleshed out, developers would need to shape and refactor the project as per newly discovered requirements, or start from scratch having hopefully learnt from the agentic rush.

To me, it always boils down to LLMs being probabilistic models which can do more of the same that has been done thousands of times, but also exhibit emergent reasoning-like properties that allow them to combine patterns sometimes. It's not actual reasoning, it's a facsimile of reasoning. The bigger the models, the better the RLHF and fine-tuning, the more useful they become but my intuition is that they'll always (LLMs) asymptotically try to approach actual reasoning without being able to get there.

So the notion of no-human-brain-in-the-loop programming is to me, a fool's errand. I do obviously hope I am right here, but we'll see. Ultimately you need accountability and for accountability you need human understanding. Trying to move fast without waiting for comprehension to catch up (which would most likely result in alternate, better approaches to solving the problem at hand) increases entropy and pushes problems further down the road.

By bikelang 2026-02-2817:30

It’s hard to keep the minutiae in your memory over a long period of time - but I certainly remember the high level details. Patterns, types, interfaces, APIs, architectural decisions. This is why I write comments and have thorough tests - the documentation of the minutiae is critical and gives guardrails when refactoring.

I absolutely feel the cognitive debt with our codebase at work now. It’s not so much that we are churning out features faster with ai (although that is certainly happening) - but we are tackling much more complex work that previously we would have said No to.

By Thanemate 2026-02-2817:38

OP talks about the increased frequency of such events happening, and not that this is a new problem.

For example, handwritten code also tended to be reviewed manually by each other member of the team, so the probability of someone recalling was higher than say, LLM generated code that was also LLM reviewed.

By barrkel 2026-02-2817:58

Understanding other people's code is harder than understanding your own code though.

By red_admiral 2026-02-2817:18

In the past, it was also an optimistic assumption that your engineers would still be working for you in a year's time? You need some kind of documentation / instructive testing anyway. And maybe more than one person who understands each bit of the system (bus factor).

By fritzo 2026-02-2819:11

I recently spent 1.5 weeks fixing a bug I introduced 20 years ago. Can confirm, I have no idea what I was thinking back then.

By predkambrij 2026-02-2818:01

My experience with Perl. "Write-only" language.

By yakattak 2026-02-2817:35

The individual details, probably not. But the high level/broad strokes I definitely remember 6+ months later.

By AIorNot 2026-02-2818:22

Also the article is AI written itself or AI assisted - there’s a tendency in AI text to bloviate and expound on irrelevant stuff so as to lose the plot

AI spec docs and documentation also have this documentation problem

By maqp 2026-02-2817:14

A lot of bug fixing relies on some mental model about the code. It manifests as rapid "Oh 100% I know what's causing" -heureka moments. With generated code, that part's gone for good. The "black box written by a black box" is spot on on, you're completely dependent on any LLM to maintain the codebase. Right now it's not a vendor lock thing but I worry it's going to be a monopoly thing. There's going to be 2-3 big companies at most, and with the bubble eventually bursting and investor money dying, running agents might get a lot more expensive. Who's going to propose the rewrite of thousands of LLM-generated features especially after the art of programming dies along with current seniors who burn out or retire.

By SpicyLemonZest 2026-02-2817:10

I’m very confused by this statement. I routinely answer questions about why we wrote the code we wrote 6 months ago and expect other people to do the same. In my mind that skill is one of the key differences between good and bad developers. Is it really so rare?

By empath75 2026-02-2816:24

I have been laboriously going through the process of adding documentation and comments in code explaining the purpose and all the interfaces we expect and adding tests for the purpose of making it easier for claude to work with it but it also makes it easier for me to work with it.

Claude often makes a hash of our legacy code and then i go look at what we had there before it started and think “i don’t even know what i was thinking, why is this even here?”

By jasode 2026-02-2816:553 reply

Not to disagree with anything the article talks about but to add some perspective...

The complaint about "code nobody understands" because of accumulating cognitive debt also happened with hand-written code. E.g. some stories:

- from https://devblogs.microsoft.com/oldnewthing/20121218-00/?p=58... : >Two of us tried to debug the program to figure out what was going on, but given that this was code written several years earlier by an outside company, and that nobody at Microsoft ever understood how the code worked (much less still understood it), and that most of the code was completely uncommented, we simply couldn’t figure out why the collision detector was not working. Heck, we couldn’t even find the collision detector! We had several million lines of code still to port, so we couldn’t afford to spend days studying the code trying to figure out what obscure floating point rounding error was causing collision detection to fail. We just made the executive decision right there to drop Pinball from the product.

- and another about the Oracle RDBMS codebase from https://news.ycombinator.com/item?id=18442941

(That hn thread is big and there are more top-level comments that talk about other ball-of-spaghetti projects besides Oracle.)

By bootsmann 2026-02-2817:251 reply

This underlines the argument of the OP no? The argument presented is that the situation where nobody knows how and why a piece of code is written will happen more often and appear faster with AI.

By layer8 2026-02-2819:52

Indeed, it’ll just result in legacy code faster. We’d need AI to be much better in reliably maintaining code quality, architecture and feature rationale documentation, than the average developer in the average software project. And that may be indistinguishable from AGI.

By the_arun 2026-02-2817:413 reply

Probably, we need to start saving prompts in Version Control. Prompts could be the context for both humans & machines.

By abustamam 2026-02-2820:45

I've been doing a version of this in a side project. Instead of saving the prompt directly, I have a road map. When implementing features, I tell it to brainstorm implementation for the road map. When fixing a bug, I tell it to brainstorm fixes from the roadmap. There's some back and forth, and then it writes a slice that is committed. Then, I look it over, verify scope, and it makes a plan (also committed). Then it generates work logs as it codes.

My prompts are literally "brainstorm next slice" or "brainstorm how to fix this bug" or "talk me through trades offs of approach A Vs B" so those prompts aren't meaningful in their own.

It's quite effective, but I'm a team of one.

By lurkshark 2026-02-2819:58

I agree with this, I like spec-driven-development tooling partially for this reason. That being said, what I’ve found is often that I don’t include enough of the “why” in my prompt artifacts. The “what” and “how” are pretty well covered but sometimes I find myself looking back at them thinking “Why did I do this?” I’ve started including it but it does sometimes feel weird because I feel like “Why would the LLM ‘care’ about this story?”

By layer8 2026-02-2819:491 reply

I wonder how scalable that is. After the twentieth feature has been added, how much connection will the conversation about the first feature still have with the current code? And you’ll need a larger and larger context for the LLM to grok the history; or you’ll have to have it rewrite it in shorter form, but that has the same failure modes why we can’t just have it maintain complete documentation (obviating the need to keep a history) in the first place.

By lurkshark 2026-02-2820:06

Things like MemGPT/Letta, ToM-SWE, and Voltropy have made long context documentation pretty manageable. You could probably build some specialized tooling/prompts for development artifacts specifically too. But I’ll be the first to admit this is basically “Throw more agents at the problem”

By abustamam 2026-02-2820:42

"when I wrote the code, only me and God understood it. Now, only God understands it."

(attributed to Martin Fowler but I can't find any solid evidence)