When LLMs Generate Thousands of Tokens per Second, What Matters Won't Be the Code

Last week I ran an experiment. I took a logistics microservice that's been in production for many years, a service nobody wants to touch because the engineer who designed it is gone and the documentation is, being generous, incomplete. I asked an agent to regenerate it from scratch using only the existing tests and the contract specification.

The agent generated code. Reasonable but with flaws. But the interesting part was what was missing. The tests covered the happy path. The contracts defined inputs and outputs, but not the business invariants the team had been discovering over the years: the prioritization rules during demand spikes, the retry thresholds someone tuned after an incident and never formalized. All of that lived in two people's heads. Not in any artifact.

The agent could regenerate the cells. But it didn't have the DNA.

Google's DORA 2025 report confirms this isn't anecdotal: AI already improves delivery speed, but it still correlates with greater instability. Their conclusion is fairly clear: AI doesn't fix a team, it amplifies what's already there. Teams with good DNA improve. Teams with incomplete DNA break faster.

The inference speed of LLMs is going to turn code into disposable, regenerable material. But if you don't change your working model, that speed won't save you. It buries you. What survives isn't the implementation, but the information that allows you to reconstruct it. The future of the software engineer isn't writing code. It's writing DNA.

What's Coming (and Why It Changes Everything)

Everything I just described is happening today, with models that generate between 50 and 200 tokens per second. With perceptible latencies. With context windows that still fall short.

But what's coming is something else. Locally, with dedicated hardware and optimized models, we're already exceeding 17,000 tokens per second. New chips are pushing the cost per token down at a brutal pace. And these numbers will democratize, just like compute, storage, and bandwidth did before.

Think of it this way. Today an agent takes minutes to generate a module. With thousands of tokens per second, it takes seconds. We're talking about generating an implementation from scratch being faster than reading the existing implementation to understand if it's correct.

The consequences are already visible. Paul Ford already did the exercise: projects that in 2021 cost $350,000 he built alone in a weekend for $150 with Claude Code. And what I see in my teams is consistent: writing code is becoming the easy part. The hard part is knowing what to write, why, and how to tell if it's right.

If regenerating is cheaper than maintaining, the rational strategy isn't to care for the code. It's to make it disposable by design. And shift all investment toward what isn't disposable: the contracts, the tests, the business invariants, the evaluations, the SLAs.

I support seven teams in the online division of one of Spain's largest distribution companies. And I've already seen what happens when a team introduces agents without having improved their tests or contracts: a bigger codebase that fewer people understand. I've seen it more than once. At the scale of millions of daily operations, that's not a theoretical problem. It's an incident waiting in the queue.

Chad Fowler calls it regenerative software: systems designed to burn and be reborn without losing their identity. I like the phoenix metaphor, but I think there's a more precise one.

Don't Write Cells. Write DNA.

The cells in your body are constantly being replaced. What persists is the information that says how to rebuild them. Software is entering the same dynamic: the code (the cells) is going to be cheaper and cheaper to generate, to discard, and to regenerate. What defines the system's identity is the information that allows it to be correctly reconstructed: the contracts, the evaluations, the business invariants, the feedback mechanisms. That is your software's DNA.

In the classic model, the code was the asset. An entire industry (estimates, code reviews, sprints, retrospectives) exists because writing code was expensive and dangerous to replace. In the model that's coming, code is consumable cells. What matters is the DNA: the verifiable information that ensures regeneration produces a healthy organism and not a tumor.

Because that's what happens when you generate code without DNA. It's mutation without natural selection. More code, faster, with no mechanism to distinguish correct from incorrect. And accelerated mutation without selection doesn't produce evolution. It produces disease. That's exactly what the DORA data shows: more speed, more instability.

Weekly Newsletter

Enjoying what you read?

Join other engineers who receive reflections on career, leadership, and technology every week.

This newsletter is written in Spanish.

Back to my experiment. Why couldn't the agent regenerate the service reliably? Not because of model limitations. Because the DNA was incomplete. There were no contracts saying "during demand spikes, prioritization follows these rules, not those." The team knew all of that. None of it was written in a way that a machine (or a new engineer) could use.

After that experiment, I started doing a mental exercise I call "the regeneration test": if tomorrow I had to regenerate this service from scratch, what percentage of the correct behavior would survive using only the artifacts I have? Without counting what people know.

The Three Layers of DNA

I've started thinking about a system's DNA as something with structure. It's not a monolithic block. It's three layers of expression, each with a distinct function (and each connects to a different verification layer):

What the system does. Behavioral tests, API contracts, input/output specs. This is the DNA that expresses itself every time, in every regeneration. The layer most teams have reasonably covered.

How it does it well. Performance limits, resilience standards, retry policies, graceful degradation thresholds. This is the DNA that expresses itself under specific conditions: when there's load, when a downstream service fails, when the system operates at the edges. This is where most teams fall short.

In my experiment, this was exactly what failed: there was a retry rule manually tuned after an incident. I converted it into a parameterized property test. It took me ten minutes. Those minutes were worth more than weeks of generated code, because now that rule survives any regeneration. That's thickening the DNA.

Why it does it that way. The business decisions that shape the design. The agreements between teams that aren't in any ADR. The heuristics someone learned after an incident at three in the morning. This is the regulatory DNA: it doesn't encode behavior directly, but it determines when and how the other layers express themselves. It's the most valuable layer and the most invisible.

I'm not going to pretend I have this figured out. I've been turning it over for months and what I have are more questions than answers. But the questions seem like the right ones to me.

And someone will say: "this sounds like waterfall in disguise." No. I'm not proposing sequencing the complete DNA before starting. I'm proposing being honest about how much of what we call "tacit knowledge" is genuinely uncodifiable and how much is, simply, documentation laziness. That retry rule wasn't tacit. It was an explicit decision nobody bothered to turn into an artifact. In my experience, most of the knowledge teams consider "tacit" is actually explicit knowledge that was never formalized. The genuinely tacit kind exists, but it's a smaller fraction than we like to believe.

The Question That Separates Two Futures

There are two paths, and both are happening right now.

The first: using AI to replace what I call glorified typing: writing code. More code, more features, more speed. Accelerated mutation without selection. The DORA data tells us how that ends: more fragile systems, the illusion of productivity without the results.

The second: changing the mental model. Treating code as disposable cells. Investing in the DNA: evaluations, contracts, system definitions. Making regeneration produce healthy organisms.

I've mentored many engineers, and what I see is that this transition is redefining what it means to be senior. Those who keep defining their value by the quality of their code are finding that an agent produces comparable code in a fraction of the time. Those who define their value by their ability to judge whether a system is correct, to formalize invariants nobody has written, to anticipate failure modes. They're in a position that strengthens with every model improvement. And for juniors the path hasn't disappeared, it has shifted axis: from "learn to write clean code" to "learn to define what it means for a system to work correctly."

I don't trust this argument as a guarantee. I've been wrong before about how this industry would evolve, and I might be wrong now. But it's no longer a theoretical reflection for me. It's something I'm testing every week with my teams, and so far the data backs me up more than it doesn't.

I've been saying for years that code is a means, not an end. Now that idea has concrete consequences: if code is a means, treat it as such. Make it disposable. Invest in what surrounds it, defines it, and verifies it. Don't write cells. Write DNA.

Question for you: Run the regeneration test on a service from your team. If tomorrow you had to regenerate all the code from scratch, with only the tests and documentation you have today, what percentage of the correct behavior would survive? Not the percentage you believe. The one you can prove.

I'd love to know what number you get.

Newsletter Content

This content was first sent to my newsletter

Every week I send exclusive reflections, resources, and deep analysis on software engineering, technical leadership, and career development. Don't miss the next one.

Join over 5,000 engineers who already receive exclusive content every week

When LLMs Generate Thousands of Tokens per Second, What Matters Won't Be the Code

What's Coming (and Why It Changes Everything)

Don't Write Cells. Write DNA.

Enjoying what you read?

The Three Layers of DNA

The Question That Separates Two Futures

This content was first sent to my newsletter

Related articles

The DNA of software wasn't a concept. It was 24 files.

Discipline Doesn't Scale. Verification Needs Infrastructure.

Generating Is Easy. Verifying Is the Work.

Emilio Carrión