Coding agents were supposed to help developers move faster. Now they're being used to build the next generation of coding agents.

OpenAI quietly published a walkthrough last week of how it used Codex, its own coding agent, to build self-improving tax agents for an enterprise customer. The setup is simple in a slightly unsettling way. Codex writes the agent, runs it, watches where it fails, and rewrites itself to do better the next time. The humans involved are mostly there to review the work and steer.

It's not a one-off. Pi Labs demonstrated a coding agent that modifies its own prompts and tools as it works. Google pulled CodeMender, its AI security patcher, into a broader agent ecosystem where it finds vulnerabilities, writes the fix, and ships it with limited human review.

And then there's Cognition. After closing a $1 billion round this week at a $26B valuation, the company said Devin, its own coding agent, now opens roughly 89% of pull requests across its engineering org. So the next version of Devin is, in large part, being written by the current version of Devin.

What's changing alongside the tooling is what a developer actually does all day. Enrique Ibarra, CIO at Mexican insurer GNP Seguros, which runs a thousand-developer engineering org, put it plainly to Forbes: "The human is not writing the code. The human is directing a platform on how to write the code. That's a huge change in paradigm."

That paradigm shift is what makes the loop interesting. If you're a developer at a company using Devin or Codex, your job is shifting toward telling the agent what to build and checking what it produces. If you're a developer at Cognition or OpenAI, your job is to do that on the very tools that are doing the work. The product is improving itself, and you're supervising the improvement.

There are limits worth flagging. A recent analysis of AI research agents found that even the best ones still struggle with the parts of problem-solving that aren't pattern-matching. Framing the question, deciding what's worth working on, knowing when an answer is actually wrong. Self-improvement has a ceiling, and nobody yet knows where it is.

But the trajectory is hard to miss. The tools doing the work are increasingly the tools doing the building, and the people in the loop are doing less of the work and more of the directing.

Into the Valley

The interesting question isn't whether AI can write AI. It clearly can. The real question is what happens to the rate of improvement once the agents are improving themselves. If every iteration is faster than the last because the thing doing the iterating is also getting faster, the gap between the labs running these systems and everyone else stops growing in a straight line. That's the part nobody has fully priced in, and it's why paying $26B for a company whose own engineers barely write code anymore stops sounding quite so crazy.