Simon Willison, a well-known Python developer and AI coding commentator, recently observed a troubling trend in his own workflow: the line between "vibe coding" and "agentic engineering" is blurring. In a podcast conversation with Joseph Ruscio for Heavybit's High Leverage podcast, Willison described how the two approaches — which he had previously kept firmly separate — are starting to overlap in ways that make him uncomfortable.
The original distinction
Willison originally defined vibe coding as the practice of using AI to generate code without reviewing it. The user may not even know how to program. They ask for something, get a result, and if it works, they move on. If it doesn't, they ask again. Code quality, security, and maintainability are not considerations. Willison's position was that vibe coding is fine for personal tools where bugs only hurt the user, but "grossly irresponsible" for software used by others.
Agentic engineering, by contrast, is what professional software engineers do: they use AI coding tools as amplifiers of their own expertise. They review the generated code, understand security and performance implications, and aim to build higher-quality production systems faster. Willison described relying on his 25 years of experience to guide the tools.
The blur
The problem, Willison realized, is that as coding agents become more reliable, he is no longer reviewing every line of code they produce — even for production-level work. He gave the example of asking Claude Code to build a JSON API endpoint that runs a SQL query and outputs the results as JSON. "It's just going to do it right," he said. "It's not going to mess that up." He adds automated tests and documentation, but he is not reading the code.
This creates a feeling of guilt. Willison compared it to working at a larger organization where another team hands over a service — say, an image resize service — and you use it without reading their code. You treat it as a semi-black box until something breaks. The difference is that human teams have professional reputations and accountability. "Claude Code does not have a professional reputation," Willison noted. "It can't take accountability for what it's done."
The normalization of deviance
Willison identified a risk he calls the "normalization of deviance": every time a model writes correct code without close monitoring, the temptation grows to trust it at the wrong moment in the future. The more often the agent proves itself, the harder it becomes to maintain the discipline of review.
Evaluating software has changed
Willison also pointed out that the traditional signals of software quality — a GitHub repository with a hundred commits, a good README, comprehensive tests — are now easy to fake. "I can knock out a git repository with a hundred commits and a beautiful readme and comprehensive tests of every line of code in half an hour," he said. The result looks identical to a project that received genuine care and attention. Even for his own projects, he cannot tell the difference by inspection alone.
His new heuristic: he values actual usage over apparent quality. "If you've got a vibe coded thing which you have used every day for the past two weeks, that's much more valuable to me than something that you've just spat out and hardly even exercised."
Bottlenecks have shifted
Willison noted that the entire software development lifecycle was designed around the assumption that a developer produces a few hundred lines of code per day. If that rate jumps to 2,000 lines per day, both upstream and downstream processes break. He cited a talk by Jenny Wen, design lead at Anthropic, who observed that design processes are built around the cost of getting things wrong — because handing off a bad design to engineers who spend three months building it is catastrophic. If building takes much less time, the design process can afford to be riskier.
Why Willison is not worried about his career
Despite these concerns, Willison is not afraid that AI will replace software engineers. He described his conversations with coding agents as "moon language for the vast majority of human beings." The tools are amplifiers of existing experience. "If you know what you're doing, you can run so much faster with them," he said. But producing software remains "ferociously difficult."
He quoted political commentator Matthew Yglesias, who tweeted: "Five months in, I think I've decided that I don't want to vibecode — I want professionally managed software companies to use AI coding assistance to make more/better/cheaper software products that they sell to me for money." Willison agreed, adding that he would rather hire a plumber than plumb his own house after watching YouTube tutorials.
Bottom line
Willison's key takeaway is that the convergence of vibe coding and agentic engineering is real and happening faster than he expected. The practical response is not to abandon AI coding tools, but to maintain disciplined review practices — and to value real-world usage over surface-level quality signals. For production software, the question is not whether the code looks good, but whether it has been proven to work under real conditions.