
The software industry is collectively hallucinating a familiar fantasy. We visited versions of it in the 2000s with offshoring and again in the 2010s with microservices. Each time, the dream was identical: a silver bullet for developer productivity, a lever managers can pull to make delivery faster, cheaper, and better. Today, that lever is generative AI, and the pitch is seductively simple: If shipping is bottlenecked by writing code, and large language models can write code instantly, then using an LLM means velocity should explode.
But software development has rarely been constrained by typing speed. The bottleneck is almost always everything except typing: deciding what to build, aligning on an approach, integrating it into an ecosystem that already exists, getting it through security and compliance, and then operating what you shipped.
AI can help with syntax, scaffolding, and the drudgery of boilerplate. It can also make a different problem much worse: It makes complexity cheap. So how do we tackle that problem? The answer is platforms. Or paved roads. Or golden paths. Whatever the term, the impact is the same: by giving developers guardrails, we can dramatically improve their productivity across the enterprise.
Production versus productivity
The evidence so far is useful precisely because it refuses to tell a single comforting story. A randomized controlled trial from METR, for example, found that experienced open source developers, working in complex repositories they already knew, took about 19% longer to complete tasks when using AI tools, even while believing they would be faster. In a very different setting, GitHub reported that developers using Copilot completed a specific, isolated programming task substantially faster in a controlled experiment and also felt better about the experience.
So which is it? Is AI a turbocharger or an anchor? The answer is yes, and that ambiguity is the point. Put AI into a healthy system and it can compound speed. Put AI into a fragmented system and it can compound chaos. The outcome depends less on which model you picked and more on the environment you allow that model to operate in. “AI makes developers productive” is not a tool claim—or it shouldn’t be. It is a systems claim.
That environment problem is not new. Years before prompt engineering became a job title, I argued that unfettered developer freedom was already colliding with enterprise reality. Freedom feels like agility until it becomes sprawl, fragmentation, and an integration tax nobody budgeted for. Generative AI does not reverse that dynamic. It accelerates it because it removes the friction that used to slow down bad decisions.
This is where leadership teams keep making the same fundamental error: They confuse production with productivity. If you define productivity as “shipping more code,” AI is the greatest invention in our lifetime. But in production, code is not an asset in isolation. Code is a liability you must secure, observe, maintain, and integrate. Every new service, dependency, framework, and clever abstraction adds surface area, and surface area turns speed into fragility.
AI lowers the cost of creating that surface area to near zero. In the past, bad architectural decisions were limited by how long it took to implement them. Now a junior engineer can generate a sprawling set of services and glue them together with plausible code they do not fully understand because the assistant handled the implementation details. The team will be proud of their speed right up until the first time the system has to be audited, patched, scaled, or handed to a different team.
At that point, the supposed productivity win shows up as an operating cost.
If you want to talk about developer productivity in the AI era, you have to talk about delivery performance. The DORA metrics remain a stubborn reality check because they measure throughput and stability rather than volume: lead time for changes, deployment frequency, change failure rate, and time to restore. The SPACE framework is also useful because it reminds us that productivity is multidimensional, and “feels faster” is not the same as “is faster.” AI often boosts satisfaction early because it removes drudgery. That matters. But satisfaction can coexist with worse performance if teams spend their time validating, debugging, and reworking AI-generated code that is verbose, subtly wrong, or inconsistent with internal standards. If you want one manager-friendly measure that forces honesty, track the time to compliant deployment: the elapsed time from work being “ready” to actual software running in production with the required security controls, observability, and policy checks.
This is the part the industry still tries to dance around: AI makes the freedom problem worse. Gergely Orosz argues that as AI writes more of the code, engineers move up the abstraction ladder. The job shifts from writing to reviewing, integrating, and making architectural choices. That sounds like a promotion. Hurray, right? Maybe. In practice, it can be a burden because it assumes a level of systems understanding that is unevenly distributed across a team.
Compounding the problem, when creation becomes cheap, coordination becomes expensive. If you let every team use AI to generate bespoke solutions, you end up with a patchwork quilt of stacks, frameworks, and operational assumptions. It can all look fine in pull requests and unit tests, but what happens when someone has to integrate, secure, and operate it? At that point, the organization slows down, not because developers cannot type, but because the system cannot cohere.
Paved roads and platforms
Forrester’s recent research hits this nail on the head. They argue that architecture communities are the “hidden engine of enterprise agility.” This isn’t about re-establishing the ivory tower architects of the service-oriented architecture era who drew diagrams nobody read. It is about preventing the massive tax of integration workarounds. Forrester suggests that without coordination, architects spend up to 60% of their time just trying to glue disparate systems together rather than innovating. AI, left unchecked, will push that number to 90%.
The solution is not to ban AI, nor is it to let it run wild. The solution is to pave the road. I have written extensively about the need for golden paths. A golden path, or “paved road” in Netflix parlance, is an opinionated, supported route to production. It is a set of composable services, templates, and guardrails that make the right way of building software also the easiest way.
In the AI era, the golden path is non-negotiable. The cognitive load on developers is already too high; asking them to choose libraries, models, prompting strategies, and RAG architectures is a recipe for burnout. Your platform team must standardize the boring parts.
Imagine two scenarios. In the first, a developer asks an AI to build a microservice. The AI scans the internet, picks a random framework, and writes code that complies with zero of your company’s security policies. The developer feels fast for 10 minutes, then spends a week fighting the security review.
In the second scenario, the developer is on a golden path. The AI is constrained to use the internal templates. It generates a service that comes pre-wired with the company’s authentication, logging sidecars, and deployment manifests. The code it writes is boring. It is compliant. And it deploys in 10 minutes. In this model, the productivity win didn’t come from the AI’s ability to write code. It came from the platform’s ability to constrain the AI within useful boundaries.
The most productive developers of the next decade won’t be the ones with the most freedom. They will be the ones with the best constraints, so they can stop worrying about the plumbing and focus on the problem. If you’re a development lead, your job is to help create constraints that enable, rather than stifle, productivity.

