Measurers are more important than ever

In May, Matthew Prince told Cloudflare’s organization that AI was going to come for the “measurers.” He borrowed the term from Peter Drucker’s 1954 frame, which divides corporate work into builders, sellers, and measurers, the third category being roughly middle management plus finance, audit, compliance, and most of the meeting-heavy coordination between teams. Prince’s argument was that AI eats the measurers first. He laid off about a thousand people on that thesis. The thesis is backwards, and the engineering version of the measurer role is the cleanest place to see why.

The “measurer” Prince has in mind is the person who writes the status report on what others did. That layer is real, and yes, AI is eating it. Good. But Drucker’s category was never just the reporting layer. It also covered the work of deciding what to measure, deciding what the measurements mean, deciding when to change course based on them, and coordinating that change across the system. The reporting layer is the byproduct. The decisions underneath are the work. AI absorbs the byproduct and pushes the work that produces it up in importance, because the system the work is monitoring is now moving ten times faster.

Engineering is where this is most visible right now. The agent that opens a PR while you’re in another tab has collapsed the cost of implementation. The bottleneck has moved one layer up, to the people deciding what should get implemented, in what order, against what coherence constraints. That work is engineering measurement, and it has never been more important than it is right now.

The measurement work the books were always about

If you read the canonical engineering management books, they aren’t really about how to write code or even how to ship features. Will Larson’s An Elegant Puzzle, Camille Fournier’s The Manager’s Path, Claire Hughes Johnson’s Scaling People, Andy Grove’s High Output Management. They’re about the strong form of measurement: deciding what to look at, what it means, what to do when it doesn’t add up, and how to act on the answer without breaking everything downstream.

Larson talks about teams in four states: falling behind, treading water, repaying debt, and innovating. The whole point of management is to measure which state a team is in and move it deliberately. He writes about how to allocate scarce senior engineers across teams, when to consolidate a team versus split it, how to run a migration that touches a hundred services without breaking the company. None of that gets easier when implementation gets cheaper. It gets more important, because the rate at which the system can change has gone up.

Hughes Johnson’s book is mostly about operating cadence: how meetings actually function and how role descriptions get written. These aren’t things an LLM helps you with. They’re the substrate that determines whether the work an LLM produces actually adds up to a product.

Fournier writes about transitions. IC to manager, manager to manager-of-managers, director to VP. Each transition is about giving up the work you were good at to take responsibility for work that’s harder to measure, and that hasn’t changed, though the IC-to-manager transition is stranger now because the IC work itself looks different than it did three years ago.

Grove, the oldest of the bunch, is the most relevant. High Output Management is fundamentally about leverage. A manager’s output is the output of the team plus the output of anyone they influence. The job is to find activities that produce disproportionate output for the time spent. When implementation cost falls by an order of magnitude, the leverage of any single measurement decision a manager makes goes up by roughly the same amount.

What gets harder when throughput goes up

A team that can produce ten times more code per week does not magically produce ten times more value. They produce ten times the surface area. The architecture drifts in more places. Two engineers more often build the same thing independently. A small misalignment about what you’re actually trying to do compounds into weeks of wasted work before anyone notices.

Prioritization gets harder. When the cost of trying something was three weeks, you picked carefully. When it’s three hours, the temptation is to try everything, and the team that does ships a museum of half-finished features. Saying no requires actually understanding why one thing matters more than another, and that judgment doesn’t come from the model.

Coherence used to be free. Five engineers with agents can produce code in five different styles, against five different mental models of the system, in a single week. The codebase stays coherent now only if someone is explicitly responsible for it. That role used to be filled implicitly by the rate at which humans could write code, and the rate limit is gone.

Sequencing matters more. The order you do things in was always part of the cost, but it was small compared to the cost of any individual step, and now it’s the other way around. If you build the integration before the schema is stable, or migrate the easy services before noticing the hard one was load-bearing, the cost lands in days instead of being amortized over months.

The performance signal you used to rely on is mostly gone. When output was scarce, you could roughly tell who was a good engineer by how much they shipped. That signal is now noise. Two engineers can ship similar volumes of code with one producing maintainable systems and the other producing a slow-motion incident. Telling the difference requires reading the work and talking to the people downstream. The manager’s eye for quality is the measurement instrument now.

What you hire for changes too. The engineer who was valuable because they could grind through a hard implementation problem is still valuable, just less rare. The rare profile is the engineer who can hold the full system in their head and use agents to build the right pieces of it. Most interview loops don’t measure that yet.

And mentorship is an open problem. Juniors used to grow by writing a lot of code and getting it reviewed. That feedback loop is weaker when most of the code is being written by an agent, and nobody has a good replacement yet. Figuring it out is a management problem.

The work that’s actually different

A lot of what a tech lead used to do is now mostly mechanical. An agent can take a design doc and produce a reasonable task breakdown, or write a first-draft architectural proposal that’s good enough to argue with. The manager who insists on doing all of that by hand is wasting their own leverage.

The parts that aren’t mechanical are where the job is. Deciding whether the proposal is solving the right problem. Deciding whether to ship now or wait for the right abstraction. Those were always where the job was. The mechanical work was a tax that obscured it, and the tax is going away.

The good engineering managers I know aren’t panicking. They were already doing the high-leverage version of the job. They’re just doing more of it, on a faster cycle, with smaller teams that produce more.

Most of the work is project management

If you sit with an engineering manager for a week and write down what they actually do, a lot of it is project management. Not the Microsoft Project sense of the term, the more general one: tracking work in flight, unblocking the team, re-sequencing when something slips, coordinating with the teams downstream of you. The human work that fills books like Sarah Drasner’s Engineering Management for the Rest of Us and Michael Lopp’s Managing Humans matters, but it isn’t what fills the calendar. Project management is what fills the calendar.

This is the work that gets most directly hammered by ten-times-faster throughput. A Gantt chart that was accurate Monday morning is wrong by Tuesday afternoon. Estimation you spent a week on last sprint was calibrated to humans-only velocity and bears almost no relationship to what a fleet of agents can produce against the same scope. Someone wrote a status update yesterday describing a state of the world that no longer exists. The dependency between your team and the platform team that was three weeks out is now three days out, and they didn’t get the memo because nobody told them the schedule moved.

This is the work Prince was pointing at when he said AI would kill the measurers. And AI can do parts of it. It can summarize a PR queue, draft a status email, flag a slipped milestone. What it can’t do, and what gets harder when the cycle time collapses, is decide what to do about any of the things it surfaces. The PR queue grew by 40%; is that good throughput or are you shipping the wrong things? The dependency slipped; do you replan, or wait?

Those are project management decisions. They were always project management decisions. When the cadence was weekly they were tractable for one manager covering a team of six. When the cadence is hourly they aren’t tractable that way, and the answer most orgs will reach for, which is to push the decisions down to the ICs, partly works because ICs already make a lot of these calls implicitly, and partly doesn’t because the cross-team coordination problems that managers exist to handle don’t push down cleanly. Project management needs to become more structured at the manager level, not less, with better instrumentation underneath it and a higher tolerance for re-deciding things you decided yesterday.

Someone has to be accountable for what the agents ship

There’s a structural reason this work isn’t going away that’s separate from the leverage argument. The companies selling you tokens are explicit, in their terms of service, that they aren’t responsible for what the models produce once those outputs land in your systems. They sell tokens. They don’t sell accountability for what the code those models wrote does to your customers when you deploy it on Tuesday.

The accountability has to sit somewhere. It can’t sit with the model, because a model isn’t a legal entity and can’t be held responsible. It can’t sit entirely with the IC who shipped the agent’s output, because they were doing the job the org asked them to do. It has to sit with the person who decided this code was good enough to ship, against whatever standards the org commits to its customers. That’s measurement in the Drucker sense: judging the output of a system against criteria, and being the person who answers when the call turns out wrong.

When the throughput was capped by human writing speed, accountability was tractable, because every line of code had a human author whose name was on the commit. That’s not the world we’re in now. The new world has accountability gaps between the model that wrote the code, the engineer who reviewed it, the manager who approved the merge, and the org that shipped it. Closing those gaps is measurement work, and it’s becoming a board-level concern rather than just an engineering-leadership one.

Meta, Coinbase, and the rest of the public-company moves

Prince’s framing was the easier counterexample to argue with, because it relied on the Drucker word doing more work than it could carry. Meta and Coinbase are harder, because they aren’t making category claims. They’re acting on bets about what an engineering org should look like.

Meta has been converting engineering managers back to ICs and pushing manager-to-report ratios from 1:8 toward 1:50. They’re probably right about a chunk of the manager population. The ones who weren’t doing high-leverage measurement work were always a tax, and a tax that gets exposed as soon as the org has to move faster. They’re probably wrong about the load-bearing ones, because compressing manager-to-report ratios toward 1:50 means almost no one is doing the prioritization, coherence, and sequencing work, which is exactly where the value compounds in a high-throughput org.

Coinbase went further. In May 2026 they cut roughly 14% of headcount and explicitly replaced “pure managers” with “player-coaches,” manager-IC hybrids expected to do both jobs at once. They announced “AI-native pods,” in some cases single-person teams directing agents across what used to be engineering, design, and PM. Brian Armstrong framed the whole thing as AI-accelerated restructuring. The stock got a brief lift on the announcement and is down roughly 10% in the month since.

That trajectory matters. A lot of the AI-layoff stories in 2026 read more like financial narratives than org-design decisions. Public companies are burning capex on AI at numbers that mostly don’t pencil yet, and the easiest way to make that capex story add up to shareholders is to subtract people on the operating-expense side. Investors will reward the announcement, but if the underlying productivity story were real you’d expect the reward to compound rather than reverse. We should be careful about reading the recent moves as clean evidence of where engineering management is going. Some of them are just accounting.

What this means for orgs

The companies that figured out their management practices when implementation was expensive built up something that’s about to matter much more than they realized: the way decisions get made and how role boundaries actually work. Stripe didn’t write its books by accident, and that work compounds.

The companies that papered over weak management with raw engineering throughput are more exposed than they used to be. You can no longer hide an unclear strategy behind a team that ships a lot, because the team will ship a lot of the wrong thing very fast, and you’ll learn it was wrong in days instead of quarters.

Beyond that, I don’t actually know what the right structure looks like, and I don’t think anyone else does either. The dynamics are too new. Meta’s flattening, Coinbase’s player-coaches, and the various AI-native pod experiments are live hypotheses on companies large enough to absorb the cost of being wrong for a few years. They may look prescient by 2028, or they may get re-evaluated the way the 2022 metaverse pivot did, and there isn’t enough information yet to tell which.

The direction of the leverage is clearer than the shape of the org. The productivity gain from agents landed on the implementation side, where it’s easy to see, but the effect is felt one layer up, in the measurement work that decides what all that new throughput is actually for. Whoever does that work, whatever the title or org chart looks like, their decisions are now worth far more than they were two years ago, and not nearly enough orgs are treating them as if that’s true. The right staffing levels and reporting structures will get figured out by losing money on the wrong ones.