The AI ROI Gap: What Uber's CTO Couldn't Answer
In April 2026, Uber's CTO Praveen Neppalli Naga gave The Information one of the most honest quotes a public-company technology executive has offered on AI in years. Uber's full-year 2026 AI budget was already gone only four months into the year. The cause was not an infrastructure failure or a vendor contract issue. It was a coding assistant.
"I'm back to the drawing board, because the budget I thought I would need is blown away already."
Uber is a multi-billion-dollar R&D organization with dedicated finance and capacity-planning teams. If they missed the forecast, the forecasting model itself is broken. That is the part worth paying attention to, because this is not just an Uber problem. It is a preview of what happens when enterprise AI adoption scales faster than enterprise measurement systems.
What actually happened
Uber rolled out Claude Code access to engineering teams at the end of 2025 and actively encouraged adoption, including internal leaderboards ranking teams by usage.
The numbers that followed were significant:
- 84 percent of Uber developers were classified as agentic coding users by March 2026
- Claude Code usage roughly doubled within three months
- AI-related costs at Uber increased approximately sixfold since 2024
- Around 11 percent of live backend code updates are now written by AI systems
Two realities exist at the same time. The productivity gains are real. The budgeting model is broken. Both being true simultaneously is exactly what makes this difficult for finance and technology leaders.
Why the budget model broke
Traditional enterprise software budgeting was designed around predictable per-seat licensing.
Fixed monthly cost. Predictable headcount. Stable forecasting.
Agentic AI tools do not behave like traditional enterprise software at scale. They are usage-based systems. An engineer occasionally using Claude Code for autocomplete consumes a fraction of the resources used by another engineer running multiple autonomous agents across a large refactor workflow.
When a finance team forecasts 5,000 engineering seats at a flat annual rate, they produce a stable estimate. When they attempt to forecast 5,000 engineers operating variable-intensity agentic workflows with adoption actively encouraged through internal competition, they are no longer forecasting. They are guessing.
The Uber example demonstrates what happens when that guess is wrong.
Industry-wide data suggests the same pattern is spreading rapidly. Organizations adopting AI coding tools are increasingly encountering runaway consumption costs and are beginning to introduce usage caps as a default governance mechanism.
The three questions no enterprise can answer today
The Uber budget problem is ultimately a symptom of a larger issue. Most enterprises still cannot answer three basic questions that a CFO should be able to answer for any major technology investment.
1. Adoption depth
Who is actually using AI tooling, and where is spending concentrated?
Vendor dashboards show licenses and token usage. They do not show which engineers are power users, which teams are dormant, or whether most spending is being driven by a small percentage of employees. They cannot provide that view because each vendor only sees its own product environment.
2. Productivity contribution
Did AI usage actually improve throughput, cycle time, or feature delivery velocity?
The answer exists across code repositories, ticketing systems, CRM workflows, and operational tools. Vendor dashboards cannot correlate those systems. Producing the answer requires joining AI usage telemetry with engineering and operational output data at the team and individual level.
3. Cost per outcome
What does it cost to deliver a pull request, close a ticket, or ship a feature when AI tooling is included in the workflow?
This is the number finance teams eventually care about most. It does not exist in a unified form inside most enterprise stacks today because it requires combining vendor billing data, finance systems, and operational output metrics into a single measurement framework.
A leaderboard does not answer these questions. A single-vendor dashboard does not answer them. A quarterly analytics consulting engagement answers them temporarily and becomes outdated almost immediately.
The intelligence layer that has to exist
Productivity output, cost data, and adoption telemetry all live in separate systems owned by different departments.
To answer the three questions above, enterprises need an intelligence layer capable of ingesting signals from all of those systems, normalizing them into comparable units, attaching confidence scores and audit trails to every metric, and exposing the same underlying data through CFO, CTO, and CHRO perspectives.
That is the category Levos is building toward.
Not another dashboard. A Human Capital Operating System that aggregates behavioral signals across the enterprise stack into a deterministic, auditable intelligence layer. AI Impact is one of six signal families in the platform and directly addresses the operational issues exposed by the Uber example.
A defensible AI Impact snapshot requires at minimum:
- Adoption depth: active users, power users, dormant licenses, and concentration ratios sourced from vendor APIs
- Productivity lift: throughput changes, cycle time improvements, code review velocity, and delivery acceleration sourced from Git, Jira, Linear, Salesforce, and related systems
- Quality trace: revert rates, defect escape rates, and test coverage changes comparing AI-assisted work against non-assisted workflows
- Cost per outcome: spending per delivered unit by team, sourced from finance systems and vendor billing
- Concentration risk: identification of spending patterns producing limited measurable output improvement
Every number must trace back to its underlying signal source. Every signal requires an attached confidence score. That is what transforms an internal analytics view into a board-defensible measurement framework.
What this means for CTOs and CFOs
Three practical implications are emerging quickly.
One: AI tooling budgets should be modeled at two to three times initial estimates
Per-seat assumptions are no longer reliable. Organizations increasingly need weekly monitoring and automated budget alerts tied to real usage behavior, not projected licensing assumptions.
Two: adoption incentives without governance create runaway spending
Internal leaderboards reward usage. Usage is not the same as measurable contribution.
If organizations incentivize adoption without implementing a measurement layer tied to productivity outcomes, they are funding behavior rather than business value.
Three: AI ROI is becoming a financial controls problem
Once AI tooling crosses a meaningful percentage of R&D or operating spend, finance teams need a defensible framework tying that spending to measurable business output.
Without that framework, AI remains a line item funded largely on faith.
The honest part
The productivity gains from agentic AI systems are real. Uber's statistic showing 11 percent of backend code written by AI systems is significant, and many organizations are already seeing measurable acceleration in engineering throughput.
The issue is not whether enterprises should adopt AI aggressively.
The issue is whether they can scale AI investment responsibly without an intelligence layer that proves where the spending is working, where it is not, and where resources should be reallocated.
Uber will likely solve this problem internally because they have the scale and engineering resources to build sophisticated measurement infrastructure. Most organizations do not.
For everyone else, the gap between "we are using AI" and "we understand what AI is producing financially" is likely to define the next eighteen months of enterprise technology spending.
That gap is the layer Levos is designed to close.
Early access for CTOs and CFOs
Levos is opening early access for organizations that want a defensible AI ROI measurement framework before their next budget cycle.
A 30-day AI Impact pilot using your existing connectors. Live data architecture, not a deck.
Design partner cohort today: 150 to 500 employees. Expanding to 500 to 2,000 in the second half of 2026.
Want to see the AI Impact framework first? See the AI Impact page →
Want the deeper analysis first? Download Show Case Study →