Why did Uber's AI budget blow over so rapidly in early 2026?

Uber's budget model broke because enterprise software budgeting was designed around predictable per-seat licensing. Agentic AI tools like Claude Code are usage-based; an engineer running complex, multi-file agentic refactors consumes drastically more tokens than someone using basic autocomplete. Encouraging adoption through internal leaderboards without real-time financial tracking created a compounding, unforecasted consumption surge.

Why can't native AI vendor consoles or billing dashboards solve this budget problem?

AI vendor consoles only see their own product's seats and token metrics. They are structurally blind to who your actual power users are across multiple different tools, which teams are dormant, and—most importantly—they have zero visibility into your code repositories, ticketing systems, or CRMs to prove if that token consumption is translating to actual business output.

How does Levos calculate a 'defensible AI ROI' compared to a standard engineering dashboard?

Levos doesn't rely on a single vendor API or static charts. It acts as an independent human capital operating layer that correlates usage telemetry directly with operational delivery output. By joining code commits, cycle times, and defect rates from systems like Git and Jira with financial billing data at the individual engineer and team levels, Levos calculates the precise cost per shipped outcome.

What are the core metrics tracked in the Levos AI Impact framework?

The Levos AI Impact snapshot tracks four vital dimensions simultaneously: Adoption Depth (active vs. dormant licenses and concentration ratios), Productivity Lift (changes in cycle time, throughput, and review velocity), Quality Trace (revert rates on AI-assisted code vs. human-only updates), and Cost per Outcome (exact tool spend per delivered feature or closed ticket).

What governance steps should CTOs take to prevent runaway agentic AI costs?

CTOs should adopt three immediate controls: first, model budget assumptions at 2x to 3x initial flat-rate estimates; second, implement real-time consumption alerts at 50%, 75%, and 90% thresholds rather than waiting for monthly bills; and third, never incentivize pure adoption metrics (like leaderboards) without an independent measurement layer like Levos to verify that increased usage equals valuable contribution.

The AI ROI Gap: What Uber's CTO Couldn't Answer

In April 2026, Uber's CTO Praveen Neppalli Naga gave The Information one of the most honest quotes a public-company technology executive has offered on AI in years. Uber's full-year 2026 AI budget was already gone only four months into the year. The cause was not an infrastructure failure or a vendor contract issue. It was a coding assistant.

"I'm back to the drawing board, because the budget I thought I would need is blown away already."

Uber is a multi-billion-dollar R&D organization with dedicated finance and capacity-planning teams. If they missed the forecast, the forecasting model itself is broken. That is the part worth paying attention to, because this is not just an Uber problem. It is a preview of what happens when enterprise AI adoption scales faster than enterprise measurement systems.

What actually happened

Uber rolled out Claude Code access to engineering teams at the end of 2025 and actively encouraged adoption, including internal leaderboards ranking teams by usage.

The numbers that followed were significant:

84 percent of Uber developers were classified as agentic coding users by March 2026
Claude Code usage roughly doubled within three months
AI-related costs at Uber increased approximately sixfold since 2024
Around 11 percent of live backend code updates are now written by AI systems

Two realities exist at the same time. The productivity gains are real. The budgeting model is broken. Both being true simultaneously is exactly what makes this difficult for finance and technology leaders.

Why the budget model broke

Traditional enterprise software budgeting was designed around predictable per-seat licensing.

Fixed monthly cost. Predictable headcount. Stable forecasting.

Agentic AI tools do not behave like traditional enterprise software at scale. They are usage-based systems. An engineer occasionally using Claude Code for autocomplete consumes a fraction of the resources used by another engineer running multiple autonomous agents across a large refactor workflow.

When a finance team forecasts 5,000 engineering seats at a flat annual rate, they produce a stable estimate. When they attempt to forecast 5,000 engineers operating variable-intensity agentic workflows with adoption actively encouraged through internal competition, they are no longer forecasting. They are guessing.

The Uber example demonstrates what happens when that guess is wrong.

Industry-wide data suggests the same pattern is spreading rapidly. Organizations adopting AI coding tools are increasingly encountering runaway consumption costs and are beginning to introduce usage caps as a default governance mechanism.

The three questions no enterprise can answer today

The Uber budget problem is ultimately a symptom of a larger issue. Most enterprises still cannot answer three basic questions that a CFO should be able to answer for any major technology investment.

1. Adoption depth

Who is actually using AI tooling, and where is spending concentrated?

Vendor dashboards show licenses and token usage. They do not show which engineers are power users, which teams are dormant, or whether most spending is being driven by a small percentage of employees. They cannot provide that view because each vendor only sees its own product environment.

2. Productivity contribution

Did AI usage actually improve throughput, cycle time, or feature delivery velocity?

The answer exists across code repositories, ticketing systems, CRM workflows, and operational tools. Vendor dashboards cannot correlate those systems. Producing the answer requires joining AI usage telemetry with engineering and operational output data at the team and individual level.

3. Cost per outcome

What does it cost to deliver a pull request, close a ticket, or ship a feature when AI tooling is included in the workflow?

This is the number finance teams eventually care about most. It does not exist in a unified form inside most enterprise stacks today because it requires combining vendor billing data, finance systems, and operational output metrics into a single measurement framework.

A leaderboard does not answer these questions. A single-vendor dashboard does not answer them. A quarterly analytics consulting engagement answers them temporarily and becomes outdated almost immediately.

The intelligence layer that has to exist

Productivity output, cost data, and adoption telemetry all live in separate systems owned by different departments.

To answer the three questions above, enterprises need an intelligence layer capable of ingesting signals from all of those systems, normalizing them into comparable units, attaching confidence scores and audit trails to every metric, and exposing the same underlying data through CFO, CTO, and CHRO perspectives.

That is the category Levos is building toward.

Not another dashboard. A Human Capital Operating System that aggregates behavioral signals across the enterprise stack into a deterministic, auditable intelligence layer. AI Impact is one of six signal families in the platform and directly addresses the operational issues exposed by the Uber example.

A defensible AI Impact snapshot requires at minimum:

Adoption depth: active users, power users, dormant licenses, and concentration ratios sourced from vendor APIs
Productivity lift: throughput changes, cycle time improvements, code review velocity, and delivery acceleration sourced from Git, Jira, Linear, Salesforce, and related systems
Quality trace: revert rates, defect escape rates, and test coverage changes comparing AI-assisted work against non-assisted workflows
Cost per outcome: spending per delivered unit by team, sourced from finance systems and vendor billing
Concentration risk: identification of spending patterns producing limited measurable output improvement

Every number must trace back to its underlying signal source. Every signal requires an attached confidence score. That is what transforms an internal analytics view into a board-defensible measurement framework.

What this means for CTOs and CFOs

Three practical implications are emerging quickly.

One: AI tooling budgets should be modeled at two to three times initial estimates

Per-seat assumptions are no longer reliable. Organizations increasingly need weekly monitoring and automated budget alerts tied to real usage behavior, not projected licensing assumptions.

Two: adoption incentives without governance create runaway spending

Internal leaderboards reward usage. Usage is not the same as measurable contribution.

If organizations incentivize adoption without implementing a measurement layer tied to productivity outcomes, they are funding behavior rather than business value.

Three: AI ROI is becoming a financial controls problem

Once AI tooling crosses a meaningful percentage of R&D or operating spend, finance teams need a defensible framework tying that spending to measurable business output.

Without that framework, AI remains a line item funded largely on faith.

The honest part

The productivity gains from agentic AI systems are real. Uber's statistic showing 11 percent of backend code written by AI systems is significant, and many organizations are already seeing measurable acceleration in engineering throughput.

The issue is not whether enterprises should adopt AI aggressively.

The issue is whether they can scale AI investment responsibly without an intelligence layer that proves where the spending is working, where it is not, and where resources should be reallocated.

Uber will likely solve this problem internally because they have the scale and engineering resources to build sophisticated measurement infrastructure. Most organizations do not.

For everyone else, the gap between "we are using AI" and "we understand what AI is producing financially" is likely to define the next eighteen months of enterprise technology spending.

That gap is the layer Levos is designed to close.

Early access for CTOs and CFOs

Levos is opening early access for organizations that want a defensible AI ROI measurement framework before their next budget cycle.

A 30-day AI Impact pilot using your existing connectors. Live data architecture, not a deck.

Design partner cohort today: 150 to 500 employees. Expanding to 500 to 2,000 in the second half of 2026.

Request early access →

Want to see the AI Impact framework first? See the AI Impact page →

Want the deeper analysis first? Download Show Case Study →

The AI ROI Gap: What Uber's CTO Couldn't Answer

The AI ROI Gap: What Uber's CTO Couldn't Answer

What actually happened

Why the budget model broke

The three questions no enterprise can answer today

1. Adoption depth

2. Productivity contribution

3. Cost per outcome

The intelligence layer that has to exist

What this means for CTOs and CFOs

One: AI tooling budgets should be modeled at two to three times initial estimates

Two: adoption incentives without governance create runaway spending

Three: AI ROI is becoming a financial controls problem

The honest part

Early access for CTOs and CFOs

Sources

Share this article

More from the blog

The AI Proof Gap

Workforce Intelligence vs People Analytics: Why the Category Is Splitting in 2026

Levos Named Intelligence Layer Category Leader in Elevates.AI 2026 Workforce Comparison