Counting Agents Is Not a Strategy: How CFOs Should Actually Measure AI Investment
Paul Griggs at PwC used a Formula 1 analogy in his March 2026 piece that should stick on every CFO whiteboard. F1 teams do not win by fielding more cars. They win by fielding the most capable car, with a driver who knows how to use it, and a team that makes the whole system work. The same is true of AI in the enterprise, and it is the exact opposite of how most boards are currently measuring progress.
Right now, the most common AI status update in a board deck looks like this. Number of agents deployed. Number of seats with Copilot. Number of departments using Gemini. Hours saved, usually self-reported in a survey, often by the same employees who are being told their performance review depends on AI adoption.
None of those numbers are wrong. All of them are insufficient. They measure activity, not outcome, and they leave the CFO in a position of approving budget without a defensible answer to a simple board question. What did we get for the money?
The four numbers that actually matter
A defensible AI measurement framework, the kind a CFO can present to an audit committee and a board without hedging, has four components. Each one is built from cross-system behavioral data, not from a vendor dashboard.
1. AI Activation Rate, weighted by tool capability
Not how many seats have a license. How many seats are actively using AI to perform measurable work, segmented by tool, role, and team. Copilot at 90 percent license penetration but 22 percent weekly active usage is a very different number than Copilot at 90 percent license penetration with 78 percent weekly active usage. The first one is a retirement candidate. The second one is a productivity engine. Most CFOs cannot tell which they have.
2. AI Contribution to Output, separated from baseline
This is the hardest number to produce and the most important one. It is the share of measurable work output (code commits, sales activity, document creation, meeting follow-ups, deal velocity, ticket resolution) that can be attributed to AI assistance with a confidence score attached. Not 100 percent attribution. Most workforce AI is collaborative. But a measurable, auditable share that can be tracked over time and benchmarked against industry medians.
3. AI Cost per Productive Seat
Total AI subscription cost divided by the number of seats actively producing measurable output with AI, not divided by the total seat count. This is the number that exposes the agent count trap.
Run the math on a real example. A 1,500-person organization with seven AI subscriptions, paying 240 thousand dollars per year in aggregate, and only 380 seats actually producing measurable AI-assisted output. The effective cost per productive seat is 632 dollars per year. If two of those seven tools are responsible for 92 percent of the productive seats, the math is suddenly very clear.
4. Retirement Candidates with Dollar Amounts
A board cannot act on a chart. It can act on a recommendation that says: terminate Gemini Enterprise across the 480 seats that have not produced measurable output in 90 days, save 138 thousand dollars in the next contract cycle. That is a CFO conversation. That is a board conversation. That is how AI moves from a cost center to a managed investment line.
Why this is not in your current stack
Your AI vendors will not produce this measurement honestly. Microsoft is not going to publish a report that shows Copilot adoption is shallow in your sales team. Google is not going to tell you Gemini is underperforming Claude in your engineering org. Workday does not see inside the productivity tools. Visier sees the survey data but not the behavioral data. Lattice sees the review cycle but not the cross-tool signal.
The measurement layer has to come from a system that is independent of any single AI vendor, ingests behavioral data from all of them, and applies a deterministic confidence score to every signal it produces. That is the architectural argument for a workforce intelligence layer. It is the only honest broker in a stack full of vendors who are also salespeople.
What the framework looks like in practice
A CFO running Levos sees a single AI Navigation surface that consolidates AI Activation Rate, AI Contribution to Output, AI Cost per Productive Seat, and Retirement Candidates into one view. Each metric shows the formula behind it. Every number is traceable to its source signal. Every retirement recommendation has a dollar value attached and an action button that initiates the cancellation workflow when the finance team approves it.
That is not a dashboard. That is a measurement instrument. The difference matters when the board is asking why the AI line item is 380 thousand dollars and what it bought.
Early access for CFOs
Levos is opening early access for mid-market CFOs who want to build a defensible AI measurement framework before the next board cycle.
The first ten design partner CFOs receive direct working sessions with our team to map their existing AI stack, define their measurement framework, and produce the first defensible AI ROI report for their board.
Design partner cohort today: 150 to 500 employees. Expanding to 500 to 2,000 in the second half of 2026.
Want to see how the CFO surface works first? See the Financial Intelligence view on the Platform page →