Field Notes
AI Strategy After Token Costs Rise: A Systems Level Approach
As AI use moves from pilots into routine work, teams need systems that route cost, risk, context, and human judgment with more care than a simple agent rollout allows.
AI adoption is moving out of the pilot phase and the economics are starting to look different. A few tool subscriptions or API experiments can be easy to absorb. Daily use scaled across teams is different. Once AI becomes part of routine work, token consumption, model choice, workflow design, and oversight all start to affect the cost and value of implementation.
This shift is already visible. Axios recently reported that Databricks is launching AI spend controls after seeing companies run up unexpectedly large AI bills, including cases where customers spent tens of millions of dollars in a single month. A recent paper on token consumption in agentic coding tasks found that agents can use far more tokens than simpler AI interactions, and that higher token use does not consistently produce better performance. These are not niche infrastructure concerns, but rather early signs that AI strategy has to account for cost, routing, governance, and fit.
At the same time, agent performance on real-world work remains uneven. Agents' Last Exam, a benchmark for long-horizon professional workflows, evaluates agents on economically useful tasks with verifiable outcomes. Its current results show that agents still struggle with the kinds of complex, domain-specific work organizations often want to automate. For AI to scale into the hype around its benefits, autonomous AI needs much clearer boundaries focused equally on both infrastructure and outcomes.
For teams already using AI, this creates a practical question: where does AI actually improve the work, and where does it add cost, risk, or coordination burden? Answering that requires looking closely at workflows before making platform decisions. Which tasks are repetitive but low risk? Which ones need context from multiple systems? Which decisions require human judgment? Which outputs need review before they can be used? Which teams need to trust the process for adoption to stick?
Organizations are increasingly focused on platform agnostic solutions. This way, they do not need to lock themselves into one model provider just because that provider has the strongest demo this quarter. In many workflows, the best solution may involve routing different tasks through different systems: a lighter model for classification, a stronger model for synthesis, retrieval for internal knowledge, and human review where judgment or accountability matters. The point is to design the workflow around the task, not force the task into the tool.
This also changes how teams should think about AI agents. Agents can be valuable when the task is bounded, the input context is reliable, the success criteria are clear, and the cost of failure is manageable. They become harder to justify when the workflow depends on hidden process knowledge, ambiguous handoffs, sensitive data, or cross-functional judgment. In those cases, the better starting point may be a narrower automation, a decision-support layer, or a human-in-the-loop process that improves speed without pretending the whole workflow can run unattended.
A lightweight AI strategy is a way of avoiding unnecessary complexity while keeping the system adaptable. That can mean improving the use of tools a team already pays for, introducing model routing instead of defaulting every task to a frontier model, creating governance rules for agentic workflows, or building a roadmap that distinguishes between what should be automated now, what should be tested, and what should remain human-led.
The human side matters because AI implementation rarely fails only on technical grounds. Legal, finance, operations, product, research, and leadership may all define value and risk differently. If those teams do not agree on where AI should sit in the workflow, adoption becomes fragile. Human-AI alignment, in practice, means designing systems that people can understand, trust, question, and improve. It also means being clear about where accountability remains with people.
Rising token costs and uneven agent performance are pushing AI strategy toward more disciplined implementation. The useful question is no longer whether a team should "use AI." Most already do. The question is how to route AI through existing systems in a way that improves work, controls cost, protects judgment, and earns buy-in from the people who have to use it.
For many organizations, the next phase of AI will be less about adding more tools and more about making better decisions around the tools already in place. That is where platform-agnostic AI strategy, workflow design, responsible governance, and cross-functional adoption start to matter.
If your team is asking how to move from AI experiments to practical results, the first step is usually a sharper view of the system you already have. I work with teams on AI advisory and strategy, from workflow assessment and vendor evaluation to governance, implementation roadmaps, and practical team workshops. You can also review common questions about AI advisory engagements.
Back to Field Notes