The era of "free" or low-cost artificial intelligence is drawing to a close as the industry shifts from simple Large Language Models (LLMs) that answer queries to autonomous AI agents capable of executing complex workflows. According to a revealing Goldman Sachs report released in late May 2026, this structural pivot is expected to skyrocket the demand for "tokens"—the fundamental unit of compute in AI—by up to 24 times current levels.
From Copilot to Autonomous Agent
To date, AI adoption has primarily followed the "copilot" model. A user provides a prompt, the model generates text or code, and the interaction ends. AI agents, however, function differently. They are designed to solve multi-step problems: planning a strategy, calling external APIs, self-correcting errors, and iterating until a goal is achieved.
This "Chain of Thought" reasoning is incredibly expensive. Every step of reasoning consumes tokens, and when an agent works in the background to, for instance, organize an entire business trip or manage a corporate supply chain, the token count explodes geometrically. Goldman Sachs notes that while a human only sees the final output, "under the hood," the system may have conducted hundreds of internal monologues to reach that conclusion.
Uber and Microsoft: Feeling the Financial Bite
The report specifically highlights companies like Uber and Microsoft, which are at the forefront of implementing these technologies. Microsoft, having integrated Copilot across its Office 365 ecosystem, faces a pricing dilemma. If agent usage grows as projected, the standard $30-per-month SaaS subscriptions may become unprofitable, forcing the tech giant toward usage-based billing models.
Uber, meanwhile, uses AI agents to optimize customer service and logistics. Moving to fully autonomous agents that handle complaints and refunds without human intervention offers labor savings but introduces a massive new line item for compute costs. The central question posed by Goldman Sachs is whether the productivity gains can truly offset the ballooning cost of tokenized intelligence.
The New Economy of Intelligence
The challenge is not merely financial; it is environmental and infrastructural. A 24-fold increase in demand means that data center and energy requirements will test the limits of existing power grids. Already, Nvidia and other semiconductor giants are seeing order backlogs grow, but supply cannot keep pace with the requirements of "Agentic AI."
- Token Inflation: Surging demand could lead to price hikes from model providers like OpenAI, Anthropic, and Google.
- Architectural Efficiency: Companies are now forced to develop smaller, specialized models (SLMs) to curb operational expenses.
- The End of Unlimited: The era of unlimited AI queries for a flat fee appears to be coming to an end.
In conclusion, Goldman Sachs warns that investors must exercise caution. The "promise" of AI-driven full automation comes with a bill that many enterprises might not be prepared to foot. The strategic shift from "what can AI do?" to "what does it cost for AI to do it?" will be the dominant theme for the remainder of 2026.