Can giants afford to burn tokens this quickly?

Silicon Valley giants who once forced employees to aggressively adopt AI are now struggling to keep up.

Amazon, after earnestly advising “don’t use AI just for the sake of using AI,” abruptly shut down its internal employee token consumption leaderboard.

Microsoft suddenly revoked most Claude Code licenses and instructed developers to migrate their workflows back to GitHub Copilot CLI.

Just months ago, the situation was entirely different. The more AI employees used, the more advanced they seemed—and the more future-ready their companies appeared.

But once employees actually began rampant AI usage, companies quickly realized: whether productivity had truly increased remained uncertain—while bills were already ballooning.

Caught between rising token costs and fear of falling behind in the AI race, Silicon Valley is now confronting a dilemma it created itself.

Guys, AI is just too damn useful!

It all started last year. While encouraging employees to embrace AI wasn't new, 2025 saw this trend explode into an unstoppable wave.

The most visible sign? Big tech firms in Silicon Valley began pushing employees to use AI at all costs.

As Julia Liuson, head of Microsoft’s developer tools business, put it: “Using AI is no longer optional—it’s a core competency required at every role and level.”

At the time, she instructed managers in an internal email to factor in employees’ use of internal AI tools—including GitHub Copilot—when evaluating performance.

Amazon simultaneously warned that some roles might shrink due to AI, while telling employees the solution was simple: “embrace AI.”

Last summer, CEO Andy Jassy sent a company-wide email on generative AI. He stated that as the company scaled deployment of generative AI and intelligent agents, fewer people would be needed for certain existing roles; over the next few years, AI-driven efficiency gains were expected to reduce Amazon’s overall workforce.

When addressing how employees should respond, Jassy directly urged them to proactively embrace AI:

“Understand AI, attend workshops and training sessions, use and experiment with AI wherever possible, participate in team brainstorming, think about how to innovate faster and at scale for customers, and figure out how to achieve more with leaner teams.”

This statement can be seen as the public launch of Amazon’s internal AI mobilization campaign.

It wasn’t just a few giants. In 2025, “AI-for-all” became something of a cultural trend.

Shopify introduced the concept of “reflective AI usage,” declaring it now a foundational requirement. “Reflective” means using AI like a reflex—employees should instinctively consider whether AI can handle a task before taking any action.

Companies also mandated that teams answer one question before requesting additional headcount or resources: Why can’t this work be done by AI?

Duolingo went even further, publicly announcing a shift toward “AI-first.” Wherever AI could do the job, outsourced labor would be avoided. Where AI could replace new hires, hiring would be paused. Employee evaluations would also include metrics on AI utilization.

This trend persisted well into this year.

In March, Huang Renxun publicly stated that if a $500k/year NVIDIA engineer didn’t consume at least $250k worth of AI tokens annually, he’d feel “deeply concerned.” When asked whether NVIDIA was prepared to spend roughly $2 billion per year on token purchases for engineering teams, Huang replied: “We’re working on it.”

This wasn’t his first such comment. At an all-hands meeting late last year, Huang berated executives who had previously advised teams to “use less AI,” asking, “Are you crazy?” He then explicitly demanded employees automate every task that could be automated via AI, while assuring them AI wouldn’t take their jobs.

But when it comes to being the most aggressive, Meta stands out.

In November 2025, Meta’s Chief People Officer Janelle Gale announced that starting in 2026, “AI-driven impact” would become a core expectation in employee performance reviews.

By April this year, Meta had launched an internal leaderboard called “Claudeonomics”: tracking token consumption across over 85,000 employees, ranking the top 250, and awarding titles like “Token Legend” and “Cache Master.” Within just 30 days, the leaderboard recorded over 60 trillion tokens consumed.

AI usage had turned into an internal competition of attrition.

According to BCG’s 2026 AI Radar report, which surveyed 2,360 corporate executives—including over 600 CEOs—94% of organizations said they would continue investing in AI even if returns weren’t immediate in 2026.

The report forecasts that enterprise AI spending as a percentage of revenue will rise from ~0.8% in 2025 to ~1.7% in 2026—nearly doubling. 72% of CEOs said they were already the primary decision-makers on AI strategy; half even believe their job security depends on AI investment delivering results.

For these companies, AI has become a CEO-level bet—one that cannot afford to appear slower than peers.

Can’t afford to play anymore? Silicon Valley giants backtrack

Yet, before the AI usage race even reached full momentum, Silicon Valley giants began hitting the brakes.

The first to contradict themselves was Amazon—the very company that once urged employees to “use and experiment with AI as much as possible.”

At the end of May, Amazon was reported to have shut down an internal leaderboard named “KiroRank,” created by employees to track token consumption during AI tool usage.

According to the Financial Times, some employees began running non-essential tasks through Amazon’s internal AI agent platform, MeshClaw, simply to boost their AI usage metrics. Originally designed to assist with code deployments, email sorting, or Slack interactions, MeshClaw’s purpose shifted once token usage entered the leaderboard game—tasks were no longer about solving real problems but about “gaming the system.”

This behavior even earned a name: “Tokenmaxxing”—the practice of maximizing token consumption at all costs.

Although Amazon didn’t disclose exactly what kinds of meaningless tasks were run, users in related forums already speculated on such tactics:

Running MeshClaw in the background to continuously perform static analysis on source code packages—ensuring constant token accumulation.

On Hacker News, another user claimed that colleagues at their company, after being evaluated on “how many tokens they spent,” began having different AI agents repeatedly pass outputs back and forth in endless loops—because genuinely high-token-demand tasks were scarce.

Eventually, Amazon had to stop the game.

Senior Vice President Dave Treadwell reminded employees internally: “Don’t use AI just because you can. Use AI to solve customer problems, address business challenges, and drive innovation.”

Less than a year after Jassy personally rallied employees to “embrace AI,” Amazon reversed course.

Amazon wasn’t alone. Mid-May this year, Microsoft began revoking most internal Claude Code licenses.

Even smaller firms couldn’t sustain the pace.

Last April, Duolingo CEO Luis von Ahn had declared a shift to “AI-first.” A year later, he admitted the company had rescinded that evaluation standard.

One year on, in a podcast interview, he revealed that employees had questioned: “Are we using AI just to look ‘AI-first,’ not because it adds value?”

Ultimately, Duolingo no longer counts AI usage as a formal performance metric. Von Ahn emphasized: what truly matters is whether employees can get work done effectively. AI suits certain tasks—but not all. Companies shouldn’t force employees to use AI where it doesn’t belong.

Of course, companies that once rushed employees to “embrace AI” haven’t abandoned AI altogether.

They’ve simply realized: not using AI is a problem. But employees burning through tokens to chase rankings, bonuses, or self-preservation may be an even more expensive one.

AI is great, but greed isn’t sustainable

Everyone knows building AI is expensive. But using AI this way? That’s surprising.

A prime example: Uber burned through its entire annual AI budget by April. Just months earlier, in December, Uber had opened access to Anthropic’s AI coding tool, Claude Code, to around 5,000 engineers.

As mentioned earlier, Microsoft began canceling most internal Claude Code licenses in May. Internally, they justified this move by citing the need to unify toolchains under GitHub Copilot CLI.

But according to The Verge, this was also a financial decision.

Claude Code licenses will be mass-closed by the end of June—before the current fiscal year ends—to cut operational costs ahead of the new fiscal year.

More importantly, as Microsoft pushes employees back to Copilot CLI, the pricing model for Copilot itself is changing.

In April, GitHub announced that starting June 1, paid plans for enterprise and team users of GitHub Copilot would transition to a usage-based billing model. Previously, customers mainly paid based on subscription tiers and premium request quotas. Under the new model, each plan includes a fixed allotment of GitHub AI Credits—any usage beyond that must be paid for per actual consumption.

How is this cost calculated? Based on input tokens, output tokens, and cached tokens consumed during employee workflows.

GitHub’s official announcement explained that as Copilot takes on more complex agent tasks—such as analysis, modification, and iteration—compute demands vary significantly across tasks, necessitating a pay-per-use model.

Anthropic has adopted a similar pricing logic.

Currently, Claude Enterprise seat fees only cover platform access—not actual usage. Every token generated by employees using Claude, Claude Code, or Cowork is billed separately at standard API rates.

Even more direct: Anthropic’s official documentation explicitly warns enterprises: under the new usage-based pricing model, there’s no separate allocation of token quotas. One employee’s heavy AI usage won’t reduce others’ available quota—only increases the organization’s final bill. The old fixed-seat model will gradually transition to this pay-per-use structure upon renewal.

OpenAI’s approach differs slightly. It hasn’t announced a universal switch to token-based billing for all enterprise plans. But in April, it introduced a pay-as-you-go option for Codex within ChatGPT Business and Enterprise teams: companies can avoid fixed seat fees and instead pay only for actual Codex usage.

Meanwhile, calling stronger models incurs significantly higher costs.

GPT-5.5, launched into the API in April, has a cost per call that is double that of GPT-5.4—both input and output token prices are twice as high under standard API pricing.

When companies tell employees to “use AI as much as possible,” but vendors charge precisely per call and per token, the situation becomes delicate.

The problem isn’t just cost.

The deeper existential question is: when the entire company races forward with AI, does it actually work?

People have long spotted the logical flaw: judging engineers by token consumption is no different than rating marketing team members by how much they spend.

While the entire industry uses AI, only a minority have successfully converted that usage into profit.

McKinsey’s 2025 AI Landscape report surveyed 1,993 corporate respondents. Only 39% reported that AI had impacted EBIT (earnings before interest and taxes) at the enterprise level.

McKinsey specifically defined a category of “high-performing AI firms”: those that both believed AI had created significant value and that AI contributed at least 5% to EBIT. Only about 6% of respondents met both criteria.

Additionally, in July last year, research firm METR published a randomized controlled trial. Sixteen experienced open-source developers completed 246 real-world tasks in familiar codebases—half with AI tools, half without.

Before the experiment, developers predicted AI would reduce task completion time by 24%.

After using the tools firsthand, they still estimated a 20% speedup.

But the actual result was opposite: task completion time increased by 19% after using AI.

This study focused on senior developers experienced with large open-source codebases—so it can’t prove AI coding helps no one, for no task.

But it does confirm: employees feeling they’re more efficient using AI doesn’t mean they actually are. Companies seeing rising AI usage don’t necessarily see matching output gains.

Once token consumption becomes a performance KPI and a “commitment detector,” the farce is already set. AI won’t vanish. But the era of treating token consumption rate as a proxy for transformation progress—and equating “more AI use” with performance success—is likely coming to an end.

Good news indeed.

Original article from WeChat Official Account “LetterAI,” author: Xiao Jinya. Republished with permission from 36Kr.

Source: 36Kr

Disclaimer: Contains third-party opinions, does not constitute financial advice

Share To