State of AI 2026. Part 1
TL;DR
Efficiency drives demand. Inference costs down 99% since 2023, while token usage grew 10x YoY. Agentic workflows consume 10–100x more tokens per task.
Models are not commoditized (yet). 88% of enterprise LLM spend is concentrated in 3 providers. Switching costs are rising due to model-specific prompts and agents.
Agentic AI is deployed, not scaled. End-to-end reliability still low for multi-step workflows. Scaled impact likely post-2027.
Power is the constraint. AI compute demand grows 4–5x annually vs. 35% efficiency gains. Data centers consume 4% of US power today, trending to 6–8% by 2026.
Hyperscaler capex reflects demand. Growing from $443B in 2025 to $600B in 2026. With $300B+ cloud backlogs, 1.6% data center vacancy, and 70% pre-leasing support continued build-out.
Infra names are call options. High leverage, customer concentration, and tech obsolescence dominate returns, not infrastructure-like stability.
IREN. Thesis holds. 2026 execution and revenue diversification are the key variables.
2026 started strong for AI:
Claude Code taking over any other previous coding model. Now the most proficient developers (e.g. Andrej Karpathy) are claiming to code 80% in English 20% in code, vs 20%/80% just 1 month ago.
Social media is exploding with CLAWD (the autonomous AI agent sitting in your local machine). It basically works like a PA with access to your computer.
Gemini/ChatGPT newer models advancing even without major release naming. Both Google and OpenAI are going to integrate ads (and shopping) soon into their chatbots
After hearing Crusoe’s CEO and reading Anthropic’s CEO's latest blogpost I thought it was a good time to sit back and try to understand the next 12-36 months as objectively as possible. First as an AI/infra investor (Part 1) but also as a market and economic participant (Part 2)
I will start with the key questions I am looking to answer in this research note:
Rebound Effect Test (Volume vs Efficiency): Does a 10x increase in model efficiency lead to a 100x increase in usage (Jevons Paradox), or a 10% decrease in compute demand (Deflation)?
The "Commodity" Test: Does Intelligence behave like "Electricity" (a commodity where cheapest wins) or "Consulting" (a service where brand/trust wins)?
The “Agentic” Reality Check: Are we actually moving to "Machine-to-Machine" commerce, or is that just a narrative?
The Energy Hard-Ceiling: Is Power truly the bottleneck, or will new chip architectures save us? Does physical infrastructure become a depreciating asset faster than expected? What's the power arbitrage duration?
The "Capex" Bluff: Are Hyperscalers building because they need to, or because they are terrified of being left behind? Front-loaded or sustainable?
How concentrated will the customer base become? If 3-5 hyperscalers control this value, what is the negotiating power of independent datacenter operators?
How does chip export control policy affect infrastructure demand?
Efficiency creates demand, not deflation
The first framework question, whether efficiency gains lead to Jevons Paradox (more compute) or deflation (less compute), has a clear empirical answer. Jevons dominates decisively.
The cost per million tokens fell from $60 (GPT-4, March 2023) to $0.60 (GPT-4o mini, 2024) to $0.27 (DeepSeek V3 Chat, 2025) a 99%+ decline. Stanford's AI Index 2025 documented a 280-fold inference cost decline from November 2022 to October 2024. Yet total token consumption exploded: OpenRouter processed 10 trillion tokens in 2024 and 100 trillion by mid-2025, a 10x increase despite radically lower costs. OpenAI now processes 6 billion tokens per minute, up 20x in two years.
The mechanism is clear: cheaper inference enables compute-intensive applications that were previously uneconomical. Agentic AI workflows (multi-step reasoning loops with tool-calling and environmental interaction) consume 10-100x more tokens than single-query interactions. Reasoning token consumption per enterprise organization increased 320x year-over-year. Claude Code went from $17.5M ARR in April 2025 to $400M ARR by July (23x growth in three months) driven by developers running continuous inference loops.
Even DeepSeek's efficiency breakthrough (supposedly training a frontier model for $5.6M, leaving many costs out) validated rather than refuted Jevons: within days of its release, Meta increased 2025 AI spending to $60-65B (+50% YoY), and infrastructure investment accelerated across all hyperscalers. As Satya Nadella tweeted in January 2025: "Jevons paradox strikes again! As AI gets more efficient and accessible, we will see its use skyrocket."
https://openai.com/index/the-state-of-enterprise-ai-2025-report/
AI intelligence resists commoditization, for now
The commodity test asked whether AI is becoming like electricity (cheapest wins) or consulting (brand/trust wins). Despite aggressive price wars, 80%+ cuts on flagship models since 2023, differentiation persists and switching costs are rising.
Menlo Ventures' December 2025 survey found 88% of enterprise LLM API spend concentrated in three providers: Anthropic (40%), OpenAI (27%), and Google (21%). This data surprised even Imme, check it out here. Critically, these shares reflect use-case specialization, not just price competition. Anthropic commands 54% market share in coding (vs. OpenAI's 21%) and dominates enterprise API revenue. OpenAI leads consumer mindshare and multimodal capabilities. Google wins on price-performance for existing GCP customers.
Switching costs are increasing, not decreasing. The a16z survey of 100 CIOs found that agentic workflows create prompt and guardrail dependencies that make model changes expensive: "All the prompts have been tuned for OpenAI... changing models takes a lot of engineering time." Enterprises now use 5+ models simultaneously (up from 29% to 37% in one year), but this reflects task specialization rather than vendor hedging. This is also consistent with my direct experience operating multiple agentic workflows. Multiple agents, multiple models and supplier tuned prompts.
Financial signals confirm differentiation: Anthropic trades at 43.9x forward revenue versus OpenAI's 31x, and generates $211/monthly user versus OpenAI's $25/weekly user, an 8x monetization efficiency gap suggesting enterprise customers pay premium prices for perceived quality advantages.
However, commoditization pressure is real and growing. Chinese open-source models (DeepSeek) push prices toward zero. For simpler use cases, "all models perform well enough" and pricing becomes decisive. The infrastructure layer (inference compute) already shows commodity dynamics. The model layer may follow within 2-3 years as capabilities converge.
https://menlovc.com/perspective/2025-the-state-of-generative-ai-in-the-enterprise/
Agentic AI is real but unreliable (yet)
The agentic AI reality check reveals genuine production deployment alongside significant reliability limitations. This is not primarily hype, but scaled autonomous operation remains elusive.
Production evidence is substantial:
Salesforce Agentforce: 18,500 deals closed, 1.5M+ support requests handled, $500M ARR with 330% YoY growth
Capital One's Chat Concierge: 55% better lead conversion in multi-agentic workflows
UK Police "Bobbi" agent: Resolving 82% of citizen queries without human escalation
52-75% of enterprises report having AI agents deployed in some capacity
However, the gap between demos and production remains wide:
68% of production agents execute fewer than 10 steps before requiring human intervention
92.5% of agents deliver output to humans, not other systems, true machine-to-machine operation is rare yet
A 20-step workflow with 95% per-step success has only 36% end-to-end reliability
40-95% failure rates reported depending on task complexity, with Gartner predicting 40%+ of agentic AI projects will be canceled by 2027
Machine-to-machine commerce infrastructure is actively being built, Google's AP2 protocol, Coinbase's x402, Visa/Mastercard agent credentials, but adoption remains yet limited. Amazon's "Buy for Me" and ChatGPT's in-chat checkout represent early experiments, not mainstream commerce.
The infrastructure implication is significant: if agentic AI scales, it will consume orders of magnitude more compute per interaction. This reinforces rather than undermines the datacenter demand thesis, but the timeline for scaled agentic deployment is 2027+ rather than 2026.
Back in 2023 we were discussing the same “AI is not ready yet” for coding. Only 2.5 years later, Claude Code now completes a majority of routine tasks within modern engineering workflows. The economic incentives for scaled agentic commerce are substantial. I believe Google/OpenAI are investing heavily on this. A native checkout embedded directly within AI chatbots would materially lower friction to adoption. This would unlock massive adoption and massive demand for AI.
Power is the binding constraint through 2030
The power bottleneck thesis is strongly confirmed. AI compute demand is growing 4-5x annually while chip efficiency improves only 35-40% per year and grid interconnection takes 5-7 years in high-demand markets.
Current constraints are severe:
Virginia (PJM): 7-year interconnection delays; PJM capacity auction costs ballooned from $2.2B to $14.7B in one year
Texas (ERCOT): 700% spike in large-load interconnection requests from 2023-2024
National queue backlog: 2,600 GW of generation seeking grid connection; only 20% of 2000-2018 requests reached commercial operation
North American datacenter vacancy: Record-low 1.6%—demand far exceeds supply
Chip efficiency is improving meaningfully but cannot close the gap. NVIDIA's Blackwell (B200) delivers 47% better FLOPS/watt than Hopper (H100). Google's TPU v7 (Ironwood) achieves 2x performance/watt versus v6 and 30x versus original TPU. AWS Trainium3 offers 4x better energy efficiency than Trainium2.
But the math doesn't work: 4-5x annual demand growth minus 1.35x efficiency improvement leaves a 2.9-3.7x annual net gap. Power demand for frontier training runs is growing 2.2-2.9x per year, with individual large training runs now requiring 25-300 MW. (Source here)
Solutions exist but have long lead times:
Small modular reactors: Google, Amazon, Microsoft have committed $10B+ to SMR partnerships, but first commercial deployment is 2029-2030
Nuclear restarts: Microsoft's Three Mile Island Unit 2 restart is targeted for 2028
On-site generation: Natural gas turbines and microgrids are deploying faster but have environmental tradeoffs
The bottom line: power is the primary bottleneck through 2028-2030, and no combination of chip efficiency and new generation can close the gap without either algorithmic breakthroughs or demand slowdowns.
Hyperscaler capex is demand-driven with a FOMO multiplier
The "Capex Bluff" thesis, that hyperscalers are building from competitive fear rather than genuine demand, has partial validity but is not the dominant explanation.
Evidence for demand-driven investment:
Customer backlogs at record highs: Microsoft $392B commercial RPO (+51% YoY), Google $155B cloud backlog (+46% QoQ), Amazon $195-200B backlog
All three major hyperscalers explicitly cite supply constraints limiting growth (Microsoft CFO: "Demand exceeded supply across workloads")
1.6% datacenter vacancy indicates genuine scarcity
Cloud infrastructure revenue growing 25%+ consistently for 5 consecutive quarters
74% of new datacenter capacity is pre-leased before construction completes
Evidence for fear-amplified spending:
Zuckerberg explicitly stated he would "rather misspend a couple hundred billion dollars" than miss the AI transformation
Pichai acknowledged "elements of irrationality" in current AI investment levels
Revenue/capex ratio is 6%, $400B+ capex generating $25B in AI-specific revenue
Debt financing replacing cash funding for the first time (Meta $30B bond offering, Amazon $12B)
Bank of America projects AI capex will consume 94% of operating cash flows by 2026
The synthesis: hyperscaler investment is predominantly demand-driven (backlogs, vacancy rates, constraint statements) but fear amplifies incremental spending beyond strict ROI calculations. The 2026-2027 window is critical, if enterprise AI revenue doesn't materialize at scale, capex retrenchment is possible. Watch for Q3-Q4 2026 capex guidance revisions as the leading indicator.
Sovereign AI as a Demand Driver
A new vector of demand is Sovereign AI. Nation-states are entering the capex race, treating compute as a strategic national resource.
National Clouds: Countries like France (Mistral + Nvidia partnership), Japan, and the UAE are building "AI Factories" to ensure data sovereignty
Defense & Security: The U.S. government has launched initiatives to secure "AI primacy," further driving demand for domestic infrastructure and influencing export controls on chips to China.
Infrastructure operators: concentrated bets, not stable plays
I focused my analysis on these four AI infrastructure operators (IREN, CoreWeave, Cipher Mining, Applied Digital). Objectively right now they are best understood as leveraged call options on hyperscaler AI capex rather than stable infrastructure investments, until proven otherwise. All face extreme customer concentration and significant execution risk.
CoreWeave (CRWV):
$6 billion debt against $5.1B 2025 revenue guidance
62% of 2024 revenue from Microsoft/OpenAI alone; top 2 customers = 77%
$310M quarterly interest expense consuming operating leverage
Free cash flow deeply negative (-$8B+ LTM)
Debt covenants recently amended to reduce liquidity requirements, could be a warning sign
IREN offers a hybrid model with meaningful hedge:
$9.7B, 5-year Microsoft contract (November 2025) transforms the company but creates binary risk
Bitcoin mining provides revenue floor if AI pivot fails
$5.8B Dell equipment commitment requires ongoing financing
Vertically integrated (power, datacenters, GPUs) is differentiator
100% renewable energy via owned hydro and solar
Cipher Mining (CIFR) pursues the lowest-risk "AI landlord" approach:
$5.5B, 15-year AWS lease (300MW capacity, 2026 delivery)
$3.8B Fluidstack/Google contracts (10-year, 300MW)
Does NOT provide compute services, leases space and power only, avoiding GPU obsolescence risk
Real estate-like returns = lower upside but better risk-adjusted profile
~$0.017/kWh power cost is exceptional
Applied Digital (APLD) has attractive economics but concentrated counterparty risk:
$11B, 15-year CoreWeave contract (400MW) largest customer has its own debt problems
North Dakota location provides $50-60M annual savings per 100MW vs. traditional locations
Targeting REIT conversion, could command higher multiples
But if CoreWeave struggles, Applied Digital's contracted revenue is at risk
The customer concentration risk is systemic: CoreWeave depends on Microsoft, IREN depends on Microsoft (for now), Applied Digital depends on CoreWeave (which depends on Microsoft), and Cipher depends on AWS. These are not diversified infrastructure plays: they are counterparty bets with infrastructure characteristics.
Regulatory and geopolitical landscape creates execution uncertainty
AI regulation is fragmenting at the state level while federal action remains deadlocked:
California SB 53 (effective January 1, 2026) requires frontier AI developers to publish safety frameworks and report incidents within 15 days
New York RAISE Act (signed December 2025) imposes stricter 72-hour incident reporting
Trump administration is pursuing federal preemption via executive order and FTC challenges to state laws
EU AI Act reaches full application by August 2, 2026, with fines up to €35M or 7% of global revenue
Export controls remain unstable:
January 2026 rules permit case-by-case licensing for NVIDIA H200 and AMD MI325X sales to China (with 25% fee)
China may access up to 850,000 H200-equivalents under current thresholds
Huawei's Ascend 950 (Q1 2026) will approach A100 performance but remain 2-5 years behind cutting-edge
Policy reversals (Biden restrictions → Trump loosening) create planning uncertainty for all parties
Datacenter-specific opposition is growing:
230+ organizations called for national moratorium on new datacenter construction (December 2025)
Water restrictions emerging in Texas, Arizona, California
Investment framework determination
Are IREN/CIFR/APLD infrastructure plays or call options on AI demand?
They are call options, not stable infrastructure plays. The evidence:
Extreme customer concentration (62-77% revenue from top 1-2 customers)
Significant leverage (CoreWeave: 4.85x debt-to-equity; IREN: $5.8B equipment commitment)
Technology obsolescence risk for GPU-based business models (CoreWeave, IREN)
Binary outcomes: if hyperscaler contracts are terminated or not renewed, these companies face existential threats
Short operating history (CoreWeave IPO: March 2025; IREN AI pivot: 2024)
Power arbitrage duration is likely 3-5 years. The grid interconnection queue (2,600 GW backlog, 5-7 year lead times) and SMR deployment timeline (2029-2030) create structural barriers to rapid energy infrastructure revolution. Current cost advantages (Cipher at $0.017/kWh, IREN at $0.036/kWh) should persist through at least 2030.
Framework for Amodei's projections: If AI companies genuinely approach "$3T in revenue per year" valued at "$30T," the infrastructure demand would be extraordinary, far exceeding current projections. However, this scenario implies AI systems that can autonomously generate that value, which would also mean AI capable of dramatically improving its own efficiency. The "country of geniuses" scenario creates both massive demand and massive uncertainty about physical infrastructure depreciation rates.
Key Metrics Watchlist (2026 Outlook)
Metric | 2025 Actuals | 2026 Forecast | Implication |
Hyperscaler Capex | $443B | $602B+ | Continued hardware bull market; margin pressure for Big Tech. |
Inference Cost | $0.50/1M tokens | <$0.05/1M tokens | Explosion of "always-on" agentic applications. |
Agentic GDP | Negligible | $50B - $100B | Emergence of non-human economic actors as a measurable demographic. |
Data Center Power | 4% of US demand | 6-8% of US demand | Grid strain; rising industrial electricity prices; regulatory backlash. |
Final Verdict: The "AI Bubble" is not a bubble in the traditional financial sense; it is the capital-intensive re-industrialization of the global economy. It will be messy, inefficient, and physically constrained, but it is not stopping.
https://introl.com/blog/hyperscaler-capex-600b-2026-ai-infrastructure-debt-january-2026
The investment case depends on timing and counterparty quality
The AI infrastructure investment thesis rests on 3 pillars that the evidence supports:
Jevons Paradox is operating: Efficiency gains create more demand, not less. This is well-established.
Power is genuinely constrained: The 4-5x demand growth vs. 1.35x efficiency improvement gap cannot be closed without new power generation arriving in 2029-2030+.
Hyperscaler capex is predominantly demand-driven: $300B+ backlogs and 1.6% vacancy indicate genuine customer demand, though competitive fear adds 10-20% incremental spending.
The infrastructure operator investment cases are not infrastructure investments in the traditional sense. They are:
Leveraged bets on hyperscaler contract durability (CoreWeave, IREN, Applied Digital)
Counterparty risk exposure (Applied Digital → CoreWeave → Microsoft chain)
Technology obsolescence exposure (GPU-based models require continuous reinvestment)
Call options on AI demand sustainability rather than stable returns on invested capital
The lowest-risk approach is the power-first "landlord" model (Cipher Mining, to a lesser extent Applied Digital) that avoids GPU technology risk and earns real estate-like returns.
The highest-risk/highest-reward approach is GPU cloud services (CoreWeave, IREN) where massive revenue growth is offset by massive leverage and obsolescence risk.
Key monitoring signals for the 12-36 month outlook:
Hyperscaler capex guidance revisions (Q3-Q4 2026)
AI services revenue as percentage of total capex (currently 6%)
Agentic AI reliability improvements (end-to-end success must improve for scaled deployment)
Grid interconnection progress and SMR construction milestones
Enterprise AI ROI realization rates
My take
My investment thesis on IREN remains intact. The analysis above reinforces the view that IREN represents a high-reward, execution-driven exposure to AI infrastructure demand rather than a traditional infrastructure asset. The “call option” nature of the investment is unchanged: upside remains significant, while risks are increasingly centered on delivery and counterparty concentration rather than the underlying demand environment.
2026 is a pivotal year. Execution against contracted capacity, progress on revenue diversification, and market perception of IREN’s transition from Bitcoin mining to AI infrastructure will be the key drivers. Risks remain, but they are being actively managed through balance sheet strength, contracted revenue, and operational delivery. Positioning and conviction will continue to be guided by observable progress rather than narrative.