State of AI 2026. Part 2

img

TL;DR

  1. AI capability is advancing fast. Deployment is not. 95% of enterprise AI pilots deliver zero measurable P&L impact (MIT NANDA, 2025). Klarna's AI-only customer support strategy reversed after a 22% drop in satisfaction. The economy is not ready for what the models can do.

  2. Compute demand is real under both scenarios. Whether displacement is rapid or slow, the path to reliable agentic AI runs through more training, more RL, more iteration. Infrastructure demand persists regardless of timeline.

  3. Slow diffusion is the base case. Software developer jobs are down 35% from peak but BLS projects 17% growth through 2033. The market is adjusting, not collapsing. The lump of labor fallacy remains a fallacy, for now.

  4. Political tail risk is scenario-dependent. Rapid displacement creates backlash that threatens infrastructure investments. Slow diffusion allows adaptation and reduces regulatory risk.

  5. For Infrastructure investments: The slow diffusion scenario is paradoxically bullish. If AI isn't reliable enough for enterprise deployment, the solution is more compute. Infrastructure demand persists while political backlash moderates.

Part 1 recap and framing

In Part 1, I laid out the AI infrastructure investment landscape: Jevons Paradox operating decisively (99%+ cost decline, 10x token usage growth), power as the binding constraint through 2030, hyperscaler capex growing from $443B to $600B+, and infrastructure operators functioning as leveraged call options on AI demand rather than stable infrastructure plays.

The conclusion was clear: the AI buildout is real, demand-driven, and physically constrained. It is not a bubble in the traditional financial sense. It is the capital-intensive re-industrialization of the global economy.

But that analysis left an open question. I framed IREN, CoreWeave, Cipher Mining, and Applied Digital as some of the counterparty bets with infrastructure characteristics. The key monitoring signal I flagged was agentic AI reliability improvements, because end-to-end success must improve for scaled deployment. That signal leads directly to the question this piece addresses:

If agentic AI takes longer to scale than headlines suggest, what are the economic cascade effects, and what does that mean for the infrastructure thesis?

This is Part 2. The focus shifts from infrastructure supply to economic demand. From "can we build it?" to "how fast does it actually change the economy?"

The diffusion constraint

Dario Amodei himself distinguishes between two exponentials: the capability of the model and the diffusion of the model into the economy. He calls diffusion "not instant, not slow, much faster than any previous technology but it has its limits."

This distinction is critical. The AI models may be ready. The economy is not.

In Part 1, I showed that 68% of production agents execute fewer than 10 steps before requiring human intervention, and that a 20-step workflow with 95% per-step success has only 36% end-to-end reliability. Salesforce Agentforce closed 18,500 deals and hit $500M ARR, but 92.5% of agents still deliver output to humans, not other systems. True machine-to-machine operation remains rare.

That was the supply-side view. Now let's look at what happens when enterprises actually try to deploy.

The Klarna case study: customer support and the human-in-the-loop

Customer support would seem to be the easiest white-collar function to automate: repetitive queries, documented procedures, measurable outcomes. If AI can't reliably replace customer support, what does that say about more complex roles?

The facts: In February 2024, Klarna claimed its AI chatbot could do the work of 700 human agents with resolution times of under 2 minutes vs. 11 minutes previously. CEO Sebastian Siemiatkowski declared: "I am of the opinion that AI can already do all of the jobs that we, as humans, do."

The reversal: By mid-2025, Klarna shifted away from its AI-only strategy to a hybrid approach, hiring human agents again after customer satisfaction dropped 22%. Siemiatkowski admitted that making cost the predominant evaluation factor resulted in lower quality.

The lesson: The chatbot successfully handled massive volumes of routine triage, but failed when real customers needed dispute resolution, refund help, or nuanced financial advice. It left "empathetic gaps" no algorithm could fill. This proves AI is an incredible tool to accelerate productivity, but acting as a complete "drop-in replacement" for human judgment and empathy remains far off.

This is not a defeat for AI. It is a necessary evolution toward a human in the loop model. Think of it as "L1 triage" at scale: the AI handles volume, humans handle complexity. That hybrid model is where value accrues in 2026, not in full replacement.

The 95% failure rate: enterprise AI pilots face severe friction

MIT NANDA's "GenAI Divide" study (July 2025), based on 300+ AI initiatives, 52 organizational interviews, and 153 senior leader surveys, found that 95% of enterprise AI pilots deliver zero measurable P&L impact. Only 5% of custom enterprise AI tools reach production.

While industry critics rightly point out that this study narrowly defines success as direct, short-term P&L impact, often ignoring very real internal efficiency and productivity gains, the underlying friction the report highlights is undeniable. The failure to scale is attributed to "brittle workflows, weak contextual learning, and misalignment with day-to-day operations." Large enterprises take 9 months on average to scale pilots, compared to 90 days for mid-market firms. Bureaucracy itself is a constraint.

Separately, IBM's 2025 CEO Study found 75% of AI projects fail to deliver expected ROI, with 74% of CEOs fearing for their jobs if they can't prove AI is working. In 2025, 42% of companies abandoned most AI initiatives (up from 17% in 2024), per S&P Global Market Intelligence.

These numbers don't mean AI doesn't work. They mean deploying AI at enterprise scale is a fundamentally different problem from building capable models.

Why agentic AI is harder than ChatGPT

In Part 1, I noted that agentic workflows consume 10-100x more tokens than single-query interactions, and that reasoning token consumption per enterprise organization increased 320x YoY. Claude Code went from $17.5M ARR in April 2025 to $400M ARR by July 2025. The demand signal is real.

But distributing a chatbot is trivially easy: send text, receive text. Everyone is already on the internet and mobile. 100M users in record time proves utility in the chat interface. It does not prove that true useful agentic AI is fast to implement.

Replacing a full white-collar job (not just answering questions) requires:

1. Deep expertise via reinforcement learning. RL is expensive and requires datacenter-scale compute. The "genius" model must be trained on domain-specific outcomes, not just general language. This is why diffusion being slow is bullish for compute demand, not bearish. As Dario pointed out recently roughly 50% of the contracted compute goes into research (pre-training, RL, etc.)

2. Hooks into enterprise ecosystems. Most enterprise systems are not fully digital or API-friendly. LLMs need clean data and programmable interfaces. Many companies still run on legacy systems, spreadsheets, and human judgment calls that have never been codified.

3. Test and iteration. Each domain requires custom evaluation, failure mode analysis, and guardrails. The 68% of production agents that execute fewer than 10 steps before requiring human intervention reflects this integration complexity.

4. Organizational change management. As Amodei noted: the leaders of the company who are furthest from the AI revolution have to approve the spending, then explain it to people two levels below. Big companies and even SMBs move slowly. There are still restaurants without online booking.

The implication: even if we have a model that generalizes exceptionally well (a "genius"), giving that genius the tools, API access, reliability, and organizational permission to replace a job is easier said than done.

The SaaS counterargument

If AI coding (the most advanced capability we have now) is already so strong, why don't we have a Salesforce, Hubspot, or Adobe competitor built by AI today?

The answer reveals the diffusion constraint: building enterprise software requires understanding business processes, compliance requirements, customer workflows, integration patterns, support infrastructure, and years of iterative improvement based on real user feedback. AI can write code faster, but it cannot replicate the accumulated organizational knowledge embedded in mature SaaS platforms.

Further, 90% of companies don't have their SaaS well set up for an AI agent to serve. They need bug fixing, business process adaptation, data normalization, and API modernization. This is why "SaaS is dead" is likely wrong: the transition to AI-native enterprise software requires massive investment in the existing SaaS ecosystem, not its replacement.

Software developer employment: the canary that didn't die

Coding is the domain where AI is most capable. In Part 1, I highlighted how Claude Code is transforming developer workflows, with top developers now coding 80% in English and 20% in code. If displacement were happening at scale, software developers would be the first victims. What does the data show?

Job postings down, but not catastrophically. Indeed data shows software development job postings are down 35% from January 2020, the lowest level in five years. But this follows a 2021-2022 boom where postings were 3.5x higher than normal. The market is normalizing, not collapsing.

Long-term projections remain positive. BLS projects 17% growth in demand for software developers, QA analysts, and testers from 2023-2033. This is well above average job growth.

AI creating new roles. AI Engineer roles are up 143% since 2024. Cybersecurity faces a global shortage of 3.5M roles. The composition of tech employment is shifting, but total tech employment has increased 20.7% since 2019.

Even in coding, the domain where AI is strongest, we see hiring slowdown (driven partly by post-pandemic correction) but not mass displacement. If the most AI exposed profession shows adjustment rather than collapse, the rapid displacement thesis for other white-collar roles is weakened.

Two scenarios, one compute demand

Scenario A: rapid displacement (Amodei's 1-3 year "country of geniuses")

In this scenario, AI capability breakthroughs enable rapid enterprise adoption. Verifiable tasks (coding, financial analysis, legal research) are automated within 2 years. Entry-level white-collar jobs decline 50% by 2030.

Cascade effects: Consumer spending faces severe downward pressure as displaced workers transition. However, rather than a total cliff edge collapse, this speed of displacement would likely force aggressive macroeconomic counter measures. Policymakers would deploy rapid fiscal stabilizers, such as modernizing the tax code to incentivize human-capital retraining (treating it on par with physical equipment expensing) or expanding unemployment safety nets. Still, municipal tax bases would experience high volatility, and commercial real estate would face continued vacancy pressure.

Infrastructure implications: Demand for compute is massive (training, inference, agentic workflows). But regulatory and political risk rises. Datacenters become targets of public anger. The reflexive loop emerges: datacenters enable the technology disrupting communities hosting them. In Part 1, I noted 230+ organizations called for a national moratorium on new datacenter construction, and that was before displacement fears intensified.

Scenario B: slow diffusion, my base case (the lump of labor adapts)

In this scenario, AI capability continues advancing but enterprise adoption remains constrained by integration complexity, organizational inertia, and the 95% pilot failure rate. Displacement occurs but over 10-15 years, not 1-5 years.

Historical pattern holds: Just as the internet and mobile took years to spread despite obvious utility, agentic AI takes time to integrate. New job categories emerge (AI trainers, prompt engineers, AI operations). Workers reskill. The lump of labor fallacy remains a fallacy.

Infrastructure implications: Demand for compute is still massive, possibly even more so. If diffusion is slow because AI isn't reliable enough, the solution is more training, more RL, more specialized models. The path to reliability requires compute. Datacenters remain critical infrastructure with less political backlash because displacement fears don't materialize at headline rates.

The convergent implication: compute demand is real either way

This is the crucial insight for infrastructure investors. Both scenarios require massive compute investment.

If displacement is rapid: Inference demand explodes as AI agents handle millions of tasks. Training demand continues for next-generation models. Jevons Paradox, which I documented in Part 1 (10x token usage growth despite 99% cost decline), operates at even larger scale.

If diffusion is slow: Training demand explodes because current models aren't reliable enough. More RL, more domain-specific training, more iteration is needed to reach enterprise-grade reliability. The path to the "country of geniuses" runs through datacenters regardless of timeline.

The difference is tail risk. Rapid displacement creates political backlash that may threaten infrastructure investments. Slow diffusion allows time for economic adjustment and reduces the risk of regulatory expropriation. For positions in IREN, CoreWeave, CIFR, and APLD, this distinction matters enormously for position sizing and time horizon.

Second-order effects under the slow diffusion scenario

If diffusion is slow (my base case), what happens to the cascade effects?

Consumer spending: pressure but not collapse

Slow diffusion means gradual job transitions rather than synchronized shock. Displaced workers have time to reskill or move to adjacent roles. Consumer spending faces headwinds (already visible in declining sentiment) but not the cliff edge collapse that rapid displacement would trigger.

Current evidence supports this: despite AI hype, consumer spending hasn't collapsed. Unemployment remains historically low. The "Rich Economy, Poor People" dynamic is emerging but manageable.

Tax bases: erosion but time to adapt

Slow diffusion gives municipalities and federal government time to redesign tax systems. Brookings' January 2026 framework proposes shifting from labor taxation to consumption taxes, robot services taxes, and AI rent capture. These reforms take years to implement, but slow diffusion provides those years.

The geographic dislocation Amodei warns about ("50% growth in Silicon Valley, current pace elsewhere") still occurs, but over a longer timeline. This creates investment opportunities in AI-adjacent regions while giving non-AI regions time to adapt.

Real estate: already happening, AI is a mixed signal

Office real estate pressure is already severe (18.8% vacancy, San Francisco at 35.8%) but primarily driven by remote work, not AI. AI is actually the only bright spot: AI companies have received $103B in SF venture capital since 2020 and are the primary source of new leasing.

Slow diffusion means this pattern continues: AI hubs (SF, Austin, Seattle) see leasing activity while non-AI regions face continued pressure from remote work. The bifurcation is real but driven by multiple factors, not AI displacement alone.

Public backlash: real but manageable

In Part 1, I documented $64B+ in datacenter projects blocked or delayed in Q1 2025, and $100B thwarted in Q2 2025. But this backlash is primarily about electricity prices and environmental impact, not job displacement. 72% of Americans have AI concerns, but these are about privacy, bias, and safety, not primarily employment.

If displacement fears don't materialize at headline rates, political backlash may moderate. The "job-killing AI" narrative loses power if visible mass unemployment doesn't occur. This reduces regulatory tail risk for infrastructure investors.

Is the "lump of labor" skepticism justified?

Amodei argues AI differs from previous technologies in four ways: speed, cognitive breadth, ability-level slicing, and gap-filling. If he's right, standard economic adjustment mechanisms may fail.

The historical pattern

Federal Reserve Governor Barr (May 2025): "Economists have long been skeptical of mass unemployment from automation. New technologies do eliminate some existing occupations, and not all workers benefit from technological change. But technology also creates new occupations."

The ATM example holds: supposed to eliminate bank tellers, instead led to more branches and more tellers for decades. Every previous automation wave was called "qualitatively different." Every previous prediction of mass technological unemployment proved wrong.

Why "this time is different" arguments may be premature

1. Speed is real, but diffusion is slow. ChatGPT reached 100M users in 2 months. But 95% of enterprise AI pilots fail to reach production. The capability curve is steep; the deployment curve is flat. This mismatch gives economies time to adjust.

2. Cognitive breadth is real, but context matters. AI can write code, analyze documents, and generate content. But it cannot navigate organizational politics, build relationships with clients, or make judgment calls in ambiguous situations. Klarna's reversal shows that even "simple" customer support requires human empathy and discretion.

3. Ability-level slicing is real, but creates new entry points. AI displaces routine junior tasks but creates new junior roles in AI operations, training data curation, and human-AI collaboration. The composition of entry-level work shifts, but entry-level work doesn't disappear.

4. Gap-filling assumes AI capability that doesn't yet exist. The argument that AI will "fill in the gaps" as humans move to new roles assumes AI can do all cognitive work. But the 95% pilot failure rate suggests AI still cannot reliably handle the integration, adaptation, and context-switching that real jobs require.

The equilibration question

The key question for infrastructure investors: will the economy equilibrate before or after political backlash threatens investments?

If diffusion is slow (10-15 years), economic adjustment mechanisms have time to operate. Workers reskill. New industries emerge. Tax systems adapt. Infrastructure investments proceed with manageable regulatory risk.

If diffusion is rapid (1-5 years), adjustment mechanisms fail. Displacement outpaces reskilling. Political backlash targets AI winners. Infrastructure investments face expropriation risk.

The 95% pilot failure rate, Klarna's reversal, and the 35% decline (not collapse) in software job postings all suggest we are in the slow diffusion scenario. The lump of labor skepticism may be premature.

Investment implications

What we know

1. AI capability is advancing rapidly. Amodei's 1-3 year "country of geniuses" timeline may be achievable for model capability, even if deployment lags.

2. Enterprise deployment is constrained. 95% pilot failure rate, 9-month enterprise scaling times, Klarna reversal all indicate integration friction.

3. Compute demand is real regardless of scenario. Rapid deployment requires inference capacity. Slow diffusion requires more training to achieve reliability. As I showed in Part 1, this is why Jevons Paradox operates: the $602B+ hyperscaler capex forecast for 2026 reflects genuine demand, not speculation.

4. Political risk is scenario-dependent. Rapid displacement creates backlash. Slow diffusion allows adaptation and reduces regulatory tail risk.

What we don't know

1. Whether capability breakthroughs will accelerate deployment. If AI agents become 10x more reliable, enterprise adoption could accelerate dramatically.

2. Whether organizational change will lag capability. Even perfect AI cannot deploy faster than organizations can absorb change.

3. Whether political systems will adapt or resist. Datacenter moratoriums and robot tax proposals are early signals; outcome is uncertain.

Practical framework

For infrastructure investments: The slow diffusion scenario is paradoxically bullish. If AI isn't reliable enough for enterprise deployment, the solution is more compute, more training, more RL. Infrastructure demand persists while political backlash moderates.

Key risk to monitor: Capability breakthroughs that enable rapid deployment would be good for demand but increase political/regulatory risk. The optimal scenario for infrastructure is sustained high demand with gradual societal adaptation.

Timeline matters: Near-term (1-3 years), infrastructure investments are relatively safe. Hyperscaler contracts are locked in, as I detailed in Part 1 (Microsoft $392B commercial RPO, Google $155B cloud backlog, Amazon $195-200B backlog). Demand exceeds supply. Longer-term (5-10 years), scenario bifurcation becomes critical. Position sizing should reflect this uncertainty.

Conclusion: embracing uncertainty

The honest answer is: we don't know which scenario will materialize. AI capability is advancing faster than any previous technology. But diffusion into the real economy faces constraints that cannot be overcome by better models alone.

The facts on the ground suggest slow diffusion:

Klarna couldn't fully automate customer support despite aggressive efforts and evolved to a human-in-the-loop model after satisfaction dropped 22%. 95% of enterprise AI pilots fail to reach production with measurable P&L impact (MIT NANDA, 2025). 75% of AI projects fail to deliver expected ROI (IBM, 2025). Software developer job postings are down 35% from peak but BLS projects 17% growth through 2033. The market is adjusting, not collapsing. Agentic AI replacing full white-collar jobs requires not just capable models but enterprise integration, organizational change, and reliability that does not yet exist at scale.

This is bullish for infrastructure demand. If diffusion is slow because AI isn't reliable enough, the path to reliability runs through more compute, more training, more RL. AI infrastructure companies are positioned to benefit regardless of displacement timeline.

The second-order effects (consumer spending pressure, tax base erosion, political backlash) remain real risks but may unfold over 10-15 years rather than 1-5 years. This gives economic adjustment mechanisms time to operate and reduces regulatory tail risk.

The capital markets will be volatile as participants reach quick conclusions. "SaaS is dead" narratives will compete with "AI is overhyped" narratives. Neither is likely correct. The reality is messier: transformative capability is emerging, but transformation takes time.

For infrastructure investors, the practical advice is: stay positioned for demand, monitor deployment velocity, and don't let either extreme narrative drive decisions. The compute buildout will happen. The displacement timeline remains uncertain.

Key monitoring signals

Diffusion velocity indicators:

  1. Enterprise AI pilot-to-production conversion rates (currently 5%)

  2. Major enterprise AI deployment announcements (not pilots, actual production at scale)

  3. Agentic AI reliability benchmarks (currently 36% end-to-end success for 20-step workflows)

  4. Customer support AI adoption rates and satisfaction scores (Klarna reversal as leading indicator)

Political/regulatory risk indicators:

  1. Datacenter moratorium count by state (14+ states with local moratoriums as of December 2025)

  2. Utility rate case outcomes for datacenter customers

  3. Congressional AI tax/regulation proposals and voting patterns

  4. Entry-level white-collar unemployment rates (currently elevated but not catastrophic)

My take

Part 1 concluded that the AI buildout is real and IREN's thesis holds. Part 2 reinforces that view from a different angle.

The slow diffusion scenario, which the evidence strongly supports, is the best possible environment for AI infrastructure operators. It means sustained compute demand (more training, more RL, more iteration to achieve reliability) combined with moderate political backlash (no visible mass unemployment to trigger regulatory overreaction). The 95% pilot failure rate is not bearish for compute, it is bullish. Every failed pilot represents future demand for better models that require more training.

The risk I watch most closely is a sudden capability breakthrough that collapses the diffusion timeline. That would be extraordinary for demand but could trigger the political backlash and regulatory intervention that threatens the physical infrastructure layer. Position sizing accounts for this tail risk.

For now, the base case is clear: AI infrastructure demand persists, the displacement timeline is longer than headlines suggest, and the compute buildout remains the trade. Stay positioned. Monitor velocity. Don't panic at narratives.