Structural Thesis / 100x Co-Design Paradigm

The real moat is no longer who buys the most chips.

It is who co-designs silicon, kernels, and models as one system.

By May 2026 Anthropic's annualized revenue run-rate reached $47 billion -- nearly double OpenAI's -- while burning roughly one-quarter as much on training compute. The divergence reveals two competing bets on the intelligence age: disciplined enterprise co-design versus capital-intensive vertical integration racing the same trillion-dollar IPO finish line.

Anthropic $47B ARROpenAI $24B ARRClaude Code $8B ARR

$47BAnthropic ARR, May 2026

$24BOpenAI ARR, April 2026

3.5 GWTPU capacity secured via Google and Broadcom

9 moOpenAI Jalapeno design-to-tape-out cycle

Archive Plates

Anthropic responsible scaling policy interface artwork — Anthropic / governance stack

OpenAI research group photograph — OpenAI / scaling culture

Claude Code better together editorial artwork — Claude Code / enterprise wedge

The 100x multiplier in frontier AI no longer comes from buying faster chips in isolation; it comes from the organizational capacity to co-design hardware, low-level systems software, and model architecture as a single integrated system -- a reality proven by DeepSeek's Hopper-native MoE and Gemini's TPU lock-in. Anthropic has operationalized this thesis at scale through disciplined TPU partnerships and the fastest-growing enterprise coding agent in history, reaching $47B ARR with materially lower infrastructure burn than OpenAI, whose Jalapeno ASIC and Public Wealth Fund gambit represent a higher-risk, capital-heavy counter-bet on the same AGI timeline.

In the spring of 2026, while most observers still measured the AGI race by the number of H100s each lab could secure, Anthropic quietly crossed a threshold that inverted the prevailing hierarchy. Its annualized revenue run-rate hit $47 billion in May -- more than double OpenAI's $24 billion at the same moment -- achieved while spending roughly one-quarter as much on training compute. The gap was not luck or marketing. It was the first large-scale proof that the old model of siloed hardware procurement had hit its mathematical ceiling.

01The Co-Design Math

For years the industry operated in three separate rooms. Hardware teams designed general-purpose accelerators. Systems engineers wrote CUDA kernels and serving frameworks to talk to them. Researchers designed transformer variants in the abstract. Each group optimized for its own metrics. The result, as semiconductor analyst Dylan Patel of SemiAnalysis has demonstrated, was mathematically guaranteed diminishing returns. A 2x improvement in silicon, a 2x improvement in kernel efficiency, and a 2x improvement in model sparsity produced, at best, an 8x system-level gain. The 100x thesis states that when those three layers are co-designed -- when the model's expert dimensions are shaped to the exact tile sizes of the silicon, when the kernel's memory access patterns are written for the specific interconnect, when the architecture itself is tuned to the memory hierarchy -- the gains become multiplicative rather than additive. The compounding is non-linear because each layer removes friction the others would otherwise have to fight.

The lock-in is geometric, not contractual.

DeepSeek V3 and V4 provided the clearest public demonstration. The researchers explicitly sized their Mixture-of-Experts dimensions and routing patterns to match the matrix-multiply tile geometry and memory hierarchy of NVIDIA's Hopper architecture. The model ran with extraordinary efficiency on Hopper and later Blackwell. When the same weights were dropped onto Google's TPU v6e -- an objectively powerful accelerator -- performance collapsed. The communication patterns and tensor shapes were misaligned with the TPU's network topology. Conversely, Google's Gemini models are co-optimized for TPU interconnects and memory bandwidth; they lose efficiency when ported to NVIDIA silicon. The vaunted CUDA moat is being partially replaced by a deeper, more structural lock-in: the geometric structure of frontier model architectures themselves now binds customers to specific hardware families.

02The Compute Shortage Is Structural

This is happening against a structural compute shortage, not a cyclical one. The total addressable market of economically useful AI tasks -- what some analysts call dark GDP -- is expanding faster than the industry can build power-dense capacity. Even with 20 gigawatts of new data center capacity coming online in 2026 and over 30 gigawatts in 2027, demand remains unsatisfied. The mismatch is severe enough that terrestrial power constraints are already pushing serious planning toward space-based data centers, projected to dominate new deployments by 2040. NVIDIA's Jensen Huang has responded by deliberately arming neoclouds and specialized labs rather than letting hyperscalers consolidate all capacity, preserving a multi-polar market where performance and time-to-rack matter more than legacy tenant isolation advantages.

03Software Is Writing Its Own Infrastructure

The middle layer -- systems software -- has become the most dynamic. Triton, originally developed inside OpenAI, abstracts CUDA complexity while preserving explicit control over parallelism and memory. But the real acceleration is coming from AI itself. Systems such as Meta's KernelEvolve and Stanford's AutoKernel use large language models to profile PyTorch models, extract bottlenecks, generate candidate Triton or custom kernels, and subject them to rigorous verification: smoke tests, shape sweeps, numerical stability, determinism. What once took engineers weeks of trial-and-error now takes hours. On KernelBench, models like DeepSeek R1 have moved from 12% to 72% success rates on Level-1 tasks through iterative test-time compute. The industry is building the tools to automate its own infrastructure optimization at superhuman speed.

At the serving layer, SGLang has pushed the throughput-interactivity Pareto frontier with RadixCache, which treats the prompt as a stream and maintains an LRU KV-cache across calls, and Ragged Paged Attention. SGLang-Jax brings the same primitives natively to TPU via XLA. Daily automated sweeps on SemiAnalysis's InferenceX platform -- running across roughly 15 chip types and major frameworks -- now show that at high-interactivity operating points, AMD's Instinct MI355X on SGLang can deliver materially lower cost-per-token than NVIDIA's GB300 NVL72 under equivalent precision and without Multi-Token Prediction. The benchmark makes the 100x thesis visible in real time: the winner is not the fastest raw chip, but the organization that aligns model, serving framework, and silicon most tightly.

04Algorithmic Progress Compresses the AGI Timeline

While hardware-software co-design governs physical efficiency, algorithmic progress governs effective compute. Leopold Aschenbrenner's framework quantifies progress in Orders of Magnitude. The jump from GPT-2, or preschool, to GPT-4, a smart high-schooler, required roughly 4.5-6 OOMs of effective compute over four years. Physical compute contributed about 0.5 OOM per year. Algorithmic efficiency contributed another 0.5 OOM per year. Unhobbling -- Chain-of-Thought, RLHF, tool use, agentic scaffolding -- contributed about 2 OOMs. The next 3-6 OOMs of effective compute are now expected to reach AGI-level capability by 2027. The more profound shift is that models capable of autonomous research work will compress a decade of human algorithmic progress into a single year. Hundreds of millions of AI agents running parallel experiments at silicon speed will trigger an intelligence explosion that moves the bottleneck from algorithms to power generation and gigawatt-scale deployment.

05Two Paths to the Same Trillion-Dollar IPO

OpenAI and Anthropic have both concluded that reliance on general-purpose GPUs from hyperscalers is a strategic vulnerability. Their solutions diverge sharply. OpenAI partnered with Broadcom and Celestica to design Jalapeno, a captive inference-only ASIC optimized for the decode phase of autoregressive generation. The chip approaches the EUV reticle limit at about 840 mm2, uses a systolic array, and co-packages six to eight HBM3/HBM4 stacks directly on a silicon interposer. Development moved from initial design to tape-out in nine months -- accelerated by OpenAI's own models. Projected 50% lower inference cost of ownership. The trade-off is architectural rigidity: if future reasoning models move beyond the specific attention patterns hardcoded into Jalapeno, the chip becomes a sunk-cost anchor. Training remains entirely on NVIDIA GPUs.

Anthropic took the opposite bet. Rather than assume tape-out risk, it secured approximately 3.5 gigawatts of next-generation TPU capacity through an expanded Google and Broadcom partnership expected online starting 2027. A complex $35-45 billion financing structure has Google backstopping lease payments across five U.S. data centers. Anthropic can match workloads to the silicon best suited for them without nine-month hardware cycles, keeping capital focused on model scaling and commercial distribution. Broadcom sits at the center of both strategies -- co-developing Google's TPU v7, v8ax, and v9 roadmap, Meta's MTIA, and OpenAI's Jalapeno -- making it the quiet kingmaker of custom AI silicon.

The financial divergence is even starker. Anthropic moved from about $1 billion ARR at the end of 2024 to $47 billion by May 2026 -- a 47x multiple in under 18 months -- with 80-85% of revenue coming from high-margin enterprise and developer API usage. Its flagship Claude Code terminal-native agent reached $1 billion ARR within six months of launch, $2.5 billion by February 2026, and about $8 billion by May 2026, eventually accounting for an estimated 4% of all public GitHub commits globally. Anthropic's training costs are projected to peak at roughly $30 billion in 2028; OpenAI's compute spend is projected at $121 billion in the same year. Anthropic is already on the verge of its first profitable quarter. OpenAI remains deeply unprofitable on broken unit economics that Jalapeno is intended to repair.

Interactive Thesis

100x Co-Design Console01 / 04

01Silicontile geometry / memory hierarchy

02Kernelsrouting / attention / serving

03Modelexperts / tensors / topology

8xto100x

Co-design100x Pareto frontier shift

Co-design

The 100x Hardware-Software Co-Design Multiplier

Isolated improvements in silicon, kernels, or architecture yield at best 8x system gains. When explicitly co-designed, gains become multiplicative. DeepSeek sized expert dimensions to Hopper tile geometry and memory hierarchy. Gemini exhibits the inverse lock-in on TPU interconnects.

Revenue

Enterprise Revenue Divergence

Anthropic reached $47B ARR by May 2026 while OpenAI sat at $24B. Claude Code alone hit roughly $8B ARR in 12 months and drove an estimated 4% of global public GitHub commits.

Silicon

Vertical Integration Paths

OpenAI: a nine-month Jalapeno inference ASIC with Broadcom and Celestica, projected at 50% lower inference cost but inference-only. Anthropic: 3.5 GW of TPU capacity through Google and Broadcom, with no internal tape-out risk and more training-plus-inference flexibility.

Governance

Governance as Infrastructure

OpenAI proposed a 1-5% equity donation to seed a U.S. Public Wealth Fund. Anthropic built the Long-Term Benefit Trust, whose Class T shares eventually control a board majority that public shareholders cannot override.

Both companies filed confidential S-1 registration statements in June 2026. Anthropic, valued at $965 billion after its $65 billion Series H, is positioned for a late-2026 listing at roughly 20x its $47 billion run-rate. OpenAI, valued at $852 billion, is reportedly considering a 2027 debut amid volatile post-SpaceX-IPO market conditions. SpaceX itself executed an $86 billion IPO and then, four days later, used its public stock to acquire Cursor, Anysphere, for $60 billion -- instantly gaining deep Fortune 500 developer distribution and a continuous stream of real coding interactions to train future xAI models.

06Governance as Competitive Infrastructure

As these systems approach AGI, they become national-security assets. OpenAI restructured into a Public Benefit Corporation and proposed donating 1-5% of its equity, potentially $42.6 billion at current valuation, directly to the U.S. government to seed a Public Wealth Fund, creating citizen dividends while simultaneously securing political insurance, favorable energy policy, and insulation from antitrust or export restrictions. Anthropic refused any government equity arrangement. Instead it created the Long-Term Benefit Trust -- a Delaware purpose trust holding special Class T shares that, upon hitting fundraising milestones, will elect a growing share of the board, ultimately achieving majority control. The Trust's five financially disinterested trustees, with national security, public policy, and AI safety backgrounds, operate on one-year terms with strict consultation requirements. Public shareholders will hold common stock with no ability to override the Trust's board appointments. This structure perfectly aligns with Anthropic's Responsible Scaling Policy but creates a unique risk profile for traditional public-market investors who expect capital to purchase control.

The intelligence explosion will shift the binding constraint from algorithmic discovery to power and gigawatt infrastructure. The organization that most effectively orchestrates co-designed silicon, automated kernel generation, agentic research loops, and politically durable governance will set the pace. Anthropic has demonstrated that disciplined enterprise focus and hardware-software co-design can produce spectacular revenue velocity with cleaner unit economics. OpenAI has bet that vertical integration, captive silicon, and sovereign wealth alignment can overcome higher burn rates and deliver geopolitical dominance. The public markets will now price two very different capital structures and two very different theories of how to win the intelligence age. The next eighteen months will reveal which thesis compounds faster.

Source Spine

Dylan Patel / SemiAnalysis — 100x co-design thesis

Anthropic / Google / Broadcom TPU partnership

OpenAI / Broadcom Jalapeno inference ASIC

FutureSearch / Luminix AI financial modeling

SaaStr — Anthropic vs OpenAI revenue divergence

Observer / AP News / AI Weekly — SpaceX and Cursor

CONTEXT JAMMING

The real moat is no longer who buys the most chips.

01The Co-Design Math

02The Compute Shortage Is Structural

03Software Is Writing Its Own Infrastructure

04Algorithmic Progress Compresses the AGI Timeline

05Two Paths to the Same Trillion-Dollar IPO

The 100x Hardware-Software Co-Design Multiplier

Enterprise Revenue Divergence

Vertical Integration Paths

Governance as Infrastructure

06Governance as Competitive Infrastructure

How this site is made.

Antigravity

Claude Opus 4.8

Codex