brianletort.ai
All issues

The AI Stack Weekly

Issue 03 · Week 19 of 2026.

/Industry brief · ~7 min read/Public sources onlyDownload brief

The Bottom Line

Capability stops being about parameters. The constraint stack flipped to power, fiber, and packaging.

Flywheel arcAll three lenses

W19 is the week the AI compute story stopped being told in parameter counts. No new closed-frontier text flagship shipped — GPT-5.5 and Opus 4.7 still anchor the AA Intelligence Index — yet the most consequential capability move was a 300 MW SpaceX deal that let Anthropic double Claude Code rate limits inside a week. OpenAI shipped a voice trio (GPT-Realtime-2 + Translate + Whisper) with GPT-5-class reasoning baked in, jumping 15 points on Big Bench Audio; xAI moved Grok 4.3 to GA with 1M context at $1.25/$2.50 per Mtok; Google promoted Gemini 3.1 Flash-Lite to GA at $0.25/$1.50.

Meanwhile NVIDIA wrote two equity checks that buy zero transistors: $500M in Corning warrants for a 10x expansion of US optical capacity, and up to $2.1B in IREN warrants alongside a $3.4B 5-year contract underwriting 5 GW of NVIDIA-aligned AI infrastructure. On the same Monday, OpenAI / AMD / Broadcom / Intel / Microsoft / NVIDIA jointly published Multipath Reliable Connection (MRC) as an OCP-released open RDMA transport — already in production at the Stargate Abilene site, Microsoft Fairwater, and OCI — collapsing 100,000-GPU clusters from three or four switching tiers down to two.

The procurement read: capability is being bought with photonics and power purchase agreements, not parameter counts. Boards should reset AI-vendor diligence to ask 'who's your power partner and which fabric standard do you ship on,' not 'what's your AA Index score.' Investors should treat NVIDIA's equity stakes in CoreWeave and now IREN as a circular-financing pattern that auditors will start framing — supplier financing customers who buy supplier product. Architects should add MRC compliance and AMD-as-substrate (ZAYA1's open frontier-class MoE trained end-to-end on MI300X this week) to procurement bake-offs that previously assumed NVIDIA + closed-fabric defaults. Operators in PJM territory should re-bid 5-7 year time-to-power for new Dominion-territory applications and pre-secure interconnect now.

JevonsMetcalfeGilderSoftwareJevonsHardwareHuangNetworkingMetcalfe + Gilder

The three lenses

What moved this week, and what to do about it.

14 events across the flywheel — 5 software, 4 hardware, 5 networking.

Software.

  • OpenAI ships GPT-Realtime-2 + GPT-Realtime-Translate + GPT-Realtime-Whisper, exiting Realtime API beta — first voice model with GPT-5-class reasoning, 128K context, adjustable effort, priced at $32/$64 per Mtok audio (Translate $0.034/min, Whisper $0.017/min); Zillow, Priceline, Deutsche Telekom named as early enterprises

    openai.com/index/advancing-voice-intelligence-with-new-models-in-the-api, ChannelNewsAsia, 9to5Mac, TheNextWeb, MarkTechPost, Latent.Space

  • xAI moves Grok 4.3 to GA on the public API with 1M-token context, native video input, and aggressive pricing of $1.25 / $2.50 per Mtok (~40-60% below Grok 4.20); eight legacy Grok models scheduled for retirement on May 15

    docs.x.ai/developers/models, VentureBeat, Artificial Analysis, help.apiyi.com

  • Anthropic doubles Claude Code 5-hour rate limits across Pro/Max/Team/Enterprise, removes peak-hours throttling for Pro/Max, raises Opus API limits — coordinated with the same-morning announcement of an Anthropic ↔ SpaceX deal: 300+ MW at Colossus 1 (Memphis), ~220K NVIDIA GPUs within the month; Claude Managed Agents ships Multiagent Orchestration (Netflix deployer), Outcomes (+8.4% docx / +10.1% pptx task success), and Dreaming (research preview) at the Code w/ Claude SF event

    anthropic.com/news/higher-limits-spacex, claude.com/blog/new-in-claude-managed-agents, Bloomberg, CNBC, The Verge, Ars Technica, simonwillison.net

  • US Center for AI Standards and Innovation (CAISI, Department of Commerce) signs pre-deployment evaluation agreements with Google DeepMind, Microsoft, and xAI — completing perimeter coverage of the top-five US labs with OpenAI and Anthropic already renegotiated; labs will provide early access including model versions 'with lowered safety protections' for national-security testing

    BBC News, CNBC, The Hill, neura.market

  • Google promotes Gemini 3.1 Flash-Lite to GA on Gemini Enterprise at $0.25 input / $1.50 output per Mtok with adjustable thinking levels (minimal/low/medium/high) and full multimodal (text, code, image, audio, video, PDF); Gladly, JetBrains, and Astrocade named as enterprise users

    cloud.google.com/blog/products/ai-machine-learning/gemini-3-1-flash-lite-is-now-generally-available, blog.google, deepmind.google/models/gemini/flash-lite, simonwillison.net

What this means

W19 is the week frontier text went into maintenance and capability moved into voice, throughput, and price-band compression. OpenAI's voice trio + Inworld TTS-2 turn 'voice model' into a frontier tier with its own benchmarks (Big Bench Audio, Audio MultiChallenge, Speech Arena Elo) — boards and architects should treat voice/IVR replacement as a 90-day procurement decision, not a 2027 plan. Anthropic's response to a compute deal — doubling paid-seat throughput rather than cutting list price — is the inverse of every other vendor's quarter and signals that the binding constraint on Claude was always GPU supply, not unit economics; operators locked into Claude Code can plan against materially higher concurrent throughput before mid-summer. CAISI completing the top-five-lab pre-deployment perimeter means every US frontier model now ships through a federal capability/risk gate before public release; pair with The Model Pulse for the seven new tree rows this week (gpt-5-5-instant, the Realtime trio, inworld-realtime-tts-2, zaya1-8b, emo-1b14b).

Hardware.

  • AMD Q1 2026 8-K: data-center revenue $5.8B (+57% YoY, +7% QoQ), Q2 guide $11.2B (±$300M); reaffirms Helios + Meta 6 GW Instinct deployment shipping H2 2026; Instinct revenue declined sequentially on China business shift

    AMD 8-K (SEC EDGAR), AMD IR press release, CNBC, Reuters earnings transcript

  • SpaceX files for 'Terafab' Texas logic fab in Grimes County — Phase 1 ~$55B, full buildout up to ~$119B, jointly built and operated with Tesla and xAI, with Intel attached for design / fab / packaging on Intel 14A; tax-abatement filing public, hearing scheduled June 3

    Grimes County public filing, CNBC, Tom's Hardware, TechCrunch, The Verge, The Register

  • NVIDIA + IREN strategic partnership: 5-year warrant for up to 30M IREN shares @ $70 (~$2.1B) plus separate $3.4B 5-year AI Cloud contract; up to 5 GW of NVIDIA-aligned AI infrastructure on IREN's pipeline with the 2 GW Sweetwater (TX) campus as flagship — Phase 1 deployment beginning 2027

    IREN press release + Q3 FY26 investor update, GlobeNewswire, Capacity Media, TheNextWeb

  • TSMC April 2026 monthly revenue 6-K: NT$410.73B (~US$13.05B), +17.5% YoY, -1.1% MoM (first MoM dip of 2026); YTD revenue +29.9% YoY through April against a CoWoS line still booked through 2027 — consistent with packaging being the binding constraint, not wafer fab

    TSMC 6-K (SEC), TSMC IR, StockTitan, ETManufacturing, MarketScreener

What this means

The binding constraint has migrated off the die. Two of NVIDIA's biggest W19 checks — Corning warrants (see Networking lens) and IREN warrants — buy no transistors; they buy optical-fiber capacity and 5 GW of deployment runway. AMD's $5.8B data-center quarter (+57% YoY) is the first merchant-GPU read of the new cycle and confirms a credible second supply path with Meta's 6 GW Helios commitment shipping in H2; SpaceX's $55-119B Terafab filing brings a fourth US foundry option onto the radar in a market that had three. TSMC's first MoM dip against +29.9% YoY YTD is consistent with packaging — not wafers — being the rate-limiter on Huang's doubling cadence. For boards, the procurement framework reset is: power, packaging, and photonics now price like the substrate; for investors, watch NVIDIA's equity stakes in customers (CoreWeave, IREN) as a circular-flow risk auditors will start framing. Architects sizing 2027 capacity should pre-bid AMD-substrate options against NVIDIA defaults — ZAYA1 this week is the first arxiv-grade frontier-class MoE trained end-to-end on AMD MI300X.

Networking.

  • OpenAI / AMD / Broadcom / Intel / Microsoft / NVIDIA jointly publish Multipath Reliable Connection (MRC) as an OCP-released open RDMA transport — running in production at the Stargate Abilene site, Microsoft Fairwater, and Oracle Cloud Infrastructure; collapses 100,000+ GPU clusters from three or four switching tiers down to two

    OpenAI press post, AMD blog, NVIDIA blog, Microsoft Azure HPC blog, Broadcom blog, Oracle Cloud Infrastructure blog, OCP MRC v1.0 spec, arXiv 2605.04333 (joint OpenAI / Microsoft / NVIDIA paper)

  • NVIDIA + Corning announce a multi-year supply and technology partnership with up to ~$3.2B in equity exposure ($500M pre-funded share purchase + warrants on 18M shares @ $180), funding a 10x expansion of US optical-connectivity manufacturing, +50% US fiber capacity, three new North Carolina and Texas plants, and 3,000 jobs

    Corning 8-K + Ex 99.1 (SEC EDGAR), NVIDIA Newsroom, CNBC, Bloomberg, Tom's Hardware, RCR Wireless

  • Astera Labs ships the Scorpio X-Series 320-lane Smart Fabric Switch — the industry's largest open memory-semantic scale-up switch, with hardware-accelerated Hypercast and In-Network Compute claiming up to 2x collective-operation speedup across CXL, Ethernet, NVLink Fusion, PCIe, and UALink

    Astera Labs IR release, GlobeNewswire

  • GlobalFoundries unveils SCALE — the first OCI MSA-capable co-packaged-optics platform — with 8λ and 16λ bidirectional DWDM demonstrated on its silicon-photonics PDK, plus 50G and 100G micro-ring modulators and TSV-ready 2.5D / 3D stacking down to sub-45μm copper pitch

    GlobalFoundries press release, Microwave Journal, New Electronics

  • EllaLink lights a new 670 km submarine-cable extension into Nouadhibou, Mauritania — the country's second direct path to European cloud and AI services, EU-co-funded via the Connecting Europe Facility, two fiber pairs scalable to multi-terabit, with a neutral landing station

    Datacenter Dynamics, Connecting Africa, TechAfrica News, Total Telecom

What this means

This is the most consequential AI-fabric week of the year. MRC as an OCP-blessed multi-vendor RDMA transport (already running at Stargate Abilene, Microsoft Fairwater, OCI) commoditizes the inside-the-cluster fabric at the protocol layer the same week NVIDIA pre-allocates the underlying optical supply chain ($3.2B equity exposure to Corning, 10x US optical capacity). Durable value migrates from cage fabric to cross-cage and cross-DC interconnect — the operator category that controls dense metro fiber and meet-me ecosystems wins the next two upgrade cycles. For investors and boards, this is Metcalfe at the standards layer: every additional MRC-conformant NIC, switch, and OCI MSA optical engine multiplies the value of every other connected node, while closed alternatives lose option value by the week. Architects should re-test single-vendor scale-up roadmaps against an open MRC + UEC + ESUN + OCI MSA blueprint stack now bookable today; operators should treat the Mauritania landing as a template for how sovereign-AI capacity now follows submarine fiber rather than the other way around.

Capital flow

Money in, revenue out.

4 categories tracked. Capital deployment up in 1 of 4; revenue follows at multiples of 0.21 to 0.6.

The four-category scorecard. Where capital is going in, where revenue is coming out, and how much of it is real. The one chart for the boardroom.

  • Frontier Labs

    OpenAI, Anthropic, Google DeepMind, xAI

    Capital In

    ~$52B

    vs ~$52B

    Revenue Out

    ~$45B

    vs ~$35B

    Burn / Rev

    0.5

    Movement

    Anthropic round still pending close at $850-900B target with $30B+ disclosed run rate (some sources cite ~$45B internal); a May 8 FT report floats potential repricing toward $1T but no primary close. Anthropic's 300 MW SpaceX compute deal (May 6) was the W19 capability event, not the raise.

  • Hyperscaler-Hosted

    Microsoft Azure, Google Cloud, AWS, Meta

    Capital In

    ~$58B

    vs ~$58B

    Revenue Out

    ~$13B

    vs ~$13B

    Burn / Rev

    0.22

    Movement

    Mid-cycle quiet between Q1 prints (W18) and NVIDIA Q1 FY27 (May 20). Forbes coverage May 6 extends the W18 OpenAI-on-Bedrock procurement story into federal/enterprise narrative; Oracle GA's OCI Compute with NVIDIA RTX PRO 6000 Blackwell at $4.50/GPU-hr (May 7). No fresh capex print. Carry W18 priors flat.

  • Neoclouds

    CoreWeave, Crusoe, Nebius, Applied Digital, IREN, Lambda

    Capital In

    ~$28B

    vs ~$18B

    Revenue Out

    ~$3.0B

    vs ~$2.0B

    Burn / Rev

    0.18

    Movement

    The loudest W19 across categories. CoreWeave Q1 print (May 7) discloses $2.08B revenue (+112% YoY) and a record $99.4B revenue backlog; IREN-NVIDIA $3.4B 5-year contract + $2.1B warrant + 5 GW partnership (May 7); Lambda $1B credit facility (4x prior); plus Cipher Mining + Digi Power X + DeepInfra + RadixArk smaller tickets. Capital-in stepped up materially via debt + customer-contract value; revenue out is approaching 1.5x the W18 prior driven by CoreWeave alone.

  • On-Prem / Hybrid

    Enterprise GPU clusters, sovereign and national programs

    Capital In

    ~$42B

    vs ~$42B

    Revenue Out

    Indirect

    vs Indirect

    Burn / Rev

    n/a

    Movement

    Two new UAE platform-tier sovereign initiatives (Core42 + Solutions+, e& 'Agents Factory') in window; no disclosed dollar values. TSMC monthly print (May 8) confirms packaging is binding — squeeze on every on-prem server OEM that ships discrete AI accelerator boards or HBM-heavy SKUs. Carry W18 priors flat.

Burn-to-Revenue is revenue divided by committed capital. Lower means more capital is going out than coming in.

Signal vs noise

What’s real, what’s noise.

5 claims this week — 3 signal, 2 noise.

Each claim is scored 1–5 on source quality and triangulation. Anything 2 or below is flagged as noise. Where consensus is wrong, we say so.

  • 5 / 5

    OpenAI / AMD / Broadcom / Intel / Microsoft / NVIDIA jointly publish Multipath Reliable Connection (MRC) as an OCP-released open RDMA transport — running in production at the Stargate Abilene site, Microsoft Fairwater, and Oracle Cloud Infrastructure; collapses 100,000+ GPU clusters from three or four switching tiers down to two.

    Sources: OpenAI press post, AMD blog, NVIDIA blog, Microsoft Azure HPC blog, Broadcom blog, Oracle Cloud Infrastructure blog, OCP MRC v1.0 spec PDF, arXiv 2605.04333 (joint OpenAI / Microsoft / NVIDIA paper)

    Real and structural. Six independent primary vendor blogs plus an OCP-published spec plus a peer-reviewed paper plus production deployment make this the most-confirmed networking event of 2026 to date. The procurement implication is durable: closed alternatives lose option value by the week. Read changes only if a major hyperscaler explicitly defects from MRC for its next-gen scale-out fabric — none have signaled intent.

  • 5 / 5

    CoreWeave Q1 2026 revenue backlog reaches $99.4B as of March 31, 2026 — up from $66.8B at YE 2025 (+49%); revenue $2.08B (+112% YoY); $31-35B 2026 capex guide tied entirely to signed customer contracts.

    Sources: CoreWeave Q1 2026 press release (May 7); Morningstar mirror; SEC EDGAR 10-Q filing window May 11-21

    Real and operationally consequential. The disclosure validates the anchor-tenant contract structure that underpins neocloud capex commitments. Read changes if the May 11-21 10-Q reveals contract-amendment language or if Q2 print shows backlog churn or recognition restatement on the 36% expected to convert in 24 months.

  • 5 / 5

    NVIDIA + Corning announce a multi-year supply and technology partnership with up to ~$3.2B in equity exposure ($500M pre-funded share purchase + warrants on 18M shares @ $180), funding a 10x expansion of US optical-connectivity manufacturing, +50% US fiber capacity, and three new NC / TX plants.

    Sources: Corning 8-K + Ex 99.1 (SEC EDGAR), NVIDIA Newsroom, CNBC, Bloomberg, Tom's Hardware, RCR Wireless

    Real. Corning shares closed +12% and NVIDIA +6% on the announcement. The structural read: NVIDIA is now an equity holder in two of its largest neocloud customers (CoreWeave Class A; IREN warrant) AND its largest US optics supplier — auditors and short-side analysts will start framing this as a circular-financing pattern. Read changes if NVIDIA discloses warrant terms and exercise pricing logic on May 20 (Q1 FY27 print) and if similar arrangements with three+ counterparties surface.

  • 2 / 5 — noise

    Anthropic's $50B funding round will reprice toward a $1 trillion valuation as investor demand intensifies (FT report May 8).

    Sources: Financial Times via Tech Startups summary (May 8) — single secondary source layered on the underlying $850-900B reporting

    Likely modest hype layered on a real underlying deal. The $1T number is one FT report being amplified; multi-source reporting still has the round at $850-900B and the deal has been 'within two weeks' of closing for two weeks already. Watch the May 12-22 window for primary close confirmation. Read changes only on Anthropic primary disclosure: $1T = signal, $900B = the FT marker was wishful pricing.

  • 1 / 5 — noise

    Panthalassa's $140M Series B (Thiel-led, May 4) for ocean-powered AI compute validates wave-energy AI inference at sea as a viable architecture.

    Sources: BusinessWire press release May 4, 2026; no customer disclosure, no unit-economics disclosure, commercial deployment targeted 2027

    Mostly narrative at this stage. The $140M is real and the investor list is real, but the technical premise — densely-packed AI accelerators surviving in marine atmosphere with wave-energy power — has well-known thermal, corrosion, vibration, latency, and serviceability problems. No anchor customer, no production node uptime data, no cost/MWh disclosure. Read changes only on a deployed pilot node with measurable inference uptime and per-token economics.

Early warning panel

The levers we monitor.

10 metrics tracked — 5 rising, 0 falling, 5 steady.

Current vs prior period. Each metric has a threshold where the read materially changes — this panel flags the inflection before it lands in headlines. Click any metric for the methodology and this-week read.

  • Frontier lab cash position (avg months runway, top 3)

    ~30 movs ~30 mo

    Threshold: <18 mo triggers re-rating risk

    What this measures

    Top 3 frontier labs (OpenAI, Anthropic, Google DeepMind) by disclosed runway, computed from cash on hand divided by trailing-12-month operating burn. W19 carries the W18 figure flat: Anthropic's pending $50B round (still not closed by Friday May 8) would extend runway 12-18 months at $850-900B if it prices at the floor, with FT floating $1T as the marginal investor's discipline test. The May 6 Anthropic ↔ SpaceX 300 MW Colossus 1 compute deal is a capacity-extension event that lets the lab use existing balance sheet without waiting for the round. <18 months triggers re-rating risk; <12 months forces consolidation, acquisition, or revenue reset.

  • Hyperscaler capex / AI revenue ratio (top 4 weighted)

    ~5.0-5.2vs ~5.0-5.2

    Threshold: >6.0 invites investor pushback at next earnings

    What this measures

    Top 4 hyperscalers (MSFT, GOOG, META, AMZN) weighted aggregate of total capex divided by AI-attributable revenue. No fresh capex print in W19 — the next pulse points are Oracle's June print and Q2 FY27 / Q2 CY26 cycle in late July through mid-August. Custom-silicon ramp continues: AWS Trainium + Inferentia crossed $20B annualized (Q1 print); Meta MTIA Gen 2 in broad production; Google TPU v8 separated into training and inference SKUs (both GA later 2026); Microsoft Maia 200 deployed at scale. Ratio crosses the >6.0 threshold inside a quarter if AI revenue growth flinches.

  • CoreWeave revenue backlog

    $99.4Bvs ~$87B+ (est)

    Threshold: Conversion velocity matters more than gross figure

    What this measures

    Booked but unrecognized revenue. May 7 audited disclosure: $99.4B as of March 31, 2026 — up from $66.8B at YE 2025 (+49%) and $87B+ in W18 estimates. Backlog growth driven by multiple new Meta agreements (incl. $21B March commitment), multi-year Anthropic agreement, and expanded Cohere / Jane Street / Mistral relationships. Contracted power up >400 MW QoQ to 3.5 GW; ~36% of backlog expected to recognize within 24 months. CoreWeave's $31-35B 2026 capex guide is tied entirely to signed customer contracts per management — the metric to watch now is conversion velocity, not gross figure.

  • NVIDIA Q-over-Q data center revenue

    $62.3B (Q4 FY26)vs $62.3B (Q4 FY26)

    Threshold: Q1 FY27 print May 20 — Goldman Sachs forecasts ~$70B+ DC implied

    What this measures

    Data-center segment revenue, sequential Q-over-Q. Q1 FY27 print scheduled for May 20, 2026 (W21). W19 movement: Goldman Sachs preview note (May 6) raised the bar — total revenue forecast $80.05B (~$2B above Street consensus $78.30B), implying ~$70B+ DC. Cross-read this with NVIDIA's circular-flow risk: equity stakes in CoreWeave ($2B Class A closed Q1) and IREN ($2.1B warrant right, May 7) plus the ~$3.2B Corning supply-equity deal (May 5) make customer-and-supplier-financing-by-the-supplier a topic auditors will start framing on the call.

  • Open vs closed gap on SWE-Bench Pro (coding)

    Closed +6 to +20ppvs Closed +6 to +19pp

    Threshold: Sustained open lead reshapes enterprise procurement

    What this measures

    Coding benchmark differential between top open-weight and top closed model. May 7 leaderboard snapshot: Claude Mythos Preview (gated) 77.8 / Claude Opus 4.7 (Adaptive) 64.3 vs top open Qwen 3.6 Max preview 57.3 / DeepSeek V4 Pro Max 55.4. Top-of-board gap (Mythos vs Qwen) is now ~20pp — at the high end of the W18 +6 to +19pp band. Production-tier gap (Opus 4.7 vs Qwen 3.6 Max) is +7pp — narrower. Bifurcation suggests open-weight is competitive at GA tier but losing ground to frontier preview / research-tier closed models.

  • Sovereign AI commitments (count / aggregate $)

    10 / ~$80B+vs 8 / ~$80B+

    What this measures

    Two new UAE platform-tier sovereign AI initiatives in W19: Core42 + Solutions+ partnership for MIC Group and Abu Dhabi government entities (May 5), and e&'s 'Agents Factory' sovereign agent platform with the UAE Cybersecurity Council and Open Innovation AI (May 4). Neither came with a disclosed dollar commitment. Aggregate dollar count flat; program count nudged from 8 to 10 — sovereign-AI gravity continues to concentrate in the UAE. Underlying set: UAE Stargate, Core42+, e& Agents Factory, Germany National DC Strategy, France IA, Mistral-Sweden, GMI Japan, IndiaAI, UK AI Growth Zones, Saudi-PIF / HUMAIN.

  • PJM 2026/27 capacity auction price ($/MW-day)

    $329.17vs $329.17

    Threshold: 11x in 24 months — power is the new binding constraint

    What this measures

    The 2026/27 BRA cleared July 2025 at $329.17/MW-day, unchanged. The 2027/28 BRA cleared Dec 17, 2025 at $333.44/MW-day at the FERC-approved cap; without the cap it would have cleared at $529.80 against the first RTO-wide reliability shortfall in PJM history (procured 6,623 MW below the 1-in-10 reserve standard). Data-center peak-load uplift of ~5,100 MW is the structural driver. No FERC ruling or auction in W19 specifically — carry-forward methodology.

  • Time-to-power, busiest US markets (months)

    60-84 (new PJM); 36-48 (existing PJM queue)vs 36-48 (PJM large >100 MW); ~53 nat'l avg

    What this measures

    Months from new-load interconnection request to energization. The PJM picture is bifurcating in W19. Existing pipeline projects in queue retain the 36-48 month timeline; new large-load applications in Dominion territory now face 5-7 year (60-84 month) windows per gpuleaseindex.com (May 2026) and Virginia Mercury — Dominion's full queue cycle reportedly running up to 15 years. PJM has 25,000 MW of data center projects slated for grid connection plus 75,000 MW in pipeline without energized dates. Architects siting greenfield AI capacity should pre-secure interconnect and assume the 'find power, build there' forcing function applies for any new application this cycle.

  • Cost-per-task, frontier reasoning model

    ~$0.10-$0.15 (effective, with hidden reasoning tokens)vs ~$0.05

    Methodology correction: published rates understate effective cost

    What this measures

    Median cost across the frontier-tier reasoning models for a benchmark complex task. W19 freshening — published per-token rates (o3 $2/$8, o3-pro $20/$80, Sonnet 4 thinking $3/$15, Opus 4 thinking $5/$25) understate true cost because hidden 'thinking' tokens are billed at output rate. Per Awesome Agents (May 2026), hidden reasoning tokens inflate effective cost 5-30x depending on task complexity, putting o3 effective cost on hard math tasks around $0.12+ per query (15K hidden reasoning tokens). The W18 prior of ~$0.05 reflects standard non-thinking inference; frontier reasoning 'true' cost is materially higher. This is a measurement-discipline correction more than a real price move — but it changes agent unit economics. Falling cost still expands the addressable workload set rather than contracting demand (the Jevons signature in AI inference).

  • Custom silicon share of incremental AI compute

    ~33-36% (estimated)vs ~31%

    Threshold: >35% materially compresses merchant GPU pricing

    What this measures

    Approximate share of newly deployed AI compute capacity using custom silicon (TPU, Trainium, MTIA, Maia) versus merchant silicon (NVIDIA, AMD). Multiple fresh data points in or adjacent to window: AWS Trainium + Inferentia $20B+ annualized (Q1 2026) growing triple-digits; Meta MTIA Gen 2 in broad production for recommendation inference; Microsoft Maia 200 deployed at scale; Google split TPU v8 into training and inference SKUs (both GA late 2026). AMD Q1 2026 (May 5) data-center revenue $5.8B (+57% YoY) with Helios + Meta 6 GW shipping H2 2026 means a credible second merchant supply path is reasserting itself on the training side. Custom AI accelerator market growing at ~44.6% CAGR through 2033 vs 16.1% for general-purpose GPUs (Introl analysis). Past 35% the compression is material and re-rates merchant gross margins.

Predictions

What we expect next.

5 predictions for the next 30-90 days, confidence 55%-75%.

Each prediction is falsifiable, time-bounded, and tied to a specific signal we will watch. Future issues score these hit, miss, partial, or pending and build a public track record.

Prediction 01

65%

confidence

Software

At least one Fortune 500 enterprise discloses replacement of an existing voice / IVR platform with a frontier voice model (GPT-Realtime-2, Inworld TTS-2, or competitor) on a contract greater than $25M annualized.

Deadline: By July 31, 2026

Trigger: Enterprise customer-disclosure announcements; Q2 vendor commentary from OpenAI / Anthropic / Inworld; named Zillow / Priceline / Deutsche Telekom contract values.

Prediction 02

70%

confidence

Networking

At least one major hyperscaler other than Microsoft publicly discloses a production AI fabric running MRC at greater than 50,000-GPU scale.

Deadline: By August 31, 2026

Trigger: Google I/O 2026 (May 19-20) fabric session; AWS re:Invent prep coverage; OCP Future Technologies Symposium filings; vendor blog disclosures.

Prediction 03

75%

confidence

Capital

Anthropic's pending raise closes at a final post-money valuation between $850B and $1T, with the lead investor and check size publicly disclosed.

Deadline: By June 15, 2026

Trigger: Anthropic primary disclosure; SEC Form D if applicable; secondary market activity; lead investor naming.

Prediction 04

55%

confidence

Hardware

A second public open-weights frontier-class model trained end-to-end on AMD silicon (MI300X or MI400) is announced, confirming AMD-as-training-substrate is structural rather than a one-off proof.

Deadline: By September 30, 2026

Trigger: Vendor announcements; arxiv preprints; AMD customer-list updates at Advancing AI 2026 (July 22-23); Rackspace + AMD MOU production milestones.

Prediction 05

70%

confidence

Capital

At least one major equity research firm or audit-grade publication explicitly frames NVIDIA's 2026 customer-equity stakes (CoreWeave + IREN, plus any subsequent counterparty) as a circular-financing risk in a written report or audit qualifier.

Deadline: By July 31, 2026

Trigger: NVIDIA Q1 FY27 earnings call (May 20); Q2 audit cycle commentary; sell-side and short-seller report releases.

Track record

Scoring prior predictions.

5 prior predictions: 0 hit, 0 miss, 0 partial, 5 pending. Hit rate —.

5 predictions across issues so far. Hit rate: . Hits 0, misses 0, partials 0, pending 5.

Prediction 01

65%

confidence

Software

AWS reports a customer-disclosed Bedrock-resident GPT-5.5 multi-year contract greater than $500M annualized, with the customer named publicly.

Deadline: By July 31, 2026

Trigger: AWS customer-win announcements during the Bedrock GPT-5.5 limited preview; OpenAI partnership commentary in Q2 prints.

pendingForbes coverage May 6 extended Bedrock procurement narrative into federal/enterprise channel; no specific customer-disclosed >$500M annualized contract surfaced. Watch AWS customer-win announcements during the limited preview before pricing hardens at GA.

Prediction 02

60%

confidence

Capital

The top-4 hyperscaler capex / AI revenue ratio drops below 5.0 for Q2 2026, confirming the W18 directional flip.

Deadline: By August 15, 2026

Trigger: Q2 2026 hyperscaler earnings releases (late July through mid-August).

pendingQ2 prints not yet in window. Custom-silicon ramp continues to expand AI revenue base; ratio holding at ~5.0-5.2.

Prediction 03

60%

confidence

Hardware

Samsung HBM4 reaches greater than 25% share of NVIDIA Vera Rubin BOM by Q3 2026 supply data, ending the SK Hynix monopoly on the platform.

Deadline: By September 30, 2026

Trigger: TrendForce / SemiAnalysis Q3 BOM data; NVIDIA Q2 FY27 earnings color on memory supply.

pendingNo fresh share data in W19. Samsung HBM4 mass production for Vera Rubin started Feb 2026 (W18-confirmed); HBM4E sampling Q2. Q3 share data still pending.

Prediction 04

65%

confidence

Networking

At least one major colocation operator other than Equinix reports Q2 2026 interconnect / fabric revenue growth greater than 25% YoY.

Deadline: By August 31, 2026

Trigger: Q2 earnings prints from interconnect operators; Light Reading and DCD coverage.

pendingQ2 prints not yet in window. Adjacent W19 evidence: Iron Mountain Q1 (Apr 30) data-center revenue +47% YoY; Crown Castle / Zayo $8.5B fiber transaction closed May 1 — both add operator-category proof points but are not Q2 fabric-specific yet.

Prediction 05

60%

confidence

Software

By Q3 2026, at least two frontier labs publicly disclose a separately-priced 'gated cyber' or 'gated security' SKU with revenue commentary, formalizing capability-gating as a product line.

Deadline: By September 30, 2026

Trigger: Q3 earnings disclosures (OpenAI partnership color, Anthropic announcements); vendor security press; AISI / METR cross-vendor evaluations.

pendingNo new gated cyber / security SKU disclosures in W19. The W18 set (OpenAI GPT-5.5-Cyber, Anthropic Claude Security) holds; Mozilla Hacks May 7 publishes a Mythos Preview write-up showing ~20x baseline of Firefox security fixes — capability disclosure, not yet a separately-priced SKU revenue line.

Watchlist

On the radar this week.

5 catalysts to watch, starting May 19-20.

Specific catalysts that would change the read materially. Watching these tells us whether the thesis is strengthening or weakening.

  • May 19-20

    Google I/O 2026

    Likely venue for Gemini 3.2 Flash confirmation (the May 5 leak surfaced in iOS app + AI Studio metadata at $0.25 / $2.00 per Mtok) and any Gemini 3.2 Pro announcement that would force a frontier scorecard refresh. Watch for fabric-side disclosures: Google has been silent through W18-W19 while OpenAI / Microsoft / NVIDIA shipped MRC publicly.

  • May 20

    NVIDIA Q1 FY27 earnings

    First disclosure on Vera Rubin ramp velocity, Samsung HBM4 second-source effect, custom-silicon competitive pressure (AMD Helios + Meta 6 GW, Trainium $225B book), and warrant-stake terms for the Corning, IREN, and CoreWeave equity investments. Goldman preview (May 6) raised the bar to $80.05B total / ~$70B+ DC implied. A beat resets the H2 hyperscaler capex tail; a miss flattens the doubling slope; circular-flow framing gets stress-tested on the call.

  • May 11-21

    CoreWeave 10-Q filing

    Audited detail behind the W19 Q1 print: $99.4B backlog conversion schedule, $31-35B 2026 capex composition, customer concentration disclosure, NVIDIA $2B Class A equity terms. The ~36% expected to recognize within 24 months is the line auditors and short-side analysts will read first. Material differences between management commentary and 10-Q footnotes would re-rate the entire neocloud category.

  • May 12-22

    Anthropic round close

    Talks have been 'within two weeks' of close for two weeks. The price ($850B floor / $900B mid / $1T per FT) re-rates the entire frontier-lab valuation curve. A walk-away from the price would be the loudest re-rating signal in a year; a clean close at $1T validates the FT signal as primary.

  • May 19-23

    Microsoft Build 2026 + Computex 2026 prep

    Microsoft Build (May 19-22) is the venue for Azure AI roadmap, Maia 200 numbers, and any production MRC fabric disclosure. Computex 2026 has now slipped to June 2-5 (with Jensen Huang's GTC Taipei keynote June 1) but vendor prep coverage will start in W21 — a bridge week for hardware-and-fabric news cycles.

Companion reads

The rest of the spine.

The AI Stack Weekly is the cross-stack flywheel read. Pair it with the model-and-tree spine and the working framework to get the full picture.

Edits this issue

  • Tree adds: 7 model rows (gpt-5-5-instant, gpt-realtime-2, gpt-realtime-translate, gpt-realtime-whisper, inworld-realtime-tts-2, zaya1-8b, emo-1b14b). See content/llm-tree/changelog.md for full delta.
  • Capital flow categories: Frontier Labs revenue stepped up to ~$45B on Anthropic's run-rate restatement (~$30B disclosed, ~$45B internal). Neoclouds capital stepped up materially to ~$28B on CoreWeave Q1 disclosure + IREN-NVIDIA + Lambda + smaller tickets. Hyperscaler-Hosted and On-Prem / Hybrid carry priors flat.
  • Cost-per-task lever: methodology correction. Published per-token rates understate effective cost by 5-30x because hidden 'thinking' tokens are billed at output rate. W19 reading is ~$0.10-$0.15 effective on hard tasks vs the W18 ~$0.05 standard-inference figure. Direction is up but the underlying price trend continues to fall on standard inference.
  • Time-to-power lever: bifurcation. Existing PJM queue holds 36-48 mo; new Dominion-territory applications now face 5-7 year (60-84 mo) windows. Architects siting new greenfield AI capacity should plan for the higher band.
  • First W19-only weekly with no Q1-print revisions; framework holds.

About this brief

Compiled from public announcements, SEC filings, earnings transcripts, and official lab and vendor publications. Every quantitative claim is graded 1–5 on source quality. Claims graded 2 or below are flagged as noise. The thesis the brief defends is published separately and updated only when a hypothesis materially changes.

Authorship

Written by Brian Letort. Independent analysis. All sources cited are public. Not investment guidance.

Operate. Publish. Teach.