brianletort.ai
All issues

The AI Stack Weekly

Issue 10 · Week 26 of 2026.

/Industry brief · ~7 min read/Public sources onlyDownload brief

The Bottom Line

The week shifted from model announcements to the physical bottlenecks that decide who can serve agentic demand.

Flywheel arcAll three lenses

W26 was not a closed-frontier model week. It was an infrastructure-conversion week: the demand signal moved from benchmark tables into memory, inference capacity, optical fabric, and power policy. Micron's Jun 24 fiscal Q3 print showed AI memory moving from scarcity story to income statement, with HBM4 in high-volume shipments for a lead platform, HBM4 ramping faster than HBM3E, and record data-center margins. Groq raised $650M on Jun 22 to expand an inference cloud that already spans 13 data centers and targets 200MW by 2027, showing the neocloud story is shifting from training clusters to low-latency inference operations. NVIDIA's Vera Rubin production and Spectrum-X Ethernet Photonics messaging made 1.6T/CPO fabric part of the default AI-factory bill of materials, while FERC's large-load show-cause orders continued to reprice where gigawatt-scale campuses can be served and who pays for upgrades. The application and agent layers moved in parallel: Codex/Cursor-style automations and ServiceNow's governed build-agent pattern point to background software work becoming an enterprise control-plane problem, not a chat feature. Net/net: boards should read this as a capacity-allocation week, investors should separate demand winners from power/memory bottleneck owners, architects should design for multi-provider inference and budgeted agent loops, and operators should lock memory, power, optical fabric, and verification gates before assuming model access converts into production throughput.

JevonsMetcalfeGilderSoftwareJevonsHardwareHuangNetworkingMetcalfe + Gilder

The three lenses

What moved this week, and what to do about it.

9 events across the flywheel — 3 software, 3 hardware, 3 networking.

Software.

  • Groq raised $650M to expand its AI inference cloud, reporting 13 data centers, more than five million developers, trillions of tokens per week, and a target of 200MW by end-2027

    Groq newsroom; TechCrunch; DCD

  • Codex Automations documentation framed recurring background coding tasks as scheduled runs that report findings to Triage and can execute in isolated worktrees

    OpenAI Developers

  • The public model leaderboard stayed largely unchanged: Claude Opus 4.8 remained the practical available closed leader, GPT-5.5 stayed close behind, and GLM-5.2 kept pressure on cost/performance

    SWE-bench Verified; model leaderboard roundups

What this means

Software's signal was not another frontier release; it was the operating model around serving and delegating work. Architects should shift evaluation from 'which model won' to 'which inference path, automation harness, and verifier can run reliably at scale.'

Hardware.

  • Micron reported record fiscal Q3 results, with HBM4 in high-volume shipments for its lead customer's platform and qualification samples shipped to multiple end customers

    Micron / GlobeNewswire

  • Micron coverage reported HBM4 ramping about twice as fast as HBM3E 12-high, with more than $1B of HBM4 revenue already shipped and 2026 HBM supply fully contracted

    StorageNewsletter; The Next Web

  • Groq's capital raise explicitly funded inference-cloud capacity, reinforcing that hardware demand is spreading from training accelerators to token-serving infrastructure

    Groq newsroom

What this means

Hardware moved from roadmap to allocation. Memory bandwidth and inference capacity are now visible financial bottlenecks, so operators should reserve HBM-backed capacity and evaluate inference-cloud redundancy before promising agent workloads to the business.

Networking.

  • NVIDIA positioned Vera Rubin with Spectrum-X Ethernet Photonics in production, using co-packaged optics and 200Gb/s SerDes as the fabric for million-GPU AI factories

    NVIDIA Newsroom

  • NVIDIA tied Spectrum-X Ethernet Photonics to 5x better power efficiency, longer uptime, and faster deployment versus traditional transceiver networks

    NVIDIA Newsroom

  • FERC's six RTO/ISO show-cause orders kept large-load interconnection, co-location, flexible service, and cost-shift transparency at the center of data-center siting

    Utility Dive; POWER Magazine

What this means

Networking and power are converging into one design constraint: moving tokens at AI-factory scale now depends on optical fabric inside the campus and tariff clarity outside it. Architects should evaluate 1.6T/CPO readiness and power-interconnection risk together, not as separate procurement tracks.

Capital flow

Money in, revenue out.

4 categories tracked. Capital deployment up in 1 of 4; revenue follows at multiples of 0.21 to 0.6.

The four-category scorecard. Where capital is going in, where revenue is coming out, and how much of it is real. The one chart for the boardroom.

  • Frontier Labs

    OpenAI, Anthropic, Google DeepMind, xAI

    Capital In

    ~$95B

    vs ~$95B

    Revenue Out

    ~$21B

    vs ~$21B

    Burn / Rev

    ~1.3x

    Movement

    No new frontier-lab financing or flagship GA reset the week; the story moved to inference capacity and open/closed model economics.

  • Hyperscaler-Hosted

    Azure-OpenAI, AWS-Anthropic, Google Cloud-Gemini, Oracle-OCI

    Capital In

    ~$187B

    vs ~$187B

    Revenue Out

    ~$62B

    vs ~$62B

    Burn / Rev

    ~3.0x

    Movement

    No new top-four earnings print; hyperscaler read-through came through memory, optical fabric, and large-load policy rather than fresh capex guidance.

  • Neoclouds

    CoreWeave, Nscale, Crusoe, Lambda, Fluidstack, IREN

    Capital In

    ~$12.7B

    vs ~$12B

    Revenue Out

    ~$5B

    vs ~$5B

    Burn / Rev

    ~2.5x

    Movement

    Groq's $650M raise moved inference neoclouds from narrative to funded capacity expansion.

  • On-Prem / Hybrid

    Enterprise GPU clusters, sovereign and national programs, Cisco / Dell / HPE

    Capital In

    ~$94B

    vs ~$94B

    Revenue Out

    ~$36B

    vs ~$36B

    Burn / Rev

    ~2.6x

    Movement

    FERC large-load reform remained the key on-prem/hybrid catalyst; no new sovereign mega-commitment landed in-window.

Burn-to-Revenue is revenue divided by committed capital. Lower means more capital is going out than coming in.

Signal vs noise

What’s real, what’s noise.

4 claims this week — 3 signal, 1 noise.

Each claim is scored 1–5 on source quality and triangulation. Anything 2 or below is flagged as noise. Where consensus is wrong, we say so.

  • 5 / 5

    Micron's HBM4 ramp is now a financial and supply-chain signal, not just a roadmap item.

    Sources: Micron fiscal Q3 release; StorageNewsletter

    This is high-grade signal because it is company-reported and tied to revenue, shipments, and margin. Memory allocation remains one of the cleanest ways to measure real AI infrastructure demand.

  • 4 / 5

    Groq's $650M raise shows inference-cloud capacity is attracting growth capital after the training-cloud rush.

    Sources: Groq newsroom; TechCrunch; DCD

    The disclosed data-center footprint, token volume, and 200MW target make this more than venture positioning. The open question is whether Groq can differentiate once NVIDIA-linked inference hardware is broadly available.

  • 4 / 5

    Co-packaged optics is moving from lab narrative into the default Vera Rubin AI-factory architecture.

    Sources: NVIDIA Newsroom

    Vendor claims need discounting, but the production framing and named ecosystem partners make CPO/1.6T readiness a near-term architecture watch item.

  • 2 / 5 — noise

    The weekly model-ranking blog cycle is overstating model-market change this week.

    Sources: SWE-bench Verified; model roundups

    There was no fresh closed-frontier release in-window. Treat leaderboard roundups as useful baselines, not as evidence that procurement posture changed this week.

Early warning panel

The levers we monitor.

10 metrics tracked — 0 rising, 0 falling, 10 steady.

Current vs prior period. Each metric has a threshold where the read materially changes — this panel flags the inflection before it lands in headlines. Click any metric for the methodology and this-week read.

  • Frontier lab cash position (avg months runway, top 3)

    ~34-37 mo (flat; no new in-window round)vs ~34-37 mo (flat; no new in-window round)

    Threshold: <18 mo triggers re-rating risk

    What this measures

    Top 3 frontier labs (OpenAI, Anthropic, Google DeepMind) by disclosed runway, with xAI now public inside SpaceX. No new financing event in-window; both OpenAI and Anthropic remain at the confidential-draft-S-1 stage. Capital access stays wide; boards should not assume funding pressure forces near-term commercial concessions.

  • Hyperscaler capex / AI revenue ratio (top 4 weighted)

    ~5.0-5.3 (flat; next top-4 earnings catalyst pending)vs ~5.0-5.3 (Amazon $10B + Google $1.5B added; top-4 guides flat)

    Threshold: >6.0 invites investor pushback at next earnings

    What this measures

    Top 4 hyperscalers (MSFT, GOOG, META, AMZN) weighted aggregate of capex divided by AI-attributable revenue. No new top-4 earnings print in-window; the incremental Amazon and Google campus commitments fit prior ~$725-805B 2026 capex guides. Investors should keep the bubble hypothesis on funding/conversion and now on FERC-driven siting cost, not demand.

  • CoreWeave revenue backlog

    ~$100B reported / ~$131B analyst-estimated by end-Q2vs ~$100B (Jun 15); Cantor estimate ~$131B by end-Q2

    Threshold: Conversion velocity matters more than gross figure

    What this measures

    Booked but unrecognized revenue. Reporting puts backlog at ~$100B as of mid-June (vs the $99.4B Q1 figure), with Cantor Fitzgerald modeling ~$131B by end-Q2 — analyst-estimated, not company-guided. The official next print is Q2 in early August. Operators should keep watching conversion velocity over the headline figure.

  • NVIDIA Q-over-Q data center revenue

    $75.2B Q1 FY27; Q2 guide $91B, reports Aug 26vs $75.2B (Q1 FY27); Q2 guide $91B, reports Aug 26

    Threshold: Q2 FY27 guide $91B implies further +21% QoQ

    What this measures

    No within-window NVIDIA event; the next earnings print is Aug 26. The Q2 guide is $91B. In-window context (Supermicro Vera Rubin order availability, all three HBM makers volume-shipping HBM4 12-Hi) supports the ramp; packaging and power remain the binding constraints, not demand.

  • Open vs closed gap on coding (SWE-Bench / agentic)

    Open pressure sustained: GLM-5.2 / DeepSeek V4 Pro remain the cost challengersvs Open closing fast: GLM-5.2 (MIT) reportedly beats GPT-5.5 on long-horizon coding at ~1/6 cost; AA rebased to v4.1

    Threshold: Sustained open lead reshapes enterprise procurement

    What this measures

    The gap narrowed sharply in-window: GLM-5.2's MIT weights reportedly beat GPT-5.5 on several long-horizon coding benchmarks at ~1/6 the cost, and Artificial Analysis rebased its Intelligence Index to v4.1 (agentic), where open leaders sit ~44 and GLM-5.2 took the open lead. With Fable 5 suspended, the available closed leader is Opus 4.8 (AA v4.1 56). Architects should treat open self-host as a live procurement option, not a hedge.

  • Sovereign AI commitments (count / aggregate $)

    ~14 / ~$180B+ (flat; no new drawn sovereign mega-commitment)vs ~14 / ~$180B+ (flat; TensorX/Solstice up-to-$1B EU facility is capacity, not drawn)

    What this measures

    Analyst-curated count of sovereign/national AI-compute commitments. No major new drawn commitment in-window; the only new item is TensorX/Solstice's up-to-$1B EU GPU/data-center financing facility (capacity, not a drawn commitment). Operators with EU workloads should track the facility but price in multi-year build timelines.

  • PJM 2026/27 capacity auction price ($/MW-day)

    $329.17; 2028/29 BRA results expected around Jul 7vs $329.17; 2028/29 BRA results expected ~July 7 (pending)

    Threshold: 11x in 24 months — power is the new binding constraint

    What this measures

    The 2026/27 BRA cleared at the FERC cap ($329.17); the 2027/28 BRA cleared at $333.44. The 2028/29 auction is still pending, with results expected around July 7, 2026, under a collar floor of $175 and a cap near $325. Architects should not assume near-term price relief; budget capacity at-cap through 2028.

  • Time-to-power, busiest US markets (months)

    60-84; FERC 60-day tariff-response clock is the next catalystvs 60-84; FERC Jun 18 show-cause may compress large-load study timelines

    What this measures

    Months from new-load interconnection request to energization. FERC's Jun 18 show-cause orders aim to speed large-load studies and clarify cost allocation within 60 days — potentially pro-speed medium-term, but no near-term change. Large-power-transformer lead times remain ~128 weeks. Architects should pre-commit power and long-lead grid equipment before GPU SKUs.

  • Cost-per-task, frontier reasoning model

    ~$0.10-$0.15 effective; inference-cloud competition risingvs ~$0.10-$0.15 (effective); open weights (GLM-5.2) push commodity capability to ~1/6 frontier cost

    GLM-5.2 API ~$1.40/$4.40 per MTok; Grok 4.3 on Bedrock $1.25/$2.50

    What this measures

    Median cost across frontier-tier reasoning models for a benchmark complex task. Open weights drove the ceiling down: GLM-5.2 lists at ~$1.40/$4.40 per MTok (~1/6 of comparable frontier) and Grok 4.3 went GA on Bedrock at $1.25/$2.50. Operators running agents at scale should re-benchmark on cost-per-task and pilot open self-host for routine work.

  • Custom silicon share of incremental AI compute

    ~33-36%; HBM and CPO now more binding than raw accelerator demandvs ~33-36%; J.P. Morgan pegs 2026 custom-ASIC TAM ~$60-70B (Broadcom 80-85%)

    Threshold: >35% materially compresses merchant GPU pricing

    What this measures

    A J.P. Morgan note put the 2026 custom-ASIC market at ~$60-70B with Broadcom at 80-85% and Marvell 10-12%, reinforcing the thesis that ASIC units overtake merchant-GPU units by 2027. Investors with concentrated NVIDIA exposure should keep diversifying into the Broadcom/Marvell co-design duopoly and advanced packaging / power.

Predictions

What we expect next.

4 predictions for the next 30-90 days, confidence 57%-72%.

Each prediction is falsifiable, time-bounded, and tied to a specific signal we will watch. Future issues score these hit, miss, partial, or pending and build a public track record.

Prediction 01

72%

confidence

Hardware

By July 31, 2026, at least one additional memory supplier besides Micron publicly confirms 2026 HBM4 supply is fully allocated or materially price-up for 2027.

Deadline: By July 31, 2026

Trigger: Samsung or SK hynix earnings call, investor presentation, or supply-chain report.

Prediction 02

57%

confidence

Capital

Groq announces at least one named Fortune 500 or hyperscaler inference-cloud customer by September 30, 2026.

Deadline: By September 30, 2026

Trigger: Groq customer announcement, case study, or partner release.

Prediction 03

63%

confidence

Networking

A named Vera Rubin partner announces a CPO/Spectrum-X Ethernet Photonics rack or cluster design win by August 31, 2026.

Deadline: By August 31, 2026

Trigger: NVIDIA, OEM, cloud, or networking vendor product/customer announcement.

Prediction 04

66%

confidence

Software

At least one major enterprise platform ships an admin control specifically for scheduled/background coding or app-building agents by August 31, 2026.

Deadline: By August 31, 2026

Trigger: Product changelog or GA announcement from OpenAI, Cursor, Microsoft, ServiceNow, GitHub, or Atlassian.

Track record

Scoring prior predictions.

6 prior predictions: 0 hit, 0 miss, 0 partial, 6 pending. Hit rate —.

6 predictions across issues so far. Hit rate: . Hits 0, misses 0, partials 0, pending 6.

Prediction 01

58%

confidence

Software

An MIT- or Apache-licensed open-weight model (e.g., GLM-5.2) enters the overall top 5 of the Artificial Analysis Intelligence Index v4.1 — not just the open-weight subset — by August 31, 2026.

Deadline: By August 31, 2026

Trigger: Artificial Analysis Intelligence Index v4.1 leaderboard update.

pendingPending. GLM-5.2 remains the open challenger, but no new top-5 overall AA confirmation landed in W26.

Prediction 02

75%

confidence

Hardware

By August 31, 2026, all three HBM makers (SK hynix, Samsung, Micron) confirm HBM fully allocated for 2026 and/or 2027 price increases.

Deadline: By August 31, 2026

Trigger: Earnings calls or supply-chain reporting (TrendForce, Bloomberg/Reuters) from the three memory vendors.

pendingPartial. Micron publicly confirmed HBM4 high-volume shipments and strong HBM visibility; confirmation from all three suppliers remains pending.

Prediction 03

55%

confidence

Networking

A second non-Broadcom vendor (Marvell or Credo) cites a 1.6T or co-packaged-optics production design win by September 30, 2026.

Deadline: By September 30, 2026

Trigger: Earnings call, product release, or customer design-win disclosure.

pendingPending. NVIDIA's CPO production signal strengthens the setup, but no second non-Broadcom production design win was disclosed this week.

Prediction 04

80%

confidence

Power

At least one RTO/ISO files a large-load interconnection compliance proposal answering FERC's Jun 18 show-cause orders by the August 17, 2026 deadline.

Deadline: By August 17, 2026

Trigger: FERC docket filings from PJM, MISO, SPP, CAISO, ISO-NE, or NYISO.

pendingPending. FERC orders are live; the RTO/ISO compliance-response deadline remains August.

Prediction 05

68%

confidence

Power

A named hyperscaler announces a >1GW behind-the-meter or off-grid generation deal for AI data centers by August 31, 2026.

Deadline: By August 31, 2026

Trigger: Hyperscaler energy/data-center announcement; utility or developer disclosure.

pendingPending. FERC and hyperscaler power focus continued, but no named >1GW behind-the-meter hyperscaler deal landed in W26.

Prediction 06

55%

confidence

Capital

At least one major closed lab cuts flagship API prices or ships a cheaper tier by August 31, 2026, in response to open-weight cost pressure.

Deadline: By August 31, 2026

Trigger: Lab pricing page or API changelog (OpenAI, Anthropic, Google).

pendingPending. Open-weight cost pressure continued, but no major closed-lab flagship price cut landed this week.

Watchlist

On the radar this week.

4 catalysts to watch, starting Jul 1-10.

Specific catalysts that would change the read materially. Watching these tells us whether the thesis is strengthening or weakening.

  • Jul 1-10

    PJM 2028/29 capacity auction result

    Another at-cap result would harden the thesis that power cost, not GPU access, is the binding AI-factory constraint in key US markets.

  • Jul 2026

    Gemini 3.5 Pro GA and independent benchmark read

    A real GA would test whether Google's delayed closed-frontier release changes the Opus/GPT/open-weight procurement baseline.

  • Jul-Aug 2026

    RTO/ISO large-load tariff responses

    The first filings will show whether FERC accelerates data-center interconnection or simply moves cost allocation fights into regional proceedings.

  • Q3 2026

    Vera Rubin / HBM4 customer deployment evidence

    Shipping evidence from OEMs or cloud partners would convert HBM4 and CPO claims into deployment timing, capacity, and margin implications.

Companion reads

The rest of the spine.

The AI Stack Weekly is the cross-stack flywheel read. Pair it with the model-and-tree spine and the working framework to get the full picture.

Edits this issue

  • W26 adds Groq's inference-cloud raise, Micron's HBM4 financial signal, NVIDIA CPO/Vera Rubin production context, and FERC large-load follow-through; prior W25 predictions remain mostly pending with one HBM4 partial.

About this brief

Compiled from public announcements, SEC filings, earnings transcripts, and official lab and vendor publications. Every quantitative claim is graded 1–5 on source quality. Claims graded 2 or below are flagged as noise. The thesis the brief defends is published separately and updated only when a hypothesis materially changes.

Authorship

Written by Brian Letort. Independent analysis. All sources cited are public. Not investment guidance.

Operate. Publish. Teach.