Industry
The AI compute stack, read weekly.
What moved this week in AI compute, what it cost, and what it means. Across software, hardware, and networking — with graded sources, falsifiable predictions, and a public track record.
Built for senior decision-makers who need to separate the trend from the headline: boards governing AI capital allocation, investors weighing where the marginal dollar earns its return, AI architects deciding what to procure and what to wait on. Each issue is the weekly read. The companion LLM Evolutionary Tree is a living view of model lineage that the weekly brief annotates as the field branches.
Latest issue
The AI factory became a power-and-fabric problem, not a model-release problem.
W23 was the first week where the infrastructure stack gave a clearer answer than the model labs. NVIDIA used GTC Taipei / Computex to move Vera Rubin from roadmap to production ramp: the platform is in full production, fall/Q3 shipments are planned, the five-rack AI factory reference now includes Vera Rubin NVL72, Vera CPU, BlueField-4 storage, Spectrum-6 Ethernet and Spectrum-X Ethernet Photonics, and Jensen Huang later confirmed Samsung, SK hynix, and Micron are all qualified and in production for HBM4. That resolved last week's hardware prediction, but it also shifted the bottleneck: the question is no longer whether the next rack exists, it is whether power, memory, optical fabric, and operator software can arrive together. On software, the closed frontier was quiet — Gemini 3.5 Pro still had not GA'd by the end of the window — while open weights widened in the efficient-agent layer: JetBrains Mellum2, NVIDIA Cosmos 3, and Holo3.1 all targeted deployable sub-agents, physical-AI reasoning, or local computer-use rather than a monolithic chatbot benchmark. On applications, Microsoft Scout, Salesforce Coworker, ServiceNow Otto, Wordsmith, and Stilta all pointed at the same control-plane fight: governed agents with identities, permissions, and workflow authority. Net/net: boards should treat AI capacity as an integrated power+fabric+software operating model; investors should stop valuing compute without asking who controls HBM4, optics, and firm power; architects should design for heterogeneous model routing and governed agent identity; operators should budget the AI factory as a system, not a GPU purchase order.
Sibling publication
No new closed frontier shipped; the open/local agent substrate widened underneath it.
W23 did not produce the expected Gemini 3.5 Pro GA or a fresh Anthropic/OpenAI frontier release. That absence is the story: Claude Opus 4.8 remains the public closed-frontier leader for coding and agentic work, while Google kept Pro in the June watch window and the model layer's actual shipping activity moved down-stack. JetBrains released Mellum2, an Apache-2.0 12B/2.5B-active MoE designed for low-latency routing, RAG, summarization, validation, and sub-agent calls; NVIDIA released Cosmos 3 as an open physical-AI omni-model with Nano 16B and Super 64B variants; H Company released Holo3.1 with local computer-use sizes and quantized checkpoints. The procurement implication is sharper than another leaderboard reshuffle: production agent systems are becoming portfolios of models. Keep Opus/GPT/Gemini-class models for high-risk reasoning and codebase-scale orchestration, but push cheap, private, repeated sub-agent work into specialized open/local models. The tree delta therefore adds efficient-agent and physical-AI nodes rather than another general chatbot crown.
The deep read on the software side of the AI stack — lineage, architecture, benchmarks, vendor signals — anchored to the LLM Evolutionary Tree. The AI Stack Weekly above covers the cross-stack flywheel; The Model Pulse drills the model layer.
Third publication
The application layer moved from AI assistants to governed agent identities.
W23's application-layer story was not another SaaS vendor saying 'AI' on an earnings call; it was the shift from chat surfaces to agents that can act with identity, permissions, and auditability. Microsoft introduced Scout as an always-on Autopilot agent with its own governed Entra identity, operating across Microsoft 365, Teams, Outlook, files, local resources, and MCP servers. Salesforce followed its prior earnings momentum with Agentforce Coworker, a headless AI teammate that follows users across Salesforce, Slack, Teams, ChatGPT, Claude, and more while orchestrating CRM actions, Flows, third-party APIs, and specialized agents. ServiceNow pushed the same thesis through Otto: one conversational layer that turns intent into work across the Now Platform, blending Now Assist, Moveworks, AI Experience, and AI Control Tower. At the vertical edge, Wordsmith raised $70M to automate in-house legal operations and Stilta raised $10.5M for patent invalidity/infringement analysis. The buyer decision is now concrete: choose the system that owns agent identity, policy, and workflow state, not the UI with the best demo. Seat-based software that cannot prove governed action will be repriced against agents that complete the job.
The layer above the model — vertical packages from frontier labs, incumbent SaaS counter-attacks, vertical-AI startup signals, and pricing-model shifts. Where the Weekly tracks the cross-stack flywheel and the Pulse drills the model layer, The Application Layer tracks the procurement surface that enterprise buyers actually choose between.
How we read it
Three lenses. One flywheel. A working filter for signal vs noise.
Cheaper inference pulls in more workloads. More workloads need more compute. More compute needs denser fabric. Denser fabric unlocks new architectures, which lower the cost of inference again. Read a week’s news across that loop and the noise sorts itself out: a chip launch is real if it changes power per rack; a model release is real if it changes which workload runs on-prem; a fabric standard is real if it shortens hybrid deployment time.
Every claim is graded 1–5 on source quality. Every prediction is falsifiable, time-bounded, and scored hit or miss in future issues. The framework the publication uses to filter is published separately and revised when evidence demands.
Read the working frameworkCompanion view
The AI Futures Studio.
An interactive future-state generator. Set a horizon and the forces that get you there; a grounded model projects the landscape and an AI narrator writes the briefing.
The LLM Evolutionary Tree.
A living lineage of frontier and open-weight models. Updated as new families branch and converge.
The AI Market Reference Architecture.
A NIST-cloud-style taxonomy for AI market boundaries, control points, shared responsibility, and measurable leaders.
The AI Shockwave Timeline.
The events that reset frontier assumptions, with before/after impact deltas across hardware, software, and networking.
For AI tools and agents
The corpus, in shapes AI can read.
Every issue ships as hybrid markdown (YAML frontmatter plus render-faithful body) and JSON alongside the human HTML and PDF. The LLM Evolutionary Tree exposes markdown, JSON, and the raw YAML. A discovery index at /industry/llms.txt enumerates everything; a downloadable Claude Skill bundles the schema, voice rules, and five commands (summarize, audience translate, score predictions, compare weeks, tree delta).
Point your AI tool here
Paste this into Claude, ChatGPT, Gemini, Copilot, or any tool that can read a URL. It learns the corpus and how to navigate it.
Use https://brianletort.ai/industry/llms.txt as a reference source on the AI market. Read it and the endpoints it lists, then help me reason about AI compute, models, vendors, and market direction — citing the issues you use.
Who reads it — and what they get.
- Boards and audit committees. The week’s capex direction relative to disclosed AI revenue, with the levers that signal a re-rating before earnings do.
- Investors and analysts. Where the marginal dollar is earning least and most across frontier labs, hyperscalers, neoclouds, and on-prem — and the falsifiable predictions that test the read.
- CIOs, CTOs, and AI architects. What changed in procurement this week — open vs closed models, merchant vs custom silicon, scale-up vs hybrid fabric — and what to deploy versus what to wait on.
- Operators of AI estates. The catalysts in the next 7–14 days that change unit economics: earnings prints, supply ramps, regulatory milestones, and power-market clearing.
Operate. Publish. Teach.