From Renting Tokens to Owning AI Assets — Part 1: The Rent-vs-Own Question

TL;DR

Rent-everything was the correct default in 2024 and is the wrong default in 2026. Model commoditization, data-gravity pressure, and sustained enterprise volumes have moved the crossover points.
The four lanes — public API, reserved capacity, regional/data-zone, and sovereign/colocated — have fundamentally different unit-economics shapes. A one-lane strategy is not a strategy; it is a default.
The crossover point is not a number. It is a calculation across workload volume, quality floor, and locality pressure. Every workload in the enterprise deserves the calculation; most get it once, at launch, and never again.
What you are buying when you move up the ladder is not cheaper tokens. It is capacity predictability, locality control, and — at the top — an owned asset that compounds.
The portfolio question belongs to the CFO, not the CTO. Lane selection is a capital-allocation decision disguised as an infrastructure decision.

Continue to Part 2: What It Means to Own AI Assets

From Renting Tokens to Owning AI Assets

Part 1 of 2

Part 2: What It Means to Own AI Assets

I brief a university board on AI strategy a few times a year. After my last session, one of the trustees — a sitting public-company CFO — stayed behind at the coffee break and asked me two questions in sequence that I suspect most boards are about to ask their own CFOs over the next eighteen months.

The first: "The AI bill doubled every six months for the last four quarters. What is it going to do next year?"

The second, after the pause the first one produced: "What if we stopped trying to shrink the bill and started asking what we should own?"

That second question is what this post is about. It is also the pivot that separates the organizations that will run profitable AI platforms from the organizations that will spend forever renting the same capabilities from the same six vendors.

The CEO's Guide to Token Economics argued that cost per verified outcome is the number that matters. Data Gravity Meets Token Economics argued that placement is an economic decision. Designing the AI Control Plane and The Enterprise Token Scorecard gave you the architecture and the metrics. This series gives you the portfolio.

Part 1 is the question — the shape of the trade-off, the math of the crossovers, why the right answer was different two years ago, and why it will be different again in two years. Part 2 redefines what "owning AI assets" actually means in an enterprise, and why most people reading that phrase are hearing the wrong half of the answer.

Both parts are written for a CEO, a CFO, and a board member. The portfolio question does not belong to the CTO. It belongs to the people who decide how the company's capital is allocated — because that is what it is.

Why rent-everything was correct in 2024, and is not the default in 2026

Two years ago, the advice was easy. Rent everything.

The models were improving faster than anyone could train against. The vendors were discounting aggressively. The tools to reserve or host or colocate were either immature or expensive or both. Locality was a niche concern outside of a handful of regulated workloads. The math, for almost every enterprise, said the same thing: pay the public-API rate, stay flexible, wait for the market to settle, and revisit in a year.

The math does not say that anymore.

Three things have changed.

Models are commoditizing on the workloads that matter most. Near-frontier open models now match frontier models on 80 to 90 percent of enterprise tasks. The premium for the top tier is real, but it applies to a shrinking share of the workload mix. That means a larger share of the bill is buying capability you do not need — and that is exactly the share that can be moved into cheaper lanes without losing quality.

Data gravity has become the dominant cost. When 93% of enterprise data is created outside the public cloud, hauling that data to a distant region for inference pays egress, latency, compliance friction, and retrieval token inflation — every time. Placement stopped being an architecture concern and started being a line item. The lanes that let you move inference to the data — regional deployments, colocated inference, sovereign private GPU — stopped being exotic.

Enterprise volumes have sustained long enough to reserve. In 2024, almost nobody knew what their AI workload would look like in six months, so reserving capacity felt reckless. Today, the top ten workloads in most large enterprises have been running for two years. Their base volume is stable enough to reserve. The reservation math — flat floor, better unit economics above a commit threshold — now works for workloads that used to look too volatile to commit to.

Put together: the default that was right two years ago is now an inefficient habit. The correct default in 2026 is not "rent everything." It is "figure out which workloads are which, and put each one in the right lane."

Four lanes, four different curves

Four cost curves, one slider

Four lanes, four different unit-economics shapes. Move the volume slider to see which lane wins at each workload volume — and how the answer flips as the business scales.

Volume55%

At this volume: Reserved / PTUs winsCheapest lane

Commitment-priced capacity. Higher floor, much flatter curve. Wins when base utilization is predictable enough to commit to; loses to public below the commit, and to owned above it.

The crossover points are the strategic decisions. Missing them is not an accounting mistake — it is a portfolio mistake, and it compounds every quarter the organization sits on the wrong curve.

Move the slider.

The chart above is the cost curves of the four lanes, plotted against workload volume. Notice what it tells you.

The public API line is straight. Linear. Unbeatable at low volume. But it keeps climbing forever — every request is metered, every token is paid at spot, nothing ever stops the slope.

The reserved / PTU line has a shelf. A higher floor, because you are paying for committed capacity whether you use it or not. But above a certain volume, the slope flattens dramatically — because the commit price has already covered the capacity you are now consuming. Sustained high volume pays for the commit and then some.

The regional / data-zone line rides parallel to public, just elevated by the locality premium. Same shape, same spot-pricing dynamics, with a 10 to 20 percent uplift for residency. Wins when your workloads require in-region processing but you are not ready to commit capacity.

The sovereign / colocated line starts high and stays nearly flat. Fixed cost dominates. The marginal cost of an additional request is almost nothing. Wins only at high, sustained, regulated volume — and loses spectacularly at everything else.

Four curves, four crossover points. The cheapest lane at low volume is not the cheapest lane at medium volume is not the cheapest lane at high volume. The strategic mistake most enterprises make is to pick a lane at launch and never revisit. The strategic opportunity is to do the opposite: hold a portfolio, re-score it every quarter, move workloads across lanes as their volume and their residency profiles mature.

The crossover calculator

The chart is the intuition. The calculator is the decision.

Rent or own? The crossover calculator

Three inputs. One recommendation. Adjust volume, quality floor, and locality pressure to see which lane the math actually points at — and why.

Workload volume50%

pilotteam-scaleplatform-scale

Quality floor

Locality pressure

Recommendation

Public API with smart routing

Volume approaching the reservation crossover but not there yet. Cache aggressively, route utility tier where evals allow, keep the option open to reserve next quarter.

Lane selectedPublic API, routed

Every workload in the enterprise should run through a version of this decision. Not once. Every quarter. The crossover points move, and so should the portfolio.

Adjust the three inputs.

Volume is the easy one. A pilot workload at 2% of what it will be in a year belongs in the public lane; do not reserve capacity for a workload you do not yet understand. A platform-scale workload at 80% utilization is paying public spot prices for capacity you would have already used.

Quality floor is less obvious. A high quality floor — the workload genuinely needs frontier capability — means you are stuck on the premium tier, which shifts the break-even point for reservation and the tolerability of the regional premium. A standard or low floor opens up utility-tier options and makes routing cheaper.

Locality is the override. Non-negotiable residency moves the answer up the stack regardless of volume — regional, at minimum; sovereign if the workload volume is high enough to amortize colocated capacity. Trying to save money on a regulated workload by running it through a global public API is not a portfolio mistake; it is a compliance time bomb.

The calculator is a toy. The principle it illustrates is not. Every workload deserves this decision. Most get it once, at launch, and never revisit. The cost of that habit is a line item on the AI bill that grows every quarter for no reason anyone can defend.

What you are actually buying

The four lanes are often described as "different prices for the same thing." They are not. They are different products, and the prices reflect that.

Four lanes, eight dimensions

The comparison the portfolio committee should actually be reading. Every lane wins somewhere; every lane loses somewhere. The move is to match the shape of the workload to the shape of the curve.

Public API

Rent at spot price

Reserved / PTUs

Rent by commitment

Regional / data-zone

Rent with residency

Sovereign / colocated

Own a governed lane

Cost shape

Linear metered

Flat floor + spot overage

Metered + 10–20% uplift

High fixed, near-flat marginal

Control

Low

Moderate

High

Locality

Global default

Global or regional

Region-bound

Boundary-bound

Commitment

None

Monthly / annual

None

Multi-year

Balance-sheet treatment

OpEx

OpEx (reservation)

OpEx

CapEx + OpEx

Time to value

Minutes

Days

Quarters

Wins when

Low/unpredictable volume

Predictable sustained volume

Soft locality / customer promise

Regulated + high sustained volume

Loses to

Reserved at sustained scale

Public at low volume; sovereign under regulation

Reserved regional once volume stabilizes

Every other lane until volume + policy demand it

A one-lane strategy is not a strategy. The mature enterprise runs a portfolio — different lanes for different workloads, revisited every quarter as the economics change.

Walk the columns. Each lane wins on some dimension and loses on others. The shape of the loss matters.

Public API wins on time-to-value and loses on control. Minutes to first request; effectively zero control over the physical infrastructure. Right answer when optionality is worth more than commitment.

Reserved capacity wins on sustained-volume economics and loses on flexibility. Days to provision, committed for months, then essentially flat-rate above the commit. Right answer when base utilization is predictable enough that the commit is capital-efficient.

Regional / data-zone wins on residency story and loses on operational simplicity. Same metered shape as public, with a locality premium. Right answer when customer promises or regulatory signals demand in-region processing but the workload is not yet large enough to reserve or own.

Sovereign / colocated wins on control and audit and loses on time-to-value. Multi-year commitment, CapEx plus OpEx, quarters to provision. Right answer only when regulation is non-negotiable and volume is high enough to amortize the fixed cost. Wrong answer for any workload that does not meet both tests.

Notice what the columns are not describing. Capability is not on the chart. Model quality is not on the chart. The four lanes can all run the same model at effectively the same quality. What differs is the shape of the commercial relationship between the enterprise and the infrastructure. That shape is the portfolio decision.

Why the portfolio question belongs to the CFO

Most enterprises route the rent-vs-own decision through their CTO or their head of platform. That is a mistake. The decision is a capital-allocation decision disguised as an infrastructure decision.

Look at the balance-sheet row on the comparison above. Public API is OpEx. Reserved capacity is OpEx, but a committed OpEx that looks more like a lease. Regional is OpEx. Sovereign / colocated is a mix of CapEx and OpEx — and the CapEx piece, if it is structured right, is depreciable against a multi-year capacity plan.

The tax treatment, the cash timing, the depreciation schedule, the relationship to the operating plan — every one of these is a CFO question. The CTO cannot run the math without finance in the room. The head of platform cannot run the math at all.

The CEOs I talk to who are getting this right are the ones who have moved the rent-vs-own review from an annual architecture meeting to a quarterly portfolio meeting — chaired by the CFO, attended by the CTO, the head of platform, compliance, and the business-domain owners whose workloads are in the mix. The meeting has an agenda and a scorecard. The scorecard is the one from the last post. The agenda is: which workloads moved up the ladder this quarter, which moved down, and which should move next quarter.

Run that meeting four times and the economics start looking different. Run it twelve times and they do not look at all like the bill that used to keep doubling.

What happens when you start asking the rent-vs-own question properly

Three things, in every enterprise I have heard describe this work honestly.

First, the top decile of workloads by volume moves up the lane ladder within two quarters. Not because anyone pushed for it — because the math made it obvious. The CFO sees the crossover curve, looks at the trailing-twelve-month volume on the top workloads, and asks why they are paying public spot rates on a workload that has been running at consistent high volume for six quarters. Nobody has a good answer. The workloads move to reserved, or to regional reserved, or — in a few cases — to colocated. Unit economics on those workloads drop by 30 to 60 percent inside a quarter.

Second, the bottom half of workloads by volume gets reconsidered. Pilots that have been running at public-API spot rates for eighteen months are not pilots anymore; they are shadow production workloads that nobody is measuring. Some get promoted to the scorecard and rationalized into real ownership. Some get shut down. Either outcome is a better use of capital than "we forgot we were still paying for this."

Third, the language changes. People stop talking about "our AI bill." They start talking about "our AI portfolio." That change in vocabulary is not cosmetic. It is the difference between an enterprise that is renting capability and an enterprise that is allocating capital.

The leadership move

Stop trying to shrink the AI bill. Start asking what should be owned.

The organizations that will define the next five years of enterprise AI economics are not the ones that squeezed the best public-API contract. They are the ones that decided, workload by workload, which parts of their AI footprint should be rented, which should be reserved, and which should be owned — and then built the portfolio discipline to keep that decision fresh.

Renting tokens is how every enterprise started. Owning assets is how a small number of them will finish.

Part 2 is about what "owning" actually means — and why the most important asset in enterprise AI is the one almost nobody is buying yet.

This is Part 1 of the two-part Rent vs Own series, which sits inside the broader executive token-economics thread. The frame is The CEO's Guide to Token Economics. The placement dimension is Data Gravity Meets Token Economics. The architecture is Designing the AI Control Plane. The metrics are The Enterprise Token Scorecard. Continue to Part 2: What It Means to Own AI Assets.

From Renting Tokens to Owning AI Assets

Part 1 of 2

Part 2: What It Means to Own AI Assets