TL;DR
- I have not actually stopped using ChatGPT — I have stopped staying in it. ChatGPT is now my fastest route into a workflow, not the workflow itself.
- Four stages, four multipliers I have actually measured on my own work: Chat (3X), Cowork (5X), Build (10X), Automate (30X). They are not interchangeable; they are a ladder.
- The leverage is not in a smarter model. It is in a different posture toward the same model.
- Most leaders are budgeting AI as if there is one stage. The teams pulling ahead in 2026 are budgeting across all three.
Eighteen months ago, my AI workflow was a tab.
I opened ChatGPT in the morning, talked through my hardest task with it, and copied the result into wherever the actual work lived — a deck, a doc, an email, a spreadsheet. By the middle of 2025 that was just the way I worked. It felt like the future. Chat was not broken. Chat is still not broken. The version of me from 2023 who first opened ChatGPT could not have imagined it could be useful in any other way.
Then, almost without noticing it, I climbed a ladder I did not know existed. Each rung kept compounding the leverage of the one below it. By the fourth rung, I was getting work done at a pace that made the chat-only version of me look like he was operating on dial-up.
This is the personal field report on what those three rungs are, what each one gave me, and — most importantly — what this means for executives whose AI strategy is still "we bought Copilot, ChatGPT Enterprise, and Claude Enterprise licenses." If you are not technical, do not skip this. The pattern is what matters; the tool names are footnotes.
(A note on timing: as I was finalizing this post, Peter Yang published a LinkedIn piece making a similar argument — the chat era is coming to an end — that has been circulating among AI builders. The shift he describes is the same one I am writing about here, told from the outside. This post is the inside view: the journey itself, rung by rung.)
The journey, in one picture
Four rungs. Same underlying AI. Wildly different leverage.
Whatever rung you are on today, the next one is closer than you think. The hard part is not the technology. The hard part is realizing the rung above you exists.
Let me walk through each one, the way I actually lived it.
Stage 1 — Chat. The 3X stage.
For roughly two years, my AI workflow was a tab. ChatGPT in the morning, Copilot in the afternoon, a Claude window open for the longer asks. I would draft an executive update by writing the brief, pasting it into the model, asking for a first cut, then editing it back into my voice. I would build a deck by talking the outline through with the model, copying back the bullets, and rebuilding them in PowerPoint. I would respond to a difficult email by pasting the thread, asking for three replies of varying tone, and choosing the one closest to what I would have written if I had three uninterrupted hours.
It felt like magic in 2023. By the middle of 2025, it was just the way I worked. The honest measurement, when I tracked it: about three times the throughput on most first-draft tasks. Not because the writing was three times better. Because the time-to-first-draft collapsed. The blank page is the expensive part of writing, and the blank page was simply gone.
But the ceiling was real, and after about a year I started feeling it in three places:
- Every turn was a human turn. I was the bus driver of every conversation, every copy, every paste, every edit. The AI never touched my actual artifacts — my email, my repo, my deck file, my data. I was the API.
- Context died at the tab. Close the window, open it tomorrow, and the model has no memory of yesterday's project. I rebuilt context with every session.
- The work above the prompt was still mine. Strategy. Synthesis. Choosing what to ship. All of it still lived in my head, and chat could only help me describe it, not do it.
3X is real. 3X is also a floor. There is a rung above it — and most of you are already on it without knowing.
Mental Model
Stage 1.5 — Cowork. The 5X stage.
Then chat got hands.
At some point in the last eighteen months, the chat tab stopped being just a chat tab. ChatGPT got Connectors and Custom GPTs and could now search across my Drive and my GitHub. M365 Copilot started living inside Word and Excel and Outlook, drafting in the document I was editing rather than in a sidebar. Claude added Projects and Skills and a Computer Use beta — same conversational interface, with a memory of my current work and the ability to reach into a few of my actual files. Gemini moved into Gmail and Docs.
None of these is a chat tab in the 2023 sense. They look like chat. They feel like chat. But the AI is now touching my apps — reading my email, drafting in my documents, pulling from my Drive — without me copy-pasting the source material in. That is a different posture, and it deserves its own rung.
The category name in 2026 is Cowork. The four products that matter for most enterprise readers:
- Microsoft 365 Copilot / Frontier — Outlook, Word, Excel, PowerPoint, Teams, OneDrive, SharePoint, the Graph. The most-deployed Cowork surface in the enterprise today.
- Claude Projects + Skills + Computer Use — project memory, named skill bundles, and screen/browser access via Computer Use beta.
- ChatGPT Enterprise + Connectors + Custom GPTs — searches across Drive, M365, GitHub, internal corpora; Custom GPTs with actions for specific workflows.
- Gemini in Workspace — Gmail, Drive, Docs, Sheets, Calendar, with the conversation living natively in each surface.
The honest measurement on my own work: about five times the throughput on tasks where the AI's value is the retrieval and the first draft. About 2X beyond raw chat — meaningful, but bounded. The blank page still disappears, and now the source material disappears too: I no longer have to copy/paste my inbox into the prompt, paste the answer into Word, and shuttle context between tools. The AI does that part for me.
But the ceiling is real, and it is the ceiling that explains why the Build rung still matters. Cowork is still chat-driven. I am still typing every turn. The AI helps me draft a Word doc, but I am still in the doc, in conversation, accepting or rejecting suggestions one at a time. The AI is doing more of the work inside my apps, but I am still the one running the work. To get the next 2X, I have to give up the conversation altogether and let an agent run a full loop on my behalf. That is what Build is.
Mental Model
Stage 2 — Build. The 10X stage.
After about a year in Cowork, the ceiling started to feel real. The AI was in my apps, but I was still in the conversation. The first time I opened Cursor, I asked an agent — using the same family of model I had been chatting with for two years — to fix a broken link on this website. It did. Without me copying or pasting anything. It read my repository, found the bad reference, fixed three other ones I did not know about, ran the linter, opened the diff for me to review, and waited.
I sat there for a minute. The work I had been describing in chat for two years had just been done in front of me. By the same underlying AI.
That was the rung change. I stopped typing drafts and started directing them. I stopped using ChatGPT as a smart colleague stuck on the other side of a glass wall, and started using AI as something that could reach into my actual environment — my files, my terminal, my data, my deploy pipeline — and produce real artifacts I could ship.
The tools at this rung have a category name now: agent IDEs, or what I called "Build" in my analytical post on the three postures. The big four in May 2026 are:
- Cursor — the one I live in. An IDE that gives the agent your codebase, your terminal, your editor, and a kill switch on every action.
- Claude Code — Anthropic's terminal-native cousin. Same posture, different surface.
- Codex CLI — OpenAI's open-source equivalent. Slightly more script-friendly.
- OpenCode — the open-source community variant of the same idea.
What they share is the shift from call-and-response to direct-and-review. You set a goal. The agent runs a loop — think, act, observe, repeat — touching real tools. You read the work product, not the keystrokes.
A common misread of this rung is that it is "for engineers." It is not. The same shift applies to anything you ship as a deliverable. In the last month I have used Build tools to produce PowerPoint charts directly from the underlying data, run Excel analyses across multi-tab spreadsheets where the agent wrote the formulas and explained the result in plain English, draft a whitepaper from a folder of source material, and assemble an executive brief that pulled facts from three different documents at once — none of it involving a single copy-paste. The agent reaches for the file, builds the chart, generates the slide, writes the narrative, and shows me the diff to approve.
This is the part that surprised me most when I migrated off the chat tab. The old workflow was a relay race: ask ChatGPT for the bullets, copy them out, paste them into PowerPoint, ask again for a chart description, build the chart by hand, ask again to refine the language, paste it back, eyeball the formatting. Each round trip was minutes of friction, and the AI never actually touched the deck. Build collapses the relay into a single direction — "produce the deliverable" — and the agent does it end-to-end. For decks, models, briefs, and reports, that single change is where most of the 10X actually shows up for non-engineers. Code is the easiest example to film. It is not the most valuable use of this rung for most professionals.
The honest measurement on my own work: about ten times the throughput on anything that touches a real system. This website. My homelab scripts. The analyses I now ship instead of outlining. The data pipelines I used to dread. Code that used to take me a Saturday now takes me an hour, because I am no longer the one typing it — I am the one deciding whether what got typed is good enough.
But there is still a ceiling, and after a few months I started feeling it again. The work happens while I am in session. When I close the laptop, the agent stops. The agent is still my colleague — a phenomenally productive one — but the work is still mine to start. There is a rung above this one too.
Mental Model
Stage 3 — Automate. The 30X stage.
The shift that genuinely surprised me happened on a Tuesday at 2:27 AM. My phone buzzed once. A message: "Auto-remediated the SSL renewal on the lab box. Two services bounced cleanly. No action needed. Next renewal in 89 days."
I read it, said "thanks," and went back to sleep. The work was already done.
For the past few months I have been running a small estate of agents in my homelab using three commercial tools: Hermes, Zeroclaw, and Agent Zero. The estate runs on hardware I own — a GB10 (NVIDIA DGX Spark), an RTX 6000, and an RTX 5090 — fronted by LLM routers that match each task to the right model. The agents watch for triggers. They run on schedules. They handle the routine. They talk to each other when one of them needs help. They reach out to me only when something genuinely requires a human — a permission they do not have, a decision that exceeds their authority, an ambiguity about intent.
The use cases stretch well past the technical. In the same week the SSL story happened, the estate read an email asking for time on my calendar, opened Outlook on my behalf, drafted the invite with the right attendees, time, and agenda — leaving me with the only step that mattered, hitting send. It monitored a third-party booking site for an availability I had been after, registered me the moment a slot opened, and confirmed the reservation. After meetings where actions get assigned, the estate transcribes the conversation, extracts the commitments, drops the action items into my inbox, and writes them straight into my digital tracking system so nothing falls through. It drafts the follow-up email from the same transcript and queues it for my review the next morning. And during meetings themselves, it assembles a real-time battlecard — pulling relevant history on every person on the call, the prior decisions, the open questions — so I am never caught unprepared by a topic the room has been working through for years before I joined.
None of these required me to start the work. All of them were standing instructions I had given the estate in plain English — and from then on, the agents either acted on my behalf inside the rules I had laid down, or paged me when they hit a decision I had not delegated.
This is the natural maturation step from Build. The workflows I prototyped manually in Cursor — the script that polls, the function that books, the template that drafts — are the same workflows I have since handed to the estate to run on their own. Build is where I figure out how the work should be done. Automate is where the work then runs without me, and only finds me when it should.
Most nights the estate handles several dozen things. Some nights it wakes me for one. The ratio that matters is not how many things it did. It is how many things it did that I never had to think about.
That is the third rung. The defining shift is not speed. The defining shift is that the work is no longer something I start. It happens on triggers — a webhook, a schedule, an alert breach, a queue depth. The agents pick it up and run. I am paged on exception, not invited to every step.
The honest measurement: about thirty times my baseline. Not because any single task is thirty times faster — by Stage 3, the per-task speedups have already happened in Stage 2. The thirty comes from a different math entirely:
- The work runs while I sleep. My day has effectively doubled.
- Most of the routine handling never reaches my attention at all. My focus has effectively multiplied.
- The handful of decisions that do reach me arrive with full context already assembled. I am making the call, not gathering the evidence.
This is the rung most leaders have not yet entered, and the one most enterprises will be living in by the end of 2027. It is also where AI stops being a productivity feature and starts being an operating model.
Mental Model
What changed underneath each multiplier
Before any executive rolls these numbers into a model — let me be honest about where they come from. These are Brian's measured leverage on Brian's tasks, not universal claims. The shape generalizes; the exact multiples will not.
Here is what each rung is actually measuring:
| Stage | What I am measuring | Why the multiplier shows up |
|---|---|---|
| Chat — 3X | Time to first draft on writing-heavy tasks | The blank page disappears |
| Cowork — 5X | Time saved by skipping the copy-paste relay between AI and apps | The AI fetches and drafts inside your apps; you still drive every turn |
| Build — 10X | Wall-clock time on tasks that touch real artifacts | The agent does the typing; I do the deciding |
| Automate — 30X | Total throughput across a 24-hour day | Work runs on triggers, not on me |
The shape that should matter to a CFO is not the exact numbers. It is the asymmetry: each rung roughly doubles or triples the leverage of the one below it, and the cost of climbing is dramatically less than the cost of staying. By Stage 1.5, you are paying enterprise Cowork licenses (most CFOs already are). By Stage 2, you are paying for an agent IDE seat. By Stage 3, you are paying for an agent estate — and the agent estate is doing work that, billed at human rates, would dwarf what it costs to run.
What you hand to the AI at each phase
There is another way to read this ladder — not by what you get out of it, but by what you hand to the AI on the way in. Each phase is a wider perimeter of trust. The AI is not getting smarter at each rung; I am letting it reach further into my work.
- Phase 1 — I hand it words. I am the bridge. I read the document, copy the relevant passage, paste it into the prompt, copy the answer back, and put it in the deck or the doc myself. The AI never touches my apps; I shuttle the bytes between them. I am inserting AI at every step, manually.
- Phase 1.5 — I hand it parts of my apps, on request. The AI reaches into Outlook, Word, Drive, my code, when I prompt it. The conversation stays mine; the retrieval and drafting do not. I no longer need to copy/paste the source material in.
- Phase 2 — I hand it my apps. The AI gets full access to PowerPoint, Excel, Word, my codebase, my browser, my files. I tell it what artifact I need; the agent opens the right app and builds it end-to-end. I review the artifact, not the keystrokes.
- Phase 3 — I hand it a computer. The AI gets a machine of its own, an inbox, a calendar, a small fleet of sub-agents, and a set of standing instructions in plain English. From there, it runs — and only finds me when it should.
The mental shift at each phase is not technical. It is one of trust. The hard part is not the tooling — every tool on this ladder is sitting on a website, ready to be opened in the next ten minutes. The hard part is being willing to hand the AI a wider perimeter of your work. Most leaders in 2026 are stuck not because their AI is not capable enough. They are stuck because they have not yet decided to hand it the next perimeter — and most are quietly stuck at Phase 1.5 (M365 Copilot or ChatGPT Enterprise rolled out widely), believing that is their AI strategy. Cowork is the most-deployed and least-understood rung on the ladder.
The executive translation
If you are non-technical, here is the part of this post built for you. Three reframes that should reshape how you think about your AI roadmap on Monday morning:
1. Stop asking which model. Start asking which mode your highest-cost employees actually live in.
The model debate — GPT versus Claude versus Gemini — barely moves the needle inside any one rung. The rung itself moves the needle by 3X every time you climb. If your most senior people are still living in Stage 1 (a ChatGPT tab on a second monitor) while your competitors have moved them to Stage 2 (a sanctioned agent IDE on their primary work surface), you are losing 3X of leverage on the most expensive payroll line in your company. Forever. Quietly. Compounding.
2. The 3X → 10X step is a tooling decision. The 10X → 30X step is an operating-model decision.
Climbing from Chat to Build is a procurement question — sanction the right surface, write a thin policy on tool access, give your top engineers and analysts a Build-class agent. You can do it in a quarter. Climbing from Build to Automate is a governance question — what triggers can fire an agent without a human, what permissions does each agent hold, how does an agent escalate to a human, what does the audit trail look like at 2 AM. That work takes an organizational decision, not a tool decision. Most enterprises will move quickly through the first step and stall on the second. The ones that do not stall will run differently than the ones that do.
3. If your AI budget is 100% Chat or 100% Cowork licenses, you are funding table stakes and missing the leverage.
Most large enterprises in 2026 are roughly 0/95/5/0 across the four rungs — almost everyone on M365 Copilot or ChatGPT Enterprise (Cowork), almost no one on Cursor (Build), almost no one on agent estates (Automate). That looks like progress because the Cowork rollout was real progress over a Chat-only world. It is also where your competitors will quietly stop. The shape that matters over the next 18 months is something more like 10/45/30/15 — Chat as the floor, Cowork as the volume tier, Build as the leverage tier, Automate as the operating-model tier. The first board meeting where you can show four columns instead of two is the meeting where your AI strategy stops being a vendor list and starts being an operating plan.
The close — start where you are
You do not have to do everything in this post tomorrow. You just have to know which rung you are on, and what the next one looks like.
- If you are reading this on a phone between meetings: open ChatGPT or Copilot tonight on a real task — your hardest email, your messiest brief, your draft of next week's all-hands. Get your 3X. That is your floor, and it is genuinely a floor — not the ceiling most people quietly treat it as.
- If your enterprise has rolled out M365 Copilot, Claude Enterprise, or ChatGPT Enterprise: you are at ~5X, not 3X. That is genuinely a step up — and it is also where most enterprises will quietly stop. The next step is putting a Build surface in front of your top operators. That is where the 10X starts to compound.
- If you have started using Build tools — Cursor, Claude Code, Codex, OpenCode: point them at something you would actually ship. Not a toy. A real deck, a real analysis, a real briefing, a real piece of code — something that touches a real artifact and goes to a real audience. Get your 10X. This is the rung most teams underestimate by about an order of magnitude.
- If you have agents running already: graduate one of them from supervised to triggered. Pick a workflow with a clear signal and a recoverable failure mode. The first thing that runs while you sleep is the moment your operating model changes — and once it changes, it does not change back.
So here is what I have actually learned, after eighteen months of climbing.
I did not stop using ChatGPT. I just stopped staying in it. Today, ChatGPT is my fastest route into a workflow — the place where I shape the question. M365 Copilot or Claude Projects is where I assemble most of the context, with the AI's hands already in my apps. Cursor is where I actually do the work that ships. And the routine work, increasingly, I have already moved past — to the agents that run while I do something else.
It is not about stopping ChatGPT. It is about graduating from it.