Updated May 28, 2026

Why Model Routing Is Now a Marketing Skill

I cut my model bill 64% by routing every job to the right model instead of defaulting to the expensive one. Here is the map: Opus, Sonnet, Haiku, and more.

Opus for planning. Sonnet for writing. M2.7 for grunt work. The marketer who doesn’t route is the marketer paying 8x.

I cut my own model bill by about 64% last month. Same workflows, same outputs. The change wasn’t a clever prompt or a new tool. It was structural: I stopped letting Claude or Codex pick the model, and started routing every job myself.

The cost differential is bigger than you’d guess. Opus 4.7 runs roughly $25 per million output tokens. MiniMax M2.7 is about $0.03 on a token plan. That’s an 800x spread, top to bottom. In between, in cost order: Sonnet 4.6 (around $15), Gemini 2.x Pro (around $12), Haiku 4.5 (around $5), Grok 4.3 (around $2.50, with a 1M context window).

Absolute prices keep dropping. The 800x structure between them doesn’t.

Most of us pick one provider and stay there. Whatever’s the default. Usually the expensive one. That single decision, made once and never revisited, is where the 64% was hiding.

This is one of those routing calls that compounds. It’s the same instinct under the whole AI tool comparison framework: you don’t pick a tool, you route each job to the one that’s best at it. Cost is just one more axis you route on.

Top to bottom by cost (output per million tokens)

Opus 4.7 (about $25). Reasoning, plan reviews, architecture decisions, strategic synthesis across a pile of context. The judgment calls. Don’t reach for it on regular work.

Sonnet 4.6 (about $15). Your workhorse for content. Drafting, editing, anything customer-facing that needs voice and nuance but not deep reasoning. Sonnet wrote most of this newsletter.

Gemini 2.x Pro (about $12). Research, and anything visual. The native multimodal makes it my call for image work: asset audits, brand consistency checks, screenshot review. Its research is Google-native too, so it surfaces signal the others miss.

Haiku 4.5 (about $5). The mechanical parts of a workflow. Tagging emails, categorizing leads, running a quality rubric. Classification, scoring, formatting. Anywhere you’re checking something against a list.

Grok 4.3 (about $2.50, 1M context). SEO and social research. Competitor teardowns, thread-mining, SERP analysis. That big context window is the point: you feed it everything and ask it to surface the patterns.

MiniMax M2.7 (about $0.03, year token plan). Bulk. Where Haiku felt cheap, this feels close to free. Mass classification, signal scoring, tagging at volume. Anywhere you’re running the same job across thousands of records. You can run it locally on your own machine, too.

What this looks like in practice

I run a content pipeline with six stages: ideation, outline, draft, quality check, voice polish, publication formatting.

Three of those stages run on Haiku. Two on Sonnet. One on Opus.

  • Ideation (Haiku). Generating candidate topics by scanning signals. Mechanical, list-driven, no judgment yet.
  • Outline (Opus). The only Opus step in the chain. Structure decisions, competitor analysis, SEO judgment. This is where reasoning actually matters.
  • Draft (Sonnet). Turning the outline into prose. Voice work, not reasoning.
  • Quality check (Haiku). Scoring against a 6-dimension rubric. Comparison, not creation.
  • Voice polish (Sonnet). Applying MA voice. Pure writing.
  • Publication formatting (Sonnet). Channel-specific formatting for newsletter, blog, social. Pattern application.

One Opus step. The other five run 5 to 60x cheaper, and the output is actually better. I ran it both ways for over a month before I trusted it.

The two pitfalls

Opus for everything. You pay 5x and you don’t get 5x back. Opus is overkill for most marketing work, but the bill doesn’t know that.

Haiku for synthesis. The test is simple.

  • If the task is “check this against that list” or “format this in this shape,” use Haiku.
  • If it’s “decide what matters here,” use Opus.
  • If it’s “say this well,” use Sonnet.

The 80/20

Roughly 70 to 80% of marketing AI work is mechanical. Lead scoring, email categorization, draft tagging, quality checks, ad reports, SEO audits, content classification. Another 15 to 25% is writing and applied research: drafts, replies, polish, competitor teardowns, image review. That tier belongs to Sonnet, Grok, and Gemini.

The last 5% is real strategic reasoning. The plan reviews. The “should we even do this?” calls. The synthesis across 40 documents. That’s where Opus or ChatGPT 5.5 earns its bill.

Model routing isn’t a feature you turn on. It’s an operational habit, and it sits right at the center of practical AI marketing. The same way you don’t run paid traffic without UTMs on the links, you don’t run an AI workflow without a routing strategy.

Start by listing every task you handed to AI this week. Tag each one: reason, write, or check. Then look at where your money actually went. The mismatches are the savings.

Pull one model bill apart this weekend

I send one email a week for marketers who’d rather route than overpay. Every issue takes a real workflow apart, names which model I handed each step to, and shows what it cost once the dust settled. If “stop defaulting to the expensive one” landed, the rest of the routing map is waiting in the inbox.

Subscribe free →