Claude Code Tips That Actually Save Tokens & Money (2026 Field Guide)

Most "Claude Code tips" articles are wrong. They're SEO filler written by people who opened the tool twice. This one isn't — every tip below is checked against the current docs and changelog, and it leans on the stuff that actually moves the needle: the habits that cut your bill.

There's an interactive quiz at the end if you'd rather find out the hard way how much you're leaving on the table.

The one mental model that explains most of the savings: every message re-sends the entire conversation as input tokens. Context discipline — not shorter prompts — is the dominant lever. Everything below follows from that.

Context is the bill

/clear between unrelated tasks. If you'd open a new document for it, you should clear. This is the single biggest token lever there is — a long all-day session re-bills its early context on every later message.
/compact only when you need continuity. /compact Focus on the auth bug keeps what you name — but compaction itself costs an LLM pass, so prefer /clear for a clean switch.
/context shows exactly what's filling the window: system prompt, CLAUDE.md, MCP tools, file reads, history. Run it first when a session feels heavy.
/usage attributes spend to skills, subagents, plugins, and individual MCP servers. /cost shows the session breakdown.
Course-correct early with Esc. Stopping a wrong path keeps a polluted transcript from being re-sent every turn afterward.

Right-size the model

Haiku for mechanical work (renames, boilerplate), Sonnet for daily coding, Opus for hard reasoning. Haiku is roughly 5× cheaper than Opus on input. /model switches the current session.
Thinking bills as output. Reserve ultrathink (and high /effort) for genuinely hard problems; drop effort for routine work.
Fast mode gives you Opus output at higher speed for snappier iteration.

Protect the cache

Prompt caching can make repeated context ~10× cheaper to re-read. But it's a prefix match — editing CLAUDE.md mid-session, swapping context files, or idling past the TTL invalidates it.
On the API, keep volatile content (timestamps, request IDs) after your last cache breakpoint, or you bust the whole prefix on every call.

The inputs that save searching

@file#10-40 injects exactly those lines — no grep, no full-file read.
! bash prefix runs a shell command inline; # saves a note to memory mid-flow.
/btw answers a quick aside without adding it to history — keeps your task context clean.
Keep CLAUDE.md small. It rides in every prompt; treat it as a lookup table, not a brain dump.

Delegate the verbose stuff

Run tests, log-crunching and wide searches in a subagent — it works in its own context and only the summary returns to your window. Set a subagent's model to Haiku for cheap delegated work.
Move specialized, rarely-needed instructions into a Skill — the description is cheap; the full content only loads when invoked.

Make the machine enforce it

Hooks fire deterministically (CLAUDE.md is only advisory). A PostToolUse hook that auto-formats edited files, or a PreToolUse deny-gate for rm -rf/.env, is worth setting up once.
A hook that greps a 10k-line log for ERROR before Claude sees it saves a huge chunk of context on noisy commands.

The mistakes that cost the most

One all-day mega-session. Opus for trivial edits. Vague prompts that trigger repo-wide scanning. Runaway agent loops with no stop condition. A giant CLAUDE.md. Each one is a quiet, recurring tax — and each is a one-line fix.

Now test yourself. I built an interactive quiz — find out if you're a context ninja or a token bonfire. No login, no email.

Pricing, model details and cache behaviour change often — verify current numbers at code.claude.com/docs before relying on them.

Stop Burning Tokens: A Field Guide to Claude Code