Token Optimization Best Practises - Visual Guide
Level 1 · Quick guide

Token Optimization Best Practises - Visual Guide

Six scenes, sixty seconds. The full playbook — calculator, real workflows, pricing — is one click away.

Or scroll through the six scenes
Jump to any scene
Scene 1 · What rides every prompt

Your question is the smallest piece.

Each turn, Copilot quietly assembles a packet and sends it to the model. The thing you typed is the thin slice on the right. Everything else is automatic — but you pay for all of it.

SYSTEM
INSTRUCTIONS
TOOLS · MCP
HISTORY
RETRIEVED CONTEXT
YOU
System prompt (fixed)
copilot-instructions.md
Tool & MCP schemas
Chat history
Retrieved files
Your message
Implication: shrinking your own message saves almost nothing. The leverage is in the other five segments.
Scene 2 · Try the toggles

Six habits. Turn them on, see where the leverage is.

Each switch is one habit. Flip it on and its impact rolls up into your ESTIMATED BENEFITS on the right. Try the first switch — benefits climb as you stack habits.

Heads up — these are estimates, not guarantees. The High / Medium / Low labels are directional only: a qualitative ranking of each habit's relative leverage, based on patterns commonly seen across engineering teams. They are not benchmarks, not measurements, and not a promised cost reduction for your project. Actual impact depends on your model, workload, codebase size, caching behaviour, and team workflow — and may be higher, lower, or negligible. Always measure on your own usage.

Further reading: GitHub Well-Architected — Managing AI credits · Improving token efficiency in agentic workflows · Prepare for usage-based billing — take action before the transition.
Point at the file you're working on
Send one file or your selection instead of the whole repo.
High impact
Use Auto Mode (recommended)
Let Copilot route each turn to the best-fit model — Auto Mode is becoming the default policy across orgs and applies a documented ~10% token-multiplier discount. Learn more.
High impact
Keep your custom instructions short and steady
If they don't change between turns, they get cached — input cost drops a lot.
Medium impact
Don't crank reasoning effort to max
Low or medium is fine for most work. Save high for genuinely ambiguous tasks.
Low impact
Turn off tools you're not using on this project
Every enabled tool's description gets sent on every message — even when unused.
High impact
Start a new chat when you switch tasks
Old turns ride along on every new message until you reset.
Medium impact
ESTIMATED BENEFITS
None relative leverage
Habits on: 0 of 6 No habits yet
Flip a switch to see which habits typically carry the most leverage.
The takeaway: you don't need all six. Adopt two or three that fit your workflow and the numbers move fast — because each habit cuts a different factor.
Scene 4 · The wider map

Zoom out: every customization has a price.

The chat habits in Scene 2 are the half you control by hand. The other half is the customization system itself — instructions, prompts, skills, agents, hooks. They sort into six tiers by when they load — i.e. how often they get sent to the model. Top tier rides along on every request; bottom tier may never be sent at all.

New to these?  show a 1-line primer for each
instructions

Markdown files Copilot reads as standing rules. copilot-instructions.md loads every turn; *.instructions.md with applyTo: loads only when matching files are in context.

e.g. "Always use TypeScript strict mode. Prefer functional components."
prompts

Reusable prompt templates you trigger with /name in chat. Think saved macros for recurring tasks — loaded only when you call them.

e.g. /review-pr, /write-test, /refactor-to-hooks
skills

Bundled know-how (a short description + a longer body). The model reads the description on every turn and pulls in the body only when it decides the skill is relevant.

e.g. a "deploy-to-azure" skill that knows your stack's exact CLI sequence.
agents

Custom chat personas with their own system prompt + tool set, invoked via @name. Subagents are spawned by the model; only their summary returns — cheap context-wise.

e.g. @security-reviewer, @db-migration-expert
hooks

Scripts that run on editor events (save, commit, edit) — outside the model. Never sent to the model; great for lint, format, or auto-running tests.

e.g. run prettier on save, pytest on commit.
MCP tools

External tools (servers) the model can call mid-conversation — databases, GitHub, browsers, your own APIs. Like skills but they do things instead of just adding context.

e.g. query your prod DB, open a PR, fetch a Jira ticket.
Every request
copilot-instructions.md, skill descriptions
On glob match
*.instructions.md with applyTo:
You invoke
Prompts (/name), custom agents (@name)
Model decides
Skill bodies, MCP tool calls
On event
Hooks (run outside the model)
Agent spawns
Subagents (only a summary returns)
Read the bar as frequency, not size. It shows how often each kind of customization is sent to the model — not how many tokens it adds. Actual token usage depends on the content size of the file or tool description too. A short copilot-instructions.md sent every turn can easily cost less than a giant skill body the model pulls in occasionally. Use this ladder to decide where to put something; measure your own usage to know what it costs.
Not sure which one to use? Ask who turns it on:
  • Always oncopilot-instructions.md (or *.instructions.md for specific files)
  • You turn it on → prompt (/name) or custom agent (@name)
  • The model turns it on → skill or MCP tool
  • An event turns it on (save, commit, edit) → hook
Scene 4 · One habit, two outcomes

Same task. Different framing.

The single highest-leverage habit is also the easiest: attach the narrowest scope that lets the model answer.

✗ WASTEFUL
"Why is the checkout broken?
Look at #codebase"
Tokens sent 0
✓ TIGHT
"Why is this assertion failing?
#file:cart.test.ts #selection"
Tokens sent 0
Same answer quality, ~7× smaller bill, faster turn. Reach for #codebase only when narrower scopes have actually failed.
Scene 6 · Copy & paste

Two snippets. Drop into any repo.

Both snippets do the same job: give the model less to read, and keep it the same each turn. Save one as a file so it gets cached and reused. Paste the other into chat to point the model at just this task. Steal them.

.github/copilot-instructions.mdStable, sub-200 lines — the prefix that gets cached on every turn.
# Project: payments-service

Stack: TypeScript · Node 20 · Fastify · Postgres · Vitest
Style: Functional core, async/await, no classes unless modelling
        an entity. Prefer `Result<T,E>` over throwing.

Tests: Vitest, co-located `*.test.ts`. Aim for behavioural
        tests, not implementation. Use builders, not fixtures.

Don't:
- Don't add new dependencies without flagging it.
- Don't generate migrations — I write those by hand.
- Don't touch `src/legacy/*` unless explicitly asked.

Do:
- When fixing a bug, write the failing test first.
- When unsure, ask one clarifying question.
Task kickoff — paste into chatScopes context to just the files and constraints this task needs.
# Task
<one sentence: what should be true when you're done>

# Scope
- Files I think are relevant: #file:... #file:...
- Don't touch: ...

# Constraints
- <any non-obvious behaviour, perf budget, API surface>

# Plan first
Outline your approach in 3–5 bullets before writing code.
If anything is ambiguous, ask one question and stop.
Show a filled-in example
# Task
Webhook retries should give up after 5 attempts instead of looping forever.

# Scope
- Files I think are relevant: #file:src/webhooks/retry.ts #file:src/webhooks/retry.test.ts
- Don't touch: src/legacy/*, the queue config.

# Constraints
- Must keep the existing exponential backoff.
- No new dependencies.

# Plan first
Outline your approach in 3–5 bullets before writing code.
If anything is ambiguous, ask one question and stop.
How this saves tokens: the instructions file is identical on every turn, so the provider serves it from cache — you stop paying full price for the same prefix over and over. The kickoff snippet replaces “here’s the whole repo, figure it out” with a narrow, named slice — fewer input tokens, less reasoning effort, tighter answers. Together they hit the two biggest levers: cache the stable stuff, scope the changing stuff.
Want more?  scoped instructions, reusable prompts, and where to read more

Scoped per-language rules — .github/instructions/typescript.instructions.md

Frontmatter-scoped via applyTo. Only loaded when matching files are in context, so it doesn’t ride every turn.

---
applyTo: "**/*.ts"
---

# TypeScript rules

- Always include explicit return types on exported functions.
- Prefer `readonly` arrays and `Readonly<T>` for function parameters.
- Use discriminated unions over enums.
- Errors at module boundaries return `Result<T, E>`. Throw only in
  top-level HTTP handlers and test setup.
- No `console.log` in committed code — use the shared `logger`.

You can have several: python.instructions.md (applyTo: "**/*.py"), sql.instructions.md (applyTo: "**/migrations/**/*.sql"). Narrow globs > broad ones.

On-demand recipe — .github/prompts/review-pr.prompt.md

For bulky reusable how-tos. You invoke it deliberately with /review-pr, so it doesn’t ride every turn.

---
mode: agent
description: Review the current diff against project conventions.
---

You are reviewing the working copy of this repository.

Focus on:
1. Conformance to TypeScript rules in `.github/instructions/typescript.instructions.md`.
2. Test coverage for new behaviors.
3. Error handling — anything thrown that should be a `Result` instead.

Procedure:
- Read `#changes`.
- For each modified file, list issues as `- file:line — issue — suggested fix`.
- End with a 1–2 sentence summary verdict.

Do not propose edits outside the scope of the diff.

Four prompt snippets worth memorizing

Targeted bug fix
Fix the bug in #selection.

Expected: <one sentence>
Actual:   <one sentence>

Don't change unrelated code.
Don't add error handling for cases that can't happen.
Bounded refactor
Refactor #file to extract the validation logic
into a separate function. Keep the public API unchanged.
Run nothing — just propose the diff.
Diff review (cheap mode)
Review #changes.
Flag: bugs, accidental scope creep, missing tests.
Skip: style nits the linter already catches.
Summarize-and-restart
Summarize what we've decided so far in this chat
in <=10 bullet points. I'm going to paste that into
a fresh chat. Don't propose further changes.

Where to read more

Official docs. Behavior changes; these stay current.

Scene 6 · Monday morning

If you take four things away, take these.

Don't memorise rules — internalise principles. These four cover the bulk of token waste; everything else is a refinement.

01

Turn on Auto Mode.

One toggle, set once. Copilot routes each turn to the best-fit model and applies a documented ~10% token-multiplier discount — the single highest-leverage thing you can do.

MODEL
02

Send the smallest slice.

Attach the file or selection that matters, not the whole repo. Less for the model to read = fewer input tokens, less reasoning effort, tighter answers.

CONTEXT
03

Keep instructions short & stable.

A sub-200-line copilot-instructions.md that doesn't change between turns gets cached — you stop paying full price for the same prefix every message.

CACHE
04

Reset chats on task switch.

History compounds — every old turn rides along on the new one. Start a new chat when the task changes, or run /compact to summarise and shed tokens without losing the thread.

SESSION
KEEP LEARNING

Go from habits to mastery.

You've seen the levers. Now build the muscle. Start with the official GitHub course — it walks you through the same patterns hands-on, in a real repo, in about an hour.

Start the GitHub Skills course Free · ~1 hour · self-paced
LEVEL 2 · CORE CONTENT

Go deeper: the full playbook awaits.

This page is the quick guide. The advanced scenarios walk through real workflows, an interactive token calculator, the full lever playbook, pricing comparisons, and a live diagram of how context flows. This is not optional reading.

Open advanced scenarios
Interactive · calculator · playbook · pricing · diagram
Full playbook