Level 1 · Quick guide

Token Optimization Best Practises - Visual Guide

Six scenes, sixty seconds. The full playbook — calculator, real workflows, pricing — is one click away.

Jump to any scene

Scene 1 · What rides every prompt

Your question is the smallest piece.

Each turn, Copilot quietly assembles a packet and sends it to the model. The thing you typed is the thin slice on the right. Everything else is automatic — but you pay for all of it.

SYSTEM

INSTRUCTIONS

TOOLS · MCP

HISTORY

RETRIEVED CONTEXT

YOU

System prompt (fixed)

copilot-instructions.md

Tool & MCP schemas

Chat history

Retrieved files

Your message

Implication: shrinking your own message saves almost nothing. The leverage is in the other five segments.

Scene 2 · Try the toggles

Six habits. Turn them on, see where the leverage is.

Each switch is one habit. Flip it on and its impact rolls up into your ESTIMATED BENEFITS on the right. Try the first switch — benefits climb as you stack habits.

Heads up — these are estimates, not guarantees. The High / Medium / Low labels are directional only: a qualitative ranking of each habit's relative leverage, based on patterns commonly seen across engineering teams. They are not benchmarks, not measurements, and not a promised cost reduction for your project. Actual impact depends on your model, workload, codebase size, caching behaviour, and team workflow — and may be higher, lower, or negligible. Always measure on your own usage.

Further reading: GitHub Well-Architected — Managing AI credits · Improving token efficiency in agentic workflows · Prepare for usage-based billing — take action before the transition.

Point at the file you're working on

Send one file or your selection instead of the whole repo.

High impact

Use Auto Mode (recommended)

Let Copilot route each turn to the best-fit model — Auto Mode is becoming the default policy across orgs and applies a documented ~10% token-multiplier discount. Learn more.

High impact

Keep your custom instructions short and steady

If they don't change between turns, they get cached — input cost drops a lot.

Medium impact

Don't crank reasoning effort to max

Low or medium is fine for most work. Save high for genuinely ambiguous tasks.

Low impact

Turn off tools you're not using on this project

Every enabled tool's description gets sent on every message — even when unused.

High impact

Start a new chat when you switch tasks

Old turns ride along on every new message until you reset.

Medium impact

ESTIMATED BENEFITS

None relative leverage

Habits on: 0 of 6 No habits yet

Flip a switch to see which habits typically carry the most leverage.

The takeaway: you don't need all six. Adopt two or three that fit your workflow and the numbers move fast — because each habit cuts a different factor.

Scene 4 · The wider map

Zoom out: every customization has a price.

The chat habits in Scene 2 are the half you control by hand. The other half is the customization system itself — instructions, prompts, skills, agents, hooks. They sort into six tiers by when they load — i.e. how often they get sent to the model. Top tier rides along on every request; bottom tier may never be sent at all.

New to these? show a 1-line primer for each

instructions

Markdown files Copilot reads as standing rules. copilot-instructions.md loads every turn; *.instructions.md with applyTo: loads only when matching files are in context.

e.g. "Always use TypeScript strict mode. Prefer functional components."

prompts

Reusable prompt templates you trigger with /name in chat. Think saved macros for recurring tasks — loaded only when you call them.

e.g. /review-pr, /write-test, /refactor-to-hooks

skills

Bundled know-how (a short description + a longer body). The model reads the description on every turn and pulls in the body only when it decides the skill is relevant.

e.g. a "deploy-to-azure" skill that knows your stack's exact CLI sequence.

agents

Custom chat personas with their own system prompt + tool set, invoked via @name. Subagents are spawned by the model; only their summary returns — cheap context-wise.

e.g. @security-reviewer, @db-migration-expert

hooks

Scripts that run on editor events (save, commit, edit) — outside the model. Never sent to the model; great for lint, format, or auto-running tests.

e.g. run prettier on save, pytest on commit.

MCP tools

External tools (servers) the model can call mid-conversation — databases, GitHub, browsers, your own APIs. Like skills but they do things instead of just adding context.

e.g. query your prod DB, open a PR, fetch a Jira ticket.

Every request

copilot-instructions.md, skill descriptions

On glob match

*.instructions.md with applyTo:

You invoke

Prompts (/name), custom agents (@name)

Model decides

Skill bodies, MCP tool calls

On event

Hooks (run outside the model)

Agent spawns

Subagents (only a summary returns)

Read the bar as frequency, not size. It shows how often each kind of customization is sent to the model — not how many tokens it adds. Actual token usage depends on the content size of the file or tool description too. A short copilot-instructions.md sent every turn can easily cost less than a giant skill body the model pulls in occasionally. Use this ladder to decide where to put something; measure your own usage to know what it costs.

Not sure which one to use? Ask who turns it on:

Always on → copilot-instructions.md (or *.instructions.md for specific files)
You turn it on → prompt (/name) or custom agent (@name)
The model turns it on → skill or MCP tool
An event turns it on (save, commit, edit) → hook

Scene 4 · One habit, two outcomes

Same task. Different framing.

The single highest-leverage habit is also the easiest: attach the narrowest scope that lets the model answer.

✗ WASTEFUL

"Why is the checkout broken?
Look at #codebase"

Tokens sent 0

✓ TIGHT

"Why is this assertion failing?
#file:cart.test.ts #selection"

Tokens sent 0

Same answer quality, ~7× smaller bill, faster turn. Reach for #codebase only when narrower scopes have actually failed.

Scene 6 · Copy & paste

Two snippets. Drop into any repo.

Both snippets do the same job: give the model less to read, and keep it the same each turn. Save one as a file so it gets cached and reused. Paste the other into chat to point the model at just this task. Steal them.

.github/copilot-instructions.mdStable, sub-200 lines — the prefix that gets cached on every turn.

# Project: payments-service

Stack: TypeScript · Node 20 · Fastify · Postgres · Vitest
Style: Functional core, async/await, no classes unless modelling
        an entity. Prefer `Result<T,E>` over throwing.

Tests: Vitest, co-located `*.test.ts`. Aim for behavioural
        tests, not implementation. Use builders, not fixtures.

Don't:
- Don't add new dependencies without flagging it.
- Don't generate migrations — I write those by hand.
- Don't touch `src/legacy/*` unless explicitly asked.

Do:
- When fixing a bug, write the failing test first.
- When unsure, ask one clarifying question.

Task kickoff — paste into chatScopes context to just the files and constraints this task needs.

# Task
<one sentence: what should be true when you're done>

# Scope
- Files I think are relevant: #file:... #file:...
- Don't touch: ...

# Constraints
- <any non-obvious behaviour, perf budget, API surface>

# Plan first
Outline your approach in 3–5 bullets before writing code.
If anything is ambiguous, ask one question and stop.

Show a filled-in example

# Task
Webhook retries should give up after 5 attempts instead of looping forever.

# Scope
- Files I think are relevant: #file:src/webhooks/retry.ts #file:src/webhooks/retry.test.ts
- Don't touch: src/legacy/*, the queue config.

# Constraints
- Must keep the existing exponential backoff.
- No new dependencies.

# Plan first
Outline your approach in 3–5 bullets before writing code.
If anything is ambiguous, ask one question and stop.

How this saves tokens: the instructions file is identical on every turn, so the provider serves it from cache — you stop paying full price for the same prefix over and over. The kickoff snippet replaces “here’s the whole repo, figure it out” with a narrow, named slice — fewer input tokens, less reasoning effort, tighter answers. Together they hit the two biggest levers: cache the stable stuff, scope the changing stuff.

Want more? scoped instructions, reusable prompts, and where to read more

Scoped per-language rules — `.github/instructions/typescript.instructions.md`

Frontmatter-scoped via applyTo. Only loaded when matching files are in context, so it doesn’t ride every turn.

---
applyTo: "**/*.ts"
---

# TypeScript rules

- Always include explicit return types on exported functions.
- Prefer `readonly` arrays and `Readonly<T>` for function parameters.
- Use discriminated unions over enums.
- Errors at module boundaries return `Result<T, E>`. Throw only in
  top-level HTTP handlers and test setup.
- No `console.log` in committed code — use the shared `logger`.

You can have several: python.instructions.md (applyTo: "**/*.py"), sql.instructions.md (applyTo: "**/migrations/**/*.sql"). Narrow globs > broad ones.

On-demand recipe — `.github/prompts/review-pr.prompt.md`

For bulky reusable how-tos. You invoke it deliberately with /review-pr, so it doesn’t ride every turn.

---
mode: agent
description: Review the current diff against project conventions.
---

You are reviewing the working copy of this repository.

Focus on:
1. Conformance to TypeScript rules in `.github/instructions/typescript.instructions.md`.
2. Test coverage for new behaviors.
3. Error handling — anything thrown that should be a `Result` instead.

Procedure:
- Read `#changes`.
- For each modified file, list issues as `- file:line — issue — suggested fix`.
- End with a 1–2 sentence summary verdict.

Do not propose edits outside the scope of the diff.

Four prompt snippets worth memorizing

Targeted bug fix

Fix the bug in #selection.

Expected: <one sentence>
Actual:   <one sentence>

Don't change unrelated code.
Don't add error handling for cases that can't happen.

Bounded refactor

Refactor #file to extract the validation logic
into a separate function. Keep the public API unchanged.
Run nothing — just propose the diff.

Diff review (cheap mode)

Review #changes.
Flag: bugs, accidental scope creep, missing tests.
Skip: style nits the linter already catches.

Summarize-and-restart

Summarize what we've decided so far in this chat
in <=10 bullet points. I'm going to paste that into
a fresh chat. Don't propose further changes.

Where to read more

Official docs. Behavior changes; these stay current.

Customizing Copilot Chat responses — overview of all customization surfaces.
VS Code · Copilot customization — copilot-instructions.md, *.instructions.md, .prompt.md.
VS Code · Custom chat modes — defining custom agents.
VS Code · MCP servers — enabling and scoping MCP.

Scene 6 · Monday morning

If you take four things away, take these.

Don't memorise rules — internalise principles. These four cover the bulk of token waste; everything else is a refinement.

01

Turn on Auto Mode.

One toggle, set once. Copilot routes each turn to the best-fit model and applies a documented ~10% token-multiplier discount — the single highest-leverage thing you can do.

MODEL

02

Send the smallest slice.

Attach the file or selection that matters, not the whole repo. Less for the model to read = fewer input tokens, less reasoning effort, tighter answers.

CONTEXT

03

Keep instructions short & stable.

A sub-200-line copilot-instructions.md that doesn't change between turns gets cached — you stop paying full price for the same prefix every message.

CACHE

04

Reset chats on task switch.

History compounds — every old turn rides along on the new one. Start a new chat when the task changes, or run /compact to summarise and shed tokens without losing the thread.

SESSION

KEEP LEARNING

Go from habits to mastery.

You've seen the levers. Now build the muscle. Start with the official GitHub course — it walks you through the same patterns hands-on, in a real repo, in about an hour.

Start the GitHub Skills course Free · ~1 hour · self-paced

Deeper resources

LEVEL 2 · CORE CONTENT

Go deeper: the full playbook awaits.

This page is the quick guide. The advanced scenarios walk through real workflows, an interactive token calculator, the full lever playbook, pricing comparisons, and a live diagram of how context flows. This is not optional reading.

Open advanced scenarios

Interactive · calculator · playbook · pricing · diagram

Token Optimization Best Practises - Visual Guide

Your question is the smallest piece.

Six habits. Turn them on, see where the leverage is.

Zoom out: every customization has a price.

Same task. Different framing.

Two snippets. Drop into any repo.

Scoped per-language rules — .github/instructions/typescript.instructions.md

On-demand recipe — .github/prompts/review-pr.prompt.md

Four prompt snippets worth memorizing

Where to read more

If you take four things away, take these.

Turn on Auto Mode.

Send the smallest slice.

Keep instructions short & stable.

Reset chats on task switch.

Go from habits to mastery.

Go deeper: the full playbook awaits.

Scoped per-language rules — `.github/instructions/typescript.instructions.md`

On-demand recipe — `.github/prompts/review-pr.prompt.md`