The Context Window as a Resource
Every Claude Code session starts with an empty context window — a finite space (typically 200,000 tokens) that holds all active information: the system prompt, CLAUDE.md, auto-memory, tool results, your messages, and Claude's responses. As the window fills up, unpleasant things begin to happen: Claude loses track of early parts of the conversation, starts "forgetting" details, and follows instructions less reliably.
It's important to understand that the context window is not an elastic bag. It's a scarce resource that must be actively managed.
What Loads Before Your First Message
Before you've typed a single word, several thousand tokens are already loaded into context:
Session start (≈7,000–8,000 tokens)
────────────────────────────────────
System prompt ~4,200 t (always present, invisible)
Auto-memory (MEMORY.md) ~680 t (first 200 lines / 25 KB)
Environment info ~280 t (CWD, OS, git branch)
MCP tools (deferred) ~120 t (names only; schemas load later)
Skill descriptions ~450 t (one-liners only)
~/.claude/CLAUDE.md ~320 t (global settings)
project CLAUDE.md ~1,800 t (project instructions)Only after all of this does your first message appear. Every file read, every command output, every response from Claude — all of it stacks on top.
flowchart TD
A["System prompt\n~4 200 t"] --> B["Auto-memory MEMORY.md\n~680 t"]
B --> C["Environment (OS, git)\n~280 t"]
C --> D["MCP tools (names)\n~120 t"]
D --> E["Skills descriptions\n~450 t"]
E --> F["~/.claude/CLAUDE.md\n~320 t"]
F --> G["project CLAUDE.md\n~1 800 t"]
G --> H(["First user message"])
H --> I["File reads\n+1 500–3 000 t / file"]
I --> J["Command output\n+300–1 500 t"]
J --> K["Claude responses\n+500–2 000 t"]
K --> L{"Context filling up?"}
L -- "No" --> H
L -- "Auto" --> M["Auto-compact\n(removes old output → summary)"]
L -- "/compact" --> N["/compact\n(structured summary)"]
L -- "/clear" --> O["/clear\n(clean slate)"]
M --> H
N --> H
style A fill:#6B6964,color:#fff
style B fill:#E8A45C,color:#fff
style C fill:#6B6964,color:#fff
style D fill:#9B7BC4,color:#fff
style E fill:#D4A843,color:#fff
style F fill:#6A9BCC,color:#fff
style G fill:#6A9BCC,color:#fff
style H fill:#558A42,color:#fff
style I fill:#8A8880,color:#fff
style J fill:#A09E96,color:#fff
style K fill:#D97757,color:#fff
style M fill:#D97757,color:#fff
style N fill:#D97757,color:#fff
style O fill:#4A6E8A,color:#fffHow Context Accumulates During a Session
Reading a 500-line file costs roughly 1,500–2,000 tokens. The output of npm test spanning 200 lines adds another ~600. After an hour-long debugging session, context can easily occupy 40–60% of the window — and that's before anything starts getting lost.
The hungriest sources:
- File reads — Claude reads them in full, even if only one function is needed
- Command output — logs, test results, diffs
- Hook tool results — injected via
additionalContextin PostToolUse - Search results — grep and glob return every match
To see what's currently taking up space: run the /context command right inside the session.
/compact: Smart Compression
/compact is the primary tool for long sessions. The command asks Claude to produce a structured summary of the conversation, which replaces the entire history in context.
What makes it into the summary:
- Your requests and intentions
- Key technical concepts
- Modified files and important code fragments
- Bugs found and how they were fixed
- Unfinished tasks and current progress
What gets lost:
- Full tool output
- Claude's intermediate reasoning
- The verbatim code from files that were read (a "memory" of them remains, but not the exact text)
After /compact, Claude re-reads CLAUDE.md from disk and reinjects it into context. Auto-memory and the system prompt are also restored automatically. The one exception is the skill description list: only the skills you actually used during the session will be retained.
Focused compact. You can tell Claude exactly what to preserve:
/compact focus on the changes made to the authentication APIAlternatively, add a ## Compact Instructions section directly to CLAUDE.md — those instructions will then be applied every time an automatic compact occurs.
## Compact Instructions
When compacting, always preserve:
- The exact names of any modified files
- All TODOs that were mentioned in the conversation
- The current state of tests (passed / failed)/clear: A Clean Slate
/clear resets the context entirely — starting a fresh session from scratch. Use it when:
- The current task is complete and a fundamentally different one is beginning
- The context has become so cluttered that
/compactno longer helps - You want Claude to take a "fresh look" without the weight of previous discussions
Unlike /compact, /clear creates no summary whatsoever. The conversation is not carried into the new context — only what normally loads at startup: CLAUDE.md, auto-memory, and environment info.
/rewind: One Step Back
Before every file edit, Claude takes a snapshot (checkpoint). Pressing Esc twice returns the session to the state it was in before the last change: files are restored and context is rolled back.
This is not git — checkpoints are local and exist only within the current session. They do not cover actions on remote systems (deploys, database writes, outbound API requests). For properly undoing those kinds of operations, use git.
Auto-Compact and "Thrashing"
When context approaches its limit, Claude begins automatic compaction: first removing older tool results, then producing a summary analogous to /compact. You will see a "Conversation compacted" message in the terminal.
A problem arises when a single file or command output is so large that context fills up again immediately after compaction. Claude will retry a few times, and if the situation doesn't change it will stop with an "auto-compaction stopped" error. What to do:
# Option 1: start a new session and work with smaller files
claude --continue # resume from what was saved
# Option 2: delegate to a subagent
# In the session: "create a subagent that reads file X and returns only what's needed"Context Hygiene Practices
Many "losing the thread" problems are solved not by commands, but by good habits.
One request, one task. Long sessions that switch between unrelated tasks pollute the context faster than anything else. Finish a task → /clear, then start the next one.
Don't read everything in sight. A vague prompt ("look at the project and fix the bug") forces Claude to read dozens of files. A precise prompt ("bug in src/auth/token.ts, around line ~120") saves several thousand tokens.
Subagent for heavy research. If a task requires reading 20 files, hand it off to a subagent. It operates in its own isolated context; only the final summary comes back — not the entire reading history. See \Subagents and Context Isolation\ for details.
Compact before a complex new phase. If you've finished research and are about to start implementation, it's a good moment for /compact focus on the plan we just created. This "clears the board" while preserving exactly what you need for the next step.
CLAUDE.md instead of in-chat explanations. If you've explained the same rule to Claude multiple times (a naming convention, an architectural constraint) — that rule belongs in CLAUDE.md. After /compact, the explanation from the chat will be gone, but CLAUDE.md will be re-read from scratch.
Nested CLAUDE.md and compact. Remember: the project-root CLAUDE.md is restored after a compact, but CLAUDE.md files from subdirectories are not. They are reloaded only the next time Claude reads a file within that subdirectory. This means important instructions are better kept in the root file rather than buried deep in the tree.
See also
- \CLAUDE.md and the Memory System\ — how the instruction and auto-memory system that loads into context is structured
- \Subagents and Context Isolation\ — the key tool for research tasks that drain the context window
- \Skills — Portable Capabilities\ — how to keep procedural instructions out of context until they're actually needed
- \Hooks — Lifecycle Events\ — hook output also enters context via additionalContext; keeping it compact matters
- \Settings and Configuration Hierarchy\ —
autoMemoryEnabledand other settings that affect what gets loaded into context - \Model Selection and Thinking Modes\ — extended thinking consumes more tokens; worth factoring in when planning your context budget