Languages: EN RU

The Context Window as a Resource

Every Claude Code session starts with an empty context window — a finite space (typically 200,000 tokens) that holds all active information: the system prompt, CLAUDE.md, auto-memory, tool results, your messages, and Claude's responses. As the window fills up, unpleasant things begin to happen: Claude loses track of early parts of the conversation, starts "forgetting" details, and follows instructions less reliably.

It's important to understand that the context window is not an elastic bag. It's a scarce resource that must be actively managed.


What Loads Before Your First Message

Before you've typed a single word, several thousand tokens are already loaded into context:

Session start (≈7,000–8,000 tokens)
────────────────────────────────────
System prompt               ~4,200 t  (always present, invisible)
Auto-memory (MEMORY.md)       ~680 t  (first 200 lines / 25 KB)
Environment info              ~280 t  (CWD, OS, git branch)
MCP tools (deferred)          ~120 t  (names only; schemas load later)
Skill descriptions            ~450 t  (one-liners only)
~/.claude/CLAUDE.md           ~320 t  (global settings)
project CLAUDE.md           ~1,800 t  (project instructions)

Only after all of this does your first message appear. Every file read, every command output, every response from Claude — all of it stacks on top.

flowchart TD A["System prompt\n~4 200 t"] --> B["Auto-memory MEMORY.md\n~680 t"] B --> C["Environment (OS, git)\n~280 t"] C --> D["MCP tools (names)\n~120 t"] D --> E["Skills descriptions\n~450 t"] E --> F["~/.claude/CLAUDE.md\n~320 t"] F --> G["project CLAUDE.md\n~1 800 t"] G --> H(["First user message"]) H --> I["File reads\n+1 500–3 000 t / file"] I --> J["Command output\n+300–1 500 t"] J --> K["Claude responses\n+500–2 000 t"] K --> L{"Context filling up?"} L -- "No" --> H L -- "Auto" --> M["Auto-compact\n(removes old output → summary)"] L -- "/compact" --> N["/compact\n(structured summary)"] L -- "/clear" --> O["/clear\n(clean slate)"] M --> H N --> H style A fill:#6B6964,color:#fff style B fill:#E8A45C,color:#fff style C fill:#6B6964,color:#fff style D fill:#9B7BC4,color:#fff style E fill:#D4A843,color:#fff style F fill:#6A9BCC,color:#fff style G fill:#6A9BCC,color:#fff style H fill:#558A42,color:#fff style I fill:#8A8880,color:#fff style J fill:#A09E96,color:#fff style K fill:#D97757,color:#fff style M fill:#D97757,color:#fff style N fill:#D97757,color:#fff style O fill:#4A6E8A,color:#fff
flowchart TD
    A["System prompt\n~4 200 t"] --> B["Auto-memory MEMORY.md\n~680 t"]
    B --> C["Environment (OS, git)\n~280 t"]
    C --> D["MCP tools (names)\n~120 t"]
    D --> E["Skills descriptions\n~450 t"]
    E --> F["~/.claude/CLAUDE.md\n~320 t"]
    F --> G["project CLAUDE.md\n~1 800 t"]
    G --> H(["First user message"])
    H --> I["File reads\n+1 500–3 000 t / file"]
    I --> J["Command output\n+300–1 500 t"]
    J --> K["Claude responses\n+500–2 000 t"]
    K --> L{"Context filling up?"}
    L -- "No" --> H
    L -- "Auto" --> M["Auto-compact\n(removes old output → summary)"]
    L -- "/compact" --> N["/compact\n(structured summary)"]
    L -- "/clear" --> O["/clear\n(clean slate)"]
    M --> H
    N --> H

    style A fill:#6B6964,color:#fff
    style B fill:#E8A45C,color:#fff
    style C fill:#6B6964,color:#fff
    style D fill:#9B7BC4,color:#fff
    style E fill:#D4A843,color:#fff
    style F fill:#6A9BCC,color:#fff
    style G fill:#6A9BCC,color:#fff
    style H fill:#558A42,color:#fff
    style I fill:#8A8880,color:#fff
    style J fill:#A09E96,color:#fff
    style K fill:#D97757,color:#fff
    style M fill:#D97757,color:#fff
    style N fill:#D97757,color:#fff
    style O fill:#4A6E8A,color:#fff
Context window lifecycle: from load to compact

How Context Accumulates During a Session

Reading a 500-line file costs roughly 1,500–2,000 tokens. The output of npm test spanning 200 lines adds another ~600. After an hour-long debugging session, context can easily occupy 40–60% of the window — and that's before anything starts getting lost.

The hungriest sources:

  • File reads — Claude reads them in full, even if only one function is needed
  • Command output — logs, test results, diffs
  • Hook tool results — injected via additionalContext in PostToolUse
  • Search results — grep and glob return every match

To see what's currently taking up space: run the /context command right inside the session.

Check yourself
You launch a Claude Code session and immediately ask: "Read the README and explain the architecture." Before your message, ~7,000 tokens are already loaded. Claude will read a 300-line README (~900 tokens). What happens to the context as you work, even if you don't write much yourself?

/compact: Smart Compression

/compact is the primary tool for long sessions. The command asks Claude to produce a structured summary of the conversation, which replaces the entire history in context.

What makes it into the summary:

  • Your requests and intentions
  • Key technical concepts
  • Modified files and important code fragments
  • Bugs found and how they were fixed
  • Unfinished tasks and current progress

What gets lost:

  • Full tool output
  • Claude's intermediate reasoning
  • The verbatim code from files that were read (a "memory" of them remains, but not the exact text)

After /compact, Claude re-reads CLAUDE.md from disk and reinjects it into context. Auto-memory and the system prompt are also restored automatically. The one exception is the skill description list: only the skills you actually used during the session will be retained.

Focused compact. You can tell Claude exactly what to preserve:

/compact focus on the changes made to the authentication API

Alternatively, add a ## Compact Instructions section directly to CLAUDE.md — those instructions will then be applied every time an automatic compact occurs.

## Compact Instructions

When compacting, always preserve:
- The exact names of any modified files
- All TODOs that were mentioned in the conversation
- The current state of tests (passed / failed)
Check yourself
After /compact, which of the following will be available to Claude: (a) the exact code of the auth.ts file it read an hour ago; (b) your request to "debug the authorization issue"; (c) the contents of CLAUDE.md; (d) descriptions of skills that Claude never used?

/clear: A Clean Slate

/clear resets the context entirely — starting a fresh session from scratch. Use it when:

  • The current task is complete and a fundamentally different one is beginning
  • The context has become so cluttered that /compact no longer helps
  • You want Claude to take a "fresh look" without the weight of previous discussions

Unlike /compact, /clear creates no summary whatsoever. The conversation is not carried into the new context — only what normally loads at startup: CLAUDE.md, auto-memory, and environment info.


Quick recall
Когда следует использовать /clear вместо /compact?

/rewind: One Step Back

Before every file edit, Claude takes a snapshot (checkpoint). Pressing Esc twice returns the session to the state it was in before the last change: files are restored and context is rolled back.

This is not git — checkpoints are local and exist only within the current session. They do not cover actions on remote systems (deploys, database writes, outbound API requests). For properly undoing those kinds of operations, use git.


Quick recall
Как работает /rewind и что он откатывает?

Auto-Compact and "Thrashing"

When context approaches its limit, Claude begins automatic compaction: first removing older tool results, then producing a summary analogous to /compact. You will see a "Conversation compacted" message in the terminal.

A problem arises when a single file or command output is so large that context fills up again immediately after compaction. Claude will retry a few times, and if the situation doesn't change it will stop with an "auto-compaction stopped" error. What to do:

# Option 1: start a new session and work with smaller files
claude --continue  # resume from what was saved

# Option 2: delegate to a subagent
# In the session: "create a subagent that reads file X and returns only what's needed"
Check yourself
You are working with a monorepo. The `packages/backend/` directory has its own CLAUDE.md with backend-specific rules. After /compact, Claude starts editing backend files — does it apply the rules from that CLAUDE.md?

Context Hygiene Practices

Many "losing the thread" problems are solved not by commands, but by good habits.

One request, one task. Long sessions that switch between unrelated tasks pollute the context faster than anything else. Finish a task → /clear, then start the next one.

Don't read everything in sight. A vague prompt ("look at the project and fix the bug") forces Claude to read dozens of files. A precise prompt ("bug in src/auth/token.ts, around line ~120") saves several thousand tokens.

Subagent for heavy research. If a task requires reading 20 files, hand it off to a subagent. It operates in its own isolated context; only the final summary comes back — not the entire reading history. See \Subagents and Context Isolation\ for details.

Compact before a complex new phase. If you've finished research and are about to start implementation, it's a good moment for /compact focus on the plan we just created. This "clears the board" while preserving exactly what you need for the next step.

CLAUDE.md instead of in-chat explanations. If you've explained the same rule to Claude multiple times (a naming convention, an architectural constraint) — that rule belongs in CLAUDE.md. After /compact, the explanation from the chat will be gone, but CLAUDE.md will be re-read from scratch.

Nested CLAUDE.md and compact. Remember: the project-root CLAUDE.md is restored after a compact, but CLAUDE.md files from subdirectories are not. They are reloaded only the next time Claude reads a file within that subdirectory. This means important instructions are better kept in the root file rather than buried deep in the tree.


See also

  • \CLAUDE.md and the Memory System\ — how the instruction and auto-memory system that loads into context is structured
  • \Subagents and Context Isolation\ — the key tool for research tasks that drain the context window
  • \Skills — Portable Capabilities\ — how to keep procedural instructions out of context until they're actually needed
  • \Hooks — Lifecycle Events\ — hook output also enters context via additionalContext; keeping it compact matters
  • \Settings and Configuration Hierarchy\ — autoMemoryEnabled and other settings that affect what gets loaded into context
  • \Model Selection and Thinking Modes\ — extended thinking consumes more tokens; worth factoring in when planning your context budget

Sources

  1. Claude Code — How Claude Code Works
  2. Claude Code — How Claude remembers your project (Memory)
  3. Claude Code — Explore the context window