Tool Use, MCP, and Streaming in the API
In the previous article we covered the basic Messages API — how to send messages, read responses, and count tokens. Now we move on to three mechanisms that turn a simple client into an agent: tools (tool use), connecting MCP servers directly from the API, and streaming.
Client-Side Tools: Defining Them with JSON Schema
A tool is a function in your code that the model can call. It is described in the tools request parameter using three required fields:
{
"name": "get_file_contents",
"description": "Reads the contents of a text file at the given path. Use when you need to retrieve source code, a config file, or data from the filesystem. Returns a string. Does not work with binary files or URLs.",
"input_schema": {
"type": "object",
"properties": {
"path": {
"type": "string",
"description": "Абсолютный или относительный путь к файлу"
}
},
"required": ["path"]
}
}The most important field is description. This is what the model uses to decide when and how to invoke the tool. Be specific: what it does, when to call it, what it returns, and what it cannot do. Two words are worse than four sentences.
For tools with complex parameters or format constraints, add input_examples — an array of sample arguments. Each example must conform to input_schema — an invalid example will be rejected by the API with a 400 error.
The Tool Use Cycle: Request → Execution → Result
When the model decides to call a tool, the response arrives with stop_reason: "tool_use". The content contains a tool_use block:
{
"type": "tool_use",
"id": "toolu_01AbCdEf",
"name": "get_file_contents",
"input": {"path": "src/main.py"}
}Three fields: id (needed for tool_result), name (which function to run), input (arguments). Your code performs the operation, then sends the result in a new user message:
{
"role": "user",
"content": [
{
"type": "tool_result",
"tool_use_id": "toolu_01AbCdEf",
"content": "import anthropic\n\ndef main():\n ..."
}
]
}Before doing so, you must append the assistant's response containing the tool_use block to the messages history. The API is stateless — the full state is entirely in your hands. If the tool failed, pass "is_error": true with the error text in content, and the model will adjust its approach.
Important: in a
usermessage containing results,tool_resultblocks must come first — before any text content. Violating this order will return a 400 error.
After receiving the tool_result, the model continues generating. If another tool is needed, stop_reason will again be "tool_use". The cycle repeats until "end_turn" is received.
sequenceDiagram
participant App as Your application
participant API as Claude API
App->>API: messages + tools
API-->>App: stop_reason tool_use
Note over App: Execute name(input)
App->>API: messages + tool_result
API-->>App: stop_reason end_turn
Note over App: Or tool_use again — loopControlling Calls: tool_choice and strict
By default, tool_choice: {"type": "auto"} — the model decides on its own whether to call a tool or respond with text. Three additional options:
tool_choice | Behavior |
|---|---|
{"type": "any"} | Must call at least one tool |
{"type": "tool", "name": "..."} | Only this specific tool |
{"type": "none"} | Tools are blocked for this response |
With any and tool, the model does not write text before the call — it goes straight to tool_use. Add "strict": true to the tool definition to guarantee that arguments exactly match the schema, with no extra fields and no missing required parameters.
In addition to client-side tools, there are server-side tools — these are executed by Anthropic's infrastructure, and you do not need to send a tool_result. They are connected via the same tools parameter, but with a named type:
tools=[{"type": "web_search_20260209", "name": "web_search"}]MCP Connector: External Servers from the API
In Claude Code, MCP servers are added with the claude mcp add command. In the Messages API, use the mcp_servers parameter (a beta feature requiring the mcp-client-2025-11-20 header):
response = client.beta.messages.create(
model="claude-opus-4-8",
max_tokens=1000,
messages=[{"role": "user", "content": "Покажи открытые PR"}],
mcp_servers=[
{
"type": "url",
"url": "https://github-mcp.example.com/sse",
"name": "github",
"authorization_token": "ghp_..."
}
],
tools=[{"type": "mcp_toolset", "mcp_server_name": "github"}],
betas=["mcp-client-2025-11-20"],
)The server must be publicly accessible over HTTPS (SSE or Streamable HTTP) — stdio servers do not work through this mechanism. The mcp_servers parameter describes the connection; tools with type mcp_toolset specifies which tools to include (all by default).
To block dangerous tools (denylist):
{
"type": "mcp_toolset",
"mcp_server_name": "github",
"configs": {"delete_repository": {"enabled": false}}
}For an allowlist — set "default_config": {"enabled": false}, then enable the ones you need via configs by name. The response contains mcp_tool_use and mcp_tool_result blocks — analogous to the regular ones, plus a server_name field.
Streaming
By default, the API waits for the complete response and returns it all at once. For long generations this means several seconds of silence before the first character. stream: true switches to Server-Sent Events.
In the Python SDK, use the .stream() method:
with client.messages.stream(
model="claude-sonnet-4-6",
max_tokens=2048,
messages=[{"role": "user", "content": "Объясни GC в CPython"}],
) as stream:
for chunk in stream.text_stream:
print(chunk, end="", flush=True)
final = stream.get_final_message()
print(f"\nТокенов: {final.usage.input_tokens} in / {final.usage.output_tokens} out")In TypeScript:
await client.messages
.stream({
model: "claude-sonnet-4-6",
max_tokens: 2048,
messages: [{ role: "user", content: "Объясни GC в CPython" }],
})
.on("text", (t) => process.stdout.write(t));Streaming with tools. When the model streams a tool call, instead of text_delta you receive input_json_delta events containing a partial JSON fragment of the arguments:
event: content_block_delta
data: {"type":"content_block_delta","index":1,
"delta":{"type":"input_json_delta","partial_json":"{\"path\": \"/src/"}}Concatenate the partial_json strings until the content_block_stop event, then parse the JSON. The SDK does this automatically — tool data does not appear in text_stream. Watch the final stop_reason in the message_delta event: if it is "tool_use", the next iteration of the cycle is required.
See also
- Claude API and Anthropic SDK: the basics — Messages API, model families, stop_reason
- Claude Agent SDK: building agents programmatically — a ready-made agent loop without manual tool loop management
- Prompt caching, batches, and cost optimization — caching tool schemas reduces cost
- Model Context Protocol: architecture and fundamentals — what MCP is as an open standard
- Connecting MCP servers in Claude Code — the same, but through the Claude Code interface
- Subagents and context isolation — an alternative orchestration pattern