Tool Use, MCP, and Streaming in the API

In the previous article we covered the basic Messages API — how to send messages, read responses, and count tokens. Now we move on to three mechanisms that turn a simple client into an agent: tools (tool use), connecting MCP servers directly from the API, and streaming.

Client-Side Tools: Defining Them with JSON Schema

A tool is a function in your code that the model can call. It is described in the tools request parameter using three required fields:

{
  "name": "get_file_contents",
  "description": "Reads the contents of a text file at the given path. Use when you need to retrieve source code, a config file, or data from the filesystem. Returns a string. Does not work with binary files or URLs.",
  "input_schema": {
    "type": "object",
    "properties": {
      "path": {
        "type": "string",
        "description": "Абсолютный или относительный путь к файлу"
      }
    },
    "required": ["path"]
  }
}

The most important field is description. This is what the model uses to decide when and how to invoke the tool. Be specific: what it does, when to call it, what it returns, and what it cannot do. Two words are worse than four sentences.

For tools with complex parameters or format constraints, add input_examples — an array of sample arguments. Each example must conform to input_schema — an invalid example will be rejected by the API with a 400 error.

Check yourself

The tool is described in a single line: `"description": "Search files"`. Name the specific problem this causes.

The Tool Use Cycle: Request → Execution → Result

When the model decides to call a tool, the response arrives with stop_reason: "tool_use". The content contains a tool_use block:

{
  "type": "tool_use",
  "id": "toolu_01AbCdEf",
  "name": "get_file_contents",
  "input": {"path": "src/main.py"}
}

Three fields: id (needed for tool_result), name (which function to run), input (arguments). Your code performs the operation, then sends the result in a new user message:

{
  "role": "user",
  "content": [
    {
      "type": "tool_result",
      "tool_use_id": "toolu_01AbCdEf",
      "content": "import anthropic\n\ndef main():\n    ..."
    }
  ]
}

Before doing so, you must append the assistant's response containing the tool_use block to the messages history. The API is stateless — the full state is entirely in your hands. If the tool failed, pass "is_error": true with the error text in content, and the model will adjust its approach.

Important: in a user message containing results, tool_result blocks must come first — before any text content. Violating this order will return a 400 error.

After receiving the tool_result, the model continues generating. If another tool is needed, stop_reason will again be "tool_use". The cycle repeats until "end_turn" is received.

sequenceDiagram participant App as Your application participant API as Claude API App->>API: messages + tools API-->>App: stop_reason tool_use Note over App: Execute name(input) App->>API: messages + tool_result API-->>App: stop_reason end_turn Note over App: Or tool_use again — loop

sequenceDiagram
    participant App as Your application
    participant API as Claude API
    App->>API: messages + tools
    API-->>App: stop_reason tool_use
    Note over App: Execute name(input)
    App->>API: messages + tool_result
    API-->>App: stop_reason end_turn
    Note over App: Or tool_use again — loop

Agentic tool-use loop: send request, execute tool, return result

Check yourself

You sent a `tool_result` and received a new response. The `stop_reason` is `"tool_use"` again. What is happening and what should you do?

Controlling Calls: tool_choice and strict

By default, tool_choice: {"type": "auto"} — the model decides on its own whether to call a tool or respond with text. Three additional options:

`tool_choice`	Behavior
`{"type": "any"}`	Must call at least one tool
`{"type": "tool", "name": "..."}`	Only this specific tool
`{"type": "none"}`	Tools are blocked for this response

With any and tool, the model does not write text before the call — it goes straight to tool_use. Add "strict": true to the tool definition to guarantee that arguments exactly match the schema, with no extra fields and no missing required parameters.

In addition to client-side tools, there are server-side tools — these are executed by Anthropic's infrastructure, and you do not need to send a tool_result. They are connected via the same tools parameter, but with a named type:

tools=[{"type": "web_search_20260209", "name": "web_search"}]

MCP Connector: External Servers from the API

In Claude Code, MCP servers are added with the claude mcp add command. In the Messages API, use the mcp_servers parameter (a beta feature requiring the mcp-client-2025-11-20 header):

response = client.beta.messages.create(
    model="claude-opus-4-8",
    max_tokens=1000,
    messages=[{"role": "user", "content": "Покажи открытые PR"}],
    mcp_servers=[
        {
            "type": "url",
            "url": "https://github-mcp.example.com/sse",
            "name": "github",
            "authorization_token": "ghp_..."
        }
    ],
    tools=[{"type": "mcp_toolset", "mcp_server_name": "github"}],
    betas=["mcp-client-2025-11-20"],
)

The server must be publicly accessible over HTTPS (SSE or Streamable HTTP) — stdio servers do not work through this mechanism. The mcp_servers parameter describes the connection; tools with type mcp_toolset specifies which tools to include (all by default).

To block dangerous tools (denylist):

{
  "type": "mcp_toolset",
  "mcp_server_name": "github",
  "configs": {"delete_repository": {"enabled": false}}
}

For an allowlist — set "default_config": {"enabled": false}, then enable the ones you need via configs by name. The response contains mcp_tool_use and mcp_tool_result blocks — analogous to the regular ones, plus a server_name field.

Streaming

By default, the API waits for the complete response and returns it all at once. For long generations this means several seconds of silence before the first character. stream: true switches to Server-Sent Events.

In the Python SDK, use the .stream() method:

with client.messages.stream(
    model="claude-sonnet-4-6",
    max_tokens=2048,
    messages=[{"role": "user", "content": "Объясни GC в CPython"}],
) as stream:
    for chunk in stream.text_stream:
        print(chunk, end="", flush=True)

final = stream.get_final_message()
print(f"\nТокенов: {final.usage.input_tokens} in / {final.usage.output_tokens} out")

In TypeScript:

await client.messages
  .stream({
    model: "claude-sonnet-4-6",
    max_tokens: 2048,
    messages: [{ role: "user", content: "Объясни GC в CPython" }],
  })
  .on("text", (t) => process.stdout.write(t));

Streaming with tools. When the model streams a tool call, instead of text_delta you receive input_json_delta events containing a partial JSON fragment of the arguments:

event: content_block_delta
data: {"type":"content_block_delta","index":1,
       "delta":{"type":"input_json_delta","partial_json":"{\"path\": \"/src/"}}

Concatenate the partial_json strings until the content_block_stop event, then parse the JSON. The SDK does this automatically — tool data does not appear in text_stream. Watch the final stop_reason in the message_delta event: if it is "tool_use", the next iteration of the cycle is required.

Check yourself

When streaming with tools, a `content_block_delta` event arrives with `"type": "input_json_delta"` and `"partial_json": "{\"path\": \"/src/"`. When is it safe to parse this as a complete JSON object?