← All open source projects

LiteLLM

BerriAI/litellm

LiteLLM is a Python SDK and proxy for calling many LLM providers through one interface.

Forks 9,225
Author BerriAI
Language Python
License NOASSERTION
Synced 2026-06-27

What it is

LiteLLM is for teams that work with several LLM providers rather than one model. It gives applications a common call interface so code depends less on a single API.

The project grew from a practical problem: providers differ in request formats, model names, errors, limits, and cost.

What is inside

The repository includes a Python SDK, proxy server, routing settings, spend tracking, access rules, provider integrations, and deployment examples.

Proxy mode is especially important because it centralizes keys, limits, logs, and model selection for several products or services.

How it is used

Teams use LiteLLM to switch models, compare providers, limit spend, or give internal teams one address for model calls.

Before adoption, keys, logging, and user data handling still need careful design.

Strengths and limits

The strength is reduced provider lock-in and centralized cost control.

The limit is another layer on the critical path; it must be monitored and understood.

In larger teams, LiteLLM becomes a policy point: which models are allowed, where keys live, what limits are set, what requests are logged, and how to switch providers during an outage. Without those rules, a unified model layer becomes opaque.

Streaming and error behavior matter too. Providers return different formats, so a wrapper must not only send a request but also handle cancellation, timeouts, retries, and partial output carefully.

It is most valuable when treated as infrastructure rather than a convenience import. Once several services depend on it, versioning, access rules, dashboards, and fallback behavior become part of the system design.

That is why many teams evaluate it together with observability and access control, not only by checking whether one model call succeeds locally.

Example

A unified model call

This shows the SDK idea: the app calls completion through LiteLLM, and the provider is selected by model name.

Language: Python
from litellm import completion

response = completion(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Summarize this text"}],
)

print(response.choices[0].message.content)