โน๏ธ Skipped - page is already crawled
| Filter | Status | Condition | Details |
|---|---|---|---|
| HTTP status | PASS | download_http_code = 200 | HTTP 200 |
| Age cutoff | PASS | download_stamp > now() - 6 MONTH | 0.1 months ago |
| History drop | PASS | isNull(history_drop_reason) | No drop reason |
| Spam/ban | PASS | fh_dont_index != 1 AND ml_spam_score = 0 | ml_spam_score=0 |
| Canonical | PASS | meta_canonical IS NULL OR = '' OR = src_unparsed | Not set |
| Property | Value |
|---|---|
| URL | https://docs.litellm.ai/docs/ |
| Last Crawled | 2026-04-06 13:05:20 (3 days ago) |
| First Indexed | 2023-08-15 00:44:53 (2 years ago) |
| HTTP Status Code | 200 |
| Meta Title | Getting Started | liteLLM |
| Meta Description | LiteLLM is an open-source library that gives you a single, unified interface to call 100+ LLMs โ OpenAI, Anthropic, Vertex AI, Bedrock, and more โ using the OpenAI format. |
| Meta Canonical | null |
| Boilerpipe Text | LiteLLM
is an open-source library that gives you a single, unified interface to call 100+ LLMs โ OpenAI, Anthropic, Vertex AI, Bedrock, and more โ using the OpenAI format.
Call any provider using the same
completion()
interface โ no re-learning the API for each one
Consistent output format regardless of which provider or model you use
Built-in retry / fallback logic across multiple deployments via the
Router
Self-hosted
LLM Gateway (Proxy)
with virtual keys, cost tracking, and an admin UI
Installation
โ
pip install litellm
To run the full Proxy Server (LLM Gateway):
pip install 'litellm[proxy]'
Quick Start
โ
Make your first LLM call using the provider of your choice:
OpenAI
Anthropic
Vertex AI
Bedrock
Ollama
Azure OpenAI
from
litellm
import
completion
import
os
os
.
environ
[
"OPENAI_API_KEY"
]
=
"your-api-key"
response
=
completion
(
model
=
"openai/gpt-4o"
,
messages
=
[
{
"role"
:
"user"
,
"content"
:
"Hello, how are you?"
}
]
)
print
(
response
.
choices
[
0
]
.
message
.
content
)
Every response follows the OpenAI Chat Completions format, regardless of provider. โ
Response Format
โ
Non-streaming responses return a
ModelResponse
object:
{
"id"
:
"chatcmpl-abc123"
,
"object"
:
"chat.completion"
,
"created"
:
1677858242
,
"model"
:
"gpt-4o"
,
"choices"
:
[
{
"index"
:
0
,
"message"
:
{
"role"
:
"assistant"
,
"content"
:
"Hello! I'm doing well, thanks for asking."
}
,
"finish_reason"
:
"stop"
}
]
,
"usage"
:
{
"prompt_tokens"
:
13
,
"completion_tokens"
:
12
,
"total_tokens"
:
25
}
}
Streaming responses (
stream=True
) yield
ModelResponseStream
chunks:
{
"id"
:
"chatcmpl-abc123"
,
"object"
:
"chat.completion.chunk"
,
"created"
:
1677858242
,
"model"
:
"gpt-4o"
,
"choices"
:
[
{
"index"
:
0
,
"delta"
:
{
"role"
:
"assistant"
,
"content"
:
"Hello"
}
,
"finish_reason"
:
null
}
]
}
๐
Full output format reference โ
Open in Colab
New to LiteLLM?
โ
Want to get started fast?
Head to
Tutorials
for step-by-step walkthroughs โ AI coding tools, agent SDKs, proxy setup, and more.
Need to understand a specific feature?
Check
Guides
for streaming, function calling, prompt caching, and other how-tos.
Choose Your Path
โ
LiteLLM Python SDK
โ
Streaming
โ
Add
stream=True
to receive chunks as they are generated:
from
litellm
import
completion
import
os
os
.
environ
[
"OPENAI_API_KEY"
]
=
"your-api-key"
for
chunk
in
completion
(
model
=
"openai/gpt-4o"
,
messages
=
[
{
"role"
:
"user"
,
"content"
:
"Write a short poem"
}
]
,
stream
=
True
,
)
:
print
(
chunk
.
choices
[
0
]
.
delta
.
content
or
""
,
end
=
""
)
Exception Handling
โ
LiteLLM maps every provider's errors to the OpenAI exception types โ your existing error handling works out of the box:
import
litellm
try
:
litellm
.
completion
(
model
=
"anthropic/claude-instant-1"
,
messages
=
[
{
"role"
:
"user"
,
"content"
:
"Hey!"
}
]
)
except
litellm
.
AuthenticationError
as
e
:
print
(
f"Bad API key:
{
e
}
"
)
except
litellm
.
RateLimitError
as
e
:
print
(
f"Rate limited:
{
e
}
"
)
except
litellm
.
APIError
as
e
:
print
(
f"API error:
{
e
}
"
)
Logging & Observability
โ
Send input/output to Langfuse, MLflow, Helicone, Lunary, and more with a single line:
import
litellm
litellm
.
success_callback
=
[
"langfuse"
,
"mlflow"
,
"helicone"
]
response
=
litellm
.
completion
(
model
=
"gpt-4o"
,
messages
=
[
{
"role"
:
"user"
,
"content"
:
"Hi!"
}
]
)
๐
See all observability integrations โ
Track Costs & Usage
โ
Use a callback to capture cost per response:
import
litellm
def
track_cost
(
kwargs
,
completion_response
,
start_time
,
end_time
)
:
print
(
"Cost:"
,
kwargs
.
get
(
"response_cost"
,
0
)
)
litellm
.
success_callback
=
[
track_cost
]
litellm
.
completion
(
model
=
"gpt-4o"
,
messages
=
[
{
"role"
:
"user"
,
"content"
:
"Hello!"
}
]
,
stream
=
True
)
๐
Custom callback docs โ
LiteLLM Proxy Server (LLM Gateway)
โ
The proxy is a self-hosted OpenAI-compatible gateway. Any client that works with OpenAI works with the proxy โ no code changes needed.
Step 1 โ Start the proxy
โ
pip
Docker
litellm --model huggingface/bigcode/starcoder
# Proxy running on http://0.0.0.0:4000
Step 2 โ Call it with the OpenAI client
โ
import
openai
client
=
openai
.
OpenAI
(
api_key
=
"anything"
,
base_url
=
"http://0.0.0.0:4000"
)
response
=
client
.
chat
.
completions
.
create
(
model
=
"gpt-3.5-turbo"
,
messages
=
[
{
"role"
:
"user"
,
"content"
:
"Write a short poem"
}
]
)
print
(
response
.
choices
[
0
]
.
message
.
content
)
๐
Full proxy quickstart with Docker โ
Debugging tool
Use
/utils/transform_request
to inspect exactly what LiteLLM sends to any provider โ useful for debugging prompt formatting, header issues, and provider-specific parameters.
๐
Interactive API explorer (Swagger) โ
Agent & MCP Gateway
โ
LiteLLM is a unified gateway for
LLMs, agents, and MCP
โ you don't need a separate agent or MCP gateway. One endpoint for 100+ models, A2A agents, and MCP tools.
What to Explore Next
โ
๐
Routing & Load Balancing
Load balance across deployments and set automatic fallbacks.
๐
Virtual Keys
Manage access, budgets, and rate limits per team or user.
๐
Spend Tracking
Track costs per key, team, and user across all providers.
๐ก๏ธ
Guardrails
Add content filtering, PII masking, and safety checks.
๐ก
Observability
Integrate with Langfuse, MLflow, Helicone, and more.
๐ญ
Enterprise
SSO/SAML, audit logs, and advanced security for production. |
| Markdown | [Skip to main content](https://docs.litellm.ai/docs/#__docusaurus_skipToContent_fallback)
[**๐
LiteLLM**](https://docs.litellm.ai/)
[Docs](https://docs.litellm.ai/docs/)[Learn](https://docs.litellm.ai/docs/learn)[Integrations](https://docs.litellm.ai/docs/integrations/)[Enterprise](https://docs.litellm.ai/docs/enterprise)[Changelog](https://docs.litellm.ai/release_notes)[Blog](https://docs.litellm.ai/blog)
- [Get Started]()
- [Quickstart](https://docs.litellm.ai/docs/)
- [Models & Pricing](https://models.litellm.ai/)
- [Changelog](https://docs.litellm.ai/release_notes)
- [LiteLLM Python SDK](https://docs.litellm.ai/docs/#litellm-python-sdk)
- [LiteLLM AI Gateway (Proxy)](https://docs.litellm.ai/docs/simple_proxy)
- [Supported Endpoints](https://docs.litellm.ai/docs/supported_endpoints)
- [Supported Models & Providers](https://docs.litellm.ai/docs/providers)
- [Routing & Load Balancing](https://docs.litellm.ai/docs/routing-load-balancing)
- [Load Testing](https://docs.litellm.ai/docs/benchmarks)
- [Contributing](https://docs.litellm.ai/docs/extras/contributing_code)
- [Extras](https://docs.litellm.ai/docs/sdk_custom_pricing)
- [Troubleshooting](https://docs.litellm.ai/docs/troubleshoot/ui_issues)
- Get Started
- Quickstart
On this page
# Getting Started

**LiteLLM** is an open-source library that gives you a single, unified interface to call 100+ LLMs โ OpenAI, Anthropic, Vertex AI, Bedrock, and more โ using the OpenAI format.
- Call any provider using the same `completion()` interface โ no re-learning the API for each one
- Consistent output format regardless of which provider or model you use
- Built-in retry / fallback logic across multiple deployments via the [Router](https://docs.litellm.ai/docs/routing)
- Self-hosted [LLM Gateway (Proxy)](https://docs.litellm.ai/docs/simple_proxy) with virtual keys, cost tracking, and an admin UI
[](https://pypi.org/project/litellm/) [](https://github.com/BerriAI/litellm)
***
## Installation[โ](https://docs.litellm.ai/docs/#installation "Direct link to Installation")
```
pip install litellm
```
To run the full Proxy Server (LLM Gateway):
```
pip install 'litellm[proxy]'
```
***
## Quick Start[โ](https://docs.litellm.ai/docs/#quick-start "Direct link to Quick Start")
Make your first LLM call using the provider of your choice:
- OpenAI
- Anthropic
- Vertex AI
- Bedrock
- Ollama
- Azure OpenAI
```
from litellm import completion
import os
os.environ["OPENAI_API_KEY"] = "your-api-key"
response = completion(
model="openai/gpt-4o",
messages=[{"role": "user", "content": "Hello, how are you?"}]
)
print(response.choices[0].message.content)
```
```
from litellm import completion
import os
os.environ["ANTHROPIC_API_KEY"] = "your-api-key"
response = completion(
model="anthropic/claude-3-5-sonnet-20241022",
messages=[{"role": "user", "content": "Hello, how are you?"}]
)
print(response.choices[0].message.content)
```
```
from litellm import completion
import os
# auth: run 'gcloud auth application-default login'
os.environ["VERTEXAI_PROJECT"] = "your-project-id"
os.environ["VERTEXAI_LOCATION"] = "us-central1"
response = completion(
model="vertex_ai/gemini-1.5-pro",
messages=[{"role": "user", "content": "Hello, how are you?"}]
)
print(response.choices[0].message.content)
```
```
from litellm import completion
import os
os.environ["AWS_ACCESS_KEY_ID"] = "your-key"
os.environ["AWS_SECRET_ACCESS_KEY"] = "your-secret"
os.environ["AWS_REGION_NAME"] = "us-east-1"
response = completion(
model="bedrock/anthropic.claude-haiku-4-5-20251001:0",
messages=[{"role": "user", "content": "Hello, how are you?"}]
)
print(response.choices[0].message.content)
```
```
from litellm import completion
response = completion(
model="ollama/llama3",
messages=[{"role": "user", "content": "Hello, how are you?"}],
api_base="http://localhost:11434"
)
print(response.choices[0].message.content)
```
```
from litellm import completion
import os
os.environ["AZURE_API_KEY"] = "your-key"
os.environ["AZURE_API_BASE"] = "https://your-resource.openai.azure.com"
os.environ["AZURE_API_VERSION"] = "2024-02-01"
response = completion(
model="azure/your-deployment-name",
messages=[{"role": "user", "content": "Hello, how are you?"}]
)
print(response.choices[0].message.content)
```
Every response follows the OpenAI Chat Completions format, regardless of provider. โ
### Response Format[โ](https://docs.litellm.ai/docs/#response-format "Direct link to Response Format")
Non-streaming responses return a `ModelResponse` object:
```
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1677858242,
"model": "gpt-4o",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm doing well, thanks for asking."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 13,
"completion_tokens": 12,
"total_tokens": 25
}
}
```
Streaming responses (`stream=True`) yield `ModelResponseStream` chunks:
```
{
"id": "chatcmpl-abc123",
"object": "chat.completion.chunk",
"created": 1677858242,
"model": "gpt-4o",
"choices": [
{
"index": 0,
"delta": {
"role": "assistant",
"content": "Hello"
},
"finish_reason": null
}
]
}
```
๐ [Full output format reference โ](https://docs.litellm.ai/docs/completion/output)
Open in Colab
[](https://colab.research.google.com/github/BerriAI/litellm/blob/main/cookbook/liteLLM_Getting_Started.ipynb)
***
## New to LiteLLM?[โ](https://docs.litellm.ai/docs/#new-to-litellm "Direct link to New to LiteLLM?")
**Want to get started fast?** Head to [Tutorials](https://docs.litellm.ai/docs/tutorials) for step-by-step walkthroughs โ AI coding tools, agent SDKs, proxy setup, and more.
**Need to understand a specific feature?** Check [Guides](https://docs.litellm.ai/docs/guides) for streaming, function calling, prompt caching, and other how-tos.
***
## Choose Your Path[โ](https://docs.litellm.ai/docs/#choose-your-path "Direct link to Choose Your Path")
[๐Python SDKIntegrate LiteLLM directly into your Python application. Drop-in replacement for the OpenAI client.completion(), embedding(), image\_generation() and more Router with retry, fallback, and load balancing OpenAI-compatible exceptions across all providers Observability callbacks (Langfuse, MLflow, Heliconeโฆ)](https://docs.litellm.ai/docs/#litellm-python-sdk)
[๐ฅ๏ธProxy Server (LLM Gateway)Self-hosted gateway for platform teams managing LLM access across an organization.Virtual keys with per-key/team/user budgets Centralized logging, guardrails, and caching Admin UI for monitoring and management Drop-in replacement for any OpenAI-compatible client](https://docs.litellm.ai/docs/#litellm-proxy-server-llm-gateway)
***
## LiteLLM Python SDK[โ](https://docs.litellm.ai/docs/#litellm-python-sdk "Direct link to LiteLLM Python SDK")
### Streaming[โ](https://docs.litellm.ai/docs/#streaming "Direct link to Streaming")
Add `stream=True` to receive chunks as they are generated:
```
from litellm import completion
import os
os.environ["OPENAI_API_KEY"] = "your-api-key"
for chunk in completion(
model="openai/gpt-4o",
messages=[{"role": "user", "content": "Write a short poem"}],
stream=True,
):
print(chunk.choices[0].delta.content or "", end="")
```
### Exception Handling[โ](https://docs.litellm.ai/docs/#exception-handling "Direct link to Exception Handling")
LiteLLM maps every provider's errors to the OpenAI exception types โ your existing error handling works out of the box:
```
import litellm
try:
litellm.completion(
model="anthropic/claude-instant-1",
messages=[{"role": "user", "content": "Hey!"}]
)
except litellm.AuthenticationError as e:
print(f"Bad API key: {e}")
except litellm.RateLimitError as e:
print(f"Rate limited: {e}")
except litellm.APIError as e:
print(f"API error: {e}")
```
### Logging & Observability[โ](https://docs.litellm.ai/docs/#logging--observability "Direct link to Logging & Observability")
Send input/output to Langfuse, MLflow, Helicone, Lunary, and more with a single line:
```
import litellm
litellm.success_callback = ["langfuse", "mlflow", "helicone"]
response = litellm.completion(
model="gpt-4o",
messages=[{"role": "user", "content": "Hi!"}]
)
```
๐ [See all observability integrations โ](https://docs.litellm.ai/docs/observability/agentops_integration)
### Track Costs & Usage[โ](https://docs.litellm.ai/docs/#track-costs--usage "Direct link to Track Costs & Usage")
Use a callback to capture cost per response:
```
import litellm
def track_cost(kwargs, completion_response, start_time, end_time):
print("Cost:", kwargs.get("response_cost", 0))
litellm.success_callback = [track_cost]
litellm.completion(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}],
stream=True
)
```
๐ [Custom callback docs โ](https://docs.litellm.ai/docs/observability/custom_callback)
***
## LiteLLM Proxy Server (LLM Gateway)[โ](https://docs.litellm.ai/docs/#litellm-proxy-server-llm-gateway "Direct link to LiteLLM Proxy Server (LLM Gateway)")
The proxy is a self-hosted OpenAI-compatible gateway. Any client that works with OpenAI works with the proxy โ no code changes needed.

#### Step 1 โ Start the proxy[โ](https://docs.litellm.ai/docs/#step-1--start-the-proxy "Direct link to Step 1 โ Start the proxy")
- pip
- Docker
```
litellm --model huggingface/bigcode/starcoder
# Proxy running on http://0.0.0.0:4000
```
litellm\_config.yaml
```
model_list:
- model_name: gpt-3.5-turbo
litellm_params:
model: azure/your-deployment
api_base: os.environ/AZURE_API_BASE
api_key: os.environ/AZURE_API_KEY
api_version: "2023-07-01-preview"
```
```
docker run \
-v $(pwd)/litellm_config.yaml:/app/config.yaml \
-e AZURE_API_KEY=your-key \
-e AZURE_API_BASE=https://your-resource.openai.azure.com/ \
-p 4000:4000 \
docker.litellm.ai/berriai/litellm:main-latest \
--config /app/config.yaml --detailed_debug
```
#### Step 2 โ Call it with the OpenAI client[โ](https://docs.litellm.ai/docs/#step-2--call-it-with-the-openai-client "Direct link to Step 2 โ Call it with the OpenAI client")
```
import openai
client = openai.OpenAI(api_key="anything", base_url="http://0.0.0.0:4000")
response = client.chat.completions.create(
model="gpt-3.5-turbo",
messages=[{"role": "user", "content": "Write a short poem"}]
)
print(response.choices[0].message.content)
```
๐ [Full proxy quickstart with Docker โ](https://docs.litellm.ai/docs/proxy/docker_quick_start)
Debugging tool
Use [**`/utils/transform_request`**](https://docs.litellm.ai/docs/utils/transform_request) to inspect exactly what LiteLLM sends to any provider โ useful for debugging prompt formatting, header issues, and provider-specific parameters.
๐ [Interactive API explorer (Swagger) โ](https://litellm-api.up.railway.app/)
***
## Agent & MCP Gateway[โ](https://docs.litellm.ai/docs/#agent--mcp-gateway "Direct link to Agent & MCP Gateway")
LiteLLM is a unified gateway for **LLMs, agents, and MCP** โ you don't need a separate agent or MCP gateway. One endpoint for 100+ models, A2A agents, and MCP tools.
[๐A2A AgentsAdd and invoke A2A agents via the LiteLLM gateway.](https://docs.litellm.ai/docs/a2a)
[๐ ๏ธMCP GatewayCentral MCP endpoint with per-key access control.](https://docs.litellm.ai/docs/mcp)
***
## What to Explore Next[โ](https://docs.litellm.ai/docs/#what-to-explore-next "Direct link to What to Explore Next")
[๐Routing & Load BalancingLoad balance across deployments and set automatic fallbacks.](https://docs.litellm.ai/docs/routing-load-balancing)
[๐Virtual KeysManage access, budgets, and rate limits per team or user.](https://docs.litellm.ai/docs/proxy/virtual_keys)
[๐Spend TrackingTrack costs per key, team, and user across all providers.](https://docs.litellm.ai/docs/proxy/cost_tracking)
[๐ก๏ธGuardrailsAdd content filtering, PII masking, and safety checks.](https://docs.litellm.ai/docs/proxy/guardrails/quick_start)
[๐กObservabilityIntegrate with Langfuse, MLflow, Helicone, and more.](https://docs.litellm.ai/docs/observability/agentops_integration)
[๐ญEnterpriseSSO/SAML, audit logs, and advanced security for production.](https://docs.litellm.ai/docs/enterprise)
[Nextcompletion()](https://docs.litellm.ai/docs/completion/input)
- [Installation](https://docs.litellm.ai/docs/#installation)
- [Quick Start](https://docs.litellm.ai/docs/#quick-start)
- [Response Format](https://docs.litellm.ai/docs/#response-format)
- [New to LiteLLM?](https://docs.litellm.ai/docs/#new-to-litellm)
- [Choose Your Path](https://docs.litellm.ai/docs/#choose-your-path)
- [LiteLLM Python SDK](https://docs.litellm.ai/docs/#litellm-python-sdk)
- [Streaming](https://docs.litellm.ai/docs/#streaming)
- [Exception Handling](https://docs.litellm.ai/docs/#exception-handling)
- [Logging & Observability](https://docs.litellm.ai/docs/#logging--observability)
- [Track Costs & Usage](https://docs.litellm.ai/docs/#track-costs--usage)
- [LiteLLM Proxy Server (LLM Gateway)](https://docs.litellm.ai/docs/#litellm-proxy-server-llm-gateway)
- [Agent & MCP Gateway](https://docs.litellm.ai/docs/#agent--mcp-gateway)
- [What to Explore Next](https://docs.litellm.ai/docs/#what-to-explore-next)
๐
LiteLLM Enterprise
SSO/SAML, audit logs, spend tracking, multi-team management, and guardrails โ built for production.
[Learn more โ](https://docs.litellm.ai/docs/enterprise)
Docs
- [Getting Started](https://docs.litellm.ai/docs/)
Community
- [Discord](https://discord.com/invite/wuPM9dRgDw)
- [Twitter](https://twitter.com/LiteLLM)
More
- [GitHub](https://github.com/BerriAI/litellm/)
Copyright ยฉ 2026 liteLLM |
| Readable Markdown | **LiteLLM** is an open-source library that gives you a single, unified interface to call 100+ LLMs โ OpenAI, Anthropic, Vertex AI, Bedrock, and more โ using the OpenAI format.
- Call any provider using the same `completion()` interface โ no re-learning the API for each one
- Consistent output format regardless of which provider or model you use
- Built-in retry / fallback logic across multiple deployments via the [Router](https://docs.litellm.ai/docs/routing)
- Self-hosted [LLM Gateway (Proxy)](https://docs.litellm.ai/docs/simple_proxy) with virtual keys, cost tracking, and an admin UI
[](https://pypi.org/project/litellm/) [](https://github.com/BerriAI/litellm)
***
## Installation[โ](https://docs.litellm.ai/docs/#installation "Direct link to Installation")
```
pip install litellm
```
To run the full Proxy Server (LLM Gateway):
```
pip install 'litellm[proxy]'
```
***
## Quick Start[โ](https://docs.litellm.ai/docs/#quick-start "Direct link to Quick Start")
Make your first LLM call using the provider of your choice:
- OpenAI
- Anthropic
- Vertex AI
- Bedrock
- Ollama
- Azure OpenAI
```
from litellm import completion
import os
os.environ["OPENAI_API_KEY"] = "your-api-key"
response = completion(
model="openai/gpt-4o",
messages=[{"role": "user", "content": "Hello, how are you?"}]
)
print(response.choices[0].message.content)
```
Every response follows the OpenAI Chat Completions format, regardless of provider. โ
### Response Format[โ](https://docs.litellm.ai/docs/#response-format "Direct link to Response Format")
Non-streaming responses return a `ModelResponse` object:
```
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1677858242,
"model": "gpt-4o",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm doing well, thanks for asking."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 13,
"completion_tokens": 12,
"total_tokens": 25
}
}
```
Streaming responses (`stream=True`) yield `ModelResponseStream` chunks:
```
{
"id": "chatcmpl-abc123",
"object": "chat.completion.chunk",
"created": 1677858242,
"model": "gpt-4o",
"choices": [
{
"index": 0,
"delta": {
"role": "assistant",
"content": "Hello"
},
"finish_reason": null
}
]
}
```
๐ [Full output format reference โ](https://docs.litellm.ai/docs/completion/output)
Open in Colab
[](https://colab.research.google.com/github/BerriAI/litellm/blob/main/cookbook/liteLLM_Getting_Started.ipynb)
***
## New to LiteLLM?[โ](https://docs.litellm.ai/docs/#new-to-litellm "Direct link to New to LiteLLM?")
**Want to get started fast?** Head to [Tutorials](https://docs.litellm.ai/docs/tutorials) for step-by-step walkthroughs โ AI coding tools, agent SDKs, proxy setup, and more.
**Need to understand a specific feature?** Check [Guides](https://docs.litellm.ai/docs/guides) for streaming, function calling, prompt caching, and other how-tos.
***
## Choose Your Path[โ](https://docs.litellm.ai/docs/#choose-your-path "Direct link to Choose Your Path")
***
## LiteLLM Python SDK[โ](https://docs.litellm.ai/docs/#litellm-python-sdk "Direct link to LiteLLM Python SDK")
### Streaming[โ](https://docs.litellm.ai/docs/#streaming "Direct link to Streaming")
Add `stream=True` to receive chunks as they are generated:
```
from litellm import completion
import os
os.environ["OPENAI_API_KEY"] = "your-api-key"
for chunk in completion(
model="openai/gpt-4o",
messages=[{"role": "user", "content": "Write a short poem"}],
stream=True,
):
print(chunk.choices[0].delta.content or "", end="")
```
### Exception Handling[โ](https://docs.litellm.ai/docs/#exception-handling "Direct link to Exception Handling")
LiteLLM maps every provider's errors to the OpenAI exception types โ your existing error handling works out of the box:
```
import litellm
try:
litellm.completion(
model="anthropic/claude-instant-1",
messages=[{"role": "user", "content": "Hey!"}]
)
except litellm.AuthenticationError as e:
print(f"Bad API key: {e}")
except litellm.RateLimitError as e:
print(f"Rate limited: {e}")
except litellm.APIError as e:
print(f"API error: {e}")
```
### Logging & Observability[โ](https://docs.litellm.ai/docs/#logging--observability "Direct link to Logging & Observability")
Send input/output to Langfuse, MLflow, Helicone, Lunary, and more with a single line:
```
import litellm
litellm.success_callback = ["langfuse", "mlflow", "helicone"]
response = litellm.completion(
model="gpt-4o",
messages=[{"role": "user", "content": "Hi!"}]
)
```
๐ [See all observability integrations โ](https://docs.litellm.ai/docs/observability/agentops_integration)
### Track Costs & Usage[โ](https://docs.litellm.ai/docs/#track-costs--usage "Direct link to Track Costs & Usage")
Use a callback to capture cost per response:
```
import litellm
def track_cost(kwargs, completion_response, start_time, end_time):
print("Cost:", kwargs.get("response_cost", 0))
litellm.success_callback = [track_cost]
litellm.completion(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}],
stream=True
)
```
๐ [Custom callback docs โ](https://docs.litellm.ai/docs/observability/custom_callback)
***
## LiteLLM Proxy Server (LLM Gateway)[โ](https://docs.litellm.ai/docs/#litellm-proxy-server-llm-gateway "Direct link to LiteLLM Proxy Server (LLM Gateway)")
The proxy is a self-hosted OpenAI-compatible gateway. Any client that works with OpenAI works with the proxy โ no code changes needed.

#### Step 1 โ Start the proxy[โ](https://docs.litellm.ai/docs/#step-1--start-the-proxy "Direct link to Step 1 โ Start the proxy")
- pip
- Docker
```
litellm --model huggingface/bigcode/starcoder
# Proxy running on http://0.0.0.0:4000
```
#### Step 2 โ Call it with the OpenAI client[โ](https://docs.litellm.ai/docs/#step-2--call-it-with-the-openai-client "Direct link to Step 2 โ Call it with the OpenAI client")
```
import openai
client = openai.OpenAI(api_key="anything", base_url="http://0.0.0.0:4000")
response = client.chat.completions.create(
model="gpt-3.5-turbo",
messages=[{"role": "user", "content": "Write a short poem"}]
)
print(response.choices[0].message.content)
```
๐ [Full proxy quickstart with Docker โ](https://docs.litellm.ai/docs/proxy/docker_quick_start)
Debugging tool
Use [**`/utils/transform_request`**](https://docs.litellm.ai/docs/utils/transform_request) to inspect exactly what LiteLLM sends to any provider โ useful for debugging prompt formatting, header issues, and provider-specific parameters.
๐ [Interactive API explorer (Swagger) โ](https://litellm-api.up.railway.app/)
***
## Agent & MCP Gateway[โ](https://docs.litellm.ai/docs/#agent--mcp-gateway "Direct link to Agent & MCP Gateway")
LiteLLM is a unified gateway for **LLMs, agents, and MCP** โ you don't need a separate agent or MCP gateway. One endpoint for 100+ models, A2A agents, and MCP tools.
***
## What to Explore Next[โ](https://docs.litellm.ai/docs/#what-to-explore-next "Direct link to What to Explore Next")
[๐Routing & Load BalancingLoad balance across deployments and set automatic fallbacks.](https://docs.litellm.ai/docs/routing-load-balancing)
[๐Virtual KeysManage access, budgets, and rate limits per team or user.](https://docs.litellm.ai/docs/proxy/virtual_keys)
[๐Spend TrackingTrack costs per key, team, and user across all providers.](https://docs.litellm.ai/docs/proxy/cost_tracking)
[๐ก๏ธGuardrailsAdd content filtering, PII masking, and safety checks.](https://docs.litellm.ai/docs/proxy/guardrails/quick_start)
[๐กObservabilityIntegrate with Langfuse, MLflow, Helicone, and more.](https://docs.litellm.ai/docs/observability/agentops_integration)
[๐ญEnterpriseSSO/SAML, audit logs, and advanced security for production.](https://docs.litellm.ai/docs/enterprise) |
| Shard | 11 (laksa) |
| Root Hash | 4050687676692930411 |
| Unparsed URL | ai,litellm!docs,/docs/ s443 |