๐Ÿ•ท๏ธ Crawler Inspector

URL Lookup

Direct Parameter Lookup

Raw Queries and Responses

1. Shard Calculation

Query:
Response:
Calculated Shard: 11 (from laksa189)

2. Crawled Status Check

Query:
Response:

3. Robots.txt Check

Query:
Response:

4. Spam/Ban Check

Query:
Response:

5. Seen Status Check

โ„น๏ธ Skipped - page is already crawled

๐Ÿ“„
INDEXABLE
โœ…
CRAWLED
3 days ago
๐Ÿค–
ROBOTS ALLOWED

Page Info Filters

FilterStatusConditionDetails
HTTP statusPASSdownload_http_code = 200HTTP 200
Age cutoffPASSdownload_stamp > now() - 6 MONTH0.1 months ago
History dropPASSisNull(history_drop_reason)No drop reason
Spam/banPASSfh_dont_index != 1 AND ml_spam_score = 0ml_spam_score=0
CanonicalPASSmeta_canonical IS NULL OR = '' OR = src_unparsedNot set

Page Details

PropertyValue
URLhttps://docs.litellm.ai/docs/
Last Crawled2026-04-06 13:05:20 (3 days ago)
First Indexed2023-08-15 00:44:53 (2 years ago)
HTTP Status Code200
Meta TitleGetting Started | liteLLM
Meta DescriptionLiteLLM is an open-source library that gives you a single, unified interface to call 100+ LLMs โ€” OpenAI, Anthropic, Vertex AI, Bedrock, and more โ€” using the OpenAI format.
Meta Canonicalnull
Boilerpipe Text
LiteLLM is an open-source library that gives you a single, unified interface to call 100+ LLMs โ€” OpenAI, Anthropic, Vertex AI, Bedrock, and more โ€” using the OpenAI format. Call any provider using the same completion() interface โ€” no re-learning the API for each one Consistent output format regardless of which provider or model you use Built-in retry / fallback logic across multiple deployments via the Router Self-hosted LLM Gateway (Proxy) with virtual keys, cost tracking, and an admin UI Installation โ€‹ pip install litellm To run the full Proxy Server (LLM Gateway): pip install 'litellm[proxy]' Quick Start โ€‹ Make your first LLM call using the provider of your choice: OpenAI Anthropic Vertex AI Bedrock Ollama Azure OpenAI from litellm import completion import os os . environ [ "OPENAI_API_KEY" ] = "your-api-key" response = completion ( model = "openai/gpt-4o" , messages = [ { "role" : "user" , "content" : "Hello, how are you?" } ] ) print ( response . choices [ 0 ] . message . content ) Every response follows the OpenAI Chat Completions format, regardless of provider. โœ… Response Format โ€‹ Non-streaming responses return a ModelResponse object: { "id" : "chatcmpl-abc123" , "object" : "chat.completion" , "created" : 1677858242 , "model" : "gpt-4o" , "choices" : [ { "index" : 0 , "message" : { "role" : "assistant" , "content" : "Hello! I'm doing well, thanks for asking." } , "finish_reason" : "stop" } ] , "usage" : { "prompt_tokens" : 13 , "completion_tokens" : 12 , "total_tokens" : 25 } } Streaming responses ( stream=True ) yield ModelResponseStream chunks: { "id" : "chatcmpl-abc123" , "object" : "chat.completion.chunk" , "created" : 1677858242 , "model" : "gpt-4o" , "choices" : [ { "index" : 0 , "delta" : { "role" : "assistant" , "content" : "Hello" } , "finish_reason" : null } ] } ๐Ÿ“– Full output format reference โ†’ Open in Colab New to LiteLLM? โ€‹ Want to get started fast? Head to Tutorials for step-by-step walkthroughs โ€” AI coding tools, agent SDKs, proxy setup, and more. Need to understand a specific feature? Check Guides for streaming, function calling, prompt caching, and other how-tos. Choose Your Path โ€‹ LiteLLM Python SDK โ€‹ Streaming โ€‹ Add stream=True to receive chunks as they are generated: from litellm import completion import os os . environ [ "OPENAI_API_KEY" ] = "your-api-key" for chunk in completion ( model = "openai/gpt-4o" , messages = [ { "role" : "user" , "content" : "Write a short poem" } ] , stream = True , ) : print ( chunk . choices [ 0 ] . delta . content or "" , end = "" ) Exception Handling โ€‹ LiteLLM maps every provider's errors to the OpenAI exception types โ€” your existing error handling works out of the box: import litellm try : litellm . completion ( model = "anthropic/claude-instant-1" , messages = [ { "role" : "user" , "content" : "Hey!" } ] ) except litellm . AuthenticationError as e : print ( f"Bad API key: { e } " ) except litellm . RateLimitError as e : print ( f"Rate limited: { e } " ) except litellm . APIError as e : print ( f"API error: { e } " ) Logging & Observability โ€‹ Send input/output to Langfuse, MLflow, Helicone, Lunary, and more with a single line: import litellm litellm . success_callback = [ "langfuse" , "mlflow" , "helicone" ] response = litellm . completion ( model = "gpt-4o" , messages = [ { "role" : "user" , "content" : "Hi!" } ] ) ๐Ÿ“– See all observability integrations โ†’ Track Costs & Usage โ€‹ Use a callback to capture cost per response: import litellm def track_cost ( kwargs , completion_response , start_time , end_time ) : print ( "Cost:" , kwargs . get ( "response_cost" , 0 ) ) litellm . success_callback = [ track_cost ] litellm . completion ( model = "gpt-4o" , messages = [ { "role" : "user" , "content" : "Hello!" } ] , stream = True ) ๐Ÿ“– Custom callback docs โ†’ LiteLLM Proxy Server (LLM Gateway) โ€‹ The proxy is a self-hosted OpenAI-compatible gateway. Any client that works with OpenAI works with the proxy โ€” no code changes needed. Step 1 โ€” Start the proxy โ€‹ pip Docker litellm --model huggingface/bigcode/starcoder # Proxy running on http://0.0.0.0:4000 Step 2 โ€” Call it with the OpenAI client โ€‹ import openai client = openai . OpenAI ( api_key = "anything" , base_url = "http://0.0.0.0:4000" ) response = client . chat . completions . create ( model = "gpt-3.5-turbo" , messages = [ { "role" : "user" , "content" : "Write a short poem" } ] ) print ( response . choices [ 0 ] . message . content ) ๐Ÿ‘‰ Full proxy quickstart with Docker โ†’ Debugging tool Use /utils/transform_request to inspect exactly what LiteLLM sends to any provider โ€” useful for debugging prompt formatting, header issues, and provider-specific parameters. ๐Ÿ”— Interactive API explorer (Swagger) โ†’ Agent & MCP Gateway โ€‹ LiteLLM is a unified gateway for LLMs, agents, and MCP โ€” you don't need a separate agent or MCP gateway. One endpoint for 100+ models, A2A agents, and MCP tools. What to Explore Next โ€‹ ๐Ÿ”€ Routing & Load Balancing Load balance across deployments and set automatic fallbacks. ๐Ÿ”‘ Virtual Keys Manage access, budgets, and rate limits per team or user. ๐Ÿ“Š Spend Tracking Track costs per key, team, and user across all providers. ๐Ÿ›ก๏ธ Guardrails Add content filtering, PII masking, and safety checks. ๐Ÿ“ก Observability Integrate with Langfuse, MLflow, Helicone, and more. ๐Ÿญ Enterprise SSO/SAML, audit logs, and advanced security for production.
Markdown
[Skip to main content](https://docs.litellm.ai/docs/#__docusaurus_skipToContent_fallback) [**๐Ÿš… LiteLLM**](https://docs.litellm.ai/) [Docs](https://docs.litellm.ai/docs/)[Learn](https://docs.litellm.ai/docs/learn)[Integrations](https://docs.litellm.ai/docs/integrations/)[Enterprise](https://docs.litellm.ai/docs/enterprise)[Changelog](https://docs.litellm.ai/release_notes)[Blog](https://docs.litellm.ai/blog) - [Get Started]() - [Quickstart](https://docs.litellm.ai/docs/) - [Models & Pricing](https://models.litellm.ai/) - [Changelog](https://docs.litellm.ai/release_notes) - [LiteLLM Python SDK](https://docs.litellm.ai/docs/#litellm-python-sdk) - [LiteLLM AI Gateway (Proxy)](https://docs.litellm.ai/docs/simple_proxy) - [Supported Endpoints](https://docs.litellm.ai/docs/supported_endpoints) - [Supported Models & Providers](https://docs.litellm.ai/docs/providers) - [Routing & Load Balancing](https://docs.litellm.ai/docs/routing-load-balancing) - [Load Testing](https://docs.litellm.ai/docs/benchmarks) - [Contributing](https://docs.litellm.ai/docs/extras/contributing_code) - [Extras](https://docs.litellm.ai/docs/sdk_custom_pricing) - [Troubleshooting](https://docs.litellm.ai/docs/troubleshoot/ui_issues) - Get Started - Quickstart On this page # Getting Started ![](https://docs.litellm.ai/assets/ideal-img/hero.1927c06.640.png) **LiteLLM** is an open-source library that gives you a single, unified interface to call 100+ LLMs โ€” OpenAI, Anthropic, Vertex AI, Bedrock, and more โ€” using the OpenAI format. - Call any provider using the same `completion()` interface โ€” no re-learning the API for each one - Consistent output format regardless of which provider or model you use - Built-in retry / fallback logic across multiple deployments via the [Router](https://docs.litellm.ai/docs/routing) - Self-hosted [LLM Gateway (Proxy)](https://docs.litellm.ai/docs/simple_proxy) with virtual keys, cost tracking, and an admin UI [![PyPI](https://img.shields.io/pypi/v/litellm.svg)](https://pypi.org/project/litellm/) [![GitHub Stars](https://img.shields.io/github/stars/BerriAI/litellm?style=social)](https://github.com/BerriAI/litellm) *** ## Installation[โ€‹](https://docs.litellm.ai/docs/#installation "Direct link to Installation") ``` pip install litellm ``` To run the full Proxy Server (LLM Gateway): ``` pip install 'litellm[proxy]' ``` *** ## Quick Start[โ€‹](https://docs.litellm.ai/docs/#quick-start "Direct link to Quick Start") Make your first LLM call using the provider of your choice: - OpenAI - Anthropic - Vertex AI - Bedrock - Ollama - Azure OpenAI ``` from litellm import completion import os os.environ["OPENAI_API_KEY"] = "your-api-key" response = completion( model="openai/gpt-4o", messages=[{"role": "user", "content": "Hello, how are you?"}] ) print(response.choices[0].message.content) ``` ``` from litellm import completion import os os.environ["ANTHROPIC_API_KEY"] = "your-api-key" response = completion( model="anthropic/claude-3-5-sonnet-20241022", messages=[{"role": "user", "content": "Hello, how are you?"}] ) print(response.choices[0].message.content) ``` ``` from litellm import completion import os # auth: run 'gcloud auth application-default login' os.environ["VERTEXAI_PROJECT"] = "your-project-id" os.environ["VERTEXAI_LOCATION"] = "us-central1" response = completion( model="vertex_ai/gemini-1.5-pro", messages=[{"role": "user", "content": "Hello, how are you?"}] ) print(response.choices[0].message.content) ``` ``` from litellm import completion import os os.environ["AWS_ACCESS_KEY_ID"] = "your-key" os.environ["AWS_SECRET_ACCESS_KEY"] = "your-secret" os.environ["AWS_REGION_NAME"] = "us-east-1" response = completion( model="bedrock/anthropic.claude-haiku-4-5-20251001:0", messages=[{"role": "user", "content": "Hello, how are you?"}] ) print(response.choices[0].message.content) ``` ``` from litellm import completion response = completion( model="ollama/llama3", messages=[{"role": "user", "content": "Hello, how are you?"}], api_base="http://localhost:11434" ) print(response.choices[0].message.content) ``` ``` from litellm import completion import os os.environ["AZURE_API_KEY"] = "your-key" os.environ["AZURE_API_BASE"] = "https://your-resource.openai.azure.com" os.environ["AZURE_API_VERSION"] = "2024-02-01" response = completion( model="azure/your-deployment-name", messages=[{"role": "user", "content": "Hello, how are you?"}] ) print(response.choices[0].message.content) ``` Every response follows the OpenAI Chat Completions format, regardless of provider. โœ… ### Response Format[โ€‹](https://docs.litellm.ai/docs/#response-format "Direct link to Response Format") Non-streaming responses return a `ModelResponse` object: ``` { "id": "chatcmpl-abc123", "object": "chat.completion", "created": 1677858242, "model": "gpt-4o", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "Hello! I'm doing well, thanks for asking." }, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 13, "completion_tokens": 12, "total_tokens": 25 } } ``` Streaming responses (`stream=True`) yield `ModelResponseStream` chunks: ``` { "id": "chatcmpl-abc123", "object": "chat.completion.chunk", "created": 1677858242, "model": "gpt-4o", "choices": [ { "index": 0, "delta": { "role": "assistant", "content": "Hello" }, "finish_reason": null } ] } ``` ๐Ÿ“– [Full output format reference โ†’](https://docs.litellm.ai/docs/completion/output) Open in Colab [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/BerriAI/litellm/blob/main/cookbook/liteLLM_Getting_Started.ipynb) *** ## New to LiteLLM?[โ€‹](https://docs.litellm.ai/docs/#new-to-litellm "Direct link to New to LiteLLM?") **Want to get started fast?** Head to [Tutorials](https://docs.litellm.ai/docs/tutorials) for step-by-step walkthroughs โ€” AI coding tools, agent SDKs, proxy setup, and more. **Need to understand a specific feature?** Check [Guides](https://docs.litellm.ai/docs/guides) for streaming, function calling, prompt caching, and other how-tos. *** ## Choose Your Path[โ€‹](https://docs.litellm.ai/docs/#choose-your-path "Direct link to Choose Your Path") [๐ŸPython SDKIntegrate LiteLLM directly into your Python application. Drop-in replacement for the OpenAI client.completion(), embedding(), image\_generation() and more Router with retry, fallback, and load balancing OpenAI-compatible exceptions across all providers Observability callbacks (Langfuse, MLflow, Heliconeโ€ฆ)](https://docs.litellm.ai/docs/#litellm-python-sdk) [๐Ÿ–ฅ๏ธProxy Server (LLM Gateway)Self-hosted gateway for platform teams managing LLM access across an organization.Virtual keys with per-key/team/user budgets Centralized logging, guardrails, and caching Admin UI for monitoring and management Drop-in replacement for any OpenAI-compatible client](https://docs.litellm.ai/docs/#litellm-proxy-server-llm-gateway) *** ## LiteLLM Python SDK[โ€‹](https://docs.litellm.ai/docs/#litellm-python-sdk "Direct link to LiteLLM Python SDK") ### Streaming[โ€‹](https://docs.litellm.ai/docs/#streaming "Direct link to Streaming") Add `stream=True` to receive chunks as they are generated: ``` from litellm import completion import os os.environ["OPENAI_API_KEY"] = "your-api-key" for chunk in completion( model="openai/gpt-4o", messages=[{"role": "user", "content": "Write a short poem"}], stream=True, ): print(chunk.choices[0].delta.content or "", end="") ``` ### Exception Handling[โ€‹](https://docs.litellm.ai/docs/#exception-handling "Direct link to Exception Handling") LiteLLM maps every provider's errors to the OpenAI exception types โ€” your existing error handling works out of the box: ``` import litellm try: litellm.completion( model="anthropic/claude-instant-1", messages=[{"role": "user", "content": "Hey!"}] ) except litellm.AuthenticationError as e: print(f"Bad API key: {e}") except litellm.RateLimitError as e: print(f"Rate limited: {e}") except litellm.APIError as e: print(f"API error: {e}") ``` ### Logging & Observability[โ€‹](https://docs.litellm.ai/docs/#logging--observability "Direct link to Logging & Observability") Send input/output to Langfuse, MLflow, Helicone, Lunary, and more with a single line: ``` import litellm litellm.success_callback = ["langfuse", "mlflow", "helicone"] response = litellm.completion( model="gpt-4o", messages=[{"role": "user", "content": "Hi!"}] ) ``` ๐Ÿ“– [See all observability integrations โ†’](https://docs.litellm.ai/docs/observability/agentops_integration) ### Track Costs & Usage[โ€‹](https://docs.litellm.ai/docs/#track-costs--usage "Direct link to Track Costs & Usage") Use a callback to capture cost per response: ``` import litellm def track_cost(kwargs, completion_response, start_time, end_time): print("Cost:", kwargs.get("response_cost", 0)) litellm.success_callback = [track_cost] litellm.completion( model="gpt-4o", messages=[{"role": "user", "content": "Hello!"}], stream=True ) ``` ๐Ÿ“– [Custom callback docs โ†’](https://docs.litellm.ai/docs/observability/custom_callback) *** ## LiteLLM Proxy Server (LLM Gateway)[โ€‹](https://docs.litellm.ai/docs/#litellm-proxy-server-llm-gateway "Direct link to LiteLLM Proxy Server (LLM Gateway)") The proxy is a self-hosted OpenAI-compatible gateway. Any client that works with OpenAI works with the proxy โ€” no code changes needed. ![LiteLLM Proxy Dashboard](https://github.com/BerriAI/litellm/assets/29436595/47c97d5e-b9be-4839-b28c-43d7f4f10033) #### Step 1 โ€” Start the proxy[โ€‹](https://docs.litellm.ai/docs/#step-1--start-the-proxy "Direct link to Step 1 โ€” Start the proxy") - pip - Docker ``` litellm --model huggingface/bigcode/starcoder # Proxy running on http://0.0.0.0:4000 ``` litellm\_config.yaml ``` model_list: - model_name: gpt-3.5-turbo litellm_params: model: azure/your-deployment api_base: os.environ/AZURE_API_BASE api_key: os.environ/AZURE_API_KEY api_version: "2023-07-01-preview" ``` ``` docker run \ -v $(pwd)/litellm_config.yaml:/app/config.yaml \ -e AZURE_API_KEY=your-key \ -e AZURE_API_BASE=https://your-resource.openai.azure.com/ \ -p 4000:4000 \ docker.litellm.ai/berriai/litellm:main-latest \ --config /app/config.yaml --detailed_debug ``` #### Step 2 โ€” Call it with the OpenAI client[โ€‹](https://docs.litellm.ai/docs/#step-2--call-it-with-the-openai-client "Direct link to Step 2 โ€” Call it with the OpenAI client") ``` import openai client = openai.OpenAI(api_key="anything", base_url="http://0.0.0.0:4000") response = client.chat.completions.create( model="gpt-3.5-turbo", messages=[{"role": "user", "content": "Write a short poem"}] ) print(response.choices[0].message.content) ``` ๐Ÿ‘‰ [Full proxy quickstart with Docker โ†’](https://docs.litellm.ai/docs/proxy/docker_quick_start) Debugging tool Use [**`/utils/transform_request`**](https://docs.litellm.ai/docs/utils/transform_request) to inspect exactly what LiteLLM sends to any provider โ€” useful for debugging prompt formatting, header issues, and provider-specific parameters. ๐Ÿ”— [Interactive API explorer (Swagger) โ†’](https://litellm-api.up.railway.app/) *** ## Agent & MCP Gateway[โ€‹](https://docs.litellm.ai/docs/#agent--mcp-gateway "Direct link to Agent & MCP Gateway") LiteLLM is a unified gateway for **LLMs, agents, and MCP** โ€” you don't need a separate agent or MCP gateway. One endpoint for 100+ models, A2A agents, and MCP tools. [๐Ÿ”—A2A AgentsAdd and invoke A2A agents via the LiteLLM gateway.](https://docs.litellm.ai/docs/a2a) [๐Ÿ› ๏ธMCP GatewayCentral MCP endpoint with per-key access control.](https://docs.litellm.ai/docs/mcp) *** ## What to Explore Next[โ€‹](https://docs.litellm.ai/docs/#what-to-explore-next "Direct link to What to Explore Next") [๐Ÿ”€Routing & Load BalancingLoad balance across deployments and set automatic fallbacks.](https://docs.litellm.ai/docs/routing-load-balancing) [๐Ÿ”‘Virtual KeysManage access, budgets, and rate limits per team or user.](https://docs.litellm.ai/docs/proxy/virtual_keys) [๐Ÿ“ŠSpend TrackingTrack costs per key, team, and user across all providers.](https://docs.litellm.ai/docs/proxy/cost_tracking) [๐Ÿ›ก๏ธGuardrailsAdd content filtering, PII masking, and safety checks.](https://docs.litellm.ai/docs/proxy/guardrails/quick_start) [๐Ÿ“กObservabilityIntegrate with Langfuse, MLflow, Helicone, and more.](https://docs.litellm.ai/docs/observability/agentops_integration) [๐ŸญEnterpriseSSO/SAML, audit logs, and advanced security for production.](https://docs.litellm.ai/docs/enterprise) [Nextcompletion()](https://docs.litellm.ai/docs/completion/input) - [Installation](https://docs.litellm.ai/docs/#installation) - [Quick Start](https://docs.litellm.ai/docs/#quick-start) - [Response Format](https://docs.litellm.ai/docs/#response-format) - [New to LiteLLM?](https://docs.litellm.ai/docs/#new-to-litellm) - [Choose Your Path](https://docs.litellm.ai/docs/#choose-your-path) - [LiteLLM Python SDK](https://docs.litellm.ai/docs/#litellm-python-sdk) - [Streaming](https://docs.litellm.ai/docs/#streaming) - [Exception Handling](https://docs.litellm.ai/docs/#exception-handling) - [Logging & Observability](https://docs.litellm.ai/docs/#logging--observability) - [Track Costs & Usage](https://docs.litellm.ai/docs/#track-costs--usage) - [LiteLLM Proxy Server (LLM Gateway)](https://docs.litellm.ai/docs/#litellm-proxy-server-llm-gateway) - [Agent & MCP Gateway](https://docs.litellm.ai/docs/#agent--mcp-gateway) - [What to Explore Next](https://docs.litellm.ai/docs/#what-to-explore-next) ๐Ÿš… LiteLLM Enterprise SSO/SAML, audit logs, spend tracking, multi-team management, and guardrails โ€” built for production. [Learn more โ†’](https://docs.litellm.ai/docs/enterprise) Docs - [Getting Started](https://docs.litellm.ai/docs/) Community - [Discord](https://discord.com/invite/wuPM9dRgDw) - [Twitter](https://twitter.com/LiteLLM) More - [GitHub](https://github.com/BerriAI/litellm/) Copyright ยฉ 2026 liteLLM
Readable Markdown
**LiteLLM** is an open-source library that gives you a single, unified interface to call 100+ LLMs โ€” OpenAI, Anthropic, Vertex AI, Bedrock, and more โ€” using the OpenAI format. - Call any provider using the same `completion()` interface โ€” no re-learning the API for each one - Consistent output format regardless of which provider or model you use - Built-in retry / fallback logic across multiple deployments via the [Router](https://docs.litellm.ai/docs/routing) - Self-hosted [LLM Gateway (Proxy)](https://docs.litellm.ai/docs/simple_proxy) with virtual keys, cost tracking, and an admin UI [![PyPI](https://img.shields.io/pypi/v/litellm.svg)](https://pypi.org/project/litellm/) [![GitHub Stars](https://img.shields.io/github/stars/BerriAI/litellm?style=social)](https://github.com/BerriAI/litellm) *** ## Installation[โ€‹](https://docs.litellm.ai/docs/#installation "Direct link to Installation") ``` pip install litellm ``` To run the full Proxy Server (LLM Gateway): ``` pip install 'litellm[proxy]' ``` *** ## Quick Start[โ€‹](https://docs.litellm.ai/docs/#quick-start "Direct link to Quick Start") Make your first LLM call using the provider of your choice: - OpenAI - Anthropic - Vertex AI - Bedrock - Ollama - Azure OpenAI ``` from litellm import completion import os os.environ["OPENAI_API_KEY"] = "your-api-key" response = completion( model="openai/gpt-4o", messages=[{"role": "user", "content": "Hello, how are you?"}] ) print(response.choices[0].message.content) ``` Every response follows the OpenAI Chat Completions format, regardless of provider. โœ… ### Response Format[โ€‹](https://docs.litellm.ai/docs/#response-format "Direct link to Response Format") Non-streaming responses return a `ModelResponse` object: ``` { "id": "chatcmpl-abc123", "object": "chat.completion", "created": 1677858242, "model": "gpt-4o", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "Hello! I'm doing well, thanks for asking." }, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 13, "completion_tokens": 12, "total_tokens": 25 } } ``` Streaming responses (`stream=True`) yield `ModelResponseStream` chunks: ``` { "id": "chatcmpl-abc123", "object": "chat.completion.chunk", "created": 1677858242, "model": "gpt-4o", "choices": [ { "index": 0, "delta": { "role": "assistant", "content": "Hello" }, "finish_reason": null } ] } ``` ๐Ÿ“– [Full output format reference โ†’](https://docs.litellm.ai/docs/completion/output) Open in Colab [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/BerriAI/litellm/blob/main/cookbook/liteLLM_Getting_Started.ipynb) *** ## New to LiteLLM?[โ€‹](https://docs.litellm.ai/docs/#new-to-litellm "Direct link to New to LiteLLM?") **Want to get started fast?** Head to [Tutorials](https://docs.litellm.ai/docs/tutorials) for step-by-step walkthroughs โ€” AI coding tools, agent SDKs, proxy setup, and more. **Need to understand a specific feature?** Check [Guides](https://docs.litellm.ai/docs/guides) for streaming, function calling, prompt caching, and other how-tos. *** ## Choose Your Path[โ€‹](https://docs.litellm.ai/docs/#choose-your-path "Direct link to Choose Your Path") *** ## LiteLLM Python SDK[โ€‹](https://docs.litellm.ai/docs/#litellm-python-sdk "Direct link to LiteLLM Python SDK") ### Streaming[โ€‹](https://docs.litellm.ai/docs/#streaming "Direct link to Streaming") Add `stream=True` to receive chunks as they are generated: ``` from litellm import completion import os os.environ["OPENAI_API_KEY"] = "your-api-key" for chunk in completion( model="openai/gpt-4o", messages=[{"role": "user", "content": "Write a short poem"}], stream=True, ): print(chunk.choices[0].delta.content or "", end="") ``` ### Exception Handling[โ€‹](https://docs.litellm.ai/docs/#exception-handling "Direct link to Exception Handling") LiteLLM maps every provider's errors to the OpenAI exception types โ€” your existing error handling works out of the box: ``` import litellm try: litellm.completion( model="anthropic/claude-instant-1", messages=[{"role": "user", "content": "Hey!"}] ) except litellm.AuthenticationError as e: print(f"Bad API key: {e}") except litellm.RateLimitError as e: print(f"Rate limited: {e}") except litellm.APIError as e: print(f"API error: {e}") ``` ### Logging & Observability[โ€‹](https://docs.litellm.ai/docs/#logging--observability "Direct link to Logging & Observability") Send input/output to Langfuse, MLflow, Helicone, Lunary, and more with a single line: ``` import litellm litellm.success_callback = ["langfuse", "mlflow", "helicone"] response = litellm.completion( model="gpt-4o", messages=[{"role": "user", "content": "Hi!"}] ) ``` ๐Ÿ“– [See all observability integrations โ†’](https://docs.litellm.ai/docs/observability/agentops_integration) ### Track Costs & Usage[โ€‹](https://docs.litellm.ai/docs/#track-costs--usage "Direct link to Track Costs & Usage") Use a callback to capture cost per response: ``` import litellm def track_cost(kwargs, completion_response, start_time, end_time): print("Cost:", kwargs.get("response_cost", 0)) litellm.success_callback = [track_cost] litellm.completion( model="gpt-4o", messages=[{"role": "user", "content": "Hello!"}], stream=True ) ``` ๐Ÿ“– [Custom callback docs โ†’](https://docs.litellm.ai/docs/observability/custom_callback) *** ## LiteLLM Proxy Server (LLM Gateway)[โ€‹](https://docs.litellm.ai/docs/#litellm-proxy-server-llm-gateway "Direct link to LiteLLM Proxy Server (LLM Gateway)") The proxy is a self-hosted OpenAI-compatible gateway. Any client that works with OpenAI works with the proxy โ€” no code changes needed. ![LiteLLM Proxy Dashboard](https://github.com/BerriAI/litellm/assets/29436595/47c97d5e-b9be-4839-b28c-43d7f4f10033) #### Step 1 โ€” Start the proxy[โ€‹](https://docs.litellm.ai/docs/#step-1--start-the-proxy "Direct link to Step 1 โ€” Start the proxy") - pip - Docker ``` litellm --model huggingface/bigcode/starcoder # Proxy running on http://0.0.0.0:4000 ``` #### Step 2 โ€” Call it with the OpenAI client[โ€‹](https://docs.litellm.ai/docs/#step-2--call-it-with-the-openai-client "Direct link to Step 2 โ€” Call it with the OpenAI client") ``` import openai client = openai.OpenAI(api_key="anything", base_url="http://0.0.0.0:4000") response = client.chat.completions.create( model="gpt-3.5-turbo", messages=[{"role": "user", "content": "Write a short poem"}] ) print(response.choices[0].message.content) ``` ๐Ÿ‘‰ [Full proxy quickstart with Docker โ†’](https://docs.litellm.ai/docs/proxy/docker_quick_start) Debugging tool Use [**`/utils/transform_request`**](https://docs.litellm.ai/docs/utils/transform_request) to inspect exactly what LiteLLM sends to any provider โ€” useful for debugging prompt formatting, header issues, and provider-specific parameters. ๐Ÿ”— [Interactive API explorer (Swagger) โ†’](https://litellm-api.up.railway.app/) *** ## Agent & MCP Gateway[โ€‹](https://docs.litellm.ai/docs/#agent--mcp-gateway "Direct link to Agent & MCP Gateway") LiteLLM is a unified gateway for **LLMs, agents, and MCP** โ€” you don't need a separate agent or MCP gateway. One endpoint for 100+ models, A2A agents, and MCP tools. *** ## What to Explore Next[โ€‹](https://docs.litellm.ai/docs/#what-to-explore-next "Direct link to What to Explore Next") [๐Ÿ”€Routing & Load BalancingLoad balance across deployments and set automatic fallbacks.](https://docs.litellm.ai/docs/routing-load-balancing) [๐Ÿ”‘Virtual KeysManage access, budgets, and rate limits per team or user.](https://docs.litellm.ai/docs/proxy/virtual_keys) [๐Ÿ“ŠSpend TrackingTrack costs per key, team, and user across all providers.](https://docs.litellm.ai/docs/proxy/cost_tracking) [๐Ÿ›ก๏ธGuardrailsAdd content filtering, PII masking, and safety checks.](https://docs.litellm.ai/docs/proxy/guardrails/quick_start) [๐Ÿ“กObservabilityIntegrate with Langfuse, MLflow, Helicone, and more.](https://docs.litellm.ai/docs/observability/agentops_integration) [๐ŸญEnterpriseSSO/SAML, audit logs, and advanced security for production.](https://docs.litellm.ai/docs/enterprise)
Shard11 (laksa)
Root Hash4050687676692930411
Unparsed URLai,litellm!docs,/docs/ s443