v1.76.1-stable - Gemini 2.5 Flash Image
Deploy this versionโ
- Docker
- Pip
docker run litellm
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:v1.76.1
pip install litellm
pip install litellm==1.76.1
Key Highlightsโ
- Major Performance Improvements - 6.5x faster LiteLLM Python SDK completion with fastuuid integration.
- New Model Support - Gemini 2.5 Flash Image Preview, Grok Code Fast, and GPT Realtime models
- Enhanced Provider Support - DeepSeek-v3.1 pricing on Fireworks AI, Vercel AI Gateway, and improved Anthropic/GitHub Copilot integration
- MCP Improvements - Better connection testing and SSE MCP tools bug fixes
Major Changesโ
- Added support for using Gemini 2.5 Flash Image Preview with /chat/completions. ๐จ Warning If you were using
gemini-2.0-flash-exp-image-generation
please follow this migration guide. Gemini Image Generation Migration Guide
Performance Improvementsโ
This release includes significant performance optimizations:
- 6.5x faster LiteLLM Python SDK Completion - Major performance boost for completion operations - PR #13990
- fastuuid Integration - 2.1x faster UUID generation with +80 RPS improvement for /chat/completions and other LLM endpoints - PR #13992, PR #14016
- Optimized Request Logging - Don't print request params by default for +50 RPS improvement - PR #14015
- Cache Performance - 21% speedup in InMemoryCache.evict_cache and 45% speedup in
_is_debugging_on
function - PR #14012, PR #13988
New Models / Updated Modelsโ
New Model Supportโ
Provider | Model | Context Window | Input ($/1M tokens) | Output ($/1M tokens) | Features |
---|---|---|---|---|---|
gemini-2.5-flash-image-preview | 1M | $0.30 | $2.50 | Chat completions + image generation ($0.039/image) | |
X.AI | xai/grok-code-fast | 256K | $0.20 | $1.50 | Code generation |
OpenAI | gpt-realtime | 32K | $4.00 | $16.00 | Real-time conversation + audio |
Vercel AI Gateway | vercel_ai_gateway/openai/o3 | 200K | $2.00 | $8.00 | Advanced reasoning |
Vercel AI Gateway | vercel_ai_gateway/openai/o3-mini | 200K | $1.10 | $4.40 | Efficient reasoning |
Vercel AI Gateway | vercel_ai_gateway/openai/o4-mini | 200K | $1.10 | $4.40 | Latest mini model |
DeepInfra | deepinfra/zai-org/GLM-4.5 | 131K | $0.55 | $2.00 | Chat completions |
Perplexity | perplexity/codellama-34b-instruct | 16K | $0.35 | $1.40 | Code generation |
Fireworks AI | fireworks_ai/accounts/fireworks/models/deepseek-v3p1 | 128K | $0.56 | $1.68 | Chat completions |
Additional Models Added: Various other Vercel AI Gateway models were added too. See models.litellm.ai for the full list.
Featuresโ
- Google Gemini
- X.AI
- OpenAI
- Fireworks AI
- Added DeepSeek-v3.1 pricing - PR #13958
- DeepInfra
- Fixed reasoning_effort setting for DeepSeek-V3.1 - PR #14053
- GitHub Copilot
- Anthropic
- Nebius
- Expanded provider models and normalized model IDs - PR #13965
- Vertex AI
- Bedrock
- Fixed structure output issues - PR #14005
- OpenRouter
- Added GPT-5 family models pricing - PR #13536
New Provider Supportโ
- Vercel AI Gateway
- New provider support added - PR #13144
- DataRobot
LLM API Endpointsโ
Featuresโ
- Images API
- Responses API
- Azure Passthrough
- Fixed Azure Passthrough request with streaming - PR #13831
Bugsโ
- General
MCP Gatewayโ
Featuresโ
- SSE MCP Tools
- Bug fix for adding SSE MCP tools - improved connection testing when adding MCPs - PR #14048
Management Endpoints / UIโ
Featuresโ
- Team Management
- Allow setting Team Member RPM/TPM limits when creating a team - PR #13943
- UI Improvements
Bugsโ
- Authentication
- Fixed Virtual keys with llm_api type causing Internal Server Error for /anthropic/* and other LLM passthrough routes - PR #14046
Logging / Guardrail Integrationsโ
Featuresโ
- Langfuse OTEL
- Allow using LANGFUSE_OTEL_HOST for configuring host - PR #14013
- Braintrust
- OpenMeter
- Set user from token user_id for OpenMeter integration - PR #13152
New Guardrail Supportโ
- Noma Security
- Added Noma Security guardrail support - PR #13572
- Pangea
- Updated Pangea Guardrail to support new AIDR endpoint - PR #13160
Performance / Loadbalancing / Reliability improvementsโ
Featuresโ
- Caching
- Router
- Refactored router to choose weights by 'weight', 'rpm', 'tpm' in one loop for simple_shuffle - PR #13562
- Logging
Bugsโ
- Dependencies
- Bumped
orjson
version to "3.11.2" - PR #13969
- Bumped
General Proxy Improvementsโ
Featuresโ
- AWS
- Add support for AWS assume_role with a session token - PR #13919
- OCI Provider
- Added oci_key_file as an optional_parameter - PR #14036
- Configuration
- Docker
- Added back supervisor to non-root image - PR #13922
New Contributorsโ
- @ArthurRenault made their first contribution in PR #13922
- @stevenmanton made their first contribution in PR #13919
- @uc4w6c made their first contribution in PR #13914
- @nielsbosma made their first contribution in PR #13573
- @Yuki-Imajuku made their first contribution in PR #13567
- @codeflash-ai[bot] made their first contribution in PR #13988
- @ColeFrench made their first contribution in PR #13978
- @dttran-glo made their first contribution in PR #13969
- @manascb1344 made their first contribution in PR #13965
- @DorZion made their first contribution in PR #13572
- @edwardsamuel made their first contribution in PR #13536
- @blahgeek made their first contribution in PR #13374
- @Deviad made their first contribution in PR #13394
- @XSAM made their first contribution in PR #13775
- @KRRT7 made their first contribution in PR #14012
- @ikaadil made their first contribution in PR #13991
- @timelfrink made their first contribution in PR #13691
- @qidu made their first contribution in PR #13562
- @nagyv made their first contribution in PR #13243
- @xywei made their first contribution in PR #12885
- @ericgtkb made their first contribution in PR #12797
- @NoWall57 made their first contribution in PR #13945
- @lmwang9527 made their first contribution in PR #14050
- @WilsonSunBritten made their first contribution in PR #14042
- @Const-antine made their first contribution in PR #14041
- @dmvieira made their first contribution in PR #14040
- @gotsysdba made their first contribution in PR #14036
- @moshemorad made their first contribution in PR #14005
- @joshualipman123 made their first contribution in PR #13144