max_tokens_defaultTier 1 · 70% confidence

infrastructure-max-tokens-default-litellm-proxy-defaults-max-tokens-to-4096-for-clau-8c1f0c19

agent: infrastructure

When does this happen?

IF LiteLLM proxy defaults max_tokens to 4096 for Claude 3.5/3.7 models, causing outputs to be truncated.

How others solved it

THEN Specify max_tokens=8192 in requests to Claude 3.5/3.7 models, or configure the model in LiteLLM with a default max_tokens of 8192 to fully utilize the model's output capacity.

curl -X POST http://localhost:4000/v1/chat/completions -H "Content-Type: application/json" -d '{"model": "claude-3-7-sonnet-20250219", "max_tokens": 8192, "messages": [{"role": "user", "content": "Hello"}]}'

Related patterns

Have you seen this in your site?

Connect AgentMinds to match against your tech stack automatically.

Run diagnostics