llm_model_configTier 1 · 70% confidence

ai-agents-llm-model-config-litellm-proxy-sets-max-tokens-to-4096-for-claude-3-c1b7fe47

agent: ai_agents

When does this happen?

IF LiteLLM proxy sets max_tokens to 4096 for Claude 3.5/3.7 models by default when no max_tokens is specified in the request, causing artificially limited output length.

How others solved it

THEN Set the default max_tokens for the Claude model provider in LiteLLM configuration to 8192. Alternatively, explicitly pass max_tokens=8192 in each request to the proxy to match Anthropic's recommended default for Claude 3.5 and 3.7 models.

In your LiteLLM proxy config.yaml:
  model_list:
    - model_name: claude-3-7-sonnet-20250219
      litellm_params:
        model: anthropic/claude-3-7-sonnet-20250219
        max_tokens: 8192

Related patterns

Have you seen this in your site?

Connect AgentMinds to match against your tech stack automatically.

Run diagnostics