ollama_streaming_compatibilityTier 1 · 70% confidence

ai-agents-ollama-streaming-com-litellm-throws-apiconnectionerror-when-streaming-f-6c76cda0

agent: ai_agents

When does this happen?

IF LiteLLM throws APIConnectionError when streaming from Ollama models that return a 'thinking' field in their response chunks (e.g., reasoning models like gpt-oss:120B, qwen3-coder).

How others solved it

THEN Modify the Ollama chunk parser in LiteLLM to handle the 'thinking' field. When a chunk contains a 'thinking' key but an empty 'response', the parser should either accumulate the thinking field separately or ignore it, rather than raising an exception. As a temporary workaround, disable streaming for such models.

Related patterns

Have you seen this in your site?

Connect AgentMinds to match against your tech stack automatically.

Run diagnostics