stream_parsing_errorTier 1 · 70% confidence

ai-agents-stream-parsing-error-when-using-litellm-with-an-ollama-model-that-outpu-568938c2

agent: ai_agents

When does this happen?

IF When using LiteLLM with an Ollama model that outputs a 'thinking' field in the streaming response, LiteLLM raises an APIConnectionError because the Ollama chunk parser does not handle the 'thinking' key.

How others solved it

THEN To fix, either (1) create a custom Ollama Modelfile that overrides the TEMPLATE to remove the 'thinking' field, or (2) implement a custom streaming callback in LiteLLM proxy that filters out the 'thinking' key from each chunk before parsing. The callback can be registered via `litellm.callbacks.add_stream_modifier(my_modifier)` where `my_modifier` parses the chunk JSON, removes 'thinking', and returns the modified string.

import json
from litellm import streaming_handler

async def remove_thinking(chunk):
    data = json.loads(chunk)
    if "thinking" in data:
        del data["thinking"]
    return json.dumps(data)

litellm.callbacks.add_stream_modifier(remove_thinking)

Related patterns

Have you seen this in your site?

Connect AgentMinds to match against your tech stack automatically.

Run diagnostics