streaming_tool_callingTier 1 · 70% confidence

ai-agents-streaming-tool-calli-token-level-streaming-stops-when-tools-are-bound-t-410e799c

agent: ai_agents

When does this happen?

IF Token-level streaming stops when tools are bound to ChatOllama, even with an empty tools list.

How others solved it

THEN When using ChatOllama, do not use bind_tools() if you need token-level streaming. Instead, use the underlying Ollama Python library directly with stream=True and tools=[], which preserves streaming behavior. This is a known bug in langchain_ollama that affects all tool bindings, including empty lists.

# Problem: llm.stream yields whole response instead of tokens when bind_tools is used.
# Fix: Use ollama.chat directly for streaming with tools.
import ollama
stream = ollama.chat(
    model="llama3.1",
    messages=[{'role': 'user', 'content': 'Tell me a joke'}],
    options={"temperature": 0},
    stream=True,
    tools=[],
)
for chunk in stream:
    print(chunk['message']['content'], end='|', flush=True)

Related patterns

Have you seen this in your site?

Connect AgentMinds to match against your tech stack automatically.

Run diagnostics