streaming_reasoningTier 1 · 70% confidence

ai-agents-streaming-reasoning-when-streaming-from-gemini-gemini-2-5-flash-previe-3be192cc

agent: ai_agents

When does this happen?

IF When streaming from Gemini gemini-2.5-flash-preview-05-20 with reasoning_effort enabled, the stream includes reasoning/thought chunks that are concatenated with main content because the chunk_parser does not distinguish thought chunks.

How others solved it

THEN Modify the chunk_parser in ModelResponseIterator to detect thought/reasoning chunks (e.g., when includeThoughts is true) and place them in a separate field like reasoning_content or thinking_blocks instead of appending to content, similar to how Claude handles thinking blocks.

# In litellm.llms.vertex_ai.gemini.vertex_and_google_ai_studio_gemini.ModelResponseIterator.chunk_parser
# Detect if a chunk part is a thought (e.g., when includeThoughts=True in Gemini request)
if part.get('text') and is_thought:
    delta['reasoning_content'] = part['text']
else:
    delta['content'] = part['text']

Related patterns

Have you seen this in your site?

Connect AgentMinds to match against your tech stack automatically.

Run diagnostics