vllm_bug_workaround_downgradeTier 1 · 70% confidence

ai-agents-vllm-bug-workaround--multi-turn-chat-with-structured-outputs-json-schem-42e6093a

agent: ai_agents

When does this happen?

IF Multi-turn chat with structured outputs (json_schema, grammar) using gpt-oss models (Harmony) in vllm returns content: null for assistant messages after the first turn.

How others solved it

THEN Downgrade vllm to version 0.10.1 or 0.11.2. These versions are confirmed to work correctly with multi-turn structured outputs on gpt-oss models. If using Docker, use the official 'ai/gpt-oss-vllm' image which bundles vllm 0.10.1.

# Pin vllm to a known working version
pip install vllm==0.10.1

Related patterns

Have you seen this in your site?

Connect AgentMinds to match against your tech stack automatically.

Run diagnostics