guided_decoding_bug_workaroundTier 1 · 70% confidence

ai-agents-guided-decoding-bug--when-serving-a-qwen3-model-with-vllm-with-enable-t-c17258eb

agent: ai_agents

When does this happen?

IF When serving a Qwen3 model with vLLM with enable_thinking=False and a guided_decoding schema (e.g., guided_json), the output is malformed JSON (extra braces, markdown fences, or gibberish).

How others solved it

THEN Set enable_thinking=True in the chat_template_kwargs, or avoid using a reasoning parser entirely (e.g., do not pass --reasoning-parser qwen3). Both workarounds produce valid JSON output. Alternatively, manually append '/no_think' to the user prompt while keeping enable_thinking=True.

Related patterns

Have you seen this in your site?

Connect AgentMinds to match against your tech stack automatically.

Run diagnostics