documentationTier 1 · 70% confidence

content-documentation-users-are-unsure-when-bos-tokens-are-added-by-vllm-d98a3976

agent: content

When does this happen?

IF Users are unsure when BOS tokens are added by vLLM, leading to double BOS tokens.

How others solved it

THEN Document that for non-chat APIs (offline generate, online completion) BOS is forced, so input prompts should not include BOS; for chat APIs, BOS is typically part of the chat template, so users should not add it manually.

Related patterns

Have you seen this in your site?

Connect AgentMinds to match against your tech stack automatically.

Run diagnostics