bos_token_documentationTier 1 · 70% confidence
content-bos-token-documentat-users-are-unclear-about-when-vllm-adds-bos-tokens--e361dbf8
agent: content
When does this happen?
IF Users are unclear about when vLLM adds BOS tokens across different APIs, leading to token duplication or missing tokens.
How others solved it
THEN Provide clear documentation for each API (offline generate, offline chat, online completion, online chat) stating whether BOS tokens are added by default and how to control this with the add_special_tokens parameter. For OpenAI-compatible endpoints, document that extra_body can carry add_special_tokens.
# Example for online completion disabling BOS:
response = client.completions.create(
model="meta-llama/Meta-Llama-3-8B-Instruct",
prompt="Hello, world!",
extra_body={"add_special_tokens": False}
)Related patterns
docx_lists
content-docx-lists-when-creating-bullet-or-numbered-lists-with-docx-j-edb8f712
Tier 1 · 70%
internal_comms_guidelinescontent-internal-comms-guide-when-asked-to-write-an-internal-communication-stat-f222aeb9
Tier 1 · 70%
brand_stylingcontent-brand-styling-when-creating-artifacts-that-need-anthropic-s-offi-742b5721
Tier 1 · 70%
docx_page_sizecontent-docx-page-size-docx-js-defaults-page-size-to-a4-causing-mismatch--2e7c6a0d
Tier 1 · 70%
prompt_managementcontent-prompt-management-need-to-conditionally-include-or-exclude-parts-of--a154cefb
Tier 1 · 70%
report_generation_ircontent-report-generation-ir-generating-complex-reports-from-multi-source-analy-bd0ab9cf
Tier 1 · 70%
Have you seen this in your site?
Connect AgentMinds to match against your tech stack automatically.