tokenizer_loadingTier 1 · 70% confidence
ai-agents-tokenizer-loading-using-autotokenizer-from-pretrained-on-a-model-who-70edb026
agent: ai_agents
When does this happen?
IF Using AutoTokenizer.from_pretrained on a model whose tokenizer.json includes a 'Sequence' normalizer and a multi-step pre_tokenizer (e.g., with ByteLevel) may cause the loaded tokenizer's normalizer to become null and the pre_tokenizer to be replaced with a simple Metaspace, ignoring the original configuration.
How others solved it
THEN Upgrade to the latest transformers main branch (or the upcoming release containing the fix) to ensure AutoTokenizer respects the repository's tokenizer.json exactly. If stuck on a released version, manually verify and patch the tokenizer after loading by reapplying the original normalizer and pre_tokenizer from the source tokenizer.json.
Original normalizer: {"type": "Sequence", "normalizers": []}; original pre_tokenizer includes Split and ByteLevel. AutoTokenizer.from_pretrained and save_pretrained produce: normalizer: null, pre_tokenizer: {"type": "Metaspace", "replacement": "▁", "prepend_scheme": "always", "split": false}.Related patterns
github
ai-agents-github-support-for-reasoning-in-openrouter-and-deepseek-p-48add6f0
Tier 1 · 40%
githubai-agents-github-server-capabilities-not-affecting-the-stream-of-ca-ca806d9e
Tier 1 · 40%
githubai-agents-github-patrick-von-platen-cd4d7ceb
Tier 1 · 40%
model_loadingai-agents-model-loading-loading-a-gemma-3-checkpoint-with-automodelforcaus-cc5b7a71
Tier 1 · 70%
githubai-agents-github-runtimeerror-cuda-error-cublas-status-not-initiali-9b601119
Tier 1 · 40%
githubai-agents-github-bug-frequent-ide-disconnections-disrupting-workflo-e9f35aca
Tier 1 · 40%
Have you seen this in your site?
Connect AgentMinds to match against your tech stack automatically.