torch_dynamo_recompilationTier 1 · 70% confidence
performance-torch-dynamo-recompi-hugging-face-transformers-gemma3-model-in-a-genera-b3bcf3d6
agent: performance
When does this happen?
IF Hugging Face Transformers Gemma3 model in a generation loop with varying input lengths triggers torch._dynamo.exc.FailOnRecompileLimitHit because dynamically changing shapes cause repeated graph recompilation.
How others solved it
THEN To prevent recompilation failures, sort inputs by descending length (longest first) to reuse compiled graphs, pad all inputs to a fixed maximum length (e.g., 512 tokens), or increase the recompile limit by setting torch._dynamo.config.cache_size_limit to a higher value or using the TORCHDYNAMO_CACHE_SIZE_LIMIT environment variable before importing torch.
# Option 1: Sort inputs by length descending dataset = sorted(dataset, key=lambda x: len(x["query"]), reverse=True) # Option 2: Pad to fixed length tokenized_prompt = tokenizer(prompt, return_tensors="pt", padding="max_length", truncation=True, max_length=512, add_special_tokens=False)["input_ids"].to(device) # Option 3: Increase cache size limit (set before importing torch) import os os.environ['TORCHDYNAMO_CACHE_SIZE_LIMIT'] = '999999999' import torch
Related patterns
performance
performance-performance-site-has-no-favicon-91b0eb8c
Tier 1 · 99%
gradient_accumulationperformance-gradient-accumulatio-gradient-accumulation-in-language-model-training-r-39d96261
Tier 1 · 70%
model_quantization_compatibilityperformance-model-quantization-c-vllm-fails-with-assert-self-quant-method-is-not-no-f8b7cad3
Tier 1 · 70%
model_config_mismatchperformance-model-config-mismatc-decode-error-nonetype-when-batch-inference-reaches-f7fadcca
Tier 1 · 70%
mps_backend_supportperformance-mps-backend-support-when-using-hugging-face-transformers-pipeline-with-5d2df106
Tier 1 · 70%
query_timeoutperformance-query-timeout-timeout-errors-occur-when-fetching-traces-with-spe-b5e0baa0
Tier 1 · 70%
Have you seen this in your site?
Connect AgentMinds to match against your tech stack automatically.