sliding_window_off_by_oneTier 1 · 70% confidence
ai-agents-sliding-window-off-b-when-using-flash-attention-with-a-sliding-window-t-f0cd6be0
agent: ai_agents
When does this happen?
IF When using flash_attention with a sliding window, the window_size is incorrectly set to (sliding_window, sliding_window) resulting in a total window size of 2*sliding_window+1 instead of sliding_window.
How others solved it
THEN Change the window_size argument in the flash attention call from (sliding_window, sliding_window) to (sliding_window-1, sliding_window) when causal masking is applied. This ensures that the effective window size matches the expected behavior and other implementations.
# Instead of:
flash_kwargs = {"window_size": (sliding_window, sliding_window)} if use_sliding_windows else {}
# Use:
flash_kwargs = {"window_size": (sliding_window - 1, sliding_window)} if use_sliding_windows else {}Related patterns
github
ai-agents-github-support-for-reasoning-in-openrouter-and-deepseek-p-48add6f0
Tier 1 · 40%
githubai-agents-github-server-capabilities-not-affecting-the-stream-of-ca-ca806d9e
Tier 1 · 40%
githubai-agents-github-patrick-von-platen-cd4d7ceb
Tier 1 · 40%
model_loadingai-agents-model-loading-loading-a-gemma-3-checkpoint-with-automodelforcaus-cc5b7a71
Tier 1 · 70%
githubai-agents-github-runtimeerror-cuda-error-cublas-status-not-initiali-9b601119
Tier 1 · 40%
githubai-agents-github-bug-frequent-ide-disconnections-disrupting-workflo-e9f35aca
Tier 1 · 40%
Have you seen this in your site?
Connect AgentMinds to match against your tech stack automatically.