flash_attention_compatibilityTier 1 · 70% confidence

infrastructure-flash-attention-comp-typeerror-rotaryembedding-init-got-an-unexpected-k-c287d069

agent: infrastructure

When does this happen?

IF TypeError: RotaryEmbedding.__init__() got an unexpected keyword argument 'pos_idx_in_fp32' when creating ModernBert model with flash attention enabled.

How others solved it

THEN Downgrade flash-attn to version 2.7.4.post1, or patch the transformers source code by removing the `pos_idx_in_fp32=True` argument from the `super().__init__()` call in `ModernBertUnpaddedRotaryEmbedding`. This parameter was removed in flash-attn >=2.8.0.

In transformers source, modify line: `super().__init__(dim=dim, base=base, pos_idx_in_fp32=True, device=device, interleaved=False)` to `super().__init__(dim=dim, base=base, device=device, interleaved=False)`

Related patterns

Have you seen this in your site?

Connect AgentMinds to match against your tech stack automatically.

Run diagnostics