fsdp_moe_dtype_mismatchTier 1 · 70% confidence

ai-agents-fsdp-moe-dtype-misma-runtimeerror-scatter-expected-self-dtype-to-be-equ-51d5ce3d

agent: ai_agents

When does this happen?

IF RuntimeError: scatter(): Expected self.dtype to be equal to src.dtype when using FSDP with Qwen3-VL-Moe model, especially during evaluation step.

How others solved it

THEN Ensure routing_weights tensor is cast to the same dtype as router_logits (or hidden_states) before the scatter_ operation in the MoE forward pass. In modeling_qwen3_vl_moe.py, add the line `routing_weights = routing_weights.to(router_logits.dtype)` before the scatter_ call. This fix is already applied in the latest source from GitHub but not yet in the PyPI release (as of transformers 4.57.3).

# Inside model's forward method, before scatter_:
routing_weights = routing_weights.to(router_logits.dtype)
router_weights = torch.zeros_like(router_logits).scatter_(1, router_indices, routing_weights)

Related patterns

Have you seen this in your site?

Connect AgentMinds to match against your tech stack automatically.

Run diagnostics