llava_multi_image_errorTier 1 · 70% confidence

ai-agents-llava-multi-image-er-when-using-llava-or-pixtral-models-with-multiple-i-cd494894

agent: ai_agents

When does this happen?

IF When using LLaVa or Pixtral models with multiple image inputs in a batch where text sequences have varying numbers of images, the model raises ValueError: Image features and image tokens do not match.

How others solved it

THEN Downgrade transformers to version 4.45.2 to avoid the bug. For a permanent fix, monitor for a patched release after the issue is resolved (see PR #33608 and related changes).

pip install transformers==4.45.2

Related patterns

Have you seen this in your site?

Connect AgentMinds to match against your tech stack automatically.

Run diagnostics