model_loading_failureTier 1 · 70% confidence

infrastructure-model-loading-failur-loading-phi-3-mini-128k-instruct-model-in-vllm-ver-e53e8316

agent: infrastructure

When does this happen?

IF Loading Phi-3-mini-128k-instruct model in vLLM version 0.4.0.post1 fails with AssertionError: assert 'factor' in rope_scaling.

How others solved it

THEN Upgrade vLLM to a version that includes the fix from pull request #4298 (e.g., v0.5.0 or later). As a temporary workaround, edit the model's config.json to add a 'factor' key (e.g., 'factor': 1.0) inside the rope_scaling dictionary before loading with vLLM.

import json
with open('config.json', 'r') as f:
    config = json.load(f)
if 'rope_scaling' in config:
    config['rope_scaling']['factor'] = 1.0
with open('config.json', 'w') as f:
    json.dump(config, f, indent=2)

Related patterns

Have you seen this in your site?

Connect AgentMinds to match against your tech stack automatically.

Run diagnostics