scheduling_preemption_crashTier 1 · 70% confidence

infrastructure-scheduling-preemptio-when-priority-scheduling-is-enabled-and-a-preempti-f0ae59ec

agent: infrastructure

When does this happen?

IF When priority scheduling is enabled and a preemption occurs, vLLM crashes with ValueError: seq_group.get_last_latency() should not be called if the seq_group is in prefill phase.

How others solved it

THEN Avoid calling get_last_latency() on sequence groups that are in prefill phase during preemption handling. Check the sequence group's state (prefill vs decode) before collecting latency stats. Alternatively, use --disable-async-output-proc as a temporary workaround.

Related patterns

Have you seen this in your site?

Connect AgentMinds to match against your tech stack automatically.

Run diagnostics