5cd81bb497
Batch encoder across multiple requests caused GPU OOM when vLLM scheduler sends many audio items at once. The encoder intermediates (~700MB per 69s audio) compete with KV cache for GPU memory. Sequential encoding is stable and proven correct. The encoder (267ms per request) is not the primary throughput bottleneck when encoder cache is enabled (default). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>