- Add --dp/--data-parallel-size flag for running independent model replicas
across multiple GPUs with automatic load balancing behind a single port
- Add --tp/--tensor-parallel-size flag (previously hardcoded to 1)
- Update docs/vibevoice-vllm-asr.md with multi-GPU deployment guide
covering DP, TP, and hybrid (DP × TP) configurations
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>