VibeVoice

Author	SHA1	Message	Date
Jianwei Yu	cd945395d4	feat: set nginx workers to 2×dp for optimal HTTP throughput Nginx worker_processes now defaults to 2×N (where N is the number of DP replicas) instead of 'auto'. This ensures enough HTTP handler processes to fully saturate all GPU backends under heavy concurrent load. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-03-27 09:16:05 +00:00
Jianwei Yu	e6b65abb9b	fix: auto-tune per-worker env vars in DP mode Pass VIBEVOICE_FFMPEG_MAX_CONCURRENCY and VLLM_MEDIA_LOADING_THREAD_COUNT to each worker subprocess so they inherit the correct settings regardless of how the container is launched (--skip-deps or not). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-03-27 07:57:49 +00:00
Jianwei Yu	3817f74d46	feat: nginx-based data parallel for optimal ASR throughput When --dp N is specified (N > 1), the launcher now starts N independent vLLM processes behind an nginx reverse proxy instead of using vLLM's built-in DP coordinator. This avoids the single-process HTTP bottleneck when handling large base64 audio payloads, achieving near-linear scaling (7.2x with 8 GPUs at 4096 concurrent requests). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-03-27 07:43:32 +00:00
JianweiYu	9634518ca4	Add data parallel (DP) support to vLLM server launcher - Add --dp/--data-parallel-size flag for running independent model replicas across multiple GPUs with automatic load balancing behind a single port - Add --tp/--tensor-parallel-size flag (previously hardcoded to 1) - Update docs/vibevoice-vllm-asr.md with multi-GPU deployment guide covering DP, TP, and hybrid (DP × TP) configurations Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-03-24 11:53:31 +00:00
JianweiYu	09ca114fa3	Add Gradio ASR demo with video support and demo audio/video files - Add gradio_asr_demo_api_video.py: Gradio web UI supporting audio/video upload, streaming output, hotwords, and Cloudflare tunnel - Add demo/asr_demo/: demo audio and video files for the Gradio interface Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-03-22 06:11:51 +00:00
YingboHAO	a4add8e52f	fix backend	2026-02-08 09:58:19 +00:00
YingboHAO	1eb04f53a2	Replace install_deps.sh with start_server.py one-click deployment	2026-01-26 07:34:54 +00:00
YingboHAO	4df5b0582f	Add vLLM plugin support for high-performance ASR serving	2026-01-23 17:32:24 +00:00

8 Commits