14 Commits

Author SHA1 Message Date
JianweiYu 09ca114fa3 Add Gradio ASR demo with video support and demo audio/video files
- Add gradio_asr_demo_api_video.py: Gradio web UI supporting audio/video upload,
  streaming output, hotwords, and Cloudflare tunnel
- Add demo/asr_demo/: demo audio and video files for the Gradio interface

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-03-22 06:11:51 +00:00
YaoyaoChang 0aa8cb4c64 fx default speaker 2026-02-03 00:35:04 -08:00
YaoyaoChang e43c1e2cdb streaming use transformers==4.51.3 2026-02-03 00:30:52 -08:00
ikeshav26 d11d756b61 fix: issues in error handling 2026-01-26 14:18:34 +08:00
DDXDB 1c5dbc4190 Add XPU sdpa Support 2026-01-26 14:00:31 +08:00
ThanhNguyxn 523713e806 fix(demo): add MPS and CPU support for ASR inference demo
- Add MPS device choice and auto-detect MPS availability
- Change default attention implementation to 'auto' with smart fallback
- Auto-detect flash_attention_2 availability on CUDA, fallback to sdpa
- Use sdpa for MPS and CPU devices (flash_attention_2 not supported)
- Use float32 dtype for MPS/CPU devices for better compatibility

Fixes #206
2026-01-26 13:56:11 +08:00
YaoyaoChang ce90a49960 fix env bug 2026-01-21 22:03:52 -08:00
Zhiliang Peng 56cb11e7b2 Add VibeVoice-ASR 2026-01-21 22:18:33 +08:00
YaoyaoChang 4adbe76674 more experimental voices 2025-12-16 04:21:09 -08:00
YaoyaoChang 04d19f8352 add experimental multi-lingual speakers 2025-12-08 08:29:00 -08:00
hydropix 79470ff576 Fix: Remove unnecessary Path() conversion for HuggingFace model IDs
The model_path was being converted to a Path object and then back to string
for from_pretrained() calls. This is unnecessary since HuggingFace accepts
strings directly, and causes issues on Windows where Path() converts forward
slashes to backslashes (e.g., "microsoft/VibeVoice-Realtime-0.5B" becomes
"microsoft\VibeVoice-Realtime-0.5B").

This fix:
- Keeps model_path as a string (no behavior change on Linux/macOS)
- Fixes Windows compatibility for HuggingFace repo IDs
- Removes redundant str() conversions
2025-12-08 10:27:58 +08:00
YaoyaoChang 7ea24a4fb9 update 2025-12-04 22:33:57 -08:00
YaoyaoChang 82d5f29842 Fix: Colab downloads occasionally get stuck 2025-12-04 07:20:36 -08:00
YaoyaoChang fc83be5d92 add VibeVoice-Realtime 2025-12-04 05:38:30 -08:00