Zhiliang Peng
4c419978c9
Merge pull request #255 from sd983527/main
...
Add news about VibeVoice ASR Transformers integration
2026-03-06 14:08:47 +08:00
Yan Xia
7e73beec97
Add news about VibeVoice ASR Transformers integration
...
- Added announcement that VibeVoice ASR is now part of Transformers v5.3.0 release
- Linked to the official Hugging Face Transformers release page
- Positioned as the latest news item with today's date
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-03-06 13:32:21 +08:00
Li Dong
7ef9dbe300
Merge pull request #247 from Damon-Salvetore/fix/vllm-version-compat
...
fix: vllm-version-stable
2026-02-28 11:12:24 +08:00
Damon-Salvetore
165e17e5ed
fix: vllm-version-stable
2026-02-25 07:30:43 +00:00
Jianwei Yu
1807b858d4
Merge pull request #236 from Damon-Salvetore/main
...
fix backend
2026-02-10 00:07:05 +08:00
YingboHAO
a4add8e52f
fix backend
2026-02-08 09:58:19 +00:00
Jianwei Yu
ce3d40c78f
Merge pull request #233 from Damon-Salvetore/main
...
Add hot words support
2026-02-07 12:32:03 +08:00
YingboHAO
0508c3e86f
fix
2026-02-06 14:38:16 +00:00
YingboHAO
7761242bf3
fix
2026-02-06 05:52:48 +00:00
YingboHAO
bb54f78d0e
feat: add hotwords support for vLLM ASR
2026-02-04 10:33:20 +00:00
YaoyaoChang
0aa8cb4c64
fx default speaker
2026-02-03 00:35:04 -08:00
YaoyaoChang
e43c1e2cdb
streaming use transformers==4.51.3
2026-02-03 00:30:52 -08:00
Jianwei Yu
e16491d65e
Merge pull request #228 from Damon-Salvetore/vllm-1
...
[Fix] Resolve occasional infinite loops during vLLM inference
2026-02-03 10:38:40 +08:00
YingboHAO
e26f1c263f
1
2026-02-02 13:50:27 +00:00
YingboHAO
0055161273
Add test_api_auto_recover.py and test audio files
2026-02-02 13:49:01 +00:00
Zhiliang Peng
b2aee8015c
Delete docs/VibeVoice-ASR-Report.pdf
2026-01-28 19:33:37 +08:00
YaoyaoChang
2ee94fab1d
update ASR architechture figure
2026-01-27 05:11:35 -08:00
YaoyaoChang
3140709188
update README
2026-01-27 21:06:31 +08:00
YaoyaoChang
c435ae05d5
update README
...
Added a section on LoRA fine-tuning to the ASR documentation.
2026-01-27 21:01:40 +08:00
YaoyaoChang
0e1a0d39fd
update README
2026-01-27 20:59:25 +08:00
YaoyaoChang
142a00112e
update ASR README: multilingual
2026-01-27 20:58:10 +08:00
YaoyaoChang
4648c50ea0
update ASR Technical Report link to Arxiv
2026-01-27 12:58:06 +08:00
MLSDCherryPick
cbbdb69474
add VibeVoice-ASR technique report arxiv link
2026-01-27 02:45:16 +00:00
YaoyaoChang
a69e77c036
1. unify env for TTS and ASR; 2. avoid transformers 5.0.0 temporarily
2026-01-26 03:29:02 -08:00
YaoyaoChang
a00f431e14
tts support latest transformers(4.57.6)
2026-01-26 03:28:10 -08:00
Jianwei Yu
c4ee4fe716
Merge pull request #213 from Damon-Salvetore/vllm-1
...
Replace install_deps.sh with start_server.py one-click deployment
2026-01-26 16:49:38 +08:00
YingboHAO
1eb04f53a2
Replace install_deps.sh with start_server.py one-click deployment
2026-01-26 07:34:54 +00:00
ikeshav26
d11d756b61
fix: issues in error handling
2026-01-26 14:18:34 +08:00
YaoyaoChang
0926f242ce
add CONTRIBUTING.md
2026-01-25 22:07:40 -08:00
DDXDB
1c5dbc4190
Add XPU sdpa Support
2026-01-26 14:00:31 +08:00
ThanhNguyxn
523713e806
fix(demo): add MPS and CPU support for ASR inference demo
...
- Add MPS device choice and auto-detect MPS availability
- Change default attention implementation to 'auto' with smart fallback
- Auto-detect flash_attention_2 availability on CUDA, fallback to sdpa
- Use sdpa for MPS and CPU devices (flash_attention_2 not supported)
- Use float32 dtype for MPS/CPU devices for better compatibility
Fixes #206
2026-01-26 13:56:11 +08:00
ThanhNguyxn
5cf026569e
fix: handle torch.dtype serialization in config classes
...
Fixes #199 - Object of type dtype is not JSON serializable
When loading models with torch_dtype as a torch.dtype object (e.g.,
torch.bfloat16), transformers would fail to serialize the config to
JSON for logging purposes, raising TypeError.
This fix:
- Adds _convert_dtype_to_string() helper function to convert torch.dtype
objects to their string representation (e.g., 'bfloat16')
- Overrides to_dict() method in VibeVoiceConfig, VibeVoiceASRConfig,
and VibeVoiceStreamingConfig to apply this conversion
The fix is backward compatible - string dtype values and None values
continue to work as expected.
2026-01-26 13:45:55 +08:00
YaoyaoChang
e67b15f47d
update
2026-01-25 21:41:42 -08:00
MLSDCherryPick
d9068541cf
1
2026-01-25 16:11:02 +00:00
YaoyaoChang
c28e23f80c
update language distribution figure
2026-01-25 00:15:11 -08:00
MLSDCherryPick
81bf8baa89
1
2026-01-25 05:14:39 +00:00
MLSDCherryPick
e4036e46f4
1
2026-01-24 08:28:05 +00:00
Jianwei Yu
3c50e50d18
Merge pull request #203 from Damon-Salvetore/vibevoice-vllm
...
Add vLLM plugin support for high-performance ASR serving
2026-01-24 16:17:10 +08:00
MLSDCherryPick
71356b87dd
Language support
2026-01-24 05:17:26 +00:00
MLSDCherryPick
7d12252de3
Language support
2026-01-24 05:11:34 +00:00
MLSDCherryPick
a3e99daedd
Language support
2026-01-24 05:10:47 +00:00
YingboHAO
04f8bc40b0
Update test_api.py
2026-01-23 17:47:31 +00:00
YingboHAO
4df5b0582f
Add vLLM plugin support for high-performance ASR serving
2026-01-23 17:32:24 +00:00
YaoyaoChang
c0c2af984e
update README for finetuning-asr
2026-01-22 06:20:11 -08:00
Zhiliang Peng
05e1a022e5
Update FT README
...
Clarified the purpose of the toy dataset in the README.
2026-01-22 21:49:47 +08:00
Zhiliang Peng
59c90e7633
Merge pull request #197 from pengzhiliang/vibevoice_asr_ft
...
add VibeVoice-ASR finetuning code
2026-01-22 21:45:35 +08:00
pengzhiliang
8516386ce4
update ft readme
2026-01-22 05:44:34 -08:00
pengzhiliang
cef628e1b5
update ft code
2026-01-22 05:20:25 -08:00
pengzhiliang
db2f1d9ff3
init vibevoice-asr ft
2026-01-22 05:04:33 -08:00
YaoyaoChang
875115c000
update README
2026-01-22 01:28:21 -08:00