VibeVoice

Author	SHA1	Message	Date
copilot-swe-agent[bot]	61ecb098d6	Improve error handling and logging for AudioMediaIO compatibility - Add warnings to inform users which compatibility mode is being used - Handle both AttributeError and ImportError for better coverage - Add __init__ method to inherited class for consistency - Provide clear diagnostic messages when patching fails Co-authored-by: donglixp <1070872+donglixp@users.noreply.github.com>	2026-01-29 02:24:53 +00:00
copilot-swe-agent[bot]	b4cd7c479f	Fix vLLM AudioMediaIO compatibility issue Add try-except blocks to handle both old and new vLLM versions where AudioMediaIO may not exist or may have been moved. This fixes the AttributeError when using newer vLLM versions. - Handle missing AudioMediaIO by creating standalone implementation - Add fallback for utils module patching - Maintain backward compatibility with older vLLM versions Co-authored-by: donglixp <1070872+donglixp@users.noreply.github.com>	2026-01-29 02:22:47 +00:00
copilot-swe-agent[bot]	11dd7420ec	Initial plan	2026-01-29 02:19:04 +00:00
Zhiliang Peng	b2aee8015c	Delete docs/VibeVoice-ASR-Report.pdf	2026-01-28 19:33:37 +08:00
YaoyaoChang	2ee94fab1d	update ASR architechture figure	2026-01-27 05:11:35 -08:00
YaoyaoChang	3140709188	update README	2026-01-27 21:06:31 +08:00
YaoyaoChang	c435ae05d5	update README Added a section on LoRA fine-tuning to the ASR documentation.	2026-01-27 21:01:40 +08:00
YaoyaoChang	0e1a0d39fd	update README	2026-01-27 20:59:25 +08:00
YaoyaoChang	142a00112e	update ASR README: multilingual	2026-01-27 20:58:10 +08:00
YaoyaoChang	4648c50ea0	update ASR Technical Report link to Arxiv	2026-01-27 12:58:06 +08:00
MLSDCherryPick	cbbdb69474	add VibeVoice-ASR technique report arxiv link	2026-01-27 02:45:16 +00:00
YaoyaoChang	a69e77c036	1. unify env for TTS and ASR; 2. avoid transformers 5.0.0 temporarily	2026-01-26 03:29:02 -08:00
YaoyaoChang	a00f431e14	tts support latest transformers(4.57.6)	2026-01-26 03:28:10 -08:00
Jianwei Yu	c4ee4fe716	Merge pull request #213 from Damon-Salvetore/vllm-1 Replace install_deps.sh with start_server.py one-click deployment	2026-01-26 16:49:38 +08:00
YingboHAO	1eb04f53a2	Replace install_deps.sh with start_server.py one-click deployment	2026-01-26 07:34:54 +00:00
ikeshav26	d11d756b61	fix: issues in error handling	2026-01-26 14:18:34 +08:00
YaoyaoChang	0926f242ce	add CONTRIBUTING.md	2026-01-25 22:07:40 -08:00
DDXDB	1c5dbc4190	Add XPU sdpa Support	2026-01-26 14:00:31 +08:00
ThanhNguyxn	523713e806	fix(demo): add MPS and CPU support for ASR inference demo - Add MPS device choice and auto-detect MPS availability - Change default attention implementation to 'auto' with smart fallback - Auto-detect flash_attention_2 availability on CUDA, fallback to sdpa - Use sdpa for MPS and CPU devices (flash_attention_2 not supported) - Use float32 dtype for MPS/CPU devices for better compatibility Fixes #206	2026-01-26 13:56:11 +08:00
ThanhNguyxn	5cf026569e	fix: handle torch.dtype serialization in config classes Fixes #199 - Object of type dtype is not JSON serializable When loading models with torch_dtype as a torch.dtype object (e.g., torch.bfloat16), transformers would fail to serialize the config to JSON for logging purposes, raising TypeError. This fix: - Adds _convert_dtype_to_string() helper function to convert torch.dtype objects to their string representation (e.g., 'bfloat16') - Overrides to_dict() method in VibeVoiceConfig, VibeVoiceASRConfig, and VibeVoiceStreamingConfig to apply this conversion The fix is backward compatible - string dtype values and None values continue to work as expected.	2026-01-26 13:45:55 +08:00
YaoyaoChang	e67b15f47d	update	2026-01-25 21:41:42 -08:00
MLSDCherryPick	d9068541cf	1	2026-01-25 16:11:02 +00:00
YaoyaoChang	c28e23f80c	update language distribution figure	2026-01-25 00:15:11 -08:00
MLSDCherryPick	81bf8baa89	1	2026-01-25 05:14:39 +00:00
MLSDCherryPick	e4036e46f4	1	2026-01-24 08:28:05 +00:00
Jianwei Yu	3c50e50d18	Merge pull request #203 from Damon-Salvetore/vibevoice-vllm Add vLLM plugin support for high-performance ASR serving	2026-01-24 16:17:10 +08:00
MLSDCherryPick	71356b87dd	Language support	2026-01-24 05:17:26 +00:00
MLSDCherryPick	7d12252de3	Language support	2026-01-24 05:11:34 +00:00
MLSDCherryPick	a3e99daedd	Language support	2026-01-24 05:10:47 +00:00
YingboHAO	04f8bc40b0	Update test_api.py	2026-01-23 17:47:31 +00:00
YingboHAO	4df5b0582f	Add vLLM plugin support for high-performance ASR serving	2026-01-23 17:32:24 +00:00
YaoyaoChang	c0c2af984e	update README for finetuning-asr	2026-01-22 06:20:11 -08:00
Zhiliang Peng	05e1a022e5	Update FT README Clarified the purpose of the toy dataset in the README.	2026-01-22 21:49:47 +08:00
Zhiliang Peng	59c90e7633	Merge pull request #197 from pengzhiliang/vibevoice_asr_ft add VibeVoice-ASR finetuning code	2026-01-22 21:45:35 +08:00
pengzhiliang	8516386ce4	update ft readme	2026-01-22 05:44:34 -08:00
pengzhiliang	cef628e1b5	update ft code	2026-01-22 05:20:25 -08:00
pengzhiliang	db2f1d9ff3	init vibevoice-asr ft	2026-01-22 05:04:33 -08:00
YaoyaoChang	875115c000	update README	2026-01-22 01:28:21 -08:00
YaoyaoChang	c0d7616e5a	update README	2026-01-22 01:26:44 -08:00
YaoyaoChang	0e0caf2f08	update README	2026-01-22 01:25:30 -08:00
YaoyaoChang	96f8ac6a49	update README	2026-01-22 01:24:58 -08:00
YaoyaoChang	0f8954a600	update README	2026-01-22 01:21:56 -08:00
YaoyaoChang	eb3533d791	update README	2026-01-22 00:51:33 -08:00
YaoyaoChang	5022277022	update README	2026-01-22 00:51:00 -08:00
YaoyaoChang	6c523ec087	update README	2026-01-22 00:49:58 -08:00
YaoyaoChang	883e3acc67	update README	2026-01-22 00:39:49 -08:00
YaoyaoChang	32a7040ce0	restructure README	2026-01-22 00:37:22 -08:00
YaoyaoChang	ce90a49960	fix env bug	2026-01-21 22:03:52 -08:00
MLSDCherryPick	1b6e8b56ea	asr evaluation	2026-01-22 03:44:34 +00:00
MLSDCherryPick	84e469c68e	asr evaluation	2026-01-22 03:43:31 +00:00

1 2

74 Commits