From c0d7616e5a2d96fdb5e02cfbabf99476162415f5 Mon Sep 17 00:00:00 2001 From: YaoyaoChang Date: Thu, 22 Jan 2026 01:26:44 -0800 Subject: [PATCH] update README --- README.md | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/README.md b/README.md index d946d1f..1adacce 100644 --- a/README.md +++ b/README.md @@ -70,14 +70,15 @@ For more information, demos, and examples, please visit our [Project Page](https - **📝 Rich Transcription (Who, When, What)**: The model jointly performs ASR, diarization, and timestamping, producing a structured output that indicates *who* said *what* and *when*. +[📖 Documentation](docs/vibevoice-asr.md) | [🤗 Hugging Face](https://huggingface.co/microsoft/VibeVoice-ASR) | [🎮 Playground](https://aka.ms/vibevoice-asr) + +

DER
cpWER
tcpWER

-[📖 Documentation](docs/vibevoice-asr.md) | [🤗 Hugging Face](https://huggingface.co/microsoft/VibeVoice-ASR) | [🎮 Playground](https://aka.ms/vibevoice-asr) -
@@ -102,12 +103,13 @@ https://github.com/user-attachments/assets/acde5602-dc17-4314-9e3b-c630bc84aefa Supports English, Chinese and other languages. +[📖 Documentation](docs/vibevoice-tts.md) | [🤗 Hugging Face](https://huggingface.co/microsoft/VibeVoice-1.5B) | [📊 Paper](https://arxiv.org/pdf/2508.19205) + +
VibeVoice Results
-[📖 Documentation](docs/vibevoice-tts.md) | [🤗 Hugging Face](https://huggingface.co/microsoft/VibeVoice-1.5B) | [📊 Paper](https://arxiv.org/pdf/2508.19205) - **English**