From c4352fee6346650191a1362a8210c909006b175d Mon Sep 17 00:00:00 2001 From: YaoyaoChang Date: Wed, 21 Jan 2026 10:36:27 -0800 Subject: [PATCH] fx --- docs/vibevoice-asr.md | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/vibevoice-asr.md b/docs/vibevoice-asr.md index 6a59842..7038370 100644 --- a/docs/vibevoice-asr.md +++ b/docs/vibevoice-asr.md @@ -2,6 +2,7 @@ [![Hugging Face](https://img.shields.io/badge/HuggingFace-Collection-orange?logo=huggingface)](https://huggingface.co/microsoft/VibeVoice-ASR) [![Live Playground](https://img.shields.io/badge/Live-Playground-green?logo=gradio)](https://aka.ms/vibevoice-asr) + **VibeVoice-ASR** is the latest addition to the **VibeVoice** family. While the original VibeVoice / VibeVoice-Realtime focused on expressive TTS, **VibeVoice-ASR** focuses on understanding long-form speech with high precision and rich metadata. It is a unified speech-to-text model designed to handle **1-hour long-form audio** in a single pass, generating structured transcriptions containing **Who (Speaker), When (Timestamps), and What (Content)**, with support for **User-Customized Context**.