- Add warnings to inform users which compatibility mode is being used
- Handle both AttributeError and ImportError for better coverage
- Add __init__ method to inherited class for consistency
- Provide clear diagnostic messages when patching fails
Co-authored-by: donglixp <1070872+donglixp@users.noreply.github.com>
Add try-except blocks to handle both old and new vLLM versions where AudioMediaIO may not exist or may have been moved. This fixes the AttributeError when using newer vLLM versions.
- Handle missing AudioMediaIO by creating standalone implementation
- Add fallback for utils module patching
- Maintain backward compatibility with older vLLM versions
Co-authored-by: donglixp <1070872+donglixp@users.noreply.github.com>
- Add MPS device choice and auto-detect MPS availability
- Change default attention implementation to 'auto' with smart fallback
- Auto-detect flash_attention_2 availability on CUDA, fallback to sdpa
- Use sdpa for MPS and CPU devices (flash_attention_2 not supported)
- Use float32 dtype for MPS/CPU devices for better compatibility
Fixes#206
Fixes#199 - Object of type dtype is not JSON serializable
When loading models with torch_dtype as a torch.dtype object (e.g.,
torch.bfloat16), transformers would fail to serialize the config to
JSON for logging purposes, raising TypeError.
This fix:
- Adds _convert_dtype_to_string() helper function to convert torch.dtype
objects to their string representation (e.g., 'bfloat16')
- Overrides to_dict() method in VibeVoiceConfig, VibeVoiceASRConfig,
and VibeVoiceStreamingConfig to apply this conversion
The fix is backward compatible - string dtype values and None values
continue to work as expected.