523713e806
- Add MPS device choice and auto-detect MPS availability - Change default attention implementation to 'auto' with smart fallback - Auto-detect flash_attention_2 availability on CUDA, fallback to sdpa - Use sdpa for MPS and CPU devices (flash_attention_2 not supported) - Use float32 dtype for MPS/CPU devices for better compatibility Fixes #206