← Back to feed

VoxCPM

GitHub Repo Pretty sure · 48kHz output claim needs validation
https://github.com/OpenBMB/VoxCPM

Tokenizer-free diffusion TTS that actually ships multilingual, voice design, and cloning — the rare case where the hype about 'naturalness' might be earned instead of marketing.

15%
20%
65%
Slop 15%Signal 20%Science 65%

VoxCPM2 is substantive: 2B params, 2M+ hours multilingual training, published technical report (arXiv 2509.24650), released weights on HF, working demos. The tokenizer-free diffusion-autoregressive architecture is a genuine technical choice, not buzzword stacking. Voice design from natural language + controllable cloning are real features, not 'AI-powered' lipstick on CRUD. Production RTF claims (0.3 RTF on RTX4090, 0.13 with Nano-VLLM) are specific and testable. The 30-language coverage is b...

7503 stars Python 2026-04-09 205 days old

Become a MFer to rate — log in