Tag
3 articles
Mistral AI's new TTS model, Voxtral, tackles the 'expressivity gap' in voice AI by combining autoregressive and flow-matching techniques for more emotionally expressive, multilingual speech synthesis.
Google introduces Gemini 3.1 Flash TTS, a new text-to-speech model that enhances speech quality, expressive control, and multilingual generation. This release marks a shift toward more controllable and natural AI voice outputs.
French AI startup Mistral has released Voxtral, its first open-weight text-to-speech model that supports nine languages and can clone voices from just three seconds of audio.