Tag

#text-to-speech

11 articles

Miso Labs Releases MisoTTS: An 8B Emotive Text-to-Speech Model with Open Weights

Learn to use MisoTTS, an 8B emotive text-to-speech model with open weights, to generate emotionally expressive speech by conditioning on both text and audio context.

Jun 328

tech

Best Text-to-Speech TTS Models in 2026: A Benchmark-Based Comparison

Learn how to create a text-to-speech application using Python and the TTS library, from setting up your environment to generating and customizing speech output.

May 3053

Supertone Releases Supertonic v3: On-Device Text-to-Speech Model with 31-Language Support, Fewer Reading Failures, and Expression Tags

Supertone has released Supertonic v3, an on-device text-to-speech model with 31-language support, improved reading stability, and expressive voice tags.

May 1435

Sakana AI Introduces KAME: A Tandem Speech-to-Speech Architecture That Injects LLM Knowledge in Real Time

Learn to build a basic speech-to-speech conversational AI system that processes voice input, generates intelligent responses, and speaks back to users.

May 252

A Coding Implementation on Deepgram Python SDK for Transcription, Text-to-Speech, Async Audio Processing, and Text Intelligence

This article explains how the Deepgram Python SDK enables developers to integrate advanced voice AI capabilities like transcription, text-to-speech, and asynchronous audio processing into Python applications.

Apr 2464

Google AI Launches Gemini 3.1 Flash TTS: A New Benchmark in Expressive and Controllable AI Voice

Google introduces Gemini 3.1 Flash TTS, a new text-to-speech model that enhances speech quality, expressive control, and multilingual generation. This release marks a shift toward more controllable and natural AI voice outputs.

Apr 15111

tech

Google ships its most expressive Gemini 3.1 text-to-speech model yet with 70+ language support

Learn how to use Google's new Gemini 3.1 Flash Text-to-Speech model to convert text into natural-sounding speech in over 70 languages with precise control over style, pace, and tone.

Apr 1574

Gemini 3.1 Flash TTS: the next generation of expressive AI speech

Google introduces Gemini 3.1 Flash TTS, a new text-to-speech technology that delivers more natural and expressive AI-generated voices. The advancement represents a significant step forward in making AI interactions more human-like and emotionally nuanced.

Apr 1566

A Hands-On Coding Tutorial for Microsoft VibeVoice Covering Speaker-Aware ASR, Real-Time TTS, and Speech-to-Speech Pipelines

Learn what Microsoft VibeVoice is, how it uses AI to understand and generate human speech, and why it's important for the future of voice technology.

Apr 1277

Mistral AI Releases Voxtral TTS: A 4B Open-Weight Streaming Speech Model for Low-Latency Multilingual Voice Generation

Learn how Voxtral TTS works, what it means for developers, and why it's a breakthrough in AI voice technology.

Mar 2895

Fish Audio Releases Fish Audio S2: A New Generation of Expressive Text-to-Speech (TTS) with Absurdly Controllable Emotion

This article explains the new Fish Audio S2 text-to-speech technology and how it creates expressive, emotion-controlled voices that sound more human than ever before.

Mar 1078