Tag

#speech recognition

25 articles

Cohere Transcribe Arabic is an open-source model built for Arabic's toughest transcription problems

Cohere has launched Transcribe Arabic, an open-source speech recognition model that outperforms existing tools in handling Arabic dialects and bilingual speech.

Jul 715

NVIDIA Releases Audex (Nemotron-Labs-Audex-30B-A3B): A Unified Audio-Text LLM That Preserves the Text Intelligence of Its Backbone

Learn how NVIDIA's new AI system Audex combines audio and text processing in one powerful model, preserving text intelligence while adding speech capabilities.

Jul 719

Interfaze Ships diffusion-gemma-asr-small, an Open-Source Diffusion ASR Model Transcribing Six Languages via DiffusionGemma’s Parallel Denoising Decoder

Learn how a new AI model called diffusion-gemma-asr-small uses a 'diffusion' approach to transcribe speech in six languages more efficiently than traditional methods.

Jul 238

tech

Equal AI raised $30M to screen phone calls for Indians who get 20 spam calls a week

Learn how to build a basic AI-powered call screening system using Python and speech recognition technologies, similar to Equal AI's spam call filtering solution.

Jun 1145

Here comes new Siri again

Learn to build a basic voice assistant similar to Siri using Python's speech recognition and text-to-speech libraries. This beginner-friendly tutorial teaches you how to create a command-based assistant that listens, understands, and responds to voice commands.

Jun 640

NVIDIA Releases Nemotron 3.5 ASR: A 600M-Parameter Cache-Aware Streaming Model Transcribing 40 Language-Locales in Real Time

This article explains NVIDIA's Nemotron 3.5 ASR, a 600M-parameter streaming speech recognition model that processes 40 languages in real-time using cache-aware optimization techniques.

Jun 553

tech

Vibe coding is coming to your phone

Learn to build a basic voice-controlled assistant app that recognizes spoken commands and responds with text-to-speech output, demonstrating the core technology behind modern voice assistants.

May 2046

AI voice startup Vapi hits $500M valuation after winning Amazon Ring over 40 rivals

Learn to build a basic AI voice assistant that can understand spoken questions and respond with intelligent answers using Python and OpenAI's API.

May 1262

Voice AI in India is hard. Wispr Flow is betting on it anyway.

Learn how voice AI works and why it's particularly challenging in India's diverse linguistic environment. Discover how companies like Wispr Flow are working to make voice technology more accessible.

May 953

Sakana AI Introduces KAME: A Tandem Speech-to-Speech Architecture That Injects LLM Knowledge in Real Time

Learn to build a basic speech-to-speech conversational AI system that processes voice input, generates intelligent responses, and speaks back to users.

May 259

The best AI dictation apps, tested and ranked

This explainer explores the advanced AI technologies behind modern dictation apps, including transformer architectures, real-time processing, and multimodal learning techniques.

May 271

IBM Releases Two Granite Speech 4.1 2B Models: Autoregressive ASR with Translation and Non-Autoregressive Editing for Fast Inference

IBM has launched two new Granite Speech 4.1 2B models — one autoregressive for high-accuracy speech recognition with translation, and one non-autoregressive for fast inference.

Apr 3078