Akapulu Labs logo Akapulu Labs Research

Voice AI Goes Real Time

Today’s digest spans streaming speech agents, emotional text-to-speech control, robust audio-visual recognition, and a new reference-free way to evaluate ASR. Together, these papers push conversational systems toward more responsive, expressive, and reliable voice interaction.

Voice AI Goes Real Time

Audio-Interaction teaser showing the next-generation audio-language model concept with streaming brain for interaction. From Audio-Interaction.

SpeechLLMs & Voice Agents

TTS & Voice Synthesis

ASR & Audio-Visual Speech