Akapulu Labs logo Akapulu Labs Research

Speech Agents Learn to Listen

Today’s digest spans full-duplex spoken dialogue, emotional voice synthesis, and ASR that catches hesitation and disfluency. Together, these papers push voice agents toward more natural, responsive, and expressive interaction.

Speech Agents Learn to Listen

BayLing-Duplex model architecture illustrating the multi-channel autoregressive design enabling integrated speech and text dialogue management. From BayLing-Duplex.

SpeechLLMs & Spoken Dialogue

TTS & Voice Synthesis

ASR for Voice Agents