Akapulu Labs logo Akapulu Labs Research

Speech systems move toward richer dialogue, controllable voices, and stronger robustness

Today’s digest spans full-duplex spoken dialogue, multi-speaker scene generation, controllable TTS, and improved ASR representations. The papers focus on making speech systems more natural, more editable, and more reliable in noisy or complex settings.

Speech systems move toward richer dialogue, controllable voices, and stronger robustness

ScenA teaser illustrating multi-speaker conversational scenes synthesized from natural language prompts and reference voices with overlapping speech, paralinguistic events, and ambient sound. From ScenA.

SpeechLLMs & Spoken Dialogue

TTS & Voice Synthesis

ASR & Speech Representation