Akapulu Labs logo Akapulu Labs Research

Voice, Avatars, and Spoken Reasoning

Today’s digest spans real-time talking avatars, relightable digital humans, universal speech synthesis, and a new look at how speech-language models reason internally. Together they point to more natural, controllable, and interactive conversational AI systems.

Voice, Avatars, and Spoken Reasoning

We propose InteractiveAvatar, a real-time streaming audio-driven avatar generation framework that enables intent-aware interaction. InteractiveAvatar interprets user intent to generate contextually relevant actions throughout the dialogue while maintaining long-range visual consistency. From InteractiveAvatar.

Talking Avatars & Interactive Video

Digital Humans & 3D Avatars

TTS & Voice Synthesis

SpeechLLMs & Spoken Reasoning