Akapulu Labs logo Akapulu Labs Research

Avatars Speak and See Clearly

Today’s digest spans photorealistic 3D human avatars, identity-preserving video generation, and a unified audio-language model for speech, sounds, and music. Together they point to richer multimodal agents that can see, hear, and render people more naturally.

Avatars Speak and See Clearly

HumanNOVA teaser figure illustrating photorealistic, universal, and rapid 3D human avatar modeling from a single image. From HumanNOVA.

Digital Humans & 3D Avatars

SpeechLLMs & Audio Understanding