Alibaba's Institute for Intelligent Computing has developed a new AI system called "EMO", short for "Emote Portrait Alive" that can animate a single portrait photo and generate realistic talking and singing videos. 
Alibaba's EMO AI System Creates Realistic Talking and Singing Videos from Photos
Pin It

Alibaba's EMO AI System Creates Realistic Talking and Singing Videos from Photos

   

Alibaba's Institute for Intelligent Computing has developed a new AI system called "EMO", short for "Emote Portrait Alive" that can animate a single portrait photo and generate realistic talking and singing videos. 

   

    

The system, described in a research paper published on arXiv, is able to create fluid and expressive facial movements and head poses that closely match the nuances of a provided audio track. This represents a major advance in audio-driven talking head video generation, an area that has challenged AI researchers for years.

   

EMO uses a direct audio-to-video synthesis approach, bypassing the need for 3D models or facial landmarks. 

   

The system employs a diffusion model and has been trained on a dataset of over 250 hours of talking head videos. EMO outperforms existing methods in video quality, identity preservation, and expressiveness. 

   

It can also create singing videos with appropriate mouth shapes and facial expressions synchronized to the vocals.

   

Pin It

Copyright © 2022 - 2024 DigiTrends4u. All Rights Reserved.