Users Smitten By Microsoft’s Image To Video Tool

As the race for AI supremacy continues, Microsoft now wants to transform people’s portrait pictures into talking faces or videos with its latest tool, VASA-1.

According to a research paper by the tech giant, Microsoft is taking the AI race to another level, with VASA 1, framework for creating lifelike talking faces of virtual characters with visual affective skills (VAS), all from a portrait.

Also read: Video Game Industry Rush to Unionize Over AI

From portraits to talking faces

Although it is not yet available to the public, the tool takes a single portrait photo and speech audio and produces a hyper-realistic talking face video with precise lip-audio sync, lifelike facial behavior, and naturalistic head movements generated in real-time.

The tool is still at research preview stage with the Microsoft Research team, and the demo videos “look impressive.”

While companies like Nvidia and Runway already have similar head movement and lip sync technology, VASA-1 seem “to be of a much higher quality and realism,” which reduces mouth artifacts, according to Tom’s Guide.

Additionally, this approach to audio driven animation is also like the recent Vlogger AI model by Google Research.

According to Microsoft, while all the images in the demonstration examples are synthetic created by Dall-E, VASA-1 can still animate a real picture.

The demo shows different people talking with almost natural movements, facial expressions, eye movements “no artifacts around top and bottom of the mouth seen in other tools.”

It also does not require a face-forward portrait style image for it to work.

Microsoft just introduced VASA-1.

It’s a new AI model that can turn 1 photo and 1 piece of audio into a fully lifelike human deepfake.

Wild to drop this right before the election 😬pic.twitter.com/MuLkZVOKRM

— Rowan Cheung (@rowancheung) April 18, 2024

VASA-1 got people talking

Already, AI enthusiasts seem smitten by the technology describing it as “wild” and “insane” on the X platform.

“The improvements we’re getting between each release is incredible,” said Linus Ekenstam.

Others are of the view the world is witnessing a “seismic shift in the way media content is created” and how it’s consumed.

“This is mind blowing, the realism is top notch,” said another enthusiast identified as Sam.

Although others recognize the tool’s abilities, they also think it is a bit irresponsible on the part of Microsoft to introduce a tool that can easily be manipulated for election deepfakes.

“Wild to drop this right before the election,” wrote Rowan Cheung on X platform.

Another user Evan Kirstel commented with a stern warning: “Microsoft Research’s VASA-1 is a game-changer, creating hyper-realistic AI-generated videos from just a photo and audio.”

“The possibilities are endless, from reviving classic cinema legends to personalized media. But let’s stay alert to deepfake risks.”

Already, the world has seen an influx of election deepfakes where politicians’ voices or images have been manipulated using AI to spread propaganda. About a third of the global population is going for polls this year.

However, the researchers at Microsoft have indicated this is just for demonstration and there are currently no plans for a public release or making it available to developers.

How does VASA-1 work?

According to Tom’s Guide, the researchers themselves are surprised at the model’s ability to “perfectly lip-sync to a song, reflecting the words from the singer without issue despite no music being used in the training dataset.”

Additionally, VASA-1 handled different image styles including the historical portraits like the famous Mona Lisa.

The tool could be used in gaming on the back of its advanced lip-sync abilities. This, experts have said, could be a game changer for immersion.

Additionally, the technology can be instrumental in creating avatars for social media videos, as in the case with firms like Synthesia and HeyGen.

AI-based movies and music video productions can also leverage VASA-1 technology for more realistic videos.

There are chances that with Microsoft having a stake in OpenAI, VASA-1 could be part of a “future Copilot Sora integration.”

SEO Powered Content & PR Distribution. Get Amplified Today.
PlatoData.Network Vertical Generative Ai. Empower Yourself. Access Here.
PlatoAiStream. Web3 Intelligence. Knowledge Amplified. Access Here.
PlatoESG. Carbon, CleanTech, Energy, Environment, Solar, Waste Management. Access Here.
PlatoHealth. Biotech and Clinical Trials Intelligence. Access Here.
Source: https://metanews.com/enthusiasts-smitten-by-image-to-video-tool-vasa-1/

Plato Data Intelligence.
Vertical Search & Ai.

Users Smitten by Microsoft’s Image to Video Tool – VASA-

From portraits to talking faces

VASA-1 got people talking

How does VASA-1 work?

This Week’s Awesome Tech Stories From Around the Web (Through May 4)

Bitcoin Pops Above $64,000 After Grayscale’s GBTC Snaps 78-Day Outflow Streak With $63M In New Money

Latest Intelligence

WienerAI Hits a Major Milestone as Holders Rush in to Stake their WAI

WienerAI Presale Hits $750k as Interest in the New Trading Bot Rises

Wojak Finance Launches GenX NFT Collection: A Meme Coin with a Mission

BlockDAG’s Fame Reaches New Heights after Forbes’ Doxxing Slip-Up

The FuturesAI is Now: How Traders Can Leverage AI Trading | Live Bitcoin News

BingX Introduces Eigenlayer Token Spot Trading Amid Growing Popularity of Ethereum Restaking Ecosystem

Chat with us

Plato Data Intelligence.Vertical Search & Ai.

Users Smitten by Microsoft’s Image to Video Tool – VASA-

From portraits to talking faces

VASA-1 got people talking

How does VASA-1 work?

Latest Intelligence

Chat with us

Plato Data Intelligence.
Vertical Search & Ai.