News

DeepMind Unveils AI-powered Soundtrack Tools

June 20, 2024

Google’s DeepMind debuts V2A, a new AI model that can generate soundtrack and dialogue for videos.

Google’s AI research lab, Deepmind, unveiled V2A (Video-to-audio), a new work-in-progress AI model that combines video pixels with natural language text prompts to generate rich soundscapes for the on-screen action.

Compatible with Veo, a text-to-video model the company introduced at the recently concluded Gooogle I/O 2024, V2A can be used to add dramatic music, realistic sound effects and dialogue that matches the tone of the video.

Also Read: DeepMind and Isomorphic Labs Debut AlphaFold 3

The new V2A model can generate an “unlimited number of soundtracks” for any video and features an optional ‘positive prompt’ and ‘negative prompt’, which can be used to tune the output to your preferences. It also watermarks the generated audio with SynthID technology.

Google’s blog post further revealed that the new large language model also works with “traditional footage” like silent films and archival material.

DeepMind’s V2A technology takes the description of a sound as input and uses a diffusion model trained on a combination of sounds, dialogue transcripts and videos. Since the model wasn’t trained on a lot of videos, the output can be distorted at times. Google also said it won’t release V2A to the public to prevent misuse anytime soon.

DeepMind Unveils AI-powered Soundtrack Tools

Google’s DeepMind debuts V2A, a new AI model that can generate soundtrack and dialogue for videos.

Latest Posts

OpenAI’s o3-Pro Is Here; Open-Weights Model Delayed

Mistral AI Unveils Its First Reasoning Model

Meta’s Zuckerberg Hiring for New ‘Superintelligence’ AI Team: Report

Apple Says AI Models Collapse When Facing Hard Puzzles

Meta in Talks to Invest in Scale AI

Reddit Sues Anthropic Over Alleged Data Scraping for AI Training