Lip-Syncing App Wombo Shows the Potential of Deepfakes

Lip-syncing app Wombo Shows the Potential of Deepfakes

Have you seen the Wombo video floating around your social media? Maybe it was Ryu from Street Fighter singing the “Witch Doctor” or the last three heads of the US Federal Reserve miming in unison to Rick Astley’s “Never Going to Give You Up.” Each clip features exaggerated facial expressions and uncanny. Also, sometimes the nightmarish animation. They are stupid, fun and offer a useful look at the current state of deep fakes.

It’s certainly getting quicker and easier to make AI-generated fakes, but the more convincing they are, the more work is needed. The realistic Tom Cruise deepfakes that went viral on TikTok, for example, required an experienced VFX artist, a top-flight impersonator, and weeks of preparation to pull off. One-click fakes that can be created with zero effort and expertise, by comparison, still look like those made by the Wombo app and will continue to do so for the immediate future. In the short term, at least, deepfakes are going to be obviously fabricated and instant meme-bait.

Wombo app launched late last month from Canada after a short development process. “Back in August 2020 I had the idea for Wombo while smoking a joint with my roommate on the roof,” app creator and Wombo CEO Ben-Zion Benkhin tells The Verge. Releasing the product was “an enormous joy,” he says. “I’ve been following the AI space, following the meme space, following the deepfake space, and just saw the opportunity to do something cool.” In just a few weeks, Benkhin estimates the app has seen some 2 million downloads.

Read More: Cloud Adoption is a Business Imperative, Experts at Enterprise Cloud and Data Center Forum Agree

Wombo is free and easy to use. Just snap a picture of your face or upload an image from your camera roll, and push a button to have the image lip-sync to one of a handful of meme-adjacent songs. The app’s software will work its magic on anything that even vaguely resembles a face and many things that don’t. Although similar apps in the past have been dogged by privacy fears, Benkhin is adamant users’ data is safe. “We take privacy really seriously,” he says. “All the data gets deleted and we don’t share it or send it to anyone else.”

The app’s name comes from esports slang, specifically Super Smash Bros. Melee. “If a player lands like a crazy combination then the casters will start yelling ‘Wombo Combo! Wombo Combo!’” says Benkhin. True to these origins, Wombo has proved particularly popular with gamers who’ve used it to animate characters from titles like League of Legends, Fallout: New Vegas, and Dragon Age. “I did some digging into [the origins of the slang],” says Benkhin, “and apparently there was some pizza place that started all this, where they would put a shit-ton of toppings on all their pizza and call it a Wombo Combo.”

Benkhin says the app works by morphing faces using predefined choreography. He and his team shot the base video for each song in his studio (“which is really just a room in my apartment”) and then use these to animate each image. “We steal the motions from their face and apply it to your photo,” he says. The app is also an example of the fast-paced world of AI research, where new techniques can become consumer products in a matter of weeks. Benkhin notes that the software is built “on top of existing work” but with subsequent tweaks and improvements that make it “our own proprietary model.”

Currently, Wombo offers just 14 short clips of songs to lip-sync with, but Benkhin says he plans to expand these options soon. When asked whether the app has the proper licenses for the music it uses, he demures to answer but says the team is working on it.

As with TikTok, though, it seems the reach offered by Wombo could help ameliorate license-holders worries about rights. Wombo has already been approached by artists wanting to get their music on the app, says Benkhin, and it’s likely this could offer a revenue stream in addition to the current premium tier (which pays for priority processing and no in-app ads). “It’s going to give [artists] a completely new way of engaging audiences,” he says. “It gives them this new viral marketing tool.”

Wombo is far from the first app to use machine learning to create quick and fun deepfakes. Others include ReFace and FaceApp. But it’s the latest example of what will be an ever-more prominent trend, as deepfake apps become the latest meme templates, allowing users to mash together favourite characters, trending songs, choreographed dances, public figures, and so much more. The future of deepfakes will definitely be memeified.