You know that saying, “a picture is worth a thousand words?” Well, it has just got a whole new meaning. D-ID has introduced a new tool that makes your picture actually say those thousand words. It adds motion and sound to your portraits, so they become creepily realistic talking heads.
D-ID is the company that stands behind MyHeritage’s DeepNostalgia which turns photos of your ancestors into realistic animations. But its latest Speaking Portraits is a step forward in technology. Basically, it merges a still image with any text or audio input, turning them into a video of a person talking.
Even though the technology looks like deepfake, the technology is different. “The system is trained on real actors and delivers a high-quality output, virtually indistinguishable from the actors themselves,” D-ID writes. In the video above, you can see an example of a portrait turned into a video, speaking Japanese.
Speaking Portraits is a part of D-ID’s AI Face Platform, which also includes Live Portrait and Face Lit. The Live Portrait process uses a driver video to animate a person in a still photo to precisely match the driver’s head movement, facial expressions, emotions, and voice. As for Face Lit, it’s a tool that changes facial expression in any still image.
Judging from the sample video, the results are super-believable. Maybe not completely yet: I can still see that something is a bit off, although I can’t quite put my finger on it. Still, if I watched this video on the phone’s screen, I’d probably buy it.
Sadly, that high believability is giving me chills. Just like deepfake, this kind of technology can be used for all kinds of deceptive and harmful purposes: from fake porn to fake political speeches – and all you need is a single photo.
But let’s try and keep it on the positive side. D-ID points out all the positive aspects of this kind of technology. It could be used in the media (as in the example video), education, entertainment, and advertising industries. “The technology enables companies to easily transform articles, websites, and corporate marketing materials into videos, at scale, without the need for costly productions and studios, and without actually filming an actor.” The company told Tech Crunch that it’s “keen to make sure it’s used for good, not bad.” It’s committed to “transparency and consent” when using any of its apps so that the users “aren’t confused about what they’re seeing and that people involved give their consent.”
[via Tech Crunch]