Sick of dog pictures on social media? Nvidia’s GANimal AI lets you turn them into other animals
Of course, I’m kidding, how can anybody get sick of dog pictures on Facebook?
Nvidia’s research teams have been doing some pretty crazy stuff with AI the last few years. This latest one is pretty funny from an amusement level standpoint but quite groundbreaking from a technical one. Nvidia’s GANimal AI lets you remap your pet’s “expression” onto other animals.
It’s a challenging task for computers, although it has been done in the past. Previously, though, it required many images in order to make it work. The GANimal AI manages to do it with just a single photo.
Imagine your Labrador’s smile on a lion or your feline’s finicky smirk on a tiger. Such a leap is easy for humans to perform, with our memories full of images. But the same task has been a tough challenge for computers — until the GANimal.
A team of NVIDIA researchers has defined new AI techniques that give computers enough smarts to see a picture of one animal and recreate its expression and pose on the face of any other creature. The work is powered in part by generative adversarial networks (GANs), an emerging AI technique that pits one neural network against another.
While it’s quite amusing, Nvidia says that the practical implications for such technology include the likes of Hollywood. They’d be able to have stunt dogs performing for the camera, but then remap more difficult to manage animals onto their movements, like tigers, for example.
Nvidia has a web-based demo of the technology online so that you can put your own photos into the system and see what it spits back out. I fed in a couple of dog photos I’ve shot, and it had a little trouble with the remapping on some images.
My first test wasn’t so bad, overall. There’s definitely some weirdness going on with some of them and a couple of the dogs appear to have no eyes anymore. But it sorta figured it out.
The next test I did was a little more challenging. This one wasn’t so much weird as… well, that basset hound definitely looks like Picasso’s dog, let’s put it that way.
A research paper presented at the International Conference on Computer Vision (ICCV) in Seol, Korea, describes what researchers call FUNIT. Which, naturally, stands for “a Few-shot, UNsupervided Image-to-image Translation algorithm.
FUNIT is powered by a relatively new technology called Generative Adversarial Networks (GANs). The problem with deep learning algorithms before GANs came along was that everything had to be tagged and labeled by humans in order for it to be understood. It was one of the biggest things holding AI back, purely because of the amount of effort it takes to train them. GANs solves this issue by.. Well, I’m just going to let Nvidia explain this one.
GANs get around this problem by reducing the amount of data needed to train deep learning algorithms. And they provide a unique way to train deep learning algorithms to create labeled data – images, in most cases – from existing data.
Rather than train a single neural network to recognize pictures, researchers train two competing networks. Extending the cat example, a generator network tries to create pictures of fake cats that look like real cats. A discriminator network examines the cat pictures and tries to determine whether they’re real or fake.
“You can think of this being like a competition between counterfeiters and police,” GAN’s creator, Ian Goodfellow said about the process. “Counterfeiters want to make fake money and have it look real, and police want to look at any particular bill and determine if it’s fake”.
Essentially it’s two networks working as a pair that are learning from each other. As one gets better at spotting the fakes, the other gets better at creating fakes that are indistinguishable from the real thing.
It’s still in its early days, and it definitely needs some work if my tests above are anything to go by, but it offers some interesting promise for the future.
John Aldred is a photographer with over 20 years of experience in the portrait and commercial worlds. He is based in Scotland and has been an early adopter – and occasional beta tester – of almost every digital imaging technology in that time. As well as his creative visual work, John uses 3D printing, electronics and programming to create his own photography and filmmaking tools and consults for a number of brands across the industry.