NVIDIA’s GauGAN AI landscape generator can now create scenes from scratch from written descriptions

Nov 22, 2021

John Aldred

NVIDIA has been doing a lot of cool stuff with AI. One of those things is GauGAN AI, something of a predecessor to NVIDIA’s Canvas application, which we checked out here when I tried to recreate some of my landscape photographs with it. Well, GauGAN2 is here now and it’s gotten smarter. Way smarter. You don’t just paint coloured pixels where you want things to be anymore. Oh no, now it actually understands what you say!

And unlike Canvas, which is free to download but limited to those with an NVIDIA RTX GPU, anybody can have a go and play with GauGAN2 through its handy web interface.

The GAN part of GuaGAN stands for generative adversarial networks. Put simply, this is two networks competing against each other. One is generating something that it hopes looks real while the other is looking at that something to see if it’s real or fake when compared to the real somethings it knows exist in the real world. This second half of the equation is also able to go back to the generator and tell it how to improve.

That’s a very very simplified explanation of the process, but that’s essentially how GuaGAN works to generate its AI landscapes. And now you can generate them by entering a simple phrase, like the NVIDIA suggestions “sunset at the beach” or “ocean waves hitting rocks on the beach”. Or, you can do what I did and see how well it handles “misty forest with mountains in the background and a blue sky” and see what happens.

Well, this happened…

It’s not quite what I’d envisioned, but it’s pretty close. I certainly didn’t see that many clouds in my head, but the misty forest mountains are spot on. And this was just from a single line of text without actually drawing anything on the screen manually.

You don’t only have to paint or describe a scene, though. You can do a combination of both, generating a segmentation map of your description and then fine-tuning it by brushing on extra information and details. NVIDIA says that GuaGAN2 is one of the first examples to combine multiple modalities within a single network, making it a powerful image generation tool for creating realistic art with a mix of words and simple sketches.

Ultimately, NVIDIA wants to make it faster and smarter to let artists easily turn their ideas to images. Rather than needing to draw out an entire scene from scratch, they’ll be able to describe a scene and have the AI generate something that users can than use as a starting point for their work. NVIDIA says, “this starting point can then be customized with sketches to make a specific mountain taller or add a couple of trees in the foreground, or clouds in the sky.”

It’s a fascinating piece of technology and I’m looking forward to seeing this improve in the future. Already from my simple tests with GauGAN2, it’s already come a long way from when I tested Canvas just a couple of months ago.

If you want to find out more, head on over to the NVIDIA Blog and be sure to have a play with it for yourself!

Filed Under:

news

Tagged With:

John Aldred

John Aldred is a photographer with over 25 years of experience in the portrait and commercial worlds. He is based in Scotland and has been an early adopter – and occasional beta tester – of almost every digital imaging technology in that time. As well as his creative visual work, John uses 3D printing, electronics and programming to create his own photography and filmmaking tools and consults for a number of brands across the industry.

Join the Discussion

DIYP Comment Policy
Be nice, be on-topic, no personal information or flames.