If you like all things cute and just wish you could create videos of cute things, then Google has the answer for you. Lumiere, Google’s latest answer to text-to-video generators, dropped on Tuesday, and it’s great…if you like pandas driving cars.

Google describes Lumiere as a “Space-Time Diffusion Model for Realistic Video Generation”. I’m not really certain what that means since I thought all video existed in space and time. However, let’s take a closer look.

Google says that the generator is capable of creating videos from both text prompts and still images. It can also create videos in a particular targeted style. This sounds like many of the other young AI video generators out there. But apparently, Lumiere is drastically different in how it works.

According to Google, it introduced a Space-Time U-Net architecture (ah, that’s the space-time connection then) that generates the entire temporal duration of the video at once through a single pass in the model. This is in contrast to other AI generators that make each frame separately and then string them all together. This is why other video generators can lack consistency. Google hopes to eliminate that problem by making the whole video in one go.

Lumiere can also make effective cinemagraphs, where just one part of the image moves and the rest stays still. It’s also boasting some fairly impressive examples of wardrobe changes and owls wearing hats.

In the paper, the Google team states that the AI model outputs five-second long 1024×1024 pixel videos, which they describe as “low-resolution.” I think give it a little time and space (see what I did there?), this will become more and more powerful and probably offer higher resolutions.

Though I do have to agree with Ars Technica when they say that this could be “the most advanced text-to-animal AI video generator yet demonstrated”.

[via ars technica]