DIY Photography

The future of filmmaking? OpenAI launches Sora, its powerful text-to-video generator

Feb 16, 2024

Dunja Đuđić Kalinin

Join Discussion

Share:

opeanai text to video sora

OpenAI has released its text-to-video generator, Sora, and it’s… scary. Fascinatingly scary, that is. Select artists can already try it out, and we’ve seen some examples that look incredible.

[Related Reading: Google launches Lumiere, a ‘cute and fluffy’ text-to-video generator]

Sora can generate up to minute-long videos and keep the visual quality and adherence to your prompt. “We’re teaching AI to understand and simulate the physical world in motion,” OpenAI writes, “with the goal of training models that help people solve problems that require real-world interaction.”

With Sora, you can bring scenes to life with multiple characters, precise motion, and painstaking attention to detail. Not only does Sora grasp the user’s request, but it also accounts for how those elements would manifest in reality.

OpenAI claims that Sora’s model boasts a comprehensive understanding of language. This allows it to interpret prompts and bring characters to life accurately. Additionally, Sora can create multiple shots within a single video, perfectly capturing the characters and visual style.

Sora’s drawbacks

OpenAI admits that the current model has its weaknesses. “It may struggle with accurately simulating the physics of a complex scene, and may not understand specific instances of cause and effect,” the company notes. “For example, a person might take a bite out of a cookie, but afterward, the cookie may not have a bite mark.”

The model may also confuse the spatial details of a prompt. For example, it can mix up left and right (just like me) and struggle with precise descriptions of events that take place over time. For example, it can’t accurately follow a specific camera trajectory.

Safety

You may wonder about the safety of these videos, especially considering the amount of false information that floods the internet every day. OpenAI claims they’re taking important safety steps before Sora is publicly available in OpenAI’s products.

Expert evaluation: Before launch, OpenAI has enlisted specialists in misinformation, bias, and harmful content (“red teamers”) to rigorously test Sora’s potential for misuse.
Misleading content detection: The company’s developing tools like classifiers to identify videos generated by Sora and flag potential misinformation. Additionally, they plan to include transparency metadata for future deployments.
Leveraging existing safeguards: OpenAI is applying safety methods from DALL-E 3, such as text filters that block harmful prompts and image filters that ensure videos comply with their usage policies.
Global collaboration: OpenAI plans to work with policymakers, educators, and artists worldwide to address concerns, explore positive applications, and learn from real-world use to continuously improve safety.

Availability

OpenAI also gives some information about the research techniques behind Sora. It can generate entire videos at once or extend already generated videos to make them longer. The model has a foresight of many frames at a time, which solves the problem of keeping the subject the same even when it temporarily goes out of view. You can read more details about it in OpenAI’s technical report.

As for availability, you and I will have to wait a bit. As of today, Sora is only available to red teamers to assess critical areas for harm or risks. OpenAI has also granted access to select visual artists, designers, and filmmakers. This is so they can get feedback on the model and how to advance it further. “We’re sharing our research progress early to start working with and getting feedback from people outside of OpenAI and to give the public a sense of what AI capabilities are on the horizon,” the company concludes.

As I mentioned, the examples of Sora’s AI-generated videos are amazing. Some of them are still a little laggy and weird, but many are easily passable as real videos. After all, with so much information we soak up daily and our attention spans, it can be hard to tell. So, I urge you, as always, to stay critical of the content you see online, especially as technology becomes more advanced and humanity becomes more regressive.

On the other hand, AI-generated videos can contribute to the filmmaking industry, replacing complex special effects and hours of editing. And for this first time, I feel that we’ve actually come closer to this.

Filed Under:

news

Tagged With:

Artificial Intelligence

Dunja Đuđić Kalinin

Dunja Djudjic is a multi-talented artist based in Novi Sad, Serbia. With 15 years of experience as a photographer, she specializes in capturing the beauty of nature, travel, concerts, and fine art. In addition to her photography, Dunja also expresses her creativity through writing, embroidery, and jewelry making.

Join the Discussion

DIYP Comment Policy
Be nice, be on-topic, no personal information or flames.

Leave a Reply Cancel reply

One response to “The future of filmmaking? OpenAI launches Sora, its powerful text-to-video generator”

Mr. Bean

Feb 16, 2024

Sora also exhibits issues with limbs. No third leg in this case but the artifical asian lady seems to have quite a walking disorder. Watch the footage around 0:10. The hopping at 0:28 is also cute. And the leg swap one second later for sure is a cool party trick. Gotta learn this one!

The beach scene is very amusing. Almost feels played backwards best in Mr. Bean manner.

Heck, if that’s the future of filmmaking, it is at least better than the crap from MCU or Disney we get to see nowadays.

Reply