OpenAI's New Project: Sora Turns Text into Video

OpenAI presents Sora - a new text-to-video model that can create realistic and fantastical video scenes based on textual prompts. This project opens up new possibilities for visualizing ideas, transforming text into moving images without the need for manual video editing.

Sora can generate videos up to a minute long, providing high visual quality and precise adherence to user instructions. The model is trained to understand the physical world in motion, allowing it to create videos that help solve tasks involving real-world interactions.

Access to Sora has already been granted to teams of specialists to assess potential risks and harms, as well as to visual artists, designers, and filmmakers to gather feedback and further improve the model.

It looks incredible! An example of a generated video based on the prompt: “Prompt: A stylish woman walks down a Tokyo street filled with warm glowing neon and animated city signage. She wears a black leather jacket, a long red dress, and black boots, and carries a black purse. She wears sunglasses and red lipstick. She walks confidently and casually. The street is damp and reflective, creating a mirror effect of the colorful lights. Many pedestrians walk about.”

UPD:
If you think that Sora from OpenAI is just a creative toy similar to DALLE, think again. Sora is a data-driven engine that simulates multiple worlds, whether real or fantastical. The simulator masters complex rendering, "intuitive" physics, long-horizon reasoning, and semantic anchoring, all thanks to specific denoising methods and gradient mathematics.

I wouldn't be surprised if Sora was trained on a large amount of synthetic data using Unreal Engine 5.

More details and examples at the link below.

🔗 Link: https://openai.com/sora

#ai #gpt #llm #text2video #3d

МР