OpenAI has introduced its latest generative artificial intelligence (GAI) tool, Sora, a revolutionary text-to-video technology from the creators of Dall-E and ChatGPT. Capable of transforming a simple still image or brief written prompt into up to a minute of highly realistic video, Sora is a testament to the cutting-edge advancements in AI.
A Glimpse into Sora
Announced on February 15, Sora is not yet available to the general public. OpenAI is currently providing access to a select group of artists and “red-team” hackers who are testing its potential benefits and identifying any harmful applications. Despite its limited release, OpenAI has shared several sample videos generated by Sora through an announcement blog post, a brief technical report, and CEO Sam Altman’s profile on X (formerly Twitter).
What Sora Can Do
Sora can generate videos up to 60 seconds long, with the option to extend this by creating additional sequential clips. This capability marks a significant improvement over previous GAI tools, which have struggled with maintaining consistency between video frames and prompts. Sora’s ability to produce coherent and realistic video clips sets it apart as a milestone in AI-generated video technology.
The Technology Behind Sora
At its core, Sora is a diffusion model with a transformer encoding system similar to ChatGPT’s. Trained to associate text captions with corresponding video content, Sora uses an iterative process to remove visual noise from video clips. Unlike image generators that encode text into still pixels, Sora translates words into temporal-spatial blocks to create complete video clips.
OpenAI has been somewhat secretive about the specifics of Sora’s development and training. The company has stated that it relied on licensed and publicly available video content for training, though some experts speculate that synthetic data from video game design programs like Unreal Engine might have been used as well. The model’s impressive capabilities are attributed to massive amounts of training data and billions of program parameters running on substantial computing power.
Toys ‘R’ Us and Sora: A Case Study
Toys ‘R’ Us, recently revived in Macy’s store locations after closing all stores in 2017, is the first brand to use Sora for advertising. The toy retailer premiered a 66-second ad at Cannes Lions, the annual advertising industry event held on the French Riviera. This ad, created by the brand’s creative partner Native Foreign, showcased the capabilities of Sora and garnered mixed reactions online.
Some viewers are excited by the integration of generative AI in commercial work, while others, like writer Mike Drucker, have criticized it. Drucker tweeted, “Love this commercial is like, ‘Toys R Us started with the dream of a little boy who wanted to share his imagination with the world. And to show how, we fired our artists and dried Lake Superior using a server farm to generate what that would look like in Stephen King’s nightmares.’”
The Future of Sora
OpenAI has not yet announced an official release date for Sora, but there are rumors that it may become publicly available by the end of summer. As anticipation builds, the capabilities demonstrated by Sora indicate a promising future for AI-generated video content. Brands and creators alike are eager to explore its potential for innovative storytelling and engagement.
For a closer look at Sora in action, you can view the Toys ‘R’ Us commercial here: Toys ‘R’ Us Sora Commercial.