The world of video is about to be changed, forever. Â
OpenAI just announced their Sora AI model, one that can create incredibly lifelike video from a simple text prompt.
As former videographers and editors ourselves, the technology is truly, truly incredible. Even in its 1.0 pre-beta format, the samples OpenAI shared show amazing promise. They also should make those in the world of video afraid for their livelihoods.
With just a sentence or two, the system can create short, high-definition video clips, just as image models like MidJourney or Dall-e can do. However, the physics, shading, movement, and dynamics of video are much harder than creating still images.
Prompt: A stylish woman walks down a Tokyo street filled with warm glowing neon and animated city signage. She wears a black leather jacket, a long red dress, and black boots, and carries a black purse. She wears sunglasses and red lipstick. She walks confidently and casually. The street is damp and reflective, creating a mirror effect of the colorful lights. Many pedestrians walk about.
Now, it’s important to remember that just one year ago, the state-of-the-art in text-to-video were examples like this absolutely horrifying clip of Will Smith eating spaghetti. (YouTube)
One year later, the progress is staggering, and frankly, a bit scary. The new model you see below is the worst the technology will be, from this point forward.
We think for applications like stock footage, background video, and more, many people in the world of videography may soon be out of a job.
Prompt: Historical footage of California during the gold rush.
“Sora is able to generate complex scenes with multiple characters, specific types of motion, and accurate details of the subject and background. The model understands not only what the user has asked for in the prompt, but also how those things exist in the physical world.”
-OpenAI
Prompt: A litter of golden retriever puppies playing in the snow. Their heads pop out of the snow, covered in.
To answer some of the obvious security and safety concerns, OpenAI is planning to safeguard the videos with digital ‘watermarks’ that show they have been made with AI. The prompting in theory should also not allow hateful or deceptive video footage.
“We’ll be taking several important safety steps ahead of making Sora available in OpenAI’s products. We are working with red teamers — domain experts in areas like misinformation, hateful content, and bias — who will be adversarially testing the model.
We’re also building tools to help detect misleading content such as a detection classifier that can tell when a video was generated by Sora. We plan to include C2PA metadata in the future if we deploy the model in an OpenAI product.”
Prompt: A young man at his 20s is sitting on a piece of cloud in the sky, reading a book.
The Sora model is not released yet, still in a pre-beta mode for testing and safety.
And while there are other companies working toward this as well, OpenAI’s offering is seen as the most advanced.
Buckle up.
1 comment
…just unbelievable….