Hours after Google announced Gemini 1.5, OpenAI announced their new video generation model Sora. Its outputs look damn impressive.
How Sora Works
How does it work? There is a technical report. Mostly it seems like OpenAI did standard OpenAI things, meaning they fed in tons of data, used lots of compute, and pressed the scaling button super hard. The innovations they are willing to talk about seem to be things like ‘do not crop the videos into a standard size.’
That does not mean there are not important other innovations. I presume that there are. They simply are not talking about the other improvements.
We should not underestimate the value of throwing in massively more compute and getting a lot of the fiddly details right. That has been the formula for some time now.
Some people think that OpenAI was using a game engine [...]
---
Outline:
(00:12) How Sora Works
(02:07) Sora Is Technically Impressive
(06:42) Sora What's it Good For?
(09:43) Until we can say exactly what we want, and get it, mostly I expect no dice. When you go looking for something specific, your chances of finding it are very bad.
(15:19) Sora Comes Next?
---
First published:
February 22nd, 2024
Source:
https://www.lesswrong.com/posts/35fZ6csrbcrKw9BwG/sora-what
Narrated by TYPE III AUDIO.