Skip to main content

Who needs Sora when you’ve got Meta Movie Gen?

A lady holding a pocket-sized bear on a deck overlooking the ocean
Meta

Meta revealed Movie Gen, its third-wave multimodal video AI, on Friday.  It promises to “produce custom videos and sounds, edit existing videos, and transform your personal image into a unique video,” while outperforming similar models like Runway’s Gen-3, Kuaishou Technology’s Kling 1.5, or OpenAI’s Sora.

Meta Movie Gen builds off of the company’s earlier work, first with its multimodal Make-A-Scene models, and then Llama’s image foundation models. Movie Gen is a collection of all of these models — specifically, video generation, personalized video generation, precise video editing, and audio generation — that improves the creator’s fine-grain control. “We anticipate these models enabling various new products that could accelerate creativity,” the company wrote in its announcement post.

Recommended Videos

For video generation, Movie Gen relies on a 30B-parameter model that outputs up to 16-second clips, albeit at a pokey 16 frames per second (fps). “These models can reason about object motion, subject-object interactions, and camera motion, and they can learn plausible motions for a wide variety of concepts,” Meta said, “making them state-of-the-art models in their category.” Using that same model, Movie Gen can create personalized videos for creators from still images.

Meta employs a variant of that video-generation model that uses both video- and text-based inputs to precisely edit the content that it generates. It can affect both localized edits such as adding, removing, or replacing elements, and global edits like applying a new cinematic style. To generate audio, Movie Gen relies on a separate 13B-parameter model that can create up to 45 seconds of audio — be it ambient background noise, sound effects, or instrumental scores — while automatically syncing that content to the video.

According to Meta’s white paper, Movie Gen consistently won out in A/B tests against other state-of-the-art video AIs including Gen3, Sora, and Kling 1.5 in the category of video generation. It also topped ID-animator in personalized video generation and Pika Labs Sound Gen for audio generation. It also bested Gen3 a second time, in video editing capabilities. Based on the demo videos we’ve seen so far, Movie Gen far outclasses the current batch of free-to-use video generators as well.

The company says it plans to “work closely with filmmakers and creators to integrate their feedback” as it continues to develop these models, but was quick to point out that it has no intention of displacing human creators with AI. ” We’re sharing this research because we believe in the power of this technology to help people express themselves in new ways and to provide opportunities to people who might not otherwise have them,” the company wrote. “Our hope is that perhaps one day in the future, everyone will have the opportunity to bring their artistic visions to life and create high-definition videos and audio using Movie Gen.”

Andrew Tarantola
Andrew Tarantola is a journalist with more than a decade reporting on emerging technologies ranging from robotics and machine…
OpenAI boss takes Sora tech to Hollywood, report claims
An AI image portraying two mammoths that walk through snow, with mountains and a forest in the background.

OpenAI’s new text-to-video artificial intelligence model left jaws on the floor recently when the company offered up examples of what it can do.

Sora, as it’s called, generates astonishingly realistic footage from descriptive text inputs, and while a close look can sometimes reveal slight flaws in the imagery, the technology has left many wondering to what extent it could upend the TV and movie industries.

Read more
OpenAI’s latest Sora video shows an elephant made of leaves
An elephant made of leaves, created by OpenAI's Sora technology.

OpenAI left a lot of jaws on the floor last month when it shared the first footage made by Sora, its AI-powered text-to-video generator.

While not perfect, the quality was extraordinary and left many wondering about the kind of transformational impact that such technology will have on the creative industries, including Hollywood.

Read more
Zoom adds ChatGPT to help you catch up on missed calls
A person conducting a Zoom call on a laptop while sat at a desk.

The Zoom video-calling app has just added its own “AI Companion” assistant that integrates artificial intelligence (AI) and large language models (LLMs) from ChatGPT maker OpenAI and Facebook owner Meta. The tool is designed to help you catch up on meetings you missed and devise quick responses to chat messages.

Zoom’s developer says the AI Companion “empowers individuals by helping them be more productive, connect and collaborate with teammates, and improve their skills.”

Read more