Skip to main content

Who needs Sora when you’ve got Meta Movie Gen?

A lady holding a pocket-sized bear on a deck overlooking the ocean
Meta

Meta revealed Movie Gen, its third-wave multimodal video AI, on Friday.  It promises to “produce custom videos and sounds, edit existing videos, and transform your personal image into a unique video,” while outperforming similar models like Runway’s Gen-3, Kuaishou Technology’s Kling 1.5, or OpenAI’s Sora.

Recommended Videos

Meta Movie Gen builds off of the company’s earlier work, first with its multimodal Make-A-Scene models, and then Llama’s image foundation models. Movie Gen is a collection of all of these models — specifically, video generation, personalized video generation, precise video editing, and audio generation — that improves the creator’s fine-grain control. “We anticipate these models enabling various new products that could accelerate creativity,” the company wrote in its announcement post.

For video generation, Movie Gen relies on a 30B-parameter model that outputs up to 16-second clips, albeit at a pokey 16 frames per second (fps). “These models can reason about object motion, subject-object interactions, and camera motion, and they can learn plausible motions for a wide variety of concepts,” Meta said, “making them state-of-the-art models in their category.” Using that same model, Movie Gen can create personalized videos for creators from still images.

Meta employs a variant of that video-generation model that uses both video- and text-based inputs to precisely edit the content that it generates. It can affect both localized edits such as adding, removing, or replacing elements, and global edits like applying a new cinematic style. To generate audio, Movie Gen relies on a separate 13B-parameter model that can create up to 45 seconds of audio — be it ambient background noise, sound effects, or instrumental scores — while automatically syncing that content to the video.

According to Meta’s white paper, Movie Gen consistently won out in A/B tests against other state-of-the-art video AIs including Gen3, Sora, and Kling 1.5 in the category of video generation. It also topped ID-animator in personalized video generation and Pika Labs Sound Gen for audio generation. It also bested Gen3 a second time, in video editing capabilities. Based on the demo videos we’ve seen so far, Movie Gen far outclasses the current batch of free-to-use video generators as well.

The company says it plans to “work closely with filmmakers and creators to integrate their feedback” as it continues to develop these models, but was quick to point out that it has no intention of displacing human creators with AI. ” We’re sharing this research because we believe in the power of this technology to help people express themselves in new ways and to provide opportunities to people who might not otherwise have them,” the company wrote. “Our hope is that perhaps one day in the future, everyone will have the opportunity to bring their artistic visions to life and create high-definition videos and audio using Movie Gen.”

Andrew Tarantola
Former Digital Trends Contributor
Andrew Tarantola is a journalist with more than a decade reporting on emerging technologies ranging from robotics and machine…
Meta and Google made AI news this week. Here were the biggest announcements
Ray-Ban Meta Smart Glasses will be available in clear frames.

From Meta's AI-empowered AR glasses to its new Natural Voice Interactions feature to Google's AlphaChip breakthrough and ChromaLock's chatbot-on-a-graphing calculator mod, this week has been packed with jaw-dropping developments in the AI space. Here are a few of the biggest headlines.

Google taught an AI to design computer chips
Deciding how and where all the bits and bobs go into today's leading-edge computer chips is a massive undertaking, often requiring agonizingly precise work before fabrication can even begin. Or it did, at least, before Google released its AlphaChip AI this week. Similar to AlphaFold, which generates potential protein structures for drug discovery, AlphaChip uses reinforcement learning to generate new chip designs in a matter of hours, rather than months. The company has reportedly been using the AI to design layouts for the past three generations of Google’s Tensor Processing Units (TPUs), and is now sharing the technology with companies like MediaTek, which builds chipsets for mobile phones and other handheld devices.

Read more
GPT-4: everything you need to know about ChatGPT’s standard AI model
A laptop opened to the ChatGPT website.

People were in awe when ChatGPT came out, impressed by its natural language abilities as an AI chatbot originally powered by the GPT-3.5 large language model. But when the highly anticipated GPT-4 large language model came out, it blew the lid off what we thought was possible with AI, with some calling it the early glimpses of AGI (artificial general intelligence)
What is GPT-4?
GPT-4 is the latest generation language model, GPT-4o being the latest specific version, created by OpenAI. It advances the technology used by ChatGPT, which was previously based on GPT-3.5 but has since been updated. In short, GPT-4 is superior to GPT-3.5. GPT is the acronym for Generative Pre-trained Transformer, a deep learning technology that uses artificial neural networks to write like a human.

According to OpenAI, this next-generation language model is more advanced than ChatGPT in three key areas: creativity, visual input, and longer context. In terms of creativity, OpenAI says GPT-4 is much better at both creating and collaborating with users on creative projects. Examples of these include music, screenplays, technical writing, and even "learning a user's writing style."

Read more
This new text-to-video AI looks incredible, and you can try it for free
A woman wearing a yellow dress seated in front of a similarly colored floral background.

Expectant AI enthusiasts flooded the Luma AI website on Wednesday, resulting in multi-hour waits to access the company's new free-to-use, high-definition AI video generator, Dream Machine, Venture Beat reports.

What's all the excitement for? Well, the Andreessen Horowitz-backed startup's model promises video generation of up to 120 frames per second for as long as 120 seconds. And based on some of the examples being shared online so far, it's pretty impressive.

Read more