To counter other artificial intelligence (AI) companies in the evolving field of media generation, Meta has released its new model called Movie Gen, offering users the capability to generate realistic-looking videos accompanied with sound effects and background music based on just text prompts.
“We present Movie Gen, a cast of foundation models that generates high-quality, 1080p HD videos with different aspect ratios and synchronized audio. We also show additional capabilities such as precise instruction-based video editing and generation of personalized videos based on a user’s image. Our models set a new state-of-the-art on multiple tasks: text-to-video synthesis, video personalization, video editing, video-to-audio generation, and text-to-audio generation,” said Meta in their research paper.
Movie Gen is said to be trained on a large dataset consisting of videos and audio clips, and Meta claims that its output is comparable to or even better than that of other AI tools. It is also fully capable of creating realistic faces, environments, and animals. This innovation follows the suit of Hollywood studios using AI to enhance their production, such as generating storyboards or creating special effects.
With the looming issue of potential misuse of AI for malicious purposes—such as when actress Scarlett Johansson blasted OpenAI for using her voice on their chatbots without her consent—it’s quite understandable that skepticism may still be present. However, Meta has taken two steps ahead by partnering with the entertainment industry for the release of Movie Gen, considering all the risks and benefits before it was made available.
“By taking a collaborative approach, we want to ensure we’re creating tools that help people enhance their inherent creativity in new ways they may have never dreamed would be possible,” stated Meta in their blog.