New Llama 4 models put Meta back in AI race against China

Written by Michael Anthony Bitoon

Published 7 Apr 2025

Fact checked by

Sophia Feona Cantiller

Why trust Greenbot

We maintain a strict editorial policy dedicated to factual accuracy, relevance, and impartiality. Our content is written and edited by top industry professionals with first-hand experience. The content undergoes thorough review by experienced editors to guarantee and adherence to the highest standards of reporting and publishing.

a white and blue square with a blue logo on it

Meta released two new Llama 4 versions on April 5, 2025. The latest artificial intelligence (AI) models feature significantly expanded context windows and multimodal capabilities. The company aims to compete with leading AI systems from both U.S. and Chinese tech firms.

The new lineup includes Llama 4 Scout and Llama 4 Maverick, which are available now, while a more powerful Llama 4 Behemoth remains in development. Meta CEO Mark Zuckerberg said these models would power Meta AI across WhatsApp, Messenger, Instagram, and the web.

“Our goal is to build the world’s leading AI, open source it, and make it universally accessible so that everyone in the world benefits,” Zuckerberg said in an Instagram video. “With Llama 4, that is starting to happen.”

Llama 4 Scout can process 10 million tokens at once – roughly 15,000 pages of text in a single interaction. This lets it handle complex tasks like analyzing entire codebases or summarizing multiple documents at once. The model contains 17 billion active parameters with 16 experts and fits on a single NVIDIA H100 GPU.

Llama 4 Maverick, described as the “workhorse” model, contains 17 billion active parameters with 128 experts. Meta claims it outperforms competitors like OpenAI’s GPT-4o and Google’s Gemini 2.0 Flash across coding, reasoning, multilingual, and image benchmarks.

Llama 4 Maverick instruction-tuned benchmarks

Source: Meta

Both models use a mixture-of-experts (MoE) architecture, which improves efficiency by only using parts of the model needed for specific tasks. This cuts costs dramatically, with Maverick’s inference cost estimated at $0.19-$0.49 per million tokens compared to GPT-4o’s $4.38.

The models represent Meta’s response to growing competition from Chinese AI firms. David Sacks, the White House’s AI and crypto czar, praised Llama 4 on social media, saying, “For the US to win the AI race, we have to win in open source too, and Llama 4 puts us back in the lead.”

Meta also announced that Llama 4 Behemoth, a massive 2-trillion-parameter teacher model, is still being trained. According to company benchmarks, Behemoth already outperforms GPT-4.5, Claude Sonnet 3.7, and Gemini 2.0 Pro on several STEM benchmarks.

While Meta calls these models “open source”, they have license restrictions. Commercial entities with more than 700 million monthly active users must request special permission to use them, which the Open Source Initiative has previously criticized, as Llama models fall outside true open-source definitions.

Meta plans to share more about its AI plans at its upcoming LlamaCon conference on April 29, including details about another model called Llama 4 Reasoning.