A Chinese artificial intelligence (AI) startup has sent shockwaves through Silicon Valley by creating an AI model matching industry leaders’ performance. The company, DeepSeek, achieved the feat while spending just $5.6 million, a fraction of what U.S. companies invest.
Founded by former hedge fund manager Liang Wenfeng, Deepseek released its R1 model this week. In benchmark tests, it performs on par with OpenAI’s ChatGPT o1 model. The company achieved this using only 2,048 Nvidia H800 chips, compared to ChatGPT’s reported 10,000 GPUs.
“DeepSeek aimed for accurate answers rather than detailing every logical step, significantly reducing computing time while maintaining a high level of effectiveness,” said Dimitris Papailiopoulos, principal researcher at Microsoft’s AI Frontiers.
The company has also released six smaller versions of R1 that can be run offline on laptops. DeepSeek even claims one of them can outperform OpenAI’s o1-mini.
The breakthrough comes despite U.S. government export restrictions on Chinese AI hardware. In 2021, Liang began stockpiling Nvidia chips, accumulating over 10,000 units before the restrictions took effect. “The U.S. export control has essentially backed Chinese companies into a corner where they have to be far more efficient with their limited computing resources,” said Matt Sheehan, an AI researcher at the Carnegie Endowment for International Peace.
The news particularly rattled Meta, the company behind the AI models called LlaMa. Meta’s AI research division reportedly entered “panic mode” after DeepSeek’s models outperformed their Llama series. “Adding insult to injury was the ‘unknown Chinese company with 5.5 million training budget,'” a Meta staff member posted on Teamblind.
The majority of DeepSeek’s team are recent graduates and PhD students from top Chinese universities. The company focuses on fundamental research rather than commercial applications, which has allowed It to attract top talent and offer some of the highest salaries in China’s AI industry. DeepSeek has offices in Beijing and Hangzhou.
“Our core technical positions are mostly filled by people who graduated this year or in the past one or two years,” Liang told Chinese media outlet 36Kr in 2023.
However, DeepSeek’s rise raises some concerns. The company complies with Chinese state censorship requirements, so questions persist about its training methods. Reports indicate that its models sometimes identify themselves as ChatGPT, suggesting possible training on Western AI outputs without permission. This situation exposes ongoing debates about who sets the rules for AI development and training.
The company’s rapid advancement has caught the attention of China’s leadership. Liang was recently selected as the only AI leader to attend a meeting with Li Qiang, China’s second-most powerful leader. Li urges entrepreneurs to “concentrate efforts to break through key core technologies.”
DeepSeek’s rise shows that innovation doesn’t always need big budgets. By focusing on efficiency and open-source models, the company is challenging Silicon Valley’s dominance in AI. Industry experts suggest this could increase competition and innovation in AI development worldwide.