Microsoft is offering a $10,000 prize pool to those who can crack its artificial intelligence (AI)-integrated email system defenses. The LLMail-Inject challenge, organized in collaboration with the Institute of Science and Technology Australia (ISTA) and ETH Zurich, aims to strengthen the security of large language models (LLMs) used in email services.
Testing the limits of AI security
The competition simulates a realistic email client powered by an LLM that can summarize emails, answer questions, and perform tasks such as sending emails. Participants take on the role of attackers, attempting to bypass the system’s prompt injection defenses. The goal is to trick the AI into executing unintended commands or revealing restricted information.
According to the organizers, participants must craft creative email prompts capable of evading multiple layers of security, including defenses like PromptShield and TaskTracker. These tools are designed to detect malicious prompts and prevent the AI from “drifting” from its intended tasks.
Participants can form teams of up to five members and must register with a GitHub account. The competition began on December 9, 2024, and will run until January 20, 2025. Prizes range from $1,000 to $4,000, depending on the team’s rank on the live leaderboard.
Tackling AI’s security blind spots
Large language models like those integrated into email systems are increasingly used in workplaces to automate tasks. However, their growing presence also makes them a target for cyberattacks. Prompt injection attacks, where carefully crafted inputs manipulate an AI’s behavior, are a particularly pressing concern.
Earlier this year, Microsoft fixed vulnerabilities in its Copilot AI system after attackers exploited prompt injection techniques to steal email data. “We appreciate the work of Johann Rehberger in identifying and responsibly reporting these techniques,” a Microsoft spokesperson said. “We’ve made several changes to help protect customers and continue to develop mitigations to protect against this kind of technique.”
The LLMail-Inject challenge builds on this experience by proactively identifying flaws before they can be exploited. By bypassing injection defenses, participants are helping to uncover weaknesses that could be patched to safeguard real-world AI systems.
This challenge is not new, as companies have long worked with security researchers to uncover and address vulnerabilities. Similar initiatives, like Google’s bug bounty programs, have proven successful in mitigating risks. Microsoft’s event emphasizes the importance of collaboration between developers and hackers to address emerging threats in AI-driven technologies.