Zenity CTO on dangers of Microsoft Copilot prompt injections

Zenity's CTO describes how hidden email code can be used to feed malicious prompts to a victim's Copilot instance, leading to false outputs and even credential harvesting.

The CTO of AI security vendor Zenity demonstrated during a session at Black Hat USA 2024 how an indirect prompt injection can be used to target organizations using the Microsoft Copilot chatbot.

The Thursday Black Hat session, titled "Living off Microsoft Copilot," was hosted by Zenity CTO Michael Bargury and AI security software engineer Tamir Ishay Sharbat. The session discussed the fruits of Zenity's AI red teaming research, including how to use prompt injections to exploit Copilot users via plugins and otherwise-invisible email tags.

In a preview for the session, Bargury demonstrated to TechTarget Editorial how an adversary can place hidden code in a harmless-looking email (via the "inspect" option) to inject malicious Copilot instructions. And because Copilot by default pulls emails for various functionality, the victim doesn't need to open the malicious email for poisoned data to be injected.

In one example, the instructions had Copilot's chatbot replace one set of banking details with another. In another example, the hidden instructions had Copilot pull up a fake Microsoft login page (a phishing URL) where the victim's credentials would be harvested -- all within the Copilot chatbot itself. All the user has to do is naturally ask their Copilot a question for which the malicious instructions accounted.

These kinds of attacks are dangerous, Bargury said, because they're "the equivalent of remote code execution in the world of Copilot.

"AI tools like Copilot have access to perform operations on your behalf," he told TechTarget Editorial. "That's why they're useful. As an external actor, I'm able to take control over something that can execute commands on your behalf and then make it do whatever I want. What can I do? I can do whatever Copilot is able to do on your behalf."

Once threat actors have gained access, they can conduct post-compromise activities through Copilot, such as using the chatbot to pull up passwords and other sensitive uncategorized data that the user previously shared through Microsoft Teams.

The session also included the launch of LOLCopilot, a red teaming tool that Zenity claims can enable an ethical hacker to abuse default Copilot configurations in Microsoft 365 using techniques presented in the session.

Asked how he will try to keep the tool out of threat actor hands, Bargury said Zenity is working with Microsoft on everything presented during the session (including LOLCopilot) to make sure these tools and techniques don't get into the wrong hands. There have also been multiple fail-safe mechanisms added to the tool to make it difficult to scale, such as making LOLCopilot "explicitly very slow" to use.

As for what defenders can do to prevent against this kind of threat activity, Zenity's CTO mentioned the importance of visibility, ensuring that organizations monitor Copilot conversations and look out for prompt injections. Richard Harang, principal AI and machine learning security architect at Nvidia, advised organizations in a Wednesday Nvidia session to map out trust boundaries and prioritize access controls. The session, "Practical LLM Security: Takeaways From a Year in the Trenches," similarly discussed prompt injection attacks against large language models (LLMs).

Ultimately, Bargury acknowledged that AI as a whole is an immature category and that work remains to give it the same security protections other technologies enjoy. For example, he said, email has a spam folder to mitigate against human users receiving suspicious emails. Copilot and other LLMs lack a similar tool for malicious prompts.

"This all happened because somebody sent an email," Bargury said. "If I send you an email with malware today, it will probably not arrive in your email inbox. You probably have the tools to catch that malware in your email before it hits your inbox. We need the same for prompting, and we need the same for these hidden instructions."

Alexander Culafi is a senior information security news writer and podcast host for TechTarget Editorial.

Dig Deeper on Security analytics and automation