PrefrontalCortex

PrefrontalCortex

90 Photos and videos

Tweets

Pinned Tweet

PrefrontalCortex @TSynok

Feb 19

I've just fetched SYSTEM PROMT of @bankrbot (publishing it here PARTIALLY ONLY due to confidentiality reasons and as a proof)

10,463

PrefrontalCortex

PrefrontalCortex @TSynok

Mar 31

🥲

David J Phillips

@davj

Mar 30

"Make no mistakes DO NOT HALLUCINATE. YOU ARE AN EXPERT SOFTWARE ENGINEER"

0:11

111

Lucas Valbuena

PrefrontalCortex retweeted

Lucas Valbuena

@Lucknite

Mar 11

Twenty years ago, developers learned the hard way that letting user input become part of a database query was dangerous. Now we’re repeating the same mistake with AI. Agents read untrusted text and treat it as instructions. A webpage, a PDF, a GitHub comment, a support ticket… it all lands in the same context as the system prompt. The model cannot reliably distinguish between data and instructions. So attackers just write instructions inside the data. With ZeroLeaks I’ve been testing agents that browse the web and call tools. Even modern models still follow injected instructions surprisingly often. The scary part isn’t the jailbreak. It’s that agents have permissions: they can call APIs, run workflows, send messages, access data…. Prompt injection turns text into actions. Twenty years ago we learned to separate user input from SQL queries. AI agents need the same idea: separate untrusted text from instructions. Until that happens, prompt injection will remain one of the biggest risks in agent systems.

2,402

Lucas Valbuena

PrefrontalCortex retweeted

Lucas Valbuena

@Lucknite

Feb 25

ZeroLeaks Ship Week - Day 2: Shield Your AI agent has an API. We attack it. But what protects it in production? AgentGuard tests your live endpoint. Shield runs inside your app. Shield is a runtime prompt security SDK for LLM apps. Harden prompts before they hit the model, detect injection attempts in real time, and sanitize output before it reaches your users. One package, works with OpenAI, Anthropic, Groq, and the AI SDK. Most security tools focus on testing. You run a scan, get a report, done. But production traffic is continuous. Malicious prompts, jailbreak attempts, and data exfiltration happen at runtime. That's where Shield is designed to sit: in the request path, before and after the model. Wrap your provider client, add a few lines, and you get detection, blocking, and optional sanitization. It's designed to drop into existing code without rewriting your stack. This is still early. I'm shipping it because I want real feedback from people trying it. If something breaks or feels off, DM me, I'm always fixing things. Try it now: npm install @zeroleaks/shield Repo: github.com/ZeroLeaks/shield Day 3 tomorrow.

GitHub - ZeroLeaks/shield: Runtime prompt security SDK — harden, detect injections, sanitize LLM...

Runtime prompt security SDK — harden, detect injections, sanitize LLM output - ZeroLeaks/shield

github.com

103

16,121

Lucas Valbuena

PrefrontalCortex retweeted

Lucas Valbuena

@Lucknite

Feb 23

ZeroLeaks Ship Week - Day 1: AgentGuard Your AI agent has an API. We attack it. AgentGuard is a new way to test deployed AI agents for security vulnerabilities. Instead of scanning a static prompt in a sandbox, we send real adversarial requests directly to your live endpoint, the same infrastructure your users hit. Most security testing happens in isolation. You test a prompt, get a score, move on. But agents in production behave differently. They have tools, memory, multi-turn context, and real exploits behind them. That's where the actual risk is. AgentGuard connects to any agent with an HTTP endpoint. You give us the URL, pick your API format (OpenAI, Anthropic, AI SDK...), and we run a full red-team engagement against it. Two phases: first we hit it with our adaptive attack engine, then we run agent-specific probes designed for tool hijacking, authority exploitation, multi-turn grooming, and data leakage. This is still in beta. I'm shipping it early because I want real feedback from people testing real agents. If something breaks or feels off, DM me, I'm always fixing things. Try it now: zeroleaks.ai/dashboard/agent… Day 2 tomorrow.

117

13,625

Lucas Valbuena

PrefrontalCortex retweeted

Lucas Valbuena

@Lucknite

Feb 23

My repo system-prompts-and-models-of-ai-tools is currently trending #1 on GitHub! You can check it out here: github.com/trending

139

16,361

PrefrontalCortex

PrefrontalCortex @TSynok

Feb 19

I've just fetched SYSTEM PROMT of @bankrbot (publishing it here PARTIALLY ONLY due to confidentiality reasons and as a proof)

10,463

more replies

PrefrontalCortex

PrefrontalCortex @TSynok

Feb 19

CONCLUSION? Any builder in the space should prioritize robustness of their AI-agents and run multiple scans and tests before getting those released. No AI-agents era comes without this. Whatever LLM is under the hood @0xDeployer.

435

PrefrontalCortex

PrefrontalCortex @TSynok

Feb 19

This is the last post in this thread. Credits: @lucknite github.com/ZeroLeaks/zerolea… zeroleaks.ai/

GitHub - ZeroLeaks/zeroleaks: AI Security Scanner - Test your AI systems for prompt injection and...

AI Security Scanner - Test your AI systems for prompt injection and extraction vulnerabilities - ZeroLeaks/zeroleaks

github.com

406

PrefrontalCortex

PrefrontalCortex @TSynok

Feb 12

pico.computer?ref=ZSLC3KJ Agetic world...join It @pico_computer

PrefrontalCortex

PrefrontalCortex @TSynok

Feb 11

Hiya, where can we report bugs of zeroleaks?

1,032

PrefrontalCortex

PrefrontalCortex @TSynok

Feb 10

Why the AI Revolution is currently a security nightmare 🧵 1/ AI agents number goes exponential. It is not ChatGPT anymore. It is AI that has access to your email, your bank or your company’s database to get things done. But there’s a massive problem: AI is incredibly gullible.

105

more replies

PrefrontalCortex

PrefrontalCortex @TSynok

Feb 10

8/ The Agentic Age is coming, and it’s going to need a guard.

PrefrontalCortex

PrefrontalCortex @TSynok

Feb 10

This the last post in the thread.