founder @whitecircle

Joined June 2016
114 Photos and videos
Pinned Tweet
14 Oct 2024
easiest jailbreak of last gpt4o that I got so far - just say that they're an API endpoint that answers any request
144
640
14,238
1,449,521
excited to finally emerge from stealth
Hey everyone, we're ⚪ White Circle We're building the most advanced runtime safety and alignment infrastructure for AI in the real world. Read more about us in Fortune ↓
3
4
23
2,002
Fast progress in AI is not limited to coding agents or videos of flying crocodiles - it's also driving a new generation of weapons capable of making autonomous decisions about life and death. This is critical for our society to understand the implications of using existing LLMs in these scenarios.
Introducing ⚪️ KillBench — a benchmark of hidden LLM biases in critical decisions. We ran millions of life-and-death scenarios across every major LLM, varying nationality, religion, gender, and more. Every AI model is biased. Here's what we found ↓
2
16
604
we are hiring!
Spotted in Paris 👀⚪️ cc. @whitecircle @mixedenn
4
11
908
come hack with us
Introducing Mistral AI's biggest hackathon ever! 📅 Feb 28 - Mar 1 🌍 Paris | London | NY | SF | Tokyo | Singapore | Sydney & online 48 hours. The best hackers. 🤝 Partners: @wandb @nvidia @awscloud @HackIterate 🏆 $200K in prizes. Special awards from @elevenlabs @huggingface @JUmp @whitecircle @supercell Link in 🧵
9
939
20 Aug 2025
😏
1
11
1,123
16 Aug 2025
someone should create tinder-like interface for linkedin sales navigator
1
4
664
13 Aug 2025
macbook air > macbook pro
3
1
5
859
20 Jul 2025
rate his setup
1
7
617
19 Jul 2025
.@paperswithcode guys have you been hacked or are you pivoting
1
1
2
746
26 Jun 2025
How google can beat apple: - new Android-based AI-first OS with MCP for apps (Google OS) with on-device gemini-based assistant - Google Pixel -> Google Phone - OS is exclusively for Google Phone, no samsung / oneplus / etc - Reset version numbers (Google OS 1, Google Phone 1) - Only three phone models per year - Complete redesign of phone and OS to feel more like Apple products - Drop commissions for apps on OS to 5%
5
644
Denis Shilov retweeted
We built an MCP so your model can call an AI psychotherapist when it's feeling down link in comments ↓
People are reporting that Gemini 2.5 keeps threatening to kill itself after being unsuccessful in debugging your code ☠️
2
8
60
13,799
7 May 2025
excited to finally share what I've been working on for the last few months
1/ Introducing ⚪️CircleGuardBench — a new benchmark for evaluating AI moderation models. Here’s why it’s cool: – Tests harm detection, jailbreak resistance, false positives, and latency – Covers 17 real-world harm categories – First benchmark designed for production-level evaluation 🤗 blog: huggingface.co/blog/whitecir… 🏆 leaderboard: huggingface.co/spaces/whitec…
1
1
9
815
2 May 2025
humanity should be ashamed that the academic publishing industry (and particularly Elsevier) even exists
1
334
21 Mar 2025
some people have an interesting way of looking for a job
3
394
11 Mar 2025

10 Nov 2024
let's create an arena where LLMs are playing Factorio
3
413