AI Research Scientist at Meta Superintelligence Labs (FAIR). prev: Ph.D. @ FAIR & Polytechnique, engineering @ Polytechnique (X18), MVA @ ENS.

Joined February 2022
44 Photos and videos
Pinned Tweet
OpenAI may secretly know that you trained on GPT outputs! In our work "Watermarking Makes Language Models Radioactive", we show that training on watermarked text can be easily spotted ☢️ Paper: arxiv.org/abs/2402.14904 @pierrefdz @AIatMeta @Polytechnique @Inria
6
23
86
15,073
Tom Sander (Ph.D.) retweeted
1/9 Excited to share TextSeal, our new state-of-the-art watermark for large language models at FAIR / Meta Superintelligence Labs (@AIatMeta) 🔐 Paper: arxiv.org/abs/2605.12456 Code: github.com/facebookresearch/…
1
11
34
4,219
1/9 Excited to share TextSeal, our new state-of-the-art watermark for large language models at FAIR / Meta Superintelligence Labs (@AIatMeta) 🔐 Paper: arxiv.org/abs/2605.12456 Code: github.com/facebookresearch/…
1
11
34
4,219
8/9 Novelty 3: fast localized detection. Real documents are often mixed: some human text, some AI-generated text. TextSeal searches for watermarked regions (previous figure), so detection remains strong even when the signal is diluted (results here)🧭
1
1
66
9/9 Beyond provenance, TextSeal is “radioactive”: its signal can transfer through model distillation, helping detect when another model was trained on watermarked outputs. Try it out! Code is Apache 2.0. Paper: arxiv.org/abs/2605.12456Code Code: github.com/facebookresearch/…

1
2
73
Delighted to share that last month, I successfully defended my Ph.D. in Mathematics! 🎓 Huge thanks to my incredible advisors, Chuan Guo at @MetaAI (FAIR) and Alain Durmus at @Polytechnique, for their phenomenal mentorship and support throughout this journey.
1
12
666
My research focuses on the intersection of machine learning and security, specifically Privacy, Traceability, Provenance and Watermarking in Deep Learning. It has been incredibly rewarding to work on making AI models more secure, transparent and accountable.
1
1
76
A sincere thank you to my thesis committee, my brilliant colleagues at FAIR and Polytechnique, and everyone who has encouraged me along the way. 🚀 scholar.google.com/citations…
1
75
Tom Sander (Ph.D.) retweeted
A couple of months after OmniASR, we’re excited to release OmniSONAR alongside OmniMT. OmniSONAR brings new training recipes for cross-lingual and cross-modal sentence encoders, enabling massively multilingual embeddings for text and speech. tinyurl.com/omnisonar 🧵 1/3

1
6
12
426
Most text watermarking methods focus on generation time. But what about existing text? We explore "Post-Hoc Watermarking," using an LLM to rephrase and watermark copyrighted books, training data, or similar content. 🧵 arxiv.org/abs/2512.16904 github.com/facebookresearch/…
3
2
3
147
Why does this matter? "Watermark Radioactivity." If we watermark specific documents post-hoc, we can detect if they are used to train future models or retrieved in RAG systems. It turns passive data into active tracers.
1
1
26