PhD student @MIT working on behavioral machine learning

Joined March 2016
18 Photos and videos
Manuel Cherep retweeted
COLM 2026 will host 16(!) workshops: colmweb.org/workshops.html CFPs are all online, and deadlines are coming up, so check the CFP of your workshops of interest
21
74
16,938
✨Announcing the first Workshop on Agent Behavior @COLM_conf 2026 (Oct 9, San Francisco 🌅) aiagentbehavior.com/ We invite two types of contributions: (i) papers, and (ii) benchmark proposals. We are also seeking reviewers. More details below!

1
6
19
6,201
Yours truly, the program committee 🙂 @manuelcherep (MIT) @_Hao_Zhu (Stanford) @StevenyzZhang (Georgia Tech Stanford) @Xinyang_Han_ (UC Berkeley) @BenSManning (MIT) @isi_magistrali (ETH) Saab Mansour (Amazon) @weronika_laj (Amazon) @PattieMaes (MIT) @nikhilsinghmus (Dartmouth)
5
214
ABxLab is accepted at @iclr_conf #ICLR 2026! ✨We ask: why do AI agents do what they do? 🧐 We introduce a framework for systematically studying AI agent behavior through controlled manipulations of their environments. We accomplish this by intercepting any real web environments and modifying what the agent sees in real time before they actually see it.
2
3
19
7,964
The world is also full of visual cues 👀, and you might be wondering whether agents are sensitive to these as well. The answer is yes! Check out our new paper, where we introduce an optimization method for editing images to understand VLMs’ decisions: x.com/manuelcherep/status/20…

Some decisions we make with our eyes 👀, but what about VLMs? Do they have structured, exploitable visual preferences that we can discover systematically before adversarial actors do? In our new paper, we propose a new optimization method for this and show substantial effects on VLMs’ decisions.
1
2
195
Work with Chengtian Ma, Abigail Xu, Maya Shaked, @pattiemaes, @nikhilsinghmus 🌐Web: abxlab.media.mit.edu 💻Code: github.com/PapayaResearch/ab… 📄Paper: arxiv.org/abs/2509.25609 Would love to hear your thoughts!

2
150
Manuel Cherep retweeted
Excited to (finally) share this paper, accepted at @iclr_conf #ICLR 2026! ✨ In this work, we use sparse autoencoders (SAEs) to study the internal representations of generative music models (here, MusicGen) and automatically discover how they encode concepts.
2
12
148
15,605
Some decisions we make with our eyes 👀, but what about VLMs? Do they have structured, exploitable visual preferences that we can discover systematically before adversarial actors do? In our new paper, we propose a new optimization method for this and show substantial effects on VLMs’ decisions.
2
5
14
2,583
In our recent ICLR 2026 paper, we showed how to study other kinds of sensitivities in agent behavior by using counterfactuals with our new framework (ABxLab) x.com/manuelcherep/status/19…

Replying to @manuelcherep
How does it work? ABxLAB is a "man-in-the-middle" framework. It intercepts web content in real-time to run controlled experiments on agents by modifying the choice architecture. Think of it as a behavioral science lab for LLMs. Paper: arxiv.org/abs/2509.25609 🧵2/9
1
2
3
524