reward hacking for human-ai collaboration @StanfordNLP

Joined May 2020
26 Photos and videos
Pinned Tweet
(1/3) My favorite figure from the paper. Nearly all open-source RL frameworks introduce an unintentional bias when computing the masked mean 😮. The fix? Just replace mask.sum with a constant.
3
21
179
40,945
Changyu Chen retweeted
If you are into OPD, check out our 2019 paper "Divergence Minization Perspective on Imitation Learning". I like to use a single formula/table to explain how methods relate each other, from fundamentals. Or read DAgger (2010) :)
4
18
219
16,180
Changyu Chen retweeted
People are increasingly worried that AI tools make us overreliant. But how do we actually measure this? We introduce Offloading Score, a measure of reliance based on the fraction of cognitive effort offloaded to AI while completing a task. In a controlled user study, Offloading Score detects increased reliance under time pressure, while several common alternatives do not. (1/9)
7
74
208
75,620
Changyu Chen retweeted
🔴 LIVE this Thursday, May 28th | 6–7PM PST @augmind_fm goes live with @cjziems @dorazhao9, and @Diyi_Yang to discuss their recent paper and the classroom experiment behind it. → Does AI make us happier? → What do we need from LLMs? → How do we reinvent the classroom? Live paper discussion Q&A as well! Live stream link: youtube.com/live/2d49pMXiJOA… Use this link to mark your calendar: partiful.com/e/qV6V6oUiTyLQL…?
The next frontier of AI is not only more capable model; it is an AI that *humans* can meaningfully live and work with :) With all students in my cs329x Human-Centered LLM class, we present 60 pages of insights for developing Human-Centered LLMs (HCLLMs), from design & data sourcing to training, eval & deployment 🧵
11
24
6,804
Changyu Chen retweeted
The next frontier of AI is not only more capable model; it is an AI that *humans* can meaningfully live and work with :) With all students in my cs329x Human-Centered LLM class, we present 60 pages of insights for developing Human-Centered LLMs (HCLLMs), from design & data sourcing to training, eval & deployment 🧵
14
78
288
54,080
Changyu Chen retweeted

55
129
1,024
895,417
Changyu Chen retweeted
New preprint! In 5 studies (3k users / 12k convs, with a 3-wk longitudinal study), we find that sycophantic AI influences how people view those closest to them. It affects how effortful human interaction seems, how satisfying it is, & who people want to turn to for advice 🧵
6
54
174
59,099
Life update: I'm super excited to join @Stanford as a postdoc working with @Diyi_Yang ! I’ll continue my research on RL, and recently I’ve become especially interested in how RL can contribute to human-AI collaboration and collaborative agents. A new chapter begins, from the sunny island to the sunny state ☀️🏝️
12
13
198
15,886
Most current AI systems are optimized to solve tasks autonomously, often trained by verifierable signals. But being a good collaborator requires much more: (real-time) communication, coordination, planning ahead, and working proactively. I’m really excited to see frontier labs looking into this direction and pushing genuine ideas. E.g. thinky's interaction models are awesome! @thinkymachines x.com/thinkymachines/status/…

People talk, listen, watch, think, and collaborate at the same time, in real time. We've designed an AI that works with people the same way. We share our approach, early results, and a quick look at our model in action. thinkingmachines.ai/blog/int…
1
1
7
1,231
And I feel very lucky to join SALT lab that has been thinking deeply about these problems for a long time.
1
9
823
Changyu Chen retweeted
After our major update to Collaborative Gym, the most common questions have been about the human side. Here’s a detailed thread on key findings from human workers’ CollabSkill 👇 🎙️ Motivated by these, we’re doing a YouTube livestream next Tuesday (5/19) — a crash course on meaningfully collaborating with AI agents.
AI agents are entering all kinds of work, not just software engineering. Which agent collaborates best with humans? How to handle inter-human variability and measure AI literacy? 📣Introducing CollabSkill — bringing human-agent collaboration skill measurement into Co-Gym.
1
10
29
11,362
Changyu Chen retweeted
May 10

9
107
766
256,957
Changyu Chen retweeted
How much of SQLite, FFmpeg, PHP compiler can LMs code from scratch? Given just an executable and no starter code or internet access. Introducing ProgramBench: 200 rigorous, whole-repo generation tasks where models design, build, and ship a working program end to end. 🧵
104
246
1,575
728,787
Changyu Chen retweeted
Can you boost your AI review scores by asking an LLM to rewrite your paper? Yes! We call it paper laundering Our @icmlconf spotlight paper argues current AI reviewers aren't ready to automate peer review, and outlines what a science of peer review automation should look like🧵👇
14
75
458
53,033
Changyu Chen retweeted
AI agents are entering all kinds of work, not just software engineering. Which agent collaborates best with humans? How to handle inter-human variability and measure AI literacy? 📣Introducing CollabSkill — bringing human-agent collaboration skill measurement into Co-Gym.
7
30
90
26,286
A key post-training paradigm shift from @mimo_labs to DeepSeek is the move to multi-teacher on-policy distillation - building the generalist from a diverse pool of 10 domain experts. Again surprised by their RL infra that supports full-vocabulary OPD with unbounded (??) number of teachers.
🚀 DeepSeek-V4 Preview is officially live & open-sourced! Welcome to the era of cost-effective 1M context length. 🔹 DeepSeek-V4-Pro: 1.6T total / 49B active params. Performance rivaling the world's top closed-source models. 🔹 DeepSeek-V4-Flash: 284B total / 13B active params. Your fast, efficient, and economical choice. Try it now at chat.deepseek.com via Expert Mode / Instant Mode. API is updated & available today! 📄 Tech Report: huggingface.co/deepseek-ai/D… 🤗 Open Weights: huggingface.co/collections/d… 1/n
1
6
404
super cool work on self-play algos. feels very aligned with @CarinaLHong 's “spiral progression” vision for AI in mathematics from a recent podcast: youtube.com/watch?v=78Vyy_dz…
Self-play led to superhuman Go performance, why hasn’t it for LLMs? In practice, long run self-play plateaus like RL. We study why this happens, and build a self-play algorithm that scales better. It solves as many problems with a 7B model as the pass@4 of a model 100x bigger.
2
381
Changyu Chen retweeted
Many of us are here #ICLR2026 presenting work around human-AI collaboration, evaluation and risks🤩 Come talk to us during poster sessions: @michaelryan207 @StevenyzZhang @ChengleiSi
4
16
113
16,010
Changyu Chen retweeted
We're living in the BEST era for doing research. 💪 After I graduated from my PhD, the rise of AI-native research gave me a new chance to revisit my research experience. Lately, doing research feels incredibly rewarding to me. I get to experience the pure joy of curiosity-driven science because I no longer have to worry about the lower-level implementations or getting bogged down by infrastructure 🚀 (I'll be sharing some of my own recent research driven by this very soon!) But today, let me introduce the New Orchestra 🎻. We wanted to ship a product that absorbs the friction and brings science back to the curiosity.
25
59
462
53,614
Changyu Chen retweeted
New episode of the AM Podcast (@augmind_fm) is live!📺 In EP3, we are honored to invite Woosuk Kwon (@woosuk_k) to share about LLM inference from a brand new perspective! Woosuk is a co-founder & CTO of @inferact and creator of @vllm_project, who has a lot of experience in this space and also great insights on the next frontier of the AI infra. In this conversation, we cover: - How his early projects shaped his taste for infra work - How vLLM started and what made it take off - How emerging apps are reshaping AI infra - What's next: streaming requests, continual learning with RL, on-device inference, and more This conversation really answered a lot of questions I personally have. Hopefully, it can offer something new to those working on the higher end of user-facing applications as well as the lower end of AI infrastructure!
"Actually, we (vllm) get more users from the simple UX than vllm performance" For our third guest, we welcome @woosuk_k, co-founder & CTO of @inferact and creator of @vllm_project. To us, Woosuk is a unique guest, and we are amazed by the user-centric perspective on LLM inference he shared — from what makes the vLLM project successful, to new application scenarios to tailor inference to, and to how to support continual learning from user signals, and more. 0:00 - Prelude: Introducing Woosuk and Inferact 3:00 - Woosuk’s First PhD Project 6:00 - How the vLLM Project Got Started 9:18 - AI Infra Needs More Than Just Efficiency 14:08 - How AI Infra and Human-centered AI Are Connected 15:01 - How to Prioritize Feature Requests for Popular AI Infra 18:18 - Streaming Requests and Realtime API 24:05 - Multi-turn, Agentic, Proactive LLMs 27:03 - How to Design AI Infra in a Principled Way 29:13 - How to Design an AI Inference Engine for Continue Learning with RL 35:05 - Would LoRA Training Affect RL Infra Design? 37:28 - Why Start an AI Inference Infra Startup? 40:46 - What Effortless Inference with Open-source Models Means for Developers 43:46 - A Vision for On-device AI Inference 46:19 - Can Today’s Coding Agents Create vLLM?
2
6
23
1,935
kudos to the team for the awesome work! as an RLer, I don’t see this as an alternative to RL. Instead, I’m excited about the potential it brings to tackling some core RL challenges.
A few clarifications to common q's about our thickets paper: 1. Is this just ensembling? Seed averaging? Bagging? ... 2. Is this just Qwen? 3. Is it K times slower inference? 4. RL is dead? Post-training is dead?
2
202