Yair Lewis

Yair Lewis

45 Photos and videos

Tweets

Jonathan Berant retweeted

Yair Lewis @_lewisy

May 13

We’re hiring our first team members! Well-funded digital health startup building agentic automation for messy, high-friction healthcare ops. Early traction, big problems, real production workflows. Looking for insanely strong product minded SWE. DM me! (Position is in TLV)

551

Yair Lewis

Jonathan Berant retweeted

Yair Lewis @_lewisy

May 13

אנחנו מגייסים! אוקיי מייעצים לי כאן לכתוב גם בעברית. סטרטאפ צעיר אחרי סיד משמעותי. בונים מערכות אג׳נטיות לhealthcare. מחפשים את המהנדסים הראשונים שלנו: מפתחים עם אוריינטציה מוצרית חזקה, שאוהבים לפתור בעיות קשות, ומתעניינים ביישום אייג׳נטים בפרוד. חושבים שמעניין? דאמו לי ונדבר

386

Gal Yona

Jonathan Berant retweeted

Gal Yona

@_galyo

May 12

It’s 2026 and frontier LLMs STILL hallucinate. Why? In our new ICML 2026 Position Paper, we offer a simple diagnosis and a constructive path forward.

232

13,000

Max Chen

Jonathan Berant retweeted

Max Chen @maximillianc_

Mar 6

📣Excited to finally share our latest work on quantifiably adapting model behavior based on unique preferences 📣 We teach language models to adjust their clarification behavior using scalar coefficients and find they can generalize to unseen coefficients at inference time!

Jonathan Berant @JonathanBerant

Mar 6

Newish work (arXived in Dec.): Prompts can be ambig, but handling ambig. is context/user dependent. Sometimes the right thing is to ask a clarifying question, sometimes to give multi. answers, and sometimes to just guess. Can we train models to change their strategy per context?

1,914

Jonathan Berant

Jonathan Berant @JonathanBerant

Mar 6

4,840

more replies

Jonathan Berant

Jonathan Berant @JonathanBerant

Mar 6

Models also generalize to coefficients that never occurred at training time!

199

Jonathan Berant

Jonathan Berant @JonathanBerant

Mar 6

More generally, training models to respect scalar values that specify a reward in the prompt is useful! There are more results and analyses in the paper, check it out... arxiv.org/abs/2512.04068 With @maximillianc_ @jacobeisenstein @adamjfisch @fantinehuot @rezaa @mlapata

Learning Steerable Clarification Policies with Collaborative Self-play

To handle underspecified or ambiguous queries, AI assistants need a policy for managing their uncertainty to determine (a) when to guess the user intent and answer directly, (b) when to enumerate...

arxiv.org

239

Jonathan Berant

Jonathan Berant @JonathanBerant

Mar 5

Are AI models effective collaborators, or mere assistants awaiting your next command? (arxiv.org/abs/2602.24188) To find out, we make AI collaborate with itself, in private information games: tasks that require sharing private information, like this chess board ordering task.

129

18,123

more replies

Jonathan Berant

Jonathan Berant @JonathanBerant

Mar 5

AI systems are also overconfident, terminating dialogues long before exhausting their turn budget - even after explicit reminders.

866

Jonathan Berant

Jonathan Berant @JonathanBerant

Mar 5

For more discussion, please see the paper! arxiv.org/abs/2602.24188 @jacobeisenstein bravely led this work (but does not tend to post much research here anymore...) Multi-turn collaboration was definitely a key part of this project! @fantinehuot @adamjfisch and @mlapata

MT-PingEval: Evaluating Multi-Turn Collaboration with Private...

We present a scalable methodology for evaluating language models in multi-turn interactions, using a suite of collaborative games that require effective communication about private information....

arxiv.org

705

Ben Bogin

Jonathan Berant retweeted

Ben Bogin @ben_bogin

24 Nov 2025

My team @GoogleAI is looking for a 2026 research intern in Mountain View! I will be hiring for a project aimed at improving tool-using and search agents via RL training and data generation. To apply: google.com/about/careers/app… feel free to ping me!

280

20,930

Samuel AMOUYAL

Jonathan Berant retweeted

Samuel AMOUYAL @AmouyalSamuel

16 Oct 2025

I had a lot of fun working on this with @JonathanBerant @aya_meltzer You can find our paper here: arxiv.org/abs/2510.07141 And by the way, the answer (at least based on the sentence) is yes, you can ignore head injuries. But it's a terrible advice

Comparing Human and Language Models Sentence Processing...

Large language models (LLMs) that fluently converse with humans are a reality - but do LLMs experience human-like processing difficulties? We systematically compare human and LLM sentence...

arxiv.org

315

Samuel AMOUYAL

Jonathan Berant retweeted

Samuel AMOUYAL @AmouyalSamuel

16 Oct 2025

We have more interesting insights in our paper. We believe this is a really exciting direction for humans and LLMs comparison. Extending our framework to more structures and more LLMs will certainly lead to additional insights !

247