Joined October 2024
1 Photos and videos
Pinned Tweet
I will attend #EMNLP2024 at Miami next week! If you are interested LLM explainability, formal reasoning and/or multilingual NLP, please DM me and connect๐Ÿ˜ƒ. I'm ready for aโ˜• talk every day! Also, please find me on Nov 13th 10:30-12:00 at poster session 6!
1
13
2,768
Great work pushing the frontier of multimodal reasoning evals!
Frontier models have become excellent at understanding videos. But what happens when we test them outside the comfort zone of Western, English-centric data? In our #CVPR2026 (Highlight) work, we pushed these models to their limits to see if they can function effectively in diverse global contexts. The results? They are struggling. Work done with @NagraniArsha @skawshik11 @Harman26Singh @dinesh_tewari1 @0xtob @CordeliaSchmid Anelia Angelova @shachi_dave (1/7)
4
70
Prasoon Bajpai retweeted
Check out Proactive Co-Creator on @GoogleAIStudio , a human-AI belief alignment demo I vibe coded: aistudio.google.com/apps/bunโ€ฆ ๐Ÿง  See & edit the AI's uncertainty via belief graph. It asks clarifying questions before creating! ๐Ÿ“ท Try Image โž” Story โž” Video. You can even remix it!
2
3
10
413
Interesting work from @IshaanWatts18 !
Replying to @IshaanWatts18
Optimize pretraining not just for loss, but for robustness to future updates. The "best" base model does not always make the best final model. ๐Ÿ“„ More in the paper: scaling results, Hessian analysis, and practical recipes arxiv.org/abs/2605.02105 Huge thanks to my collaborators: @CatherineL11638 @goyalsachin007 @jacspringer @AdtRaghunathan 9/9
1
73
Prasoon Bajpai retweeted
๐Ÿš€ #EACL2026 Sneak Peak Alert ๐Ÿš€ We're excited to share a paper that we are presenting at #EACL2026 in #Morocco! ๐Ÿ“œ Can LLMs Reason over Extended Multilingual Contexts? Towards Long-Context Evaluation Beyond Retrieval over Haystacks ๐Ÿ‘ฅ @AmeyHengle @prasNLP Soham Dan @Tanmoy_Chak
2
1
1
204
Thrilled to see our paper accepted at AISTATS 2026! Grateful to my co-authors, this was a fun deep dive into interpretability, control, and causal prompt edits. ๐Ÿš€
Thrilled to share that our paper on "Interpreting and Controlling Model Behavior via Constitutions for Atomic Concept Edits" has been accepted at AISTATS 2026! ๐Ÿš€๐Ÿš€ Read more about how input mutations can be mapped to interpretable behavioral insights. arxiv.org/abs/2602.00092 ๐Ÿงต
5
197
Long context multi-hop reasoning still remains a hard problem in the multilingual landscape. Our work on relevant evals got accepted into EACL!
๐Ÿ“ข #LCS2 is coming to Morocco โœˆ๏ธ Happy to announce that two papers from our lab have been accepted to #EACL2026. Congratulations to all the authors, great start to the year! ๐Ÿ™Œ #IITDelhi #EACL2026 #ACLCommunity #NLProc #AIResearch @Tanmoy_Chak
1
178
Go apply!
Thrilled to note that we are keeping the tradition of the awesome AI residency program alive in a new avatar: pre-doc researcher program at GDM-Blr -- with some amazing work done by our recent predocs including @gautham_ga_ @pranamyapk @puranjay1412 @sahilgo6801 @swaroopnath6 If you want to join this program, please apply here: google.com/about/careers/appโ€ฆ
1
1
180
New home at @GoogleDeepMind India as Pre-Doctoral Researcher!
30
4
625
39,936
10-minute thought-to-blog on 'Society of LLMs' prasoon1207.github.io/blog/2โ€ฆ

1
2
2,121
Who else is smelling MCTS in the deep research blog? openai.com/index/introducingโ€ฆ

1
1,759
NAACL 2025 ๐Ÿš€ Presenting โ€œMultilingual Needle in a Haystack: Investigating Long-Context Behavior of Multilingual Large Language Modelsโ€ Paper Link : arxiv.org/abs/2408.10151
Kicking off the year with a bang -- 4 papers accepted in prestigious venues this month! #ICLR2025 -- ๐‹๐‹๐Œ ๐œ๐จ๐ฆ๐ฉ๐ซ๐ž๐ฌ๐ฌ๐ข๐จ๐ง: We introduce ๐๐ซ๐ฎ๐ง๐ž๐๐ž๐ญ, a novel, dataset-free policy learning approach to model pruning, achieving high compression efficiency and performance retention, demonstrated by compressing LLaMA-2-7B with over 80% zero-shot accuracy retention at a 30% compression ratio. @iclr_conf URL: shorturl.at/HEO7O #๐๐€๐€๐‚๐‹2025 -- ๐ˆ๐ง๐ฏ๐ž๐ฌ๐ญ๐ข๐ ๐š๐ญ๐ข๐ง๐  ๐ฆ๐ฎ๐ฅ๐ญ๐ข๐ฅ๐ข๐ง๐ ๐ฎ๐š๐ฅ ๐ฅ๐จ๐ง๐ -๐œ๐จ๐ง๐ญ๐ž๐ฑ๐ญ ๐›๐ž๐ก๐š๐ฏ๐ข๐จ๐ซ ๐ข๐ง ๐‹๐‹๐Œ๐ฌ: We introduce ๐Œ๐‹๐๐ž๐ž๐๐ฅ๐ž, the first systematic evaluation of multilingual long-context retrieval in LLMs, revealing significant performance variations across languages and context positions, with insights to guide future evaluations. @naaclmeeting Preprint: lnkd.in/gtRAXjmh ๐๐€๐€๐‚๐‹'25 -- ๐‚๐จ๐ฎ๐ง๐ญ๐ž๐ซ๐ฌ๐ฉ๐ž๐ž๐œ๐ก ๐ž๐ฏ๐š๐ฅ๐ฎ๐š๐ญ๐ข๐จ๐ง ๐›๐ž๐ง๐œ๐ก๐ฆ๐š๐ซ๐ค ๐š๐ง๐ ๐ฆ๐ž๐ญ๐ซ๐ข๐œ๐ฌ: We introduce ๐‚๐’๐„๐ฏ๐š๐ฅ, a dataset for evaluating counterspeech across four dimensions and a prompt-based framework using auto-calibrated CoT, offering better alignment with human judgment than traditional metrics. @naaclmeeting ๐๐š๐ญ๐ฎ๐ซ๐ž ๐Œ๐š๐œ๐ก๐ข๐ง๐ž ๐ˆ๐ง๐ญ๐ž๐ฅ๐ฅ๐ข๐ ๐ž๐ง๐œ๐ž: In collaboration with AIIMS (All India Institute of Medical Sciences, New Delhi), NIMHANS, Bangalore and other NGOs, we wrote how GenAI can potentially empower multisectoral suicide prevention efforts, particularly in resource-constrained settings like India. @NatMachIntell
1
2
931
Prasoon Bajpai retweeted
๐ŸŒŸ ๐€ ๐๐ž๐ฐ T๐ž๐ฑ๐ญ๐›๐จ๐จ๐ค -- ๐ˆ๐ง๐ญ๐ซ๐จ๐๐ฎ๐œ๐ญ๐ข๐จ๐ง ๐ญ๐จ ๐‹๐š๐ซ๐ ๐ž ๐‹๐š๐ง๐ ๐ฎ๐š๐ ๐ž ๐Œ๐จ๐๐ž๐ฅ๐ฌ ๐ŸŒŸ I am excited to share the release of my new textbook, ๐˜๐˜ฏ๐˜ต๐˜ณ๐˜ฐ๐˜ฅ๐˜ถ๐˜ค๐˜ต๐˜ช๐˜ฐ๐˜ฏ ๐˜ต๐˜ฐ ๐˜“๐˜ข๐˜ณ๐˜จ๐˜ฆ ๐˜“๐˜ข๐˜ฏ๐˜จ๐˜ถ๐˜ข๐˜จ๐˜ฆ ๐˜”๐˜ฐ๐˜ฅ๐˜ฆ๐˜ญ๐˜ด (#LLMs) -- Perhaps the first textbook on LLMs. Target Audience: ๐Ÿ‘‰ Students/beginners, Looking for a structured starting point to learn LLMs ๐Ÿ‘‰ Teachers, planning to offer a course on LLMs ๐Ÿ‘‰ Industry professional, seeking to deepen their understanding of LLMs Explore the Book: ๐Ÿ”— Book Website: tanmoychak.com/llmbook/ ๐Ÿ“‘ Table of Contents: tanmoychak.com/llmbook/toc.pโ€ฆ ๐Ÿ›’ Available on Amazon: amazon.in/dp/936386474X/ Enhance Your Learning Experience: ๐Ÿ‘‰ Slides & Lecture Videos: Chapter-wise resources -- lcs2-iitd.github.io/ELL881-Aโ€ฆ ๐Ÿ‘‰ Exercises & Solutions: Practice with detailed chapter exercises (solutions available on request). ๐Ÿ‘‰ Upcoming @nptel_official Course: Starting January 2025! Preview here: onlinecourses.nptel.ac.in/noโ€ฆ Book Endorsement: ๐Ÿ“– Foreword by Prof. Tim Baldwin @eltimster ๐Ÿ‘ Endorsements from Prof. Iryna Gurevych @IGurevych and Prof. Pushpak Bhattacharyya #LLMs #Textbook @iitdelhi @WileyIndiaPL @lcs2lab
22
77
9,972
โ€œBeware the fury of the highly popular knowledgeโ€ Does highly popular information cause any internal struggle in LLMs? (1/n)
1
1
349
We also assess this impact critical limitation under the lens of sensitivity towards lexical variations of the queries. We unveil a key weakness in modern LLMs, in being internally sensitive to lexical perturbations, while retrieving highly popular facts from their memory.
1
1
311
We also find that LLMs struggle to give proper attention to parts of queries, which are grounded in highly popular entities. Check out the full paper for more key insights, real-world implications and detailed methodology : arxiv.org/abs/2411.10813v1
1
277
Prasoon Bajpai retweeted
๐Ÿงต on surprising revelations from our study of specialized foundation models (FMs beyond vision/text): after evaluating dozens of scientific & time series FMs we found that most werenโ€™t even competitive with simple supervised models, some with as little as 513 parameters. 1/n
3
62
243
43,051
Kickoff #EMNLP2024
2
8
311