Prahitha Movva

Prahitha Movva

2 Photos and videos

Tweets

Pinned Tweet

Prahitha Movva

@PrahithaM

22 Oct 2025

(1/3) Started this as a side project on a whim and had fun presenting it at the COLM 2025 XLLM-Reason-Plan Workshop recently. @XllmReasonPlan Built a small dataset of rebus puzzles to see how VLMs reason through visual wordplay. Dataset: huggingface.co/datasets/pmov…

842

Omar Khattab

Prahitha Movva retweeted

Omar Khattab

@lateinteraction

Jun 17

Been extremely excited about this work by @jacobli99! We're disappointed in the current ways our agents develop expertise in new domains. Very shallow and hand-engineered! Humans turn reading textbooks or documentation into deep expertise all the time. Why can’t our agents?!

Jacob X. Li

@jacobli99

Jun 17

Continual learning is widely discussed right now, but mostly as improving on the job or avoiding catastrophic forgetting. But it has a different, difficult, and already urgent form: Given nothing but a corpus of documents, how should AI systems develop expertise in a new, unfamiliar domain? We call this problem Machine Studying.

253

25,067

Yoonho Lee

Prahitha Movva retweeted

Yoonho Lee

@yoonholeee

Jun 8

I think the ML community should move towards treating text optimization with the same seriousness we give to weight optimization. I had a lot of fun writing up (and getting feedback on) a longer blog post with the best arguments and evidence I'm aware of:

Yoonho Lee

@yoonholeee

Jun 8

x.com/i/article/206401798198…

6,390

Diyi Yang

Prahitha Movva retweeted

Diyi Yang

@Diyi_Yang

Jun 5

We propose a new way to quantify AI overreliance: the Offloading Score 🧐 @vishakh_pk It measures the fraction of cognitive work you hand off to AI 🤖 via simulating how you'd have done each step without AI, then counting the steps the AI saved. It works directly from interaction traces (keystrokes, screenshots), so it's reusable across many tools!!

Vishakh Padmakumar

@vishakh_pk

Jun 3

People are increasingly worried that AI tools make us overreliant. But how do we actually measure this? We introduce Offloading Score, a measure of reliance based on the fraction of cognitive effort offloaded to AI while completing a task. In a controlled user study, Offloading Score detects increased reliance under time pressure, while several common alternatives do not. (1/9)

168

46,291

Shriram Krishnamurthi (primary: Bluesky)

Prahitha Movva retweeted

Shriram Krishnamurthi (primary: Bluesky) @ShriramKMurthi

May 26

Absolutely fascinating piece by @davideoks connecting language model oddities to human cultural development. Picked this up courtesy of @deenamousa's "Under Development" newsletter. davidoks.blog/p/language-mod…

Language models are weird for the same reason human cultures are weird

You can’t have adaptive learning without strange tics

davidoks.blog

763

Kiana Ehsani

Prahitha Movva retweeted

Kiana Ehsani

@ehsanik

May 23

Today I was supposed to be on my way to Türkiye for my wedding, to meet up with my family and have them finally meet my partner and husband. We had everything planned. We chose Turkiye since it's close to Iran and my partner and I could both go there and have our families meet each other. We were supposed to get married with our close family and a small group of friends on a boat on the Mediterranean Sea at sunset. Because of the war, all flights to and from Iran are cancelled and my family can’t leave Iran, so we had to call off the wedding. Instead, this is how my day looked like. I woke up to a reminder to call my grandma (I used to call her every Friday morning). I snoozed the reminder until next Friday, just like I have done for the past many years. I can’t call her like our tradition these days because there is no way to call home. All international calls to Iran are blocked, and the internet is fully shut down by the regime. I got to work and right as I opened my computer I received an email I had scheduled to send to myself 5 years ago: “Apply for citizenship.” This summer marks 11 years of being in the US and 5 years of being a green card holder. I am now eligible to file for citizenship, but it doesn’t matter because an executive order was signed a few months ago that banned all Iranians from applying for any visa or citizenship. At lunch I opened Twitter just to see what’s up in the world and saw the news that those who don’t have a green card now need to leave the US before they can get one. This means every one of my Iranian friends who are here on a visa now has to go back home (on which flight?) to get a green card??? As if it’s that easy? We all know getting back to the US for Iranians is a huge challenge (months and months of waiting for a visa, with a chance of never being able to come back). And this is just a normal Friday for an Iranian. These days, when people ask how I’m doing and how I’m handling everything, I just say: It’s okay, it’s okay. It will be okay some day. But the reality is: nothing is okay. I’m in constant pain. I haven’t seen my family and loved ones in years, I barely hear about their wellbeing, and I’m constantly worried about them. I’m just burying myself in work because that’s the only distraction that can save me from losing my mind. I’m not okay. None of us are okay. We are just barely holding it together…

180

346

2,391

505,090

Shreya Shankar

Prahitha Movva retweeted

Shreya Shankar

@sh_reya

May 22

i'm restarting my blog! i want to kickstart productive conversations around: what should AI agents look like for hard, subjective knowledge work? a lot of agent setups work well when tasks are objective and easy to verify. but many workflows (e.g., qualitative analysis, strategy, sensemaking) are messy and interpretive. as a first post, i explore different ways of doing agent-assisted qualitative analysis on tweets, with varying levels of human feedback/intervention. tldr: they all kinda sucked. turns out it’s hard to: (a) stop agents from converging too quickly on shallow interpretations (b) get agents to adapt to preferences that emerge gradually across many turns (i.e., evolving context) (c) capture human judgment without making humans fatigued

285

54,833

Florian Brand

Prahitha Movva retweeted

Florian Brand

@xeophon

May 22

i'll be talking about llm benchmarks, the infra behind it, the challenges and learnings later today at @tngtech :) will be live streamed and recorded, link in replies :)

332

62,668

Diyi Yang

Prahitha Movva retweeted

Diyi Yang

@Diyi_Yang

May 15

Our new longitudinal study shows that after 3 weeks with sycophantic AI, users 👉 1⃣were nearly as likely to turn to it as to close friends; 2⃣reported lower satisfaction with real human interactions; 3⃣referred it because it made them feel most understood.

Lujain Ibrahim

@lujainmibrahim

May 14

New preprint! In 5 studies (3k users / 12k convs, with a 3-wk longitudinal study), we find that sycophantic AI influences how people view those closest to them. It affects how effortful human interaction seems, how satisfying it is, & who people want to turn to for advice 🧵

153

49,026

Omar Shaikh

Prahitha Movva retweeted

Omar Shaikh @oshaikh13

May 13

We upgraded Tabracadabra 🎉 to bring an entire context-aware assistant (not just tab to autocomplete!) to any textbox. It's pretty great if you hate switching between the chat interface and what you're working on. We're also open-sourcing, so you can try it out!🧵

0:42

177

40,461

David Duvenaud

Prahitha Movva retweeted

David Duvenaud

@DavidDuvenaud

Apr 27

Announcing Talkie: a new, open-weight historical LLM! We trained and finetuned a 13B model on a newly-curated dataset of only pre-1930 data. Try it below! with @AlecRad and @status_effects 🧵

7:58

201

456

3,627

1,423,465

Akari Asai

Prahitha Movva retweeted

Akari Asai

@AkariAsai

Apr 17

Not many PhD students know about compute grants, but they can make a huge difference. During my PhD, I got access to Stability AI's HPC cluster through a small proposal and used it for Self-RAG training. Great practical post by @_emliu!

Emmy Liu @_emliu

Apr 17

wrote a guide on getting compute grants as a student, something I wish I did more at the beginning of my PhD. It's honestly one of the highest ROI things you can do as a student (we've gotten 100k gpu hrs for roughly 2 weeks of work writing). nightingal3.github.io/blog/2…

438

82,828

Andrei Bursuc

Prahitha Movva retweeted

Andrei Bursuc @abursuc

Mar 17

An unsolicited guide to being a researcher: super instructive slides by @EugeneVinitsky emerge-lab.github.io/papers/… - different goals of a PhD student - how to be a good collaborator - how to keep up with literature - tracking your ideas & experiments - stress & productivity

336

22,144

Ian Arawjo

Prahitha Movva retweeted

Ian Arawjo @IanArawjo

Mar 16

Not sure if there's an audience for this... but at least I'm having fun 😅

764

36,807

Sanmi Koyejo

Prahitha Movva retweeted

Sanmi Koyejo @sanmikoyejo

Mar 11

Semantic duplicates are invisible to small models but can be catastrophic for large ones. We show that this breaks standard scaling laws and measure the effective data pool size to fix them. If you're training at scale on synthetic data, you should read this!

Jessica Chudnovsky

@jchudnov

Mar 10

Your deduplication pipeline was built for small models. At scale, it's broken. New preprint: "Scale Dependent Data Duplication" 1/10

2,772

Greg Durrett

Prahitha Movva retweeted

Greg Durrett

@gregd_nlp

Mar 13

Check out Manya's benchmark for LLM creativity! Inspired by work on creativity in graphs (@AdtRaghunathan's "roll the dice" paper), CREATE isolates testing of creative insights for discovery. Future: understand how LLMs derive insights & how they can be better creative partners!

Manya Wadhwa @ManyaWadhwa1

Mar 13

⚛️ Introducing CREATE, a benchmark for creative associative reasoning in LLMs. Making novel, meaningful connections is key for scientific & creative works. We objectively measure how well LLMs can do this. 🧵👇

7,906

Emma Brunskill

Prahitha Movva retweeted

Emma Brunskill @EmmaBrunskill

Mar 12

Our paper on using LLMs to support people learning mental health counseling skills received an Honorable Mention at CHI 2026! arxiv.org/abs/2505.02428 Lead by @RyanCLouie (who's on the market!), w/@Diyi_Yang, Raj Shah, Ifdita Hasan Orney, & Juan Pablo Pacheco

Can LLM-Simulated Practice and Feedback Upskill Human Counselors?...

The growing demand for accessible mental health support requires training more counselors, yet existing approaches remain resource-intensive and difficult to scale. LLMs can realistically simulate...

arxiv.org

11,225

Cohere Labs

Prahitha Movva retweeted

Cohere Labs

@Cohere_Labs

Mar 9

Out Reinforcement Learning group is excited to welcome Mansi Maheshwari for a session focused on "Addressing the Plasticity-Stability Dilemma in Reinforcement Learning" next week on Monday, March 16th! Thanks to @rahul_narava and @gustiwinata_ for organizing this session 👏 Learn more: cohere.com/events/cohere-lab…

3,037

Zhijing Jin

Prahitha Movva retweeted

Zhijing Jin

@ZhijingJin

Feb 4

Here is a sharing of career & survival resources that really helped me navigate the research career in #NLProc and #AI: github.com/zhijing-jin/nlp-p… Huge thanks to the researchers & profs who wrote such thoughtful guides for our community 🙏 PRs are very welcome to keep it growing🌱

302

16,327

Niloofar ✈️ icml

Prahitha Movva retweeted

Niloofar ✈️ icml

@niloofar_mire

Jan 30

I'm looking for students/folks interested in leading a project on privacy-preserving mental health chatbot research, focusing on differentially private pattern extraction and synthetic data generation for AI safety. If you are interested or know someone who would be a good fit, email with subject "Mental Health and DP". Short project description below. Pls share!! PS this is not a recruitment for PhD positions, it's a single project. if you are already at CMU mention that in the title.

315

21,676

Cohere Labs

Prahitha Movva retweeted

Cohere Labs

@Cohere_Labs

Jan 29

Be sure to join us tomorrow, January 30th for a presentation from @Ahsaasb, for a deep dive into "Production-Grade ML in Practice: Evaluation and Design Frameworks for Recommendation Systems Serving Millions." Learn more: cohere.com/events/cohere-lab…

Cohere Labs

@Cohere_Labs

Jan 22

Our ML Industry group is looking forward to hosting @Ahsaasb, Senior ML Engineer at Instacart for a presentation on "Production-Grade ML in Practice: Evaluation and Design Frameworks for Recommendation Systems Serving Millions." Thanks @PrahithaM and @arya_suneesh to organizing this event! 🔥 Learn more: cohere.com/events/cohere-lab…

1,088