Asst prof @UUtah · Ex @allen_ai @uwnlp @HD_NLP · she/her 🇭🇷

Joined April 2014
212 Photos and videos
We evaluated CoT faithfulness evaluations & released 𝐁𝐨𝐧𝐚𝐅𝐢𝐝𝐞 so you can test yours too!!
Can we tell when LLMs are being unfaithful in their chains of thought? We evaluated 8 methods claiming to do this, and found that most perform near chance! But evaluating this requires us to have ground-truth labels for CoT faithfulness. How can we obtain these?
5
16
2,372
Ana Marasović retweeted
Can we tell when LLMs are being unfaithful in their chains of thought? We evaluated 8 methods claiming to do this, and found that most perform near chance! But evaluating this requires us to have ground-truth labels for CoT faithfulness. How can we obtain these?
4
25
147
14,713
Ana Marasović retweeted
First paper of my PhD at the University of Utah, with Prof @Kenneth_Marino. Super excited to finally share what we've been working on at SparkLab. Meet TimeWarp ⏳, a benchmark that tests web agents by sending them back in time through 6 eras of web UI design. Thread 🧵(1/5)
1
2
10
2,388
Ana Marasović retweeted
It’s been less than a year since I started my lab (SPARK Lab) at @UUtah we already have a ton of new stuff that I can’t wait to talk about soon. Stay tuned for more. I’ll start today by sharing that our updated Computer Use Survey blog has been accepted to ICLR Blogposts 2026. Collaboration with my student @aplycaebous and Utah colleague @anmarasovic.
1
4
11
1,014
Ana Marasović retweeted
Wanted to share with the CU community that our updated Computer Use Survey blog has been accepted to ICLR Blogposts 2026. Collaboration with my student @aplycaebousand Utah colleague @anmarasovic.
1
2
6
1,247
Ana Marasović retweeted
31 Dec 2025
Happy new year #NAACL! The 2026 election results are here. Congrats🥳 Chair: Anna Rumshisky @arumshisky Secretary: Jessy Li @jessyjli Board members: Muhao Chen @muhao_chen, Francisco (Paco) Guzmán, Ana Marasović @anmarasovic naacl.org/posts/2025-12-28-N… Thank you all for voting!

4
34
5,189
Ana Marasović retweeted
7 Dec 2025
$1,000,000 to understand how LLMs write code. Announcing: The Martian Interpretability Challenge. Understanding the inner workings of LLMs is the greatest scientific challenge of our age,. Let's solve it. Apply here: withmartian.com/prize 🧵👇
11
44
157
31,773
Ana Marasović retweeted
19 Nov 2025
Going through statements from NAACL board candidates. Resonating a lot with many statements, and especially love the one from @anmarasovic! naacl.org/elections/2026/ind…
1
4
38
4,513
𝙒𝙚'𝙧𝙚 𝙝𝙞𝙧𝙞𝙣𝙜 𝙣𝙚𝙬 𝙛𝙖𝙘𝙪𝙡𝙩𝙮 𝙢𝙚𝙢𝙗𝙚𝙧𝙨! Links below because twitter/x is weird.
2
21
88
24,353
If you like mountains, don't think twice! 😁
1
13
3,150
More cute photos in the good place 🦋
7
2,009
Thrilled to see this work recognized at #EMNLP2025! This framework and approach to measuring CoT faithfulness have been hugely influential for how I think about reasoning evaluation, and I'm so lucky to have worked with such brilliant collaborators. Huge credit to @mtutek
7 Nov 2025
Very honored to be one out of seven outstanding papers at this years' EMNLP :) Huge thanks to my amazing collaborators @fatemehc__ @anmarasovic @boknilev, this would not have been possible without them!
8
5
64
7,594
Ana Marasović retweeted
6 Nov 2025
Here Comes Another Bubble (AI Edition)
113
478
3,693
425,937
Ana Marasović retweeted
7 Nov 2025
Very honored to be one out of seven outstanding papers at this years' EMNLP :) Huge thanks to my amazing collaborators @fatemehc__ @anmarasovic @boknilev, this would not have been possible without them!
9
13
115
15,093
Check out Martin's talk at #EMNLP2025 today (Wed)! If you care about CoT faithfulness, you 𝘮𝘶𝘴𝘵 read this paper. It introduces the first method for measuring CoT faithfulness that is not purely behavioral, but operates with the internals!
31 Oct 2025
Flying out to @emnlpmeeting soon🇨🇳 I'll present our parametric CoT faithfulness work (arxiv.org/abs/2502.14829) on Wednesday at the second Interpretability session, 16:30-18:00 local time A104-105 If you're in Suzhou, reach out to talk all things reasoning :)
1
13
847