Ai2

Ai2

10 Photos and videos

Tweets

Lucy Lu Wang retweeted

Ai2

@allen_ai

Mar 12

🔎 Deep research agents like Asta ScholarQA and OpenAI Deep Research are transforming how we perform literature review. But how do we know if the way we evaluate them is actually meaningful? Announcing our new paper: “Deep Research, Shallow Evaluation: A Case Study in Meta-Evaluation for Long-Form QA Benchmarks” 🧵

154

12,218

Lucy Lu Wang

Lucy Lu Wang @lucyluwang

12 Dec 2025

try out our new prototype system! you can ask questions about a paper and the system will answer with both text and figures from the paper. your data will go towards understanding how to better serve diverse visual needs!

Arnavi Chheda-Kothary @arnavic

11 Dec 2025

Ever want to ask questions about a paper, including its figures & tables? 📊📈 Want smoother interactions w/papers on desktop & mobile? Try Paper Figure QA, a new tool from @allen_ai that answers with the original figures, tables, and excerpts from papers: paperfigureqa.allen.ai

A screenshot of a user asking about examples of how a system works in a paper, and the system responding with details about the system along with a relevant figure and caption from the paper.

ALT A screenshot of a user asking about examples of how a system works in a paper, and the system responding with details about the system along with a relevant figure and caption from the paper.

328

Jihan Yao

Lucy Lu Wang retweeted

Jihan Yao @jihan_yao

4 Jun 2025

We introduce MMMG: a Comprehensive and Reliable Evaluation Suite for Multitask Multimodal Generation ✅ Reliable: 94.3% agreement with human judgment ✅ Comprehensive: 4 modality combination × 49 tasks × 937 instructions 🔍Results and Takeaways: > GPT-Image-1 from @OpenAI leads image generation at 78.3% accuracy—13.7% ahead of the next-best model. The top open-source model, BAGEL from #ByteDance , achieves 45.5% accuracy. > Audio generation is still challenging: Top open-sourced models achieve only 48.7% accuracy in sound (Make-An-Audio 2 from #ByteDance) and 41.9% in music (MusicGen from @AIatMeta). 📜 Paper: arxiv.org/abs/2505.17613v1 🛠️ Code and Evaluation Suite: github.com/yaojh18/MMMG 🥇Leaderboard: yaojh18.github.io/mmmg-leade… 🧵1/N

13,108

Martin Saveski

Lucy Lu Wang retweeted

Martin Saveski @msaveski

13 Nov 2024

[Please RT] I’m recruiting PhD students to work with me at @UW! I’m looking for students passionate about developing new *social media algorithms*, both broadly and within the scope of this NSF grant: tinyurl.com/395yfphd More info: faculty.washington.edu/msave… @UW / @UW_iSchool

107

212

26,016

Isabelle Augenstein

Lucy Lu Wang retweeted

Isabelle Augenstein @IAugenstein

14 Nov 2024

📢 📅 After a long process of soliciting & vetting bids, I'm excited that we've finally been able to reveal the location for #EMNLP2025 -- it'll be at the International Expo Centre, Suzhou, China from 5-9 November 2025. Looking forward to seeing you there! @emnlpmeeting #NLProc

137

16,463

Melanie Walsh

Lucy Lu Wang retweeted

Melanie Walsh

@mellymeldubs

11 Nov 2024

I'm recruiting a PhD student to join my group @uw_ischool in 2025-26. If you like the mountains and interdisciplinary research that blends data and culture, this could be a good fit! PhD apps due Dec 2: ischool.uw.edu/programs/phd/… More info about my group: melaniewalsh.org/mentorship

Ph.D. Application Process

Details on how to apply to the Ph.D. in Information Science program.

ischool.uw.edu

120

327

43,967

Anukriti

Lucy Lu Wang retweeted

Anukriti @Anukriti_Kr

16 Oct 2024

📉 Open access papers previously had higher accessibility compliance than closed access papers, but since 2019, we observe a sharp decline in compliance among OA papers (from the same publishers), driving much of the overall drop in PDF accessibility.

A line plot displays the mean normalized total compliance of scholarly PDFs across publishing models (open versus closed-access) from 2014 to 2023, with error bars showing standard deviation. The x-axis denotes the year, while the y-axis shows the proportion of PDFs. Open access papers show a significant decline in compliance after 2019, stabilizing or slightly increasing thereafter. Conversely, compliance for closed-access papers has been gradually improving since 2014.

ALT A line plot displays the mean normalized total compliance of scholarly PDFs across publishing models (open versus closed-access) from 2014 to 2023, with error bars showing standard deviation. The x-axis denotes the year, while the y-axis shows the proportion of PDFs. Open access papers show a significant decline in compliance after 2019, stabilizing or slightly increasing thereafter. Conversely, compliance for closed-access papers has been gradually improving since 2014.

574

Lucy Lu Wang

Lucy Lu Wang @lucyluwang

16 Oct 2024

In 2019 when we first did this analysis, PDF accessibility trends were mostly improving, slowly. These 2024 results surprised me, and reflect major shifts in #OA publishing since Plan S and exacerbated by Covid. Has OA mostly been a win? Sure. But not evenly for everyone…

Anukriti @Anukriti_Kr

16 Oct 2024

📢 Crisis alert in academic publishing! Less than 3.2% of scholarly PDFs meet #accessibility standards for blind and low-vision readers, and compliance has dramatically declined since 2019, especially for #OpenAccess papers! What’s going on?👇 Joint w/ @lucyluwang @uw_ischool

Our research paper titled 'Uncovering the New Accessibility Crisis in Scholarly PDFs: Publishing Model and Platform Changes Contribute to Declining Scholarly Document Accessibility in the Last Decade.'

ALT Our research paper titled 'Uncovering the New Accessibility Crisis in Scholarly PDFs: Publishing Model and Platform Changes Contribute to Declining Scholarly Document Accessibility in the Last Decade.'

A table showing the percent of papers in our dataset of 19997 PDFs that satisfy each criterion, along with Adobe-6 Compliance. The table has two columns. The left column lists various criteria: Default language, Tagged PDF, Tab order, Appropriate Nesting, Alt-text, Table headers, and Adobe-6 Compliance. The right column shows the corresponding percentage of papers that satisfy each criterion: 17.3% for Default language, 12.6% for Tagged PDF, 6.8% for Tab order, 15.9% for Appropriate Nesting, 8.5% for Alt-text, 13.4% for Table headers, and 3.2% for Adobe-6 Compliance.

ALT A table showing the percent of papers in our dataset of 19997 PDFs that satisfy each criterion, along with Adobe-6 Compliance. The table has two columns. The left column lists various criteria: Default language, Tagged PDF, Tab order, Appropriate Nesting, Alt-text, Table headers, and Adobe-6 Compliance. The right column shows the corresponding percentage of papers that satisfy each criterion: 17.3% for Default language, 12.6% for Tagged PDF, 6.8% for Tab order, 15.9% for Appropriate Nesting, 8.5% for Alt-text, 13.4% for Table headers, and 3.2% for Adobe-6 Compliance.

1,478

Jihan Yao

Lucy Lu Wang retweeted

Jihan Yao @jihan_yao

16 Oct 2024

🚀Varying Shades of Wrong: When no correct answers exist, can alignment still unlock better outcome? Introducing wrong-over-wrong alignment, where models learn to prefer "less-wrong" over "more-wrong". Surprisingly, aligning with wrong answers only can lead to correct solutions!

6,482

Lucy Li

Lucy Lu Wang retweeted

Lucy Li @lucy3_li

14 Oct 2024

Hi friends, colleagues, followers. I am on the faculty job market! I am a PhD student @BerkeleyISchool @berkeley_ai. I work on NLP, and I believe all language, whether AI- or human-generated, is ✨social and cultural data✨. My work includes: 🧵

389

57,714

Lucy Lu Wang

Lucy Lu Wang @lucyluwang

11 Oct 2024

today i left a bunch of comments for a collaborator on a grant like “what did you mean here?” and “you should expand upon this” only to realize later that i wrote those sections 😭

5,955

Bingbing Wen

Lucy Lu Wang retweeted

Bingbing Wen @bingbingwen1

9 Oct 2024

🚨Curious how LLMs deal with uncertainty? In our new #EMNLP2024 Findings paper, we dive deep into their ability to abstain from answering when given insufficient or incorrect context in science questions 💡arxiv.org/pdf/2404.12452 Joint work w/ @billghowe @lucyluwang @uw_ischool

8,390

Semantic Scholar Research @ AI2

Lucy Lu Wang retweeted

Semantic Scholar Research @ AI2 @ai2_s2research

2 Oct 2024

Replying to @allen_ai

@allen_ai @SemanticScholar is hiring #nlproc #hci #ml #ai researchers for the following positions with target start dates in 2025, apply by *Nov 1* for the 1st rolling deadline. - Research intern - Young investigator (Postdoc) - Research scientist Apply: job-boards.greenhouse.io/the…

12,470

Lucy Lu Wang

Lucy Lu Wang @lucyluwang

23 Sep 2024

Come be my colleague! We're hiring TWO tenure-track Assistant Professors at @UW_iSchool in AI, Data Science, and HCI 📊💻👩‍💻🌄 Link to apply: apply.interfolio.com/150031 Feel free to reach out with any questions!

157

19,312

Yue Guo

Lucy Lu Wang retweeted

Yue Guo @YueGuo10

20 Sep 2024

Excited to share that our paper on plain language summarization evaluation has been accepted to the #EMNLP2024 main conference! I’ll be in Miami and will have several PhD openings for Fall 2025. Feel free to reach out if you’d like to chat!

Yue Guo @YueGuo10

24 May 2023

(1/n) Announcing 🍎APPLS, a testbed to evaluate metrics for plain language summarization! arxiv.org/abs/2305.14341 Joint work with @tal_august, @GondyLeroyUA, Trevor Cohen @UW BIME, and @lucyluwang #NLProc #SDProc

8,058

Lucy Lu Wang

Lucy Lu Wang @lucyluwang

22 Aug 2024

RT @maria_antoniak: Sexual harassment is a horrible impediment to academic research, shutting out talented researchers and slowing scientif…

GitHub - maria-antoniak/fight-harassment-in-research

Contribute to maria-antoniak/fight-harassment-in-research development by creating an account on GitHub.

github.com

Chao-Chun (Joe) Hsu

Lucy Lu Wang retweeted

Chao-Chun (Joe) Hsu @chaochunh

22 Aug 2024

1/ 🎉 Excited to share our #ACL2024 Findings paper on using LLMs to assist with literature review! 📝 "CHIME: LLM-Assisted Hierarchical Organization of Scientific Studies for Literature Review Support" Please check out our virtual poster session today at 8:15 p.m. PT!

1,773

Lucy Lu Wang

Lucy Lu Wang @lucyluwang

30 Jul 2024

very excited about this new paper in which we survey work on LLM abstention! please check it out! this was in part motivated by another recent work in which we analyze the limitations of LLMs in abstaining for context-dependent science QA questions: arxiv.org/abs/2404.12452

Bingbing Wen @bingbingwen1

30 Jul 2024

🤔💭To answer or not to answer? We survey research on when language models should abstain in our new paper, "The Art of Refusal." . Thread below! 🧵⬇️ arxiv.org/abs/2407.18418 Joint w/ @jihan_yao @shangbinfeng Chenjun Xu @tsvetshop @billghowe @lucyluwang @uw_ischool @uwcse #nlproc

4,926

Arman Cohan

Lucy Lu Wang retweeted

Arman Cohan

@armancohan

11 Jul 2024

📢 My lab at Yale is looking for a postdoc to work on LLMs/NLP for health. This is a unique collaboration with a health startup focusing on both cutting-edge AI research and real-world impact to tackle domain-specific challenges of LLMs. Pls help spread the word! 1/2 🧵

320

46,804

John Giorgi

Lucy Lu Wang retweeted

John Giorgi @johnmgiorgi

16 Jun 2024

On my way to @naaclmeeting 🇲🇽 to present “TOPICAL: TOPIC Pages AutomagicaLly 🪄📄” In this paper, we ask the simple question: can LLMs be used to write high-quality scientific topic pages automatically? 🧵👇 Paper: arxiv.org/abs/2405.01796 Demo: s2-topical.apps.allenai.org [1/7]

TOPICAL: TOPIC Pages AutomagicaLly

Topic pages aggregate useful information about an entity or concept into a single succinct and accessible article. Automated creation of topic pages would enable their rapid curation as...

arxiv.org

2,052