Daniel Weld

Daniel Weld

12 Photos and videos

Tweets

Daniel Weld @dsweld

May 4

This benchmark for Ai scientific capabilities is beautifully thought out - I especially like it's clear enumeration of design principles...

Ai2

@allen_ai

Apr 30

New AstaBench results show frontier models making progress on scientific research, but the benchmark remains far from solved. Claude Opus 4.7 leads overall at 58.0%, while GPT-5.5 comes within 5.1 points at less than half the measured cost per problem. 🧵

817

Zixian Ma@CVPR

Daniel Weld retweeted

Zixian Ma@CVPR

@zixianma02

Mar 24

We built MolmoWeb from the scratch with Molmo2!!! 💕🌐 It’s not easy to build SOTA web agents out of open source VLMs, when they can be so profitable that very few projects release everything (if anything), esp the datasets 🔑 But, we just released all the MolmoWeb model checkpoints and datasets from ai2😉 Can’t wait to see what the community builds on top of MolmoWeb!🫡

Ai2

@allen_ai

Mar 24

Today we're releasing MolmoWeb, an open source agent that can navigate complete tasks in a browser on your behalf. Built on Molmo 2 in 4B & 8B sizes, it sets a new open-weight SOTA across four major web-agent benchmarks & even surpasses agents built on proprietary models. 🧵

217

27,000

Ai2

Daniel Weld retweeted

Ai2

@allen_ai

Mar 12

🔎 Deep research agents like Asta ScholarQA and OpenAI Deep Research are transforming how we perform literature review. But how do we know if the way we evaluate them is actually meaningful? Announcing our new paper: “Deep Research, Shallow Evaluation: A Case Study in Meta-Evaluation for Long-Form QA Benchmarks” 🧵

155

12,218

Pao Siangliulue

Daniel Weld retweeted

Pao Siangliulue @Siangliulue

Mar 13

Are you a researcher in CS or a CS-adjacent field curious about how an AI agent can help you with your research project? Want to try a new tool for your research support in a paid user study ($100, 2 hr)? Limited spot numbers. See details and sign up here: forms.gle/JzLtkAhe7TtvuiwQ8

Project Document Study Interest Form

Hi! 👋 We are researchers at the Allen Institute for Artificial Intelligence (Ai2) exploring AI-powered tools to support researchers as they work on their research projects. We are looking for...

docs.google.com

9,728

Ai2

Daniel Weld retweeted

Ai2

@allen_ai

Feb 25

Can AI predict what scientists will do next—not just one piece, but the whole research process? PreScience is our new model eval for forecasting how science unfolds end-to-end, from how research teams form to a paper's eventual impact. Built with @UChicago, supported by @NSF.

104

15,258

Daniel Weld

Daniel Weld @dsweld

Feb 4

Truly open scientific question answering - that's good! science.org/content/article/…

Open-source AI program can answer science questions better than humans

Developed by and for academics, OpenScholar aims to improve searches of the ballooning scientific literature

science.org

192

Ai2

Daniel Weld retweeted

Ai2

@allen_ai

Jan 28

We’re releasing the Theorizer code and framework a dataset of ~3,000 theories generated by Theorizer across the field of AI/NLP, built from 13,744 source papers. 💻 Code: github.com/allenai/asta-theo… 📝 Technical report: arxiv.org/abs/2601.16282 ✍️ Learn more in our blog: allenai.org/blog/theorizer

GitHub - allenai/asta-theorizer: Staging area for a public release of Theorizer

Staging area for a public release of Theorizer. Contribute to allenai/asta-theorizer development by creating an account on GitHub.

github.com

3,540

Daniel Weld

Daniel Weld @dsweld

Jan 28

I'm so excited by this! Our system is generating some insightful & novel theories (e.g., internally for LM post-training). And it's still getting better!

Ai2

@allen_ai

Jan 28

Introducing Theorizer: Turning thousands of papers into scientific laws 📚➡️📜 Most automated discovery systems focus on experimentation. Theorizer tackles the other half of science: theory building—compressing scattered findings into structured, testable claims. 🧵

6,019

Ai2

Daniel Weld retweeted

Ai2

@allen_ai

Jan 27

Introducing Ai2 Open Coding Agents—starting with SERA, our first-ever coding models. Fast, accessible agents (8B–32B) that adapt to any repo, including private codebases. Train a powerful specialized agent for as little as ~$400, & it works with Claude Code out of the box. 🧵

139

937

351,266

Daniel Weld

Daniel Weld @dsweld

Jan 13

Smart analysis analysis of scholar output when authors adopted LLMs as part of their writing: 1) huge 36% boost in # papers published 2) LLMs mitigate skill disparities, eg native language - enough to shift market share of production toward China bit.ly/4qliJGo @yian_yin

635

Ai2

Daniel Weld retweeted

Ai2

@allen_ai

18 Dec 2025

🆕 New in Asta: multi-turn report generation. You can now have back-and-forth conversations with Asta, our agentic platform for scientific research, to refine long-form, fully cited reports instead of relying on single-shot prompts.

6,651

Ai2

Daniel Weld retweeted

Ai2

@allen_ai

12 Dec 2025

🧠 Introducing NeuroDiscoveryBench. Built with @AllenInstitute, it’s the first benchmark for evaluating AI systems like our Asta DataVoyager agent on neuroscience data. The benchmark tests whether AI can truly extract insights from complex brain datasets.

107

10,100

Bodhisattwa Majumder

Daniel Weld retweeted

Bodhisattwa Majumder

@mbodhisattwa

25 Nov 2025

#NeurIPS2025 and AI x Science? Some fun announcements are coming up. Stay tuned. Also, our Asta internship application is still open -- apply and mention my name if you'd like to work w me ~

Bodhisattwa Majumder

@mbodhisattwa

11 Nov 2025

🎀 Really excited to start cadence for 2026! At @allen_ai, we are at the intersection of AI making a real impact in accelerating science, with several serious collaborations with domain scientists (Economics, Oncology, Neuroscience, Climate, Epidemiology). If you are passionate about turning the stack upside down to break and build current LLMs to be adaptive for a continual discovery process, join us! If you are interested in my work, such as data-driven discovery, open-ended hypothesis search, test-time adoption for hypothesis generation, causal mechanism discovery using data and literature, mention my name in your application! Past interns might vouch for their experiences, but working with Asta interns for the last two years has been one of my most rewarding journeys at Ai2! 🩷

6,728

Ai2

Daniel Weld retweeted

Ai2

@allen_ai

20 Nov 2025

Announcing Olmo 3, a leading fully open LM suite built for reasoning, chat, & tool use, and an open model flow—not just the final weights, but the entire training journey. Best fully open 32B reasoning model & best 32B base model. 🧵

327

1,674

610,005

Bodhisattwa Majumder

Daniel Weld retweeted

Bodhisattwa Majumder

@mbodhisattwa

19 Nov 2025

Plenty of AI-gen papers in ICLR. Wonder why? 🚨 In a preregistered Randomized Controlled Trial, we find: CS authors perceive AI-abstracts as more readable, tend to edit less than their published counterparts. AI-use and its disclosure shape the fabric of collaborative scientific writing. Work led by @hsanchaita & @leadoeun27, advised by @shocheen & yours truly. 1/n

14,658

Daniel Weld

Daniel Weld @dsweld

19 Nov 2025

Impressive deep-research performance by a tiny & open model!

Rulin Shao @RulinShao

18 Nov 2025

🔥Thrilled to introduce DR Tulu-8B, an open long-form Deep Research model that matches OpenAI DR 💪Yes, just 8B! 🚀 The secret? We present Reinforcement Learning with Evolving Rubrics (RLER) for long-form non-verifiable DR tasks! Our rubrics: - co-evolve with the policy model - are grounded on search knowledge 🧵

469

Daniel Weld

Daniel Weld @dsweld

6 Nov 2025

The benchmark desiderata alone make this paper worth a read...

Jonathan Bragg @turingmusician

6 Nov 2025

Agent benchmarks don't measure true *AI* advances We built one that's hard & trustworthy 👉AstaBench tests agents w/ *standardized tools* on 2400 scientific research problems 👉SOTA results across 22 agent *classes* 👉AgentBaselines agents suite 🆕arxiv.org/abs/2510.21652 🧵👇

669

Daniel Weld

Daniel Weld @dsweld

28 Oct 2025

Super interesting and well written summary of the incredible progress we’ve made on climate change (and what’s most important to do next) ⭐️⭐️⭐️⭐️⭐️ gatesnotes.com/three-tough-t…

A new approach for the world’s climate strategy | Bill Gates

Bill Gates explains why the world’s climate change strategy should focus on human welfare—even more than temperatures or greenhouse gas emissions.

gatesnotes.com

445

Ai2

Daniel Weld retweeted

Ai2

@allen_ai

8 Oct 2025

📊 Today we're releasing data showing which scientific papers our AI research tool Asta cites most frequently. Think of it as creating citation counts for the AI era—tracking which research is actually powering AI answers across thousands of queries. 🧵

10,991

Daniel Weld

Daniel Weld @dsweld

2 Oct 2025

Pretty amazing that this can be done at all, but especially with federated data (crucial given the sensitivity of patient data)!

Ai2

@allen_ai

1 Oct 2025

Introducing Asta DataVoyager—our new AI capability in Asta that turns structured data into transparent, reproducible insights. Built for scientists, grounded in open, inspectable workflows. 🧵

0:17

1,736