Forecasting Research Institute

Forecasting Research Institute

109 Photos and videos

Tweets

Forecasting Research Institute

@Research_FRI

Jun 4

Is it possible to spot a good forecast by its rationale? We used LLMs to score the reasoning behind 55,000 forecasts and test the link between forecast accuracy and written rationales. We found that: • Causal reasoning is much more prevalent than statistical argumentation • It's easier to identify poor forecasters rather than excellent ones • Human ratings of rationale quality can be unreliable. 🧵A thread on the results:

4,197

more replies

Forecasting Research Institute

Forecasting Research Institute

@Research_FRI

Jun 4

Why was this? As you can see below, human ratings correlate strongly with rationale length, but length is essentially uncorrelated to forecast-level accuracy. Human raters weren’t wrong directionally, but they appeared to place undue weight on some features, such as underweighting "red flags" like extreme confidence.

292

Forecasting Research Institute

Forecasting Research Institute

@Research_FRI

Jun 4

Thank you to the study's authors: Chris Karvetski, @sicong_huang, @simas_kucinskas, Nadja Flechner, Jingyu Hu, @PTetlock, and @EzraKarger Read more on our Substack: forecastingresearch.substack… Read the full working paper on SSRN: papers.ssrn.com/sol3/papers.…

269

Forecasting Research Institute

Forecasting Research Institute

@Research_FRI

Jun 2

How are experts and superforecasters updating their expectations of AI progress? In our latest round of Longitudinal Expert AI Panel (LEAP) forecasts, we asked for predictions on AI progress timelines, including updated forecasts for a question we last asked nine months ago. We also asked for forecasts on near-term progress on @METR_Evals' time horizon benchmark, AGI timelines, the blockers and enablers of AGI, and AI's overall impact in the next 20 years. 🧵 Read on for more:

15,160

more replies

Forecasting Research Institute

Forecasting Research Institute

@Research_FRI

Jun 2

👍 Experts and superforecasters tend to view AI more optimistically than the general public When asked for their view on AI's overall impact on the U.S. over the next 20 years, 57.5% of experts and 69.8% of superforecasters predict a somewhat or very positive impact, compared to 42.0% of the public. The divergence is sharpest on problem-solving: 72.7% of experts expect AI will make people better at this, compared to 48.6% of the public. One area in which experts and the public converge is their views on forming meaningful relationships. 68.4% of experts and 66.7% of the public expect AI to make people worse at this. 16.5% of experts and 11.3% of superforecasters expect AI's overall impact to be somewhat or very negative, compared to 29.8% of public respondents.

1,881

Forecasting Research Institute

Forecasting Research Institute

@Research_FRI

Jun 2

For a more detailed summary of these results, check out our Substack post: forecastingresearch.substack… To explore more forecasts, including further questions from Wave 8, read the full report: leap.forecastingresearch.org…

1,298

Forecasting Research Institute

Forecasting Research Institute

@Research_FRI

May 21

On a subset of ForecastBench questions, an LLM has matched superforecaster performance for the first time. A submission from @GoogleDeepMind, named “green tree,” is now #1 on dataset questions on ForecastBench, our AI forecasting benchmark. Superforecasters remain #1 overall.

200,228

more replies

Forecasting Research Institute

Forecasting Research Institute

@Research_FRI

May 21

Does parity on dataset questions mean ForecastBench is “solved”? No. On market questions that require judgment about novel, one-off events, LLMs are still behind. Even on dataset questions, the 64.9% achieved by Google DeepMind is unlikely to be the ceiling of what’s possible. ForecastBench doesn’t stop being useful at human parity. It will continue tracking LLM progress even if they surpass expert human forecasters.

560

Forecasting Research Institute

Forecasting Research Institute

@Research_FRI

May 21

We're working to improve ForecastBench! Upcoming changes: • Fresh superforecaster round: fall 2026. • Continuous (non-binary) forecasting questions: summer 2026. • External submissions on continuous questions: summer 2026. • Revamped dataset questions: fall 2026.

485