Fei-Fei Li

Fei-Fei Li

1 Photos and videos

Tweets

Matt Wilde retweeted

Fei-Fei Li

@drfeifei

Jun 10

Scientific research is fundamental to advancing civilization and helping people globally to solve the most critical problems, from medicine to materials, from brain science to physics, and much beyond. This is only possible when scientists have access to the best tools of the time to conduct scientific research, including having access to AI-based tools.

120

471

3,088

193,646

Matt Wilde

Matt Wilde

@MattCWilde

Jun 9

That Apple / OpenAI partnership seems to be going great

Katie Harbath

Matt Wilde retweeted

Katie Harbath @katieharbath

May 20

I’ve been helping @TheForumAI build NewsBench, a benchmark for how frontier AI covers the news that matters. We put the leading models through 3,000 prompts and scored each one on accuracy, neutrality, & source quality. See where each model landed: byforum.com/newsbench

NewsBench Leaderboard | Forum AI

How do leading AI models perform on real news questions? NewsBench scores them on accuracy, neutrality, and source quality with expert-calibrated judges.

byforum.com

186

Jillian Fisher

Matt Wilde retweeted

Jillian Fisher @jrfisher552

May 19

Excited to have been part of this work exploring better ways to evaluate AI on hard, contested questions. For consequential topics, grounding evaluation in expert judgment feels especially important. Proud to have contributed and excited to see what comes next with @ByForumAI.

Andy Hall

@ahall_research

May 19

How can we teach AI the right way to handle super contested questions on consequential topics like politics, news, finance, personal health, etc? I've been working with @ByForumAI to develop a way to teach AI models the judgments of some of the world's foremost experts in these areas. I'm thrilled to share our whitepaper detailing the method we've come up with after many months of tinkering and testing. Forum starts by recruiting an incredible cast of world experts of all partisan and ideological stripes---people who are bring their own beliefs to bear on hard problems, but who are also capable of intellectual honesty in the face of disagreements. We worked through tons of hard examples with them of how AI models respond to challenging questions, developing and iterating on a rubric that captured their judgments---not on whether the answer was "correct" but on whether it bore the hallmarks of rigor. Did it exhibit neutrality by seriously engaging with all relevant arguments? Did it draw on high-quality information sources? Where there are objective facts to bring to bear, did it report them accurately? Then, the engineers at Forum developed a unique process to take the judgment of these experts and teach it to LLM judges who could apply it at scale. We're able to show that our judges perform considerably better at our task than default LLMs (i.e., if we ask Claude or ChatGPT to simply evaluate the same responses but without our special training). We've put a ton of work into validating this process, far more than I've seen in any other eval company. There is certainly more work to be done, but we now have a process that produces LLM evaluations that do a good job of replicating what our human experts say. Check out way more details in the paper here: byforum.com/whitepapers/dist…

412

Matt Wilde

Matt Wilde

@MattCWilde

May 19

@a1zhang's Mismanaged Genius hypothesis asks if poor LLM performance on certain tasks is due to a capability cap or poor utilization. At Forum AI, we've been researching what it would take to improve how LLMs handle high-stakes, subjective domains. We've found that first working to effectively manage a small set of humans unlocks the ability to use LLMs to scale to strong performance.

Matt Wilde

Matt Wilde

@MattCWilde

May 19

Check out way more details in the paper here: byforum.com/whitepapers/dist…

Andy Hall

Matt Wilde retweeted

Andy Hall

@ahall_research

May 19

3,798

Sasha Rush

Matt Wilde retweeted

Sasha Rush

@srush_nlp

8 Dec 2023

This talk by Angela Fan on Llama2 is so good. 30 min, she just tells you all the things. youtu.be/NvTSfdeAbnU?si=ZNoJ…

Developing Llama 2 | Angela Fan

Angela Fan is a research scientist at Meta AI Research Paris focusi...

youtube.com

201

1,384

231,444

@goth

Matt Wilde retweeted

@goth

@goth600

8 Jan 2023

Everybody wanna align AI, nobody wanna align corporations. What gives

372

43,946