Clémentine Fourrier 🍊 is off till Dec 2026 (🪂)

Clémentine Fourrier 🍊 is off till Dec 2026 (🪂)

208 Photos and videos

Tweets

Pinned Tweet

Clémentine Fourrier 🍊 is off till Dec 2026 (🪂)@clefourrier

3 Dec 2025

Hey twitter! I'm releasing the LLM Evaluation Guidebook v2! Updated, nicer to read, interactive graphics, etc! huggingface.co/spaces/OpenEv… After this, I'm off: I'm taking a sabbatical to go hike with my dogs :D (back @huggingface in Dec *2026*) See you all next year!

166

991

241,607

Jimmy Lin

Clémentine Fourrier 🍊 is off till Dec 2026 (🪂) retweeted

Jimmy Lin

@lintool

4 Dec 2025

👀Introducing a brand new @yupp_ai SVG leaderboard ranking frontier models on the generation of coherent and visually appealing SVGs! Gemini 3 Pro by @GoogleDeepMind takes the crown as the most powerful model! 👏 We’re also releasing a public SVG dataset. Details in🧵

454

70,109

François Chollet

Clémentine Fourrier 🍊 is off till Dec 2026 (🪂) retweeted

François Chollet

@fchollet

4 Dec 2025

Either you crack general intelligence -- the ability to efficiently acquire arbitrary skills on your own -- or you don't have AGI. A big pile of task-specific skills memorized from handcrafted/generated environments isn't AGI, not matter how big.

Dwarkesh Patel

@dwarkesh_sp

2 Dec 2025

New post: Thoughts on AI progress (Dec 2025) 1. What are we scaling?

103

111

1,189

118,432

Lucas Atkins

Clémentine Fourrier 🍊 is off till Dec 2026 (🪂) retweeted

Lucas Atkins

@latkins

4 Dec 2025

ALT Got Talent Yes GIF by TV4

Clémentine Fourrier 🍊 is off till Dec 2026 (🪂)@clefourrier

3 Dec 2025

10,149

Clémentine Fourrier 🍊 is off till Dec 2026 (🪂)

Clémentine Fourrier 🍊 is off till Dec 2026 (🪂)@clefourrier

3 Dec 2025

166

991

241,607

Clémentine Fourrier 🍊 is off till Dec 2026 (🪂)

Clémentine Fourrier 🍊 is off till Dec 2026 (🪂)@clefourrier

3 Dec 2025

cc @maximelabonne since you wanted an update :P

3,634

Clémentine Fourrier 🍊 is off till Dec 2026 (🪂)

Clémentine Fourrier 🍊 is off till Dec 2026 (🪂)@clefourrier

3 Dec 2025

The guide is very beginner friendly, as we go from the basics of tokenization/inference to the nits and tricks of running eval properly, so it's compatible with all levels. Should contain most of what we wrote about evals at HF in a single unified place, with updates ofc :)

6,764

Clémentine Fourrier 🍊 is off till Dec 2026 (🪂)

Clémentine Fourrier 🍊 is off till Dec 2026 (🪂)@clefourrier

3 Dec 2025

If you see improvements, I'd love to hear them (within the next 2 days) :) Many thanks to @thibaudfrere for his help on the banner and @gui_penedo for his proofreading! If you've got eval needs, your new PoC is @nathanhabib1011 (with a focus on lighteval)!

4,985

elie

Clémentine Fourrier 🍊 is off till Dec 2026 (🪂) retweeted

elie

@eliebakouch

3 Dec 2025

as a researcher, it makes no sense to compare reasoning vs non reasoning models on benches like the ones in Artificial Analysis without normalizing somehow by cost or output tokens. non reasoning models (base/instruct) are important for the open ecosystem since research teams and companies will use them to do RL or other things (like synthetic generation) for specific verticals (think cursor/windsurf) as a user, i get that you don’t care whether the model is reasoning or not, you judge speed, cost, and accuracy (and memory if you want to deploy your model locally) the only advantage of non reasoning models would be speed/cost because they generate fewer tokens BUT speed and cost also depend on other thing like infra -> for speed see how fast some models get on groq or cerebras -> for cost model like deepseek are so cheap that there is very few use case where you'd want to use non reasoning model anyway

13,344

merve

Clémentine Fourrier 🍊 is off till Dec 2026 (🪂) retweeted

merve

@mervenoyann

3 Dec 2025

Mistral has delivered super capable small models but no one is talking about it so here I go

570

31,304

Lysandre

Clémentine Fourrier 🍊 is off till Dec 2026 (🪂) retweeted

Lysandre

@LysandreJik

1 Dec 2025

Transformers v5's first release candidate is out 🔥 The biggest release of my life. It's been five years since the last major (v4). From 20 architectures to 400, 20k daily downloads to 3 million. The release is huge, w/ tokenization (no slow tokenizers!), modeling & processing.

572

180,808

Florian Brand

Clémentine Fourrier 🍊 is off till Dec 2026 (🪂) retweeted

Florian Brand

@xeophon

28 Nov 2025

stop looking at HLE (with tools), most of these mean "has web access" the answers to HLE are easily accessible in ungated mirrors (and prob a dozen other places). the only question is why those agents don't score 100%

Ivan Fioravanti ᯅ

@ivanfioravanti

28 Nov 2025

This 8B beast from NVIDIA is a fine-tuning of Qwen3-8B! 37.1 on Humanity's Last Exam!

147

24,115

Shaltiel

Clémentine Fourrier 🍊 is off till Dec 2026 (🪂) retweeted

Shaltiel @SShmidman

25 Nov 2025

Okay, but, wait, what reasoning traces should I train on? Excited to share our latest research paper together with @nvidia: Learning to Reason: Training LLMs with GPT-OSS or DeepSeek R1 Reasoning Traces arxiv.org/abs/2511.19333 🧵

Learning to Reason: Training LLMs with GPT-OSS or DeepSeek R1...

Test-time scaling, which leverages additional computation during inference to improve model accuracy, has enabled a new class of Large Language Models (LLMs) that are able to reason through...

arxiv.org

1,570

Adina Yakup

Clémentine Fourrier 🍊 is off till Dec 2026 (🪂) retweeted

Adina Yakup

@AdinaYakup

26 Nov 2025

China just passed the U.S. in open model downloads for the first time 👀 New data from Economies of Open Intelligence led by @huggingface policy team & community collaborators, presents some notable observations: ✨ Developer adoption In 2025, Chinese model developers saw higher global adoption for the first time, driven by the rapid rise of @deepseek_ai and @Alibaba_Qwen. ✨ The “Sino-Multimodal Period”(late 2024–present) China’s share of downloads reached 17.1%, surpassing the U.S., with DeepSeek Qwen accounting for 14% of recent activity. This period also brings larger, more quantized, and expanding multimodal models such as Wan2.1. ✨ Organizational patterns China’s open model development is more industry-driven (similar to the U.S.), while the EU has more university, nonprofit, and community-led contributors. fyi - this analysis based on 851k models, 200 attributes, and 2.2B downloads.

7,506

David Louapre

Clémentine Fourrier 🍊 is off till Dec 2026 (🪂) retweeted

David Louapre

@dlouapre

26 Nov 2025

Introducing "The Eiffel Tower Llama"!🗼 Remember Golden Gate Claude? Unfortunately Anthropic's viral demo was shut down after 24h, and key technical details remained hidden. So we recreated it, uncovering key insights on steering LLMs using SAEs⚒️ Full blog post live demo 👇

174

63,813

Sayak Paul

Clémentine Fourrier 🍊 is off till Dec 2026 (🪂) retweeted

Sayak Paul

@RisingSayak

26 Nov 2025

Non-natural image gen and editing are difficult tasks. We tested the state of the art at the time — including Nano Banana 1.0 & GPT-image — all performed quite poorly on StructBench. Nano Banana 2 (NB2) just dropped, and its improvements strongly validate a direction we studied in StructBench 🤯 It achieves 90 on our image generation tasks—by far the best we’ve seen 🔥 A few months before the release of Nano Banana 2, we introduced StructBench — a benchmark for evaluating models on non-natural images like diagrams, math figures, charts, and documents. Our motivation was simple: today’s image models are overly optimized for aesthetics, but struggle with factuality structural reasoning. If we want truly unified multimodal models, the training mix needs non-natural data too. But NB2 still isn’t perfect: we still find failure cases where it misinterprets instructions or misses structural details. Excited to see the field moving toward models that reason as well as they render. Below, we provide some more analysis along with cool results! @GeminiApp @GoogleDeepMind

7,295

Zuhaitz

Clémentine Fourrier 🍊 is off till Dec 2026 (🪂) retweeted

Zuhaitz @zuhaitz_dev

25 Nov 2025

"The most underappreciated legend of the tech industry?" I see posts like this one every day 😭 And, obviously he is a respected professional, but he is far from underappreciated. Check Sophie Wilson. Most people haven't heard about her, but she is the primary architect of the ARM architecture. If you are reading this from your phone, tablet, or a modern MacBook with an M-series chip, you are using a device running on the architecture Wilson designed.

Pallav | RAG Expert @vibingmonk

25 Nov 2025

> created Linux kernel at 21 > built Git because nothing else was good enough > becomes backbone of servers, Android, cloud, supercomputers > never chased fame, money, titles, hype > stays private, consistent, brutally honest for decades > still reviews patches, still improves Linux > avoids drama like it's a feature > influences the entire tech world without even trying > lives quietly, does the work, no noise Linus Torvalds is the most underappreciated legend of the tech industry

472

6,815

317,768

Ahmad Beirami

Clémentine Fourrier 🍊 is off till Dec 2026 (🪂) retweeted

Ahmad Beirami

@abeirami

25 Nov 2025

"Professors definitely deserve to have their names on the papers." I think this take is completely wrong. Financial support does not warrant co-authorship. Bob Gallager (a legendary information theorist who retired from MIT) did not co-author any papers with many of his students because he did not believe that he made an intellectual contribution that warranted co-authorship. The screenshot is from Erdal Arıkan's PhD thesis work that was published in IEEE Trans. Information Theory. Both Erdal and Bob have been honored with the Shannon Award (highest honor in information theory) and they have not co-authored any papers.

Zhipeng Wang 🇺🇦

@PKUWZP

24 Nov 2025

Replying to @shaananc @thegautamkamath

I agree with most of your statement. However, there’s no “simply” leading a group or advising PhD students. Those activities require tremendous efforts both intellectually and financially. Not to say that in the US, all of PhD students’ funding comes from professors’ grant money. Professors definitely deserve to have their names on the papers.

356

67,639

Charlie Snell

Clémentine Fourrier 🍊 is off till Dec 2026 (🪂) retweeted

Charlie Snell @sea_snell

25 Nov 2025

What happened to adding error bars to evals?

Yuchen Jin

@Yuchenj_UW

24 Nov 2025

Claude Opus 4.5's score on SWE-bench is wild. I like how Anthropic has focused on coding from the beginning. They haven’t released any image or video models. All in the most economically valuable area. Good strategy.

898

117,007

Clémentine Fourrier 🍊 is off till Dec 2026 (🪂)

Clémentine Fourrier 🍊 is off till Dec 2026 (🪂) retweeted

Clémentine Fourrier 🍊 is off till Dec 2026 (🪂)@clefourrier

6 Nov 2025

Know anyone who need some help to get started with ML, open source and 🤗? We partnered with @TechToTheRescue, a tech for good incubator, & answered all AI questions their non profits had to create an FAQ! github.com/huggingface/faq Come add your Q/As, it's collaborative! 🔥

920

Georgia Channing

Clémentine Fourrier 🍊 is off till Dec 2026 (🪂) retweeted

Georgia Channing

@cgeorgiaw

24 Nov 2025

incredibly detailed technical blog just dropped on the anatomy of BoltzGen 🧬 made for ML people, but covering everything from molecular representations to diffusion-based generation of protein binders crazy good interactive visuals 👏👏 @ludocomito

231

21,943