Loïck BOURDOIS

Loïck BOURDOIS

71 Photos and videos

Tweets

Pinned Tweet

Loïck BOURDOIS @BdsLoick

May 28

New blog post on @huggingface! An introdution to Trimming ✂️ a little-known but highly effective model reduction method. We achieved up to 87.24% size reduction while preserving performance 🧵

360

Loïck BOURDOIS

Loïck BOURDOIS @BdsLoick

May 28

New blog post on @huggingface! An introdution to Trimming ✂️ a little-known but highly effective model reduction method. We achieved up to 87.24% size reduction while preserving performance 🧵

360

more replies

Loïck BOURDOIS

Loïck BOURDOIS @BdsLoick

May 28

From these 16 families, we generated more than 5,500 monolingual models in 124 different languages.

Loïck BOURDOIS

Loïck BOURDOIS @BdsLoick

May 28

Big thanks to my HF Fellows bros for multilingual evaluation @tomaarsen, Bram Vanroy, @christopher, @w00jun_ @mrm8488, @prithivMLmods and to @AI_AlphaEdge for the time dedicated to this project 🙏 Links 👇 Blogpost: huggingface.co/blog/lbourdoi… Models: huggingface.co/spaces/alphae…

Introduction to Trimming ✂

A Blog post by Loïck BOURDOIS on Hugging Face

huggingface.co

365

Niels Rogge

Loïck BOURDOIS retweeted

Niels Rogge @NielsRogge

May 18

Introducing a revival of PapersWithCode! As @ilyasut said, we're back to the "age of research". Hence, it's important to share research and build on each other's work. > find SOTA per domain, not just LLMs > leaderboards > methods > all parsed at scale using AI agents.

3:01

610

78,923

tomaarsen

Loïck BOURDOIS retweeted

tomaarsen @tomaarsen

Apr 9

🌐 I've just released Sentence Transformers v5.4: we're going fully multimodal for embeddings & reranking! Also featuring a modular CrossEncoder, and automatic Flash Attention 2 input flattening. Highlights in 🧵

174

29,003

みぃ🍵

Loïck BOURDOIS retweeted

みぃ🍵@mithernet

Apr 3

著者です！ Attentionの「相対比較しかできない」という制約を外した、新しい機構を提案しました ①まずわかりやすい利点 ✅学習時より圧倒的に長い文でも性能維持＆正確な情報取得 ✅収束が非常に高速（LR=1でも学習可能） ✅モデルサイズ4割削減 ✅推論速度3倍超 (続く) arxiv.org/abs/2604.01178

133

804

87,631

Loïck BOURDOIS

Loïck BOURDOIS @BdsLoick

Mar 2

CuTeDSL is really nice For those wishing to get into writing kernels in this language, github.com/b-albar/machete can be useful Boris ALBAR reimplemented Flash Attention, RoPE, RMSnorm, etc. Everything compatible with HF Transformers (tests on llama3, GLM4.7, Qwen3), TRL, PEFT/LoRA

GitHub - b-albar/machete: A framework for megakernels

A framework for megakernels. Contribute to b-albar/machete development by creating an account on GitHub.

github.com

maharshi

@maharshii

Mar 1

CuTeDSL is my new favourite thing: I wrote a kernel for RMS norm after learning about layouts, tiling, copying tensors, reductions and so on, especially for inference and it is about 2.13x faster than a triton fused kernel for the given shape.

167

Basile Terver

Loïck BOURDOIS retweeted

Basile Terver

@BasileTerv987

Feb 4

𝗜𝗻𝘁𝗿𝗼𝗱𝘂𝗰𝗶𝗻𝗴 𝗘𝗕-𝗝𝗘𝗣𝗔 ⚡ An open-source library making JEPAs accessible, trainable on a single GPU in hours! 🚀 🔗 Paper: arxiv.org/abs/2602.03604 💻 Code: github.com/facebookresearch/…

657

92,184

João Maria Janeiro

Loïck BOURDOIS retweeted

João Maria Janeiro @JoaoMJaneiro

5 Nov 2025

🚨New Paper @AIatMeta 🚨 You want to train a largely multilingual model, but languages keep interfering and you can’t boost performance? Using a dense model is suboptimal when mixing many languages, so what can you do? You can use our new architecture Mixture of Languages! 🧵1/n

2,009

Lewis Tunstall

Loïck BOURDOIS retweeted

Lewis Tunstall

@_lewtun

30 Oct 2025

We've just published the Smol Training Playbook: a distillation of hard earned knowledge to share exactly what it takes to train SOTA LLMs ⚡️ Featuring our protagonist SmolLM3, we cover: 🧭 Strategy on whether to train your own LLM and burn all your VC money 🪨 Pretraining, aka turning a mountain of text into a fancy auto-completer 🗿How to sculpt base models with post-training alchemy 🛠️ The underlying infra and how to debug your way out of NCCL purgatory Highlights from the post-training chapter in the thread 👇

491

142,768

Shayne Longpre

Loïck BOURDOIS retweeted

Shayne Longpre

@ShayneRedford

28 Oct 2025

📢Thrilled to introduce ATLAS 🗺️: scaling laws beyond English, for pretraining, finetuning, and the curse of multilinguality. The largest public, multilingual scaling study to-date—we ran 774 exps (10M-8B params, 400 languages) to answer: 🌍Are scaling laws different by language? 🧙‍♂️Can we model the curse of multilinguality? ⚖️Pretrain from scratch or finetune from multilingual checkpoint? 🔀Cross-lingual transfer scores for 1444 lang pairs? 1/🧵

154

24,634

tomaarsen

Loïck BOURDOIS retweeted

tomaarsen @tomaarsen

22 Oct 2025

🤗 Sentence Transformers is joining @huggingface! 🤗 This formalizes the existing maintenance structure, as I've personally led the project for the past two years on behalf of Hugging Face. I'm super excited about the transfer! Details in 🧵

375

41,519