BigScience Research Workshop

BigScience Research Workshop

112 Photos and videos

Tweets

Pinned Tweet

BigScience Research Workshop @BigscienceW

12 Jul 2022

BLOOM is here. The largest open-access multilingual language model ever. Read more about it or get it at bigscience.huggingface.co/bl… hf.co/bigscience/bloom

757

2,664

BigScience Research Workshop

BigScience Research Workshop @BigscienceW

31 Oct 2025

🫶🌸

Obvious

@obv_ious

16 Oct 2025

Replying to @obv_ious

Our design for Jean Zay’s new covering translates the gradient descent, the mathematical heart of how AI learns, into color and form. We actually studied the landscape loss of the BLOOM model from @BigscienceW, which was trained on Jean Zay, to create the artwork.

1,288

Stas Bekman

BigScience Research Workshop retweeted

Stas Bekman

@StasBekman

10 Jul 2025

This is the tech that Tunji Ruwase and I first started working on during @BigscienceW to deal with cluster resizes during BLOOM-176B training and then Sam Ade Jacobs, Lev Kurilenko and Masahiro Tanaka brought it to the finish line, improving the code, and publishing a paper and presentation at USENIX ATC 2025. See Minja's post below for links to paper, code, etc.

Minjia Zhang @_Minjia_Zhang_

10 Jul 2025

📢 Yesterday at USENIX ATC 2025, Xinyu Lian from UIUC SSAIL Lab presented our paper on Universal Checkpointing (UCP). UCP is a new distributed checkpointing system designed for today's large-scale DNN training, where models often use complex forms of parallelism, including data, tensor, pipeline, and expert parallelism. Existing checkpointing systems struggle in this setting because they are tightly coupled to specific training strategies (e.g., ZeRO-style data parallelism or 3D model parallelism), which break down when the training configs need to dynamically reconfigure over time. This makes it difficult to have resilient and fault-tolerant training. UCP solves this by decoupling distributed checkpointing from parallelism strategies. Our design introduces a unified checkpoint abstraction -- atomic checkpoint, and a full pattern matching-based transformation pipeline, which enables scalable and low-overhead checkpointing with reconfigurable parallelism across arbitrary model sharding strategies. We show that UCP supports state-of-the-art models trained with hybrid 3D/4D parallelism (ZeRO, TP, PP, SP) while incurring less than 0.001% overhead of the total training time. UCP is fully open-sourced in DeepSpeed. It has been adopted by Microsoft, BigScience, UC Berkeley and others for large-scale model pre-training and fine-tuning, including Phi-3.5-MoE (42B), BLOOM (176B), and many more. It also has been selected for presentation at PyTorch Day 2025 and FMS 2025(the Future of Memory and Storage). Big thanks to the amazing collaborators from Microsoft and Snowflake: @samadejacobs , @LevKurilenko, @MasahiroTanaka, @StasBekman , and @TunjiRuwase. 🔗 Project: lnkd.in/gG6j4vJe 📄 Paper: lnkd.in/gUiC5kcR 💻 Code: lnkd.in/g6uS29nH 📚 Tutorial: lnkd.in/gi_zWSWh #ATC2025 #LLM #Checkpointing #SystemsForML #DeepLearning #DistributedTraining #UIUC #DeepSpeed

2,465

Jeff Boudier 🤗

BigScience Research Workshop retweeted

Jeff Boudier 🤗

@jeffboudier

12 Jun 2025

4 years ago we were on the brink of AI becoming proprietary and centralized, when OpenAI kept GPT3 closed and VCs started dumping money on researchers. From fully open science, to fully closed, in a matter of months. It was scary, and 1,000 leading researchers and scientists banded together to show the world that it was possible to do the same work in the open, and build an ecosystem that benefits everyone. That was the @BigscienceW BLOOM project, and it put us back on track to open science, starting with forward-thinking organizations like @Meta releasing OPT. Look at us now. Open models have not only caught up, they're state of the art now. Not just LLMs, but models for document AI, speech to text, text to speech, generating images and more. We're closing in on 2 million open weight models on @huggingface. Thanks for the reminder @Thom_Wolf .

2:21

21,913

BigScience Research Workshop

BigScience Research Workshop @BigscienceW

23 Aug 2024

🌸❤️

Matthias Gallé @mgalle

23 Aug 2024

Packing for a weekend I found this. It is hard to believe that @BigScienceLLM really happened. The first time I heard of the idea my take was "this is going to be fun... but not going to work" Kudos to @Thom_Wolf for the vision

1,660

clem 🤗

BigScience Research Workshop retweeted

clem 🤗

@ClementDelangue

23 Aug 2024

Doesn't get enough credit but IMO paved the way for open-source LLMs!

Matthias Gallé @mgalle

23 Aug 2024

11,617

Oxford Internet Institute

BigScience Research Workshop retweeted

Oxford Internet Institute @oiioxford

19 Jul 2024

DPhil candidate @cailean_osborne shares reflections on the @OpenSourceOrg co-design process to define #opensourceAI and recommends next steps, including improving model safety and supporting more grassroots initiatives like @BigscienceW.

This tweet is unavailable

2,221

Stas Bekman

BigScience Research Workshop retweeted

Stas Bekman

@StasBekman

2 Jul 2024

The Universal Checkpointing paper is out! arxiv.org/abs/2406.18820 If you remember the @BigscienceW BLOOM-176B training, Tunji Ruwase and I co-invented this technology for Megatron-Deepspeed in order to enable to quickly scale up and down node topology while continuing training. Since then @MSFTDeepSpeed continued improving on that and it has now been fully integrated into Deepspeed. The blog post is here: github.com/microsoft/DeepSpe…

Universal Checkpointing: A Flexible and Efficient Distributed...

Deep neural network (DNN) training continues to scale rapidly in terms of model size, data volume, and sequence length, to the point where multiple machines are required to fit large models for...

arxiv.org

169

19,089

Omar Sanseviero

BigScience Research Workshop retweeted

Omar Sanseviero

@osanseviero

22 Nov 2023

The top 15 most-liked organizations on @huggingface 1. @StabilityAI 20k likes 2. @AIatMeta 20k 3. @runwayml 11k 4. CompVis 10k 5. @thukeg 7k 6. @BigscienceW 7k 7. @TIIuae 7k 8. @Microsoft 6.5k 9. @GoogleAI 6k 10. @OpenAI 4k 11. @BigCodeProject 4k 12. @MosaicML 4k 13. @UKPLab 3k 14. @AiEleuther 3k 15. @salesforce 3k huggingface.co/spaces/Pulsar…

435

155,918

Yacine Jernite

BigScience Research Workshop retweeted

Yacine Jernite @YJernite

2 Nov 2023

I respect the caution, but also need to stress that efforts that pursue transparency as an operational value in service of actual inclusion and accountability do exist - see for example the writing on this very topic by @BigscienceW, including its ethical charter. 1/3

Meredith Whittaker

@mer__edith

2 Nov 2023

I did not sign this statement, tho I agree “open” AI is not the enemy of “safe” AI I can't endorse its premise that “openness” alone will “mitigate current future harms from AI,” nor that it’s an antidote to concentrated power in the AI industry 1/ open.mozilla.org/letter/

7,886

Sasha Luccioni, PhD 🦋🌎✨🤗

BigScience Research Workshop retweeted

Sasha Luccioni, PhD 🦋🌎✨🤗@SashaMTL

15 Aug 2023

Never thought I'd see the day I'd have a publication in JMLR 🥹 So happy that the BLOOM carbon footprint paper has finally found a home at such an incredible venue! Thank you @shakir_za for being such a great editor, it warms my heart to see your name on this paper 💚

181

38,380

MMitchell

BigScience Research Workshop retweeted

MMitchell

@mmitchell_ai

3 Jul 2023

If you wanted to see the fun panel/Q&A we did with Londoners on AI, you can check out the recording here! My preso at the start is also on Open Science, representing @huggingface & @BigscienceW.

Science Gallery London @SciGalleryLon

3 Jul 2023

Couldn't make it along to last week's event with @mmitchell_ai? Head over to our blog to watch Margaret's full presentation plus the lively panel discussion that followed feat. @lara_groves @irini_mirena & @carolinesinders london.sciencegallery.com/bl… @londondataweek #AI4Me

11,687

BigCode

BigScience Research Workshop retweeted

BigCode @BigCodeProject

4 May 2023

Introducing: 💫StarCoder StarCoder is a 15B LLM for code with 8k context and trained only on permissive data in 80 programming languages. It can be prompted to reach 40% pass@1 on HumanEval and act as a Tech Assistant. Try it here: shorturl.at/cYZ06r Release thread🧵

630

2,591

882,213

BigCode

BigScience Research Workshop retweeted

BigCode @BigCodeProject

21 Mar 2023

Join us tomorrow, Wednesday 22nd (6:30 PM - 8:00PM CET) at the @mozillafestival Science Fair to learn more about our work in the open and responsible development of large language models (LLMs) for code. schedule.mozillafestival.org… #Mozfest

4,123

Giada Pistilli

BigScience Research Workshop retweeted

Giada Pistilli @GiadaPistilli

16 Mar 2023

As you already know, I am very proud of the collective work that enabled the development of @BigscienceW's ethical charter. Today I am even more proud to announce that it's part of @OECDinnovation's catalog to promote Trustworthy AI: such a milestone! oecd.ai/en/catalogue/tools/b…

5,972

Aran Komatsuzaki

BigScience Research Workshop retweeted

Aran Komatsuzaki

@arankomatsuzaki

8 Mar 2023

The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset Documents the data creation and curation efforts of ROOTS corpus, a 1.6TB dataset used to train BLOOM Releases a large initial subset of the corpus data: huggingface.co/bigscience-da… abs: arxiv.org/abs/2303.03915

122

17,394

Anna Rogers

BigScience Research Workshop retweeted

Anna Rogers @annargrs

2 Mar 2023

Worried about benchmark data contamination? Studying LLM memorization or attribution? @BigscienceW BLOOM 🌸 now has exact & fuzzy search over full training data! with @olapiktus🏆 @christopher Paulo Villegas @HugoLaurencon @ggdupont @SashaMTL @YJernite arxiv.org/abs/2302.14035 /1

121

32,938

Yong Zheng-Xin

BigScience Research Workshop retweeted

Yong Zheng-Xin

@yong_zhengxin

20 Dec 2022

(Repost for corrected Arxiv) 🧐What’s the best way to quickly adapt large multilingual language models to new languages? We present our new paper from @BigscienceW 🌸: BLOOM 1: Adding Language Support to BLOOM for Zero-Shot Prompting. 📜 arxiv.org/abs/2212.09535 [1/9]

18,649

Max Ryabinin

BigScience Research Workshop retweeted

Max Ryabinin

@m_ryabinin

14 Dec 2022

Petals, a system for easy decentralized inference and adaptation of 100B LLMs, is now online! 🌸Generate text with BLOOM-176B using Colab or a desktop GPU 🔌Fine-tune large models for your tasks 👥Help others by contributing your GPUs or host a new swarm colab.research.google.com/dr…

249

clem 🤗

BigScience Research Workshop retweeted

clem 🤗

@ClementDelangue

14 Nov 2022

The Bloom paper is out. Looks like it's doing worse than current GPT3 API in zero-shot generation tasks in English but better than other open-source LLMs & better than all in zs multi-lingual (which was the main goal). Proud of the work from the community! arxiv.org/abs/2211.05100

104

590

BigScience Research Workshop

BigScience Research Workshop @BigscienceW

4 Nov 2022

Big day today with two papers out! BLOOM carbon footprint at arxiv.org/abs/2211.02001, new models BLOOMZ and mt0 at huggingface.co/bigscience/bl…