A series of foundation models for transparent AI in Europe

Joined June 2024
11 Photos and videos
Pretraining launched!🚀 Our 9B/10TT baby model is making its first steps in Leonardo (CINECA). 🐣 All people involved are eager to see the results of the effort it took to get here and share them. 👀 And advancing to push hard for the next cycle. 🦾 #goOpenEuroLLM
2
10
157
Wrapping up our 3rd general meeting, hosted by @AISweden in sunny Stockholm ☀️ A full room makes the final decisions before training the first OpenEuroLLM model. Sharing updates, ideas, and future plans. Two more days of tight collaboration. Full speed mode. 🚀 #goOpenEuroLLM
1
5
86
HPLT is of the datasets we are sharing in our world-readable catalogue across HPCs. Interesting talk!
May 13
We are at #LREC2026 presenting HPLT v3 Datasets, monolingual, parallel, massive, highly curated. For depth info and analysis, please: Join us in room Menorca 1 at 16:20!!!
1
2
119
All ready to share information about #OpenEuroLLM with the #LREC2026 crowd. Let's talk data, infra, evals and open multilingual LLM models together! Come to booth #5 at the poster area 1, Elyxir Building. #multingualLLMs #openLLMs #diverseLLMs #safeLLMs
4
8
376
Quite a nice "representation" of the OpenEuroLLM crowd will be at the International Conference on Learning Representations (ICLR) this week. On Friday 24, come to poster "OpenThoughts: Data Recipes for Reasoning Models", work partially supported by our project, and meet us!👋
2
7
334
Also, today, know more about bechmark contamination impact goint to the poster of our colleagues from the unversities of Helsinki and Turku and the ELLIS Institute Finland.
78
Experimenting with model-based annotation for better data selection? A candidate to consider is propella-1, a multi-property annotator partially funded by #OpenEuroLLM which is fully open-source. 🔓Code, annotations and paper available! arxiv.org/pdf/2602.12414

We released propella-1, a small model for advanced pre-training data annotation 🙃. Work led by @maxidahl within the @OpenEuroLLM project. Link to model annotations for important pre-training datasets below 👇
1
4
296
🎉 One year of OpenEuroLLM! 🇪🇺We’re building Europe’s next-gen open-source LLMs to boost digital sovereignty. More about our achievements and next steps for infrastructure, data, models and evaluation at openeurollm.eu/blog/first-ye…. Year 2 = full speed ahead. 🚀 Go #OpenEuroLLM
2
4
12
291
First OpenEuroLLM Winter School in collaboration with the @CircleU_eu Alliance 🧑‍🎓and the Nordic Language Processing Laboratory 🧑‍💻 Focus on Multilinguality in LLM Development and Evaluation with speakers from world organisations, academia and industry. wiki.nlpl.eu/Community/train…
1
5
14
510
Strategic access to EuroHPC resources granted to OpenEuroLLM!!! -first AI project granted strategic access across multiple EuroHPC centres -for over 10 million GPU hours Thanks @EUComission and @EuroHPC_JU!
1
6
15
830
OpenEuroLLM retweeted
17 Nov 2025
Proud to present the @OpenEuroLLM project and its results so far, with Sampo Pyysalo (@UniTurku) at the 1st Workshop on Open Source Sovereign LLMs in Berlin osfm.info/ Great opportunity to talk to many OS LLM developers! @CharlesUniPRG @hplt_eu
2
6
344
We strongly agree! Let's make it happen! Thanks @EU_Budget & STEP for the support.
6 Nov 2025
Future-proof AI in all EU languages isn’t a dream, it’s OpenEuroLLM 🗣️💬 9 countries, the EU budget & STEP join forces to build transparent, AI Act-compliant tech for Europe’s innovators. Find out how we will turn ambition into action for 2028-2034: europa.eu/!w77nKY
2
5
282
Well done #HPLT, we surely need more high performance datasets for the present and future landscape of multilingual LLMs. See you at #emnlp2025!
5 Nov 2025
The #HPLT crowd is at #EMNLP2025!!! If you are around, please visit our booth to discuss: - multilingual datasets 🌏 - dataset insights and stats 📊 - dataset performance 🔝 - efficient MT models ⏱️ - and the future of multilingual LLMs 💡 We don't want to miss U!
2
7
437
OpenEuroLLM completing 2 days of sharing progress and next steps pursuing the goal of developing strong multilingual foundation models aligned with European strategic vision & standards. Gathering at BSC nearby MareNostrum 5 supercomputer made us feel home. #Barcelona #NLProc
1
4
246
OpenEuroLLM retweeted
LeoLM has since been an inspiration for many other projects (like our DiscoLM 8b, the @occiglot models, and more) and serves as a conceptual baseline for some ideas within the @OpenEuroLLM project to bring strong LLMs to all European languages.
1
1
3
281
OpenEuroLLM retweeted
21 Aug 2025
Our co-founders project #LeoLM highlighted by @bmftr_bund. Today, we´re continuing what started as a student`s side-project with @OpenEuroLLM (and more to come). If you want to work on Open Source AI, multilingual applications and AI evaluations as well - we´re hiring! 🙂
Nearly two years after release my project LeoLM is being used as a strong justification for the expansion of federal compute funding in Germany. Goes to show how much impact open-source projects can have. Hell yeah @bmftr_bund - thanks for making projects like this possible! 🚀
1
2
2
550
This was first, but surely not the last colab between open-sci, @laion_ai and @openEuroLLM. Establishing baselines and good starting grounds for experiments to create strong open foundation models is important, and I am happy to see it worked out so well.
1
1
482
📢 First release: 38 monolingual reference LLMs (2.15B params) via @hplt_eu #OpenEuroLLM ⚙️Trained on 100B tokens from HPLT v2 dataset 🌍 Cover EU langs others ⚙️ Based on LLaMA, trained on #LUMI 📈 Useful for evaluation Downloads more info at openeurollm.eu/blog/hplt-oel…
1
10
22
1,232
Kick-off successfully completed. Go OpenEuroLLM team! openeurollm.eu/
2
16
1,954
It's time for transparent AI in Europe. It's time for open LLMs as a robust foundation for developing future private and public AI services. It's time for: OPEN = open-source Euro = under EU regulations, representing EU values LLM = LLMs openeurollm.eu
6
13
29
2,127