OpenEuroLLM

OpenEuroLLM

538 Photos and videos

Tweets

Prompsit retweeted

OpenEuroLLM @OpenEuroLLM

Jun 9

Wrapping up our 3rd general meeting, hosted by @AISweden in sunny Stockholm ☀️ A full room makes the final decisions before training the first OpenEuroLLM model. Sharing updates, ideas, and future plans. Two more days of tight collaboration. Full speed mode. 🚀 #goOpenEuroLLM

HPLT

Prompsit retweeted

HPLT @hplt_eu

6 Nov 2025

Describing HPLT datasets in depth is an essential part of our commitment as data curators: 🆕HPLT 3.0: Very Large-Scale Multilingual Resources for LLM and MT. Mono- and Bi-lingual Data, Multilingual Evaluation, and Pre-Trained Models: arxiv.org/abs/2511.01066 We are on🔥at #HPLT

HPLT 3.0: Very Large-Scale Multilingual Resources for LLMs and MT....

We present an ongoing initiative to provide open, very large, high-quality, and richly annotated textual datasets for almost 200 languages. At 30 trillion tokens, this is likely the largest...

arxiv.org

335

HPLT

Prompsit retweeted

HPLT @hplt_eu

5 Nov 2025

The #HPLT crowd is at #EMNLP2025!!! If you are around, please visit our booth to discuss: - multilingual datasets 🌏 - dataset insights and stats 📊 - dataset performance 🔝 - efficient MT models ⏱️ - and the future of multilingual LLMs 💡 We don't want to miss U!

802

Prompsit

Prompsit @Prompsit

27 Oct 2025

Gracias #PCUMH por insistir en que contemos lo que hacemos y por estar siempre atentos a nuestros avances y logros. Vuestro apoyo nos da visibilidad y alegrías como esta. ¡Gracias!

Parque Científico UMH @PcientificoUMH

14 Oct 2025

📢 El #PCUMH, finalista en los “Disruptores Innovation Awards 2025” de @elespanolcom . 🏆Ha sido seleccionado como "Mejor proyecto impulsado por parques tecnológicos" gracias a la empresa @Prompsit , parte de @OpenEuroLLM . Noticia completa🔽 parquecientificoumh.es/notic…

Prompsit

Prompsit @Prompsit

7 Jul 2025

Impossible oblidar el dia que vam conèixer a l'Olga Torres, aquell somriure que va fer de MultiTrainMT molt més que un projecte d'èxit quant als resultats: va fer pinya, va fer família. Eixe somriure ens acompanyarà sempre, DEP benvolguda amiga.

MultiTraiNMT @MultiNmt

31 Oct 2019

Kick-off meeting at @UABBarcelona of MultiTrainMT "Machine Translation training for multilingual citizens meeting" @EUErasmusPlus project. Feel free to follow/contact us for further info and/or becoming an associate partner. Anyone interested in the topic is most welcome!

190

Prompsit

Prompsit @Prompsit

4 Jul 2025

We had a great time at @MTSummit2025 presenting work about HPLT v2 multilingual datasets (v3 coming soon!) and ProMut, an improved DYI platform to teach and learn about MT. Great to be there also to celebrate the Award of Honour to our co-founder, CRO and friend Mikel Forcada! 😍

103

Prompsit

Prompsit @Prompsit

10 Mar 2025

Prompsit will actively participate in OpenEuroLLM by analysing and curating the open data needed to train the foundational LLM. We are also contributing to multilingual LLM evaluation and dissemination of it all!

OpenEuroLLM @OpenEuroLLM

7 Mar 2025

Kick-off successfully completed. Go OpenEuroLLM team! openeurollm.eu/

104

HPLT

Prompsit retweeted

HPLT @hplt_eu

28 Feb 2025

We are happy to announce the second release of HPLT bilingual datasets: - 50 English-centric language pairs = 380M parallel sentences (HPLT) 🤩 - 1,275 non-English-centric language pairs = 16.7B parallel sentences (MultiHPLT) 😮 Available at the HPLT dataset catalogue and OPUS.

1,281

Prompsit

Prompsit @Prompsit

11 Feb 2025

Fue un gusto participar en esta jornada. Gracias por la invitación @PcientificoUMH, nos gustó mucho compartir la jornada con las compañeras de @Prosperabiotech. ¡Tenemos unas científicas y tecnólogas excepcionales a la vuelta de cada esquina! 👩‍🔬👩‍💻💪🦾

Parque Científico UMH @PcientificoUMH

10 Feb 2025

Así ha sido la jornada sobre ciencia y tecnología en femenino organizada por el #ParqueCientífico de la @universidadmh para los estudiantes del @IES Victoria Kent 🧪🧬 Una sesión muy especial, promovida por @APTE y el #PCUMH, que ha contado con distintas charlas y talleres.

Prompsit

Prompsit @Prompsit

6 Feb 2025

Arrancamos febrero con proyecto nuevo en @Prompsit 👋 #openeurollm #multilingual #opensource 👋

OpenEuroLLM @OpenEuroLLM

3 Feb 2025

It's time for transparent AI in Europe. It's time for open LLMs as a robust foundation for developing future private and public AI services. It's time for: OPEN = open-source Euro = under EU regulations, representing EU values LLM = LLMs openeurollm.eu

Prompsit

Prompsit @Prompsit

28 Jun 2024

Para contaros lo que estamos haciendo en SmartBiC, proyecto liderado por @Linguaserve, nuestro póster de la @EAMT_2024 vale más que mil palabras.

274

Slator

Prompsit retweeted

Slator

@slatornews

3 Apr 2024

By harnessing web crawls 🕸️ from Internet Archive and CommonCrawl, researchers 🔎 from @EdinburghUni, @helsinkiuni, @UniOslo, @UniTurku, and @Prompsit unveil new #language resources aimed at enhancing language modeling and #MT training. slator.ch/MassiveMultilingua… @OnadeGibert @graemenail @shaoxiongji @oepen @TiedemannJoerg @ltgoslo

Here’s a ‘Brand-New’ Massive Multilingual Dataset for Machine Translation

Researchers harness web crawls from Internet Archive and CommonCrawl to release new language resources.

slator.com

480

Rik van Noord

Prompsit retweeted

Rik van Noord @rikvannoord

14 Mar 2024

Happy to share our latest MaCoCu paper, accepted at #LRECCOLING2024 @LrecColing #NLProc 🎉 We have linguists annotate the data *quality* of 4 well-known monolingual corpora (OSCAR, CC100, mC4 and MaCoCu) across 11 European low-resource languages. Link: arxiv.org/pdf/2403.08693.pdf

2,702

Parque Científico UMH

Prompsit retweeted

Parque Científico UMH @PcientificoUMH

6 Mar 2024

➡️ La empresa del #ParqueCientífico de la @UniversidadMH, @Prompsit, colabora en un proyecto europeo sobre tecnologías del lenguaje de alto rendimiento con el objetivo de crear diferentes modelos de lenguaje y traducciones potentes. Noticia completa 📌: parquecientificoumh.es/notic…

244

HPLT

Prompsit retweeted

HPLT @hplt_eu

1 Mar 2024

First datasets, then models! Initial HPLT models (LLMs and MT) are out: hplt-project.org/models, some still running 🏃 We explain what we are doing in the deliverables section: hplt-project.org/deliverable… Meanwhile, we keep cooking IA peta-data-bytes 🥘, enriching, dashboarding 📊

4,219

Prompsit

Prompsit @Prompsit

26 Jan 2024

Hoy cumplimos 18 años haciendo lo que más nos gusta en este cruce entre lenguas y tecnología. Gracias por vuestra confianza. Per molts anys Prompsit! Gràcies de tot cor pel vostre suport! Happy birthday to us! 🥳 Thanks for your trust, we'll keep doing our best!

143

HPLT

Prompsit retweeted

HPLT @hplt_eu

7 Dec 2023

We just published version 1.2 of HPLT datasets. What's new? - we fixed a bug in monolingual dedup, please redownload! 🛠️ - we filtered out very ugly monolingual documents🤮 - we anonymised the bilingual datasets🕵️‍♀️ hplt-project.org/datasets/v1…

HPLT - High Performance Language Technologies

A space that combines petabytes of natural language data with large-scale model training

hplt-project.org

2,356

Prompsit

Prompsit @Prompsit

28 Nov 2023

Select, filter, visualize your data (OpusCleaner). Then schedule and train MT and LLMs consistently (OpusTrainer) with them. As part of the HPLT project, we build tools to make it easy. They are open-source and we encourage you to use them. More:

This tweet is unavailable

124

Clarin.si

Prompsit retweeted

Clarin.si @ClarinSlovenia

5 Jun 2023

We are excited to share with you that we now provide 4 more massive monolingual corpora for under-resourced languages: you can access Icelandic, Ukrainian, Catalan and Greek #MaCoCu web corpora for free from the CLARIN.SI repository 😃

4,710

Taja Kuzman Pungeršek

Prompsit retweeted

Taja Kuzman Pungeršek @TajaKuzman

23 May 2023

#MaCoCu crew is in Groningen these days! Walking towards great results of MaCoCu corpora evaluation and new MaCoCu language models for under-resourced languages 😁

437