Solving Information Extraction problems. Creators of open-source NLU foundation models (huggingface.co/numind).

Joined June 2022
2 Photos and videos
Pinned Tweet
8 Aug 2024
🎉 🎉🎉 NuMind (YC S22) is OUT! 🎉🎉🎉 After a long R&D phase, we are finally coming out of private beta! 😀 Here is a presentation video & the release blog post: lnkd.in/enpYVaRg, lnkd.in/eWhdk6nD
1
4
13
9,073
NuMind retweeted
We are releasing NuExtract3, a 4B open-source (Apache 2.0) reasoning OCR & Structured Extraction VLM 🧠📷📄 NuExtract3 is the first reasoning VLM specialized in both OCR (converting PDFs/Scans/Spreadsheets into Markdown) and structured extraction (document to JSON via a schema). We trained NuExtract3 from Qwen3.5 4B via an SFT and an RL phase to provide its reasoning abilities which can be turned on and off. We find that NuExtract3 outperforms similar size models (<30B) in both the structured and OCR tasks, including specialized OCR models, making it the new reference for open-source document extraction😀. Congrats Alexandre Constantin, Sören Dréano, and @NathanFradet for this model! Looking forward to the Pro version :) Available on @huggingface. Links in the reply.
2
4
16
713
8 Aug 2024
🎉 🎉🎉 NuMind (YC S22) is OUT! 🎉🎉🎉 After a long R&D phase, we are finally coming out of private beta! 😀 Here is a presentation video & the release blog post: lnkd.in/enpYVaRg, lnkd.in/eWhdk6nD
1
4
13
9,073
8 Aug 2024
This is just a start. We are working hard to take this baby way further. We firmly believe that this "AI teaching" process is the way to go and will be the way humans create all sorts of advanced AIs.
1
1
303
27 Jun 2024
Our information extraction model NuExtract is out! A lightweight text-to-JSON LLM reaching GPT-4o levels. It is open-source (MIT licence) and available on @huggingface. Check the blog post: numind.ai/blog/nuextract-a-f…
2
1
14
724
13 Nov 2023
Check our new open-source foundation model for Entity Recognition (NER): huggingface.co/numind/generi… It allows to create custom NER model with typically 5x (sometimes 10x!) less annotated data than before :) Here is the blog post explaining how we created it: numind.ai/blog/a-foundation-…
1
15
58
15,356
13 Nov 2023
In a nutshell, we used GPT-3.5 to annotate a part of the C4 dataset with 80k individual concepts, and then used this dataset to train the foundation model. Research done by Sergei Bogdanov (@serega6678), Alexandre Constantin, and Etienne Bernard (@etiennebcp)
2
4
489
21 Jun 2023
New NuMind's blog post - What are Language Models? - to understand how these fundamental technology work, and what we can expect in the near future.
Hi everyone! I am starting a series of blog posts about NLP, ML, and AI. Here is the first one where I talk about how LLMs work, their history, the current state of affairs, and what we can expect next: numind.ai/blog/what-are-larg… Hope that is interesting/useful to some!
2
335
13 Sep 2022
Ahah, Brex showed our logo on Times Square. Not sure if it will lead to new customers but that's pretty cool! Thanks @brexHQ ! #GrowWithBrex
2
2 Aug 2022
Early-access program started. Let us know if you are interested!
Some @numind_ai news: our early-access program is on! So if you have any need for text processing (topic identification, sentiment analysis, information extraction...), contact me - we will give you access to the tool and help you to make sure your NLP project is a success!