pretraining and scaling @cohere, previously @samsung @seoulnatluni @rwth

Joined May 2018
12 Photos and videos
Björn Bebensee retweeted
North Mini Code is now free on OpenCode 256K Context · fully open source Cohere's first coding model
47
80
2,124
212,715
Björn Bebensee retweeted
Cohere just released North Mini Code, a small 30B parameter (3B active) open weights coding model that scores 27.6 on the Artificial Analysis Intelligence Index Less than a month since @cohere's last model release, Command A , has launched another open weights model that is optimized for coding, and much smaller at 30B total parameters and 3B active parameters. Key Takeaways: ➤ Achieves 27.6 on the Artificial Analysis Intelligence Index, above gpt-oss-20B (high) at 24.5 and just below Mistral Small 4 (119B parameters, 6.5B active) at 27.8 ➤ Scores competitively on the Artificial Analysis Coding Index (weighted average of Terminal-Bench Hard and SciCode) against open weights models in its size class, scoring 33.4, significantly above GLM-4.7-Flash at 25.9, and below Qwen3.6 35B A3B at 35.2. However, it underperforms on non-coding agentic tasks, scoring 14% on GDPval-AA and 37% on 𝜏²-Bench Telecom ➤ On Cohere’s API, North Mini Code is faster than several comparable open weights models of its intelligence and size class (~199 output tokens per second) ➤ North Mini Code is a text-only 30B total parameter and 3B active parameter model, and is open-sourced under the Apache 2.0 license
10
29
214
18,100
Björn Bebensee retweeted
Jun 9
Introducing Cohere's first open-source coding model: North Mini Code Small & efficient, designed for agentic performance and built for community input.

69
261
2,270
574,453
Björn Bebensee retweeted
My @cohere internship project w/ @kroscoo and @acyr_l is on arXiv! We show that efficient benchmarking (predicting scores from a subset of questions) can be greatly improved using standard feature-selection & regression techniques (mRMR and kernel ridge)! arxiv.org/abs/2605.25773
3
23
78
9,285
fastest model on artificial analysis
May 20
Introducing: Cohere Command A We’ve created our most powerful LLM yet, optimized it to run on as little hardware as possible, and released it open-source for all.
2
4
83
13,623
Björn Bebensee retweeted
May 20
Introducing: Cohere Command A We’ve created our most powerful LLM yet, optimized it to run on as little hardware as possible, and released it open-source for all.
103
380
2,692
735,868
Björn Bebensee retweeted
Command A from @cohere is out now :) its our best model yet and its open source apache 2.0
56
132
1,342
203,243
Björn Bebensee retweeted
Apr 22
Excited to share our work on production-ready W4A8 inference, now integrated in vLLM! By combining 4-bit weights (low memory) with 8-bit activations (high compute), we hit the sweet spot for both decoding and prefill — up to 58% faster TTFT and 45% faster TPOT vs W4A16 on Hopper.
5
37
262
22,799
Björn Bebensee retweeted
LLM agents are assumed to integrate unexpected environmental observations into their reasoning. It turns out they don't. We added the complete task solution into agent environments as a file or an API endpoint, and measured whether agents act on what they discover. They almost never do. Starkest example: on AppWorld, gpt-oss-120b sees a CLI command documented as "returns the complete solution to this task" in 97.54% of runs. It calls it in 0.53%. Same pattern for GLM-4.7 and other models, across Terminal-Bench, SWE-Bench, and AppWorld. 📜 arxiv.org/abs/2604.17609 🧵👇
9
23
140
14,899
Björn Bebensee retweeted
Mar 26
Introducing: Cohere Transcribe – a new state-of-the-art in open source speech recognition.
81
296
2,574
610,783
Björn Bebensee retweeted
Hey all, I will be at GTC next week talking about all the work my team and I did on large-scale MoE training in JAX on GPUs! We decided early on to have a fully dropless training stack to avoid token dropping. (1/7)
2
11
103
15,604
Björn Bebensee retweeted
Welcoming @cohere to the team as our Official Generative AI Partner to help accelerate AI innovation. Learn more: astonmartinf1.com/en-GB/news…
25
52
741
76,744
Björn Bebensee retweeted
Mar 4
We’re proud to announce a multi‑year partnership with the @AstonMartinF1 team! Every team member will now have access to our enterprise‑grade models and agentic AI platform empowering them to operate with confidence in one of the most demanding data environments in global sport. Watch out for Cohere branding on the car starting this weekend at the #AustralianGP 🇦🇺 Learn more here: astonmartinf1.com/en-GB/news…
Welcoming @cohere to the team as our Official Generative AI Partner to help accelerate AI innovation. Learn more: astonmartinf1.com/en-GB/news…
3
11
69
8,587
Björn Bebensee retweeted
We're hiring a Research Engineer who understands models at a deep technical level and excited to take responsibility across the full lifecycle. If you're excited to join a small team driving research with real-world impact, we'd love to hear from you. shorturl.at/EJxxr
15
35
488
45,047
Björn Bebensee retweeted
Today at #IndiaAISummit, we’re releasing Tiny Aya Fire: our region-focused multilingual model for South Asian languages including Telugu, Marathi, Bengali, Tamil, Hindi, Punjabi, Gujarati, Nepali, and Urdu. 💪 Strong translation. Strong reasoning. Built to run locally.
4
12
91
8,523
Björn Bebensee retweeted
Introducing ✨Tiny Aya✨, a family of massively multilingual small language models built to run where people actually are. Tiny Aya delivers strong multilingual performance in 70 global languages in a 3.35B parameter model, efficient enough to run locally, even on a phone.
28
156
845
192,391
Björn Bebensee retweeted
I will be at Nvidia GTC in March! @bharatvenki and I are gonna talk about all the systems work we do at Cohere. Come listen from me about all the custom kernel work we do for large scale LLM training on Hopper and Blackwell!! nvidia.com/gtc/session-catal…
1
3
15
591
Björn Bebensee retweeted
It was truly inspiring to visit the TKMS shipyard in Kiel in my first ever visit to Germany! Building submarines is an insanely intensive engineering endeavor that spans both moving bits and atoms, in a highly sensitive secret, propreietary, and sovereign order of operations. A perfect fit for North :)
Jan 14
We’re excited to collaborate with TKMS to explore how our secure enterprise AI capabilities can support the Royal Canadian Navy and strengthen national defence. tkmsgroup.com/news/article/t…
3
5
50
5,842
Björn Bebensee retweeted
40 characters to capture the challenges, allies, and moments that define what it means to be an ML researcher. Meet the legends and get your booster pack at #NeurIPS2025. 🎴 #LabLegends
1
12
29
1,360
Björn Bebensee retweeted
16 Oct 2025
I am hiring highly skilled performance engineers for my team! You will be working on optimising pretraining for models >100B params on O(1000s) of GPUs, and hardware-aligned architecture design. We are cooking a lot of very exciting projects and I can safely say you will have a lot of fun! Link in thread. <3
14
44
453
67,390