Daniel King

Daniel King

4 Photos and videos

Tweets

Daniel King @danielking36

25 Jul 2025

Last week was my last week at Databricks. I'm so grateful to have worked with many incredible folks at Mosaic and Databricks, and proud of everything we accomplished, released, and learned together. Thank you to everyone who has been part of the journey. And now, a break :)

6,018

Vitaliy Chiley

Daniel King retweeted

Vitaliy Chiley

@vitaliychiley

26 Jun 2024

Replying to @Muhtasham9

A while back we ran into this problem. This was our fix: github.com/mosaicml/llm-foun…

llm-foundry/llmfoundry/callbacks/scheduled_gc_callback.py at main · mosaicml/llm-foundry

LLM training code for Databricks foundation models - mosaicml/llm-foundry

github.com

970

Mihir Patel

Daniel King retweeted

Mihir Patel @mvpatel2000

22 Mar 2024

🚨New🌟blog✍️ on ⏩ maximizing🌙 FLOPS 🚀 Training large models requires maximizing flops/GPU, especially at scale. Excited to share a few of the cool tricks in thread👀. 1/N

186

48,410

Daniel King

Daniel King @danielking36

1 Feb 2024

It is very fun to be able to collaborate with old friends from AI2 ❤️ Congrats on the launch @kylelostat @mechanicaldirk @i_beltagy @soldni Pete and everyone else!

Mechanical Dirk @mechanicaldirk

1 Feb 2024

Replying to @allen_ai @databricks @jefrankle

@MosaicML was, and continues to be, an amazing partner of the OLMo project. Not just on the compute side! If you look at the OLMo codebase (github.com/allenai/OLMo), you'll see a lot of shared DNA between that and LLM Foundry (github.com/mosaicml/llm-foun…)!

1,486

Zack Ankner

Daniel King retweeted

Zack Ankner

@ZackAnkner

4 Sep 2023

My EMNLP paper got desk-rejected post-rebuttal because I posted it to arxiv 25 minutes after the anonymity deadline. I was optimistic about our reviews, so I spent a whole week while visiting my family writing rebuttals and coding experiments to respond.

Naomi Saphra @nsaphra

4 Sep 2023

Just got a desk reject, post-rebuttals, for a paper being submitted to arxiv <30 min late for the anonymity deadline. I talk about how the ACL embargo policy hurts junior researchers and makes ACL venues less desirable for NLP work. I don’t talk about the pointless NOISE it adds.

180

105,034

Jacob Portes

Daniel King retweeted

Jacob Portes

@JacobianNeuro

28 Jul 2023

If you're at #ICML 🌴on Saturday, make sure to check out the es-fomo.com/ workshop on efficient training of LLMs! @abhi_venigalla and @jefrankle will be at our poster on optimized pretraining of MosaicBERT ⚡️🚄 📜workshop paper: openreview.net/forum?id=WH1S… 🧵

6,879

Databricks AI Research

Daniel King retweeted

Databricks AI Research

@DbrxMosaicAI

24 Jul 2023

🌴Aloha from Oahu!🌴 We're at the @icmlconf all week talking about #generativeAI, #llms, #diffusionmodels and our #opensource projects. Come say hello - and grab a lei! #icml2023 #icmlconf 🥥 🏄‍♀️ 🏖

6,464

Daniel King

Daniel King @danielking36

23 Jul 2023

At ICML in Hawaii this week! DM me or come find me at the @MosaicML @databricks booth if you want to chat!

2,250

Daniel King

Daniel King @danielking36

30 Jun 2023

Love these types of plots. @MosaicML we've made them for cloud compute provider, cloud storage provider, and now GPU type as well!

Abhi Venigalla

@ml_hardware

30 Jun 2023

Replying to @ml_hardware

And yes, you can switch back and forth between NVIDIA and AMD, even within a single training run. It's Christmas in July!🎄

934

Databricks AI Research

Daniel King retweeted

Databricks AI Research

@DbrxMosaicAI

26 Jun 2023

Today, we’re excited to share that MosaicML has agreed to join @Databricks!

601

318,431

Databricks AI Research

Daniel King retweeted

Databricks AI Research

@DbrxMosaicAI

22 Jun 2023

Meet MPT-30B, the latest member of @MosaicML's family of open-source, commercially usable models. It's trained on 1T tokens with up to 8k context (even more w/ALiBi) on A100s and *H100s* with big improvements to Instruct and Chat. Take it for a spin on HF! huggingface.co/spaces/mosaic…

116

516

295,578

Daniel King

Daniel King @danielking36

5 May 2023

Incredibly excited to release our open source MPT-7B! The most exciting part of this imo is that the process is all repeatable, and nearly all open source. Between the @MosaicML platform, and our Composer, Streaming, and LLMFoundry libraries, the final run was pretty easy :)

Jonathan Frankle

@jefrankle

5 May 2023

MPT is here! Check out our shiny new LLMs, open-source w/commercial license. The base MPT-7B model is 7B params trained on 1T tokens and reaches LLaMA-7B quality. We also created Instruct (commercial), Chat, and (my favorite) StoryWriter-65k variants. 🧵 mosaicml.com/blog/mpt-7b

7,819

Jonathan Frankle

Daniel King retweeted

Jonathan Frankle

@jefrankle

30 Apr 2023

In the last two weeks, @MosaicML had lots of big news: We trained a 1B/200B token LLM on RedPajama in < 72hrs, Replit used us to train a SOTA code model in < 10 days, we trained SD2 for < $50k, long context BERTs, and perf #'s on H100s. But the biggest news is coming this week 👀

252

70,305

Daniel King

Daniel King @danielking36

29 Apr 2023

Took a quick break from training LLMs with @MosaicML to do a much needed update of scispaCy's entity linker to the latest UMLS release (2022AB)! github.com/allenai/scispacy/…

Release v0.5.2 · allenai/scispacy

This release includes an update of the entity linkers to use the latest UMLS release (2022AB), which includes information about newer entities like COVID-19. In [10]: doc = nlp("COVID-19 is a ...

github.com

11,804

Databricks AI Research

Daniel King retweeted

Databricks AI Research

@DbrxMosaicAI

27 Apr 2023

How good are @nvidia H100s actually? In collaboration with @CoreWeave, we benchmarked A100 vs H100 performance for large language model training. Here's what we found: [1/6] mosaicml.com/blog/coreweave-…

213

106,506

Abhi Venigalla

Daniel King retweeted

Abhi Venigalla

@ml_hardware

27 Apr 2023

Replying to @julien_c @zehavoc @mmitchell_ai @yoavgo @christopher @SashaMTL

What training duration are you aiming for? Even a (30B param, 2T token) run would only cost ~$1.5M in compute. And you can rent short-term “hero run” clusters from us, no need to buy GPUs outright or 1-yr commits Cost estimate from here: mosaicml.com/blog/gpt-3-qual…

253

Jonathan Frankle

Daniel King retweeted

Jonathan Frankle

@jefrankle

26 Apr 2023

And now it's < $50k. 🖼️Announcing @MosaicML's diffusion offering 📷We replicated Stable Diffusion 2.0, training from scratch with huge speedup, and we can do it on your data too. Human eval showed the model to be indistinguishable from the original. Blog: mosaicml.com/blog/training-s…

Training Stable Diffusion from Scratch for $50k with MosaicML (Part 2) | Databricks Blog

We've replicated Stable Diffusion 2 for less than $50k, and we've open-sourced the training code so you can too! This is a 3x cost reduction from our last blog post and an 8x reduction from the...

databricks.com

277

66,963

Mihir Patel

Daniel King retweeted

Mihir Patel @mvpatel2000

26 Apr 2023

Replying to @jefrankle @landanjs @MosaicML @vivek_myers

4. Large model training no longer requires super teams with the right tools. We had 2-4 people working on this for about 1-2 months. @Replit trained their recent 3b code model with 2 people in a week. Great tools will empower small teams. Timelines are accelerating

1,458

Jonathan Frankle

Daniel King retweeted

Jonathan Frankle

@jefrankle

20 Apr 2023

72 hrs ago, @togethercompute released the RedPajama dataset. Like everyone, we at @MosaicML were very excited about the idea of a fully open-source Llama. So excited, in fact, that we've already trained a 1B model on 200B tokens! It's on HF (Apache2) here: huggingface.co/mosaicml/mpt-…

457

152,539

Replit ⠕

Daniel King retweeted

Replit ⠕

@Replit

19 Apr 2023

At Replit, we're training our own LLMs. We’ve partnered with @databricks, @huggingface, and @MosaicML to build a full LLM stack. Curious about how? Check out this post by @truerezashabani, our Head of AI. blog.replit.com/llm-training

284

42,061