Founder of Data School

Joined March 2010
665 Photos and videos
Kevin Markham retweeted
Recently, we purchased one of each Anthropic/OpenAI subscription plan and randomly ran long horizon coding tasks until we exhausted the weekly limit. It's widely believed that a $200/month plan maxes out at ~$2000/month worth of tokens (assuming API pricing). However, we found that the subscriptions are actually far more generous. (2/4)
190
574
6,041
3,463,315
Kevin Markham retweeted
PICARD: Data, shields up DATA: Brilliant! Shields can reduce damage we sustain. Not immunity. Not hubris. Just prudence. It's not precaution—it's strategy. [camera shakes] WORF: HULL BREACHES ON NINE DECKS DATA: Here's what happened: you told me to raise shields, and I didn't
305
4,861
50,516
1,385,139
Anyone have a personal contact at a bootcamp or university where they teach #MachineLearning using #Python? I'd love to talk with them about incorporating my book into their curriculum! Feel free DM me their contact info šŸ™ Read the book online (free!): mlbook.dataschool.io
1
8
31
1,817
Floored by the response to my new #Python #MachineLearning book 🤩 Paperback: geni.us/MasterML Ebook: courses.dataschool.io/ebook Read online (free!): mlbook.dataschool.io
9
48
2,307
Kevin Markham retweeted
My book is the #1 New Release in NLP! 🄳 Amazon US put it on sale... for $0.95 off šŸ˜‚ Get the paperback: geni.us/MasterML Or read online (free!): mlbook.dataschool.io #MachineLearning #Python
5
11
59
3,283
Absolutely thrilled that my book is finally published! šŸŽ‰ Paperback: amazon.com/dp/B0GRFPZ768 ebook: courses.dataschool.io/ebook Read online: mlbook.dataschool.io/ Poured my heart & soul into this for 5 years Hopefully I sell a few copies even though you can read it for free šŸ˜‚
4
17
113
4,815
The BEST course I took last year runs one FINAL time... and it starts in 4 days ā° You'll learn how to build production-ready AI apps from @hugobowne Includes LIVE instruction & talks from experts, plus $1300 in AI partner credits Enroll for 25% off: maven.com/hugo-stefan/buildi…
1
2
645
My new book - on sale NEXT WEEK! šŸŽ‰ Sign up to get notified when it's available: dataschool.kit.com/mlbook #MachineLearning #Python @scikit_learn
2
3
15
1,841
Final proof copy of my new #MachineLearning book šŸŽ‰ Get notified the moment it's available: dataschool.kit.com/mlbook
6
5
112
4,633
Kevin Markham retweeted

54
240
1,992
800,378
Kevin Markham retweeted
31 Dec 2025
Here's my enormous round-up of everything we learned about LLMs in 2025 - the third in my annual series of reviews of the past twelve months simonwillison.net/2025/Dec/3… This year it's divided into 26 sections! This is the table of contents:
102
868
4,872
509,136
Kevin Markham retweeted
Are you a #Python user and a lifelong learner? I've just published my 8th annual list of every Python-related Black Friday / Cyber Monday sale I'm aware of. treyhunner.com/2025/11/pytho…

2
5
17
2,725
VIDEO: How to use top AI models on a budget Want to chat with the best AI models from OpenAI, Claude, and Google without paying $20/month? I'll show you how to use API keys w/ @TypingMindApp to access top models for a fraction of the cost! Find out how: youtube.com/watch?v=wvvTog-F…
2
6
1,148
Kevin Markham retweeted
19 Oct 2025
I love Andrej’s clarification that the final AGI recipe includes an RL stage, but we still need new layers of breakthroughs to get there. AGI is still a research problem, not an engineering problem. Scaling compute 100Ɨ won’t magically make it happen. The lab that invents the next learning paradigm will define the future.
46
45
704
71,143
Kevin Markham retweeted
Everyone should be using Claude Code more PMs, marketers, designers, founders, parents. Everyone. The trick is to forget that it’s called Claude Code and instead think of it as Claude Local or Claude Agent. It’s essentially a super-intelligent AI running locally, able to do stuff directly on your computer—from organizing your files and folders to brainstorming domain names, summarizing customer calls, to enhancing image quality, creating Linear tickets, andĀ so much more. Here are 50 creative ways non-technical people are using Claude Code in their work and life, to inspire your own thinking. This list includes my own favorite use cases, and many examples y’all shared with me šŸ‘‡
85
194
2,121
369,470
Kevin Markham retweeted
Excited to release new repo: nanochat! (it's among the most unhinged I've written). Unlike my earlier similar repo nanoGPT which only covered pretraining, nanochat is a minimal, from scratch, full-stack training/inference pipeline of a simple ChatGPT clone in a single, dependency-minimal codebase. You boot up a cloud GPU box, run a single script and in as little as 4 hours later you can talk to your own LLM in a ChatGPT-like web UI. It weighs ~8,000 lines of imo quite clean code to: - Train the tokenizer using a new Rust implementation - Pretrain a Transformer LLM on FineWeb, evaluate CORE score across a number of metrics - Midtrain on user-assistant conversations from SmolTalk, multiple choice questions, tool use. - SFT, evaluate the chat model on world knowledge multiple choice (ARC-E/C, MMLU), math (GSM8K), code (HumanEval) - RL the model optionally on GSM8K with "GRPO" - Efficient inference the model in an Engine with KV cache, simple prefill/decode, tool use (Python interpreter in a lightweight sandbox), talk to it over CLI or ChatGPT-like WebUI. - Write a single markdown report card, summarizing and gamifying the whole thing. Even for as low as ~$100 in cost (~4 hours on an 8XH100 node), you can train a little ChatGPT clone that you can kind of talk to, and which can write stories/poems, answer simple questions. About ~12 hours surpasses GPT-2 CORE metric. As you further scale up towards ~$1000 (~41.6 hours of training), it quickly becomes a lot more coherent and can solve simple math/code problems and take multiple choice tests. E.g. a depth 30 model trained for 24 hours (this is about equal to FLOPs of GPT-3 Small 125M and 1/1000th of GPT-3) gets into 40s on MMLU and 70s on ARC-Easy, 20s on GSM8K, etc. My goal is to get the full "strong baseline" stack into one cohesive, minimal, readable, hackable, maximally forkable repo. nanochat will be the capstone project of LLM101n (which is still being developed). I think it also has potential to grow into a research harness, or a benchmark, similar to nanoGPT before it. It is by no means finished, tuned or optimized (actually I think there's likely quite a bit of low-hanging fruit), but I think it's at a place where the overall skeleton is ok enough that it can go up on GitHub where all the parts of it can be improved. Link to repo and a detailed walkthrough of the nanochat speedrun is in the reply.
683
3,352
24,135
5,808,891
Kevin Markham retweeted
7 Oct 2025
Vibe coding is irresponsibly building software through dice rolls, not caring what code is produced What about when engineers at the top of their game use AI tools responsibly to accelerate their work? I propose "vibe engineering"!
383
162
2,117
293,852
Kevin Markham retweeted
This MIT paper just broke my brain. Everyone keeps saying LLMs can't do real logical reasoning. Turns out we've just been teaching them wrong this whole time. These researchers built something called PDDL-INSTRUCT that actually teaches models to think through planning problems step by step. Not just pattern matching - actual logical reasoning. Here's how it works: Phase 1: show the model correct and incorrect plans with explanations. Basic stuff. Phase 2 is where it gets interesting. They make the model generate explicit reasoning for every single action, then use an external verifier to check if each step is logically sound. The numbers are wild. Llama-3-8B jumped from 28% to 94% accuracy on planning benchmarks. That's not incremental improvement - that's a completely different capability emerging. What's smart is they don't trust the model to check its own work. They use VAL, a formal planning verifier, to validate every logical step. When the model screws up, it gets specific feedback about exactly what went wrong. The two-stage training is clever. First stage focuses purely on better reasoning chains. Second stage optimizes for actually solving the problem. This prevents the model from just gaming the metrics. One finding caught my attention - detailed feedback destroys binary feedback. Just telling a model "wrong" vs explaining exactly which preconditions failed makes a huge difference. The gap is especially big on complex problems. This isn't trying to replace symbolic planners. It's teaching neural networks to reason like symbolic planners while keeping external verification. That's actually sustainable. The implications go way beyond planning. Any multi-step reasoning task could benefit from this approach. We might finally be seeing how to teach LLMs structured thinking instead of just sophisticated autocomplete. Makes me wonder what other "impossible" capabilities are just sitting there waiting for the right training approach.
117
695
3,770
293,694