Karpathy announced he was leaving OpenAI 4 days ago.
Today, he released an implementation of the Byte Pair Encoding algorithm behind GPT and most LLMs.
Byte Pair Encoding: "Minimal, clean, educational code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization."
The best part? It's written in 70 lines of pure python.