Entrepreneur software engineer, currently working in payments and on a nocode product. Interested in PLD, Formal methods and AI.

Joined March 2008
99 Photos and videos
Tycho retweeted
24 Jun 2025
Setting up email shouldn't suck. Shoutbox.net makes it painless — you'll be sending in no time. No waiting. No pain. Just send.
1
7
16,112
Tycho retweeted
The LIVE Programming Workshop will be held online this year! So no excuses - submit by July 21. @liveprog2021 [link in reply]
1
6
9
1,773
15 Oct 2024
He @ContaboCom , I cannot reach one of my VPSs again, and it's not even weekend!
1
93
15 Oct 2024
Must say I haven't had downtime during the week yet, but support seems to not respond now either? Can you help?
68
Tycho retweeted
“Make it more epic.” 
EA Chief Strategy officer Mihir Vaidya demonstrates how EA plans to super-charge user-created game content with AI in this concept video "Imagination to Creation" from EA’s Investor Day event in NYC today.
485
108
932
891,102
Tycho retweeted
This is such an amazing talk from Dr. Erik Meijer (@headinthebox, famous for his work on Visual Basic, C#, LINQ, Hack), on how LLMs upended his research, and are changing coding and what developers do. I've clipped some of my fave parts of his talk: - His team found that the specialized models they built to do codegen, and find/fix bugs at Meta were completely outclassed by ChatGPT. They were surprised that ChatGPT could write Hack code, despite it not being used widely outside of Facebook. (@DynamicWebPaige has talked about the phenomenon of Gemini and general-purpose frontier models outperforming and replacing older/smaller specialized models built over the years throughout Google.) Source: youtube.com/watch?v=SsJqmV3W…
15
162
1,081
250,961
Tycho retweeted
30 Aug 2024
Why do all websites look the same? 🤔 Because we'd rather ship now than pixel-push forever! Ship fast, redesign later! 🚀 Here are 20 Next.js component libraries to fuel your next MVP: (thread) 🧵👇 #IndieHackers #BuildInPublic
2
1
5
1,398
Tycho retweeted
LIVE workshop submissions due in a couple weeks, July 7! If you've been thinking about submitting some work, great time to put together a demo / short paper
2
7
22
3,274
Tycho retweeted
🤣
189
695
10,220
742,194
Tycho retweeted
10 Jun 2024
The beauty of math book covers from @doverpubs. 😍
34
606
3,765
365,288
Tycho retweeted
Nmap
8
399
2,287
304,583
Tycho retweeted
5 techniques to fine-tune LLMs, explained visually! Fine-tuning large language models traditionally involved adjusting billions of parameters, demanding significant computational power and resources. However, the development of some innovative methods have transformed this process. Here’s a snapshot of five cutting-edge techniques for finetuning LLMs, each explained visually for easy understanding. LoRA: - Introduce two low-rank matrices, A and B, to work alongside the weight matrix W. -Adjust these matrices instead of the behemoth W, making updates manageable. LoRA-FA (Frozen-A): - Takes LoRA a step further by freezing matrix A. - Only matrix B is tweaked, reducing the activation memory needed. VeRA: - All about efficiency: matrices A and B are fixed and shared across all layers. - Focuses on tiny, trainable scaling vectors in each layer, making it super memory-friendly. Delta-LoRA: - A twist on LoRA: adds the difference (delta) between products of matrices A and B across training steps to the main weight matrix W. - Offers a dynamic yet controlled approach to parameter updates. LoRA : - An optimized variant of LoRA where matrix B gets a higher learning rate. This tweak leads to faster and more effective learning. Credits to @_avichawla for great visualisation! 👏
7
304
1,285
102,946
Tycho retweeted
9 Jun 2024
I recreated an entire GPT architecture in a spreadsheet. It is a nanoGPT designed by @karpathy with about 85000 parameters, small enough to be packed into a spreadsheet file. It is great for learning about how transformer works as it shows all the data and parameters going through a transformer pipeline, and all the calculations are actually working inside those cells. Here is a link to the project where you can download the file and play with it yourself. No coding needed, it is just a spreadsheet. 😉 Spreadsheet is all you need: github.com/dabochen/spreadsh… Also thanks to @BrendanBycroft with his LLM visualisation project which inspired this.
77
749
5,282
741,209
Tycho retweeted
📽️ New 4 hour (lol) video lecture on YouTube: "Let’s reproduce GPT-2 (124M)" youtu.be/l8pRSuU81PU The video ended up so long because it is... comprehensive: we start with empty file and end up with a GPT-2 (124M) model: - first we build the GPT-2 network - then we optimize it to train very fast - then we set up the training run optimization and hyperparameters by referencing GPT-2 and GPT-3 papers - then we bring up model evaluation, and - then cross our fingers and go to sleep. In the morning we look through the results and enjoy amusing model generations. Our "overnight" run even gets very close to the GPT-3 (124M) model. This video builds on the Zero To Hero series and at times references previous videos. You could also see this video as building my nanoGPT repo, which by the end is about 90% similar. Github. The associated GitHub repo contains the full commit history so you can step through all of the code changes in the video, step by step. github.com/karpathy/build-na… Chapters. On a high level Section 1 is building up the network, a lot of this might be review. Section 2 is making the training fast. Section 3 is setting up the run. Section 4 is the results. In more detail: 00:00:00 intro: Let’s reproduce GPT-2 (124M) 00:03:39 exploring the GPT-2 (124M) OpenAI checkpoint 00:13:47 SECTION 1: implementing the GPT-2 nn.Module 00:28:08 loading the huggingface/GPT-2 parameters 00:31:00 implementing the forward pass to get logits 00:33:31 sampling init, prefix tokens, tokenization 00:37:02 sampling loop 00:41:47 sample, auto-detect the device 00:45:50 let’s train: data batches (B,T) → logits (B,T,C) 00:52:53 cross entropy loss 00:56:42 optimization loop: overfit a single batch 01:02:00 data loader lite 01:06:14 parameter sharing wte and lm_head 01:13:47 model initialization: std 0.02, residual init 01:22:18 SECTION 2: Let’s make it fast. GPUs, mixed precision, 1000ms 01:28:14 Tensor Cores, timing the code, TF32 precision, 333ms 01:39:38 float16, gradient scalers, bfloat16, 300ms 01:48:15 torch.compile, Python overhead, kernel fusion, 130ms 02:00:18 flash attention, 96ms 02:06:54 nice/ugly numbers. vocab size 50257 → 50304, 93ms 02:14:55 SECTION 3: hyperpamaters, AdamW, gradient clipping 02:21:06 learning rate scheduler: warmup cosine decay 02:26:21 batch size schedule, weight decay, FusedAdamW, 90ms 02:34:09 gradient accumulation 02:46:52 distributed data parallel (DDP) 03:10:21 datasets used in GPT-2, GPT-3, FineWeb (EDU) 03:23:10 validation data split, validation loss, sampling revive 03:28:23 evaluation: HellaSwag, starting the run 03:43:05 SECTION 4: results in the morning! GPT-2, GPT-3 repro 03:56:21 shoutout to llm.c, equivalent but faster code in raw C/CUDA 03:59:39 summary, phew, build-nanogpt github repo
412
2,169
15,351
1,526,197
Tycho retweeted
Happy Ghostbusters Day!! Ghostbusters 1984 - 2024 40 years old this year! 👻 Atari 2600 video: brokenarcade 🕹️

13
89
545
20,331
Tycho retweeted
What if ChatGPT was made in 1995… for real? 🤯 I was bored for a few days and decided to remake my portfolio in Windows 95 style with a ChatGPT-like AI 😎 It is 100% local, no API calls, no server costs, even for the ChatGPT-like app! Try it out 👉 phuctm97.com
30
5
94
8,229
Tycho retweeted
6 Jun 2024
It's time for the reckoning of tech twitter Introducing Shiptalkers - find out if the people on Twitter actually ship code or if it's just shiptalk Live now, let's see your ratios - starting off with @thdxr who tweets 302% more than he commits
1 Jun 2024
launching a project tomorrow that will either bring balance to or destroy tech twitter let me know if you want to beta test it
119
112
1,291
469,108
Tycho retweeted
How to make a Blazor developer cry in 5 words or less?
51
3
30
21,832
Tycho retweeted
David Attenborough is now narrating my life Here's a GPT-4-vision @elevenlabs python script so you can star in your own Planet Earth:
689
4,416
25,760
3,994,996
Tycho retweeted
My first PhD paper!🎉We learn *diffusion* models for code generation that learn to directly *edit* syntax trees of programs. The result is a system that can incrementally write code, see the execution output, and debug it. 🧵1/n
111
583
5,367
742,439