Vlad ☁️

Vlad ☁️

99 Photos and videos

Tweets

Tycho retweeted

Vlad ☁️ @vlad

24 Jun 2025

Setting up email shouldn't suck. Shoutbox.net makes it painless — you'll be sending in no time. No waiting. No pain. Just send.

16,112

Jonathan Edwards

Tycho retweeted

Jonathan Edwards

@jonathoda

11 Mar 2025

The LIVE Programming Workshop will be held online this year! So no excuses - submit by July 21. @liveprog2021 [link in reply]

1,773

Tycho

Tycho @luyben

15 Oct 2024

He @ContaboCom , I cannot reach one of my VPSs again, and it's not even weekend!

Tycho

Tycho @luyben

15 Oct 2024

Must say I haven't had downtime during the week yet, but support seems to not respond now either? Can you help?

Geoff Keighley

Tycho retweeted

Geoff Keighley

@geoffkeighley

17 Sep 2024

“Make it more epic.”  EA Chief Strategy officer Mihir Vaidya demonstrates how EA plans to super-charge user-created game content with AI in this concept video "Imagination to Creation" from EA’s Investor Day event in NYC today.

4:14

485

108

932

891,102

Gene Kim

Tycho retweeted

Gene Kim

@RealGeneKim

9 Sep 2024

This is such an amazing talk from Dr. Erik Meijer (@headinthebox, famous for his work on Visual Basic, C#, LINQ, Hack), on how LLMs upended his research, and are changing coding and what developers do. I've clipped some of my fave parts of his talk: - His team found that the specialized models they built to do codegen, and find/fix bugs at Meta were completely outclassed by ChatGPT. They were surprised that ChatGPT could write Hack code, despite it not being used widely outside of Facebook. (@DynamicWebPaige has talked about the phenomenon of Gemini and general-purpose frontier models outperforming and replacing older/smaller specialized models built over the years throughout Google.) Source: youtube.com/watch?v=SsJqmV3W…

1:10

162

1,081

250,961

Vlad ☁️

Tycho retweeted

Vlad ☁️ @vlad

30 Aug 2024

Why do all websites look the same? 🤔 Because we'd rather ship now than pixel-push forever! Ship fast, redesign later! 🚀 Here are 20 Next.js component libraries to fuel your next MVP: (thread) 🧵👇 #IndieHackers #BuildInPublic

1,398

Geoffrey Litt

Tycho retweeted

Geoffrey Litt

@geoffreylitt

21 Jun 2024

LIVE workshop submissions due in a couple weeks, July 7! If you've been thinking about submitting some work, great time to put together a demo / short paper

3,274

Vivek Sen

Tycho retweeted

Vivek Sen

@Vivek4real_

18 Jun 2024

🤣

189

695

10,220

742,194

Abakcus

Tycho retweeted

Abakcus

@abakcus

10 Jun 2024

The beauty of math book covers from @doverpubs. 😍

606

3,765

365,288

Security Trybe

Tycho retweeted

Security Trybe

@SecurityTrybe

10 Jun 2024

Nmap

399

2,287

304,583

Akshay 🚀

Tycho retweeted

Akshay 🚀

@akshay_pachaar

9 Jun 2024

5 techniques to fine-tune LLMs, explained visually! Fine-tuning large language models traditionally involved adjusting billions of parameters, demanding significant computational power and resources. However, the development of some innovative methods have transformed this process. Here’s a snapshot of five cutting-edge techniques for finetuning LLMs, each explained visually for easy understanding. LoRA: - Introduce two low-rank matrices, A and B, to work alongside the weight matrix W. -Adjust these matrices instead of the behemoth W, making updates manageable. LoRA-FA (Frozen-A): - Takes LoRA a step further by freezing matrix A. - Only matrix B is tweaked, reducing the activation memory needed. VeRA: - All about efficiency: matrices A and B are fixed and shared across all layers. - Focuses on tiny, trainable scaling vectors in each layer, making it super memory-friendly. Delta-LoRA: - A twist on LoRA: adds the difference (delta) between products of matrices A and B across training steps to the main weight matrix W. - Offers a dynamic yet controlled approach to parameter updates. LoRA : - An optimized variant of LoRA where matrix B gets a higher learning rate. This tweak leads to faster and more effective learning. Credits to @_avichawla for great visualisation! 👏

304

1,285

102,946

Dabo

Tycho retweeted

Dabo

@chendabo

9 Jun 2024

I recreated an entire GPT architecture in a spreadsheet. It is a nanoGPT designed by @karpathy with about 85000 parameters, small enough to be packed into a spreadsheet file. It is great for learning about how transformer works as it shows all the data and parameters going through a transformer pipeline, and all the calculations are actually working inside those cells. Here is a link to the project where you can download the file and play with it yourself. No coding needed, it is just a spreadsheet. 😉 Spreadsheet is all you need: github.com/dabochen/spreadsh… Also thanks to @BrendanBycroft with his LLM visualisation project which inspired this.

749

5,282

741,209

Andrej Karpathy

Tycho retweeted

Andrej Karpathy

@karpathy

9 Jun 2024

📽️ New 4 hour (lol) video lecture on YouTube: "Let’s reproduce GPT-2 (124M)" youtu.be/l8pRSuU81PU The video ended up so long because it is... comprehensive: we start with empty file and end up with a GPT-2 (124M) model: - first we build the GPT-2 network - then we optimize it to train very fast - then we set up the training run optimization and hyperparameters by referencing GPT-2 and GPT-3 papers - then we bring up model evaluation, and - then cross our fingers and go to sleep. In the morning we look through the results and enjoy amusing model generations. Our "overnight" run even gets very close to the GPT-3 (124M) model. This video builds on the Zero To Hero series and at times references previous videos. You could also see this video as building my nanoGPT repo, which by the end is about 90% similar. Github. The associated GitHub repo contains the full commit history so you can step through all of the code changes in the video, step by step. github.com/karpathy/build-na… Chapters. On a high level Section 1 is building up the network, a lot of this might be review. Section 2 is making the training fast. Section 3 is setting up the run. Section 4 is the results. In more detail: 00:00:00 intro: Let’s reproduce GPT-2 (124M) 00:03:39 exploring the GPT-2 (124M) OpenAI checkpoint 00:13:47 SECTION 1: implementing the GPT-2 nn.Module 00:28:08 loading the huggingface/GPT-2 parameters 00:31:00 implementing the forward pass to get logits 00:33:31 sampling init, prefix tokens, tokenization 00:37:02 sampling loop 00:41:47 sample, auto-detect the device 00:45:50 let’s train: data batches (B,T) → logits (B,T,C) 00:52:53 cross entropy loss 00:56:42 optimization loop: overfit a single batch 01:02:00 data loader lite 01:06:14 parameter sharing wte and lm_head 01:13:47 model initialization: std 0.02, residual init 01:22:18 SECTION 2: Let’s make it fast. GPUs, mixed precision, 1000ms 01:28:14 Tensor Cores, timing the code, TF32 precision, 333ms 01:39:38 float16, gradient scalers, bfloat16, 300ms 01:48:15 torch.compile, Python overhead, kernel fusion, 130ms 02:00:18 flash attention, 96ms 02:06:54 nice/ugly numbers. vocab size 50257 → 50304, 93ms 02:14:55 SECTION 3: hyperpamaters, AdamW, gradient clipping 02:21:06 learning rate scheduler: warmup cosine decay 02:26:21 batch size schedule, weight decay, FusedAdamW, 90ms 02:34:09 gradient accumulation 02:46:52 distributed data parallel (DDP) 03:10:21 datasets used in GPT-2, GPT-3, FineWeb (EDU) 03:23:10 validation data split, validation loss, sampling revive 03:28:23 evaluation: HellaSwag, starting the run 03:43:05 SECTION 4: results in the morning! GPT-2, GPT-3 repro 03:56:21 shoutout to llm.c, equivalent but faster code in raw C/CUDA 03:59:39 summary, phew, build-nanogpt github repo

412

2,169

15,351

1,526,197

GamesYouLoved

Tycho retweeted

GamesYouLoved

@gamesyouloved

8 Jun 2024

Happy Ghostbusters Day!! Ghostbusters 1984 - 2024 40 years old this year! 👻 Atari 2600 video: brokenarcade 🕹️

0:35

545

20,331

Minh-Phuc Tran

Tycho retweeted

Minh-Phuc Tran

@phuctm97

6 Jun 2024

What if ChatGPT was made in 1995… for real? 🤯 I was bored for a few days and decided to remake my portfolio in Windows 95 style with a ChatGPT-like AI 😎 It is 100% local, no API calls, no server costs, even for the ChatGPT-like app! Try it out 👉 phuctm97.com

0:48

8,229

Rhys

Tycho retweeted

Rhys

@RhysSullivan

6 Jun 2024

It's time for the reckoning of tech twitter Introducing Shiptalkers - find out if the people on Twitter actually ship code or if it's just shiptalk Live now, let's see your ratios - starting off with @thdxr who tweets 302% more than he commits

0:11

Rhys

@RhysSullivan

1 Jun 2024

launching a project tomorrow that will either bring balance to or destroy tech twitter let me know if you want to beta test it

119

112

1,291

469,108

Mladen Macanović

Tycho retweeted

Mladen Macanović @MladenMacanovic

4 Jun 2024

How to make a Blazor developer cry in 5 words or less?

21,832

Charlie Holtz

Tycho retweeted

Charlie Holtz

@charlieholtz

15 Nov 2023

David Attenborough is now narrating my life Here's a GPT-4-vision @elevenlabs python script so you can star in your own Planet Earth:

1:32

689

4,416

25,760

3,994,996

Shreyas Kapur

Tycho retweeted

Shreyas Kapur @shreyaskapur

3 Jun 2024

My first PhD paper!🎉We learn *diffusion* models for code generation that learn to directly *edit* syntax trees of programs. The result is a system that can incrementally write code, see the execution output, and debug it. 🧵1/n

0:13

111

583

5,367

742,439