building gpu visualizations

Joined April 2026
11 Photos and videos
Pinned Tweet
this is how i wish i learned GPU fundamentals not a lengthy textbook. not a static image. every concept is an interactive visualization. covering the SM architecture, memory coalescing, synchronization, and more. what concepts do you want to see next? brrrviz.com
3
22
250
47,298
wafer.ai stays cooking

brrrviz.com is quite nice
1
8
1,417
I'm on vacation in Hong Kong and just shipped BrrrViz Chapter 11: Tiling from my hotel room. It's 6 interactive visuals that will help you grasp the concept. It's 1:43am. I'm tired. Hope it helps and go check it out :) brrrviz.com
2
15
821
You launch a million threads and have them queue up to write to one address one at a time. That's atomicAdd. It's correct. It's also a for-loop on a parallel computer. Switch to a reduction tree and you fix the bottleneck, but introduce a new one: all but one thread is idle.
1
1
89
Four interactive slides walk through the optimizations: 1. shared memory 2. warp packing 3. minimize bank conflicts 4. thread coarsening Free at brrrviz.com

376
kyle yu retweeted
great, intuitive resource. worth a few mins playing with as a refresher even if you've been through the fundamentals
this is how i wish i learned GPU fundamentals not a lengthy textbook. not a static image. every concept is an interactive visualization. covering the SM architecture, memory coalescing, synchronization, and more. what concepts do you want to see next? brrrviz.com
1
22
377
43,105
kyle yu retweeted
HOLY JESUS THIS IS AMAZING
Replying to @goyal__pramod
check out brrrviz.com for more gpu visuals 🤙
1
24
491
43,692
Most GPU bugs don't crash your program. They just give you the wrong answer. Silently. When thousands of threads try to update the same memory address simultaneously, each one does three things: 📖 read the current value ⚡ execute their computation ✍ write back the result
1
1
3
242
The cost: serialization. Threads queue at the address one at a time. The more threads contend for the same location, the more your parallelism collapses into a bottleneck. This is why real GPU kernels accumulate locally in registers first, then do a single atomicAdd at the end.
1
1
141
Chapter 9 of BrrrViz walks you through both scenarios. brrrviz.com

129
kyle yu retweeted
Formez vous à l'inference/kernel engineering. Savoir bien optimiser les GPU kernels dans les workloads d'inference vaut de l'or. Maitriser CUDA ou Triton, vLLM, SGLang, TensorRT-LLM est un vrai plus si vous voulez vous démarquer pour 2026-2027 en que AI/ML Engineer.
11
48
500
21,480
Stop tuning the wrong bottleneck. GPU optimization isn’t one ceiling, it’s memory bandwidth vs peak compute. The roofline plots both, so you see which one limits your kernel.
1
2
116
Memory-bound means your hardware is waiting on data. Fix data movement, locality, and reuse. Compute-bound means the data is there, but the math is slow on the hardware. Fix precision, use tensor cores, or change instruction path.
1
88
Chasing utilization without this perspective often means optimizing the wrong thing. Understanding where your kernel sits on this diagram helps you execute better optimizations. Find it at chapter 3 of BrrrViz 👉 brrrviz.com/

79
kyle yu retweeted
i struggled a lot with visual GPU concepts, brrrviz seems like an incredible place to start with GPU concepts and start understanding them visually.
4
11
156
5,053
kyle yu retweeted
life updates: - panicking as a stupid nervous intern handling aws ec2 instances - reading modal docs and brrrviz - studying for end semester exams - contemplating life choices; should i start over as a physics major?
3
1
41
1,268
Dropped a new landing page and announced Act 02: ML Systems. I plan on covering transformer architecture, flash attention, KV cache, speculative decoding, and more. If you've ever wanted to actually understand how to run models fast on the hardware, this is for you.
1
2
74
kyle yu retweeted
go to bed right now i know the build is almost finished the eval can wait til morning the agent will still be failing tomorrow you won't figure out why it's hallucinating yes your coworker ships on 4 hrs of sleep they also hallucinate a lot off you go
447
343
7,613
385,661