Xander Chin

Xander Chin

86 Photos and videos

Tweets

Pinned Tweet

Xander Chin

@XanderChin

12 Nov 2025

hand-controlled boids

0:15

312

1,491

21,029

739,470

Reads with Ravi

Xander Chin retweeted

Reads with Ravi

@readswithravi

Jun 15

This sentence from Carl Jung hits hard. “No matter how isolated you are and how lonely you feel, if you do your work truly and conscientiously, unknown friends will come and seek you.”

847

6,157

98,999

Tim

Xander Chin retweeted

Tim @llhtimlam

May 25

Want to see distributed computing explained via Pong? Inspired by TinyTPU and TinyTapeout workshop at FOSSi, I wrote a paper under a week pairs this demo with a proposed next-gen optical I/O chip architecture & a roadmap to prototype it. Read it on GitHub: github.com/llhtimlam/tt_um_l…

GitHub - llhtimlam/tt_um_llhtimlam_DistributedPong

Contribute to llhtimlam/tt_um_llhtimlam_DistributedPong development by creating an account on GitHub.

github.com

106

21,940

Satvik Garimella

Xander Chin retweeted

Satvik Garimella

@satvikgari

May 8

A few months ago, I saw Karpathy build NanoChat in PyTorch, and it made me want to understand how these models work underneath the abstractions. So I decided to try building one myself, but in a different framework: JAX. Here’s how I did it: 🧵

1,708

saksham

Xander Chin retweeted

saksham

@sakshambatraa

May 6

reinventing Groq's LPU with @michael_trbo we got instruction driven data movement working between SRAM memory blocks and MXM compute!!

0:28

6,535

luthira

Xander Chin retweeted

luthira

@luthiraabeykoon

May 2

We implemented @karpathy 's MicroGPT fully on FPGA fabric. No GPU. No PyTorch. No CPU inference loop. Just a transformer burned into hardware, generating 50,000 tokens/sec. The model is small, but the idea is not: inference does not have to live only in software 👇

0:26

266

696

7,508

851,204

surya

Xander Chin retweeted

surya

@suryasure05

Apr 14

anyone subletting a 1 bedroom apartment in Toronto this summer?

4,777

arjun

Xander Chin retweeted

arjun

@arjunharinath1

Apr 9

Replying to @satvikgari

@satvikgari and I have been building our own version of Nvidia’s Blackwell GPU. We just designed a 4x4 systolic array in Verilog! Here’s a breakdown of how it works and what we learned building it.

0:14

5,655

evan

Xander Chin retweeted

evan

@evanliin

Apr 9

blog blog blog blah blah evanlin.ca/writing/exploring…

What I Learned Building Attention Residuals from Scratch

Naively reimplementing a paper in PyTorch changed how I think about how transformers route information, and about the gap between academic math and physical silicon.

evanlin.ca

101

7,251

krupa

Xander Chin retweeted

krupa

@krupaad

Apr 3

bit late to the recruiting cycle, but looking for a summer internship in ML/hardware/inference!! i've been working on CUDA kernel writing, FPGA acceleration and RTL. would love to find a team doing similar work this summer dual US/Canada citizen, can relocate anywhere DMs open :)

264

30,842

surya

Xander Chin retweeted

surya

@suryasure05

Apr 6

wrote an article breaking down the math behind TurboQuant by @GoogleResearch. I walk through a toy example using concrete numbers to show every single operation that goes on under the hood. link below:

0:12

115

927

76,373

Xander Chin

Xander Chin

@XanderChin

Apr 3

seriously impressive stuff. give this man a follow

ani

@anirudhbv_ce

Apr 3

I implemented @GoogleResearch's TurboQuant as a CUDA-native compression engine on Blackwell B200. 5x KV cache compression on Qwen 2.5-1.5B, near-loseless attention scores, generating live from compressed memory. 5 custom cuTile CUDA kernels ft: - fused attention (with QJL corrections) - online softmax -on-chip cache decompression - pipelined TMA loads Try it out: devtechjr.github.io/turboqua… s/o @blelbach and the cuTile team at @nvidia for lending me Blackwell GPU access :) cc @sundeep @GavinSherry

4:47

1,032

115,703

Satvik Garimella

Xander Chin retweeted

Satvik Garimella

@satvikgari

Apr 2

Recently @arjunharinath1 and I started building our own version of Nvidia's Blackwell GPU. We built the ALUs and a 4-lane SIMD core in Verilog. Here is a breakdown of how we did it.

181

11,920

michael.trbo

Xander Chin retweeted

michael.trbo @michael_trbo

Apr 1

so @sakshambatraa and I are working on re-inventing groq's LPU from scratch this last week we implemented the VXM, the LPU's arithmetic unit here's what we learned

229

12,587

saksham

Xander Chin retweeted

saksham

@sakshambatraa

Mar 30

for my next adventure, @michael_trbo and I will be working together to build a tinyLPU! for our first checkpoint, we reinvented the MXM: the language processing unit's matrix multiplication engine. here's how we did it

0:08

6,364

saksham

Xander Chin retweeted

saksham

@sakshambatraa

Mar 29

i think one of the problems w the current scene in tech is that a lot of people are attached with what comes with it. such as money, fame, etc. i don’t disagree that we all want to make money, but at some is attachment doing justice to yourself? if youre starting a project because you think it will help you land a certain job; you’re chaining yourself down to the rewards and not your duty. because lets say you don’t get the job you wanted, will you still cherish the project and the time you put into it? you’re entitled to your duty, but not the rewards that come with it.

1,654

surya

Xander Chin retweeted

surya

@suryasure05

Mar 27

pre-symposium shenanigans

3,632

Kenny Guo

Xander Chin retweeted

Kenny Guo

@kenivinguo

Mar 25

from getting inspired by symposium last year to presenting on stage this year! live demoing tiny-tpu was insane. this year has been a wild adventure, and it's only getting started

0:28

130

9,262

momentum

Xander Chin retweeted

momentum @momentum_place

Mar 23

see how 4 university students reverse engineered Google's most advanced AI chip tomorrow

1:10

646

45,829

Allie

Xander Chin retweeted

Allie

@alspee

Mar 21

cheering (and lol’ing) for the pals from tiny-tpu @socraticainfo. this story never gets old 🙌 @XanderChin @suryasure05 @kennykgguo @evanliin

112

4,968

luthira

Xander Chin retweeted

luthira

@luthiraabeykoon

Feb 23

We built Talos - a full CNN inference engine running directly on silicon. Every multiply, buffer, and data path lives as real digital logic on the FPGA. This is what deep learning looks like when the model becomes hardware👇

0:25

109

1,202

92,184