Linear Algebra Kernels For The Age Of Research
In other words, GPUs can do more than just matmuls.
Can you make it fast?
Launching a new kernel competition: Linear Algebra Kernels For The Age Of Research.
First problem: batched QR decomposition on B200. Old math, modern hardware.
Prize: Rare swag and hangout in SF