Something new coming soon

Joined October 2018
587 Photos and videos
Apr 3
please excuse the silence. we've been cooking up something cool and are excited to share more details soon
19
2
32
4,272
Jan 29
NVIDIA just dropped benchmarks showing 4-bit inference loses less than 1 point vs BF16 on most tasks. It's not accuracy per request that you should be measuring. It's tasks completed per dollar. And at that metric, 4-bit wins by a landslide. Read the full blog 👇
14
6
36
8,110
Jan 24
1/ Yesterday we announced mdspan-cute: C 23 std::mdspan syntax with CUTLASS cute layouts. One header. Zero overhead. Here's how it works 🧵
3
5
20
2,824
Jan 24
7/ Layout algebra is formalized in Lean 4. 26 theorems, 0 sorry. Properties extracted to RapidCheck tests. The art/ directory has 23 SVG visualizations - we drew pictures until we understood.
1
8
1,969
Jan 24
8/ Check out the code: github.com/weyl-ai/mdspan-cu… Check out the Proofs: github.com/weyl-ai/mdspan-cu… /end

7
1,731
Jan 23
💿 Open Source Release 💿 mdspan-cute: a zero-overhead bridge between C 23 std::mdspan and CUTLASS cute layouts. One header. Swizzled memory. No bank conflicts. Read the blog and check out the repo (links in reply)
1
1
11
2,093
Jan 22
5/ Quantized RoPE already runs in: → LLaMA → Mistral → Most open source inference stacks This isn't obscure. It's foundational.
1
4
651
Jan 22
6/ On "bit augmentation": Log/exp is a bijection. Information in = information out. You can't create precision from a reversible transformation. Thermodynamics doesn't allow it.
3
561
Jan 20
1/Yesterday we announced nix2gpu - a NixOS package for portable GPU containers. Portable containers prevent vendor inference lock-in. Here's why it's a big deal. #Nix #AIInfra
1
2
13
1,986
Jan 20
7/ Why it matters: Makes distributed GPU compute easy and deterministic. Philosophy: It's just Linux with libs - complexity is optional. Open-source, MIT-licensed; production-tested on Fleek machines.
1
2
815