Fleek

Fleek

587 Photos and videos

Tweets

Fleek

@fleek

Apr 3

please excuse the silence. we've been cooking up something cool and are excited to share more details soon

4,272

Fleek

Fleek

@fleek

Jan 29

NVIDIA just dropped benchmarks showing 4-bit inference loses less than 1 point vs BF16 on most tasks. It's not accuracy per request that you should be measuring. It's tasks completed per dollar. And at that metric, 4-bit wins by a landslide. Read the full blog 👇

Fleek

@fleek

Jan 29

x.com/i/article/201692737652…

8,110

Fleek

Fleek

@fleek

Jan 29

x.com/i/article/201692737652…

12,512

Fleek

Fleek

@fleek

Jan 24

1/ Yesterday we announced mdspan-cute: C 23 std::mdspan syntax with CUTLASS cute layouts. One header. Zero overhead. Here's how it works 🧵

2,824

more replies

Fleek

Fleek

@fleek

Jan 24

7/ Layout algebra is formalized in Lean 4. 26 theorems, 0 sorry. Properties extracted to RapidCheck tests. The art/ directory has 23 SVG visualizations - we drew pictures until we understood.

1,969

Fleek

Fleek

@fleek

Jan 24

8/ Check out the code: github.com/weyl-ai/mdspan-cu… Check out the Proofs: github.com/weyl-ai/mdspan-cu… /end

1,731

Fleek

Fleek

@fleek

Jan 23

💿 Open Source Release 💿 mdspan-cute: a zero-overhead bridge between C 23 std::mdspan and CUTLASS cute layouts. One header. Swizzled memory. No bank conflicts. Read the blog and check out the repo (links in reply)

2,093

Fleek

Fleek

@fleek

Jan 23

Read the blog: weyl.ai/plan/mdspan-cute/ Check out the repo: github.com/weyl-ai/mdspan-cu…

mdspan-cute: Zero-Overhead Bridge to CUTLASS | Weyl

C 23 std::mdspan meets CUTLASS cute layouts. One header. Zero cost. 26 theorems. 0 sorry.

weyl.ai

1,383

Fleek

Fleek

@fleek

Jan 22

5/ Quantized RoPE already runs in: → LLaMA → Mistral → Most open source inference stacks This isn't obscure. It's foundational.

651

Fleek

Fleek

@fleek

Jan 22

6/ On "bit augmentation": Log/exp is a bijection. Information in = information out. You can't create precision from a reversible transformation. Thermodynamics doesn't allow it.

561

Fleek

Fleek

@fleek

Jan 20

1/Yesterday we announced nix2gpu - a NixOS package for portable GPU containers. Portable containers prevent vendor inference lock-in. Here's why it's a big deal. #Nix #AIInfra

1,986

more replies

Fleek

Fleek

@fleek

Jan 20

7/ Why it matters: Makes distributed GPU compute easy and deterministic. Philosophy: It's just Linux with libs - complexity is optional. Open-source, MIT-licensed; production-tested on Fleek machines.

815

Fleek

Fleek

@fleek

Jan 20

8/ Check out more info on nix2gpu: Full blog: weyl.ai/plan/portable-nix-gp… Repo: github.com/fleek-sh/nix2gpu Quickstart in README - test and send feedback! /End

Ruining GPU Market Owners' Day with the Power of Nix | Weyl

Build containers with nix2gpu that run on any GPU market

weyl.ai

759