Joined May 2011
798 Photos and videos
14 Dec 2025
Move() not moving is one of the marvelous things of the cpp std
34
31 Oct 2025
Reached 10
28
29 Oct 2025
That is one hell of an optimized kernel
29
Mase retweeted
Nuevo imán para el frigorífico CILANTRO
25
1,782
23,897
244,489
Mase retweeted
21 Sep 2025
She dumped me last night. Not because I don't listen. Not because I'm always on my phone. Not even because I forgot our anniversary (twice). But because, in her exact words: "You only pay attention to the parts of what I say that you think are important." I stared at her for a moment and realized... She just perfectly described the attention mechanism in transformers. Turns out I wasn't being a bad boyfriend. I was being mathematically optimal. See, in conversations (and transformers), you don't give equal weight to every word. Some words matter more for understanding context. Attention figures out exactly HOW important each word should be. Here's the beautiful math: Attention(Q, K, V) = softmax(QK^T / √d_k)V Breaking it down: Q (Query): "What am I looking for?" K (Key): "What info is available?" V (Value): "What is that info?" d_k: Key dimension (for scaling) Think library analogy: You have a question (Query). Books have titles (Keys) and content (Values). Attention finds which books are most relevant. Step-by-step with "The cat sat on the mat": Step 1: Create Q, K, VEach word → three vectors via learned matrices W_Q, W_K, W_V For "cat": Query: "What should I attend to when processing 'cat'?" Key: "I am 'cat'" Value: "Here's cat info" Step 2: Calculate scoresQK^T = how much each word should attend to others Processing "sat"? High similarity with "cat" (cats sit) and "mat" (where sitting happens). Step 3: Scale by √d_kPrevents dot products from getting too large, keeps softmax balanced. Step 4: SoftmaxConverts scores to probabilities: "cat": 0.4 (subject) "sat": 0.3 (action) "mat": 0.2 (location) "on": 0.1 (preposition) "the": 0.1 (article) Step 5: Weight valuesMultiply each word's value by attention weight, sum up. Now "sat" knows it's most related to "cat" and "mat". Multi-Head Magic:Transformers do this multiple times in parallel: Head 1: Subject-verb relationships Head 2: Spatial ("on", "in", "under") Head 3: Temporal ("before", "after") Head 4: Semantic similarity Each head learns different relationship types. Why This Changed Everything: Before: RNNs = reading with flashlight (one word at a time, forget the beginning) After: Attention = floodlights on entire sentence with dimmer switches This is why ChatGPT can: Remember 50 messages ago Know "it" refers to something specific Understand "bank" = money vs river based on context The Kicker:Models learn these patterns from data alone. Nobody programmed grammar rules. It figured out language structure just by predicting next words. Attention is how AI learned to read between the lines. Just like my therapist helped me understand my focus patterns, maybe understanding transformers helps us see how we decide what matters. Now if only I could implement multi-head attention in dating... 🤖 Still waiting for "scaled dot-product listening" to be invented.
100
250
2,633
196,136
6 Aug 2025
👀 videoscope.org waitlist is now live #AI #Video

22
1 Aug 2025
1
3
49
Mase retweeted
30 Jul 2025
yo en mi primera partida de dark souls porque no entendía para que servía la humanidad
18
890
12,406
263,305
30 Jul 2025
26
Mase retweeted
28 Jul 2025
En fin, pues nada, muchas gracias por el análisis y una pena que no puedas hacer nada, Ministra de Trabajo y Economía Social desde enero de 2020
🗣️ Yolanda Díaz: «He hecho la compra hace dos días, he comprado fruta y me ha costado 30 euros, hay muchas familias que no pueden hacer esto».
24
5,130
27,794
689,497
Mase retweeted
28 Jul 2025
Ok fine. Sure. Whatever. We can save PNGs to birds. I give up. You win world
This is one of the craziest ideas I've ever seen. He converted a drawing of a bird into a spectrogram (PNG -> Soundwave) then played it to a Starling who sung it back reproducing the PNG. Using the birds brain as a hard drive with 2mbps read write speed. youtube.com/watch?si=HMtVdHB…
45
2,609
31,482
923,009
23 Jul 2025
Faster than realtime in 1080p30
34
22 Jul 2025
Notice any difference?
30
21 Jul 2025
Building...
1
3
188
Mase retweeted
You can carbon-date those of us who use docker-compose with a hyphen.
1
1
4
145
3 Jul 2025
Just registered videoscope.org Stay tuned on the following months :) #upcoming

35
13 Jun 2025
Pinta a que se va a liar fina, fina
40
Mase retweeted
29 May 2025
i feel this deeply at times
50
333
3,094
114,425
Mase retweeted
27 May 2025
to avoid burnout at work use the 30-30 rule: after 30 minutes of work, quit your job and disappear into the mountains for 30 years.
462
38,001
284,949
8,165,700