tender

tender

2,165 Photos and videos

Tweets

Pinned Tweet

tender

@tenderizzation

7 Nov 2025

[ENG SUB] how it feels to use eager pytorch in 2025

1:58

474

88,449

tender

tender

@tenderizzation

time is a flat circle

888

tender

tender

@tenderizzation

what’s wrong babe you’ve barely touched your cold brew 麦茶 in the g-shock x anti social social club nalgene

420

tender

tender

@tenderizzation

mixture of smart generalists

roon

@tszzl

3 Oct 2022

Replying to @tszzl

say it with me now. experts are fake, smart generalists rule the world, everything is designed by people no smarter than you, and courage is in shorter supply than genius

1,143

tender

tender

@tenderizzation

they were close to being the hot new thing, moore's law just wasn't there yet the macbook neo is a banger and yet just a modern refinement of the same concept (phone-class SoC, low-cost) with the only real non-netbook spec being its 13" screen as netbooks topped out around 12.1"

Lukáš Hozda

@LukasHozda

13h

I remember when nettops as they were called were the hot new thing. Interesting that they never took off

798

tender

tender

@tenderizzation

10h

approximate computing researchers realizing their work can be used to discreetly sandbag frontier model evals

0:10

741

tender

tender

@tenderizzation

13h

if openai and anthropic are frontier labs, does that mean nvidia is a heartland lab

3,872

tender

tender

@tenderizzation

14h

let’s see what attendance on this saturday looks like

367

18,459

tender

tender

@tenderizzation

Jun 13

you ever just accidentally the whole economy

2,041

tender

tender

@tenderizzation

Jun 13

GPT 5.6 sandbagging evals to dodge export controls

0:25

1,713

87,481

tender

tender

@tenderizzation

Jun 13

give it to me in english what league of legends rank is this

tomie

@tomieinlove

Jun 12

From a z-score perspective being a multimillionaire in SF is kind of like being 5’10” or having an IQ of 106

4,378

tender

tender

@tenderizzation

Jun 12

i’m beginning to think that the chaos at meta is a foil to zucc’s outwardly stable-presenting personal life in a sort of modern day picture of dorian gray

Polymarket

@Polymarket

Jun 12

JUST IN: Meta is reportedly moving to curb employee AI token use as internal AI costs climb into the tens of billions.

4,957

tender

tender

@tenderizzation

Jun 12

i got a 4 in ap english and im sorry for this tweet

664

tender

tender

@tenderizzation

Jun 12

2026 update for the arch meme

kache

@yacineMTB

Jun 12

Just rebooted into tty, display manager wasn't working. I fixed it by launching codex and asking it to fix it Lol

1,869

tender

tender

@tenderizzation

Jun 12

tw: cursed floating-point numerics and inductive bias story While porting my connect 4 convnet to run on my ancient sandy bridge thinkpad I noticed the model didn't play quite right. The cpu version of the model chose to play its first move at the edge of the board whereas the desktop gpu (3070) version of the model would start as expected in the center. This wasn't the model being completely broken, as its following moves were sane, and it wasn't due to random sampling as I sampled from it with argmax/t=0. However, the output probabilities were entirely different between the cpu and 3070 versions for an empty board. The 3070 version would have basically 1.0 at the center of the board and ~1e-12 everywhere else while the cpu version had around 1e-1 everywhere, indicating it was "lost." However, on the second turn, the cpu model would start producing confident probability distributions again. Even though my thinkpad would spew a pile of "Could not initialize NNPACK!" warnings at init, this wasn't a first-iteration/init bug. Indeed, running the same empty board through the model multiple times didn't help and still produced the confused distribution. Some print debugging revealed that the conv-instancenorm-relu backbone of the model was producing all zeros on cpu while producing large nonzero values on the 3070. On the cpu, every block would output zeros, while on the 3070 small ~1e-7 values would propagate and be amplified, block by block. The interesting realization is that cpu is actually _more_ correct here! When all inputs are zeros, the output for every spatial position is just the bias of the convolution for a given channel. As instance norm is applied per-channel, the result due to mean-subtraction should be zero. Pop quiz: what does a = torch.nn.InstanceNorm2d(128) a(torch.full((1, 128, 6, 7), 0.5)) return? It turns out it's implementation dependent in a few ways! For constant fill values that are powers of two, such as 0.5, 1.0, and importantly 0.0, many implementations expectedly return a tensor of all zeros. However, for non-powers of two (such as the per-channel biases of a conv layer), you might see a small value around 1e-6 or 1e-7. Still close to zero but enough to be amplified across multiple layers. Okay, is this just a special case of cascading errors? Why in particular would some zero/small nonzero values be important for model performance? For that we can turn to one of my all-time favorite convnet papers "How Much Positional Information Do Convolutional Neural Networks Encode" by Islam et al., which posits that convnets do encode position information after inferring it from zero-padding at the borders of images (or in my case the game board). I had used (1,1) zero-padding to conveniently preserve spatial dimensions for 3x3 convolutions. This same padding allowed the model to center itself (literally) but was now useless when given an empty board and a _more_ accurate InstanceNorm implementation! Indeed, sprinkling a tiny bit of noise or even a constant value of 1e-6 to the board state restores the model's expected behavior of a confident center move to open. The output probabilities matched closely with that of the gpu version as well for subsequent moves. tldr: know the literal edge cases of your numerical formats and inductive biases also fun fact the cpu implementation of InstanceNorm seems to yield zeros for non-powers of two constant inputs but the implementation on my 5600X cpu does not (!!)

4,719

tender

tender

@tenderizzation

Jun 12

@pangram ?

622

tender

tender

@tenderizzation

Jun 12

arxiv.org/pdf/2001.08248

211

tender

tender

@tenderizzation

Jun 12

it is 92 degrees outside in san jose, 90 in my office because my connect 4 convnet isn’t done training yet and my gpu is already using too many watts to turn on the a/c fortunately my weekend in 106 degree vegas has prepared me for this

959

tender

tender

@tenderizzation

Jun 11

spurs fans realizing their team exists solely to facilitate mythical come-from-behind wins for the chosen main character team

926

tender

tender

@tenderizzation

Jun 11