Joined January 2025
2 Photos and videos
After prolonged negotiations with softmax temperatures and causal semantics, Input-Adaptive Hard-Exit DARTS concluded that unnecessary FLOPs are a skill issue. Reviewer #2, your move. #AIResearch #DeepLearning #NAS
14
DARTS finally learned where the Exit button is 🚪🤖 CIFAR-10: cost_norm 0.991→0.772 (~23% less compute), test acc 0.862→0.854. Hard-exit now actually exits (not just a layer-4 fan club 😂). Paper: sameerresearchai.github.io/a… Code: github.com/sameerresearchai/…
30
Cloudflare hid the taxes better than I hide API keys.
@Cloudflare Your domain checkout page needs some real transparency. Just bought a new domain. It showed $26.00 throughout the entire process. Got charged $30.68. The extra $4.68 in taxes was never mentioned once during checkout. Only found out via the invoice email. Please show the final all-in price (taxes included) upfront. It's a small change that greatly improves customer trust. Fix this.
1
37
Me: "2000 GPU days on RL-based NAS." DARTS: "Hold my gradient descent. Differentiable search in 4 days." Bilevel optimization: weights on training loss, architecture on validation loss. More math, less crying. sameerresearchai.github.io/a…
40
My neural network performed a harmful architectural modification, detected the regression, rolled back to a previous stable state, and resumed optimization. Meanwhile, I'm still overfitting on a conversation from 2017. github.com/sameerresearchai/… #AI #MachineLearning
54
A PyTorch model exploring adaptive architectures: • learns sparse connectivity • adapts weights from activity signals • prunes & grows neurons dynamically Basically gradient descent with a mild obsession for reorganizing itself mid-life crisis.
46
Perceptron loss optimizes separability. Hinge loss optimizes separability with margin maximization. That tiny “ 1” quietly converts a classifier from “barely right” to statistically robust. Inductive bias hidden in one constant. #MachineLearning #AIResearch
50
Reproduced “Attention Is All You Need” from scratch. The Transformer achieved convergence. I achieved gradient instability. colab.research.google.com/dr… #Transformer #AttentionIsAllYouNeed #NLP #PyTorch
57
What if intelligence isn’t “understanding” at all? What if it’s just an extremely efficient latent-space compression and reconstruction of knowledge patterns? And what if human intuition emerges the same way?
42
Maybe intelligence isn’t hidden in larger models. Maybe it begins the moment a neuron can change its own computation.
43