Machine Learning Engineer

Joined October 2012
Photos and videos
My PR #37928 to JAX library got approved. - It fixes a sparse autodiff bug involving BCOO sparse arrays, vmap, and reverse-mode gradients. - In short: a batched sparse matvec gradient could fail because the cotangent shape didn’t match the original sparse data shape. - The fix uses JAX’s existing _unbroadcast helper to return the correct shape.
1
2
67
My PR #8220 was merged into @huggingface Datasets library. - It adds support for composed splits in streaming datasets - making split composition work more consistently between streaming and non-streaming dataset loading. A small but practical fix for ML data pipelines.
1
2
99
India's last 3 t20 captains: - Borivali (rohit) - Chembur (sky) - Chembur (shreyas)
80
Actively sabotage stability and comfort early in your career, your future self will thank you for it.
23
They had claude opus 4 in 2024😱 What do they have now that will be released in 2026?
Replying to @AnthropicAI
Each time we release a model, we run the same test: give it code that trains a small AI model, ask the new model to speed it up. It takes a skilled human 4-8 hours to reach 4x faster. In May 2024, Claude Opus 4 averaged a ~3x speedup. This April, Mythos Preview achieved ~52x.
32
If they had AGI, they wouldn't need to raise money...
Anthropic has confidentially submitted a draft S-1 registration statement to the Securities and Exchange Commission. Pending completion of SEC review, this gives us the option to pursue an initial public offering. Read more: anthropic.com/news/confident…
1
27
Claude opus 4.8 is just mythos distillated.
1
47
Imagine getting hacked and someone plants a nasty fart smell.
What if smell was a file type?
1
64
Read this and get ahead of 90% of people in LLMs. Best one so far by @TheAhmadOsman People pay thousands to learn this.
35
Meet retweeted
If you are a mathematician, then you may want to make sure you are sitting down before reading further.
166
885
9,221
3,224,380
This is where Muon optimizer was born, right here on X.
New training speed record for @karpathy’s 124M-parameter NanoGPT setup: 3.28 Fineweb validation loss in 3.7B training tokens Previous record: 5B tokens Changelog: new optimizer 1/8
1
108
How do we make LLMs aware of time?
1
2
57
Sir, you're a CTO. May I ask, what the tech debt at your company is like?
RESTful APIs may be dead soon. Instead, web services may expose a single POST entry point for a prompt. Internally, an AI agent may decide how to interpret it and what to do with the data and the database.
106
I think you should add some observability to your codebase for some real time monitoring and logging.
🚨 We recently discovered that an unauthorized party obtained a token with access to the Grafana Labs GitHub environment, enabling the threat actor to download our codebase. (1/6)
2
5
8,673
Pratitya - प्रतीत्य Launching soon.
1
131
Okay he's answering all ML questions well, throw him a leetcode hard.
1
1
41
-Ultimate cheatsheet for choosing the right Vector DB for your RAG applications. -If you're building RAG systems, AI search, copilots, semantic retrieval, or agentic workflows, this will save you hours.
1
1
78
People are inventing insurance from first principles. Good idea though.
May 8
One person can't pay a $8,000 surgery bill. but 8,000 people can pay $1. the app: someone posts their bill. you give $1. the app prompts: "share this chain." your friend sees it, gives $1, shares. THEIR friend gives $1, shares. the chain grows. the bill shrinks. everyone gave $1. nobody felt it. the chain is the virality.
1
54
I was given this as a take home assignment for an AI Engineer interview, with 4hrs time limit. How would you approach it?
1
54