Reader. Geek. Gamer.

Joined November 2009
351 Photos and videos
#DiffusionGemma is not for consumer hardware, yet. Q4_K_M can get you only ~16k context size. Q6 tanks at 8080:
1
1
108
#Google's #Gemma 4 Diffusion model asked my Mac for 523GB of RAM. The fix was 1 parameter. Diffusion amplifies rounding noise into different (valid) outputs. AR models are stable maps; diffusion is chaotic. Full 262K context in 44GB: uncategorized.blog/posts/the…
1,015
npmDiffWatch running on #gemma-4-12B model on a Mac. No cloud, no GPU. Caught a malicious test package in 8.8s: postinstall hook → reads process.env → POSTs it out → pipes a remote script into sh. 🛑 malicious · 100% github.com/OffByQuant/npmDif… @ramimacisabird @abh1sek @anantshri
1
3
178
PyDiffWatch: watches the PyPI release firehose, diffs every new version, flags suspicious ones fast enough to report for takedown. ▸never runs what it inspects – packages are data, not code ▸reviews run on a local LLM github.com/OffByQuant/PyDiff… @ramimacisabird @abh1sek @anantshri
1
4
4
325
Share your inner interests. Like a child. (Only don’t lose it when you grow up) And if you already lost it, dig deep and find it again. “The most personal is the most creative.” Martin Scorsese

2
3
33
2,338
Ankit Prateek retweeted
What if you could take three completely different model families… and distill them into one tiny model? 🤯 📜 Paper: arxiv.org/pdf/2605.21699 MOPD (Multi-Teacher On-Policy Distillation) has become a standard procedure in post-training. We already distill multiple specialized variants of the same model into a single set of weights. But what if we could go further - and distill models from entirely different families? Turns out, it is possible. Today we’re releasing a paper on cross-tokenizer distillation - our first steps in this exciting direction. 📄 We distilled Qwen3-4B, Phi-4-Mini, and Llama-3B into Llama-3.2-1B. MMLU jumped from 32.05 → 46.32 when using multiple teachers. 📈 The team is now working on Nemo-RL integration so the community can try this method in their own settings. Plus, we are scaling experiments up. 🚀
50
327
2,733
1,357,784
~21h grinding a SAST run on Chromium's net/ alongside its full dependency graph. 283M total tokens (~163M DeepSeek ~120M local LLM). The structural orchestration framework built for deep context caching hit a 97.8% cloud cache-hit rate (only ~3.5M cache misses).
1
51
Complete global data-flow tracking down to external method bodies, zero brute-force prompt stuffing, and a cloud bill ~$2. Smart scaffolding > massive compute budgets. Lesson learned: Don't scan Chromium again. It's too good. 😂
22
Called it a discount. Turns out it was the price. #DeepSeek just made the 75% V4-Pro cut permanent. The token economics I wrote about 5 days ago aren't a promo window anymore – they're the floor. Resharing: lnkd.in/gbSaAj-P
33 Million Tokens for $0.25 Just ran a full SAST scan against 1M lines of code for the price of a gumball. The secret? Hybrid Architecture Context Caching. Master: DeepSeek V4-Flash (Orchestrator) Worker: llama.cpp (Local) Full deep dive on LinkedIn: [bit.ly/3Rsu6zy]
1
128
Multi-Token Prediction (MTP) is a rare "free lunch" in LLM inference. Just finished benchmarking Qwen 3.6 27B on a single RTX 5090 using llama.cpp. At extreme context scales, MTP roughly DOUBLES generation throughput with 0 quality loss. The raw telemetry 👇 🧵
1
1
1
324
Is it lossy? No. At temp=0, MTP holds a strict veto layer. The core model brain validates every parallel guess before it hits your screen. You get cosmetic word changes from FP near-ties, but logical, code, & math accuracy remain 100% intact.
1
31
If your framework supports MTP, turn it on. It’s an uncompromised velocity multiplier for repository-wide code reviews and massive doc analysis. Full writeup: linkedin.com/posts/ankitprat… Shoutout to @ggerganov & team for github.com/ggml-org/llama.cp…

41
33 Million Tokens for $0.25 Just ran a full SAST scan against 1M lines of code for the price of a gumball. The secret? Hybrid Architecture Context Caching. Master: DeepSeek V4-Flash (Orchestrator) Worker: llama.cpp (Local) Full deep dive on LinkedIn: [bit.ly/3Rsu6zy]
1
2
317
Ankit Prateek retweeted
5 Signs of Genuinely Good Person...
17
161
826
53,568
Ankit Prateek retweeted
Claude Code weekly limits are increasing 50%, now through July 13. Live now for all Pro, Max, Team, and seat-based Enterprise users.
1,366
2,048
22,463
2,779,725
Ankit Prateek retweeted
May 13
212
1,278
12,814
547,467
Ankit Prateek retweeted
May 11
#GameOfThrones برد خاطري اخيرا
458
3,604
32,700
2,689,652
Ankit Prateek retweeted
This is crazy. The hacker installed a dead-man's switch that will wipe your computer if you revoke the GitHub token they stole from you. Revoking the token is what triggers the wipe.
SECURITY ADVISORY — TanStack npm packages A supply-chain compromise affecting 42 @tanstack/* packages (84 versions total) was published to npm earlier today at approximately 19:20 and 19:26 UTC. Two malicious versions per package. Status: ACTIVE — packages are deprecated, npm security engaged, publish path being shut down. Severity: HIGH — payload exfiltrates AWS, GCP, Kubernetes, and Vault credentials, GitHub tokens, .npmrc contents, and SSH keys. If you installed any @tanstack/* package between 19:20 and 19:30 UTC today, treat the host as potentially compromised: • Rotate cloud, GitHub, and SSH credentials immediately • Audit cloud audit logs for the last several hours • Pin to a prior known-good version and reinstall from a clean lockfile Detection — the malicious manifest contains: "optionalDependencies": { "@tanstack/setup": "github:tanstack/router#79ac49ee..." } Any version with this entry is compromised. The payload is delivered via a git-resolved optionalDependency whose prepare script runs router_init.js (~2.3 MB, smuggled into each tarball at the package root). Unpublish is blocked by npm policy for most affected packages due to existing third-party dependents. All 84 versions are being deprecated with a SECURITY warning, and npm security has been engaged to pull tarballs at the registry level. Full technical breakdown, complete package and version list, and rolling status updates: github.com/TanStack/router/i… Credit to the security researcher for responsible disclosure.
145
991
9,500
1,719,663