Joined February 2024
336 Photos and videos
Pinned Tweet
rentry.org/LocalModelsLinks Various links for ML and local models (not just LLMs) that's kept fairly updated. rentry.org/LocalModelsPapers ML papers I've read that I think are interesting. Also keep a text file at the top of all the abstracts for easy searching.
1
16
142
25,614
ClaudePlaysPokemon dev is back with Opus 4.7 Def favorite vibe check given the very minimal harness. twitch.tv/claudeplayspokemon
1
1
385
Claude has for the first time made it through Victory Road. All hints were removed for this latest iteration as well. Also for the first time caught a legendary (Articuno)
185
Woosh: A Sound Effects Foundation Model From Sony AI. Optimized for sound effects with a high-quality audio encoder/decoder model, a text-audio alignment model for conditioning, as well as a text-to-audio and video-to-audio generative models. Links below
1
7
461
Came across what could be an interesting benchmark. Old famicom game called Radical Bomber: Jurai-Kun. Asymmetrical boardgame with 1 runner and 4 chasers. Runner has the ability to bomb certain connections and limited double turns. Some special blocks too. youtube.com/watch?v=A8mPtwdT…
1
5
371
VoXtream2: Full-stream TTS with dynamic speaking rate control Combines a distribution matching mechanism over duration states with CFG across conditioning signals to improve controllability and synthesis quality. Runs 4 times faster than real time on a consumer GPU. Links below
1
3
12
934
arxiv.org/abs/2603.13518 herimor.github.io/voxtream2/ Page not live yet huggingface.co/herimor Will probably be posted here Some interesting papers I keep updated rentry.org/LocalModelsPapers
3
268
Speculative Speculative Decoding Draft model predicts likely verification outcomes and prepares speculations pre-emptively for them. If the actual verification outcome is in the predicted set, a speculation can be returned immediately, eliminating drafting overhead. Links below
3
6
42
2,404
Multi-Head Low-Rank Attention Novel attention mechanism with native 4-way tensor parallelism support. At 2.9B scale achieves SOTA performance on perplexity and zero-shot common-sense reasoning benchmarks. 2.8× decoding speedup over MLA. Links below
1
14
126
12,808
Aletheia tackles FirstProof autonomously From Deepmind. Autonomously solved 6 problems (2, 5, 7, 8, 9, 10) out of 10 according to majority expert assessments; notes that experts were not unanimous on Problem 8 (only). Links below
1
6
511
arxiv.org/abs/2602.21201 github.com/google-deepmind/s… arxiv.org/abs/2602.05192 FirstProof challenge paper daniellitt.com/blog/2026/2/2… Interesting article about FirstProof Some interesting papers I keep updated rentry.org/LocalModelsPapers
2
339
Adam Improves Muon: Adaptive Moment Estimation with Orthogonalized Momentum Scales orthogonalized momentum using a single adaptive stepsize, preserving orthogonality while improving upon Muon at negligible additional cost. Links below
3
12
101
8,253
HiFloat4 Format for Language Model Inference Packs 64 4-bit elements with 32 bits of shared scaling metadata, averaging 4.5 bits per value. Achieves higher average accuracy than the state-of-the-art NVFP4 format across multiple models and diverse downstream tasks. Links below
2
6
24
1,713
arxiv.org/abs/2602.11287 Some interesting papers I keep updated rentry.org/LocalModelsPapers
1
2
467
MoEEdit: Efficient and Routing-Stable Knowledge Editing for Mixture-of-Experts LLMs Reparameterizes expert updates via per-expert null-space projections that keep router inputs invariant and thereby suppress routing shifts. Links below
1
1
4
523
DreamDojo: A Generalist Robot World Model from Large-Scale Human Videos From Nvidia. Foundation world model that learns diverse interactions and dexterous controls from 44k hours of egocentric human videos. Links below
1
6
682
arxiv.org/abs/2602.06949 dreamdojo-world.github.io/ Code link not live yet Some interesting papers I keep updated rentry.org/LocalModelsPapers
2
326