Founder of @daertml. Training LLaMAs as a hobby (and no profit yet).

Joined April 2018
280 Photos and videos
Pinned Tweet
AGI achieved externally in the 4chan chat by miqudev anon, on 29th January 2024. Here goes a 🧵with Miqu rocking everything I ask (from datasets, random things I find from the internet and more). Feel the AGI!! Using the Q5 (biggest model) version, with this llama.cpp config:
1
1
21
7,644
Alberto Fuentes (e/acc) retweeted
Geometric Action Model for Robot Policy Learning Repurposes a geometric foundation model as one backbone for perception, prediction, and action. 1.4B parameters. 6.9 ms inference. 85.5% on LIBERO-Plus. 55× faster than baselines.
1
5
26
1,154
Alberto Fuentes (e/acc) retweeted
Retrievable Gradients captures each continual-post-training update as a retrievable gradient object instead of writing it irreversibly into shared weights, so updates apply on demand without accumulating into drift. Repeated updates to shared parameters accumulate weight drift and cause catastrophic forgetting; RAG avoids drift by keeping knowledge external but lacks parametric depth. Storing updates as retrievable gradients sits between the two: more integrated than retrieving text into the prompt, without permanent weight contamination. The open question is retrieval and composition cost when many gradient objects must be selected and combined at inference. Retrievable Gradients: Continual Post-Training Without Cumulative Weight Drift Paper: arxiv.org/abs/2606.15734
1
3
10
550
Alberto Fuentes (e/acc) retweeted
This VibeThinker-3B must be tested absolutely! I bet he thinks like crazy to do a small change, but... Let's try!
Stellar performance from a 3B model. These results were achieved primarily through post-training refinements on Qwen2.5-Coder. The paper doesn't provide many details, but it appears they distill from RL ckpts and then do a final RL-based instruct RL. 🔗arxiv.org/abs/2606.16140
2
4
26
2,568
Alberto Fuentes (e/acc) retweeted
Knicks head coach Mike Brown got a standing ovation at the Polo Bar last night 😭 🎥: Derek Blasberg/Instagram
41
526
8,452
196,020
Alberto Fuentes (e/acc) retweeted
当大家还在为闭源模型的封锁而焦虑时,边缘侧的视觉大模型已经进化到了恐怖的阶段。 👇 Martin Maly 展示了仅凭一次 Prompt(One-shotted),Opus 4.8 就能把一台普通手机变成「羽毛球专业裁判系统」:
5
27
288
71,650
Alberto Fuentes (e/acc) retweeted
Today, we enable AutoResearch in the physical world for the first time! Introducing ENPIRE: we give 8 Codex agents a fleet of robots, an allocation of GPUs, and generous token budget. We set them free with a simple goal: solve the task as quickly as possible, keep the robots busy but stay safe, don't waste precious compute. Make no mistake. Then humans step aside and our watch begins. The robot fleet starts to come alive: they learn to look for visual clues, reset the scene, practice novel skills, tinker with control stack, read papers online, debate, reflect, get stuck, and try again directly on the hardware. All we did is to give Codex an API to the world of atoms, and the rest is emergence. ENPIRE is able to solve high-precision tasks like tying zip-ties, organizing fine pins, and installing GPUs all by itself. We also discovered a new type of "physical scaling": 8 robots exploring in parallel improves significantly faster than fewer ones. A part of our NVIDIA GEAR lab now self-improves tirelessly over night. We just read the reports in the morning. /goal: we all take a holiday and Jensen wouldn't even notice ;) We will be open-sourcing everything, so you can host your self-running robot lab at home too! Deep dive in the thread:
11
31
222
8,418
A really massive release from Qwen. This is the Nvidia/GDM turf.
📣 Introducing the Qwen-Robot Suite — Qwen-RobotNav, Qwen-RobotManip, Qwen-RobotWorld, three foundation models, a full stack for embodied intelligence. 🧭 Qwen-RobotNav — the gateway to mobility. • Unifies 5 navigation tasks in one model: instruction following, point-goal, object-goal, target tracking, autonomous driving • Controllable observation protocol • Tool interface for agentic systems 🤖 Qwen-RobotManip — the foundation of interaction. • Unified state-action space across heterogeneous robots • Camera-frame delta poses for coherent cross-embodiment training • Pretrained on a 38,100 hour open-source corpus 🌍 Qwen-RobotWorld — infinite worlds for physical agents. • Single world model, 20 embodiments • Natural-language action interface • Predicts physically grounded futures across manipulation, driving, and navigation Each model is independently useful, and could be composed as physical-world tools.Together, they form the low-level toolkit for general-purpose agentic systems that don't just see the world, but act in it. 📷 Blog: qwen.ai/blog?id=qwen-robotsu… 📖 Report: Qwen-RobotNav: qianwen-res.oss-accelerate.a… Qwen-RobotManip: qianwen-res.oss-accelerate.a… Qwen-RobotWorld: qianwen-res.oss-accelerate.a…
3
13
153
8,793
Alberto Fuentes (e/acc) retweeted
Next-token prediction is myopic. What if transformers learn to predict their own next latent state? 🌠 We present 𝗡𝗲𝘅𝘁-𝗟𝗮𝘁𝗲𝗻𝘁 𝗣𝗿𝗲𝗱𝗶𝗰𝘁𝗶𝗼𝗻 (𝗡𝗲𝘅𝘁𝗟𝗮𝘁): a self-supervised learning method that teaches transformers to form compact world models for reasoning and planning. It also unlocks up to 3.3x faster inference via self-speculative decoding! 🚀
4
16
125
5,406
Alberto Fuentes (e/acc) retweeted
RF-DETR keypoints is finally out preview release: real-time transformer keypoint detection Apache 2.0 71.8 AP on COCO, 9.7ms on T4. outperforms YOLO11-pose and YOLO26-pose at similar latency
8
18
98
1,796
Alberto Fuentes (e/acc) retweeted
🦞 Coding agents live on your screen. Omni-modal agents live in your physical world. VisualClaw rides on your glasses, getting cheaper AND smarter every session, without ever retraining the VLM. 📉 −98.1% API cost vs full-frame upload. 📈 15.80% peak accuracy on EgoSchema. 🤖 3.2 macro on VisualClawArena with Claude Code. 🔒 VLM weights frozen throughout. 🎯 See — proactive frame filtering. An edge cascade decides what's worth showing the VLM, on-device. A 1-hour 1fps stream is 3,600 frames; we send 5–20. 📦 Streamline — adaptive skill memory. Hot/cold skills keep prompts lean even as the agent learns new behaviors at deployment. 🔁 Meta-Evolve — continuous self-evolution. Correct rollouts enter memory; failures trigger a memory-grounded skill evolver. The scaffold improves while it runs, not just while it trains. 🏟️ Also releasing VisualClawArena: a rigorous 5-stage multimodal agent benchmark with video clips, documents, user files, dynamic updates, and executable checks. Avg 24.4 rounds per scenario, 18.1 of them requiring vision. Always on. Always learning. Always cheaper. ⚡ 📄 arxiv.org/abs/2606.16295 💻 github.com/UCSC-VLAA/VisualC… 🌐 ucsc-vlaa.github.io/VisualCl… @HaoqinT @cihangxie @yuyinzhou_cs @richardxp888 @ZhengBerkeley @itsJiaqiLiu @JimChenjw @jasoneshraghian
5
21
480
Alberto Fuentes (e/acc) retweeted
The quick OG smile right as the ball goes in is the best

11
63
4,784
291,305
RT @0sdonte: OMG😭❤️❤️
145
Alberto Fuentes (e/acc) retweeted
Weirdly enough, they don't seem to be aware that LLMs are outdated for Physical AI. Bizarre.
📣 Introducing the Qwen-Robot Suite — Qwen-RobotNav, Qwen-RobotManip, Qwen-RobotWorld, three foundation models, a full stack for embodied intelligence. 🧭 Qwen-RobotNav — the gateway to mobility. • Unifies 5 navigation tasks in one model: instruction following, point-goal, object-goal, target tracking, autonomous driving • Controllable observation protocol • Tool interface for agentic systems 🤖 Qwen-RobotManip — the foundation of interaction. • Unified state-action space across heterogeneous robots • Camera-frame delta poses for coherent cross-embodiment training • Pretrained on a 38,100 hour open-source corpus 🌍 Qwen-RobotWorld — infinite worlds for physical agents. • Single world model, 20 embodiments • Natural-language action interface • Predicts physically grounded futures across manipulation, driving, and navigation Each model is independently useful, and could be composed as physical-world tools.Together, they form the low-level toolkit for general-purpose agentic systems that don't just see the world, but act in it. 📷 Blog: qwen.ai/blog?id=qwen-robotsu… 📖 Report: Qwen-RobotNav: qianwen-res.oss-accelerate.a… Qwen-RobotManip: qianwen-res.oss-accelerate.a… Qwen-RobotWorld: qianwen-res.oss-accelerate.a…
4
2
25
3,644
Alberto Fuentes (e/acc) retweeted
playing with realtime diffusion ui
99
107
2,633
121,770
Alberto Fuentes (e/acc) retweeted
Mind-blowing hardware breakthrough: An open source garage engineer burned a full AI Transformer model (with KV cache) directly into a custom digital chip: WITH NO GPU, NO CPU, NO CLOUD. Just pure silicon running microGPT at 56,000 tokens/sec on only 80 MHz! And uses less energy than a calculator. Prototyped on FPGA, now spelling names on a tiny LCD. This is GateGPT and a big future of on-device AI is here. This can and will scale to far larger models. Insane efficiency. Pure digital magic.
56,000 tokens/sec at just 80 MHz. 🤯 I burned a full Transformer with KV cache into a custom chip. Designed gate by gate as a 100% digital integrated circuit. Prototyped on a FPGA. (No GPU. No CPU) Just pure digital silicon running @karpathy microGPT, spelling out names on a tiny LCD. This is GateGPT 👇
29
132
768
35,819
Alberto Fuentes (e/acc) retweeted
🚀 Flow-DPPO: Divergence Proximal Policy Optimization for Flow Matching Models We introduce Flow-DPPO, which replaces PPO-style ratio clipping with a divergence proximal constraint that is structurally inherent to flow models. 🔗arxiv.org/pdf/2606.11025
1
5
38
4,534
Alberto Fuentes (e/acc) retweeted
Stellar performance from a 3B model. These results were achieved primarily through post-training refinements on Qwen2.5-Coder. The paper doesn't provide many details, but it appears they distill from RL ckpts and then do a final RL-based instruct RL. 🔗arxiv.org/abs/2606.16140
14
43
303
120,013
Alberto Fuentes (e/acc) retweeted
The Knicks are single-handedly reviving print media btw
ESPN has produced an 80-pg special edition @nyknicks #NBAFinals commemorative magazine Ft. season-long coverage from ESPN's @VinceGoodwill, @ramonashelburne, William C. Rhoden & @NotoriousOHM Pre-order on Amazon (bit.ly/49VZs8k) & on newsstands Friday | #AlwaysKnicks
17
205
2,720
139,298
Alberto Fuentes (e/acc) retweeted
Pope Leo XIV holding up a Knicks jersey yesterday
21
246
3,110
39,569
Alberto Fuentes (e/acc) retweeted
Qwen 机器人套装 Qwen-Robot 发布:打通大模型到物理世界的最后一公里 mp.weixin.qq.com/s/fLyXpGp5N… huggingface.co/papers/2606.1…
11
28
182
11,522