Alberto Fuentes (e/acc)

Alberto Fuentes (e/acc)

280 Photos and videos

Tweets

Pinned Tweet

Alberto Fuentes (e/acc)@AlberFuen

29 Jan 2024

AGI achieved externally in the 4chan chat by miqudev anon, on 29th January 2024. Here goes a 🧵with Miqu rocking everything I ask (from datasets, random things I find from the internet and more). Feel the AGI!! Using the Q5 (biggest model) version, with this llama.cpp config:

7,644

DailyPapers

Alberto Fuentes (e/acc) retweeted

DailyPapers

@HuggingPapers

Geometric Action Model for Robot Policy Learning Repurposes a geometric foundation model as one backbone for perception, prediction, and action. 1.4B parameters. 6.9 ms inference. 85.5% on LIBERO-Plus. 55× faster than baselines.

1,154

Xiuyu Li

Alberto Fuentes (e/acc) retweeted

Xiuyu Li

@sheriyuo

Retrievable Gradients captures each continual-post-training update as a retrievable gradient object instead of writing it irreversibly into shared weights, so updates apply on demand without accumulating into drift. Repeated updates to shared parameters accumulate weight drift and cause catastrophic forgetting; RAG avoids drift by keeping knowledge external but lacks parametric depth. Storing updates as retrievable gradients sits between the two: more integrated than retrieving text into the prompt, without permanent weight contamination. The open question is retrieval and composition cost when many gradient objects must be selected and combined at inference. Retrievable Gradients: Continual Post-Training Without Cumulative Weight Drift Paper: arxiv.org/abs/2606.15734

550

Ivan Fioravanti ᯅ

Alberto Fuentes (e/acc) retweeted

Ivan Fioravanti ᯅ

@ivanfioravanti

This VibeThinker-3B must be tested absolutely! I bet he thinks like crazy to do a small change, but... Let's try!

Francesco Bertolotti @f14bertolotti

11h

Stellar performance from a 3B model. These results were achieved primarily through post-training refinements on Qwen2.5-Coder. The paper doesn't provide many details, but it appears they distill from RL ckpts and then do a final RL-based instruct RL. 🔗arxiv.org/abs/2606.16140

2,568

Page Six

Alberto Fuentes (e/acc) retweeted

Page Six

@PageSix

Knicks head coach Mike Brown got a standing ovation at the Polo Bar last night 😭 🎥: Derek Blasberg/Instagram

0:11

526

8,452

196,020

谁是藏镜人

Alberto Fuentes (e/acc) retweeted

谁是藏镜人 @VedaAI00

10h

当大家还在为闭源模型的封锁而焦虑时，边缘侧的视觉大模型已经进化到了恐怖的阶段。 👇 Martin Maly 展示了仅凭一次 Prompt（One-shotted），Opus 4.8 就能把一台普通手机变成「羽毛球专业裁判系统」：

0:32

288

71,650

Jim Fan

Alberto Fuentes (e/acc) retweeted

Jim Fan

@DrJimFan

44m

Today, we enable AutoResearch in the physical world for the first time! Introducing ENPIRE: we give 8 Codex agents a fleet of robots, an allocation of GPUs, and generous token budget. We set them free with a simple goal: solve the task as quickly as possible, keep the robots busy but stay safe, don't waste precious compute. Make no mistake. Then humans step aside and our watch begins. The robot fleet starts to come alive: they learn to look for visual clues, reset the scene, practice novel skills, tinker with control stack, read papers online, debate, reflect, get stuck, and try again directly on the hardware. All we did is to give Codex an API to the world of atoms, and the rest is emergence. ENPIRE is able to solve high-precision tasks like tying zip-ties, organizing fine pins, and installing GPUs all by itself. We also discovered a new type of "physical scaling": 8 robots exploring in parallel improves significantly faster than fewer ones. A part of our NVIDIA GEAR lab now self-improves tirelessly over night. We just read the reports in the morning. /goal: we all take a holiday and Jensen wouldn't even notice ;) We will be open-sourcing everything, so you can host your self-running robot lab at home too! Deep dive in the thread:

1:16

222

8,418

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)

Alberto Fuentes (e/acc) retweeted

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)

@teortaxesTex

A really massive release from Qwen. This is the Nvidia/GDM turf.

Qwen

@Alibaba_Qwen

📣 Introducing the Qwen-Robot Suite — Qwen-RobotNav, Qwen-RobotManip, Qwen-RobotWorld, three foundation models, a full stack for embodied intelligence. 🧭 Qwen-RobotNav — the gateway to mobility. • Unifies 5 navigation tasks in one model: instruction following, point-goal, object-goal, target tracking, autonomous driving • Controllable observation protocol • Tool interface for agentic systems 🤖 Qwen-RobotManip — the foundation of interaction. • Unified state-action space across heterogeneous robots • Camera-frame delta poses for coherent cross-embodiment training • Pretrained on a 38,100 hour open-source corpus 🌍 Qwen-RobotWorld — infinite worlds for physical agents. • Single world model, 20 embodiments • Natural-language action interface • Predicts physically grounded futures across manipulation, driving, and navigation Each model is independently useful, and could be composed as physical-world tools.Together, they form the low-level toolkit for general-purpose agentic systems that don't just see the world, but act in it. 📷 Blog: qwen.ai/blog?id=qwen-robotsu… 📖 Report： Qwen-RobotNav: qianwen-res.oss-accelerate.a… Qwen-RobotManip: qianwen-res.oss-accelerate.a… Qwen-RobotWorld： qianwen-res.oss-accelerate.a…

153

8,793

Jayden Teoh

Alberto Fuentes (e/acc) retweeted

Jayden Teoh

@jayden_teoh_

Next-token prediction is myopic. What if transformers learn to predict their own next latent state? 🌠 We present 𝗡𝗲𝘅𝘁-𝗟𝗮𝘁𝗲𝗻𝘁 𝗣𝗿𝗲𝗱𝗶𝗰𝘁𝗶𝗼𝗻 (𝗡𝗲𝘅𝘁𝗟𝗮𝘁): a self-supervised learning method that teaches transformers to form compact world models for reasoning and planning. It also unlocks up to 3.3x faster inference via self-speculative decoding! 🚀

ALT illustration of next-latent prediction vs. other predictive mechanisms

125

5,406

SkalskiP

Alberto Fuentes (e/acc) retweeted

SkalskiP

@skalskip92

42m

RF-DETR keypoints is finally out preview release: real-time transformer keypoint detection Apache 2.0 71.8 AP on COCO, 9.7ms on T4. outperforms YOLO11-pose and YOLO26-pose at similar latency

0:05

1,796

Huaxiu Yao

Alberto Fuentes (e/acc) retweeted

Huaxiu Yao

@HuaxiuYaoML

🦞 Coding agents live on your screen. Omni-modal agents live in your physical world. VisualClaw rides on your glasses, getting cheaper AND smarter every session, without ever retraining the VLM. 📉 −98.1% API cost vs full-frame upload. 📈 15.80% peak accuracy on EgoSchema. 🤖 3.2 macro on VisualClawArena with Claude Code. 🔒 VLM weights frozen throughout. 🎯 See — proactive frame filtering. An edge cascade decides what's worth showing the VLM, on-device. A 1-hour 1fps stream is 3,600 frames; we send 5–20. 📦 Streamline — adaptive skill memory. Hot/cold skills keep prompts lean even as the agent learns new behaviors at deployment. 🔁 Meta-Evolve — continuous self-evolution. Correct rollouts enter memory; failures trigger a memory-grounded skill evolver. The scaffold improves while it runs, not just while it trains. 🏟️ Also releasing VisualClawArena: a rigorous 5-stage multimodal agent benchmark with video clips, documents, user files, dynamic updates, and executable checks. Avg 24.4 rounds per scenario, 18.1 of them requiring vision. Always on. Always learning. Always cheaper. ⚡ 📄 arxiv.org/abs/2606.16295 💻 github.com/UCSC-VLAA/VisualC… 🌐 ucsc-vlaa.github.io/VisualCl… @HaoqinT @cihangxie @yuyinzhou_cs @richardxp888 @ZhengBerkeley @itsJiaqiLiu @JimChenjw @jasoneshraghian

1:44

480

Andrew Kuo

Alberto Fuentes (e/acc) retweeted

Andrew Kuo @earlboykins

16h

The quick OG smile right as the ball goes in is the best

0:31

4,784

291,305

Alberto Fuentes (e/acc)

Alberto Fuentes (e/acc)@AlberFuen

RT @0sdonte: OMG😭❤️❤️

145

Alexander Doria

Alberto Fuentes (e/acc) retweeted

Alexander Doria

@Dorialexander

Weirdly enough, they don't seem to be aware that LLMs are outdated for Physical AI. Bizarre.

Qwen

@Alibaba_Qwen

3,644

Ryan Stephen

Alberto Fuentes (e/acc) retweeted

Ryan Stephen

@Ryan__Stephen

playing with realtime diffusion ui

0:14

107

2,633

121,770

Brian Roemmele

Alberto Fuentes (e/acc) retweeted

Brian Roemmele

@BrianRoemmele

Mind-blowing hardware breakthrough: An open source garage engineer burned a full AI Transformer model (with KV cache) directly into a custom digital chip: WITH NO GPU, NO CPU, NO CLOUD. Just pure silicon running microGPT at 56,000 tokens/sec on only 80 MHz! And uses less energy than a calculator. Prototyped on FPGA, now spelling names on a tiny LCD. This is GateGPT and a big future of on-device AI is here. This can and will scale to far larger models. Insane efficiency. Pure digital magic.

Fabio Guzman

@FGuzmanAI

Jun 13

56,000 tokens/sec at just 80 MHz. 🤯 I burned a full Transformer with KV cache into a custom chip. Designed gate by gate as a 100% digital integrated circuit. Prototyped on a FPGA. (No GPU. No CPU) Just pure digital silicon running @karpathy microGPT, spelling out names on a tiny LCD. This is GateGPT 👇

0:24

132

768

35,819

Xiangxin Zhou

Alberto Fuentes (e/acc) retweeted

Xiangxin Zhou @NickZhou523786

🚀 Flow-DPPO: Divergence Proximal Policy Optimization for Flow Matching Models We introduce Flow-DPPO, which replaces PPO-style ratio clipping with a divergence proximal constraint that is structurally inherent to flow models. 🔗arxiv.org/pdf/2606.11025

4,534

Francesco Bertolotti

Alberto Fuentes (e/acc) retweeted

Francesco Bertolotti @f14bertolotti

11h

303

120,013

NYKTerry

Alberto Fuentes (e/acc) retweeted

NYKTerry

@NykTerry

21h

The Knicks are single-handedly reviving print media btw

ESPN PR

@ESPNPR

22h

ESPN has produced an 80-pg special edition @nyknicks #NBAFinals commemorative magazine Ft. season-long coverage from ESPN's @VinceGoodwill, @ramonashelburne, William C. Rhoden & @NotoriousOHM Pre-order on Amazon (bit.ly/49VZs8k) & on newsstands Friday | #AlwaysKnicks

205

2,720

139,298

New York Basketball

Alberto Fuentes (e/acc) retweeted

New York Basketball

@NBA_NewYork

Pope Leo XIV holding up a Knicks jersey yesterday

246

3,110

39,569

Feiteng

Alberto Fuentes (e/acc) retweeted

Feiteng

@FeitengLi

10h

Qwen 机器人套装 Qwen-Robot 发布：打通大模型到物理世界的最后一公里 mp.weixin.qq.com/s/fLyXpGp5N… huggingface.co/papers/2606.1…

182

11,522