Roberto Stagi

Roberto Stagi

Users
Tweets

Roberto Stagi

@rstagi_

Jun 11

Why we ran the benchmark with a baseline and an oracle control on every arm: without the no-gateway baseline you can't claim a token cut, and without the oracle upper bound you can't tell a ranking miss from a model mistake. Two suites, retrieval on MetaTool and an agentic campaign judged against those controls. It's all in the repo if you want to poke holes in it.

Giac

Giac

@Giac_nicoli

Jun 8

Replying to @Gaurav_vij137

We did an internal benchmark with a mix of datasets like MetaTool etc. Curious about the results of your run tbh

Roberto Stagi

Roberto Stagi

@rstagi_

Jun 3

x.com/i/article/206211723254…

2,439

Anish Moonka

Anish Moonka

@anishmoonka

May 13

Thanks for going down this rabbit hole ❤️ Follow @anishmoonka for daily stories across science, history, psychology, culture & AI. ————— Sources: Olkowicz et al. 2016 PNAS — Birds have primate-like numbers of neurons in the forebrain pnas.org/doi/10.1073/pnas.15… Nieder, Wagener & Rinnert 2020 Science — A neural correlate of sensory consciousness in a corvid bird science.org/doi/10.1126/scie… Marzluff et al. 2010 Animal Behaviour — Lasting recognition of threatening people by wild American crows sciencedirect.com/science/ar… Jelbert et al. 2014 PLOS ONE — Causal understanding of water displacement by New Caledonian crows journals.plos.org/plosone/ar… Gruber et al. 2019 Current Biology — New Caledonian crows use mental representations to solve metatool problems cell.com/current-biology/ful…

Birds have primate-like numbers of neurons in the forebrain | PNAS

Some birds achieve primate-like levels of cognition, even though their brains tend to be much smaller in absolute size. This poses a fundamental pr...

pnas.org

4,396

ゆめかなうクラウドの設計者/Yumekanau Founder｜Soulwork Lab

ゆめかなうクラウドの設計者/Yumekanau Founder｜Soulwork Lab @yumekanau19

Feb 2

ほんと、meta 広告配信ツールは、UIがクソツール。全世界の人が拍手して同意支持率100 パーなんじゃないか？テクノロジーの古代バグシステム。もう！やりにくい。#meta #metatool

162

Pixel Plus 🦊🔑 《INDIE VTUBER》

Pixel Plus 🦊🔑 《INDIE VTUBER》

@PixelPlusVtuber

29 Oct 2025

OH! But you're fine with anti white racism, promotion of illegal aliens, you seem to be fine with identity politics in games, you also made it clear you hate pretty in video games. Shut up, Metatool.

0:04

Metatron

@pureMetatron

28 Oct 2025

"Why do you care?". "Why does it matter?". "It's just one screen!". Ah, now you know how it feels when politics is pushed in games do you? Annoying is it? All the left wingers who got offended by this Halo screen are hypocrites. At this point I would say Make Halo Great Again just in revenge. But differently from you I am consistent in my principles. So I say: no politics in games.

Rohan Paul

Rohan Paul

@rohanpaul_ai

11 Aug 2025

Many-shot in-context learning works best when the prompt mixes a small, tailored slice with a big, reusable backbone. This paper shows 2 simple ways to do that, cutting cost while keeping or improving accuracy. The trick is a hybrid selection that stays mostly cached. Strategy 1 picks 20 demonstrations most similar to the test case, then fills the rest with a fixed random set that can be cached, so a 100-shot prompt becomes 20 tailored plus 80 cached. Strategy 2 swaps the random cache for a smarter fixed cache, it clusters test samples with k-means to find centroids and caches demonstrations closest to those centroids, which makes the cached part diverse and still reusable. Pure similarity selection rewrites the whole prompt for every test sample, caching breaks, and cost grows roughly with tokens squared. These hybrids keep most tokens cached, so cost grows roughly with the cached content, not the full prompt. In tests on ANLI, TREC, GSM Plus, and MetaTool using Gemini Pro and Gemini Flash, the hybrids matched the best baseline or did better, while being about 2x cheaper at 50 shots and about 10x cheaper at 200 shots. The ratio is a knob. In low-data setups, pushing more slots to the tailored slice gave around 3% to 6% accuracy gains over simply using the full small pool. The core idea is practical, keep a large cached backbone for speed, then add a small personalized slice for relevance. ---- Paper – arxiv. org/abs/2507.16217 Paper Title: "Towards Compute-Optimal Many-Shot In-Context Learning"

1,999

Rohan Paul

Rohan Paul

@rohanpaul_ai

16 May 2025

LLM agents' tool selection is vulnerable. Attackers can manipulate this process. This paper introduces ToolHijacker. ToolHijacker crafts malicious tool documents. These documents make the agent choose the attacker's tool for a target task. Methods Explored in this Paper 🔧: → ToolHijacker formulates malicious tool document creation as an optimization problem. → It employs a two-phase strategy, optimizing separate tool description subsequences for retrieval and selection. → The paper develops gradient-free and gradient-based methods to optimize these subsequences. 📌 ToolHijacker's two-phase optimization cleverly targets distinct vulnerabilities in the tool selection pipeline. 📌 Its no-box attack success (e.g., 96.43% on MetaTool) highlights significant LLM agent vulnerabilities. 📌 Current defenses (StruQ, SecAlign, perplexity detection) prove insufficient against such sophisticated attacks. ---------------------------- Paper - arxiv. org/abs/2504.19793v1 Paper Title: "Prompt Injection Attack to Tool Selection in LLM Agents"

1,754

MuseoEvoluciónHumana

MuseoEvoluciónHumana @museoevolucion

8 Apr 2025

Compartimos imágenes 🎥 del encuentro 'Robot sapiens', que contó con la exhibición de una talla de herramientas y la demostración de su uso por un robot humanoide. 🤖 La jornada fue organizada por el proyecto europeo METATOOL con la colaboración del MEH. 🫱🏻‍🫲🏼

0:15

673

RTVE Noticias

RTVE Noticias

@rtvenoticias

3 Apr 2025

El proyecto europeo METATOOL se basa en procesos cognitivos humanos para desarrollar nuevos robots rtve.es/noticias/20250403/in…

La inteligencia artificial da un salto evolutivo con robots que crean sus propias herramientas

rtve.es

Telediarios de TVE @telediario_tve

3 Apr 2025

La inteligencia artificial avanza a pasos agigantados. Ahora está camino de conseguir que los robots sean capaces de inventar y diseñar sus propias herramientas. ▶rtve.es/play/videos/directo/…

1:29

4,461

El Correo de Burgos

El Correo de Burgos @correodeburgos

3 Apr 2025

Un viaje entre el pasado y el futuro. 🌍✨ La evolución humana se encuentra con la robótica en el proyecto Metatool. ¿Estamos listos para ser transhumanos? #Reflexión #Evolución @museoevolucion @MetaTool Project EU leer.elcorreodeburgos.com/q2…

215

El Correo de Burgos

El Correo de Burgos @correodeburgos

3 Apr 2025

182

El Correo de Burgos

El Correo de Burgos @correodeburgos

2 Apr 2025

341

El Correo de Burgos

El Correo de Burgos @correodeburgos

2 Apr 2025

208

PAL Robotics

PAL Robotics @PALRobotics

2 Apr 2025

💥 We made (pre)history! At @museoevolucion, our robot #TIAGo solved a problem using tools and reasoning in a live demo—part of the @MetaTool_EU Project exploring prehistoric-inspired #AI adaptability. A milestone for #robotics research! #MetaTool #robots #AI

1,368

MuseoEvoluciónHumana

MuseoEvoluciónHumana @museoevolucion

2 Apr 2025

El Museo de la Evolución Humana acoge una jornada sobre inteligencia artificial, neurociencia y arqueología con la demostración en vivo de un robot humanoide. 🤖🍎🦴 museoevolucionhumana.com/es/… @CSIC @CSICdivulga #metatool @FATAPUERCA

1,668

MuseoEvoluciónHumana

MuseoEvoluciónHumana @museoevolucion

1 Apr 2025

Encuentro 🟢 METATOOL MEH: Robot Habilis: Robots que inventan herramientas. 🤖 Durante la jornada TIAGo, el robot de PAL ROBOTICS, y un experto en talla lítica, harán demostraciones sobre la producción de herramientas en piedra. 📆 02/04 ⌚ 11-14h y 17-19h 📍 2ª Planta MEH

525

MuseoEvoluciónHumana

MuseoEvoluciónHumana @museoevolucion

1 Apr 2025

🟠 METATOOL MEH. Robots vs Sapiens. ¿es necesaria la consciencia para inventar herramientas? 👉🏼 Con la participación de la periodista @RosaTristan , el arqueólogo @eudaldcarbonell y el investigador en lA y Robótica, Pablo Llanillos. 📆 02/04 ⌚ 20.15h 📍 Salón de actos MEH

694

juandoming

juandoming @juandoming

1 Apr 2025

Just a metatool? Some thoughts why generative AIs are not tools jondron.ca/just-a-metatool-s… a través de @jondron

Just a metatool? Some thoughts why generative AIs are not tools

Many people brush generative AI aside as being just a tool. ChatGPT describes itself as such (I asked). I think it’s more complicated than that, and this post is going to be an attempt to explain w…

jondron.ca

MetaTool Project EU

MetaTool Project EU @MetaTool_EU

21 Feb 2025

On April 2, 2025 a unique robotic experience will be presented in the Museum of Human Evolution. Based on knowledge about prehistoric man, scientists are developing robots that make better tools or use them differently. This development is in full swing in the Metatool project.

621