Filter
Exclude
Time range
-
Near
Why we ran the benchmark with a baseline and an oracle control on every arm: without the no-gateway baseline you can't claim a token cut, and without the oracle upper bound you can't tell a ranking miss from a model mistake. Two suites, retrieval on MetaTool and an agentic campaign judged against those controls. It's all in the repo if you want to poke holes in it.
1
1
31
Replying to @Gaurav_vij137
We did an internal benchmark with a mix of datasets like MetaTool etc. Curious about the results of your run tbh
13
Thanks for going down this rabbit hole ❤️ Follow @anishmoonka for daily stories across science, history, psychology, culture & AI. ————— Sources: Olkowicz et al. 2016 PNAS — Birds have primate-like numbers of neurons in the forebrain pnas.org/doi/10.1073/pnas.15… Nieder, Wagener & Rinnert 2020 Science — A neural correlate of sensory consciousness in a corvid bird science.org/doi/10.1126/scie… Marzluff et al. 2010 Animal Behaviour — Lasting recognition of threatening people by wild American crows sciencedirect.com/science/ar… Jelbert et al. 2014 PLOS ONE — Causal understanding of water displacement by New Caledonian crows journals.plos.org/plosone/ar… Gruber et al. 2019 Current Biology — New Caledonian crows use mental representations to solve metatool problems cell.com/current-biology/ful…
4
4,396
ほんと、meta 広告配信ツールは、UIがクソツール。全世界の人が拍手して同意支持率100 パーなんじゃないか?テクノロジーの古代バグシステム。もう!やりにくい。#meta #metatool
2
162
OH! But you're fine with anti white racism, promotion of illegal aliens, you seem to be fine with identity politics in games, you also made it clear you hate pretty in video games. Shut up, Metatool.
28 Oct 2025
"Why do you care?". "Why does it matter?". "It's just one screen!". Ah, now you know how it feels when politics is pushed in games do you? Annoying is it? All the left wingers who got offended by this Halo screen are hypocrites. At this point I would say Make Halo Great Again just in revenge. But differently from you I am consistent in my principles. So I say: no politics in games.
2
70
Many-shot in-context learning works best when the prompt mixes a small, tailored slice with a big, reusable backbone. This paper shows 2 simple ways to do that, cutting cost while keeping or improving accuracy. The trick is a hybrid selection that stays mostly cached. Strategy 1 picks 20 demonstrations most similar to the test case, then fills the rest with a fixed random set that can be cached, so a 100-shot prompt becomes 20 tailored plus 80 cached. Strategy 2 swaps the random cache for a smarter fixed cache, it clusters test samples with k-means to find centroids and caches demonstrations closest to those centroids, which makes the cached part diverse and still reusable. Pure similarity selection rewrites the whole prompt for every test sample, caching breaks, and cost grows roughly with tokens squared. These hybrids keep most tokens cached, so cost grows roughly with the cached content, not the full prompt. In tests on ANLI, TREC, GSM Plus, and MetaTool using Gemini Pro and Gemini Flash, the hybrids matched the best baseline or did better, while being about 2x cheaper at 50 shots and about 10x cheaper at 200 shots. The ratio is a knob. In low-data setups, pushing more slots to the tailored slice gave around 3% to 6% accuracy gains over simply using the full small pool. The core idea is practical, keep a large cached backbone for speed, then add a small personalized slice for relevance. ---- Paper – arxiv. org/abs/2507.16217 Paper Title: "Towards Compute-Optimal Many-Shot In-Context Learning"
2
1
9
1,999
LLM agents' tool selection is vulnerable. Attackers can manipulate this process. This paper introduces ToolHijacker. ToolHijacker crafts malicious tool documents. These documents make the agent choose the attacker's tool for a target task. Methods Explored in this Paper 🔧: → ToolHijacker formulates malicious tool document creation as an optimization problem. → It employs a two-phase strategy, optimizing separate tool description subsequences for retrieval and selection. → The paper develops gradient-free and gradient-based methods to optimize these subsequences. 📌 ToolHijacker's two-phase optimization cleverly targets distinct vulnerabilities in the tool selection pipeline. 📌 Its no-box attack success (e.g., 96.43% on MetaTool) highlights significant LLM agent vulnerabilities. 📌 Current defenses (StruQ, SecAlign, perplexity detection) prove insufficient against such sophisticated attacks. ---------------------------- Paper - arxiv. org/abs/2504.19793v1 Paper Title: "Prompt Injection Attack to Tool Selection in LLM Agents"
4
20
1,754
Compartimos imágenes 🎥 del encuentro 'Robot sapiens', que contó con la exhibición de una talla de herramientas y la demostración de su uso por un robot humanoide. 🤖 La jornada fue organizada por el proyecto europeo METATOOL con la colaboración del MEH. 🫱🏻‍🫲🏼
1
7
673
El proyecto europeo METATOOL se basa en procesos cognitivos humanos para desarrollar nuevos robots rtve.es/noticias/20250403/in…
La inteligencia artificial avanza a pasos agigantados. Ahora está camino de conseguir que los robots sean capaces de inventar y diseñar sus propias herramientas. ▶rtve.es/play/videos/directo/…
1
4,461
Un viaje entre el pasado y el futuro. 🌍✨ La evolución humana se encuentra con la robótica en el proyecto Metatool. ¿Estamos listos para ser transhumanos? #Reflexión #Evolución @museoevolucion @MetaTool Project EU leer.elcorreodeburgos.com/q2…

2
215
Un viaje entre el pasado y el futuro. 🌍✨ La evolución humana se encuentra con la robótica en el proyecto Metatool. ¿Estamos listos para ser transhumanos? #Reflexión #Evolución @museoevolucion @MetaTool Project EU leer.elcorreodeburgos.com/q2…

2
182
Un viaje entre el pasado y el futuro. 🌍✨ La evolución humana se encuentra con la robótica en el proyecto Metatool. ¿Estamos listos para ser transhumanos? #Reflexión #Evolución @museoevolucion @MetaTool Project EU leer.elcorreodeburgos.com/q2…

1
3
341
Un viaje entre el pasado y el futuro. 🌍✨ La evolución humana se encuentra con la robótica en el proyecto Metatool. ¿Estamos listos para ser transhumanos? #Reflexión #Evolución @museoevolucion @MetaTool Project EU leer.elcorreodeburgos.com/q2…

3
208
💥 We made (pre)history! At @museoevolucion, our robot #TIAGo solved a problem using tools and reasoning in a live demo—part of the @MetaTool_EU Project exploring prehistoric-inspired #AI adaptability. A milestone for #robotics research! #MetaTool #robots #AI
3
3
14
1,368
El Museo de la Evolución Humana acoge una jornada sobre inteligencia artificial, neurociencia y arqueología con la demostración en vivo de un robot humanoide. 🤖🍎🦴 museoevolucionhumana.com/es/… @CSIC @CSICdivulga #metatool @FATAPUERCA
2
3
23
1,668
Encuentro 🟢 METATOOL MEH: Robot Habilis: Robots que inventan herramientas. 🤖 Durante la jornada TIAGo, el robot de PAL ROBOTICS, y un experto en talla lítica, harán demostraciones sobre la producción de herramientas en piedra. 📆 02/04 ⌚ 11-14h y 17-19h 📍 2ª Planta MEH
1
4
8
525
🟠 METATOOL MEH. Robots vs Sapiens. ¿es necesaria la consciencia para inventar herramientas? 👉🏼 Con la participación de la periodista @RosaTristan , el arqueólogo @eudaldcarbonell y el investigador en lA y Robótica, Pablo Llanillos. 📆 02/04 ⌚ 20.15h 📍 Salón de actos MEH
6
9
694
On April 2, 2025 a unique robotic experience will be presented in the Museum of Human Evolution. Based on knowledge about prehistoric man, scientists are developing robots that make better tools or use them differently. This development is in full swing in the Metatool project.
2
2
6
621