Filter
Exclude
Time range
-
Near
WildDet3D, Seedance 2.0 by ByteDance, and agentic AI frameworks top this week's research This week on Hugging Face Daily Papers: • WildDet3D (238 upvotes) - Breakthrough in promptable 3D detection supporting text, point, and box prompts across 13.5K categories in the wild • Seedance 2.0 by ByteDance (139 upvotes) - Native multi-modal audio-video generation supporting text, image, audio, and video inputs with up to 15-second outputs • ClawGUI (137 upvotes) - Complete open-source framework for training, evaluating, and deploying GUI agents with online RL and real device support • The Past Is Not Past (135 upvotes) - Memory-enhanced dynamic reward shaping that reduces repetitive errors in RL training • GameWorld (110 upvotes) - Standardized benchmark for evaluating multimodal game agents in browser environments • RationalRewards (100 upvotes) - Reasoning-based reward models that scale visual generation at both training and test time • KnowRL with NVIDIA Nemotron (96 upvotes) - Knowledge-guided RL framework achieving state-of-the-art reasoning at 1.5B scale All models and artifacts available on Hugging Face.
1
9
33
10,892
THIS IS MIND BLOWING. Zhejiang University just open-sourced the first complete GUI agent pipeline that trains, evaluates, AND deploys to real phones. It's called ClawGUI. It got 3 modules and 1 framework. - ClawGUI-RL trains agents on real Android/iOS devices, not just sandboxes - ClawGUI-Eval reproduces results across 6 benchmarks and 11 models at 95.8% accuracy - ClawGUI-Agent puts trained agents on your phone through Telegram, Slack, Discord, and 9 more platforms Their 2B model outperformed Qwen3-VL-32B. 16x smaller. Better results. The gap between AI research and your actual phone just closed.
6
40
285
15,891
Apr 15
ClawGUI A Unified Framework for Training, Evaluating, and Deploying GUI Agents paper: huggingface.co/papers/2604.1…
3
11
55
9,539
🚨 Zhejiang University just open sourced a complete framework for training AI agents that control your phone by looking at the screen. Not through APIs. Not through special integrations. Through taps, swipes, and keystrokes — exactly like a human would. It's called ClawGUI. And it works on Android, HarmonyOS, and iOS out of the box. Here's why this matters. Every AI agent you've seen demo'd — the ones booking flights, ordering food, sending emails — works through APIs and programmatic access. The app has to be integrated. The developer has to build the connection. Anything without an API is off-limits. Most apps don't have APIs. Most software on your phone was never designed for AI control. The long tail of applications — the obscure tools, the enterprise software, the legacy apps — is completely inaccessible to current AI agents. GUI agents fix this by doing what humans do. They look at the screen. They read the interface. They decide where to tap. No API required. Any app that works for a human works for a GUI agent. ClawGUI is the first complete open-source infrastructure to build, train, evaluate, and deploy these agents — all in one framework. Here's what it actually includes: → ClawGUI-RL: First open-source GUI agent reinforcement learning infrastructure with support for both virtual environments and real physical devices simultaneously → ClawGUI-Eval: Standardized evaluation pipeline across 6 benchmarks and 11 models — 95.8% reproduction against official baselines, meaning you can actually trust the numbers → ClawGUI-Agent: Deploys trained agents to real devices through 12 chat platforms — chat with your phone agent through WhatsApp, Telegram, Slack, or whatever you already use Here's the wildest part. They trained ClawGUI-2B — a 2 billion parameter model — entirely within this pipeline. On MobileWorld GUI-Only, it achieves 17.1% success rate, beating the same-scale baseline by 6 full percentage points. A 2B model. Controlling a phone. Trained end-to-end in an open-source pipeline anyone can reproduce. Here's why the infrastructure matters more than the benchmark. The reason GUI agents haven't taken off despite years of research isn't capability — it's fragmentation. Training pipelines are closed. Evaluation metrics drift between papers so you can't compare results. Trained models never reach real devices. Every team building GUI agents has been rebuilding the same infrastructure from scratch. ClawGUI removes that bottleneck entirely. Train, evaluate, and deploy to a real phone from a single open-source framework. No closed pipelines. No proprietary training infrastructure. No results you can't reproduce. 100% Open Source. Model available on Hugging Face now. GitHub link in the comments 👇
1
7
9
646
ClawGUI: A full-stack framework for GUI agents Train with online RL using GiGPO, evaluate with 95.8% reproduction across 6 benchmarks, and deploy to real devices. ClawGUI-2B achieves 17.1% on MobileWorld vs 11.1% baseline.
2
10
39
2,378
🔥 今日新鲜 Skills 精选 Top 10(4月14日) 由 agentskillshub.top 整理!发现 53000 AI Agent 工具,每日更新 🚀 量化交易 Agent 杀入战场,MATLAB 官方下场,Claude Skills 开始「自我进化」!!! 1️⃣ 🆕 HKUDS/Vibe-Trading ⭐ 1,795 | 22 你的私人交易 Agent — Vibe Trading,港大团队出品的量化交易智能体 🔗 github.com/HKUDS/Vibe-Tradin… 🌐 agentskillshub.top/skill/HKU… 2️⃣ 🆕 ZJU-REAL/ClawGUI ⭐ 108 | 42 浙大出品的 GUI Agent 全栈方案:在线 RL 训练 标准化评估 一键部署 🔗 github.com/ZJU-REAL/ClawGUI 🌐 agentskillshub.top/skill/ZJU… 3️⃣ 🆕 lucasrosati/claude-code-memory-setup ⭐ 87 | 39 Obsidian Graph 打造 Claude Code 记忆系统,单次会话最高省 71.5 倍 Token 🔗 github.com/lucasrosati/claud… 🌐 agentskillshub.top/skill/luc… 4️⃣ 🆕 alchaincyf/darwin-skill ⭐ 76 | 52 达尔文技能:让 Claude Code Skills 自主进化、自动优化,灵感来自 AutoResearch 🔗 github.com/alchaincyf/darwin… 🌐 agentskillshub.top/skill/alc… 5️⃣ 🆕 matlab/matlab-agentic-toolkit ⭐ 57 | 57 MATLAB 官方 Agentic Toolkit — 把 MATLAB 的计算能力带给 AI Agent 🔗 github.com/matlab/matlab-age… 🌐 agentskillshub.top/skill/mat… 6️⃣ 🆕 beshuaxian/higgsfield-seedance2-jineng ⭐ 55 | 11 Seedance 2.0 × Higgsfield 技能集,15 个 Prompt Skills 生成 AI 视频 🔗 github.com/beshuaxian/higgsf… 🌐 agentskillshub.top/skill/bes… 7️⃣ 🆕 Amb2rZhou/intern-clawd ⭐ 50 | 0 Claude Code 私人秘书 OS — 持久记忆 仪式化工作流 日程管理 🔗 github.com/Amb2rZhou/intern-… 🌐 agentskillshub.top/skill/Amb… 8️⃣ 🆕 op7418/Seedance-Product-Video ⭐ 37 | 23 向阳乔木出品的 Seedance 产品视频技能,一句话生成产品宣传视频 🔗 github.com/op7418/Seedance-P… 🌐 agentskillshub.top/skill/op7… 9️⃣ 🆕 terancejiang/financial-report-minesweeper ⭐ 37 | 2 财报排雷工具 — 基于唐朝老师方法论的 A 股财报欺诈/风险检测 🔗 github.com/terancejiang/fina… 🌐 agentskillshub.top/skill/ter… 🔟 🆕 Hainrixz/cyber-neo ⭐ 26 | 26 开源网络安全分析 Agent — 自动扫描项目漏洞、生成安全报告 🔗 github.com/Hainrixz/cyber-ne… 🌐 agentskillshub.top/skill/Hai… 🎯 今日趋势:Agent 场景加速垂直化 — 量化交易(Vibe-Trading)、A 股财报分析、网络安全、产品视频,Skills 正从通用工具走向专业领域,MATLAB 官方下场更是标志性事件 🔍
三天不到,开发好了agent skills hub (祝女神们节日快乐👀) agentskillshub.top/ 为什么开发这个产品? 1️⃣ 如何在众多的skills中找到合适的skills,为了解决大家找优质skills的难题,做了各种分类、最近更新、周榜等功能 2️⃣ 找到对的人:优质skills大神 3️⃣ 推荐skills组合,让大家有想法去发挥出更大的想象力 网站:agentskillshub.top/ - Trending热门趋势 - Skills Masters 技能大师 - Organization Builders 组织构建者 - Recently Updated 最近更新 - Top Rated最高评分 - Browse by Category 分类精选 - Scenario Workflows 场景推荐组合 开源项目网址:github.com/zhuyansen/agent-s… 流程: 1、数据收集 2、数据清理 3、质量评估 - 参考了论文、github平台指标、token估计以及可组合性等综合指标,进行最后的加权打分 4、数据展示 纯vibe coding的前后端产品,欢迎大家轻锤,有问题可以提issue,我来让claude code干👀 也欢迎大家提交一些不错的skills、skills master以及skills组合使用的场景 最后能给个star就更好了 可以免费订阅周报,获取资讯✉️
1
12
4,134