I've been trying out Hermes agent and I have a question/some feedback on the browser use implementation.
Using Vercel's agent-browser under the hood is a great call. It's probably the most underrated dev tool Vercel labs has built.
What I’m less clear on is why the native agent-browser CLI/skill isn’t also a first-class path for the agent alongside Hermes’ browser_* tools.
The wrapped tools expose only a subset of what agent-browser is capable of, making it feel like a watered down version of it. Some big features Hermes is missing out on because of this:
- Persistent Chrome profiles / saved login state
- Cookies, localStorage, sessionStorage, saved browser state.
- Richer interactions: hover, focus, drag, select, check/uncheck, double-click, wait, semantic find, raw keyboard/mouse.
- Tabs/windows/session management.
- Auth vault and state management
- File upload/download
- Debugging/devtool features like network logs, HAR, etc.
@Teknium curious on the decision here