Vercel’s agent-browser is a Rust-based CLI designed specifically for AI agents, replacing Playwright’s screenshot-heavy approach with a lightweight accessibility tree that cuts token costs by 93% and speeds up automation.
Why Playwright Falls Short for AI Agents
Playwright was built for human developers writing test scripts, not for AI-driven workflows. When AI agents use Playwright (or its MCP variant), each step typically involves:
- Capturing a full-page screenshot
- Sending the image to the model
- Waiting for the model to interpret the pixels and decide the next action
This process repeats for every interaction—clicks, form submissions, or page navigations—resulting in tens of thousands of wasted tokens. The model often misses details because it’s parsing an image rather than structured data, making the workflow slow, expensive, and unreliable.
How Agent-Browser Works
Agent-browser replaces screenshots with a compact accessibility tree that labels DOM elements with references like @e1: button "Sign in". The AI model selects a reference, and the tool executes the action directly. This approach eliminates unnecessary token consumption and speeds up execution.
Key features:
- Accessibility tree with refs: Instead of 2MB PNGs, the model receives structured text like
@e1: button "Submit", reducing token usage by 93%. - Rust-based performance: No Node.js overhead or Playwright runtime. The tool connects directly to a Chrome instance for fast execution.
- Semantic locators: Supports plain-language commands like
agent-browser find role button click --name "Submit". - Screenshots on demand: Only captures pixels when explicitly needed, further reducing token waste.
Installation and Setup
Agent-browser can be installed globally in seconds. The fastest method is to use Claude Code with the prompt:
install agent-browser globally and run agent-browser install to download Chrome
For manual installation, run:
npm install -g agent-browser
agent-browser install
The second command downloads Chrome for Testing, Google’s official automation build. Mac users can also use Homebrew:
brew install agent-browser && agent-browser install
When to Use It
Agent-browser is the default choice for most AI-driven browser automation tasks, except for one-off screenshot jobs where token cost isn’t a concern. It’s particularly useful for:
- Multi-step workflows (e.g., form submissions, data extraction)
- Projects where token efficiency matters
- Integrations with AI coding assistants like Claude Code
Tradeoffs
While agent-browser is optimized for AI agents, it lacks some of Playwright’s features for human-written tests, such as:
- Detailed debugging tools for manual testers
- Support for non-Chrome browsers (e.g., Firefox, Safari)
- Advanced screenshot customization
For teams already invested in Playwright’s ecosystem, migrating to agent-browser may require rewriting existing test scripts.
Bottom Line
Agent-browser is the first tool built from the ground up for AI agents, not human testers. By replacing screenshots with structured accessibility trees, it slashes token costs and speeds up automation without sacrificing accuracy. With over 31,000 GitHub stars and backing from Vercel Labs, it’s quickly becoming the standard for AI-driven browser interactions.