# Browser

A browser-operating agent that interacts with live web pages using Playwright-based tools with accessibility-tree ref ids.

> For the complete documentation index, see [llms.txt](/llms.txt). Markdown variants are available by appending `.md` to any URL or sending an `Accept: text/markdown` header. An agent skill is available at [/.well-known/agent-skills/site-skill.md](/.well-known/agent-skills/site-skill.md).



<DocsBaseSwitcher base="mastra" agent="browser-agent" />

<AgentPreview
  agent="browser-agent"
  framework="mastra"
  inputFields="[
  {
    name: &#x22;task&#x22;,
    label: &#x22;Task&#x22;,
    placeholder:
      &#x22;Go to news.ycombinator.com and list the top 3 story titles.&#x22;,
    type: &#x22;textarea&#x22;,
  },
]"
/>

## Summary [#summary]

The **Browser Agent** completes tasks on the live web by driving a real Chromium
browser. It uses Playwright-based tools with accessibility-tree ref ids for
stable element targeting, web search as a fallback, and persists state to
Turso (libSQL). Reach for it to automate flows, scrape interactive pages, or
test UIs.

## Installation [#installation]

<CodeTabs>
  <TabsList>
    <TabsTrigger value="cli">
      Command
    </TabsTrigger>

    <TabsTrigger value="manual">
      Manual
    </TabsTrigger>
  </TabsList>

  <TabsContent value="cli">
    ```bash
    npx shadcn@latest add @agentcn/mastra/browser-agent
    ```
  </TabsContent>

  <TabsContent value="manual">
    <Steps>
      <Step>
        Install the following dependencies:
      </Step>

      ```bash
      npm install @mastra/core @mastra/agent-browser @mastra/libsql @mastra/memory @ai-sdk/openai playwright-chromium
      ```

      <Step>
        Copy and paste the following code into your project.
      </Step>

      <ComponentSource src="registry/mastra/browser-agent/config.ts" title="config.ts" />

      <ComponentSource src="registry/mastra/browser-agent/instructions.md" title="instructions.md" />

      <ComponentSource src="registry/mastra/browser-agent/memory.ts" title="memory.ts" />

      <Step>
        Update the import paths to match your project setup.
      </Step>
    </Steps>
  </TabsContent>
</CodeTabs>

## Composition [#composition]

```text
├── config.ts          # Agent config with browser, model, and web_search tool
├── instructions.md    # Comprehensive browser operation instructions
└── memory.ts          # Memory with last 10 messages
```

## Environment Variables [#environment-variables]

```bash
BROWSER_HEADLESS=      # Optional: Set to "false" for visible browser (default: true)
BROWSER_CDP_URL=       # Optional: CDP endpoint for remote browser
TURSO_DATABASE_URL=    # Required: Turso libSQL URL for storage
TURSO_AUTH_TOKEN=      # Optional: Turso auth token
```

## How It Works [#how-it-works]

1. **Navigate** - Call `browser_goto` with the target URL
2. **Snapshot** - Call `browser_snapshot` to get the accessibility tree with ref ids
3. **Interact** - Use ref ids from snapshot for `browser_click`, `browser_type`, etc.
4. **Re-snapshot** - Take new snapshot after actions to see updated state
5. **Verify** - Take final snapshot to confirm results before reporting
6. **Fallback** - Use `web_search` for quick factual questions

## Customization [#customization]

* **Remote browser.** Set `BROWSER_CDP_URL` to connect to a remote CDP endpoint like Bright Data Browser API.
* **Headless mode.** Set `BROWSER_HEADLESS=false` for visible browser during development.
* **Swap models.** Edit the `model` field in `config.ts`.
* **Adjust steps.** Modify `maxSteps` in `config.ts` (default: 100).
