CrewAI + Anchor: Multi-Agent Browser Workflows in Python

Hands On
Jun 24
by Idan Raman
CrewAI + Anchor: Multi-Agent Browser Workflows

Why Multi-Agent Browser Workflows?

Most browser agents are lone wolves: a single LLM picks up a task, opens a tab, and tries to finish it in one context window. That works for simple jobs, but falls apart the moment you need to research six competitor sites in parallel, cross-reference findings, and produce a structured report—all without hallucination bleeding between steps.

CrewAI solves the coordination problem. It lets you define a crew of specialized agents—each with its own role, backstory, and toolset—and wire them together into a task pipeline. Pair that with Anchor's cloud browser sessions and each agent gets a clean, isolated, fingerprint-consistent browser environment. No shared cookies, no state leakage, no CAPTCHA regressions from one agent's session affecting another's.

What We're Building

A three-agent competitive intelligence crew that:

  • Researcher — opens Anchor sessions for each target URL and pulls raw page text
  • Analyst — structures the raw text into pricing tiers, features, and positioning signals
  • Reporter — synthesizes everything into an executive brief with recommended actions

Each agent runs sequentially and can only see its predecessor's output—keeping context lean and preventing cross-contamination.

Setup

pip install crewai anchorbrowser playwright anthropic
playwright install chromium

Export your keys:

export ANCHOR_API_KEY="anc-..."
export OPENAI_API_KEY="sk-..."   # CrewAI defaults to OpenAI; swap for any LLM

Browser Tool: Anchor Session Per URL

CrewAI tools are plain Python functions decorated with @tool. Here we create one Anchor session per URL, connect Playwright, and return the page body.

import os, asyncio
from anchorbrowser import Anchorbrowser
from playwright.async_api import async_playwright
from crewai.tools import tool

anchor = Anchorbrowser(api_key=os.environ["ANCHOR_API_KEY"])

@tool("browse_url")
def browse_url(url: str) -> str:
    '''Navigate to a URL with an isolated Anchor session and return visible page text.'''
    session = anchor.sessions.create(
        timeout={"max_duration": 60, "idle_timeout": 20},
    )
    try:
        async def _fetch():
            async with async_playwright() as pw:
                browser = await pw.chromium.connect_over_cdp(session.ws_endpoint)
                page = browser.contexts[0].pages[0]
                await page.goto(url, wait_until="domcontentloaded")
                text = await page.locator("body").inner_text()
                await browser.close()
                return text[:6000]   # cap to avoid bloating agent context
        return asyncio.run(_fetch())
    finally:
        anchor.sessions.stop(session.id)

Each call to browse_url gets a fresh session with its own IP and browser fingerprint—so even if a target blocks the third request from the same profile, every tool call looks like a new visitor.

Define the Crew

from crewai import Agent, Task, Crew, Process

# ── Agents ──────────────────────────────────────────────────────────────────
researcher = Agent(
    role="Web Researcher",
    goal="Collect raw page content from competitor URLs",
    backstory=(
        "You are a meticulous researcher who fetches exact page text from websites. "
        "You never interpret or summarize—you return the raw content verbatim."
    ),
    tools=[browse_url],
    verbose=True,
)

analyst = Agent(
    role="Competitive Analyst",
    goal="Extract structured pricing and feature data from raw page text",
    backstory=(
        "You are an expert at reading SaaS pricing pages and distilling them into "
        "structured JSON: plan names, monthly prices, and three to five key features per plan."
    ),
    verbose=True,
)

reporter = Agent(
    role="Strategy Reporter",
    goal="Produce an executive brief with actionable competitive insights",
    backstory=(
        "You synthesize analyst reports into crisp executive briefings. "
        "You highlight pricing gaps, feature differentiators, and one recommended action."
    ),
    verbose=True,
)

# ── Tasks ────────────────────────────────────────────────────────────────────
COMPETITOR_URLS = [
    "https://browserbase.com/pricing",
    "https://brightdata.com/pricing",
    "https://apify.com/pricing",
]

research_task = Task(
    description=(
        f"Use the browse_url tool to fetch the content of each of these URLs:
"
        + "
".join(f"- {u}" for u in COMPETITOR_URLS)
        + "
Return the concatenated raw text with URL headers separating each section."
    ),
    expected_output="Raw page text from all competitor URLs, separated by URL headers.",
    agent=researcher,
)

analysis_task = Task(
    description=(
        "Given the raw competitor page text from the Researcher, extract structured data. "
        "Return a JSON array where each element has: url, plans (array of {name, monthly_usd, features[]})."
    ),
    expected_output="JSON array of structured competitor pricing data.",
    agent=analyst,
    context=[research_task],
)

report_task = Task(
    description=(
        "Given the structured pricing JSON from the Analyst, write an executive brief (200–300 words). "
        "Include: a one-sentence summary per competitor, a pricing gap analysis table (markdown), "
        "and one recommended action for the product team."
    ),
    expected_output="A concise executive brief in markdown format.",
    agent=reporter,
    context=[analysis_task],
)

# ── Assemble and Run ─────────────────────────────────────────────────────────
crew = Crew(
    agents=[researcher, analyst, reporter],
    tasks=[research_task, analysis_task, report_task],
    process=Process.sequential,
    verbose=True,
)

result = crew.kickoff()
print(result)

Run It

python competitive_intel.py

You'll see CrewAI's verbose output as each agent picks up its task. The Researcher fires three browse_url calls (three isolated Anchor sessions, in parallel under the hood), the Analyst structures the output, and the Reporter produces a brief like this:

## Competitive Brief — Browser Infrastructure

**Browserbase** leads on developer experience with a generous free tier (2,000 sessions/mo)
and first-class Playwright/Puppeteer support starting at $99/mo.

**Bright Data** targets enterprise at $500/mo+, emphasizing residential IP pools and
compliance features; overkill for most agent workloads.

**Apify** offers the broadest actor marketplace but charges per compute unit,
making cost unpredictable at scale.

| Provider    | Entry Price | Free Tier    | Standout Feature          |
|-------------|-------------|--------------|---------------------------|
| Browserbase | $99/mo      | 2,000 ses.   | Native DevTools Protocol  |
| Bright Data | $500/mo     | None         | Residential IP compliance |
| Apify       | Pay-as-you-go | $5 credit  | Actor marketplace         |

**Recommendation:** Anchor should emphasize session isolation and OmniConnect auth
as key differentiators vs. Browserbase's commoditized offering.

Why CrewAI + Anchor Works Well Together

  • Role isolation maps to session isolation. CrewAI's agent roles stay cleanly separated, and Anchor enforces that boundary in the browser layer—each agent's browse_url call uses a different session, fingerprint, and IP.
  • No shared browser state. Traditional Playwright scripts reuse browser contexts. Here each tool invocation is a fresh Anchor session, so login cookies, localStorage, and service workers from one target can't pollute the next.
  • Scalable concurrency. CrewAI's Process.hierarchical mode (or a custom manager agent) can fan out the Researcher across dozens of URLs concurrently—Anchor handles up to 5,000 parallel sessions with no self-hosted infrastructure.
  • Authentication ready. Swap in OmniConnect for the Researcher's sessions to browse behind login walls—pricing pages that require sign-in, dashboards, and account-specific feature gates.

Production Tips

  • Cap inner_text() output. LLM context windows are finite. The 6 000-character cap in the example keeps agent prompts lean; adjust based on your model's context limit.
  • Set idle_timeout. If a page hangs mid-load, Anchor's idle_timeout kills the session and the finally block cleans up—no leaked sessions.
  • Use Process.hierarchical for parallelism. A manager agent can dispatch the Researcher to multiple URLs simultaneously, cutting wall-clock time from O(n) to O(1) for the browsing phase.
  • Persist results between runs. Store the Analyst's JSON output in a database (Postgres, Supabase) and only re-browse URLs where the content hash changed—ideal for weekly monitoring jobs.
  • Add geo-routing. Pass proxy={"active": True, "country_code": "de"} to the session to simulate regional pricing pages—useful when competitors show different prices by geography.

What's Next

The pattern above handles competitive intel, but the same crew structure works for any multi-step web workflow: lead enrichment (Researcher fetches LinkedIn + company site, Analyst scores fit, Reporter drafts outreach), content audits, regulatory monitoring, or e-commerce price tracking.

Try extending the Reporter with a fourth agent that posts the brief directly to Slack via a webhook—CrewAI makes it trivial to add new specialists without rewriting the existing pipeline.

Sign up at anchorbrowser.io, grab your API key, and run your first CrewAI browser crew in minutes. The full documentation for sessions, proxies, and OmniConnect is at docs.anchorbrowser.io.

Stay ahead in browser automation

We respect your inbox. Privacy policy

Welcome aboard! Thanks for signing up
Oops! Something went wrong while submitting the form.