AG2 (the open-source continuation of Microsoft AutoGen) has become the go-to framework for orchestrating multi-agent workflows where specialized agents collaborate, debate, and delegate work. When you pair AG2's conversation-driven model with Anchor's cloud browser infrastructure, you get a research team that can plan, browse the live web, and synthesize findings—all in Python, with no Chromium setup required.
Why AG2?
Most agent frameworks treat the LLM as a single actor. AG2 structures work as a conversation between agents: a Planner that decomposes goals into browsing steps and a Browser Operator that executes them. This mirrors how human research teams actually work—strategy and execution stay in separate lanes.
AG2's ConversableAgent model also lets you run different models per role. The Planner can use GPT-4o for strategic reasoning; the Browser Operator can use a faster model for mechanical actions, cutting costs on high-volume workflows.
Setup
pip install ag2 anchorpy openai
export OPENAI_API_KEY="sk-..."
export ANCHOR_API_KEY="your-anchor-api-key"
Defining Browser Tools
AG2 agents call registered Python functions as tools. We define three browser primitives—navigate, extract, and screenshot—backed by a live Anchor session:
import os
import asyncio
import anchorpy
from ag2 import ConversableAgent, register_function
anchor = anchorpy.AnchorClient(api_key=os.environ["ANCHOR_API_KEY"])
_session = None
async def get_session():
global _session
if _session is None:
_session = await anchor.sessions.create()
return _session
async def browser_navigate(url: str) -> str:
# Navigate the browser to a URL and return the page title.
session = await get_session()
await session.goto(url)
title = await session.title()
return f"Navigated to {url}. Page title: {title}"
async def browser_extract(selector: str) -> str:
# Extract visible text from a CSS selector on the current page.
session = await get_session()
text = await session.inner_text(selector)
return text[:4000] # keep within context window
async def browser_screenshot() -> str:
# Take a screenshot of the current page.
session = await get_session()
await session.screenshot(path="/tmp/screen.png")
return "Screenshot saved to /tmp/screen.png"
Building the Research Team
Two agents with distinct system prompts define the division of labor:
llm_config = {"model": "gpt-4o", "api_key": os.environ["OPENAI_API_KEY"]}
planner = ConversableAgent(
name="Planner",
system_message=(
"You are a research strategist. Break the user's goal into numbered "
"browsing steps. After the BrowserOperator finishes each step, "
"synthesize the findings and issue the next instruction. "
"Reply TERMINATE when research is complete."
),
llm_config=llm_config,
human_input_mode="NEVER",
)
browser_operator = ConversableAgent(
name="BrowserOperator",
system_message=(
"You are a web research agent with access to a live Chrome browser. "
"Execute the Planner's instructions using your registered tools and "
"report findings concisely after each action."
),
llm_config=llm_config,
human_input_mode="NEVER",
)
for fn in (browser_navigate, browser_extract, browser_screenshot):
register_function(
fn,
caller=browser_operator,
executor=browser_operator,
description=fn.__doc__,
)
Running a Research Session
result = planner.initiate_chat(
recipient=browser_operator,
message=(
"Research the top three cloud browser platforms for AI agents in 2026. "
"For each: find the pricing page, list key features, and identify the "
"target user. Compile a comparison table."
),
max_turns=20,
)
print(result.summary)
The Planner decomposes the goal into steps, the Browser Operator fetches each page and reports structured findings, and the Planner synthesizes results until it emits TERMINATE. A 20-turn cap prevents runaway loops on ambiguous tasks.
Scaling to Three Agents with GroupChat
When your workflow needs more than two roles—say, a Planner, a Browser Operator, and a Report Writer that formats the final output—AG2's GroupChat routes messages automatically:
from ag2 import GroupChat, GroupChatManager
report_writer = ConversableAgent(
name="ReportWriter",
system_message=(
"You receive research findings and format them into a Markdown report "
"with an executive summary and a comparison table. Do not browse the web."
),
llm_config=llm_config,
human_input_mode="NEVER",
)
groupchat = GroupChat(
agents=[planner, browser_operator, report_writer],
messages=[],
max_round=30,
)
manager = GroupChatManager(groupchat=groupchat, llm_config=llm_config)
planner.initiate_chat(
manager,
message="Research developer-facing AI agent platforms and produce a report.",
)
Production Tips
- Isolate sessions per conversation. Reset
_session = Nonebefore eachinitiate_chatcall so parallel research threads don't share browser state. - Add a Critic agent. A third
ConversableAgentwith a fact-checking prompt can review the Planner's synthesis before output, reducing hallucinated citations. - Mix models by role. Run the Planner on
gpt-4oand the Browser Operator ongpt-4o-minito cut costs without sacrificing planning quality. - Use Anchor's session replay. Every session is recorded by default—replay recordings to debug agent navigation decisions without re-running the full conversation.



