smolagents + Anchor: Building Production Browser Agents in Python

HuggingFace's smolagents is the minimalist agent library that's quietly become one of the most-starred AI projects on GitHub. Its core idea is elegantly simple: agents that think in code. A CodeAgent writes Python snippets to perform actions—which means fewer LLM calls, less prompt engineering, and far better composability than JSON-based tool calling.

Combine that with Anchor's managed cloud browsers—session isolation, anti-bot resilience, and built-in fingerprint management—and you have a stack that handles real-world browser automation without any infrastructure overhead.

Why smolagents?

Code agents reduce steps: writing Python directly cuts task steps by ~30% vs. JSON tool-calling patterns
Model-agnostic: swap between OpenAI, Anthropic, or any HuggingFace model in a single line
Minimal surface area: the entire core is under 1,000 lines—easy to audit, easy to extend
Built-in tracing: every generated code step is logged, making debugging and replay straightforward

Setup

pip install smolagents anchor-browser playwright
playwright install chromium

export ANCHOR_API_KEY=your_anchor_key
export OPENAI_API_KEY=your_openai_key

Creating an Anchor Session

Anchor provisions a fresh browser with its own fingerprint and residential proxy. The session exposes a WebSocket endpoint that Playwright connects to over CDP.

import os
from anchor_browser import AnchorClient
from playwright.sync_api import sync_playwright

anchor = AnchorClient(api_key=os.environ["ANCHOR_API_KEY"])
session = anchor.sessions.create(
    proxy_country="us",
    options={"adblock": True}
)

playwright = session.get_playwright()
page = playwright.chromium.connect_over_cdp(
    session.ws_endpoint
).contexts[0].pages[0]

Defining Browser Tools

Wrap the Playwright page in typed tools that smolagents can call from generated code:

from smolagents import tool
import base64

@tool
def navigate(url: str) -> str:
    'Navigate the browser to a URL and return the page title.'
    page.goto(url)
    return page.title()

@tool
def get_text(selector: str) -> str:
    'Extract visible text from a CSS selector on the current page.'
    return page.locator(selector).inner_text()

@tool
def screenshot() -> str:
    'Take a screenshot of the current page, returns base64-encoded PNG.'
    return base64.b64encode(page.screenshot()).decode()

Building the Agent

from smolagents import CodeAgent, OpenAIServerModel

model = OpenAIServerModel(model_id="gpt-4o-mini")

agent = CodeAgent(
    tools=[navigate, get_text, screenshot],
    model=model,
    max_steps=10,
)

Running a Real Task

result = agent.run(
    "Go to news.ycombinator.com, find the top 5 posts about AI agents, "
    "and return their titles and point counts as a JSON list."
)
print(result)
# [
#   {"title": "Show HN: smolagents 2.0 ...", "points": 412},
#   {"title": "Agent frameworks compared ...", "points": 298},
#   ...
# ]

Because the agent writes Python rather than calling predefined steps, it can inspect the page structure at runtime and adapt when the DOM changes—no brittle XPath selectors or recorded click-coordinates.

Swapping the Model

One line to switch to a local HuggingFace model for cost-sensitive or air-gapped workloads:

from smolagents import HfApiModel

model = HfApiModel("Qwen/Qwen2.5-72B-Instruct")
agent = CodeAgent(tools=[navigate, get_text, screenshot], model=model)

Production Tips

Reuse sessions: create the Anchor session once and share the page object across tools—don't open a new browser per step
Structured output: return data as a JSON string from your tools and let the agent reason over it; smolagents handles parsing naturally in generated code
Tracing: smolagents logs every generated code step to stdout—pipe it to your observability stack for full replay and cost accounting
Session cleanup: call session.close() in a finally block to release the cloud browser resource promptly
Parallelism: spawn N Anchor sessions simultaneously to run independent agent tasks concurrently—see our parallel agents guide for patterns

What's Next

The smolagents CodeAgent is a natural fit for multi-session parallelism: each agent gets its own isolated Anchor browser, and results are aggregated in the orchestrating process. Start with a single session to validate your tools, then scale out.

Ready to build? Get your Anchor API key

smolagents + Anchor: Building Production Browser Agents in Python

Why smolagents?

Setup

Creating an Anchor Session

Defining Browser Tools

Building the Agent

Running a Real Task

Swapping the Model

Production Tips

What's Next

Recent articles

Claude Fable 5 Is the New SOTA Browser Agent — Here's How to Run It in Production

Browser Agents Crash Mid-Task. Here's How to Make Them Resumable.

Browser Agents Have an Amnesia Problem. Here's the Fix.

Stay ahead in browser automation