Agents Can Finally Read Any SaaS. Now the Problem Is Staying Logged In.

Technical Dive
Jun 4
by Idan Raman
Agents Can Finally Read Any SaaS. Now the Problem Is Staying Logged In.

There’s a shift in browser agent deployments happening right now that most teams building on top of the new models haven’t fully named yet.

For the last two years, the hard problem was understanding. Models could reason about tasks in the abstract but struggled to interpret real SaaS interfaces—dynamic menus, nested modals, tables that changed structure between renders. Engineers spent as much time wrestling with model misinterpretation as they did building actual agent logic.

That changed in late 2025. Claude 4 and GPT-5 pushed screen-understanding accuracy past 92%. For the first time, browser agents can reliably navigate interfaces they’ve never seen before. The model looks at a screen, understands what’s on it, and acts. For most teams, visual interpretation has stopped being the primary bottleneck.

What’s taken its place is session state.

What “Staying Logged In” Actually Means

Session state is a deceptively simple phrase for a cluster of related problems.

At the surface level, it means authentication: does the agent have valid credentials, and will the application accept them? But in production, the problem runs deeper. Authentication systems don’t just check whether a credential is correct—they evaluate whether the device presenting it looks familiar. An agent that authenticates successfully from a cold browser session on Monday may face a stepped-up verification challenge on Tuesday if the fingerprint shifted. An agent that re-logs in on every run will eventually trigger a suspicious activity flag, even if every individual session was clean.

There are three layers to the session state problem most teams hit:

  • Authentication continuity: Can the agent reuse a session that’s already authenticated, rather than logging in fresh every time? Most enterprise SaaS will tolerate a handful of cold logins before triggering MFA re-enrollment or account review. In production, where an agent might run hundreds of times a day, “log in fresh each time” isn’t a viable architecture.
  • Fingerprint recognition: Enterprise security systems maintain device profiles. An agent that arrives from a consistent, recognized fingerprint is treated very differently from one that appears as an unknown device on every run. The difference surfaces as MFA challenges, account locks, and session invalidations that are almost impossible to diagnose from the application layer.
  • Behavioral continuity: Beyond authentication and fingerprint, anti-fraud systems score behavioral signals—interaction timing, scroll patterns, the sequence of pages visited in a session. An agent that looks like a brand-new user on every run, regardless of the credentials it carries, accumulates a risk score that causes problems at scale.

Why This Is an Infrastructure Problem

The instinct when sessions break in production is to fix it in application code—add a login retry, catch the auth exception, persist the session cookie somewhere. This works until it doesn’t. Every enterprise app handles session state differently. Every anti-fraud vendor updates its scoring model continuously. A hand-maintained session management layer in application code becomes a permanent engineering tax that grows every time a target site updates its auth flow.

The more durable framing is to treat session persistence as a property of the infrastructure the agent runs on, not something each agent re-implements from scratch. When a browser comes with an established identity—consistent fingerprint, reusable authenticated sessions, a persistent profile that enterprise security systems recognize as a known, trusted device—the agent can stay focused on the task instead of fighting to stay logged in.

This is the exact gap between demos and production. In a demo, you set up the browser environment manually, hand the agent a warm, authenticated session, and it works cleanly. In production, the agent initializes its own environment from zero on every run, and the session management problem reasserts itself every time.

The New Bottleneck Is Tractable

The benchmark improvements that pushed screen-understanding accuracy past 92% effectively moved the browser agent bottleneck upstream. It used to be: “can the model understand this interface?” Now it’s: “can the infrastructure keep the agent in a stable, trusted, authenticated state while it works?”

That’s a harder problem than it sounds—but it’s also a more tractable one. Unlike model accuracy, which required fundamental advances in multimodal reasoning, session state is an infrastructure design problem. It has known solutions. The challenge is building them at the right layer, below the agent, where they can be shared across every workflow instead of re-solved in every codebase.

For teams deploying browser agents against real SaaS at scale, this is the conversation worth having now. The models are ready. The infrastructure is what determines whether production matches the demo.

Learn how Anchor handles persistent sessions →

Stay ahead in browser automation

We respect your inbox. Privacy policy

Welcome aboard! Thanks for signing up
Oops! Something went wrong while submitting the form.