How Predicate Labs Works

A verification & control layer for AI agents that operate browsers — rendered DOM snapshots, ordinality, and Jest-style assertions.

"Think Jest for browser agents — deterministic selection, pass/fail verification, explainable failures."

Not HTML parsing. Predicate snapshots the rendered DOM + layout from a live browser after SPA hydration. It works on JS-heavy sites because it captures post-hydration state.

Local-first by default. When the page is unstable, we resnapshot, then reset to checkpoint. Only if snapshot attempts are exhausted do we use optional vision fallback or model escalation.

Most tools help agents load pages or read content.
Predicate Labs helps agents act — and verify — reliably.

The Problem: Agents fail at execution and verification

Modern LLMs are good at deciding what they want to do.
They are unreliable at deciding where to do it — and whether it worked.

Clicking invisible or occluded elements
Guessing between similar buttons
No way to verify actions succeeded
Defaulting to vision loops when DOM structure is available
Non-reproducible behavior across runs
No deterministic way to select "3rd result"
No explicit failure policy (retry/reset/escalate)

Predicate fixes this with semantic snapshots, ordinal selection, and a verification loop (assertions + explicit failure policy).

Failure modes are explicit: resnapshot → reset → fallback → escalate.

Two Ways to Use Predicate

You can use the full runtime, or attach the verifier as a sidecar.

The browser is an adapter; the runtime + verification loop is the product.

AgentRuntime

Full loop • Owns step lifecycle

Owns step lifecycle
begin_step, action wrappers, end_step
Assertions + explicit failure policy
assert_, assert_done, configurable retries
Generates traces + artifacts on failure
Full execution history, snapshots, diagnostics

Best for: Building new agents or production pipelines that want deterministic behavior and full control over the execution loop.

PredicateDebugger

Sidecar • Attach to existing agents

Attach to existing Playwright page
Works with browser-use, LangChain, PydanticAI
Does NOT plan or invent actions
Your agent stack drives; Predicate observes
Runs snapshot + predicates + artifacts
Makes failures diagnosable without rewriting

Best for: Users already using browser-use/LangGraph/etc who want verification + debugging without changing their agent code.

PredicateDebugger Example

from sentience import PredicateDebugger
from sentience.tracing import Tracer, JsonlTraceSink

# page is a Playwright Page from your framework (browser-use, LangChain, etc.)
page = agent.get_page()
tracer = Tracer(run_id="run-123", sink=JsonlTraceSink("trace.jsonl"))

dbg = PredicateDebugger.attach(page, tracer=tracer)

# your agent loop runs as usual
await agent.step()

# verification sidecar — Predicate observes + verifies
await dbg.snapshot(goal="verify:post-action")
await dbg.check(url_contains("checkout"), label="on_checkout").eventually()

The debugger captures snapshots and runs predicates without taking over your agent loop. Works with browser-use, LangChain, LangGraph, PydanticAI, and custom frameworks.

Two Paths: Local or Gateway

Predicate works entirely locally with the browser extension, or you can add the optional Gateway for ML-powered reranking.

Local vs Gateway affects ranking/reranking; Runtime vs Debugger affects how you integrate.

Local Mode

Extension-only • Free

Rendered DOM snapshots
Post-hydration state, not static HTML
Ordinal selection
group_key, group_index, dominant_group
Layout detection
Grid positions, regions, parent/child
AgentRuntime or Debugger assertions
assert_, check(), eventually() for verification

Works on SPAs because snapshots are taken from the live rendered page, not static HTML.

Best for: Development, testing, cost-sensitive production, frameworks like browser-use

Gateway Mode

Cloud API • Pro/Enterprise

Everything in Local, plus:
ML-powered reranking
ONNX model for optimal element selection
Goal-conditioned reranking
Improves target ordering when you provide a goal
Cloud trace storage
Predicate Studio for team debugging

Best for: Production agents, maximum accuracy, team collaboration, observability

Both Modes Include Full Tracing

Every snapshot recorded
Replay, diff, debug any run
Local JSON or Studio
Confidence + reasons on instability

What's Inside a Semantic Snapshot

Each snapshot contains ~0.6–1.2k tokens per step — enough for your LLM to make deterministic decisions.

With structured snapshots, 3B-class models become viable. Larger models (7B/14B+) still help with planning and recovery, but they're no longer required just to operate a browser.

Example Element in Snapshot

{
  "diagnostics": {
    "confidence": 0.92,
    "reasons": [],
    "requires_vision": false,
    "captcha": {
      "detected": false,
      "confidence": 0.0,
      "provider_hint": "unknown",
      "evidence": {
        "iframe_src_hits": [],
        "selector_hits": [],
        "text_hits": [],
        "url_hits": []
      }
    },
    "modal_detected": false
  },
  "id": 42,
  "role": "button",
  "text": "Add to Cart",
  "importance": 95,
  "bbox": { "x": 320, "y": 480, "width": 120, "height": 40 },
  "in_viewport": true,
  "is_occluded": false,
  "visual_cues": {
    "is_primary": true,
    "is_clickable": true,
    "background_color_name": "green"
  },
  // Ordinal fields for "click 3rd result" queries
  "group_key": "480_main",
  "group_index": 2,
  "in_dominant_group": true,
  // Layout detection
  "layout": {
    "grid_id": 1,
    "grid_pos": { "row_index": 0, "col_index": 2 },
    "region": "main"
  }
}

Snapshot Metadata

{
  "snapshot_confidence": 0.92,
  "stability_reasons": [],  // empty = stable
  // On unstable pages:
  // "stability_reasons": ["dom_unstable", "layout_shifting"]
}

Core Fields

  • id – stable element identifier
  • role – button, link, input, etc.
  • text – visible label
  • bbox – exact pixel coordinates

Visibility

  • in_viewport – currently visible
  • is_occluded – covered by overlay
  • importance – relevance score
  • is_primary – main CTA

Ordinal Selection

  • group_key – geometric bucket
  • group_index – position in group
  • in_dominant_group – main content
  • grid_pos – row/column indices

~0.6–1.2k tokens per snapshot — compare to 10–50k+ tokens for vision-based approaches

How It Works (5 Steps + Failure Policy)

1

Your Agent Defines the Goal

Your agent (LLM + logic) decides what it wants to do:

  • "Click the third search result"
  • "Add the item to cart"
  • "Assert the confirmation message appears"

Predicate does not replace planning or reasoning.

2

Predicate SDK or Debugger Controls the Browser

Using the Predicate SDK (Python or TypeScript) or attaching PredicateDebugger to an existing page:

  • launches a real browser (AgentRuntime) or attaches to one (Debugger)
  • navigates pages (waits for SPA hydration)
  • requests a snapshot()

This snapshot is not raw HTML. Not screenshots by default. It's the rendered DOM + layout signals captured from the live page (after SPA hydration, not static HTML parsing).

3

Browser + WASM Capture Post-Hydration State

A lightweight browser extension captures from the rendered page:

  • post-hydration DOM state (JS-rendered content)
  • element bounding boxes (x, y, w, h)
  • visibility and occlusion at time of action
  • layout structure and stable coordinates

No inference. No guessing. Ground truth from the live browser.

4

Deterministic Actions Execute

Your agent selects a target from the snapshot and executes:

  • click("Add to Cart")
  • type("search input", "query")
  • Use ordinals: group_index=2 for "3rd result"
  • Use grids: row_index=0, col_index=2

Actions execute exactly where intended — no coordinate guessing.

Failure Policy: Explicit Recovery

When actions fail or pages are unstable, Predicate follows a deterministic escalation path:

1.
Resnapshot
When DOM is unstable, take a fresh snapshot
2.
Reset to checkpoint
On repeated failures, return to known state
3.
Vision fallback (optional)
Only when snapshot attempts are exhausted
4.
Model escalation (optional)
Bigger local model or cloud API when needed

Assertions remain the verifier throughout — pass or fail, not "maybe."

5

Verify with Jest-Style Assertions

Jest gives you matchers; Predicate adds stability + semantic targeting + deterministic retries.

Like Jest for web automation — assert expectations, not hope:

  • assert_("Order confirmed") — verify text appears (AgentRuntime)
  • check(predicate).eventually() — verify with retries (Debugger)
  • assert_done("checkout complete") — task completion

Assertions use the same semantic snapshot — deterministic, traceable verification.

Want to see this in action?

Run a live example using the Predicate SDK — no setup required.
👉Try it live

What Makes This Different

Predicate vs Browser Infrastructure

Browser infrastructure gives you a place to run code.

Predicate gives your agent certainty about where to act.

Without grounded action selection, agents still guess.

Predicate vs Scrapers / Read APIs

Scrapers parse static HTML. Predicate snapshots the rendered DOM after SPA hydration.

Scrapers don't tell agents:

  • • what is clickable
  • • what is visible
  • • where it is on screen

Reading ≠ acting. Predicate is for agents that must interact.

Works with browser-use, LangChain, and more

Already using browser-use or another agent framework? Use PredicateDebugger to add verification + tracing without rewriting your agent, or BrowserUseAdapter for full AgentRuntime integration.

View browser-use integration guide

Where Predicate Fits

Predicate complements agent frameworks rather than replacing them. It slots into the verification layer.

1

Agent Framework

Planner/executor: browser-use, LangChain, PydanticAI, custom

2

Predicate

Snapshot + verification + trace: AgentRuntime or PredicateDebugger

3

Browser Adapter

Playwright, CDP, or custom browser connection

What the Agent Actually Receives

Instead of pixels (~10–50k tokens) or raw DOM, your agent gets a compact semantic snapshot:

Ranked actionable elements
~0.6–1.2k tokens per step
Ordinal selection fields
group_key, group_index, dominant_group
Layout detection
Grid positions, regions, parent/child
Visibility signals
in_viewport, is_occluded, is_primary

Token Efficiency Comparison:

Vision: 10–50k tokens/step
Raw DOM: 5–20k tokens
Predicate: ~0.6–1.2k tokens

Built-In Observability (Traces & Studio)

Every step is recorded automatically — use local JSON traces or Predicate Studio:

snapshots
ranked targets
chosen action
execution result

Why you may see “more steps” in traces: Predicate traces are recorded at the runtime/verification level (snapshots, retries, stabilization polls, assertion attempts). A single high-level agent action can produce multiple trace steps. When a UI says “failed at step 12”, that typically refers to the trace step index, not a benchmark’s “plan steps taken”.

These traces power:

step-by-step replay
visual debugging
determinism diffing
CI-style validation

When something fails, you get a reasoned failure artifact — not a vague LLM apology.

When You Should Use Predicate

Predicate is designed for:

  • Agents that must act, not just read
  • Production workflows where retries are expensive
  • Systems that need auditability and replay
  • Teams debugging real-world agent failures
  • Users who want verification without adopting a new runtime
    Use PredicateDebugger as a sidecar

If your agent only reads text, Predicate is unnecessary.

If your agent must click, type, scroll, or submit — Predicate is the missing layer.

If you're already using another framework and just want verification + debugging — use PredicateDebugger.

Try It Live

If you're building agents that must act, Predicate Labs is the missing layer.

Explore interactive SDK examples or test the API directly with real automation scenarios

Apply a permission policy, run agent steps with lifecycle hooks, trace tool calls, and verify a real download deterministically.

1# Verification-first runtime demo (Agent acts; Predicate asserts)
2from sentience import SentienceBrowser, PermissionPolicy, SentienceAgent
3from sentience.llm_provider import OpenAIProvider
4from sentience.agent_runtime import AgentRuntime
5from sentience.tools import ToolRegistry, ToolContext, register_default_tools
6from sentience.verification import exists, download_completed
7from sentience.tracing import Tracer, JsonlTraceSink
8
9# Tracing (local JSONL; upload optional)
10tracer = Tracer(run_id="demo-run", sink=JsonlTraceSink("trace.jsonl"))
11
12# Startup permission policy avoids non-DOM browser permission bubbles
13policy = PermissionPolicy(
14    default="clear",
15    auto_grant=["geolocation"],
16    geolocation={"latitude": 37.7749, "longitude": -122.4194},
17)
18
19browser = SentienceBrowser(
20    api_key="sk_live_...",
21    allowed_domains=["example.com"],
22    permission_policy=policy,
23)
24browser.start()
25
26llm = OpenAIProvider(api_key="sk_openai_...", model="gpt-4o-mini")
27agent = SentienceAgent(browser, llm, tracer=tracer)
28runtime = AgentRuntime(browser, browser.page, tracer)
29
30# Typed, traceable tools (includes evaluate_js + permission tools)
31registry = ToolRegistry()
32register_default_tools(registry, runtime)
33ctx = ToolContext(runtime)
34
35def on_start(ctx): print("hook start:", ctx.goal)
36def on_end(ctx): print("hook end:", ctx.success, ctx.outcome)
37
38browser.page.goto("https://example.com/billing")
39
40# Small introspection (bounded output; no DOM dump)
41title = await registry.execute("evaluate_js", {"code": "document.title"}, ctx=ctx)
42
43# Act with lifecycle hooks
44agent.act("Sign in if needed", on_step_start=on_start, on_step_end=on_end)
45
46# Verify + download (assert what happened, not what you hope happened)
47runtime.begin_step("Download invoice and verify")
48await runtime.snapshot(limit=60)
49runtime.assert_(exists("role=button text~'Download invoice'"), "has_download_button", required=True)
50
51agent.act("Click 'Download invoice PDF'", on_step_start=on_start, on_step_end=on_end)
52
53await runtime.snapshot()
54await runtime.check(download_completed("invoice"), label="invoice_downloaded", required=True).eventually(timeout_s=10)
55
56print("PASS ✓ invoice_downloaded")
57await tracer.close(upload=False)
58browser.close()

✓ Jest-Style Assertions

Verify outcomes deterministically — assert what happened, not what you hope happened

📸 Stability-Aware Snapshots

Rendered DOM after hydration with confidence scoring — enables deterministic verification

↻ Bounded Retries

Retry verification (not actions) with confidence gating — explainable failures with reason codes

Predicate Labs focuses on execution intelligence. Browser runtimes and navigation engines are intentionally decoupled.