Updated 2026-01-24 - Runtime verification support for agent loops with assertion predicates and task completion tracking.
Think Jest for AI web agents. AgentRuntime provides Jest-style semantic assertions so agents can verify what changed on the page instead of guessing when they're "done".
Related topics
AgentRuntime stays small by design. These pages cover the runtime’s surrounding capabilities.
read()Predicate provides Jest-style assertions for AI web agents.
Instead of trusting that an agent probably clicked the right thing, Predicate verifies outcomes using structured snapshots from a live, rendered browser (post-SPA hydration).
Predicate does not parse static HTML and does not rely on vision by default.
Predicate snapshots the rendered DOM + layout from a real browser after SPA hydration. This works reliably on JS-heavy applications where static HTML scraping fails.
Deterministic assertions
Reliability & recovery
.eventually() retries with bounded backoffWith structured snapshots, 3B–14B local models are viable. Larger models improve planning and recovery — not DOM parsing.
Debugging Agent Failures
When assertions fail, the Failure Artifact Buffer automatically captures video clips, snapshots, and metadata for post-mortem analysis. Review artifacts locally or upload to Predicate Studio for visual debugging.
CAPTCHA Handling
Predicate detects but does not solve CAPTCHAs — by design. The SDK provides hooks to integrate your preferred resolution strategy: human-in-loop, external solvers, or vision-only verification. Learn how to configure CAPTCHA policies and handlers.
Vision models are used only if structural signals are exhausted — never by default.
Vision is used only after snapshot confidence is exhausted. Assertions remain invariant; only the perception layer changes.
The AgentRuntime class provides a thin runtime wrapper that combines:
It's designed for agent verification loops where you need to repeatedly take snapshots, execute actions, and verify results.
New in 2026-01-24: AgentRuntime is now framework-agnostic and accepts any browser implementing the BrowserBackend protocol. This allows integration with browser-use, Playwright, or any CDP-based browser through a single backend parameter.
from browser_use import BrowserSession, BrowserProfile
from predicate import get_extension_dir
from predicate.backends import BrowserUseAdapter
from predicate.agent_runtime import AgentRuntime
from predicate.tracing import Tracer, JsonlTraceSink
# 1. Setup browser-use with Predicate extension
profile = BrowserProfile(args=[f"--load-extension={get_extension_dir()}"])
session = BrowserSession(browser_profile=profile)
await session.start()
# 2. Create backend from browser-use session
adapter = BrowserUseAdapter(session)
backend = await adapter.create_backend()
# 3. Navigate using browser-use
page = await session.get_current_page()
await page.goto("https://example.com")
# 4. Create AgentRuntime with backend
sink = JsonlTraceSink("trace.jsonl")
tracer = Tracer(run_id="test-run", sink=sink)
runtime = AgentRuntime(backend=backend, tracer=tracer)
# 5. Use runtime for verification loop
await runtime.snapshot()
runtime.assert_(url_matches(r"example\.com"), label="on_homepage")
runtime.assert_(exists("role=button"), label="has_buttons")
# 6. Check task completion
if runtime.assert_done(exists("text~'Success'"), label="task_complete"):
print("Task completed!")# Same setup as above, but with API key for smart element ranking
runtime = AgentRuntime(
backend=backend,
tracer=tracer,
predicate_api_key="sk_pro_xxxxx", # Enables Gateway refinement
)
# Snapshots now use server-side ML ranking/filtering
await runtime.snapshot() # Elements are refined by GatewayIf you're using browser-use, you have two common integration patterns:
BrowserUseAdapter to turn a browser-use session into a Predicate BrowserBackend (shown above in Quick Start).PredicateAgent when you want an agent loop with verification “wired in”.This runs step assertions and a done assertion, and emits verification data into traces.
from browser_use.integrations.sentience import PredicateAgent
from predicate.verification import url_contains, exists, all_of
# Define per-step assertions
step_assertions = [
{
"predicate": url_contains("example.com"),
"label": "on_target_site",
"required": True,
},
{
"predicate": exists("role=button"),
"label": "has_buttons",
},
]
# Define task completion assertion
done_assertion = all_of(
url_contains("/success"),
exists("text~'Complete'"),
)
agent = PredicateAgent(
task="Complete the checkout flow",
llm=llm,
browser_session=session,
enable_verification=True,
step_assertions=step_assertions,
done_assertion=done_assertion,
trace_dir="traces",
)
result = await agent.run()
print(result.get("verification"))
PredicateContext builds compact, ranked DOM context blocks for LLMs (semantic geometry instead of full DOM + screenshots). Pair it with AgentRuntime when you want:
For the full browser-use integration guide, see: Browser‑use Integration.
from predicate.backends import PredicateContext
ctx = PredicateContext(
max_elements=60,
show_overlay=True,
top_element_selector={
"by_importance": 60,
"from_dominant_group": 15,
"by_position": 10,
},
)
state = await ctx.build(session, goal="Click the first search result")
if state:
print(state.prompt_block)from predicate.agent_runtime import AgentRuntime
# New: Using backend parameter (recommended)
runtime = AgentRuntime(
backend=backend, # Any BrowserBackend implementation
tracer=tracer, # Tracer for event emission
predicate_api_key="sk_...", # Optional: Pro/Enterprise Gateway refinement
)Parameters:
backend - Any browser implementing the BrowserBackend protocol (browser-use, Playwright, CDP-based)tracer - Tracer for emitting verification eventspredicate_api_key (optional) - API key for Pro/Enterprise tier Gateway refinementFor existing AsyncPredicateBrowser users, use the factory method:
from predicate import AsyncPredicateBrowser
from predicate.agent_runtime import AgentRuntime
async with AsyncPredicateBrowser() as browser:
page = await browser.new_page()
await page.goto("https://example.com")
# Use factory method for backward compatibility
runtime = await AgentRuntime.from_sentience_browser(
browser=browser,
page=page,
tracer=tracer,
)
await runtime.snapshot()| Property | Type | Description |
|---|---|---|
step_id / stepId | string | null | Current step identifier |
step_index / stepIndex | number | Current step index (0-based) |
last_snapshot / lastSnapshot | Snapshot | null | Most recent snapshot |
is_task_done / isTaskDone | boolean | Whether task is complete |
The BrowserBackend protocol defines the minimal interface required for browser integration. Any browser framework can work with AgentRuntime by implementing this protocol.
| Method | Description |
|---|---|
eval(expression) | Execute JavaScript in page context |
call(fn, args) | Call JavaScript function with arguments |
get_url() | Get current page URL |
screenshot_png() | Capture viewport screenshot |
mouse_click() | Perform mouse click action |
mouse_move() | Move mouse to coordinates |
wheel() | Scroll using mouse wheel |
type_text() | Send keyboard input |
wait_ready_state() | Wait for document ready state |
refresh_page_info() | Get viewport and scroll info |
get_layout_metrics() | Get page layout metrics |
The SDK provides built-in backend implementations:
| Backend | Use Case |
|---|---|
BrowserUseAdapter | For browser-use integration via CDPBackend |
PlaywrightBackend | For direct Playwright usage |
CDPBackend | Low-level CDP-based browser control |
Takes a snapshot of the current page state. Updates lastSnapshot which is used as context for assertions.
# Take snapshot (required before element assertions)
snap = runtime.snapshot()
print(f"Found {len(snap.elements)} elements")Returns: Snapshot - Current page state
Begins a new verification step. Generates a new step ID, clears previous assertions, and increments step index.
# Begin a new step
step_id = runtime.begin_step("Navigate to checkout")
print(f"Step ID: {step_id}")
# Or with explicit step index
step_id = runtime.begin_step("Verify cart", step_index=2)Parameters:
goal (string) - Description of what this step aims to achievestep_index / stepIndex (number, optional) - Explicit step index (auto-increments if omitted)Returns: string - Generated step ID
Evaluates an assertion predicate against the current snapshot state. Results are accumulated for the step and emitted as verification events.
# URL assertion
url_ok = runtime.assert_(url_contains("checkout"), "on_checkout_page")
# Element assertion
has_btn = runtime.assert_(exists("role=button text~'Pay'"), "has_pay_button")
# Required assertion (gates step success)
ready = runtime.assert_(
all_of(url_contains("checkout"), exists("role=button")),
"checkout_ready",
required=True
)Parameters:
predicate (Predicate) - Predicate function to evaluatelabel (string) - Human-readable label for this assertionrequired (boolean, optional) - If true, gates step success (default: false)Returns: boolean - True if assertion passed
Asserts task completion with a required assertion. When passed, marks the task as done.
# Check if task goal is achieved
if runtime.assert_done(exists("text~'Order Confirmed'"), "order_placed"):
print("Order successfully placed!")
# runtime.is_task_done is now TrueParameters:
predicate (Predicate) - Predicate function to evaluatelabel (string) - Human-readable label for this assertionReturns: boolean - True if task is complete
from predicate import url_matches, url_contains
# Regex match on URL
runtime.assert_(url_matches(r"https://.*\.example\.com"), "is_https")
# Substring match on URL
runtime.assert_(url_contains("checkout"), "on_checkout")from predicate import exists, not_exists, element_count
# Element exists (using query syntax)
runtime.assert_(exists("role=button text~'Submit'"), "has_submit")
# Element does not exist
runtime.assert_(not_exists("text~'Error'"), "no_errors")
# Element count check
runtime.assert_(element_count("role=listitem", min=5), "has_items")from predicate import all_of, any_of
# All conditions must pass
runtime.assert_(
all_of(
url_contains("checkout"),
exists("role=button text~'Pay'"),
not_exists("text~'Error'")
),
"checkout_ready"
)
# Any condition must pass
runtime.assert_(
any_of(
exists("text~'Success'"),
exists("text~'Complete'")
),
"task_done"
)from predicate import custom
from predicate.verification import AssertContext, AssertOutcome
def my_predicate(ctx: AssertContext) -> AssertOutcome:
# Custom logic using ctx.snapshot and ctx.url
has_items = len(ctx.snapshot.elements) > 10
return AssertOutcome(
passed=has_items,
reason="Found sufficient elements" if has_items else "Too few elements",
details={"element_count": len(ctx.snapshot.elements)}
)
runtime.assert_(custom(my_predicate), "custom_check")For expressive, Jest-like assertions, Predicate also ships an Assertion DSL (E, expect, dominant-list queries). DSL expressions compile to predicates — you still pass them into assert/assert_ (or into .check(...).eventually()).
For the full DSL guide and examples, see: Jest‑Style Assertions.
from predicate.asserts import E, expect, in_dominant_list
runtime.assert_(
expect(E(role="button", text_contains="Submit")).to_be_visible(),
label="submit_visible",
required=True,
)
runtime.assert_(
expect(in_dominant_list().nth(0)).to_exist(),
label="first_result_exists",
)# Check if all assertions in current step passed
if runtime.all_assertions_passed():
print("All assertions passed!")
# Check if all required assertions passed
if runtime.required_assertions_passed():
print("All required assertions passed!")
# Check if task is done
if runtime.is_task_done:
print("Task complete!")Retrieve accumulated assertions for inclusion in trace step_end events:
# Get assertions data for step_end event
assertions_data = runtime.get_assertions_for_step_end()
print(f"Assertions: {assertions_data['assertions']}")
print(f"Task done: {assertions_data.get('task_done', False)}")
print(f"Task label: {assertions_data.get('task_done_label')}")
# Flush and clear assertions for next step
assertions = runtime.flush_assertions()For multi-task runs, reset the task completion state:
# Reset task_done state for next task
runtime.reset_task_done()Assertions are automatically emitted as verification events to the tracer, making them visible in Studio timeline.
{
"type": "verification",
"data": {
"kind": "assert",
"label": "on_checkout_page",
"passed": true,
"required": false,
"reason": "URL contains 'checkout'",
"details": { "url": "https://example.com/checkout" }
},
"step_id": "abc-123"
}When assert_done() passes:
{
"type": "verification",
"data": {
"kind": "task_done",
"label": "order_placed",
"passed": true
},
"step_id": "abc-123"
}from predicate import (
AgentRuntime,
PredicateBrowser,
all_of,
exists,
not_exists,
url_contains,
url_matches,
)
from predicate.tracer_factory import create_tracer
def main():
# Setup
tracer = create_tracer(api_key="sk_...", run_id="verification-demo", upload_trace=False)
browser = PredicateBrowser(api_key="sk_...", headless=False)
browser.start()
try:
runtime = AgentRuntime(browser, browser.page, tracer)
# Navigate
browser.page.goto("https://example.com")
browser.page.wait_for_load_state("networkidle")
# Step 1: Verify page loaded
runtime.begin_step("Verify page loaded correctly")
runtime.snapshot()
# Run assertions
runtime.assert_(url_contains("example.com"), "on_example_domain")
runtime.assert_(url_matches(r"https://.*example\.com"), "url_is_https")
runtime.assert_(exists("role=heading"), "has_heading")
runtime.assert_(not_exists("text~'Error'"), "no_error_message")
# Combined assertion
runtime.assert_(
all_of(url_contains("example"), exists("role=link")),
"page_fully_ready",
)
# Check task completion
if runtime.assert_done(exists("text~'Example Domain'"), "reached_example_page"):
print("Task completed!")
# Summary
print(f"All passed: {runtime.all_assertions_passed()}")
print(f"Task complete: {runtime.is_task_done}")
finally:
tracer.close(blocking=True)
browser.close()
if __name__ == "__main__":
main()| Predicate | Description |
|---|---|
url_matches(pattern) / urlMatches(pattern) | URL matches regex pattern |
url_contains(substring) / urlContains(substring) | URL contains substring |
exists(query) | Element matching query exists in snapshot |
not_exists(query) / notExists(query) | No element matching query exists |
element_count(query, min, max) / elementCount(query, opts) | Element count within range |
all_of(...predicates) / allOf(...predicates) | All predicates must pass |
any_of(...predicates) / anyOf(...predicates) | Any predicate must pass |
custom(fn) | Custom predicate function |
| Property | Type | Description |
|---|---|---|
passed | boolean | Whether assertion passed |
reason | string | Human-readable explanation |
details | object | Additional context data |
| Property | Type | Description |
|---|---|---|
snapshot | Snapshot | null | Current page snapshot |
url | string | null | Current page URL |
step_id / stepId | string | null | Current step identifier |
toBeEnabled() — what's different?Yes, Jest/Playwright can verify state given a selector. Predicate handles the harder parts that come before verification.
What Predicate adds:
snapshot_confidence) before assertingelement_not_found, dom_unstable) + nearest-match suggestions.eventually() with deterministic backoff; optional vision fallback only when structure failsJest is the test runner. Predicate provides the perception + verification layer that makes assertions meaningful for AI agents operating on dynamic web pages.
Jest/Playwright matchers like expect(locator).toBeEnabled() check state at a point in time, given a locator you already have.
Predicate assertions like runtime.assert_(is_enabled(selector)) are stability-aware and include element selection semantics + retry logic.
Conceptual mapping:
is_enabled(...) ↔ Jest toBeEnabled() — but Predicate finds the element semantically firstis_checked(...) ↔ Jest toBeChecked()text_contains(...) ↔ Jest toContainText()exists(...) ↔ Jest toBeVisible() / toBeAttached() — but with snapshot context.eventually(...) ↔ Jest waitFor / retry loops — but Predicate retries with confidence gatingJest verifies what you already found. Predicate helps you find the right element in the first place — then verifies it with full context.
Note: Predicate can be used under Jest as the test runner. Jest is the harness; Predicate is the perception + verification layer.
If you're interested in using Predicate assertions for enterprise QA workflows — pre-release validation, regression testing, and monitoring critical user flows — see AI-Driven QA with Predicate.