The snapshot() function captures the current rendered page state and returns a ranked, token-bounded set of interactive elements, plus runtime signals you can use for Jest-style verification: layout/ordinality (including dominant_group_key), modal/overlay detection (modal_detected, modal_grids), and diagnostics (stability/confidence, reason codes, and best-effort CAPTCHA / “requires vision” signals).
from predicate import snapshot, SnapshotOptions, SnapshotFilter
# Basic snapshot (uses default options)
snap = snapshot(browser)
# With screenshot and limit
snap = snapshot(browser, SnapshotOptions(
screenshot=True,
limit=200
))
# Force local processing (no credits used)
snap = snapshot(browser, SnapshotOptions(use_api=False))
# With filtering
snap = snapshot(browser, SnapshotOptions(
filter=SnapshotFilter(
min_area=100,
allowed_roles=["button", "link"]
)
))
# Or use dict for filter (also supported)
snap = snapshot(browser, SnapshotOptions(
filter={"min_area": 100, "allowed_roles": ["button", "link"]}
))Credit Consumption:
api_key is provided, this calls the server-side /v1/snapshot endpoint which consumes 1 credit per call (metered billing).use_api=False for local processing (no credits; ranking is best-effort/local-only).Gateway timeout (API mode):
use_api=true, the SDK sends raw_elements to the Gateway (POST /v1/snapshot) and waits for a refined response.ReadTimeout), increase the Gateway timeout:
SnapshotOptions.gateway_timeout_s (seconds)SnapshotOptions.gatewayTimeoutMs (milliseconds)Payload Size Limit:
limit option to reduce the number of elements, or use use_api=False for local processing (no size limit).Screenshots:
screenshot: true, the screenshot is captured locally by the extension.use_api=true mode, the SDK does not receive screenshots from the server; it merges the server-ranked elements with the locally captured screenshot.Python:
browser (PredicateBrowser): Browser instanceoptions (SnapshotOptions, optional): Snapshot configuration optionsSnapshotOptions fields:
screenshot (bool | ScreenshotConfig, optional): Capture screenshot. True for PNG, or {"format": "jpeg", "quality": 80}. Default: False.limit (int, optional): Maximum number of elements to return. Default: 50. Range: 1-500 (SDK). In API mode, the server caps this value (default cap: 100).filter (SnapshotFilter | dict, optional): Filter options:
min_area: Minimum element area in pixelsallowed_roles: List of roles to include (e.g., ["button", "link"])min_z_index: Minimum z-index valueuse_api (bool, optional): Force server API (True) or local extension (False). Auto-detects if None.gateway_timeout_s (float, optional): Gateway snapshot timeout in seconds (only relevant when use_api=true). Default: 30.show_overlay (bool, optional): Display visual overlay in browser highlighting detected elements. Default: False.goal (str, optional): Optional goal/task description for ML reranking.TypeScript:
browser (PredicateBrowser): Browser instanceoptions (object, optional):
screenshot (boolean | object): Capture screenshotlimit (number): Maximum elements to returnfilter (object): Filter optionsuse_api (boolean): Force server API or local extensiongatewayTimeoutMs (number): Gateway snapshot timeout in milliseconds (only relevant when use_api=true). Default: 30000.show_overlay (boolean): Display visual overlay (default: false)goal (string, optional): Optional goal/task description for ML rerankingfrom predicate import snapshot, SnapshotOptions
# Large pages can take longer to refine server-side.
snap = snapshot(
browser,
SnapshotOptions(use_api=True, gateway_timeout_s=60),
)Snapshot object with:
elements: List of Element objects (sorted by importance)url: Current page URLviewport: Viewport dimensionstimestamp: Snapshot timestampscreenshot: Base64-encoded image (if requested)dominant_group_key: Geometric group key for the main content area (may be null)diagnostics: Stability/debug diagnostics (may be null)modal_detected: True if a modal/overlay grid was detected (may be null)modal_grids: Detected modal grids (may be null)ml_rerank: ML reranking metadata (may be null)snapshot.diagnostics)diagnostics is best-effort runtime evidence about page stability and “how trustworthy the snapshot is right now”.
Use it to:
.eventually() / bounded retries)| Field | Type | How to use it |
|---|---|---|
confidence | number | null | A 0..1 stability score. Low confidence typically means the page is still moving (navigation, hydration, modals, DOM churn). Use it as a signal to retry snapshots before acting. |
reasons | string[] | Machine-readable reason codes explaining low confidence. Log these and include them in artifacts—this is often the fastest way to debug flaky runs. |
metrics | object | null | Best-effort browser-side metrics used to compute confidence. Useful for diagnosing “why was this unstable?” and for telemetry dashboards. |
captcha | object | null | Detection-only CAPTCHA signal (no solving). Use it to branch to your CAPTCHA handling strategy or fail fast with a clear reason. |
requires_vision | boolean | null | Best-effort recommendation that structure may be insufficient for this page state (e.g., heavy canvas / non-semantic UI). Use it as an escalation signal. |
requires_vision_reason | string | null | Human-readable explanation for why structure is likely insufficient. Include it in traces/artifacts to make failures explainable. |
diagnostics.metrics)| Metric | Meaning |
|---|---|
ready_state | Document readyState (e.g., "loading", "interactive", "complete"). |
quiet_ms | How long the page has been “quiet” (no major DOM churn), in milliseconds (best-effort). |
node_count | Approximate DOM node count (best-effort). Useful for “page exploded” diagnostics. |
interactive_count | How many interactive candidates were detected (best-effort). |
raw_elements_count | How many raw elements were captured before filtering (best-effort). |
diagnostics.captcha)CAPTCHA diagnostics are detection-only signals:
| Field | Meaning |
|---|---|
detected | True if a CAPTCHA-like pattern was detected. |
provider_hint | Best-effort provider hint (may be null). |
confidence | 0..1 confidence of detection. |
evidence | Best-effort evidence hits (text/selector/iframe/url) to make detections explainable. |
Each element in snapshot.elements has the following properties:
| Property | Type | Description |
|---|---|---|
id | int | Unique identifier for clicking/interacting |
role | str | Semantic role (button, link, textbox, heading, etc.) |
text | str | None | Visible text content |
importance | int | AI importance score (0-1000, higher = more important) |
bbox | BBox | Bounding box with x, y, width, height |
visual_cues | VisualCues | Visual analysis (is_primary, is_clickable, background_color_name) |
in_viewport | bool | Whether element is visible in current viewport |
is_occluded | bool | Whether element is covered by another element |
z_index | int | CSS z-index value (default: 0) |
These fields are present when goal is provided in SnapshotOptions:
| Property | Type | Description |
|---|---|---|
fused_rank_index | int | None | 0-based rank after sorting by importance_fused |
heuristic_index | int | None | 0-based rank before ML reranking (original heuristic position) |
ml_probability | float | None | Confidence score from ONNX model (0.0 - 1.0) |
ml_score | float | None | Raw logit score from ONNX model (for debugging) |
These fields support position-based selection ("first result", "top item"):
| Property | Type | Description |
|---|---|---|
center_x | float | None | X coordinate of element center (viewport coords) |
center_y | float | None | Y coordinate of element center (viewport coords) |
doc_y | float | None | Y coordinate in document (center_y + scroll_y) |
group_key | str | None | Geometric bucket key for ordinal grouping |
group_index | int | None | Position within group (0-indexed, sorted by doc_y) |
in_dominant_group | bool | None | Whether element is in the dominant group (main content area) |
These fields enable Jest-style assertions for form controls:
| Property | Type | Description |
|---|---|---|
name | str | None | Accessible name/label for controls (distinct from visible text) |
value | str | None | Current value for inputs/textarea/select (may be redacted for PII) |
input_type | str | None | Input type (e.g., "text", "email", "password") |
value_redacted | bool | None | Whether value was redacted for privacy (password/email/tel) |
checked | bool | None | Normalized checked state for checkboxes/radios |
disabled | bool | None | Normalized disabled state |
expanded | bool | None | Normalized expanded state for dropdowns/accordions |
aria_checked | str | None | Raw ARIA checked string (tri-state: "true"/"false"/"mixed") |
aria_disabled | str | None | Raw ARIA disabled string |
aria_expanded | str | None | Raw ARIA expanded string |
| Property | Type | Description |
|---|---|---|
href | str | None | Hyperlink URL (for link elements) |
nearby_text | str | None | Nearby static text (best-effort, usually for top-ranked elements) |
diff_status | str | None | Diff status: "ADDED", "REMOVED", "MODIFIED", "MOVED" (for diff overlay) |
When show_overlay=True, Predicate displays a visual overlay in the browser highlighting all detected elements:
Color Coding:
is_primary=true)Visual Indicators:
importance scoreUse Cases:
# Example: Debug why a button isn't being clicked
from predicate import SnapshotOptions
browser.goto("https://example.com")
snap = snapshot(browser, SnapshotOptions(show_overlay=True)) # See what's detected
time.sleep(6) # Wait to inspect the overlay
# Check if your target button is in the results
button = find(snap, "role=button text~'Submit'")
if not button:
print("❌ Button not found - check the overlay to see what's detected")When you provide a goal parameter in SnapshotOptions, the server uses an ONNX-based machine learning model to rerank elements based on relevance to your goal. This dramatically improves element selection accuracy for agent tasks.
snapshot.ml_rerank)When ML reranking is enabled, snapshot.ml_rerank provides best-effort metadata about what happened in the server-side rerank pass.
| Field | Type | Meaning |
|---|---|---|
enabled | boolean | Whether ML reranking was enabled for this snapshot. |
applied | boolean | Whether reranking actually ran (may be false if conditions were not met). |
reason | string | null | Why reranking was applied or skipped (best-effort). |
candidate_count | number | How many elements were considered for reranking. |
top_probability | number | null | Confidence of the top-ranked element (0..1). |
min_confidence | number | null | Confidence threshold used (if any). |
is_high_confidence | boolean | null | True if top probability meets the high-confidence threshold. |
tags | string[] | Internal labels for debugging and analysis. |
error | string | null | Error message if reranking failed (best-effort). |
# Trigger ML reranking by providing a goal
snap = snapshot(browser, SnapshotOptions(
goal="Click the login button",
limit=50
))
# Elements are now sorted by ML relevance, not just heuristic importance
for element in snap.elements[:5]:
print(f"[{element.id}] {element.role}: {element.text}")
if element.ml_probability:
print(f" ML Confidence: {element.ml_probability:.2%}")
print(f" Moved from position {element.heuristic_index} → {element.fused_rank_index}")When ML fields are present:
goal is provided in SnapshotOptionsagent.act() (goals are passed automatically)goal is not specified (elements ranked by heuristic importance only)What the fields mean:
fused_rank_index: Final position after ML + heuristic fusion (0 = most relevant to goal)heuristic_index: Original position before ML (shows how much ML changed the ranking)ml_probability: Model's confidence that this element is relevant (0.0-1.0)ml_score: Raw logit score before softmax (useful for debugging model behavior)