Docs/SDK/Tracing & Debugging

Agent Tracing & Debugging

NEW in v0.12.0 - Built-in tracing infrastructure for debugging, analyzing, and monitoring agent behavior.

v0.99.7 - Auto-emit for complete trace structure (run_start, step_start, snapshot) for seamless Studio visualization.

What is a Trace?

A trace is an append-only, step-structured log of an agent’s interaction with a web page.
Stored as trace.jsonl
Each line is a time-ordered event.
Traces are:
- deterministic
- replayable
- auditable

“A trace is to a web agent what a commit history is to a Git repo.”

Why Use Traces?

Debug Failures: Understand exactly why an agent failed or got stuck
Analyze Costs: Track LLM token usage and API calls
Replay Sessions: Record and replay agent executions
Train Models: Collect successful runs as training data
Monitor Production: Track agent performance in real-time

Basic Usage

from predicate import PredicateBrowser, PredicateAgent
from predicate.llm_provider import OpenAIProvider
from predicate.tracing import Tracer, JsonlTraceSink
from predicate.agent_config import AgentConfig

# 1. Create a tracer with JSONL file sink
tracer = Tracer(
    run_id="shopping-bot-run-123",
    sink=JsonlTraceSink("trace.jsonl")
)

# 2. Configure agent behavior (optional)
config = AgentConfig(
    snapshot_limit=50,           # Max elements per snapshot
    temperature=0.0,             # LLM temperature
    max_retries=1,               # Retries on failure
    capture_screenshots=True,    # Include screenshots in traces
    screenshot_format="jpeg",    # jpeg or png
    screenshot_quality=80        # 1-100 for JPEG
)

# 3. Create agent with tracing enabled
browser = PredicateBrowser()
llm = OpenAIProvider(api_key="your-key", model="gpt-4o")
agent = PredicateAgent(browser, llm, tracer=tracer, config=config)

# 4. Use agent normally - all actions are automatically traced
with browser:
    browser.page.goto("https://amazon.com")
    agent.act("Click the search box")
    agent.act("Type 'magic mouse' into search")
    agent.act("Press Enter")

# Trace events are written to trace.jsonl

Trace Events

Each action generates multiple trace events saved to the JSONL file:

Event Types:

Event Type	Description
`run_start`	Agent run begins (includes agent type, LLM model, config)
`step_start`	Agent begins executing a goal (step_id, goal, attempt)
`snapshot` / `snapshot_taken`	Page state captured with elements, digests, and diagnostics
`llm_called` / `llm_response`	LLM decision made (includes prompt/response hashes, token usage)
`action` / `action_executed`	Action executed (click, type, press, finish, navigate)
`verification`	Assertion or verification result (assert, task_done, captcha)
`recovery`	Recovery strategy attempted after failure
`step_end`	Step completed (full StepResult with pre/llm/exec/post/verify)
`run_end`	Agent run completed (status: success/failure/partial/unknown)
`error`	Error occurred during step execution

Example trace.jsonl:

{"v":1,"type":"run_start","ts":"2025-12-26T10:00:00.000Z","run_id":"a1b2c3d4-e5f6-7890-abcd-ef1234567890","seq":1,"data":{"agent":"PredicateAgent","llm_model":"gpt-4o","config":{}}}
{"v":1,"type":"step_start","ts":"2025-12-26T10:00:00.100Z","run_id":"a1b2c3d4-e5f6-7890-abcd-ef1234567890","seq":2,"step_id":"b2c3d4e5-f6a7-8901-bcde-f12345678901","data":{"step_id":"b2c3d4e5-f6a7-8901-bcde-f12345678901","step_index":1,"goal":"Click the search box","attempt":0,"pre_url":"https://amazon.com"}}
{"v":1,"type":"snapshot_taken","ts":"2025-12-26T10:00:01.000Z","run_id":"a1b2c3d4-e5f6-7890-abcd-ef1234567890","seq":3,"step_id":"b2c3d4e5-f6a7-8901-bcde-f12345678901","data":{"snapshot_digest":"sha256:abc123...","snapshot_digest_loose":"sha256:def456...","url":"https://amazon.com","element_count":127}}
{"v":1,"type":"llm_called","ts":"2025-12-26T10:00:02.000Z","run_id":"a1b2c3d4-e5f6-7890-abcd-ef1234567890","seq":4,"step_id":"b2c3d4e5-f6a7-8901-bcde-f12345678901","data":{"step_id":"b2c3d4e5-f6a7-8901-bcde-f12345678901","model":"gpt-4o","response_text":"CLICK(42)","response_hash":"sha256:...","usage":{"prompt_tokens":1523,"completion_tokens":12,"total_tokens":1535}}}
{"v":1,"type":"action_executed","ts":"2025-12-26T10:00:03.000Z","run_id":"a1b2c3d4-e5f6-7890-abcd-ef1234567890","seq":5,"step_id":"b2c3d4e5-f6a7-8901-bcde-f12345678901","data":{"kind":"click","element_id":42,"success":true,"outcome":"dom_updated","duration_ms":234}}
{"v":1,"type":"step_end","ts":"2025-12-26T10:00:03.500Z","run_id":"a1b2c3d4-e5f6-7890-abcd-ef1234567890","seq":6,"step_id":"b2c3d4e5-f6a7-8901-bcde-f12345678901","data":{"step_id":"b2c3d4e5-f6a7-8901-bcde-f12345678901","step_index":1,"goal":"Click the search box","attempt":0,"pre":{"url":"https://amazon.com","snapshot_digest":"sha256:abc123..."},"llm":{"response_text":"CLICK(42)","response_hash":"sha256:...","usage":{"prompt_tokens":1523,"completion_tokens":12}},"exec":{"success":true,"outcome":"dom_updated","duration_ms":234},"post":{"url":"https://amazon.com","snapshot_digest":"sha256:xyz789..."},"verify":{"passed":true,"policy":"default"}}}

Snapshot Events for Studio Screenshots

NEW in v0.99.7 - Screenshots are automatically emitted to traces for visualization in Predicate Studio.

Auto-Emit (Default Behavior)

When you call runtime.snapshot(), a snapshot trace event with screenshot_base64 is automatically emitted. This ensures screenshots appear in the Predicate Studio timeline without any extra code.

from predicate.agent_runtime import AgentRuntime

# Default: auto-emit is enabled (emit_trace=True)
snapshot = await runtime.snapshot()  # Automatically emits snapshot trace event

# The trace now includes screenshot_base64 for Studio visualization
# No additional code needed!

Disabling Auto-Emit

If you need manual control over when snapshot events are emitted, disable auto-emit:

# Disable auto-emit for manual control
snapshot = await runtime.snapshot(emit_trace=False)

# Later, manually emit the snapshot when ready
tracer.emit_snapshot(
    snapshot=snapshot,
    step_id=runtime.step_id,
    step_index=runtime.step_index
)

Using `emit_snapshot()` / `emitSnapshot()` Directly

The tracer provides a dedicated helper method for emitting snapshot events:

from predicate.tracing import Tracer, JsonlTraceSink

tracer = Tracer(run_id="my-run", sink=JsonlTraceSink("trace.jsonl"))

# Emit snapshot event with screenshot for Studio visualization
tracer.emit_snapshot(
    snapshot=my_snapshot,           # Snapshot object with screenshot attribute
    step_id="step-uuid-123",        # Correlate with step (optional)
    step_index=1,                   # For Studio timeline ordering (optional)
    screenshot_format="jpeg"        # "jpeg" or "png" (default: "jpeg")
)

Snapshot Event Format

The snapshot event includes these key fields for Studio visualization:

{
  "v": 1,
  "type": "snapshot",
  "ts": "2025-12-26T10:00:01.000Z",
  "run_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "seq": 3,
  "step_id": "b2c3d4e5-f6a7-8901-bcde-f12345678901",
  "data": {
    "url": "https://example.com",
    "element_count": 127,
    "step_index": 1,
    "elements": [...],
    "screenshot_base64": "iVBORw0KGgo...",
    "screenshot_format": "jpeg"
  }
}

Key fields:

screenshot_base64 - Base64-encoded screenshot image (JPEG or PNG)
screenshot_format - Image format ("jpeg" or "png")
step_index - Step number for timeline correlation
step_id - UUID linking snapshot to a specific step

When to Use Manual Emit

Manual emission via emit_trace=False is useful when:

Batching snapshots - Taking multiple snapshots but only emitting the final one
Custom processing - Applying PII redaction or other transforms before emission
Conditional emission - Only emitting snapshots that meet certain criteria
Performance optimization - Reducing trace size by selectively emitting snapshots

Screenshot Processing

Apply custom processing (e.g., PII redaction) before screenshots are emitted. The screenshot_processor callback receives a base64-encoded image string and must return a processed base64 string.

Optional Dependencies for Image Processing

For custom screenshot processing (PII redaction, blurring, masking), you may need image processing libraries:

# Python: Pillow is optional but recommended for image manipulation
pip install Pillow

# Or install with the vision-local extras
pip install sentienceapi[vision-local]

Note: The SDK automatically converts non-JPEG screenshots to JPEG format during cloud upload. This conversion requires Pillow in Python. If Pillow is not installed, non-JPEG screenshots will be uploaded in their original format with a warning.

Example: PII Redaction with Image Processing

from predicate.tracer_factory import create_tracer
import base64
from io import BytesIO

def redact_pii(screenshot_base64: str) -> str:
    """Redact PII by blurring sensitive regions."""
    try:
        from PIL import Image, ImageFilter
    except ImportError:
        # Pillow not installed - return original
        return screenshot_base64

    # Decode base64 to image
    image_data = base64.b64decode(screenshot_base64)
    img = Image.open(BytesIO(image_data))

    # Example: blur the entire image (replace with targeted redaction)
    blurred = img.filter(ImageFilter.GaussianBlur(radius=5))

    # Re-encode to base64
    output = BytesIO()
    blurred.save(output, format="JPEG", quality=80)
    return base64.b64encode(output.getvalue()).decode("utf-8")

tracer = create_tracer(
    api_key="sk_pro_xxxxx",
    run_id="my-run",
    upload_trace=True,
    screenshot_processor=redact_pii  # Applied to all screenshots
)

Simple Example (No Dependencies)

If you don't need image manipulation, you can still use the processor for logging or validation:

from predicate.tracer_factory import create_tracer

def log_screenshot(screenshot_base64: str) -> str:
    """Log screenshot size without modifying it."""
    size_kb = len(screenshot_base64) * 3 / 4 / 1024  # Approximate decoded size
    print(f"Screenshot captured: ~{size_kb:.1f} KB")
    return screenshot_base64  # Return unchanged

tracer = create_tracer(
    api_key="sk_pro_xxxxx",
    run_id="my-run",
    upload_trace=True,
    screenshot_processor=log_screenshot
)

Complete Trace Structure (Auto-Emit)

NEW in v0.99.7 - The SDK now automatically emits a complete trace structure for proper visualization in Predicate Studio. This includes run_start, step_start, snapshots, and run_end events.

Tracer Factory Auto-Emits `run_start`

When you create a tracer using create_tracer() / createTracer(), the run_start event is automatically emitted with the metadata you provide. This ensures every trace has a proper starting point for Studio visualization.

from predicate.tracer_factory import create_tracer

# run_start is automatically emitted with metadata
tracer = create_tracer(
    api_key="sk_pro_xxxxx",
    run_id="my-run-123",
    upload_trace=True,
    goal="Add headphones to cart",        # Displayed as trace name in Studio
    agent_type="PredicateAgent",          # Agent identifier
    llm_model="gpt-4o",                   # LLM model used
    start_url="https://amazon.com",       # Starting URL
)
# run_start event is automatically emitted here!

# To disable auto-emit (for manual control):
tracer = create_tracer(
    api_key="sk_pro_xxxxx",
    run_id="my-run-123",
    auto_emit_run_start=False,  # Disable auto-emit
)
# Now you must manually emit:
tracer.emit_run_start("MyAgent", "gpt-4o", {"goal": "Custom goal"})

AgentRuntime Auto-Emits `step_start`

When you call runtime.begin_step() / runtime.beginStep(), the step_start event is automatically emitted. This ensures each step has a proper starting point in the trace timeline.

from predicate.agent_runtime import AgentRuntime

# step_start is automatically emitted when you begin a step
step_id = runtime.begin_step(
    goal="Click the search button",
    step_index=1,
    pre_url="https://example.com",  # Optional: URL before step
)
# step_start event is automatically emitted here!

# To disable auto-emit (for manual control):
step_id = runtime.begin_step(
    goal="Click the search button",
    step_index=1,
    emit_trace=False,  # Disable auto-emit
)
# Now you must manually emit:
tracer.emit_step_start(
    step_id=step_id,
    step_index=1,
    goal="Click the search button",
    attempt=0,
    pre_url="https://example.com"
)

Emitting `run_end`

The run_end event must be emitted manually before closing the tracer to signal the run's completion status:

# After all steps are complete, emit run_end
tracer.emit_run_end(
    steps=total_steps,           # Total number of steps executed
    status="success"             # "success", "failure", "partial", or "unknown"
)

# Then close the tracer to upload
tracer.close()

Complete Trace Lifecycle Example

Here's a complete example showing all auto-emit events in action:

from predicate.tracer_factory import create_tracer
from predicate.agent_runtime import AgentRuntime

# 1. Create tracer - run_start is auto-emitted
tracer = create_tracer(
    api_key="sk_pro_xxxxx",
    run_id="complete-example",
    upload_trace=True,
    goal="Search for products",
    agent_type="PredicateAgent",
    llm_model="gpt-4o",
    start_url="https://example.com",
)

# 2. Create runtime with tracer
runtime = AgentRuntime(page=page, tracer=tracer)

# 3. Begin step - step_start is auto-emitted
step_id = runtime.begin_step(goal="Click search", step_index=1)

# 4. Take snapshot - snapshot event is auto-emitted
snapshot = await runtime.snapshot()

# 5. Execute action and end step
# ... your action execution code ...
runtime.end_step(success=True)

# 6. Emit run_end (manual)
tracer.emit_run_end(steps=1, status="success")

# 7. Close tracer to upload
tracer.close()

Summary of Auto-Emit Behavior

Event	Auto-Emit Method	Default	Opt-Out Parameter
`run_start`	`create_tracer()` / `createTracer()`	Enabled	`auto_emit_run_start=False` / `autoEmitRunStart: false`
`step_start`	`runtime.begin_step()` / `runtime.beginStep()`	Enabled	`emit_trace=False` / `emitTrace: false`
`snapshot`	`runtime.snapshot()`	Enabled	`emit_trace=False` / `emitTrace: false`
`run_end`	Manual	N/A	N/A (always manual)

AgentConfig Options

Configure agent behavior with AgentConfig:

from predicate.agent_config import AgentConfig

config = AgentConfig(
    # Snapshot settings
    snapshot_limit=50,              # Max elements to include (default: 50)

    # LLM settings
    temperature=0.0,                # LLM temperature 0.0-1.0 (default: 0.0)
    max_retries=1,                  # Retries on failure (default: 1)

    # Verification
    verify=True,                    # Verify action success (default: True)

    # Screenshot settings
    capture_screenshots=True,       # Capture screenshots (default: True)
    screenshot_format="jpeg",       # "jpeg" or "png" (default: "jpeg")
    screenshot_quality=80           # 1-100 for JPEG (default: 80)
)

agent = PredicateAgent(browser, llm, config=config)

Snapshot Utilities

Snapshot Digests (Loop Detection)

Compute fingerprints to detect when page state hasn't changed:

from predicate import snapshot
from predicate.utils import compute_snapshot_digests

snap = snapshot(browser)

# Compute both strict and loose digests
digests = compute_snapshot_digests(snap.elements)

print(digests["strict"])  # sha256:abc123... (changes if text changes)
print(digests["loose"])   # sha256:def456... (only changes if layout changes)

# Use for loop detection
if current_digest == previous_digest:
    print("Agent is stuck in a loop!")

Digest Types:

Strict Digest: Includes element text - detects any content change
Loose Digest: Ignores text - only detects layout/structure changes

LLM Prompt Formatting

Format snapshots for LLM consumption:

from predicate.formatting import format_snapshot_for_llm

snap = snapshot(browser)

# Format top 50 elements for LLM context
llm_context = format_snapshot_for_llm(snap, limit=50)

print(llm_context)
# Output:
# [1] <button> "Sign In" {PRIMARY,CLICKABLE} @ (100,50) (Imp:10)
# [2] <input> "Email address" {CLICKABLE} @ (100,100) (Imp:8)
# [3] <link> "Forgot password?" @ (150,140) (Imp:5)

Format Explanation:

[ID] - Element ID for actions
<role> - Semantic role (button, input, link, etc.)
"text" - Element text (truncated to 50 chars)
{CUES} - Visual cues (PRIMARY, CLICKABLE)
@ (x,y) - Screen position
(Imp:score) - Importance score (0-10)

Cloud Tracing & Screenshots

NEW in v0.12.0+ - Upload traces and screenshots to cloud storage for remote viewing, analysis, and collaboration.

Function Signature

The create_tracer() function signature:

def create_tracer(
    api_key: str | None = None,
    run_id: str | None = None,
    api_url: str | None = None,
    logger: PredicateLogger | None = None,
    upload_trace: bool = False,
    goal: str | None = None,
    agent_type: str | None = None,
    llm_model: str | None = None,
    start_url: str | None = None,
    screenshot_processor: Callable[[str], str] | None = None,
)

Overview

Cloud tracing enables Pro, Builder, Teams, and Enterprise tier users to:

Upload traces to cloud - Access traces from any device via web interface
Automatic screenshot capture - Visual debugging with time-travel replay
Crash recovery - Traces survive process crashes and are automatically recovered
Non-blocking uploads - Agent continues immediately while uploads happen in background
Zero performance impact - Writes to local cache (10μs) instead of network (50ms+)

Quick Start

from predicate import PredicateBrowser, PredicateAgent
from predicate.llm_provider import OpenAIProvider
from predicate.tracer_factory import create_tracer

# 1. Create tracer with automatic tier detection

tracer = create_tracer(
    api_key="sk_pro_xxxxx",  # Pro/Builder/Teams/Enterprise tier key
    run_id="shopping-bot-123",  # Gateway requires UUID format
    upload_trace=True,  # Set to True if you want cloud upload
    goal="Buy a laptop from Amazon",
    agent_type="Amazon Shopping Agent",
    llm_model="gpt-4o",
    start_url="https://www.amazon.com",
    screenshot_processor=None, # function for PII redaction
)

# 2. Create agent with tracer
browser = PredicateBrowser(api_key="sk_pro_xxxxx")
llm = OpenAIProvider(api_key="your_openai_key", model="gpt-4o")
agent = PredicateAgent(browser, llm, tracer=tracer)

# 3. Use agent normally - traces automatically uploaded
with browser:
    browser.page.goto("https://amazon.com")
    agent.act("Click the search box")
    agent.act("Type 'wireless mouse' into search")
    agent.act("Press Enter")

# 4. Upload to cloud (happens automatically on close)
tracer.close()  # Uploads trace + screenshots to cloud

Automatic Tier Detection

The create_tracer() function automatically detects your tier and configures the appropriate sink:

Pro/Builder/Teams/Enterprise Tier (with API key and upload enabled):

Uses CloudTraceSink (uploads to cloud)
Prints: "[Predicate] Cloud tracing enabled (Pro/Builder/Teams/Enterprise tiers)"

Local-only tracing (opt-out of cloud upload):

Uses JsonlTraceSink (local-only even with API key)
Prints: "💾 [Predicate] Local tracing: traces/demo.jsonl"

Free Tier (no API key):

Uses JsonlTraceSink (local-only)
Prints: "💾 [Predicate] Local tracing: traces/demo.jsonl"

Graceful Fallback:

If cloud init fails (network error, timeout, etc.), automatically falls back to local tracing
Agent continues working normally - no crashes or errors
Traces are preserved locally for later upload

Viewing Traces in Cloud

After uploading, access your traces via:

Predicate Studio: https://predicatelabs.dev/studio
API: GET /api/traces/list to list all runs
Web Interface: Time-travel debugging with screenshot replay

Best Practices

Always close the tracer:

try:
    # Your agent code
    pass
finally:
    tracer.close()  # Ensures upload even on errors

Use non-blocking uploads for long-running agents:

tracer.close(blocking=False)  # Don't wait for upload

Set meaningful run IDs and metadata:

tracer = create_tracer(
    api_key="sk_pro_xxxxx",
    run_id=f"amazon-shopping-{datetime.now().strftime('%Y%m%d-%H%M%S')}",
    upload_trace=True,
    goal="Buy a laptop from Amazon",
    agent_type="Amazon Shopping Agent",
    llm_model="gpt-4o",
    start_url="https://www.amazon.com",
    screenshot_processor=None, # function for PII redaction
)

Enable screenshots for debugging:

config = AgentConfig(capture_screenshots=True)
agent = PredicateAgent(browser, llm, tracer=tracer, config=config)

Custom Trace Sinks

Implement custom trace storage by extending the TraceSink interface. This allows you to store traces in databases, cloud storage, or any custom backend.

TraceSink Interface

from predicate.tracing import TraceSink

class CustomTraceSink(TraceSink):
    """Base interface for trace storage"""

    def emit(self, event_dict: dict) -> None:
        """Write a single trace event"""
        raise NotImplementedError

    def close(self) -> None:
        """Close the sink and flush any pending writes"""
        raise NotImplementedError

Example: Database Trace Sink

Store traces directly in a database:

from predicate.tracing import TraceSink, Tracer
import psycopg2
import json

class DatabaseTraceSink(TraceSink):
    """Store traces in PostgreSQL database"""

    def __init__(self, connection_string: str):
        self.conn = psycopg2.connect(connection_string)
        self.cursor = self.conn.cursor()

        # Create traces table if it doesn't exist
        self.cursor.execute("""
            CREATE TABLE IF NOT EXISTS trace_events (
                id SERIAL PRIMARY KEY,
                run_id TEXT NOT NULL,
                seq INTEGER NOT NULL,
                type TEXT NOT NULL,
                timestamp TIMESTAMPTZ NOT NULL,
                data JSONB NOT NULL,
                created_at TIMESTAMPTZ DEFAULT NOW()
            )
        """)
        self.conn.commit()

    def emit(self, event_dict: dict) -> None:
        """Insert trace event into database"""
        self.cursor.execute("""
            INSERT INTO trace_events (run_id, seq, type, timestamp, data)
            VALUES (%s, %s, %s, %s, %s)
        """, (
            event_dict["run_id"],
            event_dict["seq"],
            event_dict["type"],
            event_dict["ts"],
            json.dumps(event_dict["data"])
        ))
        self.conn.commit()

    def close(self) -> None:
        """Close database connection"""
        self.cursor.close()
        self.conn.close()

# Usage
tracer = Tracer(
    run_id="run-123",
    sink=DatabaseTraceSink("postgresql://user:pass@localhost/traces")
)
agent = PredicateAgent(browser, llm, tracer=tracer)

Example: Cloud Storage Sink

Upload traces to S3, Google Cloud Storage, or other cloud providers:

from predicate.tracing import TraceSink
import boto3
import json
from typing import List

class S3TraceSink(TraceSink):
    """Store traces in AWS S3"""

    def __init__(self, bucket: str, prefix: str = "traces/"):
        self.s3 = boto3.client('s3')
        self.bucket = bucket
        self.prefix = prefix
        self.events: List[dict] = []

    def emit(self, event_dict: dict) -> None:
        """Buffer trace events in memory"""
        self.events.append(event_dict)

    def close(self) -> None:
        """Upload all events to S3"""
        if not self.events:
            return

        run_id = self.events[0]["run_id"]
        key = f"{self.prefix}{run_id}.jsonl"

        # Convert events to JSONL
        jsonl_content = "\n".join(json.dumps(e) for e in self.events)

        # Upload to S3
        self.s3.put_object(
            Bucket=self.bucket,
            Key=key,
            Body=jsonl_content.encode('utf-8'),
            ContentType='application/x-ndjson'
        )
        print(f"✅ Uploaded trace to s3://{self.bucket}/{key}")

# Usage
tracer = Tracer(
    run_id="run-123",
    sink=S3TraceSink(bucket="my-traces", prefix="production/")
)

Example: Multi-Sink (Write to Multiple Destinations)

Write traces to multiple sinks simultaneously:

from predicate.tracing import TraceSink, JsonlTraceSink
from typing import List

class MultiSink(TraceSink):
    """Write to multiple trace sinks simultaneously"""

    def __init__(self, sinks: List[TraceSink]):
        self.sinks = sinks

    def emit(self, event_dict: dict) -> None:
        """Emit to all sinks"""
        for sink in self.sinks:
            sink.emit(event_dict)

    def close(self) -> None:
        """Close all sinks"""
        for sink in self.sinks:
            sink.close()

# Usage: Write to both local file and database
tracer = Tracer(
    run_id="run-123",
    sink=MultiSink([
        JsonlTraceSink("trace.jsonl"),
        DatabaseTraceSink("postgresql://localhost/traces"),
        S3TraceSink("my-traces")
    ])
)

PredicateLogger Interface

Integrate Predicate tracing with your existing logging infrastructure using the PredicateLogger interface.

Logger Interface

from typing import Protocol

class PredicateLogger(Protocol):
    """Protocol for optional logger interface."""

    def info(self, message: str) -> None:
        """Log info message."""
        ...

    def warning(self, message: str) -> None:
        """Log warning message."""
        ...

    def error(self, message: str) -> None:
        """Log error message."""
        ...

Using Python's Built-in Logger

import logging
from predicate import create_tracer

# Use Python's built-in logging module
logger = logging.getLogger(__name__)
logger.setLevel(logging.INFO)

# Add handler to output to console
handler = logging.StreamHandler()
handler.setFormatter(logging.Formatter('[%(levelname)s] %(message)s'))
logger.addHandler(handler)

# Create tracer with logger
tracer = create_tracer(
  api_key="sk_pro_xxxxx",
  run_id=f"amazon-shopping-{datetime.now().strftime('%Y%m%d-%H%M%S')}",
  logger=logger  # Pass standard Python logger (implements the protocol)
  upload_trace=True,
  goal="Buy a laptop from Amazon",
  agent_type="Amazon Shopping Agent",
  llm_model="gpt-4o",
  start_url="https://www.amazon.com",
  screenshot_processor=None, # function for PII redaction
)

# The logger will receive messages like:
# [INFO] Trace file size: 2.45 MB
# [INFO] Screenshot total: 0.00 MB
# [INFO] Trace completion reported to gateway
# [WARN] Failed to report trace completion: HTTP 500

Custom Logger Implementation

from predicate import create_tracer

class CustomLogger:
    """Custom logger implementation"""

    def info(self, message: str) -> None:
        print(f"[INFO] {message}")
        # Send to monitoring service, file, etc.

    def warning(self, message: str) -> None:
        print(f"[WARN] {message}")
        # Alert team, log to error tracking

    def error(self, message: str) -> None:
        print(f"[ERROR] {message}")
        # Critical alert, page on-call engineer

custom_logger = CustomLogger()
tracer = create_tracer(
    api_key="sk_pro_xxxxx",  # Pro/Builder/Teams/Enterprise tier key
    run_id="shopping-bot-123",  # Gateway requires UUID format
    upload_trace=True,  # Set to True if you want cloud upload
    goal="Buy a laptop from Amazon",
    agent_type="Amazon Shopping Agent",
    llm_model="gpt-4o",
    start_url="https://www.amazon.com",
    screenshot_processor=None, # function for PII redaction
)

What Gets Logged

The logger receives the following types of messages:

File sizes - Compressed trace file size and screenshot totals (in MB)
Upload completion - Confirmation that trace was reported to gateway
Errors - Upload failures, measurement errors, API errors
Warnings - Non-critical issues like failed completion reports

Benefits:

Integration - Connect Predicate to your existing logging system (Winston, Bunyan, etc.)
Monitoring - Track trace upload metrics in production
Debugging - See detailed file sizes and upload status
Silent mode - Omit logger parameter for no logging (default behavior)

Crash Recovery

Traces automatically survive process crashes and are recovered on next SDK initialization.

How It Works

Persistent Cache - Traces are stored in ~/.sentience/traces/pending/ during execution
Process Crash - If your script crashes, traces remain on disk
Automatic Recovery - Next time you create a tracer, orphaned traces are automatically detected
Auto-Upload - Recovered traces are uploaded to cloud (if using Pro/Builder/Teams/Enterprise tier)

Example: Crash and Recovery

from predicate import create_tracer, PredicateBrowser, PredicateAgent
from predicate.llm import OpenAIProvider

# Run 1: Agent crashes mid-execution
tracer = create_tracer(
    api_key="sk_pro_xxxxx",
    run_id="run-1",
    upload_trace=True,
    goal="Click button on example.com",
    agent_type="Example Agent",
    llm_model="gpt-4o",
    start_url="https://example.com",
    screenshot_processor=None, # function for PII redaction
)
browser = PredicateBrowser(api_key="sk_pro_xxxxx")
llm = OpenAIProvider(api_key="your_key", model="gpt-4o")
agent = PredicateAgent(browser, llm, tracer=tracer)

with browser:
    browser.page.goto("https://example.com")
    agent.act("Click button")  # Process crashes here - trace saved locally

# Run 2: SDK automatically recovers and uploads orphaned trace
tracer = create_tracer(
    api_key="sk_pro_xxxxx",
    run_id="run-2",
    upload_trace=True,
    goal="Continue from previous run",
    agent_type="Example Agent",
    llm_model="gpt-4o",
    start_url="https://example.com",
    screenshot_processor=None, # function for PII redaction
)
# Prints: "⚠️  [Predicate] Found 1 un-uploaded trace(s) from previous runs"
# Prints: "✅ Uploaded orphaned trace: run-1"

browser = PredicateBrowser(api_key="sk_pro_xxxxx")
agent = PredicateAgent(browser, llm, tracer=tracer)
# Continue with run-2...

Recovery Details

Cache Location - ~/.sentience/traces/pending/{run_id}.jsonl
Auto-Detection - Happens on create_tracer() initialization
Upload Priority - Orphaned traces uploaded before new trace begins
Error Handling - If upload fails, traces remain in cache for next retry
Manual Cleanup - You can safely delete traces from ~/.sentience/traces/pending/

Best Practices for Crash Recovery

Always use meaningful run IDs:

run_id = f"shopping-{datetime.now().isoformat()}"
tracer = create_tracer(
    api_key="sk_pro_xxxxx",
    run_id=run_id,
    upload_trace=True,
    goal="Buy a laptop from Amazon",
    agent_type="Amazon Shopping Agent",
    llm_model="gpt-4o",
    start_url="https://www.amazon.com",
    screenshot_processor=None, # function for PII redaction
)

Monitor cache directory:
```
ls -lh ~/.sentience/traces/pending/
```

Use try/finally for cleanup:

tracer = create_tracer(
    api_key="sk_pro_xxxxx",
    run_id="run-123",
    upload_trace=True,
    goal="Complete task",
    agent_type="My Agent",
    llm_model="gpt-4o",
    start_url="https://example.com",
    screenshot_processor=None, # function for PII redaction
)
try:
    # Agent code
    agent.act("Do something")
finally:
    tracer.close()  # Ensures upload even on errors

Check logs for recovery messages:
- Look for "Found N un-uploaded trace(s)" messages
- Verify "Uploaded orphaned trace: [run-id]" confirmations

Non-Blocking Trace Uploads

Upload traces in the background to avoid blocking your script execution.

Blocking vs Non-Blocking

from predicate import create_tracer

tracer = create_tracer(
    api_key="sk_pro_xxxxx",
    run_id="run-123",
    upload_trace=True,
    goal="Example task",
    agent_type="Example Agent",
    llm_model="gpt-4o",
    start_url="https://example.com",
    screenshot_processor=None, # function for PII redaction
)

# ... agent execution ...

# Option 1: Blocking (default) - waits for upload to complete
tracer.close(blocking=True)  # Script pauses here until upload finishes
print("Upload complete!")

# Option 2: Non-blocking - returns immediately
tracer.close(blocking=False)  # Script continues immediately
print("Upload started in background!")
# Script can exit or continue with other work

Progress Callbacks

Monitor upload progress with callbacks:

def progress_callback(uploaded_bytes: int, total_bytes: int):
    percent = (uploaded_bytes / total_bytes) * 100
    print(f"Upload progress: {percent:.1f}% ({uploaded_bytes}/{total_bytes} bytes)")

tracer.close(blocking=True, on_progress=progress_callback)

# Output:
# Upload progress: 25.0% (262144/1048576 bytes)
# Upload progress: 50.0% (524288/1048576 bytes)
# Upload progress: 75.0% (786432/1048576 bytes)
# Upload progress: 100.0% (1048576/1048576 bytes)

Use Cases

Use non-blocking uploads when:

Running long-running agents that produce multiple traces
Deploying in serverless environments with time limits
Performance is critical (don't wait for network I/O)

Use blocking uploads when:

You need confirmation that upload succeeded
Running in CI/CD and need to verify all traces uploaded
Script exits immediately after tracing

Troubleshooting

"Cloud tracing requires Pro tier"

Cause: Your API key is valid but account is on Free tier

Solution:

Upgrade to Pro, Builder, Teams, or Enterprise tier at https://predicatelabs.dev/pricing
Traces will automatically use local storage until upgrade
Set upload_trace=False to suppress this message

"Cloud init timeout"

Cause: Network connectivity issue or API service temporarily unavailable

Solution:

Check your internet connection
Verify API endpoint is reachable
SDK automatically falls back to local tracing
Traces saved locally can be uploaded later via crash recovery

"Upload failed: HTTP 500"

Cause: Server error (temporary)

Solution:

Trace is preserved locally at ~/.sentience/traces/pending/{run_id}.jsonl
Retry upload by creating tracer again (automatic recovery kicks in)
Check service status at https://status.predicatelabs.dev

"Local trace preserved"

Cause: Upload failed but trace is safely stored on disk

Solution:

Check network connection and retry
Trace file location is printed in console
Next tracer initialization will automatically retry upload
You can manually upload by copying to cloud storage

Trace Schema Reference

All trace events follow this structure:

{
  "v": 1,                                              // Schema version (always 1)
  "type": "event_type",                                // Event type (see table below)
  "ts": "2025-12-26T10:00:00.000Z",                    // ISO 8601 timestamp
  "run_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",   // UUID for the agent run
  "seq": 1,                                            // Monotonically increasing sequence number
  "step_id": "b2c3d4e5-f6a7-8901-bcde-f12345678901",  // UUID for the step (optional)
  "data": {...},                                       // Event-specific payload
  "ts_ms": 1735210800000                               // Unix timestamp in milliseconds (optional)
}

⚠️ Warning: Trace formats and internal fields may evolve and are not guaranteed to be stable across versions.

Event-Specific Data Schemas:

run_start - Agent run begins

{
  "agent": "PredicateAgent",
  "llm_model": "gpt-4o",
  "config": {}
}

step_start - Step execution begins

{
  "step_id": "uuid",
  "step_index": 1,
  "goal": "Click the search box",
  "attempt": 0,
  "pre_url": "https://example.com"
}

snapshot / snapshot_taken - Page state captured

{
  "step_id": "uuid",
  "snapshot_id": "uuid",
  "snapshot_digest": "sha256:abc123...",
  "snapshot_digest_loose": "sha256:def456...",
  "url": "https://example.com",
  "element_count": 127,
  "timestamp": "2025-12-26T10:00:01.000Z",
  "diagnostics": {
    "confidence": 0.95,
    "reasons": ["dom_stable", "no_pending_requests"],
    "metrics": {
      "ready_state": "complete",
      "quiet_ms": 500,
      "node_count": 1234,
      "interactive_count": 45
    },
    "captcha": {
      "detected": false,
      "provider_hint": null,
      "confidence": 0.0,
      "evidence": {}
    }
  },
  "elements": [
    {
      "id": 42,
      "role": "button",
      "text": "Search",
      "importance": 950,
      "bbox": {"x": 100, "y": 50, "width": 80, "height": 32},
      "visual_cues": {"is_primary": true, "is_clickable": true},
      "in_viewport": true,
      "is_occluded": false,
      "z_index": 10
    }
  ],
  "screenshot_base64": "iVBORw0KGgo...",
  "screenshot_format": "png"
}

llm_called - LLM query made

{
  "step_id": "uuid",
  "model": "gpt-4o",
  "temperature": 0.0,
  "system_prompt_hash": "sha256:...",
  "user_prompt_hash": "sha256:...",
  "response_text": "CLICK(42)",
  "response_hash": "sha256:...",
  "usage": {
    "prompt_tokens": 1523,
    "completion_tokens": 12,
    "total_tokens": 1535
  }
}

action / action_executed - Action executed

{
  "kind": "click",           // click, type, press, finish, navigate
  "element_id": 42,
  "text": "search query",    // for type action
  "key": "Enter",            // for press action
  "url": "https://...",      // for navigate action
  "raw": "CLICK(42)"
}

verification - Assertion/verification result

{
  "step_id": "uuid",
  "passed": true,
  "kind": "assert",          // assert, task_done, captcha
  "label": "Search box is visible",
  "required": true,
  "reason": "Element found with matching text",
  "details": {},
  "signals": {}
}

recovery - Recovery strategy attempted

{
  "step_id": "uuid",
  "strategy": "retry",
  "attempt": 1
}

step_end - Step completed (full StepResult)

{
  "v": 1,
  "step_id": "uuid",
  "step_index": 1,
  "goal": "Click the search box",
  "attempt": 0,
  "pre": {
    "url": "https://example.com",
    "snapshot_digest": "sha256:abc123...",
    "snapshot_digest_loose": "sha256:def456..."
  },
  "llm": {
    "response_text": "CLICK(42)",
    "response_hash": "sha256:...",
    "usage": {"prompt_tokens": 1523, "completion_tokens": 12, "total_tokens": 1535}
  },
  "action": {
    "kind": "click",
    "element_id": 42,
    "raw": "CLICK(42)"
  },
  "exec": {
    "success": true,
    "outcome": "dom_updated",
    "action": "click",
    "element_id": 42,
    "duration_ms": 234,
    "url_changed": false,
    "bounding_box": {"x": 100, "y": 50, "width": 80, "height": 32}
  },
  "post": {
    "url": "https://example.com",
    "snapshot_digest": "sha256:xyz789...",
    "snapshot_digest_loose": "sha256:uvw123..."
  },
  "verify": {
    "passed": true,
    "policy": "default",
    "signals": {
      "url_changed": false,
      "assertions": [
        {"label": "Element clicked", "passed": true, "required": true}
      ],
      "task_done": false
    }
  },
  "recovery": null
}

run_end - Agent run completed

{
  "steps": 5,
  "status": "success"        // success, failure, partial, unknown
}

error - Error occurred

{
  "step_id": "uuid",
  "attempt": 0,
  "error": "Element not found: id=42"
}

Next Steps

Browser Setup → - Configure browser options
Snapshot API → - Learn about snapshot options
Examples → - See tracing in action

Agent Quick Start

Agent Runtime

Agent Tracing & Debugging

What is a Trace?

Why Use Traces?

Basic Usage

Trace Events

Snapshot Events for Studio Screenshots

Auto-Emit (Default Behavior)

Disabling Auto-Emit

Using emit_snapshot() / emitSnapshot() Directly

Snapshot Event Format

When to Use Manual Emit

Screenshot Processing

Optional Dependencies for Image Processing

Example: PII Redaction with Image Processing

Simple Example (No Dependencies)

Complete Trace Structure (Auto-Emit)

Tracer Factory Auto-Emits run_start

AgentRuntime Auto-Emits step_start

Emitting run_end

Complete Trace Lifecycle Example

Summary of Auto-Emit Behavior

AgentConfig Options

Snapshot Utilities

Snapshot Digests (Loop Detection)

LLM Prompt Formatting

Cloud Tracing & Screenshots

Function Signature

Overview

Quick Start

Automatic Tier Detection

Viewing Traces in Cloud

Best Practices

Custom Trace Sinks

TraceSink Interface

Example: Database Trace Sink

Example: Cloud Storage Sink

Example: Multi-Sink (Write to Multiple Destinations)

PredicateLogger Interface

Logger Interface

Using Python's Built-in Logger

Custom Logger Implementation

What Gets Logged

Crash Recovery

How It Works

Example: Crash and Recovery

Recovery Details

Best Practices for Crash Recovery

Non-Blocking Trace Uploads

Blocking vs Non-Blocking

Progress Callbacks

Use Cases

Troubleshooting

"Cloud tracing requires Pro tier"

"Cloud init timeout"

"Upload failed: HTTP 500"

"Local trace preserved"

Trace Schema Reference

Next Steps

Using `emit_snapshot()` / `emitSnapshot()` Directly

Tracer Factory Auto-Emits `run_start`

AgentRuntime Auto-Emits `step_start`

Emitting `run_end`