Docs/SDK/Async API

Async API (Python SDK)

NEW in v0.90.17: Complete async API implementation with async versions of all SDK functions including core utilities, supporting utilities, agent layer, and developer tools. All async functions are now organized in their respective modules, with async_api serving as a convenient re-export point.

Why Use Async API?

The async API enables you to build high-performance automation with Python's asyncio framework:

AsyncPredicateBrowser

The AsyncPredicateBrowser class provides async context manager support and all the features of the sync PredicateBrowser.

Basic Usage

from predicate.async_api import AsyncPredicateBrowser

# Async context manager (recommended)
async def main():
    async with AsyncPredicateBrowser() as browser:
        await browser.goto("https://example.com")
        # Browser automatically closes when done

# Manual lifecycle
async def manual():
    browser = AsyncPredicateBrowser()
    await browser.start()
    await browser.goto("https://example.com")
    await browser.close()

With API Key

from predicate.async_api import AsyncPredicateBrowser

async def main():
    async with AsyncPredicateBrowser(api_key="sk_...") as browser:
        await browser.goto("https://example.com")
        # Use Pro/Enterprise features

Custom Viewport

from predicate.async_api import AsyncPredicateBrowser

async def main():
    # Custom viewport size
    async with AsyncPredicateBrowser(
        viewport={"width": 1920, "height": 1080}
    ) as browser:
        await browser.goto("https://example.com")

From Existing Playwright Context

from predicate.async_api import AsyncPredicateBrowser
from playwright.async_api import async_playwright

async def main():
    async with async_playwright() as p:
        # Create Playwright context
        context = await p.chromium.launch_persistent_context(
            "./user_data",
            headless=False
        )

        # Convert to AsyncPredicateBrowser
        browser = AsyncPredicateBrowser.from_existing(context)

        # Use all Predicate features
        await browser.page.goto("https://example.com")

From Existing Page

from predicate.async_api import AsyncPredicateBrowser
from playwright.async_api import async_playwright

async def main():
    async with async_playwright() as p:
        browser = await p.chromium.launch()
        context = await browser.new_context()
        page = await context.new_page()

        # Navigate first
        await page.goto("https://example.com")

        # Convert to AsyncPredicateBrowser
        sentience_browser = AsyncPredicateBrowser.from_page(page)

Async Functions

All core SDK functions have async versions with _async suffix for clarity.

snapshot_async()

Capture page snapshot asynchronously:

from predicate.async_api import AsyncPredicateBrowser, snapshot_async

async def main():
    async with AsyncPredicateBrowser() as browser:
        await browser.goto("https://example.com")

        # Capture snapshot
        snap = await snapshot_async(browser)

        print(f"Found {len(snap.elements)} elements")
        print(f"Page title: {snap.title}")

Parameters:

click_async()

Click an element asynchronously:

from predicate.async_api import AsyncPredicateBrowser, snapshot_async, click_async, find

async def main():
    async with AsyncPredicateBrowser() as browser:
        await browser.goto("https://example.com")

        # Find and click
        snap = await snapshot_async(browser)
        button = find(snap, "role=button text~'Submit'")

        if button:
            await click_async(browser, button.id)

Parameters:

type_text_async()

Type text into an input field asynchronously:

from predicate.async_api import AsyncPredicateBrowser, snapshot_async, type_text_async, find

async def main():
    async with AsyncPredicateBrowser() as browser:
        await browser.goto("https://example.com")

        snap = await snapshot_async(browser)
        email_input = find(snap, "role=textbox text~'email'")

        if email_input:
            await type_text_async(browser, email_input.id, "user@example.com")

Parameters:

press_async()

Press keyboard keys asynchronously:

from predicate.async_api import AsyncPredicateBrowser, press_async

async def main():
    async with AsyncPredicateBrowser() as browser:
        await browser.goto("https://example.com")

        # Press Enter
        await press_async(browser, "Enter")

        # Press Escape
        await press_async(browser, "Escape")

        # Keyboard shortcut
        await press_async(browser, "Control+A")

Parameters:

click_rect_async()

Click at specific coordinates asynchronously:

from predicate.async_api import AsyncPredicateBrowser, click_rect_async

async def main():
    async with AsyncPredicateBrowser() as browser:
        await browser.goto("https://example.com")

        # Click at coordinates
        await click_rect_async(
            browser,
            x=100,
            y=200,
            width=50,
            height=30
        )

Parameters:

Phase 2A: Core Utilities

NEW in v0.90.17: Async versions of core utility functions for semantic waiting, screenshots, and text search.

wait_for_async()

Wait for an element to appear using semantic queries:

from predicate.async_api import AsyncPredicateBrowser, wait_for_async

async def main():
    async with AsyncPredicateBrowser() as browser:
        await browser.goto("https://example.com")
        
        # Wait for element with timeout
        result = await wait_for_async(browser, "role=button", timeout=5.0)
        
        if result:
            print(f"Element found: {result.id}")

Parameters:

screenshot_async()

Capture screenshot asynchronously in PNG or JPEG format:

from predicate.async_api import AsyncPredicateBrowser, screenshot_async

async def main():
    async with AsyncPredicateBrowser() as browser:
        await browser.goto("https://example.com")
        
        # Capture screenshot as JPEG
        data_url = await screenshot_async(browser, format="jpeg", quality=80)
        
        # Save to file
        import base64
        image_data = base64.b64decode(data_url.split(',')[1])
        with open("screenshot.jpg", "wb") as f:
            f.write(image_data)

Parameters:

find_text_rect_async()

Find text on the page and return pixel coordinates:

from predicate.async_api import AsyncPredicateBrowser, find_text_rect_async

async def main():
    async with AsyncPredicateBrowser() as browser:
        await browser.goto("https://example.com")
        
        # Find text and get coordinates
        text_result = await find_text_rect_async(browser, "Sign In")
        
        if text_result:
            print(f"Text found at: x={text_result.x}, y={text_result.y}")
            print(f"Size: {text_result.width}x{text_result.height}")

Parameters:

Returns: Object with x, y, width, height properties, or None if not found

Phase 2B: Supporting Utilities

NEW in v0.90.17: Async versions of supporting functions for content reading, visual overlays, and assertions.

read_async()

Read page content in various formats:

from predicate.async_api import AsyncPredicateBrowser, read_async

async def main():
    async with AsyncPredicateBrowser() as browser:
        await browser.goto("https://example.com")
        
        # Read as markdown
        markdown = await read_async(browser, output_format="markdown")
        print(markdown)
        
        # Read as plain text
        text = await read_async(browser, output_format="text")
        
        # Read raw HTML
        html = await read_async(browser, output_format="html")

Parameters:

show_overlay_async() / clear_overlay_async()

Manage visual overlays for debugging:

from predicate.async_api import AsyncPredicateBrowser, snapshot_async, show_overlay_async, clear_overlay_async

async def main():
    async with AsyncPredicateBrowser() as browser:
        await browser.goto("https://example.com")
        
        # Take snapshot
        snap = await snapshot_async(browser)
        
        # Show overlay on specific element
        await show_overlay_async(browser, snap, target_element_id=42)
        
        # Clear overlay
        await clear_overlay_async(browser)

Parameters:

expect_async() / ExpectationAsync

Async assertion helpers with fluent API:

from predicate.async_api import AsyncPredicateBrowser, expect_async

async def main():
    async with AsyncPredicateBrowser() as browser:
        await browser.goto("https://example.com")
        
        # Assert element is visible
        element = await expect_async(browser, "role=button").to_be_visible()
        
        # Assert element exists
        await expect_async(browser, "role=link").to_exist()
        
        # Assert element contains text
        await expect_async(browser, "role=heading").to_have_text("Welcome")
        
        # Assert query returns N elements
        await expect_async(browser, "role=link").to_have_count(5)

Available Methods:

Phase 2C: Agent Layer

NEW in v0.90.17: Full async implementation of the agent layer for natural language automation.

PredicateAgentAsync

Async agent with observe-think-act loop:

from predicate.async_api import AsyncPredicateBrowser, PredicateAgentAsync
from predicate.llm_provider import OpenAIProvider

async def main():
    async with AsyncPredicateBrowser() as browser:
        await browser.goto("https://example.com")
        
        # Initialize LLM provider
        llm = OpenAIProvider(api_key="your_key", model="gpt-4o")
        
        # Create async agent
        agent = PredicateAgentAsync(browser, llm)
        
        # Natural language automation
        result = await agent.act("Click the login button")
        result = await agent.act("Type 'user@example.com' into the email field")
        
        # Get token usage statistics
        stats = agent.get_token_stats()
        print(f"Tokens used: {stats['total_tokens']}")

Features:

Parameters:

Phase 2D: Developer Tools

NEW in v0.90.17: Async versions of developer tools for recording and inspection.

RecorderAsync / record_async()

Record actions and generate traces:

from predicate.async_api import AsyncPredicateBrowser, RecorderAsync

async def main():
    async with AsyncPredicateBrowser() as browser:
        await browser.goto("https://example.com")
        
        # Record actions
        async with RecorderAsync(browser, capture_snapshots=True) as recorder:
            await recorder.record_click(element_id)
            await recorder.record_type(element_id, "text")
            
            # Save trace
            recorder.save("trace.json")

Parameters:

InspectorAsync / inspect_async()

Inspect elements and debug interactively:

from predicate.async_api import AsyncPredicateBrowser, InspectorAsync

async def main():
    async with AsyncPredicateBrowser() as browser:
        await browser.goto("https://example.com")
        
        # Interactive inspection
        async with InspectorAsync(browser) as inspector:
            # Hover elements to see info in console
            # Click elements to see full details
            pass

Parameters:

Pure Functions (No Async Needed)

These functions are pure (no I/O) and don't need async versions:

from predicate.async_api import find, query

# find() - Returns single element
button = find(snap, "role=button text~'Submit'")

# query() - Returns list of elements
links = query(snap, "role=link")

Complete Example

Here's a full example combining all async functions:

import asyncio
from predicate.async_api import (
    AsyncPredicateBrowser,
    snapshot_async,
    find,
    click_async,
    type_text_async,
    press_async
)

async def login_example():
    """Complete login automation example"""
    async with AsyncPredicateBrowser() as browser:
        # Navigate to login page
        await browser.goto("https://example.com/login")

        # Take snapshot
        snap = await snapshot_async(browser)

        # Find email input
        email_input = find(snap, "role=textbox text~'email'")
        if email_input:
            await type_text_async(browser, email_input.id, "user@example.com")

        # Find password input
        snap = await snapshot_async(browser)
        password_input = find(snap, "role=textbox text~'password'")
        if password_input:
            await type_text_async(browser, password_input.id, "mypassword")

        # Click submit button
        snap = await snapshot_async(browser)
        submit_btn = find(snap, "role=button text~'log in'")
        if submit_btn:
            await click_async(browser, submit_btn.id)

        # Wait for page load
        await asyncio.sleep(2)

        # Verify login success
        snap = await snapshot_async(browser)
        print(f"Page title after login: {snap.title}")

# Run the async function
if __name__ == "__main__":
    asyncio.run(login_example())

Complete Phase 2A-2D Example

NEW in v0.90.17: Here's a comprehensive example using all the new async features:

from predicate.async_api import (
    AsyncPredicateBrowser,
    wait_for_async,
    screenshot_async,
    find_text_rect_async,
    read_async,
    show_overlay_async,
    expect_async,
    PredicateAgentAsync
)
from predicate.llm_provider import OpenAIProvider

async def comprehensive_example():
    """Example using all Phase 2A-2D features"""
    async with AsyncPredicateBrowser() as browser:
        await browser.goto("https://example.com")
        
        # Phase 2A: Core Utilities
        # Wait for element
        result = await wait_for_async(browser, "role=button", timeout=5.0)
        
        # Capture screenshot
        data_url = await screenshot_async(browser, format="jpeg", quality=80)
        
        # Find text on page
        text_result = await find_text_rect_async(browser, "Sign In")
        
        # Phase 2B: Supporting Utilities
        # Read page content
        markdown = await read_async(browser, output_format="markdown")
        
        # Show visual overlay
        from predicate.async_api import snapshot_async
        snap = await snapshot_async(browser)
        await show_overlay_async(browser, snap, target_element_id=42)
        
        # Assertions
        element = await expect_async(browser, "role=button").to_be_visible()
        await expect_async(browser, "role=link").to_have_count(5)
        
        # Phase 2C: Agent Layer
        llm = OpenAIProvider(api_key="your_key", model="gpt-4o")
        agent = PredicateAgentAsync(browser, llm)
        
        # Natural language automation
        result = await agent.act("Click the login button")
        result = await agent.act("Type 'user@example.com' into the email field")
        
        # Token tracking
        stats = agent.get_token_stats()
        print(f"Tokens used: {stats['total_tokens']}")

if __name__ == "__main__":
    import asyncio
    asyncio.run(comprehensive_example())

Concurrent Operations

Run multiple browser tasks in parallel:

import asyncio
from predicate.async_api import AsyncPredicateBrowser, snapshot_async

async def scrape_page(url: str):
    """Scrape a single page"""
    async with AsyncPredicateBrowser() as browser:
        await browser.goto(url)
        snap = await snapshot_async(browser)
        return {
            "url": url,
            "title": snap.title,
            "element_count": len(snap.elements)
        }

async def scrape_multiple_pages():
    """Scrape multiple pages concurrently"""
    urls = [
        "https://example.com",
        "https://example.org",
        "https://example.net"
    ]

    # Run all scrapes concurrently
    tasks = [scrape_page(url) for url in urls]
    results = await asyncio.gather(*tasks)

    for result in results:
        print(f"{result['url']}: {result['title']} ({result['element_count']} elements)")

# Run
if __name__ == "__main__":
    asyncio.run(scrape_multiple_pages())

Integration with Async Frameworks

FastAPI

from fastapi import FastAPI
from predicate.async_api import AsyncPredicateBrowser, snapshot_async, find

app = FastAPI()

@app.get("/scrape")
async def scrape_endpoint(url: str):
    """API endpoint that scrapes a URL"""
    async with AsyncPredicateBrowser() as browser:
        await browser.goto(url)
        snap = await snapshot_async(browser)

        return {
            "url": url,
            "title": snap.title,
            "element_count": len(snap.elements)
        }

aiohttp

import aiohttp
from aiohttp import web
from predicate.async_api import AsyncPredicateBrowser, snapshot_async

async def handle_scrape(request):
    """Handle scrape request"""
    url = request.query.get('url')

    async with AsyncPredicateBrowser() as browser:
        await browser.goto(url)
        snap = await snapshot_async(browser)

        return web.json_response({
            "url": url,
            "title": snap.title,
            "element_count": len(snap.elements)
        })

app = web.Application()
app.router.add_get('/scrape', handle_scrape)

if __name__ == "__main__":
    web.run_app(app)

API Organization

v0.90.17 Refactoring:

Benefits

Complete Coverage:

Performance:

Code Quality:

Compatibility:

Testing:

Migration from Sync API

Migrating from sync to async is straightforward:

# Sync API (before)
from predicate import PredicateBrowser, snapshot, find, click

with PredicateBrowser() as browser:
    browser.goto("https://example.com")
    snap = snapshot(browser)
    button = find(snap, "role=button")
    if button:
        click(browser, button.id)

# Async API (after)
from predicate.async_api import AsyncPredicateBrowser, snapshot_async, find, click_async

async def main():
    async with AsyncPredicateBrowser() as browser:
        await browser.goto("https://example.com")
        snap = await snapshot_async(browser)
        button = find(snap, "role=button")
        if button:
            await click_async(browser, button.id)

Key changes:

  1. Import from sentience.async_api instead of sentience
  2. Use AsyncPredicateBrowser instead of PredicateBrowser
  3. Add await before I/O operations (goto, snapshot_async, click_async, etc.)
  4. Add async keyword to function definition
  5. Pure functions (find, query) don't need async

Next Steps