Turn read() output into validated JSON records (schema-first) so agents can extract data reliably instead of scraping brittle HTML.
This page covers:
read() markdownExtraction should produce validated data, not “maybe JSON”.
Use extraction when you need structured records (items, prices, metadata) that downstream code can trust.
The stable path to structured data is:
read())extract(...))from pydantic import BaseModel
from predicate import read, extract
class Item(BaseModel):
name: str
price: str
md = read(browser, format="markdown")["content"]
result = extract(browser, llm, "Extract item name and price", schema=Item)
if result.ok:
print(result.data.name, result.data.price)
else:
print("extract failed:", result.error)
Extraction can fail for deterministic reasons:
read() outputWhen extraction fails, treat it as a normal verification failure: