Typed workflows. Not flowcharts.
Playbooks are YAML-defined workflow pipelines that specify what agents do, how they do it, and how their output is evaluated. Versioned, composable, and replayable. They replace the spreadsheets, Notion docs, and Slack threads you use to coordinate work today.
Playbook definition
playbook: content-pipeline
version: 3
trigger: scheduled
schedule: "0 9 * * MON"
inputs:
topic: { type: string, required: true }
tone: { type: string, default: "professional" }
steps:
- name: research
agent: strategist
tool: web-research
with:
query: ${inputs.topic}
depth: comprehensive
rubric:
relevance: 0.8
depth: 0.7
- name: draft
agent: marketer
tool: content-write
with:
research: ${steps.research.output}
tone: ${inputs.tone}
rubric:
quality: 0.8
brand_voice: 0.9
originality: 0.7
- name: review
agent: strategist
tool: content-review
with:
draft: ${steps.draft.output}
approval: required
- if: steps.review.score < 0.8
goto: draft
max_retries: 2Every playbook is named and versioned. Diff, rollback, and replay any version.
Scheduled, event-driven, webhook, or manual triggers. Cron syntax for precision.
Typed inputs with defaults and validation. Passed to every step via interpolation.
Each step assigns an agent, a tool, and a quality rubric. Scores determine pass/fail.
Branching, loops, retries. If a step scores below threshold, it re-runs or escalates.
Rubric-based quality control
Define quality criteria per step. Agents self-evaluate against rubrics and auto-escalate when they cannot meet the score. No manual QA. No hoping for the best.
rubric:
quality:
weight: 0.35
threshold: 0.8
on_fail: retry
brand_voice:
weight: 0.30
threshold: 0.9
on_fail: escalate
originality:
weight: 0.20
threshold: 0.7
on_fail: retry
accuracy:
weight: 0.15
threshold: 0.95
on_fail: block
# Weighted score must exceed 0.82
# to auto-approve. Below that,
# the step escalates to a human.
composite_threshold: 0.82Template library
Pre-built playbooks for common business functions. Fork, customize, and deploy. Every template ships with typed schemas and rubric grading built in.
Research, draft, review, publish. Multi-agent content creation with quality rubrics.
Crawl site, analyze rankings, generate fix list, apply changes, verify improvements.
Welcome sequence, account setup, first-value guide, check-in scheduling.
Classify ticket, route to specialist, draft response, escalate if needed.
Pull PR diff, analyze changes, run static analysis, post review comments.
Monitor competitors, detect changes, summarize findings, recommend actions.
Playbook composition
Chain playbooks into multi-step workflows. The output of one playbook feeds the input of the next. Each runs independently with its own rubrics and approval gates.
Write your first playbook.
YAML-defined, typed, versioned, and replayable. The building block for every autonomous workflow.
