πŸ€– AGENTS 5 min read

The operator console: where the autopilot's work waits for a signature

Durable runs, an inbox for licensed humans, a signature ceremony for irreversible writes, and an Ops tab with a dead-letter queue. WCAG 2.2 AA, axe-core: 0 violations.

Last post ended with the autopilot pausing at a human checkpoint. Pausing is easy β€” any workflow engine can stop. The hard questions are operational: where does the case wait, who is allowed to sign it, what do they see before signing, and what happens when the write fails at 2am?

That's what we built through v2.46–v2.63: the operator console. great-cto board β†’ /autopilot.html. It's the Operate-mode surface β€” the app for the licensed humans the flow escalates to, not for the engineer who wired it.


Durable runs: the signature crosses a process boundary

A run persists to disk and survives restarts. startRun advances the flow to the gate and parks it as awaiting-approval; approve(id, who) resumes it and executes the irreversible write; reject ends it with nothing irreversible run. Every transition appends to an immutable audit trail.

The v2.43 safety invariant now holds end to end: the 837 claim is submitted only because a coder signed its protecting gate β€” provable across a process boundary, because the approve happens in a different process than the start.

We demonstrated it on medical coding live: intake β†’ code β†’ NCCI edits (three live connectors) β†’ pause β†’ the coder signs in the inbox β†’ the claim goes out β†’ completed. The reject path submits nothing.

Flows can require several signatures in sequence. Tax needs two: the preparer signs with their PTIN, then the taxpayer signs Form 8879 β€” the IRS e-file fires only after both. The board pushes a notification to the signer the moment a gate opens.


What the signer actually sees

A queue, then a case drawer. The drawer carries everything a decision needs in one panel:

Signing an irreversible write opens a signature ceremony: an alert dialog that names exactly what will execute β€” the gated step, its blast radius, the gate protecting it β€” and requires explicit confirmation. No one "accidentally approves" a wire transfer because the button was where their cursor happened to be.

And because humans override machines (that's the point), overrides are logged: sign against the AI recommendation and the divergence is recorded β€” case, recommendation, decision, who. Your regulator will ask. Now there's an answer.


The routing dial

Not every case deserves a human minute. Admin Settings sets a per-tenant confidence floor: a low-confidence approve is downgraded to escalate, and clean high-confidence cases are flagged auto-eligible. The dial moves as your trust does β€” start with everything escalated, widen straight-through as the override rate stays flat.

Around the queue, the things an operation actually needs:


The Ops tab: because writes fail

The least glamorous tab is the one that earns the trust. For admins and compliance-leads:

Retries never double-submit: an idempotency key, stable per run, is threaded into every write.


Enterprise polish, measured

v2.63 was a full UI/UX pass, and we held it to numbers rather than adjectives:

AccessibilityWCAG 2.2 AA β€” axe-core: 0 violations, all tabs, both themes
Themeslight/dark (prefers-color-scheme + persist), white-label accent per tenant
RealtimeSSE pushes a change the instant any run mutates β€” console, CLI, or webhook
Scalerender cap keeps 500+ case queues smooth
Reliabilitydurable-runtime e2e across all 25 verticals (start β†’ gate β†’ sign β†’ write), 348/348 lib tests

Multi-tenant scoping means an operator sees only their tenant's queue. Cases export to CSV, because the auditor's tooling is Excel and pretending otherwise helps no one.


Why this matters

"Human in the loop" is usually a checkbox in a pitch deck. Operationally it's a product: an inbox with SLA clocks, a drawer with evidence, a ceremony for the point of no return, override logs, QA sampling, and a dead-letter queue for the night the provider's API was down.

That product is what makes it safe to let the autopilot run the volume. Try it: npx great-cto init, then great-cto board. Screenshots on the landing; the run store, runtime, and console are all in the repo.

Also published on
devto