πŸ›‘ COMPLIANCE 4 min read

We pivoted: GreatCTO is now AI autopilots for business

From 6 to 25 verticals in one week β€” each runs on live connectors, and the runtime physically refuses to fire an irreversible action without a human signature.

For a year GreatCTO was an engineering-process engine: agents, gates, reviewers, compliance packs. Good product. Wrong headline.

Here's the thing we kept observing: the people who got the most value weren't buying "a better SDLC." They were buying the outcome of a business function β€” claims coded, contracts reviewed, invoices matched, taxes filed. The pipeline was the means.

So in v2.40 we said it out loud: GreatCTO ships AI autopilots for business. Products that sell the outcome of a service, not a tool to a specialist. Packs, reviewers and gates didn't go anywhere β€” they became the under-the-hood trust layer instead of the headline.


What an autopilot actually is

A flow. One file per vertical β€” flows/<vertical>.flow.json β€” the single source of truth that renders the CLI behavior, the runtime, and the landing page from the same data:

The four autopilot invariants are machine-checkable (autopilot-gate.mjs): judgment boundary (confidence β†’ escalation), accuracy-as-SLA, per-decision audit trail, per-outcome unit economics. Not a manifesto β€” a validator that exits 1.


6 β†’ 16 β†’ 25 verticals

We started with six (legal docs, medical coding, procurement, accounting, managed IT, tax). Then the expansion criterion clicked: a vertical is a fit when it pairs a large displaceable-labor pool with a legally-required named human who signs the risky call. That's the exact shape the safety engine is built for.

Ten more landed in v2.44 β€” prior-auth ($35–56B), KYC/AML ($61B), managed SOC, insurance claims (~$36–38B), mortgage underwriting, title & escrow, provider credentialing, collections, freight brokerage, clinical-trial ops. Then immigration, appraisal, payroll, workers-comp, estate planning, patent prosecution. Twenty-five total, every one shipping green on --validate.

Each carries its own compliance reviewer: False Claims Act + NCCI for coding, OFAC + BSA for AML, FDCPA + Reg F for collections, Circular 230 + Β§7216 for tax, FMCSA for freight. The regulation is a step in the flow, not a PDF you read later.


"Live" means live

A flow that calls mocked connectors is a demo. By v2.45, all verticals exercise at least one live connector β€” 17 live in the catalog, keyless by default (deterministic real logic or a curated public slice), switching to the real provider the moment you add a credential.

A few favorites:


The permission is never the wound

The scariest failure mode of an agent isn't going rogue. It's doing exactly what it's permitted to do, irreversibly, at machine speed, with no human hesitation. (Hat tip to Oleksandr Torlo's essay "The Permission Was the Wound.")

v2.43 made the boundary a runtime invariant, not a convention:

The autopilot does the volume. The point of no return always waits for a person.


Quality is earned, not declared

Every vertical gets a 0–100 scorecard: seven weighted dimensions, golden + adversarial cases run through the reviewer with an LLM judge, and a regression gate so a score can't silently decay. Two measureβ†’improveβ†’re-measure cycles took legaltech from 85 to 94.75 and msp from 78 to 98.5.

If we're going to claim an autopilot can hold a function, the claim should be a number someone measured β€” and a gate that fails CI when it stops being true.


Where this leaves you

npx great-cto init, name the function, and you get the flow β€” agents, connectors, human checkpoints, the compliance pack for your domain. The pipeline that built features for a year now runs business functions, with the same receipts: all 25 autopilots, each with its flow, gates, and live-connector badges.

Next post: what happens after the flow pauses β€” the operator console where a human actually signs.

Also published on
devto