📋 owasp-llm-top-10

OWASP LLM Top 10, enforced on every PR.

OWASP Top 10 for Large Language Model Applications. The same gate attaches whatever you're building — a reviewer agent reads the rule, your code, and your tests, and blocks the merge if a requirement is missing.

What OWASP LLM Top 10 requires

Concretely, line by line.

What a OWASP LLM Top 10 failure costs

The downside is not theoretical.

Not a regulation — a security baseline. The cost is the incident: prompt-injection data exfiltration, a poisoned tool call, or a jailbroken agent acting with its full permissions.

LLM01 (prompt injection) and LLM06 (sensitive-information disclosure) are the most-exploited in production agent systems — an agent doing exactly what it was tricked into, irreversibly, at machine speed.

The risk is rarely ignorance of the rule — it's the gap between knowing it and enforcing it on every pull request. No input/output validation around the model, unbounded tool permissions, and no guardrail on what an agent can execute without a human — each is a pattern a reviewer can catch in the diff, before it reaches production. That gap is what the ai-security-reviewer + eval fixtures cover all 10 closes.

How GreatCTO wires it

Detect → overlay → evidence.

GreatCTO detects your project's archetype, overlays the matching reviewer agent, and attaches the OWASP LLM Top 10 gates. The reviewer reads the regulation text, your code, and your tests, then emits a verdict per requirement — with a diff if anything's missing.

DETECT

Archetype + scope

Stack signals, manifests, and README keywords identify the project type. OWASP LLM Top 10 gates attach when the code paths that carry the obligation are in scope.

OVERLAY

ai-security-reviewer + eval fixtures cover all 10

The reviewer prompt encodes each requirement above as a check. When a PR touches a relevant code path, the gate fires with the specific check that matters.

EVIDENCE

Audit trail per gate

Each gate decision is logged to .great_cto/gates.log with timestamp, reviewer, verdict, and rationale. Auditors get a tidy CSV; no scramble at audit time.

MEMORY

Lessons across audits

When an auditor flags something, the lesson promotes to ~/.great_cto/decisions.md after the 3rd similar finding and ships in the next project's Step 0.

Covered across every archetype

One reviewer, every project shape.

OWASP LLM Top 10 doesn't care what you're building — and neither does the gate. The same enforcement attaches whether your project is a fintech API, a healthcare app, a marketplace, an MLOps pipeline, or an internal tool:

🌐 web-service 🤖 agent-product 💸 fintech ⚕️ healthcare 🧠 ai-system 🛒 commerce 📱 mobile-app ⌨️ cli-tool 📦 library 🧩 browser-extension 🎮 game ⛓️ web3 📊 data-platform 🛠️ devtools 📡 iot-embedded 🏗️ infra 🏢 enterprise-saas 🔬 mlops 🌊 streaming 🤝 marketplace 📰 cms 📋 regulated 📚 edtech 🏛️ gov-public 🛡️ insurance
Caveats

What GreatCTO does not do.

It does not certify you. OWASP LLM Top 10 compliance requires human accountability — a sign-off, a review, in some cases an external auditor. GreatCTO ships the evidence; you still own the attestation.

It does not substitute legal review. The reviewer encodes commonly accepted readings of the regulation, not your specific jurisdictional interpretation. For high-stakes cases, lawyer involvement is still load-bearing.

It does not eliminate gaps in the requirements list. The list above is the surface area covered programmatically. Override the reviewer prompt in agents/ for your specifics.

Receipts

Don't take my word for it.

01 · ARCHITECTURE

Live state machine

Every box on the diagram is a clickable link to the agent's source on GitHub.

02 · PROOF

One real compliance run

A pack rollout with gates auto-wired. Timeline, costs, artifacts.

03 · AGENTS

Every reviewer on GitHub

The ai-security-reviewer + eval fixtures cover all 10 prompt is auditable. Read it, override it, fork it.

Install

Wire OWASP LLM Top 10 gates in one command.

$ npx great-cto init

Free, MIT, runs locally. The reviewer agent ships with the npm package — no SaaS portal, no compliance-vendor lock-in.

Related deep-dives

More from the blog

AI

What $1.4M of compliance work looks like in 14 hours – ten packs, ten regulated industries

Startups have often reached out to me with the same problem: their team could ship a regulated feature in days, but the compliance setup aro

AI

Three days of code, six weeks of compliance — the math behind why

Not a complaint about lawyers. A breakdown of where the six weeks actually go, and which parts of it are mechanical.

AI

Real cost breakdown: 10 packs, $0.60 LLM bill, $42K saved per regulated feature

Per-feature, per-MVP, per-quarter numbers. Hardware ratios, runway math, and the honest places where the savings stop.

AI

How GreatCTO chooses which compliance pack to attach

Regex vs LLM-based archetype detection, the false-positive count, and why I keep rejecting the obvious fix.