OWASP LLM Top 10 compliance, automated in the build pipeline

What OWASP LLM Top 10 requires

Concretely, line by line.

01. prompt-injection defense
02. insecure output handling
03. training-data poisoning
04. model DoS
05. supply-chain
06. sensitive-info disclosure
07. insecure plugin design
08. excessive agency
09. overreliance
10. model theft

What a OWASP LLM Top 10 failure costs

The downside is not theoretical.

Not a regulation — a security baseline. The cost is the incident: prompt-injection data exfiltration, a poisoned tool call, or a jailbroken agent acting with its full permissions.

LLM01 (prompt injection) and LLM06 (sensitive-information disclosure) are the most-exploited in production agent systems — an agent doing exactly what it was tricked into, irreversibly, at machine speed.

The risk is rarely ignorance of the rule — it's the gap between knowing it and enforcing it on every pull request. No input/output validation around the model, unbounded tool permissions, and no guardrail on what an agent can execute without a human — each is a pattern a reviewer can catch in the diff, before it reaches production. That gap is what the ai-security-reviewer + eval fixtures cover all 10 closes.

How GreatCTO wires it

Detect → overlay → evidence.

GreatCTO detects your project's archetype, overlays the matching reviewer agent, and attaches the OWASP LLM Top 10 gates. The reviewer reads the regulation text, your code, and your tests, then emits a verdict per requirement — with a diff if anything's missing.

DETECT

Archetype + scope

Stack signals, manifests, and README keywords identify the project type. OWASP LLM Top 10 gates attach when the code paths that carry the obligation are in scope.

OVERLAY

ai-security-reviewer + eval fixtures cover all 10

The reviewer prompt encodes each requirement above as a check. When a PR touches a relevant code path, the gate fires with the specific check that matters.

EVIDENCE

Audit trail per gate

Each gate decision is logged to .great_cto/gates.log with timestamp, reviewer, verdict, and rationale. Auditors get a tidy CSV; no scramble at audit time.

MEMORY

Lessons across audits

When an auditor flags something, the lesson promotes to ~/.great_cto/decisions.md after the 3rd similar finding and ships in the next project's Step 0.

Covered across every archetype

One reviewer, every project shape.

OWASP LLM Top 10 doesn't care what you're building — and neither does the gate. The same enforcement attaches whether your project is a fintech API, a healthcare app, a marketplace, an MLOps pipeline, or an internal tool:

🌐 web-service 🤖 agent-product 💸 fintech ⚕️ healthcare 🧠 ai-system 🛒 commerce 📱 mobile-app ⌨️ cli-tool 📦 library 🧩 browser-extension 🎮 game ⛓️ web3 📊 data-platform 🛠️ devtools 📡 iot-embedded 🏗️ infra 🏢 enterprise-saas 🔬 mlops 🌊 streaming 🤝 marketplace 📰 cms 📋 regulated 📚 edtech 🏛️ gov-public 🛡️ insurance

Caveats

What GreatCTO does not do.

It does not certify you. OWASP LLM Top 10 compliance requires human accountability — a sign-off, a review, in some cases an external auditor. GreatCTO ships the evidence; you still own the attestation.

It does not substitute legal review. The reviewer encodes commonly accepted readings of the regulation, not your specific jurisdictional interpretation. For high-stakes cases, lawyer involvement is still load-bearing.

It does not eliminate gaps in the requirements list. The list above is the surface area covered programmatically. Override the reviewer prompt in agents/ for your specifics.

Receipts