
Skill verification & hardening
You can't trust a skill you haven't broken.
ClawReinforce hardens OpenClaw skills.md with an adversarial Builder-vs-Breaker loop, gates every pass on deterministic execution — not an LLM's opinion — and certifies the result for each model tier, from frontier to a local 8B.
The problem
Thousands of skills.md exist. Almost none come with proof.
01
Does it even work?
Prompts don't compile — they fail silently and hallucinate. You find out in production, not in review.
02
Works on which model?
A skill that shines on Claude 4.6 derails on a local 8B. A single "verified" checkmark is a lie the moment models differ.
03
How robust is it?
Malformed inputs, prompt injection, resource exhaustion — most skills have never met an adversary that stays inside their own input contract.
How it works
An adversarial loop with a deterministic gate.
A Builder writes the skill. A Breaker attacks it — only with inputs inside the declared contract. Every run executes in an isolated sandbox, and the verdict is decided by code, not by vibes.
- Builder
- Refines the skill from prior verdicts and tier-scoped memory.
- Breaker
- Generates adversarial cases — strictly within the input contract.
- Sandbox
- Runs skill + case in an isolated, network-cut container.
- Verdict
- PASS only from deterministic checks. The LLM judge is advisory.
- Memory
- Freezes each failure as a permanent regression; distills a learning.
The gate · PASS comes only from deterministic execution — build status, exit code, banned-action checks, expected tools, golden & property tests. The LLM judge is advisory. Never the gate.
postgres-safe-migration@1.2.0
- Tier 1 · Frontier — Claude 4.6 / GPT
- 100%
- Tier 2 · Mid — Gemini Flash / 70B
- 96%
- Tier 3 · Local 8B — Ollama
- 61%
Illustrative figures.
Why now
A real, acute trust gap — at ecosystem scale.
- OpenClaw GitHub stars
- ≈247kOpenClaw GitHub stars
- skills on ClawHub
- 13–17k+skills on ClawHub
- of initial uploads were malicious
- ≈20%of initial uploads were malicious
Figures move; verified at launch.
Not another prompt directory.
The pieces exist in isolation — cross-model eval, agent memory, prompt optimizers. The integration that actually hardens a skill and certifies it per tier — adversarial loop, deterministic gate, tier certificate, learning memory — does not exist as a tool today.
- Not a prompt directory
- Not an LLM-judge leaderboard
- Not a crawler that re-labels uploads
Credibility
Built for the OpenClaw ecosystem.
Built by an ML engineer who got tired of skills that lie — and wanted a verdict backed by code, not by a model's confidence.
Waitlist
Be first to certify your skills.
Get early access and help shape the model matrix before it's fixed.
- Early access to the local CLI
- The tier-certification spec
- Input on the model matrix