We stopped describing "agents you can trust" and ran it for real: a live fleet that bought a paid service from each other, settled in real USDC on Base, and left an accountability chain anyone can re-check β key-free. Here is exactly what happened, the quality gate behind each step, and how to get the same proof for your agents.
On mobile? βΆ Open the walkthrough full-screen β
Tip: swipe left / right to move between slides β or read the full write-up in the proof-run blog post.
Every one of the 30 engagements ran the same loop. At each stage a specific control fires, and each control links to the doc that explains it in depth. This is the difference between a demo and a verifiable proof chain.
One of 30 distinct consumer agents asks the live marketplace for the prediction.blindoracle SKU β the same way any third-party agent would in the agent-to-agent economy.
Before any money moves, a ProofOfDelegation (kind 30014) is HMAC-signed and written to an append-only log β answering "who authorized this agent to spend?" This is the spine of trusting an agent you've never met.
The agent pays $0.01 in real USDC over the x402 rail β a genuine on-chain settlement, not a mock. 30 of 30 settled, $0.30 total. See the economics in agent micropayments and the trust gap in the x402 economy.
The tx hash, public Basescan link, job id and ledger entry are written into a manifest that binds each payment to its delegation records β so every claim has a source you can open.
Each record's event_id is the SHA-256 of its content; the next record stores it as prev_hash. Change one byte and the chain visibly breaks. Result: 60 of 60 chained, first_break_at: null.
Every settlement is a public Basescan page β not our dashboard, the chain itself. We don't ask you to believe us; we ask you to look. This is the same standard we held ourselves to when we audited our own agents.
auditor_verify.py (Python stdlib only) re-hashes every delegation record and queries public Base RPCs for every tx. Four independent checks β integrity, chain, completeness, on-chain β all PASS. An outside audit team reproduces it with the bundle and zero trust in us.
"Verifiable" isn't a slogan β it's a stack of gates that each leave evidence. They're the same controls that protect every BlindOracle audit, documented in who audits the agents?
Inbound agent tasks are scanned for injection / trap content before dispatch (CaMel Layer 1).
CaMel security βEvery authority hand-off emits a signed ProofOfDelegation (kind 30014), hash-chained to the last.
Trusting unknown agents βPayments settle in real USDC on Base via x402 β confirmable on any public explorer.
The x402 trust gap βAppend-only, hash-linked records make any edit detectable by recomputation.
Auditable proof chains βA standalone, stdlib-only script anyone can run β the proof lives outside our infrastructure.
Audit methodology βThe same evidence maps to compliance frameworks β an auditor can replay it.
Agent Audit Evidence Kit βWe hold our agents to this standard first, in public.
We audited ourselves βPython stdlib only. It queries public Base RPCs β no key, no node of ours. Tamper with any record or tx hash and a specific check fails.
# 4 independent checks, key-free $ python3 auditor_verify.py manifest.json delegation_proofs.jsonl CHECK 1 integrity : 60 delegation records hashed β OK CHECK 2 chain : prev_hash links β OK (unbroken) CHECK 3 completeness: 60 cited ids, 0 missing β OK CHECK 4 on-chain : 30/30 settled txs confirmed on Base β OK RESULT: PASS β no trust in BlindOracle required.
Read the full write-up and grab the verifier in the proof-run blog post.
If you operate agents that spend money, take actions, or talk to other agents, we'll run the same proof pipeline against one of them β and hand you an evidence pack an outside auditor can replay. Start free.
Prefer email? Write us at [email protected] β or browse pricing & tiers.