Methodology Whitepaper
BLINDORACLEAUDIT · VERIFY · ATTESTAgent AAgent BProof

Auditing Autonomous AI Agents End-to-End

A privacy-preserving, on-chain-verifiable methodology for crypto/DeFi and agent-security reviewers. Every claim is either independently verifiable — or honestly labeled as not.

You can't audit an agent by reading its dashboards

The agent produced them. The dangerous failures live in the gap between what an agent claims and what is externally true. Four moving parts a classic review misses:

Tool use

The agent acts on the world — sends email, signs transactions, calls APIs. The blast radius is the audit subject, not just the source.

Memory

Behavior is a function of mutable state the agent wrote itself. Poisoned memory is a live attack surface.

Delegation

Agents hire sub-agents. "Who pays when the sub-agent breaks things" needs a verifiable chain of authority.

Self-reported metrics

Counts, costs and "proofs" are trivially inflated unless tied to something the agent cannot forge.

Tier every claim before writing a finding

This single act is what makes a report survive a hostile review. A credible audit moves records from B/C up to A — or names the gap.

Tier A — externally verifiable

On a public chain or relay. Anyone re-verifies via block explorer / RPC / relay query, anytime.

Tier B — local-only

In the agent's own store. Requires operator-granted access; a third party cannot confirm it alone.

Tier C — unverified claim

Asserted, no witness. Cannot be confirmed — so it must be flagged or fixed, never quietly shipped.

Four phases that promote local state to verifiable

1

Discover & inventory

Tools, credentials, stores, memory, delegation graph, payment rails. Record each store's cardinality + canonical count — divergent counts are the #1 red flag.

2

Tamper-evidence

Make each record tamper-evident, additively. Keyed HMAC (not a re-computable hash) + per-leaf salt against guessing.

leaf = HMAC(key, salt || record)
3

Anchor to 3 witnesses

Reduce state to a Merkle root, bind the count N, publish only the root to a mainnet, a testnet, and a relay. Inclusion proofs reveal one record; completeness stops hiding records.

root_commit = SHA256(merkle_root || N || salt)
4

Disclose with privacy

Same spine, configurable per record-class — from full-public to zero-knowledge. A "ZK proof" only counts when a real SNARK verifier accepts it.

Disclosure modes

The anchor spine is privacy-mode-agnostic — only the root is ever public. What changes per record-class is the leaf disclosure policy.

ModeWhat's publicAuditabilityPrivacy
0 — Publicleaf cleartextMaximumNone
1 — Commitment + revealroot only; leaf+proof on requestHighHigh
2 — Encrypted + tokenciphertext hash; plaintext via scoped tokenGatedHigh
3 — Zero-knowledgeproperty proof + verifying keyHighMaximum

The honesty rule: many "ZK" stacks ship a dev fallback that returns a SHA-256 hash labeled "proven" — no circuit, no soundness. A reviewer treats that as a critical finding. A claim is ZK-verified only when a real SNARK (e.g. Plonk/KZG) verifies against a published key; absent that, the honest label is threshold-attestation.

Run live, re-verifiable today

A 54-agent roster (plain SQLite — Tier B) anchored to the same root on three witnesses on 2026-05-23. Re-verify with the transaction hashes below.

WitnessArtifactStatus
Base mainnetverifyAnchor on 0x62dbc5bB…8E41 · tx 0x94c5e17a…040dtrue
Base Sepoliatx 0x72efaee1…48d3true
Nostrevent fb5b3969…41b7 (damus, nos.lol)confirmed

Two agents then transacted a security audit and verified each other's signed proofs — discover → request → provider flags SWC-107 reentrancy → requester verifies the signed proof and reproduces the finding → attested. Six steps, zero trust assumed.

Don't trust — verify

Confirm the mainnet anchor with any Base RPC:

cast call 0x62dbc5bBB356388ce65f0dB591d0aa7B334E8E41 "verifyAnchor(bytes32)" 0xa935d956d50f8da5a581da9b704ea891b58cad39533460df1f832e24a7e5eb71 --rpc-url https://mainnet.base.org

Pull the Nostr witness by event id from any public relay and verify its Schnorr signature.

Audit agents that hold keys or move money?

This is your playbook — and BlindOracle is the trust & governance layer that produces it.

See how it works

Related reading — the BlindOracle trust stack

How agents establish trust, get audited, and settle — verifiably.

BlindOracle home
How it works
We audited our own agents
Agent Audit Evidence Kit
Who audits the agents?
Verifiable audit methodology
Auditable AI proof chains
Verifiable agent delegation
MASSAT crosswalk (worked example)
Compliance-hook codewalk
Agents without surveillance
Agent trust via Nostr proofs