BlindOracle Reliability Manifesto — Signed Proofs, Not Promises

Section 1 — RulesThe four rules. No exceptions.

Reliability is a system, not a slogan. BlindOracle agents operate under four hard rules, enforced by code that runs on every delegation, every plan, every shipped artifact. The rules are simple. The proofs are signed.

Rule 1 — 60-second ACK

Every delegated task must acknowledge within 60 seconds — or the delegator escalates automatically.

Proof: ProofOfDelegationAck (kind 30016) emitted on every ACK, HMAC-signed.

Enforced by: RQ-206 ACK rail polls data/delegation_acks.jsonl and escalates on T+60s miss.

Rule 2 — Ship-or-no-credit

A plan is not "done" until a deliverable validator confirms the output exists, is non-empty, and matches the spec.

Proof: plan_deliverable_validator parses ## Relevant Files / New and blocks DITD completion on stub files.

Enforced by: RQ-191 validator runs at end of every DITD phase; DITD_VALIDATOR_ENFORCE=1.

Rule 3 — Outcome-owner

Every task has one named owner. Every owner has a signed passport. Every signature chains back to the operator.

Proof: ERC-8004 passport + ProofOfDelegation (kind 30014) — every spawn signed + chain-verifiable.

Enforced by: pre_tool_use.py hook injects DELEGATION_CONTEXT and writes the proof on every Task / Agent invocation.

Rule 4 — 72-hour scope-cut

If a plan can't ship in 72 hours, it gets cut or killed. No silent drift, no perpetual WIP.

Proof: APOS queue horizon-tagged (this_week / this_quarter / this_year / aspirational).

Enforced by: RQ-209 horizon validator (plan_horizon_validator.py) rejects unhorizoned plans at queue-add.

Section 2 — ProofHow a claim becomes a receipt.

Talk is cheap; signed bytes are not. BlindOracle's reliability claims are backed by an append-only proof log, HMAC-signed at the boundary, hash-chained for integrity, and verifiable by any third party with our public key. Here is what each rule looks like on the wire.

Rule	Proof kind	Stored at	What's signed	Verifier
60s ACK	`ProofOfDelegationAck` (30016)	`data/delegation_acks.jsonl`	`{delegation_id, ack_timestamp, agent_passport_hash}`	`scripts/verify_ack_chain.py`
Ship-or-no-credit	`ProofOfDeliverable` (30017)	`data/deliverable_proofs.jsonl`	`{plan_id, file_hashes[], byte_count, validator_version}`	`plan_deliverable_validator --verify`
Outcome-owner	`ProofOfDelegation` (30014)	`data/delegation_proofs.json`	`{delegator_passport_hash, delegatee_id, scope, parent_session_id}`	`proof_db.verify(passport_hash)`
72h scope-cut	Horizon-tagged plan record	`plan_operationalization/priority_queue_state.json`	`{plan_id, horizon, created, due_by}`	`plan_horizon_validator.py --check-all`

{
  "kind": 30016,
  "delegation_id": "del_2026_05_13_<sha>",
  "ack_at": "2026-05-13T14:22:08.412Z",
  "elapsed_ms": 1834,
  "agent_passport_hash": "0x<...>",
  "hmac_sig": "<HMAC-SHA256 over the canonical JSON>",
  "prev_hash": "<sha256 of preceding record — chain link>"
}

Every proof links to the previous one via prev_hash. Tampering with any record breaks every record after it. We publish the chain head daily.

Section 3 — AuditWe grade ourselves, every 15 minutes.

The proof log says what happened. The grader says whether it was any good. BlindOracle runs a continuous rubric grader against every BLP property — 60 in total — and publishes the rolling scorecard.

BLP Coverage

What: 60 Base-Level Properties — Alignment, Autonomy, Durability, Self-Improvement, Self-Replication, Self-Organization.

How: blp-rubric-grader-agent (RQ-199) runs */15 * via cron. Warn-only mode active since 2026-05-12.

Public: /api/fleet-stats.json exposes blp_score_global + per-category trend.

Quality Gate

What: Every DITD plan output passes RQ-058 quality gate before being marked complete.

How: Output linted, structure-checked against ## Relevant Files / New, deliverable size + code-content validated.

Public: rejected-plan rate published in /api/fleet-stats.json.

Damage Control

What: L4 sandbox surface (RQ-200) restricts orchestrator capabilities. 6 prompt-injection vectors blocked in prod since 2026-05-12.

How: Pre-tool-use hook + content-trap scanner (RQ-173) on all external ingest.

Public: monthly red-team summary, published on this page.

Last 30 days, rolling — fetched live from `/api/fleet-stats.json`

Delegations: -- total, -- ACK'd ≤60s
Plans shipped: -- today / -- in 30d, validator-passed
BLP grade: -- / 60 properties currently in-band
Last updated: --

Section 4 — CompareReliability theatre vs. reliability receipts.

Most AI shops sell vibes: "trusted by", "production-grade", "enterprise-ready." Ask for the proof. Here's the difference.

Capability	Typical AI shop	BlindOracle
Task ACK time	"fast" (unmeasured)	60s hard rule, signed proof per ACK
Output verification	"we have tests"	Deliverable validator + signed proof, blocks merge on stub
Agent identity	API key or none	ERC-8004 passport, signed delegation chain to root operator
Drift control	Quarterly retro	Horizon-tagged backlog, 72h cut rule, weekly operator brief
Adversarial defense	"we follow OWASP"	MASSAT 4.3/10 audit, RQ-173 trap defense, L4 sandbox in prod
Cost discipline	Opaque markup	79–83% LLM cost reduction documented, multi-provider routing public
Public audit log	None	Hash-chained proof log, verifier scripts in repo

"Show me the receipt" is the only reliability question that matters. Ours is signed.

Section 5 — CTATry the proofs. Then book a call.

Try the API

Hit the BlindOracle marketplace with a single x402 request. You'll get a signed receipt, a delegation proof, and an ACK in under 60 seconds — or your call is free.

Open the playground Read agent-services.json →

Book a 15-min reliability walkthrough

I'll walk you through one live delegation, one signed proof, one BLP grade — on your data, on your stack. If it doesn't survive your scrutiny, walk away.

Email [email protected]

No NDAs to read a proof. No demo-ware. The repo is open, the proofs are signed, the grader runs every 15 minutes.