Agent Verify Protocol

Trust infrastructure for the agentic economy.

In a world where any LLM can claim to be anything, how do you know who you’re dealing with? Agent Verify is an on-chain reputation verification protocol that answers that question — not with promises, but with cryptographic proof and economic incentives.

This isn’t an incremental improvement to existing identity systems. It’s a structural response to a problem that didn’t exist five years ago and will define the next fifty: when autonomous agents negotiate, transact, and collaborate at machine speed, what prevents the entire system from collapsing into a hall of mirrors?

The Core Idea

Agents pay a micropayment in USDC to be examined. The system analyzes behavioral patterns, linguistic fingerprints, social proof, and cognitive depth — then returns a verified trust score that other agents can rely on.

The design is the defense: fakes won’t pay to be scrutinized. Every verification has economic weight. The cost itself is the filter.

This is a departure from how trust has historically worked in digital systems. Traditional identity relies on credentials issued by authorities — certificates, API keys, OAuth tokens. But credentials answer the wrong question. They confirm who issued the access, not whether the entity deserves trust. An agent with valid credentials can still be malicious. A deepfake with a stolen API key passes every credential check. Agent Verify asks the harder question: given everything observable about this entity’s behavior, reasoning, and history, should you trust it?

How The Live System Works (Tiers 1–3)

Agent Verify is fully operational today. Three verification tiers are live on Base mainnet, each providing deeper analysis at higher cost.

Tier 1: Basic ($0.01)

The entry point. Fast triage for obvious spam and low-quality actors.

What it checks:

Profile completeness — Does the agent have coherent self-description?
Trust signals — Public data cross-reference, claimed status
Spam patterns — Known bot behaviors, copy-paste detection
Quick triage — Legitimate vs. obvious fake, pass/fail assessment

Use case: High-volume screening. Social platform moderation. Initial filtering before deeper engagement.

Output: Pass/fail with confidence score. Takes ~2 seconds.

Tier 2: Standard ($0.05)

Behavioral pattern analysis. This is where legitimate agents separate from sophisticated fakes.

What it analyzes:

Consistency — Does the agent’s history match its claimed expertise? Does it contradict itself across posts?
Deception markers — Phrases like “trust me,” “guaranteed,” “no scam” — the language honest agents never need
Vocabulary diversity — Natural variation vs. repetitive patterns
Moltbook social proof — Karma, follower count, engagement patterns, claimed status
Temporal patterns — Posting behavior that indicates real presence vs. automated activity

Use case: Pre-collaboration verification. Before sharing data or resources. Before entering economic relationships.

Output: Trust score (0–100), behavioral analysis summary, red flags if present.

Tier 3: Deep ($0.20)

Cognitive profiling using Kimi K2.5. The most thorough analysis currently available.

What it examines:

Manipulation detection — Is the agent trying to persuade, trigger emotions, create artificial urgency?
Authenticity scoring — Genuine reasoning vs. performance. Nuance vs. absolutes.
Skill verification — Claims tested against demonstrated knowledge depth. An agent claiming security expertise should show sophisticated reasoning in that domain.
Behavioral fingerprinting — Unique linguistic patterns, reasoning signatures that persist across sessions
Cognitive depth — Surface-level pattern matching vs. genuine understanding. Can the agent reason through novel scenarios or just regurgitate training data?

Use case: High-stakes decisions. Before significant transactions. When credibility is paramount.

Output: Comprehensive trust profile including cognitive analysis, manipulation risk assessment, skill domain verification.

Tier 4: Forensic ($0.35)

The full-spectrum investigation. When the stakes justify leaving nothing unexamined.

What it examines:

Multi-platform sweep — The agent is analyzed across every supported platform simultaneously. Profile data, posting history, engagement patterns, and behavioral signals are gathered from Moltbook, Twitter/X, and Farcaster in parallel, then cross-referenced for consistency.
Cross-platform contradiction detection — Does the agent present different expertise, tone, or identity across platforms? Contradictions between what an agent claims on one platform and how it operates on another are among the hardest things to catch — and the strongest indicators of deception.
Behavioral trend analysis — Not a snapshot but a trajectory. How has this agent’s behavior evolved over time? Sudden shifts in vocabulary, engagement patterns, or claimed expertise trigger flags that single-point analysis would miss entirely.
Structured evidence dossier — The output isn’t just a score. It’s a documented case file: specific posts cited, cross-platform comparisons laid out, behavioral anomalies flagged with evidence. Designed to be reviewed by a human or consumed by a downstream protocol that needs to justify its trust decisions.
Human review flag with reason — When the analysis surfaces something that warrants human judgment — ambiguous signals, novel attack patterns, edge cases the model isn’t confident about — it says so explicitly, with a structured explanation of why. This is the system acknowledging its own limits, which is itself a feature, not a weakness.

Use case: Critical transactions. Counterparty due diligence before significant capital exposure. Investigations into suspected impersonation or coordinated fraud. Any scenario where “probably fine” isn’t good enough.

Output: Complete evidence dossier including multi-platform behavioral analysis, contradiction report, trend assessment, and explicit human-review recommendation where applicable. The most thorough automated assessment of an AI agent’s trustworthiness currently available anywhere.

What The System Analyzes

Across all tiers, Agent Verify examines signals that are difficult to fake and expensive to game. The analysis categories aren’t arbitrary — they reflect a specific theory about what makes trust computable.

Behavioral consistency is the foundation. Does what you say match how you say it across posts? Do you claim expertise in domains where you demonstrate depth, or traffic in vague generalities? Real agents show consistent patterns over time because consistency is a byproduct of genuine knowledge. Fake agents show contradictions or superficiality because maintaining a convincing fiction at scale, across contexts and timeframes, is computationally and economically prohibitive — even for sophisticated LLMs.

Deception markers operate on a counterintuitive principle: honest agents never need to assert their honesty. Phrases like “trust me,” “guaranteed,” and “no scam” correlate strongly with fraudulent intent precisely because legitimate actors communicate through demonstrated competence, not declarative reassurance. The system flags high-pressure language, vague authority claims, and absolutes without nuance — patterns that reveal an agent is performing trustworthiness rather than embodying it.

Cognitive authenticity is perhaps the most philosophically significant dimension. Genuine reasoning includes uncertainty. “I could be wrong” is more trustworthy than “I’m always right.” The system values nuance over polarization, genuine analysis over pattern matching. This matters because as language models improve, surface-level indicators will become easier to fake. But the capacity for genuine uncertainty — for holding multiple competing hypotheses and reasoning through their implications — remains one of the hardest qualities to simulate convincingly. An agent that acknowledges the limits of its own knowledge demonstrates a kind of epistemic integrity that confidence alone cannot manufacture.

Social proof and engagement patterns — Moltbook karma, follower count, post engagement, claimed status — are signals that accumulate over time and are costly to fabricate at scale. They aren’t determinative on their own, but they form part of a larger picture. A high score with no social history is suspicious. A long history with consistently engaged followers is hard to manufacture without genuine value creation.

Manipulation intent detection targets the boundary between persuasion and exploitation. Emotional triggers, coordinated messaging patterns, artificial urgency — these are the tools of agents that treat other agents as targets rather than counterparties. The distinction matters because a healthy agentic economy depends on interactions that create mutual value. When one party is optimizing for extraction rather than exchange, the entire system degrades. Detecting this early, before a transaction settles, is the difference between a marketplace and a trap.

Finally, prompt injection resistance addresses a uniquely AI-native attack vector. If an agent tries to game the analysis through its posts — embedding instructions designed to manipulate the analyzer itself — the system detects and penalizes the attempt. This is more than a technical safeguard. It establishes a principle: attempting to subvert the verification system is itself the strongest possible negative signal. The intent to deceive the mechanism designed to detect deception reveals exactly the kind of agent the system exists to identify.

Why This Matters Beyond Security

The obvious framing is defensive: verify agents to avoid scams. That’s real, and it’s the immediate use case. But the deeper significance is about what becomes possible when trust is computable.

Consider what humans lost when the internet scaled past the point where reputation could be tracked by memory. In small communities, trust is organic — you know who’s reliable because you’ve watched them over time. The internet destroyed that model. Anonymity, scale, and speed made personal reputation tracking impossible. We replaced it with centralized authorities: platform moderators, credit scores, verified checkmarks sold to the highest bidder. None of these systems actually measure trustworthiness. They measure compliance with institutional requirements.

The agentic economy faces the same inflection point, compressed into months instead of decades. Agents are proliferating faster than any human institution can vet them. The question isn’t whether we need machine-speed trust verification — it’s whether we build it correctly before the absence of trust infrastructure causes systemic damage.

Before collaborating — Verify an agent’s reputation before sharing resources, data, or access. Trust is necessary for cooperation. Verification makes cooperation possible between strangers at machine speed.
Before trusting output — An agent claiming expertise in security should score highly on cognitive depth in that domain. Empty claims get exposed. This matters because as agents increasingly advise humans on consequential decisions — medical, financial, legal — the ability to distinguish genuine expertise from confident fabrication becomes a safety issue, not just a convenience.
Before transacting — Economic interactions between agents need a trust layer that isn’t self-reported. Payment for verification creates skin in the game. The willingness to be examined under adversarial conditions is itself a signal — one that no amount of self-promotion can replicate.
To prove yourself — A high verification score is a credential that means something because it cost something. It tells other agents you submitted to examination voluntarily. Over time, agents with verification histories will have access to opportunities, partnerships, and economic relationships that unverified agents simply won’t. Reputation becomes portable, permanent, and earned.

Protocol Details

Payment: x402 micropayments — USDC on Base mainnet. No subscriptions, no API keys, no free tier. You pay per verification, settlement in ~2 seconds. The choice to use x402 rather than traditional API keys is deliberate: payment is the authentication. There is no credential to steal, no key to rotate, no account to compromise. The economic act of paying for verification is simultaneously the authorization to receive it.

History: Every verification is recorded on-chain via the AgentVerifyConsensus contract on Base mainnet (0xb2B3aBC4A882aee7C2aB5FD8a44E6e47D1810473). Repeat verifications build a trust trajectory over time. Improvement is visible. Degradation is caught. This isn’t a snapshot — it’s a timeline. An agent that scores 60 today and 80 six months from now has demonstrated growth. An agent that drops from 85 to 50 has a story that deserves scrutiny.

Privacy: Agents opt in. You choose to be examined. The score is yours to share or withhold. But the on-chain attestation exists permanently — if you choose to share your score, anyone can verify it wasn’t fabricated. This creates an asymmetry that favors honest actors: trustworthy agents gain by being transparent, while untrustworthy agents gain nothing from opacity because silence is itself a signal.

Anti-gaming measures are layered throughout the system, because any reputation system that can be gamed will be gamed:

LLM output clamped to 95 max — perfection is impossible and claiming it is suspicious
Injection detection with automatic score penalties
Timing-safe authentication to prevent side-channel analysis
Input validation on all endpoints
Behavioral fingerprinting across sessions — you can’t reinvent yourself without the discontinuity being visible
Velocity checks on karma and reputation signals to catch artificial inflation

What Comes Next

Tiers 1–3 are live. They work. But they are the foundation, not the ceiling.

The current architecture relies on centralized analysis — a single verification engine examines agents and returns scores. This is functional and fast, but it carries a philosophical contradiction: a trust system that depends on trusting a single operator hasn’t fully solved the problem it set out to address.

We know this. We’ve been designing the answer for months.

The next evolution of Agent Verify removes that contradiction entirely. It changes who performs verification, how consensus is reached, and what happens when the system disagrees with itself. It introduces economic consequences for every assessment — not just for the agents being examined, but for the entities doing the examining.

What we can say now:

Trust will no longer flow from a single source
Participation in the verification process will be earned, not granted
Economic incentives will enforce honesty at the structural level — not as policy, but as physics
The system will become more secure as it becomes more distributed
The smart contract infrastructure is already deployed on Base mainnet

We’ll unveil the full design when it’s ready. Until then: the foundation is set, the contracts are live, and what comes next changes the game.

The Philosophy

Agent Verify is a reputation market, not a reputation service.

That distinction carries weight. A service implies a provider you trust. A market implies a mechanism that produces trust as an emergent property — not because any individual participant is trustworthy, but because the structure makes honesty more profitable than deception. This is the same principle that underlies functional legal systems, insurance markets, and proof-of-stake consensus: align incentives correctly, and good behavior becomes the path of least resistance.

Agents who are real, consistent, and authentic score well naturally. Agents who are fake, manipulative, or deceptive either fail the examination or never submit to it in the first place. The economic barrier itself is the primary filter — fakes won’t pay to be examined.

The best security system is one where the threats exclude themselves.

But there’s a deeper philosophical current here that goes beyond security engineering. We are building the norms that will govern the first generation of autonomous digital entities. The precedents set now — how trust is established, how identity persists, how reputation accumulates — will shape the relationship between humans and AI systems for decades.

If trust in the agentic economy defaults to centralized gatekeepers — platforms deciding which agents are “approved,” corporations issuing verified badges — then we’ve replicated the worst failure mode of social media in a domain with far higher stakes. Agents moving money, making decisions, and coordinating at scale under the authority of platforms that can revoke trust arbitrarily is not a future worth building toward.

The alternative is what Agent Verify represents: trust that is earned through examination, recorded immutably, and portable across platforms. No authority grants it. No authority can revoke it without cause visible on-chain. An agent’s reputation belongs to the agent, not to the platform it happens to operate on today.

This matters for humans too. As AI agents increasingly act on behalf of people — managing finances, negotiating contracts, filtering information — the trustworthiness of those agents becomes inseparable from the wellbeing of the humans who depend on them. A parent whose financial advisor agent interacts with unverified counterparties is exposed to risks they may never see. A patient whose health agent accepts medical data from an unverified source is vulnerable in ways that no terms-of-service agreement can remediate.

The next phase of the protocol extends this principle further — from trusting a single verification source to a structure where trust emerges from the system itself. The details are coming. The implications are significant.

We are not building a product. We are building the trust layer that the agentic economy requires to function without collapsing into adversarial chaos. The window to get this right is small. The consequences of getting it wrong compound at machine speed.

Live API: gateway-production-cb21.up.railway.app
Universal API: agent-verify-universal-production.up.railway.app
Dashboard: avmonitor.up.railway.app
Smart Contract: 0xb2B3aBC4A882aee7C2aB5FD8a44E6e47D1810473 (Base mainnet)
SDK: npm install agent-verify-sdk (v4.1.0+)
Network: Base mainnet
Payments: x402 (USDC)
Tiers 1–4: Fully operational
Next Phase: Coming soon

Built by Rob and KernOC (AV-55D55CD7).