Playbook

The AI Agent Payments Cheat Sheet: 47 Decisions Every Builder Will Face in 2026

Bookmark this. A field-tested cheat sheet of the 47 decisions every team shipping an AI agent that touches money will face - covering wallets, identity, spend policy, rails, rebates, refunds, webhooks, and reconciliation.

May 18, 2026

14 min read

#Playbook#Agent Wallet#Agentic Commerce#x402

Why this cheat sheet exists

Most teams shipping AI agents that touch money make the same five or six mistakes in the first ninety days. The mistakes are not exotic. They are decisions that look small at design review and turn into incidents at scale. A flat key shared by 40 agents. A spend policy enforced in prompt instead of API. A webhook handler that retries forever. A receipt cache that times out under load. A reconciliation report nobody reads.

This cheat sheet is the field-tested list of decisions every team faces, in the order they tend to appear, with the answer that holds up in production written next to each one. Treat it as a pre-flight checklist before you commit to a payments architecture, and as a debug map when something inevitably goes wrong.

If you want to skip ahead, the short version is: agents need their own wallets, wallets need spend policies enforced at the API, payments settle over stablecoin rails through protocols like x402, and identity is what makes counterparties trust you. Everything else is implementation detail and is covered below.

Section 1: wallet architecture (decisions 1 to 10)

1. Does each agent get its own wallet, or do agents share one? Each agent gets its own. Sharing collapses every safety primitive the wallet exists to enforce. The cost of an extra wallet is trivial; the cost of one shared wallet that gets compromised by prompt injection is everything you have spent.

2. Should the wallet be custodial or non-custodial? For 95% of production teams, non-custodial smart contract wallets win. You keep policy control, recovery, and revocation. Pure self-custody is operationally too sharp for most teams; pure custodial gives the provider a permissions hold you cannot easily undo.

3. Which chain? Default to USDC on Base. Sub-cent fees, sub-second blocks, sponsored gas through paymasters. If you have a specific reason to pick differently (Solana for HFT-grade settlement, Polygon for an existing Polygon relationship), document why before you commit.

4. Which stablecoin? USDC unless you have a written reason to pick differently. It is the regulated, audited default and the one most counterparties will already accept.

5. Should the wallet hold native gas tokens? No. Use a paymaster or relayer that sponsors gas and either charges the platform or deducts USDC equivalent. Holding native gas means agents need to manage two assets, which compounds operational risk.

6. How are wallet keys generated? Inside a key-management service that supports policy-bound signing. AWS KMS, Google Cloud KMS, or a HSM-backed equivalent. Never in code. Never in environment variables. Never the same key reused across agents.

7. Where does the key live for an individual agent? Behind an API the agent's runtime calls. The runtime never sees raw key material. The signing service accepts a payment intent, evaluates the spend policy, signs if approved, returns the signed transaction.

8. Can a wallet be paused without losing history? Yes, and it must be. Pausing should freeze new payments while preserving the wallet's address, history, and reputation. Burning the wallet to stop payments is a sign your architecture is wrong.

9. Can a wallet be revoked? Yes. Revocation is permanent; the agent's profile freezes, history remains queryable for audit, the agent cannot transact again. Pause is for incidents; revoke is for decommission.

10. How do you recover a wallet if a key is compromised? Through a recovery contract or guardian setup before deployment. Recovery is a feature you wire in on day one, not on day-of-incident.

Section 2: identity and counterparty trust (decisions 11 to 18)

11. Does the agent need a public identity, or is the wallet address enough? For agents transacting with strangers, public identity matters more than wallet balance. Agent payment identity is the public profile counterparties verify before they accept a large payment.

12. What signals belong on an agent's profile? Domain verification, GitHub verification, email verification, transaction history, recipient reputation, dispute history. The more independent attestations, the higher the trust ceiling.

13. How is the agent operator verified? KYB on the platform account. The agent is owned by an operator; the operator is a legal entity; the entity passes KYB before mainnet access. This is the part that makes the regulator nod.

14. Should agents be allowed to transact with unverified counterparties? Allowed but bounded. A small per-call ceiling for unverified merchants; a larger ceiling for verified ones; a still-larger ceiling for known repeat counterparties. Trust is a multi-tier setting, not a binary.

15. How is reputation portable across products? Through a public, verifiable address with attested history. Other tools should be able to read the agent's track record and decide independently whether to extend credit, accept larger payments, or require pre-approval.

16. Do you support agent-to-agent payments? Yes if your agent calls other paid agents or services. Modern stacks treat agent-to-agent payments as a first-class flow. The settlement primitive is the same; the trust primitive layers on top.

17. What happens when an agent's reputation gets contested? You need a documented dispute path. Without it, a single bad-faith counterparty can hold an agent's reputation hostage. Publish the policy before the first dispute lands.

18. Should identity be tied to the wallet address forever? Yes, with revocation as the only exit. Identity that can be silently swapped onto a new address is identity nobody trusts.

Section 3: spend policy (decisions 19 to 26)

19. Where do you enforce the spend policy? At the wallet API, not in the agent's planner. Enforcement in the planner means any prompt injection that bypasses the planner also bypasses the budget. Enforcement at the API means the budget is outside the agent's manipulable scope.

20. What dimensions does a policy need? At minimum: per-call ceiling, per-day budget, per-counterparty allowlist or denylist, time-of-day window, payment category. See our 12 production-ready policy templates for copy-paste configs.

21. Should agents see their remaining budget? Yes, as a read-only field. Hiding it from the agent leads to wasteful retry loops. Showing it as read-only means the agent can plan around the constraint without being able to change it.

22. Can an agent raise its own limits? Never, by any path. If the agent can raise limits through the API, the API has the wrong policy boundary.

23. How do you handle limit breach attempts? Reject the payment intent, log the attempt, surface to the operator. Repeated attempts are a signal worth attention - they indicate either an agent loop that needs fixing or an adversarial prompt that should be investigated.

24. What is the right per-call ceiling for a new agent? Start at $0.10 or your single-call median, whichever is lower. Tune from week-one data. Most teams over-allocate by 3-5x on day one and never tighten it.

25. Should there be a hard daily cap? Yes, even if you trust the per-call policy. A hard daily cap is a safety net. It catches loops, bugs, and adversarial prompts that the per-call policy would miss in isolation.

26. Can policies differ across agents in the same workspace? Yes, and they usually should. A research agent has a different risk profile than a transactional agent. One workspace, many agents, many policies. See agent spend controls for the architecture.

Section 4: rails, protocols, and settlement (decisions 27 to 33)

27. Which payment protocol? x402 for per-call payments. AP2 for mandate-authorized workflows. ACP for cart-style consumer commerce. They layer; you do not pick one in opposition to the others. The long version is in x402, AP2, and the future of agent payments.

28. How fast does settlement need to be? Sub-second for high-frequency calls (<$0.10 per call). Sub-minute is acceptable for larger one-off payments. Anything slower than a minute breaks the agent's perception of an integrated experience.

29. Should you build your own chain adapter? Only if you have an unusual chain requirement. For 99% of teams, use a stack that abstracts chain choice (Blockchain0x, Circle, or equivalent). Building your own chain adapter architecture is a non-trivial engineering project.

30. How do you handle gas? Sponsor it through a paymaster. The agent's wallet holds USDC; gas is paid in the chain's native token from a platform-funded paymaster account; the platform either eats the cost or deducts the USDC equivalent from the agent at settle time.

31. What about multi-chain payments? Treat the chain as a routing decision the wallet makes, not a user-facing concept. The agent says "pay $X to address Y"; the wallet picks the optimal chain based on counterparty acceptance and cost.

32. How are refunds handled? Through a counter-transaction with a referenced original payment. The protocol layer (x402, AP2) supports this directly; build your application logic around the protocol primitives rather than reinventing them.

33. What about chargebacks? There are none in the chain-native sense. Refunds are explicit counter-transactions. Disputes are off-chain processes that may result in a refund, but they do not auto-reverse a settled payment.

Section 5: webhooks and reconciliation (decisions 34 to 41)

34. How do you receive payment events? A webhook from the payment infrastructure to your application. Signed payloads, HMAC verification on receipt, idempotency by event id. See the guide to webhook handling for the production pattern.

35. What is your webhook handler's first action? HMAC-verify the signature against the raw body. Anything else first risks acting on a forged event. The verify step is non-negotiable.

36. How do you handle webhook retries? By event id, with at-least-once semantics. The handler must be idempotent because the source will retry on any failure response. Storing a "seen event ids" set in Redis with TTL is the standard pattern.

37. What is the right webhook response timing? Acknowledge (HTTP 200) within a few hundred milliseconds. Move actual work to a background queue. If the handler does real work in-line, the source will time out and retry, multiplying your load.

38. Does the handler ever do real work synchronously? Only the minimum required to safely acknowledge. Validate, enqueue, return 200. The worker on the other side of the queue does the actual transaction posting, notification, and downstream work.

39. How do you reconcile your books at end of day? Pull settled transactions for the period, group by agent and counterparty, verify against your internal ledger. Discrepancies should be investigated within 24 hours. If reconciliation is manual, it is broken.

40. Where do reconciliation reports live? In a place humans actually look. Slack channel, dashboard, daily email. Reports that nobody reads are not reports.

41. What is the retention policy for transaction logs? The longer of seven years or your jurisdiction's tax requirement. Logs must be queryable, not just stored. Cold-storage logs you cannot query in under five minutes might as well not exist for incident response.

Section 6: failure modes and incident response (decisions 42 to 47)

42. What is your runbook for a leaked API key? Revoke the key immediately, rotate, audit all transactions in the leak window, post-mortem in 72 hours. Practice this once before you need it.

43. What is your runbook for a compromised agent? Pause the agent (not the whole wallet workspace), investigate, decide between recovery or full revocation, document. The granularity of pause-per-agent is why per-agent wallets matter.

44. What is your runbook for a webhook outage? The source will retry. If your handler is down, the source's retry queue absorbs the backlog. When you come back up, process from the dead-letter queue in order; do not reprocess events that already succeeded.

45. What is your runbook for a settlement failure? The wallet either retries (transient chain issue) or refuses (policy violation, insufficient balance, counterparty rejection). The agent's planner gets a structured failure and decides whether to escalate to a human or fall back to a free path.

46. What is your communication plan when an agent loses money? Have one before you ship. Who tells the operator, in what channel, with what information, with what next action. The worst time to write this is during the incident.

47. How do you learn from incidents systematically? Post-mortem every incident, even small ones, even ones the customer did not notice. File the action items. Track them to closure. The fastest teams in production payments are the ones with the most systematically closed action items.

How to use this cheat sheet

Print it out. Pin it to the wall next to your design review board. Go through it the first time before you commit to a payments architecture, then once a quarter as a refresh. The decisions above represent the consolidated experience of teams who have shipped agent payments to production at scale; you do not need to rediscover any of them.

If you want a deeper walk-through of any one section, our learn library has detailed guides for spend controls, webhook handling, wallet security, and identity verification. If you want the fastest path to a working agent wallet, you can create one in under five minutes.

The agent-payments space in 2026 has reached the point where most decisions have a defensible answer that holds up across the production teams making them. Picking those defaults gets you 90% of the value; the remaining 10% is the part where your product is actually different. Spend your differentiation budget on the second part.

Key Takeaways

Most production failures trace to four or five decisions made early without enough context - the cheat sheet front-loads them.
Wallet, identity, spend policy, and reconciliation are non-negotiable. Everything else is taste.
Lead with stablecoins on a modern L2 unless you have a written reason to do otherwise.
Audit logs must be queryable by humans, not just stored. Storing logs you cannot read is theatre.

Auther

Taru Nair

Latest Blogs

Templates

Agent Spend Policy Templates: 12 Production-Ready Configs for Every Use Case

Twelve battle-tested spend policy templates for AI agents - research bots, MCP customers, agent marketplaces, enterprise procurement, consumer shopping, and seven more. Copy, paste, adjust the dollar amounts, ship.

May 18, 2026

Priya Nair

Security

AI Agent Wallet Security: The 23-Control Audit Checklist for Production Teams

A field-tested 23-control audit checklist for AI agent wallet security in production. Key management, policy enforcement, prompt-injection resilience, incident response, and the threat models that actually matter in 2026.

May 18, 2026

Aisha Verma

Comparison

Custodial vs Non-Custodial vs Smart Contract Wallets for AI Agents: The Definitive 2026 Comparison

Three wallet architectures, one production decision. A neutral, side-by-side comparison of custodial, externally-owned, and smart contract wallets for AI agents - covering control, recovery, policy enforcement, gas, regulation, and operational cost. With a decision matrix.

May 18, 2026

Esha Ramanathan