Idempotency Keys: The API Design Pattern That Prevents Duplicate Charges

Idempotency is a mathematical property: applying an operation multiple times produces the same result as applying it once. For APIs, an idempotent endpoint returns the same response for duplicate requests with the same idempotency key. This allows clients to safely retry requests after network failures without the risk of duplicate side effects.

Why This Matters for AI APIs

AI inference requests are expensive and slow. Timeouts are common. Without idempotency, a timeout leaves your application uncertain whether the model call was processed (and charged) or not. If you retry without an idempotency key, you may pay twice for a response you already received. If you do not retry, you may miss a completed response.

Implementation

python

import hashlib
import json

# Client: generate a stable key for this logical request
def make_idempotency_key(user_id: str, request_hash: str) -> str:
    payload = f"{user_id}:{request_hash}"
    return hashlib.sha256(payload.encode()).hexdigest()

# API server: cache and replay
async def handle_request(key: str, handler):
    cached = await cache.get(f"idempotency:{key}")
    if cached:
        return cached  # Return exactly the same response

    response = await handler()
    await cache.set(f"idempotency:{key}", response, ttl=86400)
    return response

Implementation Details That Matter

▸Store the full response, not just a success flag — the client expects the same response body on retry
▸Use a 24-hour TTL on idempotency records — most retry storms resolve within minutes, 24 hours covers network partitions
▸Scope keys to the authenticated user — a key from user A should not replay as a response to user B
▸Return 409 Conflict if the same key is used with different request parameters — this indicates a client bug
▸The idempotency store must be durable — an in-memory cache that loses state on restart invalidates the guarantee

ShareX / Twitter LinkedIn

Idempotency Keys: The API Design Pattern That Prevents Duplicate Charges

Why This Matters for AI APIs

Implementation

Implementation Details That Matter

Related Articles

Row-Level Security in PostgreSQL: The Last Line of Defense for Multi-Tenant SaaS

Audit Log Integrity: Why Hash-Chaining Beats Encryption

API Security vs AI Gateway: Why You Need Both

Ready to secure your AI stack?