What is AgentTape?¶

AgentTape records what your AI agent says and does, then replays it — so your tests run offline, for free, with zero side effects.

Record once

Run your agent against the real OpenAI API, database, and tools. AgentTape saves every call to a YAML file.
Replay forever

Run the same code with the network turned off. AgentTape serves the saved responses in milliseconds.

The one-minute version¶

AgentTape sits between your code and the outside world. It intercepts external calls — an OpenAI request, a database query, a custom Python tool — and saves the inputs and outputs to a local file called a cassette.

The next time your code runs, AgentTape blocks the real call and returns the saved response instead. Your application can't tell the difference. It thinks it's talking to live services.

import agenttape
from openai import OpenAI

def run_agent():
    client = OpenAI()
    resp = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": "Say hi in 3 words"}],
    )
    return resp.choices[0].message.content

# 1. Record — calls the real API once, writes cassettes/hello.yaml
with agenttape.use_cassette("hello", mode="record"):
    print(run_agent())

# 2. Replay — zero network, free, identical output every time
with agenttape.use_cassette("hello", mode="none"):
    print(run_agent())

What happened?

The first block called OpenAI for real and saved the prompt and response into cassettes/hello.yaml. The second block intercepted the same call, matched the prompt against the cassette, and returned the saved response without touching the network.

Why this matters¶

Tests for AI agents are usually slow, flaky, expensive, and dangerous:

Problem	Without AgentTape	With AgentTape
Cost	Every test run burns tokens	Replay is free
Speed	1–5s per LLM call	< 5ms per call
Flakiness	Models drift; outputs vary	Byte-for-byte identical
Side effects	A tool really charges the card	Replayed tools never execute
Offline	No network, no tests	Runs on a plane

The usual alternative is hand-written mocks. But mocks test your assumptions about an API, not the API itself — when the real service changes, your mocks keep passing while production breaks. AgentTape captures the real interaction once, then replays that.

Read the full motivation →

How it works¶

RecordingReplaying

flowchart LR
    A[Your code] --> B[AgentTape]
    B -->|forwards| C[Real world<br/>OpenAI · DB · Stripe]
    C -->|response| B
    B -->|saves| D[(Cassette<br/>YAML)]
    B -->|returns| A

AgentTape forwards the call, waits for the real response, saves both the request and response to the cassette, then hands the response back to your code.

flowchart LR
    A[Your code] --> B[AgentTape]
    B -->|matches request| D[(Cassette<br/>YAML)]
    D -->|saved response| B
    B -->|returns| A
    C[Real world] -.->|never called| B

AgentTape matches the request against the cassette and returns the saved response. The real world is never contacted. If no match is found, it raises an error rather than silently hitting the network.

What AgentTape captures¶

AgentTape doesn't only record LLM calls. It records every boundary your agent crosses:

LLM calls

OpenAI chat completions, responses, and embeddings — captured automatically by the built-in adapter.
Tools

Any Python function you mark with @agenttape.tool — database writes, API calls, payments.
Raw HTTP

Any httpx or requests call, even to services AgentTape has no dedicated adapter for.
Retrieval & memory

Vector-store lookups and agent memory reads/writes, tagged for clarity.

Core guarantees¶

AgentTape's promises

Local-first — no servers, no telemetry, no network during replay.
Deterministic — the same inputs always produce the same recorded output, byte-for-byte.
Zero side effects — a replayed tool never executes for real. Safe for CI.
Fail loud, never silent — an unmatched request raises an error instead of quietly calling the real service.
Git-friendly — cassettes are plain YAML you can read, diff, and hand-edit.
Zero core dependencies — the engine runs on the Python standard library alone.

Where to go next¶

Your First Recording

A guided, five-minute walkthrough from zero to replaying offline.
Quickstart

Already know the idea? Copy-paste your way into an existing project.
Core Concepts

Understand cassettes, the replay engine, determinism, and partial replay.
Python API Reference

Every function, decorator, and class, with signatures and examples.

Summary¶

AgentTape intercepts your agent's external calls — LLM and tool calls.
It saves them to readable YAML cassettes, then replays them deterministically.
Replay is offline, free, fast, and safe from side effects.
Integration is one with block or one decorator — your agent code stays unchanged.