Cassettes¶
A cassette is a plain-YAML recording of every boundary your agent crossed during a session. It's the single source of truth that replay reads from.
What's in a cassette¶
A cassette has run metadata plus an ordered list of interactions — one per boundary crossing (an LLM call, a tool call, an HTTP request).
version: '1'
created_at: '2026-06-17T12:00:00.000000'
run_id: bee71bc9-33b8-431b-8012-00a753783931
meta:
agenttape_version: 0.1.5
mode: record
freeze:
features: [clock, random, uuid]
base_time: 1781706140.86
base_iso: '2026-06-17T12:00:00+00:00'
interactions:
- index: 0
kind: llm
boundary: llm
request:
endpoint: chat.completions
model: gpt-5.5
messages:
- role: user
content: What is the weather in London?
response:
choices:
- message:
content: I'll check the weather tool.
tool_calls:
- function: {name: get_weather, arguments: '{"city":"London"}'}
usage: {total_tokens: 42}
latency_ms: 640.2
match_key: 'sha256:9c1f...'
- index: 1
kind: tool
boundary: get_weather
request:
name: get_weather
args: {city: London}
response: {temp: 15, condition: rainy}
latency_ms: 88.0
match_key: 'sha256:1ed9...'
The two layers¶
meta — about the session¶
| Field | Meaning |
|---|---|
agenttape_version |
The version that recorded it |
mode |
The mode used to record |
freeze |
The pinned clock/UUID/random state, so replay reproduces it byte-for-byte |
tags |
Optional labels you attached |
Top-level version, created_at, and run_id sit beside meta and identify the schema version and this specific run.
interactions — what happened¶
An ordered list. Each interaction records one boundary crossing:
| Field | Meaning |
|---|---|
index |
Position in the run (0-based) |
kind |
llm, tool, retrieval, memory_read, memory_write, or http |
boundary |
The specific name — a tool function name, or "llm" |
request |
The inputs (used to match on replay) |
response or error |
The output, or a serialized exception |
match_key |
A sha256: hash of the canonical request |
usage, latency_ms, tags |
Captured metrics and labels |
Why YAML?¶
Because a recording you can't read is a recording you can't trust.
flowchart LR
A[Agent behavior changes] --> B[Cassette changes]
B --> C[git diff shows it]
C --> D[Reviewer sees exactly<br/>what changed in the PR]
A binary format would hide drift. Plain YAML makes every behavioral change a reviewable line in a pull request — and lets you hand-edit responses to test edge cases.
Hand-editing cassettes¶
Editing the YAML is a first-class debugging technique. To test how your code handles a malformed LLM response:
- Record a successful interaction.
- Open the cassette.
- Change
responseto the broken shape you want to test — e.g.content: '{"invalid": json'. - Run your test in
mode="none".
Your code receives the broken payload and you exercise your error handling — no network, no prompt engineering.
Edit responses freely; edit requests carefully
Changing a response (or usage, latency_ms) is safe. Changing a request field changes the match_key, so the recording won't match your code's call anymore — unless you change the code to match. Don't reorder interactions unless your code's call order changed too. See the editing guidelines.
Where cassettes live¶
By default in cassettes/ next to your code; configurable via cassette_dir. Large binary payloads (images, big blobs) are written to a sibling assets directory instead of being inlined, keeping the YAML readable. Commit cassettes to Git alongside your tests.
FAQ¶
Can I read a cassette without installing AgentTape?
Yes — it's standard YAML. Any YAML parser in any language can read it. That's the basis for building tools on top of AgentTape.
What's a .derived.yaml file?
When you run with live={...} (Partial Replay), AgentTape writes the new run to name.derived.yaml instead of overwriting your original. Diff them to see what changed.
What does match_key do?
It's how the replay engine finds the right recording for an incoming request — a hash of the request after volatile fields are dropped. You normally never touch it.
Summary¶
- A cassette = run
meta+ an ordered list ofinteractions, in plain YAML. - Each interaction records one boundary: its
kind,request, andresponse/error. - YAML keeps cassettes diffable in Git and hand-editable for edge-case testing.
- Edit responses freely; changing requests breaks matching unless code changes too.