Tools¶
A tool is any function your agent uses to act on the world. Mark it with @agenttape.tool and AgentTape records it while recording — and mocks it on replay, so it never executes for real.
Why tools need recording¶
Intercepting LLM calls saves money. Intercepting tools is about safety.
If your agent can call delete_user(id) or charge_card(amount), you cannot let those run during a test suite. Mark the function as a boundary and AgentTape guarantees it only executes during a real recording. On replay it returns the saved output and the real code is never touched.
flowchart LR
subgraph rec["Recording"]
A1[agent calls charge_card] --> B1[AgentTape]
B1 --> C1[real charge runs]
C1 --> D1[(save result)]
end
subgraph rep["Replay"]
A2[agent calls charge_card] --> B2[AgentTape]
B2 --> E2[(read result)]
B2 -.->|never runs| C2[real charge]
end
The basic pattern¶
import agenttape
import requests
@agenttape.tool
def get_user_profile(user_id: int) -> dict:
resp = requests.get(f"https://api.example.com/users/{user_id}")
return resp.json()
That's the only change. Inside a session it's recorded/replayed; outside a session it behaves exactly like the original function.
- Agent calls
get_user_profile(42). - AgentTape intercepts it.
- The real function runs and hits the network.
- AgentTape saves
args={user_id: 42}and the returned dict. - The result is returned to the agent.
- Agent calls
get_user_profile(42). - AgentTape intercepts it.
- It looks up
get_user_profilewithuser_id=42in the cassette. - It returns the saved dict. The real function never runs — no network, no
requestsimport needed.
How arguments are matched¶
AgentTape binds your call to the function's parameter names, so the recorded request is stable and readable:
@agenttape.tool
def search(query: str, top_k: int = 3): ...
search("cats") # recorded request: {query: "cats", top_k: 3}
search("cats", top_k=5) # recorded request: {query: "cats", top_k: 5}
Defaults are filled in, and self/cls are dropped for methods. On replay, the incoming arguments must match a recording or you get an UnmatchedInteractionError.
Semantic boundary decorators¶
All four decorators behave identically at runtime — they only change the kind label in the cassette, which makes recordings easier to read and filter.
| Decorator | kind |
Use for |
|---|---|---|
@agenttape.tool |
tool |
General actions: APIs, payments, calculators, Slack |
@agenttape.retrieval |
retrieval |
Vector-store / search lookups (guide) |
@agenttape.memory_read |
memory_read |
Reading agent long-term memory |
@agenttape.memory_write |
memory_write |
Writing agent long-term memory |
@agenttape.retrieval
def search_docs(query: str) -> list[str]:
...
@agenttape.tool(name="charge") # override the recorded boundary name
def charge_card(amount: int) -> dict:
...
Async functions work too — decorate an async def and it's awaited normally.
The golden rule: serialize at the boundary¶
AgentTape serializes a tool's arguments and return value to YAML. Pass and return simple, serializable types — strings, ints, dicts, lists.
If you pass a live DB connection or a custom object, AgentTape falls back to a string like <MyObj at 0x103…>. The memory address changes every run, so matching fails on replay.
Best practices¶
Tip
- Wrap only boundary functions — things that cross network/disk/DB. Don't wrap pure business logic.
- Keep boundaries small. Extract the side-effecting line into its own function and wrap that, not a 50-line handler.
- Return primitives. Convert ORM models and cursors to dicts before returning.
FAQ¶
Does a @tool do anything outside a use_cassette block?
No. With no active session it just calls the original function. AgentTape only intercepts inside a session.
Can I record a method on a class?
Yes. self/cls are stripped from the recorded arguments automatically, so only the meaningful parameters are matched.
What about a one-off call I can't decorate?
Use the low-level record_call(...) helper to route a single boundary crossing through the active session with an explicit request payload.
Summary¶
@agenttape.toolmakes any function a recorded boundary.- Recording runs it for real; replay returns the saved output and never executes it.
- Use
retrieval/memory_read/memory_writefor clearer cassettes. - Pass and return serializable primitives so matching stays stable.