Worked-example report v1 / Delimit team / 2026-05-12 / cross-agent handoff

One artifact, four CLIs: cross-agent handoff with Delimit

Same persistent context, across Claude Code, Codex, Cursor, and Gemini.

Claude Code hits its daily quota at 60% of a three-file refactor. delimit_session_handoff writes a structured JSON record. A second CLI runs delimit_revive and picks up at file 2 of 3 with the open question intact. The CLIs do not share session state; the MCP artifact does.

The task

3-file refactor

started in Claude Code

Codex / Cursor / Gemini

The pattern-accumulation point. The CLI vendors do not need to coordinate. Each CLI keeps its own process-local conversation buffer; nothing in the protocol asks them to share state. What MCP does coordinate is the tool surface: every CLI that speaks MCP to the same Delimit server reads and writes the same session-handoff record at ~/.delimit/sessions/. The handoff JSON is the shared state. The CLI is just a renderer.

What you read about here. A representative worked example of the cross-CLI handoff shape. The tool signatures are real (see the gateway repo under ai/server.py); the JSON record shape is real (see ai/ledger_manager.py); the task is synthesized to keep the example tight.

Tools used: delimit_session_handoff, delimit_revive, delimit_ledger_context, delimit_soul_capture.
Store location: ~/.delimit/sessions/session_YYYYMMDD_HHMMSS.json (one file per handoff, append-only on the directory).
CLIs in scope: Claude Code, Codex CLI, Cursor, Gemini CLI. Any MCP-capable client speaking to the same Delimit server sees the same artifact.

Setup: a three-file refactor in Claude Code

The example task: extract a shared helper out of three sibling modules. File 1 (the helper) is new; files 2 and 3 are the two callers being updated to use the new helper. Type-checking has to hold at every step, so the work order is: write the helper, update caller A, update caller B, run the type-checker, ship. Started in Claude Code, ledger item open with the goal stated, tool surface scoped to read and write on the three files plus the test directory.

File 1 lands clean. Caller A lands clean. The type-checker passes against the partial state. Caller B is the long one, with a non-obvious call site that needs a small interface adjustment. Halfway through caller B the day-quota gate trips and the Claude Code session needs to wind down.

The wall: 60% complete, no transport

Without Delimit, the recovery path is paste-the-transcript. The user copies the Claude Code conversation buffer, opens Codex (or Cursor, or Gemini CLI), pastes the buffer, and explains where the work stopped. The new agent reads a wall of past dialogue and tries to recover state by inference. Three failure modes are common at this point: the new agent misreads which file is done and re-edits a finished file; the new agent loses the open interface question and re-decides it differently; the new agent drops the policy preset that was active in the prior session and runs with a more permissive tool surface.

Each failure mode is a state-loss event. The transcript contains the words; it does not contain a structured handle the next agent can act on. The cost is paid every handoff and it scales with how long the original session ran.

With Delimit: one JSON file

In Claude Code, before winding down, the agent calls delimit_session_handoff with the seven structured fields. The MCP server writes one JSON record to ~/.delimit/sessions/. The record shape on disk:

{
  "id": "session_20260512_143012",
  "timestamp": "2026-05-12T14:30:12Z",
  "venture": "all",
  "summary": "Extracting shared helper out of 3 sibling modules. Helper landed and caller A landed clean; caller B is mid-edit at the type-adjustment for the non-obvious call site.",
  "items_completed": ["LED-9001"],
  "items_added": [],
  "key_decisions": [
    "Helper signature locked: (input, opts?) -> Result",
    "Caller A uses default opts; caller B needs opts.strict=true"
  ],
  "blockers": [
    "Caller B's non-obvious call site expects a different return shape; small interface adjustment open"
  ],
  "files_changed": [
    "src/lib/helper.ts",
    "src/callers/a.ts",
    "src/callers/b.ts (in progress)"
  ]
}

The record is local-first. The MCP server owns the write; the user owns the file. Any process that can open a JSON file can read it. The handoff is not bound to the CLI that produced it, and it is not a transcript; it is a fixed-shape record with seven named fields that any agent can consume the same way.

The resume: any MCP CLI

The user opens Codex (or Cursor, or Gemini CLI). The session starts. The resuming agent runs the standard session-start ritual: delimit_revive, delimit_ledger_context, delimit_gov_health.

# Inside the new CLI (any MCP-capable client)
> delimit_revive
{
  "active_task": "Extract shared helper out of 3 sibling modules",
  "files_modified": ["src/lib/helper.ts", "src/callers/a.ts", "src/callers/b.ts"],
  "open_question": "Caller B call site needs interface adjustment for return shape",
  "next_step": "Finish caller B, run type-checker, ship"
}

> delimit_ledger_context
{
  "open_items": ["LED-9001"],
  "recent_decisions": [
    "Helper signature locked: (input, opts?) -> Result",
    "Caller A uses default opts; caller B needs opts.strict=true"
  ],
  "policy_preset": "default"
}

The resuming agent picks up at caller B, with the helper signature locked, the per-caller opts decision recorded, and the open interface question stated. No transcript paste. No re-derivation of state. The MCP tool surface and the policy preset travel along with the handoff, so the resuming agent runs inside the same envelope of allowed actions the prior agent left behind.

Findings

1 baseline observation, 3 post-handoff shifts, 1 pattern-accumulation finding. Each finding cites the tool, the on-disk surface, and the consumer impact.

baseline statefinding F1
change type: no_shared_session_state (4 CLIs, 4 stores)
surface: Claude Code, Codex CLI, Cursor, Gemini CLI (process-local conversation buffers, no cross-tool transport)
The four CLIs in scope do not share session state with each other. Each one keeps its conversation buffer in its own process and its own on-disk format. A Claude Code chat does not show up in Codex; a Codex chat does not show up in Cursor; nothing in the protocol asks them to. The shape of the problem follows from that: when one CLI runs out of quota, restart capacity, or context window, the other three see a cold start. Pasting the transcript across does not survive; the new CLI gets a wall of past dialogue but no structured handle on what was decided, which files were touched, or what the next step is. The reset cost is paid every time, and the cost compounds the longer the original session ran.
post-handoff shiftfinding F2
change type: structured_handoff_written (delimit_session_handoff)
surface: ~/.delimit/sessions/session_YYYYMMDD_HHMMSS.json (one JSON file per handoff, written by ai.ledger_manager.session_handoff)
delimit_session_handoff takes a small set of structured fields (summary, items_completed, items_added, key_decisions, blockers, files_changed, venture) and writes a single JSON record to ~/.delimit/sessions/. The shape is fixed: id, timestamp, venture, summary, the five list fields, all named. The record is local-first, owned by the user, readable by any process that can open a JSON file. Crucially, the writer is the MCP server, not Claude Code. Any CLI that speaks MCP to the same Delimit server writes to and reads from the same handoff store; the artifact is not bound to the CLI that produced it. The transcript-paste pattern is replaced by an artifact pattern, which is what makes the handoff portable.
post-handoff shiftfinding F3
change type: structured_handoff_read (delimit_revive)
surface: delimit_revive in the resuming CLI (Codex, Cursor, or Gemini); reads the latest soul + recent handoffs, surfaces them as next-step context
delimit_revive runs on the resuming side. It auto-detects the project from cwd, loads the most recent captured soul (active task, decisions, blockers), and returns it through MCP to whichever CLI invoked it. Paired with delimit_ledger_context, the resuming CLI sees: what was being worked on, which ledger items were open, which files were modified, what blockers were live, and which decisions were locked in. The next agent does not re-read the transcript and try to infer state; the state was already written down. The replayable property is the same one the merge gate ships: the artifact is content-addressed and any reader at the same id sees the same fields.
pattern accumulationfinding F4
change type: kernel_application_4 (handoff as cross-CLI shared state)
surface: delimit_session_handoff + delimit_revive + delimit_ledger_context + delimit_soul_capture (the cross-session continuity stack)
The Delimit governance kernel is small: classify deterministically against a published schema, write a record, let any reader replay it. The merge gate on AI-written code applies that kernel to OpenAPI diffs. The TDQS linter applies it to MCP tool definitions. The cross-CLI handoff applies it to session state: a fixed schema (the seven handoff fields), a deterministic write, a per-id replayable record. The MCP protocol carries the schema across CLI boundaries the way an OpenAPI spec carries a contract across language boundaries. No CLI vendor needs to coordinate; the artifact is the coordination point. Same primitive, fourth artifact class.
post-handoff shiftfinding F5
change type: allowed_tool_envelope_preserved (no state drift across CLIs)
surface: MCP tool allowlist read on revive; gov_health checked at session start in the resuming CLI
A handoff is not just text. The resuming CLI also sees which Delimit tools are available, which policy preset is active, and what the governance kernel reported on the way out. delimit_gov_health is part of the session-start ritual on every CLI that runs the Delimit MCP; the resuming agent reads the same kernel state the prior agent left behind. If a policy preset was tightened mid-session, the next agent inherits the tightening. If a tool was disabled by license tier, the next agent is told. The envelope of allowed actions travels with the handoff, which closes the gap where a transcript paste would silently drop the policy context.

Without Delimit vs with Delimit

without delimit

Paste the conversation transcript into the next CLI.
New agent reads a wall of dialogue and infers state.
No structured handle on what is done or what is open.
Policy preset and allowed-tool envelope silently drop.
Re-derivation cost paid every handoff, every time.

with delimit

delimit_session_handoff writes a fixed-shape JSON record.
delimit_revive reads the record from any MCP CLI.
Seven named fields: summary, items, decisions, blockers, files.
Policy preset and tool envelope carried in the same handoff.
Re-derivation cost paid once, at write time.

What this report is not

Not a claim that Delimit changes how Claude Code, Codex, Cursor, or Gemini work internally. It does not. Each CLI keeps its own process-local conversation buffer; that does not change. What changes is the layer above: a fixed-shape session artifact, written through MCP, readable by every CLI that speaks MCP to the same Delimit server. The CLIs do not need to know about each other for the handoff to work, because the handoff is not between CLIs; it is between sessions, with the artifact as the shared state.

Not a replacement for a commit. A handoff is bounded cross-session context, not a substitute for shipping work to git. The two are complementary: a clean handoff makes it easy for the next agent to pick up at the open question; a clean commit makes the finished work portable. delimit_repo_diagnose and the merge gate cover the commit side.

Reproduce locally

The handoff shape is content-addressed at the file id. Anyone with Delimit installed can write a handoff in one CLI and read it in another:

# Install the CLI and register the MCP server
npm install -g delimit-cli
delimit setup

# In your first CLI (Claude Code, Codex, Cursor, Gemini Code Assist)
# Ask the agent to call the MCP tool when you are ready to hand off:
#   delimit_session_handoff(summary="...", items_completed=[...], files_changed=[...])
# The MCP server writes the record to ~/.delimit/sessions/.

# Inspect the record on disk
ls ~/.delimit/sessions/
cat ~/.delimit/sessions/session_*.json | head

# In the second CLI, at session start
# Ask the agent to call delimit_revive, or run delimit resume from the shell.
# delimit_revive          loads the latest handoff plus the ledger context.
# delimit_ledger_context  open items, recent decisions, policy preset.

The JSON shape on disk is fixed. The seven named fields are id, timestamp, venture, summary, items_completed, items_added, key_decisions, blockers, and files_changed. Any tool that opens the file sees the same record; the artifact is the shared state.

For your own cross-CLI workflow

If you run more than one AI coding CLI, the cross-session cost is the cost worth optimizing. The shape above does it with one tool call at the end of a session and one at the start of the next. The Delimit CLI is on npm:

npm install -g delimit-cli

Source and issues: github.com/delimit-ai/delimit-mcp-server.