Harbor

Agent Trajectory Format (ATIF)

Understanding and working with the Agent Trajectory Interchange Format

Overview

The Agent Trajectory Interchange Format (ATIF) is a standardized, JSON-based specification for logging the complete interaction history of autonomous LLM agents. ATIF unifies the data requirements of conversational logs, action sequences, and replayable data structures, ensuring collected data is immediately usable across debugging, visualization, Supervised Fine-Tuning (SFT), and Reinforcement Learning (RL) pipelines.

For the complete specification, see the ATIF RFC.

Key Features

ATIF provides a comprehensive format that captures:

  • Complete interaction history: User messages, agent responses, tool executions, and environment feedback
  • Multi-turn conversations: Support for both single-turn tasks and extended conversational interactions
  • LLM metrics: Token usage, costs, logprobs, and other operational metrics
  • Tool calls and observations: Structured logging of agent actions and their results
  • Multi-agent systems: Support for subagent delegation and hierarchical architectures
  • Extensibility: Optional extra fields at all levels for custom metadata

Harbor Support

Harbor provides first-class support for ATIF through:

  1. Pydantic models for type-safe trajectory construction and validation
  2. Trajectory validator for validating trajectory files against the ATIF schema
  3. Automatic trajectory generation by integrated agents

Supported Agents

The following agents in Harbor automatically generate ATIF-compliant trajectories:

  • Terminus-2 - Harbor's reference agent implementation
  • OpenHands - Converts OpenHands event logs to ATIF format
  • Mini-SWE-Agent - Software engineering agent with trajectory support
  • Gemini CLI - Google's Gemini agent interface
  • Claude Code - Anthropic's code agent

OpenHands Example

OpenHands is a great example of how Harbor converts agent-specific formats to ATIF. The OpenHands agent reads event files from the agent's execution and converts them to a standardized ATIF trajectory:

# From harbor/agents/installed/openhands.py
def populate_context_post_run(self, context: AgentContext) -> None:
    """Convert OpenHands events to ATIF trajectory format."""
    # Get the session directory
    session_dir = self._get_session_dir()
    events_dir = session_dir / "events"

    # Convert events to trajectory
    trajectory = self._convert_events_to_trajectory(events_dir)

    # Write trajectory.json file using Pydantic's to_json_dict method
    trajectory_path = self.logs_dir / "trajectory.json"
    with open(trajectory_path, "w") as f:
        json.dump(trajectory.to_json_dict(), f, indent=2)

    # Populate context from trajectory
    if trajectory.final_metrics:
        context.cost_usd = trajectory.final_metrics.total_cost_usd
        context.n_input_tokens = trajectory.final_metrics.total_prompt_tokens
        context.n_output_tokens = trajectory.final_metrics.total_completion_tokens

The conversion process:

  1. Reads OpenHands event files
  2. Maps events to ATIF steps (system/user/agent)
  3. Converts accumulated metrics to per-step deltas
  4. Creates a complete Trajectory object using Pydantic models
  5. Exports to JSON format

Data Classes

Harbor provides Pydantic models for all ATIF schema components in harbor.models.trajectories:

Core Models

Trajectory - Root-level trajectory object

from harbor.models.trajectories import Trajectory, Agent, Step

trajectory = Trajectory(
    schema_version="ATIF-v1.4",
    session_id="session-123",
    agent=Agent(
        name="my-agent",
        version="1.0.0",
        model_name="claude-3-5-sonnet-20241022"
    ),
    steps=[
        # ... steps
    ]
)

Agent - Agent configuration

from harbor.models.trajectories import Agent

agent = Agent(
    name="openhands",
    version="0.9.0",
    model_name="gpt-4",
    extra={"agent_class": "CodeActAgent"}
)

Step - Individual interaction step

from harbor.models.trajectories import Step

# User step
user_step = Step(
    step_id=1,
    timestamp="2025-01-15T10:30:00Z",
    source="user",
    message="Create a file called hello.txt with 'Hello, world!' as the content."
)

# Agent step with tool calls
agent_step = Step(
    step_id=2,
    timestamp="2025-01-15T10:30:02Z",
    source="agent",
    model_name="claude-3-5-sonnet-20241022",
    message="I'll create the file for you.",
    reasoning_content="The user wants a simple text file. I'll use the file_write tool.",
    tool_calls=[
        ToolCall(
            tool_call_id="call_1",
            function_name="file_write",
            arguments={"path": "hello.txt", "content": "Hello, world!"}
        )
    ],
    observation=Observation(
        results=[
            ObservationResult(
                source_call_id="call_1",
                content="File created successfully"
            )
        ]
    ),
    metrics=Metrics(
        prompt_tokens=520,
        completion_tokens=80,
        cached_tokens=200,
        cost_usd=0.00045
    )
)

ToolCall - Tool/function invocation

from harbor.models.trajectories import ToolCall

tool_call = ToolCall(
    tool_call_id="call_price_1",
    function_name="financial_search",
    arguments={"ticker": "GOOGL", "metric": "price"}
)

Observation - Environment feedback

from harbor.models.trajectories import Observation, ObservationResult

observation = Observation(
    results=[
        ObservationResult(
            source_call_id="call_price_1",
            content="GOOGL is currently trading at $185.35"
        )
    ]
)

Metrics - LLM operational metrics

from harbor.models.trajectories import Metrics

metrics = Metrics(
    prompt_tokens=520,
    completion_tokens=80,
    cached_tokens=200,
    cost_usd=0.00045,
    logprobs=[-0.1, -0.05, -0.02],  # Optional
    completion_token_ids=[1722, 310, 5533]  # Optional
)

FinalMetrics - Trajectory-level aggregate metrics

from harbor.models.trajectories import FinalMetrics

final_metrics = FinalMetrics(
    total_prompt_tokens=1120,
    total_completion_tokens=124,
    total_cached_tokens=200,
    total_cost_usd=0.00078,
    total_steps=3
)

Export to JSON

All models provide a to_json_dict() method for clean JSON export:

from harbor.models.trajectories import Trajectory
import json

# Build trajectory using Pydantic models
trajectory = Trajectory(...)

# Export to JSON (excludes None values by default)
trajectory_dict = trajectory.to_json_dict()

# Write to file
with open("trajectory.json", "w") as f:
    json.dump(trajectory_dict, f, indent=2)

# Include None values if needed
trajectory_dict_full = trajectory.to_json_dict(exclude_none=False)

Validation

Harbor provides a trajectory validator for validating ATIF trajectory files:

Command Line

# Validate a trajectory file
python -m harbor.utils.trajectory_validator trajectory.json

Output:

✓ Trajectory is valid: trajectory.json

Or for invalid trajectories:

✗ Trajectory validation failed: trajectory.json

Found 2 error(s):
  - trajectory.steps.0.step_id: expected 1 (sequential from 1), got 0
  - trajectory.agent.name: required field is missing

Programmatic Usage

from harbor.utils.trajectory_validator import TrajectoryValidator

validator = TrajectoryValidator()

# Validate from file path
is_valid = validator.validate("trajectory.json")

# Validate from dict
trajectory_dict = {...}
is_valid = validator.validate(trajectory_dict)

# Validate from JSON string
trajectory_json = '{"schema_version": "ATIF-v1.4", ...}'
is_valid = validator.validate(trajectory_json)

# Check errors
if not is_valid:
    for error in validator.get_errors():
        print(f"Error: {error}")

The validator:

  • Validates against the complete ATIF schema using Pydantic models
  • Checks required fields, types, and constraints
  • Validates sequential step IDs (starting from 1)
  • Validates tool call references in observations
  • Validates ISO 8601 timestamps
  • Ensures agent-only fields are only present on agent steps
  • Collects all errors before returning (not just the first error)

Building Custom Trajectories

Here's a complete example of building an ATIF trajectory:

from harbor.models.trajectories import (
    Trajectory,
    Agent,
    Step,
    ToolCall,
    Observation,
    ObservationResult,
    Metrics,
    FinalMetrics,
)
import json

# Build the trajectory
trajectory = Trajectory(
    schema_version="ATIF-v1.4",
    session_id="025B810F-B3A2-4C67-93C0-FE7A142A947A",
    agent=Agent(
        name="my-agent",
        version="1.0.0",
        model_name="claude-3-5-sonnet-20241022",
    ),
    steps=[
        # Step 1: User message
        Step(
            step_id=1,
            timestamp="2025-01-15T10:30:00Z",
            source="user",
            message="What is the current trading price of Alphabet (GOOGL)?",
        ),
        # Step 2: Agent action with tool call
        Step(
            step_id=2,
            timestamp="2025-01-15T10:30:02Z",
            source="agent",
            message="I will search for the current trading price for GOOGL.",
            reasoning_content="The request requires the current stock price. I will execute a tool call to retrieve this information.",
            tool_calls=[
                ToolCall(
                    tool_call_id="call_price_1",
                    function_name="financial_search",
                    arguments={"ticker": "GOOGL", "metric": "price"},
                )
            ],
            observation=Observation(
                results=[
                    ObservationResult(
                        source_call_id="call_price_1",
                        content="GOOGL is currently trading at $185.35",
                    )
                ]
            ),
            metrics=Metrics(
                prompt_tokens=520,
                completion_tokens=80,
                cached_tokens=200,
                cost_usd=0.00045,
            ),
        ),
        # Step 3: Agent response
        Step(
            step_id=3,
            timestamp="2025-01-15T10:30:05Z",
            source="agent",
            message="Alphabet (GOOGL) is trading at $185.35.",
            metrics=Metrics(
                prompt_tokens=600,
                completion_tokens=44,
                cost_usd=0.00033,
            ),
        ),
    ],
    final_metrics=FinalMetrics(
        total_prompt_tokens=1120,
        total_completion_tokens=124,
        total_cached_tokens=200,
        total_cost_usd=0.00078,
        total_steps=3,
    ),
)

# Export to JSON
with open("trajectory.json", "w") as f:
    json.dump(trajectory.to_json_dict(), f, indent=2)

# Validate the trajectory
from harbor.utils.trajectory_validator import validate_trajectory

is_valid = validate_trajectory(trajectory.to_json_dict())
print(f"Trajectory is valid: {is_valid}")

Schema Versions

ATIF follows semantic versioning. The current version is v1.4. Supported versions:

  • ATIF-v1.4 (current) - Added optional prompt_token_ids field for storing prompt token IDs
  • ATIF-v1.3 - Added optional completion_token_ids field for RL training
  • ATIF-v1.2 - Extended observation field to support system steps
  • ATIF-v1.1 - Added optional extra field at root level
  • ATIF-v1.0 - Initial specification