Skip to content

Agent Observability Dashboards

Intermediate

Real-time visibility into running sub-agents, tool usage, token spend, and skill assignment. Covers hook-based telemetry capture specific to Claude Code multi-agent sessions; for orchestration coordination patterns see multi session coordination, for metrics/evaluation pipelines see llmops.

Key Facts

  • Claude Code exposes 12 hook event types covering the full session and sub-agent lifecycle
  • SubagentStart / SubagentStop are the entry points for per-agent dashboarding; each carries a subagent_type field that maps to the invoked skill
  • Events flow: hooks (Python/bash process) → HTTP collector → SQLite WAL → WebSocket → browser
  • SQLite WAL mode is mandatory for concurrent multi-session writes; default journal mode causes exclusive write locks — see multi session coordination for the pattern
  • Self-hosted dashboards give raw hook events; SaaS tracing tools (LangSmith, Phoenix) require OpenTelemetry instrumentation separate from hooks
  • The gap between hook events and skill/checklist visibility requires a small overlay (~200-400 LOC) — hook events do not natively carry skill metadata or checklist progress

Event Model

12 Hook Event Types

Category Events
Session lifecycle SessionStart, SessionEnd, Stop
Multi-agent SubagentStart, SubagentStop
Tool execution PreToolUse, PostToolUse, PostToolUseFailure
User interaction PermissionRequest, Notification, UserPromptSubmit
Context PreCompact

SubagentStart and SubagentStop bracket each spawned agent's lifetime. PreToolUse / PostToolUse fire for every tool call inside that agent, enabling per-agent tool attribution when correlated by session ID.

Minimal Event Schema

{
  "event_type": "SubagentStart",
  "session_id": "sess_abc123",
  "parent_session_id": "sess_parent456",
  "timestamp": "2026-05-12T14:03:22.411Z",
  "subagent_type": "general-purpose",
  "task_prompt": "Review PR #47 for security issues",
  "model": "claude-opus-4-8",
  "metadata": {
    "hook_event_name": "SubagentStart",
    "app_name": "my-project"
  }
}
{
  "event_type": "PostToolUse",
  "session_id": "sess_abc123",
  "timestamp": "2026-05-12T14:03:45.002Z",
  "tool_name": "Read",
  "tool_input": { "file_path": "/src/auth.py" },
  "tool_result": "...",
  "duration_ms": 84,
  "token_usage": { "input": 1200, "output": 340 }
}

Pipeline

Hooks → Collector → Store → Live UI

Claude agent process
  │  (PreToolUse / PostToolUse / SubagentStart / SubagentStop ...)
Python hook script (~/.claude/settings.json → hooks[])
  │  HTTP POST to local collector
Bun/Node collector (TypeScript)
  │  INSERT INTO events (WAL mode)
SQLite (WAL)
  │  SELECT polling or change-notification
WebSocket server
  │  push to connected browsers
Browser dashboard (Vue 3 / React + Tailwind)

Minimal Hook Script

Hooks receive event data on stdin as JSON. The script reads it, enriches if needed, and POSTs to the collector.

#!/usr/bin/env python3
# ~/.claude/hooks/emit_event.py
# Configured in settings.json under hooks[].command for each event type

import json
import sys
import urllib.request

COLLECTOR_URL = "http://localhost:3001/events"

def main():
    payload = json.load(sys.stdin)
    data = json.dumps(payload).encode()
    req = urllib.request.Request(
        COLLECTOR_URL,
        data=data,
        headers={"Content-Type": "application/json"},
        method="POST",
    )
    try:
        urllib.request.urlopen(req, timeout=1)
    except Exception:
        pass  # never block the agent

if __name__ == "__main__":
    main()

Hook Registration (settings.json)

{
  "hooks": {
    "SubagentStart": [{ "type": "command", "command": "python ~/.claude/hooks/emit_event.py" }],
    "SubagentStop":  [{ "type": "command", "command": "python ~/.claude/hooks/emit_event.py" }],
    "PreToolUse":    [{ "type": "command", "command": "python ~/.claude/hooks/emit_event.py" }],
    "PostToolUse":   [{ "type": "command", "command": "python ~/.claude/hooks/emit_event.py" }],
    "SessionStart":  [{ "type": "command", "command": "python ~/.claude/hooks/emit_event.py" }],
    "SessionEnd":    [{ "type": "command", "command": "python ~/.claude/hooks/emit_event.py" }]
  }
}

Collector SQLite Schema (minimal)

CREATE TABLE events (
  id          INTEGER PRIMARY KEY AUTOINCREMENT,
  session_id  TEXT NOT NULL,
  parent_id   TEXT,
  event_type  TEXT NOT NULL,
  ts          TEXT NOT NULL,
  payload     TEXT NOT NULL  -- JSON blob
);
PRAGMA journal_mode=WAL;

CREATE INDEX idx_session ON events(session_id);
CREATE INDEX idx_type    ON events(event_type);
CREATE INDEX idx_ts      ON events(ts);

What to Visualize

View Data source Key dimensions
Session tree SubagentStart parent/child linking Agent nesting depth, subagent_type
Per-agent tool timeline PreToolUse / PostToolUse grouped by session_id Tool name, file path, duration_ms
Token spend PostToolUse.token_usage aggregated Input/output per agent, cumulative
Latency heatmap duration_ms across tool calls Tool type vs time
Live pulse All events in rolling window Color-coded by session
Skill assignment overlay SubagentStart.subagent_type → SKILL.md lookup Agent → skill description
Checklist tracker Parse bullet items from task_prompt → match against subsequent tool calls % items resolved

The skill assignment overlay and checklist tracker are not provided by any off-the-shelf dashboard as of May 2026 — both require a thin overlay reading your project's skill definitions.

Tooling Landscape

Self-Hosted (Hook-Native)

Tool Architecture Strengths Gaps
disler/claude-code-hooks-multi-agent-observability Python hooks → Bun TS → SQLite WAL → Vue 3 + Tailwind Closest fit for Claude Code; captures all 12 event types; session color coding; chat transcript viewer; live pulse No skill→agent mapping; no checklist tracker; Vue 3 (not React)
Marc Nuri AI Coding Agent Dashboard Heartbeat model + WebSocket relay + enricher pattern Multi-device; git branch / PR link per session; remote terminal attach; push notifications; workflow templates Not open source (as of May 2026); focused on cross-device orchestration, not per-agent skill visibility

SaaS / General LLM Tracing

Tool Model Self-host Claude Code native Best fit
LangSmith Trace + eval No (cloud) No (LangChain ecosystem) LangGraph production stacks
Langfuse Open-source tracing Yes No (OTel required) General LLM prod, budget-conscious
Arize Phoenix ML-grade, 7 span types Yes No (OTel required) Large-scale production, embedding eval
AgentOps Session replay, time-travel debug No (cloud) Partial Debugging agent regressions
Helicone Drop-in proxy Yes (partial) No (proxies API calls, not hooks) Cost visibility on raw API usage

OTel instrumentation is required for Langfuse/Phoenix/LangSmith. Claude Code hooks produce raw JSON; bridging to OTel spans needs a shim that maps SubagentStartSPAN_KIND_CLIENT and PostToolUse→child spans. No maintained shim exists as of May 2026.

Span Type Mapping (for OTel bridge)

EVENT_TO_OTEL = {
    "SubagentStart":    "AGENT",
    "SubagentStop":     "AGENT",   # end marker
    "PreToolUse":       "TOOL",
    "PostToolUse":      "TOOL",    # end marker + result
    "UserPromptSubmit": "LLM",
    "SessionStart":     "CHAIN",
    "SessionEnd":       "CHAIN",
}

Gotchas

  • Issue: Hook script that raises an exception or times out blocks the agent that fired it. -> Fix: Wrap all network calls in try/except, use a short timeout (1s), and always sys.exit(0) — hooks must never block or crash the parent agent process.

  • Issue: Multiple parallel sub-agents write to SQLite simultaneously without WAL mode, causing database is locked errors and dropped events. -> Fix: Always open the collector database with PRAGMA journal_mode=WAL. WAL allows concurrent readers alongside a single writer; without it, every INSERT takes an exclusive lock and parallel hooks contend. See multi session coordination for the canonical pattern.

  • Issue: PostToolUse fires for the hook script itself if the hook is registered as a Bash tool call, creating recursive event loops. -> Fix: Register hooks using "type": "command" (subprocess invocation), not as tool calls. Hooks fired via command do not themselves trigger PreToolUse/PostToolUse.

  • Issue: subagent_type in SubagentStart carries the agent's declared type (general-purpose, code-review, etc.), not the skill name passed in the task prompt. Skill-to-agent mapping requires parsing the task_prompt field for skill invocation patterns. -> Fix: In the skill overlay, regex-match task_prompt against your skill manifest (e.g. \bcode-review\b) to reconstruct the assignment; do not rely on subagent_type alone.

See Also