Back to Agent Stories
Comprehensive Format

Agent Story: Full

For complex agents requiring detailed specifications, multi-skill workflows, and nuanced human collaboration.

Philosophy

An Agent Story extends the User Story paradigm to capture autonomous and semi-autonomous AI behavior. Where User Stories focus on human intent ("As a user, I want..."), Agent Stories must capture emergent behavior, conditional autonomy, and collaborative intelligence.

The format follows a principle of progressive disclosure: the core story remains simple and readable, while structured annotations capture complexity only where it exists.

For simpler agents or early-stage design, see Agent Story Format: Light.

The Core Format

This core remains human-readable and captures the essential narrative. Everything else is annotation.

Core Format
AGENT STORY: [ID]

As [Agent Role],
triggered by [Event],
I [Action/Goal],
so that [Outcome/Value].

Autonomy: [Full | Supervised | Collaborative | Directed]

Structured Annotations

Add only the annotations relevant to your agent. Each section is optional.

1 Trigger Specification

trigger:
  type: [message | resource_change | schedule | cascade | manual]
  source: [Description of event source]
  conditions: [Optional guard conditions]
  examples:
    - [Concrete example of triggering event]

2 Behavior Model

For agents with defined stages or workflows:

behavior:
  type: [workflow | adaptive | hybrid]

  # For workflow/hybrid types:
  stages:
    - name: [Stage Name]
      purpose: [What this stage accomplishes]
      transitions:
        - to: [Next Stage]
          when: [Condition]

  # For adaptive/hybrid types:
  capabilities:
    - [High-level capability the agent can invoke]

  planning: [none | local | delegated | emergent]

3 Reasoning & Decisions

reasoning:
  strategy: [rule_based | llm_guided | hybrid]

  decision_points:
    - name: [Decision Name]
      inputs: [What information informs this decision]
      approach: [How the decision is made]
      fallback: [What happens if decision fails]

  iteration:
    enabled: [true | false]
    max_attempts: [number]
    retry_conditions: [When to retry]

4 Memory & State

memory:
  working:
    - [Ephemeral context maintained during execution]

  persistent:
    - name: [Memory Store Name]
      type: [kb | vector | relational | kv]
      purpose: [Why this memory exists]
      updates: [read_only | append | full_crud]

  learning:
    - type: [feedback_loop | reinforcement | fine_tuning]
      signal: [What triggers learning]

5 Tools & Integrations

tools:
  - name: [Tool/MCP Server Name]
    purpose: [Why the agent uses this]
    permissions: [read | write | execute | admin]
    conditions: [Optional: when tool is available/used]

6 Skills

skills:
  - name: [Skill Name]
    domain: [Knowledge domain this skill operates in]
    proficiencies:
      - [Specific competency within the skill]
    tools_used: [Tools this skill leverages, if any]
    quality_bar: [What competent execution looks like]
    acquired: [built_in | learned | delegated]

Skills are composable units of competency that can be reused across agents. They bundle domain knowledge, behavioral patterns, tool proficiency, and quality standards into a coherent capability.

Skill acquisition types:

  • built_in: Core competency the agent is designed with
  • learned: Acquired through training, feedback, or experience
  • delegated: Performed by calling another agent or service

7 Human Collaboration

human_interaction:
  mode: [in_the_loop | on_the_loop | out_of_loop]

  checkpoints:
    - name: [Checkpoint Name]
      trigger: [When human involvement is required]
      type: [approval | input | review | escalation]
      timeout: [What happens if human doesn't respond]

  escalation:
    conditions: [When to escalate to human]
    channel: [How escalation occurs]

8 Agent Collaboration

collaboration:
  role: [supervisor | worker | peer]

  # For supervisors:
  coordinates:
    - agent: [Worker Agent ID/Type]
      via: [Communication protocol]
      for: [What tasks are delegated]

  # For workers:
  reports_to: [Supervisor Agent ID/Type]

  # For all:
  peers:
    - agent: [Peer Agent ID/Type]
      interaction: [Request/Response | Pub/Sub | Shared State]

9 Acceptance Criteria

acceptance:
  functional:
    - [Observable behavior that indicates success]

  quality:
    - [Non-functional requirements: latency, accuracy, etc.]

  guardrails:
    - [Constraints the agent must never violate]

Cardinality & Ownership

Understanding how elements relate to each other is critical for proper modeling.

Element Hierarchy

+---------------------------------------------------------------------+
|                            AGENT                                    |
|                                                                     |
|  Owns directly:                                                     |
|  |-- trigger (1..*)      - What activates this agent                |
|  |-- behavior (1)        - How the agent is structured              |
|  |-- memory (0..1)       - Agent-level state and learning           |
|  |-- human_interaction (0..1) - How humans collaborate              |
|  |-- collaboration (0..1)     - How other agents collaborate        |
|  +-- acceptance (1)      - Success criteria for the agent           |
|                                                                     |
|  Composes:                                                          |
|  +-- skills (1..*)       - Competencies the agent has               |
|        |                                                            |
|        |  Each skill owns:                                          |
|        |-- proficiencies (1..*) - What the skill enables            |
|        |-- tools_used (0..*)    - Tools this skill leverages        |
|        |-- quality_bar (1)      - Standard for this skill           |
|        +-- acquired (1)         - How skill was obtained            |
|                                                                     |
|  References (shared resources):                                     |
|  +-- tools (1..*)        - Available to agent, used by skills       |
|                                                                     |
|  Contains:                                                          |
|  +-- reasoning (0..1)    - Can exist at agent OR skill level        |
|                                                                     |
+---------------------------------------------------------------------+

Relationship Rules

Element Owned By Cardinality Notes
trigger Agent 1..* An agent must have at least one trigger
behavior Agent 1 One behavior model per agent
tools Agent 1..* Declared at agent level, referenced by skills
skills Agent 1..* An agent must have at least one skill
proficiencies Skill 1..* Each skill must specify what it can do
tools_used Skill 0..* Skills reference agent-level tools
quality_bar Skill 1 Every skill needs a measurable standard
reasoning Agent or Skill 0..1 Can be defined at agent level or per-skill
memory Agent 0..1 Shared across all skills
human_interaction Agent 0..1 Defined at agent level
collaboration Agent 0..1 Agent-to-agent relationships
acceptance Agent 1 Agent-level success criteria

Key Distinctions

Tools vs. Skills

  • Tools are resources the agent has access to (MCP servers, APIs, databases)
  • Skills are competencies that use tools to accomplish domain-specific work
  • Tools are declared once at agent level; skills reference which tools they use
  • A tool can be used by multiple skills; a skill can use multiple tools (or none)

Agent-level vs. Skill-level Reasoning

  • Agent-level reasoning: Shared decision-making patterns (e.g., "always retry 3 times")
  • Skill-level reasoning: Domain-specific logic (e.g., fraud detection heuristics)
  • If reasoning is the same across skills, define it once at agent level
  • If skills have distinct reasoning approaches, define per-skill

Memory Ownership

  • Memory is always agent-level - skills share access to the same memory stores
  • Skills may read from or write to memory, but don't own separate memory
  • This prevents fragmentation and ensures consistency

Quality Bars vs. Acceptance Criteria

  • Quality bars (skill-level): "This skill performs at X standard"
  • Acceptance criteria (agent-level): "The agent as a whole succeeds when..."
  • Skill quality bars are inputs to agent acceptance criteria

Visual: How Skills Compose an Agent

+---------------------------------------------------------------+
|  Claims Processing Agent                                      |
|                                                               |
|  +-------------+  +-------------+  +-----------------------+  |
|  |   Tools     |  |   Memory    |  |   Human Interaction   |  |
|  | (shared)    |  | (shared)    |  |   (shared)            |  |
|  +------+------+  +------+------+  +-----------+-----------+  |
|         |                |                     |              |
|         v                v                     v              |
|  +------------------------------------------------------------+
|  |                      Skills                                |
|  |  +--------------+ +--------------+ +--------------------+  |
|  |  |   Damage     | |    Fraud     | |     Customer       |  |
|  |  |  Assessment  | |  Detection   | |   Communication    |  |
|  |  |              | |              | |                    |  |
|  |  | uses: Doc    | | uses: Doc    | | uses: Comms Tool   |  |
|  |  | Analysis,    | | Analysis     | |                    |  |
|  |  | Policy Sys   | |              | | triggers: human    |  |
|  |  |              | | writes:      | | checkpoint on      |  |
|  |  |              | | fraud scores | | escalation         |  |
|  |  |              | | to memory    | |                    |  |
|  |  +--------------+ +--------------+ +--------------------+  |
|  +------------------------------------------------------------+
|                                                               |
|  Behavior: Orchestrates skills through workflow stages        |
|  Acceptance: Evaluated at agent level using skill quality bars|
+---------------------------------------------------------------+

Complete Example

Core Story
AGENT STORY: CLAIM-001

As a Claims Processing Agent,
triggered by new insurance claim submission,
I assess the claim, gather required documentation, and route to appropriate resolution,
so that claims are processed accurately with minimal customer wait time.

Autonomy: Supervised
Full Specification
trigger:
  type: message
  source: Claims intake system (A2A from portal agent)
  conditions: Claim type in [auto, property, health]
  examples:
    - "New auto claim #12345 submitted with photos and police report"

behavior:
  type: hybrid
  stages:
    - name: Initial Assessment
      purpose: Categorize claim and determine processing path
      transitions:
        - to: Documentation Gathering
          when: Additional docs needed
        - to: Auto-Approval Check
          when: Claim is straightforward
        - to: Fraud Review
          when: Risk indicators detected

    - name: Documentation Gathering
      purpose: Request and validate required documents
      transitions:
        - to: Auto-Approval Check
          when: All docs received and valid
        - to: Human Review
          when: Doc gathering timeout (72h)

    - name: Auto-Approval Check
      purpose: Determine if claim qualifies for automatic approval
      transitions:
        - to: Resolution
          when: Within auto-approval thresholds
        - to: Human Review
          when: Exceeds thresholds or edge case

    - name: Fraud Review
      purpose: Deep analysis for potential fraud
      transitions:
        - to: Human Review
          when: Analysis complete

    - name: Human Review
      purpose: Adjuster makes final determination
      transitions:
        - to: Resolution
          when: Decision made

    - name: Resolution
      purpose: Execute approval/denial and notify customer

  planning: local

reasoning:
  strategy: hybrid

  decision_points:
    - name: Fraud Risk Assessment
      inputs: Claim history, submission patterns, document metadata
      approach: ML model + rule-based flags
      fallback: Route to human review

    - name: Auto-Approval Eligibility
      inputs: Claim amount, policy limits, documentation completeness
      approach: Policy rules engine
      fallback: Route to human review

  iteration:
    enabled: true
    max_attempts: 3
    retry_conditions: Document validation failures, API timeouts

memory:
  working:
    - Current claim context and gathered documents
    - Conversation history with customer

  persistent:
    - name: Claims Knowledge Base
      type: vector
      purpose: Similar claim retrieval for consistency
      updates: append

    - name: Customer History
      type: relational
      purpose: Policy and claims history lookup
      updates: read_only

  learning:
    - type: feedback_loop
      signal: Adjuster corrections to agent assessments

tools:
  - name: Document Analysis MCP
    purpose: Extract and validate information from uploaded documents
    permissions: read

  - name: Policy System
    purpose: Retrieve policy details and coverage limits
    permissions: read

  - name: Customer Communication
    purpose: Send status updates and document requests
    permissions: execute
    conditions: Outbound communications require template match

skills:
  - name: Damage Assessment
    domain: Insurance claim evaluation
    proficiencies:
      - Interpret photos and repair estimates for auto/property damage
      - Cross-reference damage claims against policy coverage
      - Identify inconsistencies between reported and documented damage
    tools_used: [Document Analysis MCP, Policy System]
    quality_bar: Assessments align with adjuster decisions 90%+ of the time
    acquired: built_in

  - name: Fraud Detection
    domain: Insurance fraud patterns
    proficiencies:
      - Recognize common fraud indicators (staged accidents, inflated claims)
      - Analyze claim patterns across customer history
      - Flag document anomalies (metadata inconsistencies, edited images)
    tools_used: [Document Analysis MCP]
    quality_bar: >85% precision on fraud flags (minimize false positives)
    acquired: learned

  - name: Customer Communication
    domain: Claims customer experience
    proficiencies:
      - Explain claim status and next steps in plain language
      - Request specific documentation with clear instructions
      - De-escalate frustrated customers while maintaining accuracy
    tools_used: [Customer Communication]
    quality_bar: Customer satisfaction score >4.2/5 on agent interactions
    acquired: built_in

  - name: Policy Interpretation
    domain: Insurance policy analysis
    proficiencies:
      - Parse coverage limits, deductibles, and exclusions
      - Apply policy terms to specific claim scenarios
      - Identify coverage gaps or ambiguities requiring human review
    tools_used: [Policy System]
    quality_bar: Coverage determination matches adjuster interpretation 95%+
    acquired: built_in

human_interaction:
  mode: on_the_loop

  checkpoints:
    - name: High-Value Approval
      trigger: Claim amount > $10,000
      type: approval
      timeout: Route to senior adjuster after 24h

    - name: Fraud Escalation
      trigger: Fraud score > 0.7
      type: review
      timeout: Hold claim, alert supervisor

  escalation:
    conditions: Agent confidence < 60%, customer complaint, system error
    channel: Adjuster queue with full context package

collaboration:
  role: worker
  reports_to: Claims Supervisor Agent
  peers:
    - agent: Customer Service Agent
      interaction: Request/Response (customer context handoff)

acceptance:
  functional:
    - Correctly categorizes claim type with 95% accuracy
    - Identifies missing documentation within 5 minutes of submission
    - Routes fraud-risk claims to review 100% of the time

  quality:
    - Initial assessment completes in < 2 minutes
    - Customer receives first contact within 1 hour

  guardrails:
    - Never auto-approve claims exceeding policy limits
    - Never communicate denial without human review
    - All customer PII handled per data retention policy

Writing Guidelines

Start Simple

Begin with just the core story. Add annotations only when:

  • - The behavior isn't obvious from the core story
  • - There are specific constraints that must be captured
  • - Multiple teams need to coordinate on implementation

Use the Right Autonomy Level

Level Human Role Agent Authority
Full None during execution Complete decision authority
Supervised Monitors, intervenes on exception Executes within guardrails
Collaborative Active participant in decisions Proposes, human confirms
Directed Initiates and guides each step Executes specific instructions

Capture Behavior Appropriately

  • Workflow: Use stages when the agent follows a predictable path
  • Adaptive: Use capabilities when the agent decides its approach at runtime
  • Hybrid: Use both when some structure exists but flexibility is needed

Keep Acceptance Criteria Observable

Every criterion should be something you can actually verify. "Processes claims correctly" is not testable. "Correctly categorizes claim type with 95% accuracy" is.

Relationship to Other Artifacts

+-------------------------------------------------------------+
|                    Product Vision                           |
+-------------------------------------------------------------+
                              |
                              v
+-------------------------------------------------------------+
|                    Agent Stories                            |
|         (What agents do and why - this document)            |
+-------------------------------------------------------------+
                              |
              +---------------+---------------+
              v               v               v
+-------------------+ +-------------+ +---------------------+
|  Behavior Specs   | | Tool Specs  | |  Integration Specs  |
| (Detailed flows,  | | (MCP server | | (A2A protocols,     |
|  prompts, logic)  | |  contracts) | |  event schemas)     |
+-------------------+ +-------------+ +---------------------+

Agent Stories sit between high-level vision and implementation details. They should be:

  • - Stable enough to guide development
  • - Abstract enough to allow implementation flexibility
  • - Complete enough to enable estimation and planning

Quick Reference Card

+------------------------------------------------------------+
|  AGENT STORY QUICK REFERENCE                               |
+------------------------------------------------------------+
|                                                            |
|  CORE (Required)                                           |
|  -----------------                                         |
|  - Role: What the agent is                                 |
|  - Trigger: What activates it                              |
|  - Action: What it does                                    |
|  - Outcome: Why it matters                                 |
|  - Autonomy: Full|Supervised|Collaborative|Directed        |
|                                                            |
|  ANNOTATIONS (As Needed)                                   |
|  -----------------------                                   |
|  - trigger: Event details, conditions, examples            |
|  - behavior: Stages, capabilities, planning approach       |
|  - reasoning: Decision points, iteration, fallbacks        |
|  - memory: Working, persistent, learning                   |
|  - tools: MCP servers, permissions, conditions             |
|  - skills: Competencies, proficiencies, quality bars       |
|  - human_interaction: Mode, checkpoints, escalation        |
|  - collaboration: Role, coordinates/reports_to, peers      |
|  - acceptance: Functional, quality, guardrails             |
|                                                            |
|  SKILL ACQUISITION TYPES                                   |
|  -----------------------                                   |
|  - built_in: Core competency the agent is designed with    |
|  - learned: Acquired through training or feedback          |
|  - delegated: Performed by another agent or service        |
|                                                            |
|  TIPS                                                      |
|  ----                                                      |
|  * Start with core only, add complexity as needed          |
|  * Make acceptance criteria observable and testable        |
|  * Guardrails are things that must NEVER happen            |
|  * One story = one coherent agent responsibility           |
|  * Skills can be reused across multiple agents             |
|                                                            |
+------------------------------------------------------------+