A step-by-step guide to designing optimized AI agent workflows, producing complete Agent Stories ready for implementation.
Your goal is to design how work should be done with AI agents, not how it's done today. You're taking your Agentic Replica (the current state) and reimagining it as an optimized human-agent workflow.
For each skill in your replica, you'll decide: Should an agent handle this autonomously? Should it augment a human? Or should it remain human-only? Then you'll design exactly how the agent should work, what triggers it needs, what tools it requires, and where humans stay in the loop.
Key principle: This is speculative design. Be bold about what could be automated or augmented, but also realistic about what requires human judgment. You'll validate assumptions later through prototyping.
For each role you're redesigning, work through these five stages. Each produces specific outputs that feed into your Agent Stories.
Classify Tasks
Automate / Augment / Human
Design Triggers
What activates agents
Design Skills
Capabilities & tools
Design Oversight
Human checkpoints
Define Success
Acceptance criteria
Take each skill from your Agentic Replica and decide how it should be handled in the future state. Every task falls into one of three categories.
No human involvement during execution. Agent makes all decisions within defined guardrails.
Good candidates: High-frequency, low-judgment tasks with clear rules and predictable outcomes.
Agent prepares options, drafts, or recommendations. Human reviews and decides.
Good candidates: Tasks requiring judgment but with automatable research, drafting, or analysis steps.
Tasks that should not be delegated to agents, even with oversight.
Examples: Final accountability decisions, relationship-dependent work, ethical judgment calls.
Tip: Start aggressive. Mark more tasks as "automate" than you think is realistic. You'll pull things back to "augment" or "human-only" as you work through the design and realize where the challenges are.
For each task you marked as "automate" or "augment," define exactly what should activate the agent. This maps directly to the trigger block in your Agent Story.
Agent activates when it receives a message, email, ticket, or request.
From replica: Look at the "triggers" you documented for each skill. Which were incoming communications?
Agent activates when data changes, a file is updated, or a threshold is crossed.
From replica: Which skills were triggered by system events or data changes?
Agent activates at a specific time or interval (daily, weekly, monthly).
From replica: Which skills happened on a regular schedule?
Agent activates when another agent completes a task or requests collaboration.
From replica: Which skills were triggered by output from other people's work?
Transform each automated or augmented task into a detailed skill specification. This is the core of your Agent Story and requires the most thought.
What knowledge area does this skill operate in? What specific things can the agent do within this domain? Be precise; vague proficiencies lead to vague implementations.
From replica: Your skill documentation shows what the human does. Translate those actions into agent capabilities.
What systems, APIs, or MCP servers does the agent need to execute this skill? What level of access (read, write, execute) is required? Under what conditions can each tool be used?
From replica: Your "tools used" list shows what the human accesses. Determine which the agent needs and at what permission level.
How does the agent make decisions? Is it rule-based (if X then Y), LLM-guided (interpret context and decide), or hybrid? Where are the key decision points, and what's the fallback when decisions fail?
From replica: Your "decision points" documentation reveals the logic. Decide which decisions the agent can make autonomously.
What context does the agent need to maintain during execution (working memory)? What needs to persist across invocations (persistent memory)? Should the agent learn from feedback?
From replica: Your "inputs" documentation shows what information the human gathers. Determine what the agent needs to remember.
What does competent execution look like? How will you know if the agent is performing this skill well? This should be measurable.
From replica: Your "outputs" documentation describes what success looks like. Translate to measurable criteria.
Even autonomous agents need guardrails and checkpoints. Design when and how humans stay in the loop.
| Level | Human Role | Agent Authority |
|---|---|---|
| Full | None during execution | Complete decision authority within guardrails |
| Supervised | Monitors, intervenes on exception | Executes independently but human can override |
| Collaborative | Active participant in decisions | Proposes actions, human confirms before execution |
| Directed | Initiates and guides each step | Executes specific instructions only |
What should the agent never do? These aren't preferences; they're hard constraints that must be enforced.
From replica: Your "observed guardrails" documentation shows what the human never does. Translate these into agent constraints. Also consider: what new risks does automation create that the human couldn't create?
How will you know if the agent is working? Define measurable acceptance criteria that can be verified in testing and production.
Observable behaviors that indicate the agent works correctly.
Example: "Correctly categorizes incoming requests with 95% accuracy" rather than "handles requests well."
Non-functional requirements like speed, reliability, and user experience.
Example: "Responds within 30 seconds" rather than "responds quickly."
Verification that constraints are never violated.
Example: "Never sends customer communication without template match" rather than "follows communication guidelines."
Testability check: For each criterion, ask: "How would I write a test for this?" If you can't imagine the test, the criterion is too vague. Make it more specific.
Your completed Agentic Reengineering produces everything needed for the Agent Story: Full format. The mapping is direct:
| Reengineering Output | Agent Story Section | |
|---|---|---|
| Task classification (automate) | Agent scope & Autonomy level | |
| Trigger design | Trigger Specification | |
| Skill domain & proficiencies | Skills array | |
| Tools & permissions | Tools & Integrations | |
| Reasoning strategy | Reasoning & Decisions | |
| Memory design | Memory & State | |
| Autonomy level & checkpoints | Human Collaboration | |
| Agent relationships | Agent Collaboration | |
| Success criteria & guardrails | Acceptance Criteria |
Get the copy-paste templates and see a complete worked example showing how a role is redesigned into Agent Stories.
View Templates & Worked Example