Back to Methods
Future State Design

Agentic Reengineering

A step-by-step guide to designing optimized AI agent workflows, producing complete Agent Stories ready for implementation.

What You're Trying to Do

Your goal is to design how work should be done with AI agents, not how it's done today. You're taking your Agentic Replica (the current state) and reimagining it as an optimized human-agent workflow.

For each skill in your replica, you'll decide: Should an agent handle this autonomously? Should it augment a human? Or should it remain human-only? Then you'll design exactly how the agent should work, what triggers it needs, what tools it requires, and where humans stay in the loop.

Key principle: This is speculative design. Be bold about what could be automated or augmented, but also realistic about what requires human judgment. You'll validate assumptions later through prototyping.

The Five-Step Process

For each role you're redesigning, work through these five stages. Each produces specific outputs that feed into your Agent Stories.

1

Classify Tasks

Automate / Augment / Human

2

Design Triggers

What activates agents

3

Design Skills

Capabilities & tools

4

Design Oversight

Human checkpoints

5

Define Success

Acceptance criteria

1
Classify Each Task

Take each skill from your Agentic Replica and decide how it should be handled in the future state. Every task falls into one of three categories.

Automate

Agent handles fully

No human involvement during execution. Agent makes all decisions within defined guardrails.

Good candidates: High-frequency, low-judgment tasks with clear rules and predictable outcomes.

Augment

Agent assists human

Agent prepares options, drafts, or recommendations. Human reviews and decides.

Good candidates: Tasks requiring judgment but with automatable research, drafting, or analysis steps.

Human-Only

Human handles fully

Tasks that should not be delegated to agents, even with oversight.

Examples: Final accountability decisions, relationship-dependent work, ethical judgment calls.

Classification criteria:

  • Judgment required: How much of the skill relies on nuanced decisions versus following clear rules? High judgment suggests augment or human-only.
  • Error cost: What happens if the agent makes a mistake? High-stakes errors suggest more human oversight.
  • Time sensitivity: Does the task need instant response? Humans can't review every action in real-time.
  • Data availability: Is the information needed to make decisions accessible to an agent? Some decisions require context agents can't access.

Tip: Start aggressive. Mark more tasks as "automate" than you think is realistic. You'll pull things back to "augment" or "human-only" as you work through the design and realize where the challenges are.

2
Design Agent Triggers

For each task you marked as "automate" or "augment," define exactly what should activate the agent. This maps directly to the trigger block in your Agent Story.

Trigger types to consider:

Message triggers

Agent activates when it receives a message, email, ticket, or request.

From replica: Look at the "triggers" you documented for each skill. Which were incoming communications?

Resource change triggers

Agent activates when data changes, a file is updated, or a threshold is crossed.

From replica: Which skills were triggered by system events or data changes?

Schedule triggers

Agent activates at a specific time or interval (daily, weekly, monthly).

From replica: Which skills happened on a regular schedule?

Cascade triggers

Agent activates when another agent completes a task or requests collaboration.

From replica: Which skills were triggered by output from other people's work?

What to specify for each trigger:

  • Source: Exactly where the trigger comes from (which system, which queue, which event type).
  • Conditions: When should the agent activate vs. ignore? Not every message triggers action.
  • Examples: Concrete examples of triggering events to avoid ambiguity.

3
Design Agent Skills

Transform each automated or augmented task into a detailed skill specification. This is the core of your Agent Story and requires the most thought.

For each skill, define:

Domain & Proficiencies

What knowledge area does this skill operate in? What specific things can the agent do within this domain? Be precise; vague proficiencies lead to vague implementations.

From replica: Your skill documentation shows what the human does. Translate those actions into agent capabilities.

Tools Required

What systems, APIs, or MCP servers does the agent need to execute this skill? What level of access (read, write, execute) is required? Under what conditions can each tool be used?

From replica: Your "tools used" list shows what the human accesses. Determine which the agent needs and at what permission level.

Reasoning Strategy

How does the agent make decisions? Is it rule-based (if X then Y), LLM-guided (interpret context and decide), or hybrid? Where are the key decision points, and what's the fallback when decisions fail?

From replica: Your "decision points" documentation reveals the logic. Decide which decisions the agent can make autonomously.

Memory Needs

What context does the agent need to maintain during execution (working memory)? What needs to persist across invocations (persistent memory)? Should the agent learn from feedback?

From replica: Your "inputs" documentation shows what information the human gathers. Determine what the agent needs to remember.

Quality Bar

What does competent execution look like? How will you know if the agent is performing this skill well? This should be measurable.

From replica: Your "outputs" documentation describes what success looks like. Translate to measurable criteria.

4
Design Human Oversight

Even autonomous agents need guardrails and checkpoints. Design when and how humans stay in the loop.

Choose an autonomy level:

Level Human Role Agent Authority
Full None during execution Complete decision authority within guardrails
Supervised Monitors, intervenes on exception Executes independently but human can override
Collaborative Active participant in decisions Proposes actions, human confirms before execution
Directed Initiates and guides each step Executes specific instructions only

Define checkpoints:

  • Approval checkpoints: Actions that require human sign-off before proceeding (e.g., high-value transactions).
  • Input checkpoints: Points where human input is needed to continue (e.g., choosing between options).
  • Review checkpoints: Completed work the human should review (e.g., generated content before sending).
  • Escalation triggers: Conditions that should automatically route to human handling (e.g., agent confidence below threshold).

Define guardrails:

What should the agent never do? These aren't preferences; they're hard constraints that must be enforced.

From replica: Your "observed guardrails" documentation shows what the human never does. Translate these into agent constraints. Also consider: what new risks does automation create that the human couldn't create?

5
Define Success Criteria

How will you know if the agent is working? Define measurable acceptance criteria that can be verified in testing and production.

Three types of criteria:

Functional criteria

Observable behaviors that indicate the agent works correctly.

Example: "Correctly categorizes incoming requests with 95% accuracy" rather than "handles requests well."

Quality criteria

Non-functional requirements like speed, reliability, and user experience.

Example: "Responds within 30 seconds" rather than "responds quickly."

Guardrail criteria

Verification that constraints are never violated.

Example: "Never sends customer communication without template match" rather than "follows communication guidelines."

Testability check: For each criterion, ask: "How would I write a test for this?" If you can't imagine the test, the criterion is too vague. Make it more specific.

From Reengineering to Agent Story

Your completed Agentic Reengineering produces everything needed for the Agent Story: Full format. The mapping is direct:

Reengineering Output Agent Story Section
Task classification (automate) Agent scope & Autonomy level
Trigger design Trigger Specification
Skill domain & proficiencies Skills array
Tools & permissions Tools & Integrations
Reasoning strategy Reasoning & Decisions
Memory design Memory & State
Autonomy level & checkpoints Human Collaboration
Agent relationships Agent Collaboration
Success criteria & guardrails Acceptance Criteria

Best Practices for Reengineering

What works well

  • Start bold, then pull back. Assume more automation is possible than you think. Reality will constrain you soon enough.
  • Design for the 80% case first. Handle common scenarios well before worrying about edge cases.
  • Make guardrails tight initially. You can loosen constraints as you build trust. Tightening later is harder.
  • Document your reasoning. Why did you choose this autonomy level? Future you will want to know.
  • Involve domain experts. Your classification decisions need validation from people who do the work.

Common mistakes to avoid

  • Automating judgment-heavy tasks fully. If the human needed to think hard, the agent probably needs human oversight.
  • Vague acceptance criteria. "Works well" isn't testable. Specify exactly how you'll know it's working.
  • Skipping error handling. What happens when the agent fails? Design the failure path, not just the happy path.
  • Designing in isolation. Agents work in systems. Consider how this agent interacts with others.
  • Forgetting ethical review. Will this displacement affect people? Consider the human impact of automation.

Ready to start designing?

Get the copy-paste templates and see a complete worked example showing how a role is redesigned into Agent Stories.

View Templates & Worked Example