Agent Implementation Roadmap

This roadmap describes the LLM-first implementation path for AI agents in the EASM Platform. It is based on the current repository state: the frontend settings page exists, agent_runs exists in the database schema, and backend execution is not yet connected.

The implementation should start with LLM-backed Advisory Agents. Operational agents should remain recommendation-only until an approval workflow exists.

Phase 1: LLM-Backed Advisory Agents

Build the initial backend contracts and execute advisory agents through an LLM provider.

Deliverables:

backend/internal/agents
AgentCategory with advisory and operational
AgentContext
AgentResponse
prompt builder
prompt templates for advisory agents
structured response validation
provider abstraction

Initial agents:

Risk Agent
Executive Summary Agent
Attack Path Agent

Acceptance criteria:

advisory agents receive structured platform context
context is rendered through prompt templates
model output is parsed as JSON
invalid model output is rejected
advisory agents do not create scans or mutate scope

Phase 2: Agent Execution History

Persist and expose agent execution history.

Deliverables:

agent_runs repository
run status tracking
stored input context
stored validated output
stored provider/model metadata
failed run error details
organization-scoped run listing

Acceptance criteria:

every run is auditable
failed provider calls are visible
users only see runs for organizations they can access
scan success is not blocked by agent failure

Phase 3: Operational Recommendation Workflow

Add operational agents as suggestion-only modules.

Agents:

Recon Agent
Pentest Agent

Expected outputs:

suggest_scan
flag_asset
suggest_profile
suggest_plugin_run
validation steps
investigation notes

Acceptance criteria:

operational agents do not auto-run scans
all scan-related actions include requires_approval = true
suggested targets are validated against approved scope
suggestions are stored in agent_runs

Phase 4: Approval System

Add workflow for approving or rejecting operational recommendations.

Recommended endpoints:

POST /api/v1/agents/runs/{id}/actions/{action_id}/approve
POST /api/v1/agents/runs/{id}/actions/{action_id}/reject

Acceptance criteria:

clients cannot approve mutation actions
hackers can approve allowed operational actions in assigned organizations
admins can approve all actions
applied actions reference the original agent_run_id
audit history records before and after values
operational auto-execution remains disabled by default

Phase 5: Multi-Provider Support

Add interchangeable hosted LLM providers.

Providers:

Claude
OpenAI
Gemini

Common interface:

type LLMProvider interface {
    Generate(ctx context.Context, prompt string) (string, error)
}

Acceptance criteria:

agents do not depend on a specific provider
provider timeout is configurable
malformed JSON output is rejected
provider errors are stored in agent_runs
provider prompts never include secrets

Phase 6: Local Model Support

Add local model support through Ollama.

Acceptance criteria:

local provider uses the same LLMProvider interface
local model output uses the same AgentResponse schema
local model failures are handled like hosted provider failures
deployments can disable hosted providers entirely

Phase 7: Multi-Agent Collaboration

Add controlled collaboration between agents after individual agents are stable.

Examples:

Risk Agent uses Attack Path Agent output as additional context
Executive Summary Agent summarizes Risk and Attack Path results
Operational agents use advisory summaries to prioritize suggestions

Acceptance criteria:

collaboration uses stored structured outputs, not hidden model state
each agent run remains separately auditable
circular agent dependencies are prevented
cross-organization context is never mixed

Future Research Context Integration

A later agent phase may add hxresearch/ as a curated context source for advisory and operational agents.

Potential inputs:

hxresearch/advisories for project-authored advisory context
hxresearch/writeups for technical analysis and remediation context
hxresearch/nuclei for detection provenance and template explanations
hxresearch/datasets for fingerprint or technology reference material

Acceptance criteria for any future integration:

agents only receive hxresearch content relevant to the authorized organization context
research content is treated as supporting context, not trusted autonomous instruction
prompts do not include secrets or uncontrolled PoC payloads
generated actions remain subject to the same approval and scope rules

This is future/planned and is not part of current agent execution.

Non-Goals for Initial Release

Do not include these in the first backend implementation:

autonomous exploitation
operational auto-execution
automatic scope expansion
automatic scan launch from LLM output
cross-organization correlation
provider key storage in frontend state as production source of truth
unvalidated free-form model output

Verification Checklist

Before enabling agents in production, verify:

disabled agents do not run
prompts contain only authorized organization data
advisory agents cannot create scans or mutate scope
operational agents remain suggestion-only
operational actions require approval
client users cannot trigger mutation actions
hacker users cannot access unassigned organization agent runs
admin users can view all agent runs
failed provider calls do not fail scans
invalid JSON output is rejected
agent_runs stores context, output, status, provider, model, and errors