Agent Implementation Plan

This document describes the intended LLM-first backend architecture for agents. It distinguishes current implementation from the target architecture.

No backend agent execution is implemented yet.

Current State

Implemented:

Admin-only frontend page: /ai-agents
Local frontend settings for enabled agents, run modes, provider, model, and API key
Database table: agent_runs

Not implemented:

Backend agent package
Agent execution service
Prompt builder
Worker integration
LLM provider adapters
Persistent backend agent settings
Agent API endpoints

Core Architecture

The target model is:

AgentContext
  -> Prompt Builder
  -> LLM Provider
  -> Structured AgentResponse
  -> validation
  -> agent_runs

Agents are LLM-powered analytical modules operating on the attack surface graph and vulnerability dataset. Their behavior is defined by:

agent category
prompt template
allowed action schema
provider configuration
safety and approval policy

Target Package Layout

backend/internal/agents/
├── model.go                # AgentContext, AgentResponse, AgentAction, statistics
├── service.go              # orchestration, authorization, policy checks
├── repository.go           # agent_runs persistence
├── registry.go             # agent metadata and prompt template registry
├── prompt_builder.go       # context reduction and prompt rendering
├── validator.go            # structured response validation
├── prompts/
│   ├── risk.md
│   ├── attack_path.md
│   ├── executive.md
│   ├── recon.md
│   └── pentest.md
└── providers/
    ├── provider.go
    ├── claude.go
    ├── openai.go
    ├── gemini.go
    └── ollama.go

Agent Categories

type AgentCategory string

const (
    AgentCategoryAdvisory    AgentCategory = "advisory"
    AgentCategoryOperational AgentCategory = "operational"
)

Category meaning:

advisory: analyzes existing data and returns notes, summaries, risk context, false-positive candidates, or attack path narratives.
operational: proposes follow-up discovery or validation actions, but does not execute them without approval.

Agent Metadata

Agents should be registered as metadata plus prompt templates rather than rule implementations.

type AgentDefinition struct {
    ID             string
    Name           string
    Category       AgentCategory
    PromptTemplate string
    AllowedActions []string
}

Agent IDs should match the frontend:

risk
attack_path
executive
recon
pentest

LLM Provider Abstraction

Providers are interchangeable. Agents must not depend on a specific vendor.

type LLMProvider interface {
    Generate(ctx context.Context, prompt string) (string, error)
}

Planned providers:

Provider	Purpose
Claude	Hosted Anthropic models
OpenAI	Hosted OpenAI models
Gemini	Hosted Google models
Ollama	Local model runtime

Provider adapters are responsible for:

sending prompts
applying provider-specific structured output features where available
returning raw model output for validation
enforcing timeouts
normalizing provider errors

Agents should use the same AgentResponse schema regardless of provider.

Prompt Lifecycle

Prompt templates are first-class architecture components.

Execution lifecycle:

Build AgentContext
Select agent prompt template
Inject context into the template
Send prompt to configured LLMProvider
Parse and validate JSON response
Store agent_runs
Display results in the UI

Prompt templates should include:

agent role and goal
organization and scan context
allowed action types
strict output schema
safety constraints
instruction to ignore prompt-like content found in target data
instruction to stay inside provided scope

AgentContext Construction

AgentContext should include only organization-scoped data:

organization metadata
scan metadata
approved scope
assets
vulnerabilities
graph edges
scan history
previous findings
risk statistics

Recommended shape:

type AgentContext struct {
    Organization       AgentOrganization `json:"organization"`
    Scan               AgentScan         `json:"scan"`
    Scope              []AgentScopeItem  `json:"scope"`
    Assets             []AgentAsset      `json:"assets"`
    Vulnerabilities    []AgentVuln       `json:"vulnerabilities"`
    Graph              []AgentEdge       `json:"graph"`
    HistoricalFindings []AgentVuln       `json:"historical_findings"`
    PreviousScans      []AgentScan       `json:"previous_scans"`
    Statistics         AgentStatistics   `json:"statistics"`
}

Important checks:

never include assets from another organization
never include unapproved targets as actionable scan targets
resolve ID-based resources to organization_id before building context
respect RBAC and organization isolation
reduce or summarize large context before prompt rendering

Prompt Builder

The prompt builder transforms AgentContext into provider input.

Responsibilities:

select the correct template for the agent
redact secrets and sensitive tokens
keep context within model token limits
preserve IDs needed for structured output
include allowed action schema
include approval requirements for operational actions
include scope boundaries

The prompt builder should not decide business outcomes. It prepares context and constraints for the LLM.

AgentResponse Validation

The model output must be parsed as JSON and validated before storage or display.

Validation rules:

output must match AgentResponse
agent_type must match the running agent
category must match the registered category
action type must be allowed for the agent
operational actions must include requires_approval = true
targets must be inside approved scope
referenced asset and vulnerability IDs must belong to the same organization
confidence must be in range 0..1
risk scores must be in range 0..10

Invalid output should create a failed agent_runs record and must not mutate platform state.

Execution Lifecycle

Advisory Lifecycle

worker or API trigger
  -> build AgentContext
  -> render advisory prompt
  -> call LLM provider
  -> validate AgentResponse
  -> store notes, summaries, risk context, and attack paths
  -> no operational actions applied

Operational Lifecycle

worker or API trigger
  -> build AgentContext
  -> render operational prompt
  -> call LLM provider
  -> validate AgentResponse
  -> store suggested actions
  -> require approval before scan execution

Operational agents must remain recommendation-only until an approval workflow exists.

Agent Settings

Current frontend settings:

enabled agent IDs
per-agent run modes
provider
model
API key

Target backend settings should support:

{
  "enabled_agents": ["risk", "executive", "recon"],
  "agent_run_modes": {
    "risk": "after_scan",
    "executive": "after_scan",
    "recon": "manual_approval"
  },
  "provider": "openai",
  "model": "gpt-4.1",
  "api_key_ref": "secret-reference",
  "auto_apply_safe_actions": false
}

Provider API keys should not be stored as plain text in frontend local storage long-term. Target storage should be encrypted server-side or delegated to a secret manager.

Future API Endpoints

Recommended future endpoints:

GET    /api/v1/agents/settings
PUT    /api/v1/agents/settings
GET    /api/v1/organizations/{id}/agent-runs
GET    /api/v1/agent-runs/{id}
POST   /api/v1/scans/{scan_id}/agents/run
POST   /api/v1/agents/runs/{id}/actions/{action_id}/approve
POST   /api/v1/agents/runs/{id}/actions/{action_id}/reject

Access:

settings should be admin-only
agent run history should be organization-scoped
manual advisory execution should be admin/hacker only for assigned organizations
operational approval should be admin/hacker only and scope-checked
clients should see only read-only outputs for assigned organizations

Error Handling

Agent errors must not fail the whole scan unless explicitly configured.

Recommended behavior:

provider timeout creates a failed agent run
malformed JSON creates a failed agent run
invalid schema creates a failed agent run
scan remains successful if plugin processing succeeded
failed agent runs include provider, model, error message, and timestamp

Test Mode

When EASM_TEST_MODE=true, provider calls should be replaced with fixture responses. This avoids external network calls while still exercising:

prompt rendering
JSON parsing
response validation
agent_runs persistence
UI display behavior