Skip to content

Technical Design — Component Overview

Part of Project Kaze Architecture

High-level design for the 6 MVP platform components. Covers what each component does, its inputs/outputs, how components connect, and key workflows.

Implementation status: Components 1-3 are implemented. See plan.md for details.


1. Agent Runtime

Implementedkaze-runtime (port 4100)

The core execution engine that manages agent lifecycles and task execution.

What It Does

  • Loads agent and skill definitions (YAML + optional TypeScript handlers)
  • Spawns agent instances bound to a tenant
  • Dispatches tasks to agents (one task at a time per agent, actor model)
  • Manages agent lifecycle: initializing → ready → executing → idle → shutdown
  • Enforces per-skill supervision levels (supervised / sampling / autonomous)
  • Routes inter-agent communication (direct calls in MVP, NATS in Phase 2)

Inputs / Outputs

InputOutput
Skill definition (YAML)Loaded, validated skill ready for composition
Agent definition (YAML)Running agent instance bound to a tenant
Task request (skill name + input data + initiator)Task result (output data + metrics + approval status)

Key Workflows

Agent Spawn:

Load agent YAML → Resolve skill definitions → Load TS handlers (if any)
→ Connect to LLM Gateway, Knowledge, Tools → Set state = ready

Task Execution:

Receive task → Check supervision level for this skill
→ If supervised: execute → queue output for human review → wait for approval
→ If sampling: execute → randomly sample X% for review → deliver rest immediately
→ If autonomous: execute → deliver
→ Log everything to Observation Logger
→ Update supervision ramp statistics

Supervision Ramp:

Track per-skill stats (success rate, approval rate, total runs)
→ When thresholds met (e.g., 50 runs at 95% approval) → promote to next level
→ If quality drops → demote back

Connections

  • Uses: LLM Gateway (for agent reasoning), Knowledge System (for memory), Tool Framework (for external actions), Observation Logger (for all events)
  • Used by: OpenClaw (dispatches tasks from user conversation), Task Scheduler (dispatches cron/event tasks)

2. LLM Gateway

Implementedkaze-gateway (port 4200)

Abstraction layer between all agents and all LLM providers.

What It Does

  • Provides a unified complete() interface — agents never hold API keys or call providers directly
  • Routes requests to the right provider/model/key based on tenant config
  • Tracks token usage per tenant, per agent, per key
  • Enforces budget limits (hard stops, not AI reasoning)
  • Rate limits per provider to respect API quotas
  • Falls back to alternative providers on failure
  • Supports model hints (fast/balanced/best) that resolve to concrete models per tenant

Inputs / Outputs

InputOutput
Messages + model hint + caller context (tenant, agent)Completion response + token usage + cost + latency
Texts + caller contextEmbedding vectors
Budget query (tenant, agent)Remaining budget info

Key Workflows

Request Routing:

Agent calls complete(messages, modelHint="balanced", context)
→ Resolve model hint to concrete model via tenant config
→ Look up tenant's key for that provider (Vault) → fall back to Speedrun key
→ Check budget → reject if exceeded
→ Check rate limit → queue if throttled
→ Send to provider → return response
→ Log usage (tokens, cost, latency) to budget tracker + Observation Logger

Fallback Chain:

Provider A fails (rate limit / down / error)
→ Try Provider B (if tenant config allows)
→ Try Provider C
→ After max attempts → return error to agent

Connections

  • Uses: Vault (key retrieval), Observation Logger (usage logging)
  • Used by: Agent Runtime (every LLM call from every agent), Knowledge System (embedding generation, quality gate evaluation)

3. Knowledge System (Mem0 + pgvector)

Partially implementedkaze-knowledge (port 4300)

MVP scope: Mem0 per-agent episodic memory only. Shared knowledge tiers, quality gates, ABAC, and graph traversal are deferred.

Persistent memory and knowledge layer for all agents.

What It Does

  • Stores 4 types of memory: episodic (events/history), semantic (facts/relationships), procedural (skills/how-to), reflective (insights/learnings)
  • Retrieves relevant memories using tri-factor scoring: recency × importance × relevance
  • Versions all knowledge writes with provenance (who, when, why — git-inspired)
  • Enforces access control: private tier (agent-scoped) and shared tier (vertical-scoped)
  • Quality-gates shared knowledge entries before they're visible to other agents
  • Routes storage: Mem0 for per-agent episodic memory, pgvector for shared knowledge

Inputs / Outputs

InputOutput
Query (text + memory types + caller context)Ranked memory entries with scores
Commit (content + memory type + access tier + metadata)Version ID + acceptance status (accepted / pending review / rejected)
Memory IDFull entry with version history

Key Workflows

Knowledge Query:

Agent queries "what do we know about Client X's SEO strategy?"
→ Check ABAC: can this agent read from this domain?
→ Search pgvector for semantic similarity (relevance)
→ Score results with tri-factor: recency + importance + relevance
→ Return top-N ranked results
→ Update last_accessed timestamps (for recency decay)

Knowledge Write (Shared Tier):

Agent commits a new insight to shared vertical knowledge
→ Check ABAC: can this agent write to shared tier?
→ Run quality gate: accuracy check, contradiction detection, duplicate check
→ If passes → accept, version, make visible to vertical
→ If fails → reject with reason, agent can store privately instead

Memory Routing:

Private + episodic → Mem0 (per-agent collection)
Private + other types → pgvector (tenant-scoped)
Shared + any type → pgvector (shared tables, quality-gated)

Connections

  • Uses: LLM Gateway (embeddings, quality gate evaluation), Observation Logger
  • Used by: Agent Runtime (agents query/commit during task execution)

4. Tool Integration Framework

Typed, auth-managed access to external services and APIs.

What It Does

  • Defines tools with typed inputs/outputs, auth requirements, and retry policies
  • Manages a registry of available tools, filterable by vertical
  • Resolves auth credentials from Vault at execution time (agents never hold raw keys)
  • Handles retries with exponential backoff for transient failures
  • Rate limits tool calls to respect external API quotas
  • Logs all tool executions for observability

Inputs / Outputs

InputOutput
Tool name + input parameters + caller contextTyped result (success + data) or error (code + retryable flag)
Discovery query (vertical, category)List of available tools with descriptions

MVP Tools

ToolVerticalWhat It Does
GitHubV0 Internal OpsList/create/update issues, PRs, comments
CalendarV0 Internal OpsList/create events, find free slots
SEMrushV1 SEOKeyword overview, keyword gap, domain analysis
Google Search ConsoleV1 SEOSearch performance, ranking data
Toddle DBV2 ToddleQuery/update activities, check data freshness, get embeddings

Key Workflow

Tool Execution:

Agent calls tool("semrush_keyword_overview", { keyword: "..." })
→ Look up tool definition in registry
→ Fetch credentials from Vault (scoped to tenant)
→ Execute with timeout
→ On failure: retry per policy (exponential backoff, max attempts)
→ Log execution to Observation Logger
→ Return typed result to agent

Connections

  • Uses: Vault (credential retrieval), Observation Logger
  • Used by: Agent Runtime (agents invoke tools during skill execution)

5. Task Scheduler

Cron-based and event-triggered task execution for agents.

What It Does

  • Registers cron schedules (from agent YAML definitions or programmatic)
  • Registers event triggers (agent A completes → fire agent B)
  • Dispatches tasks to Agent Runtime when triggers fire
  • Ensures idempotency (no double-firing on scheduler restart)
  • Tracks execution history per schedule
  • Supports skip-if-running to prevent overlapping executions

Inputs / Outputs

InputOutput
Schedule definition (cron expression + target agent + skill + input)Registered schedule ID
Event trigger definition (event type + target agent + skill)Registered trigger ID
Emitted eventDispatched task (via Agent Runtime)

Key Workflows

Cron Tick (every 10s):

Query schedules where next_run_at <= now
→ For each due schedule:
  → Check idempotency (not already fired for this timestamp)
  → Check skip-if-running (previous task still executing?)
  → Dispatch task to Agent Runtime
  → Update next_run_at
  → Record execution in history

Event Trigger (MVP — direct callbacks):

Component calls scheduler.emit({ type: "task_completed", ... })
→ Match against registered event triggers
→ Check idempotency key
→ Dispatch matched tasks to Agent Runtime

HA Safety: Multiple scheduler replicas use database-level locking (FOR UPDATE SKIP LOCKED) so only one replica picks up each due schedule.

Connections

  • Uses: Agent Runtime (dispatches tasks)
  • Used by: Agent definitions (declare cron triggers in YAML), other components (emit events)

6. Observation Logger

Structured logging of all agent activity for debugging, auditing, and future training.

What It Does

  • Records every significant event: agent lifecycle, task execution, LLM calls, tool calls, knowledge operations, supervision decisions, budget warnings
  • Provides a query interface for debugging (trace a task, view agent timeline, aggregate metrics)
  • Batches writes internally (fire-and-forget, never blocks agent execution)
  • Integrates automatically as middleware on LLM Gateway, Tool Executor, and Knowledge Client

Inputs / Outputs

InputOutput
Observation event (type + payload + context)(fire-and-forget, async write)
Query filter (tenant, agent, task, time range)Matching events
Task IDFull execution trace across all agents
Metrics filter (tenant, time range, group-by)Aggregate metrics (tasks, tokens, costs, error rates)

Event Types

CategoryEvents
Agent lifecyclespawned, shutdown, state change
Task executionstarted, completed, failed, timeout
LLM callsstart, complete, error (with provider, model, tokens, cost, latency)
Tool callsstart, complete, error (with tool name, duration, retry count)
Knowledgequery, commit (with memory type, domain, result count)
Supervisionreview required, decision made (approved/rejected/modified)
Budgetwarning (80% threshold), exceeded (hard stop)
Schedulercron triggered, event triggered

Key Design Choice

Logging is fire-and-forget with internal batching — events buffer in memory and flush to storage in batches (100 events or every 1 second). This ensures logging never becomes a bottleneck for agent execution. If the database is temporarily unavailable, events buffer up to a limit, dropping oldest debug-level events first.

Connections

  • Uses: PostgreSQL (event storage)
  • Used by: Every other component (automatic middleware integration)

Cross-Component Integration

As Implemented (MVP)

┌─ kaze-gateway (port 4200) ──────────────────────────────┐
│  POST /llm/generate     → Vercel AI SDK → Gemini/Claude │
│  POST /tools/execute    → credential injection → APIs   │
│  GET  /tools/catalog                                    │
│  Secrets: LLM keys, GitHub token                        │
│  Observability: Langfuse tracing                        │
└──────────────────────────▲──────────────────────────────┘
                           │ HTTP
┌──────────────────────────┴──────────────────────────────┐
│  kaze-runtime (port 4100)                               │
│  VerticalAgent → SubAgent (per-task, per-skill)         │
│  Memory: search before LLM, store after LLM             │
│  Zero secrets — calls gateway + knowledge via HTTP      │
└──────┬───────────────────────────────────▲──────────────┘
       │ HTTP                              │ HTTP
┌──────▼───────────────────────────────────┴──────────────┐
│  kaze-knowledge (port 4300)                             │
│  POST /memory/search    → vector similarity             │
│  POST /memory/add       → Mem0 fact extraction + store  │
│  Own LLM key (Gemini) for fact extraction + embeddings  │
│  Storage: PostgreSQL + pgvector                         │
└─────────────────────────────────────────────────────────┘

Full Design (Target)

                         ┌──────────────────────────────┐
                         │    OpenClaw (Layer 0.5)       │
                         │    User ↔ Agent conversation  │
                         └──────────────┬───────────────┘
                                        │ dispatches tasks

┌───────────────────────────────────────────────────────────────┐
│                     Agent Runtime (1)                          │
│                                                               │
│  Spawns agents · Dispatches tasks · Manages lifecycle         │
│  Enforces supervision · Routes inter-agent messages           │
│                                                               │
│  Uses: LLM Gateway (2), Knowledge (3), Tools (4), Logger (6) │
└───────┬──────────────┬────────────────┬───────────────────────┘
        │              │                │
   ┌────▼────┐   ┌─────▼──────┐   ┌────▼──────────┐
   │ LLM     │   │ Knowledge  │   │ Tool          │
   │ Gateway │   │ System     │   │ Framework     │
   │   (2)   │   │   (3)      │   │   (4)         │
   │         │   │            │   │               │
   │ Multi-  │   │ Mem0 +     │   │ GitHub,       │
   │ provider│   │ pgvector   │   │ SEMrush,      │
   │ routing │   │ Tri-factor │   │ Calendar,     │
   │ Budget  │   │ ABAC       │   │ Toddle DB     │
   └─────────┘   └────────────┘   └───────────────┘
        │              │                │
        └──────────────┼────────────────┘
                       │ all events logged
                ┌──────▼──────┐    ┌───────────────┐
                │ Observation │    │ Task          │
                │ Logger (6)  │    │ Scheduler (5) │
                │             │    │               │
                │ All events  │    │ Cron + Event  │
                │ Batched     │    │ → dispatch()  │
                │ writes      │    │ to Runtime    │
                └─────────────┘    └───────────────┘

Error Handling Summary

ComponentKey Failure ModeRecovery
Agent RuntimeAgent crash during taskMark task failed, log error, transition agent to error state. After cooldown, return to ready. 3 consecutive errors → alert ops.
Agent RuntimeTask timeoutAbort via signal, mark timeout, agent returns to ready if healthy.
LLM GatewayProvider rate limited / downFallback to next provider in chain. Queue with backoff. Return error after max attempts.
LLM GatewayBudget exceededHard reject. No fallback, no retry. Deterministic code check.
Knowledge SystemQuality gate rejects shared writeReturn rejection reason. Agent can store privately instead.
Knowledge SystemMem0 unavailableGraceful degradation: buffer episodic writes to Postgres directly. Reads return empty for Mem0-backed memories.
Tool FrameworkExternal API error (retryable)Exponential backoff per retry policy. After max attempts, return error to agent.
Tool FrameworkAuth failureRe-fetch from Vault (cache may be stale). Retry once. If still failing, return auth error.
Task SchedulerMissed cron tick (restart)On startup, scan for overdue schedules. Execute missed runs (up to 1hr lookback). Mark older as skipped.
Observation LoggerDatabase write failureBuffer in memory (up to limit). Drop oldest debug events first. Never block agent execution.

Security Controls Per Component

Each component enforces security boundaries independently — no single component failure should compromise the system.

ComponentSecurity ControlWhat It Enforces
Agent RuntimeCapability manifest enforcementAgent can only invoke tools, knowledge domains, and channels declared in its manifest
Agent RuntimePer-agent resource quotasMax concurrent tasks, max tool calls per task, max subagent depth
Agent RuntimeSupervision state isolationAgents cannot read or modify their own supervision statistics
LLM GatewayData classification checkTags on knowledge entries ("safe for LLM" vs "internal only") respected before inclusion in prompts
LLM GatewaySecret scanning in promptsDetect and redact credentials/keys before sending to provider
LLM GatewayPer-tenant request fairnessNo single tenant can saturate the provider queue
Knowledge SystemShared knowledge quarantineWrites to shared tier enter quarantine before becoming visible (configurable: time-based or review-based)
Knowledge SystemWrite rate limitingFlag agents that write excessively to shared tier
Knowledge SystemProvenance chainEvery shared entry traces to originating observation, not just agent identity
Tool FrameworkEgress whitelistTool calls validated against per-tenant, per-vertical allowed endpoints
Tool FrameworkOutput scanningDetect client-specific data in tool call parameters before sending externally
Observation LoggerSecret redactionScan all event payloads for credential patterns before storage
Observation LoggerAppend-only auditObservation events are immutable — no update or delete
Task SchedulerIdempotency enforcementPrevents duplicate task dispatch from replay or restart

Full threat model and attack surface analysis in research/threat-model.md.


Phase 2 Migration Notes

  • NATS: The inter-agent message envelope is designed so swapping DirectCallTransport for NatsTransport changes only the transport layer. Agent code, task definitions, and message shapes stay identical.
  • Apache AGE: The knowledge query/commit interface gains a graph traversal retrieval strategy alongside vector search. Same interface, new strategy option.
  • Layer 3 Agents: Supervisor and Quality Monitor agents consume the Observation Logger's event stream. Improvement agents write new versions of agent/skill definitions.