Technical Design — Component Overview

Part of Project Kaze Architecture

High-level design for the 6 MVP platform components. Covers what each component does, its inputs/outputs, how components connect, and key workflows.

Implementation status: Components 1-3 are implemented. See plan.md for details.

1. Agent Runtime

Implemented — kaze-runtime (port 4100)

The core execution engine that manages agent lifecycles and task execution.

What It Does

Loads agent and skill definitions (YAML + optional TypeScript handlers)
Spawns agent instances bound to a tenant
Dispatches tasks to agents (one task at a time per agent, actor model)
Manages agent lifecycle: initializing → ready → executing → idle → shutdown
Enforces per-skill supervision levels (supervised / sampling / autonomous)
Routes inter-agent communication (direct calls in MVP, NATS in Phase 2)

Inputs / Outputs

Input	Output
Skill definition (YAML)	Loaded, validated skill ready for composition
Agent definition (YAML)	Running agent instance bound to a tenant
Task request (skill name + input data + initiator)	Task result (output data + metrics + approval status)

Key Workflows

Agent Spawn:

Load agent YAML → Resolve skill definitions → Load TS handlers (if any)
→ Connect to LLM Gateway, Knowledge, Tools → Set state = ready

Task Execution:

Receive task → Check supervision level for this skill
→ If supervised: execute → queue output for human review → wait for approval
→ If sampling: execute → randomly sample X% for review → deliver rest immediately
→ If autonomous: execute → deliver
→ Log everything to Observation Logger
→ Update supervision ramp statistics

Supervision Ramp:

Track per-skill stats (success rate, approval rate, total runs)
→ When thresholds met (e.g., 50 runs at 95% approval) → promote to next level
→ If quality drops → demote back

Connections

Uses: LLM Gateway (for agent reasoning), Knowledge System (for memory), Tool Framework (for external actions), Observation Logger (for all events)
Used by: OpenClaw (dispatches tasks from user conversation), Task Scheduler (dispatches cron/event tasks)

2. LLM Gateway

Implemented — kaze-gateway (port 4200)

Abstraction layer between all agents and all LLM providers.

What It Does

Provides a unified complete() interface — agents never hold API keys or call providers directly
Routes requests to the right provider/model/key based on tenant config
Tracks token usage per tenant, per agent, per key
Enforces budget limits (hard stops, not AI reasoning)
Rate limits per provider to respect API quotas
Falls back to alternative providers on failure
Supports model hints (fast/balanced/best) that resolve to concrete models per tenant

Inputs / Outputs

Input	Output
Messages + model hint + caller context (tenant, agent)	Completion response + token usage + cost + latency
Texts + caller context	Embedding vectors
Budget query (tenant, agent)	Remaining budget info

Key Workflows

Request Routing:

Agent calls complete(messages, modelHint="balanced", context)
→ Resolve model hint to concrete model via tenant config
→ Look up tenant's key for that provider (Vault) → fall back to Speedrun key
→ Check budget → reject if exceeded
→ Check rate limit → queue if throttled
→ Send to provider → return response
→ Log usage (tokens, cost, latency) to budget tracker + Observation Logger

Fallback Chain:

Provider A fails (rate limit / down / error)
→ Try Provider B (if tenant config allows)
→ Try Provider C
→ After max attempts → return error to agent

Connections

Uses: Vault (key retrieval), Observation Logger (usage logging)
Used by: Agent Runtime (every LLM call from every agent), Knowledge System (embedding generation, quality gate evaluation)

3. Knowledge System (Mem0 + pgvector)

Partially implemented — kaze-knowledge (port 4300)
MVP scope: Mem0 per-agent episodic memory only. Shared knowledge tiers, quality gates, ABAC, and graph traversal are deferred.

Persistent memory and knowledge layer for all agents.

What It Does

Stores 4 types of memory: episodic (events/history), semantic (facts/relationships), procedural (skills/how-to), reflective (insights/learnings)
Retrieves relevant memories using tri-factor scoring: recency × importance × relevance
Versions all knowledge writes with provenance (who, when, why — git-inspired)
Enforces access control: private tier (agent-scoped) and shared tier (vertical-scoped)
Quality-gates shared knowledge entries before they're visible to other agents
Routes storage: Mem0 for per-agent episodic memory, pgvector for shared knowledge

Inputs / Outputs

Input	Output
Query (text + memory types + caller context)	Ranked memory entries with scores
Commit (content + memory type + access tier + metadata)	Version ID + acceptance status (accepted / pending review / rejected)
Memory ID	Full entry with version history

Key Workflows

Knowledge Query:

Agent queries "what do we know about Client X's SEO strategy?"
→ Check ABAC: can this agent read from this domain?
→ Search pgvector for semantic similarity (relevance)
→ Score results with tri-factor: recency + importance + relevance
→ Return top-N ranked results
→ Update last_accessed timestamps (for recency decay)

Knowledge Write (Shared Tier):

Agent commits a new insight to shared vertical knowledge
→ Check ABAC: can this agent write to shared tier?
→ Run quality gate: accuracy check, contradiction detection, duplicate check
→ If passes → accept, version, make visible to vertical
→ If fails → reject with reason, agent can store privately instead

Memory Routing:

Private + episodic → Mem0 (per-agent collection)
Private + other types → pgvector (tenant-scoped)
Shared + any type → pgvector (shared tables, quality-gated)

Connections

Uses: LLM Gateway (embeddings, quality gate evaluation), Observation Logger
Used by: Agent Runtime (agents query/commit during task execution)

4. Tool Integration Framework

Typed, auth-managed access to external services and APIs.

What It Does

Defines tools with typed inputs/outputs, auth requirements, and retry policies
Manages a registry of available tools, filterable by vertical
Resolves auth credentials from Vault at execution time (agents never hold raw keys)
Handles retries with exponential backoff for transient failures
Rate limits tool calls to respect external API quotas
Logs all tool executions for observability

Inputs / Outputs

Input	Output
Tool name + input parameters + caller context	Typed result (success + data) or error (code + retryable flag)
Discovery query (vertical, category)	List of available tools with descriptions

MVP Tools

Tool	Vertical	What It Does
GitHub	V0 Internal Ops	List/create/update issues, PRs, comments
Calendar	V0 Internal Ops	List/create events, find free slots
SEMrush	V1 SEO	Keyword overview, keyword gap, domain analysis
Google Search Console	V1 SEO	Search performance, ranking data
Toddle DB	V2 Toddle	Query/update activities, check data freshness, get embeddings

Key Workflow

Tool Execution:

Agent calls tool("semrush_keyword_overview", { keyword: "..." })
→ Look up tool definition in registry
→ Fetch credentials from Vault (scoped to tenant)
→ Execute with timeout
→ On failure: retry per policy (exponential backoff, max attempts)
→ Log execution to Observation Logger
→ Return typed result to agent

Connections

Uses: Vault (credential retrieval), Observation Logger
Used by: Agent Runtime (agents invoke tools during skill execution)

5. Task Scheduler

Cron-based and event-triggered task execution for agents.

What It Does

Registers cron schedules (from agent YAML definitions or programmatic)
Registers event triggers (agent A completes → fire agent B)
Dispatches tasks to Agent Runtime when triggers fire
Ensures idempotency (no double-firing on scheduler restart)
Tracks execution history per schedule
Supports skip-if-running to prevent overlapping executions

Inputs / Outputs

Input	Output
Schedule definition (cron expression + target agent + skill + input)	Registered schedule ID
Event trigger definition (event type + target agent + skill)	Registered trigger ID
Emitted event	Dispatched task (via Agent Runtime)

Key Workflows

Cron Tick (every 10s):

Query schedules where next_run_at <= now
→ For each due schedule:
  → Check idempotency (not already fired for this timestamp)
  → Check skip-if-running (previous task still executing?)
  → Dispatch task to Agent Runtime
  → Update next_run_at
  → Record execution in history

Event Trigger (MVP — direct callbacks):

Component calls scheduler.emit({ type: "task_completed", ... })
→ Match against registered event triggers
→ Check idempotency key
→ Dispatch matched tasks to Agent Runtime

HA Safety: Multiple scheduler replicas use database-level locking (FOR UPDATE SKIP LOCKED) so only one replica picks up each due schedule.

Connections

Uses: Agent Runtime (dispatches tasks)
Used by: Agent definitions (declare cron triggers in YAML), other components (emit events)

6. Observation Logger

Structured logging of all agent activity for debugging, auditing, and future training.

What It Does

Records every significant event: agent lifecycle, task execution, LLM calls, tool calls, knowledge operations, supervision decisions, budget warnings
Provides a query interface for debugging (trace a task, view agent timeline, aggregate metrics)
Batches writes internally (fire-and-forget, never blocks agent execution)
Integrates automatically as middleware on LLM Gateway, Tool Executor, and Knowledge Client

Inputs / Outputs

Input	Output
Observation event (type + payload + context)	(fire-and-forget, async write)
Query filter (tenant, agent, task, time range)	Matching events
Task ID	Full execution trace across all agents
Metrics filter (tenant, time range, group-by)	Aggregate metrics (tasks, tokens, costs, error rates)

Event Types

Category	Events
Agent lifecycle	spawned, shutdown, state change
Task execution	started, completed, failed, timeout
LLM calls	start, complete, error (with provider, model, tokens, cost, latency)
Tool calls	start, complete, error (with tool name, duration, retry count)
Knowledge	query, commit (with memory type, domain, result count)
Supervision	review required, decision made (approved/rejected/modified)
Budget	warning (80% threshold), exceeded (hard stop)
Scheduler	cron triggered, event triggered

Key Design Choice

Logging is fire-and-forget with internal batching — events buffer in memory and flush to storage in batches (100 events or every 1 second). This ensures logging never becomes a bottleneck for agent execution. If the database is temporarily unavailable, events buffer up to a limit, dropping oldest debug-level events first.

Connections

Uses: PostgreSQL (event storage)
Used by: Every other component (automatic middleware integration)

Cross-Component Integration

As Implemented (MVP)

┌─ kaze-gateway (port 4200) ──────────────────────────────┐
│  POST /llm/generate     → Vercel AI SDK → Gemini/Claude │
│  POST /tools/execute    → credential injection → APIs   │
│  GET  /tools/catalog                                    │
│  Secrets: LLM keys, GitHub token                        │
│  Observability: Langfuse tracing                        │
└──────────────────────────▲──────────────────────────────┘
                           │ HTTP
┌──────────────────────────┴──────────────────────────────┐
│  kaze-runtime (port 4100)                               │
│  VerticalAgent → SubAgent (per-task, per-skill)         │
│  Memory: search before LLM, store after LLM             │
│  Zero secrets — calls gateway + knowledge via HTTP      │
└──────┬───────────────────────────────────▲──────────────┘
       │ HTTP                              │ HTTP
┌──────▼───────────────────────────────────┴──────────────┐
│  kaze-knowledge (port 4300)                             │
│  POST /memory/search    → vector similarity             │
│  POST /memory/add       → Mem0 fact extraction + store  │
│  Own LLM key (Gemini) for fact extraction + embeddings  │
│  Storage: PostgreSQL + pgvector                         │
└─────────────────────────────────────────────────────────┘

Full Design (Target)

                         ┌──────────────────────────────┐
                         │    OpenClaw (Layer 0.5)       │
                         │    User ↔ Agent conversation  │
                         └──────────────┬───────────────┘
                                        │ dispatches tasks
                                        ▼
┌───────────────────────────────────────────────────────────────┐
│                     Agent Runtime (1)                          │
│                                                               │
│  Spawns agents · Dispatches tasks · Manages lifecycle         │
│  Enforces supervision · Routes inter-agent messages           │
│                                                               │
│  Uses: LLM Gateway (2), Knowledge (3), Tools (4), Logger (6) │
└───────┬──────────────┬────────────────┬───────────────────────┘
        │              │                │
   ┌────▼────┐   ┌─────▼──────┐   ┌────▼──────────┐
   │ LLM     │   │ Knowledge  │   │ Tool          │
   │ Gateway │   │ System     │   │ Framework     │
   │   (2)   │   │   (3)      │   │   (4)         │
   │         │   │            │   │               │
   │ Multi-  │   │ Mem0 +     │   │ GitHub,       │
   │ provider│   │ pgvector   │   │ SEMrush,      │
   │ routing │   │ Tri-factor │   │ Calendar,     │
   │ Budget  │   │ ABAC       │   │ Toddle DB     │
   └─────────┘   └────────────┘   └───────────────┘
        │              │                │
        └──────────────┼────────────────┘
                       │ all events logged
                ┌──────▼──────┐    ┌───────────────┐
                │ Observation │    │ Task          │
                │ Logger (6)  │    │ Scheduler (5) │
                │             │    │               │
                │ All events  │    │ Cron + Event  │
                │ Batched     │    │ → dispatch()  │
                │ writes      │    │ to Runtime    │
                └─────────────┘    └───────────────┘

Error Handling Summary

Component	Key Failure Mode	Recovery
Agent Runtime	Agent crash during task	Mark task failed, log error, transition agent to error state. After cooldown, return to ready. 3 consecutive errors → alert ops.
Agent Runtime	Task timeout	Abort via signal, mark timeout, agent returns to ready if healthy.
LLM Gateway	Provider rate limited / down	Fallback to next provider in chain. Queue with backoff. Return error after max attempts.
LLM Gateway	Budget exceeded	Hard reject. No fallback, no retry. Deterministic code check.
Knowledge System	Quality gate rejects shared write	Return rejection reason. Agent can store privately instead.
Knowledge System	Mem0 unavailable	Graceful degradation: buffer episodic writes to Postgres directly. Reads return empty for Mem0-backed memories.
Tool Framework	External API error (retryable)	Exponential backoff per retry policy. After max attempts, return error to agent.
Tool Framework	Auth failure	Re-fetch from Vault (cache may be stale). Retry once. If still failing, return auth error.
Task Scheduler	Missed cron tick (restart)	On startup, scan for overdue schedules. Execute missed runs (up to 1hr lookback). Mark older as skipped.
Observation Logger	Database write failure	Buffer in memory (up to limit). Drop oldest debug events first. Never block agent execution.

Security Controls Per Component

Each component enforces security boundaries independently — no single component failure should compromise the system.

Component	Security Control	What It Enforces
Agent Runtime	Capability manifest enforcement	Agent can only invoke tools, knowledge domains, and channels declared in its manifest
Agent Runtime	Per-agent resource quotas	Max concurrent tasks, max tool calls per task, max subagent depth
Agent Runtime	Supervision state isolation	Agents cannot read or modify their own supervision statistics
LLM Gateway	Data classification check	Tags on knowledge entries ("safe for LLM" vs "internal only") respected before inclusion in prompts
LLM Gateway	Secret scanning in prompts	Detect and redact credentials/keys before sending to provider
LLM Gateway	Per-tenant request fairness	No single tenant can saturate the provider queue
Knowledge System	Shared knowledge quarantine	Writes to shared tier enter quarantine before becoming visible (configurable: time-based or review-based)
Knowledge System	Write rate limiting	Flag agents that write excessively to shared tier
Knowledge System	Provenance chain	Every shared entry traces to originating observation, not just agent identity
Tool Framework	Egress whitelist	Tool calls validated against per-tenant, per-vertical allowed endpoints
Tool Framework	Output scanning	Detect client-specific data in tool call parameters before sending externally
Observation Logger	Secret redaction	Scan all event payloads for credential patterns before storage
Observation Logger	Append-only audit	Observation events are immutable — no update or delete
Task Scheduler	Idempotency enforcement	Prevents duplicate task dispatch from replay or restart

Full threat model and attack surface analysis in research/threat-model.md.

Phase 2 Migration Notes

NATS: The inter-agent message envelope is designed so swapping DirectCallTransport for NatsTransport changes only the transport layer. Agent code, task definitions, and message shapes stay identical.
Apache AGE: The knowledge query/commit interface gains a graph traversal retrieval strategy alongside vector search. Same interface, new strategy option.
Layer 3 Agents: Supervisor and Quality Monitor agents consume the Observation Logger's event stream. Improvement agents write new versions of agent/skill definitions.

Technical Design — Component Overview ​

1. Agent Runtime ​

What It Does ​

Inputs / Outputs ​

Key Workflows ​

Connections ​

2. LLM Gateway ​

What It Does ​

Inputs / Outputs ​

Key Workflows ​

Connections ​

3. Knowledge System (Mem0 + pgvector) ​

What It Does ​

Inputs / Outputs ​

Key Workflows ​

Connections ​

4. Tool Integration Framework ​

What It Does ​

Inputs / Outputs ​

MVP Tools ​

Key Workflow ​

Connections ​

5. Task Scheduler ​

What It Does ​

Inputs / Outputs ​

Key Workflows ​

Connections ​

6. Observation Logger ​

What It Does ​

Inputs / Outputs ​

Event Types ​

Key Design Choice ​

Connections ​

Cross-Component Integration ​

As Implemented (MVP) ​

Full Design (Target) ​

Error Handling Summary ​

Security Controls Per Component ​

Phase 2 Migration Notes ​

Technical Design — Component Overview

1. Agent Runtime

What It Does

Inputs / Outputs

Key Workflows

Connections

2. LLM Gateway

What It Does

Inputs / Outputs

Key Workflows

Connections

3. Knowledge System (Mem0 + pgvector)

What It Does

Inputs / Outputs

Key Workflows

Connections

4. Tool Integration Framework

What It Does

Inputs / Outputs

MVP Tools

Key Workflow

Connections

5. Task Scheduler

What It Does

Inputs / Outputs

Key Workflows

Connections

6. Observation Logger

What It Does

Inputs / Outputs

Event Types

Key Design Choice

Connections

Cross-Component Integration

As Implemented (MVP)

Full Design (Target)

Error Handling Summary

Security Controls Per Component

Phase 2 Migration Notes