System Design

Part of Project Kaze Architecture

Architectural Pattern: Agent-Oriented Architecture

Kaze follows an Agent-Oriented Architecture — a hybrid pattern that borrows from several traditional styles but is fundamentally shaped by the fact that its primary units of computation are intelligent, autonomous agents, not passive services.

Borrowed patterns and their roles in Kaze:

Pattern	What Kaze borrows	Applied where
Actor Model	Autonomous entities with private state, message-passing, supervision trees	Agent runtime — each agent is an actor
Event-Driven Architecture	Loose coupling via async events, event sourcing for audit	Inter-agent communication via NATS
Microservices	Independent deployment, own-your-data, clean API boundaries	Platform services (LLM Gateway, Knowledge Graph, etc.)
Cell-Based Architecture	Self-contained isolated deployment units	Each tenant/VPC is a cell

New to Kaze (no traditional equivalent):

Components that learn and self-modify their behavior over time
A governance hierarchy where AI agents supervise other AI agents
Shared knowledge across agents while maintaining runtime isolation
A supervision ramp (supervised → sampling → autonomous) as a trust model

Layer Architecture

Kaze is organized into 5 layers, from infrastructure at the bottom to governance at the top:

┌─────────────────────────────────────────────────────────────┐
│                        KAZE PLATFORM                         │
│                                                              │
│  Layer 3: GOVERNANCE & SELF-IMPROVEMENT                      │
│  ┌────────────┐ ┌──────────────┐ ┌────────────────────────┐ │
│  │ Supervisor │ │ Quality      │ │ Improvement Agent      │ │
│  │ Agents     │ │ Monitor      │ │ (prompt/skill/workflow │ │
│  │            │ │ Agent        │ │  optimization)         │ │
│  └────────────┘ └──────────────┘ └────────────────────────┘ │
│                                                              │
│  Layer 2: ORCHESTRATION & KNOWLEDGE                          │
│  ┌──────────────┐ ┌───────────────────────────────────────┐ │
│  │ Orchestrator │ │ Shared Knowledge Graph                │ │
│  │ Agents       │ │  ├── Vertical knowledge (SEO, CRM..) │ │
│  │ (dynamic     │ │  ├── Cross-vertical patterns         │ │
│  │  planning)   │ │  └── Client-specific context         │ │
│  └──────────────┘ └───────────────────────────────────────┘ │
│                                                              │
│  Layer 1: EXECUTION                                          │
│  ┌─────────────────────────────────────────────────────────┐│
│  │ Agent Skills (composable, reusable per vertical)        ││
│  │ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌────────────┐  ││
│  │ │ Keyword  │ │ Content  │ │ Lead     │ │ Report     │  ││
│  │ │ Research │ │ Optimize │ │ Scoring  │ │ Generator  │  ││
│  │ └──────────┘ └──────────┘ └──────────┘ └────────────┘  ││
│  └─────────────────────────────────────────────────────────┘│
│                                                              │
│  Layer 0.5: INTERACTION                                      │
│  ┌──────────────────────────────────────────────────────┐   │
│  │ Conversation Manager                                 │   │
│  │ ┌───────┐ ┌───────┐ ┌──────────┐ ┌────────┐         │   │
│  │ │ Slack │ │ Email │ │ WhatsApp │ │Telegram│  ...    │   │
│  │ └───────┘ └───────┘ └──────────┘ └────────┘         │   │
│  └──────────────────────────────────────────────────────┘   │
│                                                              │
│  Layer 0: PLATFORM INFRASTRUCTURE                            │
│  ┌──────────┐ ┌──────┐ ┌─────────┐ ┌──────┐ ┌───────────┐ │
│  │ K8s      │ │ NATS │ │Postgres │ │Vault │ │ OTel/Prom │ │
│  │          │ │      │ │         │ │      │ │ Grafana   │ │
│  └──────────┘ └──────┘ └─────────┘ └──────┘ └───────────┘ │
│  ┌──────────────────────────────────────────────────────┐   │
│  │ LLM Gateway (multi-provider, dual-key, budget mgmt) │   │
│  └──────────────────────────────────────────────────────┘   │
│                                                              │
│  ALL CONTAINERIZED · ALL IaC · ANY CLOUD · ANY VPC          │
└─────────────────────────────────────────────────────────────┘

Layer Descriptions

Layer 0: Platform Infrastructure

Non-AI infrastructure that provides the runtime foundation. This is traditional software — deterministic, well-understood, battle-tested.

Components:

Kubernetes — Universal compute runtime. Provides scheduling, scaling, networking, and the deployment abstraction across any cloud.
NATS — Lightweight message bus for all inter-agent communication. Supports pub/sub, request/reply, and persistent streaming (JetStream). Chosen for portability and minimal operational overhead.
PostgreSQL — Primary relational datastore. Managed via CloudNativePG operator for Kubernetes-native operation.
HashiCorp Vault — Secrets management. Stores LLM API keys (both Speedrun-owned and client-provided), agent credentials, and encryption keys.
OpenTelemetry + Prometheus + Grafana + Loki — Full observability stack. OTel for distributed tracing, Prometheus for metrics, Grafana for visualization, Loki for log aggregation.
LLM Gateway — Abstraction layer between agents and LLM providers (see Agent Model).

Layer 0.5: Interaction Layer

The multi-channel communication layer that enables humans to interact with agents naturally through their existing tools.

Components:

Conversation Manager — Maintains unified conversation threads across channels. An agent doesn't "think in Slack" or "think in email" — it thinks in tasks and conversations. The channel is a delivery mechanism.
Channel Adapters — Slack bot, Email agent, WhatsApp agent, Telegram bot, and future integrations. Each adapter translates between the channel's protocol and the Conversation Manager's unified format.
Approval Flow Engine — Routes approval requests to the appropriate channel based on context, urgency, and client preference.
Context Persistence — Maintains conversation history across channels so that context flows naturally (e.g., a question asked on WhatsApp can reference a document sent via email).

Layer 1: Execution

Agents that perform actual work. These are composed from reusable skills and operate within a specific vertical.

Key concepts:

Skills — The atomic reusable unit of agent capability (see Agent Model).
Agents — Compositions of skills + a role + context. An agent is instantiated from a template, bound to a client, and assigned a supervision level.
Agent Runtime — The execution environment that hosts agents. Based on the actor model — each agent has private state, a message inbox, and processes one task at a time.

Layer 2: Orchestration & Knowledge

Agents that plan, decompose, and coordinate work across execution agents. Also hosts the shared knowledge graph.

Key concepts:

Orchestrator Agents — Receive goals, decompose them into subtasks, assign to worker agents. Unlike static DAG workflows, orchestrators reason dynamically about the best approach and can re-plan at runtime if steps fail.
Router Agents — Direct incoming requests to the appropriate agent or workflow based on intent classification.
Shared Knowledge Graph — The persistent knowledge layer that agents read from and contribute to (see AI-Native).

Layer 3: Governance & Self-Improvement

Meta-agents that monitor, evaluate, and improve the entire system. This layer is the last to become autonomous and carries the most conservative guardrails.

Key concepts:

Supervisor Agents — Watch agent fleet health, detect failures, take corrective action. Unlike traditional monitoring that follows static rules, supervisors reason about novel failure modes.
Quality Monitor Agents — Evaluate agent outputs for quality, catch hallucinations or drift, score task completion. Feed results into the supervision ramp.
Improvement Agents — Analyze execution patterns, propose prompt refinements, skill updates, and workflow optimizations. All changes go through canary deployment before full rollout.

Agent Hierarchy Summary

Layer	Role	What lives here	Autonomy level
Layer 3	Governance	Supervisor, Quality Monitor, Improvement agents	Last to become autonomous — human oversight longest
Layer 2	Orchestration	Orchestrator agents, Router agents, Knowledge Graph	Second to autonomy
Layer 1	Execution	Worker agents composed of skills	First to go autonomous (per skill, per vertical)
Layer 0.5	Interaction	Conversation Manager, Channel adapters	N/A (infrastructure)
Layer 0	Infrastructure	K8s, NATS, Postgres, Vault, LLM Gateway	N/A (traditional software)

System Design ​

Architectural Pattern: Agent-Oriented Architecture ​

Layer Architecture ​

Layer Descriptions ​

Layer 0: Platform Infrastructure ​

Layer 0.5: Interaction Layer ​

Layer 1: Execution ​

Layer 2: Orchestration & Knowledge ​

Layer 3: Governance & Self-Improvement ​

Agent Hierarchy Summary ​

System Design

Architectural Pattern: Agent-Oriented Architecture

Layer Architecture

Layer Descriptions

Layer 0: Platform Infrastructure

Layer 0.5: Interaction Layer

Layer 1: Execution

Layer 2: Orchestration & Knowledge

Layer 3: Governance & Self-Improvement

Agent Hierarchy Summary