Skip to content

System Design

Part of Project Kaze Architecture

Architectural Pattern: Agent-Oriented Architecture

Kaze follows an Agent-Oriented Architecture — a hybrid pattern that borrows from several traditional styles but is fundamentally shaped by the fact that its primary units of computation are intelligent, autonomous agents, not passive services.

Borrowed patterns and their roles in Kaze:

PatternWhat Kaze borrowsApplied where
Actor ModelAutonomous entities with private state, message-passing, supervision treesAgent runtime — each agent is an actor
Event-Driven ArchitectureLoose coupling via async events, event sourcing for auditInter-agent communication via NATS
MicroservicesIndependent deployment, own-your-data, clean API boundariesPlatform services (LLM Gateway, Knowledge Graph, etc.)
Cell-Based ArchitectureSelf-contained isolated deployment unitsEach tenant/VPC is a cell

New to Kaze (no traditional equivalent):

  • Components that learn and self-modify their behavior over time
  • A governance hierarchy where AI agents supervise other AI agents
  • Shared knowledge across agents while maintaining runtime isolation
  • A supervision ramp (supervised → sampling → autonomous) as a trust model

Layer Architecture

Kaze is organized into 5 layers, from infrastructure at the bottom to governance at the top:

┌─────────────────────────────────────────────────────────────┐
│                        KAZE PLATFORM                         │
│                                                              │
│  Layer 3: GOVERNANCE & SELF-IMPROVEMENT                      │
│  ┌────────────┐ ┌──────────────┐ ┌────────────────────────┐ │
│  │ Supervisor │ │ Quality      │ │ Improvement Agent      │ │
│  │ Agents     │ │ Monitor      │ │ (prompt/skill/workflow │ │
│  │            │ │ Agent        │ │  optimization)         │ │
│  └────────────┘ └──────────────┘ └────────────────────────┘ │
│                                                              │
│  Layer 2: ORCHESTRATION & KNOWLEDGE                          │
│  ┌──────────────┐ ┌───────────────────────────────────────┐ │
│  │ Orchestrator │ │ Shared Knowledge Graph                │ │
│  │ Agents       │ │  ├── Vertical knowledge (SEO, CRM..) │ │
│  │ (dynamic     │ │  ├── Cross-vertical patterns         │ │
│  │  planning)   │ │  └── Client-specific context         │ │
│  └──────────────┘ └───────────────────────────────────────┘ │
│                                                              │
│  Layer 1: EXECUTION                                          │
│  ┌─────────────────────────────────────────────────────────┐│
│  │ Agent Skills (composable, reusable per vertical)        ││
│  │ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌────────────┐  ││
│  │ │ Keyword  │ │ Content  │ │ Lead     │ │ Report     │  ││
│  │ │ Research │ │ Optimize │ │ Scoring  │ │ Generator  │  ││
│  │ └──────────┘ └──────────┘ └──────────┘ └────────────┘  ││
│  └─────────────────────────────────────────────────────────┘│
│                                                              │
│  Layer 0.5: INTERACTION                                      │
│  ┌──────────────────────────────────────────────────────┐   │
│  │ Conversation Manager                                 │   │
│  │ ┌───────┐ ┌───────┐ ┌──────────┐ ┌────────┐         │   │
│  │ │ Slack │ │ Email │ │ WhatsApp │ │Telegram│  ...    │   │
│  │ └───────┘ └───────┘ └──────────┘ └────────┘         │   │
│  └──────────────────────────────────────────────────────┘   │
│                                                              │
│  Layer 0: PLATFORM INFRASTRUCTURE                            │
│  ┌──────────┐ ┌──────┐ ┌─────────┐ ┌──────┐ ┌───────────┐ │
│  │ K8s      │ │ NATS │ │Postgres │ │Vault │ │ OTel/Prom │ │
│  │          │ │      │ │         │ │      │ │ Grafana   │ │
│  └──────────┘ └──────┘ └─────────┘ └──────┘ └───────────┘ │
│  ┌──────────────────────────────────────────────────────┐   │
│  │ LLM Gateway (multi-provider, dual-key, budget mgmt) │   │
│  └──────────────────────────────────────────────────────┘   │
│                                                              │
│  ALL CONTAINERIZED · ALL IaC · ANY CLOUD · ANY VPC          │
└─────────────────────────────────────────────────────────────┘

Layer Descriptions

Layer 0: Platform Infrastructure

Non-AI infrastructure that provides the runtime foundation. This is traditional software — deterministic, well-understood, battle-tested.

Components:

  • Kubernetes — Universal compute runtime. Provides scheduling, scaling, networking, and the deployment abstraction across any cloud.
  • NATS — Lightweight message bus for all inter-agent communication. Supports pub/sub, request/reply, and persistent streaming (JetStream). Chosen for portability and minimal operational overhead.
  • PostgreSQL — Primary relational datastore. Managed via CloudNativePG operator for Kubernetes-native operation.
  • HashiCorp Vault — Secrets management. Stores LLM API keys (both Speedrun-owned and client-provided), agent credentials, and encryption keys.
  • OpenTelemetry + Prometheus + Grafana + Loki — Full observability stack. OTel for distributed tracing, Prometheus for metrics, Grafana for visualization, Loki for log aggregation.
  • LLM Gateway — Abstraction layer between agents and LLM providers (see Agent Model).

Layer 0.5: Interaction Layer

The multi-channel communication layer that enables humans to interact with agents naturally through their existing tools.

Components:

  • Conversation Manager — Maintains unified conversation threads across channels. An agent doesn't "think in Slack" or "think in email" — it thinks in tasks and conversations. The channel is a delivery mechanism.
  • Channel Adapters — Slack bot, Email agent, WhatsApp agent, Telegram bot, and future integrations. Each adapter translates between the channel's protocol and the Conversation Manager's unified format.
  • Approval Flow Engine — Routes approval requests to the appropriate channel based on context, urgency, and client preference.
  • Context Persistence — Maintains conversation history across channels so that context flows naturally (e.g., a question asked on WhatsApp can reference a document sent via email).

Layer 1: Execution

Agents that perform actual work. These are composed from reusable skills and operate within a specific vertical.

Key concepts:

  • Skills — The atomic reusable unit of agent capability (see Agent Model).
  • Agents — Compositions of skills + a role + context. An agent is instantiated from a template, bound to a client, and assigned a supervision level.
  • Agent Runtime — The execution environment that hosts agents. Based on the actor model — each agent has private state, a message inbox, and processes one task at a time.

Layer 2: Orchestration & Knowledge

Agents that plan, decompose, and coordinate work across execution agents. Also hosts the shared knowledge graph.

Key concepts:

  • Orchestrator Agents — Receive goals, decompose them into subtasks, assign to worker agents. Unlike static DAG workflows, orchestrators reason dynamically about the best approach and can re-plan at runtime if steps fail.
  • Router Agents — Direct incoming requests to the appropriate agent or workflow based on intent classification.
  • Shared Knowledge Graph — The persistent knowledge layer that agents read from and contribute to (see AI-Native).

Layer 3: Governance & Self-Improvement

Meta-agents that monitor, evaluate, and improve the entire system. This layer is the last to become autonomous and carries the most conservative guardrails.

Key concepts:

  • Supervisor Agents — Watch agent fleet health, detect failures, take corrective action. Unlike traditional monitoring that follows static rules, supervisors reason about novel failure modes.
  • Quality Monitor Agents — Evaluate agent outputs for quality, catch hallucinations or drift, score task completion. Feed results into the supervision ramp.
  • Improvement Agents — Analyze execution patterns, propose prompt refinements, skill updates, and workflow optimizations. All changes go through canary deployment before full rollout.

Agent Hierarchy Summary

LayerRoleWhat lives hereAutonomy level
Layer 3GovernanceSupervisor, Quality Monitor, Improvement agentsLast to become autonomous — human oversight longest
Layer 2OrchestrationOrchestrator agents, Router agents, Knowledge GraphSecond to autonomy
Layer 1ExecutionWorker agents composed of skillsFirst to go autonomous (per skill, per vertical)
Layer 0.5InteractionConversation Manager, Channel adaptersN/A (infrastructure)
Layer 0InfrastructureK8s, NATS, Postgres, Vault, LLM GatewayN/A (traditional software)