Skip to content

Project Kaze — Architecture Document

Status: Draft — Brainstorming Phase Last Updated: 2026-02-26 Authors: Speedrun Ventures Founding Team


Document Map

DocumentContents
overview.md (this file)Vision, system architecture, architecture comparison
infrastructure.mdDeployment modes, cells, cloud strategy, LLM key management, security controls, observability
ai-native.mdAI-native philosophy, self-improvement, knowledge graph, orchestration, agent safety boundaries
strategy/product-strategy.mdVerticals, supervision ramp, multi-channel, human-in-the-loop
strategy/tradeoffs.md9 risks with mitigations, risk summary matrix
strategy/decisions.mdDesign decisions (D1-D45), technology selection, open questions
strategy/mvp.mdMVP scope, 3 verticals, parallel build plan, success criteria
technical-design.mdHigh-level component design for 6 MVP platform components, security controls per component
Research
research/knowledge-system.mdAcademic literature survey, open source tooling, knowledge architecture options
research/openclaw-integration.mdOpenClaw codebase analysis, plugin system, Kaze integration architecture
research/threat-model.mdThreat actors, 10 attack surfaces, security properties, MVP security scope
research/data-rights-knowledge-sharing.mdLegal risk analysis (GDPR, trade secrets), knowledge sharing options, tiered consent model
research/scalability-model.mdPerformance bottlenecks, scale milestones, component analysis, scaling triggers
research/cost-model.mdLLM token costs, infrastructure costs, unit economics, pricing implications
research/openfang-comparison.mdFull comparison with OpenFang agent OS — architecture, features, positioning
research/frontier-lab-competitive-analysis.mdCompetitive analysis if Anthropic/OpenAI/Google build an agent OS — what's defensible vs commodity

1. Vision & Principles

1.1 Context

Speedrun Ventures is an AI-native venture studio focused on building AI agents and automated workflows for SMEs to optimize their P&L. The studio operates by building massive fleets of AI agents and tooling, along with an orchestration framework for rapid agent development and operations.

1.2 What is Kaze?

Kaze is an operating system for AI agents. It is the foundational platform that enables Speedrun Ventures to:

  • Define new agents from reusable templates and composable skills
  • Integrate with external tools and services
  • Orchestrate agents into complex workflows
  • Monitor, evaluate, and continuously improve agent performance
  • Scale agent fleets across multiple clients and deployment environments

Kaze is not another SaaS platform for deploying LLM wrappers. It is an AI-native system where AI is the core operating layer — AI monitors AI, AI improves AI, and human involvement is minimized to governance and exception handling.

1.3 Core Principles

PrincipleDescription
AI-NativeAI is not a feature bolted onto traditional software. AI agents are the primary units of computation. The system self-monitors, self-improves, and self-heals with minimal human intervention.
Security & Privacy FirstEvery architectural decision prioritizes data isolation, tenant security, and client trust. The system must be deployable in any environment without compromising on security.
Cloud-AgnosticNo hard dependencies on any cloud provider. The entire stack runs on any cloud (AWS, GCP, Azure) or on-premises with zero code changes.
Vertical-FirstValue is created by going deep into well-understood business verticals (SEO, CRM, etc.), not by building a generic horizontal platform. Each vertical creates compounding knowledge.
Meet Humans Where They AreHumans interact with agents through their existing tools (Slack, Email, WhatsApp, Telegram) — not through a specialized dashboard. Communication is natural, not software-operational.
Portable by DefaultEverything is containerized and defined as Infrastructure as Code. The same artifact deploys in Speedrun's infrastructure or a client's VPC with only configuration differences.

2. System Architecture

2.1 Architectural Pattern: Agent-Oriented Architecture

Kaze follows an Agent-Oriented Architecture — a hybrid pattern that borrows from several traditional styles but is fundamentally shaped by the fact that its primary units of computation are intelligent, autonomous agents, not passive services.

Borrowed patterns and their roles in Kaze:

PatternWhat Kaze borrowsApplied where
Actor ModelAutonomous entities with private state, message-passing, supervision treesAgent runtime — each agent is an actor
Event-Driven ArchitectureLoose coupling via async events, event sourcing for auditInter-agent communication via NATS
MicroservicesIndependent deployment, own-your-data, clean API boundariesPlatform services (LLM Gateway, Knowledge Graph, etc.)
Cell-Based ArchitectureSelf-contained isolated deployment unitsEach tenant/VPC is a cell

New to Kaze (no traditional equivalent):

  • Components that learn and self-modify their behavior over time
  • A governance hierarchy where AI agents supervise other AI agents
  • Shared knowledge across agents while maintaining runtime isolation
  • A supervision ramp (supervised → sampling → autonomous) as a trust model

2.2 Layer Architecture

Kaze is organized into 5 layers, from infrastructure at the bottom to governance at the top:

┌─────────────────────────────────────────────────────────────┐
│                        KAZE PLATFORM                         │
│                                                              │
│  Layer 3: GOVERNANCE & SELF-IMPROVEMENT                      │
│  ┌────────────┐ ┌──────────────┐ ┌────────────────────────┐ │
│  │ Supervisor │ │ Quality      │ │ Improvement Agent      │ │
│  │ Agents     │ │ Monitor      │ │ (prompt/skill/workflow │ │
│  │            │ │ Agent        │ │  optimization)         │ │
│  └────────────┘ └──────────────┘ └────────────────────────┘ │
│                                                              │
│  Layer 2: ORCHESTRATION & KNOWLEDGE                          │
│  ┌──────────────┐ ┌───────────────────────────────────────┐ │
│  │ Orchestrator │ │ Shared Knowledge Graph                │ │
│  │ Agents       │ │  ├── Vertical knowledge (SEO, CRM..) │ │
│  │ (dynamic     │ │  ├── Cross-vertical patterns         │ │
│  │  planning)   │ │  └── Client-specific context         │ │
│  └──────────────┘ └───────────────────────────────────────┘ │
│                                                              │
│  Layer 1: EXECUTION                                          │
│  ┌─────────────────────────────────────────────────────────┐│
│  │ Agent Skills (composable, reusable per vertical)        ││
│  │ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌────────────┐  ││
│  │ │ Keyword  │ │ Content  │ │ Lead     │ │ Report     │  ││
│  │ │ Research │ │ Optimize │ │ Scoring  │ │ Generator  │  ││
│  │ └──────────┘ └──────────┘ └──────────┘ └────────────┘  ││
│  └─────────────────────────────────────────────────────────┘│
│                                                              │
│  Layer 0.5: INTERACTION                                      │
│  ┌──────────────────────────────────────────────────────┐   │
│  │ Conversation Manager                                 │   │
│  │ ┌───────┐ ┌───────┐ ┌──────────┐ ┌────────┐         │   │
│  │ │ Slack │ │ Email │ │ WhatsApp │ │Telegram│  ...    │   │
│  │ └───────┘ └───────┘ └──────────┘ └────────┘         │   │
│  └──────────────────────────────────────────────────────┘   │
│                                                              │
│  Layer 0: PLATFORM INFRASTRUCTURE                            │
│  ┌──────────┐ ┌──────┐ ┌─────────┐ ┌──────┐ ┌───────────┐ │
│  │ K8s      │ │ NATS │ │Postgres │ │Vault │ │ OTel/Prom │ │
│  │          │ │      │ │         │ │      │ │ Grafana   │ │
│  └──────────┘ └──────┘ └─────────┘ └──────┘ └───────────┘ │
│  ┌──────────────────────────────────────────────────────┐   │
│  │ LLM Gateway (multi-provider, dual-key, budget mgmt) │   │
│  └──────────────────────────────────────────────────────┘   │
│                                                              │
│  ALL CONTAINERIZED · ALL IaC · ANY CLOUD · ANY VPC          │
└─────────────────────────────────────────────────────────────┘

2.3 Layer Descriptions

Layer 0: Platform Infrastructure

Non-AI infrastructure that provides the runtime foundation. This is traditional software — deterministic, well-understood, battle-tested.

Components:

  • Kubernetes — Universal compute runtime. Provides scheduling, scaling, networking, and the deployment abstraction across any cloud.
  • NATS — Lightweight message bus for all inter-agent communication. Supports pub/sub, request/reply, and persistent streaming (JetStream). Chosen for portability and minimal operational overhead.
  • PostgreSQL — Primary relational datastore. Managed via CloudNativePG operator for Kubernetes-native operation.
  • HashiCorp Vault — Secrets management. Stores LLM API keys (both Speedrun-owned and client-provided), agent credentials, and encryption keys.
  • OpenTelemetry + Prometheus + Grafana + Loki — Full observability stack. OTel for distributed tracing, Prometheus for metrics, Grafana for visualization, Loki for log aggregation.
  • LLM Gateway — Abstraction layer between agents and LLM providers (see Section 2.5).

Layer 0.5: Interaction Layer

The multi-channel communication layer that enables humans to interact with agents naturally through their existing tools.

Components:

  • Conversation Manager — Maintains unified conversation threads across channels. An agent doesn't "think in Slack" or "think in email" — it thinks in tasks and conversations. The channel is a delivery mechanism.
  • Channel Adapters — Slack bot, Email agent, WhatsApp agent, Telegram bot, and future integrations. Each adapter translates between the channel's protocol and the Conversation Manager's unified format.
  • Approval Flow Engine — Routes approval requests to the appropriate channel based on context, urgency, and client preference.
  • Context Persistence — Maintains conversation history across channels so that context flows naturally (e.g., a question asked on WhatsApp can reference a document sent via email).

Layer 1: Execution

Agents that perform actual work. These are composed from reusable skills and operate within a specific vertical.

Key concepts:

  • Skills — The atomic reusable unit of agent capability (see Section 2.4).
  • Agents — Compositions of skills + a role + context. An agent is instantiated from a template, bound to a client, and assigned a supervision level.
  • Agent Runtime — The execution environment that hosts agents. Based on the actor model — each agent has private state, a message inbox, and processes one task at a time.

Layer 2: Orchestration & Knowledge

Agents that plan, decompose, and coordinate work across execution agents. Also hosts the shared knowledge graph.

Key concepts:

  • Orchestrator Agents — Receive goals, decompose them into subtasks, assign to worker agents. Unlike static DAG workflows, orchestrators reason dynamically about the best approach and can re-plan at runtime if steps fail.
  • Router Agents — Direct incoming requests to the appropriate agent or workflow based on intent classification.
  • Shared Knowledge Graph — The persistent knowledge layer that agents read from and contribute to (see ai-native.md).

Layer 3: Governance & Self-Improvement

Meta-agents that monitor, evaluate, and improve the entire system. This layer is the last to become autonomous and carries the most conservative guardrails.

Key concepts:

  • Supervisor Agents — Watch agent fleet health, detect failures, take corrective action. Unlike traditional monitoring that follows static rules, supervisors reason about novel failure modes.
  • Quality Monitor Agents — Evaluate agent outputs for quality, catch hallucinations or drift, score task completion. Feed results into the supervision ramp.
  • Improvement Agents — Analyze execution patterns, propose prompt refinements, skill updates, and workflow optimizations. All changes go through canary deployment before full rollout.

2.4 Agent Skills — The Composable Unit

A skill is the atomic reusable unit of agent capability. Skills are the building blocks from which agents are composed.

Skill definition structure:

yaml
skill: keyword-research
  description: "Research and evaluate keyword opportunities"

  inputs:
    - business_context     # What the client does
    - current_rankings     # Optional: existing search positions
    - competitors          # Optional: known competitors

  tools_required:
    - semrush_api
    - google_search_console
    - llm                  # Analysis & reasoning

  outputs:
    - keyword_opportunities  # Structured list
    - priority_ranking       # Scored and ordered
    - reasoning              # Why these keywords

  knowledge_dependencies:
    - seo/domain-concepts
    - seo/best-practices

  quality_criteria:
    - relevance_score > 0.8
    - business_alignment check
    - search_volume validation

An agent is a composition of skills:

yaml
agent: seo-strategist
  role: "Senior SEO strategist for {client}"
  skills:
    - keyword-research
    - competitor-analysis
    - content-optimization
    - reporting
  knowledge:
    - seo/*
    - client/{client_id}/business-context
  autonomy_level: supervised → sampling → autonomous

Skills transfer across verticals where applicable. When expanding from SEO to content marketing, skills like keyword-research and competitor-analysis carry over directly while new vertical-specific skills are built.

2.5 LLM Gateway

A critical platform component that sits between all agents and all LLM providers. No agent ever holds a raw API key or calls a provider directly.

┌─────────┐     ┌──────────────────┐     ┌──────────────┐
│  Agent   │────▶│   LLM Gateway    │────▶│  Anthropic   │
│          │     │                  │────▶│  OpenAI      │
│          │     │  - Key routing   │────▶│  Google      │
│          │     │  - Budget mgmt   │────▶│  Local/Ollama│
│          │     │  - Usage tracking│     └──────────────┘
│          │     │  - Rate limiting │
│          │     │  - Fallback      │
└─────────┘     └──────────────────┘

Responsibilities:

  • Key Routing — Resolves which API key to use per request based on tenant, agent, and provider configuration (see infrastructure.md for dual-key model).
  • Provider Abstraction — Agents call a unified interface (complete(messages, tools, model_hint)). The gateway resolves to a specific provider/model based on config, availability, and cost optimization.
  • Budget Management — Per-key, per-tenant, per-agent token tracking with configurable budget caps and hard stops.
  • Rate Limiting — Respects provider rate limits, queues requests, and distributes load.
  • Fallback — If a provider is down or rate-limited, automatically falls back to alternatives (if policy allows).
  • Model Selection — Can route to the cheapest model that meets the quality bar for a given task type (auto-learned over time).

2.6 Agent Hierarchy Summary

LayerRoleWhat lives hereAutonomy level
Layer 3GovernanceSupervisor, Quality Monitor, Improvement agentsLast to become autonomous — human oversight longest
Layer 2OrchestrationOrchestrator agents, Router agents, Knowledge GraphSecond to autonomy
Layer 1ExecutionWorker agents composed of skillsFirst to go autonomous (per skill, per vertical)
Layer 0.5InteractionConversation Manager, Channel adaptersN/A (infrastructure)
Layer 0InfrastructureK8s, NATS, Postgres, Vault, LLM GatewayN/A (traditional software)

3. Architecture Comparison

3.1 Comparison with Traditional Patterns

DimensionMonolithSOAMicroservicesEDAActor ModelKaze
Primary unitModuleServiceServiceEvent handlerActorAgent (intelligent actor)
CommunicationFunction callESBAPI / eventsEventsMessagesMessages + shared knowledge
StateShared DBShared-ishOwn DB per svcEvent logPrivate per actorPrivate + shared knowledge graph
DeploymentSingle unitService groupsPer servicePer handlerPer actor systemPer cell
ScalingVerticalService-levelPer servicePer topicPer actor poolPer cell + per agent pool
IntelligenceNoneNoneNoneNoneNoneCore design property
Self-modificationNoNoNoNoNoYes (governed)
SupervisionN/AMonitoringHealth checksDead letter queueSupervisor treesIntelligent supervision hierarchy

3.2 What Fits and What Doesn't from Each Pattern

Monolithic:

  • Fits: Simple to start, fast iteration, easy to reason about. A valid starting point for the internals of a single cell.
  • Doesn't fit: Can't deploy agents independently, can't scale per-agent, can't distribute across VPCs, can't support multi-tenant isolation.

SOA:

  • Fits: Business-capability oriented thinking (verticals as service domains).
  • Doesn't fit: ESB is a centralized bottleneck and single point of failure. Too rigid for dynamic agent composition. Heavy governance overhead.

Microservices:

  • Fits: Independent deployment, own-your-data, polyglot, clean API boundaries. Ideal for platform services.
  • Doesn't fit: Assumes dumb pipes, smart endpoints. Kaze needs smart pipes (NATS with intelligent routing) AND smart endpoints (agents). Agents are stateful, long-lived, autonomous entities — not stateless request handlers.

Event-Driven Architecture:

  • Fits: Loose coupling, async communication, natural for agent-to-agent messaging, event sourcing for audit trails.
  • Doesn't fit: Doesn't address agent lifecycle, supervision, self-improvement, or knowledge sharing. EDA is a communication pattern, not a complete architecture for intelligent systems.

Actor Model:

  • Fits: Closest traditional match — autonomous entities, message-passing, supervision trees, spawn children. Natural fit for agent runtime.
  • Doesn't fit: Traditional actors don't have knowledge sharing, self-improvement, or the governance hierarchy Kaze needs. Erlang supervisors apply deterministic restart policies; Kaze supervisors reason intelligently about failures.

3.3 Key Architectural Tensions

Agent Autonomy vs. System Coherence: Actors and microservices maximize independence. Kaze agents need shared knowledge and coordinated behavior. Resolution: agents are independent in how they work (runtime isolation, own state) but connected through the knowledge graph and event bus for what they know and why they act.

Static Infrastructure vs. Dynamic Agent Topology: Traditional architectures assume you know your services at deploy time. Kaze agents can be spawned dynamically by orchestrator agents — the topology changes at runtime. Resolution: Layer 0 (platform) is static infrastructure deployed via IaC. Layers 1-3 (agents) are dynamic — the agent runtime hosts whatever agents the system needs at any given moment.

Cell Isolation vs. Knowledge Sharing: Cell architecture demands isolation. The vertical knowledge flywheel requires knowledge flow across cells. Resolution: two types of knowledge flow differently — vertical knowledge syncs across cells (opt-in, anonymized, managed centrally), client knowledge never leaves the cell.