Project Kaze — Architecture Document
Status: Draft — Brainstorming Phase Last Updated: 2026-02-26 Authors: Speedrun Ventures Founding Team
Document Map
| Document | Contents |
|---|---|
| overview.md (this file) | Vision, system architecture, architecture comparison |
| infrastructure.md | Deployment modes, cells, cloud strategy, LLM key management, security controls, observability |
| ai-native.md | AI-native philosophy, self-improvement, knowledge graph, orchestration, agent safety boundaries |
| strategy/product-strategy.md | Verticals, supervision ramp, multi-channel, human-in-the-loop |
| strategy/tradeoffs.md | 9 risks with mitigations, risk summary matrix |
| strategy/decisions.md | Design decisions (D1-D45), technology selection, open questions |
| strategy/mvp.md | MVP scope, 3 verticals, parallel build plan, success criteria |
| technical-design.md | High-level component design for 6 MVP platform components, security controls per component |
| Research | |
| research/knowledge-system.md | Academic literature survey, open source tooling, knowledge architecture options |
| research/openclaw-integration.md | OpenClaw codebase analysis, plugin system, Kaze integration architecture |
| research/threat-model.md | Threat actors, 10 attack surfaces, security properties, MVP security scope |
| research/data-rights-knowledge-sharing.md | Legal risk analysis (GDPR, trade secrets), knowledge sharing options, tiered consent model |
| research/scalability-model.md | Performance bottlenecks, scale milestones, component analysis, scaling triggers |
| research/cost-model.md | LLM token costs, infrastructure costs, unit economics, pricing implications |
| research/openfang-comparison.md | Full comparison with OpenFang agent OS — architecture, features, positioning |
| research/frontier-lab-competitive-analysis.md | Competitive analysis if Anthropic/OpenAI/Google build an agent OS — what's defensible vs commodity |
1. Vision & Principles
1.1 Context
Speedrun Ventures is an AI-native venture studio focused on building AI agents and automated workflows for SMEs to optimize their P&L. The studio operates by building massive fleets of AI agents and tooling, along with an orchestration framework for rapid agent development and operations.
1.2 What is Kaze?
Kaze is an operating system for AI agents. It is the foundational platform that enables Speedrun Ventures to:
- Define new agents from reusable templates and composable skills
- Integrate with external tools and services
- Orchestrate agents into complex workflows
- Monitor, evaluate, and continuously improve agent performance
- Scale agent fleets across multiple clients and deployment environments
Kaze is not another SaaS platform for deploying LLM wrappers. It is an AI-native system where AI is the core operating layer — AI monitors AI, AI improves AI, and human involvement is minimized to governance and exception handling.
1.3 Core Principles
| Principle | Description |
|---|---|
| AI-Native | AI is not a feature bolted onto traditional software. AI agents are the primary units of computation. The system self-monitors, self-improves, and self-heals with minimal human intervention. |
| Security & Privacy First | Every architectural decision prioritizes data isolation, tenant security, and client trust. The system must be deployable in any environment without compromising on security. |
| Cloud-Agnostic | No hard dependencies on any cloud provider. The entire stack runs on any cloud (AWS, GCP, Azure) or on-premises with zero code changes. |
| Vertical-First | Value is created by going deep into well-understood business verticals (SEO, CRM, etc.), not by building a generic horizontal platform. Each vertical creates compounding knowledge. |
| Meet Humans Where They Are | Humans interact with agents through their existing tools (Slack, Email, WhatsApp, Telegram) — not through a specialized dashboard. Communication is natural, not software-operational. |
| Portable by Default | Everything is containerized and defined as Infrastructure as Code. The same artifact deploys in Speedrun's infrastructure or a client's VPC with only configuration differences. |
2. System Architecture
2.1 Architectural Pattern: Agent-Oriented Architecture
Kaze follows an Agent-Oriented Architecture — a hybrid pattern that borrows from several traditional styles but is fundamentally shaped by the fact that its primary units of computation are intelligent, autonomous agents, not passive services.
Borrowed patterns and their roles in Kaze:
| Pattern | What Kaze borrows | Applied where |
|---|---|---|
| Actor Model | Autonomous entities with private state, message-passing, supervision trees | Agent runtime — each agent is an actor |
| Event-Driven Architecture | Loose coupling via async events, event sourcing for audit | Inter-agent communication via NATS |
| Microservices | Independent deployment, own-your-data, clean API boundaries | Platform services (LLM Gateway, Knowledge Graph, etc.) |
| Cell-Based Architecture | Self-contained isolated deployment units | Each tenant/VPC is a cell |
New to Kaze (no traditional equivalent):
- Components that learn and self-modify their behavior over time
- A governance hierarchy where AI agents supervise other AI agents
- Shared knowledge across agents while maintaining runtime isolation
- A supervision ramp (supervised → sampling → autonomous) as a trust model
2.2 Layer Architecture
Kaze is organized into 5 layers, from infrastructure at the bottom to governance at the top:
┌─────────────────────────────────────────────────────────────┐
│ KAZE PLATFORM │
│ │
│ Layer 3: GOVERNANCE & SELF-IMPROVEMENT │
│ ┌────────────┐ ┌──────────────┐ ┌────────────────────────┐ │
│ │ Supervisor │ │ Quality │ │ Improvement Agent │ │
│ │ Agents │ │ Monitor │ │ (prompt/skill/workflow │ │
│ │ │ │ Agent │ │ optimization) │ │
│ └────────────┘ └──────────────┘ └────────────────────────┘ │
│ │
│ Layer 2: ORCHESTRATION & KNOWLEDGE │
│ ┌──────────────┐ ┌───────────────────────────────────────┐ │
│ │ Orchestrator │ │ Shared Knowledge Graph │ │
│ │ Agents │ │ ├── Vertical knowledge (SEO, CRM..) │ │
│ │ (dynamic │ │ ├── Cross-vertical patterns │ │
│ │ planning) │ │ └── Client-specific context │ │
│ └──────────────┘ └───────────────────────────────────────┘ │
│ │
│ Layer 1: EXECUTION │
│ ┌─────────────────────────────────────────────────────────┐│
│ │ Agent Skills (composable, reusable per vertical) ││
│ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌────────────┐ ││
│ │ │ Keyword │ │ Content │ │ Lead │ │ Report │ ││
│ │ │ Research │ │ Optimize │ │ Scoring │ │ Generator │ ││
│ │ └──────────┘ └──────────┘ └──────────┘ └────────────┘ ││
│ └─────────────────────────────────────────────────────────┘│
│ │
│ Layer 0.5: INTERACTION │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ Conversation Manager │ │
│ │ ┌───────┐ ┌───────┐ ┌──────────┐ ┌────────┐ │ │
│ │ │ Slack │ │ Email │ │ WhatsApp │ │Telegram│ ... │ │
│ │ └───────┘ └───────┘ └──────────┘ └────────┘ │ │
│ └──────────────────────────────────────────────────────┘ │
│ │
│ Layer 0: PLATFORM INFRASTRUCTURE │
│ ┌──────────┐ ┌──────┐ ┌─────────┐ ┌──────┐ ┌───────────┐ │
│ │ K8s │ │ NATS │ │Postgres │ │Vault │ │ OTel/Prom │ │
│ │ │ │ │ │ │ │ │ │ Grafana │ │
│ └──────────┘ └──────┘ └─────────┘ └──────┘ └───────────┘ │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ LLM Gateway (multi-provider, dual-key, budget mgmt) │ │
│ └──────────────────────────────────────────────────────┘ │
│ │
│ ALL CONTAINERIZED · ALL IaC · ANY CLOUD · ANY VPC │
└─────────────────────────────────────────────────────────────┘2.3 Layer Descriptions
Layer 0: Platform Infrastructure
Non-AI infrastructure that provides the runtime foundation. This is traditional software — deterministic, well-understood, battle-tested.
Components:
- Kubernetes — Universal compute runtime. Provides scheduling, scaling, networking, and the deployment abstraction across any cloud.
- NATS — Lightweight message bus for all inter-agent communication. Supports pub/sub, request/reply, and persistent streaming (JetStream). Chosen for portability and minimal operational overhead.
- PostgreSQL — Primary relational datastore. Managed via CloudNativePG operator for Kubernetes-native operation.
- HashiCorp Vault — Secrets management. Stores LLM API keys (both Speedrun-owned and client-provided), agent credentials, and encryption keys.
- OpenTelemetry + Prometheus + Grafana + Loki — Full observability stack. OTel for distributed tracing, Prometheus for metrics, Grafana for visualization, Loki for log aggregation.
- LLM Gateway — Abstraction layer between agents and LLM providers (see Section 2.5).
Layer 0.5: Interaction Layer
The multi-channel communication layer that enables humans to interact with agents naturally through their existing tools.
Components:
- Conversation Manager — Maintains unified conversation threads across channels. An agent doesn't "think in Slack" or "think in email" — it thinks in tasks and conversations. The channel is a delivery mechanism.
- Channel Adapters — Slack bot, Email agent, WhatsApp agent, Telegram bot, and future integrations. Each adapter translates between the channel's protocol and the Conversation Manager's unified format.
- Approval Flow Engine — Routes approval requests to the appropriate channel based on context, urgency, and client preference.
- Context Persistence — Maintains conversation history across channels so that context flows naturally (e.g., a question asked on WhatsApp can reference a document sent via email).
Layer 1: Execution
Agents that perform actual work. These are composed from reusable skills and operate within a specific vertical.
Key concepts:
- Skills — The atomic reusable unit of agent capability (see Section 2.4).
- Agents — Compositions of skills + a role + context. An agent is instantiated from a template, bound to a client, and assigned a supervision level.
- Agent Runtime — The execution environment that hosts agents. Based on the actor model — each agent has private state, a message inbox, and processes one task at a time.
Layer 2: Orchestration & Knowledge
Agents that plan, decompose, and coordinate work across execution agents. Also hosts the shared knowledge graph.
Key concepts:
- Orchestrator Agents — Receive goals, decompose them into subtasks, assign to worker agents. Unlike static DAG workflows, orchestrators reason dynamically about the best approach and can re-plan at runtime if steps fail.
- Router Agents — Direct incoming requests to the appropriate agent or workflow based on intent classification.
- Shared Knowledge Graph — The persistent knowledge layer that agents read from and contribute to (see ai-native.md).
Layer 3: Governance & Self-Improvement
Meta-agents that monitor, evaluate, and improve the entire system. This layer is the last to become autonomous and carries the most conservative guardrails.
Key concepts:
- Supervisor Agents — Watch agent fleet health, detect failures, take corrective action. Unlike traditional monitoring that follows static rules, supervisors reason about novel failure modes.
- Quality Monitor Agents — Evaluate agent outputs for quality, catch hallucinations or drift, score task completion. Feed results into the supervision ramp.
- Improvement Agents — Analyze execution patterns, propose prompt refinements, skill updates, and workflow optimizations. All changes go through canary deployment before full rollout.
2.4 Agent Skills — The Composable Unit
A skill is the atomic reusable unit of agent capability. Skills are the building blocks from which agents are composed.
Skill definition structure:
skill: keyword-research
description: "Research and evaluate keyword opportunities"
inputs:
- business_context # What the client does
- current_rankings # Optional: existing search positions
- competitors # Optional: known competitors
tools_required:
- semrush_api
- google_search_console
- llm # Analysis & reasoning
outputs:
- keyword_opportunities # Structured list
- priority_ranking # Scored and ordered
- reasoning # Why these keywords
knowledge_dependencies:
- seo/domain-concepts
- seo/best-practices
quality_criteria:
- relevance_score > 0.8
- business_alignment check
- search_volume validationAn agent is a composition of skills:
agent: seo-strategist
role: "Senior SEO strategist for {client}"
skills:
- keyword-research
- competitor-analysis
- content-optimization
- reporting
knowledge:
- seo/*
- client/{client_id}/business-context
autonomy_level: supervised → sampling → autonomousSkills transfer across verticals where applicable. When expanding from SEO to content marketing, skills like keyword-research and competitor-analysis carry over directly while new vertical-specific skills are built.
2.5 LLM Gateway
A critical platform component that sits between all agents and all LLM providers. No agent ever holds a raw API key or calls a provider directly.
┌─────────┐ ┌──────────────────┐ ┌──────────────┐
│ Agent │────▶│ LLM Gateway │────▶│ Anthropic │
│ │ │ │────▶│ OpenAI │
│ │ │ - Key routing │────▶│ Google │
│ │ │ - Budget mgmt │────▶│ Local/Ollama│
│ │ │ - Usage tracking│ └──────────────┘
│ │ │ - Rate limiting │
│ │ │ - Fallback │
└─────────┘ └──────────────────┘Responsibilities:
- Key Routing — Resolves which API key to use per request based on tenant, agent, and provider configuration (see infrastructure.md for dual-key model).
- Provider Abstraction — Agents call a unified interface (
complete(messages, tools, model_hint)). The gateway resolves to a specific provider/model based on config, availability, and cost optimization. - Budget Management — Per-key, per-tenant, per-agent token tracking with configurable budget caps and hard stops.
- Rate Limiting — Respects provider rate limits, queues requests, and distributes load.
- Fallback — If a provider is down or rate-limited, automatically falls back to alternatives (if policy allows).
- Model Selection — Can route to the cheapest model that meets the quality bar for a given task type (auto-learned over time).
2.6 Agent Hierarchy Summary
| Layer | Role | What lives here | Autonomy level |
|---|---|---|---|
| Layer 3 | Governance | Supervisor, Quality Monitor, Improvement agents | Last to become autonomous — human oversight longest |
| Layer 2 | Orchestration | Orchestrator agents, Router agents, Knowledge Graph | Second to autonomy |
| Layer 1 | Execution | Worker agents composed of skills | First to go autonomous (per skill, per vertical) |
| Layer 0.5 | Interaction | Conversation Manager, Channel adapters | N/A (infrastructure) |
| Layer 0 | Infrastructure | K8s, NATS, Postgres, Vault, LLM Gateway | N/A (traditional software) |
3. Architecture Comparison
3.1 Comparison with Traditional Patterns
| Dimension | Monolith | SOA | Microservices | EDA | Actor Model | Kaze |
|---|---|---|---|---|---|---|
| Primary unit | Module | Service | Service | Event handler | Actor | Agent (intelligent actor) |
| Communication | Function call | ESB | API / events | Events | Messages | Messages + shared knowledge |
| State | Shared DB | Shared-ish | Own DB per svc | Event log | Private per actor | Private + shared knowledge graph |
| Deployment | Single unit | Service groups | Per service | Per handler | Per actor system | Per cell |
| Scaling | Vertical | Service-level | Per service | Per topic | Per actor pool | Per cell + per agent pool |
| Intelligence | None | None | None | None | None | Core design property |
| Self-modification | No | No | No | No | No | Yes (governed) |
| Supervision | N/A | Monitoring | Health checks | Dead letter queue | Supervisor trees | Intelligent supervision hierarchy |
3.2 What Fits and What Doesn't from Each Pattern
Monolithic:
- Fits: Simple to start, fast iteration, easy to reason about. A valid starting point for the internals of a single cell.
- Doesn't fit: Can't deploy agents independently, can't scale per-agent, can't distribute across VPCs, can't support multi-tenant isolation.
SOA:
- Fits: Business-capability oriented thinking (verticals as service domains).
- Doesn't fit: ESB is a centralized bottleneck and single point of failure. Too rigid for dynamic agent composition. Heavy governance overhead.
Microservices:
- Fits: Independent deployment, own-your-data, polyglot, clean API boundaries. Ideal for platform services.
- Doesn't fit: Assumes dumb pipes, smart endpoints. Kaze needs smart pipes (NATS with intelligent routing) AND smart endpoints (agents). Agents are stateful, long-lived, autonomous entities — not stateless request handlers.
Event-Driven Architecture:
- Fits: Loose coupling, async communication, natural for agent-to-agent messaging, event sourcing for audit trails.
- Doesn't fit: Doesn't address agent lifecycle, supervision, self-improvement, or knowledge sharing. EDA is a communication pattern, not a complete architecture for intelligent systems.
Actor Model:
- Fits: Closest traditional match — autonomous entities, message-passing, supervision trees, spawn children. Natural fit for agent runtime.
- Doesn't fit: Traditional actors don't have knowledge sharing, self-improvement, or the governance hierarchy Kaze needs. Erlang supervisors apply deterministic restart policies; Kaze supervisors reason intelligently about failures.
3.3 Key Architectural Tensions
Agent Autonomy vs. System Coherence: Actors and microservices maximize independence. Kaze agents need shared knowledge and coordinated behavior. Resolution: agents are independent in how they work (runtime isolation, own state) but connected through the knowledge graph and event bus for what they know and why they act.
Static Infrastructure vs. Dynamic Agent Topology: Traditional architectures assume you know your services at deploy time. Kaze agents can be spawned dynamically by orchestrator agents — the topology changes at runtime. Resolution: Layer 0 (platform) is static infrastructure deployed via IaC. Layers 1-3 (agents) are dynamic — the agent runtime hosts whatever agents the system needs at any given moment.
Cell Isolation vs. Knowledge Sharing: Cell architecture demands isolation. The vertical knowledge flywheel requires knowledge flow across cells. Resolution: two types of knowledge flow differently — vertical knowledge syncs across cells (opt-in, anonymized, managed centrally), client knowledge never leaves the cell.