Project Kaze — Philosophy & Vision

Context

Speedrun Ventures is an AI-native venture studio building AI agents and automated workflows for SMEs to optimize their P&L. The studio operates a portfolio of products — each generating domain-specific data and knowledge — with a shared agent platform (Kaze) that compounds learnings across all of them.

What is Kaze?

Kaze is an operating system for AI agents. It provides the runtime, knowledge, tooling, and governance layers that enable Speedrun to:

Define agents from composable, reusable skills
Execute agents across multiple business verticals
Accumulate and share knowledge across agents and verticals
Earn trust through a supervision ramp (supervised → sampling → autonomous)
Scale agent fleets without per-agent operational burden

Kaze is not a chatbot platform or an LLM wrapper. It is a system where AI agents are the primary units of computation — reasoning about tasks, using tools, learning from outcomes, and operating under governance.

Core Principles

Principle	What it means in practice
AI-Native	AI agents are the primary compute units, not add-ons to traditional software. The system is designed around agent reasoning, memory, and tool use as first-class concepts.
Vertical-First	Value comes from going deep into specific business domains (internal ops, SEO, content), not from building a generic horizontal platform. Each vertical creates compounding domain knowledge.
Security & Privacy First	Every design decision prioritizes data isolation, tenant security, and credential management. Zero-secret runtime — agents never hold API keys.
Meet Humans Where They Are	Users interact via their existing tools (Slack, WhatsApp, Telegram) through OpenClaw. No specialized dashboard required for day-to-day use.
Cloud-Agnostic	All infrastructure is containerized and defined as IaC. The same artifacts deploy on any cloud or on-premises with only configuration differences.
Portable by Default	One build, different configurations. No SaaS-only or on-prem-only code paths.

The AI-Native Paradigm

Most "AI platforms" place humans as operators with AI as a tool:

Human defines → Human deploys → Agent executes → Human monitors → Human improves

Kaze's target state inverts this. AI is the operating layer. Humans are governors:

Human sets goals & guardrails
         │
         ▼
   AI orchestrates → AI executes → AI evaluates
        ↑                              │
        └──── AI improves & adapts ────┘

Human intervenes only at:
  - Goal setting
  - Guardrail violations
  - Escalation thresholds
  - Approval gates (when configured)

Current state: The platform today operates at the first pattern — humans define skills, deploy agents, monitor via Langfuse, and manually improve prompts. The AI-native self-improvement loop (Layer 3: Governance) is a design-phase target, not yet implemented. The architecture is built to support this evolution.

AI Monitors AI (Target State)

The design calls for a Supervisor Agent Layer where AI agents monitor other agents:

Health Monitor Agent — Watches fleet health, detects failures, restarts stuck agents
Cost Monitor Agent — Tracks token spend, detects budget anomalies, throttles proactively
Quality Monitor Agent — Evaluates outputs, catches hallucinations, scores task completion

These would reason about novel failure modes rather than following static rules. Hard circuit breakers remain deterministic code — budget limits, error rate thresholds, and permission boundaries are enforced by platform code, not AI reasoning.

AI Improves AI (Target State)

Every agent execution produces signals that can feed a continuous improvement cycle:

Layer	What improves	How
Prompts	System prompts, few-shot examples	A/B testing, measuring output quality, auto-selecting winners
Tool usage	Which tools, in what order	Analyzing successful vs. failed runs, optimizing patterns
Model selection	Which LLM for which task	Cost vs. quality tracking per model per task, auto-routing
Knowledge	What context an agent receives	Learning which knowledge is useful, pruning noise

Safeguard: All self-improvements would be versioned, canaried, and reversible. An agent never modifies itself for all traffic simultaneously.

The Vertical-First Flywheel

Kaze's moat is domain knowledge accumulated through vertical depth:

                    ┌─────────────────┐
               ┌───▶│ More Agents      │───┐
               │    │ (new verticals)  │   │
               │    └─────────────────┘   │
               │                          ▼
    ┌──────────┴──────┐         ┌──────────────────┐
    │ Better Knowledge │◀────────│ More Executions   │
    │ (shared patterns)│         │ (tasks completed) │
    └─────────────────┘         └──────────────────┘

Each vertical produces domain knowledge. Cross-vertical patterns (communication, data analysis, reporting) compound. A new vertical starts with shared knowledge from day one.

Current Verticals

Vertical	Status	Agent Focus
V0: Internal Ops	Active — 6 skills	Research, GitHub operations, digests, triage, docs-sync, meeting notes
V1: SEO Automation	Planned	Keyword research, content optimization, technical audits, reporting
V2: Toddle Activity Enrichment	Planned	Content enrichment, data quality, recommendation tuning

V0 is strategic. It's Speedrun's own internal operations — fast feedback, no external coordination, every platform component gets exercised before external verticals use it.

The Supervision Ramp

Trust is earned, not assumed. Every agent-skill pair progresses through three levels:

supervised ──────▶ sampling ──────▶ autonomous
   │                  │                  │
   │ Every output     │ Random X%        │ All outputs
   │ reviewed by      │ reviewed.        │ delivered
   │ human before     │ Rest delivered   │ directly.
   │ delivery.        │ immediately.     │ Async quality
   │                  │                  │ monitoring.
   │                  │                  │
   └── demotion ◀─────┴── demotion ◀────┘
       if quality drops    if quality drops

Key design decisions:

Per-skill, not per-agent. An agent can be autonomous for one skill and supervised for another.
Read-only to agents. Only the platform's supervision ramp logic can promote or demote. Agents cannot query or influence their own supervision state.
Automatic promotion at thresholds: 20 successful runs → sampling, 50 → autonomous. 3 consecutive failures → demotion.
Hard boundaries. Safety-critical operations remain supervised regardless of performance history.

Knowledge Architecture

The knowledge system provides persistent memory for all agents. The design has three tiers, with MVP implementing the first:

Implemented: Per-Agent Episodic Memory

Each agent has its own memory space via Mem0. Before reasoning, agents search for relevant context. After completing tasks, agents store key learnings. Facts are extracted by LLM and embedded with Google's gemini-embedding-001 (768-dim vectors).

Designed (Not Yet Built): Shared Knowledge

┌──────────────────────────────────────────────────┐
│              Knowledge Architecture               │
│                                                   │
│  Vertical Knowledge (shared within a vertical)    │
│  ┌─────────┐  ┌─────────┐  ┌─────────┐          │
│  │   SEO   │  │  Ops    │  │ Toddle  │  ...     │
│  └────┬────┘  └────┬────┘  └────┬────┘          │
│       └────────────┼────────────┘                 │
│                    ▼                              │
│  Cross-Vertical Knowledge                         │
│  ┌─────────────────────────────────────┐          │
│  │ - Business operations patterns      │          │
│  │ - Communication best practices      │          │
│  │ - Data analysis methods             │          │
│  └─────────────────────────────────────┘          │
│                                                   │
│  Client-Specific Knowledge (isolated)             │
│  ┌───────────┐ ┌───────────┐ ┌───────────┐      │
│  │ Client A  │ │ Client B  │ │ Client C  │      │
│  └───────────┘ └───────────┘ └───────────┘      │
└──────────────────────────────────────────────────┘

Isolation rules: Vertical knowledge is shared within a vertical. Cross-vertical patterns apply everywhere. Client-specific knowledge never leaves the client boundary.

Knowledge provenance: Every entry is tagged with a source class (public, speedrun_internal, client_private, etc.) that determines visibility.

Agent Safety Boundaries

AI autonomy requires hard boundaries enforced by deterministic platform code, not agent self-discipline.

Capability Manifests

Every agent declares exactly what it can do — whitelisted tools, knowledge domains, and communication channels. The runtime enforces this on every action.

Instruction Hierarchy

When processing context, a strict trust ordering applies:

System prompt (platform-defined, immutable)
  > Skill definition (admin-authored, versioned)
    > Retrieved knowledge (quality-gated, provenance-tracked)
      > User input (untrusted by default)
        > Tool output (untrusted, external)

Higher levels override lower levels. This is the primary defense against prompt injection.

Subagent Privilege Inheritance

When an agent spawns a subagent:

The subagent inherits at most the parent's capabilities
The subagent inherits at most the parent's supervision level
Parent's resource quotas are shared, not duplicated

Relationship with OpenClaw

OpenClaw (the Claude Code fork) serves as the communication and conversation layer. Kaze builds its own agent design, memory system, and knowledge architecture on top:

┌──────────────────────────────────────────────────┐
│  User Interface (WhatsApp / Telegram / Slack)     │
└──────────────┬───────────────────────────────────┘
               │
┌──────────────▼───────────────────────────────────┐
│  OpenClaw Layer                                   │
│  • Conversation management                        │
│  • Simple orchestration & tool routing             │
│  • Subagent spawning                               │
│  • Multi-channel support (built-in)                │
└──────────────┬───────────────────────────────────┘
               │ kaze_dispatch_task / kaze_list_verticals
┌──────────────▼───────────────────────────────────┐
│  Kaze Platform Layer                              │
│  • Agent runtime (YAML + TypeScript hybrid)       │
│  • Memory system (Mem0 + pgvector)                │
│  • LLM Gateway (multi-provider)                   │
│  • Skill framework (composable, vertical-specific) │
│  • Supervision ramp & quality monitoring           │
└──────────────────────────────────────────────────┘

Why this split: OpenClaw provides a mature conversation layer and multi-channel support. Kaze's differentiation is in agent architecture, memory, knowledge accumulation, and the self-improvement loop — not chat interfaces.

Document Map

Document	Description
System Architecture	Component architecture — runtime, gateway, knowledge, agents, API contracts, data flows
Infrastructure	Kubernetes topology, CI/CD, GitOps, secrets management, sidecars, networking
Non-Functional Assessment	Security posture, threat model, cost model, scalability analysis

Project Kaze — Philosophy & Vision ​

Context ​

What is Kaze? ​

Core Principles ​

The AI-Native Paradigm ​

AI Monitors AI (Target State) ​

AI Improves AI (Target State) ​

The Vertical-First Flywheel ​

Current Verticals ​

The Supervision Ramp ​

Knowledge Architecture ​

Implemented: Per-Agent Episodic Memory ​

Designed (Not Yet Built): Shared Knowledge ​

Agent Safety Boundaries ​

Capability Manifests ​

Instruction Hierarchy ​

Subagent Privilege Inheritance ​

Relationship with OpenClaw ​

Document Map ​