Kaze vs Paperclip — Complete Comparison
Part of Project Kaze ArchitectureDate: 2026-03-05
At a Glance
| Dimension | Kaze | Paperclip |
|---|---|---|
| What it is | Agent OS for SME clients (venture studio platform) | Control plane for autonomous AI companies |
| Stage | Architecture complete, core services implemented | Production-ready, full CRUD + governance |
| Language | TypeScript | TypeScript |
| Stack | Hono + Vercel AI SDK + Mem0 + pgvector | Express 5 + Drizzle ORM + PostgreSQL (or PGlite) |
| Built by | Speedrun Ventures (venture studio) | Paperclip AI (open-source) |
| License | Private | Open-source |
| Repo | Multi-repo (gateway, runtime, knowledge, agent-ops) | Monorepo (pnpm workspaces) |
1. Architecture Philosophy
| Aspect | Kaze | Paperclip |
|---|---|---|
| Pattern | Distributed services (gateway, runtime, knowledge) | Monolith (Express API + React UI + PostgreSQL) |
| Agent model | VerticalAgent (long-running) + SubAgent (ephemeral per task) | Heartbeat model (agents wake in short execution windows) |
| Execution | Continuous — agents run within runtime, gateway holds LLM loop | Discrete — agents wake, checkout task, do work, sleep |
| Secret handling | Gateway holds all secrets, runtime is credential-free | Agents receive env vars (API keys, endpoints) at invocation |
| Communication | OpenClaw handles all user-facing interaction | Task comments + dashboard UI (no chat) |
| Deployment | Containerized K8s + Tailscale | Docker or local; embedded PGlite for zero-config |
Key insight: Kaze is a distributed system where the gateway, runtime, and knowledge are separate services. Paperclip is a monolith where everything (API, UI, DB, agent invocation) lives in one process. Different trade-offs: Kaze has cleaner isolation but more operational complexity; Paperclip is simpler to deploy but harder to scale independently.
2. Agent Runtime
| Aspect | Kaze | Paperclip |
|---|---|---|
| Agent execution | Gateway runs LLM tool-use loop (Vercel AI SDK generateText + maxSteps) | Agents are external processes — Paperclip invokes them, doesn't run LLM calls |
| Supported runtimes | Claude/Gemini via gateway (single LLM provider) | Bring-your-own: Claude Code, Codex, OpenClaw, shell, HTTP webhook |
| Tool system | Built-in ToolRegistry (github_api, file_glob, file_read, git_clone, etc.) | No built-in domain tools — agents bring their own. Only Paperclip Skill for coordination |
| Skill definitions | YAML-based (prompt template + tool list per skill) | Markdown-based (SOUL.md, HEARTBEAT.md injected into agent context) |
| Agent lifecycle | Long-running VerticalAgent manages ephemeral SubAgents | Heartbeat: triggered by schedule, assignment, comment, or manual invocation |
Key insight: Kaze is the agent runtime — it calls the LLM, executes tools, manages the loop. Paperclip orchestrates external agent runtimes — it doesn't care what the agent does internally, only that it follows the heartbeat protocol. This is a fundamental difference: Kaze has deeper control over execution, Paperclip has broader runtime compatibility.
3. Multi-Tenancy
| Aspect | Kaze | Paperclip |
|---|---|---|
| Isolation unit | Vertical + tenant pair (VerticalAgent per combination) | Company (every entity scoped to company_id) |
| Data isolation | Per-agent knowledge via Mem0 userId/agentId scoping | Full company scoping — every table has company_id FK |
| Credential isolation | Gateway holds all secrets, scoped by config | Company secrets table (encrypted, versioned) |
| Org structure | Flat — skills within verticals | Hierarchical — agents have reports_to tree, company → goals → projects → tasks |
| Multi-company | Designed for multi-tenant but not yet implemented at DB level | First-class — unlimited companies per instance, complete isolation |
Key insight: Paperclip's multi-tenancy is production-complete (DB-level FK enforcement on every row). Kaze's is architected but relies on in-memory VerticalAgent scoping — the DB-level tenant isolation is not yet implemented.
4. Orchestration & Task Management
| Aspect | Kaze | Paperclip |
|---|---|---|
| Task dispatch | OpenClaw → Runtime → SubAgent → Gateway → result | Board/agent creates issue → agent checkout → heartbeat work → status update |
| Concurrency control | Single SubAgent per task (ephemeral, dies after completion) | Atomic checkout — 409 Conflict if two agents claim same task |
| Task hierarchy | Flat (skill invocation, no parent-child tasks) | Deep — company goal → project → parent issue → subtask chain |
| Scheduling | Not yet implemented (planned: task scheduler) | Built-in — cron-based heartbeat triggers, event-driven invocation |
| Run tracking | Observation logger (planned) | heartbeatRuns table — status, timing, events, error, external_run_id |
| Goal alignment | Implicit (skill prompt carries context) | Explicit — every task traces ancestry back to company goal |
Key insight: Paperclip has a complete project management system (goals, projects, issues, comments, labels, milestones). Kaze delegates task management to OpenClaw and focuses on skill execution. Paperclip's agents always know why they're doing something (goal ancestry); Kaze agents know what to do (skill prompt).
5. Supervision & Governance
| Aspect | Kaze | Paperclip |
|---|---|---|
| Approval model | Per-skill supervision ramp (supervised → sampling → autonomous) | Per-action approval gates (hire agent, CEO strategy, extensible) |
| Promotion | Automatic — earned through track record (20/50 success thresholds) | Manual — board approves/rejects explicitly |
| Scope | Granular per-skill (github-read vs github-write have different ramps) | Coarse per-action-type (hiring, strategy — not per-tool) |
| Demotion | Automatic — 3 consecutive failures drops to supervised | Manual — board can pause any agent at any time |
| Audit | Planned (observation logger) | Complete — activityLog table, immutable, every mutation tracked |
| Config versioning | Not implemented | agentConfigRevisions — snapshot every config change with rollback |
Key insight: Kaze's supervision ramp is more automated and granular (per-skill, auto-promote/demote). Paperclip's governance is more human-driven (board approvals, manual controls). Kaze optimizes for reducing human overhead over time; Paperclip optimizes for human oversight always being available.
6. Budget & Cost Control
| Aspect | Kaze | Paperclip |
|---|---|---|
| Cost tracking | Not yet implemented (architecture references it) | Production — costEvents table per token/cost event |
| Granularity | Planned: per-tenant, per-vertical | Per-agent, per-company, per-project, per-goal, per-billing-code |
| Budget enforcement | Not yet implemented | Hard-stop — agent auto-paused when spend >= budget |
| Alerting | Not yet implemented | 80% utilization soft threshold |
| Monthly reset | Not yet implemented | Automatic calendar-based UTC reset |
| Model selection | Gateway modelHint (fast/balanced/best) | Agent-level adapter config chooses model |
Key insight: Paperclip has production-grade cost control that Kaze hasn't built yet. This is one of the clearest gaps — any multi-tenant agent platform needs per-tenant budget enforcement, and Paperclip's implementation is a strong reference.
7. Knowledge & Memory
| Aspect | Kaze | Paperclip |
|---|---|---|
| Memory system | Dedicated service (Mem0 + pgvector + Google embeddings) | None — no RAG, no vector search |
| Fact extraction | Automatic — Mem0 extracts facts from conversations | None — context flows via task ancestry only |
| Semantic search | Yes — per-agent vector similarity search | None |
| Document ingestion | Batch pipeline (docling + batch embedder) | None |
| Context injection | OpenClaw before_agent_start hook queries knowledge | Task description + parent chain + workspace files |
Key insight: Knowledge/memory is Kaze's strongest differentiator. Paperclip has zero built-in knowledge management — agents only know what's in their task description and ancestry. Kaze agents accumulate understanding over time via semantic memory.
8. Communication & UI
| Aspect | Kaze | Paperclip |
|---|---|---|
| User interaction | OpenClaw (Slack, Telegram, WhatsApp, CLI, WebChat) | React dashboard + CLI client |
| Agent communication | Via OpenClaw channels (conversational) | Issue comments (task-oriented, threaded) |
| Real-time | OpenClaw WebSocket streaming | WebSocket live events for dashboard |
| Agent-to-agent | Not implemented (planned) | Delegation via subtask creation + assignment |
| MCP support | Considered (current discussion) | Planned — 35 MCP operations defined for task management |
Key insight: Kaze leans on OpenClaw for a conversational interface (chat-first). Paperclip is task-board-first (more like Linear/Jira for agents). Different UX paradigms — Kaze feels like talking to a colleague, Paperclip feels like managing a team.
9. Deployment & Operations
| Aspect | Kaze | Paperclip |
|---|---|---|
| Deployment | 4 separate services (gateway, runtime, knowledge, OpenClaw) on K8s | Single process (Express + embedded PGlite) or Docker |
| Database | PostgreSQL + pgvector (external) | PostgreSQL (external) or PGlite (embedded, zero-config) |
| Setup complexity | High — multiple services, Tailscale, container orchestration | Low — npx paperclipai onboard --yes one-command setup |
| Auth | Not yet implemented | Better Auth (sessions + API keys), board-claim flow |
| Modes | Production only (no local dev mode yet) | local_trusted / authenticated+private / authenticated+public |
10. Portability & Ecosystem
| Aspect | Kaze | Paperclip |
|---|---|---|
| Template system | YAML skill definitions per vertical | Company templates — export/import entire orgs |
| Marketplace | None | ClipHub (planned) — share company templates publicly |
| Agent portability | Skills are YAML, but tied to Kaze's tool ecosystem | Fully portable — any runtime (Claude Code, Codex, OpenClaw, shell, HTTP) |
| Plugin system | None (skills are the extensibility unit) | Planned — embed custom plugins for reporting, knowledge |
Summary: Where Each Excels
Paperclip is stronger at:
- Governance — complete approval flows, audit trails, config versioning
- Cost control — production-grade per-agent budget enforcement with auto-pause
- Multi-tenancy — DB-level company scoping on every row
- Deployment simplicity — single process, embedded DB, one-command setup
- Runtime flexibility — any agent runtime works (Claude Code, Codex, OpenClaw, shell)
- Task management — full project hierarchy with goal ancestry
- Portability — export/import company templates
Kaze is stronger at:
- Knowledge/memory — semantic search, fact extraction, document ingestion (Paperclip has none)
- Supervision automation — per-skill ramps with auto-promote/demote
- LLM control — direct tool-use loop management, model selection hints
- Credential isolation — zero-secret runtime, gateway holds everything
- Conversational UX — OpenClaw multi-channel chat vs task-board UI
- Skill system — declarative YAML skills with prompt templating
Neither has yet:
- Production MCP support (both planned)
- Agent-to-agent communication (Paperclip has delegation via subtasks, Kaze has neither)
- Self-improvement / meta-learning loops
Strategic Assessment
Paperclip and Kaze solve the same meta-problem (orchestrating AI agent workforces) but from opposite directions:
- Paperclip is the org chart — it models companies, hierarchies, budgets, approvals. Agents are black boxes that follow a protocol.
- Kaze is the agent brain — it models skills, knowledge, supervision ramps. The org structure is delegated to OpenClaw.
The most interesting question: should Kaze adopt Paperclip's governance/cost model? Paperclip's costEvents, activityLog, approvals, and company-scoped isolation are exactly what Kaze needs but hasn't built. Rather than reimplementing, Kaze could:
- Use Paperclip as the control plane — Paperclip manages companies, budgets, task dispatch. Kaze agents register as Paperclip adapters (like OpenClaw already can). Knowledge and supervision stay in Kaze.
- Port Paperclip's patterns — Copy the DB schema patterns (cost_events, activity_log, company scoping) into Kaze's services.
- Stay independent — Build governance from scratch within Kaze's architecture.
Option 1 is the most leverage for the least work, but adds a dependency. Option 2 is pragmatic. Option 3 is the most work for the most control.