Skip to content

Threat Model & Security Assessment

Research for Project Kaze


1. System Boundaries

Kaze operates across multiple trust boundaries depending on deployment mode:

                    TRUST BOUNDARY: Internet
┌───────────────────────────────────────────────────────────┐
│                                                           │
│   TRUST BOUNDARY: Speedrun Infrastructure                 │
│  ┌─────────────────────────────────────────────────────┐ │
│  │  Agency SaaS (multi-tenant)                          │ │
│  │                                                      │ │
│  │  ┌──────────┐  ┌──────────┐  ┌──────────┐          │ │
│  │  │ Tenant A │  │ Tenant B │  │ Tenant C │          │ │
│  │  │ (cell)   │  │ (cell)   │  │ (cell)   │          │ │
│  │  └──────────┘  └──────────┘  └──────────┘          │ │
│  │                                                      │ │
│  │  Shared: LLM Gateway, Agent Runtime Pool             │ │
│  └─────────────────────────────────────────────────────┘ │
│                                                           │
│   TRUST BOUNDARY: Customer VPC                            │
│  ┌─────────────────────────────────────────────────────┐ │
│  │  Full Kaze stack (single-tenant cell)                │ │
│  │  Client data never leaves this boundary              │ │
│  │  Speedrun access only via VPN (on-demand)            │ │
│  └─────────────────────────────────────────────────────┘ │
│                                                           │
│   TRUST BOUNDARY: LLM Providers                           │
│  ┌─────────────────────────────────────────────────────┐ │
│  │  Anthropic · OpenAI · Google · Ollama (local)        │ │
│  │  Data sent for inference — no opt-out of processing  │ │
│  └─────────────────────────────────────────────────────┘ │
│                                                           │
│   TRUST BOUNDARY: External Tools                          │
│  ┌─────────────────────────────────────────────────────┐ │
│  │  SEMrush · GitHub · Google Calendar · Toddle DB      │ │
│  │  Client data may flow to these services              │ │
│  └─────────────────────────────────────────────────────┘ │
│                                                           │
│   TRUST BOUNDARY: OpenClaw                                │
│  ┌─────────────────────────────────────────────────────┐ │
│  │  Channels: Slack · WhatsApp · Telegram · Discord     │ │
│  │  User messages flow through channel providers        │ │
│  └─────────────────────────────────────────────────────┘ │
└───────────────────────────────────────────────────────────┘

2. Threat Actors

ActorCapabilityMotivationLikelihood
External attackerNetwork access, phishing, credential theftData theft, ransom, disruptionMedium
Malicious tenant (agency model)Authenticated access to own cell, standard API accessAccess other tenants' data, exceed resource quotas, abuse shared infrastructureMedium
Compromised LLM providerAccess to all prompts/responses sent for inferenceData harvesting, model poisoningLow
Compromised external toolAccess to data sent via API integrationsCredential theft, data exfiltrationLow-Medium
Malicious or buggy agentTool access, knowledge write access, LLM callsData exfiltration, knowledge poisoning, resource exhaustionMedium
Insider (Speedrun operator)Infrastructure access, Vault access, deployment privilegesData theft, unauthorized accessLow
Supply chain attackerCompromised dependency, container image, or pluginBackdoor, data theftLow-Medium

3. Attack Surfaces

3.1 Prompt Injection

Threat: Agents process user input, knowledge graph content, and tool outputs — all potential injection vectors. A crafted input could manipulate agent behavior.

Attack paths:

  • User sends malicious message via channel → agent follows injected instructions
  • Poisoned knowledge entry retrieved during agent reasoning → agent acts on false knowledge
  • External tool returns adversarial content → agent processes as trusted data

Current state: Not addressed.

Mitigations needed:

  • Input sanitization layer before agent processing
  • Separate "trusted" (system prompt, skill definitions) from "untrusted" (user messages, tool outputs, knowledge) in context
  • Instruction hierarchy: system prompt > skill definition > knowledge > user input
  • Output validation for sensitive operations (financial, data deletion)

3.2 Tenant Isolation (Agency Model)

Threat: In shared-cell deployments, one tenant's agent accesses another tenant's data, keys, or knowledge.

Attack paths:

  • Namespace escape in shared K8s cluster
  • Agent manipulates knowledge query to bypass tenant scoping
  • Shared LLM Gateway leaks context between requests
  • Shared Agent Runtime pool bleeds state between tenant tasks

Current state: Partially addressed — namespace isolation documented, stateful components isolated per-tenant.

Mitigations needed:

  • Tenant ID enforced at database query layer (every query includes tenant_id filter, not just application logic)
  • LLM Gateway must flush all state between requests from different tenants
  • Memory isolation: agent private state wiped between tenant context switches in shared runtime
  • Network policies per namespace verified and tested

3.3 Knowledge Poisoning

Threat: An agent writes false or manipulated knowledge into the shared knowledge graph, which propagates to other agents and tenants.

Attack paths:

  • Agent hallucinates confidently → writes to shared tier → quality gate passes (LLM-as-judge fooled)
  • Adversarial user provides false information → agent stores it → becomes "fact" in knowledge system
  • Compromised agent deliberately seeds misinformation

Current state: Quality gate mentioned (LLM-as-judge) but no defense-in-depth.

Mitigations needed:

  • Multi-signal quality gate (not LLM-only): source verification, cross-reference with existing knowledge, confidence scoring
  • Rate limiting on shared tier writes (flag agents that write excessively)
  • Provenance chain: every shared knowledge entry traces back to originating observation, not just agent
  • Quarantine period for shared knowledge (visible after N hours or human review, not immediately)
  • Rollback capability: if poisoned knowledge is detected, trace and revert all downstream effects

3.4 LLM Data Exposure

Threat: Sensitive client data sent to LLM providers for inference becomes accessible to the provider or is used for training.

Attack paths:

  • Agent sends client PII/financial data in prompts
  • Provider uses data for model training (opt-out policies vary)
  • Provider breach exposes inference logs
  • Prompt caching across tenants at provider level

Current state: Data classification exists for VPC observability but not for LLM call content.

Mitigations needed:

  • Data classification tags on knowledge entries: "safe for LLM" vs "internal only"
  • PII detection before LLM calls — strip or redact sensitive fields
  • Provider selection policy: sensitive data → only providers with zero-retention agreements
  • Local model option (Ollama) for highest-sensitivity operations
  • Audit trail of what data was sent to which provider

3.5 Credential Theft & Key Compromise

Threat: LLM API keys, tool credentials, or Vault access tokens are stolen.

Attack paths:

  • Vault compromise → all keys exposed
  • Agent logs credentials in observation logger (accidental)
  • LLM provider key in prompts/responses (accidental leakage)
  • CI/CD pipeline exposes secrets
  • Client key leaked → attacker uses it via Kaze or directly

Current state: Vault documented, tenant-scoped access policies mentioned.

Mitigations needed:

  • Key rotation policy: automated rotation on schedule, immediate rotation on suspected compromise
  • Secret scanning in observation logs (detect and redact credentials before storage)
  • Short-lived credentials where possible (OAuth tokens vs long-lived API keys)
  • Key usage anomaly detection (sudden spike in usage from a key → alert + auto-freeze)
  • Blast radius containment: if one client key is compromised, only that client's agents affected

3.6 Agent Privilege Escalation

Threat: An agent accesses tools, knowledge, or actions beyond its intended scope.

Attack paths:

  • Agent discovers tools outside its vertical via registry
  • Agent manipulates its own supervision level (writes to supervision_ramp_stats)
  • Agent spawns subagents with elevated privileges
  • Agent writes procedural knowledge that changes other agents' behavior

Current state: Tool filtering by vertical mentioned, supervision ramp documented.

Mitigations needed:

  • Agent capability manifest (whitelist of tools + knowledge domains per agent, enforced at runtime)
  • Supervision state is read-only to agents — only the platform can promote/demote
  • Subagent privilege inheritance: child agents cannot exceed parent's capability set
  • Knowledge write scopes enforced at system level, not agent self-declaration

3.7 Resource Exhaustion

Threat: Agent or tenant consumes disproportionate resources, affecting other tenants or platform stability.

Attack paths:

  • Agent enters infinite tool-calling loop → burns token budget
  • Agent spawns unlimited subagents → exhausts compute
  • Knowledge system flooded with writes → storage exhaustion
  • LLM Gateway request queue saturated by one tenant

Current state: Budget enforcement documented (hard stops), task timeouts mentioned.

Mitigations needed:

  • Per-agent resource quotas: max concurrent tasks, max tool calls per task, max knowledge writes per hour
  • Per-tenant compute quotas: max agents, max total tokens/day
  • Circuit breaker on tool call loops (detect repeated identical calls, break after N)
  • Queue fairness: per-tenant request queuing in LLM Gateway (no single tenant can starve others)

3.8 Supply Chain Attacks

Threat: Compromised dependency, container image, or OpenClaw plugin introduces backdoor.

Attack paths:

  • Malicious npm package in dependency tree
  • Compromised base image in container build
  • OpenClaw plugin with malicious hook (40+ plugins, varying provenance)
  • GitOps pipeline compromised → malicious update pushed to customer VPC

Current state: Signed images and reproducible builds mentioned for customer VPC.

Mitigations needed:

  • Dependency scanning in CI (Snyk, Trivy, or similar)
  • Container image scanning before deployment
  • Pin all dependency versions, review updates manually
  • OpenClaw plugin audit: only use vetted plugins, lock versions
  • GitOps: signed commits required, approval gate before deployment to any environment

3.9 Data Exfiltration via Agents

Threat: An agent (compromised or manipulated) sends client data to unauthorized external endpoints.

Attack paths:

  • Agent calls external tool with client data embedded in parameters
  • Agent generates output containing client data → sent via channel to unauthorized recipient
  • Agent writes client data to shared knowledge tier → visible to other tenants

Current state: Not addressed beyond "client knowledge never leaves cell."

Mitigations needed:

  • Egress filtering: agents can only reach whitelisted external endpoints (per tenant, per vertical)
  • Tool output scanning: detect client data in tool call parameters before sending
  • Channel output review: for supervised agents, outbound messages go through approval
  • Network policies enforcing egress restrictions at K8s level

3.10 Insider Threat

Threat: Speedrun team member with infrastructure access abuses their position.

Attack paths:

  • Direct database access to read client data
  • Vault access to read client API keys
  • VPN into customer VPC for unauthorized data access
  • Modify agent behavior to exfiltrate data

Current state: "Speedrun operators have no access to client secrets in customer VPC" mentioned but no enforcement.

Mitigations needed:

  • Principle of least privilege for all Speedrun team access
  • VPN sessions logged, time-limited (4hr max), require ticket justification
  • Vault audit logging: every secret read is logged with operator identity
  • Database access via bastion host with session recording
  • Separation of duties: no single person can deploy + access production secrets

4. Security Properties Required

4.1 Confidentiality

PropertyScopePriority
Client data isolated between tenantsAgency modelCritical
Client data never leaves customer VPCVPC deploymentsCritical
LLM API keys not exposed to agentsAllCritical
PII not sent to LLM providers unless consentedAllHigh
Operator access to client data is audited and justifiedAllHigh
Shared knowledge contains no client-specific dataAgency modelHigh

4.2 Integrity

PropertyScopePriority
Knowledge graph entries are provenance-tracked and tamper-evidentAllCritical
Agent behavior cannot be self-modified without canary + rollbackAllCritical
Supervision levels cannot be escalated by agents themselvesAllCritical
Audit logs are immutable (append-only, no modifications)AllHigh
Deployment artifacts are signed and verifiedCustomer VPCHigh

4.3 Availability

PropertyScopePriority
Single tenant failure does not affect other tenantsAgency modelCritical
Budget exhaustion stops the agent, not the platformAllCritical
External tool failure is isolated (retry, fallback, not cascade)AllHigh
LLM provider outage triggers fallback, not system failureAllHigh

4.4 Non-Repudiation

PropertyScopePriority
Every agent action is attributed to a specific agent + tenant + taskAllCritical
Every knowledge write has author identity and timestampAllCritical
Every LLM call is logged with provider, model, tokens, costAllHigh
Every supervision decision (approve/reject/modify) is recordedAllHigh

5. MVP Security Scope

What must be in place for MVP vs what can be deferred:

MVP (must have)

  • Tenant ID enforcement at database query layer
  • LLM API keys in Vault, never in agent code or logs
  • Per-agent, per-tenant token budget with hard stops
  • Agent capability manifest (whitelist of tools + knowledge per agent)
  • Basic secret scanning in observation logs
  • Task timeouts and tool call loop detection
  • Audit trail for all agent actions (observation logger)

Phase 2 (important but not blocking)

  • PII detection before LLM calls
  • Egress filtering per tenant
  • Key rotation policy
  • OpenClaw plugin security audit
  • Dependency scanning in CI
  • Network policies per namespace (K8s)
  • Data classification tags on knowledge entries

Phase 3+ (compliance & scale)

  • SOC 2 / ISO 27001 readiness
  • Formal incident response procedure
  • Penetration testing program
  • Signed GitOps deployments
  • Session recording for infrastructure access
  • Customer-facing security documentation

6. Open Questions

#QuestionImpact
S1What authentication system for users/tenants? (OAuth2/OIDC, API keys, SSO?)High — foundational
S2What's the minimum LLM provider data retention agreement acceptable?High — client trust
S3How do we handle PII detection at scale? (regex patterns, dedicated model, third-party?)Medium — compliance
S4Should shared knowledge have a quarantine period before becoming visible?Medium — knowledge integrity
S5What's the incident response playbook if a cell is compromised?Medium — operational readiness