Dispatch2026-04-10·5 min

P0 red-team audit fixes shipped in v0.3

Four P0 fixes from the v0.3 red-team audit: prompt injection guards on all agent surfaces, global spend cap enforcement at $500/day, LLM backpressure with 30-concurrency semaphore, and memory key allowlist preventing arbitrary writes.

v0.3 shipped four P0 fixes from our red-team audit. These were the highest-severity findings — issues that could cause financial loss, data leakage, or runaway costs.

P0-1: Prompt injection guards. All user input reaching the LLM is now wrapped with explicit role boundaries and system-level instructions that cannot be overridden by user messages. Attempts to inject commands like 'ignore previous instructions' are filtered before reaching the model.

P0-2: Global spend cap enforcement. The $500/day LLM budget is now enforced at the gateway level with a Redis counter. Once the cap is hit, new LLM requests return a graceful 'budget exceeded' response instead of continuing to burn Anthropic credits.

P0-3: LLM backpressure. A 30-concurrency semaphore now limits simultaneous Claude API calls. Requests above the limit queue with a 15-second acquisition timeout. This prevents API rate limit exhaustion during traffic spikes.

P0-4: Memory key allowlist. Agent-initiated memory writes are restricted to a curated list of allowed keys (diet, interests, birthday, etc.). Previously, an attacker who controlled a venue agent could write arbitrary keys to user memory. Now only approved keys pass validation.

Action required for developers: no API changes. All fixes are backend-only. If your integration uses the /agent/memory endpoint to write facts, make sure your keys match the allowlist in soul_loader.py.

Submit to

Public submit links. No API keys. Opens in a new tab with the title and URL pre-filled.

Hacker News X / Twitter Reddit LinkedIn Threads

P0 red-team audit fixes shipped in v0.3

Submit to

Copy and paste

Build on the same network.