Security¶
This document covers KeyPool's security model, the hardening measures in place, and operational guidance for maintaining a secure deployment.
Threat Model¶
KeyPool is an Internet-facing proxy that holds upstream API credentials. The primary threats are:
- Credential theft — Attacker extracts upstream API keys via the proxy
- Open proxy / SSRF — Attacker uses KeyPool to reach arbitrary hosts with attached credentials
- Token compromise — Team token leaks, granting unauthorized API access
- Admin takeover — Admin secret leaks, granting full control plane access
- Data exfiltration — D1 database leak exposes token hashes, usage data, member info
Authentication¶
Team Tokens¶
Team tokens authenticate SDK requests. They follow the format nxkp-{uuid}-{uuid} and are high-entropy (122 bits of randomness).
Storage: Tokens are never stored in plaintext. On creation, the raw token is hashed and only the hash is persisted in D1. The raw token is returned once and cannot be retrieved again.
Hashing: When TOKEN_PEPPER is set (recommended), tokens are hashed with HMAC-SHA-256 using the pepper as the key. This prevents offline token verification if D1 is compromised — the attacker would also need the pepper, which is stored separately as a Wrangler secret.
Without TOKEN_PEPPER, tokens are hashed with plain SHA-256. This is still safe against brute-force (tokens are high-entropy UUIDs), but lacks defense-in-depth against database leaks.
Legacy migration: When TOKEN_PEPPER is enabled, the auth middleware automatically detects tokens hashed with plain SHA-256 and migrates them to HMAC-SHA-256 on first use. No manual re-hashing is needed.
Accepted auth methods:
- Authorization: Bearer <token> (preferred)
- x-api-key: <token> header
- xi-api-key: <token> header (for SDK compatibility, e.g. ElevenLabs)
Query-parameter auth (?api_key=) was removed because query strings leak into logs, analytics, referrer headers, and browser history.
Admin Authentication¶
The admin API (/admin/*) is protected by a single ADMIN_TOKEN secret compared using constant-time byte comparison to prevent timing attacks.
Recommendations:
- Put /admin/* behind Cloudflare Access or Zero Trust for identity-based access control
- Use Cloudflare WAF rate limiting rules on /admin/* to prevent brute-force
- Rotate ADMIN_TOKEN periodically via wrangler secret put ADMIN_TOKEN
KV Token Cache¶
Authenticated tokens are cached in KV for 5 minutes to avoid D1 lookups on every request. The cached payload excludes PII fields (member_email) — only auth-relevant fields (ID, scopes, policy, quotas, expiry, revocation state) are stored.
Cache invalidation happens on token update, rotation, and revocation via the admin API.
Proxy Security¶
SSRF Prevention¶
The proxy constructs upstream URLs by appending the request path to the service's configured base_url. Multiple layers prevent SSRF:
- Path validation — Rejects paths starting with
//(scheme-relative override), containing\(backslash normalization), or null bytes - Safe URL construction — Uses
pathnameassignment on aURLobject constructed frombase_url, rather thannew URL(path, base)which allows origin override - Post-construction host check — Verifies the final URL's host matches the expected
base_urlhost - No redirect following —
redirect: 'manual'prevents upstream 302 redirects to attacker-controlled hosts
Header Isolation¶
Request headers: Only an explicit allowlist of safe headers is forwarded to upstream APIs:
- content-type, content-length, accept, accept-encoding, accept-language, user-agent, content-encoding, transfer-encoding, idempotency-key
This prevents leaking Cookie, CF-Access-*, x-forwarded-for, cf-connecting-ip, and other internal headers to third-party upstreams.
Response headers: Upstream response headers are filtered to strip:
- set-cookie, cookie — prevents upstream session injection
- connection, keep-alive, te, trailer, upgrade — hop-by-hop headers
- proxy-authenticate, proxy-authorization — proxy-specific headers
Upstream Credential Injection¶
API keys are injected into upstream requests based on the service's auth_scheme configuration. The original client's auth headers are always stripped before injection — there is no path for a client to see or influence the upstream credential.
Rate Limiting¶
Per-token quotas (requests per hour / per day) are enforced via KV counters.
Known limitation: KV read→put is not atomic. Under high concurrency, multiple requests can read the same counter value and all increment to the same +1, allowing short bursts beyond quota. This is acceptable for the current scale. For strict enforcement, migrate to a Durable Object with atomic counters.
Failure mode: If KV is unavailable during quota check, the proxy fails open (allows the request) to preserve availability. This is logged as a structured event.
Circuit Breaker¶
When enabled per service, the circuit breaker trips on upstream errors:
| Status | Cooldown | Reason |
|---|---|---|
| 429 | Retry-After or 60s |
Rate limited |
| 402 | 3600s (1 hour) | Payment required |
| 5xx | 30s | Server error |
| Network error | 30s (as 502) | Fetch failed |
Tripped credentials are stored in KV as { [credentialId]: expiresAtMs } with automatic expiry. The read path prunes expired entries, so recovery is automatic.
Error Handling & Failure Modes¶
All D1 and KV operations in the critical path are wrapped in try/catch with explicit error responses:
| Component | Failure | Behavior | HTTP Status |
|---|---|---|---|
| Auth (D1) | Database unavailable | Fail closed — reject request | 503 |
| Auth (KV cache) | KV unavailable | Fall through to D1 | — |
| Service config (D1) | Database unavailable | Fail closed | 503 |
| Key selection (D1/KV) | Backend unavailable | Fail closed | 503 |
| Quota check (KV) | KV unavailable | Fail open — allow request | — |
| Usage logging (D1) | Write failure | Swallowed (async, logged) | — |
| Circuit breaker (KV) | Write failure | Swallowed (async, logged) | — |
Observability¶
All errors emit structured JSON logs with stable fields for filtering:
{"event": "d1_read_error", "table": "team_tokens", "error": "..."}
{"event": "kv_read_error", "key": "token_cache", "error": "..."}
{"event": "quota_check_error", "token_id": 5, "error": "..."}
{"event": "key_selection_error", "service": "exa", "error": "..."}
{"event": "usage_log_error", "team_token_id": 5, "service_id": 1, "error": "..."}
{"event": "hash_migration_error", "token_id": 2, "error": "..."}
{"event": "usage_rollup_complete", "date": "2026-02-22"}
{"event": "usage_rollup_error", "date": "2026-02-22", "error": "..."}
Use wrangler tail --format json to stream these in real time, or filter by event field.
Secrets Management¶
| Secret | Purpose | Set via |
|---|---|---|
ADMIN_TOKEN |
Admin API authentication | wrangler secret put ADMIN_TOKEN |
TOKEN_PEPPER |
HMAC key for token hashing | wrangler secret put TOKEN_PEPPER |
Secrets are never stored in wrangler.jsonc or committed to the repository. They are injected at runtime by the Cloudflare Workers platform.
Rotation:
- ADMIN_TOKEN: Update via wrangler secret put, takes effect on next request
- TOKEN_PEPPER: Changing the pepper invalidates all existing token hashes. The legacy fallback will re-hash tokens on first use, but there will be a brief period of double D1 lookups per auth. Plan rotation during low-traffic windows.
Session Affinity¶
For async APIs (Exa research, Firecrawl crawl, AssemblyAI transcription), session affinity ensures poll requests use the same upstream key that created the job. Job IDs are stored in KV with configurable TTL.
Bounded capture: The proxy only reads response bodies for session capture when:
- The response Content-Type is application/json
- The Content-Length is under 64KB
This prevents memory exhaustion from large or binary responses.
Recommendations¶
Before Production¶
- [ ] Set
TOKEN_PEPPERviawrangler secret put TOKEN_PEPPER - [ ] Put
/admin/*behind Cloudflare Access or IP allowlist - [ ] Add WAF rate limiting rule for
/admin/* - [ ] Verify all team tokens work after pepper migration (first request per token triggers auto-migration)
Ongoing¶
- [ ] Rotate
ADMIN_TOKENquarterly - [ ] Monitor
wrangler tailforauth_hash_errorandd1_read_errorevents - [ ] Review upstream API spec changes (tracked by GitHub Action)
- [ ] Audit active tokens periodically via
GET /admin/tokens?active=true