Security¶

This document covers KeyPool's security model, the hardening measures in place, and operational guidance for maintaining a secure deployment.

Threat Model¶

KeyPool is an Internet-facing proxy that holds upstream API credentials. The primary threats are:

Credential theft — Attacker extracts upstream API keys via the proxy
Open proxy / SSRF — Attacker uses KeyPool to reach arbitrary hosts with attached credentials
Token compromise — Team token leaks, granting unauthorized API access
Admin takeover — Admin secret leaks, granting full control plane access
Data exfiltration — D1 database leak exposes token hashes, usage data, member info

Authentication¶

Team Tokens¶

Team tokens authenticate SDK requests. They follow the format nxkp-{uuid}-{uuid} and are high-entropy (122 bits of randomness).

Storage: Tokens are never stored in plaintext. On creation, the raw token is hashed and only the hash is persisted in D1. The raw token is returned once and cannot be retrieved again.

Hashing: When TOKEN_PEPPER is set (recommended), tokens are hashed with HMAC-SHA-256 using the pepper as the key. This prevents offline token verification if D1 is compromised — the attacker would also need the pepper, which is stored separately as a Wrangler secret.

Without TOKEN_PEPPER, tokens are hashed with plain SHA-256. This is still safe against brute-force (tokens are high-entropy UUIDs), but lacks defense-in-depth against database leaks.

Legacy migration: When TOKEN_PEPPER is enabled, the auth middleware automatically detects tokens hashed with plain SHA-256 and migrates them to HMAC-SHA-256 on first use. No manual re-hashing is needed.

Accepted auth methods: - Authorization: Bearer <token> (preferred) - x-api-key: <token> header - xi-api-key: <token> header (for SDK compatibility, e.g. ElevenLabs)

Query-parameter auth (?api_key=) was removed because query strings leak into logs, analytics, referrer headers, and browser history.

Admin Authentication¶

The admin API (/admin/*) is protected by a single ADMIN_TOKEN secret compared using constant-time byte comparison to prevent timing attacks.

Recommendations: - Put /admin/* behind Cloudflare Access or Zero Trust for identity-based access control - Use Cloudflare WAF rate limiting rules on /admin/* to prevent brute-force - Rotate ADMIN_TOKEN periodically via wrangler secret put ADMIN_TOKEN

KV Token Cache¶

Authenticated tokens are cached in KV for 5 minutes to avoid D1 lookups on every request. The cached payload excludes PII fields (member_email) — only auth-relevant fields (ID, scopes, policy, quotas, expiry, revocation state) are stored.

Cache invalidation happens on token update, rotation, and revocation via the admin API.

Proxy Security¶

SSRF Prevention¶

The proxy constructs upstream URLs by appending the request path to the service's configured base_url. Multiple layers prevent SSRF:

Path validation — Rejects paths starting with // (scheme-relative override), containing \ (backslash normalization), or null bytes
Safe URL construction — Uses pathname assignment on a URL object constructed from base_url, rather than new URL(path, base) which allows origin override
Post-construction host check — Verifies the final URL's host matches the expected base_url host
No redirect following — redirect: 'manual' prevents upstream 302 redirects to attacker-controlled hosts

Header Isolation¶

Request headers: Only an explicit allowlist of safe headers is forwarded to upstream APIs: - content-type, content-length, accept, accept-encoding, accept-language, user-agent, content-encoding, transfer-encoding, idempotency-key

This prevents leaking Cookie, CF-Access-*, x-forwarded-for, cf-connecting-ip, and other internal headers to third-party upstreams.

Response headers: Upstream response headers are filtered to strip: - set-cookie, cookie — prevents upstream session injection - connection, keep-alive, te, trailer, upgrade — hop-by-hop headers - proxy-authenticate, proxy-authorization — proxy-specific headers

Upstream Credential Injection¶

API keys are injected into upstream requests based on the service's auth_scheme configuration. The original client's auth headers are always stripped before injection — there is no path for a client to see or influence the upstream credential.

Rate Limiting¶

Per-token quotas (requests per hour / per day) are enforced via KV counters.

Known limitation: KV read→put is not atomic. Under high concurrency, multiple requests can read the same counter value and all increment to the same +1, allowing short bursts beyond quota. This is acceptable for the current scale. For strict enforcement, migrate to a Durable Object with atomic counters.

Failure mode: If KV is unavailable during quota check, the proxy fails open (allows the request) to preserve availability. This is logged as a structured event.

Circuit Breaker¶

When enabled per service, the circuit breaker trips on upstream errors:

Status	Cooldown	Reason
429	`Retry-After` or 60s	Rate limited
402	3600s (1 hour)	Payment required
5xx	30s	Server error
Network error	30s (as 502)	Fetch failed

Tripped credentials are stored in KV as { [credentialId]: expiresAtMs } with automatic expiry. The read path prunes expired entries, so recovery is automatic.

Error Handling & Failure Modes¶

All D1 and KV operations in the critical path are wrapped in try/catch with explicit error responses:

Component	Failure	Behavior	HTTP Status
Auth (D1)	Database unavailable	Fail closed — reject request	503
Auth (KV cache)	KV unavailable	Fall through to D1	—
Service config (D1)	Database unavailable	Fail closed	503
Key selection (D1/KV)	Backend unavailable	Fail closed	503
Quota check (KV)	KV unavailable	Fail open — allow request	—
Usage logging (D1)	Write failure	Swallowed (async, logged)	—
Circuit breaker (KV)	Write failure	Swallowed (async, logged)	—

Observability¶

All errors emit structured JSON logs with stable fields for filtering:

{"event": "d1_read_error", "table": "team_tokens", "error": "..."}
{"event": "kv_read_error", "key": "token_cache", "error": "..."}
{"event": "quota_check_error", "token_id": 5, "error": "..."}
{"event": "key_selection_error", "service": "exa", "error": "..."}
{"event": "usage_log_error", "team_token_id": 5, "service_id": 1, "error": "..."}
{"event": "hash_migration_error", "token_id": 2, "error": "..."}
{"event": "usage_rollup_complete", "date": "2026-02-22"}
{"event": "usage_rollup_error", "date": "2026-02-22", "error": "..."}

Use wrangler tail --format json to stream these in real time, or filter by event field.

Secrets Management¶

Secret	Purpose	Set via
`ADMIN_TOKEN`	Admin API authentication	`wrangler secret put ADMIN_TOKEN`
`TOKEN_PEPPER`	HMAC key for token hashing	`wrangler secret put TOKEN_PEPPER`

Secrets are never stored in wrangler.jsonc or committed to the repository. They are injected at runtime by the Cloudflare Workers platform.

Rotation: - ADMIN_TOKEN: Update via wrangler secret put, takes effect on next request - TOKEN_PEPPER: Changing the pepper invalidates all existing token hashes. The legacy fallback will re-hash tokens on first use, but there will be a brief period of double D1 lookups per auth. Plan rotation during low-traffic windows.

Session Affinity¶

For async APIs (Exa research, Firecrawl crawl, AssemblyAI transcription), session affinity ensures poll requests use the same upstream key that created the job. Job IDs are stored in KV with configurable TTL.

Bounded capture: The proxy only reads response bodies for session capture when: - The response Content-Type is application/json - The Content-Length is under 64KB

This prevents memory exhaustion from large or binary responses.

Recommendations¶

Before Production¶

[ ] Set TOKEN_PEPPER via wrangler secret put TOKEN_PEPPER
[ ] Put /admin/* behind Cloudflare Access or IP allowlist
[ ] Add WAF rate limiting rule for /admin/*
[ ] Verify all team tokens work after pepper migration (first request per token triggers auto-migration)

Ongoing¶

[ ] Rotate ADMIN_TOKEN quarterly
[ ] Monitor wrangler tail for auth_hash_error and d1_read_error events
[ ] Review upstream API spec changes (tracked by GitHub Action)
[ ] Audit active tokens periodically via GET /admin/tokens?active=true