Gate traffic inside your app based on backend health. Not request counts. Not static rules. Your actual metrics, fleet-wide.
Traditional rate limiters count requests. They have no idea if your origin is healthy or on fire.
You set 100 req/s, but your database is responding in 8 seconds. The rate limiter keeps letting traffic through until the connection pool is exhausted.
The rate limiter reports "success" while your users see an error. It counts requests. It has no idea your backend is drowning.
A POST that triggers a complex join costs 100x a cached GET, but they both count as "1 request." Your limit has no idea which traffic is expensive.
You picked a number that worked in staging. The right limit changes every minute based on query complexity, connection pool state, and what else is running.
When limits trigger, your enterprise customer's checkout gets the same rejection as a free-tier user browsing docs. No way to prioritize what matters.
Each instance enforces its own limit. Scale to 10 pods and your 100 req/s becomes 1,000. Scale down and surviving pods still think they can each do 100.
A closed loop between your backend health and your in-app gating. The SDK runs inside your application.
In-app SDK. No reverse proxy. Each gate call runs synchronously in your process with zero network hops on the hot path.
Fail-open by design. Short outages use cached policy. Prolonged outages gracefully step aside so your existing infrastructure takes over. WaitState adds intelligence, not a dependency.
Zero dependencies. Every SDK is compiled from a single Rust core. No runtime deps, no JNI classloaders, no C toolchains on your CI.
Gate before origin. Shed traffic at the edge before it reaches your backend. Fewer wasted compute cycles, lower tail latency.
Same Rust core. Compiled to WASM, runs in Cloudflare Workers, Fastly Compute, Vercel. One binary, every edge.
Sub-millisecond gate. WASM near-native speed. No cold-start penalty on the hot path.
Any language. Talk to the agent from any stack. No SDK embedding required. Gate over localhost HTTP.
DaemonSet or sidecar. One agent per node, shared by all pods. Minimal resource footprint. 10m CPU, 16Mi memory.
Apollo Router coprocessor. Built-in GraphQL integration. Tag by operation name, gate before execution.
Not all traffic is equal. Assign weights to your tags. When a reflex rule fires, low-weight traffic gets shed first.
Every gate call includes a tag and weight. When rules fire, low-weight tags get shed first.
Short outages use cached policy. If the lease expires, the SDK enters safe mode. You choose the strategy:
fixed_rps throttle to a max req/sec you setopen allow all traffic (full fail-open)last_policy keep enforcing the stale policyRules react to built-in metrics like latency and error rate. Report your own custom metrics and trigger rules on anything your backend can measure.
No rules target enterprise. It passes through at full rate while lower tiers are shed.
A single Rust engine compiles to native SDKs for every major runtime. Same in-memory gate call. Zero-copy hot path. Use the waitstate crate directly in Rust.
Free tier. No credit card. Add the SDK in under 5 minutes.