Skip to content
AGNT
Reliability

Uptime SLA

99.5%

99.5% availability target for all API endpoints.

Stable

Uptime SLA represents the target availability for AGNT's public API endpoints, measured as the percentage of 60-second intervals in which the /health endpoint returns HTTP 200 with all subsystem checks passing (PostgreSQL, Redis, LLM gateway reachable). Downtime is any 60-second window where the health check fails or responds with a non-200 status.

The 99.5% target translates to approximately 3.65 hours of allowed downtime per month. In practice, we've maintained 99.8%+ over the last 90 days, with the primary downtime events being Railway platform maintenance windows (scheduled, ~15 minutes each) and one Redis connection pool exhaustion incident in February 2026 that caused 22 minutes of degraded service.

Sentinel, our monitoring agent, polls the health endpoint every 60 seconds from an external VPS (gabagool-ams, Amsterdam). This gives us an outside-in perspective that catches issues invisible to internal monitoring — DNS failures, Railway routing problems, CDN misconfigurations. Sentinel alerts to Slack within 120 seconds of the first failed check.

Methodology

Computed as: (total 60s intervals with successful health check) / (total 60s intervals in period) * 100. The health check verifies: PostgreSQL connection (SELECT 1), Redis PING, and a lightweight LLM gateway probe (model list endpoint, not an inference call). Monitoring runs from the gabagool-ams VPS (136.244.98.211) using a custom Sentinel agent that logs results to a local SQLite database. Monthly reports are generated from this database. Scheduled maintenance windows are excluded from the calculation only if announced 24 hours in advance via the /status page.

People also ask.

See it in action.

99.5% percent — real numbers from production. Try the live scan demo or explore more benchmarks.