Skip to content

Live Demo

20 Hours of SRE Work in 5 Minutes

Define your service once. NthLayer generates SLOs, alerts, dashboards, recording rules, and runbooks—with technology-specific best practices built in.

  • 99.6% Time Savings

  • Tech-Aware (PostgreSQL, Redis, Kafka)

  • One Command for Everything

  • Real-Time Error Budget Tracking


Live Grafana Dashboards

See auto-generated dashboards for 6 production services. Each dashboard includes SLO metrics, service health, and technology-specific panels.

Dashboard Structure

Each dashboard is organized into: SLO MetricsService HealthDependencies


Generated Alerts

118 production-ready Prometheus alerts across all services, sourced from awesome-prometheus-alerts.

  • payment-api · 15 PostgreSQL alerts

    PostgresqlDown, PostgresqlRestarted, SlowQueries...

    View on GitHub

  • checkout-service · 26 MySQL + Redis alerts

    MysqlDown, RedisMemoryHigh, ReplicationLag...

    View on GitHub

  • notification-worker · 12 Redis alerts

    RedisDown, RedisMemoryHigh, TooManyConnections...

    View on GitHub

  • analytics-stream · 19 MongoDB + Redis alerts

    MongodbDown, CursorsTimeouts, ReplicasetLag...

    View on GitHub

  • identity-service · 27 PostgreSQL + Redis alerts

    PostgresqlDown, DeadLocks, RedisRejected...

    View on GitHub

  • search-api · 19 Elasticsearch alerts

    ClusterRed, JvmHeapHigh, DiskSpaceLow...

    View on GitHub


SLO Portfolio

Track org-wide reliability with tier-based health scoring:

================================================================================
  NthLayer SLO Portfolio
================================================================================

Organization Health: 78% (14/18 services meeting SLOs)

By Tier:
  Critical:  ████████░░  83% (5/6 services)
  Standard:  ███████░░░  75% (6/8 services)
  Low:       ███████░░░  75% (3/4 services)

--------------------------------------------------------------------------------
Services Needing Attention:
--------------------------------------------------------------------------------

  payment-api (Tier 1)
    availability: 156% budget burned - EXHAUSTED
    Remaining: -12.5 hours

  search-api (Tier 2)
    latency-p99: 95% budget burned - WARNING
    Remaining: 1.2 hours

--------------------------------------------------------------------------------
Total: 18 services, 16 with SLOs, 45 SLOs

Cross-Vendor Aggregation

Why this matters: PagerDuty can't give you this view—they want you locked into their ecosystem. NthLayer aggregates SLOs across any backend (Prometheus, Datadog, etc.) in a single, vendor-neutral portfolio.


PagerDuty Integration

Complete incident response setup with tier-based escalation policies.

  • Team Management


    Auto-creates teams with manager roles assigned to API key owner

  • On-Call Schedules


    Primary, secondary, and manager schedules with weekly rotation

  • Tier-Based Timing


    Critical: 5→15→30min | High: 15→30→60min | Low: 60min only

  • Service Linking


    Services linked to escalation policies with urgency settings

Support Models

Model Description
self Team handles all alerts 24/7
shared Team (day) + SRE (off-hours)
sre SRE handles all alerts
business_hours Team (9-5) + low-priority queue

What Gets Generated

From a single service.yaml, NthLayer generates:

Output Description Example
Dashboard 22 panels: health, SLOs, latency, errors, dependencies View JSON
SLOs 3 SLOs with 30-day error budgets and burn rate calculations View YAML
Alerts 15 PostgreSQL alerts with service labels and severity routing View YAML
Recording Rules 21 pre-aggregated metrics for 10x faster dashboard queries View YAML

Try It Yourself

# Install NthLayer
pipx install nthlayer

# Interactive setup (configures Prometheus, Grafana, PagerDuty)
nthlayer setup

# Generate configs for your service
nthlayer apply payment-api.yaml

# View org-wide SLO health
nthlayer portfolio