Skip to main content

Developer Kit

Outage Response Playbook

Outage Response Playbook generates complete, scenario-specific runbooks — not generic templates. It produces severity tiers with measurable criteria, role assignments for every step, step-by-step response procedures, escalation trees with observable triggers, communication templates, resolution checklists, and blameless post-mortem templates. Engineering managers, platform teams, and SRE leads use it to document failure modes before they happen instead of improvising under pressure. A playbook built from this skill is immediately usable — not a starting point that needs another hour of editing before it is safe to hand to an on-call engineer. What makes it production-grade is specificity. Every step has an explicit role owner. Severity tiers use measurable thresholds. Escalation contacts are role-based. Post-mortems are blameless by construction. Write for 3 AM, not for ideal conditions.

Nexus CertifiedClaude CodeCodexOpenClaw
outage-responsereliabilityrunbookson-calloperations

One-Time Purchase

$19.99

Sample Output

Outage Playbook: Database Connection Pool Exhaustion

Classification

Type: Infrastructure — Database Severity Tiers:

  • P1: >50% of API requests returning 500 errors for >5 minutes; customer-facing impact confirmed
  • P2: >20% of API requests returning 500 errors; degraded performance but partial functionality
  • P3: Connection pool warnings in logs; no customer-facing impact yet; preventive investigation

Trigger Conditions: Alert "db-pool-exhaustion" fires when available connections drop below 10% of max pool size for >60 seconds

Roles

| Role | Responsibility | Default Owner | |------|----------------|---------------| | Incident Commander | Coordinates response, makes severity calls, owns communication | On-call Engineering Manager | | Technical Lead | Diagnoses root cause, executes remediation steps | On-call Backend Engineer | | Communications Lead | Updates status page, notifies stakeholders | On-call Engineering Manager (doubles) |

Detection & Triage (0–5 min)

  1. [IC] Acknowledge the PagerDuty alert within 2 minutes. Open #incidents Slack channel.
  2. [Tech Lead] Verify the alert is real — check Grafana dashboard "DB Pool Health": https://grafana.internal/d/db-pool
  3. [IC] Classify severity using the tier definitions above. Declare incident in #incidents.

Response Steps

  1. [Tech Lead] Check active connections: SELECT count(*) FROM pg_stat_activity WHERE state = 'active'; — if >90% of max_connections, confirm pool exhaustion.
  2. [Tech Lead] Identify long-running queries: SELECT pid, now() - pg_stat_activity.query_start AS duration, query FROM pg_stat_activity WHERE state != 'idle' ORDER BY duration DESC LIMIT 10;
  3. [Tech Lead] Kill queries running >5 minutes that are not critical batch jobs: SELECT pg_terminate_backend(pid);
  4. [IC] If killing queries restores pool within 5 minutes, monitor for 15 minutes and proceed to resolution checklist.
  5. [Tech Lead] If pool remains exhausted, restart the application service: kubectl rollout restart deployment/api-server -n production

Resolution Checklist

  • [ ] Connection pool utilization below 60%
  • [ ] Error rate returned to baseline (<0.1%)
  • [ ] Status page updated: Resolved
  • [ ] PagerDuty incident resolved
  • [ ] Post-mortem scheduled within 48 hours

View full sample →

All sales final. No refunds on digital products.

Includes support for Claude Code, Codex, and OpenClaw in the same license.

What You Get With This Skill

Generates structured, role-clear incident response playbooks for specific failure scenarios. Covers detection through resolution and post-mortem — ready to use when an incident actually happens.

All ClearPoint Nexus Skills Include

  • Production-ready workflow packaging for three supported platforms.
  • Reusable structure designed for repeatable operator tasks.
  • Clear deliverable format, not just raw prompt output.

Related Skills

Developer Kit
Featured
Technical Writer
Generates technical documentation including runbooks, API references, onboarding guides, and changelogs. Useful for turning product and engineering context into clear docs.
Claude CodeCodexOpenClaw
documentationtechnical-writingrunbooks

$19.99

One-time license

View Skill
Developer Kit
Featured
Code Generation
Generates, reviews, debugs, and executes code in sandboxed workflows. Useful for implementation, refactoring, and technical problem solving.
Claude CodeCodexOpenClaw
codingdebuggingcode-review

$19.99

One-time license

View Skill
Developer Kit
API Documentation Generator
Generates structured, developer-ready API documentation from code, OpenAPI specs, route definitions, or descriptions. Produces reference docs, quickstart guides, error references, and code examples.
Claude CodeCodexOpenClaw
apidocumentationdeveloper-experience

$19.99

One-time license

View Skill