DevOps & Infra

check-production - Claude MCP Skill

Check production health: Sentry errors, Vercel logs, health endpoints, GitHub CI/CD. Outputs structured findings. Use log-production-issues to create issues. Invoke for: production diagnostics, error audit, health status, CI failures.

SEO Guide: Enhance your AI agent with the check-production tool. This Model Context Protocol (MCP) server allows Claude Desktop and other LLMs to check production health: sentry errors, vercel logs, health endpoints, github ci/cd. outputs structu... Download and configure this skill to unlock new capabilities for your AI workflow.

🌟1 stars • 1 forks

📥0 downloads

TypeScript API Database CI/CD

View on GitHub🔗 Claude Servers

Documentation

SKILL.md

# /check-production

Audit production health. Output findings as structured report.

## What This Does

1. Query Sentry for unresolved issues
2. Check Vercel logs for recent errors
3. Test health endpoints
4. Check GitHub Actions for CI/CD failures
5. Output prioritized findings (P0-P3)

**This is a primitive.** It only investigates and reports. Use `/log-production-issues` to create GitHub issues or `/triage` to fix.

## Process

### 1. Sentry Check

```bash
# Run triage script if available
~/.claude/skills/triage/scripts/check_sentry.sh 2>/dev/null || echo "Sentry check unavailable"
```

Or spawn Sentry MCP query if configured.

### 2. Vercel Logs Check

```bash
# Check for recent errors
~/.claude/skills/triage/scripts/check_vercel_logs.sh 2>/dev/null || vercel logs --output json 2>/dev/null | head -50
```

### 3. Health Endpoints

```bash
# Test health endpoint
~/.claude/skills/triage/scripts/check_health_endpoints.sh 2>/dev/null || curl -sf "$(grep NEXT_PUBLIC_APP_URL .env.local 2>/dev/null | cut -d= -f2)/api/health" | jq .
```

### 4. GitHub CI/CD Check

```bash
# Check for failed workflow runs on default branch
gh run list --branch main --status failure --limit 5 2>/dev/null || \
gh run list --branch master --status failure --limit 5 2>/dev/null

# Get details on most recent failure
gh run list --status failure --limit 1 --json databaseId,name,conclusion,createdAt,headBranch 2>/dev/null

# Check for stale/stuck workflows
gh run list --status in_progress --json databaseId,name,createdAt 2>/dev/null
```

**What to look for:**
- Failed runs on main/master branch (broken CI)
- Failed runs on feature branches blocking PRs
- Stuck/in-progress runs that should have completed
- Patterns in failure types (tests, lint, build, deploy)

### 5. Quick Application Checks

```bash
# Check for error handling gaps
grep -rE "catch\s*\(\s*\)" --include="*.ts" --include="*.tsx" src/ app/ 2>/dev/null | head -5
# Empty catch blocks = silent failures
```

## Output Format

```markdown
## Production Health Check

### P0: Critical (Active Production Issues)
- [SENTRY-123] PaymentIntent failed - 23 users affected (Score: 147)
  Location: api/checkout.ts:45
  First seen: 2h ago

### P1: High (Degraded Performance / Broken CI)
- Health endpoint slow: /api/health responding in 2.3s (should be <500ms)
- Vercel logs show 5xx errors in last hour (count: 12)
- [CI] Main branch failing: "Build" workflow (run #1234)
  Failed step: "Type check"
  Error: Type 'string' is not assignable to type 'number'

### P2: Medium (Warnings)
- 3 empty catch blocks found (silent failures)
- Health endpoint missing database connectivity check
- [CI] 3 feature branch workflows failing (blocking PRs)

### P3: Low (Improvements)
- Consider adding Sentry performance monitoring
- Health endpoint could include more service checks

## Summary
- P0: 1 | P1: 3 | P2: 3 | P3: 2
- Recommendation: Fix P0 immediately, then fix main branch CI
```

## Priority Mapping

| Signal | Priority |
|--------|----------|
| Active errors affecting users | P0 |
| 5xx errors, slow responses | P1 |
| Main branch CI/CD failing | P1 |
| Feature branch CI blocking PRs | P2 |
| Silent failures, missing checks | P2 |
| Missing monitoring, improvements | P3 |

## Health Endpoint Anti-Pattern

**Health checks that lie are worse than no health check.** Example:

```typescript
// ❌ BAD: Reports "ok" without checking
return { status: "ok", services: { database: "ok" } };

// ✅ GOOD: Honest liveness probe (no fake service status)
return { status: "ok", timestamp: new Date().toISOString() };

// ✅ BETTER: Real readiness probe
const dbStatus = await checkDatabase() ? "ok" : "error";
return { status: dbStatus === "ok" ? "ok" : "degraded", services: { database: dbStatus } };
```

If you can't verify a service, don't report on it. False "ok" status masks outages.

## Analytics Note

This skill checks production health (errors, logs, endpoints), not product analytics.

For analytics auditing, see `/check-observability`. Note:
- **PostHog** is REQUIRED for product analytics (has MCP server)
- **Vercel Analytics** is NOT acceptable (no CLI/API/MCP - unusable for our workflow)

If you need to investigate user behavior or funnels during incident response, query PostHog via MCP.

### 6. E2E Smoke Check

If Playwright is configured in the project:

```bash
# Run smoke tests against production
PLAYWRIGHT_BASE_URL="$PROD_URL" npx playwright test e2e/smoke.spec.ts --reporter=list 2>&1 | head -30
```

Critical paths to verify:
- Landing page loads (anonymous)
- Dashboard loads (authenticated) — the #1 incident class
- Subscribe page renders
- Session page loads
- No error boundaries triggered on any route

### 7. Post-Deploy Health Check

```bash
# Verify health endpoint
curl -sf "$PROD_URL/api/health" -w "\nHTTP %{http_code} in %{time_total}s\n" | head -5

# Verify no error boundary on dashboard (check for error text in HTML)
curl -sf "$PROD_URL/dashboard" 2>/dev/null | grep -c "Something went wrong" && echo "ERROR BOUNDARY DETECTED" || echo "Dashboard OK"
```

## Related

- `/log-production-issues` - Create GitHub issues from findings
- `/triage` - Fix production issues
- `/observability` - Set up monitoring infrastructure
- `/flywheel-qa` - Agentic QA for preview deployments

Signals

Avg rating⭐ 0.0

Reviews0

Favorites0

Information

Repository: phrazzld/claude-config
Author: phrazzld
Last Sync: 3/2/2026
Repo Updated: 3/1/2026
Created: 1/25/2026

Reviews (0)

No reviews yet. Be the first to review this skill!

Related Skills

upgrade-nodejs

Upgrading Bun's Self-Reported Node.js Version

⭐ 44899Has guide

cursorrules

CrewAI Development Rules

⭐ 43932Has guide

cn-check

Install and run the Continue CLI (`cn`) to execute AI agent checks on local code changes. Use when asked to "run checks", "lint with AI", "review my changes with cn", or set up Continue CI locally.

⭐ 33067

CLAUDE

CLAUDE.md

⭐ 27972