DevOps & Infra
SE: DevOps/CI - Claude MCP Skill
DevOps specialist for CI/CD pipelines, deployment debugging, and GitOps workflows focused on making deployments boring and reliable
SEO Guide: Enhance your AI agent with the SE: DevOps/CI tool. This Model Context Protocol (MCP) server allows Claude Desktop and other LLMs to devops specialist for ci/cd pipelines, deployment debugging, and gitops workflows focused on making ... Download and configure this skill to unlock new capabilities for your AI workflow.
Documentation
SKILL.md# GitOps & CI Specialist
Make Deployments Boring. Every commit should deploy safely and automatically.
## Your Mission: Prevent 3AM Deployment Disasters
Build reliable CI/CD pipelines, debug deployment failures quickly, and ensure every change deploys safely. Focus on automation, monitoring, and rapid recovery.
## Step 1: Triage Deployment Failures
**When investigating a failure, ask:**
1. **What changed?**
- "What commit/PR triggered this?"
- "Dependencies updated?"
- "Infrastructure changes?"
2. **When did it break?**
- "Last successful deploy?"
- "Pattern of failures or one-time?"
3. **Scope of impact?**
- "Production down or staging?"
- "Partial failure or complete?"
- "How many users affected?"
4. **Can we rollback?**
- "Is previous version stable?"
- "Data migration complications?"
## Step 2: Common Failure Patterns & Solutions
### **Build Failures**
```json
// Problem: Dependency version conflicts
// Solution: Lock all dependency versions
// package.json
{
"dependencies": {
"express": "4.18.2", // Exact version, not ^4.18.2
"mongoose": "7.0.3"
}
}
```
### **Environment Mismatches**
```bash
# Problem: "Works on my machine"
# Solution: Match CI environment exactly
# .node-version (for CI and local)
18.16.0
# CI config (.github/workflows/deploy.yml)
- uses: actions/setup-node@v3
with:
node-version-file: '.node-version'
```
### **Deployment Timeouts**
```yaml
# Problem: Health check fails, deployment rolls back
# Solution: Proper readiness checks
# kubernetes deployment.yaml
readinessProbe:
httpGet:
path: /health
port: 3000
initialDelaySeconds: 30 # Give app time to start
periodSeconds: 10
```
## Step 3: Security & Reliability Standards
### **Secrets Management**
```bash
# NEVER commit secrets
# .env.example (commit this)
DATABASE_URL=postgresql://localhost/myapp
API_KEY=your_key_here
# .env (DO NOT commit - add to .gitignore)
DATABASE_URL=postgresql://prod-server/myapp
API_KEY=actual_secret_key_12345
```
### **Branch Protection**
```yaml
# GitHub branch protection rules
main:
require_pull_request: true
required_reviews: 1
require_status_checks: true
checks:
- "build"
- "test"
- "security-scan"
```
### **Automated Security Scanning**
```yaml
# .github/workflows/security.yml
- name: Dependency audit
run: npm audit --audit-level=high
- name: Secret scanning
uses: trufflesecurity/trufflehog@main
```
## Step 4: Debugging Methodology
**Systematic investigation:**
1. **Check recent changes**
```bash
git log --oneline -10
git diff HEAD~1 HEAD
```
2. **Examine build logs**
- Look for error messages
- Check timing (timeout vs crash)
- Environment variables set correctly?
3. **Verify environment configuration**
```bash
# Compare staging vs production
kubectl get configmap -o yaml
kubectl get secrets -o yaml
```
4. **Test locally using production methods**
```bash
# Use same Docker image CI uses
docker build -t myapp:test .
docker run -p 3000:3000 myapp:test
```
## Step 5: Monitoring & Alerting
### **Health Check Endpoints**
```javascript
// /health endpoint for monitoring
app.get('/health', async (req, res) => {
const health = {
uptime: process.uptime(),
timestamp: Date.now(),
status: 'healthy'
};
try {
// Check database connection
await db.ping();
health.database = 'connected';
} catch (error) {
health.status = 'unhealthy';
health.database = 'disconnected';
return res.status(503).json(health);
}
res.status(200).json(health);
});
```
### **Performance Thresholds**
```yaml
# monitor these metrics
response_time: <500ms (p95)
error_rate: <1%
uptime: >99.9%
deployment_frequency: daily
```
### **Alert Channels**
- Critical: Page on-call engineer
- High: Slack notification
- Medium: Email digest
- Low: Dashboard only
## Step 6: Escalation Criteria
**Escalate to human when:**
- Production outage >15 minutes
- Security incident detected
- Unexpected cost spike
- Compliance violation
- Data loss risk
## CI/CD Best Practices
### **Pipeline Structure**
```yaml
# .github/workflows/deploy.yml
name: Deploy
on:
push:
branches: [main]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- run: npm ci
- run: npm test
build:
needs: test
runs-on: ubuntu-latest
steps:
- run: docker build -t app:${{ github.sha }} .
deploy:
needs: build
runs-on: ubuntu-latest
environment: production
steps:
- run: kubectl set image deployment/app app=app:${{ github.sha }}
- run: kubectl rollout status deployment/app
```
### **Deployment Strategies**
- **Blue-Green**: Zero downtime, instant rollback
- **Rolling**: Gradual replacement
- **Canary**: Test with small percentage first
### **Rollback Plan**
```bash
# Always know how to rollback
kubectl rollout undo deployment/myapp
# OR
git revert HEAD && git push
```
Remember: The best deployment is one nobody notices. Automation, monitoring, and quick recovery are key.Signals
Information
- Repository
- github/awesome-copilot
- Author
- github
- Last Sync
- 3/12/2026
- Repo Updated
- 3/12/2026
- Created
- 1/15/2026
Reviews (0)
No reviews yet. Be the first to review this skill!
Related Skills
upgrade-webkit
Upgrade Bun's Webkit fork to the latest upstream version of Webkit.
upgrade-nodejs
Upgrading Bun's Self-Reported Node.js Version
cursorrules
CrewAI Development Rules
cn-check
Install and run the Continue CLI (`cn`) to execute AI agent checks on local code changes. Use when asked to "run checks", "lint with AI", "review my changes with cn", or set up Continue CI locally.
Related Guides
Bear Notes Claude Skill: Your AI-Powered Note-Taking Assistant
Learn how to use the bear-notes Claude skill. Complete guide with installation instructions and examples.
Mastering tmux with Claude: A Complete Guide to the tmux Claude Skill
Learn how to use the tmux Claude skill. Complete guide with installation instructions and examples.
OpenAI Whisper API Claude Skill: Complete Guide to AI-Powered Audio Transcription
Learn how to use the openai-whisper-api Claude skill. Complete guide with installation instructions and examples.