Data & AI
pg-slow-query-diagnosis - Claude MCP Skill
Diagnoses slow PostgreSQL queries by analyzing pg_stat_statements, execution plans, index opportunities, and table statistics. Guides the agent through a systematic investigation workflow using pgtuner-mcp tools to produce actionable optimization recommendations.
SEO Guide: Enhance your AI agent with the pg-slow-query-diagnosis tool. This Model Context Protocol (MCP) server allows Claude Desktop and other LLMs to diagnoses slow postgresql queries by analyzing pg_stat_statements, execution plans, index opportunit... Download and configure this skill to unlock new capabilities for your AI workflow.
Documentation
SKILL.md# PostgreSQL Slow Query Diagnosis
This skill guides systematic diagnosis and optimization of slow queries in PostgreSQL databases using the pgtuner-mcp server tools.
## When to Use This Skill
Use this skill when the user:
- Reports slow queries or poor query performance
- Asks "why is my query slow?" or "how can I speed up this query?"
- Wants to find and fix the top resource-consuming queries
- Mentions high query latency, timeouts, or slow API responses
- Asks to investigate `pg_stat_statements` data
## MCP Resources Available
Before calling tools, the agent can read lightweight resources for quick context:
- `pgtuner://docs/tools` -- Full tool reference documentation
- `pgtuner://docs/workflows` -- Recommended workflow patterns
- `pgtuner://query/{query_hash}/stats` -- Detailed stats for a specific query by `queryid` (from `get_slow_queries` output)
- `pgtuner://table/{schema}/{table_name}/stats` -- Quick table statistics without running the full tool
- `pgtuner://table/{schema}/{table_name}/indexes` -- Indexes on a specific table
## Related MCP Prompt
This skill corresponds to the **`diagnose_slow_queries`** MCP Prompt and partially to **`query_tuning`**. If the user triggers either prompt, follow this skill's workflow.
## Prerequisites
- The pgtuner-mcp server must be connected with a valid `DATABASE_URI`
- Required: `pg_stat_statements` extension (check availability early -- if missing, tool will return an error)
- Optional: `hypopg` extension for hypothetical index testing
- Optional: `pglast` library on the server for SQL parsing in index recommendations
## Pre-Flight Checks
Before starting the workflow, verify extension availability:
- If `get_slow_queries` returns an error about `pg_stat_statements`, inform the user it must be enabled and provide setup instructions.
- If the user wants hypothetical index testing, use `manage_hypothetical_indexes` with `action: "check"` to verify HypoPG is available. If not, skip Step 4 and rely on heuristic recommendations.
## Agent Decision Tree
```
User reports slow queries
|
+--> User has a SPECIFIC query?
| YES --> Skip Step 1, go directly to Step 2 (analyze_query)
| NO --> Start with Step 1 (get_slow_queries)
|
+--> Step 1 returns 0 results?
| --> Check if pg_stat_statements was recently reset
| --> ASK user: "pg_stat_statements has limited data.
| Can you provide a specific query, or should we wait
| for more workload data to accumulate?"
|
+--> Step 2 shows temp_blks_written > 0?
| YES --> work_mem issue. Go to Step 6 (I/O) before indexes.
| NO --> Continue to Step 3 (index recommendations)
|
+--> Step 3 shows < 10% improvement from indexes?
| YES --> Query structure may be the issue.
| Redirect to pg-query-rewrite skill.
| NO --> Continue to Step 4 (verify with HypoPG)
|
+--> Step 2 shows row estimate mismatch > 10x?
YES --> Run ANALYZE on the table FIRST, then re-check
NO --> Continue with index recommendations
```
## Diagnosis Workflow
Follow these steps in order. At each step, analyze the output before proceeding.
### Step 1: Identify the Slowest Queries
Call the `get_slow_queries` tool to retrieve the most expensive queries:
```
Tool: get_slow_queries
Parameters:
limit: 10 # Start with top 10
min_calls: 5 # Focus on queries called frequently enough to matter
order_by: mean_time # Sort by average execution time
```
Also check for over-fetching queries (returning too many rows):
```
Tool: get_slow_queries
Parameters:
limit: 10
min_calls: 5
order_by: rows # Find queries returning the most rows
```
**What to look for:**
- Queries with high `mean_time` (> 100ms warrants investigation)
- Queries with high `total_time` (high total even if individual calls are fast)
- High `rows` returned vs actual need (over-fetching -- use `order_by: rows`)
- Queries with low `shared_blks_hit / (shared_blks_hit + shared_blks_read)` cache hit ratio
- Queries with `temp_blks_written > 0` (spilling to disk -- `work_mem` issue)
**Decision point:** If no slow queries are found, check if `pg_stat_statements` has been recently reset. Suggest the user wait for workload data to accumulate, or ask them to provide a specific query.
**Cross-reference with resources:** For any interesting `queryid` found, read `pgtuner://query/{queryid}/stats` for deeper per-query statistics without re-running the tool.
### Step 2: Analyze Execution Plans
For each problematic query identified in Step 1, run EXPLAIN ANALYZE:
```
Tool: analyze_query
Parameters:
query: "<the slow query>"
analyze: true # Execute the query to get actual timing
buffers: true # Show buffer usage (I/O information)
settings: true # Show GUC settings that affect the plan
verbose: false # Keep output readable
format: "text" # Human-readable indented plan
```
The `settings: true` parameter reveals if session-level settings (e.g., modified `work_mem`) affect the plan. The `format: "text"` produces the classic readable plan; use `format: "json"` for structured programmatic analysis when needed.
**What to look for in the execution plan:**
- **Sequential scans on large tables** (> 10,000 rows): Indicates missing indexes
- **Row estimate mismatches**: When `actual rows` differs from `estimated rows` by 10x or more, statistics are stale (run `ANALYZE`)
- **Nested Loop with high iterations**: Consider if a Hash Join or Merge Join would be better
- **Sort operations using disk** (`external merge`): Increase `work_mem` or add an index for the sort key
- **Hash Batches > 1**: Hash table spilling to disk, consider increasing `work_mem`
- **Bitmap Heap Scan with many recheck cond rows**: The index is not selective enough
**Present findings** as a table:
| Issue | Location in Plan | Impact | Recommended Fix |
|-------|-----------------|--------|-----------------|
| Sequential scan | Seq Scan on orders | High - 2.3M rows | Create index on filter columns |
| Row estimate mismatch | ... | Medium | Run ANALYZE on table |
### Step 3: Get Index Recommendations
Use the index advisor to find optimization opportunities:
```
Tool: get_index_recommendations
Parameters:
workload_queries: ["<query1>", "<query2>"] # The slow queries from Step 1
max_recommendations: 10
min_improvement_percent: 10
include_hypothetical_testing: true
```
**What to look for:**
- Indexes with > 50% estimated improvement should be prioritized
- Check if recommended indexes overlap with existing ones
- Consider the write overhead: each index slows INSERT/UPDATE/DELETE
### Step 4: Verify with Hypothetical Indexes
For the most promising index recommendations, verify using HypoPG:
```
Tool: explain_with_indexes
Parameters:
query: "<the slow query>"
hypothetical_indexes:
- table: "orders"
columns: ["customer_id", "created_at"]
index_type: "btree"
analyze: false # Use false for hypothetical testing
```
**What to look for:**
- Cost reduction percentage (aim for > 30% to justify the index)
- Plan changes: Sequential Scan -> Index Scan or Bitmap Index Scan
- If improvement is minimal, the index may not be the right solution
### Step 5: Check Table Health
For tables involved in slow queries, check their maintenance status:
```
Tool: get_table_stats
Parameters:
table_name: "<table_name>"
schema_name: "public"
include_indexes: true
```
**What to look for:**
- **High dead tuple ratio** (> 10%): Table needs `VACUUM`
- **Stale statistics** (`last_analyze` is old): Run `ANALYZE`
- **Sequential scan ratio** (seq_scan >> idx_scan): Missing indexes
- **Large table size** with no indexes: Index candidates
### Step 6: Check Disk I/O (Optional, for complex cases)
If buffer/IO issues were found in the execution plan:
```
Tool: analyze_disk_io_patterns
Parameters:
analysis_type: "all"
top_n: 20
```
**What to look for:**
- Low buffer cache hit ratio (< 99% is a warning for OLTP)
- Tables with high heap_blks_read (cold data)
- Excessive temp file usage (increase `work_mem`)
## Output Format
Present the final diagnosis as:
### Summary
Brief overview of findings (2-3 sentences).
### Critical Issues (Fix Immediately)
Numbered list of urgent issues with specific remediation SQL.
### Recommended Optimizations
Prioritized list with:
- **Issue**: What was found
- **Impact**: Estimated improvement
- **Fix**: Exact SQL command (CREATE INDEX, ANALYZE, VACUUM, config change)
### Configuration Suggestions
Any `postgresql.conf` changes that would help (e.g., `work_mem`, `effective_cache_size`).
## Common Patterns and Solutions
| Symptom | Likely Cause | Solution |
|---------|-------------|----------|
| Seq Scan on large table | Missing index | CREATE INDEX on WHERE/JOIN columns |
| Row estimate 1 vs actual 50000 | Stale statistics | ANALYZE table_name |
| Sort Method: external merge | work_mem too low | SET work_mem = '256MB' (session) |
| Nested Loop (actual loops=10000) | Bad join strategy | Check join column indexes, statistics |
| Hash Batches: 16 | work_mem too low for hash | Increase work_mem |
| Bitmap Heap Scan (lossy) | Index not selective | Consider composite index |
## Iterative Verification
After applying any fix, always verify:
1. **After creating an index**: Re-run `analyze_query` on the slow query to confirm the plan now uses the new index and timing improved.
2. **After running ANALYZE**: Re-run `analyze_query` to confirm row estimates are now accurate.
3. **After increasing work_mem**: Re-run the query to confirm temp files are eliminated (check `buffers: true` output).
4. **After query rewrite**: Compare before/after execution plans and timing.
## When to Stop and Ask the User
- **Before `analyze_query` with `analyze: true` on write queries**: "This will execute the query. For INSERT/UPDATE/DELETE, the tool wraps it in READ ONLY/ROLLBACK for safety, but please confirm you want to proceed."
- **If no clear performance issue is found**: "The database-side query performance appears normal. The issue may be in the application layer (N+1 queries, missing caching, network latency). Would you like me to investigate further?"
- **If recommendations require superuser**: "Creating this extension requires superuser privileges. Do you have access, or should I provide instructions for your DBA?"
- **If index recommendations have high write overhead**: "This table has X inserts/sec. Adding Y new indexes will increase write latency. Is this acceptable?"
## Cross-References to Other Skills
- **Query needs structural rewrite** (subqueries, CTEs, OR patterns): Use `pg-query-rewrite` skill
- **Index optimization beyond this query**: Use `pg-index-optimization` skill
- **work_mem / configuration tuning needed**: Use `pg-config-tuning` skill
- **I/O patterns need investigation**: Use `pg-io-deep-dive` skill
- **Table bloat causing slow scans**: Use `pg-bloat-analysis` skill
## PostgreSQL Version Notes
| Feature | Version | Impact on This Workflow |
|---------|---------|------------------------|
| `pg_stat_statements` `total_exec_time` | PG13+ (renamed from `total_time`) | Column name may differ |
| `compute_query_id` setting | PG14+ | Required for `queryid` in `pg_stat_statements` |
| `pg_stat_statements.track_planning` | PG13+ | Enables planning time tracking |
| CTE inlining | PG12+ | Non-recursive CTEs may be optimized differently |
| Memoize plan node | PG14+ | Improves nested loop performance |
| Incremental sort | PG13+ | Optimizer may choose different sort strategies |
## Important Notes
- `analyze_query` with `analyze: true` actually executes the query. Use caution with INSERT/UPDATE/DELETE statements (they are wrapped in READ ONLY transaction for safety).
- Always verify index recommendations with `explain_with_indexes` before applying in production.
- Consider write amplification: each new index adds overhead to every write operation on the table.
- If `pg_stat_statements` data is sparse, ask the user to collect more workload data first.
- The `PGTUNER_EXCLUDE_USERIDS` environment variable can filter out monitoring user queries from `get_slow_queries` results. Suggest configuring it if internal monitoring queries pollute the results.
## Production Safety
- All diagnostic tools (`get_slow_queries`, `get_table_stats`, `analyze_disk_io_patterns`) are read-only.
- `analyze_query` with `analyze: true` executes the query but wraps it in `BEGIN TRANSACTION READ ONLY` / `ROLLBACK`.
- `explain_with_indexes` creates temporary HypoPG indexes that exist only in the session and are never persisted.
- No tool in this workflow modifies data or schema. All CREATE INDEX / ANALYZE / VACUUM commands are recommendations for the user to execute.Signals
Information
- Repository
- isdaniel/pgtuner_mcp
- Author
- isdaniel
- Last Sync
- 5/10/2026
- Repo Updated
- 4/29/2026
- Created
- 3/26/2026
Reviews (0)
No reviews yet. Be the first to review this skill!
Related Skills
upgrade-nodejs
Upgrading Bun's Self-Reported Node.js Version
cursorrules
CrewAI Development Rules
cn-check
Install and run the Continue CLI (`cn`) to execute AI agent checks on local code changes. Use when asked to "run checks", "lint with AI", "review my changes with cn", or set up Continue CI locally.
CLAUDE
CLAUDE.md
Related Guides
Bear Notes Claude Skill: Your AI-Powered Note-Taking Assistant
Learn how to use the bear-notes Claude skill. Complete guide with installation instructions and examples.
Mastering tmux with Claude: A Complete Guide to the tmux Claude Skill
Learn how to use the tmux Claude skill. Complete guide with installation instructions and examples.
OpenAI Whisper API Claude Skill: Complete Guide to AI-Powered Audio Transcription
Learn how to use the openai-whisper-api Claude skill. Complete guide with installation instructions and examples.