Finance
pg-query-rewrite - Claude MCP Skill
Guides SQL query optimization through execution plan analysis and common rewrite patterns. Covers subquery-to-JOIN conversion, CTE materialization control, OR-to-UNION, NOT IN-to-NOT EXISTS, pagination optimization, and SELECT * elimination. Uses analyze_query with format and settings parameters for deep plan analysis.
SEO Guide: Enhance your AI agent with the pg-query-rewrite tool. This Model Context Protocol (MCP) server allows Claude Desktop and other LLMs to guides sql query optimization through execution plan analysis and common rewrite patterns. covers su... Download and configure this skill to unlock new capabilities for your AI workflow.
Documentation
SKILL.md# PostgreSQL Query Rewrite Patterns
This skill guides the agent through analyzing SQL queries and applying common restructuring patterns to improve performance, using pgtuner-mcp execution plan analysis tools.
## When to Use This Skill
Use this skill when the user:
- Has a specific slow query and wants it rewritten for better performance
- Asks "how can I make this query faster?"
- Reports that adding indexes didn't help (query structure is the issue)
- Has queries with subqueries, CTEs, OR conditions, or NOT IN that could be improved
- Asks about query optimization patterns or best practices
- Has queries with `SELECT *` or inefficient pagination
## MCP Resources Available
- `pgtuner://docs/tools` -- Reference for `analyze_query` parameters and options
- `pgtuner://query/{query_hash}/stats` -- Get statistics for a specific query by its `queryid`
- `pgtuner://table/{schema}/{table_name}/indexes` -- Check available indexes on involved tables
## Related MCP Prompt
This skill is closely related to the **`query_tuning`** MCP Prompt, which provides a structured query optimization workflow.
## Prerequisites
- The pgtuner-mcp server must be connected with a valid `DATABASE_URI`
- Required: `pg_stat_statements` extension (for identifying slow queries to rewrite)
- The user should provide the specific SQL query to optimize
## Agent Decision Tree
```
Start: User provides a slow query
|
+--> Step 1: Analyze the execution plan
|
+--> Look at the plan for these patterns:
|
+--> SubPlan / Correlated Subquery?
| --> Apply Pattern 1: Subquery to JOIN
|
+--> CTE Scan with many rows materialized?
| --> Apply Pattern 2: CTE Materialization Control
|
+--> BitmapOr / multiple OR conditions?
| --> Apply Pattern 3: OR to UNION ALL
|
+--> NOT IN with possible NULLs?
| --> Apply Pattern 4: NOT IN to NOT EXISTS
|
+--> Seq Scan on query with OFFSET + LIMIT?
| --> Apply Pattern 5: Keyset Pagination
|
+--> Many columns fetched but few used?
| --> Apply Pattern 6: Eliminate SELECT *
|
+--> Sort on non-indexed column?
| --> Recommend index (redirect to pg-index-optimization)
|
+--> No obvious rewrite opportunity?
--> Check statistics freshness (ANALYZE table)
--> Check index recommendations
--> Consider configuration tuning (work_mem, etc.)
```
## Query Rewrite Workflow
### Step 1: Analyze the Original Query
First, get the execution plan in text format for readability:
```
Tool: analyze_query
Parameters:
query: "<the user's slow query>"
analyze: true
buffers: true
settings: true
format: "text"
```
**Why `settings: true`:** Reveals if session-level settings (e.g., `SET work_mem`) affect the plan. Important when the same query behaves differently in different contexts.
**Why `format: "text"`:** The text format shows the indented plan tree that is easiest to reason about visually. For programmatic analysis, also run with `format: "json"` if needed.
**Key plan nodes to identify:**
| Plan Node | Indicates | Potential Rewrite |
|-----------|-----------|-------------------|
| `SubPlan` | Correlated subquery (runs per row) | Convert to JOIN |
| `CTE Scan` | CTE materialized to temp storage | Add `NOT MATERIALIZED` (PG12+) |
| `BitmapOr` | Multiple OR conditions | UNION ALL |
| `Nested Loop` with high loops | Cross-join-like behavior | Review join conditions |
| `Sort` (external merge) | Disk-based sort | Index or increase work_mem |
| `Seq Scan` on large table | Missing index or SELECT * | Add index or limit columns |
| `Hash Join` (many batches) | Hash spilling to disk | Increase work_mem |
### Step 2: Check Table Context
For tables involved in the query, check their indexes and statistics:
Read resource: `pgtuner://table/{schema}/{table_name}/indexes`
```
Tool: get_table_stats
Parameters:
table_name: "<table_name>"
schema_name: "public"
include_indexes: true
```
This reveals:
- Available indexes that the query optimizer could use
- Whether statistics are stale (`last_analyze` timestamp)
- Table size (affects plan choices)
### Step 3: Apply Rewrite Patterns
Based on the execution plan analysis, apply the appropriate patterns below.
---
## Rewrite Patterns
### Pattern 1: Correlated Subquery to JOIN
**Symptom in plan:** `SubPlan` node executing once per row of the outer query.
**Before (slow):**
```sql
SELECT o.id, o.total,
(SELECT c.name FROM customers c WHERE c.id = o.customer_id)
FROM orders o
WHERE o.created_at > '2024-01-01';
```
**After (fast):**
```sql
SELECT o.id, o.total, c.name
FROM orders o
LEFT JOIN customers c ON c.id = o.customer_id
WHERE o.created_at > '2024-01-01';
```
**Why:** The subquery executes once per order row. The JOIN lets PostgreSQL choose the optimal join strategy (hash join, merge join) based on data size.
**Verify:** Re-run `analyze_query` on the rewritten query. The `SubPlan` node should be gone.
### Pattern 2: CTE Materialization Control (PG12+)
**Symptom in plan:** `CTE Scan` materializing many rows into temp storage, then filtering.
**Before (materializes everything):**
```sql
WITH recent_orders AS (
SELECT * FROM orders WHERE created_at > '2024-01-01'
)
SELECT * FROM recent_orders WHERE customer_id = 42;
```
**After (allows predicate pushdown):**
```sql
WITH recent_orders AS NOT MATERIALIZED (
SELECT * FROM orders WHERE created_at > '2024-01-01'
)
SELECT * FROM recent_orders WHERE customer_id = 42;
```
**Or simply inline the CTE:**
```sql
SELECT * FROM orders
WHERE created_at > '2024-01-01' AND customer_id = 42;
```
**Why:** In PG11 and earlier, CTEs are always materialized (an optimization fence). In PG12+, non-recursive CTEs referenced once are inlined by default, but explicit `MATERIALIZED` / `NOT MATERIALIZED` gives control.
**When to KEEP materialization:**
- The CTE is referenced multiple times (avoids re-execution)
- The CTE acts as an intentional optimization fence (rare but valid)
### Pattern 3: OR Conditions to UNION ALL
**Symptom in plan:** `BitmapOr` with multiple `BitmapAnd` children, or sequential scan when OR prevents index use.
**Before (can't use single index efficiently):**
```sql
SELECT * FROM events
WHERE user_id = 42 OR event_type = 'critical';
```
**After (each branch uses its own index):**
```sql
SELECT * FROM events WHERE user_id = 42
UNION ALL
SELECT * FROM events WHERE event_type = 'critical' AND user_id != 42;
```
**Why:** PostgreSQL can use a different index for each UNION branch. The `AND user_id != 42` in the second branch prevents duplicates (alternative: use `UNION` but it adds a sort/dedup step).
**When NOT to apply:**
- If `BitmapOr` is already efficient (small result sets from each condition)
- If the OR conditions share the same indexed column
### Pattern 4: NOT IN to NOT EXISTS
**Symptom in plan:** Anti-join not chosen, or unexpected behavior with NULLs.
**Before (NULL-unsafe, can return wrong results):**
```sql
SELECT * FROM customers
WHERE id NOT IN (SELECT customer_id FROM blacklist);
```
**After (NULL-safe, often faster):**
```sql
SELECT * FROM customers c
WHERE NOT EXISTS (
SELECT 1 FROM blacklist b WHERE b.customer_id = c.id
);
```
**Why:**
- If `blacklist.customer_id` contains any NULL, `NOT IN` returns no rows at all (SQL three-valued logic)
- `NOT EXISTS` handles NULLs correctly
- PostgreSQL can use an Anti Join for `NOT EXISTS`, which is typically faster
**Alternative (also good):**
```sql
SELECT c.* FROM customers c
LEFT JOIN blacklist b ON b.customer_id = c.id
WHERE b.customer_id IS NULL;
```
### Pattern 5: OFFSET/LIMIT to Keyset Pagination
**Symptom in plan:** `Sort` + `Limit` with large OFFSET (e.g., `OFFSET 10000`). PostgreSQL must fetch and discard all rows up to the offset.
**Before (slow at high offsets):**
```sql
SELECT * FROM events
ORDER BY created_at DESC
LIMIT 20 OFFSET 10000;
```
**After (constant time regardless of page depth):**
```sql
SELECT * FROM events
WHERE created_at < '2024-06-15T10:30:00' -- last value from previous page
ORDER BY created_at DESC
LIMIT 20;
```
**Why:** Keyset pagination uses an index range scan starting from the last seen value. Performance is O(1) per page instead of O(N) where N is the offset.
**Requirements:**
- An index on the ORDER BY column(s)
- The ordering must be deterministic (add PK as tiebreaker if needed)
- The application must track the last value from the previous page
### Pattern 6: Eliminate SELECT * and Over-Fetching
**Symptom in plan:** Large `width` in plan nodes, or `Seq Scan` when an `Index Only Scan` could be used.
**Before:**
```sql
SELECT * FROM orders WHERE customer_id = 42;
-- Fetches all 25 columns including large text/json fields
```
**After:**
```sql
SELECT id, order_date, total, status FROM orders WHERE customer_id = 42;
-- Fetches only the 4 needed columns
```
**Why:**
- Reduces I/O (especially for wide tables with TOAST columns)
- Enables `Index Only Scan` if all requested columns are in the index
- Reduces network transfer to the application
**To enable Index Only Scan, create a covering index:**
```sql
CREATE INDEX idx_orders_cust_covering
ON orders (customer_id) INCLUDE (id, order_date, total, status);
```
Then verify its potential impact using `manage_hypothetical_indexes`:
**Step 1: Create the hypothetical covering index**
```
Tool: manage_hypothetical_indexes
Parameters:
action: "create"
table: "orders"
columns: ["customer_id"]
include: ["id", "order_date", "total", "status"]
```
**Step 2: Analyze the query plan with the hypothetical index**
```
Tool: analyze_query
Parameters:
query: "SELECT id, order_date, total, status FROM orders WHERE customer_id = 42"
analyze: false
format: "text"
```
**Step 3: Clean up**
```
Tool: manage_hypothetical_indexes
Parameters:
action: "reset"
```
## Step 4: Verify the Rewrite
After applying any rewrite pattern, ALWAYS verify improvement:
```
Tool: analyze_query
Parameters:
query: "<rewritten query>"
analyze: true
buffers: true
format: "text"
```
**Compare before vs after:**
- Execution time (ms)
- Buffer hits vs reads
- Plan structure changes (SubPlan gone? Index scan instead of Seq scan?)
- Rows estimates vs actuals (did the rewrite help the planner?)
IF the rewrite did NOT improve performance:
- The original plan may already be optimal for the data distribution
- Check if `ANALYZE <table>` is needed (stale statistics)
- Consider the `pg-index-optimization` skill instead
- Consider the `pg-config-tuning` skill for `work_mem` or planner settings
## When to Stop and Ask the User
- **If the query involves business logic**: "This query uses a correlated subquery that may have been intentional for correctness. Would you like me to verify that the rewritten version produces identical results?"
- **If the rewrite changes semantics (NOT IN vs NOT EXISTS with NULLs)**: "The NOT IN may behave differently if the subquery contains NULLs. Do you want me to explain the difference?"
- **If no rewrite helps**: "The query structure appears optimal. The performance issue may be due to missing indexes, stale statistics, or configuration. Should I investigate those areas?"
- **If the query is auto-generated (ORM)**: "This appears to be ORM-generated SQL. Optimizing it may require changes to the ORM query builder or using raw SQL for this specific case."
## PostgreSQL Version Notes
| Feature | Version |
|---------|---------|
| CTE inlining (automatic) | PG12+ (non-recursive, single-reference CTEs) |
| `NOT MATERIALIZED` / `MATERIALIZED` hints | PG12+ |
| `INCLUDE` columns in indexes | PG11+ |
| Parallel query for subqueries | PG12+ (improved in PG14+) |
| Memoize node for nested loops | PG14+ |
| Incremental sort | PG13+ |
## Production Safety
- `analyze_query` with `analyze: true` executes the query. For INSERT/UPDATE/DELETE, the tool wraps in `BEGIN READ ONLY` / `ROLLBACK` for safety, but always use caution.
- Query rewrites should be tested against a representative dataset before deploying to production.
- Verify that the rewritten query returns identical results, especially for NULLs, duplicates, and edge cases.
- `explain_with_indexes` with hypothetical indexes is safe (HypoPG indexes are session-only and never persisted).Signals
Information
- Repository
- isdaniel/pgtuner_mcp
- Author
- isdaniel
- Last Sync
- 5/10/2026
- Repo Updated
- 4/29/2026
- Created
- 3/26/2026
Reviews (0)
No reviews yet. Be the first to review this skill!
Related Skills
cursorrules
CrewAI Development Rules
cn-check
Install and run the Continue CLI (`cn`) to execute AI agent checks on local code changes. Use when asked to "run checks", "lint with AI", "review my changes with cn", or set up Continue CI locally.
CLAUDE
CLAUDE.md
browser-use
Automates browser interactions for web testing, form filling, screenshots, and data extraction. Use when the user needs to navigate websites, interact with web pages, fill forms, take screenshots, or extract information from web pages.
Related Guides
Mastering the Oracle CLI: A Complete Guide to the Claude Skill for Database Professionals
Learn how to use the oracle Claude skill. Complete guide with installation instructions and examples.
Python Django Best Practices: A Comprehensive Guide to the Claude Skill
Learn how to use the python django best practices Claude skill. Complete guide with installation instructions and examples.
Optimize Rell Blockchain Code: A Comprehensive Guide to the Claude Skill
Learn how to use the optimize rell blockchain code Claude skill. Complete guide with installation instructions and examples.