General

pg-bloat-analysis - Claude MCP Skill

Detects and remediates table and index bloat in PostgreSQL using pgstattuple analysis. Identifies fragmentation, dead tuples, and wasted space with prioritized maintenance actions including VACUUM, REINDEX, and pg_repack recommendations.

SEO Guide: Enhance your AI agent with the pg-bloat-analysis tool. This Model Context Protocol (MCP) server allows Claude Desktop and other LLMs to detects and remediates table and index bloat in postgresql using pgstattuple analysis. identifies fr... Download and configure this skill to unlock new capabilities for your AI workflow.

🌟2 stars • 4 forks
📥0 downloads

Documentation

SKILL.md
# PostgreSQL Bloat Analysis

This skill guides detection, analysis, and remediation of table and index bloat in PostgreSQL databases using pgtuner-mcp tools.

## When to Use This Skill

Use this skill when the user:

- Reports tables or indexes growing unexpectedly large
- Asks about database bloat, dead tuples, or fragmentation
- Mentions disk space running low
- Sees slow sequential scans on tables that should be small
- Asks "why is my table so large?" or "how do I reclaim disk space?"
- Notices `VACUUM` not keeping up or autovacuum running constantly
- Reports degraded performance after bulk DELETE or UPDATE operations

## MCP Resources Available

- `pgtuner://health/bloat` -- Quick bloat health assessment (lightweight)
- `pgtuner://table/{schema}/{table_name}/stats` -- Table statistics including dead tuple counts
- `pgtuner://settings/autovacuum` -- Autovacuum configuration (relevant for root cause analysis)

## Related MCP Prompt

This skill is referenced by the **`health_check`** MCP Prompt (Step 6: Quick Bloat Scan). For dedicated bloat investigation, follow this skill's full workflow.

## Prerequisites

- The pgtuner-mcp server must be connected with a valid `DATABASE_URI`
- Required: `pgstattuple` extension (for accurate bloat measurement)
- Without `pgstattuple`: The tools fall back to statistical estimation (less accurate)

## Pre-Flight Checks

- If `analyze_table_bloat` or `get_bloat_summary` returns an error about `pgstattuple`, inform the user: "The pgstattuple extension is not installed. Bloat measurements will be estimated from `pg_stat_user_tables` statistics, which is less accurate. For precise measurements, install pgstattuple: `CREATE EXTENSION pgstattuple;`"

## Agent Decision Tree

```
User reports bloat concerns
  |
  +--> General "how bloated is my database?"
  |    --> Start with Step 1 (get_bloat_summary)
  |
  +--> Specific table is too large
  |    --> Skip to Step 2 (analyze_table_bloat) for that table
  |
  +--> Disk space running low
  |    --> Start with Step 1, then prioritize high-bloat tables
  |
  +--> After Step 1, triage results:
       |
       +--> Dead tuple ratio > 20% on any table?
       |    --> Deep-dive Step 2, then investigate root cause (Step 5)
       |
       +--> XID age > 800M on any table?
       |    --> URGENT: redirect to pg-vacuum-tuning emergency procedures
       |
       +--> Index bloat > 40%?
       |    --> Step 3 (index bloat analysis), recommend REINDEX
       |
       +--> Autovacuum not keeping up?
            --> Step 4 (vacuum status), then pg-vacuum-tuning skill
```

## Bloat Analysis Workflow

### Step 1: Database-Wide Bloat Survey

Start with a high-level overview:

```
Tool: get_bloat_summary
Parameters:
  schema_name: "public"
  top_n: 20
  min_size_gb: 0.01    # Lower threshold to catch smaller tables too
```

**What this provides:**
- Combined table and index bloat data
- Prioritized maintenance action list (high/medium/low priority)
- Total wasted space estimate

**Triage the results:**
- **High priority**: Tables with dead tuple ratio > 20% or approaching wraparound
- **Medium priority**: Tables with dead tuple ratio 10-20% or significant index bloat
- **Low priority**: Minor bloat that autovacuum should handle

### Step 2: Deep Table Bloat Analysis

For tables flagged in the survey, check dead tuple density:

```
Tool: get_table_stats
Parameters:
  schema_name: "public"
  include_indexes: false
  order_by: "dead_tuples"
```

Then do a deep analysis on the worst offenders:

```
Tool: analyze_table_bloat
Parameters:
  table_name: "<flagged_table>"
  schema_name: "public"
  use_approx: false      # Use exact measurement for important tables
  include_toast: true     # Check TOAST tables too (large text/json columns)
```

For very large tables (> 100 GB), use approximate mode to avoid long scans:
```
Tool: analyze_table_bloat
Parameters:
  table_name: "<large_table>"
  use_approx: true
  min_table_size_gb: 0.1
```

**Interpret the results:**

| Metric | Healthy | Warning | Critical |
|--------|---------|---------|----------|
| dead_tuple_percent | < 5% | 5-20% | > 20% |
| free_space_percent | < 10% | 10-30% | > 30% |
| tuple_percent (live data density) | > 80% | 50-80% | < 50% |

**Severity scoring (from the tool):**
- Score >= 6: Critical bloat
- Score >= 4: High bloat
- Score >= 2: Moderate bloat
- Score >= 1: Low bloat

### Step 3: Index Bloat Analysis

For indexes on bloated tables:

```
Tool: analyze_index_bloat
Parameters:
  table_name: "<table_name>"
  schema_name: "public"
  min_index_size_gb: 0.01
  min_bloat_percent: 15
```

**Interpret the results:**

| Index Type | Metric | Healthy | Bloated |
|-----------|--------|---------|---------|
| B-tree | avg_leaf_density | > 80% | < 70% |
| B-tree | estimated_bloat | < 20% | > 30% |
| GIN | pending_pages | 0 | > 100 |

**Common causes of index bloat:**
- HOT updates not possible (indexed columns being updated)
- Long-running transactions preventing dead tuple cleanup
- Monotonically increasing keys with deletions (B-tree right-growth pattern)

### Step 4: Vacuum Status Check

Understand why bloat accumulated:

```
Tool: monitor_vacuum_progress
Parameters:
  action: "needs_vacuum"
  schema_name: "public"
  min_dead_tuples: 100
```

Check autovacuum configuration:
```
Tool: monitor_vacuum_progress
Parameters:
  action: "autovacuum_status"
```

Check recent vacuum history:
```
Tool: monitor_vacuum_progress
Parameters:
  action: "recent_activity"
```

**What to look for:**
- Tables that need vacuum but haven't been vacuumed
- Autovacuum workers at capacity (all slots busy)
- Tables with aggressive thresholds that never get vacuumed
- Transaction wraparound risk (XID age approaching `autovacuum_freeze_max_age`)

### Step 5: Root Cause Analysis

Based on the data collected, identify why bloat is accumulating:

| Symptom | Root Cause | Evidence |
|---------|-----------|----------|
| High dead tuples, recent vacuum | Update/delete rate exceeds vacuum throughput | `n_dead_tup` still high after vacuum |
| High dead tuples, no recent vacuum | Autovacuum not running on this table | `last_autovacuum` is NULL or very old |
| High dead tuples, vacuum running | Long-running transaction holding back cleanup | `backend_xmin` in `pg_stat_activity` |
| Index bloat but table is OK | HOT updates not possible | Updated columns are indexed |
| Free space high, dead tuples low | Space not returned to OS after mass delete | Table was VACUUMed but not reclaimed |

## Remediation Plan

### For Table Bloat

**Mild bloat (dead tuples 5-20%):**
```sql
-- Standard VACUUM (reclaims dead tuples, does NOT return space to OS)
VACUUM VERBOSE schema.table_name;
```

**Severe bloat (dead tuples > 20% or free space > 30%):**
```sql
-- VACUUM FULL rewrites the table (LOCKS TABLE - schedule during maintenance window)
VACUUM FULL VERBOSE schema.table_name;
-- Then update statistics
ANALYZE schema.table_name;
```

**Alternative for production (no downtime):**
```sql
-- pg_repack (requires pg_repack extension)
-- Rewrites table online without exclusive lock
pg_repack --table schema.table_name --no-superuser-check -d dbname
```

### For Index Bloat

**Moderate index bloat (20-40%):**
```sql
-- REINDEX CONCURRENTLY (PostgreSQL 12+, no blocking)
REINDEX INDEX CONCURRENTLY schema.index_name;
```

**Severe index bloat (> 40%):**
```sql
-- For all indexes on a table
REINDEX TABLE CONCURRENTLY schema.table_name;
```

### Autovacuum Tuning

If autovacuum is not keeping up:

```sql
-- Per-table autovacuum tuning for high-churn tables
ALTER TABLE schema.table_name SET (
  autovacuum_vacuum_scale_factor = 0.01,      -- Vacuum at 1% dead tuples (default 20%)
  autovacuum_vacuum_threshold = 1000,          -- Minimum dead tuples before vacuum
  autovacuum_analyze_scale_factor = 0.005,     -- Analyze at 0.5% changes
  autovacuum_vacuum_cost_delay = '2ms'           -- Faster vacuum (default is 20ms on PG<12, 2ms on PG12+)
);
```

Global autovacuum tuning:
```sql
-- Make autovacuum more aggressive globally
ALTER SYSTEM SET autovacuum_max_workers = 6;           -- Default 3
ALTER SYSTEM SET autovacuum_vacuum_cost_delay = '2ms'; -- Default 20ms
ALTER SYSTEM SET autovacuum_naptime = '15s';            -- Default 1min
-- Requires reload: SELECT pg_reload_conf();
```

## Output Format

### Bloat Summary

| Object | Type | Size | Bloat % | Dead Tuples | Priority | Action |
|--------|------|------|---------|-------------|----------|--------|
| orders | table | 12 GB | 35% | 4.2 GB | High | VACUUM FULL or pg_repack |
| idx_orders_date | index | 3.2 GB | 42% | 1.3 GB | High | REINDEX CONCURRENTLY |

### Root Cause
Brief explanation of why bloat accumulated.

### Maintenance Script
Ordered SQL commands to resolve the issues, with timing and locking notes.

### Prevention Recommendations
Autovacuum tuning and application-level changes to prevent recurrence.

## Important Notes

- `analyze_table_bloat` with `use_approx: false` reads the entire table (I/O intensive on large tables)
- `VACUUM FULL` requires an exclusive lock on the table (blocks all reads and writes)
- `REINDEX CONCURRENTLY` requires PostgreSQL 12+
- Always run `ANALYZE` after `VACUUM FULL` or `REINDEX` to update statistics
- TOAST table bloat is often overlooked: use `include_toast: true` for tables with large text/json columns
- `pgstattuple` extension must be installed for accurate measurements. Without it, results are estimated.

## Iterative Verification

After applying remediation:

1. **After VACUUM**: Re-run `analyze_table_bloat` to confirm dead tuples were cleaned up. Note: `VACUUM` does not reduce table size on disk (only marks space as reusable).
2. **After VACUUM FULL**: Re-run `analyze_table_bloat` to confirm both dead tuples and free space were reclaimed.
3. **After REINDEX**: Re-run `analyze_index_bloat` to confirm avg_leaf_density improved.
4. **After autovacuum tuning**: Monitor with `monitor_vacuum_progress` `action: "progress"` over the next hours/days to confirm vacuum is keeping up.

## When to Stop and Ask the User

- **Before recommending VACUUM FULL**: "VACUUM FULL requires an exclusive lock on the table, blocking all reads and writes for the duration. For a X GB table, this could take Y minutes. When is your maintenance window? Alternatively, pg_repack can do this online."
- **If pgstattuple is not installed**: "Bloat measurements are estimated. For accurate results, install pgstattuple. Would you like to proceed with estimates?"
- **If bloat is moderate (< 20%)**: "The bloat level is within normal range and autovacuum should handle it. No immediate action needed. Would you like me to tune autovacuum for better ongoing maintenance?"
- **If root cause is long-running transactions**: "Bloat is accumulating because of idle-in-transaction sessions blocking vacuum. This requires application-level fixes. Would you like me to investigate the blocking sessions?"

## Cross-References to Other Skills

- **Autovacuum tuning for bloat prevention**: Use `pg-vacuum-tuning` skill
- **Index-specific bloat and optimization**: Use `pg-index-optimization` skill
- **I/O impact of bloated tables**: Use `pg-io-deep-dive` skill
- **Lock issues blocking vacuum**: Use `pg-lock-diagnosis` skill

## PostgreSQL Version Notes

| Feature | Version |
|---------|---------|
| `pgstattuple_approx()` | PG9.5+ |
| `REINDEX CONCURRENTLY` | PG12+ |
| `pg_stat_progress_vacuum` | PG9.6+ |
| Improved autovacuum cost delay default (2ms) | PG12+ |

## Production Safety

- `analyze_table_bloat` and `analyze_index_bloat` are read-only but I/O intensive (especially with `use_approx: false` on large tables)
- `get_bloat_summary` is read-only and safe
- All VACUUM, REINDEX, and ALTER TABLE commands are recommendations, not executed by the tools
- For large tables (> 50 GB), always use `use_approx: true` to avoid long scan times

Signals

Avg rating0.0
Reviews0
Favorites0

Information

Repository
isdaniel/pgtuner_mcp
Author
isdaniel
Last Sync
5/10/2026
Repo Updated
4/29/2026
Created
3/26/2026

Reviews (0)

No reviews yet. Be the first to review this skill!