The Core Problem: Operating Without Visibility

Anyone managing a PostgreSQL database has faced the same recurring question during a long-running operation: "Is it done yet?" Index creation, vacuuming, bulk data loads, and base backups can run for minutes or hours, and without proper visibility, they behave like black boxes. PostgreSQL progress reporting system solves this by exposing the internal state of these operations through live, queryable system views — no log parsing, no guesswork, no waiting.

In production environments spanning fintech, SaaS, and e-commerce stacks, progress visibility is typically the first tool DBAs reach for during maintenance windows and live migrations.

What Is PostgreSQL Progress Reporting?

Progress reporting in PostgreSQL refers to a collection of dynamic system views that reflect the real-time status of long-running internal operations. These are in-memory, live views — they show what PostgreSQL is doing right now, updated continuously as operations proceed. No additional configuration or logging is required to use them.

Why It Matters: Operational Benefits

Before these views existed, DBAs had limited options: parse logs, use pg_stat_activity for rough signals, or simply wait. This created real uncertainty around maintenance windows, disaster recovery tests, and bulk operations. Progress reporting addresses this across several dimensions:

Bottleneck Detection — Identify exactly which phase of an index build or vacuum is consuming the most time, rather than guessing from logs.

Automation-Ready Metrics — These views are standard SQL-queryable, making them easy to integrate into monitoring scripts, alerting pipelines, and auto-scaling triggers.

Better Planning — Completion percentages derived from fields like blocks_done vs. blocks_total allow teams to schedule follow-up tasks and communicate reliable timelines.

Stuck Operation Detection — When an operation stalls due to lock contention, I/O saturation, or waiting transactions, the phase column makes it immediately visible rather than requiring deep investigation.

Confident Maintenance Windows — Live monitoring of operations like VACUUM and CLUSTER makes it easier to decide whether to let an operation continue or intervene before it overruns a scheduled window.

Reliable ETAs for Stakeholders — Instead of vague estimates, teams can share data-backed completion percentages, which is particularly important when coordinating across teams during migrations or upgrades.

Crucially, these views are lightweight and read from in-memory statistics, so querying them does not meaningfully impact database performance.

 

The Complete List of Progress Views (PostgreSQL 18)

PostgreSQL provides six dedicated progress-reporting views, each targeting a specific operation:

  • pg_stat_progress_vacuum — tracks table vacuuming
  • pg_stat_progress_analyze — tracks table analysis
  • pg_stat_progress_create_index — monitors index creation
  • pg_stat_progress_cluster — tracks heap rewrites during clustering
  • pg_stat_progress_copy — monitors COPY FROM/TO operations
  • pg_stat_progress_basebackup — tracks base backup progress

 

1. Monitoring VACUUM: pg_stat_progress_vacuum

When to use it: Query this view whenever autovacuum or a manual VACUUM is running on a large table — especially during post-bulk-load cleanup or when autovacuum appears to be running unusually slowly.

A sample output from the blog shows a VACUUM in the "scanning heap" phase on a table with 73,334 heap blocks total, with scanning just beginning.

Key fields to monitor:

  • phase — cycles through scanning heap, vacuuming indexes, and cleanup
  • heap_blks_scanned / heap_blks_total — use these to derive a completion percentage
  • num_dead_tuples — shows how much bloat is actively being reclaimed
  • index_vacuum_count — the number of index passes completed so far

2. Monitoring ANALYZE: pg_stat_progress_analyze

When to use it: Most useful when large tables are being analyzed after bulk loads, or when autoanalyze is running longer than expected and you want to understand how far along it is.

A sample output shows an ANALYZE in the "acquiring sample rows" phase, with 517 out of 2,616 sample blocks already scanned.

Key fields to monitor:

  • phase — either acquiring sample rows or acquiring inherited sample rows
  • sample_blks_scanned / sample_blks_total — gives sampling completion percentage
  • ext_stats_computed — tracks progress on multi-column extended statistics
  • child_tables_done — relevant when analyzing partitioned tables

3. Monitoring Index Builds: pg_stat_progress_create_index

When to use it: Index creation on large tables can take considerable time, especially in CONCURRENTLY mode. This view shows exactly which build phase is underway, making it far easier to estimate completion and diagnose slowdowns.

The blog shows two phases captured in sequence — first the initializing phase (where all block and tuple counts are zero), then the "building index: scanning table" phase where 161 of 2,616 blocks have been processed.

All phases in order:

  1. Initializing
  2. Building index: scanning table
  3. Building index: sorting live tuples
  4. Building index: loading tuples in tree
  5. Index validation: scanning index
  6. Index validation: scanning table
  7. Waiting for old snapshots
  8. Waiting for readers before marking dead

Key fields to monitor:

  • phase — identifies exactly which build stage is in progress
  • blocks_done / blocks_total — compute completion percentage during the scan phase
  • tuples_done / tuples_total — relevant during the sorting phase
  • partitions_done — useful for CREATE INDEX on partitioned tables

If an index build appears stuck, the phase column reveals whether it is waiting on locks, I/O resources, or other active transactions.

4. Monitoring CLUSTER: pg_stat_progress_cluster

When to use it: The CLUSTER command physically rewrites an entire table in index order — a heavy, locking operation. This view lets DBAs track its progress and plan maintenance windows accordingly, since a CLUSTER that overruns its window can cause significant disruption.

A sample output shows a CLUSTER in the "writing new heap" phase, having scanned all 2,630 heap blocks and written 1,303 tuples so far.

Key fields to monitor:

  • phase — sequential heap scanning, index scanning heap, or writing new heap
  • heap_tuples_written / heap_tuples_scanned — row-level rewrite progress
  • heap_blks_scanned — block-level scan progress
  • index_rebuild_count — how many indexes have been rebuilt so far during the operation

5. Monitoring COPY Operations: pg_stat_progress_copy

When to use it: COPY is the standard mechanism for bulk data loads and exports. This view is invaluable during ETL jobs and migrations, allowing teams to calculate load speed and estimate when a large import will finish.

A sample output shows a COPY FROM FILE operation with 100,073,472 bytes processed out of 137,777,792 bytes total, with 3,652,000 tuples loaded — working out to approximately 72.6% completion.

Key fields to monitor:

  • bytes_processed / bytes_total — direct completion percentage (multiply by 100)
  • tuples_processed — total rows loaded so far
  • tuples_excluded / tuples_skipped — flags data quality issues mid-load
  • type — identifies whether the source is FILE, PIPE, PROGRAM, or STDIN, useful for distinguishing load sources

6. Monitoring Base Backups: pg_stat_progress_basebackup

When to use it: Base backups can run for a long time on large databases or slow storage. This view tells you exactly which phase the backup is in and how much data has been streamed, removing uncertainty from a critical operational process.

A sample output shows a backup in the "waiting for checkpoint to finish" phase, with no data streamed yet.

All phases in order:

  1. Initializing
  2. Waiting for checkpoint to finish
  3. Estimating backup size
  4. Streaming database files
  5. Waiting for WAL archiving to finish
  6. Transferring WAL files

Key fields to monitor:

  • phase — a prolonged pause on "waiting for checkpoint to finish" may indicate checkpoint pressure on the server
  • backup_streamed / backup_total — bytes transferred vs. estimated total (note: backup_total remains NULL until the size estimation phase completes)
  • tablespaces_streamed — relevant for databases using multiple tablespaces

The Bigger Picture

Taken together, PostgreSQL's progress reporting views transform long-running maintenance operations from opaque, anxiety-inducing processes into transparent, monitorable workflows. DBAs gain precise, phase-level insight into what PostgreSQL is doing at any moment. This enables faster troubleshooting, more confident maintenance planning, accurate stakeholder communication, and more robust monitoring automation — all without any additional configuration or performance cost to the database.