PostgreSQL 18 introduces native "Skip Scan" for multicolumn B-tree indexes, a major optimizer enhancement. However, a common misconception is that pre-v18 versions purely resort to sequential scans when the leading column isn't filtered. In reality, the cost-based planner has long been capable of leveraging such indexes when efficient.
How it works under the hood:
In pre-v18 versions, if an index is significantly smaller than the table (the heap), scanning the entire index to find matching rows—even without utilizing the tree structure to "skip"—can be cheaper than a sequential table scan. This is sometimes also referred as a Full-Index Scan. While this lacks v18’s ability to hop between distinct leading keys, it effectively still speeds up queries on non-leading columns for many common workloads.
Notably, we would see why this v18 improvement is not a game-changer for all workloads, and why you shouldn't assume speed-up for all kinds of datasets.
Benchmarks: v17 vs v18
To understand when the new v18 Skip Scan helps, we tested two distinct scenarios.
Scenario A: Low Cardinality (The "Success" Case)
The Setup:
We created an index on (bid, abalance) and ran the following query:
SELECT COUNT(*) FROM pgbench_accounts WHERE abalance = -2187;
Note: We did not run a read-write workload, so abalance is 0 for all rows. The query returns 0 rows.
Data Statistics:
- Table Rows: 1,000,000
- Leading Column (
bid): ~10 distinct values (Low Cardinality). - Filter Column (
abalance): 1 distinct value (Uniform).
Results:
| Version | Strategy | TPS | Avg Latency | Buffers Hit |
|---|---|---|---|---|
| v17 | Index Only Scan (Full) | ~2,511 | 0.40 ms | 845 |
| v18 | Index Only Scan (Skip Scan) | ~14,382 | 0.07 ms | 39 |
Why the massive 5.7x gain?
Pre-v18 Postgres sees that bid is not constrained. It decides its only option is to read the entire index (all leaf pages) to find rows where abalance = -2187. It scans 1,000,000 items (845 pages).
Postgres v18 uses Skip Scan. It knows bid comes first. instead of scanning sequentially, it:
- Seeks to the first
bid(e.g., 1). Checks forabalance = -2187. - "Skips" to the next unique
bid(e.g., 2). Checks forabalance = -2187. - Repeats for all 10 branches.
It effectively turns one giant scan into 10 small seeks. The EXPLAIN output confirms this with Index Searches: 11 (10 branches + 1 stop condition) and only 39 buffer hits.
v18 Skip Scan Plan (Scenario A):
Index Only Scan using pgbench_accounts_bid_abalance_idx on public.pgbench_accounts ...
Index Cond: (pgbench_accounts.abalance = '-2187'::integer)
Heap Fetches: 0
Index Searches: 11
Buffers: shared hit=39
Scenario B: High Cardinality (The "Failure" Case)
The Setup:
We used the same query, but against an index on (aid, abalance).
Data Statistics:
- Leading Column (
aid): 1,000,000 distinct values (100% Unique / High Cardinality).
Results:
| Version | Strategy | TPS | Avg Latency | Buffers Hit |
|---|---|---|---|---|
| v17 | Index Only Scan (Full) | ~55.8 | 17.92 ms | 2735 |
| v18 | Index Only Scan (Full) | ~51.2 | 19.52 ms | 2741 |
Why no improvement?
If v18 attempted to "Skip Scan" here, it would have to skip 1,000,000 times (once for every unique aid). Performing 1 million seeks is significantly heavier than simply reading the 1 million index entries linearly (which benefits from sequential I/O and page pre-fetching). The planner correctly estimated this cost and fell back to the standard "Index Only Scan" used in v17.
v18 Plan (Identical to v17):
Index Only Scan using pgbench_accounts_aid_abalance_idx on public.pgbench_accounts ...
Index Cond: (pgbench_accounts.abalance = '-2187'::integer)
Heap Fetches: 0
Index Searches: 1
Buffers: shared hit=2741
How to Identify a Skip Scan
You might notice that the node type in the EXPLAIN output remains Index Scan or Index Only Scan in both cases. PostgreSQL 18 does not introduce a special "Skip Scan" node.
Instead, you must look at the Index Searches line (visible with EXPLAIN (ANALYZE)):
- v17 (and older): The
Index Searchesrow does not appear. - v18 (Standard Scan):
Index Searches: 1. The scan started at one point and read sequentially (as seen in Scenario B). - v18 (Skip Scan):
Index Searches: > 1. The engine performed multiple seeks (as seen in Scenario A which had 11).
In our Scenario A (Success), Index Searches: 11 tells us it performed ~11 hops, confirming the feature was used.
How to reproduce:
Note: We use PostgreSQL 13 in this section to meaningfully demonstrate that even older versions (long before v18) could efficiently utilize multicolumn indexes for these types of queries, to clarify that older Postgres versions were capable of avoiding a Full Table Scan (and instead fallback to Full Index Scan) in such scenarios.
- PostgreSQL 13 (a running cluster)
- A pgbench-initialized database (run
pgbench -ito createpgbench_accounts,pgbench_branches,pgbench_tellers, andpgbench_history) - Reasonable data loaded (default
pgbench -i -s 10or similar)
Steps to reproduce the example:
Follow these steps in your terminal to set up the test environment. This assumes you have PostgreSQL 13 (or a similar version) installed and its command-line tools are in your PATH.
Step 1: Create and Initialize the Database
createdb testdb # Create a new database
pgbench -i -s 10 testdb # Initialize it with 1 million rows
Step 2: Run a Quick Read-Write Test
To simulate some database activity, run a standard 10-second read-write benchmark.
pgbench -T 10 testdb
Step 3: Connect and Run the Test Query
Now, connect to the database with psql to run the SQL commands that demonstrate the index scan behavior.
SQL steps (run in psql):
psql testdb
-- create a multicolumn index (first column aid, second column abalance)
testdb=# CREATE INDEX CONCURRENTLY IF NOT EXISTS pgbench_accounts_aid_abalance_idx
ON pgbench_accounts(aid, abalance);
testdb=# select abalance, count(*) from pgbench_accounts group by abalance order by count(*) desc limit 5;
abalance | count
----------+--------
0 | 995151
-2187 | 4
-4621 | 4
-4030 | 4
-2953 | 4
(5 rows)
-- run an explain on a selection that filters only on the second column (abalance)
testdb=# EXPLAIN (ANALYSE, VERBOSE, COSTS)
SELECT COUNT(*) FROM pgbench_accounts WHERE abalance = -2187;
Aggregate (cost=18480.77..18480.78 rows=1 width=8) (actual time=18.672..18.674 rows=1 loops=1)
Output: count(*)
-> Index Only Scan using pgbench_accounts_aid_abalance_idx on public.pgbench_accounts (cost=0.42..18480.69 rows=33 width=0) (actual time=8.583..18.659 rows=4 loops=1)
Output: aid, abalance
Index Cond: (pgbench_accounts.abalance = '-2187'::integer)
Heap Fetches: 0
Planning Time: 0.328 ms
Execution Time: 18.732 ms
(8 rows)
Notes on the output above:
- The important line is
Index Only Scan using pgbench_accounts_aid_abalance_idxalong withIndex Cond: (pgbench_accounts.abalance = '-2187'::integer)— this demonstrates the planner chose an index scan that probes the multicolumn btree for entries where the second column matches the predicate.
Why this is not the v18 "skip-scan" feature in full:
- The v18 skip-scan adds optimizer logic to efficiently iterate distinct values of leading columns and probe the remainder of the index; it's a targeted algorithmic addition. What we show above is the planner choosing an index scan over a multicolumn index and applying the
Index Condto the second column. That can be effective for many queries and data layouts, but it lacks the specialized skip-scan internals that v18 adds for certain other cases.
Production Tips
- If you need queries on a non-leading column to be fast on older Postgres versions, create the right index and keep statistics current (ANALYZE). The planner may prefer an index scan over a seq scan when selectivity and costs align.
- Consider partial or expression indexes when appropriate; they let you make an index that directly serves the filter you need.
- When portability across versions is important, test on the earliest Postgres version you need to support; planner behavior can vary by release, statistics, and data distribution.
Conclusion
Postgres v18's documented skip-scan addition for B-tree indexes is a welcome and useful optimizer enhancement, specifically for low-cardinality leading columns. However, for high-cardinality leading columns (like our first example), the standard Full Index Scan remains optimal, and pre-v18 versions handle them just fine.
References
- PostgreSQL 18 release notes (skip-scan, indexes): https://www.postgresql.org/docs/18/release-18.html
- nbtree skip-scan optimization: https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=92fe23d93
- Further optimize nbtree search scan key comparisons: https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=8a510275d