This page covers the exact scenario where a working ST_DWithin radius search — correct and fast in development — degrades under production load: p95 latency climbs past 100ms, plan cache misses accumulate, and the query planner silently falls back to sequential scans. Solving this requires aligning four separate layers: index structure, query shape, driver configuration, and PostgreSQL server parameters. None of these layers is sufficient alone.

The primary bottleneck at high QPS is rarely the distance calculation itself. It is sequential geometry scans triggered by stale statistics, implicit type coercion between geometry and geography, missing work_mem for spatial sorts, and query plan recompilation when prepared statements are not reused across connections.

Why the Naive Approach Fails

The most common starting point looks like this:

python
# Naive implementation — correct but breaks under load
async def find_nearby_naive(conn, lat, lon, radius_m):
    query = f"""
        SELECT id, name,
               ST_Distance(geom::geography,
                           ST_SetSRID(ST_MakePoint({lon}, {lat}), 4326)::geography) AS dist_m
        FROM locations
        WHERE ST_DWithin(geom::geography,
                         ST_SetSRID(ST_MakePoint({lon}, {lat}), 4326)::geography,
                         {radius_m})
        ORDER BY dist_m
        LIMIT 50
    """
    return await conn.fetch(query)

This breaks in three ways under load:

  1. No plan reuse. String interpolation produces a unique query string for every call. PostgreSQL parses and plans from scratch each time — ~2–5ms planning overhead per request at 1,000 QPS adds up to 2–5 seconds of wasted CPU per second of traffic.
  2. No bounding-box pre-filter. PostGIS injects a bounding-box check internally, but under skewed data distributions or high concurrency, the planner may misestimate selectivity and abandon the GiST index entirely.
  3. SQL injection risk. Interpolating float values directly is exploitable; parameters must always be passed as bind variables.

Query Execution Flow

The diagram below shows how an optimised radius search moves through the two-phase index filtering pipeline: the bounding-box && check prunes candidates cheaply via the GiST index, then ST_DWithin on the geography cast applies the exact spheroidal distance check to the surviving rows only.

Two-phase ST_DWithin execution pipeline Diagram showing query flow: parameterised query enters the GiST index bounding-box scan, candidate rows pass to the ST_DWithin geography cast predicate, then the result set is sorted and limited. Parameterised Query Phase 1 GiST Index Bounding-Box && operator prunes ~95% of rows Phase 2 ST_DWithin geography cast exact metres spheroidal distance ORDER BY dist_m LIMIT n sorted result set returned to Python

Production-Ready Implementation

The following is a complete, copy-paste-ready implementation. It combines all four fixes: parameterised bind variables (plan reuse), explicit && pre-filter (deterministic index use), geography cast (metre-accurate distance), and asyncpg with statement_cache_size (plan cache preservation across the pool).

Step 1 — GiST Covering Index

ST_DWithin uses the spatial index only when the predicate matches the indexed column’s type and SRID. For APIs that always return id, name, and status, a covering index eliminates expensive heap fetches by storing those columns in the index leaf nodes:

sql
-- Run once, then ANALYZE immediately
CREATE INDEX IF NOT EXISTS idx_locations_geom_covering
ON locations USING GIST (geom)
INCLUDE (id, name, status, updated_at);

ANALYZE locations;

Verify the index exists and is used:

sql
SELECT indexname, indexdef
FROM pg_indexes
WHERE tablename = 'locations'
  AND indexdef ILIKE '%gist%';

For advice on more advanced index configurations — including partial indexes on active regions and composite geometry+timestamp indexes — see Advanced GiST Indexing & Optimization.

Step 2 — Bounding-Box Pre-Filter + Exact Predicate

When the geometry column stores EPSG:4326 data and the radius is in metres, convert the metre radius to degrees for the && pre-filter (equatorial approximation: 111,320 metres per degree). The && check is resolved entirely from the GiST index; the exact spheroidal distance is then validated by ST_DWithin on the geography cast:

sql
-- Optimised ST_DWithin query for EPSG:4326 geometry columns
SELECT id, name,
       ST_Distance(
           geom::geography,
           ST_SetSRID(ST_MakePoint($2, $3), 4326)::geography
       ) AS dist_m
FROM locations
WHERE geom && ST_Expand(
          ST_SetSRID(ST_MakePoint($2, $3), 4326),
          $4 / 111320.0   -- convert metres to degrees for the bbox pre-filter
      )
  AND ST_DWithin(
          geom::geography,
          ST_SetSRID(ST_MakePoint($2, $3), 4326)::geography,
          $4              -- exact metre radius on geography type
      )
ORDER BY dist_m
LIMIT $5;

The ST_Expand on SRID 4326 geometry expects degrees. Dividing by 111,320 gives a degree-based bounding box that is slightly over-inclusive — the && pre-filter must not exclude valid results. The exact metre-based distance is validated by ST_DWithin in the second predicate. For how this pattern fits into broader bounding-box filtering strategies, see that cluster.

Step 3 — asyncpg Engine with Prepared Statement Cache

python
import asyncpg
from sqlalchemy import text
from sqlalchemy.ext.asyncio import create_async_engine, AsyncSession
from sqlalchemy.orm import sessionmaker

DATABASE_URL = "postgresql+asyncpg://api_user:secure_pass@db-primary:5432/gis_db"

# statement_cache_size is the critical setting: asyncpg caches parsed+planned
# statements per connection. Without it, every query re-parses on each call.
engine = create_async_engine(
    DATABASE_URL,
    pool_size=25,           # (2 x CPU cores) on the DB host is a good starting point
    max_overflow=15,        # headroom for brief traffic spikes only
    pool_timeout=30,
    pool_recycle=1800,      # recycle connections before TCP keepalive timeouts
    pool_pre_ping=True,     # detect stale connections before checkout
    connect_args={"statement_cache_size": 200},  # plan reuse across all pool slots
)

_RADIUS_QUERY = text("""
    SELECT id, name,
           ST_Distance(
               geom::geography,
               ST_SetSRID(ST_MakePoint(:lon, :lat), 4326)::geography
           ) AS dist_m
    FROM locations
    WHERE geom && ST_Expand(
              ST_SetSRID(ST_MakePoint(:lon, :lat), 4326),
              :radius / 111320.0
          )
      AND ST_DWithin(
              geom::geography,
              ST_SetSRID(ST_MakePoint(:lon, :lat), 4326)::geography,
              :radius
          )
    ORDER BY dist_m
    LIMIT :limit
""")

async def query_radius_locations(
    lat: float,
    lon: float,
    radius_m: float,
    limit: int = 50,
) -> list[dict]:
    """Return locations within radius_m metres of (lat, lon).

    Uses named bind variables so SQLAlchemy Core passes them as asyncpg
    prepared statement parameters — plan reuse is automatic once the cache
    warms after the first call per connection slot.
    """
    async with engine.begin() as conn:
        result = await conn.execute(
            _RADIUS_QUERY,
            {"lat": lat, "lon": lon, "radius": radius_m, "limit": limit},
        )
        return result.mappings().all()

Never use f-strings or % formatting for spatial parameters. They produce a unique query string per call, bypass the statement cache, and expose the database to plan cache bloat. Always use named bind variables as shown above.

If your ORM layer uses GeoAlchemy2 column mapping for the geom column, the same parameterisation principle applies — pass coordinates as scalars, not as embedded string literals.

Configuration and Tuning Knobs

Query structure alone does not sustain more than 1,000 QPS. These four GUC settings directly affect PostGIS planner behaviour and must be set at the database level so they apply to every new connection:

sql
-- Apply at database level (not session level) so every connection inherits them
ALTER DATABASE gis_db SET work_mem = '128MB';
ALTER DATABASE gis_db SET effective_cache_size = '12GB';  -- 50-75% of total RAM
ALTER DATABASE gis_db SET random_page_cost = 1.1;
ALTER DATABASE gis_db SET jit = off;
Parameter Recommended value Why it matters for ST_DWithin
work_mem 64MB–256MB Spatial sorts spill to disk when this is too low, spiking tail latency
effective_cache_size 50–75% of RAM Guides the planner toward index scans without consuming actual memory
random_page_cost 1.1 (SSD) GiST traversal is random-I/O-heavy; default 4.0 underweights index paths
jit off JIT compilation overhead exceeds benefit for sub-100ms spatial queries

The work_mem setting is per-sort per-connection — at 25 connections, 128MB work_mem could allocate up to 3.2GB in the worst case. Monitor pg_stat_activity and total memory usage. If you are running mixed workloads, set work_mem at the role level instead:

sql
ALTER ROLE api_user SET work_mem = '128MB';

Verification Steps

After deploying the index and configuration changes, run this verification sequence to confirm the pipeline behaves as expected.

1. Confirm the Index Scan path:

sql
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT id, name,
       ST_Distance(geom::geography,
                   ST_SetSRID(ST_MakePoint(-0.1276, 51.5074), 4326)::geography) AS dist_m
FROM locations
WHERE geom && ST_Expand(ST_SetSRID(ST_MakePoint(-0.1276, 51.5074), 4326), 5000 / 111320.0)
  AND ST_DWithin(geom::geography,
                 ST_SetSRID(ST_MakePoint(-0.1276, 51.5074), 4326)::geography, 5000)
ORDER BY dist_m
LIMIT 50;

Expected output markers:

  • Index Scan using idx_locations_geom_covering (not Seq Scan)
  • Buffers: shared hit=... with a hit ratio above 90%
  • Planning Time: < 0.5 ms (confirms prepared statement plan reuse)

2. Verify plan cache is populated:

sql
SELECT query, calls, mean_exec_time, stddev_exec_time
FROM pg_stat_statements
WHERE query ILIKE '%ST_DWithin%'
ORDER BY calls DESC
LIMIT 5;

A calls count climbing with stable mean_exec_time confirms plan reuse. High stddev_exec_time spikes indicate planner regressions — trigger ANALYZE and inspect for index bloat.

3. Python timing assertion:

python
import asyncio, time

async def verify_latency(runs: int = 100):
    times = []
    for _ in range(runs):
        t0 = time.perf_counter()
        rows = await query_radius_locations(51.5074, -0.1276, 5000, limit=50)
        times.append((time.perf_counter() - t0) * 1000)
    p50 = sorted(times)[runs // 2]
    p95 = sorted(times)[int(runs * 0.95)]
    print(f"p50={p50:.1f}ms  p95={p95:.1f}ms  rows={len(rows)}")
    assert p95 < 50, f"p95 latency {p95:.1f}ms exceeds 50ms budget"

asyncio.run(verify_latency())

If the planner still chooses a sequential scan after these steps, temporarily set SET enable_seqscan = off; in a diagnostic session to force the index path and compare actual execution times. If the index path is genuinely slower, your dataset may have a pathological skew (most points clustered in one region) — consider a partial GiST index covering only the dense region.

For a deeper walkthrough of reading EXPLAIN (ANALYZE, BUFFERS) output for spatial queries, see Reading EXPLAIN ANALYZE Output for Spatial Joins.

Gotchas Checklist

  • work_mem is per-sort per-connection, not per-database total. With 25 pool connections, 128MB work_mem can peak at 3.2GB. Monitor total memory usage and reduce if the host has less than 16GB RAM.
  • The && degree conversion breaks near the poles. 111,320 metres/degree is the equatorial approximation. Above ~60° latitude, it over-expands the bounding box by up to 50%, reducing selectivity. Use ST_DWithin on geography only (drop the && pre-filter) for polar data, or apply a latitude-adjusted conversion.
  • statement_cache_size is per asyncpg connection, not per pool. With pool_size=25, you get 25 independent caches of 200 plans each. The pool warm-up period (first 25 calls) still incurs full planning time.
  • GiST index bloat after bulk inserts. High-volume INSERT workloads fragment the GiST tree. Run REINDEX CONCURRENTLY idx_locations_geom_covering in a maintenance window, or use pg_repack to avoid table locks. Monitor pg_stat_user_indexes for index size growth.
  • JIT cannot be disabled per-query with SQLAlchemy Core (it requires SET jit = off inside the transaction). Set it at the database or role level as shown above to ensure all connections inherit the setting without per-query overhead.