Effective model mapping is the entry point for every spatial backend built on top of SQLAlchemy and GeoAlchemy2 integration workflows. Without deliberate SRID declarations, correctly placed GiST indexes, and careful session boundary management, geometry columns behave unpredictably in production: inserts silently reject coordinate systems, queries fall back to sequential scans, and API serialization breaks on raw WKBElement objects.

This guide walks through a production-tested workflow for defining PostGIS geometry columns in SQLAlchemy 2.0, applying spatial indexes, managing type coercion, and verifying performance with EXPLAIN ANALYZE.


Mapping flow: ORM model to PostGIS table

The diagram below shows how a SQLAlchemy model definition travels through GeoAlchemy2 and the psycopg2 driver before landing in PostGIS as a GiST-indexed geometry column.

ORM to PostGIS mapping flow A left-to-right flow diagram showing: SQLAlchemy Model (Mapped columns) to GeoAlchemy2 Type Layer (Geometry type, WKB/WKT serialisation, SRID enforcement) to psycopg2 Driver (binary protocol) to PostGIS Table (geometry column + GiST index). SQLAlchemy Model Mapped[Geometry] mapped_column(…) GeoAlchemy2 Type Layer Geometry(type, srid) WKB/WKT coercion SRID enforcement ST_ function compile psycopg2 Driver binary protocol parameterised query PostGIS Table geometry column CHECK constraint GiST index

Prerequisites and infrastructure validation

Before defining spatial models, confirm that PostGIS is enabled in the target database and that your Python environment has the required packages.

Verify PostGIS extension:

sql
-- Run in the target database; must return a version string
SELECT PostGIS_Version();

-- If the extension is missing, enable it:
CREATE EXTENSION IF NOT EXISTS postgis;

Python packages — pin exact versions to prevent silent type-coercion regressions during CI/CD:

bash
pip install "sqlalchemy>=2.0,<3" "geoalchemy2>=0.14" "psycopg2-binary>=2.9" alembic

Verify a GiST index can be created (useful as a pre-migration smoke test):

sql
-- Create a temporary table and confirm the GiST index type is available
CREATE TEMP TABLE _gist_check (geom geometry(POINT, 4326));
CREATE INDEX ON _gist_check USING gist (geom);
DROP TABLE _gist_check;
-- No error means the PostGIS operator classes are installed correctly

GeoAlchemy2 releases track SQLAlchemy minor versions. Misaligned packages break geometry column introspection during Alembic autogeneration, producing spurious ALTER COLUMN migrations on every run.


Step 1 — Engine and declarative base

Configure the engine with a connection pool sized for spatial workloads and set expire_on_commit=False at the session factory level. SQLAlchemy’s default behaviour expires all mapped attributes after each commit, forcing unnecessary round-trips when geometry columns are read in the same request cycle.

python
from sqlalchemy import create_engine, URL
from sqlalchemy.orm import DeclarativeBase, sessionmaker

engine = create_engine(
    URL.create(
        drivername="postgresql+psycopg2",
        username="app_user",
        password="secure_password",
        host="localhost",
        port=5432,
        database="spatial_db",
    ),
    pool_size=10,
    max_overflow=20,
    pool_pre_ping=True,   # validates connections; avoids stale-socket errors
    echo=False,           # set True during development to inspect compiled SQL
)

class Base(DeclarativeBase):
    pass

SessionLocal = sessionmaker(
    bind=engine,
    expire_on_commit=False,  # geometry columns remain accessible after commit
)

For async workloads using asyncpg, swap the driver to postgresql+asyncpg and replace sessionmaker with async_sessionmaker. Session boundary patterns for both sync and async scenarios are covered in session management for spatial data.


Step 2 — Spatial model definition and SRID enforcement

Use SQLAlchemy 2.0’s Mapped annotations. GeoAlchemy2’s Geometry type handles WKB serialization and PostGIS type casting automatically, but an explicit srid parameter is mandatory — omitting it causes silent SRID=0 storage and breaks every spatial join that assumes coordinate-system consistency.

python
from sqlalchemy import String, Integer, Index
from sqlalchemy.orm import Mapped, mapped_column
from geoalchemy2 import Geometry

class Facility(Base):
    __tablename__ = "facilities"
    __table_args__ = (
        # GiST index is required for ST_DWithin, ST_Intersects, and
        # bounding-box operators to use index scans rather than seq scans
        Index("idx_facilities_location", "location", postgresql_using="gist"),
        {"schema": "public"},
    )

    id: Mapped[int] = mapped_column(Integer, primary_key=True)
    name: Mapped[str] = mapped_column(String(255), nullable=False)
    category: Mapped[str] = mapped_column(String(50), index=True)

    # Explicit SRID=4326 (WGS84). PostGIS stores this in geometry_columns
    # and enforces it via a CHECK constraint on insert.
    location: Mapped[Geometry] = mapped_column(
        Geometry(geometry_type="POINT", srid=4326),
        nullable=False,
    )

Why geometry_type matters. Specifying geometry_type="POINT" constrains the column at the PostGIS level — inserting a LINESTRING raises a constraint violation rather than storing corrupt data. For heterogeneous geometry stores (mixed point/polygon), use geometry_type="GEOMETRY" and enforce type at the application layer.

Multi-geometry models — declare each geometry column separately with its own Geometry() definition, SRID, and corresponding GiST index entry in __table_args__:

python
class Zone(Base):
    __tablename__ = "zones"
    __table_args__ = (
        Index("idx_zones_boundary", "boundary", postgresql_using="gist"),
        Index("idx_zones_centroid", "centroid", postgresql_using="gist"),
    )

    id: Mapped[int] = mapped_column(Integer, primary_key=True)
    boundary: Mapped[Geometry] = mapped_column(
        Geometry(geometry_type="POLYGON", srid=4326), nullable=False
    )
    centroid: Mapped[Geometry] = mapped_column(
        Geometry(geometry_type="POINT", srid=4326), nullable=True
    )

Step 3 — Inserting spatial records

Pass geometry as WKTElement with an explicit SRID to guarantee the coordinate system is preserved through the psycopg2 binary protocol. GeoAlchemy2 converts the value to PostGIS’s internal binary format during flush.

python
from geoalchemy2.elements import WKTElement
from sqlalchemy.orm import Session

def create_facility(
    name: str, category: str, lat: float, lon: float
) -> Facility:
    # WKTElement carries SRID alongside the coordinate string
    geom = WKTElement(f"POINT({lon} {lat})", srid=4326)

    with Session(engine) as session:
        facility = Facility(name=name, category=category, location=geom)
        session.add(facility)
        session.commit()
        session.refresh(facility)  # loads server-generated id and defaults
        return facility

Using Shapely geometries directly. GeoAlchemy2 ≥ 0.14 registers a type adapter so Shapely Point, Polygon, and other geometry objects can be assigned to mapped columns. The SRID from the column definition is used automatically:

python
from shapely.geometry import Point

facility.location = Point(-73.9857, 40.7484)
# GeoAlchemy2 converts to WKBElement with the column's srid=4326 on flush

Test this in your specific GeoAlchemy2 version — the Shapely adapter behaviour changed across 0.12→0.14. When in doubt, use WKTElement with an explicit SRID for deterministic results.


Step 4 — Querying with spatial filters

Spatial filter expressions compile to PostGIS functions. Use func.ST_DWithin, func.ST_Intersects, and the && bounding-box operator through SQLAlchemy’s func namespace. Pair radius searches with ST_DWithin radius searches for the full proximity-query pattern.

python
from sqlalchemy import select, func
from geoalchemy2.elements import WKTElement

def facilities_within_km(lat: float, lon: float, km: float) -> list[Facility]:
    origin = WKTElement(f"POINT({lon} {lat})", srid=4326)

    stmt = (
        select(Facility)
        .where(
            # ST_DWithin on geography uses metres; cast geometry → geography
            func.ST_DWithin(
                Facility.location.cast("geography"),
                func.ST_GeogFromText(f"POINT({lon} {lat})"),
                km * 1000,  # metres
            )
        )
        .order_by(
            func.ST_Distance(
                Facility.location.cast("geography"),
                func.ST_GeogFromText(f"POINT({lon} {lat})"),
            )
        )
        .limit(50)
    )

    with Session(engine) as session:
        return session.scalars(stmt).all()

For bulk spatial joins — for example matching thousands of delivery points to neighbourhood polygons — avoid loading all rows into Python and then filtering in memory. Push the join to PostGIS and stream results via server-side cursors. The batch processing spatial joins in Python page covers this pattern end to end.


Step 5 — Serialization: WKBElement to GeoJSON

Raw WKBElement objects returned by SQLAlchemy queries are not JSON-serializable. Every API layer needs a conversion step. Use geoalchemy2.shape.to_shape() to obtain a Shapely geometry, then shapely.geometry.mapping() for a GeoJSON-compatible dict. For type coercion in more complex scenarios — including Pydantic model validators and custom JSON encoders — see type coercion and serialization.

python
from geoalchemy2.shape import to_shape
from shapely.geometry import mapping

def serialize_facility(facility: Facility) -> dict:
    geom = to_shape(facility.location)  # WKBElement → Shapely geometry
    return {
        "id": facility.id,
        "name": facility.name,
        "category": facility.category,
        "coordinates": list(geom.coords),  # [(lon, lat)]
        "geojson": mapping(geom),           # {"type": "Point", "coordinates": [...]}
    }

For FastAPI, configure a custom JSONResponse encoder or use a Pydantic model with a field_serializer that calls to_shape internally. Framework-specific patterns are detailed in configuring GeoAlchemy2 geometry columns in FastAPI.


Step 6 — Hybrid geometry properties

Frequently accessed spatial calculations — centroid, bounding box, area — should be expressed as @hybrid_property so they work identically on in-memory ORM instances and inside SQLAlchemy filter expressions. For the full set of hybrid property patterns, including distance thresholds and spatial aggregation, see hybrid properties for geometry.

python
from sqlalchemy.ext.hybrid import hybrid_property
from sqlalchemy import func
from geoalchemy2.shape import to_shape

class Facility(Base):
    # ... columns as above ...

    @hybrid_property
    def centroid(self):
        """Python-side: returns a Shapely Point."""
        if self.location is None:
            return None
        return to_shape(self.location).centroid

    @centroid.expression
    def centroid(cls):
        """SQL-side: compiles to ST_Centroid(location)."""
        return func.ST_Centroid(cls.location)

    @hybrid_property
    def longitude(self):
        if self.location is None:
            return None
        return to_shape(self.location).x

    @longitude.expression
    def longitude(cls):
        return func.ST_X(cls.location)

The dual-context pattern lets you write session.scalars(select(Facility).where(Facility.longitude > -74)) without breaking when the same property is accessed on a loaded instance.


Performance considerations

Expected EXPLAIN ANALYZE output

After creating the GiST index and running ANALYZE facilities;, a radius query using ST_DWithin should show an Index Scan on the GiST index rather than a Seq Scan:

Index Scan using idx_facilities_location on facilities
  (cost=0.28..8.54 rows=1 width=300)
  (actual time=0.312..0.318 rows=3 loops=1)
  Index Cond: (location && st_expand(...))
  Filter: st_dwithin(location, ..., 5000)
Buffers: shared hit=4
Planning Time: 0.8 ms
Execution Time: 0.4 ms

A Seq Scan here means the index is missing, the SRID in the query does not match the column SRID, or the planner’s row-count estimate is so low that it considers the index overhead not worthwhile. Run ANALYZE facilities; after bulk inserts to refresh statistics.

For deep EXPLAIN (ANALYZE, BUFFERS) output interpretation — including reading bitmap index scans on composite indexes and diagnosing recheck conditions — see reading EXPLAIN ANALYZE output for spatial joins.

GUC settings that affect spatial query plans

Setting Recommended value Effect
random_page_cost 1.1 (SSD) / 4.0 (HDD) Lower values make the planner prefer index scans
effective_cache_size 50–75% of RAM Tells the planner how much OS page cache is available
work_mem 16MB64MB Affects sort and hash operations in spatial joins
jit off for OLTP JIT compilation overhead can slow short spatial queries

Apply these as session-level settings for spatial-heavy endpoints rather than globally:

python
from sqlalchemy import event, text

@event.listens_for(engine, "connect")
def set_spatial_gucs(dbapi_conn, _):
    with dbapi_conn.cursor() as cur:
        cur.execute("SET random_page_cost = 1.1")
        cur.execute("SET work_mem = '32MB'")

Common failure modes and fixes

SRID mismatch on insert

Symptom: ERROR: Geometry SRID (0) does not match column SRID (4326)

Diagnosis:

sql
-- Check what SRID the column expects
SELECT type, srid
FROM geometry_columns
WHERE f_table_name = 'facilities' AND f_geometry_column = 'location';

Fix: Always pass srid=4326 (or the column’s declared SRID) in WKTElement:

python
# Wrong — SRID defaults to 0
location = WKTElement("POINT(-73.98 40.74)")

# Correct
location = WKTElement("POINT(-73.98 40.74)", srid=4326)

Missing GiST index causes sequential scan

Symptom: EXPLAIN ANALYZE shows Seq Scan even on large tables.

Diagnosis:

sql
SELECT indexname, indexdef
FROM pg_indexes
WHERE tablename = 'facilities'
  AND indexdef ILIKE '%gist%';
-- Empty result means no GiST index exists

Fix: Create the index if missing or if the migration was not applied:

sql
CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_facilities_location
ON facilities USING gist (location);

-- Then refresh statistics
ANALYZE facilities;

For partial GiST indexes targeting a filtered subset — for example indexing only rows where active = true — see creating partial indexes for active map regions.

Geometry attribute is None after session.commit()

Symptom: facility.location is None immediately after a commit that set it.

Root cause: expire_on_commit=True (SQLAlchemy default) expires all attributes.

Fix option 1 — set expire_on_commit=False at the session factory:

python
SessionLocal = sessionmaker(bind=engine, expire_on_commit=False)

Fix option 2 — call session.refresh() immediately after commit to reload server-generated values:

python
session.commit()
session.refresh(facility)   # reloads id, location, and any server defaults
print(facility.location)    # now populated

OOM on large geometry result sets

Symptom: worker process killed with out-of-memory when fetching many geometry rows.

Fix: Use yield_per to stream results in batches rather than loading everything into memory at once:

python
stmt = select(Facility).execution_options(yield_per=500)
with Session(engine) as session:
    for batch in session.scalars(stmt).partitions():
        process(batch)

For server-side cursor patterns with psycopg2 named_cursor and asyncpg streaming, see session management for spatial data.


Verification

After applying the model definition and running migrations, confirm that the table, index, and column metadata are in order:

sql
-- 1. Confirm the geometry column is registered with correct type and SRID
SELECT f_table_name, f_geometry_column, type, srid
FROM geometry_columns
WHERE f_table_name = 'facilities';
-- Expected: facilities | location | POINT | 4326

-- 2. Confirm the GiST index exists
SELECT indexname
FROM pg_indexes
WHERE tablename = 'facilities' AND indexdef ILIKE '%gist%';
-- Expected: idx_facilities_location

-- 3. Insert a test row and confirm it round-trips correctly
INSERT INTO facilities (name, category, location)
VALUES ('Test', 'qa', ST_GeomFromText('POINT(-73.98 40.74)', 4326));

SELECT id, name, ST_AsText(location) AS wkt, ST_SRID(location) AS srid
FROM facilities
WHERE name = 'Test';
-- Expected: wkt=POINT(-73.98 40.74), srid=4326

-- 4. Check the query uses the index
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM facilities
WHERE ST_DWithin(location::geography, ST_GeogFromText('POINT(-73.98 40.74)'), 5000);
-- Expected: "Index Scan using idx_facilities_location"

-- Cleanup
DELETE FROM facilities WHERE name = 'Test';

Common pitfalls — quick reference

Symptom Root cause Fix
type "geometry" does not exist PostGIS extension not enabled CREATE EXTENSION IF NOT EXISTS postgis;
SRID mismatch on insert Missing srid in Geometry() or WKTElement Declare srid=4326 in both the column and the element
WKBElement not serializable Raw ORM object passed to JSON encoder to_shape() then mapping() before response
Seq scan despite GiST index Index missing or stale statistics CREATE INDEX … USING gist + ANALYZE
None geometry after commit expire_on_commit=True default Set expire_on_commit=False or call session.refresh()
Alembic generates spurious geometry migrations GeoAlchemy2/SQLAlchemy version mismatch Pin both packages; check geoalchemy2.alembic render hooks

FAQ: Model mapping with GeoAlchemy2

Q: Do I need to call Base.metadata.create_all() or use Alembic?

For anything beyond local development, use Alembic. create_all() cannot detect column changes; it only creates missing tables. GeoAlchemy2 ships Alembic render hooks (geoalchemy2.alembic) that emit correct op.create_geospatial_table() and op.add_geospatial_column() operations.

Q: Can I use asyncpg instead of psycopg2?

Yes. Replace postgresql+psycopg2 with postgresql+asyncpg in the create_engine URL and use create_async_engine from sqlalchemy.ext.asyncio. GeoAlchemy2’s type handling works identically through the async session interface. Be aware that asyncpg uses binary protocol by default, which requires the PostGIS binary codec to be registered — GeoAlchemy2 ≥ 0.14 handles this automatically.

Q: What happens if I store geometries without a GiST index?

Every spatial predicate (ST_DWithin, ST_Intersects, &&) falls back to a sequential scan. On a table with 100,000 rows a radius query that should take milliseconds can take several seconds. Always declare the index in __table_args__ and apply it via migration before the table receives data.

Q: How do I handle coordinate systems other than WGS84?

Use the appropriate EPSG code as the srid parameter — for example srid=32632 for UTM Zone 32N. Distance calculations in projected CRS use metres natively without the ::geography cast, which can be faster for large datasets in a single UTM zone. Use ST_Transform in queries when mixing CRS.