Effective model mapping is the entry point for every spatial backend built on top of SQLAlchemy and GeoAlchemy2 integration workflows. Without deliberate SRID declarations, correctly placed GiST indexes, and careful session boundary management, geometry columns behave unpredictably in production: inserts silently reject coordinate systems, queries fall back to sequential scans, and API serialization breaks on raw WKBElement objects.
This guide walks through a production-tested workflow for defining PostGIS geometry columns in SQLAlchemy 2.0, applying spatial indexes, managing type coercion, and verifying performance with EXPLAIN ANALYZE.
Mapping flow: ORM model to PostGIS table
The diagram below shows how a SQLAlchemy model definition travels through GeoAlchemy2 and the psycopg2 driver before landing in PostGIS as a GiST-indexed geometry column.
Prerequisites and infrastructure validation
Before defining spatial models, confirm that PostGIS is enabled in the target database and that your Python environment has the required packages.
Verify PostGIS extension:
-- Run in the target database; must return a version string
SELECT PostGIS_Version();
-- If the extension is missing, enable it:
CREATE EXTENSION IF NOT EXISTS postgis;Python packages — pin exact versions to prevent silent type-coercion regressions during CI/CD:
pip install "sqlalchemy>=2.0,<3" "geoalchemy2>=0.14" "psycopg2-binary>=2.9" alembicVerify a GiST index can be created (useful as a pre-migration smoke test):
-- Create a temporary table and confirm the GiST index type is available
CREATE TEMP TABLE _gist_check (geom geometry(POINT, 4326));
CREATE INDEX ON _gist_check USING gist (geom);
DROP TABLE _gist_check;
-- No error means the PostGIS operator classes are installed correctlyGeoAlchemy2 releases track SQLAlchemy minor versions. Misaligned packages break geometry column introspection during Alembic autogeneration, producing spurious ALTER COLUMN migrations on every run.
Step 1 — Engine and declarative base
Configure the engine with a connection pool sized for spatial workloads and set expire_on_commit=False at the session factory level. SQLAlchemy’s default behaviour expires all mapped attributes after each commit, forcing unnecessary round-trips when geometry columns are read in the same request cycle.
from sqlalchemy import create_engine, URL
from sqlalchemy.orm import DeclarativeBase, sessionmaker
engine = create_engine(
URL.create(
drivername="postgresql+psycopg2",
username="app_user",
password="secure_password",
host="localhost",
port=5432,
database="spatial_db",
),
pool_size=10,
max_overflow=20,
pool_pre_ping=True, # validates connections; avoids stale-socket errors
echo=False, # set True during development to inspect compiled SQL
)
class Base(DeclarativeBase):
pass
SessionLocal = sessionmaker(
bind=engine,
expire_on_commit=False, # geometry columns remain accessible after commit
)For async workloads using asyncpg, swap the driver to postgresql+asyncpg and replace sessionmaker with async_sessionmaker. Session boundary patterns for both sync and async scenarios are covered in session management for spatial data.
Step 2 — Spatial model definition and SRID enforcement
Use SQLAlchemy 2.0’s Mapped annotations. GeoAlchemy2’s Geometry type handles WKB serialization and PostGIS type casting automatically, but an explicit srid parameter is mandatory — omitting it causes silent SRID=0 storage and breaks every spatial join that assumes coordinate-system consistency.
from sqlalchemy import String, Integer, Index
from sqlalchemy.orm import Mapped, mapped_column
from geoalchemy2 import Geometry
class Facility(Base):
__tablename__ = "facilities"
__table_args__ = (
# GiST index is required for ST_DWithin, ST_Intersects, and
# bounding-box operators to use index scans rather than seq scans
Index("idx_facilities_location", "location", postgresql_using="gist"),
{"schema": "public"},
)
id: Mapped[int] = mapped_column(Integer, primary_key=True)
name: Mapped[str] = mapped_column(String(255), nullable=False)
category: Mapped[str] = mapped_column(String(50), index=True)
# Explicit SRID=4326 (WGS84). PostGIS stores this in geometry_columns
# and enforces it via a CHECK constraint on insert.
location: Mapped[Geometry] = mapped_column(
Geometry(geometry_type="POINT", srid=4326),
nullable=False,
)Why geometry_type matters. Specifying geometry_type="POINT" constrains the column at the PostGIS level — inserting a LINESTRING raises a constraint violation rather than storing corrupt data. For heterogeneous geometry stores (mixed point/polygon), use geometry_type="GEOMETRY" and enforce type at the application layer.
Multi-geometry models — declare each geometry column separately with its own Geometry() definition, SRID, and corresponding GiST index entry in __table_args__:
class Zone(Base):
__tablename__ = "zones"
__table_args__ = (
Index("idx_zones_boundary", "boundary", postgresql_using="gist"),
Index("idx_zones_centroid", "centroid", postgresql_using="gist"),
)
id: Mapped[int] = mapped_column(Integer, primary_key=True)
boundary: Mapped[Geometry] = mapped_column(
Geometry(geometry_type="POLYGON", srid=4326), nullable=False
)
centroid: Mapped[Geometry] = mapped_column(
Geometry(geometry_type="POINT", srid=4326), nullable=True
)Step 3 — Inserting spatial records
Pass geometry as WKTElement with an explicit SRID to guarantee the coordinate system is preserved through the psycopg2 binary protocol. GeoAlchemy2 converts the value to PostGIS’s internal binary format during flush.
from geoalchemy2.elements import WKTElement
from sqlalchemy.orm import Session
def create_facility(
name: str, category: str, lat: float, lon: float
) -> Facility:
# WKTElement carries SRID alongside the coordinate string
geom = WKTElement(f"POINT({lon} {lat})", srid=4326)
with Session(engine) as session:
facility = Facility(name=name, category=category, location=geom)
session.add(facility)
session.commit()
session.refresh(facility) # loads server-generated id and defaults
return facilityUsing Shapely geometries directly. GeoAlchemy2 ≥ 0.14 registers a type adapter so Shapely Point, Polygon, and other geometry objects can be assigned to mapped columns. The SRID from the column definition is used automatically:
from shapely.geometry import Point
facility.location = Point(-73.9857, 40.7484)
# GeoAlchemy2 converts to WKBElement with the column's srid=4326 on flushTest this in your specific GeoAlchemy2 version — the Shapely adapter behaviour changed across 0.12→0.14. When in doubt, use WKTElement with an explicit SRID for deterministic results.
Step 4 — Querying with spatial filters
Spatial filter expressions compile to PostGIS functions. Use func.ST_DWithin, func.ST_Intersects, and the && bounding-box operator through SQLAlchemy’s func namespace. Pair radius searches with ST_DWithin radius searches for the full proximity-query pattern.
from sqlalchemy import select, func
from geoalchemy2.elements import WKTElement
def facilities_within_km(lat: float, lon: float, km: float) -> list[Facility]:
origin = WKTElement(f"POINT({lon} {lat})", srid=4326)
stmt = (
select(Facility)
.where(
# ST_DWithin on geography uses metres; cast geometry → geography
func.ST_DWithin(
Facility.location.cast("geography"),
func.ST_GeogFromText(f"POINT({lon} {lat})"),
km * 1000, # metres
)
)
.order_by(
func.ST_Distance(
Facility.location.cast("geography"),
func.ST_GeogFromText(f"POINT({lon} {lat})"),
)
)
.limit(50)
)
with Session(engine) as session:
return session.scalars(stmt).all()For bulk spatial joins — for example matching thousands of delivery points to neighbourhood polygons — avoid loading all rows into Python and then filtering in memory. Push the join to PostGIS and stream results via server-side cursors. The batch processing spatial joins in Python page covers this pattern end to end.
Step 5 — Serialization: WKBElement to GeoJSON
Raw WKBElement objects returned by SQLAlchemy queries are not JSON-serializable. Every API layer needs a conversion step. Use geoalchemy2.shape.to_shape() to obtain a Shapely geometry, then shapely.geometry.mapping() for a GeoJSON-compatible dict. For type coercion in more complex scenarios — including Pydantic model validators and custom JSON encoders — see type coercion and serialization.
from geoalchemy2.shape import to_shape
from shapely.geometry import mapping
def serialize_facility(facility: Facility) -> dict:
geom = to_shape(facility.location) # WKBElement → Shapely geometry
return {
"id": facility.id,
"name": facility.name,
"category": facility.category,
"coordinates": list(geom.coords), # [(lon, lat)]
"geojson": mapping(geom), # {"type": "Point", "coordinates": [...]}
}For FastAPI, configure a custom JSONResponse encoder or use a Pydantic model with a field_serializer that calls to_shape internally. Framework-specific patterns are detailed in configuring GeoAlchemy2 geometry columns in FastAPI.
Step 6 — Hybrid geometry properties
Frequently accessed spatial calculations — centroid, bounding box, area — should be expressed as @hybrid_property so they work identically on in-memory ORM instances and inside SQLAlchemy filter expressions. For the full set of hybrid property patterns, including distance thresholds and spatial aggregation, see hybrid properties for geometry.
from sqlalchemy.ext.hybrid import hybrid_property
from sqlalchemy import func
from geoalchemy2.shape import to_shape
class Facility(Base):
# ... columns as above ...
@hybrid_property
def centroid(self):
"""Python-side: returns a Shapely Point."""
if self.location is None:
return None
return to_shape(self.location).centroid
@centroid.expression
def centroid(cls):
"""SQL-side: compiles to ST_Centroid(location)."""
return func.ST_Centroid(cls.location)
@hybrid_property
def longitude(self):
if self.location is None:
return None
return to_shape(self.location).x
@longitude.expression
def longitude(cls):
return func.ST_X(cls.location)The dual-context pattern lets you write session.scalars(select(Facility).where(Facility.longitude > -74)) without breaking when the same property is accessed on a loaded instance.
Performance considerations
Expected EXPLAIN ANALYZE output
After creating the GiST index and running ANALYZE facilities;, a radius query using ST_DWithin should show an Index Scan on the GiST index rather than a Seq Scan:
Index Scan using idx_facilities_location on facilities
(cost=0.28..8.54 rows=1 width=300)
(actual time=0.312..0.318 rows=3 loops=1)
Index Cond: (location && st_expand(...))
Filter: st_dwithin(location, ..., 5000)
Buffers: shared hit=4
Planning Time: 0.8 ms
Execution Time: 0.4 ms
A Seq Scan here means the index is missing, the SRID in the query does not match the column SRID, or the planner’s row-count estimate is so low that it considers the index overhead not worthwhile. Run ANALYZE facilities; after bulk inserts to refresh statistics.
For deep EXPLAIN (ANALYZE, BUFFERS) output interpretation — including reading bitmap index scans on composite indexes and diagnosing recheck conditions — see reading EXPLAIN ANALYZE output for spatial joins.
GUC settings that affect spatial query plans
| Setting | Recommended value | Effect |
|---|---|---|
random_page_cost |
1.1 (SSD) / 4.0 (HDD) |
Lower values make the planner prefer index scans |
effective_cache_size |
50–75% of RAM | Tells the planner how much OS page cache is available |
work_mem |
16MB–64MB |
Affects sort and hash operations in spatial joins |
jit |
off for OLTP |
JIT compilation overhead can slow short spatial queries |
Apply these as session-level settings for spatial-heavy endpoints rather than globally:
from sqlalchemy import event, text
@event.listens_for(engine, "connect")
def set_spatial_gucs(dbapi_conn, _):
with dbapi_conn.cursor() as cur:
cur.execute("SET random_page_cost = 1.1")
cur.execute("SET work_mem = '32MB'")Common failure modes and fixes
SRID mismatch on insert
Symptom: ERROR: Geometry SRID (0) does not match column SRID (4326)
Diagnosis:
-- Check what SRID the column expects
SELECT type, srid
FROM geometry_columns
WHERE f_table_name = 'facilities' AND f_geometry_column = 'location';Fix: Always pass srid=4326 (or the column’s declared SRID) in WKTElement:
# Wrong — SRID defaults to 0
location = WKTElement("POINT(-73.98 40.74)")
# Correct
location = WKTElement("POINT(-73.98 40.74)", srid=4326)Missing GiST index causes sequential scan
Symptom: EXPLAIN ANALYZE shows Seq Scan even on large tables.
Diagnosis:
SELECT indexname, indexdef
FROM pg_indexes
WHERE tablename = 'facilities'
AND indexdef ILIKE '%gist%';
-- Empty result means no GiST index existsFix: Create the index if missing or if the migration was not applied:
CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_facilities_location
ON facilities USING gist (location);
-- Then refresh statistics
ANALYZE facilities;For partial GiST indexes targeting a filtered subset — for example indexing only rows where active = true — see creating partial indexes for active map regions.
Geometry attribute is None after session.commit()
Symptom: facility.location is None immediately after a commit that set it.
Root cause: expire_on_commit=True (SQLAlchemy default) expires all attributes.
Fix option 1 — set expire_on_commit=False at the session factory:
SessionLocal = sessionmaker(bind=engine, expire_on_commit=False)Fix option 2 — call session.refresh() immediately after commit to reload server-generated values:
session.commit()
session.refresh(facility) # reloads id, location, and any server defaults
print(facility.location) # now populatedOOM on large geometry result sets
Symptom: worker process killed with out-of-memory when fetching many geometry rows.
Fix: Use yield_per to stream results in batches rather than loading everything into memory at once:
stmt = select(Facility).execution_options(yield_per=500)
with Session(engine) as session:
for batch in session.scalars(stmt).partitions():
process(batch)For server-side cursor patterns with psycopg2 named_cursor and asyncpg streaming, see session management for spatial data.
Verification
After applying the model definition and running migrations, confirm that the table, index, and column metadata are in order:
-- 1. Confirm the geometry column is registered with correct type and SRID
SELECT f_table_name, f_geometry_column, type, srid
FROM geometry_columns
WHERE f_table_name = 'facilities';
-- Expected: facilities | location | POINT | 4326
-- 2. Confirm the GiST index exists
SELECT indexname
FROM pg_indexes
WHERE tablename = 'facilities' AND indexdef ILIKE '%gist%';
-- Expected: idx_facilities_location
-- 3. Insert a test row and confirm it round-trips correctly
INSERT INTO facilities (name, category, location)
VALUES ('Test', 'qa', ST_GeomFromText('POINT(-73.98 40.74)', 4326));
SELECT id, name, ST_AsText(location) AS wkt, ST_SRID(location) AS srid
FROM facilities
WHERE name = 'Test';
-- Expected: wkt=POINT(-73.98 40.74), srid=4326
-- 4. Check the query uses the index
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM facilities
WHERE ST_DWithin(location::geography, ST_GeogFromText('POINT(-73.98 40.74)'), 5000);
-- Expected: "Index Scan using idx_facilities_location"
-- Cleanup
DELETE FROM facilities WHERE name = 'Test';Common pitfalls — quick reference
| Symptom | Root cause | Fix |
|---|---|---|
type "geometry" does not exist |
PostGIS extension not enabled | CREATE EXTENSION IF NOT EXISTS postgis; |
| SRID mismatch on insert | Missing srid in Geometry() or WKTElement |
Declare srid=4326 in both the column and the element |
WKBElement not serializable |
Raw ORM object passed to JSON encoder | to_shape() then mapping() before response |
| Seq scan despite GiST index | Index missing or stale statistics | CREATE INDEX … USING gist + ANALYZE |
None geometry after commit |
expire_on_commit=True default |
Set expire_on_commit=False or call session.refresh() |
| Alembic generates spurious geometry migrations | GeoAlchemy2/SQLAlchemy version mismatch | Pin both packages; check geoalchemy2.alembic render hooks |
FAQ: Model mapping with GeoAlchemy2
Q: Do I need to call Base.metadata.create_all() or use Alembic?
For anything beyond local development, use Alembic. create_all() cannot detect column changes; it only creates missing tables. GeoAlchemy2 ships Alembic render hooks (geoalchemy2.alembic) that emit correct op.create_geospatial_table() and op.add_geospatial_column() operations.
Q: Can I use asyncpg instead of psycopg2?
Yes. Replace postgresql+psycopg2 with postgresql+asyncpg in the create_engine URL and use create_async_engine from sqlalchemy.ext.asyncio. GeoAlchemy2’s type handling works identically through the async session interface. Be aware that asyncpg uses binary protocol by default, which requires the PostGIS binary codec to be registered — GeoAlchemy2 ≥ 0.14 handles this automatically.
Q: What happens if I store geometries without a GiST index?
Every spatial predicate (ST_DWithin, ST_Intersects, &&) falls back to a sequential scan. On a table with 100,000 rows a radius query that should take milliseconds can take several seconds. Always declare the index in __table_args__ and apply it via migration before the table receives data.
Q: How do I handle coordinate systems other than WGS84?
Use the appropriate EPSG code as the srid parameter — for example srid=32632 for UTM Zone 32N. Distance calculations in projected CRS use metres natively without the ::geography cast, which can be faster for large datasets in a single UTM zone. Use ST_Transform in queries when mixing CRS.
Related topics
- SQLAlchemy and GeoAlchemy2 integration workflows — parent section covering the full ORM integration landscape
- Session management for spatial data — isolation levels, scoped sessions, and deadlock mitigation
- Type coercion and serialization — WKB/WKT conversions, Pydantic integration, and custom JSON encoders
- Hybrid properties for geometry — dual Python/SQL geometry expressions for computed attributes
- Advanced GiST indexing and optimization — partial indexes, composite indexes, index-only scans, and EXPLAIN plan analysis