How to Reproject Spatial Data in Python (GeoPandas)

How to reproject spatial data in Python using GeoPandas to_crs(), with examples for common coordinate systems.

Problem statement

A common GIS task is changing the coordinate reference system of vector data so it matches another layer, works with web maps, or produces correct distance and area results.

Typical cases include:

  • a shapefile in EPSG:4326 needs to display with web tiles in EPSG:3857
  • you need a projected CRS before calculating area, length, or buffers
  • two datasets do not align because they use different CRS values
  • a file has valid coordinates, but the CRS metadata is missing

If you need to reproject a GeoDataFrame in Python, the main issue is knowing whether the CRS is already defined correctly. If it is, you transform it. If it is missing but known from external metadata or documentation, you assign it first.

Quick answer

Use GeoPandas .to_crs() to reproject geometries.

Basic workflow:

  1. check the current CRS with gdf.crs
  2. if the CRS is missing but known, assign it with gdf.set_crs()
  3. reproject with gdf.to_crs(...)
  4. save the result or continue analysis

Example:

import geopandas as gpd

gdf = gpd.read_file("data/roads.shp")
print(gdf.crs)

gdf_3857 = gdf.to_crs(epsg=3857)
print(gdf_3857.crs)

Important: .to_crs() only works correctly if the current CRS is already defined correctly.

Step-by-step solution

Check the current CRS

Before changing anything, inspect the CRS.

import geopandas as gpd

gdf = gpd.read_file("data/parcels.shp")
print(gdf.crs)

Example output:

EPSG:4326

This tells you what coordinate system the current geometry coordinates use.

You need to distinguish between:

  • missing CRS: gdf.crs is None
  • incorrect CRS: a CRS is present, but it does not match the actual coordinates

A missing CRS prevents safe reprojection. An incorrect CRS is worse because reprojection will run, but the output will be wrong.

Code example: read a shapefile and inspect its CRS

import geopandas as gpd

gdf = gpd.read_file("data/city_boundary.shp")

print("Current CRS:", gdf.crs)
print(gdf.head())

Use this as the first check any time you work with a new shapefile or GeoJSON file.

Set the CRS if it is missing

Use .set_crs() only when the coordinates are already in a known CRS but the metadata is missing.

For example, if coordinates are longitude and latitude in WGS84, but gdf.crs is None, assign EPSG:4326.

if gdf.crs is None:
    gdf = gdf.set_crs(epsg=4326)

This does not change coordinate values. It only labels what the existing coordinates mean.

Code example: assign a missing CRS with set_crs()

import geopandas as gpd
from shapely.geometry import Point

gdf = gpd.GeoDataFrame(
    {"name": ["A", "B"]},
    geometry=[Point(-73.9857, 40.7484), Point(-73.9819, 40.7681)]
)

print("Before:", gdf.crs)

gdf = gdf.set_crs(epsg=4326)

print("After:", gdf.crs)

Use this only if you know the source coordinates are in EPSG:4326. Do not guess the CRS from coordinates alone.

Reproject the GeoDataFrame with to_crs()

Once the source CRS is correct, use .to_crs() to transform the coordinates into a new system.

Code example: reproject to Web Mercator with to_crs()

A common case is converting from EPSG:4326 to EPSG:3857 for web map display.

import geopandas as gpd

gdf = gpd.read_file("data/points.geojson")

print("Source CRS:", gdf.crs)

gdf_web = gdf.to_crs(epsg=3857)

print("Reprojected CRS:", gdf_web.crs)
print(gdf_web.geometry.head())

This transforms the geometry coordinates from geographic coordinates (typically degrees in EPSG:4326) to projected coordinates in meters in Web Mercator (EPSG:3857).

Use EPSG:3857 for display, not for accurate area or distance calculations.

Common target CRS choices:

  • EPSG:3857 for web mapping
  • local UTM zones for measurement
  • national projected CRS for local or regional analysis

Code example: reproject to a projected CRS for area or distance analysis

For area and distance work, use a projected CRS instead of EPSG:4326.

import geopandas as gpd

gdf = gpd.read_file("data/parcels.geojson")

# Example: UTM zone 18N
gdf_utm = gdf.to_crs(epsg=32618)

gdf_utm["area_sqm"] = gdf_utm.area
print(gdf_utm[["area_sqm"]].head())

If your data is in a different location, choose the appropriate UTM zone or another suitable local projected CRS.

Save the reprojected data

After reprojection, save the output to a new file.

Code example: save the reprojected output

import geopandas as gpd

gdf = gpd.read_file("data/roads.shp")
gdf_3857 = gdf.to_crs(epsg=3857)

gdf_3857.to_file("output/roads_3857.shp")

You can also save to GeoJSON or GeoPackage:

gdf_3857.to_file("output/roads_3857.geojson", driver="GeoJSON")
gdf_3857.to_file("output/roads_3857.gpkg", driver="GPKG")

Verify the saved CRS by reading the file again:

check = gpd.read_file("output/roads_3857.gpkg")
print(check.crs)

Code examples

Reproject in one short workflow

import geopandas as gpd

gdf = gpd.read_file("data/buildings.geojson")

if gdf.crs is None:
    gdf = gdf.set_crs(epsg=4326)

gdf_projected = gdf.to_crs(epsg=32618)
gdf_projected.to_file("output/buildings_32618.gpkg", driver="GPKG")

Match one layer to another layer's CRS

This is common before overlay, clipping, or spatial joins.

import geopandas as gpd

parcels = gpd.read_file("data/parcels.gpkg")
zoning = gpd.read_file("data/zoning.gpkg")

zoning = zoning.to_crs(parcels.crs)

print("Parcels CRS:", parcels.crs)
print("Zoning CRS:", zoning.crs)

Explanation

set_crs() vs to_crs()

This is the most important distinction in GeoPandas CRS workflows.

set_crs()

Use this when the coordinates are already correct, but the CRS label is missing.

It answers: what do these existing coordinates mean?

gdf = gdf.set_crs(epsg=4326)

to_crs()

Use this when you want to transform coordinates into a different coordinate system.

It answers: convert these coordinates into another CRS

gdf_projected = gdf.to_crs(epsg=3857)

If you use set_crs() when you meant to_crs(), your layer will not actually move into a new coordinate system. If you use to_crs() on data with a wrong or missing CRS, the output will be incorrect.

Choosing the right target CRS

EPSG:4326 is common for storage and exchange, especially with GeoJSON, but it is not a good choice for measurement because coordinates are stored in degrees.

Use a projected CRS when you need:

  • distance
  • area
  • length
  • buffers
  • overlay analysis based on local accuracy

Examples:

  • EPSG:3857 for web display
  • UTM CRS such as EPSG:32618 for local metric analysis
  • a national CRS for country-specific workflows

The correct CRS depends on where the data is located and what you need to do with it.

Why layers fail to align

A mismatched CRS is one of the most common reasons layers do not line up.

Other causes include:

  • one layer has no CRS metadata
  • the wrong CRS was assigned with set_crs()
  • data was exported without correct CRS information
  • axis order confusion in some external workflows

Before overlay, join, clip, or measurement, make sure both layers use appropriate and matching CRS.

Edge cases / notes

Notes and common mistakes

  • reprojection fails if gdf.crs is None
  • assigning the wrong CRS can produce bad output even when the code runs
  • GeoJSON is commonly used with EPSG:4326
  • GeoPackage usually preserves CRS metadata more reliably than shapefile
  • shapefiles have format limitations, so GeoPackage is often a better output format for practical workflows
  • large datasets can take longer to transform
  • reproject both layers before spatial joins, overlays, clipping, or measurements
  • invalid geometries can cause problems in later analysis, even if reprojection itself succeeds

A simple geometry check:

invalid = ~gdf.is_valid
print(gdf[invalid])

If needed, repair invalid geometries before more complex processing.

If you need background on coordinate systems, see Coordinate Reference Systems (CRS) in Python GIS.

For related tasks, see:

If you need to export the result, see How to Export GeoJSON in Python with GeoPandas.

If your data still does not line up, check Why GeoPandas Layers Do Not Align on a Map.

FAQ

How do I reproject a GeoDataFrame in Python with GeoPandas?

Use gdf.to_crs(...) after confirming that gdf.crs is already set correctly.

gdf = gdf.to_crs(epsg=3857)

What is the difference between set_crs() and to_crs() in GeoPandas?

set_crs() assigns CRS metadata without changing coordinates.
to_crs() transforms coordinates into a different coordinate system.

Why does reprojection fail when my GeoDataFrame has no CRS?

GeoPandas cannot transform coordinates if it does not know the source coordinate system. Set the CRS first if it is known:

gdf = gdf.set_crs(epsg=4326)

Which CRS should I use for distance or area calculations in GeoPandas?

Use a projected CRS, such as a local UTM zone or national projected CRS. Do not use EPSG:4326 for area or distance calculations.