How to Clip Spatial Data in Python with GeoPandas
A common GIS task is clipping one vector dataset to the boundary of another. For example:
Problem statement
A common GIS task is clipping one vector dataset to the boundary of another. For example:
- limiting a roads layer to a city boundary
- clipping land parcels to a project area
- trimming environmental polygons to an administrative boundary
In Python, this usually means taking a shapefile or GeoJSON layer and keeping only the features, or feature parts, that fall inside a polygon boundary.
This page shows how to clip spatial data in Python with GeoPandas. The same workflow works for shapefiles, GeoJSON, and other vector formats supported by GeoPandas.
Quick answer
Use geopandas.clip() to clip one GeoDataFrame with a polygon boundary.
Two requirements matter:
- both layers should use the same CRS
- the clip layer should contain polygon or multipolygon geometry
import geopandas as gpd
target = gpd.read_file("data/roads.shp")
boundary = gpd.read_file("data/city_boundary.shp")
if target.crs != boundary.crs:
boundary = boundary.to_crs(target.crs)
clipped = gpd.clip(target, boundary)
clipped.to_file("output/roads_clipped.shp")
The result keeps only features, or parts of features, that fall inside the boundary.
Step-by-step solution
Load the input spatial layers
First read the dataset you want to clip and the polygon layer that will act as the clip boundary.
import geopandas as gpd
roads = gpd.read_file("data/roads.shp")
city_boundary = gpd.read_file("data/city_boundary.shp")
GeoPandas supports common vector formats including:
- ESRI Shapefile
- GeoJSON
- GeoPackage
Check that both layers use the same CRS
Before clipping, both layers should be in the same coordinate reference system.
print(roads.crs)
print(city_boundary.crs)
If they differ, reproject one layer to match the other:
if roads.crs != city_boundary.crs:
city_boundary = city_boundary.to_crs(roads.crs)
This is one of the most common causes of empty or incorrect clip results.
Confirm the clip layer is polygon-based
The clip boundary should usually be a polygon or multipolygon layer.
You can inspect geometry types like this:
print(city_boundary.geom_type.unique())
Typical valid clip geometry types are:
PolygonMultiPolygon
If your boundary layer is made of points or lines, it is not the right input for a standard vector clip workflow.
Run the clip operation with GeoPandas
Use gpd.clip() to clip the input data by the boundary layer.
roads_clipped = gpd.clip(roads, city_boundary)
This returns:
- features fully inside the boundary unchanged
- line and polygon features crossing the boundary cut to the boundary edge
- point features outside the boundary removed
- features fully outside removed
Save the clipped result to a new file
Write the output to a new dataset so the original files stay unchanged.
Save as a shapefile:
roads_clipped.to_file("output/roads_clipped.shp")
Or save as GeoJSON:
roads_clipped.to_file("output/roads_clipped.geojson", driver="GeoJSON")
Code examples
Example 1: Clip a shapefile by an administrative boundary
This example clips a roads shapefile to a city boundary shapefile.
import geopandas as gpd
roads = gpd.read_file("data/roads.shp")
city_boundary = gpd.read_file("data/city_boundary.shp")
if roads.crs != city_boundary.crs:
city_boundary = city_boundary.to_crs(roads.crs)
roads_clipped = gpd.clip(roads, city_boundary)
roads_clipped.to_file("output/roads_within_city.shp")
This is a typical workflow when you need road features only for one municipality or study area.
Example 2: Clip GeoJSON data with a polygon layer
The same process works for GeoJSON.
import geopandas as gpd
parcels = gpd.read_file("data/parcels.geojson")
project_area = gpd.read_file("data/project_area.geojson")
if parcels.crs != project_area.crs:
project_area = project_area.to_crs(parcels.crs)
parcels_clipped = gpd.clip(parcels, project_area)
parcels_clipped.to_file("output/parcels_project_area.geojson", driver="GeoJSON")
Example 3: Clip using a single polygon from a larger boundary dataset
Sometimes your boundary file contains many areas and you only want one of them.
import geopandas as gpd
land_use = gpd.read_file("data/land_use.shp")
districts = gpd.read_file("data/districts.shp")
selected_district = districts[districts["NAME"] == "Central District"]
if land_use.crs != selected_district.crs:
selected_district = selected_district.to_crs(land_use.crs)
land_use_clipped = gpd.clip(land_use, selected_district)
land_use_clipped.to_file("output/land_use_central_district.shp")
This is useful when clipping by one administrative unit, project area, or management zone from a larger boundary dataset.
Explanation
Clipping is a vector operation that uses polygon geometry as a mask.
When you clip spatial data in Python with GeoPandas:
- line and polygon features crossing the boundary are cut
- point features are filtered to those within the boundary
- features outside the boundary are removed
- features inside the boundary are kept
- attribute fields from the input layer are preserved
This is different from a few related operations.
Clip vs bounding box filter
A bounding box filter only checks whether features fall inside a rectangle. It does not cut features to the true polygon boundary.
Clip vs spatial join
A spatial join adds attributes based on spatial relationships. It does not trim geometry.
Clip vs overlay intersection
overlay(..., how="intersection") is more general and combines attributes from both layers. clip() is simpler when you only want to trim one layer by a boundary.
In many GIS workflows, clipping is used to reduce a larger dataset to a study area before analysis, mapping, or export.
Edge cases or notes
CRS mismatch causes incorrect or empty results
If the layers use different CRS values, they may appear not to overlap even when they represent the same area.
Always check before clipping:
print(target.crs, boundary.crs)
Then reproject one layer if needed:
boundary = boundary.to_crs(target.crs)
Invalid geometries can break the clip process
Self-intersections or other invalid geometry problems can cause clipping errors.
Check validity:
print(boundary.is_valid.value_counts())
A common repair approach is:
boundary["geometry"] = boundary.buffer(0)
Use this carefully and inspect results, especially for production data.
Multi-part boundaries may affect output size
If the clip layer contains many polygons, the result can become fragmented. This is normal when clipping roads, parcels, or land cover to many separate boundary parts.
If needed, dissolve the boundary first:
boundary_dissolved = boundary.dissolve()
clipped = gpd.clip(target, boundary_dissolved)
Attribute data is preserved from the input layer
The clipped output keeps attributes from the layer being clipped. It does not automatically add fields from the boundary layer.
If you need attributes from both layers, use an overlay or spatial join instead.
Performance considerations for large datasets
Clipping very large layers can be slow, especially with complex polygons.
Practical ways to improve performance:
- keep only necessary columns before clipping
- dissolve many boundary polygons into one geometry if appropriate
- use a projected CRS for regional workflows
- prefilter by extent before full clipping when datasets are very large
Internal links
For broader context, see Vector operations in GeoPandas.
Related task guides:
- How to Reproject Spatial Data in Python (GeoPandas)
- How to Read a Shapefile in Python with GeoPandas
For output formats after clipping, see How to Export GeoJSON in Python with GeoPandas.
FAQ
How do I clip a shapefile in Python with GeoPandas?
Read both layers with gpd.read_file(), make sure they share the same CRS, then run gpd.clip().
clipped = gpd.clip(input_layer, boundary_layer)
Save the result with to_file().
Why does geopandas.clip() return an empty result?
The most common reasons are:
- the layers use different CRS values
- the datasets do not actually overlap
- the clip layer is not polygon-based
- one of the layers has invalid geometry
Check CRS, geometry type, and validity first.
Do both layers need the same CRS before clipping?
Yes. In practice, they should be in the same CRS before running gpd.clip(). If not, results may be empty or spatially wrong.
What is the difference between clip and overlay in GeoPandas?
clip() trims one layer by a polygon boundary and keeps attributes from the input layer. overlay() performs more general spatial operations like intersection and can combine attributes from both layers.
Related articles
Keep exploring with more guides in this category.
How to Fix Invalid Geometries in Python (GeoPandas)
How to find and fix invalid geometries in GeoPandas using buffer(0), make_valid, and geometry validation checks.
Read article →
How to Read a Shapefile in Python with GeoPandas
Step-by-step guide to reading a shapefile in Python using GeoPandas, with examples and common issues covered.
Read article →
How to Reproject Spatial Data in Python (GeoPandas)
How to reproject spatial data in Python using GeoPandas to_crs(), with examples for common coordinate systems.
Read article →