How to Fix Invalid Geometries in Python (GeoPandas)
How to find and fix invalid geometries in GeoPandas using buffer(0), make_valid, and geometry validation checks.
Problem statement
A common GIS data problem is a vector layer that loads successfully but fails during later processing because some features have invalid geometry.
This usually shows up when you try to run:
- overlays
- spatial joins
- dissolves
- clipping
- exports to file
- imports into PostGIS or another database
Typical signs include:
TopologyExceptionerrors- self-intersecting polygons
- empty or broken output
- failed writes to shapefile or GeoJSON
- unexpected spatial operation results
If you are working with a shapefile or GeoJSON in GeoPandas, the practical fix is to detect invalid features, inspect the reason, repair them, and verify the result before continuing.
Quick answer
The shortest reliable workflow to fix invalid geometries in GeoPandas is:
import geopandas as gpd
from shapely import make_valid
gdf = gpd.read_file("data/parcels.shp")
invalid_mask = ~gdf.geometry.is_valid
gdf.loc[invalid_mask, "geometry"] = gdf.loc[invalid_mask, "geometry"].apply(make_valid)
gdf = gdf[gdf.geometry.notna()]
gdf = gdf[~gdf.geometry.is_empty]
gdf = gdf[gdf.geometry.is_valid]
gdf.to_file("data/parcels_cleaned.gpkg", driver="GPKG")
Use make_valid() first. Use buffer(0) only as a fallback for simple polygon issues. Always re-check validity before saving.
Step-by-step solution
Load the vector dataset into GeoPandas
Start by reading the source file into a GeoDataFrame.
import geopandas as gpd
gdf = gpd.read_file("data/parcels.shp")
print(gdf.crs)
print(gdf.geometry.name)
print(len(gdf))
This works the same way for GeoJSON:
gdf = gpd.read_file("data/parcels.geojson")
If you plan to compare or combine this layer with others later, keep the CRS unchanged or reproject explicitly with to_crs().
Find invalid geometries
Use .is_valid on the geometry column to identify invalid features.
gdf["is_valid"] = gdf.geometry.is_valid
invalid_gdf = gdf[~gdf["is_valid"]]
print("Total features:", len(gdf))
print("Invalid features:", len(invalid_gdf))
If you only want the bad rows for review:
print(invalid_gdf[["is_valid", "geometry"]].head())
This is usually the first check to run before overlay, dissolve, or export operations.
Check why each geometry is invalid
Finding invalid rows is useful, but you often also need the reason. Shapely provides explain_validity() for this.
from shapely.validation import explain_validity
invalid_gdf = invalid_gdf.copy()
invalid_gdf["validity_reason"] = invalid_gdf.geometry.apply(explain_validity)
print(invalid_gdf[["validity_reason"]].head(10))
Typical messages include:
Self-intersectionRing Self-intersectionToo few points in geometry component
This helps you understand whether you are dealing with a few isolated errors or a broader data-quality problem.
Repair geometries with make_valid()
The preferred repair method is make_valid() from Shapely.
from shapely import make_valid
This requires a modern Shapely version that provides make_valid(). In older environments, use buffer(0) only as a limited fallback and consider upgrading Shapely.
Apply it only to invalid rows:
gdf_repaired = gdf.copy()
invalid_mask = ~gdf_repaired.geometry.is_valid
gdf_repaired.loc[invalid_mask, "geometry"] = (
gdf_repaired.loc[invalid_mask, "geometry"].apply(make_valid)
)
Then check the result:
gdf_repaired["is_valid_after"] = gdf_repaired.geometry.is_valid
print("Still invalid:", (~gdf_repaired["is_valid_after"]).sum())
print(gdf_repaired.geometry.geom_type.value_counts())
This is the best default option in current GeoPandas and Shapely workflows.
Use buffer(0) as a fallback for simple polygon issues
If make_valid() is not available in your environment, or if you are testing a simple self-intersection fix, buffer(0) can help.
gdf_buffered = gdf.copy()
invalid_mask = ~gdf_buffered.geometry.is_valid
gdf_buffered.loc[invalid_mask, "geometry"] = (
gdf_buffered.loc[invalid_mask, "geometry"].buffer(0)
)
Re-check validity:
print("Still invalid after buffer(0):", (~gdf_buffered.geometry.is_valid).sum())
Use this carefully. It can change geometry shape or remove small artifacts. It is not the best default repair method.
Remove empty or still-invalid results
Some repaired geometries may become empty, null, or remain invalid.
cleaned = gdf_repaired.copy()
cleaned = cleaned[cleaned.geometry.notna()]
cleaned = cleaned[~cleaned.geometry.is_empty]
cleaned = cleaned[cleaned.geometry.is_valid]
print("Remaining features:", len(cleaned))
If downstream tools require polygon-only output, inspect geometry types after repair:
print(cleaned.geometry.geom_type.value_counts())
A simple polygon-only filter is:
polygon_types = ["Polygon", "MultiPolygon"]
cleaned_polygons = cleaned[cleaned.geometry.geom_type.isin(polygon_types)].copy()
print(cleaned_polygons.geometry.geom_type.value_counts())
This is useful when make_valid() returns GeometryCollection or other non-polygon results.
Verify repaired geometries before continuing
Before running overlay, dissolve, or export steps, confirm the repair worked.
print("Original feature count:", len(gdf))
print("Cleaned feature count:", len(cleaned))
print("Invalid after repair:", (~cleaned.geometry.is_valid).sum())
print(cleaned.geometry.geom_type.value_counts())
This matters because make_valid() may split one invalid polygon into multiple valid parts or return a different geometry type.
Save the cleaned dataset
Write the cleaned layer to an output format after checking the geometry types you now have.
cleaned.to_file("data/parcels_cleaned.gpkg", driver="GPKG")
You can also save to GeoJSON:
cleaned.to_file("data/parcels_cleaned.geojson", driver="GeoJSON")
If you save to shapefile, be aware that it is more restrictive:
cleaned.to_file("data/parcels_cleaned.shp")
GeoPackage is often the safer choice than shapefile when repaired features become multipart. If repair produces mixed geometry types such as GeometryCollection, inspect and filter geometry types before saving.
Code examples
Example 1: Detect invalid geometries in a shapefile
import geopandas as gpd
gdf = gpd.read_file("data/buildings.shp")
gdf["is_valid"] = gdf.geometry.is_valid
invalid_gdf = gdf[~gdf["is_valid"]]
print("Invalid features:", len(invalid_gdf))
print(invalid_gdf[["is_valid"]].head())
Example 2: Print validation reasons for bad features
from shapely.validation import explain_validity
invalid_gdf = invalid_gdf.copy()
invalid_gdf["reason"] = invalid_gdf.geometry.apply(explain_validity)
for idx, row in invalid_gdf[["reason"]].head(10).iterrows():
print(idx, row["reason"])
Example 3: Repair geometries with make_valid()
from shapely import make_valid
gdf_fixed = gdf.copy()
mask = ~gdf_fixed.geometry.is_valid
gdf_fixed.loc[mask, "geometry"] = gdf_fixed.loc[mask, "geometry"].apply(make_valid)
print("Invalid after repair:", (~gdf_fixed.geometry.is_valid).sum())
print(gdf_fixed.geometry.geom_type.value_counts())
Example 4: Keep only polygon results after repair
polygon_types = ["Polygon", "MultiPolygon"]
gdf_polygons = gdf_fixed[gdf_fixed.geometry.geom_type.isin(polygon_types)].copy()
print(gdf_polygons.geometry.geom_type.value_counts())
Example 5: Drop empty geometries and save cleaned output
cleaned = gdf_fixed.copy()
cleaned = cleaned[cleaned.geometry.notna()]
cleaned = cleaned[~cleaned.geometry.is_empty]
cleaned = cleaned[cleaned.geometry.is_valid]
cleaned.to_file("data/buildings_cleaned.gpkg", driver="GPKG")
Explanation
In practical GIS work, an invalid geometry is a feature that does not meet the structural rules required by geometry libraries. For polygons, common problems include self-intersections, ring issues, or broken parts.
These problems matter because invalid features often break spatial processing before export, especially during:
- overlays
- clipping
- dissolves
- spatial joins
The repair workflow has three separate steps:
- Check validity with
.is_valid - Diagnose the reason with
explain_validity() - Repair with
make_valid()or, if necessary,buffer(0)
GeoPandas relies on Shapely for these geometry operations, so the repair behavior comes from Shapely.
A key point is that repair can change the geometry structure. One invalid polygon may become:
- a
MultiPolygon - a
GeometryCollection - a different valid shape than the original
That is why checking validity is not enough. You should also inspect geom_type after repair and filter the output if your workflow expects only polygons, only lines, or a single geometry type for export.
Edge cases or notes
make_valid() may return different geometry types
A repaired polygon layer may no longer contain only polygons. If your downstream workflow expects polygon-only features, inspect geom_type and filter before saving.
Shapefiles are restrictive after repair
Shapefiles are restrictive. Multipart geometries may be fine if they match the layer type, but mixed geometry types after repair can cause problems. Save to GeoPackage first and inspect geom_type before exporting to stricter formats.
Some invalid geometries will not repair cleanly
A few features may remain invalid or become empty after repair. In those cases, inspect them manually or remove them from the dataset.
buffer(0) is not a universal fix
It can work for simple polygon self-intersections, but it can also alter geometry shape. Do not use it as the default method if make_valid() is available.
CRS still matters
CRS does not determine whether a geometry is valid, but it still matters for the rest of your workflow. Before overlay or spatial join operations, make sure all layers use the same CRS.
Valid does not always mean correct
A geometry can be technically valid and still be wrong for the dataset. Geometry repair fixes structural issues, not mapping mistakes or attribute problems.
Internal links
For background, see Geometry validity in GIS: what makes a feature invalid?
Related task pages:
If your processing fails before repair, see Why GeoPandas Overlay Fails with TopologyException
FAQ
What is the best way to fix invalid geometries in GeoPandas?
Use make_valid() from Shapely when available. It is the preferred method for most cases. Use buffer(0) only as a fallback for simple polygon issues.
Why does make_valid() change my polygon into multiple parts?
Because the original shape may be structurally broken. Repairing it can split overlapping or self-intersecting areas into valid separate geometries such as MultiPolygon or GeometryCollection.
Can I use buffer(0) to repair all invalid geometries?
No. It sometimes works for simple self-intersections, but it can change geometry shape and does not fix every invalid case reliably.
What should I do if repaired geometries become empty or still invalid?
Filter out null, empty, or still-invalid rows and inspect those features manually if they matter to the project. Save the cleaned output in a flexible format such as GeoPackage.