GeoPandas Basics: Working with Spatial Data in Python
A practical introduction to GeoPandas for reading, inspecting, filtering, and exporting vector spatial data in Python.
Problem statement
If you need to work with shapefiles or GeoJSON in Python, the main problem is usually not the file format itself. The problem is getting spatial data into a form where you can inspect attributes, check geometry, confirm the coordinate reference system, filter features, and save the result without handling low-level GIS details manually.
This is where GeoPandas basics matter. GeoPandas gives you a practical way to work with vector spatial data in Python using a table-like structure that feels similar to pandas, but with geometry support built in.
This page shows how to:
- read spatial files
- inspect geometry and attributes
- view the CRS
- filter features
- preview data with a quick plot
- save results to a new file
Quick answer
GeoPandas is a Python library for working with vector spatial data in a GeoDataFrame, which is similar to a pandas DataFrame but includes a geometry column.
A basic workflow looks like this:
- Load a file with
gpd.read_file() - Inspect columns, rows, and geometry
- Check the CRS with
.crs - Filter records by attribute or geometry presence
- Save the output with
.to_file()
import geopandas as gpd
gdf = gpd.read_file("data/parcels.shp")
print(gdf.head())
print(gdf.crs)
filtered = gdf[gdf["zone"] == "RESIDENTIAL"]
filtered.to_file("output/residential_parcels.geojson", driver="GeoJSON")
Step-by-step solution
Install GeoPandas
A common installation method is:
pip install geopandas
If you use Conda, install GeoPandas in your environment before running the examples:
conda install geopandas
If installation fails, fix the Python environment first before debugging your code.
Import GeoPandas
Most vector workflows start with one import:
import geopandas as gpd
Using the gpd alias is standard and keeps code readable.
Read a shapefile or GeoJSON into a GeoDataFrame
Use gpd.read_file() to load common vector formats.
Read a parcel shapefile
import geopandas as gpd
parcels = gpd.read_file("data/parcels/parcels.shp")
print(parcels.head())
Read a neighborhoods GeoJSON
import geopandas as gpd
neighborhoods = gpd.read_file("data/city_neighborhoods.geojson")
print(neighborhoods.head())
This returns a GeoDataFrame with attribute columns and one active geometry column.
Inspect the GeoDataFrame structure
Before doing any analysis, inspect the dataset structure.
print(parcels.head())
print(parcels.columns)
print(parcels.dtypes)
print(parcels.geometry.name)
print(len(parcels))
This helps you confirm:
- available fields
- geometry column name
- data types
- number of records
If you are reading unfamiliar data, this step prevents many later errors.
Check the coordinate reference system
CRS matters in every GIS workflow. If the CRS is wrong or missing, maps, overlays, and distance-based analysis can be misleading.
print(parcels.crs)
print(neighborhoods.crs)
Typical output may look like:
EPSG:26917
or:
EPSG:4326
Check CRS before plotting, combining layers, or measuring anything.
Filter spatial records
A common task is filtering features based on an attribute field.
Example: keep only residential parcels
residential = parcels[parcels["zone"] == "RESIDENTIAL"]
print(residential.head())
print(len(residential))
You can also remove rows with missing geometry:
parcels_with_geometry = parcels[parcels.geometry.notna()]
This is a simple but useful quality check before plotting or exporting.
Check geometry types
GeoPandas makes it easy to inspect geometry types.
print(parcels.geometry.geom_type.head())
print(parcels.geometry.geom_type.value_counts())
Common geometry types include:
PointLineStringPolygon
If you expected polygons but see mixed geometry types, verify the source data before continuing.
Plot the data for a quick visual check
A quick plot helps confirm that the layer loaded correctly. This is not advanced cartography. It is a fast validation step.
residential.plot()
For a slightly clearer result:
ax = residential.plot(figsize=(8, 6), edgecolor="black")
ax.set_title("Residential Parcels")
This can help you spot empty layers, unexpected extents, or geometry problems.
Save the result to a new file
After filtering or cleaning data, write it back to disk.
Save to GeoJSON
residential.to_file("output/residential_parcels.geojson", driver="GeoJSON")
Save to shapefile
residential.to_file("output/residential_parcels.shp")
GeoJSON is often easier for exchange and web workflows. Shapefile is still common in older GIS systems.
Code examples
Example 1: Load a shapefile into GeoPandas
import geopandas as gpd
parcels = gpd.read_file("data/parcels/parcels.shp")
print(parcels.head())
Example 2: Load a GeoJSON file and inspect columns
import geopandas as gpd
neighborhoods = gpd.read_file("data/city_neighborhoods.geojson")
print(neighborhoods.columns)
print(neighborhoods.head())
print(neighborhoods.geometry.name)
Example 3: Check CRS and geometry types
import geopandas as gpd
neighborhoods = gpd.read_file("data/city_neighborhoods.geojson")
print("CRS:", neighborhoods.crs)
print(neighborhoods.geometry.geom_type.value_counts())
Example 4: Filter features by attribute
import geopandas as gpd
parcels = gpd.read_file("data/parcels/parcels.shp")
commercial = parcels[parcels["land_use"] == "COMMERCIAL"]
print(commercial[["parcel_id", "land_use"]].head())
Example 5: Plot and export filtered data
import geopandas as gpd
parcels = gpd.read_file("data/parcels/parcels.shp")
residential = parcels[(parcels["zone"] == "RESIDENTIAL") & (parcels.geometry.notna())]
ax = residential.plot(figsize=(8, 6), edgecolor="black")
ax.set_title("Residential Parcels")
residential.to_file("output/residential_parcels.geojson", driver="GeoJSON")
Explanation
A GeoDataFrame is the core object in GeoPandas. It extends a pandas DataFrame by adding support for spatial geometry. That means you still get rows and columns like a normal table, but one column stores shapes such as points, lines, or polygons.
The geometry column is what makes spatial operations possible. Without it, you just have attribute data. With it, GeoPandas can plot features, inspect geometry types, and support later GIS tasks such as clipping or spatial joins.
This is the main difference between pandas and GeoPandas:
- pandas works with tabular data
- GeoPandas works with tabular data plus spatial geometry
CRS is also a required part of a reliable spatial data workflow. Two datasets can both look valid but still fail to align if their CRS values differ. Even in a basic workflow, checking .crs should be standard.
This page is a foundation for common vector tasks in Python. It covers loading, inspecting, filtering, plotting, and exporting data without going into reprojection, spatial joins, or geoprocessing.
Edge cases or notes
Missing or invalid geometry
Some rows may have null geometry values or invalid shapes. Check for missing geometry before plotting or analysis:
gdf = gdf[gdf.geometry.notna()]
If you need to check geometry validity, use:
print(gdf.geometry.is_valid.value_counts())
Invalid geometry can cause errors in later operations.
CRS may be missing
Some files do not contain CRS metadata. If .crs returns None, do not assume the coordinates are correct. An unknown CRS makes mapping and spatial analysis unreliable.
Shapefile field limitations
Shapefiles have older format constraints, including:
- shortened field names
- multiple sidecar files such as
.shp,.shx,.dbf, and.prj - weaker support for long text and metadata
If possible, consider GeoJSON or GeoPackage for newer workflows.
Large files can be slow
Large parcel or boundary datasets may load slowly and be expensive to plot. Start by inspecting structure, columns, and row count before doing heavier operations.
Internal links
For the broader concept, see Python for GIS: What It Is and When to Use It.
For related tasks, see How to Read a Shapefile with GeoPandas and How to Read GeoJSON in Python with GeoPandas.
If your data does not line up or the CRS is unclear, see Coordinate Reference Systems (CRS) Explained for Python GIS.
FAQ
What is GeoPandas used for?
GeoPandas is used for working with vector spatial data in Python. Common tasks include reading shapefiles and GeoJSON, filtering features, checking CRS, plotting data, and exporting results.
What is the difference between pandas and GeoPandas?
pandas handles standard tabular data. GeoPandas adds a geometry column so the table can store and work with spatial features such as points, lines, and polygons.
Can GeoPandas read shapefiles and GeoJSON?
Yes. gpd.read_file() can read both shapefiles and GeoJSON, along with other vector formats supported by the installed GIS libraries.
How do I check the CRS of a GeoDataFrame?
Use the .crs attribute:
print(gdf.crs)
This shows the coordinate reference system if it is stored with the dataset.