Residential Flooding Activity#
Flooding is the most common and damaging natural disaster in the United States. Therefore understanding the number of people at risk of flooding is critical information for planning. In this activity, we will perfom some basic spatial analysis using the geopandas package to investigate the number of people exposed to flooding in Orange County, NC.
The data can be downloaded from here. There are two shapefiles in this directory. The first dataset, orange-county-structures.shp, is an inventory of all structures larger than 450 square feet produced by the Federal Emergency Management Agency (FEMA). More information about this dataset can be found here. The second dataset, S_FLD_HAZ_AR.shp, is the FEMA Special Flood Hazard Area (SFHA) designation, a regulatory product produced using 1-dimensional flood modeling.
Activate the
.gdsPython environment by opening an Anaconda Prompt (miniconda3) (Windows) or Terminal (macOS). Then, on Windows:
.gds\Scripts\activate
Or, on macOS:
source .gds/bin/activate
Note
Make sure you run this command from the same directory as the .gds environment folder.
Open a Jupyter Notebook by running:
jupyter notebook
Tip
If you run this command from your course folder, your .ipynb assignment will automatically be saved there.
Task 1 (5 points)#
Import the geopandas package and read the orange-county-structures.shp shapefile. Write some code that prints the following information from the GeoDataFrame:
Number of rows and columns
Coordinate reference system as an EPSG code
Click to reveal hint
One way of getting the EPSG code from a Coordinate Reference System (CRS) object is to use the to_epsg() method.
The number of different categories in the
OCC_CLScolumnThe percentage of buildings classified as Residential in the
OCC_CLScolumnThe total number of people living in Orange County (i.e. using the
POP_MEDIANcolumn)
Task 2 (5 points)#
Now read the S_FLD_HAZ_AR.shp shapefile. Write some code that prints the following information from the GeoDataFrame:
Number of rows and columns
Coordinate reference system as an EPSG code
Reproject the GeoDataFrame to a projected coordinate system (i.e. spatial units are in meters) and answer the following questions:
Click to reveal hint
Consider using a UTM Zone e.g. https://epsg.io/32617.
What is the total area of Orange County according to this dataset (in km2)?
What percentage of the county (by area) has been designated as a Special Flood Hazard Area (i.e. where column
SFHA_TF == T)?
Note
More information about the attributes of the columns can be found here
Task 3 (5 points)#
Make sure that both datasets have a common projected coordinate system and answer the following questions.
How many structures intersect designated Special Flood Hazard Areas (i.e. where columns
SFHA_TF == T)?
Click to reveal hint
The sjoin function should work well for this.
How many of these structures are classified as Residential?
How many people in Orange County live in a Special Flood Hazard Area?
Find the name of the school that is located in a Special Flood Hazard Area (2 points)
Click to reveal hint
The PRIM_OCC column describes the primary occupancy for each structure. Consider finding a lat/lon first and then use Google Maps.
Important
Save your notebook locally in both .ipynb and .pdf formats but only submit the pdf to Canvas.
Acknowledgements#
This activity was inspired by Gold and Steinberg-McElroy (2025).
Gold, A.C., Steinberg-McElroy, I. High-resolution estimates of the US population in fluvial or coastal flood hazard areas. Sci Data 12, 1377 (2025). https://doi.org/10.1038/s41597-025-05717-y