Residential Flooding Activity#

Flooding is the most common and damaging natural disaster in the United States. Therefore understanding the number of people at risk of flooding is critical information for planning. In this activity, we will perfom some basic spatial analysis using the geopandas package to investigate the number of people exposed to flooding in Orange County, NC.

The data can be downloaded from here. There are two shapefiles in this directory. The first dataset, orange-county-structures.shp, is an inventory of all structures larger than 450 square feet produced by the Federal Emergency Management Agency (FEMA). More information about this dataset can be found here. The second dataset, S_FLD_HAZ_AR.shp, is the FEMA Special Flood Hazard Area (SFHA) designation, a regulatory product produced using 1-dimensional flood modeling.

../_images/flooding.png
  • Activate the .gds Python environment by opening an Anaconda Prompt (miniconda3) (Windows) or Terminal (macOS). Then, on Windows:

.gds\Scripts\activate

Or, on macOS:

source .gds/bin/activate

Note

Make sure you run this command from the same directory as the .gds environment folder.

  • Open a Jupyter Notebook by running:

jupyter notebook

Tip

If you run this command from your course folder, your .ipynb assignment will automatically be saved there.


Task 1 (5 points)#

Import the geopandas package and read the orange-county-structures.shp shapefile. Write some code that prints the following information from the GeoDataFrame:

  • Number of rows and columns

  • Coordinate reference system as an EPSG code

  • The number of different categories in the OCC_CLS column

  • The percentage of buildings classified as Residential in the OCC_CLS column

  • The total number of people living in Orange County (i.e. using the POP_MEDIAN column)


Task 2 (5 points)#

Now read the S_FLD_HAZ_AR.shp shapefile. Write some code that prints the following information from the GeoDataFrame:

  • Number of rows and columns

  • Coordinate reference system as an EPSG code

Reproject the GeoDataFrame to a projected coordinate system (i.e. spatial units are in meters) and answer the following questions:

  • What is the total area of Orange County according to this dataset (in km2)?

  • What percentage of the county (by area) has been designated as a Special Flood Hazard Area (i.e. where column SFHA_TF == T)?

Note

More information about the attributes of the columns can be found here


Task 3 (5 points)#

Make sure that both datasets have a common projected coordinate system and answer the following questions.

  • How many structures intersect designated Special Flood Hazard Areas (i.e. where columns SFHA_TF == T)?

  • How many of these structures are classified as Residential?

  • How many people in Orange County live in a Special Flood Hazard Area?

  • Find the name of the school that is located in a Special Flood Hazard Area (2 points)


Important

Save your notebook locally in both .ipynb and .pdf formats but only submit the pdf to Canvas.

Acknowledgements#

This activity was inspired by Gold and Steinberg-McElroy (2025).

Gold, A.C., Steinberg-McElroy, I. High-resolution estimates of the US population in fluvial or coastal flood hazard areas. Sci Data 12, 1377 (2025). https://doi.org/10.1038/s41597-025-05717-y