Rivers of the World Activity#
In this activity, we will practice working with table data in Python. Download the data for the assignment from here. This dataset contains attributes for the some of the largest rivers in the world. In this activity, we will use the pandas package to do some basic analysis on this dataset.
Activate the
.gdsPython environment by opening an Anaconda Prompt (miniconda3) (Windows) or Terminal (macOS). Then, on Windows:
.gds\Scripts\activate
Or, on macOS:
source .gds/bin/activate
Note
Make sure you run this command from the same directory as the .gds environment folder.
Open a Jupyter Notebook by running:
jupyter notebook
Note
If you run this command from your course folder, your .ipynb assignment will automatically be saved there.
Task 1 (5 points)#
Import the
pandaspackage (i.e.import pandas as pd) and read the data (i.e.pd.read_csv)
Write some code that prints the following information:
Number of rows and columns
The maximum
Average discharge (m3/s)valueThe minimum
Drainage area (km2)valueThe mean
Length (km)valueWhat is the name of the shortest river?
Compute the ratio of discharge to drainage area (m³/s per km²) for each river. Which river is most “efficient” at draining water relative to its basin size?
Important
We recommend presenting your numerical answers in a readable way using string formatting. See this guide for more info.
Task 2 (5 points)#
Answer the following questions:
How many of these rivers are located in North America?
What are the names of the rivers that flow into the Atlantic Ocean?
Which continent contains the most rivers?
Which continent has the longest rivers (on average)?
If the Mississippi and Missouri were combined into a single river, what would their combined discharge, length, and drainage area be? How would it rank globally?
Task 3 (5 points)#
Add a column called
Primarythat has value of1if the riverTypeisPrimary Riverand0if the river is aTributary River.Make a new DataFrame of just the
Primaryrivers.Write a
forloop that prints the name of each river in this new DataFrame.Write another
forloop that only prints the name of the river if it starts with the letterM.Modify the
forloop so it saves the names of these rivers as alist.
Important
Save your notebook locally in both .ipynb and .pdf formats but only submit the pdf to Canvas.