Two data sets

Below you will find a link to a zip file containing two large data sets. The first data set is the population data JSON file that the author used in chapter 16. The entries there make it possible to look up population data for different years based on standard country codes. The second data set is a large CSV file I downloaded from the World Bank. This data set gives historical data on national GDP for a large number of countries over a span of many years. The second data set uses the same country codes as the first to identify countries, so this should give you a way to aggregate population and GDP data for a large number of countries.

The data sets

A question

Is there a correlation between the rate of a country's population growth and the rate of growth of its economy? In this exercise you will aggregate and plot some relevant data to help answer this question.

To do this analysis you will want to start by collecting population and GDP data for a pair of years, 1995 and 2005. For each country in your analysis compute the percentage change in the country's population over that period and the percentage change in that country's GDP per capita. (Note: to compute GDP per capita you will need to divide the GDP of each country by that country's population.)

Next, using pyplot construct a plot that plots population change vs GDP per capita change. If you want, you can make this plot more interesting by using colors to show more information: for example you can group countries by sizes and use different colors for countries in different size categories.

If there is any correlation at all between population growth and GDP per capita growth it should show up in the shape of the data cloud that you plot.

Due date

This assignment is due by the start of class on Thursday, November 2.