Introduction to pandas

pandas is a Python library designed to make common tasks in data science easier to perform.

Today's lecture on pandas will be delivered in the form of two Jupyter notebooks that you will find in the zip file I have linked to below. To read these these notebook files, expand the zip file in a location that is easy to find (such as your home directory). Open a terminal window or an Anaconda prompt window, navigate to that directory, and then type jupyter notebook to launch the Jupyter server. This will open a welcome window in a browser that shows a list of available files. Click on the first notebook file, pandas.ipynb, to see some introductory material on pandas.

After you have read the introductory material, open the notebook plot.ipynb. This notebook will take you through a solution to the GDP vs. population plotting problem that uses pandas to do the necessary data manipultations. Finally, take a look at the source code file plot.py, which contains a full solution to the plotting problem.

Files for the lecture