Jupyter Notebooks

An introduction to pandas

The main subject of today's lecture is an introduction to the pandas Python package. pandas is the most popular packages in Python for doing data science, including cleaning and organizing data. The archive I have linked to above contains two Jupyter notebooks. One of these is an introduction to the basics of pandas, and the other will walk you through an extended example using pandas to manage a data set.

To work with pandas you will need to start by using the Anaconda navigator to install the pandas package in your CMSC210 environment.

Online documentation for pandas, including a quick start guide, is available at pandas.pydata.org.