Course Info

  • Class Meeting: 8:30 – 9:45 AM US Eastern Time Wednesdays & Fridays on Zoom (accessible via Sakai).
  • Instructors: Professors Brandon Fain & Kristin Stephens-Martinez
  • Contact: (Please use piazza for technical/content questions)
  • Graduate TAs: Emily Kim & Nirav Patel
  • Undergraduate TAs: Kaya Celebi, Annie Hirsch, Billy Luqiu, Margaret Reed, Bhrij Patel, Anshul Shah, Sona Suryadevara, Zack Thomas, & Siyi Xu
  • Course Box Folder. All assignments and data will be made available in the course box folder.
  • Piazza. We will use piazza (accessible via Sakai) for questions and forum discussion around course content.
  • Gradescope. We will use gradescope (accessible via Sakai) for submitting and grading assignments.

Course Description

Data is the new currency. In every walk of life, people leave digital traces, which are stored and analyzed at both individual and population levels, by businesses for improving products and services, by governments for policy-making and national security, and by scientists for advancing the frontiers of human knowledge.

This course serves as an introduction to various aspects of working with data–acquisition, integration, querying, analysis, and visualization–and data of different types–from unstructured text to structured databases. Through lectures and hands-on labs, the course covers both fundamental concepts and computational tools for working with data and applies them to real datasets in a capstone team project.

This course is open to students from both inside and outside computer science. Dealing with data requires more than just computer programming: What do we know about the processes underlying the data? What are the interesting questions to ask about data? What practical impacts can arise from the data? What constitute ethical uses? Therefore, we also welcome students with analytical backgrounds (e.g., statistics, math) or knowledge in fields that would benefit from data analysis (e.g., social and life science, public policy).


This course requires basic knowledge of programming (the equivalent of CompSci 101) and statistics. Additionally, each student should have taken at least one of the following (or their equivalent):

  • a 200-level (or above) computer science course;
  • a 100-level (or above) statistics course;
  • a 200-level (or above) math course.

If the prerequisites are not met, students must obtain the consent of the instructor to enroll.