Module 04: Data Wrangling

  1. Prepare (due M 1/31)
    1. Content below
    2. Sakai quizzes
  2. Peer Instructions – See on the class forum
  3. Homework (due Su 2/6)
  4. Worked Example

Content (Slides in the Box folder)

4.A – What is Wrangling

  1. Data sources, formats, and importing (26 min.)
  2. Common data cleaning problems (16 min.)
  3. Read Section 3.4 Handling Missing Data from Python Data Science Handbook

4.B – Wrangling Text

  1. Python string operations (16 min.)
  2. Introduction to regular expressions (18 min.)
  3. Read Section 3.10 Vectorized String Operations from Python Data Science Handbook

Optional Supplements