- Prepare (due Mon 9/23)
- Content below
- Canvas quizzes
- Class engagement – See on the class forum
- Homework (due Sun 9/29) [LINK]
- Worked Example [LINK]
Content (Slides in the Box folder)
04.A – What is Wrangling
- Data sources, formats, and importing (26 min.)
- Common data cleaning problems (16 min.)
- Read Section 3.4 Handling Missing Data from Python Data Science Handbook
04.B – Wrangling Text
- Python string operations (16 min.)
- Introduction to regular expressions (18 min.)
- Read Section 3.10 Vectorized String Operations from Python Data Science Handbook
Optional Supplements
- Pandas IO tools Documentation
- Pandas working with missing data user guide
- Python Regular Expression HOWTO
- Pandas working with text data user guide
- Why is data wrangling sometimes hard? Check out this case study [The Maddening Mess of Airport Codes!]