Instructors: Sam Wiseman (swiseman@cs.duke.edu), Bhuwan Dhingra (bdhingra@cs.duke.edu)
Lectures: Monday and Wednesday 3.30pm — 4.45pm in LSRC A247
Large-scale neural language models have enabled significant advances in many areas of natural language processing (NLP). As a result the field has shifted focus from task-specific feature engineering and model design to task-agnostic methods for pretraining, transfer learning, domain adaptation, knowledge integration, model interpretation and robustness (to name a few). This class is a seminar course where we will read and discuss recent papers on these topics, with a particular focus on NLP applications. The class is intended for graduate level students in computing fields who have prior experience in NLP or Deep Learning. By the end of the course, students should have a deep understanding of the state-of-the-art in NLP research on the above topics.
The course structure will follow a “Role-Playing Paper-Reading Seminar” format previously used by Alec Jacobson and Colin Raffel in their courses. A detailed description of this format is available at this blog post. There will also be a final project requiring students to pursue original research extending one or more of the papers discussed in class.
Prequisites: Students are expected to have completed either an NLP course at the level of CS590.03 Intro to NLP or a machine learning course at the level of CMU’s 10-701. If a majority of the topics in those courses seem unfamiliar to you, you should probably first do the Intro to NLP course in fall before this seminar. You should also be able to read a recent paper from an NLP conference (e.g., ACL 2021) and understand the basic concepts and ideas in it. Lastly, you should be familiar enough with a deep learning package (e.g., PyTorch or Tensorflow) to be able to implement simple neural network models for NLP.