Skip to content

Home

Instructor: Bhuwan Dhingra (bdhingra@cs.duke.edu)
Lectures: Wednesday and Friday 1.25pm – 2.40pm in Perkins 217
Grad TAs: Flora Jia (flora.jia@duke.edu), Xunjian Yin (xunjian.yin@duke.edu)
UTAs: James Cai (hongyi.cai@duke.edu), Raymond Xiong (raymond.xiong@duke.edu)

This class is a graduate-level introduction to the methodologies underlying modern natural language processing (NLP), the study of computing systems which process human languages. The class is intended for graduate students and upper-level CS undergraduates.

The course will cover a wide range of NLP tasks and the methods for solving them, including both classical and modern deep learning techniques. The second half of the course will focus on training, evaluation and deployment of Large Language Models (LLMs). The lectures will cover the mathematics behind these methods and the assignments will require students to implement some of them using a standard machine learning library (Pytorch). By the end of the course, students should be familiar with the main applications of NLP. They should be able to identify appropriate techniques for tackling them, read research papers about those techniques and (with some effort) implement them in Python. Topics include text classification, language modeling, generative and discriminative models of sequences, pretraining and post-training of LLMs, and reinforcement learning. The final project will involve post-training open source LLMs on competitive benchmarks.

Prerequisites: [Undergraduate machine learning (COMPSCI 370 or 371) OR statistical inference (STA 250D / MATH 342D) + probability (MATH 230 / STA 230) + linear algebra (MATH 221, 218 or 216)] AND comfort with programming in python.