- Class Meeting. Mondays and Wednesdays 10:05 – 11:20 am in Gross Hall 103 (recordings available on Canvas).
- Instructor. Brandon Fain. Office hours:
- 11:20-11:35 am Mondays & Wednesdays (after class) in Gross Hall 1st floor lobby,
- 3-5 pm Wednesdays in LSRC D104,
- or by appointment in-person or on zoom (email in advance to set up a time).
- Graduate Teaching Assistant. Minxing (Matt) Zhang. Office Hours:
- 4 – 5:15 pm Mondays in LSRC D301,
- 11 am – 12:15 pm Fridays in LSRC D301,
- or contact by email (minxing.zhang@duke.edu) to make an appointment or to request joining online at duke.zoom.us/j/5904612206.
- Course Platforms:
- Canvas learning management system
- Ed Discussion forum for questions (accessible from Canvas)
- Gradescope submission and grading (accessible from Canvas)
Course Description
Machine Learning (ML) studies techniques to automatically learn patterns from data rather than explicitly programing a behavior. This course explores applications of machine learning in tabular data, computer vision, human language, and reinforcement learning. Linear, logistic, and deep artificial neural networks of different architectures including perceptrons, convolutional neural networks, and transformers, will be utilized. Students will apply all techniques on real data using modern software. Societal and ethical considerations of ML will be examined.
References
The course utilizes several reference texts. All readings are freely (and legally) available online; you are not required to purchase anything for this course.
- BB: Deep Learning Foundations and Concepts by Christopher M. Bishop with Hugh Bishop. [BB online access link].
- GBC: Deep Learning by Ian Goodfellow, Yoshua Bengio, and Aaron C. [GBC online access link]
- JM: Speech and Language Processing (3rd edition) by Dan Jurafsky and James H. Martin. [JM online access link].
- SB: Reinforcement Learning: An Introduction by Richard S. Sutton and Andrew G. Barto [SB online access link].
- RLM: Machine Learning with PyTorch and Scikit-Learn by Sebastian Raschka, Yuxi (Hayden) Liu, and Vahid M. Can [freely view RLM code examples here] or [optionally purchase RLM full text here].
Learning Objectives
- Program in Python using NumPy, Scikit-Learn, and PyTorch
- Train, validate, tune, and test a predictive models including linear and neural models
- Understand, measure, and explain common machine learning performance metrics including accuracy, precision, recall, receiver operator characteristic curve, and cross entropy
- Design and train an artificial neural network (ANN) using back propagation and stochastic gradient descent
- Recognize and classify images using convolutional neural networks (CNNs)
- Apply transformer architectures to generate text using large language models
- Plan behavior using reinforcement learning with deep neural network function approximation
- Interpret and explain recent research advances in machine learning
- Consider the integrated questions of interpretation, bias, transparency, and fairness in machine learning
Background and Prerequisites
Computer Science 201 Data Structures and Algorithms is required as a prerequisite. You are expected to have experience programming small to medium sized software projects, to be familiar with standard data structures such as arrays, lists, Strings, and maps, to be able to read code and documentation, and to be able to debug a program.
Previous experience with the Python programming language will be helpful but is not required. No previous machine learning experience is required. Some mathematics will be necessary to correctly describe the concepts and algorithms we study. To get the most out of the course, you should be comfortable with introductory college-level mathematics (at the level of a Calculus 1) and willing to learn and ask for help when you encounter new concepts and notations. However, this is an applied course without higher math prerequisites, and you will not be expected to write proofs, complex derivations, etc. in order to succeed in the course.