Chair: Yuan Wu, PhD (Duke University)
Instructor: Anru Zhang, PhD (Duke University)
Course Description:
Generative modeling is a rapidly evolving area in data science that enables the creation, analysis, and interpretation of complex data. This short course introduces the statistical foundations of generative models and their growing impact on biomedical research.
We will begin with backgrounds, variational autoencoders, generative adversarial networks, and normalizing flows, before moving to state-of-the-art approaches such as diffusion models and flow matching. Throughout, we will emphasize how these models relate to fundamental statistical ideas, including estimation, uncertainty quantification, and bias correction.
Applications will focus on biomedical science, showcasing how generative methods enable protein structure prediction, design, and simulation. Case studies will demonstrate how these approaches address practical challenges such as small sample sizes, irregular time series, and sampling bias, enabling data augmentation and accelerating discovery.
The course is designed for a broad audience across statistics, machine learning, and the biomedical sciences. By integrating theory, methodology, and applications, it aims to equip participants with a principled understanding of generative modeling and inspire future research.
Prerequisite: | Basic statistics, probability, and linear algebra |
Instructor:
Anru Zhang, PhD
Eugene Anson Stead, Jr. M.D. Associate Professor
Department of Biostatistics & BioinformaticsÂ
Duke University
Dr. Anru Zhang is a primary faculty member jointly appointed by the Department of Biostatistics \& Bioinformatics and the Departments of Computer Science and the Eugene Anson Stead, Jr. M.D. Associate Professor at Duke University. He also serves as the Associate Chair for Research at the Department of Biostatistics \& Bioinformatics at Duke University. He obtained his bachelor’s degree from Peking University in 2010 and his Ph.D. from the University of Pennsylvania in 2015. His work focuses on high-dimensional statistics, tensor learning, generative models, and applications in electronic health records and microbiome data analysis. He won the COPSS Emerging Leader Award, IMS Tweedie Award, and the ASA Gottfried E. Noether Junior Award. Two of his PhD students won the IMS Lawrence D. Brown Awards. His research is currently supported by two NIH R01 Grants (as PI and MPI) and an NSF CAREER Award.