+Data Science (+DS) is a Duke-wide program, operating in partnership with departments, schools, and institutes to enable faculty, students, and staff to employ data science at a level tailored to their needs, level of expertise, and interests. For more information, please visit our website at https://plus.datascience.duke.edu
Upcoming In-Person Learning Experiences (IPLEs)
Ten +DS learning experiences will be available in March and April, delivered as virtual sessions due to Duke’s important measures to respond to the coronavirus global health crisis.
These sessions offer the opportunity to dive deeper into topics and target diverse units at Duke: from those that desire a broad understanding of what is possible with data science, and those who wish to use data-science tools (software) without a need for deep understanding of underlying methodology, to those who desire a rigorous technical proficiency of the details and methodology of data science. Anyone in the Duke community is welcome to join, there is no fee to attend, and no prior experience is necessary. Learn more about IPLEs on the +DS website: https://plus.datascience.duke.edu/learn-ds#iple
Tuesday, March 24 | 4:30-5:30 PM virtual session
Overview of Ethical Issues with Emerging AI
Nita Farahany
Artificial intelligence (AI) can reduce costs, improve efficiency, and potentially improve accuracy in many critical areas of life that impact humans. And yet, many of the tools of AI lack transparency, have inherent biases, and are difficult to govern. Where does this leave society and what are some of the known and unknown risks of what has come to be known as the “Fourth Industrial Revolution” underwritten by AI? This discussion will focus on AI and its implications for changes in humanity, the need for greater transparency, the growing use of AI in critical areas of decision-making, the importance of safeguarding against biases, explore issues about privacy, safety and security, and the future of work. Register at https://training.oit.duke.edu/enroll/common/show/21/174920
Wednesday, March 25 | 4:30-6:30 PM virtual session
PyTorch for Computer Vision
Kevin Liang
The goal of computer vision is for computers to be able to understand visual content (e.g. images, videos, 3D, stereo), usually for the purpose of making predictions (classification, detection, captioning, generation, etc.). Modern computer vision models are almost universally based on convolutional neural networks (CNNs), whose recent developments have lead to increasing adoption and deployment of deep learning models in a wide number of fields. In this hands-on session, we’ll introduce how to build CNNs in PyTorch, as well as how to load datasets and pre-trained models using PyTorch’s vision library, Torchvision. Register at https://training.oit.duke.edu/enroll/common/show/21/174973
Thursday, March 26 | 4:30-6:30 PM virtual session
Attention Networks for Natural Language Processing
Lawrence Carin
Neural-network-based methods for natural language processing (NLP) constitute an area of significant recent technical progress, with many interesting real-world applications. The Transformer Network is one of the newest and most powerful approaches of this type. This algorithm is based on repeated application of attention networks, in an encoder-decoder framework. In this presentation the basics of all-attention models (the Transformer) for NLP will be described, with application in areas like text synthesis (e.g., suggesting email text) and language translation. Register at https://training.oit.duke.edu/enroll/common/show/21/174974
Thursday, March 31 | 4:30-6:30 PM virtual session
AI for the Digital Humanities
Matthew Kenney
Artificial intelligence (AI) is playing an increasingly large role in the Digital Humanities. The use of AI throughout the humanities can accelerate research, open up new forms of investigation, and create novel approaches to interacting with data. In this IPLE we will overview how digital humanities researchers can leverage AI in their work. First, we will survey approaches, models, and applications common to AI and Digital Humanities research. Next, we’ll review successful projects at the intersection of DH and AI, and discuss how these projects are shifting the landscape of what is possible within the digital humanities. Lastly, we’ll review several models and architectures through case studies, to better understand how we can use AI in our own research. This session is part of the Humanities/Social Sciences +DS track, which focuses on engagements with AI for the Digital Humanities and Social Sciences. Register at https://training.oit.duke.edu/enroll/common/show/21/174919
Wednesday, April 1 | 4:30-6:30 PM virtual session
Convolutional Neural Networks for Image Analysis
Timothy Dunn
The convolutional neural network (CNN) represents the current state-of-the-art for image and video analysis, and is increasingly used for analyzing time series and other data with spatial or sequential structure. This session will provide an intuitive introduction to the fundamentals of CNNs, with an emphasis on hierarchical feature extraction and the convolution operation itself. Model training and transfer learning will also be discussed. Register at https://training.oit.duke.edu/enroll/common/show/21/175013
Thursday, April 2 | 4:30-6:30 PM virtual session
Introduction to PyTorch
Serge Assaad
PyTorch is an open source machine learning framework popular for building neural networks. In this hands-on session, we’ll walk through building and training a neural network, introducing the basic mechanics of PyTorch. Register at https://training.oit.duke.edu/enroll/common/show/21/175014
Thursday, April 8 | 4:30-6:30 PM virtual session
Molecular (Omics) Data Analysis
Ricardo Henao
Omics aims to understand biological processes by leveraging high-throughput technologies and data science. Aided by subject matter expertise, this combination has resulted in accelerated discoveries in health and disease. In this session we will go through the characteristics of the molecular data generated by some of this technologies and the fundamental processing and statistical analysis tools (including machine learning methods) that can be used to generate knowledge from these complex, high-dimensional data. Use cases include analysis of gene expression, microbiome, and proteomics data. Register at https://training.oit.duke.edu/enroll/common/show/21/174978
Wednesday, April 8 | 4:30-6:30 PM virtual session (1 of 2)
Thursday, April 9 | 4:30-6:30 PM virtual session (1 of 2)
Introduction to Data Science in Health Care
Matthew Hirschey
The ability to make data-driven decisions is redefining the future of patient care. This two-part series provides an introduction to the emerging field of health data science using the R software language, including data analysis and visualization, with a particular focus on its utility for insight in healthcare. No prior knowledge of data science or computer programming is assumed; laptops are required. Attendees will be provided with healthcare dataset examples, and introduced to R packages and code used to examine data. Particular attention will be paid to code interpretation and data provenance methods by learning to generate reproducible data output files. Although specific datasets will be used for analysis in class, this workshop will provide broadly applicable tools to reproducibly analyze and visualize data across the healthcare continuum. Register for Wednesday (https://training.oit.duke.edu/enroll/common/show/21/175015) and Thursday (https://training.oit.duke.edu/enroll/common/show/21/175016)
Thursday, April 16 | 4:30-6:30 PM virtual session
Machine Learning in Neuroimaging
Andrew Michael
This training will consist of two main sections: (1) application of ML to brain images from a clinical archive to detect brain disorders and (2) extraction of brain features from a large publicly available dataset to better understand mental health. After a brief introduction to the fundamentals of brain imaging, the first part of the class will focus on using structural brain MRI to diagnose and predict autism. Next, a deep learning technique will be applied to estimate brain volume from head CT (computed tomography) images that have poor image contrast. This technique’s potential for early detection and tracking Alzheimer’s disease will be presented. In the second part of the class, resting-state functional MRI (rsfMRI) data will be used to identify brain markers that may help to better understand the gender disparity in mental health. The class will conclude with evidence that suggests that rsfMRI has individually unique patterns that may serve as brain markers of certain behavioral characteristics. Register at https://training.oit.duke.edu/enroll/common/show/21/174975