Resources
Technology Install Directions
This class uses Anaconda’s Individual Edition. It’s a free open source distribution containing Python, Jupyter Notebook, and (nearly) everything for data science in Python. Go to Anaconda’s Individual Edition and download the data science toolkit for your operating system with Python version >= 3.7. If you have trouble installing, check the anaconda documentation. When you are done, we recommend trying to open a Jupyter Notebook (enter “jupyter notebook” at a command line terminal, run Jupyter Notebook like a regular Windows program, or run the Anaconda navigator program and select Jupyter Notebook) and begin familiarizing yourself with the Jupyter Notebook documentation.
Gradescope
We will be using Gradescope to submit labs and projects. If you are unfamiliar with Gradescope or aren’t sure how to submit your assignment, they created a Gradescope help document for you.
Jupyter Notebook Container
If something happens to your computer or you cannot install Anaconda on it, we’ve reserved containers for you through OIT. Go to the container manager and look for “JupyterLab with Pytorch for Data Science and Machine Learning”. Click on the button to reserve your instance of the notebook. Once your instance is reserved you can click on “Pytorch” among your reserved containers, start the server, and upload any necessary files.
Getting Help
Office Hours
All times are listed for US Eastern Time (i.e., local time at Duke).
- Prof. Stephens-Martinez
- Monday, 3-4 pm on Zoom
- Friday, 1-2 in LSRC D224 (her office)
TA Office Hours
All times are in ET (Duke timezone). Note, these are not always 7 – 11 pm. If they are on zoom, check Sakai for the link.
- Sunday
- 7 – 9 pm: UTAs (BioSci 113)
- 9 – 11 pm: UTAs (Zoom)
- Monday
- 4-5 pm: Han Gong
- 7 – 9 pm: UTAs (BioSci 113)
- 9 – 11 pm: UTAs (Zoom)
- Tuesday
- 5 – 7 pm: Chang Xu
- 7 – 9 pm: UTAs (BioSci 113)
- 9 – 11 pm: UTAs (Zoom)
- Wednesday
- 4:45-5:45 pm (right after class): Han Gong
- 7 – 9 pm: UTAs (BioSci 113)
- 9 – 11 pm: UTAs (Zoom)
- Thursday
- 7 – 9 pm: UTAs (BioSci 113)
- 9 – 11 pm: UTAs (Zoom)
- Friday
- 12:30-1:30 pm: Yunzhou (David) Liu
There are a few steps to get office hours support:
- If attending remotely, go to the Zoom link available in Sakai.
- Log into My Digital Hand Beta and navigate to the Get Help Page.
- Click the Get Help Now button and answer the prompted questions.
- You will be added to a waitlist. Wait in-person or Zoom. If in person, a TA will call for you. If in Zoom, a TA will connect with you via a Zoom breakout room when it is your turn on the waitlist.
- After you have been helped, return to My Digital Hand Beta. You will again be prompted to fill out some questions regarding the help you received.
My Digital Hand Beta Student Instructions
To sign up go to My Digital Hand Beta and use entry code S7YRC5U. The most important step is to make sure you sign-up with your Duke email in the form netid@duke.edu.
Class Forum
We will use Ed Discussion for online discussion. You access it from Sakai. You can post questions anonymously to classmates), as well as message the instructors. Please use it for technical questions first instead of email. In particular, we encourage you to use it so that (a) you can get a faster response (multiple instructors or students can reply), (b) your questions don’t get lost in anyone’s email, and (c) other students can benefit from your questions or comments.
Python for Data Science
If you are new to programming in Python, there are a lot of good tutorials available. The official documentation has one: https://docs.python.org/3/tutorial/index.html and google also hosts a good tutorial with videos: https://developers.google.com/edu/python/. If you want a guide that is specific to transitioning to Python from Java, try http://python4java.necaiseweb.org/Main/TableOfContents. Note that if you are new to programming altogether, you do not meet the pre-requisites for the class and should consider CS 101 or CS 116 instead; we are assuming that you have the background to pick up basic syntax and functionality of Python on your own.
If you are new to scientific programming with Python, you may find this NumPy tutorial helpful, along with the NumPy documentation. The Python Data Science Handbook is also a very useful reference the use of Python for data science, including helpful information on commonly used libraries like NumPy, Pandas, Matplotlib, and Scikit-Learn. For a gentler introduction, try this online data8 book developed for U.C. Berkeley’s Foundations of Data Science course and used in CS 116 at Duke.
To get all of the data science libraries you need together with a Python distribution on your local device, look at the Anaconda distribution, available for free. The Anaconda distribution of Python contains everything that you need to be successful in data science with Python, including all Python resources you should need for this course. It includes Python 3 itself, all of crucial libraries for data science (NumPy, Pandas, Matplotlib, scikit-learn, etc), and development environments (notably the Spyder scientific computing IDE and Jupypter notebooks).
Duke Co-Lab
The innovation Co-Lab hosts a variety or trainings, projects, and programming that might be interesting to an aspiring data scientist. The Co-Lab also hosts regular office hours (and you can make an appointment) on a variety of technical subjects.
Terminal
A unix/linux terminal (or bash shell) is the basic non-graphical interface with which all computer scientists (and likely all data scientists) need to be familiar. We may occasionally need to use terminals in the class to install packages, connect remotely, or execute code. If you have never worked with a terminal before, see a brief introduction to Shell Basics.
Academic Resource Center
Want expert consultation about study habits, learning, time management, and more? Check out the Academic Resource Center (ARC) at Duke.
Counseling and Psychological Services
Your thriving is about more than this class. If you need to talk to someone, consider Duke Counseling and Psychological Services (CAPS).