Patrick Green and Eleanor Caves

We are all data scientists these days, to one degree or another. The ability to explore and analyze data helps us make sense of our world.

Duke’s Data Expeditions program aims to introduce more undergraduates to data science early in their college education. The Information Initiative at Duke (iiD), in partnership with the Social Science Research Institute (SSRI), supports pairs of graduate students to prepare a dataset for use in an existing undergraduate course.

Patrick Green teaching Data ExpeditionsIn one Data Expedition project, Exploring Cleaner Shrimp Color Vision Capabilities Using R, Biology doctoral students Eleanor Caves and Patrick Green teamed up with Professor Sönke Johnsen to pilot their approach in an introductory summer course called Sensory Systems. Green and his advisor Sheila Patek then adapted it for use in an upper-level lab course, Principles of Animal Physiology.

“Especially if classes have a lab component, getting students some experience with importing, analyzing, and plotting data can be invaluable,” said Caves. “I remember struggling with Excel to write my own lab reports in college, and if someone had just given me the tools to code, and then inspired me to use those tools for a couple of reports, I would have been so much more comfortable with different aspects of data analysis.”

“This is a critical tool for students to learn,” Green added, “whether they use data in their future careers or whether they’re just trying to understand the world around them as they, for example, vote and raise families.”

Cleaner shrimp working on a fishCleaner shrimp are crustaceans that provide handy cleaning services to reef fish by removing ectoparasites. The project’s aim was to investigate how cleaner shrimp perceive the color patterns of other cleaner shrimp and fish. Caves collected the data as part of her doctoral dissertation.

In the class, she and Green introduced the ecology of cleaner shrimp, asked the students to make predictions about color vision capability and taught coding sessions in R.

Along the way, both the undergraduates and the instructors faced challenges.

“What makes coding frustrating on an individual level translates into the classroom,” said Caves. “Typos and minor errors that can send coding errors back at you occur on the students’ computers too, and you have to be ready to troubleshoot on your feet.”

Patrick Green working with Data Expeditions students

“Similar to Eleanor, I learned that these activities move more slowly than we might expect,” noted Green. “It was incredibly useful to have ‘teachable moments’ when students hit error messages. Even if these errors were caused by simple misspellings, it allowed us to show students that this is normal and fixable – not an impassible roadblock. Because we coded in real-time along with the students, we were also able to showcase our own mistakes and humanize the process, something I think is useful for students to see.”

The students soon learned how to subset, index, plot, change the color and shape of data points, add best fit lines, change line width and type, and create smooth spectral sensitivity curves (which show how sensitive photoreceptors are across the visible spectrum of light).

Figure from Data ExpeditionsAt the end, they created a figure of spectral sensitivity for several individuals of the same species. They compared their results to their predictions and discussed how they might use their new skills to analyze data they’ll collect in future lab-based courses.

And they seemed to enjoy the process. Caves noted, “I’ve been pleasantly surprised at how attentive the students remain and how engaged they seem the whole time.”

“It never occurred to me that I would need to learn how to code,” wrote one student in an end-of-class reflection, “but I am glad that I get to learn this.” Another student wrote, “It was actually easier than I expected, since coding seems so out of reach when you don’t know what is happening or what the terms mean. I could definitely use R in the future for projects where I am required to use data.”

At the end of the day, coding gives students a deeper understanding of data to solve real-world problems. “It gives students, even those who won’t go on to do research of their own, a respect for the scientific process, how we analyze our data, and where results come from, so that hopefully they can be more informed citizens and interpreters of the overwhelming number of facts they’re exposed to every day,” said Caves.

Eleanor Caves and Patrick Green with their advisors

Both Caves and Green received the Dean’s Award for Excellence in Mentoring from The Graduate School. They graduated this spring and are now postdoctoral researchers in Duke’s Biology Department with the Nowicki Lab.

“I have been surprised to learn during my Ph.D. that I can code, and that I am somewhat good at it,” Green reflected. “This has taken lots of trial and error, but I am motivated to continue learning and developing these skills in my research. Being able to use the same skills in my teaching is something that expands my teaching abilities and, I hope, will improve my ability to reach new generations of students.”

See other Data Expeditions projects and learn about a new program at Duke called Archival Expeditions. Photos at top and bottom courtesy of The Graduate School; other photos courtesy of Eleanor Caves and Patrick Green.