Performing computational research is a double-edged sword. On one hand, you can work anywhere you want. You are not limited to working in a single location, as all you need is a working internet connection that allows you to remotely use a workstation (houses all the computing power needed for computational tasks); however, it also means that a lot of time is spent in front of a computer.
My lab day always begins with a coffee order from Twinnies, as I work at the CIEMAS which is directly adjacent to the cafĂ©. After getting caffeinated, I go to my “cubicle” with a double monitor setup and powerful GPUs and CPUs that can efficiently run data analytical tasks on my lab’s dataset. However, depending on the weather, my mood, or my plan for the day, I might even go downtown and find a new spot to work at to keep me energetic and creative.
Before I begin work, I write out my major goal for the workday and list out minor tasks that I would like to accomplish. I always plan out the day to align myself and ensure that each day is productive and mindful. With the day’s targets delineated, I begin working on what I believe requires the most mental energy, as I can do tasks that take time but are relatively easy while listening to music.
Publications, readings, and articles are my best resources during my data analytical tasks. Sometimes, I need a refresher or need to learn new things that are directly related to my specific task, particularly for certain statistical techniques or probabilistic modeling methods.
The popular data science workflow is OSEMN (pronounced “awesome”) and stands for Obtaining, Scrubbing (cleaning), Exploring, Modeling, and iNterpreting. At this point in time, I handled the difficult task of scrubbing the data (the most time-consuming task), so most of the day is trying to transform the data to explore it and plotting the features I extracted.
Excluding the mix of talking with grad students, postdocs, and scheduling meetings with my mentor whenever I have questions, the day is filled with logically working through problems and trying to create representations of the dataset that will be useable in a ML model for solving the problem. I love taking a problem, working it out on my iPad with the Apple Pen, and then coding it up on my computer. Though computer science may be boring to most, I like to think of my computational research as identifying problems, solving them theoretically, and then applying the solution with coding as a toolkit.
Though some days are frustratingly filled with debugging long lines of code, mostly because of how niche it is to handle DNA sequencing data and my lab’s dataset, it is particularly rewarding when a beautiful, purposeful plot is generated that provides insight into the data and brings me a step closer to a practical, effective solution.