Just another Duke WordPress Sites site

For those who don’t have a Computer Science background, computer programs might seem like far-fetched wizardry. When I took my first Duke CS course, Data Structures & Algorithms, I also had a difficult time wrapping my head around the numerous ideas introduced throughout the course. At the time, data structures (e.g. linked lists, binary trees, stacks, queues) seemed abstract and confusing to me. Looking back, data structures (and CS in general) are so much more relatable to our daily lives than I used to think! As a programming instructor this summer, I want to make sure that my students can realize this sooner. Given this goal, I began to think: How should I teach data structures?

“How do we access an element in a sorted array?”

A few hands go up hesitantly…

“How do we look up a page in a book?”

Most students are willing to share!

When the vocabulary is technical, students are less likely to trust that they know the answer, even when they actually do. Putting technical ideas into familiar, real-life settings effectively reduces the fear associated with classroom discussion. Thus, when I’m introducing new data structures, I make sure to draw on plenty of analogies to simplify abstract concepts and encourage further discussion. Since students are more likely to personally connect with these everyday examples, I find that they have a much easier time comprehending and implementing data structures in code later on.

In the following weeks, I will be introducing several important data structures with the examples I frequently use in class. Let’s start with arrays!

Arrays

Introduction (Beginner):

Arrays are the simplest and most widely used data structure. Other data structures, such as stacks and queues (which I’ll explain later) are derived from arrays. Because arrays are so prevalently used in programming, they often are amongst the first data structures students learn in an introductory CS course.

Below is a simple array of size 4, containing elements 1, 2, 3, and 4.

You can think of an array as a row of boxes with each box capable of storing a single element. The type of element that is stored in each box within the array needs to be the same. In other words, if the one box stores a number (i.e. int, double), the other boxes cannot store words (i.e. Strings) because all the boxes within the array need to be consistent in type.

Each element is assigned a positive numerical value called an index, which corresponds to the position of that particular element in the array. Note that array indices typically start at a value of 0 (starting from 0 instead of 1 is a recurring theme in CS).

There are two major types of arrays:

• One Dimensional Arrays (as illustrated previously)
• Multi-Dimensional Arrays (arrays within arrays, or nested arrays)

You can picture a two-dimensional array as a table made of rows and columns, such as a spreadsheet. Since two-dimensional arrays can be a little confusing for beginners, I like using an empty egg carton to demonstrate how a two-dimensional array is traversed.

In a two-dimensional array, each row represents a distinct array of elements. As pictured above, the carton (array) holds two rows of eggs (elements), so it contains two distinct arrays, specifically {0, 1, 2, 3, 4, 5} and {6, 7, 8, 9, 10, 11}. In class, I have students place pieces of candies into different spaces of an egg carton to simulate a piece of array-accessing code step by step. And yes, they get to eat the candies afterwards!

Memory (Intermediate):

To understand how arrays work on a low level in memory, you can think of an array as a group of apartments (elements) side by side in an apartment complex (the array). Each apartment has an associated address number (memory address) at its front door. For instance, if the first apartment has an address number of 1000, then the second apartment will have an address number of 1001, and so on. Following this logic, we know that the fifth apartment will have an address number of 1004 and the tenth apartment will have an address number of 1009. A similar phenomenon happens in arrays. Since the elements of an array are stored in contiguous memory locations, if the memory address of the first element of an integer array is 1024, then the memory address of the fifth element will be 1024 + 4*4 ( the size of an integer element is 4 bytes) = 1040. This means that as long as we keep track of the memory location of the first element of the array, we can find any element in the array with a quick calculation (see below).

Unlike static arrays (what we’ve covered so far), which are initialized with a fixed memory allocation (e.g. 10 integer elements; int[10]), dynamic arrays automatically grow when the programmer tries to make an insertion and there is no more space left for a new element.

How does this iterative resizing process work? Whenever there is no more space to add a new element, a new array is allocated and each element from the original array is copied to the new array. To avoid incurring the cost of resizing many times, dynamic arrays resize by a large amount, such as doubling in size (as illustrated above), and use that space for future expansion. The old array, which is now useless since everything is copied to the new array already, is deleted (memory is freed). As you can see this is a relatively costly process, but resizing doesn’t happen every time you append an element to the original array. Resizing only occurs when the original array is full of elements and thus out of memory.  For those who understand Big-O Time Complexity, appending in dynamic arrays is O(n) instead of O(1) (appending in static arrays) because of its additional infrequent resizing process.

Interestingly, the resizing process in dynamic arrays can be compared to the process in which hermit crabs find new shells. As a hermit crab grows bigger, its shell becomes an ever-tighter fit so eventually the crabs need to move into a bigger one. As more elements are appended to a dynamic array, the array’s pre-allocated memory gets increasingly filled until the last space eventually gets filled, prompting it to resize. This fun analogy helps students better understand the logic behind resizing.

Most modern programming languages have built-in dynamic arrays (e.g. ArrayLists in Java and lists in Python). However, to illustrate how a dynamic array is specifically implemented, here’s code for a dynamic array I wrote from scratch using Python:

Conclusion:

Whether you’re a beginner, intermediate, or advanced programmer, I hope you gained new perspectives on arrays after reading this post. Stay tuned for next week’s post on stacks!

## 1 Comment

1. Joyce Huang

Hey Eric!
This was such an informative blog post! I took CS101 my sophomore year and nothing against CS, but I personally really did not enjoy it. A large part of the reason for my dislike though was because I was being introduced to so many brand new topics and ideas and itwas all taught using vocabulary I could not understand in the slightest. The way you describe and break down these complicated topics really makes them much easier to understand, your students are lucky to have you as a teacher!