LO13 AI Prompting

If you have not yet, read the Step-by-Step Guide to Learning Effectively with AI. Then learn about the CLEAR Framework (you can access it through the Duke Library, and the PDF is in the class Box folder) by reading up to and including section The CLEAR Framework’s five components. Finally, learn about different types of prompts by reading Effective Prompts for AI: The Essentials.

The goal of reading these resources is to understand both a process for using chatbots that supports learning and how to effectively prompt the chatbots. However, in the end, this is a skill, not something you can easily memorize or learn by simply reading others’ chat logs and examples. Therefore, the best thing to do is to start using chatbots to help you learn. If you plan to use it to learn something for a course, make sure that it is allowed by the course. However, you can also use it to learn about things not from a particular course. So, for example, have you ever wondered how the Doppler effect changes the pitch of a passing ambulance? What’s the difference between affect and effect? Why the immune system is actually a brute-force system, given how it creates millions of different receptors to match foreign antigens?

LO12 Learning Illusions Day 2 – Motivated Reasoning

It is easy to learn about the idea of learning illusions while also assuming this can’t happen to you. However, to be human means to be fallible. Moreover, we all have a desire to be right. And when the motivation to reach a correct conclusion conflicts with the motivation to reach a desired conclusion, we have a recipe for motivated reasoning to lead us astray.

Read Motivated Reasoning and Angel Hernandez on Psychology Today to learn about the general idea of motivated reasoning using a specific case study as an example. Then read The Collision Among Goals and Accuracy to learn about the framework for how motivated reasoning comes about and ways to counteract it.

Note: This is not using the definition of motivation we learned in LO6 Motivation, which focuses on self-determination theory and the self-determination continuum from amotivation to extrinsic to intrinsic motivation.

LO12 Learning Illusions Day 1

Read How Are Students Really Using AI?: Here’s what the data tell us. By Derek O’Connell. Focus on the latter part that discusses the impact of AI use on learning.

We can connect the article’s discussion about cognitive offloading with what we learned in LO5 Cognitive Load. In fact, using the framework of intrinsic and extraneous cognitive load, we can see how cognitive offloading can help learning when the load offloaded is extraneous. The more extraneous load removed, the more germane the cognitive load is to the task while at the same time the cognitive load as a whole is lightened.

On the other hand, what the article calls cognitive debt, we should instead call scaffolding dependence, where the learner is not being set up to eventually succeed without the scaffold. The term cognitive debt is confusing within cognitive load theory, since the idea he is describing is really about learning, not cognitive load.

In our course, we would use the term ‘help germaneness’ to differentiate between what the article calls facilitating, supplementing, and replacing learning. However, we should not necessarily use these terms, since it’s an odd anthropomorphic framing that an AI can replace a person’s learning, when in reality, they are likely just generating what the student needs to complete a learning task. As a reminder, in LO1 Zone of Proximal Development, we learned about differentiating whether help is scaffolding by focusing on the germaneness of the help, as in how much of that help/support needs to be faded away for the learning objective to be achieved. AI facilitating learning makes sense, since that is when it gives help that is not germane to the learning objective. However, “replacing learning” in the article is really just scaffolding dependence where the help is so germane to the learning objective that the learner does not and will not achieve it, and the learner is in the Zone of Learner Cannot Do. As for the murky middle of supplementing learning, we can think of it within the framework of help germaneness, which requires knowing the learning objective and identifying whether the AI help is scaffolded such that it will/can be properly faded so the learner will eventually be independent and in the Zone of Can Do Unaided.

LO11 Probability in the Real World Day 2

Now that we’ve learned the basic probability concepts (in LO10), the different schools of probability thoughts (in LO11 Day 1), we have enough of languages to talk about probability in the real world. Or more specifically…

How and why probability goes wrong in the real world.

Or to make it less negatively-sounding…

How and why human instincts, intuitions, and judgments on probability contradicts what the laws of probability suggest.

As a start, watch this 3blue1brown video on Bayes’ Theorem (15 mins). If you don’t like watching videos and prefer readings, you may alternatively read this interactive webpage that contains the same material (but with less animation/visualization). As a side practice, try to identify the school of probability thought this video belongs to, and any hidden probability assumption(s) it discusses. The answers can be found below.


Spoiler prevention space

 

 

 

 

 

 

 

 


Answer: the video falls under the Bayesian school of thought. Towards the end when it talks about whether you should assume Steve is a randomly sampled American, it gets at the assumption of equal probability outcomes.

Conjunction Fallacy

Aside from an introduction to Bayes’ theorem, the video also introduces the conjunction fallacy along the way without using the term explicitly. The conjunction fallacy refers to the Linda experiment where almost all participants chose the less likely event. The name comes from the fact that the second event (Linda is a bank teller AND is active in the feminist movement) is a conjunction (“AND”) of two logical statements.

Base Rate Fallacy (a.k.a. Medical Test Paradox)

Now, watch (at least the first part of) this 3blue1brown video on this more advanced concept in very similar context. (21 mins in total). You should watch at least the first 6 minutes that focus on introducing the fallacy/paradox; the later parts are on “fixing” the fallacy/paradox and are optional, since we more focus on identifying the fallacy/paradox in this class. (There is also a longer article by The Decision Lab on the same topic; consider this one optional)

Gambler’s Fallacy and Hot Hand Fallacy

Read this article by The Decision Lab on these two fallacies. They are two sides of the same coin: in repeated experiments (compound experiments that involve when the same experiment is repeated again and again) where the assumption of independence among all experiments is appropriate, the Gambler’s Fallacy happens when human incorrectly bias their belief of future trials against the recent outcomes, and the Hot Hand Fallacy happens when they bias towards the recent outcomes.

Simpson’s Paradox

Lastly, watch this short video by minutephysics on Simpson’s paradox (6 mins), which occurs when categorized data exhibits opposite trends when analyzed in aggregate vs. broken down by category. Aside from introducing the paradox, the video rightfully caveats against taking statistics out of context (the part where it says “more money makes you a cat”).

There are more fallacies and paradoxes out there (Shao-Heng’s favorite is the Bertrand’s paradox, if you crave for more), but the ones above are what this class can reasonably cover.

LO11 Probability in the Real World Day 1

Schools of Probability Thoughts

Strictly speaking, LO10 was all math definitions and operations. We did contextualize the concepts using real-world examples, but we did not really touch the following fundamental question:

What do people actually mean when talking about probabilities? 

Turns out there are many different schools of thoughts when it comes to probability, just like there are different schools of thoughts when it comes to what AI is, or how to categorize kinds of AI systems.

Read the following articles:

The first article is philosophical, whereas the second is from a data scientist’s perspective, making them complement each other.

After reading the two articles, you may realize that they talk about related but seemingly different concepts, and it is hard to synthesize between the philosophical and data scientific perspectives. Below is a table of how we think of the three schools of probability thoughts in this class:

"School of thought" (we will use these terms)Closely related (or equivalent) toTheoretical or Empirical?Objective or Subjective?What is probability/uncertainty in this school of thought?Notes
PureAbstract; Classical; Logical TheoreticalObjectiveProbability is just a mathematical concept. It may or may not model the real world, and we might not care.May or may not assume equally likely outcomes (also called the Principle of Indifference in the 1000wordphilosophy article)
FrequentistPropensityEmpiricalObjectiveProbability is just the long-run frequency that some event we care about occurs. There is a true underlying frequency. The more data/evidence we get, the closer we are to that truth.Everything that EXCLUSIVELY draws from data falls under this school of thought.
BayesianBothSubjectiveProbability is just someone's belief about things. Data and evidence can lead to changes/updates in beliefs, but there is no such thing of a truth.Can be purely theoretical, but since it is subjective, most of the time it is used in an empirical context.

In other words, when asked to identify the school of probability thought something falls under, you should answer one of Pure, Frequentist, or Bayesian, justified by evidence provided in the scenario.

Hidden/Implicit Assumptions in Probability Thoughts

1. Assumption of equally likely outcomes

As already mentioned in the 1000wordphilosophy article, this assumption  is frequently used in pure/classical probability. You should already be quite familiar with this, as our entire LO10 operates under this assumption.

As a ridiculous example of misusing the assumption, suppose we are interested in whether it will rain tomorrow in Durham, NC. We can define the two outcomes as {rain, no rain}. With the assumption of equally likely outcomes, the probability that it rains tomorrow is clearly one half, or 50%. Now, we can be more granular and define three outcomes, {no rain, light rain, heavy rain}. Now the assumption of equally likely outcomes tells us the probability that it rains tomorrow is two thirds, or about 66%. Clearly, at least one of these models is “wrong”. (As you’ll see below, both models are very far off empirically.)

2. Assumption of independent events

Another common hidden/implicit assumption in probabilistic reasoning is the assumption of independence between two or more events. This is especially prevalent when people reason about compound experiments involving many parallel or sequential experiments of the same kind. From LO10, we have learned that this assumption leads to very nice simplifications when calculating the probability of an event. Which is usually fine and inconsequential if we stay in pure/classical probability land, but if we also happen to care about modeling the real world reasonably well, we need to be vigilant when holding these assumptions, especially if without explicitly stating it.

As an example, our lovely city of Durham, NC has about 108 days of precipitation annually (note: this is a frequentist claim). We can simplify this a bit and assume it is 1 out of 3 days (although 108/365 is closer to 30%). What is the probability that it rains two days in a row in Durham then? If your gut feeling is to say about 1 out of 9, you just successfully applied the assumption of independence. But the precipitation events cannot really be assumed independent (think about multiple-day storms or hurricanes).

3. Assumption of stationarity

A third common one is the assumption of stationarity, which means the probabilistic model does not change with time (or any other unaccounted factors that may change with time).

Continuing our discussion on rainy days in Durham. Is it accurate to say the probability that it rains on a typical summer day in Durham and that for a typical winter day in Durham are both 30%? With data and some common knowledge about how seasons work*, we know the probability should be slightly higher for summer than for winter.

*Do not take this for granted. Many parts of the world do not have four distinguishable seasons. Some parts of the world do not even have two seasons.

Note that these assumptions are not always wrong/inappropriate. They can be reasonable if applied in appropriate contexts. We should use our best judgment in determining whether an assumption is appropriate in a context. Some of this is subjective.

There are more hidden/implicit assumptions out there that involve more complicated probability theory concepts beyond the scope of this not-a-math class. But you are encouraged to explore them yourself (potentially with AI, if you choose so).

LO10 Introduction to Probability

For the probability LO, there is some built-in autonomy in the reference material to (hopefully) boost your motivation.

The concepts to learn in this LO are the following:

  • Definitions of basic probability concepts (outcome, experiment, event, sample space, compound experiment)
  • Basic probability axioms
  • Disjoint vs. joint events
  • Independent vs. not independent events
  • Conditional probability

There are at least four different ways to learn these concepts, as we outline below. You should choose at least one route based on the following considerations:

  1. your preferred style of the material – see the media form and “mathyness” information below
  2. your future plans of study – if you have aspirations of taking data science, discrete math, majoring in CompSci/Statistics/Math, etc., it might make sense to do the version that is closest to your future study
  3. whether you decided to use AI in your learning for this LO. The AI option is obviously only available to those of you who decided to use AI.
  • CS216 (Everything Data) reference videos:
    • Content: two short videos (26 mins in total) designed for a data science class with running examples of application in medicine/vaccine.
    • Media form: video with text-based slides (no visualization)
    • “Mathyness”: medium. Prof. Fain talks in natural language, while the slides do use corresponding math notations.
    • Links: Video 1 (Outcomes, Events, Probabilities); Video 2 (Joint and Conditional Probability)
    • Note: no need to worry about the reference to a dataframe around 13:15 of Video 1, as that is a data science-specific reference.
  • CS230 (Discrete Math) reference readings:
    • Content: two subsections of a textbook chapter (16 pages in total) designed for a CompSci discrete math class.
    • Media form: pure text with a few diagrams, with Shao-Heng’s PDF annotations marking what paragraphs to skip
    • “Mathyness”: extremely rigorous math notations
    • Link: In class Box folder
  • Static Web Resource (MathIsFun.com):
    • Content: several interactive webpages
    • Media form: text-based but with illustrated examples and hyperlinks to related concepts
    • “Mathyness”: extremely lay language, does not use set notations, etc.
    • Links: Probability, Independent Events, Conditional Probability
    • Caveat: These few pages are somewhat repetitive and rely heavily on the assumption of equally-likely outcomes, which is not always true (we will discuss this in LO11). They also use terms such as dependent events that are not mathematically rigorous. What they are good at is avoiding math notations and providing links to related concepts.
  • Generative AI with Internet Search capability (so for example, use Duke-accessible/paid ChatGPT instead of DukeGPT):
    • Content: you decide. For example, generative AI models can find all the materials above with accurate prompting.
    • Media form: you decide. You can explicitly instruct the AI to find material in your favorite media form.
    • “Mathyness”: you decide. You can explicitly instruct the AI to find material at the level of “mathyness” you prefer.
    • Note for prompting AI: be sure to give AI the following:
      • Context: this is for our class, CS/Edu171, which is a first-year class on learning theories, AI literacy, etc. You can even feed it this very website.
      • Task: the task here is to find appropriate resources for you to learn the concepts outlined above. So, give the AI the list of concepts.
      • Directions: what exactly do you want from the AI? Likely at least the links to whatever websites they are referring to.
      • Other important directions, like the media form and mathyness you prefer.
    • Notes: it is your own responsibility to check carefully that the material you got collectively covers all of the concepts. All the caveats we have discussed about using AI to learn in this class still apply. Finally, remember the task here for you is to learn these concepts. It is not just to find the appropriate resources–that’s the task for the AI.

 

LO9 Building Visualizations Day 2

For Day 2 of LO9, we are going to have a workshop on building charts in Excel. To do this, you must first create the Online Excel file that has all of your learning log data in it. Do the following:

  1. Go to https://forms.office.com/Pages/DesignPagev2.aspx
  2. Open your Microsoft Form Learning Log
  3. Click on “View Responses” in the top right corner
  4. Click on “Open results in Excel” to the right of the results summary. This will take you to the Online Excel file that your form is creating entries into
    1. Do not click to download a csv or open on your Desktop. We want you to create an Online Excel file so that it updates automatically as you make more learning log entries.
  5. To confirm that you have completed all these steps, you will create a simple histogram to share for Discussion Prep. Do the following:
    1. Select the column for the question “How much time did you spend, in minutes?”
    2. Pick the “Insert” tab of the menu
    3. Click on the drop-down menu for the charts:
    4. Choose Histogram from the drop-down.

    5. You should get something like this (note this is synthetic data):

LO9 Building Visualizations Day 1

Now that you know the basics of data visualizations and charts, and last class covered (some) ways to do charts poorly, watch the following videos to learn about more kinds of charts and visualization best practices.

LO8 Intro to Data Visualization

Watch the following free videos from Data Literacy Level 1. Videos 1.1 and 1.6 are optional. These are the foundational ideas around data visualization.

Then watch CompSci216: Basic Plot Types video about basic plots and how to explain a spot on a graph from CompSci216’s course. You can skip slide 2 (the one after the title slide) and start the video at slide 3 at timestamp 0:38. We will not be coding in this class, so slide 2 is not relevant. You can stop watching the video at 10:44 after slide 21, since I do not expect you to know about kernel density estimation plots.

LO7 Different kinds of AI

Imagine a world where someone got their hands on Thanos’s Infinity Gauntlet and, with a single snap, replaced all the words referring to transportation with the word “vehicle.” No one has a word for “car,” “bike,” “ship,” “airplane,” or “spaceship.”

It would be absolute chaos. One person would say, “Vehicles are great for the environment,” when talking about bicycles, while another may respond, “No! They are destroying our planet,” because they are thinking of heavy-fuel-oil-powered cargo ships. Meanwhile, news of a recently designed, highly efficient, electric-powered, small airplane has people thinking their next car might fly.

It would take years, maybe generations, to rebuild the lost vocabulary. It takes time to articulate the differences between two different vehicles, create a word to differentiate them, and then get others to adopt the word until there’s a shared understanding. And where there’s ambiguity, there’s opportunity to take advantage of the confusion for your own gain.

So, in this world, salespeople lean into this confusion. They advertise a scooter with a basket as a “smart personal vehicle with space for cargo” and point out how it is sleek and eco-friendly. From the ad, it sounds like a car. In reality, it doesn’t go far and barely holds anything. But when the word vehicle could just as easily mean a cargo ship or a skateboard, who’s to say what you’re supposed to expect?

This is what it’s like with the word “AI” right now. So, in this learning objective, we are going to start our journey to build our vocabulary so that AI is seen as generic as the word vehicle, and we can become suspicious if no other details are provided. With a greater vocabulary, we can become skeptical in a more precise way.

Terminology

AI is a very large umbrella term. Overall, it can be broken down into two main categories and a hodgepodge of other things. One category is predictive AI, which refers to a system that uses data to estimate future outcomes or classify current states. How they do this depends on the data they have and the mathematical model that is built using that data. Another main category is generative AI, which refers to a system that creates new content based on the data it is trained on. Rather than predicting an outcome or state, it produces something that resembles its data but is not (usually) exactly like its data. Due to the umbrella nature of the term AI, there are other things that are labeled AI that do not fall in these two categories. We list a few of them in the table below as well.

Note: This is not an exhaustive list. AI is a large and rich research field, with new applications emerging continually. In addition, the categorization between predictive and generative AI is arguable for some of these terms. Some AI systems could really be both. For example, it’s not uncommon for generative AI to rely on predictive AI in some way under the hood. However, for the purposes of this class, we do not have the time to go into all the shades of gray. Therefore, this is how we will characterize things and what we will use for this class.

KindCategoryPurpose/GoalOutputExample
ClassificationPredictiveAssigns a known label to its inputA textual label like "cat" for images or yes/no, depending on the modelClassifying whether an email is spam or not spam based on the email content and where it came from
RecommendationPredictiveSuggests relevant content or items based on a set of input dataA ranked list of items, content, or actionsA ranked list of recommendations of what video to watch next based on a particular user's watch history, video likes/dislikes, and the data from other users that are similar to this user.
Decision MakingPredictiveGuides or automates decisions based on a specific context and predicted outcomeA recommended action to takeA tool recommending whether to approve a car loan
TranslationPredictiveConverts text from one language to anotherText in the targeted languageTranslating from English to Spanish
Synthetic text generatorGenerativeProduce new text based on a given promptTextProducing text to describe a product for marketing
ChatbotGenerativeA subclass of synthetic text generators that focus on turn-taking conversationsMulti-round conversational text from two or more entitiesA customer service chatbot that tries to help the customer without the need of a human
Synthetic image generatorGenerativeProduce an image based on a text promptImageProducing an image of a space alien from the prompt "draw me a space alien"
Synthetic audio generatorGenerativeProduces audio based on text or structured input.Audio, such as music, speech, or sound effectsAn app that turns text into a short audio clip of part of a song
Synthetic video generatorGenerativeProduce a video based on (likely) a sequence of promptsVideoProducing a short video clip of a dancing cat alien
AutomationOtherPerforms repetitive tasks or a set of predefined tasks without human interventionA completed task or processA machine that installs the door of a car at a manufacturing plant without aid from a human
RoboticsOtherA physical machine that senses the world around it and interacts with itPhysical actions such as movement, manipulation, and sensingA robot that plays soccer
Artificial General Intelligence (AGI)OtherA computer that replicates human-level reasoning across any task or domain.Problem-solving and reasoning ability like a humanThis does not exist, and there is no agreed-upon definition or test for this. If someone ever claims this, they made up a definition and claimed it is true.

On the Anthropomorphization of AI

Anthropomorphism is the human tendency to attribute human traits, like intention, emotion, or consciousness, to non-human things. In the context of AI, this becomes especially problematic with synthetic text generators like chatbots. Language is central to how we understand and relate to one another. Linguistics research shows that when we encounter coherent language, we instinctively imagine a mind behind it, a person who is thinking, feeling, and trying to communicate. This is how we evolved to interpret language, and it works well for human relationships. Synthetic text generators have no mind, no goals, no moral judgment, and no understanding. They do not care about us because there is nothing there that can care. They are simply remixing patterns from massive datasets to produce plausible-sounding responses.

A person may anthropomorphize their car and say it “takes care of them on road trips,” but we don’t actually believe the car has emotions or intentions. And yet many of us still treat synthetic text generators as if they had empathy, insight, feelings to be hurt, etc. That’s because language itself triggers social and emotional instincts. We imagine a mind behind the text.

This problem is exacerbated by a common bias that leads people to believe computers are more objective, neutral, and trustworthy than humans. As a result, we are more likely to place undue trust in AI-generated outputs, excuse harmful output as accidental “mistakes,” or assume good intentions where there are none because computers do not have a mind. Synthetic image generators rarely evoke this illusion of sentience, but synthetic text generators routinely do because of how closely human language is tied to our understanding of thought and emotion.

This false perception of humanity in a synthetic text generator and our bias to believe computers are neutral have serious implications. When an AI causes harm, we risk blaming the AI instead of its creators. The creators are the ones who designed it. They decided what data to train it on, what outputs were reasonable, where to use it, and how to profit from it. If a car had a fundamental manufacturing flaw, we would not blame the car. We would blame the automaker and hold them accountable. We should do the same for the creators of the AI.

Learning A/B Test

Now that we have finished the themes “What is learning?” and “How does learning work?” and entered the “What is AI?” theme for the course, you will apply what you have learned to observe and analyze your own learning. You will do this by running an A/B test as you learn LO8 through LO11. An A/B test is a user-experience research method where two variants (A and B) are experienced by either the same or different users to determine which variant is more effective. Of course, you know, at this point, that learning is much more complicated than what you can learn from a simple A/B test. The purpose of this experience is to start deliberately and carefully exploring how AI impacts your learning.

You will do this A/B test as follows:

  1. There are two learning units in the “What is AI?” theme:
    1. Data visualization: LO8 and LO9
    2. Probability: LO10 and LO11
  2. For one unit, you will commit to not using AI in any way, shape, or form to help you learn the LO’s for that unit until after the first LO checkpoints that test that LO.
  3. For the other unit, you will plan to use AI to help you learn the material.

The units are as follows:

  • Data Visualization
    • LO8 – Basics of data types, chart types, what kind of data types best match a chart type, how to read a chart, and common ways charts can be ineffective or misleading.
    • LO9 – Create charts using Excel and how to convert/transform data in a spreadsheet into a format that enables you to create chart you want.
  • Probability
    • LO10 – Basics of probability, such that given a scenario, you can calculate the probability of that scenario happening.
    • LO11 – Conditional probability, such that given a conditional probabilistic scenario, you can calculate the probability of that scenario happening.

Additional Optional Readings if you are interested

References

The following text was written based on the following materials:

  1. Narayanan, A., & Kapoor, S. (2024). AI Snake Oil. Penguin Press.
    1. Vehicle metaphor came from this book
  2. Bender, E. M., & Hanna, A. (2025). The AI Con: How to Fight Big Tech’s Hype and Create the Future We Want. Harper.
    1. Much of the original terminology is from this book, and the discussion of anthropomorphization
  3. Bender, E. M., & Hanna, A. (Hosts). (2023–present). Mystery AI Hype Theater 3000 [Podcast]. Distributed AI Research Institute.