Skip to content

AI Study Guide – A Cautionary Tale

By: Stephen Toback

Dall-e 3/Bing Image Creation – Prompt: Draw me a picture in a classical Italian style of a scribe working at a computer with a quill pen in a city that would appear in ancient Rome

I was asked to give a talk to students here at Duke about some “non-controversial” use cases for ChatGPT. One of the ideas I had was to take a complex reading and have ChatGPT build a study guide – create several questions and answers to help the students test their knowledge. I’m a great fan of a Mark Goodacre’s work here at Duke who teaches in the Department of Religious Studies about historical aspects of the New Testament. I first found not a great article but the faculty member pointed me to an article that might be better for my testing.

As part of my talk, I created a guide to important aspects of creating a prompt, and I used this to create the prompt for this study guide:

Since I’m paying for ChatGPT Pro and it is now connected to the internet, I I didn’t think linking to a PDF would be a problem. I was immediately proven wrong.

The fact that it said, “I can attempt to help…” is actually a negative in my opinion. Students and all of us should understand, left to it’s own devices, ChatGPT is not an expert on anything. It’s broad knowledge is actually an issue when trying to get specific information – especially specific information that is provided by an instructor for a course.

My next step was to try Bing chat since it has internet access and uses ChatGPT 4.0.

Note this is not a screenshot of the Bing interface as I’m having problems screencapturing it, but a copy and paste of the text into Word. At first I was delighted that it was able to read the PDF, but my Spidey-Senses™ were tingling because the response was almost instantaneous. I immediately said to myself, “there’s no way it read that entire PDF that quickly.” (Foreshadowing…)

I sent the result to the instructor to review (which I think would be great for students to do as well) and received this reply:

“This is really terrible! … What it has produced misunderstands a fairly fundamental element of the article (it thinks that the article supports the existence of Q when in fact it argues against it), and the examples it provides of “fatigue” are all bogus, and none are taken from the article. It’s a sort of hodge-podge of materials in the general area covered by the article, but mis-understood and mis-applied, and drawn from all over the internet rather than the article itself. I tried the same thing in BARD, and it was far, far better than Bing (perhaps a B- rather than a D-), but still had misunderstandings and errors.”

Well, there you go. It seems that my “guess” that it didn’t really read the PDF, but said that it did was correct. Horrifying, sincerely.

I then received a link to an HTML version and tried ChatGPT 4.0’s web interface again and the results were way better.

“…this is far and away the best. If a student had produced this, I’d probably give it a B, or maybe if I was feeling generous, a B+. So Chat GPT wns! It is free of errors, and on the whole it is pretty lucid. What would give me pause if it were a human are a couple of things that would show an incomplete understanding or the article. In particular, (4) “docile reproduction”. This is just another term for “editorial fatigue”; it is me varying the terminology, which Chat GPT has not picked up. But also, Chat GPT has not grasped the importance of the conclusion of the article, in which I argued that “fatigue” helps us with the most controversial question in Synoptic Problem studies, viz. the existence of Q. Interestingly, none of the three picked up that the last two to three pages of the article are in fact the most significant. 
But the most interesting finding, to me, is that Chat GPT has done a good job of elucidating certain key elements in the article, and it has done so clearly, even if it has failed to get a really good sense of the total argument of the piece.”
Lessons learned… Stuff we already know:
  1. Trust but verify. If you are using ChatGPT to develop study materials or any materials for a course, check with your instructor
  2. This is a quickly evolving technology that is amazing but very far from trustworthy or perfect
  3. The more you know about a subject, the better your results will be (because you can keep the chatbot honest)

Leave a Reply

Your email address will not be published. Required fields are marked *