Why I'm Not Worried About ChatGPT

ChatGPT has been all over my newsfeed lately, with a considerable amount of hype. In particular, many are wondering or even worrying whether the emergence of this technology will threaten jobs with moderate to high education requirments. See for example “How ChatGPT Will Destabilize White-Collar Work” (The Atlantic), where Annie Lowrey leads with “In the next five years, it is likely that AI will begin to reduce employment for college-educated workers.” I do not share these views. In fact, I am somewhat underwhelmed by the threat of ChatGPT for a number of reasons. Since this topic has come up a few times for me lately, I will write down my thoughts here so I can reference them more easily.

ChatGPT Cannot Think

The first issue I take with many of the AI hype articles is that despite what the news coverage may imply, ChatGPT cannot think. To be honest, when I see articles talking about ChatGPT as “intelligent” or “thinking,” the first thing that comes to mind is this SMBC from 2011-08-17:

In my view, ChatGPT is a lot like this parrot - except that I do think it is fundamentally different, and ChatGPT is not “conscious” and does not “think” in a meaningful way. Despite the many advances made, artificial intelligence (AI) functions differently than a natural intelligence (NI), and in any ChatGPT is not designed to “think.”

Before giving a big picture view of how a large language model like ChatGPT works, I want to illustrate the limited flexibility of AI with an example from image recognition. An NI can readily distinguish between what is in the foreground and background of an image. Think of an image like this:

A human will have no problem distinguishing between the insect in the image, the flower it is on, and the foliage in the background as distinct objects in different planes. This holds true even if the individual is not familiar with the particular plant or insect in the image. If given additional images of either the insect on a different background or the same background without the insect, we would not mistake the the plant for the insect or the other way around.

Now consider an AI model trained to recognize insects. The algorithm doesn’t have a concept of “insect” or “plant,” per se. Rather, it notices patterns in images that are labeled “insect” or labeled with a particular insect. The pattern it learns does not depend on it having a concept of “insect.” What that means in practice, is that our model might learn that the background is equally or even more important than the foreground. If we train our data set with bees on flowers, but not flowers without bees, we may end up with a model that declares flower photos “bees.” This phenomeon is known in image recognition, and people are actively working on methods around this problem. But it nicely illustrates how AI is not “smart,” and humans need to do a lot of heavy lifting to get the AI algorithm to perform as intended, even if the application domain is relatively limited. For more information on this application, see this article from GradientScience.

How does it Work?

With this background, let’s get an overview of how models like ChatGPT work. A good summary of the techniques involved is detailed in this post by AssemblyAI. In simple terms, a model is exposed to large amounts of data in order to learn about the structure of words and how the are aligned in sentences. In principle, this is not too different from the text prediction feature you have on your phone while texting. But this methodology only works to help produce coherent or seemingly coherent sentences by completion. Marked language modeling is a method use to help the model learn about syntax as well to improve the output.

What is new with ChatGPT is that in addition labeled training material, it utilizes human feedback to improve its output. Deep down, AI models can be thought of as optimizing some (very complicated) function. This goal function need not necessarily be written down explicitly. OpenAI uses a method where a model gives two possible outputs for a prompt, and then a human judges which is “better,” somewhat similar to when an optometrist asks you if “1” or “2” is better. It then uses this feedback to improve its output iteratively. See this blog post from OpenAI where they use this methodology to animate a backflip.

ChatGPT uses three steps for human feedback based reinforcement learning. You can already imagine some of the issues that can arise from using this method. For one, if human feedback is used to train the model, then we can expect the model to reflect the thoughts and opinions of the labelers to some degree. Labelers may be mistaken and might not be experts in whatever topic they are reviewing. They may be fundamentally mistaken or biased about what we would consider high school-level knowledge.1 This is on top of the issues of the large amount of source text used in the initial training phase. These source texts may vary wildly in style and accuracy. Even humans reviewing an article may not be able to distinguish facts from opinion, let alone a language model using many source texts as input. Which leads us to what I see as a main problem for ChatGPT.

Factual Inaccuracies

Despite the confidence exuded by ChatGPTs output, it will readily produce a number of factual inaccuracies or give bad advice when explaining how to do tasks. See for example Avram Piltch’s “I Asked ChatGPT How to Build a PC. It Told Me to Break My CPU” (Tom’s Hardware), where ChatGPT gives instructions for a computer assembly that is potentially damaging to the hardware.

Or this article (ToolGuyd) where Stuart asked ChatGPT to recommend a cordless powerdrill. ChatGPT made three recommendations. In explaining its recommendations, it gave several tech specs about the recommended products. The only problem is that it got several of these items wrong. It made mistakes about what type of drill a particular model was, whether the battery is included in the particular SKU it listed or not, and how many BPM the model delivers. It also recommended a discontinued model.

As a third example, consider this post where economics professor Bryan Caplan attempts to let ChatGPT take one his more recent midterms. It’s quite detailed and includes the questions, answers, and grading rubric Bryan used. He gave ChatGPT a D on this exam, substantially below the average grade human students in the class received.

I would like to highlight that my argument isn’t that ChatGPT gets everything wrong - it doesn’t. It can even perform exceptionally well at certain tasks. See this white paper by Christian Terwiesch grading ChatGPT’s attempt at the final exam Wharton Business School MBA core course for just one example. A little googling will quickly lead to other examples, such as it passing law school exams or giving decent answers to tech sector interview questions.

My concern is that it sounds very confident in its answers, but it is not always trivial for the average person to verify whether or not ChatGPT’s output is trustworthy. As Rupert Goodwin put it, ChatGPT is “a Dunning-Kruger effect knowledge simulator par excellence.” And that’s a problem if people decide to just trust it to produce truth, when ChatGPT has no idea what “truth” is. It’s important to know that OpenAI is aware of this and it even says so on it’s FAQ page:

  1. Can I trust that the AI is telling me the truth?

a. ChatGPT is not connected to the internet, and it can occasionally produce incorrect answers. It has limited knowledge of world and events after 2021 and may also occasionally produce harmful instructions or biased content.

We’d recommend checking whether responses from the model are accurate or not. If you find an answer is incorrect, please provide that feedback by using the “Thumbs Down” button.

In my opinion this is reasonable and to be expected. I think some people may get too excited and feel too confident in this technology when it just isn’t as reliable as many would wish at this stage. And for those reasons, I don’t think it’s coming for our jobs any time soon.

Note: If you use ChatGPT, be careful to not give it any sensitive information. OpenAI isn’t making this very expensive model available to you for free out of the goodness of their hearts. They’re using your interaction with it to further train the model.

Update 3/21: There is a good article in the New Yorker regarding my point that ChatGPT doesn’t “think.” This is contra Daniel Miessler’s argument that ChatGPT and similar models exhibit “understanding.”

Update 4/4: And here a good post by Michael Huemer on this issue.


  1. For a good review of the many ways in which typical adults are uninformed and mistaken about issues contra accepted expert opinion, see: B. Caplan, The Myth of the Rational Voter: Why Democracies Choose Bad Policies, Princeton University Press, Princeton, NJ, 2007. And B. Caplan, The Case against Education: Why the Education System Is a Waste of Time and Money, Princeton University Press, Princeton, NJ, 2019. ↩︎

D. Michael Senter
D. Michael Senter
Research Statistician Developer

My research interests include data analytics and missing data.

Related