ChatGPT in education: What is ChatGPT capable of, and what are its limitations?

Een hand van een robot en een menselijke hand reiken naar elkaar

Much is said and written about the Open AI tool ChatGPT. The tool has both fans and opponents, with varying opinions. But that the tool will have consequences for education, is without a doubt. To understand what these consequences are (both positive and negative), it is useful to understand how ChatGPT works, what its abilities are, and what not. 

What is ChatGPT?

ChatGPT is a chatbot based on a Large Language Model (LLM). That means you can ask a question (prompt) and ChatGPT will write a text for it. This ranges from writing a limerick to writing scientific articles. To generate text, ChatGPT does not need to understand the prompt (and the answer). Instead, the prompt gives the chatbot a context within which it will use probability to see which words best line up, forming sentences. It generates new unique text and does not show existing texts (like ordinary search engines). It is therefore not possible to check the authenticity of texts with plagiarism software.

What is ChatGPT capable of?

ChatGPT's capabilities can be roughly divided into three categories.   

  1. You can ask a question, and ChatGPT answers it. This usage is similar to giving a command to Google. The difference is that Google produces sources from which you have to extract the information yourself. ChatGPT is more efficient and produces a written-out answer, the disadvantage is that it is not easy to verify if the answer is true and what it bases it on.  
  2. You can ask to generate a piece of text, for example an outline or a reflection on a particular topic. The question can include what length and writing style the piece should be. You can also ask for sources to be cited. If you ask it outright, it will include fabricated sources, but you can specify that you want only 'real sources'. In that case, the chatbot will provide real sources, but be aware these will still not always be accurate.  
  3. You can enter pieces of text and ask the chatbot to summarise, paraphrase, translate, remove spelling mistakes, give feedback on them, etc.   

ChatGPT predicts well what kind of answer we want to get. So even though the chatbot does not fully understand the questions and the generated answers, it is still capable to create a logical structure, as is shown in this study in which ChatGPT scores a 7,8 on a Dutch secondary school (vwo) exam on reading comprehension in English.

Moreover, text generation is only one of the software's capabilities. Within the playground of OpenAI there are possibilities to convert a prompt to software code or have an illustration generated. Conversely, you can enter code and ask what it doesSee this example for an illustration.  

Additionally, part of the source code is publicly available, therefore different add-ons are created, for example to link to Excel, Gmail, R, Youtube or Whatsapp. This gives an indication of the broad implications we can expect.   

What can ChatGPT not do?   

Incorrect referencing

ChatGPT uses probability theory to compose an answer based on numerous sources. It does this when compiling pieces of text, as well as when compiling a simple factual answer to a question. This means you can never identify all of the sources that were used for the output. This makes ChatGPT's answer non-transparent, difficult to verify, and makes proper and complete source citation (and compliance with copy-right) impossible. If you ask ChatGPT to provide the sources used to generate an answer, the chatbot often fabricates non-existent sources. You can ask for "real" and "peer-reviewed" sources, then you will usually get existing and applicable sources, but they will not contain all the data on which it bases an answer. These are obviously major drawbacks within the scientific community.  

Falsehoods and biases

Another major drawback of ChatGPT is in the training data used to develop the model. First, there are biases in data, and following the old "garbage-in-garbage-out" principle, the chatbot's output depends on the training data, and there is a good chance that ChatGPT's answers will reflect these biases. For example, it is clear from ChatGPT's predecessor (GPT-3) that there were gender-stereotypical associations in the answers. OpenAI, the company behind ChatGPT, implements filters and human verification to extract obvious falsehoods and the most severe biases. However, this remains an inherent limitation.  

Outdated data

The current version of ChatGPT (January 2023) is only trained on data through September 2021. This means that the chatbot cannot interpret more recent data. This limitation probably will be short-lived; developments are rapid. 

Calculation errors

Finally, the chatbot (in addition to content errors) also makes computational and logic errors. Again, this has to do with how ChatGPT generates answers. It is a language model, not a calculator: text is generated in response to mathematical questions, but the outcome is a random number based on human preference for numbers.

Contact and advice

Do you have questions concerning this topic? Feel free to contact us.

With special thanks to: Alex Lodder and Marijne Wijnker (Educate-it)

Sources

We have been inspired by the following sources when writing this text: