What is ChatGPT capable of, and what are its limitations?
Much is said and written about the Open AI tool ChatGPT. The tool has both fans and opponents, with varying opinions. But that the tool will have consequences for education, is without a doubt. To understand what these consequences are (both positive and negative), it is useful to understand how ChatGPT works, what its abilities are, and what not.
What is ChatGPT?
ChatGPT is a chatbot based on a Large Language Model (LLM). That means you can ask a question (prompt) and ChatGPT will write a text for it. This ranges from writing a limerick to writing scientific articles. To generate text, ChatGPT does not need to understand the prompt (and the answer). Instead, the prompt gives the chatbot a context within which it will use probability to see which words best line up, forming sentences. It generates new unique text and does not show existing texts (like ordinary search engines). It is therefore not possible to check the authenticity of texts with plagiarism software.
What is ChatGPT capable of?
ChatGPT's capabilities can be roughly divided into three categories.
- You can ask a question, and ChatGPT answers it. This usage is similar to giving a command to Google. The difference is that Google produces sources from which you have to extract the information yourself. ChatGPT is more efficient and produces a written-out answer, the disadvantage is that it is not easy to verify if the answer is true and what it bases it on.
- You can ask to generate a piece of text, for example an outline or a reflection on a particular topic. The question can include what length and writing style the piece should be. You can also ask for sources to be cited. If you ask it outright, it will include fabricated sources, but you can specify that you want only 'real sources'. In that case, the chatbot will provide real sources, but be aware these will still not always be accurate.
- You can enter pieces of text and ask the chatbot to summarise, paraphrase, translate, remove spelling mistakes, give feedback on them, etc.
ChatGPT predicts well what kind of answer we want to get. So even though the chatbot does not fully understand the questions and the generated answers, it is still capable to create a logical structure, as is shown in this study in which ChatGPT scores a 7,8 on a Dutch secondary school (vwo) exam on reading comprehension in English.
Moreover, text generation is only one of the software's capabilities. Within the playground of OpenAI there are possibilities to convert a prompt to software code or have an illustration generated. Conversely, you can enter code and ask what it does. See this example for an illustration.
Additionally, part of the source code is publicly available, therefore different add-ons are created, for example to link to Excel, Gmail, R, Youtube or Whatsapp. This gives an indication of the broad implications we can expect.
What can ChatGPT not do?
ChatGPT uses probability theory to compose an answer based on numerous sources. It does this when compiling pieces of text, as well as when compiling a simple factual answer to a question. This means you can never identify all of the sources that were used for the output. This makes ChatGPT's answer non-transparent, difficult to verify, and makes proper and complete source citation (and compliance with copy-right) impossible. If you ask ChatGPT to provide the sources used to generate an answer, the chatbot often fabricates non-existent sources. You can ask for "real" and "peer-reviewed" sources, then you will usually get existing and applicable sources, but they will not contain all the data on which it bases an answer. These are obviously major drawbacks within the scientific community.
Falsehoods and biases
Another major drawback of ChatGPT is in the training data used to develop the model. First, there are biases in data, and following the old "garbage-in-garbage-out" principle, the chatbot's output depends on the training data, and there is a good chance that ChatGPT's answers will reflect these biases. For example, it is clear from ChatGPT's predecessor (GPT-3) that there were gender-stereotypical associations in the answers. OpenAI, the company behind ChatGPT, implements filters and human verification to extract obvious falsehoods and the most severe biases. However, this remains an inherent limitation.
The current version of ChatGPT (January 2023) is only trained on data through September 2021. This means that the chatbot cannot interpret more recent data. This limitation probably will be short-lived; developments are rapid.
Finally, the chatbot (in addition to content errors) also makes computational and logic errors. Again, this has to do with how ChatGPT generates answers. It is a language model, not a calculator: text is generated in response to mathematical questions, but the outcome is a random number based on human preference for numbers.
- Email: firstname.lastname@example.orgPhone: 06 28 801 161
With special thanks to: Alex Lodder and Marijne Wijnker (Educate-it)
We have been inspired by the following sources when writing this text:
- M. Fouque, A. Calabrese, M. Diallo, N. Vey, J.-N. Monette, J.-P. Chevallet, & L. Besacier. (2021). HAL-03913837: The GPT-3 OpenAI Large Model: An Overview and Some Experiments. HAL. https://hal.science/hal-03913837/
- S. Naushad, R. Kumar, & J. De Winter. (2022). ChatGPT for Next Generation Science Learning. ResearchGate. https://www.researchgate.net/publication/367281552_ChatGPT_for_Next_Gen…
- J. De Winter & M. Farooq. (2022). Can ChatGPT pass high school exams on English Language Comprehension? ResearchGate. https://www.researchgate.net/profile/Joost-De-Winter/publication/366659…;
- J. De Winter. (2022). ChatGPT: Unlocking the Future of NLP in Finance. ResearchGate. https://www.researchgate.net/publication/367129318_ChatGPT_Unlocking_th…;
- H. Ismail, & M. Y. Kadhum. (2021). The study of the effect of using a language model in teaching writing skills. Iraqi Journal of Curriculum and Management Sciences, 8, 258-266. https://journal.esj.edu.iq/index.php/IJCM/article/download/539/258
- M. Farooq & J. De Winter. (2021). The Future of Natural Language Processing: A Study of OpenAI GPT-3 Model. Social Science Research Network. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4312418
- M. Reagan. (2023). Not All Rainbows and Sunshine: The Darker Side of ChatGPT. https://towardsdatascience.com/not-all-rainbows-and-sunshine-the-darker…