ChatGPT in education: can you still use take-home exams and essays?

Student maakt thuis een tentamen

The new chatbot ChatGPT has been in the news frequently in recent months. The impact on education is also much discussed and especially what the consequences are for assessment. Can you still have students write essays? And what do we do with take-home exams that have already been used before? Can we still reliably assess writing skills? Assessment experts and educational experts of the UU answer this question in this article. They also make suggestions for the design of take-home exams and writing assignments as well as discuss the (im)possibility of use after the administration of the exam. 

Which type of assessment are we discussing?

If you want to use ChatGPT, you need internet access. This rules out any influence on the results of  regular exams on location; the laptops in an exam hall have no internet connection. Therefore, this article focusses on take-home exams or writing assignments that students are expected to do at home. 

For the immediate future: what can you as a lecturer do with the planned assessment for the coming period?

Prior to the exam: the design

Currently, ChatGPT has some limitations related to the functionality of the chatbot and the way it was trained. You can use this when designing your take-home exam or essay assignment. You can find some suggestions below. But beware, the application is self-learning, and there is no guarantee that these limitations are still in place after a week/month/half year. 

Focus on content instead of structure and writing style when formulating questions.

ChatGPT makes mistakes in content, but which are written convincingly. Thus, you can focus on content in your assessment, and have structure and writing style (things the chatbot does very well) less heavily. This has a few disadvantages: firstly, you are tampering with your learning goals. And secondly, it is possible that the level of the assessment changes, i.e. becomes more difficult. This is evidently not desirable.

Have students reflect on current events in the exam. 

Currently, ChatGPT only has access to information up to 2021. If you design an essay assignment where you can relate a theory to current events, the chatbot will produce an error. Beware, this is circumventable by copy-pasting a recent article into the chatbot and asking it to take it into account for the answer. 

Focus your exam questions on recent articles or articles behind a paywall. 

The same limitation applies to recent articles, those written after 2021. The chatbot also does not use articles behind a paywall, provided they are not available elsewhere on the web. As such, it is worth the effort to see whether ChatGPT can reproduce information from an article or not. Unfortunately, this method is also circumventable by copy-pasting an article into the chatbot.

Ask for personal reflection in your exam, or ask questions on the writing process. 

ChatGPT can make personal reflectons, but these will remain generic and will evidently not fit the person who is generating them.  

State on the exam (and in the course manual and lecture slides) that students must submit their own work. 

Indicate that they must properly reference their sources. Also indicate which tools they can (e.g. statistical tools such as SPSS) and cannot use (e.g. ChatGPT) for the assessment. 

Tip: Enter the assignment/take-home exam into ChatGPT and see what the result is. But beware: the chatbot will ‘construct’ a unique answer every time. Thus students will not generate the same result. If you click on ‘regenerate response’ multiple times, you will get different variants. This should give you an idea on the possibilities of the chatbot. These generated examples can also function as a nice input for a calibration session.

After the assessment: checking?

Can you check whether students used ChatGPT? The short answer is: no, you currently cannot do so. The plagiarism checkers work based on recognising exact sections of text, and ChatGPT generates new text. Other than plagiarism checkers, you can also use proctoring software, but the use of this means you will end up in a cat-and-mouse game in which you will likely lag behind with the proctoring software. As such, there is currently no known way to prove the use of ChatGPT in an assessment. 

You should also consider what you formally could do as an examiner with a probability percentage on the use of ChatGPT. During COVID it has been suggested to do spot checks with oral exams in addition to a take-home exam, which we could also do now. However, then as well as now it was difficult to determine which consequences a below average performance on the oral exam should have. Is this truly evidence that the student has not handed in their own work? Thus, we currently advise lectures to focus on the design phase of assessment when trying to overcome the adverse effects of ChatGPT’s existence.  

For teaching writing (theses and essays) there are some options for monitoring the writing process more closely, to see if a student has handed in their own work. In supervision meetings, you can discuss the line of reasoning used in the submitted text as well as the steps the student used in their learning process, along with the usual discussion of supervisor feedback. This can show whether the student has gained knowledge on the subject matter and has actually improved themselves. It is also likely to improve the learning process. Indicate in advance that this is how you will fill in the supervisions meetings, this could have a preventative effect. 

Conclusion 

Exam committees and examiners would of course want to know whether students have submitted their own work. Unreported use of a chatbot to generate texts is fraud. Students should be well aware of this. The design phase of assessment offers the most options for limiting the effects of the use of ChatGPT.

    Should you change the EER?

    The Education and Examination Regulations (EER of Utrecht University at least) do not need to be changed as the current EER suffices. It states that: "Fraud and plagiarism are defined as an action or omission on the part of students, which produces an incorrect representation of their own performance as regards their knowledge, skills, and understanding, which may result in the examiner no longer being able to assess the knowledge or ability of the students in a proper and fair manner."  At most you could mention by way of example that the unreported use of ChatGPT will be marked as fraud/plagiarism.

    Long term: What can you do in the design of assessment if you want to change your approach next year?

    If you are going to change the design of your assessment, it is important to start with your learning goals. The type of learning goal determines which options you will have when changing your assessment. Within this context we can distinguish between two types of learning goals:

    1. Learning goals concerning writing skills, argumentative skills, etc. I.e. learning goals covering those aspects that ChatGTP does well. 
    2. All other learning goals. 

    1. Learning goals concerning aspects that ChatGPT is good at

    With this we mean writing assignments given to test how a student constructs sentences, summarises, finds the key points of a text, creates structure, etc. These are exactly those aspects that ChatGPT excels at. Should a student commit fraud in this aspect, it is hard to find out and you cannot be sure if the student has mastered these learning goals. 

    Suppose that your course revolves around these learning goals, and you want to be sure that a student masters these skills. Which options are available to you? You can have students write sections in a ‘controlled’ environment, as is the case when taking exams. Be sure to plan this well in advance. It is also possible to have them write sections during your class. 

    You can also have the content be weighed more heavily as ChatGPT makes mistakes when it comes to content. Note: you are changing the learning goals, and thus the level of difficulty of your assessment.  

    A few points for perspective: 

    • Students always had the ability to use a ghost writer for take-home exams, essays en theses (for instance parents, brothers/sisters, etc.) to write or improve their texts for them. From this perspective, ChatGPT is maybe just a more easily accessible form of ghost writer, but in the end, not that novel a concept. 
    • Most students understand that handing in work written by ChatGPT is considered fraud, and that they learn very little by doing so. Automatically assuming that all students will commit fraud, when possible, leads to a controlling strategy and eventually a cat-and-mouse game between plagiarism software and chatbots. 
    • Within higher education, skills such as making sentences, creating structure, and summarising are important building blocks that lead to advanced writing skills. For their thesis, students should be able to write texts that directly relate to their research, or in which the information is integrated and critically reviewed. In the end, students will have to be able to write.

    Tip: Have a discussion with your students on this topic. Make clear why they need to learn certain skills, and what they will miss if they purely rely on ChatGPT.

    2. Learning goals that assess skills/knowledge other than those that ChatGPT excels at

    For this we can consider learning goals that cover gained knowledge and skills. The question you can ask here is: is a written text the best method to assess this learning goal? 

    A report is a much-used form in higher education. It may be time that we collectively start thinking about different forms. If you want to know if a student has mastered argumentation skills with regards to a specific topic, you could have them submit an argumentation scheme. When it comes to integration and relations between concepts, a concept map is a good form for this. Oral presentations or vlogs are also good forms. It is still important to make proper assessment models for these forms (as it is with reports), for instance a rubric.

    Tip: Check whether a written form is the best assessment form for your learning goals. If not, then there are other forms you can use! Be sure to keep the constructive alignment in mind, and what the changes can do for the level of difficulty of your assessment.

    Discusssion: Which questions have not yet been answered?

    When discussing the use of ChatGPT in the article above, we are talking about generating a text and handing it in as your own work. This is of course not acceptable. However, there is a different use of ChatGPT possible. ChatGPT can be used as a tool in the way that Google or Wikipedia are currently being used. The questions that this raises are: which use do we find acceptable? Which use is unacceptable? Is less use acceptable in assessment than during learning? 

    And if we do allow the use (to a greater or lesser extent): what will students learn less well? In other words: which skills are we outsourcing? And is this a problem? Should students still learn to write? And what does this mean for the final attainment goals of our programmes? 

    There are also major ethical and moral objections to the use of ChatGPT, which you can read about in this article. To what extent do we facilitate or even encourage the use of this tool? 

    Finally, read how ChatGPT responds to these varied concerns.

    Contact and advice

    Do you have questions concerning this topic? Feel free to contact us.

    With special thanks to: Davitze Könning, Alex Lodder and Marijne Wijnker (Educate-it)