Matomo

Exam papers and tests: Can we detect texts generated by ChatGPT?

· 15 min read
Exam papers and tests: Can we detect texts generated by ChatGPT?

Whenever we're out giving presentations, we're asked the same question: "How do we find out if students have turned in an assignment written by ChatGPT?" In particular, it is about what we do for this summer's written exams.

Briefly about the challenge

Currently, artificial intelligence is being built into many different tools, including Microsoft Copilot, Google Workspace, and Windows 11. Individual students will soon find it difficult to avoid artificial intelligence. Although it is forbidden to use for exam assignments, some students may not even know they used it, as artificial intelligence is everywhere.

In upper secondary schools, the rules from the Ministry of Education are quite clear because students have to submit their exams, and it is up to the schools to enforce the rules. Whether the schools will hand out computers to all students with close access to the internet, buy a program for monitoring, or put an exam guard behind every student is entirely up to the individual school.

However, it is not only in education that it is a problem that we do not know whether texts, images, and videos come from artificial intelligence because the technology can also be abused for the generation of fake news and spam. Therefore, a lot of effort is also spent on finding out what is prepared by artificial intelligence and what is not.

Tools to find texts written by ChatGPT

We previously wrote about this topic in the article "Plagiarism and ChatGPT – How to detect the use of AI in written assignments?". Since we wrote the article, a lot has happened with the language models. They have become much more effective at writing human-like texts, and the detection tools have difficulty keeping up.

When Turnitins writes that they can detect texts written by ChatGPT with 98 percent certainty, one should be wary. Well, it may well be that the program is good at capturing English texts copied directly from ChatGPT. However, as we will show in this article, there are many ways in which texts can be rewritten. It is also interesting that OpenAI's tool, AI text classifier, according to their  studies, can only identify 26% of texts correctly (true positive), while 9% are labeled incorrectly (false positive). 

The Ministry of Children and Education has also written that you should not use these tools

"The agency is not aware that there is a screening tool that can provide a reliable statement about the use of AI. This is because there is a high proportion of so-called 'false positive' outcomes in the known screening tools. The teacher should therefore not use such tools, as they compromise the legal certainty of examinees.", BUVM 2023

At the same time, teacher bias can occur if teachers rely too much on the results of the various tools. Thus, they may falsely accuse students of plagiarism. Therefore, we can easily come to suspect pupils and students on a false and flimsy basis without conclusive proof.