Researchers at the University of Reading have conducted interesting research by submitting answers written by AI in exams. The results were astonishingly better than those of human students, and professors were unable to differentiate AI answers from human answers.
Researchers created 33 fake student identities to submit unedited AI-generated answers. The answers were generated with the help of ChatGPT-4 for undergraduate students’ online assessments in psychology tests. Researchers submitted AI answers for 63 questions in short and essay form, while the professors who were marking the papers were not informed about the research.
Institutes won’t go back to handwritten exams
The uninformed university professors identified only one of the 33 exam submissions as AI-written. At the same time, the other 32 submissions went unspotted and received higher grades than real students. In actual numbers, 83% of AI-written submissions received more marks than human students.
Also read: Chinese educational AI apps gain market in the United States
The research study was led by Professor Etienne Roesch and Associate Professor Peter Scarfe. Scarfe said that their research shows it is of international importance for the integrity of academic assessments and how AI will affect them. He said,
“We won’t necessarily go back fully to handwritten exams – but the global education sector will need to evolve in the face of AI.”
Scarfe mentioned that most institutions have shifted from traditional procedures to ensure more equitable assessments. Both professors highlighted that their research findings are a “wake-up call for educators.”
AI essays have a low detection rate
The researchers revealed that AI essays were almost undetectable, as 94% of them did not raise concerns with the checkers. The Journal Plos One, which published the study, said that the lower detection rate is likely to be an overestimation. It said,
“This is particularly worrying as AI submissions robustly gained higher grades than real student submissions,”
The journal also noted that students can cheat using AI and get away with it. They might also get higher marks than the honest ones who did not use AI. For the study, AI-generated answers were submitted for first—to third-year modules through fake identities. AI topped human students in the first and second modules.
However, humans scored better in the third-year exams. Researchers said this is consistent with the belief that AI is not good at “abstract reasoning,” at least in its current state.
AI still lacks the ability to reason
The study also noted that AI’s ability to reason will increase with time, and its detectability will decrease. This will make it more difficult to maintain academic integrity. The researchers said that the study may end take-home or unsupervised exams.
Prof Roesch said that the education sector needs to agree on how students can use AI in their work. He insisted that the same is true for uses of AI in other areas to maintain trust across society.
Also read: Tech solutions emerge to tackle higher education’s enrollment cliff
Pro Vice Chancellor of Education at Reading, Professor Elizabeth McCrum, said that the university is limiting take-home exams. He said that the university is working on developing other ways that would require applying knowledge in real-life situations, “often workplace-related scenarios.”
McCrum clarified that students will be allowed to use AI for some assessments so that they know how to use it ethically. However, other assignments will not require the use of artificial intelligence tools. The professor said this will help increase students’ AI literacy and prepare them for modern workplace requirements.
Cryptopolitan reporting by Aamir Sheikh