Looking for ChatGPT
Turnitin, a company that looks for similar word sequences in student assignments, says that it can now detect ChatGPT writing. (Stuff, RNZ)
The company is 98% confident it can spot when students use ChatGPT and other AI writing tools in their work, Turnitin’s Asia Pacific vice president James Thorley said.
“We’re under 1% in terms of false positive rate,” he said.
It’s worth looking at what that 1% actually means. It appears to mean that of material they tested that was genuinely written by students, only 1% was classified as being written by ChatGPT. This sounds pretty good, and it’s a substantial achievement if it’s true. This doesn’t mean that only 1% of accusations from the system are wrong. The proportion of false accusations will depend on how many students are really using ChatGPT. If none of them are, 100% of accusations will be false; if all of them are, 100% of accusations will be true.
What does the 1% rate mean for a typical student? An average student might hand in 4 assignments per course, for 4 courses per semester, two semesters per year. That’s nearly 100 assignments in a three-year degree. A false accusation rate of 1 in 100 means an average of one false accusation for each innocent student, which doesn’t sound quite as satisfactory.
The average is likely to be misleading, though. Some people will be more likely than others to be accused. In addition to knowing the overall false positive rate, we’d want to know the false positive rate for important groups of students. Does using a translator app on text you wrote in another language make you more likely to get flagged? Using a grammar checker? Speaking Kiwi? Are people who use semicolons safe?
Turnitin emphasize, as they do with plagiarism, that they don’t want to be blamed for any mistakes — that all their tools do is raise questions. For plagiarism, that’s a reasonable argument. The tool shows you which words match, and you can then look at other evidence for or against copying. Maybe the words are largely boilerplate. Maybe they are properly attributed, so there is copying but not plagiarism. In the other direction, maybe there are text similarities beyond exact word matches, or there are matching errors — both papers think Louis Armstrong was the first man on the moon, or something. With ChatGPT there’s none of this. It’s hard to look for additional evidence in the text, since there is no real way to know whether something you see is additional or is part of the evidence that Turnitin already used.
Recent comments on Thomas Lumley’s posts