The race is on to develop tools and techniques that can detect AI-generated copy. Educators and institutions of higher ed are especially vulnerable to the throes of machine-generated content. Now, a group of researchers at Stanford may be one step closer to a useful solution.
The use of large language models (LLMs) is skyrocketing, and with good reason; it’s really good …
Recently, a team of researchers at Stanford proposed a new method called DetectGPT, which aims to be among the first tools to combat generated text in higher education. The method is based around the idea that text generated by LLMs typically hover around specific regions of the negative curvature regions of the model’s log probability function. Through this insight, the team developed a new barometer for judging if text is machine-generated which doesn’t rely on training an AI or collecting large datasets to compare the text against. We can only guess this means human written text occupies positive curvature regions, but the source is not clear on this.
This method, called “zero-shot”, allows DetectGPT to detect machine written text without any knowledge of the AI that was used to generate it. It operates in stark contrast to other methods which require training ‘classifiers’ and datasets of real and generated passages.
The team tested DetectGPT on a dataset of fake news articles (presumably anything that came out of CNET over the last year) and it outperformed other zero-shot methods for detecting machine-generated text. Specifically, they found that DetectGPT improved the detection of fake news articles generated by 20B parameter GPT-NeoX from 0.81 AUROC for the strongest zero-shot baseline to 0.95 AUROC for DetectGPT. Honestly, this is all French to me, but it purports a substantial improvement in detection performance and suggests that DetectGPT may be a promising way to scrutinize machine-generated text moving forward.