Why detecting AI-generated text is so difficult (and what to do about it)

ByContributor February 7, 2023

This tool is OpenAI’s response to the heat it’s gotten from educators, journalists, and others for launching ChatGPT without any ways to detect text it has generated. However, it is still very much a work in progress, and it is woefully unreliable. OpenAI says its AI text detector correctly identifies 26% of AI-written text as “likely AI-written.”

While OpenAI clearly has a lot more work to do to refine its tool, there’s a limit to just how good it can make it. We’re extremely unlikely to ever get a tool that can spot AI-generated text with 100% certainty. It’s really hard to detect AI-generated text because the whole point of AI language models is to generate fluent and human-seeming text, and the model is mimicking text created by humans, says Muhammad Abdul-Mageed, a professor who oversees research in natural-language processing and machine learning at the University of British Columbia

We are in an arms race to build detection methods that can match the latest, most powerful models, Abdul-Mageed adds. New AI language models are more powerful and better at generating even more fluent language, which quickly makes our existing detection tool kit outdated.

OpenAI built its detector by creating a whole new AI language model akin to ChatGPT that is specifically trained to detect outputs from models like itself. Although details are sparse, the company apparently trained the model with examples of AI-generated text and examples of human-generated text, and then asked it to spot the AI-generated text. We asked for more information, but OpenAI did not respond.

Last month, I wrote about another method for detecting text generated by an AI: watermarks. These act as a sort of secret signal in AI-produced text that allows computer programs to detect it as such.

Researchers at the University of Maryland have developed a neat way of applying watermarks to text generated by AI language models, and they have made it freely available. These watermarks would allow us to tell with almost complete certainty when AI-generated text has been used.

The trouble is that this method requires AI companies to embed watermarking in their chatbots right from the start. OpenAI is developing these systems but has yet to roll them out in any of its products. Why the delay? One reason might be that it’s not always desirable to have AI-generated text watermarked.

One of the most promising ways ChatGPT could be integrated into products is as a tool to help people write emails or as an enhanced spell-checker in a word processor. That’s not exactly cheating. But watermarking all AI-generated text would automatically flag these outputs and could lead to wrongful accusations.