After reportedly developing text provenance solutions for ChatGPT-generated content, OpenAI faces a crossroads—torn whether to release the artificial intelligence (AI) detection models or risk driving away potential users.
Online newspaper TechCrunch first spotted the company’s sneaky May post on its blog, which first introduced a text watermarking tool to help researchers study content authenticity. The tool is also said to be complete for about a year already, according to The Wall Street Journal.
OpenAI has since confirmed these reports and updated its initial blog post, saying, “We have also done extensive research on the area of text provenance and have explored a range of solutions, including classifiers, watermarking, and metadata.”
OpenAI efforts for AI detection
The text watermarking system works by adding a secret watermark to identify AI-generated content. OpenAI claimed the system is “99.9% effective” and resistant to “localized tampering, such as paraphrasing.”
However, it yields a false positive rate, albeit low. This could lead to a bigger number when applied to larger volumes of text.
As such, the company is also investigating embedding cryptographically signed metadata to eliminate the risk of a false positive. The AI company is still in the early phase of this research and has yet to measure its effectiveness. Still, it is confident that several characteristics of metadata would make this particularly promising.
Further, the use of classifiers is currently being investigated. The idea is to create a hidden classification that tags automated text, similar to how email apps can automatically sort and categorize spam and unimportant messages away from the main inbox.
While OpenAI is working on tools to detect AI in the text ChatGPT generates, it revealed that audiovisual content provenance solutions are its priority. These primarily involve using C2PA metadata to label when an image is created or retouched using AI tools.
“As users can now edit DALL-E 3 generated images in ChatGPT, we wanted to ensure that provenance information continues to be demonstrated along with those edits. If a user edits an image, we’ve built in a means for our C2PA credential to show that the image was edited and how,” the company wrote in its update.
Useful for teachers
One main concern about using ChatGPT and similar AI chatbots is cheating, particularly among students. While tools that claim to detect AI in essays and paperwork are often unreliable and not foolproof.
Now, with OpenAI developing an AI detection system for its chatbot, many, especially teachers, favor its release. The Journal reported that a survey commissioned by the company found four out of five people worldwide support the idea of an AI detection tool.
The report also disclosed that watermarking does not affect the quality of the text responses in ChatGPT.
But might hurt ChatGPT users
Despite this positive feedback, OpenAI is still hesitant to roll out its provenance tools, worrying that the potential risks may outweigh the benefits.
Watermarking, although highly effective against localized tampering, does not perform as well against “globalized tampering.” This means that “using translation systems, rewording with another generative model, or asking the model to insert a special character in between every word and then deleting that character” can be workarounds for the watermark.
OpenAI also points to the possibility of losing subscribers after 30% of its survey respondents expressed their intention to use ChatGPT less if watermarking was enforced.
Moreover, the company fears that its AI detection tools may impact non-native speakers more than others. Since they typically turn to the chatbot to compose English texts, the tools may stigmatize AI and ultimately limit its global accessibility for education.