OpenAI takes steps to boost AI-generated content transparency

09.05.2024

OpenAI, now part of the Coalition for Content Provenance and Authenticity (C2PA) steering committee, will incorporate the open standard’s metadata into its generative AI models to enhance transparency regarding generated content.

The C2PA standard certifies digital content with metadata proving its origins, whether entirely AI-created, AI-enhanced, or traditional. OpenAI has begun adding C2PA metadata to images from its latest DALL-E 3 model output in ChatGPT and the OpenAI API. This metadata will also be integrated into OpenAI’s upcoming video generation model Sora upon its broader release.

“Although individuals can still produce deceptive content without this information (or potentially remove it), they cannot easily forge or alter this information, making it a valuable resource for building trust,” explained OpenAI.

This move responds to increasing concerns about AI-generated content potentially misleading voters before major elections in the US, UK, and elsewhere this year. Authenticating AI-created media could help counteract deepfakes and other manipulated content used in disinformation campaigns.

While technical measures are crucial, OpenAI acknowledges that achieving content authenticity in practice requires collective action from platforms, creators, and content handlers to preserve metadata for end consumers.

In addition to C2PA integration, OpenAI is developing new provenance methods such as tamper-resistant watermarking for audio and image detection classifiers to identify AI-generated visuals.

OpenAI has opened applications for accessing its DALL-E 3 image detection classifier through its Researcher Access Program. This tool predicts the likelihood that an image originated from one of OpenAI’s models.

“Our aim is to enable independent research assessing the classifier’s effectiveness, analyzing its real-world application, raising relevant considerations for such use, and exploring the characteristics of AI-generated content,” the company stated.

Internal testing indicates high accuracy in distinguishing non-AI images from DALL-E 3 visuals, with approximately 98% of DALL-E images correctly identified and less than 0.5% of non-AI images mistakenly flagged. However, the classifier faces more challenges in differentiating between images produced by DALL-E and other generative AI models.

OpenAI has also integrated watermarking into its Voice Engine custom voice model, currently available in limited preview.

The company believes that increased adoption of provenance standards will result in metadata accompanying content throughout its lifecycle, addressing “a crucial gap in digital content authenticity practices.”

OpenAI, alongside Microsoft, is launching a $2 million societal resilience fund to support AI education and understanding, with contributions from AARP, International IDEA, and the Partnership on AI.

“While technical solutions like those mentioned provide active tools for defense, effectively establishing content authenticity in practice will necessitate collective action,” stated OpenAI.

“Our efforts regarding provenance are just one aspect of a broader industry endeavor—many of our peer research labs and generative AI companies are also advancing research in this domain. We commend these efforts—the industry must collaborate and share insights to enhance our understanding and promote online transparency.”