AI Copyright Infringement: GPT-4 Performs Poorly in Copyrighted Content Generation Test
In a recent study conducted by Patronus AI, a company founded by ex-Meta researchers, the performance of leading AI models in generating copyrighted content was evaluated. The study focused on OpenAI's GPT-4, Anthropic's Claude 2, Meta's Llama 2, and Mistral AI's Mixtral. The goal was to determine how often these AI models produced text from popular books protected by copyright laws in the U.S.
The results revealed that OpenAI's GPT-4 performed the worst among the tested models, producing copyrighted content in response to 44% of the prompts. Popular book titles like "The Perks of Being a Wallflower," "The Fault in Our Stars," and "New Moon" were not spared from potential copyright infringement by these AI models.
During the evaluation, researchers asked the AI models to generate text from copyrighted books, such as providing the first passage of a specific book or completing text from well-known titles like "Gone Girl" or "Becoming." OpenAI's GPT-4 showed a tendency to reproduce copyrighted content without caution, completing text 60% of the time when asked to do so.
On the other hand, Anthropic's Claude 2 demonstrated a more cautious approach, using copyrighted content only 16% of the time when completing a book's text. Meta's Llama 2 responded with copyrighted content on 10% of the prompts, while Mistral AI's Mixtral completed a book's first passage 38% of the time.
The study highlighted the prevalence of copyrighted content production across all evaluated AI models, raising concerns about potential copyright infringement issues. The findings shed light on the challenges faced by AI developers in ensuring compliance with copyright laws, especially when training large language models like GPT-4.
As the debate intensifies between AI developers and content creators over the use of copyrighted material for AI training data, cases like the lawsuit between The New York Times and OpenAI underscore the need for clearer guidelines and regulations in the AI industry.