OpenAI is facing accusations of deleting crucial training data for its ChatGPT model amid copyright infringement lawsuits from The New York Times and the Daily News. The alleged accidental deletion raises concerns about evidence retention in legal cases involving AI.
The publishers claim ChatGPT was trained using their copyrighted content. OpenAI had granted them access to virtual machines to search for this content within the training data. However, according to a letter filed with the U.S. District Court for the Southern District of New York, OpenAI engineers erased the publishers' search data from one of the virtual machines.
TechCrunch's Kyle Wiggers reported:
Earlier this fall, OpenAI agreed to provide two virtual machines...But on November 14, OpenAI engineers erased all the publishers’ search data...which was filed in the U.S. District Court...late Wednesday.
Although OpenAI claims the deletion was accidental and that the data was recovered, the recovered format is allegedly unusable for legal purposes. This raises questions about how the publishers will proceed with their claims.
The incident highlights the complex legal challenges surrounding the use of copyrighted material in training AI models. The published letter provides further details about the case. The outcome of this legal battle could have significant implications for the future of AI development and copyright law.