A group of authors has sued Meta for using pirated books to train its AI models. The allegations come with the discovery of internal documents indicating the company's awareness and approval of using pirated content.
Allegations Against Meta
According to newly disclosed documents, a group of authors claims that Meta was aware of the pirated nature of the books used to train its AI systems. Authors including Ta-Nehisi Coates and Sarah Silverman allege in their California federal court filing that the company, led by Mark Zuckerberg, utilized pirated books for developing its Llama model.
Use of LibGen Dataset
It is claimed that Meta used the LibGen dataset, a large online library known to be a pirate collection, containing millions of works. Despite internal warnings, the archive was still utilized. This raises questions of content use without copyright holders' consent, potentially endangering traditional business models in the publishing industry.
Responses and Court Rulings
The judge allowed authors to submit an updated complaint despite expressing skepticism over some of the arguments presented. In 2023, district judge Vince Chhabria dismissed claims that Meta's AI-generated texts infringed authors' copyrights. However, authors were allowed to amend their claims, potentially strengthening their stance in the case.
The case highlights current challenges in copyright posed by AI use, initiating discussions on the ethics and legality of using intellectual property for training technology.