Unsealed court documents reveal internal discussions at Meta regarding the use of copyrighted materials for training their AI models.
The Heart of the Matter: Meta’s AI Training Data Dilemma
The lawsuit *Kadrey v. Meta* is a landmark case in AI copyright law. Plaintiffs, including prominent authors, are challenging Meta’s claim of 'fair use' when training AI models on copyrighted books. New documents reveal Meta's strategies to incorporate copyrighted data into models like the Llama family.
Libgen and the Pursuit of State-of-the-Art AI Models
A concerning revelation is the discussion of using Libgen, a known link aggregator for copyrighted works. Internal communications reveal that some Meta executives considered Libgen essential to meet state-of-the-art performance for their AI models.
IP Risks and Data Scarcity
Documents suggest Meta tuned its models to avoid 'IP risky prompts,' programming them to refuse requests that could directly reveal training data sources.
The developments in the *Kadrey v. Meta* case could significantly impact legal frameworks governing the use of copyrighted material in AI training, raising profound ethical questions about transparency and AI developers’ responsibilities.