Apple Accused of Using Pirated Books for AI Training

Two professors from the State University of New York (SUNY) College of Health Sciences have filed a class-action lawsuit against Apple Inc., alleging unauthorized use of their copyrighted works in training artificial intelligence systems. The complaint marks another escalation in the growing legal battles over AI training data sources.

The Allegations

Professors Susana Martinez-Conde and Stephen Macknik claim Apple used texts from Books3, a controversial dataset containing approximately 186,640 books sourced from pirated materials, to train its Apple Intelligence and OpenELM language models. Their books Champions of Illusion and Sleights of Mind were allegedly included without permission.

The lawsuit asserts Apple not only used the materials for model training but also employed them to test performance and filter copyrighted content from user-facing outputs. This follows Apple's April 2024 admission that it utilized The Pile dataset, which incorporated Books3 content.

Background on Books3

Books3 operated as a shadow library, obtaining materials primarily through the private BitTorrent tracker Bibliotik. The collection gained notoriety among AI researchers before being taken down in October 2023 following copyright complaints.

The dataset became particularly controversial because:

It contained clearly copyrighted material
Was widely distributed among tech companies
Lacked proper attribution or compensation mechanisms

Legal Implications

The case presents complex questions about:

Whether AI training constitutes fair use
How to compensate creators when works are used algorithmically
What constitutes willful infringement in machine learning contexts

The plaintiffs seek:

A jury trial
Financial compensation
An injunction preventing future use of their works If found guilty of willful infringement, Apple could face penalties up to $150,000 per infringed work.

The lawsuit arrives amid growing scrutiny of tech companies' data practices:

"This isn't just about compensation - it's about establishing ethical boundaries for how creative works are used in the AI era," said intellectual property attorney Mark Lemley.

The case follows similar disputes involving Midjourney and Anthropic, where courts have struggled with applying traditional copyright frameworks to AI development.

Market Context

While the complaint notes Apple's market value increased $200 billion following its AI announcement, analysts caution against attributing this solely to disputed training methods:

Apple's valuation grew consistently over five years
Multiple factors influence stock performance
Actual impact remains unclear pending legal outcomes

The company has not yet issued substantive responses to the allegations.

Key Points:

Legal action: SUNY professors allege unauthorized use of their books in Apple's AI training
Controversial source: Books3 dataset contained pirated materials before takedown
High stakes: Potential penalties could reach $150k per infringed work
Broader implications: Case tests copyright boundaries in AI development

Apple Faces Lawsuit Over Alleged Use of Pirated Books for AI Training

Apple Accused of Using Pirated Books for AI Training

The Allegations

Background on Books3

Legal Implications

Market Context

Key Points:

Related Articles

UK PM Demands Action as Musk's Grok AI Sparks Deepfake Scandal

UK Tech Minister Slams Grok AI Over Disturbing Imagery

Grok's Deepfake Scandal Sparks International Investigations

Tencent's AI Assistant Surprises Users with Unexpected Attitude

Meta's AI Scandal: Leaked Admission Reveals Llama 4 Test Manipulation

Firefox Goes All-In on AI, Sparking Privacy Concerns Among Developers

AI DAMN

Main Pages

Content

Others