Publishers and Authors Sue Meta Over AI Training Data Use

SubscribeLaw and Policy

Major publishers and an author have filed a class action complaint in the Southern District of New York against Meta Platforms Inc. and Chief Executive Mark Zuckerberg, alleging that Meta copied books and literary works without authorization or compensation to train its Llama series of large language models ^[1]. The plaintiffs include Elsevier Inc., Cengage Learning, Hachette Book Group, Macmillan Publishing Group, and McGraw Hill LLC, joined by author and former Authors Guild president Scott Turow as a named class representative ^[1]. The complaint, docketed as Case 1:26-cv-03689, was filed May 5, 2026 ^[1].

The complaint alleges violations of the Copyright Act, asserting that Meta ingested protected works, including scientific and academic texts, trade books, and literary fiction, as training data without obtaining licenses or making any payment to rights holders ^[1]. The inclusion of Zuckerberg as a named defendant is notable; plaintiffs allege he had direct knowledge of and authority over the decisions to acquire and use copyrighted material at scale ^[1]. The Copyright Act grants exclusive reproduction and derivative-work rights to copyright owners, and courts have not yet issued definitive appellate guidance on whether large-scale ingestion of protected text for AI model training constitutes infringement or qualifies as fair use.

The filing enters a crowded but still-evolving litigation landscape. Multiple other AI copyright suits are pending in federal courts, including actions brought by news organizations and individual authors against companies such as OpenAI and Google. The publisher plaintiffs here are among the most commercially significant rights holders in academic and trade publishing, and the proposed class would extend to all similarly situated authors and publishers whose works were used without consent. The Southern District of New York has become a primary forum for AI copyright disputes, given its concentration of publishing industry plaintiffs and established copyright docket.

Meta has not yet filed a responsive pleading, and no scheduling order has been entered in the public record as of the filing date ^[1]. The court will likely set briefing schedules on class certification and any motion to dismiss in the coming months. The outcome of threshold questions, particularly whether AI training constitutes fair use and whether a CEO can be held personally liable for corporate copyright decisions, will carry significant consequences for how the broader AI industry structures its data acquisition practices.

References

[1]Association of American Publishers. (2026, May 5). Elsevier v. Meta Complaint (Case 1:26-cv-03689). https://publishers.org/wp-content/uploads/2026/05/2026-05-05-Complaint.pdf

Publishers and Authors Sue Meta Over AI Training Data Use

References

Latest Articles