Meta Halts Book Licensing for AI Training Amid Publisher Resistance and Legal Challenges

Meta has “paused” its efforts to license books for training its large language models, according to new court filings in the ongoing copyright case Kadrey v. Meta Platforms, Inc. This follows earlier reports of difficulties securing agreements with publishers, throwing Meta’s AI strategy into question amidst mounting lawsuits alleging copyright infringement.
The filings, part of the broader legal battle, specifically the Kadrey v. Meta Platforms court filings, pit AI companies against authors and copyright holders. At issue is the “fair use” doctrine, which AI companies claim allows training on copyrighted content; copyright holders vehemently disagree.

The new filings, submitted on Friday, include partial deposition transcripts from Meta employees. These reveal a largely unsuccessful attempt to negotiate licensing deals for AI training data.
Sy Choudhury, head of Meta’s AI partnership initiatives, testified that outreach to publishers saw “very slow uptake in engagement and interest.”
“I don’t recall the entire list, but I remember we had made a long list from initially scouring the Internet of top publishers, et cetera,” Choudhury said, according to the transcript. “And we didn’t get contact and feedback from — from a lot of our cold call outreaches to try to establish contact.”
He added, “There were a few, like, that did, you know, engage, but not many.” This “publisher hesitation,” and as Choudhury testified, Meta book publishers AI training slow uptake, presented an immediate challenge.
The transcripts indicate Meta paused some AI-related book licensing efforts in early April 2023, citing “timing” and logistical issues. A major problem, Choudhury revealed, was that many fiction publishers did not hold the necessary licensing rights.
“I’d like to point out that the — in the fiction category, we quickly learned from the business development team that most of the publishers we were talking to, they themselves were representing that they did not have, actually, the rights to license the data to us,” Choudhury said. “And so it would take a long time to engage with all their authors.” This discovery, also confirmed in supporting documentation, rendered the publisher-centric approach unworkable.
Choudhury also noted that Meta has paused other AI-related licensing efforts in the past.
“I am aware of licensing efforts such, for example, we tried to license 3D worlds from different game engine and game manufacturers for our AI research team,” he said. “And in the same way that I’m describing here for fiction and textbook data, we got very little engagement to even have a conversation […] We decided to — in that case, we decided to build our own solution.”
Plaintiffs in Kadrey v. Meta Platforms, including authors Sarah Silverman and Ta-Nehisi Coates, have amended their complaint multiple times since 2023. The latest version accuses Meta of copyright infringement, including allegedly using pirated books from “shadow libraries” to train its Llama AI models.
The complaint further alleges that Meta cross-referenced pirated books with copyrighted books to assess the value of licensing agreements. It also claims Meta may have used torrenting to obtain materials, with the “seeding” aspect of torrenting constituting further copyright infringement, as it facilitates distribution. As reported by TechHQ, “BitTorrent client software typically has a default setting to automatically ‘seed’ downloaded material”.
Meta maintains its use of publicly available data falls under “fair use.” However, applying fair use to AI training is a highly contested area; one expert quoted by MSK stated that “there are ‘hundreds or thousands of interpretations and applications of the fair use doctrine'”.
The outcome of these cases, including any Meta AI copyright lawsuit update, will significantly impact AI development and copyright law. A ruling against AI companies could lead to substantial licensing fees or require explicit permission for every copyrighted work used, dramatically increasing costs. The projected growth of the AI training dataset market, estimated to reach USD 14.67 billion by 2032, highlights this conflict.
Beyond legal issues, ethical concerns arise. AI researchers emphasize responsible AI development, respecting intellectual property and ensuring fair compensation. Data diversity and bias are also crucial, as biased training data can perpetuate societal biases.
One solution is prioritizing publicly available resources, like works in the public domain or those under open licenses, as noted by AInvest. However, this may not suffice for advanced AI models, necessitating collaboration and innovative licensing models. Some publishers are engaging; Shutterstock has deals with Meta, OpenAI, Amazon, and Apple to license its image library.
The publishing industry already faces digital piracy challenges, with one study showing a 26.6% increase in popularity of pirated book sites, and AI exacerbates these concerns. There are also fears that, as statistics and trends for 2024 suggest, AI text generation will compete with human authors. A sustainable path requires balancing AI innovation with creator rights, highlighting the need for clear guidelines in this rapidly evolving field.
Read More From AI Buzz

Vector DB Market Shifts: Qdrant, Chroma Challenge Milvus
The vector database market is splitting in two. On one side: enterprise-grade distributed systems built for billion-vector scale. On the other: developer-first tools designed so that spinning up semantic search is as easy as pip install. This month’s data makes clear which side developers are choosing — and the answer should concern anyone who bet […]

Anyscale Ray Adoption Trends Point to a New AI Standard
Ray just hit 49.1 million PyPI downloads in a single month — and it’s growing at 25.6% month-over-month. That’s not the headline. The headline is what that growth rate looks like next to the competition. According to data tracked on the AI-Buzz dashboard , Ray’s adoption velocity is more than double that of Weaviate (+11.4%) […]
