
Welcome to The Journal!
The Journal component of the University of Wisconsin Pre-Law Journal (UWPLJ) showcases research-driven articles that explore legal issues, judicial decisions, and the law’s influence on society, politics, and culture. Unlike our blog, The Journal features contributions from pre-law–focused students selected through an application process. Each article undergoes a multi-stage peer review and editorial process, with writers collaborating closely with their editors to ensure clarity and depth.
In order to read our published journal articles, please select one of our issues.
Applications for writers and editors open to UW–Madison Pre-Law Society members at the start of each fall and spring semester.
Bartz vs. Anthropic PBC: Where Do We Draw the Line?
Written by Leo Zhu, Edited by Misha Beggs
Vol. 2, Issue 1 – January 2026
Abstract
Large language models (LLMs) have become a dominant presence in modern society. Their ability to perform a wide range of tasks at high efficiency can be attributed to the “Transformer” model that most AIs of today follow, which allows for bots to skim through information to identify the most crucial parts and perform tasks based on those findings. To improve upon and fully harness the potential of these bots, however, AI companies are forced to source vast amounts of information to feed their models. Anthropic, the developer behind Claude, is a name at the forefront of AI development, meaning they must constantly seek out new information to train their models to stay ahead of the curve. In 2024, they were caught using pirated material belonging to a handful of authors (Bartz et al.), who then sued the company for copyright infringement.
The court ruled that both the usage of copyrighted material in AI learning and the method of digitizing books used by Anthropic are fair uses, but that their possession of pirated material is a breach of copyright law. As a result, the court moved to discuss Anthropic’s settlement, which materialized as $3,000 for each of the estimated 500,000 Class works, amounting to at least $1.5 billion (not including works in excess of the estimated total, each of which will receive the same $3,000 payout). In addition, Anthropic was forced to delete all remaining copies of pirated work from their digital library. Despite the magnitude of this settlement, there is little guarantee that other big AI companies will play along and completely abandon their current sourcing methods, as the financial risk of a suit is far less significant than the profit they stand to gain from being the face of perhaps the most widely influential, rapidly-growing industry in the world.
I. Introduction
Over the past decade, large language models (LLMs) have engrained themselves into human society. From menial tasks such as data processing to highly skillful tasks such as coding and pattern analysis, these chatbots are highly proficient in many areas and widely sought. The system which now serves as the backbone for this new wave of high-functioning bots was first introduced in 2017 by Vaswani et al. in their paper, “Attention is All You Need [1].” The “Transformer” model, as they called it, is now being utilized in the newest, most cutting-edge bots of today. What makes this model so unique is its ability to skim data sets, allowing it to “attend to the most relevant parts of input data [2]”; this gives it a significant advantage over older models, which take much longer to produce results on account of attending to every input equally. In the years since Vaswani et al. published their findings, a new generation transformer model bots have taken the world by storm. These bots are able to process large quantities of data with incredible efficiency–a helpful, but problematic ability. To train big LLMs such as ChatGPT, Claude, Meta, and Gemini, huge amounts of information must be poured into them on a regular basis. Given that only so much material can be obtained from legal sources, AI developers may find themselves faced with a moral dilemma when attempting to keep up with competitors: to pirate or not to pirate.
[1] Ashish Vaswani, Noam Shazeer, Niki Parmar, and Jakob Uszkoreit, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin, Attention is All you Need, ARXIV (2023), https://doi.org/10.48550/arXiv.1706.03762.
[2] Dave Bergmann & Cole Stryker, What is an attention mechanism?, IBM (October 21, 2025), https://www.ibm.com/think/topics/attention-mechanism
II. Background
In August 2024, a class action lawsuit was filed against Anthropic, the company behind Claude AI. The plaintiffs, Andrea Bartz, Charles Graeber, and Kirk Wallace Johnson (Bartz et al.), were all published authors who accused Anthropic of pirating several of their works. Dario Amodei, co-founder and CEO of Anthropic, did not deny this claim, stating that there were many sources from which they could have legally purchased the books, but that they still decided to pirate to avoid “legal/practice/business slog [3].” Ben Mann, another Anthropic co-founder, admitted to sourcing millions of digitized books from Books3 and Library Genesis (LibGen), which are both known pirating sites.
A year after Mann’s pirating spree, in July 2022, Anthropic found yet another illegal source in Pirate Library Mirror (PiLiMi), from which they downloaded another two million or so copies of pirated material [4]. Aware of the legal issues associated with their current sourcing methods, Anthropic decided to hire Google’s former head of partnerships, Tom Turvey, in February 2024, tasking him with finding as many books as possible while staying within legal boundaries [5]. This led Turvey to email two major publishers, asking if Anthropic could license some of their books to train AI. To clarify, “training” an AI means using outside information to teach the model certain tasks; for Claude, these “tasks” would include developing its writing style, improving its understanding of language patterns, and building on its knowledge of real-world events. What could have been a fair and legal agreement between both parties fell through, as Turvey decided to end all ongoing discussions with the publishers, instead opting for a separate approach: contacting major book distributors and retailers with intent of bulk-purchase.
Anthropic purchased millions of physical books for the express purpose of adding them to their research library. Among those purchased books, however, were a handful of works written by Bartz et al. Once these books were acquired, Anthropic moved through a four-step digitization process, which involved cleaning out unnecessary text (headers, footers, etc.), converting words into their corresponding number sequences (in accordance with Anthropic’s dictionary), and allowing Claude models to store these newly-formatted texts in their databases. In response to the plaintiffs’ suit, Anthropic moved for summary judgment on fair use.
[3] Bartz v. Anthropic PBC, 3:24-cv-05417 (N.D. Cal, 2024)
[4] Id.
[5] Id.
III. Ruling
On June 23, 2025, Judge William Alsup of the U.S. District Court in the Northern District of California ruled in favor of the authors. This ruling was made on account of two offenses: first, the fact that Anthropic possessed upwards of seven million pirated book copies; second, because Anthropic’s usage of pirated books wasn’t sufficiently transformative, meaning either not enough additional content was introduced or not enough copyrighted content was removed to creatively distinguish the original copies from the digitized ones. On the other hand, the court deemed both Anthropic’s usage of books in AI training and the digitization of legally purchased books fair use. The former was argued to be unlawful by Bartz et al., primarily because they believed Claude would imitate a writing style identical to their own when communicating with users. This point was ultimately shot down by Anthropic, who argued that their models’ writing styles are an amalgamation of thousands of different works from countless sources, so a style prevailing in one, or even a few specific authors’ techniques would be impossible. The latter was deemed fair use on account of the conversion from physical to digitized format being “exceedingly transformative [6],” meaning there was enough modification to legally distinguish the newly digitized versions from the original copies. Additionally, Anthropic refrained from copying and redistributing their physically purchased books, thus fully sanctioning their digitization process within the law.
Following the verdict on Anthropic’s piracy, the court moved to discuss a settlement. On September 5, 2025, the details of the class-action payout were decided: Anthropic was set to pay the Class “at least $1.5 billion dollars, plus interest”; with an estimated 500,000 Class-relevant works, “this amounts to an estimated gross recovery of $3,000 per class work” to be paid within two years of final approval [7]. In addition, any Class works in excess of the expected 500,000 total will receive the same $3,000 per-unit payout. According to plaintiffs, this settlement will be “the largest publicly reported copyright recovery in history, larger than any other class action settlement or any individual copyright case litigated to final judgment [8].” Along with the $1.5 billion payout, Anthropic was also obligated to “destroy the LibGen and PiLiMi datasets after the expiration of any litigation preservation or other court orders [9],” effectively preventing them from making use of any remaining pirated materials.
[6] Bartz v. Anthropic PBC, 3:24-cv-05417 (N.D. Cal, 2024)
[7] Eileen McDermott, Anthropic to pay largest publicly reported copyright settlement in history, IP Wᴀᴛᴄʜᴅᴏɢ, September 5, 2025, https://ipwatchdog.com/2025/09/05/anthropic-pay-largest-publicly-reported-copyright settlement-history/
[8] Id.
[9] Bartz v. Anthropic PBC, 3:24-cv-05417 (N.D. Cal, 2024)
IV. Implications
Despite the historic results of Bartz vs. Anthropic PBC, there is no guarantee that they will directly restrict or regulate the current sourcing practices of other AI developers. To understand why, we must put into perspective the scale of companies like Anthropic round. Founded in 2021 by former OpenAI employees, Anthropic has become one of the biggest names in AI in just four years. As of September 2, 2025, according to a report from Anthropic themselves, the company is currently valued at $183 billion after the most recent fundraising [10]. The $1.5 billion settlement, which was crowned the largest publicly-recorded copyright infringement payout in history, seems little more than a costly slap on the wrist when taken in the context of Anthropic’s 12-figure valuation. Other big AI companies, such as OpenAI, Google, and Meta, are valued even higher, with some even reaching the trillions. Having witnessed the suit against Anthropic, these companies are now aware of the billions they stand to lose if their sourcing methods are deemed illegal. Despite this, big AI companies will likely continue their illegal sourcing methods, as the profit and growth from creating the next big LLM far outweighs a minor legal battle in the present. Should other authors be enticed by the thought of securing a similarly lucrative settlement, there may be an influx of intellectual property (IP) suits against AI companies, leading to a heated climate within the world of IP law for the foreseeable future.
Since the start of 2025, there have been four major copyright infringement cases surrounding the fair use of AI. Of these cases, two have gone in favor of the plaintiffs who sued (including Bartz vs. Anthropic PBC) and the other two have defended AI companies [11]. In three of the four cases, at least one ruling was decided based on whether the company’s use of material was deemed transformative, illustrating the concept’s importance in discussions of fair use [12]. For now, it seems that if an AI company wants to avoid being successfully sued on grounds of copyright infringement, they must consider whether the content they feed their bots can be decisively distinguished from the authors’ original work; from the results Bartz vs. Anthropic PBC, however, it seems that only a minimal amount of difference must be present to distinguish between an original and edited work to avoid infringement, as evidenced by the limited changes done to the three authors’ books during the digitization process. This could lead to rising tension among authors whose works are being used to train AI models, as they receive zero compensation from the companies making billions using their work.
[10] Anthropic raises $13B Series F at $183B post-money evaluation, Aɴᴛʜʀᴏᴘɪᴄ, September 2, 2025, https://www.anthropic.com/news/anthropic-raises-series-f-at-usd183b-post-money-valuation
[11] Sarah C. Reasoner, et al., The Art (and Legality) of Imitation: Navigating the Murky Waters of Fair Use in AI Training, THE NATIONAL LAW REVIEW, July 15, 2025, https://natlawreview.com/article/art-and-legality-imitation navigating-murky-waters-fair-use-ai-training .
[12] Id.

