from the blind-judges-and-the-ai-copyright-elephant dept
Within days of each other, two federal judges in the same district reached completely opposite conclusions about AI training on copyrighted works. Judge William Alsup said it’s likely fair use as transformative. Judge Vince Chhabria said it’s likely infringing because of the supposed impact on the market. Both rulings came out of the Northern District of California, both involve thoughtful judges with solid copyright track records, and both can’t be right.
The disconnect reveals something important: we’re watching judges fixate on their personal bugbears rather than grappling with the fundamental questions about how copyright should work in the age of AI. It’s a classic case of blind men and an elephant, with each judge touching one part of the problem and declaring that’s the whole animal.
I just wrote about Judge Alsup’s careful analysis, which found that training AI was likely protected as fair use, but building an internal digital library on unlicensed downloaded works was probably not. Before that piece was even published, Judge Vince Chhabria came out with a ruling that disagrees.
The summary: AI training is likely infringing. But here, the plaintiff authors failed to present evidence, and thus, their case against Meta is dismissed. Ironically, Alsup’s ruling was probably a win for AI innovation but a loss for Anthropic. Chhabria’s is the opposite: a clear win for Meta, but potentially devastating for AI innovation generally.
Chhabria’s Flawed Market Harm Analysis
Chhabria’s ruling seems to overweight (and, I think incorrectly predict) the “effect on the market” aspect of the fair use analysis:
Because the performance of a generative AI model depends on the amount and quality of data it absorbs as part of its training, companies have been unable to resist the temptation to feed copyright-protected materials into their models—without getting permission from the copyright holders or paying them for the right to use their works for this purpose. This case presents the question whether such conduct is illegal.
Although the devil is in the details, in most cases the answer will likely be yes. What copyright law cares about, above all else, is preserving the incentive for human beings to create artistic and scientific works. Therefore, it is generally illegal to copy protected works without permission. And the doctrine of “fair use,” which provides a defense to certain claims of copyright infringement, typically doesn’t apply to copying that will significantly diminish the ability of copyright holders to make money from their works (thus significantly diminishing the incentive to create in the future). Generative AI has the potential to flood the market with endless amounts of images, songs, articles, books, and more. People can prompt generative AI models to produce these outputs using a tiny fraction of the time and creativity that would otherwise be required. So by training generative AI models with copyrighted works, companies are creating something that often will dramatically undermine the market for those works, and thus dramatically undermine the incentive for human beings to create things the old-fashioned way
I find this entire reasoning extremely problematic, and it’s why I mentioned in the Alsup piece that I don’t think the “effect of the use upon the market” should really be a part of the fair use calculation. Because any type of competition can lead fewer people to buy a different work. Or it can inspire people to actually buy more works because of more interest. Chhabria’s example here seems particularly… weird:
Take, for example, biographies. If a company uses copyrighted biographies to train a model, and if the model is thus capable of generating endless amounts of biographies, the market for many of the copied biographies could be severely harmed. Perhaps not the market for Robert Caro’s Master of the Senate, because that book is at the top of so many people’s lists of biographies to read. But you can bet that the market for lesser-known biographies of Lyndon B. Johnson will be affected. And this, in turn, will diminish the incentive to write biographies in the future.
This is where Chhabria’s reasoning completely falls apart. He admits in his own example that Robert Caro’s biography would be fine because “that book is at the top of so many people’s lists.” But that admission destroys his entire argument: people recognize that a good biography is a good biography, and AI slop—even AI slop generated from reading other good biographies—is not a credible substitute.
More fundamentally, his logic would make any learning from existing works potentially infringing.
If you go to Ford’s Theatre in DC, where Lincoln was shot and killed, you can actually see a very cool tower of every book they could find written about Lincoln. Under Chhabria’s reasoning, this abundance should have killed the market for Lincoln biographies decades ago. Instead, new ones keep getting published and finding audiences.
If any of the authors of any of those books read any of the other books, learned from them, and then wrote their own take which did not copy any of the protectable expression of the other books, would that be infringing? Of course not. Yet Chhabria’s analysis seems to argue that it would likely be so.
Or take magazine articles. If a company uses copyrighted magazine articles to train a model capable of generating similar articles, it’s easy to imagine the market for the copied articles diminishing substantially. Especially if the AI-generated articles are made available for free. And again, how will this affect the incentive for human beings to put in the effort necessary to produce high-quality magazine articles?
This argument would be more compelling if the internet hadn’t already been flooded with free content for decades. Plenty of the internet (including this very site) consists of freely available articles based on our reading and analysis of magazine articles. This hasn’t destroyed the market for original journalism—it’s just competition. And, indeed, some of that competition can actually increase the market for the original works as well. If I read a short summary of a magazine article, that may make me even more likely to want to read the original, professionally written one.
So I don’t find either of these examples particularly compelling, and am a bit surprised that Chhabria does. He does admit that other kinds of works are “murkier”:
With some types of works, the picture is a bit murkier. For example, it’s not clear how generative AI would affect the market for memoirs or autobiographies, since by definition people read those works because of who wrote them. With fiction, it might depend on the type of book. Perhaps classic works of literature like The Catcher in the Rye would not see their markets diminished. But the market for the typical human-created romance or spy novel could be diminished substantially by the proliferation of similar AI-created works. And again, the proliferation of such works would presumably diminish the incentive for human beings to write romance or spy novels in the first place.
Again, even his murkier claims seem weird. There are so many romance and spy novels out there, with more coming out all the time, and the fact that the market is flooded with such books doesn’t seem to diminish the demand for new ones.
This all feels suspiciously like the debunked arguments during the big internet piracy wars about how downloading music for free would magically make it so that no one wanted to make music ever again. The reality was actually quite different: the fact that the tools for production and distribution became much easier and more democratic, meant that more music than ever before was actually produced, released, distributed… and monetized in some form.
So the entire premise of Chhabria’s argument just seems… wrong.
The Alsup vs. Chhabria Split
Chhabria also takes a fairly dismissive tone on the question of transformativeness. And even though he likely wrote most of this opinion before Alsup’s became public, he adds in a short paragraph addressing Alsup’s ruling:
Speaking of which, in a recent ruling on this topic, Judge Alsup focused heavily on the transformative nature of generative AI while brushing aside concerns about the harm it can inflict on the market for the works it gets trained on. Such harm would be no different, he reasoned, than the harm caused by using the works for “training schoolchildren to write well,” which could “result in an explosion of competing works.” Order on Fair Use at 28, Bartz v. Anthropic PBC, No. 24-cv-5417 (N.D. Cal. June 23, 2025), Dkt. No. 231. According to Judge Alsup, this “is not the kind of competitive or creative displacement that concerns the Copyright Act.” Id. But when it comes to market effects, using books to teach children to write is not remotely like using books to create a product that a single individual could employ to generate countless competing works with a miniscule fraction of the time and creativity it would otherwise take. This inapt analogy is not a basis for blowing off the most important factor in the fair use analysis.
Here we see the fundamental disagreement: Alsup thinks transformativeness is the key factor; Chhabria thinks market impact trumps everything else. Both can’t be right, and the fair use four-factor test gives judges enough wiggle room to justify either conclusion.
Chhabria does agree that training LLMs is transformative:
This factor favors Meta. There is no serious question that Meta’s use of the plaintiffs’ books had a “further purpose” and “different character” than the books—that it was highly transformative. The purpose of Meta’s copying was to train its LLMs, which are innovative tools that can be used to generate diverse text and perform a wide range of functions. Cf. Oracle, 593 U.S. at 30 (transformative to use copyrighted computer code “to create a new platform that could be readily used by programmers”). Users can ask Llama to edit an email they have written, translate an excerpt from or into a foreign language, write a skit based on a hypothetical scenario, or do any number of other tasks. The purpose of the plaintiffs’ books, by contrast, is to be read for entertainment or education.
But he thinks market harm is more important—a conclusion that would gut much of fair use doctrine if applied consistently.
Also, while Alsup focused heavily on the unauthorized works that Anthropic downloaded and then stored in an internal “library” and Chhabria goes into great detail about how Meta used BitTorrent to download similar (and in some cases, identical) copies of books, he leaves for another day the question of whether that aspect is infringing.
Indeed, in some ways, these two cases represent the old claim that the fair use four factors is just an excuse to do whatever the judge wants to do and then try to work backwards to try to justify it in more legalistic terms using those for factors.
The Plaintiffs’ Spectacular Failure
Given all this, you might think that Chhabria ruled against Meta, but he did not, mainly because the crux of his opinion—that these AI tools will flood the market and diminish the incentives for new authors—is so ludicrous that the plaintiffs in this case barely even raised it as an issue and presented no evidence in support.
In connection with these fair use arguments, the plaintiffs offer two primary theories for how the markets for their works are affected by Meta’s copying. They contend that Llama is capable of reproducing small snippets of text from their books. And they contend that Meta, by using their works for training without permission, has diminished the authors’ ability to license their works for the purpose of training large language models. As explained below, both of these arguments are clear losers. Llama is not capable of generating enough text from the plaintiffs’ books to matter, and the plaintiffs are not entitled to the market for licensing their works as AI training data. As for the potentially winning argument—that Meta has copied their works to create a product that will likely flood the market with similar works, causing market dilution—the plaintiffs barely give this issue lip service, and they present no evidence about how the current or expected outputs from Meta’s models would dilute the market for their own works.
Given the state of the record, the Court has no choice but to grant summary judgment to Meta on the plaintiffs’ claim that the company violated copyright law by training its models with their books.
In short, the court’s ruling in this case is that the winning argument is the impact on the market, while the plaintiffs in this case focused on the claim that the outputs of AI tools trained on their works was infringing. But, Chhabria notes, that argument is silly.
The irony is delicious: Chhabria essentially handed the authors a roadmap for how to beat AI companies in future cases, but these particular authors were too focused on their other weak theories to follow it. It’s a clear win for Meta, but potentially devastating precedent for AI development generally.
What we’re watching is how the fair use four-factor test can be manipulated to justify almost any conclusion a judge wants to reach. Alsup prioritized transformativeness and found for fair use. Chhabria prioritized market harm and found against it (even while ruling for Meta on procedural grounds). Both wrote lengthy, seemingly reasoned opinions reaching opposite conclusions from largely similar facts.
This case isn’t settled. Neither is the broader question of AI training and copyright. We’re still years away from definitive answers, and in the meantime, companies and developers are left navigating a legal minefield where identical conduct might be fair use in one courtroom and infringement in another.
Filed Under: competition, copyright, effect on the market, fair use, generative ai, llms, transformative, transformativeness, vince chhabria, william alsup
Companies: anthropic, meta