The Fatal Failures of the Copyright Office’s Report on AI

At long last, the Copyright Office released its third report on fair use and AI — but their analysis contained serious missteps.

Back in 2023, the U.S. Copyright Office was given the unenviable task of advising Congress on whether using copyrighted works to train artificial intelligence was a fair use. It took them nearly two years and a Librarian of Congress, but in May 2025, the Office released the long-awaited third installment of the report

Its top-line conclusions were generally correct: Copyright law is extremely robust, with decades of relevant case law behind it, and is actually quite well-equipped to deal with the question of training on copyrighted works. Fair use is a case-by-case analysis, and the outcome will depend largely on the specific facts of each case. Congress should wait for the courts to suss out these questions, and see whether licensing markets develop organically in the meantime. If they don’t develop, or take an anticompetitive turn, then Congress should consider a kind of mass licensing known as “extended collective licensing.” The Report also contained an extremely useful (and accurate) description of the training process, including unique use cases such as “retrieval-augmented generation” (or RAG) models, which search for answers in real time to user queries. 

But the reasoning the Office used to arrive at those conclusions is largely a mess. Three missteps in particular undermine the Office’s analysis: endorsing an unsupported theory of market dilution; making up a nonexistent right for rightsholders to control the manner in which their works are accessed; and sloppily applying the recent Warhol decision to a general-purpose technology.

Market Dilution Is Not A Thing

Probably the most-discussed failure of the Report is the Copyright Office’s decision to endorse a brand new, fringe theory of market harm: that the existence of too much creative work is bad. This stems from a recurring fear of rightsholders that AI-generated works will flood the market in a way that will disadvantage human-created works and drive down their value.

This theory, known as “market dilution,” is based on a tortured reading of the fourth factor. Although the Office “acknowledge[s] this is uncharted territory,” it nevertheless argues that “any effect upon the potential market” for a copyrighted work – up to and including the introduction of competing works – can weigh against fair use. This becomes doubly tortured when applied to situations where the work being “disadvantaged” was used solely to train the person or tool being used to assist in the creation of a new, competing work. Under “market dilution” theory, copyright stops being about copies and starts being about competition, because humans learn from existing works before creating their own. Assigning novels to a college creative writing class? If any of those student works ends up getting published, that’s a new work competing against the books the students read in class; that’s market dilution. DSLR or digital single-lens reflex cameras, which allow everyday users to take professional-looking photos that they can license to news organizations? Market dilution. Recording your own music with GarageBand and releasing it online? Market dilution, and you can expect a call from the RIAA (once they’re done suing your internet service provider). 

Copyright law doesn’t protect creators from the threat of too much competition, machine-made or otherwise. If there’s a thru-line at all in the history of copyright policy, it’s that technology allows for more works, by more people, with less friction. Everything from word processing to Photoshop has led to explosions in the number of works being created – and licensed. Copyright law does not protect a right to be free of competitive market pressure; it simply protects the right to control (some) uses of your specific work. As Professor Edward Lee puts it, market dilution theory “turns copyright from a limited monopoly to a general monopoly against competition posed by noninfringing works.” And, if that’s not enough to convince you, the U.S Supreme Court also had a few things to say, namely that “Creative works can compete with other creative works for the same market, even if their appeal is overlapping,” and “No presumption or inference of market harm … is applicable to a case involving something beyond mere duplication for commercial purposes.” 

In short, if you want to address the market threat to creative labor, copyright law isn’t the avenue to do it.

Publishers Do Not Have an Absolute Right To Control the Means of Access

Another nonsense assertion in the report is the idea that whether or not a source work was initially accessed with the copyright owner’s permission has a bearing on whether or not the use is fair. For one thing, this is self-defeating logic; if you had obtained the rightsholder’s permission, you wouldn’t have to be in court arguing that your use was fair. Rightsholders take this argument a step further, saying that any training data obtained via “unlawful access” poisons the entire model. 

It’s a nonsense argument, but it’s worth unpacking what that means in the context of digital media. Training sets are digital; any material included in the set needs to either be converted from analog to digital (say by scanning), or obtained in a native digital format. While methods vary – Anthropic used a combination of book-scanning and scraped digital sources – most developers are naturally going to want things in a digital-first format. 

Accessing these works, however, requires users to agree to a minefield of contracts. These contracts frequently contain clauses that (purport to) bar users from making legally allowable fair uses of the works they’re accessing. And even if there’s no contract, there’s likely to be a digital access lock, which starkly limits what you can do with the file – even if all you want to do is make a fair use. Under the rightsholder view of the world, there is no “lawful access” of a digital work that isn’t entirely on their terms. 

Luckily, that’s not how the law works. The “Digital Millennium Copyright Act” or DMCA explicitly allows for unlawful access by circumventing digital locks if the purpose is to make an approved fair use. “Lawful access” as a factor in fair use is not backed up by statute or the case law, and judges have thrown it out both times it was argued in recent AI copyright cases; in both Kadrey v. Meta and Bartz v. Anthropic, the court held that even using pirated material that was infringing to save or redistribute on its own did not necessarily make the training itself inherently unfair. 

Warhol (Derogatory)

All of this is further complicated by a longstanding problem in fair use: What, exactly, is the “use” the courts are supposed to be evaluating? 

Let’s use a hypothetical. Say you’re a collage artist; you cut out images from magazines and incorporate them into your work. The resulting piece is a large, 3’ x 5’ poster, which you license to your favorite band as album cover art. What is the relevant “use” in this case? Is it the act of incorporating that picture into the resulting collage? Is it the act of licensing the collage as cover art? Or is “fair use” a status held by the work itself, that then travels with it regardless of what the artist does later down the line?  

In its recent Warhol v. Goldsmith decision, the Supreme Court faced a version of this problem – what is the relevant “use,” and once that’s defined, how do we assess its fairness? In 1981, photographer Lynn Goldsmith photographed singer-songwriter Prince, then an up-and-coming artist. In 1984, Vanity Fair licensed that photograph and commissioned Andy Warhol to turn it into a silk screen print. Warhol made the one print allowed by the license, but he liked the photograph so much that, over the remaining three years of his life (1984-87), he made another fifteen for his personal collection. He died in 1987, and his estate displayed the Prince Series in galleries around the world. In 2016 – nearly four decades after his death – his estate then licensed one of the unlicensed prints back to Vanity Fair as a cover image for a special retrospective issue. Goldsmith, who had for nearly four decades been unaware of the existence of the other 15 prints, sued. 

The Court ultimately decided that commercial use of the work (by a separate party, four decades after its creation!) retroactively rendered the prints themselves less transformative. The Court, through this miracle of judicial time travel, created a “Schröedinger’s cat” of fairness; whether or not a work itself is transformative may depend wholly on what the artist (or someone else!) chooses to do with it commercially at a later, unknown date. This is like saying that my cat is now a fish because I gave him a bath

The only way this line of argument – commerciality can change how inherently transformative a use is – makes any logical sense is if you assume that the creator, when she sat down to make the secondary work, had commercial intent. Something made for the purpose of commercial use is less transformative; something made for the purpose of educational use may be more transformative. If the work is commercialized by its artist later, that speaks to the artist’s intent at the time of creation. 

The idea that the actions of Warhol’s estate evidenced that Warhol himself had “commercial intent” only works thanks to the legal fiction that an estate acts, for all intents and purposes, as the deceased individual himself. Andy Warhol made the unauthorized prints, and “Andy Warhol” licensed them to Vanity Fair in 2016. But this principle starts breaking down rapidly when you try to apply it to something with as many intervening actors as AI. 

How Does This Affect AI?

AI is a general-purpose technology. It can be used (and marketed) for any number of ends that are fundamentally beyond the decision power of the original developer. The initial training, fine tuning, and end use can, and often is, done by entirely different people and organizations. The whole point of most AI models is to be able to re-implement and repurpose what’s been trained for new, custom ends. Meta’s Llama model is a great example: it’s an open-source large language model that has been fine-tuned and reshaped for everything from identifying biomarkers in cancerous tumors to helping conservators analyze centuries-old artworks and strategize how to best preserve them.

The Copyright Office report largely discarded this fact, and instead attempted to apply Warhol’s reasoning as if the entire process, from training to output prompting, was under conscious control of the same entity. There are cases where this might make sense: systems developed entirely for in-house use, or agentic systems explicitly marketed as digital replacements for human workers. But when that decision-making chain breaks, so does the Report’s logic. A model or its components may pass through control of multiple entities. Does the fact that an end user decides to use a re-implemented model in a way that competes with something in the training set retroactively render that training not a fair use? If the training itself was not fair, then is it unfair as applied to all subsequent implementations? Does the cancer biomarker detector model now violate copyright law? 

The Copyright Office’s sloppy assumption of singular decision-making results in a vision of fair use where training exists inside that Schröedinger’s box, both fair and unfair, depending on whether someone out there in the universe decides to point the model toward an unauthorized end. 

Practical Impacts

So far, the Report has had little to no impact on the political (or legal) trajectory of AI disputes. First, because the Report was rushed out the door when it became clear President Trump was preparing to fire the Register of Copyrights, it is only available in a “pre-publication version.” Given that the Office still lacks an acting Register – and the administration appears to have turned its attention elsewhere – it will likely be a while before we see a formal, finalized version… or the rest of the reports that the Office is scheduled to release in its AI and Copyright series.

Second, the court decisions that have happened since the Report’s release have completely ignored it. A report from the Copyright Office doesn’t have any binding legal weight; in lawyer speak, it’s considered persuasive, but not entitled to any special deference. (It’s unclear what impact, if any, the Report’s “pre-publication” status has on its persuasiveness.) 

Finally, the Report was almost immediately superseded by the American Law Institute’s issuance of the long-awaited Restatement of Copyrights. ALI Restatements are summaries of complex areas of law, designed to bring unfamiliar judges up to speed in areas with which they have no experience. The Restatement was a monumental effort, and its (much more even-handed) treatment of fair use is going to be much closer to hand and more readily consulted by judges than the Copyright Office’s pre-publication report. 

Conclusion

The Copyright Office’s report was an ambitious but flawed attempt to provide categorical answers to fact-specific questions. In doing so, the Office dangled some crazy bait  – but mercifully, the courts don’t seem to be biting.