The COPIED Act Is an End Run around Copyright Law

This is the wrong bill, at the wrong time, from the wrong policymakers, to address complex questions of copyright and generative artificial intelligence.

Over the past week, there has been a flurry of activity related to the Content Origin Protection and Integrity from Edited and Deepfaked Media (COPIED) Act. While superficially focused on helping people understand when they are looking at content that has been created or altered using artificial intelligence (AI) tools, this overly broad bill makes an end run around copyright law and restricts how everyone – not just huge AI developers – can use copyrighted work as the basis of new creative expression. 

The COPIED Act was introduced in the Senate two weeks ago by Senators Maria Cantwell (D-WA, and Chair of the Commerce Committee); Marsha Blackburn (R-TN); and Martin Heinrich (D-NM). By the end of last week, we learned there may be a hearing and markup on the bill within days or weeks. The bill directs agency action on standards for detecting and labeling synthetic content; requires AI developers to allow the inclusion of these standards on content; and prohibits the use of such content to generate new content or train AI models without consent and compensation from creators. It allows for enforcement by the Federal Trade Commission and state attorneys general, and for private rights of action. 

We want to say unequivocally that this is the wrong bill, at the wrong time, from the wrong policymakers, to address complex questions of copyright and generative artificial intelligence. 

If you squint, some of the provisions in COPIED are compatible with the White House’s Executive Order on AI and the Roadmap for AI Policy in the Senate (of which Senator Heinrich is an author). Both documents refer to the development of standards related to authenticating content, tracking its provenance, and using technical methods such as watermarking to label synthetic content. The Senate Roadmap also referred to the AI-related concerns, including consent and disclosure, of professional content creators and publishers. It’s the role of the relevant Senate committees to translate these documents into legislation, but we believe the introduced language in this bill is over-broad, scoping in anodyne tools like basic office software, your cell phone’s camera app, and Photoshop, among other things. 

But that’s not the biggest problem. This bill leapfrogs from content provenance to copyright, skipping over complex questions already under consideration in the courts and the Copyright Office to simply declare that attaching content provenance information to copyrighted content prohibits anyone from either using that content to create new or modified works with AI-enabled tools, or to train AI models without permission. Giving content creators such control over how others use copyrighted work as the basis of new creative expression overrides generations of copyright law and the right of fair use. As written, for example, the bill could make it unlawful to quote excerpts from a book if a book review used an AI-based spelling and grammar checker, or was translated from one language to another using AI. It could also prohibit the use of watermarked content for documentary films, television shows, podcasts, educational works, research, critique, commentary, news reporting, or parody – all examples of fair use exceptions in copyright law. We also believe that the processes involved in the development of generative AI models – web scraping, copying data sets, and training models on copyrighted content for non-infringing purposes – is fair use under copyright law. Fair use fuels lots of uses that have nothing to do with commercial generative AI, like academic and scientific research; it supports open source, small startups, AI researchers, and other competitive entrants in the tech marketplace; and it leads to more powerful and inclusive models because of better, more diverse sources of data. (The output of AI tools may violate copyright, and there are strong legal remedies that are already in place to address this. But that is a separate question from how they are trained.)

But if an official determination that AI training is fair use –  or not – needs to be made, there are already two processes in motion to do so. First, there are multiple lawsuits already filed against the major AI companies by news organizations, publishers, and record labels. (Notably, major trade associations from each of these industries have endorsed the COPIED Act, implying they’re not confident which way the lawsuits will go.) These suits will allow for decisions based on the specific facts at issue in each case – for example, in the case of news, are the AI models training on the facts of news articles (which cannot be copyrighted), or publishers’ specific expression of them? Second, there is an investigation, including an open comment period, at the Copyright Office that will result in a report promised by the end of the year. The report will specifically address “the legal implications of training AI models on copyrighted works as well as the allocation of potential liability for AI-generated outputs that may infringe.”  

Either or both of these processes are more appropriate and comprehensive methods to solve for these complex issues. And so are the House and Senate Judiciary Committees and their Intellectual Property subcommittees, which includes several senators and representatives that have built their careers in part on copyright and intellectual property issues.

There are other issues with this bill. Again, its language is far too broad, including in its definition of “covered content” (which could include works that are in the public domain). It presumes government agencies can quickly solve complex technical issues associated with watermarking and labeling, and develop “standards” for detecting synthetic content – a worthy task, though some researchers think it’s impossible. It creates an “optional opt-in” system that will make it even more challenging for users to understand what they’re looking at online. We urge Congress to reject this bill.