Today's news that Google shut down music blogs that were accused of copyright infringement is rightfully getting plenty of coverage. Mostly, it is being held up as another in a long line of examples of problems with the DMCA notice-and-takedown system.
This is a great example of a problem with the DMCA because, at least according to The Guardian, the notices that Google relied on to delete the blogs were woefully incomplete. Google should not have acted until it had proper notices from rights holders, including the name of the actual work allegedly infringed. Since many of the notices did not even include this information, there was no way for the bloggers to file a DMCA counternotice. For an update on the DMCA part of this story, check out Wired and Google's own post. Of course, the DMCA confusion does a great job of illustrating the points about filtering below.
It is important that this story is being used to point out
problems with the DMCA, and with Google's policies for dealing with DMCA complaints. how complicated DMCA implementation can be. What it equally important, if less commented on, is what it can tell us about copyright filtering.
One of Public Knoweldge's often stated objections to copyright filtering is that, while a machine can (sometimes) identify a work, it cannot tell if the copy of the work violates copyright. Usually we talk about this as a so-called “judge in a box” problem related to fair use. Although fair use famously has four guiding factors, applying the factors to a real world situation is far from an automatic process. It requires a judge to actually balance the factors and come to a (hopefully) reasoned and nuanced decision. Since you can't pack a judge into a box and plug it into the network, there is no way to build a filter that reliably recognizes fair uses.
Fortunately, the Google takedown story is much less nuanced than any judge-in-a-box problem. While the bloggers might be able to make a fair use argument, at least one does not have to. I Rock Cleveland, a music blog, was asked by artists and labels to post tracks on the blog. In the Blogger help forums, I Rock Cleveland notes that they work with bands and labels and have permission to post the files. I Rock Cleveland posted a list documenting all of the permissions to host files. However, Google erased I Rock Cleveland anyway.
This highlights a critical flaw with filtering technologies. At best, a filter may be able to identify a file (assuming that it was not encrypted). However, identification is only part of enforcing copyright. Copyright is not merely a question of “is there a copy.” Rather, it is a question of “is there an unlawful copy?” Part of answering that question is determining if the copier has permission. Until someone develops a single database with every copyrighted file and every permission ever granted for every one of those files, the second question is almost impossible to answer. A filter may be able to tell if someone is sending me a music file, but it is a lot harder to know if the person sending me that file has permission.
In this case, someone associated with the files appears to have asked Google to take down the blogs. However, as I Love Cleveland's list makes clear, someone associated with the files also appears to have given permission to post the files. If record labels and artists cannot figure out who they allow to post their stuff, how can some automated filter in an ISP's basement be expected to? More importantly, why should blogs like I Love Cleveland, which appears to have played by the rules, have their creative work wiped from the face of the earth because rights holders cannot get their act together?