The Increasing Importance of Digital Fair Use: Reaction to the HathiTrust Decision

Yesterday, a federal district court in New York decided that five universities' digitizing of their library collections was a fair use, rejecting the Authors Guild's attempt to halt the efforts. For a good, brief rundown of the decision, see James Grimmelmann's early post here. Also, see Nancy Sims here, and Kenneth Crews here.

Basically, HathiTrust, a non-profit digital library partnership, and five universities (Michigan, the UC system, Wisconsin, Indiana, and Cornell) partnered with Google to scan books in the university libraries. The digitized books would be used to create a searchable index, so that users could conduct text searches of entire library collections. Digital copies of the books would also be made available to print-disabled patrons for use on accessible devices, and plans were in the works for potential orphan works to be identified, and, if the authors could not be found, made available to researchers in the universities. The Authors Guild sued, insisting that the making of the scans was copyright infringement not allowed under fair use.

The reactions I've linked to above can probably give you a better synopsis of the case and the decision than I could, but I wanted to touch upon one particular aspect of the fair use ruling—the issue of available licenses for things like scanning, and the fourth fair use factor.

People typically will say that there are four factors to be considered in fair use. This is a bit of an oversimplification. There's an infinitude of facts that a court can consider in making a fair use determination, and lots of things they should consider. But the oft-quoted four factors are the things a court is required to consider: (1) the purpose and character of the use; (2) the nature of the original copyrighted work; (3) the amount and substantiality of the work used; and (4) the effect of the use on the potential market for the original work.

As to this last point, courts have been clear that it's not just a matter of straight one-to-one substitution—you can still be liable for infringement even if your use of the work (say, adapting a novel into a comic book) won't directly substitute for the original. If you're making that comic book adaptation, you're cannibalizing the market for licensed comic book versions of the novel.

But this can be taken to absurd extremes. Given how creative lawyers can get with licensing, it's possible to slice and dice any particular use of a work into some sort of a license. But the fact that someone can craft a license for a particular work doesn't automatically mean that there's a market that the new use harms.

That's what happened in this case. The Authors Guild was arguing that someone like the Copyright Clearance Center could have issued licenses to libraries for the scanning of books, and that, by scanning without permission, HathiTrust and the universities were undercutting the market for book-scanning licenses. The court, though, didn't think much of this argument.

Were a court automatically to conclude in every case that potential licensing revenue were impermissibly impaired simply because the secondary user did not pay a fee for the right to engage in the use, the fourth factor would always favor the copyright owner…A copyright holder cannot preempt a transformative market.

(internal citations and quotations omitted)

This makes sense. Movie producers could just as easily (actually, probably more easily) draft up licenses for film critics to use clips of their movies in reviews. CNN could draft licenses that it could issue to Jon Stewart for use of clips where he mocks them. Yet the fact that those licenses could exist—or even if they did exist—doesn't somehow move those forms of criticism and parody outside the scope of fair use. Nor does the fact that TV studios could issue licenses for people to record their shows mean that we have to get a license every time we boot up the TiVo.

In getting so license-crazy, people sometimes seem to forget that copyright law doesn't give a creator the right to prevent all uses of her work—it only specifies certain particular things she can prevent others from doing, most notably making reproductions of it, and, in the case of things like movies and music, performing the works publicly.

Uses of copyrighted works that don't fit those specific categories of uses (there's six total, in section 106) just aren't infringements. Those sorts of uses have always existed. No one can prevent you from reading a book however you like, or watching a movie or listening to an album privately. A publisher could draft and issue (or withhold) all the reading licenses its wants, no one has to pay for them. Reading aloud to your kids doesn't implicate copyright, nor does arranging your books on a shelf.

We've gone for centuries without such idiotic licenses being bandied about, because they would be laughed out of court in most cases. But with everything being digitized, we're facing a significant change. Everything digital must be, in some sense, reproduced and copied in order for it to work. Just to view an ebook, just to watch a streamed video, you have to make digital copies along the way. That, at least initially, on the face of it, implicates those section 106 rights.

And so out come the license demands and lawsuits. While I never had to get a license to read a book aloud in private, we have the Authors Guild threatening Amazon with a lawsuit because the Kindle could read books aloud. Or here, an accusation that the digitizing of books to make, among other things, a searchable index and accessible books requires some sort of license.

This is not what the reproduction right was intended to do—convert unrestricted uses of copyrighted works into for-pay events. But that's exactly what would happen were it not for the flexibility of fair use. Mike Madison has a post detailing how, increasingly, judges and courts are understanding something about how computers and digital technology works: that the making of reproductions—an use that used to be the clear-cut action on which infringement hung—is now something that can, under many circumstances, be something routine and perfectly normal, especially when the reproductions made are simply in service of a known legitimate use:

The “computer” version of analog activities used to be the hard cases; it was often assumed that “analog” uses (indexing, annotating, creating access to reserve copies) was likely to be fair use. If the “computer” version is now likely to be deemed fair use, then working backward leads me to the probable conclusion that the fair use case for analog equivalents has gotten even stronger. ”Computers” is becoming the new copyright normal. And if these fair use cases turn out to generalize even a bit beyond the education context, then the “copying” that “computers” do will turn out to be less fraught than ever as a presumptive basis for finding infringement.

This goes back to why we have the 106 rights in the first place. They're to protect the limited monopoly that the law gladly grants to copyright holders. The prohibitions on reproduction and public performance aren't meant to serve as snares for normal usage, but to prevent the unauthorized multiplication of works that would undercut the copyright holder's monopoly. And while specific parts of the law have baked in this idea of incidental digital copying as the new normal (see section 117, for instance), the rest of the Copyright Act has some catching up to do. Until then, fair use is what keeps every digital device—your phone, computer, tablet, or DVR—from automatically being ruled an infringement engine just for existing. Fair use preserves copyright law as something that actually works, not just some arcane set of rules that everyone ends up violating anyway just to get through the day. And by reconciling the necessary technology of our everyday lives with copyright law, fair use ensures that copyright law remains relevant in its protection of authors.