Jeff Goldenson at Harvard Law Library’s Digital Lab (Disclosure: I’ve just started consulting there) has been thinking about the benefits and pitfalls of embedding metadata into JPG images. That happens already, and some of it can be quite useful, although some can be a little creepy.

He and I were talking and began to wonder if there’d be utility in embedding Creative Commons license info into JPGs. So, let’s say you post a snapshot and you want to make it available under a Creative Commons license that allows people to reuse it so long as they attribute it to you and agree to let others reuse it under the same license. That information — including your preferred attribution and a link to the page you want it linked to — would be hidden within the JPG file.

Why bother? Because it would mean that the license info travels with the image. Otherwise, the chain of licenses and attributions can too easily be lost as B republishes a snap posted by A, and C republishes B, etc. The game of License Gossip just about ensures the chain of license info will not be unbroken.

At least as important, if this metadata were inserted in a standardized form, applications could begin using it, making the CC license both more useful and more visible, thus encouraging more people to use it. For example, someone could write a Firefox extension that would insert under any CC’ed image a line such as: “Share this image. Just be sure to include this attribution: (cc) [name] [license],” etc.

For this idea to have any effect, someone (Creative Commons?) would have to promulgate the standardized format for the embedded info, someone would have to write a metadata editor/inserter, and apps would have to add features take advantage of it. It’d help, for example, if Flickr were to let us set a preference for embedding the metadata into any photo we post there under a CC license.

Down sides? Well, the idea is unlikely to take off. And I suppose there’s a chance that the Big Content industry would start to insert their copyright info using the same mechanism, and thus would have something like a “broadcast flag” with which they could try to beat up browser makers and others who make create apps that display images: “Whenever your app displays images with copyright metadata, we insist you turn off the Copy entry on the context menu.” (IANAL, but I believe such a demand would have no legal basis, but since when does that have anything to do with it?)

Care to punch holes in this idea? Point to people who have already done it?

Here’s a summary of the summary Google provides [pdf], although IANAL and I encourage you to read the summary, which is written in non-legal language and is only 2 pages long:

1. The agreement now has been narrowed to books registered for copyright in the US, or published in the UK, Australia or Canada.

2. There have been changes to the terms of how “orphaned works” (books under copyright whose rightsholders can’t be found) are handled. The revenue generated by selling orphaned works no longer will get divvied up among the authors, publishers and Google, none of whom actually have any right to that money. Instead it will go to fund active searching for the rightsholders. (At the press call covered by Danny Sullivan [see below], the Authors Guild rep said that with money, about 90% of missing rightsholders can be found.) After holding those revenues in escrow (maybe I’m using the wrong legal term) for ten years (up from five in the first settlement), the Book Rights Registry established by the settlement can ask the court to disburse the funds to “nonprofits benefiting rightsholders and the reading public”; I believe in the original, the Registry decided who got the money. So, in ten years there may be a windfall for public libraries, literacy programs, and maybe even competing digital libraries. (The Registry may also (determined by what?) give the money to states under abandoned property laws. (No, I don’t understand that either.))

The new settlement creates a new entity: A “Court-approved fiduciary” who represents the rightsholders who can’t be found. (James Grimmelmann [below] speculates interestingly on what that might mean.)

3. The settlement now explicitly states that any book retailer can sell online access to the out-of-print books Google has scanned, including orphaned works. The revenue split will be the same (63% to the rightsholder, “the majority of” 37% to the retailer).

4. The settlement clarifies that the Registry can decide to let public libraries have more than a pitiful single terminal for public access to the scanned books. The new agreement also explicitly acknowledges that rightsholders can maintain their Creative Commons licenses for books in the collection, so you could buy digital access and be given the right to re-use much or all of the book. Rightsholders also get more control over how much Google can display of their books without requiring a license.

5. The initial version said Google would establish “market prices” for out of print book, which seemed vague because what counts as the market for out-of-print books? The new agreement clarifies the algorithm, aiming to price them as if in a competitive market. And, quite importantly, the new agreement removes the egregious “most favored nation” clause that prevented more competitive deals to be made with other potential book digitizers.

From my non-legal point of view, this addresses many of the issues. But not all of them.

I’m particularly happy about the elements that increase competition and access. It’s big that Amazon and others will be able to sell access to the out-of-print books Google has scanned, and sell access on the same terms as Google. As I understand it, there won’t be price competition, because prices will be set by the Registry. Further, I’m not sure if retailers will be allowed to cut their margins and compete on price: If the Registry prices an out-of-print book at $10, which means that $6.30 goes to the escrow account, will Amazon be allowed to sell it to customers for, say $8, reducing its profit margin? If so, then how long before some public-spirited entity decides to sell these books to the public at their cost, eschewing entirely the $3.70 (or the majority of that split, which is what they’re entitled to)? I don’t know.

I also like the inclusion of Creative Commons licensing. That’s a big deal since it will let authors both sell their books and loosen up the rights of reuse.

As far as getting rid of the most favored nation clause: Once the Dept. of Justice spoke up, it’s hard to imagine it could have survived more than a single meeting at Google HQ.

The Open Book Alliance (basically an everyone-but-Google consortium) is not even a little amused, because the new agreement doesn’t do enough to keep Google from establishing a de facto monopoly over digital books. The Electronic Frontier Foundation is not satisfied because no reader privacy protections were added. Says the ACLU: “No Settlement should be approved that allows reading records to be disclosed without a properly-issued warrant from law enforcement and court orders from third parties. ”

Danny Sullivan live-blogged the press call where Google and the other parties to the settlement discussed the changes. It includes a response to Open Book Alliance’s charges.

By the way, if Ellen Degeneres wants to respond in a reasonable and constructive way to the lawsuits over her use of song snippets to dance to, she could always start using Creative Commons-licensed music, with a nice plug for the open-hearted musicians making our lives more tuney.