LexisNexis: Riding the Patent Pony

April 25, 2015

Need patent information? Lots of folks believed that making sense of the public documents available from the USPTO were the road to riches. Before I kicked back to enjoy the sylvan life in rural Kentucky, I did some work on Fancy Dan patent systems. There was a brush with the IBM Intelligent Patent Miner system. For those who do not recall their search history, you can find a chunk of information in “Information Mining with the IBM Intelligent Miner Family.” Keep in mind that the write up is about 20 years old. (Please, notice that the LexisNexis system discussed below uses many of the same, time worn techniques.)

Patented dog coat.

Then there was the Manning & Napier “smart” patent analysis system with analyses’ output displayed in three-D visualizations. I bumped into Derwent (now Intellectual Property & Science) and other Thomson Corp. solutions as well. And, of course, there was may work for an unnamed, mostly clueless multi billion dollar outfit related to Google’s patent documents. I summarized the results of this analysis in my Google Version 2.0 monograph, portions of which were published by BearStearns before it met its thrilling end seven years ago. (Was my boss the fellow carrying a box out of the Midtown BearStearns’ building?)

Why the history?

Well, patents are expensive to litigate. For some companies, intellectual property is a revenue stream.

There is a knot in the headphone cable. Law firms are not the go go business they were 15 or 20 years ago. Law school grads are running gyms; some are Uber drivers. Like many modern post Reagan businesses, concentration is the name of the game. For the big firms with the big buck clients, money is no object.

The problem in the legal information business is that smaller shops, including the one and two person outfits operating in Dixie Highway type of real estate do not want to pay for the $200 and up per search commercial online services charge. Even when I was working for some high rollers, the notion of a five or six figure online charge elicited what I would diplomatically describe as gentle push back.

I read “LexisNexis TotalPatent Keeps Patent Research out of the Black Box with Improved Version of Semantic Search.” For those out of touch with online history, I worked for a company in the 1980s which provided commercial databases to LexisNexis. I knew one of the founders (Don Wilson). I even had reasonably functional working relationships with Dan Prickett and people named “Jim” and “Sharon.” In one bizarre incident, a big wheel from LexisNexis wanted to meet with me in the Cherry Hill Mall’s parking lot across from the old Bell Labs’ facility where I was a consultant at the time. Err, no thanks. I was okay with the wonky environs of Bell Labs. I was not okay with the lash up of a Dutch and British company.

Snippet of code from a Ramanathan Guha invention. Guha used to be at IBM Almaden and he is a bright fellow. See US7593939 B2.

What does LexisNexis TotalPatent deliver for a fee? According to the write up:

TotalPatent, a web-based patent research, retrieval and analysis solution powered by the world’s biggest assortment of searchable full-text and bibliographic patent authorities, allows researchers to enter as much as 32,000 characters (comparable to more than 10 pages of text)—much over along a whole patent abstract—into its search industry. The newly enhanced semantic brains, pioneered by LexisNexis during 2009 and continually improved upon utilizing contextual information supplied by the useful patent data offered to the machine, current results in the form of a user-adjustable term cloud, where the weighting and positioning of terms may be managed for lots more precise results. And countless full-text patent documents, TotalPatent in addition utilizes systematic, technical also non-patent literature to go back the deepest, most comprehensive serp’s.

I found this explanation of the benefits of the system interesting:

Semantic sort through TotalPatent frees internet protocol address professionals from the need to learn complicated Boolean search providers while also employed in combination with standard Boolean search. Through its patent-pending “Visualize & Compare” device, TotalPatent shows outcomes between Boolean, semantic along with other online searches, permitting scientists contrast the value of results between various search types.

I knew from my experiences and from research I did for various clueless big outfits that there were some painful items of information stuck in the Reed Elsevier bespoke brogues. I am digging into my memory which is less and less reliable (like many online services) for these recollections so enjoy them or not:

Patent searching used to be a money spinner for law firms. Patent searching is still a big business, but it is increasingly concentrated. As a result, for the paying customer, patent information is a boutique business. Like an Aston Martin, patent information from commercial services is often eye wateringly expensive.

Patent searching is no longer the exclusive domain of the commercial online services. There are low cost services which obviate much of the pain and hassle associated with accessing the US government service at www.uspto.gov. If you have not looked at SumoBrain’s FreePatentsOnline.com, check out the service. Be sure to note the advanced search functions. “Regular searches” are not too useful in my opinion. You can see why by running queries for “Alon Halevy” on Google’s free patent search system at https://www.google.com/?tbm=pts&gws_rd=ssl.

Fancy content processing provides some useful insights, but these “outputs” must be checked by billable humans or enthusiastic, but often uninformed inventors, which does little to ameliorate the time honored way to figure out patent information—a human has to read, review, think, and synthesize information.

Patents are the legal version of a sonnet. The rules are strict. Within those rules, let metaphors flower in the rich soil of ambiguity. Google described an important software invention as a “janitor.” Clever, eh.

Some patents contain names of an inventor who then files other patents using variants of the original name. I have written aboiut this characteristic, citing the Babak Amir Parviz documents. I documented Parviz, Parvis, Amir Parviz, Babak Parviz, and other variations. Most patent analysis systems do not do industrial strength entity resolution particularly well.

Is it feasible to do away with fielded searching and Boolean? Even the impressive but now low, low profile Brainware trigram system required a skilled human to make sense of the pattern matching outputs. IHS’s Invention Machine, now rebranded as Goldfire, assumes that a specialist with work the old fashioned way to make sense of the outputs of that system.

The “new” LexisNexis solution pivots on semantic search. Here’s what LexisNexis’ new service delivers:

The inclusion greater than three million complete text applications and provided patents from the European Patent workplace, in addition to full text journals and summit procedures from Elsevier, to keep to broaden the comprehensive database offered by TotalPatent. Changes towards user interface including to your logic behind the weighting and placement associated with the suggested terms to produce more precision and accuracy for patent search queries.

I am supportive of the systems and methods used by some semantic solutions. On the other hand, semantic methods are often one subsystem or one component in a quite complex content processing system.

Even the NGIA (next generation information access) systems analyzed for my recently monograph CyberOSINT have some severe semantic limitations. Patents like other compound documents contain images. Semantic methods struggle with the type of images in patent documents published in Europe, Japan, and the US. Entity extraction can be difficult when the content is specifically processed to parse and put in context the names of people, affiliations / companies, and named technologies. The challenge with any patent system that is available online is that a patent is often the tip of the filing iceberg. There are versions; there are emendations; there are wild and crazy documents with different numbers, titles, and other important information used intentionally or inadvertently. In short, patent documents are a very big headache.

I don’t want to be skeptical, but IBM gave patent analysis the old college try. Go, Big Blue. The Manning & Napier organization invested significant sums and person years of effort in its promising patent system. Thomson continues to labor in this vineyard, and it too is making an attempt to convert content processing marketers assertions into sustainable revenue from NGIA technology.

Net net: Those with a budget or a tenured teaching position in a law school or information science program will test LexisNexis’ latest offering. A small number of law firms will kick the tires. But substantial, sustainable revenues from enhanced search of patent documents is likely to fall short of the mark.

When will the versions, variants, and images be cracked?

When lawyers write claims that make sense. Until then, when I have to deal with patents, I print out some patents and applications and start reading. I then research details, read, and analyze. I do use Fancy Dan content processing technology from multiple vendors.

But I still have to kill many trees and breathe cubic meters of fumes from high light pens. The process often seems to have no end. Come to think of it some law firms and expert witnesses like the process. Billable hours are often a benefit.

Comments

Search the site

Stephen E. Arnold monitors search, content processing, text mining
and related topics from his high-tech nerve center in rural Kentucky.
He tries to winnow the goose feathers from the giblets. He works with colleagues
worldwide to make this Web log useful to those who want to go
"beyond search". Contact him at sa [at] arnoldit.com. His Web site
with additional information about search is arnoldit.com.