Beyond Searchby Stephen E. Arnold2015-08-02T11:19:52Zhttp://arnoldit.com/wordpress/feed/atom/WordPressStephen E. Arnoldhttp://www.arnoldit.com/http://arnoldit.com/wordpress/?p=453802015-08-02T11:19:52Z2015-08-02T11:19:06ZI read “The Analytics Journey Leading to the Business Data Lake.” Data lake is one of the terms floating around (pun definitely intended!) to stimulate sales. If one has a great deal of water, one needs a place to put it. Even though water is dammed, piped, used, recycled, and dumped—storage is the key.

Enter EMC, a company which is in the business of helping those with water store it and make use of that substance.

The write up reflects effort. I assume there was a PowerPoint slide deck in the mix. There are some snazzy graphics. Here’s one that caught my eye:

Instead of enterprise search being the go-to enterprise software solution, EMC has slugged in the following umbrella terms:

Advanced analytics (obviously because regular analytics just are zippy enough)

Knowledge layer (I remain puzzled about knowledge because I have a tough time defining. In fact, I resigned from my for fee knowledge management column because I just don’t know what the heck “knowledge” means.)

The unfathomable data lake (yep, pun intended). What’s wrong with the word “storage” or “database” by the way?

Master data which is also baffling. Is there servant data too?

Machine data. Again I have no clue what this means.

The chart scatters undefined and fuzzy buzzwords like a crazed Jethro Tull, a water soluble blend of Jethro Tull (inventor of the seed drill) and Jethro Tull (the commercially successful and eccentric rock bands).

The write up is important because EMC has sucked in the jargon and assertions once associated with enterprise search and applied them to the dark and mysterious data lake.

I highlighted:

Our data lake is one logical data platform with multiple tiers of performance and storage levels to optimally serve various data needs based on Service Level Agreements (SLA). It will provide a vast amount of structured and unstructured data at the Hadoop and Greenplum layers to data scientists for advanced analytics innovation. The higher performance levels powered by Greenplum and in-memory caching databases will serve mission-critical and real-time analytics and application solutions. With more robust data governance and data quality management, we can ensure authoritative, high-quality data driving all of EMC business insights and analytics driven applications using data services from the lake.

Ah, the Mariana Trench of enterprise information: Governance. Like “knowledge” and “advanced analytics”, governance has euphony. I think of the water lapping against the shore of Lake Paseco.

So what? Several observations:

This type of “suggest lots” marketing ended poorly for a number of companies who used this type of rhetoric when marketing search

The folks who swallow this bait are likely to find themselves in a most uncomfortable spot

The problems associated with making use of information to improve decision making by reducing risk are not going to be solved by crazy diagrams and unsupported assertions.

EMC has been able to return revenue growth. But the company’s profit margin has flat lined.

I am not sure that increasing the buzzword density in marketing write ups will help angle the red lines to low earth orbit. With better margins, it is much easier to check out the topographic view and see where lakes meet land.

the competitive perspective is almost always the least important aspect in managerial decision-making. Internal operational issues including execution, budgets, and deadlines are paramount in a company’s deliberation, but what other players will do is hardly ever in focus. This “island mentality” is surprisingly prevalent among talented, seasoned managers.

What’s the fix?

Gilad seems to realize the magnitude of the challenge. He states:

a company can’t force its managers to use information optimally. It can, however, ensure they at least consider it. In many areas of the corporation, mandatory reviews are routine- regulatory, legal, financial reviews are considered the norm. Ironically, competitive reviews are not, even though the cost of missing out on understanding the competitive environment can be enormous.

In short, MBAs talk the way they learned in Harvard-type business schools. The walk, on the other hand, is different.

From my point of view, biased by my work at Booz, Allen & Hamilton before it became the two separate outfits Booz and Booz, Allen, I hear a different drum cadence.

Managers are unable to deal effectively with available information. As a result, many are emulating the leatherback sea turtle. Shutting down and making decisions based on what other turtles say is the preferred course of action.

A number of MBAs shift the discussion to data. The notion that competitive insights may be based on inputs which are tough to quantify is sufficient evidence to accept the outputs of an Excel spreadsheet or some canned analysis ginned up by an intern at a mid tier consulting firm.

Quite a few senior managers, in my experience, live in a state of fear. The happy attitude and rah rah, go team approach is like a coat of drive through car wax. Beneath the surface, there is real concern about keeping a job, dealing with life’s little challenges, and being able to pull off another Board meeting.

Competitive intelligence, like business intelligence and military intelligence, get quite a bit of marketing attention. But in today’s business environment, turtles, data addicts, and cheerleaders stumble with basics.

The evidence falls readily to hand: Security woes at government agencies, fumbling with immigrants in Calais, automobiles which can be hacked, and enterprise search systems which cannot locate information.

From my point of view, the problem is cross cultural and deeper than competitive intelligence. Executives struggle with strategy, planning, and personal conduct too.

Perhaps business schools and management experts are not symptoms but triggers?

Stephen E Arnold, August 2, 2015

]]>0Stephen E. Arnoldhttp://www.arnoldit.com/http://arnoldit.com/wordpress/?p=453702015-08-01T13:31:10Z2015-08-01T13:30:52ZWant to shake free of the proprietary search and retrieval systems? I don’t blame you. Irregular and slow bug fixes and licensing handcuffs are two good reasons. Remember: The cost of search is not the licensing fee. The cost is a collection of fees, purchases, and expenses which every search system with which I am familiar is burdened.

Elasticsearch is the go to solution at this time in my opinion. If you want a useful overview of Elasticsearch, check out the Slideshare presentation “Introduction to ElasticSearch.” You may have to “join” LinkedIn / Slideshare to do anything useful, however.

The deck was prepared / delivered in the spring of 2015 by Roy Russo who is affiliated with or is “DevNexus.” The information is jargon free, an approach which the whiz kids at LucidWorks (Really?) may want to imitate. The presentation does contain a couple of buzzwords like NGram, but no MBA speak.

Stephen E Arnold, August 1, 2015

]]>0Stephen E. Arnoldhttp://www.arnoldit.com/http://arnoldit.com/wordpress/?p=453662015-08-01T10:55:44Z2015-08-01T10:54:29ZI am no specialist in the arcane art of legal eagle spotting. I did notice some references to a dust up between an outfit called Speedtrack and licensees of Endeca’s ageing search technology.

The Speedtrack outfit seems to have rights to an invention called “Method for Accessing Computer Files and Data, Using Linked Categories Assigned to Each Data File Record on Entry of the Data File Record.” This is explained brilliantly in US5544360, filed in February 1995.

Here’s a diagram showing how the user can click on categories to locate information. No typing required.

Compare this to Endeca’s invention, “Hierarchical Data Driven Navigation System and Method for Information Retrieval.” This is US7062483, filed in 2001. You may also find US7035864 and US7325201 interesting as well.

First, unlike res judicata, which is a defense that is personal to the parties in a prior litigation, the Kessler Doctrine “attaches to the [accused] product itself” and precludes a patentee from reasserting the same patent against the same (or “essentially the same”) product in a subsequent action.

Then noted:

Second, the Federal Circuit ruled that the Kessler doctrine may be raised by customers as well as the product manufacturer or supplier.

What I found fascinating was this infringement related statement attributed to the presiding legal eagle:

Third, the Federal Circuit held that the Kessler doctrine applied to Speedtrack’s claim even though the Endeca software allegedly infringed only when combined with the customer’s own computer hardware.

I recall that Endeca’s faceted navigation burst upon the scene in the late 1990s. Who knew that Jerzy Lewak (co founder of Speedtrack), Slawek Grzechnik, and Jon Matousek seemed to be trying to figure out a way around the problem of keyword search before Endeca?

I wonder if Oracle were surprised too. I have a hunch Speedtrack was.

Stephen E Arnold, August 1, 2015

]]>0Stephen E. Arnoldhttp://www.arnoldit.com/http://arnoldit.com/wordpress/?p=453602015-08-01T10:55:05Z2015-08-01T10:16:31ZMany years ago I loaded a software application from Autonomy. The application watched what I was “doing” and automatically displayed search results sort of relevant to what the software thought I was writing.

Darktrace monitors digital flows for signals. Instead of displaying search results, the system alerts security officers of a probable issue. Maybe Kinjin is not the influencer of the system. No matter. The company is “valued at more than $100 million.”

Dr. Lynch is once again moving into a market sector in which some of the competitors are likely to be unaware of Dr. Lynch’s electric powered kitchen appliance taking over their coffee machine.

Hewlett Packard may want to ask and answer: “Why did we lose this fellow?”

My hunch is that HP won’t ask the question and may not admit that the answer is not just technology. The murky world of management spoils and otherwise pristine cup of java. That’s a $100 million dollar cup of joe.

Stephen E Arnold, August 1, 2015

]]>0Stephen E. Arnoldhttp://www.arnoldit.com/http://arnoldit.com/wordpress/?p=453572015-07-31T13:49:54Z2015-07-31T13:43:21ZI am now getting interested in the marketing efforts of IBM Watson’s professionals. I have written about some of the items which my Overflight system snags.

I have gathered a handful of gems from the past week or so. As you peruse these items, remember several facts:

Watson is Lucene, home brew scripts, and acquired search utilities like Vivisimo’s clustering and de-duplicating technology

IBM said that Watson would be a multi billion dollar business and then dropped that target from 10 or 12 Autonomy scale operations to something more modest. How modest the company won’t say.

IBM has tallied a baker’s dozen of quarterly reports with declining revenues

IBM’s reallocation of employee resources continues as IBM is starting to run out of easy ways to trim expenses

The good old mainframe is still a technology wonder, and it produces something Watson only dreams about: Profits.

Here we go. Remember high school English class and the “willing suspension of disbelief.” Keep that in mind, please.

ITEM 1: “IBM Watson to Help Cities Run Smarter.” The main assertion, which comes from unicorn land, is: “Purple Forge’s “Powered by IBM Watson” solution uses Watson’s question answering and natural language processing capabilities to let users ask questions and get evidence-based answers using a website, smartphone or wearable devices such as the Apple Watch, without having to wait for a call agent or a reply to an email.” There you go. Better customer service. Aren’t government’s supposed to serve its citizens? Does the project suggest that city governments are not performing this basic duty? Smarter? Hmm.

ITEM 2: “Why I’m So Excited about Watson, IBM’s Answer Man.” In this remarkable essay, an “expert” explains that the president of IBM explained to a TV interviewer that IBM was being “reinvented.” Here’s the quote that I found amusing: “IBM invented almost everything about data,” Rometty insisted. “Our research lab was the first one ever in Silicon Valley. Creating Watson made perfect sense for us. Now he’s ready to help everyone.” Now the author is probably unaware that I was, lo, these many years ago, involved with an IBM Herb Noble who was struggling to make IBM’s own and much loved STAIRS III work. I wish to point out that Silicon Valley research did not have its hands on the steering wheel when it came to the STAIRS system. In fact, the job of making this puppy work fell to IBM folks in Germany as I recall.

ITEM 3: “IBM Watson, CVS Deal: How the Smartest Computer on Earth Could Shake Up Health Care for 70m Pharmacy Customers.” Now this is an astounding chunk of public relations output. I am confident that the author is confident that “real journalism” was involved. You know: Interviewing, researching, analyzing, using Watson, talking to customers, etc. Here’s the passage I highlighted: “One of the most frustrating things for patients can be a lack of access to their health or prescription history and the ability to share it. This is one of the things both IBM and CVS officials have said they hope to solve.” Yes, hope. It springs eternal as my mother used to say.

If you find these fact filled romps through the market activating technology of Watson, you may be qualified to become a Watson believer. For me, I am reminded of Charles Bukowski’s alleged quip:

The problem with the world is that the intelligent people are full of doubts while the stupid ones are full of confidence.

Stephen E Arnold, July 31, 2015

]]>0Stephen E. Arnoldhttp://www.arnoldit.com/http://arnoldit.com/wordpress/?p=453542015-07-31T13:20:38Z2015-07-31T13:20:04ZI love write ups like “Don’t Settle When It Comes to Enterprise Search Platforms.” These articles are designed to make consulting firms with the marketing flim flam which positions each as an “expert” in enterprise information access. I would not be surprised to find copies of this article in the peddler kit of search sales professionals.

The main point of the write up is that enterprise search is a “platform.” Because there are options, no self respecting company will try to implement search without the equivalent of the F Troop in mid tier or below consultants.

I noted:

Let’s look at two very common workarounds some have tried, and then we will talk about why you must go with a reputable developer when you make your final decision.

When I read this, I wondered if the “expert” were familiar with the Maxxcat line of enterprise search systems or the Blossom hosted solution.

The write up dismisses an open source solution apparently unaware of research by Diomidis Spinellis and Vaggelis Giannikas work published in Journal of Systems and Software, March 2012, pages 666 to 682. That’s okay. My hunch is that those finding the “Don’t Settle” article compelling are not likely to be interested in researchy type stuff.

One of the more interesting segments in the write up is the assertion that scalability is a “given.” Hmmm. In my experience, there are some on going enterprise search challenges: Scalability is one facet of a nest of vipers which includes my favorite reptile indexing latency.

The article states:

Open source platforms are only as scalable as their code allows, so if the person who first made it didn’t have your company’s needs in mind, you’ll be in trouble. Even if they did, you could run into a problem where you find out that scaling up actually reveals some issues you hadn’t encountered before. This is the exact kind of event you want to avoid at all costs.

I don’t want to rain on this parade of “information,” but every enterprise search system which I have had the pleasure of procuring, managing, investigating, and analyzing has scalability problems.

The reason is simple: The volume of changed information and the flow of new information goes up. Whatever one starts with is rather rapidly choked. The solutions are painful: Spend more or index less.

I am not confident that one who follows the advice of certain experts will find his or her enterprise search journey pleasant. On the other hand, there are opportunities as Uber drivers one can pursue.

Stephen E Arnold, July 31, 2015

]]>0Stephen E. Arnoldhttp://www.arnoldit.com/http://arnoldit.com/wordpress/?p=453512015-07-30T17:30:30Z2015-07-31T09:13:48Zi read “Google Says Non to French Demand to Expand Right to Be Forgotten Worldwide.” When third parties want the GOOG to do something, those suggestions face headwinds. It is okay for the Google to terminate unused Gmail accounts. It is okay for the Google to nuke APIs. It is okay for the Google to deliver “relevant” results which are beyond the statistical embrace of precision and recall analyses.

But when a third party wants to be forgotten? According to the write up from the increasingly anti Google folks in the UK, I learned:

Google has rejected the French data protection authority’s demand that it censor search results worldwide in order to comply with the European Court of Justice’s so-called right to be forgotten ruling. The company’s rejection of the ruling could see its French subsidiary facing daily fines, although no explicit sanction has yet been declared.

The write up also reminded me of Google’s official view of third party requests to be forgotten:

In a blog post, Peter Fleischer, Google’s Global Privacy Counsel, said: “We believe this order is disproportionate and unnecessary, given that the overwhelming majority of French internet users – currently around 97% – access a European version of Google’s search engine like Google.fr, rather than Google.com or any other version of Google.” Additionally, Fleischer added, the company is concerned that complying with the French courts could potentially set a precedent that one country’s laws can control access to content globally.

My hunch is that Google wants its policies and procedures applied globally. Google has suggested that some nation states alter their behavior to better mesh with the Googley universe.

“Some who don’t like the direction in which Google is going say that Bing is the search engine they prefer, especially since Microsoft has honed Bing’s ability to deliver relevant results. Others, however, look at Bing as one of many products from Microsoft, which is still seen as the “Evil Empire” in some quarters and a search platform that’s incapable of delivering the results that compare favorably with Google. Bing, introduced six years ago in 2009, is still a remarkably controversial product in Microsoft’s lineup. But it’s one that plays an important role in so many of the company’s Internet services.”

Microsoft is ramping up Bing to become a valuable part of its software services, it continues its partnership with Yahoo and Apple, and it will also power AOL’s web advertising and search. Bing is becoming a more respected search engine, but what does it have to offer?

Bing has many features it is using to entice people to stop using Google. When searching a person’s name, search results display a bio of the person (only if they are affluent, however). Bing has a loyalty program, seriously, called Bing Rewards, the more you search on Bing it rewards points that are redeemable for gift cards, movie rentals, and other items.

Bing is already a big component in Microsoft software, including Windows 10 and Office 365. It serves as the backbone for not only a system search, but searching the entire Internet. Think Apple’s Spotlight, except for Windows. It also supports a bevy of useful applications and do not forget about Cortana, which is Microsoft’s answer to Siri.

Bing is very important to Microsoft because of the ad revenue. It is just a guess, but you can always ask Cortana for the answer.

]]>0Stephen E. Arnoldhttp://www.arnoldit.com/http://arnoldit.com/wordpress/?p=453172015-07-27T21:38:01Z2015-07-31T07:13:01ZThere are many services that offer companies the ability to increase their content discover. One of these services is Leiki, which offers intelligent user profiling, context-based intelligence, and semantic SaaS solutions. Rather than having humans adapt their content to get to the top of search engine results, the machine is altered to fit a human’s needs. Leiki pushes relevant content to a user’s search query. Leiki released a recent, “Case Study: Lieki Smart Services Increase Customer Flow Significantly At Alma Media.”

Alma Media is one of the largest media companies in Finland, owning many well-known Finnish brands. These include Finland’s most popular Web site, classified ads, and a tabloid newspaper. Alma Media employed two of Leiki’s services to grow its traffic:

“Leiki’s Smart Services are adept at understanding textual content across various content types: articles, video, images, classifieds, etc. Each content item is analyzed with our semantic engine Leiki Focus to create a very detailed “fingerprint” or content profile of topics associated with the content.

SmartContext is the market leading service for contextual content recommendations. It’s uniquely able to recommend content across content types and sites and does this by finding related content using the meaning of content – not keyword frequency.

SmartPersonal stands for behavioral content recommendations. As it also uses Leiki’s unique analysis of the meaning in content, it can recommend content from any other site and content type based on usage of one site.”

The case study runs down how Leiki’s services improved traffic and encouraged more users to consume its content. Leiki’s main selling point in the cast study is that offers users personal recommendations based on content they clicked on Alma Media Web sites. Leiki wants to be a part of developing Web 3.0 and the research shows that personalization is the way for it to go.