Short honk: My hunch is that the University of Maryland has come up with a nifty method to deal with some cumbersome and computationally intensive computations. Navigate to “Scientists Develop World’s Fastest Program to Find Patterns in Social Networks” and read about fancy math and chopping big data into chunks. With the technique, figuring out patterns gets easier. I will resist a pun about cozying up to big data. Here’s the passage that caught my attention in the write up:

In a paper that has been accepted for presentation at the 2010 Advances in Social Network Analysis and Mining conference to be held in Denmark in August, Broecheler, Pugliese and Subrahmanian [University of Maryland wizards] leveraged a key insight – it is possible to split the social network into a set of almost independent, relatively small sub-networks, each of which is stored on a computer in a cloud computing cluster in such a way that the probability that a query pattern will need to access two nodes is kept as small as possible. Using knowledge of past queries and a complex set of calculations to compute these probabilities, their paper reports algorithms and experiments to answer social network subgraph pattern matching queries on real-world social network data with 778 million edges (which may denote relationships or connections between individuals) in less than one second. More recent results not contained in the paper are able to efficiently answer queries to social network databases containing over a billion edges.

Navigate to “Cloudera Goes Enterprise with New Hadoop Offering.” Is this another cloud thing? Not in my little goose pond. Cloudera is positioning itself as a platform vendor. And a platform vendors offers lots of hand holding and neck rubbing services. My view is that Cloudera has made a smart move. For me, the most important point in the write up is:

For Cloudera’s part, Olson [Cloudera wiazard] said it believes companies want–and are willing to pay for–user and group authentication and roles, visual tools for managing large numbers of data feeds into a cluster, and other management tools that are important for conventional IT staff with critical applications who need easy-to-use dashboards. But Cloudera isn’t straying from its open-source roots.

Clever stuff and important because Cloudera can slap on other functions and features. The opportunity to disrupt high end vendors is a big one. That’s one reason why I am helping to organize the presentations at the Lucene Revolution conference which will focus on Lucene and Solr. I want to see what the momentum is and try to figure out where the shock waves will impact.

We have a new advertiser, the Lucene Revolution Conference. I have been paying more attention to open source search because it is becoming an option for some of the organizations with whom I speak. I read an interesting article with a snappy title: “Has Oracle Been a Disaster for Sun’s Open Source?” Oracle has become a player in open source. The company has MySQL, Java, and other properties. Oracle is a publicly traded company, and I am not surprised that open source may not be getting the attention the open source community wants to see. For me, the most interesting comment in the write up was:

It would probably be unfair to characterize Oracle’s running of Sun’s open source projects as a disaster – at least, for the moment; but as the above shows, there are plenty of grounds for concern, both in terms of how the code is being developed, and the happiness or otherwise of developers and users. Whether buying Sun will prove to be a smart move in the long term depends critically on how smartly Larry Ellison and his managers can address these issues. They also need to start to think more seriously about how Oracle can contribute to Sun’s open source products, and not just the other way around.

My view is that some companies – maybe even Oracle?—might be squeezing whatever marketing and sales value it can from its open source properties. The commercial imperative can be quite different from other organizations’ and developers’ motives. I have come across some instances of commercial outfits using “open source” as a marketing angle. A lousy economy can create some difficult situations when good intent collides with making money.

I read two stories by publications that bump up against one another for readers.Eweek, once a Ziff flagship in terms of ad pages, is now an online publication. The story is really a slideshow with comments next to each graphic. Navigate to “10 Reasons to Stop Using Google.” The idea is to call attention to Google weaknesses and services that out Google Google. The example that sticks in my mind is Zoho, an online version of Microsoft Office. I understand the need for page views, but I wondered why pick on Google? The analysis is okay, but nothing spectacular.

The second Google kicker is “Why Do We Trust Google?”, which appeared in an Infoworld online publication. Like the eWeek “story”, this write up dances around the “Google is evil” angle. Nothing wrong with that, but the Google has been chugging along in the same mode for more than a decade. Worrying about Google makes it possible to mention lots of Google services and maybe get some traffic.

The more interesting question for me is, “Why are these outfits snapping at Google’s heels?” Like the identical covers that popped up once in a while on Time and Newsweek paper issues, the coincidence is interesting. My opinion is that Google is not an advertiser and writing about Google produces traffic. Google is a juicy target and it is great sport. Substantive articles? It is summer time and the SEO is easy.

“Once we do the migration–this fall, hopefully–there will be two options. Everyone will still buy Google, but there will be a clear second chance to buy, with roughly 30 percent market share,” Mehdi continued, according to the article. “Just being able to be a credible choice, for us, is a huge step forward. And that’s what we’ll accomplish with Yahoo.”

Short honk: You know my view about the baloney that marketing and public relations types create. If you take a look at my summary of my lecture about real time search, you know that slapping the phrase “real time” on a search engine is about as meaningless as yip yapping about “social search”. I just want to capture the story “Technical Jargon ‘Confuses Shoppers’, says Which?” from the always-clear UK newspaper, The Telegraph. “Which” is a “consumer watchdog”. It is a consumer reports type of publication. For me, the loud bark in the write up was:

Information about megapixels, contrast rations, resolutions and refresh rates are baffling shoppers and not helping them to make informed purchasing decisions, warns Which?, the consumer watchdog. The group found that product labels on items such as televisions was not consistent across brands and often contained meaningless figures that could be misconstrued by shoppers.

I am more concerned with the why, not the fact that Mad Ave and MBA baloney makers confuse people. Heck, these folks confuse me, but I am addled and growing somewhat fatigued with the jargon of the world’s smartest people. Almost everyone I meet under the age of 40 is one of the world’s smartest people. Whatever happened to the good old segmentation of people by a yardstick other than ribbons awarded to everyone for T Ball participation? One high school class had so many A+ students, the graduation featured a play of these all-equal wonders.

Here’ my take:

The people writing about products don’t know what the heck the products do, so engineers’ quips are recycled as Talmudic statements.

The process for group approval almost guarantees a slow devolution to popular jargon. This helps me understand why one vendor’s statement becomes a feature on every competitors’ product check list. “Assisted navigation” and “real time search” are two examples.

The financial pressure ensures that whoever is working on a project has too little time to work on one activity. Now attention deficit disorder plays a role as well. That’s why when I run meetings, we use facilities where phone reception sucks, Internet connections are lousy, and distractions are curtailed by the remoteness of the location. Even then, it takes effort to stick to an agenda and figure out a method appropriate for a specific problem.

Search and content processing vendors are suffering because of their marketing. The only groups that have more electrode burns on sensitive bits are content management vendors, outfits pitching fully automated anything systems, and azure chip consultants who think that working as a newspaper reporter gives them technical expertise and financial acumen. No wonder customers are confused.

Apple operates with a controlling idea: make lots of money and control as much as possible. If I remember the Star Wars film series, Darth Vader operated in a similar way. I recall one scene in which he choked some hapless underling for not doing what Mr. Vader wanted. Google, on the other, hand operates in an iterative fashion. The company’s approach relies on pushing out products, services, and technologies and then adapting. The two methods are fascinating to watch, and I am not sure which is more effective. When it comes to control, Apple has the upper hand. When it comes to doing lots of things and making changes in near real time, Google is the clear winner.

When I read “Google’s Mismanagement of the Android Market,” I thought of the differences in management methods at these two companies. Whatever Google learned when it was pals with Apple did not spill over into marketing in my opinion. The write up in Nanocr.eu said:

Earlier this week, CNET ran an article critical of the permission model of the Android Market. Google’s response to the criticism was that “each Android app must get users’ permission to access sensitive information”. While this is technically true, one should not need a PhD in Computer Science to use a smartphone. How is a consumer supposed to know exactly what the permission “act as an account authenticator” means? The CNET opinion piece “Is Google far too much in love with engineering?” is quite relevant here.

Developers and users are getting fed up and it’s time for Google to clean up the house.

No one is more surprised than I at the strong uptake for the iPad and now the iPhone 4. The message is that consumers are looking for products that are easy to use and pretty much do a few things well. Darth Vader may not have been the homecoming king, but he sure seems to know how to move product in a way that is understandable to consumers. The Math Club may have to rethink its iterative approach to products and services if the Darth Vader approach continues to work despite its flaws.

Apple’s search is now Bing. The Darth Vader approach may be good enough for Apple and a real boost for Microsoft.

Hong Kong-listed Alibaba.com, the business-to-business unit of Alibaba Group, is one of China’s okay Internet companies. I don’t think too much about China because it is a long, long way from the pond filled with mine run off here in Harrod’s Creek. My hunch is that the eCommerce companies don’t think too much about Alibaba either. That may change. Navigate to “Alibaba.com to Acquire Vendio,” and you will learn that Alibaba is going to acquire “a multi-channel e-commerce company providing a one-stop solution for small businesses that are selling online across multiple channels.” If this is the same Alibaba with which I am familiar, the tie up with Vendio could give those looking for a scalable eCommerce solution another option. The new story said:

The company says that from the Vendio Platform, merchants can source products from its supplier network and sell through channels such as eBay, Amazon, and their own Vendio-supported store. It adds that this platform is offered on Software as a Service (SaaS) cloud-computing model to help businesses increase their sales while managing costs to enhance their profit margin.

The implications of this deal range from price competition to more integrated back end functions. Google may be able to move forward without much concern. Google’s eCommerce solution is a recent innovation. Google’s ability to glue components together may allow the juggernaut to push Alibaba and Vendio to the curb. Endeca, working with somewhat different methods, may find itself having to deal with Amazon and the Alibaba Vendio duo. Yahoo remains a potential player in this eCommerce sector. My recollection is that Yahoo owns or owned a stake in the company. Amazon is an aggressive player, and it may have to adapt to Alibaba Vendio as well.

Alibaba has search technology, so this deal if it goes through will deliver what I call “search enabled processes.” I know that process is one of the words that put people to sleep. But despite the notion’s lack of sizzle, SEPs are going to be an increasingly important in the world beyond search. In fact, at the October Lucene Revolution, there will be some interesting sessions on this very topic.

Booz, Allen – the outfit where I worked after my years at Halliburton NUS (Nuclear Utility Services) – has been booking business big time in Washington, DC. I have heard that Booz, Allen has been explaining the challenges of cyber warfare. Now this is not a new topic. A number of analysts have pointed out that systems connected to a public network can be compromised by a range of methods. I recall hearing a lecture by Winn Schwartau a number of years ago. Now the blue chip crowd has caught up with Mr. Schwartau, the author of Information Warfare, and some of his ideas which date from the late 1990s.

One azure chip consulting firm advocated slashing security budgets. I wrote about that odd approach at a time of risk in “Cut That Security Budget, Says Azure Chip Consultancy.” I know about marching out of step, but it is a good idea to be on the same parade ground.

I received an email from one of my two or three readers pointing me the online defense magazine, Defense Update. The April story “Hackers, Terrorists or Cyber Warriors?” is an interesting one. The key idea is that “cyber warfare is here and now.” In that write up are some useful ideas and facts. For me, the key passage was:

Shai Blitzbau, technical director at Magelan information defense and intelligence services describes typical attacks simulated by his company, providing threat assessment audit for government, security and commercial organizations. In recent exercises Magelan performed a threat simulation, that targeted an essential national infrastructure network responsible for the production and distribution of a vital product, considered as basic necessity for the entire population. The simulation demonstrated how, after 96 hour preparation, the team could bring a network, producing and distributing critical goods to a standstill, and keep it idle for at least two weeks. The aggressor team that started with zero access to, or knowledge of the target, managed to study the target, write malicious code, penetrate the network and execute his attack in less than four days.

I wanted to point out that there are extremely fast, effective search systems that can index and make searchable content “sucked” out of a secure system. You can learn about the Gaviri pocket search technology at www.gaviri.com.

Search is one component in the warrior’s arsenal. Booz, Allen is right in forcing governmental entities to be aware of risks. Within the last 14 days, I have been in a facility. I had in my back pocket a small USB drive equipped with a “pocket search” technology. The screening did not flag this device. I did not realize I had the USB in my pocket until I emptied my pockets at the hotel after the meeting.

The blue chip crowd is correct in focusing attention on cyber warfare. Slashing security budgets is ill considered in my opinion.

I heard about Vivisimo’s Federal Day from a contact in Washington, DC. Like MarkLogic and many other organizations, a company sponsored conference can be more effective than a general purpose trade show. The vendors need qualified prospects, and I think that customer conferences with an open door policy for prospects is an important marketing angle for search and content processing vendors.

Vivisimo has not been on my radar. There has been executive churn which is often a sign that a company is in some flux. You can read about the event in the effusive write up in Vivisimo’s Web log Information Optimized. The story is “Vivisimo’s Federal Day 2010.” The line up of speakers struck me as eclectic, and I am not sure how much search and content processing focus the presentations had. The notion of “information optimization” strikes me as azure chip consultant speak. The phrase is ambiguous. I am not sure what information is, so it is tough for me to know how to optimize something I don’t understand. But I was not there, so hopefully Vivisimo will post the PowerPoint decks or PDF versions of the notes.

Like other companies with roots in a search function, Vivisimo is working hard to find a way to get customers without falling into the “search is dead” quagmire. For me, the most telling comment in the article was:

By the end of the day, with the help of our customers and partners, we had explored the need, the theory and the practice behind Information Optimization. As Director of Product Management I have the benefit of hearing our customer stories daily, but many in attendance don’t have this luxury so it was a great pleasure to see their eye light up with possibilities when hearing each other’s stories. One of my favorite quotes of the day was when an analyst explained the value of their application as “finally it was like the lights were turned on.” The diversity of solutions shown by our customers drove home the enormous potential of this discipline, and the feedback we received will help drive the evolution of Vivisimo’s product and service offerings in the future. What a home run!

A home run is great. Winning for customers and stakeholders is the real yard stick in my opinion.

Search the site

Stephen E. Arnold monitors search, content processing, text mining
and related topics from his high-tech nerve center in rural Kentucky.
He tries to winnow the goose feathers from the giblets. He works with colleagues
worldwide to make this Web log useful to those who want to go
"beyond search". Contact him at sa [at] arnoldit.com. His Web site
with additional information about search is arnoldit.com.