Post navigation

Are banks predicting divorces? Well, if there’s data to help them predict such things, they may very well use it to optimize their business.

Forbes has a couple of posts that peek into businesses use of “big data”. The first article talks about the race to build new analytics to solve challenges of large volumes of data. Here’s a snippet from Tom Groenfeldt‘s post, quoting Scott Gnau, head of research and development at Teradata:

Thought leaders in a number of industries are starting to leverage the additional analytic content from big data and combine it with what they have in large volume data stores as well. It is interesting to understand social media and consumer sentiment, but when that information is analyzed in combination with traditional consumer data it provides new, rich intelligence helping companies to identify trends and react to immediate business conditions.

According to another Forbes article, there are a number of studies that show that companies that characterize themselves as “data driven” as the best corporate performers. Now, when we’re talking “data driven”, we mean in how the company operates, not necessarily in what it produces as technology. Top performing companies are determined to use the data that they have (especially about themselves) to improve what they do and how they do it.

Also, banks are on the lookout for changes that could affect how they do business with their customers, and of course, their bottom line:

Banks, for example, worry about their customers divorcing, because divorce causes a change in credit-worthiness. No problem. They can now see a divorce coming before the couple does. All from the data.

As part of the “Computer Science or Data Science” panel at Techonomy 2011 in Tucson, AZ this week, the panel explored how data science has taken its place next to computer science as a fundamental element of information technology. New technologies are coming out seemingly every day, not only to handle big data, but to understand how to extract relevant information from the ocean of data we’re swimming in.

A company in Silicon Valley, ai-one, announced today that they have “a breakthrough method to graphically represent knowledge enables software developers to easily build intelligent agents such as Apple’s SIRI and IBM Watson”. The technology, ai-Fingerprint, is geared toward natural language programming, allowing developers to create new technologies that use natural language as input data.

Apple’s Siri and IBM’s Watson are definitely heading in the right direction for this type of technology. I just bought an iPhone 4S and I’ve tested Siri out a number of times. While Siri doesn’t get everything right (it keeps thinking my name is “Nick” when I say “Mic”), it does get more right than I expected. I was able to send texts and e-mails to people without keystrokes, and I took some notes using the voice feature, getting nearly every word correct. Pretty amazing stuff!…

Watson is the supercomputer that beat two longtime Jeopardy! champions, and it uses a technology approach that looks for the best answer for the questions being asked (or in this case, the best question for the answer being presented – it is Jeopardy! after all…). These are definitely the models that should be emulated; although, ai-one’s announcement is a press release so before we see the results, let’s chalk this up at the moment as good marketing…

Here is a really good insight from Dumbill and how data science applies to business:

Why is the scientific method applicable to business and data?

Every company’s business is complex in itself, and they operate in a complex world. The financial, economic and societal structures we live and do business in are complex. Because of this complexity and interactions, businesses can be viewed in the same light as organic, biological systems. They are complex entities within a complex system.

This is where science comes into play. Even assuming you could come up with a top-down mathematical model of your business, there’s too much interaction and randomness with complex systems for your model to be practical. Thus, the exploratory approach of science becomes useful to a business.

Your world and business is a giant laboratory, and ever more so as the world becomes more networked. By employing data scientists you can discover better how your business works, how it can be improved, and find new things you can do that you didn’t know of before. To do this, you must connect up three kinds of people: the business folk, the data scientists and the data engineers.

Ultimately, data doesn’t mean anything without trying to answer questions. To get actionable information, you need data and you need to be asking the right questions. That’s why the scientific method is so important – it’s all about posing a hypothesis or asking a question, and then squeezing the right information out of the data in order to answer it.

Based upon their Magic Quadrant analysis of data integration tools, Gartner rates Informatica Corp. and IBM as the top software vendors in the space.

Gartner uses a Magic Quadrant to rate companies as leaders, challengers, niche players and visionaries based on several criteria including “completeness of vision” and “ability to execute.” From Gartner’s website:

Leaders execute well against their current vision and are well positioned for tomorrow.

Visionaries understand where the market is going or have a vision for changing market rules, but do not yet execute well.

Niche Players focus successfully on a small segment, or are unfocused and do not out-innovate or outperform others.

Challengers execute well today or may dominate a large segment, but do not demonstrate an understanding of market direction.

A post by Mark Brunelli, Senior News Editor, at SeniorDataManagement has a more detailed analysis of the Gartner report. Here’s what Brunelli wrote, detailing some of the thoughts of Ted Friedman, a Gartner vice president and information management analyst and co-author of the report:

“You’re hearing a lot about big data and analytics around big data,” Friedman said. “To do that kind of stuff you’ve got to collect the data that you want to analyze and put it somewhere. [That] in effect is a job for data integration tools.”

It does seems that the main focus right now in this space is on data handling and data management. A lot of work is being done by companies to create data visualization tools to gain insight from the data, but as the problems get much harder, better analytics approaches will need to be brought to bear. The real key over the next few years will be on the smart analysis of all this data, turning the data into reliable actionable information.

A couple of interesting notes today…. On the PRNewswire today, Kontagent, a leading enterprise user analytics company, today announced that it has closed $12 million in Series B financing with a consortium of investors including Battery Ventures, Altos Ventures, and Maverick Capital. Kontagent focuses on social and mobile web applications with their kSuite product, which combines a proprietary database with customized analytics and real-time monitoring to help customers identify and react to usage patterns in real-time. This continues the pattern of heavy entry level funding into analytics and big data startups – data science applications are becoming the next big technology boom…

There’s another interesting article snippet at Bloomberg Businessweek about how the oncoming avalanche of data could change the nature of astronomy and physics. According to the article, Johns Hopkins will be building a 100 gigabit-per-second network to shuttle data from the campus to other large computing centers at national labs and even to Google. Here’s what Dr. Alex Szalay, Alumni Centennial Professor at Johns Hopkins and head of the network project, thinks about what this could mean for the future of science itself:

In his mind, the new way of using massive processing power to filter through petabytes of data is an entirely new type of computing that will lead to advances in astronomy and physics, much as how the microscope’s creation in the 17th century led to advances in biology and chemistry. In that light, the creation of a 100 gigabit-per-second research network at Johns Hopkins becomes not just a fast network but also an essential tool for research and discovery, a basic component of the 21st century microscope.

You can read about the Kontagent financing deal here, and Dr. Szalay’s effort to build big data networks here…

I am a big fan of Eric Ries’ book The Lean Startup, where he advocates treating every new entrepreneurial venture, whether inside an existing company or as its own startup company, as a startup. And further, since this is a startup venture, the uncertainty about whether this venture will succeed or fail is very high.

So, rather than put together detailed plans about building the product that the business will be based upon, a startup venture should be building the “minimum viable product” and getting feedback from customers quickly to see whether you’re on the right track. The faster you get feedback, the better you’ll be able to build a business that is sustainable and meets the needs of your customers.

Effectively, Ries argues that you should treat the startup venture, every aspect of it, as a series of scientific experiments designed to inform you whether you are building a sustainable business consistent with your company’s vision. It’s basically applying the scientific method to your business.

For one, I say, “Absolutely dead on!” Most business activities, whether marketing or sales or even less-than-disciplined engineering, are performed via rules of thumb (“here’s what’s worked before…”) – there is no true “validated learning”, as Ries put it. Generally, many businesses and engineering teams operate with the approach of “we made a number of changes last month, and our customers seem to like them, and our overall numbers are higher this month, so we must be on the right track”. This might make a company feel good, but it gets them no closer to understanding why they might be succeeding, and what to do if things turn south.

And what is worse, the internal workings of the business may be driven by managers more motivated by preserving the current business enterprise than creating a new one. This puts entrepreneurial ventures at risk from getting started in the first place, or at least started with the greatest possible chance for success.

In the twenty-first century we can build almost anything that can be imagined. The challenge is not to build more stuff. It’s to build the right stuff. Most startups fail, says Ries, because they make the wrong things. The key activity of a startup should therefore be learning, not building. What creates value for a startup is it determining whether or not it’s on the path to a sustainable business.

If you’re interested in the Lean Startup approach to business (which, again, I highly recommend), you can find out more at Eric Ries’ website here, and you can buy his book The Lean Startuphere… Also, you can read more of Stephen Samild’s post here…

You might not think of a Brad Pitt movie and human resources as fitting together, but stay with me…

Josh Bersin of Bersin and Associates makes the case that human resources professionals need to exploit data science to support their businesses better. Just as analyzing the statistics of baseball helped Billy Beane’s Oakland A’s beat their better-financed competition, so too can HR departments win the battles to attract and keep top talent for their organization through data science.

According to Bersin, in surveying 711 HR departments, “attracting and selecting the right talent” rates highest among HR skills and capabilities, yet even though better data analysis can more successfully achieve this top goal, these HR departments rate “developing workforce analytics for management” and “measuring HR program effectiveness” at the bottom. Basically, measurement, analytics, and assessing performance objectively is ranked as the least important skills for HR departments.

Bersin is advocating that HR departments can raise the bar for their effectiveness by embracing data science (I agree!…). Bersin’s presentation is on SlideShare and can be found here… Whether you are an HR professional or a data science weenie, you’ll likely find the slides interesting.

On the Wall Street Journal website today, Ben Rooney posts an interview with Hortonworks CEO Eric Baldeschwieler, co-creator of Hadoop. For all those in the big data space, the Hadoop project develops open-source software for reliable, scalable, distributed computing, and Hortonworks is focused on accelerating the development and adoption of Hadoop.

In Rooney’s interview, Baldeschwieler describes the problem Hadoop is designed to solve:

At its base, it is just a way to take bulk data and storage in a way that is cheap and replicated and can pull up data very, very fast.

Hadoop is at one level much simpler than other databases. It has two primary components; a storage layer that lets you combine the local disks of a whole bunch of commodity computers, cheap computers. It lets you combine that into a shared file system, a way to store data without worrying which computer it is on. What that means is you can use cheap computers. That lets you strip a lot of cost out of the hardware layer.

The thing that people don’t appreciate when you drop a lower price point is that it is not about saving money, it is about being able to do an order of magnitude more on the same budget. That is revolutionary. You can score five to 10 times more data and you can process it in ways that you can’t imagine. A lot of the innovation it opens up is just the speed of innovation. You get to an answer faster, you move into production faster, you make revenue faster.

Shvetank Shah is executive director of the Corporate Executive Board, and recently wrote an article for InformationWeek about the need for scrutiny when launching into big data initiatives. Here’s the first paragraph from Shah’s post, which sums things up pretty well:

Even as companies invest eight- and nine-figure sums to analyze the information streaming in from suppliers and customers, fewer than 40% of employees have the right processes and skills to make good use of the analysis. Think of this as a company’s insight deficit. To overcome it, “big data” needs to be complemented by “big judgment.”

Just because we now have lots of data to work with doesn’t mean that we will now get better decisions. How we turn data into actional information – the methods, the tools, the techniques – are incredibly important. Also, as Shah points out, sometimes the data itself is just plain bad, and people don’t always trust it. Gaining insight from data is the important factor, not merely getting more data.

You can read Shah’s post on InformationWeek here, and more from CEB on this topic is available at insightdeficit.com.

David Smith is an author, blogger and R evangelist for Revolution Analytics, where he serves as VP of Marketing, and he blogs daily at blog.revolutionanalytics.com. This weekend, he posted an article about the use of data science in the winemaking business (yes, data science is getting to be everywhere!…)

Here’s a neat little interview conducted by Internet Evolution’s Todd Watson of Michael Lewis and Billy Beane. Watson was attending the Information OnDemand event this past month, where one of the key themes of the event was the idea of putting business analytics into practice to help improve business outcomes. Watson felt that Beane did a great job of this in the business of baseball, and Lewis did a great job of writing about this, so he got both together for this interview.

Billy Beane is the general manager of the Oakland Athletics, changing the way that major league baseball uses data to field their rosters. Michael Lewis is the author of Moneyball, documenting Beane’s efforts to build a winning baseball franchise while being limited with a payroll that dwarfs his competition.

Lewis’ book was recently turned into a major motion picture featuring Brad Pitt as Beane and Jonah Hill as the statistics whiz that helps Beane turn the A’s around.

Here’s just a little bit from Watson’s interview on how Lewis got turned on to writing about Beane and the A’s:

Todd Watson: One of the key themes of the IOD event has been “turning insight into action,” and that seems to be a theme prevalent in some of your books — most notably Moneyball and The Big Short. I’m curious, in terms of baseball managers who are using sabermetrics to make more informed decisions, I’m really interested in how you got turned on to that topic and also just how that came to be and what inspired you to write the book?

Michael Lewis: It was really simple. I was living in Billy’s backyard in Berkeley so I was paying attention to the A’s. I didn’t know… I wasn’t a baseball fanatic, but I did know there was this payroll issue and I got interested in that.

I got interested in that in the first place, because at first I thought I was going to write a piece about the A’s. I think it was when Jose Canseco got this giant deal, and he was being paid something like $8 million, and the right fielder and left fielder were being paid something like $150,000, and I wanted to know if the outfielders were pissed!

And, how they felt when those Jose Canseco dropped a fly ball. (Laughter) And I was going to come out and write about that, and then I started thinking about it, and I realized there were these huge discrepancies from team to team. And then I wondered, so how does the whole team feel about being poor?

I enjoyed Moneyball, both the movie and the book. I have mentioned before that Lewis is a really great author – I wrote another post about Michael Lewis’ book on the 2008 global financial meltdown called The Big Short…