October 24, 2011

NYT on Big Data and R

Big Data is really about ... the benefits we will gain by cleverly sifting through it to find and exploit new patterns and relationships. You see it now in things like Facebook ads, which are put in front of you because the posts you have read and contributed to (which Facebook’s algorithms get to examine as the price of this “free” service) indicate you might be ready to buy the advertised good.

The article includes applications of big data analytics at various companies: ad placement at Google; credit card transaction analysis (according the CEO of TrialPay, the value of the transaction data exceeds the transaction fee credit companies like Visa charge merchants); and inferring information from the semantic web at search start-up Domo. (By the way, here's a great presentation on using R to mine the semantic web, from Chris Davis and Alfredas Chmieliauskas of the Amsterdam R Users Group.)

Speaking of R, the NYT article ended with a mention of new tools for Big Data analyyics: MapReduce, NoSQL and R:

There are an uncountable number of data-mining start-ups in the field: MapReduce and NoSQL for managing the stuff; and the open-source R statistical programming language, for making predictions about what is likely to happen next, based on what has happened before.

For example, R is used at social networking sites like Foursquare and OMGPOP to make predictions based on user transaction data.