Microsoft earlier this week launched Teams, a new collaboration offering designed to fend off competitors to its Office 365 platform. The new chat-based workspace product brings together people, conversations and content, along with some familiar productivity tools, to facilitate collaboration on projects.

Please read here my thoughts in an interview with E-Commerce Times here.

Spotify has raised US$526 million in its latest round of funding, according to reports that surfaced Wednesday. Spotify has raised US$526 million in its latest round of funding, according to reports that surfaced Wednesday. Here is what I told E-Commerce Times about this news, “The combination of Spotify’s free service, which carries ads, and a premium paid version that’s ad-free, has let the company gain significant traction. On the other hand, Apple is leveraging its brand recognition to simply convert its global customer base into paying subscribers of its new streaming service.”

This week Twitter began indexing every public tweet posted since it began operating in 2006. “Our long-standing goal has been to let people search through every tweet ever published,” said Yi Zhuang, who led the team working on the project. Here is what I told Tech News World about the situation, “Twitter’s rollout is an enormous achievement in computer science. Making hundreds of billions of tweets searchable with a latency of under 100 ms is like combining the capacity of CNN providing the latest news with the thoroughness and completeness of the Library of Congress in real-time.”

We had a lot to celebrate recently. Last year was the 300th anniversary of Jacob Bernoulli’s Ars Conjectandi. In this book he consolidated central ideas in probability theory, such as the very first version of the law of large numbers. It was also the 250th anniversary of Bayes theorem named after Thomas Bayes (1701–1761), who first suggested using the theorem to update beliefs.

Fast forward.

Thomas Bayes

The enthusiasm around Big Data hinges on the use of Statistics to provide relevant and meaningful analysis of ever-increasingly large data sets. Statistical science has produced excellent machine learning tools and methods that go beyond just classification and ranking of data sets. Today, we try to explicitly quantify the uncertainty of what can be concluded from a data set, be it a prediction or a scientific inference. Our work today, firmly rests on the shoulders of Bernoulli and Bayes a few hundred years ago.

Big data will give us increasingly more precise answers. That is a direct consequence of Bernoulli’s work. However, we have to be aware of issues such as ‘Selection Bias’, ‘Regression to the Mean‘, ‘Over-interpretation of Associations‘. Particularly in our field, it’s important to very carefully examine the underlying science explaining the data.

As the data sets continue to get bigger, the problem of potential false findings grows exponentially. In order to protect us from that, we will have to use statistical methods to quantify the uncertainty associated with our results. A careful and considered approach to applying statistical methods is the best option to get out of this dilemma. Every variable in a study must be examined for completeness and consistency in how it is coded, and the assumptions of each statistical routine must be validated.