[Blog Reads] October 2017

[Blog Reads] October 2017

Ready for Round 2? There are a lot of interesting articles in this round up.

Data Science Methods

When we have a huge sample set to analyze, correlation between variables can often be a tricky concept to analyze. Here is a 7 ways to view correlation, compiled by SAS:

Graphically

The sum of crossproducts

The inner product of standardized vectors

Angle between two vectors

The standardized covariance

Slope of the regression line between two standardized variables

Geometric mean of regression slopes

Another study done by storybench which has been interviewing data specialists for three years came to the following conclusion about their studies. “Based on the interviews analyzed, three recurring themes emerged: Collaborative, open and mobile.” – link.

Collaborative, team-based story-building

The new open-source echos

Mobile-focused ideation

Spotify’s Discover Weekly algorithm is very complex and based on three machine learning models (link):

Collaborative Filtering models: analyze your and others’ behavior (if two people listen to many common songs, chance are they will enjoy the other person’s playlist)

Natural Language Procession: analyzing text (analyze lyrics, blogs, articles to determinate what adjectives are used to describe the song; how to categorize it)

Audio models: take raw audio and transform into matrix, then use CNN to categorize the song.

Financial Prediction

A computer was asked to predict which start-ups would be successful. The results were astonishing In this article, we see that an 8 years old prediction using computing power was able to accurately spot startups like Evernote, Spotify, Etsy, Zynga, Palantir and more as future big tech giants. They are looking to create a new list. The methology this time will use up to 50000 private companies’ data and categorize them based on investors and theme. (link) Some interesting findings:

Augmented reality will be far more significant than virtual reality because it will shape the way we look at and interact with the world around us.

Image recognition and mapping technologies will be deployed across the auto industry as traditional car manufacturers adapt to self-driving vehicles.

There continue to be major market opportunities in e-commerce as fashion becomes increasingly mobile and social.

Learning Things We Already Know About Stocks This project done in R groups together in a network that highlights associations within and between the groups using only historical price data. As expected, stocks are grouped together into business sectors.

“We downloaded daily closing stock prices for 100 stocks from the S&P 500, and, using basic tools of statistics and analysis like correlation and regularization, we grouped the stocks together in a network that highlights associations within and between the groups. The structure teased out of the stock price data is reasonably intuitive.” – (link)

Deep Reinforcement Learning

With the continuous progress of DRL, here are a few new research I found especially interesting:

Unity Machine Learning Agents: Unity introduces new agents to simulate a game playing environment for research in RL. (link)

Data Visualization

Machine Visions: Exploring Visual Motifs in Wes Anderson Films I love well presented presentation with interactive data visualization that goes with it. This machine-visions webpage does exactly that and it is totally fun to go through it again and again. (link)

A quick explanation of what the Open-Source project Semiotic can do: Transform data plots into a “sketchy version”! (link)