Month: May 2017

In 2016, all I was reading about was big data, deep learning, artifical intelligence, machine learning, etc… soon I realized I needed to do more than just read about it. So for 2017, I decided it was time to take a deep dive into Machine Learning and see what all the buzz was about.

I haven’t programmed in 20 years but figured now would be a great time to restart. From all the reading I did in 2016 it was clear that the programming language of choice for Machine Learning was Python. I didn’t want to take a bunch of disconnected courses on Coursera and Udacity to learn about Machine Learning, instead I had a project in mind. When I moved to India 12 years ago, it was to launch an algorithm/quant hedge fund and I was the guy tasked with getting all the technology infrastructure (servers, data feeds, leased lines, datacenter access, etc…) in place and then over time I would learn to build trading algorithms. One thing led to another and I never got around to build those models. Over the years, I felt the algo/quant space was over done and it would be tough to get back into it. However there has been a resurgence with all of the new technologies involving Artificial Intelligence entering the space. So that was my goal, learn Machine Learning to trade the stock market.

I spent the first couple weeks of the new year putting together a plan to accomplish the end goal. The first thing was to take an introduction course on Python from Coursera. In parallel I was researching the algo/quant side and understanding what goes into building models, trading models and risk management. Not only did I want to learn about Machine Learning but whatever I did, I wanted to build it like it was going to be a billon dollar asset management company – highly redundant architecture, quality data feeds and top-notch risk management. It soon became clear this was something that was not going to get built over the weekend!

I was able to breakdown the work into 3 stages:1. Infrastructure – cloud provider, servers, databases, data feeds, trade execution2. Research trading models – researching and designing algorithms to produce “alpha”3. Risk management – once the trade is made, constantly monitoring the position and making sure it fits within the risk model that has been designed. Or as they say within the industry Value at Risk (VaR).

This blog post will talk about the infrastructure and some of the technology I learned along the way.

It quickly became apparent that many of the Machine Learning experts were using something called Jupyter which is an open-source platform to share notebooks and run live Python code. It’s like an online version of an IDE (integrated development environment) that programmers use to build applications.

The next thing was to start getting data and lots of data onto the platform that I had built. For all the crap I talk about Yahoo, they have a pretty good finance section to download historical stock data for Indian stocks. Using pandas, a Python data analysis library, I was able to pull down all the price data I needed.