Data Science Immersive Program

What does Divergence Academy Teach

Data Science Immersive Program

What does Divergence Academy Teach

Are you a interested in taking a course with us? Learn more on our programs page or contact us.

The course duration is from Sept 11th - Dec 8th, 2017.

This 12-week course is a right blend of Business Analysis, Machine Learning, Data Engineering, and Software Development to build a Data Product. This course is well suited for:

Developers who want to transition to a new role of a Data Scientist

Entrepreneurs who want to launch new products covering IoT and analytics

PhD students who want to transition to the business world

PREREQUISITES

If it’s been a long time since you used any linear algebra, this is a good time for a refresher. Here are the relevant concepts on Metacademy. Each one gives a number of pointers, but the Khan Academy links are especially useful since they have auto-graded exercises you can use to check your understanding.

If you haven’t taken a programming class, you may need to spend some time learning the basics. You should watch Lectures 2, 3, 4, and 6 of MIT 6.001 on EdX. (Lectures 7 and 11 are also helpful.)

If you have programming experience but not in Python, read Learn X in Y Minutes for a concise summary of the language. You can probably pick up Python quickly if you are familiar with another general-purpose language (C, Java, Matlab, etc.).
Read this tutorial on NumPy, the library we’ll use for array manipulation and linear algebra.

Spark Streaming, Timeseries Analysis in the context of Internet of Things

Collaborative Filtering

Model Deployment

Model Operations

WEEK #9 - PROJECT CAPSTONE

WEEK #10 - PROJECT CAPSTONE

WEEK #11 - COURSE REVIEW & INTERVIEW PREP

WEEK #12 - PROJECT PRESENTATIONS

INDUSTRY TOOLS

Beyond the guest lectures, students also get to apply the concepts and the alogirthms using the latest industry tools. Some of the tools are listed below:

WEEK

TOOL

1

Trifacta, Import.io

2

Bayesialab

3

H20

4

MongoDB

5

Azure ML

6

Hortonworks HDP, PrestoDB

7

Tableau

8

Dato, Azure Remote Monitoring

ONLINE RESOURCES FOR GETTING STARTED WITH DATA SCIENCE AND MACHINE LEARNING

For someone trying to get started with ML, here is a resource where the complexity is just right. It introduces you to a lot of the essential Mathematics, but doesn’t go too deep into it. It is an equivalent of the Applied ML course at Stanford.

Very briefly, here are the ML algorithms which are very useful and basic, and will help you solve a lot of problems.

Regression- Single, Multiple Variables, Logistic Regression

Overfitting and Underfitting issues- ‘Bias’ and ‘Variance’

Simple clustering algorithms- K-Means

Applying basic linear algebra: Principal Component Analysis

Recommendation Systems and Large Scale Systems.

Many people have gone on to become top Kaggle contestants (a popular data science contest portal) after doing this course. These introductory algorithms can be extremely useful.

Apart from this we’d also recommend learning a bit about text processing such as regular expressions, string functions and language models. You might find them in the first few lectures and tutorials on this Natural Language Processing Course.

We’d like to emphasize that a lot of the Mathematics involved doesn’t require much more than an introductory statistics code.

For a quick and general introduction to Data Science, the course material from this Coursera course is great, and introduces R, Python, Map-Reduce and Data Visualization techniques.

At a later stage, and for those looking for more academically challenging and abstract courses, you might be interested in the Learning from Data course from Caltech and the Probabilistic Graphical Models course from Stanford. The first course gives more theoretical insights into the foundations of Machine Learning and Statistical Leaning Theory; the second is about mixing Data Structures with Statistics to evolve Bayesian Networks and Hidden Markov Models: powerful tools, which are used in medical diagnostics, speech recognition engines, Kinect – and have been found to be significant improvements on traditional Machine Learning techniques.

Credit: Our motivation to deliver something in Dallas, a place for many Enterprise Customers, has come from pioneers that have come before us like Zipfian Academy and Datasciencemasters.org. The difference in our course structure is that it has been more blended to suit the needs of customers in Midwest and Heartland states.