PASS Summit Notes for my AzureML, R and Power BI Presentation

I’m going to have fun with my AzureML session today at PASS Summit! More will follow on this post later; I am racing off to the keynote so I don’t have long 🙂

I heard some folks weren’t sure whether to attend my session or Chris Webb’s session. I’m honestly flattered but I’m not in the same league as Chris! I’ve posted my notes here so that folks can go off and attend Chris’ session, if they are stuck between the two.

Here’s a sample R code. I know it is simple, and there are better ways of doing this. However, remember that this is for instructional purposes in front of +/- 500 people so I want to be sure everyone has a grounding before we talk more complicated things.

# Let’s see if the columns renamed well
# What is the maximum age of the adult?
# How much data is missing?
summary(adult.data)

# How many rows do we have?
# 32561 rows, 15 columns
dim(adult.data)

# There are lots of different ways to deal with missing data
# That would be a session in itself!
# For demo purposes, we are simply going to replace question marks, and remove rows which have anything missing.