Solar Radiation Prediction

Sci-kit learn is a fantastic set of tools for machine learning in python. It is built on numpy, scipy, and matplotlib introduced in the first py-guy post and makes data analysis and visualization simple and intuitive. sci-kit learn provides classification, regression, clustering, dimensionality reduction, model selection, and preprocessing algorithms making data analysis in python accessible to everyone. We will cover an example of linear regression in this weeks post exploring Solar Radiation data from a NASA hackathon.

First after importing packages let’s read in the SolarPrediction.csv data set. The link to the data set is commented in the code block.

Taking a first look at the data set, specifically, UNIXTime and Date, note it is not formatted to a particular type so we will look at this later.

df.shape
df.describe()

Calling the describe method on the data frame returns some descriptive statistics on the data set and tells us there might be a relationship between radiation, humidity and or temperature.

So let’s look at a correlation plot to get a better feel for any possible relationships.

truthmat= df.corr()
sns.heatmap(truthmat, vmax=.8, square=True)

There is a strong relationship between radiation and temperature (unsurprisingly or surprisingly) so let’s choose two features with some ambiguity. Pressure and Temperature will do fine, we will use seaborn, a statistical visualization library based on matplotlib to explore the relationship between the two features.

First we will convert to date time to manipulate later then add hour, month and year columns for a granular scope. Much Better!

With sklearn linear regression we can train python to model the data and then test the model for its accuracy. We will drop temperature column from the dependent variables because that is what we want to learn.

If you like these blog posts or want to comment and or share something do so below and follow py-guy!

Note: I referenced kaggler Sarah VCH’s notebook in making todays blog post, specifically the feature engineering code in the fifth code block. If you want to see her notebook I’ve listed the link below.

Previous Post

Welcome to py-guy! py-guy blog explores science, culture and technology with simple examples and thoughtful discussions. I will list a “preview” link to the previous blog post here once I figure out how to do so and after I’ve written more blog posts.