Python and Bokeh

Introducing Part 2 of my “let’s mess about with Python and Bokeh” series.

Scatter

So last time, I created a simple bar chart showing the top Pokemon by Combat Power (CP). I used this Pokemon Go dataset from Kaggle. I want to keep it fairly simple again, but still learn something and show the data in a different way. I thought a scatter graph might be a good shout.

The Data

Scatter graphs show trends in two numbers right? We already have CP from last time so that’ll do for the Y axis again. For the X axis, I think it might be worthwhile using Hit/Health Points (HP). I would imagine there’ll be a positive correlation between the two but it’d be cool to confirm and any outliers might be interesting to note.

Looking at the dataset again, it seems like there are a few other variables we should include:

Pokemon Name – scatter points are kind of useless unless you know what they refer to.

Type 1 – this is the Pokemon’s main type. I figure we might get some trends out of it.

The Code

Imports

Just using pandas and bokeh again:
from bokeh.charts import Scatter, output_file, show
import pandas as pd

Visualise

We’ve use dataframe’s read_csv() method again to grab the data and specify our columns (including our new ones). Notice I’ve also used the rename() function to get rid of spaces in some of the columns names. This is because the tooltips parameter in the graph builder needs it. NOTE: you’ll need to latest Bokeh to get the tooltips in the builder to even work. Older versions vomit. If we were to use the lower level Bokeh components this wouldn’t be an issue because we wouldn’t be using chart builders.
data = pd.read_csv('pokemonGo.csv', header=0, usecols=['Name', 'Max CP', 'Max HP', 'Type 1'])
data = data.rename(columns={'Name': 'name', 'Type 1': 'type1'})
scatter = Scatter(data, x='Max HP', y='Max CP', color='type1', marker='type1', tooltips=['name', 'type1'])
output_file('scatter.html')
show(scatter)

Results

What did we find out? Well the graph confirms a positive correlation between HP and CP:

Luckily the code all worked too. We’ve got a key showing the primary types and if we hover on a point we can see the name and type appearing as they should. Great! Using the scatter has definitely added a bit more value to the previous post. We can see that normal Pokemon lead the field when it comes to HP.

We can also see that bug Pokemon don’t have great stats:
But anyone who has played Pokemon knows that doesn’t tell the whole story. Maybe I’ve got an excuse to create more parts to this series!

If Sherlock Holmes was alive today (and non fictional) I'd wager he would be a great developer. The plan for this blog is to provide succinct and useful investigations into software topics, aimed at mainly backend developers, hobbyist data scientists and anyone with a curiousity for the technical.