Need help with Deep Learning in Python?

1. Load Data

Whenever we work with machine learning algorithms that use a stochastic process (e.g. random numbers), it is a good idea to set the random number seed.

This is so that you can run the same code again and again and get the same result. This is useful if you need to demonstrate a result, compare algorithms using the same source of randomness or to debug a part of your code.

You can initialize the random number generator with any seed you like, for example:

1

2

3

4

5

from keras.models import Sequential

from keras.layers import Dense

import numpy

# fix random seed for reproducibility

numpy.random.seed(7)

Now we can load our data.

In this tutorial, we are going to use the Pima Indians onset of diabetes dataset. This is a standard machine learning dataset from the UCI Machine Learning repository. It describes patient medical record data for Pima Indians and whether they had an onset of diabetes within five years.

As such, it is a binary classification problem (onset of diabetes as 1 or not as 0). All of the input variables that describe each patient are numerical. This makes it easy to use directly with neural networks that expect numerical input and output values, and ideal for our first neural network in Keras.

Download the dataset and place it in your local working directory, the same as your python file. Save it with the file name:

1

pima-indians-diabetes.csv

You can now load the file directly using the NumPy function loadtxt(). There are eight input variables and one output variable (the last column). Once loaded we can split the dataset into input variables (X) and the output class variable (Y).

1

2

3

4

5

# load pima indians dataset

dataset=numpy.loadtxt("pima-indians-diabetes.csv",delimiter=",")

# split into input (X) and output (Y) variables

X=dataset[:,0:8]

Y=dataset[:,8]

We have initialized our random number generator to ensure our results are reproducible and loaded our data. We are now ready to define our neural network model.

Note, the dataset has 9 columns and the range 0:8 will select columns from 0 to 7, stopping before index 8. If this is new to you, then you can learn more about array slicing and ranges in this post:

2. Define Model

Models in Keras are defined as a sequence of layers.

We create a Sequential model and add layers one at a time until we are happy with our network topology.

The first thing to get right is to ensure the input layer has the right number of inputs. This can be specified when creating the first layer with the input_dim argument and setting it to 8 for the 8 input variables.

How do we know the number of layers and their types?

This is a very hard question. There are heuristics that we can use and often the best network structure is found through a process of trial and error experimentation. Generally, you need a network large enough to capture the structure of the problem if that helps at all.

In this example, we will use a fully-connected network structure with three layers.

Fully connected layers are defined using the Dense class. We can specify the number of neurons in the layer as the first argument, the initialization method as the second argument as init and specify the activation function using the activation argument.

In this case, we initialize the network weights to a small random number generated from a uniform distribution (‘uniform‘), in this case between 0 and 0.05 because that is the default uniform weight initialization in Keras. Another traditional alternative would be ‘normal’ for small random numbers generated from a Gaussian distribution.

We will use the rectifier (‘relu‘) activation function on the first two layers and the sigmoid function in the output layer. It used to be the case that sigmoid and tanh activation functions were preferred for all layers. These days, better performance is achieved using the rectifier activation function. We use a sigmoid on the output layer to ensure our network output is between 0 and 1 and easy to map to either a probability of class 1 or snap to a hard classification of either class with a default threshold of 0.5.

We can piece it all together by adding each layer. The first layer has 12 neurons and expects 8 input variables. The second hidden layer has 8 neurons and finally, the output layer has 1 neuron to predict the class (onset of diabetes or not).

1

2

3

4

5

# create model

model=Sequential()

model.add(Dense(12,input_dim=8,activation='relu'))

model.add(Dense(8,activation='relu'))

model.add(Dense(1,activation='sigmoid'))

3. Compile Model

Now that the model is defined, we can compile it.

Compiling the model uses the efficient numerical libraries under the covers (the so-called backend) such as Theano or TensorFlow. The backend automatically chooses the best way to represent the network for training and making predictions to run on your hardware, such as CPU or GPU or even distributed.

When compiling, we must specify some additional properties required when training the network. Remember training a network means finding the best set of weights to make predictions for this problem.

We must specify the loss function to use to evaluate a set of weights, the optimizer used to search through different weights for the network and any optional metrics we would like to collect and report during training.

In this case, we will use logarithmic loss, which for a binary classification problem is defined in Keras as “binary_crossentropy“. We will also use the efficient gradient descent algorithm “adam” for no other reason that it is an efficient default. Learn more about the Adam optimization algorithm in the paper “Adam: A Method for Stochastic Optimization“.

Finally, because it is a classification problem, we will collect and report the classification accuracy as the metric.

4. Fit Model

We have defined our model and compiled it ready for efficient computation.

Now it is time to execute the model on some data.

We can train or fit our model on our loaded data by calling the fit() function on the model.

The training process will run for a fixed number of iterations through the dataset called epochs, that we must specify using the nepochs argument. We can also set the number of instances that are evaluated before a weight update in the network is performed, called the batch size and set using the batch_size argument.

For this problem, we will run for a small number of iterations (150) and use a relatively small batch size of 10. Again, these can be chosen experimentally by trial and error.

1

2

# Fit the model

model.fit(X,Y,epochs=150,batch_size=10)

This is where the work happens on your CPU or GPU.

No GPU is required for this example, but if you’re intersted in how to run large models on GPU hardware cheaply in the cloud, see this post:

5. Evaluate Model

We have trained our neural network on the entire dataset and we can evaluate the performance of the network on the same dataset.

This will only give us an idea of how well we have modeled the dataset (e.g. train accuracy), but no idea of how well the algorithm might perform on new data. We have done this for simplicity, but ideally, you could separate your data into train and test datasets for training and evaluation of your model.

You can evaluate your model on your training dataset using the evaluate() function on your model and pass it the same input and output used to train the model.

This will generate a prediction for each input and output pair and collect scores, including the average loss and any metrics you have configured, such as accuracy.

1

2

3

# evaluate the model

scores=model.evaluate(X,Y)

print("\n%s: %.2f%%"%(model.metrics_names[1],scores[1]*100))

6. Tie It All Together

You have just seen how you can easily create your first neural network model in Keras.

Note: If you try running this example in an IPython or Jupyter notebook you may get an error. The reason is the output progress bars during training. You can easily turn these off by setting verbose=0 in the call to model.fit().

Note, the skill of your model may vary.

Neural networks are a stochastic algorithm, meaning that the same algorithm on the same data can train a different model with different skill. This is a feature, not a bug. You can learn more about this in the post:

7. Bonus: Make Predictions

After I train my model, how can I use it to make predictions on new data?

Great question.

We can adapt the above example and use it to generate predictions on the training dataset, pretending it is a new dataset we have not seen before.

Making predictions is as easy as calling model.predict(). We are using a sigmoid activation function on the output layer, so the predictions will be in the range between 0 and 1. We can easily convert them into a crisp binary prediction for this classification task by rounding them.

The complete example that makes predictions for each record in the training data is listed below.

I am interested in deep learning and machine learning. You mentioned “It defines a hidden layer with 12 neurons, connected to the input layer that use relu activation function.” I wonder how can we determine the number of neurons in order to achieve a high accuracy rate of the model?

Sir, thanks for your tutorial. Would you like to make tutorial on stock Data Prediction through Neural Network Model and training this on any stock data. If you have on this so please share the link. Thanks

We may be maxing out on this problem, but here is some general advice for lifting performance.
– data prep – try lots of different views of the problem and see which is best at exposing the structure of the problem to the learning algorithm (data transforms, feature engineering, etc.)
– algorithm selection – try lots of algorithms and see which one or few are best on the problem (try on all views)
– algorithm tuning – tune well performing algorithms to get the most out of them (grid search or random search hyperparameter tuning)
– ensembles – combine predictions from multiple algorithms (stacking, boosting, bagging, etc.)

For neural nets, there are a lot of things to tune, I think there are big gains in trying different network topologies (layers and number of neurons per layer) in concert with training epochs and learning rate (bigger nets need more training).

Jason, I’m not quite understanding how the predicted values ([1.0, 0.0, 1.0, 0.0, 1.0,…) map to the real world problem. For instance, what does that first “1.0” in the results indicate?

I get that it’s a prediction of ‘true’ for diabetes…but to which patient is it predicting that—the first in the list? So then the second result, “0.0,” is the prediction for the second patient/row in the dataset?

Hey, Jason! Thank you for the awesome tutorial! I’ve use your tutorial to learn about CNN. I have one question for you… Supposing I want to use Keras to classicate images and I have 3 or more classes to classify, How could my algorithm know about this classes? You know, I have to code what is a cat, a dog and a horse. Is there any way to code this? I’ve tried it:

This is an example of a multi-class classification problem. You must use a one hot encoding on the output variable to be able to model it with a neural network and specify the number of classes as the number of outputs on the final layer of your network.

I’m using keras (with CNNs) for sentiment classification of documents and I’d like to improve the performance, but I’m completely at a loss when it comes to tuning the parameters in a non-arbitrary way. Could you maybe point me somewhere that will help me go about this in a more systematic fashion? There must be some heuristics or rules-of-thumb that could guide me.

I have a tutorial coming out soon (next week) that provide lots of examples of tuning the hyperparameters of a neural network in Keras, but limited to MLPs.

For CNNs, I would advise tuning the number of repeating layers (conv + max pool), the number of filters in repeating block, and the number and size of dense layers at the predicting part of your network. Also consider using some fixed layers from pre-trained models as the start of your network (e.g. VGG) and try just training some input and output layers around it for your problem.

Hi Jason,
i’m a student conducting a research on how to use artificial neural network to predict the business viability of potential software projects.
I intend to use python as a programming language. The application of ANN fascinates me but i’m new to machine learning and python. Can you help suggest how to go about this.
Many thanks

Dear Jeson, this is a great tutorial for beginners. It will satisfy the need of many students who are looking for the initial help. But I have a question. Could you please light on a few things: i) how to test the trained model using test dataset (i.e., loading of test dataset and applied the model and suppose the test file name is test.csv) ii) print the accuracy obtained on test dataset iii) the o/p has more than 2 class (suppose 4-class classification problem).
Please show the whole program to overcome any confusion.
Thanks a lot.

I could print a diagram of the network but what I want Basically is that each neuron in the current time frame to know only its own previous output and not the output of all the neurons in the output layer.

Hi Jason
Thanks for this great tutorial, i am new to machine learning i went through your basic tutorial on keras and also handwritten-digit-recognition. I would like to understand how i can train a set of image data, for eg. the set of image data can be some thing like square, circle, pyramid.
pl. let me know how the input data needs to fed to the program and how we need to export the model.

Are there any inbuilt functions in keras that can give me the feature importance for the ANN model?

If not, can you suggest a technique I can use to extract variable importance from the loss function? I am considering an approach similar to that used in RF which involves permuting the values of the selected variable and calculating the relative increase in loss.

Dear Jason, I am new to Deep learning. Being a novice, I am asking you a technical question which may seem silly. My question is that- can we use features (for example length of the sentence etc.) of a sentence while classifying a sentence ( suppose the o/p are +ve sentence and -ve sentence) using deep neural network?

Hi Jason,
The tutorial looks really good but unfortunately I keep getting an error when importing Dense from keras.layers, I get the error : AttributeError: module ‘theano’ has no attribute ‘gof’
I have tried reinstalling Theano but it has not fixed the issue.

Can you please make a tutorial on how to add additional train data into the already trained model? This will be helpful for the bigger data sets. I read that warm start is used for random forest. But not sure how to implement as algorithm. A generalised version of how to implement would be good. Thank You!

Hi Jason,
first of all congratulations for this amazing work that you have done!
Here is my question:
What about if my .csv file includes also both nominal and numerical attributes?
Should I change my nominal values to numerical?

Hello Jason,
You are using model.predict in the end to predict the results. Is it possible to save the model somewhere in the harddisk and transfer it to another machine(turtlebot running on ROS for my instance) and then use the model directly on turtlebot to predict the results?
Please tell me how
Thanking you
Homagni Saha

The problem has 8 input variables and the first hidden layer has 12 neurons. Inputs are the columns of data, these are fixed. The Hidden layers in general are whatever we design based on whatever capacity we think we need to represent the complexity of the problem. In this case, we have chosen 12 neurons for the first hidden layer.

Hi,
I have a data , IRIS like data but with more colmuns.
I want to use MLP and DBN/CNNClassifier (or any other Deep Learning classificaiton algorithm) on my data to see how correctly it does classified into 6 groups.

Previously using DEEP LEARNING FOR J, today first time see KERAS.
does KERAS has examples (code examples) of DL Classification algorithms?

I have installed theano but it gives me the error of tensorflow.is it mendatory to install both packages? because tensorflow is not supported on wndows.the only way to get it on windows is to install virtual machine

Hello Rumesa!
Have you solved your problem? I have the same one. Everywhere is the same answer with keras.json file or envirinment variable but it doesn’t work. Can you tell me what have worked for you?

First off, thanks so much for creating these resources, I have been keeping an eye on your newsletter for a while now, and I finally have the free time to start learning more about it myself, so your work has been really appreciated.

My question is: How can I set/get the weights of each hidden node?

I am planning to create several arrays randomized weights, then use a genetic algorithm to see which weight array performs the best and improve over generations. How would be the best way to go about this, and if I use a “relu” activation function, am I right in thinking these randomly generated weights should be between 0 and 0.05?

I have a question, how can I represent a character as a vector that could be an input for the neural network to predict the word meaning and trained using LSTM

For instance, I have bf to predict boy friend or best friend and similarly I have 2mor to predict tomorrow. I need to encode all the input as a character represented as vector, so that it can be train with RNN/LSTM to predict the output.

Thank you, Jason, if I map characters to integers value to get vectors representation of the informal text using English Alphabets, numbers and special characters

The question is how will LSTM predict the character or words that have close meaning to the input value. Please example in more details for me. I understand how RNN/LSTM work based on your tutorial example but the logic in designing processing is what I am stress with.

numpy.random.seed(seed)
model.fit(X_train, y_train, validation_data=(X_test, y_test), nb_epoch=100, batch_size=64)
# Final evaluation of the model
scores = model.evaluate(X_test, y_test, verbose=0)
print(“Accuracy: %.2f%%” % (scores[1] * 100))
this method above does not work and does not give me any error message.
could you help me with this please?

Hi Jason
Great work. I have another doubt. How can we apply this to text mining. I have a csv file containing review document and label. I want to apply classify the documents based on the text available. Can U do this favor.

First of all a special thanks to you for providing such a great tutorial. I am very new to machine learning and truly speaking i had no background in data science. The concept of ML overwhelmed me and now i have a desire to be an expert of this field. I need your advice to start from a scratch. Also i am a PhD student in Computer Engineering ( computer hardware )and i want to apply it as a tool for fault detection and testing for ICs.Can you provide me some references on this field?

Hi, Jason, thank you for your amazing examples.
I run the same code on my laptop. But I did not get the same results. What could be the possible reasons?
I am using windows 8.1 64bit+eclipse+anaconda 4.2+theano 0.9.4+CUDA7.5
I got results like follows.

Hello Jason Brownlee,Thx for sharing~
I’m new in deep learning.And I am wondering can what you dicussed here:”Keras” be used to build a CNN in tensorflow and train some csv fiels for classification.May be this is a stupid question,but waiting for you reply.I’m working on my graduation project for Word sense disambiguation with cnn,and just can’t move on.Hope for your heip~Bese wishes!

I’ve just installed Anaconda with Keras and am using python 3.5.
It seems there’s an error with the rounding using Py3 as opposed to Py2. I think it’s because of this change: https://github.com/numpy/numpy/issues/5700

I removed the rounding and just used print(predictions) and it seemed to work outputting floats instead.

X = dataset[:,1:17]
Y = dataset[:,0]
but a have some error (something related that strings are not recognized) .
I tried to modified each letter whit the ASCII code (A became 65 and so on).The string error disappeared.
The program compiles now but the output look like this :

Since the epoch is set to 150 and batch size is 10, does the training algorithm pick 10 training examples at random in each iteration, given that we had only 768 total in X. Or does it sample randomly after it has finished covering all.

Hi Jason
Thanks a lot for this blog. It really helps me to start learning deep learning which was in a planning state for last few months. Your simple enrich blogs are awsome. No questions from my side before completing all tutorials.
One question regarding availability of your book. How can I buy those books from India ?

Hi Jason, firstly your work here is a fantastic resource and I am very thankful for the effort you put in.
I am a slightly-better-than-beginner at python and an absolute novice at ML, I wonder if you could help me classify my problem and find an angle to work at it from.

I want to find the percentage chance of each Column Names category being the Result based off the configuration of all the values present from 1-15. Then if need be compare the configuration of Values with another row of values to find the same, Resulting in the total needed calculation as:

Dear Jason, Thanks for sharing this article.
I am novice to the deep learning, and my apology if my question is not clear. my question is could we call all that functions and program from any .php,.aspx, or .html webpage. i mean i load the variables and other files selection from user interface and then make them input to this functions.

I used tensorflow as backend, and implemented the procedures using Jupyter.
I did “source activate tensorflow” -> “ipython notebook”.
I can successfully use Keras and import tensorflow.

However, it seems that such environment doesn’t support pandas and sklearn.
Do you have any way to incorporate pandas, sklearn and keras?
(I wish to use sklearn to revisit the classification problem and compare the accuracy with the deep learning method. But I also wish to put the works together in the same interface.)

Thanks, Jason!
Actually the problem is not on notebooks. Even I used the terminal mode, i.e. doing “source activate tensorflow” only. It failed to import sklearn. Does that mean tensorflow library is not compatible with sklearn? Thanks again!

hello sir,
A very informative post indeed . I know my question is a very trivial one but can you please show me how to predict on a explicitly mentioned data tuple say v=[6,148,72,35,0,33.6,0.627,50]
thanks for the tutorial anyway

excuse me sir, i wanna ask you a question about this paragraph”dataset = numpy.loadtxt(“pima-indians-diabetes.csv”,delimiter=’,’)”, i used the mac and downloaded the dataset,then i exchanged the text into csv file. Running the program

Great tutorial! Amazing amount of work you’ve put in and great marketing skills (I also have an email list, ebooks and sequence, etc). I ran this in Jupyter notebook… I noticed the 144th epoch (acc .7982) had more accuracy than at 150. Why is that?

P.S. i did this for the print: print(numpy.round(predictions))
It seems to avoid a list of arrays which when printing includes the dtype (messy)

Hi Jason, im just starting deep learning in python using keras and theano. I have followed the installation instructions without a hitch. Tested some examples but when i run this one line by line i get a lot of exceptions and errors once i run the “model.fit(X,Y, nb_epochs=150, batch_size=10”

I have a 32 core machine with 64 GB RAM and it does not converge even in more than an hour. I can see all the cores busy, so it is using all the cores for training. However, if I change the input neurons to 3 then it converges in around 2 minutes.

hello sir
could you plz tell me what is the role of optimizer and binary_crossentropy exactly? it is written that optimizer is used to search through the weights of the network which weights are we talking about exactly?

Batch size is how many patterns to show to the network before the weights are updated with the accumulated errors. The smaller the batch, the faster the learning, but also the more noisy the learning (higher variance).

Try exploring different batch sizes and see the effect on the train and test performance over each epoch.

I want a neural that can predict sin values. Further from a given data set i need to determine the function(for example if the data is of tan or cos, then how to determine that data is of tan only or cos only)

I am trying to use two odd frames of a video to predict the even one. Thus I need to give two images as input to the network and get one image as output. Can you help me with the syntax for the first model.add()? I have X_train of dimension (190, 2, 240, 320, 3) where 190 are the number of odd pairs, 2 are the two odd images, and (240,320,3) are the (height, width, depth) of each image.

Hi Jason,
Thanks for this awesome post.
I ran your code with tensorflow back end, just out of curiosity. The accuracy returned was different every time I ran the code. That didn’t happen with Theano. Can you tell me why?

Hi Jason,
If I want to use the diabetes dataset (NOT Pima) https://archive.ics.uci.edu/ml/datasets/Diabetes to predict Blood Glucose which tutorials and e-books of yours would I need to start with…. Also, the data in its current format with time, code and value is it usable as is or do I need to convert the data in another format to be able to use it.

Dr. Jason,
The data is time series(time based data) with categorical(20) with two numbers one for insulin level and another for blood sugar level… Each time series data does not have every categorical data… For example one category is blood sugar before breakfast, another category is blood sugar after breakfast, before lunch and after lunch… Some times some of these category data is missing… I read through the above link, but does not talk about time series, categorical data with some category of data missing what to do in those cases…. Please let me know if any of your books will help clarify these points?

Is it compulsory to normalize the data before using ANN model. I read it somewhere I which the author insisted that each attribute be comparable on the scale of [0,1] for a meaningful model. What is your take on that sir. Kind regards.

Hi Jason, You are simply awesome. I’m one of the many who got benefited from your book “machine learning mastery with python”. I’m working with a medical image classification problem. I have two classes of medical images (each class having 1000 images of 32*32) to be worked upon by the convolutional neural networks. Could you guide me how to load this data to the keras dataset? Or how to use my data while following your simple steps? kindly help.

I adapted your code with the cross validation pipelined with ANN (Keras) for my model. It gave me 100% still. I got the data from UCI ( Chronic Kidney Disease). It was 400 instances, 24 input attributes and 1 binary attribute. When I removed the rows with missing data I was left with 170 instances. Is my dataset too small for (24 input layer, 24 hidden layer and 1 output layer ANN, using adam and kernel initializer as uniform )?

Hi Jason,
I am currently working with the IMDB sentiment analysis problem as mentioned in your book. Am using Anaconda 3 with Python 3.5.2. In an attempt to summarize the review length as you have mentioned in your book, When i try to execute the command:

Hello, quite new to Python, Numpy and Keras(background in PHP, MYSQL etc). If there are 8 input variables and 1 output varable(9 total), and the Array indexing starts from zero(from what I’ve gathered it’s a Numpy Array, which is built on Python lists) and the order is [rows, columns], then shouldn’t our input variable(X) be X = dataset[:,0:7] (where we select from the 1st to 8th columns, ie. 0th to 7th indices) and output variable(Y) be Y = dataset[:,8] (where we the 9th column, ie. 8th index)?

Hi,
Would you mind if I use this code as an example of a simple network in a school project of mine?
Need to ask before using it, since I cannot find anywhere in this tutorial that you are OK with anyone using the code, and the ethics moment of my course requires me to ask (and of course give credit where credit is due).
Kind regards
Eric T

Can you give a deep cnn code which includes 25 layers , in the first conv layer the filter sizs should be 39×39 woth a total lf 64 filters , in the 2nd conv layer , 21 ×21 with 32 filters , in the 3rd conv layer 11×11 with 64 filters , 4th Conv layer 7×7 with 32 layers . For a input size of image 256×256. Im Competely new in this Deep learning Thing but if you can code that for me it would be a great help. Thanks

Some ideas:
– Consider trying the theano backend and see if that makes a difference.
– Try searching/posting on the keras user group and slack channel.
– Try searching/posting on stackoverflow or cross validated.

3. The batch size means how many training data are used in one epoch, am I right?
I have thought we have to use the whole training data set for the training. In this case I would determine the batch size as the number of training data pairs I have achieved through experiments etc.. In your example, does the batch (sized 10) means that the computer always uses the same 10 training data in every epoch or are the 10 training data randomly chosen among all training data before every epoch?

4. When evaluating the model what does the loss means (e.g. in loss: 0.5105 – acc: 0.7396)?
Is it the sum of values of the error function (e.g. mean_squared_error) of the output neurons?

I want to study the change in weights and predictions between each epoch run.
Have tried to use the model.train_on_batch method and the model.fit method with epoch=1 and batch_size equal all the samples.

But it seems like the model doesn’t save the new updated weights.
I print predictions before and after I dont see a change in the evaluation scores.

Hi Dr Jason,
This is probably a stupid question but I cannot find out how to do it … and I am beginner on Neural Network.
I have relatively same number of inputs (7) and one output. This output can take numbers between -3000 and +3000.
I want to build a neural network model in python but I don’t know how to do it.
Do you have an example with outputs different from 0-1.
Tanks in advance

Thanks for your tutorial, I found it very useful to get me started with Keras. I’ve previously tried TensorFlow, but found it very difficult to work with. I do have a question for you though. I have both Theano and TensorFlow installed, how do I know which back-end Keras is using? Thanks again

If I have trained prediction models or neural network function scripts. How can I use them to make predictions in an application that will be used by end users? I want to use python but it seems I will have to redo the training in Python again. Is there a way I can rewrite the scripts in Python without retraining and just call the function of predicting?

Jason, I used your tutorial to install everything needed to run this tutorial. I followed your tutorial and ran the resulting program successfully. Can you please describe what the output means? I would like to thank you for your very informative tutorials.

…is redundant i.e. the Keras output in a loop running the same model with the same configuration will yield a similar variety of results regardless if it’s set at all, or which number it is set to. Or am I missing something?

I totally get what it should do, but as I had pointed out, it does not do it. If you run the codes you have provided above in a loop for say 10 times. First 10 with random seed set and the other 10 times without that line of code all together. Then compare the result. At least the result I’m getting, is suggesting the effect is not there i.e. both sets of 10 times will have similar variation in the result.

Thanks for sharing it. Been lately thinking about the aspect of accuracy a lot, it seems that at the moment it’s a “hot mess” in terms of the way common tools do it out of the box. I think a lot of non PhD / non expert crowd (most people) will at least initially be easily confused and make the kinds of mistakes you point out in your post.

Thanks for all the amazing contributions you are making in this field!

I have a question about your code:
is the argument metrics=[‘accuracy’] necessary in the code and does it change the results of the neural network or is it just for showing me the accuracy during compiling?

your work here is really great. It helped me a lot.
I recently stumbled upon one thing I cannot understand:

For the pimas dataset you state:
<>
When I look at the table of the pimas dataset, the examples are in rows and the features in columns, so your input dimension is the number of columns. As far as I can see, you don’t change the table.

For neural networks, isn’t the input normally: examples = columns, features=rows?
Is this different for Keras? Or can I use both shapes? An if yes, what’s the difference in the construction of the net?

Thanks! 🙂
I had a lot of discussions because of that.
In Andrew Ng new Coursera course it’s explained as examples = columns, features=rows, but he doesn’t use Keras of course, but programms the neural networks from scratch.

Thats what I thought, but I looked it up in the notation for the new coursera course (deeplearning.ai) and there it says: m is the numer of examples in the dataset and n is the input size, where X superscript n x m is the input matrix …
But either way, you helped me! Thank you. 🙂

Hi Jason, thank you so much for your tutorial, it helps me a lot. I need your help for the question below:
I copy the code and run it. Although I got the classification results, there were some warning messages in the process. As follows:

Hi Jason,
Great article, thumbs up for that. I am getting this error when I try to run the file on the command prompt. Any suggestions. Thanks for you response.

#######################################################################
C:\Work\ML>python keras_first_network.py
Using TensorFlow backend.
2017-09-22 10:11:11.189829: W C:\tf_jenkins\home\workspace\rel-win\M\windows\PY\
36\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn
‘t compiled to use AVX instructions, but these are available on your machine and
could speed up CPU computations.
2017-09-22 10:11:11.190829: W C:\tf_jenkins\home\workspace\rel-win\M\windows\PY\
36\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn
‘t compiled to use AVX2 instructions, but these are available on your machine an
d could speed up CPU computations.
32/768 [>………………………..] – ETA: 0s
acc: 78.52%
#######################################################################

I tried to train this model on my laptop, it is working fine. But I tried to train this model on google-cloud with the same instructions as in your example-5. But it is failing.
Can you just let me know, which changes are to required for the model, so that I can train this on cloud.

Hi Jason, thanks for the great tutorials. I just learnt and repeated the program in your “Your First Machine Learning Project in Python Step-By-Step” without problem. Now trying this one, getting stuck at the line “model = Sequential()” when the Interactive window throws: NameError: name ‘Sequential’ is not defined. tried to google, can’t find a solution. I did import Sequential from keras.models as in ur example code. copy pasted as it is. Thanks in advance for your help.

I want to use your code to predict the classification (1 or 0) of unknown samples. Should I create one common csv file having the train (known) as well as the test (unknown) data. Whereas the ‘classification’ column for the known data will have a known value, 1 or 0, for the unknown data, should I leave the column empty (and let the code decide the outcome)?

This is really cool! I am blown away! Thanks so much for making it so simple for a beginner to have some hands on. I have a couple questions:

1) where are the weights, can I save and/or retrieve them?

2) if I want to train images with dogs and cats and later ask the neural network whether a new image has a cat or a dog, how do I get my input image to pass as an array and my output result to be “cat” or “dog”?

Do you have or else could you recommend a beginner’s level image segmentation approach that uses deep learning? For example, I want to train some neural net to automatically “find” a particular feature out of an image.

I just started my DL training a few weeks ago. According to what I learned in course, in order to train the parameters for the NN, we need to run the Forward and Backward propagation; however, looking at your Keras example, i don’t find any of these propagation processes. Does it mean that Keras has its own mechanism to find the parameters instead of using Forward and Backward propagation?

Hi Jason, thanks for your short tutorial, helps a lot to actually get your hands dirty with a simple example.
I have tried 5 different parameters and got some interesting results to see what would happen. Unfortunately, I didnt record running time.

Hello,
i have a a bit general question.
I have to do a forecasting for restaurant sales (meaning that I have to predict 4 meals based on a historical daily sales data), weather condition (such as temperature, rain, etc), official holiday and in-off-season. I have to perform that forecasting using neuronal networks.
I am unfortunately not a very skilled in python. On my computer I have Python 2.7 and I have install anaconda. I am trying to learn exercising with your codes, Mr. Brownlee. But somehow I can not run the code at all (in Spyder). Can you tell me what kind of version of python and anaconda I have to install on my computer and in which environment (jupiterlab,notebook,qtconsole, spyder, etc) I can run the code, so to work and not to give error from the very beginning?
I will be very thankful for your response
KG
Tanya

I looked over the tutorial and I had a question regarding reading the data from a binary file? For instance I working on solving the sliding tiled n-puzzle using neural networks, but I seem to have trouble to getting my data which is in a binary file and it generates the number of move required for the n-puzzle to be solve in. Am not sure if you have dealt with this before, but any help would be appreciated.

thanks for sharing such nice tutorials, it helped me alot. i want to print the confusion matrix from the above example. and one more question.
if i have
20-input variable
1- class label (binary)
and 400 instances
how i would know , setting up the dense layer parameter in the first layer and hidden layer and output layer. like above example you have placed. 12,8,1

Hello Mr.Janson
After installing Anaconda and deep learning libraries, I read your Free mini-course and I tried to write the code about the handwritten digit recognition.
I wrote the codes in jupyter notebook, am I right?
if not where should I write the codes ?
and if I want to use another dataset (my own data set) how can I use in the code?
and how can I see the result, for example the accuracy percentage?
I am really sorry for my simple questions! I have written a lot of code in “Matlab” but I am really a beginner in Python and Anaconda, my teacher force me to use Python and keras for my project.

hello
please tell me how can I find out that tensorflow and keras are correctly installed on my system.
maybe the problem is that, because no code runs in my jupyter. and no “import” acts well(for example import pandas)
thank you

Hi. I’m totally new to machine learning and I’m trying to wrap my head around it.
I have a problem I can’t quite solve yet. And don’t know where to start actually.
I have a dictionary with a few key:value pairs. The key is a random 4 digit number from 0000 to 9999. And the value for each key is set as follows: if a digit in a number is either 0, 6 or 9 then its weight is 1, if a digit is 8 then it’s weight is 2, any other digit has a weight of 0. All the weights are summarised then and here you have the value for the key. (example: { ‘0000’: 4, ‘1234’: 0, ‘1692’: 2, ‘8800’: 6} – and so on).

Now I’m trying to build a model that will predict the correct value of a given key. (i.e if I give it 2222 the answer is 0, if I give it 9011 – it’s 2). What I did first is created a CSV file with 5 columns, first four is a split (by a single digit) key from my dictionary, and the fifth column is the value for each key. Next I created a dataset and defined a model (like this tutorial but with input_dim=4). Now when I train the model the accuracy won’t go higher then ~30%. Also your model is based on binary output, whereas mine should have an integer from 0 to 8. Where do I go from here?

I guess the number at the end is if the person has diabetes (1) or does not (0) , but what I dont understand is how I know the ‘prediction’is about that 0 or 1, tehere are a lot of other variables in the data, and I dont see ‘diabetes’ being a label for any of that.

So, how do I know or how do I set wich variable (number) I want to predict?