Data Science Stack Exchange is a question and answer site for Data science professionals, Machine Learning specialists, and those interested in learning more about the field. It only takes a minute to sign up.

$\begingroup$Typically we experiment, using our intution; consider it a hyperparameter. There are ways of learning the architecture but I don't know how practical they are: blog.acolyer.org/2017/05/10/…$\endgroup$
– EmreJul 6 '17 at 19:12

2

$\begingroup$I looked for a duplicate to this, because I am sure it has cropped up many times before on this site. However, could not find a pure version that wasn't attached to some dataset or problem. Maybe this could be the generic question we point others to? Sadly there isn't a great "how to" answer to be had in general, but it's a common question when faced with so much choice.$\endgroup$
– Neil SlaterJul 6 '17 at 19:23

$\begingroup$This is a very interesting question to answer (Researcher started working on your question). What would be the optimal architecture for dataset A and dataset B. Please read below paper that tried to answer to your question. Welcome the world of Neural Architecture Search (NAS). arxiv.org/abs/1611.01578$\endgroup$
– iDeepVisionMar 17 '19 at 0:28

1 Answer
1

Sadly there is no generic way to determine a priori the best number of neurons and number of layers for a neural network, given just a problem description. There isn't even much guidance to be had determining good values to try as a starting point.

The most common approach seems to be to start with a rough guess based on prior experience about networks used on similar problems. This could be your own experience, or second/third-hand experience you have picked up from a training course, blog or research paper. Then try some variations, and check the performance carefully before picking a best one.

The size and depth of neural networks interact with other hyper-paramaters too, so that changing one thing elsewhere can affect where the best values are. So it is not possible to isolate a "best" size and depth for a network then continue to tune other parameters in isolation. For instance, if you have a very deep network, it may work efficiently with the ReLU activation function, but not so well with sigmoid - if you found the best size/shape of network and then tried an experiment with varying activation functions you may come to the wrong conclusion about what works best.

You may sometimes read about "rules of thumb" that researchers use when starting a neural network design from scratch. These things might work for your problems or not, but they at least have the advantage of making a start on the problem. The variations I have seen are:

Create a network with hidden layers similar size order to the input, and all the same size, on the grounds that there is no particular reason to vary the size (unless you are creating an autoencoder perhaps).

Start simple and build up complexity to see what improves a simple network.

Try varying depths of network if you expect the output to be explained well by the input data, but with a complex relationship (as opposed to just inherently noisy).

If you read these or anything like them in any text, then take them with a pinch of salt. However, at worst they help you get past the blank page effect, and write some kind of network, and get you to start the testing and refinement process.

As an aside, try not to get too lost in tuning a neural network when some other approach might be better and save you lots of time. Do consider and use other machine learning and data science approaches. Explore the data, maybe make some plots. Try some simple linear approaches first to get benchmarks to beat, linear regression, logistic regression or softmax regression depending on your problem. Consider using a different ML algorithm to NNs - decision tree based approaches such as XGBoost can be faster and more effective than deep learning on many problems.

$\begingroup$It's a great explanation. Thanks. I also wonder if there is a good way to decide which ML approach to use? You mentioned that there might be a better way than a neural network, but how do we determine that easily?$\endgroup$
– user7677413Jul 7 '17 at 7:05

$\begingroup$@user7677413: The same thing applies. You have to try and see, although experience may give you a guide on familiar problems.$\endgroup$
– Neil SlaterJul 7 '17 at 7:07

$\begingroup$Neural networks are rarely necessary. However, they are better at some problems. They excel at signal processing tasks such as audio and image recognition, and also have capacity to learn subtle differences from large amounts of data where simpler algorithms may reach a limit. However, whether a NN is the right tool for you and whatever problem you face on a particular day, no-one can predict.$\endgroup$
– Neil SlaterJul 7 '17 at 7:32

1

$\begingroup$@user7677413 I think you're making the assumption that there isn't 40 years of deep and insightful machine learning research. It sounds like you're just scratching the surface. I recommend finding a textbook and seeing how it all ties together, that would help build your intuition for the many machine learning algorithms.$\endgroup$
– Alex LMar 17 '19 at 3:32