Data Science Stack Exchange is a question and answer site for Data science professionals, Machine Learning specialists, and those interested in learning more about the field. It only takes a minute to sign up.

1 Answer
1

It depends on the type of input pattern but to make a decision, I suggest not to. There are different reasons for that. First of all, you are damaging your input signal. I don't know whether you are familiar with the information theory or not but the signal to noise ratio will be too small and if you do so, you will be left with a signal which is far from your real signal.

Moreover, it is also not good to add dropout in the convolutional layers because they are feature extractors and they are significant features for classification problems. If you miss them, means that you are losing information more than usual. Consider the point that your input to the network is already resized to a smaller shape than its original shape, for instance, the input shape of typical CNNs is 224 * 224 while the original shape may be ten times bigger or even more for each direction.

You may have seen that in the Lenet-5 the authors have used a data-augmentation technique that changes the colors of the inputs with different distributions. The point there is that they have not changed the locality of the input signal. Moreover, the signal to noise ratio also is not a too small number due to the fact that they have not set the input features to zeros. They just have changed them slightly.

Finally, the last layer should not employ drop-out. Because the output size has to have specified characteristics, sum to one. Dense layers due to having a large number of weights, and consequently a large number of activations, are good points for exploiting drop-out.