Is there a rule of thumb for selecting for a neural network or an autoencoder:

(a) Number of hidden neurons

(b) Number of hidden layers

(c) In general, to begin applying a machine learning algorithm is there a statistical method to select the number of features or those features which are more relevant? Intuitively, if the entropy of a feature is high then the information content of that feature is high. So we should select that feature. However, I have no idea how to calculate entropy of continuous valued single feature. Therefore, there must be other ways to determine which feature is more relevant out of a pool of several features?