1 Answer
1

Input: it is not common to use activation functions in the input layer. Just rescale your data to the [-1;1] interval;

Hidden: the $tanh$ activation used to be the most popular. Now, this role has been taken by the $relu$ (rectified linear unit) activation, which produce sparse activations and better preserve the gradients;

Output: here is where you need to chose the activation function according to your problem type. For classification use the softmax activation (the multivariate version of the logistic sigmoid). For regression problems you may use linear outputs (identity activation function).