It has three major steps1. Network training2. Network Pruning3. Rule extracting

I'm still in the first step which is a kind of backpropagation. It is explained that, there are two activation function which is used. Log sigmoid and hyperbolic tangent sigmoid. I get mad with them. They produced a value that isn't really clear to be 0 or 1 so I search about activation function more and find that applying those needs derivative activation function.

But inside the journal, there's no explaination about that. Should I add it or not? and is there any explicit impact if I add it?