This initializer is designed to keep the scale of gradients roughly the same
in all layers.

By default, rnd_type is 'uniform' and factor_type is 'avg',
the initializer fills the weights with random numbers in the range
of \([-c, c]\), where \(c = \sqrt{\frac{3.}{0.5 * (n_{in} + n_{out})}}\).
\(n_{in}\) is the number of neurons feeding into weights, and \(n_{out}\) is
the number of neurons the result is fed to.

If rnd_type is 'uniform' and factor_type is 'in',
the \(c = \sqrt{\frac{3.}{n_{in}}}\).
Similarly when factor_type is 'out', the \(c = \sqrt{\frac{3.}{n_{out}}}\).

If rnd_type is 'gaussian' and factor_type is 'avg',
the initializer fills the weights with numbers from normal distribution with
a standard deviation of \(\sqrt{\frac{3.}{0.5 * (n_{in} + n_{out})}}\).