Date

Author

Metadata

Abstract

Product units provide a method of automatically learning the
higher-order input combinations required for the efficient synthesis of
Boolean logic functions by neural networks. Product units also have a
higher information capacity than sigmoidal networks. However, this
activation function has not received much attention in the literature. A
possible reason for this is that one encounters some problems when
using standard backpropagation to train networks containing these
units. This report examines these problems, and evaluates the
performance of three training algorithms on networks of this
type. Empirical results indicate that the error surface of networks
containing product units have more local minima than corresponding
networks with summation units. For this reason, a combination of local
and global training algorithms were found to provide the most reliable
convergence.
We then investigate how `hints' can be added to the training algorithm.
By extracting a common frequency from the input weights,
and training this frequency separately, we show that convergence can
be accelerated.
A constructive algorithm is then introduced which adds product units
to a network as required by the problem. Simulations show that
for the same problems this method creates a network with significantly
less neurons than those constructed by the tiling and upstart algorithms.
In order to compare their performance with other transfer functions,
product units were implemented as candidate units in the Cascade
Correlation (CC) \cite{Fahlman90} system. Using these candidate units
resulted in smaller networks which trained faster than when the any of
the standard (three sigmoidal types and one Gaussian) transfer
functions were used. This superiority was confirmed when a pool of
candidate units of four different nonlinear activation functions were
used, which have to compete for addition to the network. Extensive
simulations showed that for the problem of implementing random Boolean
logic functions, product units are always chosen above any of
the other transfer functions.
(Also cross-referenced as UMIACS-TR-95-80)