Chen et al. presented a system which uses a SOM to cluster states.
After learning, the SOM units are extended with a histogram keeping the number of times the unit was BMU and the input belonged to each of a number of known states $$C={c_1,c_2,\dots,c_n}$$.

The system is used in robot soccer.
Each class is connected to an action.
Actions are chosen by finding the BMU in the net and selecting the action connected to its most likely class.

In an unsupervised, online phase, these histograms are updated in a reinforcement-learning fashion: whenever the action selected lead to success, the bin in the BMU's histogram which was the most likely class is increased.
It is decreased otherwise.⇒