13
Experiments: Affect as additional reinforcement Three social settings –Moderate social reinforcement (setting a) r human is small Long period of training with r human (trials 20-30) –Strong social reinforcement (setting b) r human is large Short period of training with r human (trials 20-25) –Learned social reinforcement (setting c) r human is used as above and to train R social (s) (an MLP). Period using r human is between 29 and 45. After that, R social (s) is used.

17
Conclusion A critical learning period can be used to influence robot learning using affective signals, in real-time, in a non-trivial learning environment. This has a benefit on learning Most specifically when the robot learns to predict the social feedback by training a reward function R social (s)

18
Further work Use affect/emotion as metaparameter to control –Learning rate –Exploration exploitation Differentiate between meanings of negative and positive emotions –Anger: negative feedback due to action of agent –Fear: negative anticipatory feedback –Surprise: strong positive feedback due to action of agent –Frustration: connect to exploration/exploitation rate? Affective Robot-Robot interaction? Use robot to human signals such as hesitation