Forward models play a key role in cognitive agents by providing predictions of the sensory consequences of motor commands, also known as sensorimotor contingencies (SMCs). In continuously evolving environments, the ability to anticipate is fundamental in distinguishing cognitive from reactive agents, and it is particularly relevant for autonomous robots that must be able to adapt their models in an online manner. Online learning skills, high accuracy of the forward models and multiple-step-ahead predictions are needed to enhance the robotsí anticipation capabilities. We propose an online heterogeneous ensemble learning method for building accurate forward models of SMCs relating motor commands to effects in robotsí sensorimotor system. We demonstrate that training a heterogeneous ensemble provides better performance than alternatives, and validate the method on two different robots: the iCub and the Baxter. rom motor babbling, robots can explore their sensorimotor contingencies and learn the mappings between the commands issued to the joint motors and the effect in their sensorimotor perception, in particular proprioception and vision. To identify parts of arms in the visual frames, we use a segmentation algorithm that is able to automatically cluster limb's parts according to motion. Ensemble predictor models are learnt in an online manner, to build the robotsí forward models. We propose a heterogeneous online ensemble learning algorithm which combines in an online manner predictors of different natures, namely Echo State Networks, Online Infinite Echo State Gaussian Processes, Locally Weighted Projected Regression and recursive ARX models. The ensemble prediction is computed online at each time step as the weighted sum of the base model predictions, following a Bayesian model combination approach. In order to give the robots a longer prediction horizon, k-step-ahead prediction models have been realised by chaining single-step-ahead predictors. The ensemble predictors are compared against the single base models, against an offline tree-based bagging ensemble for regression, and against the homogeneous online ensembles. Single- and multiple-step-ahead predictions obtained with the proposed online heterogeneous ensemble learner are more accurate than those of all the other alternative solutions, achieving up to 85% and up to 98% better accuracy, respectively. To demonstrate the generalisation of the learnt predictors beyond the trained data, we have evaluated the performance on gestures that have a longer longer duration to complete, e.g. waving and pointing. Highly accurate predictions on both joint and vision space are obtained by the proposed ensemble method. The experimental results demonstrate that the proposed heterogeneous ensemble benefits the learning process of forward models, providing better performance than homogeneous solutions, and achieving highly accurate short- and long-term predictions in an online manner.