Interacting embodied agents, be they groups of adult humans engaged in a coordinated task, autonomous robots acting in an environment, or a mother teaching a child, must seamlessly coordinate their actions to achieve a collaborative goal. Some forms of human collaborative and coordinated behavior (such as maintaining a conversation, or jointly solving a complex problem) appear to happen effortlessly as if the participants can read each other's mind and understand each other's communicative intent. At an elementary level, inter-agent coordination depends crucially on external (and observable) behaviors by the participants where the behavior of one participant organizes the actions of the other - behaviors such as eye movements, head turns, and hand gestures.

In the present study, we have developed a multimodal sensing system which consists of multiple video cameras, audio recording devices, a motion tracking system and eye tracking systems. We have conducted several experiments of child-parent interaction and human-robot interaction in which they were engaged in some naturalistic everyday tasks. Multimodal data streams are collected and analyzed to discover how two agents coordinate their behaviors to maintain a smooth interaction. Our approach is based on generating partition - a nonlinear time series analysis technique - which symbolizes a time series so that the symbolic sequence can preserve the maximum information of an intrinsic nonlinear dynamical system. Different with most previous approaches that attempt to approximate generating partition through various deterministic symbolization processes, our algorithm maintains and estimates a probabilistic distribution over a symbol set for each data point in a time series. To do so, we develop a Bayesian framework for probabilistic symbolization and demonstrate that this approach can successfully discover irrelevant dimensions in a time series and perform the probabilistic symbolization process only on relevant dimensions. After symbolization, information theoretical measures are used to symbolic sequences to further quantify information flows within an agent and across two agents. Various sensorimotor patterns are found to be predictive to successful interaction. Our results also suggest the importance of studying and understanding real-time adaptive behaviors in human-human and human-robot interactions.